2021-04-19 16:17:58

by Florent Revest

[permalink] [raw]
Subject: [PATCH bpf-next v5 0/6] Add a snprintf eBPF helper

We have a usecase where we want to audit symbol names (if available) in
callback registration hooks. (ex: fentry/nf_register_net_hook)

A few months back, I proposed a bpf_kallsyms_lookup series but it was
decided in the reviews that a more generic helper, bpf_snprintf, would
be more useful.

This series implements the helper according to the feedback received in
https://lore.kernel.org/bpf/[email protected]/T/#u

- A new arg type guarantees the NULL-termination of string arguments and
lets us pass format strings in only one arg
- A new helper is implemented using that guarantee. Because the format
string is known at verification time, the format string validation is
done by the verifier
- To implement a series of tests for bpf_snprintf, the logic for
marshalling variadic args in a fixed-size array is reworked as per:
https://lore.kernel.org/bpf/[email protected]/T/#u

---
Changes in v5:
- Fixed the bpf_printf_buf_used counter logic in try_get_fmt_tmp_buf
- Added a couple of extra incorrect specifiers tests
- Call test_snprintf_single__destroy unconditionally
- Fixed a C++-style comment

---
Changes in v4:
- Moved bpf_snprintf, bpf_printf_prepare and bpf_printf_cleanup to
kernel/bpf/helpers.c so that they get built without CONFIG_BPF_EVENTS
- Added negative test cases (various invalid format strings)
- Renamed put_fmt_tmp_buf() as bpf_printf_cleanup()
- Fixed a mistake that caused temporary buffers to be unconditionally
freed in bpf_printf_prepare
- Fixed a mistake that caused missing 0 character to be ignored
- Fixed a warning about integer to pointer conversion
- Misc cleanups

---
Changes in v3:
- Simplified temporary buffer acquisition with try_get_fmt_tmp_buf()
- Made zero-termination check more consistent
- Allowed NULL output_buffer
- Simplified the BPF_CAST_FMT_ARG macro
- Three new test cases: number padding, simple string with no arg and
string length extraction only with a NULL output buffer
- Clarified helper's description for edge cases (eg: str_size == 0)
- Lots of cosmetic changes

---
Changes in v2:
- Extracted the format validation/argument sanitization in a generic way
for all printf-like helpers.
- bpf_snprintf's str_size can now be 0
- bpf_snprintf is now exposed to all BPF program types
- We now preempt_disable when using a per-cpu temporary buffer
- Addressed a few cosmetic changes

Florent Revest (6):
bpf: Factorize bpf_trace_printk and bpf_seq_printf
bpf: Add a ARG_PTR_TO_CONST_STR argument type
bpf: Add a bpf_snprintf helper
libbpf: Initialize the bpf_seq_printf parameters array field by field
libbpf: Introduce a BPF_SNPRINTF helper macro
selftests/bpf: Add a series of tests for bpf_snprintf

include/linux/bpf.h | 22 ++
include/uapi/linux/bpf.h | 28 ++
kernel/bpf/helpers.c | 306 ++++++++++++++
kernel/bpf/verifier.c | 82 ++++
kernel/trace/bpf_trace.c | 373 ++----------------
tools/include/uapi/linux/bpf.h | 28 ++
tools/lib/bpf/bpf_tracing.h | 58 ++-
.../selftests/bpf/prog_tests/snprintf.c | 125 ++++++
.../selftests/bpf/progs/test_snprintf.c | 73 ++++
.../bpf/progs/test_snprintf_single.c | 20 +
10 files changed, 770 insertions(+), 345 deletions(-)
create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf.c
create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf.c
create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf_single.c

--
2.31.1.368.gbe11c130af-goog


2021-04-19 16:18:08

by Florent Revest

[permalink] [raw]
Subject: [PATCH bpf-next v5 6/6] selftests/bpf: Add a series of tests for bpf_snprintf

The "positive" part tests all format specifiers when things go well.

The "negative" part makes sure that incorrect format strings fail at
load time.

Signed-off-by: Florent Revest <[email protected]>
---
.../selftests/bpf/prog_tests/snprintf.c | 125 ++++++++++++++++++
.../selftests/bpf/progs/test_snprintf.c | 73 ++++++++++
.../bpf/progs/test_snprintf_single.c | 20 +++
3 files changed, 218 insertions(+)
create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf.c
create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf.c
create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf_single.c

diff --git a/tools/testing/selftests/bpf/prog_tests/snprintf.c b/tools/testing/selftests/bpf/prog_tests/snprintf.c
new file mode 100644
index 000000000000..a958c22aec75
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/snprintf.c
@@ -0,0 +1,125 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2021 Google LLC. */
+
+#include <test_progs.h>
+#include "test_snprintf.skel.h"
+#include "test_snprintf_single.skel.h"
+
+#define EXP_NUM_OUT "-8 9 96 -424242 1337 DABBAD00"
+#define EXP_NUM_RET sizeof(EXP_NUM_OUT)
+
+#define EXP_IP_OUT "127.000.000.001 0000:0000:0000:0000:0000:0000:0000:0001"
+#define EXP_IP_RET sizeof(EXP_IP_OUT)
+
+/* The third specifier, %pB, depends on compiler inlining so don't check it */
+#define EXP_SYM_OUT "schedule schedule+0x0/"
+#define MIN_SYM_RET sizeof(EXP_SYM_OUT)
+
+/* The third specifier, %p, is a hashed pointer which changes on every reboot */
+#define EXP_ADDR_OUT "0000000000000000 ffff00000add4e55 "
+#define EXP_ADDR_RET sizeof(EXP_ADDR_OUT "unknownhashedptr")
+
+#define EXP_STR_OUT "str1 longstr"
+#define EXP_STR_RET sizeof(EXP_STR_OUT)
+
+#define EXP_OVER_OUT "%over"
+#define EXP_OVER_RET 10
+
+#define EXP_PAD_OUT " 4 000"
+#define EXP_PAD_RET 900007
+
+#define EXP_NO_ARG_OUT "simple case"
+#define EXP_NO_ARG_RET 12
+
+#define EXP_NO_BUF_RET 29
+
+void test_snprintf_positive(void)
+{
+ char exp_addr_out[] = EXP_ADDR_OUT;
+ char exp_sym_out[] = EXP_SYM_OUT;
+ struct test_snprintf *skel;
+
+ skel = test_snprintf__open_and_load();
+ if (!ASSERT_OK_PTR(skel, "skel_open"))
+ return;
+
+ if (!ASSERT_OK(test_snprintf__attach(skel), "skel_attach"))
+ goto cleanup;
+
+ /* trigger tracepoint */
+ usleep(1);
+
+ ASSERT_STREQ(skel->bss->num_out, EXP_NUM_OUT, "num_out");
+ ASSERT_EQ(skel->bss->num_ret, EXP_NUM_RET, "num_ret");
+
+ ASSERT_STREQ(skel->bss->ip_out, EXP_IP_OUT, "ip_out");
+ ASSERT_EQ(skel->bss->ip_ret, EXP_IP_RET, "ip_ret");
+
+ ASSERT_OK(memcmp(skel->bss->sym_out, exp_sym_out,
+ sizeof(exp_sym_out) - 1), "sym_out");
+ ASSERT_LT(MIN_SYM_RET, skel->bss->sym_ret, "sym_ret");
+
+ ASSERT_OK(memcmp(skel->bss->addr_out, exp_addr_out,
+ sizeof(exp_addr_out) - 1), "addr_out");
+ ASSERT_EQ(skel->bss->addr_ret, EXP_ADDR_RET, "addr_ret");
+
+ ASSERT_STREQ(skel->bss->str_out, EXP_STR_OUT, "str_out");
+ ASSERT_EQ(skel->bss->str_ret, EXP_STR_RET, "str_ret");
+
+ ASSERT_STREQ(skel->bss->over_out, EXP_OVER_OUT, "over_out");
+ ASSERT_EQ(skel->bss->over_ret, EXP_OVER_RET, "over_ret");
+
+ ASSERT_STREQ(skel->bss->pad_out, EXP_PAD_OUT, "pad_out");
+ ASSERT_EQ(skel->bss->pad_ret, EXP_PAD_RET, "pad_ret");
+
+ ASSERT_STREQ(skel->bss->noarg_out, EXP_NO_ARG_OUT, "no_arg_out");
+ ASSERT_EQ(skel->bss->noarg_ret, EXP_NO_ARG_RET, "no_arg_ret");
+
+ ASSERT_EQ(skel->bss->nobuf_ret, EXP_NO_BUF_RET, "no_buf_ret");
+
+cleanup:
+ test_snprintf__destroy(skel);
+}
+
+#define min(a, b) ((a) < (b) ? (a) : (b))
+
+/* Loads an eBPF object calling bpf_snprintf with up to 10 characters of fmt */
+static int load_single_snprintf(char *fmt)
+{
+ struct test_snprintf_single *skel;
+ int ret;
+
+ skel = test_snprintf_single__open();
+ if (!skel)
+ return -EINVAL;
+
+ memcpy(skel->rodata->fmt, fmt, min(strlen(fmt) + 1, 10));
+
+ ret = test_snprintf_single__load(skel);
+ test_snprintf_single__destroy(skel);
+
+ return ret;
+}
+
+void test_snprintf_negative(void)
+{
+ ASSERT_OK(load_single_snprintf("valid %d"), "valid usage");
+
+ ASSERT_ERR(load_single_snprintf("0123456789"), "no terminating zero");
+ ASSERT_ERR(load_single_snprintf("%d %d"), "too many specifiers");
+ ASSERT_ERR(load_single_snprintf("%pi5"), "invalid specifier 1");
+ ASSERT_ERR(load_single_snprintf("%a"), "invalid specifier 2");
+ ASSERT_ERR(load_single_snprintf("%"), "invalid specifier 3");
+ ASSERT_ERR(load_single_snprintf("%12345678"), "invalid specifier 4");
+ ASSERT_ERR(load_single_snprintf("%--------"), "invalid specifier 5");
+ ASSERT_ERR(load_single_snprintf("\x80"), "non ascii character");
+ ASSERT_ERR(load_single_snprintf("\x1"), "non printable character");
+}
+
+void test_snprintf(void)
+{
+ if (test__start_subtest("snprintf_positive"))
+ test_snprintf_positive();
+ if (test__start_subtest("snprintf_negative"))
+ test_snprintf_negative();
+}
diff --git a/tools/testing/selftests/bpf/progs/test_snprintf.c b/tools/testing/selftests/bpf/progs/test_snprintf.c
new file mode 100644
index 000000000000..951a0301c553
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_snprintf.c
@@ -0,0 +1,73 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2021 Google LLC. */
+
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+char num_out[64] = {};
+long num_ret = 0;
+
+char ip_out[64] = {};
+long ip_ret = 0;
+
+char sym_out[64] = {};
+long sym_ret = 0;
+
+char addr_out[64] = {};
+long addr_ret = 0;
+
+char str_out[64] = {};
+long str_ret = 0;
+
+char over_out[6] = {};
+long over_ret = 0;
+
+char pad_out[10] = {};
+long pad_ret = 0;
+
+char noarg_out[64] = {};
+long noarg_ret = 0;
+
+long nobuf_ret = 0;
+
+extern const void schedule __ksym;
+
+SEC("raw_tp/sys_enter")
+int handler(const void *ctx)
+{
+ /* Convenient values to pretty-print */
+ const __u8 ex_ipv4[] = {127, 0, 0, 1};
+ const __u8 ex_ipv6[] = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1};
+ static const char str1[] = "str1";
+ static const char longstr[] = "longstr";
+
+ /* Integer types */
+ num_ret = BPF_SNPRINTF(num_out, sizeof(num_out),
+ "%d %u %x %li %llu %lX",
+ -8, 9, 150, -424242, 1337, 0xDABBAD00);
+ /* IP addresses */
+ ip_ret = BPF_SNPRINTF(ip_out, sizeof(ip_out), "%pi4 %pI6",
+ &ex_ipv4, &ex_ipv6);
+ /* Symbol lookup formatting */
+ sym_ret = BPF_SNPRINTF(sym_out, sizeof(sym_out), "%ps %pS %pB",
+ &schedule, &schedule, &schedule);
+ /* Kernel pointers */
+ addr_ret = BPF_SNPRINTF(addr_out, sizeof(addr_out), "%pK %px %p",
+ 0, 0xFFFF00000ADD4E55, 0xFFFF00000ADD4E55);
+ /* Strings embedding */
+ str_ret = BPF_SNPRINTF(str_out, sizeof(str_out), "%s %+05s",
+ str1, longstr);
+ /* Overflow */
+ over_ret = BPF_SNPRINTF(over_out, sizeof(over_out), "%%overflow");
+ /* Padding of fixed width numbers */
+ pad_ret = BPF_SNPRINTF(pad_out, sizeof(pad_out), "%5d %0900000X", 4, 4);
+ /* No args */
+ noarg_ret = BPF_SNPRINTF(noarg_out, sizeof(noarg_out), "simple case");
+ /* No buffer */
+ nobuf_ret = BPF_SNPRINTF(NULL, 0, "only interested in length %d", 60);
+
+ return 0;
+}
+
+char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/progs/test_snprintf_single.c b/tools/testing/selftests/bpf/progs/test_snprintf_single.c
new file mode 100644
index 000000000000..402adaf344f9
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_snprintf_single.c
@@ -0,0 +1,20 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2021 Google LLC. */
+
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+
+/* The format string is filled from the userspace such that loading fails */
+static const char fmt[10];
+
+SEC("raw_tp/sys_enter")
+int handler(const void *ctx)
+{
+ unsigned long long arg = 42;
+
+ bpf_snprintf(NULL, 0, fmt, &arg, sizeof(arg));
+
+ return 0;
+}
+
+char _license[] SEC("license") = "GPL";
--
2.31.1.368.gbe11c130af-goog

2021-04-19 16:18:10

by Florent Revest

[permalink] [raw]
Subject: [PATCH bpf-next v5 1/6] bpf: Factorize bpf_trace_printk and bpf_seq_printf

Two helpers (trace_printk and seq_printf) have very similar
implementations of format string parsing and a third one is coming
(snprintf). To avoid code duplication and make the code easier to
maintain, this moves the operations associated with format string
parsing (validation and argument sanitization) into one generic
function.

The implementation of the two existing helpers already drifted quite a
bit so unifying them entailed a lot of changes:

- bpf_trace_printk always expected fmt[fmt_size] to be the terminating
NULL character, this is no longer true, the first 0 is terminating.
- bpf_trace_printk now supports %% (which produces the percentage char).
- bpf_trace_printk now skips width formating fields.
- bpf_trace_printk now supports the X modifier (capital hexadecimal).
- bpf_trace_printk now supports %pK, %px, %pB, %pi4, %pI4, %pi6 and %pI6
- argument casting on 32 bit has been simplified into one macro and
using an enum instead of obscure int increments.

- bpf_seq_printf now uses bpf_trace_copy_string instead of
strncpy_from_kernel_nofault and handles the %pks %pus specifiers.
- bpf_seq_printf now prints longs correctly on 32 bit architectures.

- both were changed to use a global per-cpu tmp buffer instead of one
stack buffer for trace_printk and 6 small buffers for seq_printf.
- to avoid per-cpu buffer usage conflict, these helpers disable
preemption while the per-cpu buffer is in use.
- both helpers now support the %ps and %pS specifiers to print symbols.

The implementation is also moved from bpf_trace.c to helpers.c because
the upcoming bpf_snprintf helper will be made available to all BPF
programs and will need it.

Signed-off-by: Florent Revest <[email protected]>
---
include/linux/bpf.h | 20 +++
kernel/bpf/helpers.c | 256 +++++++++++++++++++++++++++
kernel/trace/bpf_trace.c | 371 ++++-----------------------------------
3 files changed, 313 insertions(+), 334 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index ff8cd68c01b3..77d1d8c65b81 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -2077,4 +2077,24 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t,
struct btf_id_set;
bool btf_id_set_contains(const struct btf_id_set *set, u32 id);

+enum bpf_printf_mod_type {
+ BPF_PRINTF_INT,
+ BPF_PRINTF_LONG,
+ BPF_PRINTF_LONG_LONG,
+};
+
+/* Workaround for getting va_list handling working with different argument type
+ * combinations generically for 32 and 64 bit archs.
+ */
+#define BPF_CAST_FMT_ARG(arg_nb, args, mod) \
+ (mod[arg_nb] == BPF_PRINTF_LONG_LONG || \
+ (mod[arg_nb] == BPF_PRINTF_LONG && __BITS_PER_LONG == 64) \
+ ? (u64)args[arg_nb] \
+ : (u32)args[arg_nb])
+
+int bpf_printf_prepare(char *fmt, u32 fmt_size, const u64 *raw_args,
+ u64 *final_args, enum bpf_printf_mod_type *mod,
+ u32 num_args);
+void bpf_printf_cleanup(void);
+
#endif /* _LINUX_BPF_H */
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index f306611c4ddf..9ca57eb1fc0d 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -669,6 +669,262 @@ const struct bpf_func_proto bpf_this_cpu_ptr_proto = {
.arg1_type = ARG_PTR_TO_PERCPU_BTF_ID,
};

+static int bpf_trace_copy_string(char *buf, void *unsafe_ptr, char fmt_ptype,
+ size_t bufsz)
+{
+ void __user *user_ptr = (__force void __user *)unsafe_ptr;
+
+ buf[0] = 0;
+
+ switch (fmt_ptype) {
+ case 's':
+#ifdef CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
+ if ((unsigned long)unsafe_ptr < TASK_SIZE)
+ return strncpy_from_user_nofault(buf, user_ptr, bufsz);
+ fallthrough;
+#endif
+ case 'k':
+ return strncpy_from_kernel_nofault(buf, unsafe_ptr, bufsz);
+ case 'u':
+ return strncpy_from_user_nofault(buf, user_ptr, bufsz);
+ }
+
+ return -EINVAL;
+}
+
+/* Per-cpu temp buffers which can be used by printf-like helpers for %s or %p
+ */
+#define MAX_PRINTF_BUF_LEN 512
+
+struct bpf_printf_buf {
+ char tmp_buf[MAX_PRINTF_BUF_LEN];
+};
+static DEFINE_PER_CPU(struct bpf_printf_buf, bpf_printf_buf);
+static DEFINE_PER_CPU(int, bpf_printf_buf_used);
+
+static int try_get_fmt_tmp_buf(char **tmp_buf)
+{
+ struct bpf_printf_buf *bufs;
+ int used;
+
+ if (*tmp_buf)
+ return 0;
+
+ preempt_disable();
+ used = this_cpu_inc_return(bpf_printf_buf_used);
+ if (WARN_ON_ONCE(used > 1)) {
+ this_cpu_dec(bpf_printf_buf_used);
+ preempt_enable();
+ return -EBUSY;
+ }
+ bufs = this_cpu_ptr(&bpf_printf_buf);
+ *tmp_buf = bufs->tmp_buf;
+
+ return 0;
+}
+
+void bpf_printf_cleanup(void)
+{
+ if (this_cpu_read(bpf_printf_buf_used)) {
+ this_cpu_dec(bpf_printf_buf_used);
+ preempt_enable();
+ }
+}
+
+/*
+ * bpf_parse_fmt_str - Generic pass on format strings for printf-like helpers
+ *
+ * Returns a negative value if fmt is an invalid format string or 0 otherwise.
+ *
+ * This can be used in two ways:
+ * - Format string verification only: when final_args and mod are NULL
+ * - Arguments preparation: in addition to the above verification, it writes in
+ * final_args a copy of raw_args where pointers from BPF have been sanitized
+ * into pointers safe to use by snprintf. This also writes in the mod array
+ * the size requirement of each argument, usable by BPF_CAST_FMT_ARG for ex.
+ *
+ * In argument preparation mode, if 0 is returned, safe temporary buffers are
+ * allocated and bpf_printf_cleanup should be called to free them after use.
+ */
+int bpf_printf_prepare(char *fmt, u32 fmt_size, const u64 *raw_args,
+ u64 *final_args, enum bpf_printf_mod_type *mod,
+ u32 num_args)
+{
+ char *unsafe_ptr = NULL, *tmp_buf = NULL, *fmt_end;
+ size_t tmp_buf_len = MAX_PRINTF_BUF_LEN;
+ int err, i, num_spec = 0, copy_size;
+ enum bpf_printf_mod_type cur_mod;
+ u64 cur_arg;
+ char fmt_ptype;
+
+ if (!!final_args != !!mod)
+ return -EINVAL;
+
+ fmt_end = strnchr(fmt, fmt_size, 0);
+ if (!fmt_end)
+ return -EINVAL;
+ fmt_size = fmt_end - fmt;
+
+ for (i = 0; i < fmt_size; i++) {
+ if ((!isprint(fmt[i]) && !isspace(fmt[i])) || !isascii(fmt[i])) {
+ err = -EINVAL;
+ goto cleanup;
+ }
+
+ if (fmt[i] != '%')
+ continue;
+
+ if (fmt[i + 1] == '%') {
+ i++;
+ continue;
+ }
+
+ if (num_spec >= num_args) {
+ err = -EINVAL;
+ goto cleanup;
+ }
+
+ /* The string is zero-terminated so if fmt[i] != 0, we can
+ * always access fmt[i + 1], in the worst case it will be a 0
+ */
+ i++;
+
+ /* skip optional "[0 +-][num]" width formatting field */
+ while (fmt[i] == '0' || fmt[i] == '+' || fmt[i] == '-' ||
+ fmt[i] == ' ')
+ i++;
+ if (fmt[i] >= '1' && fmt[i] <= '9') {
+ i++;
+ while (fmt[i] >= '0' && fmt[i] <= '9')
+ i++;
+ }
+
+ if (fmt[i] == 'p') {
+ cur_mod = BPF_PRINTF_LONG;
+
+ if ((fmt[i + 1] == 'k' || fmt[i + 1] == 'u') &&
+ fmt[i + 2] == 's') {
+ fmt_ptype = fmt[i + 1];
+ i += 2;
+ goto fmt_str;
+ }
+
+ if (fmt[i + 1] == 0 || isspace(fmt[i + 1]) ||
+ ispunct(fmt[i + 1]) || fmt[i + 1] == 'K' ||
+ fmt[i + 1] == 'x' || fmt[i + 1] == 'B' ||
+ fmt[i + 1] == 's' || fmt[i + 1] == 'S') {
+ /* just kernel pointers */
+ if (final_args)
+ cur_arg = raw_args[num_spec];
+ goto fmt_next;
+ }
+
+ /* only support "%pI4", "%pi4", "%pI6" and "%pi6". */
+ if ((fmt[i + 1] != 'i' && fmt[i + 1] != 'I') ||
+ (fmt[i + 2] != '4' && fmt[i + 2] != '6')) {
+ err = -EINVAL;
+ goto cleanup;
+ }
+
+ i += 2;
+ if (!final_args)
+ goto fmt_next;
+
+ if (try_get_fmt_tmp_buf(&tmp_buf)) {
+ err = -EBUSY;
+ goto out;
+ }
+
+ copy_size = (fmt[i + 2] == '4') ? 4 : 16;
+ if (tmp_buf_len < copy_size) {
+ err = -ENOSPC;
+ goto cleanup;
+ }
+
+ unsafe_ptr = (char *)(long)raw_args[num_spec];
+ err = copy_from_kernel_nofault(tmp_buf, unsafe_ptr,
+ copy_size);
+ if (err < 0)
+ memset(tmp_buf, 0, copy_size);
+ cur_arg = (u64)(long)tmp_buf;
+ tmp_buf += copy_size;
+ tmp_buf_len -= copy_size;
+
+ goto fmt_next;
+ } else if (fmt[i] == 's') {
+ cur_mod = BPF_PRINTF_LONG;
+ fmt_ptype = fmt[i];
+fmt_str:
+ if (fmt[i + 1] != 0 &&
+ !isspace(fmt[i + 1]) &&
+ !ispunct(fmt[i + 1])) {
+ err = -EINVAL;
+ goto cleanup;
+ }
+
+ if (!final_args)
+ goto fmt_next;
+
+ if (try_get_fmt_tmp_buf(&tmp_buf)) {
+ err = -EBUSY;
+ goto out;
+ }
+
+ if (!tmp_buf_len) {
+ err = -ENOSPC;
+ goto cleanup;
+ }
+
+ unsafe_ptr = (char *)(long)raw_args[num_spec];
+ err = bpf_trace_copy_string(tmp_buf, unsafe_ptr,
+ fmt_ptype, tmp_buf_len);
+ if (err < 0) {
+ tmp_buf[0] = '\0';
+ err = 1;
+ }
+
+ cur_arg = (u64)(long)tmp_buf;
+ tmp_buf += err;
+ tmp_buf_len -= err;
+
+ goto fmt_next;
+ }
+
+ cur_mod = BPF_PRINTF_INT;
+
+ if (fmt[i] == 'l') {
+ cur_mod = BPF_PRINTF_LONG;
+ i++;
+ }
+ if (fmt[i] == 'l') {
+ cur_mod = BPF_PRINTF_LONG_LONG;
+ i++;
+ }
+
+ if (fmt[i] != 'i' && fmt[i] != 'd' && fmt[i] != 'u' &&
+ fmt[i] != 'x' && fmt[i] != 'X') {
+ err = -EINVAL;
+ goto cleanup;
+ }
+
+ if (final_args)
+ cur_arg = raw_args[num_spec];
+fmt_next:
+ if (final_args) {
+ mod[num_spec] = cur_mod;
+ final_args[num_spec] = cur_arg;
+ }
+ num_spec++;
+ }
+
+ err = 0;
+cleanup:
+ if (err)
+ bpf_printf_cleanup();
+out:
+ return err;
+}
+
const struct bpf_func_proto bpf_get_current_task_proto __weak;
const struct bpf_func_proto bpf_probe_read_user_proto __weak;
const struct bpf_func_proto bpf_probe_read_user_str_proto __weak;
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 0d23755c2747..a13f8644b357 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -372,188 +372,38 @@ static const struct bpf_func_proto *bpf_get_probe_write_proto(void)
return &bpf_probe_write_user_proto;
}

-static void bpf_trace_copy_string(char *buf, void *unsafe_ptr, char fmt_ptype,
- size_t bufsz)
-{
- void __user *user_ptr = (__force void __user *)unsafe_ptr;
-
- buf[0] = 0;
-
- switch (fmt_ptype) {
- case 's':
-#ifdef CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
- if ((unsigned long)unsafe_ptr < TASK_SIZE) {
- strncpy_from_user_nofault(buf, user_ptr, bufsz);
- break;
- }
- fallthrough;
-#endif
- case 'k':
- strncpy_from_kernel_nofault(buf, unsafe_ptr, bufsz);
- break;
- case 'u':
- strncpy_from_user_nofault(buf, user_ptr, bufsz);
- break;
- }
-}
-
static DEFINE_RAW_SPINLOCK(trace_printk_lock);

-#define BPF_TRACE_PRINTK_SIZE 1024
+#define MAX_TRACE_PRINTK_VARARGS 3
+#define BPF_TRACE_PRINTK_SIZE 1024

-static __printf(1, 0) int bpf_do_trace_printk(const char *fmt, ...)
+BPF_CALL_5(bpf_trace_printk, char *, fmt, u32, fmt_size, u64, arg1,
+ u64, arg2, u64, arg3)
{
+ u64 args[MAX_TRACE_PRINTK_VARARGS] = { arg1, arg2, arg3 };
+ enum bpf_printf_mod_type mod[MAX_TRACE_PRINTK_VARARGS];
static char buf[BPF_TRACE_PRINTK_SIZE];
unsigned long flags;
- va_list ap;
int ret;

- raw_spin_lock_irqsave(&trace_printk_lock, flags);
- va_start(ap, fmt);
- ret = vsnprintf(buf, sizeof(buf), fmt, ap);
- va_end(ap);
- /* vsnprintf() will not append null for zero-length strings */
+ ret = bpf_printf_prepare(fmt, fmt_size, args, args, mod,
+ MAX_TRACE_PRINTK_VARARGS);
+ if (ret < 0)
+ return ret;
+
+ ret = snprintf(buf, sizeof(buf), fmt, BPF_CAST_FMT_ARG(0, args, mod),
+ BPF_CAST_FMT_ARG(1, args, mod), BPF_CAST_FMT_ARG(2, args, mod));
+ /* snprintf() will not append null for zero-length strings */
if (ret == 0)
buf[0] = '\0';
+
+ raw_spin_lock_irqsave(&trace_printk_lock, flags);
trace_bpf_trace_printk(buf);
raw_spin_unlock_irqrestore(&trace_printk_lock, flags);

- return ret;
-}
-
-/*
- * Only limited trace_printk() conversion specifiers allowed:
- * %d %i %u %x %ld %li %lu %lx %lld %lli %llu %llx %p %pB %pks %pus %s
- */
-BPF_CALL_5(bpf_trace_printk, char *, fmt, u32, fmt_size, u64, arg1,
- u64, arg2, u64, arg3)
-{
- int i, mod[3] = {}, fmt_cnt = 0;
- char buf[64], fmt_ptype;
- void *unsafe_ptr = NULL;
- bool str_seen = false;
+ bpf_printf_cleanup();

- /*
- * bpf_check()->check_func_arg()->check_stack_boundary()
- * guarantees that fmt points to bpf program stack,
- * fmt_size bytes of it were initialized and fmt_size > 0
- */
- if (fmt[--fmt_size] != 0)
- return -EINVAL;
-
- /* check format string for allowed specifiers */
- for (i = 0; i < fmt_size; i++) {
- if ((!isprint(fmt[i]) && !isspace(fmt[i])) || !isascii(fmt[i]))
- return -EINVAL;
-
- if (fmt[i] != '%')
- continue;
-
- if (fmt_cnt >= 3)
- return -EINVAL;
-
- /* fmt[i] != 0 && fmt[last] == 0, so we can access fmt[i + 1] */
- i++;
- if (fmt[i] == 'l') {
- mod[fmt_cnt]++;
- i++;
- } else if (fmt[i] == 'p') {
- mod[fmt_cnt]++;
- if ((fmt[i + 1] == 'k' ||
- fmt[i + 1] == 'u') &&
- fmt[i + 2] == 's') {
- fmt_ptype = fmt[i + 1];
- i += 2;
- goto fmt_str;
- }
-
- if (fmt[i + 1] == 'B') {
- i++;
- goto fmt_next;
- }
-
- /* disallow any further format extensions */
- if (fmt[i + 1] != 0 &&
- !isspace(fmt[i + 1]) &&
- !ispunct(fmt[i + 1]))
- return -EINVAL;
-
- goto fmt_next;
- } else if (fmt[i] == 's') {
- mod[fmt_cnt]++;
- fmt_ptype = fmt[i];
-fmt_str:
- if (str_seen)
- /* allow only one '%s' per fmt string */
- return -EINVAL;
- str_seen = true;
-
- if (fmt[i + 1] != 0 &&
- !isspace(fmt[i + 1]) &&
- !ispunct(fmt[i + 1]))
- return -EINVAL;
-
- switch (fmt_cnt) {
- case 0:
- unsafe_ptr = (void *)(long)arg1;
- arg1 = (long)buf;
- break;
- case 1:
- unsafe_ptr = (void *)(long)arg2;
- arg2 = (long)buf;
- break;
- case 2:
- unsafe_ptr = (void *)(long)arg3;
- arg3 = (long)buf;
- break;
- }
-
- bpf_trace_copy_string(buf, unsafe_ptr, fmt_ptype,
- sizeof(buf));
- goto fmt_next;
- }
-
- if (fmt[i] == 'l') {
- mod[fmt_cnt]++;
- i++;
- }
-
- if (fmt[i] != 'i' && fmt[i] != 'd' &&
- fmt[i] != 'u' && fmt[i] != 'x')
- return -EINVAL;
-fmt_next:
- fmt_cnt++;
- }
-
-/* Horrid workaround for getting va_list handling working with different
- * argument type combinations generically for 32 and 64 bit archs.
- */
-#define __BPF_TP_EMIT() __BPF_ARG3_TP()
-#define __BPF_TP(...) \
- bpf_do_trace_printk(fmt, ##__VA_ARGS__)
-
-#define __BPF_ARG1_TP(...) \
- ((mod[0] == 2 || (mod[0] == 1 && __BITS_PER_LONG == 64)) \
- ? __BPF_TP(arg1, ##__VA_ARGS__) \
- : ((mod[0] == 1 || (mod[0] == 0 && __BITS_PER_LONG == 32)) \
- ? __BPF_TP((long)arg1, ##__VA_ARGS__) \
- : __BPF_TP((u32)arg1, ##__VA_ARGS__)))
-
-#define __BPF_ARG2_TP(...) \
- ((mod[1] == 2 || (mod[1] == 1 && __BITS_PER_LONG == 64)) \
- ? __BPF_ARG1_TP(arg2, ##__VA_ARGS__) \
- : ((mod[1] == 1 || (mod[1] == 0 && __BITS_PER_LONG == 32)) \
- ? __BPF_ARG1_TP((long)arg2, ##__VA_ARGS__) \
- : __BPF_ARG1_TP((u32)arg2, ##__VA_ARGS__)))
-
-#define __BPF_ARG3_TP(...) \
- ((mod[2] == 2 || (mod[2] == 1 && __BITS_PER_LONG == 64)) \
- ? __BPF_ARG2_TP(arg3, ##__VA_ARGS__) \
- : ((mod[2] == 1 || (mod[2] == 0 && __BITS_PER_LONG == 32)) \
- ? __BPF_ARG2_TP((long)arg3, ##__VA_ARGS__) \
- : __BPF_ARG2_TP((u32)arg3, ##__VA_ARGS__)))
-
- return __BPF_TP_EMIT();
+ return ret;
}

static const struct bpf_func_proto bpf_trace_printk_proto = {
@@ -581,184 +431,37 @@ const struct bpf_func_proto *bpf_get_trace_printk_proto(void)
}

#define MAX_SEQ_PRINTF_VARARGS 12
-#define MAX_SEQ_PRINTF_MAX_MEMCPY 6
-#define MAX_SEQ_PRINTF_STR_LEN 128
-
-struct bpf_seq_printf_buf {
- char buf[MAX_SEQ_PRINTF_MAX_MEMCPY][MAX_SEQ_PRINTF_STR_LEN];
-};
-static DEFINE_PER_CPU(struct bpf_seq_printf_buf, bpf_seq_printf_buf);
-static DEFINE_PER_CPU(int, bpf_seq_printf_buf_used);

BPF_CALL_5(bpf_seq_printf, struct seq_file *, m, char *, fmt, u32, fmt_size,
const void *, data, u32, data_len)
{
- int err = -EINVAL, fmt_cnt = 0, memcpy_cnt = 0;
- int i, buf_used, copy_size, num_args;
- u64 params[MAX_SEQ_PRINTF_VARARGS];
- struct bpf_seq_printf_buf *bufs;
- const u64 *args = data;
-
- buf_used = this_cpu_inc_return(bpf_seq_printf_buf_used);
- if (WARN_ON_ONCE(buf_used > 1)) {
- err = -EBUSY;
- goto out;
- }
-
- bufs = this_cpu_ptr(&bpf_seq_printf_buf);
-
- /*
- * bpf_check()->check_func_arg()->check_stack_boundary()
- * guarantees that fmt points to bpf program stack,
- * fmt_size bytes of it were initialized and fmt_size > 0
- */
- if (fmt[--fmt_size] != 0)
- goto out;
-
- if (data_len & 7)
- goto out;
-
- for (i = 0; i < fmt_size; i++) {
- if (fmt[i] == '%') {
- if (fmt[i + 1] == '%')
- i++;
- else if (!data || !data_len)
- goto out;
- }
- }
+ enum bpf_printf_mod_type mod[MAX_SEQ_PRINTF_VARARGS];
+ u64 args[MAX_SEQ_PRINTF_VARARGS];
+ int err, num_args;

+ if (data_len & 7 || data_len > MAX_SEQ_PRINTF_VARARGS * 8 ||
+ (data_len && !data))
+ return -EINVAL;
num_args = data_len / 8;

- /* check format string for allowed specifiers */
- for (i = 0; i < fmt_size; i++) {
- /* only printable ascii for now. */
- if ((!isprint(fmt[i]) && !isspace(fmt[i])) || !isascii(fmt[i])) {
- err = -EINVAL;
- goto out;
- }
-
- if (fmt[i] != '%')
- continue;
-
- if (fmt[i + 1] == '%') {
- i++;
- continue;
- }
-
- if (fmt_cnt >= MAX_SEQ_PRINTF_VARARGS) {
- err = -E2BIG;
- goto out;
- }
-
- if (fmt_cnt >= num_args) {
- err = -EINVAL;
- goto out;
- }
-
- /* fmt[i] != 0 && fmt[last] == 0, so we can access fmt[i + 1] */
- i++;
-
- /* skip optional "[0 +-][num]" width formating field */
- while (fmt[i] == '0' || fmt[i] == '+' || fmt[i] == '-' ||
- fmt[i] == ' ')
- i++;
- if (fmt[i] >= '1' && fmt[i] <= '9') {
- i++;
- while (fmt[i] >= '0' && fmt[i] <= '9')
- i++;
- }
-
- if (fmt[i] == 's') {
- void *unsafe_ptr;
-
- /* try our best to copy */
- if (memcpy_cnt >= MAX_SEQ_PRINTF_MAX_MEMCPY) {
- err = -E2BIG;
- goto out;
- }
-
- unsafe_ptr = (void *)(long)args[fmt_cnt];
- err = strncpy_from_kernel_nofault(bufs->buf[memcpy_cnt],
- unsafe_ptr, MAX_SEQ_PRINTF_STR_LEN);
- if (err < 0)
- bufs->buf[memcpy_cnt][0] = '\0';
- params[fmt_cnt] = (u64)(long)bufs->buf[memcpy_cnt];
-
- fmt_cnt++;
- memcpy_cnt++;
- continue;
- }
-
- if (fmt[i] == 'p') {
- if (fmt[i + 1] == 0 ||
- fmt[i + 1] == 'K' ||
- fmt[i + 1] == 'x' ||
- fmt[i + 1] == 'B') {
- /* just kernel pointers */
- params[fmt_cnt] = args[fmt_cnt];
- fmt_cnt++;
- continue;
- }
-
- /* only support "%pI4", "%pi4", "%pI6" and "%pi6". */
- if (fmt[i + 1] != 'i' && fmt[i + 1] != 'I') {
- err = -EINVAL;
- goto out;
- }
- if (fmt[i + 2] != '4' && fmt[i + 2] != '6') {
- err = -EINVAL;
- goto out;
- }
-
- if (memcpy_cnt >= MAX_SEQ_PRINTF_MAX_MEMCPY) {
- err = -E2BIG;
- goto out;
- }
-
-
- copy_size = (fmt[i + 2] == '4') ? 4 : 16;
-
- err = copy_from_kernel_nofault(bufs->buf[memcpy_cnt],
- (void *) (long) args[fmt_cnt],
- copy_size);
- if (err < 0)
- memset(bufs->buf[memcpy_cnt], 0, copy_size);
- params[fmt_cnt] = (u64)(long)bufs->buf[memcpy_cnt];
-
- i += 2;
- fmt_cnt++;
- memcpy_cnt++;
- continue;
- }
-
- if (fmt[i] == 'l') {
- i++;
- if (fmt[i] == 'l')
- i++;
- }
-
- if (fmt[i] != 'i' && fmt[i] != 'd' &&
- fmt[i] != 'u' && fmt[i] != 'x' &&
- fmt[i] != 'X') {
- err = -EINVAL;
- goto out;
- }
-
- params[fmt_cnt] = args[fmt_cnt];
- fmt_cnt++;
- }
+ err = bpf_printf_prepare(fmt, fmt_size, data, args, mod, num_args);
+ if (err < 0)
+ return err;

/* Maximumly we can have MAX_SEQ_PRINTF_VARARGS parameter, just give
* all of them to seq_printf().
*/
- seq_printf(m, fmt, params[0], params[1], params[2], params[3],
- params[4], params[5], params[6], params[7], params[8],
- params[9], params[10], params[11]);
+ seq_printf(m, fmt, BPF_CAST_FMT_ARG(0, args, mod),
+ BPF_CAST_FMT_ARG(1, args, mod), BPF_CAST_FMT_ARG(2, args, mod),
+ BPF_CAST_FMT_ARG(3, args, mod), BPF_CAST_FMT_ARG(4, args, mod),
+ BPF_CAST_FMT_ARG(5, args, mod), BPF_CAST_FMT_ARG(6, args, mod),
+ BPF_CAST_FMT_ARG(7, args, mod), BPF_CAST_FMT_ARG(8, args, mod),
+ BPF_CAST_FMT_ARG(9, args, mod), BPF_CAST_FMT_ARG(10, args, mod),
+ BPF_CAST_FMT_ARG(11, args, mod));

- err = seq_has_overflowed(m) ? -EOVERFLOW : 0;
-out:
- this_cpu_dec(bpf_seq_printf_buf_used);
- return err;
+ bpf_printf_cleanup();
+
+ return seq_has_overflowed(m) ? -EOVERFLOW : 0;
}

BTF_ID_LIST_SINGLE(btf_seq_file_ids, struct, seq_file)
--
2.31.1.368.gbe11c130af-goog

2021-04-19 19:36:54

by Florent Revest

[permalink] [raw]
Subject: [PATCH bpf-next v5 4/6] libbpf: Initialize the bpf_seq_printf parameters array field by field

When initializing the __param array with a one liner, if all args are
const, the initial array value will be placed in the rodata section but
because libbpf does not support relocation in the rodata section, any
pointer in this array will stay NULL.

Fixes: c09add2fbc5a ("tools/libbpf: Add bpf_iter support")
Signed-off-by: Florent Revest <[email protected]>
Acked-by: Andrii Nakryiko <[email protected]>
---
tools/lib/bpf/bpf_tracing.h | 40 +++++++++++++++++++++++++++----------
1 file changed, 29 insertions(+), 11 deletions(-)

diff --git a/tools/lib/bpf/bpf_tracing.h b/tools/lib/bpf/bpf_tracing.h
index f9ef37707888..1c2e91ee041d 100644
--- a/tools/lib/bpf/bpf_tracing.h
+++ b/tools/lib/bpf/bpf_tracing.h
@@ -413,20 +413,38 @@ typeof(name(0)) name(struct pt_regs *ctx) \
} \
static __always_inline typeof(name(0)) ____##name(struct pt_regs *ctx, ##args)

+#define ___bpf_fill0(arr, p, x) do {} while (0)
+#define ___bpf_fill1(arr, p, x) arr[p] = x
+#define ___bpf_fill2(arr, p, x, args...) arr[p] = x; ___bpf_fill1(arr, p + 1, args)
+#define ___bpf_fill3(arr, p, x, args...) arr[p] = x; ___bpf_fill2(arr, p + 1, args)
+#define ___bpf_fill4(arr, p, x, args...) arr[p] = x; ___bpf_fill3(arr, p + 1, args)
+#define ___bpf_fill5(arr, p, x, args...) arr[p] = x; ___bpf_fill4(arr, p + 1, args)
+#define ___bpf_fill6(arr, p, x, args...) arr[p] = x; ___bpf_fill5(arr, p + 1, args)
+#define ___bpf_fill7(arr, p, x, args...) arr[p] = x; ___bpf_fill6(arr, p + 1, args)
+#define ___bpf_fill8(arr, p, x, args...) arr[p] = x; ___bpf_fill7(arr, p + 1, args)
+#define ___bpf_fill9(arr, p, x, args...) arr[p] = x; ___bpf_fill8(arr, p + 1, args)
+#define ___bpf_fill10(arr, p, x, args...) arr[p] = x; ___bpf_fill9(arr, p + 1, args)
+#define ___bpf_fill11(arr, p, x, args...) arr[p] = x; ___bpf_fill10(arr, p + 1, args)
+#define ___bpf_fill12(arr, p, x, args...) arr[p] = x; ___bpf_fill11(arr, p + 1, args)
+#define ___bpf_fill(arr, args...) \
+ ___bpf_apply(___bpf_fill, ___bpf_narg(args))(arr, 0, args)
+
/*
* BPF_SEQ_PRINTF to wrap bpf_seq_printf to-be-printed values
* in a structure.
*/
-#define BPF_SEQ_PRINTF(seq, fmt, args...) \
- ({ \
- _Pragma("GCC diagnostic push") \
- _Pragma("GCC diagnostic ignored \"-Wint-conversion\"") \
- static const char ___fmt[] = fmt; \
- unsigned long long ___param[] = { args }; \
- _Pragma("GCC diagnostic pop") \
- int ___ret = bpf_seq_printf(seq, ___fmt, sizeof(___fmt), \
- ___param, sizeof(___param)); \
- ___ret; \
- })
+#define BPF_SEQ_PRINTF(seq, fmt, args...) \
+({ \
+ static const char ___fmt[] = fmt; \
+ unsigned long long ___param[___bpf_narg(args)]; \
+ \
+ _Pragma("GCC diagnostic push") \
+ _Pragma("GCC diagnostic ignored \"-Wint-conversion\"") \
+ ___bpf_fill(___param, args); \
+ _Pragma("GCC diagnostic pop") \
+ \
+ bpf_seq_printf(seq, ___fmt, sizeof(___fmt), \
+ ___param, sizeof(___param)); \
+})

#endif
--
2.31.1.368.gbe11c130af-goog

2021-04-19 19:37:01

by Florent Revest

[permalink] [raw]
Subject: [PATCH bpf-next v5 3/6] bpf: Add a bpf_snprintf helper

The implementation takes inspiration from the existing bpf_trace_printk
helper but there are a few differences:

To allow for a large number of format-specifiers, parameters are
provided in an array, like in bpf_seq_printf.

Because the output string takes two arguments and the array of
parameters also takes two arguments, the format string needs to fit in
one argument. Thankfully, ARG_PTR_TO_CONST_STR is guaranteed to point to
a zero-terminated read-only map so we don't need a format string length
arg.

Because the format-string is known at verification time, we also do
a first pass of format string validation in the verifier logic. This
makes debugging easier.

Signed-off-by: Florent Revest <[email protected]>
Acked-by: Andrii Nakryiko <[email protected]>
---
include/linux/bpf.h | 1 +
include/uapi/linux/bpf.h | 28 +++++++++++++++++++
kernel/bpf/helpers.c | 50 ++++++++++++++++++++++++++++++++++
kernel/bpf/verifier.c | 41 ++++++++++++++++++++++++++++
kernel/trace/bpf_trace.c | 2 ++
tools/include/uapi/linux/bpf.h | 28 +++++++++++++++++++
6 files changed, 150 insertions(+)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index c160526fc8bf..f8a45f109e96 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1953,6 +1953,7 @@ extern const struct bpf_func_proto bpf_skc_to_tcp_request_sock_proto;
extern const struct bpf_func_proto bpf_skc_to_udp6_sock_proto;
extern const struct bpf_func_proto bpf_copy_from_user_proto;
extern const struct bpf_func_proto bpf_snprintf_btf_proto;
+extern const struct bpf_func_proto bpf_snprintf_proto;
extern const struct bpf_func_proto bpf_per_cpu_ptr_proto;
extern const struct bpf_func_proto bpf_this_cpu_ptr_proto;
extern const struct bpf_func_proto bpf_ktime_get_coarse_ns_proto;
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index df164a44bb41..ec6d85a81744 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -4708,6 +4708,33 @@ union bpf_attr {
* Return
* The number of traversed map elements for success, **-EINVAL** for
* invalid **flags**.
+ *
+ * long bpf_snprintf(char *str, u32 str_size, const char *fmt, u64 *data, u32 data_len)
+ * Description
+ * Outputs a string into the **str** buffer of size **str_size**
+ * based on a format string stored in a read-only map pointed by
+ * **fmt**.
+ *
+ * Each format specifier in **fmt** corresponds to one u64 element
+ * in the **data** array. For strings and pointers where pointees
+ * are accessed, only the pointer values are stored in the *data*
+ * array. The *data_len* is the size of *data* in bytes.
+ *
+ * Formats **%s** and **%p{i,I}{4,6}** require to read kernel
+ * memory. Reading kernel memory may fail due to either invalid
+ * address or valid address but requiring a major memory fault. If
+ * reading kernel memory fails, the string for **%s** will be an
+ * empty string, and the ip address for **%p{i,I}{4,6}** will be 0.
+ * Not returning error to bpf program is consistent with what
+ * **bpf_trace_printk**\ () does for now.
+ *
+ * Return
+ * The strictly positive length of the formatted string, including
+ * the trailing zero character. If the return value is greater than
+ * **str_size**, **str** contains a truncated string, guaranteed to
+ * be zero-terminated except when **str_size** is 0.
+ *
+ * Or **-EBUSY** if the per-CPU memory copy buffer is busy.
*/
#define __BPF_FUNC_MAPPER(FN) \
FN(unspec), \
@@ -4875,6 +4902,7 @@ union bpf_attr {
FN(sock_from_file), \
FN(check_mtu), \
FN(for_each_map_elem), \
+ FN(snprintf), \
/* */

/* integer value in 'imm' field of BPF_CALL instruction selects which helper
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 9ca57eb1fc0d..85b26ca5aacd 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -925,6 +925,54 @@ int bpf_printf_prepare(char *fmt, u32 fmt_size, const u64 *raw_args,
return err;
}

+#define MAX_SNPRINTF_VARARGS 12
+
+BPF_CALL_5(bpf_snprintf, char *, str, u32, str_size, char *, fmt,
+ const void *, data, u32, data_len)
+{
+ enum bpf_printf_mod_type mod[MAX_SNPRINTF_VARARGS];
+ u64 args[MAX_SNPRINTF_VARARGS];
+ int err, num_args;
+
+ if (data_len % 8 || data_len > MAX_SNPRINTF_VARARGS * 8 ||
+ (data_len && !data))
+ return -EINVAL;
+ num_args = data_len / 8;
+
+ /* ARG_PTR_TO_CONST_STR guarantees that fmt is zero-terminated so we
+ * can safely give an unbounded size.
+ */
+ err = bpf_printf_prepare(fmt, UINT_MAX, data, args, mod, num_args);
+ if (err < 0)
+ return err;
+
+ /* Maximumly we can have MAX_SNPRINTF_VARARGS parameters, just give
+ * all of them to snprintf().
+ */
+ err = snprintf(str, str_size, fmt, BPF_CAST_FMT_ARG(0, args, mod),
+ BPF_CAST_FMT_ARG(1, args, mod), BPF_CAST_FMT_ARG(2, args, mod),
+ BPF_CAST_FMT_ARG(3, args, mod), BPF_CAST_FMT_ARG(4, args, mod),
+ BPF_CAST_FMT_ARG(5, args, mod), BPF_CAST_FMT_ARG(6, args, mod),
+ BPF_CAST_FMT_ARG(7, args, mod), BPF_CAST_FMT_ARG(8, args, mod),
+ BPF_CAST_FMT_ARG(9, args, mod), BPF_CAST_FMT_ARG(10, args, mod),
+ BPF_CAST_FMT_ARG(11, args, mod));
+
+ bpf_printf_cleanup();
+
+ return err + 1;
+}
+
+const struct bpf_func_proto bpf_snprintf_proto = {
+ .func = bpf_snprintf,
+ .gpl_only = true,
+ .ret_type = RET_INTEGER,
+ .arg1_type = ARG_PTR_TO_MEM_OR_NULL,
+ .arg2_type = ARG_CONST_SIZE_OR_ZERO,
+ .arg3_type = ARG_PTR_TO_CONST_STR,
+ .arg4_type = ARG_PTR_TO_MEM_OR_NULL,
+ .arg5_type = ARG_CONST_SIZE_OR_ZERO,
+};
+
const struct bpf_func_proto bpf_get_current_task_proto __weak;
const struct bpf_func_proto bpf_probe_read_user_proto __weak;
const struct bpf_func_proto bpf_probe_read_user_str_proto __weak;
@@ -1013,6 +1061,8 @@ bpf_base_func_proto(enum bpf_func_id func_id)
return &bpf_probe_read_kernel_str_proto;
case BPF_FUNC_snprintf_btf:
return &bpf_snprintf_btf_proto;
+ case BPF_FUNC_snprintf:
+ return &bpf_snprintf_proto;
default:
return NULL;
}
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 5f46dd6f3383..994ef36c5f60 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -5918,6 +5918,41 @@ static int check_reference_leak(struct bpf_verifier_env *env)
return state->acquired_refs ? -EINVAL : 0;
}

+static int check_bpf_snprintf_call(struct bpf_verifier_env *env,
+ struct bpf_reg_state *regs)
+{
+ struct bpf_reg_state *fmt_reg = &regs[BPF_REG_3];
+ struct bpf_reg_state *data_len_reg = &regs[BPF_REG_5];
+ struct bpf_map *fmt_map = fmt_reg->map_ptr;
+ int err, fmt_map_off, num_args;
+ u64 fmt_addr;
+ char *fmt;
+
+ /* data must be an array of u64 */
+ if (data_len_reg->var_off.value % 8)
+ return -EINVAL;
+ num_args = data_len_reg->var_off.value / 8;
+
+ /* fmt being ARG_PTR_TO_CONST_STR guarantees that var_off is const
+ * and map_direct_value_addr is set.
+ */
+ fmt_map_off = fmt_reg->off + fmt_reg->var_off.value;
+ err = fmt_map->ops->map_direct_value_addr(fmt_map, &fmt_addr,
+ fmt_map_off);
+ if (err)
+ return err;
+ fmt = (char *)(long)fmt_addr + fmt_map_off;
+
+ /* We are also guaranteed that fmt+fmt_map_off is NULL terminated, we
+ * can focus on validating the format specifiers.
+ */
+ err = bpf_printf_prepare(fmt, UINT_MAX, NULL, NULL, NULL, num_args);
+ if (err < 0)
+ verbose(env, "Invalid format string\n");
+
+ return err;
+}
+
static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
int *insn_idx_p)
{
@@ -6032,6 +6067,12 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
return -EINVAL;
}

+ if (func_id == BPF_FUNC_snprintf) {
+ err = check_bpf_snprintf_call(env, regs);
+ if (err < 0)
+ return err;
+ }
+
/* reset caller saved regs */
for (i = 0; i < CALLER_SAVED_REGS; i++) {
mark_reg_not_init(env, regs, caller_saved[i]);
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index a13f8644b357..2a8bcdc927c7 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -1076,6 +1076,8 @@ bpf_tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
return &bpf_task_storage_delete_proto;
case BPF_FUNC_for_each_map_elem:
return &bpf_for_each_map_elem_proto;
+ case BPF_FUNC_snprintf:
+ return &bpf_snprintf_proto;
default:
return NULL;
}
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index df164a44bb41..ec6d85a81744 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -4708,6 +4708,33 @@ union bpf_attr {
* Return
* The number of traversed map elements for success, **-EINVAL** for
* invalid **flags**.
+ *
+ * long bpf_snprintf(char *str, u32 str_size, const char *fmt, u64 *data, u32 data_len)
+ * Description
+ * Outputs a string into the **str** buffer of size **str_size**
+ * based on a format string stored in a read-only map pointed by
+ * **fmt**.
+ *
+ * Each format specifier in **fmt** corresponds to one u64 element
+ * in the **data** array. For strings and pointers where pointees
+ * are accessed, only the pointer values are stored in the *data*
+ * array. The *data_len* is the size of *data* in bytes.
+ *
+ * Formats **%s** and **%p{i,I}{4,6}** require to read kernel
+ * memory. Reading kernel memory may fail due to either invalid
+ * address or valid address but requiring a major memory fault. If
+ * reading kernel memory fails, the string for **%s** will be an
+ * empty string, and the ip address for **%p{i,I}{4,6}** will be 0.
+ * Not returning error to bpf program is consistent with what
+ * **bpf_trace_printk**\ () does for now.
+ *
+ * Return
+ * The strictly positive length of the formatted string, including
+ * the trailing zero character. If the return value is greater than
+ * **str_size**, **str** contains a truncated string, guaranteed to
+ * be zero-terminated except when **str_size** is 0.
+ *
+ * Or **-EBUSY** if the per-CPU memory copy buffer is busy.
*/
#define __BPF_FUNC_MAPPER(FN) \
FN(unspec), \
@@ -4875,6 +4902,7 @@ union bpf_attr {
FN(sock_from_file), \
FN(check_mtu), \
FN(for_each_map_elem), \
+ FN(snprintf), \
/* */

/* integer value in 'imm' field of BPF_CALL instruction selects which helper
--
2.31.1.368.gbe11c130af-goog

2021-04-19 19:37:54

by Florent Revest

[permalink] [raw]
Subject: [PATCH bpf-next v5 5/6] libbpf: Introduce a BPF_SNPRINTF helper macro

Similarly to BPF_SEQ_PRINTF, this macro turns variadic arguments into an
array of u64, making it more natural to call the bpf_snprintf helper.

Signed-off-by: Florent Revest <[email protected]>
Acked-by: Andrii Nakryiko <[email protected]>
---
tools/lib/bpf/bpf_tracing.h | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)

diff --git a/tools/lib/bpf/bpf_tracing.h b/tools/lib/bpf/bpf_tracing.h
index 1c2e91ee041d..8c954ebc0c7c 100644
--- a/tools/lib/bpf/bpf_tracing.h
+++ b/tools/lib/bpf/bpf_tracing.h
@@ -447,4 +447,22 @@ static __always_inline typeof(name(0)) ____##name(struct pt_regs *ctx, ##args)
___param, sizeof(___param)); \
})

+/*
+ * BPF_SNPRINTF wraps the bpf_snprintf helper with variadic arguments instead of
+ * an array of u64.
+ */
+#define BPF_SNPRINTF(out, out_size, fmt, args...) \
+({ \
+ static const char ___fmt[] = fmt; \
+ unsigned long long ___param[___bpf_narg(args)]; \
+ \
+ _Pragma("GCC diagnostic push") \
+ _Pragma("GCC diagnostic ignored \"-Wint-conversion\"") \
+ ___bpf_fill(___param, args); \
+ _Pragma("GCC diagnostic pop") \
+ \
+ bpf_snprintf(out, out_size, ___fmt, \
+ ___param, sizeof(___param)); \
+})
+
#endif
--
2.31.1.368.gbe11c130af-goog

2021-04-19 20:48:39

by Andrii Nakryiko

[permalink] [raw]
Subject: Re: [PATCH bpf-next v5 0/6] Add a snprintf eBPF helper

On Mon, Apr 19, 2021 at 8:52 AM Florent Revest <[email protected]> wrote:
>
> We have a usecase where we want to audit symbol names (if available) in
> callback registration hooks. (ex: fentry/nf_register_net_hook)
>
> A few months back, I proposed a bpf_kallsyms_lookup series but it was
> decided in the reviews that a more generic helper, bpf_snprintf, would
> be more useful.
>
> This series implements the helper according to the feedback received in
> https://lore.kernel.org/bpf/[email protected]/T/#u
>
> - A new arg type guarantees the NULL-termination of string arguments and
> lets us pass format strings in only one arg
> - A new helper is implemented using that guarantee. Because the format
> string is known at verification time, the format string validation is
> done by the verifier
> - To implement a series of tests for bpf_snprintf, the logic for
> marshalling variadic args in a fixed-size array is reworked as per:
> https://lore.kernel.org/bpf/[email protected]/T/#u
>
> ---
> Changes in v5:
> - Fixed the bpf_printf_buf_used counter logic in try_get_fmt_tmp_buf
> - Added a couple of extra incorrect specifiers tests
> - Call test_snprintf_single__destroy unconditionally
> - Fixed a C++-style comment
>
> ---
> Changes in v4:
> - Moved bpf_snprintf, bpf_printf_prepare and bpf_printf_cleanup to
> kernel/bpf/helpers.c so that they get built without CONFIG_BPF_EVENTS
> - Added negative test cases (various invalid format strings)
> - Renamed put_fmt_tmp_buf() as bpf_printf_cleanup()
> - Fixed a mistake that caused temporary buffers to be unconditionally
> freed in bpf_printf_prepare
> - Fixed a mistake that caused missing 0 character to be ignored
> - Fixed a warning about integer to pointer conversion
> - Misc cleanups
>
> ---
> Changes in v3:
> - Simplified temporary buffer acquisition with try_get_fmt_tmp_buf()
> - Made zero-termination check more consistent
> - Allowed NULL output_buffer
> - Simplified the BPF_CAST_FMT_ARG macro
> - Three new test cases: number padding, simple string with no arg and
> string length extraction only with a NULL output buffer
> - Clarified helper's description for edge cases (eg: str_size == 0)
> - Lots of cosmetic changes
>
> ---
> Changes in v2:
> - Extracted the format validation/argument sanitization in a generic way
> for all printf-like helpers.
> - bpf_snprintf's str_size can now be 0
> - bpf_snprintf is now exposed to all BPF program types
> - We now preempt_disable when using a per-cpu temporary buffer
> - Addressed a few cosmetic changes
>
> Florent Revest (6):
> bpf: Factorize bpf_trace_printk and bpf_seq_printf
> bpf: Add a ARG_PTR_TO_CONST_STR argument type
> bpf: Add a bpf_snprintf helper
> libbpf: Initialize the bpf_seq_printf parameters array field by field
> libbpf: Introduce a BPF_SNPRINTF helper macro
> selftests/bpf: Add a series of tests for bpf_snprintf
>
> include/linux/bpf.h | 22 ++
> include/uapi/linux/bpf.h | 28 ++
> kernel/bpf/helpers.c | 306 ++++++++++++++
> kernel/bpf/verifier.c | 82 ++++
> kernel/trace/bpf_trace.c | 373 ++----------------
> tools/include/uapi/linux/bpf.h | 28 ++
> tools/lib/bpf/bpf_tracing.h | 58 ++-
> .../selftests/bpf/prog_tests/snprintf.c | 125 ++++++
> .../selftests/bpf/progs/test_snprintf.c | 73 ++++
> .../bpf/progs/test_snprintf_single.c | 20 +
> 10 files changed, 770 insertions(+), 345 deletions(-)
> create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf.c
> create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf.c
> create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf_single.c
>
> --
> 2.31.1.368.gbe11c130af-goog
>

Looks great, thank you!

For the series:

Acked-by: Andrii Nakryiko <[email protected]>

2021-04-20 12:04:01

by Florent Revest

[permalink] [raw]
Subject: Re: [PATCH bpf-next v5 0/6] Add a snprintf eBPF helper

On Mon, Apr 19, 2021 at 9:34 PM Andrii Nakryiko
<[email protected]> wrote:
>
> On Mon, Apr 19, 2021 at 8:52 AM Florent Revest <[email protected]> wrote:
> >
> > We have a usecase where we want to audit symbol names (if available) in
> > callback registration hooks. (ex: fentry/nf_register_net_hook)
> >
> > A few months back, I proposed a bpf_kallsyms_lookup series but it was
> > decided in the reviews that a more generic helper, bpf_snprintf, would
> > be more useful.
> >
> > This series implements the helper according to the feedback received in
> > https://lore.kernel.org/bpf/[email protected]/T/#u
> >
> > - A new arg type guarantees the NULL-termination of string arguments and
> > lets us pass format strings in only one arg
> > - A new helper is implemented using that guarantee. Because the format
> > string is known at verification time, the format string validation is
> > done by the verifier
> > - To implement a series of tests for bpf_snprintf, the logic for
> > marshalling variadic args in a fixed-size array is reworked as per:
> > https://lore.kernel.org/bpf/[email protected]/T/#u
> >
> > ---
> > Changes in v5:
> > - Fixed the bpf_printf_buf_used counter logic in try_get_fmt_tmp_buf
> > - Added a couple of extra incorrect specifiers tests
> > - Call test_snprintf_single__destroy unconditionally
> > - Fixed a C++-style comment
> >
> > ---
> > Changes in v4:
> > - Moved bpf_snprintf, bpf_printf_prepare and bpf_printf_cleanup to
> > kernel/bpf/helpers.c so that they get built without CONFIG_BPF_EVENTS
> > - Added negative test cases (various invalid format strings)
> > - Renamed put_fmt_tmp_buf() as bpf_printf_cleanup()
> > - Fixed a mistake that caused temporary buffers to be unconditionally
> > freed in bpf_printf_prepare
> > - Fixed a mistake that caused missing 0 character to be ignored
> > - Fixed a warning about integer to pointer conversion
> > - Misc cleanups
> >
> > ---
> > Changes in v3:
> > - Simplified temporary buffer acquisition with try_get_fmt_tmp_buf()
> > - Made zero-termination check more consistent
> > - Allowed NULL output_buffer
> > - Simplified the BPF_CAST_FMT_ARG macro
> > - Three new test cases: number padding, simple string with no arg and
> > string length extraction only with a NULL output buffer
> > - Clarified helper's description for edge cases (eg: str_size == 0)
> > - Lots of cosmetic changes
> >
> > ---
> > Changes in v2:
> > - Extracted the format validation/argument sanitization in a generic way
> > for all printf-like helpers.
> > - bpf_snprintf's str_size can now be 0
> > - bpf_snprintf is now exposed to all BPF program types
> > - We now preempt_disable when using a per-cpu temporary buffer
> > - Addressed a few cosmetic changes
> >
> > Florent Revest (6):
> > bpf: Factorize bpf_trace_printk and bpf_seq_printf
> > bpf: Add a ARG_PTR_TO_CONST_STR argument type
> > bpf: Add a bpf_snprintf helper
> > libbpf: Initialize the bpf_seq_printf parameters array field by field
> > libbpf: Introduce a BPF_SNPRINTF helper macro
> > selftests/bpf: Add a series of tests for bpf_snprintf
> >
> > include/linux/bpf.h | 22 ++
> > include/uapi/linux/bpf.h | 28 ++
> > kernel/bpf/helpers.c | 306 ++++++++++++++
> > kernel/bpf/verifier.c | 82 ++++
> > kernel/trace/bpf_trace.c | 373 ++----------------
> > tools/include/uapi/linux/bpf.h | 28 ++
> > tools/lib/bpf/bpf_tracing.h | 58 ++-
> > .../selftests/bpf/prog_tests/snprintf.c | 125 ++++++
> > .../selftests/bpf/progs/test_snprintf.c | 73 ++++
> > .../bpf/progs/test_snprintf_single.c | 20 +
> > 10 files changed, 770 insertions(+), 345 deletions(-)
> > create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf.c
> > create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf.c
> > create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf_single.c
> >
> > --
> > 2.31.1.368.gbe11c130af-goog
> >
>
> Looks great, thank you!
>
> For the series:
>
> Acked-by: Andrii Nakryiko <[email protected]>

Thank you for the all the fast and in-depth reviews Andrii! :)

2021-04-23 22:39:20

by Andrii Nakryiko

[permalink] [raw]
Subject: Re: [PATCH bpf-next v5 6/6] selftests/bpf: Add a series of tests for bpf_snprintf

On Mon, Apr 19, 2021 at 8:52 AM Florent Revest <[email protected]> wrote:
>
> The "positive" part tests all format specifiers when things go well.
>
> The "negative" part makes sure that incorrect format strings fail at
> load time.
>
> Signed-off-by: Florent Revest <[email protected]>
> ---
> .../selftests/bpf/prog_tests/snprintf.c | 125 ++++++++++++++++++
> .../selftests/bpf/progs/test_snprintf.c | 73 ++++++++++
> .../bpf/progs/test_snprintf_single.c | 20 +++
> 3 files changed, 218 insertions(+)
> create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf.c
> create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf.c
> create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf_single.c
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/snprintf.c b/tools/testing/selftests/bpf/prog_tests/snprintf.c
> new file mode 100644
> index 000000000000..a958c22aec75
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/prog_tests/snprintf.c
> @@ -0,0 +1,125 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (c) 2021 Google LLC. */
> +
> +#include <test_progs.h>
> +#include "test_snprintf.skel.h"
> +#include "test_snprintf_single.skel.h"
> +
> +#define EXP_NUM_OUT "-8 9 96 -424242 1337 DABBAD00"
> +#define EXP_NUM_RET sizeof(EXP_NUM_OUT)
> +
> +#define EXP_IP_OUT "127.000.000.001 0000:0000:0000:0000:0000:0000:0000:0001"
> +#define EXP_IP_RET sizeof(EXP_IP_OUT)
> +
> +/* The third specifier, %pB, depends on compiler inlining so don't check it */
> +#define EXP_SYM_OUT "schedule schedule+0x0/"
> +#define MIN_SYM_RET sizeof(EXP_SYM_OUT)
> +
> +/* The third specifier, %p, is a hashed pointer which changes on every reboot */
> +#define EXP_ADDR_OUT "0000000000000000 ffff00000add4e55 "
> +#define EXP_ADDR_RET sizeof(EXP_ADDR_OUT "unknownhashedptr")
> +
> +#define EXP_STR_OUT "str1 longstr"
> +#define EXP_STR_RET sizeof(EXP_STR_OUT)
> +
> +#define EXP_OVER_OUT "%over"
> +#define EXP_OVER_RET 10
> +
> +#define EXP_PAD_OUT " 4 000"

Roughly 50% of the time I get failure for this test case:

test_snprintf_positive:FAIL:pad_out unexpected pad_out: actual ' 4
0000' != expected ' 4 000'

Re-running this test case immediately passes. Running again most
probably fails. Please take a look.

> +#define EXP_PAD_RET 900007
> +
> +#define EXP_NO_ARG_OUT "simple case"
> +#define EXP_NO_ARG_RET 12
> +
> +#define EXP_NO_BUF_RET 29
> +

[...]

2021-04-26 10:11:45

by Florent Revest

[permalink] [raw]
Subject: Re: [PATCH bpf-next v5 6/6] selftests/bpf: Add a series of tests for bpf_snprintf

On Sat, Apr 24, 2021 at 12:38 AM Andrii Nakryiko
<[email protected]> wrote:
>
> On Mon, Apr 19, 2021 at 8:52 AM Florent Revest <[email protected]> wrote:
> >
> > The "positive" part tests all format specifiers when things go well.
> >
> > The "negative" part makes sure that incorrect format strings fail at
> > load time.
> >
> > Signed-off-by: Florent Revest <[email protected]>
> > ---
> > .../selftests/bpf/prog_tests/snprintf.c | 125 ++++++++++++++++++
> > .../selftests/bpf/progs/test_snprintf.c | 73 ++++++++++
> > .../bpf/progs/test_snprintf_single.c | 20 +++
> > 3 files changed, 218 insertions(+)
> > create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf.c
> > create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf.c
> > create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf_single.c
> >
> > diff --git a/tools/testing/selftests/bpf/prog_tests/snprintf.c b/tools/testing/selftests/bpf/prog_tests/snprintf.c
> > new file mode 100644
> > index 000000000000..a958c22aec75
> > --- /dev/null
> > +++ b/tools/testing/selftests/bpf/prog_tests/snprintf.c
> > @@ -0,0 +1,125 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/* Copyright (c) 2021 Google LLC. */
> > +
> > +#include <test_progs.h>
> > +#include "test_snprintf.skel.h"
> > +#include "test_snprintf_single.skel.h"
> > +
> > +#define EXP_NUM_OUT "-8 9 96 -424242 1337 DABBAD00"
> > +#define EXP_NUM_RET sizeof(EXP_NUM_OUT)
> > +
> > +#define EXP_IP_OUT "127.000.000.001 0000:0000:0000:0000:0000:0000:0000:0001"
> > +#define EXP_IP_RET sizeof(EXP_IP_OUT)
> > +
> > +/* The third specifier, %pB, depends on compiler inlining so don't check it */
> > +#define EXP_SYM_OUT "schedule schedule+0x0/"
> > +#define MIN_SYM_RET sizeof(EXP_SYM_OUT)
> > +
> > +/* The third specifier, %p, is a hashed pointer which changes on every reboot */
> > +#define EXP_ADDR_OUT "0000000000000000 ffff00000add4e55 "
> > +#define EXP_ADDR_RET sizeof(EXP_ADDR_OUT "unknownhashedptr")
> > +
> > +#define EXP_STR_OUT "str1 longstr"
> > +#define EXP_STR_RET sizeof(EXP_STR_OUT)
> > +
> > +#define EXP_OVER_OUT "%over"
> > +#define EXP_OVER_RET 10
> > +
> > +#define EXP_PAD_OUT " 4 000"
>
> Roughly 50% of the time I get failure for this test case:
>
> test_snprintf_positive:FAIL:pad_out unexpected pad_out: actual ' 4
> 0000' != expected ' 4 000'
>
> Re-running this test case immediately passes. Running again most
> probably fails. Please take a look.

Do you have more information on how to reproduce this ?
I spinned up a VM at 87bd9e602 with ./vmtest -s and then run this script:

#!/bin/sh
for i in `seq 1000`
do
./test_progs -t snprintf
if [ $? -ne 0 ];
then
echo FAILURE
exit 1
fi
done

The thousand executions passed.

This is a bit concerning because your unexpected_pad_out seems to have
an extra '0' so it ends up with strlen(pad_out)=11 but
sizeof(pad_out)=10. The actual string writing is not really done by
our helper code but by the snprintf implementation (str and str_size
are only given to snprintf()) so I'd expect the truncation to work
well there. I'm a bit puzzled

2021-04-26 16:21:27

by Andrii Nakryiko

[permalink] [raw]
Subject: Re: [PATCH bpf-next v5 6/6] selftests/bpf: Add a series of tests for bpf_snprintf

On Mon, Apr 26, 2021 at 3:10 AM Florent Revest <[email protected]> wrote:
>
> On Sat, Apr 24, 2021 at 12:38 AM Andrii Nakryiko
> <[email protected]> wrote:
> >
> > On Mon, Apr 19, 2021 at 8:52 AM Florent Revest <[email protected]> wrote:
> > >
> > > The "positive" part tests all format specifiers when things go well.
> > >
> > > The "negative" part makes sure that incorrect format strings fail at
> > > load time.
> > >
> > > Signed-off-by: Florent Revest <[email protected]>
> > > ---
> > > .../selftests/bpf/prog_tests/snprintf.c | 125 ++++++++++++++++++
> > > .../selftests/bpf/progs/test_snprintf.c | 73 ++++++++++
> > > .../bpf/progs/test_snprintf_single.c | 20 +++
> > > 3 files changed, 218 insertions(+)
> > > create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf.c
> > > create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf.c
> > > create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf_single.c
> > >
> > > diff --git a/tools/testing/selftests/bpf/prog_tests/snprintf.c b/tools/testing/selftests/bpf/prog_tests/snprintf.c
> > > new file mode 100644
> > > index 000000000000..a958c22aec75
> > > --- /dev/null
> > > +++ b/tools/testing/selftests/bpf/prog_tests/snprintf.c
> > > @@ -0,0 +1,125 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> > > +/* Copyright (c) 2021 Google LLC. */
> > > +
> > > +#include <test_progs.h>
> > > +#include "test_snprintf.skel.h"
> > > +#include "test_snprintf_single.skel.h"
> > > +
> > > +#define EXP_NUM_OUT "-8 9 96 -424242 1337 DABBAD00"
> > > +#define EXP_NUM_RET sizeof(EXP_NUM_OUT)
> > > +
> > > +#define EXP_IP_OUT "127.000.000.001 0000:0000:0000:0000:0000:0000:0000:0001"
> > > +#define EXP_IP_RET sizeof(EXP_IP_OUT)
> > > +
> > > +/* The third specifier, %pB, depends on compiler inlining so don't check it */
> > > +#define EXP_SYM_OUT "schedule schedule+0x0/"
> > > +#define MIN_SYM_RET sizeof(EXP_SYM_OUT)
> > > +
> > > +/* The third specifier, %p, is a hashed pointer which changes on every reboot */
> > > +#define EXP_ADDR_OUT "0000000000000000 ffff00000add4e55 "
> > > +#define EXP_ADDR_RET sizeof(EXP_ADDR_OUT "unknownhashedptr")
> > > +
> > > +#define EXP_STR_OUT "str1 longstr"
> > > +#define EXP_STR_RET sizeof(EXP_STR_OUT)
> > > +
> > > +#define EXP_OVER_OUT "%over"
> > > +#define EXP_OVER_RET 10
> > > +
> > > +#define EXP_PAD_OUT " 4 000"
> >
> > Roughly 50% of the time I get failure for this test case:
> >
> > test_snprintf_positive:FAIL:pad_out unexpected pad_out: actual ' 4
> > 0000' != expected ' 4 000'
> >
> > Re-running this test case immediately passes. Running again most
> > probably fails. Please take a look.
>
> Do you have more information on how to reproduce this ?
> I spinned up a VM at 87bd9e602 with ./vmtest -s and then run this script:
>
> #!/bin/sh
> for i in `seq 1000`
> do
> ./test_progs -t snprintf
> if [ $? -ne 0 ];
> then
> echo FAILURE
> exit 1
> fi
> done
>
> The thousand executions passed.
>
> This is a bit concerning because your unexpected_pad_out seems to have
> an extra '0' so it ends up with strlen(pad_out)=11 but
> sizeof(pad_out)=10. The actual string writing is not really done by
> our helper code but by the snprintf implementation (str and str_size
> are only given to snprintf()) so I'd expect the truncation to work
> well there. I'm a bit puzzled

I'm puzzled too, have no idea. I also can't repro this with vmtest.sh.
But I can quite reliably reproduce with my local ArchLinux-based qemu
image with different config (see [0] for config itself). So please try
with my config and see if that helps to repro. If not, I'll have to
debug it on my own later.

[0] https://gist.github.com/anakryiko/4b6ae21680842bdeacca8fa99d378048

2021-04-26 21:11:11

by Florent Revest

[permalink] [raw]
Subject: Re: [PATCH bpf-next v5 6/6] selftests/bpf: Add a series of tests for bpf_snprintf

On Mon, Apr 26, 2021 at 6:19 PM Andrii Nakryiko
<[email protected]> wrote:
>
> On Mon, Apr 26, 2021 at 3:10 AM Florent Revest <[email protected]> wrote:
> >
> > On Sat, Apr 24, 2021 at 12:38 AM Andrii Nakryiko
> > <[email protected]> wrote:
> > >
> > > On Mon, Apr 19, 2021 at 8:52 AM Florent Revest <[email protected]> wrote:
> > > >
> > > > The "positive" part tests all format specifiers when things go well.
> > > >
> > > > The "negative" part makes sure that incorrect format strings fail at
> > > > load time.
> > > >
> > > > Signed-off-by: Florent Revest <[email protected]>
> > > > ---
> > > > .../selftests/bpf/prog_tests/snprintf.c | 125 ++++++++++++++++++
> > > > .../selftests/bpf/progs/test_snprintf.c | 73 ++++++++++
> > > > .../bpf/progs/test_snprintf_single.c | 20 +++
> > > > 3 files changed, 218 insertions(+)
> > > > create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf.c
> > > > create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf.c
> > > > create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf_single.c
> > > >
> > > > diff --git a/tools/testing/selftests/bpf/prog_tests/snprintf.c b/tools/testing/selftests/bpf/prog_tests/snprintf.c
> > > > new file mode 100644
> > > > index 000000000000..a958c22aec75
> > > > --- /dev/null
> > > > +++ b/tools/testing/selftests/bpf/prog_tests/snprintf.c
> > > > @@ -0,0 +1,125 @@
> > > > +// SPDX-License-Identifier: GPL-2.0
> > > > +/* Copyright (c) 2021 Google LLC. */
> > > > +
> > > > +#include <test_progs.h>
> > > > +#include "test_snprintf.skel.h"
> > > > +#include "test_snprintf_single.skel.h"
> > > > +
> > > > +#define EXP_NUM_OUT "-8 9 96 -424242 1337 DABBAD00"
> > > > +#define EXP_NUM_RET sizeof(EXP_NUM_OUT)
> > > > +
> > > > +#define EXP_IP_OUT "127.000.000.001 0000:0000:0000:0000:0000:0000:0000:0001"
> > > > +#define EXP_IP_RET sizeof(EXP_IP_OUT)
> > > > +
> > > > +/* The third specifier, %pB, depends on compiler inlining so don't check it */
> > > > +#define EXP_SYM_OUT "schedule schedule+0x0/"
> > > > +#define MIN_SYM_RET sizeof(EXP_SYM_OUT)
> > > > +
> > > > +/* The third specifier, %p, is a hashed pointer which changes on every reboot */
> > > > +#define EXP_ADDR_OUT "0000000000000000 ffff00000add4e55 "
> > > > +#define EXP_ADDR_RET sizeof(EXP_ADDR_OUT "unknownhashedptr")
> > > > +
> > > > +#define EXP_STR_OUT "str1 longstr"
> > > > +#define EXP_STR_RET sizeof(EXP_STR_OUT)
> > > > +
> > > > +#define EXP_OVER_OUT "%over"
> > > > +#define EXP_OVER_RET 10
> > > > +
> > > > +#define EXP_PAD_OUT " 4 000"
> > >
> > > Roughly 50% of the time I get failure for this test case:
> > >
> > > test_snprintf_positive:FAIL:pad_out unexpected pad_out: actual ' 4
> > > 0000' != expected ' 4 000'
> > >
> > > Re-running this test case immediately passes. Running again most
> > > probably fails. Please take a look.
> >
> > Do you have more information on how to reproduce this ?
> > I spinned up a VM at 87bd9e602 with ./vmtest -s and then run this script:
> >
> > #!/bin/sh
> > for i in `seq 1000`
> > do
> > ./test_progs -t snprintf
> > if [ $? -ne 0 ];
> > then
> > echo FAILURE
> > exit 1
> > fi
> > done
> >
> > The thousand executions passed.
> >
> > This is a bit concerning because your unexpected_pad_out seems to have
> > an extra '0' so it ends up with strlen(pad_out)=11 but
> > sizeof(pad_out)=10. The actual string writing is not really done by
> > our helper code but by the snprintf implementation (str and str_size
> > are only given to snprintf()) so I'd expect the truncation to work
> > well there. I'm a bit puzzled
>
> I'm puzzled too, have no idea. I also can't repro this with vmtest.sh.
> But I can quite reliably reproduce with my local ArchLinux-based qemu
> image with different config (see [0] for config itself). So please try
> with my config and see if that helps to repro. If not, I'll have to
> debug it on my own later.
>
> [0] https://gist.github.com/anakryiko/4b6ae21680842bdeacca8fa99d378048

I tried that config on the same commit 87bd9e602 (bpf-next/master)
with my debian-based qemu image and I still can't reproduce the issue
:| If I can be of any help let me know, I'd be happy to help

2021-04-27 06:36:42

by Rasmus Villemoes

[permalink] [raw]
Subject: Re: [PATCH bpf-next v5 6/6] selftests/bpf: Add a series of tests for bpf_snprintf

On 26/04/2021 23.08, Florent Revest wrote:
> On Mon, Apr 26, 2021 at 6:19 PM Andrii Nakryiko
> <[email protected]> wrote:
>>
>> On Mon, Apr 26, 2021 at 3:10 AM Florent Revest <[email protected]> wrote:
>>>
>>> On Sat, Apr 24, 2021 at 12:38 AM Andrii Nakryiko
>>> <[email protected]> wrote:
>>>>
>>>> On Mon, Apr 19, 2021 at 8:52 AM Florent Revest <[email protected]> wrote:
>>>>>
>>>>> The "positive" part tests all format specifiers when things go well.
>>>>>
>>>>> The "negative" part makes sure that incorrect format strings fail at
>>>>> load time.
>>>>>
>>>>> Signed-off-by: Florent Revest <[email protected]>
>>>>> ---
>>>>> .../selftests/bpf/prog_tests/snprintf.c | 125 ++++++++++++++++++
>>>>> .../selftests/bpf/progs/test_snprintf.c | 73 ++++++++++
>>>>> .../bpf/progs/test_snprintf_single.c | 20 +++
>>>>> 3 files changed, 218 insertions(+)
>>>>> create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf.c
>>>>> create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf.c
>>>>> create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf_single.c
>>>>>
>>>>> diff --git a/tools/testing/selftests/bpf/prog_tests/snprintf.c b/tools/testing/selftests/bpf/prog_tests/snprintf.c
>>>>> new file mode 100644
>>>>> index 000000000000..a958c22aec75
>>>>> --- /dev/null
>>>>> +++ b/tools/testing/selftests/bpf/prog_tests/snprintf.c
>>>>> @@ -0,0 +1,125 @@
>>>>> +// SPDX-License-Identifier: GPL-2.0
>>>>> +/* Copyright (c) 2021 Google LLC. */
>>>>> +
>>>>> +#include <test_progs.h>
>>>>> +#include "test_snprintf.skel.h"
>>>>> +#include "test_snprintf_single.skel.h"
>>>>> +
>>>>> +#define EXP_NUM_OUT "-8 9 96 -424242 1337 DABBAD00"
>>>>> +#define EXP_NUM_RET sizeof(EXP_NUM_OUT)
>>>>> +
>>>>> +#define EXP_IP_OUT "127.000.000.001 0000:0000:0000:0000:0000:0000:0000:0001"
>>>>> +#define EXP_IP_RET sizeof(EXP_IP_OUT)
>>>>> +
>>>>> +/* The third specifier, %pB, depends on compiler inlining so don't check it */
>>>>> +#define EXP_SYM_OUT "schedule schedule+0x0/"
>>>>> +#define MIN_SYM_RET sizeof(EXP_SYM_OUT)
>>>>> +
>>>>> +/* The third specifier, %p, is a hashed pointer which changes on every reboot */
>>>>> +#define EXP_ADDR_OUT "0000000000000000 ffff00000add4e55 "
>>>>> +#define EXP_ADDR_RET sizeof(EXP_ADDR_OUT "unknownhashedptr")
>>>>> +
>>>>> +#define EXP_STR_OUT "str1 longstr"
>>>>> +#define EXP_STR_RET sizeof(EXP_STR_OUT)
>>>>> +
>>>>> +#define EXP_OVER_OUT "%over"
>>>>> +#define EXP_OVER_RET 10
>>>>> +
>>>>> +#define EXP_PAD_OUT " 4 000"
>>>>
>>>> Roughly 50% of the time I get failure for this test case:
>>>>
>>>> test_snprintf_positive:FAIL:pad_out unexpected pad_out: actual ' 4
>>>> 0000' != expected ' 4 000'
>>>>
>>>> Re-running this test case immediately passes. Running again most
>>>> probably fails. Please take a look.
>>>
>>> Do you have more information on how to reproduce this ?
>>> I spinned up a VM at 87bd9e602 with ./vmtest -s and then run this script:
>>>
>>> #!/bin/sh
>>> for i in `seq 1000`
>>> do
>>> ./test_progs -t snprintf
>>> if [ $? -ne 0 ];
>>> then
>>> echo FAILURE
>>> exit 1
>>> fi
>>> done
>>>
>>> The thousand executions passed.
>>>
>>> This is a bit concerning because your unexpected_pad_out seems to have
>>> an extra '0' so it ends up with strlen(pad_out)=11 but
>>> sizeof(pad_out)=10. The actual string writing is not really done by
>>> our helper code but by the snprintf implementation (str and str_size
>>> are only given to snprintf()) so I'd expect the truncation to work
>>> well there. I'm a bit puzzled
>>
>> I'm puzzled too, have no idea. I also can't repro this with vmtest.sh.
>> But I can quite reliably reproduce with my local ArchLinux-based qemu
>> image with different config (see [0] for config itself). So please try
>> with my config and see if that helps to repro. If not, I'll have to
>> debug it on my own later.
>>
>> [0] https://gist.github.com/anakryiko/4b6ae21680842bdeacca8fa99d378048
>
> I tried that config on the same commit 87bd9e602 (bpf-next/master)
> with my debian-based qemu image and I still can't reproduce the issue
> :| If I can be of any help let me know, I'd be happy to help
>

It's not really clear to me if this is before or after the rewrite to
use bprintf, but regardless, in those two patches this caught my attention:

u64 args[MAX_TRACE_PRINTK_VARARGS] = { arg1, arg2, arg3 };
- enum bpf_printf_mod_type mod[MAX_TRACE_PRINTK_VARARGS];
+ u32 *bin_args;
static char buf[BPF_TRACE_PRINTK_SIZE];
unsigned long flags;
int ret;

- ret = bpf_printf_prepare(fmt, fmt_size, args, args, mod,
- MAX_TRACE_PRINTK_VARARGS);
+ ret = bpf_bprintf_prepare(fmt, fmt_size, args, &bin_args,
+ MAX_TRACE_PRINTK_VARARGS);
if (ret < 0)
return ret;

- ret = snprintf(buf, sizeof(buf), fmt, BPF_CAST_FMT_ARG(0, args, mod),
- BPF_CAST_FMT_ARG(1, args, mod), BPF_CAST_FMT_ARG(2, args, mod));
- /* snprintf() will not append null for zero-length strings */
- if (ret == 0)
- buf[0] = '\0';
+ ret = bstr_printf(buf, sizeof(buf), fmt, bin_args);

raw_spin_lock_irqsave(&trace_printk_lock, flags);
trace_bpf_trace_printk(buf);
raw_spin_unlock_irqrestore(&trace_printk_lock, flags);

Why isn't the write to buf[] protected by that spinlock? Or put another
way, what protects buf[] from concurrent writes?

Probably the test cases are not run in parallel, but this is the kind of
thing that would give those symptoms.

Rasmus

2021-04-27 09:52:25

by Florent Revest

[permalink] [raw]
Subject: Re: [PATCH bpf-next v5 6/6] selftests/bpf: Add a series of tests for bpf_snprintf

On Tue, Apr 27, 2021 at 8:35 AM Rasmus Villemoes
<[email protected]> wrote:
>
> On 26/04/2021 23.08, Florent Revest wrote:
> > On Mon, Apr 26, 2021 at 6:19 PM Andrii Nakryiko
> > <[email protected]> wrote:
> >>
> >> On Mon, Apr 26, 2021 at 3:10 AM Florent Revest <[email protected]> wrote:
> >>>
> >>> On Sat, Apr 24, 2021 at 12:38 AM Andrii Nakryiko
> >>> <[email protected]> wrote:
> >>>>
> >>>> On Mon, Apr 19, 2021 at 8:52 AM Florent Revest <[email protected]> wrote:
> >>>>>
> >>>>> The "positive" part tests all format specifiers when things go well.
> >>>>>
> >>>>> The "negative" part makes sure that incorrect format strings fail at
> >>>>> load time.
> >>>>>
> >>>>> Signed-off-by: Florent Revest <[email protected]>
> >>>>> ---
> >>>>> .../selftests/bpf/prog_tests/snprintf.c | 125 ++++++++++++++++++
> >>>>> .../selftests/bpf/progs/test_snprintf.c | 73 ++++++++++
> >>>>> .../bpf/progs/test_snprintf_single.c | 20 +++
> >>>>> 3 files changed, 218 insertions(+)
> >>>>> create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf.c
> >>>>> create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf.c
> >>>>> create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf_single.c
> >>>>>
> >>>>> diff --git a/tools/testing/selftests/bpf/prog_tests/snprintf.c b/tools/testing/selftests/bpf/prog_tests/snprintf.c
> >>>>> new file mode 100644
> >>>>> index 000000000000..a958c22aec75
> >>>>> --- /dev/null
> >>>>> +++ b/tools/testing/selftests/bpf/prog_tests/snprintf.c
> >>>>> @@ -0,0 +1,125 @@
> >>>>> +// SPDX-License-Identifier: GPL-2.0
> >>>>> +/* Copyright (c) 2021 Google LLC. */
> >>>>> +
> >>>>> +#include <test_progs.h>
> >>>>> +#include "test_snprintf.skel.h"
> >>>>> +#include "test_snprintf_single.skel.h"
> >>>>> +
> >>>>> +#define EXP_NUM_OUT "-8 9 96 -424242 1337 DABBAD00"
> >>>>> +#define EXP_NUM_RET sizeof(EXP_NUM_OUT)
> >>>>> +
> >>>>> +#define EXP_IP_OUT "127.000.000.001 0000:0000:0000:0000:0000:0000:0000:0001"
> >>>>> +#define EXP_IP_RET sizeof(EXP_IP_OUT)
> >>>>> +
> >>>>> +/* The third specifier, %pB, depends on compiler inlining so don't check it */
> >>>>> +#define EXP_SYM_OUT "schedule schedule+0x0/"
> >>>>> +#define MIN_SYM_RET sizeof(EXP_SYM_OUT)
> >>>>> +
> >>>>> +/* The third specifier, %p, is a hashed pointer which changes on every reboot */
> >>>>> +#define EXP_ADDR_OUT "0000000000000000 ffff00000add4e55 "
> >>>>> +#define EXP_ADDR_RET sizeof(EXP_ADDR_OUT "unknownhashedptr")
> >>>>> +
> >>>>> +#define EXP_STR_OUT "str1 longstr"
> >>>>> +#define EXP_STR_RET sizeof(EXP_STR_OUT)
> >>>>> +
> >>>>> +#define EXP_OVER_OUT "%over"
> >>>>> +#define EXP_OVER_RET 10
> >>>>> +
> >>>>> +#define EXP_PAD_OUT " 4 000"
> >>>>
> >>>> Roughly 50% of the time I get failure for this test case:
> >>>>
> >>>> test_snprintf_positive:FAIL:pad_out unexpected pad_out: actual ' 4
> >>>> 0000' != expected ' 4 000'
> >>>>
> >>>> Re-running this test case immediately passes. Running again most
> >>>> probably fails. Please take a look.
> >>>
> >>> Do you have more information on how to reproduce this ?
> >>> I spinned up a VM at 87bd9e602 with ./vmtest -s and then run this script:
> >>>
> >>> #!/bin/sh
> >>> for i in `seq 1000`
> >>> do
> >>> ./test_progs -t snprintf
> >>> if [ $? -ne 0 ];
> >>> then
> >>> echo FAILURE
> >>> exit 1
> >>> fi
> >>> done
> >>>
> >>> The thousand executions passed.
> >>>
> >>> This is a bit concerning because your unexpected_pad_out seems to have
> >>> an extra '0' so it ends up with strlen(pad_out)=11 but
> >>> sizeof(pad_out)=10. The actual string writing is not really done by
> >>> our helper code but by the snprintf implementation (str and str_size
> >>> are only given to snprintf()) so I'd expect the truncation to work
> >>> well there. I'm a bit puzzled
> >>
> >> I'm puzzled too, have no idea. I also can't repro this with vmtest.sh.
> >> But I can quite reliably reproduce with my local ArchLinux-based qemu
> >> image with different config (see [0] for config itself). So please try
> >> with my config and see if that helps to repro. If not, I'll have to
> >> debug it on my own later.
> >>
> >> [0] https://gist.github.com/anakryiko/4b6ae21680842bdeacca8fa99d378048
> >
> > I tried that config on the same commit 87bd9e602 (bpf-next/master)
> > with my debian-based qemu image and I still can't reproduce the issue
> > :| If I can be of any help let me know, I'd be happy to help
> >
>
> It's not really clear to me if this is before or after the rewrite to
> use bprintf, but regardless, in those two patches this caught my attention:

I tried to reproduce Andrii's bug both before and after the bprintf
rewrite but I think he meant before.

> u64 args[MAX_TRACE_PRINTK_VARARGS] = { arg1, arg2, arg3 };
> - enum bpf_printf_mod_type mod[MAX_TRACE_PRINTK_VARARGS];
> + u32 *bin_args;
> static char buf[BPF_TRACE_PRINTK_SIZE];
> unsigned long flags;
> int ret;
>
> - ret = bpf_printf_prepare(fmt, fmt_size, args, args, mod,
> - MAX_TRACE_PRINTK_VARARGS);
> + ret = bpf_bprintf_prepare(fmt, fmt_size, args, &bin_args,
> + MAX_TRACE_PRINTK_VARARGS);
> if (ret < 0)
> return ret;
>
> - ret = snprintf(buf, sizeof(buf), fmt, BPF_CAST_FMT_ARG(0, args, mod),
> - BPF_CAST_FMT_ARG(1, args, mod), BPF_CAST_FMT_ARG(2, args, mod));
> - /* snprintf() will not append null for zero-length strings */
> - if (ret == 0)
> - buf[0] = '\0';
> + ret = bstr_printf(buf, sizeof(buf), fmt, bin_args);
>
> raw_spin_lock_irqsave(&trace_printk_lock, flags);
> trace_bpf_trace_printk(buf);
> raw_spin_unlock_irqrestore(&trace_printk_lock, flags);
>
> Why isn't the write to buf[] protected by that spinlock? Or put another
> way, what protects buf[] from concurrent writes?

You're right, that is a bug, I missed that buf was static and thought
it was just on the stack. That snprintf call should be after the
raw_spin_lock_irqsave. I'll send a patch. Thank you Rasmus. (before my
snprintf series, there was a vsprintf after the raw_spin_lock_irqsave)

> Probably the test cases are not run in parallel, but this is the kind of
> thing that would give those symptoms.

I think it's a separate issue from what Andrii reported though because
the flaky test exercises the bpf_snprintf helper and this buf spinlock
bug you just found only affects the bpf_trace_printk helper.

That being said, it does smell a little bit like a concurrency issue
too, indeed. The bpf_snprintf test program is a raw_tp/sys_enter so it
attaches to all syscall entries and most likely gets executed many
more times than necessary and probably on parallel CPUs. The "pad_out"
buffer they write to is unique and not locked so maybe the test's
userspace reads pad_out while another CPU is writing on it and if the
string output goes through a stage where it is " 4 0000" before
being " 4 000", we might read at the wrong time. That being said, I
would find it weird that this happens as much as 50% of the time and
always specifically on that test case.

Andrii could you maybe try changing the prog type to
"tp/syscalls/sys_enter_nanosleep" on the machine where you can
reproduce this bug ?

> Rasmus

2021-04-27 18:08:14

by Andrii Nakryiko

[permalink] [raw]
Subject: Re: [PATCH bpf-next v5 6/6] selftests/bpf: Add a series of tests for bpf_snprintf

On Tue, Apr 27, 2021 at 2:51 AM Florent Revest <[email protected]> wrote:
>
> On Tue, Apr 27, 2021 at 8:35 AM Rasmus Villemoes
> <[email protected]> wrote:
> >
> > On 26/04/2021 23.08, Florent Revest wrote:
> > > On Mon, Apr 26, 2021 at 6:19 PM Andrii Nakryiko
> > > <[email protected]> wrote:
> > >>
> > >> On Mon, Apr 26, 2021 at 3:10 AM Florent Revest <[email protected]> wrote:
> > >>>
> > >>> On Sat, Apr 24, 2021 at 12:38 AM Andrii Nakryiko
> > >>> <[email protected]> wrote:
> > >>>>
> > >>>> On Mon, Apr 19, 2021 at 8:52 AM Florent Revest <[email protected]> wrote:
> > >>>>>
> > >>>>> The "positive" part tests all format specifiers when things go well.
> > >>>>>
> > >>>>> The "negative" part makes sure that incorrect format strings fail at
> > >>>>> load time.
> > >>>>>
> > >>>>> Signed-off-by: Florent Revest <[email protected]>
> > >>>>> ---
> > >>>>> .../selftests/bpf/prog_tests/snprintf.c | 125 ++++++++++++++++++
> > >>>>> .../selftests/bpf/progs/test_snprintf.c | 73 ++++++++++
> > >>>>> .../bpf/progs/test_snprintf_single.c | 20 +++
> > >>>>> 3 files changed, 218 insertions(+)
> > >>>>> create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf.c
> > >>>>> create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf.c
> > >>>>> create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf_single.c
> > >>>>>
> > >>>>> diff --git a/tools/testing/selftests/bpf/prog_tests/snprintf.c b/tools/testing/selftests/bpf/prog_tests/snprintf.c
> > >>>>> new file mode 100644
> > >>>>> index 000000000000..a958c22aec75
> > >>>>> --- /dev/null
> > >>>>> +++ b/tools/testing/selftests/bpf/prog_tests/snprintf.c
> > >>>>> @@ -0,0 +1,125 @@
> > >>>>> +// SPDX-License-Identifier: GPL-2.0
> > >>>>> +/* Copyright (c) 2021 Google LLC. */
> > >>>>> +
> > >>>>> +#include <test_progs.h>
> > >>>>> +#include "test_snprintf.skel.h"
> > >>>>> +#include "test_snprintf_single.skel.h"
> > >>>>> +
> > >>>>> +#define EXP_NUM_OUT "-8 9 96 -424242 1337 DABBAD00"
> > >>>>> +#define EXP_NUM_RET sizeof(EXP_NUM_OUT)
> > >>>>> +
> > >>>>> +#define EXP_IP_OUT "127.000.000.001 0000:0000:0000:0000:0000:0000:0000:0001"
> > >>>>> +#define EXP_IP_RET sizeof(EXP_IP_OUT)
> > >>>>> +
> > >>>>> +/* The third specifier, %pB, depends on compiler inlining so don't check it */
> > >>>>> +#define EXP_SYM_OUT "schedule schedule+0x0/"
> > >>>>> +#define MIN_SYM_RET sizeof(EXP_SYM_OUT)
> > >>>>> +
> > >>>>> +/* The third specifier, %p, is a hashed pointer which changes on every reboot */
> > >>>>> +#define EXP_ADDR_OUT "0000000000000000 ffff00000add4e55 "
> > >>>>> +#define EXP_ADDR_RET sizeof(EXP_ADDR_OUT "unknownhashedptr")
> > >>>>> +
> > >>>>> +#define EXP_STR_OUT "str1 longstr"
> > >>>>> +#define EXP_STR_RET sizeof(EXP_STR_OUT)
> > >>>>> +
> > >>>>> +#define EXP_OVER_OUT "%over"
> > >>>>> +#define EXP_OVER_RET 10
> > >>>>> +
> > >>>>> +#define EXP_PAD_OUT " 4 000"
> > >>>>
> > >>>> Roughly 50% of the time I get failure for this test case:
> > >>>>
> > >>>> test_snprintf_positive:FAIL:pad_out unexpected pad_out: actual ' 4
> > >>>> 0000' != expected ' 4 000'
> > >>>>
> > >>>> Re-running this test case immediately passes. Running again most
> > >>>> probably fails. Please take a look.
> > >>>
> > >>> Do you have more information on how to reproduce this ?
> > >>> I spinned up a VM at 87bd9e602 with ./vmtest -s and then run this script:
> > >>>
> > >>> #!/bin/sh
> > >>> for i in `seq 1000`
> > >>> do
> > >>> ./test_progs -t snprintf
> > >>> if [ $? -ne 0 ];
> > >>> then
> > >>> echo FAILURE
> > >>> exit 1
> > >>> fi
> > >>> done
> > >>>
> > >>> The thousand executions passed.
> > >>>
> > >>> This is a bit concerning because your unexpected_pad_out seems to have
> > >>> an extra '0' so it ends up with strlen(pad_out)=11 but
> > >>> sizeof(pad_out)=10. The actual string writing is not really done by
> > >>> our helper code but by the snprintf implementation (str and str_size
> > >>> are only given to snprintf()) so I'd expect the truncation to work
> > >>> well there. I'm a bit puzzled
> > >>
> > >> I'm puzzled too, have no idea. I also can't repro this with vmtest.sh.
> > >> But I can quite reliably reproduce with my local ArchLinux-based qemu
> > >> image with different config (see [0] for config itself). So please try
> > >> with my config and see if that helps to repro. If not, I'll have to
> > >> debug it on my own later.
> > >>
> > >> [0] https://gist.github.com/anakryiko/4b6ae21680842bdeacca8fa99d378048
> > >
> > > I tried that config on the same commit 87bd9e602 (bpf-next/master)
> > > with my debian-based qemu image and I still can't reproduce the issue
> > > :| If I can be of any help let me know, I'd be happy to help
> > >
> >
> > It's not really clear to me if this is before or after the rewrite to
> > use bprintf, but regardless, in those two patches this caught my attention:
>
> I tried to reproduce Andrii's bug both before and after the bprintf
> rewrite but I think he meant before.

I'm running on the latest bpf-next master, but I don't think it's
related to bprintf change.

>
> > u64 args[MAX_TRACE_PRINTK_VARARGS] = { arg1, arg2, arg3 };
> > - enum bpf_printf_mod_type mod[MAX_TRACE_PRINTK_VARARGS];
> > + u32 *bin_args;
> > static char buf[BPF_TRACE_PRINTK_SIZE];
> > unsigned long flags;
> > int ret;
> >
> > - ret = bpf_printf_prepare(fmt, fmt_size, args, args, mod,
> > - MAX_TRACE_PRINTK_VARARGS);
> > + ret = bpf_bprintf_prepare(fmt, fmt_size, args, &bin_args,
> > + MAX_TRACE_PRINTK_VARARGS);
> > if (ret < 0)
> > return ret;
> >
> > - ret = snprintf(buf, sizeof(buf), fmt, BPF_CAST_FMT_ARG(0, args, mod),
> > - BPF_CAST_FMT_ARG(1, args, mod), BPF_CAST_FMT_ARG(2, args, mod));
> > - /* snprintf() will not append null for zero-length strings */
> > - if (ret == 0)
> > - buf[0] = '\0';
> > + ret = bstr_printf(buf, sizeof(buf), fmt, bin_args);
> >
> > raw_spin_lock_irqsave(&trace_printk_lock, flags);
> > trace_bpf_trace_printk(buf);
> > raw_spin_unlock_irqrestore(&trace_printk_lock, flags);
> >
> > Why isn't the write to buf[] protected by that spinlock? Or put another
> > way, what protects buf[] from concurrent writes?
>
> You're right, that is a bug, I missed that buf was static and thought
> it was just on the stack. That snprintf call should be after the
> raw_spin_lock_irqsave. I'll send a patch. Thank you Rasmus. (before my
> snprintf series, there was a vsprintf after the raw_spin_lock_irqsave)

Can you please also clean up unnecessary ()s you added in at least a
few places. Thanks.

>
> > Probably the test cases are not run in parallel, but this is the kind of
> > thing that would give those symptoms.
>
> I think it's a separate issue from what Andrii reported though because
> the flaky test exercises the bpf_snprintf helper and this buf spinlock
> bug you just found only affects the bpf_trace_printk helper.
>
> That being said, it does smell a little bit like a concurrency issue
> too, indeed. The bpf_snprintf test program is a raw_tp/sys_enter so it
> attaches to all syscall entries and most likely gets executed many
> more times than necessary and probably on parallel CPUs. The "pad_out"
> buffer they write to is unique and not locked so maybe the test's
> userspace reads pad_out while another CPU is writing on it and if the
> string output goes through a stage where it is " 4 0000" before
> being " 4 000", we might read at the wrong time. That being said, I
> would find it weird that this happens as much as 50% of the time and
> always specifically on that test case.
>
> Andrii could you maybe try changing the prog type to
> "tp/syscalls/sys_enter_nanosleep" on the machine where you can
> reproduce this bug ?

Yes, it helps. I can't repro it easily anymore. I think the right fix,
though, should be to filter by tid, not change the tracepoint.

I think what's happening is we see the string right before bstr_printf
does zero-termination with end[-1] = '\0'; So in some cases we see
truncated string, in others we see untruncated one.

>
> > Rasmus

2021-04-28 17:36:51

by Florent Revest

[permalink] [raw]
Subject: Re: [PATCH bpf-next v5 6/6] selftests/bpf: Add a series of tests for bpf_snprintf

On Tue, Apr 27, 2021 at 8:03 PM Andrii Nakryiko
<[email protected]> wrote:
>
> On Tue, Apr 27, 2021 at 2:51 AM Florent Revest <[email protected]> wrote:
> >
> > On Tue, Apr 27, 2021 at 8:35 AM Rasmus Villemoes
> > <[email protected]> wrote:
> > > u64 args[MAX_TRACE_PRINTK_VARARGS] = { arg1, arg2, arg3 };
> > > - enum bpf_printf_mod_type mod[MAX_TRACE_PRINTK_VARARGS];
> > > + u32 *bin_args;
> > > static char buf[BPF_TRACE_PRINTK_SIZE];
> > > unsigned long flags;
> > > int ret;
> > >
> > > - ret = bpf_printf_prepare(fmt, fmt_size, args, args, mod,
> > > - MAX_TRACE_PRINTK_VARARGS);
> > > + ret = bpf_bprintf_prepare(fmt, fmt_size, args, &bin_args,
> > > + MAX_TRACE_PRINTK_VARARGS);
> > > if (ret < 0)
> > > return ret;
> > >
> > > - ret = snprintf(buf, sizeof(buf), fmt, BPF_CAST_FMT_ARG(0, args, mod),
> > > - BPF_CAST_FMT_ARG(1, args, mod), BPF_CAST_FMT_ARG(2, args, mod));
> > > - /* snprintf() will not append null for zero-length strings */
> > > - if (ret == 0)
> > > - buf[0] = '\0';
> > > + ret = bstr_printf(buf, sizeof(buf), fmt, bin_args);
> > >
> > > raw_spin_lock_irqsave(&trace_printk_lock, flags);
> > > trace_bpf_trace_printk(buf);
> > > raw_spin_unlock_irqrestore(&trace_printk_lock, flags);
> > >
> > > Why isn't the write to buf[] protected by that spinlock? Or put another
> > > way, what protects buf[] from concurrent writes?
> >
> > You're right, that is a bug, I missed that buf was static and thought
> > it was just on the stack. That snprintf call should be after the
> > raw_spin_lock_irqsave. I'll send a patch. Thank you Rasmus. (before my
> > snprintf series, there was a vsprintf after the raw_spin_lock_irqsave)

Solved now

> Can you please also clean up unnecessary ()s you added in at least a
> few places. Thanks.

Alexei said he took care of this .:)

> > > Probably the test cases are not run in parallel, but this is the kind of
> > > thing that would give those symptoms.
> >
> > I think it's a separate issue from what Andrii reported though because
> > the flaky test exercises the bpf_snprintf helper and this buf spinlock
> > bug you just found only affects the bpf_trace_printk helper.
> >
> > That being said, it does smell a little bit like a concurrency issue
> > too, indeed. The bpf_snprintf test program is a raw_tp/sys_enter so it
> > attaches to all syscall entries and most likely gets executed many
> > more times than necessary and probably on parallel CPUs. The "pad_out"
> > buffer they write to is unique and not locked so maybe the test's
> > userspace reads pad_out while another CPU is writing on it and if the
> > string output goes through a stage where it is " 4 0000" before
> > being " 4 000", we might read at the wrong time. That being said, I
> > would find it weird that this happens as much as 50% of the time and
> > always specifically on that test case.
> >
> > Andrii could you maybe try changing the prog type to
> > "tp/syscalls/sys_enter_nanosleep" on the machine where you can
> > reproduce this bug ?
>
> Yes, it helps. I can't repro it easily anymore.

Good, so it does sound like a concurrency issue indeed

> I think the right fix, though, should be to filter by tid, not change the tracepoint.

Agreed, I'll send a patch for that today. :)

> I think what's happening is we see the string right before bstr_printf
> does zero-termination with end[-1] = '\0'; So in some cases we see
> truncated string, in others we see untruncated one.

Makes sense but I still wonder why it happens so often (50% of the
time is really a lot) and why it is consistently that one test case
that fails and not the "overflow" case for example. But I'm confident
that this is not a bug in the helper now and that the tid filter will
fix the test.

2021-05-05 06:57:21

by Rasmus Villemoes

[permalink] [raw]
Subject: Re: [PATCH bpf-next v5 6/6] selftests/bpf: Add a series of tests for bpf_snprintf

On 28/04/2021 16.59, Florent Revest wrote:
> On Tue, Apr 27, 2021 at 8:03 PM Andrii Nakryiko
> <[email protected]> wrote:
>>
>> On Tue, Apr 27, 2021 at 2:51 AM Florent Revest <[email protected]> wrote:
>>>
>>> On Tue, Apr 27, 2021 at 8:35 AM Rasmus Villemoes
>>> <[email protected]> wrote:
>>>> u64 args[MAX_TRACE_PRINTK_VARARGS] = { arg1, arg2, arg3 };
>>>> - enum bpf_printf_mod_type mod[MAX_TRACE_PRINTK_VARARGS];
>>>> + u32 *bin_args;
>>>> static char buf[BPF_TRACE_PRINTK_SIZE];
>>>> unsigned long flags;
>>>> int ret;
>>>>
>>>> - ret = bpf_printf_prepare(fmt, fmt_size, args, args, mod,
>>>> - MAX_TRACE_PRINTK_VARARGS);
>>>> + ret = bpf_bprintf_prepare(fmt, fmt_size, args, &bin_args,
>>>> + MAX_TRACE_PRINTK_VARARGS);
>>>> if (ret < 0)
>>>> return ret;
>>>>
>>>> - ret = snprintf(buf, sizeof(buf), fmt, BPF_CAST_FMT_ARG(0, args, mod),
>>>> - BPF_CAST_FMT_ARG(1, args, mod), BPF_CAST_FMT_ARG(2, args, mod));
>>>> - /* snprintf() will not append null for zero-length strings */
>>>> - if (ret == 0)
>>>> - buf[0] = '\0';
>>>> + ret = bstr_printf(buf, sizeof(buf), fmt, bin_args);
>>>>
>>>> raw_spin_lock_irqsave(&trace_printk_lock, flags);
>>>> trace_bpf_trace_printk(buf);
>>>> raw_spin_unlock_irqrestore(&trace_printk_lock, flags);
>>>>
>>>> Why isn't the write to buf[] protected by that spinlock? Or put another
>>>> way, what protects buf[] from concurrent writes?
>>>
>>> You're right, that is a bug, I missed that buf was static and thought
>>> it was just on the stack. That snprintf call should be after the
>>> raw_spin_lock_irqsave. I'll send a patch. Thank you Rasmus. (before my
>>> snprintf series, there was a vsprintf after the raw_spin_lock_irqsave)
>
> Solved now
>
>> Can you please also clean up unnecessary ()s you added in at least a
>> few places. Thanks.
>
> Alexei said he took care of this .:)
>
>>>> Probably the test cases are not run in parallel, but this is the kind of
>>>> thing that would give those symptoms.
>>>
>>> I think it's a separate issue from what Andrii reported though because
>>> the flaky test exercises the bpf_snprintf helper and this buf spinlock
>>> bug you just found only affects the bpf_trace_printk helper.
>>>
>>> That being said, it does smell a little bit like a concurrency issue
>>> too, indeed. The bpf_snprintf test program is a raw_tp/sys_enter so it
>>> attaches to all syscall entries and most likely gets executed many
>>> more times than necessary and probably on parallel CPUs. The "pad_out"
>>> buffer they write to is unique and not locked so maybe the test's
>>> userspace reads pad_out while another CPU is writing on it and if the
>>> string output goes through a stage where it is " 4 0000" before
>>> being " 4 000", we might read at the wrong time. That being said, I
>>> would find it weird that this happens as much as 50% of the time and
>>> always specifically on that test case.
>>>
>>> Andrii could you maybe try changing the prog type to
>>> "tp/syscalls/sys_enter_nanosleep" on the machine where you can
>>> reproduce this bug ?
>>
>> Yes, it helps. I can't repro it easily anymore.
>
> Good, so it does sound like a concurrency issue indeed
>
>> I think the right fix, though, should be to filter by tid, not change the tracepoint.
>
> Agreed, I'll send a patch for that today. :)
>
>> I think what's happening is we see the string right before bstr_printf
>> does zero-termination with end[-1] = '\0'; So in some cases we see
>> truncated string, in others we see untruncated one.
>
> Makes sense but I still wonder why it happens so often (50% of the
> time is really a lot) and why it is consistently that one test case
> that fails and not the "overflow" case for example. But I'm confident
> that this is not a bug in the helper now and that the tid filter will
> fix the test.
>

If the caller, or one of its sibling threads, inspects the buffer before
(v)snprintf has returned it's very obviously a bug in the caller. As for
why that particular case exposes the race: It seems to be the only one
that "expects" an insanely long result, and it takes a very very long
time (several cycles per byte I'd assume) for the vsnprintf code to very
carefully go through the

if (buf < end)
*buf = /* '0' or ' ' or whatever it is it is emitting here */
buf++;

900k times. So there's simply a very large window where the buffer
contents is " 4 0000" while number() is still 'emitting' 0s until
control returns to vsnprintf() which does that final end[-1] = '\0'.

Rasmus

2021-05-05 14:27:55

by Florent Revest

[permalink] [raw]
Subject: Re: [PATCH bpf-next v5 6/6] selftests/bpf: Add a series of tests for bpf_snprintf

On Wed, May 5, 2021 at 8:55 AM Rasmus Villemoes
<[email protected]> wrote:
>
> On 28/04/2021 16.59, Florent Revest wrote:
> > On Tue, Apr 27, 2021 at 8:03 PM Andrii Nakryiko
> > <[email protected]> wrote:
> >>
> >> On Tue, Apr 27, 2021 at 2:51 AM Florent Revest <[email protected]> wrote:
> >>>
> >>> On Tue, Apr 27, 2021 at 8:35 AM Rasmus Villemoes
> >>> <[email protected]> wrote:
> >>>> Probably the test cases are not run in parallel, but this is the kind of
> >>>> thing that would give those symptoms.
> >>>
> >>> I think it's a separate issue from what Andrii reported though because
> >>> the flaky test exercises the bpf_snprintf helper and this buf spinlock
> >>> bug you just found only affects the bpf_trace_printk helper.
> >>>
> >>> That being said, it does smell a little bit like a concurrency issue
> >>> too, indeed. The bpf_snprintf test program is a raw_tp/sys_enter so it
> >>> attaches to all syscall entries and most likely gets executed many
> >>> more times than necessary and probably on parallel CPUs. The "pad_out"
> >>> buffer they write to is unique and not locked so maybe the test's
> >>> userspace reads pad_out while another CPU is writing on it and if the
> >>> string output goes through a stage where it is " 4 0000" before
> >>> being " 4 000", we might read at the wrong time. That being said, I
> >>> would find it weird that this happens as much as 50% of the time and
> >>> always specifically on that test case.
> >>>
> >>> Andrii could you maybe try changing the prog type to
> >>> "tp/syscalls/sys_enter_nanosleep" on the machine where you can
> >>> reproduce this bug ?
> >>
> >> Yes, it helps. I can't repro it easily anymore.
> >
> > Good, so it does sound like a concurrency issue indeed
> >
> >> I think the right fix, though, should be to filter by tid, not change the tracepoint.
> >
> > Agreed, I'll send a patch for that today. :)
> >
> >> I think what's happening is we see the string right before bstr_printf
> >> does zero-termination with end[-1] = '\0'; So in some cases we see
> >> truncated string, in others we see untruncated one.
> >
> > Makes sense but I still wonder why it happens so often (50% of the
> > time is really a lot) and why it is consistently that one test case
> > that fails and not the "overflow" case for example. But I'm confident
> > that this is not a bug in the helper now and that the tid filter will
> > fix the test.
> >
>
> If the caller, or one of its sibling threads, inspects the buffer before
> (v)snprintf has returned it's very obviously a bug in the caller. As for

Absolutely

> why that particular case exposes the race: It seems to be the only one
> that "expects" an insanely long result, and it takes a very very long
> time (several cycles per byte I'd assume) for the vsnprintf code to very
> carefully go through the
>
> if (buf < end)
> *buf = /* '0' or ' ' or whatever it is it is emitting here */
> buf++;
>
> 900k times. So there's simply a very large window where the buffer
> contents is " 4 0000" while number() is still 'emitting' 0s until
> control returns to vsnprintf() which does that final end[-1] = '\0'.

Aaah, of course, it makes complete sense! :) Thank you Rasmus