2020-12-07 16:12:30

by Brendan Jackman

[permalink] [raw]
Subject: [PATCH bpf-next v4 00/11] Atomics for eBPF

Status of the patches
=====================

Thanks for the reviews! Differences from v3->v4 [1]:

* Added one Ack from Yonghong. He acked some other patches but those
have now changed non-trivally so I didn't add those acks.

* Fixups to commit messages.

* Fixed disassembly and comments: first arg to atomic_fetch_* is a
pointer.

* Improved prog_test efficiency. BPF progs are now all loaded in a
single call, then the skeleton is re-used for each subtest.

* Dropped use of tools/build/feature in favour of a one-liner in the
Makefile.

* Dropped the commit that created an emit_neg helper in the x86
JIT. It's not used any more (it wasn't used in v3 either).

* Combined all the different filter.h macros (used to be
BPF_ATOMIC_ADD, BPF_ATOMIC_FETCH_ADD, BPF_ATOMIC_AND, etc) into
just BPF_ATOMIC32 and BPF_ATOMIC64.

* Removed some references to BPF_STX_XADD from tools/, samples/ and
lib/ that I missed before.

Differences from v2->v3 [1]:

* More minor fixes and naming/comment changes

* Dropped atomic subtract: compilers can implement this by preceding
an atomic add with a NEG instruction (which is what the x86 JIT did
under the hood anyway).

* Dropped the use of -mcpu=v4 in the Clang BPF command-line; there is
no longer an architecture version bump. Instead a feature test is
added to Kbuild - it builds a source file to check if Clang
supports BPF atomics.

* Fixed the prog_test so it no longer breaks
test_progs-no_alu32. This requires some ifdef acrobatics to avoid
complicating the prog_tests model where the same userspace code
exercises both the normal and no_alu32 BPF test objects, using the
same skeleton header.

Differences from v1->v2 [1]:

* Fixed mistakes in the netronome driver

* Addd sub, add, or, xor operations

* The above led to some refactors to keep things readable. (Maybe I
should have just waited until I'd implemented these before starting
the review...)

* Replaced BPF_[CMP]SET | BPF_FETCH with just BPF_[CMP]XCHG, which
include the BPF_FETCH flag

* Added a bit of documentation. Suggestions welcome for more places
to dump this info...

The prog_test that's added depends on Clang/LLVM features added by
Yonghong in commit 286daafd6512 (was
https://reviews.llvm.org/D72184).

This only includes a JIT implementation for x86_64 - I don't plan to
implement JIT support myself for other architectures.

Operations
==========

This patchset adds atomic operations to the eBPF instruction set. The
use-case that motivated this work was a trivial and efficient way to
generate globally-unique cookies in BPF progs, but I think it's
obvious that these features are pretty widely applicable. The
instructions that are added here can be summarised with this list of
kernel operations:

* atomic[64]_[fetch_]add
* atomic[64]_[fetch_]and
* atomic[64]_[fetch_]or
* atomic[64]_xchg
* atomic[64]_cmpxchg

The following are left out of scope for this effort:

* 16 and 8 bit operations
* Explicit memory barriers

Encoding
========

I originally planned to add new values for bpf_insn.opcode. This was
rather unpleasant: the opcode space has holes in it but no entire
instruction classes[2]. Yonghong Song had a better idea: use the
immediate field of the existing STX XADD instruction to encode the
operation. This works nicely, without breaking existing programs,
because the immediate field is currently reserved-must-be-zero, and
extra-nicely because BPF_ADD happens to be zero.

Note that this of course makes immediate-source atomic operations
impossible. It's hard to imagine a measurable speedup from such
instructions, and if it existed it would certainly not benefit x86,
which has no support for them.

The BPF_OP opcode fields are re-used in the immediate, and an
additional flag BPF_FETCH is used to mark instructions that should
fetch a pre-modification value from memory.

So, BPF_XADD is now called BPF_ATOMIC (the old name is kept to avoid
breaking userspace builds), and where we previously had .imm = 0, we
now have .imm = BPF_ADD (which is 0).

Operands
========

Reg-source eBPF instructions only have two operands, while these
atomic operations have up to four. To avoid needing to encode
additional operands, then:

- One of the input registers is re-used as an output register
(e.g. atomic_fetch_add both reads from and writes to the source
register).

- Where necessary (i.e. for cmpxchg) , R0 is "hard-coded" as one of
the operands.

This approach also allows the new eBPF instructions to map directly
to single x86 instructions.

[1] Previous iterations:
v1: https://lore.kernel.org/bpf/[email protected]/
v2: https://lore.kernel.org/bpf/[email protected]/
v3: https://lore.kernel.org/bpf/[email protected]/

[2] Visualisation of eBPF opcode space:
https://gist.github.com/bjackman/00fdad2d5dfff601c1918bc29b16e778

Brendan Jackman (11):
bpf: x86: Factor out emission of ModR/M for *(reg + off)
bpf: x86: Factor out emission of REX byte
bpf: x86: Factor out a lookup table for some ALU opcodes
bpf: Rename BPF_XADD and prepare to encode other atomics in .imm
bpf: Move BPF_STX reserved field check into BPF_STX verifier code
bpf: Add BPF_FETCH field / create atomic_fetch_add instruction
bpf: Add instructions for atomic_[cmp]xchg
bpf: Pull out a macro for interpreting atomic ALU operations
bpf: Add bitwise atomic instructions
bpf: Add tests for new BPF atomic operations
bpf: Document new atomic instructions

Documentation/networking/filter.rst | 56 +++-
arch/arm/net/bpf_jit_32.c | 7 +-
arch/arm64/net/bpf_jit_comp.c | 16 +-
arch/mips/net/ebpf_jit.c | 11 +-
arch/powerpc/net/bpf_jit_comp64.c | 25 +-
arch/riscv/net/bpf_jit_comp32.c | 20 +-
arch/riscv/net/bpf_jit_comp64.c | 16 +-
arch/s390/net/bpf_jit_comp.c | 27 +-
arch/sparc/net/bpf_jit_comp_64.c | 17 +-
arch/x86/net/bpf_jit_comp.c | 217 ++++++++++-----
arch/x86/net/bpf_jit_comp32.c | 6 +-
drivers/net/ethernet/netronome/nfp/bpf/jit.c | 14 +-
drivers/net/ethernet/netronome/nfp/bpf/main.h | 4 +-
.../net/ethernet/netronome/nfp/bpf/verifier.c | 15 +-
include/linux/filter.h | 118 ++++++++-
include/uapi/linux/bpf.h | 10 +-
kernel/bpf/core.c | 67 ++++-
kernel/bpf/disasm.c | 43 ++-
kernel/bpf/verifier.c | 75 ++++--
lib/test_bpf.c | 14 +-
samples/bpf/bpf_insn.h | 4 +-
samples/bpf/cookie_uid_helper_example.c | 6 +-
samples/bpf/sock_example.c | 2 +-
samples/bpf/test_cgrp2_attach.c | 5 +-
tools/include/linux/filter.h | 127 ++++++++-
tools/include/uapi/linux/bpf.h | 10 +-
tools/testing/selftests/bpf/Makefile | 10 +
.../selftests/bpf/prog_tests/atomics.c | 246 ++++++++++++++++++
.../bpf/prog_tests/cgroup_attach_multi.c | 4 +-
tools/testing/selftests/bpf/progs/atomics.c | 154 +++++++++++
.../selftests/bpf/test_cgroup_storage.c | 2 +-
.../selftests/bpf/verifier/atomic_and.c | 77 ++++++
.../selftests/bpf/verifier/atomic_cmpxchg.c | 96 +++++++
.../selftests/bpf/verifier/atomic_fetch_add.c | 106 ++++++++
.../selftests/bpf/verifier/atomic_or.c | 77 ++++++
.../selftests/bpf/verifier/atomic_xchg.c | 46 ++++
.../selftests/bpf/verifier/atomic_xor.c | 77 ++++++
tools/testing/selftests/bpf/verifier/ctx.c | 7 +-
.../bpf/verifier/direct_packet_access.c | 4 +-
.../testing/selftests/bpf/verifier/leak_ptr.c | 10 +-
.../selftests/bpf/verifier/meta_access.c | 4 +-
tools/testing/selftests/bpf/verifier/unpriv.c | 3 +-
.../bpf/verifier/value_illegal_alu.c | 2 +-
tools/testing/selftests/bpf/verifier/xadd.c | 18 +-
44 files changed, 1665 insertions(+), 210 deletions(-)
create mode 100644 tools/testing/selftests/bpf/prog_tests/atomics.c
create mode 100644 tools/testing/selftests/bpf/progs/atomics.c
create mode 100644 tools/testing/selftests/bpf/verifier/atomic_and.c
create mode 100644 tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c
create mode 100644 tools/testing/selftests/bpf/verifier/atomic_fetch_add.c
create mode 100644 tools/testing/selftests/bpf/verifier/atomic_or.c
create mode 100644 tools/testing/selftests/bpf/verifier/atomic_xchg.c
create mode 100644 tools/testing/selftests/bpf/verifier/atomic_xor.c


base-commit: 34da87213d3ddd26643aa83deff7ffc6463da0fc
--
2.29.2.576.ga3fc446d84-goog


2020-12-07 16:12:33

by Brendan Jackman

[permalink] [raw]
Subject: [PATCH bpf-next v4 05/11] bpf: Move BPF_STX reserved field check into BPF_STX verifier code

I can't find a reason why this code is in resolve_pseudo_ldimm64;
since I'll be modifying it in a subsequent commit, tidy it up.

Signed-off-by: Brendan Jackman <[email protected]>
---
kernel/bpf/verifier.c | 13 ++++++-------
1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 615be10abd71..745c53df0485 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -9527,6 +9527,12 @@ static int do_check(struct bpf_verifier_env *env)
} else if (class == BPF_STX) {
enum bpf_reg_type *prev_dst_type, dst_reg_type;

+ if (((BPF_MODE(insn->code) != BPF_MEM &&
+ BPF_MODE(insn->code) != BPF_ATOMIC) || insn->imm != 0)) {
+ verbose(env, "BPF_STX uses reserved fields\n");
+ return -EINVAL;
+ }
+
if (BPF_MODE(insn->code) == BPF_ATOMIC) {
err = check_atomic(env, env->insn_idx, insn);
if (err)
@@ -9939,13 +9945,6 @@ static int resolve_pseudo_ldimm64(struct bpf_verifier_env *env)
return -EINVAL;
}

- if (BPF_CLASS(insn->code) == BPF_STX &&
- ((BPF_MODE(insn->code) != BPF_MEM &&
- BPF_MODE(insn->code) != BPF_ATOMIC) || insn->imm != 0)) {
- verbose(env, "BPF_STX uses reserved fields\n");
- return -EINVAL;
- }
-
if (insn[0].code == (BPF_LD | BPF_IMM | BPF_DW)) {
struct bpf_insn_aux_data *aux;
struct bpf_map *map;
--
2.29.2.576.ga3fc446d84-goog

2020-12-07 16:12:47

by Brendan Jackman

[permalink] [raw]
Subject: [PATCH bpf-next v4 08/11] bpf: Pull out a macro for interpreting atomic ALU operations

Since the atomic operations that are added in subsequent commits are
all isomorphic with BPF_ADD, pull out a macro to avoid the
interpreter becoming dominated by lines of atomic-related code.

Note that this sacrificies interpreter performance (combining
STX_ATOMIC_W and STX_ATOMIC_DW into single switch case means that we
need an extra conditional branch to differentiate them) in favour of
compact and (relatively!) simple C code.

Acked-by: Yonghong Song <[email protected]>
Signed-off-by: Brendan Jackman <[email protected]>
---
kernel/bpf/core.c | 80 +++++++++++++++++++++++------------------------
1 file changed, 39 insertions(+), 41 deletions(-)

diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 28f960bc2e30..1d9e5dcde03a 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -1618,55 +1618,53 @@ static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, u64 *stack)
LDX_PROBE(DW, 8)
#undef LDX_PROBE

- STX_ATOMIC_W:
- switch (IMM) {
- case BPF_ADD:
- /* lock xadd *(u32 *)(dst_reg + off16) += src_reg */
- atomic_add((u32) SRC, (atomic_t *)(unsigned long)
- (DST + insn->off));
- break;
- case BPF_ADD | BPF_FETCH:
- SRC = (u32) atomic_fetch_add(
- (u32) SRC,
- (atomic_t *)(unsigned long) (DST + insn->off));
- break;
- case BPF_XCHG:
- SRC = (u32) atomic_xchg(
- (atomic_t *)(unsigned long) (DST + insn->off),
- (u32) SRC);
- break;
- case BPF_CMPXCHG:
- BPF_R0 = (u32) atomic_cmpxchg(
- (atomic_t *)(unsigned long) (DST + insn->off),
- (u32) BPF_R0, (u32) SRC);
+#define ATOMIC_ALU_OP(BOP, KOP) \
+ case BOP: \
+ if (BPF_SIZE(insn->code) == BPF_W) \
+ atomic_##KOP((u32) SRC, (atomic_t *)(unsigned long) \
+ (DST + insn->off)); \
+ else \
+ atomic64_##KOP((u64) SRC, (atomic64_t *)(unsigned long) \
+ (DST + insn->off)); \
+ break; \
+ case BOP | BPF_FETCH: \
+ if (BPF_SIZE(insn->code) == BPF_W) \
+ SRC = (u32) atomic_fetch_##KOP( \
+ (u32) SRC, \
+ (atomic_t *)(unsigned long) (DST + insn->off)); \
+ else \
+ SRC = (u64) atomic64_fetch_##KOP( \
+ (u64) SRC, \
+ (atomic64_t *)(s64) (DST + insn->off)); \
break;
- default:
- goto default_label;
- }
- CONT;

STX_ATOMIC_DW:
+ STX_ATOMIC_W:
switch (IMM) {
- case BPF_ADD:
- /* lock xadd *(u64 *)(dst_reg + off16) += src_reg */
- atomic64_add((u64) SRC, (atomic64_t *)(unsigned long)
- (DST + insn->off));
- break;
- case BPF_ADD | BPF_FETCH:
- SRC = (u64) atomic64_fetch_add(
- (u64) SRC,
- (atomic64_t *)(s64) (DST + insn->off));
- break;
+ ATOMIC_ALU_OP(BPF_ADD, add)
+#undef ATOMIC_ALU_OP
+
case BPF_XCHG:
- SRC = (u64) atomic64_xchg(
- (atomic64_t *)(u64) (DST + insn->off),
- (u64) SRC);
+ if (BPF_SIZE(insn->code) == BPF_W)
+ SRC = (u32) atomic_xchg(
+ (atomic_t *)(unsigned long) (DST + insn->off),
+ (u32) SRC);
+ else
+ SRC = (u64) atomic64_xchg(
+ (atomic64_t *)(u64) (DST + insn->off),
+ (u64) SRC);
break;
case BPF_CMPXCHG:
- BPF_R0 = (u64) atomic64_cmpxchg(
- (atomic64_t *)(u64) (DST + insn->off),
- (u64) BPF_R0, (u64) SRC);
+ if (BPF_SIZE(insn->code) == BPF_W)
+ BPF_R0 = (u32) atomic_cmpxchg(
+ (atomic_t *)(unsigned long) (DST + insn->off),
+ (u32) BPF_R0, (u32) SRC);
+ else
+ BPF_R0 = (u64) atomic64_cmpxchg(
+ (atomic64_t *)(u64) (DST + insn->off),
+ (u64) BPF_R0, (u64) SRC);
break;
+
default:
goto default_label;
}
--
2.29.2.576.ga3fc446d84-goog

2020-12-07 16:13:01

by Brendan Jackman

[permalink] [raw]
Subject: [PATCH bpf-next v4 10/11] bpf: Add tests for new BPF atomic operations

The prog_test that's added depends on Clang/LLVM features added by
Yonghong in commit 286daafd6512 (was https://reviews.llvm.org/D72184).

Note the use of a define called ENABLE_ATOMICS_TESTS: this is used
to:

- Avoid breaking the build for people on old versions of Clang
- Avoid needing separate lists of test objects for no_alu32, where
atomics are not supported even if Clang has the feature.

The atomics_test.o BPF object is built unconditionally both for
test_progs and test_progs-no_alu32. For test_progs, if Clang supports
atomics, ENABLE_ATOMICS_TESTS is defined, so it includes the proper
test code. Otherwise, progs and global vars are defined anyway, as
stubs; this means that the skeleton user code still builds.

The atomics_test.o userspace object is built once and used for both
test_progs and test_progs-no_alu32. A variable called skip_tests is
defined in the BPF object's data section, which tells the userspace
object whether to skip the atomics test.

Signed-off-by: Brendan Jackman <[email protected]>
---
tools/testing/selftests/bpf/Makefile | 10 +
.../selftests/bpf/prog_tests/atomics.c | 246 ++++++++++++++++++
tools/testing/selftests/bpf/progs/atomics.c | 154 +++++++++++
.../selftests/bpf/verifier/atomic_and.c | 77 ++++++
.../selftests/bpf/verifier/atomic_cmpxchg.c | 96 +++++++
.../selftests/bpf/verifier/atomic_fetch_add.c | 106 ++++++++
.../selftests/bpf/verifier/atomic_or.c | 77 ++++++
.../selftests/bpf/verifier/atomic_xchg.c | 46 ++++
.../selftests/bpf/verifier/atomic_xor.c | 77 ++++++
9 files changed, 889 insertions(+)
create mode 100644 tools/testing/selftests/bpf/prog_tests/atomics.c
create mode 100644 tools/testing/selftests/bpf/progs/atomics.c
create mode 100644 tools/testing/selftests/bpf/verifier/atomic_and.c
create mode 100644 tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c
create mode 100644 tools/testing/selftests/bpf/verifier/atomic_fetch_add.c
create mode 100644 tools/testing/selftests/bpf/verifier/atomic_or.c
create mode 100644 tools/testing/selftests/bpf/verifier/atomic_xchg.c
create mode 100644 tools/testing/selftests/bpf/verifier/atomic_xor.c

diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index ac25ba5d0d6c..13bc1d736164 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -239,6 +239,12 @@ BPF_CFLAGS = -g -D__TARGET_ARCH_$(SRCARCH) $(MENDIAN) \
-I$(INCLUDE_DIR) -I$(CURDIR) -I$(APIDIR) \
-I$(abspath $(OUTPUT)/../usr/include)

+# BPF atomics support was added to Clang in llvm-project commit 286daafd6512
+# (release 12.0.0).
+BPF_ATOMICS_SUPPORTED = $(shell \
+ echo "int x = 0; int foo(void) { return __sync_val_compare_and_swap(&x, 1, 2); }" \
+ | $(CLANG) -x cpp-output -S -target bpf -mcpu=v3 - -o /dev/null && echo 1 || echo 0)
+
CLANG_CFLAGS = $(CLANG_SYS_INCLUDES) \
-Wno-compare-distinct-pointer-types

@@ -399,11 +405,15 @@ TRUNNER_EXTRA_FILES := $(OUTPUT)/urandom_read $(OUTPUT)/bpf_testmod.ko \
$(wildcard progs/btf_dump_test_case_*.c)
TRUNNER_BPF_BUILD_RULE := CLANG_BPF_BUILD_RULE
TRUNNER_BPF_CFLAGS := $(BPF_CFLAGS) $(CLANG_CFLAGS)
+ifeq ($(BPF_ATOMICS_SUPPORTED),1)
+ TRUNNER_BPF_CFLAGS += -DENABLE_ATOMICS_TESTS
+endif
TRUNNER_BPF_LDFLAGS := -mattr=+alu32
$(eval $(call DEFINE_TEST_RUNNER,test_progs))

# Define test_progs-no_alu32 test runner.
TRUNNER_BPF_BUILD_RULE := CLANG_NOALU32_BPF_BUILD_RULE
+TRUNNER_BPF_CFLAGS := $(BPF_CFLAGS) $(CLANG_CFLAGS)
TRUNNER_BPF_LDFLAGS :=
$(eval $(call DEFINE_TEST_RUNNER,test_progs,no_alu32))

diff --git a/tools/testing/selftests/bpf/prog_tests/atomics.c b/tools/testing/selftests/bpf/prog_tests/atomics.c
new file mode 100644
index 000000000000..c841a3abc2f7
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/atomics.c
@@ -0,0 +1,246 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <test_progs.h>
+
+#include "atomics.skel.h"
+
+static void test_add(struct atomics *skel)
+{
+ int err, prog_fd;
+ __u32 duration = 0, retval;
+ struct bpf_link *link;
+
+ link = bpf_program__attach(skel->progs.add);
+ if (CHECK(IS_ERR(link), "attach(add)", "err: %ld\n", PTR_ERR(link)))
+ return;
+
+ prog_fd = bpf_program__fd(skel->progs.add);
+ err = bpf_prog_test_run(prog_fd, 1, NULL, 0,
+ NULL, NULL, &retval, &duration);
+ if (CHECK(err || retval, "test_run add",
+ "err %d errno %d retval %d duration %d\n", err, errno, retval, duration))
+ goto cleanup;
+
+ ASSERT_EQ(skel->data->add64_value, 3, "add64_value");
+ ASSERT_EQ(skel->bss->add64_result, 1, "add64_result");
+
+ ASSERT_EQ(skel->data->add32_value, 3, "add32_value");
+ ASSERT_EQ(skel->bss->add32_result, 1, "add32_result");
+
+ ASSERT_EQ(skel->bss->add_stack_value_copy, 3, "add_stack_value");
+ ASSERT_EQ(skel->bss->add_stack_result, 1, "add_stack_result");
+
+ ASSERT_EQ(skel->data->add_noreturn_value, 3, "add_noreturn_value");
+
+cleanup:
+ bpf_link__destroy(link);
+}
+
+static void test_sub(struct atomics *skel)
+{
+ int err, prog_fd;
+ __u32 duration = 0, retval;
+ struct bpf_link *link;
+
+ link = bpf_program__attach(skel->progs.sub);
+ if (CHECK(IS_ERR(link), "attach(sub)", "err: %ld\n", PTR_ERR(link)))
+ return;
+
+ prog_fd = bpf_program__fd(skel->progs.sub);
+ err = bpf_prog_test_run(prog_fd, 1, NULL, 0,
+ NULL, NULL, &retval, &duration);
+ if (CHECK(err || retval, "test_run sub",
+ "err %d errno %d retval %d duration %d\n",
+ err, errno, retval, duration))
+ goto cleanup;
+
+ ASSERT_EQ(skel->data->sub64_value, -1, "sub64_value");
+ ASSERT_EQ(skel->bss->sub64_result, 1, "sub64_result");
+
+ ASSERT_EQ(skel->data->sub32_value, -1, "sub32_value");
+ ASSERT_EQ(skel->bss->sub32_result, 1, "sub32_result");
+
+ ASSERT_EQ(skel->bss->sub_stack_value_copy, -1, "sub_stack_value");
+ ASSERT_EQ(skel->bss->sub_stack_result, 1, "sub_stack_result");
+
+ ASSERT_EQ(skel->data->sub_noreturn_value, -1, "sub_noreturn_value");
+
+cleanup:
+ bpf_link__destroy(link);
+}
+
+static void test_and(struct atomics *skel)
+{
+ int err, prog_fd;
+ __u32 duration = 0, retval;
+ struct bpf_link *link;
+
+ link = bpf_program__attach(skel->progs.and);
+ if (CHECK(IS_ERR(link), "attach(and)", "err: %ld\n", PTR_ERR(link)))
+ return;
+
+ prog_fd = bpf_program__fd(skel->progs.and);
+ err = bpf_prog_test_run(prog_fd, 1, NULL, 0,
+ NULL, NULL, &retval, &duration);
+ if (CHECK(err || retval, "test_run and",
+ "err %d errno %d retval %d duration %d\n", err, errno, retval, duration))
+ goto cleanup;
+
+ ASSERT_EQ(skel->data->and64_value, 0x010ull << 32, "and64_value");
+ ASSERT_EQ(skel->bss->and64_result, 0x110ull << 32, "and64_result");
+
+ ASSERT_EQ(skel->data->and32_value, 0x010, "and32_value");
+ ASSERT_EQ(skel->bss->and32_result, 0x110, "and32_result");
+
+ ASSERT_EQ(skel->data->and_noreturn_value, 0x010ull << 32, "and_noreturn_value");
+cleanup:
+ bpf_link__destroy(link);
+}
+
+static void test_or(struct atomics *skel)
+{
+ int err, prog_fd;
+ __u32 duration = 0, retval;
+ struct bpf_link *link;
+
+ link = bpf_program__attach(skel->progs.or);
+ if (CHECK(IS_ERR(link), "attach(or)", "err: %ld\n", PTR_ERR(link)))
+ return;
+
+ prog_fd = bpf_program__fd(skel->progs.or);
+ err = bpf_prog_test_run(prog_fd, 1, NULL, 0,
+ NULL, NULL, &retval, &duration);
+ if (CHECK(err || retval, "test_run or",
+ "err %d errno %d retval %d duration %d\n",
+ err, errno, retval, duration))
+ goto cleanup;
+
+ ASSERT_EQ(skel->data->or64_value, 0x111ull << 32, "or64_value");
+ ASSERT_EQ(skel->bss->or64_result, 0x110ull << 32, "or64_result");
+
+ ASSERT_EQ(skel->data->or32_value, 0x111, "or32_value");
+ ASSERT_EQ(skel->bss->or32_result, 0x110, "or32_result");
+
+ ASSERT_EQ(skel->data->or_noreturn_value, 0x111ull << 32, "or_noreturn_value");
+cleanup:
+ bpf_link__destroy(link);
+}
+
+static void test_xor(struct atomics *skel)
+{
+ int err, prog_fd;
+ __u32 duration = 0, retval;
+ struct bpf_link *link;
+
+ link = bpf_program__attach(skel->progs.xor);
+ if (CHECK(IS_ERR(link), "attach(xor)", "err: %ld\n", PTR_ERR(link)))
+ return;
+
+ prog_fd = bpf_program__fd(skel->progs.xor);
+ err = bpf_prog_test_run(prog_fd, 1, NULL, 0,
+ NULL, NULL, &retval, &duration);
+ if (CHECK(err || retval, "test_run xor",
+ "err %d errno %d retval %d duration %d\n", err, errno, retval, duration))
+ goto cleanup;
+
+ ASSERT_EQ(skel->data->xor64_value, 0x101ull << 32, "xor64_value");
+ ASSERT_EQ(skel->bss->xor64_result, 0x110ull << 32, "xor64_result");
+
+ ASSERT_EQ(skel->data->xor32_value, 0x101, "xor32_value");
+ ASSERT_EQ(skel->bss->xor32_result, 0x110, "xor32_result");
+
+ ASSERT_EQ(skel->data->xor_noreturn_value, 0x101ull << 32, "xor_nxoreturn_value");
+cleanup:
+ bpf_link__destroy(link);
+}
+
+static void test_cmpxchg(struct atomics *skel)
+{
+ int err, prog_fd;
+ __u32 duration = 0, retval;
+ struct bpf_link *link;
+
+ link = bpf_program__attach(skel->progs.cmpxchg);
+ if (CHECK(IS_ERR(link), "attach(cmpxchg)", "err: %ld\n", PTR_ERR(link)))
+ return;
+
+ prog_fd = bpf_program__fd(skel->progs.cmpxchg);
+ err = bpf_prog_test_run(prog_fd, 1, NULL, 0,
+ NULL, NULL, &retval, &duration);
+ if (CHECK(err || retval, "test_run add",
+ "err %d errno %d retval %d duration %d\n", err, errno, retval, duration))
+ goto cleanup;
+
+ ASSERT_EQ(skel->data->cmpxchg64_value, 2, "cmpxchg64_value");
+ ASSERT_EQ(skel->bss->cmpxchg64_result_fail, 1, "cmpxchg_result_fail");
+ ASSERT_EQ(skel->bss->cmpxchg64_result_succeed, 1, "cmpxchg_result_succeed");
+
+ ASSERT_EQ(skel->data->cmpxchg32_value, 2, "lcmpxchg32_value");
+ ASSERT_EQ(skel->bss->cmpxchg32_result_fail, 1, "cmpxchg_result_fail");
+ ASSERT_EQ(skel->bss->cmpxchg32_result_succeed, 1, "cmpxchg_result_succeed");
+
+cleanup:
+ bpf_link__destroy(link);
+}
+
+static void test_xchg(struct atomics *skel)
+{
+ int err, prog_fd;
+ __u32 duration = 0, retval;
+ struct bpf_link *link;
+
+ link = bpf_program__attach(skel->progs.xchg);
+ if (CHECK(IS_ERR(link), "attach(xchg)", "err: %ld\n", PTR_ERR(link)))
+ return;
+
+ prog_fd = bpf_program__fd(skel->progs.xchg);
+ err = bpf_prog_test_run(prog_fd, 1, NULL, 0,
+ NULL, NULL, &retval, &duration);
+ if (CHECK(err || retval, "test_run add",
+ "err %d errno %d retval %d duration %d\n", err, errno, retval, duration))
+ goto cleanup;
+
+ ASSERT_EQ(skel->data->xchg64_value, 2, "xchg64_value");
+ ASSERT_EQ(skel->bss->xchg64_result, 1, "xchg_result");
+
+ ASSERT_EQ(skel->data->xchg32_value, 2, "xchg32_value");
+ ASSERT_EQ(skel->bss->xchg32_result, 1, "xchg_result");
+
+cleanup:
+ bpf_link__destroy(link);
+}
+
+void test_atomics(void)
+{
+ struct atomics *skel;
+ __u32 duration = 0;
+
+ skel = atomics__open_and_load();
+ if (CHECK(!skel, "skel_load", "atomics skeleton failed\n"))
+ return;
+
+ if (skel->data->skip_tests) {
+ printf("%s:SKIP:no ENABLE_ATOMICS_TESTS (missing Clang BPF atomics support)",
+ __func__);
+ test__skip();
+ goto cleanup;
+ }
+
+ if (test__start_subtest("add"))
+ test_add(skel);
+ if (test__start_subtest("sub"))
+ test_sub(skel);
+ if (test__start_subtest("and"))
+ test_and(skel);
+ if (test__start_subtest("or"))
+ test_or(skel);
+ if (test__start_subtest("xor"))
+ test_xor(skel);
+ if (test__start_subtest("cmpxchg"))
+ test_cmpxchg(skel);
+ if (test__start_subtest("xchg"))
+ test_xchg(skel);
+
+cleanup:
+ atomics__destroy(skel);
+}
diff --git a/tools/testing/selftests/bpf/progs/atomics.c b/tools/testing/selftests/bpf/progs/atomics.c
new file mode 100644
index 000000000000..d40c93496843
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/atomics.c
@@ -0,0 +1,154 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+#include <stdbool.h>
+
+#ifdef ENABLE_ATOMICS_TESTS
+bool skip_tests __attribute((__section__(".data"))) = false;
+#else
+bool skip_tests = true;
+#endif
+
+__u64 add64_value = 1;
+__u64 add64_result = 0;
+__u32 add32_value = 1;
+__u32 add32_result = 0;
+__u64 add_stack_value_copy = 0;
+__u64 add_stack_result = 0;
+__u64 add_noreturn_value = 1;
+
+SEC("fentry/bpf_fentry_test1")
+int BPF_PROG(add, int a)
+{
+#ifdef ENABLE_ATOMICS_TESTS
+ __u64 add_stack_value = 1;
+
+ add64_result = __sync_fetch_and_add(&add64_value, 2);
+ add32_result = __sync_fetch_and_add(&add32_value, 2);
+ add_stack_result = __sync_fetch_and_add(&add_stack_value, 2);
+ add_stack_value_copy = add_stack_value;
+ __sync_fetch_and_add(&add_noreturn_value, 2);
+#endif
+
+ return 0;
+}
+
+__s64 sub64_value = 1;
+__s64 sub64_result = 0;
+__s32 sub32_value = 1;
+__s32 sub32_result = 0;
+__s64 sub_stack_value_copy = 0;
+__s64 sub_stack_result = 0;
+__s64 sub_noreturn_value = 1;
+
+SEC("fentry/bpf_fentry_test1")
+int BPF_PROG(sub, int a)
+{
+#ifdef ENABLE_ATOMICS_TESTS
+ __u64 sub_stack_value = 1;
+
+ sub64_result = __sync_fetch_and_sub(&sub64_value, 2);
+ sub32_result = __sync_fetch_and_sub(&sub32_value, 2);
+ sub_stack_result = __sync_fetch_and_sub(&sub_stack_value, 2);
+ sub_stack_value_copy = sub_stack_value;
+ __sync_fetch_and_sub(&sub_noreturn_value, 2);
+#endif
+
+ return 0;
+}
+
+__u64 and64_value = (0x110ull << 32);
+__u64 and64_result = 0;
+__u32 and32_value = 0x110;
+__u32 and32_result = 0;
+__u64 and_noreturn_value = (0x110ull << 32);
+
+SEC("fentry/bpf_fentry_test1")
+int BPF_PROG(and, int a)
+{
+#ifdef ENABLE_ATOMICS_TESTS
+
+ and64_result = __sync_fetch_and_and(&and64_value, 0x011ull << 32);
+ and32_result = __sync_fetch_and_and(&and32_value, 0x011);
+ __sync_fetch_and_and(&and_noreturn_value, 0x011ull << 32);
+#endif
+
+ return 0;
+}
+
+__u64 or64_value = (0x110ull << 32);
+__u64 or64_result = 0;
+__u32 or32_value = 0x110;
+__u32 or32_result = 0;
+__u64 or_noreturn_value = (0x110ull << 32);
+
+SEC("fentry/bpf_fentry_test1")
+int BPF_PROG(or, int a)
+{
+#ifdef ENABLE_ATOMICS_TESTS
+ or64_result = __sync_fetch_and_or(&or64_value, 0x011ull << 32);
+ or32_result = __sync_fetch_and_or(&or32_value, 0x011);
+ __sync_fetch_and_or(&or_noreturn_value, 0x011ull << 32);
+#endif
+
+ return 0;
+}
+
+__u64 xor64_value = (0x110ull << 32);
+__u64 xor64_result = 0;
+__u32 xor32_value = 0x110;
+__u32 xor32_result = 0;
+__u64 xor_noreturn_value = (0x110ull << 32);
+
+SEC("fentry/bpf_fentry_test1")
+int BPF_PROG(xor, int a)
+{
+#ifdef ENABLE_ATOMICS_TESTS
+ xor64_result = __sync_fetch_and_xor(&xor64_value, 0x011ull << 32);
+ xor32_result = __sync_fetch_and_xor(&xor32_value, 0x011);
+ __sync_fetch_and_xor(&xor_noreturn_value, 0x011ull << 32);
+#endif
+
+ return 0;
+}
+
+__u64 cmpxchg64_value = 1;
+__u64 cmpxchg64_result_fail = 0;
+__u64 cmpxchg64_result_succeed = 0;
+__u32 cmpxchg32_value = 1;
+__u32 cmpxchg32_result_fail = 0;
+__u32 cmpxchg32_result_succeed = 0;
+
+SEC("fentry/bpf_fentry_test1")
+int BPF_PROG(cmpxchg, int a)
+{
+#ifdef ENABLE_ATOMICS_TESTS
+ cmpxchg64_result_fail = __sync_val_compare_and_swap(&cmpxchg64_value, 0, 3);
+ cmpxchg64_result_succeed = __sync_val_compare_and_swap(&cmpxchg64_value, 1, 2);
+
+ cmpxchg32_result_fail = __sync_val_compare_and_swap(&cmpxchg32_value, 0, 3);
+ cmpxchg32_result_succeed = __sync_val_compare_and_swap(&cmpxchg32_value, 1, 2);
+#endif
+
+ return 0;
+}
+
+__u64 xchg64_value = 1;
+__u64 xchg64_result = 0;
+__u32 xchg32_value = 1;
+__u32 xchg32_result = 0;
+
+SEC("fentry/bpf_fentry_test1")
+int BPF_PROG(xchg, int a)
+{
+#ifdef ENABLE_ATOMICS_TESTS
+ __u64 val64 = 2;
+ __u32 val32 = 2;
+
+ __atomic_exchange(&xchg64_value, &val64, &xchg64_result, __ATOMIC_RELAXED);
+ __atomic_exchange(&xchg32_value, &val32, &xchg32_result, __ATOMIC_RELAXED);
+#endif
+
+ return 0;
+}
diff --git a/tools/testing/selftests/bpf/verifier/atomic_and.c b/tools/testing/selftests/bpf/verifier/atomic_and.c
new file mode 100644
index 000000000000..7eea6d9dfd7d
--- /dev/null
+++ b/tools/testing/selftests/bpf/verifier/atomic_and.c
@@ -0,0 +1,77 @@
+{
+ "BPF_ATOMIC_AND without fetch",
+ .insns = {
+ /* val = 0x110; */
+ BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0x110),
+ /* atomic_and(&val, 0x011); */
+ BPF_MOV64_IMM(BPF_REG_1, 0x011),
+ BPF_ATOMIC_AND(BPF_DW, BPF_REG_10, BPF_REG_1, -8),
+ /* if (val != 0x010) exit(2); */
+ BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_10, -8),
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0x010, 2),
+ BPF_MOV64_IMM(BPF_REG_0, 2),
+ BPF_EXIT_INSN(),
+ /* r1 should not be clobbered, no BPF_FETCH flag */
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 0x011, 1),
+ BPF_MOV64_IMM(BPF_REG_0, 1),
+ BPF_EXIT_INSN(),
+ },
+ .result = ACCEPT,
+},
+{
+ "BPF_ATOMIC_AND with fetch",
+ .insns = {
+ BPF_MOV64_IMM(BPF_REG_0, 123),
+ /* val = 0x110; */
+ BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0x110),
+ /* old = atomic_fetch_and(&val, 0x011); */
+ BPF_MOV64_IMM(BPF_REG_1, 0x011),
+ BPF_ATOMIC_FETCH_AND(BPF_DW, BPF_REG_10, BPF_REG_1, -8),
+ /* if (old != 0x110) exit(3); */
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 0x110, 2),
+ BPF_MOV64_IMM(BPF_REG_0, 3),
+ BPF_EXIT_INSN(),
+ /* if (val != 0x010) exit(2); */
+ BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_10, -8),
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 0x010, 2),
+ BPF_MOV64_IMM(BPF_REG_1, 2),
+ BPF_EXIT_INSN(),
+ /* Check R0 wasn't clobbered (for fear of x86 JIT bug) */
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 123, 2),
+ BPF_MOV64_IMM(BPF_REG_0, 1),
+ BPF_EXIT_INSN(),
+ /* exit(0); */
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ },
+ .result = ACCEPT,
+},
+{
+ "BPF_ATOMIC_AND with fetch 32bit",
+ .insns = {
+ /* r0 = (s64) -1 */
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_ALU64_IMM(BPF_SUB, BPF_REG_0, 1),
+ /* val = 0x110; */
+ BPF_ST_MEM(BPF_W, BPF_REG_10, -4, 0x110),
+ /* old = atomic_fetch_and(&val, 0x011); */
+ BPF_MOV32_IMM(BPF_REG_1, 0x011),
+ BPF_ATOMIC_FETCH_AND(BPF_W, BPF_REG_10, BPF_REG_1, -4),
+ /* if (old != 0x110) exit(3); */
+ BPF_JMP32_IMM(BPF_JEQ, BPF_REG_1, 0x110, 2),
+ BPF_MOV32_IMM(BPF_REG_0, 3),
+ BPF_EXIT_INSN(),
+ /* if (val != 0x010) exit(2); */
+ BPF_LDX_MEM(BPF_W, BPF_REG_1, BPF_REG_10, -4),
+ BPF_JMP32_IMM(BPF_JEQ, BPF_REG_1, 0x010, 2),
+ BPF_MOV32_IMM(BPF_REG_1, 2),
+ BPF_EXIT_INSN(),
+ /* Check R0 wasn't clobbered (for fear of x86 JIT bug)
+ * It should be -1 so add 1 to get exit code.
+ */
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_0, 1),
+ BPF_EXIT_INSN(),
+ },
+ .result = ACCEPT,
+},
diff --git a/tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c b/tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c
new file mode 100644
index 000000000000..335e12690be7
--- /dev/null
+++ b/tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c
@@ -0,0 +1,96 @@
+{
+ "atomic compare-and-exchange smoketest - 64bit",
+ .insns = {
+ /* val = 3; */
+ BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 3),
+ /* old = atomic_cmpxchg(&val, 2, 4); */
+ BPF_MOV64_IMM(BPF_REG_1, 4),
+ BPF_MOV64_IMM(BPF_REG_0, 2),
+ BPF_ATOMIC_CMPXCHG(BPF_DW, BPF_REG_10, BPF_REG_1, -8),
+ /* if (old != 3) exit(2); */
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 3, 2),
+ BPF_MOV64_IMM(BPF_REG_0, 2),
+ BPF_EXIT_INSN(),
+ /* if (val != 3) exit(3); */
+ BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_10, -8),
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 3, 2),
+ BPF_MOV64_IMM(BPF_REG_0, 3),
+ BPF_EXIT_INSN(),
+ /* old = atomic_cmpxchg(&val, 3, 4); */
+ BPF_MOV64_IMM(BPF_REG_1, 4),
+ BPF_MOV64_IMM(BPF_REG_0, 3),
+ BPF_ATOMIC_CMPXCHG(BPF_DW, BPF_REG_10, BPF_REG_1, -8),
+ /* if (old != 3) exit(4); */
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 3, 2),
+ BPF_MOV64_IMM(BPF_REG_0, 4),
+ BPF_EXIT_INSN(),
+ /* if (val != 4) exit(5); */
+ BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_10, -8),
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 4, 2),
+ BPF_MOV64_IMM(BPF_REG_0, 5),
+ BPF_EXIT_INSN(),
+ /* exit(0); */
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ },
+ .result = ACCEPT,
+},
+{
+ "atomic compare-and-exchange smoketest - 32bit",
+ .insns = {
+ /* val = 3; */
+ BPF_ST_MEM(BPF_W, BPF_REG_10, -4, 3),
+ /* old = atomic_cmpxchg(&val, 2, 4); */
+ BPF_MOV32_IMM(BPF_REG_1, 4),
+ BPF_MOV32_IMM(BPF_REG_0, 2),
+ BPF_ATOMIC_CMPXCHG(BPF_W, BPF_REG_10, BPF_REG_1, -4),
+ /* if (old != 3) exit(2); */
+ BPF_JMP32_IMM(BPF_JEQ, BPF_REG_0, 3, 2),
+ BPF_MOV32_IMM(BPF_REG_0, 2),
+ BPF_EXIT_INSN(),
+ /* if (val != 3) exit(3); */
+ BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_10, -4),
+ BPF_JMP32_IMM(BPF_JEQ, BPF_REG_0, 3, 2),
+ BPF_MOV32_IMM(BPF_REG_0, 3),
+ BPF_EXIT_INSN(),
+ /* old = atomic_cmpxchg(&val, 3, 4); */
+ BPF_MOV32_IMM(BPF_REG_1, 4),
+ BPF_MOV32_IMM(BPF_REG_0, 3),
+ BPF_ATOMIC_CMPXCHG(BPF_W, BPF_REG_10, BPF_REG_1, -4),
+ /* if (old != 3) exit(4); */
+ BPF_JMP32_IMM(BPF_JEQ, BPF_REG_0, 3, 2),
+ BPF_MOV32_IMM(BPF_REG_0, 4),
+ BPF_EXIT_INSN(),
+ /* if (val != 4) exit(5); */
+ BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_10, -4),
+ BPF_JMP32_IMM(BPF_JEQ, BPF_REG_0, 4, 2),
+ BPF_MOV32_IMM(BPF_REG_0, 5),
+ BPF_EXIT_INSN(),
+ /* exit(0); */
+ BPF_MOV32_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ },
+ .result = ACCEPT,
+},
+{
+ "Can't use cmpxchg on uninit src reg",
+ .insns = {
+ BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 3),
+ BPF_MOV64_IMM(BPF_REG_0, 3),
+ BPF_ATOMIC_CMPXCHG(BPF_DW, BPF_REG_10, BPF_REG_2, -8),
+ BPF_EXIT_INSN(),
+ },
+ .result = REJECT,
+ .errstr = "!read_ok",
+},
+{
+ "Can't use cmpxchg on uninit memory",
+ .insns = {
+ BPF_MOV64_IMM(BPF_REG_0, 3),
+ BPF_MOV64_IMM(BPF_REG_2, 4),
+ BPF_ATOMIC_CMPXCHG(BPF_DW, BPF_REG_10, BPF_REG_2, -8),
+ BPF_EXIT_INSN(),
+ },
+ .result = REJECT,
+ .errstr = "invalid read from stack",
+},
diff --git a/tools/testing/selftests/bpf/verifier/atomic_fetch_add.c b/tools/testing/selftests/bpf/verifier/atomic_fetch_add.c
new file mode 100644
index 000000000000..7c87bc9a13de
--- /dev/null
+++ b/tools/testing/selftests/bpf/verifier/atomic_fetch_add.c
@@ -0,0 +1,106 @@
+{
+ "BPF_ATOMIC_FETCH_ADD smoketest - 64bit",
+ .insns = {
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ /* Write 3 to stack */
+ BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 3),
+ /* Put a 1 in R1, add it to the 3 on the stack, and load the value back into R1 */
+ BPF_MOV64_IMM(BPF_REG_1, 1),
+ BPF_ATOMIC_FETCH_ADD(BPF_DW, BPF_REG_10, BPF_REG_1, -8),
+ /* Check the value we loaded back was 3 */
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 3, 2),
+ BPF_MOV64_IMM(BPF_REG_0, 1),
+ BPF_EXIT_INSN(),
+ /* Load value from stack */
+ BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_10, -8),
+ /* Check value loaded from stack was 4 */
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 4, 1),
+ BPF_MOV64_IMM(BPF_REG_0, 2),
+ BPF_EXIT_INSN(),
+ },
+ .result = ACCEPT,
+},
+{
+ "BPF_ATOMIC_FETCH_ADD smoketest - 32bit",
+ .insns = {
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ /* Write 3 to stack */
+ BPF_ST_MEM(BPF_W, BPF_REG_10, -4, 3),
+ /* Put a 1 in R1, add it to the 3 on the stack, and load the value back into R1 */
+ BPF_MOV32_IMM(BPF_REG_1, 1),
+ BPF_ATOMIC_FETCH_ADD(BPF_W, BPF_REG_10, BPF_REG_1, -4),
+ /* Check the value we loaded back was 3 */
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 3, 2),
+ BPF_MOV64_IMM(BPF_REG_0, 1),
+ BPF_EXIT_INSN(),
+ /* Load value from stack */
+ BPF_LDX_MEM(BPF_W, BPF_REG_1, BPF_REG_10, -4),
+ /* Check value loaded from stack was 4 */
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 4, 1),
+ BPF_MOV64_IMM(BPF_REG_0, 2),
+ BPF_EXIT_INSN(),
+ },
+ .result = ACCEPT,
+},
+{
+ "Can't use ATM_FETCH_ADD on frame pointer",
+ .insns = {
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 3),
+ BPF_ATOMIC_FETCH_ADD(BPF_DW, BPF_REG_10, BPF_REG_10, -8),
+ BPF_EXIT_INSN(),
+ },
+ .result = REJECT,
+ .errstr_unpriv = "R10 leaks addr into mem",
+ .errstr = "frame pointer is read only",
+},
+{
+ "Can't use ATM_FETCH_ADD on uninit src reg",
+ .insns = {
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 3),
+ BPF_ATOMIC_FETCH_ADD(BPF_DW, BPF_REG_10, BPF_REG_2, -8),
+ BPF_EXIT_INSN(),
+ },
+ .result = REJECT,
+ /* It happens that the address leak check is first, but it would also be
+ * complain about the fact that we're trying to modify R10.
+ */
+ .errstr = "!read_ok",
+},
+{
+ "Can't use ATM_FETCH_ADD on uninit dst reg",
+ .insns = {
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_ATOMIC_FETCH_ADD(BPF_DW, BPF_REG_2, BPF_REG_0, -8),
+ BPF_EXIT_INSN(),
+ },
+ .result = REJECT,
+ /* It happens that the address leak check is first, but it would also be
+ * complain about the fact that we're trying to modify R10.
+ */
+ .errstr = "!read_ok",
+},
+{
+ "Can't use ATM_FETCH_ADD on kernel memory",
+ .insns = {
+ /* This is an fentry prog, context is array of the args of the
+ * kernel function being called. Load first arg into R2.
+ */
+ BPF_LDX_MEM(BPF_DW, BPF_REG_2, BPF_REG_1, 0),
+ /* First arg of bpf_fentry_test7 is a pointer to a struct.
+ * Attempt to modify that struct. Verifier shouldn't let us
+ * because it's kernel memory.
+ */
+ BPF_MOV64_IMM(BPF_REG_3, 1),
+ BPF_ATOMIC_FETCH_ADD(BPF_DW, BPF_REG_2, BPF_REG_3, 0),
+ /* Done */
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ },
+ .prog_type = BPF_PROG_TYPE_TRACING,
+ .expected_attach_type = BPF_TRACE_FENTRY,
+ .kfunc = "bpf_fentry_test7",
+ .result = REJECT,
+ .errstr = "only read is supported",
+},
diff --git a/tools/testing/selftests/bpf/verifier/atomic_or.c b/tools/testing/selftests/bpf/verifier/atomic_or.c
new file mode 100644
index 000000000000..1b22fb2881f0
--- /dev/null
+++ b/tools/testing/selftests/bpf/verifier/atomic_or.c
@@ -0,0 +1,77 @@
+{
+ "BPF_ATOMIC_OR without fetch",
+ .insns = {
+ /* val = 0x110; */
+ BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0x110),
+ /* atomic_or(&val, 0x011); */
+ BPF_MOV64_IMM(BPF_REG_1, 0x011),
+ BPF_ATOMIC_OR(BPF_DW, BPF_REG_10, BPF_REG_1, -8),
+ /* if (val != 0x111) exit(2); */
+ BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_10, -8),
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0x111, 2),
+ BPF_MOV64_IMM(BPF_REG_0, 2),
+ BPF_EXIT_INSN(),
+ /* r1 should not be clobbered, no BPF_FETCH flag */
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 0x011, 1),
+ BPF_MOV64_IMM(BPF_REG_0, 1),
+ BPF_EXIT_INSN(),
+ },
+ .result = ACCEPT,
+},
+{
+ "BPF_ATOMIC_OR with fetch",
+ .insns = {
+ BPF_MOV64_IMM(BPF_REG_0, 123),
+ /* val = 0x110; */
+ BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0x110),
+ /* old = atomic_fetch_or(&val, 0x011); */
+ BPF_MOV64_IMM(BPF_REG_1, 0x011),
+ BPF_ATOMIC_FETCH_OR(BPF_DW, BPF_REG_10, BPF_REG_1, -8),
+ /* if (old != 0x110) exit(3); */
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 0x110, 2),
+ BPF_MOV64_IMM(BPF_REG_0, 3),
+ BPF_EXIT_INSN(),
+ /* if (val != 0x111) exit(2); */
+ BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_10, -8),
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 0x111, 2),
+ BPF_MOV64_IMM(BPF_REG_1, 2),
+ BPF_EXIT_INSN(),
+ /* Check R0 wasn't clobbered (for fear of x86 JIT bug) */
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 123, 2),
+ BPF_MOV64_IMM(BPF_REG_0, 1),
+ BPF_EXIT_INSN(),
+ /* exit(0); */
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ },
+ .result = ACCEPT,
+},
+{
+ "BPF_ATOMIC_OR with fetch 32bit",
+ .insns = {
+ /* r0 = (s64) -1 */
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_ALU64_IMM(BPF_SUB, BPF_REG_0, 1),
+ /* val = 0x110; */
+ BPF_ST_MEM(BPF_W, BPF_REG_10, -4, 0x110),
+ /* old = atomic_fetch_or(&val, 0x011); */
+ BPF_MOV32_IMM(BPF_REG_1, 0x011),
+ BPF_ATOMIC_FETCH_OR(BPF_W, BPF_REG_10, BPF_REG_1, -4),
+ /* if (old != 0x110) exit(3); */
+ BPF_JMP32_IMM(BPF_JEQ, BPF_REG_1, 0x110, 2),
+ BPF_MOV32_IMM(BPF_REG_0, 3),
+ BPF_EXIT_INSN(),
+ /* if (val != 0x111) exit(2); */
+ BPF_LDX_MEM(BPF_W, BPF_REG_1, BPF_REG_10, -4),
+ BPF_JMP32_IMM(BPF_JEQ, BPF_REG_1, 0x111, 2),
+ BPF_MOV32_IMM(BPF_REG_1, 2),
+ BPF_EXIT_INSN(),
+ /* Check R0 wasn't clobbered (for fear of x86 JIT bug)
+ * It should be -1 so add 1 to get exit code.
+ */
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_0, 1),
+ BPF_EXIT_INSN(),
+ },
+ .result = ACCEPT,
+},
diff --git a/tools/testing/selftests/bpf/verifier/atomic_xchg.c b/tools/testing/selftests/bpf/verifier/atomic_xchg.c
new file mode 100644
index 000000000000..9348ac490e24
--- /dev/null
+++ b/tools/testing/selftests/bpf/verifier/atomic_xchg.c
@@ -0,0 +1,46 @@
+{
+ "atomic exchange smoketest - 64bit",
+ .insns = {
+ /* val = 3; */
+ BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 3),
+ /* old = atomic_xchg(&val, 4); */
+ BPF_MOV64_IMM(BPF_REG_1, 4),
+ BPF_ATOMIC_XCHG(BPF_DW, BPF_REG_10, BPF_REG_1, -8),
+ /* if (old != 3) exit(1); */
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 3, 2),
+ BPF_MOV64_IMM(BPF_REG_0, 1),
+ BPF_EXIT_INSN(),
+ /* if (val != 4) exit(2); */
+ BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_10, -8),
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 4, 2),
+ BPF_MOV64_IMM(BPF_REG_0, 2),
+ BPF_EXIT_INSN(),
+ /* exit(0); */
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ },
+ .result = ACCEPT,
+},
+{
+ "atomic exchange smoketest - 32bit",
+ .insns = {
+ /* val = 3; */
+ BPF_ST_MEM(BPF_W, BPF_REG_10, -4, 3),
+ /* old = atomic_xchg(&val, 4); */
+ BPF_MOV32_IMM(BPF_REG_1, 4),
+ BPF_ATOMIC_XCHG(BPF_W, BPF_REG_10, BPF_REG_1, -4),
+ /* if (old != 3) exit(1); */
+ BPF_JMP32_IMM(BPF_JEQ, BPF_REG_1, 3, 2),
+ BPF_MOV32_IMM(BPF_REG_0, 1),
+ BPF_EXIT_INSN(),
+ /* if (val != 4) exit(2); */
+ BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_10, -4),
+ BPF_JMP32_IMM(BPF_JEQ, BPF_REG_0, 4, 2),
+ BPF_MOV32_IMM(BPF_REG_0, 2),
+ BPF_EXIT_INSN(),
+ /* exit(0); */
+ BPF_MOV32_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ },
+ .result = ACCEPT,
+},
diff --git a/tools/testing/selftests/bpf/verifier/atomic_xor.c b/tools/testing/selftests/bpf/verifier/atomic_xor.c
new file mode 100644
index 000000000000..d1315419a3a8
--- /dev/null
+++ b/tools/testing/selftests/bpf/verifier/atomic_xor.c
@@ -0,0 +1,77 @@
+{
+ "BPF_ATOMIC_XOR without fetch",
+ .insns = {
+ /* val = 0x110; */
+ BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0x110),
+ /* atomic_xor(&val, 0x011); */
+ BPF_MOV64_IMM(BPF_REG_1, 0x011),
+ BPF_ATOMIC_XOR(BPF_DW, BPF_REG_10, BPF_REG_1, -8),
+ /* if (val != 0x101) exit(2); */
+ BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_10, -8),
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0x101, 2),
+ BPF_MOV64_IMM(BPF_REG_0, 2),
+ BPF_EXIT_INSN(),
+ /* r1 should not be clobbered, no BPF_FETCH flag */
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 0x011, 1),
+ BPF_MOV64_IMM(BPF_REG_0, 1),
+ BPF_EXIT_INSN(),
+ },
+ .result = ACCEPT,
+},
+{
+ "BPF_ATOMIC_XOR with fetch",
+ .insns = {
+ BPF_MOV64_IMM(BPF_REG_0, 123),
+ /* val = 0x110; */
+ BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0x110),
+ /* old = atomic_fetch_xor(&val, 0x011); */
+ BPF_MOV64_IMM(BPF_REG_1, 0x011),
+ BPF_ATOMIC_FETCH_XOR(BPF_DW, BPF_REG_10, BPF_REG_1, -8),
+ /* if (old != 0x110) exit(3); */
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 0x110, 2),
+ BPF_MOV64_IMM(BPF_REG_0, 3),
+ BPF_EXIT_INSN(),
+ /* if (val != 0x101) exit(2); */
+ BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_10, -8),
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 0x101, 2),
+ BPF_MOV64_IMM(BPF_REG_1, 2),
+ BPF_EXIT_INSN(),
+ /* Check R0 wasn't clobbered (fxor fear of x86 JIT bug) */
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 123, 2),
+ BPF_MOV64_IMM(BPF_REG_0, 1),
+ BPF_EXIT_INSN(),
+ /* exit(0); */
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ },
+ .result = ACCEPT,
+},
+{
+ "BPF_ATOMIC_XOR with fetch 32bit",
+ .insns = {
+ /* r0 = (s64) -1 */
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_ALU64_IMM(BPF_SUB, BPF_REG_0, 1),
+ /* val = 0x110; */
+ BPF_ST_MEM(BPF_W, BPF_REG_10, -4, 0x110),
+ /* old = atomic_fetch_xor(&val, 0x011); */
+ BPF_MOV32_IMM(BPF_REG_1, 0x011),
+ BPF_ATOMIC_FETCH_XOR(BPF_W, BPF_REG_10, BPF_REG_1, -4),
+ /* if (old != 0x110) exit(3); */
+ BPF_JMP32_IMM(BPF_JEQ, BPF_REG_1, 0x110, 2),
+ BPF_MOV32_IMM(BPF_REG_0, 3),
+ BPF_EXIT_INSN(),
+ /* if (val != 0x101) exit(2); */
+ BPF_LDX_MEM(BPF_W, BPF_REG_1, BPF_REG_10, -4),
+ BPF_JMP32_IMM(BPF_JEQ, BPF_REG_1, 0x101, 2),
+ BPF_MOV32_IMM(BPF_REG_1, 2),
+ BPF_EXIT_INSN(),
+ /* Check R0 wasn't clobbered (fxor fear of x86 JIT bug)
+ * It should be -1 so add 1 to get exit code.
+ */
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_0, 1),
+ BPF_EXIT_INSN(),
+ },
+ .result = ACCEPT,
+},
--
2.29.2.576.ga3fc446d84-goog

2020-12-07 16:13:16

by Brendan Jackman

[permalink] [raw]
Subject: [PATCH bpf-next v4 11/11] bpf: Document new atomic instructions

Document new atomic instructions.

Signed-off-by: Brendan Jackman <[email protected]>
---
Documentation/networking/filter.rst | 26 ++++++++++++++++++++++++++
1 file changed, 26 insertions(+)

diff --git a/Documentation/networking/filter.rst b/Documentation/networking/filter.rst
index 1583d59d806d..26d508a5e038 100644
--- a/Documentation/networking/filter.rst
+++ b/Documentation/networking/filter.rst
@@ -1053,6 +1053,32 @@ encoding.
.imm = BPF_ADD, .code = BPF_ATOMIC | BPF_W | BPF_STX: lock xadd *(u32 *)(dst_reg + off16) += src_reg
.imm = BPF_ADD, .code = BPF_ATOMIC | BPF_DW | BPF_STX: lock xadd *(u64 *)(dst_reg + off16) += src_reg

+The basic atomic operations supported (from architecture v4 onwards) are:
+
+ BPF_ADD
+ BPF_AND
+ BPF_OR
+ BPF_XOR
+
+Each having equivalent semantics with the ``BPF_ADD`` example, that is: the
+memory location addresed by ``dst_reg + off`` is atomically modified, with
+``src_reg`` as the other operand. If the ``BPF_FETCH`` flag is set in the
+immediate, then these operations also overwrite ``src_reg`` with the
+value that was in memory before it was modified.
+
+The more special operations are:
+
+ BPF_XCHG
+
+This atomically exchanges ``src_reg`` with the value addressed by ``dst_reg +
+off``.
+
+ BPF_CMPXCHG
+
+This atomically compares the value addressed by ``dst_reg + off`` with
+``R0``. If they match it is replaced with ``src_reg``, The value that was there
+before is loaded back to ``R0``.
+
Note that 1 and 2 byte atomic operations are not supported.

You may encounter BPF_XADD - this is a legacy name for BPF_ATOMIC, referring to
--
2.29.2.576.ga3fc446d84-goog

2020-12-07 16:13:18

by Brendan Jackman

[permalink] [raw]
Subject: [PATCH bpf-next v4 03/11] bpf: x86: Factor out a lookup table for some ALU opcodes

A later commit will need to lookup a subset of these opcodes. To
avoid duplicating code, pull out a table.

The shift opcodes won't be needed by that later commit, but they're
already duplicated, so fold them into the table anyway.

Signed-off-by: Brendan Jackman <[email protected]>
---
arch/x86/net/bpf_jit_comp.c | 33 +++++++++++++++------------------
1 file changed, 15 insertions(+), 18 deletions(-)

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 7106cfd10ba6..f0c98fd275e5 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -205,6 +205,18 @@ static u8 add_2reg(u8 byte, u32 dst_reg, u32 src_reg)
return byte + reg2hex[dst_reg] + (reg2hex[src_reg] << 3);
}

+/* Some 1-byte opcodes for binary ALU operations */
+static u8 simple_alu_opcodes[] = {
+ [BPF_ADD] = 0x01,
+ [BPF_SUB] = 0x29,
+ [BPF_AND] = 0x21,
+ [BPF_OR] = 0x09,
+ [BPF_XOR] = 0x31,
+ [BPF_LSH] = 0xE0,
+ [BPF_RSH] = 0xE8,
+ [BPF_ARSH] = 0xF8,
+};
+
static void jit_fill_hole(void *area, unsigned int size)
{
/* Fill whole space with INT3 instructions */
@@ -862,15 +874,9 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image,
case BPF_ALU64 | BPF_AND | BPF_X:
case BPF_ALU64 | BPF_OR | BPF_X:
case BPF_ALU64 | BPF_XOR | BPF_X:
- switch (BPF_OP(insn->code)) {
- case BPF_ADD: b2 = 0x01; break;
- case BPF_SUB: b2 = 0x29; break;
- case BPF_AND: b2 = 0x21; break;
- case BPF_OR: b2 = 0x09; break;
- case BPF_XOR: b2 = 0x31; break;
- }
maybe_emit_mod(&prog, dst_reg, src_reg,
BPF_CLASS(insn->code) == BPF_ALU64);
+ b2 = simple_alu_opcodes[BPF_OP(insn->code)];
EMIT2(b2, add_2reg(0xC0, dst_reg, src_reg));
break;

@@ -1050,12 +1056,7 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image,
else if (is_ereg(dst_reg))
EMIT1(add_1mod(0x40, dst_reg));

- switch (BPF_OP(insn->code)) {
- case BPF_LSH: b3 = 0xE0; break;
- case BPF_RSH: b3 = 0xE8; break;
- case BPF_ARSH: b3 = 0xF8; break;
- }
-
+ b3 = simple_alu_opcodes[BPF_OP(insn->code)];
if (imm32 == 1)
EMIT2(0xD1, add_1reg(b3, dst_reg));
else
@@ -1089,11 +1090,7 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image,
else if (is_ereg(dst_reg))
EMIT1(add_1mod(0x40, dst_reg));

- switch (BPF_OP(insn->code)) {
- case BPF_LSH: b3 = 0xE0; break;
- case BPF_RSH: b3 = 0xE8; break;
- case BPF_ARSH: b3 = 0xF8; break;
- }
+ b3 = simple_alu_opcodes[BPF_OP(insn->code)];
EMIT2(0xD3, add_1reg(b3, dst_reg));

if (src_reg != BPF_REG_4)
--
2.29.2.576.ga3fc446d84-goog

2020-12-07 16:13:27

by Brendan Jackman

[permalink] [raw]
Subject: [PATCH bpf-next v4 01/11] bpf: x86: Factor out emission of ModR/M for *(reg + off)

The case for JITing atomics is about to get more complicated. Let's
factor out some common code to make the review and result more
readable.

NB the atomics code doesn't yet use the new helper - a subsequent
patch will add its use as a side-effect of other changes.

Signed-off-by: Brendan Jackman <[email protected]>
---
arch/x86/net/bpf_jit_comp.c | 42 +++++++++++++++++++++----------------
1 file changed, 24 insertions(+), 18 deletions(-)

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 796506dcfc42..cc818ed7c2b9 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -681,6 +681,27 @@ static void emit_mov_reg(u8 **pprog, bool is64, u32 dst_reg, u32 src_reg)
*pprog = prog;
}

+/* Emit the suffix (ModR/M etc) for addressing *(ptr_reg + off) and val_reg */
+static void emit_insn_suffix(u8 **pprog, u32 ptr_reg, u32 val_reg, int off)
+{
+ u8 *prog = *pprog;
+ int cnt = 0;
+
+ if (is_imm8(off)) {
+ /* 1-byte signed displacement.
+ *
+ * If off == 0 we could skip this and save one extra byte, but
+ * special case of x86 R13 which always needs an offset is not
+ * worth the hassle
+ */
+ EMIT2(add_2reg(0x40, ptr_reg, val_reg), off);
+ } else {
+ /* 4-byte signed displacement */
+ EMIT1_off32(add_2reg(0x80, ptr_reg, val_reg), off);
+ }
+ *pprog = prog;
+}
+
/* LDX: dst_reg = *(u8*)(src_reg + off) */
static void emit_ldx(u8 **pprog, u32 size, u32 dst_reg, u32 src_reg, int off)
{
@@ -708,15 +729,7 @@ static void emit_ldx(u8 **pprog, u32 size, u32 dst_reg, u32 src_reg, int off)
EMIT2(add_2mod(0x48, src_reg, dst_reg), 0x8B);
break;
}
- /*
- * If insn->off == 0 we can save one extra byte, but
- * special case of x86 R13 which always needs an offset
- * is not worth the hassle
- */
- if (is_imm8(off))
- EMIT2(add_2reg(0x40, src_reg, dst_reg), off);
- else
- EMIT1_off32(add_2reg(0x80, src_reg, dst_reg), off);
+ emit_insn_suffix(&prog, src_reg, dst_reg, off);
*pprog = prog;
}

@@ -751,10 +764,7 @@ static void emit_stx(u8 **pprog, u32 size, u32 dst_reg, u32 src_reg, int off)
EMIT2(add_2mod(0x48, dst_reg, src_reg), 0x89);
break;
}
- if (is_imm8(off))
- EMIT2(add_2reg(0x40, dst_reg, src_reg), off);
- else
- EMIT1_off32(add_2reg(0x80, dst_reg, src_reg), off);
+ emit_insn_suffix(&prog, dst_reg, src_reg, off);
*pprog = prog;
}

@@ -1240,11 +1250,7 @@ st: if (is_imm8(insn->off))
goto xadd;
case BPF_STX | BPF_XADD | BPF_DW:
EMIT3(0xF0, add_2mod(0x48, dst_reg, src_reg), 0x01);
-xadd: if (is_imm8(insn->off))
- EMIT2(add_2reg(0x40, dst_reg, src_reg), insn->off);
- else
- EMIT1_off32(add_2reg(0x80, dst_reg, src_reg),
- insn->off);
+xadd: emit_modrm_dstoff(&prog, dst_reg, src_reg, insn->off);
break;

/* call */
--
2.29.2.576.ga3fc446d84-goog

2020-12-07 16:13:36

by Brendan Jackman

[permalink] [raw]
Subject: [PATCH bpf-next v4 07/11] bpf: Add instructions for atomic_[cmp]xchg

This adds two atomic opcodes, both of which include the BPF_FETCH
flag. XCHG without the BPF_FETCH flag would naturally encode
atomic_set. This is not supported because it would be of limited
value to userspace (it doesn't imply any barriers). CMPXCHG without
BPF_FETCH woulud be an atomic compare-and-write. We don't have such
an operation in the kernel so it isn't provided to BPF either.

There are two significant design decisions made for the CMPXCHG
instruction:

- To solve the issue that this operation fundamentally has 3
operands, but we only have two register fields. Therefore the
operand we compare against (the kernel's API calls it 'old') is
hard-coded to be R0. x86 has similar design (and A64 doesn't
have this problem).

A potential alternative might be to encode the other operand's
register number in the immediate field.

- The kernel's atomic_cmpxchg returns the old value, while the C11
userspace APIs return a boolean indicating the comparison
result. Which should BPF do? A64 returns the old value. x86 returns
the old value in the hard-coded register (and also sets a
flag). That means return-old-value is easier to JIT.

Signed-off-by: Brendan Jackman <[email protected]>
---
arch/x86/net/bpf_jit_comp.c | 8 ++++++++
include/linux/filter.h | 22 ++++++++++++++++++++++
include/uapi/linux/bpf.h | 4 +++-
kernel/bpf/core.c | 20 ++++++++++++++++++++
kernel/bpf/disasm.c | 15 +++++++++++++++
kernel/bpf/verifier.c | 19 +++++++++++++++++--
tools/include/linux/filter.h | 22 ++++++++++++++++++++++
tools/include/uapi/linux/bpf.h | 4 +++-
8 files changed, 110 insertions(+), 4 deletions(-)

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index eea7d8b0bb12..308241187582 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -815,6 +815,14 @@ static int emit_atomic(u8 **pprog, u8 atomic_op,
/* src_reg = atomic_fetch_add(dst_reg + off, src_reg); */
EMIT2(0x0F, 0xC1);
break;
+ case BPF_XCHG:
+ /* src_reg = atomic_xchg(dst_reg + off, src_reg); */
+ EMIT1(0x87);
+ break;
+ case BPF_CMPXCHG:
+ /* r0 = atomic_cmpxchg(dst_reg + off, r0, src_reg); */
+ EMIT2(0x0F, 0xB1);
+ break;
default:
pr_err("bpf_jit: unknown atomic opcode %02x\n", atomic_op);
return -EFAULT;
diff --git a/include/linux/filter.h b/include/linux/filter.h
index b5258bca10d2..e1e1fc946a7c 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -265,6 +265,8 @@ static inline bool insn_is_zext(const struct bpf_insn *insn)
*
* BPF_ADD *(uint *) (dst_reg + off16) += src_reg
* BPF_ADD | BPF_FETCH src_reg = atomic_fetch_add(dst_reg + off16, src_reg);
+ * BPF_XCHG src_reg = atomic_xchg(dst_reg + off16, src_reg)
+ * BPF_CMPXCHG r0 = atomic_cmpxchg(dst_reg + off16, r0, src_reg)
*/

#define BPF_ATOMIC64(OP, DST, SRC, OFF) \
@@ -293,6 +295,26 @@ static inline bool insn_is_zext(const struct bpf_insn *insn)
.off = OFF, \
.imm = BPF_ADD })

+/* Atomic exchange, src_reg = atomic_xchg(dst_reg + off, src_reg) */
+
+#define BPF_ATOMIC_XCHG(SIZE, DST, SRC, OFF) \
+ ((struct bpf_insn) { \
+ .code = BPF_STX | BPF_SIZE(SIZE) | BPF_ATOMIC, \
+ .dst_reg = DST, \
+ .src_reg = SRC, \
+ .off = OFF, \
+ .imm = BPF_XCHG })
+
+/* Atomic compare-exchange, r0 = atomic_cmpxchg(dst_reg + off, r0, src_reg) */
+
+#define BPF_ATOMIC_CMPXCHG(SIZE, DST, SRC, OFF) \
+ ((struct bpf_insn) { \
+ .code = BPF_STX | BPF_SIZE(SIZE) | BPF_ATOMIC, \
+ .dst_reg = DST, \
+ .src_reg = SRC, \
+ .off = OFF, \
+ .imm = BPF_CMPXCHG })
+
/* Memory store, *(uint *) (dst_reg + off16) = imm32 */

#define BPF_ST_MEM(SIZE, DST, OFF, IMM) \
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index d5389119291e..b733af50a5b9 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -45,7 +45,9 @@
#define BPF_EXIT 0x90 /* function return */

/* atomic op type fields (stored in immediate) */
-#define BPF_FETCH 0x01 /* fetch previous value into src reg */
+#define BPF_XCHG (0xe0 | BPF_FETCH) /* atomic exchange */
+#define BPF_CMPXCHG (0xf0 | BPF_FETCH) /* atomic compare-and-write */
+#define BPF_FETCH 0x01 /* not an opcode on its own, used to build others */

/* Register numbers */
enum {
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 61e93eb7d363..28f960bc2e30 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -1630,6 +1630,16 @@ static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, u64 *stack)
(u32) SRC,
(atomic_t *)(unsigned long) (DST + insn->off));
break;
+ case BPF_XCHG:
+ SRC = (u32) atomic_xchg(
+ (atomic_t *)(unsigned long) (DST + insn->off),
+ (u32) SRC);
+ break;
+ case BPF_CMPXCHG:
+ BPF_R0 = (u32) atomic_cmpxchg(
+ (atomic_t *)(unsigned long) (DST + insn->off),
+ (u32) BPF_R0, (u32) SRC);
+ break;
default:
goto default_label;
}
@@ -1647,6 +1657,16 @@ static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, u64 *stack)
(u64) SRC,
(atomic64_t *)(s64) (DST + insn->off));
break;
+ case BPF_XCHG:
+ SRC = (u64) atomic64_xchg(
+ (atomic64_t *)(u64) (DST + insn->off),
+ (u64) SRC);
+ break;
+ case BPF_CMPXCHG:
+ BPF_R0 = (u64) atomic64_cmpxchg(
+ (atomic64_t *)(u64) (DST + insn->off),
+ (u64) BPF_R0, (u64) SRC);
+ break;
default:
goto default_label;
}
diff --git a/kernel/bpf/disasm.c b/kernel/bpf/disasm.c
index d2e20f6d0516..ee8d1132767b 100644
--- a/kernel/bpf/disasm.c
+++ b/kernel/bpf/disasm.c
@@ -167,6 +167,21 @@ void print_bpf_insn(const struct bpf_insn_cbs *cbs,
BPF_SIZE(insn->code) == BPF_DW ? "64" : "",
bpf_ldst_string[BPF_SIZE(insn->code) >> 3],
insn->dst_reg, insn->off, insn->src_reg);
+ } else if (BPF_MODE(insn->code) == BPF_ATOMIC &&
+ insn->imm == BPF_CMPXCHG) {
+ verbose(cbs->private_data, "(%02x) r0 = atomic%s_cmpxchg((%s *)(r%d %+d), r0, r%d)\n",
+ insn->code,
+ BPF_SIZE(insn->code) == BPF_DW ? "64" : "",
+ bpf_ldst_string[BPF_SIZE(insn->code) >> 3],
+ insn->dst_reg, insn->off,
+ insn->src_reg);
+ } else if (BPF_MODE(insn->code) == BPF_ATOMIC &&
+ insn->imm == BPF_XCHG) {
+ verbose(cbs->private_data, "(%02x) r%d = atomic%s_xchg((%s *)(r%d %+d), r%d)\n",
+ insn->code, insn->src_reg,
+ BPF_SIZE(insn->code) == BPF_DW ? "64" : "",
+ bpf_ldst_string[BPF_SIZE(insn->code) >> 3],
+ insn->dst_reg, insn->off, insn->src_reg);
} else {
verbose(cbs->private_data, "BUG_%02x\n", insn->code);
}
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index f8c4e809297d..f5f4460b3e4e 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -3608,11 +3608,14 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn

static int check_atomic(struct bpf_verifier_env *env, int insn_idx, struct bpf_insn *insn)
{
+ int load_reg;
int err;

switch (insn->imm) {
case BPF_ADD:
case BPF_ADD | BPF_FETCH:
+ case BPF_XCHG:
+ case BPF_CMPXCHG:
break;
default:
verbose(env, "BPF_ATOMIC uses invalid atomic opcode %02x\n", insn->imm);
@@ -3634,6 +3637,13 @@ static int check_atomic(struct bpf_verifier_env *env, int insn_idx, struct bpf_i
if (err)
return err;

+ if (insn->imm == BPF_CMPXCHG) {
+ /* Check comparison of R0 with memory location */
+ err = check_reg_arg(env, BPF_REG_0, SRC_OP);
+ if (err)
+ return err;
+ }
+
if (is_pointer_value(env, insn->src_reg)) {
verbose(env, "R%d leaks addr into mem\n", insn->src_reg);
return -EACCES;
@@ -3664,8 +3674,13 @@ static int check_atomic(struct bpf_verifier_env *env, int insn_idx, struct bpf_i
if (!(insn->imm & BPF_FETCH))
return 0;

- /* check and record load of old value into src reg */
- err = check_reg_arg(env, insn->src_reg, DST_OP);
+ if (insn->imm == BPF_CMPXCHG)
+ load_reg = BPF_REG_0;
+ else
+ load_reg = insn->src_reg;
+
+ /* check and record load of old value */
+ err = check_reg_arg(env, load_reg, DST_OP);
if (err)
return err;

diff --git a/tools/include/linux/filter.h b/tools/include/linux/filter.h
index 4e0100ba52c2..21598053fd40 100644
--- a/tools/include/linux/filter.h
+++ b/tools/include/linux/filter.h
@@ -174,6 +174,8 @@
*
* BPF_ADD *(uint *) (dst_reg + off16) += src_reg
* BPF_ADD | BPF_FETCH src_reg = atomic_fetch_add(dst_reg + off16, src_reg);
+ * BPF_XCHG src_reg = atomic_xchg(dst_reg + off16, src_reg)
+ * BPF_CMPXCHG r0 = atomic_cmpxchg(dst_reg + off16, r0, src_reg)
*/

#define BPF_ATOMIC64(OP, DST, SRC, OFF) \
@@ -212,6 +214,26 @@
.off = OFF, \
.imm = BPF_ADD | BPF_FETCH })

+/* Atomic exchange, src_reg = atomic_xchg(dst_reg + off, src_reg) */
+
+#define BPF_ATOMIC_XCHG(SIZE, DST, SRC, OFF) \
+ ((struct bpf_insn) { \
+ .code = BPF_STX | BPF_SIZE(SIZE) | BPF_ATOMIC, \
+ .dst_reg = DST, \
+ .src_reg = SRC, \
+ .off = OFF, \
+ .imm = BPF_XCHG })
+
+/* Atomic compare-exchange, r0 = atomic_cmpxchg(dst_reg + off, r0, src_reg) */
+
+#define BPF_ATOMIC_CMPXCHG(SIZE, DST, SRC, OFF) \
+ ((struct bpf_insn) { \
+ .code = BPF_STX | BPF_SIZE(SIZE) | BPF_ATOMIC, \
+ .dst_reg = DST, \
+ .src_reg = SRC, \
+ .off = OFF, \
+ .imm = BPF_CMPXCHG })
+
/* Memory store, *(uint *) (dst_reg + off16) = imm32 */

#define BPF_ST_MEM(SIZE, DST, OFF, IMM) \
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index d5389119291e..b733af50a5b9 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -45,7 +45,9 @@
#define BPF_EXIT 0x90 /* function return */

/* atomic op type fields (stored in immediate) */
-#define BPF_FETCH 0x01 /* fetch previous value into src reg */
+#define BPF_XCHG (0xe0 | BPF_FETCH) /* atomic exchange */
+#define BPF_CMPXCHG (0xf0 | BPF_FETCH) /* atomic compare-and-write */
+#define BPF_FETCH 0x01 /* not an opcode on its own, used to build others */

/* Register numbers */
enum {
--
2.29.2.576.ga3fc446d84-goog

2020-12-07 16:14:26

by Brendan Jackman

[permalink] [raw]
Subject: [PATCH bpf-next v4 04/11] bpf: Rename BPF_XADD and prepare to encode other atomics in .imm

A subsequent patch will add additional atomic operations. These new
operations will use the same opcode field as the existing XADD, with
the immediate discriminating different operations.

In preparation, rename the instruction mode BPF_ATOMIC and start
calling the zero immediate BPF_ADD.

This is possible (doesn't break existing valid BPF progs) because the
immediate field is currently reserved MBZ and BPF_ADD is zero.

All uses are removed from the tree but the BPF_XADD definition is
kept around to avoid breaking builds for people including kernel
headers.

Signed-off-by: Brendan Jackman <[email protected]>
---
Documentation/networking/filter.rst | 30 ++++++++-----
arch/arm/net/bpf_jit_32.c | 7 ++-
arch/arm64/net/bpf_jit_comp.c | 16 +++++--
arch/mips/net/ebpf_jit.c | 11 +++--
arch/powerpc/net/bpf_jit_comp64.c | 25 ++++++++---
arch/riscv/net/bpf_jit_comp32.c | 20 +++++++--
arch/riscv/net/bpf_jit_comp64.c | 16 +++++--
arch/s390/net/bpf_jit_comp.c | 27 ++++++-----
arch/sparc/net/bpf_jit_comp_64.c | 17 +++++--
arch/x86/net/bpf_jit_comp.c | 45 ++++++++++++++-----
arch/x86/net/bpf_jit_comp32.c | 6 +--
drivers/net/ethernet/netronome/nfp/bpf/jit.c | 14 ++++--
drivers/net/ethernet/netronome/nfp/bpf/main.h | 4 +-
.../net/ethernet/netronome/nfp/bpf/verifier.c | 15 ++++---
include/linux/filter.h | 29 ++++++++++--
include/uapi/linux/bpf.h | 5 ++-
kernel/bpf/core.c | 31 +++++++++----
kernel/bpf/disasm.c | 6 ++-
kernel/bpf/verifier.c | 24 +++++-----
lib/test_bpf.c | 14 +++---
samples/bpf/bpf_insn.h | 4 +-
samples/bpf/cookie_uid_helper_example.c | 6 +--
samples/bpf/sock_example.c | 2 +-
samples/bpf/test_cgrp2_attach.c | 5 ++-
tools/include/linux/filter.h | 28 ++++++++++--
tools/include/uapi/linux/bpf.h | 5 ++-
.../bpf/prog_tests/cgroup_attach_multi.c | 4 +-
.../selftests/bpf/test_cgroup_storage.c | 2 +-
tools/testing/selftests/bpf/verifier/ctx.c | 7 ++-
.../bpf/verifier/direct_packet_access.c | 4 +-
.../testing/selftests/bpf/verifier/leak_ptr.c | 10 ++---
.../selftests/bpf/verifier/meta_access.c | 4 +-
tools/testing/selftests/bpf/verifier/unpriv.c | 3 +-
.../bpf/verifier/value_illegal_alu.c | 2 +-
tools/testing/selftests/bpf/verifier/xadd.c | 18 ++++----
35 files changed, 317 insertions(+), 149 deletions(-)

diff --git a/Documentation/networking/filter.rst b/Documentation/networking/filter.rst
index debb59e374de..1583d59d806d 100644
--- a/Documentation/networking/filter.rst
+++ b/Documentation/networking/filter.rst
@@ -1006,13 +1006,13 @@ Size modifier is one of ...

Mode modifier is one of::

- BPF_IMM 0x00 /* used for 32-bit mov in classic BPF and 64-bit in eBPF */
- BPF_ABS 0x20
- BPF_IND 0x40
- BPF_MEM 0x60
- BPF_LEN 0x80 /* classic BPF only, reserved in eBPF */
- BPF_MSH 0xa0 /* classic BPF only, reserved in eBPF */
- BPF_XADD 0xc0 /* eBPF only, exclusive add */
+ BPF_IMM 0x00 /* used for 32-bit mov in classic BPF and 64-bit in eBPF */
+ BPF_ABS 0x20
+ BPF_IND 0x40
+ BPF_MEM 0x60
+ BPF_LEN 0x80 /* classic BPF only, reserved in eBPF */
+ BPF_MSH 0xa0 /* classic BPF only, reserved in eBPF */
+ BPF_ATOMIC 0xc0 /* eBPF only, atomic operations */

eBPF has two non-generic instructions: (BPF_ABS | <size> | BPF_LD) and
(BPF_IND | <size> | BPF_LD) which are used to access packet data.
@@ -1044,11 +1044,19 @@ Unlike classic BPF instruction set, eBPF has generic load/store operations::
BPF_MEM | <size> | BPF_STX: *(size *) (dst_reg + off) = src_reg
BPF_MEM | <size> | BPF_ST: *(size *) (dst_reg + off) = imm32
BPF_MEM | <size> | BPF_LDX: dst_reg = *(size *) (src_reg + off)
- BPF_XADD | BPF_W | BPF_STX: lock xadd *(u32 *)(dst_reg + off16) += src_reg
- BPF_XADD | BPF_DW | BPF_STX: lock xadd *(u64 *)(dst_reg + off16) += src_reg

-Where size is one of: BPF_B or BPF_H or BPF_W or BPF_DW. Note that 1 and
-2 byte atomic increments are not supported.
+Where size is one of: BPF_B or BPF_H or BPF_W or BPF_DW.
+
+It also includes atomic operations, which use the immediate field for extra
+encoding.
+
+ .imm = BPF_ADD, .code = BPF_ATOMIC | BPF_W | BPF_STX: lock xadd *(u32 *)(dst_reg + off16) += src_reg
+ .imm = BPF_ADD, .code = BPF_ATOMIC | BPF_DW | BPF_STX: lock xadd *(u64 *)(dst_reg + off16) += src_reg
+
+Note that 1 and 2 byte atomic operations are not supported.
+
+You may encounter BPF_XADD - this is a legacy name for BPF_ATOMIC, referring to
+the exclusive-add operation encoded when the immediate field is zero.

eBPF has one 16-byte instruction: BPF_LD | BPF_DW | BPF_IMM which consists
of two consecutive ``struct bpf_insn`` 8-byte blocks and interpreted as single
diff --git a/arch/arm/net/bpf_jit_32.c b/arch/arm/net/bpf_jit_32.c
index 0207b6ea6e8a..897634d0a67c 100644
--- a/arch/arm/net/bpf_jit_32.c
+++ b/arch/arm/net/bpf_jit_32.c
@@ -1620,10 +1620,9 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx)
}
emit_str_r(dst_lo, tmp2, off, ctx, BPF_SIZE(code));
break;
- /* STX XADD: lock *(u32 *)(dst + off) += src */
- case BPF_STX | BPF_XADD | BPF_W:
- /* STX XADD: lock *(u64 *)(dst + off) += src */
- case BPF_STX | BPF_XADD | BPF_DW:
+ /* Atomic ops */
+ case BPF_STX | BPF_ATOMIC | BPF_W:
+ case BPF_STX | BPF_ATOMIC | BPF_DW:
goto notyet;
/* STX: *(size *)(dst + off) = src */
case BPF_STX | BPF_MEM | BPF_W:
diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index ef9f1d5e989d..f7b194878a99 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -875,10 +875,18 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx,
}
break;

- /* STX XADD: lock *(u32 *)(dst + off) += src */
- case BPF_STX | BPF_XADD | BPF_W:
- /* STX XADD: lock *(u64 *)(dst + off) += src */
- case BPF_STX | BPF_XADD | BPF_DW:
+ case BPF_STX | BPF_ATOMIC | BPF_W:
+ case BPF_STX | BPF_ATOMIC | BPF_DW:
+ if (insn->imm != BPF_ADD) {
+ pr_err_once("unknown atomic op code %02x\n", insn->imm);
+ return -EINVAL;
+ }
+
+ /* STX XADD: lock *(u32 *)(dst + off) += src
+ * and
+ * STX XADD: lock *(u64 *)(dst + off) += src
+ */
+
if (!off) {
reg = dst;
} else {
diff --git a/arch/mips/net/ebpf_jit.c b/arch/mips/net/ebpf_jit.c
index 561154cbcc40..939dd06764bc 100644
--- a/arch/mips/net/ebpf_jit.c
+++ b/arch/mips/net/ebpf_jit.c
@@ -1423,8 +1423,8 @@ static int build_one_insn(const struct bpf_insn *insn, struct jit_ctx *ctx,
case BPF_STX | BPF_H | BPF_MEM:
case BPF_STX | BPF_W | BPF_MEM:
case BPF_STX | BPF_DW | BPF_MEM:
- case BPF_STX | BPF_W | BPF_XADD:
- case BPF_STX | BPF_DW | BPF_XADD:
+ case BPF_STX | BPF_W | BPF_ATOMIC:
+ case BPF_STX | BPF_DW | BPF_ATOMIC:
if (insn->dst_reg == BPF_REG_10) {
ctx->flags |= EBPF_SEEN_FP;
dst = MIPS_R_SP;
@@ -1438,7 +1438,12 @@ static int build_one_insn(const struct bpf_insn *insn, struct jit_ctx *ctx,
src = ebpf_to_mips_reg(ctx, insn, src_reg_no_fp);
if (src < 0)
return src;
- if (BPF_MODE(insn->code) == BPF_XADD) {
+ if (BPF_MODE(insn->code) == BPF_ATOMIC) {
+ if (insn->imm != BPF_ADD) {
+ pr_err("ATOMIC OP %02x NOT HANDLED\n", insn->imm);
+ return -EINVAL;
+ }
+
/*
* If mem_off does not fit within the 9 bit ll/sc
* instruction immediate field, use a temp reg.
diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
index 022103c6a201..aaf1a887f653 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -683,10 +683,18 @@ static int bpf_jit_build_body(struct bpf_prog *fp, u32 *image,
break;

/*
- * BPF_STX XADD (atomic_add)
+ * BPF_STX ATOMIC (atomic ops)
*/
- /* *(u32 *)(dst + off) += src */
- case BPF_STX | BPF_XADD | BPF_W:
+ case BPF_STX | BPF_ATOMIC | BPF_W:
+ if (insn->imm != BPF_ADD) {
+ pr_err_ratelimited(
+ "eBPF filter atomic op code %02x (@%d) unsupported\n",
+ code, i);
+ return -ENOTSUPP;
+ }
+
+ /* *(u32 *)(dst + off) += src */
+
/* Get EA into TMP_REG_1 */
EMIT(PPC_RAW_ADDI(b2p[TMP_REG_1], dst_reg, off));
tmp_idx = ctx->idx * 4;
@@ -699,8 +707,15 @@ static int bpf_jit_build_body(struct bpf_prog *fp, u32 *image,
/* we're done if this succeeded */
PPC_BCC_SHORT(COND_NE, tmp_idx);
break;
- /* *(u64 *)(dst + off) += src */
- case BPF_STX | BPF_XADD | BPF_DW:
+ case BPF_STX | BPF_ATOMIC | BPF_DW:
+ if (insn->imm != BPF_ADD) {
+ pr_err_ratelimited(
+ "eBPF filter atomic op code %02x (@%d) unsupported\n",
+ code, i);
+ return -ENOTSUPP;
+ }
+ /* *(u64 *)(dst + off) += src */
+
EMIT(PPC_RAW_ADDI(b2p[TMP_REG_1], dst_reg, off));
tmp_idx = ctx->idx * 4;
EMIT(PPC_RAW_LDARX(b2p[TMP_REG_2], 0, b2p[TMP_REG_1], 0));
diff --git a/arch/riscv/net/bpf_jit_comp32.c b/arch/riscv/net/bpf_jit_comp32.c
index 579575f9cdae..a9ef808b235f 100644
--- a/arch/riscv/net/bpf_jit_comp32.c
+++ b/arch/riscv/net/bpf_jit_comp32.c
@@ -881,7 +881,7 @@ static int emit_store_r64(const s8 *dst, const s8 *src, s16 off,
const s8 *rd = bpf_get_reg64(dst, tmp1, ctx);
const s8 *rs = bpf_get_reg64(src, tmp2, ctx);

- if (mode == BPF_XADD && size != BPF_W)
+ if (mode == BPF_ATOMIC && (size != BPF_W || imm != BPF_ADD))
return -1;

emit_imm(RV_REG_T0, off, ctx);
@@ -899,7 +899,7 @@ static int emit_store_r64(const s8 *dst, const s8 *src, s16 off,
case BPF_MEM:
emit(rv_sw(RV_REG_T0, 0, lo(rs)), ctx);
break;
- case BPF_XADD:
+ case BPF_ATOMIC: /* .imm checked above - only BPF_ADD allowed */
emit(rv_amoadd_w(RV_REG_ZERO, lo(rs), RV_REG_T0, 0, 0),
ctx);
break;
@@ -1260,7 +1260,6 @@ int bpf_jit_emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx,
case BPF_STX | BPF_MEM | BPF_H:
case BPF_STX | BPF_MEM | BPF_W:
case BPF_STX | BPF_MEM | BPF_DW:
- case BPF_STX | BPF_XADD | BPF_W:
if (BPF_CLASS(code) == BPF_ST) {
emit_imm32(tmp2, imm, ctx);
src = tmp2;
@@ -1271,8 +1270,21 @@ int bpf_jit_emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx,
return -1;
break;

+ case BPF_STX | BPF_ATOMIC | BPF_W:
+ if (insn->imm != BPF_ADD) {
+ pr_info_once(
+ "bpf-jit: not supported: atomic operation %02x ***\n",
+ insn->imm);
+ return -EFAULT;
+ }
+
+ if (emit_store_r64(dst, src, off, ctx, BPF_SIZE(code),
+ BPF_MODE(code)))
+ return -1;
+ break;
+
/* No hardware support for 8-byte atomics in RV32. */
- case BPF_STX | BPF_XADD | BPF_DW:
+ case BPF_STX | BPF_ATOMIC | BPF_DW:
/* Fallthrough. */

notsupported:
diff --git a/arch/riscv/net/bpf_jit_comp64.c b/arch/riscv/net/bpf_jit_comp64.c
index 8a56b5293117..b44ff52f84a6 100644
--- a/arch/riscv/net/bpf_jit_comp64.c
+++ b/arch/riscv/net/bpf_jit_comp64.c
@@ -1027,10 +1027,18 @@ int bpf_jit_emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx,
emit_add(RV_REG_T1, RV_REG_T1, rd, ctx);
emit_sd(RV_REG_T1, 0, rs, ctx);
break;
- /* STX XADD: lock *(u32 *)(dst + off) += src */
- case BPF_STX | BPF_XADD | BPF_W:
- /* STX XADD: lock *(u64 *)(dst + off) += src */
- case BPF_STX | BPF_XADD | BPF_DW:
+ case BPF_STX | BPF_ATOMIC | BPF_W:
+ case BPF_STX | BPF_ATOMIC | BPF_DW:
+ if (insn->imm != BPF_ADD) {
+ pr_err("bpf-jit: not supported: atomic operation %02x ***\n",
+ insn->imm);
+ return -EINVAL;
+ }
+
+ /* atomic_add: lock *(u32 *)(dst + off) += src
+ * atomic_add: lock *(u64 *)(dst + off) += src
+ */
+
if (off) {
if (is_12b_int(off)) {
emit_addi(RV_REG_T1, rd, off, ctx);
diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c
index 0a4182792876..f973e2ead197 100644
--- a/arch/s390/net/bpf_jit_comp.c
+++ b/arch/s390/net/bpf_jit_comp.c
@@ -1205,18 +1205,23 @@ static noinline int bpf_jit_insn(struct bpf_jit *jit, struct bpf_prog *fp,
jit->seen |= SEEN_MEM;
break;
/*
- * BPF_STX XADD (atomic_add)
+ * BPF_ATOMIC
*/
- case BPF_STX | BPF_XADD | BPF_W: /* *(u32 *)(dst + off) += src */
- /* laal %w0,%src,off(%dst) */
- EMIT6_DISP_LH(0xeb000000, 0x00fa, REG_W0, src_reg,
- dst_reg, off);
- jit->seen |= SEEN_MEM;
- break;
- case BPF_STX | BPF_XADD | BPF_DW: /* *(u64 *)(dst + off) += src */
- /* laalg %w0,%src,off(%dst) */
- EMIT6_DISP_LH(0xeb000000, 0x00ea, REG_W0, src_reg,
- dst_reg, off);
+ case BPF_STX | BPF_ATOMIC | BPF_DW:
+ case BPF_STX | BPF_ATOMIC | BPF_W:
+ if (insn->imm != BPF_ADD) {
+ pr_err("Unknown atomic operation %02x\n", insn->imm);
+ return -1;
+ }
+
+ /* *(u32/u64 *)(dst + off) += src
+ *
+ * BFW_W: laal %w0,%src,off(%dst)
+ * BPF_DW: laalg %w0,%src,off(%dst)
+ */
+ EMIT6_DISP_LH(0xeb000000,
+ BPF_SIZE(insn->code) == BPF_W ? 0x00fa : 0x00ea,
+ REG_W0, src_reg, dst_reg, off);
jit->seen |= SEEN_MEM;
break;
/*
diff --git a/arch/sparc/net/bpf_jit_comp_64.c b/arch/sparc/net/bpf_jit_comp_64.c
index 3364e2a00989..4b8d3c65d266 100644
--- a/arch/sparc/net/bpf_jit_comp_64.c
+++ b/arch/sparc/net/bpf_jit_comp_64.c
@@ -1366,12 +1366,18 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx)
break;
}

- /* STX XADD: lock *(u32 *)(dst + off) += src */
- case BPF_STX | BPF_XADD | BPF_W: {
+ case BPF_STX | BPF_ATOMIC | BPF_W: {
const u8 tmp = bpf2sparc[TMP_REG_1];
const u8 tmp2 = bpf2sparc[TMP_REG_2];
const u8 tmp3 = bpf2sparc[TMP_REG_3];

+ if (insn->imm != BPF_ADD) {
+ pr_err_once("unknown atomic op %02x\n", insn->imm);
+ return -EINVAL;
+ }
+
+ /* lock *(u32 *)(dst + off) += src */
+
if (insn->dst_reg == BPF_REG_FP)
ctx->saw_frame_pointer = true;

@@ -1390,11 +1396,16 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx)
break;
}
/* STX XADD: lock *(u64 *)(dst + off) += src */
- case BPF_STX | BPF_XADD | BPF_DW: {
+ case BPF_STX | BPF_ATOMIC | BPF_DW: {
const u8 tmp = bpf2sparc[TMP_REG_1];
const u8 tmp2 = bpf2sparc[TMP_REG_2];
const u8 tmp3 = bpf2sparc[TMP_REG_3];

+ if (insn->imm != BPF_ADD) {
+ pr_err_once("unknown atomic op %02x\n", insn->imm);
+ return -EINVAL;
+ }
+
if (insn->dst_reg == BPF_REG_FP)
ctx->saw_frame_pointer = true;

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index f0c98fd275e5..b1829a534da1 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -795,6 +795,33 @@ static void emit_stx(u8 **pprog, u32 size, u32 dst_reg, u32 src_reg, int off)
*pprog = prog;
}

+static int emit_atomic(u8 **pprog, u8 atomic_op,
+ u32 dst_reg, u32 src_reg, s16 off, u8 bpf_size)
+{
+ u8 *prog = *pprog;
+ int cnt = 0;
+
+ EMIT1(0xF0); /* lock prefix */
+
+ maybe_emit_mod(&prog, dst_reg, src_reg, bpf_size == BPF_DW);
+
+ /* emit opcode */
+ switch (atomic_op) {
+ case BPF_ADD:
+ /* lock *(u32/u64*)(dst_reg + off) <op>= src_reg */
+ EMIT1(simple_alu_opcodes[atomic_op]);
+ break;
+ default:
+ pr_err("bpf_jit: unknown atomic opcode %02x\n", atomic_op);
+ return -EFAULT;
+ }
+
+ emit_insn_suffix(&prog, dst_reg, src_reg, off);
+
+ *pprog = prog;
+ return 0;
+}
+
static bool ex_handler_bpf(const struct exception_table_entry *x,
struct pt_regs *regs, int trapnr,
unsigned long error_code, unsigned long fault_addr)
@@ -839,6 +866,7 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image,
int i, cnt = 0, excnt = 0;
int proglen = 0;
u8 *prog = temp;
+ int err;

detect_reg_usage(insn, insn_cnt, callee_regs_used,
&tail_call_seen);
@@ -1250,17 +1278,12 @@ st: if (is_imm8(insn->off))
}
break;

- /* STX XADD: lock *(u32*)(dst_reg + off) += src_reg */
- case BPF_STX | BPF_XADD | BPF_W:
- /* Emit 'lock add dword ptr [rax + off], eax' */
- if (is_ereg(dst_reg) || is_ereg(src_reg))
- EMIT3(0xF0, add_2mod(0x40, dst_reg, src_reg), 0x01);
- else
- EMIT2(0xF0, 0x01);
- goto xadd;
- case BPF_STX | BPF_XADD | BPF_DW:
- EMIT3(0xF0, add_2mod(0x48, dst_reg, src_reg), 0x01);
-xadd: emit_modrm_dstoff(&prog, dst_reg, src_reg, insn->off);
+ case BPF_STX | BPF_ATOMIC | BPF_W:
+ case BPF_STX | BPF_ATOMIC | BPF_DW:
+ err = emit_atomic(&prog, insn->imm, dst_reg, src_reg,
+ insn->off, BPF_SIZE(insn->code));
+ if (err)
+ return err;
break;

/* call */
diff --git a/arch/x86/net/bpf_jit_comp32.c b/arch/x86/net/bpf_jit_comp32.c
index 96fde03aa987..d17b67c69f89 100644
--- a/arch/x86/net/bpf_jit_comp32.c
+++ b/arch/x86/net/bpf_jit_comp32.c
@@ -2243,10 +2243,8 @@ emit_cond_jmp: jmp_cond = get_cond_jmp_opcode(BPF_OP(code), false);
return -EFAULT;
}
break;
- /* STX XADD: lock *(u32 *)(dst + off) += src */
- case BPF_STX | BPF_XADD | BPF_W:
- /* STX XADD: lock *(u64 *)(dst + off) += src */
- case BPF_STX | BPF_XADD | BPF_DW:
+ case BPF_STX | BPF_ATOMIC | BPF_W:
+ case BPF_STX | BPF_ATOMIC | BPF_DW:
goto notyet;
case BPF_JMP | BPF_EXIT:
if (seen_exit) {
diff --git a/drivers/net/ethernet/netronome/nfp/bpf/jit.c b/drivers/net/ethernet/netronome/nfp/bpf/jit.c
index 0a721f6e8676..e31f8fbbc696 100644
--- a/drivers/net/ethernet/netronome/nfp/bpf/jit.c
+++ b/drivers/net/ethernet/netronome/nfp/bpf/jit.c
@@ -3109,13 +3109,19 @@ mem_xadd(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta, bool is64)
return 0;
}

-static int mem_xadd4(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+static int mem_atomic4(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
{
+ if (meta->insn.imm != BPF_ADD)
+ return -EOPNOTSUPP;
+
return mem_xadd(nfp_prog, meta, false);
}

-static int mem_xadd8(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+static int mem_atomic8(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
{
+ if (meta->insn.imm != BPF_ADD)
+ return -EOPNOTSUPP;
+
return mem_xadd(nfp_prog, meta, true);
}

@@ -3475,8 +3481,8 @@ static const instr_cb_t instr_cb[256] = {
[BPF_STX | BPF_MEM | BPF_H] = mem_stx2,
[BPF_STX | BPF_MEM | BPF_W] = mem_stx4,
[BPF_STX | BPF_MEM | BPF_DW] = mem_stx8,
- [BPF_STX | BPF_XADD | BPF_W] = mem_xadd4,
- [BPF_STX | BPF_XADD | BPF_DW] = mem_xadd8,
+ [BPF_STX | BPF_ATOMIC | BPF_W] = mem_atomic4,
+ [BPF_STX | BPF_ATOMIC | BPF_DW] = mem_atomic8,
[BPF_ST | BPF_MEM | BPF_B] = mem_st1,
[BPF_ST | BPF_MEM | BPF_H] = mem_st2,
[BPF_ST | BPF_MEM | BPF_W] = mem_st4,
diff --git a/drivers/net/ethernet/netronome/nfp/bpf/main.h b/drivers/net/ethernet/netronome/nfp/bpf/main.h
index fac9c6f9e197..d0e17eebddd9 100644
--- a/drivers/net/ethernet/netronome/nfp/bpf/main.h
+++ b/drivers/net/ethernet/netronome/nfp/bpf/main.h
@@ -428,9 +428,9 @@ static inline bool is_mbpf_classic_store_pkt(const struct nfp_insn_meta *meta)
return is_mbpf_classic_store(meta) && meta->ptr.type == PTR_TO_PACKET;
}

-static inline bool is_mbpf_xadd(const struct nfp_insn_meta *meta)
+static inline bool is_mbpf_atomic(const struct nfp_insn_meta *meta)
{
- return (meta->insn.code & ~BPF_SIZE_MASK) == (BPF_STX | BPF_XADD);
+ return (meta->insn.code & ~BPF_SIZE_MASK) == (BPF_STX | BPF_ATOMIC);
}

static inline bool is_mbpf_mul(const struct nfp_insn_meta *meta)
diff --git a/drivers/net/ethernet/netronome/nfp/bpf/verifier.c b/drivers/net/ethernet/netronome/nfp/bpf/verifier.c
index e92ee510fd52..9d235c0ce46a 100644
--- a/drivers/net/ethernet/netronome/nfp/bpf/verifier.c
+++ b/drivers/net/ethernet/netronome/nfp/bpf/verifier.c
@@ -479,7 +479,7 @@ nfp_bpf_check_ptr(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
pr_vlog(env, "map writes not supported\n");
return -EOPNOTSUPP;
}
- if (is_mbpf_xadd(meta)) {
+ if (is_mbpf_atomic(meta)) {
err = nfp_bpf_map_mark_used(env, meta, reg,
NFP_MAP_USE_ATOMIC_CNT);
if (err)
@@ -523,12 +523,17 @@ nfp_bpf_check_store(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
}

static int
-nfp_bpf_check_xadd(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
- struct bpf_verifier_env *env)
+nfp_bpf_check_atomic(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
+ struct bpf_verifier_env *env)
{
const struct bpf_reg_state *sreg = cur_regs(env) + meta->insn.src_reg;
const struct bpf_reg_state *dreg = cur_regs(env) + meta->insn.dst_reg;

+ if (meta->insn.imm != BPF_ADD) {
+ pr_vlog(env, "atomic op not implemented: %d\n", meta->insn.imm);
+ return -EOPNOTSUPP;
+ }
+
if (dreg->type != PTR_TO_MAP_VALUE) {
pr_vlog(env, "atomic add not to a map value pointer: %d\n",
dreg->type);
@@ -655,8 +660,8 @@ int nfp_verify_insn(struct bpf_verifier_env *env, int insn_idx,
if (is_mbpf_store(meta))
return nfp_bpf_check_store(nfp_prog, meta, env);

- if (is_mbpf_xadd(meta))
- return nfp_bpf_check_xadd(nfp_prog, meta, env);
+ if (is_mbpf_atomic(meta))
+ return nfp_bpf_check_atomic(nfp_prog, meta, env);

if (is_mbpf_alu(meta))
return nfp_bpf_check_alu(nfp_prog, meta, env);
diff --git a/include/linux/filter.h b/include/linux/filter.h
index 1b62397bd124..45be19408f68 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -259,15 +259,38 @@ static inline bool insn_is_zext(const struct bpf_insn *insn)
.off = OFF, \
.imm = 0 })

-/* Atomic memory add, *(uint *)(dst_reg + off16) += src_reg */
+
+/*
+ * Atomic operations:
+ *
+ * BPF_ADD *(uint *) (dst_reg + off16) += src_reg
+ */
+
+#define BPF_ATOMIC64(OP, DST, SRC, OFF) \
+ ((struct bpf_insn) { \
+ .code = BPF_STX | BPF_DW | BPF_ATOMIC, \
+ .dst_reg = DST, \
+ .src_reg = SRC, \
+ .off = OFF, \
+ .imm = OP })
+
+#define BPF_ATOMIC32(OP, DST, SRC, OFF) \
+ ((struct bpf_insn) { \
+ .code = BPF_STX | BPF_W | BPF_ATOMIC, \
+ .dst_reg = DST, \
+ .src_reg = SRC, \
+ .off = OFF, \
+ .imm = OP })
+
+/* Legacy equivalent of BPF_ATOMIC{64,32}(BPF_ADD, ...) */

#define BPF_STX_XADD(SIZE, DST, SRC, OFF) \
((struct bpf_insn) { \
- .code = BPF_STX | BPF_SIZE(SIZE) | BPF_XADD, \
+ .code = BPF_STX | BPF_SIZE(SIZE) | BPF_ATOMIC, \
.dst_reg = DST, \
.src_reg = SRC, \
.off = OFF, \
- .imm = 0 })
+ .imm = BPF_ADD })

/* Memory store, *(uint *) (dst_reg + off16) = imm32 */

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 30b477a26482..98161e2d389f 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -19,7 +19,8 @@

/* ld/ldx fields */
#define BPF_DW 0x18 /* double word (64-bit) */
-#define BPF_XADD 0xc0 /* exclusive add */
+#define BPF_ATOMIC 0xc0 /* atomic memory ops - op type in immediate */
+#define BPF_XADD 0xc0 /* exclusive add - legacy name */

/* alu/jmp fields */
#define BPF_MOV 0xb0 /* mov reg to reg */
@@ -2448,7 +2449,7 @@ union bpf_attr {
* running simultaneously.
*
* A user should care about the synchronization by himself.
- * For example, by using the **BPF_STX_XADD** instruction to alter
+ * For example, by using the **BPF_ATOMIC64** instructions to alter
* the shared data.
* Return
* A pointer to the local storage area.
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 261f8692d0d2..3abc6b250b18 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -1309,8 +1309,8 @@ EXPORT_SYMBOL_GPL(__bpf_call_base);
INSN_3(STX, MEM, H), \
INSN_3(STX, MEM, W), \
INSN_3(STX, MEM, DW), \
- INSN_3(STX, XADD, W), \
- INSN_3(STX, XADD, DW), \
+ INSN_3(STX, ATOMIC, W), \
+ INSN_3(STX, ATOMIC, DW), \
/* Immediate based. */ \
INSN_3(ST, MEM, B), \
INSN_3(ST, MEM, H), \
@@ -1618,13 +1618,25 @@ static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, u64 *stack)
LDX_PROBE(DW, 8)
#undef LDX_PROBE

- STX_XADD_W: /* lock xadd *(u32 *)(dst_reg + off16) += src_reg */
- atomic_add((u32) SRC, (atomic_t *)(unsigned long)
- (DST + insn->off));
+ STX_ATOMIC_W:
+ switch (IMM) {
+ case BPF_ADD:
+ /* lock xadd *(u32 *)(dst_reg + off16) += src_reg */
+ atomic_add((u32) SRC, (atomic_t *)(unsigned long)
+ (DST + insn->off));
+ default:
+ goto default_label;
+ }
CONT;
- STX_XADD_DW: /* lock xadd *(u64 *)(dst_reg + off16) += src_reg */
- atomic64_add((u64) SRC, (atomic64_t *)(unsigned long)
- (DST + insn->off));
+ STX_ATOMIC_DW:
+ switch (IMM) {
+ case BPF_ADD:
+ /* lock xadd *(u64 *)(dst_reg + off16) += src_reg */
+ atomic64_add((u64) SRC, (atomic64_t *)(unsigned long)
+ (DST + insn->off));
+ default:
+ goto default_label;
+ }
CONT;

default_label:
@@ -1634,7 +1646,8 @@ static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, u64 *stack)
*
* Note, verifier whitelists all opcodes in bpf_opcode_in_insntable().
*/
- pr_warn("BPF interpreter: unknown opcode %02x\n", insn->code);
+ pr_warn("BPF interpreter: unknown opcode %02x (imm: 0x%x)\n",
+ insn->code, insn->imm);
BUG_ON(1);
return 0;
}
diff --git a/kernel/bpf/disasm.c b/kernel/bpf/disasm.c
index b44d8c447afd..37c8d6e9b4cc 100644
--- a/kernel/bpf/disasm.c
+++ b/kernel/bpf/disasm.c
@@ -153,14 +153,16 @@ void print_bpf_insn(const struct bpf_insn_cbs *cbs,
bpf_ldst_string[BPF_SIZE(insn->code) >> 3],
insn->dst_reg,
insn->off, insn->src_reg);
- else if (BPF_MODE(insn->code) == BPF_XADD)
+ else if (BPF_MODE(insn->code) == BPF_ATOMIC &&
+ insn->imm == BPF_ADD) {
verbose(cbs->private_data, "(%02x) lock *(%s *)(r%d %+d) += r%d\n",
insn->code,
bpf_ldst_string[BPF_SIZE(insn->code) >> 3],
insn->dst_reg, insn->off,
insn->src_reg);
- else
+ } else {
verbose(cbs->private_data, "BUG_%02x\n", insn->code);
+ }
} else if (class == BPF_ST) {
if (BPF_MODE(insn->code) != BPF_MEM) {
verbose(cbs->private_data, "BUG_st_%02x\n", insn->code);
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 93def76cf32b..615be10abd71 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -3606,13 +3606,17 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn
return err;
}

-static int check_xadd(struct bpf_verifier_env *env, int insn_idx, struct bpf_insn *insn)
+static int check_atomic(struct bpf_verifier_env *env, int insn_idx, struct bpf_insn *insn)
{
int err;

- if ((BPF_SIZE(insn->code) != BPF_W && BPF_SIZE(insn->code) != BPF_DW) ||
- insn->imm != 0) {
- verbose(env, "BPF_XADD uses reserved fields\n");
+ if (insn->imm != BPF_ADD) {
+ verbose(env, "BPF_ATOMIC uses invalid atomic opcode %02x\n", insn->imm);
+ return -EINVAL;
+ }
+
+ if (BPF_SIZE(insn->code) != BPF_W && BPF_SIZE(insn->code) != BPF_DW) {
+ verbose(env, "invalid atomic operand size\n");
return -EINVAL;
}

@@ -3635,19 +3639,19 @@ static int check_xadd(struct bpf_verifier_env *env, int insn_idx, struct bpf_ins
is_pkt_reg(env, insn->dst_reg) ||
is_flow_key_reg(env, insn->dst_reg) ||
is_sk_reg(env, insn->dst_reg)) {
- verbose(env, "BPF_XADD stores into R%d %s is not allowed\n",
+ verbose(env, "BPF_ATOMIC stores into R%d %s is not allowed\n",
insn->dst_reg,
reg_type_str[reg_state(env, insn->dst_reg)->type]);
return -EACCES;
}

- /* check whether atomic_add can read the memory */
+ /* check whether we can read the memory */
err = check_mem_access(env, insn_idx, insn->dst_reg, insn->off,
BPF_SIZE(insn->code), BPF_READ, -1, true);
if (err)
return err;

- /* check whether atomic_add can write into the same memory */
+ /* check whether we can write into the same memory */
return check_mem_access(env, insn_idx, insn->dst_reg, insn->off,
BPF_SIZE(insn->code), BPF_WRITE, -1, true);
}
@@ -9523,8 +9527,8 @@ static int do_check(struct bpf_verifier_env *env)
} else if (class == BPF_STX) {
enum bpf_reg_type *prev_dst_type, dst_reg_type;

- if (BPF_MODE(insn->code) == BPF_XADD) {
- err = check_xadd(env, env->insn_idx, insn);
+ if (BPF_MODE(insn->code) == BPF_ATOMIC) {
+ err = check_atomic(env, env->insn_idx, insn);
if (err)
return err;
env->insn_idx++;
@@ -9937,7 +9941,7 @@ static int resolve_pseudo_ldimm64(struct bpf_verifier_env *env)

if (BPF_CLASS(insn->code) == BPF_STX &&
((BPF_MODE(insn->code) != BPF_MEM &&
- BPF_MODE(insn->code) != BPF_XADD) || insn->imm != 0)) {
+ BPF_MODE(insn->code) != BPF_ATOMIC) || insn->imm != 0)) {
verbose(env, "BPF_STX uses reserved fields\n");
return -EINVAL;
}
diff --git a/lib/test_bpf.c b/lib/test_bpf.c
index ca7d635bccd9..d17e2e5a712f 100644
--- a/lib/test_bpf.c
+++ b/lib/test_bpf.c
@@ -4295,13 +4295,13 @@ static struct bpf_test tests[] = {
{ { 0, 0xffffffff } },
.stack_depth = 40,
},
- /* BPF_STX | BPF_XADD | BPF_W/DW */
+ /* BPF_STX | BPF_ATOMIC | BPF_W/DW */
{
"STX_XADD_W: Test: 0x12 + 0x10 = 0x22",
.u.insns_int = {
BPF_ALU32_IMM(BPF_MOV, R0, 0x12),
BPF_ST_MEM(BPF_W, R10, -40, 0x10),
- BPF_STX_XADD(BPF_W, R10, R0, -40),
+ BPF_ATOMIC32(BPF_ADD, R10, R0, -40),
BPF_LDX_MEM(BPF_W, R0, R10, -40),
BPF_EXIT_INSN(),
},
@@ -4316,7 +4316,7 @@ static struct bpf_test tests[] = {
BPF_ALU64_REG(BPF_MOV, R1, R10),
BPF_ALU32_IMM(BPF_MOV, R0, 0x12),
BPF_ST_MEM(BPF_W, R10, -40, 0x10),
- BPF_STX_XADD(BPF_W, R10, R0, -40),
+ BPF_ATOMIC32(BPF_ADD, R10, R0, -40),
BPF_ALU64_REG(BPF_MOV, R0, R10),
BPF_ALU64_REG(BPF_SUB, R0, R1),
BPF_EXIT_INSN(),
@@ -4331,7 +4331,7 @@ static struct bpf_test tests[] = {
.u.insns_int = {
BPF_ALU32_IMM(BPF_MOV, R0, 0x12),
BPF_ST_MEM(BPF_W, R10, -40, 0x10),
- BPF_STX_XADD(BPF_W, R10, R0, -40),
+ BPF_ATOMIC32(BPF_ADD, R10, R0, -40),
BPF_EXIT_INSN(),
},
INTERNAL,
@@ -4352,7 +4352,7 @@ static struct bpf_test tests[] = {
.u.insns_int = {
BPF_ALU32_IMM(BPF_MOV, R0, 0x12),
BPF_ST_MEM(BPF_DW, R10, -40, 0x10),
- BPF_STX_XADD(BPF_DW, R10, R0, -40),
+ BPF_ATOMIC64(BPF_ADD, R10, R0, -40),
BPF_LDX_MEM(BPF_DW, R0, R10, -40),
BPF_EXIT_INSN(),
},
@@ -4367,7 +4367,7 @@ static struct bpf_test tests[] = {
BPF_ALU64_REG(BPF_MOV, R1, R10),
BPF_ALU32_IMM(BPF_MOV, R0, 0x12),
BPF_ST_MEM(BPF_DW, R10, -40, 0x10),
- BPF_STX_XADD(BPF_DW, R10, R0, -40),
+ BPF_ATOMIC64(BPF_ADD, R10, R0, -40),
BPF_ALU64_REG(BPF_MOV, R0, R10),
BPF_ALU64_REG(BPF_SUB, R0, R1),
BPF_EXIT_INSN(),
@@ -4382,7 +4382,7 @@ static struct bpf_test tests[] = {
.u.insns_int = {
BPF_ALU32_IMM(BPF_MOV, R0, 0x12),
BPF_ST_MEM(BPF_DW, R10, -40, 0x10),
- BPF_STX_XADD(BPF_DW, R10, R0, -40),
+ BPF_ATOMIC64(BPF_ADD, R10, R0, -40),
BPF_EXIT_INSN(),
},
INTERNAL,
diff --git a/samples/bpf/bpf_insn.h b/samples/bpf/bpf_insn.h
index 544237980582..db67a2847395 100644
--- a/samples/bpf/bpf_insn.h
+++ b/samples/bpf/bpf_insn.h
@@ -138,11 +138,11 @@ struct bpf_insn;

#define BPF_STX_XADD(SIZE, DST, SRC, OFF) \
((struct bpf_insn) { \
- .code = BPF_STX | BPF_SIZE(SIZE) | BPF_XADD, \
+ .code = BPF_STX | BPF_SIZE(SIZE) | BPF_ATOMIC, \
.dst_reg = DST, \
.src_reg = SRC, \
.off = OFF, \
- .imm = 0 })
+ .imm = BPF_ADD })

/* Memory store, *(uint *) (dst_reg + off16) = imm32 */

diff --git a/samples/bpf/cookie_uid_helper_example.c b/samples/bpf/cookie_uid_helper_example.c
index deb0e3e0324d..545fd6a8d800 100644
--- a/samples/bpf/cookie_uid_helper_example.c
+++ b/samples/bpf/cookie_uid_helper_example.c
@@ -147,12 +147,10 @@ static void prog_load(void)
*/
BPF_MOV64_REG(BPF_REG_9, BPF_REG_0),
BPF_MOV64_IMM(BPF_REG_1, 1),
- BPF_STX_XADD(BPF_DW, BPF_REG_9, BPF_REG_1,
- offsetof(struct stats, packets)),
+ BPF_ATOMIC64(BPF_ADD, BPF_REG_9, BPF_REG_1, offsetof(struct stats, packets)),
BPF_LDX_MEM(BPF_W, BPF_REG_1, BPF_REG_6,
offsetof(struct __sk_buff, len)),
- BPF_STX_XADD(BPF_DW, BPF_REG_9, BPF_REG_1,
- offsetof(struct stats, bytes)),
+ BPF_ATOMIC64(BPF_ADD, BPF_REG_9, BPF_REG_1, offsetof(struct stats, bytes)),
BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_6,
offsetof(struct __sk_buff, len)),
BPF_EXIT_INSN(),
diff --git a/samples/bpf/sock_example.c b/samples/bpf/sock_example.c
index 00aae1d33fca..e47bf6764841 100644
--- a/samples/bpf/sock_example.c
+++ b/samples/bpf/sock_example.c
@@ -54,7 +54,7 @@ static int test_sock(void)
BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 2),
BPF_MOV64_IMM(BPF_REG_1, 1), /* r1 = 1 */
- BPF_RAW_INSN(BPF_STX | BPF_XADD | BPF_DW, BPF_REG_0, BPF_REG_1, 0, 0), /* xadd r0 += r1 */
+ BPF_ATOMIC64(BPF_ADD, BPF_REG_0, BPF_REG_1, 0),
BPF_MOV64_IMM(BPF_REG_0, 0), /* r0 = 0 */
BPF_EXIT_INSN(),
};
diff --git a/samples/bpf/test_cgrp2_attach.c b/samples/bpf/test_cgrp2_attach.c
index 20fbd1241db3..80b03df23da4 100644
--- a/samples/bpf/test_cgrp2_attach.c
+++ b/samples/bpf/test_cgrp2_attach.c
@@ -53,7 +53,7 @@ static int prog_load(int map_fd, int verdict)
BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 2),
BPF_MOV64_IMM(BPF_REG_1, 1), /* r1 = 1 */
- BPF_RAW_INSN(BPF_STX | BPF_XADD | BPF_DW, BPF_REG_0, BPF_REG_1, 0, 0), /* xadd r0 += r1 */
+ BPF_ATOMIC64(BPF_ADD, BPF_REG_0, BPF_REG_1, 0),

/* Count bytes */
BPF_MOV64_IMM(BPF_REG_0, MAP_KEY_BYTES), /* r0 = 1 */
@@ -64,7 +64,8 @@ static int prog_load(int map_fd, int verdict)
BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 2),
BPF_LDX_MEM(BPF_W, BPF_REG_1, BPF_REG_6, offsetof(struct __sk_buff, len)), /* r1 = skb->len */
- BPF_RAW_INSN(BPF_STX | BPF_XADD | BPF_DW, BPF_REG_0, BPF_REG_1, 0, 0), /* xadd r0 += r1 */
+
+ BPF_ATOMIC64(BPF_ADD, BPF_REG_0, BPF_REG_1, 0),

BPF_MOV64_IMM(BPF_REG_0, verdict), /* r0 = verdict */
BPF_EXIT_INSN(),
diff --git a/tools/include/linux/filter.h b/tools/include/linux/filter.h
index ca28b6ab8db7..f345f12c1ff8 100644
--- a/tools/include/linux/filter.h
+++ b/tools/include/linux/filter.h
@@ -169,15 +169,37 @@
.off = OFF, \
.imm = 0 })

-/* Atomic memory add, *(uint *)(dst_reg + off16) += src_reg */
+/*
+ * Atomic operations:
+ *
+ * BPF_ADD *(uint *) (dst_reg + off16) += src_reg
+ */
+
+#define BPF_ATOMIC64(OP, DST, SRC, OFF) \
+ ((struct bpf_insn) { \
+ .code = BPF_STX | BPF_DW | BPF_ATOMIC, \
+ .dst_reg = DST, \
+ .src_reg = SRC, \
+ .off = OFF, \
+ .imm = OP })
+
+#define BPF_ATOMIC32(OP, DST, SRC, OFF) \
+ ((struct bpf_insn) { \
+ .code = BPF_STX | BPF_W | BPF_ATOMIC, \
+ .dst_reg = DST, \
+ .src_reg = SRC, \
+ .off = OFF, \
+ .imm = OP })
+
+/* Legacy equivalent of BPF_ATOMIC{64,32}(BPF_ADD, DST, SRC, OFF) */

#define BPF_STX_XADD(SIZE, DST, SRC, OFF) \
((struct bpf_insn) { \
- .code = BPF_STX | BPF_SIZE(SIZE) | BPF_XADD, \
+ .code = BPF_STX | BPF_SIZE(SIZE) | BPF_ATOMIC, \
.dst_reg = DST, \
.src_reg = SRC, \
.off = OFF, \
- .imm = 0 })
+ .imm = BPF_ADD })

/* Memory store, *(uint *) (dst_reg + off16) = imm32 */

diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 30b477a26482..98161e2d389f 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -19,7 +19,8 @@

/* ld/ldx fields */
#define BPF_DW 0x18 /* double word (64-bit) */
-#define BPF_XADD 0xc0 /* exclusive add */
+#define BPF_ATOMIC 0xc0 /* atomic memory ops - op type in immediate */
+#define BPF_XADD 0xc0 /* exclusive add - legacy name */

/* alu/jmp fields */
#define BPF_MOV 0xb0 /* mov reg to reg */
@@ -2448,7 +2449,7 @@ union bpf_attr {
* running simultaneously.
*
* A user should care about the synchronization by himself.
- * For example, by using the **BPF_STX_XADD** instruction to alter
+ * For example, by using the **BPF_ATOMIC64** instructions to alter
* the shared data.
* Return
* A pointer to the local storage area.
diff --git a/tools/testing/selftests/bpf/prog_tests/cgroup_attach_multi.c b/tools/testing/selftests/bpf/prog_tests/cgroup_attach_multi.c
index b549fcfacc0b..086527487f83 100644
--- a/tools/testing/selftests/bpf/prog_tests/cgroup_attach_multi.c
+++ b/tools/testing/selftests/bpf/prog_tests/cgroup_attach_multi.c
@@ -45,13 +45,13 @@ static int prog_load_cnt(int verdict, int val)
BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 2),
BPF_MOV64_IMM(BPF_REG_1, val), /* r1 = 1 */
- BPF_RAW_INSN(BPF_STX | BPF_XADD | BPF_DW, BPF_REG_0, BPF_REG_1, 0, 0), /* xadd r0 += r1 */
+ BPF_ATOMIC64(BPF_ADD, BPF_REG_0, BPF_REG_1, 0),

BPF_LD_MAP_FD(BPF_REG_1, cgroup_storage_fd),
BPF_MOV64_IMM(BPF_REG_2, 0),
BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_get_local_storage),
BPF_MOV64_IMM(BPF_REG_1, val),
- BPF_RAW_INSN(BPF_STX | BPF_XADD | BPF_W, BPF_REG_0, BPF_REG_1, 0, 0),
+ BPF_ATOMIC32(BPF_ADD, BPF_REG_0, BPF_REG_1, 0),

BPF_LD_MAP_FD(BPF_REG_1, percpu_cgroup_storage_fd),
BPF_MOV64_IMM(BPF_REG_2, 0),
diff --git a/tools/testing/selftests/bpf/test_cgroup_storage.c b/tools/testing/selftests/bpf/test_cgroup_storage.c
index d946252a25bb..aa90f219e04c 100644
--- a/tools/testing/selftests/bpf/test_cgroup_storage.c
+++ b/tools/testing/selftests/bpf/test_cgroup_storage.c
@@ -29,7 +29,7 @@ int main(int argc, char **argv)
BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
BPF_FUNC_get_local_storage),
BPF_MOV64_IMM(BPF_REG_1, 1),
- BPF_STX_XADD(BPF_DW, BPF_REG_0, BPF_REG_1, 0),
+ BPF_ATOMIC64(BPF_ADD, BPF_REG_0, BPF_REG_1, 0),
BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_0, 0),
BPF_ALU64_IMM(BPF_AND, BPF_REG_1, 0x1),
BPF_MOV64_REG(BPF_REG_0, BPF_REG_1),
diff --git a/tools/testing/selftests/bpf/verifier/ctx.c b/tools/testing/selftests/bpf/verifier/ctx.c
index 93d6b1641481..c8caed32ea91 100644
--- a/tools/testing/selftests/bpf/verifier/ctx.c
+++ b/tools/testing/selftests/bpf/verifier/ctx.c
@@ -10,14 +10,13 @@
.prog_type = BPF_PROG_TYPE_SCHED_CLS,
},
{
- "context stores via XADD",
+ "context stores via BPF_ATOMIC",
.insns = {
BPF_MOV64_IMM(BPF_REG_0, 0),
- BPF_RAW_INSN(BPF_STX | BPF_XADD | BPF_W, BPF_REG_1,
- BPF_REG_0, offsetof(struct __sk_buff, mark), 0),
+ BPF_ATOMIC32(BPF_ADD, BPF_REG_1, BPF_REG_0, offsetof(struct __sk_buff, mark)),
BPF_EXIT_INSN(),
},
- .errstr = "BPF_XADD stores into R1 ctx is not allowed",
+ .errstr = "BPF_ATOMIC stores into R1 ctx is not allowed",
.result = REJECT,
.prog_type = BPF_PROG_TYPE_SCHED_CLS,
},
diff --git a/tools/testing/selftests/bpf/verifier/direct_packet_access.c b/tools/testing/selftests/bpf/verifier/direct_packet_access.c
index ae72536603fe..783536603100 100644
--- a/tools/testing/selftests/bpf/verifier/direct_packet_access.c
+++ b/tools/testing/selftests/bpf/verifier/direct_packet_access.c
@@ -333,7 +333,7 @@
BPF_MOV64_REG(BPF_REG_4, BPF_REG_10),
BPF_ALU64_IMM(BPF_ADD, BPF_REG_4, -8),
BPF_STX_MEM(BPF_DW, BPF_REG_4, BPF_REG_2, 0),
- BPF_STX_XADD(BPF_DW, BPF_REG_4, BPF_REG_5, 0),
+ BPF_ATOMIC64(BPF_ADD, BPF_REG_4, BPF_REG_5, 0),
BPF_LDX_MEM(BPF_DW, BPF_REG_2, BPF_REG_4, 0),
BPF_STX_MEM(BPF_W, BPF_REG_2, BPF_REG_5, 0),
BPF_MOV64_IMM(BPF_REG_0, 0),
@@ -488,7 +488,7 @@
BPF_JMP_REG(BPF_JGT, BPF_REG_0, BPF_REG_3, 11),
BPF_LDX_MEM(BPF_DW, BPF_REG_2, BPF_REG_10, -8),
BPF_MOV64_IMM(BPF_REG_4, 0xffffffff),
- BPF_STX_XADD(BPF_DW, BPF_REG_10, BPF_REG_4, -8),
+ BPF_ATOMIC64(BPF_ADD, BPF_REG_10, BPF_REG_4, -8),
BPF_LDX_MEM(BPF_DW, BPF_REG_4, BPF_REG_10, -8),
BPF_ALU64_IMM(BPF_RSH, BPF_REG_4, 49),
BPF_ALU64_REG(BPF_ADD, BPF_REG_4, BPF_REG_2),
diff --git a/tools/testing/selftests/bpf/verifier/leak_ptr.c b/tools/testing/selftests/bpf/verifier/leak_ptr.c
index d6eec17f2cd2..f033c1ca33d0 100644
--- a/tools/testing/selftests/bpf/verifier/leak_ptr.c
+++ b/tools/testing/selftests/bpf/verifier/leak_ptr.c
@@ -5,7 +5,7 @@
BPF_STX_MEM(BPF_DW, BPF_REG_1, BPF_REG_0,
offsetof(struct __sk_buff, cb[0])),
BPF_LD_MAP_FD(BPF_REG_2, 0),
- BPF_STX_XADD(BPF_DW, BPF_REG_1, BPF_REG_2,
+ BPF_ATOMIC64(BPF_ADD, BPF_REG_1, BPF_REG_2,
offsetof(struct __sk_buff, cb[0])),
BPF_EXIT_INSN(),
},
@@ -13,7 +13,7 @@
.errstr_unpriv = "R2 leaks addr into mem",
.result_unpriv = REJECT,
.result = REJECT,
- .errstr = "BPF_XADD stores into R1 ctx is not allowed",
+ .errstr = "BPF_ATOMIC stores into R1 ctx is not allowed",
},
{
"leak pointer into ctx 2",
@@ -21,14 +21,14 @@
BPF_MOV64_IMM(BPF_REG_0, 0),
BPF_STX_MEM(BPF_DW, BPF_REG_1, BPF_REG_0,
offsetof(struct __sk_buff, cb[0])),
- BPF_STX_XADD(BPF_DW, BPF_REG_1, BPF_REG_10,
+ BPF_ATOMIC64(BPF_ADD, BPF_REG_1, BPF_REG_10,
offsetof(struct __sk_buff, cb[0])),
BPF_EXIT_INSN(),
},
.errstr_unpriv = "R10 leaks addr into mem",
.result_unpriv = REJECT,
.result = REJECT,
- .errstr = "BPF_XADD stores into R1 ctx is not allowed",
+ .errstr = "BPF_ATOMIC stores into R1 ctx is not allowed",
},
{
"leak pointer into ctx 3",
@@ -56,7 +56,7 @@
BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 3),
BPF_MOV64_IMM(BPF_REG_3, 0),
BPF_STX_MEM(BPF_DW, BPF_REG_0, BPF_REG_3, 0),
- BPF_STX_XADD(BPF_DW, BPF_REG_0, BPF_REG_6, 0),
+ BPF_ATOMIC64(BPF_ADD, BPF_REG_0, BPF_REG_6, 0),
BPF_MOV64_IMM(BPF_REG_0, 0),
BPF_EXIT_INSN(),
},
diff --git a/tools/testing/selftests/bpf/verifier/meta_access.c b/tools/testing/selftests/bpf/verifier/meta_access.c
index 205292b8dd65..56d38e8244f4 100644
--- a/tools/testing/selftests/bpf/verifier/meta_access.c
+++ b/tools/testing/selftests/bpf/verifier/meta_access.c
@@ -171,7 +171,7 @@
BPF_MOV64_IMM(BPF_REG_5, 42),
BPF_MOV64_IMM(BPF_REG_6, 24),
BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_5, -8),
- BPF_STX_XADD(BPF_DW, BPF_REG_10, BPF_REG_6, -8),
+ BPF_ATOMIC64(BPF_ADD, BPF_REG_10, BPF_REG_6, -8),
BPF_LDX_MEM(BPF_DW, BPF_REG_5, BPF_REG_10, -8),
BPF_JMP_IMM(BPF_JGT, BPF_REG_5, 100, 6),
BPF_ALU64_REG(BPF_ADD, BPF_REG_3, BPF_REG_5),
@@ -196,7 +196,7 @@
BPF_MOV64_IMM(BPF_REG_5, 42),
BPF_MOV64_IMM(BPF_REG_6, 24),
BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_5, -8),
- BPF_STX_XADD(BPF_DW, BPF_REG_10, BPF_REG_6, -8),
+ BPF_ATOMIC64(BPF_ADD, BPF_REG_10, BPF_REG_6, -8),
BPF_LDX_MEM(BPF_DW, BPF_REG_5, BPF_REG_10, -8),
BPF_JMP_IMM(BPF_JGT, BPF_REG_5, 100, 6),
BPF_ALU64_REG(BPF_ADD, BPF_REG_2, BPF_REG_5),
diff --git a/tools/testing/selftests/bpf/verifier/unpriv.c b/tools/testing/selftests/bpf/verifier/unpriv.c
index 91bb77c24a2e..85b5e8b70e5d 100644
--- a/tools/testing/selftests/bpf/verifier/unpriv.c
+++ b/tools/testing/selftests/bpf/verifier/unpriv.c
@@ -206,7 +206,8 @@
BPF_ALU64_IMM(BPF_ADD, BPF_REG_6, -8),
BPF_STX_MEM(BPF_DW, BPF_REG_6, BPF_REG_1, 0),
BPF_MOV64_IMM(BPF_REG_0, 1),
- BPF_RAW_INSN(BPF_STX | BPF_XADD | BPF_DW, BPF_REG_10, BPF_REG_0, -8, 0),
+ BPF_RAW_INSN(BPF_STX | BPF_ATOMIC | BPF_DW,
+ BPF_REG_10, BPF_REG_0, -8, BPF_ADD),
BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_6, 0),
BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_get_hash_recalc),
BPF_EXIT_INSN(),
diff --git a/tools/testing/selftests/bpf/verifier/value_illegal_alu.c b/tools/testing/selftests/bpf/verifier/value_illegal_alu.c
index ed1c2cea1dea..43e621f5da85 100644
--- a/tools/testing/selftests/bpf/verifier/value_illegal_alu.c
+++ b/tools/testing/selftests/bpf/verifier/value_illegal_alu.c
@@ -82,7 +82,7 @@
BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
BPF_STX_MEM(BPF_DW, BPF_REG_2, BPF_REG_0, 0),
- BPF_STX_XADD(BPF_DW, BPF_REG_2, BPF_REG_3, 0),
+ BPF_ATOMIC64(BPF_ADD, BPF_REG_2, BPF_REG_3, 0),
BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_2, 0),
BPF_ST_MEM(BPF_DW, BPF_REG_0, 0, 22),
BPF_EXIT_INSN(),
diff --git a/tools/testing/selftests/bpf/verifier/xadd.c b/tools/testing/selftests/bpf/verifier/xadd.c
index c5de2e62cc8b..aa1a12227018 100644
--- a/tools/testing/selftests/bpf/verifier/xadd.c
+++ b/tools/testing/selftests/bpf/verifier/xadd.c
@@ -3,7 +3,7 @@
.insns = {
BPF_MOV64_IMM(BPF_REG_0, 1),
BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_0, -8),
- BPF_STX_XADD(BPF_W, BPF_REG_10, BPF_REG_0, -7),
+ BPF_ATOMIC32(BPF_ADD, BPF_REG_10, BPF_REG_0, -7),
BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_10, -8),
BPF_EXIT_INSN(),
},
@@ -22,7 +22,7 @@
BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1),
BPF_EXIT_INSN(),
BPF_MOV64_IMM(BPF_REG_1, 1),
- BPF_STX_XADD(BPF_W, BPF_REG_0, BPF_REG_1, 3),
+ BPF_ATOMIC32(BPF_ADD, BPF_REG_0, BPF_REG_1, 3),
BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_0, 3),
BPF_EXIT_INSN(),
},
@@ -45,13 +45,13 @@
BPF_MOV64_IMM(BPF_REG_0, 1),
BPF_ST_MEM(BPF_W, BPF_REG_2, 0, 0),
BPF_ST_MEM(BPF_W, BPF_REG_2, 3, 0),
- BPF_STX_XADD(BPF_W, BPF_REG_2, BPF_REG_0, 1),
- BPF_STX_XADD(BPF_W, BPF_REG_2, BPF_REG_0, 2),
+ BPF_ATOMIC32(BPF_ADD, BPF_REG_2, BPF_REG_0, 1),
+ BPF_ATOMIC32(BPF_ADD, BPF_REG_2, BPF_REG_0, 2),
BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_2, 1),
BPF_EXIT_INSN(),
},
.result = REJECT,
- .errstr = "BPF_XADD stores into R2 pkt is not allowed",
+ .errstr = "BPF_ATOMIC stores into R2 pkt is not allowed",
.prog_type = BPF_PROG_TYPE_XDP,
.flags = F_NEEDS_EFFICIENT_UNALIGNED_ACCESS,
},
@@ -62,8 +62,8 @@
BPF_MOV64_REG(BPF_REG_6, BPF_REG_0),
BPF_MOV64_REG(BPF_REG_7, BPF_REG_10),
BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_0, -8),
- BPF_STX_XADD(BPF_DW, BPF_REG_10, BPF_REG_0, -8),
- BPF_STX_XADD(BPF_DW, BPF_REG_10, BPF_REG_0, -8),
+ BPF_ATOMIC64(BPF_ADD, BPF_REG_10, BPF_REG_0, -8),
+ BPF_ATOMIC64(BPF_ADD, BPF_REG_10, BPF_REG_0, -8),
BPF_JMP_REG(BPF_JNE, BPF_REG_6, BPF_REG_0, 3),
BPF_JMP_REG(BPF_JNE, BPF_REG_7, BPF_REG_10, 2),
BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_10, -8),
@@ -82,8 +82,8 @@
BPF_MOV64_REG(BPF_REG_6, BPF_REG_0),
BPF_MOV64_REG(BPF_REG_7, BPF_REG_10),
BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_0, -8),
- BPF_STX_XADD(BPF_W, BPF_REG_10, BPF_REG_0, -8),
- BPF_STX_XADD(BPF_W, BPF_REG_10, BPF_REG_0, -8),
+ BPF_ATOMIC32(BPF_ADD, BPF_REG_10, BPF_REG_0, -8),
+ BPF_ATOMIC32(BPF_ADD, BPF_REG_10, BPF_REG_0, -8),
BPF_JMP_REG(BPF_JNE, BPF_REG_6, BPF_REG_0, 3),
BPF_JMP_REG(BPF_JNE, BPF_REG_7, BPF_REG_10, 2),
BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_10, -8),
--
2.29.2.576.ga3fc446d84-goog

2020-12-07 16:14:28

by Brendan Jackman

[permalink] [raw]
Subject: [PATCH bpf-next v4 06/11] bpf: Add BPF_FETCH field / create atomic_fetch_add instruction

The BPF_FETCH field can be set in bpf_insn.imm, for BPF_ATOMIC
instructions, in order to have the previous value of the
atomically-modified memory location loaded into the src register
after an atomic op is carried out.

Suggested-by: Yonghong Song <[email protected]>
Signed-off-by: Brendan Jackman <[email protected]>
---
arch/x86/net/bpf_jit_comp.c | 4 ++++
include/linux/filter.h | 1 +
include/uapi/linux/bpf.h | 3 +++
kernel/bpf/core.c | 13 +++++++++++++
kernel/bpf/disasm.c | 7 +++++++
kernel/bpf/verifier.c | 33 ++++++++++++++++++++++++---------
tools/include/linux/filter.h | 11 +++++++++++
tools/include/uapi/linux/bpf.h | 3 +++
8 files changed, 66 insertions(+), 9 deletions(-)

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index b1829a534da1..eea7d8b0bb12 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -811,6 +811,10 @@ static int emit_atomic(u8 **pprog, u8 atomic_op,
/* lock *(u32/u64*)(dst_reg + off) <op>= src_reg */
EMIT1(simple_alu_opcodes[atomic_op]);
break;
+ case BPF_ADD | BPF_FETCH:
+ /* src_reg = atomic_fetch_add(dst_reg + off, src_reg); */
+ EMIT2(0x0F, 0xC1);
+ break;
default:
pr_err("bpf_jit: unknown atomic opcode %02x\n", atomic_op);
return -EFAULT;
diff --git a/include/linux/filter.h b/include/linux/filter.h
index 45be19408f68..b5258bca10d2 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -264,6 +264,7 @@ static inline bool insn_is_zext(const struct bpf_insn *insn)
* Atomic operations:
*
* BPF_ADD *(uint *) (dst_reg + off16) += src_reg
+ * BPF_ADD | BPF_FETCH src_reg = atomic_fetch_add(dst_reg + off16, src_reg);
*/

#define BPF_ATOMIC64(OP, DST, SRC, OFF) \
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 98161e2d389f..d5389119291e 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -44,6 +44,9 @@
#define BPF_CALL 0x80 /* function call */
#define BPF_EXIT 0x90 /* function return */

+/* atomic op type fields (stored in immediate) */
+#define BPF_FETCH 0x01 /* fetch previous value into src reg */
+
/* Register numbers */
enum {
BPF_REG_0 = 0,
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 3abc6b250b18..61e93eb7d363 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -1624,16 +1624,29 @@ static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, u64 *stack)
/* lock xadd *(u32 *)(dst_reg + off16) += src_reg */
atomic_add((u32) SRC, (atomic_t *)(unsigned long)
(DST + insn->off));
+ break;
+ case BPF_ADD | BPF_FETCH:
+ SRC = (u32) atomic_fetch_add(
+ (u32) SRC,
+ (atomic_t *)(unsigned long) (DST + insn->off));
+ break;
default:
goto default_label;
}
CONT;
+
STX_ATOMIC_DW:
switch (IMM) {
case BPF_ADD:
/* lock xadd *(u64 *)(dst_reg + off16) += src_reg */
atomic64_add((u64) SRC, (atomic64_t *)(unsigned long)
(DST + insn->off));
+ break;
+ case BPF_ADD | BPF_FETCH:
+ SRC = (u64) atomic64_fetch_add(
+ (u64) SRC,
+ (atomic64_t *)(s64) (DST + insn->off));
+ break;
default:
goto default_label;
}
diff --git a/kernel/bpf/disasm.c b/kernel/bpf/disasm.c
index 37c8d6e9b4cc..d2e20f6d0516 100644
--- a/kernel/bpf/disasm.c
+++ b/kernel/bpf/disasm.c
@@ -160,6 +160,13 @@ void print_bpf_insn(const struct bpf_insn_cbs *cbs,
bpf_ldst_string[BPF_SIZE(insn->code) >> 3],
insn->dst_reg, insn->off,
insn->src_reg);
+ } else if (BPF_MODE(insn->code) == BPF_ATOMIC &&
+ insn->imm == (BPF_ADD | BPF_FETCH)) {
+ verbose(cbs->private_data, "(%02x) r%d = atomic%s_fetch_add((%s *)(r%d %+d), r%d)\n",
+ insn->code, insn->src_reg,
+ BPF_SIZE(insn->code) == BPF_DW ? "64" : "",
+ bpf_ldst_string[BPF_SIZE(insn->code) >> 3],
+ insn->dst_reg, insn->off, insn->src_reg);
} else {
verbose(cbs->private_data, "BUG_%02x\n", insn->code);
}
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 745c53df0485..f8c4e809297d 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -3610,7 +3610,11 @@ static int check_atomic(struct bpf_verifier_env *env, int insn_idx, struct bpf_i
{
int err;

- if (insn->imm != BPF_ADD) {
+ switch (insn->imm) {
+ case BPF_ADD:
+ case BPF_ADD | BPF_FETCH:
+ break;
+ default:
verbose(env, "BPF_ATOMIC uses invalid atomic opcode %02x\n", insn->imm);
return -EINVAL;
}
@@ -3652,8 +3656,20 @@ static int check_atomic(struct bpf_verifier_env *env, int insn_idx, struct bpf_i
return err;

/* check whether we can write into the same memory */
- return check_mem_access(env, insn_idx, insn->dst_reg, insn->off,
- BPF_SIZE(insn->code), BPF_WRITE, -1, true);
+ err = check_mem_access(env, insn_idx, insn->dst_reg, insn->off,
+ BPF_SIZE(insn->code), BPF_WRITE, -1, true);
+ if (err)
+ return err;
+
+ if (!(insn->imm & BPF_FETCH))
+ return 0;
+
+ /* check and record load of old value into src reg */
+ err = check_reg_arg(env, insn->src_reg, DST_OP);
+ if (err)
+ return err;
+
+ return 0;
}

static int __check_stack_boundary(struct bpf_verifier_env *env, u32 regno,
@@ -9527,12 +9543,6 @@ static int do_check(struct bpf_verifier_env *env)
} else if (class == BPF_STX) {
enum bpf_reg_type *prev_dst_type, dst_reg_type;

- if (((BPF_MODE(insn->code) != BPF_MEM &&
- BPF_MODE(insn->code) != BPF_ATOMIC) || insn->imm != 0)) {
- verbose(env, "BPF_STX uses reserved fields\n");
- return -EINVAL;
- }
-
if (BPF_MODE(insn->code) == BPF_ATOMIC) {
err = check_atomic(env, env->insn_idx, insn);
if (err)
@@ -9541,6 +9551,11 @@ static int do_check(struct bpf_verifier_env *env)
continue;
}

+ if (BPF_MODE(insn->code) != BPF_MEM || insn->imm != 0) {
+ verbose(env, "BPF_STX uses reserved fields\n");
+ return -EINVAL;
+ }
+
/* check src1 operand */
err = check_reg_arg(env, insn->src_reg, SRC_OP);
if (err)
diff --git a/tools/include/linux/filter.h b/tools/include/linux/filter.h
index f345f12c1ff8..4e0100ba52c2 100644
--- a/tools/include/linux/filter.h
+++ b/tools/include/linux/filter.h
@@ -173,6 +173,7 @@
* Atomic operations:
*
* BPF_ADD *(uint *) (dst_reg + off16) += src_reg
+ * BPF_ADD | BPF_FETCH src_reg = atomic_fetch_add(dst_reg + off16, src_reg);
*/

#define BPF_ATOMIC64(OP, DST, SRC, OFF) \
@@ -201,6 +202,16 @@
.off = OFF, \
.imm = BPF_ADD })

+/* Atomic memory add with fetch, src_reg = atomic_fetch_add(dst_reg + off, src_reg); */
+
+#define BPF_ATOMIC_FETCH_ADD(SIZE, DST, SRC, OFF) \
+ ((struct bpf_insn) { \
+ .code = BPF_STX | BPF_SIZE(SIZE) | BPF_ATOMIC, \
+ .dst_reg = DST, \
+ .src_reg = SRC, \
+ .off = OFF, \
+ .imm = BPF_ADD | BPF_FETCH })
+
/* Memory store, *(uint *) (dst_reg + off16) = imm32 */

#define BPF_ST_MEM(SIZE, DST, OFF, IMM) \
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 98161e2d389f..d5389119291e 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -44,6 +44,9 @@
#define BPF_CALL 0x80 /* function call */
#define BPF_EXIT 0x90 /* function return */

+/* atomic op type fields (stored in immediate) */
+#define BPF_FETCH 0x01 /* fetch previous value into src reg */
+
/* Register numbers */
enum {
BPF_REG_0 = 0,
--
2.29.2.576.ga3fc446d84-goog

2020-12-07 16:15:06

by Brendan Jackman

[permalink] [raw]
Subject: [PATCH bpf-next v4 09/11] bpf: Add bitwise atomic instructions

This adds instructions for

atomic[64]_[fetch_]and
atomic[64]_[fetch_]or
atomic[64]_[fetch_]xor

All these operations are isomorphic enough to implement with the same
verifier, interpreter, and x86 JIT code, hence being a single commit.

The main interesting thing here is that x86 doesn't directly support
the fetch_ version these operations, so we need to generate a CMPXCHG
loop in the JIT. This requires the use of two temporary registers,
IIUC it's safe to use BPF_REG_AX and x86's AUX_REG for this purpose.

Signed-off-by: Brendan Jackman <[email protected]>
---
arch/x86/net/bpf_jit_comp.c | 50 ++++++++++++++++++++++++++-
include/linux/filter.h | 66 ++++++++++++++++++++++++++++++++++++
kernel/bpf/core.c | 3 ++
kernel/bpf/disasm.c | 21 +++++++++---
kernel/bpf/verifier.c | 6 ++++
tools/include/linux/filter.h | 66 ++++++++++++++++++++++++++++++++++++
6 files changed, 207 insertions(+), 5 deletions(-)

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 308241187582..1d4d50199293 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -808,6 +808,10 @@ static int emit_atomic(u8 **pprog, u8 atomic_op,
/* emit opcode */
switch (atomic_op) {
case BPF_ADD:
+ case BPF_SUB:
+ case BPF_AND:
+ case BPF_OR:
+ case BPF_XOR:
/* lock *(u32/u64*)(dst_reg + off) <op>= src_reg */
EMIT1(simple_alu_opcodes[atomic_op]);
break;
@@ -1292,8 +1296,52 @@ st: if (is_imm8(insn->off))

case BPF_STX | BPF_ATOMIC | BPF_W:
case BPF_STX | BPF_ATOMIC | BPF_DW:
+ if (insn->imm == (BPF_AND | BPF_FETCH) ||
+ insn->imm == (BPF_OR | BPF_FETCH) ||
+ insn->imm == (BPF_XOR | BPF_FETCH)) {
+ u8 *branch_target;
+ bool is64 = BPF_SIZE(insn->code) == BPF_DW;
+
+ /*
+ * Can't be implemented with a single x86 insn.
+ * Need to do a CMPXCHG loop.
+ */
+
+ /* Will need RAX as a CMPXCHG operand so save R0 */
+ emit_mov_reg(&prog, true, BPF_REG_AX, BPF_REG_0);
+ branch_target = prog;
+ /* Load old value */
+ emit_ldx(&prog, BPF_SIZE(insn->code),
+ BPF_REG_0, dst_reg, insn->off);
+ /*
+ * Perform the (commutative) operation locally,
+ * put the result in the AUX_REG.
+ */
+ emit_mov_reg(&prog, is64, AUX_REG, BPF_REG_0);
+ maybe_emit_mod(&prog, AUX_REG, src_reg, is64);
+ EMIT2(simple_alu_opcodes[BPF_OP(insn->imm)],
+ add_2reg(0xC0, AUX_REG, src_reg));
+ /* Attempt to swap in new value */
+ err = emit_atomic(&prog, BPF_CMPXCHG,
+ dst_reg, AUX_REG, insn->off,
+ BPF_SIZE(insn->code));
+ if (WARN_ON(err))
+ return err;
+ /*
+ * ZF tells us whether we won the race. If it's
+ * cleared we need to try again.
+ */
+ EMIT2(X86_JNE, -(prog - branch_target) - 2);
+ /* Return the pre-modification value */
+ emit_mov_reg(&prog, is64, src_reg, BPF_REG_0);
+ /* Restore R0 after clobbering RAX */
+ emit_mov_reg(&prog, true, BPF_REG_0, BPF_REG_AX);
+ break;
+
+ }
+
err = emit_atomic(&prog, insn->imm, dst_reg, src_reg,
- insn->off, BPF_SIZE(insn->code));
+ insn->off, BPF_SIZE(insn->code));
if (err)
return err;
break;
diff --git a/include/linux/filter.h b/include/linux/filter.h
index e1e1fc946a7c..e100c71555a4 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -264,7 +264,13 @@ static inline bool insn_is_zext(const struct bpf_insn *insn)
* Atomic operations:
*
* BPF_ADD *(uint *) (dst_reg + off16) += src_reg
+ * BPF_AND *(uint *) (dst_reg + off16) &= src_reg
+ * BPF_OR *(uint *) (dst_reg + off16) |= src_reg
+ * BPF_XOR *(uint *) (dst_reg + off16) ^= src_reg
* BPF_ADD | BPF_FETCH src_reg = atomic_fetch_add(dst_reg + off16, src_reg);
+ * BPF_AND | BPF_FETCH src_reg = atomic_fetch_and(dst_reg + off16, src_reg);
+ * BPF_OR | BPF_FETCH src_reg = atomic_fetch_or(dst_reg + off16, src_reg);
+ * BPF_XOR | BPF_FETCH src_reg = atomic_fetch_xor(dst_reg + off16, src_reg);
* BPF_XCHG src_reg = atomic_xchg(dst_reg + off16, src_reg)
* BPF_CMPXCHG r0 = atomic_cmpxchg(dst_reg + off16, r0, src_reg)
*/
@@ -295,6 +301,66 @@ static inline bool insn_is_zext(const struct bpf_insn *insn)
.off = OFF, \
.imm = BPF_ADD })

+/* Atomic memory and, *(uint *)(dst_reg + off16) &= src_reg */
+
+#define BPF_ATOMIC_AND(SIZE, DST, SRC, OFF) \
+ ((struct bpf_insn) { \
+ .code = BPF_STX | BPF_SIZE(SIZE) | BPF_ATOMIC, \
+ .dst_reg = DST, \
+ .src_reg = SRC, \
+ .off = OFF, \
+ .imm = BPF_AND })
+
+/* Atomic memory and with fetch, src_reg = atomic_fetch_and(dst_reg + off, src_reg); */
+
+#define BPF_ATOMIC_FETCH_AND(SIZE, DST, SRC, OFF) \
+ ((struct bpf_insn) { \
+ .code = BPF_STX | BPF_SIZE(SIZE) | BPF_ATOMIC, \
+ .dst_reg = DST, \
+ .src_reg = SRC, \
+ .off = OFF, \
+ .imm = BPF_AND | BPF_FETCH })
+
+/* Atomic memory or, *(uint *)(dst_reg + off16) |= src_reg */
+
+#define BPF_ATOMIC_OR(SIZE, DST, SRC, OFF) \
+ ((struct bpf_insn) { \
+ .code = BPF_STX | BPF_SIZE(SIZE) | BPF_ATOMIC, \
+ .dst_reg = DST, \
+ .src_reg = SRC, \
+ .off = OFF, \
+ .imm = BPF_OR })
+
+/* Atomic memory or with fetch, src_reg = atomic_fetch_or(dst_reg + off, src_reg); */
+
+#define BPF_ATOMIC_FETCH_OR(SIZE, DST, SRC, OFF) \
+ ((struct bpf_insn) { \
+ .code = BPF_STX | BPF_SIZE(SIZE) | BPF_ATOMIC, \
+ .dst_reg = DST, \
+ .src_reg = SRC, \
+ .off = OFF, \
+ .imm = BPF_OR | BPF_FETCH })
+
+/* Atomic memory xor, *(uint *)(dst_reg + off16) ^= src_reg */
+
+#define BPF_ATOMIC_XOR(SIZE, DST, SRC, OFF) \
+ ((struct bpf_insn) { \
+ .code = BPF_STX | BPF_SIZE(SIZE) | BPF_ATOMIC, \
+ .dst_reg = DST, \
+ .src_reg = SRC, \
+ .off = OFF, \
+ .imm = BPF_XOR })
+
+/* Atomic memory xor with fetch, src_reg = atomic_fetch_xor(dst_reg + off, src_reg); */
+
+#define BPF_ATOMIC_FETCH_XOR(SIZE, DST, SRC, OFF) \
+ ((struct bpf_insn) { \
+ .code = BPF_STX | BPF_SIZE(SIZE) | BPF_ATOMIC, \
+ .dst_reg = DST, \
+ .src_reg = SRC, \
+ .off = OFF, \
+ .imm = BPF_XOR | BPF_FETCH })
+
/* Atomic exchange, src_reg = atomic_xchg(dst_reg + off, src_reg) */

#define BPF_ATOMIC_XCHG(SIZE, DST, SRC, OFF) \
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 1d9e5dcde03a..4b78ff89ec91 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -1642,6 +1642,9 @@ static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, u64 *stack)
STX_ATOMIC_W:
switch (IMM) {
ATOMIC_ALU_OP(BPF_ADD, add)
+ ATOMIC_ALU_OP(BPF_AND, and)
+ ATOMIC_ALU_OP(BPF_OR, or)
+ ATOMIC_ALU_OP(BPF_XOR, xor)
#undef ATOMIC_ALU_OP

case BPF_XCHG:
diff --git a/kernel/bpf/disasm.c b/kernel/bpf/disasm.c
index ee8d1132767b..19ff8fed7f4b 100644
--- a/kernel/bpf/disasm.c
+++ b/kernel/bpf/disasm.c
@@ -80,6 +80,13 @@ const char *const bpf_alu_string[16] = {
[BPF_END >> 4] = "endian",
};

+static const char *const bpf_atomic_alu_string[16] = {
+ [BPF_ADD >> 4] = "add",
+ [BPF_AND >> 4] = "and",
+ [BPF_OR >> 4] = "or",
+ [BPF_XOR >> 4] = "or",
+};
+
static const char *const bpf_ldst_string[] = {
[BPF_W >> 3] = "u32",
[BPF_H >> 3] = "u16",
@@ -154,17 +161,23 @@ void print_bpf_insn(const struct bpf_insn_cbs *cbs,
insn->dst_reg,
insn->off, insn->src_reg);
else if (BPF_MODE(insn->code) == BPF_ATOMIC &&
- insn->imm == BPF_ADD) {
- verbose(cbs->private_data, "(%02x) lock *(%s *)(r%d %+d) += r%d\n",
+ (insn->imm == BPF_ADD || insn->imm == BPF_ADD ||
+ insn->imm == BPF_OR || insn->imm == BPF_XOR)) {
+ verbose(cbs->private_data, "(%02x) lock *(%s *)(r%d %+d) %s r%d\n",
insn->code,
bpf_ldst_string[BPF_SIZE(insn->code) >> 3],
insn->dst_reg, insn->off,
+ bpf_alu_string[BPF_OP(insn->imm) >> 4],
insn->src_reg);
} else if (BPF_MODE(insn->code) == BPF_ATOMIC &&
- insn->imm == (BPF_ADD | BPF_FETCH)) {
- verbose(cbs->private_data, "(%02x) r%d = atomic%s_fetch_add((%s *)(r%d %+d), r%d)\n",
+ (insn->imm == (BPF_ADD | BPF_FETCH) ||
+ insn->imm == (BPF_AND | BPF_FETCH) ||
+ insn->imm == (BPF_OR | BPF_FETCH) ||
+ insn->imm == (BPF_XOR | BPF_FETCH))) {
+ verbose(cbs->private_data, "(%02x) r%d = atomic%s_fetch_%s((%s *)(r%d %+d), r%d)\n",
insn->code, insn->src_reg,
BPF_SIZE(insn->code) == BPF_DW ? "64" : "",
+ bpf_atomic_alu_string[BPF_OP(insn->imm) >> 4],
bpf_ldst_string[BPF_SIZE(insn->code) >> 3],
insn->dst_reg, insn->off, insn->src_reg);
} else if (BPF_MODE(insn->code) == BPF_ATOMIC &&
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index f5f4460b3e4e..ec5265e6d91b 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -3614,6 +3614,12 @@ static int check_atomic(struct bpf_verifier_env *env, int insn_idx, struct bpf_i
switch (insn->imm) {
case BPF_ADD:
case BPF_ADD | BPF_FETCH:
+ case BPF_AND:
+ case BPF_AND | BPF_FETCH:
+ case BPF_OR:
+ case BPF_OR | BPF_FETCH:
+ case BPF_XOR:
+ case BPF_XOR | BPF_FETCH:
case BPF_XCHG:
case BPF_CMPXCHG:
break;
diff --git a/tools/include/linux/filter.h b/tools/include/linux/filter.h
index 21598053fd40..723c7a485e67 100644
--- a/tools/include/linux/filter.h
+++ b/tools/include/linux/filter.h
@@ -173,7 +173,13 @@
* Atomic operations:
*
* BPF_ADD *(uint *) (dst_reg + off16) += src_reg
+ * BPF_AND *(uint *) (dst_reg + off16) &= src_reg
+ * BPF_OR *(uint *) (dst_reg + off16) |= src_reg
+ * BPF_XOR *(uint *) (dst_reg + off16) ^= src_reg
* BPF_ADD | BPF_FETCH src_reg = atomic_fetch_add(dst_reg + off16, src_reg);
+ * BPF_AND | BPF_FETCH src_reg = atomic_fetch_and(dst_reg + off16, src_reg);
+ * BPF_OR | BPF_FETCH src_reg = atomic_fetch_or(dst_reg + off16, src_reg);
+ * BPF_XOR | BPF_FETCH src_reg = atomic_fetch_xor(dst_reg + off16, src_reg);
* BPF_XCHG src_reg = atomic_xchg(dst_reg + off16, src_reg)
* BPF_CMPXCHG r0 = atomic_cmpxchg(dst_reg + off16, r0, src_reg)
*/
@@ -214,6 +220,66 @@
.off = OFF, \
.imm = BPF_ADD | BPF_FETCH })

+/* Atomic memory and, *(uint *)(dst_reg + off16) -= src_reg */
+
+#define BPF_ATOMIC_AND(SIZE, DST, SRC, OFF) \
+ ((struct bpf_insn) { \
+ .code = BPF_STX | BPF_SIZE(SIZE) | BPF_ATOMIC, \
+ .dst_reg = DST, \
+ .src_reg = SRC, \
+ .off = OFF, \
+ .imm = BPF_AND })
+
+/* Atomic memory and with fetch, src_reg = atomic_fetch_and(dst_reg + off, src_reg); */
+
+#define BPF_ATOMIC_FETCH_AND(SIZE, DST, SRC, OFF) \
+ ((struct bpf_insn) { \
+ .code = BPF_STX | BPF_SIZE(SIZE) | BPF_ATOMIC, \
+ .dst_reg = DST, \
+ .src_reg = SRC, \
+ .off = OFF, \
+ .imm = BPF_AND | BPF_FETCH })
+
+/* Atomic memory or, *(uint *)(dst_reg + off16) -= src_reg */
+
+#define BPF_ATOMIC_OR(SIZE, DST, SRC, OFF) \
+ ((struct bpf_insn) { \
+ .code = BPF_STX | BPF_SIZE(SIZE) | BPF_ATOMIC, \
+ .dst_reg = DST, \
+ .src_reg = SRC, \
+ .off = OFF, \
+ .imm = BPF_OR })
+
+/* Atomic memory or with fetch, src_reg = atomic_fetch_or(dst_reg + off, src_reg); */
+
+#define BPF_ATOMIC_FETCH_OR(SIZE, DST, SRC, OFF) \
+ ((struct bpf_insn) { \
+ .code = BPF_STX | BPF_SIZE(SIZE) | BPF_ATOMIC, \
+ .dst_reg = DST, \
+ .src_reg = SRC, \
+ .off = OFF, \
+ .imm = BPF_OR | BPF_FETCH })
+
+/* Atomic memory xor, *(uint *)(dst_reg + off16) -= src_reg */
+
+#define BPF_ATOMIC_XOR(SIZE, DST, SRC, OFF) \
+ ((struct bpf_insn) { \
+ .code = BPF_STX | BPF_SIZE(SIZE) | BPF_ATOMIC, \
+ .dst_reg = DST, \
+ .src_reg = SRC, \
+ .off = OFF, \
+ .imm = BPF_XOR })
+
+/* Atomic memory xor with fetch, src_reg = atomic_fetch_xor(dst_reg + off, src_reg); */
+
+#define BPF_ATOMIC_FETCH_XOR(SIZE, DST, SRC, OFF) \
+ ((struct bpf_insn) { \
+ .code = BPF_STX | BPF_SIZE(SIZE) | BPF_ATOMIC, \
+ .dst_reg = DST, \
+ .src_reg = SRC, \
+ .off = OFF, \
+ .imm = BPF_XOR | BPF_FETCH })
+
/* Atomic exchange, src_reg = atomic_xchg(dst_reg + off, src_reg) */

#define BPF_ATOMIC_XCHG(SIZE, DST, SRC, OFF) \
--
2.29.2.576.ga3fc446d84-goog

2020-12-07 21:08:18

by John Fastabend

[permalink] [raw]
Subject: RE: [PATCH bpf-next v4 01/11] bpf: x86: Factor out emission of ModR/M for *(reg + off)

Brendan Jackman wrote:
> The case for JITing atomics is about to get more complicated. Let's
> factor out some common code to make the review and result more
> readable.
>
> NB the atomics code doesn't yet use the new helper - a subsequent
> patch will add its use as a side-effect of other changes.
>
> Signed-off-by: Brendan Jackman <[email protected]>
> ---

Small nit on style preference below.

Acked-by: John Fastabend <[email protected]>

[...]

>
> @@ -1240,11 +1250,7 @@ st: if (is_imm8(insn->off))
> goto xadd;
> case BPF_STX | BPF_XADD | BPF_DW:
> EMIT3(0xF0, add_2mod(0x48, dst_reg, src_reg), 0x01);
> -xadd: if (is_imm8(insn->off))
> - EMIT2(add_2reg(0x40, dst_reg, src_reg), insn->off);
> - else
> - EMIT1_off32(add_2reg(0x80, dst_reg, src_reg),
> - insn->off);
> +xadd: emit_modrm_dstoff(&prog, dst_reg, src_reg, insn->off);

I at least prefer the xadd on its own line above the emit_*(). That seems
more consistent with the rest of the code in this file. The only other
example like this is st:.

2020-12-07 21:14:22

by John Fastabend

[permalink] [raw]
Subject: RE: [PATCH bpf-next v4 03/11] bpf: x86: Factor out a lookup table for some ALU opcodes

Brendan Jackman wrote:
> A later commit will need to lookup a subset of these opcodes. To
> avoid duplicating code, pull out a table.
>
> The shift opcodes won't be needed by that later commit, but they're
> already duplicated, so fold them into the table anyway.
>
> Signed-off-by: Brendan Jackman <[email protected]>
> ---

Acked-byy: John Fastabend <[email protected]>

2020-12-07 22:01:30

by John Fastabend

[permalink] [raw]
Subject: RE: [PATCH bpf-next v4 04/11] bpf: Rename BPF_XADD and prepare to encode other atomics in .imm

Brendan Jackman wrote:
> A subsequent patch will add additional atomic operations. These new
> operations will use the same opcode field as the existing XADD, with
> the immediate discriminating different operations.
>
> In preparation, rename the instruction mode BPF_ATOMIC and start
> calling the zero immediate BPF_ADD.
>
> This is possible (doesn't break existing valid BPF progs) because the
> immediate field is currently reserved MBZ and BPF_ADD is zero.
>
> All uses are removed from the tree but the BPF_XADD definition is
> kept around to avoid breaking builds for people including kernel
> headers.
>
> Signed-off-by: Brendan Jackman <[email protected]>
> ---
> Documentation/networking/filter.rst | 30 ++++++++-----
> arch/arm/net/bpf_jit_32.c | 7 ++-
> arch/arm64/net/bpf_jit_comp.c | 16 +++++--
> arch/mips/net/ebpf_jit.c | 11 +++--
> arch/powerpc/net/bpf_jit_comp64.c | 25 ++++++++---
> arch/riscv/net/bpf_jit_comp32.c | 20 +++++++--
> arch/riscv/net/bpf_jit_comp64.c | 16 +++++--
> arch/s390/net/bpf_jit_comp.c | 27 ++++++-----
> arch/sparc/net/bpf_jit_comp_64.c | 17 +++++--
> arch/x86/net/bpf_jit_comp.c | 45 ++++++++++++++-----
> arch/x86/net/bpf_jit_comp32.c | 6 +--
> drivers/net/ethernet/netronome/nfp/bpf/jit.c | 14 ++++--
> drivers/net/ethernet/netronome/nfp/bpf/main.h | 4 +-
> .../net/ethernet/netronome/nfp/bpf/verifier.c | 15 ++++---
> include/linux/filter.h | 29 ++++++++++--
> include/uapi/linux/bpf.h | 5 ++-
> kernel/bpf/core.c | 31 +++++++++----
> kernel/bpf/disasm.c | 6 ++-
> kernel/bpf/verifier.c | 24 +++++-----
> lib/test_bpf.c | 14 +++---
> samples/bpf/bpf_insn.h | 4 +-
> samples/bpf/cookie_uid_helper_example.c | 6 +--
> samples/bpf/sock_example.c | 2 +-
> samples/bpf/test_cgrp2_attach.c | 5 ++-
> tools/include/linux/filter.h | 28 ++++++++++--
> tools/include/uapi/linux/bpf.h | 5 ++-
> .../bpf/prog_tests/cgroup_attach_multi.c | 4 +-
> .../selftests/bpf/test_cgroup_storage.c | 2 +-
> tools/testing/selftests/bpf/verifier/ctx.c | 7 ++-
> .../bpf/verifier/direct_packet_access.c | 4 +-
> .../testing/selftests/bpf/verifier/leak_ptr.c | 10 ++---
> .../selftests/bpf/verifier/meta_access.c | 4 +-
> tools/testing/selftests/bpf/verifier/unpriv.c | 3 +-
> .../bpf/verifier/value_illegal_alu.c | 2 +-
> tools/testing/selftests/bpf/verifier/xadd.c | 18 ++++----
> 35 files changed, 317 insertions(+), 149 deletions(-)
>

[...]

> +++ a/arch/mips/net/ebpf_jit.c

[...]

> - if (BPF_MODE(insn->code) == BPF_XADD) {
> + if (BPF_MODE(insn->code) == BPF_ATOMIC) {
> + if (insn->imm != BPF_ADD) {
> + pr_err("ATOMIC OP %02x NOT HANDLED\n", insn->imm);
> + return -EINVAL;
> + }
> +
> /*
[...]
> +++ b/arch/powerpc/net/bpf_jit_comp64.c

> - case BPF_STX | BPF_XADD | BPF_W:
> + case BPF_STX | BPF_ATOMIC | BPF_W:
> + if (insn->imm != BPF_ADD) {
> + pr_err_ratelimited(
> + "eBPF filter atomic op code %02x (@%d) unsupported\n",
> + code, i);
> + return -ENOTSUPP;
> + }
[...]
> @@ -699,8 +707,15 @@ static int bpf_jit_build_body(struct bpf_prog *fp, u32 *image,
> - case BPF_STX | BPF_XADD | BPF_DW:
> + case BPF_STX | BPF_ATOMIC | BPF_DW:
> + if (insn->imm != BPF_ADD) {
> + pr_err_ratelimited(
> + "eBPF filter atomic op code %02x (@%d) unsupported\n",
> + code, i);
> + return -ENOTSUPP;
> + }
[...]
> + case BPF_STX | BPF_ATOMIC | BPF_W:
> + if (insn->imm != BPF_ADD) {
> + pr_info_once(
> + "bpf-jit: not supported: atomic operation %02x ***\n",
> + insn->imm);
> + return -EFAULT;
> + }
[...]
> + case BPF_STX | BPF_ATOMIC | BPF_W:
> + case BPF_STX | BPF_ATOMIC | BPF_DW:
> + if (insn->imm != BPF_ADD) {
> + pr_err("bpf-jit: not supported: atomic operation %02x ***\n",
> + insn->imm);
> + return -EINVAL;
> + }

Can we standardize the error across jits and the error return code? It seems
odd that we use pr_err, pr_info_once, pr_err_ratelimited and then return
ENOTSUPP, EFAULT or EINVAL.

granted the error codes might not propagate all the way out at the moment but
still shouldn't hurt.

> diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c
> index 0a4182792876..f973e2ead197 100644
> --- a/arch/s390/net/bpf_jit_comp.c
> +++ b/arch/s390/net/bpf_jit_comp.c
> @@ -1205,18 +1205,23 @@ static noinline int bpf_jit_insn(struct bpf_jit *jit, struct bpf_prog *fp,

For example this will return -1 regardless of error from insn->imm != BPF_ADD.
[...]
> + case BPF_STX | BPF_ATOMIC | BPF_DW:
> + case BPF_STX | BPF_ATOMIC | BPF_W:
> + if (insn->imm != BPF_ADD) {
> + pr_err("Unknown atomic operation %02x\n", insn->imm);
> + return -1;
> + }
> +
[...]

> --- a/include/linux/filter.h
> +++ b/include/linux/filter.h
> @@ -259,15 +259,38 @@ static inline bool insn_is_zext(const struct bpf_insn *insn)
> .off = OFF, \
> .imm = 0 })
>
> -/* Atomic memory add, *(uint *)(dst_reg + off16) += src_reg */
> +
> +/*
> + * Atomic operations:
> + *
> + * BPF_ADD *(uint *) (dst_reg + off16) += src_reg
> + */
> +
> +#define BPF_ATOMIC64(OP, DST, SRC, OFF) \
> + ((struct bpf_insn) { \
> + .code = BPF_STX | BPF_DW | BPF_ATOMIC, \
> + .dst_reg = DST, \
> + .src_reg = SRC, \
> + .off = OFF, \
> + .imm = OP })
> +
> +#define BPF_ATOMIC32(OP, DST, SRC, OFF) \
> + ((struct bpf_insn) { \
> + .code = BPF_STX | BPF_W | BPF_ATOMIC, \
> + .dst_reg = DST, \
> + .src_reg = SRC, \
> + .off = OFF, \
> + .imm = OP })
> +
> +/* Legacy equivalent of BPF_ATOMIC{64,32}(BPF_ADD, ...) */

Not sure I care too much. Does seem more natural to follow
below pattern and use,

BPF_ATOMIC(OP, SIZE, DST, SRC, OFF)

>
> #define BPF_STX_XADD(SIZE, DST, SRC, OFF) \
> ((struct bpf_insn) { \
> - .code = BPF_STX | BPF_SIZE(SIZE) | BPF_XADD, \
> + .code = BPF_STX | BPF_SIZE(SIZE) | BPF_ATOMIC, \
> .dst_reg = DST, \
> .src_reg = SRC, \
> .off = OFF, \
> - .imm = 0 })
> + .imm = BPF_ADD })
>
> /* Memory store, *(uint *) (dst_reg + off16) = imm32 */
>

[...]

Otherwise LGTM, I'll try to get the remaining patches reviewed tonight
I need to jump onto something else this afternoon. Thanks!

2020-12-08 01:40:29

by Yonghong Song

[permalink] [raw]
Subject: Re: [PATCH bpf-next v4 05/11] bpf: Move BPF_STX reserved field check into BPF_STX verifier code



On 12/7/20 8:07 AM, Brendan Jackman wrote:
> I can't find a reason why this code is in resolve_pseudo_ldimm64;
> since I'll be modifying it in a subsequent commit, tidy it up.
>
> Signed-off-by: Brendan Jackman <[email protected]>

Acked-by: Yonghong Song <[email protected]>

2020-12-08 01:48:43

by Yonghong Song

[permalink] [raw]
Subject: Re: [PATCH bpf-next v4 06/11] bpf: Add BPF_FETCH field / create atomic_fetch_add instruction



On 12/7/20 8:07 AM, Brendan Jackman wrote:
> The BPF_FETCH field can be set in bpf_insn.imm, for BPF_ATOMIC
> instructions, in order to have the previous value of the
> atomically-modified memory location loaded into the src register
> after an atomic op is carried out.
>
> Suggested-by: Yonghong Song <[email protected]>
> Signed-off-by: Brendan Jackman <[email protected]>
> ---
> arch/x86/net/bpf_jit_comp.c | 4 ++++
> include/linux/filter.h | 1 +
> include/uapi/linux/bpf.h | 3 +++
> kernel/bpf/core.c | 13 +++++++++++++
> kernel/bpf/disasm.c | 7 +++++++
> kernel/bpf/verifier.c | 33 ++++++++++++++++++++++++---------
> tools/include/linux/filter.h | 11 +++++++++++
> tools/include/uapi/linux/bpf.h | 3 +++
> 8 files changed, 66 insertions(+), 9 deletions(-)
>
> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
[...]

> index f345f12c1ff8..4e0100ba52c2 100644
> --- a/tools/include/linux/filter.h
> +++ b/tools/include/linux/filter.h
> @@ -173,6 +173,7 @@
> * Atomic operations:
> *
> * BPF_ADD *(uint *) (dst_reg + off16) += src_reg
> + * BPF_ADD | BPF_FETCH src_reg = atomic_fetch_add(dst_reg + off16, src_reg);
> */
>
> #define BPF_ATOMIC64(OP, DST, SRC, OFF) \
> @@ -201,6 +202,16 @@
> .off = OFF, \
> .imm = BPF_ADD })
>
> +/* Atomic memory add with fetch, src_reg = atomic_fetch_add(dst_reg + off, src_reg); */
> +
> +#define BPF_ATOMIC_FETCH_ADD(SIZE, DST, SRC, OFF) \
> + ((struct bpf_insn) { \
> + .code = BPF_STX | BPF_SIZE(SIZE) | BPF_ATOMIC, \
> + .dst_reg = DST, \
> + .src_reg = SRC, \
> + .off = OFF, \
> + .imm = BPF_ADD | BPF_FETCH })

Not sure whether it is a good idea or not to fold this into BPF_ATOMIC
macro. At least you can define BPF_ATOMIC macro and
#define BPF_ATOMIC_FETCH_ADD(SIZE, DST, SRC, OFF) \
BPF_ATOMIC(SIZE, DST, SRC, OFF, BPF_ADD | BPF_FETCH)

to avoid too many code duplications?

> +
> /* Memory store, *(uint *) (dst_reg + off16) = imm32 */
>
> #define BPF_ST_MEM(SIZE, DST, OFF, IMM) \
> diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> index 98161e2d389f..d5389119291e 100644
> --- a/tools/include/uapi/linux/bpf.h
> +++ b/tools/include/uapi/linux/bpf.h
> @@ -44,6 +44,9 @@
> #define BPF_CALL 0x80 /* function call */
> #define BPF_EXIT 0x90 /* function return */
>
> +/* atomic op type fields (stored in immediate) */
> +#define BPF_FETCH 0x01 /* fetch previous value into src reg */
> +
> /* Register numbers */
> enum {
> BPF_REG_0 = 0,
>

2020-12-08 01:48:48

by Yonghong Song

[permalink] [raw]
Subject: Re: [PATCH bpf-next v4 07/11] bpf: Add instructions for atomic_[cmp]xchg



On 12/7/20 8:07 AM, Brendan Jackman wrote:
> This adds two atomic opcodes, both of which include the BPF_FETCH
> flag. XCHG without the BPF_FETCH flag would naturally encode
> atomic_set. This is not supported because it would be of limited
> value to userspace (it doesn't imply any barriers). CMPXCHG without
> BPF_FETCH woulud be an atomic compare-and-write. We don't have such
> an operation in the kernel so it isn't provided to BPF either.
>
> There are two significant design decisions made for the CMPXCHG
> instruction:
>
> - To solve the issue that this operation fundamentally has 3
> operands, but we only have two register fields. Therefore the
> operand we compare against (the kernel's API calls it 'old') is
> hard-coded to be R0. x86 has similar design (and A64 doesn't
> have this problem).
>
> A potential alternative might be to encode the other operand's
> register number in the immediate field.
>
> - The kernel's atomic_cmpxchg returns the old value, while the C11
> userspace APIs return a boolean indicating the comparison
> result. Which should BPF do? A64 returns the old value. x86 returns
> the old value in the hard-coded register (and also sets a
> flag). That means return-old-value is easier to JIT.
>
> Signed-off-by: Brendan Jackman <[email protected]>
> ---
> arch/x86/net/bpf_jit_comp.c | 8 ++++++++
> include/linux/filter.h | 22 ++++++++++++++++++++++
> include/uapi/linux/bpf.h | 4 +++-
> kernel/bpf/core.c | 20 ++++++++++++++++++++
> kernel/bpf/disasm.c | 15 +++++++++++++++
> kernel/bpf/verifier.c | 19 +++++++++++++++++--
> tools/include/linux/filter.h | 22 ++++++++++++++++++++++
> tools/include/uapi/linux/bpf.h | 4 +++-
> 8 files changed, 110 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> index eea7d8b0bb12..308241187582 100644
> --- a/arch/x86/net/bpf_jit_comp.c
> +++ b/arch/x86/net/bpf_jit_comp.c
> @@ -815,6 +815,14 @@ static int emit_atomic(u8 **pprog, u8 atomic_op,
> /* src_reg = atomic_fetch_add(dst_reg + off, src_reg); */
> EMIT2(0x0F, 0xC1);
> break;
> + case BPF_XCHG:
> + /* src_reg = atomic_xchg(dst_reg + off, src_reg); */
> + EMIT1(0x87);
> + break;
> + case BPF_CMPXCHG:
> + /* r0 = atomic_cmpxchg(dst_reg + off, r0, src_reg); */
> + EMIT2(0x0F, 0xB1);
> + break;
> default:
> pr_err("bpf_jit: unknown atomic opcode %02x\n", atomic_op);
> return -EFAULT;
> diff --git a/include/linux/filter.h b/include/linux/filter.h
> index b5258bca10d2..e1e1fc946a7c 100644
> --- a/include/linux/filter.h
> +++ b/include/linux/filter.h
> @@ -265,6 +265,8 @@ static inline bool insn_is_zext(const struct bpf_insn *insn)
> *
> * BPF_ADD *(uint *) (dst_reg + off16) += src_reg
> * BPF_ADD | BPF_FETCH src_reg = atomic_fetch_add(dst_reg + off16, src_reg);
> + * BPF_XCHG src_reg = atomic_xchg(dst_reg + off16, src_reg)
> + * BPF_CMPXCHG r0 = atomic_cmpxchg(dst_reg + off16, r0, src_reg)
> */
>
> #define BPF_ATOMIC64(OP, DST, SRC, OFF) \
> @@ -293,6 +295,26 @@ static inline bool insn_is_zext(const struct bpf_insn *insn)
> .off = OFF, \
> .imm = BPF_ADD })
>
> +/* Atomic exchange, src_reg = atomic_xchg(dst_reg + off, src_reg) */
> +
> +#define BPF_ATOMIC_XCHG(SIZE, DST, SRC, OFF) \
> + ((struct bpf_insn) { \
> + .code = BPF_STX | BPF_SIZE(SIZE) | BPF_ATOMIC, \
> + .dst_reg = DST, \
> + .src_reg = SRC, \
> + .off = OFF, \
> + .imm = BPF_XCHG })
> +
> +/* Atomic compare-exchange, r0 = atomic_cmpxchg(dst_reg + off, r0, src_reg) */
> +
> +#define BPF_ATOMIC_CMPXCHG(SIZE, DST, SRC, OFF) \
> + ((struct bpf_insn) { \
> + .code = BPF_STX | BPF_SIZE(SIZE) | BPF_ATOMIC, \
> + .dst_reg = DST, \
> + .src_reg = SRC, \
> + .off = OFF, \
> + .imm = BPF_CMPXCHG })

Define BPF_ATOMIC_{XCHG, CMPXCHG} based on BPF_ATOMIC macro?

> +
> /* Memory store, *(uint *) (dst_reg + off16) = imm32 */
>
> #define BPF_ST_MEM(SIZE, DST, OFF, IMM) \
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index d5389119291e..b733af50a5b9 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -45,7 +45,9 @@
> #define BPF_EXIT 0x90 /* function return */
>
> /* atomic op type fields (stored in immediate) */
> -#define BPF_FETCH 0x01 /* fetch previous value into src reg */
> +#define BPF_XCHG (0xe0 | BPF_FETCH) /* atomic exchange */
> +#define BPF_CMPXCHG (0xf0 | BPF_FETCH) /* atomic compare-and-write */
> +#define BPF_FETCH 0x01 /* not an opcode on its own, used to build others */
>
> /* Register numbers */
> enum {
> diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
> index 61e93eb7d363..28f960bc2e30 100644
> --- a/kernel/bpf/core.c
> +++ b/kernel/bpf/core.c
> @@ -1630,6 +1630,16 @@ static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, u64 *stack)
> (u32) SRC,
> (atomic_t *)(unsigned long) (DST + insn->off));
> break;
> + case BPF_XCHG:
> + SRC = (u32) atomic_xchg(
> + (atomic_t *)(unsigned long) (DST + insn->off),
> + (u32) SRC);
> + break;
> + case BPF_CMPXCHG:
> + BPF_R0 = (u32) atomic_cmpxchg(
> + (atomic_t *)(unsigned long) (DST + insn->off),
> + (u32) BPF_R0, (u32) SRC);
> + break;
> default:
> goto default_label;
> }
[...]

2020-12-08 04:11:40

by Yonghong Song

[permalink] [raw]
Subject: Re: [PATCH bpf-next v4 10/11] bpf: Add tests for new BPF atomic operations



On 12/7/20 8:07 AM, Brendan Jackman wrote:
> The prog_test that's added depends on Clang/LLVM features added by
> Yonghong in commit 286daafd6512 (was https://reviews.llvm.org/D72184 ).
>
> Note the use of a define called ENABLE_ATOMICS_TESTS: this is used
> to:
>
> - Avoid breaking the build for people on old versions of Clang
> - Avoid needing separate lists of test objects for no_alu32, where
> atomics are not supported even if Clang has the feature.
>
> The atomics_test.o BPF object is built unconditionally both for
> test_progs and test_progs-no_alu32. For test_progs, if Clang supports
> atomics, ENABLE_ATOMICS_TESTS is defined, so it includes the proper
> test code. Otherwise, progs and global vars are defined anyway, as
> stubs; this means that the skeleton user code still builds.
>
> The atomics_test.o userspace object is built once and used for both
> test_progs and test_progs-no_alu32. A variable called skip_tests is
> defined in the BPF object's data section, which tells the userspace
> object whether to skip the atomics test.
>
> Signed-off-by: Brendan Jackman <[email protected]>

Ack with minor comments below.

Acked-by: Yonghong Song <[email protected]>

> ---
> tools/testing/selftests/bpf/Makefile | 10 +
> .../selftests/bpf/prog_tests/atomics.c | 246 ++++++++++++++++++
> tools/testing/selftests/bpf/progs/atomics.c | 154 +++++++++++
> .../selftests/bpf/verifier/atomic_and.c | 77 ++++++
> .../selftests/bpf/verifier/atomic_cmpxchg.c | 96 +++++++
> .../selftests/bpf/verifier/atomic_fetch_add.c | 106 ++++++++
> .../selftests/bpf/verifier/atomic_or.c | 77 ++++++
> .../selftests/bpf/verifier/atomic_xchg.c | 46 ++++
> .../selftests/bpf/verifier/atomic_xor.c | 77 ++++++
> 9 files changed, 889 insertions(+)
> create mode 100644 tools/testing/selftests/bpf/prog_tests/atomics.c
> create mode 100644 tools/testing/selftests/bpf/progs/atomics.c
> create mode 100644 tools/testing/selftests/bpf/verifier/atomic_and.c
> create mode 100644 tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c
> create mode 100644 tools/testing/selftests/bpf/verifier/atomic_fetch_add.c
> create mode 100644 tools/testing/selftests/bpf/verifier/atomic_or.c
> create mode 100644 tools/testing/selftests/bpf/verifier/atomic_xchg.c
> create mode 100644 tools/testing/selftests/bpf/verifier/atomic_xor.c
>
> diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
> index ac25ba5d0d6c..13bc1d736164 100644
> --- a/tools/testing/selftests/bpf/Makefile
> +++ b/tools/testing/selftests/bpf/Makefile
> @@ -239,6 +239,12 @@ BPF_CFLAGS = -g -D__TARGET_ARCH_$(SRCARCH) $(MENDIAN) \
> -I$(INCLUDE_DIR) -I$(CURDIR) -I$(APIDIR) \
> -I$(abspath $(OUTPUT)/../usr/include)
>
> +# BPF atomics support was added to Clang in llvm-project commit 286daafd6512
> +# (release 12.0.0).
> +BPF_ATOMICS_SUPPORTED = $(shell \
> + echo "int x = 0; int foo(void) { return __sync_val_compare_and_swap(&x, 1, 2); }" \
> + | $(CLANG) -x cpp-output -S -target bpf -mcpu=v3 - -o /dev/null && echo 1 || echo 0)

'-x c' here more intuitive?

> +
> CLANG_CFLAGS = $(CLANG_SYS_INCLUDES) \
> -Wno-compare-distinct-pointer-types
>
> @@ -399,11 +405,15 @@ TRUNNER_EXTRA_FILES := $(OUTPUT)/urandom_read $(OUTPUT)/bpf_testmod.ko \
> $(wildcard progs/btf_dump_test_case_*.c)
> TRUNNER_BPF_BUILD_RULE := CLANG_BPF_BUILD_RULE
> TRUNNER_BPF_CFLAGS := $(BPF_CFLAGS) $(CLANG_CFLAGS)
> +ifeq ($(BPF_ATOMICS_SUPPORTED),1)
> + TRUNNER_BPF_CFLAGS += -DENABLE_ATOMICS_TESTS
> +endif
> TRUNNER_BPF_LDFLAGS := -mattr=+alu32
> $(eval $(call DEFINE_TEST_RUNNER,test_progs))
>
> # Define test_progs-no_alu32 test runner.
> TRUNNER_BPF_BUILD_RULE := CLANG_NOALU32_BPF_BUILD_RULE
> +TRUNNER_BPF_CFLAGS := $(BPF_CFLAGS) $(CLANG_CFLAGS)
> TRUNNER_BPF_LDFLAGS :=
> $(eval $(call DEFINE_TEST_RUNNER,test_progs,no_alu32))
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/atomics.c b/tools/testing/selftests/bpf/prog_tests/atomics.c
> new file mode 100644
> index 000000000000..c841a3abc2f7
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/prog_tests/atomics.c
> @@ -0,0 +1,246 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#include <test_progs.h>
> +
> +#include "atomics.skel.h"
> +
> +static void test_add(struct atomics *skel)
> +{
> + int err, prog_fd;
> + __u32 duration = 0, retval;
> + struct bpf_link *link;
> +
> + link = bpf_program__attach(skel->progs.add);
> + if (CHECK(IS_ERR(link), "attach(add)", "err: %ld\n", PTR_ERR(link)))
> + return;
> +
> + prog_fd = bpf_program__fd(skel->progs.add);
> + err = bpf_prog_test_run(prog_fd, 1, NULL, 0,
> + NULL, NULL, &retval, &duration);
> + if (CHECK(err || retval, "test_run add",
> + "err %d errno %d retval %d duration %d\n", err, errno, retval, duration))
> + goto cleanup;
> +
> + ASSERT_EQ(skel->data->add64_value, 3, "add64_value");
> + ASSERT_EQ(skel->bss->add64_result, 1, "add64_result");
> +
> + ASSERT_EQ(skel->data->add32_value, 3, "add32_value");
> + ASSERT_EQ(skel->bss->add32_result, 1, "add32_result");
> +
> + ASSERT_EQ(skel->bss->add_stack_value_copy, 3, "add_stack_value");
> + ASSERT_EQ(skel->bss->add_stack_result, 1, "add_stack_result");
> +
> + ASSERT_EQ(skel->data->add_noreturn_value, 3, "add_noreturn_value");
> +
> +cleanup:
> + bpf_link__destroy(link);
> +}
> +
[...]
> +
> +__u64 xchg64_value = 1;
> +__u64 xchg64_result = 0;
> +__u32 xchg32_value = 1;
> +__u32 xchg32_result = 0;
> +
> +SEC("fentry/bpf_fentry_test1")
> +int BPF_PROG(xchg, int a)
> +{
> +#ifdef ENABLE_ATOMICS_TESTS
> + __u64 val64 = 2;
> + __u32 val32 = 2;
> +
> + __atomic_exchange(&xchg64_value, &val64, &xchg64_result, __ATOMIC_RELAXED);
> + __atomic_exchange(&xchg32_value, &val32, &xchg32_result, __ATOMIC_RELAXED);

Interesting to see this also works. I guess we probably won't advertise
this, right? Currently for LLVM, the memory ordering parameter is ignored.

> +#endif
> +
> + return 0;
> +}
[...]

2020-12-08 04:11:51

by Yonghong Song

[permalink] [raw]
Subject: Re: [PATCH bpf-next v4 11/11] bpf: Document new atomic instructions



On 12/7/20 8:07 AM, Brendan Jackman wrote:
> Document new atomic instructions.
>
> Signed-off-by: Brendan Jackman <[email protected]>

Ack with minor comments below.

Acked-by: Yonghong Song <[email protected]>

> ---
> Documentation/networking/filter.rst | 26 ++++++++++++++++++++++++++
> 1 file changed, 26 insertions(+)
>
> diff --git a/Documentation/networking/filter.rst b/Documentation/networking/filter.rst
> index 1583d59d806d..26d508a5e038 100644
> --- a/Documentation/networking/filter.rst
> +++ b/Documentation/networking/filter.rst
> @@ -1053,6 +1053,32 @@ encoding.
> .imm = BPF_ADD, .code = BPF_ATOMIC | BPF_W | BPF_STX: lock xadd *(u32 *)(dst_reg + off16) += src_reg
> .imm = BPF_ADD, .code = BPF_ATOMIC | BPF_DW | BPF_STX: lock xadd *(u64 *)(dst_reg + off16) += src_reg
>
> +The basic atomic operations supported (from architecture v4 onwards) are:

No "v4" any more. Just say
The basic atomic operations supported are:

> +
> + BPF_ADD
> + BPF_AND
> + BPF_OR
> + BPF_XOR
> +
> +Each having equivalent semantics with the ``BPF_ADD`` example, that is: the
> +memory location addresed by ``dst_reg + off`` is atomically modified, with
> +``src_reg`` as the other operand. If the ``BPF_FETCH`` flag is set in the
> +immediate, then these operations also overwrite ``src_reg`` with the
> +value that was in memory before it was modified.

For 4-byte operations, except BPF_ADD, alu32 mode is required.
alu32 is implied with -mcpu=v3.

> +
> +The more special operations are:
> +
> + BPF_XCHG
> +
> +This atomically exchanges ``src_reg`` with the value addressed by ``dst_reg +
> +off``.
> +
> + BPF_CMPXCHG
> +
> +This atomically compares the value addressed by ``dst_reg + off`` with
> +``R0``. If they match it is replaced with ``src_reg``, The value that was there
> +before is loaded back to ``R0``.
> +
> Note that 1 and 2 byte atomic operations are not supported.
>
> You may encounter BPF_XADD - this is a legacy name for BPF_ATOMIC, referring to
>

2020-12-08 04:41:07

by Yonghong Song

[permalink] [raw]
Subject: Re: [PATCH bpf-next v4 09/11] bpf: Add bitwise atomic instructions



On 12/7/20 8:07 AM, Brendan Jackman wrote:
> This adds instructions for
>
> atomic[64]_[fetch_]and
> atomic[64]_[fetch_]or
> atomic[64]_[fetch_]xor
>
> All these operations are isomorphic enough to implement with the same
> verifier, interpreter, and x86 JIT code, hence being a single commit.
>
> The main interesting thing here is that x86 doesn't directly support
> the fetch_ version these operations, so we need to generate a CMPXCHG
> loop in the JIT. This requires the use of two temporary registers,
> IIUC it's safe to use BPF_REG_AX and x86's AUX_REG for this purpose.
>
> Signed-off-by: Brendan Jackman <[email protected]>
> ---
> arch/x86/net/bpf_jit_comp.c | 50 ++++++++++++++++++++++++++-
> include/linux/filter.h | 66 ++++++++++++++++++++++++++++++++++++
> kernel/bpf/core.c | 3 ++
> kernel/bpf/disasm.c | 21 +++++++++---
> kernel/bpf/verifier.c | 6 ++++
> tools/include/linux/filter.h | 66 ++++++++++++++++++++++++++++++++++++
> 6 files changed, 207 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> index 308241187582..1d4d50199293 100644
> --- a/arch/x86/net/bpf_jit_comp.c
> +++ b/arch/x86/net/bpf_jit_comp.c
> @@ -808,6 +808,10 @@ static int emit_atomic(u8 **pprog, u8 atomic_op,
> /* emit opcode */
> switch (atomic_op) {
> case BPF_ADD:
> + case BPF_SUB:
> + case BPF_AND:
> + case BPF_OR:
> + case BPF_XOR:
> /* lock *(u32/u64*)(dst_reg + off) <op>= src_reg */
> EMIT1(simple_alu_opcodes[atomic_op]);
> break;
[...]
> diff --git a/include/linux/filter.h b/include/linux/filter.h
> index e1e1fc946a7c..e100c71555a4 100644
> --- a/include/linux/filter.h
> +++ b/include/linux/filter.h
> @@ -264,7 +264,13 @@ static inline bool insn_is_zext(const struct bpf_insn *insn)
> * Atomic operations:
> *
> * BPF_ADD *(uint *) (dst_reg + off16) += src_reg
> + * BPF_AND *(uint *) (dst_reg + off16) &= src_reg
> + * BPF_OR *(uint *) (dst_reg + off16) |= src_reg
> + * BPF_XOR *(uint *) (dst_reg + off16) ^= src_reg
> * BPF_ADD | BPF_FETCH src_reg = atomic_fetch_add(dst_reg + off16, src_reg);
> + * BPF_AND | BPF_FETCH src_reg = atomic_fetch_and(dst_reg + off16, src_reg);
> + * BPF_OR | BPF_FETCH src_reg = atomic_fetch_or(dst_reg + off16, src_reg);
> + * BPF_XOR | BPF_FETCH src_reg = atomic_fetch_xor(dst_reg + off16, src_reg);
> * BPF_XCHG src_reg = atomic_xchg(dst_reg + off16, src_reg)
> * BPF_CMPXCHG r0 = atomic_cmpxchg(dst_reg + off16, r0, src_reg)
> */
> @@ -295,6 +301,66 @@ static inline bool insn_is_zext(const struct bpf_insn *insn)
> .off = OFF, \
> .imm = BPF_ADD })
>
> +/* Atomic memory and, *(uint *)(dst_reg + off16) &= src_reg */
> +
> +#define BPF_ATOMIC_AND(SIZE, DST, SRC, OFF) \
> + ((struct bpf_insn) { \
> + .code = BPF_STX | BPF_SIZE(SIZE) | BPF_ATOMIC, \
> + .dst_reg = DST, \
> + .src_reg = SRC, \
> + .off = OFF, \
> + .imm = BPF_AND })
> +
> +/* Atomic memory and with fetch, src_reg = atomic_fetch_and(dst_reg + off, src_reg); */
> +
> +#define BPF_ATOMIC_FETCH_AND(SIZE, DST, SRC, OFF) \
> + ((struct bpf_insn) { \
> + .code = BPF_STX | BPF_SIZE(SIZE) | BPF_ATOMIC, \
> + .dst_reg = DST, \
> + .src_reg = SRC, \
> + .off = OFF, \
> + .imm = BPF_AND | BPF_FETCH })
> +
> +/* Atomic memory or, *(uint *)(dst_reg + off16) |= src_reg */
> +
> +#define BPF_ATOMIC_OR(SIZE, DST, SRC, OFF) \
> + ((struct bpf_insn) { \
> + .code = BPF_STX | BPF_SIZE(SIZE) | BPF_ATOMIC, \
> + .dst_reg = DST, \
> + .src_reg = SRC, \
> + .off = OFF, \
> + .imm = BPF_OR })
> +
> +/* Atomic memory or with fetch, src_reg = atomic_fetch_or(dst_reg + off, src_reg); */
> +
> +#define BPF_ATOMIC_FETCH_OR(SIZE, DST, SRC, OFF) \
> + ((struct bpf_insn) { \
> + .code = BPF_STX | BPF_SIZE(SIZE) | BPF_ATOMIC, \
> + .dst_reg = DST, \
> + .src_reg = SRC, \
> + .off = OFF, \
> + .imm = BPF_OR | BPF_FETCH })
> +
> +/* Atomic memory xor, *(uint *)(dst_reg + off16) ^= src_reg */
> +
> +#define BPF_ATOMIC_XOR(SIZE, DST, SRC, OFF) \
> + ((struct bpf_insn) { \
> + .code = BPF_STX | BPF_SIZE(SIZE) | BPF_ATOMIC, \
> + .dst_reg = DST, \
> + .src_reg = SRC, \
> + .off = OFF, \
> + .imm = BPF_XOR })
> +
> +/* Atomic memory xor with fetch, src_reg = atomic_fetch_xor(dst_reg + off, src_reg); */
> +
> +#define BPF_ATOMIC_FETCH_XOR(SIZE, DST, SRC, OFF) \
> + ((struct bpf_insn) { \
> + .code = BPF_STX | BPF_SIZE(SIZE) | BPF_ATOMIC, \
> + .dst_reg = DST, \
> + .src_reg = SRC, \
> + .off = OFF, \
> + .imm = BPF_XOR | BPF_FETCH })

Use BPF_ATOMIC macro to define all the above macros?

> +
> /* Atomic exchange, src_reg = atomic_xchg(dst_reg + off, src_reg) */
>
> #define BPF_ATOMIC_XCHG(SIZE, DST, SRC, OFF) \
> diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
> index 1d9e5dcde03a..4b78ff89ec91 100644
> --- a/kernel/bpf/core.c
> +++ b/kernel/bpf/core.c
> @@ -1642,6 +1642,9 @@ static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, u64 *stack)
> STX_ATOMIC_W:
> switch (IMM) {
> ATOMIC_ALU_OP(BPF_ADD, add)
> + ATOMIC_ALU_OP(BPF_AND, and)
> + ATOMIC_ALU_OP(BPF_OR, or)
> + ATOMIC_ALU_OP(BPF_XOR, xor)
> #undef ATOMIC_ALU_OP
>
> case BPF_XCHG:
[...]

2020-12-08 05:34:45

by John Fastabend

[permalink] [raw]
Subject: RE: [PATCH bpf-next v4 06/11] bpf: Add BPF_FETCH field / create atomic_fetch_add instruction

Brendan Jackman wrote:
> The BPF_FETCH field can be set in bpf_insn.imm, for BPF_ATOMIC
> instructions, in order to have the previous value of the
> atomically-modified memory location loaded into the src register
> after an atomic op is carried out.
>
> Suggested-by: Yonghong Song <[email protected]>
> Signed-off-by: Brendan Jackman <[email protected]>
> ---

I like Yonghong suggestion

#define BPF_ATOMIC_FETCH_ADD(SIZE, DST, SRC, OFF) \
BPF_ATOMIC(SIZE, DST, SRC, OFF, BPF_ADD | BPF_FETCH)

otherwise LGTM. One observation to consider below.

Acked-by: John Fastabend <[email protected]>

> arch/x86/net/bpf_jit_comp.c | 4 ++++
> include/linux/filter.h | 1 +
> include/uapi/linux/bpf.h | 3 +++
> kernel/bpf/core.c | 13 +++++++++++++
> kernel/bpf/disasm.c | 7 +++++++
> kernel/bpf/verifier.c | 33 ++++++++++++++++++++++++---------
> tools/include/linux/filter.h | 11 +++++++++++
> tools/include/uapi/linux/bpf.h | 3 +++
> 8 files changed, 66 insertions(+), 9 deletions(-)

[...]

> @@ -3652,8 +3656,20 @@ static int check_atomic(struct bpf_verifier_env *env, int insn_idx, struct bpf_i
> return err;
>
> /* check whether we can write into the same memory */
> - return check_mem_access(env, insn_idx, insn->dst_reg, insn->off,
> - BPF_SIZE(insn->code), BPF_WRITE, -1, true);
> + err = check_mem_access(env, insn_idx, insn->dst_reg, insn->off,
> + BPF_SIZE(insn->code), BPF_WRITE, -1, true);
> + if (err)
> + return err;
> +
> + if (!(insn->imm & BPF_FETCH))
> + return 0;
> +
> + /* check and record load of old value into src reg */
> + err = check_reg_arg(env, insn->src_reg, DST_OP);

This will mark the reg unknown. I think this is fine here. Might be nice
to carry bounds through though if possible

> + if (err)
> + return err;
> +
> + return 0;
> }
>

2020-12-08 06:22:03

by John Fastabend

[permalink] [raw]
Subject: RE: [PATCH bpf-next v4 05/11] bpf: Move BPF_STX reserved field check into BPF_STX verifier code

Brendan Jackman wrote:
> I can't find a reason why this code is in resolve_pseudo_ldimm64;
> since I'll be modifying it in a subsequent commit, tidy it up.
>
> Signed-off-by: Brendan Jackman <[email protected]>
> ---
> kernel/bpf/verifier.c | 13 ++++++-------
> 1 file changed, 6 insertions(+), 7 deletions(-)

Acked-by: John Fastabend <[email protected]>

2020-12-08 06:42:58

by John Fastabend

[permalink] [raw]
Subject: RE: [PATCH bpf-next v4 07/11] bpf: Add instructions for atomic_[cmp]xchg

Brendan Jackman wrote:
> This adds two atomic opcodes, both of which include the BPF_FETCH
> flag. XCHG without the BPF_FETCH flag would naturally encode
> atomic_set. This is not supported because it would be of limited
> value to userspace (it doesn't imply any barriers). CMPXCHG without
> BPF_FETCH woulud be an atomic compare-and-write. We don't have such
> an operation in the kernel so it isn't provided to BPF either.
>
> There are two significant design decisions made for the CMPXCHG
> instruction:
>
> - To solve the issue that this operation fundamentally has 3
> operands, but we only have two register fields. Therefore the
> operand we compare against (the kernel's API calls it 'old') is
> hard-coded to be R0. x86 has similar design (and A64 doesn't
> have this problem).
>
> A potential alternative might be to encode the other operand's
> register number in the immediate field.
>
> - The kernel's atomic_cmpxchg returns the old value, while the C11
> userspace APIs return a boolean indicating the comparison
> result. Which should BPF do? A64 returns the old value. x86 returns
> the old value in the hard-coded register (and also sets a
> flag). That means return-old-value is easier to JIT.

Just a nit as it looks like perhaps we get one more spin here. Would
be nice to be explicit about what the code does here. The above reads
like it could go either way. Just an extra line "So we use ...' would
be enough.

>
> Signed-off-by: Brendan Jackman <[email protected]>
> ---

One question below.

> arch/x86/net/bpf_jit_comp.c | 8 ++++++++
> include/linux/filter.h | 22 ++++++++++++++++++++++
> include/uapi/linux/bpf.h | 4 +++-
> kernel/bpf/core.c | 20 ++++++++++++++++++++
> kernel/bpf/disasm.c | 15 +++++++++++++++
> kernel/bpf/verifier.c | 19 +++++++++++++++++--
> tools/include/linux/filter.h | 22 ++++++++++++++++++++++
> tools/include/uapi/linux/bpf.h | 4 +++-
> 8 files changed, 110 insertions(+), 4 deletions(-)
>

[...]

> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index f8c4e809297d..f5f4460b3e4e 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -3608,11 +3608,14 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn
>
> static int check_atomic(struct bpf_verifier_env *env, int insn_idx, struct bpf_insn *insn)
> {
> + int load_reg;
> int err;
>
> switch (insn->imm) {
> case BPF_ADD:
> case BPF_ADD | BPF_FETCH:
> + case BPF_XCHG:
> + case BPF_CMPXCHG:
> break;
> default:
> verbose(env, "BPF_ATOMIC uses invalid atomic opcode %02x\n", insn->imm);
> @@ -3634,6 +3637,13 @@ static int check_atomic(struct bpf_verifier_env *env, int insn_idx, struct bpf_i
> if (err)
> return err;
>
> + if (insn->imm == BPF_CMPXCHG) {
> + /* Check comparison of R0 with memory location */
> + err = check_reg_arg(env, BPF_REG_0, SRC_OP);
> + if (err)
> + return err;
> + }
> +

I need to think a bit more about it, but do we need to update is_reg64()
at all for these?

> if (is_pointer_value(env, insn->src_reg)) {
> verbose(env, "R%d leaks addr into mem\n", insn->src_reg);
> return -EACCES;
> @@ -3664,8 +3674,13 @@ static int check_atomic(struct bpf_verifier_env *env, int insn_idx, struct bpf_i
> if (!(insn->imm & BPF_FETCH))
> return 0;
>
> - /* check and record load of old value into src reg */
> - err = check_reg_arg(env, insn->src_reg, DST_OP);
> + if (insn->imm == BPF_CMPXCHG)
> + load_reg = BPF_REG_0;
> + else
> + load_reg = insn->src_reg;
> +
> + /* check and record load of old value */
> + err = check_reg_arg(env, load_reg, DST_OP);
> if (err)
> return err;

2020-12-08 06:47:15

by John Fastabend

[permalink] [raw]
Subject: RE: [PATCH bpf-next v4 07/11] bpf: Add instructions for atomic_[cmp]xchg

Brendan Jackman wrote:
> This adds two atomic opcodes, both of which include the BPF_FETCH
> flag. XCHG without the BPF_FETCH flag would naturally encode
> atomic_set. This is not supported because it would be of limited
> value to userspace (it doesn't imply any barriers). CMPXCHG without
> BPF_FETCH woulud be an atomic compare-and-write. We don't have such
> an operation in the kernel so it isn't provided to BPF either.
>
> There are two significant design decisions made for the CMPXCHG
> instruction:
>
> - To solve the issue that this operation fundamentally has 3
> operands, but we only have two register fields. Therefore the
> operand we compare against (the kernel's API calls it 'old') is
> hard-coded to be R0. x86 has similar design (and A64 doesn't
> have this problem).
>
> A potential alternative might be to encode the other operand's
> register number in the immediate field.
>
> - The kernel's atomic_cmpxchg returns the old value, while the C11
> userspace APIs return a boolean indicating the comparison
> result. Which should BPF do? A64 returns the old value. x86 returns
> the old value in the hard-coded register (and also sets a
> flag). That means return-old-value is easier to JIT.
>
> Signed-off-by: Brendan Jackman <[email protected]>
> ---

Sorry if this is a dup, client crashed while I sent the previous version
and don't see it on the list.

> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -3608,11 +3608,14 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn
>
> static int check_atomic(struct bpf_verifier_env *env, int insn_idx, struct bpf_insn *insn)
> {
> + int load_reg;
> int err;
>
> switch (insn->imm) {
> case BPF_ADD:
> case BPF_ADD | BPF_FETCH:
> + case BPF_XCHG:
> + case BPF_CMPXCHG:
> break;
> default:
> verbose(env, "BPF_ATOMIC uses invalid atomic opcode %02x\n", insn->imm);
> @@ -3634,6 +3637,13 @@ static int check_atomic(struct bpf_verifier_env *env, int insn_idx, struct bpf_i
> if (err)
> return err;
>
> + if (insn->imm == BPF_CMPXCHG) {
> + /* Check comparison of R0 with memory location */
> + err = check_reg_arg(env, BPF_REG_0, SRC_OP);
> + if (err)
> + return err;
> + }
> +

Need to think a bit more on this, but do we need to update is_reg64() here
as well?

> if (is_pointer_value(env, insn->src_reg)) {
> verbose(env, "R%d leaks addr into mem\n", insn->src_reg);
> return -EACCES;
> @@ -3664,8 +3674,13 @@ static int check_atomic(struct bpf_verifier_env *env, int insn_idx, struct bpf_i
> if (!(insn->imm & BPF_FETCH))
> return 0;
>
> - /* check and record load of old value into src reg */
> - err = check_reg_arg(env, insn->src_reg, DST_OP);
> + if (insn->imm == BPF_CMPXCHG)
> + load_reg = BPF_REG_0;
> + else
> + load_reg = insn->src_reg;
> +
> + /* check and record load of old value */
> + err = check_reg_arg(env, load_reg, DST_OP);
> if (err)
> return err;
>

Thanks,
John

2020-12-08 09:32:33

by Brendan Jackman

[permalink] [raw]
Subject: Re: [PATCH bpf-next v4 04/11] bpf: Rename BPF_XADD and prepare to encode other atomics in .imm

Hi John, thanks a lot for the reviews!

On Mon, Dec 07, 2020 at 01:56:53PM -0800, John Fastabend wrote:
> Brendan Jackman wrote:
> > A subsequent patch will add additional atomic operations. These new
> > operations will use the same opcode field as the existing XADD, with
> > the immediate discriminating different operations.
> >
> > In preparation, rename the instruction mode BPF_ATOMIC and start
> > calling the zero immediate BPF_ADD.
> >
> > This is possible (doesn't break existing valid BPF progs) because the
> > immediate field is currently reserved MBZ and BPF_ADD is zero.
> >
> > All uses are removed from the tree but the BPF_XADD definition is
> > kept around to avoid breaking builds for people including kernel
> > headers.
> >
> > Signed-off-by: Brendan Jackman <[email protected]>
> > ---
> > Documentation/networking/filter.rst | 30 ++++++++-----
> > arch/arm/net/bpf_jit_32.c | 7 ++-
> > arch/arm64/net/bpf_jit_comp.c | 16 +++++--
> > arch/mips/net/ebpf_jit.c | 11 +++--
> > arch/powerpc/net/bpf_jit_comp64.c | 25 ++++++++---
> > arch/riscv/net/bpf_jit_comp32.c | 20 +++++++--
> > arch/riscv/net/bpf_jit_comp64.c | 16 +++++--
> > arch/s390/net/bpf_jit_comp.c | 27 ++++++-----
> > arch/sparc/net/bpf_jit_comp_64.c | 17 +++++--
> > arch/x86/net/bpf_jit_comp.c | 45 ++++++++++++++-----
> > arch/x86/net/bpf_jit_comp32.c | 6 +--
> > drivers/net/ethernet/netronome/nfp/bpf/jit.c | 14 ++++--
> > drivers/net/ethernet/netronome/nfp/bpf/main.h | 4 +-
> > .../net/ethernet/netronome/nfp/bpf/verifier.c | 15 ++++---
> > include/linux/filter.h | 29 ++++++++++--
> > include/uapi/linux/bpf.h | 5 ++-
> > kernel/bpf/core.c | 31 +++++++++----
> > kernel/bpf/disasm.c | 6 ++-
> > kernel/bpf/verifier.c | 24 +++++-----
> > lib/test_bpf.c | 14 +++---
> > samples/bpf/bpf_insn.h | 4 +-
> > samples/bpf/cookie_uid_helper_example.c | 6 +--
> > samples/bpf/sock_example.c | 2 +-
> > samples/bpf/test_cgrp2_attach.c | 5 ++-
> > tools/include/linux/filter.h | 28 ++++++++++--
> > tools/include/uapi/linux/bpf.h | 5 ++-
> > .../bpf/prog_tests/cgroup_attach_multi.c | 4 +-
> > .../selftests/bpf/test_cgroup_storage.c | 2 +-
> > tools/testing/selftests/bpf/verifier/ctx.c | 7 ++-
> > .../bpf/verifier/direct_packet_access.c | 4 +-
> > .../testing/selftests/bpf/verifier/leak_ptr.c | 10 ++---
> > .../selftests/bpf/verifier/meta_access.c | 4 +-
> > tools/testing/selftests/bpf/verifier/unpriv.c | 3 +-
> > .../bpf/verifier/value_illegal_alu.c | 2 +-
> > tools/testing/selftests/bpf/verifier/xadd.c | 18 ++++----
> > 35 files changed, 317 insertions(+), 149 deletions(-)
> >
>
> [...]
>
> > +++ a/arch/mips/net/ebpf_jit.c
>
> [...]
>
> > - if (BPF_MODE(insn->code) == BPF_XADD) {
> > + if (BPF_MODE(insn->code) == BPF_ATOMIC) {
> > + if (insn->imm != BPF_ADD) {
> > + pr_err("ATOMIC OP %02x NOT HANDLED\n", insn->imm);
> > + return -EINVAL;
> > + }
> > +
> > /*
> [...]
> > +++ b/arch/powerpc/net/bpf_jit_comp64.c
>
> > - case BPF_STX | BPF_XADD | BPF_W:
> > + case BPF_STX | BPF_ATOMIC | BPF_W:
> > + if (insn->imm != BPF_ADD) {
> > + pr_err_ratelimited(
> > + "eBPF filter atomic op code %02x (@%d) unsupported\n",
> > + code, i);
> > + return -ENOTSUPP;
> > + }
> [...]
> > @@ -699,8 +707,15 @@ static int bpf_jit_build_body(struct bpf_prog *fp, u32 *image,
> > - case BPF_STX | BPF_XADD | BPF_DW:
> > + case BPF_STX | BPF_ATOMIC | BPF_DW:
> > + if (insn->imm != BPF_ADD) {
> > + pr_err_ratelimited(
> > + "eBPF filter atomic op code %02x (@%d) unsupported\n",
> > + code, i);
> > + return -ENOTSUPP;
> > + }
> [...]
> > + case BPF_STX | BPF_ATOMIC | BPF_W:
> > + if (insn->imm != BPF_ADD) {
> > + pr_info_once(
> > + "bpf-jit: not supported: atomic operation %02x ***\n",
> > + insn->imm);
> > + return -EFAULT;
> > + }
> [...]
> > + case BPF_STX | BPF_ATOMIC | BPF_W:
> > + case BPF_STX | BPF_ATOMIC | BPF_DW:
> > + if (insn->imm != BPF_ADD) {
> > + pr_err("bpf-jit: not supported: atomic operation %02x ***\n",
> > + insn->imm);
> > + return -EINVAL;
> > + }
>
> Can we standardize the error across jits and the error return code? It seems
> odd that we use pr_err, pr_info_once, pr_err_ratelimited and then return
> ENOTSUPP, EFAULT or EINVAL.

That would be a noble cause but I don't think it makes sense in this
patchset: they are already inconsistent, so here I've gone for intra-JIT
consistency over inter-JIT consistency.

I think it would be more annoying, for example, if the s390 JIT returned
-EOPNOTSUPP for a bad atomic but -1 for other unsupported ops, than it
is already that the s390 JIT returns -1 where the MIPS returns -EINVAL.

> granted the error codes might not propagate all the way out at the moment but
> still shouldn't hurt.
>
> > diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c
> > index 0a4182792876..f973e2ead197 100644
> > --- a/arch/s390/net/bpf_jit_comp.c
> > +++ b/arch/s390/net/bpf_jit_comp.c
> > @@ -1205,18 +1205,23 @@ static noinline int bpf_jit_insn(struct bpf_jit *jit, struct bpf_prog *fp,
>
> For example this will return -1 regardless of error from insn->imm != BPF_ADD.
> [...]
> > + case BPF_STX | BPF_ATOMIC | BPF_DW:
> > + case BPF_STX | BPF_ATOMIC | BPF_W:
> > + if (insn->imm != BPF_ADD) {
> > + pr_err("Unknown atomic operation %02x\n", insn->imm);
> > + return -1;
> > + }
> > +
> [...]
>
> > --- a/include/linux/filter.h
> > +++ b/include/linux/filter.h
> > @@ -259,15 +259,38 @@ static inline bool insn_is_zext(const struct bpf_insn *insn)
> > .off = OFF, \
> > .imm = 0 })
> >
> > -/* Atomic memory add, *(uint *)(dst_reg + off16) += src_reg */
> > +
> > +/*
> > + * Atomic operations:
> > + *
> > + * BPF_ADD *(uint *) (dst_reg + off16) += src_reg
> > + */
> > +
> > +#define BPF_ATOMIC64(OP, DST, SRC, OFF) \
> > + ((struct bpf_insn) { \
> > + .code = BPF_STX | BPF_DW | BPF_ATOMIC, \
> > + .dst_reg = DST, \
> > + .src_reg = SRC, \
> > + .off = OFF, \
> > + .imm = OP })
> > +
> > +#define BPF_ATOMIC32(OP, DST, SRC, OFF) \
> > + ((struct bpf_insn) { \
> > + .code = BPF_STX | BPF_W | BPF_ATOMIC, \
> > + .dst_reg = DST, \
> > + .src_reg = SRC, \
> > + .off = OFF, \
> > + .imm = OP })
> > +
> > +/* Legacy equivalent of BPF_ATOMIC{64,32}(BPF_ADD, ...) */
>
> Not sure I care too much. Does seem more natural to follow
> below pattern and use,
>
> BPF_ATOMIC(OP, SIZE, DST, SRC, OFF)
>
> >
> > #define BPF_STX_XADD(SIZE, DST, SRC, OFF) \
> > ((struct bpf_insn) { \
> > - .code = BPF_STX | BPF_SIZE(SIZE) | BPF_XADD, \
> > + .code = BPF_STX | BPF_SIZE(SIZE) | BPF_ATOMIC, \
> > .dst_reg = DST, \
> > .src_reg = SRC, \
> > .off = OFF, \
> > - .imm = 0 })
> > + .imm = BPF_ADD })
> >
> > /* Memory store, *(uint *) (dst_reg + off16) = imm32 */
> >
>
> [...]
>
> Otherwise LGTM, I'll try to get the remaining patches reviewed tonight
> I need to jump onto something else this afternoon. Thanks!

2020-12-08 10:03:33

by Brendan Jackman

[permalink] [raw]
Subject: Re: [PATCH bpf-next v4 06/11] bpf: Add BPF_FETCH field / create atomic_fetch_add instruction

On Mon, Dec 07, 2020 at 09:31:40PM -0800, John Fastabend wrote:
> Brendan Jackman wrote:
> > The BPF_FETCH field can be set in bpf_insn.imm, for BPF_ATOMIC
> > instructions, in order to have the previous value of the
> > atomically-modified memory location loaded into the src register
> > after an atomic op is carried out.
> >
> > Suggested-by: Yonghong Song <[email protected]>
> > Signed-off-by: Brendan Jackman <[email protected]>
> > ---
>
> I like Yonghong suggestion
>
> #define BPF_ATOMIC_FETCH_ADD(SIZE, DST, SRC, OFF) \
> BPF_ATOMIC(SIZE, DST, SRC, OFF, BPF_ADD | BPF_FETCH)
>
> otherwise LGTM. One observation to consider below.
>
> Acked-by: John Fastabend <[email protected]>
>
> > arch/x86/net/bpf_jit_comp.c | 4 ++++
> > include/linux/filter.h | 1 +
> > include/uapi/linux/bpf.h | 3 +++
> > kernel/bpf/core.c | 13 +++++++++++++
> > kernel/bpf/disasm.c | 7 +++++++
> > kernel/bpf/verifier.c | 33 ++++++++++++++++++++++++---------
> > tools/include/linux/filter.h | 11 +++++++++++
> > tools/include/uapi/linux/bpf.h | 3 +++
> > 8 files changed, 66 insertions(+), 9 deletions(-)
>
> [...]
>
> > @@ -3652,8 +3656,20 @@ static int check_atomic(struct bpf_verifier_env *env, int insn_idx, struct bpf_i
> > return err;
> >
> > /* check whether we can write into the same memory */
> > - return check_mem_access(env, insn_idx, insn->dst_reg, insn->off,
> > - BPF_SIZE(insn->code), BPF_WRITE, -1, true);
> > + err = check_mem_access(env, insn_idx, insn->dst_reg, insn->off,
> > + BPF_SIZE(insn->code), BPF_WRITE, -1, true);
> > + if (err)
> > + return err;
> > +
> > + if (!(insn->imm & BPF_FETCH))
> > + return 0;
> > +
> > + /* check and record load of old value into src reg */
> > + err = check_reg_arg(env, insn->src_reg, DST_OP);
>
> This will mark the reg unknown. I think this is fine here. Might be nice
> to carry bounds through though if possible

Ah, I hadn't thought of this. I think if I move this check_reg_arg to be
before the first check_mem_access, and then (when BPF_FETCH) set the
val_regno arg to load_reg, then the bounds from memory would get
propagated back to the register:

if (insn->imm & BPF_FETCH) {
if (insn->imm == BPF_CMPXCHG)
load_reg = BPF_REG_0;
else
load_reg = insn->src_reg;
err = check_reg_arg(env, load_reg, DST_OP);
if (err)
return err;
} else {
load_reg = -1;
}
/* check wether we can read the memory */
err = check_mem_access(env, insn_index, insn->dst_reg, insn->off
BPF_SIZE(insn->code), BPF_READ,
load_reg, // <--
true);

Is that the kind of thing you had in mind?

> > + if (err)
> > + return err;
> > +
> > + return 0;
> > }
> >

2020-12-08 12:59:53

by Brendan Jackman

[permalink] [raw]
Subject: Re: [PATCH bpf-next v4 10/11] bpf: Add tests for new BPF atomic operations

On Mon, Dec 07, 2020 at 07:18:57PM -0800, Yonghong Song wrote:
>
>
> On 12/7/20 8:07 AM, Brendan Jackman wrote:
> > The prog_test that's added depends on Clang/LLVM features added by
> > Yonghong in commit 286daafd6512 (was https://reviews.llvm.org/D72184 ).
> >
> > Note the use of a define called ENABLE_ATOMICS_TESTS: this is used
> > to:
> >
> > - Avoid breaking the build for people on old versions of Clang
> > - Avoid needing separate lists of test objects for no_alu32, where
> > atomics are not supported even if Clang has the feature.
> >
> > The atomics_test.o BPF object is built unconditionally both for
> > test_progs and test_progs-no_alu32. For test_progs, if Clang supports
> > atomics, ENABLE_ATOMICS_TESTS is defined, so it includes the proper
> > test code. Otherwise, progs and global vars are defined anyway, as
> > stubs; this means that the skeleton user code still builds.
> >
> > The atomics_test.o userspace object is built once and used for both
> > test_progs and test_progs-no_alu32. A variable called skip_tests is
> > defined in the BPF object's data section, which tells the userspace
> > object whether to skip the atomics test.
> >
> > Signed-off-by: Brendan Jackman <[email protected]>
>
> Ack with minor comments below.
>
> Acked-by: Yonghong Song <[email protected]>
>
> > ---
> > tools/testing/selftests/bpf/Makefile | 10 +
> > .../selftests/bpf/prog_tests/atomics.c | 246 ++++++++++++++++++
> > tools/testing/selftests/bpf/progs/atomics.c | 154 +++++++++++
> > .../selftests/bpf/verifier/atomic_and.c | 77 ++++++
> > .../selftests/bpf/verifier/atomic_cmpxchg.c | 96 +++++++
> > .../selftests/bpf/verifier/atomic_fetch_add.c | 106 ++++++++
> > .../selftests/bpf/verifier/atomic_or.c | 77 ++++++
> > .../selftests/bpf/verifier/atomic_xchg.c | 46 ++++
> > .../selftests/bpf/verifier/atomic_xor.c | 77 ++++++
> > 9 files changed, 889 insertions(+)
> > create mode 100644 tools/testing/selftests/bpf/prog_tests/atomics.c
> > create mode 100644 tools/testing/selftests/bpf/progs/atomics.c
> > create mode 100644 tools/testing/selftests/bpf/verifier/atomic_and.c
> > create mode 100644 tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c
> > create mode 100644 tools/testing/selftests/bpf/verifier/atomic_fetch_add.c
> > create mode 100644 tools/testing/selftests/bpf/verifier/atomic_or.c
> > create mode 100644 tools/testing/selftests/bpf/verifier/atomic_xchg.c
> > create mode 100644 tools/testing/selftests/bpf/verifier/atomic_xor.c
> >
> > diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
> > index ac25ba5d0d6c..13bc1d736164 100644
> > --- a/tools/testing/selftests/bpf/Makefile
> > +++ b/tools/testing/selftests/bpf/Makefile
> > @@ -239,6 +239,12 @@ BPF_CFLAGS = -g -D__TARGET_ARCH_$(SRCARCH) $(MENDIAN) \
> > -I$(INCLUDE_DIR) -I$(CURDIR) -I$(APIDIR) \
> > -I$(abspath $(OUTPUT)/../usr/include)
> > +# BPF atomics support was added to Clang in llvm-project commit 286daafd6512
> > +# (release 12.0.0).
> > +BPF_ATOMICS_SUPPORTED = $(shell \
> > + echo "int x = 0; int foo(void) { return __sync_val_compare_and_swap(&x, 1, 2); }" \
> > + | $(CLANG) -x cpp-output -S -target bpf -mcpu=v3 - -o /dev/null && echo 1 || echo 0)
>
> '-x c' here more intuitive?
>
> > +
> > CLANG_CFLAGS = $(CLANG_SYS_INCLUDES) \
> > -Wno-compare-distinct-pointer-types
> > @@ -399,11 +405,15 @@ TRUNNER_EXTRA_FILES := $(OUTPUT)/urandom_read $(OUTPUT)/bpf_testmod.ko \
> > $(wildcard progs/btf_dump_test_case_*.c)
> > TRUNNER_BPF_BUILD_RULE := CLANG_BPF_BUILD_RULE
> > TRUNNER_BPF_CFLAGS := $(BPF_CFLAGS) $(CLANG_CFLAGS)
> > +ifeq ($(BPF_ATOMICS_SUPPORTED),1)
> > + TRUNNER_BPF_CFLAGS += -DENABLE_ATOMICS_TESTS
> > +endif
> > TRUNNER_BPF_LDFLAGS := -mattr=+alu32
> > $(eval $(call DEFINE_TEST_RUNNER,test_progs))
> > # Define test_progs-no_alu32 test runner.
> > TRUNNER_BPF_BUILD_RULE := CLANG_NOALU32_BPF_BUILD_RULE
> > +TRUNNER_BPF_CFLAGS := $(BPF_CFLAGS) $(CLANG_CFLAGS)
> > TRUNNER_BPF_LDFLAGS :=
> > $(eval $(call DEFINE_TEST_RUNNER,test_progs,no_alu32))
> > diff --git a/tools/testing/selftests/bpf/prog_tests/atomics.c b/tools/testing/selftests/bpf/prog_tests/atomics.c
> > new file mode 100644
> > index 000000000000..c841a3abc2f7
> > --- /dev/null
> > +++ b/tools/testing/selftests/bpf/prog_tests/atomics.c
> > @@ -0,0 +1,246 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +
> > +#include <test_progs.h>
> > +
> > +#include "atomics.skel.h"
> > +
> > +static void test_add(struct atomics *skel)
> > +{
> > + int err, prog_fd;
> > + __u32 duration = 0, retval;
> > + struct bpf_link *link;
> > +
> > + link = bpf_program__attach(skel->progs.add);
> > + if (CHECK(IS_ERR(link), "attach(add)", "err: %ld\n", PTR_ERR(link)))
> > + return;
> > +
> > + prog_fd = bpf_program__fd(skel->progs.add);
> > + err = bpf_prog_test_run(prog_fd, 1, NULL, 0,
> > + NULL, NULL, &retval, &duration);
> > + if (CHECK(err || retval, "test_run add",
> > + "err %d errno %d retval %d duration %d\n", err, errno, retval, duration))
> > + goto cleanup;
> > +
> > + ASSERT_EQ(skel->data->add64_value, 3, "add64_value");
> > + ASSERT_EQ(skel->bss->add64_result, 1, "add64_result");
> > +
> > + ASSERT_EQ(skel->data->add32_value, 3, "add32_value");
> > + ASSERT_EQ(skel->bss->add32_result, 1, "add32_result");
> > +
> > + ASSERT_EQ(skel->bss->add_stack_value_copy, 3, "add_stack_value");
> > + ASSERT_EQ(skel->bss->add_stack_result, 1, "add_stack_result");
> > +
> > + ASSERT_EQ(skel->data->add_noreturn_value, 3, "add_noreturn_value");
> > +
> > +cleanup:
> > + bpf_link__destroy(link);
> > +}
> > +
> [...]
> > +
> > +__u64 xchg64_value = 1;
> > +__u64 xchg64_result = 0;
> > +__u32 xchg32_value = 1;
> > +__u32 xchg32_result = 0;
> > +
> > +SEC("fentry/bpf_fentry_test1")
> > +int BPF_PROG(xchg, int a)
> > +{
> > +#ifdef ENABLE_ATOMICS_TESTS
> > + __u64 val64 = 2;
> > + __u32 val32 = 2;
> > +
> > + __atomic_exchange(&xchg64_value, &val64, &xchg64_result, __ATOMIC_RELAXED);
> > + __atomic_exchange(&xchg32_value, &val32, &xchg32_result, __ATOMIC_RELAXED);
>
> Interesting to see this also works. I guess we probably won't advertise
> this, right? Currently for LLVM, the memory ordering parameter is ignored.

Well IIUC this specific case is fine: the ordering that you get with
BPF_[CMP]XCHG (via kernel atomic_[cmpxchg]) is sequential consistency,
and its' fine to provide a stronger ordering than the one requested. I
should change it to say __ATOMIC_SEQ_CST to avoid confusing readers,
though.

(I wrote it this way because I didn't see a __sync* function for
unconditional atomic exchange, and I didn't see an __atomic* function
where you don't need to specify the ordering).

However... this led me to double-check the semantics and realise that we
do have a problem with ordering: The kernel's atomic_{add,and,or,xor} do
not imply memory barriers and therefore neither do the corresponding BPF
instructions. That means Clang can compile this:

(void)__atomic_fetch_add(&val, 1, __ATOMIC_SEQ_CST)

to a {.code = (BPF_STX | BPF_DW | BPF_ATOMIC), .imm = BPF_ADD},
which is implemented with atomic_add, which doesn't actually satisfy
__ATOMIC_SEQ_CST.

In fact... I think this is a pre-existing issue with BPF_XADD.

If all I've written here is correct, the fix is to use
(void)atomic_fetch_add etc (these imply barriers) even when BPF_FETCH is
not set. And that change ought to be backported to fix BPF_XADD.

2020-12-08 16:42:41

by Yonghong Song

[permalink] [raw]
Subject: Re: [PATCH bpf-next v4 10/11] bpf: Add tests for new BPF atomic operations



On 12/8/20 4:41 AM, Brendan Jackman wrote:
> On Mon, Dec 07, 2020 at 07:18:57PM -0800, Yonghong Song wrote:
>>
>>
>> On 12/7/20 8:07 AM, Brendan Jackman wrote:
>>> The prog_test that's added depends on Clang/LLVM features added by
>>> Yonghong in commit 286daafd6512 (was https://reviews.llvm.org/D72184 ).
>>>
>>> Note the use of a define called ENABLE_ATOMICS_TESTS: this is used
>>> to:
>>>
>>> - Avoid breaking the build for people on old versions of Clang
>>> - Avoid needing separate lists of test objects for no_alu32, where
>>> atomics are not supported even if Clang has the feature.
>>>
>>> The atomics_test.o BPF object is built unconditionally both for
>>> test_progs and test_progs-no_alu32. For test_progs, if Clang supports
>>> atomics, ENABLE_ATOMICS_TESTS is defined, so it includes the proper
>>> test code. Otherwise, progs and global vars are defined anyway, as
>>> stubs; this means that the skeleton user code still builds.
>>>
>>> The atomics_test.o userspace object is built once and used for both
>>> test_progs and test_progs-no_alu32. A variable called skip_tests is
>>> defined in the BPF object's data section, which tells the userspace
>>> object whether to skip the atomics test.
>>>
>>> Signed-off-by: Brendan Jackman <[email protected]>
>>
>> Ack with minor comments below.
>>
>> Acked-by: Yonghong Song <[email protected]>
>>
>>> ---
>>> tools/testing/selftests/bpf/Makefile | 10 +
>>> .../selftests/bpf/prog_tests/atomics.c | 246 ++++++++++++++++++
>>> tools/testing/selftests/bpf/progs/atomics.c | 154 +++++++++++
>>> .../selftests/bpf/verifier/atomic_and.c | 77 ++++++
>>> .../selftests/bpf/verifier/atomic_cmpxchg.c | 96 +++++++
>>> .../selftests/bpf/verifier/atomic_fetch_add.c | 106 ++++++++
>>> .../selftests/bpf/verifier/atomic_or.c | 77 ++++++
>>> .../selftests/bpf/verifier/atomic_xchg.c | 46 ++++
>>> .../selftests/bpf/verifier/atomic_xor.c | 77 ++++++
>>> 9 files changed, 889 insertions(+)
>>> create mode 100644 tools/testing/selftests/bpf/prog_tests/atomics.c
>>> create mode 100644 tools/testing/selftests/bpf/progs/atomics.c
>>> create mode 100644 tools/testing/selftests/bpf/verifier/atomic_and.c
>>> create mode 100644 tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c
>>> create mode 100644 tools/testing/selftests/bpf/verifier/atomic_fetch_add.c
>>> create mode 100644 tools/testing/selftests/bpf/verifier/atomic_or.c
>>> create mode 100644 tools/testing/selftests/bpf/verifier/atomic_xchg.c
>>> create mode 100644 tools/testing/selftests/bpf/verifier/atomic_xor.c
>>>
>>> diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
>>> index ac25ba5d0d6c..13bc1d736164 100644
>>> --- a/tools/testing/selftests/bpf/Makefile
>>> +++ b/tools/testing/selftests/bpf/Makefile
>>> @@ -239,6 +239,12 @@ BPF_CFLAGS = -g -D__TARGET_ARCH_$(SRCARCH) $(MENDIAN) \
>>> -I$(INCLUDE_DIR) -I$(CURDIR) -I$(APIDIR) \
>>> -I$(abspath $(OUTPUT)/../usr/include)
>>> +# BPF atomics support was added to Clang in llvm-project commit 286daafd6512
>>> +# (release 12.0.0).
>>> +BPF_ATOMICS_SUPPORTED = $(shell \
>>> + echo "int x = 0; int foo(void) { return __sync_val_compare_and_swap(&x, 1, 2); }" \
>>> + | $(CLANG) -x cpp-output -S -target bpf -mcpu=v3 - -o /dev/null && echo 1 || echo 0)
>>
>> '-x c' here more intuitive?
>>
>>> +
>>> CLANG_CFLAGS = $(CLANG_SYS_INCLUDES) \
>>> -Wno-compare-distinct-pointer-types
>>> @@ -399,11 +405,15 @@ TRUNNER_EXTRA_FILES := $(OUTPUT)/urandom_read $(OUTPUT)/bpf_testmod.ko \
>>> $(wildcard progs/btf_dump_test_case_*.c)
>>> TRUNNER_BPF_BUILD_RULE := CLANG_BPF_BUILD_RULE
>>> TRUNNER_BPF_CFLAGS := $(BPF_CFLAGS) $(CLANG_CFLAGS)
>>> +ifeq ($(BPF_ATOMICS_SUPPORTED),1)
>>> + TRUNNER_BPF_CFLAGS += -DENABLE_ATOMICS_TESTS
>>> +endif
>>> TRUNNER_BPF_LDFLAGS := -mattr=+alu32
>>> $(eval $(call DEFINE_TEST_RUNNER,test_progs))
>>> # Define test_progs-no_alu32 test runner.
>>> TRUNNER_BPF_BUILD_RULE := CLANG_NOALU32_BPF_BUILD_RULE
>>> +TRUNNER_BPF_CFLAGS := $(BPF_CFLAGS) $(CLANG_CFLAGS)
>>> TRUNNER_BPF_LDFLAGS :=
>>> $(eval $(call DEFINE_TEST_RUNNER,test_progs,no_alu32))
>>> diff --git a/tools/testing/selftests/bpf/prog_tests/atomics.c b/tools/testing/selftests/bpf/prog_tests/atomics.c
>>> new file mode 100644
>>> index 000000000000..c841a3abc2f7
>>> --- /dev/null
>>> +++ b/tools/testing/selftests/bpf/prog_tests/atomics.c
>>> @@ -0,0 +1,246 @@
>>> +// SPDX-License-Identifier: GPL-2.0
>>> +
>>> +#include <test_progs.h>
>>> +
>>> +#include "atomics.skel.h"
>>> +
>>> +static void test_add(struct atomics *skel)
>>> +{
>>> + int err, prog_fd;
>>> + __u32 duration = 0, retval;
>>> + struct bpf_link *link;
>>> +
>>> + link = bpf_program__attach(skel->progs.add);
>>> + if (CHECK(IS_ERR(link), "attach(add)", "err: %ld\n", PTR_ERR(link)))
>>> + return;
>>> +
>>> + prog_fd = bpf_program__fd(skel->progs.add);
>>> + err = bpf_prog_test_run(prog_fd, 1, NULL, 0,
>>> + NULL, NULL, &retval, &duration);
>>> + if (CHECK(err || retval, "test_run add",
>>> + "err %d errno %d retval %d duration %d\n", err, errno, retval, duration))
>>> + goto cleanup;
>>> +
>>> + ASSERT_EQ(skel->data->add64_value, 3, "add64_value");
>>> + ASSERT_EQ(skel->bss->add64_result, 1, "add64_result");
>>> +
>>> + ASSERT_EQ(skel->data->add32_value, 3, "add32_value");
>>> + ASSERT_EQ(skel->bss->add32_result, 1, "add32_result");
>>> +
>>> + ASSERT_EQ(skel->bss->add_stack_value_copy, 3, "add_stack_value");
>>> + ASSERT_EQ(skel->bss->add_stack_result, 1, "add_stack_result");
>>> +
>>> + ASSERT_EQ(skel->data->add_noreturn_value, 3, "add_noreturn_value");
>>> +
>>> +cleanup:
>>> + bpf_link__destroy(link);
>>> +}
>>> +
>> [...]
>>> +
>>> +__u64 xchg64_value = 1;
>>> +__u64 xchg64_result = 0;
>>> +__u32 xchg32_value = 1;
>>> +__u32 xchg32_result = 0;
>>> +
>>> +SEC("fentry/bpf_fentry_test1")
>>> +int BPF_PROG(xchg, int a)
>>> +{
>>> +#ifdef ENABLE_ATOMICS_TESTS
>>> + __u64 val64 = 2;
>>> + __u32 val32 = 2;
>>> +
>>> + __atomic_exchange(&xchg64_value, &val64, &xchg64_result, __ATOMIC_RELAXED);
>>> + __atomic_exchange(&xchg32_value, &val32, &xchg32_result, __ATOMIC_RELAXED);
>>
>> Interesting to see this also works. I guess we probably won't advertise
>> this, right? Currently for LLVM, the memory ordering parameter is ignored.
>
> Well IIUC this specific case is fine: the ordering that you get with
> BPF_[CMP]XCHG (via kernel atomic_[cmpxchg]) is sequential consistency,
> and its' fine to provide a stronger ordering than the one requested. I
> should change it to say __ATOMIC_SEQ_CST to avoid confusing readers,
> though.
>
> (I wrote it this way because I didn't see a __sync* function for
> unconditional atomic exchange, and I didn't see an __atomic* function
> where you don't need to specify the ordering).

For the above code,
__atomic_exchange(&xchg64_value, &val64, &xchg64_result,
__ATOMIC_RELAXED);
It tries to do an atomic exchange between &xchg64_value and
&val64, and store the old &xchg64_value to &xchg64_result. So it is
equivalent to
xchg64_result = __sync_lock_test_and_set(&xchg64_value, val64);

So I think this test case can be dropped.

>
> However... this led me to double-check the semantics and realise that we
> do have a problem with ordering: The kernel's atomic_{add,and,or,xor} do
> not imply memory barriers and therefore neither do the corresponding BPF
> instructions. That means Clang can compile this:
>
> (void)__atomic_fetch_add(&val, 1, __ATOMIC_SEQ_CST)
>
> to a {.code = (BPF_STX | BPF_DW | BPF_ATOMIC), .imm = BPF_ADD},
> which is implemented with atomic_add, which doesn't actually satisfy
> __ATOMIC_SEQ_CST.

This is the main reason in all my llvm selftests I did not use
__atomic_* intrinsics because we cannot handle *different* memory
ordering properly.

>
> In fact... I think this is a pre-existing issue with BPF_XADD.
>
> If all I've written here is correct, the fix is to use
> (void)atomic_fetch_add etc (these imply barriers) even when BPF_FETCH is
> not set. And that change ought to be backported to fix BPF_XADD.

We cannot change BPF_XADD behavior. If we change BPF_XADD to use
atomic_fetch_add, then suddenly old code compiled with llvm12 will
suddenly requires latest kernel, which will break userland very badly.

2020-12-08 17:04:09

by Brendan Jackman

[permalink] [raw]
Subject: Re: [PATCH bpf-next v4 10/11] bpf: Add tests for new BPF atomic operations

On Tue, Dec 08, 2020 at 08:38:04AM -0800, Yonghong Song wrote:
>
>
> On 12/8/20 4:41 AM, Brendan Jackman wrote:
> > On Mon, Dec 07, 2020 at 07:18:57PM -0800, Yonghong Song wrote:
> > >
> > >
> > > On 12/7/20 8:07 AM, Brendan Jackman wrote:
> > > > The prog_test that's added depends on Clang/LLVM features added by
> > > > Yonghong in commit 286daafd6512 (was https://reviews.llvm.org/D72184 ).
> > > >
> > > > Note the use of a define called ENABLE_ATOMICS_TESTS: this is used
> > > > to:
> > > >
> > > > - Avoid breaking the build for people on old versions of Clang
> > > > - Avoid needing separate lists of test objects for no_alu32, where
> > > > atomics are not supported even if Clang has the feature.
> > > >
> > > > The atomics_test.o BPF object is built unconditionally both for
> > > > test_progs and test_progs-no_alu32. For test_progs, if Clang supports
> > > > atomics, ENABLE_ATOMICS_TESTS is defined, so it includes the proper
> > > > test code. Otherwise, progs and global vars are defined anyway, as
> > > > stubs; this means that the skeleton user code still builds.
> > > >
> > > > The atomics_test.o userspace object is built once and used for both
> > > > test_progs and test_progs-no_alu32. A variable called skip_tests is
> > > > defined in the BPF object's data section, which tells the userspace
> > > > object whether to skip the atomics test.
> > > >
> > > > Signed-off-by: Brendan Jackman <[email protected]>
> > >
> > > Ack with minor comments below.
> > >
> > > Acked-by: Yonghong Song <[email protected]>
> > >
> > > > ---
> > > > tools/testing/selftests/bpf/Makefile | 10 +
> > > > .../selftests/bpf/prog_tests/atomics.c | 246 ++++++++++++++++++
> > > > tools/testing/selftests/bpf/progs/atomics.c | 154 +++++++++++
> > > > .../selftests/bpf/verifier/atomic_and.c | 77 ++++++
> > > > .../selftests/bpf/verifier/atomic_cmpxchg.c | 96 +++++++
> > > > .../selftests/bpf/verifier/atomic_fetch_add.c | 106 ++++++++
> > > > .../selftests/bpf/verifier/atomic_or.c | 77 ++++++
> > > > .../selftests/bpf/verifier/atomic_xchg.c | 46 ++++
> > > > .../selftests/bpf/verifier/atomic_xor.c | 77 ++++++
> > > > 9 files changed, 889 insertions(+)
> > > > create mode 100644 tools/testing/selftests/bpf/prog_tests/atomics.c
> > > > create mode 100644 tools/testing/selftests/bpf/progs/atomics.c
> > > > create mode 100644 tools/testing/selftests/bpf/verifier/atomic_and.c
> > > > create mode 100644 tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c
> > > > create mode 100644 tools/testing/selftests/bpf/verifier/atomic_fetch_add.c
> > > > create mode 100644 tools/testing/selftests/bpf/verifier/atomic_or.c
> > > > create mode 100644 tools/testing/selftests/bpf/verifier/atomic_xchg.c
> > > > create mode 100644 tools/testing/selftests/bpf/verifier/atomic_xor.c
> > > >
> > > > diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
> > > > index ac25ba5d0d6c..13bc1d736164 100644
> > > > --- a/tools/testing/selftests/bpf/Makefile
> > > > +++ b/tools/testing/selftests/bpf/Makefile
> > > > @@ -239,6 +239,12 @@ BPF_CFLAGS = -g -D__TARGET_ARCH_$(SRCARCH) $(MENDIAN) \
> > > > -I$(INCLUDE_DIR) -I$(CURDIR) -I$(APIDIR) \
> > > > -I$(abspath $(OUTPUT)/../usr/include)
> > > > +# BPF atomics support was added to Clang in llvm-project commit 286daafd6512
> > > > +# (release 12.0.0).
> > > > +BPF_ATOMICS_SUPPORTED = $(shell \
> > > > + echo "int x = 0; int foo(void) { return __sync_val_compare_and_swap(&x, 1, 2); }" \
> > > > + | $(CLANG) -x cpp-output -S -target bpf -mcpu=v3 - -o /dev/null && echo 1 || echo 0)
> > >
> > > '-x c' here more intuitive?
> > >
> > > > +
> > > > CLANG_CFLAGS = $(CLANG_SYS_INCLUDES) \
> > > > -Wno-compare-distinct-pointer-types
> > > > @@ -399,11 +405,15 @@ TRUNNER_EXTRA_FILES := $(OUTPUT)/urandom_read $(OUTPUT)/bpf_testmod.ko \
> > > > $(wildcard progs/btf_dump_test_case_*.c)
> > > > TRUNNER_BPF_BUILD_RULE := CLANG_BPF_BUILD_RULE
> > > > TRUNNER_BPF_CFLAGS := $(BPF_CFLAGS) $(CLANG_CFLAGS)
> > > > +ifeq ($(BPF_ATOMICS_SUPPORTED),1)
> > > > + TRUNNER_BPF_CFLAGS += -DENABLE_ATOMICS_TESTS
> > > > +endif
> > > > TRUNNER_BPF_LDFLAGS := -mattr=+alu32
> > > > $(eval $(call DEFINE_TEST_RUNNER,test_progs))
> > > > # Define test_progs-no_alu32 test runner.
> > > > TRUNNER_BPF_BUILD_RULE := CLANG_NOALU32_BPF_BUILD_RULE
> > > > +TRUNNER_BPF_CFLAGS := $(BPF_CFLAGS) $(CLANG_CFLAGS)
> > > > TRUNNER_BPF_LDFLAGS :=
> > > > $(eval $(call DEFINE_TEST_RUNNER,test_progs,no_alu32))
> > > > diff --git a/tools/testing/selftests/bpf/prog_tests/atomics.c b/tools/testing/selftests/bpf/prog_tests/atomics.c
> > > > new file mode 100644
> > > > index 000000000000..c841a3abc2f7
> > > > --- /dev/null
> > > > +++ b/tools/testing/selftests/bpf/prog_tests/atomics.c
> > > > @@ -0,0 +1,246 @@
> > > > +// SPDX-License-Identifier: GPL-2.0
> > > > +
> > > > +#include <test_progs.h>
> > > > +
> > > > +#include "atomics.skel.h"
> > > > +
> > > > +static void test_add(struct atomics *skel)
> > > > +{
> > > > + int err, prog_fd;
> > > > + __u32 duration = 0, retval;
> > > > + struct bpf_link *link;
> > > > +
> > > > + link = bpf_program__attach(skel->progs.add);
> > > > + if (CHECK(IS_ERR(link), "attach(add)", "err: %ld\n", PTR_ERR(link)))
> > > > + return;
> > > > +
> > > > + prog_fd = bpf_program__fd(skel->progs.add);
> > > > + err = bpf_prog_test_run(prog_fd, 1, NULL, 0,
> > > > + NULL, NULL, &retval, &duration);
> > > > + if (CHECK(err || retval, "test_run add",
> > > > + "err %d errno %d retval %d duration %d\n", err, errno, retval, duration))
> > > > + goto cleanup;
> > > > +
> > > > + ASSERT_EQ(skel->data->add64_value, 3, "add64_value");
> > > > + ASSERT_EQ(skel->bss->add64_result, 1, "add64_result");
> > > > +
> > > > + ASSERT_EQ(skel->data->add32_value, 3, "add32_value");
> > > > + ASSERT_EQ(skel->bss->add32_result, 1, "add32_result");
> > > > +
> > > > + ASSERT_EQ(skel->bss->add_stack_value_copy, 3, "add_stack_value");
> > > > + ASSERT_EQ(skel->bss->add_stack_result, 1, "add_stack_result");
> > > > +
> > > > + ASSERT_EQ(skel->data->add_noreturn_value, 3, "add_noreturn_value");
> > > > +
> > > > +cleanup:
> > > > + bpf_link__destroy(link);
> > > > +}
> > > > +
> > > [...]
> > > > +
> > > > +__u64 xchg64_value = 1;
> > > > +__u64 xchg64_result = 0;
> > > > +__u32 xchg32_value = 1;
> > > > +__u32 xchg32_result = 0;
> > > > +
> > > > +SEC("fentry/bpf_fentry_test1")
> > > > +int BPF_PROG(xchg, int a)
> > > > +{
> > > > +#ifdef ENABLE_ATOMICS_TESTS
> > > > + __u64 val64 = 2;
> > > > + __u32 val32 = 2;
> > > > +
> > > > + __atomic_exchange(&xchg64_value, &val64, &xchg64_result, __ATOMIC_RELAXED);
> > > > + __atomic_exchange(&xchg32_value, &val32, &xchg32_result, __ATOMIC_RELAXED);
> > >
> > > Interesting to see this also works. I guess we probably won't advertise
> > > this, right? Currently for LLVM, the memory ordering parameter is ignored.
> >
> > Well IIUC this specific case is fine: the ordering that you get with
> > BPF_[CMP]XCHG (via kernel atomic_[cmpxchg]) is sequential consistency,
> > and its' fine to provide a stronger ordering than the one requested. I
> > should change it to say __ATOMIC_SEQ_CST to avoid confusing readers,
> > though.
> >
> > (I wrote it this way because I didn't see a __sync* function for
> > unconditional atomic exchange, and I didn't see an __atomic* function
> > where you don't need to specify the ordering).
>
> For the above code,
> __atomic_exchange(&xchg64_value, &val64, &xchg64_result,
> __ATOMIC_RELAXED);
> It tries to do an atomic exchange between &xchg64_value and
> &val64, and store the old &xchg64_value to &xchg64_result. So it is
> equivalent to
> xchg64_result = __sync_lock_test_and_set(&xchg64_value, val64);
>
> So I think this test case can be dropped.

Ah nice, I didn't know about __sync_lock_test_and_set, let's switch to
that I think.

> > However... this led me to double-check the semantics and realise that we
> > do have a problem with ordering: The kernel's atomic_{add,and,or,xor} do
> > not imply memory barriers and therefore neither do the corresponding BPF
> > instructions. That means Clang can compile this:
> >
> > (void)__atomic_fetch_add(&val, 1, __ATOMIC_SEQ_CST)
> >
> > to a {.code = (BPF_STX | BPF_DW | BPF_ATOMIC), .imm = BPF_ADD},
> > which is implemented with atomic_add, which doesn't actually satisfy
> > __ATOMIC_SEQ_CST.
>
> This is the main reason in all my llvm selftests I did not use
> __atomic_* intrinsics because we cannot handle *different* memory
> ordering properly.
>
> >
> > In fact... I think this is a pre-existing issue with BPF_XADD.
> >
> > If all I've written here is correct, the fix is to use
> > (void)atomic_fetch_add etc (these imply barriers) even when BPF_FETCH is
> > not set. And that change ought to be backported to fix BPF_XADD.
>
> We cannot change BPF_XADD behavior. If we change BPF_XADD to use
> atomic_fetch_add, then suddenly old code compiled with llvm12 will
> suddenly requires latest kernel, which will break userland very badly.

Sorry I should have been more explicit: I meant that the fix would be to
call atomic_fetch_add but discard the return value, purely for the
implied barrier. The x86 JIT would stay the same. It would not break any
existing code, only add ordering guarantees that the user probably
already expected (since these builtins come from GCC originally and the
GCC docs say "these builtins are considered a full barrier" [1])

[1] https://gcc.gnu.org/onlinedocs/gcc-4.1.1/gcc/Atomic-Builtins.html

2020-12-08 18:20:51

by Yonghong Song

[permalink] [raw]
Subject: Re: [PATCH bpf-next v4 10/11] bpf: Add tests for new BPF atomic operations



On 12/8/20 8:59 AM, Brendan Jackman wrote:
> On Tue, Dec 08, 2020 at 08:38:04AM -0800, Yonghong Song wrote:
>>
>>
>> On 12/8/20 4:41 AM, Brendan Jackman wrote:
>>> On Mon, Dec 07, 2020 at 07:18:57PM -0800, Yonghong Song wrote:
>>>>
>>>>
>>>> On 12/7/20 8:07 AM, Brendan Jackman wrote:
>>>>> The prog_test that's added depends on Clang/LLVM features added by
>>>>> Yonghong in commit 286daafd6512 (was https://reviews.llvm.org/D72184 ).
>>>>>
>>>>> Note the use of a define called ENABLE_ATOMICS_TESTS: this is used
>>>>> to:
>>>>>
>>>>> - Avoid breaking the build for people on old versions of Clang
>>>>> - Avoid needing separate lists of test objects for no_alu32, where
>>>>> atomics are not supported even if Clang has the feature.
>>>>>
>>>>> The atomics_test.o BPF object is built unconditionally both for
>>>>> test_progs and test_progs-no_alu32. For test_progs, if Clang supports
>>>>> atomics, ENABLE_ATOMICS_TESTS is defined, so it includes the proper
>>>>> test code. Otherwise, progs and global vars are defined anyway, as
>>>>> stubs; this means that the skeleton user code still builds.
>>>>>
>>>>> The atomics_test.o userspace object is built once and used for both
>>>>> test_progs and test_progs-no_alu32. A variable called skip_tests is
>>>>> defined in the BPF object's data section, which tells the userspace
>>>>> object whether to skip the atomics test.
>>>>>
>>>>> Signed-off-by: Brendan Jackman <[email protected]>
>>>>
>>>> Ack with minor comments below.
>>>>
>>>> Acked-by: Yonghong Song <[email protected]>
>>>>
>>>>> ---
>>>>> tools/testing/selftests/bpf/Makefile | 10 +
>>>>> .../selftests/bpf/prog_tests/atomics.c | 246 ++++++++++++++++++
>>>>> tools/testing/selftests/bpf/progs/atomics.c | 154 +++++++++++
>>>>> .../selftests/bpf/verifier/atomic_and.c | 77 ++++++
>>>>> .../selftests/bpf/verifier/atomic_cmpxchg.c | 96 +++++++
>>>>> .../selftests/bpf/verifier/atomic_fetch_add.c | 106 ++++++++
>>>>> .../selftests/bpf/verifier/atomic_or.c | 77 ++++++
>>>>> .../selftests/bpf/verifier/atomic_xchg.c | 46 ++++
>>>>> .../selftests/bpf/verifier/atomic_xor.c | 77 ++++++
>>>>> 9 files changed, 889 insertions(+)
>>>>> create mode 100644 tools/testing/selftests/bpf/prog_tests/atomics.c
>>>>> create mode 100644 tools/testing/selftests/bpf/progs/atomics.c
>>>>> create mode 100644 tools/testing/selftests/bpf/verifier/atomic_and.c
>>>>> create mode 100644 tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c
>>>>> create mode 100644 tools/testing/selftests/bpf/verifier/atomic_fetch_add.c
>>>>> create mode 100644 tools/testing/selftests/bpf/verifier/atomic_or.c
>>>>> create mode 100644 tools/testing/selftests/bpf/verifier/atomic_xchg.c
>>>>> create mode 100644 tools/testing/selftests/bpf/verifier/atomic_xor.c
>>>>>
>>>>> diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
>>>>> index ac25ba5d0d6c..13bc1d736164 100644
>>>>> --- a/tools/testing/selftests/bpf/Makefile
>>>>> +++ b/tools/testing/selftests/bpf/Makefile
>>>>> @@ -239,6 +239,12 @@ BPF_CFLAGS = -g -D__TARGET_ARCH_$(SRCARCH) $(MENDIAN) \
>>>>> -I$(INCLUDE_DIR) -I$(CURDIR) -I$(APIDIR) \
>>>>> -I$(abspath $(OUTPUT)/../usr/include)
>>>>> +# BPF atomics support was added to Clang in llvm-project commit 286daafd6512
>>>>> +# (release 12.0.0).
>>>>> +BPF_ATOMICS_SUPPORTED = $(shell \
>>>>> + echo "int x = 0; int foo(void) { return __sync_val_compare_and_swap(&x, 1, 2); }" \
>>>>> + | $(CLANG) -x cpp-output -S -target bpf -mcpu=v3 - -o /dev/null && echo 1 || echo 0)
>>>>
>>>> '-x c' here more intuitive?
>>>>
>>>>> +
>>>>> CLANG_CFLAGS = $(CLANG_SYS_INCLUDES) \
>>>>> -Wno-compare-distinct-pointer-types
>>>>> @@ -399,11 +405,15 @@ TRUNNER_EXTRA_FILES := $(OUTPUT)/urandom_read $(OUTPUT)/bpf_testmod.ko \
>>>>> $(wildcard progs/btf_dump_test_case_*.c)
>>>>> TRUNNER_BPF_BUILD_RULE := CLANG_BPF_BUILD_RULE
>>>>> TRUNNER_BPF_CFLAGS := $(BPF_CFLAGS) $(CLANG_CFLAGS)
>>>>> +ifeq ($(BPF_ATOMICS_SUPPORTED),1)
>>>>> + TRUNNER_BPF_CFLAGS += -DENABLE_ATOMICS_TESTS
>>>>> +endif
>>>>> TRUNNER_BPF_LDFLAGS := -mattr=+alu32
>>>>> $(eval $(call DEFINE_TEST_RUNNER,test_progs))
>>>>> # Define test_progs-no_alu32 test runner.
>>>>> TRUNNER_BPF_BUILD_RULE := CLANG_NOALU32_BPF_BUILD_RULE
>>>>> +TRUNNER_BPF_CFLAGS := $(BPF_CFLAGS) $(CLANG_CFLAGS)
>>>>> TRUNNER_BPF_LDFLAGS :=
>>>>> $(eval $(call DEFINE_TEST_RUNNER,test_progs,no_alu32))
>>>>> diff --git a/tools/testing/selftests/bpf/prog_tests/atomics.c b/tools/testing/selftests/bpf/prog_tests/atomics.c
>>>>> new file mode 100644
>>>>> index 000000000000..c841a3abc2f7
>>>>> --- /dev/null
>>>>> +++ b/tools/testing/selftests/bpf/prog_tests/atomics.c
>>>>> @@ -0,0 +1,246 @@
>>>>> +// SPDX-License-Identifier: GPL-2.0
>>>>> +
>>>>> +#include <test_progs.h>
>>>>> +
>>>>> +#include "atomics.skel.h"
>>>>> +
>>>>> +static void test_add(struct atomics *skel)
>>>>> +{
>>>>> + int err, prog_fd;
>>>>> + __u32 duration = 0, retval;
>>>>> + struct bpf_link *link;
>>>>> +
>>>>> + link = bpf_program__attach(skel->progs.add);
>>>>> + if (CHECK(IS_ERR(link), "attach(add)", "err: %ld\n", PTR_ERR(link)))
>>>>> + return;
>>>>> +
>>>>> + prog_fd = bpf_program__fd(skel->progs.add);
>>>>> + err = bpf_prog_test_run(prog_fd, 1, NULL, 0,
>>>>> + NULL, NULL, &retval, &duration);
>>>>> + if (CHECK(err || retval, "test_run add",
>>>>> + "err %d errno %d retval %d duration %d\n", err, errno, retval, duration))
>>>>> + goto cleanup;
>>>>> +
>>>>> + ASSERT_EQ(skel->data->add64_value, 3, "add64_value");
>>>>> + ASSERT_EQ(skel->bss->add64_result, 1, "add64_result");
>>>>> +
>>>>> + ASSERT_EQ(skel->data->add32_value, 3, "add32_value");
>>>>> + ASSERT_EQ(skel->bss->add32_result, 1, "add32_result");
>>>>> +
>>>>> + ASSERT_EQ(skel->bss->add_stack_value_copy, 3, "add_stack_value");
>>>>> + ASSERT_EQ(skel->bss->add_stack_result, 1, "add_stack_result");
>>>>> +
>>>>> + ASSERT_EQ(skel->data->add_noreturn_value, 3, "add_noreturn_value");
>>>>> +
>>>>> +cleanup:
>>>>> + bpf_link__destroy(link);
>>>>> +}
>>>>> +
>>>> [...]
>>>>> +
>>>>> +__u64 xchg64_value = 1;
>>>>> +__u64 xchg64_result = 0;
>>>>> +__u32 xchg32_value = 1;
>>>>> +__u32 xchg32_result = 0;
>>>>> +
>>>>> +SEC("fentry/bpf_fentry_test1")
>>>>> +int BPF_PROG(xchg, int a)
>>>>> +{
>>>>> +#ifdef ENABLE_ATOMICS_TESTS
>>>>> + __u64 val64 = 2;
>>>>> + __u32 val32 = 2;
>>>>> +
>>>>> + __atomic_exchange(&xchg64_value, &val64, &xchg64_result, __ATOMIC_RELAXED);
>>>>> + __atomic_exchange(&xchg32_value, &val32, &xchg32_result, __ATOMIC_RELAXED);
>>>>
>>>> Interesting to see this also works. I guess we probably won't advertise
>>>> this, right? Currently for LLVM, the memory ordering parameter is ignored.
>>>
>>> Well IIUC this specific case is fine: the ordering that you get with
>>> BPF_[CMP]XCHG (via kernel atomic_[cmpxchg]) is sequential consistency,
>>> and its' fine to provide a stronger ordering than the one requested. I
>>> should change it to say __ATOMIC_SEQ_CST to avoid confusing readers,
>>> though.
>>>
>>> (I wrote it this way because I didn't see a __sync* function for
>>> unconditional atomic exchange, and I didn't see an __atomic* function
>>> where you don't need to specify the ordering).
>>
>> For the above code,
>> __atomic_exchange(&xchg64_value, &val64, &xchg64_result,
>> __ATOMIC_RELAXED);
>> It tries to do an atomic exchange between &xchg64_value and
>> &val64, and store the old &xchg64_value to &xchg64_result. So it is
>> equivalent to
>> xchg64_result = __sync_lock_test_and_set(&xchg64_value, val64);
>>
>> So I think this test case can be dropped.
>
> Ah nice, I didn't know about __sync_lock_test_and_set, let's switch to
> that I think.
>
>>> However... this led me to double-check the semantics and realise that we
>>> do have a problem with ordering: The kernel's atomic_{add,and,or,xor} do
>>> not imply memory barriers and therefore neither do the corresponding BPF
>>> instructions. That means Clang can compile this:
>>>
>>> (void)__atomic_fetch_add(&val, 1, __ATOMIC_SEQ_CST)
>>>
>>> to a {.code = (BPF_STX | BPF_DW | BPF_ATOMIC), .imm = BPF_ADD},
>>> which is implemented with atomic_add, which doesn't actually satisfy
>>> __ATOMIC_SEQ_CST.
>>
>> This is the main reason in all my llvm selftests I did not use
>> __atomic_* intrinsics because we cannot handle *different* memory
>> ordering properly.
>>
>>>
>>> In fact... I think this is a pre-existing issue with BPF_XADD.
>>>
>>> If all I've written here is correct, the fix is to use
>>> (void)atomic_fetch_add etc (these imply barriers) even when BPF_FETCH is
>>> not set. And that change ought to be backported to fix BPF_XADD.
>>
>> We cannot change BPF_XADD behavior. If we change BPF_XADD to use
>> atomic_fetch_add, then suddenly old code compiled with llvm12 will
>> suddenly requires latest kernel, which will break userland very badly.
>
> Sorry I should have been more explicit: I meant that the fix would be to
> call atomic_fetch_add but discard the return value, purely for the
> implied barrier. The x86 JIT would stay the same. It would not break any
> existing code, only add ordering guarantees that the user probably
> already expected (since these builtins come from GCC originally and the
> GCC docs say "these builtins are considered a full barrier" [1])
>
> [1] https://gcc.gnu.org/onlinedocs/gcc-4.1.1/gcc/Atomic-Builtins.html

This is indeed the issue. In the past, people already use gcc
__sync_fetch_and_add() for xadd instruction for which git generated
code does not implying barrier.

The new atomics support has the following logic:
. if return value is used, it is atomic_fetch_add
. if return value is not used, it is xadd
The reason to do this is to preserve backward compabiility
and this way, we can get rid of -mcpu=v4.

barrier issue is tricky and as we discussed earlier let us
delay this after basic atomics support landed. We may not
100% conform to gcc __sync_fetch_and_add() or __atomic_*()
semantics. We do need to clearly document what is expected
in llvm and kernel.

2020-12-09 00:56:25

by Brendan Jackman

[permalink] [raw]
Subject: Re: [PATCH bpf-next v4 06/11] bpf: Add BPF_FETCH field / create atomic_fetch_add instruction

On Mon, Dec 07, 2020 at 05:41:05PM -0800, Yonghong Song wrote:
>
>
> On 12/7/20 8:07 AM, Brendan Jackman wrote:
> > The BPF_FETCH field can be set in bpf_insn.imm, for BPF_ATOMIC
> > instructions, in order to have the previous value of the
> > atomically-modified memory location loaded into the src register
> > after an atomic op is carried out.
> >
> > Suggested-by: Yonghong Song <[email protected]>
> > Signed-off-by: Brendan Jackman <[email protected]>
> > ---
> > arch/x86/net/bpf_jit_comp.c | 4 ++++
> > include/linux/filter.h | 1 +
> > include/uapi/linux/bpf.h | 3 +++
> > kernel/bpf/core.c | 13 +++++++++++++
> > kernel/bpf/disasm.c | 7 +++++++
> > kernel/bpf/verifier.c | 33 ++++++++++++++++++++++++---------
> > tools/include/linux/filter.h | 11 +++++++++++
> > tools/include/uapi/linux/bpf.h | 3 +++
> > 8 files changed, 66 insertions(+), 9 deletions(-)
> >
> > diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> [...]
>
> > index f345f12c1ff8..4e0100ba52c2 100644
> > --- a/tools/include/linux/filter.h
> > +++ b/tools/include/linux/filter.h
> > @@ -173,6 +173,7 @@
> > * Atomic operations:
> > *
> > * BPF_ADD *(uint *) (dst_reg + off16) += src_reg
> > + * BPF_ADD | BPF_FETCH src_reg = atomic_fetch_add(dst_reg + off16, src_reg);
> > */
> > #define BPF_ATOMIC64(OP, DST, SRC, OFF) \
> > @@ -201,6 +202,16 @@
> > .off = OFF, \
> > .imm = BPF_ADD })
> > +/* Atomic memory add with fetch, src_reg = atomic_fetch_add(dst_reg + off, src_reg); */
> > +
> > +#define BPF_ATOMIC_FETCH_ADD(SIZE, DST, SRC, OFF) \
> > + ((struct bpf_insn) { \
> > + .code = BPF_STX | BPF_SIZE(SIZE) | BPF_ATOMIC, \
> > + .dst_reg = DST, \
> > + .src_reg = SRC, \
> > + .off = OFF, \
> > + .imm = BPF_ADD | BPF_FETCH })
>
> Not sure whether it is a good idea or not to fold this into BPF_ATOMIC
> macro. At least you can define BPF_ATOMIC macro and
> #define BPF_ATOMIC_FETCH_ADD(SIZE, DST, SRC, OFF) \
> BPF_ATOMIC(SIZE, DST, SRC, OFF, BPF_ADD | BPF_FETCH)
>
> to avoid too many code duplications?

Oops.. I intended to totally get rid these and folded them into
BPF_ATOMIC{64,32}! OK, let's combine all of them into a single macro.
It will have to be called something slightly awkward like
BPF_ATOMIC_INSN because BPF_ATOMIC is the name of the BPF_OP.

>
> > +
> > /* Memory store, *(uint *) (dst_reg + off16) = imm32 */
> > #define BPF_ST_MEM(SIZE, DST, OFF, IMM) \
> > diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> > index 98161e2d389f..d5389119291e 100644
> > --- a/tools/include/uapi/linux/bpf.h
> > +++ b/tools/include/uapi/linux/bpf.h
> > @@ -44,6 +44,9 @@
> > #define BPF_CALL 0x80 /* function call */
> > #define BPF_EXIT 0x90 /* function return */
> > +/* atomic op type fields (stored in immediate) */
> > +#define BPF_FETCH 0x01 /* fetch previous value into src reg */
> > +
> > /* Register numbers */
> > enum {
> > BPF_REG_0 = 0,
> >

2020-12-09 06:14:28

by John Fastabend

[permalink] [raw]
Subject: Re: [PATCH bpf-next v4 04/11] bpf: Rename BPF_XADD and prepare to encode other atomics in .imm

Brendan Jackman wrote:
> Hi John, thanks a lot for the reviews!
>
> On Mon, Dec 07, 2020 at 01:56:53PM -0800, John Fastabend wrote:
> > Brendan Jackman wrote:
> > > A subsequent patch will add additional atomic operations. These new
> > > operations will use the same opcode field as the existing XADD, with
> > > the immediate discriminating different operations.
> > >
> > > In preparation, rename the instruction mode BPF_ATOMIC and start
> > > calling the zero immediate BPF_ADD.
> > >
> > > This is possible (doesn't break existing valid BPF progs) because the
> > > immediate field is currently reserved MBZ and BPF_ADD is zero.
> > >
> > > All uses are removed from the tree but the BPF_XADD definition is
> > > kept around to avoid breaking builds for people including kernel
> > > headers.
> > >
> > > Signed-off-by: Brendan Jackman <[email protected]>
> > > ---

[...]

> > > + case BPF_STX | BPF_ATOMIC | BPF_W:
> > > + case BPF_STX | BPF_ATOMIC | BPF_DW:
> > > + if (insn->imm != BPF_ADD) {
> > > + pr_err("bpf-jit: not supported: atomic operation %02x ***\n",
> > > + insn->imm);
> > > + return -EINVAL;
> > > + }
> >
> > Can we standardize the error across jits and the error return code? It seems
> > odd that we use pr_err, pr_info_once, pr_err_ratelimited and then return
> > ENOTSUPP, EFAULT or EINVAL.
>
> That would be a noble cause but I don't think it makes sense in this
> patchset: they are already inconsistent, so here I've gone for intra-JIT
> consistency over inter-JIT consistency.
>
> I think it would be more annoying, for example, if the s390 JIT returned
> -EOPNOTSUPP for a bad atomic but -1 for other unsupported ops, than it
> is already that the s390 JIT returns -1 where the MIPS returns -EINVAL.

ok works for me thanks for the explanation.

2020-12-10 00:28:02

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH bpf-next v4 09/11] bpf: Add bitwise atomic instructions

Hi Brendan,

I love your patch! Yet something to improve:

[auto build test ERROR on 34da87213d3ddd26643aa83deff7ffc6463da0fc]

url: https://github.com/0day-ci/linux/commits/Brendan-Jackman/Atomics-for-eBPF/20201208-001343
base: 34da87213d3ddd26643aa83deff7ffc6463da0fc
config: m68k-randconfig-r022-20201209 (attached as .config)
compiler: m68k-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/0day-ci/linux/commit/2a65bda50b756e76e985b1d2bba80b3023a9cdc3
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Brendan-Jackman/Atomics-for-eBPF/20201208-001343
git checkout 2a65bda50b756e76e985b1d2bba80b3023a9cdc3
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=m68k

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>):

kernel/bpf/core.c:1350:12: warning: no previous prototype for 'bpf_probe_read_kernel' [-Wmissing-prototypes]
1350 | u64 __weak bpf_probe_read_kernel(void *dst, u32 size, const void *unsafe_ptr)
| ^~~~~~~~~~~~~~~~~~~~~
In file included from kernel/bpf/core.c:21:
kernel/bpf/core.c: In function '___bpf_prog_run':
include/linux/filter.h:1000:3: warning: cast between incompatible function types from 'u64 (*)(u64, u64, u64, u64, u64)' {aka 'long long unsigned int (*)(long long unsigned int, long long unsigned int, long long unsigned int, long long unsigned int, long long unsigned int)'} to 'u64 (*)(u64, u64, u64, u64, u64, const struct bpf_insn *)' {aka 'long long unsigned int (*)(long long unsigned int, long long unsigned int, long long unsigned int, long long unsigned int, long long unsigned int, const struct bpf_insn *)'} [-Wcast-function-type]
1000 | ((u64 (*)(u64, u64, u64, u64, u64, const struct bpf_insn *)) \
| ^
kernel/bpf/core.c:1518:13: note: in expansion of macro '__bpf_call_base_args'
1518 | BPF_R0 = (__bpf_call_base_args + insn->imm)(BPF_R1, BPF_R2,
| ^~~~~~~~~~~~~~~~~~~~
kernel/bpf/core.c:1638:6: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
1638 | (atomic64_t *)(s64) (DST + insn->off)); \
| ^
kernel/bpf/core.c:1644:3: note: in expansion of macro 'ATOMIC_ALU_OP'
1644 | ATOMIC_ALU_OP(BPF_ADD, add)
| ^~~~~~~~~~~~~
kernel/bpf/core.c:1638:6: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
1638 | (atomic64_t *)(s64) (DST + insn->off)); \
| ^
kernel/bpf/core.c:1645:3: note: in expansion of macro 'ATOMIC_ALU_OP'
1645 | ATOMIC_ALU_OP(BPF_AND, and)
| ^~~~~~~~~~~~~
kernel/bpf/core.c:1638:6: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
1638 | (atomic64_t *)(s64) (DST + insn->off)); \
| ^
kernel/bpf/core.c:1646:3: note: in expansion of macro 'ATOMIC_ALU_OP'
1646 | ATOMIC_ALU_OP(BPF_OR, or)
| ^~~~~~~~~~~~~
kernel/bpf/core.c:1638:6: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
1638 | (atomic64_t *)(s64) (DST + insn->off)); \
| ^
kernel/bpf/core.c:1647:3: note: in expansion of macro 'ATOMIC_ALU_OP'
1647 | ATOMIC_ALU_OP(BPF_XOR, xor)
| ^~~~~~~~~~~~~
kernel/bpf/core.c:1657:6: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
1657 | (atomic64_t *)(u64) (DST + insn->off),
| ^
kernel/bpf/core.c:1667:6: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
1667 | (atomic64_t *)(u64) (DST + insn->off),
| ^
In file included from kernel/bpf/core.c:21:
kernel/bpf/core.c: In function 'bpf_patch_call_args':
include/linux/filter.h:1000:3: warning: cast between incompatible function types from 'u64 (*)(u64, u64, u64, u64, u64)' {aka 'long long unsigned int (*)(long long unsigned int, long long unsigned int, long long unsigned int, long long unsigned int, long long unsigned int)'} to 'u64 (*)(u64, u64, u64, u64, u64, const struct bpf_insn *)' {aka 'long long unsigned int (*)(long long unsigned int, long long unsigned int, long long unsigned int, long long unsigned int, long long unsigned int, const struct bpf_insn *)'} [-Wcast-function-type]
1000 | ((u64 (*)(u64, u64, u64, u64, u64, const struct bpf_insn *)) \
| ^
kernel/bpf/core.c:1756:3: note: in expansion of macro '__bpf_call_base_args'
1756 | __bpf_call_base_args;
| ^~~~~~~~~~~~~~~~~~~~
{standard input}: Assembler messages:
>> {standard input}:3068: Error: operands mismatch -- statement `orl %a1,%d0' ignored
{standard input}:3068: Error: invalid instruction for this architecture; needs 68020 or higher (68020 [68k, 68ec020], 68030 [68ec030], 68040 [68ec040], 68060 [68ec060]) -- statement `casl %d4,%d0,(%a6)' ignored
{standard input}:3116: Error: invalid instruction for this architecture; needs 68020 or higher (68020 [68k, 68ec020], 68030 [68ec030], 68040 [68ec040], 68060 [68ec060]) -- statement `casl %d1,%d5,(%a6,%d0.l)' ignored
{standard input}:3163: Error: invalid instruction for this architecture; needs 68020 or higher (68020 [68k, 68ec020], 68030 [68ec030], 68040 [68ec040], 68060 [68ec060]) -- statement `casl %d4,%d0,(%a6)' ignored
>> {standard input}:3225: Error: operands mismatch -- statement `andl %a1,%d0' ignored
{standard input}:3225: Error: invalid instruction for this architecture; needs 68020 or higher (68020 [68k, 68ec020], 68030 [68ec030], 68040 [68ec040], 68060 [68ec060]) -- statement `casl %d4,%d0,(%a6)' ignored
>> {standard input}:3290: Error: operands mismatch -- statement `eorl %a1,%d0' ignored
{standard input}:3290: Error: invalid instruction for this architecture; needs 68020 or higher (68020 [68k, 68ec020], 68030 [68ec030], 68040 [68ec040], 68060 [68ec060]) -- statement `casl %d4,%d0,(%a6)' ignored
{standard input}:3316: Error: invalid instruction for this architecture; needs 68020 or higher (68020 [68k, 68ec020], 68030 [68ec030], 68040 [68ec040], 68060 [68ec060]) -- statement `casl %d0,%d5,(%a0)' ignored

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]


Attachments:
(No filename) (6.55 kB)
.config.gz (26.22 kB)
Download all attachments

2020-12-14 15:45:16

by Brendan Jackman

[permalink] [raw]
Subject: Re: [PATCH bpf-next v4 07/11] bpf: Add instructions for atomic_[cmp]xchg

Seems I never replied to this, thanks for the reviews!

On Mon, Dec 07, 2020 at 10:37:32PM -0800, John Fastabend wrote:
> Brendan Jackman wrote:
> > This adds two atomic opcodes, both of which include the BPF_FETCH
> > flag. XCHG without the BPF_FETCH flag would naturally encode
> > atomic_set. This is not supported because it would be of limited
> > value to userspace (it doesn't imply any barriers). CMPXCHG without
> > BPF_FETCH woulud be an atomic compare-and-write. We don't have such
> > an operation in the kernel so it isn't provided to BPF either.
> >
> > There are two significant design decisions made for the CMPXCHG
> > instruction:
> >
> > - To solve the issue that this operation fundamentally has 3
> > operands, but we only have two register fields. Therefore the
> > operand we compare against (the kernel's API calls it 'old') is
> > hard-coded to be R0. x86 has similar design (and A64 doesn't
> > have this problem).
> >
> > A potential alternative might be to encode the other operand's
> > register number in the immediate field.
> >
> > - The kernel's atomic_cmpxchg returns the old value, while the C11
> > userspace APIs return a boolean indicating the comparison
> > result. Which should BPF do? A64 returns the old value. x86 returns
> > the old value in the hard-coded register (and also sets a
> > flag). That means return-old-value is easier to JIT.
>
> Just a nit as it looks like perhaps we get one more spin here. Would
> be nice to be explicit about what the code does here. The above reads
> like it could go either way. Just an extra line "So we use ...' would
> be enough.

Ack, adding the note.

> > Signed-off-by: Brendan Jackman <[email protected]>
> > ---
>
> One question below.
>
> > arch/x86/net/bpf_jit_comp.c | 8 ++++++++
> > include/linux/filter.h | 22 ++++++++++++++++++++++
> > include/uapi/linux/bpf.h | 4 +++-
> > kernel/bpf/core.c | 20 ++++++++++++++++++++
> > kernel/bpf/disasm.c | 15 +++++++++++++++
> > kernel/bpf/verifier.c | 19 +++++++++++++++++--
> > tools/include/linux/filter.h | 22 ++++++++++++++++++++++
> > tools/include/uapi/linux/bpf.h | 4 +++-
> > 8 files changed, 110 insertions(+), 4 deletions(-)
> >
>
> [...]
>
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index f8c4e809297d..f5f4460b3e4e 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -3608,11 +3608,14 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn
> >
> > static int check_atomic(struct bpf_verifier_env *env, int insn_idx, struct bpf_insn *insn)
> > {
> > + int load_reg;
> > int err;
> >
> > switch (insn->imm) {
> > case BPF_ADD:
> > case BPF_ADD | BPF_FETCH:
> > + case BPF_XCHG:
> > + case BPF_CMPXCHG:
> > break;
> > default:
> > verbose(env, "BPF_ATOMIC uses invalid atomic opcode %02x\n", insn->imm);
> > @@ -3634,6 +3637,13 @@ static int check_atomic(struct bpf_verifier_env *env, int insn_idx, struct bpf_i
> > if (err)
> > return err;
> >
> > + if (insn->imm == BPF_CMPXCHG) {
> > + /* Check comparison of R0 with memory location */
> > + err = check_reg_arg(env, BPF_REG_0, SRC_OP);
> > + if (err)
> > + return err;
> > + }
> > +
>
> I need to think a bit more about it, but do we need to update is_reg64()
> at all for these?

I don't think so - this all falls into the same
`if (class == BPF_STX)` case as the existing BPF_STX_XADD instruction.

> > if (is_pointer_value(env, insn->src_reg)) {
> > verbose(env, "R%d leaks addr into mem\n", insn->src_reg);
> > return -EACCES;
> > @@ -3664,8 +3674,13 @@ static int check_atomic(struct bpf_verifier_env *env, int insn_idx, struct bpf_i
> > if (!(insn->imm & BPF_FETCH))
> > return 0;
> >
> > - /* check and record load of old value into src reg */
> > - err = check_reg_arg(env, insn->src_reg, DST_OP);
> > + if (insn->imm == BPF_CMPXCHG)
> > + load_reg = BPF_REG_0;
> > + else
> > + load_reg = insn->src_reg;
> > +
> > + /* check and record load of old value */
> > + err = check_reg_arg(env, load_reg, DST_OP);
> > if (err)
> > return err;

2020-12-15 11:17:29

by Brendan Jackman

[permalink] [raw]
Subject: Re: [PATCH bpf-next v4 10/11] bpf: Add tests for new BPF atomic operations

On Tue, Dec 08, 2020 at 10:15:35AM -0800, Yonghong Song wrote:
>
>
> On 12/8/20 8:59 AM, Brendan Jackman wrote:
> > On Tue, Dec 08, 2020 at 08:38:04AM -0800, Yonghong Song wrote:
> > >
> > >
> > > On 12/8/20 4:41 AM, Brendan Jackman wrote:
> > > > On Mon, Dec 07, 2020 at 07:18:57PM -0800, Yonghong Song wrote:
> > > > >
> > > > >
> > > > > On 12/7/20 8:07 AM, Brendan Jackman wrote:
> > > > > > The prog_test that's added depends on Clang/LLVM features added by
> > > > > > Yonghong in commit 286daafd6512 (was https://reviews.llvm.org/D72184 ).
> > > > > >
> > > > > > Note the use of a define called ENABLE_ATOMICS_TESTS: this is used
> > > > > > to:
> > > > > >
> > > > > > - Avoid breaking the build for people on old versions of Clang
> > > > > > - Avoid needing separate lists of test objects for no_alu32, where
> > > > > > atomics are not supported even if Clang has the feature.
> > > > > >
> > > > > > The atomics_test.o BPF object is built unconditionally both for
> > > > > > test_progs and test_progs-no_alu32. For test_progs, if Clang supports
> > > > > > atomics, ENABLE_ATOMICS_TESTS is defined, so it includes the proper
> > > > > > test code. Otherwise, progs and global vars are defined anyway, as
> > > > > > stubs; this means that the skeleton user code still builds.
> > > > > >
> > > > > > The atomics_test.o userspace object is built once and used for both
> > > > > > test_progs and test_progs-no_alu32. A variable called skip_tests is
> > > > > > defined in the BPF object's data section, which tells the userspace
> > > > > > object whether to skip the atomics test.
> > > > > >
> > > > > > Signed-off-by: Brendan Jackman <[email protected]>
> > > > >
> > > > > Ack with minor comments below.
> > > > >
> > > > > Acked-by: Yonghong Song <[email protected]>
> > > > >
> > > > > > ---
> > > > > > tools/testing/selftests/bpf/Makefile | 10 +
> > > > > > .../selftests/bpf/prog_tests/atomics.c | 246 ++++++++++++++++++
> > > > > > tools/testing/selftests/bpf/progs/atomics.c | 154 +++++++++++
> > > > > > .../selftests/bpf/verifier/atomic_and.c | 77 ++++++
> > > > > > .../selftests/bpf/verifier/atomic_cmpxchg.c | 96 +++++++
> > > > > > .../selftests/bpf/verifier/atomic_fetch_add.c | 106 ++++++++
> > > > > > .../selftests/bpf/verifier/atomic_or.c | 77 ++++++
> > > > > > .../selftests/bpf/verifier/atomic_xchg.c | 46 ++++
> > > > > > .../selftests/bpf/verifier/atomic_xor.c | 77 ++++++
> > > > > > 9 files changed, 889 insertions(+)
> > > > > > create mode 100644 tools/testing/selftests/bpf/prog_tests/atomics.c
> > > > > > create mode 100644 tools/testing/selftests/bpf/progs/atomics.c
> > > > > > create mode 100644 tools/testing/selftests/bpf/verifier/atomic_and.c
> > > > > > create mode 100644 tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c
> > > > > > create mode 100644 tools/testing/selftests/bpf/verifier/atomic_fetch_add.c
> > > > > > create mode 100644 tools/testing/selftests/bpf/verifier/atomic_or.c
> > > > > > create mode 100644 tools/testing/selftests/bpf/verifier/atomic_xchg.c
> > > > > > create mode 100644 tools/testing/selftests/bpf/verifier/atomic_xor.c
> > > > > >
> > > > > > diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
> > > > > > index ac25ba5d0d6c..13bc1d736164 100644
> > > > > > --- a/tools/testing/selftests/bpf/Makefile
> > > > > > +++ b/tools/testing/selftests/bpf/Makefile
> > > > > > @@ -239,6 +239,12 @@ BPF_CFLAGS = -g -D__TARGET_ARCH_$(SRCARCH) $(MENDIAN) \
> > > > > > -I$(INCLUDE_DIR) -I$(CURDIR) -I$(APIDIR) \
> > > > > > -I$(abspath $(OUTPUT)/../usr/include)
> > > > > > +# BPF atomics support was added to Clang in llvm-project commit 286daafd6512
> > > > > > +# (release 12.0.0).
> > > > > > +BPF_ATOMICS_SUPPORTED = $(shell \
> > > > > > + echo "int x = 0; int foo(void) { return __sync_val_compare_and_swap(&x, 1, 2); }" \
> > > > > > + | $(CLANG) -x cpp-output -S -target bpf -mcpu=v3 - -o /dev/null && echo 1 || echo 0)
> > > > >
> > > > > '-x c' here more intuitive?
> > > > >
> > > > > > +
> > > > > > CLANG_CFLAGS = $(CLANG_SYS_INCLUDES) \
> > > > > > -Wno-compare-distinct-pointer-types
> > > > > > @@ -399,11 +405,15 @@ TRUNNER_EXTRA_FILES := $(OUTPUT)/urandom_read $(OUTPUT)/bpf_testmod.ko \
> > > > > > $(wildcard progs/btf_dump_test_case_*.c)
> > > > > > TRUNNER_BPF_BUILD_RULE := CLANG_BPF_BUILD_RULE
> > > > > > TRUNNER_BPF_CFLAGS := $(BPF_CFLAGS) $(CLANG_CFLAGS)
> > > > > > +ifeq ($(BPF_ATOMICS_SUPPORTED),1)
> > > > > > + TRUNNER_BPF_CFLAGS += -DENABLE_ATOMICS_TESTS
> > > > > > +endif
> > > > > > TRUNNER_BPF_LDFLAGS := -mattr=+alu32
> > > > > > $(eval $(call DEFINE_TEST_RUNNER,test_progs))
> > > > > > # Define test_progs-no_alu32 test runner.
> > > > > > TRUNNER_BPF_BUILD_RULE := CLANG_NOALU32_BPF_BUILD_RULE
> > > > > > +TRUNNER_BPF_CFLAGS := $(BPF_CFLAGS) $(CLANG_CFLAGS)
> > > > > > TRUNNER_BPF_LDFLAGS :=
> > > > > > $(eval $(call DEFINE_TEST_RUNNER,test_progs,no_alu32))
> > > > > > diff --git a/tools/testing/selftests/bpf/prog_tests/atomics.c b/tools/testing/selftests/bpf/prog_tests/atomics.c
> > > > > > new file mode 100644
> > > > > > index 000000000000..c841a3abc2f7
> > > > > > --- /dev/null
> > > > > > +++ b/tools/testing/selftests/bpf/prog_tests/atomics.c
> > > > > > @@ -0,0 +1,246 @@
> > > > > > +// SPDX-License-Identifier: GPL-2.0
> > > > > > +
> > > > > > +#include <test_progs.h>
> > > > > > +
> > > > > > +#include "atomics.skel.h"
> > > > > > +
> > > > > > +static void test_add(struct atomics *skel)
> > > > > > +{
> > > > > > + int err, prog_fd;
> > > > > > + __u32 duration = 0, retval;
> > > > > > + struct bpf_link *link;
> > > > > > +
> > > > > > + link = bpf_program__attach(skel->progs.add);
> > > > > > + if (CHECK(IS_ERR(link), "attach(add)", "err: %ld\n", PTR_ERR(link)))
> > > > > > + return;
> > > > > > +
> > > > > > + prog_fd = bpf_program__fd(skel->progs.add);
> > > > > > + err = bpf_prog_test_run(prog_fd, 1, NULL, 0,
> > > > > > + NULL, NULL, &retval, &duration);
> > > > > > + if (CHECK(err || retval, "test_run add",
> > > > > > + "err %d errno %d retval %d duration %d\n", err, errno, retval, duration))
> > > > > > + goto cleanup;
> > > > > > +
> > > > > > + ASSERT_EQ(skel->data->add64_value, 3, "add64_value");
> > > > > > + ASSERT_EQ(skel->bss->add64_result, 1, "add64_result");
> > > > > > +
> > > > > > + ASSERT_EQ(skel->data->add32_value, 3, "add32_value");
> > > > > > + ASSERT_EQ(skel->bss->add32_result, 1, "add32_result");
> > > > > > +
> > > > > > + ASSERT_EQ(skel->bss->add_stack_value_copy, 3, "add_stack_value");
> > > > > > + ASSERT_EQ(skel->bss->add_stack_result, 1, "add_stack_result");
> > > > > > +
> > > > > > + ASSERT_EQ(skel->data->add_noreturn_value, 3, "add_noreturn_value");
> > > > > > +
> > > > > > +cleanup:
> > > > > > + bpf_link__destroy(link);
> > > > > > +}
> > > > > > +
> > > > > [...]
> > > > > > +
> > > > > > +__u64 xchg64_value = 1;
> > > > > > +__u64 xchg64_result = 0;
> > > > > > +__u32 xchg32_value = 1;
> > > > > > +__u32 xchg32_result = 0;
> > > > > > +
> > > > > > +SEC("fentry/bpf_fentry_test1")
> > > > > > +int BPF_PROG(xchg, int a)
> > > > > > +{
> > > > > > +#ifdef ENABLE_ATOMICS_TESTS
> > > > > > + __u64 val64 = 2;
> > > > > > + __u32 val32 = 2;
> > > > > > +
> > > > > > + __atomic_exchange(&xchg64_value, &val64, &xchg64_result, __ATOMIC_RELAXED);
> > > > > > + __atomic_exchange(&xchg32_value, &val32, &xchg32_result, __ATOMIC_RELAXED);
> > > > >
> > > > > Interesting to see this also works. I guess we probably won't advertise
> > > > > this, right? Currently for LLVM, the memory ordering parameter is ignored.
> > > >
> > > > Well IIUC this specific case is fine: the ordering that you get with
> > > > BPF_[CMP]XCHG (via kernel atomic_[cmpxchg]) is sequential consistency,
> > > > and its' fine to provide a stronger ordering than the one requested. I
> > > > should change it to say __ATOMIC_SEQ_CST to avoid confusing readers,
> > > > though.
> > > >
> > > > (I wrote it this way because I didn't see a __sync* function for
> > > > unconditional atomic exchange, and I didn't see an __atomic* function
> > > > where you don't need to specify the ordering).
> > >
> > > For the above code,
> > > __atomic_exchange(&xchg64_value, &val64, &xchg64_result,
> > > __ATOMIC_RELAXED);
> > > It tries to do an atomic exchange between &xchg64_value and
> > > &val64, and store the old &xchg64_value to &xchg64_result. So it is
> > > equivalent to
> > > xchg64_result = __sync_lock_test_and_set(&xchg64_value, val64);
> > >
> > > So I think this test case can be dropped.
> >
> > Ah nice, I didn't know about __sync_lock_test_and_set, let's switch to
> > that I think.
> >
> > > > However... this led me to double-check the semantics and realise that we
> > > > do have a problem with ordering: The kernel's atomic_{add,and,or,xor} do
> > > > not imply memory barriers and therefore neither do the corresponding BPF
> > > > instructions. That means Clang can compile this:
> > > >
> > > > (void)__atomic_fetch_add(&val, 1, __ATOMIC_SEQ_CST)
> > > >
> > > > to a {.code = (BPF_STX | BPF_DW | BPF_ATOMIC), .imm = BPF_ADD},
> > > > which is implemented with atomic_add, which doesn't actually satisfy
> > > > __ATOMIC_SEQ_CST.
> > >
> > > This is the main reason in all my llvm selftests I did not use
> > > __atomic_* intrinsics because we cannot handle *different* memory
> > > ordering properly.
> > >
> > > >
> > > > In fact... I think this is a pre-existing issue with BPF_XADD.
> > > >
> > > > If all I've written here is correct, the fix is to use
> > > > (void)atomic_fetch_add etc (these imply barriers) even when BPF_FETCH is
> > > > not set. And that change ought to be backported to fix BPF_XADD.
> > >
> > > We cannot change BPF_XADD behavior. If we change BPF_XADD to use
> > > atomic_fetch_add, then suddenly old code compiled with llvm12 will
> > > suddenly requires latest kernel, which will break userland very badly.
> >
> > Sorry I should have been more explicit: I meant that the fix would be to
> > call atomic_fetch_add but discard the return value, purely for the
> > implied barrier. The x86 JIT would stay the same. It would not break any
> > existing code, only add ordering guarantees that the user probably
> > already expected (since these builtins come from GCC originally and the
> > GCC docs say "these builtins are considered a full barrier" [1])
> >
> > [1] https://gcc.gnu.org/onlinedocs/gcc-4.1.1/gcc/Atomic-Builtins.html
>
> This is indeed the issue. In the past, people already use gcc
> __sync_fetch_and_add() for xadd instruction for which git generated
> code does not implying barrier.
>
> The new atomics support has the following logic:
> . if return value is used, it is atomic_fetch_add
> . if return value is not used, it is xadd
> The reason to do this is to preserve backward compabiility
> and this way, we can get rid of -mcpu=v4.
>
> barrier issue is tricky and as we discussed earlier let us
> delay this after basic atomics support landed. We may not
> 100% conform to gcc __sync_fetch_and_add() or __atomic_*()
> semantics. We do need to clearly document what is expected
> in llvm and kernel.

OK, then I think we can probably justify not conforming to the
__sync_fetch_and_add() semantics since that API is under-specified
anyway.

However IMO it's unambiguously a bug for

(void)__atomic_fetch_add(&x, y, __ATOMIC_SEQ_CST);

to compile down to a kernel atomic_add. I think for that specific API
Clang really ought to always use BPF_FETCH | BPF_ADD when
anything stronger than __ATOMIC_RELAXED is requested, or even just
refuse to compile with when the return value is ignored and a
none-relaxed memory ordering is specified.