Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp598272pxu; Thu, 3 Dec 2020 08:07:27 -0800 (PST) X-Google-Smtp-Source: ABdhPJzRDzO+uf0nPwY+uEDbjyvmRMcQdHXWe4bw1V8rXaG74Pl1CLLm9MAG8Rkjzxb2mw2NehJd X-Received: by 2002:a17:906:878d:: with SMTP id za13mr3113366ejb.395.1607011647229; Thu, 03 Dec 2020 08:07:27 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1607011647; cv=none; d=google.com; s=arc-20160816; b=l1ciWgrbcodQz+Sq07isg2puUmqnkGjn7dOaiKbmAil6VQI6PbM9yFKtTgZsUrvsn6 qdl0iIqTaPOg+5DceHwraQq1iQKQNHcXVLlccjRuUUcp2t5h4Jv5xZ3KV79/8emAZwvd TuiKbrXFbLj4ZWUJiAgGwIjFNZqEFzuZOkAM1oklXOG6sYO58O7WX25eDYNvn03iPZYp Ix2CrL+3oILowecWywZgKl9yRB/SWe4ouxHKOBhhP4B7Nj9f6dNZsadg2ZCqFXunswg6 QxonKBJFZSQpWO8mSjub/riDS4FXLtzQ3S2TqIjmX67+X4FZtOyDc4Df4qG9D0ERy9QO aGNg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:mime-version:message-id:date :sender:dkim-signature; bh=Umt6coDnn4aZpAALuE3BNX1YhLXZCr34hXrUZ71njTk=; b=mVP/dbgGnDKrnNuHPmELXktJlQeHcEIimSQRBVrsuJWjEyv956RO4WhsLT0n1eNrQ4 LITrx0pV3JS2kZiFuxcokXL4Uq/IMw22P7v+1GKQdCt3q95H7phDIEUtPLo/jBa6FOwL rZ1oR8zeK4N3qjnHOP62xhcQIPB7VmPlBL2vI2GoAPf1EBJnSowJcfLptNPCtJeH3lS5 KcoR1g9ZSwKUxNdyaOfRs4H4XAi9XolagoCdKK8QcgoFbvbmQV/aT7g88ErwWpzbnRq3 dgJUkD6F4s7vwB3lakQQMzpjeMZ0jQ+4xq0M+dhQD2tBSGwgxXTZSph8mfFITb8kfzq7 k6/w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=C3HBhRol; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id gu21si1363985ejb.81.2020.12.03.08.07.03; Thu, 03 Dec 2020 08:07:27 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=C3HBhRol; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389110AbgLCQDv (ORCPT + 99 others); Thu, 3 Dec 2020 11:03:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54704 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2389264AbgLCQDv (ORCPT ); Thu, 3 Dec 2020 11:03:51 -0500 Received: from mail-wm1-x34a.google.com (mail-wm1-x34a.google.com [IPv6:2a00:1450:4864:20::34a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 10B3CC061A51 for ; Thu, 3 Dec 2020 08:03:11 -0800 (PST) Received: by mail-wm1-x34a.google.com with SMTP id a205so1418771wme.9 for ; Thu, 03 Dec 2020 08:03:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:message-id:mime-version:subject:from:to:cc; bh=Umt6coDnn4aZpAALuE3BNX1YhLXZCr34hXrUZ71njTk=; b=C3HBhRolseNEzsSxkUobwV6xR5RqddG0sq+ojtufOrz8uZMgV4OTse5wwhhFQtV6WE F4XNOoTPIJdC1LQ+a4N7SLUj9rfaAhc8gyOldkg0TSVPsokmAs/CDr93hjrvJkkTwj/b DYoyHS/Og/vzFT1cef/YEsJqTws3yLnj1CNVWbG/fZ/t16TLFVWZ56moJHvypdl+7dG9 l/gWP5wgRyEplBhYtmi3CfKGSwbeHTj+nIju0Gd7l54w3CJscCiN8wYDMT6ocpUGnGhK lv3Yk01Y3c7BfKPXQmAAh0HSqDz111vRJWuQ1DVbhFhyqsI0ATkRhpamSCTYE+wTQ4rz g1vQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:message-id:mime-version:subject:from :to:cc; bh=Umt6coDnn4aZpAALuE3BNX1YhLXZCr34hXrUZ71njTk=; b=rtUx70IFvm5ZRxStpSPzgcBXZvbEI0GsFdssq4F+cD+yJvHrkZYQOUBZhKtPOEk5lT OZbbd7GbnvHfhTKKNzjJzlYlSGiWC+lMiFHJpK0i3la+1CmmwNUA+mSh7qasEHSTpgw8 W8oygCe9mBSXFQnHZyp5DPc6PuF6Ff9Cb0cbfzuR7zKwpJPv1BHX9V55Zh0QRhGnKvL1 NWUvzpTCDSn2+ayNFG1HtbXH5JXuFAeKG7TiNRO+N7dBgK9mOCuASnW7V3m4LYen0WyI IcPjQ1kV2Mp5I04iczPPlUDBymwUyG3FSaE2ngq4c/mY8O1xrxoEWgkhmZKLoa2ctvD6 XBwQ== X-Gm-Message-State: AOAM533toAtqmKcTXSVOU1PTk8Eadk1ymqfVPAlbyrRQg7QXC+WinWui s+sNc5nNBxTK1htLJexCc6RqfYC9Esj5eg== Sender: "jackmanb via sendgmr" X-Received: from beeg.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:11db]) (user=jackmanb job=sendgmr) by 2002:a1c:2b03:: with SMTP id r3mr4146344wmr.184.1607011388856; Thu, 03 Dec 2020 08:03:08 -0800 (PST) Date: Thu, 3 Dec 2020 16:02:31 +0000 Message-Id: <20201203160245.1014867-1-jackmanb@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.29.2.454.gaff20da3a2-goog Subject: [PATCH bpf-next v3 00/14] Atomics for eBPF From: Brendan Jackman To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Yonghong Song , Daniel Borkmann , KP Singh , Florent Revest , linux-kernel@vger.kernel.org, Jann Horn , Brendan Jackman Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Status of the patches ===================== Thanks for the reviews! Differences from v2->v3 [1]: * More minor fixes and naming/comment changes * Dropped atomic subtract: compilers can implement this by preceding an atomic add with a NEG instruction (which is what the x86 JIT did under the hood anyway). * Dropped the use of -mcpu=v4 in the Clang BPF command-line; there is no longer an architecture version bump. Instead a feature test is added to Kbuild - it builds a source file to check if Clang supports BPF atomics. * Fixed the prog_test so it no longer breaks test_progs-no_alu32. This requires some ifdef acrobatics to avoid complicating the prog_tests model where the same userspace code exercises both the normal and no_alu32 BPF test objects, using the same skeleton header. Differences from v1->v2 [1]: * Fixed mistakes in the netronome driver * Addd sub, add, or, xor operations * The above led to some refactors to keep things readable. (Maybe I should have just waited until I'd implemented these before starting the review...) * Replaced BPF_[CMP]SET | BPF_FETCH with just BPF_[CMP]XCHG, which include the BPF_FETCH flag * Added a bit of documentation. Suggestions welcome for more places to dump this info... The prog_test that's added depends on Clang/LLVM features added by Yonghong in https://reviews.llvm.org/D72184 This only includes a JIT implementation for x86_64 - I don't plan to implement JIT support myself for other architectures. Operations ========== This patchset adds atomic operations to the eBPF instruction set. The use-case that motivated this work was a trivial and efficient way to generate globally-unique cookies in BPF progs, but I think it's obvious that these features are pretty widely applicable. The instructions that are added here can be summarised with this list of kernel operations: * atomic[64]_[fetch_]add * atomic[64]_[fetch_]and * atomic[64]_[fetch_]or * atomic[64]_xchg * atomic[64]_cmpxchg The following are left out of scope for this effort: * 16 and 8 bit operations * Explicit memory barriers Encoding ======== I originally planned to add new values for bpf_insn.opcode. This was rather unpleasant: the opcode space has holes in it but no entire instruction classes[2]. Yonghong Song had a better idea: use the immediate field of the existing STX XADD instruction to encode the operation. This works nicely, without breaking existing programs, because the immediate field is currently reserved-must-be-zero, and extra-nicely because BPF_ADD happens to be zero. Note that this of course makes immediate-source atomic operations impossible. It's hard to imagine a measurable speedup from such instructions, and if it existed it would certainly not benefit x86, which has no support for them. The BPF_OP opcode fields are re-used in the immediate, and an additional flag BPF_FETCH is used to mark instructions that should fetch a pre-modification value from memory. So, BPF_XADD is now called BPF_ATOMIC (the old name is kept to avoid breaking userspace builds), and where we previously had .imm = 0, we now have .imm = BPF_ADD (which is 0). Operands ======== Reg-source eBPF instructions only have two operands, while these atomic operations have up to four. To avoid needing to encode additional operands, then: - One of the input registers is re-used as an output register (e.g. atomic_fetch_add both reads from and writes to the source register). - Where necessary (i.e. for cmpxchg) , R0 is "hard-coded" as one of the operands. This approach also allows the new eBPF instructions to map directly to single x86 instructions. [1] Previous patchset: https://lore.kernel.org/bpf/20201123173202.1335708-1-jackmanb@google.com/ [2] Visualisation of eBPF opcode space: https://gist.github.com/bjackman/00fdad2d5dfff601c1918bc29b16e778 Brendan Jackman (14): bpf: x86: Factor out emission of ModR/M for *(reg + off) bpf: x86: Factor out emission of REX byte bpf: x86: Factor out function to emit NEG bpf: x86: Factor out a lookup table for some ALU opcodes bpf: Rename BPF_XADD and prepare to encode other atomics in .imm bpf: Move BPF_STX reserved field check into BPF_STX verifier code bpf: Add BPF_FETCH field / create atomic_fetch_add instruction bpf: Add instructions for atomic_[cmp]xchg bpf: Pull out a macro for interpreting atomic ALU operations bpf: Add bitwise atomic instructions tools build: Implement feature check for BPF atomics in Clang bpf: Pull tools/build/feature biz into selftests Makefile bpf: Add tests for new BPF atomic operations bpf: Document new atomic instructions Documentation/networking/filter.rst | 56 +++- arch/arm/net/bpf_jit_32.c | 7 +- arch/arm64/net/bpf_jit_comp.c | 16 +- arch/mips/net/ebpf_jit.c | 11 +- arch/powerpc/net/bpf_jit_comp64.c | 25 +- arch/riscv/net/bpf_jit_comp32.c | 20 +- arch/riscv/net/bpf_jit_comp64.c | 16 +- arch/s390/net/bpf_jit_comp.c | 27 +- arch/sparc/net/bpf_jit_comp_64.c | 17 +- arch/x86/net/bpf_jit_comp.c | 241 +++++++++++----- arch/x86/net/bpf_jit_comp32.c | 6 +- drivers/net/ethernet/netronome/nfp/bpf/jit.c | 14 +- drivers/net/ethernet/netronome/nfp/bpf/main.h | 4 +- .../net/ethernet/netronome/nfp/bpf/verifier.c | 15 +- include/linux/filter.h | 97 ++++++- include/uapi/linux/bpf.h | 8 +- kernel/bpf/core.c | 66 ++++- kernel/bpf/disasm.c | 43 ++- kernel/bpf/verifier.c | 75 +++-- lib/test_bpf.c | 2 +- samples/bpf/bpf_insn.h | 4 +- samples/bpf/sock_example.c | 2 +- samples/bpf/test_cgrp2_attach.c | 4 +- tools/build/feature/Makefile | 4 + tools/build/feature/test-clang-bpf-atomics.c | 9 + tools/include/linux/filter.h | 97 ++++++- tools/include/uapi/linux/bpf.h | 8 +- tools/testing/selftests/bpf/.gitignore | 1 + tools/testing/selftests/bpf/Makefile | 42 +++ .../selftests/bpf/prog_tests/atomics_test.c | 262 ++++++++++++++++++ .../bpf/prog_tests/cgroup_attach_multi.c | 4 +- .../selftests/bpf/progs/atomics_test.c | 154 ++++++++++ .../selftests/bpf/verifier/atomic_and.c | 77 +++++ .../selftests/bpf/verifier/atomic_cmpxchg.c | 96 +++++++ .../selftests/bpf/verifier/atomic_fetch_add.c | 106 +++++++ .../selftests/bpf/verifier/atomic_or.c | 77 +++++ .../selftests/bpf/verifier/atomic_xchg.c | 46 +++ .../selftests/bpf/verifier/atomic_xor.c | 77 +++++ tools/testing/selftests/bpf/verifier/ctx.c | 7 +- .../testing/selftests/bpf/verifier/leak_ptr.c | 4 +- tools/testing/selftests/bpf/verifier/unpriv.c | 3 +- tools/testing/selftests/bpf/verifier/xadd.c | 2 +- 42 files changed, 1666 insertions(+), 186 deletions(-) create mode 100644 tools/build/feature/test-clang-bpf-atomics.c create mode 100644 tools/testing/selftests/bpf/prog_tests/atomics_test.c create mode 100644 tools/testing/selftests/bpf/progs/atomics_test.c create mode 100644 tools/testing/selftests/bpf/verifier/atomic_and.c create mode 100644 tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c create mode 100644 tools/testing/selftests/bpf/verifier/atomic_fetch_add.c create mode 100644 tools/testing/selftests/bpf/verifier/atomic_or.c create mode 100644 tools/testing/selftests/bpf/verifier/atomic_xchg.c create mode 100644 tools/testing/selftests/bpf/verifier/atomic_xor.c base-commit: 97306be45fbe7a02461c3c2a57e666cf662b1aaf -- 2.29.2.454.gaff20da3a2-goog