Received: by 2002:a05:7412:b130:b0:e2:908c:2ebd with SMTP id az48csp2316255rdb; Mon, 20 Nov 2023 07:53:41 -0800 (PST) X-Google-Smtp-Source: AGHT+IGh2SejMbORXTeAD0Jkw2hUu2R/0NtA0ATfSGtH4QOvkL3dTf2nYjUP/VIcFJbtyX89S1p4 X-Received: by 2002:a17:902:6a87:b0:1cc:4a23:c5fc with SMTP id n7-20020a1709026a8700b001cc4a23c5fcmr6033167plk.2.1700495620943; Mon, 20 Nov 2023 07:53:40 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700495620; cv=none; d=google.com; s=arc-20160816; b=zaUZ6C7WzmzCOFGDxC0o+YSGfLhM00jyhNBkyL7rbXDtdw0Y6Effy08jzSR9iAW9nV fVRNlsME2yWdPFOt5V9vG3mOQQLkM7b0Xlq5Qu3vsEfem2b6ljVbW/bpgI684TGGOO3e Zi9Ol0wZOIMNlNi/CwRRCJZHKxXZIxI3+gAOOHpKGBSGmEwtZ18OIJO2m4KlOwtS2lA9 RjVA+iRVgbu2CbzIbiFo6sF1IlpxX5YBe/F7ZgKQ3ng1yTJqIum2kjrdvflwQBlf0bDo z9focE0yTc+CE4B3tjVOmPQJVOXUWpzNMnpU/dFUP/Xj/K1dDSXOd0f4GOA4HzhC9Xpq XU3A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:references:subject:cc:to:from:date :user-agent:message-id:dkim-signature; bh=TUAJbJ40xdBCxjfy4Ym48+BQtPYk790x6l+6wvwozjI=; fh=yQyl1TZ2bkp5oQbPpyjSEvVEeF8BWu8rlAibMnqOWbE=; b=EGlz8JM2RTjwSfOVZPdYsMHrSOmvFsGWFYH8Shxt7aPFsZcIz+x/yOxcxcdWYpkNQS WCsPkGiDHyiq42txC0JlTNaAmuzQGFySxk9qJJGt56M8tyojCzg/JCLZkcebT/VyO5g4 levj1zEt4eAoMXwAc+ztOZFoMZPtUeUGGiQg+ojFQQCxfwjwnvK2od9fpm6ycGwLk0DD YCS8XJ5iookWlQBHbiAgs4kvXippGNSjUwV1lv1Q19siMTIx6bNkZrJehpvyHyBbORTF 9cVVXKU7aj/qqbLvYBKyBiSO8asZ81xsEmhHmgBjOOuO+2lR6V8GHhoKpRFFVbURc4tc PX1Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=ggcZcZHs; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from fry.vger.email (fry.vger.email. [2620:137:e000::3:8]) by mx.google.com with ESMTPS id 21-20020a170902ee5500b001c9ca0a03ecsi8026507plo.2.2023.11.20.07.53.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 20 Nov 2023 07:53:40 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) client-ip=2620:137:e000::3:8; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=ggcZcZHs; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by fry.vger.email (Postfix) with ESMTP id 37A78804C531; Mon, 20 Nov 2023 07:53:32 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at fry.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233740AbjKTPwy (ORCPT + 99 others); Mon, 20 Nov 2023 10:52:54 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41456 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233584AbjKTPww (ORCPT ); Mon, 20 Nov 2023 10:52:52 -0500 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E4406B4; Mon, 20 Nov 2023 07:52:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-Id:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=TUAJbJ40xdBCxjfy4Ym48+BQtPYk790x6l+6wvwozjI=; b=ggcZcZHsyddyORie4M8DTX56Wo 2zj+CKr2fFqfpBkCyz/mn6H480DUqFUExU1p7HrsOejTaXIRuQm5UY72J9WBpgHpfuZT6BRXj/FRC BWpzjF4a4zKUNswD21o25DP2+BM1h2F1fZiccz+s7zgBO/ZiDt3s8KHB1PbR9rldRWH4ntRARTE7u 0i+XHfztp2RSki09KduTMElcjC4RgAxx1eEjzPn8WOeF9gliUwyMVsW59oDLY70vkGPMLmvIRvcdI MCejfevodcH9QpU6m7Q6evvSWrlOcn60HJS37GcbsAf3FdVECkqRhTAFjcZstrm8xav7RkiL1Kc+8 2aaht9kg==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1r56Z8-004iPu-LL; Mon, 20 Nov 2023 15:52:02 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 0) id 4D3B33007C8; Mon, 20 Nov 2023 16:52:01 +0100 (CET) Message-Id: <20231120154948.708762225@infradead.org> User-Agent: quilt/0.65 Date: Mon, 20 Nov 2023 15:46:44 +0100 From: Peter Zijlstra To: peterz@infradead.org Cc: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, davem@davemloft.net, dsahern@kernel.org, ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yonghong.song@linux.dev, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, Arnd Bergmann , samitolvanen@google.com, keescook@chromium.org, nathan@kernel.org, ndesaulniers@google.com, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, linux-arch@vger.kernel.org, llvm@lists.linux.dev, jpoimboe@kernel.org, joao@overdrivepizza.com, mark.rutland@arm.com Subject: [PATCH 2/2] x86/cfi,bpf: Fix BPF JIT call References: <20231120144642.591358648@infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on fry.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (fry.vger.email [0.0.0.0]); Mon, 20 Nov 2023 07:53:32 -0800 (PST) The current BPF call convention is __nocfi, except when it calls !JIT things, then it calls regular C functions. It so happens that with FineIBT the __nocfi and C calling conventions are incompatible. Specifically __nocfi will call at func+0, while FineIBT will have endbr-poison there, which is not a valid indirect target. Causing #CP. Notably this only triggers on IBT enabled hardware, which is probably why this hasn't been reported (also, most people will have JIT on anyway). Implement proper CFI prologues for the BPF JIT codegen and drop __nocfi for x86. Signed-off-by: Peter Zijlstra (Intel) --- arch/x86/include/asm/cfi.h | 12 +++++ arch/x86/kernel/alternative.c | 41 ++++++++++++++--- arch/x86/net/bpf_jit_comp.c | 96 +++++++++++++++++++++++++++++++++++++----- include/linux/bpf.h | 9 +++ 4 files changed, 137 insertions(+), 21 deletions(-) --- a/arch/x86/include/asm/cfi.h +++ b/arch/x86/include/asm/cfi.h @@ -9,15 +9,27 @@ */ #include +enum cfi_mode { + CFI_DEFAULT, + CFI_OFF, + CFI_KCFI, + CFI_FINEIBT, +}; + +extern enum cfi_mode cfi_mode; + struct pt_regs; #ifdef CONFIG_CFI_CLANG enum bug_trap_type handle_cfi_failure(struct pt_regs *regs); +#define __bpfcall +extern u32 cfi_bpf_hash; #else static inline enum bug_trap_type handle_cfi_failure(struct pt_regs *regs) { return BUG_TRAP_TYPE_NONE; } +#define cfi_bpf_hash 0U #endif /* CONFIG_CFI_CLANG */ #endif /* _ASM_X86_CFI_H */ --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -30,6 +30,7 @@ #include #include #include +#include int __read_mostly alternatives_patched; @@ -832,15 +833,37 @@ void __init_or_module apply_seal_endbr(s #endif /* CONFIG_X86_KERNEL_IBT */ #ifdef CONFIG_FINEIBT +#define __CFI_DEFAULT CFI_DEFAULT +#elif defined(CONFIG_CFI_CLANG) +#define __CFI_DEFAULT CFI_KCFI +#else +#define __CFI_DEFAULT CFI_OFF +#endif -enum cfi_mode { - CFI_DEFAULT, - CFI_OFF, - CFI_KCFI, - CFI_FINEIBT, -}; +enum cfi_mode cfi_mode __ro_after_init = __CFI_DEFAULT; + +#ifdef CONFIG_CFI_CLANG +struct bpf_insn; + +extern unsigned int bpf_func_proto(const void *ctx, + const struct bpf_insn *insn); + +__ADDRESSABLE(bpf_func_proto); + +asm ( +" .pushsection .data..ro_after_init,\"aw\",@progbits \n" +" .type cfi_bpf_hash,@object \n" +" .globl cfi_bpf_hash \n" +" .p2align 2, 0x0 \n" +"cfi_bpf_hash: \n" +" .long __kcfi_typeid_bpf_func_proto \n" +" .size cfi_bpf_hash, 4 \n" +" .popsection \n" +); +#endif + +#ifdef CONFIG_FINEIBT -static enum cfi_mode cfi_mode __ro_after_init = CFI_DEFAULT; static bool cfi_rand __ro_after_init = true; static u32 cfi_seed __ro_after_init; @@ -1149,8 +1172,10 @@ static void __apply_fineibt(s32 *start_r goto err; if (cfi_rand) { - if (builtin) + if (builtin) { cfi_seed = get_random_u32(); + cfi_bpf_hash = cfi_rehash(cfi_bpf_hash); + } ret = cfi_rand_preamble(start_cfi, end_cfi); if (ret) --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -17,6 +17,7 @@ #include #include #include +#include static bool all_callee_regs_used[4] = {true, true, true, true}; @@ -51,9 +52,11 @@ static u8 *emit_code(u8 *ptr, u32 bytes, do { EMIT4(b1, b2, b3, b4); EMIT(off, 4); } while (0) #ifdef CONFIG_X86_KERNEL_IBT -#define EMIT_ENDBR() EMIT(gen_endbr(), 4) +#define EMIT_ENDBR() EMIT(gen_endbr(), 4) +#define EMIT_ENDBR_POISON() EMIT(gen_endbr_poison(), 4); #else #define EMIT_ENDBR() +#define EMIT_ENDBR_POISON() #endif static bool is_imm8(int value) @@ -247,6 +250,7 @@ struct jit_context { */ int tail_call_direct_label; int tail_call_indirect_label; + int prog_offset; }; /* Maximum number of bytes emitted while JITing one eBPF insn */ @@ -304,21 +308,86 @@ static void pop_callee_regs(u8 **pprog, *pprog = prog; } +static int emit_fineibt(u8 **pprog) +{ + u8 *prog = *pprog; + + EMIT_ENDBR(); + EMIT3_off32(0x41, 0x81, 0xea, cfi_bpf_hash); + EMIT2(0x74, 0x07); + EMIT2(0x0f, 0x0b); + EMIT1(0x90); + EMIT_ENDBR_POISON(); + + *pprog = prog; + return 16; +} + +static int emit_kcfi(u8 **pprog) +{ + u8 *prog = *pprog; + int offset = 5; + + EMIT1_off32(0xb8, cfi_bpf_hash); +#ifdef CONFIG_CALL_PADDING + EMIT1(0x90); + EMIT1(0x90); + EMIT1(0x90); + EMIT1(0x90); + EMIT1(0x90); + EMIT1(0x90); + EMIT1(0x90); + EMIT1(0x90); + EMIT1(0x90); + EMIT1(0x90); + EMIT1(0x90); + offset += 11; +#endif + EMIT_ENDBR(); + + *pprog = prog; + return offset; +} + +static int emit_cfi(u8 **pprog) +{ + u8 *prog = *pprog; + int offset = 0; + + switch (cfi_mode) { + case CFI_FINEIBT: + offset = emit_fineibt(&prog); + break; + + case CFI_KCFI: + offset = emit_kcfi(&prog); + break; + + default: + EMIT_ENDBR(); + break; + } + + *pprog = prog; + return offset; +} + /* * Emit x86-64 prologue code for BPF program. * bpf_tail_call helper will skip the first X86_TAIL_CALL_OFFSET bytes * while jumping to another program */ -static void emit_prologue(u8 **pprog, u32 stack_depth, bool ebpf_from_cbpf, - bool tail_call_reachable, bool is_subprog, - bool is_exception_cb) +static int emit_prologue(u8 **pprog, u32 stack_depth, bool ebpf_from_cbpf, + bool tail_call_reachable, bool is_subprog, + bool is_exception_cb) { u8 *prog = *pprog; + int offset; + offset = emit_cfi(&prog); /* BPF trampoline can be made to work without these nops, * but let's waste 5 bytes for now and optimize later */ - EMIT_ENDBR(); memcpy(prog, x86_nops[5], X86_PATCH_SIZE); prog += X86_PATCH_SIZE; if (!ebpf_from_cbpf) { @@ -357,6 +426,8 @@ static void emit_prologue(u8 **pprog, u3 if (tail_call_reachable) EMIT1(0x50); /* push rax */ *pprog = prog; + + return offset; } static int emit_patch(u8 **pprog, void *func, void *ip, u8 opcode) @@ -1083,8 +1154,8 @@ static int do_jit(struct bpf_prog *bpf_p bool tail_call_seen = false; bool seen_exit = false; u8 temp[BPF_MAX_INSN_SIZE + BPF_INSN_SAFETY]; - int i, excnt = 0; int ilen, proglen = 0; + int i, excnt = 0; u8 *prog = temp; int err; @@ -1094,9 +1165,12 @@ static int do_jit(struct bpf_prog *bpf_p /* tail call's presence in current prog implies it is reachable */ tail_call_reachable |= tail_call_seen; - emit_prologue(&prog, bpf_prog->aux->stack_depth, - bpf_prog_was_classic(bpf_prog), tail_call_reachable, - bpf_is_subprog(bpf_prog), bpf_prog->aux->exception_cb); + ctx->prog_offset = emit_prologue(&prog, bpf_prog->aux->stack_depth, + bpf_prog_was_classic(bpf_prog), + tail_call_reachable, + bpf_is_subprog(bpf_prog), + bpf_prog->aux->exception_cb); + /* Exception callback will clobber callee regs for its own use, and * restore the original callee regs from main prog's stack frame. */ @@ -2935,9 +3009,9 @@ struct bpf_prog *bpf_int_jit_compile(str jit_data->header = header; jit_data->rw_header = rw_header; } - prog->bpf_func = (void *)image; + prog->bpf_func = (void *)image + ctx.prog_offset; prog->jited = 1; - prog->jited_len = proglen; + prog->jited_len = proglen - ctx.prog_offset; // XXX? } else { prog = orig_prog; } --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -29,6 +29,7 @@ #include #include #include +#include struct bpf_verifier_env; struct bpf_verifier_log; @@ -1188,7 +1189,11 @@ struct bpf_dispatcher { #endif }; -static __always_inline __nocfi unsigned int bpf_dispatcher_nop_func( +#ifndef __bpfcall +#define __bpfcall __nocfi +#endif + +static __always_inline __bpfcall unsigned int bpf_dispatcher_nop_func( const void *ctx, const struct bpf_insn *insnsi, bpf_func_t bpf_func) @@ -1278,7 +1283,7 @@ int arch_prepare_bpf_dispatcher(void *im #define DEFINE_BPF_DISPATCHER(name) \ __BPF_DISPATCHER_SC(name); \ - noinline __nocfi unsigned int bpf_dispatcher_##name##_func( \ + noinline __bpfcall unsigned int bpf_dispatcher_##name##_func( \ const void *ctx, \ const struct bpf_insn *insnsi, \ bpf_func_t bpf_func) \