Received: by 2002:a05:6358:16cc:b0:ea:6187:17c9 with SMTP id r12csp12203694rwl; Tue, 3 Jan 2023 10:30:58 -0800 (PST) X-Google-Smtp-Source: AMrXdXtvKBoRJzIBlCK+TCJ8NLG9+1LXBsfYfmUOI0CPN8QYUDXG4y+9G7cJyx7gBcF2sY/AhVa7 X-Received: by 2002:a17:907:3445:b0:7ad:c5b7:1c79 with SMTP id ac5-20020a170907344500b007adc5b71c79mr32382610ejc.45.1672770658201; Tue, 03 Jan 2023 10:30:58 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1672770658; cv=none; d=google.com; s=arc-20160816; b=foFwtqE6JWmBn396NK2IDxKeMl4iWrKpkA8uo6jxN/GLGmmOrcpaoXUv+OxFNNJ1qF xiuJU21fBa6sp8xQGWB7QjLfOwadW9QnvoHDqaP/lfkq6cYTrpX9y/wO64OrTqJKIyKA 2xymMX/XbX7uHJo35w3kLplgBIYGMan8kR8+tnWTFgx8YhssiBatzp84wmN5tV/DbbY7 sIyNPWR8wzEUFTr0yCis0bZw2cWL/qcwDn+6qsTNEEt3I52Hg6CqUKDL1zi/rx7R0J26 iqFk+DrH4iDIzAwxrVBolFeToWHBvEsGEPKhlfdm7VrhzzRBlJjpl9pnR9bVpcBo+1Eu AgMA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:references:in-reply-to:subject:cc:to:from :dkim-signature; bh=lhHrDU5nn9m0Z+h26IqmQBQnT6YsVxJMkjIlQFroTiY=; b=H17ordVjmyop9LqVs+U2E/nES8vXJ2RSI7oOwLboX6o012vmME/BgdAGjJ9PJH4XRJ 9jsxEwulmEn8y0iTyXFvxsueI9MI0DM5nSuXjvdDgZqVCMVpHBQtQv9a7hxAFsX8Wao5 ZCcAa5mqyo+2p6kdaDW78FnD6OxKiPO1R4NDfnZzx7XG0qFf0q3ttyDLaeb2JnJQnfTo o76mtx++e8spWmYLTzKGxQPN0emX/n8qg/2az9W1dBfmytgp3kvgOUFZLfl32OyYrs4U xmNu5qZeVqruSbdNZkDWdUJYnhJEanhWop+LzKS5KvQ9w0PREYdMfiJuhbMUg9roWrQR 6NgQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=UitcWki6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id wv1-20020a170907080100b007ae10525573si27026277ejb.671.2023.01.03.10.30.43; Tue, 03 Jan 2023 10:30:58 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=UitcWki6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238733AbjACSMM (ORCPT + 60 others); Tue, 3 Jan 2023 13:12:12 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35666 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238742AbjACSMH (ORCPT ); Tue, 3 Jan 2023 13:12:07 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 653FAE7F; Tue, 3 Jan 2023 10:12:06 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 0199A614B8; Tue, 3 Jan 2023 18:12:06 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id DB502C433EF; Tue, 3 Jan 2023 18:12:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672769525; bh=xPTONxYTtJqAfLr329AYZvkB7QNlHmA83t7R/LUR6tk=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=UitcWki67SW8Yaw+gzxlB+xfxHuJUYr5//DsFeyELyyhtzpfQBr293EID2MYOCC2+ Cx8YmX5Tx52hkCAAZgpV5lcUztlxLcqY6egDV6miPF391jPIDXarRKeBOtHn+LhLGz oea6rMnLEU3BWg7EZffrInAHyFn4o5mc63+hLBPwTn6y6HHKTJVtugJq1IKHv2Nf5G YjYTIj5xQSaMBZSYNoEQ0wxhYlvFup/ouPUmIuNxsX6zraGPyJpFFDWLzPF9QMdaln YX5swWnvzQ+sKFfNkANzymsEBX/099LLHhpwWXsZWSVHBGESdCjbYbBtaq2dAe4Xfg LPFrkOKtuht1g== From: =?utf-8?B?QmrDtnJuIFTDtnBlbA==?= To: Pu Lehui , bpf@vger.kernel.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org Cc: Luke Nelson , Xi Wang , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Paul Walmsley , Palmer Dabbelt , Albert Ou , Pu Lehui , Pu Lehui Subject: Re: [RFC PATCH RESEND bpf-next 3/4] riscv, bpf: Add bpf_arch_text_poke support for RV64 In-Reply-To: <20221220021319.1655871-4-pulehui@huaweicloud.com> References: <20221220021319.1655871-1-pulehui@huaweicloud.com> <20221220021319.1655871-4-pulehui@huaweicloud.com> Date: Tue, 03 Jan 2023 19:12:02 +0100 Message-ID: <871qobqyh9.fsf@all.your.base.are.belong.to.us> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Pu Lehui writes: > From: Pu Lehui > > Implement bpf_arch_text_poke for RV64. For call scenario, > ftrace framework reserve 4 nops for RV64 kernel function > as function entry, and use auipc+jalr instructions to call > kernel or module functions. However, since the auipc+jalr > call instructions is non-atomic operation, we need to use > stop-machine to make sure instruction patching in atomic > context. As for jump scenario, since we only jump inside > the trampoline, a jal instruction is sufficient. Hmm, is that really true? More below! > > Signed-off-by: Pu Lehui > --- > arch/riscv/net/bpf_jit.h | 5 ++ > arch/riscv/net/bpf_jit_comp64.c | 131 +++++++++++++++++++++++++++++++- > 2 files changed, 134 insertions(+), 2 deletions(-) > > diff --git a/arch/riscv/net/bpf_jit.h b/arch/riscv/net/bpf_jit.h > index d926e0f7ef57..bf9802a63061 100644 > --- a/arch/riscv/net/bpf_jit.h > +++ b/arch/riscv/net/bpf_jit.h > @@ -573,6 +573,11 @@ static inline u32 rv_fence(u8 pred, u8 succ) > return rv_i_insn(imm11_0, 0, 0, 0, 0xf); > } >=20=20 > +static inline u32 rv_nop(void) > +{ > + return rv_i_insn(0, 0, 0, 0, 0x13); > +} > + > /* RVC instrutions. */ >=20=20 > static inline u16 rvc_addi4spn(u8 rd, u32 imm10) > diff --git a/arch/riscv/net/bpf_jit_comp64.c b/arch/riscv/net/bpf_jit_com= p64.c > index bf4721a99a09..fa8b03c52463 100644 > --- a/arch/riscv/net/bpf_jit_comp64.c > +++ b/arch/riscv/net/bpf_jit_comp64.c > @@ -8,6 +8,8 @@ > #include > #include > #include > +#include > +#include > #include "bpf_jit.h" >=20=20 > #define RV_REG_TCC RV_REG_A6 > @@ -238,7 +240,7 @@ static void __build_epilogue(bool is_tail_call, struc= t rv_jit_context *ctx) > if (!is_tail_call) > emit_mv(RV_REG_A0, RV_REG_A5, ctx); > emit_jalr(RV_REG_ZERO, is_tail_call ? RV_REG_T3 : RV_REG_RA, > - is_tail_call ? 4 : 0, /* skip TCC init */ > + is_tail_call ? 20 : 0, /* skip reserved nops and TCC init */ > ctx); > } >=20=20 > @@ -615,6 +617,127 @@ static int add_exception_handler(const struct bpf_i= nsn *insn, > return 0; > } >=20=20 > +struct text_poke_args { > + void *addr; > + const void *insns; > + size_t len; > + atomic_t cpu_count; > +}; > + > +static int do_text_poke(void *data) > +{ > + int ret =3D 0; > + struct text_poke_args *patch =3D data; > + > + if (atomic_inc_return(&patch->cpu_count) =3D=3D num_online_cpus()) { > + ret =3D patch_text_nosync(patch->addr, patch->insns, patch->len); > + atomic_inc(&patch->cpu_count); > + } else { > + while (atomic_read(&patch->cpu_count) <=3D num_online_cpus()) > + cpu_relax(); > + smp_mb(); > + } > + > + return ret; > +} > + > +static int bpf_text_poke_stop_machine(void *addr, const void *insns, siz= e_t len) > +{ > + struct text_poke_args patch =3D { > + .addr =3D addr, > + .insns =3D insns, > + .len =3D len, > + .cpu_count =3D ATOMIC_INIT(0), > + }; > + > + return stop_machine(do_text_poke, &patch, cpu_online_mask); > +} > + > +static int gen_call_or_nops(void *target, void *ip, u32 *insns) > +{ > + int i, ret; > + s64 rvoff; > + struct rv_jit_context ctx; > + > + ctx.ninsns =3D 0; > + ctx.insns =3D (u16 *)insns; > + > + if (!target) { > + for (i =3D 0; i < 4; i++) > + emit(rv_nop(), &ctx); > + return 0; > + } > + > + rvoff =3D (s64)(target - ip); > + emit(rv_sd(RV_REG_SP, -8, RV_REG_RA), &ctx); > + ret =3D emit_jump_and_link(RV_REG_RA, rvoff, false, &ctx); > + if (ret) > + return ret; > + emit(rv_ld(RV_REG_RA, -8, RV_REG_SP), &ctx); > + > + return 0; > + > +} > + > +static int bpf_text_poke_call(void *ip, void *old_addr, void *new_addr) > +{ > + int ret; > + u32 old_insns[4], new_insns[4]; > + > + ret =3D gen_call_or_nops(old_addr, ip + 4, old_insns); > + if (ret) > + return ret; > + > + ret =3D gen_call_or_nops(new_addr, ip + 4, new_insns); > + if (ret) > + return ret; > + > + mutex_lock(&text_mutex); > + if (memcmp(ip, old_insns, sizeof(old_insns))) { > + ret =3D -EFAULT; > + goto out; > + } > + > + if (memcmp(ip, new_insns, sizeof(new_insns))) > + ret =3D bpf_text_poke_stop_machine(ip, new_insns, > sizeof(new_insns)); I'd rather see that you added a patch_text variant to arch/riscv/kernel/patch.c (something like your bpf_text_poke_stop_machine()), and use that here. Might be other users of that as well -- Andy's ftrace patch maybe? :-) > +out: > + mutex_unlock(&text_mutex); > + return ret; > +} > + > +static int bpf_text_poke_jump(void *ip, void *old_addr, void *new_addr) > +{ > + int ret; > + u32 old_insn, new_insn; > + > + old_insn =3D old_addr ? rv_jal(RV_REG_ZERO, (s64)(old_addr - ip) >> 1) = : rv_nop(); > + new_insn =3D new_addr ? rv_jal(RV_REG_ZERO, (s64)(new_addr - ip) >> 1) = : rv_nop(); > + > + mutex_lock(&text_mutex); > + if (memcmp(ip, &old_insn, sizeof(old_insn))) { > + ret =3D -EFAULT; > + goto out; > + } > + > + if (memcmp(ip, &new_insn, sizeof(new_insn))) > + ret =3D patch_text_nosync(ip, &new_insn, sizeof(new_insn)); > +out: > + mutex_unlock(&text_mutex); > + return ret; > +} > + > +int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type, > + void *old_addr, void *new_addr) AFAIU there's nothing in the bpf_arch_text_poke() API that say that BPF_MOD_JUMP is jumps within the trampoline. That is one usage, but not the only one. In general, the jal might not have enough reach. I believe that this needs to be an auipc/jalr pair similar to BPF_MOD_CALL (w/o linked register).=20 And again, thanks for working on the RV trampoline! Bj=C3=B6rn