LinuxLists.cc - [PATCH bpf v2] riscv, bpf: Emit fixed-length instructions for BPF_PSEUDO

2022-12-02 10:11:25

Subject: [PATCH bpf v2] riscv, bpf: Emit fixed-length instructions for BPF_PSEUDO_FUNC

From: Pu Lehui <[email protected]>

For BPF_PSEUDO_FUNC instruction, verifier will refill imm with
correct addresses of bpf_calls and then run last pass of JIT.
Since the emit_imm of RV64 is variable-length, which will emit
appropriate length instructions accorroding to the imm, it may
broke ctx->offset, and lead to unpredictable problem, such as
inaccurate jump. So let's fix it with fixed-length instructions.

Fixes: 69c087ba6225 ("bpf: Add bpf_for_each_map_elem() helper")
Signed-off-by: Pu Lehui <[email protected]>
Suggested-by: Björn Töpel <[email protected]>
---
arch/riscv/net/bpf_jit_comp64.c | 20 +++++++++++++++++++-
1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/net/bpf_jit_comp64.c b/arch/riscv/net/bpf_jit_comp64.c
index eb99df41fa33..9723f34f7a06 100644
--- a/arch/riscv/net/bpf_jit_comp64.c
+++ b/arch/riscv/net/bpf_jit_comp64.c
@@ -139,6 +139,19 @@ static bool in_auipc_jalr_range(s64 val)
val < ((1L << 31) - (1L << 11));
}

+/* Emit fixed-length instructions for address */
+static void emit_addr(u8 rd, u64 addr, struct rv_jit_context *ctx)
+{
+ u64 ip = (u64)(ctx->insns + ctx->ninsns);
+ s64 off = addr - ip;
+ s64 upper = (off + (1 << 11)) >> 12;
+ s64 lower = ((off & 0xfff) << 52) >> 52;
+
+ emit(rv_auipc(rd, upper), ctx);
+ emit(rv_addi(rd, rd, lower), ctx);
+}
+
+/* Emit variable-length instructions for 32-bit and 64-bit imm */
static void emit_imm(u8 rd, s64 val, struct rv_jit_context *ctx)
{
/* Note that the immediate from the add is sign-extended,
@@ -1053,7 +1066,12 @@ int bpf_jit_emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx,
u64 imm64;

imm64 = (u64)insn1.imm << 32 | (u32)imm;
- emit_imm(rd, imm64, ctx);
+ if (bpf_pseudo_func(insn))
+ /* fixed-length insns for extra jit pass */
+ emit_addr(rd, imm64, ctx);
+ else
+ emit_imm(rd, imm64, ctx);
+
return 1;
}

--
2.25.1

2022-12-02 11:54:21

by Björn Töpel

[permalink] [raw]

Subject: Re: [PATCH bpf v2] riscv, bpf: Emit fixed-length instructions for BPF_PSEUDO_FUNC

Pu Lehui <[email protected]> writes:

> From: Pu Lehui <[email protected]>
>
> For BPF_PSEUDO_FUNC instruction, verifier will refill imm with
> correct addresses of bpf_calls and then run last pass of JIT.
> Since the emit_imm of RV64 is variable-length, which will emit
> appropriate length instructions accorroding to the imm, it may
> broke ctx->offset, and lead to unpredictable problem, such as
> inaccurate jump. So let's fix it with fixed-length instructions.
>
> Fixes: 69c087ba6225 ("bpf: Add bpf_for_each_map_elem() helper")
> Signed-off-by: Pu Lehui <[email protected]>
> Suggested-by: Björn Töpel <[email protected]>
> ---
> arch/riscv/net/bpf_jit_comp64.c | 20 +++++++++++++++++++-
> 1 file changed, 19 insertions(+), 1 deletion(-)
>
> diff --git a/arch/riscv/net/bpf_jit_comp64.c b/arch/riscv/net/bpf_jit_comp64.c
> index eb99df41fa33..9723f34f7a06 100644
> --- a/arch/riscv/net/bpf_jit_comp64.c
> +++ b/arch/riscv/net/bpf_jit_comp64.c
> @@ -139,6 +139,19 @@ static bool in_auipc_jalr_range(s64 val)
> val < ((1L << 31) - (1L << 11));
> }
>
> +/* Emit fixed-length instructions for address */
> +static void emit_addr(u8 rd, u64 addr, struct rv_jit_context *ctx)
> +{
> + u64 ip = (u64)(ctx->insns + ctx->ninsns);
> + s64 off = addr - ip;
> + s64 upper = (off + (1 << 11)) >> 12;
> + s64 lower = ((off & 0xfff) << 52) >> 52;
> +
> + emit(rv_auipc(rd, upper), ctx);
> + emit(rv_addi(rd, rd, lower), ctx);
> +}

Nice! Two instructions are better than 6! :-)

One final thing. Please add a sanity check, that the range is correct,
e.g.:

if (!(addr && in_auipc_addi_range(off)))
return -1;

Have a look at emit_jump_and_link().

Thanks!
Björn

2022-12-06 04:13:48

by Pu Lehui

[permalink] [raw]

Subject: Re: [PATCH bpf v2] riscv, bpf: Emit fixed-length instructions for BPF_PSEUDO_FUNC

On 2022/12/2 18:54, Björn Töpel wrote:
> Pu Lehui <[email protected]> writes:
>
>> From: Pu Lehui <[email protected]>
>>
>> For BPF_PSEUDO_FUNC instruction, verifier will refill imm with
>> correct addresses of bpf_calls and then run last pass of JIT.
>> Since the emit_imm of RV64 is variable-length, which will emit
>> appropriate length instructions accorroding to the imm, it may
>> broke ctx->offset, and lead to unpredictable problem, such as
>> inaccurate jump. So let's fix it with fixed-length instructions.
>>
>> Fixes: 69c087ba6225 ("bpf: Add bpf_for_each_map_elem() helper")
>> Signed-off-by: Pu Lehui <[email protected]>
>> Suggested-by: Björn Töpel <[email protected]>
>> ---
>> arch/riscv/net/bpf_jit_comp64.c | 20 +++++++++++++++++++-
>> 1 file changed, 19 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/riscv/net/bpf_jit_comp64.c b/arch/riscv/net/bpf_jit_comp64.c
>> index eb99df41fa33..9723f34f7a06 100644
>> --- a/arch/riscv/net/bpf_jit_comp64.c
>> +++ b/arch/riscv/net/bpf_jit_comp64.c
>> @@ -139,6 +139,19 @@ static bool in_auipc_jalr_range(s64 val)
>> val < ((1L << 31) - (1L << 11));
>> }
>>
>> +/* Emit fixed-length instructions for address */
>> +static void emit_addr(u8 rd, u64 addr, struct rv_jit_context *ctx)
>> +{
>> + u64 ip = (u64)(ctx->insns + ctx->ninsns);
>> + s64 off = addr - ip;
>> + s64 upper = (off + (1 << 11)) >> 12;
>> + s64 lower = ((off & 0xfff) << 52) >> 52;
>> +
>> + emit(rv_auipc(rd, upper), ctx);
>> + emit(rv_addi(rd, rd, lower), ctx);
>> +}
>
> Nice! Two instructions are better than 6! :-)
>
> One final thing. Please add a sanity check, that the range is correct,
> e.g.:
>
> if (!(addr && in_auipc_addi_range(off)))
> return -1;
>

Hi Björn,

Sorry for replying so late. For BPF_PSEUDO_FUNC instruction, verifier
will set insn[0].imm and insn[1].imm to 1 that make addr to 0x100000001
before extra pass, and also ctx->insns is NULL in iteration stage, all
of these make off out of range of AUIPC-ADDI range, and return failed.
We could add some special handling at different stages, but that seems a
little weird. By the way, I do not really like emit_addr function with
return value.

While a proper address is at least 2B alignment, and the valid address
is from 0xffffffff00000000 to 0xffffffffffffffff, we can make address
shifed 1 place to right, and addr >> 1 will always in the range of
AUIPC-ADDI range. We can get rid of the range detection. The
implementation is as follows:

static void emit_addr(u8 rd, u64 addr, struct rv_jit_context *ctx)
{
s64 imm = addr >> 1;
s64 upper = (imm + (1 << 11)) >> 12;
s64 lower = imm & 0xfff;

emit(rv_lui(rd, upper), ctx);
emit(rv_addi(rd, rd, lower), ctx);
emit(rv_slli(rd, rd, 1), ctx);
}

What do you think?

Regards,
Lehui

> Have a look at emit_jump_and_link().
>
>
> Thanks!
> Björn

2022-12-06 08:03:17

by Björn Töpel

[permalink] [raw]

Subject: Re: [PATCH bpf v2] riscv, bpf: Emit fixed-length instructions for BPF_PSEUDO_FUNC

Pu Lehui <[email protected]> writes:

> Sorry for replying so late. For BPF_PSEUDO_FUNC instruction, verifier
> will set insn[0].imm and insn[1].imm to 1 that make addr to 0x100000001
> before extra pass, and also ctx->insns is NULL in iteration stage, all
> of these make off out of range of AUIPC-ADDI range, and return failed.
> We could add some special handling at different stages, but that seems a
> little weird. By the way, I do not really like emit_addr function with
> return value.

My rational is that *if* for some reason the jit is passed an address
that auipc/addi can't represent, we'd like to catch that and not emit
broken code.

> While a proper address is at least 2B alignment, and the valid address
> is from 0xffffffff00000000 to 0xffffffffffffffff, we can make address
> shifed 1 place to right, and addr >> 1 will always in the range of
> AUIPC-ADDI range. We can get rid of the range detection. The
> implementation is as follows:
>
> static void emit_addr(u8 rd, u64 addr, struct rv_jit_context *ctx)
> {
> s64 imm = addr >> 1;
> s64 upper = (imm + (1 << 11)) >> 12;
> s64 lower = imm & 0xfff;
>
> emit(rv_lui(rd, upper), ctx);
> emit(rv_addi(rd, rd, lower), ctx);
> emit(rv_slli(rd, rd, 1), ctx);
> }
>
> What do you think?

That's a code generation penalty, instead of catching it at code
gen. Don't like! :-) I much prefer the auipc/addi version.

What do you think about the diff (on-top of your work) below?

--8<--
diff --git a/arch/riscv/net/bpf_jit_comp64.c b/arch/riscv/net/bpf_jit_comp64.c
index aa9410eef77c..7acaf28cb3be 100644
--- a/arch/riscv/net/bpf_jit_comp64.c
+++ b/arch/riscv/net/bpf_jit_comp64.c
@@ -137,15 +137,21 @@ static bool in_auipc_jalr_range(s64 val)
}

/* Emit fixed-length instructions for address */
-static void emit_addr(u8 rd, u64 addr, struct rv_jit_context *ctx)
+static int emit_addr(u8 rd, u64 addr, bool extra_pass, struct rv_jit_context *ctx)
{
u64 ip = (u64)(ctx->insns + ctx->ninsns);
s64 off = addr - ip;
s64 upper = (off + (1 << 11)) >> 12;
s64 lower = ((off & 0xfff) << 52) >> 52;

+ if (extra_pass && !in_auipc_jalr_range(off)) {
+ pr_err("bpf-jit: target offset 0x%llx is out of range\n", off);
+ return -ERANGE;
+ }
+
emit(rv_auipc(rd, upper), ctx);
emit(rv_addi(rd, rd, lower), ctx);
+ return 0;
}

/* Emit variable-length instructions for 32-bit and 64-bit imm */
@@ -1061,13 +1067,17 @@ int bpf_jit_emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx,
{
struct bpf_insn insn1 = insn[1];
u64 imm64;
+ int ret;

imm64 = (u64)insn1.imm << 32 | (u32)imm;
- if (bpf_pseudo_func(insn))
+ if (bpf_pseudo_func(insn)) {
/* fixed-length insns for extra jit pass */
- emit_addr(rd, imm64, ctx);
- else
+ ret = emit_addr(rd, imm64, extra_pass, ctx);
+ if (ret)
+ return ret;
+ } else {
emit_imm(rd, imm64, ctx);
+ }

return 1;
}

--8<--

Wouldn't that work?

Björn

2022-12-06 08:34:47

by Pu Lehui

[permalink] [raw]

Subject: Re: [PATCH bpf v2] riscv, bpf: Emit fixed-length instructions for BPF_PSEUDO_FUNC

On 2022/12/6 15:55, Björn Töpel wrote:
> Pu Lehui <[email protected]> writes:
>
>> Sorry for replying so late. For BPF_PSEUDO_FUNC instruction, verifier
>> will set insn[0].imm and insn[1].imm to 1 that make addr to 0x100000001
>> before extra pass, and also ctx->insns is NULL in iteration stage, all
>> of these make off out of range of AUIPC-ADDI range, and return failed.
>> We could add some special handling at different stages, but that seems a
>> little weird. By the way, I do not really like emit_addr function with
>> return value.
>
> My rational is that *if* for some reason the jit is passed an address
> that auipc/addi can't represent, we'd like to catch that and not emit
> broken code.
>
>> While a proper address is at least 2B alignment, and the valid address
>> is from 0xffffffff00000000 to 0xffffffffffffffff, we can make address
>> shifed 1 place to right, and addr >> 1 will always in the range of
>> AUIPC-ADDI range. We can get rid of the range detection. The
>> implementation is as follows:
>>
>> static void emit_addr(u8 rd, u64 addr, struct rv_jit_context *ctx)
>> {
>> s64 imm = addr >> 1;
>> s64 upper = (imm + (1 << 11)) >> 12;
>> s64 lower = imm & 0xfff;
>>
>> emit(rv_lui(rd, upper), ctx);
>> emit(rv_addi(rd, rd, lower), ctx);
>> emit(rv_slli(rd, rd, 1), ctx);
>> }
>>
>> What do you think?
>
> That's a code generation penalty, instead of catching it at code
> gen. Don't like! :-) I much prefer the auipc/addi version.
>
> What do you think about the diff (on-top of your work) below?
>
> --8<--
> diff --git a/arch/riscv/net/bpf_jit_comp64.c b/arch/riscv/net/bpf_jit_comp64.c
> index aa9410eef77c..7acaf28cb3be 100644
> --- a/arch/riscv/net/bpf_jit_comp64.c
> +++ b/arch/riscv/net/bpf_jit_comp64.c
> @@ -137,15 +137,21 @@ static bool in_auipc_jalr_range(s64 val)
> }
>
> /* Emit fixed-length instructions for address */
> -static void emit_addr(u8 rd, u64 addr, struct rv_jit_context *ctx)
> +static int emit_addr(u8 rd, u64 addr, bool extra_pass, struct rv_jit_context *ctx)
> {
> u64 ip = (u64)(ctx->insns + ctx->ninsns);
> s64 off = addr - ip;
> s64 upper = (off + (1 << 11)) >> 12;
> s64 lower = ((off & 0xfff) << 52) >> 52;
>
> + if (extra_pass && !in_auipc_jalr_range(off)) {
> + pr_err("bpf-jit: target offset 0x%llx is out of range\n", off);
> + return -ERANGE;
> + }
> +
> emit(rv_auipc(rd, upper), ctx);
> emit(rv_addi(rd, rd, lower), ctx);
> + return 0;
> }
>
> /* Emit variable-length instructions for 32-bit and 64-bit imm */
> @@ -1061,13 +1067,17 @@ int bpf_jit_emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx,
> {
> struct bpf_insn insn1 = insn[1];
> u64 imm64;
> + int ret;
>
> imm64 = (u64)insn1.imm << 32 | (u32)imm;
> - if (bpf_pseudo_func(insn))
> + if (bpf_pseudo_func(insn)) {
> /* fixed-length insns for extra jit pass */
> - emit_addr(rd, imm64, ctx);
> - else
> + ret = emit_addr(rd, imm64, extra_pass, ctx);
> + if (ret)
> + return ret;
> + } else {
> emit_imm(rd, imm64, ctx);
> + }
>
> return 1;
> }
>
> --8<--
>
> Wouldn't that work?
>

It definitely works. But auipc+addi may be some holes, while
lui+addi+slli support all the address of kernel and module. And this
might be help for the future feature porting.

>
> Björn

2022-12-06 10:16:41

by Björn Töpel

[permalink] [raw]

Subject: Re: [PATCH bpf v2] riscv, bpf: Emit fixed-length instructions for BPF_PSEUDO_FUNC

Pu Lehui <[email protected]> writes:

>> Wouldn't that work?
>>
>
> It definitely works. But auipc+addi may be some holes, while
> lui+addi+slli support all the address of kernel and module. And this
> might be help for the future feature porting.

We're already using auipc/jalr for calls, and I'd say it *very* unlikely
that we'll hit the non-covered range. I'd say go with auipc/addi +
error, and we can change if this really is a problem.