2023-12-07 03:30:57

by Tiezhu Yang

[permalink] [raw]
Subject: [PATCH v1 0/2] Fix failed test cases in test_bpf.ko for LoongArch

There are 5 failed test cases in test_bpf.ko if enable jit on LoongArch,
one of them is related with the test case itself and has been fixed in
bpf-next tree [1], the other failed test cases are related with bpf jit
and can be fixed with this patch series.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/commit/?id=5fa201f37c2e

Tiezhu Yang (2):
LoongArch: BPF: Fix sign-extension mov instructions
LoongArch: BPF: Fix unconditional bswap instructions

arch/loongarch/net/bpf_jit.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

--
2.42.0


2023-12-07 03:31:10

by Tiezhu Yang

[permalink] [raw]
Subject: [PATCH v1 1/2] LoongArch: BPF: Fix sign-extension mov instructions

We can see that "Short form of movsx, dst_reg = (s8,s16,s32)src_reg"
in include/linux/filter.h, additionally, for BPF_ALU64 the value of
the destination register is unchanged whereas for BPF_ALU the upper
32 bits of the destination register are zeroed, so it should clear
the upper 32 bits for BPF_ALU.

[root@linux fedora]# echo 1 > /proc/sys/net/core/bpf_jit_enable
[root@linux fedora]# modprobe test_bpf

Before:
test_bpf: #81 ALU_MOVSX | BPF_B jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times)
test_bpf: #82 ALU_MOVSX | BPF_H jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times)

After:
test_bpf: #81 ALU_MOVSX | BPF_B jited:1 6 PASS
test_bpf: #82 ALU_MOVSX | BPF_H jited:1 6 PASS

By the way, the bpf selftest case "./test_progs -t verifier_movsx" can
also be fixed with this patch.

Fixes: f48012f16150 ("LoongArch: BPF: Support sign-extension mov instructions")
Signed-off-by: Tiezhu Yang <[email protected]>
---
arch/loongarch/net/bpf_jit.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
index 169ff8b3915e..8c907c2c42f7 100644
--- a/arch/loongarch/net/bpf_jit.c
+++ b/arch/loongarch/net/bpf_jit.c
@@ -480,10 +480,12 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx, bool ext
case 8:
move_reg(ctx, t1, src);
emit_insn(ctx, extwb, dst, t1);
+ emit_zext_32(ctx, dst, is32);
break;
case 16:
move_reg(ctx, t1, src);
emit_insn(ctx, extwh, dst, t1);
+ emit_zext_32(ctx, dst, is32);
break;
case 32:
emit_insn(ctx, addw, dst, src, LOONGARCH_GPR_ZERO);
--
2.42.0

2023-12-07 04:12:51

by Huacai Chen

[permalink] [raw]
Subject: Re: [PATCH v1 1/2] LoongArch: BPF: Fix sign-extension mov instructions

On Thu, Dec 7, 2023 at 11:30 AM Tiezhu Yang <[email protected]> wrote:
>
> We can see that "Short form of movsx, dst_reg = (s8,s16,s32)src_reg"
> in include/linux/filter.h, additionally, for BPF_ALU64 the value of
> the destination register is unchanged whereas for BPF_ALU the upper
> 32 bits of the destination register are zeroed, so it should clear
> the upper 32 bits for BPF_ALU.
>
> [root@linux fedora]# echo 1 > /proc/sys/net/core/bpf_jit_enable
> [root@linux fedora]# modprobe test_bpf
>
> Before:
> test_bpf: #81 ALU_MOVSX | BPF_B jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times)
> test_bpf: #82 ALU_MOVSX | BPF_H jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times)
>
> After:
> test_bpf: #81 ALU_MOVSX | BPF_B jited:1 6 PASS
> test_bpf: #82 ALU_MOVSX | BPF_H jited:1 6 PASS
>
> By the way, the bpf selftest case "./test_progs -t verifier_movsx" can
> also be fixed with this patch.
Hmmm, it is a little strange that privileged verifier_movsx has no problem.

Huacai

>
> Fixes: f48012f16150 ("LoongArch: BPF: Support sign-extension mov instructions")
> Signed-off-by: Tiezhu Yang <[email protected]>
> ---
> arch/loongarch/net/bpf_jit.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> index 169ff8b3915e..8c907c2c42f7 100644
> --- a/arch/loongarch/net/bpf_jit.c
> +++ b/arch/loongarch/net/bpf_jit.c
> @@ -480,10 +480,12 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx, bool ext
> case 8:
> move_reg(ctx, t1, src);
> emit_insn(ctx, extwb, dst, t1);
> + emit_zext_32(ctx, dst, is32);
> break;
> case 16:
> move_reg(ctx, t1, src);
> emit_insn(ctx, extwh, dst, t1);
> + emit_zext_32(ctx, dst, is32);
> break;
> case 32:
> emit_insn(ctx, addw, dst, src, LOONGARCH_GPR_ZERO);
> --
> 2.42.0
>
>

2023-12-09 06:04:10

by Hengqi Chen

[permalink] [raw]
Subject: Re: [PATCH v1 1/2] LoongArch: BPF: Fix sign-extension mov instructions

On Thu, Dec 7, 2023 at 12:12 PM Huacai Chen <[email protected]> wrote:
>
> On Thu, Dec 7, 2023 at 11:30 AM Tiezhu Yang <[email protected]> wrote:
> >
> > We can see that "Short form of movsx, dst_reg = (s8,s16,s32)src_reg"
> > in include/linux/filter.h, additionally, for BPF_ALU64 the value of
> > the destination register is unchanged whereas for BPF_ALU the upper
> > 32 bits of the destination register are zeroed, so it should clear
> > the upper 32 bits for BPF_ALU.
> >
> > [root@linux fedora]# echo 1 > /proc/sys/net/core/bpf_jit_enable
> > [root@linux fedora]# modprobe test_bpf
> >
> > Before:
> > test_bpf: #81 ALU_MOVSX | BPF_B jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times)
> > test_bpf: #82 ALU_MOVSX | BPF_H jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times)
> >
> > After:
> > test_bpf: #81 ALU_MOVSX | BPF_B jited:1 6 PASS
> > test_bpf: #82 ALU_MOVSX | BPF_H jited:1 6 PASS
> >
> > By the way, the bpf selftest case "./test_progs -t verifier_movsx" can
> > also be fixed with this patch.
> Hmmm, it is a little strange that privileged verifier_movsx has no problem.
>

I have found the differences between priv and unpriv mode.
The BPF verifier performs different optimizations for priv and upriv progs.
See the following commits:

https://github.com/torvalds/linux/commit/e2ae4ca266a1c
https://github.com/torvalds/linux/commit/52875a04f4b26
https://github.com/torvalds/linux/commit/a1b14abc009d9

As a result, with unprivileged_bpf_disabled on, we have:

# bpftool p d x i 55
void mov32sx_s16_range_2():
; asm volatile (" \
0: (b7) r1 = 65535
1: (bc) w2 = w1
2: (77) r2 >>= 1
3: (b7) r0 = 1
4: (95) exit

# bpftool p d j i 55
void mov32sx_s16_range_2():
0xffff800002416238:
; asm volatile (" \
0: addi.d $a6, $zero, 33(0x21)
4: addi.d $sp, $sp, -64(0xfc0)
8: st.d $ra, $sp, 56(0x38)
c: st.d $fp, $sp, 48(0x30)
10: st.d $s0, $sp, 40(0x28)
14: st.d $s1, $sp, 32(0x20)
18: st.d $s2, $sp, 24(0x18)
1c: st.d $s3, $sp, 16(0x10)
20: st.d $s4, $sp, 8(0x8)
24: st.d $s5, $sp, 0
28: addi.d $fp, $sp, 64(0x40)
2c: lu12i.w $a0, 15(0xf)
30: ori $a0, $a0, 0xfff
34: move $t1, $a0
38: ext.w.h $a1, $t1
3c: srli.d $a1, $a1, 0x1
40: addi.w $a5, $zero, 1(0x1)
44: ld.d $ra, $sp, 56(0x38)
48: ld.d $fp, $sp, 48(0x30)
4c: ld.d $s0, $sp, 40(0x28)
50: ld.d $s1, $sp, 32(0x20)
54: ld.d $s2, $sp, 24(0x18)
58: ld.d $s3, $sp, 16(0x10)
5c: ld.d $s4, $sp, 8(0x8)
60: ld.d $s5, $sp, 0
64: addi.d $sp, $sp, 64(0x40)
68: move $a0, $a5
6c: jirl $zero, $ra, 0

With unprivileged_bpf_disabled off, we have:

# bpftool p d x i 59
0: (b7) r1 = 65535
1: (bc) w2 = w1
2: (77) r2 >>= 1
3: (55) if r2 != 0x7fffffff goto pc+2
4: (b7) r0 = 1
5: (95) exit
6: (05) goto pc-1
7: (05) goto pc-1

# bpftool p d j i 59
0xffff8000024146a0:
0: addi.d $a6, $zero, 33(0x21)
4: addi.d $sp, $sp, -64(0xfc0)
8: st.d $ra, $sp, 56(0x38)
c: st.d $fp, $sp, 48(0x30)
10: st.d $s0, $sp, 40(0x28)
14: st.d $s1, $sp, 32(0x20)
18: st.d $s2, $sp, 24(0x18)
1c: st.d $s3, $sp, 16(0x10)
20: st.d $s4, $sp, 8(0x8)
24: st.d $s5, $sp, 0
28: addi.d $fp, $sp, 64(0x40)
2c: lu12i.w $a0, 15(0xf)
30: ori $a0, $a0, 0xfff
34: move $t1, $a0
38: ext.w.h $a1, $t1
3c: srli.d $a1, $a1, 0x1
40: lu12i.w $t1, 524287(0x7ffff)
44: ori $t1, $t1, 0xfff
48: move $t2, $a1
4c: beq $t2, $t1, 8(0x8) # 0x0000000000000054
50: b 12(0xc) # 0x000000000000005c
54: addi.w $a5, $zero, 1(0x1)
58: b 12(0xc) # 0x0000000000000064
5c: b 0 # 0x000000000000005c
60: b 0 # 0x0000000000000060
64: ld.d $ra, $sp, 56(0x38)
68: ld.d $fp, $sp, 48(0x30)
6c: ld.d $s0, $sp, 40(0x28)
70: ld.d $s1, $sp, 32(0x20)
74: ld.d $s2, $sp, 24(0x18)
78: ld.d $s3, $sp, 16(0x10)
7c: ld.d $s4, $sp, 8(0x8)
80: ld.d $s5, $sp, 0
84: addi.d $sp, $sp, 64(0x40)
88: move $a0, $a5
8c: jirl $zero, $ra, 0

Without this fix, it seems like the prog is trapped in an infinite loop.

This patch looks good to me, so I am going to offer:

Acked-by: Hengqi Chen <[email protected]>

Cheers,
--
Hengqi

> Huacai
>
> >
> > Fixes: f48012f16150 ("LoongArch: BPF: Support sign-extension mov instructions")
> > Signed-off-by: Tiezhu Yang <[email protected]>
> > ---
> > arch/loongarch/net/bpf_jit.c | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> > index 169ff8b3915e..8c907c2c42f7 100644
> > --- a/arch/loongarch/net/bpf_jit.c
> > +++ b/arch/loongarch/net/bpf_jit.c
> > @@ -480,10 +480,12 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx, bool ext
> > case 8:
> > move_reg(ctx, t1, src);
> > emit_insn(ctx, extwb, dst, t1);
> > + emit_zext_32(ctx, dst, is32);
> > break;
> > case 16:
> > move_reg(ctx, t1, src);
> > emit_insn(ctx, extwh, dst, t1);
> > + emit_zext_32(ctx, dst, is32);
> > break;
> > case 32:
> > emit_insn(ctx, addw, dst, src, LOONGARCH_GPR_ZERO);
> > --
> > 2.42.0
> >
> >

2023-12-09 12:05:16

by Huacai Chen

[permalink] [raw]
Subject: Re: [PATCH v1 1/2] LoongArch: BPF: Fix sign-extension mov instructions

Applied, thanks.

Huacai

On Sat, Dec 9, 2023 at 2:04 PM Hengqi Chen <[email protected]> wrote:
>
> On Thu, Dec 7, 2023 at 12:12 PM Huacai Chen <[email protected]> wrote:
> >
> > On Thu, Dec 7, 2023 at 11:30 AM Tiezhu Yang <[email protected]> wrote:
> > >
> > > We can see that "Short form of movsx, dst_reg = (s8,s16,s32)src_reg"
> > > in include/linux/filter.h, additionally, for BPF_ALU64 the value of
> > > the destination register is unchanged whereas for BPF_ALU the upper
> > > 32 bits of the destination register are zeroed, so it should clear
> > > the upper 32 bits for BPF_ALU.
> > >
> > > [root@linux fedora]# echo 1 > /proc/sys/net/core/bpf_jit_enable
> > > [root@linux fedora]# modprobe test_bpf
> > >
> > > Before:
> > > test_bpf: #81 ALU_MOVSX | BPF_B jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times)
> > > test_bpf: #82 ALU_MOVSX | BPF_H jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times)
> > >
> > > After:
> > > test_bpf: #81 ALU_MOVSX | BPF_B jited:1 6 PASS
> > > test_bpf: #82 ALU_MOVSX | BPF_H jited:1 6 PASS
> > >
> > > By the way, the bpf selftest case "./test_progs -t verifier_movsx" can
> > > also be fixed with this patch.
> > Hmmm, it is a little strange that privileged verifier_movsx has no problem.
> >
>
> I have found the differences between priv and unpriv mode.
> The BPF verifier performs different optimizations for priv and upriv progs.
> See the following commits:
>
> https://github.com/torvalds/linux/commit/e2ae4ca266a1c
> https://github.com/torvalds/linux/commit/52875a04f4b26
> https://github.com/torvalds/linux/commit/a1b14abc009d9
>
> As a result, with unprivileged_bpf_disabled on, we have:
>
> # bpftool p d x i 55
> void mov32sx_s16_range_2():
> ; asm volatile (" \
> 0: (b7) r1 = 65535
> 1: (bc) w2 = w1
> 2: (77) r2 >>= 1
> 3: (b7) r0 = 1
> 4: (95) exit
>
> # bpftool p d j i 55
> void mov32sx_s16_range_2():
> 0xffff800002416238:
> ; asm volatile (" \
> 0: addi.d $a6, $zero, 33(0x21)
> 4: addi.d $sp, $sp, -64(0xfc0)
> 8: st.d $ra, $sp, 56(0x38)
> c: st.d $fp, $sp, 48(0x30)
> 10: st.d $s0, $sp, 40(0x28)
> 14: st.d $s1, $sp, 32(0x20)
> 18: st.d $s2, $sp, 24(0x18)
> 1c: st.d $s3, $sp, 16(0x10)
> 20: st.d $s4, $sp, 8(0x8)
> 24: st.d $s5, $sp, 0
> 28: addi.d $fp, $sp, 64(0x40)
> 2c: lu12i.w $a0, 15(0xf)
> 30: ori $a0, $a0, 0xfff
> 34: move $t1, $a0
> 38: ext.w.h $a1, $t1
> 3c: srli.d $a1, $a1, 0x1
> 40: addi.w $a5, $zero, 1(0x1)
> 44: ld.d $ra, $sp, 56(0x38)
> 48: ld.d $fp, $sp, 48(0x30)
> 4c: ld.d $s0, $sp, 40(0x28)
> 50: ld.d $s1, $sp, 32(0x20)
> 54: ld.d $s2, $sp, 24(0x18)
> 58: ld.d $s3, $sp, 16(0x10)
> 5c: ld.d $s4, $sp, 8(0x8)
> 60: ld.d $s5, $sp, 0
> 64: addi.d $sp, $sp, 64(0x40)
> 68: move $a0, $a5
> 6c: jirl $zero, $ra, 0
>
> With unprivileged_bpf_disabled off, we have:
>
> # bpftool p d x i 59
> 0: (b7) r1 = 65535
> 1: (bc) w2 = w1
> 2: (77) r2 >>= 1
> 3: (55) if r2 != 0x7fffffff goto pc+2
> 4: (b7) r0 = 1
> 5: (95) exit
> 6: (05) goto pc-1
> 7: (05) goto pc-1
>
> # bpftool p d j i 59
> 0xffff8000024146a0:
> 0: addi.d $a6, $zero, 33(0x21)
> 4: addi.d $sp, $sp, -64(0xfc0)
> 8: st.d $ra, $sp, 56(0x38)
> c: st.d $fp, $sp, 48(0x30)
> 10: st.d $s0, $sp, 40(0x28)
> 14: st.d $s1, $sp, 32(0x20)
> 18: st.d $s2, $sp, 24(0x18)
> 1c: st.d $s3, $sp, 16(0x10)
> 20: st.d $s4, $sp, 8(0x8)
> 24: st.d $s5, $sp, 0
> 28: addi.d $fp, $sp, 64(0x40)
> 2c: lu12i.w $a0, 15(0xf)
> 30: ori $a0, $a0, 0xfff
> 34: move $t1, $a0
> 38: ext.w.h $a1, $t1
> 3c: srli.d $a1, $a1, 0x1
> 40: lu12i.w $t1, 524287(0x7ffff)
> 44: ori $t1, $t1, 0xfff
> 48: move $t2, $a1
> 4c: beq $t2, $t1, 8(0x8) # 0x0000000000000054
> 50: b 12(0xc) # 0x000000000000005c
> 54: addi.w $a5, $zero, 1(0x1)
> 58: b 12(0xc) # 0x0000000000000064
> 5c: b 0 # 0x000000000000005c
> 60: b 0 # 0x0000000000000060
> 64: ld.d $ra, $sp, 56(0x38)
> 68: ld.d $fp, $sp, 48(0x30)
> 6c: ld.d $s0, $sp, 40(0x28)
> 70: ld.d $s1, $sp, 32(0x20)
> 74: ld.d $s2, $sp, 24(0x18)
> 78: ld.d $s3, $sp, 16(0x10)
> 7c: ld.d $s4, $sp, 8(0x8)
> 80: ld.d $s5, $sp, 0
> 84: addi.d $sp, $sp, 64(0x40)
> 88: move $a0, $a5
> 8c: jirl $zero, $ra, 0
>
> Without this fix, it seems like the prog is trapped in an infinite loop.
>
> This patch looks good to me, so I am going to offer:
>
> Acked-by: Hengqi Chen <[email protected]>
>
> Cheers,
> --
> Hengqi
>
> > Huacai
> >
> > >
> > > Fixes: f48012f16150 ("LoongArch: BPF: Support sign-extension mov instructions")
> > > Signed-off-by: Tiezhu Yang <[email protected]>
> > > ---
> > > arch/loongarch/net/bpf_jit.c | 2 ++
> > > 1 file changed, 2 insertions(+)
> > >
> > > diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> > > index 169ff8b3915e..8c907c2c42f7 100644
> > > --- a/arch/loongarch/net/bpf_jit.c
> > > +++ b/arch/loongarch/net/bpf_jit.c
> > > @@ -480,10 +480,12 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx, bool ext
> > > case 8:
> > > move_reg(ctx, t1, src);
> > > emit_insn(ctx, extwb, dst, t1);
> > > + emit_zext_32(ctx, dst, is32);
> > > break;
> > > case 16:
> > > move_reg(ctx, t1, src);
> > > emit_insn(ctx, extwh, dst, t1);
> > > + emit_zext_32(ctx, dst, is32);
> > > break;
> > > case 32:
> > > emit_insn(ctx, addw, dst, src, LOONGARCH_GPR_ZERO);
> > > --
> > > 2.42.0
> > >
> > >
>