2020-07-21 02:54:47

by Luke Nelson

[permalink] [raw]
Subject: [PATCH bpf-next v1 0/3] bpf, riscv: Add compressed instructions to rv64 JIT

This patch series enables using compressed riscv (RVC) instructions
in the rv64 BPF JIT.

RVC is a standard riscv extension that adds a set of compressed,
2-byte instructions that can replace some regular 4-byte instructions
for improved code density.

This series first modifies the JIT to support using 2-byte instructions
(e.g., in jump offset computations), then adds RVC encoding and
helper functions, and finally uses the helper functions to optimize
the rv64 JIT.

I used our formal verification framework, Serval, to verify the
correctness of the RVC encodings and their uses in the rv64 JIT.

The JIT continues to pass all tests in lib/test_bpf.c, and introduces
no new failures to test_verifier; both with and without RVC being enabled.

The following are examples of the JITed code for the verifier selftest
"direct packet read test#3 for CGROUP_SKB OK", without and with RVC
enabled, respectively. The former uses 178 bytes, and the latter uses 112,
for a ~37% reduction in code size for this example.

Without RVC:

0: 02000813 addi a6,zero,32
4: fd010113 addi sp,sp,-48
8: 02813423 sd s0,40(sp)
c: 02913023 sd s1,32(sp)
10: 01213c23 sd s2,24(sp)
14: 01313823 sd s3,16(sp)
18: 01413423 sd s4,8(sp)
1c: 03010413 addi s0,sp,48
20: 03056683 lwu a3,48(a0)
24: 02069693 slli a3,a3,0x20
28: 0206d693 srli a3,a3,0x20
2c: 03456703 lwu a4,52(a0)
30: 02071713 slli a4,a4,0x20
34: 02075713 srli a4,a4,0x20
38: 03856483 lwu s1,56(a0)
3c: 02049493 slli s1,s1,0x20
40: 0204d493 srli s1,s1,0x20
44: 03c56903 lwu s2,60(a0)
48: 02091913 slli s2,s2,0x20
4c: 02095913 srli s2,s2,0x20
50: 04056983 lwu s3,64(a0)
54: 02099993 slli s3,s3,0x20
58: 0209d993 srli s3,s3,0x20
5c: 09056a03 lwu s4,144(a0)
60: 020a1a13 slli s4,s4,0x20
64: 020a5a13 srli s4,s4,0x20
68: 00900313 addi t1,zero,9
6c: 006a7463 bgeu s4,t1,0x74
70: 00000a13 addi s4,zero,0
74: 02d52823 sw a3,48(a0)
78: 02e52a23 sw a4,52(a0)
7c: 02952c23 sw s1,56(a0)
80: 03252e23 sw s2,60(a0)
84: 05352023 sw s3,64(a0)
88: 00000793 addi a5,zero,0
8c: 02813403 ld s0,40(sp)
90: 02013483 ld s1,32(sp)
94: 01813903 ld s2,24(sp)
98: 01013983 ld s3,16(sp)
9c: 00813a03 ld s4,8(sp)
a0: 03010113 addi sp,sp,48
a4: 00078513 addi a0,a5,0
a8: 00008067 jalr zero,0(ra)

With RVC:

0: 02000813 addi a6,zero,32
4: 7179 c.addi16sp sp,-48
6: f422 c.sdsp s0,40(sp)
8: f026 c.sdsp s1,32(sp)
a: ec4a c.sdsp s2,24(sp)
c: e84e c.sdsp s3,16(sp)
e: e452 c.sdsp s4,8(sp)
10: 1800 c.addi4spn s0,sp,48
12: 03056683 lwu a3,48(a0)
16: 1682 c.slli a3,0x20
18: 9281 c.srli a3,0x20
1a: 03456703 lwu a4,52(a0)
1e: 1702 c.slli a4,0x20
20: 9301 c.srli a4,0x20
22: 03856483 lwu s1,56(a0)
26: 1482 c.slli s1,0x20
28: 9081 c.srli s1,0x20
2a: 03c56903 lwu s2,60(a0)
2e: 1902 c.slli s2,0x20
30: 02095913 srli s2,s2,0x20
34: 04056983 lwu s3,64(a0)
38: 1982 c.slli s3,0x20
3a: 0209d993 srli s3,s3,0x20
3e: 09056a03 lwu s4,144(a0)
42: 1a02 c.slli s4,0x20
44: 020a5a13 srli s4,s4,0x20
48: 4325 c.li t1,9
4a: 006a7363 bgeu s4,t1,0x50
4e: 4a01 c.li s4,0
50: d914 c.sw a3,48(a0)
52: d958 c.sw a4,52(a0)
54: dd04 c.sw s1,56(a0)
56: 03252e23 sw s2,60(a0)
5a: 05352023 sw s3,64(a0)
5e: 4781 c.li a5,0
60: 7422 c.ldsp s0,40(sp)
62: 7482 c.ldsp s1,32(sp)
64: 6962 c.ldsp s2,24(sp)
66: 69c2 c.ldsp s3,16(sp)
68: 6a22 c.ldsp s4,8(sp)
6a: 6145 c.addi16sp sp,48
6c: 853e c.mv a0,a5
6e: 8082 c.jr ra

RFC -> v1:
- From Björn Töpel:
* Changed RVOFF macro to static inline "ninsns_rvoff".
* Changed return type of rvc_ functions from u32 to u16.
* Changed sizeof(u16) to sizeof(*ctx->insns).
* Factored unsigned immediate checks into helper functions
(is_8b_uint, etc.)
* Changed to use IS_ENABLED instead of #ifdef to check if RVC is
enabled.
* Changed type of immediate arguments to rvc_* encoding to u32
to avoid issues from promotion of u16 to signed int.
* Cleaned up RVC checks in emit_{addi,slli,srli,srai}.
+ Wrapped lines at 100 instead of 80 columns for increased clarity.
+ Move !imm checks into each branch instead of checking
separately.
+ Strengthed checks for c.{slli,srli,srai} to check that
imm < XLEN. Otherwise, imm could be non-zero but the lower
XLEN bits could all be zero, leading to invalid RVC encoding.
* Changed emit_imm to sign-extend the 12-bit value in "lower"
+ The immediate checks for emit_{addiw,li,addi} use signed
comparisons, so this enables the RVC variants to be used
more often (e.g., if val == -1, then lower should be -1
as opposed to 4095).

Luke Nelson (3):
bpf, riscv: Modify JIT ctx to support compressed instructions
bpf, riscv: Add encodings for compressed instructions
bpf, riscv: Use compressed instructions in the rv64 JIT

arch/riscv/net/bpf_jit.h | 483 +++++++++++++++++++++++++++++++-
arch/riscv/net/bpf_jit_comp32.c | 14 +-
arch/riscv/net/bpf_jit_comp64.c | 293 ++++++++++---------
arch/riscv/net/bpf_jit_core.c | 6 +-
4 files changed, 643 insertions(+), 153 deletions(-)

--
2.25.1


2020-07-21 07:28:08

by Björn Töpel

[permalink] [raw]
Subject: Re: [PATCH bpf-next v1 0/3] bpf, riscv: Add compressed instructions to rv64 JIT

On Tue, 21 Jul 2020 at 04:52, Luke Nelson <[email protected]> wrote:
>
> This patch series enables using compressed riscv (RVC) instructions
> in the rv64 BPF JIT.
>
> RVC is a standard riscv extension that adds a set of compressed,
> 2-byte instructions that can replace some regular 4-byte instructions
> for improved code density.
>
> This series first modifies the JIT to support using 2-byte instructions
> (e.g., in jump offset computations), then adds RVC encoding and
> helper functions, and finally uses the helper functions to optimize
> the rv64 JIT.
>
> I used our formal verification framework, Serval, to verify the
> correctness of the RVC encodings and their uses in the rv64 JIT.
>
> The JIT continues to pass all tests in lib/test_bpf.c, and introduces
> no new failures to test_verifier; both with and without RVC being enabled.
>
> The following are examples of the JITed code for the verifier selftest
> "direct packet read test#3 for CGROUP_SKB OK", without and with RVC
> enabled, respectively. The former uses 178 bytes, and the latter uses 112,
> for a ~37% reduction in code size for this example.
>
> Without RVC:
>
> 0: 02000813 addi a6,zero,32
> 4: fd010113 addi sp,sp,-48
> 8: 02813423 sd s0,40(sp)
> c: 02913023 sd s1,32(sp)
> 10: 01213c23 sd s2,24(sp)
> 14: 01313823 sd s3,16(sp)
> 18: 01413423 sd s4,8(sp)
> 1c: 03010413 addi s0,sp,48
> 20: 03056683 lwu a3,48(a0)
> 24: 02069693 slli a3,a3,0x20
> 28: 0206d693 srli a3,a3,0x20
> 2c: 03456703 lwu a4,52(a0)
> 30: 02071713 slli a4,a4,0x20
> 34: 02075713 srli a4,a4,0x20
> 38: 03856483 lwu s1,56(a0)
> 3c: 02049493 slli s1,s1,0x20
> 40: 0204d493 srli s1,s1,0x20
> 44: 03c56903 lwu s2,60(a0)
> 48: 02091913 slli s2,s2,0x20
> 4c: 02095913 srli s2,s2,0x20
> 50: 04056983 lwu s3,64(a0)
> 54: 02099993 slli s3,s3,0x20
> 58: 0209d993 srli s3,s3,0x20
> 5c: 09056a03 lwu s4,144(a0)
> 60: 020a1a13 slli s4,s4,0x20
> 64: 020a5a13 srli s4,s4,0x20
> 68: 00900313 addi t1,zero,9
> 6c: 006a7463 bgeu s4,t1,0x74
> 70: 00000a13 addi s4,zero,0
> 74: 02d52823 sw a3,48(a0)
> 78: 02e52a23 sw a4,52(a0)
> 7c: 02952c23 sw s1,56(a0)
> 80: 03252e23 sw s2,60(a0)
> 84: 05352023 sw s3,64(a0)
> 88: 00000793 addi a5,zero,0
> 8c: 02813403 ld s0,40(sp)
> 90: 02013483 ld s1,32(sp)
> 94: 01813903 ld s2,24(sp)
> 98: 01013983 ld s3,16(sp)
> 9c: 00813a03 ld s4,8(sp)
> a0: 03010113 addi sp,sp,48
> a4: 00078513 addi a0,a5,0
> a8: 00008067 jalr zero,0(ra)
>
> With RVC:
>
> 0: 02000813 addi a6,zero,32
> 4: 7179 c.addi16sp sp,-48
> 6: f422 c.sdsp s0,40(sp)
> 8: f026 c.sdsp s1,32(sp)
> a: ec4a c.sdsp s2,24(sp)
> c: e84e c.sdsp s3,16(sp)
> e: e452 c.sdsp s4,8(sp)
> 10: 1800 c.addi4spn s0,sp,48
> 12: 03056683 lwu a3,48(a0)
> 16: 1682 c.slli a3,0x20
> 18: 9281 c.srli a3,0x20
> 1a: 03456703 lwu a4,52(a0)
> 1e: 1702 c.slli a4,0x20
> 20: 9301 c.srli a4,0x20
> 22: 03856483 lwu s1,56(a0)
> 26: 1482 c.slli s1,0x20
> 28: 9081 c.srli s1,0x20
> 2a: 03c56903 lwu s2,60(a0)
> 2e: 1902 c.slli s2,0x20
> 30: 02095913 srli s2,s2,0x20
> 34: 04056983 lwu s3,64(a0)
> 38: 1982 c.slli s3,0x20
> 3a: 0209d993 srli s3,s3,0x20
> 3e: 09056a03 lwu s4,144(a0)
> 42: 1a02 c.slli s4,0x20
> 44: 020a5a13 srli s4,s4,0x20
> 48: 4325 c.li t1,9
> 4a: 006a7363 bgeu s4,t1,0x50
> 4e: 4a01 c.li s4,0
> 50: d914 c.sw a3,48(a0)
> 52: d958 c.sw a4,52(a0)
> 54: dd04 c.sw s1,56(a0)
> 56: 03252e23 sw s2,60(a0)
> 5a: 05352023 sw s3,64(a0)
> 5e: 4781 c.li a5,0
> 60: 7422 c.ldsp s0,40(sp)
> 62: 7482 c.ldsp s1,32(sp)
> 64: 6962 c.ldsp s2,24(sp)
> 66: 69c2 c.ldsp s3,16(sp)
> 68: 6a22 c.ldsp s4,8(sp)
> 6a: 6145 c.addi16sp sp,48
> 6c: 853e c.mv a0,a5
> 6e: 8082 c.jr ra
>
> RFC -> v1:
> - From Björn Töpel:
> * Changed RVOFF macro to static inline "ninsns_rvoff".
> * Changed return type of rvc_ functions from u32 to u16.
> * Changed sizeof(u16) to sizeof(*ctx->insns).
> * Factored unsigned immediate checks into helper functions
> (is_8b_uint, etc.)
> * Changed to use IS_ENABLED instead of #ifdef to check if RVC is
> enabled.
> * Changed type of immediate arguments to rvc_* encoding to u32
> to avoid issues from promotion of u16 to signed int.
> * Cleaned up RVC checks in emit_{addi,slli,srli,srai}.
> + Wrapped lines at 100 instead of 80 columns for increased clarity.
> + Move !imm checks into each branch instead of checking
> separately.
> + Strengthed checks for c.{slli,srli,srai} to check that
> imm < XLEN. Otherwise, imm could be non-zero but the lower
> XLEN bits could all be zero, leading to invalid RVC encoding.
> * Changed emit_imm to sign-extend the 12-bit value in "lower"
> + The immediate checks for emit_{addiw,li,addi} use signed
> comparisons, so this enables the RVC variants to be used
> more often (e.g., if val == -1, then lower should be -1
> as opposed to 4095).
>

Finally RVC support! Thank you!

For the series:
Reviewed-by: Björn Töpel <[email protected]>
Acked-by: Björn Töpel <[email protected]>

> Luke Nelson (3):
> bpf, riscv: Modify JIT ctx to support compressed instructions
> bpf, riscv: Add encodings for compressed instructions
> bpf, riscv: Use compressed instructions in the rv64 JIT
>
> arch/riscv/net/bpf_jit.h | 483 +++++++++++++++++++++++++++++++-
> arch/riscv/net/bpf_jit_comp32.c | 14 +-
> arch/riscv/net/bpf_jit_comp64.c | 293 ++++++++++---------
> arch/riscv/net/bpf_jit_core.c | 6 +-
> 4 files changed, 643 insertions(+), 153 deletions(-)
>
> --
> 2.25.1
>

2020-07-21 19:43:16

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: [PATCH bpf-next v1 0/3] bpf, riscv: Add compressed instructions to rv64 JIT

On Tue, Jul 21, 2020 at 12:27 AM Björn Töpel <[email protected]> wrote:
>
> On Tue, 21 Jul 2020 at 04:52, Luke Nelson <[email protected]> wrote:
> >
> > This patch series enables using compressed riscv (RVC) instructions
> > in the rv64 BPF JIT.
> >
> > RVC is a standard riscv extension that adds a set of compressed,
> > 2-byte instructions that can replace some regular 4-byte instructions
> > for improved code density.
> >
> > This series first modifies the JIT to support using 2-byte instructions
> > (e.g., in jump offset computations), then adds RVC encoding and
> > helper functions, and finally uses the helper functions to optimize
> > the rv64 JIT.
> >
> > I used our formal verification framework, Serval, to verify the
> > correctness of the RVC encodings and their uses in the rv64 JIT.
> >
> > The JIT continues to pass all tests in lib/test_bpf.c, and introduces
> > no new failures to test_verifier; both with and without RVC being enabled.
> >
> > The following are examples of the JITed code for the verifier selftest
> > "direct packet read test#3 for CGROUP_SKB OK", without and with RVC
> > enabled, respectively. The former uses 178 bytes, and the latter uses 112,
> > for a ~37% reduction in code size for this example.
> >
> > Without RVC:
> >
> > 0: 02000813 addi a6,zero,32
> > 4: fd010113 addi sp,sp,-48
> > 8: 02813423 sd s0,40(sp)
> > c: 02913023 sd s1,32(sp)
> > 10: 01213c23 sd s2,24(sp)
> > 14: 01313823 sd s3,16(sp)
> > 18: 01413423 sd s4,8(sp)
> > 1c: 03010413 addi s0,sp,48
> > 20: 03056683 lwu a3,48(a0)
> > 24: 02069693 slli a3,a3,0x20
> > 28: 0206d693 srli a3,a3,0x20
> > 2c: 03456703 lwu a4,52(a0)
> > 30: 02071713 slli a4,a4,0x20
> > 34: 02075713 srli a4,a4,0x20
> > 38: 03856483 lwu s1,56(a0)
> > 3c: 02049493 slli s1,s1,0x20
> > 40: 0204d493 srli s1,s1,0x20
> > 44: 03c56903 lwu s2,60(a0)
> > 48: 02091913 slli s2,s2,0x20
> > 4c: 02095913 srli s2,s2,0x20
> > 50: 04056983 lwu s3,64(a0)
> > 54: 02099993 slli s3,s3,0x20
> > 58: 0209d993 srli s3,s3,0x20
> > 5c: 09056a03 lwu s4,144(a0)
> > 60: 020a1a13 slli s4,s4,0x20
> > 64: 020a5a13 srli s4,s4,0x20
> > 68: 00900313 addi t1,zero,9
> > 6c: 006a7463 bgeu s4,t1,0x74
> > 70: 00000a13 addi s4,zero,0
> > 74: 02d52823 sw a3,48(a0)
> > 78: 02e52a23 sw a4,52(a0)
> > 7c: 02952c23 sw s1,56(a0)
> > 80: 03252e23 sw s2,60(a0)
> > 84: 05352023 sw s3,64(a0)
> > 88: 00000793 addi a5,zero,0
> > 8c: 02813403 ld s0,40(sp)
> > 90: 02013483 ld s1,32(sp)
> > 94: 01813903 ld s2,24(sp)
> > 98: 01013983 ld s3,16(sp)
> > 9c: 00813a03 ld s4,8(sp)
> > a0: 03010113 addi sp,sp,48
> > a4: 00078513 addi a0,a5,0
> > a8: 00008067 jalr zero,0(ra)
> >
> > With RVC:
> >
> > 0: 02000813 addi a6,zero,32
> > 4: 7179 c.addi16sp sp,-48
> > 6: f422 c.sdsp s0,40(sp)
> > 8: f026 c.sdsp s1,32(sp)
> > a: ec4a c.sdsp s2,24(sp)
> > c: e84e c.sdsp s3,16(sp)
> > e: e452 c.sdsp s4,8(sp)
> > 10: 1800 c.addi4spn s0,sp,48
> > 12: 03056683 lwu a3,48(a0)
> > 16: 1682 c.slli a3,0x20
> > 18: 9281 c.srli a3,0x20
> > 1a: 03456703 lwu a4,52(a0)
> > 1e: 1702 c.slli a4,0x20
> > 20: 9301 c.srli a4,0x20
> > 22: 03856483 lwu s1,56(a0)
> > 26: 1482 c.slli s1,0x20
> > 28: 9081 c.srli s1,0x20
> > 2a: 03c56903 lwu s2,60(a0)
> > 2e: 1902 c.slli s2,0x20
> > 30: 02095913 srli s2,s2,0x20
> > 34: 04056983 lwu s3,64(a0)
> > 38: 1982 c.slli s3,0x20
> > 3a: 0209d993 srli s3,s3,0x20
> > 3e: 09056a03 lwu s4,144(a0)
> > 42: 1a02 c.slli s4,0x20
> > 44: 020a5a13 srli s4,s4,0x20
> > 48: 4325 c.li t1,9
> > 4a: 006a7363 bgeu s4,t1,0x50
> > 4e: 4a01 c.li s4,0
> > 50: d914 c.sw a3,48(a0)
> > 52: d958 c.sw a4,52(a0)
> > 54: dd04 c.sw s1,56(a0)
> > 56: 03252e23 sw s2,60(a0)
> > 5a: 05352023 sw s3,64(a0)
> > 5e: 4781 c.li a5,0
> > 60: 7422 c.ldsp s0,40(sp)
> > 62: 7482 c.ldsp s1,32(sp)
> > 64: 6962 c.ldsp s2,24(sp)
> > 66: 69c2 c.ldsp s3,16(sp)
> > 68: 6a22 c.ldsp s4,8(sp)
> > 6a: 6145 c.addi16sp sp,48
> > 6c: 853e c.mv a0,a5
> > 6e: 8082 c.jr ra
> >
> > RFC -> v1:
> > - From Björn Töpel:
> > * Changed RVOFF macro to static inline "ninsns_rvoff".
> > * Changed return type of rvc_ functions from u32 to u16.
> > * Changed sizeof(u16) to sizeof(*ctx->insns).
> > * Factored unsigned immediate checks into helper functions
> > (is_8b_uint, etc.)
> > * Changed to use IS_ENABLED instead of #ifdef to check if RVC is
> > enabled.
> > * Changed type of immediate arguments to rvc_* encoding to u32
> > to avoid issues from promotion of u16 to signed int.
> > * Cleaned up RVC checks in emit_{addi,slli,srli,srai}.
> > + Wrapped lines at 100 instead of 80 columns for increased clarity.
> > + Move !imm checks into each branch instead of checking
> > separately.
> > + Strengthed checks for c.{slli,srli,srai} to check that
> > imm < XLEN. Otherwise, imm could be non-zero but the lower
> > XLEN bits could all be zero, leading to invalid RVC encoding.
> > * Changed emit_imm to sign-extend the 12-bit value in "lower"
> > + The immediate checks for emit_{addiw,li,addi} use signed
> > comparisons, so this enables the RVC variants to be used
> > more often (e.g., if val == -1, then lower should be -1
> > as opposed to 4095).
> >
>
> Finally RVC support! Thank you!
>
> For the series:
> Reviewed-by: Björn Töpel <[email protected]>
> Acked-by: Björn Töpel <[email protected]>

Applied. Thanks