---------- Forwarded message ---------
From: Kuee k1r0a <[email protected]>
Date: Mon, Aug 8, 2022 at 11:11 PM
Subject: Re: [PATCH bpf] bpf: Do more tight ALU bounds tracking
To: Daniel Borkmann <[email protected]>
On Mon, Aug 8, 2022 at 9:25 PM Daniel Borkmann <[email protected]> wrote:
>
> On 7/30/22 12:48 AM, Hao Luo wrote:
> > On Fri, Jul 29, 2022 at 3:43 PM Youlin Li <[email protected]> wrote:
> >>
> >> In adjust_scalar_min_max_vals(), let 32bit bounds learn from 64bit bounds
> >> to get more tight bounds tracking. Similar operation can be found in
> >> reg_set_min_max().
> >>
> >> Also, we can now fold reg_bounds_sync() into zext_32_to_64().
> >>
> >> Before:
> >>
> >> func#0 @0
> >> 0: R1=ctx(off=0,imm=0) R10=fp0
> >> 0: (b7) r0 = 0 ; R0_w=0
> >> 1: (b7) r1 = 0 ; R1_w=0
> >> 2: (87) r1 = -r1 ; R1_w=scalar()
> >> 3: (87) r1 = -r1 ; R1_w=scalar()
> >> 4: (c7) r1 s>>= 63 ; R1_w=scalar(smin=-1,smax=0)
> >> 5: (07) r1 += 2 ; R1_w=scalar(umin=1,umax=2,var_off=(0x0; 0xffffffff)) <--- [*]
> >> 6: (95) exit
> >>
> >> It can be seen that even if the 64bit bounds is clear here, the 32bit
> >> bounds is still in the state of 'UNKNOWN'.
> >>
> >> After:
> >>
> >> func#0 @0
> >> 0: R1=ctx(off=0,imm=0) R10=fp0
> >> 0: (b7) r0 = 0 ; R0_w=0
> >> 1: (b7) r1 = 0 ; R1_w=0
> >> 2: (87) r1 = -r1 ; R1_w=scalar()
> >> 3: (87) r1 = -r1 ; R1_w=scalar()
> >> 4: (c7) r1 s>>= 63 ; R1_w=scalar(smin=-1,smax=0)
> >> 5: (07) r1 += 2 ; R1_w=scalar(umin=1,umax=2,var_off=(0x0; 0x3)) <--- [*]
> >> 6: (95) exit
> >>
> >> Signed-off-by: Youlin Li <[email protected]>
> >
> > Looks good to me. Thanks Youlin.
> >
> > Acked-by: Hao Luo <[email protected]>
>
> Thanks Youlin! Looks like the patch breaks CI [0] e.g.:
>
> #142/p bounds check after truncation of non-boundary-crossing range FAIL
> Failed to load prog 'Permission denied'!
> invalid access to map value, value_size=8 off=16777215 size=1
> R0 max value is outside of the allowed memory range
> verification time 296 usec
> stack depth 8
> processed 15 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
>
> Please take a look. Also it would be great to add a test_verifier selftest to
> assert above case from commit log against future changes.
>
> Thanks,
> Daniel
>
> [0] https://github.com/kernel-patches/bpf/runs/7696324041?check_suite_focus=true
This test case fails because the 32bit boundary information is lost
after the 11th instruction is executed:
Before:
11: (07) r1 += 2147483647 ;
R1_w=scalar(umin=70866960383,umax=70866960638,var_off=(0x1000000000;
0xffffffff),u32_min=2147483647,u32_max=-2147483394)
After:
11: (07) r1 += 2147483647 ;
R1_w=scalar(umin=70866960383,umax=70866960638,var_off=(0x1000000000;
0xffffffff))
This may be because, in previous versions of the code, when
__reg_combine_64_into_32() was called, the 32bit boundary was
completely deduced from the 64bit boundary, so there was a call to
__mark_reg32_unbounded() in __reg_combine_64_into_32().
But now, before adjust_scalar_min_max_vals() calls
__reg_combine_64_into_32() , the 32bit bounds are already calculated
to some extent, and __mark_reg32_unbounded() will eliminate these
information.
Simply copying a code without __mark_reg32_unbounded() should work,
perhaps it would be more elegant to introduce a flag into
__reg_combine_64_into_32()?
Sorry for not completing the tests because I did not 'make selftests'
successfully, and uploaded the code that caused the error.
On 8/8/22 5:14 PM, Kuee k1r0a wrote:
> ---------- Forwarded message ---------
> From: Kuee k1r0a <[email protected]>
> Date: Mon, Aug 8, 2022 at 11:11 PM
> Subject: Re: [PATCH bpf] bpf: Do more tight ALU bounds tracking
> To: Daniel Borkmann <[email protected]>
>
>
> On Mon, Aug 8, 2022 at 9:25 PM Daniel Borkmann <[email protected]> wrote:
>>
>> On 7/30/22 12:48 AM, Hao Luo wrote:
>>> On Fri, Jul 29, 2022 at 3:43 PM Youlin Li <[email protected]> wrote:
>>>>
>>>> In adjust_scalar_min_max_vals(), let 32bit bounds learn from 64bit bounds
>>>> to get more tight bounds tracking. Similar operation can be found in
>>>> reg_set_min_max().
>>>>
>>>> Also, we can now fold reg_bounds_sync() into zext_32_to_64().
>>>>
>>>> Before:
>>>>
>>>> func#0 @0
>>>> 0: R1=ctx(off=0,imm=0) R10=fp0
>>>> 0: (b7) r0 = 0 ; R0_w=0
>>>> 1: (b7) r1 = 0 ; R1_w=0
>>>> 2: (87) r1 = -r1 ; R1_w=scalar()
>>>> 3: (87) r1 = -r1 ; R1_w=scalar()
>>>> 4: (c7) r1 s>>= 63 ; R1_w=scalar(smin=-1,smax=0)
>>>> 5: (07) r1 += 2 ; R1_w=scalar(umin=1,umax=2,var_off=(0x0; 0xffffffff)) <--- [*]
>>>> 6: (95) exit
>>>>
>>>> It can be seen that even if the 64bit bounds is clear here, the 32bit
>>>> bounds is still in the state of 'UNKNOWN'.
>>>>
>>>> After:
>>>>
>>>> func#0 @0
>>>> 0: R1=ctx(off=0,imm=0) R10=fp0
>>>> 0: (b7) r0 = 0 ; R0_w=0
>>>> 1: (b7) r1 = 0 ; R1_w=0
>>>> 2: (87) r1 = -r1 ; R1_w=scalar()
>>>> 3: (87) r1 = -r1 ; R1_w=scalar()
>>>> 4: (c7) r1 s>>= 63 ; R1_w=scalar(smin=-1,smax=0)
>>>> 5: (07) r1 += 2 ; R1_w=scalar(umin=1,umax=2,var_off=(0x0; 0x3)) <--- [*]
>>>> 6: (95) exit
>>>>
>>>> Signed-off-by: Youlin Li <[email protected]>
>>>
>>> Looks good to me. Thanks Youlin.
>>>
>>> Acked-by: Hao Luo <[email protected]>
>>
>> Thanks Youlin! Looks like the patch breaks CI [0] e.g.:
>>
>> #142/p bounds check after truncation of non-boundary-crossing range FAIL
>> Failed to load prog 'Permission denied'!
>> invalid access to map value, value_size=8 off=16777215 size=1
>> R0 max value is outside of the allowed memory range
>> verification time 296 usec
>> stack depth 8
>> processed 15 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
>>
>> Please take a look. Also it would be great to add a test_verifier selftest to
>> assert above case from commit log against future changes.
>>
>> Thanks,
>> Daniel
>>
>> [0] https://github.com/kernel-patches/bpf/runs/7696324041?check_suite_focus=true
>
> This test case fails because the 32bit boundary information is lost
> after the 11th instruction is executed:
> Before:
> 11: (07) r1 += 2147483647 ;
> R1_w=scalar(umin=70866960383,umax=70866960638,var_off=(0x1000000000;
> 0xffffffff),u32_min=2147483647,u32_max=-2147483394)
> After:
> 11: (07) r1 += 2147483647 ;
> R1_w=scalar(umin=70866960383,umax=70866960638,var_off=(0x1000000000;
> 0xffffffff))
>
> This may be because, in previous versions of the code, when
> __reg_combine_64_into_32() was called, the 32bit boundary was
> completely deduced from the 64bit boundary, so there was a call to
> __mark_reg32_unbounded() in __reg_combine_64_into_32().
>
> But now, before adjust_scalar_min_max_vals() calls
> __reg_combine_64_into_32() , the 32bit bounds are already calculated
> to some extent, and __mark_reg32_unbounded() will eliminate these
> information.
>
> Simply copying a code without __mark_reg32_unbounded() should work,
> perhaps it would be more elegant to introduce a flag into
> __reg_combine_64_into_32()?
>
> Sorry for not completing the tests because I did not 'make selftests'
> successfully, and uploaded the code that caused the error.
Under tools/testing/selftests/bpf/, you can run test_progs and test_verifier
through the vmtest script, e.g. `./vmtest.sh -- ./test_progs` should ease
running it. The whole `make selftests` is not necessary given here we care
about BPF, CI is running these where 2 failed and need investigation:
test_progs: PASS
test_progs-no_alu32: FAIL (returned 1)
test_maps: PASS
test_verifier: FAIL (returned 1)
Fwiw, for the test_verifier failure case at least, we should then adapt it
in a separate commit with an analysis explaining why it is okay to alter the
test; plus a 3rd commit adding new test cases as mentioned earlier.
Thanks a lot, Kuee!
Daniel
Add a test case to ensure that 32-bit bounds can be learned from 64-bit
bounds when performing 64-bit ALU operations.
Make use of dead code elimination, so that we can see the verifier
bailing out on unmodified kernels.
Before:
./test_verifier 165
#165/p 32-bit bounds update in ALU64 FAIL
Failed to load prog 'Permission denied'!
R2 !read_ok
verification time 49 usec
stack depth 0
processed 8 insns (limit 1000000) max_states_per_insn 0 total_states
0 peak_states 0 mark_read 0
Summary: 0 PASSED, 1 SKIPPED, 1 FAILED
After:
./test_verifier 165
#165/p 32-bit bounds update in ALU64 OK
Summary: 1 PASSED, 1 SKIPPED, 0 FAILED
Signed-off-by: Youlin Li <[email protected]>
---
tools/testing/selftests/bpf/verifier/bounds.c | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff --git a/tools/testing/selftests/bpf/verifier/bounds.c b/tools/testing/selftests/bpf/verifier/bounds.c
index 33125d5f6772..b9aee2f2c66e 100644
--- a/tools/testing/selftests/bpf/verifier/bounds.c
+++ b/tools/testing/selftests/bpf/verifier/bounds.c
@@ -753,3 +753,20 @@
.result_unpriv = REJECT,
.result = ACCEPT,
},
+{
+ "32-bit bounds update in ALU64",
+ .insns = {
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_MOV64_IMM(BPF_REG_1, 0),
+ BPF_ALU64_IMM(BPF_NEG, BPF_REG_1, 0),
+ BPF_ALU64_IMM(BPF_NEG, BPF_REG_1, 0),
+ BPF_ALU64_IMM(BPF_ARSH, BPF_REG_1, 63),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, 2),
+ BPF_JMP32_IMM(BPF_JGE, BPF_REG_1, 1, 1),
+ BPF_MOV64_REG(BPF_REG_0, BPF_REG_2),
+ BPF_JMP32_IMM(BPF_JLE, BPF_REG_1, 2, 1),
+ BPF_MOV64_REG(BPF_REG_0, BPF_REG_2),
+ BPF_EXIT_INSN()
+ },
+ .result = ACCEPT,
+},
--
2.25.1
The commit ("bpf: Do more tight ALU bounds tracking") introduces a bug
that fails some selftests.
in previous versions of the code, when
__reg_combine_64_into_32() was called, the 32bit boundary was
completely deduced from the 64bit boundary, so there was a call to
__mark_reg32_unbounded() in __reg_combine_64_into_32(). But before
adjust_scalar_min_max_vals() calls
__reg_combine_64_into_32() , the 32bit bounds are already calculated
to some extent, and __mark_reg32_unbounded() will eliminate these
information.
Simply remove the call to __reg_combine_64_into_32() and copying a code
without __mark_reg32_unbounded() should work.
Before:
./test_verifier 142
#142/p bounds check after truncation of non-boundary-crossing range FAIL
Failed to load prog 'Permission denied'!
invalid access to map value, value_size=8 off=16777215 size=1
R0 max value is outside of the allowed memory range
verification time 149 usec
stack depth 8
processed 15 insns (limit 1000000) max_states_per_insn 0
total_states 0 peak_states 0 mark_read 0
Summary: 0 PASSED, 1 SKIPPED, 1 FAILED
After:
./test_verifier 142
#142/p bounds check after truncation of non-boundary-crossing range OK
Summary: 1 PASSED, 1 SKIPPED, 0 FAILED
Signed-off-by: Youlin Li <[email protected]>
---
kernel/bpf/verifier.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 11d8bb54ba6b..7ea6e0372d62 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -9014,7 +9014,17 @@ static int adjust_scalar_min_max_vals(struct bpf_verifier_env *env,
/* ALU32 ops are zero extended into 64bit register */
zext_32_to_64(dst_reg);
} else {
- __reg_combine_64_into_32(dst_reg);
+ if (__reg64_bound_s32(dst_reg->smin_value) &&
+ __reg64_bound_s32(dst_reg->smax_value)) {
+ dst_reg->s32_min_value = (s32)dst_reg->smin_value;
+ dst_reg->s32_max_value = (s32)dst_reg->smax_value;
+ }
+ if (__reg64_bound_u32(dst_reg->umin_value) &&
+ __reg64_bound_u32(dst_reg->umax_value)) {
+ dst_reg->u32_min_value = (u32)dst_reg->umin_value;
+ dst_reg->u32_max_value = (u32)dst_reg->umax_value;
+ }
+ reg_bounds_sync(dst_reg);
}
return 0;
}
--
2.25.1
On 8/10/22 12:08 PM, Youlin Li wrote:
> The commit ("bpf: Do more tight ALU bounds tracking") introduces a bug
> that fails some selftests.
>
> in previous versions of the code, when
> __reg_combine_64_into_32() was called, the 32bit boundary was
> completely deduced from the 64bit boundary, so there was a call to
> __mark_reg32_unbounded() in __reg_combine_64_into_32(). But before
> adjust_scalar_min_max_vals() calls
> __reg_combine_64_into_32() , the 32bit bounds are already calculated
> to some extent, and __mark_reg32_unbounded() will eliminate these
> information.
>
> Simply remove the call to __reg_combine_64_into_32() and copying a code
> without __mark_reg32_unbounded() should work.
>
> Before:
> ./test_verifier 142
> #142/p bounds check after truncation of non-boundary-crossing range FAIL
> Failed to load prog 'Permission denied'!
> invalid access to map value, value_size=8 off=16777215 size=1
> R0 max value is outside of the allowed memory range
> verification time 149 usec
> stack depth 8
> processed 15 insns (limit 1000000) max_states_per_insn 0
> total_states 0 peak_states 0 mark_read 0
> Summary: 0 PASSED, 1 SKIPPED, 1 FAILED
>
> After:
> ./test_verifier 142
> #142/p bounds check after truncation of non-boundary-crossing range OK
> Summary: 1 PASSED, 1 SKIPPED, 0 FAILED
>
> Signed-off-by: Youlin Li <[email protected]>
> ---
> kernel/bpf/verifier.c | 12 +++++++++++-
> 1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 11d8bb54ba6b..7ea6e0372d62 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -9014,7 +9014,17 @@ static int adjust_scalar_min_max_vals(struct bpf_verifier_env *env,
> /* ALU32 ops are zero extended into 64bit register */
> zext_32_to_64(dst_reg);
> } else {
> - __reg_combine_64_into_32(dst_reg);
> + if (__reg64_bound_s32(dst_reg->smin_value) &&
> + __reg64_bound_s32(dst_reg->smax_value)) {
> + dst_reg->s32_min_value = (s32)dst_reg->smin_value;
> + dst_reg->s32_max_value = (s32)dst_reg->smax_value;
> + }
> + if (__reg64_bound_u32(dst_reg->umin_value) &&
> + __reg64_bound_u32(dst_reg->umax_value)) {
> + dst_reg->u32_min_value = (u32)dst_reg->umin_value;
> + dst_reg->u32_max_value = (u32)dst_reg->umax_value;
> + }
> + reg_bounds_sync(dst_reg);
Hm, this doesn't apply to the bpf tree. Is this on top of your previous patch [0]?
Please squash both together in that case and resubmit your previous one as a v2.
[0] https://lore.kernel.org/bpf/[email protected]/
Thanks,
Daniel
In adjust_scalar_min_max_vals(), let 32bit bounds learn from 64bit bounds
to get more tight bounds tracking. Similar operation can be found in
reg_set_min_max().
Note that we cannot simply add a call to __reg_combine_64_into_32(). In
previous versions of the code, when __reg_combine_64_into_32() was
called, the 32bit boundary was completely deduced from the 64bit
boundary, so there was a call to __mark_reg32_unbounded() in
__reg_combine_64_into_32(). But in adjust_scalar_min_max_vals(), the 32bit
bounds are already calculated to some extent, and __mark_reg32_unbounded()
will eliminate these information.
Simply copying a code without __mark_reg32_unbounded() should work.
Also, we can now fold reg_bounds_sync() into zext_32_to_64().
Before:
func#0 @0
0: R1=ctx(off=0,imm=0) R10=fp0
0: (b7) r0 = 0 ; R0_w=0
1: (b7) r1 = 0 ; R1_w=0
2: (87) r1 = -r1 ; R1_w=scalar()
3: (87) r1 = -r1 ; R1_w=scalar()
4: (c7) r1 s>>= 63 ; R1_w=scalar(smin=-1,smax=0)
5: (07) r1 += 2 ; R1_w=scalar(umin=1,umax=2,var_off=(0x0; 0xffffffff)) <--- [*]
6: (95) exit
It can be seen that even if the 64bit bounds is clear here, the 32bit
bounds is still in the state of 'UNKNOWN'.
After:
func#0 @0
0: R1=ctx(off=0,imm=0) R10=fp0
0: (b7) r0 = 0 ; R0_w=0
1: (b7) r1 = 0 ; R1_w=0
2: (87) r1 = -r1 ; R1_w=scalar()
3: (87) r1 = -r1 ; R1_w=scalar()
4: (c7) r1 s>>= 63 ; R1_w=scalar(smin=-1,smax=0)
5: (07) r1 += 2 ; R1_w=scalar(umin=1,umax=2,var_off=(0x0; 0x3)) <--- [*]
6: (95) exit
Signed-off-by: Youlin Li <[email protected]>
---
v1 -> v2:
Replaced the call to __reg_combine_64_into_32() with the code in
__reg_combine_64_into_32(), and removed the call to
__mark_reg32_unbounded().
Sorry for the delay, I've been busy looking for a job recently :)
kernel/bpf/verifier.c | 20 ++++++++++++++++----
1 file changed, 16 insertions(+), 4 deletions(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 3eadb14e090b..b7403773e834 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -4383,6 +4383,7 @@ static void zext_32_to_64(struct bpf_reg_state *reg)
{
reg->var_off = tnum_subreg(reg->var_off);
__reg_assign_32_into_64(reg);
+ reg_bounds_sync(reg);
}
/* truncate register to smaller size (in bytes)
@@ -9010,10 +9011,22 @@ static int adjust_scalar_min_max_vals(struct bpf_verifier_env *env,
break;
}
- /* ALU32 ops are zero extended into 64bit register */
- if (alu32)
+ if (alu32) {
+ /* ALU32 ops are zero extended into 64bit register */
zext_32_to_64(dst_reg);
- reg_bounds_sync(dst_reg);
+ } else {
+ if (__reg64_bound_s32(dst_reg->smin_value) &&
+ __reg64_bound_s32(dst_reg->smax_value)) {
+ dst_reg->s32_min_value = (s32)dst_reg->smin_value;
+ dst_reg->s32_max_value = (s32)dst_reg->smax_value;
+ }
+ if (__reg64_bound_u32(dst_reg->umin_value) &&
+ __reg64_bound_u32(dst_reg->umax_value)) {
+ dst_reg->u32_min_value = (u32)dst_reg->umin_value;
+ dst_reg->u32_max_value = (u32)dst_reg->umax_value;
+ }
+ reg_bounds_sync(dst_reg);
+ }
return 0;
}
@@ -9202,7 +9215,6 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn)
insn->dst_reg);
}
zext_32_to_64(dst_reg);
- reg_bounds_sync(dst_reg);
}
} else {
/* case: R = imm
--
2.25.1
Hi Youlin,
On Sat, Aug 27, 2022 at 6:57 AM Youlin Li <[email protected]> wrote:
>
> In adjust_scalar_min_max_vals(), let 32bit bounds learn from 64bit bounds
> to get more tight bounds tracking. Similar operation can be found in
> reg_set_min_max().
>
> Note that we cannot simply add a call to __reg_combine_64_into_32(). In
> previous versions of the code, when __reg_combine_64_into_32() was
> called, the 32bit boundary was completely deduced from the 64bit
> boundary, so there was a call to __mark_reg32_unbounded() in
> __reg_combine_64_into_32(). But in adjust_scalar_min_max_vals(), the 32bit
> bounds are already calculated to some extent, and __mark_reg32_unbounded()
> will eliminate these information.
>
> Simply copying a code without __mark_reg32_unbounded() should work.
>
> Also, we can now fold reg_bounds_sync() into zext_32_to_64().
>
> Before:
>
> func#0 @0
> 0: R1=ctx(off=0,imm=0) R10=fp0
> 0: (b7) r0 = 0 ; R0_w=0
> 1: (b7) r1 = 0 ; R1_w=0
> 2: (87) r1 = -r1 ; R1_w=scalar()
> 3: (87) r1 = -r1 ; R1_w=scalar()
> 4: (c7) r1 s>>= 63 ; R1_w=scalar(smin=-1,smax=0)
> 5: (07) r1 += 2 ; R1_w=scalar(umin=1,umax=2,var_off=(0x0; 0xffffffff)) <--- [*]
> 6: (95) exit
>
> It can be seen that even if the 64bit bounds is clear here, the 32bit
> bounds is still in the state of 'UNKNOWN'.
>
> After:
>
> func#0 @0
> 0: R1=ctx(off=0,imm=0) R10=fp0
> 0: (b7) r0 = 0 ; R0_w=0
> 1: (b7) r1 = 0 ; R1_w=0
> 2: (87) r1 = -r1 ; R1_w=scalar()
> 3: (87) r1 = -r1 ; R1_w=scalar()
> 4: (c7) r1 s>>= 63 ; R1_w=scalar(smin=-1,smax=0)
> 5: (07) r1 += 2 ; R1_w=scalar(umin=1,umax=2,var_off=(0x0; 0x3)) <--- [*]
> 6: (95) exit
>
> Signed-off-by: Youlin Li <[email protected]>
> ---
It might be better to put the code that performs the actual bounds
deduction into a helper function. It avoids code duplication. But the
current version looks fine to me. Thanks for the patch!
Acked-by: Hao Luo <[email protected]>