2023-10-24 12:40:39

by Hao Sun

[permalink] [raw]
Subject: bpf: shift-out-of-bounds in tnum_rshift()

Hi,

The following program can trigger a shift-out-of-bounds in
tnum_rshift(), called by scalar32_min_max_rsh():

0: (bc) w0 = w1
1: (bf) r2 = r0
2: (18) r3 = 0xd
4: (bc) w4 = w0
5: (bf) r5 = r0
6: (bf) r7 = r3
7: (bf) r8 = r4
8: (2f) r8 *= r5
9: (cf) r5 s>>= r5
10: (a6) if w8 < 0xfffffffb goto pc+10
11: (1f) r7 -= r5
12: (71) r6 = *(u8 *)(r1 +17)
13: (5f) r3 &= r8
14: (74) w2 >>= 30
15: (1f) r7 -= r5
16: (5d) if r8 != r6 goto pc+4
17: (c7) r8 s>>= 5
18: (cf) r0 s>>= r0
19: (7f) r0 >>= r0
20: (7c) w5 >>= w8 # shift-out-bounds here
21: exit

After load:
================================================================================
UBSAN: shift-out-of-bounds in kernel/bpf/tnum.c:44:9
shift exponent 255 is too large for 64-bit type 'long long unsigned int'
CPU: 2 PID: 8574 Comm: bpf-test Not tainted
6.6.0-rc5-01400-g7c2f6c9fb91f-dirty #21
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x8e/0xb0 lib/dump_stack.c:106
ubsan_epilogue lib/ubsan.c:217 [inline]
__ubsan_handle_shift_out_of_bounds+0x15a/0x2f0 lib/ubsan.c:387
tnum_rshift.cold+0x17/0x32 kernel/bpf/tnum.c:44
scalar32_min_max_rsh kernel/bpf/verifier.c:12999 [inline]
adjust_scalar_min_max_vals kernel/bpf/verifier.c:13224 [inline]
adjust_reg_min_max_vals+0x1936/0x5d50 kernel/bpf/verifier.c:13338
do_check kernel/bpf/verifier.c:16890 [inline]
do_check_common+0x2f64/0xbb80 kernel/bpf/verifier.c:19563
do_check_main kernel/bpf/verifier.c:19626 [inline]
bpf_check+0x65cf/0xa9e0 kernel/bpf/verifier.c:20263
bpf_prog_load+0x110e/0x1b20 kernel/bpf/syscall.c:2717
__sys_bpf+0xfcf/0x4380 kernel/bpf/syscall.c:5365
__do_sys_bpf kernel/bpf/syscall.c:5469 [inline]
__se_sys_bpf kernel/bpf/syscall.c:5467 [inline]
__x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:5467
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x5610511e23cd
Code: 24 80 00 00 00 48 0f 42 d0 48 89 94 24 68 0c 00 00 b8 41 01 00
00 bf 05 00 00 00 ba 90 00 00 00 48 8d b44
RSP: 002b:00007f5357fc7820 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
RAX: ffffffffffffffda RBX: 0000000000000095 RCX: 00005610511e23cd
RDX: 0000000000000090 RSI: 00007f5357fc8410 RDI: 0000000000000005
RBP: 0000000000000000 R08: 00007f5357fca458 R09: 00007f5350005520
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000002b
R13: 0000000d00000000 R14: 000000000000002b R15: 000000000000002b
</TASK>

If remove insn #20, the verifier gives:
-------- Verifier Log --------
func#0 @0
0: R1=ctx(off=0,imm=0) R10=fp0
0: (bc) w0 = w1 ;
R0_w=scalar(smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
R1=ctx(off=0,
imm=0)
1: (bf) r2 = r0 ;
R0_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
0xffffffff))
R2_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
2: (18) r3 = 0xd ; R3_w=13
4: (bc) w4 = w0 ;
R0_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
0xffffffff))
R4_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
5: (bf) r5 = r0 ;
R0_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
0xffffffff))
R5_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
6: (bf) r7 = r3 ; R3_w=13 R7_w=13
7: (bf) r8 = r4 ;
R4_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
0xffffffff))
R8_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
8: (2f) r8 *= r5 ;
R5_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
0xffffffff))
R8_w=scalar()
9: (cf) r5 s>>= r5 ; R5_w=scalar()
10: (a6) if w8 < 0xfffffffb goto pc+9 ;
R8_w=scalar(smin=-9223372032559808520,umin=4294967288,smin32=-5,smax32=-1,
umin32=4294967291,var_off=(0xfffffff8; 0xffffffff00000007))
11: (1f) r7 -= r5 ; R5_w=scalar() R7_w=scalar()
12: (71) r6 = *(u8 *)(r1 +17) ; R1=ctx(off=0,imm=0)
R6_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=255,
var_off=(0x0; 0xff))
13: (5f) r3 &= r8 ;
R3_w=scalar(smin=umin=smin32=umin32=8,smax=umax=smax32=umax32=13,var_off=(0x8;
0x5)) R8_w=scalar(smin=-9223372032559808520,umin=4294967288,smin32=-5,smax32=-1,umin32=4294967291,var_off=(0xffff)
14: (74) w2 >>= 30 ;
R2_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=3,var_off=(0x0;
0x3))
15: (1f) r7 -= r5 ; R5_w=scalar() R7_w=scalar()
16: (5d) if r8 != r6 goto pc+3 ;
R6_w=scalar(smin=umin=umin32=4294967288,smax=umax=umax32=255,smin32=-8,smax32=-1,
var_off=(0xfffffff8; 0x7))
R8_w=scalar(smin=umin=4294967288,smax=umax=255,smin32=-5,smax32=-1,umin32=4294967291)
17: (c7) r8 s>>= 5 ; R8_w=134217727
18: (cf) r0 s>>= r0 ; R0_w=scalar()
19: (7f) r0 >>= r0 ; R0=scalar()
20: (95) exit

from 16 to 20: safe

from 10 to 20: safe
processed 22 insns (limit 1000000) max_states_per_insn 0 total_states
1 peak_states 1 mark_read 1
-------- End of Verifier Log --------

In adjust_scalar_min_max_vals(), src_reg.umax_value is 7, thus pass
the check here:
if (umax_val >= insn_bitness) {
/* Shifts greater than 31 or 63 are undefined.
* This includes shifts by a negative number.
*/
mark_reg_unknown(env, regs, insn->dst_reg);
break;
}

However in scalar32_min_max_rsh(), both src_reg->u32_min_value and
src_reg->u32_max_value is 134217727, causing tnum_rsh() shit by 255.

Should we check if(src_reg->u32_max_value < insn_bitness) before calling
scalar32_min_max_rsh(), rather than only checking umax_val? Or, is it
because issues somewhere else, incorrectly setting u32_min_value to
34217727

Best
Hao Sun


2023-10-25 12:31:52

by Hao Sun

[permalink] [raw]
Subject: Re: bpf: shift-out-of-bounds in tnum_rshift()

On Tue, Oct 24, 2023 at 2:40 PM Hao Sun <[email protected]> wrote:
>
> Hi,
>
> The following program can trigger a shift-out-of-bounds in
> tnum_rshift(), called by scalar32_min_max_rsh():
>
> 0: (bc) w0 = w1
> 1: (bf) r2 = r0
> 2: (18) r3 = 0xd
> 4: (bc) w4 = w0
> 5: (bf) r5 = r0
> 6: (bf) r7 = r3
> 7: (bf) r8 = r4
> 8: (2f) r8 *= r5
> 9: (cf) r5 s>>= r5
> 10: (a6) if w8 < 0xfffffffb goto pc+10
> 11: (1f) r7 -= r5
> 12: (71) r6 = *(u8 *)(r1 +17)
> 13: (5f) r3 &= r8
> 14: (74) w2 >>= 30
> 15: (1f) r7 -= r5
> 16: (5d) if r8 != r6 goto pc+4
> 17: (c7) r8 s>>= 5
> 18: (cf) r0 s>>= r0
> 19: (7f) r0 >>= r0
> 20: (7c) w5 >>= w8 # shift-out-bounds here
> 21: exit
>

Here are the c macros for the above program in case anyone needs this:

// 0: (bc) w0 = w1
BPF_MOV32_REG(BPF_REG_0, BPF_REG_1),
// 1: (bf) r2 = r0
BPF_MOV64_REG(BPF_REG_2, BPF_REG_0),
// 2: (18) r3 = 0xd
BPF_LD_IMM64(BPF_REG_3, 0xd),
// 4: (bc) w4 = w0
BPF_MOV32_REG(BPF_REG_4, BPF_REG_0),
// 5: (bf) r5 = r0
BPF_MOV64_REG(BPF_REG_5, BPF_REG_0),
// 6: (bf) r7 = r3
BPF_MOV64_REG(BPF_REG_7, BPF_REG_3),
// 7: (bf) r8 = r4
BPF_MOV64_REG(BPF_REG_8, BPF_REG_4),
// 8: (2f) r8 *= r5
BPF_ALU64_REG(BPF_MUL, BPF_REG_8, BPF_REG_5),
// 9: (cf) r5 s>>= r5
BPF_ALU64_REG(BPF_ARSH, BPF_REG_5, BPF_REG_5),
// 10: (a6) if w8 < 0xfffffffb goto pc+10
BPF_JMP32_IMM(BPF_JLT, BPF_REG_8, 0xfffffffb, 10),
// 11: (1f) r7 -= r5
BPF_ALU64_REG(BPF_SUB, BPF_REG_7, BPF_REG_5),
// 12: (71) r6 = *(u8 *)(r1 +17)
BPF_LDX_MEM(BPF_B, BPF_REG_6, BPF_REG_1, 17),
// 13: (5f) r3 &= r8
BPF_ALU64_REG(BPF_AND, BPF_REG_3, BPF_REG_8),
// 14: (74) w2 >>= 30
BPF_ALU32_IMM(BPF_RSH, BPF_REG_2, 30),
// 15: (1f) r7 -= r5
BPF_ALU64_REG(BPF_SUB, BPF_REG_7, BPF_REG_5),
// 16: (5d) if r8 != r6 goto pc+4
BPF_JMP_REG(BPF_JNE, BPF_REG_8, BPF_REG_6, 4),
// 17: (c7) r8 s>>= 5
BPF_ALU64_IMM(BPF_ARSH, BPF_REG_8, 5),
// 18: (cf) r0 s>>= r0
BPF_ALU64_REG(BPF_ARSH, BPF_REG_0, BPF_REG_0),
// 19: (7f) r0 >>= r0
BPF_ALU64_REG(BPF_RSH, BPF_REG_0, BPF_REG_0),
// 20: (7c) w5 >>= w8
BPF_ALU32_REG(BPF_RSH, BPF_REG_5, BPF_REG_8),
BPF_EXIT_INSN()

> After load:
> ================================================================================
> UBSAN: shift-out-of-bounds in kernel/bpf/tnum.c:44:9
> shift exponent 255 is too large for 64-bit type 'long long unsigned int'
> CPU: 2 PID: 8574 Comm: bpf-test Not tainted
> 6.6.0-rc5-01400-g7c2f6c9fb91f-dirty #21
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
> Call Trace:
> <TASK>
> __dump_stack lib/dump_stack.c:88 [inline]
> dump_stack_lvl+0x8e/0xb0 lib/dump_stack.c:106
> ubsan_epilogue lib/ubsan.c:217 [inline]
> __ubsan_handle_shift_out_of_bounds+0x15a/0x2f0 lib/ubsan.c:387
> tnum_rshift.cold+0x17/0x32 kernel/bpf/tnum.c:44
> scalar32_min_max_rsh kernel/bpf/verifier.c:12999 [inline]
> adjust_scalar_min_max_vals kernel/bpf/verifier.c:13224 [inline]
> adjust_reg_min_max_vals+0x1936/0x5d50 kernel/bpf/verifier.c:13338
> do_check kernel/bpf/verifier.c:16890 [inline]
> do_check_common+0x2f64/0xbb80 kernel/bpf/verifier.c:19563
> do_check_main kernel/bpf/verifier.c:19626 [inline]
> bpf_check+0x65cf/0xa9e0 kernel/bpf/verifier.c:20263
> bpf_prog_load+0x110e/0x1b20 kernel/bpf/syscall.c:2717
> __sys_bpf+0xfcf/0x4380 kernel/bpf/syscall.c:5365
> __do_sys_bpf kernel/bpf/syscall.c:5469 [inline]
> __se_sys_bpf kernel/bpf/syscall.c:5467 [inline]
> __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:5467
> do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
> entry_SYSCALL_64_after_hwframe+0x63/0xcd
> RIP: 0033:0x5610511e23cd
> Code: 24 80 00 00 00 48 0f 42 d0 48 89 94 24 68 0c 00 00 b8 41 01 00
> 00 bf 05 00 00 00 ba 90 00 00 00 48 8d b44
> RSP: 002b:00007f5357fc7820 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
> RAX: ffffffffffffffda RBX: 0000000000000095 RCX: 00005610511e23cd
> RDX: 0000000000000090 RSI: 00007f5357fc8410 RDI: 0000000000000005
> RBP: 0000000000000000 R08: 00007f5357fca458 R09: 00007f5350005520
> R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000002b
> R13: 0000000d00000000 R14: 000000000000002b R15: 000000000000002b
> </TASK>
>
> If remove insn #20, the verifier gives:
> -------- Verifier Log --------
> func#0 @0
> 0: R1=ctx(off=0,imm=0) R10=fp0
> 0: (bc) w0 = w1 ;
> R0_w=scalar(smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
> R1=ctx(off=0,
> imm=0)
> 1: (bf) r2 = r0 ;
> R0_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
> 0xffffffff))
> R2_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
> 2: (18) r3 = 0xd ; R3_w=13
> 4: (bc) w4 = w0 ;
> R0_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
> 0xffffffff))
> R4_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
> 5: (bf) r5 = r0 ;
> R0_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
> 0xffffffff))
> R5_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
> 6: (bf) r7 = r3 ; R3_w=13 R7_w=13
> 7: (bf) r8 = r4 ;
> R4_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
> 0xffffffff))
> R8_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
> 8: (2f) r8 *= r5 ;
> R5_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
> 0xffffffff))
> R8_w=scalar()
> 9: (cf) r5 s>>= r5 ; R5_w=scalar()
> 10: (a6) if w8 < 0xfffffffb goto pc+9 ;
> R8_w=scalar(smin=-9223372032559808520,umin=4294967288,smin32=-5,smax32=-1,
> umin32=4294967291,var_off=(0xfffffff8; 0xffffffff00000007))
> 11: (1f) r7 -= r5 ; R5_w=scalar() R7_w=scalar()
> 12: (71) r6 = *(u8 *)(r1 +17) ; R1=ctx(off=0,imm=0)
> R6_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=255,
> var_off=(0x0; 0xff))
> 13: (5f) r3 &= r8 ;
> R3_w=scalar(smin=umin=smin32=umin32=8,smax=umax=smax32=umax32=13,var_off=(0x8;
> 0x5)) R8_w=scalar(smin=-9223372032559808520,umin=4294967288,smin32=-5,smax32=-1,umin32=4294967291,var_off=(0xffff)
> 14: (74) w2 >>= 30 ;
> R2_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=3,var_off=(0x0;
> 0x3))
> 15: (1f) r7 -= r5 ; R5_w=scalar() R7_w=scalar()
> 16: (5d) if r8 != r6 goto pc+3 ;
> R6_w=scalar(smin=umin=umin32=4294967288,smax=umax=umax32=255,smin32=-8,smax32=-1,
> var_off=(0xfffffff8; 0x7))
> R8_w=scalar(smin=umin=4294967288,smax=umax=255,smin32=-5,smax32=-1,umin32=4294967291)
> 17: (c7) r8 s>>= 5 ; R8_w=134217727
> 18: (cf) r0 s>>= r0 ; R0_w=scalar()
> 19: (7f) r0 >>= r0 ; R0=scalar()
> 20: (95) exit
>
> from 16 to 20: safe
>
> from 10 to 20: safe
> processed 22 insns (limit 1000000) max_states_per_insn 0 total_states
> 1 peak_states 1 mark_read 1
> -------- End of Verifier Log --------
>
> In adjust_scalar_min_max_vals(), src_reg.umax_value is 7, thus pass
> the check here:
> if (umax_val >= insn_bitness) {
> /* Shifts greater than 31 or 63 are undefined.
> * This includes shifts by a negative number.
> */
> mark_reg_unknown(env, regs, insn->dst_reg);
> break;
> }
>
> However in scalar32_min_max_rsh(), both src_reg->u32_min_value and
> src_reg->u32_max_value is 134217727, causing tnum_rsh() shit by 255.
>
> Should we check if(src_reg->u32_max_value < insn_bitness) before calling
> scalar32_min_max_rsh(), rather than only checking umax_val? Or, is it
> because issues somewhere else, incorrectly setting u32_min_value to
> 34217727
>
> Best
> Hao Sun

2023-10-25 13:21:18

by Hao Sun

[permalink] [raw]
Subject: Re: bpf: shift-out-of-bounds in tnum_rshift()

On Wed, Oct 25, 2023 at 2:31 PM Hao Sun <[email protected]> wrote:
>
> On Tue, Oct 24, 2023 at 2:40 PM Hao Sun <[email protected]> wrote:
> >
> > Hi,
> >
> > The following program can trigger a shift-out-of-bounds in
> > tnum_rshift(), called by scalar32_min_max_rsh():
> >
> > 0: (bc) w0 = w1
> > 1: (bf) r2 = r0
> > 2: (18) r3 = 0xd
> > 4: (bc) w4 = w0
> > 5: (bf) r5 = r0
> > 6: (bf) r7 = r3
> > 7: (bf) r8 = r4
> > 8: (2f) r8 *= r5
> > 9: (cf) r5 s>>= r5
> > 10: (a6) if w8 < 0xfffffffb goto pc+10
> > 11: (1f) r7 -= r5
> > 12: (71) r6 = *(u8 *)(r1 +17)
> > 13: (5f) r3 &= r8
> > 14: (74) w2 >>= 30
> > 15: (1f) r7 -= r5
> > 16: (5d) if r8 != r6 goto pc+4
> > 17: (c7) r8 s>>= 5
> > 18: (cf) r0 s>>= r0
> > 19: (7f) r0 >>= r0
> > 20: (7c) w5 >>= w8 # shift-out-bounds here
> > 21: exit
> >
>
> Here are the c macros for the above program in case anyone needs this:
>
> // 0: (bc) w0 = w1
> BPF_MOV32_REG(BPF_REG_0, BPF_REG_1),
> // 1: (bf) r2 = r0
> BPF_MOV64_REG(BPF_REG_2, BPF_REG_0),
> // 2: (18) r3 = 0xd
> BPF_LD_IMM64(BPF_REG_3, 0xd),
> // 4: (bc) w4 = w0
> BPF_MOV32_REG(BPF_REG_4, BPF_REG_0),
> // 5: (bf) r5 = r0
> BPF_MOV64_REG(BPF_REG_5, BPF_REG_0),
> // 6: (bf) r7 = r3
> BPF_MOV64_REG(BPF_REG_7, BPF_REG_3),
> // 7: (bf) r8 = r4
> BPF_MOV64_REG(BPF_REG_8, BPF_REG_4),
> // 8: (2f) r8 *= r5
> BPF_ALU64_REG(BPF_MUL, BPF_REG_8, BPF_REG_5),
> // 9: (cf) r5 s>>= r5
> BPF_ALU64_REG(BPF_ARSH, BPF_REG_5, BPF_REG_5),
> // 10: (a6) if w8 < 0xfffffffb goto pc+10
> BPF_JMP32_IMM(BPF_JLT, BPF_REG_8, 0xfffffffb, 10),
> // 11: (1f) r7 -= r5
> BPF_ALU64_REG(BPF_SUB, BPF_REG_7, BPF_REG_5),
> // 12: (71) r6 = *(u8 *)(r1 +17)
> BPF_LDX_MEM(BPF_B, BPF_REG_6, BPF_REG_1, 17),
> // 13: (5f) r3 &= r8
> BPF_ALU64_REG(BPF_AND, BPF_REG_3, BPF_REG_8),
> // 14: (74) w2 >>= 30
> BPF_ALU32_IMM(BPF_RSH, BPF_REG_2, 30),
> // 15: (1f) r7 -= r5
> BPF_ALU64_REG(BPF_SUB, BPF_REG_7, BPF_REG_5),
> // 16: (5d) if r8 != r6 goto pc+4
> BPF_JMP_REG(BPF_JNE, BPF_REG_8, BPF_REG_6, 4),
> // 17: (c7) r8 s>>= 5
> BPF_ALU64_IMM(BPF_ARSH, BPF_REG_8, 5),
> // 18: (cf) r0 s>>= r0
> BPF_ALU64_REG(BPF_ARSH, BPF_REG_0, BPF_REG_0),
> // 19: (7f) r0 >>= r0
> BPF_ALU64_REG(BPF_RSH, BPF_REG_0, BPF_REG_0),
> // 20: (7c) w5 >>= w8
> BPF_ALU32_REG(BPF_RSH, BPF_REG_5, BPF_REG_8),
> BPF_EXIT_INSN()
>
> > After load:
> > ================================================================================
> > UBSAN: shift-out-of-bounds in kernel/bpf/tnum.c:44:9
> > shift exponent 255 is too large for 64-bit type 'long long unsigned int'
> > CPU: 2 PID: 8574 Comm: bpf-test Not tainted
> > 6.6.0-rc5-01400-g7c2f6c9fb91f-dirty #21
> > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
> > Call Trace:
> > <TASK>
> > __dump_stack lib/dump_stack.c:88 [inline]
> > dump_stack_lvl+0x8e/0xb0 lib/dump_stack.c:106
> > ubsan_epilogue lib/ubsan.c:217 [inline]
> > __ubsan_handle_shift_out_of_bounds+0x15a/0x2f0 lib/ubsan.c:387
> > tnum_rshift.cold+0x17/0x32 kernel/bpf/tnum.c:44
> > scalar32_min_max_rsh kernel/bpf/verifier.c:12999 [inline]
> > adjust_scalar_min_max_vals kernel/bpf/verifier.c:13224 [inline]
> > adjust_reg_min_max_vals+0x1936/0x5d50 kernel/bpf/verifier.c:13338
> > do_check kernel/bpf/verifier.c:16890 [inline]
> > do_check_common+0x2f64/0xbb80 kernel/bpf/verifier.c:19563
> > do_check_main kernel/bpf/verifier.c:19626 [inline]
> > bpf_check+0x65cf/0xa9e0 kernel/bpf/verifier.c:20263
> > bpf_prog_load+0x110e/0x1b20 kernel/bpf/syscall.c:2717
> > __sys_bpf+0xfcf/0x4380 kernel/bpf/syscall.c:5365
> > __do_sys_bpf kernel/bpf/syscall.c:5469 [inline]
> > __se_sys_bpf kernel/bpf/syscall.c:5467 [inline]
> > __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:5467
> > do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> > do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
> > entry_SYSCALL_64_after_hwframe+0x63/0xcd
> > RIP: 0033:0x5610511e23cd
> > Code: 24 80 00 00 00 48 0f 42 d0 48 89 94 24 68 0c 00 00 b8 41 01 00
> > 00 bf 05 00 00 00 ba 90 00 00 00 48 8d b44
> > RSP: 002b:00007f5357fc7820 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
> > RAX: ffffffffffffffda RBX: 0000000000000095 RCX: 00005610511e23cd
> > RDX: 0000000000000090 RSI: 00007f5357fc8410 RDI: 0000000000000005
> > RBP: 0000000000000000 R08: 00007f5357fca458 R09: 00007f5350005520
> > R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000002b
> > R13: 0000000d00000000 R14: 000000000000002b R15: 000000000000002b
> > </TASK>
> >

Here is another similar one, which can probably be fixed in the same
patch. Build the kernel with UBSAN and run the following repro can
easily reproduce this.

C reproducer: https://pastebin.com/raw/zNfHaBnj

================================================================================
UBSAN: shift-out-of-bounds in kernel/bpf/verifier.c:13049:63
shift exponent 55 is too large for 32-bit type 'int'
CPU: 3 PID: 8614 Comm: poc Not tainted 6.6.0-rc5-01400-g7c2f6c9fb91f-dirty #22
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x8e/0xb0 lib/dump_stack.c:106
ubsan_epilogue lib/ubsan.c:217 [inline]
__ubsan_handle_shift_out_of_bounds+0x15a/0x2f0 lib/ubsan.c:387
scalar32_min_max_arsh kernel/bpf/verifier.c:13049 [inline]
adjust_scalar_min_max_vals kernel/bpf/verifier.c:13237 [inline]
adjust_reg_min_max_vals.cold+0x116/0x353 kernel/bpf/verifier.c:13338
do_check kernel/bpf/verifier.c:16890 [inline]
do_check_common+0x2f64/0xbb80 kernel/bpf/verifier.c:19563
do_check_main kernel/bpf/verifier.c:19626 [inline]
bpf_check+0x65cf/0xa9e0 kernel/bpf/verifier.c:20263
bpf_prog_load+0x110e/0x1b20 kernel/bpf/syscall.c:2717
__sys_bpf+0xfcf/0x4380 kernel/bpf/syscall.c:5365
__do_sys_bpf kernel/bpf/syscall.c:5469 [inline]
__se_sys_bpf kernel/bpf/syscall.c:5467 [inline]
__x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:5467
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x43b0a9
Code: 48 83 c4 28 c3 e8 17 1a 00 00 0f 1f 80 00 00 00 00 48 89 f8 48
89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 48
RSP: 002b:00007fffec705b18 EFLAGS: 00000202 ORIG_RAX: 0000000000000141
RAX: ffffffffffffffda RBX: 00007fffec705ff8 RCX: 000000000043b0a9
RDX: 0000000000000080 RSI: 00007fffec705b30 RDI: 0000000000000005
RBP: 00007fffec705c00 R08: 0000000400000002 R09: 0000003e00000000
R10: 00000000000000fc R11: 0000000000000202 R12: 0000000000000001
R13: 00007fffec705fe8 R14: 0000000000000001 R15: 0000000000000001
</TASK>

2023-10-25 14:11:24

by Shung-Hsi Yu

[permalink] [raw]
Subject: Re: bpf: shift-out-of-bounds in tnum_rshift()

Hi Hao,

On Wed, Oct 25, 2023 at 02:31:02PM +0200, Hao Sun wrote:
> On Tue, Oct 24, 2023 at 2:40 PM Hao Sun <[email protected]> wrote:
> >
> > Hi,
> >
> > The following program can trigger a shift-out-of-bounds in
> > tnum_rshift(), called by scalar32_min_max_rsh():
> >
> > 0: (bc) w0.
= w1
> > 1: (bf) r2 = r0
> > 2: (18) r3 = 0xd
> > 4: (bc) w4 = w0
> > 5: (bf) r5 = r0
> > 6: (bf) r7 = r3
> > 7: (bf) r8 = r4
> > 8: (2f) r8 *= r5
> > 9: (cf) r5 s>>= r5
> > 10: (a6) if w8 < 0xfffffffb goto pc+10
> > 11: (1f) r7 -= r5
> > 12: (71) r6 = *(u8 *)(r1 +17)
> > 13: (5f) r3 &= r8
> > 14: (74) w2 >>= 30
> > 15: (1f) r7 -= r5
> > 16: (5d) if r8 != r6 goto pc+4
> > 17: (c7) r8 s>>= 5
> > 18: (cf) r0 s>>= r0
> > 19: (7f) r0 >>= r0
> > 20: (7c) w5 >>= w8 # shift-out-bounds here
> > 21: exit
> >
>
> Here are the c macros for the above program in case anyone needs this:
>
> // 0: (bc) w0 = w1
> BPF_MOV32_REG(BPF_REG_0, BPF_REG_1),
> // 1: (bf) r2 = r0
> BPF_MOV64_REG(BPF_REG_2, BPF_REG_0),
> // 2: (18) r3 = 0xd
> BPF_LD_IMM64(BPF_REG_3, 0xd),
> // 4: (bc) w4 = w0
> BPF_MOV32_REG(BPF_REG_4, BPF_REG_0),
> // 5: (bf) r5 = r0
> BPF_MOV64_REG(BPF_REG_5, BPF_REG_0),
> // 6: (bf) r7 = r3
> BPF_MOV64_REG(BPF_REG_7, BPF_REG_3),
> // 7: (bf) r8 = r4
> BPF_MOV64_REG(BPF_REG_8, BPF_REG_4),
> // 8: (2f) r8 *= r5
> BPF_ALU64_REG(BPF_MUL, BPF_REG_8, BPF_REG_5),
> // 9: (cf) r5 s>>= r5
> BPF_ALU64_REG(BPF_ARSH, BPF_REG_5, BPF_REG_5),
> // 10: (a6) if w8 < 0xfffffffb goto pc+10
> BPF_JMP32_IMM(BPF_JLT, BPF_REG_8, 0xfffffffb, 10),
> // 11: (1f) r7 -= r5
> BPF_ALU64_REG(BPF_SUB, BPF_REG_7, BPF_REG_5),
> // 12: (71) r6 = *(u8 *)(r1 +17)
> BPF_LDX_MEM(BPF_B, BPF_REG_6, BPF_REG_1, 17),
> // 13: (5f) r3 &= r8
> BPF_ALU64_REG(BPF_AND, BPF_REG_3, BPF_REG_8),
> // 14: (74) w2 >>= 30
> BPF_ALU32_IMM(BPF_RSH, BPF_REG_2, 30),
> // 15: (1f) r7 -= r5
> BPF_ALU64_REG(BPF_SUB, BPF_REG_7, BPF_REG_5),
> // 16: (5d) if r8 != r6 goto pc+4
> BPF_JMP_REG(BPF_JNE, BPF_REG_8, BPF_REG_6, 4),
> // 17: (c7) r8 s>>= 5
> BPF_ALU64_IMM(BPF_ARSH, BPF_REG_8, 5),
> // 18: (cf) r0 s>>= r0
> BPF_ALU64_REG(BPF_ARSH, BPF_REG_0, BPF_REG_0),
> // 19: (7f) r0 >>= r0
> BPF_ALU64_REG(BPF_RSH, BPF_REG_0, BPF_REG_0),
> // 20: (7c) w5 >>= w8
> BPF_ALU32_REG(BPF_RSH, BPF_REG_5, BPF_REG_8),
> BPF_EXIT_INSN()
>
> > After load:
> > ================================================================================
> > UBSAN: shift-out-of-bounds in kernel/bpf/tnum.c:44:9
> > shift exponent 255 is too large for 64-bit type 'long long unsigned int'
> > CPU: 2 PID: 8574 Comm: bpf-test Not tainted
> > 6.6.0-rc5-01400-g7c2f6c9fb91f-dirty #21
> > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
> > Call Trace:
> > <TASK>
> > __dump_stack lib/dump_stack.c:88 [inline]
> > dump_stack_lvl+0x8e/0xb0 lib/dump_stack.c:106
> > ubsan_epilogue lib/ubsan.c:217 [inline]
> > __ubsan_handle_shift_out_of_bounds+0x15a/0x2f0 lib/ubsan.c:387
> > tnum_rshift.cold+0x17/0x32 kernel/bpf/tnum.c:44
> > scalar32_min_max_rsh kernel/bpf/verifier.c:12999 [inline]
> > adjust_scalar_min_max_vals kernel/bpf/verifier.c:13224 [inline]
> > adjust_reg_min_max_vals+0x1936/0x5d50 kernel/bpf/verifier.c:13338
> > do_check kernel/bpf/verifier.c:16890 [inline]
> > do_check_common+0x2f64/0xbb80 kernel/bpf/verifier.c:19563
> > do_check_main kernel/bpf/verifier.c:19626 [inline]
> > bpf_check+0x65cf/0xa9e0 kernel/bpf/verifier.c:20263
> > bpf_prog_load+0x110e/0x1b20 kernel/bpf/syscall.c:2717
> > __sys_bpf+0xfcf/0x4380 kernel/bpf/syscall.c:5365
> > __do_sys_bpf kernel/bpf/syscall.c:5469 [inline]
> > __se_sys_bpf kernel/bpf/syscall.c:5467 [inline]
> > __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:5467
> > do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> > do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
> > entry_SYSCALL_64_after_hwframe+0x63/0xcd
> > RIP: 0033:0x5610511e23cd
> > Code: 24 80 00 00 00 48 0f 42 d0 48 89 94 24 68 0c 00 00 b8 41 01 00
> > 00 bf 05 00 00 00 ba 90 00 00 00 48 8d b44
> > RSP: 002b:00007f5357fc7820 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
> > RAX: ffffffffffffffda RBX: 0000000000000095 RCX: 00005610511e23cd
> > RDX: 0000000000000090 RSI: 00007f5357fc8410 RDI: 0000000000000005
> > RBP: 0000000000000000 R08: 00007f5357fca458 R09: 00007f5350005520
> > R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000002b
> > R13: 0000000d00000000 R14: 000000000000002b R15: 000000000000002b
> > </TASK>
> >
> > If remove insn #20, the verifier gives:
> > -------- Verifier Log --------
> > func#0 @0
> > 0: R1=ctx(off=0,imm=0) R10=fp0
> > 0: (bc) w0 = w1 ;
> > R0_w=scalar(smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
> > R1=ctx(off=0,
> > imm=0)
> > 1: (bf) r2 = r0 ;
> > R0_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
> > 0xffffffff))
> > R2_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
> > 2: (18) r3 = 0xd ; R3_w=13
> > 4: (bc) w4 = w0 ;
> > R0_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
> > 0xffffffff))
> > R4_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
> > 5: (bf) r5 = r0 ;
> > R0_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
> > 0xffffffff))
> > R5_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
> > 6: (bf) r7 = r3 ; R3_w=13 R7_w=13
> > 7: (bf) r8 = r4 ;
> > R4_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
> > 0xffffffff))
> > R8_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
> > 8: (2f) r8 *= r5 ;
> > R5_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
> > 0xffffffff))
> > R8_w=scalar()
> > 9: (cf) r5 s>>= r5 ; R5_w=scalar()
> > 10: (a6) if w8 < 0xfffffffb goto pc+9 ;
> > R8_w=scalar(smin=-9223372032559808520,umin=4294967288,smin32=-5,smax32=-1,
> > umin32=4294967291,var_off=(0xfffffff8; 0xffffffff00000007))
> > 11: (1f) r7 -= r5 ; R5_w=scalar() R7_w=scalar()
> > 12: (71) r6 = *(u8 *)(r1 +17) ; R1=ctx(off=0,imm=0)
> > R6_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=255,
> > var_off=(0x0; 0xff))
> > 13: (5f) r3 &= r8 ;
> > R3_w=scalar(smin=umin=smin32=umin32=8,smax=umax=smax32=umax32=13,var_off=(0x8;
> > 0x5)) R8_w=scalar(smin=-9223372032559808520,umin=4294967288,smin32=-5,smax32=-1,umin32=4294967291,var_off=(0xffff)
> > 14: (74) w2 >>= 30 ;
> > R2_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=3,var_off=(0x0;
> > 0x3))
> > 15: (1f) r7 -= r5 ; R5_w=scalar() R7_w=scalar()
> > 16: (5d) if r8 != r6 goto pc+3 ;
> > R6_w=scalar(smin=umin=umin32=4294967288,smax=umax=umax32=255,smin32=-8,smax32=-1,
> > var_off=(0xfffffff8; 0x7))
> > R8_w=scalar(smin=umin=4294967288,smax=umax=255,smin32=-5,smax32=-1,umin32=4294967291)

Seems like the root cause is a bug with range tracking, before instruction
16, R8_w was

R8_w=scalar(smin=-9223372032559808520,umin=4294967288,smin32=-5,smax32=-1,umin32=4294967291,var_off=(0xffff)

But after instruction 16 it becomes

R8_w=scalar(smin=umin=4294967288,smax=umax=255,smin32=-5,smax32=-1,umin32=4294967291)

Where smin_value > smax_value, and umin_value > umax_value (among other
things). This should be the main problem.

The verifier operates on the assumption that smin_value <= smax_value and
umin_value <= umax_value, and if that assumption is not upheld then all kind
of things can go wrong.

Maybe Andrii may already has this worked out in the range-vs-range that he
has mentioned[1] he'll be sending soon.

1: https://lore.kernel.org/bpf/CAEf4BzbJ3hZCSt4nLCZCV4cxV60+kddiSMsy7-9ou_RaQV7B8A@mail.gmail.com/

> > 17: (c7) r8 s>>= 5 ; R8_w=134217727
> > 18: (cf) r0 s>>= r0 ; R0_w=scalar()
> > 19: (7f) r0 >>= r0 ; R0=scalar()
> > 20: (95) exit
> >
> > from 16 to 20: safe
> >
> > from 10 to 20: safe
> > processed 22 insns (limit 1000000) max_states_per_insn 0 total_states
> > 1 peak_states 1 mark_read 1
> > -------- End of Verifier Log --------
> >
> > In adjust_scalar_min_max_vals(), src_reg.umax_value is 7, thus pass
> > the check here:
> > if (umax_val >= insn_bitness) {
> > /* Shifts greater than 31 or 63 are undefined.
> > * This includes shifts by a negative number.
> > */
> > mark_reg_unknown(env, regs, insn->dst_reg);
> > break;
> > }
> >
> > However in scalar32_min_max_rsh(), both src_reg->u32_min_value and
> > src_reg->u32_max_value is 134217727, causing tnum_rsh() shit by 255.
> >
> > Should we check if(src_reg->u32_max_value < insn_bitness) before calling
> > scalar32_min_max_rsh(), rather than only checking umax_val? Or, is it
> > because issues somewhere else, incorrectly setting u32_min_value to
> > 34217727

Checking umax_val alone is be enough and we don't need to add a check for
u32_max_value, because (when we have correct range tracking) u32_max_value
should always be smaller than u32_value. So the fix needed here is to have
correct range tracking.

> > Best
> > Hao Sun

2023-10-25 14:19:02

by Shung-Hsi Yu

[permalink] [raw]
Subject: Re: bpf: shift-out-of-bounds in tnum_rshift()

On Wed, Oct 25, 2023 at 10:09:27PM +0800, Shung-Hsi Yu wrote:
> Hi Hao,
>
> On Wed, Oct 25, 2023 at 02:31:02PM +0200, Hao Sun wrote:
> > On Tue, Oct 24, 2023 at 2:40 PM Hao Sun <[email protected]> wrote:
> > >
> > > Hi,
> > >
> > > The following program can trigger a shift-out-of-bounds in
> > > tnum_rshift(), called by scalar32_min_max_rsh():
> > >
> > > 0: (bc) w0.
> = w1
> > > 1: (bf) r2 = r0
> > > 2: (18) r3 = 0xd
> > > 4: (bc) w4 = w0
> > > 5: (bf) r5 = r0
> > > 6: (bf) r7 = r3
> > > 7: (bf) r8 = r4
> > > 8: (2f) r8 *= r5
> > > 9: (cf) r5 s>>= r5
> > > 10: (a6) if w8 < 0xfffffffb goto pc+10
> > > 11: (1f) r7 -= r5
> > > 12: (71) r6 = *(u8 *)(r1 +17)
> > > 13: (5f) r3 &= r8
> > > 14: (74) w2 >>= 30
> > > 15: (1f) r7 -= r5
> > > 16: (5d) if r8 != r6 goto pc+4
> > > 17: (c7) r8 s>>= 5
> > > 18: (cf) r0 s>>= r0
> > > 19: (7f) r0 >>= r0
> > > 20: (7c) w5 >>= w8 # shift-out-bounds here
> > > 21: exit
> > >
> >
> > Here are the c macros for the above program in case anyone needs this:
> >
> > // 0: (bc) w0 = w1
> > BPF_MOV32_REG(BPF_REG_0, BPF_REG_1),
> > // 1: (bf) r2 = r0
> > BPF_MOV64_REG(BPF_REG_2, BPF_REG_0),
> > // 2: (18) r3 = 0xd
> > BPF_LD_IMM64(BPF_REG_3, 0xd),
> > // 4: (bc) w4 = w0
> > BPF_MOV32_REG(BPF_REG_4, BPF_REG_0),
> > // 5: (bf) r5 = r0
> > BPF_MOV64_REG(BPF_REG_5, BPF_REG_0),
> > // 6: (bf) r7 = r3
> > BPF_MOV64_REG(BPF_REG_7, BPF_REG_3),
> > // 7: (bf) r8 = r4
> > BPF_MOV64_REG(BPF_REG_8, BPF_REG_4),
> > // 8: (2f) r8 *= r5
> > BPF_ALU64_REG(BPF_MUL, BPF_REG_8, BPF_REG_5),
> > // 9: (cf) r5 s>>= r5
> > BPF_ALU64_REG(BPF_ARSH, BPF_REG_5, BPF_REG_5),
> > // 10: (a6) if w8 < 0xfffffffb goto pc+10
> > BPF_JMP32_IMM(BPF_JLT, BPF_REG_8, 0xfffffffb, 10),
> > // 11: (1f) r7 -= r5
> > BPF_ALU64_REG(BPF_SUB, BPF_REG_7, BPF_REG_5),
> > // 12: (71) r6 = *(u8 *)(r1 +17)
> > BPF_LDX_MEM(BPF_B, BPF_REG_6, BPF_REG_1, 17),
> > // 13: (5f) r3 &= r8
> > BPF_ALU64_REG(BPF_AND, BPF_REG_3, BPF_REG_8),
> > // 14: (74) w2 >>= 30
> > BPF_ALU32_IMM(BPF_RSH, BPF_REG_2, 30),
> > // 15: (1f) r7 -= r5
> > BPF_ALU64_REG(BPF_SUB, BPF_REG_7, BPF_REG_5),
> > // 16: (5d) if r8 != r6 goto pc+4
> > BPF_JMP_REG(BPF_JNE, BPF_REG_8, BPF_REG_6, 4),
> > // 17: (c7) r8 s>>= 5
> > BPF_ALU64_IMM(BPF_ARSH, BPF_REG_8, 5),
> > // 18: (cf) r0 s>>= r0
> > BPF_ALU64_REG(BPF_ARSH, BPF_REG_0, BPF_REG_0),
> > // 19: (7f) r0 >>= r0
> > BPF_ALU64_REG(BPF_RSH, BPF_REG_0, BPF_REG_0),
> > // 20: (7c) w5 >>= w8
> > BPF_ALU32_REG(BPF_RSH, BPF_REG_5, BPF_REG_8),
> > BPF_EXIT_INSN()
> >
> > > After load:
> > > ================================================================================
> > > UBSAN: shift-out-of-bounds in kernel/bpf/tnum.c:44:9
> > > shift exponent 255 is too large for 64-bit type 'long long unsigned int'
> > > CPU: 2 PID: 8574 Comm: bpf-test Not tainted
> > > 6.6.0-rc5-01400-g7c2f6c9fb91f-dirty #21
> > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
> > > Call Trace:
> > > <TASK>
> > > __dump_stack lib/dump_stack.c:88 [inline]
> > > dump_stack_lvl+0x8e/0xb0 lib/dump_stack.c:106
> > > ubsan_epilogue lib/ubsan.c:217 [inline]
> > > __ubsan_handle_shift_out_of_bounds+0x15a/0x2f0 lib/ubsan.c:387
> > > tnum_rshift.cold+0x17/0x32 kernel/bpf/tnum.c:44
> > > scalar32_min_max_rsh kernel/bpf/verifier.c:12999 [inline]
> > > adjust_scalar_min_max_vals kernel/bpf/verifier.c:13224 [inline]
> > > adjust_reg_min_max_vals+0x1936/0x5d50 kernel/bpf/verifier.c:13338
> > > do_check kernel/bpf/verifier.c:16890 [inline]
> > > do_check_common+0x2f64/0xbb80 kernel/bpf/verifier.c:19563
> > > do_check_main kernel/bpf/verifier.c:19626 [inline]
> > > bpf_check+0x65cf/0xa9e0 kernel/bpf/verifier.c:20263
> > > bpf_prog_load+0x110e/0x1b20 kernel/bpf/syscall.c:2717
> > > __sys_bpf+0xfcf/0x4380 kernel/bpf/syscall.c:5365
> > > __do_sys_bpf kernel/bpf/syscall.c:5469 [inline]
> > > __se_sys_bpf kernel/bpf/syscall.c:5467 [inline]
> > > __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:5467
> > > do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> > > do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
> > > entry_SYSCALL_64_after_hwframe+0x63/0xcd
> > > RIP: 0033:0x5610511e23cd
> > > Code: 24 80 00 00 00 48 0f 42 d0 48 89 94 24 68 0c 00 00 b8 41 01 00
> > > 00 bf 05 00 00 00 ba 90 00 00 00 48 8d b44
> > > RSP: 002b:00007f5357fc7820 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
> > > RAX: ffffffffffffffda RBX: 0000000000000095 RCX: 00005610511e23cd
> > > RDX: 0000000000000090 RSI: 00007f5357fc8410 RDI: 0000000000000005
> > > RBP: 0000000000000000 R08: 00007f5357fca458 R09: 00007f5350005520
> > > R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000002b
> > > R13: 0000000d00000000 R14: 000000000000002b R15: 000000000000002b
> > > </TASK>
> > >
> > > If remove insn #20, the verifier gives:
> > > -------- Verifier Log --------
> > > func#0 @0
> > > 0: R1=ctx(off=0,imm=0) R10=fp0
> > > 0: (bc) w0 = w1 ;
> > > R0_w=scalar(smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
> > > R1=ctx(off=0,
> > > imm=0)
> > > 1: (bf) r2 = r0 ;
> > > R0_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
> > > 0xffffffff))
> > > R2_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
> > > 2: (18) r3 = 0xd ; R3_w=13
> > > 4: (bc) w4 = w0 ;
> > > R0_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
> > > 0xffffffff))
> > > R4_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
> > > 5: (bf) r5 = r0 ;
> > > R0_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
> > > 0xffffffff))
> > > R5_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
> > > 6: (bf) r7 = r3 ; R3_w=13 R7_w=13
> > > 7: (bf) r8 = r4 ;
> > > R4_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
> > > 0xffffffff))
> > > R8_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
> > > 8: (2f) r8 *= r5 ;
> > > R5_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
> > > 0xffffffff))
> > > R8_w=scalar()
> > > 9: (cf) r5 s>>= r5 ; R5_w=scalar()
> > > 10: (a6) if w8 < 0xfffffffb goto pc+9 ;
> > > R8_w=scalar(smin=-9223372032559808520,umin=4294967288,smin32=-5,smax32=-1,
> > > umin32=4294967291,var_off=(0xfffffff8; 0xffffffff00000007))
> > > 11: (1f) r7 -= r5 ; R5_w=scalar() R7_w=scalar()
> > > 12: (71) r6 = *(u8 *)(r1 +17) ; R1=ctx(off=0,imm=0)
> > > R6_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=255,
> > > var_off=(0x0; 0xff))
> > > 13: (5f) r3 &= r8 ;
> > > R3_w=scalar(smin=umin=smin32=umin32=8,smax=umax=smax32=umax32=13,var_off=(0x8;
> > > 0x5)) R8_w=scalar(smin=-9223372032559808520,umin=4294967288,smin32=-5,smax32=-1,umin32=4294967291,var_off=(0xffff)
> > > 14: (74) w2 >>= 30 ;
> > > R2_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=3,var_off=(0x0;
> > > 0x3))
> > > 15: (1f) r7 -= r5 ; R5_w=scalar() R7_w=scalar()
> > > 16: (5d) if r8 != r6 goto pc+3 ;
> > > R6_w=scalar(smin=umin=umin32=4294967288,smax=umax=umax32=255,smin32=-8,smax32=-1,
> > > var_off=(0xfffffff8; 0x7))
> > > R8_w=scalar(smin=umin=4294967288,smax=umax=255,smin32=-5,smax32=-1,umin32=4294967291)
>
> Seems like the root cause is a bug with range tracking, before instruction
> 16, R8_w was
>
> R8_w=scalar(smin=-9223372032559808520,umin=4294967288,smin32=-5,smax32=-1,umin32=4294967291,var_off=(0xffff)
>
> But after instruction 16 it becomes
>
> R8_w=scalar(smin=umin=4294967288,smax=umax=255,smin32=-5,smax32=-1,umin32=4294967291)
>
> Where smin_value > smax_value, and umin_value > umax_value (among other
> things). This should be the main problem.
>
> The verifier operates on the assumption that smin_value <= smax_value and
> umin_value <= umax_value, and if that assumption is not upheld then all kind
> of things can go wrong.
>
> Maybe Andrii may already has this worked out in the range-vs-range that he
> has mentioned[1] he'll be sending soon.
>
> 1: https://lore.kernel.org/bpf/CAEf4BzbJ3hZCSt4nLCZCV4cxV60+kddiSMsy7-9ou_RaQV7B8A@mail.gmail.com/
>
> > > 17: (c7) r8 s>>= 5 ; R8_w=134217727
> > > 18: (cf) r0 s>>= r0 ; R0_w=scalar()
> > > 19: (7f) r0 >>= r0 ; R0=scalar()
> > > 20: (95) exit
> > >
> > > from 16 to 20: safe
> > >
> > > from 10 to 20: safe
> > > processed 22 insns (limit 1000000) max_states_per_insn 0 total_states
> > > 1 peak_states 1 mark_read 1
> > > -------- End of Verifier Log --------
> > >
> > > In adjust_scalar_min_max_vals(), src_reg.umax_value is 7, thus pass
> > > the check here:
> > > if (umax_val >= insn_bitness) {
> > > /* Shifts greater than 31 or 63 are undefined.
> > > * This includes shifts by a negative number.
> > > */
> > > mark_reg_unknown(env, regs, insn->dst_reg);
> > > break;
> > > }
> > >
> > > However in scalar32_min_max_rsh(), both src_reg->u32_min_value and
> > > src_reg->u32_max_value is 134217727, causing tnum_rsh() shit by 255.
> > >
> > > Should we check if(src_reg->u32_max_value < insn_bitness) before calling
> > > scalar32_min_max_rsh(), rather than only checking umax_val? Or, is it
> > > because issues somewhere else, incorrectly setting u32_min_value to
> > > 34217727
>
> Checking umax_val alone is be enough and we don't need to add a check for
> u32_max_value, because (when we have correct range tracking) u32_max_value
> should always be smaller than u32_value. So the fix needed here is to have
> correct range tracking.

Since you are running fuzzer on BPF verifier, it may be be helpful to check
that the following conditions are always met:
- umin <= umax
- smin <= smax
- u32_min <= u32_max
- s32_min <= s32_max
- u32_max <= umax
- smin <= s32_min
- s32_max <= smax

If any of these condition is not upheld then it should be a range tracking
bug.

> > > Best
> > > Hao Sun

2023-10-25 17:35:08

by Eduard Zingerman

[permalink] [raw]
Subject: Re: bpf: shift-out-of-bounds in tnum_rshift()

On Tue, 2023-10-24 at 14:40 +0200, Hao Sun wrote:
> Hi,
>
> The following program can trigger a shift-out-of-bounds in
> tnum_rshift(), called by scalar32_min_max_rsh():
>
> 0: (bc) w0 = w1
> 1: (bf) r2 = r0
> 2: (18) r3 = 0xd
> 4: (bc) w4 = w0
> 5: (bf) r5 = r0
> 6: (bf) r7 = r3
> 7: (bf) r8 = r4
> 8: (2f) r8 *= r5
> 9: (cf) r5 s>>= r5
> 10: (a6) if w8 < 0xfffffffb goto pc+10
> 11: (1f) r7 -= r5
> 12: (71) r6 = *(u8 *)(r1 +17)
> 13: (5f) r3 &= r8
> 14: (74) w2 >>= 30
> 15: (1f) r7 -= r5
> 16: (5d) if r8 != r6 goto pc+4
> 17: (c7) r8 s>>= 5
> 18: (cf) r0 s>>= r0
> 19: (7f) r0 >>= r0
> 20: (7c) w5 >>= w8 # shift-out-bounds here
> 21: exit

Here is a simplified example:

SEC("?tp")
__success __retval(0)
__naked void large_shifts(void)
{
asm volatile (" \
call %[bpf_get_prandom_u32]; \n\
r8 = r0; \n\
r6 = r0; \n\
r6 &= 0xf; \n\
if w8 < 0xffffffff goto +2; \n\
if r8 != r6 goto +1; \n\
w0 >>= w8; /* shift-out-bounds here */ \n\
exit; \n\
" :
: __imm(bpf_get_prandom_u32)
: __clobber_all);
}

The issue is caused by an invalid range assigned to R8 after R8 != R6
check, here is GDB log:

(gdb) bt
#0 scalar32_min_max_rsh ... at kernel/bpf/verifier.c:13368
#1 0xffffffff81295236 in adjust_scalar_min_max_vals ... at kernel/bpf/verifier.c:13592
#2 adjust_reg_min_max_vals .... at kernel/bpf/verifier.c:13706
#3 0xffffffff8128701f in check_alu_op ... at kernel/bpf/verifier.c:13938
#4 do_check ... at kernel/bpf/verifier.c:17327
(gdb) p *src_reg
$2 = {
type = SCALAR_VALUE,
...
smin_value = 4294967295,
smax_value = 15,
umin_value = 4294967295,
umax_value = 15,
s32_min_value = -1,
s32_max_value = -1,
u32_min_value = 4294967295,
u32_max_value = 4294967295,
...
}

The invalid range is assigned within reg_combine_min_max() function in
BPF_JNE branch. The following diff removes the error:

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 857d76694517..3d140bf85282 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -14485,7 +14485,7 @@ static void reg_combine_min_max(struct bpf_reg_state *true_src,
__reg_combine_min_max(true_src, true_dst);
break;
case BPF_JNE:
- __reg_combine_min_max(false_src, false_dst);
+ //__reg_combine_min_max(false_src, false_dst);
break;
}
}

I do not understand what BPF_JNE branch logically means in
reg_combine_min_max(), does anyone has any insight?

> After load:
> ================================================================================
> UBSAN: shift-out-of-bounds in kernel/bpf/tnum.c:44:9
> shift exponent 255 is too large for 64-bit type 'long long unsigned int'
> CPU: 2 PID: 8574 Comm: bpf-test Not tainted
> 6.6.0-rc5-01400-g7c2f6c9fb91f-dirty #21
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
> Call Trace:
> <TASK>
> __dump_stack lib/dump_stack.c:88 [inline]
> dump_stack_lvl+0x8e/0xb0 lib/dump_stack.c:106
> ubsan_epilogue lib/ubsan.c:217 [inline]
> __ubsan_handle_shift_out_of_bounds+0x15a/0x2f0 lib/ubsan.c:387
> tnum_rshift.cold+0x17/0x32 kernel/bpf/tnum.c:44
> scalar32_min_max_rsh kernel/bpf/verifier.c:12999 [inline]
> adjust_scalar_min_max_vals kernel/bpf/verifier.c:13224 [inline]
> adjust_reg_min_max_vals+0x1936/0x5d50 kernel/bpf/verifier.c:13338
> do_check kernel/bpf/verifier.c:16890 [inline]
> do_check_common+0x2f64/0xbb80 kernel/bpf/verifier.c:19563
> do_check_main kernel/bpf/verifier.c:19626 [inline]
> bpf_check+0x65cf/0xa9e0 kernel/bpf/verifier.c:20263
> bpf_prog_load+0x110e/0x1b20 kernel/bpf/syscall.c:2717
> __sys_bpf+0xfcf/0x4380 kernel/bpf/syscall.c:5365
> __do_sys_bpf kernel/bpf/syscall.c:5469 [inline]
> __se_sys_bpf kernel/bpf/syscall.c:5467 [inline]
> __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:5467
> do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
> entry_SYSCALL_64_after_hwframe+0x63/0xcd
> RIP: 0033:0x5610511e23cd
> Code: 24 80 00 00 00 48 0f 42 d0 48 89 94 24 68 0c 00 00 b8 41 01 00
> 00 bf 05 00 00 00 ba 90 00 00 00 48 8d b44
> RSP: 002b:00007f5357fc7820 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
> RAX: ffffffffffffffda RBX: 0000000000000095 RCX: 00005610511e23cd
> RDX: 0000000000000090 RSI: 00007f5357fc8410 RDI: 0000000000000005
> RBP: 0000000000000000 R08: 00007f5357fca458 R09: 00007f5350005520
> R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000002b
> R13: 0000000d00000000 R14: 000000000000002b R15: 000000000000002b
> </TASK>
>
> If remove insn #20, the verifier gives:
> -------- Verifier Log --------
> func#0 @0
> 0: R1=ctx(off=0,imm=0) R10=fp0
> 0: (bc) w0 = w1 ;
> R0_w=scalar(smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
> R1=ctx(off=0,
> imm=0)
> 1: (bf) r2 = r0 ;
> R0_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
> 0xffffffff))
> R2_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
> 2: (18) r3 = 0xd ; R3_w=13
> 4: (bc) w4 = w0 ;
> R0_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
> 0xffffffff))
> R4_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
> 5: (bf) r5 = r0 ;
> R0_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
> 0xffffffff))
> R5_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
> 6: (bf) r7 = r3 ; R3_w=13 R7_w=13
> 7: (bf) r8 = r4 ;
> R4_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
> 0xffffffff))
> R8_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
> 8: (2f) r8 *= r5 ;
> R5_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
> 0xffffffff))
> R8_w=scalar()
> 9: (cf) r5 s>>= r5 ; R5_w=scalar()
> 10: (a6) if w8 < 0xfffffffb goto pc+9 ;
> R8_w=scalar(smin=-9223372032559808520,umin=4294967288,smin32=-5,smax32=-1,
> umin32=4294967291,var_off=(0xfffffff8; 0xffffffff00000007))
> 11: (1f) r7 -= r5 ; R5_w=scalar() R7_w=scalar()
> 12: (71) r6 = *(u8 *)(r1 +17) ; R1=ctx(off=0,imm=0)
> R6_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=255,
> var_off=(0x0; 0xff))
> 13: (5f) r3 &= r8 ;
> R3_w=scalar(smin=umin=smin32=umin32=8,smax=umax=smax32=umax32=13,var_off=(0x8;
> 0x5)) R8_w=scalar(smin=-9223372032559808520,umin=4294967288,smin32=-5,smax32=-1,umin32=4294967291,var_off=(0xffff)
> 14: (74) w2 >>= 30 ;
> R2_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=3,var_off=(0x0;
> 0x3))
> 15: (1f) r7 -= r5 ; R5_w=scalar() R7_w=scalar()
> 16: (5d) if r8 != r6 goto pc+3 ;
> R6_w=scalar(smin=umin=umin32=4294967288,smax=umax=umax32=255,smin32=-8,smax32=-1,
> var_off=(0xfffffff8; 0x7))
> R8_w=scalar(smin=umin=4294967288,smax=umax=255,smin32=-5,smax32=-1,umin32=4294967291)
> 17: (c7) r8 s>>= 5 ; R8_w=134217727
> 18: (cf) r0 s>>= r0 ; R0_w=scalar()
> 19: (7f) r0 >>= r0 ; R0=scalar()
> 20: (95) exit
>
> from 16 to 20: safe
>
> from 10 to 20: safe
> processed 22 insns (limit 1000000) max_states_per_insn 0 total_states
> 1 peak_states 1 mark_read 1
> -------- End of Verifier Log --------
>
> In adjust_scalar_min_max_vals(), src_reg.umax_value is 7, thus pass
> the check here:
> if (umax_val >= insn_bitness) {
> /* Shifts greater than 31 or 63 are undefined.
> * This includes shifts by a negative number.
> */
> mark_reg_unknown(env, regs, insn->dst_reg);
> break;
> }
>
> However in scalar32_min_max_rsh(), both src_reg->u32_min_value and
> src_reg->u32_max_value is 134217727, causing tnum_rsh() shit by 255.
>
> Should we check if(src_reg->u32_max_value < insn_bitness) before calling
> scalar32_min_max_rsh(), rather than only checking umax_val? Or, is it
> because issues somewhere else, incorrectly setting u32_min_value to
> 34217727
>
> Best
> Hao Sun
>

2023-10-25 21:59:51

by Eduard Zingerman

[permalink] [raw]
Subject: Re: bpf: shift-out-of-bounds in tnum_rshift()

On Wed, 2023-10-25 at 22:09 +0800, Shung-Hsi Yu wrote:
> Hi Hao,
>
> On Wed, Oct 25, 2023 at 02:31:02PM +0200, Hao Sun wrote:
> > On Tue, Oct 24, 2023 at 2:40 PM Hao Sun <[email protected]> wrote:
> > >
> > > Hi,
> > >
> > > The following program can trigger a shift-out-of-bounds in
> > > tnum_rshift(), called by scalar32_min_max_rsh():
> > >
> > > 0: (bc) w0.
> = w1
> > > 1: (bf) r2 = r0
> > > 2: (18) r3 = 0xd
> > > 4: (bc) w4 = w0
> > > 5: (bf) r5 = r0
> > > 6: (bf) r7 = r3
> > > 7: (bf) r8 = r4
> > > 8: (2f) r8 *= r5
> > > 9: (cf) r5 s>>= r5
> > > 10: (a6) if w8 < 0xfffffffb goto pc+10
> > > 11: (1f) r7 -= r5
> > > 12: (71) r6 = *(u8 *)(r1 +17)
> > > 13: (5f) r3 &= r8
> > > 14: (74) w2 >>= 30
> > > 15: (1f) r7 -= r5
> > > 16: (5d) if r8 != r6 goto pc+4
> > > 17: (c7) r8 s>>= 5
> > > 18: (cf) r0 s>>= r0
> > > 19: (7f) r0 >>= r0
> > > 20: (7c) w5 >>= w8 # shift-out-bounds here
> > > 21: exit
> > >
> >
> > Here are the c macros for the above program in case anyone needs this:
> >
> > // 0: (bc) w0 = w1
> > BPF_MOV32_REG(BPF_REG_0, BPF_REG_1),
> > // 1: (bf) r2 = r0
> > BPF_MOV64_REG(BPF_REG_2, BPF_REG_0),
> > // 2: (18) r3 = 0xd
> > BPF_LD_IMM64(BPF_REG_3, 0xd),
> > // 4: (bc) w4 = w0
> > BPF_MOV32_REG(BPF_REG_4, BPF_REG_0),
> > // 5: (bf) r5 = r0
> > BPF_MOV64_REG(BPF_REG_5, BPF_REG_0),
> > // 6: (bf) r7 = r3
> > BPF_MOV64_REG(BPF_REG_7, BPF_REG_3),
> > // 7: (bf) r8 = r4
> > BPF_MOV64_REG(BPF_REG_8, BPF_REG_4),
> > // 8: (2f) r8 *= r5
> > BPF_ALU64_REG(BPF_MUL, BPF_REG_8, BPF_REG_5),
> > // 9: (cf) r5 s>>= r5
> > BPF_ALU64_REG(BPF_ARSH, BPF_REG_5, BPF_REG_5),
> > // 10: (a6) if w8 < 0xfffffffb goto pc+10
> > BPF_JMP32_IMM(BPF_JLT, BPF_REG_8, 0xfffffffb, 10),
> > // 11: (1f) r7 -= r5
> > BPF_ALU64_REG(BPF_SUB, BPF_REG_7, BPF_REG_5),
> > // 12: (71) r6 = *(u8 *)(r1 +17)
> > BPF_LDX_MEM(BPF_B, BPF_REG_6, BPF_REG_1, 17),
> > // 13: (5f) r3 &= r8
> > BPF_ALU64_REG(BPF_AND, BPF_REG_3, BPF_REG_8),
> > // 14: (74) w2 >>= 30
> > BPF_ALU32_IMM(BPF_RSH, BPF_REG_2, 30),
> > // 15: (1f) r7 -= r5
> > BPF_ALU64_REG(BPF_SUB, BPF_REG_7, BPF_REG_5),
> > // 16: (5d) if r8 != r6 goto pc+4
> > BPF_JMP_REG(BPF_JNE, BPF_REG_8, BPF_REG_6, 4),
> > // 17: (c7) r8 s>>= 5
> > BPF_ALU64_IMM(BPF_ARSH, BPF_REG_8, 5),
> > // 18: (cf) r0 s>>= r0
> > BPF_ALU64_REG(BPF_ARSH, BPF_REG_0, BPF_REG_0),
> > // 19: (7f) r0 >>= r0
> > BPF_ALU64_REG(BPF_RSH, BPF_REG_0, BPF_REG_0),
> > // 20: (7c) w5 >>= w8
> > BPF_ALU32_REG(BPF_RSH, BPF_REG_5, BPF_REG_8),
> > BPF_EXIT_INSN()
> >
> > > After load:
> > > ================================================================================
> > > UBSAN: shift-out-of-bounds in kernel/bpf/tnum.c:44:9
> > > shift exponent 255 is too large for 64-bit type 'long long unsigned int'
> > > CPU: 2 PID: 8574 Comm: bpf-test Not tainted
> > > 6.6.0-rc5-01400-g7c2f6c9fb91f-dirty #21
> > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
> > > Call Trace:
> > > <TASK>
> > > __dump_stack lib/dump_stack.c:88 [inline]
> > > dump_stack_lvl+0x8e/0xb0 lib/dump_stack.c:106
> > > ubsan_epilogue lib/ubsan.c:217 [inline]
> > > __ubsan_handle_shift_out_of_bounds+0x15a/0x2f0 lib/ubsan.c:387
> > > tnum_rshift.cold+0x17/0x32 kernel/bpf/tnum.c:44
> > > scalar32_min_max_rsh kernel/bpf/verifier.c:12999 [inline]
> > > adjust_scalar_min_max_vals kernel/bpf/verifier.c:13224 [inline]
> > > adjust_reg_min_max_vals+0x1936/0x5d50 kernel/bpf/verifier.c:13338
> > > do_check kernel/bpf/verifier.c:16890 [inline]
> > > do_check_common+0x2f64/0xbb80 kernel/bpf/verifier.c:19563
> > > do_check_main kernel/bpf/verifier.c:19626 [inline]
> > > bpf_check+0x65cf/0xa9e0 kernel/bpf/verifier.c:20263
> > > bpf_prog_load+0x110e/0x1b20 kernel/bpf/syscall.c:2717
> > > __sys_bpf+0xfcf/0x4380 kernel/bpf/syscall.c:5365
> > > __do_sys_bpf kernel/bpf/syscall.c:5469 [inline]
> > > __se_sys_bpf kernel/bpf/syscall.c:5467 [inline]
> > > __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:5467
> > > do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> > > do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
> > > entry_SYSCALL_64_after_hwframe+0x63/0xcd
> > > RIP: 0033:0x5610511e23cd
> > > Code: 24 80 00 00 00 48 0f 42 d0 48 89 94 24 68 0c 00 00 b8 41 01 00
> > > 00 bf 05 00 00 00 ba 90 00 00 00 48 8d b44
> > > RSP: 002b:00007f5357fc7820 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
> > > RAX: ffffffffffffffda RBX: 0000000000000095 RCX: 00005610511e23cd
> > > RDX: 0000000000000090 RSI: 00007f5357fc8410 RDI: 0000000000000005
> > > RBP: 0000000000000000 R08: 00007f5357fca458 R09: 00007f5350005520
> > > R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000002b
> > > R13: 0000000d00000000 R14: 000000000000002b R15: 000000000000002b
> > > </TASK>
> > >
> > > If remove insn #20, the verifier gives:
> > > -------- Verifier Log --------
> > > func#0 @0
> > > 0: R1=ctx(off=0,imm=0) R10=fp0
> > > 0: (bc) w0 = w1 ;
> > > R0_w=scalar(smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
> > > R1=ctx(off=0,
> > > imm=0)
> > > 1: (bf) r2 = r0 ;
> > > R0_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
> > > 0xffffffff))
> > > R2_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
> > > 2: (18) r3 = 0xd ; R3_w=13
> > > 4: (bc) w4 = w0 ;
> > > R0_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
> > > 0xffffffff))
> > > R4_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
> > > 5: (bf) r5 = r0 ;
> > > R0_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
> > > 0xffffffff))
> > > R5_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
> > > 6: (bf) r7 = r3 ; R3_w=13 R7_w=13
> > > 7: (bf) r8 = r4 ;
> > > R4_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
> > > 0xffffffff))
> > > R8_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
> > > 8: (2f) r8 *= r5 ;
> > > R5_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
> > > 0xffffffff))
> > > R8_w=scalar()
> > > 9: (cf) r5 s>>= r5 ; R5_w=scalar()
> > > 10: (a6) if w8 < 0xfffffffb goto pc+9 ;
> > > R8_w=scalar(smin=-9223372032559808520,umin=4294967288,smin32=-5,smax32=-1,
> > > umin32=4294967291,var_off=(0xfffffff8; 0xffffffff00000007))
> > > 11: (1f) r7 -= r5 ; R5_w=scalar() R7_w=scalar()
> > > 12: (71) r6 = *(u8 *)(r1 +17) ; R1=ctx(off=0,imm=0)
> > > R6_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=255,
> > > var_off=(0x0; 0xff))
> > > 13: (5f) r3 &= r8 ;
> > > R3_w=scalar(smin=umin=smin32=umin32=8,smax=umax=smax32=umax32=13,var_off=(0x8;
> > > 0x5)) R8_w=scalar(smin=-9223372032559808520,umin=4294967288,smin32=-5,smax32=-1,umin32=4294967291,var_off=(0xffff)
> > > 14: (74) w2 >>= 30 ;
> > > R2_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=3,var_off=(0x0;
> > > 0x3))
> > > 15: (1f) r7 -= r5 ; R5_w=scalar() R7_w=scalar()
> > > 16: (5d) if r8 != r6 goto pc+3 ;
> > > R6_w=scalar(smin=umin=umin32=4294967288,smax=umax=umax32=255,smin32=-8,smax32=-1,
> > > var_off=(0xfffffff8; 0x7))
> > > R8_w=scalar(smin=umin=4294967288,smax=umax=255,smin32=-5,smax32=-1,umin32=4294967291)
>
> Seems like the root cause is a bug with range tracking, before instruction
> 16, R8_w was
>
> R8_w=scalar(smin=-9223372032559808520,umin=4294967288,smin32=-5,smax32=-1,umin32=4294967291,var_off=(0xffff)
>
> But after instruction 16 it becomes
>
> R8_w=scalar(smin=umin=4294967288,smax=umax=255,smin32=-5,smax32=-1,umin32=4294967291)
>
> Where smin_value > smax_value, and umin_value > umax_value (among other
> things). This should be the main problem.
>
> The verifier operates on the assumption that smin_value <= smax_value and
> umin_value <= umax_value, and if that assumption is not upheld then all kind
> of things can go wrong.
>
> Maybe Andrii may already has this worked out in the range-vs-range that he
> has mentioned[1] he'll be sending soon.
>
> 1: https://lore.kernel.org/bpf/CAEf4BzbJ3hZCSt4nLCZCV4cxV60+kddiSMsy7-9ou_RaQV7B8A@mail.gmail.com/
>
> > > 17: (c7) r8 s>>= 5 ; R8_w=134217727
> > > 18: (cf) r0 s>>= r0 ; R0_w=scalar()
> > > 19: (7f) r0 >>= r0 ; R0=scalar()
> > > 20: (95) exit
> > >
> > > from 16 to 20: safe
> > >
> > > from 10 to 20: safe
> > > processed 22 insns (limit 1000000) max_states_per_insn 0 total_states
> > > 1 peak_states 1 mark_read 1
> > > -------- End of Verifier Log --------
> > >
> > > In adjust_scalar_min_max_vals(), src_reg.umax_value is 7, thus pass
> > > the check here:
> > > if (umax_val >= insn_bitness) {
> > > /* Shifts greater than 31 or 63 are undefined.
> > > * This includes shifts by a negative number.
> > > */
> > > mark_reg_unknown(env, regs, insn->dst_reg);
> > > break;
> > > }
> > >
> > > However in scalar32_min_max_rsh(), both src_reg->u32_min_value and
> > > src_reg->u32_max_value is 134217727, causing tnum_rsh() shit by 255.
> > >
> > > Should we check if(src_reg->u32_max_value < insn_bitness) before calling
> > > scalar32_min_max_rsh(), rather than only checking umax_val? Or, is it
> > > because issues somewhere else, incorrectly setting u32_min_value to
> > > 34217727
>
> Checking umax_val alone is be enough and we don't need to add a check for
> u32_max_value, because (when we have correct range tracking) u32_max_value
> should always be smaller than u32_value. So the fix needed here is to have
> correct range tracking.

Hello,

Sorry, I haven't noticed your reply when replying in a sibling thread.
I agree with your analysis, I think the culprit here is inability of
__reg_combine_min_max() to deal with non-overlapping ranges.

Consider example below:

SEC("?tp")
__success __retval(0)
__naked void large_shifts(void)
{
asm volatile (" \
call %[bpf_get_prandom_u32]; \
r8 = r0; \
r6 = r0; \
r6 &= 0x00f; \
r8 &= 0xf00; \
r8 |= 0x0ff; \
if r8 != r6 goto +1; \
w0 >>= w8; /* shift-out-bounds here */ \
exit; \
" :
: __imm(bpf_get_prandom_u32)
: __clobber_all);
}

Here the ranges before 'if' are {0,15} for R6 and {255,4095} for R8.
And here is the code of __reg_combine_min_max():

...
src_reg->umin_value = dst_reg->umin_value = max(src_reg->umin_value,
dst_reg->umin_value);
src_reg->umax_value = dst_reg->umax_value = min(src_reg->umax_value,
dst_reg->umax_value);
...

This code would be executed when 'if' is processed from the following call-chain:
- check_cond_jmp_op
- reg_combine_min_max
- __reg_combine_min_max

The src_reg is R6 and dst_reg is R8, the min/max assignments above
would produce umin_value > umax_value for any ranges {a,b}, {c,d}
where a < b < c < d.

Non-overlapping ranges can get to reg_combine_min_max() because
check_cond_jmp_op() does predictions only when one of the operands of
the comparison is constant.

I think the way to fix this bug is to:
- teach check_cond_jmp_op() to do predictions when ranges of operands
do not overlap;
- add assertion to __reg_combine_min_max() to make sure that only
operands with overlapping ranges are passed as arguments.

wdyt?

2023-10-26 06:45:05

by Shung-Hsi Yu

[permalink] [raw]
Subject: Re: bpf: shift-out-of-bounds in tnum_rshift()

Hi Andrii,

The fix needed here to address the "non-overlapping ranges in
__reg_combine_min_max()" issue sounds like something you'd already have in
the range-vs-range series you mentioned previously. Maybe you've already got
a patch for this?

On Thu, Oct 26, 2023 at 12:59:19AM +0300, Eduard Zingerman wrote:
> On Wed, 2023-10-25 at 22:09 +0800, Shung-Hsi Yu wrote:
> > Hi Hao,
> > On Wed, Oct 25, 2023 at 02:31:02PM +0200, Hao Sun wrote:
> > > On Tue, Oct 24, 2023 at 2:40 PM Hao Sun <[email protected]> wrote:
> > > >
> > > > Hi,
> > > >
> > > > The following program can trigger a shift-out-of-bounds in
> > > > tnum_rshift(), called by scalar32_min_max_rsh():
> > > >
> > > > 0: (bc) w0.
> > = w1
> > > > 1: (bf) r2 = r0
> > > > 2: (18) r3 = 0xd
> > > > 4: (bc) w4 = w0
> > > > 5: (bf) r5 = r0
> > > > 6: (bf) r7 = r3
> > > > 7: (bf) r8 = r4
> > > > 8: (2f) r8 *= r5
> > > > 9: (cf) r5 s>>= r5
> > > > 10: (a6) if w8 < 0xfffffffb goto pc+10
> > > > 11: (1f) r7 -= r5
> > > > 12: (71) r6 = *(u8 *)(r1 +17)
> > > > 13: (5f) r3 &= r8
> > > > 14: (74) w2 >>= 30
> > > > 15: (1f) r7 -= r5
> > > > 16: (5d) if r8 != r6 goto pc+4
> > > > 17: (c7) r8 s>>= 5
> > > > 18: (cf) r0 s>>= r0
> > > > 19: (7f) r0 >>= r0
> > > > 20: (7c) w5 >>= w8 # shift-out-bounds here
> > > > 21: exit
> > > >
> > >
> > > Here are the c macros for the above program in case anyone needs this:
> > >
> > > // 0: (bc) w0 = w1
> > > BPF_MOV32_REG(BPF_REG_0, BPF_REG_1),
> > > // 1: (bf) r2 = r0
> > > BPF_MOV64_REG(BPF_REG_2, BPF_REG_0),
> > > // 2: (18) r3 = 0xd
> > > BPF_LD_IMM64(BPF_REG_3, 0xd),
> > > // 4: (bc) w4 = w0
> > > BPF_MOV32_REG(BPF_REG_4, BPF_REG_0),
> > > // 5: (bf) r5 = r0
> > > BPF_MOV64_REG(BPF_REG_5, BPF_REG_0),
> > > // 6: (bf) r7 = r3
> > > BPF_MOV64_REG(BPF_REG_7, BPF_REG_3),
> > > // 7: (bf) r8 = r4
> > > BPF_MOV64_REG(BPF_REG_8, BPF_REG_4),
> > > // 8: (2f) r8 *= r5
> > > BPF_ALU64_REG(BPF_MUL, BPF_REG_8, BPF_REG_5),
> > > // 9: (cf) r5 s>>= r5
> > > BPF_ALU64_REG(BPF_ARSH, BPF_REG_5, BPF_REG_5),
> > > // 10: (a6) if w8 < 0xfffffffb goto pc+10
> > > BPF_JMP32_IMM(BPF_JLT, BPF_REG_8, 0xfffffffb, 10),
> > > // 11: (1f) r7 -= r5
> > > BPF_ALU64_REG(BPF_SUB, BPF_REG_7, BPF_REG_5),
> > > // 12: (71) r6 = *(u8 *)(r1 +17)
> > > BPF_LDX_MEM(BPF_B, BPF_REG_6, BPF_REG_1, 17),
> > > // 13: (5f) r3 &= r8
> > > BPF_ALU64_REG(BPF_AND, BPF_REG_3, BPF_REG_8),
> > > // 14: (74) w2 >>= 30
> > > BPF_ALU32_IMM(BPF_RSH, BPF_REG_2, 30),
> > > // 15: (1f) r7 -= r5
> > > BPF_ALU64_REG(BPF_SUB, BPF_REG_7, BPF_REG_5),
> > > // 16: (5d) if r8 != r6 goto pc+4
> > > BPF_JMP_REG(BPF_JNE, BPF_REG_8, BPF_REG_6, 4),
> > > // 17: (c7) r8 s>>= 5
> > > BPF_ALU64_IMM(BPF_ARSH, BPF_REG_8, 5),
> > > // 18: (cf) r0 s>>= r0
> > > BPF_ALU64_REG(BPF_ARSH, BPF_REG_0, BPF_REG_0),
> > > // 19: (7f) r0 >>= r0
> > > BPF_ALU64_REG(BPF_RSH, BPF_REG_0, BPF_REG_0),
> > > // 20: (7c) w5 >>= w8
> > > BPF_ALU32_REG(BPF_RSH, BPF_REG_5, BPF_REG_8),
> > > BPF_EXIT_INSN()
> > >
> > > > After load:
> > > > ================================================================================
> > > > UBSAN: shift-out-of-bounds in kernel/bpf/tnum.c:44:9
> > > > shift exponent 255 is too large for 64-bit type 'long long unsigned int'
> > > > CPU: 2 PID: 8574 Comm: bpf-test Not tainted
> > > > 6.6.0-rc5-01400-g7c2f6c9fb91f-dirty #21
> > > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
> > > > Call Trace:
> > > > <TASK>
> > > > __dump_stack lib/dump_stack.c:88 [inline]
> > > > dump_stack_lvl+0x8e/0xb0 lib/dump_stack.c:106
> > > > ubsan_epilogue lib/ubsan.c:217 [inline]
> > > > __ubsan_handle_shift_out_of_bounds+0x15a/0x2f0 lib/ubsan.c:387
> > > > tnum_rshift.cold+0x17/0x32 kernel/bpf/tnum.c:44
> > > > scalar32_min_max_rsh kernel/bpf/verifier.c:12999 [inline]
> > > > adjust_scalar_min_max_vals kernel/bpf/verifier.c:13224 [inline]
> > > > adjust_reg_min_max_vals+0x1936/0x5d50 kernel/bpf/verifier.c:13338
> > > > do_check kernel/bpf/verifier.c:16890 [inline]
> > > > do_check_common+0x2f64/0xbb80 kernel/bpf/verifier.c:19563
> > > > do_check_main kernel/bpf/verifier.c:19626 [inline]
> > > > bpf_check+0x65cf/0xa9e0 kernel/bpf/verifier.c:20263
> > > > bpf_prog_load+0x110e/0x1b20 kernel/bpf/syscall.c:2717
> > > > __sys_bpf+0xfcf/0x4380 kernel/bpf/syscall.c:5365
> > > > __do_sys_bpf kernel/bpf/syscall.c:5469 [inline]
> > > > __se_sys_bpf kernel/bpf/syscall.c:5467 [inline]
> > > > __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:5467
> > > > do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> > > > do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
> > > > entry_SYSCALL_64_after_hwframe+0x63/0xcd
> > > > RIP: 0033:0x5610511e23cd
> > > > Code: 24 80 00 00 00 48 0f 42 d0 48 89 94 24 68 0c 00 00 b8 41 01 00
> > > > 00 bf 05 00 00 00 ba 90 00 00 00 48 8d b44
> > > > RSP: 002b:00007f5357fc7820 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
> > > > RAX: ffffffffffffffda RBX: 0000000000000095 RCX: 00005610511e23cd
> > > > RDX: 0000000000000090 RSI: 00007f5357fc8410 RDI: 0000000000000005
> > > > RBP: 0000000000000000 R08: 00007f5357fca458 R09: 00007f5350005520
> > > > R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000002b
> > > > R13: 0000000d00000000 R14: 000000000000002b R15: 000000000000002b
> > > > </TASK>
> > > >
> > > > If remove insn #20, the verifier gives:
> > > > -------- Verifier Log --------
> > > > func#0 @0
> > > > 0: R1=ctx(off=0,imm=0) R10=fp0
> > > > 0: (bc) w0 = w1 ;
> > > > R0_w=scalar(smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
> > > > R1=ctx(off=0,
> > > > imm=0)
> > > > 1: (bf) r2 = r0 ;
> > > > R0_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
> > > > 0xffffffff))
> > > > R2_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
> > > > 2: (18) r3 = 0xd ; R3_w=13
> > > > 4: (bc) w4 = w0 ;
> > > > R0_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
> > > > 0xffffffff))
> > > > R4_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
> > > > 5: (bf) r5 = r0 ;
> > > > R0_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
> > > > 0xffffffff))
> > > > R5_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
> > > > 6: (bf) r7 = r3 ; R3_w=13 R7_w=13
> > > > 7: (bf) r8 = r4 ;
> > > > R4_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
> > > > 0xffffffff))
> > > > R8_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
> > > > 8: (2f) r8 *= r5 ;
> > > > R5_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
> > > > 0xffffffff))
> > > > R8_w=scalar()
> > > > 9: (cf) r5 s>>= r5 ; R5_w=scalar()
> > > > 10: (a6) if w8 < 0xfffffffb goto pc+9 ;
> > > > R8_w=scalar(smin=-9223372032559808520,umin=4294967288,smin32=-5,smax32=-1,
> > > > umin32=4294967291,var_off=(0xfffffff8; 0xffffffff00000007))
> > > > 11: (1f) r7 -= r5 ; R5_w=scalar() R7_w=scalar()
> > > > 12: (71) r6 = *(u8 *)(r1 +17) ; R1=ctx(off=0,imm=0)
> > > > R6_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=255,
> > > > var_off=(0x0; 0xff))
> > > > 13: (5f) r3 &= r8 ;
> > > > R3_w=scalar(smin=umin=smin32=umin32=8,smax=umax=smax32=umax32=13,var_off=(0x8;
> > > > 0x5)) R8_w=scalar(smin=-9223372032559808520,umin=4294967288,smin32=-5,smax32=-1,umin32=4294967291,var_off=(0xffff)
> > > > 14: (74) w2 >>= 30 ;
> > > > R2_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=3,var_off=(0x0;
> > > > 0x3))
> > > > 15: (1f) r7 -= r5 ; R5_w=scalar() R7_w=scalar()
> > > > 16: (5d) if r8 != r6 goto pc+3 ;
> > > > R6_w=scalar(smin=umin=umin32=4294967288,smax=umax=umax32=255,smin32=-8,smax32=-1,
> > > > var_off=(0xfffffff8; 0x7))
> > > > R8_w=scalar(smin=umin=4294967288,smax=umax=255,smin32=-5,smax32=-1,umin32=4294967291)
> >
> > Seems like the root cause is a bug with range tracking, before instruction
> > 16, R8_w was
> >
> > R8_w=scalar(smin=-9223372032559808520,umin=4294967288,smin32=-5,smax32=-1,umin32=4294967291,var_off=(0xffff)
> >
> > But after instruction 16 it becomes
> >
> > R8_w=scalar(smin=umin=4294967288,smax=umax=255,smin32=-5,smax32=-1,umin32=4294967291)
> >
> > Where smin_value > smax_value, and umin_value > umax_value (among other
> > things). This should be the main problem.
> >
> > The verifier operates on the assumption that smin_value <= smax_value and
> > umin_value <= umax_value, and if that assumption is not upheld then all kind
> > of things can go wrong.
> >
> > Maybe Andrii may already has this worked out in the range-vs-range that he
> > has mentioned[1] he'll be sending soon.
> >
> > 1: https://lore.kernel.org/bpf/CAEf4BzbJ3hZCSt4nLCZCV4cxV60+kddiSMsy7-9ou_RaQV7B8A@mail.gmail.com/
> >
> > > > 17: (c7) r8 s>>= 5 ; R8_w=134217727
> > > > 18: (cf) r0 s>>= r0 ; R0_w=scalar()
> > > > 19: (7f) r0 >>= r0 ; R0=scalar()
> > > > 20: (95) exit
> > > >
> > > > from 16 to 20: safe
> > > >
> > > > from 10 to 20: safe
> > > > processed 22 insns (limit 1000000) max_states_per_insn 0 total_states
> > > > 1 peak_states 1 mark_read 1
> > > > -------- End of Verifier Log --------
> > > >
> > > > In adjust_scalar_min_max_vals(), src_reg.umax_value is 7, thus pass
> > > > the check here:
> > > > if (umax_val >= insn_bitness) {
> > > > /* Shifts greater than 31 or 63 are undefined.
> > > > * This includes shifts by a negative number.
> > > > */
> > > > mark_reg_unknown(env, regs, insn->dst_reg);
> > > > break;
> > > > }
> > > >
> > > > However in scalar32_min_max_rsh(), both src_reg->u32_min_value and
> > > > src_reg->u32_max_value is 134217727, causing tnum_rsh() shit by 255.
> > > >
> > > > Should we check if(src_reg->u32_max_value < insn_bitness) before calling
> > > > scalar32_min_max_rsh(), rather than only checking umax_val? Or, is it
> > > > because issues somewhere else, incorrectly setting u32_min_value to
> > > > 34217727
> >
> > Checking umax_val alone is be enough and we don't need to add a check for
> > u32_max_value, because (when we have correct range tracking) u32_max_value
> > should always be smaller than u32_value. So the fix needed here is to have
> > correct range tracking.
>
> Hello,
>
> Sorry, I haven't noticed your reply when replying in a sibling thread.

That's something I struggle with too :)

> I agree with your analysis, I think the culprit here is inability of
> __reg_combine_min_max() to deal with non-overlapping ranges.
>
> Consider example below:
>
> SEC("?tp")
> __success __retval(0)
> __naked void large_shifts(void)
> {
> asm volatile (" \
> call %[bpf_get_prandom_u32]; \
> r8 = r0; \
> r6 = r0; \
> r6 &= 0x00f; \
> r8 &= 0xf00; \
> r8 |= 0x0ff; \
> if r8 != r6 goto +1; \
> w0 >>= w8; /* shift-out-bounds here */ \
> exit; \
> " :
> : __imm(bpf_get_prandom_u32)
> : __clobber_all);
> }
>
> Here the ranges before 'if' are {0,15} for R6 and {255,4095} for R8.
>
> And here is the code of __reg_combine_min_max():
>
> ...
> src_reg->umin_value = dst_reg->umin_value = max(src_reg->umin_value,
> dst_reg->umin_value);
> src_reg->umax_value = dst_reg->umax_value = min(src_reg->umax_value,
> dst_reg->umax_value);
> ...
>
> This code would be executed when 'if' is processed from the following call-chain:
> - check_cond_jmp_op
> - reg_combine_min_max
> - __reg_combine_min_max
>
> The src_reg is R6 and dst_reg is R8, the min/max assignments above
> would produce umin_value > umax_value for any ranges {a,b}, {c,d}
> where a < b < c < d.
>
> Non-overlapping ranges can get to reg_combine_min_max() because
> check_cond_jmp_op() does predictions only when one of the operands of
> the comparison is constant.
>
> I think the way to fix this bug is to:
> - teach check_cond_jmp_op() to do predictions when ranges of operands
> do not overlap;
> - add assertion to __reg_combine_min_max() to make sure that only
> operands with overlapping ranges are passed as arguments.
>
> wdyt?

Agree on both points above. For the assertion in __reg_combine_min_max() I
think verbose("BUG...") plus __mark_reg_unknown() on both src_reg and
dst_reg should be enough.

2023-10-26 21:53:54

by Andrii Nakryiko

[permalink] [raw]
Subject: Re: bpf: shift-out-of-bounds in tnum_rshift()

On Wed, Oct 25, 2023 at 11:44 PM Shung-Hsi Yu <[email protected]> wrote:
>
> Hi Andrii,
>
> The fix needed here to address the "non-overlapping ranges in
> __reg_combine_min_max()" issue sounds like something you'd already have in
> the range-vs-range series you mentioned previously. Maybe you've already got
> a patch for this?
>
> On Thu, Oct 26, 2023 at 12:59:19AM +0300, Eduard Zingerman wrote:
> > On Wed, 2023-10-25 at 22:09 +0800, Shung-Hsi Yu wrote:
> > > Hi Hao,
> > > On Wed, Oct 25, 2023 at 02:31:02PM +0200, Hao Sun wrote:
> > > > On Tue, Oct 24, 2023 at 2:40 PM Hao Sun <[email protected]> wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > The following program can trigger a shift-out-of-bounds in
> > > > > tnum_rshift(), called by scalar32_min_max_rsh():
> > > > >
> > > > > 0: (bc) w0.
> > > = w1
> > > > > 1: (bf) r2 = r0
> > > > > 2: (18) r3 = 0xd
> > > > > 4: (bc) w4 = w0
> > > > > 5: (bf) r5 = r0
> > > > > 6: (bf) r7 = r3
> > > > > 7: (bf) r8 = r4
> > > > > 8: (2f) r8 *= r5
> > > > > 9: (cf) r5 s>>= r5
> > > > > 10: (a6) if w8 < 0xfffffffb goto pc+10
> > > > > 11: (1f) r7 -= r5
> > > > > 12: (71) r6 = *(u8 *)(r1 +17)
> > > > > 13: (5f) r3 &= r8
> > > > > 14: (74) w2 >>= 30
> > > > > 15: (1f) r7 -= r5
> > > > > 16: (5d) if r8 != r6 goto pc+4
> > > > > 17: (c7) r8 s>>= 5
> > > > > 18: (cf) r0 s>>= r0
> > > > > 19: (7f) r0 >>= r0
> > > > > 20: (7c) w5 >>= w8 # shift-out-bounds here
> > > > > 21: exit
> > > > >
> > > >
> > > > Here are the c macros for the above program in case anyone needs this:
> > > >
> > > > // 0: (bc) w0 = w1
> > > > BPF_MOV32_REG(BPF_REG_0, BPF_REG_1),
> > > > // 1: (bf) r2 = r0
> > > > BPF_MOV64_REG(BPF_REG_2, BPF_REG_0),
> > > > // 2: (18) r3 = 0xd
> > > > BPF_LD_IMM64(BPF_REG_3, 0xd),
> > > > // 4: (bc) w4 = w0
> > > > BPF_MOV32_REG(BPF_REG_4, BPF_REG_0),
> > > > // 5: (bf) r5 = r0
> > > > BPF_MOV64_REG(BPF_REG_5, BPF_REG_0),
> > > > // 6: (bf) r7 = r3
> > > > BPF_MOV64_REG(BPF_REG_7, BPF_REG_3),
> > > > // 7: (bf) r8 = r4
> > > > BPF_MOV64_REG(BPF_REG_8, BPF_REG_4),
> > > > // 8: (2f) r8 *= r5
> > > > BPF_ALU64_REG(BPF_MUL, BPF_REG_8, BPF_REG_5),
> > > > // 9: (cf) r5 s>>= r5
> > > > BPF_ALU64_REG(BPF_ARSH, BPF_REG_5, BPF_REG_5),
> > > > // 10: (a6) if w8 < 0xfffffffb goto pc+10
> > > > BPF_JMP32_IMM(BPF_JLT, BPF_REG_8, 0xfffffffb, 10),
> > > > // 11: (1f) r7 -= r5
> > > > BPF_ALU64_REG(BPF_SUB, BPF_REG_7, BPF_REG_5),
> > > > // 12: (71) r6 = *(u8 *)(r1 +17)
> > > > BPF_LDX_MEM(BPF_B, BPF_REG_6, BPF_REG_1, 17),
> > > > // 13: (5f) r3 &= r8
> > > > BPF_ALU64_REG(BPF_AND, BPF_REG_3, BPF_REG_8),
> > > > // 14: (74) w2 >>= 30
> > > > BPF_ALU32_IMM(BPF_RSH, BPF_REG_2, 30),
> > > > // 15: (1f) r7 -= r5
> > > > BPF_ALU64_REG(BPF_SUB, BPF_REG_7, BPF_REG_5),
> > > > // 16: (5d) if r8 != r6 goto pc+4
> > > > BPF_JMP_REG(BPF_JNE, BPF_REG_8, BPF_REG_6, 4),
> > > > // 17: (c7) r8 s>>= 5
> > > > BPF_ALU64_IMM(BPF_ARSH, BPF_REG_8, 5),
> > > > // 18: (cf) r0 s>>= r0
> > > > BPF_ALU64_REG(BPF_ARSH, BPF_REG_0, BPF_REG_0),
> > > > // 19: (7f) r0 >>= r0
> > > > BPF_ALU64_REG(BPF_RSH, BPF_REG_0, BPF_REG_0),
> > > > // 20: (7c) w5 >>= w8
> > > > BPF_ALU32_REG(BPF_RSH, BPF_REG_5, BPF_REG_8),
> > > > BPF_EXIT_INSN()
> > > >
> > > > > After load:
> > > > > ================================================================================
> > > > > UBSAN: shift-out-of-bounds in kernel/bpf/tnum.c:44:9
> > > > > shift exponent 255 is too large for 64-bit type 'long long unsigned int'
> > > > > CPU: 2 PID: 8574 Comm: bpf-test Not tainted
> > > > > 6.6.0-rc5-01400-g7c2f6c9fb91f-dirty #21
> > > > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
> > > > > Call Trace:
> > > > > <TASK>
> > > > > __dump_stack lib/dump_stack.c:88 [inline]
> > > > > dump_stack_lvl+0x8e/0xb0 lib/dump_stack.c:106
> > > > > ubsan_epilogue lib/ubsan.c:217 [inline]
> > > > > __ubsan_handle_shift_out_of_bounds+0x15a/0x2f0 lib/ubsan.c:387
> > > > > tnum_rshift.cold+0x17/0x32 kernel/bpf/tnum.c:44
> > > > > scalar32_min_max_rsh kernel/bpf/verifier.c:12999 [inline]
> > > > > adjust_scalar_min_max_vals kernel/bpf/verifier.c:13224 [inline]
> > > > > adjust_reg_min_max_vals+0x1936/0x5d50 kernel/bpf/verifier.c:13338
> > > > > do_check kernel/bpf/verifier.c:16890 [inline]
> > > > > do_check_common+0x2f64/0xbb80 kernel/bpf/verifier.c:19563
> > > > > do_check_main kernel/bpf/verifier.c:19626 [inline]
> > > > > bpf_check+0x65cf/0xa9e0 kernel/bpf/verifier.c:20263
> > > > > bpf_prog_load+0x110e/0x1b20 kernel/bpf/syscall.c:2717
> > > > > __sys_bpf+0xfcf/0x4380 kernel/bpf/syscall.c:5365
> > > > > __do_sys_bpf kernel/bpf/syscall.c:5469 [inline]
> > > > > __se_sys_bpf kernel/bpf/syscall.c:5467 [inline]
> > > > > __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:5467
> > > > > do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> > > > > do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
> > > > > entry_SYSCALL_64_after_hwframe+0x63/0xcd
> > > > > RIP: 0033:0x5610511e23cd
> > > > > Code: 24 80 00 00 00 48 0f 42 d0 48 89 94 24 68 0c 00 00 b8 41 01 00
> > > > > 00 bf 05 00 00 00 ba 90 00 00 00 48 8d b44
> > > > > RSP: 002b:00007f5357fc7820 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
> > > > > RAX: ffffffffffffffda RBX: 0000000000000095 RCX: 00005610511e23cd
> > > > > RDX: 0000000000000090 RSI: 00007f5357fc8410 RDI: 0000000000000005
> > > > > RBP: 0000000000000000 R08: 00007f5357fca458 R09: 00007f5350005520
> > > > > R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000002b
> > > > > R13: 0000000d00000000 R14: 000000000000002b R15: 000000000000002b
> > > > > </TASK>
> > > > >
> > > > > If remove insn #20, the verifier gives:
> > > > > -------- Verifier Log --------
> > > > > func#0 @0
> > > > > 0: R1=ctx(off=0,imm=0) R10=fp0
> > > > > 0: (bc) w0 = w1 ;
> > > > > R0_w=scalar(smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
> > > > > R1=ctx(off=0,
> > > > > imm=0)
> > > > > 1: (bf) r2 = r0 ;
> > > > > R0_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
> > > > > 0xffffffff))
> > > > > R2_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
> > > > > 2: (18) r3 = 0xd ; R3_w=13
> > > > > 4: (bc) w4 = w0 ;
> > > > > R0_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
> > > > > 0xffffffff))
> > > > > R4_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
> > > > > 5: (bf) r5 = r0 ;
> > > > > R0_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
> > > > > 0xffffffff))
> > > > > R5_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
> > > > > 6: (bf) r7 = r3 ; R3_w=13 R7_w=13
> > > > > 7: (bf) r8 = r4 ;
> > > > > R4_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
> > > > > 0xffffffff))
> > > > > R8_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
> > > > > 8: (2f) r8 *= r5 ;
> > > > > R5_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
> > > > > 0xffffffff))
> > > > > R8_w=scalar()
> > > > > 9: (cf) r5 s>>= r5 ; R5_w=scalar()
> > > > > 10: (a6) if w8 < 0xfffffffb goto pc+9 ;
> > > > > R8_w=scalar(smin=-9223372032559808520,umin=4294967288,smin32=-5,smax32=-1,
> > > > > umin32=4294967291,var_off=(0xfffffff8; 0xffffffff00000007))
> > > > > 11: (1f) r7 -= r5 ; R5_w=scalar() R7_w=scalar()
> > > > > 12: (71) r6 = *(u8 *)(r1 +17) ; R1=ctx(off=0,imm=0)
> > > > > R6_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=255,
> > > > > var_off=(0x0; 0xff))
> > > > > 13: (5f) r3 &= r8 ;
> > > > > R3_w=scalar(smin=umin=smin32=umin32=8,smax=umax=smax32=umax32=13,var_off=(0x8;
> > > > > 0x5)) R8_w=scalar(smin=-9223372032559808520,umin=4294967288,smin32=-5,smax32=-1,umin32=4294967291,var_off=(0xffff)
> > > > > 14: (74) w2 >>= 30 ;
> > > > > R2_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=3,var_off=(0x0;
> > > > > 0x3))
> > > > > 15: (1f) r7 -= r5 ; R5_w=scalar() R7_w=scalar()
> > > > > 16: (5d) if r8 != r6 goto pc+3 ;
> > > > > R6_w=scalar(smin=umin=umin32=4294967288,smax=umax=umax32=255,smin32=-8,smax32=-1,
> > > > > var_off=(0xfffffff8; 0x7))
> > > > > R8_w=scalar(smin=umin=4294967288,smax=umax=255,smin32=-5,smax32=-1,umin32=4294967291)
> > >
> > > Seems like the root cause is a bug with range tracking, before instruction
> > > 16, R8_w was
> > >
> > > R8_w=scalar(smin=-9223372032559808520,umin=4294967288,smin32=-5,smax32=-1,umin32=4294967291,var_off=(0xffff)
> > >
> > > But after instruction 16 it becomes
> > >
> > > R8_w=scalar(smin=umin=4294967288,smax=umax=255,smin32=-5,smax32=-1,umin32=4294967291)
> > >
> > > Where smin_value > smax_value, and umin_value > umax_value (among other
> > > things). This should be the main problem.
> > >
> > > The verifier operates on the assumption that smin_value <= smax_value and
> > > umin_value <= umax_value, and if that assumption is not upheld then all kind
> > > of things can go wrong.
> > >
> > > Maybe Andrii may already has this worked out in the range-vs-range that he
> > > has mentioned[1] he'll be sending soon.

Yes, I think so. I did add few more conditions to BPF_JEQ/BPF_JNE to
prevent situations where we have to "combine" non-overlapping regions.
I don't know if my changes fix this specific condition, so having a
nice and short repro from Eduard is great.

I intend to submit a full series for reg bounds improvements in the
next few working days.

> > >
> > > 1: https://lore.kernel.org/bpf/CAEf4BzbJ3hZCSt4nLCZCV4cxV60+kddiSMsy7-9ou_RaQV7B8A@mail.gmail.com/
> > >
> > > > > 17: (c7) r8 s>>= 5 ; R8_w=134217727
> > > > > 18: (cf) r0 s>>= r0 ; R0_w=scalar()
> > > > > 19: (7f) r0 >>= r0 ; R0=scalar()
> > > > > 20: (95) exit
> > > > >
> > > > > from 16 to 20: safe
> > > > >
> > > > > from 10 to 20: safe
> > > > > processed 22 insns (limit 1000000) max_states_per_insn 0 total_states
> > > > > 1 peak_states 1 mark_read 1
> > > > > -------- End of Verifier Log --------
> > > > >
> > > > > In adjust_scalar_min_max_vals(), src_reg.umax_value is 7, thus pass
> > > > > the check here:
> > > > > if (umax_val >= insn_bitness) {
> > > > > /* Shifts greater than 31 or 63 are undefined.
> > > > > * This includes shifts by a negative number.
> > > > > */
> > > > > mark_reg_unknown(env, regs, insn->dst_reg);
> > > > > break;
> > > > > }
> > > > >
> > > > > However in scalar32_min_max_rsh(), both src_reg->u32_min_value and
> > > > > src_reg->u32_max_value is 134217727, causing tnum_rsh() shit by 255.
> > > > >
> > > > > Should we check if(src_reg->u32_max_value < insn_bitness) before calling
> > > > > scalar32_min_max_rsh(), rather than only checking umax_val? Or, is it
> > > > > because issues somewhere else, incorrectly setting u32_min_value to
> > > > > 34217727
> > >
> > > Checking umax_val alone is be enough and we don't need to add a check for
> > > u32_max_value, because (when we have correct range tracking) u32_max_value
> > > should always be smaller than u32_value. So the fix needed here is to have
> > > correct range tracking.
> >
> > Hello,
> >
> > Sorry, I haven't noticed your reply when replying in a sibling thread.
>
> That's something I struggle with too :)
>
> > I agree with your analysis, I think the culprit here is inability of
> > __reg_combine_min_max() to deal with non-overlapping ranges.
> >
> > Consider example below:
> >
> > SEC("?tp")
> > __success __retval(0)
> > __naked void large_shifts(void)
> > {
> > asm volatile (" \
> > call %[bpf_get_prandom_u32]; \
> > r8 = r0; \
> > r6 = r0; \
> > r6 &= 0x00f; \
> > r8 &= 0xf00; \
> > r8 |= 0x0ff; \
> > if r8 != r6 goto +1; \
> > w0 >>= w8; /* shift-out-bounds here */ \
> > exit; \
> > " :
> > : __imm(bpf_get_prandom_u32)
> > : __clobber_all);
> > }
> >
> > Here the ranges before 'if' are {0,15} for R6 and {255,4095} for R8.
> >
> > And here is the code of __reg_combine_min_max():
> >
> > ...
> > src_reg->umin_value = dst_reg->umin_value = max(src_reg->umin_value,
> > dst_reg->umin_value);
> > src_reg->umax_value = dst_reg->umax_value = min(src_reg->umax_value,
> > dst_reg->umax_value);
> > ...
> >
> > This code would be executed when 'if' is processed from the following call-chain:
> > - check_cond_jmp_op
> > - reg_combine_min_max
> > - __reg_combine_min_max
> >
> > The src_reg is R6 and dst_reg is R8, the min/max assignments above
> > would produce umin_value > umax_value for any ranges {a,b}, {c,d}
> > where a < b < c < d.
> >
> > Non-overlapping ranges can get to reg_combine_min_max() because
> > check_cond_jmp_op() does predictions only when one of the operands of
> > the comparison is constant.
> >
> > I think the way to fix this bug is to:
> > - teach check_cond_jmp_op() to do predictions when ranges of operands
> > do not overlap;
> > - add assertion to __reg_combine_min_max() to make sure that only
> > operands with overlapping ranges are passed as arguments.
> >
> > wdyt?
>
> Agree on both points above. For the assertion in __reg_combine_min_max() I
> think verbose("BUG...") plus __mark_reg_unknown() on both src_reg and
> dst_reg should be enough.

2023-10-27 17:51:55

by Andrii Nakryiko

[permalink] [raw]
Subject: Re: bpf: shift-out-of-bounds in tnum_rshift()

On Wed, Oct 25, 2023 at 10:34 AM Eduard Zingerman <[email protected]> wrote:
>
> On Tue, 2023-10-24 at 14:40 +0200, Hao Sun wrote:
> > Hi,
> >
> > The following program can trigger a shift-out-of-bounds in
> > tnum_rshift(), called by scalar32_min_max_rsh():
> >
> > 0: (bc) w0 = w1
> > 1: (bf) r2 = r0
> > 2: (18) r3 = 0xd
> > 4: (bc) w4 = w0
> > 5: (bf) r5 = r0
> > 6: (bf) r7 = r3
> > 7: (bf) r8 = r4
> > 8: (2f) r8 *= r5
> > 9: (cf) r5 s>>= r5
> > 10: (a6) if w8 < 0xfffffffb goto pc+10
> > 11: (1f) r7 -= r5
> > 12: (71) r6 = *(u8 *)(r1 +17)
> > 13: (5f) r3 &= r8
> > 14: (74) w2 >>= 30
> > 15: (1f) r7 -= r5
> > 16: (5d) if r8 != r6 goto pc+4
> > 17: (c7) r8 s>>= 5
> > 18: (cf) r0 s>>= r0
> > 19: (7f) r0 >>= r0
> > 20: (7c) w5 >>= w8 # shift-out-bounds here
> > 21: exit
>
> Here is a simplified example:
>
> SEC("?tp")
> __success __retval(0)
> __naked void large_shifts(void)
> {
> asm volatile (" \
> call %[bpf_get_prandom_u32]; \n\
> r8 = r0; \n\
> r6 = r0; \n\
> r6 &= 0xf; \n\
> if w8 < 0xffffffff goto +2; \n\
> if r8 != r6 goto +1; \n\
> w0 >>= w8; /* shift-out-bounds here */ \n\
> exit; \n\
> " :
> : __imm(bpf_get_prandom_u32)
> : __clobber_all);
> }
>

With my changes the verifier does correctly derive that r8 != r6 will
always happen, and thus skips w0 >>= w8. But the test itself with
__retval(0) is not a valid test, so it would be good to construct
something that will correctly return 0 at runtime (or use some other
check). So I won't put this test into my patch set and will live it as
a follow up for someone. But here's the log for anyone curious:

VERIFIER LOG:
=============
func#0 @0
0: R1=ctx(off=0,imm=0) R10=fp0
; asm volatile (" \
0: (85) call bpf_get_prandom_u32#7 ; R0_w=scalar()
1: (bf) r8 = r0 ; R0_w=scalar(id=1) R8_w=scalar(id=1)
2: (bf) r6 = r0 ; R0_w=scalar(id=1) R6_w=scalar(id=1)
3: (57) r6 &= 15 ;
R6_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=15,var_off=(0x0;
0xf))
4: (a6) if w8 < 0xffffffff goto pc+2 ;
R8_w=scalar(id=1,smin=-9223372032559808513,umin=umin32=4294967295,smin32=-1,smax32=-1,var_off=(0xffffffff;
0xffffffff00000000))
5: (5d) if r8 != r6 goto pc+1
mark_precise: frame0: last_idx 5 first_idx 0 subseq_idx -1
mark_precise: frame0: regs=r0,r8 stack= before 4: (a6) if w8 <
0xffffffff goto pc+2
mark_precise: frame0: regs=r0,r8 stack= before 3: (57) r6 &= 15
mark_precise: frame0: regs=r0,r8 stack= before 2: (bf) r6 = r0
mark_precise: frame0: regs=r0,r8 stack= before 1: (bf) r8 = r0
mark_precise: frame0: regs=r0 stack= before 0: (85) call bpf_get_prandom_u32#7
mark_precise: frame0: last_idx 5 first_idx 0 subseq_idx -1
mark_precise: frame0: regs=r6 stack= before 4: (a6) if w8 < 0xffffffff goto pc+2
mark_precise: frame0: regs=r6 stack= before 3: (57) r6 &= 15
mark_precise: frame0: regs=r6 stack= before 2: (bf) r6 = r0
mark_precise: frame0: regs=r0 stack= before 1: (bf) r8 = r0
mark_precise: frame0: regs=r0 stack= before 0: (85) call bpf_get_prandom_u32#7
5: R6_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=15,var_off=(0x0;
0xf)) R8_w=scalar(id=1,smin=-9223372032559808513,umin=umin32=4294967295,smin32=-1,smax32=-1,var_off=(0xffffffff;
0xffffffff00000000))
7: (95) exit

from 4 to 7: R0=scalar(id=1,smax=9223372036854775806,umax=18446744073709551614,umax32=4294967294)
R6=scalar(smin=smin32=0,smax=umax=smax32=umax32=15,var_off=(0x0; 0xf))
R8=scalar(id=1,smax=9223372036854775806,umax=18446744073709551614,umax32=4294967294)
R10=fp0
7: R0=scalar(id=1,smax=9223372036854775806,umax=18446744073709551614,umax32=4294967294)
R6=scalar(smin=smin32=0,smax=umax=smax32=umax32=15,var_off=(0x0; 0xf))
R8=scalar(id=1,smax=9223372036854775806,umax=18446744073709551614,umax32=4294967294)
R10=fp0
7: (95) exit
processed 8 insns (limit 1000000) max_states_per_insn 0 total_states 1
peak_states 1 mark_read 1
=============

at insn #4, simulating a FALSE condition, verifier knows that r6 is
[0, 15], while w8 is exactly 0xffffffff, so at insn #5 it can tell
that 0xffffffff can never be equal to a value in [0, 15] range, and
thus skips the shift instruction.


> The issue is caused by an invalid range assigned to R8 after R8 != R6
> check, here is GDB log:
>
> (gdb) bt
> #0 scalar32_min_max_rsh ... at kernel/bpf/verifier.c:13368
> #1 0xffffffff81295236 in adjust_scalar_min_max_vals ... at kernel/bpf/verifier.c:13592
> #2 adjust_reg_min_max_vals .... at kernel/bpf/verifier.c:13706
> #3 0xffffffff8128701f in check_alu_op ... at kernel/bpf/verifier.c:13938
> #4 do_check ... at kernel/bpf/verifier.c:17327
> (gdb) p *src_reg
> $2 = {
> type = SCALAR_VALUE,
> ...
> smin_value = 4294967295,
> smax_value = 15,
> umin_value = 4294967295,
> umax_value = 15,
> s32_min_value = -1,
> s32_max_value = -1,
> u32_min_value = 4294967295,
> u32_max_value = 4294967295,
> ...
> }
>
> The invalid range is assigned within reg_combine_min_max() function in
> BPF_JNE branch. The following diff removes the error:
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 857d76694517..3d140bf85282 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -14485,7 +14485,7 @@ static void reg_combine_min_max(struct bpf_reg_state *true_src,
> __reg_combine_min_max(true_src, true_dst);
> break;
> case BPF_JNE:
> - __reg_combine_min_max(false_src, false_dst);
> + //__reg_combine_min_max(false_src, false_dst);
> break;
> }
> }
>
> I do not understand what BPF_JNE branch logically means in
> reg_combine_min_max(), does anyone has any insight?

Not equal check? When you have either `if r1 == r2 goto ` or `if r1 !=
r2`, verifier simulates both TRUE and FALSE conditions, so basically
both BPF_JEQ and BPF_JNE, depending on the branch.

>
> > After load:
> > ================================================================================
> > UBSAN: shift-out-of-bounds in kernel/bpf/tnum.c:44:9
> > shift exponent 255 is too large for 64-bit type 'long long unsigned int'
> > CPU: 2 PID: 8574 Comm: bpf-test Not tainted
> > 6.6.0-rc5-01400-g7c2f6c9fb91f-dirty #21
> > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
> > Call Trace:
> > <TASK>
> > __dump_stack lib/dump_stack.c:88 [inline]
> > dump_stack_lvl+0x8e/0xb0 lib/dump_stack.c:106
> > ubsan_epilogue lib/ubsan.c:217 [inline]
> > __ubsan_handle_shift_out_of_bounds+0x15a/0x2f0 lib/ubsan.c:387
> > tnum_rshift.cold+0x17/0x32 kernel/bpf/tnum.c:44
> > scalar32_min_max_rsh kernel/bpf/verifier.c:12999 [inline]
> > adjust_scalar_min_max_vals kernel/bpf/verifier.c:13224 [inline]
> > adjust_reg_min_max_vals+0x1936/0x5d50 kernel/bpf/verifier.c:13338
> > do_check kernel/bpf/verifier.c:16890 [inline]
> > do_check_common+0x2f64/0xbb80 kernel/bpf/verifier.c:19563
> > do_check_main kernel/bpf/verifier.c:19626 [inline]
> > bpf_check+0x65cf/0xa9e0 kernel/bpf/verifier.c:20263
> > bpf_prog_load+0x110e/0x1b20 kernel/bpf/syscall.c:2717
> > __sys_bpf+0xfcf/0x4380 kernel/bpf/syscall.c:5365
> > __do_sys_bpf kernel/bpf/syscall.c:5469 [inline]
> > __se_sys_bpf kernel/bpf/syscall.c:5467 [inline]
> > __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:5467
> > do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> > do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
> > entry_SYSCALL_64_after_hwframe+0x63/0xcd
> > RIP: 0033:0x5610511e23cd
> > Code: 24 80 00 00 00 48 0f 42 d0 48 89 94 24 68 0c 00 00 b8 41 01 00
> > 00 bf 05 00 00 00 ba 90 00 00 00 48 8d b44
> > RSP: 002b:00007f5357fc7820 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
> > RAX: ffffffffffffffda RBX: 0000000000000095 RCX: 00005610511e23cd
> > RDX: 0000000000000090 RSI: 00007f5357fc8410 RDI: 0000000000000005
> > RBP: 0000000000000000 R08: 00007f5357fca458 R09: 00007f5350005520
> > R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000002b
> > R13: 0000000d00000000 R14: 000000000000002b R15: 000000000000002b
> > </TASK>
> >
> > If remove insn #20, the verifier gives:
> > -------- Verifier Log --------
> > func#0 @0
> > 0: R1=ctx(off=0,imm=0) R10=fp0
> > 0: (bc) w0 = w1 ;
> > R0_w=scalar(smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
> > R1=ctx(off=0,
> > imm=0)
> > 1: (bf) r2 = r0 ;
> > R0_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
> > 0xffffffff))
> > R2_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
> > 2: (18) r3 = 0xd ; R3_w=13
> > 4: (bc) w4 = w0 ;
> > R0_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
> > 0xffffffff))
> > R4_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
> > 5: (bf) r5 = r0 ;
> > R0_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
> > 0xffffffff))
> > R5_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
> > 6: (bf) r7 = r3 ; R3_w=13 R7_w=13
> > 7: (bf) r8 = r4 ;
> > R4_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
> > 0xffffffff))
> > R8_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff))
> > 8: (2f) r8 *= r5 ;
> > R5_w=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0;
> > 0xffffffff))
> > R8_w=scalar()
> > 9: (cf) r5 s>>= r5 ; R5_w=scalar()
> > 10: (a6) if w8 < 0xfffffffb goto pc+9 ;
> > R8_w=scalar(smin=-9223372032559808520,umin=4294967288,smin32=-5,smax32=-1,
> > umin32=4294967291,var_off=(0xfffffff8; 0xffffffff00000007))
> > 11: (1f) r7 -= r5 ; R5_w=scalar() R7_w=scalar()
> > 12: (71) r6 = *(u8 *)(r1 +17) ; R1=ctx(off=0,imm=0)
> > R6_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=255,
> > var_off=(0x0; 0xff))
> > 13: (5f) r3 &= r8 ;
> > R3_w=scalar(smin=umin=smin32=umin32=8,smax=umax=smax32=umax32=13,var_off=(0x8;
> > 0x5)) R8_w=scalar(smin=-9223372032559808520,umin=4294967288,smin32=-5,smax32=-1,umin32=4294967291,var_off=(0xffff)
> > 14: (74) w2 >>= 30 ;
> > R2_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=3,var_off=(0x0;
> > 0x3))
> > 15: (1f) r7 -= r5 ; R5_w=scalar() R7_w=scalar()
> > 16: (5d) if r8 != r6 goto pc+3 ;
> > R6_w=scalar(smin=umin=umin32=4294967288,smax=umax=umax32=255,smin32=-8,smax32=-1,
> > var_off=(0xfffffff8; 0x7))
> > R8_w=scalar(smin=umin=4294967288,smax=umax=255,smin32=-5,smax32=-1,umin32=4294967291)
> > 17: (c7) r8 s>>= 5 ; R8_w=134217727
> > 18: (cf) r0 s>>= r0 ; R0_w=scalar()
> > 19: (7f) r0 >>= r0 ; R0=scalar()
> > 20: (95) exit
> >
> > from 16 to 20: safe
> >
> > from 10 to 20: safe
> > processed 22 insns (limit 1000000) max_states_per_insn 0 total_states
> > 1 peak_states 1 mark_read 1
> > -------- End of Verifier Log --------
> >
> > In adjust_scalar_min_max_vals(), src_reg.umax_value is 7, thus pass
> > the check here:
> > if (umax_val >= insn_bitness) {
> > /* Shifts greater than 31 or 63 are undefined.
> > * This includes shifts by a negative number.
> > */
> > mark_reg_unknown(env, regs, insn->dst_reg);
> > break;
> > }
> >
> > However in scalar32_min_max_rsh(), both src_reg->u32_min_value and
> > src_reg->u32_max_value is 134217727, causing tnum_rsh() shit by 255.
> >
> > Should we check if(src_reg->u32_max_value < insn_bitness) before calling
> > scalar32_min_max_rsh(), rather than only checking umax_val? Or, is it
> > because issues somewhere else, incorrectly setting u32_min_value to
> > 34217727
> >
> > Best
> > Hao Sun
> >
>

2023-11-01 09:53:26

by Hao Sun

[permalink] [raw]
Subject: Re: bpf: shift-out-of-bounds in tnum_rshift()

On Fri, Oct 27, 2023 at 7:51 PM Andrii Nakryiko
<[email protected]> wrote:
>
> On Wed, Oct 25, 2023 at 10:34 AM Eduard Zingerman <[email protected]> wrote:
> >
> > On Tue, 2023-10-24 at 14:40 +0200, Hao Sun wrote:
> > > Hi,
> > >
> > > The following program can trigger a shift-out-of-bounds in
> > > tnum_rshift(), called by scalar32_min_max_rsh():
> > >
> > > 0: (bc) w0 = w1
> > > 1: (bf) r2 = r0
> > > 2: (18) r3 = 0xd
> > > 4: (bc) w4 = w0
> > > 5: (bf) r5 = r0
> > > 6: (bf) r7 = r3
> > > 7: (bf) r8 = r4
> > > 8: (2f) r8 *= r5
> > > 9: (cf) r5 s>>= r5
> > > 10: (a6) if w8 < 0xfffffffb goto pc+10
> > > 11: (1f) r7 -= r5
> > > 12: (71) r6 = *(u8 *)(r1 +17)
> > > 13: (5f) r3 &= r8
> > > 14: (74) w2 >>= 30
> > > 15: (1f) r7 -= r5
> > > 16: (5d) if r8 != r6 goto pc+4
> > > 17: (c7) r8 s>>= 5
> > > 18: (cf) r0 s>>= r0
> > > 19: (7f) r0 >>= r0
> > > 20: (7c) w5 >>= w8 # shift-out-bounds here
> > > 21: exit
> >
> > Here is a simplified example:
> >
> > SEC("?tp")
> > __success __retval(0)
> > __naked void large_shifts(void)
> > {
> > asm volatile (" \
> > call %[bpf_get_prandom_u32]; \n\
> > r8 = r0; \n\
> > r6 = r0; \n\
> > r6 &= 0xf; \n\
> > if w8 < 0xffffffff goto +2; \n\
> > if r8 != r6 goto +1; \n\
> > w0 >>= w8; /* shift-out-bounds here */ \n\
> > exit; \n\
> > " :
> > : __imm(bpf_get_prandom_u32)
> > : __clobber_all);
> > }
> >
>
> With my changes the verifier does correctly derive that r8 != r6 will
> always happen, and thus skips w0 >>= w8. But the test itself with

A similar issue can be triggered after your patch for JNE/JEQ.

For the following case, the verifier would shift out of bound:
// 0: r0 = -2
BPF_MOV64_IMM(BPF_REG_0, -2),
// 1: r0 /= 1
BPF_ALU64_IMM(BPF_DIV, BPF_REG_0, 1),
// 2: r8 = r0
BPF_MOV64_REG(BPF_REG_8, BPF_REG_0),
// 3: if w8 != 0xfffffffe goto+4
BPF_JMP32_IMM(BPF_JNE, BPF_REG_8, 0xfffffffe, 4),
// 4: if r8 s> 0xd goto+3
BPF_JMP_IMM(BPF_JSGT, BPF_REG_8, 0xd, 3),
// 5: r4 = 0x2
BPF_MOV64_IMM(BPF_REG_4, 0x2),
// 6: if r8 s<= r4 goto+1
BPF_JMP_REG(BPF_JSLE, BPF_REG_8, BPF_REG_4, 1),
// 7: w8 s>>= w0 # shift out of bound here
BPF_ALU32_REG(BPF_ARSH, BPF_REG_8, BPF_REG_0),
// 8: exit
BPF_EXIT_INSN(),

-------- Verifier Log --------
func#0 @0
0: R1=ctx(off=0,imm=0) R10=fp0
0: (b7) r0 = -2 ; R0_w=-2
1: (37) r0 /= 1 ; R0_w=scalar()
2: (bf) r8 = r0 ; R0_w=scalar(id=1) R8_w=scalar(id=1)
3: (56) if w8 != 0xfffffffe goto pc+4 ;
R8_w=scalar(id=1,smin=-9223372032559808514,smax=9223372036854775806,umin=umin32=4294967294,umax=18446744073709551614,smin32=-2,smax32=-2,
umax32=4294967294,var_off=(0xfffffffe; 0xffffffff00000000))
4: (65) if r8 s> 0xd goto pc+3 ;
R8_w=scalar(id=1,smin=-9223372032559808514,smax=13,umin=umin32=4294967294,umax=18446744073709551614,smin32=-2,smax32=-2,umax32=4294967294,
var_off=(0xfffffffe; 0xffffffff00000000))
5: (b7) r4 = 2 ; R4_w=2
6: (dd) if r8 s<= r4 goto pc+1 ; R4_w=2 R8_w=4294967294
7: (cc) w8 s>>= w0 ; R0=4294967294 R8=4294967295
8: (95) exit

Here, after #6, reg range is incorrect, seems to be an issue in JSLE case
in is_branch_taken(). Is this issue fixed in your patch series?

> __retval(0) is not a valid test, so it would be good to construct
> something that will correctly return 0 at runtime (or use some other
> check). So I won't put this test into my patch set and will live it as
> a follow up for someone. But here's the log for anyone curious:
>
> VERIFIER LOG:
> =============
> func#0 @0
> 0: R1=ctx(off=0,imm=0) R10=fp0
> ; asm volatile (" \
> 0: (85) call bpf_get_prandom_u32#7 ; R0_w=scalar()
> 1: (bf) r8 = r0 ; R0_w=scalar(id=1) R8_w=scalar(id=1)
> 2: (bf) r6 = r0 ; R0_w=scalar(id=1) R6_w=scalar(id=1)
> 3: (57) r6 &= 15 ;
> R6_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=15,var_off=(0x0;
> 0xf))
> 4: (a6) if w8 < 0xffffffff goto pc+2 ;
> R8_w=scalar(id=1,smin=-9223372032559808513,umin=umin32=4294967295,smin32=-1,smax32=-1,var_off=(0xffffffff;
> 0xffffffff00000000))
> 5: (5d) if r8 != r6 goto pc+1
> mark_precise: frame0: last_idx 5 first_idx 0 subseq_idx -1
> mark_precise: frame0: regs=r0,r8 stack= before 4: (a6) if w8 <
> 0xffffffff goto pc+2
> mark_precise: frame0: regs=r0,r8 stack= before 3: (57) r6 &= 15
> mark_precise: frame0: regs=r0,r8 stack= before 2: (bf) r6 = r0
> mark_precise: frame0: regs=r0,r8 stack= before 1: (bf) r8 = r0
> mark_precise: frame0: regs=r0 stack= before 0: (85) call bpf_get_prandom_u32#7
> mark_precise: frame0: last_idx 5 first_idx 0 subseq_idx -1
> mark_precise: frame0: regs=r6 stack= before 4: (a6) if w8 < 0xffffffff goto pc+2
> mark_precise: frame0: regs=r6 stack= before 3: (57) r6 &= 15
> mark_precise: frame0: regs=r6 stack= before 2: (bf) r6 = r0
> mark_precise: frame0: regs=r0 stack= before 1: (bf) r8 = r0
> mark_precise: frame0: regs=r0 stack= before 0: (85) call bpf_get_prandom_u32#7
> 5: R6_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=15,var_off=(0x0;
> 0xf)) R8_w=scalar(id=1,smin=-9223372032559808513,umin=umin32=4294967295,smin32=-1,smax32=-1,var_off=(0xffffffff;
> 0xffffffff00000000))
> 7: (95) exit
>
> from 4 to 7: R0=scalar(id=1,smax=9223372036854775806,umax=18446744073709551614,umax32=4294967294)
> R6=scalar(smin=smin32=0,smax=umax=smax32=umax32=15,var_off=(0x0; 0xf))
> R8=scalar(id=1,smax=9223372036854775806,umax=18446744073709551614,umax32=4294967294)
> R10=fp0
> 7: R0=scalar(id=1,smax=9223372036854775806,umax=18446744073709551614,umax32=4294967294)
> R6=scalar(smin=smin32=0,smax=umax=smax32=umax32=15,var_off=(0x0; 0xf))
> R8=scalar(id=1,smax=9223372036854775806,umax=18446744073709551614,umax32=4294967294)
> R10=fp0
> 7: (95) exit
> processed 8 insns (limit 1000000) max_states_per_insn 0 total_states 1
> peak_states 1 mark_read 1
> =============
>
> at insn #4, simulating a FALSE condition, verifier knows that r6 is
> [0, 15], while w8 is exactly 0xffffffff, so at insn #5 it can tell
> that 0xffffffff can never be equal to a value in [0, 15] range, and
> thus skips the shift instruction.
>

2023-11-01 23:16:17

by Andrii Nakryiko

[permalink] [raw]
Subject: Re: bpf: shift-out-of-bounds in tnum_rshift()

On Wed, Nov 1, 2023 at 2:53 AM Hao Sun <[email protected]> wrote:
>
> On Fri, Oct 27, 2023 at 7:51 PM Andrii Nakryiko
> <[email protected]> wrote:
> >
> > On Wed, Oct 25, 2023 at 10:34 AM Eduard Zingerman <[email protected]> wrote:
> > >
> > > On Tue, 2023-10-24 at 14:40 +0200, Hao Sun wrote:
> > > > Hi,
> > > >
> > > > The following program can trigger a shift-out-of-bounds in
> > > > tnum_rshift(), called by scalar32_min_max_rsh():
> > > >
> > > > 0: (bc) w0 = w1
> > > > 1: (bf) r2 = r0
> > > > 2: (18) r3 = 0xd
> > > > 4: (bc) w4 = w0
> > > > 5: (bf) r5 = r0
> > > > 6: (bf) r7 = r3
> > > > 7: (bf) r8 = r4
> > > > 8: (2f) r8 *= r5
> > > > 9: (cf) r5 s>>= r5
> > > > 10: (a6) if w8 < 0xfffffffb goto pc+10
> > > > 11: (1f) r7 -= r5
> > > > 12: (71) r6 = *(u8 *)(r1 +17)
> > > > 13: (5f) r3 &= r8
> > > > 14: (74) w2 >>= 30
> > > > 15: (1f) r7 -= r5
> > > > 16: (5d) if r8 != r6 goto pc+4
> > > > 17: (c7) r8 s>>= 5
> > > > 18: (cf) r0 s>>= r0
> > > > 19: (7f) r0 >>= r0
> > > > 20: (7c) w5 >>= w8 # shift-out-bounds here
> > > > 21: exit
> > >
> > > Here is a simplified example:
> > >
> > > SEC("?tp")
> > > __success __retval(0)
> > > __naked void large_shifts(void)
> > > {
> > > asm volatile (" \
> > > call %[bpf_get_prandom_u32]; \n\
> > > r8 = r0; \n\
> > > r6 = r0; \n\
> > > r6 &= 0xf; \n\
> > > if w8 < 0xffffffff goto +2; \n\
> > > if r8 != r6 goto +1; \n\
> > > w0 >>= w8; /* shift-out-bounds here */ \n\
> > > exit; \n\
> > > " :
> > > : __imm(bpf_get_prandom_u32)
> > > : __clobber_all);
> > > }
> > >
> >
> > With my changes the verifier does correctly derive that r8 != r6 will
> > always happen, and thus skips w0 >>= w8. But the test itself with
>
> A similar issue can be triggered after your patch for JNE/JEQ.
>
> For the following case, the verifier would shift out of bound:
> // 0: r0 = -2
> BPF_MOV64_IMM(BPF_REG_0, -2),
> // 1: r0 /= 1
> BPF_ALU64_IMM(BPF_DIV, BPF_REG_0, 1),
> // 2: r8 = r0
> BPF_MOV64_REG(BPF_REG_8, BPF_REG_0),
> // 3: if w8 != 0xfffffffe goto+4
> BPF_JMP32_IMM(BPF_JNE, BPF_REG_8, 0xfffffffe, 4),
> // 4: if r8 s> 0xd goto+3
> BPF_JMP_IMM(BPF_JSGT, BPF_REG_8, 0xd, 3),
> // 5: r4 = 0x2
> BPF_MOV64_IMM(BPF_REG_4, 0x2),
> // 6: if r8 s<= r4 goto+1
> BPF_JMP_REG(BPF_JSLE, BPF_REG_8, BPF_REG_4, 1),
> // 7: w8 s>>= w0 # shift out of bound here
> BPF_ALU32_REG(BPF_ARSH, BPF_REG_8, BPF_REG_0),
> // 8: exit
> BPF_EXIT_INSN(),
>
> -------- Verifier Log --------
> func#0 @0
> 0: R1=ctx(off=0,imm=0) R10=fp0
> 0: (b7) r0 = -2 ; R0_w=-2
> 1: (37) r0 /= 1 ; R0_w=scalar()
> 2: (bf) r8 = r0 ; R0_w=scalar(id=1) R8_w=scalar(id=1)
> 3: (56) if w8 != 0xfffffffe goto pc+4 ;
> R8_w=scalar(id=1,smin=-9223372032559808514,smax=9223372036854775806,umin=umin32=4294967294,umax=18446744073709551614,smin32=-2,smax32=-2,
> umax32=4294967294,var_off=(0xfffffffe; 0xffffffff00000000))
> 4: (65) if r8 s> 0xd goto pc+3 ;
> R8_w=scalar(id=1,smin=-9223372032559808514,smax=13,umin=umin32=4294967294,umax=18446744073709551614,smin32=-2,smax32=-2,umax32=4294967294,
> var_off=(0xfffffffe; 0xffffffff00000000))
> 5: (b7) r4 = 2 ; R4_w=2
> 6: (dd) if r8 s<= r4 goto pc+1 ; R4_w=2 R8_w=4294967294
> 7: (cc) w8 s>>= w0 ; R0=4294967294 R8=4294967295
> 8: (95) exit
>
> Here, after #6, reg range is incorrect, seems to be an issue in JSLE case
> in is_branch_taken(). Is this issue fixed in your patch series?

I don't know, but you can easily check by applying my patches on top
of bpf-next and then trying your change.

>
> > __retval(0) is not a valid test, so it would be good to construct
> > something that will correctly return 0 at runtime (or use some other
> > check). So I won't put this test into my patch set and will live it as
> > a follow up for someone. But here's the log for anyone curious:
> >
> > VERIFIER LOG:
> > =============
> > func#0 @0
> > 0: R1=ctx(off=0,imm=0) R10=fp0
> > ; asm volatile (" \
> > 0: (85) call bpf_get_prandom_u32#7 ; R0_w=scalar()
> > 1: (bf) r8 = r0 ; R0_w=scalar(id=1) R8_w=scalar(id=1)
> > 2: (bf) r6 = r0 ; R0_w=scalar(id=1) R6_w=scalar(id=1)
> > 3: (57) r6 &= 15 ;
> > R6_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=15,var_off=(0x0;
> > 0xf))
> > 4: (a6) if w8 < 0xffffffff goto pc+2 ;
> > R8_w=scalar(id=1,smin=-9223372032559808513,umin=umin32=4294967295,smin32=-1,smax32=-1,var_off=(0xffffffff;
> > 0xffffffff00000000))
> > 5: (5d) if r8 != r6 goto pc+1
> > mark_precise: frame0: last_idx 5 first_idx 0 subseq_idx -1
> > mark_precise: frame0: regs=r0,r8 stack= before 4: (a6) if w8 <
> > 0xffffffff goto pc+2
> > mark_precise: frame0: regs=r0,r8 stack= before 3: (57) r6 &= 15
> > mark_precise: frame0: regs=r0,r8 stack= before 2: (bf) r6 = r0
> > mark_precise: frame0: regs=r0,r8 stack= before 1: (bf) r8 = r0
> > mark_precise: frame0: regs=r0 stack= before 0: (85) call bpf_get_prandom_u32#7
> > mark_precise: frame0: last_idx 5 first_idx 0 subseq_idx -1
> > mark_precise: frame0: regs=r6 stack= before 4: (a6) if w8 < 0xffffffff goto pc+2
> > mark_precise: frame0: regs=r6 stack= before 3: (57) r6 &= 15
> > mark_precise: frame0: regs=r6 stack= before 2: (bf) r6 = r0
> > mark_precise: frame0: regs=r0 stack= before 1: (bf) r8 = r0
> > mark_precise: frame0: regs=r0 stack= before 0: (85) call bpf_get_prandom_u32#7
> > 5: R6_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=15,var_off=(0x0;
> > 0xf)) R8_w=scalar(id=1,smin=-9223372032559808513,umin=umin32=4294967295,smin32=-1,smax32=-1,var_off=(0xffffffff;
> > 0xffffffff00000000))
> > 7: (95) exit
> >
> > from 4 to 7: R0=scalar(id=1,smax=9223372036854775806,umax=18446744073709551614,umax32=4294967294)
> > R6=scalar(smin=smin32=0,smax=umax=smax32=umax32=15,var_off=(0x0; 0xf))
> > R8=scalar(id=1,smax=9223372036854775806,umax=18446744073709551614,umax32=4294967294)
> > R10=fp0
> > 7: R0=scalar(id=1,smax=9223372036854775806,umax=18446744073709551614,umax32=4294967294)
> > R6=scalar(smin=smin32=0,smax=umax=smax32=umax32=15,var_off=(0x0; 0xf))
> > R8=scalar(id=1,smax=9223372036854775806,umax=18446744073709551614,umax32=4294967294)
> > R10=fp0
> > 7: (95) exit
> > processed 8 insns (limit 1000000) max_states_per_insn 0 total_states 1
> > peak_states 1 mark_read 1
> > =============
> >
> > at insn #4, simulating a FALSE condition, verifier knows that r6 is
> > [0, 15], while w8 is exactly 0xffffffff, so at insn #5 it can tell
> > that 0xffffffff can never be equal to a value in [0, 15] range, and
> > thus skips the shift instruction.
> >

2023-11-02 14:18:05

by Hao Sun

[permalink] [raw]
Subject: Re: bpf: shift-out-of-bounds in tnum_rshift()

On Wed, Nov 1, 2023 at 11:45 PM Andrii Nakryiko
<[email protected]> wrote:
>
> On Wed, Nov 1, 2023 at 2:53 AM Hao Sun <[email protected]> wrote:
> >
> > On Fri, Oct 27, 2023 at 7:51 PM Andrii Nakryiko
> > <[email protected]> wrote:
> > >
> > > On Wed, Oct 25, 2023 at 10:34 AM Eduard Zingerman <[email protected]> wrote:
> > > >
> > > > On Tue, 2023-10-24 at 14:40 +0200, Hao Sun wrote:
> > > > > Hi,
> > > > >
> > > > > The following program can trigger a shift-out-of-bounds in
> > > > > tnum_rshift(), called by scalar32_min_max_rsh():
> > > > >
> > > > > 0: (bc) w0 = w1
> > > > > 1: (bf) r2 = r0
> > > > > 2: (18) r3 = 0xd
> > > > > 4: (bc) w4 = w0
> > > > > 5: (bf) r5 = r0
> > > > > 6: (bf) r7 = r3
> > > > > 7: (bf) r8 = r4
> > > > > 8: (2f) r8 *= r5
> > > > > 9: (cf) r5 s>>= r5
> > > > > 10: (a6) if w8 < 0xfffffffb goto pc+10
> > > > > 11: (1f) r7 -= r5
> > > > > 12: (71) r6 = *(u8 *)(r1 +17)
> > > > > 13: (5f) r3 &= r8
> > > > > 14: (74) w2 >>= 30
> > > > > 15: (1f) r7 -= r5
> > > > > 16: (5d) if r8 != r6 goto pc+4
> > > > > 17: (c7) r8 s>>= 5
> > > > > 18: (cf) r0 s>>= r0
> > > > > 19: (7f) r0 >>= r0
> > > > > 20: (7c) w5 >>= w8 # shift-out-bounds here
> > > > > 21: exit
> > > >
> > > > Here is a simplified example:
> > > >
> > > > SEC("?tp")
> > > > __success __retval(0)
> > > > __naked void large_shifts(void)
> > > > {
> > > > asm volatile (" \
> > > > call %[bpf_get_prandom_u32]; \n\
> > > > r8 = r0; \n\
> > > > r6 = r0; \n\
> > > > r6 &= 0xf; \n\
> > > > if w8 < 0xffffffff goto +2; \n\
> > > > if r8 != r6 goto +1; \n\
> > > > w0 >>= w8; /* shift-out-bounds here */ \n\
> > > > exit; \n\
> > > > " :
> > > > : __imm(bpf_get_prandom_u32)
> > > > : __clobber_all);
> > > > }
> > > >
> > >
> > > With my changes the verifier does correctly derive that r8 != r6 will
> > > always happen, and thus skips w0 >>= w8. But the test itself with
> >
> > A similar issue can be triggered after your patch for JNE/JEQ.
> >
> > For the following case, the verifier would shift out of bound:
> > // 0: r0 = -2
> > BPF_MOV64_IMM(BPF_REG_0, -2),
> > // 1: r0 /= 1
> > BPF_ALU64_IMM(BPF_DIV, BPF_REG_0, 1),
> > // 2: r8 = r0
> > BPF_MOV64_REG(BPF_REG_8, BPF_REG_0),
> > // 3: if w8 != 0xfffffffe goto+4
> > BPF_JMP32_IMM(BPF_JNE, BPF_REG_8, 0xfffffffe, 4),
> > // 4: if r8 s> 0xd goto+3
> > BPF_JMP_IMM(BPF_JSGT, BPF_REG_8, 0xd, 3),
> > // 5: r4 = 0x2
> > BPF_MOV64_IMM(BPF_REG_4, 0x2),
> > // 6: if r8 s<= r4 goto+1
> > BPF_JMP_REG(BPF_JSLE, BPF_REG_8, BPF_REG_4, 1),
> > // 7: w8 s>>= w0 # shift out of bound here
> > BPF_ALU32_REG(BPF_ARSH, BPF_REG_8, BPF_REG_0),
> > // 8: exit
> > BPF_EXIT_INSN(),
> >
> > -------- Verifier Log --------
> > func#0 @0
> > 0: R1=ctx(off=0,imm=0) R10=fp0
> > 0: (b7) r0 = -2 ; R0_w=-2
> > 1: (37) r0 /= 1 ; R0_w=scalar()
> > 2: (bf) r8 = r0 ; R0_w=scalar(id=1) R8_w=scalar(id=1)
> > 3: (56) if w8 != 0xfffffffe goto pc+4 ;
> > R8_w=scalar(id=1,smin=-9223372032559808514,smax=9223372036854775806,umin=umin32=4294967294,umax=18446744073709551614,smin32=-2,smax32=-2,
> > umax32=4294967294,var_off=(0xfffffffe; 0xffffffff00000000))
> > 4: (65) if r8 s> 0xd goto pc+3 ;
> > R8_w=scalar(id=1,smin=-9223372032559808514,smax=13,umin=umin32=4294967294,umax=18446744073709551614,smin32=-2,smax32=-2,umax32=4294967294,
> > var_off=(0xfffffffe; 0xffffffff00000000))
> > 5: (b7) r4 = 2 ; R4_w=2
> > 6: (dd) if r8 s<= r4 goto pc+1 ; R4_w=2 R8_w=4294967294
> > 7: (cc) w8 s>>= w0 ; R0=4294967294 R8=4294967295
> > 8: (95) exit
> >
> > Here, after #6, reg range is incorrect, seems to be an issue in JSLE case
> > in is_branch_taken(). Is this issue fixed in your patch series?
>
> I don't know, but you can easily check by applying my patches on top
> of bpf-next and then trying your change.
>

After applying your change, the verifier does not shift out of bound,
but the range
is still not correct. See this verifier log:

-------- Verifier Log --------
func#0 @0
0: R1=ctx(off=0,imm=0) R10=fp0
0: (b7) r0 = -2 ; R0_w=-2
1: (37) r0 /= 1 ; R0_w=scalar()
2: (bf) r8 = r0 ; R0_w=scalar(id=1) R8_w=scalar(id=1)
3: (56) if w8 != 0xfffffffe goto pc+4 ;
R8_w=scalar(id=1,smin=-9223372032559808514,smax=9223372036854775806,umin=umin32=4294967294,umax=18446744073709551614,smin32=-2,smax32=-2,umax32=4294967294,var_off=(0xfffffffe;
0xffffffff00000000))
4: (65) if r8 s> 0xd goto pc+3 ;
R8_w=scalar(id=1,smin=-9223372032559808514,smax=13,umin=umin32=4294967294,umax=18446744073709551614,smin32=-2,smax32=-2,umax32=4294967294,var_off=(0xfffffffe;
0xffffffff00000000))
5: (b7) r4 = 2 ; R4_w=2
6: (dd) if r8 s<= r4 goto pc+1 ; R4_w=2 R8_w=4294967294
7: (cc) w8 s>>= w0 ; R0=4294967294 R8=scalar()
8: (77) r0 >>= 32 ; R0_w=0
9: (57) r0 &= 1 ; R0_w=0
10: (95) exit

from 6 to 8: safe

from 4 to 8: safe

from 3 to 8: safe
processed 14 insns (limit 1000000) max_states_per_insn 0 total_states
1 peak_states 1 mark_read 1

-------- Test Run Log --------
RetVal: 1

At #9, the verifier thinks the value of R0 is 0, but it's 1, because
r0 is -2 all the time before
#8. The test run also shows return value is 1.

This program can reproduce the above: https://pastebin.com/raw/a0WuXaKh