2021-12-17 18:50:22

by George Kennedy

[permalink] [raw]
Subject: [PATCH] bpf: check size before calling kvmalloc

ZERO_SIZE_PTR ((void *)16) is returned by kvmalloc() instead of NULL
if size is zero. Currently, return values from kvmalloc() are only
checked for NULL. Before calling kvmalloc() check for size of zero
and return error if size is zero to avoid the following crash.

BUG: kernel NULL pointer dereference, address: 0000000000000000
PGD 1030bd067 P4D 1030bd067 PUD 103497067 PMD 0
Oops: 0010 [#1] PREEMPT SMP KASAN NOPTI
CPU: 1 PID: 15094 Comm: syz-executor344 Not tainted 5.16.0-rc1-syzk #1
Hardware name: Red Hat KVM, BIOS
RIP: 0010:0x0
Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
RSP: 0018:ffff888017627b78 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff8880215d0780 RCX: ffffffff81b63c60
RDX: 0000000000000010 RSI: 0000000000000000 RDI: ffff8881035db400
RBP: ffff888017627f08 R08: ffffed1003697209 R09: ffffed1003697209
R10: ffff88801b4b9043 R11: ffffed1003697208 R12: ffffffff8f15d580
R13: 1ffff11002ec4f77 R14: ffff8881035db400 R15: 0000000000000000
FS: 00007f62bca78740(0000) GS:ffff888107880000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 000000002282a000 CR4: 00000000000006e0
Call Trace:
<TASK>
map_get_next_key kernel/bpf/syscall.c:1279 [inline]
__sys_bpf+0x384d/0x5b30 kernel/bpf/syscall.c:4612
__do_sys_bpf kernel/bpf/syscall.c:4722 [inline]
__se_sys_bpf kernel/bpf/syscall.c:4720 [inline]
__x64_sys_bpf+0x7a/0xc0 kernel/bpf/syscall.c:4720
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3a/0x80 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae

Reported-by: syzkaller <[email protected]>
Signed-off-by: George Kennedy <[email protected]>
---
kernel/bpf/syscall.c | 18 ++++++++++++++++--
1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 1033ee8..9873723 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1278,10 +1278,18 @@ static int map_get_next_key(union bpf_attr *attr)
key = NULL;
}

+ if (!map->key_size) {
+ err = -EINVAL;
+ goto err_put;
+ }
+
err = -ENOMEM;
next_key = kvmalloc(map->key_size, GFP_USER);
- if (!next_key)
- goto free_key;
+ if (!next_key) {
+ if (key)
+ goto free_key;
+ goto err_put;
+ }

if (bpf_map_is_dev_bound(map)) {
err = bpf_map_offload_get_next_key(map, key, next_key);
@@ -1331,6 +1339,8 @@ int generic_map_delete_batch(struct bpf_map *map,
if (!max_count)
return 0;

+ if (!map->key_size)
+ return -EINVAL;
key = kvmalloc(map->key_size, GFP_USER | __GFP_NOWARN);
if (!key)
return -ENOMEM;
@@ -1388,6 +1398,8 @@ int generic_map_update_batch(struct bpf_map *map,
if (!max_count)
return 0;

+ if (!map->key_size)
+ return -EINVAL;
key = kvmalloc(map->key_size, GFP_USER | __GFP_NOWARN);
if (!key)
return -ENOMEM;
@@ -1452,6 +1464,8 @@ int generic_map_lookup_batch(struct bpf_map *map,
if (put_user(0, &uattr->batch.count))
return -EFAULT;

+ if (!map->key_size)
+ return -EINVAL;
buf_prevkey = kvmalloc(map->key_size, GFP_USER | __GFP_NOWARN);
if (!buf_prevkey)
return -ENOMEM;
--
1.8.3.1



2021-12-17 22:46:01

by Daniel Borkmann

[permalink] [raw]
Subject: Re: [PATCH] bpf: check size before calling kvmalloc

On 12/17/21 7:48 PM, George Kennedy wrote:
> ZERO_SIZE_PTR ((void *)16) is returned by kvmalloc() instead of NULL
> if size is zero. Currently, return values from kvmalloc() are only
> checked for NULL. Before calling kvmalloc() check for size of zero
> and return error if size is zero to avoid the following crash.
>
> BUG: kernel NULL pointer dereference, address: 0000000000000000
> PGD 1030bd067 P4D 1030bd067 PUD 103497067 PMD 0
> Oops: 0010 [#1] PREEMPT SMP KASAN NOPTI
> CPU: 1 PID: 15094 Comm: syz-executor344 Not tainted 5.16.0-rc1-syzk #1
> Hardware name: Red Hat KVM, BIOS
> RIP: 0010:0x0
> Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
> RSP: 0018:ffff888017627b78 EFLAGS: 00010246
> RAX: 0000000000000000 RBX: ffff8880215d0780 RCX: ffffffff81b63c60
> RDX: 0000000000000010 RSI: 0000000000000000 RDI: ffff8881035db400
> RBP: ffff888017627f08 R08: ffffed1003697209 R09: ffffed1003697209
> R10: ffff88801b4b9043 R11: ffffed1003697208 R12: ffffffff8f15d580
> R13: 1ffff11002ec4f77 R14: ffff8881035db400 R15: 0000000000000000
> FS: 00007f62bca78740(0000) GS:ffff888107880000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: ffffffffffffffd6 CR3: 000000002282a000 CR4: 00000000000006e0
> Call Trace:
> <TASK>
> map_get_next_key kernel/bpf/syscall.c:1279 [inline]
> __sys_bpf+0x384d/0x5b30 kernel/bpf/syscall.c:4612
> __do_sys_bpf kernel/bpf/syscall.c:4722 [inline]
> __se_sys_bpf kernel/bpf/syscall.c:4720 [inline]
> __x64_sys_bpf+0x7a/0xc0 kernel/bpf/syscall.c:4720
> do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> do_syscall_64+0x3a/0x80 arch/x86/entry/common.c:80
> entry_SYSCALL_64_after_hwframe+0x44/0xae
>
> Reported-by: syzkaller <[email protected]>
> Signed-off-by: George Kennedy <[email protected]>

Could you provide some more details, e.g. which map type is this where we
have to assume zero-sized keys everywhere?

(Or link to syzkaller report could also work alternatively if public.)

Thanks,
Daniel

2021-12-20 13:50:24

by George Kennedy

[permalink] [raw]
Subject: Re: [PATCH] bpf: check size before calling kvmalloc



On 12/17/2021 5:45 PM, Daniel Borkmann wrote:
> On 12/17/21 7:48 PM, George Kennedy wrote:
>> ZERO_SIZE_PTR ((void *)16) is returned by kvmalloc() instead of NULL
>> if size is zero. Currently, return values from kvmalloc() are only
>> checked for NULL. Before calling kvmalloc() check for size of zero
>> and return error if size is zero to avoid the following crash.
>>
>> BUG: kernel NULL pointer dereference, address: 0000000000000000
>> PGD 1030bd067 P4D 1030bd067 PUD 103497067 PMD 0
>> Oops: 0010 [#1] PREEMPT SMP KASAN NOPTI
>> CPU: 1 PID: 15094 Comm: syz-executor344 Not tainted 5.16.0-rc1-syzk #1
>> Hardware name: Red Hat KVM, BIOS
>> RIP: 0010:0x0
>> Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
>> RSP: 0018:ffff888017627b78 EFLAGS: 00010246
>> RAX: 0000000000000000 RBX: ffff8880215d0780 RCX: ffffffff81b63c60
>> RDX: 0000000000000010 RSI: 0000000000000000 RDI: ffff8881035db400
>> RBP: ffff888017627f08 R08: ffffed1003697209 R09: ffffed1003697209
>> R10: ffff88801b4b9043 R11: ffffed1003697208 R12: ffffffff8f15d580
>> R13: 1ffff11002ec4f77 R14: ffff8881035db400 R15: 0000000000000000
>> FS:  00007f62bca78740(0000) GS:ffff888107880000(0000)
>> knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: ffffffffffffffd6 CR3: 000000002282a000 CR4: 00000000000006e0
>> Call Trace:
>>   <TASK>
>>   map_get_next_key kernel/bpf/syscall.c:1279 [inline]
>>   __sys_bpf+0x384d/0x5b30 kernel/bpf/syscall.c:4612
>>   __do_sys_bpf kernel/bpf/syscall.c:4722 [inline]
>>   __se_sys_bpf kernel/bpf/syscall.c:4720 [inline]
>>   __x64_sys_bpf+0x7a/0xc0 kernel/bpf/syscall.c:4720
>>   do_syscall_x64 arch/x86/entry/common.c:50 [inline]
>>   do_syscall_64+0x3a/0x80 arch/x86/entry/common.c:80
>>   entry_SYSCALL_64_after_hwframe+0x44/0xae
>>
>> Reported-by: syzkaller <[email protected]>
>> Signed-off-by: George Kennedy <[email protected]>
>
> Could you provide some more details, e.g. which map type is this where we
> have to assume zero-sized keys everywhere?
>
> (Or link to syzkaller report could also work alternatively if public.)

I don't think the report is public. Here's the report and C reproducer:

#ifdef REF
Syzkaller hit 'BUG: unable to handle kernel NULL pointer dereference in
bpf' bug.

BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor instruction fetch in kernel mode
#PF: error_code(0x0010) - not-present page
PGD 1030bd067 P4D 1030bd067 PUD 103497067 PMD 0
Oops: 0010 [#1] PREEMPT SMP KASAN NOPTI
CPU: 1 PID: 15094 Comm: syz-executor344 Not tainted 5.16.0-rc1-syzk #1
Hardware name: Red Hat KVM, BIOS 1.13.0-2.module+el8.3.0+7860+a7792d29
04/01/2014
RIP: 0010:0x0
Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
RSP: 0018:ffff888017627b78 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff8880215d0780 RCX: ffffffff81b63c60
RDX: 0000000000000010 RSI: 0000000000000000 RDI: ffff8881035db400
RBP: ffff888017627f08 R08: ffffed1003697209 R09: ffffed1003697209
R10: ffff88801b4b9043 R11: ffffed1003697208 R12: ffffffff8f15d580
R13: 1ffff11002ec4f77 R14: ffff8881035db400 R15: 0000000000000000
FS:  00007f62bca78740(0000) GS:ffff888107880000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 000000002282a000 CR4: 00000000000006e0
Call Trace:
 <TASK>
 map_get_next_key kernel/bpf/syscall.c:1279 [inline]
 __sys_bpf+0x384d/0x5b30 kernel/bpf/syscall.c:4612
 __do_sys_bpf kernel/bpf/syscall.c:4722 [inline]
 __se_sys_bpf kernel/bpf/syscall.c:4720 [inline]
 __x64_sys_bpf+0x7a/0xc0 kernel/bpf/syscall.c:4720
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x3a/0x80 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f62bc36f289
Code: 01 00 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89
f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01
f0 ff ff 73 01 c3 48 8b 0d b7 db 2c 00 f7 d8 64 89 01 48
RSP: 002b:00007ffccaa211e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f62bc36f289
RDX: 0000000000000020 RSI: 0000000020000080 RDI: 0000000000000004
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000004006d0
R13: 00007ffccaa212d0 R14: 0000000000000000 R15: 0000000000000000
 </TASK>
Modules linked in:
CR2: 0000000000000000
---[ end trace d203e5a1836d64aa ]---
RIP: 0010:0x0
Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
RSP: 0018:ffff888017627b78 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff8880215d0780 RCX: ffffffff81b63c60
RDX: 0000000000000010 RSI: 0000000000000000 RDI: ffff8881035db400
RBP: ffff888017627f08 R08: ffffed1003697209 R09: ffffed1003697209
R10: ffff88801b4b9043 R11: ffffed1003697208 R12: ffffffff8f15d580
R13: 1ffff11002ec4f77 R14: ffff8881035db400 R15: 0000000000000000
FS:  00007f62bca78740(0000) GS:ffff888107880000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 000000002282a000 CR4: 00000000000006e0


Syzkaller reproducer:
# {Threaded:false Collide:false Repeat:false RepeatTimes:0 Procs:1
Slowdown:1 Sandbox: Fault:false FaultCall:-1 FaultNth:0 Leak:false
NetInjection:false NetDevices:false NetReset:false Cgroups:false
BinfmtMisc:false CloseFDs:false KCSAN:false DevlinkPCI:false USB:false
VhciInjection:false Wifi:false IEEE802154:false Sysctl:false
UseTmpDir:false HandleSegv:false Repro:false Trace:false}
r0 = bpf$MAP_CREATE(0x0, &(0x7f0000001480)={0x1e, 0x0, 0x2, 0x2, 0x0,
0x1}, 0x40)
bpf$MAP_GET_NEXT_KEY(0x4, &(0x7f0000000080)={r0, 0x0, 0x0}, 0x20)


C reproducer:
#endif /* REF */
// autogenerated by syzkaller (https://github.com/google/syzkaller)

#define _GNU_SOURCE

#include <endian.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/syscall.h>
#include <sys/types.h>
#include <unistd.h>

#ifndef __NR_bpf
#define __NR_bpf 321
#endif

uint64_t r[1] = {0xffffffffffffffff};

int main(void)
{
        syscall(__NR_mmap, 0x1ffff000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
    syscall(__NR_mmap, 0x20000000ul, 0x1000000ul, 7ul, 0x32ul, -1, 0ul);
    syscall(__NR_mmap, 0x21000000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
                intptr_t res = 0;
*(uint32_t*)0x20001480 = 0x1e;
*(uint32_t*)0x20001484 = 0;
*(uint32_t*)0x20001488 = 2;
*(uint32_t*)0x2000148c = 2;
*(uint32_t*)0x20001490 = 0;
*(uint32_t*)0x20001494 = 1;
*(uint32_t*)0x20001498 = 0;
memset((void*)0x2000149c, 0, 16);
*(uint32_t*)0x200014ac = 0;
*(uint32_t*)0x200014b0 = -1;
*(uint32_t*)0x200014b4 = 0;
*(uint32_t*)0x200014b8 = 0;
*(uint32_t*)0x200014bc = 0;
    res = syscall(__NR_bpf, 0ul, 0x20001480ul, 0x40ul);
    if (res != -1)
        r[0] = res;
*(uint32_t*)0x20000080 = r[0];
*(uint64_t*)0x20000088 = 0;
*(uint64_t*)0x20000090 = 0;
*(uint64_t*)0x20000098 = 0;
    syscall(__NR_bpf, 4ul, 0x20000080ul, 0x20ul);
    return 0;
}

George

>
> Thanks,
> Daniel


2021-12-21 18:17:49

by George Kennedy

[permalink] [raw]
Subject: Re: [PATCH] bpf: check size before calling kvmalloc



On 12/20/2021 8:50 AM, George Kennedy wrote:
>
>
> On 12/17/2021 5:45 PM, Daniel Borkmann wrote:
>> On 12/17/21 7:48 PM, George Kennedy wrote:
>>> ZERO_SIZE_PTR ((void *)16) is returned by kvmalloc() instead of NULL
>>> if size is zero. Currently, return values from kvmalloc() are only
>>> checked for NULL. Before calling kvmalloc() check for size of zero
>>> and return error if size is zero to avoid the following crash.
>>>
>>> BUG: kernel NULL pointer dereference, address: 0000000000000000
>>> PGD 1030bd067 P4D 1030bd067 PUD 103497067 PMD 0
>>> Oops: 0010 [#1] PREEMPT SMP KASAN NOPTI
>>> CPU: 1 PID: 15094 Comm: syz-executor344 Not tainted 5.16.0-rc1-syzk #1
>>> Hardware name: Red Hat KVM, BIOS
>>> RIP: 0010:0x0
>>> Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
>>> RSP: 0018:ffff888017627b78 EFLAGS: 00010246
>>> RAX: 0000000000000000 RBX: ffff8880215d0780 RCX: ffffffff81b63c60
>>> RDX: 0000000000000010 RSI: 0000000000000000 RDI: ffff8881035db400
>>> RBP: ffff888017627f08 R08: ffffed1003697209 R09: ffffed1003697209
>>> R10: ffff88801b4b9043 R11: ffffed1003697208 R12: ffffffff8f15d580
>>> R13: 1ffff11002ec4f77 R14: ffff8881035db400 R15: 0000000000000000
>>> FS:  00007f62bca78740(0000) GS:ffff888107880000(0000)
>>> knlGS:0000000000000000
>>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> CR2: ffffffffffffffd6 CR3: 000000002282a000 CR4: 00000000000006e0
>>> Call Trace:
>>>   <TASK>
>>>   map_get_next_key kernel/bpf/syscall.c:1279 [inline]
>>>   __sys_bpf+0x384d/0x5b30 kernel/bpf/syscall.c:4612
>>>   __do_sys_bpf kernel/bpf/syscall.c:4722 [inline]
>>>   __se_sys_bpf kernel/bpf/syscall.c:4720 [inline]
>>>   __x64_sys_bpf+0x7a/0xc0 kernel/bpf/syscall.c:4720
>>>   do_syscall_x64 arch/x86/entry/common.c:50 [inline]
>>>   do_syscall_64+0x3a/0x80 arch/x86/entry/common.c:80
>>>   entry_SYSCALL_64_after_hwframe+0x44/0xae
>>>
>>> Reported-by: syzkaller <[email protected]>
>>> Signed-off-by: George Kennedy <[email protected]>
>>
>> Could you provide some more details, e.g. which map type is this
>> where we
>> have to assume zero-sized keys everywhere?
>>
>> (Or link to syzkaller report could also work alternatively if public.)
>
> I don't think the report is public. Here's the report and C reproducer:
>
> #ifdef REF
> Syzkaller hit 'BUG: unable to handle kernel NULL pointer dereference
> in bpf' bug.
>
> BUG: kernel NULL pointer dereference, address: 0000000000000000
> #PF: supervisor instruction fetch in kernel mode
> #PF: error_code(0x0010) - not-present page
> PGD 1030bd067 P4D 1030bd067 PUD 103497067 PMD 0
> Oops: 0010 [#1] PREEMPT SMP KASAN NOPTI
> CPU: 1 PID: 15094 Comm: syz-executor344 Not tainted 5.16.0-rc1-syzk #1
> Hardware name: Red Hat KVM, BIOS 1.13.0-2.module+el8.3.0+7860+a7792d29
> 04/01/2014
> RIP: 0010:0x0
> Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
> RSP: 0018:ffff888017627b78 EFLAGS: 00010246
> RAX: 0000000000000000 RBX: ffff8880215d0780 RCX: ffffffff81b63c60
> RDX: 0000000000000010 RSI: 0000000000000000 RDI: ffff8881035db400
> RBP: ffff888017627f08 R08: ffffed1003697209 R09: ffffed1003697209
> R10: ffff88801b4b9043 R11: ffffed1003697208 R12: ffffffff8f15d580
> R13: 1ffff11002ec4f77 R14: ffff8881035db400 R15: 0000000000000000
> FS:  00007f62bca78740(0000) GS:ffff888107880000(0000)
> knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: ffffffffffffffd6 CR3: 000000002282a000 CR4: 00000000000006e0
> Call Trace:
>  <TASK>
>  map_get_next_key kernel/bpf/syscall.c:1279 [inline]
>  __sys_bpf+0x384d/0x5b30 kernel/bpf/syscall.c:4612
>  __do_sys_bpf kernel/bpf/syscall.c:4722 [inline]
>  __se_sys_bpf kernel/bpf/syscall.c:4720 [inline]
>  __x64_sys_bpf+0x7a/0xc0 kernel/bpf/syscall.c:4720
>  do_syscall_x64 arch/x86/entry/common.c:50 [inline]
>  do_syscall_64+0x3a/0x80 arch/x86/entry/common.c:80
>  entry_SYSCALL_64_after_hwframe+0x44/0xae
> RIP: 0033:0x7f62bc36f289
> Code: 01 00 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48
> 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d
> 01 f0 ff ff 73 01 c3 48 8b 0d b7 db 2c 00 f7 d8 64 89 01 48
> RSP: 002b:00007ffccaa211e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f62bc36f289
> RDX: 0000000000000020 RSI: 0000000020000080 RDI: 0000000000000004
> RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 00000000004006d0
> R13: 00007ffccaa212d0 R14: 0000000000000000 R15: 0000000000000000
>  </TASK>
> Modules linked in:
> CR2: 0000000000000000
> ---[ end trace d203e5a1836d64aa ]---
> RIP: 0010:0x0
> Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
> RSP: 0018:ffff888017627b78 EFLAGS: 00010246
> RAX: 0000000000000000 RBX: ffff8880215d0780 RCX: ffffffff81b63c60
> RDX: 0000000000000010 RSI: 0000000000000000 RDI: ffff8881035db400
> RBP: ffff888017627f08 R08: ffffed1003697209 R09: ffffed1003697209
> R10: ffff88801b4b9043 R11: ffffed1003697208 R12: ffffffff8f15d580
> R13: 1ffff11002ec4f77 R14: ffff8881035db400 R15: 0000000000000000
> FS:  00007f62bca78740(0000) GS:ffff888107880000(0000)
> knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: ffffffffffffffd6 CR3: 000000002282a000 CR4: 00000000000006e0
>
>
> Syzkaller reproducer:
> # {Threaded:false Collide:false Repeat:false RepeatTimes:0 Procs:1
> Slowdown:1 Sandbox: Fault:false FaultCall:-1 FaultNth:0 Leak:false
> NetInjection:false NetDevices:false NetReset:false Cgroups:false
> BinfmtMisc:false CloseFDs:false KCSAN:false DevlinkPCI:false USB:false
> VhciInjection:false Wifi:false IEEE802154:false Sysctl:false
> UseTmpDir:false HandleSegv:false Repro:false Trace:false}
> r0 = bpf$MAP_CREATE(0x0, &(0x7f0000001480)={0x1e, 0x0, 0x2, 0x2, 0x0,
> 0x1}, 0x40)
> bpf$MAP_GET_NEXT_KEY(0x4, &(0x7f0000000080)={r0, 0x0, 0x0}, 0x20)
>
>
> C reproducer:
> #endif /* REF */
> // autogenerated by syzkaller (https://github.com/google/syzkaller)
>
> #define _GNU_SOURCE
>
> #include <endian.h>
> #include <stdint.h>
> #include <stdio.h>
> #include <stdlib.h>
> #include <string.h>
> #include <sys/syscall.h>
> #include <sys/types.h>
> #include <unistd.h>
>
> #ifndef __NR_bpf
> #define __NR_bpf 321
> #endif
>
> uint64_t r[1] = {0xffffffffffffffff};
>
> int main(void)
> {
>         syscall(__NR_mmap, 0x1ffff000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
>     syscall(__NR_mmap, 0x20000000ul, 0x1000000ul, 7ul, 0x32ul, -1, 0ul);
>     syscall(__NR_mmap, 0x21000000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
>                 intptr_t res = 0;
> *(uint32_t*)0x20001480 = 0x1e;
> *(uint32_t*)0x20001484 = 0;
> *(uint32_t*)0x20001488 = 2;
> *(uint32_t*)0x2000148c = 2;
> *(uint32_t*)0x20001490 = 0;
> *(uint32_t*)0x20001494 = 1;
> *(uint32_t*)0x20001498 = 0;
> memset((void*)0x2000149c, 0, 16);
> *(uint32_t*)0x200014ac = 0;
> *(uint32_t*)0x200014b0 = -1;
> *(uint32_t*)0x200014b4 = 0;
> *(uint32_t*)0x200014b8 = 0;
> *(uint32_t*)0x200014bc = 0;
>     res = syscall(__NR_bpf, 0ul, 0x20001480ul, 0x40ul);
>     if (res != -1)
>         r[0] = res;
> *(uint32_t*)0x20000080 = r[0];
> *(uint64_t*)0x20000088 = 0;
> *(uint64_t*)0x20000090 = 0;
> *(uint64_t*)0x20000098 = 0;
>     syscall(__NR_bpf, 4ul, 0x20000080ul, 0x20ul);
>     return 0;
> }
>
> George
>
Hi Daniel,

I missed another set of kvmallocs. Here's another report and reproducer:

Syzkaller hit 'WARNING: kmalloc bug in bpf' bug.

------------[ cut here ]------------
WARNING: CPU: 1 PID: 15091 at mm/util.c:597 kvmalloc_node+0x11d/0x130 mm/util.c:597
Modules linked in:
CPU: 1 PID: 15091 Comm: syz-executor949 Not tainted 5.16.0-rc5-syzk #1
Hardware name: Red Hat KVM, BIOS 1.13.0-2.module+el8.3.0+7860+a7792d29 04/01/2014
RIP: 0010:kvmalloc_node+0x11d/0x130 mm/util.c:597
Code: 01 00 00 00 48 89 df e8 01 4f 0c 00 49 89 c5 e9 68 ff ff ff e8 b4 82 ca ff 45 89 e5 41 81 cd 00 20 01 00 eb 95 e8 a3 82 ca ff <0f> 0b e9 4b ff ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 90 0f 1f 44
RSP: 0018:ffff888017687b50 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000080000001 RCX: ffffffff81b63b8a
RDX: 0000000000000000 RSI: ffff888101916500 RDI: 0000000000000002
RBP: ffff888017687b70 R08: 0000000000112cc0 R09: 00000000ffffffff
R10: 0000000000000000 R11: ffffed1004a71db0 R12: 0000000000102cc0
R13: 0000000000000000 R14: 00000000ffffffff R15: ffff888025092800
FS: 00007f0794bc3740(0000) GS:ffff888107880000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000500 CR3: 00000000299d0000 CR4: 00000000000006e0
Call Trace:
<TASK>
kvmalloc include/linux/slab.h:741 [inline]
map_lookup_elem kernel/bpf/syscall.c:1099 [inline]
__sys_bpf+0x415b/0x5a80 kernel/bpf/syscall.c:4618
__do_sys_bpf kernel/bpf/syscall.c:4737 [inline]
__se_sys_bpf kernel/bpf/syscall.c:4735 [inline]
__x64_sys_bpf+0x7a/0xc0 kernel/bpf/syscall.c:4735
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3a/0x80 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f07944ba289
Code: 01 00 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d b7 db 2c 00 f7 d8 64 89 01 48
RSP: 002b:00007ffc3a07dcd8 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f07944ba289
RDX: 0000000000000020 RSI: 0000000020000240 RDI: 0000000000000001
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000004006d0
R13: 00007ffc3a07ddc0 R14: 0000000000000000 R15: 0000000000000000
</TASK>
---[ end trace 67ed3be15b904c13 ]---


Syzkaller reproducer:
# {Threaded:false Collide:false Repeat:false RepeatTimes:0 Procs:1 Slowdown:1 Sandbox: Fault:false FaultCall:-1 FaultNth:0 Leak:false NetInjection:false NetDevices:false NetReset:false Cgroups:false BinfmtMisc:false CloseFDs:false KCSAN:false DevlinkPCI:false USB:false VhciInjection:false Wifi:false IEEE802154:false Sysctl:false UseTmpDir:false HandleSegv:false Repro:false Trace:false}
r0 = bpf$MAP_CREATE(0x0, &(0x7f0000000500)={0x1e, 0x0, 0x80000001, 0x1, 0x0, 0x1}, 0x40)
bpf$MAP_LOOKUP_ELEM(0x1, &(0x7f0000000240)={r0, 0x0, 0x0}, 0x20)


C reproducer:
// autogenerated by syzkaller (https://github.com/google/syzkaller)

#define _GNU_SOURCE

#include <endian.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/syscall.h>
#include <sys/types.h>
#include <unistd.h>

#ifndef __NR_bpf
#define __NR_bpf 321
#endif

uint64_t r[1] = {0xffffffffffffffff};

int main(void)
{
syscall(__NR_mmap, 0x1ffff000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
syscall(__NR_mmap, 0x20000000ul, 0x1000000ul, 7ul, 0x32ul, -1, 0ul);
syscall(__NR_mmap, 0x21000000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
intptr_t res = 0;
*(uint32_t*)0x20000500 = 0x1e;
*(uint32_t*)0x20000504 = 0;
*(uint32_t*)0x20000508 = 0x80000001;
*(uint32_t*)0x2000050c = 1;
*(uint32_t*)0x20000510 = 0;
*(uint32_t*)0x20000514 = 1;
*(uint32_t*)0x20000518 = 0;
memset((void*)0x2000051c, 0, 16);
*(uint32_t*)0x2000052c = 0;
*(uint32_t*)0x20000530 = -1;
*(uint32_t*)0x20000534 = 0;
*(uint32_t*)0x20000538 = 0;
*(uint32_t*)0x2000053c = 0;
res = syscall(__NR_bpf, 0ul, 0x20000500ul, 0x40ul);
if (res != -1)
r[0] = res;
*(uint32_t*)0x20000240 = r[0];
*(uint64_t*)0x20000248 = 0;
*(uint64_t*)0x20000250 = 0;
*(uint64_t*)0x20000258 = 0;
syscall(__NR_bpf, 1ul, 0x20000240ul, 0x20ul);
return 0;
}


It seems like kvmalloc and its friends are used with no size check
throughout the kernel. It seems like the commit that returned
ZERO_SIZE_PTR ((void *)16) should be backed out.

Should I send out a v2 of the patch including the other kvmalloc
calls or do you have a suggested fix?

Thanks,
George

>>
>> Thanks,
>> Daniel
>