2015-11-29 13:18:53

by Dmitry Vyukov

[permalink] [raw]
Subject: user-controllable kmalloc size in bpf syscall

Hello,

The following program triggers a WARNING in kmalloc:

// autogenerated by syzkaller (http://github.com/google/syzkaller)
#include <syscall.h>
#include <string.h>
#include <stdint.h>

#define SYS_bpf 321
#define BPF_MAP_CREATE 0
#define BPF_MAP_UPDATE_ELEM 2

union bpf_attr {
struct {
uint32_t map_type; /* one of enum bpf_map_type */
uint32_t key_size; /* size of key in bytes */
uint32_t value_size; /* size of value in bytes */
uint32_t max_entries; /* max number of
entries in a map */
};

struct {
uint32_t map_fd;
uint64_t key;
uint64_t val;
uint64_t flags;
};
};

int main()
{
union bpf_attr ca;
memset(&ca, 0, sizeof(ca));
ca.map_type = 1;
ca.key_size = 1;
ca.value_size = 0xfffffff9;
ca.max_entries = 10;
int fd = syscall(SYS_bpf, BPF_MAP_CREATE, &ca, sizeof(ca));
memset(&ca, 0, sizeof(ca));
ca.map_fd = fd;
ca.key = &ca;
ca.val = &ca;
ca.flags = 0;
syscall(SYS_bpf, BPF_MAP_UPDATE_ELEM, &ca, sizeof(ca));
return 0;
}


------------[ cut here ]------------
WARNING: CPU: 2 PID: 11122 at mm/page_alloc.c:2989
__alloc_pages_nodemask+0x695/0x14e0()
Modules linked in:
CPU: 2 PID: 11122 Comm: syzkaller_execu Tainted: G B 4.4.0-rc2+ #5
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
00000000ffffffff ffff880039aefa30 ffffffff82743b56 0000000000000000
ffff88003b2a0000 ffffffff85a8d0e0 ffff880039aefa70 ffffffff81244ec9
ffffffff81554e95 ffffffff85a8d0e0 0000000000000bad 0000000000000000
Call Trace:
[< inline >] __dump_stack lib/dump_stack.c:15
[<ffffffff82743b56>] dump_stack+0x68/0x92 lib/dump_stack.c:50
[<ffffffff81244ec9>] warn_slowpath_common+0xd9/0x140 kernel/panic.c:460
[<ffffffff812450f9>] warn_slowpath_null+0x29/0x30 kernel/panic.c:493
[< inline >] __alloc_pages_slowpath mm/page_alloc.c:2989
[<ffffffff81554e95>] __alloc_pages_nodemask+0x695/0x14e0 mm/page_alloc.c:3235
[<ffffffff816188fe>] alloc_pages_current+0xee/0x340 mm/mempolicy.c:2055
[< inline >] alloc_pages include/linux/gfp.h:451
[<ffffffff81550706>] alloc_kmem_pages+0x16/0xf0 mm/page_alloc.c:3414
[<ffffffff815a1c89>] kmalloc_order+0x19/0x60 mm/slab_common.c:1007
[<ffffffff815a1cef>] kmalloc_order_trace+0x1f/0xa0 mm/slab_common.c:1018
[< inline >] kmalloc_large include/linux/slab.h:390
[<ffffffff81627784>] __kmalloc+0x234/0x250 mm/slub.c:3525
[< inline >] kmalloc include/linux/slab.h:463
[< inline >] map_update_elem kernel/bpf/syscall.c:288
[< inline >] SYSC_bpf kernel/bpf/syscall.c:744
[<ffffffff814f07c4>] SyS_bpf+0xfd4/0x1a20 kernel/bpf/syscall.c:695
[<ffffffff8597eeb6>] entry_SYSCALL_64_fastpath+0x16/0x7a
arch/x86/entry/entry_64.S:185
---[ end trace f00640b4d448df95 ]---


On commit 78c4a49a69e910a162b05e4e8727b9bdbf948f13 (Now 25).


2015-11-29 18:22:00

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: user-controllable kmalloc size in bpf syscall

On Sun, Nov 29, 2015 at 02:18:29PM +0100, Dmitry Vyukov wrote:
> ca.key_size = 1;
> ca.value_size = 0xfffffff9;
> ca.max_entries = 10;
> int fd = syscall(SYS_bpf, BPF_MAP_CREATE, &ca, sizeof(ca));
...
> ------------[ cut here ]------------
> WARNING: CPU: 2 PID: 11122 at mm/page_alloc.c:2989
> __alloc_pages_nodemask+0x695/0x14e0()

thanks for the report. That's an integer overflow :(
working on the fix.

2015-11-30 00:59:44

by Alexei Starovoitov

[permalink] [raw]
Subject: [PATCH net] bpf: fix allocation warnings in bpf maps and integer overflow

From: Alexei Starovoitov <[email protected]>

For large map->value_size the user space can trigger memory allocation warnings like:
WARNING: CPU: 2 PID: 11122 at mm/page_alloc.c:2989
__alloc_pages_nodemask+0x695/0x14e0()
Call Trace:
[< inline >] __dump_stack lib/dump_stack.c:15
[<ffffffff82743b56>] dump_stack+0x68/0x92 lib/dump_stack.c:50
[<ffffffff81244ec9>] warn_slowpath_common+0xd9/0x140 kernel/panic.c:460
[<ffffffff812450f9>] warn_slowpath_null+0x29/0x30 kernel/panic.c:493
[< inline >] __alloc_pages_slowpath mm/page_alloc.c:2989
[<ffffffff81554e95>] __alloc_pages_nodemask+0x695/0x14e0 mm/page_alloc.c:3235
[<ffffffff816188fe>] alloc_pages_current+0xee/0x340 mm/mempolicy.c:2055
[< inline >] alloc_pages include/linux/gfp.h:451
[<ffffffff81550706>] alloc_kmem_pages+0x16/0xf0 mm/page_alloc.c:3414
[<ffffffff815a1c89>] kmalloc_order+0x19/0x60 mm/slab_common.c:1007
[<ffffffff815a1cef>] kmalloc_order_trace+0x1f/0xa0 mm/slab_common.c:1018
[< inline >] kmalloc_large include/linux/slab.h:390
[<ffffffff81627784>] __kmalloc+0x234/0x250 mm/slub.c:3525
[< inline >] kmalloc include/linux/slab.h:463
[< inline >] map_update_elem kernel/bpf/syscall.c:288
[< inline >] SYSC_bpf kernel/bpf/syscall.c:744

To avoid never succeeding kmalloc with order >= MAX_ORDER check that
elem->value_size and computed elem_size are within limits for both hash and
array type maps.
Also add __GFP_NOWARN to kmalloc(value_size | elem_size) to avoid OOM warnings.
Note kmalloc(key_size) is highly unlikely to trigger OOM, since key_size <= 512,
so keep those kmalloc-s as-is.

Large value_size can cause integer overflows in elem_size and map.pages
formulas, so check for that as well.

Fixes: aaac3ba95e4c ("bpf: charge user for creation of BPF maps and programs")
Reported-by: Dmitry Vyukov <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
---
The bug is serious in 4.4 where unprivileged bpf was introduced.
Also would be great to back port further all the way to the beginning of maps:
commit 0f8e4bd8a1fc ("bpf: add hashtable type of eBPF maps").
Only hunk about htab->map.pages is not applicable.
---
kernel/bpf/arraymap.c | 8 +++++++-
kernel/bpf/hashtab.c | 34 +++++++++++++++++++++++++---------
kernel/bpf/syscall.c | 4 ++--
3 files changed, 34 insertions(+), 12 deletions(-)

diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
index 3f4c99e06c6b..b1e53b79c586 100644
--- a/kernel/bpf/arraymap.c
+++ b/kernel/bpf/arraymap.c
@@ -28,11 +28,17 @@ static struct bpf_map *array_map_alloc(union bpf_attr *attr)
attr->value_size == 0)
return ERR_PTR(-EINVAL);

+ if (attr->value_size >= 1 << (KMALLOC_SHIFT_MAX - 1))
+ /* if value_size is bigger, the user space won't be able to
+ * access the elements.
+ */
+ return ERR_PTR(-E2BIG);
+
elem_size = round_up(attr->value_size, 8);

/* check round_up into zero and u32 overflow */
if (elem_size == 0 ||
- attr->max_entries > (U32_MAX - sizeof(*array)) / elem_size)
+ attr->max_entries > (U32_MAX - PAGE_SIZE - sizeof(*array)) / elem_size)
return ERR_PTR(-ENOMEM);

array_size = sizeof(*array) + attr->max_entries * elem_size;
diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
index 19909b22b4f8..34777b3746fa 100644
--- a/kernel/bpf/hashtab.c
+++ b/kernel/bpf/hashtab.c
@@ -64,12 +64,35 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
*/
goto free_htab;

- err = -ENOMEM;
+ if (htab->map.value_size >= (1 << (KMALLOC_SHIFT_MAX - 1)) -
+ MAX_BPF_STACK - sizeof(struct htab_elem))
+ /* if value_size is bigger, the user space won't be able to
+ * access the elements via bpf syscall. This check also makes
+ * sure that the elem_size doesn't overflow and it's
+ * kmalloc-able later in htab_map_update_elem()
+ */
+ goto free_htab;
+
+ htab->elem_size = sizeof(struct htab_elem) +
+ round_up(htab->map.key_size, 8) +
+ htab->map.value_size;
+
/* prevent zero size kmalloc and check for u32 overflow */
if (htab->n_buckets == 0 ||
htab->n_buckets > U32_MAX / sizeof(struct hlist_head))
goto free_htab;

+ if ((u64) htab->n_buckets * sizeof(struct hlist_head) +
+ (u64) htab->elem_size * htab->map.max_entries >=
+ U32_MAX - PAGE_SIZE)
+ /* make sure page count doesn't overflow */
+ goto free_htab;
+
+ htab->map.pages = round_up(htab->n_buckets * sizeof(struct hlist_head) +
+ htab->elem_size * htab->map.max_entries,
+ PAGE_SIZE) >> PAGE_SHIFT;
+
+ err = -ENOMEM;
htab->buckets = kmalloc_array(htab->n_buckets, sizeof(struct hlist_head),
GFP_USER | __GFP_NOWARN);

@@ -85,13 +108,6 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
raw_spin_lock_init(&htab->lock);
htab->count = 0;

- htab->elem_size = sizeof(struct htab_elem) +
- round_up(htab->map.key_size, 8) +
- htab->map.value_size;
-
- htab->map.pages = round_up(htab->n_buckets * sizeof(struct hlist_head) +
- htab->elem_size * htab->map.max_entries,
- PAGE_SIZE) >> PAGE_SHIFT;
return &htab->map;

free_htab:
@@ -222,7 +238,7 @@ static int htab_map_update_elem(struct bpf_map *map, void *key, void *value,
WARN_ON_ONCE(!rcu_read_lock_held());

/* allocate new element outside of lock */
- l_new = kmalloc(htab->elem_size, GFP_ATOMIC);
+ l_new = kmalloc(htab->elem_size, GFP_ATOMIC | __GFP_NOWARN);
if (!l_new)
return -ENOMEM;

diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 4a8f3c1d7da6..3b39550d8485 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -240,7 +240,7 @@ static int map_lookup_elem(union bpf_attr *attr)
goto free_key;

err = -ENOMEM;
- value = kmalloc(map->value_size, GFP_USER);
+ value = kmalloc(map->value_size, GFP_USER | __GFP_NOWARN);
if (!value)
goto free_key;

@@ -299,7 +299,7 @@ static int map_update_elem(union bpf_attr *attr)
goto free_key;

err = -ENOMEM;
- value = kmalloc(map->value_size, GFP_USER);
+ value = kmalloc(map->value_size, GFP_USER | __GFP_NOWARN);
if (!value)
goto free_key;

--
2.4.6

2015-11-30 13:52:37

by Daniel Borkmann

[permalink] [raw]
Subject: Re: [PATCH net] bpf: fix allocation warnings in bpf maps and integer overflow

On 11/30/2015 01:59 AM, Alexei Starovoitov wrote:
[...]
> For large map->value_size the user space can trigger memory allocation warnings like:
[...]

> To avoid never succeeding kmalloc with order >= MAX_ORDER check that
> elem->value_size and computed elem_size are within limits for both hash and
> array type maps.
[...]

> Large value_size can cause integer overflows in elem_size and map.pages
> formulas, so check for that as well.
[...]

> diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
> index 3f4c99e06c6b..b1e53b79c586 100644
> --- a/kernel/bpf/arraymap.c
> +++ b/kernel/bpf/arraymap.c
> @@ -28,11 +28,17 @@ static struct bpf_map *array_map_alloc(union bpf_attr *attr)
> attr->value_size == 0)
> return ERR_PTR(-EINVAL);
>
> + if (attr->value_size >= 1 << (KMALLOC_SHIFT_MAX - 1))
> + /* if value_size is bigger, the user space won't be able to
> + * access the elements.
> + */
> + return ERR_PTR(-E2BIG);
> +

Bit confused, given that in array map, we try kzalloc() with __GFP_NOWARN already
and if that fails, we fall back to vzalloc(), it shouldn't trigger memory allocation
warnings here ...

Then, integer overflow in elem_size with round_up(attr->value_size, 8) could only
result in 0, which is already tested below.

> elem_size = round_up(attr->value_size, 8);
>
> /* check round_up into zero and u32 overflow */
> if (elem_size == 0 ||
> - attr->max_entries > (U32_MAX - sizeof(*array)) / elem_size)
> + attr->max_entries > (U32_MAX - PAGE_SIZE - sizeof(*array)) / elem_size)
> return ERR_PTR(-ENOMEM);

... and this change seems to be needed for the integer overflow in map.pages?

So if the first check above intends to check for some size overflow (?), how is it
then related to KMALLOC_SHIFT_MAX?

Thanks,
Daniel

2015-11-30 13:58:03

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: [PATCH net] bpf: fix allocation warnings in bpf maps and integer overflow

On Mon, Nov 30, 2015 at 2:52 PM, Daniel Borkmann <[email protected]> wrote:
> On 11/30/2015 01:59 AM, Alexei Starovoitov wrote:
> [...]
>>
>> For large map->value_size the user space can trigger memory allocation
>> warnings like:
>
> [...]
>
>> To avoid never succeeding kmalloc with order >= MAX_ORDER check that
>> elem->value_size and computed elem_size are within limits for both hash
>> and
>> array type maps.
>
> [...]
>
>> Large value_size can cause integer overflows in elem_size and map.pages
>> formulas, so check for that as well.
>
> [...]
>
>> diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
>> index 3f4c99e06c6b..b1e53b79c586 100644
>> --- a/kernel/bpf/arraymap.c
>> +++ b/kernel/bpf/arraymap.c
>> @@ -28,11 +28,17 @@ static struct bpf_map *array_map_alloc(union bpf_attr
>> *attr)
>> attr->value_size == 0)
>> return ERR_PTR(-EINVAL);
>>
>> + if (attr->value_size >= 1 << (KMALLOC_SHIFT_MAX - 1))
>> + /* if value_size is bigger, the user space won't be able
>> to
>> + * access the elements.
>> + */
>> + return ERR_PTR(-E2BIG);
>> +
>
>
> Bit confused, given that in array map, we try kzalloc() with __GFP_NOWARN
> already
> and if that fails, we fall back to vzalloc(), it shouldn't trigger memory
> allocation
> warnings here ...
>
> Then, integer overflow in elem_size with round_up(attr->value_size, 8) could
> only
> result in 0, which is already tested below.
>
>> elem_size = round_up(attr->value_size, 8);
>>
>> /* check round_up into zero and u32 overflow */
>> if (elem_size == 0 ||
>> - attr->max_entries > (U32_MAX - sizeof(*array)) / elem_size)
>> + attr->max_entries > (U32_MAX - PAGE_SIZE - sizeof(*array)) /
>> elem_size)
>> return ERR_PTR(-ENOMEM);
>
>
> ... and this change seems to be needed for the integer overflow in
> map.pages?
>
> So if the first check above intends to check for some size overflow (?), how
> is it
> then related to KMALLOC_SHIFT_MAX?


kamlloc produces a WARNING if you try to allocate more than it ever
possibly can (KMALLOC_SHIFT_MAX).

2015-11-30 14:13:56

by Daniel Borkmann

[permalink] [raw]
Subject: Re: [PATCH net] bpf: fix allocation warnings in bpf maps and integer overflow

On 11/30/2015 02:57 PM, Dmitry Vyukov wrote:
...
> kamlloc produces a WARNING if you try to allocate more than it ever
> possibly can (KMALLOC_SHIFT_MAX).

Sure, I understand that.

The kzalloc() in array_map_alloc() is however with __GFP_NOWARN flag already.
The warning only triggers in mm if:

WARN_ON_ONCE(!(gfp_mask & __GFP_NOWARN));

Your test case is using ca.map_type = 1, which is BPF_MAP_TYPE_HASH. So on
update you're triggering the kmalloc() in htab_map_update_elem().

I'm just asking about the added change in array map.

Thanks,
Daniel

2015-11-30 14:16:51

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: [PATCH net] bpf: fix allocation warnings in bpf maps and integer overflow

On Mon, Nov 30, 2015 at 3:13 PM, Daniel Borkmann <[email protected]> wrote:
> On 11/30/2015 02:57 PM, Dmitry Vyukov wrote:
> ...
>>
>> kamlloc produces a WARNING if you try to allocate more than it ever
>> possibly can (KMALLOC_SHIFT_MAX).
>
>
> Sure, I understand that.
>
> The kzalloc() in array_map_alloc() is however with __GFP_NOWARN flag
> already.
> The warning only triggers in mm if:
>
> WARN_ON_ONCE(!(gfp_mask & __GFP_NOWARN));
>
> Your test case is using ca.map_type = 1, which is BPF_MAP_TYPE_HASH. So on
> update you're triggering the kmalloc() in htab_map_update_elem().
>
> I'm just asking about the added change in array map.


Then, sorry.
Let's wait for Alexei.

2015-11-30 14:34:42

by Daniel Borkmann

[permalink] [raw]
Subject: Re: [PATCH net] bpf: fix allocation warnings in bpf maps and integer overflow

On 11/30/2015 02:52 PM, Daniel Borkmann wrote:
> On 11/30/2015 01:59 AM, Alexei Starovoitov wrote:
> [...]
>> For large map->value_size the user space can trigger memory allocation warnings like:
> [...]
>
>> To avoid never succeeding kmalloc with order >= MAX_ORDER check that
>> elem->value_size and computed elem_size are within limits for both hash and
>> array type maps.
> [...]
>
>> Large value_size can cause integer overflows in elem_size and map.pages
>> formulas, so check for that as well.
> [...]
>
>> diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
>> index 3f4c99e06c6b..b1e53b79c586 100644
>> --- a/kernel/bpf/arraymap.c
>> +++ b/kernel/bpf/arraymap.c
>> @@ -28,11 +28,17 @@ static struct bpf_map *array_map_alloc(union bpf_attr *attr)
>> attr->value_size == 0)
>> return ERR_PTR(-EINVAL);
>>
>> + if (attr->value_size >= 1 << (KMALLOC_SHIFT_MAX - 1))
>> + /* if value_size is bigger, the user space won't be able to
>> + * access the elements.
>> + */
>> + return ERR_PTR(-E2BIG);
>> +
>
> Bit confused, given that in array map, we try kzalloc() with __GFP_NOWARN already
> and if that fails, we fall back to vzalloc(), it shouldn't trigger memory allocation
> warnings here ...
>
> Then, integer overflow in elem_size with round_up(attr->value_size, 8) could only
> result in 0, which is already tested below.
>
>> elem_size = round_up(attr->value_size, 8);
>>
>> /* check round_up into zero and u32 overflow */
>> if (elem_size == 0 ||
>> - attr->max_entries > (U32_MAX - sizeof(*array)) / elem_size)
>> + attr->max_entries > (U32_MAX - PAGE_SIZE - sizeof(*array)) / elem_size)
>> return ERR_PTR(-ENOMEM);
>
> ... and this change seems to be needed for the integer overflow in map.pages?
>
> So if the first check above intends to check for some size overflow (?), how is it
> then related to KMALLOC_SHIFT_MAX?

Ok, I see. The check and comment is related to the fact that when we do bpf(2)
syscall to lookup an element:

We call map_lookup_elem(), which does kmalloc() on the value_size.

So an individual entry lookup could fail with kmalloc() there, unrelated to an
individual map implementation.

Hmm, seems this patch fixes many things at once, maybe makes sense to split it?

2015-11-30 18:13:17

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: [PATCH net] bpf: fix allocation warnings in bpf maps and integer overflow

On Mon, Nov 30, 2015 at 03:34:35PM +0100, Daniel Borkmann wrote:
> >>diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
> >>index 3f4c99e06c6b..b1e53b79c586 100644
> >>--- a/kernel/bpf/arraymap.c
> >>+++ b/kernel/bpf/arraymap.c
> >>@@ -28,11 +28,17 @@ static struct bpf_map *array_map_alloc(union bpf_attr *attr)
> >> attr->value_size == 0)
> >> return ERR_PTR(-EINVAL);
> >>
> >>+ if (attr->value_size >= 1 << (KMALLOC_SHIFT_MAX - 1))
> >>+ /* if value_size is bigger, the user space won't be able to
> >>+ * access the elements.
> >>+ */
> >>+ return ERR_PTR(-E2BIG);
> >>+
> >
> >Bit confused, given that in array map, we try kzalloc() with __GFP_NOWARN already
> >and if that fails, we fall back to vzalloc(), it shouldn't trigger memory allocation
> >warnings here ...

not quite, the above check is for kmalloc-s in syscall.c

> Ok, I see. The check and comment is related to the fact that when we do bpf(2)
> syscall to lookup an element:
>
> We call map_lookup_elem(), which does kmalloc() on the value_size.
>
> So an individual entry lookup could fail with kmalloc() there, unrelated to an
> individual map implementation.

kmalloc with order >= MAX_ORDER warning can be seen in syscall for update/lookup
commands regardless of map implememtation.
So the maps with "value_size >= 1 << (KMALLOC_SHIFT_MAX - 1)" were not accessible
from user space anyway.
This check in arraymap.c fixes the warning and prevents creation of such
maps in the first place as the comment right below it says.
Similar check in hashmap.c fixes warning, prevents abnormal map creation and fixes
integer overflow which is the most dangerous of them all.

The check in arraymap.c
- attr->max_entries > (U32_MAX - sizeof(*array)) / elem_size)
+ attr->max_entries > (U32_MAX - PAGE_SIZE - sizeof(*array)) / elem_size)
fixes potential integer overflow in map.pages computation.

and similar check in hashtab.c:
(u64) htab->elem_size * htab->map.max_entries >= U32_MAX - PAGE_SIZE
fixes integer overflow in map.pages as well.

the 'value_size >= (1 << (KMALLOC_SHIFT_MAX - 1)) - MAX_BPF_STACK - sizeof(struct htab_elem)'
check in hashmap.c fixes integer overflow in elem_size and
makes elem_size kmalloc-able later in htab_map_update_elem().
Since it wasn't obvious that this one 'if' addresses these multiple issues,
I've added a comment there.

Addition of __GFP_NOWARN only fixes OOM warning as commit log says.

> Hmm, seems this patch fixes many things at once, maybe makes sense to split it?

hmm I don't see a point of changing the same single line over multipe patches.
The split won't help backporting, but rather makes for more patches to deal with.

2015-11-30 22:16:55

by Daniel Borkmann

[permalink] [raw]
Subject: Re: [PATCH net] bpf: fix allocation warnings in bpf maps and integer overflow

On 11/30/2015 07:13 PM, Alexei Starovoitov wrote:
> On Mon, Nov 30, 2015 at 03:34:35PM +0100, Daniel Borkmann wrote:
>>>> diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
>>>> index 3f4c99e06c6b..b1e53b79c586 100644
>>>> --- a/kernel/bpf/arraymap.c
>>>> +++ b/kernel/bpf/arraymap.c
>>>> @@ -28,11 +28,17 @@ static struct bpf_map *array_map_alloc(union bpf_attr *attr)
>>>> attr->value_size == 0)
>>>> return ERR_PTR(-EINVAL);
>>>>
>>>> + if (attr->value_size >= 1 << (KMALLOC_SHIFT_MAX - 1))
>>>> + /* if value_size is bigger, the user space won't be able to
>>>> + * access the elements.
>>>> + */
>>>> + return ERR_PTR(-E2BIG);
>>>> +
>>>
>>> Bit confused, given that in array map, we try kzalloc() with __GFP_NOWARN already
>>> and if that fails, we fall back to vzalloc(), it shouldn't trigger memory allocation
>>> warnings here ...
>
> not quite, the above check is for kmalloc-s in syscall.c
>
>> Ok, I see. The check and comment is related to the fact that when we do bpf(2)
>> syscall to lookup an element:
>>
>> We call map_lookup_elem(), which does kmalloc() on the value_size.
>>
>> So an individual entry lookup could fail with kmalloc() there, unrelated to an
>> individual map implementation.
>
> kmalloc with order >= MAX_ORDER warning can be seen in syscall for update/lookup
> commands regardless of map implememtation.
> So the maps with "value_size >= 1 << (KMALLOC_SHIFT_MAX - 1)" were not accessible
> from user space anyway.
> This check in arraymap.c fixes the warning and prevents creation of such
> maps in the first place as the comment right below it says.

Yeah, right. Noticed that later on. It was a bit confusing at first as I didn't
parse that clearly from the commit message itself.

> Similar check in hashmap.c fixes warning, prevents abnormal map creation and fixes
> integer overflow which is the most dangerous of them all.
>
> The check in arraymap.c
> - attr->max_entries > (U32_MAX - sizeof(*array)) / elem_size)
> + attr->max_entries > (U32_MAX - PAGE_SIZE - sizeof(*array)) / elem_size)
> fixes potential integer overflow in map.pages computation.
>
> and similar check in hashtab.c:
> (u64) htab->elem_size * htab->map.max_entries >= U32_MAX - PAGE_SIZE
> fixes integer overflow in map.pages as well.

Yep, got that part.

> the 'value_size >= (1 << (KMALLOC_SHIFT_MAX - 1)) - MAX_BPF_STACK - sizeof(struct htab_elem)'
> check in hashmap.c fixes integer overflow in elem_size and
> makes elem_size kmalloc-able later in htab_map_update_elem().
> Since it wasn't obvious that this one 'if' addresses these multiple issues,
> I've added a comment there.

... and the MAX_BPF_STACK stands for the maximum key part here, okay.

So, when creating a sufficiently large map where map->key_size + map->value_size
would be > MAX_BPF_STACK (but map->key_size still <= MAX_BPF_STACK), we can only
read the map from an eBPF program, but not update it. In such cases, updates could
only happen from user space application.

> Addition of __GFP_NOWARN only fixes OOM warning as commit log says.

That's obvious, too.

>> Hmm, seems this patch fixes many things at once, maybe makes sense to split it?
>
> hmm I don't see a point of changing the same single line over multipe patches.
> The split won't help backporting, but rather makes for more patches to deal with.

2015-11-30 23:30:40

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: [PATCH net] bpf: fix allocation warnings in bpf maps and integer overflow

On Mon, Nov 30, 2015 at 11:16:46PM +0100, Daniel Borkmann wrote:
>
> So, when creating a sufficiently large map where map->key_size + map->value_size
> would be > MAX_BPF_STACK (but map->key_size still <= MAX_BPF_STACK), we can only
> read the map from an eBPF program, but not update it. In such cases, updates could
> only happen from user space application.

yes and no.
If both key_size + value_size > MAX_BPF_STACK, the program cannot technically
call bpf_map_update_elem() helper, but the user space can still populate large map
elements and the program can update it, since it can have a pointer via
bpf_map_lookup_elem(). So depends on definition of 'update'.

btw, the large-ish key support is actually needed too, since on tracing side
we need to be able to do map[kernel_stack_and_user_stack]++ and key is multipage long.
On userspace side it will be consumed by flamegraphs.

2015-12-03 04:36:34

by David Miller

[permalink] [raw]
Subject: Re: [PATCH net] bpf: fix allocation warnings in bpf maps and integer overflow

From: Alexei Starovoitov <[email protected]>
Date: Sun, 29 Nov 2015 16:59:35 -0800

> From: Alexei Starovoitov <[email protected]>
>
> For large map->value_size the user space can trigger memory allocation warnings like:
...
> To avoid never succeeding kmalloc with order >= MAX_ORDER check that
> elem->value_size and computed elem_size are within limits for both hash and
> array type maps.
> Also add __GFP_NOWARN to kmalloc(value_size | elem_size) to avoid OOM warnings.
> Note kmalloc(key_size) is highly unlikely to trigger OOM, since key_size <= 512,
> so keep those kmalloc-s as-is.
>
> Large value_size can cause integer overflows in elem_size and map.pages
> formulas, so check for that as well.
>
> Fixes: aaac3ba95e4c ("bpf: charge user for creation of BPF maps and programs")
> Reported-by: Dmitry Vyukov <[email protected]>
> Signed-off-by: Alexei Starovoitov <[email protected]>

Applied and queued up for -stable, thanks.