Most allocation sites in the kernel want an explicitly sized allocation
(and not "more"), and that dynamic runtime analysis tools (e.g. KASAN,
UBSAN_BOUNDS, FORTIFY_SOURCE, etc) are looking for precise bounds checking
(i.e. not something that is rounded up). A tiny handful of allocations
were doing an implicit alloc/realloc loop that actually depended on
ksize(), and didn't actually always call realloc. This has created a
long series of bugs and problems over many years related to the runtime
bounds checking, so these callers are finally being adjusted to _not_
depend on the ksize() side-effect, by doing one of several things:
- tracking the allocation size precisely and just never calling ksize()
at all[1].
- always calling realloc and not using ksize() at all. (This solution
ends up actually be a subset of the next solution.)
- using kmalloc_size_roundup() to explicitly round up the desired
allocation size immediately[2].
The bpf/verifier case is this another of this latter case, and is the
last outstanding case to be fixed in the kernel.
Because some of the dynamic bounds checking depends on the size being an
_argument_ to an allocator function (i.e. see the __alloc_size attribute),
the ksize() users are rare, and it could waste local variables, it
was been deemed better to explicitly separate the rounding up from the
allocation itself[3].
Round up allocations with kmalloc_size_roundup() so that the verifier's
use of ksize() is always accurate.
[1] e.g.:
https://git.kernel.org/linus/712f210a457d
https://git.kernel.org/linus/72c08d9f4c72
[2] e.g.:
https://git.kernel.org/netdev/net-next/c/12d6c1d3a2ad
https://git.kernel.org/netdev/net-next/c/ab3f7828c979
https://git.kernel.org/netdev/net-next/c/d6dd508080a3
[3] https://lore.kernel.org/lkml/[email protected]/
Cc: Alexei Starovoitov <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: John Fastabend <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Cc: Martin KaFai Lau <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Yonghong Song <[email protected]>
Cc: KP Singh <[email protected]>
Cc: Stanislav Fomichev <[email protected]>
Cc: Hao Luo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: [email protected]
Signed-off-by: Kees Cook <[email protected]>
---
v3:
- memory leak already taken into -next (daniel)
- improve commit log (daniel)
- drop optimization patch for now (sdf)
v2: https://lore.kernel.org/lkml/[email protected]/
v1: https://lore.kernel.org/lkml/[email protected]/
---
kernel/bpf/verifier.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index beed7e03addc..c596c7c75d25 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -1010,9 +1010,9 @@ static void *copy_array(void *dst, const void *src, size_t n, size_t size, gfp_t
if (unlikely(check_mul_overflow(n, size, &bytes)))
return NULL;
- if (ksize(dst) < bytes) {
+ if (ksize(dst) < ksize(src)) {
kfree(dst);
- dst = kmalloc_track_caller(bytes, flags);
+ dst = kmalloc_track_caller(kmalloc_size_roundup(bytes), flags);
if (!dst)
return NULL;
}
@@ -1029,12 +1029,14 @@ static void *copy_array(void *dst, const void *src, size_t n, size_t size, gfp_t
*/
static void *realloc_array(void *arr, size_t old_n, size_t new_n, size_t size)
{
+ size_t alloc_size;
void *new_arr;
if (!new_n || old_n == new_n)
goto out;
- new_arr = krealloc_array(arr, new_n, size, GFP_KERNEL);
+ alloc_size = kmalloc_size_roundup(size_mul(new_n, size));
+ new_arr = krealloc(arr, alloc_size, GFP_KERNEL);
if (!new_arr) {
kfree(arr);
return NULL;
@@ -2506,9 +2508,11 @@ static int push_jmp_history(struct bpf_verifier_env *env,
{
u32 cnt = cur->jmp_history_cnt;
struct bpf_idx_pair *p;
+ size_t alloc_size;
cnt++;
- p = krealloc(cur->jmp_history, cnt * sizeof(*p), GFP_USER);
+ alloc_size = kmalloc_size_roundup(size_mul(cnt, sizeof(*p)));
+ p = krealloc(cur->jmp_history, alloc_size, GFP_USER);
if (!p)
return -ENOMEM;
p[cnt - 1].idx = env->insn_idx;
--
2.34.1
On Fri, Nov 18, 2022 at 10:34 AM Kees Cook <[email protected]> wrote:
>
> Most allocation sites in the kernel want an explicitly sized allocation
> (and not "more"), and that dynamic runtime analysis tools (e.g. KASAN,
> UBSAN_BOUNDS, FORTIFY_SOURCE, etc) are looking for precise bounds checking
> (i.e. not something that is rounded up). A tiny handful of allocations
> were doing an implicit alloc/realloc loop that actually depended on
> ksize(), and didn't actually always call realloc. This has created a
> long series of bugs and problems over many years related to the runtime
> bounds checking, so these callers are finally being adjusted to _not_
> depend on the ksize() side-effect, by doing one of several things:
>
> - tracking the allocation size precisely and just never calling ksize()
> at all[1].
>
> - always calling realloc and not using ksize() at all. (This solution
> ends up actually be a subset of the next solution.)
>
> - using kmalloc_size_roundup() to explicitly round up the desired
> allocation size immediately[2].
>
> The bpf/verifier case is this another of this latter case, and is the
> last outstanding case to be fixed in the kernel.
>
> Because some of the dynamic bounds checking depends on the size being an
> _argument_ to an allocator function (i.e. see the __alloc_size attribute),
> the ksize() users are rare, and it could waste local variables, it
> was been deemed better to explicitly separate the rounding up from the
> allocation itself[3].
>
> Round up allocations with kmalloc_size_roundup() so that the verifier's
> use of ksize() is always accurate.
>
> [1] e.g.:
> https://git.kernel.org/linus/712f210a457d
> https://git.kernel.org/linus/72c08d9f4c72
>
> [2] e.g.:
> https://git.kernel.org/netdev/net-next/c/12d6c1d3a2ad
> https://git.kernel.org/netdev/net-next/c/ab3f7828c979
> https://git.kernel.org/netdev/net-next/c/d6dd508080a3
>
> [3] https://lore.kernel.org/lkml/[email protected]/
>
> Cc: Alexei Starovoitov <[email protected]>
> Cc: Daniel Borkmann <[email protected]>
> Cc: John Fastabend <[email protected]>
> Cc: Andrii Nakryiko <[email protected]>
> Cc: Martin KaFai Lau <[email protected]>
> Cc: Song Liu <[email protected]>
> Cc: Yonghong Song <[email protected]>
> Cc: KP Singh <[email protected]>
> Cc: Stanislav Fomichev <[email protected]>
Acked-by: Stanislav Fomichev <[email protected]>
> Cc: Hao Luo <[email protected]>
> Cc: Jiri Olsa <[email protected]>
> Cc: [email protected]
> Signed-off-by: Kees Cook <[email protected]>
> ---
> v3:
> - memory leak already taken into -next (daniel)
> - improve commit log (daniel)
> - drop optimization patch for now (sdf)
> v2: https://lore.kernel.org/lkml/[email protected]/
> v1: https://lore.kernel.org/lkml/[email protected]/
> ---
> kernel/bpf/verifier.c | 12 ++++++++----
> 1 file changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index beed7e03addc..c596c7c75d25 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -1010,9 +1010,9 @@ static void *copy_array(void *dst, const void *src, size_t n, size_t size, gfp_t
> if (unlikely(check_mul_overflow(n, size, &bytes)))
> return NULL;
>
> - if (ksize(dst) < bytes) {
> + if (ksize(dst) < ksize(src)) {
> kfree(dst);
> - dst = kmalloc_track_caller(bytes, flags);
> + dst = kmalloc_track_caller(kmalloc_size_roundup(bytes), flags);
> if (!dst)
> return NULL;
> }
> @@ -1029,12 +1029,14 @@ static void *copy_array(void *dst, const void *src, size_t n, size_t size, gfp_t
> */
> static void *realloc_array(void *arr, size_t old_n, size_t new_n, size_t size)
> {
> + size_t alloc_size;
> void *new_arr;
>
> if (!new_n || old_n == new_n)
> goto out;
>
> - new_arr = krealloc_array(arr, new_n, size, GFP_KERNEL);
> + alloc_size = kmalloc_size_roundup(size_mul(new_n, size));
> + new_arr = krealloc(arr, alloc_size, GFP_KERNEL);
> if (!new_arr) {
> kfree(arr);
> return NULL;
> @@ -2506,9 +2508,11 @@ static int push_jmp_history(struct bpf_verifier_env *env,
> {
> u32 cnt = cur->jmp_history_cnt;
> struct bpf_idx_pair *p;
> + size_t alloc_size;
>
> cnt++;
> - p = krealloc(cur->jmp_history, cnt * sizeof(*p), GFP_USER);
> + alloc_size = kmalloc_size_roundup(size_mul(cnt, sizeof(*p)));
> + p = krealloc(cur->jmp_history, alloc_size, GFP_USER);
> if (!p)
> return -ENOMEM;
> p[cnt - 1].idx = env->insn_idx;
> --
> 2.34.1
>
Hello:
This patch was applied to bpf/bpf-next.git (master)
by Daniel Borkmann <[email protected]>:
On Fri, 18 Nov 2022 10:34:14 -0800 you wrote:
> Most allocation sites in the kernel want an explicitly sized allocation
> (and not "more"), and that dynamic runtime analysis tools (e.g. KASAN,
> UBSAN_BOUNDS, FORTIFY_SOURCE, etc) are looking for precise bounds checking
> (i.e. not something that is rounded up). A tiny handful of allocations
> were doing an implicit alloc/realloc loop that actually depended on
> ksize(), and didn't actually always call realloc. This has created a
> long series of bugs and problems over many years related to the runtime
> bounds checking, so these callers are finally being adjusted to _not_
> depend on the ksize() side-effect, by doing one of several things:
>
> [...]
Here is the summary with links:
- [bpf-next,v3] bpf/verifier: Use kmalloc_size_roundup() to match ksize() usage
https://git.kernel.org/bpf/bpf-next/c/ceb35b666d42
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html