The kstack_offset variable was really only ever using the low bits for
kernel stack offset entropy. Add a ror32() to increase bit diffusion.
Suggested-by: Arnd Bergmann <[email protected]>
Fixes: 39218ff4c625 ("stack: Optionally randomize kernel stack offset each syscall")
Signed-off-by: Kees Cook <[email protected]>
---
Cc: Jeremy Linton <[email protected]>
Cc: "Gustavo A. R. Silva" <[email protected]>
Cc: [email protected]
---
include/linux/randomize_kstack.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/randomize_kstack.h b/include/linux/randomize_kstack.h
index 5d868505a94e..6d92b68efbf6 100644
--- a/include/linux/randomize_kstack.h
+++ b/include/linux/randomize_kstack.h
@@ -80,7 +80,7 @@ DECLARE_PER_CPU(u32, kstack_offset);
if (static_branch_maybe(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT, \
&randomize_kstack_offset)) { \
u32 offset = raw_cpu_read(kstack_offset); \
- offset ^= (rand); \
+ offset = ror32(offset, 5) ^ (rand); \
raw_cpu_write(kstack_offset, offset); \
} \
} while (0)
--
2.34.1
On Sat, 09 Mar 2024 12:24:48 -0800, Kees Cook wrote:
> The kstack_offset variable was really only ever using the low bits for
> kernel stack offset entropy. Add a ror32() to increase bit diffusion.
>
>
Applied to for-next/hardening:
[1/1] randomize_kstack: Improve entropy diffusion
https://git.kernel.org/kees/c/9c573cd31343
--
Kees Cook
Kees Cook <[email protected]> writes:
>
> diff --git a/include/linux/randomize_kstack.h b/include/linux/randomize_kstack.h
> index 5d868505a94e..6d92b68efbf6 100644
> --- a/include/linux/randomize_kstack.h
> +++ b/include/linux/randomize_kstack.h
> @@ -80,7 +80,7 @@ DECLARE_PER_CPU(u32, kstack_offset);
> if (static_branch_maybe(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT, \
> &randomize_kstack_offset)) { \
> u32 offset = raw_cpu_read(kstack_offset); \
> - offset ^= (rand); \
> + offset = ror32(offset, 5) ^ (rand); \
Hi Kees,
I'm wondering whether this renders the per-arch mask applied to 'rand'
at the respective choose_random_kstack_offset() invocations ineffective?
Like e.g. on x86 there is
choose_random_kstack_offset(rdtsc() & 0xFF);
I would argue that while before the patch kstack_offset had been
guaranteed to stay within the bounds of 0xFF, it's now effectively
unlimited (well, <= (u32)-1) and only capped to 0x3ff when subsequently
applying the KSTACK_OFFSET_MAX().
Or am I simply missing something?
Thanks!
Nicolai
> raw_cpu_write(kstack_offset, offset); \
> } \
> } while (0)
--
SUSE Software Solutions Germany GmbH, Frankenstraße 146, 90461 Nürnberg, Germany
GF: Ivo Totev, Andrew McDonald, Werner Knoblich
(HRB 36809, AG Nürnberg)
On Wed, May 22, 2024, at 08:35, Nicolai Stange wrote:
> Kees Cook <[email protected]> writes:
>>
>> diff --git a/include/linux/randomize_kstack.h b/include/linux/randomize_kstack.h
>> index 5d868505a94e..6d92b68efbf6 100644
>> --- a/include/linux/randomize_kstack.h
>> +++ b/include/linux/randomize_kstack.h
>> @@ -80,7 +80,7 @@ DECLARE_PER_CPU(u32, kstack_offset);
>> if (static_branch_maybe(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT, \
>> &randomize_kstack_offset)) { \
>> u32 offset = raw_cpu_read(kstack_offset); \
>> - offset ^= (rand); \
>> + offset = ror32(offset, 5) ^ (rand); \
>
> Hi Kees,
>
> I'm wondering whether this renders the per-arch mask applied to 'rand'
> at the respective choose_random_kstack_offset() invocations ineffective?
>
> Like e.g. on x86 there is
>
> choose_random_kstack_offset(rdtsc() & 0xFF);
>
> I would argue that while before the patch kstack_offset had been
> guaranteed to stay within the bounds of 0xFF, it's now effectively
> unlimited (well, <= (u32)-1) and only capped to 0x3ff when subsequently
> applying the KSTACK_OFFSET_MAX().
>
> Or am I simply missing something?
Hi Nicolai,
I think you are correct and this is an unintended side-effect
of this patch. We could either restore the previous limits
or try to come up with a cross platform policy here, which
may be better in the end.
I see that out of the five architectures that have randomized
kstacks, only powerpc and riscv actually use the default
1kb range of offsets, so those are unaffected by the unintential
change.
arm64 uses a 512 byte while x86 and s390 use a 256 byte range.
As far as I can tell, there should be nothing architecture
specific about that limit, though we might want to reconsider
the total size of the stack. On architectures with 4KB
pages and CONFIG_VMAP_STACK, we should be able to have
arbitrarily sized stacks (e.g. 12kb or 20kb instead of the
default 16kb) as a compile-time selection.
If we increase the stack size by 4KB, it would be trivial
to set the random offset to the same 4KB range instead of
256/512/1024 bytes. On the other hand, I always feel
uneasy about enabling kstack randomization without
CONFIG_VMAP_STACK, so we may want to also tie it to that.
Arnd