The randomize_kstack_offset feature is unconditionally compiled in when
the architecture supports it.
To add constraints on compiler versions, we require a dedicated Kconfig
variable. Therefore, introduce RANDOMIZE_KSTACK_OFFSET.
Furthermore, this option is now also configurable by EXPERT kernels:
while the feature is supposed to have zero performance overhead when
disabled, due to its use of static branches, there are few cases where
giving a distribution the option to disable the feature entirely makes
sense. For example, in very resource constrained environments, which
would never enable the feature to begin with, in which case the
additional kernel code size increase would be redundant.
Signed-off-by: Marco Elver <[email protected]>
---
arch/Kconfig | 23 ++++++++++++++++++-----
include/linux/randomize_kstack.h | 5 +++++
init/main.c | 2 +-
3 files changed, 24 insertions(+), 6 deletions(-)
diff --git a/arch/Kconfig b/arch/Kconfig
index 678a80713b21..2cde48d9b77c 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -1159,16 +1159,29 @@ config HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET
to the compiler, so it will attempt to add canary checks regardless
of the static branch state.
-config RANDOMIZE_KSTACK_OFFSET_DEFAULT
- bool "Randomize kernel stack offset on syscall entry"
+config RANDOMIZE_KSTACK_OFFSET
+ bool "Support for randomizing kernel stack offset on syscall entry" if EXPERT
+ default y
depends on HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET
help
The kernel stack offset can be randomized (after pt_regs) by
roughly 5 bits of entropy, frustrating memory corruption
attacks that depend on stack address determinism or
- cross-syscall address exposures. This feature is controlled
- by kernel boot param "randomize_kstack_offset=on/off", and this
- config chooses the default boot state.
+ cross-syscall address exposures.
+
+ The feature is controlled via the "randomize_kstack_offset=on/off"
+ kernel boot param, and if turned off has zero overhead due to its use
+ of static branches (see JUMP_LABEL).
+
+ If unsure, say Y.
+
+config RANDOMIZE_KSTACK_OFFSET_DEFAULT
+ bool "Default state of kernel stack offset randomization"
+ depends on RANDOMIZE_KSTACK_OFFSET
+ help
+ Kernel stack offset randomization is controlled by kernel boot param
+ "randomize_kstack_offset=on/off", and this config chooses the default
+ boot state.
config ARCH_OPTIONAL_KERNEL_RWX
def_bool n
diff --git a/include/linux/randomize_kstack.h b/include/linux/randomize_kstack.h
index bebc911161b6..91f1b990a3c3 100644
--- a/include/linux/randomize_kstack.h
+++ b/include/linux/randomize_kstack.h
@@ -2,6 +2,7 @@
#ifndef _LINUX_RANDOMIZE_KSTACK_H
#define _LINUX_RANDOMIZE_KSTACK_H
+#ifdef CONFIG_RANDOMIZE_KSTACK_OFFSET
#include <linux/kernel.h>
#include <linux/jump_label.h>
#include <linux/percpu-defs.h>
@@ -50,5 +51,9 @@ void *__builtin_alloca(size_t size);
raw_cpu_write(kstack_offset, offset); \
} \
} while (0)
+#else /* CONFIG_RANDOMIZE_KSTACK_OFFSET */
+#define add_random_kstack_offset() do { } while (0)
+#define choose_random_kstack_offset(rand) do { } while (0)
+#endif /* CONFIG_RANDOMIZE_KSTACK_OFFSET */
#endif
diff --git a/init/main.c b/init/main.c
index 65fa2e41a9c0..560f45c27ffe 100644
--- a/init/main.c
+++ b/init/main.c
@@ -853,7 +853,7 @@ static void __init mm_init(void)
pti_init();
}
-#ifdef CONFIG_HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET
+#ifdef CONFIG_RANDOMIZE_KSTACK_OFFSET
DEFINE_STATIC_KEY_MAYBE_RO(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT,
randomize_kstack_offset);
DEFINE_PER_CPU(u32, kstack_offset);
--
2.35.0.rc0.227.g00780c9af4-goog
All supported versions of Clang perform auto-init of __builtin_alloca()
when stack auto-init is on (CONFIG_INIT_STACK_ALL_{ZERO,PATTERN}).
add_random_kstack_offset() uses __builtin_alloca() to add a stack
offset. This means, when CONFIG_INIT_STACK_ALL_{ZERO,PATTERN} is
enabled, add_random_kstack_offset() will auto-init that unused portion
of the stack used to add an offset.
There are several problems with this:
1. These offsets can be as large as 1023 bytes. Performing
memset() on them isn't exactly cheap, and this is done on
every syscall entry.
2. Architectures adding add_random_kstack_offset() to syscall
entry implemented in C require them to be 'noinstr' (e.g. see
x86 and s390). The potential problem here is that a call to
memset may occur, which is not noinstr.
A x86_64 defconfig kernel with Clang 11 and CONFIG_VMLINUX_VALIDATION shows:
| vmlinux.o: warning: objtool: do_syscall_64()+0x9d: call to memset() leaves .noinstr.text section
| vmlinux.o: warning: objtool: do_int80_syscall_32()+0xab: call to memset() leaves .noinstr.text section
| vmlinux.o: warning: objtool: __do_fast_syscall_32()+0xe2: call to memset() leaves .noinstr.text section
| vmlinux.o: warning: objtool: fixup_bad_iret()+0x2f: call to memset() leaves .noinstr.text section
Clang 14 (unreleased) will introduce a way to skip alloca initialization
via __builtin_alloca_uninitialized() (https://reviews.llvm.org/D115440).
Constrain RANDOMIZE_KSTACK_OFFSET to only be enabled if no stack
auto-init is enabled, the compiler is GCC, or Clang is version 14+. Use
__builtin_alloca_uninitialized() if the compiler provides it, as is done
by Clang 14.
Link: https://lkml.kernel.org/r/[email protected]
Fixes: 39218ff4c625 ("stack: Optionally randomize kernel stack offset each syscall")
Signed-off-by: Marco Elver <[email protected]>
---
arch/Kconfig | 1 +
include/linux/randomize_kstack.h | 14 ++++++++++++--
2 files changed, 13 insertions(+), 2 deletions(-)
diff --git a/arch/Kconfig b/arch/Kconfig
index 2cde48d9b77c..c5b50bfe31c1 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -1163,6 +1163,7 @@ config RANDOMIZE_KSTACK_OFFSET
bool "Support for randomizing kernel stack offset on syscall entry" if EXPERT
default y
depends on HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET
+ depends on INIT_STACK_NONE || !CC_IS_CLANG || CLANG_VERSION >= 140000
help
The kernel stack offset can be randomized (after pt_regs) by
roughly 5 bits of entropy, frustrating memory corruption
diff --git a/include/linux/randomize_kstack.h b/include/linux/randomize_kstack.h
index 91f1b990a3c3..5c711d73ed10 100644
--- a/include/linux/randomize_kstack.h
+++ b/include/linux/randomize_kstack.h
@@ -17,8 +17,18 @@ DECLARE_PER_CPU(u32, kstack_offset);
* alignment. Also, since this use is being explicitly masked to a max of
* 10 bits, stack-clash style attacks are unlikely. For more details see
* "VLAs" in Documentation/process/deprecated.rst
+ *
+ * The normal alloca() can be initialized with INIT_STACK_ALL. Initializing the
+ * unused area on each syscall entry is expensive, and generating an implicit
+ * call to memset() may also be problematic (such as in noinstr functions).
+ * Therefore, if the compiler provides it, use the "uninitialized" variant.
*/
-void *__builtin_alloca(size_t size);
+#if __has_builtin(__builtin_alloca_uninitialized)
+#define __kstack_alloca __builtin_alloca_uninitialized
+#else
+#define __kstack_alloca __builtin_alloca
+#endif
+
/*
* Use, at most, 10 bits of entropy. We explicitly cap this to keep the
* "VLA" from being unbounded (see above). 10 bits leaves enough room for
@@ -37,7 +47,7 @@ void *__builtin_alloca(size_t size);
if (static_branch_maybe(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT, \
&randomize_kstack_offset)) { \
u32 offset = raw_cpu_read(kstack_offset); \
- u8 *ptr = __builtin_alloca(KSTACK_OFFSET_MAX(offset)); \
+ u8 *ptr = __kstack_alloca(KSTACK_OFFSET_MAX(offset)); \
/* Keep allocation even after "ptr" loses scope. */ \
asm volatile("" :: "r"(ptr) : "memory"); \
} \
--
2.35.0.rc0.227.g00780c9af4-goog
On Fri, Jan 28, 2022 at 12:44:45PM +0100, Marco Elver wrote:
> The randomize_kstack_offset feature is unconditionally compiled in when
> the architecture supports it.
>
> To add constraints on compiler versions, we require a dedicated Kconfig
> variable. Therefore, introduce RANDOMIZE_KSTACK_OFFSET.
>
> Furthermore, this option is now also configurable by EXPERT kernels:
> while the feature is supposed to have zero performance overhead when
> disabled, due to its use of static branches, there are few cases where
> giving a distribution the option to disable the feature entirely makes
> sense. For example, in very resource constrained environments, which
> would never enable the feature to begin with, in which case the
> additional kernel code size increase would be redundant.
>
> Signed-off-by: Marco Elver <[email protected]>
From a Kconfig perspective:
Reviewed-by: Nathan Chancellor <[email protected]>
> ---
> arch/Kconfig | 23 ++++++++++++++++++-----
> include/linux/randomize_kstack.h | 5 +++++
> init/main.c | 2 +-
> 3 files changed, 24 insertions(+), 6 deletions(-)
>
> diff --git a/arch/Kconfig b/arch/Kconfig
> index 678a80713b21..2cde48d9b77c 100644
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -1159,16 +1159,29 @@ config HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET
> to the compiler, so it will attempt to add canary checks regardless
> of the static branch state.
>
> -config RANDOMIZE_KSTACK_OFFSET_DEFAULT
> - bool "Randomize kernel stack offset on syscall entry"
> +config RANDOMIZE_KSTACK_OFFSET
> + bool "Support for randomizing kernel stack offset on syscall entry" if EXPERT
> + default y
> depends on HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET
> help
> The kernel stack offset can be randomized (after pt_regs) by
> roughly 5 bits of entropy, frustrating memory corruption
> attacks that depend on stack address determinism or
> - cross-syscall address exposures. This feature is controlled
> - by kernel boot param "randomize_kstack_offset=on/off", and this
> - config chooses the default boot state.
> + cross-syscall address exposures.
> +
> + The feature is controlled via the "randomize_kstack_offset=on/off"
> + kernel boot param, and if turned off has zero overhead due to its use
> + of static branches (see JUMP_LABEL).
> +
> + If unsure, say Y.
> +
> +config RANDOMIZE_KSTACK_OFFSET_DEFAULT
> + bool "Default state of kernel stack offset randomization"
> + depends on RANDOMIZE_KSTACK_OFFSET
> + help
> + Kernel stack offset randomization is controlled by kernel boot param
> + "randomize_kstack_offset=on/off", and this config chooses the default
> + boot state.
>
> config ARCH_OPTIONAL_KERNEL_RWX
> def_bool n
> diff --git a/include/linux/randomize_kstack.h b/include/linux/randomize_kstack.h
> index bebc911161b6..91f1b990a3c3 100644
> --- a/include/linux/randomize_kstack.h
> +++ b/include/linux/randomize_kstack.h
> @@ -2,6 +2,7 @@
> #ifndef _LINUX_RANDOMIZE_KSTACK_H
> #define _LINUX_RANDOMIZE_KSTACK_H
>
> +#ifdef CONFIG_RANDOMIZE_KSTACK_OFFSET
> #include <linux/kernel.h>
> #include <linux/jump_label.h>
> #include <linux/percpu-defs.h>
> @@ -50,5 +51,9 @@ void *__builtin_alloca(size_t size);
> raw_cpu_write(kstack_offset, offset); \
> } \
> } while (0)
> +#else /* CONFIG_RANDOMIZE_KSTACK_OFFSET */
> +#define add_random_kstack_offset() do { } while (0)
> +#define choose_random_kstack_offset(rand) do { } while (0)
> +#endif /* CONFIG_RANDOMIZE_KSTACK_OFFSET */
>
> #endif
> diff --git a/init/main.c b/init/main.c
> index 65fa2e41a9c0..560f45c27ffe 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -853,7 +853,7 @@ static void __init mm_init(void)
> pti_init();
> }
>
> -#ifdef CONFIG_HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET
> +#ifdef CONFIG_RANDOMIZE_KSTACK_OFFSET
> DEFINE_STATIC_KEY_MAYBE_RO(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT,
> randomize_kstack_offset);
> DEFINE_PER_CPU(u32, kstack_offset);
> --
> 2.35.0.rc0.227.g00780c9af4-goog
>
On Fri, Jan 28, 2022 at 12:44:46PM +0100, Marco Elver wrote:
> All supported versions of Clang perform auto-init of __builtin_alloca()
> when stack auto-init is on (CONFIG_INIT_STACK_ALL_{ZERO,PATTERN}).
>
> add_random_kstack_offset() uses __builtin_alloca() to add a stack
> offset. This means, when CONFIG_INIT_STACK_ALL_{ZERO,PATTERN} is
> enabled, add_random_kstack_offset() will auto-init that unused portion
> of the stack used to add an offset.
>
> There are several problems with this:
>
> 1. These offsets can be as large as 1023 bytes. Performing
> memset() on them isn't exactly cheap, and this is done on
> every syscall entry.
>
> 2. Architectures adding add_random_kstack_offset() to syscall
> entry implemented in C require them to be 'noinstr' (e.g. see
> x86 and s390). The potential problem here is that a call to
> memset may occur, which is not noinstr.
>
> A x86_64 defconfig kernel with Clang 11 and CONFIG_VMLINUX_VALIDATION shows:
>
> | vmlinux.o: warning: objtool: do_syscall_64()+0x9d: call to memset() leaves .noinstr.text section
> | vmlinux.o: warning: objtool: do_int80_syscall_32()+0xab: call to memset() leaves .noinstr.text section
> | vmlinux.o: warning: objtool: __do_fast_syscall_32()+0xe2: call to memset() leaves .noinstr.text section
> | vmlinux.o: warning: objtool: fixup_bad_iret()+0x2f: call to memset() leaves .noinstr.text section
>
> Clang 14 (unreleased) will introduce a way to skip alloca initialization
> via __builtin_alloca_uninitialized() (https://reviews.llvm.org/D115440).
>
> Constrain RANDOMIZE_KSTACK_OFFSET to only be enabled if no stack
> auto-init is enabled, the compiler is GCC, or Clang is version 14+. Use
> __builtin_alloca_uninitialized() if the compiler provides it, as is done
> by Clang 14.
>
> Link: https://lkml.kernel.org/r/[email protected]
> Fixes: 39218ff4c625 ("stack: Optionally randomize kernel stack offset each syscall")
> Signed-off-by: Marco Elver <[email protected]>
> ---
> arch/Kconfig | 1 +
> include/linux/randomize_kstack.h | 14 ++++++++++++--
> 2 files changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/arch/Kconfig b/arch/Kconfig
> index 2cde48d9b77c..c5b50bfe31c1 100644
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -1163,6 +1163,7 @@ config RANDOMIZE_KSTACK_OFFSET
> bool "Support for randomizing kernel stack offset on syscall entry" if EXPERT
> default y
> depends on HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET
> + depends on INIT_STACK_NONE || !CC_IS_CLANG || CLANG_VERSION >= 140000
This makes it _unavailable_ for folks with Clang < 14, which seems
too strong, especially since it's run-time off by default. I'd prefer
dropping this hunk and adding some language to the _DEFAULT help noting
the specific performance impact on Clang < 14.
> help
> The kernel stack offset can be randomized (after pt_regs) by
> roughly 5 bits of entropy, frustrating memory corruption
> diff --git a/include/linux/randomize_kstack.h b/include/linux/randomize_kstack.h
> index 91f1b990a3c3..5c711d73ed10 100644
> --- a/include/linux/randomize_kstack.h
> +++ b/include/linux/randomize_kstack.h
> @@ -17,8 +17,18 @@ DECLARE_PER_CPU(u32, kstack_offset);
> * alignment. Also, since this use is being explicitly masked to a max of
> * 10 bits, stack-clash style attacks are unlikely. For more details see
> * "VLAs" in Documentation/process/deprecated.rst
> + *
> + * The normal alloca() can be initialized with INIT_STACK_ALL. Initializing the
> + * unused area on each syscall entry is expensive, and generating an implicit
> + * call to memset() may also be problematic (such as in noinstr functions).
> + * Therefore, if the compiler provides it, use the "uninitialized" variant.
Can you include the note that GCC doesn't initialize its alloca()?
Otherwise, yeah, looks good to me.
-Kees
> */
> -void *__builtin_alloca(size_t size);
> +#if __has_builtin(__builtin_alloca_uninitialized)
> +#define __kstack_alloca __builtin_alloca_uninitialized
> +#else
> +#define __kstack_alloca __builtin_alloca
> +#endif
> +
> /*
> * Use, at most, 10 bits of entropy. We explicitly cap this to keep the
> * "VLA" from being unbounded (see above). 10 bits leaves enough room for
> @@ -37,7 +47,7 @@ void *__builtin_alloca(size_t size);
> if (static_branch_maybe(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT, \
> &randomize_kstack_offset)) { \
> u32 offset = raw_cpu_read(kstack_offset); \
> - u8 *ptr = __builtin_alloca(KSTACK_OFFSET_MAX(offset)); \
> + u8 *ptr = __kstack_alloca(KSTACK_OFFSET_MAX(offset)); \
> /* Keep allocation even after "ptr" loses scope. */ \
> asm volatile("" :: "r"(ptr) : "memory"); \
> } \
> --
> 2.35.0.rc0.227.g00780c9af4-goog
>
--
Kees Cook
On Fri, 28 Jan 2022 at 20:10, Kees Cook <[email protected]> wrote:
[...]
> > 2. Architectures adding add_random_kstack_offset() to syscall
> > entry implemented in C require them to be 'noinstr' (e.g. see
> > x86 and s390). The potential problem here is that a call to
> > memset may occur, which is not noinstr.
[...]
> > --- a/arch/Kconfig
> > +++ b/arch/Kconfig
> > @@ -1163,6 +1163,7 @@ config RANDOMIZE_KSTACK_OFFSET
> > bool "Support for randomizing kernel stack offset on syscall entry" if EXPERT
> > default y
> > depends on HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET
> > + depends on INIT_STACK_NONE || !CC_IS_CLANG || CLANG_VERSION >= 140000
>
> This makes it _unavailable_ for folks with Clang < 14, which seems
> too strong, especially since it's run-time off by default. I'd prefer
> dropping this hunk and adding some language to the _DEFAULT help noting
> the specific performance impact on Clang < 14.
You're right, if it was only about performance. But there's the
correctness issue with ARCH_WANTS_NOINSTR architectures, where we
really shouldn't emit a call. In those cases, even if compiled in,
enabling the feature may cause trouble.
That's how this got on my radar in the first place (the objtool warnings).
So my proposal is to add another "|| !ARCH_WANTS_NO_INSTR", and add
the performance note to the help text for the !ARCH_WANTS_NO_INSTR
case if Clang < 14.
Is that reasonable?
Sadly both arm64 and x86 are ARCH_WANTS_NO_INSTR. :-/
> > help
> > The kernel stack offset can be randomized (after pt_regs) by
> > roughly 5 bits of entropy, frustrating memory corruption
> > diff --git a/include/linux/randomize_kstack.h b/include/linux/randomize_kstack.h
> > index 91f1b990a3c3..5c711d73ed10 100644
> > --- a/include/linux/randomize_kstack.h
> > +++ b/include/linux/randomize_kstack.h
> > @@ -17,8 +17,18 @@ DECLARE_PER_CPU(u32, kstack_offset);
> > * alignment. Also, since this use is being explicitly masked to a max of
> > * 10 bits, stack-clash style attacks are unlikely. For more details see
> > * "VLAs" in Documentation/process/deprecated.rst
> > + *
> > + * The normal alloca() can be initialized with INIT_STACK_ALL. Initializing the
> > + * unused area on each syscall entry is expensive, and generating an implicit
> > + * call to memset() may also be problematic (such as in noinstr functions).
> > + * Therefore, if the compiler provides it, use the "uninitialized" variant.
>
> Can you include the note that GCC doesn't initialize its alloca()?
I'm guessing this won't change any time soon, so probably adding it in
the code comment is ok.
Thanks,
-- Marco
On Fri, 28 Jan 2022 at 19:55, Nathan Chancellor <[email protected]> wrote:
[...]
>
> Reviewed-by: Nathan Chancellor <[email protected]>
>
> One comment below.
Thanks!
Though with Kees's requested changes I'll probably let you re-review it.
> > ---
> > arch/Kconfig | 1 +
> > include/linux/randomize_kstack.h | 14 ++++++++++++--
> > 2 files changed, 13 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/Kconfig b/arch/Kconfig
> > index 2cde48d9b77c..c5b50bfe31c1 100644
> > --- a/arch/Kconfig
> > +++ b/arch/Kconfig
> > @@ -1163,6 +1163,7 @@ config RANDOMIZE_KSTACK_OFFSET
> > bool "Support for randomizing kernel stack offset on syscall entry" if EXPERT
> > default y
> > depends on HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET
> > + depends on INIT_STACK_NONE || !CC_IS_CLANG || CLANG_VERSION >= 140000
> > help
> > The kernel stack offset can be randomized (after pt_regs) by
> > roughly 5 bits of entropy, frustrating memory corruption
> > diff --git a/include/linux/randomize_kstack.h b/include/linux/randomize_kstack.h
> > index 91f1b990a3c3..5c711d73ed10 100644
> > --- a/include/linux/randomize_kstack.h
> > +++ b/include/linux/randomize_kstack.h
> > @@ -17,8 +17,18 @@ DECLARE_PER_CPU(u32, kstack_offset);
> > * alignment. Also, since this use is being explicitly masked to a max of
> > * 10 bits, stack-clash style attacks are unlikely. For more details see
> > * "VLAs" in Documentation/process/deprecated.rst
> > + *
> > + * The normal alloca() can be initialized with INIT_STACK_ALL. Initializing the
> > + * unused area on each syscall entry is expensive, and generating an implicit
> > + * call to memset() may also be problematic (such as in noinstr functions).
> > + * Therefore, if the compiler provides it, use the "uninitialized" variant.
> > */
> > -void *__builtin_alloca(size_t size);
>
> Is it okay to remove the declaration? Why was it even added in the first
> place (Kees)?
Declaring __builtins is redundant for as long as I remember.
On Fri, Jan 28, 2022 at 12:44:46PM +0100, Marco Elver wrote:
> All supported versions of Clang perform auto-init of __builtin_alloca()
> when stack auto-init is on (CONFIG_INIT_STACK_ALL_{ZERO,PATTERN}).
>
> add_random_kstack_offset() uses __builtin_alloca() to add a stack
> offset. This means, when CONFIG_INIT_STACK_ALL_{ZERO,PATTERN} is
> enabled, add_random_kstack_offset() will auto-init that unused portion
> of the stack used to add an offset.
>
> There are several problems with this:
>
> 1. These offsets can be as large as 1023 bytes. Performing
> memset() on them isn't exactly cheap, and this is done on
> every syscall entry.
>
> 2. Architectures adding add_random_kstack_offset() to syscall
> entry implemented in C require them to be 'noinstr' (e.g. see
> x86 and s390). The potential problem here is that a call to
> memset may occur, which is not noinstr.
>
> A x86_64 defconfig kernel with Clang 11 and CONFIG_VMLINUX_VALIDATION shows:
>
> | vmlinux.o: warning: objtool: do_syscall_64()+0x9d: call to memset() leaves .noinstr.text section
> | vmlinux.o: warning: objtool: do_int80_syscall_32()+0xab: call to memset() leaves .noinstr.text section
> | vmlinux.o: warning: objtool: __do_fast_syscall_32()+0xe2: call to memset() leaves .noinstr.text section
> | vmlinux.o: warning: objtool: fixup_bad_iret()+0x2f: call to memset() leaves .noinstr.text section
>
> Clang 14 (unreleased) will introduce a way to skip alloca initialization
> via __builtin_alloca_uninitialized() (https://reviews.llvm.org/D115440).
>
> Constrain RANDOMIZE_KSTACK_OFFSET to only be enabled if no stack
> auto-init is enabled, the compiler is GCC, or Clang is version 14+. Use
> __builtin_alloca_uninitialized() if the compiler provides it, as is done
> by Clang 14.
>
> Link: https://lkml.kernel.org/r/[email protected]
> Fixes: 39218ff4c625 ("stack: Optionally randomize kernel stack offset each syscall")
> Signed-off-by: Marco Elver <[email protected]>
Reviewed-by: Nathan Chancellor <[email protected]>
One comment below.
> ---
> arch/Kconfig | 1 +
> include/linux/randomize_kstack.h | 14 ++++++++++++--
> 2 files changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/arch/Kconfig b/arch/Kconfig
> index 2cde48d9b77c..c5b50bfe31c1 100644
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -1163,6 +1163,7 @@ config RANDOMIZE_KSTACK_OFFSET
> bool "Support for randomizing kernel stack offset on syscall entry" if EXPERT
> default y
> depends on HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET
> + depends on INIT_STACK_NONE || !CC_IS_CLANG || CLANG_VERSION >= 140000
> help
> The kernel stack offset can be randomized (after pt_regs) by
> roughly 5 bits of entropy, frustrating memory corruption
> diff --git a/include/linux/randomize_kstack.h b/include/linux/randomize_kstack.h
> index 91f1b990a3c3..5c711d73ed10 100644
> --- a/include/linux/randomize_kstack.h
> +++ b/include/linux/randomize_kstack.h
> @@ -17,8 +17,18 @@ DECLARE_PER_CPU(u32, kstack_offset);
> * alignment. Also, since this use is being explicitly masked to a max of
> * 10 bits, stack-clash style attacks are unlikely. For more details see
> * "VLAs" in Documentation/process/deprecated.rst
> + *
> + * The normal alloca() can be initialized with INIT_STACK_ALL. Initializing the
> + * unused area on each syscall entry is expensive, and generating an implicit
> + * call to memset() may also be problematic (such as in noinstr functions).
> + * Therefore, if the compiler provides it, use the "uninitialized" variant.
> */
> -void *__builtin_alloca(size_t size);
Is it okay to remove the declaration? Why was it even added in the first
place (Kees)?
> +#if __has_builtin(__builtin_alloca_uninitialized)
> +#define __kstack_alloca __builtin_alloca_uninitialized
> +#else
> +#define __kstack_alloca __builtin_alloca
> +#endif
> +
> /*
> * Use, at most, 10 bits of entropy. We explicitly cap this to keep the
> * "VLA" from being unbounded (see above). 10 bits leaves enough room for
> @@ -37,7 +47,7 @@ void *__builtin_alloca(size_t size);
> if (static_branch_maybe(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT, \
> &randomize_kstack_offset)) { \
> u32 offset = raw_cpu_read(kstack_offset); \
> - u8 *ptr = __builtin_alloca(KSTACK_OFFSET_MAX(offset)); \
> + u8 *ptr = __kstack_alloca(KSTACK_OFFSET_MAX(offset)); \
> /* Keep allocation even after "ptr" loses scope. */ \
> asm volatile("" :: "r"(ptr) : "memory"); \
> } \
> --
> 2.35.0.rc0.227.g00780c9af4-goog
>
>
On Fri, Jan 28, 2022 at 08:23:02PM +0100, Marco Elver wrote:
> On Fri, 28 Jan 2022 at 20:10, Kees Cook <[email protected]> wrote:
> [...]
> > > 2. Architectures adding add_random_kstack_offset() to syscall
> > > entry implemented in C require them to be 'noinstr' (e.g. see
> > > x86 and s390). The potential problem here is that a call to
> > > memset may occur, which is not noinstr.
> [...]
> > > --- a/arch/Kconfig
> > > +++ b/arch/Kconfig
> > > @@ -1163,6 +1163,7 @@ config RANDOMIZE_KSTACK_OFFSET
> > > bool "Support for randomizing kernel stack offset on syscall entry" if EXPERT
> > > default y
> > > depends on HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET
> > > + depends on INIT_STACK_NONE || !CC_IS_CLANG || CLANG_VERSION >= 140000
> >
> > This makes it _unavailable_ for folks with Clang < 14, which seems
> > too strong, especially since it's run-time off by default. I'd prefer
> > dropping this hunk and adding some language to the _DEFAULT help noting
> > the specific performance impact on Clang < 14.
>
> You're right, if it was only about performance. But there's the
> correctness issue with ARCH_WANTS_NOINSTR architectures, where we
> really shouldn't emit a call. In those cases, even if compiled in,
> enabling the feature may cause trouble.
Hrm. While I suspect instrumentation of memset() from a C function that is
about to turn on instrumentation is likely quite safe, I guess the size
of the venn diagram overlap of folks wanting to use kstack randomization
and an older Clang quickly approaches zero. But everyone building with
an older Clang gets warnings spewed, so I agree: let's opt for complete
correctness here, and make this >= 14 as you have done.
--
Kees Cook