2021-03-30 21:02:44

by Kees Cook

[permalink] [raw]
Subject: [PATCH v8 0/6] Optionally randomize kernel stack offset each syscall


v8:
- switch to __this_cpu_*() (tglx)
- improve commit log details, comments, and masking (ingo, tglx)
v7: https://lore.kernel.org/lkml/[email protected]/
v6: https://lore.kernel.org/lkml/[email protected]/
v5: https://lore.kernel.org/lkml/[email protected]/
v4: https://lore.kernel.org/lkml/[email protected]/
v3: https://lore.kernel.org/lkml/[email protected]/
v2: https://lore.kernel.org/lkml/[email protected]/
rfc: https://lore.kernel.org/kernel-hardening/[email protected]/

Hi,

This is a continuation and refactoring of Elena's earlier effort to add
kernel stack base offset randomization. In the time since the earlier
discussions, two attacks[1][2] were made public that depended on stack
determinism, so we're no longer in the position of "this is a good idea
but we have no examples of attacks". :)

Earlier discussions also devolved into debates on entropy sources, which
is mostly a red herring, given the already low entropy available due
to stack size. Regardless, entropy can be changed/improved separately
from this series as needed.

Earlier discussions also got stuck debating how much syscall overhead
was too much, but this is also a red herring since the feature itself
needs to be selectable at boot with no cost for those that don't want it:
this is solved here with static branches.

So, here is the latest improved version, made as arch-agnostic as
possible, with usage added for x86 and arm64. It also includes some small
static branch clean ups, and addresses some surprise performance issues
due to the stack canary[3].

At the very least, the first two patches can land separately (already
Acked and Reviewed), since they're kind of "separate", but introduce
macros that are used in the core stack changes.

If I can get an Ack from an arm64 maintainer, I think this could all
land via -tip to make merging easiest.

Thanks!

-Kees

[1] https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html
[2] https://repositorio-aberto.up.pt/bitstream/10216/125357/2/374717.pdf
[3] https://lore.kernel.org/lkml/202003281520.A9BFF461@keescook/


Kees Cook (6):
jump_label: Provide CONFIG-driven build state defaults
init_on_alloc: Optimize static branches
stack: Optionally randomize kernel stack offset each syscall
x86/entry: Enable random_kstack_offset support
arm64: entry: Enable random_kstack_offset support
lkdtm: Add REPORT_STACK for checking stack offsets

.../admin-guide/kernel-parameters.txt | 11 ++++
Makefile | 4 ++
arch/Kconfig | 23 ++++++++
arch/arm64/Kconfig | 1 +
arch/arm64/kernel/Makefile | 5 ++
arch/arm64/kernel/syscall.c | 16 ++++++
arch/x86/Kconfig | 1 +
arch/x86/entry/common.c | 3 +
arch/x86/include/asm/entry-common.h | 16 ++++++
drivers/misc/lkdtm/bugs.c | 17 ++++++
drivers/misc/lkdtm/core.c | 1 +
drivers/misc/lkdtm/lkdtm.h | 1 +
include/linux/jump_label.h | 19 +++++++
include/linux/mm.h | 10 ++--
include/linux/randomize_kstack.h | 55 +++++++++++++++++++
init/main.c | 23 ++++++++
mm/page_alloc.c | 4 +-
mm/slab.h | 6 +-
18 files changed, 208 insertions(+), 8 deletions(-)
create mode 100644 include/linux/randomize_kstack.h

--
2.25.1


2021-03-30 21:03:01

by Kees Cook

[permalink] [raw]
Subject: [PATCH v8 6/6] lkdtm: Add REPORT_STACK for checking stack offsets

For validating the stack offset behavior, report the offset from a given
process's first seen stack address. A quick way to measure the entropy:

for i in $(seq 1 1000); do
echo "REPORT_STACK" >/sys/kernel/debug/provoke-crash/DIRECT
done
offsets=$(dmesg | grep 'Stack offset' | cut -d: -f3 | sort | uniq -c | sort -n | wc -l)
echo "$(uname -m) bits of stack entropy: $(echo "obase=2; $offsets" | bc | wc -L)"

Signed-off-by: Kees Cook <[email protected]>
---
drivers/misc/lkdtm/bugs.c | 17 +++++++++++++++++
drivers/misc/lkdtm/core.c | 1 +
drivers/misc/lkdtm/lkdtm.h | 1 +
3 files changed, 19 insertions(+)

diff --git a/drivers/misc/lkdtm/bugs.c b/drivers/misc/lkdtm/bugs.c
index 110f5a8538e9..0e8254d0cf0b 100644
--- a/drivers/misc/lkdtm/bugs.c
+++ b/drivers/misc/lkdtm/bugs.c
@@ -134,6 +134,23 @@ noinline void lkdtm_CORRUPT_STACK_STRONG(void)
__lkdtm_CORRUPT_STACK((void *)&data);
}

+static pid_t stack_pid;
+static unsigned long stack_addr;
+
+void lkdtm_REPORT_STACK(void)
+{
+ volatile uintptr_t magic;
+ pid_t pid = task_pid_nr(current);
+
+ if (pid != stack_pid) {
+ pr_info("Starting stack offset tracking for pid %d\n", pid);
+ stack_pid = pid;
+ stack_addr = (uintptr_t)&magic;
+ }
+
+ pr_info("Stack offset: %d\n", (int)(stack_addr - (uintptr_t)&magic));
+}
+
void lkdtm_UNALIGNED_LOAD_STORE_WRITE(void)
{
static u8 data[5] __attribute__((aligned(4))) = {1, 2, 3, 4, 5};
diff --git a/drivers/misc/lkdtm/core.c b/drivers/misc/lkdtm/core.c
index b2aff4d87c01..8024b6a5cc7f 100644
--- a/drivers/misc/lkdtm/core.c
+++ b/drivers/misc/lkdtm/core.c
@@ -110,6 +110,7 @@ static const struct crashtype crashtypes[] = {
CRASHTYPE(EXHAUST_STACK),
CRASHTYPE(CORRUPT_STACK),
CRASHTYPE(CORRUPT_STACK_STRONG),
+ CRASHTYPE(REPORT_STACK),
CRASHTYPE(CORRUPT_LIST_ADD),
CRASHTYPE(CORRUPT_LIST_DEL),
CRASHTYPE(STACK_GUARD_PAGE_LEADING),
diff --git a/drivers/misc/lkdtm/lkdtm.h b/drivers/misc/lkdtm/lkdtm.h
index 5ae48c64df24..99f90d3e5e9c 100644
--- a/drivers/misc/lkdtm/lkdtm.h
+++ b/drivers/misc/lkdtm/lkdtm.h
@@ -17,6 +17,7 @@ void lkdtm_LOOP(void);
void lkdtm_EXHAUST_STACK(void);
void lkdtm_CORRUPT_STACK(void);
void lkdtm_CORRUPT_STACK_STRONG(void);
+void lkdtm_REPORT_STACK(void);
void lkdtm_UNALIGNED_LOAD_STORE_WRITE(void);
void lkdtm_SOFTLOCKUP(void);
void lkdtm_HARDLOCKUP(void);
--
2.25.1

2021-03-30 21:03:10

by Kees Cook

[permalink] [raw]
Subject: [PATCH v8 5/6] arm64: entry: Enable random_kstack_offset support

Allow for a randomized stack offset on a per-syscall basis, with roughly
5 bits of entropy. (And include AAPCS rationale AAPCS thanks to Mark
Rutland.)

In order to avoid unconditional stack canaries on syscall entry (due to
the use of alloca()), also disable stack protector to avoid triggering
needless checks and slowing down the entry path. As there is no general
way to control stack protector coverage with a function attribute[1],
this must be disabled at the compilation unit level. This isn't a problem
here, though, since stack protector was not triggered before: examining
the resulting syscall.o, there are no changes in canary coverage (none
before, none now).

[1] a working __attribute__((no_stack_protector)) has been added to GCC
and Clang but has not been released in any version yet:
https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=346b302d09c1e6db56d9fe69048acb32fbb97845
https://reviews.llvm.org/rG4fbf84c1732fca596ad1d6e96015e19760eb8a9b

Signed-off-by: Kees Cook <[email protected]>
---
arch/arm64/Kconfig | 1 +
arch/arm64/kernel/Makefile | 5 +++++
arch/arm64/kernel/syscall.c | 16 ++++++++++++++++
3 files changed, 22 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 1f212b47a48a..2d0e5f544429 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -146,6 +146,7 @@ config ARM64
select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT
select HAVE_ARCH_PFN_VALID
select HAVE_ARCH_PREL32_RELOCATIONS
+ select HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET
select HAVE_ARCH_SECCOMP_FILTER
select HAVE_ARCH_STACKLEAK
select HAVE_ARCH_THREAD_STRUCT_WHITELIST
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index ed65576ce710..6cc97730790e 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -9,6 +9,11 @@ CFLAGS_REMOVE_ftrace.o = $(CC_FLAGS_FTRACE)
CFLAGS_REMOVE_insn.o = $(CC_FLAGS_FTRACE)
CFLAGS_REMOVE_return_address.o = $(CC_FLAGS_FTRACE)

+# Remove stack protector to avoid triggering unneeded stack canary
+# checks due to randomize_kstack_offset.
+CFLAGS_REMOVE_syscall.o = -fstack-protector -fstack-protector-strong
+CFLAGS_syscall.o += -fno-stack-protector
+
# Object file lists.
obj-y := debug-monitors.o entry.o irq.o fpsimd.o \
entry-common.o entry-fpsimd.o process.o ptrace.o \
diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
index b9cf12b271d7..263d6c1a525f 100644
--- a/arch/arm64/kernel/syscall.c
+++ b/arch/arm64/kernel/syscall.c
@@ -5,6 +5,7 @@
#include <linux/errno.h>
#include <linux/nospec.h>
#include <linux/ptrace.h>
+#include <linux/randomize_kstack.h>
#include <linux/syscalls.h>

#include <asm/daifflags.h>
@@ -43,6 +44,8 @@ static void invoke_syscall(struct pt_regs *regs, unsigned int scno,
{
long ret;

+ add_random_kstack_offset();
+
if (scno < sc_nr) {
syscall_fn_t syscall_fn;
syscall_fn = syscall_table[array_index_nospec(scno, sc_nr)];
@@ -55,6 +58,19 @@ static void invoke_syscall(struct pt_regs *regs, unsigned int scno,
ret = lower_32_bits(ret);

regs->regs[0] = ret;
+
+ /*
+ * Ultimately, this value will get limited by KSTACK_OFFSET_MAX(),
+ * but not enough for arm64 stack utilization comfort. To keep
+ * reasonable stack head room, reduce the maximum offset to 9 bits.
+ *
+ * The actual entropy will be further reduced by the compiler when
+ * applying stack alignment constraints: the AAPCS mandates a
+ * 16-byte (i.e. 4-bit) aligned SP at function boundaries.
+ *
+ * The resulting 5 bits of entropy is seen in SP[8:4].
+ */
+ choose_random_kstack_offset(get_random_int() & 0x1FF);
}

static inline bool has_syscall_work(unsigned long flags)
--
2.25.1

2021-04-01 19:30:09

by Roy Yang

[permalink] [raw]
Subject: [PATCH] Where we are for this patch?

Both Android and Chrome OS really want this feature; For Container-Optimized OS, we have customers
interested in the defense too.

Thank you very much.

Change-Id: I1eb1b726007aa8f9c374b934cc1c690fb4924aa3
--
2.31.0.208.g409f899ff0-goog

2021-04-01 19:50:06

by Al Viro

[permalink] [raw]
Subject: Re: [PATCH] Where we are for this patch?

On Thu, Apr 01, 2021 at 12:17:44PM -0700, Roy Yang wrote:
> Both Android and Chrome OS really want this feature; For Container-Optimized OS, we have customers
> interested in the defense too.
>
> Thank you very much.
>
> Change-Id: I1eb1b726007aa8f9c374b934cc1c690fb4924aa3

You forgot to tell what patch you are refering to. Your
Change-Id (whatever the hell that is) doesn't help at all. Don't
assume that keys in your internal database make sense for the
rest of the world, especially when they appear to contain a hash
of something...

2021-04-01 20:15:38

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH] Where we are for this patch?

On Thu, Apr 01, 2021 at 07:48:30PM +0000, Al Viro wrote:
> On Thu, Apr 01, 2021 at 12:17:44PM -0700, Roy Yang wrote:
> > Both Android and Chrome OS really want this feature; For Container-Optimized OS, we have customers
> > interested in the defense too.
> >
> > Thank you very much.
> >
> > Change-Id: I1eb1b726007aa8f9c374b934cc1c690fb4924aa3
>
> You forgot to tell what patch you are refering to. Your
> Change-Id (whatever the hell that is) doesn't help at all. Don't
> assume that keys in your internal database make sense for the
> rest of the world, especially when they appear to contain a hash
> of something...

The Change-Id fails to have any direct search hits at lore.kernel.org.
However, it turn up Roy's original patch, and clicking on the
message-Id in the "In-Reply-Field", it apperas Roy was replying to
this message:

https://lore.kernel.org/lkml/[email protected]/

which is the head of this patch series:

Subject: [PATCH v8 0/6] Optionally randomize kernel stack offset each syscall

That being said, it would have been better if the original subject
line had been preserved, and it's yet another example of how the
lore.kernel.org URL is infinitely better than the Change-Id. :-)

- Ted

2021-04-01 21:47:03

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH v8 0/6] Optionally randomize kernel stack offset each syscall

On Thu, Apr 01, 2021 at 12:17:44PM -0700, Roy Yang wrote:
> Both Android and Chrome OS really want this feature; For Container-Optimized OS, we have customers
> interested in the defense too.

It's pretty close! There are a couple recent comments that need to be
addressed, but hopefully it can land if x86 and arm64 maintainers are
happy v10.

> Change-Id: I1eb1b726007aa8f9c374b934cc1c690fb4924aa3
> --
> 2.31.0.208.g409f899ff0-goog

And to let other folks know, I'm guessing this email got sent with git
send-email to try to get a valid In-Reply-To header, but I guess git
trashed the Subject and ran hooks to generate a Change-Id UUID.

I assume it's from following the "Reply instructions" at the bottom of:
https://lore.kernel.org/lkml/[email protected]/
(It seems those need clarification about Subject handling.)

--
Kees Cook