2022-05-28 18:17:50

by David Gow

[permalink] [raw]
Subject: [PATCH v2 2/2] UML: add support for KASAN under x86_64

From: Patricia Alfonso <[email protected]>

Make KASAN run on User Mode Linux on x86_64.

The UML-specific KASAN initializer uses mmap to map the roughly 2.25TB
of shadow memory to the location defined by KASAN_SHADOW_OFFSET.
kasan_init() utilizes constructors to initialize KASAN before main().

The location of the KASAN shadow memory, starting at
KASAN_SHADOW_OFFSET, can be configured using the KASAN_SHADOW_OFFSET
option. UML uses roughly 18TB of address space, and KASAN requires 1/8th
of this. The default location of this offset is 0x100000000000, which
keeps it out-of-the-way even on UML setups with more "physical" memory.

For low-memory setups, 0x7fff8000 can be used instead, which fits in an
immediate and is therefore faster, as suggested by Dmitry Vyukov. There
is usually enough free space at this location; however, it is a config
option so that it can be easily changed if needed.

Note that, unlike KASAN on other architectures, vmalloc allocations
still use the shadow memory allocated upfront, rather than allocating
and free-ing it per-vmalloc allocation.

Also note that, while UML supports both KASAN in inline mode
(CONFIG_KASAN_INLINE) and static linking (CONFIG_STATIC_LINK), it does
not support both at the same time.

Signed-off-by: Patricia Alfonso <[email protected]>
Co-developed-by: Vincent Whitchurch <[email protected]>
Signed-off-by: Vincent Whitchurch <[email protected]>
Signed-off-by: David Gow <[email protected]>
---

This is v2 of the KASAN/UML port. It should be ready to go.

It does benefit significantly from the following patches:
- Bugfix for memory corruption, needed for KASAN_STACK support:
https://lore.kernel.org/lkml/[email protected]/
- Fix CONFIG_CONSTRUCTORS on static linked kernels:
https://lore.kernel.org/all/[email protected]/
- Improve callstack reporting for Tiny RCU:
https://lore.kernel.org/all/20220527170743.04c21d235467.I1f79da0f90fb9b557ec34932136c656bc64b8fbf@changeid/

Changes since RFC v3:
https://lore.kernel.org/all/[email protected]/
- No longer print "KernelAddressSanitizer initialized" (Johannes)
- Document the reason for the CONFIG_UML checks in shadow.c (Dmitry)
- Support static builds via kasan_arch_is_ready() (Dmitry)
- Get rid of a redundant call to kasam_mem_to_shadow() (Dmitry)
- Use PAGE_ALIGN and the new PAGE_ALIGN_DOWN macros (Dmitry)
- Reinstate missing arch/um/include/asm/kasan.h file (Johannes)

Changes since v1:
https://lore.kernel.org/all/[email protected]/
- Include several fixes from Vincent Whitchurch:
https://lore.kernel.org/all/[email protected]/
- Support for KASAN_VMALLOC, by changing the way
kasan_{populate,release}_vmalloc work to update existing shadow
memory, rather than allocating anything new.
- A similar fix for modules' shadow memory.
- Support for KASAN_STACK
- This requires the bugfix here:
https://lore.kernel.org/lkml/[email protected]/
- Plus a couple of files excluded from KASAN.
- Revert the default shadow offset to 0x100000000000
- This was breaking when mem=1G for me, at least.
- A few minor fixes to linker sections and scripts.
- I've added one to dyn.lds.S on top of the ones Vincent added.


---
arch/um/Kconfig | 15 +++++++++++++
arch/um/Makefile | 6 ++++++
arch/um/include/asm/common.lds.S | 2 ++
arch/um/include/asm/kasan.h | 37 ++++++++++++++++++++++++++++++++
arch/um/kernel/Makefile | 3 +++
arch/um/kernel/dyn.lds.S | 6 +++++-
arch/um/kernel/mem.c | 19 ++++++++++++++++
arch/um/os-Linux/mem.c | 22 +++++++++++++++++++
arch/um/os-Linux/user_syms.c | 4 ++--
arch/x86/um/Makefile | 3 ++-
arch/x86/um/vdso/Makefile | 3 +++
mm/kasan/shadow.c | 36 +++++++++++++++++++++++++++++--
12 files changed, 150 insertions(+), 6 deletions(-)
create mode 100644 arch/um/include/asm/kasan.h

diff --git a/arch/um/Kconfig b/arch/um/Kconfig
index 4d398b80aea8..c28ea5c89381 100644
--- a/arch/um/Kconfig
+++ b/arch/um/Kconfig
@@ -11,6 +11,8 @@ config UML
select ARCH_HAS_STRNLEN_USER
select ARCH_NO_PREEMPT
select HAVE_ARCH_AUDITSYSCALL
+ select HAVE_ARCH_KASAN if X86_64
+ select HAVE_ARCH_KASAN_VMALLOC if HAVE_ARCH_KASAN
select HAVE_ARCH_SECCOMP_FILTER
select HAVE_ASM_MODVERSIONS
select HAVE_UID16
@@ -219,6 +221,19 @@ config UML_TIME_TRAVEL_SUPPORT

It is safe to say Y, but you probably don't need this.

+config KASAN_SHADOW_OFFSET
+ hex
+ depends on KASAN
+ default 0x100000000000
+ help
+ This is the offset at which the ~2.25TB of shadow memory is
+ mapped and used by KASAN for memory debugging. This can be any
+ address that has at least KASAN_SHADOW_SIZE(total address space divided
+ by 8) amount of space so that the KASAN shadow memory does not conflict
+ with anything. The default is 0x100000000000, which works even if mem is
+ set to a large value. On low-memory systems, try 0x7fff8000, as it fits
+ into the immediate of most instructions, improving performance.
+
endmenu

source "arch/um/drivers/Kconfig"
diff --git a/arch/um/Makefile b/arch/um/Makefile
index f2fe63bfd819..a98405f4ecb8 100644
--- a/arch/um/Makefile
+++ b/arch/um/Makefile
@@ -75,6 +75,12 @@ USER_CFLAGS = $(patsubst $(KERNEL_DEFINES),,$(patsubst -I%,,$(KBUILD_CFLAGS))) \
-D_FILE_OFFSET_BITS=64 -idirafter $(srctree)/include \
-idirafter $(objtree)/include -D__KERNEL__ -D__UM_HOST__

+# Kernel config options are not included in USER_CFLAGS, but the option for KASAN
+# should be included if the KASAN config option was set.
+ifdef CONFIG_KASAN
+ USER_CFLAGS+=-DCONFIG_KASAN=y
+endif
+
#This will adjust *FLAGS accordingly to the platform.
include $(srctree)/$(ARCH_DIR)/Makefile-os-$(OS)

diff --git a/arch/um/include/asm/common.lds.S b/arch/um/include/asm/common.lds.S
index eca6c452a41b..fd481ac371de 100644
--- a/arch/um/include/asm/common.lds.S
+++ b/arch/um/include/asm/common.lds.S
@@ -83,6 +83,8 @@
}
.init_array : {
__init_array_start = .;
+ *(.kasan_init)
+ *(.init_array.*)
*(.init_array)
__init_array_end = .;
}
diff --git a/arch/um/include/asm/kasan.h b/arch/um/include/asm/kasan.h
new file mode 100644
index 000000000000..0d6547f4ec85
--- /dev/null
+++ b/arch/um/include/asm/kasan.h
@@ -0,0 +1,37 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_UM_KASAN_H
+#define __ASM_UM_KASAN_H
+
+#include <linux/init.h>
+#include <linux/const.h>
+
+#define KASAN_SHADOW_OFFSET _AC(CONFIG_KASAN_SHADOW_OFFSET, UL)
+
+/* used in kasan_mem_to_shadow to divide by 8 */
+#define KASAN_SHADOW_SCALE_SHIFT 3
+
+#ifdef CONFIG_X86_64
+#define KASAN_HOST_USER_SPACE_END_ADDR 0x00007fffffffffffUL
+/* KASAN_SHADOW_SIZE is the size of total address space divided by 8 */
+#define KASAN_SHADOW_SIZE ((KASAN_HOST_USER_SPACE_END_ADDR + 1) >> \
+ KASAN_SHADOW_SCALE_SHIFT)
+#else
+#error "KASAN_SHADOW_SIZE is not defined for this sub-architecture"
+#endif /* CONFIG_X86_64 */
+
+#define KASAN_SHADOW_START (KASAN_SHADOW_OFFSET)
+#define KASAN_SHADOW_END (KASAN_SHADOW_START + KASAN_SHADOW_SIZE)
+
+#ifdef CONFIG_KASAN
+void kasan_init(void);
+void kasan_map_memory(void *start, unsigned long len);
+extern int kasan_um_is_ready;
+
+#ifdef CONFIG_STATIC_LINK
+#define kasan_arch_is_ready() (kasan_um_is_ready)
+#endif
+#else
+static inline void kasan_init(void) { }
+#endif /* CONFIG_KASAN */
+
+#endif /* __ASM_UM_KASAN_H */
diff --git a/arch/um/kernel/Makefile b/arch/um/kernel/Makefile
index 1c2d4b29a3d4..a089217e2f0e 100644
--- a/arch/um/kernel/Makefile
+++ b/arch/um/kernel/Makefile
@@ -27,6 +27,9 @@ obj-$(CONFIG_EARLY_PRINTK) += early_printk.o
obj-$(CONFIG_STACKTRACE) += stacktrace.o
obj-$(CONFIG_GENERIC_PCI_IOMAP) += ioport.o

+KASAN_SANITIZE_stacktrace.o := n
+KASAN_SANITIZE_sysrq.o := n
+
USER_OBJS := config.o

include arch/um/scripts/Makefile.rules
diff --git a/arch/um/kernel/dyn.lds.S b/arch/um/kernel/dyn.lds.S
index 2f2a8ce92f1e..2b7fc5b54164 100644
--- a/arch/um/kernel/dyn.lds.S
+++ b/arch/um/kernel/dyn.lds.S
@@ -109,7 +109,11 @@ SECTIONS
be empty, which isn't pretty. */
. = ALIGN(32 / 8);
.preinit_array : { *(.preinit_array) }
- .init_array : { *(.init_array) }
+ .init_array : {
+ *(.kasan_init)
+ *(.init_array.*)
+ *(.init_array)
+ }
.fini_array : { *(.fini_array) }
.data : {
INIT_TASK_DATA(KERNEL_STACK_SIZE)
diff --git a/arch/um/kernel/mem.c b/arch/um/kernel/mem.c
index 15295c3237a0..276a1f0b91f1 100644
--- a/arch/um/kernel/mem.c
+++ b/arch/um/kernel/mem.c
@@ -18,6 +18,25 @@
#include <kern_util.h>
#include <mem_user.h>
#include <os.h>
+#include <linux/sched/task.h>
+
+#ifdef CONFIG_KASAN
+int kasan_um_is_ready;
+void kasan_init(void)
+{
+ /*
+ * kasan_map_memory will map all of the required address space and
+ * the host machine will allocate physical memory as necessary.
+ */
+ kasan_map_memory((void *)KASAN_SHADOW_START, KASAN_SHADOW_SIZE);
+ init_task.kasan_depth = 0;
+ kasan_um_is_ready = true;
+}
+
+static void (*kasan_init_ptr)(void)
+__section(".kasan_init") __used
+= kasan_init;
+#endif

/* allocated in paging_init, zeroed in mem_init, and unchanged thereafter */
unsigned long *empty_zero_page = NULL;
diff --git a/arch/um/os-Linux/mem.c b/arch/um/os-Linux/mem.c
index 3c1b77474d2d..8530b2e08604 100644
--- a/arch/um/os-Linux/mem.c
+++ b/arch/um/os-Linux/mem.c
@@ -17,6 +17,28 @@
#include <init.h>
#include <os.h>

+/*
+ * kasan_map_memory - maps memory from @start with a size of @len.
+ * The allocated memory is filled with zeroes upon success.
+ * @start: the start address of the memory to be mapped
+ * @len: the length of the memory to be mapped
+ *
+ * This function is used to map shadow memory for KASAN in uml
+ */
+void kasan_map_memory(void *start, size_t len)
+{
+ if (mmap(start,
+ len,
+ PROT_READ|PROT_WRITE,
+ MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE|MAP_NORESERVE,
+ -1,
+ 0) == MAP_FAILED) {
+ os_info("Couldn't allocate shadow memory: %s\n.",
+ strerror(errno));
+ exit(1);
+ }
+}
+
/* Set by make_tempfile() during early boot. */
static char *tempdir = NULL;

diff --git a/arch/um/os-Linux/user_syms.c b/arch/um/os-Linux/user_syms.c
index 715594fe5719..cb667c9225ab 100644
--- a/arch/um/os-Linux/user_syms.c
+++ b/arch/um/os-Linux/user_syms.c
@@ -27,10 +27,10 @@ EXPORT_SYMBOL(strstr);
#ifndef __x86_64__
extern void *memcpy(void *, const void *, size_t);
EXPORT_SYMBOL(memcpy);
-#endif
-
EXPORT_SYMBOL(memmove);
EXPORT_SYMBOL(memset);
+#endif
+
EXPORT_SYMBOL(printf);

/* Here, instead, I can provide a fake prototype. Yes, someone cares: genksyms.
diff --git a/arch/x86/um/Makefile b/arch/x86/um/Makefile
index ba5789c35809..f778e37494ba 100644
--- a/arch/x86/um/Makefile
+++ b/arch/x86/um/Makefile
@@ -28,7 +28,8 @@ else

obj-y += syscalls_64.o vdso/

-subarch-y = ../lib/csum-partial_64.o ../lib/memcpy_64.o ../entry/thunk_64.o
+subarch-y = ../lib/csum-partial_64.o ../lib/memcpy_64.o ../entry/thunk_64.o \
+ ../lib/memmove_64.o ../lib/memset_64.o

endif

diff --git a/arch/x86/um/vdso/Makefile b/arch/x86/um/vdso/Makefile
index 5943387e3f35..8c0396fd0e6f 100644
--- a/arch/x86/um/vdso/Makefile
+++ b/arch/x86/um/vdso/Makefile
@@ -3,6 +3,9 @@
# Building vDSO images for x86.
#

+# do not instrument on vdso because KASAN is not compatible with user mode
+KASAN_SANITIZE := n
+
# Prevents link failures: __sanitizer_cov_trace_pc() is not linked in.
KCOV_INSTRUMENT := n

diff --git a/mm/kasan/shadow.c b/mm/kasan/shadow.c
index a4f07de21771..c993d99116f2 100644
--- a/mm/kasan/shadow.c
+++ b/mm/kasan/shadow.c
@@ -295,9 +295,29 @@ int kasan_populate_vmalloc(unsigned long addr, unsigned long size)
return 0;

shadow_start = (unsigned long)kasan_mem_to_shadow((void *)addr);
- shadow_start = ALIGN_DOWN(shadow_start, PAGE_SIZE);
shadow_end = (unsigned long)kasan_mem_to_shadow((void *)addr + size);
- shadow_end = ALIGN(shadow_end, PAGE_SIZE);
+
+ /*
+ * User Mode Linux maps enough shadow memory for all of physical memory
+ * at boot, so doesn't need to allocate more on vmalloc, just clear it.
+ *
+ * If another architecture chooses to go down the same path, we should
+ * replace this check for CONFIG_UML with something more generic, such
+ * as:
+ * - A CONFIG_KASAN_NO_SHADOW_ALLOC option, which architectures could set
+ * - or, a way of having architecture-specific versions of these vmalloc
+ * and module shadow memory allocation options.
+ *
+ * For the time being, though, this check works. The remaining CONFIG_UML
+ * checks in this file exist for the same reason.
+ */
+ if (IS_ENABLED(CONFIG_UML)) {
+ __memset((void *)shadow_start, KASAN_VMALLOC_INVALID, shadow_end - shadow_start);
+ return 0;
+ }
+
+ shadow_start = PAGE_ALIGN_DOWN(shadow_start);
+ shadow_end = PAGE_ALIGN(shadow_end);

ret = apply_to_page_range(&init_mm, shadow_start,
shadow_end - shadow_start,
@@ -466,6 +486,10 @@ void kasan_release_vmalloc(unsigned long start, unsigned long end,

if (shadow_end > shadow_start) {
size = shadow_end - shadow_start;
+ if (IS_ENABLED(CONFIG_UML)) {
+ __memset(shadow_start, KASAN_SHADOW_INIT, shadow_end - shadow_start);
+ return;
+ }
apply_to_existing_page_range(&init_mm,
(unsigned long)shadow_start,
size, kasan_depopulate_vmalloc_pte,
@@ -531,6 +555,11 @@ int kasan_alloc_module_shadow(void *addr, size_t size, gfp_t gfp_mask)
if (WARN_ON(!PAGE_ALIGNED(shadow_start)))
return -EINVAL;

+ if (IS_ENABLED(CONFIG_UML)) {
+ __memset((void *)shadow_start, KASAN_SHADOW_INIT, shadow_size);
+ return 0;
+ }
+
ret = __vmalloc_node_range(shadow_size, 1, shadow_start,
shadow_start + shadow_size,
GFP_KERNEL,
@@ -554,6 +583,9 @@ int kasan_alloc_module_shadow(void *addr, size_t size, gfp_t gfp_mask)

void kasan_free_module_shadow(const struct vm_struct *vm)
{
+ if (IS_ENABLED(CONFIG_UML))
+ return;
+
if (vm->flags & VM_KASAN)
vfree(kasan_mem_to_shadow(vm->addr));
}
--
2.36.1.124.g0e6072fb45-goog



2022-05-28 18:30:09

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] UML: add support for KASAN under x86_64

On Fri, 2022-05-27 at 11:56 -0700, David Gow wrote:
>
> This is v2 of the KASAN/UML port. It should be ready to go.

Nice, thanks a lot! :)

> It does benefit significantly from the following patches:
> - Bugfix for memory corruption, needed for KASAN_STACK support:
> https://lore.kernel.org/lkml/[email protected]/

Btw, oddly enough, I don't seem to actually see this (tried gcc 10.3 and
11.3 so far) - is there anything you know about compiler versions
related to this perhaps? Or clang only?

The kasan_stack_oob test passes though, and generally 45 tests pass and
10 are skipped.


> +# Kernel config options are not included in USER_CFLAGS, but the
> option for KASAN
> +# should be included if the KASAN config option was set.
> +ifdef CONFIG_KASAN
> + USER_CFLAGS+=-DCONFIG_KASAN=y
> +endif
>

I'm not sure that's (still?) necessary - you don't #ifdef on it anywhere
in the user code; perhaps the original intent had been to #ifdef
kasan_map_memory()?


> +++ b/arch/um/os-Linux/user_syms.c
> @@ -27,10 +27,10 @@ EXPORT_SYMBOL(strstr);
> #ifndef __x86_64__
> extern void *memcpy(void *, const void *, size_t);
> EXPORT_SYMBOL(memcpy);
> -#endif
> -
> EXPORT_SYMBOL(memmove);
> EXPORT_SYMBOL(memset);
> +#endif
> +
> EXPORT_SYMBOL(printf);
>
> /* Here, instead, I can provide a fake prototype. Yes, someone cares: genksyms.
> diff --git a/arch/x86/um/Makefile b/arch/x86/um/Makefile
> index ba5789c35809..f778e37494ba 100644
> --- a/arch/x86/um/Makefile
> +++ b/arch/x86/um/Makefile
> @@ -28,7 +28,8 @@ else
>
> obj-y += syscalls_64.o vdso/
>
> -subarch-y = ../lib/csum-partial_64.o ../lib/memcpy_64.o ../entry/thunk_64.o
> +subarch-y = ../lib/csum-partial_64.o ../lib/memcpy_64.o ../entry/thunk_64.o \
> + ../lib/memmove_64.o ../lib/memset_64.o

I wonder if we should make these two changes contingent on KASAN too, I
seem to remember that we had some patches from Anton flying around at
some point to use glibc string routines, since they can be even more
optimised (we're in user space, after all).

But I suppose for now this doesn't really matter, and even if we did use
them, they'd come from libasan anyway?


Anyway, looks good to me, not sure the little not above about the user
cflags matters.

Reviewed-by: Johannes Berg <[email protected]>

johannes

2022-05-29 19:20:02

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] UML: add support for KASAN under x86_64

On Fri, 2022-05-27 at 22:14 +0200, Johannes Berg wrote:
> On Fri, 2022-05-27 at 11:56 -0700, David Gow wrote:
> >
> > This is v2 of the KASAN/UML port. It should be ready to go.
>
> Nice, thanks a lot! :)
>
> > It does benefit significantly from the following patches:
> > - Bugfix for memory corruption, needed for KASAN_STACK support:
> > https://lore.kernel.org/lkml/[email protected]/
>
> Btw, oddly enough, I don't seem to actually see this (tried gcc 10.3 and
> 11.3 so far) - is there anything you know about compiler versions
> related to this perhaps? Or clang only?
>

I do see it on gcc (Debian 10.2.1-6) 10.2.1 20210110, but that was a
different .config too ... strange.

johannes

2022-05-30 00:18:44

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] UML: add support for KASAN under x86_64

On Fri, 2022-05-27 at 11:56 -0700, David Gow wrote:
>
> The UML-specific KASAN initializer uses mmap to map the roughly 2.25TB

You say 2.25TB here, and

> +config KASAN_SHADOW_OFFSET
> + hex
> + depends on KASAN
> + default 0x100000000000
> + help
> + This is the offset at which the ~2.25TB of shadow memory is

here too, of course.

But I notice that I get ~16TB address space use when running,

> +/* used in kasan_mem_to_shadow to divide by 8 */
> +#define KASAN_SHADOW_SCALE_SHIFT 3
> +
> +#ifdef CONFIG_X86_64
> +#define KASAN_HOST_USER_SPACE_END_ADDR 0x00007fffffffffffUL
> +/* KASAN_SHADOW_SIZE is the size of total address space divided by 8 */
> +#define KASAN_SHADOW_SIZE ((KASAN_HOST_USER_SPACE_END_ADDR + 1) >> \
> + KASAN_SHADOW_SCALE_SHIFT)

because this ends up being 0x100000000000, i.e. 16 TiB.

Is that intentional? Was something missed? Maybe
KASAN_HOST_USER_SPACE_END_ADDR was too big?

It doesn't really matter, but I guess then the documentation should be
updated.

johannes

2022-06-01 17:33:16

by Andrey Konovalov

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] UML: add support for KASAN under x86_64

On Fri, May 27, 2022 at 8:56 PM David Gow <[email protected]> wrote:
>
> From: Patricia Alfonso <[email protected]>
>
> Make KASAN run on User Mode Linux on x86_64.
>
> The UML-specific KASAN initializer uses mmap to map the roughly 2.25TB
> of shadow memory to the location defined by KASAN_SHADOW_OFFSET.
> kasan_init() utilizes constructors to initialize KASAN before main().
>
> The location of the KASAN shadow memory, starting at
> KASAN_SHADOW_OFFSET, can be configured using the KASAN_SHADOW_OFFSET
> option. UML uses roughly 18TB of address space, and KASAN requires 1/8th
> of this. The default location of this offset is 0x100000000000, which
> keeps it out-of-the-way even on UML setups with more "physical" memory.
>
> For low-memory setups, 0x7fff8000 can be used instead, which fits in an
> immediate and is therefore faster, as suggested by Dmitry Vyukov. There
> is usually enough free space at this location; however, it is a config
> option so that it can be easily changed if needed.
>
> Note that, unlike KASAN on other architectures, vmalloc allocations
> still use the shadow memory allocated upfront, rather than allocating
> and free-ing it per-vmalloc allocation.
>
> Also note that, while UML supports both KASAN in inline mode
> (CONFIG_KASAN_INLINE) and static linking (CONFIG_STATIC_LINK), it does
> not support both at the same time.
>
> Signed-off-by: Patricia Alfonso <[email protected]>
> Co-developed-by: Vincent Whitchurch <[email protected]>
> Signed-off-by: Vincent Whitchurch <[email protected]>
> Signed-off-by: David Gow <[email protected]>

Hi David,

Thanks for working on this!

> diff --git a/mm/kasan/shadow.c b/mm/kasan/shadow.c
> index a4f07de21771..c993d99116f2 100644
> --- a/mm/kasan/shadow.c
> +++ b/mm/kasan/shadow.c
> @@ -295,9 +295,29 @@ int kasan_populate_vmalloc(unsigned long addr, unsigned long size)
> return 0;
>
> shadow_start = (unsigned long)kasan_mem_to_shadow((void *)addr);
> - shadow_start = ALIGN_DOWN(shadow_start, PAGE_SIZE);
> shadow_end = (unsigned long)kasan_mem_to_shadow((void *)addr + size);
> - shadow_end = ALIGN(shadow_end, PAGE_SIZE);
> +
> + /*
> + * User Mode Linux maps enough shadow memory for all of physical memory
> + * at boot, so doesn't need to allocate more on vmalloc, just clear it.

Should this say "for all of _virtual_ memory"?

Otherwise, this is confusing. All KASAN-enabled architectures map
shadow for physical memory. And they still need map shadow for
vmalloc() separately. This is what kasan_populate_vmalloc() is for.

> + *
> + * If another architecture chooses to go down the same path, we should
> + * replace this check for CONFIG_UML with something more generic, such
> + * as:
> + * - A CONFIG_KASAN_NO_SHADOW_ALLOC option, which architectures could set
> + * - or, a way of having architecture-specific versions of these vmalloc
> + * and module shadow memory allocation options.

I think this part above and the first sentence below belong to the
commit changelog, not to a comment.

> + *
> + * For the time being, though, this check works. The remaining CONFIG_UML
> + * checks in this file exist for the same reason.
> + */
> + if (IS_ENABLED(CONFIG_UML)) {
> + __memset((void *)shadow_start, KASAN_VMALLOC_INVALID, shadow_end - shadow_start);
> + return 0;
> + }
> +
> + shadow_start = PAGE_ALIGN_DOWN(shadow_start);
> + shadow_end = PAGE_ALIGN(shadow_end);
>
> ret = apply_to_page_range(&init_mm, shadow_start,
> shadow_end - shadow_start,
> @@ -466,6 +486,10 @@ void kasan_release_vmalloc(unsigned long start, unsigned long end,
>
> if (shadow_end > shadow_start) {
> size = shadow_end - shadow_start;
> + if (IS_ENABLED(CONFIG_UML)) {
> + __memset(shadow_start, KASAN_SHADOW_INIT, shadow_end - shadow_start);
> + return;
> + }
> apply_to_existing_page_range(&init_mm,
> (unsigned long)shadow_start,
> size, kasan_depopulate_vmalloc_pte,
> @@ -531,6 +555,11 @@ int kasan_alloc_module_shadow(void *addr, size_t size, gfp_t gfp_mask)
> if (WARN_ON(!PAGE_ALIGNED(shadow_start)))
> return -EINVAL;
>
> + if (IS_ENABLED(CONFIG_UML)) {
> + __memset((void *)shadow_start, KASAN_SHADOW_INIT, shadow_size);
> + return 0;
> + }
> +
> ret = __vmalloc_node_range(shadow_size, 1, shadow_start,
> shadow_start + shadow_size,
> GFP_KERNEL,
> @@ -554,6 +583,9 @@ int kasan_alloc_module_shadow(void *addr, size_t size, gfp_t gfp_mask)
>
> void kasan_free_module_shadow(const struct vm_struct *vm)
> {
> + if (IS_ENABLED(CONFIG_UML))
> + return;
> +
> if (vm->flags & VM_KASAN)
> vfree(kasan_mem_to_shadow(vm->addr));
> }
> --
> 2.36.1.124.g0e6072fb45-goog
>

Thanks!

2022-06-30 08:00:55

by David Gow

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] UML: add support for KASAN under x86_64

On Tue, May 31, 2022 at 2:03 AM Andrey Konovalov <[email protected]> wrote:
>
> On Fri, May 27, 2022 at 8:56 PM David Gow <[email protected]> wrote:
> >
> > From: Patricia Alfonso <[email protected]>
> >
> > Make KASAN run on User Mode Linux on x86_64.
> >
> > The UML-specific KASAN initializer uses mmap to map the roughly 2.25TB
> > of shadow memory to the location defined by KASAN_SHADOW_OFFSET.
> > kasan_init() utilizes constructors to initialize KASAN before main().
> >
> > The location of the KASAN shadow memory, starting at
> > KASAN_SHADOW_OFFSET, can be configured using the KASAN_SHADOW_OFFSET
> > option. UML uses roughly 18TB of address space, and KASAN requires 1/8th
> > of this. The default location of this offset is 0x100000000000, which
> > keeps it out-of-the-way even on UML setups with more "physical" memory.
> >
> > For low-memory setups, 0x7fff8000 can be used instead, which fits in an
> > immediate and is therefore faster, as suggested by Dmitry Vyukov. There
> > is usually enough free space at this location; however, it is a config
> > option so that it can be easily changed if needed.
> >
> > Note that, unlike KASAN on other architectures, vmalloc allocations
> > still use the shadow memory allocated upfront, rather than allocating
> > and free-ing it per-vmalloc allocation.
> >
> > Also note that, while UML supports both KASAN in inline mode
> > (CONFIG_KASAN_INLINE) and static linking (CONFIG_STATIC_LINK), it does
> > not support both at the same time.
> >
> > Signed-off-by: Patricia Alfonso <[email protected]>
> > Co-developed-by: Vincent Whitchurch <[email protected]>
> > Signed-off-by: Vincent Whitchurch <[email protected]>
> > Signed-off-by: David Gow <[email protected]>
>
> Hi David,
>
> Thanks for working on this!
>
> > diff --git a/mm/kasan/shadow.c b/mm/kasan/shadow.c
> > index a4f07de21771..c993d99116f2 100644
> > --- a/mm/kasan/shadow.c
> > +++ b/mm/kasan/shadow.c
> > @@ -295,9 +295,29 @@ int kasan_populate_vmalloc(unsigned long addr, unsigned long size)
> > return 0;
> >
> > shadow_start = (unsigned long)kasan_mem_to_shadow((void *)addr);
> > - shadow_start = ALIGN_DOWN(shadow_start, PAGE_SIZE);
> > shadow_end = (unsigned long)kasan_mem_to_shadow((void *)addr + size);
> > - shadow_end = ALIGN(shadow_end, PAGE_SIZE);
> > +
> > + /*
> > + * User Mode Linux maps enough shadow memory for all of physical memory
> > + * at boot, so doesn't need to allocate more on vmalloc, just clear it.
>
> Should this say "for all of _virtual_ memory"?
>
> Otherwise, this is confusing. All KASAN-enabled architectures map
> shadow for physical memory. And they still need map shadow for
> vmalloc() separately. This is what kasan_populate_vmalloc() is for.
>

Yup, this was a mistake on my part: the original RFC for KASAN/UML
only allocated enough shadow memory to cover physical memory, but it
was changed in v1 (which I'd forgotten).

I've updated the comment in v3:
https://lore.kernel.org/lkml/[email protected]/

> > + *
> > + * If another architecture chooses to go down the same path, we should
> > + * replace this check for CONFIG_UML with something more generic, such
> > + * as:
> > + * - A CONFIG_KASAN_NO_SHADOW_ALLOC option, which architectures could set
> > + * - or, a way of having architecture-specific versions of these vmalloc
> > + * and module shadow memory allocation options.
>
> I think this part above and the first sentence below belong to the
> commit changelog, not to a comment.
>

While I think there's _some_ sense in leaving this in the comment (as
a bit of a reminder / TODO), given that the commit changelog is more
ephemeral, I've moved it to the commit message for v3. This will be
easy to find via git blame, while not cluttering the actual file, so
seems an okay spot for it.

Cheers,
-- David



> > + *
> > + * For the time being, though, this check works. The remaining CONFIG_UML
> > + * checks in this file exist for the same reason.
> > + */
> > + if (IS_ENABLED(CONFIG_UML)) {
> > + __memset((void *)shadow_start, KASAN_VMALLOC_INVALID, shadow_end - shadow_start);
> > + return 0;
> > + }
> > +
> > + shadow_start = PAGE_ALIGN_DOWN(shadow_start);
> > + shadow_end = PAGE_ALIGN(shadow_end);
> >
> > ret = apply_to_page_range(&init_mm, shadow_start,
> > shadow_end - shadow_start,
> > @@ -466,6 +486,10 @@ void kasan_release_vmalloc(unsigned long start, unsigned long end,
> >
> > if (shadow_end > shadow_start) {
> > size = shadow_end - shadow_start;
> > + if (IS_ENABLED(CONFIG_UML)) {
> > + __memset(shadow_start, KASAN_SHADOW_INIT, shadow_end - shadow_start);
> > + return;
> > + }
> > apply_to_existing_page_range(&init_mm,
> > (unsigned long)shadow_start,
> > size, kasan_depopulate_vmalloc_pte,
> > @@ -531,6 +555,11 @@ int kasan_alloc_module_shadow(void *addr, size_t size, gfp_t gfp_mask)
> > if (WARN_ON(!PAGE_ALIGNED(shadow_start)))
> > return -EINVAL;
> >
> > + if (IS_ENABLED(CONFIG_UML)) {
> > + __memset((void *)shadow_start, KASAN_SHADOW_INIT, shadow_size);
> > + return 0;
> > + }
> > +
> > ret = __vmalloc_node_range(shadow_size, 1, shadow_start,
> > shadow_start + shadow_size,
> > GFP_KERNEL,
> > @@ -554,6 +583,9 @@ int kasan_alloc_module_shadow(void *addr, size_t size, gfp_t gfp_mask)
> >
> > void kasan_free_module_shadow(const struct vm_struct *vm)
> > {
> > + if (IS_ENABLED(CONFIG_UML))
> > + return;
> > +
> > if (vm->flags & VM_KASAN)
> > vfree(kasan_mem_to_shadow(vm->addr));
> > }
> > --
> > 2.36.1.124.g0e6072fb45-goog
> >
>
> Thanks!


Attachments:
smime.p7s (3.91 kB)
S/MIME Cryptographic Signature

2022-06-30 08:08:47

by David Gow

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] UML: add support for KASAN under x86_64

On Mon, May 30, 2022 at 1:04 AM Johannes Berg <[email protected]> wrote:
>
> On Fri, 2022-05-27 at 11:56 -0700, David Gow wrote:
> >
> > The UML-specific KASAN initializer uses mmap to map the roughly 2.25TB
>
> You say 2.25TB here, and
>
> > +config KASAN_SHADOW_OFFSET
> > + hex
> > + depends on KASAN
> > + default 0x100000000000
> > + help
> > + This is the offset at which the ~2.25TB of shadow memory is
>
> here too, of course.
>
> But I notice that I get ~16TB address space use when running,
>
> > +/* used in kasan_mem_to_shadow to divide by 8 */
> > +#define KASAN_SHADOW_SCALE_SHIFT 3
> > +
> > +#ifdef CONFIG_X86_64
> > +#define KASAN_HOST_USER_SPACE_END_ADDR 0x00007fffffffffffUL
> > +/* KASAN_SHADOW_SIZE is the size of total address space divided by 8 */
> > +#define KASAN_SHADOW_SIZE ((KASAN_HOST_USER_SPACE_END_ADDR + 1) >> \
> > + KASAN_SHADOW_SCALE_SHIFT)
>
> because this ends up being 0x100000000000, i.e. 16 TiB.
>
> Is that intentional? Was something missed? Maybe
> KASAN_HOST_USER_SPACE_END_ADDR was too big?
>
> It doesn't really matter, but I guess then the documentation should be
> updated.

Whoops, the amount of shadow memory allocated was changed for v1 of
this patch (it was ~2.25 TB in the original RFC). I've updated these
comments in v3:
https://lore.kernel.org/lkml/[email protected]/

-- David


Attachments:
smime.p7s (3.91 kB)
S/MIME Cryptographic Signature

2022-06-30 08:20:11

by David Gow

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] UML: add support for KASAN under x86_64

On Sat, May 28, 2022 at 4:14 AM Johannes Berg <[email protected]> wrote:
>
> On Fri, 2022-05-27 at 11:56 -0700, David Gow wrote:
> >
> > This is v2 of the KASAN/UML port. It should be ready to go.
>
> Nice, thanks a lot! :)
>

Thanks for looking at this: I've finally had the time to go through
this in detail again, and have sent out v3:
https://lore.kernel.org/lkml/[email protected]/

> > It does benefit significantly from the following patches:
> > - Bugfix for memory corruption, needed for KASAN_STACK support:
> > https://lore.kernel.org/lkml/[email protected]/
>
> Btw, oddly enough, I don't seem to actually see this (tried gcc 10.3 and
> 11.3 so far) - is there anything you know about compiler versions
> related to this perhaps? Or clang only?
>
> The kasan_stack_oob test passes though, and generally 45 tests pass and
> 10 are skipped.
>

Given this patch has already been accepted, I dropped this comment
from v3. As you note, the issue didn't reproduce totally
consistently.

> > +# Kernel config options are not included in USER_CFLAGS, but the
> > option for KASAN
> > +# should be included if the KASAN config option was set.
> > +ifdef CONFIG_KASAN
> > + USER_CFLAGS+=-DCONFIG_KASAN=y
> > +endif
> >
>
> I'm not sure that's (still?) necessary - you don't #ifdef on it anywhere
> in the user code; perhaps the original intent had been to #ifdef
> kasan_map_memory()?
>

I've got rid of this for v3, thanks.

> > +++ b/arch/um/os-Linux/user_syms.c
> > @@ -27,10 +27,10 @@ EXPORT_SYMBOL(strstr);
> > #ifndef __x86_64__
> > extern void *memcpy(void *, const void *, size_t);
> > EXPORT_SYMBOL(memcpy);
> > -#endif
> > -
> > EXPORT_SYMBOL(memmove);
> > EXPORT_SYMBOL(memset);
> > +#endif
> > +
> > EXPORT_SYMBOL(printf);
> >
> > /* Here, instead, I can provide a fake prototype. Yes, someone cares: genksyms.
> > diff --git a/arch/x86/um/Makefile b/arch/x86/um/Makefile
> > index ba5789c35809..f778e37494ba 100644
> > --- a/arch/x86/um/Makefile
> > +++ b/arch/x86/um/Makefile
> > @@ -28,7 +28,8 @@ else
> >
> > obj-y += syscalls_64.o vdso/
> >
> > -subarch-y = ../lib/csum-partial_64.o ../lib/memcpy_64.o ../entry/thunk_64.o
> > +subarch-y = ../lib/csum-partial_64.o ../lib/memcpy_64.o ../entry/thunk_64.o \
> > + ../lib/memmove_64.o ../lib/memset_64.o
>
> I wonder if we should make these two changes contingent on KASAN too, I
> seem to remember that we had some patches from Anton flying around at
> some point to use glibc string routines, since they can be even more
> optimised (we're in user space, after all).
>
> But I suppose for now this doesn't really matter, and even if we did use
> them, they'd come from libasan anyway?

I had a quick look into this, and think it's probably best left as-is.
I think it's better to have the same implementation of these
functions, regardless of whether KASAN is enabled. And given that we
need the explicit, separate instrumented and uninstrumented versions,
we'd need some way of having one copy which came from libasan and one
which was totally uninstrumented.

But if the performance difference is really significant, we could
always revisit it.


>
> Anyway, looks good to me, not sure the little not above about the user
> cflags matters.
>
> Reviewed-by: Johannes Berg <[email protected]>
>

Cheers,
-- David