2018-06-26 13:17:55

by Andrey Konovalov

[permalink] [raw]
Subject: [PATCH v4 00/17] khwasan: kernel hardware assisted address sanitizer

This patchset adds a new mode to KASAN [1], which is called KHWASAN
(Kernel HardWare assisted Address SANitizer).

The plan is to implement HWASan [2] for the kernel with the incentive,
that it's going to have comparable to KASAN performance, but in the same
time consume much less memory, trading that off for somewhat imprecise
bug detection and being supported only for arm64.

The overall idea of the approach used by KHWASAN is the following:

1. By using the Top Byte Ignore arm64 CPU feature, we can store pointer
tags in the top byte of each kernel pointer.

2. Using shadow memory, we can store memory tags for each chunk of kernel
memory.

3. On each memory allocation, we can generate a random tag, embed it into
the returned pointer and set the memory tags that correspond to this
chunk of memory to the same value.

4. By using compiler instrumentation, before each memory access we can add
a check that the pointer tag matches the tag of the memory that is being
accessed.

5. On a tag mismatch we report an error.

[1] https://www.kernel.org/doc/html/latest/dev-tools/kasan.html

[2] http://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html


====== Technical details

KHWASAN is implemented in a very similar way to KASAN. This patchset
essentially does the following:

1. TCR_TBI1 is set to enable Top Byte Ignore.

2. Shadow memory is used (with a different scale, 1:16, so each shadow
byte corresponds to 16 bytes of kernel memory) to store memory tags.

3. All slab objects are aligned to shadow scale, which is 16 bytes.

4. All pointers returned from the slab allocator are tagged with a random
tag and the corresponding shadow memory is poisoned with the same value.

5. Compiler instrumentation is used to insert tag checks. Either by
calling callbacks or by inlining them (CONFIG_KASAN_OUTLINE and
CONFIG_KASAN_INLINE flags are reused).

6. When a tag mismatch is detected in callback instrumentation mode
KHWASAN simply prints a bug report. In case of inline instrumentation,
clang inserts a brk instruction, and KHWASAN has it's own brk handler,
which reports the bug.

7. The memory in between slab objects is marked with a reserved tag, and
acts as a redzone.

8. When a slab object is freed it's marked with a reserved tag.

Bug detection is imprecise for two reasons:

1. We won't catch some small out-of-bounds accesses, that fall into the
same shadow cell, as the last byte of a slab object.

2. We only have 1 byte to store tags, which means we have a 1/256
probability of a tag match for an incorrect access (actually even
slightly less due to reserved tag values).

Despite that there's a particular type of bugs that KHWASAN can detect
compared to KASAN: use-after-free after the object has been allocated by
someone else.


====== Benchmarks

The following numbers were collected on Odroid C2 board. Both KASAN and
KHWASAN were used in inline instrumentation mode.

Boot time [1]:
* ~1.7 sec for clean kernel
* ~5.0 sec for KASAN
* ~5.0 sec for KHWASAN

Slab memory usage after boot [2]:
* ~40 kb for clean kernel
* ~105 kb + 1/8th shadow ~= 118 kb for KASAN
* ~47 kb + 1/16th shadow ~= 50 kb for KHWASAN

Network performance [3]:
* 8.33 Gbits/sec for clean kernel
* 3.17 Gbits/sec for KASAN
* 2.85 Gbits/sec for KHWASAN

Note, that KHWASAN (compared to KASAN) doesn't require quarantine.

[1] Time before the ext4 driver is initialized.
[2] Measured as `cat /proc/meminfo | grep Slab`.
[3] Measured as `iperf -s & iperf -c 127.0.0.1 -t 30`.


====== Some notes

A few notes:

1. The patchset can be found here:
https://github.com/xairy/kasan-prototype/tree/khwasan

2. Building requires a recent LLVM version (r330044 or later).

3. Stack instrumentation is not supported yet and will be added later.


====== Changes

Changes in v4:
- Fixed SPDX comment style in mm/kasan/kasan.h.
- Fixed mm/kasan/kasan.h changes being included in a wrong patch.
- Swapped “khwasan, arm64: fix up fault handling logic” and “khwasan: add
tag related helper functions” patches order.
- Rebased onto 6f0d349d (4.18-rc2+).

Changes in v3:
- Minor documentation fixes.
- Fixed CFLAGS variable name in KASAN makefile.
- Added a "SPDX-License-Identifier: GPL-2.0" line to all source files
under mm/kasan.
- Rebased onto 81e97f013 (4.18-rc1+).

Changes in v2:
- Changed kmalloc_large_node_hook to return tagged pointer instead of
using an output argument.
- Fix checking whether -fsanitize=hwaddress is supported by the compiler.
- Removed duplication of -fno-builtin for KASAN and KHWASAN.
- Removed {} block for one line for_each_possible_cpu loop.
- Made set_track() static inline as it is used only in common.c.
- Moved optimal_redzone() to common.c.
- Fixed using tagged pointer for shadow calculation in
kasan_unpoison_shadow().
- Restored setting cache->align in kasan_cache_create(), which was
accidentally lost.
- Simplified __kasan_slab_free(), kasan_alloc_pages() and kasan_kmalloc().
- Removed tagging from kasan_kmalloc_large().
- Added page_kasan_tag_reset() to kasan_poison_slab() and removed
!PageSlab() check from page_to_virt.
- Reset pointer tag in _virt_addr_is_linear.
- Set page tag for each page when multiple pages are allocated or freed.
- Added a comment as to why we ignore cma allocated pages.

Changes in v1:
- Rebased onto 4.17-rc4.
- Updated benchmarking stats.
- Documented compiler version requirements, memory usage and slowdown.
- Dropped kvm patches, as clang + arm64 + kvm is completely broken [1].

Changes in RFC v3:
- Renamed CONFIG_KASAN_CLASSIC and CONFIG_KASAN_TAGS to
CONFIG_KASAN_GENERIC and CONFIG_KASAN_HW respectively.
- Switch to -fsanitize=kernel-hwaddress instead of -fsanitize=hwaddress.
- Removed unnecessary excessive shadow initialization.
- Removed khwasan_enabled flag (it’s not needed since KHWASAN is
initialized before any slab caches are used).
- Split out kasan_report.c and khwasan_report.c from report.c.
- Moved more common KASAN and KHWASAN functions to common.c.
- Added tagging to pagealloc.
- Rebased onto 4.17-rc1.
- Temporarily dropped patch that adds kvm support (arm64 + kvm + clang
combo is broken right now [1]).

Changes in RFC v2:
- Removed explicit casts to u8 * for kasan_mem_to_shadow() calls.
- Introduced KASAN_TCR_FLAGS for setting the TCR_TBI1 flag.
- Added a comment regarding the non-atomic RMW sequence in
khwasan_random_tag().
- Made all tag related functions accept const void *.
- Untagged pointers in __kimg_to_phys, which is used by virt_to_phys.
- Untagged pointers in show_ptr in fault handling logic.
- Untagged pointers passed to KVM.
- Added two reserved tag values: 0xFF and 0xFE.
- Used the reserved tag 0xFF to disable validity checking (to resolve the
issue with pointer tag being lost after page_address + kmap usage).
- Used the reserved tag 0xFE to mark redzones and freed objects.
- Added mnemonics for esr manipulation in KHWASAN brk handler.
- Added a comment about the -recover flag.
- Some minor cleanups and fixes.
- Rebased onto 3215b9d5 (4.16-rc6+).
- Tested on real hardware (Odroid C2 board).
- Added better benchmarks.

[1] https://lkml.org/lkml/2018/4/19/775

Andrey Konovalov (17):
khwasan, mm: change kasan hooks signatures
khwasan: move common kasan and khwasan code to common.c
khwasan: add CONFIG_KASAN_GENERIC and CONFIG_KASAN_HW
khwasan, arm64: adjust shadow size for CONFIG_KASAN_HW
khwasan: initialize shadow to 0xff
khwasan, arm64: untag virt address in __kimg_to_phys and
_virt_addr_is_linear
khwasan: add tag related helper functions
khwasan, arm64: fix up fault handling logic
khwasan, arm64: enable top byte ignore for the kernel
khwasan, mm: perform untagged pointers comparison in krealloc
khwasan: split out kasan_report.c from report.c
khwasan: add bug reporting routines
khwasan: add hooks implementation
khwasan, arm64: add brk handler for inline instrumentation
khwasan, mm, arm64: tag non slab memory allocated via pagealloc
khwasan: update kasan documentation
kasan: add SPDX-License-Identifier mark to source files

Documentation/dev-tools/kasan.rst | 213 ++++----
arch/arm64/Kconfig | 1 +
arch/arm64/Makefile | 2 +-
arch/arm64/include/asm/brk-imm.h | 2 +
arch/arm64/include/asm/memory.h | 41 +-
arch/arm64/include/asm/pgtable-hwdef.h | 1 +
arch/arm64/kernel/traps.c | 69 ++-
arch/arm64/mm/fault.c | 3 +
arch/arm64/mm/kasan_init.c | 18 +-
arch/arm64/mm/proc.S | 8 +-
include/linux/compiler-clang.h | 5 +-
include/linux/compiler-gcc.h | 4 +
include/linux/compiler.h | 3 +-
include/linux/kasan.h | 84 +++-
include/linux/mm.h | 29 ++
include/linux/page-flags-layout.h | 10 +
lib/Kconfig.kasan | 76 ++-
mm/cma.c | 11 +
mm/kasan/Makefile | 9 +-
mm/kasan/common.c | 659 +++++++++++++++++++++++++
mm/kasan/kasan.c | 564 +--------------------
mm/kasan/kasan.h | 85 +++-
mm/kasan/kasan_init.c | 1 +
mm/kasan/kasan_report.c | 156 ++++++
mm/kasan/khwasan.c | 163 ++++++
mm/kasan/khwasan_report.c | 61 +++
mm/kasan/quarantine.c | 1 +
mm/kasan/report.c | 272 +++-------
mm/page_alloc.c | 1 +
mm/slab.c | 12 +-
mm/slab.h | 2 +-
mm/slab_common.c | 6 +-
mm/slub.c | 17 +-
scripts/Makefile.kasan | 27 +-
34 files changed, 1698 insertions(+), 918 deletions(-)
create mode 100644 mm/kasan/common.c
create mode 100644 mm/kasan/kasan_report.c
create mode 100644 mm/kasan/khwasan.c
create mode 100644 mm/kasan/khwasan_report.c

--
2.18.0.rc2.346.g013aa6912e-goog



2018-06-26 13:17:55

by Andrey Konovalov

[permalink] [raw]
Subject: [PATCH v4 08/17] khwasan, arm64: fix up fault handling logic

show_pte in arm64 fault handling relies on the fact that the top byte of
a kernel pointer is 0xff, which isn't always the case with KHWASAN enabled.
Reset the top byte.

Signed-off-by: Andrey Konovalov <[email protected]>
---
arch/arm64/mm/fault.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index b8eecc7b9531..b7b152783d54 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -32,6 +32,7 @@
#include <linux/perf_event.h>
#include <linux/preempt.h>
#include <linux/hugetlb.h>
+#include <linux/kasan.h>

#include <asm/bug.h>
#include <asm/cmpxchg.h>
@@ -134,6 +135,8 @@ void show_pte(unsigned long addr)
pgd_t *pgdp;
pgd_t pgd;

+ addr = (unsigned long)khwasan_reset_tag((void *)addr);
+
if (addr < TASK_SIZE) {
/* TTBR0 */
mm = current->active_mm;
--
2.18.0.rc2.346.g013aa6912e-goog


2018-06-26 13:17:55

by Andrey Konovalov

[permalink] [raw]
Subject: [PATCH v4 12/17] khwasan: add bug reporting routines

This commit adds rountines, that print KHWASAN error reports. Those are
quite similar to KASAN, the difference is:

1. The way KHWASAN finds the first bad shadow cell (with a mismatching
tag). KHWASAN compares memory tags from the shadow memory to the pointer
tag.

2. KHWASAN reports all bugs with the "KASAN: invalid-access" header. This
is done, so various external tools that already parse the kernel logs
looking for KASAN reports wouldn't need to be changed.

Signed-off-by: Andrey Konovalov <[email protected]>
---
include/linux/kasan.h | 3 +++
mm/kasan/kasan.h | 7 +++++
mm/kasan/kasan_report.c | 7 ++---
mm/kasan/khwasan_report.c | 21 +++++++++++++++
mm/kasan/report.c | 57 +++++++++++++++++++++------------------
5 files changed, 64 insertions(+), 31 deletions(-)

diff --git a/include/linux/kasan.h b/include/linux/kasan.h
index d7624b879d86..e209027f3b52 100644
--- a/include/linux/kasan.h
+++ b/include/linux/kasan.h
@@ -161,6 +161,9 @@ void *khwasan_set_tag(const void *addr, u8 tag);
u8 khwasan_get_tag(const void *addr);
void *khwasan_reset_tag(const void *ptr);

+void kasan_report(unsigned long addr, size_t size,
+ bool write, unsigned long ip);
+
#else /* CONFIG_KASAN_HW */

static inline void khwasan_init(void) { }
diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
index 82672473740c..d60859d26be7 100644
--- a/mm/kasan/kasan.h
+++ b/mm/kasan/kasan.h
@@ -119,8 +119,15 @@ void kasan_poison_shadow(const void *address, size_t size, u8 value);
void check_memory_region(unsigned long addr, size_t size, bool write,
unsigned long ret_ip);

+void *find_first_bad_addr(void *addr, size_t size);
const char *get_bug_type(struct kasan_access_info *info);

+#ifdef CONFIG_KASAN_HW
+void print_tags(u8 addr_tag, const void *addr);
+#else
+static inline void print_tags(u8 addr_tag, const void *addr) { }
+#endif
+
void kasan_report(unsigned long addr, size_t size,
bool is_write, unsigned long ip);
void kasan_report_invalid_free(void *object, unsigned long ip);
diff --git a/mm/kasan/kasan_report.c b/mm/kasan/kasan_report.c
index 2d8decbecbd5..fdf2d77e3125 100644
--- a/mm/kasan/kasan_report.c
+++ b/mm/kasan/kasan_report.c
@@ -33,10 +33,10 @@
#include "kasan.h"
#include "../slab.h"

-static const void *find_first_bad_addr(const void *addr, size_t size)
+void *find_first_bad_addr(void *addr, size_t size)
{
u8 shadow_val = *(u8 *)kasan_mem_to_shadow(addr);
- const void *first_bad_addr = addr;
+ void *first_bad_addr = addr;

while (!shadow_val && first_bad_addr < addr + size) {
first_bad_addr += KASAN_SHADOW_SCALE_SIZE;
@@ -50,9 +50,6 @@ static const char *get_shadow_bug_type(struct kasan_access_info *info)
const char *bug_type = "unknown-crash";
u8 *shadow_addr;

- info->first_bad_addr = find_first_bad_addr(info->access_addr,
- info->access_size);
-
shadow_addr = (u8 *)kasan_mem_to_shadow(info->first_bad_addr);

/*
diff --git a/mm/kasan/khwasan_report.c b/mm/kasan/khwasan_report.c
index 2edbc3c76be5..51238b404b08 100644
--- a/mm/kasan/khwasan_report.c
+++ b/mm/kasan/khwasan_report.c
@@ -37,3 +37,24 @@ const char *get_bug_type(struct kasan_access_info *info)
{
return "invalid-access";
}
+
+void *find_first_bad_addr(void *addr, size_t size)
+{
+ u8 tag = get_tag(addr);
+ void *untagged_addr = reset_tag(addr);
+ u8 *shadow = (u8 *)kasan_mem_to_shadow(untagged_addr);
+ void *first_bad_addr = untagged_addr;
+
+ while (*shadow == tag && first_bad_addr < untagged_addr + size) {
+ first_bad_addr += KASAN_SHADOW_SCALE_SIZE;
+ shadow = (u8 *)kasan_mem_to_shadow(first_bad_addr);
+ }
+ return first_bad_addr;
+}
+
+void print_tags(u8 addr_tag, const void *addr)
+{
+ u8 *shadow = (u8 *)kasan_mem_to_shadow(addr);
+
+ pr_err("Pointer tag: [%02x], memory tag: [%02x]\n", addr_tag, *shadow);
+}
diff --git a/mm/kasan/report.c b/mm/kasan/report.c
index 155247a6f8a8..e031c78f2e52 100644
--- a/mm/kasan/report.c
+++ b/mm/kasan/report.c
@@ -64,11 +64,10 @@ static int __init kasan_set_multi_shot(char *str)
}
__setup("kasan_multi_shot", kasan_set_multi_shot);

-static void print_error_description(struct kasan_access_info *info,
- const char *bug_type)
+static void print_error_description(struct kasan_access_info *info)
{
pr_err("BUG: KASAN: %s in %pS\n",
- bug_type, (void *)info->ip);
+ get_bug_type(info), (void *)info->ip);
pr_err("%s of size %zu at addr %px by task %s/%d\n",
info->is_write ? "Write" : "Read", info->access_size,
info->access_addr, current->comm, task_pid_nr(current));
@@ -272,6 +271,8 @@ void kasan_report_invalid_free(void *object, unsigned long ip)

start_report(&flags);
pr_err("BUG: KASAN: double-free or invalid-free in %pS\n", (void *)ip);
+ print_tags(get_tag(object), reset_tag(object));
+ object = reset_tag(object);
pr_err("\n");
print_address_description(object);
pr_err("\n");
@@ -279,41 +280,45 @@ void kasan_report_invalid_free(void *object, unsigned long ip)
end_report(&flags);
}

-static void kasan_report_error(struct kasan_access_info *info)
-{
- unsigned long flags;
-
- start_report(&flags);
-
- print_error_description(info, get_bug_type(info));
- pr_err("\n");
-
- if (!addr_has_shadow(info->access_addr)) {
- dump_stack();
- } else {
- print_address_description((void *)info->access_addr);
- pr_err("\n");
- print_shadow_for_address(info->first_bad_addr);
- }
-
- end_report(&flags);
-}
-
void kasan_report(unsigned long addr, size_t size,
bool is_write, unsigned long ip)
{
struct kasan_access_info info;
+ void *tagged_addr;
+ void *untagged_addr;
+ unsigned long flags;

if (likely(!report_enabled()))
return;

disable_trace_on_warning();

- info.access_addr = (void *)addr;
- info.first_bad_addr = (void *)addr;
+ tagged_addr = (void *)addr;
+ untagged_addr = reset_tag(tagged_addr);
+
+ info.access_addr = tagged_addr;
+ if (addr_has_shadow(untagged_addr))
+ info.first_bad_addr = find_first_bad_addr(tagged_addr, size);
+ else
+ info.first_bad_addr = untagged_addr;
info.access_size = size;
info.is_write = is_write;
info.ip = ip;

- kasan_report_error(&info);
+ start_report(&flags);
+
+ print_error_description(&info);
+ if (addr_has_shadow(untagged_addr))
+ print_tags(get_tag(tagged_addr), info.first_bad_addr);
+ pr_err("\n");
+
+ if (addr_has_shadow(untagged_addr)) {
+ print_address_description(untagged_addr);
+ pr_err("\n");
+ print_shadow_for_address(info.first_bad_addr);
+ } else {
+ dump_stack();
+ }
+
+ end_report(&flags);
}
--
2.18.0.rc2.346.g013aa6912e-goog


2018-06-26 13:18:56

by Andrey Konovalov

[permalink] [raw]
Subject: [PATCH v4 17/17] kasan: add SPDX-License-Identifier mark to source files

This patch adds a "SPDX-License-Identifier: GPL-2.0" mark to all source
files under mm/kasan.

Signed-off-by: Andrey Konovalov <[email protected]>
---
mm/kasan/common.c | 1 +
mm/kasan/kasan.c | 1 +
mm/kasan/kasan_init.c | 1 +
mm/kasan/kasan_report.c | 1 +
mm/kasan/khwasan.c | 1 +
mm/kasan/khwasan_report.c | 1 +
mm/kasan/quarantine.c | 1 +
mm/kasan/report.c | 1 +
8 files changed, 8 insertions(+)

diff --git a/mm/kasan/common.c b/mm/kasan/common.c
index 6cf7dec0b765..955e0ebdc644 100644
--- a/mm/kasan/common.c
+++ b/mm/kasan/common.c
@@ -1,3 +1,4 @@
+// SPDX-License-Identifier: GPL-2.0
/*
* This file contains common KASAN and KHWASAN code.
*
diff --git a/mm/kasan/kasan.c b/mm/kasan/kasan.c
index 44ec228de0a2..128a865c9e05 100644
--- a/mm/kasan/kasan.c
+++ b/mm/kasan/kasan.c
@@ -1,3 +1,4 @@
+// SPDX-License-Identifier: GPL-2.0
/*
* This file contains core KASAN code.
*
diff --git a/mm/kasan/kasan_init.c b/mm/kasan/kasan_init.c
index f436246ccc79..2dfa730a9d43 100644
--- a/mm/kasan/kasan_init.c
+++ b/mm/kasan/kasan_init.c
@@ -1,3 +1,4 @@
+// SPDX-License-Identifier: GPL-2.0
/*
* This file contains some kasan initialization code.
*
diff --git a/mm/kasan/kasan_report.c b/mm/kasan/kasan_report.c
index fdf2d77e3125..48da73f4ef7c 100644
--- a/mm/kasan/kasan_report.c
+++ b/mm/kasan/kasan_report.c
@@ -1,3 +1,4 @@
+// SPDX-License-Identifier: GPL-2.0
/*
* This file contains KASAN specific error reporting code.
*
diff --git a/mm/kasan/khwasan.c b/mm/kasan/khwasan.c
index fd1725022794..f0d528a8c3f3 100644
--- a/mm/kasan/khwasan.c
+++ b/mm/kasan/khwasan.c
@@ -1,3 +1,4 @@
+// SPDX-License-Identifier: GPL-2.0
/*
* This file contains core KHWASAN code.
*
diff --git a/mm/kasan/khwasan_report.c b/mm/kasan/khwasan_report.c
index 51238b404b08..4e193546d94e 100644
--- a/mm/kasan/khwasan_report.c
+++ b/mm/kasan/khwasan_report.c
@@ -1,3 +1,4 @@
+// SPDX-License-Identifier: GPL-2.0
/*
* This file contains KHWASAN specific error reporting code.
*
diff --git a/mm/kasan/quarantine.c b/mm/kasan/quarantine.c
index 3a8ddf8baf7d..0e4dc1a22615 100644
--- a/mm/kasan/quarantine.c
+++ b/mm/kasan/quarantine.c
@@ -1,3 +1,4 @@
+// SPDX-License-Identifier: GPL-2.0
/*
* KASAN quarantine.
*
diff --git a/mm/kasan/report.c b/mm/kasan/report.c
index e031c78f2e52..633b4b245798 100644
--- a/mm/kasan/report.c
+++ b/mm/kasan/report.c
@@ -1,3 +1,4 @@
+// SPDX-License-Identifier: GPL-2.0
/*
* This file contains common KASAN and KHWASAN error reporting code.
*
--
2.18.0.rc2.346.g013aa6912e-goog


2018-06-26 13:19:37

by Andrey Konovalov

[permalink] [raw]
Subject: [PATCH v4 11/17] khwasan: split out kasan_report.c from report.c

This patch moves KASAN specific error reporting routines to kasan_report.c
without any functional changes, leaving common error reporting code in
report.c to be later reused by KHWASAN.

Signed-off-by: Andrey Konovalov <[email protected]>
---
mm/kasan/Makefile | 4 +-
mm/kasan/kasan.h | 7 ++
mm/kasan/kasan_report.c | 158 +++++++++++++++++++++++++
mm/kasan/khwasan_report.c | 39 +++++++
mm/kasan/report.c | 234 +++++++++-----------------------------
5 files changed, 257 insertions(+), 185 deletions(-)
create mode 100644 mm/kasan/kasan_report.c
create mode 100644 mm/kasan/khwasan_report.c

diff --git a/mm/kasan/Makefile b/mm/kasan/Makefile
index 14955add96d3..7ef536390365 100644
--- a/mm/kasan/Makefile
+++ b/mm/kasan/Makefile
@@ -14,5 +14,5 @@ CFLAGS_kasan.o := $(call cc-option, -fno-conserve-stack -fno-stack-protector)
CFLAGS_khwasan.o := $(call cc-option, -fno-conserve-stack -fno-stack-protector)

obj-$(CONFIG_KASAN) := common.o kasan_init.o report.o
-obj-$(CONFIG_KASAN_GENERIC) += kasan.o quarantine.o
-obj-$(CONFIG_KASAN_HW) += khwasan.o
+obj-$(CONFIG_KASAN_GENERIC) += kasan.o kasan_report.o quarantine.o
+obj-$(CONFIG_KASAN_HW) += khwasan.o khwasan_report.o
diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
index a7cc27d96608..82672473740c 100644
--- a/mm/kasan/kasan.h
+++ b/mm/kasan/kasan.h
@@ -109,11 +109,18 @@ static inline const void *kasan_shadow_to_mem(const void *shadow_addr)
<< KASAN_SHADOW_SCALE_SHIFT);
}

+static inline bool addr_has_shadow(const void *addr)
+{
+ return (addr >= kasan_shadow_to_mem((void *)KASAN_SHADOW_START));
+}
+
void kasan_poison_shadow(const void *address, size_t size, u8 value);

void check_memory_region(unsigned long addr, size_t size, bool write,
unsigned long ret_ip);

+const char *get_bug_type(struct kasan_access_info *info);
+
void kasan_report(unsigned long addr, size_t size,
bool is_write, unsigned long ip);
void kasan_report_invalid_free(void *object, unsigned long ip);
diff --git a/mm/kasan/kasan_report.c b/mm/kasan/kasan_report.c
new file mode 100644
index 000000000000..2d8decbecbd5
--- /dev/null
+++ b/mm/kasan/kasan_report.c
@@ -0,0 +1,158 @@
+/*
+ * This file contains KASAN specific error reporting code.
+ *
+ * Copyright (c) 2014 Samsung Electronics Co., Ltd.
+ * Author: Andrey Ryabinin <[email protected]>
+ *
+ * Some code borrowed from https://github.com/xairy/kasan-prototype by
+ * Andrey Konovalov <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ */
+
+#include <linux/bitops.h>
+#include <linux/ftrace.h>
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/mm.h>
+#include <linux/printk.h>
+#include <linux/sched.h>
+#include <linux/slab.h>
+#include <linux/stackdepot.h>
+#include <linux/stacktrace.h>
+#include <linux/string.h>
+#include <linux/types.h>
+#include <linux/kasan.h>
+#include <linux/module.h>
+
+#include <asm/sections.h>
+
+#include "kasan.h"
+#include "../slab.h"
+
+static const void *find_first_bad_addr(const void *addr, size_t size)
+{
+ u8 shadow_val = *(u8 *)kasan_mem_to_shadow(addr);
+ const void *first_bad_addr = addr;
+
+ while (!shadow_val && first_bad_addr < addr + size) {
+ first_bad_addr += KASAN_SHADOW_SCALE_SIZE;
+ shadow_val = *(u8 *)kasan_mem_to_shadow(first_bad_addr);
+ }
+ return first_bad_addr;
+}
+
+static const char *get_shadow_bug_type(struct kasan_access_info *info)
+{
+ const char *bug_type = "unknown-crash";
+ u8 *shadow_addr;
+
+ info->first_bad_addr = find_first_bad_addr(info->access_addr,
+ info->access_size);
+
+ shadow_addr = (u8 *)kasan_mem_to_shadow(info->first_bad_addr);
+
+ /*
+ * If shadow byte value is in [0, KASAN_SHADOW_SCALE_SIZE) we can look
+ * at the next shadow byte to determine the type of the bad access.
+ */
+ if (*shadow_addr > 0 && *shadow_addr <= KASAN_SHADOW_SCALE_SIZE - 1)
+ shadow_addr++;
+
+ switch (*shadow_addr) {
+ case 0 ... KASAN_SHADOW_SCALE_SIZE - 1:
+ /*
+ * In theory it's still possible to see these shadow values
+ * due to a data race in the kernel code.
+ */
+ bug_type = "out-of-bounds";
+ break;
+ case KASAN_PAGE_REDZONE:
+ case KASAN_KMALLOC_REDZONE:
+ bug_type = "slab-out-of-bounds";
+ break;
+ case KASAN_GLOBAL_REDZONE:
+ bug_type = "global-out-of-bounds";
+ break;
+ case KASAN_STACK_LEFT:
+ case KASAN_STACK_MID:
+ case KASAN_STACK_RIGHT:
+ case KASAN_STACK_PARTIAL:
+ bug_type = "stack-out-of-bounds";
+ break;
+ case KASAN_FREE_PAGE:
+ case KASAN_KMALLOC_FREE:
+ bug_type = "use-after-free";
+ break;
+ case KASAN_USE_AFTER_SCOPE:
+ bug_type = "use-after-scope";
+ break;
+ case KASAN_ALLOCA_LEFT:
+ case KASAN_ALLOCA_RIGHT:
+ bug_type = "alloca-out-of-bounds";
+ break;
+ }
+
+ return bug_type;
+}
+
+static const char *get_wild_bug_type(struct kasan_access_info *info)
+{
+ const char *bug_type = "unknown-crash";
+
+ if ((unsigned long)info->access_addr < PAGE_SIZE)
+ bug_type = "null-ptr-deref";
+ else if ((unsigned long)info->access_addr < TASK_SIZE)
+ bug_type = "user-memory-access";
+ else
+ bug_type = "wild-memory-access";
+
+ return bug_type;
+}
+
+const char *get_bug_type(struct kasan_access_info *info)
+{
+ if (addr_has_shadow(info->access_addr))
+ return get_shadow_bug_type(info);
+ return get_wild_bug_type(info);
+}
+
+#define DEFINE_ASAN_REPORT_LOAD(size) \
+void __asan_report_load##size##_noabort(unsigned long addr) \
+{ \
+ kasan_report(addr, size, false, _RET_IP_); \
+} \
+EXPORT_SYMBOL(__asan_report_load##size##_noabort)
+
+#define DEFINE_ASAN_REPORT_STORE(size) \
+void __asan_report_store##size##_noabort(unsigned long addr) \
+{ \
+ kasan_report(addr, size, true, _RET_IP_); \
+} \
+EXPORT_SYMBOL(__asan_report_store##size##_noabort)
+
+DEFINE_ASAN_REPORT_LOAD(1);
+DEFINE_ASAN_REPORT_LOAD(2);
+DEFINE_ASAN_REPORT_LOAD(4);
+DEFINE_ASAN_REPORT_LOAD(8);
+DEFINE_ASAN_REPORT_LOAD(16);
+DEFINE_ASAN_REPORT_STORE(1);
+DEFINE_ASAN_REPORT_STORE(2);
+DEFINE_ASAN_REPORT_STORE(4);
+DEFINE_ASAN_REPORT_STORE(8);
+DEFINE_ASAN_REPORT_STORE(16);
+
+void __asan_report_load_n_noabort(unsigned long addr, size_t size)
+{
+ kasan_report(addr, size, false, _RET_IP_);
+}
+EXPORT_SYMBOL(__asan_report_load_n_noabort);
+
+void __asan_report_store_n_noabort(unsigned long addr, size_t size)
+{
+ kasan_report(addr, size, true, _RET_IP_);
+}
+EXPORT_SYMBOL(__asan_report_store_n_noabort);
diff --git a/mm/kasan/khwasan_report.c b/mm/kasan/khwasan_report.c
new file mode 100644
index 000000000000..2edbc3c76be5
--- /dev/null
+++ b/mm/kasan/khwasan_report.c
@@ -0,0 +1,39 @@
+/*
+ * This file contains KHWASAN specific error reporting code.
+ *
+ * Copyright (c) 2014 Samsung Electronics Co., Ltd.
+ * Author: Andrey Ryabinin <[email protected]>
+ *
+ * Some code borrowed from https://github.com/xairy/kasan-prototype by
+ * Andrey Konovalov <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ */
+
+#include <linux/bitops.h>
+#include <linux/ftrace.h>
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/mm.h>
+#include <linux/printk.h>
+#include <linux/sched.h>
+#include <linux/slab.h>
+#include <linux/stackdepot.h>
+#include <linux/stacktrace.h>
+#include <linux/string.h>
+#include <linux/types.h>
+#include <linux/kasan.h>
+#include <linux/module.h>
+
+#include <asm/sections.h>
+
+#include "kasan.h"
+#include "../slab.h"
+
+const char *get_bug_type(struct kasan_access_info *info)
+{
+ return "invalid-access";
+}
diff --git a/mm/kasan/report.c b/mm/kasan/report.c
index 5c169aa688fd..155247a6f8a8 100644
--- a/mm/kasan/report.c
+++ b/mm/kasan/report.c
@@ -1,5 +1,5 @@
/*
- * This file contains error reporting code.
+ * This file contains common KASAN and KHWASAN error reporting code.
*
* Copyright (c) 2014 Samsung Electronics Co., Ltd.
* Author: Andrey Ryabinin <[email protected]>
@@ -39,103 +39,34 @@
#define SHADOW_BYTES_PER_ROW (SHADOW_BLOCKS_PER_ROW * SHADOW_BYTES_PER_BLOCK)
#define SHADOW_ROWS_AROUND_ADDR 2

-static const void *find_first_bad_addr(const void *addr, size_t size)
-{
- u8 shadow_val = *(u8 *)kasan_mem_to_shadow(addr);
- const void *first_bad_addr = addr;
-
- while (!shadow_val && first_bad_addr < addr + size) {
- first_bad_addr += KASAN_SHADOW_SCALE_SIZE;
- shadow_val = *(u8 *)kasan_mem_to_shadow(first_bad_addr);
- }
- return first_bad_addr;
-}
+static unsigned long kasan_flags;

-static bool addr_has_shadow(struct kasan_access_info *info)
-{
- return (info->access_addr >=
- kasan_shadow_to_mem((void *)KASAN_SHADOW_START));
-}
+#define KASAN_BIT_REPORTED 0
+#define KASAN_BIT_MULTI_SHOT 1

-static const char *get_shadow_bug_type(struct kasan_access_info *info)
+bool kasan_save_enable_multi_shot(void)
{
- const char *bug_type = "unknown-crash";
- u8 *shadow_addr;
-
- info->first_bad_addr = find_first_bad_addr(info->access_addr,
- info->access_size);
-
- shadow_addr = (u8 *)kasan_mem_to_shadow(info->first_bad_addr);
-
- /*
- * If shadow byte value is in [0, KASAN_SHADOW_SCALE_SIZE) we can look
- * at the next shadow byte to determine the type of the bad access.
- */
- if (*shadow_addr > 0 && *shadow_addr <= KASAN_SHADOW_SCALE_SIZE - 1)
- shadow_addr++;
-
- switch (*shadow_addr) {
- case 0 ... KASAN_SHADOW_SCALE_SIZE - 1:
- /*
- * In theory it's still possible to see these shadow values
- * due to a data race in the kernel code.
- */
- bug_type = "out-of-bounds";
- break;
- case KASAN_PAGE_REDZONE:
- case KASAN_KMALLOC_REDZONE:
- bug_type = "slab-out-of-bounds";
- break;
- case KASAN_GLOBAL_REDZONE:
- bug_type = "global-out-of-bounds";
- break;
- case KASAN_STACK_LEFT:
- case KASAN_STACK_MID:
- case KASAN_STACK_RIGHT:
- case KASAN_STACK_PARTIAL:
- bug_type = "stack-out-of-bounds";
- break;
- case KASAN_FREE_PAGE:
- case KASAN_KMALLOC_FREE:
- bug_type = "use-after-free";
- break;
- case KASAN_USE_AFTER_SCOPE:
- bug_type = "use-after-scope";
- break;
- case KASAN_ALLOCA_LEFT:
- case KASAN_ALLOCA_RIGHT:
- bug_type = "alloca-out-of-bounds";
- break;
- }
-
- return bug_type;
+ return test_and_set_bit(KASAN_BIT_MULTI_SHOT, &kasan_flags);
}
+EXPORT_SYMBOL_GPL(kasan_save_enable_multi_shot);

-static const char *get_wild_bug_type(struct kasan_access_info *info)
+void kasan_restore_multi_shot(bool enabled)
{
- const char *bug_type = "unknown-crash";
-
- if ((unsigned long)info->access_addr < PAGE_SIZE)
- bug_type = "null-ptr-deref";
- else if ((unsigned long)info->access_addr < TASK_SIZE)
- bug_type = "user-memory-access";
- else
- bug_type = "wild-memory-access";
-
- return bug_type;
+ if (!enabled)
+ clear_bit(KASAN_BIT_MULTI_SHOT, &kasan_flags);
}
+EXPORT_SYMBOL_GPL(kasan_restore_multi_shot);

-static const char *get_bug_type(struct kasan_access_info *info)
+static int __init kasan_set_multi_shot(char *str)
{
- if (addr_has_shadow(info))
- return get_shadow_bug_type(info);
- return get_wild_bug_type(info);
+ set_bit(KASAN_BIT_MULTI_SHOT, &kasan_flags);
+ return 1;
}
+__setup("kasan_multi_shot", kasan_set_multi_shot);

-static void print_error_description(struct kasan_access_info *info)
+static void print_error_description(struct kasan_access_info *info,
+ const char *bug_type)
{
- const char *bug_type = get_bug_type(info);
-
pr_err("BUG: KASAN: %s in %pS\n",
bug_type, (void *)info->ip);
pr_err("%s of size %zu at addr %px by task %s/%d\n",
@@ -143,25 +74,9 @@ static void print_error_description(struct kasan_access_info *info)
info->access_addr, current->comm, task_pid_nr(current));
}

-static inline bool kernel_or_module_addr(const void *addr)
-{
- if (addr >= (void *)_stext && addr < (void *)_end)
- return true;
- if (is_module_address((unsigned long)addr))
- return true;
- return false;
-}
-
-static inline bool init_task_stack_addr(const void *addr)
-{
- return addr >= (void *)&init_thread_union.stack &&
- (addr <= (void *)&init_thread_union.stack +
- sizeof(init_thread_union.stack));
-}
-
static DEFINE_SPINLOCK(report_lock);

-static void kasan_start_report(unsigned long *flags)
+static void start_report(unsigned long *flags)
{
/*
* Make sure we don't end up in loop.
@@ -171,7 +86,7 @@ static void kasan_start_report(unsigned long *flags)
pr_err("==================================================================\n");
}

-static void kasan_end_report(unsigned long *flags)
+static void end_report(unsigned long *flags)
{
pr_err("==================================================================\n");
add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE);
@@ -249,6 +164,22 @@ static void describe_object(struct kmem_cache *cache, void *object,
describe_object_addr(cache, object, addr);
}

+static inline bool kernel_or_module_addr(const void *addr)
+{
+ if (addr >= (void *)_stext && addr < (void *)_end)
+ return true;
+ if (is_module_address((unsigned long)addr))
+ return true;
+ return false;
+}
+
+static inline bool init_task_stack_addr(const void *addr)
+{
+ return addr >= (void *)&init_thread_union.stack &&
+ (addr <= (void *)&init_thread_union.stack +
+ sizeof(init_thread_union.stack));
+}
+
static void print_address_description(void *addr)
{
struct page *page = addr_to_page(addr);
@@ -326,29 +257,38 @@ static void print_shadow_for_address(const void *addr)
}
}

+static bool report_enabled(void)
+{
+ if (current->kasan_depth)
+ return false;
+ if (test_bit(KASAN_BIT_MULTI_SHOT, &kasan_flags))
+ return true;
+ return !test_and_set_bit(KASAN_BIT_REPORTED, &kasan_flags);
+}
+
void kasan_report_invalid_free(void *object, unsigned long ip)
{
unsigned long flags;

- kasan_start_report(&flags);
+ start_report(&flags);
pr_err("BUG: KASAN: double-free or invalid-free in %pS\n", (void *)ip);
pr_err("\n");
print_address_description(object);
pr_err("\n");
print_shadow_for_address(object);
- kasan_end_report(&flags);
+ end_report(&flags);
}

static void kasan_report_error(struct kasan_access_info *info)
{
unsigned long flags;

- kasan_start_report(&flags);
+ start_report(&flags);

- print_error_description(info);
+ print_error_description(info, get_bug_type(info));
pr_err("\n");

- if (!addr_has_shadow(info)) {
+ if (!addr_has_shadow(info->access_addr)) {
dump_stack();
} else {
print_address_description((void *)info->access_addr);
@@ -356,41 +296,7 @@ static void kasan_report_error(struct kasan_access_info *info)
print_shadow_for_address(info->first_bad_addr);
}

- kasan_end_report(&flags);
-}
-
-static unsigned long kasan_flags;
-
-#define KASAN_BIT_REPORTED 0
-#define KASAN_BIT_MULTI_SHOT 1
-
-bool kasan_save_enable_multi_shot(void)
-{
- return test_and_set_bit(KASAN_BIT_MULTI_SHOT, &kasan_flags);
-}
-EXPORT_SYMBOL_GPL(kasan_save_enable_multi_shot);
-
-void kasan_restore_multi_shot(bool enabled)
-{
- if (!enabled)
- clear_bit(KASAN_BIT_MULTI_SHOT, &kasan_flags);
-}
-EXPORT_SYMBOL_GPL(kasan_restore_multi_shot);
-
-static int __init kasan_set_multi_shot(char *str)
-{
- set_bit(KASAN_BIT_MULTI_SHOT, &kasan_flags);
- return 1;
-}
-__setup("kasan_multi_shot", kasan_set_multi_shot);
-
-static inline bool kasan_report_enabled(void)
-{
- if (current->kasan_depth)
- return false;
- if (test_bit(KASAN_BIT_MULTI_SHOT, &kasan_flags))
- return true;
- return !test_and_set_bit(KASAN_BIT_REPORTED, &kasan_flags);
+ end_report(&flags);
}

void kasan_report(unsigned long addr, size_t size,
@@ -398,7 +304,7 @@ void kasan_report(unsigned long addr, size_t size,
{
struct kasan_access_info info;

- if (likely(!kasan_report_enabled()))
+ if (likely(!report_enabled()))
return;

disable_trace_on_warning();
@@ -411,41 +317,3 @@ void kasan_report(unsigned long addr, size_t size,

kasan_report_error(&info);
}
-
-
-#define DEFINE_ASAN_REPORT_LOAD(size) \
-void __asan_report_load##size##_noabort(unsigned long addr) \
-{ \
- kasan_report(addr, size, false, _RET_IP_); \
-} \
-EXPORT_SYMBOL(__asan_report_load##size##_noabort)
-
-#define DEFINE_ASAN_REPORT_STORE(size) \
-void __asan_report_store##size##_noabort(unsigned long addr) \
-{ \
- kasan_report(addr, size, true, _RET_IP_); \
-} \
-EXPORT_SYMBOL(__asan_report_store##size##_noabort)
-
-DEFINE_ASAN_REPORT_LOAD(1);
-DEFINE_ASAN_REPORT_LOAD(2);
-DEFINE_ASAN_REPORT_LOAD(4);
-DEFINE_ASAN_REPORT_LOAD(8);
-DEFINE_ASAN_REPORT_LOAD(16);
-DEFINE_ASAN_REPORT_STORE(1);
-DEFINE_ASAN_REPORT_STORE(2);
-DEFINE_ASAN_REPORT_STORE(4);
-DEFINE_ASAN_REPORT_STORE(8);
-DEFINE_ASAN_REPORT_STORE(16);
-
-void __asan_report_load_n_noabort(unsigned long addr, size_t size)
-{
- kasan_report(addr, size, false, _RET_IP_);
-}
-EXPORT_SYMBOL(__asan_report_load_n_noabort);
-
-void __asan_report_store_n_noabort(unsigned long addr, size_t size)
-{
- kasan_report(addr, size, true, _RET_IP_);
-}
-EXPORT_SYMBOL(__asan_report_store_n_noabort);
--
2.18.0.rc2.346.g013aa6912e-goog


2018-06-26 13:20:49

by Andrey Konovalov

[permalink] [raw]
Subject: [PATCH v4 09/17] khwasan, arm64: enable top byte ignore for the kernel

KHWASAN uses the Top Byte Ignore feature of arm64 CPUs to store a pointer
tag in the top byte of each pointer. This commit enables the TCR_TBI1 bit,
which enables Top Byte Ignore for the kernel, when KHWASAN is used.

Signed-off-by: Andrey Konovalov <[email protected]>
---
arch/arm64/include/asm/pgtable-hwdef.h | 1 +
arch/arm64/mm/proc.S | 8 +++++++-
2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index fd208eac9f2a..483aceedad76 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -289,6 +289,7 @@
#define TCR_A1 (UL(1) << 22)
#define TCR_ASID16 (UL(1) << 36)
#define TCR_TBI0 (UL(1) << 37)
+#define TCR_TBI1 (UL(1) << 38)
#define TCR_HA (UL(1) << 39)
#define TCR_HD (UL(1) << 40)
#define TCR_NFD1 (UL(1) << 54)
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 03646e6a2ef4..c5175e098d02 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -47,6 +47,12 @@
/* PTWs cacheable, inner/outer WBWA */
#define TCR_CACHE_FLAGS TCR_IRGN_WBWA | TCR_ORGN_WBWA

+#ifdef CONFIG_KASAN_HW
+#define TCR_KASAN_FLAGS TCR_TBI1
+#else
+#define TCR_KASAN_FLAGS 0
+#endif
+
#define MAIR(attr, mt) ((attr) << ((mt) * 8))

/*
@@ -440,7 +446,7 @@ ENTRY(__cpu_setup)
*/
ldr x10, =TCR_TxSZ(VA_BITS) | TCR_CACHE_FLAGS | TCR_SMP_FLAGS | \
TCR_TG_FLAGS | TCR_KASLR_FLAGS | TCR_ASID16 | \
- TCR_TBI0 | TCR_A1
+ TCR_TBI0 | TCR_A1 | TCR_KASAN_FLAGS
tcr_set_idmap_t0sz x10, x9

/*
--
2.18.0.rc2.346.g013aa6912e-goog


2018-06-26 13:21:32

by Andrey Konovalov

[permalink] [raw]
Subject: [PATCH v4 06/17] khwasan, arm64: untag virt address in __kimg_to_phys and _virt_addr_is_linear

__kimg_to_phys (which is used by virt_to_phys) and _virt_addr_is_linear
(which is used by virt_addr_valid) assume that the top byte of the address
is 0xff, which isn't always the case with KHWASAN enabled.

The solution is to reset the tag in those macros.

Signed-off-by: Andrey Konovalov <[email protected]>
---
arch/arm64/include/asm/memory.h | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)

diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 6d084431b7f7..e9e054dfb1fc 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -92,6 +92,13 @@
#define KASAN_THREAD_SHIFT 0
#endif

+#ifdef CONFIG_KASAN_HW
+#define KASAN_TAG_SHIFTED(tag) ((unsigned long)(tag) << 56)
+#define KASAN_SET_TAG(addr, tag) (((addr) & ~KASAN_TAG_SHIFTED(0xff)) | \
+ KASAN_TAG_SHIFTED(tag))
+#define KASAN_RESET_TAG(addr) KASAN_SET_TAG(addr, 0xff)
+#endif
+
#define MIN_THREAD_SHIFT (14 + KASAN_THREAD_SHIFT)

/*
@@ -225,7 +232,12 @@ static inline unsigned long kaslr_offset(void)
#define __is_lm_address(addr) (!!((addr) & BIT(VA_BITS - 1)))

#define __lm_to_phys(addr) (((addr) & ~PAGE_OFFSET) + PHYS_OFFSET)
+
+#ifdef CONFIG_KASAN_HW
+#define __kimg_to_phys(addr) (KASAN_RESET_TAG(addr) - kimage_voffset)
+#else
#define __kimg_to_phys(addr) ((addr) - kimage_voffset)
+#endif

#define __virt_to_phys_nodebug(x) ({ \
phys_addr_t __x = (phys_addr_t)(x); \
@@ -301,7 +313,13 @@ static inline void *phys_to_virt(phys_addr_t x)
#endif
#endif

+#ifdef CONFIG_KASAN_HW
+#define _virt_addr_is_linear(kaddr) (KASAN_RESET_TAG((u64)(kaddr)) >= \
+ PAGE_OFFSET)
+#else
#define _virt_addr_is_linear(kaddr) (((u64)(kaddr)) >= PAGE_OFFSET)
+#endif
+
#define virt_addr_valid(kaddr) (_virt_addr_is_linear(kaddr) && \
_virt_addr_valid(kaddr))

--
2.18.0.rc2.346.g013aa6912e-goog


2018-06-26 13:22:22

by Andrey Konovalov

[permalink] [raw]
Subject: [PATCH v4 04/17] khwasan, arm64: adjust shadow size for CONFIG_KASAN_HW

KWHASAN uses 1 shadow byte for 16 bytes of kernel memory, so it requires
1/16th of the kernel virtual address space for the shadow memory.

This commit sets KASAN_SHADOW_SCALE_SHIFT to 4 when KHWASAN is enabled.

Signed-off-by: Andrey Konovalov <[email protected]>
---
arch/arm64/Makefile | 2 +-
arch/arm64/include/asm/memory.h | 13 +++++++++----
2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index 45272266dafb..191a649f34e7 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -93,7 +93,7 @@ endif
# KASAN_SHADOW_OFFSET = VA_START + (1 << (VA_BITS - KASAN_SHADOW_SCALE_SHIFT))
# - (1 << (64 - KASAN_SHADOW_SCALE_SHIFT))
# in 32-bit arithmetic
-KASAN_SHADOW_SCALE_SHIFT := 3
+KASAN_SHADOW_SCALE_SHIFT := $(if $(CONFIG_KASAN_HW), 4, 3)
KASAN_SHADOW_OFFSET := $(shell printf "0x%08x00000000\n" $$(( \
(0xffffffff & (-1 << ($(CONFIG_ARM64_VA_BITS) - 32))) \
+ (1 << ($(CONFIG_ARM64_VA_BITS) - 32 - $(KASAN_SHADOW_SCALE_SHIFT))) \
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 49d99214f43c..6d084431b7f7 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -74,12 +74,17 @@
#define KERNEL_END _end

/*
- * KASAN requires 1/8th of the kernel virtual address space for the shadow
- * region. KASAN can bloat the stack significantly, so double the (minimum)
- * stack size when KASAN is in use.
+ * KASAN and KHWASAN require 1/8th and 1/16th of the kernel virtual address
+ * space for the shadow region respectively. They can bloat the stack
+ * significantly, so double the (minimum) stack size when they are in use.
*/
-#ifdef CONFIG_KASAN
+#ifdef CONFIG_KASAN_GENERIC
#define KASAN_SHADOW_SCALE_SHIFT 3
+#endif
+#ifdef CONFIG_KASAN_HW
+#define KASAN_SHADOW_SCALE_SHIFT 4
+#endif
+#ifdef CONFIG_KASAN
#define KASAN_SHADOW_SIZE (UL(1) << (VA_BITS - KASAN_SHADOW_SCALE_SHIFT))
#define KASAN_THREAD_SHIFT 1
#else
--
2.18.0.rc2.346.g013aa6912e-goog


2018-06-26 13:22:44

by Andrey Konovalov

[permalink] [raw]
Subject: [PATCH v4 01/17] khwasan, mm: change kasan hooks signatures

KHWASAN will change the value of the top byte of pointers returned from the
kernel allocation functions (such as kmalloc). This patch updates KASAN
hooks signatures and their usage in SLAB and SLUB code to reflect that.

Signed-off-by: Andrey Konovalov <[email protected]>
---
include/linux/kasan.h | 34 +++++++++++++++++++++++-----------
mm/kasan/kasan.c | 24 ++++++++++++++----------
mm/slab.c | 12 ++++++------
mm/slab.h | 2 +-
mm/slab_common.c | 4 ++--
mm/slub.c | 15 +++++++--------
6 files changed, 53 insertions(+), 38 deletions(-)

diff --git a/include/linux/kasan.h b/include/linux/kasan.h
index de784fd11d12..cbdc54543803 100644
--- a/include/linux/kasan.h
+++ b/include/linux/kasan.h
@@ -53,14 +53,14 @@ void kasan_unpoison_object_data(struct kmem_cache *cache, void *object);
void kasan_poison_object_data(struct kmem_cache *cache, void *object);
void kasan_init_slab_obj(struct kmem_cache *cache, const void *object);

-void kasan_kmalloc_large(const void *ptr, size_t size, gfp_t flags);
+void *kasan_kmalloc_large(const void *ptr, size_t size, gfp_t flags);
void kasan_kfree_large(void *ptr, unsigned long ip);
void kasan_poison_kfree(void *ptr, unsigned long ip);
-void kasan_kmalloc(struct kmem_cache *s, const void *object, size_t size,
+void *kasan_kmalloc(struct kmem_cache *s, const void *object, size_t size,
gfp_t flags);
-void kasan_krealloc(const void *object, size_t new_size, gfp_t flags);
+void *kasan_krealloc(const void *object, size_t new_size, gfp_t flags);

-void kasan_slab_alloc(struct kmem_cache *s, void *object, gfp_t flags);
+void *kasan_slab_alloc(struct kmem_cache *s, void *object, gfp_t flags);
bool kasan_slab_free(struct kmem_cache *s, void *object, unsigned long ip);

struct kasan_cache {
@@ -105,16 +105,28 @@ static inline void kasan_poison_object_data(struct kmem_cache *cache,
static inline void kasan_init_slab_obj(struct kmem_cache *cache,
const void *object) {}

-static inline void kasan_kmalloc_large(void *ptr, size_t size, gfp_t flags) {}
+static inline void *kasan_kmalloc_large(void *ptr, size_t size, gfp_t flags)
+{
+ return ptr;
+}
static inline void kasan_kfree_large(void *ptr, unsigned long ip) {}
static inline void kasan_poison_kfree(void *ptr, unsigned long ip) {}
-static inline void kasan_kmalloc(struct kmem_cache *s, const void *object,
- size_t size, gfp_t flags) {}
-static inline void kasan_krealloc(const void *object, size_t new_size,
- gfp_t flags) {}
+static inline void *kasan_kmalloc(struct kmem_cache *s, const void *object,
+ size_t size, gfp_t flags)
+{
+ return (void *)object;
+}
+static inline void *kasan_krealloc(const void *object, size_t new_size,
+ gfp_t flags)
+{
+ return (void *)object;
+}

-static inline void kasan_slab_alloc(struct kmem_cache *s, void *object,
- gfp_t flags) {}
+static inline void *kasan_slab_alloc(struct kmem_cache *s, void *object,
+ gfp_t flags)
+{
+ return object;
+}
static inline bool kasan_slab_free(struct kmem_cache *s, void *object,
unsigned long ip)
{
diff --git a/mm/kasan/kasan.c b/mm/kasan/kasan.c
index f185455b3406..f04aa1e0ba48 100644
--- a/mm/kasan/kasan.c
+++ b/mm/kasan/kasan.c
@@ -485,9 +485,9 @@ void kasan_init_slab_obj(struct kmem_cache *cache, const void *object)
__memset(alloc_info, 0, sizeof(*alloc_info));
}

-void kasan_slab_alloc(struct kmem_cache *cache, void *object, gfp_t flags)
+void *kasan_slab_alloc(struct kmem_cache *cache, void *object, gfp_t flags)
{
- kasan_kmalloc(cache, object, cache->object_size, flags);
+ return kasan_kmalloc(cache, object, cache->object_size, flags);
}

static bool __kasan_slab_free(struct kmem_cache *cache, void *object,
@@ -528,7 +528,7 @@ bool kasan_slab_free(struct kmem_cache *cache, void *object, unsigned long ip)
return __kasan_slab_free(cache, object, ip, true);
}

-void kasan_kmalloc(struct kmem_cache *cache, const void *object, size_t size,
+void *kasan_kmalloc(struct kmem_cache *cache, const void *object, size_t size,
gfp_t flags)
{
unsigned long redzone_start;
@@ -538,7 +538,7 @@ void kasan_kmalloc(struct kmem_cache *cache, const void *object, size_t size,
quarantine_reduce();

if (unlikely(object == NULL))
- return;
+ return NULL;

redzone_start = round_up((unsigned long)(object + size),
KASAN_SHADOW_SCALE_SIZE);
@@ -551,10 +551,12 @@ void kasan_kmalloc(struct kmem_cache *cache, const void *object, size_t size,

if (cache->flags & SLAB_KASAN)
set_track(&get_alloc_info(cache, object)->alloc_track, flags);
+
+ return (void *)object;
}
EXPORT_SYMBOL(kasan_kmalloc);

-void kasan_kmalloc_large(const void *ptr, size_t size, gfp_t flags)
+void *kasan_kmalloc_large(const void *ptr, size_t size, gfp_t flags)
{
struct page *page;
unsigned long redzone_start;
@@ -564,7 +566,7 @@ void kasan_kmalloc_large(const void *ptr, size_t size, gfp_t flags)
quarantine_reduce();

if (unlikely(ptr == NULL))
- return;
+ return NULL;

page = virt_to_page(ptr);
redzone_start = round_up((unsigned long)(ptr + size),
@@ -574,21 +576,23 @@ void kasan_kmalloc_large(const void *ptr, size_t size, gfp_t flags)
kasan_unpoison_shadow(ptr, size);
kasan_poison_shadow((void *)redzone_start, redzone_end - redzone_start,
KASAN_PAGE_REDZONE);
+
+ return (void *)ptr;
}

-void kasan_krealloc(const void *object, size_t size, gfp_t flags)
+void *kasan_krealloc(const void *object, size_t size, gfp_t flags)
{
struct page *page;

if (unlikely(object == ZERO_SIZE_PTR))
- return;
+ return ZERO_SIZE_PTR;

page = virt_to_head_page(object);

if (unlikely(!PageSlab(page)))
- kasan_kmalloc_large(object, size, flags);
+ return kasan_kmalloc_large(object, size, flags);
else
- kasan_kmalloc(page->slab_cache, object, size, flags);
+ return kasan_kmalloc(page->slab_cache, object, size, flags);
}

void kasan_poison_kfree(void *ptr, unsigned long ip)
diff --git a/mm/slab.c b/mm/slab.c
index aa76a70e087e..6fdca9ec2ea4 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -3551,7 +3551,7 @@ void *kmem_cache_alloc(struct kmem_cache *cachep, gfp_t flags)
{
void *ret = slab_alloc(cachep, flags, _RET_IP_);

- kasan_slab_alloc(cachep, ret, flags);
+ ret = kasan_slab_alloc(cachep, ret, flags);
trace_kmem_cache_alloc(_RET_IP_, ret,
cachep->object_size, cachep->size, flags);

@@ -3617,7 +3617,7 @@ kmem_cache_alloc_trace(struct kmem_cache *cachep, gfp_t flags, size_t size)

ret = slab_alloc(cachep, flags, _RET_IP_);

- kasan_kmalloc(cachep, ret, size, flags);
+ ret = kasan_kmalloc(cachep, ret, size, flags);
trace_kmalloc(_RET_IP_, ret,
size, cachep->size, flags);
return ret;
@@ -3641,7 +3641,7 @@ void *kmem_cache_alloc_node(struct kmem_cache *cachep, gfp_t flags, int nodeid)
{
void *ret = slab_alloc_node(cachep, flags, nodeid, _RET_IP_);

- kasan_slab_alloc(cachep, ret, flags);
+ ret = kasan_slab_alloc(cachep, ret, flags);
trace_kmem_cache_alloc_node(_RET_IP_, ret,
cachep->object_size, cachep->size,
flags, nodeid);
@@ -3660,7 +3660,7 @@ void *kmem_cache_alloc_node_trace(struct kmem_cache *cachep,

ret = slab_alloc_node(cachep, flags, nodeid, _RET_IP_);

- kasan_kmalloc(cachep, ret, size, flags);
+ ret = kasan_kmalloc(cachep, ret, size, flags);
trace_kmalloc_node(_RET_IP_, ret,
size, cachep->size,
flags, nodeid);
@@ -3679,7 +3679,7 @@ __do_kmalloc_node(size_t size, gfp_t flags, int node, unsigned long caller)
if (unlikely(ZERO_OR_NULL_PTR(cachep)))
return cachep;
ret = kmem_cache_alloc_node_trace(cachep, flags, node, size);
- kasan_kmalloc(cachep, ret, size, flags);
+ ret = kasan_kmalloc(cachep, ret, size, flags);

return ret;
}
@@ -3715,7 +3715,7 @@ static __always_inline void *__do_kmalloc(size_t size, gfp_t flags,
return cachep;
ret = slab_alloc(cachep, flags, caller);

- kasan_kmalloc(cachep, ret, size, flags);
+ ret = kasan_kmalloc(cachep, ret, size, flags);
trace_kmalloc(caller, ret,
size, cachep->size, flags);

diff --git a/mm/slab.h b/mm/slab.h
index 68bdf498da3b..15ef6a0d9c16 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -441,7 +441,7 @@ static inline void slab_post_alloc_hook(struct kmem_cache *s, gfp_t flags,

kmemleak_alloc_recursive(object, s->object_size, 1,
s->flags, flags);
- kasan_slab_alloc(s, object, flags);
+ p[i] = kasan_slab_alloc(s, object, flags);
}

if (memcg_kmem_enabled())
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 890b1f04a03a..c279b52c7565 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1179,7 +1179,7 @@ void *kmalloc_order(size_t size, gfp_t flags, unsigned int order)
page = alloc_pages(flags, order);
ret = page ? page_address(page) : NULL;
kmemleak_alloc(ret, size, 1, flags);
- kasan_kmalloc_large(ret, size, flags);
+ ret = kasan_kmalloc_large(ret, size, flags);
return ret;
}
EXPORT_SYMBOL(kmalloc_order);
@@ -1457,7 +1457,7 @@ static __always_inline void *__do_krealloc(const void *p, size_t new_size,
ks = ksize(p);

if (ks >= new_size) {
- kasan_krealloc((void *)p, new_size, flags);
+ p = kasan_krealloc((void *)p, new_size, flags);
return (void *)p;
}

diff --git a/mm/slub.c b/mm/slub.c
index a3b8467c14af..a60887938c19 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1336,10 +1336,10 @@ static inline void dec_slabs_node(struct kmem_cache *s, int node,
* Hooks for other subsystems that check memory allocations. In a typical
* production configuration these hooks all should produce no code at all.
*/
-static inline void kmalloc_large_node_hook(void *ptr, size_t size, gfp_t flags)
+static inline void *kmalloc_large_node_hook(void *ptr, size_t size, gfp_t flags)
{
kmemleak_alloc(ptr, size, 1, flags);
- kasan_kmalloc_large(ptr, size, flags);
+ return kasan_kmalloc_large(ptr, size, flags);
}

static __always_inline void kfree_hook(void *x)
@@ -2732,7 +2732,7 @@ void *kmem_cache_alloc_trace(struct kmem_cache *s, gfp_t gfpflags, size_t size)
{
void *ret = slab_alloc(s, gfpflags, _RET_IP_);
trace_kmalloc(_RET_IP_, ret, size, s->size, gfpflags);
- kasan_kmalloc(s, ret, size, gfpflags);
+ ret = kasan_kmalloc(s, ret, size, gfpflags);
return ret;
}
EXPORT_SYMBOL(kmem_cache_alloc_trace);
@@ -2760,7 +2760,7 @@ void *kmem_cache_alloc_node_trace(struct kmem_cache *s,
trace_kmalloc_node(_RET_IP_, ret,
size, s->size, gfpflags, node);

- kasan_kmalloc(s, ret, size, gfpflags);
+ ret = kasan_kmalloc(s, ret, size, gfpflags);
return ret;
}
EXPORT_SYMBOL(kmem_cache_alloc_node_trace);
@@ -3750,7 +3750,7 @@ void *__kmalloc(size_t size, gfp_t flags)

trace_kmalloc(_RET_IP_, ret, size, s->size, flags);

- kasan_kmalloc(s, ret, size, flags);
+ ret = kasan_kmalloc(s, ret, size, flags);

return ret;
}
@@ -3767,8 +3767,7 @@ static void *kmalloc_large_node(size_t size, gfp_t flags, int node)
if (page)
ptr = page_address(page);

- kmalloc_large_node_hook(ptr, size, flags);
- return ptr;
+ return kmalloc_large_node_hook(ptr, size, flags);
}

void *__kmalloc_node(size_t size, gfp_t flags, int node)
@@ -3795,7 +3794,7 @@ void *__kmalloc_node(size_t size, gfp_t flags, int node)

trace_kmalloc_node(_RET_IP_, ret, size, s->size, flags, node);

- kasan_kmalloc(s, ret, size, flags);
+ ret = kasan_kmalloc(s, ret, size, flags);

return ret;
}
--
2.18.0.rc2.346.g013aa6912e-goog


2018-06-26 13:23:05

by Andrey Konovalov

[permalink] [raw]
Subject: [PATCH v4 05/17] khwasan: initialize shadow to 0xff

A KHWASAN shadow memory cell contains a memory tag, that corresponds to
the tag in the top byte of the pointer, that points to that memory. The
native top byte value of kernel pointers is 0xff, so with KHWASAN we
need to initialize shadow memory to 0xff. This commit does that.

Signed-off-by: Andrey Konovalov <[email protected]>
---
arch/arm64/mm/kasan_init.c | 16 ++++++++++++++--
include/linux/kasan.h | 8 ++++++++
mm/kasan/common.c | 3 ++-
3 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/mm/kasan_init.c b/arch/arm64/mm/kasan_init.c
index 12145874c02b..7a31e8ccbad2 100644
--- a/arch/arm64/mm/kasan_init.c
+++ b/arch/arm64/mm/kasan_init.c
@@ -44,6 +44,15 @@ static phys_addr_t __init kasan_alloc_zeroed_page(int node)
return __pa(p);
}

+static phys_addr_t __init kasan_alloc_raw_page(int node)
+{
+ void *p = memblock_virt_alloc_try_nid_raw(PAGE_SIZE, PAGE_SIZE,
+ __pa(MAX_DMA_ADDRESS),
+ MEMBLOCK_ALLOC_ACCESSIBLE,
+ node);
+ return __pa(p);
+}
+
static pte_t *__init kasan_pte_offset(pmd_t *pmdp, unsigned long addr, int node,
bool early)
{
@@ -89,7 +98,9 @@ static void __init kasan_pte_populate(pmd_t *pmdp, unsigned long addr,

do {
phys_addr_t page_phys = early ? __pa_symbol(kasan_zero_page)
- : kasan_alloc_zeroed_page(node);
+ : kasan_alloc_raw_page(node);
+ if (!early)
+ memset(__va(page_phys), KASAN_SHADOW_INIT, PAGE_SIZE);
next = addr + PAGE_SIZE;
set_pte(ptep, pfn_pte(__phys_to_pfn(page_phys), PAGE_KERNEL));
} while (ptep++, addr = next, addr != end && pte_none(READ_ONCE(*ptep)));
@@ -139,6 +150,7 @@ asmlinkage void __init kasan_early_init(void)
KASAN_SHADOW_END - (1UL << (64 - KASAN_SHADOW_SCALE_SHIFT)));
BUILD_BUG_ON(!IS_ALIGNED(KASAN_SHADOW_START, PGDIR_SIZE));
BUILD_BUG_ON(!IS_ALIGNED(KASAN_SHADOW_END, PGDIR_SIZE));
+
kasan_pgd_populate(KASAN_SHADOW_START, KASAN_SHADOW_END, NUMA_NO_NODE,
true);
}
@@ -235,7 +247,7 @@ void __init kasan_init(void)
set_pte(&kasan_zero_pte[i],
pfn_pte(sym_to_pfn(kasan_zero_page), PAGE_KERNEL_RO));

- memset(kasan_zero_page, 0, PAGE_SIZE);
+ memset(kasan_zero_page, KASAN_SHADOW_INIT, PAGE_SIZE);
cpu_replace_ttbr1(lm_alias(swapper_pg_dir));

/* At this point kasan is fully initialized. Enable error messages */
diff --git a/include/linux/kasan.h b/include/linux/kasan.h
index 6608aa9b35ac..336385baf926 100644
--- a/include/linux/kasan.h
+++ b/include/linux/kasan.h
@@ -139,6 +139,8 @@ static inline size_t kasan_metadata_size(struct kmem_cache *cache) { return 0; }

#ifdef CONFIG_KASAN_GENERIC

+#define KASAN_SHADOW_INIT 0
+
void kasan_cache_shrink(struct kmem_cache *cache);
void kasan_cache_shutdown(struct kmem_cache *cache);

@@ -149,4 +151,10 @@ static inline void kasan_cache_shutdown(struct kmem_cache *cache) {}

#endif /* CONFIG_KASAN_GENERIC */

+#ifdef CONFIG_KASAN_HW
+
+#define KASAN_SHADOW_INIT 0xFF
+
+#endif /* CONFIG_KASAN_HW */
+
#endif /* LINUX_KASAN_H */
diff --git a/mm/kasan/common.c b/mm/kasan/common.c
index e78ebeff1f4c..656baa8984c7 100644
--- a/mm/kasan/common.c
+++ b/mm/kasan/common.c
@@ -473,11 +473,12 @@ int kasan_module_alloc(void *addr, size_t size)

ret = __vmalloc_node_range(shadow_size, 1, shadow_start,
shadow_start + shadow_size,
- GFP_KERNEL | __GFP_ZERO,
+ GFP_KERNEL,
PAGE_KERNEL, VM_NO_GUARD, NUMA_NO_NODE,
__builtin_return_address(0));

if (ret) {
+ __memset(ret, KASAN_SHADOW_INIT, shadow_size);
find_vm_area(addr)->flags |= VM_KASAN;
kmemleak_ignore(ret);
return 0;
--
2.18.0.rc2.346.g013aa6912e-goog


2018-06-26 13:48:43

by Andrey Konovalov

[permalink] [raw]
Subject: [PATCH v4 07/17] khwasan: add tag related helper functions

This commit adds a few helper functions, that are meant to be used to
work with tags embedded in the top byte of kernel pointers: to set, to
get or to reset (set to 0xff) the top byte.

Signed-off-by: Andrey Konovalov <[email protected]>
---
arch/arm64/mm/kasan_init.c | 2 ++
include/linux/kasan.h | 23 ++++++++++++++++
mm/kasan/kasan.h | 55 ++++++++++++++++++++++++++++++++++++++
mm/kasan/khwasan.c | 47 ++++++++++++++++++++++++++++++++
4 files changed, 127 insertions(+)

diff --git a/arch/arm64/mm/kasan_init.c b/arch/arm64/mm/kasan_init.c
index 7a31e8ccbad2..e7f37c0b7e14 100644
--- a/arch/arm64/mm/kasan_init.c
+++ b/arch/arm64/mm/kasan_init.c
@@ -250,6 +250,8 @@ void __init kasan_init(void)
memset(kasan_zero_page, KASAN_SHADOW_INIT, PAGE_SIZE);
cpu_replace_ttbr1(lm_alias(swapper_pg_dir));

+ khwasan_init();
+
/* At this point kasan is fully initialized. Enable error messages */
init_task.kasan_depth = 0;
pr_info("KernelAddressSanitizer initialized\n");
diff --git a/include/linux/kasan.h b/include/linux/kasan.h
index 336385baf926..d7624b879d86 100644
--- a/include/linux/kasan.h
+++ b/include/linux/kasan.h
@@ -155,6 +155,29 @@ static inline void kasan_cache_shutdown(struct kmem_cache *cache) {}

#define KASAN_SHADOW_INIT 0xFF

+void khwasan_init(void);
+
+void *khwasan_set_tag(const void *addr, u8 tag);
+u8 khwasan_get_tag(const void *addr);
+void *khwasan_reset_tag(const void *ptr);
+
+#else /* CONFIG_KASAN_HW */
+
+static inline void khwasan_init(void) { }
+
+static inline void *khwasan_set_tag(const void *addr, u8 tag)
+{
+ return (void *)addr;
+}
+static inline u8 khwasan_get_tag(const void *addr)
+{
+ return 0xFF;
+}
+static inline void *khwasan_reset_tag(const void *ptr)
+{
+ return (void *)ptr;
+}
+
#endif /* CONFIG_KASAN_HW */

#endif /* LINUX_KASAN_H */
diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
index 19b950eaccff..a7cc27d96608 100644
--- a/mm/kasan/kasan.h
+++ b/mm/kasan/kasan.h
@@ -8,6 +8,10 @@
#define KASAN_SHADOW_SCALE_SIZE (1UL << KASAN_SHADOW_SCALE_SHIFT)
#define KASAN_SHADOW_MASK (KASAN_SHADOW_SCALE_SIZE - 1)

+#define KHWASAN_TAG_KERNEL 0xFF /* native kernel pointers tag */
+#define KHWASAN_TAG_INVALID 0xFE /* inaccessible memory tag */
+#define KHWASAN_TAG_MAX 0xFD /* maximum value for random tags */
+
#define KASAN_FREE_PAGE 0xFF /* page was freed */
#define KASAN_PAGE_REDZONE 0xFE /* redzone for kmalloc_large allocations */
#define KASAN_KMALLOC_REDZONE 0xFC /* redzone inside slub object */
@@ -126,6 +130,57 @@ static inline void quarantine_reduce(void) { }
static inline void quarantine_remove_cache(struct kmem_cache *cache) { }
#endif

+#ifdef CONFIG_KASAN_HW
+
+#define KHWASAN_TAG_SHIFT 56
+#define KHWASAN_TAG_MASK (0xFFUL << KHWASAN_TAG_SHIFT)
+
+u8 random_tag(void);
+
+static inline void *set_tag(const void *addr, u8 tag)
+{
+ u64 a = (u64)addr;
+
+ a &= ~KHWASAN_TAG_MASK;
+ a |= ((u64)tag << KHWASAN_TAG_SHIFT);
+
+ return (void *)a;
+}
+
+static inline u8 get_tag(const void *addr)
+{
+ return (u8)((u64)addr >> KHWASAN_TAG_SHIFT);
+}
+
+static inline void *reset_tag(const void *addr)
+{
+ return set_tag(addr, KHWASAN_TAG_KERNEL);
+}
+
+#else /* CONFIG_KASAN_HW */
+
+static inline u8 random_tag(void)
+{
+ return 0;
+}
+
+static inline void *set_tag(const void *addr, u8 tag)
+{
+ return (void *)addr;
+}
+
+static inline u8 get_tag(const void *addr)
+{
+ return 0;
+}
+
+static inline void *reset_tag(const void *addr)
+{
+ return (void *)addr;
+}
+
+#endif /* CONFIG_KASAN_HW */
+
/*
* Exported functions for interfaces called from assembly or from generated
* code. Declarations here to avoid warning about missing declarations.
diff --git a/mm/kasan/khwasan.c b/mm/kasan/khwasan.c
index e2c3a7f7fd1f..d34679b8f8c7 100644
--- a/mm/kasan/khwasan.c
+++ b/mm/kasan/khwasan.c
@@ -38,6 +38,53 @@
#include "kasan.h"
#include "../slab.h"

+static DEFINE_PER_CPU(u32, prng_state);
+
+void khwasan_init(void)
+{
+ int cpu;
+
+ for_each_possible_cpu(cpu)
+ per_cpu(prng_state, cpu) = get_random_u32();
+}
+
+/*
+ * If a preemption happens between this_cpu_read and this_cpu_write, the only
+ * side effect is that we'll give a few allocated in different contexts objects
+ * the same tag. Since KHWASAN is meant to be used a probabilistic bug-detection
+ * debug feature, this doesn’t have significant negative impact.
+ *
+ * Ideally the tags use strong randomness to prevent any attempts to predict
+ * them during explicit exploit attempts. But strong randomness is expensive,
+ * and we did an intentional trade-off to use a PRNG. This non-atomic RMW
+ * sequence has in fact positive effect, since interrupts that randomly skew
+ * PRNG at unpredictable points do only good.
+ */
+u8 random_tag(void)
+{
+ u32 state = this_cpu_read(prng_state);
+
+ state = 1664525 * state + 1013904223;
+ this_cpu_write(prng_state, state);
+
+ return (u8)(state % (KHWASAN_TAG_MAX + 1));
+}
+
+void *khwasan_set_tag(const void *addr, u8 tag)
+{
+ return set_tag(addr, tag);
+}
+
+u8 khwasan_get_tag(const void *addr)
+{
+ return get_tag(addr);
+}
+
+void *khwasan_reset_tag(const void *addr)
+{
+ return reset_tag(addr);
+}
+
void check_memory_region(unsigned long addr, size_t size, bool write,
unsigned long ret_ip)
{
--
2.18.0.rc2.346.g013aa6912e-goog


2018-06-26 13:50:44

by Andrey Konovalov

[permalink] [raw]
Subject: [PATCH v4 15/17] khwasan, mm, arm64: tag non slab memory allocated via pagealloc

KWHASAN doesn't check memory accesses through pointers tagged with 0xff.
When page_address is used to get pointer to memory that corresponds to
some page, the tag of the resulting pointer gets set to 0xff, even though
the allocated memory might have been tagged differently.

For slab pages it's impossible to recover the correct tag to return from
page_address, since the page might contain multiple slab objects tagged
with different values, and we can't know in advance which one of them is
going to get accessed. For non slab pages however, we can recover the tag
in page_address, since the whole page was marked with the same tag.

This patch adds tagging to non slab memory allocated with pagealloc. To
set the tag of the pointer returned from page_address, the tag gets stored
to page->flags when the memory gets allocated.

Signed-off-by: Andrey Konovalov <[email protected]>
---
arch/arm64/include/asm/memory.h | 10 ++++++++++
include/linux/mm.h | 29 +++++++++++++++++++++++++++++
include/linux/page-flags-layout.h | 10 ++++++++++
mm/cma.c | 11 +++++++++++
mm/kasan/common.c | 15 +++++++++++++--
mm/page_alloc.c | 1 +
6 files changed, 74 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index e9e054dfb1fc..3352a65b8312 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -305,7 +305,17 @@ static inline void *phys_to_virt(phys_addr_t x)
#define __virt_to_pgoff(kaddr) (((u64)(kaddr) & ~PAGE_OFFSET) / PAGE_SIZE * sizeof(struct page))
#define __page_to_voff(kaddr) (((u64)(kaddr) & ~VMEMMAP_START) * PAGE_SIZE / sizeof(struct page))

+#ifndef CONFIG_KASAN_HW
#define page_to_virt(page) ((void *)((__page_to_voff(page)) | PAGE_OFFSET))
+#else
+#define page_to_virt(page) ({ \
+ unsigned long __addr = \
+ ((__page_to_voff(page)) | PAGE_OFFSET); \
+ __addr = KASAN_SET_TAG(__addr, page_kasan_tag(page)); \
+ ((void *)__addr); \
+})
+#endif
+
#define virt_to_page(vaddr) ((struct page *)((__virt_to_pgoff(vaddr)) | VMEMMAP_START))

#define _virt_addr_valid(kaddr) pfn_valid((((u64)(kaddr) & ~PAGE_OFFSET) \
diff --git a/include/linux/mm.h b/include/linux/mm.h
index a0fbb9ffe380..46afadf4f48c 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -784,6 +784,7 @@ int finish_mkwrite_fault(struct vm_fault *vmf);
#define NODES_PGOFF (SECTIONS_PGOFF - NODES_WIDTH)
#define ZONES_PGOFF (NODES_PGOFF - ZONES_WIDTH)
#define LAST_CPUPID_PGOFF (ZONES_PGOFF - LAST_CPUPID_WIDTH)
+#define KASAN_TAG_PGOFF (LAST_CPUPID_PGOFF - KASAN_TAG_WIDTH)

/*
* Define the bit shifts to access each section. For non-existent
@@ -794,6 +795,7 @@ int finish_mkwrite_fault(struct vm_fault *vmf);
#define NODES_PGSHIFT (NODES_PGOFF * (NODES_WIDTH != 0))
#define ZONES_PGSHIFT (ZONES_PGOFF * (ZONES_WIDTH != 0))
#define LAST_CPUPID_PGSHIFT (LAST_CPUPID_PGOFF * (LAST_CPUPID_WIDTH != 0))
+#define KASAN_TAG_PGSHIFT (KASAN_TAG_PGOFF * (KASAN_TAG_WIDTH != 0))

/* NODE:ZONE or SECTION:ZONE is used to ID a zone for the buddy allocator */
#ifdef NODE_NOT_IN_PAGE_FLAGS
@@ -816,6 +818,7 @@ int finish_mkwrite_fault(struct vm_fault *vmf);
#define NODES_MASK ((1UL << NODES_WIDTH) - 1)
#define SECTIONS_MASK ((1UL << SECTIONS_WIDTH) - 1)
#define LAST_CPUPID_MASK ((1UL << LAST_CPUPID_SHIFT) - 1)
+#define KASAN_TAG_MASK ((1UL << KASAN_TAG_WIDTH) - 1)
#define ZONEID_MASK ((1UL << ZONEID_SHIFT) - 1)

static inline enum zone_type page_zonenum(const struct page *page)
@@ -1070,6 +1073,32 @@ static inline bool cpupid_match_pid(struct task_struct *task, int cpupid)
}
#endif /* CONFIG_NUMA_BALANCING */

+#ifdef CONFIG_KASAN_HW
+static inline u8 page_kasan_tag(const struct page *page)
+{
+ return (page->flags >> KASAN_TAG_PGSHIFT) & KASAN_TAG_MASK;
+}
+
+static inline void page_kasan_tag_set(struct page *page, u8 tag)
+{
+ page->flags &= ~(KASAN_TAG_MASK << KASAN_TAG_PGSHIFT);
+ page->flags |= (tag & KASAN_TAG_MASK) << KASAN_TAG_PGSHIFT;
+}
+
+static inline void page_kasan_tag_reset(struct page *page)
+{
+ page_kasan_tag_set(page, 0xff);
+}
+#else
+static inline u8 page_kasan_tag(const struct page *page)
+{
+ return 0xff;
+}
+
+static inline void page_kasan_tag_set(struct page *page, u8 tag) { }
+static inline void page_kasan_tag_reset(struct page *page) { }
+#endif
+
static inline struct zone *page_zone(const struct page *page)
{
return &NODE_DATA(page_to_nid(page))->node_zones[page_zonenum(page)];
diff --git a/include/linux/page-flags-layout.h b/include/linux/page-flags-layout.h
index 7ec86bf31ce4..8dbad17664c2 100644
--- a/include/linux/page-flags-layout.h
+++ b/include/linux/page-flags-layout.h
@@ -82,6 +82,16 @@
#define LAST_CPUPID_WIDTH 0
#endif

+#ifdef CONFIG_KASAN_HW
+#define KASAN_TAG_WIDTH 8
+#if SECTIONS_WIDTH+NODES_WIDTH+ZONES_WIDTH+LAST_CPUPID_WIDTH+KASAN_TAG_WIDTH \
+ > BITS_PER_LONG - NR_PAGEFLAGS
+#error "KASAN: not enough bits in page flags for tag"
+#endif
+#else
+#define KASAN_TAG_WIDTH 0
+#endif
+
/*
* We are going to use the flags for the page to node mapping if its in
* there. This includes the case where there is no node, so it is implicit.
diff --git a/mm/cma.c b/mm/cma.c
index 5809bbe360d7..fdad7ad0d9c4 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -407,6 +407,7 @@ struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
unsigned long pfn = -1;
unsigned long start = 0;
unsigned long bitmap_maxno, bitmap_no, bitmap_count;
+ size_t i;
struct page *page = NULL;
int ret = -ENOMEM;

@@ -466,6 +467,16 @@ struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,

trace_cma_alloc(pfn, page, count, align);

+ /*
+ * CMA can allocate multiple page blocks, which results in different
+ * blocks being marked with different tags. Reset the tags to ignore
+ * those page blocks.
+ */
+ if (page) {
+ for (i = 0; i < count; i++)
+ page_kasan_tag_reset(page + i);
+ }
+
if (ret && !(gfp_mask & __GFP_NOWARN)) {
pr_err("%s: alloc failed, req-size: %zu pages, ret: %d\n",
__func__, count, ret);
diff --git a/mm/kasan/common.c b/mm/kasan/common.c
index 1e96ca050c75..6cf7dec0b765 100644
--- a/mm/kasan/common.c
+++ b/mm/kasan/common.c
@@ -212,8 +212,15 @@ void kasan_unpoison_stack_above_sp_to(const void *watermark)

void kasan_alloc_pages(struct page *page, unsigned int order)
{
+ u8 tag;
+ unsigned long i;
+
if (unlikely(PageHighMem(page)))
return;
+
+ tag = random_tag();
+ for (i = 0; i < (1 << order); i++)
+ page_kasan_tag_set(page + i, tag);
kasan_unpoison_shadow(page_address(page), PAGE_SIZE << order);
}

@@ -311,6 +318,10 @@ struct kasan_free_meta *get_free_info(struct kmem_cache *cache,

void kasan_poison_slab(struct page *page)
{
+ unsigned long i;
+
+ for (i = 0; i < (1 << compound_order(page)); i++)
+ page_kasan_tag_reset(page + i);
kasan_poison_shadow(page_address(page),
PAGE_SIZE << compound_order(page),
KASAN_KMALLOC_REDZONE);
@@ -483,7 +494,7 @@ void kasan_poison_kfree(void *ptr, unsigned long ip)
page = virt_to_head_page(ptr);

if (unlikely(!PageSlab(page))) {
- if (reset_tag(ptr) != page_address(page)) {
+ if (ptr != page_address(page)) {
kasan_report_invalid_free(ptr, ip);
return;
}
@@ -496,7 +507,7 @@ void kasan_poison_kfree(void *ptr, unsigned long ip)

void kasan_kfree_large(void *ptr, unsigned long ip)
{
- if (reset_tag(ptr) != page_address(virt_to_head_page(ptr)))
+ if (ptr != page_address(virt_to_head_page(ptr)))
kasan_report_invalid_free(ptr, ip);
/* The object will be poisoned by page_alloc. */
}
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 1521100f1e63..266e86323d73 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1176,6 +1176,7 @@ static void __meminit __init_single_page(struct page *page, unsigned long pfn,
init_page_count(page);
page_mapcount_reset(page);
page_cpupid_reset_last(page);
+ page_kasan_tag_reset(page);

INIT_LIST_HEAD(&page->lru);
#ifdef WANT_PAGE_VIRTUAL
--
2.18.0.rc2.346.g013aa6912e-goog


2018-06-26 13:53:50

by Andrey Konovalov

[permalink] [raw]
Subject: [PATCH v4 10/17] khwasan, mm: perform untagged pointers comparison in krealloc

The krealloc function checks where the same buffer was reused or a new one
allocated by comparing kernel pointers. KHWASAN changes memory tag on the
krealloc'ed chunk of memory and therefore also changes the pointer tag of
the returned pointer. Therefore we need to perform comparison on untagged
(with tags reset) pointers to check whether it's the same memory region or
not.

Signed-off-by: Andrey Konovalov <[email protected]>
---
mm/slab_common.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/slab_common.c b/mm/slab_common.c
index c279b52c7565..7746d2ef5d45 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1509,7 +1509,7 @@ void *krealloc(const void *p, size_t new_size, gfp_t flags)
}

ret = __do_krealloc(p, new_size, flags);
- if (ret && p != ret)
+ if (ret && khwasan_reset_tag(p) != khwasan_reset_tag(ret))
kfree(p);

return ret;
--
2.18.0.rc2.346.g013aa6912e-goog


2018-06-26 14:12:41

by Andrey Konovalov

[permalink] [raw]
Subject: [PATCH v4 02/17] khwasan: move common kasan and khwasan code to common.c

KHWASAN will reuse a significant part of KASAN code, so move the common
parts to common.c without any functional changes.

Signed-off-by: Andrey Konovalov <[email protected]>
---
mm/kasan/Makefile | 5 +-
mm/kasan/common.c | 603 ++++++++++++++++++++++++++++++++++++++++++++++
mm/kasan/kasan.c | 567 +------------------------------------------
mm/kasan/kasan.h | 5 +
4 files changed, 614 insertions(+), 566 deletions(-)
create mode 100644 mm/kasan/common.c

diff --git a/mm/kasan/Makefile b/mm/kasan/Makefile
index 3289db38bc87..a6df14bffb6b 100644
--- a/mm/kasan/Makefile
+++ b/mm/kasan/Makefile
@@ -1,11 +1,14 @@
# SPDX-License-Identifier: GPL-2.0
KASAN_SANITIZE := n
+UBSAN_SANITIZE_common.o := n
UBSAN_SANITIZE_kasan.o := n
KCOV_INSTRUMENT := n

CFLAGS_REMOVE_kasan.o = -pg
# Function splitter causes unnecessary splits in __asan_load1/__asan_store1
# see: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63533
+
+CFLAGS_common.o := $(call cc-option, -fno-conserve-stack -fno-stack-protector)
CFLAGS_kasan.o := $(call cc-option, -fno-conserve-stack -fno-stack-protector)

-obj-y := kasan.o report.o kasan_init.o quarantine.o
+obj-y := common.o kasan.o report.o kasan_init.o quarantine.o
diff --git a/mm/kasan/common.c b/mm/kasan/common.c
new file mode 100644
index 000000000000..e78ebeff1f4c
--- /dev/null
+++ b/mm/kasan/common.c
@@ -0,0 +1,603 @@
+/*
+ * This file contains common KASAN and KHWASAN code.
+ *
+ * Copyright (c) 2014 Samsung Electronics Co., Ltd.
+ * Author: Andrey Ryabinin <[email protected]>
+ *
+ * Some code borrowed from https://github.com/xairy/kasan-prototype by
+ * Andrey Konovalov <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ */
+
+#include <linux/export.h>
+#include <linux/interrupt.h>
+#include <linux/init.h>
+#include <linux/kasan.h>
+#include <linux/kernel.h>
+#include <linux/kmemleak.h>
+#include <linux/linkage.h>
+#include <linux/memblock.h>
+#include <linux/memory.h>
+#include <linux/mm.h>
+#include <linux/module.h>
+#include <linux/printk.h>
+#include <linux/sched.h>
+#include <linux/sched/task_stack.h>
+#include <linux/slab.h>
+#include <linux/stacktrace.h>
+#include <linux/string.h>
+#include <linux/types.h>
+#include <linux/vmalloc.h>
+#include <linux/bug.h>
+
+#include "kasan.h"
+#include "../slab.h"
+
+static inline int in_irqentry_text(unsigned long ptr)
+{
+ return (ptr >= (unsigned long)&__irqentry_text_start &&
+ ptr < (unsigned long)&__irqentry_text_end) ||
+ (ptr >= (unsigned long)&__softirqentry_text_start &&
+ ptr < (unsigned long)&__softirqentry_text_end);
+}
+
+static inline void filter_irq_stacks(struct stack_trace *trace)
+{
+ int i;
+
+ if (!trace->nr_entries)
+ return;
+ for (i = 0; i < trace->nr_entries; i++)
+ if (in_irqentry_text(trace->entries[i])) {
+ /* Include the irqentry function into the stack. */
+ trace->nr_entries = i + 1;
+ break;
+ }
+}
+
+static inline depot_stack_handle_t save_stack(gfp_t flags)
+{
+ unsigned long entries[KASAN_STACK_DEPTH];
+ struct stack_trace trace = {
+ .nr_entries = 0,
+ .entries = entries,
+ .max_entries = KASAN_STACK_DEPTH,
+ .skip = 0
+ };
+
+ save_stack_trace(&trace);
+ filter_irq_stacks(&trace);
+ if (trace.nr_entries != 0 &&
+ trace.entries[trace.nr_entries-1] == ULONG_MAX)
+ trace.nr_entries--;
+
+ return depot_save_stack(&trace, flags);
+}
+
+static inline void set_track(struct kasan_track *track, gfp_t flags)
+{
+ track->pid = current->pid;
+ track->stack = save_stack(flags);
+}
+
+void kasan_enable_current(void)
+{
+ current->kasan_depth++;
+}
+
+void kasan_disable_current(void)
+{
+ current->kasan_depth--;
+}
+
+void kasan_check_read(const volatile void *p, unsigned int size)
+{
+ check_memory_region((unsigned long)p, size, false, _RET_IP_);
+}
+EXPORT_SYMBOL(kasan_check_read);
+
+void kasan_check_write(const volatile void *p, unsigned int size)
+{
+ check_memory_region((unsigned long)p, size, true, _RET_IP_);
+}
+EXPORT_SYMBOL(kasan_check_write);
+
+#undef memset
+void *memset(void *addr, int c, size_t len)
+{
+ check_memory_region((unsigned long)addr, len, true, _RET_IP_);
+
+ return __memset(addr, c, len);
+}
+
+#undef memmove
+void *memmove(void *dest, const void *src, size_t len)
+{
+ check_memory_region((unsigned long)src, len, false, _RET_IP_);
+ check_memory_region((unsigned long)dest, len, true, _RET_IP_);
+
+ return __memmove(dest, src, len);
+}
+
+#undef memcpy
+void *memcpy(void *dest, const void *src, size_t len)
+{
+ check_memory_region((unsigned long)src, len, false, _RET_IP_);
+ check_memory_region((unsigned long)dest, len, true, _RET_IP_);
+
+ return __memcpy(dest, src, len);
+}
+
+/*
+ * Poisons the shadow memory for 'size' bytes starting from 'addr'.
+ * Memory addresses should be aligned to KASAN_SHADOW_SCALE_SIZE.
+ */
+void kasan_poison_shadow(const void *address, size_t size, u8 value)
+{
+ void *shadow_start, *shadow_end;
+
+ shadow_start = kasan_mem_to_shadow(address);
+ shadow_end = kasan_mem_to_shadow(address + size);
+
+ __memset(shadow_start, value, shadow_end - shadow_start);
+}
+
+void kasan_unpoison_shadow(const void *address, size_t size)
+{
+ kasan_poison_shadow(address, size, 0);
+
+ if (size & KASAN_SHADOW_MASK) {
+ u8 *shadow = (u8 *)kasan_mem_to_shadow(address + size);
+ *shadow = size & KASAN_SHADOW_MASK;
+ }
+}
+
+static void __kasan_unpoison_stack(struct task_struct *task, const void *sp)
+{
+ void *base = task_stack_page(task);
+ size_t size = sp - base;
+
+ kasan_unpoison_shadow(base, size);
+}
+
+/* Unpoison the entire stack for a task. */
+void kasan_unpoison_task_stack(struct task_struct *task)
+{
+ __kasan_unpoison_stack(task, task_stack_page(task) + THREAD_SIZE);
+}
+
+/* Unpoison the stack for the current task beyond a watermark sp value. */
+asmlinkage void kasan_unpoison_task_stack_below(const void *watermark)
+{
+ /*
+ * Calculate the task stack base address. Avoid using 'current'
+ * because this function is called by early resume code which hasn't
+ * yet set up the percpu register (%gs).
+ */
+ void *base = (void *)((unsigned long)watermark & ~(THREAD_SIZE - 1));
+
+ kasan_unpoison_shadow(base, watermark - base);
+}
+
+/*
+ * Clear all poison for the region between the current SP and a provided
+ * watermark value, as is sometimes required prior to hand-crafted asm function
+ * returns in the middle of functions.
+ */
+void kasan_unpoison_stack_above_sp_to(const void *watermark)
+{
+ const void *sp = __builtin_frame_address(0);
+ size_t size = watermark - sp;
+
+ if (WARN_ON(sp > watermark))
+ return;
+ kasan_unpoison_shadow(sp, size);
+}
+
+void kasan_alloc_pages(struct page *page, unsigned int order)
+{
+ if (likely(!PageHighMem(page)))
+ kasan_unpoison_shadow(page_address(page), PAGE_SIZE << order);
+}
+
+void kasan_free_pages(struct page *page, unsigned int order)
+{
+ if (likely(!PageHighMem(page)))
+ kasan_poison_shadow(page_address(page),
+ PAGE_SIZE << order,
+ KASAN_FREE_PAGE);
+}
+
+/*
+ * Adaptive redzone policy taken from the userspace AddressSanitizer runtime.
+ * For larger allocations larger redzones are used.
+ */
+static inline unsigned int optimal_redzone(unsigned int object_size)
+{
+ if (IS_ENABLED(CONFIG_KASAN_HW))
+ return 0;
+
+ return
+ object_size <= 64 - 16 ? 16 :
+ object_size <= 128 - 32 ? 32 :
+ object_size <= 512 - 64 ? 64 :
+ object_size <= 4096 - 128 ? 128 :
+ object_size <= (1 << 14) - 256 ? 256 :
+ object_size <= (1 << 15) - 512 ? 512 :
+ object_size <= (1 << 16) - 1024 ? 1024 : 2048;
+}
+
+void kasan_cache_create(struct kmem_cache *cache, unsigned int *size,
+ slab_flags_t *flags)
+{
+ unsigned int orig_size = *size;
+ int redzone_adjust;
+
+ /* Add alloc meta. */
+ cache->kasan_info.alloc_meta_offset = *size;
+ *size += sizeof(struct kasan_alloc_meta);
+
+ /* Add free meta. */
+ if (cache->flags & SLAB_TYPESAFE_BY_RCU || cache->ctor ||
+ cache->object_size < sizeof(struct kasan_free_meta)) {
+ cache->kasan_info.free_meta_offset = *size;
+ *size += sizeof(struct kasan_free_meta);
+ }
+ redzone_adjust = optimal_redzone(cache->object_size) -
+ (*size - cache->object_size);
+
+ if (redzone_adjust > 0)
+ *size += redzone_adjust;
+
+ *size = min_t(unsigned int, KMALLOC_MAX_SIZE,
+ max(*size, cache->object_size +
+ optimal_redzone(cache->object_size)));
+
+ /*
+ * If the metadata doesn't fit, don't enable KASAN at all.
+ */
+ if (*size <= cache->kasan_info.alloc_meta_offset ||
+ *size <= cache->kasan_info.free_meta_offset) {
+ cache->kasan_info.alloc_meta_offset = 0;
+ cache->kasan_info.free_meta_offset = 0;
+ *size = orig_size;
+ return;
+ }
+
+ *flags |= SLAB_KASAN;
+}
+
+size_t kasan_metadata_size(struct kmem_cache *cache)
+{
+ return (cache->kasan_info.alloc_meta_offset ?
+ sizeof(struct kasan_alloc_meta) : 0) +
+ (cache->kasan_info.free_meta_offset ?
+ sizeof(struct kasan_free_meta) : 0);
+}
+
+struct kasan_alloc_meta *get_alloc_info(struct kmem_cache *cache,
+ const void *object)
+{
+ BUILD_BUG_ON(sizeof(struct kasan_alloc_meta) > 32);
+ return (void *)object + cache->kasan_info.alloc_meta_offset;
+}
+
+struct kasan_free_meta *get_free_info(struct kmem_cache *cache,
+ const void *object)
+{
+ BUILD_BUG_ON(sizeof(struct kasan_free_meta) > 32);
+ return (void *)object + cache->kasan_info.free_meta_offset;
+}
+
+void kasan_poison_slab(struct page *page)
+{
+ kasan_poison_shadow(page_address(page),
+ PAGE_SIZE << compound_order(page),
+ KASAN_KMALLOC_REDZONE);
+}
+
+void kasan_unpoison_object_data(struct kmem_cache *cache, void *object)
+{
+ kasan_unpoison_shadow(object, cache->object_size);
+}
+
+void kasan_poison_object_data(struct kmem_cache *cache, void *object)
+{
+ kasan_poison_shadow(object,
+ round_up(cache->object_size, KASAN_SHADOW_SCALE_SIZE),
+ KASAN_KMALLOC_REDZONE);
+}
+
+void kasan_init_slab_obj(struct kmem_cache *cache, const void *object)
+{
+ struct kasan_alloc_meta *alloc_info;
+
+ if (!(cache->flags & SLAB_KASAN))
+ return;
+
+ alloc_info = get_alloc_info(cache, object);
+ __memset(alloc_info, 0, sizeof(*alloc_info));
+}
+
+void *kasan_slab_alloc(struct kmem_cache *cache, void *object, gfp_t flags)
+{
+ return kasan_kmalloc(cache, object, cache->object_size, flags);
+}
+
+static bool __kasan_slab_free(struct kmem_cache *cache, void *object,
+ unsigned long ip, bool quarantine)
+{
+ s8 shadow_byte;
+ unsigned long rounded_up_size;
+
+ if (unlikely(nearest_obj(cache, virt_to_head_page(object), object) !=
+ object)) {
+ kasan_report_invalid_free(object, ip);
+ return true;
+ }
+
+ /* RCU slabs could be legally used after free within the RCU period */
+ if (unlikely(cache->flags & SLAB_TYPESAFE_BY_RCU))
+ return false;
+
+ shadow_byte = READ_ONCE(*(s8 *)kasan_mem_to_shadow(object));
+ if (shadow_byte < 0 || shadow_byte >= KASAN_SHADOW_SCALE_SIZE) {
+ kasan_report_invalid_free(object, ip);
+ return true;
+ }
+
+ rounded_up_size = round_up(cache->object_size, KASAN_SHADOW_SCALE_SIZE);
+ kasan_poison_shadow(object, rounded_up_size, KASAN_KMALLOC_FREE);
+
+ if (!quarantine || unlikely(!(cache->flags & SLAB_KASAN)))
+ return false;
+
+ set_track(&get_alloc_info(cache, object)->free_track, GFP_NOWAIT);
+ quarantine_put(get_free_info(cache, object), cache);
+ return true;
+}
+
+bool kasan_slab_free(struct kmem_cache *cache, void *object, unsigned long ip)
+{
+ return __kasan_slab_free(cache, object, ip, true);
+}
+
+void *kasan_kmalloc(struct kmem_cache *cache, const void *object, size_t size,
+ gfp_t flags)
+{
+ unsigned long redzone_start;
+ unsigned long redzone_end;
+
+ if (gfpflags_allow_blocking(flags))
+ quarantine_reduce();
+
+ if (unlikely(object == NULL))
+ return NULL;
+
+ redzone_start = round_up((unsigned long)(object + size),
+ KASAN_SHADOW_SCALE_SIZE);
+ redzone_end = round_up((unsigned long)object + cache->object_size,
+ KASAN_SHADOW_SCALE_SIZE);
+
+ kasan_unpoison_shadow(object, size);
+ kasan_poison_shadow((void *)redzone_start, redzone_end - redzone_start,
+ KASAN_KMALLOC_REDZONE);
+
+ if (cache->flags & SLAB_KASAN)
+ set_track(&get_alloc_info(cache, object)->alloc_track, flags);
+
+ return (void *)object;
+}
+EXPORT_SYMBOL(kasan_kmalloc);
+
+void *kasan_kmalloc_large(const void *ptr, size_t size, gfp_t flags)
+{
+ struct page *page;
+ unsigned long redzone_start;
+ unsigned long redzone_end;
+
+ if (gfpflags_allow_blocking(flags))
+ quarantine_reduce();
+
+ if (unlikely(ptr == NULL))
+ return NULL;
+
+ page = virt_to_page(ptr);
+ redzone_start = round_up((unsigned long)(ptr + size),
+ KASAN_SHADOW_SCALE_SIZE);
+ redzone_end = (unsigned long)ptr + (PAGE_SIZE << compound_order(page));
+
+ kasan_unpoison_shadow(ptr, size);
+ kasan_poison_shadow((void *)redzone_start, redzone_end - redzone_start,
+ KASAN_PAGE_REDZONE);
+
+ return (void *)ptr;
+}
+
+void *kasan_krealloc(const void *object, size_t size, gfp_t flags)
+{
+ struct page *page;
+
+ if (unlikely(object == ZERO_SIZE_PTR))
+ return (void *)object;
+
+ page = virt_to_head_page(object);
+
+ if (unlikely(!PageSlab(page)))
+ return kasan_kmalloc_large(object, size, flags);
+ else
+ return kasan_kmalloc(page->slab_cache, object, size, flags);
+}
+
+void kasan_poison_kfree(void *ptr, unsigned long ip)
+{
+ struct page *page;
+
+ page = virt_to_head_page(ptr);
+
+ if (unlikely(!PageSlab(page))) {
+ if (ptr != page_address(page)) {
+ kasan_report_invalid_free(ptr, ip);
+ return;
+ }
+ kasan_poison_shadow(ptr, PAGE_SIZE << compound_order(page),
+ KASAN_FREE_PAGE);
+ } else {
+ __kasan_slab_free(page->slab_cache, ptr, ip, false);
+ }
+}
+
+void kasan_kfree_large(void *ptr, unsigned long ip)
+{
+ if (ptr != page_address(virt_to_head_page(ptr)))
+ kasan_report_invalid_free(ptr, ip);
+ /* The object will be poisoned by page_alloc. */
+}
+
+int kasan_module_alloc(void *addr, size_t size)
+{
+ void *ret;
+ size_t shadow_size;
+ unsigned long shadow_start;
+
+ shadow_start = (unsigned long)kasan_mem_to_shadow(addr);
+ shadow_size = round_up(size >> KASAN_SHADOW_SCALE_SHIFT,
+ PAGE_SIZE);
+
+ if (WARN_ON(!PAGE_ALIGNED(shadow_start)))
+ return -EINVAL;
+
+ ret = __vmalloc_node_range(shadow_size, 1, shadow_start,
+ shadow_start + shadow_size,
+ GFP_KERNEL | __GFP_ZERO,
+ PAGE_KERNEL, VM_NO_GUARD, NUMA_NO_NODE,
+ __builtin_return_address(0));
+
+ if (ret) {
+ find_vm_area(addr)->flags |= VM_KASAN;
+ kmemleak_ignore(ret);
+ return 0;
+ }
+
+ return -ENOMEM;
+}
+
+void kasan_free_shadow(const struct vm_struct *vm)
+{
+ if (vm->flags & VM_KASAN)
+ vfree(kasan_mem_to_shadow(vm->addr));
+}
+
+#ifdef CONFIG_MEMORY_HOTPLUG
+static bool shadow_mapped(unsigned long addr)
+{
+ pgd_t *pgd = pgd_offset_k(addr);
+ p4d_t *p4d;
+ pud_t *pud;
+ pmd_t *pmd;
+ pte_t *pte;
+
+ if (pgd_none(*pgd))
+ return false;
+ p4d = p4d_offset(pgd, addr);
+ if (p4d_none(*p4d))
+ return false;
+ pud = pud_offset(p4d, addr);
+ if (pud_none(*pud))
+ return false;
+
+ /*
+ * We can't use pud_large() or pud_huge(), the first one is
+ * arch-specific, the last one depends on HUGETLB_PAGE. So let's abuse
+ * pud_bad(), if pud is bad then it's bad because it's huge.
+ */
+ if (pud_bad(*pud))
+ return true;
+ pmd = pmd_offset(pud, addr);
+ if (pmd_none(*pmd))
+ return false;
+
+ if (pmd_bad(*pmd))
+ return true;
+ pte = pte_offset_kernel(pmd, addr);
+ return !pte_none(*pte);
+}
+
+static int __meminit kasan_mem_notifier(struct notifier_block *nb,
+ unsigned long action, void *data)
+{
+ struct memory_notify *mem_data = data;
+ unsigned long nr_shadow_pages, start_kaddr, shadow_start;
+ unsigned long shadow_end, shadow_size;
+
+ nr_shadow_pages = mem_data->nr_pages >> KASAN_SHADOW_SCALE_SHIFT;
+ start_kaddr = (unsigned long)pfn_to_kaddr(mem_data->start_pfn);
+ shadow_start = (unsigned long)kasan_mem_to_shadow((void *)start_kaddr);
+ shadow_size = nr_shadow_pages << PAGE_SHIFT;
+ shadow_end = shadow_start + shadow_size;
+
+ if (WARN_ON(mem_data->nr_pages % KASAN_SHADOW_SCALE_SIZE) ||
+ WARN_ON(start_kaddr % (KASAN_SHADOW_SCALE_SIZE << PAGE_SHIFT)))
+ return NOTIFY_BAD;
+
+ switch (action) {
+ case MEM_GOING_ONLINE: {
+ void *ret;
+
+ /*
+ * If shadow is mapped already than it must have been mapped
+ * during the boot. This could happen if we onlining previously
+ * offlined memory.
+ */
+ if (shadow_mapped(shadow_start))
+ return NOTIFY_OK;
+
+ ret = __vmalloc_node_range(shadow_size, PAGE_SIZE, shadow_start,
+ shadow_end, GFP_KERNEL,
+ PAGE_KERNEL, VM_NO_GUARD,
+ pfn_to_nid(mem_data->start_pfn),
+ __builtin_return_address(0));
+ if (!ret)
+ return NOTIFY_BAD;
+
+ kmemleak_ignore(ret);
+ return NOTIFY_OK;
+ }
+ case MEM_CANCEL_ONLINE:
+ case MEM_OFFLINE: {
+ struct vm_struct *vm;
+
+ /*
+ * shadow_start was either mapped during boot by kasan_init()
+ * or during memory online by __vmalloc_node_range().
+ * In the latter case we can use vfree() to free shadow.
+ * Non-NULL result of the find_vm_area() will tell us if
+ * that was the second case.
+ *
+ * Currently it's not possible to free shadow mapped
+ * during boot by kasan_init(). It's because the code
+ * to do that hasn't been written yet. So we'll just
+ * leak the memory.
+ */
+ vm = find_vm_area((void *)shadow_start);
+ if (vm)
+ vfree((void *)shadow_start);
+ }
+ }
+
+ return NOTIFY_OK;
+}
+
+static int __init kasan_memhotplug_init(void)
+{
+ hotplug_memory_notifier(kasan_mem_notifier, 0);
+
+ return 0;
+}
+
+core_initcall(kasan_memhotplug_init);
+#endif
diff --git a/mm/kasan/kasan.c b/mm/kasan/kasan.c
index f04aa1e0ba48..44ec228de0a2 100644
--- a/mm/kasan/kasan.c
+++ b/mm/kasan/kasan.c
@@ -1,5 +1,5 @@
/*
- * This file contains shadow memory manipulation code.
+ * This file contains core KASAN code.
*
* Copyright (c) 2014 Samsung Electronics Co., Ltd.
* Author: Andrey Ryabinin <[email protected]>
@@ -40,82 +40,6 @@
#include "kasan.h"
#include "../slab.h"

-void kasan_enable_current(void)
-{
- current->kasan_depth++;
-}
-
-void kasan_disable_current(void)
-{
- current->kasan_depth--;
-}
-
-/*
- * Poisons the shadow memory for 'size' bytes starting from 'addr'.
- * Memory addresses should be aligned to KASAN_SHADOW_SCALE_SIZE.
- */
-static void kasan_poison_shadow(const void *address, size_t size, u8 value)
-{
- void *shadow_start, *shadow_end;
-
- shadow_start = kasan_mem_to_shadow(address);
- shadow_end = kasan_mem_to_shadow(address + size);
-
- memset(shadow_start, value, shadow_end - shadow_start);
-}
-
-void kasan_unpoison_shadow(const void *address, size_t size)
-{
- kasan_poison_shadow(address, size, 0);
-
- if (size & KASAN_SHADOW_MASK) {
- u8 *shadow = (u8 *)kasan_mem_to_shadow(address + size);
- *shadow = size & KASAN_SHADOW_MASK;
- }
-}
-
-static void __kasan_unpoison_stack(struct task_struct *task, const void *sp)
-{
- void *base = task_stack_page(task);
- size_t size = sp - base;
-
- kasan_unpoison_shadow(base, size);
-}
-
-/* Unpoison the entire stack for a task. */
-void kasan_unpoison_task_stack(struct task_struct *task)
-{
- __kasan_unpoison_stack(task, task_stack_page(task) + THREAD_SIZE);
-}
-
-/* Unpoison the stack for the current task beyond a watermark sp value. */
-asmlinkage void kasan_unpoison_task_stack_below(const void *watermark)
-{
- /*
- * Calculate the task stack base address. Avoid using 'current'
- * because this function is called by early resume code which hasn't
- * yet set up the percpu register (%gs).
- */
- void *base = (void *)((unsigned long)watermark & ~(THREAD_SIZE - 1));
-
- kasan_unpoison_shadow(base, watermark - base);
-}
-
-/*
- * Clear all poison for the region between the current SP and a provided
- * watermark value, as is sometimes required prior to hand-crafted asm function
- * returns in the middle of functions.
- */
-void kasan_unpoison_stack_above_sp_to(const void *watermark)
-{
- const void *sp = __builtin_frame_address(0);
- size_t size = watermark - sp;
-
- if (WARN_ON(sp > watermark))
- return;
- kasan_unpoison_shadow(sp, size);
-}
-
/*
* All functions below always inlined so compiler could
* perform better optimizations in each of __asan_loadX/__assn_storeX
@@ -260,121 +184,12 @@ static __always_inline void check_memory_region_inline(unsigned long addr,
kasan_report(addr, size, write, ret_ip);
}

-static void check_memory_region(unsigned long addr,
- size_t size, bool write,
+void check_memory_region(unsigned long addr, size_t size, bool write,
unsigned long ret_ip)
{
check_memory_region_inline(addr, size, write, ret_ip);
}

-void kasan_check_read(const volatile void *p, unsigned int size)
-{
- check_memory_region((unsigned long)p, size, false, _RET_IP_);
-}
-EXPORT_SYMBOL(kasan_check_read);
-
-void kasan_check_write(const volatile void *p, unsigned int size)
-{
- check_memory_region((unsigned long)p, size, true, _RET_IP_);
-}
-EXPORT_SYMBOL(kasan_check_write);
-
-#undef memset
-void *memset(void *addr, int c, size_t len)
-{
- check_memory_region((unsigned long)addr, len, true, _RET_IP_);
-
- return __memset(addr, c, len);
-}
-
-#undef memmove
-void *memmove(void *dest, const void *src, size_t len)
-{
- check_memory_region((unsigned long)src, len, false, _RET_IP_);
- check_memory_region((unsigned long)dest, len, true, _RET_IP_);
-
- return __memmove(dest, src, len);
-}
-
-#undef memcpy
-void *memcpy(void *dest, const void *src, size_t len)
-{
- check_memory_region((unsigned long)src, len, false, _RET_IP_);
- check_memory_region((unsigned long)dest, len, true, _RET_IP_);
-
- return __memcpy(dest, src, len);
-}
-
-void kasan_alloc_pages(struct page *page, unsigned int order)
-{
- if (likely(!PageHighMem(page)))
- kasan_unpoison_shadow(page_address(page), PAGE_SIZE << order);
-}
-
-void kasan_free_pages(struct page *page, unsigned int order)
-{
- if (likely(!PageHighMem(page)))
- kasan_poison_shadow(page_address(page),
- PAGE_SIZE << order,
- KASAN_FREE_PAGE);
-}
-
-/*
- * Adaptive redzone policy taken from the userspace AddressSanitizer runtime.
- * For larger allocations larger redzones are used.
- */
-static unsigned int optimal_redzone(unsigned int object_size)
-{
- return
- object_size <= 64 - 16 ? 16 :
- object_size <= 128 - 32 ? 32 :
- object_size <= 512 - 64 ? 64 :
- object_size <= 4096 - 128 ? 128 :
- object_size <= (1 << 14) - 256 ? 256 :
- object_size <= (1 << 15) - 512 ? 512 :
- object_size <= (1 << 16) - 1024 ? 1024 : 2048;
-}
-
-void kasan_cache_create(struct kmem_cache *cache, unsigned int *size,
- slab_flags_t *flags)
-{
- unsigned int orig_size = *size;
- int redzone_adjust;
-
- /* Add alloc meta. */
- cache->kasan_info.alloc_meta_offset = *size;
- *size += sizeof(struct kasan_alloc_meta);
-
- /* Add free meta. */
- if (cache->flags & SLAB_TYPESAFE_BY_RCU || cache->ctor ||
- cache->object_size < sizeof(struct kasan_free_meta)) {
- cache->kasan_info.free_meta_offset = *size;
- *size += sizeof(struct kasan_free_meta);
- }
- redzone_adjust = optimal_redzone(cache->object_size) -
- (*size - cache->object_size);
-
- if (redzone_adjust > 0)
- *size += redzone_adjust;
-
- *size = min_t(unsigned int, KMALLOC_MAX_SIZE,
- max(*size, cache->object_size +
- optimal_redzone(cache->object_size)));
-
- /*
- * If the metadata doesn't fit, don't enable KASAN at all.
- */
- if (*size <= cache->kasan_info.alloc_meta_offset ||
- *size <= cache->kasan_info.free_meta_offset) {
- cache->kasan_info.alloc_meta_offset = 0;
- cache->kasan_info.free_meta_offset = 0;
- *size = orig_size;
- return;
- }
-
- *flags |= SLAB_KASAN;
-}
-
void kasan_cache_shrink(struct kmem_cache *cache)
{
quarantine_remove_cache(cache);
@@ -386,274 +201,6 @@ void kasan_cache_shutdown(struct kmem_cache *cache)
quarantine_remove_cache(cache);
}

-size_t kasan_metadata_size(struct kmem_cache *cache)
-{
- return (cache->kasan_info.alloc_meta_offset ?
- sizeof(struct kasan_alloc_meta) : 0) +
- (cache->kasan_info.free_meta_offset ?
- sizeof(struct kasan_free_meta) : 0);
-}
-
-void kasan_poison_slab(struct page *page)
-{
- kasan_poison_shadow(page_address(page),
- PAGE_SIZE << compound_order(page),
- KASAN_KMALLOC_REDZONE);
-}
-
-void kasan_unpoison_object_data(struct kmem_cache *cache, void *object)
-{
- kasan_unpoison_shadow(object, cache->object_size);
-}
-
-void kasan_poison_object_data(struct kmem_cache *cache, void *object)
-{
- kasan_poison_shadow(object,
- round_up(cache->object_size, KASAN_SHADOW_SCALE_SIZE),
- KASAN_KMALLOC_REDZONE);
-}
-
-static inline int in_irqentry_text(unsigned long ptr)
-{
- return (ptr >= (unsigned long)&__irqentry_text_start &&
- ptr < (unsigned long)&__irqentry_text_end) ||
- (ptr >= (unsigned long)&__softirqentry_text_start &&
- ptr < (unsigned long)&__softirqentry_text_end);
-}
-
-static inline void filter_irq_stacks(struct stack_trace *trace)
-{
- int i;
-
- if (!trace->nr_entries)
- return;
- for (i = 0; i < trace->nr_entries; i++)
- if (in_irqentry_text(trace->entries[i])) {
- /* Include the irqentry function into the stack. */
- trace->nr_entries = i + 1;
- break;
- }
-}
-
-static inline depot_stack_handle_t save_stack(gfp_t flags)
-{
- unsigned long entries[KASAN_STACK_DEPTH];
- struct stack_trace trace = {
- .nr_entries = 0,
- .entries = entries,
- .max_entries = KASAN_STACK_DEPTH,
- .skip = 0
- };
-
- save_stack_trace(&trace);
- filter_irq_stacks(&trace);
- if (trace.nr_entries != 0 &&
- trace.entries[trace.nr_entries-1] == ULONG_MAX)
- trace.nr_entries--;
-
- return depot_save_stack(&trace, flags);
-}
-
-static inline void set_track(struct kasan_track *track, gfp_t flags)
-{
- track->pid = current->pid;
- track->stack = save_stack(flags);
-}
-
-struct kasan_alloc_meta *get_alloc_info(struct kmem_cache *cache,
- const void *object)
-{
- BUILD_BUG_ON(sizeof(struct kasan_alloc_meta) > 32);
- return (void *)object + cache->kasan_info.alloc_meta_offset;
-}
-
-struct kasan_free_meta *get_free_info(struct kmem_cache *cache,
- const void *object)
-{
- BUILD_BUG_ON(sizeof(struct kasan_free_meta) > 32);
- return (void *)object + cache->kasan_info.free_meta_offset;
-}
-
-void kasan_init_slab_obj(struct kmem_cache *cache, const void *object)
-{
- struct kasan_alloc_meta *alloc_info;
-
- if (!(cache->flags & SLAB_KASAN))
- return;
-
- alloc_info = get_alloc_info(cache, object);
- __memset(alloc_info, 0, sizeof(*alloc_info));
-}
-
-void *kasan_slab_alloc(struct kmem_cache *cache, void *object, gfp_t flags)
-{
- return kasan_kmalloc(cache, object, cache->object_size, flags);
-}
-
-static bool __kasan_slab_free(struct kmem_cache *cache, void *object,
- unsigned long ip, bool quarantine)
-{
- s8 shadow_byte;
- unsigned long rounded_up_size;
-
- if (unlikely(nearest_obj(cache, virt_to_head_page(object), object) !=
- object)) {
- kasan_report_invalid_free(object, ip);
- return true;
- }
-
- /* RCU slabs could be legally used after free within the RCU period */
- if (unlikely(cache->flags & SLAB_TYPESAFE_BY_RCU))
- return false;
-
- shadow_byte = READ_ONCE(*(s8 *)kasan_mem_to_shadow(object));
- if (shadow_byte < 0 || shadow_byte >= KASAN_SHADOW_SCALE_SIZE) {
- kasan_report_invalid_free(object, ip);
- return true;
- }
-
- rounded_up_size = round_up(cache->object_size, KASAN_SHADOW_SCALE_SIZE);
- kasan_poison_shadow(object, rounded_up_size, KASAN_KMALLOC_FREE);
-
- if (!quarantine || unlikely(!(cache->flags & SLAB_KASAN)))
- return false;
-
- set_track(&get_alloc_info(cache, object)->free_track, GFP_NOWAIT);
- quarantine_put(get_free_info(cache, object), cache);
- return true;
-}
-
-bool kasan_slab_free(struct kmem_cache *cache, void *object, unsigned long ip)
-{
- return __kasan_slab_free(cache, object, ip, true);
-}
-
-void *kasan_kmalloc(struct kmem_cache *cache, const void *object, size_t size,
- gfp_t flags)
-{
- unsigned long redzone_start;
- unsigned long redzone_end;
-
- if (gfpflags_allow_blocking(flags))
- quarantine_reduce();
-
- if (unlikely(object == NULL))
- return NULL;
-
- redzone_start = round_up((unsigned long)(object + size),
- KASAN_SHADOW_SCALE_SIZE);
- redzone_end = round_up((unsigned long)object + cache->object_size,
- KASAN_SHADOW_SCALE_SIZE);
-
- kasan_unpoison_shadow(object, size);
- kasan_poison_shadow((void *)redzone_start, redzone_end - redzone_start,
- KASAN_KMALLOC_REDZONE);
-
- if (cache->flags & SLAB_KASAN)
- set_track(&get_alloc_info(cache, object)->alloc_track, flags);
-
- return (void *)object;
-}
-EXPORT_SYMBOL(kasan_kmalloc);
-
-void *kasan_kmalloc_large(const void *ptr, size_t size, gfp_t flags)
-{
- struct page *page;
- unsigned long redzone_start;
- unsigned long redzone_end;
-
- if (gfpflags_allow_blocking(flags))
- quarantine_reduce();
-
- if (unlikely(ptr == NULL))
- return NULL;
-
- page = virt_to_page(ptr);
- redzone_start = round_up((unsigned long)(ptr + size),
- KASAN_SHADOW_SCALE_SIZE);
- redzone_end = (unsigned long)ptr + (PAGE_SIZE << compound_order(page));
-
- kasan_unpoison_shadow(ptr, size);
- kasan_poison_shadow((void *)redzone_start, redzone_end - redzone_start,
- KASAN_PAGE_REDZONE);
-
- return (void *)ptr;
-}
-
-void *kasan_krealloc(const void *object, size_t size, gfp_t flags)
-{
- struct page *page;
-
- if (unlikely(object == ZERO_SIZE_PTR))
- return ZERO_SIZE_PTR;
-
- page = virt_to_head_page(object);
-
- if (unlikely(!PageSlab(page)))
- return kasan_kmalloc_large(object, size, flags);
- else
- return kasan_kmalloc(page->slab_cache, object, size, flags);
-}
-
-void kasan_poison_kfree(void *ptr, unsigned long ip)
-{
- struct page *page;
-
- page = virt_to_head_page(ptr);
-
- if (unlikely(!PageSlab(page))) {
- if (ptr != page_address(page)) {
- kasan_report_invalid_free(ptr, ip);
- return;
- }
- kasan_poison_shadow(ptr, PAGE_SIZE << compound_order(page),
- KASAN_FREE_PAGE);
- } else {
- __kasan_slab_free(page->slab_cache, ptr, ip, false);
- }
-}
-
-void kasan_kfree_large(void *ptr, unsigned long ip)
-{
- if (ptr != page_address(virt_to_head_page(ptr)))
- kasan_report_invalid_free(ptr, ip);
- /* The object will be poisoned by page_alloc. */
-}
-
-int kasan_module_alloc(void *addr, size_t size)
-{
- void *ret;
- size_t shadow_size;
- unsigned long shadow_start;
-
- shadow_start = (unsigned long)kasan_mem_to_shadow(addr);
- shadow_size = round_up(size >> KASAN_SHADOW_SCALE_SHIFT,
- PAGE_SIZE);
-
- if (WARN_ON(!PAGE_ALIGNED(shadow_start)))
- return -EINVAL;
-
- ret = __vmalloc_node_range(shadow_size, 1, shadow_start,
- shadow_start + shadow_size,
- GFP_KERNEL | __GFP_ZERO,
- PAGE_KERNEL, VM_NO_GUARD, NUMA_NO_NODE,
- __builtin_return_address(0));
-
- if (ret) {
- find_vm_area(addr)->flags |= VM_KASAN;
- kmemleak_ignore(ret);
- return 0;
- }
-
- return -ENOMEM;
-}
-
-void kasan_free_shadow(const struct vm_struct *vm)
-{
- if (vm->flags & VM_KASAN)
- vfree(kasan_mem_to_shadow(vm->addr));
-}
-
static void register_global(struct kasan_global *global)
{
size_t aligned_size = round_up(global->size, KASAN_SHADOW_SCALE_SIZE);
@@ -794,113 +341,3 @@ DEFINE_ASAN_SET_SHADOW(f2);
DEFINE_ASAN_SET_SHADOW(f3);
DEFINE_ASAN_SET_SHADOW(f5);
DEFINE_ASAN_SET_SHADOW(f8);
-
-#ifdef CONFIG_MEMORY_HOTPLUG
-static bool shadow_mapped(unsigned long addr)
-{
- pgd_t *pgd = pgd_offset_k(addr);
- p4d_t *p4d;
- pud_t *pud;
- pmd_t *pmd;
- pte_t *pte;
-
- if (pgd_none(*pgd))
- return false;
- p4d = p4d_offset(pgd, addr);
- if (p4d_none(*p4d))
- return false;
- pud = pud_offset(p4d, addr);
- if (pud_none(*pud))
- return false;
-
- /*
- * We can't use pud_large() or pud_huge(), the first one is
- * arch-specific, the last one depends on HUGETLB_PAGE. So let's abuse
- * pud_bad(), if pud is bad then it's bad because it's huge.
- */
- if (pud_bad(*pud))
- return true;
- pmd = pmd_offset(pud, addr);
- if (pmd_none(*pmd))
- return false;
-
- if (pmd_bad(*pmd))
- return true;
- pte = pte_offset_kernel(pmd, addr);
- return !pte_none(*pte);
-}
-
-static int __meminit kasan_mem_notifier(struct notifier_block *nb,
- unsigned long action, void *data)
-{
- struct memory_notify *mem_data = data;
- unsigned long nr_shadow_pages, start_kaddr, shadow_start;
- unsigned long shadow_end, shadow_size;
-
- nr_shadow_pages = mem_data->nr_pages >> KASAN_SHADOW_SCALE_SHIFT;
- start_kaddr = (unsigned long)pfn_to_kaddr(mem_data->start_pfn);
- shadow_start = (unsigned long)kasan_mem_to_shadow((void *)start_kaddr);
- shadow_size = nr_shadow_pages << PAGE_SHIFT;
- shadow_end = shadow_start + shadow_size;
-
- if (WARN_ON(mem_data->nr_pages % KASAN_SHADOW_SCALE_SIZE) ||
- WARN_ON(start_kaddr % (KASAN_SHADOW_SCALE_SIZE << PAGE_SHIFT)))
- return NOTIFY_BAD;
-
- switch (action) {
- case MEM_GOING_ONLINE: {
- void *ret;
-
- /*
- * If shadow is mapped already than it must have been mapped
- * during the boot. This could happen if we onlining previously
- * offlined memory.
- */
- if (shadow_mapped(shadow_start))
- return NOTIFY_OK;
-
- ret = __vmalloc_node_range(shadow_size, PAGE_SIZE, shadow_start,
- shadow_end, GFP_KERNEL,
- PAGE_KERNEL, VM_NO_GUARD,
- pfn_to_nid(mem_data->start_pfn),
- __builtin_return_address(0));
- if (!ret)
- return NOTIFY_BAD;
-
- kmemleak_ignore(ret);
- return NOTIFY_OK;
- }
- case MEM_CANCEL_ONLINE:
- case MEM_OFFLINE: {
- struct vm_struct *vm;
-
- /*
- * shadow_start was either mapped during boot by kasan_init()
- * or during memory online by __vmalloc_node_range().
- * In the latter case we can use vfree() to free shadow.
- * Non-NULL result of the find_vm_area() will tell us if
- * that was the second case.
- *
- * Currently it's not possible to free shadow mapped
- * during boot by kasan_init(). It's because the code
- * to do that hasn't been written yet. So we'll just
- * leak the memory.
- */
- vm = find_vm_area((void *)shadow_start);
- if (vm)
- vfree((void *)shadow_start);
- }
- }
-
- return NOTIFY_OK;
-}
-
-static int __init kasan_memhotplug_init(void)
-{
- hotplug_memory_notifier(kasan_mem_notifier, 0);
-
- return 0;
-}
-
-core_initcall(kasan_memhotplug_init);
-#endif
diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
index c12dcfde2ebd..659463800f10 100644
--- a/mm/kasan/kasan.h
+++ b/mm/kasan/kasan.h
@@ -105,6 +105,11 @@ static inline const void *kasan_shadow_to_mem(const void *shadow_addr)
<< KASAN_SHADOW_SCALE_SHIFT);
}

+void kasan_poison_shadow(const void *address, size_t size, u8 value);
+
+void check_memory_region(unsigned long addr, size_t size, bool write,
+ unsigned long ret_ip);
+
void kasan_report(unsigned long addr, size_t size,
bool is_write, unsigned long ip);
void kasan_report_invalid_free(void *object, unsigned long ip);
--
2.18.0.rc2.346.g013aa6912e-goog


2018-06-26 14:13:36

by Andrey Konovalov

[permalink] [raw]
Subject: [PATCH v4 16/17] khwasan: update kasan documentation

This patch updates KASAN documentation to reflect the addition of KHWASAN.

Signed-off-by: Andrey Konovalov <[email protected]>
---
Documentation/dev-tools/kasan.rst | 213 +++++++++++++++++-------------
1 file changed, 123 insertions(+), 90 deletions(-)

diff --git a/Documentation/dev-tools/kasan.rst b/Documentation/dev-tools/kasan.rst
index aabc8738b3d8..842d95af74d3 100644
--- a/Documentation/dev-tools/kasan.rst
+++ b/Documentation/dev-tools/kasan.rst
@@ -8,11 +8,19 @@ KernelAddressSANitizer (KASAN) is a dynamic memory error detector. It provides
a fast and comprehensive solution for finding use-after-free and out-of-bounds
bugs.

-KASAN uses compile-time instrumentation for checking every memory access,
-therefore you will need a GCC version 4.9.2 or later. GCC 5.0 or later is
-required for detection of out-of-bounds accesses to stack or global variables.
+KASAN has two modes: classic KASAN (a classic version, similar to user space
+ASan) and KHWASAN (a version based on memory tagging, similar to user space
+HWASan).

-Currently KASAN is supported only for the x86_64 and arm64 architectures.
+KASAN uses compile-time instrumentation to insert validity checks before every
+memory access, and therefore requires a compiler version that supports that.
+For classic KASAN you need GCC version 4.9.2 or later. GCC 5.0 or later is
+required for detection of out-of-bounds accesses on stack and global variables.
+KHWASAN in turns is only supported in clang and requires revision 330044 or
+later.
+
+Currently classic KASAN is supported for the x86_64, arm64 and xtensa
+architectures, and KHWASAN is supported only for arm64.

Usage
-----
@@ -21,12 +29,14 @@ To enable KASAN configure kernel with::

CONFIG_KASAN = y

-and choose between CONFIG_KASAN_OUTLINE and CONFIG_KASAN_INLINE. Outline and
-inline are compiler instrumentation types. The former produces smaller binary
-the latter is 1.1 - 2 times faster. Inline instrumentation requires a GCC
+and choose between CONFIG_KASAN_GENERIC (to enable classic KASAN) and
+CONFIG_KASAN_HW (to enabled KHWASAN). You also need to choose choose between
+CONFIG_KASAN_OUTLINE and CONFIG_KASAN_INLINE. Outline and inline are compiler
+instrumentation types. The former produces smaller binary while the latter is
+1.1 - 2 times faster. For classic KASAN inline instrumentation requires GCC
version 5.0 or later.

-KASAN works with both SLUB and SLAB memory allocators.
+Both KASAN modes work with both SLUB and SLAB memory allocators.
For better bug detection and nicer reporting, enable CONFIG_STACKTRACE.

To disable instrumentation for specific files or directories, add a line
@@ -43,85 +53,80 @@ similar to the following to the respective kernel Makefile:
Error reports
~~~~~~~~~~~~~

-A typical out of bounds access report looks like this::
+A typical out-of-bounds access classic KASAN report looks like this::

==================================================================
- BUG: AddressSanitizer: out of bounds access in kmalloc_oob_right+0x65/0x75 [test_kasan] at addr ffff8800693bc5d3
- Write of size 1 by task modprobe/1689
- =============================================================================
- BUG kmalloc-128 (Not tainted): kasan error
- -----------------------------------------------------------------------------
-
- Disabling lock debugging due to kernel taint
- INFO: Allocated in kmalloc_oob_right+0x3d/0x75 [test_kasan] age=0 cpu=0 pid=1689
- __slab_alloc+0x4b4/0x4f0
- kmem_cache_alloc_trace+0x10b/0x190
- kmalloc_oob_right+0x3d/0x75 [test_kasan]
- init_module+0x9/0x47 [test_kasan]
- do_one_initcall+0x99/0x200
- load_module+0x2cb3/0x3b20
- SyS_finit_module+0x76/0x80
- system_call_fastpath+0x12/0x17
- INFO: Slab 0xffffea0001a4ef00 objects=17 used=7 fp=0xffff8800693bd728 flags=0x100000000004080
- INFO: Object 0xffff8800693bc558 @offset=1368 fp=0xffff8800693bc720
-
- Bytes b4 ffff8800693bc548: 00 00 00 00 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a ........ZZZZZZZZ
- Object ffff8800693bc558: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
- Object ffff8800693bc568: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
- Object ffff8800693bc578: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
- Object ffff8800693bc588: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
- Object ffff8800693bc598: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
- Object ffff8800693bc5a8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
- Object ffff8800693bc5b8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
- Object ffff8800693bc5c8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 kkkkkkkkkkkkkkk.
- Redzone ffff8800693bc5d8: cc cc cc cc cc cc cc cc ........
- Padding ffff8800693bc718: 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZ
- CPU: 0 PID: 1689 Comm: modprobe Tainted: G B 3.18.0-rc1-mm1+ #98
- Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
- ffff8800693bc000 0000000000000000 ffff8800693bc558 ffff88006923bb78
- ffffffff81cc68ae 00000000000000f3 ffff88006d407600 ffff88006923bba8
- ffffffff811fd848 ffff88006d407600 ffffea0001a4ef00 ffff8800693bc558
+ BUG: KASAN: slab-out-of-bounds in kmalloc_oob_right+0xa8/0xbc [test_kasan]
+ Write of size 1 at addr ffff8800696f3d3b by task insmod/2734
+
+ CPU: 0 PID: 2734 Comm: insmod Not tainted 4.15.0+ #98
+ Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
Call Trace:
- [<ffffffff81cc68ae>] dump_stack+0x46/0x58
- [<ffffffff811fd848>] print_trailer+0xf8/0x160
- [<ffffffffa00026a7>] ? kmem_cache_oob+0xc3/0xc3 [test_kasan]
- [<ffffffff811ff0f5>] object_err+0x35/0x40
- [<ffffffffa0002065>] ? kmalloc_oob_right+0x65/0x75 [test_kasan]
- [<ffffffff8120b9fa>] kasan_report_error+0x38a/0x3f0
- [<ffffffff8120a79f>] ? kasan_poison_shadow+0x2f/0x40
- [<ffffffff8120b344>] ? kasan_unpoison_shadow+0x14/0x40
- [<ffffffff8120a79f>] ? kasan_poison_shadow+0x2f/0x40
- [<ffffffffa00026a7>] ? kmem_cache_oob+0xc3/0xc3 [test_kasan]
- [<ffffffff8120a995>] __asan_store1+0x75/0xb0
- [<ffffffffa0002601>] ? kmem_cache_oob+0x1d/0xc3 [test_kasan]
- [<ffffffffa0002065>] ? kmalloc_oob_right+0x65/0x75 [test_kasan]
- [<ffffffffa0002065>] kmalloc_oob_right+0x65/0x75 [test_kasan]
- [<ffffffffa00026b0>] init_module+0x9/0x47 [test_kasan]
- [<ffffffff810002d9>] do_one_initcall+0x99/0x200
- [<ffffffff811e4e5c>] ? __vunmap+0xec/0x160
- [<ffffffff81114f63>] load_module+0x2cb3/0x3b20
- [<ffffffff8110fd70>] ? m_show+0x240/0x240
- [<ffffffff81115f06>] SyS_finit_module+0x76/0x80
- [<ffffffff81cd3129>] system_call_fastpath+0x12/0x17
+ __dump_stack lib/dump_stack.c:17
+ dump_stack+0x83/0xbc lib/dump_stack.c:53
+ print_address_description+0x73/0x280 mm/kasan/report.c:254
+ kasan_report_error mm/kasan/report.c:352
+ kasan_report+0x10e/0x220 mm/kasan/report.c:410
+ __asan_report_store1_noabort+0x17/0x20 mm/kasan/report.c:505
+ kmalloc_oob_right+0xa8/0xbc [test_kasan] lib/test_kasan.c:42
+ kmalloc_tests_init+0x16/0x769 [test_kasan]
+ do_one_initcall+0x9e/0x240 init/main.c:832
+ do_init_module+0x1b6/0x542 kernel/module.c:3462
+ load_module+0x6042/0x9030 kernel/module.c:3786
+ SYSC_init_module+0x18f/0x1c0 kernel/module.c:3858
+ SyS_init_module+0x9/0x10 kernel/module.c:3841
+ do_syscall_64+0x198/0x480 arch/x86/entry/common.c:287
+ entry_SYSCALL_64_after_hwframe+0x21/0x86 arch/x86/entry/entry_64.S:251
+ RIP: 0033:0x7fdd79df99da
+ RSP: 002b:00007fff2229bdf8 EFLAGS: 00000202 ORIG_RAX: 00000000000000af
+ RAX: ffffffffffffffda RBX: 000055c408121190 RCX: 00007fdd79df99da
+ RDX: 00007fdd7a0b8f88 RSI: 0000000000055670 RDI: 00007fdd7a47e000
+ RBP: 000055c4081200b0 R08: 0000000000000003 R09: 0000000000000000
+ R10: 00007fdd79df5d0a R11: 0000000000000202 R12: 00007fdd7a0b8f88
+ R13: 000055c408120090 R14: 0000000000000000 R15: 0000000000000000
+
+ Allocated by task 2734:
+ save_stack+0x43/0xd0 mm/kasan/common.c:176
+ set_track+0x20/0x30 mm/kasan/common.c:188
+ kasan_kmalloc+0x9a/0xc0 mm/kasan/kasan.c:372
+ kmem_cache_alloc_trace+0xcd/0x1a0 mm/slub.c:2761
+ kmalloc ./include/linux/slab.h:512
+ kmalloc_oob_right+0x56/0xbc [test_kasan] lib/test_kasan.c:36
+ kmalloc_tests_init+0x16/0x769 [test_kasan]
+ do_one_initcall+0x9e/0x240 init/main.c:832
+ do_init_module+0x1b6/0x542 kernel/module.c:3462
+ load_module+0x6042/0x9030 kernel/module.c:3786
+ SYSC_init_module+0x18f/0x1c0 kernel/module.c:3858
+ SyS_init_module+0x9/0x10 kernel/module.c:3841
+ do_syscall_64+0x198/0x480 arch/x86/entry/common.c:287
+ entry_SYSCALL_64_after_hwframe+0x21/0x86 arch/x86/entry/entry_64.S:251
+
+ The buggy address belongs to the object at ffff8800696f3cc0
+ which belongs to the cache kmalloc-128 of size 128
+ The buggy address is located 123 bytes inside of
+ 128-byte region [ffff8800696f3cc0, ffff8800696f3d40)
+ The buggy address belongs to the page:
+ page:ffffea0001a5bcc0 count:1 mapcount:0 mapping: (null) index:0x0
+ flags: 0x100000000000100(slab)
+ raw: 0100000000000100 0000000000000000 0000000000000000 0000000180150015
+ raw: ffffea0001a8ce40 0000000300000003 ffff88006d001640 0000000000000000
+ page dumped because: kasan: bad access detected
+
Memory state around the buggy address:
- ffff8800693bc300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
- ffff8800693bc380: fc fc 00 00 00 00 00 00 00 00 00 00 00 00 00 fc
- ffff8800693bc400: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
- ffff8800693bc480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
- ffff8800693bc500: fc fc fc fc fc fc fc fc fc fc fc 00 00 00 00 00
- >ffff8800693bc580: 00 00 00 00 00 00 00 00 00 00 03 fc fc fc fc fc
- ^
- ffff8800693bc600: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
- ffff8800693bc680: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
- ffff8800693bc700: fc fc fc fc fb fb fb fb fb fb fb fb fb fb fb fb
- ffff8800693bc780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
- ffff8800693bc800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
+ ffff8800696f3c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fc
+ ffff8800696f3c80: fc fc fc fc fc fc fc fc 00 00 00 00 00 00 00 00
+ >ffff8800696f3d00: 00 00 00 00 00 00 00 03 fc fc fc fc fc fc fc fc
+ ^
+ ffff8800696f3d80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fc fc
+ ffff8800696f3e00: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
==================================================================

-The header of the report discribe what kind of bug happened and what kind of
-access caused it. It's followed by the description of the accessed slub object
-(see 'SLUB Debug output' section in Documentation/vm/slub.rst for details) and
-the description of the accessed memory page.
+The header of the report provides a short summary of what kind of bug happened
+and what kind of access caused it. It's followed by a stack trace of the bad
+access, a stack trace of where the accessed memory was allocated (in case bad
+access happens on a slab object), and a stack trace of where the object was
+freed (in case of a use-after-free bug report). Next comes a description of
+the accessed slab object and information about the accessed memory page.

In the last section the report shows memory state around the accessed address.
Reading this part requires some understanding of how KASAN works.
@@ -138,18 +143,24 @@ inaccessible memory like redzones or freed memory (see mm/kasan/kasan.h).
In the report above the arrows point to the shadow byte 03, which means that
the accessed address is partially accessible.

+For KHWASAN this last report section shows the memory tags around the accessed
+address (see Implementation details section).
+

Implementation details
----------------------

+Classic KASAN
+~~~~~~~~~~~~~
+
From a high level, our approach to memory error detection is similar to that
of kmemcheck: use shadow memory to record whether each byte of memory is safe
-to access, and use compile-time instrumentation to check shadow memory on each
-memory access.
+to access, and use compile-time instrumentation to insert checks of shadow
+memory on each memory access.

-AddressSanitizer dedicates 1/8 of kernel memory to its shadow memory
-(e.g. 16TB to cover 128TB on x86_64) and uses direct mapping with a scale and
-offset to translate a memory address to its corresponding shadow address.
+Classic KASAN dedicates 1/8th of kernel memory to its shadow memory (e.g. 16TB
+to cover 128TB on x86_64) and uses direct mapping with a scale and offset to
+translate a memory address to its corresponding shadow address.

Here is the function which translates an address to its corresponding shadow
address::
@@ -162,12 +173,34 @@ address::

where ``KASAN_SHADOW_SCALE_SHIFT = 3``.

-Compile-time instrumentation used for checking memory accesses. Compiler inserts
-function calls (__asan_load*(addr), __asan_store*(addr)) before each memory
-access of size 1, 2, 4, 8 or 16. These functions check whether memory access is
-valid or not by checking corresponding shadow memory.
+Compile-time instrumentation is used to insert memory access checks. Compiler
+inserts function calls (__asan_load*(addr), __asan_store*(addr)) before each
+memory access of size 1, 2, 4, 8 or 16. These functions check whether memory
+access is valid or not by checking corresponding shadow memory.

GCC 5.0 has possibility to perform inline instrumentation. Instead of making
function calls GCC directly inserts the code to check the shadow memory.
This option significantly enlarges kernel but it gives x1.1-x2 performance
boost over outline instrumented kernel.
+
+KHWASAN
+~~~~~~~
+
+KHWASAN uses the Top Byte Ignore (TBI) feature of modern arm64 CPUs to store
+a pointer tag in the top byte of kernel pointers. KHWASAN also uses shadow
+memory to store memory tags associated with each 16-byte memory cell (therefore
+it dedicates 1/16th of the kernel memory for shadow memory).
+
+On each memory allocation KHWASAN generates a random tag, tags allocated memory
+with this tag, and embeds this tag into the returned pointer. KHWASAN uses
+compile-time instrumentation to insert checks before each memory access. These
+checks make sure that tag of the memory that is being accessed is equal to tag
+of the pointer that is used to access this memory. In case of a tag mismatch
+KHWASAN prints a bug report.
+
+KHWASAN also has two instrumentation modes (outline, that emits callbacks to
+check memory accesses; and inline, that performs the shadow memory checks
+inline). With outline instrumentation mode, a bug report is simply printed
+from the function that performs the access check. With inline instrumentation
+a brk instruction is emitted by the compiler, and a dedicated brk handler is
+used to print KHWASAN reports.
--
2.18.0.rc2.346.g013aa6912e-goog


2018-06-26 14:16:41

by Andrey Konovalov

[permalink] [raw]
Subject: [PATCH v4 14/17] khwasan, arm64: add brk handler for inline instrumentation

KHWASAN inline instrumentation mode (which embeds checks of shadow memory
into the generated code, instead of inserting a callback) generates a brk
instruction when a tag mismatch is detected.

This commit add a KHWASAN brk handler, that decodes the immediate value
passed to the brk instructions (to extract information about the memory
access that triggered the mismatch), reads the register values (x0 contains
the guilty address) and reports the bug.

Signed-off-by: Andrey Konovalov <[email protected]>
---
arch/arm64/include/asm/brk-imm.h | 2 +
arch/arm64/kernel/traps.c | 69 +++++++++++++++++++++++++++++++-
2 files changed, 69 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/brk-imm.h b/arch/arm64/include/asm/brk-imm.h
index ed693c5bcec0..e4a7013321dc 100644
--- a/arch/arm64/include/asm/brk-imm.h
+++ b/arch/arm64/include/asm/brk-imm.h
@@ -16,10 +16,12 @@
* 0x400: for dynamic BRK instruction
* 0x401: for compile time BRK instruction
* 0x800: kernel-mode BUG() and WARN() traps
+ * 0x9xx: KHWASAN trap (allowed values 0x900 - 0x9ff)
*/
#define FAULT_BRK_IMM 0x100
#define KGDB_DYN_DBG_BRK_IMM 0x400
#define KGDB_COMPILED_DBG_BRK_IMM 0x401
#define BUG_BRK_IMM 0x800
+#define KHWASAN_BRK_IMM 0x900

#endif
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index d399d459397b..95152a4fd202 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -35,6 +35,7 @@
#include <linux/sizes.h>
#include <linux/syscalls.h>
#include <linux/mm_types.h>
+#include <linux/kasan.h>

#include <asm/atomic.h>
#include <asm/bug.h>
@@ -269,10 +270,14 @@ void arm64_notify_die(const char *str, struct pt_regs *regs,
}
}

-void arm64_skip_faulting_instruction(struct pt_regs *regs, unsigned long size)
+void __arm64_skip_faulting_instruction(struct pt_regs *regs, unsigned long size)
{
regs->pc += size;
+}

+void arm64_skip_faulting_instruction(struct pt_regs *regs, unsigned long size)
+{
+ __arm64_skip_faulting_instruction(regs, size);
/*
* If we were single stepping, we want to get the step exception after
* we return from the trap.
@@ -791,7 +796,7 @@ static int bug_handler(struct pt_regs *regs, unsigned int esr)
}

/* If thread survives, skip over the BUG instruction and continue: */
- arm64_skip_faulting_instruction(regs, AARCH64_INSN_SIZE);
+ __arm64_skip_faulting_instruction(regs, AARCH64_INSN_SIZE);
return DBG_HOOK_HANDLED;
}

@@ -801,6 +806,59 @@ static struct break_hook bug_break_hook = {
.fn = bug_handler,
};

+#ifdef CONFIG_KASAN_HW
+
+#define KHWASAN_ESR_RECOVER 0x20
+#define KHWASAN_ESR_WRITE 0x10
+#define KHWASAN_ESR_SIZE_MASK 0x0f
+#define KHWASAN_ESR_SIZE(esr) (1 << ((esr) & KHWASAN_ESR_SIZE_MASK))
+
+static int khwasan_handler(struct pt_regs *regs, unsigned int esr)
+{
+ bool recover = esr & KHWASAN_ESR_RECOVER;
+ bool write = esr & KHWASAN_ESR_WRITE;
+ size_t size = KHWASAN_ESR_SIZE(esr);
+ u64 addr = regs->regs[0];
+ u64 pc = regs->pc;
+
+ if (user_mode(regs))
+ return DBG_HOOK_ERROR;
+
+ kasan_report(addr, size, write, pc);
+
+ /*
+ * The instrumentation allows to control whether we can proceed after
+ * a crash was detected. This is done by passing the -recover flag to
+ * the compiler. Disabling recovery allows to generate more compact
+ * code.
+ *
+ * Unfortunately disabling recovery doesn't work for the kernel right
+ * now. KHWASAN reporting is disabled in some contexts (for example when
+ * the allocator accesses slab object metadata; same is true for KASAN;
+ * this is controlled by current->kasan_depth). All these accesses are
+ * detected by the tool, even though the reports for them are not
+ * printed.
+ *
+ * This is something that might be fixed at some point in the future.
+ */
+ if (!recover)
+ die("Oops - KHWASAN", regs, 0);
+
+ /* If thread survives, skip over the brk instruction and continue: */
+ __arm64_skip_faulting_instruction(regs, AARCH64_INSN_SIZE);
+ return DBG_HOOK_HANDLED;
+}
+
+#define KHWASAN_ESR_VAL (0xf2000000 | KHWASAN_BRK_IMM)
+#define KHWASAN_ESR_MASK 0xffffff00
+
+static struct break_hook khwasan_break_hook = {
+ .esr_val = KHWASAN_ESR_VAL,
+ .esr_mask = KHWASAN_ESR_MASK,
+ .fn = khwasan_handler,
+};
+#endif
+
/*
* Initial handler for AArch64 BRK exceptions
* This handler only used until debug_traps_init().
@@ -808,6 +866,10 @@ static struct break_hook bug_break_hook = {
int __init early_brk64(unsigned long addr, unsigned int esr,
struct pt_regs *regs)
{
+#ifdef CONFIG_KASAN_HW
+ if ((esr & KHWASAN_ESR_MASK) == KHWASAN_ESR_VAL)
+ return khwasan_handler(regs, esr) != DBG_HOOK_HANDLED;
+#endif
return bug_handler(regs, esr) != DBG_HOOK_HANDLED;
}

@@ -815,4 +877,7 @@ int __init early_brk64(unsigned long addr, unsigned int esr,
void __init trap_init(void)
{
register_break_hook(&bug_break_hook);
+#ifdef CONFIG_KASAN_HW
+ register_break_hook(&khwasan_break_hook);
+#endif
}
--
2.18.0.rc2.346.g013aa6912e-goog


2018-06-26 14:16:46

by Andrey Konovalov

[permalink] [raw]
Subject: [PATCH v4 13/17] khwasan: add hooks implementation

This commit adds KHWASAN specific hooks implementation and adjusts
common KASAN and KHWASAN ones.

1. When a new slab cache is created, KHWASAN rounds up the size of the
objects in this cache to KASAN_SHADOW_SCALE_SIZE (== 16).

2. On each kmalloc KHWASAN generates a random tag, sets the shadow memory,
that corresponds to this object to this tag, and embeds this tag value
into the top byte of the returned pointer.

3. On each kfree KHWASAN poisons the shadow memory with a random tag to
allow detection of use-after-free bugs.

The rest of the logic of the hook implementation is very much similar to
the one provided by KASAN. KHWASAN saves allocation and free stack metadata
to the slab object the same was KASAN does this.

Signed-off-by: Andrey Konovalov <[email protected]>
---
mm/kasan/common.c | 83 +++++++++++++++++++++++++++++++++++-----------
mm/kasan/kasan.h | 8 +++++
mm/kasan/khwasan.c | 40 ++++++++++++++++++++++
3 files changed, 111 insertions(+), 20 deletions(-)

diff --git a/mm/kasan/common.c b/mm/kasan/common.c
index 656baa8984c7..1e96ca050c75 100644
--- a/mm/kasan/common.c
+++ b/mm/kasan/common.c
@@ -140,6 +140,9 @@ void kasan_poison_shadow(const void *address, size_t size, u8 value)
{
void *shadow_start, *shadow_end;

+ /* Perform shadow offset calculation based on untagged address */
+ address = reset_tag(address);
+
shadow_start = kasan_mem_to_shadow(address);
shadow_end = kasan_mem_to_shadow(address + size);

@@ -148,11 +151,20 @@ void kasan_poison_shadow(const void *address, size_t size, u8 value)

void kasan_unpoison_shadow(const void *address, size_t size)
{
- kasan_poison_shadow(address, size, 0);
+ u8 tag = get_tag(address);
+
+ /* Perform shadow offset calculation based on untagged address */
+ address = reset_tag(address);
+
+ kasan_poison_shadow(address, size, tag);

if (size & KASAN_SHADOW_MASK) {
u8 *shadow = (u8 *)kasan_mem_to_shadow(address + size);
- *shadow = size & KASAN_SHADOW_MASK;
+
+ if (IS_ENABLED(CONFIG_KASAN_HW))
+ *shadow = tag;
+ else
+ *shadow = size & KASAN_SHADOW_MASK;
}
}

@@ -200,8 +212,9 @@ void kasan_unpoison_stack_above_sp_to(const void *watermark)

void kasan_alloc_pages(struct page *page, unsigned int order)
{
- if (likely(!PageHighMem(page)))
- kasan_unpoison_shadow(page_address(page), PAGE_SIZE << order);
+ if (unlikely(PageHighMem(page)))
+ return;
+ kasan_unpoison_shadow(page_address(page), PAGE_SIZE << order);
}

void kasan_free_pages(struct page *page, unsigned int order)
@@ -235,6 +248,7 @@ void kasan_cache_create(struct kmem_cache *cache, unsigned int *size,
slab_flags_t *flags)
{
unsigned int orig_size = *size;
+ unsigned int redzone_size = 0;
int redzone_adjust;

/* Add alloc meta. */
@@ -242,20 +256,20 @@ void kasan_cache_create(struct kmem_cache *cache, unsigned int *size,
*size += sizeof(struct kasan_alloc_meta);

/* Add free meta. */
- if (cache->flags & SLAB_TYPESAFE_BY_RCU || cache->ctor ||
- cache->object_size < sizeof(struct kasan_free_meta)) {
+ if (IS_ENABLED(CONFIG_KASAN_GENERIC) &&
+ (cache->flags & SLAB_TYPESAFE_BY_RCU || cache->ctor ||
+ cache->object_size < sizeof(struct kasan_free_meta))) {
cache->kasan_info.free_meta_offset = *size;
*size += sizeof(struct kasan_free_meta);
}
- redzone_adjust = optimal_redzone(cache->object_size) -
- (*size - cache->object_size);

+ redzone_size = optimal_redzone(cache->object_size);
+ redzone_adjust = redzone_size - (*size - cache->object_size);
if (redzone_adjust > 0)
*size += redzone_adjust;

*size = min_t(unsigned int, KMALLOC_MAX_SIZE,
- max(*size, cache->object_size +
- optimal_redzone(cache->object_size)));
+ max(*size, cache->object_size + redzone_size));

/*
* If the metadata doesn't fit, don't enable KASAN at all.
@@ -268,6 +282,8 @@ void kasan_cache_create(struct kmem_cache *cache, unsigned int *size,
return;
}

+ cache->align = round_up(cache->align, KASAN_SHADOW_SCALE_SIZE);
+
*flags |= SLAB_KASAN;
}

@@ -325,18 +341,41 @@ void kasan_init_slab_obj(struct kmem_cache *cache, const void *object)

void *kasan_slab_alloc(struct kmem_cache *cache, void *object, gfp_t flags)
{
- return kasan_kmalloc(cache, object, cache->object_size, flags);
+ object = kasan_kmalloc(cache, object, cache->object_size, flags);
+ if (IS_ENABLED(CONFIG_KASAN_HW) && unlikely(cache->ctor)) {
+ /*
+ * Cache constructor might use object's pointer value to
+ * initialize some of its fields.
+ */
+ cache->ctor(object);
+ }
+ return object;
+}
+
+static inline bool shadow_invalid(u8 tag, s8 shadow_byte)
+{
+ if (IS_ENABLED(CONFIG_KASAN_GENERIC))
+ return shadow_byte < 0 ||
+ shadow_byte >= KASAN_SHADOW_SCALE_SIZE;
+ else
+ return tag != (u8)shadow_byte;
}

static bool __kasan_slab_free(struct kmem_cache *cache, void *object,
unsigned long ip, bool quarantine)
{
s8 shadow_byte;
+ u8 tag;
+ void *tagged_object;
unsigned long rounded_up_size;

+ tag = get_tag(object);
+ tagged_object = object;
+ object = reset_tag(object);
+
if (unlikely(nearest_obj(cache, virt_to_head_page(object), object) !=
object)) {
- kasan_report_invalid_free(object, ip);
+ kasan_report_invalid_free(tagged_object, ip);
return true;
}

@@ -345,20 +384,22 @@ static bool __kasan_slab_free(struct kmem_cache *cache, void *object,
return false;

shadow_byte = READ_ONCE(*(s8 *)kasan_mem_to_shadow(object));
- if (shadow_byte < 0 || shadow_byte >= KASAN_SHADOW_SCALE_SIZE) {
- kasan_report_invalid_free(object, ip);
+ if (shadow_invalid(tag, shadow_byte)) {
+ kasan_report_invalid_free(tagged_object, ip);
return true;
}

rounded_up_size = round_up(cache->object_size, KASAN_SHADOW_SCALE_SIZE);
kasan_poison_shadow(object, rounded_up_size, KASAN_KMALLOC_FREE);

- if (!quarantine || unlikely(!(cache->flags & SLAB_KASAN)))
+ if ((IS_ENABLED(CONFIG_KASAN_GENERIC) && !quarantine) ||
+ unlikely(!(cache->flags & SLAB_KASAN)))
return false;

set_track(&get_alloc_info(cache, object)->free_track, GFP_NOWAIT);
quarantine_put(get_free_info(cache, object), cache);
- return true;
+
+ return IS_ENABLED(CONFIG_KASAN_GENERIC);
}

bool kasan_slab_free(struct kmem_cache *cache, void *object, unsigned long ip)
@@ -371,6 +412,7 @@ void *kasan_kmalloc(struct kmem_cache *cache, const void *object, size_t size,
{
unsigned long redzone_start;
unsigned long redzone_end;
+ u8 tag;

if (gfpflags_allow_blocking(flags))
quarantine_reduce();
@@ -383,14 +425,15 @@ void *kasan_kmalloc(struct kmem_cache *cache, const void *object, size_t size,
redzone_end = round_up((unsigned long)object + cache->object_size,
KASAN_SHADOW_SCALE_SIZE);

- kasan_unpoison_shadow(object, size);
+ tag = random_tag();
+ kasan_unpoison_shadow(set_tag(object, tag), size);
kasan_poison_shadow((void *)redzone_start, redzone_end - redzone_start,
KASAN_KMALLOC_REDZONE);

if (cache->flags & SLAB_KASAN)
set_track(&get_alloc_info(cache, object)->alloc_track, flags);

- return (void *)object;
+ return set_tag(object, tag);
}
EXPORT_SYMBOL(kasan_kmalloc);

@@ -440,7 +483,7 @@ void kasan_poison_kfree(void *ptr, unsigned long ip)
page = virt_to_head_page(ptr);

if (unlikely(!PageSlab(page))) {
- if (ptr != page_address(page)) {
+ if (reset_tag(ptr) != page_address(page)) {
kasan_report_invalid_free(ptr, ip);
return;
}
@@ -453,7 +496,7 @@ void kasan_poison_kfree(void *ptr, unsigned long ip)

void kasan_kfree_large(void *ptr, unsigned long ip)
{
- if (ptr != page_address(virt_to_head_page(ptr)))
+ if (reset_tag(ptr) != page_address(virt_to_head_page(ptr)))
kasan_report_invalid_free(ptr, ip);
/* The object will be poisoned by page_alloc. */
}
diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
index d60859d26be7..6f4f2ebf5f57 100644
--- a/mm/kasan/kasan.h
+++ b/mm/kasan/kasan.h
@@ -12,10 +12,18 @@
#define KHWASAN_TAG_INVALID 0xFE /* inaccessible memory tag */
#define KHWASAN_TAG_MAX 0xFD /* maximum value for random tags */

+#ifdef CONFIG_KASAN_GENERIC
#define KASAN_FREE_PAGE 0xFF /* page was freed */
#define KASAN_PAGE_REDZONE 0xFE /* redzone for kmalloc_large allocations */
#define KASAN_KMALLOC_REDZONE 0xFC /* redzone inside slub object */
#define KASAN_KMALLOC_FREE 0xFB /* object was freed (kmem_cache_free/kfree) */
+#else
+#define KASAN_FREE_PAGE KHWASAN_TAG_INVALID
+#define KASAN_PAGE_REDZONE KHWASAN_TAG_INVALID
+#define KASAN_KMALLOC_REDZONE KHWASAN_TAG_INVALID
+#define KASAN_KMALLOC_FREE KHWASAN_TAG_INVALID
+#endif
+
#define KASAN_GLOBAL_REDZONE 0xFA /* redzone for global variable */

/*
diff --git a/mm/kasan/khwasan.c b/mm/kasan/khwasan.c
index d34679b8f8c7..fd1725022794 100644
--- a/mm/kasan/khwasan.c
+++ b/mm/kasan/khwasan.c
@@ -88,15 +88,52 @@ void *khwasan_reset_tag(const void *addr)
void check_memory_region(unsigned long addr, size_t size, bool write,
unsigned long ret_ip)
{
+ u8 tag;
+ u8 *shadow_first, *shadow_last, *shadow;
+ void *untagged_addr;
+
+ tag = get_tag((const void *)addr);
+
+ /* Ignore accesses for pointers tagged with 0xff (native kernel
+ * pointer tag) to suppress false positives caused by kmap.
+ *
+ * Some kernel code was written to account for archs that don't keep
+ * high memory mapped all the time, but rather map and unmap particular
+ * pages when needed. Instead of storing a pointer to the kernel memory,
+ * this code saves the address of the page structure and offset within
+ * that page for later use. Those pages are then mapped and unmapped
+ * with kmap/kunmap when necessary and virt_to_page is used to get the
+ * virtual address of the page. For arm64 (that keeps the high memory
+ * mapped all the time), kmap is turned into a page_address call.
+
+ * The issue is that with use of the page_address + virt_to_page
+ * sequence the top byte value of the original pointer gets lost (gets
+ * set to KHWASAN_TAG_KERNEL (0xFF).
+ */
+ if (tag == KHWASAN_TAG_KERNEL)
+ return;
+
+ untagged_addr = reset_tag((const void *)addr);
+ shadow_first = kasan_mem_to_shadow(untagged_addr);
+ shadow_last = kasan_mem_to_shadow(untagged_addr + size - 1);
+
+ for (shadow = shadow_first; shadow <= shadow_last; shadow++) {
+ if (*shadow != tag) {
+ kasan_report(addr, size, write, ret_ip);
+ return;
+ }
+ }
}

#define DEFINE_HWASAN_LOAD_STORE(size) \
void __hwasan_load##size##_noabort(unsigned long addr) \
{ \
+ check_memory_region(addr, size, false, _RET_IP_); \
} \
EXPORT_SYMBOL(__hwasan_load##size##_noabort); \
void __hwasan_store##size##_noabort(unsigned long addr) \
{ \
+ check_memory_region(addr, size, true, _RET_IP_); \
} \
EXPORT_SYMBOL(__hwasan_store##size##_noabort)

@@ -108,15 +145,18 @@ DEFINE_HWASAN_LOAD_STORE(16);

void __hwasan_loadN_noabort(unsigned long addr, unsigned long size)
{
+ check_memory_region(addr, size, false, _RET_IP_);
}
EXPORT_SYMBOL(__hwasan_loadN_noabort);

void __hwasan_storeN_noabort(unsigned long addr, unsigned long size)
{
+ check_memory_region(addr, size, true, _RET_IP_);
}
EXPORT_SYMBOL(__hwasan_storeN_noabort);

void __hwasan_tag_memory(unsigned long addr, u8 tag, unsigned long size)
{
+ kasan_poison_shadow((void *)addr, size, tag);
}
EXPORT_SYMBOL(__hwasan_tag_memory);
--
2.18.0.rc2.346.g013aa6912e-goog


2018-06-26 14:18:41

by Andrey Konovalov

[permalink] [raw]
Subject: [PATCH v4 03/17] khwasan: add CONFIG_KASAN_GENERIC and CONFIG_KASAN_HW

This commit splits the current CONFIG_KASAN config option into two:
1. CONFIG_KASAN_GENERIC, that enables the generic software-only KASAN
version (the one that exists now);
2. CONFIG_KASAN_HW, that enables KHWASAN.

With CONFIG_KASAN_HW enabled, compiler options are changed to instrument
kernel files wiht -fsantize=hwaddress (except the ones for which
KASAN_SANITIZE := n is set).

Both CONFIG_KASAN_GENERIC and CONFIG_KASAN_HW support both
CONFIG_KASAN_INLINE and CONFIG_KASAN_OUTLINE instrumentation modes.

This commit also adds empty placeholder (for now) implementation of
KHWASAN specific hooks inserted by the compiler and adjusts common hooks
implementation to compile correctly with each of the config options.

Signed-off-by: Andrey Konovalov <[email protected]>
---
arch/arm64/Kconfig | 1 +
include/linux/compiler-clang.h | 5 ++-
include/linux/compiler-gcc.h | 4 ++
include/linux/compiler.h | 3 +-
include/linux/kasan.h | 16 +++++--
lib/Kconfig.kasan | 76 ++++++++++++++++++++++++++--------
mm/kasan/Makefile | 6 ++-
mm/kasan/kasan.h | 3 +-
mm/kasan/khwasan.c | 75 +++++++++++++++++++++++++++++++++
mm/slub.c | 2 +-
scripts/Makefile.kasan | 27 +++++++++++-
11 files changed, 189 insertions(+), 29 deletions(-)
create mode 100644 mm/kasan/khwasan.c

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 42c090cf0292..43d9de526d3a 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -92,6 +92,7 @@ config ARM64
select HAVE_ARCH_HUGE_VMAP
select HAVE_ARCH_JUMP_LABEL
select HAVE_ARCH_KASAN if !(ARM64_16K_PAGES && ARM64_VA_BITS_48)
+ select HAVE_ARCH_KASAN_HW if !(ARM64_16K_PAGES && ARM64_VA_BITS_48)
select HAVE_ARCH_KGDB
select HAVE_ARCH_MMAP_RND_BITS
select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT
diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
index 7087446c24c8..6ac1761ff102 100644
--- a/include/linux/compiler-clang.h
+++ b/include/linux/compiler-clang.h
@@ -21,13 +21,16 @@
#define KASAN_ABI_VERSION 5

/* emulate gcc's __SANITIZE_ADDRESS__ flag */
-#if __has_feature(address_sanitizer)
+#if __has_feature(address_sanitizer) || __has_feature(hwaddress_sanitizer)
#define __SANITIZE_ADDRESS__
#endif

#undef __no_sanitize_address
#define __no_sanitize_address __attribute__((no_sanitize("address")))

+#undef __no_sanitize_hwaddress
+#define __no_sanitize_hwaddress __attribute__((no_sanitize("hwaddress")))
+
/* Clang doesn't have a way to turn it off per-function, yet. */
#ifdef __noretpoline
#undef __noretpoline
diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
index f1a7492a5cc8..24ca815de84b 100644
--- a/include/linux/compiler-gcc.h
+++ b/include/linux/compiler-gcc.h
@@ -338,6 +338,10 @@
#define __no_sanitize_address
#endif

+#if !defined(__no_sanitize_hwaddress)
+#define __no_sanitize_hwaddress /* gcc doesn't support KHWASAN */
+#endif
+
/*
* A trick to suppress uninitialized variable warning without generating any
* code
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 42506e4d1f53..6439fdd46b4e 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -195,7 +195,8 @@ void __read_once_size(const volatile void *p, void *res, int size)
* https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67368
* '__maybe_unused' allows us to avoid defined-but-not-used warnings.
*/
-# define __no_kasan_or_inline __no_sanitize_address __maybe_unused
+# define __no_kasan_or_inline __no_sanitize_address __no_sanitize_hwaddress \
+ __maybe_unused
#else
# define __no_kasan_or_inline __always_inline
#endif
diff --git a/include/linux/kasan.h b/include/linux/kasan.h
index cbdc54543803..6608aa9b35ac 100644
--- a/include/linux/kasan.h
+++ b/include/linux/kasan.h
@@ -45,8 +45,6 @@ void kasan_free_pages(struct page *page, unsigned int order);

void kasan_cache_create(struct kmem_cache *cache, unsigned int *size,
slab_flags_t *flags);
-void kasan_cache_shrink(struct kmem_cache *cache);
-void kasan_cache_shutdown(struct kmem_cache *cache);

void kasan_poison_slab(struct page *page);
void kasan_unpoison_object_data(struct kmem_cache *cache, void *object);
@@ -94,8 +92,6 @@ static inline void kasan_free_pages(struct page *page, unsigned int order) {}
static inline void kasan_cache_create(struct kmem_cache *cache,
unsigned int *size,
slab_flags_t *flags) {}
-static inline void kasan_cache_shrink(struct kmem_cache *cache) {}
-static inline void kasan_cache_shutdown(struct kmem_cache *cache) {}

static inline void kasan_poison_slab(struct page *page) {}
static inline void kasan_unpoison_object_data(struct kmem_cache *cache,
@@ -141,4 +137,16 @@ static inline size_t kasan_metadata_size(struct kmem_cache *cache) { return 0; }

#endif /* CONFIG_KASAN */

+#ifdef CONFIG_KASAN_GENERIC
+
+void kasan_cache_shrink(struct kmem_cache *cache);
+void kasan_cache_shutdown(struct kmem_cache *cache);
+
+#else /* CONFIG_KASAN_GENERIC */
+
+static inline void kasan_cache_shrink(struct kmem_cache *cache) {}
+static inline void kasan_cache_shutdown(struct kmem_cache *cache) {}
+
+#endif /* CONFIG_KASAN_GENERIC */
+
#endif /* LINUX_KASAN_H */
diff --git a/lib/Kconfig.kasan b/lib/Kconfig.kasan
index 3d35d062970d..baf2619b7ff4 100644
--- a/lib/Kconfig.kasan
+++ b/lib/Kconfig.kasan
@@ -1,33 +1,73 @@
config HAVE_ARCH_KASAN
bool

+config HAVE_ARCH_KASAN_HW
+ bool
+
if HAVE_ARCH_KASAN

config KASAN
- bool "KASan: runtime memory debugger"
+ bool "KASAN: runtime memory debugger"
+ help
+ Enables KASAN (KernelAddressSANitizer) - runtime memory debugger,
+ designed to find out-of-bounds accesses and use-after-free bugs.
+
+choice
+ prompt "KASAN mode"
+ depends on KASAN
+ default KASAN_GENERIC
+ help
+ KASAN has two modes: KASAN (a classic version, similar to userspace
+ ASan, enabled with CONFIG_KASAN_GENERIC) and KHWASAN (a version
+ based on pointer tagging, only for arm64, similar to userspace
+ HWASan, enabled with CONFIG_KASAN_HW).
+
+config KASAN_GENERIC
+ bool "KASAN: the generic mode"
depends on SLUB || (SLAB && !DEBUG_SLAB)
select CONSTRUCTORS
select STACKDEPOT
help
- Enables kernel address sanitizer - runtime memory debugger,
- designed to find out-of-bounds accesses and use-after-free bugs.
- This is strictly a debugging feature and it requires a gcc version
- of 4.9.2 or later. Detection of out of bounds accesses to stack or
- global variables requires gcc 5.0 or later.
- This feature consumes about 1/8 of available memory and brings about
- ~x3 performance slowdown.
+ Enables the generic mode of KASAN.
+ This is strictly a debugging feature and it requires a GCC version
+ of 4.9.2 or later. Detection of out-of-bounds accesses to stack or
+ global variables requires GCC 5.0 or later.
+ This mode consumes about 1/8 of available memory at kernel start
+ and introduces an overhead of ~x1.5 for the rest of the allocations.
+ The performance slowdown is ~x3.
+ For better error detection enable CONFIG_STACKTRACE.
+ Currently CONFIG_KASAN_GENERIC doesn't work with CONFIG_DEBUG_SLAB
+ (the resulting kernel does not boot).
+
+if HAVE_ARCH_KASAN_HW
+
+config KASAN_HW
+ bool "KHWASAN: the hardware assisted mode"
+ depends on SLUB || (SLAB && !DEBUG_SLAB)
+ select CONSTRUCTORS
+ select STACKDEPOT
+ help
+ Enabled KHWASAN (KASAN mode based on pointer tagging).
+ This mode requires Top Byte Ignore support by the CPU and therefore
+ only supported for arm64.
+ This feature requires clang revision 330044 or later.
+ This mode consumes about 1/16 of available memory at kernel start
+ and introduces an overhead of ~20% for the rest of the allocations.
For better error detection enable CONFIG_STACKTRACE.
- Currently CONFIG_KASAN doesn't work with CONFIG_DEBUG_SLAB
+ Currently CONFIG_KASAN_HW doesn't work with CONFIG_DEBUG_SLAB
(the resulting kernel does not boot).

+endif
+
+endchoice
+
config KASAN_EXTRA
- bool "KAsan: extra checks"
- depends on KASAN && DEBUG_KERNEL && !COMPILE_TEST
+ bool "KASAN: extra checks"
+ depends on KASAN_GENERIC && DEBUG_KERNEL && !COMPILE_TEST
help
- This enables further checks in the kernel address sanitizer, for now
- it only includes the address-use-after-scope check that can lead
- to excessive kernel stack usage, frame size warnings and longer
- compile time.
+ This enables further checks in KASAN, for now it only includes the
+ address-use-after-scope check that can lead to excessive kernel
+ stack usage, frame size warnings and longer compile time.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715 has more


@@ -52,16 +92,16 @@ config KASAN_INLINE
memory accesses. This is faster than outline (in some workloads
it gives about x2 boost over outline instrumentation), but
make kernel's .text size much bigger.
- This requires a gcc version of 5.0 or later.
+ For CONFIG_KASAN_GENERIC this requires GCC 5.0 or later.

endchoice

config TEST_KASAN
- tristate "Module for testing kasan for bug detection"
+ tristate "Module for testing KASAN for bug detection"
depends on m && KASAN
help
This is a test module doing various nasty things like
out of bounds accesses, use after free. It is useful for testing
- kernel debugging features like kernel address sanitizer.
+ kernel debugging features like KASAN.

endif
diff --git a/mm/kasan/Makefile b/mm/kasan/Makefile
index a6df14bffb6b..14955add96d3 100644
--- a/mm/kasan/Makefile
+++ b/mm/kasan/Makefile
@@ -2,6 +2,7 @@
KASAN_SANITIZE := n
UBSAN_SANITIZE_common.o := n
UBSAN_SANITIZE_kasan.o := n
+UBSAN_SANITIZE_khwasan.o := n
KCOV_INSTRUMENT := n

CFLAGS_REMOVE_kasan.o = -pg
@@ -10,5 +11,8 @@ CFLAGS_REMOVE_kasan.o = -pg

CFLAGS_common.o := $(call cc-option, -fno-conserve-stack -fno-stack-protector)
CFLAGS_kasan.o := $(call cc-option, -fno-conserve-stack -fno-stack-protector)
+CFLAGS_khwasan.o := $(call cc-option, -fno-conserve-stack -fno-stack-protector)

-obj-y := common.o kasan.o report.o kasan_init.o quarantine.o
+obj-$(CONFIG_KASAN) := common.o kasan_init.o report.o
+obj-$(CONFIG_KASAN_GENERIC) += kasan.o quarantine.o
+obj-$(CONFIG_KASAN_HW) += khwasan.o
diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
index 659463800f10..19b950eaccff 100644
--- a/mm/kasan/kasan.h
+++ b/mm/kasan/kasan.h
@@ -114,7 +114,8 @@ void kasan_report(unsigned long addr, size_t size,
bool is_write, unsigned long ip);
void kasan_report_invalid_free(void *object, unsigned long ip);

-#if defined(CONFIG_SLAB) || defined(CONFIG_SLUB)
+#if defined(CONFIG_KASAN_GENERIC) && \
+ (defined(CONFIG_SLAB) || defined(CONFIG_SLUB))
void quarantine_put(struct kasan_free_meta *info, struct kmem_cache *cache);
void quarantine_reduce(void);
void quarantine_remove_cache(struct kmem_cache *cache);
diff --git a/mm/kasan/khwasan.c b/mm/kasan/khwasan.c
new file mode 100644
index 000000000000..e2c3a7f7fd1f
--- /dev/null
+++ b/mm/kasan/khwasan.c
@@ -0,0 +1,75 @@
+/*
+ * This file contains core KHWASAN code.
+ *
+ * Copyright (c) 2018 Google, Inc.
+ * Author: Andrey Konovalov <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+#define DISABLE_BRANCH_PROFILING
+
+#include <linux/export.h>
+#include <linux/interrupt.h>
+#include <linux/init.h>
+#include <linux/kasan.h>
+#include <linux/kernel.h>
+#include <linux/kmemleak.h>
+#include <linux/linkage.h>
+#include <linux/memblock.h>
+#include <linux/memory.h>
+#include <linux/mm.h>
+#include <linux/module.h>
+#include <linux/printk.h>
+#include <linux/random.h>
+#include <linux/sched.h>
+#include <linux/sched/task_stack.h>
+#include <linux/slab.h>
+#include <linux/stacktrace.h>
+#include <linux/string.h>
+#include <linux/types.h>
+#include <linux/vmalloc.h>
+#include <linux/bug.h>
+
+#include "kasan.h"
+#include "../slab.h"
+
+void check_memory_region(unsigned long addr, size_t size, bool write,
+ unsigned long ret_ip)
+{
+}
+
+#define DEFINE_HWASAN_LOAD_STORE(size) \
+ void __hwasan_load##size##_noabort(unsigned long addr) \
+ { \
+ } \
+ EXPORT_SYMBOL(__hwasan_load##size##_noabort); \
+ void __hwasan_store##size##_noabort(unsigned long addr) \
+ { \
+ } \
+ EXPORT_SYMBOL(__hwasan_store##size##_noabort)
+
+DEFINE_HWASAN_LOAD_STORE(1);
+DEFINE_HWASAN_LOAD_STORE(2);
+DEFINE_HWASAN_LOAD_STORE(4);
+DEFINE_HWASAN_LOAD_STORE(8);
+DEFINE_HWASAN_LOAD_STORE(16);
+
+void __hwasan_loadN_noabort(unsigned long addr, unsigned long size)
+{
+}
+EXPORT_SYMBOL(__hwasan_loadN_noabort);
+
+void __hwasan_storeN_noabort(unsigned long addr, unsigned long size)
+{
+}
+EXPORT_SYMBOL(__hwasan_storeN_noabort);
+
+void __hwasan_tag_memory(unsigned long addr, u8 tag, unsigned long size)
+{
+}
+EXPORT_SYMBOL(__hwasan_tag_memory);
diff --git a/mm/slub.c b/mm/slub.c
index a60887938c19..5d1e1dee159b 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2956,7 +2956,7 @@ static __always_inline void slab_free(struct kmem_cache *s, struct page *page,
do_slab_free(s, page, head, tail, cnt, addr);
}

-#ifdef CONFIG_KASAN
+#ifdef CONFIG_KASAN_GENERIC
void ___cache_free(struct kmem_cache *cache, void *x, unsigned long addr)
{
do_slab_free(cache, virt_to_head_page(x), x, NULL, 1, addr);
diff --git a/scripts/Makefile.kasan b/scripts/Makefile.kasan
index 69552a39951d..49c6e056c697 100644
--- a/scripts/Makefile.kasan
+++ b/scripts/Makefile.kasan
@@ -1,5 +1,5 @@
# SPDX-License-Identifier: GPL-2.0
-ifdef CONFIG_KASAN
+ifdef CONFIG_KASAN_GENERIC
ifdef CONFIG_KASAN_INLINE
call_threshold := 10000
else
@@ -42,6 +42,29 @@ ifdef CONFIG_KASAN_EXTRA
CFLAGS_KASAN += $(call cc-option, -fsanitize-address-use-after-scope)
endif

-CFLAGS_KASAN_NOSANITIZE := -fno-builtin
+endif
+
+ifdef CONFIG_KASAN_HW
+
+ifdef CONFIG_KASAN_INLINE
+ instrumentation_flags := -mllvm -hwasan-mapping-offset=$(KASAN_SHADOW_OFFSET)
+else
+ instrumentation_flags := -mllvm -hwasan-instrument-with-calls=1
+endif

+CFLAGS_KASAN := -fsanitize=kernel-hwaddress \
+ -mllvm -hwasan-instrument-stack=0 \
+ $(instrumentation_flags)
+
+ifeq ($(call cc-option, $(CFLAGS_KASAN) -Werror),)
+ ifneq ($(CONFIG_COMPILE_TEST),y)
+ $(warning Cannot use CONFIG_KASAN_HW: \
+ -fsanitize=hwaddress is not supported by compiler)
+ endif
+endif
+
+endif
+
+ifdef CONFIG_KASAN
+CFLAGS_KASAN_NOSANITIZE := -fno-builtin
endif
--
2.18.0.rc2.346.g013aa6912e-goog


2018-06-27 23:53:52

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH v4 00/17] khwasan: kernel hardware assisted address sanitizer

On Tue, 26 Jun 2018 15:15:10 +0200 Andrey Konovalov <[email protected]> wrote:

> This patchset adds a new mode to KASAN [1], which is called KHWASAN
> (Kernel HardWare assisted Address SANitizer).
>
> The plan is to implement HWASan [2] for the kernel with the incentive,
> that it's going to have comparable to KASAN performance, but in the same
> time consume much less memory, trading that off for somewhat imprecise
> bug detection and being supported only for arm64.

Why do we consider this to be a worthwhile change?

Is KASAN's memory consumption actually a significant problem? Some
data regarding that would be very useful.

If it is a large problem then we still have that problem on x86, so the
problem remains largely unsolved?

> ====== Benchmarks
>
> The following numbers were collected on Odroid C2 board. Both KASAN and
> KHWASAN were used in inline instrumentation mode.
>
> Boot time [1]:
> * ~1.7 sec for clean kernel
> * ~5.0 sec for KASAN
> * ~5.0 sec for KHWASAN
>
> Slab memory usage after boot [2]:
> * ~40 kb for clean kernel
> * ~105 kb + 1/8th shadow ~= 118 kb for KASAN
> * ~47 kb + 1/16th shadow ~= 50 kb for KHWASAN
>
> Network performance [3]:
> * 8.33 Gbits/sec for clean kernel
> * 3.17 Gbits/sec for KASAN
> * 2.85 Gbits/sec for KHWASAN
>
> Note, that KHWASAN (compared to KASAN) doesn't require quarantine.
>
> [1] Time before the ext4 driver is initialized.
> [2] Measured as `cat /proc/meminfo | grep Slab`.
> [3] Measured as `iperf -s & iperf -c 127.0.0.1 -t 30`.

The above doesn't actually demonstrate the whole point of the
patchset: to reduce KASAN's very high memory consumption?


2018-06-28 00:06:56

by Kostya Serebryany

[permalink] [raw]
Subject: Re: [PATCH v4 00/17] khwasan: kernel hardware assisted address sanitizer

On Wed, Jun 27, 2018 at 4:08 PM Andrew Morton <[email protected]> wrote:
>
> On Tue, 26 Jun 2018 15:15:10 +0200 Andrey Konovalov <[email protected]> wrote:
>
> > This patchset adds a new mode to KASAN [1], which is called KHWASAN
> > (Kernel HardWare assisted Address SANitizer).
> >
> > The plan is to implement HWASan [2] for the kernel with the incentive,
> > that it's going to have comparable to KASAN performance, but in the same
> > time consume much less memory, trading that off for somewhat imprecise
> > bug detection and being supported only for arm64.
>
> Why do we consider this to be a worthwhile change?
>
> Is KASAN's memory consumption actually a significant problem? Some
> data regarding that would be very useful.

On mobile, ASAN's and KASAN's memory usage is a significant problem.
Not sure if I can find scientific evidence of that.
CC-ing Vishwath Mohan who deals with KASAN on Android to provide
anecdotal evidence.

There are several other benefits too:
* HWASAN more reliably detects non-linear-buffer-overflows compared to
ASAN (same for kernel-HWASAN vs kernel-ASAN)
* Same for detecting use-after-free (since HWASAN doesn't rely on quarantine).
* Much easier to implement stack-use-after-return detection (which
IIRC KASAN doesn't have yet, because in KASAN it's too hard)

> If it is a large problem then we still have that problem on x86, so the
> problem remains largely unsolved?

The problem is more significant on mobile devices than on desktop/server.
I'd love to have [K]HWASAN on x86_64 as well, but it's less trivial since x86_64
doesn't have an analog of aarch64's top-byte-ignore hardware feature.


>
> > ====== Benchmarks
> >
> > The following numbers were collected on Odroid C2 board. Both KASAN and
> > KHWASAN were used in inline instrumentation mode.
> >
> > Boot time [1]:
> > * ~1.7 sec for clean kernel
> > * ~5.0 sec for KASAN
> > * ~5.0 sec for KHWASAN
> >
> > Slab memory usage after boot [2]:
> > * ~40 kb for clean kernel
> > * ~105 kb + 1/8th shadow ~= 118 kb for KASAN
> > * ~47 kb + 1/16th shadow ~= 50 kb for KHWASAN
> >
> > Network performance [3]:
> > * 8.33 Gbits/sec for clean kernel
> > * 3.17 Gbits/sec for KASAN
> > * 2.85 Gbits/sec for KHWASAN
> >
> > Note, that KHWASAN (compared to KASAN) doesn't require quarantine.
> >
> > [1] Time before the ext4 driver is initialized.
> > [2] Measured as `cat /proc/meminfo | grep Slab`.
> > [3] Measured as `iperf -s & iperf -c 127.0.0.1 -t 30`.
>
> The above doesn't actually demonstrate the whole point of the
> patchset: to reduce KASAN's very high memory consumption?
>
> --
> You received this message because you are subscribed to the Google Groups "kasan-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
> To post to this group, send email to [email protected].
> To view this discussion on the web visit https://groups.google.com/d/msgid/kasan-dev/20180627160800.3dc7f9ee41c0badbf7342520%40linux-foundation.org.
> For more options, visit https://groups.google.com/d/optout.

2018-06-28 01:17:39

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH v4 00/17] khwasan: kernel hardware assisted address sanitizer

On Wed, 27 Jun 2018 17:59:00 -0700 Vishwath Mohan <[email protected]> wrote:

> > > > time consume much less memory, trading that off for somewhat imprecise
> > > > bug detection and being supported only for arm64.
> > >
> > > Why do we consider this to be a worthwhile change?
> > >
> > > Is KASAN's memory consumption actually a significant problem? Some
> > > data regarding that would be very useful.
> >
> > On mobile, ASAN's and KASAN's memory usage is a significant problem.
> > Not sure if I can find scientific evidence of that.
> > CC-ing Vishwath Mohan who deals with KASAN on Android to provide
> > anecdotal evidence.
> >
> Yeah, I can confirm that it's an issue. Like Kostya mentioned, I don't have
> data on-hand, but anecdotally both ASAN and KASAN have proven problematic
> to enable for environments that don't tolerate the increased memory
> pressure well. This includes,
> (a) Low-memory form factors - Wear, TV, Things, lower-tier phones like Go
> (c) Connected components like Pixel's visual core
> <https://www.blog.google/products/pixel/pixel-visual-core-image-processing-and-machine-learning-pixel-2/>
>
>
> These are both places I'd love to have a low(er) memory footprint option at
> my disposal.

Thanks.

It really is important that such information be captured in the
changelogs. In as much detail as can be mustered.


2018-06-28 07:51:13

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH v4 00/17] khwasan: kernel hardware assisted address sanitizer

Hi Kostya,

On Thu, Jun 28, 2018 at 2:04 AM Kostya Serebryany <[email protected]> wrote:
> On Wed, Jun 27, 2018 at 4:08 PM Andrew Morton <[email protected]> wrote:
> > On Tue, 26 Jun 2018 15:15:10 +0200 Andrey Konovalov <[email protected]> wrote:
> > > This patchset adds a new mode to KASAN [1], which is called KHWASAN
> > > (Kernel HardWare assisted Address SANitizer).
> > >
> > > The plan is to implement HWASan [2] for the kernel with the incentive,
> > > that it's going to have comparable to KASAN performance, but in the same
> > > time consume much less memory, trading that off for somewhat imprecise
> > > bug detection and being supported only for arm64.
> >
> > Why do we consider this to be a worthwhile change?
> >
> > Is KASAN's memory consumption actually a significant problem? Some
> > data regarding that would be very useful.
>
> On mobile, ASAN's and KASAN's memory usage is a significant problem.
> Not sure if I can find scientific evidence of that.
> CC-ing Vishwath Mohan who deals with KASAN on Android to provide
> anecdotal evidence.
>
> There are several other benefits too:
> * HWASAN more reliably detects non-linear-buffer-overflows compared to
> ASAN (same for kernel-HWASAN vs kernel-ASAN)
> * Same for detecting use-after-free (since HWASAN doesn't rely on quarantine).
> * Much easier to implement stack-use-after-return detection (which
> IIRC KASAN doesn't have yet, because in KASAN it's too hard)
>
> > If it is a large problem then we still have that problem on x86, so the
> > problem remains largely unsolved?
>
> The problem is more significant on mobile devices than on desktop/server.
> I'd love to have [K]HWASAN on x86_64 as well, but it's less trivial since x86_64
> doesn't have an analog of aarch64's top-byte-ignore hardware feature.

This depends on your mobile devices and desktops and servers.
There exist mobile devices with more memory than some desktops or servers.

So actual numbers (O(KiB)? O(MiB)? O(GiB)?) would be nice to have.

Thanks!

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2018-06-28 16:25:05

by Dave Martin

[permalink] [raw]
Subject: Re: [PATCH v4 00/17] khwasan: kernel hardware assisted address sanitizer

On Tue, Jun 26, 2018 at 03:15:10PM +0200, Andrey Konovalov wrote:
> This patchset adds a new mode to KASAN [1], which is called KHWASAN
> (Kernel HardWare assisted Address SANitizer).
>
> The plan is to implement HWASan [2] for the kernel with the incentive,
> that it's going to have comparable to KASAN performance, but in the same
> time consume much less memory, trading that off for somewhat imprecise
> bug detection and being supported only for arm64.
>
> The overall idea of the approach used by KHWASAN is the following:
>
> 1. By using the Top Byte Ignore arm64 CPU feature, we can store pointer
> tags in the top byte of each kernel pointer.

[...]

This is a change from the current situation, so the kernel may be
making implicit assumptions about the top byte of kernel addresses.

Randomising the top bits may cause things like address conversions and
pointer arithmetic to break.

For example, (q - p) will not produce the expected result if q and p
have different tags.

Conversions, such as between pointer and pfn, may also go wrong if not
appropriately masked.

There are also potential pointer comparison and aliasing issues if
the tag bits are ever stripped or modified.


What was your approach to tracking down all the points in the code
where we have a potential issue?

(I haven't dug through the series in detail yet, so this may be
answered somewhere already...)

Cheers
---Dave

2018-06-28 19:46:33

by Andrey Konovalov

[permalink] [raw]
Subject: Re: [PATCH v4 00/17] khwasan: kernel hardware assisted address sanitizer

On Thu, Jun 28, 2018 at 1:08 AM, Andrew Morton
<[email protected]> wrote:
> On Tue, 26 Jun 2018 15:15:10 +0200 Andrey Konovalov <[email protected]> wrote:
>> ====== Benchmarks
>>
>> The following numbers were collected on Odroid C2 board. Both KASAN and
>> KHWASAN were used in inline instrumentation mode.
>>
>> Boot time [1]:
>> * ~1.7 sec for clean kernel
>> * ~5.0 sec for KASAN
>> * ~5.0 sec for KHWASAN
>>
>> Slab memory usage after boot [2]:
>> * ~40 kb for clean kernel
>> * ~105 kb + 1/8th shadow ~= 118 kb for KASAN
>> * ~47 kb + 1/16th shadow ~= 50 kb for KHWASAN
>>
>> Network performance [3]:
>> * 8.33 Gbits/sec for clean kernel
>> * 3.17 Gbits/sec for KASAN
>> * 2.85 Gbits/sec for KHWASAN
>>
>> Note, that KHWASAN (compared to KASAN) doesn't require quarantine.
>>
>> [1] Time before the ext4 driver is initialized.
>> [2] Measured as `cat /proc/meminfo | grep Slab`.
>> [3] Measured as `iperf -s & iperf -c 127.0.0.1 -t 30`.
>
> The above doesn't actually demonstrate the whole point of the
> patchset: to reduce KASAN's very high memory consumption?

You mean that memory usage numbers collected after boot don't give a
representative picture of actual memory consumption on real workloads?

What kind of memory consumption testing would you like to see?

2018-06-29 02:53:16

by Andrey Konovalov

[permalink] [raw]
Subject: Re: [PATCH v4 00/17] khwasan: kernel hardware assisted address sanitizer

On Thu, Jun 28, 2018 at 3:11 AM, Andrew Morton
<[email protected]> wrote:
> On Wed, 27 Jun 2018 17:59:00 -0700 Vishwath Mohan <[email protected]> wrote:
>> Yeah, I can confirm that it's an issue. Like Kostya mentioned, I don't have
>> data on-hand, but anecdotally both ASAN and KASAN have proven problematic
>> to enable for environments that don't tolerate the increased memory
>> pressure well. This includes,
>> (a) Low-memory form factors - Wear, TV, Things, lower-tier phones like Go
>> (c) Connected components like Pixel's visual core
>> <https://www.blog.google/products/pixel/pixel-visual-core-image-processing-and-machine-learning-pixel-2/>
>>
>>
>> These are both places I'd love to have a low(er) memory footprint option at
>> my disposal.
>
> Thanks.
>
> It really is important that such information be captured in the
> changelogs. In as much detail as can be mustered.

I'll add it to the changelog in v5. Thanks!

2018-06-29 03:34:59

by Andrey Konovalov

[permalink] [raw]
Subject: Re: [PATCH v4 00/17] khwasan: kernel hardware assisted address sanitizer

On Thu, Jun 28, 2018 at 12:51 PM, Dave Martin <[email protected]> wrote:
> On Tue, Jun 26, 2018 at 03:15:10PM +0200, Andrey Konovalov wrote:
>> 1. By using the Top Byte Ignore arm64 CPU feature, we can store pointer
>> tags in the top byte of each kernel pointer.
>
> [...]
>
> This is a change from the current situation, so the kernel may be
> making implicit assumptions about the top byte of kernel addresses.
>
> Randomising the top bits may cause things like address conversions and
> pointer arithmetic to break.
>
> For example, (q - p) will not produce the expected result if q and p
> have different tags.

If q and p have different tags, that means they come from different
allocations. I don't think it would make sense to calculate pointer
difference in this case.

>
> Conversions, such as between pointer and pfn, may also go wrong if not
> appropriately masked.
>
> There are also potential pointer comparison and aliasing issues if
> the tag bits are ever stripped or modified.
>
>
> What was your approach to tracking down all the points in the code
> where we have a potential issue?

I've been fuzzing the kernel built with KWHASAN with syzkaller. This
gives a decent coverage and I was able to find some places where
fixups were required this way. Right now the fuzzer is running without
issues. It doesn't prove that all such places are fixed, but I don't
know a better way to test this.

2018-06-29 04:33:15

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH v4 00/17] khwasan: kernel hardware assisted address sanitizer

On Thu, 28 Jun 2018 20:29:07 +0200 Andrey Konovalov <[email protected]> wrote:

> >> Slab memory usage after boot [2]:
> >> * ~40 kb for clean kernel
> >> * ~105 kb + 1/8th shadow ~= 118 kb for KASAN
> >> * ~47 kb + 1/16th shadow ~= 50 kb for KHWASAN
> >>
> >> Network performance [3]:
> >> * 8.33 Gbits/sec for clean kernel
> >> * 3.17 Gbits/sec for KASAN
> >> * 2.85 Gbits/sec for KHWASAN
> >>
> >> Note, that KHWASAN (compared to KASAN) doesn't require quarantine.
> >>
> >> [1] Time before the ext4 driver is initialized.
> >> [2] Measured as `cat /proc/meminfo | grep Slab`.
> >> [3] Measured as `iperf -s & iperf -c 127.0.0.1 -t 30`.
> >
> > The above doesn't actually demonstrate the whole point of the
> > patchset: to reduce KASAN's very high memory consumption?
>
> You mean that memory usage numbers collected after boot don't give a
> representative picture of actual memory consumption on real workloads?
>
> What kind of memory consumption testing would you like to see?

Well, 100kb or so is a teeny amount on virtually any machine. I'm
assuming the savings are (much) more significant once the machine gets
loaded up and doing work?


2018-06-29 10:35:12

by Mark Rutland

[permalink] [raw]
Subject: Re: [PATCH v4 00/17] khwasan: kernel hardware assisted address sanitizer

On Thu, Jun 28, 2018 at 08:56:41PM +0200, Andrey Konovalov wrote:
> On Thu, Jun 28, 2018 at 12:51 PM, Dave Martin <[email protected]> wrote:
> > On Tue, Jun 26, 2018 at 03:15:10PM +0200, Andrey Konovalov wrote:
> >> 1. By using the Top Byte Ignore arm64 CPU feature, we can store pointer
> >> tags in the top byte of each kernel pointer.
> >
> > [...]
> >
> > This is a change from the current situation, so the kernel may be
> > making implicit assumptions about the top byte of kernel addresses.

[...]

> > What was your approach to tracking down all the points in the code
> > where we have a potential issue?
>
> I've been fuzzing the kernel built with KWHASAN with syzkaller. This
> gives a decent coverage and I was able to find some places where
> fixups were required this way. Right now the fuzzer is running without
> issues. It doesn't prove that all such places are fixed, but I don't
> know a better way to test this.

While fuzzing shows that the kernel doesn't crash (and this is very
important), it does not show that it exhibits the expected behaviour,
and there could be a number of soft failures present.

e.g. certain functions might just return an error code if a pointer has
a tag other than 0xff (such that it looks like a kernel pointer) or 0x00
(such that it looks like a user pointer), and this might not result in a
fatal error even though the behaviour is not what we require.

Perhaps it's possible to compare the behaviour of a kernel making use of
tags with one which does not, though I'm not sure at which level the
comparison needs to be performed.

Thanks,
Mark.

2018-06-29 11:24:31

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH v4 00/17] khwasan: kernel hardware assisted address sanitizer

On Thu, Jun 28, 2018 at 08:56:41PM +0200, Andrey Konovalov wrote:
> On Thu, Jun 28, 2018 at 12:51 PM, Dave Martin <[email protected]> wrote:
> > On Tue, Jun 26, 2018 at 03:15:10PM +0200, Andrey Konovalov wrote:
> >> 1. By using the Top Byte Ignore arm64 CPU feature, we can store pointer
> >> tags in the top byte of each kernel pointer.
> >
> > [...]
> >
> > This is a change from the current situation, so the kernel may be
> > making implicit assumptions about the top byte of kernel addresses.
> >
> > Randomising the top bits may cause things like address conversions and
> > pointer arithmetic to break.
> >
> > For example, (q - p) will not produce the expected result if q and p
> > have different tags.
>
> If q and p have different tags, that means they come from different
> allocations. I don't think it would make sense to calculate pointer
> difference in this case.

Well, there is a lot of pointer comparison in the kernel which means
pointer difference. Take is_vmalloc_addr() for example, even if your
patchset does not cover (IIUC) vmalloc() at the moment, this function
may be called with slab addresses. Presumably they would all fail the
check with a non-0xff tag but it's something needs understood. If you
later add support for vmalloc(), this test would fail (as would the
rb tree search in find_vmap_area(). Kmemleak would probably break as
well as it makes heavy use of rb tree.

Basically you need to be very clear about kernel pointer usage (with an
associated tag or type) vs address range it refers to and in most cases
converted to an unsigned long. See the other discussion on sparse, it
could potentially be useful if we can detect the places where a pointer
is converted to ulong and maybe hide such conversion behind a macro with
the arm64 implementation also clearing the tag.

--
Catalin

2018-06-29 13:35:34

by Dave Martin

[permalink] [raw]
Subject: Re: [PATCH v4 00/17] khwasan: kernel hardware assisted address sanitizer

On Thu, Jun 28, 2018 at 08:56:41PM +0200, Andrey Konovalov wrote:
> On Thu, Jun 28, 2018 at 12:51 PM, Dave Martin <[email protected]> wrote:
> > On Tue, Jun 26, 2018 at 03:15:10PM +0200, Andrey Konovalov wrote:
> >> 1. By using the Top Byte Ignore arm64 CPU feature, we can store pointer
> >> tags in the top byte of each kernel pointer.
> >
> > [...]
> >
> > This is a change from the current situation, so the kernel may be
> > making implicit assumptions about the top byte of kernel addresses.
> >
> > Randomising the top bits may cause things like address conversions and
> > pointer arithmetic to break.
> >
> > For example, (q - p) will not produce the expected result if q and p
> > have different tags.
>
> If q and p have different tags, that means they come from different
> allocations. I don't think it would make sense to calculate pointer
> difference in this case.

It's not strictly valid to subtract pointers from different allocations
in C, but it's hard to prove statically that two pointers are guaranteed
to point into the same allocation.

It's likely that we're getting away with it in some places today.

> > Conversions, such as between pointer and pfn, may also go wrong if not
> > appropriately masked.
> >
> > There are also potential pointer comparison and aliasing issues if
> > the tag bits are ever stripped or modified.
> >
> >
> > What was your approach to tracking down all the points in the code
> > where we have a potential issue?
>
> I've been fuzzing the kernel built with KWHASAN with syzkaller. This
> gives a decent coverage and I was able to find some places where
> fixups were required this way. Right now the fuzzer is running without
> issues. It doesn't prove that all such places are fixed, but I don't
> know a better way to test this.

Can sparse be hacked to identify pointer subtractions where the pointers
are cannot be statically proved to point into the same allocation?

Maybe the number of hits for this wouldn't be outrageously high, though
I expect there would be a fair number.

Tracking pointers that have been cast to integer types is harder.
Ideally we'd want to do that, to flag up potentially problematic
masking and other similar hacks.

Cheers
---Dave

2018-06-29 13:36:24

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v4 00/17] khwasan: kernel hardware assisted address sanitizer

On Thu, Jun 28, 2018 at 08:56:41PM +0200, Andrey Konovalov wrote:
> On Thu, Jun 28, 2018 at 12:51 PM, Dave Martin <[email protected]> wrote:
> > On Tue, Jun 26, 2018 at 03:15:10PM +0200, Andrey Konovalov wrote:
> >> 1. By using the Top Byte Ignore arm64 CPU feature, we can store pointer
> >> tags in the top byte of each kernel pointer.
> >
> > [...]
> >
> > This is a change from the current situation, so the kernel may be
> > making implicit assumptions about the top byte of kernel addresses.
> >
> > Randomising the top bits may cause things like address conversions and
> > pointer arithmetic to break.
> >
> > For example, (q - p) will not produce the expected result if q and p
> > have different tags.
>
> If q and p have different tags, that means they come from different
> allocations. I don't think it would make sense to calculate pointer
> difference in this case.

It might not seen sensible, but we could still be relying on this in the
kernel and so this change would introduce a regression. I think we need
a way to identify such pointer usage before these patches can seriously be
considered for mainline inclusion. For example use of '>' and '<' to
compare pointers in an rbtree could be affected by the introduction of
tags.

Will

2018-06-29 14:18:44

by Andrey Konovalov

[permalink] [raw]
Subject: Re: [PATCH v4 00/17] khwasan: kernel hardware assisted address sanitizer

On Thu, Jun 28, 2018 at 9:40 PM, Andrew Morton
<[email protected]> wrote:
> On Thu, 28 Jun 2018 20:29:07 +0200 Andrey Konovalov <[email protected]> wrote:
>
>> >> Slab memory usage after boot [2]:
>> >> * ~40 kb for clean kernel
>> >> * ~105 kb + 1/8th shadow ~= 118 kb for KASAN
>> >> * ~47 kb + 1/16th shadow ~= 50 kb for KHWASAN
>> >>
>> >> Network performance [3]:
>> >> * 8.33 Gbits/sec for clean kernel
>> >> * 3.17 Gbits/sec for KASAN
>> >> * 2.85 Gbits/sec for KHWASAN
>> >>
>> >> Note, that KHWASAN (compared to KASAN) doesn't require quarantine.
>> >>
>> >> [1] Time before the ext4 driver is initialized.
>> >> [2] Measured as `cat /proc/meminfo | grep Slab`.
>> >> [3] Measured as `iperf -s & iperf -c 127.0.0.1 -t 30`.
>> >
>> > The above doesn't actually demonstrate the whole point of the
>> > patchset: to reduce KASAN's very high memory consumption?
>>
>> You mean that memory usage numbers collected after boot don't give a
>> representative picture of actual memory consumption on real workloads?
>>
>> What kind of memory consumption testing would you like to see?
>
> Well, 100kb or so is a teeny amount on virtually any machine. I'm
> assuming the savings are (much) more significant once the machine gets
> loaded up and doing work?

So with clean kernel after boot we get 40 kb memory usage. With KASAN
it is ~120 kb, which is 200% overhead. With KHWASAN it's 50 kb, which
is 25% overhead. This should approximately scale to any amounts of
used slab memory. For example with 100 mb memory usage we would get
+200 mb for KASAN and +25 mb with KHWASAN. (And KASAN also requires
quarantine for better use-after-free detection). I can explicitly
mention the overhead in %s in the changelog.

If you think it makes sense, I can also make separate measurements
with some workload. What kind of workload should I use?

2018-06-29 14:22:06

by Mark Rutland

[permalink] [raw]
Subject: Re: [PATCH v4 00/17] khwasan: kernel hardware assisted address sanitizer

On Fri, Jun 29, 2018 at 02:45:08PM +0200, Andrey Konovalov wrote:
> On Thu, Jun 28, 2018 at 9:40 PM, Andrew Morton
> <[email protected]> wrote:
> > On Thu, 28 Jun 2018 20:29:07 +0200 Andrey Konovalov <[email protected]> wrote:
> >
> >> >> Slab memory usage after boot [2]:
> >> >> * ~40 kb for clean kernel
> >> >> * ~105 kb + 1/8th shadow ~= 118 kb for KASAN
> >> >> * ~47 kb + 1/16th shadow ~= 50 kb for KHWASAN
> >> >>
> >> >> Network performance [3]:
> >> >> * 8.33 Gbits/sec for clean kernel
> >> >> * 3.17 Gbits/sec for KASAN
> >> >> * 2.85 Gbits/sec for KHWASAN
> >> >>
> >> >> Note, that KHWASAN (compared to KASAN) doesn't require quarantine.
> >> >>
> >> >> [1] Time before the ext4 driver is initialized.
> >> >> [2] Measured as `cat /proc/meminfo | grep Slab`.
> >> >> [3] Measured as `iperf -s & iperf -c 127.0.0.1 -t 30`.
> >> >
> >> > The above doesn't actually demonstrate the whole point of the
> >> > patchset: to reduce KASAN's very high memory consumption?
> >>
> >> You mean that memory usage numbers collected after boot don't give a
> >> representative picture of actual memory consumption on real workloads?
> >>
> >> What kind of memory consumption testing would you like to see?
> >
> > Well, 100kb or so is a teeny amount on virtually any machine. I'm
> > assuming the savings are (much) more significant once the machine gets
> > loaded up and doing work?
>
> So with clean kernel after boot we get 40 kb memory usage. With KASAN
> it is ~120 kb, which is 200% overhead. With KHWASAN it's 50 kb, which
> is 25% overhead. This should approximately scale to any amounts of
> used slab memory. For example with 100 mb memory usage we would get
> +200 mb for KASAN and +25 mb with KHWASAN. (And KASAN also requires
> quarantine for better use-after-free detection). I can explicitly
> mention the overhead in %s in the changelog.

Could you elaborate on where that SLAB overhead comes from?

IIUC that's not for the shadow itself (since it's allocated up-front and
not accounted to SLAB), and that doesn't take into account the
quarantine, so what's eating that space?

Thanks,
Mark.

2018-06-29 14:24:44

by Andrey Konovalov

[permalink] [raw]
Subject: Re: [PATCH v4 00/17] khwasan: kernel hardware assisted address sanitizer

On Fri, Jun 29, 2018 at 1:26 PM, Luc Van Oostenryck
<[email protected]> wrote:
> On Fri, Jun 29, 2018 at 12:04:22PM +0100, Dave Martin wrote:
>>
>> Can sparse be hacked to identify pointer subtractions where the pointers
>> are cannot be statically proved to point into the same allocation?

Re all the comments about finding all the places where we do pointer
subtraction/comparison:

I might be wrong, but I doubt you can easily do that with static analysis.

What we could do is to try to detect all such subtractions/comparisons
dynamically. The idea is to instrument all pointer/ulong
subtraction/comparison instructions and try to detect tags mismatch.
And then run some workload (e.g. syzkaller) to trigger more kernel
code. The question is how much false positives we would get, since I
imagine there would be a number of cases when we compare some random
ulongs.

2018-06-29 14:30:12

by Dan Carpenter

[permalink] [raw]
Subject: Re: [PATCH v4 00/17] khwasan: kernel hardware assisted address sanitizer

On Fri, Jun 29, 2018 at 01:26:14PM +0200, Luc Van Oostenryck wrote:
> On Fri, Jun 29, 2018 at 12:04:22PM +0100, Dave Martin wrote:
> >
> > Can sparse be hacked to identify pointer subtractions where the pointers
> > are cannot be statically proved to point into the same allocation?
>
> sparse only see the (deatils of) the function it analyses and all
> visible declarations, nothing more.
>
> It would be more a job for smatch which do global analysis.
> But to identify such subtractions yu must already have a (good)
> pointer alias analysis which I don't think smatch do (but I can
> be wrong, Dan & smatch's ml added in CC).

That would be hard to manage. Maybe in a year from now...

Pointer math errors tend to get caught pretty quick because they're on
the success path so I don't imagine there are huge numbers of bugs.

regards,
dan carpenter


2018-06-29 14:42:23

by Andrey Konovalov

[permalink] [raw]
Subject: Re: [PATCH v4 00/17] khwasan: kernel hardware assisted address sanitizer

On Fri, Jun 29, 2018 at 3:01 PM, Mark Rutland <[email protected]> wrote:
> On Fri, Jun 29, 2018 at 02:45:08PM +0200, Andrey Konovalov wrote:
>> So with clean kernel after boot we get 40 kb memory usage. With KASAN
>> it is ~120 kb, which is 200% overhead. With KHWASAN it's 50 kb, which
>> is 25% overhead. This should approximately scale to any amounts of
>> used slab memory. For example with 100 mb memory usage we would get
>> +200 mb for KASAN and +25 mb with KHWASAN. (And KASAN also requires
>> quarantine for better use-after-free detection). I can explicitly
>> mention the overhead in %s in the changelog.
>
> Could you elaborate on where that SLAB overhead comes from?
>
> IIUC that's not for the shadow itself (since it's allocated up-front and
> not accounted to SLAB), and that doesn't take into account the
> quarantine, so what's eating that space?

Redzones. KHWASAN doesn't need them since the next slab object is
marked with a different tag (with a high probability) and acts as a
redzone.

2018-06-29 15:10:37

by Luc Van Oostenryck

[permalink] [raw]
Subject: Re: [PATCH v4 00/17] khwasan: kernel hardware assisted address sanitizer

On Fri, Jun 29, 2018 at 12:04:22PM +0100, Dave Martin wrote:
>
> Can sparse be hacked to identify pointer subtractions where the pointers
> are cannot be statically proved to point into the same allocation?

sparse only see the (deatils of) the function it analyses and all
visible declarations, nothing more.

It would be more a job for smatch which do global analysis.
But to identify such subtractions yu must already have a (good)
pointer alias analysis which I don't think smatch do (but I can
be wrong, Dan & smatch's ml added in CC).

-- Luc Van Oostenryck

2018-06-29 17:34:53

by Andrey Konovalov

[permalink] [raw]
Subject: Re: [PATCH v4 00/17] khwasan: kernel hardware assisted address sanitizer

On Fri, Jun 29, 2018 at 1:07 PM, Will Deacon <[email protected]> wrote:
> It might not seen sensible, but we could still be relying on this in the
> kernel and so this change would introduce a regression. I think we need
> a way to identify such pointer usage before these patches can seriously be
> considered for mainline inclusion.

Another point that I have here is that KHWASAN is a debugging tool not
meant to be used in production. We're not trying to change the ABI or
something like that (referring to the other HWASAN patchset). We can
fix up the non obvious places where untagging is needed in a case by
case basis with additional patches when testing reveals it.

2018-06-30 03:12:49

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH v4 00/17] khwasan: kernel hardware assisted address sanitizer

On Fri, 29 Jun 2018 14:45:08 +0200 Andrey Konovalov <[email protected]> wrote:

> >> What kind of memory consumption testing would you like to see?
> >
> > Well, 100kb or so is a teeny amount on virtually any machine. I'm
> > assuming the savings are (much) more significant once the machine gets
> > loaded up and doing work?
>
> So with clean kernel after boot we get 40 kb memory usage. With KASAN
> it is ~120 kb, which is 200% overhead. With KHWASAN it's 50 kb, which
> is 25% overhead. This should approximately scale to any amounts of
> used slab memory. For example with 100 mb memory usage we would get
> +200 mb for KASAN and +25 mb with KHWASAN. (And KASAN also requires
> quarantine for better use-after-free detection). I can explicitly
> mention the overhead in %s in the changelog.
>
> If you think it makes sense, I can also make separate measurements
> with some workload. What kind of workload should I use?

Whatever workload people were running when they encountered problems
with KASAN memory consumption ;)

I dunno, something simple. `find / > /dev/null'?


2018-07-02 19:17:51

by Evgenii Stepanov

[permalink] [raw]
Subject: Re: [PATCH v4 00/17] khwasan: kernel hardware assisted address sanitizer

Looking at a live Android device under load, slab (according to
/proc/meminfo) + kernel stack take 8-10% available RAM (~350MB).
Kasan's overhead of 2x - 3x on top of it is not insignificant.

On Fri, Jun 29, 2018 at 7:41 PM, Andrew Morton
<[email protected]> wrote:
> On Fri, 29 Jun 2018 14:45:08 +0200 Andrey Konovalov <[email protected]> wrote:
>
>> >> What kind of memory consumption testing would you like to see?
>> >
>> > Well, 100kb or so is a teeny amount on virtually any machine. I'm
>> > assuming the savings are (much) more significant once the machine gets
>> > loaded up and doing work?
>>
>> So with clean kernel after boot we get 40 kb memory usage. With KASAN
>> it is ~120 kb, which is 200% overhead. With KHWASAN it's 50 kb, which
>> is 25% overhead. This should approximately scale to any amounts of
>> used slab memory. For example with 100 mb memory usage we would get
>> +200 mb for KASAN and +25 mb with KHWASAN. (And KASAN also requires
>> quarantine for better use-after-free detection). I can explicitly
>> mention the overhead in %s in the changelog.
>>
>> If you think it makes sense, I can also make separate measurements
>> with some workload. What kind of workload should I use?
>
> Whatever workload people were running when they encountered problems
> with KASAN memory consumption ;)
>
> I dunno, something simple. `find / > /dev/null'?
>

2018-07-02 19:22:14

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH v4 00/17] khwasan: kernel hardware assisted address sanitizer

On Mon, 2 Jul 2018 12:16:42 -0700 Evgenii Stepanov <[email protected]> wrote:

> On Fri, Jun 29, 2018 at 7:41 PM, Andrew Morton
> <[email protected]> wrote:
> > On Fri, 29 Jun 2018 14:45:08 +0200 Andrey Konovalov <[email protected]> wrote:
> >
> >> >> What kind of memory consumption testing would you like to see?
> >> >
> >> > Well, 100kb or so is a teeny amount on virtually any machine. I'm
> >> > assuming the savings are (much) more significant once the machine gets
> >> > loaded up and doing work?
> >>
> >> So with clean kernel after boot we get 40 kb memory usage. With KASAN
> >> it is ~120 kb, which is 200% overhead. With KHWASAN it's 50 kb, which
> >> is 25% overhead. This should approximately scale to any amounts of
> >> used slab memory. For example with 100 mb memory usage we would get
> >> +200 mb for KASAN and +25 mb with KHWASAN. (And KASAN also requires
> >> quarantine for better use-after-free detection). I can explicitly
> >> mention the overhead in %s in the changelog.
> >>
> >> If you think it makes sense, I can also make separate measurements
> >> with some workload. What kind of workload should I use?
> >
> > Whatever workload people were running when they encountered problems
> > with KASAN memory consumption ;)
> >
> > I dunno, something simple. `find / > /dev/null'?
> >
>
> Looking at a live Android device under load, slab (according to
> /proc/meminfo) + kernel stack take 8-10% available RAM (~350MB).
> Kasan's overhead of 2x - 3x on top of it is not insignificant.
>

(top-posting repaired. Please don't)

For a debugging, not-for-production-use feature, that overhead sounds
quite acceptable to me. What problems is it known to cause?


2018-07-02 20:25:03

by Evgenii Stepanov

[permalink] [raw]
Subject: Re: [PATCH v4 00/17] khwasan: kernel hardware assisted address sanitizer

On Mon, Jul 2, 2018 at 12:21 PM, Andrew Morton
<[email protected]> wrote:
> On Mon, 2 Jul 2018 12:16:42 -0700 Evgenii Stepanov <[email protected]> wrote:
>
>> On Fri, Jun 29, 2018 at 7:41 PM, Andrew Morton
>> <[email protected]> wrote:
>> > On Fri, 29 Jun 2018 14:45:08 +0200 Andrey Konovalov <[email protected]> wrote:
>> >
>> >> >> What kind of memory consumption testing would you like to see?
>> >> >
>> >> > Well, 100kb or so is a teeny amount on virtually any machine. I'm
>> >> > assuming the savings are (much) more significant once the machine gets
>> >> > loaded up and doing work?
>> >>
>> >> So with clean kernel after boot we get 40 kb memory usage. With KASAN
>> >> it is ~120 kb, which is 200% overhead. With KHWASAN it's 50 kb, which
>> >> is 25% overhead. This should approximately scale to any amounts of
>> >> used slab memory. For example with 100 mb memory usage we would get
>> >> +200 mb for KASAN and +25 mb with KHWASAN. (And KASAN also requires
>> >> quarantine for better use-after-free detection). I can explicitly
>> >> mention the overhead in %s in the changelog.
>> >>
>> >> If you think it makes sense, I can also make separate measurements
>> >> with some workload. What kind of workload should I use?
>> >
>> > Whatever workload people were running when they encountered problems
>> > with KASAN memory consumption ;)
>> >
>> > I dunno, something simple. `find / > /dev/null'?
>> >
>>
>> Looking at a live Android device under load, slab (according to
>> /proc/meminfo) + kernel stack take 8-10% available RAM (~350MB).
>> Kasan's overhead of 2x - 3x on top of it is not insignificant.
>>
>
> (top-posting repaired. Please don't)
>
> For a debugging, not-for-production-use feature, that overhead sounds
> quite acceptable to me. What problems is it known to cause?

Not having this overhead enables near-production use - ex. running
kasan/khasan kernel on a personal, daily-use device to catch bugs that
do not reproduce in test configuration. These are the ones that often
cost the most engineering time to track down.

CPU overhead is bad, but generally tolerable. RAM is critical, in our
experience. Once it gets low enough, OOM-killer makes your life
miserable.

2018-07-02 20:32:21

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH v4 00/17] khwasan: kernel hardware assisted address sanitizer

On Mon, 2 Jul 2018 13:22:23 -0700 Evgenii Stepanov <[email protected]> wrote:

> On Mon, Jul 2, 2018 at 12:21 PM, Andrew Morton
> <[email protected]> wrote:
> > On Mon, 2 Jul 2018 12:16:42 -0700 Evgenii Stepanov <[email protected]> wrote:
> >
> >> On Fri, Jun 29, 2018 at 7:41 PM, Andrew Morton
> >> <[email protected]> wrote:
> >> > On Fri, 29 Jun 2018 14:45:08 +0200 Andrey Konovalov <[email protected]> wrote:
> >> >
> >> >> >> What kind of memory consumption testing would you like to see?
> >> >> >
> >> >> > Well, 100kb or so is a teeny amount on virtually any machine. I'm
> >> >> > assuming the savings are (much) more significant once the machine gets
> >> >> > loaded up and doing work?
> >> >>
> >> >> So with clean kernel after boot we get 40 kb memory usage. With KASAN
> >> >> it is ~120 kb, which is 200% overhead. With KHWASAN it's 50 kb, which
> >> >> is 25% overhead. This should approximately scale to any amounts of
> >> >> used slab memory. For example with 100 mb memory usage we would get
> >> >> +200 mb for KASAN and +25 mb with KHWASAN. (And KASAN also requires
> >> >> quarantine for better use-after-free detection). I can explicitly
> >> >> mention the overhead in %s in the changelog.
> >> >>
> >> >> If you think it makes sense, I can also make separate measurements
> >> >> with some workload. What kind of workload should I use?
> >> >
> >> > Whatever workload people were running when they encountered problems
> >> > with KASAN memory consumption ;)
> >> >
> >> > I dunno, something simple. `find / > /dev/null'?
> >> >
> >>
> >> Looking at a live Android device under load, slab (according to
> >> /proc/meminfo) + kernel stack take 8-10% available RAM (~350MB).
> >> Kasan's overhead of 2x - 3x on top of it is not insignificant.
> >>
> >
> > (top-posting repaired. Please don't)
> >
> > For a debugging, not-for-production-use feature, that overhead sounds
> > quite acceptable to me. What problems is it known to cause?
>
> Not having this overhead enables near-production use - ex. running
> kasan/khasan kernel on a personal, daily-use device to catch bugs that
> do not reproduce in test configuration. These are the ones that often
> cost the most engineering time to track down.
>
> CPU overhead is bad, but generally tolerable. RAM is critical, in our
> experience. Once it gets low enough, OOM-killer makes your life
> miserable.

OK, anecdotal experience works for me. But this is all stuff that
should have been in the changelog from day zero, please. It describes
the reason for the patchset's existence!


2018-07-02 20:35:07

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH v4 00/17] khwasan: kernel hardware assisted address sanitizer

On Wed, Jun 27, 2018 at 05:04:28PM -0700, Kostya Serebryany wrote:
> The problem is more significant on mobile devices than on desktop/server.
> I'd love to have [K]HWASAN on x86_64 as well, but it's less trivial since x86_64
> doesn't have an analog of aarch64's top-byte-ignore hardware feature.

Well, can we emulate it in software?

We've got 48 bits of virtual address space on x86. If we need all 8
bits, then that takes us down to 40 bits (39 bits for user and 39 bits
for kernel). My laptop only has 34 bits of physical memory, so could
we come up with a memory layout which works for me?

2018-07-02 23:40:54

by Evgenii Stepanov

[permalink] [raw]
Subject: Re: [PATCH v4 00/17] khwasan: kernel hardware assisted address sanitizer

On Mon, Jul 2, 2018 at 1:33 PM, Matthew Wilcox <[email protected]> wrote:
> On Wed, Jun 27, 2018 at 05:04:28PM -0700, Kostya Serebryany wrote:
>> The problem is more significant on mobile devices than on desktop/server.
>> I'd love to have [K]HWASAN on x86_64 as well, but it's less trivial since x86_64
>> doesn't have an analog of aarch64's top-byte-ignore hardware feature.
>
> Well, can we emulate it in software?
>
> We've got 48 bits of virtual address space on x86. If we need all 8
> bits, then that takes us down to 40 bits (39 bits for user and 39 bits
> for kernel). My laptop only has 34 bits of physical memory, so could
> we come up with a memory layout which works for me?

Yes, probably.

We've tried this in userspace by mapping a file multiple times, but
that's very slow, likely because of the extra TLB pressure.
It should be possible to achieve better performance in the kernel with
some page table tricks (i.e. if we take top 8 bits out of 48, then
there would be only two second-level tables, and the top-level table
will look like [p1, p2, p1, p2, ...]). I'm not 100% sure if that would
work.

I don't think this should be part of this patchset, but it's good to
keep this in mind.

2018-07-03 17:37:08

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v4 00/17] khwasan: kernel hardware assisted address sanitizer

On Fri, Jun 29, 2018 at 06:36:10PM +0200, Andrey Konovalov wrote:
> On Fri, Jun 29, 2018 at 1:07 PM, Will Deacon <[email protected]> wrote:
> > It might not seen sensible, but we could still be relying on this in the
> > kernel and so this change would introduce a regression. I think we need
> > a way to identify such pointer usage before these patches can seriously be
> > considered for mainline inclusion.
>
> Another point that I have here is that KHWASAN is a debugging tool not
> meant to be used in production. We're not trying to change the ABI or
> something like that (referring to the other HWASAN patchset). We can
> fix up the non obvious places where untagging is needed in a case by
> case basis with additional patches when testing reveals it.

Hmm, but elsewhere in this thread, Evgenii is motivating the need for this
patch set precisely because the lower overhead means it's suitable for
"near-production" use. So I don't think writing this off as a debugging
feature is the right approach, and we instead need to put effort into
analysing the impact of address tags on the kernel as a whole. Playing
whack-a-mole with subtle tag issues sounds like the worst possible outcome
for the long-term.

Will

2018-07-18 17:17:20

by Andrey Konovalov

[permalink] [raw]
Subject: Re: [PATCH v4 00/17] khwasan: kernel hardware assisted address sanitizer

On Mon, Jul 2, 2018 at 10:30 PM, Andrew Morton
<[email protected]> wrote:
> OK, anecdotal experience works for me. But this is all stuff that
> should have been in the changelog from day zero, please. It describes
> the reason for the patchset's existence!

I will add all those points to the cover letter in v5.

On Tue, Jul 3, 2018 at 7:36 PM, Will Deacon <[email protected]> wrote:
> Hmm, but elsewhere in this thread, Evgenii is motivating the need for this
> patch set precisely because the lower overhead means it's suitable for
> "near-production" use. So I don't think writing this off as a debugging
> feature is the right approach, and we instead need to put effort into
> analysing the impact of address tags on the kernel as a whole. Playing
> whack-a-mole with subtle tag issues sounds like the worst possible outcome
> for the long-term.

I don't see a way to find cases where pointer tags would matter
statically, so I've implemented the dynamic approach that I mentioned
above. I've instrumented all pointer comparisons/subtractions in an
LLVM compiler pass and used a kernel module that would print a bug
report whenever two pointers with different tags are being
compared/subtracted (ignoring comparisons with NULL pointers and with
pointers obtained by casting an error code to a pointer type). Then I
tried booting the kernel in QEMU and on an Odroid C2 board and I ran
syzkaller overnight.

This yielded the following results.

======

The two places that look interesting are:

is_vmalloc_addr in include/linux/mm.h (already mentioned by Catalin)
is_kernel_rodata in mm/util.c

Here we compare a pointer with some fixed untagged values to make sure
that the pointer lies in a particular part of the kernel address
space. Since KWHASAN doesn't add tags to pointers that belong to
rodata or vmalloc regions, this should work as is. To make sure I've
added debug checks to those two functions that check that the result
doesn't change whether we operate on pointers with or without
untagging.

======

A few other cases that don't look that interesting:

Comparing pointers to achieve unique sorting order of pointee objects
(e.g. sorting locks addresses before performing a double lock):

tty_ldisc_lock_pair_timeout in drivers/tty/tty_ldisc.c
pipe_double_lock in fs/pipe.c
unix_state_double_lock in net/unix/af_unix.c
lock_two_nondirectories in fs/inode.c
mutex_lock_double in kernel/events/core.c

ep_cmp_ffd in fs/eventpoll.c
fsnotify_compare_groups fs/notify/mark.c

Nothing needs to be done here, since the tags embedded into pointers
don't change, so the sorting order would still be unique.

Check that a pointer belongs to some particular allocation:

is_sibling_entry lib/radix-tree.c
object_is_on_stack in include/linux/sched/task_stack.h

Nothing needs to be here either, since two pointers can only belong to
the same allocation if they have the same tag.

======

Will, Catalin, WDYT?

2018-07-25 13:45:25

by Vincenzo Frascino

[permalink] [raw]
Subject: Re: [PATCH v4 13/17] khwasan: add hooks implementation

On 06/26/2018 02:15 PM, Andrey Konovalov wrote:

> @@ -325,18 +341,41 @@ void kasan_init_slab_obj(struct kmem_cache *cache, const void *object)
>
> void *kasan_slab_alloc(struct kmem_cache *cache, void *object, gfp_t flags)
> {
> - return kasan_kmalloc(cache, object, cache->object_size, flags);
> + object = kasan_kmalloc(cache, object, cache->object_size, flags);
> + if (IS_ENABLED(CONFIG_KASAN_HW) && unlikely(cache->ctor)) {
> + /*
> + * Cache constructor might use object's pointer value to
> + * initialize some of its fields.
> + */
> + cache->ctor(object);
>
This seams breaking the kmem_cache_create() contract: "The @ctor is run
when new pages are allocated by the cache."
(https://elixir.bootlin.com/linux/v3.7/source/mm/slab_common.c#L83)

Since there might be preexisting code relying on it, this could lead to
global side effects. Did you verify that this is not the case?

Another concern is performance related if we consider this solution
suitable for "near-production", since with the current implementation
you call the ctor (where present) on an object multiple times and this
ends up memsetting and repopulating the memory every time (i.e. inode.c:
inode_init_once). Do you know what is the performance impact?

--
Regards,
Vincenzo


2018-07-31 13:07:02

by Andrey Konovalov

[permalink] [raw]
Subject: Re: [PATCH v4 13/17] khwasan: add hooks implementation

On Wed, Jul 25, 2018 at 3:44 PM, Vincenzo Frascino@Foss
<[email protected]> wrote:
> On 06/26/2018 02:15 PM, Andrey Konovalov wrote:
>
>> @@ -325,18 +341,41 @@ void kasan_init_slab_obj(struct kmem_cache *cache,
>> const void *object)
>> void *kasan_slab_alloc(struct kmem_cache *cache, void *object, gfp_t
>> flags)
>> {
>> - return kasan_kmalloc(cache, object, cache->object_size, flags);
>> + object = kasan_kmalloc(cache, object, cache->object_size, flags);
>> + if (IS_ENABLED(CONFIG_KASAN_HW) && unlikely(cache->ctor)) {
>> + /*
>> + * Cache constructor might use object's pointer value to
>> + * initialize some of its fields.
>> + */
>> + cache->ctor(object);
>>
> This seams breaking the kmem_cache_create() contract: "The @ctor is run when
> new pages are allocated by the cache."
> (https://elixir.bootlin.com/linux/v3.7/source/mm/slab_common.c#L83)
>
> Since there might be preexisting code relying on it, this could lead to
> global side effects. Did you verify that this is not the case?
>
> Another concern is performance related if we consider this solution suitable
> for "near-production", since with the current implementation you call the
> ctor (where present) on an object multiple times and this ends up memsetting
> and repopulating the memory every time (i.e. inode.c: inode_init_once). Do
> you know what is the performance impact?

We can assign tags to objects with constructors when a slab is
allocated and call constructors once as usual. The downside is that
such object would always have the same tag when it is reallocated, so
we won't catch use-after-frees. But that is probably something we'll
have to deal with if we're aiming for "near-production". I'll add this
change to v5, thanks!

2018-07-31 13:23:57

by Andrey Konovalov

[permalink] [raw]
Subject: Re: [PATCH v4 00/17] khwasan: kernel hardware assisted address sanitizer

On Wed, Jul 18, 2018 at 7:16 PM, Andrey Konovalov <[email protected]> wrote:
> On Tue, Jul 3, 2018 at 7:36 PM, Will Deacon <[email protected]> wrote:
>> Hmm, but elsewhere in this thread, Evgenii is motivating the need for this
>> patch set precisely because the lower overhead means it's suitable for
>> "near-production" use. So I don't think writing this off as a debugging
>> feature is the right approach, and we instead need to put effort into
>> analysing the impact of address tags on the kernel as a whole. Playing
>> whack-a-mole with subtle tag issues sounds like the worst possible outcome
>> for the long-term.
>
> I don't see a way to find cases where pointer tags would matter
> statically, so I've implemented the dynamic approach that I mentioned
> above. I've instrumented all pointer comparisons/subtractions in an
> LLVM compiler pass and used a kernel module that would print a bug
> report whenever two pointers with different tags are being
> compared/subtracted (ignoring comparisons with NULL pointers and with
> pointers obtained by casting an error code to a pointer type). Then I
> tried booting the kernel in QEMU and on an Odroid C2 board and I ran
> syzkaller overnight.
>
> This yielded the following results.
>
> ======
>
> The two places that look interesting are:
>
> is_vmalloc_addr in include/linux/mm.h (already mentioned by Catalin)
> is_kernel_rodata in mm/util.c
>
> Here we compare a pointer with some fixed untagged values to make sure
> that the pointer lies in a particular part of the kernel address
> space. Since KWHASAN doesn't add tags to pointers that belong to
> rodata or vmalloc regions, this should work as is. To make sure I've
> added debug checks to those two functions that check that the result
> doesn't change whether we operate on pointers with or without
> untagging.
>
> ======
>
> A few other cases that don't look that interesting:
>
> Comparing pointers to achieve unique sorting order of pointee objects
> (e.g. sorting locks addresses before performing a double lock):
>
> tty_ldisc_lock_pair_timeout in drivers/tty/tty_ldisc.c
> pipe_double_lock in fs/pipe.c
> unix_state_double_lock in net/unix/af_unix.c
> lock_two_nondirectories in fs/inode.c
> mutex_lock_double in kernel/events/core.c
>
> ep_cmp_ffd in fs/eventpoll.c
> fsnotify_compare_groups fs/notify/mark.c
>
> Nothing needs to be done here, since the tags embedded into pointers
> don't change, so the sorting order would still be unique.
>
> Check that a pointer belongs to some particular allocation:
>
> is_sibling_entry lib/radix-tree.c
> object_is_on_stack in include/linux/sched/task_stack.h
>
> Nothing needs to be here either, since two pointers can only belong to
> the same allocation if they have the same tag.
>
> ======
>
> Will, Catalin, WDYT?

ping

2018-07-31 14:52:09

by Andrey Ryabinin

[permalink] [raw]
Subject: Re: [PATCH v4 13/17] khwasan: add hooks implementation



On 07/31/2018 04:05 PM, Andrey Konovalov wrote:
> On Wed, Jul 25, 2018 at 3:44 PM, Vincenzo Frascino@Foss
> <[email protected]> wrote:
>> On 06/26/2018 02:15 PM, Andrey Konovalov wrote:
>>
>>> @@ -325,18 +341,41 @@ void kasan_init_slab_obj(struct kmem_cache *cache,
>>> const void *object)
>>> void *kasan_slab_alloc(struct kmem_cache *cache, void *object, gfp_t
>>> flags)
>>> {
>>> - return kasan_kmalloc(cache, object, cache->object_size, flags);
>>> + object = kasan_kmalloc(cache, object, cache->object_size, flags);
>>> + if (IS_ENABLED(CONFIG_KASAN_HW) && unlikely(cache->ctor)) {
>>> + /*
>>> + * Cache constructor might use object's pointer value to
>>> + * initialize some of its fields.
>>> + */
>>> + cache->ctor(object);
>>>
>> This seams breaking the kmem_cache_create() contract: "The @ctor is run when
>> new pages are allocated by the cache."
>> (https://elixir.bootlin.com/linux/v3.7/source/mm/slab_common.c#L83)
>>
>> Since there might be preexisting code relying on it, this could lead to
>> global side effects. Did you verify that this is not the case?
>>
>> Another concern is performance related if we consider this solution suitable
>> for "near-production", since with the current implementation you call the
>> ctor (where present) on an object multiple times and this ends up memsetting
>> and repopulating the memory every time (i.e. inode.c: inode_init_once). Do
>> you know what is the performance impact?
>
> We can assign tags to objects with constructors when a slab is
> allocated and call constructors once as usual. The downside is that
> such object would always have the same tag when it is reallocated, so
> we won't catch use-after-frees.

Actually you should do this for SLAB_TYPESAFE_BY_RCU slabs. Usually they are with ->ctors but there
are few without constructors.
We can't reinitialize or even retag them. The latter will definitely cause false-positive use-after-free reports.

As for non-SLAB_TYPESAFE_BY_RCU caches with constructors, it's probably ok to reinitialize and retag such objects.
I don't see how could any code rely on the current ->ctor() behavior in non-SLAB_TYPESAFE_BY_RCU case,
unless it does something extremely stupid or weird.
But let's not do it now. If you care, you cand do it later, with a separate patch, so we could just revert
it if anything goes wrong.



2018-07-31 15:05:52

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: [PATCH v4 13/17] khwasan: add hooks implementation

On Tue, Jul 31, 2018 at 4:50 PM, Andrey Ryabinin
<[email protected]> wrote:
>
>
> On 07/31/2018 04:05 PM, Andrey Konovalov wrote:
>> On Wed, Jul 25, 2018 at 3:44 PM, Vincenzo Frascino@Foss
>> <[email protected]> wrote:
>>> On 06/26/2018 02:15 PM, Andrey Konovalov wrote:
>>>
>>>> @@ -325,18 +341,41 @@ void kasan_init_slab_obj(struct kmem_cache *cache,
>>>> const void *object)
>>>> void *kasan_slab_alloc(struct kmem_cache *cache, void *object, gfp_t
>>>> flags)
>>>> {
>>>> - return kasan_kmalloc(cache, object, cache->object_size, flags);
>>>> + object = kasan_kmalloc(cache, object, cache->object_size, flags);
>>>> + if (IS_ENABLED(CONFIG_KASAN_HW) && unlikely(cache->ctor)) {
>>>> + /*
>>>> + * Cache constructor might use object's pointer value to
>>>> + * initialize some of its fields.
>>>> + */
>>>> + cache->ctor(object);
>>>>
>>> This seams breaking the kmem_cache_create() contract: "The @ctor is run when
>>> new pages are allocated by the cache."
>>> (https://elixir.bootlin.com/linux/v3.7/source/mm/slab_common.c#L83)
>>>
>>> Since there might be preexisting code relying on it, this could lead to
>>> global side effects. Did you verify that this is not the case?
>>>
>>> Another concern is performance related if we consider this solution suitable
>>> for "near-production", since with the current implementation you call the
>>> ctor (where present) on an object multiple times and this ends up memsetting
>>> and repopulating the memory every time (i.e. inode.c: inode_init_once). Do
>>> you know what is the performance impact?
>>
>> We can assign tags to objects with constructors when a slab is
>> allocated and call constructors once as usual. The downside is that
>> such object would always have the same tag when it is reallocated, so
>> we won't catch use-after-frees.
>
> Actually you should do this for SLAB_TYPESAFE_BY_RCU slabs. Usually they are with ->ctors but there
> are few without constructors.
> We can't reinitialize or even retag them. The latter will definitely cause false-positive use-after-free reports.

Somewhat offtopic, but I can't understand how SLAB_TYPESAFE_BY_RCU
slabs can be useful without ctors or at least memset(0). Objects in
such slabs need to be type-stable, but I can't understand how it's
possible to establish type stability without a ctor... Are these bugs?
Or I am missing something subtle? What would be a canonical usage of
SLAB_TYPESAFE_BY_RCU slab without a ctor?

2018-07-31 15:22:40

by Andrey Konovalov

[permalink] [raw]
Subject: Re: [PATCH v4 13/17] khwasan: add hooks implementation

On Tue, Jul 31, 2018 at 4:50 PM, Andrey Ryabinin
<[email protected]> wrote:
> On 07/31/2018 04:05 PM, Andrey Konovalov wrote:
>> We can assign tags to objects with constructors when a slab is
>> allocated and call constructors once as usual. The downside is that
>> such object would always have the same tag when it is reallocated, so
>> we won't catch use-after-frees.
>
> Actually you should do this for SLAB_TYPESAFE_BY_RCU slabs. Usually they are with ->ctors but there
> are few without constructors.
> We can't reinitialize or even retag them. The latter will definitely cause false-positive use-after-free reports.
>
> As for non-SLAB_TYPESAFE_BY_RCU caches with constructors, it's probably ok to reinitialize and retag such objects.
> I don't see how could any code rely on the current ->ctor() behavior in non-SLAB_TYPESAFE_BY_RCU case,
> unless it does something extremely stupid or weird.
> But let's not do it now. If you care, you cand do it later, with a separate patch, so we could just revert
> it if anything goes wrong.

OK, will do it then when there's either a constructor or the slab is
SLAB_TYPESAFE_BY_RCU.

Subject: Re: [PATCH v4 13/17] khwasan: add hooks implementation

On Tue, 31 Jul 2018, Dmitry Vyukov wrote:

> > Actually you should do this for SLAB_TYPESAFE_BY_RCU slabs. Usually they are with ->ctors but there
> > are few without constructors.
> > We can't reinitialize or even retag them. The latter will definitely cause false-positive use-after-free reports.
>
> Somewhat offtopic, but I can't understand how SLAB_TYPESAFE_BY_RCU
> slabs can be useful without ctors or at least memset(0). Objects in
> such slabs need to be type-stable, but I can't understand how it's
> possible to establish type stability without a ctor... Are these bugs?
> Or I am missing something subtle? What would be a canonical usage of
> SLAB_TYPESAFE_BY_RCU slab without a ctor?

True that sounds fishy. Would someone post a list of SLAB_TYPESAFE_BY_RCU
slabs without ctors?


2018-07-31 16:05:45

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: [PATCH v4 13/17] khwasan: add hooks implementation

On Tue, Jul 31, 2018 at 5:38 PM, Christopher Lameter <[email protected]> wrote:
> On Tue, 31 Jul 2018, Dmitry Vyukov wrote:
>
>> > Actually you should do this for SLAB_TYPESAFE_BY_RCU slabs. Usually they are with ->ctors but there
>> > are few without constructors.
>> > We can't reinitialize or even retag them. The latter will definitely cause false-positive use-after-free reports.
>>
>> Somewhat offtopic, but I can't understand how SLAB_TYPESAFE_BY_RCU
>> slabs can be useful without ctors or at least memset(0). Objects in
>> such slabs need to be type-stable, but I can't understand how it's
>> possible to establish type stability without a ctor... Are these bugs?
>> Or I am missing something subtle? What would be a canonical usage of
>> SLAB_TYPESAFE_BY_RCU slab without a ctor?
>
> True that sounds fishy. Would someone post a list of SLAB_TYPESAFE_BY_RCU
> slabs without ctors?

https://elixir.bootlin.com/linux/latest/source/fs/jbd2/journal.c#L2395
https://elixir.bootlin.com/linux/latest/source/fs/kernfs/mount.c#L415
https://elixir.bootlin.com/linux/latest/source/net/netfilter/nf_conntrack_core.c#L2065
https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/i915/i915_gem.c#L5501
https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/i915/selftests/mock_gem_device.c#L212
https://elixir.bootlin.com/linux/latest/source/drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c#L1131

Also these in proto structs:
https://elixir.bootlin.com/linux/latest/source/net/dccp/ipv4.c#L959
https://elixir.bootlin.com/linux/latest/source/net/dccp/ipv6.c#L1048
https://elixir.bootlin.com/linux/latest/source/net/ipv4/tcp_ipv4.c#L2461
https://elixir.bootlin.com/linux/latest/source/net/ipv6/tcp_ipv6.c#L1980
https://elixir.bootlin.com/linux/latest/source/net/llc/af_llc.c#L145
https://elixir.bootlin.com/linux/latest/source/net/smc/af_smc.c#L105

They later created in net/core/sock.c without ctor.

2018-07-31 16:06:01

by Andrey Ryabinin

[permalink] [raw]
Subject: Re: [PATCH v4 13/17] khwasan: add hooks implementation



On 07/31/2018 06:03 PM, Dmitry Vyukov wrote:
> On Tue, Jul 31, 2018 at 4:50 PM, Andrey Ryabinin
> <[email protected]> wrote:
>>
>>
>> On 07/31/2018 04:05 PM, Andrey Konovalov wrote:
>>> On Wed, Jul 25, 2018 at 3:44 PM, Vincenzo Frascino@Foss
>>> <[email protected]> wrote:
>>>> On 06/26/2018 02:15 PM, Andrey Konovalov wrote:
>>>>
>>>>> @@ -325,18 +341,41 @@ void kasan_init_slab_obj(struct kmem_cache *cache,
>>>>> const void *object)
>>>>> void *kasan_slab_alloc(struct kmem_cache *cache, void *object, gfp_t
>>>>> flags)
>>>>> {
>>>>> - return kasan_kmalloc(cache, object, cache->object_size, flags);
>>>>> + object = kasan_kmalloc(cache, object, cache->object_size, flags);
>>>>> + if (IS_ENABLED(CONFIG_KASAN_HW) && unlikely(cache->ctor)) {
>>>>> + /*
>>>>> + * Cache constructor might use object's pointer value to
>>>>> + * initialize some of its fields.
>>>>> + */
>>>>> + cache->ctor(object);
>>>>>
>>>> This seams breaking the kmem_cache_create() contract: "The @ctor is run when
>>>> new pages are allocated by the cache."
>>>> (https://elixir.bootlin.com/linux/v3.7/source/mm/slab_common.c#L83)
>>>>
>>>> Since there might be preexisting code relying on it, this could lead to
>>>> global side effects. Did you verify that this is not the case?
>>>>
>>>> Another concern is performance related if we consider this solution suitable
>>>> for "near-production", since with the current implementation you call the
>>>> ctor (where present) on an object multiple times and this ends up memsetting
>>>> and repopulating the memory every time (i.e. inode.c: inode_init_once). Do
>>>> you know what is the performance impact?
>>>
>>> We can assign tags to objects with constructors when a slab is
>>> allocated and call constructors once as usual. The downside is that
>>> such object would always have the same tag when it is reallocated, so
>>> we won't catch use-after-frees.
>>
>> Actually you should do this for SLAB_TYPESAFE_BY_RCU slabs. Usually they are with ->ctors but there
>> are few without constructors.
>> We can't reinitialize or even retag them. The latter will definitely cause false-positive use-after-free reports.
>
> Somewhat offtopic, but I can't understand how SLAB_TYPESAFE_BY_RCU
> slabs can be useful without ctors or at least memset(0). Objects in
> such slabs need to be type-stable, but I can't understand how it's
> possible to establish type stability without a ctor... Are these bugs?

Yeah, I puzzled by this too. However, I think it's hard but possible to make it work, at least in theory.
There must be an initializer, which consists of two parts:
a) initilize objects fields
b) expose object to the world (add it to list or something like that)

(a) part must somehow to be ok to race with another cpu which might already use the object.
(b) part must must use e.g. barriers to make sure that racy users will see previously inilized fields.
Racy users must have parring barrier of course.

But it sound fishy, and very easy to fuck up. I won't be surprised if every single one SLAB_TYPESAFE_BY_RCU user
without ->ctor is bogus. It certainly would be better to convert those to use ->ctor.

Such caches seems used by networking subsystem in proto_register():

prot->slab = kmem_cache_create_usercopy(prot->name,
prot->obj_size, 0,
SLAB_HWCACHE_ALIGN | SLAB_ACCOUNT |
prot->slab_flags,
prot->useroffset, prot->usersize,
NULL);

And certain protocols specify SLAB_TYPESAFE_BY_RCU in ->slab_flags, such as:
llc_proto, smc_proto, smc_proto6, tcp_prot, tcpv6_prot, dccp_v6_prot, dccp_v4_prot.


Also nf_conntrack_cachep, kernfs_node_cache, jbd2_journal_head_cache and i915_request cache.



2018-07-31 16:11:44

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: [PATCH v4 13/17] khwasan: add hooks implementation

On Tue, Jul 31, 2018 at 6:04 PM, Andrey Ryabinin
<[email protected]> wrote:
>>>>>> @@ -325,18 +341,41 @@ void kasan_init_slab_obj(struct kmem_cache *cache,
>>>>>> const void *object)
>>>>>> void *kasan_slab_alloc(struct kmem_cache *cache, void *object, gfp_t
>>>>>> flags)
>>>>>> {
>>>>>> - return kasan_kmalloc(cache, object, cache->object_size, flags);
>>>>>> + object = kasan_kmalloc(cache, object, cache->object_size, flags);
>>>>>> + if (IS_ENABLED(CONFIG_KASAN_HW) && unlikely(cache->ctor)) {
>>>>>> + /*
>>>>>> + * Cache constructor might use object's pointer value to
>>>>>> + * initialize some of its fields.
>>>>>> + */
>>>>>> + cache->ctor(object);
>>>>>>
>>>>> This seams breaking the kmem_cache_create() contract: "The @ctor is run when
>>>>> new pages are allocated by the cache."
>>>>> (https://elixir.bootlin.com/linux/v3.7/source/mm/slab_common.c#L83)
>>>>>
>>>>> Since there might be preexisting code relying on it, this could lead to
>>>>> global side effects. Did you verify that this is not the case?
>>>>>
>>>>> Another concern is performance related if we consider this solution suitable
>>>>> for "near-production", since with the current implementation you call the
>>>>> ctor (where present) on an object multiple times and this ends up memsetting
>>>>> and repopulating the memory every time (i.e. inode.c: inode_init_once). Do
>>>>> you know what is the performance impact?
>>>>
>>>> We can assign tags to objects with constructors when a slab is
>>>> allocated and call constructors once as usual. The downside is that
>>>> such object would always have the same tag when it is reallocated, so
>>>> we won't catch use-after-frees.
>>>
>>> Actually you should do this for SLAB_TYPESAFE_BY_RCU slabs. Usually they are with ->ctors but there
>>> are few without constructors.
>>> We can't reinitialize or even retag them. The latter will definitely cause false-positive use-after-free reports.
>>
>> Somewhat offtopic, but I can't understand how SLAB_TYPESAFE_BY_RCU
>> slabs can be useful without ctors or at least memset(0). Objects in
>> such slabs need to be type-stable, but I can't understand how it's
>> possible to establish type stability without a ctor... Are these bugs?
>
> Yeah, I puzzled by this too. However, I think it's hard but possible to make it work, at least in theory.
> There must be an initializer, which consists of two parts:
> a) initilize objects fields
> b) expose object to the world (add it to list or something like that)
>
> (a) part must somehow to be ok to race with another cpu which might already use the object.
> (b) part must must use e.g. barriers to make sure that racy users will see previously inilized fields.
> Racy users must have parring barrier of course.
>
> But it sound fishy, and very easy to fuck up.


Agree on both fronts: theoretically possible but easy to fuck up. Even
if it works, complexity of the code should be brain damaging and there
are unlikely good reasons to just not be more explicit and use a ctor.


> I won't be surprised if every single one SLAB_TYPESAFE_BY_RCU user
> without ->ctor is bogus. It certainly would be better to convert those to use ->ctor.

I have another hypothesis: they are not bogus, just don't need
SLAB_TYPESAFE_BY_RCU :)


> Such caches seems used by networking subsystem in proto_register():
>
> prot->slab = kmem_cache_create_usercopy(prot->name,
> prot->obj_size, 0,
> SLAB_HWCACHE_ALIGN | SLAB_ACCOUNT |
> prot->slab_flags,
> prot->useroffset, prot->usersize,
> NULL);
>
> And certain protocols specify SLAB_TYPESAFE_BY_RCU in ->slab_flags, such as:
> llc_proto, smc_proto, smc_proto6, tcp_prot, tcpv6_prot, dccp_v6_prot, dccp_v4_prot.
>
>
> Also nf_conntrack_cachep, kernfs_node_cache, jbd2_journal_head_cache and i915_request cache.
>
>
> --
> You received this message because you are subscribed to the Google Groups "kasan-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
> To post to this group, send email to [email protected].
> To view this discussion on the web visit https://groups.google.com/d/msgid/kasan-dev/b6b58786-85c9-e831-5571-58b5580c84ba%40virtuozzo.com.
> For more options, visit https://groups.google.com/d/optout.

2018-07-31 16:20:03

by Andrey Ryabinin

[permalink] [raw]
Subject: Re: [PATCH v4 13/17] khwasan: add hooks implementation



On 07/31/2018 07:08 PM, Dmitry Vyukov wrote:
> On Tue, Jul 31, 2018 at 6:04 PM, Andrey Ryabinin
> <[email protected]> wrote:
>>>>>>> @@ -325,18 +341,41 @@ void kasan_init_slab_obj(struct kmem_cache *cache,
>>>>>>> const void *object)
>>>>>>> void *kasan_slab_alloc(struct kmem_cache *cache, void *object, gfp_t
>>>>>>> flags)
>>>>>>> {
>>>>>>> - return kasan_kmalloc(cache, object, cache->object_size, flags);
>>>>>>> + object = kasan_kmalloc(cache, object, cache->object_size, flags);
>>>>>>> + if (IS_ENABLED(CONFIG_KASAN_HW) && unlikely(cache->ctor)) {
>>>>>>> + /*
>>>>>>> + * Cache constructor might use object's pointer value to
>>>>>>> + * initialize some of its fields.
>>>>>>> + */
>>>>>>> + cache->ctor(object);
>>>>>>>
>>>>>> This seams breaking the kmem_cache_create() contract: "The @ctor is run when
>>>>>> new pages are allocated by the cache."
>>>>>> (https://elixir.bootlin.com/linux/v3.7/source/mm/slab_common.c#L83)
>>>>>>
>>>>>> Since there might be preexisting code relying on it, this could lead to
>>>>>> global side effects. Did you verify that this is not the case?
>>>>>>
>>>>>> Another concern is performance related if we consider this solution suitable
>>>>>> for "near-production", since with the current implementation you call the
>>>>>> ctor (where present) on an object multiple times and this ends up memsetting
>>>>>> and repopulating the memory every time (i.e. inode.c: inode_init_once). Do
>>>>>> you know what is the performance impact?
>>>>>
>>>>> We can assign tags to objects with constructors when a slab is
>>>>> allocated and call constructors once as usual. The downside is that
>>>>> such object would always have the same tag when it is reallocated, so
>>>>> we won't catch use-after-frees.
>>>>
>>>> Actually you should do this for SLAB_TYPESAFE_BY_RCU slabs. Usually they are with ->ctors but there
>>>> are few without constructors.
>>>> We can't reinitialize or even retag them. The latter will definitely cause false-positive use-after-free reports.
>>>
>>> Somewhat offtopic, but I can't understand how SLAB_TYPESAFE_BY_RCU
>>> slabs can be useful without ctors or at least memset(0). Objects in
>>> such slabs need to be type-stable, but I can't understand how it's
>>> possible to establish type stability without a ctor... Are these bugs?
>>
>> Yeah, I puzzled by this too. However, I think it's hard but possible to make it work, at least in theory.
>> There must be an initializer, which consists of two parts:
>> a) initilize objects fields
>> b) expose object to the world (add it to list or something like that)
>>
>> (a) part must somehow to be ok to race with another cpu which might already use the object.
>> (b) part must must use e.g. barriers to make sure that racy users will see previously inilized fields.
>> Racy users must have parring barrier of course.
>>
>> But it sound fishy, and very easy to fuck up.
>
>
> Agree on both fronts: theoretically possible but easy to fuck up. Even
> if it works, complexity of the code should be brain damaging and there
> are unlikely good reasons to just not be more explicit and use a ctor.
>
>
>> I won't be surprised if every single one SLAB_TYPESAFE_BY_RCU user
>> without ->ctor is bogus. It certainly would be better to convert those to use ->ctor.
>
> I have another hypothesis: they are not bogus, just don't need
> SLAB_TYPESAFE_BY_RCU :)
>

I'd call this a bug too.

2018-08-01 16:37:41

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v4 00/17] khwasan: kernel hardware assisted address sanitizer

Hi Andrey,

On Tue, Jul 31, 2018 at 03:22:13PM +0200, Andrey Konovalov wrote:
> On Wed, Jul 18, 2018 at 7:16 PM, Andrey Konovalov <[email protected]> wrote:
> > On Tue, Jul 3, 2018 at 7:36 PM, Will Deacon <[email protected]> wrote:
> >> Hmm, but elsewhere in this thread, Evgenii is motivating the need for this
> >> patch set precisely because the lower overhead means it's suitable for
> >> "near-production" use. So I don't think writing this off as a debugging
> >> feature is the right approach, and we instead need to put effort into
> >> analysing the impact of address tags on the kernel as a whole. Playing
> >> whack-a-mole with subtle tag issues sounds like the worst possible outcome
> >> for the long-term.
> >
> > I don't see a way to find cases where pointer tags would matter
> > statically, so I've implemented the dynamic approach that I mentioned
> > above. I've instrumented all pointer comparisons/subtractions in an
> > LLVM compiler pass and used a kernel module that would print a bug
> > report whenever two pointers with different tags are being
> > compared/subtracted (ignoring comparisons with NULL pointers and with
> > pointers obtained by casting an error code to a pointer type). Then I
> > tried booting the kernel in QEMU and on an Odroid C2 board and I ran
> > syzkaller overnight.
> >
> > This yielded the following results.
> >
> > ======
> >
> > The two places that look interesting are:
> >
> > is_vmalloc_addr in include/linux/mm.h (already mentioned by Catalin)
> > is_kernel_rodata in mm/util.c
> >
> > Here we compare a pointer with some fixed untagged values to make sure
> > that the pointer lies in a particular part of the kernel address
> > space. Since KWHASAN doesn't add tags to pointers that belong to
> > rodata or vmalloc regions, this should work as is. To make sure I've
> > added debug checks to those two functions that check that the result
> > doesn't change whether we operate on pointers with or without
> > untagging.
> >
> > ======
> >
> > A few other cases that don't look that interesting:
> >
> > Comparing pointers to achieve unique sorting order of pointee objects
> > (e.g. sorting locks addresses before performing a double lock):
> >
> > tty_ldisc_lock_pair_timeout in drivers/tty/tty_ldisc.c
> > pipe_double_lock in fs/pipe.c
> > unix_state_double_lock in net/unix/af_unix.c
> > lock_two_nondirectories in fs/inode.c
> > mutex_lock_double in kernel/events/core.c
> >
> > ep_cmp_ffd in fs/eventpoll.c
> > fsnotify_compare_groups fs/notify/mark.c
> >
> > Nothing needs to be done here, since the tags embedded into pointers
> > don't change, so the sorting order would still be unique.
> >
> > Check that a pointer belongs to some particular allocation:
> >
> > is_sibling_entry lib/radix-tree.c
> > object_is_on_stack in include/linux/sched/task_stack.h
> >
> > Nothing needs to be here either, since two pointers can only belong to
> > the same allocation if they have the same tag.
> >
> > ======
> >
> > Will, Catalin, WDYT?
>
> ping

Thanks for tracking these cases down and going through each of them. The
obvious follow-up question is: how do we ensure that we keep on top of
this in mainline? Are you going to repeat your experiment at every kernel
release or every -rc or something else? I really can't see how we can
maintain this in the long run, especially given that the coverage we have
is only dynamic -- do you have an idea of how much coverage you're actually
getting for, say, a defconfig+modules build?

I'd really like to enable pointer tagging in the kernel, I'm just still
failing to see how we can do it in a controlled manner where we can reason
about the semantic changes using something other than a best-effort,
case-by-case basis which is likely to be fragile and error-prone.
Unfortunately, if that's all we have, then this gets relegated to a
debug feature, which sort of defeats the point in my opinion.

Will

2018-08-01 17:50:02

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: [PATCH v4 00/17] khwasan: kernel hardware assisted address sanitizer

On Wed, Aug 1, 2018 at 6:35 PM, Will Deacon <[email protected]> wrote:
> Hi Andrey,
>
> On Tue, Jul 31, 2018 at 03:22:13PM +0200, Andrey Konovalov wrote:
>> On Wed, Jul 18, 2018 at 7:16 PM, Andrey Konovalov <[email protected]> wrote:
>> > On Tue, Jul 3, 2018 at 7:36 PM, Will Deacon <[email protected]> wrote:
>> >> Hmm, but elsewhere in this thread, Evgenii is motivating the need for this
>> >> patch set precisely because the lower overhead means it's suitable for
>> >> "near-production" use. So I don't think writing this off as a debugging
>> >> feature is the right approach, and we instead need to put effort into
>> >> analysing the impact of address tags on the kernel as a whole. Playing
>> >> whack-a-mole with subtle tag issues sounds like the worst possible outcome
>> >> for the long-term.
>> >
>> > I don't see a way to find cases where pointer tags would matter
>> > statically, so I've implemented the dynamic approach that I mentioned
>> > above. I've instrumented all pointer comparisons/subtractions in an
>> > LLVM compiler pass and used a kernel module that would print a bug
>> > report whenever two pointers with different tags are being
>> > compared/subtracted (ignoring comparisons with NULL pointers and with
>> > pointers obtained by casting an error code to a pointer type). Then I
>> > tried booting the kernel in QEMU and on an Odroid C2 board and I ran
>> > syzkaller overnight.
>> >
>> > This yielded the following results.
>> >
>> > ======
>> >
>> > The two places that look interesting are:
>> >
>> > is_vmalloc_addr in include/linux/mm.h (already mentioned by Catalin)
>> > is_kernel_rodata in mm/util.c
>> >
>> > Here we compare a pointer with some fixed untagged values to make sure
>> > that the pointer lies in a particular part of the kernel address
>> > space. Since KWHASAN doesn't add tags to pointers that belong to
>> > rodata or vmalloc regions, this should work as is. To make sure I've
>> > added debug checks to those two functions that check that the result
>> > doesn't change whether we operate on pointers with or without
>> > untagging.
>> >
>> > ======
>> >
>> > A few other cases that don't look that interesting:
>> >
>> > Comparing pointers to achieve unique sorting order of pointee objects
>> > (e.g. sorting locks addresses before performing a double lock):
>> >
>> > tty_ldisc_lock_pair_timeout in drivers/tty/tty_ldisc.c
>> > pipe_double_lock in fs/pipe.c
>> > unix_state_double_lock in net/unix/af_unix.c
>> > lock_two_nondirectories in fs/inode.c
>> > mutex_lock_double in kernel/events/core.c
>> >
>> > ep_cmp_ffd in fs/eventpoll.c
>> > fsnotify_compare_groups fs/notify/mark.c
>> >
>> > Nothing needs to be done here, since the tags embedded into pointers
>> > don't change, so the sorting order would still be unique.
>> >
>> > Check that a pointer belongs to some particular allocation:
>> >
>> > is_sibling_entry lib/radix-tree.c
>> > object_is_on_stack in include/linux/sched/task_stack.h
>> >
>> > Nothing needs to be here either, since two pointers can only belong to
>> > the same allocation if they have the same tag.
>> >
>> > ======
>> >
>> > Will, Catalin, WDYT?
>>
>> ping
>
> Thanks for tracking these cases down and going through each of them. The
> obvious follow-up question is: how do we ensure that we keep on top of
> this in mainline? Are you going to repeat your experiment at every kernel
> release or every -rc or something else? I really can't see how we can
> maintain this in the long run, especially given that the coverage we have
> is only dynamic -- do you have an idea of how much coverage you're actually
> getting for, say, a defconfig+modules build?
>
> I'd really like to enable pointer tagging in the kernel, I'm just still
> failing to see how we can do it in a controlled manner where we can reason
> about the semantic changes using something other than a best-effort,
> case-by-case basis which is likely to be fragile and error-prone.
> Unfortunately, if that's all we have, then this gets relegated to a
> debug feature, which sort of defeats the point in my opinion.

Well, in some cases there is no other way as resorting to dynamic testing.
How do we ensure that kernel does not dereference NULL pointers, does
not access objects after free or out of bounds? Nohow. And, yes, it's
constant maintenance burden resolved via dynamic testing.
In some sense HWASAN is better in this regard because it's like, say,
LOCKDEP in this regard. It's enabled only when one does dynamic
testing and collect, analyze and fix everything that pops up. Any
false positives will fail loudly (as opposed to, say, silent memory
corruptions due to use-after-frees), so any false positives will be
just first things to fix during the tool application.

2018-08-02 11:11:44

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH v4 00/17] khwasan: kernel hardware assisted address sanitizer

On Wed, Aug 01, 2018 at 06:52:09PM +0200, Dmitry Vyukov wrote:
> On Wed, Aug 1, 2018 at 6:35 PM, Will Deacon <[email protected]> wrote:
> > On Tue, Jul 31, 2018 at 03:22:13PM +0200, Andrey Konovalov wrote:
> >> On Wed, Jul 18, 2018 at 7:16 PM, Andrey Konovalov <[email protected]> wrote:
> >> > On Tue, Jul 3, 2018 at 7:36 PM, Will Deacon <[email protected]> wrote:
> >> >> Hmm, but elsewhere in this thread, Evgenii is motivating the need for this
> >> >> patch set precisely because the lower overhead means it's suitable for
> >> >> "near-production" use. So I don't think writing this off as a debugging
> >> >> feature is the right approach, and we instead need to put effort into
> >> >> analysing the impact of address tags on the kernel as a whole. Playing
> >> >> whack-a-mole with subtle tag issues sounds like the worst possible outcome
> >> >> for the long-term.
> >> >
> >> > I don't see a way to find cases where pointer tags would matter
> >> > statically, so I've implemented the dynamic approach that I mentioned
> >> > above. I've instrumented all pointer comparisons/subtractions in an
> >> > LLVM compiler pass and used a kernel module that would print a bug
> >> > report whenever two pointers with different tags are being
> >> > compared/subtracted (ignoring comparisons with NULL pointers and with
> >> > pointers obtained by casting an error code to a pointer type). Then I
> >> > tried booting the kernel in QEMU and on an Odroid C2 board and I ran
> >> > syzkaller overnight.
> >> >
> >> > This yielded the following results.
> >> >
> >> > ======
> >> >
> >> > The two places that look interesting are:
> >> >
> >> > is_vmalloc_addr in include/linux/mm.h (already mentioned by Catalin)
> >> > is_kernel_rodata in mm/util.c
> >> >
> >> > Here we compare a pointer with some fixed untagged values to make sure
> >> > that the pointer lies in a particular part of the kernel address
> >> > space. Since KWHASAN doesn't add tags to pointers that belong to
> >> > rodata or vmalloc regions, this should work as is. To make sure I've
> >> > added debug checks to those two functions that check that the result
> >> > doesn't change whether we operate on pointers with or without
> >> > untagging.
> >> >
> >> > ======
> >> >
> >> > A few other cases that don't look that interesting:
> >> >
> >> > Comparing pointers to achieve unique sorting order of pointee objects
> >> > (e.g. sorting locks addresses before performing a double lock):
> >> >
> >> > tty_ldisc_lock_pair_timeout in drivers/tty/tty_ldisc.c
> >> > pipe_double_lock in fs/pipe.c
> >> > unix_state_double_lock in net/unix/af_unix.c
> >> > lock_two_nondirectories in fs/inode.c
> >> > mutex_lock_double in kernel/events/core.c
> >> >
> >> > ep_cmp_ffd in fs/eventpoll.c
> >> > fsnotify_compare_groups fs/notify/mark.c
> >> >
> >> > Nothing needs to be done here, since the tags embedded into pointers
> >> > don't change, so the sorting order would still be unique.
> >> >
> >> > Check that a pointer belongs to some particular allocation:
> >> >
> >> > is_sibling_entry lib/radix-tree.c
> >> > object_is_on_stack in include/linux/sched/task_stack.h
> >> >
> >> > Nothing needs to be here either, since two pointers can only belong to
> >> > the same allocation if they have the same tag.
> >> >
> >> > ======
> >> >
> >> > Will, Catalin, WDYT?
> >>
> >> ping
> >
> > Thanks for tracking these cases down and going through each of them. The
> > obvious follow-up question is: how do we ensure that we keep on top of
> > this in mainline? Are you going to repeat your experiment at every kernel
> > release or every -rc or something else? I really can't see how we can
> > maintain this in the long run, especially given that the coverage we have
> > is only dynamic -- do you have an idea of how much coverage you're actually
> > getting for, say, a defconfig+modules build?
> >
> > I'd really like to enable pointer tagging in the kernel, I'm just still
> > failing to see how we can do it in a controlled manner where we can reason
> > about the semantic changes using something other than a best-effort,
> > case-by-case basis which is likely to be fragile and error-prone.
> > Unfortunately, if that's all we have, then this gets relegated to a
> > debug feature, which sort of defeats the point in my opinion.
>
> Well, in some cases there is no other way as resorting to dynamic testing.
> How do we ensure that kernel does not dereference NULL pointers, does
> not access objects after free or out of bounds?

We should not confuse software bugs (like NULL pointer dereference) with
unexpected software behaviour introduced by khwasan where pointers no
longer represent only an address range (e.g. calling find_vmap_area())
but rather an address and a tag. Parts of the kernel rely on pointers
being just address ranges.

It's the latter that we'd like to identify more easily and avoid subtle
bugs or change in behaviour when running correctly written code.

> And, yes, it's
> constant maintenance burden resolved via dynamic testing.
> In some sense HWASAN is better in this regard because it's like, say,
> LOCKDEP in this regard. It's enabled only when one does dynamic
> testing and collect, analyze and fix everything that pops up. Any
> false positives will fail loudly (as opposed to, say, silent memory
> corruptions due to use-after-frees), so any false positives will be
> just first things to fix during the tool application.

Again, you are talking about the bugs that khwasan would discover. We
don't deny its value and false positives are acceptable here.

However, not untagging a pointer when converting to long may have
side-effects in some cases and I consider these bugs introduced by the
khwasan support rather than bugs in the original kernel code. Ideally
we'd need some tooling on top of khwasan to detect such shortcomings but
I'm not sure we can do this statically, as Andrey already mentioned. For
__user pointers, things are slightly better as we can detect the
conversion either with sparse (modified) or some LLVM changes.

--
Catalin

2018-08-02 11:38:26

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: [PATCH v4 00/17] khwasan: kernel hardware assisted address sanitizer

On Thu, Aug 2, 2018 at 1:10 PM, Catalin Marinas <[email protected]> wrote:
> On Wed, Aug 01, 2018 at 06:52:09PM +0200, Dmitry Vyukov wrote:
>> On Wed, Aug 1, 2018 at 6:35 PM, Will Deacon <[email protected]> wrote:
>> > On Tue, Jul 31, 2018 at 03:22:13PM +0200, Andrey Konovalov wrote:
>> >> On Wed, Jul 18, 2018 at 7:16 PM, Andrey Konovalov <[email protected]> wrote:
>> >> > On Tue, Jul 3, 2018 at 7:36 PM, Will Deacon <[email protected]> wrote:
>> >> >> Hmm, but elsewhere in this thread, Evgenii is motivating the need for this
>> >> >> patch set precisely because the lower overhead means it's suitable for
>> >> >> "near-production" use. So I don't think writing this off as a debugging
>> >> >> feature is the right approach, and we instead need to put effort into
>> >> >> analysing the impact of address tags on the kernel as a whole. Playing
>> >> >> whack-a-mole with subtle tag issues sounds like the worst possible outcome
>> >> >> for the long-term.
>> >> >
>> >> > I don't see a way to find cases where pointer tags would matter
>> >> > statically, so I've implemented the dynamic approach that I mentioned
>> >> > above. I've instrumented all pointer comparisons/subtractions in an
>> >> > LLVM compiler pass and used a kernel module that would print a bug
>> >> > report whenever two pointers with different tags are being
>> >> > compared/subtracted (ignoring comparisons with NULL pointers and with
>> >> > pointers obtained by casting an error code to a pointer type). Then I
>> >> > tried booting the kernel in QEMU and on an Odroid C2 board and I ran
>> >> > syzkaller overnight.
>> >> >
>> >> > This yielded the following results.
>> >> >
>> >> > ======
>> >> >
>> >> > The two places that look interesting are:
>> >> >
>> >> > is_vmalloc_addr in include/linux/mm.h (already mentioned by Catalin)
>> >> > is_kernel_rodata in mm/util.c
>> >> >
>> >> > Here we compare a pointer with some fixed untagged values to make sure
>> >> > that the pointer lies in a particular part of the kernel address
>> >> > space. Since KWHASAN doesn't add tags to pointers that belong to
>> >> > rodata or vmalloc regions, this should work as is. To make sure I've
>> >> > added debug checks to those two functions that check that the result
>> >> > doesn't change whether we operate on pointers with or without
>> >> > untagging.
>> >> >
>> >> > ======
>> >> >
>> >> > A few other cases that don't look that interesting:
>> >> >
>> >> > Comparing pointers to achieve unique sorting order of pointee objects
>> >> > (e.g. sorting locks addresses before performing a double lock):
>> >> >
>> >> > tty_ldisc_lock_pair_timeout in drivers/tty/tty_ldisc.c
>> >> > pipe_double_lock in fs/pipe.c
>> >> > unix_state_double_lock in net/unix/af_unix.c
>> >> > lock_two_nondirectories in fs/inode.c
>> >> > mutex_lock_double in kernel/events/core.c
>> >> >
>> >> > ep_cmp_ffd in fs/eventpoll.c
>> >> > fsnotify_compare_groups fs/notify/mark.c
>> >> >
>> >> > Nothing needs to be done here, since the tags embedded into pointers
>> >> > don't change, so the sorting order would still be unique.
>> >> >
>> >> > Check that a pointer belongs to some particular allocation:
>> >> >
>> >> > is_sibling_entry lib/radix-tree.c
>> >> > object_is_on_stack in include/linux/sched/task_stack.h
>> >> >
>> >> > Nothing needs to be here either, since two pointers can only belong to
>> >> > the same allocation if they have the same tag.
>> >> >
>> >> > ======
>> >> >
>> >> > Will, Catalin, WDYT?
>> >>
>> >> ping
>> >
>> > Thanks for tracking these cases down and going through each of them. The
>> > obvious follow-up question is: how do we ensure that we keep on top of
>> > this in mainline? Are you going to repeat your experiment at every kernel
>> > release or every -rc or something else? I really can't see how we can
>> > maintain this in the long run, especially given that the coverage we have
>> > is only dynamic -- do you have an idea of how much coverage you're actually
>> > getting for, say, a defconfig+modules build?
>> >
>> > I'd really like to enable pointer tagging in the kernel, I'm just still
>> > failing to see how we can do it in a controlled manner where we can reason
>> > about the semantic changes using something other than a best-effort,
>> > case-by-case basis which is likely to be fragile and error-prone.
>> > Unfortunately, if that's all we have, then this gets relegated to a
>> > debug feature, which sort of defeats the point in my opinion.
>>
>> Well, in some cases there is no other way as resorting to dynamic testing.
>> How do we ensure that kernel does not dereference NULL pointers, does
>> not access objects after free or out of bounds?
>
> We should not confuse software bugs (like NULL pointer dereference) with
> unexpected software behaviour introduced by khwasan where pointers no
> longer represent only an address range (e.g. calling find_vmap_area())
> but rather an address and a tag.

These are also software bugs, not different from NULL derefs that we
do not detect statically.

> Parts of the kernel rely on pointers
> being just address ranges.
>
> It's the latter that we'd like to identify more easily and avoid subtle
> bugs or change in behaviour when running correctly written code.

You mean _previously_ correct code, now it's just incorrect code. Not
different from any other types of incorrect code, and we do have
thousands of types of incorrect code already, most of these types are
not detectable statically.

>> And, yes, it's
>> constant maintenance burden resolved via dynamic testing.
>> In some sense HWASAN is better in this regard because it's like, say,
>> LOCKDEP in this regard. It's enabled only when one does dynamic
>> testing and collect, analyze and fix everything that pops up. Any
>> false positives will fail loudly (as opposed to, say, silent memory
>> corruptions due to use-after-frees), so any false positives will be
>> just first things to fix during the tool application.
>
> Again, you are talking about the bugs that khwasan would discover. We
> don't deny its value and false positives are acceptable here.

I am talking about the same thing you are talking about. New crashes
of changes in behavior will also pop up and will need to be fixed.

> However, not untagging a pointer when converting to long may have
> side-effects in some cases and I consider these bugs introduced by the
> khwasan support rather than bugs in the original kernel code. Ideally
> we'd need some tooling on top of khwasan to detect such shortcomings but
> I'm not sure we can do this statically, as Andrey already mentioned. For
> __user pointers, things are slightly better as we can detect the
> conversion either with sparse (modified) or some LLVM changes.

I agree. Ideally we have strict static checking for this type of bugs.
Ideally we have it for all types of bugs. NULL derefs,
use-after-frees, or say confusion between a function returning 0/1 for
ok/failure with a function returning true/false. How do we detect that
statically? Nohow.

For example, LOCKDEP has the same problem. Previously correct code can
become incorrect and require finer-grained lock class annotations.
KMEMLEAK has the same problem: previously correct code that hides a
pointer may now need changes to unhide the pointer.

If somebody has a practical idea how to detect these statically, let's
do it. Otherwise let's go with the traditional solution to this --
dynamic testing. The patch series show that the problem is not a
disaster and we won't need to change just every line of kernel code.

2018-08-02 13:53:14

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH v4 00/17] khwasan: kernel hardware assisted address sanitizer

(trimming the quoted text a bit)

On Thu, Aug 02, 2018 at 01:36:25PM +0200, Dmitry Vyukov wrote:
> On Thu, Aug 2, 2018 at 1:10 PM, Catalin Marinas <[email protected]> wrote:
> > On Wed, Aug 01, 2018 at 06:52:09PM +0200, Dmitry Vyukov wrote:
> >> On Wed, Aug 1, 2018 at 6:35 PM, Will Deacon <[email protected]> wrote:
> >> > I'd really like to enable pointer tagging in the kernel, I'm just still
> >> > failing to see how we can do it in a controlled manner where we can reason
> >> > about the semantic changes using something other than a best-effort,
> >> > case-by-case basis which is likely to be fragile and error-prone.
> >> > Unfortunately, if that's all we have, then this gets relegated to a
> >> > debug feature, which sort of defeats the point in my opinion.
> >>
> >> Well, in some cases there is no other way as resorting to dynamic testing.
> >> How do we ensure that kernel does not dereference NULL pointers, does
> >> not access objects after free or out of bounds?
> >
> > We should not confuse software bugs (like NULL pointer dereference) with
> > unexpected software behaviour introduced by khwasan where pointers no
> > longer represent only an address range (e.g. calling find_vmap_area())
> > but rather an address and a tag.
[...]
> > However, not untagging a pointer when converting to long may have
> > side-effects in some cases and I consider these bugs introduced by the
> > khwasan support rather than bugs in the original kernel code. Ideally
> > we'd need some tooling on top of khwasan to detect such shortcomings but
> > I'm not sure we can do this statically, as Andrey already mentioned. For
> > __user pointers, things are slightly better as we can detect the
> > conversion either with sparse (modified) or some LLVM changes.
[...]
> For example, LOCKDEP has the same problem. Previously correct code can
> become incorrect and require finer-grained lock class annotations.
> KMEMLEAK has the same problem: previously correct code that hides a
> pointer may now need changes to unhide the pointer.

It's not actually the same. Take the kmemleak example as I'm familiar
with, previously correct code _continues_ to run correctly in the
presence of kmemleak. The annotation or unhiding is only necessary to
reduce the kmemleak false positives. With khwasan, OTOH, an explicit
untagging is necessary so that the code functions correctly again.

IOW, kmemleak only monitors the behaviour of the original code while
khwasan changes such behaviour by tagging the pointers.

> If somebody has a practical idea how to detect these statically, let's
> do it. Otherwise let's go with the traditional solution to this --
> dynamic testing. The patch series show that the problem is not a
> disaster and we won't need to change just every line of kernel code.

It's indeed not a disaster but we had to do this exercise to find out
whether there are better ways of detecting where untagging is necessary.

If you want to enable khwasan in "production" and since enabling it
could potentially change the behaviour of existing code paths, the
run-time validation space doubles as we'd need to get the same code
coverage with and without the feature being enabled. I wouldn't say it's
a blocker for khwasan, more like something to be aware of.

The awareness is a bit of a problem as the normal programmer would have
to pay more attention to conversions between pointer and long. Given
that this is an arm64-only feature, we have a risk of khwasan-triggered
bugs being introduced in generic code in the future (hence the
suggestion of some static checker, if possible).

--
Catalin

2018-08-02 14:28:49

by Andrey Ryabinin

[permalink] [raw]
Subject: Re: [PATCH v4 00/17] khwasan: kernel hardware assisted address sanitizer

On 08/02/2018 04:52 PM, Catalin Marinas wrote:
>
>> If somebody has a practical idea how to detect these statically, let's
>> do it. Otherwise let's go with the traditional solution to this --
>> dynamic testing. The patch series show that the problem is not a
>> disaster and we won't need to change just every line of kernel code.
>
> It's indeed not a disaster but we had to do this exercise to find out
> whether there are better ways of detecting where untagging is necessary.
>
> If you want to enable khwasan in "production" and since enabling it
> could potentially change the behaviour of existing code paths, the
> run-time validation space doubles as we'd need to get the same code
> coverage with and without the feature being enabled. I wouldn't say it's
> a blocker for khwasan, more like something to be aware of.
>
> The awareness is a bit of a problem as the normal programmer would have
> to pay more attention to conversions between pointer and long. Given
> that this is an arm64-only feature, we have a risk of khwasan-triggered
> bugs being introduced in generic code in the future (hence the
> suggestion of some static checker, if possible).

I don't see how we can implement such checker. There is no simple rule which defines when we need
to remove the tag and when we can leave it in place.
The cast to long have nothing to do with the need to remove the tag. If pointers compared for sorting objects,
or lock ordering, than having tags is fine and it doesn't matter whether pointers compared as 'unsigned long'
or as 'void *'.
If developer needs to check whether the pointer is in linear mapping, than tag has to be removed.
But again, it doesn't matter if pointer is 'unsigned long' or 'void *'. Either way, the tag has to go away.

2018-08-03 09:24:47

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v4 00/17] khwasan: kernel hardware assisted address sanitizer

On Wed, Aug 01, 2018 at 06:52:09PM +0200, Dmitry Vyukov wrote:
> On Wed, Aug 1, 2018 at 6:35 PM, Will Deacon <[email protected]> wrote:
> > Thanks for tracking these cases down and going through each of them. The
> > obvious follow-up question is: how do we ensure that we keep on top of
> > this in mainline? Are you going to repeat your experiment at every kernel
> > release or every -rc or something else? I really can't see how we can
> > maintain this in the long run, especially given that the coverage we have
> > is only dynamic -- do you have an idea of how much coverage you're actually
> > getting for, say, a defconfig+modules build?
> >
> > I'd really like to enable pointer tagging in the kernel, I'm just still
> > failing to see how we can do it in a controlled manner where we can reason
> > about the semantic changes using something other than a best-effort,
> > case-by-case basis which is likely to be fragile and error-prone.
> > Unfortunately, if that's all we have, then this gets relegated to a
> > debug feature, which sort of defeats the point in my opinion.
>
> Well, in some cases there is no other way as resorting to dynamic testing.
> How do we ensure that kernel does not dereference NULL pointers, does
> not access objects after free or out of bounds? Nohow. And, yes, it's
> constant maintenance burden resolved via dynamic testing.

... and the advantage of NULL pointer issues is that you're likely to see
them as a synchronous exception at runtime, regardless of architecture and
regardless of Kconfig options. With pointer tagging, that's certainly not
the case, and so I don't think we can just treat issues there like we do for
NULL pointers.

Will

2018-08-03 09:44:07

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: [PATCH v4 00/17] khwasan: kernel hardware assisted address sanitizer

On Fri, Aug 3, 2018 at 11:23 AM, Will Deacon <[email protected]> wrote:
> On Wed, Aug 01, 2018 at 06:52:09PM +0200, Dmitry Vyukov wrote:
>> On Wed, Aug 1, 2018 at 6:35 PM, Will Deacon <[email protected]> wrote:
>> > Thanks for tracking these cases down and going through each of them. The
>> > obvious follow-up question is: how do we ensure that we keep on top of
>> > this in mainline? Are you going to repeat your experiment at every kernel
>> > release or every -rc or something else? I really can't see how we can
>> > maintain this in the long run, especially given that the coverage we have
>> > is only dynamic -- do you have an idea of how much coverage you're actually
>> > getting for, say, a defconfig+modules build?
>> >
>> > I'd really like to enable pointer tagging in the kernel, I'm just still
>> > failing to see how we can do it in a controlled manner where we can reason
>> > about the semantic changes using something other than a best-effort,
>> > case-by-case basis which is likely to be fragile and error-prone.
>> > Unfortunately, if that's all we have, then this gets relegated to a
>> > debug feature, which sort of defeats the point in my opinion.
>>
>> Well, in some cases there is no other way as resorting to dynamic testing.
>> How do we ensure that kernel does not dereference NULL pointers, does
>> not access objects after free or out of bounds? Nohow. And, yes, it's
>> constant maintenance burden resolved via dynamic testing.
>
> ... and the advantage of NULL pointer issues is that you're likely to see
> them as a synchronous exception at runtime, regardless of architecture and
> regardless of Kconfig options. With pointer tagging, that's certainly not
> the case, and so I don't think we can just treat issues there like we do for
> NULL pointers.

Well, let's take use-after-frees, out-of-bounds, info leaks, data
races is a good example, deadlocks and just logical bugs...

> If you want to enable khwasan in "production" and since enabling it
> could potentially change the behaviour of existing code paths, the
> run-time validation space doubles as we'd need to get the same code
> coverage with and without the feature being enabled.

This is true for just any change in configs, sysctls or just a
different workload. Any of this can enable new code, exiting code
working differently, or just working with data in new states. And we
have tens of thousands of bugs, so blindly deploying anything new to
production without proper testing is a bad idea. It's not specific to
HWASAN in any way. And when you enable HWASAN you actually do mean to
retest everything as hard as possible.

And in the end we do not seem to have any action points here, right?

2018-08-08 16:29:14

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v4 00/17] khwasan: kernel hardware assisted address sanitizer

On Fri, Aug 03, 2018 at 11:42:32AM +0200, Dmitry Vyukov wrote:
> On Fri, Aug 3, 2018 at 11:23 AM, Will Deacon <[email protected]> wrote:
> > On Wed, Aug 01, 2018 at 06:52:09PM +0200, Dmitry Vyukov wrote:
> >> On Wed, Aug 1, 2018 at 6:35 PM, Will Deacon <[email protected]> wrote:
> >> > Thanks for tracking these cases down and going through each of them. The
> >> > obvious follow-up question is: how do we ensure that we keep on top of
> >> > this in mainline? Are you going to repeat your experiment at every kernel
> >> > release or every -rc or something else? I really can't see how we can
> >> > maintain this in the long run, especially given that the coverage we have
> >> > is only dynamic -- do you have an idea of how much coverage you're actually
> >> > getting for, say, a defconfig+modules build?
> >> >
> >> > I'd really like to enable pointer tagging in the kernel, I'm just still
> >> > failing to see how we can do it in a controlled manner where we can reason
> >> > about the semantic changes using something other than a best-effort,
> >> > case-by-case basis which is likely to be fragile and error-prone.
> >> > Unfortunately, if that's all we have, then this gets relegated to a
> >> > debug feature, which sort of defeats the point in my opinion.
> >>
> >> Well, in some cases there is no other way as resorting to dynamic testing.
> >> How do we ensure that kernel does not dereference NULL pointers, does
> >> not access objects after free or out of bounds? Nohow. And, yes, it's
> >> constant maintenance burden resolved via dynamic testing.
> >
> > ... and the advantage of NULL pointer issues is that you're likely to see
> > them as a synchronous exception at runtime, regardless of architecture and
> > regardless of Kconfig options. With pointer tagging, that's certainly not
> > the case, and so I don't think we can just treat issues there like we do for
> > NULL pointers.
>
> Well, let's take use-after-frees, out-of-bounds, info leaks, data
> races is a good example, deadlocks and just logical bugs...

Ok, but it was you that brought up NULL pointers, so there's some goalpost
moving here. And as with NULL pointers, all of the issues you mention above
apply to other architectures and the majority of their configurations, so my
concerns about this feature remain.

> > If you want to enable khwasan in "production" and since enabling it
> > could potentially change the behaviour of existing code paths, the
> > run-time validation space doubles as we'd need to get the same code
> > coverage with and without the feature being enabled.
>
> This is true for just any change in configs, sysctls or just a
> different workload. Any of this can enable new code, exiting code
> working differently, or just working with data in new states. And we
> have tens of thousands of bugs, so blindly deploying anything new to
> production without proper testing is a bad idea. It's not specific to
> HWASAN in any way. And when you enable HWASAN you actually do mean to
> retest everything as hard as possible.

I suppose I'm trying to understand whether we have to resort to testing, or
whether we can do better. I'm really uncomfortable with testing as our only
means of getting this right because this is a non-standard, arm64-specific
option and I don't think it will get very much testing in mainline at all.
Rather, we'll get spurious bug reports from forks of -stable many releases
later and we'll actually be worse-off for it.

> And in the end we do not seem to have any action points here, right?

Right now, it feels like this series trades one set of bugs for another,
so I'd like to get to a position where this new set of bugs is genuinely
more manageable (i.e. detectable, fixable, preventable) than the old set.
Unfortunately, the only suggestion seems to be "testing", which I really
don't find convincing :(

Could we do things like:

- Set up a dedicated arm64 test farm, running mainline and with a public
frontend, aimed at getting maximum coverage of the kernel with KHWASAN
enabled?

- Have an implementation of KHWASAN for other architectures? (Is this even
possible?)

- Have a compiler plugin to clear out the tag for pointer arithmetic?
Could we WARN if two pointers are compared with different tags?
Could we manipulate the tag on cast-to-pointer so that a mismatch would
be qualifier to say that pointer was created via a cast?

- ...

?

Will

2018-08-08 16:55:20

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: [PATCH v4 00/17] khwasan: kernel hardware assisted address sanitizer

On Wed, Aug 8, 2018 at 6:27 PM, Will Deacon <[email protected]> wrote:
>> >> > Thanks for tracking these cases down and going through each of them. The
>> >> > obvious follow-up question is: how do we ensure that we keep on top of
>> >> > this in mainline? Are you going to repeat your experiment at every kernel
>> >> > release or every -rc or something else? I really can't see how we can
>> >> > maintain this in the long run, especially given that the coverage we have
>> >> > is only dynamic -- do you have an idea of how much coverage you're actually
>> >> > getting for, say, a defconfig+modules build?
>> >> >
>> >> > I'd really like to enable pointer tagging in the kernel, I'm just still
>> >> > failing to see how we can do it in a controlled manner where we can reason
>> >> > about the semantic changes using something other than a best-effort,
>> >> > case-by-case basis which is likely to be fragile and error-prone.
>> >> > Unfortunately, if that's all we have, then this gets relegated to a
>> >> > debug feature, which sort of defeats the point in my opinion.
>> >>
>> >> Well, in some cases there is no other way as resorting to dynamic testing.
>> >> How do we ensure that kernel does not dereference NULL pointers, does
>> >> not access objects after free or out of bounds? Nohow. And, yes, it's
>> >> constant maintenance burden resolved via dynamic testing.
>> >
>> > ... and the advantage of NULL pointer issues is that you're likely to see
>> > them as a synchronous exception at runtime, regardless of architecture and
>> > regardless of Kconfig options. With pointer tagging, that's certainly not
>> > the case, and so I don't think we can just treat issues there like we do for
>> > NULL pointers.
>>
>> Well, let's take use-after-frees, out-of-bounds, info leaks, data
>> races is a good example, deadlocks and just logical bugs...
>
> Ok, but it was you that brought up NULL pointers, so there's some goalpost
> moving here.

I moved it only because our views on bugs seems to be somewhat
different. I would put it all including NULL derefs into the same
bucket of bugs. But the point I wanted to make holds if we take NULL
derefs out of equation too, so I took them out so that we don't
concentrate on "synchronous exceptions" only.

> And as with NULL pointers, all of the issues you mention above
> apply to other architectures and the majority of their configurations, so my
> concerns about this feature remain.
>
>> > If you want to enable khwasan in "production" and since enabling it
>> > could potentially change the behaviour of existing code paths, the
>> > run-time validation space doubles as we'd need to get the same code
>> > coverage with and without the feature being enabled.
>>
>> This is true for just any change in configs, sysctls or just a
>> different workload. Any of this can enable new code, exiting code
>> working differently, or just working with data in new states. And we
>> have tens of thousands of bugs, so blindly deploying anything new to
>> production without proper testing is a bad idea. It's not specific to
>> HWASAN in any way. And when you enable HWASAN you actually do mean to
>> retest everything as hard as possible.
>
> I suppose I'm trying to understand whether we have to resort to testing, or
> whether we can do better. I'm really uncomfortable with testing as our only
> means of getting this right because this is a non-standard, arm64-specific
> option and I don't think it will get very much testing in mainline at all.
> Rather, we'll get spurious bug reports from forks of -stable many releases
> later and we'll actually be worse-off for it.
>
>> And in the end we do not seem to have any action points here, right?
>
> Right now, it feels like this series trades one set of bugs for another,
> so I'd like to get to a position where this new set of bugs is genuinely
> more manageable (i.e. detectable, fixable, preventable) than the old set.
> Unfortunately, the only suggestion seems to be "testing", which I really
> don't find convincing :(
>
> Could we do things like:
>
> - Set up a dedicated arm64 test farm, running mainline and with a public
> frontend, aimed at getting maximum coverage of the kernel with KHWASAN
> enabled?

FWIW we could try to setup a syzbot instance with qemu/arm64
emulation. We run such combination few times, but I am not sure how
stable it will be wrt flaky timeouts/stalls/etc. If works, it will
give instant coverage of about 1MLOC.

> - Have an implementation of KHWASAN for other architectures? (Is this even
> possible?)
>
> - Have a compiler plugin to clear out the tag for pointer arithmetic?
> Could we WARN if two pointers are compared with different tags?
> Could we manipulate the tag on cast-to-pointer so that a mismatch would
> be qualifier to say that pointer was created via a cast?
>
> - ...
>
> ?
>
> Will