This patchset is not complete (see particular TODOs in the last patch),
and I haven't performed any benchmarking yet, but I would like to start the
discussion now and hear people's opinions regarding the questions mentioned
below.
=== Overview
This patchset adopts the existing hardware tag-based KASAN mode [1] for
use in production as a memory corruption mitigation. Hardware tag-based
KASAN relies on arm64 Memory Tagging Extension (MTE) [2] to perform memory
and pointer tagging. Please see [3] and [4] for detailed analysis of how
MTE helps to fight memory safety problems.
The current plan is reuse CONFIG_KASAN_HW_TAGS for production, but add a
boot time switch, that allows to choose between a debugging mode, that
includes all KASAN features as they are, and a production mode, that only
includes the essentials like tag checking.
It is essential that switching between these modes doesn't require
rebuilding the kernel with different configs, as this is required by the
Android GKI initiative [5].
The last patch of this series adds a new boot time parameter called
kasan_mode, which can have the following values:
- "kasan_mode=on" - only production features
- "kasan_mode=debug" - all debug features
- "kasan_mode=off" - no checks at all (not implemented yet)
Currently outlined differences between "on" and "debug":
- "on" doesn't keep track of alloc/free stacks, and therefore doesn't
require the additional memory to store those
- "on" uses asyncronous tag checking (not implemented yet)
=== Questions
The intention with this kind of a high level switch is to hide the
implementation details. Arguably, we could add multiple switches that allow
to separately control each KASAN or MTE feature, but I'm not sure there's
much value in that.
Does this make sense? Any preference regarding the name of the parameter
and its values?
What should be the default when the parameter is not specified? I would
argue that it should be "debug" (for hardware that supports MTE, otherwise
"off"), as it's the implied default for all other KASAN modes.
Should we somehow control whether to panic the kernel on a tag fault?
Another boot time parameter perhaps?
Any ideas as to how properly estimate the slowdown? As there's no
MTE-enabled hardware yet, the only way to test these patches is use an
emulator (like QEMU). The delay that is added by the emulator (for setting
and checking the tags) is different from the hardware delay, and this skews
the results.
A question to KASAN maintainers: what would be the best way to support the
"off" mode? I see two potential approaches: add a check into each kasan
callback (easier to implement, but we still call kasan callbacks, even
though they immediately return), or add inline header wrappers that do the
same.
=== Notes
This patchset is available here:
https://github.com/xairy/linux/tree/up-prod-mte-rfc1
and on Gerrit here:
https://linux-review.googlesource.com/c/linux/kernel/git/torvalds/linux/+/3460
This patchset is based on v5 of "kasan: add hardware tag-based mode for
arm64" patchset [1].
For testing in QEMU hardware tag-based KASAN requires:
1. QEMU built from master [6] (use "-machine virt,mte=on -cpu max" arguments
to run).
2. GCC version 10.
[1] https://lore.kernel.org/linux-arm-kernel/[email protected]/
[2] https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/enhancing-memory-safety
[3] https://arxiv.org/pdf/1802.09517.pdf
[4] https://github.com/microsoft/MSRC-Security-Research/blob/master/papers/2020/Security%20analysis%20of%20memory%20tagging.pdf
[5] https://source.android.com/devices/architecture/kernel/generic-kernel-image
[6] https://github.com/qemu/qemu
Andrey Konovalov (8):
kasan: simplify quarantine_put call
kasan: rename get_alloc/free_info
kasan: introduce set_alloc_info
kasan: unpoison stack only with CONFIG_KASAN_STACK
kasan: mark kasan_init_tags as __init
kasan, arm64: move initialization message
arm64: kasan: Add system_supports_tags helper
kasan: add and integrate kasan_mode boot param
arch/arm64/include/asm/memory.h | 1 +
arch/arm64/kernel/sleep.S | 2 +-
arch/arm64/mm/kasan_init.c | 3 ++
arch/x86/kernel/acpi/wakeup_64.S | 2 +-
include/linux/kasan.h | 14 ++---
mm/kasan/common.c | 90 ++++++++++++++++++--------------
mm/kasan/generic.c | 18 ++++---
mm/kasan/hw_tags.c | 63 ++++++++++++++++++++--
mm/kasan/kasan.h | 25 ++++++---
mm/kasan/quarantine.c | 5 +-
mm/kasan/report.c | 22 +++++---
mm/kasan/report_sw_tags.c | 2 +-
mm/kasan/sw_tags.c | 14 +++--
13 files changed, 182 insertions(+), 79 deletions(-)
--
2.28.0.1011.ga647a8990f-goog
Move get_free_info() call into quarantine_put() to simplify the call site.
No functional changes.
Signed-off-by: Andrey Konovalov <[email protected]>
Link: https://linux-review.googlesource.com/id/Iab0f04e7ebf8d83247024b7190c67c3c34c7940f
---
mm/kasan/common.c | 2 +-
mm/kasan/kasan.h | 5 ++---
mm/kasan/quarantine.c | 3 ++-
3 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/mm/kasan/common.c b/mm/kasan/common.c
index 2bb0ef6da6bd..5712c66c11c1 100644
--- a/mm/kasan/common.c
+++ b/mm/kasan/common.c
@@ -308,7 +308,7 @@ static bool __kasan_slab_free(struct kmem_cache *cache, void *object,
kasan_set_free_info(cache, object, tag);
- quarantine_put(get_free_info(cache, object), cache);
+ quarantine_put(cache, object);
return IS_ENABLED(CONFIG_KASAN_GENERIC);
}
diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
index 32ddb18541e3..a3bf60ceb5e1 100644
--- a/mm/kasan/kasan.h
+++ b/mm/kasan/kasan.h
@@ -214,12 +214,11 @@ struct kasan_track *kasan_get_free_track(struct kmem_cache *cache,
#if defined(CONFIG_KASAN_GENERIC) && \
(defined(CONFIG_SLAB) || defined(CONFIG_SLUB))
-void quarantine_put(struct kasan_free_meta *info, struct kmem_cache *cache);
+void quarantine_put(struct kmem_cache *cache, void *object);
void quarantine_reduce(void);
void quarantine_remove_cache(struct kmem_cache *cache);
#else
-static inline void quarantine_put(struct kasan_free_meta *info,
- struct kmem_cache *cache) { }
+static inline void quarantine_put(struct kmem_cache *cache, void *object) { }
static inline void quarantine_reduce(void) { }
static inline void quarantine_remove_cache(struct kmem_cache *cache) { }
#endif
diff --git a/mm/kasan/quarantine.c b/mm/kasan/quarantine.c
index 580ff5610fc1..a0792f0d6d0f 100644
--- a/mm/kasan/quarantine.c
+++ b/mm/kasan/quarantine.c
@@ -161,11 +161,12 @@ static void qlist_free_all(struct qlist_head *q, struct kmem_cache *cache)
qlist_init(q);
}
-void quarantine_put(struct kasan_free_meta *info, struct kmem_cache *cache)
+void quarantine_put(struct kmem_cache *cache, void *object)
{
unsigned long flags;
struct qlist_head *q;
struct qlist_head temp = QLIST_INIT;
+ struct kasan_free_meta *info = get_free_info(cache, object);
/*
* Note: irq must be disabled until after we move the batch to the
--
2.28.0.1011.ga647a8990f-goog
Tag-based KASAN modes are initialized with kasan_init_tags() instead of
kasan_init() for the generic mode. Move the initialization message for
tag-based modes into kasan_init_tags().
Also fix pr_fmt() usage for KASAN code: generic mode doesn't need it,
tag-based modes should use "kasan:" instead of KBUILD_MODNAME.
Signed-off-by: Andrey Konovalov <[email protected]>
Link: https://linux-review.googlesource.com/id/Idfd1e50625ffdf42dfc3dbf7455b11bd200a0a49
---
arch/arm64/mm/kasan_init.c | 3 +++
mm/kasan/generic.c | 2 --
mm/kasan/hw_tags.c | 4 ++++
mm/kasan/sw_tags.c | 4 +++-
4 files changed, 10 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/mm/kasan_init.c b/arch/arm64/mm/kasan_init.c
index b6b9d55bb72e..8f17fa834b62 100644
--- a/arch/arm64/mm/kasan_init.c
+++ b/arch/arm64/mm/kasan_init.c
@@ -290,5 +290,8 @@ void __init kasan_init(void)
{
kasan_init_shadow();
kasan_init_depth();
+#if defined(CONFIG_KASAN_GENERIC)
+ /* CONFIG_KASAN_SW/HW_TAGS also requires kasan_init_tags(). */
pr_info("KernelAddressSanitizer initialized\n");
+#endif
}
diff --git a/mm/kasan/generic.c b/mm/kasan/generic.c
index de6b3f03a023..d259e4c3aefd 100644
--- a/mm/kasan/generic.c
+++ b/mm/kasan/generic.c
@@ -9,8 +9,6 @@
* Andrey Konovalov <[email protected]>
*/
-#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
-
#include <linux/export.h>
#include <linux/interrupt.h>
#include <linux/init.h>
diff --git a/mm/kasan/hw_tags.c b/mm/kasan/hw_tags.c
index 0128062320d5..b372421258c8 100644
--- a/mm/kasan/hw_tags.c
+++ b/mm/kasan/hw_tags.c
@@ -6,6 +6,8 @@
* Author: Andrey Konovalov <[email protected]>
*/
+#define pr_fmt(fmt) "kasan: " fmt
+
#include <linux/kasan.h>
#include <linux/kernel.h>
#include <linux/memory.h>
@@ -18,6 +20,8 @@
void __init kasan_init_tags(void)
{
init_tags(KASAN_TAG_MAX);
+
+ pr_info("KernelAddressSanitizer initialized\n");
}
void *kasan_reset_tag(const void *addr)
diff --git a/mm/kasan/sw_tags.c b/mm/kasan/sw_tags.c
index bf1422282bb5..099af6dc8f7e 100644
--- a/mm/kasan/sw_tags.c
+++ b/mm/kasan/sw_tags.c
@@ -6,7 +6,7 @@
* Author: Andrey Konovalov <[email protected]>
*/
-#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+#define pr_fmt(fmt) "kasan: " fmt
#include <linux/export.h>
#include <linux/interrupt.h>
@@ -41,6 +41,8 @@ void __init kasan_init_tags(void)
for_each_possible_cpu(cpu)
per_cpu(prng_state, cpu) = (u32)get_cycles();
+
+ pr_info("KernelAddressSanitizer initialized\n");
}
/*
--
2.28.0.1011.ga647a8990f-goog
TODO: no meaningful description here yet, please see the cover letter
for this RFC series.
Signed-off-by: Andrey Konovalov <[email protected]>
Link: https://linux-review.googlesource.com/id/If7d37003875b2ed3e0935702c8015c223d6416a4
---
mm/kasan/common.c | 69 +++++++++++++++++++++++++---------------------
mm/kasan/generic.c | 4 +++
mm/kasan/hw_tags.c | 53 +++++++++++++++++++++++++++++++++++
mm/kasan/kasan.h | 8 ++++++
mm/kasan/report.c | 10 +++++--
mm/kasan/sw_tags.c | 4 +++
6 files changed, 115 insertions(+), 33 deletions(-)
diff --git a/mm/kasan/common.c b/mm/kasan/common.c
index a3e67d49b893..d642d5fce1e5 100644
--- a/mm/kasan/common.c
+++ b/mm/kasan/common.c
@@ -135,35 +135,37 @@ void kasan_cache_create(struct kmem_cache *cache, unsigned int *size,
unsigned int redzone_size;
int redzone_adjust;
- /* Add alloc meta. */
- cache->kasan_info.alloc_meta_offset = *size;
- *size += sizeof(struct kasan_alloc_meta);
-
- /* Add free meta. */
- if (IS_ENABLED(CONFIG_KASAN_GENERIC) &&
- (cache->flags & SLAB_TYPESAFE_BY_RCU || cache->ctor ||
- cache->object_size < sizeof(struct kasan_free_meta))) {
- cache->kasan_info.free_meta_offset = *size;
- *size += sizeof(struct kasan_free_meta);
- }
-
- redzone_size = optimal_redzone(cache->object_size);
- redzone_adjust = redzone_size - (*size - cache->object_size);
- if (redzone_adjust > 0)
- *size += redzone_adjust;
-
- *size = min_t(unsigned int, KMALLOC_MAX_SIZE,
- max(*size, cache->object_size + redzone_size));
+ if (static_branch_unlikely(&kasan_debug)) {
+ /* Add alloc meta. */
+ cache->kasan_info.alloc_meta_offset = *size;
+ *size += sizeof(struct kasan_alloc_meta);
+
+ /* Add free meta. */
+ if (IS_ENABLED(CONFIG_KASAN_GENERIC) &&
+ (cache->flags & SLAB_TYPESAFE_BY_RCU || cache->ctor ||
+ cache->object_size < sizeof(struct kasan_free_meta))) {
+ cache->kasan_info.free_meta_offset = *size;
+ *size += sizeof(struct kasan_free_meta);
+ }
- /*
- * If the metadata doesn't fit, don't enable KASAN at all.
- */
- if (*size <= cache->kasan_info.alloc_meta_offset ||
- *size <= cache->kasan_info.free_meta_offset) {
- cache->kasan_info.alloc_meta_offset = 0;
- cache->kasan_info.free_meta_offset = 0;
- *size = orig_size;
- return;
+ redzone_size = optimal_redzone(cache->object_size);
+ redzone_adjust = redzone_size - (*size - cache->object_size);
+ if (redzone_adjust > 0)
+ *size += redzone_adjust;
+
+ *size = min_t(unsigned int, KMALLOC_MAX_SIZE,
+ max(*size, cache->object_size + redzone_size));
+
+ /*
+ * If the metadata doesn't fit, don't enable KASAN at all.
+ */
+ if (*size <= cache->kasan_info.alloc_meta_offset ||
+ *size <= cache->kasan_info.free_meta_offset) {
+ cache->kasan_info.alloc_meta_offset = 0;
+ cache->kasan_info.free_meta_offset = 0;
+ *size = orig_size;
+ return;
+ }
}
*flags |= SLAB_KASAN;
@@ -180,6 +182,7 @@ size_t kasan_metadata_size(struct kmem_cache *cache)
struct kasan_alloc_meta *kasan_get_alloc_meta(struct kmem_cache *cache,
const void *object)
{
+ WARN_ON(!static_branch_unlikely(&kasan_debug));
return (void *)reset_tag(object) + cache->kasan_info.alloc_meta_offset;
}
@@ -187,6 +190,7 @@ struct kasan_free_meta *kasan_get_free_meta(struct kmem_cache *cache,
const void *object)
{
BUILD_BUG_ON(sizeof(struct kasan_free_meta) > 32);
+ WARN_ON(!static_branch_unlikely(&kasan_debug));
return (void *)reset_tag(object) + cache->kasan_info.free_meta_offset;
}
@@ -266,8 +270,10 @@ void * __must_check kasan_init_slab_obj(struct kmem_cache *cache,
if (!(cache->flags & SLAB_KASAN))
return (void *)object;
- alloc_meta = kasan_get_alloc_meta(cache, object);
- __memset(alloc_meta, 0, sizeof(*alloc_meta));
+ if (static_branch_unlikely(&kasan_debug)) {
+ alloc_meta = kasan_get_alloc_meta(cache, object);
+ __memset(alloc_meta, 0, sizeof(*alloc_meta));
+ }
if (IS_ENABLED(CONFIG_KASAN_SW_TAGS) || IS_ENABLED(CONFIG_KASAN_HW_TAGS))
object = set_tag(object, assign_tag(cache, object, true, false));
@@ -305,6 +311,7 @@ static bool __kasan_slab_free(struct kmem_cache *cache, void *object,
kasan_poison_memory(object, rounded_up_size, KASAN_KMALLOC_FREE);
if ((IS_ENABLED(CONFIG_KASAN_GENERIC) && !quarantine) ||
+ !static_branch_unlikely(&kasan_debug) ||
unlikely(!(cache->flags & SLAB_KASAN)))
return false;
@@ -351,7 +358,7 @@ static void *__kasan_kmalloc(struct kmem_cache *cache, const void *object,
kasan_poison_memory((void *)redzone_start, redzone_end - redzone_start,
KASAN_KMALLOC_REDZONE);
- if (cache->flags & SLAB_KASAN)
+ if (static_branch_unlikely(&kasan_debug) && cache->flags & SLAB_KASAN)
set_alloc_info(cache, (void *)object, flags);
return set_tag(object, tag);
diff --git a/mm/kasan/generic.c b/mm/kasan/generic.c
index d259e4c3aefd..9d968eaedc98 100644
--- a/mm/kasan/generic.c
+++ b/mm/kasan/generic.c
@@ -33,6 +33,10 @@
#include "kasan.h"
#include "../slab.h"
+/* See the comments in hw_tags.c */
+DEFINE_STATIC_KEY_TRUE_RO(kasan_enabled);
+DEFINE_STATIC_KEY_TRUE_RO(kasan_debug);
+
/*
* All functions below always inlined so compiler could
* perform better optimizations in each of __asan_loadX/__assn_storeX
diff --git a/mm/kasan/hw_tags.c b/mm/kasan/hw_tags.c
index b372421258c8..fc6ab1c8b155 100644
--- a/mm/kasan/hw_tags.c
+++ b/mm/kasan/hw_tags.c
@@ -8,6 +8,8 @@
#define pr_fmt(fmt) "kasan: " fmt
+#include <linux/init.h>
+#include <linux/jump_label.h>
#include <linux/kasan.h>
#include <linux/kernel.h>
#include <linux/memory.h>
@@ -17,8 +19,57 @@
#include "kasan.h"
+enum kasan_mode {
+ KASAN_MODE_OFF,
+ KASAN_MODE_ON,
+ KASAN_MODE_DEBUG,
+};
+
+static enum kasan_mode kasan_mode __ro_after_init;
+
+/* Whether KASAN is enabled at all. */
+/* TODO: ideally no KASAN callbacks when this is disabled. */
+DEFINE_STATIC_KEY_FALSE_RO(kasan_enabled);
+
+/* Whether to collect debugging info, e.g. alloc/free stack traces. */
+DEFINE_STATIC_KEY_FALSE_RO(kasan_debug);
+
+/* Whether to use syncronous or asynchronous tag checking. */
+static bool kasan_sync __ro_after_init;
+
+static int __init early_kasan_mode(char *arg)
+{
+ if (!arg)
+ return -EINVAL;
+
+ if (strcmp(arg, "on") == 0)
+ kasan_mode = KASAN_MODE_ON;
+ else if (strcmp(arg, "debug") == 0)
+ kasan_mode = KASAN_MODE_DEBUG;
+ return 0;
+}
+early_param("kasan_mode", early_kasan_mode);
+
void __init kasan_init_tags(void)
{
+ /* TODO: system_supports_tags() always returns 0 here, fix. */
+ if (0 /*!system_supports_tags()*/)
+ return;
+
+ switch (kasan_mode) {
+ case KASAN_MODE_OFF:
+ return;
+ case KASAN_MODE_ON:
+ static_branch_enable(&kasan_enabled);
+ break;
+ case KASAN_MODE_DEBUG:
+ static_branch_enable(&kasan_enabled);
+ static_branch_enable(&kasan_debug);
+ kasan_sync = true;
+ break;
+ }
+
+ /* TODO: choose between sync and async based on kasan_sync. */
init_tags(KASAN_TAG_MAX);
pr_info("KernelAddressSanitizer initialized\n");
@@ -60,6 +111,7 @@ void kasan_set_free_info(struct kmem_cache *cache,
{
struct kasan_alloc_meta *alloc_meta;
+ WARN_ON(!static_branch_unlikely(&kasan_debug));
alloc_meta = kasan_get_alloc_meta(cache, object);
kasan_set_track(&alloc_meta->free_track[0], GFP_NOWAIT);
}
@@ -69,6 +121,7 @@ struct kasan_track *kasan_get_free_track(struct kmem_cache *cache,
{
struct kasan_alloc_meta *alloc_meta;
+ WARN_ON(!static_branch_unlikely(&kasan_debug));
alloc_meta = kasan_get_alloc_meta(cache, object);
return &alloc_meta->free_track[0];
}
diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
index 47d6074c7958..3712e7a39717 100644
--- a/mm/kasan/kasan.h
+++ b/mm/kasan/kasan.h
@@ -279,6 +279,14 @@ static inline const void *arch_kasan_set_tag(const void *addr, u8 tag)
#define get_mem_tag(addr) arch_get_mem_tag(addr)
#define set_mem_tag_range(addr, size, tag) arch_set_mem_tag_range((addr), (size), (tag))
+#ifdef CONFIG_KASAN_HW_TAGS
+DECLARE_STATIC_KEY_FALSE(kasan_enabled);
+DECLARE_STATIC_KEY_FALSE(kasan_debug);
+#else
+DECLARE_STATIC_KEY_TRUE(kasan_enabled);
+DECLARE_STATIC_KEY_TRUE(kasan_debug);
+#endif
+
/*
* Exported functions for interfaces called from assembly or from generated
* code. Declarations here to avoid warning about missing declarations.
diff --git a/mm/kasan/report.c b/mm/kasan/report.c
index dee5350b459c..ae956a29ad4e 100644
--- a/mm/kasan/report.c
+++ b/mm/kasan/report.c
@@ -159,8 +159,8 @@ static void describe_object_addr(struct kmem_cache *cache, void *object,
(void *)(object_addr + cache->object_size));
}
-static void describe_object(struct kmem_cache *cache, void *object,
- const void *addr, u8 tag)
+static void describe_object_stacks(struct kmem_cache *cache, void *object,
+ const void *addr, u8 tag)
{
struct kasan_alloc_meta *alloc_meta = kasan_get_alloc_meta(cache, object);
@@ -188,7 +188,13 @@ static void describe_object(struct kmem_cache *cache, void *object,
}
#endif
}
+}
+static void describe_object(struct kmem_cache *cache, void *object,
+ const void *addr, u8 tag)
+{
+ if (static_branch_unlikely(&kasan_debug))
+ describe_object_stacks(cache, object, addr, tag);
describe_object_addr(cache, object, addr);
}
diff --git a/mm/kasan/sw_tags.c b/mm/kasan/sw_tags.c
index 099af6dc8f7e..50e797a16e17 100644
--- a/mm/kasan/sw_tags.c
+++ b/mm/kasan/sw_tags.c
@@ -33,6 +33,10 @@
#include "kasan.h"
#include "../slab.h"
+/* See the comments in hw_tags.c */
+DEFINE_STATIC_KEY_TRUE_RO(kasan_enabled);
+DEFINE_STATIC_KEY_TRUE_RO(kasan_debug);
+
static DEFINE_PER_CPU(u32, prng_state);
void __init kasan_init_tags(void)
--
2.28.0.1011.ga647a8990f-goog
Add a helper that exposes information about whether the system supports
memory tagging to be called in generic code.
Signed-off-by: Andrey Konovalov <[email protected]>
Link: https://linux-review.googlesource.com/id/Ib4b56a42c57c6293df29a0cdfee334c3ca7bdab4
---
arch/arm64/include/asm/memory.h | 1 +
mm/kasan/kasan.h | 4 ++++
2 files changed, 5 insertions(+)
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index b5d6b824c21c..6d2b7c54780e 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -232,6 +232,7 @@ static inline const void *__tag_set(const void *addr, u8 tag)
}
#ifdef CONFIG_KASAN_HW_TAGS
+#define arch_system_supports_tags() system_supports_mte()
#define arch_init_tags(max_tag) mte_init_tags(max_tag)
#define arch_get_random_tag() mte_get_random_tag()
#define arch_get_mem_tag(addr) mte_get_mem_tag(addr)
diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
index e5b8367a07f2..47d6074c7958 100644
--- a/mm/kasan/kasan.h
+++ b/mm/kasan/kasan.h
@@ -257,6 +257,9 @@ static inline const void *arch_kasan_set_tag(const void *addr, u8 tag)
#define reset_tag(addr) ((void *)arch_kasan_reset_tag(addr))
#define get_tag(addr) arch_kasan_get_tag(addr)
+#ifndef arch_system_supports_tags
+#define arch_system_supports_tags() (false)
+#endif
#ifndef arch_init_tags
#define arch_init_tags(max_tag)
#endif
@@ -270,6 +273,7 @@ static inline const void *arch_kasan_set_tag(const void *addr, u8 tag)
#define arch_set_mem_tag_range(addr, size, tag) ((void *)(addr))
#endif
+#define system_supports_tags() arch_system_supports_tags()
#define init_tags(max_tag) arch_init_tags(max_tag)
#define get_random_tag() arch_get_random_tag()
#define get_mem_tag(addr) arch_get_mem_tag(addr)
--
2.28.0.1011.ga647a8990f-goog
On Wed, 14 Oct 2020 at 22:45, Andrey Konovalov <[email protected]> wrote:
>
> TODO: no meaningful description here yet, please see the cover letter
> for this RFC series.
>
> Signed-off-by: Andrey Konovalov <[email protected]>
> Link: https://linux-review.googlesource.com/id/If7d37003875b2ed3e0935702c8015c223d6416a4
> ---
> mm/kasan/common.c | 69 +++++++++++++++++++++++++---------------------
> mm/kasan/generic.c | 4 +++
> mm/kasan/hw_tags.c | 53 +++++++++++++++++++++++++++++++++++
> mm/kasan/kasan.h | 8 ++++++
> mm/kasan/report.c | 10 +++++--
> mm/kasan/sw_tags.c | 4 +++
> 6 files changed, 115 insertions(+), 33 deletions(-)
>
> diff --git a/mm/kasan/common.c b/mm/kasan/common.c
> index a3e67d49b893..d642d5fce1e5 100644
> --- a/mm/kasan/common.c
> +++ b/mm/kasan/common.c
> @@ -135,35 +135,37 @@ void kasan_cache_create(struct kmem_cache *cache, unsigned int *size,
> unsigned int redzone_size;
> int redzone_adjust;
>
> - /* Add alloc meta. */
> - cache->kasan_info.alloc_meta_offset = *size;
> - *size += sizeof(struct kasan_alloc_meta);
> -
> - /* Add free meta. */
> - if (IS_ENABLED(CONFIG_KASAN_GENERIC) &&
> - (cache->flags & SLAB_TYPESAFE_BY_RCU || cache->ctor ||
> - cache->object_size < sizeof(struct kasan_free_meta))) {
> - cache->kasan_info.free_meta_offset = *size;
> - *size += sizeof(struct kasan_free_meta);
> - }
> -
> - redzone_size = optimal_redzone(cache->object_size);
> - redzone_adjust = redzone_size - (*size - cache->object_size);
> - if (redzone_adjust > 0)
> - *size += redzone_adjust;
> -
> - *size = min_t(unsigned int, KMALLOC_MAX_SIZE,
> - max(*size, cache->object_size + redzone_size));
> + if (static_branch_unlikely(&kasan_debug)) {
> + /* Add alloc meta. */
> + cache->kasan_info.alloc_meta_offset = *size;
> + *size += sizeof(struct kasan_alloc_meta);
> +
> + /* Add free meta. */
> + if (IS_ENABLED(CONFIG_KASAN_GENERIC) &&
> + (cache->flags & SLAB_TYPESAFE_BY_RCU || cache->ctor ||
> + cache->object_size < sizeof(struct kasan_free_meta))) {
> + cache->kasan_info.free_meta_offset = *size;
> + *size += sizeof(struct kasan_free_meta);
> + }
>
> - /*
> - * If the metadata doesn't fit, don't enable KASAN at all.
> - */
> - if (*size <= cache->kasan_info.alloc_meta_offset ||
> - *size <= cache->kasan_info.free_meta_offset) {
> - cache->kasan_info.alloc_meta_offset = 0;
> - cache->kasan_info.free_meta_offset = 0;
> - *size = orig_size;
> - return;
> + redzone_size = optimal_redzone(cache->object_size);
> + redzone_adjust = redzone_size - (*size - cache->object_size);
> + if (redzone_adjust > 0)
> + *size += redzone_adjust;
> +
> + *size = min_t(unsigned int, KMALLOC_MAX_SIZE,
> + max(*size, cache->object_size + redzone_size));
> +
> + /*
> + * If the metadata doesn't fit, don't enable KASAN at all.
> + */
> + if (*size <= cache->kasan_info.alloc_meta_offset ||
> + *size <= cache->kasan_info.free_meta_offset) {
> + cache->kasan_info.alloc_meta_offset = 0;
> + cache->kasan_info.free_meta_offset = 0;
> + *size = orig_size;
> + return;
> + }
> }
>
> *flags |= SLAB_KASAN;
> @@ -180,6 +182,7 @@ size_t kasan_metadata_size(struct kmem_cache *cache)
> struct kasan_alloc_meta *kasan_get_alloc_meta(struct kmem_cache *cache,
> const void *object)
> {
> + WARN_ON(!static_branch_unlikely(&kasan_debug));
The WARN_ON condition itself should be unlikely, so that would imply
that the static branch here should be likely since you're negating it.
And AFAIK, this function should only be called if kasan_debug is true.
> return (void *)reset_tag(object) + cache->kasan_info.alloc_meta_offset;
> }
>
> @@ -187,6 +190,7 @@ struct kasan_free_meta *kasan_get_free_meta(struct kmem_cache *cache,
> const void *object)
> {
> BUILD_BUG_ON(sizeof(struct kasan_free_meta) > 32);
> + WARN_ON(!static_branch_unlikely(&kasan_debug));
Same here.
> return (void *)reset_tag(object) + cache->kasan_info.free_meta_offset;
> }
>
> @@ -266,8 +270,10 @@ void * __must_check kasan_init_slab_obj(struct kmem_cache *cache,
> if (!(cache->flags & SLAB_KASAN))
> return (void *)object;
>
> - alloc_meta = kasan_get_alloc_meta(cache, object);
> - __memset(alloc_meta, 0, sizeof(*alloc_meta));
> + if (static_branch_unlikely(&kasan_debug)) {
> + alloc_meta = kasan_get_alloc_meta(cache, object);
> + __memset(alloc_meta, 0, sizeof(*alloc_meta));
> + }
>
> if (IS_ENABLED(CONFIG_KASAN_SW_TAGS) || IS_ENABLED(CONFIG_KASAN_HW_TAGS))
> object = set_tag(object, assign_tag(cache, object, true, false));
> @@ -305,6 +311,7 @@ static bool __kasan_slab_free(struct kmem_cache *cache, void *object,
> kasan_poison_memory(object, rounded_up_size, KASAN_KMALLOC_FREE);
>
> if ((IS_ENABLED(CONFIG_KASAN_GENERIC) && !quarantine) ||
> + !static_branch_unlikely(&kasan_debug) ||
> unlikely(!(cache->flags & SLAB_KASAN)))
> return false;
>
> @@ -351,7 +358,7 @@ static void *__kasan_kmalloc(struct kmem_cache *cache, const void *object,
> kasan_poison_memory((void *)redzone_start, redzone_end - redzone_start,
> KASAN_KMALLOC_REDZONE);
>
> - if (cache->flags & SLAB_KASAN)
> + if (static_branch_unlikely(&kasan_debug) && cache->flags & SLAB_KASAN)
> set_alloc_info(cache, (void *)object, flags);
>
> return set_tag(object, tag);
> diff --git a/mm/kasan/generic.c b/mm/kasan/generic.c
> index d259e4c3aefd..9d968eaedc98 100644
> --- a/mm/kasan/generic.c
> +++ b/mm/kasan/generic.c
> @@ -33,6 +33,10 @@
> #include "kasan.h"
> #include "../slab.h"
>
> +/* See the comments in hw_tags.c */
> +DEFINE_STATIC_KEY_TRUE_RO(kasan_enabled);
> +DEFINE_STATIC_KEY_TRUE_RO(kasan_debug);
> +
> /*
> * All functions below always inlined so compiler could
> * perform better optimizations in each of __asan_loadX/__assn_storeX
> diff --git a/mm/kasan/hw_tags.c b/mm/kasan/hw_tags.c
> index b372421258c8..fc6ab1c8b155 100644
> --- a/mm/kasan/hw_tags.c
> +++ b/mm/kasan/hw_tags.c
> @@ -8,6 +8,8 @@
>
> #define pr_fmt(fmt) "kasan: " fmt
>
> +#include <linux/init.h>
> +#include <linux/jump_label.h>
> #include <linux/kasan.h>
> #include <linux/kernel.h>
> #include <linux/memory.h>
> @@ -17,8 +19,57 @@
>
> #include "kasan.h"
>
> +enum kasan_mode {
> + KASAN_MODE_OFF,
> + KASAN_MODE_ON,
> + KASAN_MODE_DEBUG,
> +};
> +
> +static enum kasan_mode kasan_mode __ro_after_init;
> +
> +/* Whether KASAN is enabled at all. */
> +/* TODO: ideally no KASAN callbacks when this is disabled. */
> +DEFINE_STATIC_KEY_FALSE_RO(kasan_enabled);
> +
> +/* Whether to collect debugging info, e.g. alloc/free stack traces. */
> +DEFINE_STATIC_KEY_FALSE_RO(kasan_debug);
> +
> +/* Whether to use syncronous or asynchronous tag checking. */
> +static bool kasan_sync __ro_after_init;
s/syncronous/synchronous/
> +static int __init early_kasan_mode(char *arg)
> +{
> + if (!arg)
> + return -EINVAL;
> +
> + if (strcmp(arg, "on") == 0)
> + kasan_mode = KASAN_MODE_ON;
> + else if (strcmp(arg, "debug") == 0)
s/strcmp(..) == 0/!strcmp(..)/ ?
> + kasan_mode = KASAN_MODE_DEBUG;
> + return 0;
> +}
> +early_param("kasan_mode", early_kasan_mode);
> +
> void __init kasan_init_tags(void)
> {
> + /* TODO: system_supports_tags() always returns 0 here, fix. */
> + if (0 /*!system_supports_tags()*/)
> + return;
> +
> + switch (kasan_mode) {
> + case KASAN_MODE_OFF:
> + return;
> + case KASAN_MODE_ON:
> + static_branch_enable(&kasan_enabled);
> + break;
> + case KASAN_MODE_DEBUG:
> + static_branch_enable(&kasan_enabled);
> + static_branch_enable(&kasan_debug);
> + kasan_sync = true;
> + break;
> + }
> +
> + /* TODO: choose between sync and async based on kasan_sync. */
> init_tags(KASAN_TAG_MAX);
>
> pr_info("KernelAddressSanitizer initialized\n");
> @@ -60,6 +111,7 @@ void kasan_set_free_info(struct kmem_cache *cache,
> {
> struct kasan_alloc_meta *alloc_meta;
>
> + WARN_ON(!static_branch_unlikely(&kasan_debug));
What actually happens if any of these are called with !kasan_debug and
the warning triggers? Is it still valid to execute the below, or
should it bail out? Or possibly even disable KASAN entirely?
> alloc_meta = kasan_get_alloc_meta(cache, object);
> kasan_set_track(&alloc_meta->free_track[0], GFP_NOWAIT);
> }
> @@ -69,6 +121,7 @@ struct kasan_track *kasan_get_free_track(struct kmem_cache *cache,
> {
> struct kasan_alloc_meta *alloc_meta;
>
> + WARN_ON(!static_branch_unlikely(&kasan_debug));
> alloc_meta = kasan_get_alloc_meta(cache, object);
> return &alloc_meta->free_track[0];
> }
> diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
> index 47d6074c7958..3712e7a39717 100644
> --- a/mm/kasan/kasan.h
> +++ b/mm/kasan/kasan.h
> @@ -279,6 +279,14 @@ static inline const void *arch_kasan_set_tag(const void *addr, u8 tag)
> #define get_mem_tag(addr) arch_get_mem_tag(addr)
> #define set_mem_tag_range(addr, size, tag) arch_set_mem_tag_range((addr), (size), (tag))
>
> +#ifdef CONFIG_KASAN_HW_TAGS
> +DECLARE_STATIC_KEY_FALSE(kasan_enabled);
> +DECLARE_STATIC_KEY_FALSE(kasan_debug);
> +#else
> +DECLARE_STATIC_KEY_TRUE(kasan_enabled);
> +DECLARE_STATIC_KEY_TRUE(kasan_debug);
> +#endif
> +
> /*
> * Exported functions for interfaces called from assembly or from generated
> * code. Declarations here to avoid warning about missing declarations.
> diff --git a/mm/kasan/report.c b/mm/kasan/report.c
> index dee5350b459c..ae956a29ad4e 100644
> --- a/mm/kasan/report.c
> +++ b/mm/kasan/report.c
> @@ -159,8 +159,8 @@ static void describe_object_addr(struct kmem_cache *cache, void *object,
> (void *)(object_addr + cache->object_size));
> }
>
> -static void describe_object(struct kmem_cache *cache, void *object,
> - const void *addr, u8 tag)
> +static void describe_object_stacks(struct kmem_cache *cache, void *object,
> + const void *addr, u8 tag)
> {
> struct kasan_alloc_meta *alloc_meta = kasan_get_alloc_meta(cache, object);
>
> @@ -188,7 +188,13 @@ static void describe_object(struct kmem_cache *cache, void *object,
> }
> #endif
> }
> +}
>
> +static void describe_object(struct kmem_cache *cache, void *object,
> + const void *addr, u8 tag)
> +{
> + if (static_branch_unlikely(&kasan_debug))
> + describe_object_stacks(cache, object, addr, tag);
> describe_object_addr(cache, object, addr);
> }
>
> diff --git a/mm/kasan/sw_tags.c b/mm/kasan/sw_tags.c
> index 099af6dc8f7e..50e797a16e17 100644
> --- a/mm/kasan/sw_tags.c
> +++ b/mm/kasan/sw_tags.c
> @@ -33,6 +33,10 @@
> #include "kasan.h"
> #include "../slab.h"
>
> +/* See the comments in hw_tags.c */
> +DEFINE_STATIC_KEY_TRUE_RO(kasan_enabled);
> +DEFINE_STATIC_KEY_TRUE_RO(kasan_debug);
> +
> static DEFINE_PER_CPU(u32, prng_state);
>
> void __init kasan_init_tags(void)
> --
> 2.28.0.1011.ga647a8990f-goog
>
On Wed, 14 Oct 2020 at 22:44, Andrey Konovalov <[email protected]> wrote:
> This patchset is not complete (see particular TODOs in the last patch),
> and I haven't performed any benchmarking yet, but I would like to start the
> discussion now and hear people's opinions regarding the questions mentioned
> below.
>
> === Overview
>
> This patchset adopts the existing hardware tag-based KASAN mode [1] for
> use in production as a memory corruption mitigation. Hardware tag-based
> KASAN relies on arm64 Memory Tagging Extension (MTE) [2] to perform memory
> and pointer tagging. Please see [3] and [4] for detailed analysis of how
> MTE helps to fight memory safety problems.
>
> The current plan is reuse CONFIG_KASAN_HW_TAGS for production, but add a
> boot time switch, that allows to choose between a debugging mode, that
> includes all KASAN features as they are, and a production mode, that only
> includes the essentials like tag checking.
>
> It is essential that switching between these modes doesn't require
> rebuilding the kernel with different configs, as this is required by the
> Android GKI initiative [5].
>
> The last patch of this series adds a new boot time parameter called
> kasan_mode, which can have the following values:
>
> - "kasan_mode=on" - only production features
> - "kasan_mode=debug" - all debug features
> - "kasan_mode=off" - no checks at all (not implemented yet)
>
> Currently outlined differences between "on" and "debug":
>
> - "on" doesn't keep track of alloc/free stacks, and therefore doesn't
> require the additional memory to store those
> - "on" uses asyncronous tag checking (not implemented yet)
>
> === Questions
>
> The intention with this kind of a high level switch is to hide the
> implementation details. Arguably, we could add multiple switches that allow
> to separately control each KASAN or MTE feature, but I'm not sure there's
> much value in that.
>
> Does this make sense? Any preference regarding the name of the parameter
> and its values?
KASAN itself used to be a debugging tool only. So introducing an "on"
mode which no longer follows this convention may be confusing.
Instead, maybe the following might be less confusing:
"full" - current "debug", normal KASAN, all debugging help available.
"opt" - current "on", optimized mode for production.
"on" - automatic selection => chooses "full" if CONFIG_DEBUG_KERNEL,
"opt" otherwise.
"off" - as before.
Also, if there is no other kernel boot parameter named "kasan" yet,
maybe it could just be "kasan=..." ?
> What should be the default when the parameter is not specified? I would
> argue that it should be "debug" (for hardware that supports MTE, otherwise
> "off"), as it's the implied default for all other KASAN modes.
Perhaps we could make this dependent on CONFIG_DEBUG_KERNEL as above.
I do not think that having the full/debug KASAN enabled on production
kernels adds any value because for it to be useful requires somebody
to actually look at the stacktraces; I think that choice should be
made explicitly if it's a production kernel. My guess is that we'll
save explaining performance differences and resulting headaches for
ourselves and others that way.
> Should we somehow control whether to panic the kernel on a tag fault?
> Another boot time parameter perhaps?
It already respects panic_on_warn, correct?
> Any ideas as to how properly estimate the slowdown? As there's no
> MTE-enabled hardware yet, the only way to test these patches is use an
> emulator (like QEMU). The delay that is added by the emulator (for setting
> and checking the tags) is different from the hardware delay, and this skews
> the results.
>
> A question to KASAN maintainers: what would be the best way to support the
> "off" mode? I see two potential approaches: add a check into each kasan
> callback (easier to implement, but we still call kasan callbacks, even
> though they immediately return), or add inline header wrappers that do the
> same.
[...]
Thanks,
-- Marco
On Thu, Oct 15, 2020 at 4:41 PM Marco Elver <[email protected]> wrote:
>
> On Wed, 14 Oct 2020 at 22:44, Andrey Konovalov <[email protected]> wrote:
> > This patchset is not complete (see particular TODOs in the last patch),
> > and I haven't performed any benchmarking yet, but I would like to start the
> > discussion now and hear people's opinions regarding the questions mentioned
> > below.
> >
> > === Overview
> >
> > This patchset adopts the existing hardware tag-based KASAN mode [1] for
> > use in production as a memory corruption mitigation. Hardware tag-based
> > KASAN relies on arm64 Memory Tagging Extension (MTE) [2] to perform memory
> > and pointer tagging. Please see [3] and [4] for detailed analysis of how
> > MTE helps to fight memory safety problems.
> >
> > The current plan is reuse CONFIG_KASAN_HW_TAGS for production, but add a
> > boot time switch, that allows to choose between a debugging mode, that
> > includes all KASAN features as they are, and a production mode, that only
> > includes the essentials like tag checking.
> >
> > It is essential that switching between these modes doesn't require
> > rebuilding the kernel with different configs, as this is required by the
> > Android GKI initiative [5].
> >
> > The last patch of this series adds a new boot time parameter called
> > kasan_mode, which can have the following values:
> >
> > - "kasan_mode=on" - only production features
> > - "kasan_mode=debug" - all debug features
> > - "kasan_mode=off" - no checks at all (not implemented yet)
> >
> > Currently outlined differences between "on" and "debug":
> >
> > - "on" doesn't keep track of alloc/free stacks, and therefore doesn't
> > require the additional memory to store those
> > - "on" uses asyncronous tag checking (not implemented yet)
> >
> > === Questions
> >
> > The intention with this kind of a high level switch is to hide the
> > implementation details. Arguably, we could add multiple switches that allow
> > to separately control each KASAN or MTE feature, but I'm not sure there's
> > much value in that.
> >
> > Does this make sense? Any preference regarding the name of the parameter
> > and its values?
>
> KASAN itself used to be a debugging tool only. So introducing an "on"
> mode which no longer follows this convention may be confusing.
Yeah, perhaps "on" is not the best name here.
> Instead, maybe the following might be less confusing:
>
> "full" - current "debug", normal KASAN, all debugging help available.
> "opt" - current "on", optimized mode for production.
How about "prod" here?
> "on" - automatic selection => chooses "full" if CONFIG_DEBUG_KERNEL,
> "opt" otherwise.
> "off" - as before.
It actually makes sense to depend on CONFIG_DEBUG_KERNEL, I like this idea.
>
> Also, if there is no other kernel boot parameter named "kasan" yet,
> maybe it could just be "kasan=..." ?
Sounds good to me too.
> > What should be the default when the parameter is not specified? I would
> > argue that it should be "debug" (for hardware that supports MTE, otherwise
> > "off"), as it's the implied default for all other KASAN modes.
>
> Perhaps we could make this dependent on CONFIG_DEBUG_KERNEL as above.
> I do not think that having the full/debug KASAN enabled on production
> kernels adds any value because for it to be useful requires somebody
> to actually look at the stacktraces; I think that choice should be
> made explicitly if it's a production kernel. My guess is that we'll
> save explaining performance differences and resulting headaches for
> ourselves and others that way.
Ack.
> > Should we somehow control whether to panic the kernel on a tag fault?
> > Another boot time parameter perhaps?
>
> It already respects panic_on_warn, correct?
Yes, but Android is unlikely to enable panic_on_warn as they have
warnings happening all over. AFAIR Pixel 3/4 kernels actually have a
custom patch that enables kernel panic for KASAN crashes specifically
(even though they don't obviously use KASAN in production), and I
think it's better to provide a similar facility upstream. Maybe call
it panic_on_kasan or something?
On Fri, 16 Oct 2020 at 15:17, 'Andrey Konovalov' via kasan-dev
<[email protected]> wrote:
[...]
> > > The intention with this kind of a high level switch is to hide the
> > > implementation details. Arguably, we could add multiple switches that allow
> > > to separately control each KASAN or MTE feature, but I'm not sure there's
> > > much value in that.
> > >
> > > Does this make sense? Any preference regarding the name of the parameter
> > > and its values?
> >
> > KASAN itself used to be a debugging tool only. So introducing an "on"
> > mode which no longer follows this convention may be confusing.
>
> Yeah, perhaps "on" is not the best name here.
>
> > Instead, maybe the following might be less confusing:
> >
> > "full" - current "debug", normal KASAN, all debugging help available.
> > "opt" - current "on", optimized mode for production.
>
> How about "prod" here?
SGTM.
[...]
>
> > > Should we somehow control whether to panic the kernel on a tag fault?
> > > Another boot time parameter perhaps?
> >
> > It already respects panic_on_warn, correct?
>
> Yes, but Android is unlikely to enable panic_on_warn as they have
> warnings happening all over. AFAIR Pixel 3/4 kernels actually have a
> custom patch that enables kernel panic for KASAN crashes specifically
> (even though they don't obviously use KASAN in production), and I
> think it's better to provide a similar facility upstream. Maybe call
> it panic_on_kasan or something?
Best would be if kasan= can take another option, e.g.
"kasan=prod,panic". I think you can change the strcmp() to a
str_has_prefix() for the checks for full/prod/on/off, and then check
if what comes after it is ",panic".
Thanks,
-- Marco
On Thu, Oct 15, 2020 at 3:56 PM Marco Elver <[email protected]> wrote:
>
> On Wed, 14 Oct 2020 at 22:45, Andrey Konovalov <[email protected]> wrote:
> >
[...]
> > @@ -180,6 +182,7 @@ size_t kasan_metadata_size(struct kmem_cache *cache)
> > struct kasan_alloc_meta *kasan_get_alloc_meta(struct kmem_cache *cache,
> > const void *object)
> > {
> > + WARN_ON(!static_branch_unlikely(&kasan_debug));
>
> The WARN_ON condition itself should be unlikely, so that would imply
> that the static branch here should be likely since you're negating it.
Here I was thinking that we should optimize for the production use
case, which shouldn't have kasan_debug enabled, hence the unlikely.
But technically this function shouldn't be called in production
anyway, so likely will do fine too.
> And AFAIK, this function should only be called if kasan_debug is true.
Yes, this WARN_ON is to make sure this doesn't happen.
[...]
> > +/* Whether to use syncronous or asynchronous tag checking. */
> > +static bool kasan_sync __ro_after_init;
>
> s/syncronous/synchronous/
Ack.
>
> > +static int __init early_kasan_mode(char *arg)
> > +{
> > + if (!arg)
> > + return -EINVAL;
> > +
> > + if (strcmp(arg, "on") == 0)
> > + kasan_mode = KASAN_MODE_ON;
> > + else if (strcmp(arg, "debug") == 0)
>
> s/strcmp(..) == 0/!strcmp(..)/ ?
Sounds good.
[...]
> > @@ -60,6 +111,7 @@ void kasan_set_free_info(struct kmem_cache *cache,
> > {
> > struct kasan_alloc_meta *alloc_meta;
> >
> > + WARN_ON(!static_branch_unlikely(&kasan_debug));
>
> What actually happens if any of these are called with !kasan_debug and
> the warning triggers? Is it still valid to execute the below, or
> should it bail out? Or possibly even disable KASAN entirely?
It shouldn't happen, but if it happens maybe it indeed makes sense to
disable KASAN here is a failsafe. It might be tricky to disable MTE
though, but I'll see what we can do here.
Thank you!
On Thu, Oct 15, 2020 at 4:41 PM Marco Elver <[email protected]> wrote:
>
> On Wed, 14 Oct 2020 at 22:44, Andrey Konovalov <[email protected]> wrote:
> > This patchset is not complete (see particular TODOs in the last patch),
> > and I haven't performed any benchmarking yet, but I would like to start the
> > discussion now and hear people's opinions regarding the questions mentioned
> > below.
> >
> > === Overview
> >
> > This patchset adopts the existing hardware tag-based KASAN mode [1] for
> > use in production as a memory corruption mitigation. Hardware tag-based
> > KASAN relies on arm64 Memory Tagging Extension (MTE) [2] to perform memory
> > and pointer tagging. Please see [3] and [4] for detailed analysis of how
> > MTE helps to fight memory safety problems.
> >
> > The current plan is reuse CONFIG_KASAN_HW_TAGS for production, but add a
> > boot time switch, that allows to choose between a debugging mode, that
> > includes all KASAN features as they are, and a production mode, that only
> > includes the essentials like tag checking.
> >
> > It is essential that switching between these modes doesn't require
> > rebuilding the kernel with different configs, as this is required by the
> > Android GKI initiative [5].
> >
> > The last patch of this series adds a new boot time parameter called
> > kasan_mode, which can have the following values:
> >
> > - "kasan_mode=on" - only production features
> > - "kasan_mode=debug" - all debug features
> > - "kasan_mode=off" - no checks at all (not implemented yet)
> >
> > Currently outlined differences between "on" and "debug":
> >
> > - "on" doesn't keep track of alloc/free stacks, and therefore doesn't
> > require the additional memory to store those
> > - "on" uses asyncronous tag checking (not implemented yet)
> >
> > === Questions
> >
> > The intention with this kind of a high level switch is to hide the
> > implementation details. Arguably, we could add multiple switches that allow
> > to separately control each KASAN or MTE feature, but I'm not sure there's
> > much value in that.
> >
> > Does this make sense? Any preference regarding the name of the parameter
> > and its values?
>
> KASAN itself used to be a debugging tool only. So introducing an "on"
> mode which no longer follows this convention may be confusing.
> Instead, maybe the following might be less confusing:
>
> "full" - current "debug", normal KASAN, all debugging help available.
> "opt" - current "on", optimized mode for production.
> "on" - automatic selection => chooses "full" if CONFIG_DEBUG_KERNEL,
> "opt" otherwise.
> "off" - as before.
>
> Also, if there is no other kernel boot parameter named "kasan" yet,
> maybe it could just be "kasan=..." ?
>
> > What should be the default when the parameter is not specified? I would
> > argue that it should be "debug" (for hardware that supports MTE, otherwise
> > "off"), as it's the implied default for all other KASAN modes.
>
> Perhaps we could make this dependent on CONFIG_DEBUG_KERNEL as above.
> I do not think that having the full/debug KASAN enabled on production
> kernels adds any value because for it to be useful requires somebody
> to actually look at the stacktraces; I think that choice should be
> made explicitly if it's a production kernel. My guess is that we'll
> save explaining performance differences and resulting headaches for
> ourselves and others that way.
>
> > Should we somehow control whether to panic the kernel on a tag fault?
> > Another boot time parameter perhaps?
>
> It already respects panic_on_warn, correct?
>
> > Any ideas as to how properly estimate the slowdown? As there's no
> > MTE-enabled hardware yet, the only way to test these patches is use an
> > emulator (like QEMU). The delay that is added by the emulator (for setting
> > and checking the tags) is different from the hardware delay, and this skews
> > the results.
> >
> > A question to KASAN maintainers: what would be the best way to support the
> > "off" mode? I see two potential approaches: add a check into each kasan
> > callback (easier to implement, but we still call kasan callbacks, even
> > though they immediately return), or add inline header wrappers that do the
> > same.
> [...]
CC Kostya and Serban.
On Wed, Oct 14, 2020 at 10:44 PM Andrey Konovalov <[email protected]> wrote:
>
> This patchset is not complete (see particular TODOs in the last patch),
> and I haven't performed any benchmarking yet, but I would like to start the
> discussion now and hear people's opinions regarding the questions mentioned
> below.
>
> === Overview
>
> This patchset adopts the existing hardware tag-based KASAN mode [1] for
> use in production as a memory corruption mitigation. Hardware tag-based
> KASAN relies on arm64 Memory Tagging Extension (MTE) [2] to perform memory
> and pointer tagging. Please see [3] and [4] for detailed analysis of how
> MTE helps to fight memory safety problems.
>
> The current plan is reuse CONFIG_KASAN_HW_TAGS for production, but add a
> boot time switch, that allows to choose between a debugging mode, that
> includes all KASAN features as they are, and a production mode, that only
> includes the essentials like tag checking.
>
> It is essential that switching between these modes doesn't require
> rebuilding the kernel with different configs, as this is required by the
> Android GKI initiative [5].
>
> The last patch of this series adds a new boot time parameter called
> kasan_mode, which can have the following values:
>
> - "kasan_mode=on" - only production features
> - "kasan_mode=debug" - all debug features
> - "kasan_mode=off" - no checks at all (not implemented yet)
>
> Currently outlined differences between "on" and "debug":
>
> - "on" doesn't keep track of alloc/free stacks, and therefore doesn't
> require the additional memory to store those
> - "on" uses asyncronous tag checking (not implemented yet)
>
> === Questions
>
> The intention with this kind of a high level switch is to hide the
> implementation details. Arguably, we could add multiple switches that allow
> to separately control each KASAN or MTE feature, but I'm not sure there's
> much value in that.
>
> Does this make sense? Any preference regarding the name of the parameter
> and its values?
>
> What should be the default when the parameter is not specified? I would
> argue that it should be "debug" (for hardware that supports MTE, otherwise
> "off"), as it's the implied default for all other KASAN modes.
>
> Should we somehow control whether to panic the kernel on a tag fault?
> Another boot time parameter perhaps?
>
> Any ideas as to how properly estimate the slowdown? As there's no
> MTE-enabled hardware yet, the only way to test these patches is use an
> emulator (like QEMU). The delay that is added by the emulator (for setting
> and checking the tags) is different from the hardware delay, and this skews
> the results.
>
> A question to KASAN maintainers: what would be the best way to support the
> "off" mode? I see two potential approaches: add a check into each kasan
> callback (easier to implement, but we still call kasan callbacks, even
> though they immediately return), or add inline header wrappers that do the
> same.
CC Kostya and Serban.
>
> === Notes
>
> This patchset is available here:
>
> https://github.com/xairy/linux/tree/up-prod-mte-rfc1
>
> and on Gerrit here:
>
> https://linux-review.googlesource.com/c/linux/kernel/git/torvalds/linux/+/3460
>
> This patchset is based on v5 of "kasan: add hardware tag-based mode for
> arm64" patchset [1].
>
> For testing in QEMU hardware tag-based KASAN requires:
>
> 1. QEMU built from master [6] (use "-machine virt,mte=on -cpu max" arguments
> to run).
> 2. GCC version 10.
>
> [1] https://lore.kernel.org/linux-arm-kernel/[email protected]/
> [2] https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/enhancing-memory-safety
> [3] https://arxiv.org/pdf/1802.09517.pdf
> [4] https://github.com/microsoft/MSRC-Security-Research/blob/master/papers/2020/Security%20analysis%20of%20memory%20tagging.pdf
> [5] https://source.android.com/devices/architecture/kernel/generic-kernel-image
> [6] https://github.com/qemu/qemu
>
> Andrey Konovalov (8):
> kasan: simplify quarantine_put call
> kasan: rename get_alloc/free_info
> kasan: introduce set_alloc_info
> kasan: unpoison stack only with CONFIG_KASAN_STACK
> kasan: mark kasan_init_tags as __init
> kasan, arm64: move initialization message
> arm64: kasan: Add system_supports_tags helper
> kasan: add and integrate kasan_mode boot param
>
> arch/arm64/include/asm/memory.h | 1 +
> arch/arm64/kernel/sleep.S | 2 +-
> arch/arm64/mm/kasan_init.c | 3 ++
> arch/x86/kernel/acpi/wakeup_64.S | 2 +-
> include/linux/kasan.h | 14 ++---
> mm/kasan/common.c | 90 ++++++++++++++++++--------------
> mm/kasan/generic.c | 18 ++++---
> mm/kasan/hw_tags.c | 63 ++++++++++++++++++++--
> mm/kasan/kasan.h | 25 ++++++---
> mm/kasan/quarantine.c | 5 +-
> mm/kasan/report.c | 22 +++++---
> mm/kasan/report_sw_tags.c | 2 +-
> mm/kasan/sw_tags.c | 14 +++--
> 13 files changed, 182 insertions(+), 79 deletions(-)
>
> --
> 2.28.0.1011.ga647a8990f-goog
>
On Fri, Oct 16, 2020 at 3:31 PM Marco Elver <[email protected]> wrote:
>
> On Fri, 16 Oct 2020 at 15:17, 'Andrey Konovalov' via kasan-dev
> <[email protected]> wrote:
> [...]
> > > > The intention with this kind of a high level switch is to hide the
> > > > implementation details. Arguably, we could add multiple switches that allow
> > > > to separately control each KASAN or MTE feature, but I'm not sure there's
> > > > much value in that.
> > > >
> > > > Does this make sense? Any preference regarding the name of the parameter
> > > > and its values?
> > >
> > > KASAN itself used to be a debugging tool only. So introducing an "on"
> > > mode which no longer follows this convention may be confusing.
> >
> > Yeah, perhaps "on" is not the best name here.
> >
> > > Instead, maybe the following might be less confusing:
> > >
> > > "full" - current "debug", normal KASAN, all debugging help available.
> > > "opt" - current "on", optimized mode for production.
> >
> > How about "prod" here?
>
> SGTM.
>
> [...]
> >
> > > > Should we somehow control whether to panic the kernel on a tag fault?
> > > > Another boot time parameter perhaps?
> > >
> > > It already respects panic_on_warn, correct?
> >
> > Yes, but Android is unlikely to enable panic_on_warn as they have
> > warnings happening all over. AFAIR Pixel 3/4 kernels actually have a
> > custom patch that enables kernel panic for KASAN crashes specifically
> > (even though they don't obviously use KASAN in production), and I
> > think it's better to provide a similar facility upstream. Maybe call
> > it panic_on_kasan or something?
>
> Best would be if kasan= can take another option, e.g.
> "kasan=prod,panic". I think you can change the strcmp() to a
> str_has_prefix() for the checks for full/prod/on/off, and then check
> if what comes after it is ",panic".
>
> Thanks,
> -- Marco
CC Kostya and Serban.
On Wed, 14 Oct 2020 at 22:44, Andrey Konovalov <[email protected]> wrote:
[...]
> A question to KASAN maintainers: what would be the best way to support the
> "off" mode? I see two potential approaches: add a check into each kasan
> callback (easier to implement, but we still call kasan callbacks, even
> though they immediately return), or add inline header wrappers that do the
> same.
This is tricky, because we don't know how bad the performance will be
if we keep them as calls. We'd have to understand the performance
impact of keeping them as calls, and if the performance impact is
acceptable or not.
Without understanding the performance impact, the only viable option I
see is to add __always_inline kasan_foo() wrappers, which use the
static branch to guard calls to __kasan_foo().
Thanks,
-- Marco
On Mon, Oct 19, 2020 at 2:23 PM Marco Elver <[email protected]> wrote:
>
> On Wed, 14 Oct 2020 at 22:44, Andrey Konovalov <[email protected]> wrote:
> [...]
> > A question to KASAN maintainers: what would be the best way to support the
> > "off" mode? I see two potential approaches: add a check into each kasan
> > callback (easier to implement, but we still call kasan callbacks, even
> > though they immediately return), or add inline header wrappers that do the
> > same.
>
> This is tricky, because we don't know how bad the performance will be
> if we keep them as calls. We'd have to understand the performance
> impact of keeping them as calls, and if the performance impact is
> acceptable or not.
>
> Without understanding the performance impact, the only viable option I
> see is to add __always_inline kasan_foo() wrappers, which use the
> static branch to guard calls to __kasan_foo().
This sounds reasonable to me.
On Tue, Oct 20, 2020 at 7:34 AM Dmitry Vyukov <[email protected]> wrote:
>
> On Tue, Oct 20, 2020 at 12:51 AM Kostya Serebryany <[email protected]> wrote:
> >
> > Hi,
> > I would like to hear opinions from others in CC on these choices:
> > * Production use of In-kernel MTE should be based on stripped-down
> > KASAN, or implemented independently?
>
> Andrey, what are the fundamental consequences of basing MTE on KASAN?
> I would assume that there are none as we can change KASAN code and
> special case some code paths as necessary.
The main consequence is psychological and manifests in inheriting the name :)
But generally you're right. As we can change KASAN code, we can do
whatever we want, like adding fast paths for MTE, etc. If we Ctrl+C
Ctrl+V KASAN common code, we could potentially do some micro
optimizations (like avoiding a couple of checks), but I doubt that
will make any difference.
> > * Should we aim at a single boot-time flag (with several values) or
> > for several independent flags (OFF/SYNC/ASYNC, Stack traces on/off)
>
> We won't be able to answer this question for several years until we
> have actual hardware/users...
> It's definitely safer to aim at multiple options. I would reuse the fs
> opt parsing code as we seem to have lots of potential things to
> configure so that we can do:
> kasan_options=quarantine=off,fault=panic,trap=async
>
> I am also always confused by the term "debug" when configuring the
> kernel. In some cases it's for debugging of the subsystem (for
> developers of KASAN), in some cases it adds additional checks to catch
> misuses of the subsystem. in some - it just adds more debugging output
> on console. And in this case it's actually neither of these. But I am
> not sure what's a better name ("full"?). Even if we split options into
> multiple, we still can have some kind of presents that just flip all
> other options into reasonable values.
OK, let me try to incorporate the feedback I've heard so far into the
next version.
>
> > Andrey, please give us some idea of the CPU and RAM overheads other
> > than those coming from MTE
> > * stack trace collection and storage
> > * adding redzones to every allocation - not strictly needed for MTE,
> > but convenient to store the stack trace IDs.
> >
> > Andrey: with production MTE we should not be using quarantine, which
> > means storing the stack trace IDs
> > in the deallocated memory doesn't provide good report quality.
> > We may need to consider another approach, e.g. the one used in HWASAN
> > (separate ring buffer, per thread or per core)
My current priority is cleaning up the mode where stack traces are
disabled and estimating the slowdown from KASAN callbacks. Once done
with that, I'll switch to these ones.
Hi,
I would like to hear opinions from others in CC on these choices:
* Production use of In-kernel MTE should be based on stripped-down
KASAN, or implemented independently?
* Should we aim at a single boot-time flag (with several values) or
for several independent flags (OFF/SYNC/ASYNC, Stack traces on/off)
Andrey, please give us some idea of the CPU and RAM overheads other
than those coming from MTE
* stack trace collection and storage
* adding redzones to every allocation - not strictly needed for MTE,
but convenient to store the stack trace IDs.
Andrey: with production MTE we should not be using quarantine, which
means storing the stack trace IDs
in the deallocated memory doesn't provide good report quality.
We may need to consider another approach, e.g. the one used in HWASAN
(separate ring buffer, per thread or per core)
--kcc
On Fri, Oct 16, 2020 at 8:52 AM Andrey Konovalov <[email protected]> wrote:
>
> On Fri, Oct 16, 2020 at 3:31 PM Marco Elver <[email protected]> wrote:
> >
> > On Fri, 16 Oct 2020 at 15:17, 'Andrey Konovalov' via kasan-dev
> > <[email protected]> wrote:
> > [...]
> > > > > The intention with this kind of a high level switch is to hide the
> > > > > implementation details. Arguably, we could add multiple switches that allow
> > > > > to separately control each KASAN or MTE feature, but I'm not sure there's
> > > > > much value in that.
> > > > >
> > > > > Does this make sense? Any preference regarding the name of the parameter
> > > > > and its values?
> > > >
> > > > KASAN itself used to be a debugging tool only. So introducing an "on"
> > > > mode which no longer follows this convention may be confusing.
> > >
> > > Yeah, perhaps "on" is not the best name here.
> > >
> > > > Instead, maybe the following might be less confusing:
> > > >
> > > > "full" - current "debug", normal KASAN, all debugging help available.
> > > > "opt" - current "on", optimized mode for production.
> > >
> > > How about "prod" here?
> >
> > SGTM.
> >
> > [...]
> > >
> > > > > Should we somehow control whether to panic the kernel on a tag fault?
> > > > > Another boot time parameter perhaps?
> > > >
> > > > It already respects panic_on_warn, correct?
> > >
> > > Yes, but Android is unlikely to enable panic_on_warn as they have
> > > warnings happening all over. AFAIR Pixel 3/4 kernels actually have a
> > > custom patch that enables kernel panic for KASAN crashes specifically
> > > (even though they don't obviously use KASAN in production), and I
> > > think it's better to provide a similar facility upstream. Maybe call
> > > it panic_on_kasan or something?
> >
> > Best would be if kasan= can take another option, e.g.
> > "kasan=prod,panic". I think you can change the strcmp() to a
> > str_has_prefix() for the checks for full/prod/on/off, and then check
> > if what comes after it is ",panic".
> >
> > Thanks,
> > -- Marco
>
> CC Kostya and Serban.
On Tue, Oct 20, 2020 at 12:51 AM Kostya Serebryany <[email protected]> wrote:
>
> Hi,
> I would like to hear opinions from others in CC on these choices:
> * Production use of In-kernel MTE should be based on stripped-down
> KASAN, or implemented independently?
Andrey, what are the fundamental consequences of basing MTE on KASAN?
I would assume that there are none as we can change KASAN code and
special case some code paths as necessary.
> * Should we aim at a single boot-time flag (with several values) or
> for several independent flags (OFF/SYNC/ASYNC, Stack traces on/off)
We won't be able to answer this question for several years until we
have actual hardware/users...
It's definitely safer to aim at multiple options. I would reuse the fs
opt parsing code as we seem to have lots of potential things to
configure so that we can do:
kasan_options=quarantine=off,fault=panic,trap=async
I am also always confused by the term "debug" when configuring the
kernel. In some cases it's for debugging of the subsystem (for
developers of KASAN), in some cases it adds additional checks to catch
misuses of the subsystem. in some - it just adds more debugging output
on console. And in this case it's actually neither of these. But I am
not sure what's a better name ("full"?). Even if we split options into
multiple, we still can have some kind of presents that just flip all
other options into reasonable values.
> Andrey, please give us some idea of the CPU and RAM overheads other
> than those coming from MTE
> * stack trace collection and storage
> * adding redzones to every allocation - not strictly needed for MTE,
> but convenient to store the stack trace IDs.
>
> Andrey: with production MTE we should not be using quarantine, which
> means storing the stack trace IDs
> in the deallocated memory doesn't provide good report quality.
> We may need to consider another approach, e.g. the one used in HWASAN
> (separate ring buffer, per thread or per core)
>
> --kcc
>
>
> On Fri, Oct 16, 2020 at 8:52 AM Andrey Konovalov <[email protected]> wrote:
> >
> > On Fri, Oct 16, 2020 at 3:31 PM Marco Elver <[email protected]> wrote:
> > >
> > > On Fri, 16 Oct 2020 at 15:17, 'Andrey Konovalov' via kasan-dev
> > > <[email protected]> wrote:
> > > [...]
> > > > > > The intention with this kind of a high level switch is to hide the
> > > > > > implementation details. Arguably, we could add multiple switches that allow
> > > > > > to separately control each KASAN or MTE feature, but I'm not sure there's
> > > > > > much value in that.
> > > > > >
> > > > > > Does this make sense? Any preference regarding the name of the parameter
> > > > > > and its values?
> > > > >
> > > > > KASAN itself used to be a debugging tool only. So introducing an "on"
> > > > > mode which no longer follows this convention may be confusing.
> > > >
> > > > Yeah, perhaps "on" is not the best name here.
> > > >
> > > > > Instead, maybe the following might be less confusing:
> > > > >
> > > > > "full" - current "debug", normal KASAN, all debugging help available.
> > > > > "opt" - current "on", optimized mode for production.
> > > >
> > > > How about "prod" here?
> > >
> > > SGTM.
> > >
> > > [...]
> > > >
> > > > > > Should we somehow control whether to panic the kernel on a tag fault?
> > > > > > Another boot time parameter perhaps?
> > > > >
> > > > > It already respects panic_on_warn, correct?
> > > >
> > > > Yes, but Android is unlikely to enable panic_on_warn as they have
> > > > warnings happening all over. AFAIR Pixel 3/4 kernels actually have a
> > > > custom patch that enables kernel panic for KASAN crashes specifically
> > > > (even though they don't obviously use KASAN in production), and I
> > > > think it's better to provide a similar facility upstream. Maybe call
> > > > it panic_on_kasan or something?
> > >
> > > Best would be if kasan= can take another option, e.g.
> > > "kasan=prod,panic". I think you can change the strcmp() to a
> > > str_has_prefix() for the checks for full/prod/on/off, and then check
> > > if what comes after it is ",panic".
> > >
> > > Thanks,
> > > -- Marco
> >
> > CC Kostya and Serban.
On Tue, Oct 20, 2020 at 8:23 AM Hillf Danton <[email protected]> wrote:
>
> On Wed, 14 Oct 2020 22:44:35 +0200
> >
> > #ifdef CONFIG_KASAN_HW_TAGS
> > +#define arch_system_supports_tags() system_supports_mte()
>
> s/system_supports/support/ in order to look more like the brother of
>
> > #define arch_init_tags(max_tag) mte_init_tags(max_tag)
Well, init_tags() does initialize tags, but supports_tags() doesn't
not enable support for tags, and rather returns its status. So using
"support" here would be wrong from the English language standpoint.