2021-02-08 18:50:27

by Vincenzo Frascino

[permalink] [raw]
Subject: [PATCH v12 0/7] arm64: ARMv8.5-A: MTE: Add async mode support

This patchset implements the asynchronous mode support for ARMv8.5-A
Memory Tagging Extension (MTE), which is a debugging feature that allows
to detect with the help of the architecture the C and C++ programmatic
memory errors like buffer overflow, use-after-free, use-after-return, etc.

MTE is built on top of the AArch64 v8.0 virtual address tagging TBI
(Top Byte Ignore) feature and allows a task to set a 4 bit tag on any
subset of its address space that is multiple of a 16 bytes granule. MTE
is based on a lock-key mechanism where the lock is the tag associated to
the physical memory and the key is the tag associated to the virtual
address.
When MTE is enabled and tags are set for ranges of address space of a task,
the PE will compare the tag related to the physical memory with the tag
related to the virtual address (tag check operation). Access to the memory
is granted only if the two tags match. In case of mismatch the PE will raise
an exception.

The exception can be handled synchronously or asynchronously. When the
asynchronous mode is enabled:
- Upon fault the PE updates the TFSR_EL1 register.
- The kernel detects the change during one of the following:
- Context switching
- Return to user/EL0
- Kernel entry from EL1
- Kernel exit to EL1
- If the register has been updated by the PE the kernel clears it and
reports the error.

The series is based on linux-next/akpm.

To simplify the testing a tree with the new patches on top has been made
available at [1].

[1] https://git.gitlab.arm.com/linux-arm/linux-vf.git mte/v11.async.akpm

Changes:
--------
v12:
- Fixed a bug affecting kernel functions allowed to read
beyond buffer boundaries.
- Added support for save/restore of TFSR_EL1 register
during suspend/resume operations.
- Rebased on latest linux-next/akpm.
v11:
- Added patch that disables KUNIT tests in async mode
v10:
- Rebase on the latest linux-next/akpm
- Address review comments.
v9:
- Rebase on the latest linux-next/akpm
- Address review comments.
v8:
- Address review comments.
v7:
- Fix a warning reported by kernel test robot. This
time for real.
v6:
- Drop patches that forbid KASAN KUNIT tests when async
mode is enabled.
- Fix a warning reported by kernel test robot.
- Address review comments.
v5:
- Rebase the series on linux-next/akpm.
- Forbid execution for KASAN KUNIT tests when async
mode is enabled.
- Dropped patch to inline mte_assign_mem_tag_range().
- Address review comments.
v4:
- Added support for kasan.mode (sync/async) kernel
command line parameter.
- Addressed review comments.
v3:
- Exposed kasan_hw_tags_mode to convert the internal
KASAN represenetation.
- Added dsb() for kernel exit paths in arm64.
- Addressed review comments.
v2:
- Fixed a compilation issue reported by krobot.
- General cleanup.

Cc: Andrew Morton <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Marco Elver <[email protected]>
Cc: Evgenii Stepanov <[email protected]>
Cc: Branislav Rankov <[email protected]>
Cc: Andrey Konovalov <[email protected]>
Cc: Lorenzo Pieralisi <[email protected]>
Signed-off-by: Vincenzo Frascino <[email protected]>

Andrey Konovalov (1):
kasan: don't run tests in async mode

Vincenzo Frascino (6):
arm64: mte: Add asynchronous mode support
kasan: Add KASAN mode kernel parameter
kasan: Add report for async mode
arm64: mte: Enable TCO in functions that can read beyond buffer limits
arm64: mte: Enable async tag check fault
arm64: mte: Save/Restore TFSR_EL1 during suspend

Documentation/dev-tools/kasan.rst | 9 +++
arch/arm64/include/asm/memory.h | 3 +-
arch/arm64/include/asm/mte-kasan.h | 9 ++-
arch/arm64/include/asm/mte.h | 36 +++++++++++
arch/arm64/include/asm/uaccess.h | 19 ++++++
arch/arm64/include/asm/word-at-a-time.h | 4 ++
arch/arm64/kernel/entry-common.c | 6 ++
arch/arm64/kernel/mte.c | 84 ++++++++++++++++++++++++-
arch/arm64/kernel/suspend.c | 3 +
include/linux/kasan.h | 6 ++
lib/test_kasan.c | 6 +-
mm/kasan/hw_tags.c | 52 ++++++++++++++-
mm/kasan/kasan.h | 7 ++-
mm/kasan/report.c | 17 ++++-
14 files changed, 251 insertions(+), 10 deletions(-)

--
2.30.0


2021-02-08 18:50:37

by Vincenzo Frascino

[permalink] [raw]
Subject: [PATCH v12 3/7] kasan: Add report for async mode

KASAN provides an asynchronous mode of execution.

Add reporting functionality for this mode.

Cc: Dmitry Vyukov <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Andrey Konovalov <[email protected]>
Reviewed-by: Andrey Konovalov <[email protected]>
Signed-off-by: Vincenzo Frascino <[email protected]>
Signed-off-by: Andrey Konovalov <[email protected]>
---
include/linux/kasan.h | 6 ++++++
mm/kasan/report.c | 17 ++++++++++++++++-
2 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/include/linux/kasan.h b/include/linux/kasan.h
index 1011e4f30284..6d8f3227c264 100644
--- a/include/linux/kasan.h
+++ b/include/linux/kasan.h
@@ -367,6 +367,12 @@ static inline void *kasan_reset_tag(const void *addr)

#endif /* CONFIG_KASAN_SW_TAGS || CONFIG_KASAN_HW_TAGS*/

+#ifdef CONFIG_KASAN_HW_TAGS
+
+void kasan_report_async(void);
+
+#endif /* CONFIG_KASAN_HW_TAGS */
+
#ifdef CONFIG_KASAN_SW_TAGS
void __init kasan_init_sw_tags(void);
#else
diff --git a/mm/kasan/report.c b/mm/kasan/report.c
index 87b271206163..f147633f1f2b 100644
--- a/mm/kasan/report.c
+++ b/mm/kasan/report.c
@@ -87,7 +87,8 @@ static void start_report(unsigned long *flags)

static void end_report(unsigned long *flags, unsigned long addr)
{
- trace_error_report_end(ERROR_DETECTOR_KASAN, addr);
+ if (!kasan_flag_async)
+ trace_error_report_end(ERROR_DETECTOR_KASAN, addr);
pr_err("==================================================================\n");
add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE);
spin_unlock_irqrestore(&report_lock, *flags);
@@ -360,6 +361,20 @@ void kasan_report_invalid_free(void *object, unsigned long ip)
end_report(&flags, (unsigned long)object);
}

+#ifdef CONFIG_KASAN_HW_TAGS
+void kasan_report_async(void)
+{
+ unsigned long flags;
+
+ start_report(&flags);
+ pr_err("BUG: KASAN: invalid-access\n");
+ pr_err("Asynchronous mode enabled: no access details available\n");
+ pr_err("\n");
+ dump_stack();
+ end_report(&flags, 0);
+}
+#endif /* CONFIG_KASAN_HW_TAGS */
+
static void __kasan_report(unsigned long addr, size_t size, bool is_write,
unsigned long ip)
{
--
2.30.0

2021-02-08 18:51:08

by Vincenzo Frascino

[permalink] [raw]
Subject: [PATCH v12 2/7] kasan: Add KASAN mode kernel parameter

Architectures supported by KASAN_HW_TAGS can provide a sync or async mode
of execution. On an MTE enabled arm64 hw for example this can be identified
with the synchronous or asynchronous tagging mode of execution.
In synchronous mode, an exception is triggered if a tag check fault occurs.
In asynchronous mode, if a tag check fault occurs, the TFSR_EL1 register is
updated asynchronously. The kernel checks the corresponding bits
periodically.

KASAN requires a specific kernel command line parameter to make use of this
hw features.

Add KASAN HW execution mode kernel command line parameter.

Note: This patch adds the kasan.mode kernel parameter and the
sync/async kernel command line options to enable the described features.

Cc: Dmitry Vyukov <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Andrey Konovalov <[email protected]>
Reviewed-by: Andrey Konovalov <[email protected]>
Signed-off-by: Vincenzo Frascino <[email protected]>
[ Add a new var instead of exposing kasan_arg_mode to be consistent with
flags for other command line arguments. ]
Signed-off-by: Andrey Konovalov <[email protected]>
---
Documentation/dev-tools/kasan.rst | 9 ++++++
lib/test_kasan.c | 2 +-
mm/kasan/hw_tags.c | 52 ++++++++++++++++++++++++++++++-
mm/kasan/kasan.h | 7 +++--
4 files changed, 66 insertions(+), 4 deletions(-)

diff --git a/Documentation/dev-tools/kasan.rst b/Documentation/dev-tools/kasan.rst
index e022b7506e37..e3dca4d1f2a7 100644
--- a/Documentation/dev-tools/kasan.rst
+++ b/Documentation/dev-tools/kasan.rst
@@ -161,6 +161,15 @@ particular KASAN features.

- ``kasan=off`` or ``=on`` controls whether KASAN is enabled (default: ``on``).

+- ``kasan.mode=sync`` or ``=async`` controls whether KASAN is configured in
+ synchronous or asynchronous mode of execution (default: ``sync``).
+ Synchronous mode: a bad access is detected immediately when a tag
+ check fault occurs.
+ Asynchronous mode: a bad access detection is delayed. When a tag check
+ fault occurs, the information is stored in hardware (in the TFSR_EL1
+ register for arm64). The kernel periodically checks the hardware and
+ only reports tag faults during these checks.
+
- ``kasan.stacktrace=off`` or ``=on`` disables or enables alloc and free stack
traces collection (default: ``on`` for ``CONFIG_DEBUG_KERNEL=y``, otherwise
``off``).
diff --git a/lib/test_kasan.c b/lib/test_kasan.c
index d16ec9e66806..7285dcf9fcc1 100644
--- a/lib/test_kasan.c
+++ b/lib/test_kasan.c
@@ -97,7 +97,7 @@ static void kasan_test_exit(struct kunit *test)
READ_ONCE(fail_data.report_found)); \
if (IS_ENABLED(CONFIG_KASAN_HW_TAGS)) { \
if (READ_ONCE(fail_data.report_found)) \
- hw_enable_tagging(); \
+ hw_enable_tagging_sync(); \
migrate_enable(); \
} \
} while (0)
diff --git a/mm/kasan/hw_tags.c b/mm/kasan/hw_tags.c
index e529428e7a11..f537d2240811 100644
--- a/mm/kasan/hw_tags.c
+++ b/mm/kasan/hw_tags.c
@@ -25,6 +25,12 @@ enum kasan_arg {
KASAN_ARG_ON,
};

+enum kasan_arg_mode {
+ KASAN_ARG_MODE_DEFAULT,
+ KASAN_ARG_MODE_SYNC,
+ KASAN_ARG_MODE_ASYNC,
+};
+
enum kasan_arg_stacktrace {
KASAN_ARG_STACKTRACE_DEFAULT,
KASAN_ARG_STACKTRACE_OFF,
@@ -38,6 +44,7 @@ enum kasan_arg_fault {
};

static enum kasan_arg kasan_arg __ro_after_init;
+static enum kasan_arg_mode kasan_arg_mode __ro_after_init;
static enum kasan_arg_stacktrace kasan_arg_stacktrace __ro_after_init;
static enum kasan_arg_fault kasan_arg_fault __ro_after_init;

@@ -45,6 +52,10 @@ static enum kasan_arg_fault kasan_arg_fault __ro_after_init;
DEFINE_STATIC_KEY_FALSE(kasan_flag_enabled);
EXPORT_SYMBOL(kasan_flag_enabled);

+/* Whether the asynchronous mode is enabled. */
+bool kasan_flag_async __ro_after_init;
+EXPORT_SYMBOL_GPL(kasan_flag_async);
+
/* Whether to collect alloc/free stack traces. */
DEFINE_STATIC_KEY_FALSE(kasan_flag_stacktrace);

@@ -68,6 +79,21 @@ static int __init early_kasan_flag(char *arg)
}
early_param("kasan", early_kasan_flag);

+/* kasan.mode=sync/async */
+static int __init early_kasan_mode(char *arg)
+{
+ /* If arg is not set the default mode is sync */
+ if ((!arg) || !strcmp(arg, "sync"))
+ kasan_arg_mode = KASAN_ARG_MODE_SYNC;
+ else if (!strcmp(arg, "async"))
+ kasan_arg_mode = KASAN_ARG_MODE_ASYNC;
+ else
+ return -EINVAL;
+
+ return 0;
+}
+early_param("kasan.mode", early_kasan_mode);
+
/* kasan.stacktrace=off/on */
static int __init early_kasan_flag_stacktrace(char *arg)
{
@@ -115,7 +141,15 @@ void kasan_init_hw_tags_cpu(void)
return;

hw_init_tags(KASAN_TAG_MAX);
- hw_enable_tagging();
+
+ /*
+ * Enable async mode only when explicitly requested through
+ * the command line.
+ */
+ if (kasan_arg_mode == KASAN_ARG_MODE_ASYNC)
+ hw_enable_tagging_async();
+ else
+ hw_enable_tagging_sync();
}

/* kasan_init_hw_tags() is called once on boot CPU. */
@@ -132,6 +166,22 @@ void __init kasan_init_hw_tags(void)
/* Enable KASAN. */
static_branch_enable(&kasan_flag_enabled);

+ switch (kasan_arg_mode) {
+ case KASAN_ARG_MODE_DEFAULT:
+ /*
+ * Default to sync mode.
+ * Do nothing, kasan_flag_async keeps its default value.
+ */
+ break;
+ case KASAN_ARG_MODE_SYNC:
+ /* Do nothing, kasan_flag_async keeps its default value. */
+ break;
+ case KASAN_ARG_MODE_ASYNC:
+ /* Async mode enabled. */
+ kasan_flag_async = true;
+ break;
+ }
+
switch (kasan_arg_stacktrace) {
case KASAN_ARG_STACKTRACE_DEFAULT:
/*
diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
index 4fb8106f8e31..dd14e8870023 100644
--- a/mm/kasan/kasan.h
+++ b/mm/kasan/kasan.h
@@ -21,6 +21,7 @@ static inline bool kasan_stack_collection_enabled(void)
#endif

extern bool kasan_flag_panic __ro_after_init;
+extern bool kasan_flag_async __ro_after_init;

#if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS)
#define KASAN_GRANULE_SIZE (1UL << KASAN_SHADOW_SCALE_SHIFT)
@@ -294,7 +295,8 @@ static inline const void *arch_kasan_set_tag(const void *addr, u8 tag)
#define arch_set_mem_tag_range(addr, size, tag) ((void *)(addr))
#endif

-#define hw_enable_tagging() arch_enable_tagging()
+#define hw_enable_tagging_sync() arch_enable_tagging_sync()
+#define hw_enable_tagging_async() arch_enable_tagging_async()
#define hw_init_tags(max_tag) arch_init_tags(max_tag)
#define hw_set_tagging_report_once(state) arch_set_tagging_report_once(state)
#define hw_get_random_tag() arch_get_random_tag()
@@ -303,7 +305,8 @@ static inline const void *arch_kasan_set_tag(const void *addr, u8 tag)

#else /* CONFIG_KASAN_HW_TAGS */

-#define hw_enable_tagging()
+#define hw_enable_tagging_sync()
+#define hw_enable_tagging_async()
#define hw_set_tagging_report_once(state)

#endif /* CONFIG_KASAN_HW_TAGS */
--
2.30.0

2021-02-08 18:51:33

by Vincenzo Frascino

[permalink] [raw]
Subject: [PATCH v12 1/7] arm64: mte: Add asynchronous mode support

MTE provides an asynchronous mode for detecting tag exceptions. In
particular instead of triggering a fault the arm64 core updates a
register which is checked by the kernel after the asynchronous tag
check fault has occurred.

Add support for MTE asynchronous mode.

The exception handling mechanism will be added with a future patch.

Note: KASAN HW activates async mode via kasan.mode kernel parameter.
The default mode is set to synchronous.
The code that verifies the status of TFSR_EL1 will be added with a
future patch.

Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Reviewed-by: Catalin Marinas <[email protected]>
Reviewed-by: Andrey Konovalov <[email protected]>
Signed-off-by: Vincenzo Frascino <[email protected]>
---
arch/arm64/include/asm/memory.h | 3 ++-
arch/arm64/include/asm/mte-kasan.h | 9 +++++++--
arch/arm64/kernel/mte.c | 16 ++++++++++++++--
3 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index c759faf7a1ff..91515383d763 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -243,7 +243,8 @@ static inline const void *__tag_set(const void *addr, u8 tag)
}

#ifdef CONFIG_KASAN_HW_TAGS
-#define arch_enable_tagging() mte_enable_kernel()
+#define arch_enable_tagging_sync() mte_enable_kernel_sync()
+#define arch_enable_tagging_async() mte_enable_kernel_async()
#define arch_set_tagging_report_once(state) mte_set_report_once(state)
#define arch_init_tags(max_tag) mte_init_tags(max_tag)
#define arch_get_random_tag() mte_get_random_tag()
diff --git a/arch/arm64/include/asm/mte-kasan.h b/arch/arm64/include/asm/mte-kasan.h
index 3748d5bb88c0..8ad981069afb 100644
--- a/arch/arm64/include/asm/mte-kasan.h
+++ b/arch/arm64/include/asm/mte-kasan.h
@@ -29,7 +29,8 @@ u8 mte_get_mem_tag(void *addr);
u8 mte_get_random_tag(void);
void *mte_set_mem_tag_range(void *addr, size_t size, u8 tag);

-void mte_enable_kernel(void);
+void mte_enable_kernel_sync(void);
+void mte_enable_kernel_async(void);
void mte_init_tags(u64 max_tag);

void mte_set_report_once(bool state);
@@ -55,7 +56,11 @@ static inline void *mte_set_mem_tag_range(void *addr, size_t size, u8 tag)
return addr;
}

-static inline void mte_enable_kernel(void)
+static inline void mte_enable_kernel_sync(void)
+{
+}
+
+static inline void mte_enable_kernel_async(void)
{
}

diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
index c63b3d7a3cd9..92078e1eb627 100644
--- a/arch/arm64/kernel/mte.c
+++ b/arch/arm64/kernel/mte.c
@@ -153,11 +153,23 @@ void mte_init_tags(u64 max_tag)
write_sysreg_s(SYS_GCR_EL1_RRND | gcr_kernel_excl, SYS_GCR_EL1);
}

-void mte_enable_kernel(void)
+static inline void __mte_enable_kernel(const char *mode, unsigned long tcf)
{
/* Enable MTE Sync Mode for EL1. */
- sysreg_clear_set(sctlr_el1, SCTLR_ELx_TCF_MASK, SCTLR_ELx_TCF_SYNC);
+ sysreg_clear_set(sctlr_el1, SCTLR_ELx_TCF_MASK, tcf);
isb();
+
+ pr_info_once("MTE: enabled in %s mode at EL1\n", mode);
+}
+
+void mte_enable_kernel_sync(void)
+{
+ __mte_enable_kernel("synchronous", SCTLR_ELx_TCF_SYNC);
+}
+
+void mte_enable_kernel_async(void)
+{
+ __mte_enable_kernel("asynchronous", SCTLR_ELx_TCF_ASYNC);
}

void mte_set_report_once(bool state)
--
2.30.0

2021-02-08 18:52:56

by Vincenzo Frascino

[permalink] [raw]
Subject: [PATCH v12 5/7] arm64: mte: Enable async tag check fault

MTE provides a mode that asynchronously updates the TFSR_EL1 register
when a tag check exception is detected.

To take advantage of this mode the kernel has to verify the status of
the register at:
1. Context switching
2. Return to user/EL0 (Not required in entry from EL0 since the kernel
did not run)
3. Kernel entry from EL1
4. Kernel exit to EL1

If the register is non-zero a trace is reported.

Add the required features for EL1 detection and reporting.

Note: ITFSB bit is set in the SCTLR_EL1 register hence it guaranties that
the indirect writes to TFSR_EL1 are synchronized at exception entry to
EL1. On the context switch path the synchronization is guarantied by the
dsb() in __switch_to().
The dsb(nsh) in mte_check_tfsr_exit() is provisional pending
confirmation by the architects.

Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Reviewed-by: Catalin Marinas <[email protected]>
Acked-by: Andrey Konovalov <[email protected]>
Signed-off-by: Vincenzo Frascino <[email protected]>
---
arch/arm64/include/asm/mte.h | 32 ++++++++++++++++++++++++++++
arch/arm64/kernel/entry-common.c | 6 ++++++
arch/arm64/kernel/mte.c | 36 ++++++++++++++++++++++++++++++++
3 files changed, 74 insertions(+)

diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
index d02aff9f493d..237bb2f7309d 100644
--- a/arch/arm64/include/asm/mte.h
+++ b/arch/arm64/include/asm/mte.h
@@ -92,5 +92,37 @@ static inline void mte_assign_mem_tag_range(void *addr, size_t size)

#endif /* CONFIG_ARM64_MTE */

+#ifdef CONFIG_KASAN_HW_TAGS
+void mte_check_tfsr_el1(void);
+
+static inline void mte_check_tfsr_entry(void)
+{
+ mte_check_tfsr_el1();
+}
+
+static inline void mte_check_tfsr_exit(void)
+{
+ /*
+ * The asynchronous faults are sync'ed automatically with
+ * TFSR_EL1 on kernel entry but for exit an explicit dsb()
+ * is required.
+ */
+ dsb(nsh);
+ isb();
+
+ mte_check_tfsr_el1();
+}
+#else
+static inline void mte_check_tfsr_el1(void)
+{
+}
+static inline void mte_check_tfsr_entry(void)
+{
+}
+static inline void mte_check_tfsr_exit(void)
+{
+}
+#endif /* CONFIG_KASAN_HW_TAGS */
+
#endif /* __ASSEMBLY__ */
#endif /* __ASM_MTE_H */
diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
index 5346953e4382..31666511ba67 100644
--- a/arch/arm64/kernel/entry-common.c
+++ b/arch/arm64/kernel/entry-common.c
@@ -37,6 +37,8 @@ static void noinstr enter_from_kernel_mode(struct pt_regs *regs)
lockdep_hardirqs_off(CALLER_ADDR0);
rcu_irq_enter_check_tick();
trace_hardirqs_off_finish();
+
+ mte_check_tfsr_entry();
}

/*
@@ -47,6 +49,8 @@ static void noinstr exit_to_kernel_mode(struct pt_regs *regs)
{
lockdep_assert_irqs_disabled();

+ mte_check_tfsr_exit();
+
if (interrupts_enabled(regs)) {
if (regs->exit_rcu) {
trace_hardirqs_on_prepare();
@@ -243,6 +247,8 @@ asmlinkage void noinstr enter_from_user_mode(void)

asmlinkage void noinstr exit_to_user_mode(void)
{
+ mte_check_tfsr_exit();
+
trace_hardirqs_on_prepare();
lockdep_hardirqs_on_prepare(CALLER_ADDR0);
user_enter_irqoff();
diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
index 60531afc706e..3332aabda466 100644
--- a/arch/arm64/kernel/mte.c
+++ b/arch/arm64/kernel/mte.c
@@ -192,6 +192,29 @@ bool mte_report_once(void)
return READ_ONCE(report_fault_once);
}

+#ifdef CONFIG_KASAN_HW_TAGS
+void mte_check_tfsr_el1(void)
+{
+ u64 tfsr_el1;
+
+ if (!system_supports_mte())
+ return;
+
+ tfsr_el1 = read_sysreg_s(SYS_TFSR_EL1);
+
+ if (unlikely(tfsr_el1 & SYS_TFSR_EL1_TF1)) {
+ /*
+ * Note: isb() is not required after this direct write
+ * because there is no indirect read subsequent to it
+ * (per ARM DDI 0487F.c table D13-1).
+ */
+ write_sysreg_s(0, SYS_TFSR_EL1);
+
+ kasan_report_async();
+ }
+}
+#endif
+
static void update_sctlr_el1_tcf0(u64 tcf0)
{
/* ISB required for the kernel uaccess routines */
@@ -257,6 +280,19 @@ void mte_thread_switch(struct task_struct *next)
/* avoid expensive SCTLR_EL1 accesses if no change */
if (current->thread.sctlr_tcf0 != next->thread.sctlr_tcf0)
update_sctlr_el1_tcf0(next->thread.sctlr_tcf0);
+ else
+ isb();
+
+ /*
+ * Check if an async tag exception occurred at EL1.
+ *
+ * Note: On the context switch path we rely on the dsb() present
+ * in __switch_to() to guarantee that the indirect writes to TFSR_EL1
+ * are synchronized before this point.
+ * isb() above is required for the same reason.
+ *
+ */
+ mte_check_tfsr_el1();
}

void mte_suspend_exit(void)
--
2.30.0

2021-02-08 18:53:09

by Vincenzo Frascino

[permalink] [raw]
Subject: [PATCH v12 6/7] arm64: mte: Save/Restore TFSR_EL1 during suspend

When MTE async mode is enabled TFSR_EL1 contains the accumulative
asynchronous tag check faults for EL1 and EL0.

During the suspend/resume operations the firmware might perform some
operations that could change the state of the register resulting in
a spurious tag check fault report.

Save/restore the state of the TFSR_EL1 register during the
suspend/resume operations to prevent this to happen.

Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Lorenzo Pieralisi <[email protected]>
Signed-off-by: Vincenzo Frascino <[email protected]>
---
arch/arm64/include/asm/mte.h | 4 ++++
arch/arm64/kernel/mte.c | 22 ++++++++++++++++++++++
arch/arm64/kernel/suspend.c | 3 +++
3 files changed, 29 insertions(+)

diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
index 237bb2f7309d..2d79bcaaeb30 100644
--- a/arch/arm64/include/asm/mte.h
+++ b/arch/arm64/include/asm/mte.h
@@ -43,6 +43,7 @@ void mte_sync_tags(pte_t *ptep, pte_t pte);
void mte_copy_page_tags(void *kto, const void *kfrom);
void flush_mte_state(void);
void mte_thread_switch(struct task_struct *next);
+void mte_suspend_enter(void);
void mte_suspend_exit(void);
long set_mte_ctrl(struct task_struct *task, unsigned long arg);
long get_mte_ctrl(struct task_struct *task);
@@ -68,6 +69,9 @@ static inline void flush_mte_state(void)
static inline void mte_thread_switch(struct task_struct *next)
{
}
+static inline void mte_suspend_enter(void)
+{
+}
static inline void mte_suspend_exit(void)
{
}
diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
index 3332aabda466..5c440967721b 100644
--- a/arch/arm64/kernel/mte.c
+++ b/arch/arm64/kernel/mte.c
@@ -25,6 +25,7 @@

u64 gcr_kernel_excl __ro_after_init;

+static u64 mte_suspend_tfsr_el1;
static bool report_fault_once = true;

/* Whether the MTE asynchronous mode is enabled. */
@@ -295,12 +296,33 @@ void mte_thread_switch(struct task_struct *next)
mte_check_tfsr_el1();
}

+void mte_suspend_enter(void)
+{
+ if (!system_supports_mte())
+ return;
+
+ /*
+ * The barriers are required to guarantee that the indirect writes
+ * to TFSR_EL1 are synchronized before we save the state.
+ */
+ dsb(nsh);
+ isb();
+
+ /* Save SYS_TFSR_EL1 before suspend entry */
+ mte_suspend_tfsr_el1 = read_sysreg_s(SYS_TFSR_EL1);
+}
+
void mte_suspend_exit(void)
{
if (!system_supports_mte())
return;

update_gcr_el1_excl(gcr_kernel_excl);
+
+ /* Resume SYS_TFSR_EL1 after suspend exit */
+ write_sysreg_s(mte_suspend_tfsr_el1, SYS_TFSR_EL1);
+
+ mte_check_tfsr_el1();
}

long set_mte_ctrl(struct task_struct *task, unsigned long arg)
diff --git a/arch/arm64/kernel/suspend.c b/arch/arm64/kernel/suspend.c
index a67b37a7a47e..16caa9b32dae 100644
--- a/arch/arm64/kernel/suspend.c
+++ b/arch/arm64/kernel/suspend.c
@@ -91,6 +91,9 @@ int cpu_suspend(unsigned long arg, int (*fn)(unsigned long))
unsigned long flags;
struct sleep_stack_data state;

+ /* Report any MTE async fault before going to suspend. */
+ mte_suspend_enter();
+
/*
* From this point debug exceptions are disabled to prevent
* updates to mdscr register (saved and restored along with
--
2.30.0

2021-02-08 18:54:27

by Vincenzo Frascino

[permalink] [raw]
Subject: [PATCH v12 7/7] kasan: don't run tests in async mode

From: Andrey Konovalov <[email protected]>

Asynchronous KASAN mode doesn't guarantee that a tag fault will be
detected immediately and causes tests to fail. Forbid running them
in asynchronous mode.

Signed-off-by: Andrey Konovalov <[email protected]>
---
lib/test_kasan.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/lib/test_kasan.c b/lib/test_kasan.c
index 7285dcf9fcc1..f82d9630cae1 100644
--- a/lib/test_kasan.c
+++ b/lib/test_kasan.c
@@ -51,6 +51,10 @@ static int kasan_test_init(struct kunit *test)
kunit_err(test, "can't run KASAN tests with KASAN disabled");
return -1;
}
+ if (kasan_flag_async) {
+ kunit_err(test, "can't run KASAN tests in async mode");
+ return -1;
+ }

multishot = kasan_save_enable_multi_shot();
hw_set_tagging_report_once(false);
--
2.30.0

2021-02-08 20:25:02

by Lorenzo Pieralisi

[permalink] [raw]
Subject: Re: [PATCH v12 6/7] arm64: mte: Save/Restore TFSR_EL1 during suspend

On Mon, Feb 08, 2021 at 04:56:16PM +0000, Vincenzo Frascino wrote:
> When MTE async mode is enabled TFSR_EL1 contains the accumulative
> asynchronous tag check faults for EL1 and EL0.
>
> During the suspend/resume operations the firmware might perform some
> operations that could change the state of the register resulting in
> a spurious tag check fault report.
>
> Save/restore the state of the TFSR_EL1 register during the
> suspend/resume operations to prevent this to happen.
>
> Cc: Catalin Marinas <[email protected]>
> Cc: Will Deacon <[email protected]>
> Cc: Lorenzo Pieralisi <[email protected]>
> Signed-off-by: Vincenzo Frascino <[email protected]>
> ---
> arch/arm64/include/asm/mte.h | 4 ++++
> arch/arm64/kernel/mte.c | 22 ++++++++++++++++++++++
> arch/arm64/kernel/suspend.c | 3 +++
> 3 files changed, 29 insertions(+)
>
> diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
> index 237bb2f7309d..2d79bcaaeb30 100644
> --- a/arch/arm64/include/asm/mte.h
> +++ b/arch/arm64/include/asm/mte.h
> @@ -43,6 +43,7 @@ void mte_sync_tags(pte_t *ptep, pte_t pte);
> void mte_copy_page_tags(void *kto, const void *kfrom);
> void flush_mte_state(void);
> void mte_thread_switch(struct task_struct *next);
> +void mte_suspend_enter(void);
> void mte_suspend_exit(void);
> long set_mte_ctrl(struct task_struct *task, unsigned long arg);
> long get_mte_ctrl(struct task_struct *task);
> @@ -68,6 +69,9 @@ static inline void flush_mte_state(void)
> static inline void mte_thread_switch(struct task_struct *next)
> {
> }
> +static inline void mte_suspend_enter(void)
> +{
> +}
> static inline void mte_suspend_exit(void)
> {
> }
> diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
> index 3332aabda466..5c440967721b 100644
> --- a/arch/arm64/kernel/mte.c
> +++ b/arch/arm64/kernel/mte.c
> @@ -25,6 +25,7 @@
>
> u64 gcr_kernel_excl __ro_after_init;
>
> +static u64 mte_suspend_tfsr_el1;

IIUC you need this per-CPU (core loses context on suspend-to-RAM but also
CPUidle, S2R is single threaded but CPUidle runs on every core idle
thread).

Unless you sync/report it on enter/exit (please note: I am not familiar
with MTE so it is just a, perhaps silly, suggestion to avoid
saving/restoring it).

Lorenzo

> static bool report_fault_once = true;
>
> /* Whether the MTE asynchronous mode is enabled. */
> @@ -295,12 +296,33 @@ void mte_thread_switch(struct task_struct *next)
> mte_check_tfsr_el1();
> }
>
> +void mte_suspend_enter(void)
> +{
> + if (!system_supports_mte())
> + return;
> +
> + /*
> + * The barriers are required to guarantee that the indirect writes
> + * to TFSR_EL1 are synchronized before we save the state.
> + */
> + dsb(nsh);
> + isb();
> +
> + /* Save SYS_TFSR_EL1 before suspend entry */
> + mte_suspend_tfsr_el1 = read_sysreg_s(SYS_TFSR_EL1);
> +}
> +
> void mte_suspend_exit(void)
> {
> if (!system_supports_mte())
> return;
>
> update_gcr_el1_excl(gcr_kernel_excl);
> +
> + /* Resume SYS_TFSR_EL1 after suspend exit */
> + write_sysreg_s(mte_suspend_tfsr_el1, SYS_TFSR_EL1);
> +
> + mte_check_tfsr_el1();
> }
>
> long set_mte_ctrl(struct task_struct *task, unsigned long arg)
> diff --git a/arch/arm64/kernel/suspend.c b/arch/arm64/kernel/suspend.c
> index a67b37a7a47e..16caa9b32dae 100644
> --- a/arch/arm64/kernel/suspend.c
> +++ b/arch/arm64/kernel/suspend.c
> @@ -91,6 +91,9 @@ int cpu_suspend(unsigned long arg, int (*fn)(unsigned long))
> unsigned long flags;
> struct sleep_stack_data state;
>
> + /* Report any MTE async fault before going to suspend. */
> + mte_suspend_enter();
> +
> /*
> * From this point debug exceptions are disabled to prevent
> * updates to mdscr register (saved and restored along with
> --
> 2.30.0
>

2021-02-09 06:37:52

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v12 7/7] kasan: don't run tests in async mode

Hi Vincenzo,

I love your patch! Yet something to improve:

[auto build test ERROR on next-20210125]
[cannot apply to arm64/for-next/core xlnx/master arm/for-next soc/for-next kvmarm/next linus/master hnaz-linux-mm/master v5.11-rc6 v5.11-rc5 v5.11-rc4 v5.11-rc6]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url: https://github.com/0day-ci/linux/commits/Vincenzo-Frascino/arm64-ARMv8-5-A-MTE-Add-async-mode-support/20210209-080907
base: 59fa6a163ffabc1bf25c5e0e33899e268a96d3cc
config: powerpc64-randconfig-r033-20210209 (attached as .config)
compiler: powerpc-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/0day-ci/linux/commit/53907a0b15724b414ddd9201356f92e09571ef90
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Vincenzo-Frascino/arm64-ARMv8-5-A-MTE-Add-async-mode-support/20210209-080907
git checkout 53907a0b15724b414ddd9201356f92e09571ef90
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=powerpc64

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>):

powerpc-linux-ld: lib/test_kasan.o: in function `kasan_test_init':
test_kasan.c:(.text+0x849a): undefined reference to `kasan_flag_async'
>> powerpc-linux-ld: test_kasan.c:(.text+0x84a2): undefined reference to `kasan_flag_async'
powerpc-linux-ld: test_kasan.c:(.text+0x84e2): undefined reference to `kasan_flag_async'

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]


Attachments:
(No filename) (1.97 kB)
.config.gz (32.76 kB)
Download all attachments

2021-02-09 07:43:50

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v12 3/7] kasan: Add report for async mode

Hi Vincenzo,

I love your patch! Yet something to improve:

[auto build test ERROR on next-20210125]
[cannot apply to arm64/for-next/core xlnx/master arm/for-next soc/for-next kvmarm/next linus/master hnaz-linux-mm/master v5.11-rc6 v5.11-rc5 v5.11-rc4 v5.11-rc6]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url: https://github.com/0day-ci/linux/commits/Vincenzo-Frascino/arm64-ARMv8-5-A-MTE-Add-async-mode-support/20210209-080907
base: 59fa6a163ffabc1bf25c5e0e33899e268a96d3cc
config: x86_64-randconfig-s021-20210209 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-15) 9.3.0
reproduce:
# apt-get install sparse
# sparse version: v0.6.3-215-g0fb77bb6-dirty
# https://github.com/0day-ci/linux/commit/93bd347e4877e3616f7db64f488ebb469718dd68
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Vincenzo-Frascino/arm64-ARMv8-5-A-MTE-Add-async-mode-support/20210209-080907
git checkout 93bd347e4877e3616f7db64f488ebb469718dd68
# save the attached .config to linux build tree
make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=x86_64

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>):

ld: mm/kasan/report.o: in function `end_report':
>> mm/kasan/report.c:90: undefined reference to `kasan_flag_async'
>> ld: mm/kasan/report.c:90: undefined reference to `kasan_flag_async'


vim +90 mm/kasan/report.c

87
88 static void end_report(unsigned long *flags, unsigned long addr)
89 {
> 90 if (!kasan_flag_async)
91 trace_error_report_end(ERROR_DETECTOR_KASAN, addr);
92 pr_err("==================================================================\n");
93 add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE);
94 spin_unlock_irqrestore(&report_lock, *flags);
95 if (panic_on_warn && !test_bit(KASAN_BIT_MULTI_SHOT, &kasan_flags)) {
96 /*
97 * This thread may hit another WARN() in the panic path.
98 * Resetting this prevents additional WARN() from panicking the
99 * system on this thread. Other threads are blocked by the
100 * panic_mutex in panic().
101 */
102 panic_on_warn = 0;
103 panic("panic_on_warn set ...\n");
104 }
105 #ifdef CONFIG_KASAN_HW_TAGS
106 if (kasan_flag_panic)
107 panic("kasan.fault=panic set ...\n");
108 #endif
109 kasan_enable_current();
110 }
111

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]


Attachments:
(No filename) (2.77 kB)
.config.gz (38.09 kB)
Download all attachments

2021-02-09 10:54:38

by Vincenzo Frascino

[permalink] [raw]
Subject: Re: [PATCH v12 6/7] arm64: mte: Save/Restore TFSR_EL1 during suspend

Hi Lorenzo,

thank you for your review.

On 2/8/21 6:56 PM, Lorenzo Pieralisi wrote:
>> u64 gcr_kernel_excl __ro_after_init;
>>
>> +static u64 mte_suspend_tfsr_el1;
> IIUC you need this per-CPU (core loses context on suspend-to-RAM but also
> CPUidle, S2R is single threaded but CPUidle runs on every core idle
> thread).
>
> Unless you sync/report it on enter/exit (please note: I am not familiar
> with MTE so it is just a, perhaps silly, suggestion to avoid
> saving/restoring it).
>

I thought about making it per cpu, but I concluded that since it is an
asynchronous tag fault it wasn't necessary.

But thinking at it from the statistical point of view what you are saying is
completely right, because we might end up in scenario in which we report the
fault on multiple cores when it happens on one or in a scenario in which we do
not report the potential fault at all.

I am going to update my code accordingly in the next version.

Thanks!

> Lorenzo
>

--
Regards,
Vincenzo

2021-02-09 11:43:24

by Vincenzo Frascino

[permalink] [raw]
Subject: Re: [PATCH v12 7/7] kasan: don't run tests in async mode



On 2/9/21 6:32 AM, kernel test robot wrote:
> Hi Vincenzo,
>
> I love your patch! Yet something to improve:
>
> [auto build test ERROR on next-20210125]
> [cannot apply to arm64/for-next/core xlnx/master arm/for-next soc/for-next kvmarm/next linus/master hnaz-linux-mm/master v5.11-rc6 v5.11-rc5 v5.11-rc4 v5.11-rc6]

The patches are based on linux-next/akpm and since they depend on some patches
present on that tree, can be applied only on linux-next/akpm and linux-next/master.

The dependency is reported in the cover letter.

Thanks,
Vincenzo

> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch]
>
> url: https://github.com/0day-ci/linux/commits/Vincenzo-Frascino/arm64-ARMv8-5-A-MTE-Add-async-mode-support/20210209-080907
> base: 59fa6a163ffabc1bf25c5e0e33899e268a96d3cc
> config: powerpc64-randconfig-r033-20210209 (attached as .config)
> compiler: powerpc-linux-gcc (GCC) 9.3.0
> reproduce (this is a W=1 build):
> wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # https://github.com/0day-ci/linux/commit/53907a0b15724b414ddd9201356f92e09571ef90
> git remote add linux-review https://github.com/0day-ci/linux
> git fetch --no-tags linux-review Vincenzo-Frascino/arm64-ARMv8-5-A-MTE-Add-async-mode-support/20210209-080907
> git checkout 53907a0b15724b414ddd9201356f92e09571ef90
> # save the attached .config to linux build tree
> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=powerpc64
>
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <[email protected]>
>
> All errors (new ones prefixed by >>):
>
> powerpc-linux-ld: lib/test_kasan.o: in function `kasan_test_init':
> test_kasan.c:(.text+0x849a): undefined reference to `kasan_flag_async'
>>> powerpc-linux-ld: test_kasan.c:(.text+0x84a2): undefined reference to `kasan_flag_async'
> powerpc-linux-ld: test_kasan.c:(.text+0x84e2): undefined reference to `kasan_flag_async'
>
> ---
> 0-DAY CI Kernel Test Service, Intel Corporation
> https://lists.01.org/hyperkitty/list/[email protected]
>

--
Regards,
Vincenzo

2021-02-09 11:46:04

by Vincenzo Frascino

[permalink] [raw]
Subject: Re: [PATCH v12 3/7] kasan: Add report for async mode

On 2/9/21 7:39 AM, kernel test robot wrote:
> Hi Vincenzo,
>
> I love your patch! Yet something to improve:
>
> [auto build test ERROR on next-20210125]
> [cannot apply to arm64/for-next/core xlnx/master arm/for-next soc/for-next kvmarm/next linus/master hnaz-linux-mm/master v5.11-rc6 v5.11-rc5 v5.11-rc4 v5.11-rc6]

The patches are based on linux-next/akpm and since they depend on some patches
present on that tree, can be applied only on linux-next/akpm and linux-next/master.

The dependency is reported in the cover letter.

Thanks,
Vincenzo

> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch]
>
> url: https://github.com/0day-ci/linux/commits/Vincenzo-Frascino/arm64-ARMv8-5-A-MTE-Add-async-mode-support/20210209-080907
> base: 59fa6a163ffabc1bf25c5e0e33899e268a96d3cc
> config: x86_64-randconfig-s021-20210209 (attached as .config)
> compiler: gcc-9 (Debian 9.3.0-15) 9.3.0
> reproduce:
> # apt-get install sparse
> # sparse version: v0.6.3-215-g0fb77bb6-dirty
> # https://github.com/0day-ci/linux/commit/93bd347e4877e3616f7db64f488ebb469718dd68
> git remote add linux-review https://github.com/0day-ci/linux
> git fetch --no-tags linux-review Vincenzo-Frascino/arm64-ARMv8-5-A-MTE-Add-async-mode-support/20210209-080907
> git checkout 93bd347e4877e3616f7db64f488ebb469718dd68
> # save the attached .config to linux build tree
> make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=x86_64
>
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <[email protected]>
>
> All errors (new ones prefixed by >>):
>
> ld: mm/kasan/report.o: in function `end_report':
>>> mm/kasan/report.c:90: undefined reference to `kasan_flag_async'
>>> ld: mm/kasan/report.c:90: undefined reference to `kasan_flag_async'
>
>
> vim +90 mm/kasan/report.c
>
> 87
> 88 static void end_report(unsigned long *flags, unsigned long addr)
> 89 {
> > 90 if (!kasan_flag_async)
> 91 trace_error_report_end(ERROR_DETECTOR_KASAN, addr);
> 92 pr_err("==================================================================\n");
> 93 add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE);
> 94 spin_unlock_irqrestore(&report_lock, *flags);
> 95 if (panic_on_warn && !test_bit(KASAN_BIT_MULTI_SHOT, &kasan_flags)) {
> 96 /*
> 97 * This thread may hit another WARN() in the panic path.
> 98 * Resetting this prevents additional WARN() from panicking the
> 99 * system on this thread. Other threads are blocked by the
> 100 * panic_mutex in panic().
> 101 */
> 102 panic_on_warn = 0;
> 103 panic("panic_on_warn set ...\n");
> 104 }
> 105 #ifdef CONFIG_KASAN_HW_TAGS
> 106 if (kasan_flag_panic)
> 107 panic("kasan.fault=panic set ...\n");
> 108 #endif
> 109 kasan_enable_current();
> 110 }
> 111
>
> ---
> 0-DAY CI Kernel Test Service, Intel Corporation
> https://lists.01.org/hyperkitty/list/[email protected]
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>

--
Regards,
Vincenzo

2021-02-09 12:00:35

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH v12 6/7] arm64: mte: Save/Restore TFSR_EL1 during suspend

On Mon, Feb 08, 2021 at 04:56:16PM +0000, Vincenzo Frascino wrote:
> When MTE async mode is enabled TFSR_EL1 contains the accumulative
> asynchronous tag check faults for EL1 and EL0.
>
> During the suspend/resume operations the firmware might perform some
> operations that could change the state of the register resulting in
> a spurious tag check fault report.
>
> Save/restore the state of the TFSR_EL1 register during the
> suspend/resume operations to prevent this to happen.

Do we need a similar fix for TFSRE0_EL1? We get away with this if
suspend is only entered on the idle (kernel) thread but I recall we
could also enter suspend on behalf of a user process (I may be wrong
though).

If that's the case, it would make more sense to store the TFSR* regs in
the thread_struct alongside sctlr_tcf0. If we did that, we'd not need
the per-cpu mte_suspend_tfsr_el1 variable.

--
Catalin

2021-02-09 12:09:07

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH v12 7/7] kasan: don't run tests in async mode

On Mon, Feb 08, 2021 at 04:56:17PM +0000, Vincenzo Frascino wrote:
> From: Andrey Konovalov <[email protected]>
>
> Asynchronous KASAN mode doesn't guarantee that a tag fault will be
> detected immediately and causes tests to fail. Forbid running them
> in asynchronous mode.
>
> Signed-off-by: Andrey Konovalov <[email protected]>

That's missing your SoB.

> diff --git a/lib/test_kasan.c b/lib/test_kasan.c
> index 7285dcf9fcc1..f82d9630cae1 100644
> --- a/lib/test_kasan.c
> +++ b/lib/test_kasan.c
> @@ -51,6 +51,10 @@ static int kasan_test_init(struct kunit *test)
> kunit_err(test, "can't run KASAN tests with KASAN disabled");
> return -1;
> }
> + if (kasan_flag_async) {
> + kunit_err(test, "can't run KASAN tests in async mode");
> + return -1;
> + }
>
> multishot = kasan_save_enable_multi_shot();
> hw_set_tagging_report_once(false);

I think we can still run the kasan tests in async mode if we check the
TFSR_EL1 at the end of each test by calling mte_check_tfsr_exit().

--
Catalin

2021-02-09 17:31:27

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH v12 6/7] arm64: mte: Save/Restore TFSR_EL1 during suspend

On Tue, Feb 09, 2021 at 02:33:28PM +0000, Lorenzo Pieralisi wrote:
> On Tue, Feb 09, 2021 at 11:55:33AM +0000, Catalin Marinas wrote:
> > On Mon, Feb 08, 2021 at 04:56:16PM +0000, Vincenzo Frascino wrote:
> > > When MTE async mode is enabled TFSR_EL1 contains the accumulative
> > > asynchronous tag check faults for EL1 and EL0.
> > >
> > > During the suspend/resume operations the firmware might perform some
> > > operations that could change the state of the register resulting in
> > > a spurious tag check fault report.
> > >
> > > Save/restore the state of the TFSR_EL1 register during the
> > > suspend/resume operations to prevent this to happen.
> >
> > Do we need a similar fix for TFSRE0_EL1? We get away with this if
> > suspend is only entered on the idle (kernel) thread but I recall we
> > could also enter suspend on behalf of a user process (I may be wrong
> > though).
>
> Yes, when we suspend the machine to RAM, we execute suspend on behalf
> on a userspace process (but that's only running on 1 cpu, the others
> are hotplugged out).
>
> IIUC (and that's an if) TFSRE0_EL1 is checked on kernel entry so I don't
> think there is a need to save/restore it (just reset it on suspend
> exit).

You are right, we don't check TFSRE0_EL1 on return to user, only
clear it, so no need to do anything on suspend/resume.

> TFSR_EL1, I don't see a point in saving/restoring it (it is a bit
> per-CPU AFAICS) either, IMO we should "check" it on suspend (if it is
> possible in that context) and reset it on resume.

I think this should work.

> I don't think though you can "check" with IRQs disabled so I suspect
> that TFSR_EL1 has to be saved/restored (which means that there is a
> black out period where we run kernel code without being able to detect
> faults but there is no solution to that other than delaying saving the
> value to just before calling into PSCI). Likewise on resume from low
> power.

It depends on whether kasan_report can be called with IRQs disabled. I
don't see why not, so if this works I'd rather just call mte_check_async
(or whatever it's called) on the suspend path and zero the register on
resume (mte_suspend_exit). We avoid any saving of the state.

--
Catalin

2021-02-09 17:36:22

by Vincenzo Frascino

[permalink] [raw]
Subject: Re: [PATCH v12 7/7] kasan: don't run tests in async mode

Hi Andrey,

On 2/9/21 5:26 PM, Andrey Konovalov wrote:
> On Tue, Feb 9, 2021 at 6:07 PM Catalin Marinas <[email protected]> wrote:
>>
>> On Tue, Feb 09, 2021 at 04:02:25PM +0100, Andrey Konovalov wrote:
>>> On Tue, Feb 9, 2021 at 1:16 PM Vincenzo Frascino
>>> <[email protected]> wrote:
>>>> On 2/9/21 12:02 PM, Catalin Marinas wrote:
>>>>> On Mon, Feb 08, 2021 at 04:56:17PM +0000, Vincenzo Frascino wrote:
>>>>>> diff --git a/lib/test_kasan.c b/lib/test_kasan.c
>>>>>> index 7285dcf9fcc1..f82d9630cae1 100644
>>>>>> --- a/lib/test_kasan.c
>>>>>> +++ b/lib/test_kasan.c
>>>>>> @@ -51,6 +51,10 @@ static int kasan_test_init(struct kunit *test)
>>>>>> kunit_err(test, "can't run KASAN tests with KASAN disabled");
>>>>>> return -1;
>>>>>> }
>>>>>> + if (kasan_flag_async) {
>>>>>> + kunit_err(test, "can't run KASAN tests in async mode");
>>>>>> + return -1;
>>>>>> + }
>>>>>>
>>>>>> multishot = kasan_save_enable_multi_shot();
>>>>>> hw_set_tagging_report_once(false);
>>>>>
>>>>> I think we can still run the kasan tests in async mode if we check the
>>>>> TFSR_EL1 at the end of each test by calling mte_check_tfsr_exit().
>>>>>
>>>>
>>>> IIUC this was the plan for the future. But I let Andrey comment for more details.
>>>
>>> If it's possible to implement, then it would be good to have. Doesn't
>>> have to be a part of this series though.
>>
>> I think it can be part of this series but after the 5.12 merging window
>> (we are a few days away from final 5.11 and I don't think we should
>> rush the MTE kernel async support in).
>>
>> It would be nice to have the kasan tests running with async by the time
>> we merge the patches (at a quick look, I think it's possible but, of
>> course, we may hit some blockers when implementing it).
>
> OK, sounds good.
>
> If it's possible to put an explicit check for tag faults at the end of
> each test, then adding async support shouldn't be hard.
>
> Note, that some of the tests trigger bugs that are detected via
> explicit checks within KASAN. For example, KASAN checks that a pointer
> that's being freed points to a start of a slab object, or that the
> object is accessible when it gets freed, etc. I don't see this being a
> problem, so just FYI.
>

Once you have your patches ready please send them to me and I will repost
another version. In the meantime I will address the remaining comments.

> Thanks!
>

--
Regards,
Vincenzo

2021-02-09 19:57:26

by Vincenzo Frascino

[permalink] [raw]
Subject: Re: [PATCH v12 6/7] arm64: mte: Save/Restore TFSR_EL1 during suspend

On 2/9/21 5:28 PM, Catalin Marinas wrote:
>> I don't think though you can "check" with IRQs disabled so I suspect
>> that TFSR_EL1 has to be saved/restored (which means that there is a
>> black out period where we run kernel code without being able to detect
>> faults but there is no solution to that other than delaying saving the
>> value to just before calling into PSCI). Likewise on resume from low
>> power.
> It depends on whether kasan_report can be called with IRQs disabled. I
> don't see why not, so if this works I'd rather just call mte_check_async
> (or whatever it's called) on the suspend path and zero the register on
> resume (mte_suspend_exit). We avoid any saving of the state.

Fine by me, I tried a quick test and can confirm that kasan_report can be
invoked with IRQ disabled.

--
Regards,
Vincenzo

2021-02-10 03:35:41

by Vincenzo Frascino

[permalink] [raw]
Subject: Re: [PATCH v12 7/7] kasan: don't run tests in async mode



On 2/9/21 12:02 PM, Catalin Marinas wrote:
> On Mon, Feb 08, 2021 at 04:56:17PM +0000, Vincenzo Frascino wrote:
>> From: Andrey Konovalov <[email protected]>
>>
>> Asynchronous KASAN mode doesn't guarantee that a tag fault will be
>> detected immediately and causes tests to fail. Forbid running them
>> in asynchronous mode.
>>
>> Signed-off-by: Andrey Konovalov <[email protected]>
>
> That's missing your SoB.
>

Yes, I will add it in the next iteration.

>> diff --git a/lib/test_kasan.c b/lib/test_kasan.c
>> index 7285dcf9fcc1..f82d9630cae1 100644
>> --- a/lib/test_kasan.c
>> +++ b/lib/test_kasan.c
>> @@ -51,6 +51,10 @@ static int kasan_test_init(struct kunit *test)
>> kunit_err(test, "can't run KASAN tests with KASAN disabled");
>> return -1;
>> }
>> + if (kasan_flag_async) {
>> + kunit_err(test, "can't run KASAN tests in async mode");
>> + return -1;
>> + }
>>
>> multishot = kasan_save_enable_multi_shot();
>> hw_set_tagging_report_once(false);
>
> I think we can still run the kasan tests in async mode if we check the
> TFSR_EL1 at the end of each test by calling mte_check_tfsr_exit().
>

IIUC this was the plan for the future. But I let Andrey comment for more details.

--
Regards,
Vincenzo

2021-02-10 05:13:22

by Lorenzo Pieralisi

[permalink] [raw]
Subject: Re: [PATCH v12 6/7] arm64: mte: Save/Restore TFSR_EL1 during suspend

On Tue, Feb 09, 2021 at 11:55:33AM +0000, Catalin Marinas wrote:
> On Mon, Feb 08, 2021 at 04:56:16PM +0000, Vincenzo Frascino wrote:
> > When MTE async mode is enabled TFSR_EL1 contains the accumulative
> > asynchronous tag check faults for EL1 and EL0.
> >
> > During the suspend/resume operations the firmware might perform some
> > operations that could change the state of the register resulting in
> > a spurious tag check fault report.
> >
> > Save/restore the state of the TFSR_EL1 register during the
> > suspend/resume operations to prevent this to happen.
>
> Do we need a similar fix for TFSRE0_EL1? We get away with this if
> suspend is only entered on the idle (kernel) thread but I recall we
> could also enter suspend on behalf of a user process (I may be wrong
> though).

Yes, when we suspend the machine to RAM, we execute suspend on behalf
on a userspace process (but that's only running on 1 cpu, the others
are hotplugged out).

IIUC (and that's an if) TFSRE0_EL1 is checked on kernel entry so I don't
think there is a need to save/restore it (just reset it on suspend
exit).

TFSR_EL1, I don't see a point in saving/restoring it (it is a bit
per-CPU AFAICS) either, IMO we should "check" it on suspend (if it is
possible in that context) and reset it on resume.

I don't think though you can "check" with IRQs disabled so I suspect
that TFSR_EL1 has to be saved/restored (which means that there is a
black out period where we run kernel code without being able to detect
faults but there is no solution to that other than delaying saving the
value to just before calling into PSCI). Likewise on resume from low
power.

Thanks,
Lorenzo

> If that's the case, it would make more sense to store the TFSR* regs in
> the thread_struct alongside sctlr_tcf0. If we did that, we'd not need
> the per-cpu mte_suspend_tfsr_el1 variable.
>
> --
> Catalin

2021-02-10 05:19:31

by Vincenzo Frascino

[permalink] [raw]
Subject: Re: [PATCH v12 6/7] arm64: mte: Save/Restore TFSR_EL1 during suspend



On 2/9/21 2:33 PM, Lorenzo Pieralisi wrote:
>> Do we need a similar fix for TFSRE0_EL1? We get away with this if
>> suspend is only entered on the idle (kernel) thread but I recall we
>> could also enter suspend on behalf of a user process (I may be wrong
>> though).
> Yes, when we suspend the machine to RAM, we execute suspend on behalf
> on a userspace process (but that's only running on 1 cpu, the others
> are hotplugged out).
>
> IIUC (and that's an if) TFSRE0_EL1 is checked on kernel entry so I don't
> think there is a need to save/restore it (just reset it on suspend
> exit).
>
> TFSR_EL1, I don't see a point in saving/restoring it (it is a bit
> per-CPU AFAICS) either, IMO we should "check" it on suspend (if it is
> possible in that context) and reset it on resume.
>
> I don't think though you can "check" with IRQs disabled so I suspect
> that TFSR_EL1 has to be saved/restored (which means that there is a
> black out period where we run kernel code without being able to detect
> faults but there is no solution to that other than delaying saving the
> value to just before calling into PSCI). Likewise on resume from low
> power.
>

Ok, based on what you are saying it seems that the most viable solution here is
to save and restore TFSR_EL1. I will update my code accordingly.

> Thanks,
> Lorenzo
>

--
Regards,
Vincenzo

2021-02-10 05:23:13

by Andrey Konovalov

[permalink] [raw]
Subject: Re: [PATCH v12 7/7] kasan: don't run tests in async mode

On Tue, Feb 9, 2021 at 1:16 PM Vincenzo Frascino
<[email protected]> wrote:
>
>
>
> On 2/9/21 12:02 PM, Catalin Marinas wrote:
> > On Mon, Feb 08, 2021 at 04:56:17PM +0000, Vincenzo Frascino wrote:
> >> From: Andrey Konovalov <[email protected]>
> >>
> >> Asynchronous KASAN mode doesn't guarantee that a tag fault will be
> >> detected immediately and causes tests to fail. Forbid running them
> >> in asynchronous mode.
> >>
> >> Signed-off-by: Andrey Konovalov <[email protected]>
> >
> > That's missing your SoB.
> >
>
> Yes, I will add it in the next iteration.
>
> >> diff --git a/lib/test_kasan.c b/lib/test_kasan.c
> >> index 7285dcf9fcc1..f82d9630cae1 100644
> >> --- a/lib/test_kasan.c
> >> +++ b/lib/test_kasan.c
> >> @@ -51,6 +51,10 @@ static int kasan_test_init(struct kunit *test)
> >> kunit_err(test, "can't run KASAN tests with KASAN disabled");
> >> return -1;
> >> }
> >> + if (kasan_flag_async) {
> >> + kunit_err(test, "can't run KASAN tests in async mode");
> >> + return -1;
> >> + }
> >>
> >> multishot = kasan_save_enable_multi_shot();
> >> hw_set_tagging_report_once(false);
> >
> > I think we can still run the kasan tests in async mode if we check the
> > TFSR_EL1 at the end of each test by calling mte_check_tfsr_exit().
> >
>
> IIUC this was the plan for the future. But I let Andrey comment for more details.

If it's possible to implement, then it would be good to have. Doesn't
have to be a part of this series though.

2021-02-10 07:42:14

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH v12 7/7] kasan: don't run tests in async mode

On Tue, Feb 09, 2021 at 04:02:25PM +0100, Andrey Konovalov wrote:
> On Tue, Feb 9, 2021 at 1:16 PM Vincenzo Frascino
> <[email protected]> wrote:
> > On 2/9/21 12:02 PM, Catalin Marinas wrote:
> > > On Mon, Feb 08, 2021 at 04:56:17PM +0000, Vincenzo Frascino wrote:
> > >> diff --git a/lib/test_kasan.c b/lib/test_kasan.c
> > >> index 7285dcf9fcc1..f82d9630cae1 100644
> > >> --- a/lib/test_kasan.c
> > >> +++ b/lib/test_kasan.c
> > >> @@ -51,6 +51,10 @@ static int kasan_test_init(struct kunit *test)
> > >> kunit_err(test, "can't run KASAN tests with KASAN disabled");
> > >> return -1;
> > >> }
> > >> + if (kasan_flag_async) {
> > >> + kunit_err(test, "can't run KASAN tests in async mode");
> > >> + return -1;
> > >> + }
> > >>
> > >> multishot = kasan_save_enable_multi_shot();
> > >> hw_set_tagging_report_once(false);
> > >
> > > I think we can still run the kasan tests in async mode if we check the
> > > TFSR_EL1 at the end of each test by calling mte_check_tfsr_exit().
> > >
> >
> > IIUC this was the plan for the future. But I let Andrey comment for more details.
>
> If it's possible to implement, then it would be good to have. Doesn't
> have to be a part of this series though.

I think it can be part of this series but after the 5.12 merging window
(we are a few days away from final 5.11 and I don't think we should
rush the MTE kernel async support in).

It would be nice to have the kasan tests running with async by the time
we merge the patches (at a quick look, I think it's possible but, of
course, we may hit some blockers when implementing it).

--
Catalin

2021-02-10 07:45:26

by Andrey Konovalov

[permalink] [raw]
Subject: Re: [PATCH v12 7/7] kasan: don't run tests in async mode

On Tue, Feb 9, 2021 at 6:07 PM Catalin Marinas <[email protected]> wrote:
>
> On Tue, Feb 09, 2021 at 04:02:25PM +0100, Andrey Konovalov wrote:
> > On Tue, Feb 9, 2021 at 1:16 PM Vincenzo Frascino
> > <[email protected]> wrote:
> > > On 2/9/21 12:02 PM, Catalin Marinas wrote:
> > > > On Mon, Feb 08, 2021 at 04:56:17PM +0000, Vincenzo Frascino wrote:
> > > >> diff --git a/lib/test_kasan.c b/lib/test_kasan.c
> > > >> index 7285dcf9fcc1..f82d9630cae1 100644
> > > >> --- a/lib/test_kasan.c
> > > >> +++ b/lib/test_kasan.c
> > > >> @@ -51,6 +51,10 @@ static int kasan_test_init(struct kunit *test)
> > > >> kunit_err(test, "can't run KASAN tests with KASAN disabled");
> > > >> return -1;
> > > >> }
> > > >> + if (kasan_flag_async) {
> > > >> + kunit_err(test, "can't run KASAN tests in async mode");
> > > >> + return -1;
> > > >> + }
> > > >>
> > > >> multishot = kasan_save_enable_multi_shot();
> > > >> hw_set_tagging_report_once(false);
> > > >
> > > > I think we can still run the kasan tests in async mode if we check the
> > > > TFSR_EL1 at the end of each test by calling mte_check_tfsr_exit().
> > > >
> > >
> > > IIUC this was the plan for the future. But I let Andrey comment for more details.
> >
> > If it's possible to implement, then it would be good to have. Doesn't
> > have to be a part of this series though.
>
> I think it can be part of this series but after the 5.12 merging window
> (we are a few days away from final 5.11 and I don't think we should
> rush the MTE kernel async support in).
>
> It would be nice to have the kasan tests running with async by the time
> we merge the patches (at a quick look, I think it's possible but, of
> course, we may hit some blockers when implementing it).

OK, sounds good.

If it's possible to put an explicit check for tag faults at the end of
each test, then adding async support shouldn't be hard.

Note, that some of the tests trigger bugs that are detected via
explicit checks within KASAN. For example, KASAN checks that a pointer
that's being freed points to a start of a slab object, or that the
object is accessible when it gets freed, etc. I don't see this being a
problem, so just FYI.

Thanks!

2021-02-10 08:39:35

by Chen, Rong A

[permalink] [raw]
Subject: Re: [kbuild-all] Re: [PATCH v12 7/7] kasan: don't run tests in async mode



On 2/9/21 7:33 PM, Vincenzo Frascino wrote:
>
> On 2/9/21 6:32 AM, kernel test robot wrote:
>> Hi Vincenzo,
>>
>> I love your patch! Yet something to improve:
>>
>> [auto build test ERROR on next-20210125]
>> [cannot apply to arm64/for-next/core xlnx/master arm/for-next soc/for-next kvmarm/next linus/master hnaz-linux-mm/master v5.11-rc6 v5.11-rc5 v5.11-rc4 v5.11-rc6]
> The patches are based on linux-next/akpm and since they depend on some patches
> present on that tree, can be applied only on linux-next/akpm and linux-next/master.
>
> The dependency is reported in the cover letter.

Hi Vincenzo,

Thanks for the feedback, we'll take a look.

Best Regards,
Rong Chen

>
> Thanks,
> Vincenzo
>
>> [If your patch is applied to the wrong git tree, kindly drop us a note.
>> And when submitting patch, we suggest to use '--base' as documented in
>> https://git-scm.com/docs/git-format-patch]
>>
>> url: https://github.com/0day-ci/linux/commits/Vincenzo-Frascino/arm64-ARMv8-5-A-MTE-Add-async-mode-support/20210209-080907
>> base: 59fa6a163ffabc1bf25c5e0e33899e268a96d3cc
>> config: powerpc64-randconfig-r033-20210209 (attached as .config)
>> compiler: powerpc-linux-gcc (GCC) 9.3.0
>> reproduce (this is a W=1 build):
>> wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
>> chmod +x ~/bin/make.cross
>> # https://github.com/0day-ci/linux/commit/53907a0b15724b414ddd9201356f92e09571ef90
>> git remote add linux-review https://github.com/0day-ci/linux
>> git fetch --no-tags linux-review Vincenzo-Frascino/arm64-ARMv8-5-A-MTE-Add-async-mode-support/20210209-080907
>> git checkout 53907a0b15724b414ddd9201356f92e09571ef90
>> # save the attached .config to linux build tree
>> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=powerpc64
>>
>> If you fix the issue, kindly add following tag as appropriate
>> Reported-by: kernel test robot <[email protected]>
>>
>> All errors (new ones prefixed by >>):
>>
>> powerpc-linux-ld: lib/test_kasan.o: in function `kasan_test_init':
>> test_kasan.c:(.text+0x849a): undefined reference to `kasan_flag_async'
>>>> powerpc-linux-ld: test_kasan.c:(.text+0x84a2): undefined reference to `kasan_flag_async'
>> powerpc-linux-ld: test_kasan.c:(.text+0x84e2): undefined reference to `kasan_flag_async'
>>
>> ---
>> 0-DAY CI Kernel Test Service, Intel Corporation
>> https://lists.01.org/hyperkitty/list/[email protected]
>>