2021-01-30 16:56:55

by Vincenzo Frascino

[permalink] [raw]
Subject: [PATCH v11 0/5] arm64: ARMv8.5-A: MTE: Add async mode support

This patchset implements the asynchronous mode support for ARMv8.5-A
Memory Tagging Extension (MTE), which is a debugging feature that allows
to detect with the help of the architecture the C and C++ programmatic
memory errors like buffer overflow, use-after-free, use-after-return, etc.

MTE is built on top of the AArch64 v8.0 virtual address tagging TBI
(Top Byte Ignore) feature and allows a task to set a 4 bit tag on any
subset of its address space that is multiple of a 16 bytes granule. MTE
is based on a lock-key mechanism where the lock is the tag associated to
the physical memory and the key is the tag associated to the virtual
address.
When MTE is enabled and tags are set for ranges of address space of a task,
the PE will compare the tag related to the physical memory with the tag
related to the virtual address (tag check operation). Access to the memory
is granted only if the two tags match. In case of mismatch the PE will raise
an exception.

The exception can be handled synchronously or asynchronously. When the
asynchronous mode is enabled:
- Upon fault the PE updates the TFSR_EL1 register.
- The kernel detects the change during one of the following:
- Context switching
- Return to user/EL0
- Kernel entry from EL1
- Kernel exit to EL1
- If the register has been updated by the PE the kernel clears it and
reports the error.

The series is based on linux-next/akpm.

To simplify the testing a tree with the new patches on top has been made
available at [1].

[1] https://git.gitlab.arm.com/linux-arm/linux-vf.git mte/v10.async.akpm

Changes:
--------
v11:
- Added patch that disables KUNIT tests in async mode
v10:
- Rebase on the latest linux-next/akpm
- Address review comments.
v9:
- Rebase on the latest linux-next/akpm
- Address review comments.
v8:
- Address review comments.
v7:
- Fix a warning reported by kernel test robot. This
time for real.
v6:
- Drop patches that forbid KASAN KUNIT tests when async
mode is enabled.
- Fix a warning reported by kernel test robot.
- Address review comments.
v5:
- Rebase the series on linux-next/akpm.
- Forbid execution for KASAN KUNIT tests when async
mode is enabled.
- Dropped patch to inline mte_assign_mem_tag_range().
- Address review comments.
v4:
- Added support for kasan.mode (sync/async) kernel
command line parameter.
- Addressed review comments.
v3:
- Exposed kasan_hw_tags_mode to convert the internal
KASAN represenetation.
- Added dsb() for kernel exit paths in arm64.
- Addressed review comments.
v2:
- Fixed a compilation issue reported by krobot.
- General cleanup.

Cc: Andrew Morton <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Marco Elver <[email protected]>
Cc: Evgenii Stepanov <[email protected]>
Cc: Branislav Rankov <[email protected]>
Cc: Andrey Konovalov <[email protected]>
Signed-off-by: Vincenzo Frascino <[email protected]>

Andrey Konovalov (1):
kasan: don't run tests in async mode

Vincenzo Frascino (4):
arm64: mte: Add asynchronous mode support
kasan: Add KASAN mode kernel parameter
kasan: Add report for async mode
arm64: mte: Enable async tag check fault

Documentation/dev-tools/kasan.rst | 9 +++++
arch/arm64/include/asm/memory.h | 3 +-
arch/arm64/include/asm/mte-kasan.h | 9 ++++-
arch/arm64/include/asm/mte.h | 32 ++++++++++++++++
arch/arm64/kernel/entry-common.c | 6 +++
arch/arm64/kernel/mte.c | 60 +++++++++++++++++++++++++++++-
include/linux/kasan.h | 6 +++
lib/test_kasan.c | 6 ++-
mm/kasan/hw_tags.c | 51 ++++++++++++++++++++++++-
mm/kasan/kasan.h | 7 +++-
mm/kasan/report.c | 17 ++++++++-
11 files changed, 196 insertions(+), 10 deletions(-)

--
2.30.0


2021-01-30 16:57:16

by Vincenzo Frascino

[permalink] [raw]
Subject: [PATCH v11 4/5] arm64: mte: Enable async tag check fault

MTE provides a mode that asynchronously updates the TFSR_EL1 register
when a tag check exception is detected.

To take advantage of this mode the kernel has to verify the status of
the register at:
1. Context switching
2. Return to user/EL0 (Not required in entry from EL0 since the kernel
did not run)
3. Kernel entry from EL1
4. Kernel exit to EL1

If the register is non-zero a trace is reported.

Add the required features for EL1 detection and reporting.

Note: ITFSB bit is set in the SCTLR_EL1 register hence it guaranties that
the indirect writes to TFSR_EL1 are synchronized at exception entry to
EL1. On the context switch path the synchronization is guarantied by the
dsb() in __switch_to().
The dsb(nsh) in mte_check_tfsr_exit() is provisional pending
confirmation by the architects.

Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Reviewed-by: Catalin Marinas <[email protected]>
Acked-by: Andrey Konovalov <[email protected]>
Signed-off-by: Vincenzo Frascino <[email protected]>
---
arch/arm64/include/asm/mte.h | 32 +++++++++++++++++++++++
arch/arm64/kernel/entry-common.c | 6 +++++
arch/arm64/kernel/mte.c | 44 ++++++++++++++++++++++++++++++++
3 files changed, 82 insertions(+)

diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
index d02aff9f493d..237bb2f7309d 100644
--- a/arch/arm64/include/asm/mte.h
+++ b/arch/arm64/include/asm/mte.h
@@ -92,5 +92,37 @@ static inline void mte_assign_mem_tag_range(void *addr, size_t size)

#endif /* CONFIG_ARM64_MTE */

+#ifdef CONFIG_KASAN_HW_TAGS
+void mte_check_tfsr_el1(void);
+
+static inline void mte_check_tfsr_entry(void)
+{
+ mte_check_tfsr_el1();
+}
+
+static inline void mte_check_tfsr_exit(void)
+{
+ /*
+ * The asynchronous faults are sync'ed automatically with
+ * TFSR_EL1 on kernel entry but for exit an explicit dsb()
+ * is required.
+ */
+ dsb(nsh);
+ isb();
+
+ mte_check_tfsr_el1();
+}
+#else
+static inline void mte_check_tfsr_el1(void)
+{
+}
+static inline void mte_check_tfsr_entry(void)
+{
+}
+static inline void mte_check_tfsr_exit(void)
+{
+}
+#endif /* CONFIG_KASAN_HW_TAGS */
+
#endif /* __ASSEMBLY__ */
#endif /* __ASM_MTE_H */
diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
index 5346953e4382..31666511ba67 100644
--- a/arch/arm64/kernel/entry-common.c
+++ b/arch/arm64/kernel/entry-common.c
@@ -37,6 +37,8 @@ static void noinstr enter_from_kernel_mode(struct pt_regs *regs)
lockdep_hardirqs_off(CALLER_ADDR0);
rcu_irq_enter_check_tick();
trace_hardirqs_off_finish();
+
+ mte_check_tfsr_entry();
}

/*
@@ -47,6 +49,8 @@ static void noinstr exit_to_kernel_mode(struct pt_regs *regs)
{
lockdep_assert_irqs_disabled();

+ mte_check_tfsr_exit();
+
if (interrupts_enabled(regs)) {
if (regs->exit_rcu) {
trace_hardirqs_on_prepare();
@@ -243,6 +247,8 @@ asmlinkage void noinstr enter_from_user_mode(void)

asmlinkage void noinstr exit_to_user_mode(void)
{
+ mte_check_tfsr_exit();
+
trace_hardirqs_on_prepare();
lockdep_hardirqs_on_prepare(CALLER_ADDR0);
user_enter_irqoff();
diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
index 92078e1eb627..7763ac1f2917 100644
--- a/arch/arm64/kernel/mte.c
+++ b/arch/arm64/kernel/mte.c
@@ -182,6 +182,37 @@ bool mte_report_once(void)
return READ_ONCE(report_fault_once);
}

+#ifdef CONFIG_KASAN_HW_TAGS
+void mte_check_tfsr_el1(void)
+{
+ u64 tfsr_el1;
+
+ if (!system_supports_mte())
+ return;
+
+ tfsr_el1 = read_sysreg_s(SYS_TFSR_EL1);
+
+ /*
+ * The kernel should never trigger an asynchronous fault on a
+ * TTBR0 address, so we should never see TF0 set.
+ * For futexes we disable checks via PSTATE.TCO.
+ */
+ WARN_ONCE(tfsr_el1 & SYS_TFSR_EL1_TF0,
+ "Kernel async tag fault on TTBR0 address");
+
+ if (unlikely(tfsr_el1 & SYS_TFSR_EL1_TF1)) {
+ /*
+ * Note: isb() is not required after this direct write
+ * because there is no indirect read subsequent to it
+ * (per ARM DDI 0487F.c table D13-1).
+ */
+ write_sysreg_s(0, SYS_TFSR_EL1);
+
+ kasan_report_async();
+ }
+}
+#endif
+
static void update_sctlr_el1_tcf0(u64 tcf0)
{
/* ISB required for the kernel uaccess routines */
@@ -247,6 +278,19 @@ void mte_thread_switch(struct task_struct *next)
/* avoid expensive SCTLR_EL1 accesses if no change */
if (current->thread.sctlr_tcf0 != next->thread.sctlr_tcf0)
update_sctlr_el1_tcf0(next->thread.sctlr_tcf0);
+ else
+ isb();
+
+ /*
+ * Check if an async tag exception occurred at EL1.
+ *
+ * Note: On the context switch path we rely on the dsb() present
+ * in __switch_to() to guarantee that the indirect writes to TFSR_EL1
+ * are synchronized before this point.
+ * isb() above is required for the same reason.
+ *
+ */
+ mte_check_tfsr_el1();
}

void mte_suspend_exit(void)
--
2.30.0

2021-01-30 16:58:42

by Vincenzo Frascino

[permalink] [raw]
Subject: [PATCH v11 5/5] kasan: don't run tests in async mode

From: Andrey Konovalov <[email protected]>

Asynchronous KASAN mode doesn't guarantee that a tag fault will be
detected immediately and causes tests to fail. Forbid running them
in asynchronous mode.

Signed-off-by: Andrey Konovalov <[email protected]>
---
lib/test_kasan.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/lib/test_kasan.c b/lib/test_kasan.c
index 7285dcf9fcc1..f82d9630cae1 100644
--- a/lib/test_kasan.c
+++ b/lib/test_kasan.c
@@ -51,6 +51,10 @@ static int kasan_test_init(struct kunit *test)
kunit_err(test, "can't run KASAN tests with KASAN disabled");
return -1;
}
+ if (kasan_flag_async) {
+ kunit_err(test, "can't run KASAN tests in async mode");
+ return -1;
+ }

multishot = kasan_save_enable_multi_shot();
hw_set_tagging_report_once(false);
--
2.30.0

2021-01-30 17:02:38

by Vincenzo Frascino

[permalink] [raw]
Subject: Re: [PATCH v11 5/5] kasan: don't run tests in async mode



On 1/30/21 4:52 PM, Vincenzo Frascino wrote:
> From: Andrey Konovalov <[email protected]>
>
> Asynchronous KASAN mode doesn't guarantee that a tag fault will be
> detected immediately and causes tests to fail. Forbid running them
> in asynchronous mode.
>
> Signed-off-by: Andrey Konovalov <[email protected]>

Reviewed-by: Vincenzo Frascino <[email protected]>

With:

[ 18.283644] 1..1
[ 18.284167] # Subtest: kasan
[ 18.284444] 1..45
[ 18.295536] # kmalloc_oob_right: can't run KASAN tests in async mode
[ 18.296873] # kmalloc_oob_right: failed to initialize: -1
[ 18.303714] not ok 1 - kmalloc_oob_right
[ 18.316439] # kmalloc_oob_left: can't run KASAN tests in async mode
[ 18.319466] # kmalloc_oob_left: failed to initialize: -1
[ 18.325001] not ok 2 - kmalloc_oob_left

Tested-by: Vincenzo Frascino <[email protected]>

> ---
> lib/test_kasan.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/lib/test_kasan.c b/lib/test_kasan.c
> index 7285dcf9fcc1..f82d9630cae1 100644
> --- a/lib/test_kasan.c
> +++ b/lib/test_kasan.c
> @@ -51,6 +51,10 @@ static int kasan_test_init(struct kunit *test)
> kunit_err(test, "can't run KASAN tests with KASAN disabled");
> return -1;
> }
> + if (kasan_flag_async) {
> + kunit_err(test, "can't run KASAN tests in async mode");
> + return -1;
> + }
>
> multishot = kasan_save_enable_multi_shot();
> hw_set_tagging_report_once(false);
>

--
Regards,
Vincenzo

2021-02-05 19:56:30

by Vincenzo Frascino

[permalink] [raw]
Subject: Re: [PATCH v11 0/5] arm64: ARMv8.5-A: MTE: Add async mode support

On 1/30/21 4:52 PM, Vincenzo Frascino wrote:
> This patchset implements the asynchronous mode support for ARMv8.5-A
> Memory Tagging Extension (MTE), which is a debugging feature that allows
> to detect with the help of the architecture the C and C++ programmatic
> memory errors like buffer overflow, use-after-free, use-after-return, etc.
>
> MTE is built on top of the AArch64 v8.0 virtual address tagging TBI
> (Top Byte Ignore) feature and allows a task to set a 4 bit tag on any
> subset of its address space that is multiple of a 16 bytes granule. MTE
> is based on a lock-key mechanism where the lock is the tag associated to
> the physical memory and the key is the tag associated to the virtual
> address.
> When MTE is enabled and tags are set for ranges of address space of a task,
> the PE will compare the tag related to the physical memory with the tag
> related to the virtual address (tag check operation). Access to the memory
> is granted only if the two tags match. In case of mismatch the PE will raise
> an exception.
>
> The exception can be handled synchronously or asynchronously. When the
> asynchronous mode is enabled:
> - Upon fault the PE updates the TFSR_EL1 register.
> - The kernel detects the change during one of the following:
> - Context switching
> - Return to user/EL0
> - Kernel entry from EL1
> - Kernel exit to EL1
> - If the register has been updated by the PE the kernel clears it and
> reports the error.
>
> The series is based on linux-next/akpm.
>

We are suspecting an issue with with the kernel access nofault functions
triggering async faults that impacts Android init process.
Please do not merge this series until this is sorted.

> To simplify the testing a tree with the new patches on top has been made
> available at [1].
>
> [1] https://git.gitlab.arm.com/linux-arm/linux-vf.git mte/v10.async.akpm
>
> Changes:
> --------
> v11:
> - Added patch that disables KUNIT tests in async mode
> v10:
> - Rebase on the latest linux-next/akpm
> - Address review comments.
> v9:
> - Rebase on the latest linux-next/akpm
> - Address review comments.
> v8:
> - Address review comments.
> v7:
> - Fix a warning reported by kernel test robot. This
> time for real.
> v6:
> - Drop patches that forbid KASAN KUNIT tests when async
> mode is enabled.
> - Fix a warning reported by kernel test robot.
> - Address review comments.
> v5:
> - Rebase the series on linux-next/akpm.
> - Forbid execution for KASAN KUNIT tests when async
> mode is enabled.
> - Dropped patch to inline mte_assign_mem_tag_range().
> - Address review comments.
> v4:
> - Added support for kasan.mode (sync/async) kernel
> command line parameter.
> - Addressed review comments.
> v3:
> - Exposed kasan_hw_tags_mode to convert the internal
> KASAN represenetation.
> - Added dsb() for kernel exit paths in arm64.
> - Addressed review comments.
> v2:
> - Fixed a compilation issue reported by krobot.
> - General cleanup.
>
> Cc: Andrew Morton <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> Cc: Will Deacon <[email protected]>
> Cc: Dmitry Vyukov <[email protected]>
> Cc: Andrey Ryabinin <[email protected]>
> Cc: Alexander Potapenko <[email protected]>
> Cc: Marco Elver <[email protected]>
> Cc: Evgenii Stepanov <[email protected]>
> Cc: Branislav Rankov <[email protected]>
> Cc: Andrey Konovalov <[email protected]>
> Signed-off-by: Vincenzo Frascino <[email protected]>
>
> Andrey Konovalov (1):
> kasan: don't run tests in async mode
>
> Vincenzo Frascino (4):
> arm64: mte: Add asynchronous mode support
> kasan: Add KASAN mode kernel parameter
> kasan: Add report for async mode
> arm64: mte: Enable async tag check fault
>
> Documentation/dev-tools/kasan.rst | 9 +++++
> arch/arm64/include/asm/memory.h | 3 +-
> arch/arm64/include/asm/mte-kasan.h | 9 ++++-
> arch/arm64/include/asm/mte.h | 32 ++++++++++++++++
> arch/arm64/kernel/entry-common.c | 6 +++
> arch/arm64/kernel/mte.c | 60 +++++++++++++++++++++++++++++-
> include/linux/kasan.h | 6 +++
> lib/test_kasan.c | 6 ++-
> mm/kasan/hw_tags.c | 51 ++++++++++++++++++++++++-
> mm/kasan/kasan.h | 7 +++-
> mm/kasan/report.c | 17 ++++++++-
> 11 files changed, 196 insertions(+), 10 deletions(-)
>

--
Regards,
Vincenzo

2021-02-05 23:16:06

by Vincenzo Frascino

[permalink] [raw]
Subject: Re: [PATCH v11 4/5] arm64: mte: Enable async tag check fault



On 2/5/21 3:39 PM, Catalin Marinas wrote:
> On Sat, Jan 30, 2021 at 04:52:24PM +0000, Vincenzo Frascino wrote:
>> diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
>> index 92078e1eb627..7763ac1f2917 100644
>> --- a/arch/arm64/kernel/mte.c
>> +++ b/arch/arm64/kernel/mte.c
>> @@ -182,6 +182,37 @@ bool mte_report_once(void)
>> return READ_ONCE(report_fault_once);
>> }
>>
>> +#ifdef CONFIG_KASAN_HW_TAGS
>> +void mte_check_tfsr_el1(void)
>> +{
>> + u64 tfsr_el1;
>> +
>> + if (!system_supports_mte())
>> + return;
>> +
>> + tfsr_el1 = read_sysreg_s(SYS_TFSR_EL1);
>> +
>> + /*
>> + * The kernel should never trigger an asynchronous fault on a
>> + * TTBR0 address, so we should never see TF0 set.
>> + * For futexes we disable checks via PSTATE.TCO.
>> + */
>> + WARN_ONCE(tfsr_el1 & SYS_TFSR_EL1_TF0,
>> + "Kernel async tag fault on TTBR0 address");
>
> Sorry, I got confused when I suggested this warning. If the user is
> running in async mode, the TFSR_EL1.TF0 bit may be set by
> copy_mount_options(), strncpy_from_user() which rely on an actual fault
> happening (not the case with asynchronous where only a bit is set). With
> the user MTE support, we never report asynchronous faults caused by the
> kernel on user addresses as we can't easily track them. So this warning
> may be triggered on correctly functioning kernel/user.
>

No issue, I will re-post removing the WARN_ONCE().

>> +
>> + if (unlikely(tfsr_el1 & SYS_TFSR_EL1_TF1)) {
>> + /*
>> + * Note: isb() is not required after this direct write
>> + * because there is no indirect read subsequent to it
>> + * (per ARM DDI 0487F.c table D13-1).
>> + */
>> + write_sysreg_s(0, SYS_TFSR_EL1);
>
> Zeroing the whole register is still fine, we don't care about the TF0
> bit anyway.
>
>> +
>> + kasan_report_async();
>> + }
>> +}
>> +#endif
>

--
Regards,
Vincenzo

2021-02-05 23:33:19

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH v11 4/5] arm64: mte: Enable async tag check fault

On Sat, Jan 30, 2021 at 04:52:24PM +0000, Vincenzo Frascino wrote:
> diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
> index 92078e1eb627..7763ac1f2917 100644
> --- a/arch/arm64/kernel/mte.c
> +++ b/arch/arm64/kernel/mte.c
> @@ -182,6 +182,37 @@ bool mte_report_once(void)
> return READ_ONCE(report_fault_once);
> }
>
> +#ifdef CONFIG_KASAN_HW_TAGS
> +void mte_check_tfsr_el1(void)
> +{
> + u64 tfsr_el1;
> +
> + if (!system_supports_mte())
> + return;
> +
> + tfsr_el1 = read_sysreg_s(SYS_TFSR_EL1);
> +
> + /*
> + * The kernel should never trigger an asynchronous fault on a
> + * TTBR0 address, so we should never see TF0 set.
> + * For futexes we disable checks via PSTATE.TCO.
> + */
> + WARN_ONCE(tfsr_el1 & SYS_TFSR_EL1_TF0,
> + "Kernel async tag fault on TTBR0 address");

Sorry, I got confused when I suggested this warning. If the user is
running in async mode, the TFSR_EL1.TF0 bit may be set by
copy_mount_options(), strncpy_from_user() which rely on an actual fault
happening (not the case with asynchronous where only a bit is set). With
the user MTE support, we never report asynchronous faults caused by the
kernel on user addresses as we can't easily track them. So this warning
may be triggered on correctly functioning kernel/user.

> +
> + if (unlikely(tfsr_el1 & SYS_TFSR_EL1_TF1)) {
> + /*
> + * Note: isb() is not required after this direct write
> + * because there is no indirect read subsequent to it
> + * (per ARM DDI 0487F.c table D13-1).
> + */
> + write_sysreg_s(0, SYS_TFSR_EL1);

Zeroing the whole register is still fine, we don't care about the TF0
bit anyway.

> +
> + kasan_report_async();
> + }
> +}
> +#endif

--
Catalin