LinuxLists.cc - [PATCHv3 7/8] x86: Expose untagging mask in /proc/$PID/arch

2022-06-10 15:07:02

Subject: [PATCHv3 7/8] x86: Expose untagging mask in /proc/$PID/arch_status

Add a line in /proc/$PID/arch_status to report untag_mask. It can be
used to find out LAM status of the process from the outside. It is
useful for debuggers.

Signed-off-by: Kirill A. Shutemov <[email protected]>
---
arch/x86/include/asm/mmu_context.h | 10 ++++++
arch/x86/kernel/Makefile | 2 ++
arch/x86/kernel/fpu/xstate.c | 47 ----------------------------
arch/x86/kernel/proc.c | 50 ++++++++++++++++++++++++++++++
4 files changed, 62 insertions(+), 47 deletions(-)
create mode 100644 arch/x86/kernel/proc.c

diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h
index 05821534aadc..a6cded0f5e64 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -103,6 +103,11 @@ static inline void dup_lam(struct mm_struct *oldmm, struct mm_struct *mm)
mm->context.untag_mask = oldmm->context.untag_mask;
}

+static inline unsigned long mm_untag_mask(struct mm_struct *mm)
+{
+ return mm->context.untag_mask;
+}
+
static inline void mm_reset_untag_mask(struct mm_struct *mm)
{
mm->context.untag_mask = -1UL;
@@ -119,6 +124,11 @@ static inline void dup_lam(struct mm_struct *oldmm, struct mm_struct *mm)
{
}

+static inline unsigned long mm_untag_mask(struct mm_struct *mm)
+{
+ return -1UL;
+}
+
static inline void mm_reset_untag_mask(struct mm_struct *mm)
{
}
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 03364dc40d8d..228e108cbaba 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -145,6 +145,8 @@ obj-$(CONFIG_UNWINDER_GUESS) += unwind_guess.o

obj-$(CONFIG_AMD_MEM_ENCRYPT) += sev.o

+obj-$(CONFIG_PROC_FS) += proc.o
+
###
# 64 bit specific files
ifeq ($(CONFIG_X86_64),y)
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index c8340156bfd2..838a6f0627fd 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -10,8 +10,6 @@
#include <linux/mman.h>
#include <linux/nospec.h>
#include <linux/pkeys.h>
-#include <linux/seq_file.h>
-#include <linux/proc_fs.h>
#include <linux/vmalloc.h>

#include <asm/fpu/api.h>
@@ -1745,48 +1743,3 @@ long fpu_xstate_prctl(int option, unsigned long arg2)
return -EINVAL;
}
}
-
-#ifdef CONFIG_PROC_PID_ARCH_STATUS
-/*
- * Report the amount of time elapsed in millisecond since last AVX512
- * use in the task.
- */
-static void avx512_status(struct seq_file *m, struct task_struct *task)
-{
- unsigned long timestamp = READ_ONCE(task->thread.fpu.avx512_timestamp);
- long delta;
-
- if (!timestamp) {
- /*
- * Report -1 if no AVX512 usage
- */
- delta = -1;
- } else {
- delta = (long)(jiffies - timestamp);
- /*
- * Cap to LONG_MAX if time difference > LONG_MAX
- */
- if (delta < 0)
- delta = LONG_MAX;
- delta = jiffies_to_msecs(delta);
- }
-
- seq_put_decimal_ll(m, "AVX512_elapsed_ms:\t", delta);
- seq_putc(m, '\n');
-}
-
-/*
- * Report architecture specific information
- */
-int proc_pid_arch_status(struct seq_file *m, struct pid_namespace *ns,
- struct pid *pid, struct task_struct *task)
-{
- /*
- * Report AVX512 state if the processor and build option supported.
- */
- if (cpu_feature_enabled(X86_FEATURE_AVX512F))
- avx512_status(m, task);
-
- return 0;
-}
-#endif /* CONFIG_PROC_PID_ARCH_STATUS */
diff --git a/arch/x86/kernel/proc.c b/arch/x86/kernel/proc.c
new file mode 100644
index 000000000000..59e681425e09
--- /dev/null
+++ b/arch/x86/kernel/proc.c
@@ -0,0 +1,50 @@
+// SPDX-License-Identifier: GPL-2.0-only
+#include <linux/seq_file.h>
+#include <linux/proc_fs.h>
+#include <uapi/asm/prctl.h>
+#include <asm/mmu_context.h>
+
+/*
+ * Report the amount of time elapsed in millisecond since last AVX512
+ * use in the task.
+ */
+static void avx512_status(struct seq_file *m, struct task_struct *task)
+{
+ unsigned long timestamp = READ_ONCE(task->thread.fpu.avx512_timestamp);
+ long delta;
+
+ if (!timestamp) {
+ /*
+ * Report -1 if no AVX512 usage
+ */
+ delta = -1;
+ } else {
+ delta = (long)(jiffies - timestamp);
+ /*
+ * Cap to LONG_MAX if time difference > LONG_MAX
+ */
+ if (delta < 0)
+ delta = LONG_MAX;
+ delta = jiffies_to_msecs(delta);
+ }
+
+ seq_put_decimal_ll(m, "AVX512_elapsed_ms:\t", delta);
+ seq_putc(m, '\n');
+}
+
+/*
+ * Report architecture specific information
+ */
+int proc_pid_arch_status(struct seq_file *m, struct pid_namespace *ns,
+ struct pid *pid, struct task_struct *task)
+{
+ /*
+ * Report AVX512 state if the processor and build option supported.
+ */
+ if (cpu_feature_enabled(X86_FEATURE_AVX512F))
+ avx512_status(m, task);
+
+ seq_printf(m, "untag_mask:\t%#lx\n", mm_untag_mask(task->mm));
+
+ return 0;
+}
--
2.35.1

2022-06-10 16:35:02

by Dave Hansen

[permalink] [raw]

Subject: Re: [PATCHv3 7/8] x86: Expose untagging mask in /proc/$PID/arch_status

On 6/10/22 07:35, Kirill A. Shutemov wrote:
> +/*
> + * Report architecture specific information
> + */
> +int proc_pid_arch_status(struct seq_file *m, struct pid_namespace *ns,
> + struct pid *pid, struct task_struct *task)
> +{
> + /*
> + * Report AVX512 state if the processor and build option supported.
> + */
> + if (cpu_feature_enabled(X86_FEATURE_AVX512F))
> + avx512_status(m, task);
> +
> + seq_printf(m, "untag_mask:\t%#lx\n", mm_untag_mask(task->mm));
> +
> + return 0;
> +}

Arch-specific gunk is great for, well, arch-specific stuff. AVX-512 and
its, um, "quirks", really won't show up anywhere else. But x86 isn't
even the first to be doing this address tagging business.

Shouldn't we be talking to the ARM folks about a common way to do this?

2022-06-11 01:30:49

by Kirill A. Shutemov

[permalink] [raw]

Subject: Re: [PATCHv3 7/8] x86: Expose untagging mask in /proc/$PID/arch_status

On Fri, Jun 10, 2022 at 08:24:38AM -0700, Dave Hansen wrote:
> On 6/10/22 07:35, Kirill A. Shutemov wrote:
> > +/*
> > + * Report architecture specific information
> > + */
> > +int proc_pid_arch_status(struct seq_file *m, struct pid_namespace *ns,
> > + struct pid *pid, struct task_struct *task)
> > +{
> > + /*
> > + * Report AVX512 state if the processor and build option supported.
> > + */
> > + if (cpu_feature_enabled(X86_FEATURE_AVX512F))
> > + avx512_status(m, task);
> > +
> > + seq_printf(m, "untag_mask:\t%#lx\n", mm_untag_mask(task->mm));
> > +
> > + return 0;
> > +}
>
> Arch-specific gunk is great for, well, arch-specific stuff. AVX-512 and
> its, um, "quirks", really won't show up anywhere else. But x86 isn't
> even the first to be doing this address tagging business.
>
> Shouldn't we be talking to the ARM folks about a common way to do this?

+ Catalin, Will.

I guess we can expose the mask via proc for ARM too, but I'm not sure if
we can unify interface further without breaking existing TBI users: TBI is
enabled per-thread while LAM is per-process.

Any opinions?

--
Kirill A. Shutemov

2022-06-27 12:36:00

by Catalin Marinas

[permalink] [raw]

Subject: Re: [PATCHv3 7/8] x86: Expose untagging mask in /proc/$PID/arch_status

Hi Kirill,

Sorry, this fell through the cracks (thanks to Will for reminding me).

On Sat, Jun 11, 2022 at 04:28:30AM +0300, Kirill A. Shutemov wrote:
> On Fri, Jun 10, 2022 at 08:24:38AM -0700, Dave Hansen wrote:
> > On 6/10/22 07:35, Kirill A. Shutemov wrote:
> > > +/*
> > > + * Report architecture specific information
> > > + */
> > > +int proc_pid_arch_status(struct seq_file *m, struct pid_namespace *ns,
> > > + struct pid *pid, struct task_struct *task)
> > > +{
> > > + /*
> > > + * Report AVX512 state if the processor and build option supported.
> > > + */
> > > + if (cpu_feature_enabled(X86_FEATURE_AVX512F))
> > > + avx512_status(m, task);
> > > +
> > > + seq_printf(m, "untag_mask:\t%#lx\n", mm_untag_mask(task->mm));
> > > +
> > > + return 0;
> > > +}
> >
> > Arch-specific gunk is great for, well, arch-specific stuff. AVX-512 and
> > its, um, "quirks", really won't show up anywhere else. But x86 isn't
> > even the first to be doing this address tagging business.
> >
> > Shouldn't we be talking to the ARM folks about a common way to do this?
>
> + Catalin, Will.
>
> I guess we can expose the mask via proc for ARM too, but I'm not sure if
> we can unify interface further without breaking existing TBI users: TBI is
> enabled per-thread while LAM is per-process.

Hardware TBI is enabled for all user space at boot (it was like this
form the beginning). The TBI syscall interface is per-thread (TIF flag)
but it doesn't change any hardware behaviour. The mask is fixed in
hardware, unchangeable. I'm fine with reporting an untag_mask in a
common way, only that setting it won't be possible on arm64.

If arm64 ever gains support for a modifiable untag_mask, it's a good
chance it would be per mm as well since the controls for TBI are per
page table.

--
Catalin