2020-07-28 20:23:13

by Gabriel Krisman Bertazi

[permalink] [raw]
Subject: [PATCH 0/6] Reclaim x86 psABI TIF flags

Hi,

This patch set reduces the number of arch-specific TIF_ flags in x86, as
a clean up to reduce the pressure over the few remaining x86_32 TIF
bits and as a preparation to have the arch-agnostic TIF_ flags shared by
different architectures by the common syscall entry code recently
published by Thomas Gleixner.

This is based on top of:

git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git x86/entry

It was tested by booting and running an i386 rootfs over the 64bit
kernel, and also by running x32 binaries on top of it.

There is one remaining use of TIF_IA32 in User Mode Linux which is
turned off by a bogus define, such that I think it is currently dead
code and apparently a bug. I think, it was wired to an old TIF_IA32
flag definition that doesn't exist anymore in arch/um. I think it
deserves a fix other than dropping that code, but it is a different
piece of work that I will tackle next.

Gabriel Krisman Bertazi (6):
arch: x86: Don't use TIF flags for mm context
arch: x86: Wrap TIF_IA32 checks
arch: x86: Wrap TIF_X32 checks
arch: x86: Expose psABI on thread_info
arch: x86: Reclaim TIF_IA32 flag
arch: x86: Reclaim TIF_X32 flag

arch/x86/entry/vdso/vma.c | 2 +-
arch/x86/events/core.c | 4 ++--
arch/x86/events/intel/ds.c | 2 +-
arch/x86/events/intel/lbr.c | 2 +-
arch/x86/include/asm/compat.h | 2 +-
arch/x86/include/asm/elf.h | 2 +-
arch/x86/include/asm/mmu_context.h | 2 +-
arch/x86/include/asm/thread_info.h | 15 +++++++++++----
arch/x86/kernel/perf_regs.c | 2 +-
arch/x86/kernel/process_64.c | 16 ++++++----------
arch/x86/oprofile/backtrace.c | 2 +-
11 files changed, 27 insertions(+), 24 deletions(-)

--
2.27.0


2020-07-28 20:23:18

by Gabriel Krisman Bertazi

[permalink] [raw]
Subject: [PATCH 1/6] arch: x86: Don't use TIF flags for mm context

TIF_IA32 and TIF_X32 are going away. Create a dedicated enum for the
MM context.

Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
---
arch/x86/include/asm/mmu_context.h | 2 +-
arch/x86/include/asm/thread_info.h | 6 ++++++
arch/x86/kernel/process_64.c | 4 ++--
3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h
index 47562147e70b..055ee5d66b41 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -178,7 +178,7 @@ static inline void arch_exit_mmap(struct mm_struct *mm)
static inline bool is_64bit_mm(struct mm_struct *mm)
{
return !IS_ENABLED(CONFIG_IA32_EMULATION) ||
- !(mm->context.ia32_compat == TIF_IA32);
+ !(mm->context.ia32_compat == PSABI_IA32);
}
#else
static inline bool is_64bit_mm(struct mm_struct *mm)
diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
index 267701ae3d86..934aa15b20f2 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -53,6 +53,12 @@ struct task_struct;
#include <asm/cpufeature.h>
#include <linux/atomic.h>

+enum {
+ PSABI_IA64 = 0,
+ PSABI_IA32 = 1,
+ PSABI_X32 = 2
+};
+
struct thread_info {
unsigned long flags; /* low level flags */
u32 status; /* thread synchronous flags */
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 9a97415b2139..4452a35402f9 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -557,7 +557,7 @@ static void __set_personality_x32(void)
clear_thread_flag(TIF_IA32);
set_thread_flag(TIF_X32);
if (current->mm)
- current->mm->context.ia32_compat = TIF_X32;
+ current->mm->context.ia32_compat = PSABI_X32;
current->personality &= ~READ_IMPLIES_EXEC;
/*
* in_32bit_syscall() uses the presence of the x32 syscall bit
@@ -578,7 +578,7 @@ static void __set_personality_ia32(void)
set_thread_flag(TIF_IA32);
clear_thread_flag(TIF_X32);
if (current->mm)
- current->mm->context.ia32_compat = TIF_IA32;
+ current->mm->context.ia32_compat = PSABI_IA32;
current->personality |= force_personality32;
/* Prepare the first "return" to user space */
task_pt_regs(current)->orig_ax = __NR_ia32_execve;
--
2.27.0

2020-07-28 20:23:25

by Gabriel Krisman Bertazi

[permalink] [raw]
Subject: [PATCH 2/6] arch: x86: Wrap TIF_IA32 checks

In preparation to remove TIF_IA32, add wrapper that check the process
has IA32 ABI without using the flag directly.

Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
---
arch/x86/events/core.c | 2 +-
arch/x86/events/intel/ds.c | 2 +-
arch/x86/events/intel/lbr.c | 2 +-
arch/x86/include/asm/compat.h | 2 +-
arch/x86/include/asm/thread_info.h | 2 ++
arch/x86/kernel/perf_regs.c | 2 +-
arch/x86/oprofile/backtrace.c | 2 +-
7 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 4103665c6e03..42dff74c6197 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2491,7 +2491,7 @@ perf_callchain_user32(struct pt_regs *regs, struct perf_callchain_entry_ctx *ent
struct stack_frame_ia32 frame;
const struct stack_frame_ia32 __user *fp;

- if (!test_thread_flag(TIF_IA32))
+ if (!TASK_IA32(current))
return 0;

cs_base = get_segment_base(regs->cs);
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index dc43cc124e09..27d1cc1f3d05 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -1261,7 +1261,7 @@ static int intel_pmu_pebs_fixup_ip(struct pt_regs *regs)
old_to = to;

#ifdef CONFIG_X86_64
- is_64bit = kernel_ip(to) || !test_thread_flag(TIF_IA32);
+ is_64bit = kernel_ip(to) || !TASK_IA32(current);
#endif
insn_init(&insn, kaddr, size, is_64bit);
insn_get_length(&insn);
diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
index 65113b16804a..6c097a2eac97 100644
--- a/arch/x86/events/intel/lbr.c
+++ b/arch/x86/events/intel/lbr.c
@@ -920,7 +920,7 @@ static int branch_type(unsigned long from, unsigned long to, int abort)
* on 64-bit systems running 32-bit apps
*/
#ifdef CONFIG_X86_64
- is64 = kernel_ip((unsigned long)addr) || !test_thread_flag(TIF_IA32);
+ is64 = kernel_ip((unsigned long)addr) || !TASK_IA32(current);
#endif
insn_init(&insn, addr, bytes_read, is64);
insn_get_opcode(&insn);
diff --git a/arch/x86/include/asm/compat.h b/arch/x86/include/asm/compat.h
index d4edf281fff4..d39f9b3ae683 100644
--- a/arch/x86/include/asm/compat.h
+++ b/arch/x86/include/asm/compat.h
@@ -181,7 +181,7 @@ static inline void __user *arch_compat_alloc_user_space(long len)
{
compat_uptr_t sp;

- if (test_thread_flag(TIF_IA32)) {
+ if (TASK_IA32(current)) {
sp = task_pt_regs(current)->sp;
} else {
/* -128 for the x32 ABI redzone */
diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
index 934aa15b20f2..a3859595847c 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -243,4 +243,6 @@ extern void arch_setup_new_exec(void);
#define arch_setup_new_exec arch_setup_new_exec
#endif /* !__ASSEMBLY__ */

+#define TASK_IA32(tsk) test_tsk_thread_flag(tsk, TIF_IA32)
+
#endif /* _ASM_X86_THREAD_INFO_H */
diff --git a/arch/x86/kernel/perf_regs.c b/arch/x86/kernel/perf_regs.c
index bb7e1132290b..9b446d3b67d2 100644
--- a/arch/x86/kernel/perf_regs.c
+++ b/arch/x86/kernel/perf_regs.c
@@ -123,7 +123,7 @@ int perf_reg_validate(u64 mask)

u64 perf_reg_abi(struct task_struct *task)
{
- if (test_tsk_thread_flag(task, TIF_IA32))
+ if (TASK_IA32(task))
return PERF_SAMPLE_REGS_ABI_32;
else
return PERF_SAMPLE_REGS_ABI_64;
diff --git a/arch/x86/oprofile/backtrace.c b/arch/x86/oprofile/backtrace.c
index a2488b6e27d6..3f1086afa297 100644
--- a/arch/x86/oprofile/backtrace.c
+++ b/arch/x86/oprofile/backtrace.c
@@ -49,7 +49,7 @@ x86_backtrace_32(struct pt_regs * const regs, unsigned int depth)
struct stack_frame_ia32 *head;

/* User process is IA32 */
- if (!current || !test_thread_flag(TIF_IA32))
+ if (!current || !TASK_IA32(current))
return 0;

head = (struct stack_frame_ia32 *) regs->bp;
--
2.27.0

2020-07-28 20:23:57

by Gabriel Krisman Bertazi

[permalink] [raw]
Subject: [PATCH 3/6] arch: x86: Wrap TIF_X32 checks

In preparation to remove TIF_X32, add a wrapper that checks the process
is using the X32 ABI without using the flag directly.

Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
---
arch/x86/entry/vdso/vma.c | 2 +-
arch/x86/include/asm/elf.h | 2 +-
arch/x86/include/asm/thread_info.h | 1 +
arch/x86/kernel/process_64.c | 3 +--
4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c
index ea7c1f0b79df..0f54a5feeced 100644
--- a/arch/x86/entry/vdso/vma.c
+++ b/arch/x86/entry/vdso/vma.c
@@ -417,7 +417,7 @@ int compat_arch_setup_additional_pages(struct linux_binprm *bprm,
int uses_interp)
{
#ifdef CONFIG_X86_X32_ABI
- if (test_thread_flag(TIF_X32)) {
+ if (TASK_X32(current)) {
if (!vdso64_enabled)
return 0;
return map_vdso_randomized(&vdso_image_x32);
diff --git a/arch/x86/include/asm/elf.h b/arch/x86/include/asm/elf.h
index 452beed7892b..a5c8f10d5180 100644
--- a/arch/x86/include/asm/elf.h
+++ b/arch/x86/include/asm/elf.h
@@ -363,7 +363,7 @@ do { \
#define AT_SYSINFO 32

#define COMPAT_ARCH_DLINFO \
-if (test_thread_flag(TIF_X32)) \
+if (TASK_X32(current)) \
ARCH_DLINFO_X32; \
else \
ARCH_DLINFO_IA32
diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
index a3859595847c..6d55a9c0dda2 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -244,5 +244,6 @@ extern void arch_setup_new_exec(void);
#endif /* !__ASSEMBLY__ */

#define TASK_IA32(tsk) test_tsk_thread_flag(tsk, TIF_IA32)
+#define TASK_X32(tsk) test_tsk_thread_flag(tsk, TIF_X32)

#endif /* _ASM_X86_THREAD_INFO_H */
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 4452a35402f9..f20a365017b8 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -406,8 +406,7 @@ EXPORT_SYMBOL_GPL(start_thread);
void compat_start_thread(struct pt_regs *regs, u32 new_ip, u32 new_sp)
{
start_thread_common(regs, new_ip, new_sp,
- test_thread_flag(TIF_X32)
- ? __USER_CS : __USER32_CS,
+ TASK_X32(current) ? __USER_CS : __USER32_CS,
__USER_DS, __USER_DS);
}
#endif
--
2.27.0

2020-07-28 20:24:35

by Gabriel Krisman Bertazi

[permalink] [raw]
Subject: [PATCH 5/6] arch: x86: Reclaim TIF_IA32 flag

Dropping this as a TIF flag is interesting given the pressure over x86
remaining x86 flags, plus considering the current common entry code,
reducing arch-specific flags is a good thing.

Notice that no path really relies on TIF_IA32 as part of a critical
path, therefore the cost of checking another field in thread_info
shouldn't be a problem.

Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
---
arch/x86/include/asm/thread_info.h | 4 +---
arch/x86/kernel/process_64.c | 3 ---
2 files changed, 1 insertion(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
index 698feefd5f5f..aa7d27054a8a 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -99,7 +99,6 @@ struct thread_info {
#define TIF_NEED_FPU_LOAD 14 /* load FPU on return to userspace */
#define TIF_NOCPUID 15 /* CPUID is not accessible in userland */
#define TIF_NOTSC 16 /* TSC is not accessible in userland */
-#define TIF_IA32 17 /* IA32 compatibility process */
#define TIF_SLD 18 /* Restore split lock detection on context switch */
#define TIF_MEMDIE 20 /* is terminating due to OOM killer */
#define TIF_POLLING_NRFLAG 21 /* idle is polling for TIF_NEED_RESCHED */
@@ -129,7 +128,6 @@ struct thread_info {
#define _TIF_NEED_FPU_LOAD (1 << TIF_NEED_FPU_LOAD)
#define _TIF_NOCPUID (1 << TIF_NOCPUID)
#define _TIF_NOTSC (1 << TIF_NOTSC)
-#define _TIF_IA32 (1 << TIF_IA32)
#define _TIF_SLD (1 << TIF_SLD)
#define _TIF_POLLING_NRFLAG (1 << TIF_POLLING_NRFLAG)
#define _TIF_IO_BITMAP (1 << TIF_IO_BITMAP)
@@ -245,7 +243,7 @@ extern void arch_setup_new_exec(void);
#define arch_setup_new_exec arch_setup_new_exec
#endif /* !__ASSEMBLY__ */

-#define TASK_IA32(tsk) test_tsk_thread_flag(tsk, TIF_IA32)
+#define TASK_IA32(tsk) (task_thread_info(tsk)->psabi == PSABI_IA32)
#define TASK_X32(tsk) test_tsk_thread_flag(tsk, TIF_X32)

#endif /* _ASM_X86_THREAD_INFO_H */
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index aea2c03e8a5d..75059c9de829 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -533,7 +533,6 @@ void set_personality_64bit(void)

/* Make sure to be in 64bit mode */
current_thread_info()->psabi = PSABI_IA64;
- clear_thread_flag(TIF_IA32);
clear_thread_flag(TIF_ADDR32);
clear_thread_flag(TIF_X32);
/* Pretend that this comes from a 64bit execve */
@@ -555,7 +554,6 @@ static void __set_personality_x32(void)
{
#ifdef CONFIG_X86_X32
current_thread_info()->psabi = PSABI_X32;
- clear_thread_flag(TIF_IA32);
set_thread_flag(TIF_X32);
if (current->mm)
current->mm->context.ia32_compat = PSABI_X32;
@@ -577,7 +575,6 @@ static void __set_personality_ia32(void)
{
#ifdef CONFIG_IA32_EMULATION
current_thread_info()->psabi = PSABI_IA32;
- set_thread_flag(TIF_IA32);
clear_thread_flag(TIF_X32);
if (current->mm)
current->mm->context.ia32_compat = PSABI_IA32;
--
2.27.0

2020-07-28 20:25:07

by Gabriel Krisman Bertazi

[permalink] [raw]
Subject: [PATCH 6/6] arch: x86: Reclaim TIF_X32 flag

Dropping this as a TIF flag is interesting given the pressure over x86
remaining x86 flags, plus considering the current common entry code,
reducing arch-specific flags is a good thing.

Notice that no path really relies on TIF_X32 as part of a critical path,
therefore the cost of checking another field in thread_info shouldn't be
a problem.

Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
---
arch/x86/events/core.c | 2 +-
arch/x86/include/asm/thread_info.h | 4 +---
arch/x86/kernel/process_64.c | 3 ---
3 files changed, 2 insertions(+), 7 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 42dff74c6197..389a840e2211 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2574,7 +2574,7 @@ perf_callchain_user(struct perf_callchain_entry_ctx *entry, struct pt_regs *regs
* IA32 - Where we need to look at GDT/LDT segment descriptor tables
* to figure out what the 32bit base address is.
*
- * X32 - has TIF_X32 set, but is running in x86_64
+ * X32 - has PSABI_X32 set, but is running in x86_64
*
* X86_64 - CS,DS,SS,ES are all zero based.
*/
diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
index aa7d27054a8a..3059af355cdb 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -108,7 +108,6 @@ struct thread_info {
#define TIF_LAZY_MMU_UPDATES 27 /* task is updating the mmu lazily */
#define TIF_SYSCALL_TRACEPOINT 28 /* syscall tracepoint instrumentation */
#define TIF_ADDR32 29 /* 32-bit address space on 64 bits */
-#define TIF_X32 30 /* 32-bit native x86-64 binary */
#define TIF_FSCHECK 31 /* Check FS is USER_DS on return */

#define _TIF_SYSCALL_TRACE (1 << TIF_SYSCALL_TRACE)
@@ -136,7 +135,6 @@ struct thread_info {
#define _TIF_LAZY_MMU_UPDATES (1 << TIF_LAZY_MMU_UPDATES)
#define _TIF_SYSCALL_TRACEPOINT (1 << TIF_SYSCALL_TRACEPOINT)
#define _TIF_ADDR32 (1 << TIF_ADDR32)
-#define _TIF_X32 (1 << TIF_X32)
#define _TIF_FSCHECK (1 << TIF_FSCHECK)

/* flags to check in __switch_to() */
@@ -244,6 +242,6 @@ extern void arch_setup_new_exec(void);
#endif /* !__ASSEMBLY__ */

#define TASK_IA32(tsk) (task_thread_info(tsk)->psabi == PSABI_IA32)
-#define TASK_X32(tsk) test_tsk_thread_flag(tsk, TIF_X32)
+#define TASK_X32(tsk) (task_thread_info(tsk)->psabi == PSABI_X32)

#endif /* _ASM_X86_THREAD_INFO_H */
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 75059c9de829..d9a72e186db6 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -534,7 +534,6 @@ void set_personality_64bit(void)
/* Make sure to be in 64bit mode */
current_thread_info()->psabi = PSABI_IA64;
clear_thread_flag(TIF_ADDR32);
- clear_thread_flag(TIF_X32);
/* Pretend that this comes from a 64bit execve */
task_pt_regs(current)->orig_ax = __NR_execve;
current_thread_info()->status &= ~TS_COMPAT;
@@ -554,7 +553,6 @@ static void __set_personality_x32(void)
{
#ifdef CONFIG_X86_X32
current_thread_info()->psabi = PSABI_X32;
- set_thread_flag(TIF_X32);
if (current->mm)
current->mm->context.ia32_compat = PSABI_X32;
current->personality &= ~READ_IMPLIES_EXEC;
@@ -575,7 +573,6 @@ static void __set_personality_ia32(void)
{
#ifdef CONFIG_IA32_EMULATION
current_thread_info()->psabi = PSABI_IA32;
- clear_thread_flag(TIF_X32);
if (current->mm)
current->mm->context.ia32_compat = PSABI_IA32;
current->personality |= force_personality32;
--
2.27.0

2020-07-28 20:26:02

by Gabriel Krisman Bertazi

[permalink] [raw]
Subject: [PATCH 4/6] arch: x86: Expose psABI on thread_info

Expose psABI in thread_info, in preparation for the TIF_IA32 and
TIF_X32 flags removal.

Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
---
arch/x86/include/asm/thread_info.h | 2 ++
arch/x86/kernel/process_64.c | 3 +++
2 files changed, 5 insertions(+)

diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
index 6d55a9c0dda2..698feefd5f5f 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -62,11 +62,13 @@ enum {
struct thread_info {
unsigned long flags; /* low level flags */
u32 status; /* thread synchronous flags */
+ short int psabi;
};

#define INIT_THREAD_INFO(tsk) \
{ \
.flags = 0, \
+ .psabi = 0, \
}

#else /* !__ASSEMBLY__ */
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index f20a365017b8..aea2c03e8a5d 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -532,6 +532,7 @@ void set_personality_64bit(void)
/* inherit personality from parent */

/* Make sure to be in 64bit mode */
+ current_thread_info()->psabi = PSABI_IA64;
clear_thread_flag(TIF_IA32);
clear_thread_flag(TIF_ADDR32);
clear_thread_flag(TIF_X32);
@@ -553,6 +554,7 @@ void set_personality_64bit(void)
static void __set_personality_x32(void)
{
#ifdef CONFIG_X86_X32
+ current_thread_info()->psabi = PSABI_X32;
clear_thread_flag(TIF_IA32);
set_thread_flag(TIF_X32);
if (current->mm)
@@ -574,6 +576,7 @@ static void __set_personality_x32(void)
static void __set_personality_ia32(void)
{
#ifdef CONFIG_IA32_EMULATION
+ current_thread_info()->psabi = PSABI_IA32;
set_thread_flag(TIF_IA32);
clear_thread_flag(TIF_X32);
if (current->mm)
--
2.27.0

2020-07-29 03:45:06

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH 2/6] arch: x86: Wrap TIF_IA32 checks

On Tue, Jul 28, 2020 at 1:22 PM Gabriel Krisman Bertazi
<[email protected]> wrote:
>
> In preparation to remove TIF_IA32, add wrapper that check the process
> has IA32 ABI without using the flag directly.

Thank you for doing this, but let's please do it right. There is,
fundamentally, no such thing as a "process with IA32 ABI".

>
> Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
> ---
> arch/x86/events/core.c | 2 +-
> arch/x86/events/intel/ds.c | 2 +-
> arch/x86/events/intel/lbr.c | 2 +-
> arch/x86/include/asm/compat.h | 2 +-
> arch/x86/include/asm/thread_info.h | 2 ++
> arch/x86/kernel/perf_regs.c | 2 +-
> arch/x86/oprofile/backtrace.c | 2 +-
> 7 files changed, 8 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
> index 4103665c6e03..42dff74c6197 100644
> --- a/arch/x86/events/core.c
> +++ b/arch/x86/events/core.c
> @@ -2491,7 +2491,7 @@ perf_callchain_user32(struct pt_regs *regs, struct perf_callchain_entry_ctx *ent
> struct stack_frame_ia32 frame;
> const struct stack_frame_ia32 __user *fp;
>
> - if (!test_thread_flag(TIF_IA32))
> + if (!TASK_IA32(current))
> return 0;

if (user_64bit_mode(regs))
return 0;

>
> cs_base = get_segment_base(regs->cs);
> diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
> index dc43cc124e09..27d1cc1f3d05 100644
> --- a/arch/x86/events/intel/ds.c
> +++ b/arch/x86/events/intel/ds.c
> @@ -1261,7 +1261,7 @@ static int intel_pmu_pebs_fixup_ip(struct pt_regs *regs)
> old_to = to;
>
> #ifdef CONFIG_X86_64
> - is_64bit = kernel_ip(to) || !test_thread_flag(TIF_IA32);
> + is_64bit = kernel_ip(to) || !TASK_IA32(current);

PeterZ, does PEBS not give us a CPL? Is it really just IP?

Anyway, this should probably be:

is_64bit = kernel_ip(to) || user_64bit_mode(regs) || !user_mode(regs);


> #ifdef CONFIG_X86_64
> - is64 = kernel_ip((unsigned long)addr) || !test_thread_flag(TIF_IA32);
> + is64 = kernel_ip((unsigned long)addr) || !TASK_IA32(current);

Same as above.

> diff --git a/arch/x86/include/asm/compat.h b/arch/x86/include/asm/compat.h
> index d4edf281fff4..d39f9b3ae683 100644
> --- a/arch/x86/include/asm/compat.h
> +++ b/arch/x86/include/asm/compat.h
> @@ -181,7 +181,7 @@ static inline void __user *arch_compat_alloc_user_space(long len)
> {
> compat_uptr_t sp;
>
> - if (test_thread_flag(TIF_IA32)) {
> + if (TASK_IA32(current)) {
> sp = task_pt_regs(current)->sp;

Christoph, you spend a *lot* more time looking at this stuff lately
than I do, but this looks totally wrong. Shouldn't this be either:

sp = task_pt_regs(current)->sp;

/* This might be a compat syscall issued via int $0x80 from 64-bit-ABI code. */
if (user_64bit_mode(task_pt_regs(current))
sp -= 128;

Or perhaps the same thing without the user_64bit_mode() check at all?
There shouldn't be much if any harm done by respecting the redzone
unnecessarily.

> --- a/arch/x86/kernel/perf_regs.c
> +++ b/arch/x86/kernel/perf_regs.c
> @@ -123,7 +123,7 @@ int perf_reg_validate(u64 mask)
>
> u64 perf_reg_abi(struct task_struct *task)
> {
> - if (test_tsk_thread_flag(task, TIF_IA32))
> + if (TASK_IA32(task))
> return PERF_SAMPLE_REGS_ABI_32;
> else
> return PERF_SAMPLE_REGS_ABI_64;

Surely this should be:

if (user_64bit_mode(task_pt_regs(regs))
return PERF_SAMPLE_REGS_ABI_64;
else
return PERF_SAMPLE_REGS_ABI_32;

> diff --git a/arch/x86/oprofile/backtrace.c b/arch/x86/oprofile/backtrace.c
> index a2488b6e27d6..3f1086afa297 100644
> --- a/arch/x86/oprofile/backtrace.c
> +++ b/arch/x86/oprofile/backtrace.c
> @@ -49,7 +49,7 @@ x86_backtrace_32(struct pt_regs * const regs, unsigned int depth)
> struct stack_frame_ia32 *head;
>
> /* User process is IA32 */
> - if (!current || !test_thread_flag(TIF_IA32))
> + if (!current || !TASK_IA32(current))
> return 0;

if (user_64bit_mode(regs))
return 0;


And now you don't need the TASK_IA32 macro :)

All of the above being said, I'm wondering how many of these profiling
users remember to check whether the task is a kernel thread. And I
have no idea what task_pt_regs(current) contains in a kernel thread.

--Andy

2020-07-29 03:45:13

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH 4/6] arch: x86: Expose psABI on thread_info

On Tue, Jul 28, 2020 at 1:22 PM Gabriel Krisman Bertazi
<[email protected]> wrote:
>
> Expose psABI in thread_info, in preparation for the TIF_IA32 and
> TIF_X32 flags removal.

NAK. Linux threads don't have a user ABI like this. See my other comments :)

--Andy

2020-07-29 04:49:33

by Gabriel Krisman Bertazi

[permalink] [raw]
Subject: Re: [PATCH 2/6] arch: x86: Wrap TIF_IA32 checks

Andy Lutomirski <[email protected]> writes:

> On Tue, Jul 28, 2020 at 1:22 PM Gabriel Krisman Bertazi
> <[email protected]> wrote:
>>
>> In preparation to remove TIF_IA32, add wrapper that check the process
>> has IA32 ABI without using the flag directly.
>
> Thank you for doing this, but let's please do it right. There is,
> fundamentally, no such thing as a "process with IA32 ABI".

Hi Andy,

Thanks a lot for your review.

As you can see, I'm learning my way here. Can you clarify "there is no
such a thing as a 'process with IA32 ABI'"? I'm not sure if I confused
the terminology or if (more worrisome for me) I got the concepts wrong.

My understanding is that TIF_IA32 marks a thread that is running under
the 32-bit compat mode, which would be running a 32-bit process (as in
compiled with -m32, for instance), while TIF_X32 marks a process running
under the X32 ABI. Each process would have only one of these
"personalities". This is what I meant by a process with IA32 ABI (which
is wrong in any case). Is there more to it, or is the problem the
terminology I used?

I don't have any comments on the other things you mentioned, except that
I need to go through them and better understand your suggestions. Would
you prefer me to rework this patch series with what you suggested or is
this something you want to take over and do yourself? Both ways are
fine by me.

Thanks,

--
Gabriel Krisman Bertazi

2020-07-29 04:57:37

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH 3/6] arch: x86: Wrap TIF_X32 checks

On Tue, Jul 28, 2020 at 1:22 PM Gabriel Krisman Bertazi
<[email protected]> wrote:
>
> In preparation to remove TIF_X32, add a wrapper that checks the process
> is using the X32 ABI without using the flag directly.

I'm not sure what the right solution here is, but all three of these
functions are in the ELF loading path, and the callers of all three
have access to the full ELF info. Let's instead either stick enough
information into bprm to figure out what ABI is being loaded or pass
in a pointer to the ELF headers. The latter might be a bit nasty due
to the way that the ELF headers have a different type for compat.

In any case, sticking this into per-task state (task_struct or
anything in it) is silly. The full ABI info is on the call stack, and
there's no need to keep this particular bit around forever.

--Andy

2020-07-29 05:10:49

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH 2/6] arch: x86: Wrap TIF_IA32 checks

On Tue, Jul 28, 2020 at 9:46 PM Gabriel Krisman Bertazi
<[email protected]> wrote:
>
> Andy Lutomirski <[email protected]> writes:
>
> > On Tue, Jul 28, 2020 at 1:22 PM Gabriel Krisman Bertazi
> > <[email protected]> wrote:
> >>
> >> In preparation to remove TIF_IA32, add wrapper that check the process
> >> has IA32 ABI without using the flag directly.
> >
> > Thank you for doing this, but let's please do it right. There is,
> > fundamentally, no such thing as a "process with IA32 ABI".
>
> Hi Andy,
>
> Thanks a lot for your review.
>
> As you can see, I'm learning my way here. Can you clarify "there is no
> such a thing as a 'process with IA32 ABI'"? I'm not sure if I confused
> the terminology or if (more worrisome for me) I got the concepts wrong.
>
> My understanding is that TIF_IA32 marks a thread that is running under
> the 32-bit compat mode, which would be running a 32-bit process (as in
> compiled with -m32, for instance), while TIF_X32 marks a process running
> under the X32 ABI. Each process would have only one of these
> "personalities". This is what I meant by a process with IA32 ABI (which
> is wrong in any case). Is there more to it, or is the problem the
> terminology I used?

There's more to it.

On sane architectures like Linux on ARM64, a process can be 32-bit or
64-bit. If you are a 64-bit ARM process, you run in 64-bit mode,
period. Anything other than execve() that lets you become 32-bit is a
bug.

Linux on x86 is nutty, though. (IIRC Sparc is too.) In particular,
it's entirely legal for a Linux program to switch between 32-bit and
64-bit mode all by itself. This can be done with LJMP, LRET, IRET, or
a few other hacks. It can also be done using any of the sigreturn
syscalls. (SYSENTER comes to mind, but anyone who uses SYSENTER to
switch modes is *nuts*.) Certain evil people named Andy Lutomirski
write kernel selftests that switch modes using sigreturn just to abuse
the kernel. But some real programs do this too. For example, I think
that CRIU restores 32-bit programs by starting out as a 64-bit
program, restoring everything, and then switching to 32-bit mode.
Keep in mind that most of these mode-switching tricks do not enter the
kernel at all.

The upshot is that kernel does not actually know what the running
program's ABI is. For all the kernel knows, a process uses the
Linux/psABI 64-bit ABI, the ia32 ABI, the x32 ABI, the Windows 32-bit
ABI, the Windows 64-bit ABI, and some gnarly ABI from DOS or OS/2, all
in the same process.

So we really shouldn't be trying to keep track of what we think the
user program should be doing, and instead we should pay attention to
what the user program actually did in its interactions with the
kernel. If we want to know whether a user task is running in 64-bit
mode right now, we can use user_64bit_mode(regs). (And keep in mind
that syscalls like rt_sigreturn() and anything that gets ptraced can
*change* the outcome of user_64bit_mode(regs) mid-syscall.) If we're
delivering a signal, we instead want to know the ABI of the signal's
handler, so we use is_ia32_frame(ksig) and is_x32_frame(ksig). If we
want to know whether mmap() should return values above 2^32-1, we
currently use some hackery, but we really ought to be looking at
whether the call was 64-bit mmap() or compat mmap(). (I had some
really half-baked patches to fix this at one point, but the core code
was a real mess and I didn't finish it.) If we still supported MPX in
the kernel, we'd have a real mess, because MPX malfunctions if user
programs switch bitness on us. Good riddance!

>
> I don't have any comments on the other things you mentioned, except that
> I need to go through them and better understand your suggestions. Would
> you prefer me to rework this patch series with what you suggested or is
> this something you want to take over and do yourself? Both ways are
> fine by me.

Please rework it :) I have seriously limited bandwidth right now.

2020-07-29 07:02:58

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 2/6] arch: x86: Wrap TIF_IA32 checks

On Tue, Jul 28, 2020 at 08:43:27PM -0700, Andy Lutomirski wrote:
> > index d4edf281fff4..d39f9b3ae683 100644
> > --- a/arch/x86/include/asm/compat.h
> > +++ b/arch/x86/include/asm/compat.h
> > @@ -181,7 +181,7 @@ static inline void __user *arch_compat_alloc_user_space(long len)
> > {
> > compat_uptr_t sp;
> >
> > - if (test_thread_flag(TIF_IA32)) {
> > + if (TASK_IA32(current)) {
> > sp = task_pt_regs(current)->sp;
>
> Christoph, you spend a *lot* more time looking at this stuff lately
> than I do, but this looks totally wrong. Shouldn't this be either:
>
> sp = task_pt_regs(current)->sp;
>
> /* This might be a compat syscall issued via int $0x80 from 64-bit-ABI code. */
> if (user_64bit_mode(task_pt_regs(current))
> sp -= 128;
>
> Or perhaps the same thing without the user_64bit_mode() check at all?
> There shouldn't be much if any harm done by respecting the redzone
> unnecessarily.

compat_alloc_user_space is only used when either called from compat
calls or if in_compat_syscall() is true (and there are very few callers
left, and we plan to kill it off entirely..).

Which means we are either called from an i386 or x32 syscall, but then
again IIRC user_64bit_mode would also return true for x32. So your
above version looks correct, and I'd also be tempted to just always
respect the redzone.

> Surely this should be:
>
> if (user_64bit_mode(task_pt_regs(regs))

s/regs/current/

Btw, I wonder if want a shorthand for

user_64bit_mode(task_pt_regs(thread))

instead of always open coding it.

2020-07-29 09:04:16

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 1/6] arch: x86: Don't use TIF flags for mm context

On Tue, Jul 28, 2020 at 04:22:24PM -0400, Gabriel Krisman Bertazi wrote:

> diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
> index 267701ae3d86..934aa15b20f2 100644
> --- a/arch/x86/include/asm/thread_info.h
> +++ b/arch/x86/include/asm/thread_info.h
> @@ -53,6 +53,12 @@ struct task_struct;
> #include <asm/cpufeature.h>
> #include <linux/atomic.h>
>
> +enum {
> + PSABI_IA64 = 0,

That's the Itanic, and it has sailed. I think you want PSABI_X86_64 or
something.

> + PSABI_IA32 = 1,
> + PSABI_X32 = 2
> +};

2020-07-29 09:15:01

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 2/6] arch: x86: Wrap TIF_IA32 checks

On Tue, Jul 28, 2020 at 08:43:27PM -0700, Andy Lutomirski wrote:
> On Tue, Jul 28, 2020 at 1:22 PM Gabriel Krisman Bertazi

> > diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
> > index dc43cc124e09..27d1cc1f3d05 100644
> > --- a/arch/x86/events/intel/ds.c
> > +++ b/arch/x86/events/intel/ds.c
> > @@ -1261,7 +1261,7 @@ static int intel_pmu_pebs_fixup_ip(struct pt_regs *regs)
> > old_to = to;
> >
> > #ifdef CONFIG_X86_64
> > - is_64bit = kernel_ip(to) || !test_thread_flag(TIF_IA32);
> > + is_64bit = kernel_ip(to) || !TASK_IA32(current);
>
> PeterZ, does PEBS not give us a CPL? Is it really just IP?
>
> Anyway, this should probably be:
>
> is_64bit = kernel_ip(to) || user_64bit_mode(regs) || !user_mode(regs);

Correct, PEBS doesn't have the segment registers and we get to guess :/
Look at the various pebs_record_* structures in
arch/x86/events/intel/ds.c.

That said, in fixup_ip() we're guaranteed to be in the same process, so
unless the task does really funny things like switch mode between
triggering the assist and getting the PMI, we ought to be able to trust
regs.

2020-07-29 18:14:48

by Gabriel Krisman Bertazi

[permalink] [raw]
Subject: Re: [PATCH 2/6] arch: x86: Wrap TIF_IA32 checks

Andy Lutomirski <[email protected]> writes:

> On Tue, Jul 28, 2020 at 9:46 PM Gabriel Krisman Bertazi
> <[email protected]> wrote:
>>
>> Andy Lutomirski <[email protected]> writes:
>>
>> > On Tue, Jul 28, 2020 at 1:22 PM Gabriel Krisman Bertazi
>> > <[email protected]> wrote:
>> >>
>> >> In preparation to remove TIF_IA32, add wrapper that check the process
>> >> has IA32 ABI without using the flag directly.
>> >
>> > Thank you for doing this, but let's please do it right. There is,
>> > fundamentally, no such thing as a "process with IA32 ABI".
>>
>> Hi Andy,
>>
>> Thanks a lot for your review.
>>
>> As you can see, I'm learning my way here. Can you clarify "there is no
>> such a thing as a 'process with IA32 ABI'"? I'm not sure if I confused
>> the terminology or if (more worrisome for me) I got the concepts wrong.
>>
>> My understanding is that TIF_IA32 marks a thread that is running under
>> the 32-bit compat mode, which would be running a 32-bit process (as in
>> compiled with -m32, for instance), while TIF_X32 marks a process running
>> under the X32 ABI. Each process would have only one of these
>> "personalities". This is what I meant by a process with IA32 ABI (which
>> is wrong in any case). Is there more to it, or is the problem the
>> terminology I used?
>
> There's more to it.

Thanks again for the explanation!

>> I don't have any comments on the other things you mentioned, except that
>> I need to go through them and better understand your suggestions. Would
>> you prefer me to rework this patch series with what you suggested or is
>> this something you want to take over and do yourself? Both ways are
>> fine by me.
>
> Please rework it :) I have seriously limited bandwidth right now.

Will do.

--
Gabriel Krisman Bertazi