2017-03-31 11:15:26

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCHv5] x86/mm: make in_compat_syscall() work during exec

After my changes to mmap(), its code now relies on the bitness of
performing syscall. According to that, it chooses the base of allocation:
mmap_base for 64-bit mmap() and mmap_compat_base for 32-bit syscall.
It was done by:
commit 1b028f784e8c ("x86/mm: Introduce mmap_compat_base() for
32-bit mmap()").

The code afterwards relies on in_compat_syscall() returning true for
32-bit syscalls. It's usually so while we're in context of application
that does 32-bit syscalls. But during exec() it is not valid for x32 ELF.
The reason is that the application hasn't yet done any syscall, so x32
bit has not being set.

But do_execve() calls load_elf_binary(), which adds mappings with
elf_map(). That results in -ENOMEM for x32 ELF binaries as
in_compat_syscall() says we're in 64-bit syscall and so mmap_base
is used instead of mmap_compat_base.
For i386 ELFs it works as SET_PERSONALITY() sets TS_COMPAT flag.

As suggested by HPA and with diff by Thomas, make SET_PERSONALITY()
change original syscall number to appropriate execve() number to
pretend that we've come from the same bitness syscall as loading binary.

Fixes: commit 1b028f784e8c ("x86/mm: Introduce mmap_compat_base() for
32-bit mmap()")
Cc: [email protected]
Cc: [email protected]
Cc: Andrei Vagin <[email protected]>
Cc: Cyrill Gorcunov <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: "Kirill A. Shutemov" <[email protected]>
Cc: [email protected]
Cc: Andy Lutomirski <[email protected]>
Cc: Ingo Molnar <[email protected]>
Reported-by: Adam Borowski <[email protected]>
Suggested-by: H. Peter Anvin <[email protected]>
Suggested-by: Thomas Gleixner <[email protected]>
Signed-off-by: Dmitry Safonov <[email protected]>
---
v5: Use generated unistd includes than defining syscall numbers by hands.
v4: Pretend that we've come from appropriate system call than only
setting/dropping x32 bit.

arch/x86/kernel/process_64.c | 70 +++++++++++++++++++++++++++++++-------------
1 file changed, 49 insertions(+), 21 deletions(-)

diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index ea1a6180bf39..57a827f3ed5b 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -53,6 +53,13 @@
#include <asm/xen/hypervisor.h>
#include <asm/vdso.h>
#include <asm/intel_rdt.h>
+#include <asm/unistd_64.h>
+#ifdef CONFIG_X86_X32
+#include <asm/unistd_64_x32.h>
+#endif
+#ifdef CONFIG_IA32_EMULATION
+#include <asm/unistd_32_ia32.h>
+#endif

__visible DEFINE_PER_CPU(unsigned long, rsp_scratch);

@@ -494,6 +501,8 @@ void set_personality_64bit(void)
clear_thread_flag(TIF_IA32);
clear_thread_flag(TIF_ADDR32);
clear_thread_flag(TIF_X32);
+ /* Pretend that this comes from a 64bit execve */
+ task_pt_regs(current)->orig_ax = __NR_execve;

/* Ensure the corresponding mm is not marked. */
if (current->mm)
@@ -506,32 +515,51 @@ void set_personality_64bit(void)
current->personality &= ~READ_IMPLIES_EXEC;
}

-void set_personality_ia32(bool x32)
+static void __set_personality_x32(void)
{
- /* inherit personality from parent */
+#ifdef CONFIG_X86_X32
+ clear_thread_flag(TIF_IA32);
+ set_thread_flag(TIF_X32);
+ if (current->mm)
+ current->mm->context.ia32_compat = TIF_X32;
+ current->personality &= ~READ_IMPLIES_EXEC;
+ /*
+ * in_compat_syscall() uses the presence of the x32
+ * syscall bit flag to determine compat status.
+ * The x86 mmap() code relies on the syscall bitness
+ * so set x32 syscall bit right here to make
+ * in_compat_syscall() work during exec().
+ *
+ * Pretend to come from a x32 execve.
+ */
+ task_pt_regs(current)->orig_ax = __NR_x32_execve | __X32_SYSCALL_BIT;
+ current->thread.status &= ~TS_COMPAT;
+#endif
+}

+static void __set_personality_ia32(void)
+{
+#ifdef CONFIG_IA32_EMULATION
+ set_thread_flag(TIF_IA32);
+ clear_thread_flag(TIF_X32);
+ if (current->mm)
+ current->mm->context.ia32_compat = TIF_IA32;
+ current->personality |= force_personality32;
+ /* Prepare the first "return" to user space */
+ task_pt_regs(current)->orig_ax = __NR_ia32_execve;
+ current->thread.status |= TS_COMPAT;
+#endif
+}
+
+void set_personality_ia32(bool x32)
+{
/* Make sure to be in 32bit mode */
set_thread_flag(TIF_ADDR32);

- /* Mark the associated mm as containing 32-bit tasks. */
- if (x32) {
- clear_thread_flag(TIF_IA32);
- set_thread_flag(TIF_X32);
- if (current->mm)
- current->mm->context.ia32_compat = TIF_X32;
- current->personality &= ~READ_IMPLIES_EXEC;
- /* in_compat_syscall() uses the presence of the x32
- syscall bit flag to determine compat status */
- current->thread.status &= ~TS_COMPAT;
- } else {
- set_thread_flag(TIF_IA32);
- clear_thread_flag(TIF_X32);
- if (current->mm)
- current->mm->context.ia32_compat = TIF_IA32;
- current->personality |= force_personality32;
- /* Prepare the first "return" to user space */
- current->thread.status |= TS_COMPAT;
- }
+ if (x32)
+ __set_personality_x32();
+ else
+ __set_personality_ia32();
}
EXPORT_SYMBOL_GPL(set_personality_ia32);

--
2.12.0


2017-03-31 14:51:34

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCHv5] x86/mm: make in_compat_syscall() work during exec

On Fri, 31 Mar 2017, Dmitry Safonov wrote:
> #include <asm/intel_rdt.h>
> +#include <asm/unistd_64.h>
> +#ifdef CONFIG_X86_X32
> +#include <asm/unistd_64_x32.h>
> +#endif

Bah. asm/unistd.h includes both 64bit and x32 headers.

Subject: [tip:x86/mm] x86/mm: Make in_compat_syscall() work during exec

Commit-ID: ada26481dfe698ac64b4aaf19a726e66eb8508c6
Gitweb: http://git.kernel.org/tip/ada26481dfe698ac64b4aaf19a726e66eb8508c6
Author: Dmitry Safonov <[email protected]>
AuthorDate: Fri, 31 Mar 2017 14:11:37 +0300
Committer: Thomas Gleixner <[email protected]>
CommitDate: Fri, 31 Mar 2017 16:53:02 +0200

x86/mm: Make in_compat_syscall() work during exec

The x86 mmap() code selects the mmap base for an allocation depending on
the bitness of the syscall. For 64bit sycalls it select mm->mmap_base and
for 32bit mm->mmap_compat_base.

On execve the registers of the task invoking exec() are copied to the child
pt_regs. So child->pt_regs->orig_ax contains the execve syscall number of the
parent.

exec() calls mmap() which in turn uses in_compat_syscall() to check whether
the mapping is for a 32bit or a 64bit task. The decision is made on the
following criteria:

ia32 child->thread.status & TS_COMPAT
x32 child->pt_regs.orig_ax & __X32_SYSCALL_BIT
ia64 !ia32 && !x32

child->thread.status is corretly set up in set_personality_*(), but the
syscall number in child->pt_regs.orig_ax is left unmodified.

Therefore the parent/child combinations work or fail in the following way:

Parent Child Child->thread_status child->pt_regs.orig_ax in_compat() Works
ia64 ia64 TS_COMPAT == 0 __X32_SYSCALL_BIT == 0 false Y
ia64 ia32 TS_COMPAT == 1 __X32_SYSCALL_BIT == 0 true Y
ia64 x32 TS_COMPAT == 0 __X32_SYSCALL_BIT == 0 false N
ia32 ia64 TS_COMPAT == 0 __X32_SYSCALL_BIT == 0 false Y
ia32 ia32 TS_COMPAT == 1 __X32_SYSCALL_BIT == 0 true Y
ia32 x32 TS_COMPAT == 0 __X32_SYSCALL_BIT == 0 false N
x32 ia64 TS_COMPAT == 0 __X32_SYSCALL_BIT == 1 true N
x32 ia32 TS_COMPAT == 1 __X32_SYSCALL_BIT == 1 true Y
x32 x32 TS_COMPAT == 0 __X32_SYSCALL_BIT == 1 true Y

Make set_personality_*() store the syscall number incl. __X32_SYSCALL_BIT
which corresponds to the newly started ELF executable in the childs
pt_regs, i.e. pretend that the exec was invoked from a task with the same
executable format.

So both thread.status and pt_regs.orig_ax correspond to the new ELF format
and in_compat_syscall() returns the correct result.

[ tglx: Rewrote changelog ]

Fixes: commit 1b028f784e8c ("x86/mm: Introduce mmap_compat_base() for 32-bit mmap()")
Reported-by: Adam Borowski <[email protected]>
Suggested-by: H. Peter Anvin <[email protected]>
Suggested-by: Thomas Gleixner <[email protected]>
Signed-off-by: Dmitry Safonov <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: Andrei Vagin <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Cyrill Gorcunov <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: "Kirill A. Shutemov" <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

---
arch/x86/kernel/process_64.c | 67 ++++++++++++++++++++++++++++++--------------
1 file changed, 46 insertions(+), 21 deletions(-)

diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index ea1a618..825a1e4 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -53,6 +53,11 @@
#include <asm/xen/hypervisor.h>
#include <asm/vdso.h>
#include <asm/intel_rdt.h>
+#include <asm/unistd.h>
+#ifdef CONFIG_IA32_EMULATION
+/* Not included via unistd.h */
+#include <asm/unistd_32_ia32.h>
+#endif

__visible DEFINE_PER_CPU(unsigned long, rsp_scratch);

@@ -494,6 +499,8 @@ void set_personality_64bit(void)
clear_thread_flag(TIF_IA32);
clear_thread_flag(TIF_ADDR32);
clear_thread_flag(TIF_X32);
+ /* Pretend that this comes from a 64bit execve */
+ task_pt_regs(current)->orig_ax = __NR_execve;

/* Ensure the corresponding mm is not marked. */
if (current->mm)
@@ -506,32 +513,50 @@ void set_personality_64bit(void)
current->personality &= ~READ_IMPLIES_EXEC;
}

-void set_personality_ia32(bool x32)
+static void __set_personality_x32(void)
{
- /* inherit personality from parent */
+#ifdef CONFIG_X86_X32
+ clear_thread_flag(TIF_IA32);
+ set_thread_flag(TIF_X32);
+ if (current->mm)
+ current->mm->context.ia32_compat = TIF_X32;
+ current->personality &= ~READ_IMPLIES_EXEC;
+ /*
+ * in_compat_syscall() uses the presence of the x32 syscall bit
+ * flag to determine compat status. The x86 mmap() code relies on
+ * the syscall bitness so set x32 syscall bit right here to make
+ * in_compat_syscall() work during exec().
+ *
+ * Pretend to come from a x32 execve.
+ */
+ task_pt_regs(current)->orig_ax = __NR_x32_execve | __X32_SYSCALL_BIT;
+ current->thread.status &= ~TS_COMPAT;
+#endif
+}

+static void __set_personality_ia32(void)
+{
+#ifdef CONFIG_IA32_EMULATION
+ set_thread_flag(TIF_IA32);
+ clear_thread_flag(TIF_X32);
+ if (current->mm)
+ current->mm->context.ia32_compat = TIF_IA32;
+ current->personality |= force_personality32;
+ /* Prepare the first "return" to user space */
+ task_pt_regs(current)->orig_ax = __NR_ia32_execve;
+ current->thread.status |= TS_COMPAT;
+#endif
+}
+
+void set_personality_ia32(bool x32)
+{
/* Make sure to be in 32bit mode */
set_thread_flag(TIF_ADDR32);

- /* Mark the associated mm as containing 32-bit tasks. */
- if (x32) {
- clear_thread_flag(TIF_IA32);
- set_thread_flag(TIF_X32);
- if (current->mm)
- current->mm->context.ia32_compat = TIF_X32;
- current->personality &= ~READ_IMPLIES_EXEC;
- /* in_compat_syscall() uses the presence of the x32
- syscall bit flag to determine compat status */
- current->thread.status &= ~TS_COMPAT;
- } else {
- set_thread_flag(TIF_IA32);
- clear_thread_flag(TIF_X32);
- if (current->mm)
- current->mm->context.ia32_compat = TIF_IA32;
- current->personality |= force_personality32;
- /* Prepare the first "return" to user space */
- current->thread.status |= TS_COMPAT;
- }
+ if (x32)
+ __set_personality_x32();
+ else
+ __set_personality_ia32();
}
EXPORT_SYMBOL_GPL(set_personality_ia32);