2012-02-20 00:08:56

by H. Peter Anvin

[permalink] [raw]
Subject: [PATCH 00/30] RFC: x32 support

This is an initial RFC patchset for x32 support. This is largely
complete and should be able to boot an x32 userspace.

The patch authors are assigned to myself and H. J. in a somewhat
haphazard fashion; the kernel side of x32 was very much a
collaborative effort, although H. J. ended up doing most of the latter
coding due to my being occupied with kernel.org in late 2011.

This patchset applies on top of tip:core/types which in turn is based
on v3.3-rc3; there may be some trivially resolved conflicts with -rc4.
I am planning to push this out as tip:x86/x32 after the RFC.

Controversial issue in this patchset:

There are a lot of ABIs that expose "long" or "unsigned long"; the
traditional way to deal with that is compat, but we don't want to use
that in most cases for x32. As a result, I have introduced
__kernel_[u]long_t as an additional type in posix_types.h.

This type largely duplicates __kernel_[s]size_t, but the latter is
defined as [unsigned] int on a lot of platforms, so changing types
over to __kernel_[s]size_t would have changed the type, if not the
size, on a lot of platforms. I have personally no strong opinion on
which way is preferrable.

The use of a flag to signal x32 rather than using a completely new set
of system call numbers. This was discussed back in August.

Incompleteness:

There are a number of (especially) ioctl paths which don't yet handle
a 64-bit time_t in the compat path. This is necessary because of
Linus' (quite correct) dictat that new ABIs should use a 64-bit
time_t, but this also means this is the first 32-bit ABI with a 64-bit
time_t, and some problems are expected to crop up.

The header-exported ABI is not correct everywhere yet.


The total diffstat is attached.

arch/x86/Kconfig | 21 +-
arch/x86/ia32/ia32_signal.c | 12 +-
arch/x86/ia32/sys_ia32.c | 40 ---
arch/x86/include/asm/Kbuild | 2 +
arch/x86/include/asm/compat.h | 39 ++-
arch/x86/include/asm/elf.h | 29 ++-
arch/x86/include/asm/ia32.h | 9 +
arch/x86/include/asm/posix_types.h | 4 +-
arch/x86/include/asm/posix_types_x32.h | 19 +
arch/x86/include/asm/processor.h | 10 +-
arch/x86/include/asm/ptrace.h | 1 -
arch/x86/include/asm/sigcontext.h | 57 ++--
arch/x86/include/asm/sigframe.h | 13 +
arch/x86/include/asm/sighandling.h | 24 ++
arch/x86/include/asm/sys_ia32.h | 7 +-
arch/x86/include/asm/syscall.h | 5 +-
arch/x86/include/asm/thread_info.h | 6 +-
arch/x86/include/asm/unistd.h | 15 +-
arch/x86/kernel/asm-offsets_64.c | 6 +
arch/x86/kernel/cpu/perf_event.c | 4 +-
arch/x86/kernel/entry_64.S | 44 +++
arch/x86/kernel/process_64.c | 25 +-
arch/x86/kernel/signal.c | 138 +++++++-
arch/x86/kernel/sys_x86_64.c | 6 +-
arch/x86/kernel/syscall_64.c | 8 +
arch/x86/oprofile/backtrace.c | 2 +-
arch/x86/syscalls/Makefile | 19 +-
arch/x86/syscalls/syscall_32.tbl | 2 +-
arch/x86/syscalls/syscall_64.tbl | 579 +++++++++++++++++---------------
arch/x86/um/sys_call_table_64.c | 3 +
arch/x86/um/user-offsets.c | 2 +
arch/x86/vdso/.gitignore | 2 +
arch/x86/vdso/Makefile | 46 +++-
arch/x86/vdso/vdso32-setup.c | 6 +
arch/x86/vdso/vdsox32.S | 22 ++
arch/x86/vdso/vdsox32.lds.S | 32 ++
arch/x86/vdso/vma.c | 78 ++++-
drivers/char/lp.c | 20 +-
drivers/input/input-compat.c | 4 +-
drivers/input/input-compat.h | 2 +-
fs/binfmt_elf.c | 24 ++-
fs/compat.c | 26 ++-
include/asm-generic/posix_types.h | 23 +-
include/linux/Kbuild | 1 +
include/linux/aio_abi.h | 2 +-
include/linux/compat.h | 4 +
include/linux/kernel.h | 21 +-
include/linux/sysinfo.h | 24 ++
net/bluetooth/hci_sock.c | 3 +-
49 files changed, 1027 insertions(+), 464 deletions(-)

It is probably worth noting that this work directly spawned the
x86/syscall and core/types cleanups:

33 files changed, 1075 insertions(+), 1978 deletions(-)
22 files changed, 190 insertions(+), 1588 deletions(-)


2012-02-20 00:09:21

by H. Peter Anvin

[permalink] [raw]
Subject: [PATCH 01/30] x86: Factor out TIF_IA32 from 32-bit address space

From: "H. Peter Anvin" <[email protected]>

Factor out IA32 (compatibility instruction set) from 32-bit address
space in the thread_info flags; this is a precondition patch for x32
support.

Originally-by: H. J. Lu <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
---
arch/x86/include/asm/elf.h | 4 ++--
arch/x86/include/asm/processor.h | 4 ++--
arch/x86/include/asm/thread_info.h | 4 +++-
arch/x86/kernel/process_64.c | 2 ++
arch/x86/kernel/sys_x86_64.c | 6 +++---
arch/x86/oprofile/backtrace.c | 2 +-
6 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/arch/x86/include/asm/elf.h b/arch/x86/include/asm/elf.h
index 5f962df..410fa6a 100644
--- a/arch/x86/include/asm/elf.h
+++ b/arch/x86/include/asm/elf.h
@@ -287,7 +287,7 @@ do { \
#define VDSO_HIGH_BASE 0xffffe000U /* CONFIG_COMPAT_VDSO address */

/* 1GB for 64bit, 8MB for 32bit */
-#define STACK_RND_MASK (test_thread_flag(TIF_IA32) ? 0x7ff : 0x3fffff)
+#define STACK_RND_MASK (test_thread_flag(TIF_ADDR32) ? 0x7ff : 0x3fffff)

#define ARCH_DLINFO \
do { \
@@ -330,7 +330,7 @@ static inline int mmap_is_ia32(void)
return 1;
#endif
#ifdef CONFIG_IA32_EMULATION
- if (test_thread_flag(TIF_IA32))
+ if (test_thread_flag(TIF_ADDR32))
return 1;
#endif
return 0;
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index aa9088c..9f748b5 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -924,9 +924,9 @@ extern unsigned long thread_saved_pc(struct task_struct *tsk);
#define IA32_PAGE_OFFSET ((current->personality & ADDR_LIMIT_3GB) ? \
0xc0000000 : 0xFFFFe000)

-#define TASK_SIZE (test_thread_flag(TIF_IA32) ? \
+#define TASK_SIZE (test_thread_flag(TIF_ADDR32) ? \
IA32_PAGE_OFFSET : TASK_SIZE_MAX)
-#define TASK_SIZE_OF(child) ((test_tsk_thread_flag(child, TIF_IA32)) ? \
+#define TASK_SIZE_OF(child) ((test_tsk_thread_flag(child, TIF_ADDR32)) ? \
IA32_PAGE_OFFSET : TASK_SIZE_MAX)

#define STACK_TOP TASK_SIZE
diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
index bc817cd..d1803a4 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -86,7 +86,7 @@ struct thread_info {
#define TIF_MCE_NOTIFY 10 /* notify userspace of an MCE */
#define TIF_USER_RETURN_NOTIFY 11 /* notify kernel of userspace return */
#define TIF_NOTSC 16 /* TSC is not accessible in userland */
-#define TIF_IA32 17 /* 32bit process */
+#define TIF_IA32 17 /* IA32 compatibility process */
#define TIF_FORK 18 /* ret_from_fork */
#define TIF_MEMDIE 20 /* is terminating due to OOM killer */
#define TIF_DEBUG 21 /* uses debug registers */
@@ -95,6 +95,7 @@ struct thread_info {
#define TIF_BLOCKSTEP 25 /* set when we want DEBUGCTLMSR_BTF */
#define TIF_LAZY_MMU_UPDATES 27 /* task is updating the mmu lazily */
#define TIF_SYSCALL_TRACEPOINT 28 /* syscall tracepoint instrumentation */
+#define TIF_ADDR32 29 /* 32-bit address space on 64 bits */

#define _TIF_SYSCALL_TRACE (1 << TIF_SYSCALL_TRACE)
#define _TIF_NOTIFY_RESUME (1 << TIF_NOTIFY_RESUME)
@@ -116,6 +117,7 @@ struct thread_info {
#define _TIF_BLOCKSTEP (1 << TIF_BLOCKSTEP)
#define _TIF_LAZY_MMU_UPDATES (1 << TIF_LAZY_MMU_UPDATES)
#define _TIF_SYSCALL_TRACEPOINT (1 << TIF_SYSCALL_TRACEPOINT)
+#define _TIF_ADDR32 (1 << TIF_ADDR32)

/* work to do in syscall_trace_enter() */
#define _TIF_WORK_SYSCALL_ENTRY \
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 9b9fe4a..0e900d0 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -508,6 +508,7 @@ void set_personality_64bit(void)

/* Make sure to be in 64bit mode */
clear_thread_flag(TIF_IA32);
+ clear_thread_flag(TIF_ADDR32);

/* Ensure the corresponding mm is not marked. */
if (current->mm)
@@ -526,6 +527,7 @@ void set_personality_ia32(void)

/* Make sure to be in 32bit mode */
set_thread_flag(TIF_IA32);
+ set_thread_flag(TIF_ADDR32);
current->personality |= force_personality32;

/* Mark the associated mm as containing 32-bit tasks. */
diff --git a/arch/x86/kernel/sys_x86_64.c b/arch/x86/kernel/sys_x86_64.c
index 0514890..f921df8 100644
--- a/arch/x86/kernel/sys_x86_64.c
+++ b/arch/x86/kernel/sys_x86_64.c
@@ -98,7 +98,7 @@ out:
static void find_start_end(unsigned long flags, unsigned long *begin,
unsigned long *end)
{
- if (!test_thread_flag(TIF_IA32) && (flags & MAP_32BIT)) {
+ if (!test_thread_flag(TIF_ADDR32) && (flags & MAP_32BIT)) {
unsigned long new_begin;
/* This is usually used needed to map code in small
model, so it needs to be in the first 31bit. Limit
@@ -144,7 +144,7 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
(!vma || addr + len <= vma->vm_start))
return addr;
}
- if (((flags & MAP_32BIT) || test_thread_flag(TIF_IA32))
+ if (((flags & MAP_32BIT) || test_thread_flag(TIF_ADDR32))
&& len <= mm->cached_hole_size) {
mm->cached_hole_size = 0;
mm->free_area_cache = begin;
@@ -205,7 +205,7 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
return addr;

/* for MAP_32BIT mappings we force the legact mmap base */
- if (!test_thread_flag(TIF_IA32) && (flags & MAP_32BIT))
+ if (!test_thread_flag(TIF_ADDR32) && (flags & MAP_32BIT))
goto bottomup;

/* requesting a specific address */
diff --git a/arch/x86/oprofile/backtrace.c b/arch/x86/oprofile/backtrace.c
index bff89df..d6aa6e8 100644
--- a/arch/x86/oprofile/backtrace.c
+++ b/arch/x86/oprofile/backtrace.c
@@ -67,7 +67,7 @@ x86_backtrace_32(struct pt_regs * const regs, unsigned int depth)
{
struct stack_frame_ia32 *head;

- /* User process is 32-bit */
+ /* User process is IA32 */
if (!current || !test_thread_flag(TIF_IA32))
return 0;

--
1.7.6.5

2012-02-20 00:09:45

by H. Peter Anvin

[permalink] [raw]
Subject: [PATCH 02/30] x86-64: Use explicit sizes in sigcontext.h, prepare for x32

From: "H. Peter Anvin" <[email protected]>

Use explicit sizes (__u64) instead of implicit sizes (unsigned long)
in the definition for sigcontext.h; this will allow this structure to
be shared between the x86-64 native ABI and the x32 ABI.

Originally-by: H. J. Lu <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
---
arch/x86/include/asm/sigcontext.h | 57 +++++++++++++++++++-----------------
1 files changed, 30 insertions(+), 27 deletions(-)

diff --git a/arch/x86/include/asm/sigcontext.h b/arch/x86/include/asm/sigcontext.h
index 04459d2..4a08538 100644
--- a/arch/x86/include/asm/sigcontext.h
+++ b/arch/x86/include/asm/sigcontext.h
@@ -230,34 +230,37 @@ struct sigcontext {
* User-space might still rely on the old definition:
*/
struct sigcontext {
- unsigned long r8;
- unsigned long r9;
- unsigned long r10;
- unsigned long r11;
- unsigned long r12;
- unsigned long r13;
- unsigned long r14;
- unsigned long r15;
- unsigned long rdi;
- unsigned long rsi;
- unsigned long rbp;
- unsigned long rbx;
- unsigned long rdx;
- unsigned long rax;
- unsigned long rcx;
- unsigned long rsp;
- unsigned long rip;
- unsigned long eflags; /* RFLAGS */
- unsigned short cs;
- unsigned short gs;
- unsigned short fs;
- unsigned short __pad0;
- unsigned long err;
- unsigned long trapno;
- unsigned long oldmask;
- unsigned long cr2;
+ __u64 r8;
+ __u64 r9;
+ __u64 r10;
+ __u64 r11;
+ __u64 r12;
+ __u64 r13;
+ __u64 r14;
+ __u64 r15;
+ __u64 rdi;
+ __u64 rsi;
+ __u64 rbp;
+ __u64 rbx;
+ __u64 rdx;
+ __u64 rax;
+ __u64 rcx;
+ __u64 rsp;
+ __u64 rip;
+ __u64 eflags; /* RFLAGS */
+ __u16 cs;
+ __u16 gs;
+ __u16 fs;
+ __u16 __pad0;
+ __u64 err;
+ __u64 trapno;
+ __u64 oldmask;
+ __u64 cr2;
struct _fpstate __user *fpstate; /* zero when no FPU context */
- unsigned long reserved1[8];
+#ifndef __LP64__
+ __u32 __fpstate_pad;
+#endif
+ __u64 reserved1[8];
};
#endif /* !__KERNEL__ */

--
1.7.6.5

2012-02-20 00:10:12

by H. Peter Anvin

[permalink] [raw]
Subject: [PATCH 03/30] sysinfo: Move struct sysinfo to a separate header file

From: "H. Peter Anvin" <[email protected]>

struct sysinfo is just about the only thing exported to userspace from
<linux/kernel.h>, so move it into a separate header file with a
residual #include in <linux/kernel.h>.

Originally-by: H. J. Lu <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
---
include/linux/Kbuild | 1 +
include/linux/kernel.h | 21 ++-------------------
include/linux/sysinfo.h | 22 ++++++++++++++++++++++
3 files changed, 25 insertions(+), 19 deletions(-)
create mode 100644 include/linux/sysinfo.h

diff --git a/include/linux/Kbuild b/include/linux/Kbuild
index c94e717..8446086 100644
--- a/include/linux/Kbuild
+++ b/include/linux/Kbuild
@@ -355,6 +355,7 @@ header-y += suspend_ioctls.h
header-y += swab.h
header-y += synclink.h
header-y += sysctl.h
+header-y += sysinfo.h
header-y += taskstats.h
header-y += tcp.h
header-y += telephony.h
diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index e834342..dc6a50f 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -1,6 +1,8 @@
#ifndef _LINUX_KERNEL_H
#define _LINUX_KERNEL_H

+#include <linux/sysinfo.h>
+
/*
* 'kernel.h' contains some often-used function prototypes etc
*/
@@ -745,27 +747,8 @@ extern int __build_bug_on_failed;
# define REBUILD_DUE_TO_FTRACE_MCOUNT_RECORD
#endif

-struct sysinfo;
extern int do_sysinfo(struct sysinfo *info);

#endif /* __KERNEL__ */

-#define SI_LOAD_SHIFT 16
-struct sysinfo {
- long uptime; /* Seconds since boot */
- unsigned long loads[3]; /* 1, 5, and 15 minute load averages */
- unsigned long totalram; /* Total usable main memory size */
- unsigned long freeram; /* Available memory size */
- unsigned long sharedram; /* Amount of shared memory */
- unsigned long bufferram; /* Memory used by buffers */
- unsigned long totalswap; /* Total swap space size */
- unsigned long freeswap; /* swap space still available */
- unsigned short procs; /* Number of current processes */
- unsigned short pad; /* explicit padding for m68k */
- unsigned long totalhigh; /* Total high memory size */
- unsigned long freehigh; /* Available high memory size */
- unsigned int mem_unit; /* Memory unit size in bytes */
- char _f[20-2*sizeof(long)-sizeof(int)]; /* Padding: libc5 uses this.. */
-};
-
#endif
diff --git a/include/linux/sysinfo.h b/include/linux/sysinfo.h
new file mode 100644
index 0000000..ec4fc22
--- /dev/null
+++ b/include/linux/sysinfo.h
@@ -0,0 +1,22 @@
+#ifndef _LINUX_SYSINFO_H
+#define _LINUX_SYSINFO_H
+
+#define SI_LOAD_SHIFT 16
+struct sysinfo {
+ long uptime; /* Seconds since boot */
+ unsigned long loads[3]; /* 1, 5, and 15 minute load averages */
+ unsigned long totalram; /* Total usable main memory size */
+ unsigned long freeram; /* Available memory size */
+ unsigned long sharedram; /* Amount of shared memory */
+ unsigned long bufferram; /* Memory used by buffers */
+ unsigned long totalswap; /* Total swap space size */
+ unsigned long freeswap; /* swap space still available */
+ unsigned short procs; /* Number of current processes */
+ unsigned short pad; /* explicit padding for m68k */
+ unsigned long totalhigh; /* Total high memory size */
+ unsigned long freehigh; /* Available high memory size */
+ unsigned int mem_unit; /* Memory unit size in bytes */
+ char _f[20-2*sizeof(long)-sizeof(int)]; /* Padding: libc5 uses this.. */
+};
+
+#endif /* _LINUX_SYSINFO_H */
--
1.7.6.5

2012-02-20 00:10:36

by H. Peter Anvin

[permalink] [raw]
Subject: [PATCH 04/30] posix_types: Introduce __kernel_[u]long_t

From: "H. Peter Anvin" <[email protected]>

Introduce __kernel_[u]long_t, which allows an ABI to override all
defaults of type [unsigned] long.

This enables x32 and potentially other 32-bit userspace on 64-bit
kernel ABIs.

Signed-off-by: H. Peter Anvin <[email protected]>
---
include/asm-generic/posix_types.h | 23 ++++++++++++++---------
1 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/include/asm-generic/posix_types.h b/include/asm-generic/posix_types.h
index e294fe6..91d44bd 100644
--- a/include/asm-generic/posix_types.h
+++ b/include/asm-generic/posix_types.h
@@ -10,8 +10,13 @@
* architectures, so that you can override them.
*/

+#ifndef __kernel_long_t
+typedef long __kernel_long_t;
+typedef unsigned long __kernel_ulong_t;
+#endif
+
#ifndef __kernel_ino_t
-typedef unsigned long __kernel_ino_t;
+typedef __kernel_ulong_t __kernel_ino_t;
#endif

#ifndef __kernel_mode_t
@@ -19,7 +24,7 @@ typedef unsigned int __kernel_mode_t;
#endif

#ifndef __kernel_nlink_t
-typedef unsigned long __kernel_nlink_t;
+typedef __kernel_ulong_t __kernel_nlink_t;
#endif

#ifndef __kernel_pid_t
@@ -36,7 +41,7 @@ typedef unsigned int __kernel_gid_t;
#endif

#ifndef __kernel_suseconds_t
-typedef long __kernel_suseconds_t;
+typedef __kernel_long_t __kernel_suseconds_t;
#endif

#ifndef __kernel_daddr_t
@@ -67,9 +72,9 @@ typedef unsigned int __kernel_size_t;
typedef int __kernel_ssize_t;
typedef int __kernel_ptrdiff_t;
#else
-typedef unsigned long __kernel_size_t;
-typedef long __kernel_ssize_t;
-typedef long __kernel_ptrdiff_t;
+typedef __kernel_ulong_t __kernel_size_t;
+typedef __kernel_long_t __kernel_ssize_t;
+typedef __kernel_long_t __kernel_ptrdiff_t;
#endif
#endif

@@ -82,10 +87,10 @@ typedef struct {
/*
* anything below here should be completely generic
*/
-typedef long __kernel_off_t;
+typedef __kernel_long_t __kernel_off_t;
typedef long long __kernel_loff_t;
-typedef long __kernel_time_t;
-typedef long __kernel_clock_t;
+typedef __kernel_long_t __kernel_time_t;
+typedef __kernel_long_t __kernel_clock_t;
typedef int __kernel_timer_t;
typedef int __kernel_clockid_t;
typedef char * __kernel_caddr_t;
--
1.7.6.5

2012-02-20 00:11:03

by H. Peter Anvin

[permalink] [raw]
Subject: [PATCH 05/30] x32: Create posix_types_x32.h

From: "H. Peter Anvin" <[email protected]>

This is the same as the 64-bit posix_types.h, except that
__kernel_[u]long_t is defined to be [unsigned] long long and therefore
64 bits.

Signed-off-by: H. Peter Anvin <[email protected]>
---
arch/x86/include/asm/Kbuild | 1 +
arch/x86/include/asm/posix_types.h | 4 +++-
arch/x86/include/asm/posix_types_x32.h | 19 +++++++++++++++++++
3 files changed, 23 insertions(+), 1 deletions(-)
create mode 100644 arch/x86/include/asm/posix_types_x32.h

diff --git a/arch/x86/include/asm/Kbuild b/arch/x86/include/asm/Kbuild
index b57e6a4..986954f 100644
--- a/arch/x86/include/asm/Kbuild
+++ b/arch/x86/include/asm/Kbuild
@@ -14,6 +14,7 @@ header-y += msr.h
header-y += mtrr.h
header-y += posix_types_32.h
header-y += posix_types_64.h
+header-y += posix_types_x32.h
header-y += prctl.h
header-y += processor-flags.h
header-y += ptrace-abi.h
diff --git a/arch/x86/include/asm/posix_types.h b/arch/x86/include/asm/posix_types.h
index bb7133d..3427b77 100644
--- a/arch/x86/include/asm/posix_types.h
+++ b/arch/x86/include/asm/posix_types.h
@@ -7,7 +7,9 @@
#else
# ifdef __i386__
# include "posix_types_32.h"
-# else
+# elif defined(__LP64__)
# include "posix_types_64.h"
+# else
+# include "posix_types_x32.h"
# endif
#endif
diff --git a/arch/x86/include/asm/posix_types_x32.h b/arch/x86/include/asm/posix_types_x32.h
new file mode 100644
index 0000000..85f9bda
--- /dev/null
+++ b/arch/x86/include/asm/posix_types_x32.h
@@ -0,0 +1,19 @@
+#ifndef _ASM_X86_POSIX_TYPES_X32_H
+#define _ASM_X86_POSIX_TYPES_X32_H
+
+/*
+ * This file is only used by user-level software, so you need to
+ * be a little careful about namespace pollution etc. Also, we cannot
+ * assume GCC is being used.
+ *
+ * These types should generally match the ones used by the 64-bit kernel,
+ *
+ */
+
+typedef long long __kernel_long_t;
+typedef unsigned long long __kernel_ulong_t;
+#define __kernel_long_t __kernel_long_t
+
+#include <asm/posix_types_64.h>
+
+#endif /* _ASM_X86_POSIX_TYPES_X32_H */
--
1.7.6.5

2012-02-20 00:11:42

by H. Peter Anvin

[permalink] [raw]
Subject: [PATCH 06/30] sysinfo: Use explicit types in <linux/sysinfo.h>

From: "H. Peter Anvin" <[email protected]>

Change <linux/sysinfo.h> to use explicitly sized types. Replace
long/unsigned long with __kernel_[u]long_t so that a non-legacy 32-bit
ABI running on a 64-bit kernel can export those as 64-bit types.

Originally-by: H. J. Lu <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>
---
include/linux/sysinfo.h | 30 ++++++++++++++++--------------
1 files changed, 16 insertions(+), 14 deletions(-)

diff --git a/include/linux/sysinfo.h b/include/linux/sysinfo.h
index ec4fc22..934335a 100644
--- a/include/linux/sysinfo.h
+++ b/include/linux/sysinfo.h
@@ -1,22 +1,24 @@
#ifndef _LINUX_SYSINFO_H
#define _LINUX_SYSINFO_H

+#include <linux/types.h>
+
#define SI_LOAD_SHIFT 16
struct sysinfo {
- long uptime; /* Seconds since boot */
- unsigned long loads[3]; /* 1, 5, and 15 minute load averages */
- unsigned long totalram; /* Total usable main memory size */
- unsigned long freeram; /* Available memory size */
- unsigned long sharedram; /* Amount of shared memory */
- unsigned long bufferram; /* Memory used by buffers */
- unsigned long totalswap; /* Total swap space size */
- unsigned long freeswap; /* swap space still available */
- unsigned short procs; /* Number of current processes */
- unsigned short pad; /* explicit padding for m68k */
- unsigned long totalhigh; /* Total high memory size */
- unsigned long freehigh; /* Available high memory size */
- unsigned int mem_unit; /* Memory unit size in bytes */
- char _f[20-2*sizeof(long)-sizeof(int)]; /* Padding: libc5 uses this.. */
+ __kernel_long_t uptime; /* Seconds since boot */
+ __kernel_ulong_t loads[3]; /* 1, 5, and 15 minute load averages */
+ __kernel_ulong_t totalram; /* Total usable main memory size */
+ __kernel_ulong_t freeram; /* Available memory size */
+ __kernel_ulong_t sharedram; /* Amount of shared memory */
+ __kernel_ulong_t bufferram; /* Memory used by buffers */
+ __kernel_ulong_t totalswap; /* Total swap space size */
+ __kernel_ulong_t freeswap; /* swap space still available */
+ __u16 procs; /* Number of current processes */
+ __u16 pad; /* Explicit padding for m68k */
+ __kernel_ulong_t totalhigh; /* Total high memory size */
+ __kernel_ulong_t freehigh; /* Available high memory size */
+ __u32 mem_unit; /* Memory unit size in bytes */
+ char _f[20-2*sizeof(__kernel_ulong_t)-sizeof(__u32)]; /* Padding: libc5 uses this.. */
};

#endif /* _LINUX_SYSINFO_H */
--
1.7.6.5

2012-02-20 00:12:08

by H. Peter Anvin

[permalink] [raw]
Subject: [PATCH 07/30] compat: Introduce COMPAT_USE_64BIT_TIME

From: "H. J. Lu" <[email protected]>

Allow a compatibility ABI to use a 64-bit time_t and 64-bit members in
struct timeval and struct timespec to avoid the Y2038 problem.

This will be used for the x32 ABI.

Signed-off-by: H. Peter Anvin <[email protected]>
---
include/linux/compat.h | 4 ++++
1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/include/linux/compat.h b/include/linux/compat.h
index 41c9f65..1be91c0 100644
--- a/include/linux/compat.h
+++ b/include/linux/compat.h
@@ -19,6 +19,10 @@
#include <asm/siginfo.h>
#include <asm/signal.h>

+#ifndef COMPAT_USE_64BIT_TIME
+#define COMPAT_USE_64BIT_TIME 0
+#endif
+
#define compat_jiffies_to_clock_t(x) \
(((unsigned long)(x) * COMPAT_USER_HZ) / HZ)

--
1.7.6.5

2012-02-20 00:12:41

by H. Peter Anvin

[permalink] [raw]
Subject: [PATCH 08/30] compat: Use COMPAT_USE_64BIT_TIME in the lp driver

From: "H. J. Lu" <[email protected]>

Enable the lp driver to be used with a compat ABI with 64-bit time.

Signed-off-by: H. Peter Anvin <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
---
drivers/char/lp.c | 20 ++++++++++++++------
1 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/drivers/char/lp.c b/drivers/char/lp.c
index f434856..a111ff2 100644
--- a/drivers/char/lp.c
+++ b/drivers/char/lp.c
@@ -706,18 +706,26 @@ static long lp_compat_ioctl(struct file *file, unsigned int cmd,
{
unsigned int minor;
struct timeval par_timeout;
- struct compat_timeval __user *tc;
int ret;

minor = iminor(file->f_path.dentry->d_inode);
mutex_lock(&lp_mutex);
switch (cmd) {
case LPSETTIMEOUT:
- tc = compat_ptr(arg);
- if (get_user(par_timeout.tv_sec, &tc->tv_sec) ||
- get_user(par_timeout.tv_usec, &tc->tv_usec)) {
- ret = -EFAULT;
- break;
+ if (COMPAT_USE_64BIT_TIME) {
+ if (copy_from_user(&par_timeout, (void __user *)arg,
+ sizeof (struct timeval))) {
+ ret = -EFAULT;
+ break;
+ }
+ } else {
+ struct compat_timeval __user *tc;
+ tc = compat_ptr(arg);
+ if (get_user(par_timeout.tv_sec, &tc->tv_sec) ||
+ get_user(par_timeout.tv_usec, &tc->tv_usec)) {
+ ret = -EFAULT;
+ break;
+ }
}
ret = lp_set_timeout(minor, &par_timeout);
break;
--
1.7.6.5

2012-02-20 00:13:00

by H. Peter Anvin

[permalink] [raw]
Subject: [PATCH 09/30] compat: Use COMPAT_USE_64BIT_TIME in the input subsystem

From: "H. J. Lu" <[email protected]>

Enable the input system to be used with a compat ABI with 64-bit time.

Signed-off-by: H. Peter Anvin <[email protected]>
Cc: Dmitry Torokhov <[email protected]>
---
drivers/input/input-compat.c | 4 ++--
drivers/input/input-compat.h | 2 +-
2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/input/input-compat.c b/drivers/input/input-compat.c
index e46a867..64ca711 100644
--- a/drivers/input/input-compat.c
+++ b/drivers/input/input-compat.c
@@ -17,7 +17,7 @@
int input_event_from_user(const char __user *buffer,
struct input_event *event)
{
- if (INPUT_COMPAT_TEST) {
+ if (INPUT_COMPAT_TEST && !COMPAT_USE_64BIT_TIME) {
struct input_event_compat compat_event;

if (copy_from_user(&compat_event, buffer,
@@ -41,7 +41,7 @@ int input_event_from_user(const char __user *buffer,
int input_event_to_user(char __user *buffer,
const struct input_event *event)
{
- if (INPUT_COMPAT_TEST) {
+ if (INPUT_COMPAT_TEST && !COMPAT_USE_64BIT_TIME) {
struct input_event_compat compat_event;

compat_event.time.tv_sec = event->time.tv_sec;
diff --git a/drivers/input/input-compat.h b/drivers/input/input-compat.h
index 22be27b..148f66f 100644
--- a/drivers/input/input-compat.h
+++ b/drivers/input/input-compat.h
@@ -67,7 +67,7 @@ struct ff_effect_compat {

static inline size_t input_event_size(void)
{
- return INPUT_COMPAT_TEST ?
+ return (INPUT_COMPAT_TEST && !COMPAT_USE_64BIT_TIME) ?
sizeof(struct input_event_compat) : sizeof(struct input_event);
}

--
1.7.6.5

2012-02-20 00:13:32

by H. Peter Anvin

[permalink] [raw]
Subject: [PATCH 10/30] compat: Use COMPAT_USE_64BIT_TIME in the Bluetooth subsystem

From: "H. J. Lu" <[email protected]>

Enable the Bluetooth subsystem to be used with a compat ABI with
64-bit time.

Signed-off-by: H. Peter Anvin <[email protected]>
Cc: Marcel Holtmann <[email protected]>
Cc: Gustavo F. Padovan <[email protected]>
Cc: David S. Miller <[email protected]>
---
net/bluetooth/hci_sock.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/net/bluetooth/hci_sock.c b/net/bluetooth/hci_sock.c
index 0dcc962..b2eb2b9 100644
--- a/net/bluetooth/hci_sock.c
+++ b/net/bluetooth/hci_sock.c
@@ -418,7 +418,8 @@ static inline void hci_sock_cmsg(struct sock *sk, struct msghdr *msg, struct sk_
data = &tv;
len = sizeof(tv);
#ifdef CONFIG_COMPAT
- if (msg->msg_flags & MSG_CMSG_COMPAT) {
+ if (!COMPAT_USE_64BIT_TIME &&
+ (msg->msg_flags & MSG_CMSG_COMPAT)) {
ctv.tv_sec = tv.tv_sec;
ctv.tv_usec = tv.tv_usec;
data = &ctv;
--
1.7.6.5

2012-02-20 00:14:05

by H. Peter Anvin

[permalink] [raw]
Subject: [PATCH 11/30] aio: Use __kernel_ulong_t to define aio_context_t

From: "H. Peter Anvin" <[email protected]>

Rather than using "unsigned long" which is ABI-dependent, use
__kernel_ulong_t to define the externally visible type aio_context_t.

Note: the change in this form will cause unsigned long/unsigned int
differences on existing ABIs. If that is unacceptable we may have to
define a new type.

Signed-off-by: H. Peter Anvin <[email protected]>
Cc: Benjamin LaHaise <[email protected]>
---
include/linux/aio_abi.h | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/linux/aio_abi.h b/include/linux/aio_abi.h
index 2c87316..86fa7a7 100644
--- a/include/linux/aio_abi.h
+++ b/include/linux/aio_abi.h
@@ -30,7 +30,7 @@
#include <linux/types.h>
#include <asm/byteorder.h>

-typedef unsigned long aio_context_t;
+typedef __kernel_ulong_t aio_context_t;

enum {
IOCB_CMD_PREAD = 0,
--
1.7.6.5

2012-02-20 00:14:29

by H. Peter Anvin

[permalink] [raw]
Subject: [PATCH 12/30] compat: Create compat_sys_p{read,write}v64

From: "H. J. Lu" <[email protected]>

For 32-bit ABIs which have real 64-bit registers, we don't want to
break the position argument into two. However, we still need compat
support to deal with 32-bit pointers, so we can't just use
sys_p{read,write} directly.

Signed-off-by: H. Peter Anvin <[email protected]>
---
fs/compat.c | 26 ++++++++++++++++++++------
1 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/fs/compat.c b/fs/compat.c
index fa9d721..83d751c 100644
--- a/fs/compat.c
+++ b/fs/compat.c
@@ -1177,10 +1177,9 @@ compat_sys_readv(unsigned long fd, const struct compat_iovec __user *vec,
}

asmlinkage ssize_t
-compat_sys_preadv(unsigned long fd, const struct compat_iovec __user *vec,
- unsigned long vlen, u32 pos_low, u32 pos_high)
+compat_sys_preadv64(unsigned long fd, const struct compat_iovec __user *vec,
+ unsigned long vlen, loff_t pos)
{
- loff_t pos = ((loff_t)pos_high << 32) | pos_low;
struct file *file;
int fput_needed;
ssize_t ret;
@@ -1197,6 +1196,14 @@ compat_sys_preadv(unsigned long fd, const struct compat_iovec __user *vec,
return ret;
}

+asmlinkage ssize_t
+compat_sys_preadv(unsigned long fd, const struct compat_iovec __user *vec,
+ unsigned long vlen, u32 pos_low, u32 pos_high)
+{
+ loff_t pos = ((loff_t)pos_high << 32) | pos_low;
+ return compat_sys_preadv64(fd, vec, vlen, pos);
+}
+
static size_t compat_writev(struct file *file,
const struct compat_iovec __user *vec,
unsigned long vlen, loff_t *pos)
@@ -1236,10 +1243,9 @@ compat_sys_writev(unsigned long fd, const struct compat_iovec __user *vec,
}

asmlinkage ssize_t
-compat_sys_pwritev(unsigned long fd, const struct compat_iovec __user *vec,
- unsigned long vlen, u32 pos_low, u32 pos_high)
+compat_sys_pwritev64(unsigned long fd, const struct compat_iovec __user *vec,
+ unsigned long vlen, loff_t pos)
{
- loff_t pos = ((loff_t)pos_high << 32) | pos_low;
struct file *file;
int fput_needed;
ssize_t ret;
@@ -1256,6 +1262,14 @@ compat_sys_pwritev(unsigned long fd, const struct compat_iovec __user *vec,
return ret;
}

+asmlinkage ssize_t
+compat_sys_pwritev(unsigned long fd, const struct compat_iovec __user *vec,
+ unsigned long vlen, u32 pos_low, u32 pos_high)
+{
+ loff_t pos = ((loff_t)pos_high << 32) | pos_low;
+ return compat_sys_pwritev64(fd, vec, vlen, pos);
+}
+
asmlinkage long
compat_sys_vmsplice(int fd, const struct compat_iovec __user *iov32,
unsigned int nr_segs, unsigned int flags)
--
1.7.6.5

2012-02-20 00:14:59

by H. Peter Anvin

[permalink] [raw]
Subject: [PATCH 13/30] elf: Allow core dump-related fields to be overridden

From: "H. J. Lu" <[email protected]>

Allow some core dump-related fields to be overridden. This allows
core dumps to work correctly for x32.

Signed-off-by: H. Peter Anvin <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Roland McGrath <[email protected]>
Cc: Oleg Nesterov <[email protected]>
---
fs/binfmt_elf.c | 24 ++++++++++++++++++++----
1 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index bcb884e..43ba478 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -1390,6 +1390,22 @@ static void do_thread_regset_writeback(struct task_struct *task,
regset->writeback(task, regset, 1);
}

+#ifndef PR_REG_SIZE
+#define PR_REG_SIZE(S) sizeof(S)
+#endif
+
+#ifndef PRSTATUS_SIZE
+#define PRSTATUS_SIZE(S) sizeof(S)
+#endif
+
+#ifndef PR_REG_PTR
+#define PR_REG_PTR(S) (&((S)->pr_reg))
+#endif
+
+#ifndef SET_PR_FPVALID
+#define SET_PR_FPVALID(S, V) ((S)->pr_fpvalid = (V))
+#endif
+
static int fill_thread_core_info(struct elf_thread_core_info *t,
const struct user_regset_view *view,
long signr, size_t *total)
@@ -1404,11 +1420,11 @@ static int fill_thread_core_info(struct elf_thread_core_info *t,
*/
fill_prstatus(&t->prstatus, t->task, signr);
(void) view->regsets[0].get(t->task, &view->regsets[0],
- 0, sizeof(t->prstatus.pr_reg),
- &t->prstatus.pr_reg, NULL);
+ 0, PR_REG_SIZE(t->prstatus.pr_reg),
+ PR_REG_PTR(&t->prstatus), NULL);

fill_note(&t->notes[0], "CORE", NT_PRSTATUS,
- sizeof(t->prstatus), &t->prstatus);
+ PRSTATUS_SIZE(t->prstatus), &t->prstatus);
*total += notesize(&t->notes[0]);

do_thread_regset_writeback(t->task, &view->regsets[0]);
@@ -1438,7 +1454,7 @@ static int fill_thread_core_info(struct elf_thread_core_info *t,
regset->core_note_type,
size, data);
else {
- t->prstatus.pr_fpvalid = 1;
+ SET_PR_FPVALID(&t->prstatus, 1);
fill_note(&t->notes[i], "CORE",
NT_PRFPREG, size, data);
}
--
1.7.6.5

2012-02-20 00:15:22

by H. Peter Anvin

[permalink] [raw]
Subject: [PATCH 14/30] x86-64: Add prototype for old_rsp to a header file

From: "H. J. Lu" <[email protected]>

So far this has only been used in process_64.c, but the x32 code will
need it in additional code.

Signed-off-by: H. Peter Anvin <[email protected]>
---
arch/x86/include/asm/processor.h | 6 ++++++
1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 9f748b5..e34f951 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -948,6 +948,12 @@ extern unsigned long thread_saved_pc(struct task_struct *tsk);

#define task_pt_regs(tsk) ((struct pt_regs *)(tsk)->thread.sp0 - 1)
extern unsigned long KSTK_ESP(struct task_struct *task);
+
+/*
+ * User space RSP while inside the SYSCALL fast path
+ */
+DECLARE_PER_CPU(unsigned long, old_rsp);
+
#endif /* CONFIG_X86_64 */

extern void start_thread(struct pt_regs *regs, unsigned long new_ip,
--
1.7.6.5

2012-02-20 00:15:47

by H. Peter Anvin

[permalink] [raw]
Subject: [PATCH 15/30] x32: Add a thread flag for x32 processes

From: "H. Peter Anvin" <[email protected]>

An x32 process is *almost* the same thing as a 64-bit process with a
32-bit address limit, but there are a few minor differences -- in
particular core dumps are 32 bits and signal handling is different.

Signed-off-by: H. Peter Anvin <[email protected]>
---
arch/x86/include/asm/thread_info.h | 2 ++
arch/x86/kernel/process_64.c | 2 ++
2 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
index d1803a4..912e935 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -96,6 +96,7 @@ struct thread_info {
#define TIF_LAZY_MMU_UPDATES 27 /* task is updating the mmu lazily */
#define TIF_SYSCALL_TRACEPOINT 28 /* syscall tracepoint instrumentation */
#define TIF_ADDR32 29 /* 32-bit address space on 64 bits */
+#define TIF_X32 30 /* 32-bit native x86-64 binary */

#define _TIF_SYSCALL_TRACE (1 << TIF_SYSCALL_TRACE)
#define _TIF_NOTIFY_RESUME (1 << TIF_NOTIFY_RESUME)
@@ -118,6 +119,7 @@ struct thread_info {
#define _TIF_LAZY_MMU_UPDATES (1 << TIF_LAZY_MMU_UPDATES)
#define _TIF_SYSCALL_TRACEPOINT (1 << TIF_SYSCALL_TRACEPOINT)
#define _TIF_ADDR32 (1 << TIF_ADDR32)
+#define _TIF_X32 (1 << TIF_X32)

/* work to do in syscall_trace_enter() */
#define _TIF_WORK_SYSCALL_ENTRY \
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 0e900d0..5fe2fba 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -509,6 +509,7 @@ void set_personality_64bit(void)
/* Make sure to be in 64bit mode */
clear_thread_flag(TIF_IA32);
clear_thread_flag(TIF_ADDR32);
+ clear_thread_flag(TIF_X32);

/* Ensure the corresponding mm is not marked. */
if (current->mm)
@@ -528,6 +529,7 @@ void set_personality_ia32(void)
/* Make sure to be in 32bit mode */
set_thread_flag(TIF_IA32);
set_thread_flag(TIF_ADDR32);
+ clear_thread_flag(TIF_X32);
current->personality |= force_personality32;

/* Mark the associated mm as containing 32-bit tasks. */
--
1.7.6.5

2012-02-20 00:16:14

by H. Peter Anvin

[permalink] [raw]
Subject: [PATCH 16/30] x86-64, ia32: Drop sys32_rt_sigprocmask

From: "H. Peter Anvin" <[email protected]>

On x86, the only difference between sys_rt_sigprocmask and
sys32_rt_sigprocmask is the alignment of the data structures.
However, x86 allows data accesses with arbitrary alignment, and
therefore there is no reason for this code to be different.

Reported-by: Gregory M. Lueck <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>
---
arch/x86/ia32/sys_ia32.c | 40 --------------------------------------
arch/x86/include/asm/sys_ia32.h | 2 -
arch/x86/syscalls/syscall_32.tbl | 2 +-
3 files changed, 1 insertions(+), 43 deletions(-)

diff --git a/arch/x86/ia32/sys_ia32.c b/arch/x86/ia32/sys_ia32.c
index f6f5c53..aec2202 100644
--- a/arch/x86/ia32/sys_ia32.c
+++ b/arch/x86/ia32/sys_ia32.c
@@ -287,46 +287,6 @@ asmlinkage long sys32_sigaction(int sig, struct old_sigaction32 __user *act,
return ret;
}

-asmlinkage long sys32_rt_sigprocmask(int how, compat_sigset_t __user *set,
- compat_sigset_t __user *oset,
- unsigned int sigsetsize)
-{
- sigset_t s;
- compat_sigset_t s32;
- int ret;
- mm_segment_t old_fs = get_fs();
-
- if (set) {
- if (copy_from_user(&s32, set, sizeof(compat_sigset_t)))
- return -EFAULT;
- switch (_NSIG_WORDS) {
- case 4: s.sig[3] = s32.sig[6] | (((long)s32.sig[7]) << 32);
- case 3: s.sig[2] = s32.sig[4] | (((long)s32.sig[5]) << 32);
- case 2: s.sig[1] = s32.sig[2] | (((long)s32.sig[3]) << 32);
- case 1: s.sig[0] = s32.sig[0] | (((long)s32.sig[1]) << 32);
- }
- }
- set_fs(KERNEL_DS);
- ret = sys_rt_sigprocmask(how,
- set ? (sigset_t __user *)&s : NULL,
- oset ? (sigset_t __user *)&s : NULL,
- sigsetsize);
- set_fs(old_fs);
- if (ret)
- return ret;
- if (oset) {
- switch (_NSIG_WORDS) {
- case 4: s32.sig[7] = (s.sig[3] >> 32); s32.sig[6] = s.sig[3];
- case 3: s32.sig[5] = (s.sig[2] >> 32); s32.sig[4] = s.sig[2];
- case 2: s32.sig[3] = (s.sig[1] >> 32); s32.sig[2] = s.sig[1];
- case 1: s32.sig[1] = (s.sig[0] >> 32); s32.sig[0] = s.sig[0];
- }
- if (copy_to_user(oset, &s32, sizeof(compat_sigset_t)))
- return -EFAULT;
- }
- return 0;
-}
-
asmlinkage long sys32_alarm(unsigned int seconds)
{
return alarm_setitimer(seconds);
diff --git a/arch/x86/include/asm/sys_ia32.h b/arch/x86/include/asm/sys_ia32.h
index cb23852..68da87b 100644
--- a/arch/x86/include/asm/sys_ia32.h
+++ b/arch/x86/include/asm/sys_ia32.h
@@ -36,8 +36,6 @@ asmlinkage long sys32_rt_sigaction(int, struct sigaction32 __user *,
struct sigaction32 __user *, unsigned int);
asmlinkage long sys32_sigaction(int, struct old_sigaction32 __user *,
struct old_sigaction32 __user *);
-asmlinkage long sys32_rt_sigprocmask(int, compat_sigset_t __user *,
- compat_sigset_t __user *, unsigned int);
asmlinkage long sys32_alarm(unsigned int);

asmlinkage long sys32_waitpid(compat_pid_t, unsigned int *, int);
diff --git a/arch/x86/syscalls/syscall_32.tbl b/arch/x86/syscalls/syscall_32.tbl
index ce98e28..031cef8 100644
--- a/arch/x86/syscalls/syscall_32.tbl
+++ b/arch/x86/syscalls/syscall_32.tbl
@@ -181,7 +181,7 @@
172 i386 prctl sys_prctl
173 i386 rt_sigreturn ptregs_rt_sigreturn stub32_rt_sigreturn
174 i386 rt_sigaction sys_rt_sigaction sys32_rt_sigaction
-175 i386 rt_sigprocmask sys_rt_sigprocmask sys32_rt_sigprocmask
+175 i386 rt_sigprocmask sys_rt_sigprocmask
176 i386 rt_sigpending sys_rt_sigpending sys32_rt_sigpending
177 i386 rt_sigtimedwait sys_rt_sigtimedwait compat_sys_rt_sigtimedwait
178 i386 rt_sigqueueinfo sys_rt_sigqueueinfo sys32_rt_sigqueueinfo
--
1.7.6.5

2012-02-20 00:16:46

by H. Peter Anvin

[permalink] [raw]
Subject: [PATCH 17/30] x32: Add x32 system calls to syscall/syscall_64.tbl

From: "H. Peter Anvin" <[email protected]>

Split the 64-bit system calls into "64" (64-bit only) and "common"
(64-bit or x32) and add the x32 system call numbers.

Signed-off-by: H. Peter Anvin <[email protected]>
---
arch/x86/kernel/asm-offsets_64.c | 2 +
arch/x86/kernel/syscall_64.c | 3 +
arch/x86/syscalls/Makefile | 2 +-
arch/x86/syscalls/syscall_64.tbl | 579 ++++++++++++++++++++------------------
arch/x86/um/sys_call_table_64.c | 3 +
arch/x86/um/user-offsets.c | 2 +
6 files changed, 317 insertions(+), 274 deletions(-)

diff --git a/arch/x86/kernel/asm-offsets_64.c b/arch/x86/kernel/asm-offsets_64.c
index 834e897..c3354f7 100644
--- a/arch/x86/kernel/asm-offsets_64.c
+++ b/arch/x86/kernel/asm-offsets_64.c
@@ -1,6 +1,8 @@
#include <asm/ia32.h>

#define __SYSCALL_64(nr, sym, compat) [nr] = 1,
+#define __SYSCALL_COMMON(nr, sym, compat) [nr] = 1,
+#define __SYSCALL_X32(nr, sym, compat) /* Not yet */
static char syscalls_64[] = {
#include <asm/syscalls_64.h>
};
diff --git a/arch/x86/kernel/syscall_64.c b/arch/x86/kernel/syscall_64.c
index 7ac7943..26c4ca1 100644
--- a/arch/x86/kernel/syscall_64.c
+++ b/arch/x86/kernel/syscall_64.c
@@ -5,6 +5,9 @@
#include <linux/cache.h>
#include <asm/asm-offsets.h>

+#define __SYSCALL_COMMON(nr, sym, compat) __SYSCALL_64(nr, sym, compat)
+#define __SYSCALL_X32(nr, sym, compat) /* Not yet */
+
#define __SYSCALL_64(nr, sym, compat) extern asmlinkage void sym(void) ;
#include <asm/syscalls_64.h>
#undef __SYSCALL_64
diff --git a/arch/x86/syscalls/Makefile b/arch/x86/syscalls/Makefile
index 564b247..89dd958 100644
--- a/arch/x86/syscalls/Makefile
+++ b/arch/x86/syscalls/Makefile
@@ -24,7 +24,7 @@ syshdr_pfx_unistd_32_ia32 := ia32_
$(out)/unistd_32_ia32.h: $(syscall32) $(syshdr)
$(call if_changed,syshdr)

-syshdr_abi_unistd_64 := 64
+syshdr_abi_unistd_64 := common,64
$(out)/unistd_64.h: $(syscall64) $(syshdr)
$(call if_changed,syshdr)

diff --git a/arch/x86/syscalls/syscall_64.tbl b/arch/x86/syscalls/syscall_64.tbl
index b440a8f..4aecc7e 100644
--- a/arch/x86/syscalls/syscall_64.tbl
+++ b/arch/x86/syscalls/syscall_64.tbl
@@ -4,317 +4,350 @@
# The format is:
# <number> <abi> <name> <entry point>
#
-# The abi is always "64" for this file (for now.)
+# The abi is "common", "64" or "x32" for this file.
#
-0 64 read sys_read
-1 64 write sys_write
-2 64 open sys_open
-3 64 close sys_close
-4 64 stat sys_newstat
-5 64 fstat sys_newfstat
-6 64 lstat sys_newlstat
-7 64 poll sys_poll
-8 64 lseek sys_lseek
-9 64 mmap sys_mmap
-10 64 mprotect sys_mprotect
-11 64 munmap sys_munmap
-12 64 brk sys_brk
+0 common read sys_read
+1 common write sys_write
+2 common open sys_open
+3 common close sys_close
+4 common stat sys_newstat
+5 common fstat sys_newfstat
+6 common lstat sys_newlstat
+7 common poll sys_poll
+8 common lseek sys_lseek
+9 common mmap sys_mmap
+10 common mprotect sys_mprotect
+11 common munmap sys_munmap
+12 common brk sys_brk
13 64 rt_sigaction sys_rt_sigaction
-14 64 rt_sigprocmask sys_rt_sigprocmask
+14 common rt_sigprocmask sys_rt_sigprocmask
15 64 rt_sigreturn stub_rt_sigreturn
16 64 ioctl sys_ioctl
-17 64 pread64 sys_pread64
-18 64 pwrite64 sys_pwrite64
+17 common pread64 sys_pread64
+18 common pwrite64 sys_pwrite64
19 64 readv sys_readv
20 64 writev sys_writev
-21 64 access sys_access
-22 64 pipe sys_pipe
-23 64 select sys_select
-24 64 sched_yield sys_sched_yield
-25 64 mremap sys_mremap
-26 64 msync sys_msync
-27 64 mincore sys_mincore
-28 64 madvise sys_madvise
-29 64 shmget sys_shmget
-30 64 shmat sys_shmat
-31 64 shmctl sys_shmctl
-32 64 dup sys_dup
-33 64 dup2 sys_dup2
-34 64 pause sys_pause
-35 64 nanosleep sys_nanosleep
-36 64 getitimer sys_getitimer
-37 64 alarm sys_alarm
-38 64 setitimer sys_setitimer
-39 64 getpid sys_getpid
-40 64 sendfile sys_sendfile64
-41 64 socket sys_socket
-42 64 connect sys_connect
-43 64 accept sys_accept
-44 64 sendto sys_sendto
+21 common access sys_access
+22 common pipe sys_pipe
+23 common select sys_select
+24 common sched_yield sys_sched_yield
+25 common mremap sys_mremap
+26 common msync sys_msync
+27 common mincore sys_mincore
+28 common madvise sys_madvise
+29 common shmget sys_shmget
+30 common shmat sys_shmat
+31 common shmctl sys_shmctl
+32 common dup sys_dup
+33 common dup2 sys_dup2
+34 common pause sys_pause
+35 common nanosleep sys_nanosleep
+36 common getitimer sys_getitimer
+37 common alarm sys_alarm
+38 common setitimer sys_setitimer
+39 common getpid sys_getpid
+40 common sendfile sys_sendfile64
+41 common socket sys_socket
+42 common connect sys_connect
+43 common accept sys_accept
+44 common sendto sys_sendto
45 64 recvfrom sys_recvfrom
46 64 sendmsg sys_sendmsg
47 64 recvmsg sys_recvmsg
-48 64 shutdown sys_shutdown
-49 64 bind sys_bind
-50 64 listen sys_listen
-51 64 getsockname sys_getsockname
-52 64 getpeername sys_getpeername
-53 64 socketpair sys_socketpair
-54 64 setsockopt sys_setsockopt
-55 64 getsockopt sys_getsockopt
-56 64 clone stub_clone
-57 64 fork stub_fork
-58 64 vfork stub_vfork
+48 common shutdown sys_shutdown
+49 common bind sys_bind
+50 common listen sys_listen
+51 common getsockname sys_getsockname
+52 common getpeername sys_getpeername
+53 common socketpair sys_socketpair
+54 common setsockopt sys_setsockopt
+55 common getsockopt sys_getsockopt
+56 common clone stub_clone
+57 common fork stub_fork
+58 common vfork stub_vfork
59 64 execve stub_execve
-60 64 exit sys_exit
-61 64 wait4 sys_wait4
-62 64 kill sys_kill
-63 64 uname sys_newuname
-64 64 semget sys_semget
-65 64 semop sys_semop
-66 64 semctl sys_semctl
-67 64 shmdt sys_shmdt
-68 64 msgget sys_msgget
-69 64 msgsnd sys_msgsnd
-70 64 msgrcv sys_msgrcv
-71 64 msgctl sys_msgctl
-72 64 fcntl sys_fcntl
-73 64 flock sys_flock
-74 64 fsync sys_fsync
-75 64 fdatasync sys_fdatasync
-76 64 truncate sys_truncate
-77 64 ftruncate sys_ftruncate
-78 64 getdents sys_getdents
-79 64 getcwd sys_getcwd
-80 64 chdir sys_chdir
-81 64 fchdir sys_fchdir
-82 64 rename sys_rename
-83 64 mkdir sys_mkdir
-84 64 rmdir sys_rmdir
-85 64 creat sys_creat
-86 64 link sys_link
-87 64 unlink sys_unlink
-88 64 symlink sys_symlink
-89 64 readlink sys_readlink
-90 64 chmod sys_chmod
-91 64 fchmod sys_fchmod
-92 64 chown sys_chown
-93 64 fchown sys_fchown
-94 64 lchown sys_lchown
-95 64 umask sys_umask
-96 64 gettimeofday sys_gettimeofday
-97 64 getrlimit sys_getrlimit
-98 64 getrusage sys_getrusage
-99 64 sysinfo sys_sysinfo
+60 common exit sys_exit
+61 common wait4 sys_wait4
+62 common kill sys_kill
+63 common uname sys_newuname
+64 common semget sys_semget
+65 common semop sys_semop
+66 common semctl sys_semctl
+67 common shmdt sys_shmdt
+68 common msgget sys_msgget
+69 common msgsnd sys_msgsnd
+70 common msgrcv sys_msgrcv
+71 common msgctl sys_msgctl
+72 common fcntl sys_fcntl
+73 common flock sys_flock
+74 common fsync sys_fsync
+75 common fdatasync sys_fdatasync
+76 common truncate sys_truncate
+77 common ftruncate sys_ftruncate
+78 common getdents sys_getdents
+79 common getcwd sys_getcwd
+80 common chdir sys_chdir
+81 common fchdir sys_fchdir
+82 common rename sys_rename
+83 common mkdir sys_mkdir
+84 common rmdir sys_rmdir
+85 common creat sys_creat
+86 common link sys_link
+87 common unlink sys_unlink
+88 common symlink sys_symlink
+89 common readlink sys_readlink
+90 common chmod sys_chmod
+91 common fchmod sys_fchmod
+92 common chown sys_chown
+93 common fchown sys_fchown
+94 common lchown sys_lchown
+95 common umask sys_umask
+96 common gettimeofday sys_gettimeofday
+97 common getrlimit sys_getrlimit
+98 common getrusage sys_getrusage
+99 common sysinfo sys_sysinfo
100 64 times sys_times
-101 64 ptrace sys_ptrace
-102 64 getuid sys_getuid
-103 64 syslog sys_syslog
-104 64 getgid sys_getgid
-105 64 setuid sys_setuid
-106 64 setgid sys_setgid
-107 64 geteuid sys_geteuid
-108 64 getegid sys_getegid
-109 64 setpgid sys_setpgid
-110 64 getppid sys_getppid
-111 64 getpgrp sys_getpgrp
-112 64 setsid sys_setsid
-113 64 setreuid sys_setreuid
-114 64 setregid sys_setregid
-115 64 getgroups sys_getgroups
-116 64 setgroups sys_setgroups
-117 64 setresuid sys_setresuid
-118 64 getresuid sys_getresuid
-119 64 setresgid sys_setresgid
-120 64 getresgid sys_getresgid
-121 64 getpgid sys_getpgid
-122 64 setfsuid sys_setfsuid
-123 64 setfsgid sys_setfsgid
-124 64 getsid sys_getsid
-125 64 capget sys_capget
-126 64 capset sys_capset
+101 common ptrace sys_ptrace
+102 common getuid sys_getuid
+103 common syslog sys_syslog
+104 common getgid sys_getgid
+105 common setuid sys_setuid
+106 common setgid sys_setgid
+107 common geteuid sys_geteuid
+108 common getegid sys_getegid
+109 common setpgid sys_setpgid
+110 common getppid sys_getppid
+111 common getpgrp sys_getpgrp
+112 common setsid sys_setsid
+113 common setreuid sys_setreuid
+114 common setregid sys_setregid
+115 common getgroups sys_getgroups
+116 common setgroups sys_setgroups
+117 common setresuid sys_setresuid
+118 common getresuid sys_getresuid
+119 common setresgid sys_setresgid
+120 common getresgid sys_getresgid
+121 common getpgid sys_getpgid
+122 common setfsuid sys_setfsuid
+123 common setfsgid sys_setfsgid
+124 common getsid sys_getsid
+125 common capget sys_capget
+126 common capset sys_capset
127 64 rt_sigpending sys_rt_sigpending
128 64 rt_sigtimedwait sys_rt_sigtimedwait
129 64 rt_sigqueueinfo sys_rt_sigqueueinfo
-130 64 rt_sigsuspend sys_rt_sigsuspend
+130 common rt_sigsuspend sys_rt_sigsuspend
131 64 sigaltstack stub_sigaltstack
-132 64 utime sys_utime
-133 64 mknod sys_mknod
+132 common utime sys_utime
+133 common mknod sys_mknod
134 64 uselib
-135 64 personality sys_personality
-136 64 ustat sys_ustat
-137 64 statfs sys_statfs
-138 64 fstatfs sys_fstatfs
-139 64 sysfs sys_sysfs
-140 64 getpriority sys_getpriority
-141 64 setpriority sys_setpriority
-142 64 sched_setparam sys_sched_setparam
-143 64 sched_getparam sys_sched_getparam
-144 64 sched_setscheduler sys_sched_setscheduler
-145 64 sched_getscheduler sys_sched_getscheduler
-146 64 sched_get_priority_max sys_sched_get_priority_max
-147 64 sched_get_priority_min sys_sched_get_priority_min
-148 64 sched_rr_get_interval sys_sched_rr_get_interval
-149 64 mlock sys_mlock
-150 64 munlock sys_munlock
-151 64 mlockall sys_mlockall
-152 64 munlockall sys_munlockall
-153 64 vhangup sys_vhangup
-154 64 modify_ldt sys_modify_ldt
-155 64 pivot_root sys_pivot_root
+135 common personality sys_personality
+136 common ustat sys_ustat
+137 common statfs sys_statfs
+138 common fstatfs sys_fstatfs
+139 common sysfs sys_sysfs
+140 common getpriority sys_getpriority
+141 common setpriority sys_setpriority
+142 common sched_setparam sys_sched_setparam
+143 common sched_getparam sys_sched_getparam
+144 common sched_setscheduler sys_sched_setscheduler
+145 common sched_getscheduler sys_sched_getscheduler
+146 common sched_get_priority_max sys_sched_get_priority_max
+147 common sched_get_priority_min sys_sched_get_priority_min
+148 common sched_rr_get_interval sys_sched_rr_get_interval
+149 common mlock sys_mlock
+150 common munlock sys_munlock
+151 common mlockall sys_mlockall
+152 common munlockall sys_munlockall
+153 common vhangup sys_vhangup
+154 common modify_ldt sys_modify_ldt
+155 common pivot_root sys_pivot_root
156 64 _sysctl sys_sysctl
-157 64 prctl sys_prctl
-158 64 arch_prctl sys_arch_prctl
-159 64 adjtimex sys_adjtimex
-160 64 setrlimit sys_setrlimit
-161 64 chroot sys_chroot
-162 64 sync sys_sync
-163 64 acct sys_acct
-164 64 settimeofday sys_settimeofday
-165 64 mount sys_mount
-166 64 umount2 sys_umount
-167 64 swapon sys_swapon
-168 64 swapoff sys_swapoff
-169 64 reboot sys_reboot
-170 64 sethostname sys_sethostname
-171 64 setdomainname sys_setdomainname
-172 64 iopl stub_iopl
-173 64 ioperm sys_ioperm
+157 common prctl sys_prctl
+158 common arch_prctl sys_arch_prctl
+159 common adjtimex sys_adjtimex
+160 common setrlimit sys_setrlimit
+161 common chroot sys_chroot
+162 common sync sys_sync
+163 common acct sys_acct
+164 common settimeofday sys_settimeofday
+165 common mount sys_mount
+166 common umount2 sys_umount
+167 common swapon sys_swapon
+168 common swapoff sys_swapoff
+169 common reboot sys_reboot
+170 common sethostname sys_sethostname
+171 common setdomainname sys_setdomainname
+172 common iopl stub_iopl
+173 common ioperm sys_ioperm
174 64 create_module
-175 64 init_module sys_init_module
-176 64 delete_module sys_delete_module
+175 common init_module sys_init_module
+176 common delete_module sys_delete_module
177 64 get_kernel_syms
178 64 query_module
-179 64 quotactl sys_quotactl
+179 common quotactl sys_quotactl
180 64 nfsservctl
-181 64 getpmsg
-182 64 putpmsg
-183 64 afs_syscall
-184 64 tuxcall
-185 64 security
-186 64 gettid sys_gettid
-187 64 readahead sys_readahead
-188 64 setxattr sys_setxattr
-189 64 lsetxattr sys_lsetxattr
-190 64 fsetxattr sys_fsetxattr
-191 64 getxattr sys_getxattr
-192 64 lgetxattr sys_lgetxattr
-193 64 fgetxattr sys_fgetxattr
-194 64 listxattr sys_listxattr
-195 64 llistxattr sys_llistxattr
-196 64 flistxattr sys_flistxattr
-197 64 removexattr sys_removexattr
-198 64 lremovexattr sys_lremovexattr
-199 64 fremovexattr sys_fremovexattr
-200 64 tkill sys_tkill
-201 64 time sys_time
-202 64 futex sys_futex
-203 64 sched_setaffinity sys_sched_setaffinity
-204 64 sched_getaffinity sys_sched_getaffinity
+181 common getpmsg
+182 common putpmsg
+183 common afs_syscall
+184 common tuxcall
+185 common security
+186 common gettid sys_gettid
+187 common readahead sys_readahead
+188 common setxattr sys_setxattr
+189 common lsetxattr sys_lsetxattr
+190 common fsetxattr sys_fsetxattr
+191 common getxattr sys_getxattr
+192 common lgetxattr sys_lgetxattr
+193 common fgetxattr sys_fgetxattr
+194 common listxattr sys_listxattr
+195 common llistxattr sys_llistxattr
+196 common flistxattr sys_flistxattr
+197 common removexattr sys_removexattr
+198 common lremovexattr sys_lremovexattr
+199 common fremovexattr sys_fremovexattr
+200 common tkill sys_tkill
+201 common time sys_time
+202 common futex sys_futex
+203 common sched_setaffinity sys_sched_setaffinity
+204 common sched_getaffinity sys_sched_getaffinity
205 64 set_thread_area
-206 64 io_setup sys_io_setup
-207 64 io_destroy sys_io_destroy
-208 64 io_getevents sys_io_getevents
-209 64 io_submit sys_io_submit
-210 64 io_cancel sys_io_cancel
+206 common io_setup sys_io_setup
+207 common io_destroy sys_io_destroy
+208 common io_getevents sys_io_getevents
+209 common io_submit sys_io_submit
+210 common io_cancel sys_io_cancel
211 64 get_thread_area
-212 64 lookup_dcookie sys_lookup_dcookie
-213 64 epoll_create sys_epoll_create
+212 common lookup_dcookie sys_lookup_dcookie
+213 common epoll_create sys_epoll_create
214 64 epoll_ctl_old
215 64 epoll_wait_old
-216 64 remap_file_pages sys_remap_file_pages
-217 64 getdents64 sys_getdents64
-218 64 set_tid_address sys_set_tid_address
-219 64 restart_syscall sys_restart_syscall
-220 64 semtimedop sys_semtimedop
-221 64 fadvise64 sys_fadvise64
+216 common remap_file_pages sys_remap_file_pages
+217 common getdents64 sys_getdents64
+218 common set_tid_address sys_set_tid_address
+219 common restart_syscall sys_restart_syscall
+220 common semtimedop sys_semtimedop
+221 common fadvise64 sys_fadvise64
222 64 timer_create sys_timer_create
-223 64 timer_settime sys_timer_settime
-224 64 timer_gettime sys_timer_gettime
-225 64 timer_getoverrun sys_timer_getoverrun
-226 64 timer_delete sys_timer_delete
-227 64 clock_settime sys_clock_settime
-228 64 clock_gettime sys_clock_gettime
-229 64 clock_getres sys_clock_getres
-230 64 clock_nanosleep sys_clock_nanosleep
-231 64 exit_group sys_exit_group
-232 64 epoll_wait sys_epoll_wait
-233 64 epoll_ctl sys_epoll_ctl
-234 64 tgkill sys_tgkill
-235 64 utimes sys_utimes
+223 common timer_settime sys_timer_settime
+224 common timer_gettime sys_timer_gettime
+225 common timer_getoverrun sys_timer_getoverrun
+226 common timer_delete sys_timer_delete
+227 common clock_settime sys_clock_settime
+228 common clock_gettime sys_clock_gettime
+229 common clock_getres sys_clock_getres
+230 common clock_nanosleep sys_clock_nanosleep
+231 common exit_group sys_exit_group
+232 common epoll_wait sys_epoll_wait
+233 common epoll_ctl sys_epoll_ctl
+234 common tgkill sys_tgkill
+235 common utimes sys_utimes
236 64 vserver
-237 64 mbind sys_mbind
-238 64 set_mempolicy sys_set_mempolicy
-239 64 get_mempolicy sys_get_mempolicy
-240 64 mq_open sys_mq_open
-241 64 mq_unlink sys_mq_unlink
-242 64 mq_timedsend sys_mq_timedsend
-243 64 mq_timedreceive sys_mq_timedreceive
+237 common mbind sys_mbind
+238 common set_mempolicy sys_set_mempolicy
+239 common get_mempolicy sys_get_mempolicy
+240 common mq_open sys_mq_open
+241 common mq_unlink sys_mq_unlink
+242 common mq_timedsend sys_mq_timedsend
+243 common mq_timedreceive sys_mq_timedreceive
244 64 mq_notify sys_mq_notify
-245 64 mq_getsetattr sys_mq_getsetattr
+245 common mq_getsetattr sys_mq_getsetattr
246 64 kexec_load sys_kexec_load
247 64 waitid sys_waitid
-248 64 add_key sys_add_key
-249 64 request_key sys_request_key
-250 64 keyctl sys_keyctl
-251 64 ioprio_set sys_ioprio_set
-252 64 ioprio_get sys_ioprio_get
-253 64 inotify_init sys_inotify_init
-254 64 inotify_add_watch sys_inotify_add_watch
-255 64 inotify_rm_watch sys_inotify_rm_watch
-256 64 migrate_pages sys_migrate_pages
-257 64 openat sys_openat
-258 64 mkdirat sys_mkdirat
-259 64 mknodat sys_mknodat
-260 64 fchownat sys_fchownat
-261 64 futimesat sys_futimesat
-262 64 newfstatat sys_newfstatat
-263 64 unlinkat sys_unlinkat
-264 64 renameat sys_renameat
-265 64 linkat sys_linkat
-266 64 symlinkat sys_symlinkat
-267 64 readlinkat sys_readlinkat
-268 64 fchmodat sys_fchmodat
-269 64 faccessat sys_faccessat
-270 64 pselect6 sys_pselect6
-271 64 ppoll sys_ppoll
-272 64 unshare sys_unshare
+248 common add_key sys_add_key
+249 common request_key sys_request_key
+250 common keyctl sys_keyctl
+251 common ioprio_set sys_ioprio_set
+252 common ioprio_get sys_ioprio_get
+253 common inotify_init sys_inotify_init
+254 common inotify_add_watch sys_inotify_add_watch
+255 common inotify_rm_watch sys_inotify_rm_watch
+256 common migrate_pages sys_migrate_pages
+257 common openat sys_openat
+258 common mkdirat sys_mkdirat
+259 common mknodat sys_mknodat
+260 common fchownat sys_fchownat
+261 common futimesat sys_futimesat
+262 common newfstatat sys_newfstatat
+263 common unlinkat sys_unlinkat
+264 common renameat sys_renameat
+265 common linkat sys_linkat
+266 common symlinkat sys_symlinkat
+267 common readlinkat sys_readlinkat
+268 common fchmodat sys_fchmodat
+269 common faccessat sys_faccessat
+270 common pselect6 sys_pselect6
+271 common ppoll sys_ppoll
+272 common unshare sys_unshare
273 64 set_robust_list sys_set_robust_list
274 64 get_robust_list sys_get_robust_list
-275 64 splice sys_splice
-276 64 tee sys_tee
-277 64 sync_file_range sys_sync_file_range
+275 common splice sys_splice
+276 common tee sys_tee
+277 common sync_file_range sys_sync_file_range
278 64 vmsplice sys_vmsplice
279 64 move_pages sys_move_pages
-280 64 utimensat sys_utimensat
-281 64 epoll_pwait sys_epoll_pwait
-282 64 signalfd sys_signalfd
-283 64 timerfd_create sys_timerfd_create
-284 64 eventfd sys_eventfd
-285 64 fallocate sys_fallocate
-286 64 timerfd_settime sys_timerfd_settime
-287 64 timerfd_gettime sys_timerfd_gettime
-288 64 accept4 sys_accept4
-289 64 signalfd4 sys_signalfd4
-290 64 eventfd2 sys_eventfd2
-291 64 epoll_create1 sys_epoll_create1
-292 64 dup3 sys_dup3
-293 64 pipe2 sys_pipe2
-294 64 inotify_init1 sys_inotify_init1
+280 common utimensat sys_utimensat
+281 common epoll_pwait sys_epoll_pwait
+282 common signalfd sys_signalfd
+283 common timerfd_create sys_timerfd_create
+284 common eventfd sys_eventfd
+285 common fallocate sys_fallocate
+286 common timerfd_settime sys_timerfd_settime
+287 common timerfd_gettime sys_timerfd_gettime
+288 common accept4 sys_accept4
+289 common signalfd4 sys_signalfd4
+290 common eventfd2 sys_eventfd2
+291 common epoll_create1 sys_epoll_create1
+292 common dup3 sys_dup3
+293 common pipe2 sys_pipe2
+294 common inotify_init1 sys_inotify_init1
295 64 preadv sys_preadv
296 64 pwritev sys_pwritev
297 64 rt_tgsigqueueinfo sys_rt_tgsigqueueinfo
-298 64 perf_event_open sys_perf_event_open
+298 common perf_event_open sys_perf_event_open
299 64 recvmmsg sys_recvmmsg
-300 64 fanotify_init sys_fanotify_init
-301 64 fanotify_mark sys_fanotify_mark
-302 64 prlimit64 sys_prlimit64
-303 64 name_to_handle_at sys_name_to_handle_at
-304 64 open_by_handle_at sys_open_by_handle_at
-305 64 clock_adjtime sys_clock_adjtime
-306 64 syncfs sys_syncfs
+300 common fanotify_init sys_fanotify_init
+301 common fanotify_mark sys_fanotify_mark
+302 common prlimit64 sys_prlimit64
+303 common name_to_handle_at sys_name_to_handle_at
+304 common open_by_handle_at sys_open_by_handle_at
+305 common clock_adjtime sys_clock_adjtime
+306 common syncfs sys_syncfs
307 64 sendmmsg sys_sendmmsg
-308 64 setns sys_setns
-309 64 getcpu sys_getcpu
+308 common setns sys_setns
+309 common getcpu sys_getcpu
310 64 process_vm_readv sys_process_vm_readv
311 64 process_vm_writev sys_process_vm_writev
+#
+# x32-specific system call numbers start at 512 to avoid cache impact
+# for native 64-bit operation.
+#
+512 x32 rt_sigaction sys32_rt_sigaction
+513 x32 rt_sigreturn stub_x32_rt_sigreturn
+514 x32 ioctl compat_sys_ioctl
+515 x32 readv compat_sys_readv
+516 x32 writev compat_sys_writev
+517 x32 recvfrom compat_sys_recvfrom
+518 x32 sendmsg compat_sys_sendmsg
+519 x32 recvmsg compat_sys_recvmsg
+520 x32 execve stub_x32_execve
+521 x32 times compat_sys_times
+522 x32 rt_sigpending sys32_rt_sigpending
+523 x32 rt_sigtimedwait compat_sys_rt_sigtimedwait
+524 x32 rt_sigqueueinfo sys32_rt_sigqueueinfo
+525 x32 sigaltstack stub_x32_sigaltstack
+526 x32 timer_create compat_sys_timer_create
+527 x32 mq_notify compat_sys_mq_notify
+528 x32 kexec_load compat_sys_kexec_load
+529 x32 waitid compat_sys_waitid
+530 x32 set_robust_list compat_sys_set_robust_list
+531 x32 get_robust_list compat_sys_get_robust_list
+532 x32 vmsplice compat_sys_vmsplice
+533 x32 move_pages compat_sys_move_pages
+534 x32 preadv compat_sys_preadv64
+535 x32 pwritev compat_sys_pwritev64
+536 x32 rt_tgsigqueueinfo compat_sys_rt_tgsigqueueinfo
+537 x32 recvmmsg compat_sys_recvmmsg
+538 x32 sendmmsg compat_sys_sendmmsg
+539 x32 process_vm_readv compat_sys_process_vm_readv
+540 x32 process_vm_writev compat_sys_process_vm_writev
diff --git a/arch/x86/um/sys_call_table_64.c b/arch/x86/um/sys_call_table_64.c
index fe626c3..9924776 100644
--- a/arch/x86/um/sys_call_table_64.c
+++ b/arch/x86/um/sys_call_table_64.c
@@ -35,6 +35,9 @@
#define stub_sigaltstack sys_sigaltstack
#define stub_rt_sigreturn sys_rt_sigreturn

+#define __SYSCALL_COMMON(nr, sym, compat) __SYSCALL_64(nr, sym, compat)
+#define __SYSCALL_X32(nr, sym, compat) /* Not supported */
+
#define __SYSCALL_64(nr, sym, compat) extern asmlinkage void sym(void) ;
#include <asm/syscalls_64.h>

diff --git a/arch/x86/um/user-offsets.c b/arch/x86/um/user-offsets.c
index 5edf4f4..ce7e360 100644
--- a/arch/x86/um/user-offsets.c
+++ b/arch/x86/um/user-offsets.c
@@ -15,6 +15,8 @@ static char syscalls[] = {
};
#else
#define __SYSCALL_64(nr, sym, compat) [nr] = 1,
+#define __SYSCALL_COMMON(nr, sym, compat) [nr] = 1,
+#define __SYSCALL_X32(nr, sym, compat) /* Not supported */
static char syscalls[] = {
#include <asm/syscalls_64.h>
};
--
1.7.6.5

2012-02-20 00:17:08

by H. Peter Anvin

[permalink] [raw]
Subject: [PATCH 18/30] x32: Generate <asm/unistd_x32.h>

From: "H. Peter Anvin" <[email protected]>

Generate <asm/unistd_x32.h>; this exports x32 system call numbers to
user space.

Signed-off-by: H. Peter Anvin <[email protected]>
---
arch/x86/include/asm/Kbuild | 1 +
arch/x86/include/asm/unistd.h | 7 ++++++-
arch/x86/syscalls/Makefile | 10 ++++++++--
3 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/Kbuild b/arch/x86/include/asm/Kbuild
index 986954f..f9c0d3b 100644
--- a/arch/x86/include/asm/Kbuild
+++ b/arch/x86/include/asm/Kbuild
@@ -25,3 +25,4 @@ header-y += vsyscall.h

genhdr-y += unistd_32.h
genhdr-y += unistd_64.h
+genhdr-y += unistd_x32.h
diff --git a/arch/x86/include/asm/unistd.h b/arch/x86/include/asm/unistd.h
index 21f77b8..dab5349 100644
--- a/arch/x86/include/asm/unistd.h
+++ b/arch/x86/include/asm/unistd.h
@@ -1,6 +1,9 @@
#ifndef _ASM_X86_UNISTD_H
#define _ASM_X86_UNISTD_H 1

+/* x32 syscall flag bit */
+#define __X32_SYSCALL_BIT 0x40000000
+
#ifdef __KERNEL__
# ifdef CONFIG_X86_32

@@ -52,8 +55,10 @@
#else
# ifdef __i386__
# include <asm/unistd_32.h>
-# else
+# elif defined(__LP64__)
# include <asm/unistd_64.h>
+# else
+# include <asm/unistd_x32.h>
# endif
#endif

diff --git a/arch/x86/syscalls/Makefile b/arch/x86/syscalls/Makefile
index 89dd958..54bdbd7 100644
--- a/arch/x86/syscalls/Makefile
+++ b/arch/x86/syscalls/Makefile
@@ -11,7 +11,8 @@ systbl := $(srctree)/$(src)/syscalltbl.sh

quiet_cmd_syshdr = SYSHDR $@
cmd_syshdr = $(CONFIG_SHELL) '$(syshdr)' $< $@ \
- $(syshdr_abi_$(basetarget)) $(syshdr_pfx_$(basetarget))
+ $(syshdr_abi_$(basetarget)) $(syshdr_pfx_$(basetarget)) \
+ $(syshdr_offset_$(basetarget))
quiet_cmd_systbl = SYSTBL $@
cmd_systbl = $(CONFIG_SHELL) '$(systbl)' $< $@

@@ -24,6 +25,11 @@ syshdr_pfx_unistd_32_ia32 := ia32_
$(out)/unistd_32_ia32.h: $(syscall32) $(syshdr)
$(call if_changed,syshdr)

+syshdr_abi_unistd_x32 := common,x32
+syshdr_offset_unistd_x32 := __X32_SYSCALL_BIT
+$(out)/unistd_x32.h: $(syscall64) $(syshdr)
+ $(call if_changed,syshdr)
+
syshdr_abi_unistd_64 := common,64
$(out)/unistd_64.h: $(syscall64) $(syshdr)
$(call if_changed,syshdr)
@@ -33,7 +39,7 @@ $(out)/syscalls_32.h: $(syscall32) $(systbl)
$(out)/syscalls_64.h: $(syscall64) $(systbl)
$(call if_changed,systbl)

-syshdr-y += unistd_32.h unistd_64.h
+syshdr-y += unistd_32.h unistd_64.h unistd_x32.h
syshdr-y += syscalls_32.h
syshdr-$(CONFIG_X86_64) += unistd_32_ia32.h
syshdr-$(CONFIG_X86_64) += syscalls_64.h
--
1.7.6.5

2012-02-20 00:17:32

by H. Peter Anvin

[permalink] [raw]
Subject: [PATCH 19/30] x32: Generate <asm/unistd_64_x32.h>

From: "H. Peter Anvin" <[email protected]>

Generate macros for the *kernel* code to use to refer to x32 system
calls. These have an __NR_x32_ prefix and do not include
__X32_SYSCALL_BIT.

Signed-off-by: H. Peter Anvin <[email protected]>
---
arch/x86/include/asm/unistd.h | 1 +
arch/x86/syscalls/Makefile | 7 ++++++-
2 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/unistd.h b/arch/x86/include/asm/unistd.h
index dab5349..7a48a55 100644
--- a/arch/x86/include/asm/unistd.h
+++ b/arch/x86/include/asm/unistd.h
@@ -17,6 +17,7 @@
# else

# include <asm/unistd_64.h>
+# include <asm/unistd_64_x32.h>
# define __ARCH_WANT_COMPAT_SYS_TIME

# endif
diff --git a/arch/x86/syscalls/Makefile b/arch/x86/syscalls/Makefile
index 54bdbd7..414d402 100644
--- a/arch/x86/syscalls/Makefile
+++ b/arch/x86/syscalls/Makefile
@@ -34,6 +34,11 @@ syshdr_abi_unistd_64 := common,64
$(out)/unistd_64.h: $(syscall64) $(syshdr)
$(call if_changed,syshdr)

+syshdr_abi_unistd_64_x32 := x32
+syshdr_pfx_unistd_64_x32 := x32_
+$(out)/unistd_64_x32.h: $(syscall64) $(syshdr)
+ $(call if_changed,syshdr)
+
$(out)/syscalls_32.h: $(syscall32) $(systbl)
$(call if_changed,systbl)
$(out)/syscalls_64.h: $(syscall64) $(systbl)
@@ -41,7 +46,7 @@ $(out)/syscalls_64.h: $(syscall64) $(systbl)

syshdr-y += unistd_32.h unistd_64.h unistd_x32.h
syshdr-y += syscalls_32.h
-syshdr-$(CONFIG_X86_64) += unistd_32_ia32.h
+syshdr-$(CONFIG_X86_64) += unistd_32_ia32.h unistd_64_x32.h
syshdr-$(CONFIG_X86_64) += syscalls_64.h

targets += $(syshdr-y)
--
1.7.6.5

2012-02-20 00:17:59

by H. Peter Anvin

[permalink] [raw]
Subject: [PATCH 20/30] x86: Move some signal-handling definitions to a common header

From: "H. Peter Anvin" <[email protected]>

There are some definitions which are duplicated between
kernel/signal.c and ia32/ia32_signal.c; move them to a common header
file.

Rather than adding stuff to existing header files which contain data
structures, create a new header file; hence the slightly odd name
("all the good ones were taken.")

Note: nothing relied on signal_fault() being defined in
<asm/ptrace.h>.

Signed-off-by: H. Peter Anvin <[email protected]>
---
arch/x86/ia32/ia32_signal.c | 12 ++----------
arch/x86/include/asm/ptrace.h | 1 -
arch/x86/include/asm/sighandling.h | 19 +++++++++++++++++++
arch/x86/kernel/signal.c | 10 +---------
4 files changed, 22 insertions(+), 20 deletions(-)
create mode 100644 arch/x86/include/asm/sighandling.h

diff --git a/arch/x86/ia32/ia32_signal.c b/arch/x86/ia32/ia32_signal.c
index 6557769..25d80f3 100644
--- a/arch/x86/ia32/ia32_signal.c
+++ b/arch/x86/ia32/ia32_signal.c
@@ -12,10 +12,8 @@
#include <linux/mm.h>
#include <linux/smp.h>
#include <linux/kernel.h>
-#include <linux/signal.h>
#include <linux/errno.h>
#include <linux/wait.h>
-#include <linux/ptrace.h>
#include <linux/unistd.h>
#include <linux/stddef.h>
#include <linux/personality.h>
@@ -31,16 +29,10 @@
#include <asm/proto.h>
#include <asm/vdso.h>
#include <asm/sigframe.h>
+#include <asm/sighandling.h>
#include <asm/sys_ia32.h>

-#define _BLOCKABLE (~(sigmask(SIGKILL) | sigmask(SIGSTOP)))
-
-#define FIX_EFLAGS (X86_EFLAGS_AC | X86_EFLAGS_OF | \
- X86_EFLAGS_DF | X86_EFLAGS_TF | X86_EFLAGS_SF | \
- X86_EFLAGS_ZF | X86_EFLAGS_AF | X86_EFLAGS_PF | \
- X86_EFLAGS_CF)
-
-void signal_fault(struct pt_regs *regs, void __user *frame, char *where);
+#define FIX_EFLAGS __FIX_EFLAGS

int copy_siginfo_to_user32(compat_siginfo_t __user *to, siginfo_t *from)
{
diff --git a/arch/x86/include/asm/ptrace.h b/arch/x86/include/asm/ptrace.h
index 3566454..dcfde52 100644
--- a/arch/x86/include/asm/ptrace.h
+++ b/arch/x86/include/asm/ptrace.h
@@ -145,7 +145,6 @@ extern unsigned long
convert_ip_to_linear(struct task_struct *child, struct pt_regs *regs);
extern void send_sigtrap(struct task_struct *tsk, struct pt_regs *regs,
int error_code, int si_code);
-void signal_fault(struct pt_regs *regs, void __user *frame, char *where);

extern long syscall_trace_enter(struct pt_regs *);
extern void syscall_trace_leave(struct pt_regs *);
diff --git a/arch/x86/include/asm/sighandling.h b/arch/x86/include/asm/sighandling.h
new file mode 100644
index 0000000..843e299
--- /dev/null
+++ b/arch/x86/include/asm/sighandling.h
@@ -0,0 +1,19 @@
+#ifndef _ASM_X86_SIGHANDLING_H
+#define _ASM_X86_SIGHANDLING_H
+
+#include <linux/compiler.h>
+#include <linux/ptrace.h>
+#include <linux/signal.h>
+
+#include <asm/processor-flags.h>
+
+#define _BLOCKABLE (~(sigmask(SIGKILL) | sigmask(SIGSTOP)))
+
+#define __FIX_EFLAGS (X86_EFLAGS_AC | X86_EFLAGS_OF | \
+ X86_EFLAGS_DF | X86_EFLAGS_TF | X86_EFLAGS_SF | \
+ X86_EFLAGS_ZF | X86_EFLAGS_AF | X86_EFLAGS_PF | \
+ X86_EFLAGS_CF)
+
+void signal_fault(struct pt_regs *regs, void __user *frame, char *where);
+
+#endif /* _ASM_X86_SIGHANDLING_H */
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 46a01bd..c432dc0 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -10,10 +10,8 @@
#include <linux/mm.h>
#include <linux/smp.h>
#include <linux/kernel.h>
-#include <linux/signal.h>
#include <linux/errno.h>
#include <linux/wait.h>
-#include <linux/ptrace.h>
#include <linux/tracehook.h>
#include <linux/unistd.h>
#include <linux/stddef.h>
@@ -26,6 +24,7 @@
#include <asm/i387.h>
#include <asm/vdso.h>
#include <asm/mce.h>
+#include <asm/sighandling.h>

#ifdef CONFIG_X86_64
#include <asm/proto.h>
@@ -37,13 +36,6 @@

#include <asm/sigframe.h>

-#define _BLOCKABLE (~(sigmask(SIGKILL) | sigmask(SIGSTOP)))
-
-#define __FIX_EFLAGS (X86_EFLAGS_AC | X86_EFLAGS_OF | \
- X86_EFLAGS_DF | X86_EFLAGS_TF | X86_EFLAGS_SF | \
- X86_EFLAGS_ZF | X86_EFLAGS_AF | X86_EFLAGS_PF | \
- X86_EFLAGS_CF)
-
#ifdef CONFIG_X86_32
# define FIX_EFLAGS (__FIX_EFLAGS | X86_EFLAGS_RF)
#else
--
1.7.6.5

2012-02-20 00:18:24

by H. Peter Anvin

[permalink] [raw]
Subject: [PATCH 21/30] x32: Export setup/restore_sigcontext from signal.c

From: "H. Peter Anvin" <[email protected]>

Export setup_sigcontext() and restore_sigcontext() from signal.c, so
we can use the 64-bit versions verbatim for x32.

Signed-off-by: H. Peter Anvin <[email protected]>
---
arch/x86/include/asm/sighandling.h | 5 +++++
arch/x86/kernel/signal.c | 10 ++++------
2 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/sighandling.h b/arch/x86/include/asm/sighandling.h
index 843e299..ada93b3 100644
--- a/arch/x86/include/asm/sighandling.h
+++ b/arch/x86/include/asm/sighandling.h
@@ -16,4 +16,9 @@

void signal_fault(struct pt_regs *regs, void __user *frame, char *where);

+int restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *sc,
+ unsigned long *pax);
+int setup_sigcontext(struct sigcontext __user *sc, void __user *fpstate,
+ struct pt_regs *regs, unsigned long mask);
+
#endif /* _ASM_X86_SIGHANDLING_H */
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index c432dc0..450fb25 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -60,9 +60,8 @@
regs->seg = GET_SEG(seg) | 3; \
} while (0)

-static int
-restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *sc,
- unsigned long *pax)
+int restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *sc,
+ unsigned long *pax)
{
void __user *buf;
unsigned int tmpflags;
@@ -117,9 +116,8 @@ restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *sc,
return err;
}

-static int
-setup_sigcontext(struct sigcontext __user *sc, void __user *fpstate,
- struct pt_regs *regs, unsigned long mask)
+int setup_sigcontext(struct sigcontext __user *sc, void __user *fpstate,
+ struct pt_regs *regs, unsigned long mask)
{
int err = 0;

--
1.7.6.5

2012-02-20 00:18:49

by H. Peter Anvin

[permalink] [raw]
Subject: [PATCH 22/30] x32: Add struct ucontext_x32

From: "H. J. Lu" <[email protected]>

Add a definition for struct ucontext_x32; this is inherently a mix of
the 32- and 64-bit versions.

Signed-off-by: H. Peter Anvin <[email protected]>
---
arch/x86/include/asm/ia32.h | 9 +++++++++
1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/ia32.h b/arch/x86/include/asm/ia32.h
index 1f7e625..c6435ab 100644
--- a/arch/x86/include/asm/ia32.h
+++ b/arch/x86/include/asm/ia32.h
@@ -43,6 +43,15 @@ struct ucontext_ia32 {
compat_sigset_t uc_sigmask; /* mask last for extensibility */
};

+struct ucontext_x32 {
+ unsigned int uc_flags;
+ unsigned int uc_link;
+ stack_ia32_t uc_stack;
+ unsigned int uc__pad0; /* needed for alignment */
+ struct sigcontext uc_mcontext; /* the 64-bit sigcontext type */
+ compat_sigset_t uc_sigmask; /* mask last for extensibility */
+};
+
/* This matches struct stat64 in glibc2.2, hence the absolutely
* insane amounts of padding around dev_t's.
*/
--
1.7.6.5

2012-02-20 00:19:15

by H. Peter Anvin

[permalink] [raw]
Subject: [PATCH 23/30] x32: Add rt_sigframe_x32

From: "H. Peter Anvin" <[email protected]>

Add rt_sigframe_x32 to <asm/sigframe.h>. Unfortunately we can't just
define all the data structures unconditionally, due to the #ifdef
CONFIG_COMPAT in <linux/compat.h> and its trickle-down effects, hence
the #ifdef mess.

Signed-off-by: H. Peter Anvin <[email protected]>
---
arch/x86/include/asm/sigframe.h | 13 +++++++++++++
1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/sigframe.h b/arch/x86/include/asm/sigframe.h
index 4e0fe26..7c7c27c 100644
--- a/arch/x86/include/asm/sigframe.h
+++ b/arch/x86/include/asm/sigframe.h
@@ -59,12 +59,25 @@ struct rt_sigframe_ia32 {
#endif /* defined(CONFIG_X86_32) || defined(CONFIG_IA32_EMULATION) */

#ifdef CONFIG_X86_64
+
struct rt_sigframe {
char __user *pretcode;
struct ucontext uc;
struct siginfo info;
/* fp state follows here */
};
+
+#ifdef CONFIG_X86_X32_ABI
+
+struct rt_sigframe_x32 {
+ u64 pretcode;
+ struct ucontext_x32 uc;
+ compat_siginfo_t info;
+ /* fp state follows here */
+};
+
+#endif /* CONFIG_X86_X32_ABI */
+
#endif /* CONFIG_X86_64 */

#endif /* _ASM_X86_SIGFRAME_H */
--
1.7.6.5

2012-02-20 00:19:42

by H. Peter Anvin

[permalink] [raw]
Subject: [PATCH 24/30] x32: Handle the x32 system call flag

From: "H. Peter Anvin" <[email protected]>

x32 shares most system calls with x86-64, but unfortunately some
subsystem (the input subsystem is the chief offender) which require
is_compat() when operating with a 32-bit userspace. The input system
actually has text files in sysfs whose meaning is dependent on
sizeof(long) in userspace!

We could solve this by having two completely disjoint system call
tables; requiring that each system call be duplicated. This patch
takes a different approach: we add a flag to the system call number;
this flag doesn't affect the system call dispatch but requests compat
treatment from affected subsystems for the duration of the system call.

The change of cmpq to cmpl is safe since it immediately follows the
and.

Signed-off-by: H. Peter Anvin <[email protected]>
---
arch/x86/include/asm/compat.h | 13 +++++++++++--
arch/x86/include/asm/syscall.h | 5 +++--
arch/x86/include/asm/unistd.h | 7 +++++++
arch/x86/kernel/entry_64.S | 10 ++++++++++
4 files changed, 31 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/compat.h b/arch/x86/include/asm/compat.h
index 30d737e..7938b84 100644
--- a/arch/x86/include/asm/compat.h
+++ b/arch/x86/include/asm/compat.h
@@ -7,6 +7,7 @@
#include <linux/types.h>
#include <linux/sched.h>
#include <asm/user32.h>
+#include <asm/unistd.h>

#define COMPAT_USER_HZ 100
#define COMPAT_UTS_MACHINE "i686\0\0"
@@ -212,9 +213,17 @@ static inline void __user *arch_compat_alloc_user_space(long len)
return (void __user *)regs->sp - len;
}

-static inline int is_compat_task(void)
+static inline bool is_compat_task(void)
{
- return current_thread_info()->status & TS_COMPAT;
+#ifdef CONFIG_IA32_EMULATION
+ if (current_thread_info()->status & TS_COMPAT)
+ return true;
+#endif
+#ifdef CONFIG_X86_X32_ABI
+ if (task_pt_regs(current)->orig_ax & __X32_SYSCALL_BIT)
+ return true;
+#endif
+ return false;
}

#endif /* _ASM_X86_COMPAT_H */
diff --git a/arch/x86/include/asm/syscall.h b/arch/x86/include/asm/syscall.h
index d962e56..386b786 100644
--- a/arch/x86/include/asm/syscall.h
+++ b/arch/x86/include/asm/syscall.h
@@ -16,6 +16,7 @@
#include <linux/sched.h>
#include <linux/err.h>
#include <asm/asm-offsets.h> /* For NR_syscalls */
+#include <asm/unistd.h>

extern const unsigned long sys_call_table[];

@@ -26,13 +27,13 @@ extern const unsigned long sys_call_table[];
*/
static inline int syscall_get_nr(struct task_struct *task, struct pt_regs *regs)
{
- return regs->orig_ax;
+ return regs->orig_ax & __SYSCALL_MASK;
}

static inline void syscall_rollback(struct task_struct *task,
struct pt_regs *regs)
{
- regs->ax = regs->orig_ax;
+ regs->ax = regs->orig_ax & __SYSCALL_MASK;
}

static inline long syscall_get_error(struct task_struct *task,
diff --git a/arch/x86/include/asm/unistd.h b/arch/x86/include/asm/unistd.h
index 7a48a55..37cdc9d 100644
--- a/arch/x86/include/asm/unistd.h
+++ b/arch/x86/include/asm/unistd.h
@@ -5,6 +5,13 @@
#define __X32_SYSCALL_BIT 0x40000000

#ifdef __KERNEL__
+
+# ifdef CONFIG_X86_X32_ABI
+# define __SYSCALL_MASK (~(__X32_SYSCALL_BIT))
+# else
+# define __SYSCALL_MASK (~0)
+# endif
+
# ifdef CONFIG_X86_32

# include <asm/unistd_32.h>
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 3fe8239..a17b342 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -482,7 +482,12 @@ GLOBAL(system_call_after_swapgs)
testl $_TIF_WORK_SYSCALL_ENTRY,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
jnz tracesys
system_call_fastpath:
+#if __SYSCALL_MASK == ~0
cmpq $__NR_syscall_max,%rax
+#else
+ andl $__SYSCALL_MASK,%eax
+ cmpl $__NR_syscall_max,%eax
+#endif
ja badsys
movq %r10,%rcx
call *sys_call_table(,%rax,8) # XXX: rip relative
@@ -596,7 +601,12 @@ tracesys:
*/
LOAD_ARGS ARGOFFSET, 1
RESTORE_REST
+#if __SYSCALL_MASK == ~0
cmpq $__NR_syscall_max,%rax
+#else
+ andl $__SYSCALL_MASK,%eax
+ cmpl $__NR_syscall_max,%eax
+#endif
ja int_ret_from_sys_call /* RAX(%rsp) set to -ENOSYS above */
movq %r10,%rcx /* fixup for C */
call *sys_call_table(,%rax,8)
--
1.7.6.5

2012-02-20 00:20:18

by H. Peter Anvin

[permalink] [raw]
Subject: [PATCH 25/30] x86: Add #ifdef CONFIG_COMPAT to <asm/sys_ia32.h>

From: "H. Peter Anvin" <[email protected]>

Unfortunately a lot of the compat types are guarded with CONFIG_COMPAT
or the equivalent, so add a similar guard to <asm/sys_ia32.h> to avoid
compilation failures when CONFIG_COMPAT=n.

Signed-off-by: H. Peter Anvin <[email protected]>
---
arch/x86/include/asm/sys_ia32.h | 5 +++++
1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/sys_ia32.h b/arch/x86/include/asm/sys_ia32.h
index 68da87b..3fda9db4 100644
--- a/arch/x86/include/asm/sys_ia32.h
+++ b/arch/x86/include/asm/sys_ia32.h
@@ -10,6 +10,8 @@
#ifndef _ASM_X86_SYS_IA32_H
#define _ASM_X86_SYS_IA32_H

+#ifdef CONFIG_COMPAT
+
#include <linux/compiler.h>
#include <linux/linkage.h>
#include <linux/types.h>
@@ -81,4 +83,7 @@ asmlinkage long sys32_ipc(u32, int, int, int, compat_uptr_t, u32);

asmlinkage long sys32_fanotify_mark(int, unsigned int, u32, u32, int,
const char __user *);
+
+#endif /* CONFIG_COMPAT */
+
#endif /* _ASM_X86_SYS_IA32_H */
--
1.7.6.5

2012-02-20 00:20:44

by H. Peter Anvin

[permalink] [raw]
Subject: [PATCH 26/30] x32: Signal-related system calls

From: "H. Peter Anvin" <[email protected]>

x32 uses the 64-bit signal frame format, obviously, but there are some
structures which mixes that with pointers or sizeof(long) types, as
such we have to create a handful of system calls specific to x32. By
and large these are a mixture of the 64-bit and the compat system
calls.

Originally-by: H. J. Lu <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>
---
arch/x86/kernel/entry_64.S | 19 +++++++
arch/x86/kernel/signal.c | 118 +++++++++++++++++++++++++++++++++++++++++++-
2 files changed, 136 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index a17b342..53dc821 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -746,6 +746,25 @@ ENTRY(stub_rt_sigreturn)
CFI_ENDPROC
END(stub_rt_sigreturn)

+#ifdef CONFIG_X86_X32_ABI
+ PTREGSCALL stub_x32_sigaltstack, sys32_sigaltstack, %rdx
+
+ENTRY(stub_x32_rt_sigreturn)
+ CFI_STARTPROC
+ addq $8, %rsp
+ PARTIAL_FRAME 0
+ SAVE_REST
+ movq %rsp,%rdi
+ FIXUP_TOP_OF_STACK %r11
+ call sys32_x32_rt_sigreturn
+ movq %rax,RAX(%rsp) # fixme, this could be done at the higher layer
+ RESTORE_REST
+ jmp int_ret_from_sys_call
+ CFI_ENDPROC
+END(stub_x32_rt_sigreturn)
+
+#endif
+
/*
* Build the entry stubs and pointer table with some assembler magic.
* We pack 7 stubs into a single 32-byte chunk, which will fit in a
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 450fb25..c3846b6 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -29,6 +29,7 @@
#ifdef CONFIG_X86_64
#include <asm/proto.h>
#include <asm/ia32_unistd.h>
+#include <asm/sys_ia32.h>
#endif /* CONFIG_X86_64 */

#include <asm/syscall.h>
@@ -632,6 +633,16 @@ static int signr_convert(int sig)
#define is_ia32 0
#endif /* CONFIG_IA32_EMULATION */

+#ifdef CONFIG_X86_X32_ABI
+#define is_x32 test_thread_flag(TIF_X32)
+
+static int x32_setup_rt_frame(int sig, struct k_sigaction *ka,
+ siginfo_t *info, compat_sigset_t *set,
+ struct pt_regs *regs);
+#else /* !CONFIG_X86_X32_ABI */
+#define is_x32 0
+#endif /* CONFIG_X86_X32_ABI */
+
int ia32_setup_rt_frame(int sig, struct k_sigaction *ka, siginfo_t *info,
sigset_t *set, struct pt_regs *regs);
int ia32_setup_frame(int sig, struct k_sigaction *ka,
@@ -656,8 +667,14 @@ setup_rt_frame(int sig, struct k_sigaction *ka, siginfo_t *info,
ret = ia32_setup_rt_frame(usig, ka, info, set, regs);
else
ret = ia32_setup_frame(usig, ka, set, regs);
- } else
+#ifdef CONFIG_X86_X32_ABI
+ } else if (is_x32) {
+ ret = x32_setup_rt_frame(usig, ka, info,
+ (compat_sigset_t *)set, regs);
+#endif
+ } else {
ret = __setup_rt_frame(sig, ka, info, set, regs);
+ }

if (ret) {
force_sigsegv(sig, current);
@@ -840,3 +857,102 @@ void signal_fault(struct pt_regs *regs, void __user *frame, char *where)

force_sig(SIGSEGV, me);
}
+
+#ifdef CONFIG_X86_X32_ABI
+static int x32_setup_rt_frame(int sig, struct k_sigaction *ka,
+ siginfo_t *info, compat_sigset_t *set,
+ struct pt_regs *regs)
+{
+ struct rt_sigframe_x32 __user *frame;
+ void __user *restorer;
+ int err = 0;
+ void __user *fpstate = NULL;
+
+ frame = get_sigframe(ka, regs, sizeof(*frame), &fpstate);
+
+ if (!access_ok(VERIFY_WRITE, frame, sizeof(*frame)))
+ return -EFAULT;
+
+ if (ka->sa.sa_flags & SA_SIGINFO) {
+ if (copy_siginfo_to_user32(&frame->info, info))
+ return -EFAULT;
+ }
+
+ put_user_try {
+ /* Create the ucontext. */
+ if (cpu_has_xsave)
+ put_user_ex(UC_FP_XSTATE, &frame->uc.uc_flags);
+ else
+ put_user_ex(0, &frame->uc.uc_flags);
+ put_user_ex(0, &frame->uc.uc_link);
+ put_user_ex(current->sas_ss_sp, &frame->uc.uc_stack.ss_sp);
+ put_user_ex(sas_ss_flags(regs->sp),
+ &frame->uc.uc_stack.ss_flags);
+ put_user_ex(current->sas_ss_size, &frame->uc.uc_stack.ss_size);
+ put_user_ex(0, &frame->uc.uc__pad0);
+ err |= setup_sigcontext(&frame->uc.uc_mcontext, fpstate,
+ regs, set->sig[0]);
+ err |= __copy_to_user(&frame->uc.uc_sigmask, set, sizeof(*set));
+
+ if (ka->sa.sa_flags & SA_RESTORER) {
+ restorer = ka->sa.sa_restorer;
+ } else {
+ /* could use a vstub here */
+ restorer = NULL;
+ err |= -EFAULT;
+ }
+ put_user_ex(restorer, &frame->pretcode);
+ } put_user_catch(err);
+
+ if (err)
+ return -EFAULT;
+
+ /* Set up registers for signal handler */
+ regs->sp = (unsigned long) frame;
+ regs->ip = (unsigned long) ka->sa.sa_handler;
+
+ /* We use the x32 calling convention here... */
+ regs->di = sig;
+ regs->si = (unsigned long) &frame->info;
+ regs->dx = (unsigned long) &frame->uc;
+
+ loadsegment(ds, __USER_DS);
+ loadsegment(es, __USER_DS);
+
+ regs->cs = __USER_CS;
+ regs->ss = __USER_DS;
+
+ return 0;
+}
+
+asmlinkage long sys32_x32_rt_sigreturn(struct pt_regs *regs)
+{
+ struct rt_sigframe_x32 __user *frame;
+ sigset_t set;
+ unsigned long ax;
+ struct pt_regs tregs;
+
+ frame = (struct rt_sigframe_x32 __user *)(regs->sp - 8);
+
+ if (!access_ok(VERIFY_READ, frame, sizeof(*frame)))
+ goto badframe;
+ if (__copy_from_user(&set, &frame->uc.uc_sigmask, sizeof(set)))
+ goto badframe;
+
+ sigdelsetmask(&set, ~_BLOCKABLE);
+ set_current_blocked(&set);
+
+ if (restore_sigcontext(regs, &frame->uc.uc_mcontext, &ax))
+ goto badframe;
+
+ tregs = *regs;
+ if (sys32_sigaltstack(&frame->uc.uc_stack, NULL, &tregs) == -EFAULT)
+ goto badframe;
+
+ return ax;
+
+badframe:
+ signal_fault(regs, frame, "x32 rt_sigreturn");
+ return 0;
+}
+#endif
--
1.7.6.5

2012-02-20 00:21:18

by H. Peter Anvin

[permalink] [raw]
Subject: [PATCH 27/30] x32: Handle process creation

From: "H. Peter Anvin" <[email protected]>

Allow an x32 process to be started.

Originally-by: H. J. Lu <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>
Cc: Peter Zijlstra <[email protected]>
---
arch/x86/include/asm/compat.h | 26 ++++++++++++++++++++++++--
arch/x86/include/asm/elf.h | 25 +++++++++++++++++++++----
arch/x86/kernel/cpu/perf_event.c | 4 +++-
arch/x86/kernel/entry_64.S | 15 +++++++++++++++
arch/x86/kernel/process_64.c | 23 ++++++++++++++++-------
5 files changed, 79 insertions(+), 14 deletions(-)

diff --git a/arch/x86/include/asm/compat.h b/arch/x86/include/asm/compat.h
index 7938b84..e7f68b4 100644
--- a/arch/x86/include/asm/compat.h
+++ b/arch/x86/include/asm/compat.h
@@ -6,6 +6,7 @@
*/
#include <linux/types.h>
#include <linux/sched.h>
+#include <asm/processor.h>
#include <asm/user32.h>
#include <asm/unistd.h>

@@ -187,7 +188,20 @@ struct compat_shmid64_ds {
/*
* The type of struct elf_prstatus.pr_reg in compatible core dumps.
*/
+#ifdef CONFIG_X86_X32_ABI
+typedef struct user_regs_struct compat_elf_gregset_t;
+
+#define PR_REG_SIZE(S) (test_thread_flag(TIF_IA32) ? 68 : 216)
+#define PRSTATUS_SIZE(S) (test_thread_flag(TIF_IA32) ? 144 : 296)
+#define SET_PR_FPVALID(S,V) \
+ do { *(int *) (((void *) &((S)->pr_reg)) + PR_REG_SIZE(0)) = (V); } \
+ while (0)
+
+#define COMPAT_USE_64BIT_TIME \
+ (!!(task_pt_regs(current)->orig_ax & __X32_SYSCALL_BIT))
+#else
typedef struct user_regs_struct32 compat_elf_gregset_t;
+#endif

/*
* A pointer passed in from user mode. This should not
@@ -209,8 +223,16 @@ static inline compat_uptr_t ptr_to_compat(void __user *uptr)

static inline void __user *arch_compat_alloc_user_space(long len)
{
- struct pt_regs *regs = task_pt_regs(current);
- return (void __user *)regs->sp - len;
+ compat_uptr_t sp;
+
+ if (test_thread_flag(TIF_IA32)) {
+ sp = task_pt_regs(current)->sp;
+ } else {
+ /* -128 for the x32 ABI redzone */
+ sp = percpu_read(old_rsp) - 128;
+ }
+
+ return (void __user *)round_down(sp - len, 16);
}

static inline bool is_compat_task(void)
diff --git a/arch/x86/include/asm/elf.h b/arch/x86/include/asm/elf.h
index 410fa6a..83aabea 100644
--- a/arch/x86/include/asm/elf.h
+++ b/arch/x86/include/asm/elf.h
@@ -156,7 +156,12 @@ do { \
#define elf_check_arch(x) \
((x)->e_machine == EM_X86_64)

-#define compat_elf_check_arch(x) elf_check_arch_ia32(x)
+#define compat_elf_check_arch(x) \
+ (elf_check_arch_ia32(x) || (x)->e_machine == EM_X86_64)
+
+#if __USER32_DS != __USER_DS
+# error "The following code assumes __USER32_DS == __USER_DS"
+#endif

static inline void elf_common_init(struct thread_struct *t,
struct pt_regs *regs, const u16 ds)
@@ -179,8 +184,9 @@ static inline void elf_common_init(struct thread_struct *t,
void start_thread_ia32(struct pt_regs *regs, u32 new_ip, u32 new_sp);
#define compat_start_thread start_thread_ia32

-void set_personality_ia32(void);
-#define COMPAT_SET_PERSONALITY(ex) set_personality_ia32()
+void set_personality_ia32(bool);
+#define COMPAT_SET_PERSONALITY(ex) \
+ set_personality_ia32((ex).e_machine == EM_X86_64)

#define COMPAT_ELF_PLATFORM ("i686")

@@ -296,9 +302,20 @@ do { \
(unsigned long)current->mm->context.vdso); \
} while (0)

+#define ARCH_DLINFO_X32 \
+do { \
+ if (vdso_enabled) \
+ NEW_AUX_ENT(AT_SYSINFO_EHDR, \
+ (unsigned long)current->mm->context.vdso); \
+} while (0)
+
#define AT_SYSINFO 32

-#define COMPAT_ARCH_DLINFO ARCH_DLINFO_IA32(sysctl_vsyscall32)
+#define COMPAT_ARCH_DLINFO \
+if (test_thread_flag(TIF_X32)) \
+ ARCH_DLINFO_X32; \
+else \
+ ARCH_DLINFO_IA32(sysctl_vsyscall32)

#define COMPAT_ELF_ET_DYN_BASE (TASK_UNMAPPED_BASE + 0x1000000)

diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index 5adce10..63c0e05 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -28,7 +28,6 @@
#include <asm/apic.h>
#include <asm/stacktrace.h>
#include <asm/nmi.h>
-#include <asm/compat.h>
#include <asm/smp.h>
#include <asm/alternative.h>

@@ -1595,6 +1594,9 @@ perf_callchain_kernel(struct perf_callchain_entry *entry, struct pt_regs *regs)
}

#ifdef CONFIG_COMPAT
+
+#include <asm/compat.h>
+
static inline int
perf_callchain_user32(struct pt_regs *regs, struct perf_callchain_entry *entry)
{
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 53dc821..9e036f0 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -763,6 +763,21 @@ ENTRY(stub_x32_rt_sigreturn)
CFI_ENDPROC
END(stub_x32_rt_sigreturn)

+ENTRY(stub_x32_execve)
+ CFI_STARTPROC
+ addq $8, %rsp
+ PARTIAL_FRAME 0
+ SAVE_REST
+ FIXUP_TOP_OF_STACK %r11
+ movq %rsp, %rcx
+ call sys32_execve
+ RESTORE_TOP_OF_STACK %r11
+ movq %rax,RAX(%rsp)
+ RESTORE_REST
+ jmp int_ret_from_sys_call
+ CFI_ENDPROC
+END(stub_x32_execve)
+
#endif

/*
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 5fe2fba..a0701da 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -364,7 +364,9 @@ start_thread(struct pt_regs *regs, unsigned long new_ip, unsigned long new_sp)
void start_thread_ia32(struct pt_regs *regs, u32 new_ip, u32 new_sp)
{
start_thread_common(regs, new_ip, new_sp,
- __USER32_CS, __USER32_DS, __USER32_DS);
+ test_thread_flag(TIF_X32)
+ ? __USER_CS : __USER32_CS,
+ __USER_DS, __USER_DS);
}
#endif

@@ -508,6 +510,7 @@ void set_personality_64bit(void)

/* Make sure to be in 64bit mode */
clear_thread_flag(TIF_IA32);
+ clear_thread_flag(TIF_X32);
clear_thread_flag(TIF_ADDR32);
clear_thread_flag(TIF_X32);

@@ -522,22 +525,28 @@ void set_personality_64bit(void)
current->personality &= ~READ_IMPLIES_EXEC;
}

-void set_personality_ia32(void)
+void set_personality_ia32(bool x32)
{
/* inherit personality from parent */

/* Make sure to be in 32bit mode */
- set_thread_flag(TIF_IA32);
set_thread_flag(TIF_ADDR32);
- clear_thread_flag(TIF_X32);
- current->personality |= force_personality32;

/* Mark the associated mm as containing 32-bit tasks. */
if (current->mm)
current->mm->context.ia32_compat = 1;

- /* Prepare the first "return" to user space */
- current_thread_info()->status |= TS_COMPAT;
+ if (x32) {
+ clear_thread_flag(TIF_IA32);
+ set_thread_flag(TIF_X32);
+ current->personality &= ~READ_IMPLIES_EXEC;
+ } else {
+ set_thread_flag(TIF_IA32);
+ clear_thread_flag(TIF_X32);
+ current->personality |= force_personality32;
+ /* Prepare the first "return" to user space */
+ current_thread_info()->status |= TS_COMPAT;
+ }
}

unsigned long get_wchan(struct task_struct *p)
--
1.7.6.5

2012-02-20 00:21:37

by H. Peter Anvin

[permalink] [raw]
Subject: [PATCH 28/30] x32: If configured, add x32 system calls to system call tables

From: "H. Peter Anvin" <[email protected]>

If CONFIG_X86_X32_ABI is defined, add the x32 system calls to the
system call tables.

Signed-off-by: H. Peter Anvin <[email protected]>
---
arch/x86/kernel/asm-offsets_64.c | 6 +++++-
arch/x86/kernel/syscall_64.c | 7 ++++++-
2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/asm-offsets_64.c b/arch/x86/kernel/asm-offsets_64.c
index c3354f7..1b4754f 100644
--- a/arch/x86/kernel/asm-offsets_64.c
+++ b/arch/x86/kernel/asm-offsets_64.c
@@ -2,7 +2,11 @@

#define __SYSCALL_64(nr, sym, compat) [nr] = 1,
#define __SYSCALL_COMMON(nr, sym, compat) [nr] = 1,
-#define __SYSCALL_X32(nr, sym, compat) /* Not yet */
+#ifdef CONFIG_X86_X32_ABI
+# define __SYSCALL_X32(nr, sym, compat) [nr] = 1,
+#else
+# define __SYSCALL_X32(nr, sym, compat) /* nothing */
+#endif
static char syscalls_64[] = {
#include <asm/syscalls_64.h>
};
diff --git a/arch/x86/kernel/syscall_64.c b/arch/x86/kernel/syscall_64.c
index 26c4ca1..5c7f8c2 100644
--- a/arch/x86/kernel/syscall_64.c
+++ b/arch/x86/kernel/syscall_64.c
@@ -6,7 +6,12 @@
#include <asm/asm-offsets.h>

#define __SYSCALL_COMMON(nr, sym, compat) __SYSCALL_64(nr, sym, compat)
-#define __SYSCALL_X32(nr, sym, compat) /* Not yet */
+
+#ifdef CONFIG_X86_X32_ABI
+# define __SYSCALL_X32(nr, sym, compat) __SYSCALL_64(nr, sym, compat)
+#else
+# define __SYSCALL_X32(nr, sym, compat) /* nothing */
+#endif

#define __SYSCALL_64(nr, sym, compat) extern asmlinkage void sym(void) ;
#include <asm/syscalls_64.h>
--
1.7.6.5

2012-02-20 00:22:04

by H. Peter Anvin

[permalink] [raw]
Subject: [PATCH 29/30] x32: Allow x32 to be configured

From: "H. J. Lu" <[email protected]>

At this point, one should be able to build an x32 kernel.

Note that for now we depend on CONFIG_IA32_EMULATION. Long term, x32
and IA32 should be detangled.

Signed-off-by: H. Peter Anvin <[email protected]>
---
arch/x86/Kconfig | 21 +++++++++++++++++----
1 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 5bed94e..c9d6c9e 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2165,9 +2165,9 @@ config IA32_EMULATION
depends on X86_64
select COMPAT_BINFMT_ELF
---help---
- Include code to run 32-bit programs under a 64-bit kernel. You should
- likely turn this on, unless you're 100% sure that you don't have any
- 32-bit programs left.
+ Include code to run legacy 32-bit programs under a
+ 64-bit kernel. You should likely turn this on, unless you're
+ 100% sure that you don't have any 32-bit programs left.

config IA32_AOUT
tristate "IA32 a.out support"
@@ -2175,9 +2175,22 @@ config IA32_AOUT
---help---
Support old a.out binaries in the 32bit emulation.

+config X86_X32_ABI
+ bool "x32 ABI for 64-bit mode (EXPERIMENTAL)"
+ depends on X86_64 && IA32_EMULATION && EXPERIMENTAL
+ ---help---
+ Include code to run binaries for the x32 native 32-bit ABI
+ for 64-bit processors. An x32 process gets access to the
+ full 64-bit register file and wide data path while leaving
+ pointers at 32 bits for smaller memory footprint.
+
+ You will need a recent binutils (2.22 or later) with
+ elf32_x86_64 support enabled to compile a kernel with this
+ option set.
+
config COMPAT
def_bool y
- depends on IA32_EMULATION
+ depends on IA32_EMULATION || X86_X32_ABI

config COMPAT_FOR_U64_ALIGNMENT
def_bool COMPAT
--
1.7.6.5

2012-02-20 00:22:30

by H. Peter Anvin

[permalink] [raw]
Subject: [PATCH 30/30] x32: Add x32 VDSO support

From: "H. J. Lu" <[email protected]>

Add support for the x32 VDSO. The x32 VDSO takes advantage of the
similarity between the x86-64 and the x32 ABIs to contain the same
content, only the container is different, as the x32 VDSO obviously is
an x32 shared object.

Signed-off-by: H. Peter Anvin <[email protected]>
---
arch/x86/vdso/.gitignore | 2 +
arch/x86/vdso/Makefile | 46 ++++++++++++++++++++++++-
arch/x86/vdso/vdso32-setup.c | 6 +++
arch/x86/vdso/vdsox32.S | 22 ++++++++++++
arch/x86/vdso/vdsox32.lds.S | 32 +++++++++++++++++
arch/x86/vdso/vma.c | 78 +++++++++++++++++++++++++++++++++++++----
6 files changed, 177 insertions(+), 9 deletions(-)
create mode 100644 arch/x86/vdso/vdsox32.S
create mode 100644 arch/x86/vdso/vdsox32.lds.S

diff --git a/arch/x86/vdso/.gitignore b/arch/x86/vdso/.gitignore
index 60274d5..3282874 100644
--- a/arch/x86/vdso/.gitignore
+++ b/arch/x86/vdso/.gitignore
@@ -1,5 +1,7 @@
vdso.lds
vdso-syms.lds
+vdsox32.lds
+vdsox32-syms.lds
vdso32-syms.lds
vdso32-syscall-syms.lds
vdso32-sysenter-syms.lds
diff --git a/arch/x86/vdso/Makefile b/arch/x86/vdso/Makefile
index 5d17950..fd14be1 100644
--- a/arch/x86/vdso/Makefile
+++ b/arch/x86/vdso/Makefile
@@ -3,21 +3,29 @@
#

VDSO64-$(CONFIG_X86_64) := y
+VDSOX32-$(CONFIG_X86_X32_ABI) := y
VDSO32-$(CONFIG_X86_32) := y
VDSO32-$(CONFIG_COMPAT) := y

vdso-install-$(VDSO64-y) += vdso.so
+vdso-install-$(VDSOX32-y) += vdsox32.so
vdso-install-$(VDSO32-y) += $(vdso32-images)


# files to link into the vdso
vobjs-y := vdso-note.o vclock_gettime.o vgetcpu.o

+vobjs-$(VDSOX32-y) += $(vobjx32s-compat)
+
+# Filter out x32 objects.
+vobj64s := $(filter-out $(vobjx32s-compat),$(vobjs-y))
+
# files to link into kernel
obj-$(VDSO64-y) += vma.o vdso.o
+obj-$(VDSOX32-y) += vdsox32.o
obj-$(VDSO32-y) += vdso32.o vdso32-setup.o

-vobjs := $(foreach F,$(vobjs-y),$(obj)/$F)
+vobjs := $(foreach F,$(vobj64s),$(obj)/$F)

$(obj)/vdso.o: $(obj)/vdso.so

@@ -73,6 +81,42 @@ $(obj)/%-syms.lds: $(obj)/%.so.dbg FORCE
$(call if_changed,vdsosym)

#
+# X32 processes use x32 vDSO to access 64bit kernel data.
+#
+# Build x32 vDSO image:
+# 1. Compile x32 vDSO as 64bit.
+# 2. Convert object files to x32.
+# 3. Build x32 VDSO image with x32 objects, which contains 64bit codes
+# so that it can reach 64bit address space with 64bit pointers.
+#
+
+targets += vdsox32-syms.lds
+obj-$(VDSOX32-y) += vdsox32-syms.lds
+
+CPPFLAGS_vdsox32.lds = $(CPPFLAGS_vdso.lds)
+VDSO_LDFLAGS_vdsox32.lds = -Wl,-m,elf32_x86_64 \
+ -Wl,-soname=linux-vdso.so.1 \
+ -Wl,-z,max-page-size=4096 \
+ -Wl,-z,common-page-size=4096
+
+vobjx32s-y := $(vobj64s:.o=-x32.o)
+vobjx32s := $(foreach F,$(vobjx32s-y),$(obj)/$F)
+
+# Convert 64bit object file to x32 for x32 vDSO.
+quiet_cmd_x32 = X32 $@
+ cmd_x32 = $(OBJCOPY) -O elf32-x86-64 $< $@
+
+$(obj)/%-x32.o: $(obj)/%.o FORCE
+ $(call if_changed,x32)
+
+targets += vdsox32.so vdsox32.so.dbg vdsox32.lds $(vobjx32s-y)
+
+$(obj)/vdsox32.o: $(src)/vdsox32.S $(obj)/vdsox32.so
+
+$(obj)/vdsox32.so.dbg: $(src)/vdsox32.lds $(vobjx32s) FORCE
+ $(call if_changed,vdso)
+
+#
# Build multiple 32-bit vDSO images to choose from at boot time.
#
obj-$(VDSO32-y) += vdso32-syms.lds
diff --git a/arch/x86/vdso/vdso32-setup.c b/arch/x86/vdso/vdso32-setup.c
index 468d591..01b8a0d 100644
--- a/arch/x86/vdso/vdso32-setup.c
+++ b/arch/x86/vdso/vdso32-setup.c
@@ -317,6 +317,12 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
int ret = 0;
bool compat;

+#ifdef CONFIG_X86_X32_ABI
+ extern int x32_setup_additional_pages(struct linux_binprm *, int);
+ if (test_thread_flag(TIF_X32))
+ return x32_setup_additional_pages (bprm, uses_interp);
+#endif
+
if (vdso_enabled == VDSO_DISABLED)
return 0;

diff --git a/arch/x86/vdso/vdsox32.S b/arch/x86/vdso/vdsox32.S
new file mode 100644
index 0000000..d6b9a7f
--- /dev/null
+++ b/arch/x86/vdso/vdsox32.S
@@ -0,0 +1,22 @@
+#include <asm/page_types.h>
+#include <linux/linkage.h>
+#include <linux/init.h>
+
+__PAGE_ALIGNED_DATA
+
+ .globl vdsox32_start, vdsox32_end
+ .align PAGE_SIZE
+vdsox32_start:
+ .incbin "arch/x86/vdso/vdsox32.so"
+vdsox32_end:
+ .align PAGE_SIZE /* extra data here leaks to userspace. */
+
+.previous
+
+ .globl vdsox32_pages
+ .bss
+ .align 8
+ .type vdsox32_pages, @object
+vdsox32_pages:
+ .zero (vdsox32_end - vdsox32_start + PAGE_SIZE - 1) / PAGE_SIZE * 8
+ .size vdsox32_pages, .-vdsox32_pages
diff --git a/arch/x86/vdso/vdsox32.lds.S b/arch/x86/vdso/vdsox32.lds.S
new file mode 100644
index 0000000..373ca9a
--- /dev/null
+++ b/arch/x86/vdso/vdsox32.lds.S
@@ -0,0 +1,32 @@
+/*
+ * Linker script for x32 vDSO.
+ * We #include the file to define the layout details.
+ * Here we only choose the prelinked virtual address.
+ *
+ * This file defines the version script giving the user-exported symbols in
+ * the DSO. We can define local symbols here called VDSO* to make their
+ * values visible using the asm-x86/vdso.h macros from the kernel proper.
+ */
+
+#define VDSO_PRELINK 0
+#include "vdso-layout.lds.S"
+
+/*
+ * This controls what userland symbols we export from the vDSO.
+ */
+VERSION {
+ LINUX_2.6 {
+ global:
+ clock_gettime;
+ __vdso_clock_gettime;
+ gettimeofday;
+ __vdso_gettimeofday;
+ getcpu;
+ __vdso_getcpu;
+ time;
+ __vdso_time;
+ local: *;
+ };
+}
+
+VDSOX32_PRELINK = VDSO_PRELINK;
diff --git a/arch/x86/vdso/vma.c b/arch/x86/vdso/vma.c
index 153407c..1bbcc62 100644
--- a/arch/x86/vdso/vma.c
+++ b/arch/x86/vdso/vma.c
@@ -24,7 +24,44 @@ extern unsigned short vdso_sync_cpuid;
extern struct page *vdso_pages[];
static unsigned vdso_size;

-static void __init patch_vdso(void *vdso, size_t len)
+#ifdef CONFIG_X86_X32_ABI
+extern char vdsox32_start[], vdsox32_end[];
+extern struct page *vdsox32_pages[];
+static unsigned vdsox32_size;
+
+static void __init patch_vdsox32(void *vdso, size_t len)
+{
+ Elf32_Ehdr *hdr = vdso;
+ Elf32_Shdr *sechdrs, *alt_sec = 0;
+ char *secstrings;
+ void *alt_data;
+ int i;
+
+ BUG_ON(len < sizeof(Elf32_Ehdr));
+ BUG_ON(memcmp(hdr->e_ident, ELFMAG, SELFMAG) != 0);
+
+ sechdrs = (void *)hdr + hdr->e_shoff;
+ secstrings = (void *)hdr + sechdrs[hdr->e_shstrndx].sh_offset;
+
+ for (i = 1; i < hdr->e_shnum; i++) {
+ Elf32_Shdr *shdr = &sechdrs[i];
+ if (!strcmp(secstrings + shdr->sh_name, ".altinstructions")) {
+ alt_sec = shdr;
+ goto found;
+ }
+ }
+
+ /* If we get here, it's probably a bug. */
+ pr_warning("patch_vdsox32: .altinstructions not found\n");
+ return; /* nothing to patch */
+
+found:
+ alt_data = (void *)hdr + alt_sec->sh_offset;
+ apply_alternatives(alt_data, alt_data + alt_sec->sh_size);
+}
+#endif
+
+static void __init patch_vdso64(void *vdso, size_t len)
{
Elf64_Ehdr *hdr = vdso;
Elf64_Shdr *sechdrs, *alt_sec = 0;
@@ -47,7 +84,7 @@ static void __init patch_vdso(void *vdso, size_t len)
}

/* If we get here, it's probably a bug. */
- pr_warning("patch_vdso: .altinstructions not found\n");
+ pr_warning("patch_vdso64: .altinstructions not found\n");
return; /* nothing to patch */

found:
@@ -60,12 +97,20 @@ static int __init init_vdso(void)
int npages = (vdso_end - vdso_start + PAGE_SIZE - 1) / PAGE_SIZE;
int i;

- patch_vdso(vdso_start, vdso_end - vdso_start);
+ patch_vdso64(vdso_start, vdso_end - vdso_start);

vdso_size = npages << PAGE_SHIFT;
for (i = 0; i < npages; i++)
vdso_pages[i] = virt_to_page(vdso_start + i*PAGE_SIZE);

+#ifdef CONFIG_X86_X32_ABI
+ patch_vdsox32(vdsox32_start, vdsox32_end - vdsox32_start);
+ npages = (vdsox32_end - vdsox32_start + PAGE_SIZE - 1) / PAGE_SIZE;
+ vdsox32_size = npages << PAGE_SHIFT;
+ for (i = 0; i < npages; i++)
+ vdsox32_pages[i] = virt_to_page(vdsox32_start + i*PAGE_SIZE);
+#endif
+
return 0;
}
subsys_initcall(init_vdso);
@@ -103,7 +148,10 @@ static unsigned long vdso_addr(unsigned long start, unsigned len)

/* Setup a VMA at program startup for the vsyscall page.
Not called for compat tasks */
-int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
+static int setup_additional_pages(struct linux_binprm *bprm,
+ int uses_interp,
+ struct page **pages,
+ unsigned size)
{
struct mm_struct *mm = current->mm;
unsigned long addr;
@@ -113,8 +161,8 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
return 0;

down_write(&mm->mmap_sem);
- addr = vdso_addr(mm->start_stack, vdso_size);
- addr = get_unmapped_area(NULL, addr, vdso_size, 0, 0);
+ addr = vdso_addr(mm->start_stack, size);
+ addr = get_unmapped_area(NULL, addr, size, 0, 0);
if (IS_ERR_VALUE(addr)) {
ret = addr;
goto up_fail;
@@ -122,11 +170,11 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)

current->mm->context.vdso = (void *)addr;

- ret = install_special_mapping(mm, addr, vdso_size,
+ ret = install_special_mapping(mm, addr, size,
VM_READ|VM_EXEC|
VM_MAYREAD|VM_MAYWRITE|VM_MAYEXEC|
VM_ALWAYSDUMP,
- vdso_pages);
+ pages);
if (ret) {
current->mm->context.vdso = NULL;
goto up_fail;
@@ -137,6 +185,20 @@ up_fail:
return ret;
}

+int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
+{
+ return setup_additional_pages (bprm, uses_interp, vdso_pages,
+ vdso_size);
+}
+
+#ifdef CONFIG_X86_X32_ABI
+int x32_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
+{
+ return setup_additional_pages (bprm, uses_interp, vdsox32_pages,
+ vdsox32_size);
+}
+#endif
+
static __init int vdso_setup(char *s)
{
vdso_enabled = simple_strtoul(s, NULL, 0);
--
1.7.6.5

2012-02-20 00:52:14

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH 02/30] x86-64: Use explicit sizes in sigcontext.h, prepare for x32

On Sun, Feb 19, 2012 at 4:07 PM, H. Peter Anvin <[email protected]> wrote:
>
> Use explicit sizes (__u64) instead of implicit sizes (unsigned long)
> in the definition for sigcontext.h; this will allow this structure to
> be shared between the x86-64 native ABI and the x32 ABI.

Btw, since we had this issue just with autofs: what are the x32 ABI
alignment issues for __u64? Are they like x86-64 ("natural alignment")
or x86-32 ("4-byte alignment")?

I assume they are natural alignment, and as pointed out by Davem, we
do have the versions of u64 that make this explicit: "compat_u64" is
the 4-byte-aligned one, while "__aligned_u64" is the natively aligned
one.

Just plain "__u64" doesn't tell which it is, which is sad and wrong,
but we're likely stuck with it forever. Unless some shining knight
comes and says "__u64 is native alignment, and if you want anything
else, you need to use __compat_u64", and actually fixes the cases
where x86-32 depends on the 4-byte aligned one.

Which would be nice, but sounds unlikely. Shining knights tend to be
rare. But this *could* possibly be automated, so it's not entirely out
of the question.

Linus

2012-02-20 00:56:22

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH 02/30] x86-64: Use explicit sizes in sigcontext.h, prepare for x32

On 02/19/2012 04:51 PM, Linus Torvalds wrote:
> On Sun, Feb 19, 2012 at 4:07 PM, H. Peter Anvin <[email protected]> wrote:
>>
>> Use explicit sizes (__u64) instead of implicit sizes (unsigned long)
>> in the definition for sigcontext.h; this will allow this structure to
>> be shared between the x86-64 native ABI and the x32 ABI.
>
> Btw, since we had this issue just with autofs: what are the x32 ABI
> alignment issues for __u64? Are they like x86-64 ("natural alignment")
> or x86-32 ("4-byte alignment")?
>
> I assume they are natural alignment, and as pointed out by Davem, we
> do have the versions of u64 that make this explicit: "compat_u64" is
> the 4-byte-aligned one, while "__aligned_u64" is the natively aligned
> one.
>
> Just plain "__u64" doesn't tell which it is, which is sad and wrong,
> but we're likely stuck with it forever. Unless some shining knight
> comes and says "__u64 is native alignment, and if you want anything
> else, you need to use __compat_u64", and actually fixes the cases
> where x86-32 depends on the 4-byte aligned one.
>
> Which would be nice, but sounds unlikely. Shining knights tend to be
> rare. But this *could* possibly be automated, so it's not entirely out
> of the question.
>

We are using __u64 as x86-32 compatible since we are sharing most of the
really complex path (like ioctl) with i386 much more so than x86-64. So
it is defined in userspace as:

typedef unsigned long long __u64 __attribute__((aligned(4)));

__aligned_u64 obviously is naturally aligned, which matches uint64_t is
userspace.

-hpa

--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.

2012-02-20 00:56:34

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH 08/30] compat: Use COMPAT_USE_64BIT_TIME in the lp driver

On Sun, Feb 19, 2012 at 4:07 PM, H. Peter Anvin <[email protected]> wrote:
> From: "H. J. Lu" <[email protected]>
>
> Enable the lp driver to be used with a compat ABI with 64-bit time.

Ugh. Is this really the only case?

Because if it isn't, I suspect it would be much better off with a
helper function. In fact, even if this *does* end up being the only
place, a helper function to get/set a timeval from user space sounds
like a good idea, and makes things much more readable than that
ad-hoccery in the middle of code.

IOW, I'd like to see something like

get_user_timeval(void __user *tv, struct timeval *res)

even if it would be only local to this file.

Linus

2012-02-20 00:59:32

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH 08/30] compat: Use COMPAT_USE_64BIT_TIME in the lp driver

On 02/19/2012 04:56 PM, Linus Torvalds wrote:
> On Sun, Feb 19, 2012 at 4:07 PM, H. Peter Anvin <[email protected]> wrote:
>> From: "H. J. Lu" <[email protected]>
>>
>> Enable the lp driver to be used with a compat ABI with 64-bit time.
>
> Ugh. Is this really the only case?
>

No, it's not the only case, and a helper function is a great idea.

-hpa


--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.

2012-02-20 02:23:38

by H. Peter Anvin

[permalink] [raw]
Subject: [PATCH 0/7] COMPAT_USE_64BIT_TIME v2

From: "H. Peter Anvin" <[email protected]>

This is a respin of the COMPAT_USE_64BIT_TIME bits of the x32
patchset. It replaces patches 07/30-10/30 out of that patchset.

The differences in this respin are:

1. Use helper functions as suggested by Linus, and
2. Add support in the networking stack (this bit got inadvertently
left out of the previous patchset.)

Linus, is this more what you were looking for?

This is the diffstat for this set:

drivers/char/lp.c | 5 +--
drivers/input/input-compat.c | 4 +-
drivers/input/input-compat.h | 2 +-
include/linux/compat.h | 20 ++++++++++++
kernel/compat.c | 70 +++++++++++++++++++++++++++++++++++++-----
net/bluetooth/hci_sock.c | 3 +-
net/compat.c | 65 ++++++++++++++++++++++++---------------
net/socket.c | 18 +++++------
8 files changed, 136 insertions(+), 51 deletions(-)

2012-02-20 02:24:53

by H. Peter Anvin

[permalink] [raw]
Subject: [PATCH 1/7] compat: Introduce COMPAT_USE_64BIT_TIME

From: "H. J. Lu" <[email protected]>

Allow a compatibility ABI to use a 64-bit time_t and 64-bit members in
struct timeval and struct timespec to avoid the Y2038 problem.

This will be used for the x32 ABI.

Signed-off-by: H. Peter Anvin <[email protected]>
---
include/linux/compat.h | 4 ++++
1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/include/linux/compat.h b/include/linux/compat.h
index 41c9f65..1be91c0 100644
--- a/include/linux/compat.h
+++ b/include/linux/compat.h
@@ -19,6 +19,10 @@
#include <asm/siginfo.h>
#include <asm/signal.h>

+#ifndef COMPAT_USE_64BIT_TIME
+#define COMPAT_USE_64BIT_TIME 0
+#endif
+
#define compat_jiffies_to_clock_t(x) \
(((unsigned long)(x) * COMPAT_USER_HZ) / HZ)

--
1.7.6.5

2012-02-20 02:25:28

by H. Peter Anvin

[permalink] [raw]
Subject: [PATCH 2/7] compat: Add helper functions to read/write struct timeval, timespec

From: "H. Peter Anvin" <[email protected]>

Add helper functions to read and write struct timeval and struct
timespec from userspace. We already had helper functions for reading
and writing struct compat_timespec; add a set of functions to do the
same with struct timeval, and add a second suite of functions which
can be sensitive to COMPAT_USE_64BIT_TIME and access either 32- or
64-bit time structures.

This also exports these helper functions to modules.

Rename the existing inlines for converting between struct
compat_timeval and native struct timespec so we can have a saner
naming convention for the exported functions.

Suggested-by: Linus Torvalds <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>
---
include/linux/compat.h | 16 +++++++++++
kernel/compat.c | 70 ++++++++++++++++++++++++++++++++++++++++++-----
2 files changed, 78 insertions(+), 8 deletions(-)

diff --git a/include/linux/compat.h b/include/linux/compat.h
index 1be91c0..a82e452 100644
--- a/include/linux/compat.h
+++ b/include/linux/compat.h
@@ -87,10 +87,26 @@ typedef struct {
compat_sigset_word sig[_COMPAT_NSIG_WORDS];
} compat_sigset_t;

+/*
+ * These functions operate strictly on struct compat_time*
+ */
extern int get_compat_timespec(struct timespec *,
const struct compat_timespec __user *);
extern int put_compat_timespec(const struct timespec *,
struct compat_timespec __user *);
+extern int get_compat_timeval(struct timeval *,
+ const struct compat_timeval __user *);
+extern int put_compat_timeval(const struct timeval *,
+ struct compat_timeval __user *);
+/*
+ * These functions operate on 32- or 64-bit specs depending on
+ * COMPAT_USE_64BIT_TIME, hence the void user pointer arguments and the
+ * naming as compat_get/put_ rather than get/put_compat_.
+ */
+extern int compat_get_timespec(struct timespec *, const void __user *);
+extern int compat_put_timespec(const struct timespec *, void __user *);
+extern int compat_get_timeval(struct timeval *, const void __user *);
+extern int compat_put_timeval(const struct timeval *, void __user *);

struct compat_iovec {
compat_uptr_t iov_base;
diff --git a/kernel/compat.c b/kernel/compat.c
index f346ced..1743d67 100644
--- a/kernel/compat.c
+++ b/kernel/compat.c
@@ -31,11 +31,10 @@
#include <asm/uaccess.h>

/*
- * Note that the native side is already converted to a timespec, because
- * that's what we want anyway.
+ * Get/set struct timeval with struct timespec on the native side
*/
-static int compat_get_timeval(struct timespec *o,
- struct compat_timeval __user *i)
+static int compat_get_timeval_convert(struct timespec *o,
+ struct compat_timeval __user *i)
{
long usec;

@@ -46,8 +45,8 @@ static int compat_get_timeval(struct timespec *o,
return 0;
}

-static int compat_put_timeval(struct compat_timeval __user *o,
- struct timeval *i)
+static int compat_put_timeval_convert(struct compat_timeval __user *o,
+ struct timeval *i)
{
return (put_user(i->tv_sec, &o->tv_sec) ||
put_user(i->tv_usec, &o->tv_usec)) ? -EFAULT : 0;
@@ -117,7 +116,7 @@ asmlinkage long compat_sys_gettimeofday(struct compat_timeval __user *tv,
if (tv) {
struct timeval ktv;
do_gettimeofday(&ktv);
- if (compat_put_timeval(tv, &ktv))
+ if (compat_put_timeval_convert(tv, &ktv))
return -EFAULT;
}
if (tz) {
@@ -135,7 +134,7 @@ asmlinkage long compat_sys_settimeofday(struct compat_timeval __user *tv,
struct timezone ktz;

if (tv) {
- if (compat_get_timeval(&kts, tv))
+ if (compat_get_timeval_convert(&kts, tv))
return -EFAULT;
}
if (tz) {
@@ -146,12 +145,29 @@ asmlinkage long compat_sys_settimeofday(struct compat_timeval __user *tv,
return do_sys_settimeofday(tv ? &kts : NULL, tz ? &ktz : NULL);
}

+int get_compat_timeval(struct timeval *tv, const struct compat_timeval __user *ctv)
+{
+ return (!access_ok(VERIFY_READ, ctv, sizeof(*ctv)) ||
+ __get_user(tv->tv_sec, &ctv->tv_sec) ||
+ __get_user(tv->tv_usec, &ctv->tv_usec)) ? -EFAULT : 0;
+}
+EXPORT_SYMBOL_GPL(get_compat_timeval);
+
+int put_compat_timeval(const struct timeval *tv, struct compat_timespec __user *ctv)
+{
+ return (!access_ok(VERIFY_WRITE, ctv, sizeof(*ctv)) ||
+ __put_user(tv->tv_sec, &ctv->tv_sec) ||
+ __put_user(tv->tv_usec, &ctv->tv_usec)) ? -EFAULT : 0;
+}
+EXPORT_SYMBOL_GPL(put_compat_timeval);
+
int get_compat_timespec(struct timespec *ts, const struct compat_timespec __user *cts)
{
return (!access_ok(VERIFY_READ, cts, sizeof(*cts)) ||
__get_user(ts->tv_sec, &cts->tv_sec) ||
__get_user(ts->tv_nsec, &cts->tv_nsec)) ? -EFAULT : 0;
}
+EXPORT_SYMBOL_GPL(get_compat_timespec);

int put_compat_timespec(const struct timespec *ts, struct compat_timespec __user *cts)
{
@@ -161,6 +177,42 @@ int put_compat_timespec(const struct timespec *ts, struct compat_timespec __user
}
EXPORT_SYMBOL_GPL(put_compat_timespec);

+int compat_get_timeval(struct timeval *tv, const void __user *utv)
+{
+ if (COMPAT_USE_64BIT_TIME)
+ return get_user(tv, (const struct timeval __user *)utv);
+ else
+ return get_compat_timeval(tv, utv);
+}
+EXPORT_SYMBOL_GPL(compat_get_timeval);
+
+int compat_put_timeval(const struct timeval *tv, void __user *utv)
+{
+ if (COMPAT_USE_64BIT_TIME)
+ return put_user(tv, (struct timeval __user *)utv);
+ else
+ return put_compat_timeval(tv, utv);
+}
+EXPORT_SYMBOL_GPL(compat_put_timeval);
+
+int compat_get_timespec(struct timespec *ts, const void __user *uts)
+{
+ if (COMPAT_USE_64BIT_TIME)
+ return get_user(ts, (const struct timespec __user *)uts);
+ else
+ return get_compat_timespec(ts, uts);
+}
+EXPORT_SYMBOL_GPL(compat_get_timespec);
+
+int compat_put_timespec(const struct timespec *ts, void __user *uts)
+{
+ if (COMPAT_USE_64BIT_TIME)
+ return put_user(tv, (struct timeval __user *)ctv);
+ else
+ return put_compat_timespec(ts, uts);
+}
+EXPORT_SYMBOL_GPL(compat_put_timespec);
+
static long compat_nanosleep_restart(struct restart_block *restart)
{
struct compat_timespec __user *rmtp;
@@ -1162,3 +1214,5 @@ void __user *compat_alloc_user_space(unsigned long len)
return ptr;
}
EXPORT_SYMBOL_GPL(compat_alloc_user_space);
+
+/*
--
1.7.6.5

2012-02-20 02:25:54

by H. Peter Anvin

[permalink] [raw]
Subject: [PATCH 3/7] compat: Handle COMPAT_USE_64BIT_TIME in the lp driver

From: "H. Peter Anvin" <[email protected]>

Enable the lp driver to be used with a compat ABI with 64-bit time.

Signed-off-by: H. Peter Anvin <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
---
drivers/char/lp.c | 5 +----
1 files changed, 1 insertions(+), 4 deletions(-)

diff --git a/drivers/char/lp.c b/drivers/char/lp.c
index f434856..64f5aeb 100644
--- a/drivers/char/lp.c
+++ b/drivers/char/lp.c
@@ -706,16 +706,13 @@ static long lp_compat_ioctl(struct file *file, unsigned int cmd,
{
unsigned int minor;
struct timeval par_timeout;
- struct compat_timeval __user *tc;
int ret;

minor = iminor(file->f_path.dentry->d_inode);
mutex_lock(&lp_mutex);
switch (cmd) {
case LPSETTIMEOUT:
- tc = compat_ptr(arg);
- if (get_user(par_timeout.tv_sec, &tc->tv_sec) ||
- get_user(par_timeout.tv_usec, &tc->tv_usec)) {
+ if (compat_get_timeval(par_timeout, compat_ptr(arg))) {
ret = -EFAULT;
break;
}
--
1.7.6.5

2012-02-20 02:26:22

by H. Peter Anvin

[permalink] [raw]
Subject: [PATCH 4/7] compat: Use COMPAT_USE_64BIT_TIME in the input subsystem

From: "H. J. Lu" <[email protected]>

Enable the input system to be used with a compat ABI with 64-bit time.

Signed-off-by: H. Peter Anvin <[email protected]>
Cc: Dmitry Torokhov <[email protected]>
---
drivers/input/input-compat.c | 4 ++--
drivers/input/input-compat.h | 2 +-
2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/input/input-compat.c b/drivers/input/input-compat.c
index e46a867..64ca711 100644
--- a/drivers/input/input-compat.c
+++ b/drivers/input/input-compat.c
@@ -17,7 +17,7 @@
int input_event_from_user(const char __user *buffer,
struct input_event *event)
{
- if (INPUT_COMPAT_TEST) {
+ if (INPUT_COMPAT_TEST && !COMPAT_USE_64BIT_TIME) {
struct input_event_compat compat_event;

if (copy_from_user(&compat_event, buffer,
@@ -41,7 +41,7 @@ int input_event_from_user(const char __user *buffer,
int input_event_to_user(char __user *buffer,
const struct input_event *event)
{
- if (INPUT_COMPAT_TEST) {
+ if (INPUT_COMPAT_TEST && !COMPAT_USE_64BIT_TIME) {
struct input_event_compat compat_event;

compat_event.time.tv_sec = event->time.tv_sec;
diff --git a/drivers/input/input-compat.h b/drivers/input/input-compat.h
index 22be27b..148f66f 100644
--- a/drivers/input/input-compat.h
+++ b/drivers/input/input-compat.h
@@ -67,7 +67,7 @@ struct ff_effect_compat {

static inline size_t input_event_size(void)
{
- return INPUT_COMPAT_TEST ?
+ return (INPUT_COMPAT_TEST && !COMPAT_USE_64BIT_TIME) ?
sizeof(struct input_event_compat) : sizeof(struct input_event);
}

--
1.7.6.5

2012-02-20 02:27:07

by H. Peter Anvin

[permalink] [raw]
Subject: [PATCH 5/7] compat: Use COMPAT_USE_64BIT_TIME in the Bluetooth subsystem

From: "H. J. Lu" <[email protected]>

Enable the Bluetooth subsystem to be used with a compat ABI with
64-bit time.

Signed-off-by: H. Peter Anvin <[email protected]>
Cc: Marcel Holtmann <[email protected]>
Cc: Gustavo F. Padovan <[email protected]>
Cc: David S. Miller <[email protected]>
---
net/bluetooth/hci_sock.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/net/bluetooth/hci_sock.c b/net/bluetooth/hci_sock.c
index 0dcc962..b2eb2b9 100644
--- a/net/bluetooth/hci_sock.c
+++ b/net/bluetooth/hci_sock.c
@@ -418,7 +418,8 @@ static inline void hci_sock_cmsg(struct sock *sk, struct msghdr *msg, struct sk_
data = &tv;
len = sizeof(tv);
#ifdef CONFIG_COMPAT
- if (msg->msg_flags & MSG_CMSG_COMPAT) {
+ if (!COMPAT_USE_64BIT_TIME &&
+ (msg->msg_flags & MSG_CMSG_COMPAT)) {
ctv.tv_sec = tv.tv_sec;
ctv.tv_usec = tv.tv_usec;
data = &ctv;
--
1.7.6.5

2012-02-20 02:27:16

by H. Peter Anvin

[permalink] [raw]
Subject: [PATCH 6/7] compat: Use COMPAT_USE_64BIT_TIME in net/compat.c

From: "H. J. Lu" <[email protected]>

Handle 64-bit time structures in the networking core compat code.

Signed-off-by: H. Peter Anvin <[email protected]>
Cc: David S. Miller <[email protected]>
---
net/compat.c | 65 +++++++++++++++++++++++++++++++++++----------------------
1 files changed, 40 insertions(+), 25 deletions(-)

diff --git a/net/compat.c b/net/compat.c
index 6def90e..73bf0e0 100644
--- a/net/compat.c
+++ b/net/compat.c
@@ -219,8 +219,6 @@ Efault:

int put_cmsg_compat(struct msghdr *kmsg, int level, int type, int len, void *data)
{
- struct compat_timeval ctv;
- struct compat_timespec cts[3];
struct compat_cmsghdr __user *cm = (struct compat_cmsghdr __user *) kmsg->msg_control;
struct compat_cmsghdr cmhdr;
int cmlen;
@@ -230,24 +228,28 @@ int put_cmsg_compat(struct msghdr *kmsg, int level, int type, int len, void *dat
return 0; /* XXX: return error? check spec. */
}

- if (level == SOL_SOCKET && type == SCM_TIMESTAMP) {
- struct timeval *tv = (struct timeval *)data;
- ctv.tv_sec = tv->tv_sec;
- ctv.tv_usec = tv->tv_usec;
- data = &ctv;
- len = sizeof(ctv);
- }
- if (level == SOL_SOCKET &&
- (type == SCM_TIMESTAMPNS || type == SCM_TIMESTAMPING)) {
- int count = type == SCM_TIMESTAMPNS ? 1 : 3;
- int i;
- struct timespec *ts = (struct timespec *)data;
- for (i = 0; i < count; i++) {
- cts[i].tv_sec = ts[i].tv_sec;
- cts[i].tv_nsec = ts[i].tv_nsec;
+ if (!COMPAT_USE_64BIT_TIME) {
+ struct compat_timeval ctv;
+ struct compat_timespec cts[3];
+ if (level == SOL_SOCKET && type == SCM_TIMESTAMP) {
+ struct timeval *tv = (struct timeval *)data;
+ ctv.tv_sec = tv->tv_sec;
+ ctv.tv_usec = tv->tv_usec;
+ data = &ctv;
+ len = sizeof(ctv);
+ }
+ if (level == SOL_SOCKET &&
+ (type == SCM_TIMESTAMPNS || type == SCM_TIMESTAMPING)) {
+ int count = type == SCM_TIMESTAMPNS ? 1 : 3;
+ int i;
+ struct timespec *ts = (struct timespec *)data;
+ for (i = 0; i < count; i++) {
+ cts[i].tv_sec = ts[i].tv_sec;
+ cts[i].tv_nsec = ts[i].tv_nsec;
+ }
+ data = &cts;
+ len = sizeof(cts[0]) * count;
}
- data = &cts;
- len = sizeof(cts[0]) * count;
}

cmlen = CMSG_COMPAT_LEN(len);
@@ -454,11 +456,15 @@ static int compat_sock_getsockopt(struct socket *sock, int level, int optname,

int compat_sock_get_timestamp(struct sock *sk, struct timeval __user *userstamp)
{
- struct compat_timeval __user *ctv =
- (struct compat_timeval __user *) userstamp;
- int err = -ENOENT;
+ struct compat_timeval __user *ctv;
+ int err;
struct timeval tv;

+ if (COMPAT_USE_64BIT_TIME)
+ return sock_get_timestamp(sk, userstamp);
+
+ ctv = (struct compat_timeval __user *) userstamp;
+ err = -ENOENT;
if (!sock_flag(sk, SOCK_TIMESTAMP))
sock_enable_timestamp(sk, SOCK_TIMESTAMP);
tv = ktime_to_timeval(sk->sk_stamp);
@@ -478,11 +484,15 @@ EXPORT_SYMBOL(compat_sock_get_timestamp);

int compat_sock_get_timestampns(struct sock *sk, struct timespec __user *userstamp)
{
- struct compat_timespec __user *ctv =
- (struct compat_timespec __user *) userstamp;
- int err = -ENOENT;
+ struct compat_timespec __user *ctv;
+ int err;
struct timespec ts;

+ if (COMPAT_USE_64BIT_TIME)
+ return sock_get_timestampns (sk, userstamp);
+
+ ctv = (struct compat_timespec __user *) userstamp;
+ err = -ENOENT;
if (!sock_flag(sk, SOCK_TIMESTAMP))
sock_enable_timestamp(sk, SOCK_TIMESTAMP);
ts = ktime_to_timespec(sk->sk_stamp);
@@ -767,6 +777,11 @@ asmlinkage long compat_sys_recvmmsg(int fd, struct compat_mmsghdr __user *mmsg,
int datagrams;
struct timespec ktspec;

+ if (COMPAT_USE_64BIT_TIME)
+ return __sys_recvmmsg(fd, (struct mmsghdr __user *)mmsg, vlen,
+ flags | MSG_CMSG_COMPAT,
+ (struct timespec *) timeout);
+
if (timeout == NULL)
return __sys_recvmmsg(fd, (struct mmsghdr __user *)mmsg, vlen,
flags | MSG_CMSG_COMPAT, NULL);
--
1.7.6.5

2012-02-20 02:27:40

by H. Peter Anvin

[permalink] [raw]
Subject: [PATCH 7/7] compat: Handle COMPAT_USE_64BIT_TIME in net/socket.c

From: "H. Peter Anvin" <[email protected]>

Use helper functions aware of COMPAT_USE_64BIT_TIME to write struct
timeval and struct timespec to userspace in net/socket.c.

Signed-off-by: H. Peter Anvin <[email protected]>
---
net/socket.c | 18 ++++++++----------
1 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/net/socket.c b/net/socket.c
index 28a96af..57f5a25 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -2600,7 +2600,7 @@ void socket_seq_show(struct seq_file *seq)

#ifdef CONFIG_COMPAT
static int do_siocgstamp(struct net *net, struct socket *sock,
- unsigned int cmd, struct compat_timeval __user *up)
+ unsigned int cmd, void __user *up)
{
mm_segment_t old_fs = get_fs();
struct timeval ktv;
@@ -2609,15 +2609,14 @@ static int do_siocgstamp(struct net *net, struct socket *sock,
set_fs(KERNEL_DS);
err = sock_do_ioctl(net, sock, cmd, (unsigned long)&ktv);
set_fs(old_fs);
- if (!err) {
- err = put_user(ktv.tv_sec, &up->tv_sec);
- err |= __put_user(ktv.tv_usec, &up->tv_usec);
- }
+ if (!err)
+ err = compat_put_timeval(up, &ktv);
+
return err;
}

static int do_siocgstampns(struct net *net, struct socket *sock,
- unsigned int cmd, struct compat_timespec __user *up)
+ unsigned int cmd, struct void __user *up)
{
mm_segment_t old_fs = get_fs();
struct timespec kts;
@@ -2626,10 +2625,9 @@ static int do_siocgstampns(struct net *net, struct socket *sock,
set_fs(KERNEL_DS);
err = sock_do_ioctl(net, sock, cmd, (unsigned long)&kts);
set_fs(old_fs);
- if (!err) {
- err = put_user(kts.tv_sec, &up->tv_sec);
- err |= __put_user(kts.tv_nsec, &up->tv_nsec);
- }
+ if (!err)
+ err = compat_put_timespec(up, &kts);
+
return err;
}

--
1.7.6.5

2012-02-20 02:42:25

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH 0/7] COMPAT_USE_64BIT_TIME v2

On 02/19/2012 06:22 PM, H. Peter Anvin wrote:
> From: "H. Peter Anvin" <[email protected]>
>
> This is a respin of the COMPAT_USE_64BIT_TIME bits of the x32
> patchset. It replaces patches 07/30-10/30 out of that patchset.
>
> The differences in this respin are:
>
> 1. Use helper functions as suggested by Linus, and
> 2. Add support in the networking stack (this bit got inadvertently
> left out of the previous patchset.)
>
> Linus, is this more what you were looking for?
>
> This is the diffstat for this set:
>

(And no, it doesn't even compile... working on that ;)

-hpa

--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.

2012-02-20 06:22:51

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH 0/7] COMPAT_USE_64BIT_TIME v2

On 02/19/2012 06:42 PM, H. Peter Anvin wrote:
>
> (And no, it doesn't even compile... working on that ;)
>

The full tree which more Actually Compiles[TM] at:

git://git.kernel.org/pub/scm/linux/kernel/git/hpa/linux-x32.git x86/x32

I won't patchbomb it tonight, though.

-hpa

--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.

2012-02-20 12:12:37

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH 06/30] sysinfo: Use explicit types in <linux/sysinfo.h>

On Mon, Feb 20, 2012 at 01:07, H. Peter Anvin <[email protected]> wrote:
> --- a/include/linux/sysinfo.h
> +++ b/include/linux/sysinfo.h
> @@ -1,22 +1,24 @@
>  #ifndef _LINUX_SYSINFO_H
>  #define _LINUX_SYSINFO_H
>
> +#include <linux/types.h>
> +
>  #define SI_LOAD_SHIFT  16
>  struct sysinfo {
> -       long uptime;                    /* Seconds since boot */
> -       unsigned long loads[3];         /* 1, 5, and 15 minute load averages */
> -       unsigned long totalram;         /* Total usable main memory size */
> -       unsigned long freeram;          /* Available memory size */
> -       unsigned long sharedram;        /* Amount of shared memory */
> -       unsigned long bufferram;        /* Memory used by buffers */
> -       unsigned long totalswap;        /* Total swap space size */
> -       unsigned long freeswap;         /* swap space still available */
> -       unsigned short procs;           /* Number of current processes */
> -       unsigned short pad;             /* explicit padding for m68k */
> -       unsigned long totalhigh;        /* Total high memory size */
> -       unsigned long freehigh;         /* Available high memory size */
> -       unsigned int mem_unit;          /* Memory unit size in bytes */
> -       char _f[20-2*sizeof(long)-sizeof(int)]; /* Padding: libc5 uses this.. */
> +       __kernel_long_t uptime;         /* Seconds since boot */
> +       __kernel_ulong_t loads[3];      /* 1, 5, and 15 minute load averages */
> +       __kernel_ulong_t totalram;      /* Total usable main memory size */
> +       __kernel_ulong_t freeram;       /* Available memory size */
> +       __kernel_ulong_t sharedram;     /* Amount of shared memory */
> +       __kernel_ulong_t bufferram;     /* Memory used by buffers */
> +       __kernel_ulong_t totalswap;     /* Total swap space size */
> +       __kernel_ulong_t freeswap;      /* swap space still available */
> +       __u16 procs;                    /* Number of current processes */
> +       __u16 pad;                      /* Explicit padding for m68k */

Fueling the discussion about natural vs. 4-byte alignment?

> +       __kernel_ulong_t totalhigh;     /* Total high memory size */
> +       __kernel_ulong_t freehigh;      /* Available high memory size */
> +       __u32 mem_unit;                 /* Memory unit size in bytes */
> +       char _f[20-2*sizeof(__kernel_ulong_t)-sizeof(__u32)];   /* Padding: libc5 uses this.. */
>  };

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

2012-02-20 17:30:32

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH 06/30] sysinfo: Use explicit types in <linux/sysinfo.h>

Not really... it comes down to "implicit padding in kernel ABI structures is bad". They can easily become security holes.



Geert Uytterhoeven <[email protected]> wrote:

>On Mon, Feb 20, 2012 at 01:07, H. Peter Anvin <[email protected]> wrote:
>> --- a/include/linux/sysinfo.h
>> +++ b/include/linux/sysinfo.h
>> @@ -1,22 +1,24 @@
>>  #ifndef _LINUX_SYSINFO_H
>>  #define _LINUX_SYSINFO_H
>>
>> +#include <linux/types.h>
>> +
>>  #define SI_LOAD_SHIFT  16
>>  struct sysinfo {
>> -       long uptime;                    /* Seconds since boot */
>> -       unsigned long loads[3];         /* 1, 5, and 15 minute load
>averages */
>> -       unsigned long totalram;         /* Total usable main memory
>size */
>> -       unsigned long freeram;          /* Available memory size */
>> -       unsigned long sharedram;        /* Amount of shared memory */
>> -       unsigned long bufferram;        /* Memory used by buffers */
>> -       unsigned long totalswap;        /* Total swap space size */
>> -       unsigned long freeswap;         /* swap space still available
>*/
>> -       unsigned short procs;           /* Number of current
>processes */
>> -       unsigned short pad;             /* explicit padding for m68k
>*/
>> -       unsigned long totalhigh;        /* Total high memory size */
>> -       unsigned long freehigh;         /* Available high memory size
>*/
>> -       unsigned int mem_unit;          /* Memory unit size in bytes
>*/
>> -       char _f[20-2*sizeof(long)-sizeof(int)]; /* Padding: libc5
>uses this.. */
>> +       __kernel_long_t uptime;         /* Seconds since boot */
>> +       __kernel_ulong_t loads[3];      /* 1, 5, and 15 minute load
>averages */
>> +       __kernel_ulong_t totalram;      /* Total usable main memory
>size */
>> +       __kernel_ulong_t freeram;       /* Available memory size */
>> +       __kernel_ulong_t sharedram;     /* Amount of shared memory */
>> +       __kernel_ulong_t bufferram;     /* Memory used by buffers */
>> +       __kernel_ulong_t totalswap;     /* Total swap space size */
>> +       __kernel_ulong_t freeswap;      /* swap space still available
>*/
>> +       __u16 procs;                    /* Number of current
>processes */
>> +       __u16 pad;                      /* Explicit padding for m68k
>*/
>
>Fueling the discussion about natural vs. 4-byte alignment?
>
>> +       __kernel_ulong_t totalhigh;     /* Total high memory size */
>> +       __kernel_ulong_t freehigh;      /* Available high memory size
>*/
>> +       __u32 mem_unit;                 /* Memory unit size in bytes
>*/
>> +       char _f[20-2*sizeof(__kernel_ulong_t)-sizeof(__u32)];   /*
>Padding: libc5 uses this.. */
>>  };
>
>Gr{oetje,eeting}s,
>
>                        Geert
>
>--
>Geert Uytterhoeven -- There's lots of Linux beyond ia32 --
>[email protected]
>
>In personal conversations with technical people, I call myself a
>hacker. But
>when I'm talking to journalists I just say "programmer" or something
>like that.
>                                -- Linus Torvalds

--
Sent from my mobile phone. Please excuse my brevity and lack of formatting.

2012-02-20 20:01:48

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH 06/30] sysinfo: Use explicit types in <linux/sysinfo.h>

On Mon, Feb 20, 2012 at 18:29, H. Peter Anvin <[email protected]> wrote:
> Not really... it comes down to "implicit padding in kernel ABI structures is bad".  They can easily become security holes.

On 64-bit platforms with natural alignment of long, there's an implicit padding.

> Geert Uytterhoeven <[email protected]> wrote:
>
>>On Mon, Feb 20, 2012 at 01:07, H. Peter Anvin <[email protected]> wrote:
>>> --- a/include/linux/sysinfo.h
>>> +++ b/include/linux/sysinfo.h
>>> @@ -1,22 +1,24 @@
>>>  #ifndef _LINUX_SYSINFO_H
>>>  #define _LINUX_SYSINFO_H
>>>
>>> +#include <linux/types.h>
>>> +
>>>  #define SI_LOAD_SHIFT  16
>>>  struct sysinfo {
>>> -       long uptime;                    /* Seconds since boot */
>>> -       unsigned long loads[3];         /* 1, 5, and 15 minute load
>>averages */
>>> -       unsigned long totalram;         /* Total usable main memory
>>size */
>>> -       unsigned long freeram;          /* Available memory size */
>>> -       unsigned long sharedram;        /* Amount of shared memory */
>>> -       unsigned long bufferram;        /* Memory used by buffers */
>>> -       unsigned long totalswap;        /* Total swap space size */
>>> -       unsigned long freeswap;         /* swap space still available
>>*/
>>> -       unsigned short procs;           /* Number of current
>>processes */
>>> -       unsigned short pad;             /* explicit padding for m68k
>>*/
>>> -       unsigned long totalhigh;        /* Total high memory size */
>>> -       unsigned long freehigh;         /* Available high memory size
>>*/
>>> -       unsigned int mem_unit;          /* Memory unit size in bytes
>>*/
>>> -       char _f[20-2*sizeof(long)-sizeof(int)]; /* Padding: libc5
>>uses this.. */
>>> +       __kernel_long_t uptime;         /* Seconds since boot */
>>> +       __kernel_ulong_t loads[3];      /* 1, 5, and 15 minute load
>>averages */
>>> +       __kernel_ulong_t totalram;      /* Total usable main memory
>>size */
>>> +       __kernel_ulong_t freeram;       /* Available memory size */
>>> +       __kernel_ulong_t sharedram;     /* Amount of shared memory */
>>> +       __kernel_ulong_t bufferram;     /* Memory used by buffers */
>>> +       __kernel_ulong_t totalswap;     /* Total swap space size */
>>> +       __kernel_ulong_t freeswap;      /* swap space still available
>>*/
>>> +       __u16 procs;                    /* Number of current
>>processes */
>>> +       __u16 pad;                      /* Explicit padding for m68k
>>*/
>>
>>Fueling the discussion about natural vs. 4-byte alignment?
>>
>>> +       __kernel_ulong_t totalhigh;     /* Total high memory size */
>>> +       __kernel_ulong_t freehigh;      /* Available high memory size
>>*/
>>> +       __u32 mem_unit;                 /* Memory unit size in bytes
>>*/
>>> +       char _f[20-2*sizeof(__kernel_ulong_t)-sizeof(__u32)];   /*
>>Padding: libc5 uses this.. */
>>>  };

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

2012-02-20 20:44:46

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH 06/30] sysinfo: Use explicit types in <linux/sysinfo.h>

On 02/20/2012 12:01 PM, Geert Uytterhoeven wrote:
> On Mon, Feb 20, 2012 at 18:29, H. Peter Anvin <[email protected]> wrote:
>> Not really... it comes down to "implicit padding in kernel ABI structures is bad". They can easily become security holes.
>
> On 64-bit platforms with natural alignment of long, there's an implicit padding.
>

We should try to make those explicit, but it sometimes gets hard
retroactively. Note that the particular patch you're pointing to is
just an identity patch.

-hpa

--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.

2012-02-21 00:17:43

by Andrew Lutomirski

[permalink] [raw]
Subject: Re: [PATCH 30/30] x32: Add x32 VDSO support

On 02/19/2012 04:08 PM, H. Peter Anvin wrote:
> From: "H. J. Lu" <[email protected]>
>
> Add support for the x32 VDSO. The x32 VDSO takes advantage of the
> similarity between the x86-64 and the x32 ABIs to contain the same
> content, only the container is different, as the x32 VDSO obviously is
> an x32 shared object.

> +
> +/*
> + * This controls what userland symbols we export from the vDSO.
> + */
> +VERSION {
> + LINUX_2.6 {
> + global:
> + clock_gettime;
> + __vdso_clock_gettime;
> + gettimeofday;
> + __vdso_gettimeofday;
> + getcpu;
> + __vdso_getcpu;
> + time;
> + __vdso_time;
> + local: *;
> + };
> +}
> +

Would it make sense to remove the non-__vdso-prefixed weak symbols?
AFAICT they are somewhere between useless (because the __vdso symbols
are unambiguous), confusing (has anyone not read this and said "huh?"),
and wrong (they are not interchangeable with glibc's symbols as they
return different values).

We're stuck with them on x86-64, but x32 is new and has no
backwards-compatibility issues.

--Andy

2012-02-21 03:58:45

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH 30/30] x32: Add x32 VDSO support

On 02/20/2012 04:12 PM, Andy Lutomirski wrote:
>
> Would it make sense to remove the non-__vdso-prefixed weak symbols?
> AFAICT they are somewhere between useless (because the __vdso symbols
> are unambiguous), confusing (has anyone not read this and said "huh?"),
> and wrong (they are not interchangeable with glibc's symbols as they
> return different values).
>
> We're stuck with them on x86-64, but x32 is new and has no
> backwards-compatibility issues.
>

What about non-glibc?

-hpa


--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.

2012-02-21 16:52:42

by Andrew Lutomirski

[permalink] [raw]
Subject: Re: [PATCH 30/30] x32: Add x32 VDSO support

On Mon, Feb 20, 2012 at 7:58 PM, H. Peter Anvin <[email protected]> wrote:
> On 02/20/2012 04:12 PM, Andy Lutomirski wrote:
>>
>> Would it make sense to remove the non-__vdso-prefixed weak symbols?
>> AFAICT they are somewhere between useless (because the __vdso symbols
>> are unambiguous), confusing (has anyone not read this and said "huh?"),
>> and wrong (they are not interchangeable with glibc's symbols as they
>> return different values).
>>
>> We're stuck with them on x86-64, but x32 is new and has no
>> backwards-compatibility issues.
>>
>
> What about non-glibc?

IMO non-glibc users should just call __vdso_clock_gettime, etc.
Currently, code like:

if (clock_gettime(whatever) == -1)
handle_the_error();

is correct when linked against glibc but incorrect when linked
directly against the vdso.

--Andy

2012-02-21 17:51:32

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH 30/30] x32: Add x32 VDSO support

On 02/21/2012 08:52 AM, Andrew Lutomirski wrote:
>>
>> What about non-glibc?
>
> IMO non-glibc users should just call __vdso_clock_gettime, etc.
> Currently, code like:
>
> if (clock_gettime(whatever) == -1)
> handle_the_error();
>
> is correct when linked against glibc but incorrect when linked
> directly against the vdso.
>

The issue is what uclibc, Bionic, etc. actually do.

-hpa

2012-02-21 18:54:54

by Andrew Lutomirski

[permalink] [raw]
Subject: Re: [PATCH 30/30] x32: Add x32 VDSO support

On Tue, Feb 21, 2012 at 9:51 AM, H. Peter Anvin <[email protected]> wrote:
> On 02/21/2012 08:52 AM, Andrew Lutomirski wrote:
>>>
>>>
>>> What about non-glibc?
>>
>>
>> IMO non-glibc users should just call __vdso_clock_gettime, etc.
>> Currently, code like:
>>
>> if (clock_gettime(whatever) == -1)
>> ? handle_the_error();
>>
>> is correct when linked against glibc but incorrect when linked
>> directly against the vdso.
>>
>
> The issue is what uclibc, Bionic, etc. actually do.

AFAICS Bionic only works on x86-32 and calls clock_gettime via
hardcoded int 0x80, written in assembly (!). uclibc calls
__vdso_getcpu and does not seem to use the other vdso calls. On a
cursory inspection, klibc uses neither the vsyscall page nor the vdso.

I doubt that there's any existing libc replacement that uses the
non-prefixed vdso entries and that already works on x32 -- that would
be impressive. I'm not suggesting changing anything in the x86-64
vdso.

uclibc hardcodes a call to the vsyscall gettimeofday implementation in
its locking primitives, which probably gives terrible performance, but
that's a separate issue. I think do_emulate_vsyscall should send a
segfault if called by an x32 task -- there's some security benefit to
doing so, and there's unlikely to be any downside.

--Andy

2012-02-21 19:03:47

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH 30/30] x32: Add x32 VDSO support

On 02/21/2012 10:54 AM, Andrew Lutomirski wrote:
>
> uclibc hardcodes a call to the vsyscall gettimeofday implementation in
> its locking primitives, which probably gives terrible performance, but
> that's a separate issue. I think do_emulate_vsyscall should send a
> segfault if called by an x32 task -- there's some security benefit to
> doing so, and there's unlikely to be any downside.
>

The vsyscall page shouldn't be mapped for x32 tasks...

-hpa

2012-02-21 19:05:49

by Gustavo Padovan

[permalink] [raw]
Subject: Re: [PATCH 10/30] compat: Use COMPAT_USE_64BIT_TIME in the Bluetooth subsystem


* H. Peter Anvin <[email protected]> [2012-02-19 16:07:48 -0800]:

> From: "H. J. Lu" <[email protected]>
>
> Enable the Bluetooth subsystem to be used with a compat ABI with
> 64-bit time.
>
> Signed-off-by: H. Peter Anvin <[email protected]>
> Cc: Marcel Holtmann <[email protected]>
> Cc: Gustavo F. Padovan <[email protected]>
> Cc: David S. Miller <[email protected]>
> ---
> net/bluetooth/hci_sock.c | 3 ++-
> 1 files changed, 2 insertions(+), 1 deletions(-)

Applied, thanks.

Gustavo

2012-02-21 19:15:43

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH 10/30] compat: Use COMPAT_USE_64BIT_TIME in the Bluetooth subsystem

On 02/21/2012 11:05 AM, Gustavo Padovan wrote:
>
> * H. Peter Anvin <[email protected]> [2012-02-19 16:07:48 -0800]:
>
>> From: "H. J. Lu" <[email protected]>
>>
>> Enable the Bluetooth subsystem to be used with a compat ABI with
>> 64-bit time.
>>
>> Signed-off-by: H. Peter Anvin <[email protected]>
>> Cc: Marcel Holtmann <[email protected]>
>> Cc: Gustavo F. Padovan <[email protected]>
>> Cc: David S. Miller <[email protected]>
>> ---
>> net/bluetooth/hci_sock.c | 3 ++-
>> 1 files changed, 2 insertions(+), 1 deletions(-)
>
> Applied, thanks.
>

OK, I have it in my x86/x32 tree; you don't want to have this patch
without the prior patch for obvious reasons.

-hpa

2012-02-21 19:29:23

by Andrew Lutomirski

[permalink] [raw]
Subject: Re: [PATCH 30/30] x32: Add x32 VDSO support

On Tue, Feb 21, 2012 at 11:03 AM, H. Peter Anvin <[email protected]> wrote:
> On 02/21/2012 10:54 AM, Andrew Lutomirski wrote:
>>
>> uclibc hardcodes a call to the vsyscall gettimeofday implementation in
>> its locking primitives, which probably gives terrible performance, but
>> that's a separate issue. ?I think do_emulate_vsyscall should send a
>> segfault if called by an x32 task -- there's some security benefit to
>> doing so, and there's unlikely to be any downside.
>>
>
> The vsyscall page shouldn't be mapped for x32 tasks...

How is that possible? It lives in the fixmap and is presumably
visible from any 64-bit code.

Admittedly, x32 tasks are probably somewhat difficult to trick into
calling addresses with high bits set, but it's not necessarily
impossible.

--Andy

2012-02-21 19:37:22

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH 30/30] x32: Add x32 VDSO support

On 02/21/2012 11:29 AM, Andrew Lutomirski wrote:
>>
>> The vsyscall page shouldn't be mapped for x32 tasks...
>
> How is that possible? It lives in the fixmap and is presumably
> visible from any 64-bit code.
>
> Admittedly, x32 tasks are probably somewhat difficult to trick into
> calling addresses with high bits set, but it's not necessarily
> impossible.
>

Fair enough, and it's not necessarily all that hard either.

And it's visible even in a 32-bit task, although a 32-bit task has to
switch into 64-bit mode. Yet another reason the vsyscall page needs to die.

I was having delusions that we could have a task-owned PDT in negative
space, but that would require unsharing the third level, too, which is
just way too messy.

-hpa

2012-02-21 19:41:03

by Andrew Lutomirski

[permalink] [raw]
Subject: Re: [PATCH 30/30] x32: Add x32 VDSO support

On Tue, Feb 21, 2012 at 11:37 AM, H. Peter Anvin <[email protected]> wrote:
> On 02/21/2012 11:29 AM, Andrew Lutomirski wrote:
>>>
>>> The vsyscall page shouldn't be mapped for x32 tasks...
>>
>> How is that possible? ?It lives in the fixmap and is presumably
>> visible from any 64-bit code.
>>
>> Admittedly, x32 tasks are probably somewhat difficult to trick into
>> calling addresses with high bits set, but it's not necessarily
>> impossible.
>>
>
> Fair enough, and it's not necessarily all that hard either.
>
> And it's visible even in a 32-bit task, although a 32-bit task has to
> switch into 64-bit mode. ?Yet another reason the vsyscall page needs to die.
>
> I was having delusions that we could have a task-owned PDT in negative
> space, but that would require unsharing the third level, too, which is
> just way too messy.

I'd like to do that, too, and I'd also like to have a per-cpu
kernel-only page in there, but that's even worse. If we had a
separate cr3-like register for negative addresses, life would be good
:)

--Andy

2012-02-21 19:49:53

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH 30/30] x32: Add x32 VDSO support

On 02/21/2012 11:40 AM, Andrew Lutomirski wrote:
>>
>> I was having delusions that we could have a task-owned PDT in negative
>> space, but that would require unsharing the third level, too, which is
>> just way too messy.
>
> I'd like to do that, too, and I'd also like to have a per-cpu
> kernel-only page in there, but that's even worse. If we had a
> separate cr3-like register for negative addresses, life would be good
> :)
>

No, that wouldn't help. The situation is actually quite similar to the
current situation where we have an unshared fourth level, but since the
fourth entries are 512G per entry, we would have to push unsharing of
the kernel address space at least one more level (1G), possibly two
(2M). Painful.

The main advantage of a separate cr3 would be that we wouldn't need the
unshared top level for the kernel side.

-hpa

2012-02-21 19:51:45

by Andrew Lutomirski

[permalink] [raw]
Subject: Re: [PATCH 30/30] x32: Add x32 VDSO support

On Tue, Feb 21, 2012 at 11:49 AM, H. Peter Anvin <[email protected]> wrote:
> On 02/21/2012 11:40 AM, Andrew Lutomirski wrote:
>>>
>>> I was having delusions that we could have a task-owned PDT in negative
>>> space, but that would require unsharing the third level, too, which is
>>> just way too messy.
>>
>> I'd like to do that, too, and I'd also like to have a per-cpu
>> kernel-only page in there, but that's even worse. ?If we had a
>> separate cr3-like register for negative addresses, life would be good
>> :)
>>
>
> No, that wouldn't help. ?The situation is actually quite similar to the
> current situation where we have an unshared fourth level, but since the
> fourth entries are 512G per entry, we would have to push unsharing of
> the kernel address space at least one more level (1G), possibly two
> (2M). ?Painful.
>
> The main advantage of a separate cr3 would be that we wouldn't need the
> unshared top level for the kernel side.

Also, as is, if the top level wants to be per-cpu *and* per-task,
that's a big explosion of page tables that all need to stay in sync.

Oh well.

--Andy

2012-02-21 19:57:07

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH 30/30] x32: Add x32 VDSO support

On 02/21/2012 11:51 AM, Andrew Lutomirski wrote:
>>
>> No, that wouldn't help. The situation is actually quite similar to the
>> current situation where we have an unshared fourth level, but since the
>> fourth entries are 512G per entry, we would have to push unsharing of
>> the kernel address space at least one more level (1G), possibly two
>> (2M). Painful.
>>
>> The main advantage of a separate cr3 would be that we wouldn't need the
>> unshared top level for the kernel side.
>
> Also, as is, if the top level wants to be per-cpu *and* per-task,
> that's a big explosion of page tables that all need to stay in sync.
>

Fair enough.

-hpa

2012-02-22 12:23:22

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH 02/30] x86-64: Use explicit sizes in sigcontext.h, prepare for x32

On Monday 20 February 2012, H. Peter Anvin wrote:
> We are using __u64 as x86-32 compatible since we are sharing most of the
> really complex path (like ioctl) with i386 much more so than x86-64. So
> it is defined in userspace as:
>
> typedef unsigned long long __u64 __attribute__((aligned(4)));
>
> __aligned_u64 obviously is naturally aligned, which matches uint64_t is
> userspace.

Has someone audited the interfaces to check if there are data structures that
use a plain signed or unsigned "long long" instead of __s64/__u64 in places
where i386 differs from the other compat implementations?

I found DRM_IOCTL_UPDATE_DRAW, but there could be more like this one.

Arnd

2012-02-22 13:48:21

by Jiri Kosina

[permalink] [raw]
Subject: Re: [PATCH 10/30] compat: Use COMPAT_USE_64BIT_TIME in the Bluetooth subsystem

On Tue, 21 Feb 2012, Gustavo Padovan wrote:

> > From: "H. J. Lu" <[email protected]>
> >
> > Enable the Bluetooth subsystem to be used with a compat ABI with
> > 64-bit time.
> >
> > Signed-off-by: H. Peter Anvin <[email protected]>
> > Cc: Marcel Holtmann <[email protected]>
> > Cc: Gustavo F. Padovan <[email protected]>
> > Cc: David S. Miller <[email protected]>
> > ---
> > net/bluetooth/hci_sock.c | 3 ++-
> > 1 files changed, 2 insertions(+), 1 deletions(-)
>
> Applied, thanks.

How much sense does it make to apply this single patch without the
previous x32 infrastructure changes?

--
Jiri Kosina
SUSE Labs

2012-02-22 14:46:17

by Gustavo Padovan

[permalink] [raw]
Subject: Re: [PATCH 10/30] compat: Use COMPAT_USE_64BIT_TIME in the Bluetooth subsystem

Hi Jiri,

* Jiri Kosina <[email protected]> [2012-02-22 14:47:29 +0100]:

> On Tue, 21 Feb 2012, Gustavo Padovan wrote:
>
> > > From: "H. J. Lu" <[email protected]>
> > >
> > > Enable the Bluetooth subsystem to be used with a compat ABI with
> > > 64-bit time.
> > >
> > > Signed-off-by: H. Peter Anvin <[email protected]>
> > > Cc: Marcel Holtmann <[email protected]>
> > > Cc: Gustavo F. Padovan <[email protected]>
> > > Cc: David S. Miller <[email protected]>
> > > ---
> > > net/bluetooth/hci_sock.c | 3 ++-
> > > 1 files changed, 2 insertions(+), 1 deletions(-)
> >
> > Applied, thanks.
>
> How much sense does it make to apply this single patch without the
> previous x32 infrastructure changes?

No sense, I already stepped back with this changes. That's what happens when
you try to apply patches in the middle of the Brazilian Carnival. :)

Gustavo

2012-02-22 18:14:48

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH 02/30] x86-64: Use explicit sizes in sigcontext.h, prepare for x32

On 02/22/2012 04:22 AM, Arnd Bergmann wrote:
> On Monday 20 February 2012, H. Peter Anvin wrote:
>> We are using __u64 as x86-32 compatible since we are sharing most of the
>> really complex path (like ioctl) with i386 much more so than x86-64. So
>> it is defined in userspace as:
>>
>> typedef unsigned long long __u64 __attribute__((aligned(4)));
>>
>> __aligned_u64 obviously is naturally aligned, which matches uint64_t is
>> userspace.
>
> Has someone audited the interfaces to check if there are data structures that
> use a plain signed or unsigned "long long" instead of __s64/__u64 in places
> where i386 differs from the other compat implementations?
>
> I found DRM_IOCTL_UPDATE_DRAW, but there could be more like this one.
>

Has someone audited every single ioctl in the kernel? Definitely not,
which is why x32 is marked EXPERIMENTAL. I think it is still time for
this work to switch to happening in the upstream, however.

-hpa


--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.

2012-02-22 21:24:26

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH 02/30] x86-64: Use explicit sizes in sigcontext.h, prepare for x32

On Wednesday 22 February 2012, H. Peter Anvin wrote:
> On 02/22/2012 04:22 AM, Arnd Bergmann wrote:
> > On Monday 20 February 2012, H. Peter Anvin wrote:
> >> We are using __u64 as x86-32 compatible since we are sharing most of the
> >> really complex path (like ioctl) with i386 much more so than x86-64. So
> >> it is defined in userspace as:
> >>
> >> typedef unsigned long long __u64 __attribute__((aligned(4)));
> >>
> >> __aligned_u64 obviously is naturally aligned, which matches uint64_t is
> >> userspace.
> >
> > Has someone audited the interfaces to check if there are data structures that
> > use a plain signed or unsigned "long long" instead of __s64/__u64 in places
> > where i386 differs from the other compat implementations?
> >
> > I found DRM_IOCTL_UPDATE_DRAW, but there could be more like this one.
> >
>
> Has someone audited every single ioctl in the kernel? Definitely not,
> which is why x32 is marked EXPERIMENTAL. I think it is still time for
> this work to switch to happening in the upstream, however.

Depends on how you want to do it. In some cases, the easiest answer
would be to change the data structure to use __u64 and be compatible
with i386. Once there are distros built using data structure with
padding around a long long, you have to use a run-time conditional
in the compat handler.

I'd say we should fix at least the ones that are easy to spot because
they already use compat_u64 or have an #ifdef CONFIG_X86_64 in compat
code. I've looked at everything I could find that fits into that category
and found only two locations. My expectation is that all other data
structures that would fall into this category are already broken
for 32 bit emulation on x86.

Signed-off-by: Arnd Bergmann <[email protected]>

diff --git a/include/drm/drm.h b/include/drm/drm.h
index 49d94ed..73b7c33 100644
--- a/include/drm/drm.h
+++ b/include/drm/drm.h
@@ -438,7 +438,7 @@ struct drm_update_draw {
drm_drawable_t handle;
unsigned int type;
unsigned int num;
- unsigned long long data;
+ __u64 data;
};

/**
diff --git a/include/sound/asound.h b/include/sound/asound.h
index a2e4ff5..a17e96c 100644
--- a/include/sound/asound.h
+++ b/include/sound/asound.h
@@ -824,8 +824,8 @@ struct snd_ctl_elem_value {
long *value_ptr; /* obsoleted */
} integer;
union {
- long long value[64];
- long long *value_ptr; /* obsoleted */
+ __s64 value[64];
+ __s64 *value_ptr; /* obsoleted */
} integer64;
union {
unsigned int item[128];
Arnd

2012-02-22 21:55:41

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH 02/30] x86-64: Use explicit sizes in sigcontext.h, prepare for x32

On 02/22/2012 01:23 PM, Arnd Bergmann wrote:
>
> Depends on how you want to do it. In some cases, the easiest answer
> would be to change the data structure to use __u64 and be compatible
> with i386. Once there are distros built using data structure with
> padding around a long long, you have to use a run-time conditional
> in the compat handler.
>

That we'd like to avoid, but as you said, they'd be broken or might
already be broken.

> I'd say we should fix at least the ones that are easy to spot because
> they already use compat_u64 or have an #ifdef CONFIG_X86_64 in compat
> code. I've looked at everything I could find that fits into that category
> and found only two locations. My expectation is that all other data
> structures that would fall into this category are already broken
> for 32 bit emulation on x86.
>
> Signed-off-by: Arnd Bergmann <[email protected]>
>
> diff --git a/include/drm/drm.h b/include/drm/drm.h
> index 49d94ed..73b7c33 100644
> --- a/include/drm/drm.h
> +++ b/include/drm/drm.h
> @@ -438,7 +438,7 @@ struct drm_update_draw {
> drm_drawable_t handle;
> unsigned int type;
> unsigned int num;
> - unsigned long long data;
> + __u64 data;
> };
>
> /**
> diff --git a/include/sound/asound.h b/include/sound/asound.h
> index a2e4ff5..a17e96c 100644
> --- a/include/sound/asound.h
> +++ b/include/sound/asound.h
> @@ -824,8 +824,8 @@ struct snd_ctl_elem_value {
> long *value_ptr; /* obsoleted */
> } integer;
> union {
> - long long value[64];
> - long long *value_ptr; /* obsoleted */
> + __s64 value[64];
> + __s64 *value_ptr; /* obsoleted */
> } integer64;
> union {
> unsigned int item[128];
> Arnd

Right, those are good starts.

-hpa

--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.

2012-02-23 04:50:12

by H. Peter Anvin

[permalink] [raw]
Subject: [tip:x86/x32] x32: Drop non-__vdso weak symbols from the x32 VDSO

Commit-ID: 862ae3132dc393ab6ea750b9ee9e0e1c276b9abb
Gitweb: http://git.kernel.org/tip/862ae3132dc393ab6ea750b9ee9e0e1c276b9abb
Author: H. Peter Anvin <[email protected]>
AuthorDate: Wed, 22 Feb 2012 20:37:10 -0800
Committer: H. Peter Anvin <[email protected]>
CommitDate: Wed, 22 Feb 2012 20:40:07 -0800

x32: Drop non-__vdso weak symbols from the x32 VDSO

Drop the legacy weak symbols that don't carry the __vdso prefix from
the x32 VDSO. This is a new ABI and we don't need to support that
legacy; the actual libc will export the proper symbols.

Suggested-by: Andy Lutomirski <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Cc: H. J. Lu <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>
---
arch/x86/vdso/vdsox32.lds.S | 4 ----
1 files changed, 0 insertions(+), 4 deletions(-)

diff --git a/arch/x86/vdso/vdsox32.lds.S b/arch/x86/vdso/vdsox32.lds.S
index 373ca9a..62272aa 100644
--- a/arch/x86/vdso/vdsox32.lds.S
+++ b/arch/x86/vdso/vdsox32.lds.S
@@ -17,13 +17,9 @@
VERSION {
LINUX_2.6 {
global:
- clock_gettime;
__vdso_clock_gettime;
- gettimeofday;
__vdso_gettimeofday;
- getcpu;
__vdso_getcpu;
- time;
__vdso_time;
local: *;
};

2012-02-23 10:56:03

by Ingo Molnar

[permalink] [raw]
Subject: Re: [tip:x86/x32] x32: Drop non-__vdso weak symbols from the x32 VDSO


* tip-bot for H. Peter Anvin <[email protected]> wrote:

> Commit-ID: 862ae3132dc393ab6ea750b9ee9e0e1c276b9abb
> Gitweb: http://git.kernel.org/tip/862ae3132dc393ab6ea750b9ee9e0e1c276b9abb
> Author: H. Peter Anvin <[email protected]>
> AuthorDate: Wed, 22 Feb 2012 20:37:10 -0800
> Committer: H. Peter Anvin <[email protected]>
> CommitDate: Wed, 22 Feb 2012 20:40:07 -0800
>
> x32: Drop non-__vdso weak symbols from the x32 VDSO

One of the recent x32 commit broke the build on some configs:

VDSO arch/x86/vdso/vdsox32.so.dbg
/usr/bin/ld: arch/x86/vdso/vgetcpu-x32.o: relocation R_X86_64_64
against symbol `.rodata' isn't supported in x32 mode
arch/x86/vdso/vgetcpu-x32.o: could not read symbols: Bad value
collect2: error: ld returned 1 exit status

config attached.

binutils-2.21.51, gcc-4.7.0.

Thanks,

Ingo


Attachments:
(No filename) (821.00 B)
config (74.58 kB)
Download all attachments

2012-02-23 14:37:07

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [tip:x86/x32] x32: Drop non-__vdso weak symbols from the x32 VDSO

On 02/23/2012 02:55 AM, Ingo Molnar wrote:
>
> * tip-bot for H. Peter Anvin <[email protected]> wrote:
>
>> Commit-ID: 862ae3132dc393ab6ea750b9ee9e0e1c276b9abb
>> Gitweb: http://git.kernel.org/tip/862ae3132dc393ab6ea750b9ee9e0e1c276b9abb
>> Author: H. Peter Anvin <[email protected]>
>> AuthorDate: Wed, 22 Feb 2012 20:37:10 -0800
>> Committer: H. Peter Anvin <[email protected]>
>> CommitDate: Wed, 22 Feb 2012 20:40:07 -0800
>>
>> x32: Drop non-__vdso weak symbols from the x32 VDSO
>
> One of the recent x32 commit broke the build on some configs:
>
> VDSO arch/x86/vdso/vdsox32.so.dbg
> /usr/bin/ld: arch/x86/vdso/vgetcpu-x32.o: relocation R_X86_64_64
> against symbol `.rodata' isn't supported in x32 mode
> arch/x86/vdso/vgetcpu-x32.o: could not read symbols: Bad value
> collect2: error: ld returned 1 exit status
>
> config attached.
>
> binutils-2.21.51, gcc-4.7.0.
>
> Thanks,
>
> Ingo

Yes, this is known: binutils-2.21.51 has some x32 support, but that
version of the ABI is too old for what the kernel needs; in other words
it's not functional.

-hpa

2012-02-24 02:34:20

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH 02/30] x86-64: Use explicit sizes in sigcontext.h, prepare for x32

On 02/22/2012 01:23 PM, Arnd Bergmann wrote:
>
> I'd say we should fix at least the ones that are easy to spot because
> they already use compat_u64 or have an #ifdef CONFIG_X86_64 in compat
> code. I've looked at everything I could find that fits into that category
> and found only two locations. My expectation is that all other data
> structures that would fall into this category are already broken
> for 32 bit emulation on x86.
>
> Signed-off-by: Arnd Bergmann <[email protected]>
>

Would you mind sending these as patches to the respective maintainers?
Since they don't depend on the x32 patchset per se it's probably the
best way to handle this.

Feel free to include my Acked-by: H. Peter Anvin <[email protected]>.

-hpa

--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.

2012-02-27 22:23:08

by H. Peter Anvin

[permalink] [raw]
Subject: [tip:x86/x32] x32: Warn and disable rather than error if binutils too old

Commit-ID: 0bf6276392e990dd0da0ccd8e10f42597d503f29
Gitweb: http://git.kernel.org/tip/0bf6276392e990dd0da0ccd8e10f42597d503f29
Author: H. Peter Anvin <[email protected]>
AuthorDate: Mon, 27 Feb 2012 14:09:10 -0800
Committer: H. Peter Anvin <[email protected]>
CommitDate: Mon, 27 Feb 2012 14:09:10 -0800

x32: Warn and disable rather than error if binutils too old

If X32 is enabled in .config, but the binutils can't build it, issue a
warning and disable the feature rather than erroring out.

In order to support this, have CONFIG_X86_X32 be the option set in
Kconfig, and CONFIG_X86_X32_ABI be the option set by the Makefile when
it is enabled and binutils has been found to be functional.

Requested-by: Ingo Molnar <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>
Cc: H. J. Lu <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
---
arch/x86/Kconfig | 4 ++--
arch/x86/Makefile | 16 ++++++++++++++++
2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index c9d6c9e..e2b38b4 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2175,7 +2175,7 @@ config IA32_AOUT
---help---
Support old a.out binaries in the 32bit emulation.

-config X86_X32_ABI
+config X86_X32
bool "x32 ABI for 64-bit mode (EXPERIMENTAL)"
depends on X86_64 && IA32_EMULATION && EXPERIMENTAL
---help---
@@ -2190,7 +2190,7 @@ config X86_X32_ABI

config COMPAT
def_bool y
- depends on IA32_EMULATION || X86_X32_ABI
+ depends on IA32_EMULATION || X86_X32

config COMPAT_FOR_U64_ALIGNMENT
def_bool COMPAT
diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index 209ba12..31bb1eb 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -82,6 +82,22 @@ ifdef CONFIG_CC_STACKPROTECTOR
endif
endif

+ifdef CONFIG_X86_X32
+ x32_ld_ok := $(call try-run,\
+ /bin/echo -e '1: .quad 1b' | \
+ $(CC) $(KBUILD_AFLAGS) -c -xassembler -o "$$TMP" - && \
+ $(OBJCOPY) -O elf32-x86-64 "$$TMP" "$$TMPO" && \
+ $(LD) -m elf32_x86_64 "$$TMPO" -o "$$TMP",y,n)
+ ifeq ($(x32_ld_ok),y)
+ CONFIG_X86_X32_ABI := y
+ KBUILD_AFLAGS += -DCONFIG_X86_X32_ABI
+ KBUILD_CFLAGS += -DCONFIG_X86_X32_ABI
+ else
+ $(warning CONFIG_X86_X32 enabled but no binutils support)
+ endif
+endif
+export CONFIG_X86_X32_ABI
+
# Don't unroll struct assignments with kmemcheck enabled
ifeq ($(CONFIG_KMEMCHECK),y)
KBUILD_CFLAGS += $(call cc-option,-fno-builtin-memcpy)

2012-02-28 09:50:30

by Ingo Molnar

[permalink] [raw]
Subject: [tip:x86/x32] x86/x32: Fix the binutils auto-detect

Commit-ID: 8bd69c2d5f9c0b5237c632d1b21dbfe4fd16ba6b
Gitweb: http://git.kernel.org/tip/8bd69c2d5f9c0b5237c632d1b21dbfe4fd16ba6b
Author: Ingo Molnar <[email protected]>
AuthorDate: Tue, 28 Feb 2012 10:35:06 +0100
Committer: Ingo Molnar <[email protected]>
CommitDate: Tue, 28 Feb 2012 10:35:06 +0100

x86/x32: Fix the binutils auto-detect

Fix:

arch/x86/Makefile:96: *** recipe commences before first target. Stop.

Cc: H. Peter Anvin <[email protected]>
Cc: H. J. Lu <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/Makefile | 14 +++++++-------
1 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index 31bb1eb..968dbe2 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -88,13 +88,13 @@ ifdef CONFIG_X86_X32
$(CC) $(KBUILD_AFLAGS) -c -xassembler -o "$$TMP" - && \
$(OBJCOPY) -O elf32-x86-64 "$$TMP" "$$TMPO" && \
$(LD) -m elf32_x86_64 "$$TMPO" -o "$$TMP",y,n)
- ifeq ($(x32_ld_ok),y)
- CONFIG_X86_X32_ABI := y
- KBUILD_AFLAGS += -DCONFIG_X86_X32_ABI
- KBUILD_CFLAGS += -DCONFIG_X86_X32_ABI
- else
- $(warning CONFIG_X86_X32 enabled but no binutils support)
- endif
+ ifeq ($(x32_ld_ok),y)
+ CONFIG_X86_X32_ABI := y
+ KBUILD_AFLAGS += -DCONFIG_X86_X32_ABI
+ KBUILD_CFLAGS += -DCONFIG_X86_X32_ABI
+ else
+ $(warning CONFIG_X86_X32 enabled but no binutils support)
+ endif
endif
export CONFIG_X86_X32_ABI