2016-10-21 21:06:05

by Yury Norov

[permalink] [raw]
Subject: [RFC3 nowrap: PATCH v7 00/18] ILP32 for ARM64

This series enables aarch64 with ilp32 mode, and as supporting work,
introduces ARCH_32BIT_OFF_T configuration option that is enabled for
existing 32-bit architectures but disabled for new arches (so 64-bit
off_t is is used by new userspace).

This version is based on kernel v4.9-rc1. It works with glibc-2.24,
and tested with LTP.

This version contains ABI changes, and should be used with new glibc
version. See links below.

This is RFC because there is still no solid understanding what type
of registers top-halves delousing we prefer and it affects ABI. In
this patchset, w0-w7 are cleared for each syscall in assembler entry.

The alternative approach is in introducing compat wrappers which is
little faster for natively routed syscalls (~2.6% for syscall with
no payload) but much more complicated.

Patch 1 may be applied separately from other patches of series.

v3: https://lkml.org/lkml/2014/9/3/704
v4: https://lkml.org/lkml/2015/4/13/691
v5: https://lkml.org/lkml/2015/9/29/911
v6: https://lkml.org/lkml/2016/5/23/661
v7: RFC nowrap: https://lkml.org/lkml/2016/6/17/990
v7: RFC2 nowrap: https://lkml.org/lkml/2016/8/17/245
v7: RFC3 nowrap: https://lkml.org/lkml/2016/8/17/245
- rebased on kernel 4.9-rc1;
- setrlimit(), getrlimit() special handling is dropped.
rlim_t is still 64-bit, but glibc is forced to use sys_prlimit64(),
and redirection is not needed anymore;
- sys_stat() and sys_stat64() redirection is dropped. Glibc defines
aarch32-compatible struct stat instead;
- sys_fcntl() redirection is dropped. Glibc sets proper definitions for
requests instead;
- renameat() is disabled for aarch64/ilp32. Glibc is forced to use renameat2();
- __ARCH_WANT_SYNC_FILE_RANGE2 is enabled for aarch64/ilp32 to force it use
sys_sync_file_range2 prior to sys_sync_file_range, like aarch32;
- VDSO code refactored. Version is switched to 4.9.
- comments and documentation are revised;
- checkpatch.pl errors are fixed.

Links:
Kernel: https://github.com/norov/linux/tree/ilp32-v4.9
glibc: https://github.com/norov/glibc/tree/ilp32-2.24-dev1

Andrew Pinski (6):
arm64: rename COMPAT to AARCH32_EL0 in Kconfig
arm64: ensure the kernel is compiled for LP64
arm64:uapi: set __BITS_PER_LONG correctly for ILP32 and LP64
arm64: ilp32: add sys_ilp32.c and a separate table (in entry.S) to use
it
arm64: ilp32: introduce ilp32-specific handlers for sigframe and
ucontext
arm64:ilp32: add ARM64_ILP32 to Kconfig

Philipp Tomsich (1):
arm64:ilp32: add vdso-ilp32 and use for signal return

Yury Norov (11):
32-bit ABI: introduce ARCH_32BIT_OFF_T config option
arm64: ilp32: add documentation on the ILP32 ABI for ARM64
thread: move thread bits accessors to separated file
arm64: introduce is_a32_task and is_a32_thread (for AArch32 compat)
arm64: ilp32: add is_ilp32_compat_{task,thread} and TIF_32BIT_AARCH64
arm64: introduce binfmt_elf32.c
arm64: ilp32: introduce binfmt_ilp32.c
arm64: ilp32: share aarch32 syscall handlers
arm64: signal: share lp64 signal routines to ilp32
arm64: signal32: move ilp32 and aarch32 common code to separated file
arm64: ptrace: handle ptrace_request differently for aarch32 and ilp32

Documentation/arm64/ilp32.txt | 46 +++++++
arch/Kconfig | 4 +
arch/arc/Kconfig | 1 +
arch/arm/Kconfig | 1 +
arch/arm64/Kconfig | 19 ++-
arch/arm64/Makefile | 5 +
arch/arm64/include/asm/compat.h | 19 +--
arch/arm64/include/asm/elf.h | 29 +++--
arch/arm64/include/asm/fpsimd.h | 2 +-
arch/arm64/include/asm/ftrace.h | 2 +-
arch/arm64/include/asm/hwcap.h | 6 +-
arch/arm64/include/asm/is_compat.h | 90 +++++++++++++
arch/arm64/include/asm/memory.h | 5 +-
arch/arm64/include/asm/processor.h | 11 +-
arch/arm64/include/asm/ptrace.h | 2 +-
arch/arm64/include/asm/seccomp.h | 2 +-
arch/arm64/include/asm/signal32.h | 9 +-
arch/arm64/include/asm/signal32_common.h | 27 ++++
arch/arm64/include/asm/signal_common.h | 33 +++++
arch/arm64/include/asm/signal_ilp32.h | 38 ++++++
arch/arm64/include/asm/syscall.h | 2 +-
arch/arm64/include/asm/thread_info.h | 4 +-
arch/arm64/include/asm/unistd.h | 8 +-
arch/arm64/include/asm/unistd32.h | 2 +-
arch/arm64/include/asm/vdso.h | 6 +
arch/arm64/include/uapi/asm/bitsperlong.h | 9 +-
arch/arm64/include/uapi/asm/unistd.h | 12 ++
arch/arm64/kernel/Makefile | 18 ++-
arch/arm64/kernel/asm-offsets.c | 9 +-
arch/arm64/kernel/binfmt_elf32.c | 31 +++++
arch/arm64/kernel/binfmt_ilp32.c | 97 ++++++++++++++
arch/arm64/kernel/cpufeature.c | 8 +-
arch/arm64/kernel/cpuinfo.c | 20 +--
arch/arm64/kernel/entry.S | 34 ++++-
arch/arm64/kernel/entry32.S | 80 ------------
arch/arm64/kernel/entry32_common.S | 107 ++++++++++++++++
arch/arm64/kernel/entry_ilp32.S | 22 ++++
arch/arm64/kernel/head.S | 2 +-
arch/arm64/kernel/hw_breakpoint.c | 10 +-
arch/arm64/kernel/perf_regs.c | 2 +-
arch/arm64/kernel/process.c | 7 +-
arch/arm64/kernel/ptrace.c | 110 ++++++++++++++--
arch/arm64/kernel/signal.c | 102 +++++++++------
arch/arm64/kernel/signal32.c | 107 ----------------
arch/arm64/kernel/signal32_common.c | 135 ++++++++++++++++++++
arch/arm64/kernel/signal_ilp32.c | 174 ++++++++++++++++++++++++++
arch/arm64/kernel/sys32.c | 1 +
arch/arm64/kernel/sys_ilp32.c | 100 +++++++++++++++
arch/arm64/kernel/traps.c | 5 +-
arch/arm64/kernel/vdso-ilp32/.gitignore | 2 +
arch/arm64/kernel/vdso-ilp32/Makefile | 74 +++++++++++
arch/arm64/kernel/vdso-ilp32/vdso-ilp32.S | 33 +++++
arch/arm64/kernel/vdso-ilp32/vdso-ilp32.lds.S | 95 ++++++++++++++
arch/arm64/kernel/vdso.c | 70 +++++++++--
arch/arm64/kernel/vdso/gettimeofday.S | 20 ++-
arch/arm64/kernel/vdso/vdso.S | 6 +-
arch/blackfin/Kconfig | 1 +
arch/cris/Kconfig | 1 +
arch/frv/Kconfig | 1 +
arch/h8300/Kconfig | 1 +
arch/hexagon/Kconfig | 1 +
arch/m32r/Kconfig | 1 +
arch/m68k/Kconfig | 1 +
arch/metag/Kconfig | 1 +
arch/microblaze/Kconfig | 1 +
arch/mips/Kconfig | 1 +
arch/mn10300/Kconfig | 1 +
arch/nios2/Kconfig | 1 +
arch/openrisc/Kconfig | 1 +
arch/parisc/Kconfig | 1 +
arch/powerpc/Kconfig | 1 +
arch/score/Kconfig | 1 +
arch/sh/Kconfig | 1 +
arch/sparc/Kconfig | 1 +
arch/tile/Kconfig | 1 +
arch/tile/kernel/compat.c | 3 +
arch/unicore32/Kconfig | 1 +
arch/x86/Kconfig | 1 +
arch/x86/um/Kconfig | 1 +
arch/xtensa/Kconfig | 1 +
drivers/clocksource/arm_arch_timer.c | 2 +-
include/linux/fcntl.h | 2 +-
include/linux/ptrace.h | 6 +
include/linux/thread_bits.h | 54 ++++++++
include/linux/thread_info.h | 44 +------
include/uapi/asm-generic/unistd.h | 5 +-
kernel/ptrace.c | 10 +-
87 files changed, 1635 insertions(+), 389 deletions(-)
create mode 100644 Documentation/arm64/ilp32.txt
create mode 100644 arch/arm64/include/asm/is_compat.h
create mode 100644 arch/arm64/include/asm/signal32_common.h
create mode 100644 arch/arm64/include/asm/signal_common.h
create mode 100644 arch/arm64/include/asm/signal_ilp32.h
create mode 100644 arch/arm64/kernel/binfmt_elf32.c
create mode 100644 arch/arm64/kernel/binfmt_ilp32.c
create mode 100644 arch/arm64/kernel/entry32_common.S
create mode 100644 arch/arm64/kernel/entry_ilp32.S
create mode 100644 arch/arm64/kernel/signal32_common.c
create mode 100644 arch/arm64/kernel/signal_ilp32.c
create mode 100644 arch/arm64/kernel/sys_ilp32.c
create mode 100644 arch/arm64/kernel/vdso-ilp32/.gitignore
create mode 100644 arch/arm64/kernel/vdso-ilp32/Makefile
create mode 100644 arch/arm64/kernel/vdso-ilp32/vdso-ilp32.S
create mode 100644 arch/arm64/kernel/vdso-ilp32/vdso-ilp32.lds.S
create mode 100644 include/linux/thread_bits.h

--
2.7.4


2016-10-21 20:35:23

by Yury Norov

[permalink] [raw]
Subject: [PATCH 10/18] arm64: ilp32: introduce binfmt_ilp32.c

binfmt_ilp32.c is needed to handle ILP32 binaries

Signed-off-by: Yury Norov <[email protected]>
Signed-off-by: Bamvor Zhang Jian <[email protected]>
---
arch/arm64/include/asm/elf.h | 6 +++
arch/arm64/kernel/Makefile | 1 +
arch/arm64/kernel/binfmt_ilp32.c | 97 ++++++++++++++++++++++++++++++++++++++++
3 files changed, 104 insertions(+)
create mode 100644 arch/arm64/kernel/binfmt_ilp32.c

diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
index f259fe8..be29dde 100644
--- a/arch/arm64/include/asm/elf.h
+++ b/arch/arm64/include/asm/elf.h
@@ -175,10 +175,16 @@ extern int arch_setup_additional_pages(struct linux_binprm *bprm,

#define COMPAT_ELF_ET_DYN_BASE (2 * TASK_SIZE_32 / 3)

+#ifndef USE_AARCH64_GREG
/* AArch32 registers. */
#define COMPAT_ELF_NGREG 18
typedef unsigned int compat_elf_greg_t;
typedef compat_elf_greg_t compat_elf_gregset_t[COMPAT_ELF_NGREG];
+#else /* AArch64 registers for AARCH64/ILP32 */
+#define COMPAT_ELF_NGREG ELF_NGREG
+#define compat_elf_greg_t elf_greg_t
+#define compat_elf_gregset_t elf_gregset_t
+#endif

/* AArch32 EABI. */
#define EF_ARM_EABI_MASK 0xff000000
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index abe5040..f661888 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -29,6 +29,7 @@ $(obj)/%.stub.o: $(obj)/%.o FORCE

arm64-obj-$(CONFIG_AARCH32_EL0) += sys32.o kuser32.o signal32.o \
sys_compat.o entry32.o binfmt_elf32.o
+arm64-obj-$(CONFIG_ARM64_ILP32) += binfmt_ilp32.o
arm64-obj-$(CONFIG_FUNCTION_TRACER) += ftrace.o entry-ftrace.o
arm64-obj-$(CONFIG_MODULES) += arm64ksyms.o module.o
arm64-obj-$(CONFIG_ARM64_MODULE_PLTS) += module-plts.o
diff --git a/arch/arm64/kernel/binfmt_ilp32.c b/arch/arm64/kernel/binfmt_ilp32.c
new file mode 100644
index 0000000..759066e
--- /dev/null
+++ b/arch/arm64/kernel/binfmt_ilp32.c
@@ -0,0 +1,97 @@
+/*
+ * Support for ILP32 Linux/aarch64 ELF binaries.
+ */
+#define USE_AARCH64_GREG
+
+#include <linux/elfcore-compat.h>
+#include <linux/time.h>
+
+#undef ELF_CLASS
+#define ELF_CLASS ELFCLASS32
+
+#undef elfhdr
+#undef elf_phdr
+#undef elf_shdr
+#undef elf_note
+#undef elf_addr_t
+#define elfhdr elf32_hdr
+#define elf_phdr elf32_phdr
+#define elf_shdr elf32_shdr
+#define elf_note elf32_note
+#define elf_addr_t Elf32_Addr
+
+/*
+ * Some data types as stored in coredump.
+ */
+#define user_long_t compat_long_t
+#define user_siginfo_t compat_siginfo_t
+#define copy_siginfo_to_user copy_siginfo_to_user32
+
+/*
+ * The machine-dependent core note format types are defined in elfcore-compat.h,
+ * which requires asm/elf.h to define compat_elf_gregset_t et al.
+ */
+#define elf_prstatus compat_elf_prstatus
+#define elf_prpsinfo compat_elf_prpsinfo
+
+/*
+ * Compat version of cputime_to_compat_timeval, perhaps this
+ * should be an inline in <linux/compat.h>.
+ */
+static void cputime_to_compat_timeval(const cputime_t cputime,
+ struct compat_timeval *value)
+{
+ struct timeval tv;
+
+ cputime_to_timeval(cputime, &tv);
+ value->tv_sec = tv.tv_sec;
+ value->tv_usec = tv.tv_usec;
+}
+
+#undef cputime_to_timeval
+#define cputime_to_timeval cputime_to_compat_timeval
+
+/* AARCH64 ILP32 EABI. */
+#undef elf_check_arch
+#define elf_check_arch(x) (((x)->e_machine == EM_AARCH64) \
+ && (x)->e_ident[EI_CLASS] == ELFCLASS32)
+
+#undef SET_PERSONALITY
+#define SET_PERSONALITY(ex) \
+do { \
+ set_thread_flag(TIF_32BIT_AARCH64); \
+ clear_thread_flag(TIF_32BIT); \
+} while (0)
+
+#undef ARCH_DLINFO
+#define ARCH_DLINFO \
+do { \
+ NEW_AUX_ENT(AT_SYSINFO_EHDR, \
+ (elf_addr_t)(long)current->mm->context.vdso); \
+} while (0)
+
+#undef ELF_PLATFORM
+#ifdef __AARCH64EB__
+#define ELF_PLATFORM ("aarch64_be:ilp32")
+#else
+#define ELF_PLATFORM ("aarch64:ilp32")
+#endif
+
+#undef ELF_ET_DYN_BASE
+#define ELF_ET_DYN_BASE COMPAT_ELF_ET_DYN_BASE
+
+#undef ELF_HWCAP
+#undef ELF_HWCAP2
+#define ELF_HWCAP ((u32) elf_hwcap)
+#define ELF_HWCAP2 ((u32) (elf_hwcap >> 32))
+
+/*
+ * Rename a few of the symbols that binfmt_elf.c will define.
+ * These are all local so the names don't really matter, but it
+ * might make some debugging less confusing not to duplicate them.
+ */
+#define elf_format compat_elf_format
+#define init_elf_binfmt init_compat_elf_binfmt
+#define exit_elf_binfmt exit_compat_elf_binfmt
+
+#include "../../../fs/binfmt_elf.c"
--
2.7.4

2016-10-21 20:48:12

by Yury Norov

[permalink] [raw]
Subject: [PATCH 02/18] arm64: ilp32: add documentation on the ILP32 ABI for ARM64

Based on Andrew Pinski's patch-series.

Signed-off-by: Yury Norov <[email protected]>
---
Documentation/arm64/ilp32.txt | 46 +++++++++++++++++++++++++++++++++++++++++++
1 file changed, 46 insertions(+)
create mode 100644 Documentation/arm64/ilp32.txt

diff --git a/Documentation/arm64/ilp32.txt b/Documentation/arm64/ilp32.txt
new file mode 100644
index 0000000..b96c18f
--- /dev/null
+++ b/Documentation/arm64/ilp32.txt
@@ -0,0 +1,46 @@
+ILP32 AARCH64 SYSCALL ABI
+=========================
+
+This document describes the ILP32 syscall ABI and where it differs
+from the generic compat linux syscall interface.
+
+AARCH64/ILP32 userspace can potentially access top halves of registers that
+are passed as syscall arguments, so such registers (w0-w7) are deloused.
+
+AARCH64/ILP32 provides next types turned to 64-bit (comparing to AARCH32):
+ino_t is u64 type.
+off_t is s64 type.
+blkcnt_t is s64 type.
+fsblkcnt_t is u64 type.
+fsfilcnt_t is u64 type.
+rlim_t is u64 type.
+
+AARCH64/ILP32 ABI uses standard syscall table which can be found at
+include/uapi/asm-generic/unistd.h, with the exceptions listed below.
+
+Syscalls which pass 64bit values are handled by the code shared from
+AARCH32 and pass that value as a pair. Next syscalls are affected:
+fadvise64_64()
+fallocate()
+ftruncate64()
+pread64()
+pwrite64()
+readahead()
+sync_file_range()
+truncate64()
+sys_mmap()
+
+ptrace() syscall is handled by compat version.
+
+shmat() syscall is handled by non-compat handler as aarch64/ilp32 has no
+limitation on 4-pages alignment for shared memory.
+
+statfs() and fstatfs() take the size of sfruct statfs as an argument.
+It is calculated differently in kernel and user spaces. So AARCH32 handlers
+are taken to handle it.
+
+struct rt_sigframe is redefined and contains struct compat_siginfo,
+as compat syscalls expects, and struct ilp32_sigframe, to handle
+AARCH64 register set and 32-bit userspace register representation.
+
+elf_gregset_t is taken from lp64 to handle registers properly.
--
2.7.4

2016-10-21 20:49:15

by Yury Norov

[permalink] [raw]
Subject: [PATCH 08/18] arm64: ilp32: add is_ilp32_compat_{task,thread} and TIF_32BIT_AARCH64

ILP32 tasks are needed to be distinguished from lp64 and aarch32.
This patch adds helper functions is_ilp32_compat_{task,thread} and
thread flag TIF_32BIT_AARCH64 to address it. This is a preparation
for following patches in ilp32 patchset.

For consistency, SET_PERSONALITY is changed here accordingly.

Signed-off-by: Andrew Pinski <[email protected]>
Signed-off-by: Philipp Tomsich <[email protected]>
Signed-off-by: Christoph Muellner <[email protected]>
Signed-off-by: Yury Norov <[email protected]>
Reviewed-by: David Daney <[email protected]>
---
arch/arm64/include/asm/elf.h | 13 +++++++++++--
arch/arm64/include/asm/is_compat.h | 30 ++++++++++++++++++++++++++++--
arch/arm64/include/asm/thread_info.h | 2 ++
3 files changed, 41 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
index 6a9049b..f259fe8 100644
--- a/arch/arm64/include/asm/elf.h
+++ b/arch/arm64/include/asm/elf.h
@@ -142,7 +142,11 @@ typedef struct user_fpsimd_state elf_fpregset_t;
*/
#define ELF_PLAT_INIT(_r, load_addr) (_r)->regs[0] = 0

-#define SET_PERSONALITY(ex) clear_thread_flag(TIF_32BIT);
+#define SET_PERSONALITY(ex) \
+do { \
+ clear_thread_flag(TIF_32BIT_AARCH64); \
+ clear_thread_flag(TIF_32BIT); \
+} while (0)

/* update AT_VECTOR_SIZE_ARCH if the number of NEW_AUX_ENT entries changes */
#define ARCH_DLINFO \
@@ -183,7 +187,12 @@ typedef compat_elf_greg_t compat_elf_gregset_t[COMPAT_ELF_NGREG];
((x)->e_flags & EF_ARM_EABI_MASK))

#define compat_start_thread compat_start_thread
-#define COMPAT_SET_PERSONALITY(ex) set_thread_flag(TIF_32BIT);
+#define COMPAT_SET_PERSONALITY(ex) \
+do { \
+ clear_thread_flag(TIF_32BIT_AARCH64); \
+ set_thread_flag(TIF_32BIT); \
+} while (0)
+
#define COMPAT_ARCH_DLINFO
extern int aarch32_setup_vectors_page(struct linux_binprm *bprm,
int uses_interp);
diff --git a/arch/arm64/include/asm/is_compat.h b/arch/arm64/include/asm/is_compat.h
index 8dba5ca..7726beb 100644
--- a/arch/arm64/include/asm/is_compat.h
+++ b/arch/arm64/include/asm/is_compat.h
@@ -45,18 +45,44 @@ static inline int is_a32_compat_thread(struct thread_info *thread)

#endif /* CONFIG_AARCH32_EL0 */

+#ifdef CONFIG_ARM64_ILP32
+
+static inline int is_ilp32_compat_task(void)
+{
+ return test_thread_flag(TIF_32BIT_AARCH64);
+}
+
+static inline int is_ilp32_compat_thread(struct thread_info *thread)
+{
+ return test_ti_thread_flag(thread, TIF_32BIT_AARCH64);
+}
+
+#else
+
+static inline int is_ilp32_compat_task(void)
+{
+ return 0;
+}
+
+static inline int is_ilp32_compat_thread(struct thread_info *thread)
+{
+ return 0;
+}
+
+#endif /* CONFIG_ARM64_ILP32 */
+
#ifdef CONFIG_COMPAT

static inline int is_compat_task(void)
{
- return is_a32_compat_task();
+ return is_a32_compat_task() || is_ilp32_compat_task();
}

#endif /* CONFIG_COMPAT */

static inline int is_compat_thread(struct thread_info *thread)
{
- return is_a32_compat_thread(thread);
+ return is_a32_compat_thread(thread) || is_ilp32_compat_thread(thread);
}


diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
index e12411f..680aca5 100644
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -122,6 +122,7 @@ static inline struct thread_info *current_thread_info(void)
#define TIF_RESTORE_SIGMASK 20
#define TIF_SINGLESTEP 21
#define TIF_32BIT 22 /* AARCH32 process */
+#define TIF_32BIT_AARCH64 23 /* 32 bit process on AArch64(ILP32) */

#define _TIF_SIGPENDING (1 << TIF_SIGPENDING)
#define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED)
@@ -133,6 +134,7 @@ static inline struct thread_info *current_thread_info(void)
#define _TIF_SYSCALL_TRACEPOINT (1 << TIF_SYSCALL_TRACEPOINT)
#define _TIF_SECCOMP (1 << TIF_SECCOMP)
#define _TIF_32BIT (1 << TIF_32BIT)
+#define _TIF_32BIT_AARCH64 (1 << TIF_32BIT_AARCH64)

#define _TIF_WORK_MASK (_TIF_NEED_RESCHED | _TIF_SIGPENDING | \
_TIF_NOTIFY_RESUME | _TIF_FOREIGN_FPSTATE)
--
2.7.4

2016-10-21 20:49:35

by Yury Norov

[permalink] [raw]
Subject: [PATCH 01/18] 32-bit ABI: introduce ARCH_32BIT_OFF_T config option

All new 32-bit architectures should have 64-bit off_t type, but existing
architectures has 32-bit ones.

To handle it, new config option is added to arch/Kconfig that defaults
ARCH_32BIT_OFF_T to be disabled for non-64 bit architectures. All existing
32-bit architectures enable it explicitly here.

New option affects force_o_largefile() behaviour. Namely, if off_t is
64-bits long, we have no reason to reject user to open big files.

For syscalls sys_openat() and sys_open_by_handle_at() force_o_largefile()
is called, to set O_LARGEFILE flag, and this is the only difference
comparing to compat versions. All compat ABIs are already turned to use
64-bit off_t, except tile. So, compat versions for this syscalls are not
needed anymore. Tile is handled explicitly.

Note that even if architectures has only 64-bit off_t in the kernel
(arc, c6x, h8300, hexagon, metag, nios2, openrisc, tile32 and unicore32),
a libc may use 32-bit off_t, and therefore want to limit the file size
to 4GB unless specified differently in the open flags.

Signed-off-by: Yury Norov <[email protected]>
---
arch/Kconfig | 4 ++++
arch/arc/Kconfig | 1 +
arch/arm/Kconfig | 1 +
arch/blackfin/Kconfig | 1 +
arch/cris/Kconfig | 1 +
arch/frv/Kconfig | 1 +
arch/h8300/Kconfig | 1 +
arch/hexagon/Kconfig | 1 +
arch/m32r/Kconfig | 1 +
arch/m68k/Kconfig | 1 +
arch/metag/Kconfig | 1 +
arch/microblaze/Kconfig | 1 +
arch/mips/Kconfig | 1 +
arch/mn10300/Kconfig | 1 +
arch/nios2/Kconfig | 1 +
arch/openrisc/Kconfig | 1 +
arch/parisc/Kconfig | 1 +
arch/powerpc/Kconfig | 1 +
arch/score/Kconfig | 1 +
arch/sh/Kconfig | 1 +
arch/sparc/Kconfig | 1 +
arch/tile/Kconfig | 1 +
arch/tile/kernel/compat.c | 3 +++
arch/unicore32/Kconfig | 1 +
arch/x86/Kconfig | 1 +
arch/x86/um/Kconfig | 1 +
arch/xtensa/Kconfig | 1 +
include/linux/fcntl.h | 2 +-
include/uapi/asm-generic/unistd.h | 5 ++---
29 files changed, 35 insertions(+), 4 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index 659bdd0..ec06a71 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -234,6 +234,10 @@ config ARCH_THREAD_STACK_ALLOCATOR
config ARCH_WANTS_DYNAMIC_TASK_STRUCT
bool

+config ARCH_32BIT_OFF_T
+ bool
+ depends on !64BIT
+
config HAVE_REGS_AND_STACK_ACCESS_API
bool
help
diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig
index ecd1237..3e8dfd6 100644
--- a/arch/arc/Kconfig
+++ b/arch/arc/Kconfig
@@ -9,6 +9,7 @@
config ARC
def_bool y
select ARCH_SUPPORTS_ATOMIC_RMW if ARC_HAS_LLSC
+ select ARCH_32BIT_OFF_T
select BUILDTIME_EXTABLE_SORT
select CLKSRC_OF
select CLONE_BACKWARDS
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index b5d529f..ff8b8b2 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1,6 +1,7 @@
config ARM
bool
default y
+ select ARCH_32BIT_OFF_T
select ARCH_CLOCKSOURCE_DATA
select ARCH_HAS_DEVMEM_IS_ALLOWED
select ARCH_HAS_ELF_RANDOMIZE
diff --git a/arch/blackfin/Kconfig b/arch/blackfin/Kconfig
index 3c1bd64..26418e7 100644
--- a/arch/blackfin/Kconfig
+++ b/arch/blackfin/Kconfig
@@ -12,6 +12,7 @@ config RWSEM_XCHGADD_ALGORITHM

config BLACKFIN
def_bool y
+ select ARCH_32BIT_OFF_T
select HAVE_ARCH_KGDB
select HAVE_ARCH_TRACEHOOK
select HAVE_DYNAMIC_FTRACE
diff --git a/arch/cris/Kconfig b/arch/cris/Kconfig
index 71b758d..8c059f0 100644
--- a/arch/cris/Kconfig
+++ b/arch/cris/Kconfig
@@ -50,6 +50,7 @@ config LOCKDEP_SUPPORT
config CRIS
bool
default y
+ select ARCH_32BIT_OFF_T
select HAVE_IDE
select GENERIC_ATOMIC64
select HAVE_UID16
diff --git a/arch/frv/Kconfig b/arch/frv/Kconfig
index eefd9a4..2f14904 100644
--- a/arch/frv/Kconfig
+++ b/arch/frv/Kconfig
@@ -1,6 +1,7 @@
config FRV
bool
default y
+ select ARCH_32BIT_OFF_T
select HAVE_IDE
select HAVE_ARCH_TRACEHOOK
select HAVE_PERF_EVENTS
diff --git a/arch/h8300/Kconfig b/arch/h8300/Kconfig
index 3ae8525..29bbcb1 100644
--- a/arch/h8300/Kconfig
+++ b/arch/h8300/Kconfig
@@ -1,5 +1,6 @@
config H8300
def_bool y
+ select ARCH_32BIT_OFF_T
select GENERIC_ATOMIC64
select HAVE_UID16
select VIRT_TO_BUS
diff --git a/arch/hexagon/Kconfig b/arch/hexagon/Kconfig
index 1941e4b..bbcea8c 100644
--- a/arch/hexagon/Kconfig
+++ b/arch/hexagon/Kconfig
@@ -3,6 +3,7 @@ comment "Linux Kernel Configuration for Hexagon"

config HEXAGON
def_bool y
+ select ARCH_32BIT_OFF_T
select HAVE_OPROFILE
# Other pending projects/to-do items.
# select HAVE_REGS_AND_STACK_ACCESS_API
diff --git a/arch/m32r/Kconfig b/arch/m32r/Kconfig
index 3cc8498..efa10d3 100644
--- a/arch/m32r/Kconfig
+++ b/arch/m32r/Kconfig
@@ -1,6 +1,7 @@
config M32R
bool
default y
+ select ARCH_32BIT_OFF_T
select HAVE_IDE
select HAVE_OPROFILE
select INIT_ALL_POSSIBLE
diff --git a/arch/m68k/Kconfig b/arch/m68k/Kconfig
index d140206..ed6f90c 100644
--- a/arch/m68k/Kconfig
+++ b/arch/m68k/Kconfig
@@ -1,6 +1,7 @@
config M68K
bool
default y
+ select ARCH_32BIT_OFF_T
select ARCH_MIGHT_HAVE_PC_PARPORT if ISA
select HAVE_IDE
select HAVE_AOUT if MMU
diff --git a/arch/metag/Kconfig b/arch/metag/Kconfig
index 5b7a45d..c337192 100644
--- a/arch/metag/Kconfig
+++ b/arch/metag/Kconfig
@@ -1,5 +1,6 @@
config METAG
def_bool y
+ select ARCH_32BIT_OFF_T
select EMBEDDED
select GENERIC_ATOMIC64
select GENERIC_CLOCKEVENTS
diff --git a/arch/microblaze/Kconfig b/arch/microblaze/Kconfig
index 86f6572..3a6146b 100644
--- a/arch/microblaze/Kconfig
+++ b/arch/microblaze/Kconfig
@@ -1,5 +1,6 @@
config MICROBLAZE
def_bool y
+ select ARCH_32BIT_OFF_T
select ARCH_HAS_GCOV_PROFILE_ALL
select ARCH_MIGHT_HAVE_PC_PARPORT
select ARCH_WANT_IPC_PARSE_VERSION
diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index b3c5bde..a01da24 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -1,6 +1,7 @@
config MIPS
bool
default y
+ select ARCH_32BIT_OFF_T if !64BIT
select ARCH_SUPPORTS_UPROBES
select ARCH_MIGHT_HAVE_PC_PARPORT
select ARCH_MIGHT_HAVE_PC_SERIO
diff --git a/arch/mn10300/Kconfig b/arch/mn10300/Kconfig
index 38e3494..c44c699 100644
--- a/arch/mn10300/Kconfig
+++ b/arch/mn10300/Kconfig
@@ -1,6 +1,7 @@
config MN10300
def_bool y
select HAVE_EXIT_THREAD
+ select ARCH_32BIT_OFF_T
select HAVE_OPROFILE
select HAVE_UID16
select GENERIC_IRQ_SHOW
diff --git a/arch/nios2/Kconfig b/arch/nios2/Kconfig
index 51a56c8..f9273c9 100644
--- a/arch/nios2/Kconfig
+++ b/arch/nios2/Kconfig
@@ -1,5 +1,6 @@
config NIOS2
def_bool y
+ select ARCH_32BIT_OFF_T
select CLKSRC_OF
select GENERIC_ATOMIC64
select GENERIC_CLOCKEVENTS
diff --git a/arch/openrisc/Kconfig b/arch/openrisc/Kconfig
index 489e7f9..c4c96c9 100644
--- a/arch/openrisc/Kconfig
+++ b/arch/openrisc/Kconfig
@@ -5,6 +5,7 @@

config OPENRISC
def_bool y
+ select ARCH_32BIT_OFF_T
select OF
select OF_EARLY_FLATTREE
select IRQ_DOMAIN
diff --git a/arch/parisc/Kconfig b/arch/parisc/Kconfig
index 71c4a3a..025ae12 100644
--- a/arch/parisc/Kconfig
+++ b/arch/parisc/Kconfig
@@ -1,5 +1,6 @@
config PARISC
def_bool y
+ select ARCH_32BIT_OFF_T if !64BIT
select ARCH_MIGHT_HAVE_PC_PARPORT
select HAVE_IDE
select HAVE_OPROFILE
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 65fba4c..22178eb 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -80,6 +80,7 @@ config ARCH_HAS_DMA_SET_COHERENT_MASK
config PPC
bool
default y
+ select ARCH_32BIT_OFF_T if PPC32
select ARCH_MIGHT_HAVE_PC_PARPORT
select ARCH_MIGHT_HAVE_PC_SERIO
select BINFMT_ELF
diff --git a/arch/score/Kconfig b/arch/score/Kconfig
index 507d631..0a9484b 100644
--- a/arch/score/Kconfig
+++ b/arch/score/Kconfig
@@ -2,6 +2,7 @@ menu "Machine selection"

config SCORE
def_bool y
+ select ARCH_32BIT_OFF_T
select GENERIC_IRQ_SHOW
select GENERIC_IOMAP
select GENERIC_ATOMIC64
diff --git a/arch/sh/Kconfig b/arch/sh/Kconfig
index ee08695..1f99eb3 100644
--- a/arch/sh/Kconfig
+++ b/arch/sh/Kconfig
@@ -56,6 +56,7 @@ config SUPERH

config SUPERH32
def_bool ARCH = "sh"
+ select ARCH_32BIT_OFF_T
select HAVE_KPROBES
select HAVE_KRETPROBES
select HAVE_IOREMAP_PROT if MMU && !X2TLB
diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index b23c76b..36ef669 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -46,6 +46,7 @@ config SPARC

config SPARC32
def_bool !64BIT
+ select ARCH_32BIT_OFF_T
select GENERIC_ATOMIC64
select CLZ_TAB
select HAVE_UID16
diff --git a/arch/tile/Kconfig b/arch/tile/Kconfig
index 4583c03..845dcbd 100644
--- a/arch/tile/Kconfig
+++ b/arch/tile/Kconfig
@@ -3,6 +3,7 @@

config TILE
def_bool y
+ select ARCH_32BIT_OFF_T if !64BIT
select ARCH_HAS_DEVMEM_IS_ALLOWED
select ARCH_HAVE_NMI_SAFE_CMPXCHG
select ARCH_WANT_FRAME_POINTERS
diff --git a/arch/tile/kernel/compat.c b/arch/tile/kernel/compat.c
index bdaf71d..b38a898 100644
--- a/arch/tile/kernel/compat.c
+++ b/arch/tile/kernel/compat.c
@@ -103,6 +103,9 @@ COMPAT_SYSCALL_DEFINE5(llseek, unsigned int, fd, unsigned int, offset_high,
#define compat_sys_readahead sys32_readahead
#define sys_llseek compat_sys_llseek

+#define sys_openat compat_sys_openat
+#define sys_open_by_handle_at compat_sys_open_by_handle_at
+
/* Call the assembly trampolines where necessary. */
#define compat_sys_rt_sigreturn _compat_sys_rt_sigreturn
#define sys_clone _sys_clone
diff --git a/arch/unicore32/Kconfig b/arch/unicore32/Kconfig
index 0769066..cc642f9 100644
--- a/arch/unicore32/Kconfig
+++ b/arch/unicore32/Kconfig
@@ -1,6 +1,7 @@
config UNICORE32
def_bool y
select ARCH_HAS_DEVMEM_IS_ALLOWED
+ select ARCH_32BIT_OFF_T
select ARCH_MIGHT_HAVE_PC_PARPORT
select ARCH_MIGHT_HAVE_PC_SERIO
select HAVE_MEMBLOCK
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index bada636..52d19b4 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -20,6 +20,7 @@ config X86
select ACPI_LEGACY_TABLES_LOOKUP if ACPI
select ACPI_SYSTEM_POWER_STATES_SUPPORT if ACPI
select ANON_INODES
+ select ARCH_32BIT_OFF_T if X86_32
select ARCH_CLOCKSOURCE_DATA
select ARCH_DISCARD_MEMBLOCK
select ARCH_HAS_ACPI_TABLE_UPGRADE if ACPI
diff --git a/arch/x86/um/Kconfig b/arch/x86/um/Kconfig
index ed56a1c..8436bcd 100644
--- a/arch/x86/um/Kconfig
+++ b/arch/x86/um/Kconfig
@@ -21,6 +21,7 @@ config 64BIT
config X86_32
def_bool !64BIT
select HAVE_AOUT
+ select ARCH_32BIT_OFF_T
select ARCH_WANT_IPC_PARSE_VERSION
select MODULES_USE_ELF_REL
select CLONE_BACKWARDS
diff --git a/arch/xtensa/Kconfig b/arch/xtensa/Kconfig
index f610586..90c062d 100644
--- a/arch/xtensa/Kconfig
+++ b/arch/xtensa/Kconfig
@@ -3,6 +3,7 @@ config ZONE_DMA

config XTENSA
def_bool y
+ select ARCH_32BIT_OFF_T
select ARCH_WANT_FRAME_POINTERS
select ARCH_WANT_IPC_PARSE_VERSION
select BUILDTIME_EXTABLE_SORT
diff --git a/include/linux/fcntl.h b/include/linux/fcntl.h
index 76ce329..46960a1 100644
--- a/include/linux/fcntl.h
+++ b/include/linux/fcntl.h
@@ -5,7 +5,7 @@


#ifndef force_o_largefile
-#define force_o_largefile() (BITS_PER_LONG != 32)
+#define force_o_largefile() (!IS_ENABLED(CONFIG_ARCH_32BIT_OFF_T))
#endif

#if BITS_PER_LONG == 32
diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
index 9b1462e..a6062be 100644
--- a/include/uapi/asm-generic/unistd.h
+++ b/include/uapi/asm-generic/unistd.h
@@ -178,7 +178,7 @@ __SYSCALL(__NR_fchownat, sys_fchownat)
#define __NR_fchown 55
__SYSCALL(__NR_fchown, sys_fchown)
#define __NR_openat 56
-__SC_COMP(__NR_openat, sys_openat, compat_sys_openat)
+__SYSCALL(__NR_openat, sys_openat)
#define __NR_close 57
__SYSCALL(__NR_close, sys_close)
#define __NR_vhangup 58
@@ -676,8 +676,7 @@ __SYSCALL(__NR_fanotify_mark, sys_fanotify_mark)
#define __NR_name_to_handle_at 264
__SYSCALL(__NR_name_to_handle_at, sys_name_to_handle_at)
#define __NR_open_by_handle_at 265
-__SC_COMP(__NR_open_by_handle_at, sys_open_by_handle_at, \
- compat_sys_open_by_handle_at)
+__SYSCALL(__NR_open_by_handle_at, sys_open_by_handle_at)
#define __NR_clock_adjtime 266
__SC_COMP(__NR_clock_adjtime, sys_clock_adjtime, compat_sys_clock_adjtime)
#define __NR_syncfs 267
--
2.7.4

2016-10-21 20:49:48

by Yury Norov

[permalink] [raw]
Subject: [PATCH 06/18] thread: move thread bits accessors to separated file

They may be accessed from low-level code, so isolating is a measure to
avoid circular dependencies in header files.

The exact reason for circular dependency is WARN_ON() macro added
in patch [edd63a27] "set_restore_sigmask() is never called without
SIGPENDING (and never should be)"

Signed-off-by: Yury Norov <[email protected]>
---
include/linux/thread_bits.h | 54 +++++++++++++++++++++++++++++++++++++++++++++
include/linux/thread_info.h | 44 +-----------------------------------
2 files changed, 55 insertions(+), 43 deletions(-)
create mode 100644 include/linux/thread_bits.h

diff --git a/include/linux/thread_bits.h b/include/linux/thread_bits.h
new file mode 100644
index 0000000..ed788b0
--- /dev/null
+++ b/include/linux/thread_bits.h
@@ -0,0 +1,54 @@
+
+/* thread_bits.h: common low-level thread bits accessors */
+
+#ifndef _LINUX_THREAD_BITS_H
+#define _LINUX_THREAD_BITS_H
+
+#ifndef __ASSEMBLY__
+
+#include <linux/bitops.h>
+#include <asm/thread_info.h>
+
+/*
+ * flag set/clear/test wrappers
+ * - pass TIF_xxxx constants to these functions
+ */
+
+static inline void set_ti_thread_flag(struct thread_info *ti, int flag)
+{
+ set_bit(flag, (unsigned long *)&ti->flags);
+}
+
+static inline void clear_ti_thread_flag(struct thread_info *ti, int flag)
+{
+ clear_bit(flag, (unsigned long *)&ti->flags);
+}
+
+static inline int test_and_set_ti_thread_flag(struct thread_info *ti, int flag)
+{
+ return test_and_set_bit(flag, (unsigned long *)&ti->flags);
+}
+
+static inline int test_and_clear_ti_thread_flag(struct thread_info *ti, int flag)
+{
+ return test_and_clear_bit(flag, (unsigned long *)&ti->flags);
+}
+
+static inline int test_ti_thread_flag(struct thread_info *ti, int flag)
+{
+ return test_bit(flag, (unsigned long *)&ti->flags);
+}
+
+#define set_thread_flag(flag) \
+ set_ti_thread_flag(current_thread_info(), flag)
+#define clear_thread_flag(flag) \
+ clear_ti_thread_flag(current_thread_info(), flag)
+#define test_and_set_thread_flag(flag) \
+ test_and_set_ti_thread_flag(current_thread_info(), flag)
+#define test_and_clear_thread_flag(flag) \
+ test_and_clear_ti_thread_flag(current_thread_info(), flag)
+#define test_thread_flag(flag) \
+ test_ti_thread_flag(current_thread_info(), flag)
+
+#endif /* !__ASSEMBLY__ */
+#endif /* _LINUX_THREAD_BITS_H */
diff --git a/include/linux/thread_info.h b/include/linux/thread_info.h
index 45f004e..f6e3239 100644
--- a/include/linux/thread_info.h
+++ b/include/linux/thread_info.h
@@ -65,8 +65,7 @@ struct restart_block {

extern long do_no_restart_syscall(struct restart_block *parm);

-#include <linux/bitops.h>
-#include <asm/thread_info.h>
+#include <linux/thread_bits.h>

#ifdef __KERNEL__

@@ -77,47 +76,6 @@ extern long do_no_restart_syscall(struct restart_block *parm);
# define THREADINFO_GFP (GFP_KERNEL_ACCOUNT | __GFP_NOTRACK)
#endif

-/*
- * flag set/clear/test wrappers
- * - pass TIF_xxxx constants to these functions
- */
-
-static inline void set_ti_thread_flag(struct thread_info *ti, int flag)
-{
- set_bit(flag, (unsigned long *)&ti->flags);
-}
-
-static inline void clear_ti_thread_flag(struct thread_info *ti, int flag)
-{
- clear_bit(flag, (unsigned long *)&ti->flags);
-}
-
-static inline int test_and_set_ti_thread_flag(struct thread_info *ti, int flag)
-{
- return test_and_set_bit(flag, (unsigned long *)&ti->flags);
-}
-
-static inline int test_and_clear_ti_thread_flag(struct thread_info *ti, int flag)
-{
- return test_and_clear_bit(flag, (unsigned long *)&ti->flags);
-}
-
-static inline int test_ti_thread_flag(struct thread_info *ti, int flag)
-{
- return test_bit(flag, (unsigned long *)&ti->flags);
-}
-
-#define set_thread_flag(flag) \
- set_ti_thread_flag(current_thread_info(), flag)
-#define clear_thread_flag(flag) \
- clear_ti_thread_flag(current_thread_info(), flag)
-#define test_and_set_thread_flag(flag) \
- test_and_set_ti_thread_flag(current_thread_info(), flag)
-#define test_and_clear_thread_flag(flag) \
- test_and_clear_ti_thread_flag(current_thread_info(), flag)
-#define test_thread_flag(flag) \
- test_ti_thread_flag(current_thread_info(), flag)
-
#define tif_need_resched() test_thread_flag(TIF_NEED_RESCHED)

#ifndef CONFIG_HAVE_ARCH_WITHIN_STACK_FRAMES
--
2.7.4

2016-10-21 20:50:01

by Yury Norov

[permalink] [raw]
Subject: [PATCH 05/18] arm64:uapi: set __BITS_PER_LONG correctly for ILP32 and LP64

From: Andrew Pinski <[email protected]>

Define __BITS_PER_LONG depending on the ABI used (i.e. check whether
__ILP32__ or __LP64__ is defined). This is necessary for glibc to
determine the appropriate type definitions for the system call interface.

Signed-off-by: Andrew Pinski <[email protected]>
Signed-off-by: Philipp Tomsich <[email protected]>
Signed-off-by: Christoph Muellner <[email protected]>
Signed-off-by: Yury Norov <[email protected]>
Reviewed-by: David Daney <[email protected]>
---
arch/arm64/include/uapi/asm/bitsperlong.h | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/uapi/asm/bitsperlong.h b/arch/arm64/include/uapi/asm/bitsperlong.h
index fce9c29..ab61d68 100644
--- a/arch/arm64/include/uapi/asm/bitsperlong.h
+++ b/arch/arm64/include/uapi/asm/bitsperlong.h
@@ -16,7 +16,14 @@
#ifndef __ASM_BITSPERLONG_H
#define __ASM_BITSPERLONG_H

-#define __BITS_PER_LONG 64
+#if defined(__LP64__)
+/* Assuming __LP64__ will be defined for native ELF64's and not for ILP32. */
+# define __BITS_PER_LONG 64
+#elif defined(__ILP32__)
+# define __BITS_PER_LONG 32
+#else
+# error "Neither LP64 nor ILP32: unsupported ABI in asm/bitsperlong.h"
+#endif

#include <asm-generic/bitsperlong.h>

--
2.7.4

2016-10-21 20:50:14

by Yury Norov

[permalink] [raw]
Subject: [PATCH 12/18] arm64: ilp32: add sys_ilp32.c and a separate table (in entry.S) to use it

From: Andrew Pinski <[email protected]>

Add a separate syscall-table for ILP32, which dispatches either to native
LP64 system call implementation or to compat-syscalls, as appropriate.

Signed-off-by: Andrew Pinski <[email protected]>
Signed-off-by: Yury Norov <[email protected]>
Signed-off-by: Bamvor Zhang Jian <[email protected]>
---
arch/arm64/include/asm/unistd.h | 8 ++-
arch/arm64/include/uapi/asm/unistd.h | 12 +++++
arch/arm64/kernel/Makefile | 2 +-
arch/arm64/kernel/entry.S | 28 +++++++++-
arch/arm64/kernel/sys_ilp32.c | 100 +++++++++++++++++++++++++++++++++++
5 files changed, 145 insertions(+), 5 deletions(-)
create mode 100644 arch/arm64/kernel/sys_ilp32.c

diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h
index fe9d6c1..851cc8a 100644
--- a/arch/arm64/include/asm/unistd.h
+++ b/arch/arm64/include/asm/unistd.h
@@ -13,13 +13,17 @@
* You should have received a copy of the GNU General Public License
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
+
+#ifdef CONFIG_COMPAT
+#define __ARCH_WANT_COMPAT_STAT64
+#define __ARCH_WANT_SYS_LLSEEK
+#endif
+
#ifdef CONFIG_AARCH32_EL0
#define __ARCH_WANT_COMPAT_SYS_GETDENTS64
-#define __ARCH_WANT_COMPAT_STAT64
#define __ARCH_WANT_SYS_GETHOSTNAME
#define __ARCH_WANT_SYS_PAUSE
#define __ARCH_WANT_SYS_GETPGRP
-#define __ARCH_WANT_SYS_LLSEEK
#define __ARCH_WANT_SYS_NICE
#define __ARCH_WANT_SYS_SIGPENDING
#define __ARCH_WANT_SYS_SIGPROCMASK
diff --git a/arch/arm64/include/uapi/asm/unistd.h b/arch/arm64/include/uapi/asm/unistd.h
index 043d17a..b4cd688 100644
--- a/arch/arm64/include/uapi/asm/unistd.h
+++ b/arch/arm64/include/uapi/asm/unistd.h
@@ -14,6 +14,18 @@
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/

+/*
+ * Use AARCH32 interface for sys_sync_file_range() as it passes 64-bit arguments.
+ */
+#if defined(__ILP32__) || defined(__SYSCALL_COMPAT)
+#define __ARCH_WANT_SYNC_FILE_RANGE2
+#endif
+
+/*
+ * AARCH64/ILP32 is introduced after renameat() was replaced with renameat2().
+ */
+#if !(defined(__ILP32__) || defined(__SYSCALL_COMPAT))
#define __ARCH_WANT_RENAMEAT
+#endif

#include <asm-generic/unistd.h>
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 9123bb8..06070f5 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -29,7 +29,7 @@ $(obj)/%.stub.o: $(obj)/%.o FORCE

arm64-obj-$(CONFIG_AARCH32_EL0) += sys32.o kuser32.o signal32.o \
sys_compat.o entry32.o binfmt_elf32.o
-arm64-obj-$(CONFIG_ARM64_ILP32) += binfmt_ilp32.o
+arm64-obj-$(CONFIG_ARM64_ILP32) += binfmt_ilp32.o sys_ilp32.o
arm64-obj-$(CONFIG_COMPAT) += entry32_common.o
arm64-obj-$(CONFIG_FUNCTION_TRACER) += ftrace.o entry-ftrace.o
arm64-obj-$(CONFIG_MODULES) += arm64ksyms.o module.o
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index b6fb14b..b152aab 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -249,6 +249,23 @@ tsk .req x28 // current thread_info

.text

+#ifdef CONFIG_ARM64_ILP32
+/*
+ * AARCH64/ILP32. Zero top halves of x0-x7
+ * registers as userspace may put garbage there.
+ */
+ .macro delouse_input_regs
+ mov w0, w0
+ mov w1, w1
+ mov w2, w2
+ mov w3, w3
+ mov w4, w4
+ mov w5, w5
+ mov w6, w6
+ mov w7, w7
+ .endm
+#endif
+
/*
* Exception vectors.
*/
@@ -517,6 +534,7 @@ el0_svc_compat:
* AArch32 syscall handling
*/
adrp stbl, compat_sys_call_table // load compat syscall table pointer
+ ldr x16, [tsk, #TI_FLAGS]
uxtw scno, w7 // syscall number in w7 (r7)
mov sc_nr, #__NR_compat_syscalls
b el0_svc_naked
@@ -739,15 +757,21 @@ ENDPROC(ret_from_fork)
.align 6
el0_svc:
adrp stbl, sys_call_table // load syscall table pointer
+ ldr x16, [tsk, #TI_FLAGS]
uxtw scno, w8 // syscall number in w8
mov sc_nr, #__NR_syscalls
+#ifdef CONFIG_ARM64_ILP32
+ tst x16, #_TIF_32BIT_AARCH64
+ b.eq el0_svc_naked // We are using LP64 syscall table
+ adrp stbl, sys_call_ilp32_table // load ilp32 syscall table pointer
+ delouse_input_regs
+#endif
el0_svc_naked: // compat entry point
stp x0, scno, [sp, #S_ORIG_X0] // save the original x0 and syscall number
enable_dbg_and_irq
ct_user_exit 1

- ldr x16, [tsk, #TI_FLAGS] // check for syscall hooks
- tst x16, #_TIF_SYSCALL_WORK
+ tst x16, #_TIF_SYSCALL_WORK // check for syscall hooks
b.ne __sys_trace
cmp scno, sc_nr // check upper syscall limit
b.hs ni_sys
diff --git a/arch/arm64/kernel/sys_ilp32.c b/arch/arm64/kernel/sys_ilp32.c
new file mode 100644
index 0000000..fbf2f00
--- /dev/null
+++ b/arch/arm64/kernel/sys_ilp32.c
@@ -0,0 +1,100 @@
+/*
+ * AArch64- ILP32 specific system calls implementation
+ *
+ * Copyright (C) 2016 Cavium Inc.
+ * Author: Andrew Pinski <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#define __SYSCALL_COMPAT
+
+#include <linux/compiler.h>
+#include <linux/errno.h>
+#include <linux/fs.h>
+#include <linux/mm.h>
+#include <linux/msg.h>
+#include <linux/export.h>
+#include <linux/sched.h>
+#include <linux/slab.h>
+#include <linux/syscalls.h>
+#include <linux/compat.h>
+#include <asm-generic/syscalls.h>
+
+/*
+ * AARCH32 requires 4-page alignment for shared memory,
+ * but AARCH64 - only 1 page. This is the only difference
+ * between compat and native sys_shmat(). So ILP32 just pick
+ * AARCH64 version.
+ */
+#define compat_sys_shmat sys_shmat
+
+/*
+ * ILP32 needs special handling for some ptrace requests.
+ */
+#define sys_ptrace compat_sys_ptrace
+
+/*
+ * Using AARCH32 interface for syscalls that take 64-bit
+ * parameters in registers.
+ */
+#define compat_sys_fadvise64_64 compat_sys_fadvise64_64_wrapper
+#define compat_sys_fallocate compat_sys_fallocate_wrapper
+#define compat_sys_ftruncate64 compat_sys_ftruncate64_wrapper
+#define compat_sys_pread64 compat_sys_pread64_wrapper
+#define compat_sys_pwrite64 compat_sys_pwrite64_wrapper
+#define compat_sys_readahead compat_sys_readahead_wrapper
+#define compat_sys_sync_file_range2 compat_sys_sync_file_range2_wrapper
+#define compat_sys_truncate64 compat_sys_truncate64_wrapper
+#define sys_mmap2 compat_sys_mmap2_wrapper
+
+/*
+ * Using AARCH32 interface for syscalls that take the size of
+ * sfruct statfs as an argument, as it's calculated differently
+ * in kernel and user spaces.
+ */
+#define compat_sys_fstatfs64 compat_sys_fstatfs64_wrapper
+#define compat_sys_statfs64 compat_sys_statfs64_wrapper
+
+/*
+ * Using custom wrapper for rt_sigreturn() to handle custom
+ * struct rt_sigframe.
+ */
+#define compat_sys_rt_sigreturn ilp32_sys_rt_sigreturn_wrapper
+
+asmlinkage long compat_sys_fstatfs64_wrapper(void);
+asmlinkage long compat_sys_statfs64_wrapper(void);
+asmlinkage long compat_sys_fadvise64_64_wrapper(void);
+asmlinkage long compat_sys_fallocate_wrapper(void);
+asmlinkage long compat_sys_ftruncate64_wrapper(void);
+asmlinkage long compat_sys_mmap2_wrapper(void);
+asmlinkage long compat_sys_pread64_wrapper(void);
+asmlinkage long compat_sys_pwrite64_wrapper(void);
+asmlinkage long compat_sys_readahead_wrapper(void);
+asmlinkage long compat_sys_sync_file_range2_wrapper(void);
+asmlinkage long compat_sys_truncate64_wrapper(void);
+asmlinkage long ilp32_sys_rt_sigreturn_wrapper(void);
+
+#include <asm/syscall.h>
+
+#undef __SYSCALL
+#define __SYSCALL(nr, sym) [nr] = sym,
+
+/*
+ * The sys_call_ilp32_table array must be 4K aligned to be accessible from
+ * kernel/entry.S.
+ */
+void *sys_call_ilp32_table[__NR_syscalls] __aligned(4096) = {
+ [0 ... __NR_syscalls - 1] = sys_ni_syscall,
+#include <asm/unistd.h>
+};
--
2.7.4

2016-10-21 20:50:28

by Yury Norov

[permalink] [raw]
Subject: [PATCH 07/18] arm64: introduce is_a32_task and is_a32_thread (for AArch32 compat)

Based on patch of Andrew Pinski.

This patch introduces is_a32_compat_task and is_a32_thread so it is
easier to say this is a a32 specific thread or a generic compat thread/task.
Corresponding functions are located in <asm/is_compat.h> to avoid mess in
headers.

Some files include both <linux/compat.h> and <asm/compat.h>,
and this is wrong because <linux/compat.h> has <asm/compat.h> already
included. It was fixed too.

Signed-off-by: Yury Norov <[email protected]>
Signed-off-by: Andrew Pinski <[email protected]>
Signed-off-by: Bamvor Zhang Jian <[email protected]>
---
arch/arm64/include/asm/compat.h | 19 ++---------
arch/arm64/include/asm/elf.h | 10 +++---
arch/arm64/include/asm/ftrace.h | 2 +-
arch/arm64/include/asm/is_compat.h | 64 ++++++++++++++++++++++++++++++++++++
arch/arm64/include/asm/memory.h | 5 +--
arch/arm64/include/asm/processor.h | 5 +--
arch/arm64/include/asm/syscall.h | 2 +-
arch/arm64/include/asm/thread_info.h | 2 +-
arch/arm64/kernel/hw_breakpoint.c | 10 +++---
arch/arm64/kernel/perf_regs.c | 2 +-
arch/arm64/kernel/process.c | 7 ++--
arch/arm64/kernel/ptrace.c | 11 +++----
arch/arm64/kernel/signal.c | 4 +--
arch/arm64/kernel/traps.c | 3 +-
14 files changed, 98 insertions(+), 48 deletions(-)
create mode 100644 arch/arm64/include/asm/is_compat.h

diff --git a/arch/arm64/include/asm/compat.h b/arch/arm64/include/asm/compat.h
index eb8432b..df2f72d 100644
--- a/arch/arm64/include/asm/compat.h
+++ b/arch/arm64/include/asm/compat.h
@@ -24,6 +24,8 @@
#include <linux/types.h>
#include <linux/sched.h>

+#include <asm/is_compat.h>
+
#define COMPAT_USER_HZ 100
#ifdef __AARCH64EB__
#define COMPAT_UTS_MACHINE "armv8b\0\0"
@@ -298,23 +300,6 @@ struct compat_shmid64_ds {
compat_ulong_t __unused5;
};

-static inline int is_compat_task(void)
-{
- return test_thread_flag(TIF_32BIT);
-}
-
-static inline int is_compat_thread(struct thread_info *thread)
-{
- return test_ti_thread_flag(thread, TIF_32BIT);
-}
-
-#else /* !CONFIG_COMPAT */
-
-static inline int is_compat_thread(struct thread_info *thread)
-{
- return 0;
-}
-
#endif /* CONFIG_COMPAT */
#endif /* __KERNEL__ */
#endif /* __ASM_COMPAT_H */
diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
index a55384f..6a9049b 100644
--- a/arch/arm64/include/asm/elf.h
+++ b/arch/arm64/include/asm/elf.h
@@ -16,6 +16,10 @@
#ifndef __ASM_ELF_H
#define __ASM_ELF_H

+#ifndef __ASSEMBLY__
+#include <linux/compat.h>
+#endif
+
#include <asm/hwcap.h>

/*
@@ -153,13 +157,9 @@ extern int arch_setup_additional_pages(struct linux_binprm *bprm,
int uses_interp);

/* 1GB of VA */
-#ifdef CONFIG_COMPAT
-#define STACK_RND_MASK (test_thread_flag(TIF_32BIT) ? \
+#define STACK_RND_MASK (is_compat_task() ? \
0x7ff >> (PAGE_SHIFT - 12) : \
0x3ffff >> (PAGE_SHIFT - 12))
-#else
-#define STACK_RND_MASK (0x3ffff >> (PAGE_SHIFT - 12))
-#endif

#ifdef __AARCH64EB__
#define COMPAT_ELF_PLATFORM ("v8b")
diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
index caa955f..0feb28a 100644
--- a/arch/arm64/include/asm/ftrace.h
+++ b/arch/arm64/include/asm/ftrace.h
@@ -54,7 +54,7 @@ static inline unsigned long ftrace_call_adjust(unsigned long addr)
#define ARCH_TRACE_IGNORE_COMPAT_SYSCALLS
static inline bool arch_trace_is_compat_syscall(struct pt_regs *regs)
{
- return is_compat_task();
+ return is_a32_compat_task();
}
#endif /* ifndef __ASSEMBLY__ */

diff --git a/arch/arm64/include/asm/is_compat.h b/arch/arm64/include/asm/is_compat.h
new file mode 100644
index 0000000..8dba5ca
--- /dev/null
+++ b/arch/arm64/include/asm/is_compat.h
@@ -0,0 +1,64 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef __ASM_IS_COMPAT_H
+#define __ASM_IS_COMPAT_H
+#ifndef __ASSEMBLY__
+
+#include <linux/thread_bits.h>
+
+#ifdef CONFIG_AARCH32_EL0
+
+static inline int is_a32_compat_task(void)
+{
+ return test_thread_flag(TIF_32BIT);
+}
+
+static inline int is_a32_compat_thread(struct thread_info *thread)
+{
+ return test_ti_thread_flag(thread, TIF_32BIT);
+}
+
+#else
+
+static inline int is_a32_compat_task(void)
+
+{
+ return 0;
+}
+
+static inline int is_a32_compat_thread(struct thread_info *thread)
+{
+ return 0;
+}
+
+#endif /* CONFIG_AARCH32_EL0 */
+
+#ifdef CONFIG_COMPAT
+
+static inline int is_compat_task(void)
+{
+ return is_a32_compat_task();
+}
+
+#endif /* CONFIG_COMPAT */
+
+static inline int is_compat_thread(struct thread_info *thread)
+{
+ return is_a32_compat_thread(thread);
+}
+
+
+#endif /* !__ASSEMBLY__ */
+#endif /* __ASM_IS_COMPAT_H */
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index ba62df8..39497ae 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -26,6 +26,7 @@
#include <linux/types.h>
#include <asm/bug.h>
#include <asm/sizes.h>
+#include <asm/is_compat.h>

/*
* Allow for constants defined here to be used from assembly code
@@ -78,9 +79,9 @@

#ifdef CONFIG_COMPAT
#define TASK_SIZE_32 UL(0x100000000)
-#define TASK_SIZE (test_thread_flag(TIF_32BIT) ? \
+#define TASK_SIZE (is_compat_task() ? \
TASK_SIZE_32 : TASK_SIZE_64)
-#define TASK_SIZE_OF(tsk) (test_tsk_thread_flag(tsk, TIF_32BIT) ? \
+#define TASK_SIZE_OF(tsk) (is_compat_thread(tsk) ? \
TASK_SIZE_32 : TASK_SIZE_64)
#else
#define TASK_SIZE TASK_SIZE_64
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 6173a7b..49a046a 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -30,6 +30,7 @@
#include <linux/string.h>

#include <asm/alternative.h>
+#include <asm/is_compat.h>
#include <asm/fpsimd.h>
#include <asm/hw_breakpoint.h>
#include <asm/lse.h>
@@ -40,7 +41,7 @@
#define STACK_TOP_MAX TASK_SIZE_64
#ifdef CONFIG_COMPAT
#define AARCH32_VECTORS_BASE 0xffff0000
-#define STACK_TOP (test_thread_flag(TIF_32BIT) ? \
+#define STACK_TOP (is_compat_task() ? \
AARCH32_VECTORS_BASE : STACK_TOP_MAX)
#else
#define STACK_TOP STACK_TOP_MAX
@@ -92,7 +93,7 @@ struct thread_struct {
#define task_user_tls(t) \
({ \
unsigned long *__tls; \
- if (is_compat_thread(task_thread_info(t))) \
+ if (is_a32_compat_thread(task_thread_info(t))) \
__tls = &(t)->thread.tp2_value; \
else \
__tls = &(t)->thread.tp_value; \
diff --git a/arch/arm64/include/asm/syscall.h b/arch/arm64/include/asm/syscall.h
index 709a574..ce09641 100644
--- a/arch/arm64/include/asm/syscall.h
+++ b/arch/arm64/include/asm/syscall.h
@@ -113,7 +113,7 @@ static inline void syscall_set_arguments(struct task_struct *task,
*/
static inline int syscall_get_arch(void)
{
- if (is_compat_task())
+ if (is_a32_compat_task())
return AUDIT_ARCH_ARM;

return AUDIT_ARCH_AARCH64;
diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
index e9ea5a6..e12411f 100644
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -121,7 +121,7 @@ static inline struct thread_info *current_thread_info(void)
#define TIF_FREEZE 19
#define TIF_RESTORE_SIGMASK 20
#define TIF_SINGLESTEP 21
-#define TIF_32BIT 22 /* 32bit process */
+#define TIF_32BIT 22 /* AARCH32 process */

#define _TIF_SIGPENDING (1 << TIF_SIGPENDING)
#define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED)
diff --git a/arch/arm64/kernel/hw_breakpoint.c b/arch/arm64/kernel/hw_breakpoint.c
index 948b731..4c14957 100644
--- a/arch/arm64/kernel/hw_breakpoint.c
+++ b/arch/arm64/kernel/hw_breakpoint.c
@@ -168,7 +168,7 @@ enum hw_breakpoint_ops {
HW_BREAKPOINT_RESTORE
};

-static int is_compat_bp(struct perf_event *bp)
+static int is_a32_compat_bp(struct perf_event *bp)
{
struct task_struct *tsk = bp->hw.target;

@@ -179,7 +179,7 @@ static int is_compat_bp(struct perf_event *bp)
* deprecated behaviour if we use unaligned watchpoints in
* AArch64 state.
*/
- return tsk && is_compat_thread(task_thread_info(tsk));
+ return tsk && is_a32_compat_thread(task_thread_info(tsk));
}

/**
@@ -439,7 +439,7 @@ static int arch_build_bp_info(struct perf_event *bp)
* Watchpoints can be of length 1, 2, 4 or 8 bytes.
*/
if (info->ctrl.type == ARM_BREAKPOINT_EXECUTE) {
- if (is_compat_bp(bp)) {
+ if (is_a32_compat_bp(bp)) {
if (info->ctrl.len != ARM_BREAKPOINT_LEN_2 &&
info->ctrl.len != ARM_BREAKPOINT_LEN_4)
return -EINVAL;
@@ -496,7 +496,7 @@ int arch_validate_hwbkpt_settings(struct perf_event *bp)
* AArch32 tasks expect some simple alignment fixups, so emulate
* that here.
*/
- if (is_compat_bp(bp)) {
+ if (is_a32_compat_bp(bp)) {
if (info->ctrl.len == ARM_BREAKPOINT_LEN_8)
alignment_mask = 0x7;
else
@@ -685,7 +685,7 @@ static int watchpoint_handler(unsigned long addr, unsigned int esr,

info = counter_arch_bp(wp);
/* AArch32 watchpoints are either 4 or 8 bytes aligned. */
- if (is_compat_task()) {
+ if (is_a32_compat_task()) {
if (info->ctrl.len == ARM_BREAKPOINT_LEN_8)
alignment_mask = 0x7;
else
diff --git a/arch/arm64/kernel/perf_regs.c b/arch/arm64/kernel/perf_regs.c
index 3f62b35..a79058f 100644
--- a/arch/arm64/kernel/perf_regs.c
+++ b/arch/arm64/kernel/perf_regs.c
@@ -45,7 +45,7 @@ int perf_reg_validate(u64 mask)

u64 perf_reg_abi(struct task_struct *task)
{
- if (is_compat_thread(task_thread_info(task)))
+ if (is_a32_compat_thread(task_thread_info(task)))
return PERF_SAMPLE_REGS_ABI_32;
else
return PERF_SAMPLE_REGS_ABI_64;
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 27b2f13..b78f80d 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -47,7 +47,6 @@
#include <trace/events/power.h>

#include <asm/alternative.h>
-#include <asm/compat.h>
#include <asm/cacheflush.h>
#include <asm/fpsimd.h>
#include <asm/mmu_context.h>
@@ -204,7 +203,7 @@ static void tls_thread_flush(void)
{
write_sysreg(0, tpidr_el0);

- if (is_compat_task()) {
+ if (is_a32_compat_task()) {
current->thread.tp_value = 0;

/*
@@ -256,7 +255,7 @@ int copy_thread(unsigned long clone_flags, unsigned long stack_start,
*task_user_tls(p) = read_sysreg(tpidr_el0);

if (stack_start) {
- if (is_compat_thread(task_thread_info(p)))
+ if (is_a32_compat_thread(task_thread_info(p)))
childregs->compat_sp = stack_start;
else
childregs->sp = stack_start;
@@ -293,7 +292,7 @@ static void tls_thread_switch(struct task_struct *next)
*task_user_tls(current) = tpidr;

tpidr = *task_user_tls(next);
- tpidrro = is_compat_thread(task_thread_info(next)) ?
+ tpidrro = is_a32_compat_thread(task_thread_info(next)) ?
next->thread.tp_value : 0;

write_sysreg(tpidr, tpidr_el0);
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index 1d6f43e..1d075ed 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -38,7 +38,6 @@
#include <linux/tracehook.h>
#include <linux/elf.h>

-#include <asm/compat.h>
#include <asm/debug-monitors.h>
#include <asm/pgtable.h>
#include <asm/syscall.h>
@@ -186,7 +185,7 @@ static void ptrace_hbptriggered(struct perf_event *bp,
#ifdef CONFIG_AARCH32_EL0
int i;

- if (!is_compat_task())
+ if (!is_a32_compat_task())
goto send_sig;

for (i = 0; i < ARM_MAX_BRP; ++i) {
@@ -1304,9 +1303,9 @@ const struct user_regset_view *task_user_regset_view(struct task_struct *task)
* 32-bit children use an extended user_aarch32_ptrace_view to allow
* access to the TLS register.
*/
- if (is_compat_task())
+ if (is_a32_compat_task())
return &user_aarch32_view;
- else if (is_compat_thread(task_thread_info(task)))
+ else if (is_a32_compat_thread(task_thread_info(task)))
return &user_aarch32_ptrace_view;
#endif
return &user_aarch64_view;
@@ -1333,7 +1332,7 @@ static void tracehook_report_syscall(struct pt_regs *regs,
* A scratch register (ip(r12) on AArch32, x7 on AArch64) is
* used to denote syscall entry/exit:
*/
- regno = (is_compat_task() ? 12 : 7);
+ regno = (is_a32_compat_task() ? 12 : 7);
saved_reg = regs->regs[regno];
regs->regs[regno] = dir;

@@ -1444,7 +1443,7 @@ int valid_user_regs(struct user_pt_regs *regs, struct task_struct *task)
if (!test_tsk_thread_flag(task, TIF_SINGLESTEP))
regs->pstate &= ~DBG_SPSR_SS;

- if (is_compat_thread(task_thread_info(task)))
+ if (is_a32_compat_thread(task_thread_info(task)))
return valid_compat_regs(regs);
else
return valid_native_regs(regs);
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index 404dd67..f90cdf5 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -276,7 +276,7 @@ static int setup_rt_frame(int usig, struct ksignal *ksig, sigset_t *set,

static void setup_restart_syscall(struct pt_regs *regs)
{
- if (is_compat_task())
+ if (is_a32_compat_task())
compat_setup_restart_syscall(regs);
else
regs->regs[8] = __NR_restart_syscall;
@@ -295,7 +295,7 @@ static void handle_signal(struct ksignal *ksig, struct pt_regs *regs)
/*
* Set up the stack frame
*/
- if (is_compat_task()) {
+ if (is_a32_compat_task()) {
if (ksig->ka.sa.sa_flags & SA_SIGINFO)
ret = compat_setup_rt_frame(usig, ksig, oldset, regs);
else
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index 14a08a0..3644ddc 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -18,6 +18,7 @@
*/

#include <linux/bug.h>
+#include <linux/compat.h>
#include <linux/signal.h>
#include <linux/personality.h>
#include <linux/kallsyms.h>
@@ -528,7 +529,7 @@ asmlinkage long do_ni_syscall(struct pt_regs *regs)
{
#ifdef CONFIG_AARCH32_EL0
long ret;
- if (is_compat_task()) {
+ if (is_a32_compat_task()) {
ret = compat_arm_syscall(regs);
if (ret != -ENOSYS)
return ret;
--
2.7.4

2016-10-21 20:50:41

by Yury Norov

[permalink] [raw]
Subject: [PATCH 11/18] arm64: ilp32: share aarch32 syscall handlers

off_t is passed in register pair just like in aarch32.
In this patch corresponding aarch32 handlers are shared to
ilp32 code.

Signed-off-by: Yury Norov <[email protected]>
---
arch/arm64/kernel/Makefile | 1 +
arch/arm64/kernel/entry32.S | 80 ---------------------------
arch/arm64/kernel/entry32_common.S | 107 +++++++++++++++++++++++++++++++++++++
3 files changed, 108 insertions(+), 80 deletions(-)
create mode 100644 arch/arm64/kernel/entry32_common.S

diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index f661888..9123bb8 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -30,6 +30,7 @@ $(obj)/%.stub.o: $(obj)/%.o FORCE
arm64-obj-$(CONFIG_AARCH32_EL0) += sys32.o kuser32.o signal32.o \
sys_compat.o entry32.o binfmt_elf32.o
arm64-obj-$(CONFIG_ARM64_ILP32) += binfmt_ilp32.o
+arm64-obj-$(CONFIG_COMPAT) += entry32_common.o
arm64-obj-$(CONFIG_FUNCTION_TRACER) += ftrace.o entry-ftrace.o
arm64-obj-$(CONFIG_MODULES) += arm64ksyms.o module.o
arm64-obj-$(CONFIG_ARM64_MODULE_PLTS) += module-plts.o
diff --git a/arch/arm64/kernel/entry32.S b/arch/arm64/kernel/entry32.S
index f332d5d..4bede03 100644
--- a/arch/arm64/kernel/entry32.S
+++ b/arch/arm64/kernel/entry32.S
@@ -39,83 +39,3 @@ ENTRY(compat_sys_rt_sigreturn_wrapper)
mov x0, sp
b compat_sys_rt_sigreturn
ENDPROC(compat_sys_rt_sigreturn_wrapper)
-
-ENTRY(compat_sys_statfs64_wrapper)
- mov w3, #84
- cmp w1, #88
- csel w1, w3, w1, eq
- b compat_sys_statfs64
-ENDPROC(compat_sys_statfs64_wrapper)
-
-ENTRY(compat_sys_fstatfs64_wrapper)
- mov w3, #84
- cmp w1, #88
- csel w1, w3, w1, eq
- b compat_sys_fstatfs64
-ENDPROC(compat_sys_fstatfs64_wrapper)
-
-/*
- * Note: off_4k (w5) is always in units of 4K. If we can't do the
- * requested offset because it is not page-aligned, we return -EINVAL.
- */
-ENTRY(compat_sys_mmap2_wrapper)
-#if PAGE_SHIFT > 12
- tst w5, #~PAGE_MASK >> 12
- b.ne 1f
- lsr w5, w5, #PAGE_SHIFT - 12
-#endif
- b sys_mmap_pgoff
-1: mov x0, #-EINVAL
- ret
-ENDPROC(compat_sys_mmap2_wrapper)
-
-/*
- * Wrappers for AArch32 syscalls that either take 64-bit parameters
- * in registers or that take 32-bit parameters which require sign
- * extension.
- */
-ENTRY(compat_sys_pread64_wrapper)
- regs_to_64 x3, x4, x5
- b sys_pread64
-ENDPROC(compat_sys_pread64_wrapper)
-
-ENTRY(compat_sys_pwrite64_wrapper)
- regs_to_64 x3, x4, x5
- b sys_pwrite64
-ENDPROC(compat_sys_pwrite64_wrapper)
-
-ENTRY(compat_sys_truncate64_wrapper)
- regs_to_64 x1, x2, x3
- b sys_truncate
-ENDPROC(compat_sys_truncate64_wrapper)
-
-ENTRY(compat_sys_ftruncate64_wrapper)
- regs_to_64 x1, x2, x3
- b sys_ftruncate
-ENDPROC(compat_sys_ftruncate64_wrapper)
-
-ENTRY(compat_sys_readahead_wrapper)
- regs_to_64 x1, x2, x3
- mov w2, w4
- b sys_readahead
-ENDPROC(compat_sys_readahead_wrapper)
-
-ENTRY(compat_sys_fadvise64_64_wrapper)
- mov w6, w1
- regs_to_64 x1, x2, x3
- regs_to_64 x2, x4, x5
- mov w3, w6
- b sys_fadvise64_64
-ENDPROC(compat_sys_fadvise64_64_wrapper)
-
-ENTRY(compat_sys_sync_file_range2_wrapper)
- regs_to_64 x2, x2, x3
- regs_to_64 x3, x4, x5
- b sys_sync_file_range2
-ENDPROC(compat_sys_sync_file_range2_wrapper)
-
-ENTRY(compat_sys_fallocate_wrapper)
- regs_to_64 x2, x2, x3
- regs_to_64 x3, x4, x5
- b sys_fallocate
-ENDPROC(compat_sys_fallocate_wrapper)
diff --git a/arch/arm64/kernel/entry32_common.S b/arch/arm64/kernel/entry32_common.S
new file mode 100644
index 0000000..f4a5e4d
--- /dev/null
+++ b/arch/arm64/kernel/entry32_common.S
@@ -0,0 +1,107 @@
+/*
+ * Compat system call wrappers
+ *
+ * Copyright (C) 2012 ARM Ltd.
+ * Authors: Will Deacon <[email protected]>
+ * Catalin Marinas <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/linkage.h>
+#include <linux/const.h>
+
+#include <asm/assembler.h>
+#include <asm/asm-offsets.h>
+#include <asm/errno.h>
+#include <asm/page.h>
+
+/*
+ * Note: off_4k (w5) is always in units of 4K. If we can't do the
+ * requested offset because it is not page-aligned, we return -EINVAL.
+ */
+ENTRY(compat_sys_mmap2_wrapper)
+#if PAGE_SHIFT > 12
+ tst w5, #~PAGE_MASK >> 12
+ b.ne 1f
+ lsr w5, w5, #PAGE_SHIFT - 12
+#endif
+ b sys_mmap_pgoff
+1: mov x0, #-EINVAL
+ ret
+ENDPROC(compat_sys_mmap2_wrapper)
+
+/*
+ * Wrappers for AArch32 syscalls that either take 64-bit parameters
+ * in registers or that take 32-bit parameters which require sign
+ * extension.
+ */
+ENTRY(compat_sys_pread64_wrapper)
+ regs_to_64 x3, x4, x5
+ b sys_pread64
+ENDPROC(compat_sys_pread64_wrapper)
+
+ENTRY(compat_sys_pwrite64_wrapper)
+ regs_to_64 x3, x4, x5
+ b sys_pwrite64
+ENDPROC(compat_sys_pwrite64_wrapper)
+
+ENTRY(compat_sys_truncate64_wrapper)
+ regs_to_64 x1, x2, x3
+ b sys_truncate
+ENDPROC(compat_sys_truncate64_wrapper)
+
+ENTRY(compat_sys_ftruncate64_wrapper)
+ regs_to_64 x1, x2, x3
+ b sys_ftruncate
+ENDPROC(compat_sys_ftruncate64_wrapper)
+
+ENTRY(compat_sys_readahead_wrapper)
+ regs_to_64 x1, x2, x3
+ mov w2, w4
+ b sys_readahead
+ENDPROC(compat_sys_readahead_wrapper)
+
+ENTRY(compat_sys_fadvise64_64_wrapper)
+ mov w6, w1
+ regs_to_64 x1, x2, x3
+ regs_to_64 x2, x4, x5
+ mov w3, w6
+ b sys_fadvise64_64
+ENDPROC(compat_sys_fadvise64_64_wrapper)
+
+ENTRY(compat_sys_sync_file_range2_wrapper)
+ regs_to_64 x2, x2, x3
+ regs_to_64 x3, x4, x5
+ b sys_sync_file_range2
+ENDPROC(compat_sys_sync_file_range2_wrapper)
+
+ENTRY(compat_sys_fallocate_wrapper)
+ regs_to_64 x2, x2, x3
+ regs_to_64 x3, x4, x5
+ b sys_fallocate
+ENDPROC(compat_sys_fallocate_wrapper)
+
+ENTRY(compat_sys_statfs64_wrapper)
+ mov w3, #84
+ cmp w1, #88
+ csel w1, w3, w1, eq
+ b compat_sys_statfs64
+ENDPROC(compat_sys_statfs64_wrapper)
+
+ENTRY(compat_sys_fstatfs64_wrapper)
+ mov w3, #84
+ cmp w1, #88
+ csel w1, w3, w1, eq
+ b compat_sys_fstatfs64
+ENDPROC(compat_sys_fstatfs64_wrapper)
--
2.7.4

2016-10-21 20:51:01

by Yury Norov

[permalink] [raw]
Subject: [PATCH 14/18] arm64: signal32: move ilp32 and aarch32 common code to separated file

Signed-off-by: Yury Norov <[email protected]>
---
arch/arm64/include/asm/signal32.h | 3 +
arch/arm64/include/asm/signal32_common.h | 27 +++++++
arch/arm64/kernel/Makefile | 2 +-
arch/arm64/kernel/signal32.c | 107 ------------------------
arch/arm64/kernel/signal32_common.c | 135 +++++++++++++++++++++++++++++++
5 files changed, 166 insertions(+), 108 deletions(-)
create mode 100644 arch/arm64/include/asm/signal32_common.h
create mode 100644 arch/arm64/kernel/signal32_common.c

diff --git a/arch/arm64/include/asm/signal32.h b/arch/arm64/include/asm/signal32.h
index e68fcce..1c4ede7 100644
--- a/arch/arm64/include/asm/signal32.h
+++ b/arch/arm64/include/asm/signal32.h
@@ -13,6 +13,9 @@
* You should have received a copy of the GNU General Public License
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
+
+#include <asm/signal32_common.h>
+
#ifndef __ASM_SIGNAL32_H
#define __ASM_SIGNAL32_H

diff --git a/arch/arm64/include/asm/signal32_common.h b/arch/arm64/include/asm/signal32_common.h
new file mode 100644
index 0000000..36c1ebc
--- /dev/null
+++ b/arch/arm64/include/asm/signal32_common.h
@@ -0,0 +1,27 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_SIGNAL32_COMMON_H
+#define __ASM_SIGNAL32_COMMON_H
+
+#ifdef CONFIG_COMPAT
+
+int copy_siginfo_to_user32(compat_siginfo_t __user *to, const siginfo_t *from);
+int copy_siginfo_from_user32(siginfo_t *to, compat_siginfo_t __user *from);
+
+int put_sigset_t(compat_sigset_t __user *uset, sigset_t *set);
+int get_sigset_t(sigset_t *set, const compat_sigset_t __user *uset);
+
+#endif /* CONFIG_COMPAT*/
+
+#endif /* __ASM_SIGNAL32_COMMON_H */
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 06070f5..fdc0052 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -30,7 +30,7 @@ $(obj)/%.stub.o: $(obj)/%.o FORCE
arm64-obj-$(CONFIG_AARCH32_EL0) += sys32.o kuser32.o signal32.o \
sys_compat.o entry32.o binfmt_elf32.o
arm64-obj-$(CONFIG_ARM64_ILP32) += binfmt_ilp32.o sys_ilp32.o
-arm64-obj-$(CONFIG_COMPAT) += entry32_common.o
+arm64-obj-$(CONFIG_COMPAT) += entry32_common.o signal32_common.o
arm64-obj-$(CONFIG_FUNCTION_TRACER) += ftrace.o entry-ftrace.o
arm64-obj-$(CONFIG_MODULES) += arm64ksyms.o module.o
arm64-obj-$(CONFIG_ARM64_MODULE_PLTS) += module-plts.o
diff --git a/arch/arm64/kernel/signal32.c b/arch/arm64/kernel/signal32.c
index b7063de..f2c1a38 100644
--- a/arch/arm64/kernel/signal32.c
+++ b/arch/arm64/kernel/signal32.c
@@ -103,113 +103,6 @@ struct compat_rt_sigframe {

#define _BLOCKABLE (~(sigmask(SIGKILL) | sigmask(SIGSTOP)))

-static inline int put_sigset_t(compat_sigset_t __user *uset, sigset_t *set)
-{
- compat_sigset_t cset;
-
- cset.sig[0] = set->sig[0] & 0xffffffffull;
- cset.sig[1] = set->sig[0] >> 32;
-
- return copy_to_user(uset, &cset, sizeof(*uset));
-}
-
-static inline int get_sigset_t(sigset_t *set,
- const compat_sigset_t __user *uset)
-{
- compat_sigset_t s32;
-
- if (copy_from_user(&s32, uset, sizeof(*uset)))
- return -EFAULT;
-
- set->sig[0] = s32.sig[0] | (((long)s32.sig[1]) << 32);
- return 0;
-}
-
-int copy_siginfo_to_user32(compat_siginfo_t __user *to, const siginfo_t *from)
-{
- int err;
-
- if (!access_ok(VERIFY_WRITE, to, sizeof(*to)))
- return -EFAULT;
-
- /* If you change siginfo_t structure, please be sure
- * this code is fixed accordingly.
- * It should never copy any pad contained in the structure
- * to avoid security leaks, but must copy the generic
- * 3 ints plus the relevant union member.
- * This routine must convert siginfo from 64bit to 32bit as well
- * at the same time.
- */
- err = __put_user(from->si_signo, &to->si_signo);
- err |= __put_user(from->si_errno, &to->si_errno);
- err |= __put_user((short)from->si_code, &to->si_code);
- if (from->si_code < 0)
- err |= __copy_to_user(&to->_sifields._pad, &from->_sifields._pad,
- SI_PAD_SIZE);
- else switch (from->si_code & __SI_MASK) {
- case __SI_KILL:
- err |= __put_user(from->si_pid, &to->si_pid);
- err |= __put_user(from->si_uid, &to->si_uid);
- break;
- case __SI_TIMER:
- err |= __put_user(from->si_tid, &to->si_tid);
- err |= __put_user(from->si_overrun, &to->si_overrun);
- err |= __put_user(from->si_int, &to->si_int);
- break;
- case __SI_POLL:
- err |= __put_user(from->si_band, &to->si_band);
- err |= __put_user(from->si_fd, &to->si_fd);
- break;
- case __SI_FAULT:
- err |= __put_user((compat_uptr_t)(unsigned long)from->si_addr,
- &to->si_addr);
-#ifdef BUS_MCEERR_AO
- /*
- * Other callers might not initialize the si_lsb field,
- * so check explicitly for the right codes here.
- */
- if (from->si_signo == SIGBUS &&
- (from->si_code == BUS_MCEERR_AR || from->si_code == BUS_MCEERR_AO))
- err |= __put_user(from->si_addr_lsb, &to->si_addr_lsb);
-#endif
- break;
- case __SI_CHLD:
- err |= __put_user(from->si_pid, &to->si_pid);
- err |= __put_user(from->si_uid, &to->si_uid);
- err |= __put_user(from->si_status, &to->si_status);
- err |= __put_user(from->si_utime, &to->si_utime);
- err |= __put_user(from->si_stime, &to->si_stime);
- break;
- case __SI_RT: /* This is not generated by the kernel as of now. */
- case __SI_MESGQ: /* But this is */
- err |= __put_user(from->si_pid, &to->si_pid);
- err |= __put_user(from->si_uid, &to->si_uid);
- err |= __put_user(from->si_int, &to->si_int);
- break;
- case __SI_SYS:
- err |= __put_user((compat_uptr_t)(unsigned long)
- from->si_call_addr, &to->si_call_addr);
- err |= __put_user(from->si_syscall, &to->si_syscall);
- err |= __put_user(from->si_arch, &to->si_arch);
- break;
- default: /* this is just in case for now ... */
- err |= __put_user(from->si_pid, &to->si_pid);
- err |= __put_user(from->si_uid, &to->si_uid);
- break;
- }
- return err;
-}
-
-int copy_siginfo_from_user32(siginfo_t *to, compat_siginfo_t __user *from)
-{
- if (copy_from_user(to, from, __ARCH_SI_PREAMBLE_SIZE) ||
- copy_from_user(to->_sifields._pad,
- from->_sifields._pad, SI_PAD_SIZE))
- return -EFAULT;
-
- return 0;
-}
-
/*
* VFP save/restore code.
*
diff --git a/arch/arm64/kernel/signal32_common.c b/arch/arm64/kernel/signal32_common.c
new file mode 100644
index 0000000..c8cba96
--- /dev/null
+++ b/arch/arm64/kernel/signal32_common.c
@@ -0,0 +1,135 @@
+/*
+ * Based on arch/arm/kernel/signal.c
+ *
+ * Copyright (C) 1995-2009 Russell King
+ * Copyright (C) 2012 ARM Ltd.
+ * Modified by Will Deacon <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/compat.h>
+#include <linux/signal.h>
+#include <linux/ratelimit.h>
+
+#include <asm/esr.h>
+#include <asm/fpsimd.h>
+#include <asm/signal32_common.h>
+#include <asm/uaccess.h>
+#include <asm/unistd.h>
+
+int put_sigset_t(compat_sigset_t __user *uset, sigset_t *set)
+{
+ compat_sigset_t cset;
+
+ cset.sig[0] = set->sig[0] & 0xffffffffull;
+ cset.sig[1] = set->sig[0] >> 32;
+
+ return copy_to_user(uset, &cset, sizeof(*uset));
+}
+
+int get_sigset_t(sigset_t *set, const compat_sigset_t __user *uset)
+{
+ compat_sigset_t s32;
+
+ if (copy_from_user(&s32, uset, sizeof(*uset)))
+ return -EFAULT;
+
+ set->sig[0] = s32.sig[0] | (((long)s32.sig[1]) << 32);
+ return 0;
+}
+
+int copy_siginfo_to_user32(compat_siginfo_t __user *to, const siginfo_t *from)
+{
+ int err;
+
+ if (!access_ok(VERIFY_WRITE, to, sizeof(*to)))
+ return -EFAULT;
+
+ /* If you change siginfo_t structure, please be sure
+ * this code is fixed accordingly.
+ * It should never copy any pad contained in the structure
+ * to avoid security leaks, but must copy the generic
+ * 3 ints plus the relevant union member.
+ * This routine must convert siginfo from 64bit to 32bit as well
+ * at the same time.
+ */
+ err = __put_user(from->si_signo, &to->si_signo);
+ err |= __put_user(from->si_errno, &to->si_errno);
+ err |= __put_user((short)from->si_code, &to->si_code);
+ if (from->si_code < 0)
+ err |= __copy_to_user(&to->_sifields._pad, &from->_sifields._pad,
+ SI_PAD_SIZE);
+ else switch (from->si_code & __SI_MASK) {
+ case __SI_KILL:
+ err |= __put_user(from->si_pid, &to->si_pid);
+ err |= __put_user(from->si_uid, &to->si_uid);
+ break;
+ case __SI_TIMER:
+ err |= __put_user(from->si_tid, &to->si_tid);
+ err |= __put_user(from->si_overrun, &to->si_overrun);
+ err |= __put_user(from->si_int, &to->si_int);
+ break;
+ case __SI_POLL:
+ err |= __put_user(from->si_band, &to->si_band);
+ err |= __put_user(from->si_fd, &to->si_fd);
+ break;
+ case __SI_FAULT:
+ err |= __put_user((compat_uptr_t)(unsigned long)from->si_addr,
+ &to->si_addr);
+#ifdef BUS_MCEERR_AO
+ /*
+ * Other callers might not initialize the si_lsb field,
+ * so check explicitly for the right codes here.
+ */
+ if (from->si_signo == SIGBUS &&
+ (from->si_code == BUS_MCEERR_AR || from->si_code == BUS_MCEERR_AO))
+ err |= __put_user(from->si_addr_lsb, &to->si_addr_lsb);
+#endif
+ break;
+ case __SI_CHLD:
+ err |= __put_user(from->si_pid, &to->si_pid);
+ err |= __put_user(from->si_uid, &to->si_uid);
+ err |= __put_user(from->si_status, &to->si_status);
+ err |= __put_user(from->si_utime, &to->si_utime);
+ err |= __put_user(from->si_stime, &to->si_stime);
+ break;
+ case __SI_RT: /* This is not generated by the kernel as of now. */
+ case __SI_MESGQ: /* But this is */
+ err |= __put_user(from->si_pid, &to->si_pid);
+ err |= __put_user(from->si_uid, &to->si_uid);
+ err |= __put_user(from->si_int, &to->si_int);
+ break;
+ case __SI_SYS:
+ err |= __put_user((compat_uptr_t)(unsigned long)
+ from->si_call_addr, &to->si_call_addr);
+ err |= __put_user(from->si_syscall, &to->si_syscall);
+ err |= __put_user(from->si_arch, &to->si_arch);
+ break;
+ default: /* this is just in case for now ... */
+ err |= __put_user(from->si_pid, &to->si_pid);
+ err |= __put_user(from->si_uid, &to->si_uid);
+ break;
+ }
+ return err;
+}
+
+int copy_siginfo_from_user32(siginfo_t *to, compat_siginfo_t __user *from)
+{
+ if (copy_from_user(to, from, __ARCH_SI_PREAMBLE_SIZE) ||
+ copy_from_user(to->_sifields._pad,
+ from->_sifields._pad, SI_PAD_SIZE))
+ return -EFAULT;
+
+ return 0;
+}
--
2.7.4

2016-10-21 20:51:23

by Yury Norov

[permalink] [raw]
Subject: [PATCH 18/18] arm64:ilp32: add ARM64_ILP32 to Kconfig

From: Andrew Pinski <[email protected]>

This patch adds the config option for ILP32.

Signed-off-by: Andrew Pinski <[email protected]>
Signed-off-by: Philipp Tomsich <[email protected]>
Signed-off-by: Christoph Muellner <[email protected]>
Signed-off-by: Yury Norov <[email protected]>
Reviewed-by: David Daney <[email protected]>
---
arch/arm64/Kconfig | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 9efa86a..07e177f 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -996,7 +996,7 @@ source "fs/Kconfig.binfmt"

config COMPAT
bool
- depends on AARCH32_EL0
+ depends on AARCH32_EL0 || ARM64_ILP32

config AARCH32_EL0
bool "Kernel support for 32-bit EL0"
@@ -1018,6 +1018,14 @@ config AARCH32_EL0

If you want to execute 32-bit userspace applications, say Y.

+config ARM64_ILP32
+ bool "Kernel support for ILP32"
+ select COMPAT
+ help
+ This option enables support for AArch64 ILP32 user space. ILP32
+ is an ABI where long and pointers are 32bits but it uses the AARCH64
+ instruction set.
+
config SYSVIPC_COMPAT
def_bool y
depends on COMPAT && SYSVIPC
--
2.7.4

2016-10-21 21:06:57

by Yury Norov

[permalink] [raw]
Subject: [PATCH 03/18] arm64: rename COMPAT to AARCH32_EL0 in Kconfig

From: Andrew Pinski <[email protected]>

In this patchset ILP32 ABI support is added. Additionally to AARCH32,
which is binary-compatible with ARM, ILP32 is (mostly) ABI-compatible.

>From now, AARCH32_EL0 (former COMPAT) config option means the support of
AARCH32 userspace, ARM64_ILP32 - support of ILP32 ABI (see next patches),
and COMPAT indicates that one of them, or both, is enabled.

Where needed, CONFIG_COMPAT is changed over to use CONFIG_AARCH32_EL0 instead

Signed-off-by: Andrew Pinski <[email protected]>
Signed-off-by: Yury Norov <[email protected]>
Signed-off-by: Philipp Tomsich <[email protected]>
Signed-off-by: Christoph Muellner <[email protected]>
Signed-off-by: Bamvor Jian Zhang <[email protected]>
---
arch/arm64/Kconfig | 10 ++++++++--
arch/arm64/include/asm/fpsimd.h | 2 +-
arch/arm64/include/asm/hwcap.h | 4 ++--
arch/arm64/include/asm/processor.h | 6 +++---
arch/arm64/include/asm/ptrace.h | 2 +-
arch/arm64/include/asm/seccomp.h | 2 +-
arch/arm64/include/asm/signal32.h | 6 ++++--
arch/arm64/include/asm/unistd.h | 2 +-
arch/arm64/kernel/Makefile | 2 +-
arch/arm64/kernel/asm-offsets.c | 2 +-
arch/arm64/kernel/cpufeature.c | 8 ++++----
arch/arm64/kernel/cpuinfo.c | 20 +++++++++++---------
arch/arm64/kernel/entry.S | 6 +++---
arch/arm64/kernel/head.S | 2 +-
arch/arm64/kernel/ptrace.c | 8 ++++----
arch/arm64/kernel/traps.c | 2 +-
arch/arm64/kernel/vdso.c | 4 ++--
drivers/clocksource/arm_arch_timer.c | 2 +-
18 files changed, 50 insertions(+), 40 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 30398db..0cd786e 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -396,7 +396,7 @@ config ARM64_ERRATUM_834220

config ARM64_ERRATUM_845719
bool "Cortex-A53: 845719: a load might read incorrect data"
- depends on COMPAT
+ depends on AARCH32_EL0
default y
help
This option adds an alternative code sequence to work around ARM
@@ -725,7 +725,7 @@ config FORCE_MAX_ZONEORDER

menuconfig ARMV8_DEPRECATED
bool "Emulate deprecated/obsolete ARMv8 instructions"
- depends on COMPAT
+ depends on AARCH32_EL0
help
Legacy software support may require certain instructions
that have been deprecated or obsoleted in the architecture.
@@ -995,8 +995,14 @@ menu "Userspace binary formats"
source "fs/Kconfig.binfmt"

config COMPAT
+ bool
+ depends on AARCH32_EL0
+
+config AARCH32_EL0
bool "Kernel support for 32-bit EL0"
+ def_bool y
depends on ARM64_4K_PAGES || EXPERT
+ select COMPAT
select COMPAT_BINFMT_ELF
select HAVE_UID16
select OLD_SIGSUSPEND3
diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index 50f559f..63b19f1 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -52,7 +52,7 @@ struct fpsimd_partial_state {
};


-#if defined(__KERNEL__) && defined(CONFIG_COMPAT)
+#if defined(__KERNEL__) && defined(CONFIG_AARCH32_EL0)
/* Masks for extracting the FPSR and FPCR from the FPSCR */
#define VFP_FPSCR_STAT_MASK 0xf800009f
#define VFP_FPSCR_CTRL_MASK 0x07f79f00
diff --git a/arch/arm64/include/asm/hwcap.h b/arch/arm64/include/asm/hwcap.h
index 400b80b..2c7fc5d 100644
--- a/arch/arm64/include/asm/hwcap.h
+++ b/arch/arm64/include/asm/hwcap.h
@@ -46,7 +46,7 @@
*/
#define ELF_HWCAP (elf_hwcap)

-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
#define COMPAT_ELF_HWCAP (compat_elf_hwcap)
#define COMPAT_ELF_HWCAP2 (compat_elf_hwcap2)
extern unsigned int compat_elf_hwcap, compat_elf_hwcap2;
@@ -54,7 +54,7 @@ extern unsigned int compat_elf_hwcap, compat_elf_hwcap2;

enum {
CAP_HWCAP = 1,
-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
CAP_COMPAT_HWCAP,
CAP_COMPAT_HWCAP2,
#endif
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index df2e53d..6173a7b 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -79,7 +79,7 @@ struct cpu_context {
struct thread_struct {
struct cpu_context cpu_context; /* cpu context */
unsigned long tp_value; /* TLS register */
-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
unsigned long tp2_value;
#endif
struct fpsimd_state fpsimd_state;
@@ -88,7 +88,7 @@ struct thread_struct {
struct debug_info debug; /* debugging */
};

-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
#define task_user_tls(t) \
({ \
unsigned long *__tls; \
@@ -119,7 +119,7 @@ static inline void start_thread(struct pt_regs *regs, unsigned long pc,
regs->sp = sp;
}

-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
static inline void compat_start_thread(struct pt_regs *regs, unsigned long pc,
unsigned long sp)
{
diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
index ada08b5..f5ca5f5 100644
--- a/arch/arm64/include/asm/ptrace.h
+++ b/arch/arm64/include/asm/ptrace.h
@@ -125,7 +125,7 @@ struct pt_regs {

#define arch_has_single_step() (1)

-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
#define compat_thumb_mode(regs) \
(((regs)->pstate & COMPAT_PSR_T_BIT))
#else
diff --git a/arch/arm64/include/asm/seccomp.h b/arch/arm64/include/asm/seccomp.h
index c76fac9..00ef0bf 100644
--- a/arch/arm64/include/asm/seccomp.h
+++ b/arch/arm64/include/asm/seccomp.h
@@ -13,7 +13,7 @@

#include <asm/unistd.h>

-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
#define __NR_seccomp_read_32 __NR_compat_read
#define __NR_seccomp_write_32 __NR_compat_write
#define __NR_seccomp_exit_32 __NR_compat_exit
diff --git a/arch/arm64/include/asm/signal32.h b/arch/arm64/include/asm/signal32.h
index eeaa975..e68fcce 100644
--- a/arch/arm64/include/asm/signal32.h
+++ b/arch/arm64/include/asm/signal32.h
@@ -17,7 +17,9 @@
#define __ASM_SIGNAL32_H

#ifdef __KERNEL__
-#ifdef CONFIG_COMPAT
+
+#ifdef CONFIG_AARCH32_EL0
+
#include <linux/compat.h>

#define AARCH32_KERN_SIGRET_CODE_OFFSET 0x500
@@ -47,6 +49,6 @@ static inline int compat_setup_rt_frame(int usig, struct ksignal *ksig, sigset_t
static inline void compat_setup_restart_syscall(struct pt_regs *regs)
{
}
-#endif /* CONFIG_COMPAT */
+#endif /* CONFIG_AARCH32_EL0 */
#endif /* __KERNEL__ */
#endif /* __ASM_SIGNAL32_H */
diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h
index e78ac26..fe9d6c1 100644
--- a/arch/arm64/include/asm/unistd.h
+++ b/arch/arm64/include/asm/unistd.h
@@ -13,7 +13,7 @@
* You should have received a copy of the GNU General Public License
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
#define __ARCH_WANT_COMPAT_SYS_GETDENTS64
#define __ARCH_WANT_COMPAT_STAT64
#define __ARCH_WANT_SYS_GETHOSTNAME
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 7d66bba..8a19fda 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -27,7 +27,7 @@ OBJCOPYFLAGS := --prefix-symbols=__efistub_
$(obj)/%.stub.o: $(obj)/%.o FORCE
$(call if_changed,objcopy)

-arm64-obj-$(CONFIG_COMPAT) += sys32.o kuser32.o signal32.o \
+arm64-obj-$(CONFIG_AARCH32_EL0) += sys32.o kuser32.o signal32.o \
sys_compat.o entry32.o
arm64-obj-$(CONFIG_FUNCTION_TRACER) += ftrace.o entry-ftrace.o
arm64-obj-$(CONFIG_MODULES) += arm64ksyms.o module.o
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 4a2f0f0..d8d7086 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -65,7 +65,7 @@ int main(void)
DEFINE(S_X28, offsetof(struct pt_regs, regs[28]));
DEFINE(S_LR, offsetof(struct pt_regs, regs[30]));
DEFINE(S_SP, offsetof(struct pt_regs, sp));
-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
DEFINE(S_COMPAT_SP, offsetof(struct pt_regs, compat_sp));
#endif
DEFINE(S_PSTATE, offsetof(struct pt_regs, pstate));
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index b3ac0c4..12805ee 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -32,7 +32,7 @@
unsigned long elf_hwcap __read_mostly;
EXPORT_SYMBOL_GPL(elf_hwcap);

-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
#define COMPAT_ELF_HWCAP_DEFAULT \
(COMPAT_HWCAP_HALF|COMPAT_HWCAP_THUMB|\
COMPAT_HWCAP_FAST_MULT|COMPAT_HWCAP_EDSP|\
@@ -859,7 +859,7 @@ static const struct arm64_cpu_capabilities arm64_elf_hwcaps[] = {
};

static const struct arm64_cpu_capabilities compat_elf_hwcaps[] = {
-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
HWCAP_CAP(SYS_ID_ISAR5_EL1, ID_ISAR5_AES_SHIFT, FTR_UNSIGNED, 2, CAP_COMPAT_HWCAP2, COMPAT_HWCAP2_PMULL),
HWCAP_CAP(SYS_ID_ISAR5_EL1, ID_ISAR5_AES_SHIFT, FTR_UNSIGNED, 1, CAP_COMPAT_HWCAP2, COMPAT_HWCAP2_AES),
HWCAP_CAP(SYS_ID_ISAR5_EL1, ID_ISAR5_SHA1_SHIFT, FTR_UNSIGNED, 1, CAP_COMPAT_HWCAP2, COMPAT_HWCAP2_SHA1),
@@ -875,7 +875,7 @@ static void __init cap_set_elf_hwcap(const struct arm64_cpu_capabilities *cap)
case CAP_HWCAP:
elf_hwcap |= cap->hwcap;
break;
-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
case CAP_COMPAT_HWCAP:
compat_elf_hwcap |= (u32)cap->hwcap;
break;
@@ -898,7 +898,7 @@ static bool cpus_have_elf_hwcap(const struct arm64_cpu_capabilities *cap)
case CAP_HWCAP:
rc = (elf_hwcap & cap->hwcap) != 0;
break;
-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
case CAP_COMPAT_HWCAP:
rc = (compat_elf_hwcap & (u32)cap->hwcap) != 0;
break;
diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
index c742df5..b76c759 100644
--- a/arch/arm64/kernel/cpuinfo.c
+++ b/arch/arm64/kernel/cpuinfo.c
@@ -134,15 +134,17 @@ static int c_show(struct seq_file *m, void *v)
*/
seq_puts(m, "Features\t:");
if (compat) {
-#ifdef CONFIG_COMPAT
- for (j = 0; compat_hwcap_str[j]; j++)
- if (compat_elf_hwcap & (1 << j))
- seq_printf(m, " %s", compat_hwcap_str[j]);
-
- for (j = 0; compat_hwcap2_str[j]; j++)
- if (compat_elf_hwcap2 & (1 << j))
- seq_printf(m, " %s", compat_hwcap2_str[j]);
-#endif /* CONFIG_COMPAT */
+#ifdef CONFIG_AARCH32_EL0
+ if (personality(current->personality) == PER_LINUX32) {
+ for (j = 0; compat_hwcap_str[j]; j++)
+ if (compat_elf_hwcap & (1 << j))
+ seq_printf(m, " %s", compat_hwcap_str[j]);
+
+ for (j = 0; compat_hwcap2_str[j]; j++)
+ if (compat_elf_hwcap2 & (1 << j))
+ seq_printf(m, " %s", compat_hwcap2_str[j]);
+ }
+#endif /* CONFIG_AARCH32_EL0 */
} else {
for (j = 0; hwcap_str[j]; j++)
if (elf_hwcap & (1 << j))
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 223d54a..b6fb14b 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -271,7 +271,7 @@ ENTRY(vectors)
ventry el0_fiq_invalid // FIQ 64-bit EL0
ventry el0_error_invalid // Error 64-bit EL0

-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
ventry el0_sync_compat // Synchronous 32-bit EL0
ventry el0_irq_compat // IRQ 32-bit EL0
ventry el0_fiq_invalid_compat // FIQ 32-bit EL0
@@ -311,7 +311,7 @@ el0_error_invalid:
inv_entry 0, BAD_ERROR
ENDPROC(el0_error_invalid)

-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
el0_fiq_invalid_compat:
inv_entry 0, BAD_FIQ, 32
ENDPROC(el0_fiq_invalid_compat)
@@ -479,7 +479,7 @@ el0_sync:
b.ge el0_dbg
b el0_inv

-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
.align 6
el0_sync_compat:
kernel_entry 0, 32
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 427f6d3..10cb017 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -575,7 +575,7 @@ CPU_LE( movk x0, #0x30d0, lsl #16 ) // Clear EE and E0E on LE systems
msr cptr_el2, x0 // Disable copro. traps to EL2
1:

-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
msr hstr_el2, xzr // Disable CP15 traps to EL2
#endif

diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index e0c81da..1d6f43e 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -183,7 +183,7 @@ static void ptrace_hbptriggered(struct perf_event *bp,
.si_addr = (void __user *)(bkpt->trigger),
};

-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
int i;

if (!is_compat_task())
@@ -758,7 +758,7 @@ static const struct user_regset_view user_aarch64_view = {
.regsets = aarch64_regsets, .n = ARRAY_SIZE(aarch64_regsets)
};

-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
#include <linux/compat.h>

enum compat_regset {
@@ -1293,11 +1293,11 @@ long compat_arch_ptrace(struct task_struct *child, compat_long_t request,

return ret;
}
-#endif /* CONFIG_COMPAT */
+#endif /* CONFIG_AARCH32_EL0 */

const struct user_regset_view *task_user_regset_view(struct task_struct *task)
{
-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
/*
* Core dumping of 32-bit tasks or compat ptrace requests must use the
* user_aarch32_view compatible with arm32. Native ptrace requests on
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index 5ff020f..14a08a0 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -526,7 +526,7 @@ long compat_arm_syscall(struct pt_regs *regs);

asmlinkage long do_ni_syscall(struct pt_regs *regs)
{
-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
long ret;
if (is_compat_task()) {
ret = compat_arm_syscall(regs);
diff --git a/arch/arm64/kernel/vdso.c b/arch/arm64/kernel/vdso.c
index a2c2478..7f822cd 100644
--- a/arch/arm64/kernel/vdso.c
+++ b/arch/arm64/kernel/vdso.c
@@ -49,7 +49,7 @@ static union {
} vdso_data_store __page_aligned_data;
struct vdso_data *vdso_data = &vdso_data_store.data;

-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
/*
* Create and map the vectors page for AArch32 tasks.
*/
@@ -108,7 +108,7 @@ int aarch32_setup_vectors_page(struct linux_binprm *bprm, int uses_interp)

return PTR_ERR_OR_ZERO(ret);
}
-#endif /* CONFIG_COMPAT */
+#endif /* CONFIG_AARCH32_EL0 */

static struct vm_special_mapping vdso_spec[2] __ro_after_init = {
{
diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
index 73c487d..0ed1b62 100644
--- a/drivers/clocksource/arm_arch_timer.c
+++ b/drivers/clocksource/arm_arch_timer.c
@@ -418,7 +418,7 @@ static void arch_timer_evtstrm_enable(int divider)
| ARCH_TIMER_VIRT_EVT_EN;
arch_timer_set_cntkctl(cntkctl);
elf_hwcap |= HWCAP_EVTSTRM;
-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
compat_elf_hwcap |= COMPAT_HWCAP_EVTSTRM;
#endif
}
--
2.7.4

2016-10-21 21:07:24

by Yury Norov

[permalink] [raw]
Subject: [PATCH 04/18] arm64: ensure the kernel is compiled for LP64

From: Andrew Pinski <[email protected]>

The kernel needs to be compiled as a LP64 binary for ARM64, even when
using a compiler that defaults to code-generation for the ILP32 ABI.
Consequently, we need to explicitly pass '-mabi=lp64' (supported on
gcc-4.9 and newer).

Signed-off-by: Andrew Pinski <[email protected]>
Signed-off-by: Philipp Tomsich <[email protected]>
Signed-off-by: Christoph Muellner <[email protected]>
Signed-off-by: Yury Norov <[email protected]>
Reviewed-by: David Daney <[email protected]>
---
arch/arm64/Makefile | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index ab51aed..80eb000 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -42,15 +42,20 @@ KBUILD_CFLAGS += -fno-asynchronous-unwind-tables
KBUILD_CFLAGS += $(call cc-option, -mpc-relative-literal-loads)
KBUILD_AFLAGS += $(lseinstr)

+KBUILD_CFLAGS += $(call cc-option,-mabi=lp64)
+KBUILD_AFLAGS += $(call cc-option,-mabi=lp64)
+
ifeq ($(CONFIG_CPU_BIG_ENDIAN), y)
KBUILD_CPPFLAGS += -mbig-endian
AS += -EB
LD += -EB
+LDFLAGS += -maarch64linuxb
UTS_MACHINE := aarch64_be
else
KBUILD_CPPFLAGS += -mlittle-endian
AS += -EL
LD += -EL
+LDFLAGS += -maarch64linux
UTS_MACHINE := aarch64
endif

--
2.7.4

2016-10-21 21:08:17

by Yury Norov

[permalink] [raw]
Subject: [PATCH 15/18] arm64: ilp32: introduce ilp32-specific handlers for sigframe and ucontext

From: Andrew Pinski <[email protected]>

ILP32 uses AARCH32 compat structures and syscall handlers for signals.
But ILP32 struct rt_sigframe and ucontext differs from both LP64 and
AARCH32. So some specific mechanism is needed to take care of it.

Signed-off-by: Andrew Pinski <[email protected]>
Signed-off-by: Yury Norov <[email protected]>
---
arch/arm64/include/asm/signal_ilp32.h | 38 ++++++++
arch/arm64/kernel/Makefile | 3 +-
arch/arm64/kernel/entry_ilp32.S | 22 +++++
arch/arm64/kernel/signal.c | 3 +
arch/arm64/kernel/signal_ilp32.c | 174 ++++++++++++++++++++++++++++++++++
5 files changed, 239 insertions(+), 1 deletion(-)
create mode 100644 arch/arm64/include/asm/signal_ilp32.h
create mode 100644 arch/arm64/kernel/entry_ilp32.S
create mode 100644 arch/arm64/kernel/signal_ilp32.c

diff --git a/arch/arm64/include/asm/signal_ilp32.h b/arch/arm64/include/asm/signal_ilp32.h
new file mode 100644
index 0000000..d3210d8
--- /dev/null
+++ b/arch/arm64/include/asm/signal_ilp32.h
@@ -0,0 +1,38 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <asm/signal32_common.h>
+#include <asm/signal_common.h>
+
+#ifndef __ASM_SIGNAL_ILP32_H
+#define __ASM_SIGNAL_ILP32_H
+
+#ifdef CONFIG_ARM64_ILP32
+
+#include <linux/compat.h>
+
+int ilp32_setup_rt_frame(int usig, struct ksignal *ksig, sigset_t *set,
+ struct pt_regs *regs);
+
+#else
+
+static inline int ilp32_setup_rt_frame(int usig, struct ksignal *ksig,
+ sigset_t *set, struct pt_regs *regs)
+{
+ return -ENOSYS;
+}
+
+#endif /* CONFIG_ARM64_ILP32 */
+
+#endif /* __ASM_SIGNAL_ILP32_H */
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index fdc0052..af400fb 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -29,7 +29,8 @@ $(obj)/%.stub.o: $(obj)/%.o FORCE

arm64-obj-$(CONFIG_AARCH32_EL0) += sys32.o kuser32.o signal32.o \
sys_compat.o entry32.o binfmt_elf32.o
-arm64-obj-$(CONFIG_ARM64_ILP32) += binfmt_ilp32.o sys_ilp32.o
+arm64-obj-$(CONFIG_ARM64_ILP32) += binfmt_ilp32.o sys_ilp32.o \
+ signal_ilp32.o entry_ilp32.o
arm64-obj-$(CONFIG_COMPAT) += entry32_common.o signal32_common.o
arm64-obj-$(CONFIG_FUNCTION_TRACER) += ftrace.o entry-ftrace.o
arm64-obj-$(CONFIG_MODULES) += arm64ksyms.o module.o
diff --git a/arch/arm64/kernel/entry_ilp32.S b/arch/arm64/kernel/entry_ilp32.S
new file mode 100644
index 0000000..a8bb94b
--- /dev/null
+++ b/arch/arm64/kernel/entry_ilp32.S
@@ -0,0 +1,22 @@
+/*
+ * ILP32 system call wrappers
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/linkage.h>
+
+ENTRY(ilp32_sys_rt_sigreturn_wrapper)
+ mov x0, sp
+ b ilp32_sys_rt_sigreturn
+ENDPROC(ilp32_sys_rt_sigreturn_wrapper)
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index 478d6c5..1b130f4 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -35,6 +35,7 @@
#include <asm/signal32.h>
#include <asm/vdso.h>
#include <asm/signal_common.h>
+#include <asm/signal_ilp32.h>

#define RT_SIGFRAME_FP_POS (offsetof(struct rt_sigframe, sig) \
+ offsetof(struct sigframe, fp))
@@ -325,6 +326,8 @@ static void handle_signal(struct ksignal *ksig, struct pt_regs *regs)
ret = compat_setup_rt_frame(usig, ksig, oldset, regs);
else
ret = compat_setup_frame(usig, ksig, oldset, regs);
+ } else if (is_ilp32_compat_task()) {
+ ret = ilp32_setup_rt_frame(usig, ksig, oldset, regs);
} else {
ret = setup_rt_frame(usig, ksig, oldset, regs);
}
diff --git a/arch/arm64/kernel/signal_ilp32.c b/arch/arm64/kernel/signal_ilp32.c
new file mode 100644
index 0000000..6f9b7aa
--- /dev/null
+++ b/arch/arm64/kernel/signal_ilp32.c
@@ -0,0 +1,174 @@
+/*
+ * Based on arch/arm/kernel/signal.c
+ *
+ * Copyright (C) 1995-2009 Russell King
+ * Copyright (C) 2012 ARM Ltd.
+ * Copyright (C) 2016 Cavium Networks.
+ * Yury Norov <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/compat.h>
+#include <linux/signal.h>
+#include <linux/syscalls.h>
+#include <linux/ratelimit.h>
+
+#include <asm/esr.h>
+#include <asm/fpsimd.h>
+#include <asm/signal_ilp32.h>
+#include <asm/uaccess.h>
+#include <asm/unistd.h>
+#include <asm/ucontext.h>
+
+
+#define ILP32_RT_SIGFRAME_FP_POS (offsetof(struct ilp32_rt_sigframe, sig) \
+ + offsetof(struct ilp32_sigframe, fp))
+
+struct ilp32_ucontext {
+ u32 uc_flags;
+ u32 uc_link;
+ compat_stack_t uc_stack;
+ compat_sigset_t uc_sigmask;
+ /* glibc uses a 1024-bit sigset_t */
+ __u8 unused[1024 / 8 - sizeof(compat_sigset_t)];
+ /* last for future expansion */
+ struct sigcontext uc_mcontext;
+};
+
+struct ilp32_sigframe {
+ struct ilp32_ucontext uc;
+ u64 fp;
+ u64 lr;
+};
+
+struct ilp32_rt_sigframe {
+ struct compat_siginfo info;
+ struct ilp32_sigframe sig;
+};
+
+static int restore_ilp32_sigframe(struct pt_regs *regs,
+ struct ilp32_sigframe __user *sf)
+{
+ int err;
+ sigset_t set;
+
+ err = get_sigset_t(&set, &sf->uc.uc_sigmask);
+ if (err == 0)
+ set_current_blocked(&set);
+ err |= restore_sigcontext(regs, &sf->uc.uc_mcontext);
+ return err;
+}
+
+static int setup_ilp32_sigframe(struct ilp32_sigframe __user *sf,
+ struct pt_regs *regs, sigset_t *set)
+{
+ int err = 0;
+
+ /* set up the stack frame for unwinding */
+ __put_user_error(regs->regs[29], &sf->fp, err);
+ __put_user_error(regs->regs[30], &sf->lr, err);
+
+ err |= put_sigset_t(&sf->uc.uc_sigmask, set);
+ err |= setup_sigcontext(&sf->uc.uc_mcontext, regs);
+ return err;
+}
+
+asmlinkage long ilp32_sys_rt_sigreturn(struct pt_regs *regs)
+{
+ struct ilp32_rt_sigframe __user *frame;
+
+ /* Always make any pending restarted system calls return -EINTR */
+ current->restart_block.fn = do_no_restart_syscall;
+
+ /*
+ * Since we stacked the signal on a 128-bit boundary,
+ * then 'sp' should be word aligned here. If it's
+ * not, then the user is trying to mess with us.
+ */
+ if (regs->sp & 15)
+ goto badframe;
+
+ frame = (struct ilp32_rt_sigframe __user *) regs->sp;
+
+ if (!access_ok(VERIFY_READ, frame, sizeof(*frame)))
+ goto badframe;
+
+ if (restore_ilp32_sigframe(regs, &frame->sig))
+ goto badframe;
+
+ if (compat_restore_altstack(&frame->sig.uc.uc_stack))
+ goto badframe;
+
+ return regs->regs[0];
+
+badframe:
+ if (show_unhandled_signals)
+ pr_info_ratelimited("%s[%d]: bad frame in %s: pc=%08llx sp=%08llx\n",
+ current->comm, task_pid_nr(current),
+ __func__, regs->pc, regs->sp);
+ force_sig(SIGSEGV, current);
+
+ return 0;
+}
+
+static struct ilp32_rt_sigframe __user *ilp32_get_sigframe(struct ksignal *ksig,
+ struct pt_regs *regs)
+{
+ unsigned long sp, sp_top;
+ struct ilp32_rt_sigframe __user *frame;
+
+ sp = sp_top = sigsp(regs->sp, ksig);
+
+ sp = (sp - sizeof(struct ilp32_rt_sigframe)) & ~15;
+ frame = (struct ilp32_rt_sigframe __user *)sp;
+
+ /*
+ * Check that we can actually write to the signal frame.
+ */
+ if (!access_ok(VERIFY_WRITE, frame, sp_top - sp))
+ frame = NULL;
+
+ return frame;
+}
+
+/*
+ * ILP32 signal handling routines called from signal.c
+ */
+int ilp32_setup_rt_frame(int usig, struct ksignal *ksig,
+ sigset_t *set, struct pt_regs *regs)
+{
+ struct ilp32_rt_sigframe __user *frame;
+ int err = 0;
+
+ frame = ilp32_get_sigframe(ksig, regs);
+
+ if (!frame)
+ return 1;
+
+ err |= copy_siginfo_to_user32(&frame->info, &ksig->info);
+
+ __put_user_error(0, &frame->sig.uc.uc_flags, err);
+ __put_user_error(0, &frame->sig.uc.uc_link, err);
+
+ err |= __compat_save_altstack(&frame->sig.uc.uc_stack, regs->sp);
+ err |= setup_ilp32_sigframe(&frame->sig, regs, set);
+ if (err == 0) {
+ setup_return(regs, &ksig->ka,
+ frame, ILP32_RT_SIGFRAME_FP_POS, usig);
+ regs->regs[1] = (unsigned long)&frame->info;
+ regs->regs[2] = (unsigned long)&frame->sig.uc;
+ }
+
+ return err;
+}
--
2.7.4

2016-10-21 21:09:12

by Yury Norov

[permalink] [raw]
Subject: [PATCH 13/18] arm64: signal: share lp64 signal routines to ilp32

After that, it will be possible to reuse it in ilp32.

Signed-off-by: Yury Norov <[email protected]>
Signed-off-by: Bamvor Zhang Jian <[email protected]>
---
arch/arm64/include/asm/signal_common.h | 33 ++++++++++++
arch/arm64/kernel/signal.c | 93 +++++++++++++++++++++-------------
2 files changed, 92 insertions(+), 34 deletions(-)
create mode 100644 arch/arm64/include/asm/signal_common.h

diff --git a/arch/arm64/include/asm/signal_common.h b/arch/arm64/include/asm/signal_common.h
new file mode 100644
index 0000000..756ed2c
--- /dev/null
+++ b/arch/arm64/include/asm/signal_common.h
@@ -0,0 +1,33 @@
+/*
+ * Copyright (C) 1995-2009 Russell King
+ * Copyright (C) 2012 ARM Ltd.
+ * Copyright (C) 2016 Cavium Networks.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef __ASM_SIGNAL_COMMON_H
+#define __ASM_SIGNAL_COMMON_H
+
+#include <linux/uaccess.h>
+#include <asm/ucontext.h>
+#include <asm/fpsimd.h>
+
+int preserve_fpsimd_context(struct fpsimd_context __user *ctx);
+int restore_fpsimd_context(struct fpsimd_context __user *ctx);
+int setup_sigcontext(struct sigcontext __user *uc_mcontext, struct pt_regs *regs);
+int restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *sf);
+void setup_return(struct pt_regs *regs, struct k_sigaction *ka,
+ void __user *frame, off_t sigframe_off, int usig);
+
+#endif /* __ASM_SIGNAL_COMMON_H */
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index f90cdf5..478d6c5 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -34,18 +34,26 @@
#include <asm/fpsimd.h>
#include <asm/signal32.h>
#include <asm/vdso.h>
+#include <asm/signal_common.h>
+
+#define RT_SIGFRAME_FP_POS (offsetof(struct rt_sigframe, sig) \
+ + offsetof(struct sigframe, fp))
+
+struct sigframe {
+ struct ucontext uc;
+ u64 fp;
+ u64 lr;
+};

/*
* Do a signal return; undo the signal stack. These are aligned to 128-bit.
*/
struct rt_sigframe {
struct siginfo info;
- struct ucontext uc;
- u64 fp;
- u64 lr;
+ struct sigframe sig;
};

-static int preserve_fpsimd_context(struct fpsimd_context __user *ctx)
+int preserve_fpsimd_context(struct fpsimd_context __user *ctx)
{
struct fpsimd_state *fpsimd = &current->thread.fpsimd_state;
int err;
@@ -65,7 +73,7 @@ static int preserve_fpsimd_context(struct fpsimd_context __user *ctx)
return err ? -EFAULT : 0;
}

-static int restore_fpsimd_context(struct fpsimd_context __user *ctx)
+int restore_fpsimd_context(struct fpsimd_context __user *ctx)
{
struct fpsimd_state fpsimd;
__u32 magic, size;
@@ -93,22 +101,30 @@ static int restore_fpsimd_context(struct fpsimd_context __user *ctx)
}

static int restore_sigframe(struct pt_regs *regs,
- struct rt_sigframe __user *sf)
+ struct sigframe __user *sf)
{
sigset_t set;
- int i, err;
- void *aux = sf->uc.uc_mcontext.__reserved;
-
+ int err;
err = __copy_from_user(&set, &sf->uc.uc_sigmask, sizeof(set));
if (err == 0)
set_current_blocked(&set);

+ err |= restore_sigcontext(regs, &sf->uc.uc_mcontext);
+ return err;
+}
+
+
+int restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *uc_mcontext)
+{
+ int i, err = 0;
+ void *aux = uc_mcontext->__reserved;
+
for (i = 0; i < 31; i++)
- __get_user_error(regs->regs[i], &sf->uc.uc_mcontext.regs[i],
+ __get_user_error(regs->regs[i], &uc_mcontext->regs[i],
err);
- __get_user_error(regs->sp, &sf->uc.uc_mcontext.sp, err);
- __get_user_error(regs->pc, &sf->uc.uc_mcontext.pc, err);
- __get_user_error(regs->pstate, &sf->uc.uc_mcontext.pstate, err);
+ __get_user_error(regs->sp, &uc_mcontext->sp, err);
+ __get_user_error(regs->pc, &uc_mcontext->pc, err);
+ __get_user_error(regs->pstate, &uc_mcontext->pstate, err);

/*
* Avoid sys_rt_sigreturn() restarting.
@@ -145,10 +161,10 @@ asmlinkage long sys_rt_sigreturn(struct pt_regs *regs)
if (!access_ok(VERIFY_READ, frame, sizeof (*frame)))
goto badframe;

- if (restore_sigframe(regs, frame))
+ if (restore_sigframe(regs, &frame->sig))
goto badframe;

- if (restore_altstack(&frame->uc.uc_stack))
+ if (restore_altstack(&frame->sig.uc.uc_stack))
goto badframe;

return regs->regs[0];
@@ -162,27 +178,36 @@ asmlinkage long sys_rt_sigreturn(struct pt_regs *regs)
return 0;
}

-static int setup_sigframe(struct rt_sigframe __user *sf,
+static int setup_sigframe(struct sigframe __user *sf,
struct pt_regs *regs, sigset_t *set)
{
- int i, err = 0;
- void *aux = sf->uc.uc_mcontext.__reserved;
- struct _aarch64_ctx *end;
+ int err = 0;

/* set up the stack frame for unwinding */
__put_user_error(regs->regs[29], &sf->fp, err);
__put_user_error(regs->regs[30], &sf->lr, err);
+ err |= __copy_to_user(&sf->uc.uc_sigmask, set, sizeof(*set));
+ err |= setup_sigcontext(&sf->uc.uc_mcontext, regs);
+
+ return err;
+}
+
+int setup_sigcontext(struct sigcontext __user *uc_mcontext,
+ struct pt_regs *regs)
+{
+ void *aux = uc_mcontext->__reserved;
+ struct _aarch64_ctx *end;
+ int i, err = 0;

for (i = 0; i < 31; i++)
- __put_user_error(regs->regs[i], &sf->uc.uc_mcontext.regs[i],
+ __put_user_error(regs->regs[i], &uc_mcontext->regs[i],
err);
- __put_user_error(regs->sp, &sf->uc.uc_mcontext.sp, err);
- __put_user_error(regs->pc, &sf->uc.uc_mcontext.pc, err);
- __put_user_error(regs->pstate, &sf->uc.uc_mcontext.pstate, err);

- __put_user_error(current->thread.fault_address, &sf->uc.uc_mcontext.fault_address, err);
+ __put_user_error(regs->sp, &uc_mcontext->sp, err);
+ __put_user_error(regs->pc, &uc_mcontext->pc, err);
+ __put_user_error(regs->pstate, &uc_mcontext->pstate, err);

- err |= __copy_to_user(&sf->uc.uc_sigmask, set, sizeof(*set));
+ __put_user_error(current->thread.fault_address, &uc_mcontext->fault_address, err);

if (err == 0) {
struct fpsimd_context *fpsimd_ctx =
@@ -229,14 +254,14 @@ static struct rt_sigframe __user *get_sigframe(struct ksignal *ksig,
return frame;
}

-static void setup_return(struct pt_regs *regs, struct k_sigaction *ka,
- void __user *frame, int usig)
+void setup_return(struct pt_regs *regs, struct k_sigaction *ka,
+ void __user *frame, off_t fp_pos, int usig)
{
__sigrestore_t sigtramp;

regs->regs[0] = usig;
regs->sp = (unsigned long)frame;
- regs->regs[29] = regs->sp + offsetof(struct rt_sigframe, fp);
+ regs->regs[29] = regs->sp + fp_pos;
regs->pc = (unsigned long)ka->sa.sa_handler;

if (ka->sa.sa_flags & SA_RESTORER)
@@ -257,17 +282,17 @@ static int setup_rt_frame(int usig, struct ksignal *ksig, sigset_t *set,
if (!frame)
return 1;

- __put_user_error(0, &frame->uc.uc_flags, err);
- __put_user_error(NULL, &frame->uc.uc_link, err);
+ __put_user_error(0, &frame->sig.uc.uc_flags, err);
+ __put_user_error(NULL, &frame->sig.uc.uc_link, err);

- err |= __save_altstack(&frame->uc.uc_stack, regs->sp);
- err |= setup_sigframe(frame, regs, set);
+ err |= __save_altstack(&frame->sig.uc.uc_stack, regs->sp);
+ err |= setup_sigframe(&frame->sig, regs, set);
if (err == 0) {
- setup_return(regs, &ksig->ka, frame, usig);
+ setup_return(regs, &ksig->ka, frame, RT_SIGFRAME_FP_POS, usig);
if (ksig->ka.sa.sa_flags & SA_SIGINFO) {
err |= copy_siginfo_to_user(&frame->info, &ksig->info);
regs->regs[1] = (unsigned long)&frame->info;
- regs->regs[2] = (unsigned long)&frame->uc;
+ regs->regs[2] = (unsigned long)&frame->sig.uc;
}
}

--
2.7.4

2016-10-21 21:09:39

by Yury Norov

[permalink] [raw]
Subject: [PATCH 09/18] arm64: introduce binfmt_elf32.c

As we support more than one compat formats, it looks more reasonable
to not use fs/compat_binfmt.c. Custom binfmt_elf32.c allows to move aarch32
specific definitions there and make code more maintainable and readable.

Signed-off-by: Yury Norov <[email protected]>
---
arch/arm64/Kconfig | 1 -
arch/arm64/include/asm/hwcap.h | 2 --
arch/arm64/kernel/Makefile | 2 +-
arch/arm64/kernel/binfmt_elf32.c | 31 +++++++++++++++++++++++++++++++
4 files changed, 32 insertions(+), 4 deletions(-)
create mode 100644 arch/arm64/kernel/binfmt_elf32.c

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 0cd786e..9efa86a 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1003,7 +1003,6 @@ config AARCH32_EL0
def_bool y
depends on ARM64_4K_PAGES || EXPERT
select COMPAT
- select COMPAT_BINFMT_ELF
select HAVE_UID16
select OLD_SIGSUSPEND3
select COMPAT_OLD_SIGACTION
diff --git a/arch/arm64/include/asm/hwcap.h b/arch/arm64/include/asm/hwcap.h
index 2c7fc5d..99dfd92 100644
--- a/arch/arm64/include/asm/hwcap.h
+++ b/arch/arm64/include/asm/hwcap.h
@@ -47,8 +47,6 @@
#define ELF_HWCAP (elf_hwcap)

#ifdef CONFIG_AARCH32_EL0
-#define COMPAT_ELF_HWCAP (compat_elf_hwcap)
-#define COMPAT_ELF_HWCAP2 (compat_elf_hwcap2)
extern unsigned int compat_elf_hwcap, compat_elf_hwcap2;
#endif

diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 8a19fda..abe5040 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -28,7 +28,7 @@ $(obj)/%.stub.o: $(obj)/%.o FORCE
$(call if_changed,objcopy)

arm64-obj-$(CONFIG_AARCH32_EL0) += sys32.o kuser32.o signal32.o \
- sys_compat.o entry32.o
+ sys_compat.o entry32.o binfmt_elf32.o
arm64-obj-$(CONFIG_FUNCTION_TRACER) += ftrace.o entry-ftrace.o
arm64-obj-$(CONFIG_MODULES) += arm64ksyms.o module.o
arm64-obj-$(CONFIG_ARM64_MODULE_PLTS) += module-plts.o
diff --git a/arch/arm64/kernel/binfmt_elf32.c b/arch/arm64/kernel/binfmt_elf32.c
new file mode 100644
index 0000000..aec1c8a
--- /dev/null
+++ b/arch/arm64/kernel/binfmt_elf32.c
@@ -0,0 +1,31 @@
+/*
+ * Support for AArch32 Linux ELF binaries.
+ */
+
+/* AArch32 EABI. */
+#define EF_ARM_EABI_MASK 0xff000000
+
+#define compat_start_thread compat_start_thread
+#define COMPAT_SET_PERSONALITY(ex) \
+do { \
+ clear_thread_flag(TIF_32BIT_AARCH64); \
+ set_thread_flag(TIF_32BIT); \
+} while (0)
+
+#define COMPAT_ARCH_DLINFO
+#define COMPAT_ELF_HWCAP (compat_elf_hwcap)
+#define COMPAT_ELF_HWCAP2 (compat_elf_hwcap2)
+
+#ifdef __AARCH64EB__
+#define COMPAT_ELF_PLATFORM ("v8b")
+#else
+#define COMPAT_ELF_PLATFORM ("v8l")
+#endif
+
+#define compat_arch_setup_additional_pages \
+ aarch32_setup_vectors_page
+struct linux_binprm;
+extern int aarch32_setup_vectors_page(struct linux_binprm *bprm,
+ int uses_interp);
+
+#include "../../../fs/compat_binfmt_elf.c"
--
2.7.4

2016-10-21 21:09:49

by Yury Norov

[permalink] [raw]
Subject: [PATCH 16/18] arm64: ptrace: handle ptrace_request differently for aarch32 and ilp32

New aarch32 ptrace syscall handler is introduced to avoid run-time
detection of the task type.

Signed-off-by: Yury Norov <[email protected]>
Signed-off-by: Bamvor Zhang Jian <[email protected]>
Signed-off-by: Chengming Zhou <[email protected]>
---
arch/arm64/include/asm/unistd32.h | 2 +-
arch/arm64/kernel/ptrace.c | 91 ++++++++++++++++++++++++++++++++++++++-
arch/arm64/kernel/sys32.c | 1 +
include/linux/ptrace.h | 6 +++
kernel/ptrace.c | 10 ++---
5 files changed, 103 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h
index b7e8ef1..6da7cbd 100644
--- a/arch/arm64/include/asm/unistd32.h
+++ b/arch/arm64/include/asm/unistd32.h
@@ -74,7 +74,7 @@ __SYSCALL(__NR_getuid, sys_getuid16)
/* 25 was sys_stime */
__SYSCALL(25, sys_ni_syscall)
#define __NR_ptrace 26
-__SYSCALL(__NR_ptrace, compat_sys_ptrace)
+__SYSCALL(__NR_ptrace, compat_sys_aarch32_ptrace)
/* 27 was sys_alarm */
__SYSCALL(27, sys_ni_syscall)
/* 28 was sys_fstat */
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index 1d075ed..ac542c9 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -29,6 +29,7 @@
#include <linux/user.h>
#include <linux/seccomp.h>
#include <linux/security.h>
+#include <linux/syscalls.h>
#include <linux/init.h>
#include <linux/signal.h>
#include <linux/uaccess.h>
@@ -40,6 +41,7 @@

#include <asm/debug-monitors.h>
#include <asm/pgtable.h>
+#include <asm/signal32_common.h>
#include <asm/syscall.h>
#include <asm/traps.h>
#include <asm/system_misc.h>
@@ -1215,7 +1217,7 @@ static int compat_ptrace_sethbpregs(struct task_struct *tsk, compat_long_t num,
}
#endif /* CONFIG_HAVE_HW_BREAKPOINT */

-long compat_arch_ptrace(struct task_struct *child, compat_long_t request,
+static long compat_a32_ptrace(struct task_struct *child, compat_long_t request,
compat_ulong_t caddr, compat_ulong_t cdata)
{
unsigned long addr = caddr;
@@ -1292,8 +1294,95 @@ long compat_arch_ptrace(struct task_struct *child, compat_long_t request,

return ret;
}
+
+COMPAT_SYSCALL_DEFINE4(aarch32_ptrace, compat_long_t, request, compat_long_t, pid,
+ compat_long_t, addr, compat_long_t, data)
+{
+ struct task_struct *child;
+ long ret;
+
+ if (request == PTRACE_TRACEME) {
+ ret = ptrace_traceme();
+ goto out;
+ }
+
+ child = ptrace_get_task_struct(pid);
+ if (IS_ERR(child)) {
+ ret = PTR_ERR(child);
+ goto out;
+ }
+
+ if (request == PTRACE_ATTACH || request == PTRACE_SEIZE) {
+ ret = ptrace_attach(child, request, addr, data);
+ goto out_put_task_struct;
+ }
+
+ ret = ptrace_check_attach(child, request == PTRACE_KILL ||
+ request == PTRACE_INTERRUPT);
+ if (!ret) {
+ ret = compat_a32_ptrace(child, request, addr, data);
+ if (ret || request != PTRACE_DETACH)
+ ptrace_unfreeze_traced(child);
+ }
+
+ out_put_task_struct:
+ put_task_struct(child);
+ out:
+ return ret;
+}
+
#endif /* CONFIG_AARCH32_EL0 */

+#ifdef CONFIG_ARM64_ILP32
+
+long compat_arch_ptrace(struct task_struct *child, compat_long_t request,
+ compat_ulong_t caddr, compat_ulong_t cdata)
+{
+ sigset_t new_set;
+
+ switch (request) {
+ case PTRACE_GETSIGMASK:
+ if (caddr != sizeof(compat_sigset_t))
+ return -EINVAL;
+
+ return put_sigset_t((compat_sigset_t __user *) (u64) cdata,
+ &child->blocked);
+
+ case PTRACE_SETSIGMASK:
+ if (caddr != sizeof(compat_sigset_t))
+ return -EINVAL;
+
+ if (get_sigset_t(&new_set, (compat_sigset_t __user *) (u64) cdata))
+ return -EFAULT;
+
+ sigdelsetmask(&new_set, sigmask(SIGKILL)|sigmask(SIGSTOP));
+
+ /*
+ * Every thread does recalc_sigpending() after resume, so
+ * retarget_shared_pending() and recalc_sigpending() are not
+ * called here.
+ */
+ spin_lock_irq(&child->sighand->siglock);
+ child->blocked = new_set;
+ spin_unlock_irq(&child->sighand->siglock);
+
+ return 0;
+
+ default:
+ return compat_ptrace_request(child, request, caddr, cdata);
+ }
+}
+
+#elif defined(CONFIG_COMPAT)
+
+long compat_arch_ptrace(struct task_struct *child, compat_long_t request,
+ compat_ulong_t caddr, compat_ulong_t cdata)
+{
+ return 0;
+}
+
+#endif
+
const struct user_regset_view *task_user_regset_view(struct task_struct *task)
{
#ifdef CONFIG_AARCH32_EL0
diff --git a/arch/arm64/kernel/sys32.c b/arch/arm64/kernel/sys32.c
index a40b134..3752443 100644
--- a/arch/arm64/kernel/sys32.c
+++ b/arch/arm64/kernel/sys32.c
@@ -38,6 +38,7 @@ asmlinkage long compat_sys_fadvise64_64_wrapper(void);
asmlinkage long compat_sys_sync_file_range2_wrapper(void);
asmlinkage long compat_sys_fallocate_wrapper(void);
asmlinkage long compat_sys_mmap2_wrapper(void);
+asmlinkage long compat_sys_aarch32_ptrace(void);

#undef __SYSCALL
#define __SYSCALL(nr, sym) [nr] = sym,
diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
index 504c98a..75887a0 100644
--- a/include/linux/ptrace.h
+++ b/include/linux/ptrace.h
@@ -97,6 +97,12 @@ int generic_ptrace_peekdata(struct task_struct *tsk, unsigned long addr,
unsigned long data);
int generic_ptrace_pokedata(struct task_struct *tsk, unsigned long addr,
unsigned long data);
+int ptrace_traceme(void);
+struct task_struct *ptrace_get_task_struct(pid_t pid);
+int ptrace_attach(struct task_struct *task, long request,
+ unsigned long addr, unsigned long flags);
+int ptrace_check_attach(struct task_struct *child, bool ignore_state);
+void ptrace_unfreeze_traced(struct task_struct *task);

/**
* ptrace_parent - return the task that is tracing the given task
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 2a99027..5638880 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -138,7 +138,7 @@ static bool ptrace_freeze_traced(struct task_struct *task)
return ret;
}

-static void ptrace_unfreeze_traced(struct task_struct *task)
+void ptrace_unfreeze_traced(struct task_struct *task)
{
if (task->state != __TASK_TRACED)
return;
@@ -170,7 +170,7 @@ static void ptrace_unfreeze_traced(struct task_struct *task)
* RETURNS:
* 0 on success, -ESRCH if %child is not ready.
*/
-static int ptrace_check_attach(struct task_struct *child, bool ignore_state)
+int ptrace_check_attach(struct task_struct *child, bool ignore_state)
{
int ret = -ESRCH;

@@ -294,7 +294,7 @@ bool ptrace_may_access(struct task_struct *task, unsigned int mode)
return !err;
}

-static int ptrace_attach(struct task_struct *task, long request,
+int ptrace_attach(struct task_struct *task, long request,
unsigned long addr,
unsigned long flags)
{
@@ -408,7 +408,7 @@ static int ptrace_attach(struct task_struct *task, long request,
* Performs checks and sets PT_PTRACED.
* Should be used by all ptrace implementations for PTRACE_TRACEME.
*/
-static int ptrace_traceme(void)
+int ptrace_traceme(void)
{
int ret = -EPERM;

@@ -1057,7 +1057,7 @@ int ptrace_request(struct task_struct *child, long request,
return ret;
}

-static struct task_struct *ptrace_get_task_struct(pid_t pid)
+struct task_struct *ptrace_get_task_struct(pid_t pid)
{
struct task_struct *child;

--
2.7.4

2016-10-21 21:09:25

by Yury Norov

[permalink] [raw]
Subject: [PATCH 17/18] arm64:ilp32: add vdso-ilp32 and use for signal return

From: Philipp Tomsich <[email protected]>

ILP32 VDSO exports next symbols:
__kernel_rt_sigreturn;
__kernel_gettimeofday;
__kernel_clock_gettime;
__kernel_clock_getres.

What shared object to use, kernel selects depending on result of
is_ilp32_compat_task() in arch/arm64/kernel/vdso.c, so it substitutes
correct pages and spec.

Adjusted to move the move data page before code pages in sync with
commit 601255ae3c98 ("arm64: vdso: move data page before code pages")

Signed-off-by: Philipp Tomsich <[email protected]>
Signed-off-by: Christoph Muellner <[email protected]>
Signed-off-by: Yury Norov <[email protected]>
Signed-off-by: Bamvor Zhang Jian <[email protected]>
---
arch/arm64/include/asm/vdso.h | 6 ++
arch/arm64/kernel/Makefile | 11 ++++
arch/arm64/kernel/asm-offsets.c | 7 ++
arch/arm64/kernel/signal.c | 2 +
arch/arm64/kernel/vdso-ilp32/.gitignore | 2 +
arch/arm64/kernel/vdso-ilp32/Makefile | 74 +++++++++++++++++++++
arch/arm64/kernel/vdso-ilp32/vdso-ilp32.S | 33 ++++++++++
arch/arm64/kernel/vdso-ilp32/vdso-ilp32.lds.S | 95 +++++++++++++++++++++++++++
arch/arm64/kernel/vdso.c | 66 ++++++++++++++++---
arch/arm64/kernel/vdso/gettimeofday.S | 20 +++++-
arch/arm64/kernel/vdso/vdso.S | 6 +-
11 files changed, 306 insertions(+), 16 deletions(-)
create mode 100644 arch/arm64/kernel/vdso-ilp32/.gitignore
create mode 100644 arch/arm64/kernel/vdso-ilp32/Makefile
create mode 100644 arch/arm64/kernel/vdso-ilp32/vdso-ilp32.S
create mode 100644 arch/arm64/kernel/vdso-ilp32/vdso-ilp32.lds.S

diff --git a/arch/arm64/include/asm/vdso.h b/arch/arm64/include/asm/vdso.h
index 839ce00..649a9a4 100644
--- a/arch/arm64/include/asm/vdso.h
+++ b/arch/arm64/include/asm/vdso.h
@@ -29,6 +29,12 @@

#include <generated/vdso-offsets.h>

+#ifdef CONFIG_ARM64_ILP32
+#include <generated/vdso-ilp32-offsets.h>
+#else
+#define vdso_offset_sigtramp_ilp32
+#endif
+
#define VDSO_SYMBOL(base, name) \
({ \
(void *)(vdso_offset_##name - VDSO_LBASE + (unsigned long)(base)); \
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index af400fb..43e680a 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -55,6 +55,17 @@ arm64-obj-$(CONFIG_KEXEC) += machine_kexec.o relocate_kernel.o \
cpu-reset.o

obj-y += $(arm64-obj-y) vdso/ probes/
+obj-$(CONFIG_ARM64_ILP32) += vdso-ilp32/
obj-m += $(arm64-obj-m)
head-y := head.o
extra-y += $(head-y) vmlinux.lds
+
+# vDSO - this must be built first to generate the symbol offsets
+$(call objectify,$(arm64-obj-y)): $(obj)/vdso/vdso-offsets.h
+$(obj)/vdso/vdso-offsets.h: $(obj)/vdso
+
+ifeq ($(CONFIG_ARM64_ILP32),y)
+# vDSO - this must be built first to generate the symbol offsets
+$(call objectify,$(arm64-obj-y)): $(obj)/vdso-ilp32/vdso-ilp32-offsets.h
+$(obj)/vdso-ilp32/vdso-ilp32-offsets.h: $(obj)/vdso-ilp32
+endif
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index d8d7086..8f844b9 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -119,6 +119,13 @@ int main(void)
DEFINE(TSPEC_TV_SEC, offsetof(struct timespec, tv_sec));
DEFINE(TSPEC_TV_NSEC, offsetof(struct timespec, tv_nsec));
BLANK();
+#ifdef CONFIG_COMPAT
+ DEFINE(COMPAT_TVAL_TV_SEC, offsetof(struct compat_timeval, tv_sec));
+ DEFINE(COMPAT_TVAL_TV_USEC, offsetof(struct compat_timeval, tv_usec));
+ DEFINE(COMPAT_TSPEC_TV_SEC, offsetof(struct compat_timespec, tv_sec));
+ DEFINE(COMPAT_TSPEC_TV_NSEC, offsetof(struct compat_timespec, tv_nsec));
+ BLANK();
+#endif
DEFINE(TZ_MINWEST, offsetof(struct timezone, tz_minuteswest));
DEFINE(TZ_DSTTIME, offsetof(struct timezone, tz_dsttime));
BLANK();
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index 1b130f4..72f68f0 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -267,6 +267,8 @@ void setup_return(struct pt_regs *regs, struct k_sigaction *ka,

if (ka->sa.sa_flags & SA_RESTORER)
sigtramp = ka->sa.sa_restorer;
+ else if (is_ilp32_compat_task())
+ sigtramp = VDSO_SYMBOL(current->mm->context.vdso, sigtramp_ilp32);
else
sigtramp = VDSO_SYMBOL(current->mm->context.vdso, sigtramp);

diff --git a/arch/arm64/kernel/vdso-ilp32/.gitignore b/arch/arm64/kernel/vdso-ilp32/.gitignore
new file mode 100644
index 0000000..61806c3
--- /dev/null
+++ b/arch/arm64/kernel/vdso-ilp32/.gitignore
@@ -0,0 +1,2 @@
+vdso-ilp32.lds
+vdso-ilp32-offsets.h
diff --git a/arch/arm64/kernel/vdso-ilp32/Makefile b/arch/arm64/kernel/vdso-ilp32/Makefile
new file mode 100644
index 0000000..0671e88
--- /dev/null
+++ b/arch/arm64/kernel/vdso-ilp32/Makefile
@@ -0,0 +1,74 @@
+#
+# Building a vDSO image for AArch64.
+#
+# Author: Will Deacon <[email protected]>
+# Heavily based on the vDSO Makefiles for other archs.
+#
+
+obj-ilp32-vdso := gettimeofday-ilp32.o note-ilp32.o sigreturn-ilp32.o
+
+# Build rules
+targets := $(obj-ilp32-vdso) vdso-ilp32.so vdso-ilp32.so.dbg
+obj-ilp32-vdso := $(addprefix $(obj)/, $(obj-ilp32-vdso))
+
+ccflags-y := -shared -fno-common -fno-builtin
+ccflags-y += -nostdlib -Wl,-soname=linux-ilp32-vdso.so.1 \
+ $(call cc-ldoption, -Wl$(comma)--hash-style=sysv)
+
+obj-y += vdso-ilp32.o
+extra-y += vdso-ilp32.lds vdso-ilp32-offsets.h
+CPPFLAGS_vdso-ilp32.lds += -P -C -U$(ARCH) -mabi=ilp32
+
+# Force dependency (incbin is bad)
+$(obj)/vdso-ilp32.o : $(obj)/vdso-ilp32.so
+
+# Link rule for the .so file, .lds has to be first
+$(obj)/vdso-ilp32.so.dbg: $(src)/vdso-ilp32.lds $(obj-ilp32-vdso)
+ $(call if_changed,vdso-ilp32ld)
+
+# Strip rule for the .so file
+$(obj)/%.so: OBJCOPYFLAGS := -S
+$(obj)/%.so: $(obj)/%.so.dbg FORCE
+ $(call if_changed,objcopy)
+
+# Generate VDSO offsets using helper script
+gen-vdsosym := $(srctree)/$(src)/../vdso/gen_vdso_offsets.sh
+quiet_cmd_vdsosym = VDSOSYM $@
+define cmd_vdsosym
+ $(NM) $< | $(gen-vdsosym) | LC_ALL=C sort > $@ && \
+ cp $@ include/generated/
+endef
+
+$(obj)/vdso-ilp32-offsets.h: $(obj)/vdso-ilp32.so.dbg FORCE
+ $(call if_changed,vdsosym)
+
+# Assembly rules for the .S files
+#$(obj-ilp32-vdso): %.o: $(src)/../vdso/$(subst -ilp32,,%.S)
+# $(call if_changed_dep,vdso-ilp32as)
+
+$(obj)/gettimeofday-ilp32.o: $(src)/../vdso/gettimeofday.S
+ $(call if_changed_dep,vdso-ilp32as)
+
+$(obj)/note-ilp32.o: $(src)/../vdso/note.S
+ $(call if_changed_dep,vdso-ilp32as)
+
+# This one should be fine because ILP32 uses the same generic
+# __NR_rt_sigreturn syscall number.
+$(obj)/sigreturn-ilp32.o: $(src)/../vdso/sigreturn.S
+ $(call if_changed_dep,vdso-ilp32as)
+
+# Actual build commands
+quiet_cmd_vdso-ilp32ld = VDSOILP32L $@
+ cmd_vdso-ilp32ld = $(CC) $(c_flags) -mabi=ilp32 -Wl,-n -Wl,-T $^ -o $@
+quiet_cmd_vdso-ilp32as = VDSOILP32A $@
+ cmd_vdso-ilp32as = $(CC) $(a_flags) -mabi=ilp32 -c -o $@ $<
+
+# Install commands for the unstripped file
+quiet_cmd_vdso_install = INSTALL $@
+ cmd_vdso_install = cp $(obj)/[email protected] $(MODLIB)/vdso/$@
+
+vdso-ilp32.so: $(obj)/vdso-ilp32.so.dbg
+ @mkdir -p $(MODLIB)/vdso
+ $(call cmd,vdso_install)
+
+vdso_install: vdso-ilp32.so
diff --git a/arch/arm64/kernel/vdso-ilp32/vdso-ilp32.S b/arch/arm64/kernel/vdso-ilp32/vdso-ilp32.S
new file mode 100644
index 0000000..46ac072
--- /dev/null
+++ b/arch/arm64/kernel/vdso-ilp32/vdso-ilp32.S
@@ -0,0 +1,33 @@
+/*
+ * Copyright (C) 2012 ARM Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Author: Will Deacon <[email protected]>
+ */
+
+#include <linux/init.h>
+#include <linux/linkage.h>
+#include <linux/const.h>
+#include <asm/page.h>
+
+ __PAGE_ALIGNED_DATA
+
+ .globl vdso_ilp32_start, vdso_ilp32_end
+ .balign PAGE_SIZE
+vdso_ilp32_start:
+ .incbin "arch/arm64/kernel/vdso-ilp32/vdso-ilp32.so"
+ .balign PAGE_SIZE
+vdso_ilp32_end:
+
+ .previous
diff --git a/arch/arm64/kernel/vdso-ilp32/vdso-ilp32.lds.S b/arch/arm64/kernel/vdso-ilp32/vdso-ilp32.lds.S
new file mode 100644
index 0000000..3b564ca
--- /dev/null
+++ b/arch/arm64/kernel/vdso-ilp32/vdso-ilp32.lds.S
@@ -0,0 +1,95 @@
+/*
+ * GNU linker script for the VDSO library.
+ *
+ * Copyright (C) 2012 ARM Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Author: Will Deacon <[email protected]>
+ * Heavily based on the vDSO linker scripts for other archs.
+ */
+
+#include <linux/const.h>
+#include <asm/page.h>
+#include <asm/vdso.h>
+
+SECTIONS
+{
+ PROVIDE(_vdso_data = . - PAGE_SIZE);
+ . = VDSO_LBASE + SIZEOF_HEADERS;
+
+ .hash : { *(.hash) } :text
+ .gnu.hash : { *(.gnu.hash) }
+ .dynsym : { *(.dynsym) }
+ .dynstr : { *(.dynstr) }
+ .gnu.version : { *(.gnu.version) }
+ .gnu.version_d : { *(.gnu.version_d) }
+ .gnu.version_r : { *(.gnu.version_r) }
+
+ .note : { *(.note.*) } :text :note
+
+ . = ALIGN(16);
+
+ .text : { *(.text*) } :text =0xd503201f
+ PROVIDE (__etext = .);
+ PROVIDE (_etext = .);
+ PROVIDE (etext = .);
+
+ .eh_frame_hdr : { *(.eh_frame_hdr) } :text :eh_frame_hdr
+ .eh_frame : { KEEP (*(.eh_frame)) } :text
+
+ .dynamic : { *(.dynamic) } :text :dynamic
+
+ .rodata : { *(.rodata*) } :text
+
+ _end = .;
+ PROVIDE(end = .);
+
+ /DISCARD/ : {
+ *(.note.GNU-stack)
+ *(.data .data.* .gnu.linkonce.d.* .sdata*)
+ *(.bss .sbss .dynbss .dynsbss)
+ }
+}
+
+/*
+ * We must supply the ELF program headers explicitly to get just one
+ * PT_LOAD segment, and set the flags explicitly to make segments read-only.
+ */
+PHDRS
+{
+ text PT_LOAD FLAGS(5) FILEHDR PHDRS; /* PF_R|PF_X */
+ dynamic PT_DYNAMIC FLAGS(4); /* PF_R */
+ note PT_NOTE FLAGS(4); /* PF_R */
+ eh_frame_hdr PT_GNU_EH_FRAME;
+}
+
+/*
+ * This controls what symbols we export from the DSO.
+ */
+VERSION
+{
+ LINUX_4.9 {
+ global:
+ __kernel_rt_sigreturn;
+ __kernel_gettimeofday;
+ __kernel_clock_gettime;
+ __kernel_clock_getres;
+ local: *;
+ };
+}
+
+/*
+ * Make the sigreturn code visible to the kernel.
+ */
+VDSO_sigtramp_ilp32 = __kernel_rt_sigreturn;
diff --git a/arch/arm64/kernel/vdso.c b/arch/arm64/kernel/vdso.c
index 7f822cd..3f884e1 100644
--- a/arch/arm64/kernel/vdso.c
+++ b/arch/arm64/kernel/vdso.c
@@ -37,8 +37,13 @@
#include <asm/vdso.h>
#include <asm/vdso_datapage.h>

-extern char vdso_start, vdso_end;
-static unsigned long vdso_pages __ro_after_init;
+extern char vdso_lp64_start, vdso_lp64_end;
+static unsigned long vdso_lp64_pages __ro_after_init;
+
+#ifdef CONFIG_ARM64_ILP32
+extern char vdso_ilp32_start, vdso_ilp32_end;
+static unsigned long vdso_ilp32_pages __ro_after_init;
+#endif

/*
* The vDSO data page.
@@ -110,7 +115,17 @@ int aarch32_setup_vectors_page(struct linux_binprm *bprm, int uses_interp)
}
#endif /* CONFIG_AARCH32_EL0 */

-static struct vm_special_mapping vdso_spec[2] __ro_after_init = {
+static struct vm_special_mapping vdso_lp64_spec[2] __ro_after_init = {
+ {
+ .name = "[vvar]",
+ },
+ {
+ .name = "[vdso]",
+ },
+};
+
+#ifdef CONFIG_ARM64_ILP32
+static struct vm_special_mapping vdso_ilp32_spec[2] __ro_after_init = {
{
.name = "[vvar]",
},
@@ -118,20 +133,26 @@ static struct vm_special_mapping vdso_spec[2] __ro_after_init = {
.name = "[vdso]",
},
};
+#endif

-static int __init vdso_init(void)
+static int __init vdso_init(char *vdso_start, char *vdso_end,
+ unsigned long *vdso_pagesp,
+ struct vm_special_mapping *vdso_spec)
{
int i;
+ unsigned long vdso_pages;
struct page **vdso_pagelist;

- if (memcmp(&vdso_start, "\177ELF", 4)) {
+ if (memcmp(vdso_start, "\177ELF", 4)) {
pr_err("vDSO is not a valid ELF object!\n");
return -EINVAL;
}

- vdso_pages = (&vdso_end - &vdso_start) >> PAGE_SHIFT;
+ vdso_pages = (vdso_end - vdso_start) >> PAGE_SHIFT;
+ *vdso_pagesp = vdso_pages;
pr_info("vdso: %ld pages (%ld code @ %p, %ld data @ %p)\n",
- vdso_pages + 1, vdso_pages, &vdso_start, 1L, vdso_data);
+ vdso_pages + 1, vdso_pages,
+ vdso_start, 1L, vdso_data);

/* Allocate the vDSO pagelist, plus a page for the data. */
vdso_pagelist = kcalloc(vdso_pages + 1, sizeof(struct page *),
@@ -144,14 +165,30 @@ static int __init vdso_init(void)

/* Grab the vDSO code pages. */
for (i = 0; i < vdso_pages; i++)
- vdso_pagelist[i + 1] = pfn_to_page(PHYS_PFN(__pa(&vdso_start)) + i);
+ vdso_pagelist[i + 1] =
+ pfn_to_page(PHYS_PFN(__pa(vdso_start)) + i);

vdso_spec[0].pages = &vdso_pagelist[0];
vdso_spec[1].pages = &vdso_pagelist[1];

return 0;
}
-arch_initcall(vdso_init);
+
+static int __init vdso_lp64_init(void)
+{
+ return vdso_init(&vdso_lp64_start, &vdso_lp64_end,
+ &vdso_lp64_pages, vdso_lp64_spec);
+}
+arch_initcall(vdso_lp64_init);
+
+#ifdef CONFIG_ARM64_ILP32
+static int __init vdso_ilp32_init(void)
+{
+ return vdso_init(&vdso_ilp32_start, &vdso_ilp32_end,
+ &vdso_ilp32_pages, vdso_ilp32_spec);
+}
+arch_initcall(vdso_ilp32_init);
+#endif

int arch_setup_additional_pages(struct linux_binprm *bprm,
int uses_interp)
@@ -159,8 +196,17 @@ int arch_setup_additional_pages(struct linux_binprm *bprm,
struct mm_struct *mm = current->mm;
unsigned long vdso_base, vdso_text_len, vdso_mapping_len;
void *ret;
+ unsigned long pages = vdso_lp64_pages;
+ struct vm_special_mapping *vdso_spec = vdso_lp64_spec;
+
+#ifdef CONFIG_ARM64_ILP32
+ if (is_ilp32_compat_task()) {
+ pages = vdso_ilp32_pages;
+ vdso_spec = vdso_ilp32_spec;
+ }
+#endif

- vdso_text_len = vdso_pages << PAGE_SHIFT;
+ vdso_text_len = pages << PAGE_SHIFT;
/* Be sure to map the data page */
vdso_mapping_len = vdso_text_len + PAGE_SIZE;

diff --git a/arch/arm64/kernel/vdso/gettimeofday.S b/arch/arm64/kernel/vdso/gettimeofday.S
index e00b467..062a33d 100644
--- a/arch/arm64/kernel/vdso/gettimeofday.S
+++ b/arch/arm64/kernel/vdso/gettimeofday.S
@@ -25,6 +25,16 @@
#define NSEC_PER_SEC_LO16 0xca00
#define NSEC_PER_SEC_HI16 0x3b9a

+#ifdef __LP64__
+#define PTR_REG(n) x##n
+#define OFFSET(n) n
+#define DELOUSE(n)
+#else
+#define PTR_REG(n) w##n
+#define OFFSET(n) COMPAT_##n
+#define DELOUSE(n) mov w##n, w##n
+#endif
+
vdso_data .req x6
seqcnt .req w7
w_tmp .req w8
@@ -119,7 +129,7 @@ x_tmp .req x8
.if \shift == 1
lsr x11, x11, x12
.endif
- stp x10, x11, [x1, #TSPEC_TV_SEC]
+ stp PTR_REG(10), PTR_REG(11), [x1, #OFFSET(TSPEC_TV_SEC)]
mov x0, xzr
ret
.endm
@@ -136,6 +146,8 @@ x_tmp .req x8
/* int __kernel_gettimeofday(struct timeval *tv, struct timezone *tz); */
ENTRY(__kernel_gettimeofday)
.cfi_startproc
+ DELOUSE(0)
+ DELOUSE(1)
adr vdso_data, _vdso_data
/* If tv is NULL, skip to the timezone code. */
cbz x0, 2f
@@ -160,7 +172,7 @@ ENTRY(__kernel_gettimeofday)
mov x13, #1000
lsl x13, x13, x12
udiv x11, x11, x13
- stp x10, x11, [x0, #TVAL_TV_SEC]
+ stp PTR_REG(10), PTR_REG(11), [x0, #OFFSET(TVAL_TV_SEC)]
2:
/* If tz is NULL, return 0. */
cbz x1, 3f
@@ -182,6 +194,7 @@ ENDPROC(__kernel_gettimeofday)
/* int __kernel_clock_gettime(clockid_t clock_id, struct timespec *tp); */
ENTRY(__kernel_clock_gettime)
.cfi_startproc
+ DELOUSE(1)
cmp w0, #JUMPSLOT_MAX
b.hi syscall
adr vdso_data, _vdso_data
@@ -297,6 +310,7 @@ ENDPROC(__kernel_clock_gettime)
/* int __kernel_clock_getres(clockid_t clock_id, struct timespec *res); */
ENTRY(__kernel_clock_getres)
.cfi_startproc
+ DELOUSE(1)
cmp w0, #CLOCK_REALTIME
ccmp w0, #CLOCK_MONOTONIC, #0x4, ne
ccmp w0, #CLOCK_MONOTONIC_RAW, #0x4, ne
@@ -311,7 +325,7 @@ ENTRY(__kernel_clock_getres)
ldr x2, 6f
2:
cbz w1, 3f
- stp xzr, x2, [x1]
+ stp PTR_REG(zr), PTR_REG(2), [x1]

3: /* res == NULL. */
mov w0, wzr
diff --git a/arch/arm64/kernel/vdso/vdso.S b/arch/arm64/kernel/vdso/vdso.S
index 82379a7..a40ae24 100644
--- a/arch/arm64/kernel/vdso/vdso.S
+++ b/arch/arm64/kernel/vdso/vdso.S
@@ -21,12 +21,12 @@
#include <linux/const.h>
#include <asm/page.h>

- .globl vdso_start, vdso_end
+ .globl vdso_lp64_start, vdso_lp64_end
.section .rodata
.balign PAGE_SIZE
-vdso_start:
+vdso_lp64_start:
.incbin "arch/arm64/kernel/vdso/vdso.so"
.balign PAGE_SIZE
-vdso_end:
+vdso_lp64_end:

.previous
--
2.7.4

2016-10-24 17:09:45

by Chris Metcalf

[permalink] [raw]
Subject: Re: [PATCH 02/18] arm64: ilp32: add documentation on the ILP32 ABI for ARM64

On 10/21/2016 4:33 PM, Yury Norov wrote:
> Based on Andrew Pinski's patch-series.
>
> Signed-off-by: Yury Norov <[email protected]>
> ---
> Documentation/arm64/ilp32.txt | 46 +++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 46 insertions(+)
> create mode 100644 Documentation/arm64/ilp32.txt
>
> diff --git a/Documentation/arm64/ilp32.txt b/Documentation/arm64/ilp32.txt
> new file mode 100644
> index 0000000..b96c18f
> --- /dev/null
> +++ b/Documentation/arm64/ilp32.txt
> @@ -0,0 +1,46 @@
> +ILP32 AARCH64 SYSCALL ABI
> +=========================
> +
> +This document describes the ILP32 syscall ABI and where it differs
> +from the generic compat linux syscall interface.
> +
> +AARCH64/ILP32 userspace can potentially access top halves of registers that
> +are passed as syscall arguments, so such registers (w0-w7) are deloused.

I'm not sure what "potentially access" here means: I think what you want to say
is that userspace can pass garbage in the top half, but you should be clearer about
what you mean here. Also, you shouldn't use "deloused" here, since it's not a term
that's defined elsewhere in the kernel, even though it's been used colloquially on LKML.
Provide an actual implementation definition, like "have their top 32 bits zeroed".

> +AARCH64/ILP32 provides next types turned to 64-bit (comparing to AARCH32):

What does "turned" mean here? And I "next types" isn't standard English; you want
to say something like "the following types". Likewise later with "next syscalls".

--
Chris Metcalf, Mellanox Technologies
http://www.mellanox.com

2016-10-24 18:04:33

by Chris Metcalf

[permalink] [raw]
Subject: Re: [PATCH 01/18] 32-bit ABI: introduce ARCH_32BIT_OFF_T config option

On 10/21/2016 4:33 PM, Yury Norov wrote:
> All new 32-bit architectures should have 64-bit off_t type, but existing
> architectures has 32-bit ones.
>
> [...]
> For syscalls sys_openat() and sys_open_by_handle_at() force_o_largefile()
> is called, to set O_LARGEFILE flag, and this is the only difference
> comparing to compat versions. All compat ABIs are already turned to use
> 64-bit off_t, except tile. So, compat versions for this syscalls are not
> needed anymore. Tile is handled explicitly.
>
> [...]
> --- a/arch/tile/kernel/compat.c
> +++ b/arch/tile/kernel/compat.c
> @@ -103,6 +103,9 @@ COMPAT_SYSCALL_DEFINE5(llseek, unsigned int, fd, unsigned int, offset_high,
> #define compat_sys_readahead sys32_readahead
> #define sys_llseek compat_sys_llseek
>
> +#define sys_openat compat_sys_openat
> +#define sys_open_by_handle_at compat_sys_open_by_handle_at
> +
> /* Call the assembly trampolines where necessary. */
> #define compat_sys_rt_sigreturn _compat_sys_rt_sigreturn
> #define sys_clone _sys_clone

This patch accomplishes two goals that could be completely separated.
It's confusing to have them mixed in the same patch without any
discussion of why they are in the same patch.

First, you want to modify the default <asm/unistd.h> behavior for
compat syscalls so that the default is sys_openat (etc) rather than
the existing compat_sys_openat, and then use that new behavior for
arm64 ILP32. This lets you force O_LARGEFILE for arm64 ILP32 to
support having a 64-bit off_t at all times. To do that, you fix the
asm-generic header, and then make tile have a special override.
This seems reasonable enough.

Second, you introduce ARCH_32BIT_OFF_T basically as a synonym for
"BITS_PER_WORD == 32", so that new 32-bit architectures can choose not
to enable it. This is fine in the abstract, but I'm a bit troubled by
the fact that you are not actually introducing a new 32-bit
architecture here (just a new 32-bit mode for the arm 64-bit kernel).
Shouldn't this part of the change wait until someone actually has a
new 32-bit kernel to drive this forward?

If you want to push forward the ARCH_32BIT_OFF_T change in the absence
of an architecture that supports it, I would think it would be a lot
less confusing to have these two in separate patches, and make it
clear that the ARCH_32BIT_OFF_T change is just laying groundwork
for some hypothetical future architecture.

The existing commit language itself is also confusing. You write "All
compat ABIs are already turned to use 64-bit off_t, except tile."
First, I'm not sure what you mean by "turned" here. And, tile is just
one of many compat ABIs that allow O_LARGEFILE not to be part of the
open call: see arm64's AArch32 ABI, MIPS o32, s390 31-bit emulation,
sparc64's 32-bit mode, and of course x86's 32-bit compat mode.
Presumably your point here is that tile is the only pre-existing
architecture that #includes <asm/unistd.h> to create its compat
syscall table, and so I think "all except tile" here is particularly
confusing, since there are no architectures except tile that use the
__SYSCALL_COMPAT functionality in the current tree.

--
Chris Metcalf, Mellanox Technologies
http://www.mellanox.com

2016-10-24 22:24:08

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH 01/18] 32-bit ABI: introduce ARCH_32BIT_OFF_T config option

On Monday, October 24, 2016 12:30:47 PM CEST Chris Metcalf wrote:
> On 10/21/2016 4:33 PM, Yury Norov wrote:
> > All new 32-bit architectures should have 64-bit off_t type, but existing
> > architectures has 32-bit ones.
> >
> > [...]
> > For syscalls sys_openat() and sys_open_by_handle_at() force_o_largefile()
> > is called, to set O_LARGEFILE flag, and this is the only difference
> > comparing to compat versions. All compat ABIs are already turned to use
> > 64-bit off_t, except tile. So, compat versions for this syscalls are not
> > needed anymore. Tile is handled explicitly.
> >
> > [...]
> > --- a/arch/tile/kernel/compat.c
> > +++ b/arch/tile/kernel/compat.c
> > @@ -103,6 +103,9 @@ COMPAT_SYSCALL_DEFINE5(llseek, unsigned int, fd, unsigned int, offset_high,
> > #define compat_sys_readahead sys32_readahead
> > #define sys_llseek compat_sys_llseek
> >
> > +#define sys_openat compat_sys_openat
> > +#define sys_open_by_handle_at compat_sys_open_by_handle_at
> > +
> > /* Call the assembly trampolines where necessary. */
> > #define compat_sys_rt_sigreturn _compat_sys_rt_sigreturn
> > #define sys_clone _sys_clone
>
> This patch accomplishes two goals that could be completely separated.
> It's confusing to have them mixed in the same patch without any
> discussion of why they are in the same patch.
>
> First, you want to modify the default <asm/unistd.h> behavior for
> compat syscalls so that the default is sys_openat (etc) rather than
> the existing compat_sys_openat, and then use that new behavior for
> arm64 ILP32. This lets you force O_LARGEFILE for arm64 ILP32 to
> support having a 64-bit off_t at all times. To do that, you fix the
> asm-generic header, and then make tile have a special override.
> This seems reasonable enough.
>
> Second, you introduce ARCH_32BIT_OFF_T basically as a synonym for
> "BITS_PER_WORD == 32", so that new 32-bit architectures can choose not
> to enable it. This is fine in the abstract, but I'm a bit troubled by
> the fact that you are not actually introducing a new 32-bit
> architecture here (just a new 32-bit mode for the arm 64-bit kernel).
> Shouldn't this part of the change wait until someone actually has a
> new 32-bit kernel to drive this forward?

I asked for this specifically because we identified the problem
during the review of the aarch64 ilp32 code, and it might not
be noticed in the next architecture submission.

The most important aspect from my perspective is that the new
ilp32 ABI on aarch64 behaves the same way that any native 32-bit
architecture does, and when we change the default, it should
be done for both compat mode and native mode at the same time.

> If you want to push forward the ARCH_32BIT_OFF_T change in the absence
> of an architecture that supports it, I would think it would be a lot
> less confusing to have these two in separate patches, and make it
> clear that the ARCH_32BIT_OFF_T change is just laying groundwork
> for some hypothetical future architecture.
>
> The existing commit language itself is also confusing. You write "All
> compat ABIs are already turned to use 64-bit off_t, except tile."
> First, I'm not sure what you mean by "turned" here. And, tile is just
> one of many compat ABIs that allow O_LARGEFILE not to be part of the
> open call: see arm64's AArch32 ABI, MIPS o32, s390 31-bit emulation,
> sparc64's 32-bit mode, and of course x86's 32-bit compat mode.
> Presumably your point here is that tile is the only pre-existing
> architecture that #includes <asm/unistd.h> to create its compat
> syscall table, and so I think "all except tile" here is particularly
> confusing, since there are no architectures except tile that use the
> __SYSCALL_COMPAT functionality in the current tree.

Agreed, this could be made clearer, and splitting the patch up
in two also seems reasonable, though I didn't see it as important.

Arnd

2016-10-27 14:15:20

by Yury Norov

[permalink] [raw]
Subject: Re: [PATCH 02/18] arm64: ilp32: add documentation on the ILP32 ABI for ARM64

Hi Chris,

Thank you for comments

On Mon, Oct 24, 2016 at 12:36:27PM -0400, Chris Metcalf wrote:
> On 10/21/2016 4:33 PM, Yury Norov wrote:
> >Based on Andrew Pinski's patch-series.
> >
> >Signed-off-by: Yury Norov <[email protected]>
> >---
> > Documentation/arm64/ilp32.txt | 46 +++++++++++++++++++++++++++++++++++++++++++
> > 1 file changed, 46 insertions(+)
> > create mode 100644 Documentation/arm64/ilp32.txt
> >
> >diff --git a/Documentation/arm64/ilp32.txt b/Documentation/arm64/ilp32.txt
> >new file mode 100644
> >index 0000000..b96c18f
> >--- /dev/null
> >+++ b/Documentation/arm64/ilp32.txt
> >@@ -0,0 +1,46 @@
> >+ILP32 AARCH64 SYSCALL ABI
> >+=========================
> >+
> >+This document describes the ILP32 syscall ABI and where it differs
> >+from the generic compat linux syscall interface.
> >+
> >+AARCH64/ILP32 userspace can potentially access top halves of registers that
> >+are passed as syscall arguments, so such registers (w0-w7) are deloused.
>
> I'm not sure what "potentially access" here means: I think what you want to say
> is that userspace can pass garbage in the top half, but you should be clearer about
> what you mean here.

Yes. Will change.

> Also, you shouldn't use "deloused" here, since it's not a term
> that's defined elsewhere in the kernel, even though it's been used colloquially on LKML.
> Provide an actual implementation definition, like "have their top 32 bits zeroed".

Agree.
In fact 'delouse' is used in the name of corresponding macro in
include/linux/compat.h:
29 #ifndef __SC_DELOUSE
30 #define __SC_DELOUSE(t,v) ((t)(unsigned long)(v))
31 #endif

But it's not for documentation.

>
> >+AARCH64/ILP32 provides next types turned to 64-bit (comparing to AARCH32):
>
> What does "turned" mean here? And I "next types" isn't standard English; you want
> to say something like "the following types". Likewise later with "next syscalls".

Thanks, will change.

Yury

2016-10-27 16:05:52

by Yury Norov

[permalink] [raw]
Subject: Re: [PATCH 01/18] 32-bit ABI: introduce ARCH_32BIT_OFF_T config option

On Tue, Oct 25, 2016 at 12:22:47AM +0200, Arnd Bergmann wrote:
> On Monday, October 24, 2016 12:30:47 PM CEST Chris Metcalf wrote:
> > On 10/21/2016 4:33 PM, Yury Norov wrote:
> > > All new 32-bit architectures should have 64-bit off_t type, but existing
> > > architectures has 32-bit ones.
> > >
> > > [...]
> > > For syscalls sys_openat() and sys_open_by_handle_at() force_o_largefile()
> > > is called, to set O_LARGEFILE flag, and this is the only difference
> > > comparing to compat versions. All compat ABIs are already turned to use
> > > 64-bit off_t, except tile. So, compat versions for this syscalls are not
> > > needed anymore. Tile is handled explicitly.
> > >
> > > [...]
> > > --- a/arch/tile/kernel/compat.c
> > > +++ b/arch/tile/kernel/compat.c
> > > @@ -103,6 +103,9 @@ COMPAT_SYSCALL_DEFINE5(llseek, unsigned int, fd, unsigned int, offset_high,
> > > #define compat_sys_readahead sys32_readahead
> > > #define sys_llseek compat_sys_llseek
> > >
> > > +#define sys_openat compat_sys_openat
> > > +#define sys_open_by_handle_at compat_sys_open_by_handle_at
> > > +
> > > /* Call the assembly trampolines where necessary. */
> > > #define compat_sys_rt_sigreturn _compat_sys_rt_sigreturn
> > > #define sys_clone _sys_clone
> >
> > This patch accomplishes two goals that could be completely separated.
> > It's confusing to have them mixed in the same patch without any
> > discussion of why they are in the same patch.
> >
> > First, you want to modify the default <asm/unistd.h> behavior for
> > compat syscalls so that the default is sys_openat (etc) rather than
> > the existing compat_sys_openat, and then use that new behavior for
> > arm64 ILP32. This lets you force O_LARGEFILE for arm64 ILP32 to
> > support having a 64-bit off_t at all times. To do that, you fix the
> > asm-generic header, and then make tile have a special override.
> > This seems reasonable enough.
> >
> > Second, you introduce ARCH_32BIT_OFF_T basically as a synonym for
> > "BITS_PER_WORD == 32", so that new 32-bit architectures can choose not
> > to enable it. This is fine in the abstract, but I'm a bit troubled by
> > the fact that you are not actually introducing a new 32-bit
> > architecture here (just a new 32-bit mode for the arm 64-bit kernel).
> > Shouldn't this part of the change wait until someone actually has a
> > new 32-bit kernel to drive this forward?
>
> I asked for this specifically because we identified the problem
> during the review of the aarch64 ilp32 code, and it might not
> be noticed in the next architecture submission.
>
> The most important aspect from my perspective is that the new
> ilp32 ABI on aarch64 behaves the same way that any native 32-bit
> architecture does, and when we change the default, it should
> be done for both compat mode and native mode at the same time.
>
> > If you want to push forward the ARCH_32BIT_OFF_T change in the absence
> > of an architecture that supports it, I would think it would be a lot
> > less confusing to have these two in separate patches, and make it
> > clear that the ARCH_32BIT_OFF_T change is just laying groundwork
> > for some hypothetical future architecture.
> >
> > The existing commit language itself is also confusing. You write "All
> > compat ABIs are already turned to use 64-bit off_t, except tile."
> > First, I'm not sure what you mean by "turned" here. And, tile is just
> > one of many compat ABIs that allow O_LARGEFILE not to be part of the
> > open call: see arm64's AArch32 ABI, MIPS o32, s390 31-bit emulation,
> > sparc64's 32-bit mode, and of course x86's 32-bit compat mode.
> > Presumably your point here is that tile is the only pre-existing
> > architecture that #includes <asm/unistd.h> to create its compat
> > syscall table, and so I think "all except tile" here is particularly
> > confusing, since there are no architectures except tile that use the
> > __SYSCALL_COMPAT functionality in the current tree.
>
> Agreed, this could be made clearer, and splitting the patch up
> in two also seems reasonable, though I didn't see it as important.
>
> Arnd

In the past it was a separated series of 2 patches, and it was even
acked by Arnd, but not submitted.
http://lists-archives.com/linux-kernel/28471253-32-bit-abi-introduce-arch_32bit_off_t-config-option.html

I can restore that small series in aarch64/ilp32 for next iteration, or resend
it separately if you think to submit it before aarch64/ilp32 (which is
better, for me).

Yury

2016-10-28 12:47:16

by Yury Norov

[permalink] [raw]
Subject: Re: ILP32 for ARM64 - testing with lmbench

[Add Steve Ellcey, thanks for testing on ThunderX]

Lmbench-3.0-a9 testing is performed on ThunderX machine to check that
ILP32 series does not add performance regressions for LP64. Test
summary is in the table below. Our measurements doesn't show
significant performance regression of LP64 if ILP32 code is merged,
both enabled or disabled.

ILP32 enabled ILP32 disabled Standard Kernel
null syscall 0.1066 0.1121 0.1121
95.09% 100.00%

stat 1.3947 1.3814 1.3864
100.60% 99.64%

fstat 0.4459 0.4344 0.4524
98.56% 96.02%

open/close 4.0606 4.0411 4.0453
100.38% 99.90%

read 0.4819 0.5014 0.5014
96.11% 100.00%

Tested with linux 4.8 because 4.9-rc1 is not fixed yet for ThunderX.
Other system details below.

Yury.

ubuntu@crb6:~$ uname -a
Linux crb6 4.8.0+ #3 SMP Thu Oct 27 11:01:32 PDT 2016 aarch64 aarch64 aarch64 GNU/Linux

ubuntu@crb6:~$ cat /proc/meminfo
MemTotal:???????132011948 kB
MemFree:????????131442672 kB
MemAvailable:???130695764 kB
Buffers:???????????15696 kB
Cached:????????????88088 kB
SwapCached:????????????0 kB
Active:????????????82760 kB
Inactive:??????????41336 kB
Active(anon):??????20880 kB
Inactive(anon):?????8576 kB
Active(file):??????61880 kB
Inactive(file):????32760 kB
Unevictable:???????????0 kB
Mlocked:???????????????0 kB
SwapTotal:??????128920572 kB
SwapFree:???????128920572 kB
Dirty:?????????????????0 kB
Writeback:?????????????0 kB
AnonPages:?????????20544 kB
Mapped:????????????19780 kB
Shmem:??????????????9060 kB
Slab:??????????????78804 kB
SReclaimable:??????27372 kB
SUnreclaim:????????51432 kB
KernelStack:????????8336 kB
PageTables:??????????820 kB
NFS_Unstable:??????????0 kB
Bounce:????????????????0 kB
WritebackTmp:??????????0 kB
CommitLimit:????194926544 kB
Committed_AS:?????256324 kB
VmallocTotal:???135290290112 kB
VmallocUsed:???????????0 kB
VmallocChunk:??????????0 kB
AnonHugePages:?????????0 kB
ShmemHugePages:????????0 kB
ShmemPmdMapped:????????0 kB
CmaTotal:??????????????0 kB
CmaFree:???????????????0 kB
HugePages_Total:???????0
HugePages_Free:????????0
HugePages_Rsvd:????????0
HugePages_Surp:????????0
Hugepagesize:???????2048 kB

ubuntu@crb6:~$ cat /proc/cpuinfo
processor : 0
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 1
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 2
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 3
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 4
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 5
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 6
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 7
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 8
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 9
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 10
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 11
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 12
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 13
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 14
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 15
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 16
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 17
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 18
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 19
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 20
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 21
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 22
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 23
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 24
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 25
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 26
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 27
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 28
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 29
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 30
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 31
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 32
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 33
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 34
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 35
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 36
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 37
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 38
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 39
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 40
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 41
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 42
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 43
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 44
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 45
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 46
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

processor : 47
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer : 0x43
CPU architecture: 8
CPU variant : 0x1
CPU part : 0x0a1
CPU revision : 0

2016-11-07 08:39:16

by Yury Norov

[permalink] [raw]
Subject: Re: ILP32 for ARM64: testing with glibc testsuite

Hi all,

[add libc-alpha mail list]

For libc-alpha: this is the part of LKML submission with latest
patches for aarch64/ilp32.
https://www.spinics.net/lists/arm-kernel/msg537846.html

Glibc that I use has also included consolidation patches from Adhemerval
Zanella and me that are still not in the glibc master. The full series is:
https://github.com/norov/glibc/tree/ilp32-2.24-dev2

Below is the results of glibc testsuite run for aarch64/lp64
in different configurations. Column names meaning:
kvgv: kernel is vanilla, glibc is vanilla;
kdgv: kernel has ilp32 patches applied, but ilp32 is disabled in config;
glibc is vanilla;
kegv: kernel has ilp32 patches applied and ilp32 is enabled, glibc is vanilla;
kege: kernel patches are applied and enabled, glibc patches are applied.

Only different lines are shown. Full results are in attached archive.

I didn't analyze regressions deep yet, so any ideas/suggestions are appreciated.

Yury.

Test kvgv kdgv kegv kege
conform/ISO/stdio.h/linknamespace PASS PASS PASS FAIL
conform/ISO11/stdio.h/linknamespace PASS PASS PASS FAIL
conform/ISO99/stdio.h/linknamespace PASS PASS PASS FAIL
conform/POSIX/stdio.h/linknamespace PASS PASS PASS FAIL
conform/POSIX/sys/stat.h/linknamespace PASS PASS PASS FAIL
conform/UNIX98/stdio.h/linknamespace PASS PASS PASS FAIL
conform/XOPEN2K/stdio.h/linknamespace PASS PASS PASS FAIL
conform/XPG3/stdio.h/linknamespace PASS PASS PASS FAIL
conform/XPG4/stdio.h/linknamespace PASS PASS PASS FAIL
csu/tst-atomic PASS PASS PASS FAIL
elf/check-localplt PASS PASS PASS FAIL
iconvdata/mtrace-tst-loading PASS FAIL PASS PASS
iconvdata/tst-loading PASS FAIL PASS PASS
io/check-installed-headers-c PASS PASS PASS FAIL
io/check-installed-headers-cxx PASS PASS PASS FAIL
malloc/tst-malloc-backtrace FAIL PASS PASS PASS
malloc/tst-malloc-thread-exit FAIL PASS PASS PASS
malloc/tst-malloc-usable FAIL PASS PASS PASS
malloc/tst-mallocfork FAIL PASS PASS PASS
malloc/tst-mallocstate FAIL PASS PASS PASS
malloc/tst-mallopt FAIL PASS PASS PASS
malloc/tst-mcheck FAIL PASS PASS PASS
malloc/tst-memalign FAIL PASS PASS PASS
malloc/tst-obstack FAIL PASS PASS PASS
malloc/tst-posix_memalign FAIL PASS PASS PASS
malloc/tst-pvalloc FAIL PASS PASS PASS
malloc/tst-realloc FAIL PASS PASS PASS
malloc/tst-scratch_buffer FAIL PASS PASS PASS
malloc/tst-trim1 FAIL PASS PASS PASS
nptl/tst-eintr4 PASS PASS PASS NA
posix/tst-regex2 PASS FAIL FAIL FAIL
posix/tst-getaddrinfo4 PASS PASS FAIL FAIL
posix/tst-getaddrinfo5 PASS PASS FAIL FAIL
sysvipc/test-sysvmsg NA NA NA FAIL
sysvipc/test-sysvsem NA NA NA FAIL
sysvipc/test-sysvshm NA NA NA FAIL


Attachments:
(No filename) (2.62 kB)
lp64.sum.tar.gz (36.14 kB)
Download all attachments

2016-11-09 09:58:45

by Yury Norov

[permalink] [raw]
Subject: Re: ILP32 for ARM64: testing with glibc testsuite

On Mon, Nov 07, 2016 at 01:53:59PM +0530, Yury Norov wrote:
> Hi all,
>
> [add libc-alpha mail list]
>
> For libc-alpha: this is the part of LKML submission with latest
> patches for aarch64/ilp32.
> https://www.spinics.net/lists/arm-kernel/msg537846.html
>
> Glibc that I use has also included consolidation patches from Adhemerval
> Zanella and me that are still not in the glibc master. The full series is:
> https://github.com/norov/glibc/tree/ilp32-2.24-dev2
>
> Below is the results of glibc testsuite run for aarch64/lp64
> in different configurations. Column names meaning:
> kvgv: kernel is vanilla, glibc is vanilla;
> kdgv: kernel has ilp32 patches applied, but ilp32 is disabled in config;
> glibc is vanilla;
> kegv: kernel has ilp32 patches applied and ilp32 is enabled, glibc is vanilla;
> kege: kernel patches are applied and enabled, glibc patches are applied.
>
> Only different lines are shown. Full results are in attached archive.

The same, plus ILP32 regressions:

Test kvgv kdgv kegv kege ilp32
conform/ISO/stdio.h/linknamespace PASS PASS PASS FAIL FAIL
conform/ISO11/stdio.h/linknamespace PASS PASS PASS FAIL FAIL
conform/ISO99/stdio.h/linknamespace PASS PASS PASS FAIL FAIL
conform/POSIX/stdio.h/linknamespace PASS PASS PASS FAIL FAIL
conform/POSIX/sys/stat.h/linknamespace PASS PASS PASS FAIL FAIL
conform/UNIX98/stdio.h/linknamespace PASS PASS PASS FAIL FAIL
conform/XOPEN2K/stdio.h/linknamespace PASS PASS PASS FAIL FAIL
conform/XPG3/stdio.h/linknamespace PASS PASS PASS FAIL FAIL
conform/XPG4/stdio.h/linknamespace PASS PASS PASS FAIL FAIL
csu/tst-atomic PASS PASS PASS FAIL PASS
elf/check-localplt PASS PASS PASS FAIL FAIL
iconvdata/mtrace-tst-loading PASS FAIL PASS PASS FAIL
iconvdata/tst-loading PASS FAIL PASS PASS PASS
io/check-installed-headers-c PASS PASS PASS FAIL FAIL
io/check-installed-headers-cxx PASS PASS PASS FAIL FAIL
malloc/tst-malloc-backtrace FAIL PASS PASS PASS PASS
malloc/tst-malloc-thread-exit FAIL PASS PASS PASS PASS
malloc/tst-malloc-usable FAIL PASS PASS PASS PASS
malloc/tst-mallocfork FAIL PASS PASS PASS PASS
malloc/tst-mallocstate FAIL PASS PASS PASS PASS
malloc/tst-mallopt FAIL PASS PASS PASS PASS
malloc/tst-mcheck FAIL PASS PASS PASS PASS
malloc/tst-memalign FAIL PASS PASS PASS PASS
malloc/tst-obstack FAIL PASS PASS PASS PASS
malloc/tst-posix_memalign FAIL PASS PASS PASS PASS
malloc/tst-pvalloc FAIL PASS PASS PASS PASS
malloc/tst-realloc FAIL PASS PASS PASS PASS
malloc/tst-scratch_buffer FAIL PASS PASS PASS PASS
malloc/tst-trim1 FAIL PASS PASS PASS PASS
nptl/tst-eintr4 PASS PASS PASS NA NA
posix/tst-regex2 PASS FAIL FAIL FAIL FAIL
posix/tst-getaddrinfo4 PASS PASS FAIL FAIL PASS
posix/tst-getaddrinfo5 PASS PASS FAIL FAIL PASS
sysvipc/test-sysvmsg NA NA NA FAIL PASS
sysvipc/test-sysvsem NA NA NA FAIL PASS
sysvipc/test-sysvshm NA NA NA FAIL PASS

c++-types-check PASS PASS PASS PASS FAIL
debug/tst-backtrace4 PASS PASS PASS PASS FAIL
elf/check-abi-libc PASS PASS PASS PASS FAIL
elf/tst-tls1 PASS PASS PASS PASS FAIL
elf/tst-tls1-static PASS PASS PASS PASS FAIL
elf/tst-tls2 PASS PASS PASS PASS FAIL
elf/tst-tls2-static PASS PASS PASS PASS FAIL
elf/tst-tls3 PASS PASS PASS PASS FAIL
math/check-abi-libm PASS PASS PASS PASS FAIL
misc/tst-writev PASS PASS PASS PASS NA
nptl/tst-cancel-self-canceltype PASS PASS PASS PASS FAIL
nptl/tst-cancel1 PASS PASS PASS PASS FAIL
nptl/tst-cancel10 PASS PASS PASS PASS FAIL
nptl/tst-cancel11 PASS PASS PASS PASS FAIL
nptl/tst-cancel13 PASS PASS PASS PASS FAIL
nptl/tst-cancel15 PASS PASS PASS PASS FAIL
nptl/tst-cancel16 PASS PASS PASS PASS FAIL
nptl/tst-cancel17 PASS PASS PASS PASS FAIL
nptl/tst-cancel18 PASS PASS PASS PASS FAIL
nptl/tst-cancel2 PASS PASS PASS PASS FAIL
nptl/tst-cancel20 PASS PASS PASS PASS FAIL
nptl/tst-cancel21 PASS PASS PASS PASS FAIL
nptl/tst-cancel24 PASS PASS PASS PASS FAIL
nptl/tst-cancel25 PASS PASS PASS PASS FAIL
nptl/tst-cancel26 PASS PASS PASS PASS FAIL
nptl/tst-cancel27 PASS PASS PASS PASS FAIL
nptl/tst-cancel3 PASS PASS PASS PASS FAIL
nptl/tst-cancel4 PASS PASS PASS PASS FAIL
nptl/tst-cancel5 PASS PASS PASS PASS FAIL
nptl/tst-cancel6 PASS PASS PASS PASS FAIL
nptl/tst-cancel7 PASS PASS PASS PASS FAIL
nptl/tst-cancelx10 PASS PASS PASS PASS FAIL
nptl/tst-cancelx11 PASS PASS PASS PASS FAIL
nptl/tst-cancelx13 PASS PASS PASS PASS FAIL
nptl/tst-cancelx15 PASS PASS PASS PASS FAIL
nptl/tst-cancelx16 PASS PASS PASS PASS FAIL
nptl/tst-cancelx17 PASS PASS PASS PASS FAIL
nptl/tst-cancelx18 PASS PASS PASS PASS FAIL
nptl/tst-cancelx2 PASS PASS PASS PASS FAIL
nptl/tst-cancelx20 PASS PASS PASS PASS FAIL
nptl/tst-cancelx21 PASS PASS PASS PASS FAIL
nptl/tst-cancelx3 PASS PASS PASS PASS FAIL
nptl/tst-cancelx4 PASS PASS PASS PASS FAIL
nptl/tst-cancelx5 PASS PASS PASS PASS FAIL
nptl/tst-cancelx6 PASS PASS PASS PASS FAIL
nptl/tst-cancelx7 PASS PASS PASS PASS FAIL
nptl/tst-cleanup4 PASS PASS PASS PASS FAIL
nptl/tst-cleanupx4 PASS PASS PASS PASS FAIL
nptl/tst-cond-except PASS PASS PASS PASS FAIL
nptl/tst-cond7 PASS PASS PASS PASS FAIL
nptl/tst-cond8 PASS PASS PASS PASS FAIL
nptl/tst-fini1 PASS PASS PASS PASS FAIL
nptl/tst-initializers1 PASS PASS PASS PASS FAIL
nptl/tst-initializers1-c11 PASS PASS PASS PASS FAIL
nptl/tst-initializers1-c89 PASS PASS PASS PASS FAIL
nptl/tst-initializers1-c99 PASS PASS PASS PASS FAIL
nptl/tst-initializers1-gnu11 PASS PASS PASS PASS FAIL
nptl/tst-initializers1-gnu89 PASS PASS PASS PASS FAIL
nptl/tst-initializers1-gnu99 PASS PASS PASS PASS FAIL
nptl/tst-join5 PASS PASS PASS PASS FAIL
nptl/tst-key3 PASS PASS PASS PASS FAIL
nptl/tst-mutex8 PASS PASS PASS PASS FAIL
nptl/tst-mutexpi8 PASS PASS PASS PASS FAIL
nptl/tst-once3 PASS PASS PASS PASS FAIL
nptl/tst-once4 PASS PASS PASS PASS FAIL
nptl/tst-oncex3 PASS PASS PASS PASS FAIL
nptl/tst-oncex4 PASS PASS PASS PASS FAIL
nptl/tst-rwlock15 PASS PASS PASS PASS FAIL
nptl/tst-rwlock8 PASS PASS PASS PASS FAIL
nptl/tst-rwlock9 PASS PASS PASS PASS FAIL
nptl/tst-sem11 PASS PASS PASS PASS FAIL
nptl/tst-sem12 PASS PASS PASS PASS FAIL
posix/bug-regex24 PASS PASS PASS PASS FAIL
rt/tst-mqueue1 PASS PASS PASS PASS FAIL
rt/tst-mqueue2 PASS PASS PASS PASS FAIL
rt/tst-mqueue4 PASS PASS PASS PASS FAIL
rt/tst-mqueue7 PASS PASS PASS PASS FAIL
rt/tst-mqueue8 PASS PASS PASS PASS FAIL
rt/tst-mqueue8x PASS PASS PASS PASS FAIL
stdlib/tst-makecontext3 PASS PASS PASS PASS FAIL

2016-11-16 11:23:03

by Maxim Kuvyrkov

[permalink] [raw]
Subject: Re: ILP32 for ARM64: testing with glibc testsuite

> On Nov 9, 2016, at 1:56 PM, Yury Norov <[email protected]> wrote:
>
> On Mon, Nov 07, 2016 at 01:53:59PM +0530, Yury Norov wrote:
>> Hi all,
>>
>> [add libc-alpha mail list]
>>
>> For libc-alpha: this is the part of LKML submission with latest
>> patches for aarch64/ilp32.
>> https://www.spinics.net/lists/arm-kernel/msg537846.html
>>
>> Glibc that I use has also included consolidation patches from Adhemerval
>> Zanella and me that are still not in the glibc master. The full series is:
>> https://github.com/norov/glibc/tree/ilp32-2.24-dev2
>>
>> Below is the results of glibc testsuite run for aarch64/lp64
>> in different configurations. Column names meaning:
>> kvgv: kernel is vanilla, glibc is vanilla;
>> kdgv: kernel has ilp32 patches applied, but ilp32 is disabled in config;
>> glibc is vanilla;
>> kegv: kernel has ilp32 patches applied and ilp32 is enabled, glibc is vanilla;
>> kege: kernel patches are applied and enabled, glibc patches are applied.
>>
>> Only different lines are shown. Full results are in attached archive.

Hi Yury,

The general requirement merging ILP32 glibc patches is that LP64 does not regress in any reasonable configuration. This means that there should be 0 regressions between kvgv and kvge -- i.e., glibc in LP64 mode with and without ILP32 patches does not regress on the vanilla kernel. The kvge configuration is not in your testing matrix, and I suggest you make sure it has no regressions before fixing the more "advanced" configuration of kege.

Ideally, there should be no regressions between kvgv and kege configurations, but I don't consider this to a requirement for glibc acceptance of ILP32 patches, since any regressions between kvge and kege configurations are likely to be on the kernel side.

Speculating on the kernel requirements for ILP32 kernel patchset, I think there should be 0 regressions between kvgv and kdgv configurations, where you have only 3 tests to investigate and fix.

[I do appreciate that there are progressions in your results as well, but the glibc policy is that they do not offset regressions.]

The above only concerns LP64 support in kernel and glibc.

Regarding ILP32 runtime, my opinion is that it is acceptable for ILP32 to have extra failures compared to LP64, since these are not regressions, but, rather, failures of a new configuration. From a superficial glance is seems that ILP32 linknamespace support requires attention, as well as stack unwinding (judging from NPTL failures).


--
Maxim Kuvyrkov
http://www.linaro.org



>
> The same, plus ILP32 regressions:
>
> Test kvgv kdgv kegv kege ilp32
> conform/ISO/stdio.h/linknamespace PASS PASS PASS FAIL FAIL
> conform/ISO11/stdio.h/linknamespace PASS PASS PASS FAIL FAIL
> conform/ISO99/stdio.h/linknamespace PASS PASS PASS FAIL FAIL
> conform/POSIX/stdio.h/linknamespace PASS PASS PASS FAIL FAIL
> conform/POSIX/sys/stat.h/linknamespace PASS PASS PASS FAIL FAIL
> conform/UNIX98/stdio.h/linknamespace PASS PASS PASS FAIL FAIL
> conform/XOPEN2K/stdio.h/linknamespace PASS PASS PASS FAIL FAIL
> conform/XPG3/stdio.h/linknamespace PASS PASS PASS FAIL FAIL
> conform/XPG4/stdio.h/linknamespace PASS PASS PASS FAIL FAIL
> csu/tst-atomic PASS PASS PASS FAIL PASS
> elf/check-localplt PASS PASS PASS FAIL FAIL
> iconvdata/mtrace-tst-loading PASS FAIL PASS PASS FAIL
> iconvdata/tst-loading PASS FAIL PASS PASS PASS
> io/check-installed-headers-c PASS PASS PASS FAIL FAIL
> io/check-installed-headers-cxx PASS PASS PASS FAIL FAIL
> malloc/tst-malloc-backtrace FAIL PASS PASS PASS PASS
> malloc/tst-malloc-thread-exit FAIL PASS PASS PASS PASS
> malloc/tst-malloc-usable FAIL PASS PASS PASS PASS
> malloc/tst-mallocfork FAIL PASS PASS PASS PASS
> malloc/tst-mallocstate FAIL PASS PASS PASS PASS
> malloc/tst-mallopt FAIL PASS PASS PASS PASS
> malloc/tst-mcheck FAIL PASS PASS PASS PASS
> malloc/tst-memalign FAIL PASS PASS PASS PASS
> malloc/tst-obstack FAIL PASS PASS PASS PASS
> malloc/tst-posix_memalign FAIL PASS PASS PASS PASS
> malloc/tst-pvalloc FAIL PASS PASS PASS PASS
> malloc/tst-realloc FAIL PASS PASS PASS PASS
> malloc/tst-scratch_buffer FAIL PASS PASS PASS PASS
> malloc/tst-trim1 FAIL PASS PASS PASS PASS
> nptl/tst-eintr4 PASS PASS PASS NA NA
> posix/tst-regex2 PASS FAIL FAIL FAIL FAIL
> posix/tst-getaddrinfo4 PASS PASS FAIL FAIL PASS
> posix/tst-getaddrinfo5 PASS PASS FAIL FAIL PASS
> sysvipc/test-sysvmsg NA NA NA FAIL PASS
> sysvipc/test-sysvsem NA NA NA FAIL PASS
> sysvipc/test-sysvshm NA NA NA FAIL PASS
>
> c++-types-check PASS PASS PASS PASS FAIL
> debug/tst-backtrace4 PASS PASS PASS PASS FAIL
> elf/check-abi-libc PASS PASS PASS PASS FAIL
> elf/tst-tls1 PASS PASS PASS PASS FAIL
> elf/tst-tls1-static PASS PASS PASS PASS FAIL
> elf/tst-tls2 PASS PASS PASS PASS FAIL
> elf/tst-tls2-static PASS PASS PASS PASS FAIL
> elf/tst-tls3 PASS PASS PASS PASS FAIL
> math/check-abi-libm PASS PASS PASS PASS FAIL
> misc/tst-writev PASS PASS PASS PASS NA
> nptl/tst-cancel-self-canceltype PASS PASS PASS PASS FAIL
> nptl/tst-cancel1 PASS PASS PASS PASS FAIL
> nptl/tst-cancel10 PASS PASS PASS PASS FAIL
> nptl/tst-cancel11 PASS PASS PASS PASS FAIL
> nptl/tst-cancel13 PASS PASS PASS PASS FAIL
> nptl/tst-cancel15 PASS PASS PASS PASS FAIL
> nptl/tst-cancel16 PASS PASS PASS PASS FAIL
> nptl/tst-cancel17 PASS PASS PASS PASS FAIL
> nptl/tst-cancel18 PASS PASS PASS PASS FAIL
> nptl/tst-cancel2 PASS PASS PASS PASS FAIL
> nptl/tst-cancel20 PASS PASS PASS PASS FAIL
> nptl/tst-cancel21 PASS PASS PASS PASS FAIL
> nptl/tst-cancel24 PASS PASS PASS PASS FAIL
> nptl/tst-cancel25 PASS PASS PASS PASS FAIL
> nptl/tst-cancel26 PASS PASS PASS PASS FAIL
> nptl/tst-cancel27 PASS PASS PASS PASS FAIL
> nptl/tst-cancel3 PASS PASS PASS PASS FAIL
> nptl/tst-cancel4 PASS PASS PASS PASS FAIL
> nptl/tst-cancel5 PASS PASS PASS PASS FAIL
> nptl/tst-cancel6 PASS PASS PASS PASS FAIL
> nptl/tst-cancel7 PASS PASS PASS PASS FAIL
> nptl/tst-cancelx10 PASS PASS PASS PASS FAIL
> nptl/tst-cancelx11 PASS PASS PASS PASS FAIL
> nptl/tst-cancelx13 PASS PASS PASS PASS FAIL
> nptl/tst-cancelx15 PASS PASS PASS PASS FAIL
> nptl/tst-cancelx16 PASS PASS PASS PASS FAIL
> nptl/tst-cancelx17 PASS PASS PASS PASS FAIL
> nptl/tst-cancelx18 PASS PASS PASS PASS FAIL
> nptl/tst-cancelx2 PASS PASS PASS PASS FAIL
> nptl/tst-cancelx20 PASS PASS PASS PASS FAIL
> nptl/tst-cancelx21 PASS PASS PASS PASS FAIL
> nptl/tst-cancelx3 PASS PASS PASS PASS FAIL
> nptl/tst-cancelx4 PASS PASS PASS PASS FAIL
> nptl/tst-cancelx5 PASS PASS PASS PASS FAIL
> nptl/tst-cancelx6 PASS PASS PASS PASS FAIL
> nptl/tst-cancelx7 PASS PASS PASS PASS FAIL
> nptl/tst-cleanup4 PASS PASS PASS PASS FAIL
> nptl/tst-cleanupx4 PASS PASS PASS PASS FAIL
> nptl/tst-cond-except PASS PASS PASS PASS FAIL
> nptl/tst-cond7 PASS PASS PASS PASS FAIL
> nptl/tst-cond8 PASS PASS PASS PASS FAIL
> nptl/tst-fini1 PASS PASS PASS PASS FAIL
> nptl/tst-initializers1 PASS PASS PASS PASS FAIL
> nptl/tst-initializers1-c11 PASS PASS PASS PASS FAIL
> nptl/tst-initializers1-c89 PASS PASS PASS PASS FAIL
> nptl/tst-initializers1-c99 PASS PASS PASS PASS FAIL
> nptl/tst-initializers1-gnu11 PASS PASS PASS PASS FAIL
> nptl/tst-initializers1-gnu89 PASS PASS PASS PASS FAIL
> nptl/tst-initializers1-gnu99 PASS PASS PASS PASS FAIL
> nptl/tst-join5 PASS PASS PASS PASS FAIL
> nptl/tst-key3 PASS PASS PASS PASS FAIL
> nptl/tst-mutex8 PASS PASS PASS PASS FAIL
> nptl/tst-mutexpi8 PASS PASS PASS PASS FAIL
> nptl/tst-once3 PASS PASS PASS PASS FAIL
> nptl/tst-once4 PASS PASS PASS PASS FAIL
> nptl/tst-oncex3 PASS PASS PASS PASS FAIL
> nptl/tst-oncex4 PASS PASS PASS PASS FAIL
> nptl/tst-rwlock15 PASS PASS PASS PASS FAIL
> nptl/tst-rwlock8 PASS PASS PASS PASS FAIL
> nptl/tst-rwlock9 PASS PASS PASS PASS FAIL
> nptl/tst-sem11 PASS PASS PASS PASS FAIL
> nptl/tst-sem12 PASS PASS PASS PASS FAIL
> posix/bug-regex24 PASS PASS PASS PASS FAIL
> rt/tst-mqueue1 PASS PASS PASS PASS FAIL
> rt/tst-mqueue2 PASS PASS PASS PASS FAIL
> rt/tst-mqueue4 PASS PASS PASS PASS FAIL
> rt/tst-mqueue7 PASS PASS PASS PASS FAIL
> rt/tst-mqueue8 PASS PASS PASS PASS FAIL
> rt/tst-mqueue8x PASS PASS PASS PASS FAIL
> stdlib/tst-makecontext3 PASS PASS PASS PASS FAIL





2016-11-17 03:43:21

by Zhangjian (Bamvor)

[permalink] [raw]
Subject: Re: ILP32 for ARM64 - testing with lmbench

Hi, all

I test specint of aarch64 LP64 when aarch32 el0 disable/enabled respectively
and compare with ILP32 unmerged kernel(4.8-rc6) in our arm64 board. I found
that difference(ILP32 disabled/ILP32 unmerged) is bigger when aarch32 el0 is
enabled, compare with aarch32 el0 disabled kernel. And bzip2, mcg, hmmer,
libquantum are the top four differences[1]. Note that bigger is better in
specint test.

In order to make sure the above results, I retest these four testcases in
reportable way(reference the command in the end). The result[2] show that
libquantum decrease -2.09% after ILP32 enabled and aarch32 on. I think it is in
significant.

The result of lmbench is not stable in my board. I plan to dig it later.

[1] The following test result is tested through --size=ref --iterations=3.
1.1 Test when aarch32_el0 is enabled.
ILP32 disabled base line
400.perlbench 100.00% 100%
401.bzip2 99.35% 100%
403.gcc 100.26% 100%
429.mcf 102.75% 100%
445.gobmk 100.00% 100%
456.hmmer 95.66% 100%
458.sjeng 100.00% 100%
462.libquantum 100.00% 100%
471.omnetpp 100.59% 100%
473.astar 99.66% 100%
483.xalancbmk 99.10% 100%

1.2 Test when aarch32_el0 is disabled
ILP32 disabled base line
400.perlbench 100.22% 100%
401.bzip2 100.95% 100%
403.gcc 100.20% 100%
429.mcf 100.76% 100%
445.gobmk 100.36% 100%
456.hmmer 97.94% 100%
458.sjeng 99.73% 100%
462.libquantum 98.72% 100%
471.omnetpp 100.86% 100%
473.astar 99.15% 100%
483.xalancbmk 100.08% 100%

[2] The following test result is tested through: runspec --config=my.cfg --size=test,train,ref --noreportable --tune=base,peak --iterations=3 bzip2 mcf hmmer libquantum
2.1 Test when aarch32_el0 is enabled.
ILP32_enabled base line
401.bzip2 100.82% 100%
429.mcf 100.18% 100%
456.hmmer 99.64% 100%
462.libquantum 97.91% 100%

Regards

Bamvor

On 2016/10/28 20:46, Yury Norov wrote:
> [Add Steve Ellcey, thanks for testing on ThunderX]
>
> Lmbench-3.0-a9 testing is performed on ThunderX machine to check that
> ILP32 series does not add performance regressions for LP64. Test
> summary is in the table below. Our measurements doesn't show
> significant performance regression of LP64 if ILP32 code is merged,
> both enabled or disabled.
>
> ILP32 enabled ILP32 disabled Standard Kernel
> null syscall 0.1066 0.1121 0.1121
> 95.09% 100.00%
>
> stat 1.3947 1.3814 1.3864
> 100.60% 99.64%
>
> fstat 0.4459 0.4344 0.4524
> 98.56% 96.02%
>
> open/close 4.0606 4.0411 4.0453
> 100.38% 99.90%
>
> read 0.4819 0.5014 0.5014
> 96.11% 100.00%
>
> Tested with linux 4.8 because 4.9-rc1 is not fixed yet for ThunderX.
> Other system details below.
>
> Yury.
>
> ubuntu@crb6:~$ uname -a
> Linux crb6 4.8.0+ #3 SMP Thu Oct 27 11:01:32 PDT 2016 aarch64 aarch64 aarch64 GNU/Linux
>
> ubuntu@crb6:~$ cat /proc/meminfo
> MemTotal: 132011948 kB
> MemFree: 131442672 kB
> MemAvailable: 130695764 kB
> Buffers: 15696 kB
> Cached: 88088 kB
> SwapCached: 0 kB
> Active: 82760 kB
> Inactive: 41336 kB
> Active(anon): 20880 kB
> Inactive(anon): 8576 kB
> Active(file): 61880 kB
> Inactive(file): 32760 kB
> Unevictable: 0 kB
> Mlocked: 0 kB
> SwapTotal: 128920572 kB
> SwapFree: 128920572 kB
> Dirty: 0 kB
> Writeback: 0 kB
> AnonPages: 20544 kB
> Mapped: 19780 kB
> Shmem: 9060 kB
> Slab: 78804 kB
> SReclaimable: 27372 kB
> SUnreclaim: 51432 kB
> KernelStack: 8336 kB
> PageTables: 820 kB
> NFS_Unstable: 0 kB
> Bounce: 0 kB
> WritebackTmp: 0 kB
> CommitLimit: 194926544 kB
> Committed_AS: 256324 kB
> VmallocTotal: 135290290112 kB
> VmallocUsed: 0 kB
> VmallocChunk: 0 kB
> AnonHugePages: 0 kB
> ShmemHugePages: 0 kB
> ShmemPmdMapped: 0 kB
> CmaTotal: 0 kB
> CmaFree: 0 kB
> HugePages_Total: 0
> HugePages_Free: 0
> HugePages_Rsvd: 0
> HugePages_Surp: 0
> Hugepagesize: 2048 kB
>
> ubuntu@crb6:~$ cat /proc/cpuinfo
> processor : 0
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 1
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 2
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 3
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 4
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 5
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 6
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 7
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 8
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 9
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 10
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 11
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 12
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 13
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 14
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 15
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 16
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 17
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 18
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 19
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 20
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 21
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 22
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 23
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 24
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 25
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 26
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 27
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 28
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 29
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 30
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 31
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 32
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 33
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 34
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 35
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 36
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 37
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 38
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 39
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 40
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 41
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 42
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 43
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 44
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 45
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 46
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>
> processor : 47
> BogoMIPS : 200.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer : 0x43
> CPU architecture: 8
> CPU variant : 0x1
> CPU part : 0x0a1
> CPU revision : 0
>

2016-11-17 05:02:39

by Maxim Kuvyrkov

[permalink] [raw]
Subject: Re: ILP32 for ARM64 - testing with lmbench

Hi Bamvor,

I'm surprised that you see this much difference from ILP32 patches on SPEC CPU2006int at all. The SPEC CPU2006 benchmarks spend almost no time in the kernel syscalls. I can imagine memory, TLB, and cache handling in the kernel could affect CPU2006 benchmarks. Do ILP32 patches touch code in those areas?

Other than that, it would be interesting to check what the variance is between the 3 iterations of benchmark runs. Could you check what relative standard deviation is between the 3 iterations -- (STDEV(RUN1, RUN2, RUN3) / RUNselected)?

For reference, in my [non-ILP32] benchmarking I see 1.1% for 401.bzip2, 0.8% for 429.mcf, 0.2% for 456.hmmer, and 0.1% for 462.libquantum.

--
Maxim Kuvyrkov
http://www.linaro.org



> On Nov 17, 2016, at 7:28 AM, Zhangjian (Bamvor) <[email protected]> wrote:
>
> Hi, all
>
> I test specint of aarch64 LP64 when aarch32 el0 disable/enabled respectively
> and compare with ILP32 unmerged kernel(4.8-rc6) in our arm64 board. I found
> that difference(ILP32 disabled/ILP32 unmerged) is bigger when aarch32 el0 is
> enabled, compare with aarch32 el0 disabled kernel. And bzip2, mcg, hmmer,
> libquantum are the top four differences[1]. Note that bigger is better in
> specint test.
>
> In order to make sure the above results, I retest these four testcases in
> reportable way(reference the command in the end). The result[2] show that
> libquantum decrease -2.09% after ILP32 enabled and aarch32 on. I think it is in
> significant.
>
> The result of lmbench is not stable in my board. I plan to dig it later.
>
> [1] The following test result is tested through --size=ref --iterations=3.
> 1.1 Test when aarch32_el0 is enabled.
> ILP32 disabled base line
> 400.perlbench 100.00% 100%
> 401.bzip2 99.35% 100%
> 403.gcc 100.26% 100%
> 429.mcf 102.75% 100%
> 445.gobmk 100.00% 100%
> 456.hmmer 95.66% 100%
> 458.sjeng 100.00% 100%
> 462.libquantum 100.00% 100%
> 471.omnetpp 100.59% 100%
> 473.astar 99.66% 100%
> 483.xalancbmk 99.10% 100%
>
> 1.2 Test when aarch32_el0 is disabled
> ILP32 disabled base line
> 400.perlbench 100.22% 100%
> 401.bzip2 100.95% 100%
> 403.gcc 100.20% 100%
> 429.mcf 100.76% 100%
> 445.gobmk 100.36% 100%
> 456.hmmer 97.94% 100%
> 458.sjeng 99.73% 100%
> 462.libquantum 98.72% 100%
> 471.omnetpp 100.86% 100%
> 473.astar 99.15% 100%
> 483.xalancbmk 100.08% 100%
>
> [2] The following test result is tested through: runspec --config=my.cfg --size=test,train,ref --noreportable --tune=base,peak --iterations=3 bzip2 mcf hmmer libquantum
> 2.1 Test when aarch32_el0 is enabled.
> ILP32_enabled base line
> 401.bzip2 100.82% 100%
> 429.mcf 100.18% 100%
> 456.hmmer 99.64% 100%
> 462.libquantum 97.91% 100%
>
> Regards
>
> Bamvor
>
> On 2016/10/28 20:46, Yury Norov wrote:
>> [Add Steve Ellcey, thanks for testing on ThunderX]
>>
>> Lmbench-3.0-a9 testing is performed on ThunderX machine to check that
>> ILP32 series does not add performance regressions for LP64. Test
>> summary is in the table below. Our measurements doesn't show
>> significant performance regression of LP64 if ILP32 code is merged,
>> both enabled or disabled.
>>
>> ILP32 enabled ILP32 disabled Standard Kernel
>> null syscall 0.1066 0.1121 0.1121
>> 95.09% 100.00%
>>
>> stat 1.3947 1.3814 1.3864
>> 100.60% 99.64%
>>
>> fstat 0.4459 0.4344 0.4524
>> 98.56% 96.02%
>>
>> open/close 4.0606 4.0411 4.0453
>> 100.38% 99.90%
>>
>> read 0.4819 0.5014 0.5014
>> 96.11% 100.00%
>>
>> Tested with linux 4.8 because 4.9-rc1 is not fixed yet for ThunderX.
>> Other system details below.
>>
>> Yury.
>>
>> ubuntu@crb6:~$ uname -a
>> Linux crb6 4.8.0+ #3 SMP Thu Oct 27 11:01:32 PDT 2016 aarch64 aarch64 aarch64 GNU/Linux
>>
>> ubuntu@crb6:~$ cat /proc/meminfo
>> MemTotal: 132011948 kB
>> MemFree: 131442672 kB
>> MemAvailable: 130695764 kB
>> Buffers: 15696 kB
>> Cached: 88088 kB
>> SwapCached: 0 kB
>> Active: 82760 kB
>> Inactive: 41336 kB
>> Active(anon): 20880 kB
>> Inactive(anon): 8576 kB
>> Active(file): 61880 kB
>> Inactive(file): 32760 kB
>> Unevictable: 0 kB
>> Mlocked: 0 kB
>> SwapTotal: 128920572 kB
>> SwapFree: 128920572 kB
>> Dirty: 0 kB
>> Writeback: 0 kB
>> AnonPages: 20544 kB
>> Mapped: 19780 kB
>> Shmem: 9060 kB
>> Slab: 78804 kB
>> SReclaimable: 27372 kB
>> SUnreclaim: 51432 kB
>> KernelStack: 8336 kB
>> PageTables: 820 kB
>> NFS_Unstable: 0 kB
>> Bounce: 0 kB
>> WritebackTmp: 0 kB
>> CommitLimit: 194926544 kB
>> Committed_AS: 256324 kB
>> VmallocTotal: 135290290112 kB
>> VmallocUsed: 0 kB
>> VmallocChunk: 0 kB
>> AnonHugePages: 0 kB
>> ShmemHugePages: 0 kB
>> ShmemPmdMapped: 0 kB
>> CmaTotal: 0 kB
>> CmaFree: 0 kB
>> HugePages_Total: 0
>> HugePages_Free: 0
>> HugePages_Rsvd: 0
>> HugePages_Surp: 0
>> Hugepagesize: 2048 kB
>>
>> ubuntu@crb6:~$ cat /proc/cpuinfo
>> processor : 0
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 1
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 2
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 3
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 4
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 5
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 6
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 7
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 8
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 9
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 10
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 11
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 12
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 13
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 14
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 15
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 16
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 17
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 18
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 19
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 20
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 21
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 22
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 23
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 24
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 25
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 26
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 27
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 28
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 29
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 30
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 31
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 32
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 33
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 34
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 35
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 36
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 37
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 38
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 39
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 40
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 41
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 42
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 43
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 44
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 45
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 46
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>> processor : 47
>> BogoMIPS : 200.00
>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer : 0x43
>> CPU architecture: 8
>> CPU variant : 0x1
>> CPU part : 0x0a1
>> CPU revision : 0
>>
>


2016-11-17 07:48:56

by Zhangjian (Bamvor)

[permalink] [raw]
Subject: Re: ILP32 for ARM64 - testing with lmbench

Hi, Maxim

On 2016/11/17 13:02, Maxim Kuvyrkov wrote:
> Hi Bamvor,
>
> I'm surprised that you see this much difference from ILP32 patches on SPEC CPU2006int at all. The SPEC CPU2006 benchmarks spend almost no time in the kernel syscalls. I can imagine memory, TLB, and cache handling in the kernel could affect CPU2006 benchmarks. Do ILP32 patches touch code in those areas?
>
> Other than that, it would be interesting to check what the variance is between the 3 iterations of benchmark runs. Could you check what relative standard deviation is between the 3 iterations -- (STDEV(RUN1, RUN2, RUN3) / RUNselected)?
>
> For reference, in my [non-ILP32] benchmarking I see 1.1% for 401.bzip2, 0.8% for 429.mcf, 0.2% for 456.hmmer, and 0.1% for 462.libquantum.
Here is my result:
ILP32_merged ILP32_unmerged
401.bzip2 0.31% 0.26%
429.mcf 1.61% 1.36%
456.hmmer 1.37% 1.57%
462.libquantum 0.29% 0.28%

Regards

Bamvor

>
> --
> Maxim Kuvyrkov
> http://www.linaro.org
>
>
>
>> On Nov 17, 2016, at 7:28 AM, Zhangjian (Bamvor) <[email protected]> wrote:
>>
>> Hi, all
>>
>> I test specint of aarch64 LP64 when aarch32 el0 disable/enabled respectively
>> and compare with ILP32 unmerged kernel(4.8-rc6) in our arm64 board. I found
>> that difference(ILP32 disabled/ILP32 unmerged) is bigger when aarch32 el0 is
>> enabled, compare with aarch32 el0 disabled kernel. And bzip2, mcg, hmmer,
>> libquantum are the top four differences[1]. Note that bigger is better in
>> specint test.
>>
>> In order to make sure the above results, I retest these four testcases in
>> reportable way(reference the command in the end). The result[2] show that
>> libquantum decrease -2.09% after ILP32 enabled and aarch32 on. I think it is in
>> significant.
>>
>> The result of lmbench is not stable in my board. I plan to dig it later.
>>
>> [1] The following test result is tested through --size=ref --iterations=3.
>> 1.1 Test when aarch32_el0 is enabled.
>> ILP32 disabled base line
>> 400.perlbench 100.00% 100%
>> 401.bzip2 99.35% 100%
>> 403.gcc 100.26% 100%
>> 429.mcf 102.75% 100%
>> 445.gobmk 100.00% 100%
>> 456.hmmer 95.66% 100%
>> 458.sjeng 100.00% 100%
>> 462.libquantum 100.00% 100%
>> 471.omnetpp 100.59% 100%
>> 473.astar 99.66% 100%
>> 483.xalancbmk 99.10% 100%
>>
>> 1.2 Test when aarch32_el0 is disabled
>> ILP32 disabled base line
>> 400.perlbench 100.22% 100%
>> 401.bzip2 100.95% 100%
>> 403.gcc 100.20% 100%
>> 429.mcf 100.76% 100%
>> 445.gobmk 100.36% 100%
>> 456.hmmer 97.94% 100%
>> 458.sjeng 99.73% 100%
>> 462.libquantum 98.72% 100%
>> 471.omnetpp 100.86% 100%
>> 473.astar 99.15% 100%
>> 483.xalancbmk 100.08% 100%
>>
>> [2] The following test result is tested through: runspec --config=my.cfg --size=test,train,ref --noreportable --tune=base,peak --iterations=3 bzip2 mcf hmmer libquantum
>> 2.1 Test when aarch32_el0 is enabled.
>> ILP32_enabled base line
>> 401.bzip2 100.82% 100%
>> 429.mcf 100.18% 100%
>> 456.hmmer 99.64% 100%
>> 462.libquantum 97.91% 100%
>>
>> Regards
>>
>> Bamvor
>>
>> On 2016/10/28 20:46, Yury Norov wrote:
>>> [Add Steve Ellcey, thanks for testing on ThunderX]
>>>
>>> Lmbench-3.0-a9 testing is performed on ThunderX machine to check that
>>> ILP32 series does not add performance regressions for LP64. Test
>>> summary is in the table below. Our measurements doesn't show
>>> significant performance regression of LP64 if ILP32 code is merged,
>>> both enabled or disabled.
>>>
>>> ILP32 enabled ILP32 disabled Standard Kernel
>>> null syscall 0.1066 0.1121 0.1121
>>> 95.09% 100.00%
>>>
>>> stat 1.3947 1.3814 1.3864
>>> 100.60% 99.64%
>>>
>>> fstat 0.4459 0.4344 0.4524
>>> 98.56% 96.02%
>>>
>>> open/close 4.0606 4.0411 4.0453
>>> 100.38% 99.90%
>>>
>>> read 0.4819 0.5014 0.5014
>>> 96.11% 100.00%
>>>
>>> Tested with linux 4.8 because 4.9-rc1 is not fixed yet for ThunderX.
>>> Other system details below.
>>>
>>> Yury.
>>>
>>> ubuntu@crb6:~$ uname -a
>>> Linux crb6 4.8.0+ #3 SMP Thu Oct 27 11:01:32 PDT 2016 aarch64 aarch64 aarch64 GNU/Linux
>>>
>>> ubuntu@crb6:~$ cat /proc/meminfo
>>> MemTotal: 132011948 kB
>>> MemFree: 131442672 kB
>>> MemAvailable: 130695764 kB
>>> Buffers: 15696 kB
>>> Cached: 88088 kB
>>> SwapCached: 0 kB
>>> Active: 82760 kB
>>> Inactive: 41336 kB
>>> Active(anon): 20880 kB
>>> Inactive(anon): 8576 kB
>>> Active(file): 61880 kB
>>> Inactive(file): 32760 kB
>>> Unevictable: 0 kB
>>> Mlocked: 0 kB
>>> SwapTotal: 128920572 kB
>>> SwapFree: 128920572 kB
>>> Dirty: 0 kB
>>> Writeback: 0 kB
>>> AnonPages: 20544 kB
>>> Mapped: 19780 kB
>>> Shmem: 9060 kB
>>> Slab: 78804 kB
>>> SReclaimable: 27372 kB
>>> SUnreclaim: 51432 kB
>>> KernelStack: 8336 kB
>>> PageTables: 820 kB
>>> NFS_Unstable: 0 kB
>>> Bounce: 0 kB
>>> WritebackTmp: 0 kB
>>> CommitLimit: 194926544 kB
>>> Committed_AS: 256324 kB
>>> VmallocTotal: 135290290112 kB
>>> VmallocUsed: 0 kB
>>> VmallocChunk: 0 kB
>>> AnonHugePages: 0 kB
>>> ShmemHugePages: 0 kB
>>> ShmemPmdMapped: 0 kB
>>> CmaTotal: 0 kB
>>> CmaFree: 0 kB
>>> HugePages_Total: 0
>>> HugePages_Free: 0
>>> HugePages_Rsvd: 0
>>> HugePages_Surp: 0
>>> Hugepagesize: 2048 kB
>>>
>>> ubuntu@crb6:~$ cat /proc/cpuinfo
>>> processor : 0
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 1
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 2
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 3
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 4
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 5
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 6
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 7
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 8
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 9
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 10
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 11
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 12
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 13
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 14
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 15
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 16
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 17
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 18
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 19
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 20
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 21
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 22
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 23
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 24
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 25
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 26
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 27
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 28
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 29
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 30
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 31
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 32
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 33
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 34
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 35
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 36
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 37
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 38
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 39
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 40
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 41
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 42
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 43
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 44
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 45
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 46
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>> processor : 47
>>> BogoMIPS : 200.00
>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer : 0x43
>>> CPU architecture: 8
>>> CPU variant : 0x1
>>> CPU part : 0x0a1
>>> CPU revision : 0
>>>
>>
>

2016-11-17 17:11:23

by Catalin Marinas

[permalink] [raw]
Subject: Re: ILP32 for ARM64: testing with glibc testsuite

On Wed, Nov 16, 2016 at 03:22:26PM +0400, Maxim Kuvyrkov wrote:
> Regarding ILP32 runtime, my opinion is that it is acceptable for ILP32
> to have extra failures compared to LP64, since these are not
> regressions, but, rather, failures of a new configuration.

I disagree with this. We definitely need to understand why they fail,
otherwise we run the risk of potential glibc or kernel implementation
bugs becoming ABI.

--
Catalin

2016-11-18 00:19:23

by Steve Ellcey

[permalink] [raw]
Subject: Re: ILP32 for ARM64: testing with glibc testsuite

On Wed, 2016-11-16 at 15:22 +0400, Maxim Kuvyrkov wrote:
> >
> > On Nov 9, 2016, at 1:56 PM, Yury Norov <[email protected]>
> > wrote:
> >
> > >
> > > Below is the results of glibc testsuite run for aarch64/lp64

I have been running the glibc testsuite as well.  I have only run it on
an ILP32 enabled kernel.  Using that kernel, top-of-tree glibc, and the
ILP32 glibc patches I have no LP64 regressions.  There are 5 failures
in LP64 mode but I get them with vanilla top-of-tree glibc sources too.
They are:
nptl/eintr1 (I actually don't run this because it kills the 'make check')
debug/tst-backtrace5
debug/tst-backtrace6
nptl/tst-stack4
nptl/tst-thread_local1

In ILP32 mode I get 33 failures, they include the above failures (minus
nptl/tst-thread_local1) plus:

c++-types-check
conform/ISO11/inttypes.h/conform
conform/ISO11/stdint.h/conform
conform/ISO99/inttypes.h/conform
conform/ISO99/stdint.h/conform
conform/POSIX2008/inttypes.h/conform
conform/POSIX2008/stdint.h/conform
conform/XOPEN2K/inttypes.h/conform
conform/XOPEN2K/stdint.h/conform
conform/XOPEN2K8/inttypes.h/conform
conform/XOPEN2K8/stdint.h/conform
elf/tst-tls1
elf/tst-tls1-static
elf/tst-tls2
elf/tst-tls2-static
elf/tst-tls3
math/check-abi-libm
math/test-double
math/test-double-finite
math/test-float
math/test-float-finite
misc/tst-sync_file_range
nptl/tst-cancel26
nptl/tst-cancel27
nptl/tst-sem3
rt/tst-mqueue1
rt/tst-mqueue2
rt/tst-mqueue4
rt/tst-mqueue7
stdlib/tst-makecontext3

I am currently looking at these ILP32 regressions (starting with the
tls failures) to see if I can figure out what is happening with them.

Steve Ellcey
[email protected]

2016-11-30 05:02:49

by Yury Norov

[permalink] [raw]
Subject: Re: [RFC3 nowrap: PATCH v7 00/18] ILP32 for ARM64

On Fri, Oct 21, 2016 at 11:32:59PM +0300, Yury Norov wrote:
> This series enables aarch64 with ilp32 mode, and as supporting work,
> introduces ARCH_32BIT_OFF_T configuration option that is enabled for
> existing 32-bit architectures but disabled for new arches (so 64-bit
> off_t is is used by new userspace).
>
> This version is based on kernel v4.9-rc1. It works with glibc-2.24,
> and tested with LTP.
>
> This version contains ABI changes, and should be used with new glibc
> version. See links below.
>
> This is RFC because there is still no solid understanding what type
> of registers top-halves delousing we prefer and it affects ABI. In
> this patchset, w0-w7 are cleared for each syscall in assembler entry.
>
> The alternative approach is in introducing compat wrappers which is
> little faster for natively routed syscalls (~2.6% for syscall with
> no payload) but much more complicated.

Hi all,

Steve Ellcey submitted glibc patches for ILP32:
https://www.sourceware.org/ml/libc-alpha/2016-11/msg01071.html
It implicitly assumes that kernel clears top halves of registers for
all syscalls in assembly entry. That patches are going to be taken.
It it happens, we will have no choice on kernel side how to clear top
halves anymore.

For me current version is OK, and I see no problems with it. I just
write this email to remind that it's still RFC, and this is the last
chance to get back to wrappers.

Yury.

2016-11-30 06:53:18

by Adam Borowski

[permalink] [raw]
Subject: Re: [RFC3 nowrap: PATCH v7 00/18] ILP32 for ARM64

On Wed, Nov 30, 2016 at 10:32:09AM +0530, Yury Norov wrote:
> On Fri, Oct 21, 2016 at 11:32:59PM +0300, Yury Norov wrote:
> > This series enables aarch64 with ilp32 mode, and as supporting work,
> > introduces ARCH_32BIT_OFF_T configuration option that is enabled for
> > existing 32-bit architectures but disabled for new arches (so 64-bit
> > off_t is is used by new userspace).
>
> Hi all,
>
> Steve Ellcey submitted glibc patches for ILP32:
> https://www.sourceware.org/ml/libc-alpha/2016-11/msg01071.html
> It implicitly assumes that kernel clears top halves of registers for
> all syscalls in assembly entry. That patches are going to be taken.
> It it happens, we will have no choice on kernel side how to clear top
> halves anymore.

Since a while ago, there's a package "arch-test" in Debian that empirically
enumerates architectures executable by the running kernel (and loaded
binfmts), by trying small test programs for each. The list of architectures
it knows does include arm64ilp32.

For most archs the test is just {write(1, "ok\n"); _exit(0);} unless there's
some difference from baseline that should be checked for, like dmb (ARMv7)
on armhf or mtvsrd (POWER8) on ppc64el. I could scribble in the top half of
a register to test the delousing, but it's not like alternate versions of
the ABI are expected in the wild...


There's another issue: name. A stalled request to add it to dpkg's cputable
(https://bugs.debian.org/824742) uses "arm64ilp32" and "arm64ilp32be" which
are unwieldy. Even the discussion uses "ilp32" -- probably too generic.
https://wiki.linaro.org/Platform/arm64-ilp32 mentions both. I've heard
"a32" somewhere. I have no stake here (I'm on the CC list as a x32 not arm
porter...), but if you want to choose a color for this bikeshed, the time
is now.


Meow!
--
The bill declaring Jesus as the King of Poland fails to specify whether
the addition is at the top or end of the list of kings. What should the
historians do?

2016-12-05 10:08:48

by Andreas Schwab

[permalink] [raw]
Subject: Re: ILP32 for ARM64: testing with glibc testsuite

On Dez 05 2016, "Zhangjian (Bamvor)" <[email protected]> wrote:

> Is there some progresses on it? We could collabrate to fix those issues.

All the elf/nptl/rt fails should be fixed by the recent binutils fixes.

Andreas.

--
Andreas Schwab, SUSE Labs, [email protected]
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

2016-12-05 10:09:43

by Zhangjian (Bamvor)

[permalink] [raw]
Subject: Re: ILP32 for ARM64: testing with glibc testsuite

Hi, Steve

On 2016/11/18 5:45, Steve Ellcey wrote:
> On Wed, 2016-11-16 at 15:22 +0400, Maxim Kuvyrkov wrote:
>>>
>>> On Nov 9, 2016, at 1:56 PM, Yury Norov <[email protected]>
>>> wrote:
>>>
>>>>
>>>> Below is the results of glibc testsuite run for aarch64/lp64
>
> I have been running the glibc testsuite as well. I have only run it on
> an ILP32 enabled kernel. Using that kernel, top-of-tree glibc, and the
> ILP32 glibc patches I have no LP64 regressions. There are 5 failures
> in LP64 mode but I get them with vanilla top-of-tree glibc sources too.
> They are:
> nptl/eintr1 (I actually don't run this because it kills the 'make check')
> debug/tst-backtrace5
> debug/tst-backtrace6
> nptl/tst-stack4
> nptl/tst-thread_local1
>
> In ILP32 mode I get 33 failures, they include the above failures (minus
> nptl/tst-thread_local1) plus:
>
> c++-types-check
> conform/ISO11/inttypes.h/conform
> conform/ISO11/stdint.h/conform
> conform/ISO99/inttypes.h/conform
> conform/ISO99/stdint.h/conform
> conform/POSIX2008/inttypes.h/conform
> conform/POSIX2008/stdint.h/conform
> conform/XOPEN2K/inttypes.h/conform
> conform/XOPEN2K/stdint.h/conform
> conform/XOPEN2K8/inttypes.h/conform
> conform/XOPEN2K8/stdint.h/conform
> elf/tst-tls1
> elf/tst-tls1-static
> elf/tst-tls2
> elf/tst-tls2-static
> elf/tst-tls3
> math/check-abi-libm
> math/test-double
> math/test-double-finite
> math/test-float
> math/test-float-finite
> misc/tst-sync_file_range
> nptl/tst-cancel26
> nptl/tst-cancel27
> nptl/tst-sem3
> rt/tst-mqueue1
> rt/tst-mqueue2
> rt/tst-mqueue4
> rt/tst-mqueue7
> stdlib/tst-makecontext3
>
> I am currently looking at these ILP32 regressions (starting with the
> tls failures) to see if I can figure out what is happening with them.
Is there some progresses on it? We could collabrate to fix those issues.

Regards

Bamvor
>
> Steve Ellcey
> [email protected]
>

2016-12-05 10:18:00

by Zhangjian (Bamvor)

[permalink] [raw]
Subject: Re: ILP32 for ARM64 - testing with lmbench

Hi, Catalin, Guys

Do you have suggestion of next move of upstreaming ILP32?
There are already the test results of lmbench and specint. Do you they are ok or need more data to prove no regression?
I have also noticed that there are ILP32 failures in glibc testsuite. Is it the only blocker for merge ILP32(in technology part)?

We appreciate any feedback/suggestion and hope could collaborate to improve the upstream progress.

(cc libc-alpha to get more input).

Thanks

Bamvor

On 2016/11/17 15:48, Zhangjian (Bamvor) wrote:
> Hi, Maxim
>
> On 2016/11/17 13:02, Maxim Kuvyrkov wrote:
>> Hi Bamvor,
>>
>> I'm surprised that you see this much difference from ILP32 patches on SPEC CPU2006int at all. The SPEC CPU2006 benchmarks spend almost no time in the kernel syscalls. I can imagine memory, TLB,
>> and cache handling in the kernel could affect CPU2006 benchmarks. Do ILP32 patches touch code in those areas?
>>
>> Other than that, it would be interesting to check what the variance is between the 3 iterations of benchmark runs. Could you check what relative standard deviation is between the 3 iterations --
>> (STDEV(RUN1, RUN2, RUN3) / RUNselected)?
>>
>> For reference, in my [non-ILP32] benchmarking I see 1.1% for 401.bzip2, 0.8% for 429.mcf, 0.2% for 456.hmmer, and 0.1% for 462.libquantum.
> Here is my result:
> ILP32_merged ILP32_unmerged
> 401.bzip2 0.31% 0.26%
> 429.mcf 1.61% 1.36%
> 456.hmmer 1.37% 1.57%
> 462.libquantum 0.29% 0.28%
>
> Regards
>
> Bamvor
>
>>
>> --
>> Maxim Kuvyrkov
>> http://www.linaro.org
>>
>>
>>
>>> On Nov 17, 2016, at 7:28 AM, Zhangjian (Bamvor) <[email protected]> wrote:
>>>
>>> Hi, all
>>>
>>> I test specint of aarch64 LP64 when aarch32 el0 disable/enabled respectively
>>> and compare with ILP32 unmerged kernel(4.8-rc6) in our arm64 board. I found
>>> that difference(ILP32 disabled/ILP32 unmerged) is bigger when aarch32 el0 is
>>> enabled, compare with aarch32 el0 disabled kernel. And bzip2, mcg, hmmer,
>>> libquantum are the top four differences[1]. Note that bigger is better in
>>> specint test.
>>>
>>> In order to make sure the above results, I retest these four testcases in
>>> reportable way(reference the command in the end). The result[2] show that
>>> libquantum decrease -2.09% after ILP32 enabled and aarch32 on. I think it is in
>>> significant.
>>>
>>> The result of lmbench is not stable in my board. I plan to dig it later.
>>>
>>> [1] The following test result is tested through --size=ref --iterations=3.
>>> 1.1 Test when aarch32_el0 is enabled.
>>> ILP32 disabled base line
>>> 400.perlbench 100.00% 100%
>>> 401.bzip2 99.35% 100%
>>> 403.gcc 100.26% 100%
>>> 429.mcf 102.75% 100%
>>> 445.gobmk 100.00% 100%
>>> 456.hmmer 95.66% 100%
>>> 458.sjeng 100.00% 100%
>>> 462.libquantum 100.00% 100%
>>> 471.omnetpp 100.59% 100%
>>> 473.astar 99.66% 100%
>>> 483.xalancbmk 99.10% 100%
>>>
>>> 1.2 Test when aarch32_el0 is disabled
>>> ILP32 disabled base line
>>> 400.perlbench 100.22% 100%
>>> 401.bzip2 100.95% 100%
>>> 403.gcc 100.20% 100%
>>> 429.mcf 100.76% 100%
>>> 445.gobmk 100.36% 100%
>>> 456.hmmer 97.94% 100%
>>> 458.sjeng 99.73% 100%
>>> 462.libquantum 98.72% 100%
>>> 471.omnetpp 100.86% 100%
>>> 473.astar 99.15% 100%
>>> 483.xalancbmk 100.08% 100%
>>>
>>> [2] The following test result is tested through: runspec --config=my.cfg --size=test,train,ref --noreportable --tune=base,peak --iterations=3 bzip2 mcf hmmer libquantum
>>> 2.1 Test when aarch32_el0 is enabled.
>>> ILP32_enabled base line
>>> 401.bzip2 100.82% 100%
>>> 429.mcf 100.18% 100%
>>> 456.hmmer 99.64% 100%
>>> 462.libquantum 97.91% 100%
>>>
>>> Regards
>>>
>>> Bamvor
>>>
>>> On 2016/10/28 20:46, Yury Norov wrote:
>>>> [Add Steve Ellcey, thanks for testing on ThunderX]
>>>>
>>>> Lmbench-3.0-a9 testing is performed on ThunderX machine to check that
>>>> ILP32 series does not add performance regressions for LP64. Test
>>>> summary is in the table below. Our measurements doesn't show
>>>> significant performance regression of LP64 if ILP32 code is merged,
>>>> both enabled or disabled.
>>>>
>>>> ILP32 enabled ILP32 disabled Standard Kernel
>>>> null syscall 0.1066 0.1121 0.1121
>>>> 95.09% 100.00%
>>>>
>>>> stat 1.3947 1.3814 1.3864
>>>> 100.60% 99.64%
>>>>
>>>> fstat 0.4459 0.4344 0.4524
>>>> 98.56% 96.02%
>>>>
>>>> open/close 4.0606 4.0411 4.0453
>>>> 100.38% 99.90%
>>>>
>>>> read 0.4819 0.5014 0.5014
>>>> 96.11% 100.00%
>>>>
>>>> Tested with linux 4.8 because 4.9-rc1 is not fixed yet for ThunderX.
>>>> Other system details below.
>>>>
>>>> Yury.
>>>>
>>>> ubuntu@crb6:~$ uname -a
>>>> Linux crb6 4.8.0+ #3 SMP Thu Oct 27 11:01:32 PDT 2016 aarch64 aarch64 aarch64 GNU/Linux
>>>>
>>>> ubuntu@crb6:~$ cat /proc/meminfo
>>>> MemTotal: 132011948 kB
>>>> MemFree: 131442672 kB
>>>> MemAvailable: 130695764 kB
>>>> Buffers: 15696 kB
>>>> Cached: 88088 kB
>>>> SwapCached: 0 kB
>>>> Active: 82760 kB
>>>> Inactive: 41336 kB
>>>> Active(anon): 20880 kB
>>>> Inactive(anon): 8576 kB
>>>> Active(file): 61880 kB
>>>> Inactive(file): 32760 kB
>>>> Unevictable: 0 kB
>>>> Mlocked: 0 kB
>>>> SwapTotal: 128920572 kB
>>>> SwapFree: 128920572 kB
>>>> Dirty: 0 kB
>>>> Writeback: 0 kB
>>>> AnonPages: 20544 kB
>>>> Mapped: 19780 kB
>>>> Shmem: 9060 kB
>>>> Slab: 78804 kB
>>>> SReclaimable: 27372 kB
>>>> SUnreclaim: 51432 kB
>>>> KernelStack: 8336 kB
>>>> PageTables: 820 kB
>>>> NFS_Unstable: 0 kB
>>>> Bounce: 0 kB
>>>> WritebackTmp: 0 kB
>>>> CommitLimit: 194926544 kB
>>>> Committed_AS: 256324 kB
>>>> VmallocTotal: 135290290112 kB
>>>> VmallocUsed: 0 kB
>>>> VmallocChunk: 0 kB
>>>> AnonHugePages: 0 kB
>>>> ShmemHugePages: 0 kB
>>>> ShmemPmdMapped: 0 kB
>>>> CmaTotal: 0 kB
>>>> CmaFree: 0 kB
>>>> HugePages_Total: 0
>>>> HugePages_Free: 0
>>>> HugePages_Rsvd: 0
>>>> HugePages_Surp: 0
>>>> Hugepagesize: 2048 kB
>>>>
>>>> ubuntu@crb6:~$ cat /proc/cpuinfo
>>>> processor : 0
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 1
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 2
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 3
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 4
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 5
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 6
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 7
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 8
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 9
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 10
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 11
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 12
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 13
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 14
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 15
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 16
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 17
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 18
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 19
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 20
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 21
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 22
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 23
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 24
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 25
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 26
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 27
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 28
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 29
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 30
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 31
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 32
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 33
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 34
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 35
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 36
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 37
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 38
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 39
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 40
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 41
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 42
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 43
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 44
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 45
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 46
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>> processor : 47
>>>> BogoMIPS : 200.00
>>>> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer : 0x43
>>>> CPU architecture: 8
>>>> CPU variant : 0x1
>>>> CPU part : 0x0a1
>>>> CPU revision : 0
>>>>
>>>
>>
>

2016-12-05 10:25:37

by Zhangjian (Bamvor)

[permalink] [raw]
Subject: Re: ILP32 for ARM64: testing with glibc testsuite



On 2016/12/5 18:07, Andreas Schwab wrote:
> On Dez 05 2016, "Zhangjian (Bamvor)" <[email protected]> wrote:
>
>> Is there some progresses on it? We could collabrate to fix those issues.
>
> All the elf/nptl/rt fails should be fixed by the recent binutils fixes.
Cool. How about the conform and other failures?

Regards

Bamvor
>
> Andreas.
>

2016-12-05 14:13:44

by Catalin Marinas

[permalink] [raw]
Subject: Re: ILP32 for ARM64 - testing with lmbench

On Mon, Dec 05, 2016 at 06:16:09PM +0800, Zhangjian (Bamvor) wrote:
> Do you have suggestion of next move of upstreaming ILP32?

I mentioned the steps a few time before. I'm pasting them again here:

1. Complete the review of the Linux patches and ABI (no merge yet)
2. Review the corresponding glibc patches (no merge yet)
3. Ask (Linaro, Cavium) for toolchain + filesystem (pre-built and more
than just busybox) to be able to reproduce the testing in ARM
4. More testing (LTP, trinity, performance regressions etc.)
5. Move the ILP32 PCS out of beta (based on the results from 4)
6. Check the market again to see if anyone still needs ILP32
7. Based on 6, decide whether to merge the kernel and glibc patches

What's not explicitly mentioned in step 4 is glibc testing. Point 5 is
ARM's responsibility (toolchain folk).

> There are already the test results of lmbench and specint. Do you they
> are ok or need more data to prove no regression?

I would need to reproduce the tests myself, see step 3.

> I have also noticed that there are ILP32 failures in glibc testsuite.
> Is it the only blocker for merge ILP32(in technology part)?

It's probably not the only blocker but I have to review the kernel
patches again to make sure. I'd also like to see whether the libc-alpha
community is ok with the glibc counterpart (but don't merge the patches
until the ABI is agreed on both sides).

On performance, I want to make sure there are no regressions on
AArch32/compat and AArch64/LP64.

--
Catalin

2016-12-05 15:10:28

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH 09/18] arm64: introduce binfmt_elf32.c

On Fri, Oct 21, 2016 at 11:33:08PM +0300, Yury Norov wrote:
> As we support more than one compat formats, it looks more reasonable
> to not use fs/compat_binfmt.c. Custom binfmt_elf32.c allows to move aarch32
> specific definitions there and make code more maintainable and readable.

Can you remind me why we need this patch (rather than using the default
fs/compat_binfmt_elf.c which you include here anyway)?

> --- /dev/null
> +++ b/arch/arm64/kernel/binfmt_elf32.c
> @@ -0,0 +1,31 @@
> +/*
> + * Support for AArch32 Linux ELF binaries.
> + */
> +
> +/* AArch32 EABI. */
> +#define EF_ARM_EABI_MASK 0xff000000
> +
> +#define compat_start_thread compat_start_thread
> +#define COMPAT_SET_PERSONALITY(ex) \
> +do { \
> + clear_thread_flag(TIF_32BIT_AARCH64); \
> + set_thread_flag(TIF_32BIT); \
> +} while (0)

You introduce this here but it seems to still be present in asm/elf.h.

--
Catalin

2016-12-05 15:38:13

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH 10/18] arm64: ilp32: introduce binfmt_ilp32.c

On Fri, Oct 21, 2016 at 11:33:09PM +0300, Yury Norov wrote:
> binfmt_ilp32.c is needed to handle ILP32 binaries
>
> Signed-off-by: Yury Norov <[email protected]>
> Signed-off-by: Bamvor Zhang Jian <[email protected]>
> ---
> arch/arm64/include/asm/elf.h | 6 +++
> arch/arm64/kernel/Makefile | 1 +
> arch/arm64/kernel/binfmt_ilp32.c | 97 ++++++++++++++++++++++++++++++++++++++++
> 3 files changed, 104 insertions(+)
> create mode 100644 arch/arm64/kernel/binfmt_ilp32.c
>
> diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
> index f259fe8..be29dde 100644
> --- a/arch/arm64/include/asm/elf.h
> +++ b/arch/arm64/include/asm/elf.h
> @@ -175,10 +175,16 @@ extern int arch_setup_additional_pages(struct linux_binprm *bprm,
>
> #define COMPAT_ELF_ET_DYN_BASE (2 * TASK_SIZE_32 / 3)
>
> +#ifndef USE_AARCH64_GREG
> /* AArch32 registers. */
> #define COMPAT_ELF_NGREG 18
> typedef unsigned int compat_elf_greg_t;
> typedef compat_elf_greg_t compat_elf_gregset_t[COMPAT_ELF_NGREG];
> +#else /* AArch64 registers for AARCH64/ILP32 */
> +#define COMPAT_ELF_NGREG ELF_NGREG
> +#define compat_elf_greg_t elf_greg_t
> +#define compat_elf_gregset_t elf_gregset_t
> +#endif

I think you only need compat_elf_gregset_t definition here and leave the
other two undefined.

--
Catalin

2016-12-05 16:18:33

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH 14/18] arm64: signal32: move ilp32 and aarch32 common code to separated file

On Fri, Oct 21, 2016 at 11:33:13PM +0300, Yury Norov wrote:
> Signed-off-by: Yury Norov <[email protected]>

Please add some description, even if it means copying the subject.

> ---
> arch/arm64/include/asm/signal32.h | 3 +
> arch/arm64/include/asm/signal32_common.h | 27 +++++++
> arch/arm64/kernel/Makefile | 2 +-
> arch/arm64/kernel/signal32.c | 107 ------------------------
> arch/arm64/kernel/signal32_common.c | 135 +++++++++++++++++++++++++++++++
> 5 files changed, 166 insertions(+), 108 deletions(-)
> create mode 100644 arch/arm64/include/asm/signal32_common.h
> create mode 100644 arch/arm64/kernel/signal32_common.c

I wonder whether you can make such patches more readable by setting
"diff.renames" to "copy" in your gitconfig (unless it's set already and
Git cannot detect partial file code moving/copying).

--
Catalin

2016-12-05 16:34:33

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH 16/18] arm64: ptrace: handle ptrace_request differently for aarch32 and ilp32

On Fri, Oct 21, 2016 at 11:33:15PM +0300, Yury Norov wrote:
> New aarch32 ptrace syscall handler is introduced to avoid run-time
> detection of the task type.

What's wrong with the run-time detection? If it's just to avoid a
negligible overhead, I would rather keep the code simpler by avoiding
duplicating the generic compat_sys_ptrace().

--
Catalin

2016-12-05 17:12:52

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH 11/18] arm64: ilp32: share aarch32 syscall handlers

On Fri, Oct 21, 2016 at 11:33:10PM +0300, Yury Norov wrote:
> off_t is passed in register pair just like in aarch32.
> In this patch corresponding aarch32 handlers are shared to
> ilp32 code.
[...]
> +/*
> + * Note: off_4k (w5) is always in units of 4K. If we can't do the
> + * requested offset because it is not page-aligned, we return -EINVAL.
> + */
> +ENTRY(compat_sys_mmap2_wrapper)
> +#if PAGE_SHIFT > 12
> + tst w5, #~PAGE_MASK >> 12
> + b.ne 1f
> + lsr w5, w5, #PAGE_SHIFT - 12
> +#endif
> + b sys_mmap_pgoff
> +1: mov x0, #-EINVAL
> + ret
> +ENDPROC(compat_sys_mmap2_wrapper)

For compat sys_mmap2, the pgoff argument is in multiples of 4K. This was
traditionally used for architectures where off_t is 32-bit to allow
mapping files to 2^44.

Since off_t is 64-bit with AArch64/ILP32, should we just pass the off_t
as a 64-bit value in two different registers (w5 and w6)?

--
Catalin

2016-12-05 19:34:07

by Steve Ellcey

[permalink] [raw]
Subject: Re: ILP32 for ARM64: testing with glibc testsuite

On Mon, 2016-12-05 at 11:07 +0100, Andreas Schwab wrote:
> On Dez 05 2016, "Zhangjian (Bamvor)" <[email protected]>
> wrote:
>
> >
> > Is there some progresses on it? We could collabrate to fix those
> > issues.
> All the elf/nptl/rt fails should be fixed by the recent binutils
> fixes.
>
> Andreas.

I am using binutils ToT and Yury's latest patch (https://sourceware.org
/ml/binutils/2016-12/msg00039.html) and I am still seeing some nptl and
rt failures in the glibc testsuite, specifically:

FAIL: nptl/tst-cancel26
FAIL: nptl/tst-cancel27
FAIL: nptl/tst-stack4
FAIL: rt/tst-mqueue1
FAIL: rt/tst-mqueue2
FAIL: rt/tst-mqueue4
FAIL: rt/tst-mqueue7

Steve Ellcey
[email protected]

2016-12-06 06:25:30

by Yury Norov

[permalink] [raw]
Subject: Re: [PATCH 16/18] arm64: ptrace: handle ptrace_request differently for aarch32 and ilp32

On Mon, Dec 05, 2016 at 04:34:23PM +0000, Catalin Marinas wrote:
> On Fri, Oct 21, 2016 at 11:33:15PM +0300, Yury Norov wrote:
> > New aarch32 ptrace syscall handler is introduced to avoid run-time
> > detection of the task type.
>
> What's wrong with the run-time detection? If it's just to avoid a
> negligible overhead, I would rather keep the code simpler by avoiding
> duplicating the generic compat_sys_ptrace().

Nothing wrong. This is how Arnd asked me to do. You already asked this
question: http://lkml.iu.edu/hypermail/linux/kernel/1604.3/00930.html

If it's still looking weird to you, I can switch back to runtime
ptrace. But I'd like to see Arnd's opinion.

Yury.

2016-12-06 07:06:19

by Yury Norov

[permalink] [raw]
Subject: Re: [PATCH 16/18] arm64: ptrace: handle ptrace_request differently for aarch32 and ilp32

On Tue, Dec 06, 2016 at 11:55:08AM +0530, Yury Norov wrote:
> On Mon, Dec 05, 2016 at 04:34:23PM +0000, Catalin Marinas wrote:
> > On Fri, Oct 21, 2016 at 11:33:15PM +0300, Yury Norov wrote:
> > > New aarch32 ptrace syscall handler is introduced to avoid run-time
> > > detection of the task type.
> >
> > What's wrong with the run-time detection? If it's just to avoid a
> > negligible overhead, I would rather keep the code simpler by avoiding
> > duplicating the generic compat_sys_ptrace().
>
> Nothing wrong. This is how Arnd asked me to do. You already asked this
> question: http://lkml.iu.edu/hypermail/linux/kernel/1604.3/00930.html
>
> If it's still looking weird to you, I can switch back to runtime
> ptrace. But I'd like to see Arnd's opinion.

This is the Arnd's email:
https://patchwork.kernel.org/patch/7980521/

Yury.

2016-12-06 07:48:33

by Yury Norov

[permalink] [raw]
Subject: Re: [PATCH 11/18] arm64: ilp32: share aarch32 syscall handlers

On Mon, Dec 05, 2016 at 05:12:43PM +0000, Catalin Marinas wrote:
> On Fri, Oct 21, 2016 at 11:33:10PM +0300, Yury Norov wrote:
> > off_t is passed in register pair just like in aarch32.
> > In this patch corresponding aarch32 handlers are shared to
> > ilp32 code.
> [...]
> > +/*
> > + * Note: off_4k (w5) is always in units of 4K. If we can't do the
> > + * requested offset because it is not page-aligned, we return -EINVAL.
> > + */
> > +ENTRY(compat_sys_mmap2_wrapper)
> > +#if PAGE_SHIFT > 12
> > + tst w5, #~PAGE_MASK >> 12
> > + b.ne 1f
> > + lsr w5, w5, #PAGE_SHIFT - 12
> > +#endif
> > + b sys_mmap_pgoff
> > +1: mov x0, #-EINVAL
> > + ret
> > +ENDPROC(compat_sys_mmap2_wrapper)
>
> For compat sys_mmap2, the pgoff argument is in multiples of 4K. This was
> traditionally used for architectures where off_t is 32-bit to allow
> mapping files to 2^44.
>
> Since off_t is 64-bit with AArch64/ILP32, should we just pass the off_t
> as a 64-bit value in two different registers (w5 and w6)?

Current glibc implementation becomes broken for 64-bit off_t if
if I'll do what you want.
sysdeps/unix/sysv/linux/generic/wordsize-32/mmap.c
28 __ptr_t
29 __mmap (__ptr_t addr, size_t len, int prot, int flags, int fd, off_t offset)
30 {
31 if (offset & (MMAP_PAGE_UNIT - 1))
32 {
33 __set_errno (EINVAL);
34 return MAP_FAILED;
35 }
36 return (__ptr_t) INLINE_SYSCALL (mmap2, 6, addr, len, prot, flags, fd,
37 offset / MMAP_PAGE_UNIT);
38 }
39
40 weak_alias (__mmap, mmap)

So it requires changes both in glibc and in kernel. I can do it. But
I'd like to collect opinions of kernel and glibc developers before
starting it.

Yury

2016-12-06 08:05:57

by Yury Norov

[permalink] [raw]
Subject: Re: ILP32 for ARM64: testing with glibc testsuite

On Mon, Dec 05, 2016 at 06:24:11PM +0800, Zhangjian (Bamvor) wrote:
>
>
> On 2016/12/5 18:07, Andreas Schwab wrote:
> >On Dez 05 2016, "Zhangjian (Bamvor)" <[email protected]> wrote:
> >
> >>Is there some progresses on it? We could collabrate to fix those issues.
> >
> >All the elf/nptl/rt fails should be fixed by the recent binutils fixes.
> Cool. How about the conform and other failures?

I think conform is only my local problem. I use pretty non-standard
environment for build and testing - cross-compilation + qemu. Steve
builds and runs tests natively, and he doesn't see that regressions.

Yury

2016-12-06 08:32:50

by Andreas Schwab

[permalink] [raw]
Subject: Re: ILP32 for ARM64: testing with glibc testsuite

On Dez 05 2016, Steve Ellcey <[email protected]> wrote:

> FAIL: nptl/tst-cancel26
> FAIL: nptl/tst-cancel27

> FAIL: rt/tst-mqueue1
> FAIL: rt/tst-mqueue2
> FAIL: rt/tst-mqueue4
> FAIL: rt/tst-mqueue7

I don't see these failures. Maybe you need to rebuild libgcc?

https://build.opensuse.org/package/live_build_log/devel:ARM:AArch64:ILP32/glibc-testsuite/standard/aarch64_ilp32

Andreas.

--
Andreas Schwab, SUSE Labs, [email protected]
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

2016-12-06 09:40:04

by Yury Norov

[permalink] [raw]
Subject: Re: [PATCH 14/18] arm64: signal32: move ilp32 and aarch32 common code to separated file

On Mon, Dec 05, 2016 at 04:18:24PM +0000, Catalin Marinas wrote:
> On Fri, Oct 21, 2016 at 11:33:13PM +0300, Yury Norov wrote:
> > Signed-off-by: Yury Norov <[email protected]>
>
> Please add some description, even if it means copying the subject.
>
> > ---
> > arch/arm64/include/asm/signal32.h | 3 +
> > arch/arm64/include/asm/signal32_common.h | 27 +++++++
> > arch/arm64/kernel/Makefile | 2 +-
> > arch/arm64/kernel/signal32.c | 107 ------------------------
> > arch/arm64/kernel/signal32_common.c | 135 +++++++++++++++++++++++++++++++
> > 5 files changed, 166 insertions(+), 108 deletions(-)
> > create mode 100644 arch/arm64/include/asm/signal32_common.h
> > create mode 100644 arch/arm64/kernel/signal32_common.c
>
> I wonder whether you can make such patches more readable by setting
> "diff.renames" to "copy" in your gitconfig (unless it's set already and
> Git cannot detect partial file code moving/copying).

I tried "git format-patch -C --find-copies-harder" - the same result.

Yury

2016-12-07 16:59:30

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH 16/18] arm64: ptrace: handle ptrace_request differently for aarch32 and ilp32

On Tue, Dec 06, 2016 at 11:55:08AM +0530, Yury Norov wrote:
> On Mon, Dec 05, 2016 at 04:34:23PM +0000, Catalin Marinas wrote:
> > On Fri, Oct 21, 2016 at 11:33:15PM +0300, Yury Norov wrote:
> > > New aarch32 ptrace syscall handler is introduced to avoid run-time
> > > detection of the task type.
> >
> > What's wrong with the run-time detection? If it's just to avoid a
> > negligible overhead, I would rather keep the code simpler by avoiding
> > duplicating the generic compat_sys_ptrace().
>
> Nothing wrong. This is how Arnd asked me to do. You already asked this
> question: http://lkml.iu.edu/hypermail/linux/kernel/1604.3/00930.html

Hmm, I completely forgot about this ;). There is still an advantage to
doing run-time checking if we avoid touching core code (less acks to
gather and less code duplication).

Let's see what Arnd says but the initial patch looked simpler.

--
Catalin

2016-12-07 20:41:53

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH 16/18] arm64: ptrace: handle ptrace_request differently for aarch32 and ilp32

On Wednesday, December 7, 2016 4:59:13 PM CET Catalin Marinas wrote:
> On Tue, Dec 06, 2016 at 11:55:08AM +0530, Yury Norov wrote:
> > On Mon, Dec 05, 2016 at 04:34:23PM +0000, Catalin Marinas wrote:
> > > On Fri, Oct 21, 2016 at 11:33:15PM +0300, Yury Norov wrote:
> > > > New aarch32 ptrace syscall handler is introduced to avoid run-time
> > > > detection of the task type.
> > >
> > > What's wrong with the run-time detection? If it's just to avoid a
> > > negligible overhead, I would rather keep the code simpler by avoiding
> > > duplicating the generic compat_sys_ptrace().
> >
> > Nothing wrong. This is how Arnd asked me to do. You already asked this
> > question: http://lkml.iu.edu/hypermail/linux/kernel/1604.3/00930.html
>
> Hmm, I completely forgot about this ;). There is still an advantage to
> doing run-time checking if we avoid touching core code (less acks to
> gather and less code duplication).
>
> Let's see what Arnd says but the initial patch looked simpler.

I don't currently have either version of the patch in my inbox
(the archive is on a different machine), but in general I'd still
think it's best to avoid the runtime check for aarch64-ilp32
altogether. I'd have to look at the overall kernel source to
see if it's worth avoiding one or two instances though, or
if there are an overwhelming number of other checks that we
can't avoid at all.

Regarding ptrace, I notice that arch/tile doesn't even use
the compat entry point for its ilp32 user space on 64-bit
kernels, it just calls the regular 64-bit one. Would that
help here?

Arnd

2016-12-08 13:13:40

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH 16/18] arm64: ptrace: handle ptrace_request differently for aarch32 and ilp32

On Wed, Dec 07, 2016 at 09:40:13PM +0100, Arnd Bergmann wrote:
> On Wednesday, December 7, 2016 4:59:13 PM CET Catalin Marinas wrote:
> > On Tue, Dec 06, 2016 at 11:55:08AM +0530, Yury Norov wrote:
> > > On Mon, Dec 05, 2016 at 04:34:23PM +0000, Catalin Marinas wrote:
> > > > On Fri, Oct 21, 2016 at 11:33:15PM +0300, Yury Norov wrote:
> > > > > New aarch32 ptrace syscall handler is introduced to avoid run-time
> > > > > detection of the task type.
> > > >
> > > > What's wrong with the run-time detection? If it's just to avoid a
> > > > negligible overhead, I would rather keep the code simpler by avoiding
> > > > duplicating the generic compat_sys_ptrace().
> > >
> > > Nothing wrong. This is how Arnd asked me to do. You already asked this
> > > question: http://lkml.iu.edu/hypermail/linux/kernel/1604.3/00930.html
> >
> > Hmm, I completely forgot about this ;). There is still an advantage to
> > doing run-time checking if we avoid touching core code (less acks to
> > gather and less code duplication).
> >
> > Let's see what Arnd says but the initial patch looked simpler.
>
> I don't currently have either version of the patch in my inbox
> (the archive is on a different machine), but in general I'd still
> think it's best to avoid the runtime check for aarch64-ilp32
> altogether. I'd have to look at the overall kernel source to
> see if it's worth avoiding one or two instances though, or
> if there are an overwhelming number of other checks that we
> can't avoid at all.

Just in case you haven't found them already, current version:

https://marc.info/?l=linux-arm-kernel&m=147708276818318&w=2

Original version:

https://patchwork.kernel.org/patch/7980521/

The old one looks more readable and given that ptrace is not really a
fast path, I'm not two worried about run-time checks

> Regarding ptrace, I notice that arch/tile doesn't even use
> the compat entry point for its ilp32 user space on 64-bit
> kernels, it just calls the regular 64-bit one. Would that
> help here?

I don't know whether it would work, we have incompatible siginfo_t on
AArch64/ILP32.

--
Catalin

2016-12-11 12:08:27

by Yury Norov

[permalink] [raw]
Subject: Re: ILP32 for ARM64 - testing with lmbench

On Mon, Dec 05, 2016 at 02:13:12PM +0000, Catalin Marinas wrote:
> On Mon, Dec 05, 2016 at 06:16:09PM +0800, Zhangjian (Bamvor) wrote:
> > Do you have suggestion of next move of upstreaming ILP32?
>
> I mentioned the steps a few time before. I'm pasting them again here:
>
> 1. Complete the review of the Linux patches and ABI (no merge yet)
> 2. Review the corresponding glibc patches (no merge yet)
> 3. Ask (Linaro, Cavium) for toolchain + filesystem (pre-built and more
> than just busybox) to be able to reproduce the testing in ARM
> 4. More testing (LTP, trinity, performance regressions etc.)
> 5. Move the ILP32 PCS out of beta (based on the results from 4)
> 6. Check the market again to see if anyone still needs ILP32
> 7. Based on 6, decide whether to merge the kernel and glibc patches
>
> What's not explicitly mentioned in step 4 is glibc testing. Point 5 is
> ARM's responsibility (toolchain folk).
>
> > There are already the test results of lmbench and specint. Do you they
> > are ok or need more data to prove no regression?
>
> I would need to reproduce the tests myself, see step 3.

Hi Catalin,

> 3. Ask (Linaro, Cavium) for toolchain + filesystem (pre-built and more
> than just busybox) to be able to reproduce the testing in ARM

This is the Andrew's toolchain I use to build kernel, GLIBC, binutils etc:
https://drive.google.com/open?id=0B93nHerV55yNVlVKaXpOOHQtbW8
It's not the latest build but it works well to me.

This archive contains 4.9-rc8 kernel, initrd, sys-root, qemu image based on
ilp32 busybox.
https://drive.google.com/open?id=0B93nHerV55yNbVo0bko0bWlQeFE

I can start linux on qemu and run basic commands and tests in ilp32
mode. This is my first attempt to create rootfs, and this is very basic
busybox + sys-root. But it lets me start lp64 and ilp32 apps (find
example there). If you need something more, let me know and I'll add
it. You can also use any professional distro with this ilp32-enabled
kernel, just copy sys-root there (like I actually do - I run Ubuntu
14 daily).

BTW. This is of course good idea to build and test ilp32 user
environment, but in real life I think ilp32 apps will work in lp64
userspace.

> 4. More testing (LTP, trinity, performance regressions etc.)

I also built and ran trinity. After ~24 hours I found all trinity
threads stalled for lp64, and after another 24 hours I found it
running but slower for ilp32. Kernel was alive in both cases.

Yury.

2016-12-14 09:41:04

by Yury Norov

[permalink] [raw]
Subject: Re: [PATCH 09/18] arm64: introduce binfmt_elf32.c

On Mon, Dec 05, 2016 at 03:10:19PM +0000, Catalin Marinas wrote:
> On Fri, Oct 21, 2016 at 11:33:08PM +0300, Yury Norov wrote:
> > As we support more than one compat formats, it looks more reasonable
> > to not use fs/compat_binfmt.c. Custom binfmt_elf32.c allows to move aarch32
> > specific definitions there and make code more maintainable and readable.
>
> Can you remind me why we need this patch (rather than using the default
> fs/compat_binfmt_elf.c which you include here anyway)?

https://patchwork.kernel.org/patch/8756121/

This is mostly to avoid runtime checks and hide some re-definitions
for aarch32 from ilp32, to avoid re-re-definition.

>
> > --- /dev/null
> > +++ b/arch/arm64/kernel/binfmt_elf32.c
> > @@ -0,0 +1,31 @@
> > +/*
> > + * Support for AArch32 Linux ELF binaries.
> > + */
> > +
> > +/* AArch32 EABI. */
> > +#define EF_ARM_EABI_MASK 0xff000000
> > +
> > +#define compat_start_thread compat_start_thread
> > +#define COMPAT_SET_PERSONALITY(ex) \
> > +do { \
> > + clear_thread_flag(TIF_32BIT_AARCH64); \
> > + set_thread_flag(TIF_32BIT); \
> > +} while (0)
>
> You introduce this here but it seems to still be present in asm/elf.h.

Hmm... Maybe chunk that delete it from asm/elf.h was dropped at some
rebase. Thank you for the catch. I'll check it again.

Yury

2016-12-18 07:08:49

by Yury Norov

[permalink] [raw]
Subject: Re: [RFC3 nowrap: PATCH v7 00/18] ILP32 for ARM64

On Fri, Oct 21, 2016 at 11:32:59PM +0300, Yury Norov wrote:
> This series enables aarch64 with ilp32 mode, and as supporting work,
> introduces ARCH_32BIT_OFF_T configuration option that is enabled for
> existing 32-bit architectures but disabled for new arches (so 64-bit
> off_t is is used by new userspace).
>
> This version is based on kernel v4.9-rc1. It works with glibc-2.24,
> and tested with LTP.

Hi Arnd, Catalin

For last few days I'm trying to rebase this series on current master,
and I see significant conflicts and regressions. In fact, every time
I rebase on next rc1, I feel like I play a roulette.

This is not a significant problem now because it's almost for sure
that this series will not get into 4.10, for reasons not related to
kernel code. And I have time to deal with regressions. But in general,
I'd like to try my patches on top of other candidates for next merge
window. I cannot read all emails in LKML, but I can easily detect
problems and join to the discussion at early stage if I see any problem.

This is probably a noob question, and there are well-known branches,
like Andrew Morton's one. But at this stage it's very important to
have this series prepared for merge, and I'd prefer to ask about it.

Yury.

2016-12-21 18:57:00

by Yury Norov

[permalink] [raw]
Subject: Re: [PATCH 10/18] arm64: ilp32: introduce binfmt_ilp32.c

On Mon, Dec 05, 2016 at 03:38:01PM +0000, Catalin Marinas wrote:
> On Fri, Oct 21, 2016 at 11:33:09PM +0300, Yury Norov wrote:
> > binfmt_ilp32.c is needed to handle ILP32 binaries
> >
> > Signed-off-by: Yury Norov <[email protected]>
> > Signed-off-by: Bamvor Zhang Jian <[email protected]>
> > ---
> > arch/arm64/include/asm/elf.h | 6 +++
> > arch/arm64/kernel/Makefile | 1 +
> > arch/arm64/kernel/binfmt_ilp32.c | 97 ++++++++++++++++++++++++++++++++++++++++
> > 3 files changed, 104 insertions(+)
> > create mode 100644 arch/arm64/kernel/binfmt_ilp32.c
> >
> > diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
> > index f259fe8..be29dde 100644
> > --- a/arch/arm64/include/asm/elf.h
> > +++ b/arch/arm64/include/asm/elf.h
> > @@ -175,10 +175,16 @@ extern int arch_setup_additional_pages(struct linux_binprm *bprm,
> >
> > #define COMPAT_ELF_ET_DYN_BASE (2 * TASK_SIZE_32 / 3)
> >
> > +#ifndef USE_AARCH64_GREG
> > /* AArch32 registers. */
> > #define COMPAT_ELF_NGREG 18
> > typedef unsigned int compat_elf_greg_t;
> > typedef compat_elf_greg_t compat_elf_gregset_t[COMPAT_ELF_NGREG];
> > +#else /* AArch64 registers for AARCH64/ILP32 */
> > +#define COMPAT_ELF_NGREG ELF_NGREG
> > +#define compat_elf_greg_t elf_greg_t
> > +#define compat_elf_gregset_t elf_gregset_t
> > +#endif
>
> I think you only need compat_elf_gregset_t definition here and leave the
> other two undefined.

I checked everything here again, and found that almost all compat defines
may be moved to corresponding binfmt files. If everything is OK, I'll
incorporate next patch to the series

Yury

--
diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
index abb75f5..76f0a5c 100644
--- a/arch/arm64/include/asm/elf.h
+++ b/arch/arm64/include/asm/elf.h
@@ -176,30 +176,10 @@ extern int arch_setup_additional_pages(struct linux_binprm *bprm,

#define COMPAT_ELF_ET_DYN_BASE (2 * TASK_SIZE_32 / 3)

-#ifndef USE_AARCH64_GREG
/* AArch32 registers. */
#define COMPAT_ELF_NGREG 18
typedef unsigned int compat_elf_greg_t;
typedef compat_elf_greg_t compat_elf_gregset_t[COMPAT_ELF_NGREG];
-#else /* AArch64 registers for AARCH64/ILP32 */
-#define COMPAT_ELF_NGREG ELF_NGREG
-#define compat_elf_greg_t elf_greg_t
-#define compat_elf_gregset_t elf_gregset_t
-#endif
-
-/* AArch32 EABI. */
-#define EF_ARM_EABI_MASK 0xff000000
-#define compat_elf_check_arch(x) (system_supports_32bit_el0() && \
- ((x)->e_machine == EM_ARM) && \
- ((x)->e_flags & EF_ARM_EABI_MASK))
-
-#define compat_start_thread compat_start_thread
-#define COMPAT_ARCH_DLINFO
-extern int aarch32_setup_vectors_page(struct linux_binprm *bprm,
- int uses_interp);
-#define compat_arch_setup_additional_pages \
- aarch32_setup_vectors_page
-
#endif /* CONFIG_COMPAT */

#endif /* !__ASSEMBLY__ */
diff --git a/arch/arm64/kernel/binfmt_elf32.c b/arch/arm64/kernel/binfmt_elf32.c
index 99a4cf2..7c38a22 100644
--- a/arch/arm64/kernel/binfmt_elf32.c
+++ b/arch/arm64/kernel/binfmt_elf32.c
@@ -17,16 +17,16 @@
#define COMPAT_ELF_HWCAP (compat_elf_hwcap)
#define COMPAT_ELF_HWCAP2 (compat_elf_hwcap2)

-#ifdef __AARCH64EB__
-#define COMPAT_ELF_PLATFORM ("v8b")
-#else
-#define COMPAT_ELF_PLATFORM ("v8l")
-#endif
-
#define compat_arch_setup_additional_pages \
aarch32_setup_vectors_page
struct linux_binprm;
extern int aarch32_setup_vectors_page(struct linux_binprm *bprm,
int uses_interp);

+/* AArch32 EABI. */
+#define compat_elf_check_arch(x) (system_supports_32bit_el0() && \
+ ((x)->e_machine == EM_ARM) && \
+ ((x)->e_flags & EF_ARM_EABI_MASK))
+
+
#include "../../../fs/compat_binfmt_elf.c"
diff --git a/arch/arm64/kernel/binfmt_ilp32.c b/arch/arm64/kernel/binfmt_ilp32.c
index dd62467..ec4a412 100644
--- a/arch/arm64/kernel/binfmt_ilp32.c
+++ b/arch/arm64/kernel/binfmt_ilp32.c
@@ -1,7 +1,9 @@
/*
* Support for ILP32 Linux/aarch64 ELF binaries.
*/
-#define USE_AARCH64_GREG
+
+#undef compat_elf_gregset_t
+#define compat_elf_gregset_t elf_gregset_t

#include <linux/elfcore-compat.h>
#include <linux/time.h>