2021-06-11 18:06:33

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH v3 00/23] Add generic vdso_base tracking

v3 Changes:
- Migrated arch/powerpc to vdso_base
- Added x86/selftest for unmapped vdso & no landing on fast syscall
- Review comments from Andy & Christophe (thanks!)
- Amended s/born process/execed process/ everywhere I noticed
- Build robot warning on cast from __user pointer

I've tested it on x86, I would appreciate any help with
Tested-by on arm/arm64/mips/powerpc/s390/... platforms.

One thing I've noticed while cooking this and haven't found a clean
way to solve is zero-terminated .pages[] array in vdso mappings, which
is not always zero-terminated but works by the reason of
VM_DONTEXPAND on mappings.

v2 Changes:
- Rename user_landing to vdso_base as it tracks vDSO VMA start address,
rather than the explicit address to land (Andy)
- Reword and don't use "new-execed" and "new-born" task (Andy)
- Fix failures reported by build robot

Started from discussion [1], where was noted that currently a couple of
architectures support mremap() for vdso/sigpage, but not munmap().
If an application maps something on the ex-place of vdso/sigpage,
later after processing signal it will land there (good luck!)

Patches set is based on linux-next (next-20201123) and it depends on
changes in x86/cleanups (those reclaim TIF_IA32/TIF_X32) and also
on my changes in akpm (fixing several mremap() issues).

Logically, the patches set divides on:
- patch 1: a cleanup for patches in x86/cleanups
- patches 2-13: cleanups for arch_setup_additional_pages()
- patches 13-14: x86 signal changes for unmapped vdso
- patches 15-22: provide generic vdso_base in mm_struct
- patch 23: selftest for unmapped vDSO & fast syscalls

In the end, besides cleanups, it's now more predictable what happens for
applications with unmapped vdso on architectures those support .mremap()
for vdso/sigpage.

I'm aware of only one user that unmaps vdso - Valgrind [2].
(there possibly are more, but this one is "special", it unmaps vdso, but
not vvar, which confuses CRIU [Checkpoint Restore In Userspace], that's
why I'm aware of it)

Patches as a .git branch:
https://github.com/0x7f454c46/linux/tree/vdso_base-v3

v2 Link:
https://lore.kernel.org/lkml/[email protected]/
v1 Link:
https://lore.kernel.org/lkml/[email protected]/

[1]: https://lore.kernel.org/linux-arch/CAJwJo6ZANqYkSHbQ+3b+Fi_VT80MtrzEV5yreQAWx-L8j8x2zA@mail.gmail.com/
[2]: https://github.com/checkpoint-restore/criu/issues/488

Cc: Alexander Viro <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Guo Ren <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Russell King <[email protected]>
Cc: Thomas Bogendoerfer <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Vincenzo Frascino <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: [email protected]

Dmitry Safonov (23):
x86/elf: Check in_x32_syscall() in
compat_arch_setup_additional_pages()
elf: Move arch_setup_additional_pages() to generic elf.h
arm/elf: Remove needless ifdef CONFIG_MMU
arm64: Use in_compat_task() in arch_setup_additional_pages()
x86: Remove compat_arch_setup_additional_pages()
elf: Remove compat_arch_setup_additional_pages()
vdso: Set mm->context.vdso only on success of
_install_special_mapping()
elf/vdso: Modify arch_setup_additional_pages() parameters
elf: Use sysinfo_ehdr in ARCH_DLINFO()
arm/vdso: Remove vdso pointer from mm->context
s390/vdso: Remove vdso_base pointer from mm->context
sparc/vdso: Remove vdso pointer from mm->context
mm/mmap: Make vm_special_mapping::mremap return void
x86/signal: Land on &frame->retcode when vdso isn't mapped
x86/signal: Check if vdso_image_32 is mapped before trying to land on it
mm: Add vdso_base in mm_struct
x86/vdso: Migrate to generic vdso_base
arm/vdso: Migrate to generic vdso_base
arm64/vdso: Migrate compat signals to generic vdso_base
arm64/vdso: Migrate native signals to generic vdso_base
mips/vdso: Migrate to generic vdso_base
powerpc/vdso: Migrate native signals to generic vdso_base
x86/vdso/selftest: Add a test for unmapping vDSO

arch/Kconfig | 3 +
arch/alpha/include/asm/elf.h | 2 +-
arch/arm/Kconfig | 2 +
arch/arm/include/asm/elf.h | 10 +-
arch/arm/include/asm/mmu.h | 3 -
arch/arm/include/asm/vdso.h | 6 +-
arch/arm/kernel/process.c | 14 +-
arch/arm/kernel/signal.c | 6 +-
arch/arm/kernel/vdso.c | 20 +--
arch/arm64/Kconfig | 2 +
arch/arm64/include/asm/elf.h | 27 +---
arch/arm64/include/asm/mmu.h | 4 -
arch/arm64/kernel/signal.c | 10 +-
arch/arm64/kernel/signal32.c | 17 +-
arch/arm64/kernel/vdso.c | 72 +++------
arch/csky/Kconfig | 1 +
arch/csky/include/asm/elf.h | 4 -
arch/csky/kernel/vdso.c | 11 +-
arch/hexagon/Kconfig | 1 +
arch/hexagon/include/asm/elf.h | 6 -
arch/hexagon/kernel/vdso.c | 3 +-
arch/ia64/include/asm/elf.h | 2 +-
arch/mips/Kconfig | 2 +
arch/mips/include/asm/elf.h | 10 +-
arch/mips/include/asm/mmu.h | 2 -
arch/mips/kernel/signal.c | 11 +-
arch/mips/kernel/vdso.c | 5 +-
arch/mips/vdso/genvdso.c | 9 --
arch/nds32/Kconfig | 1 +
arch/nds32/include/asm/elf.h | 8 +-
arch/nds32/kernel/vdso.c | 8 +-
arch/nios2/Kconfig | 1 +
arch/nios2/include/asm/elf.h | 4 -
arch/nios2/mm/init.c | 2 +-
arch/powerpc/Kconfig | 2 +
arch/powerpc/include/asm/book3s/32/mmu-hash.h | 1 -
arch/powerpc/include/asm/book3s/64/mmu.h | 1 -
arch/powerpc/include/asm/elf.h | 9 +-
arch/powerpc/include/asm/mmu_context.h | 9 --
arch/powerpc/include/asm/nohash/32/mmu-40x.h | 1 -
arch/powerpc/include/asm/nohash/32/mmu-44x.h | 1 -
arch/powerpc/include/asm/nohash/32/mmu-8xx.h | 1 -
arch/powerpc/include/asm/nohash/mmu-book3e.h | 1 -
arch/powerpc/kernel/signal_32.c | 8 +-
arch/powerpc/kernel/signal_64.c | 4 +-
arch/powerpc/kernel/vdso.c | 48 +-----
arch/powerpc/perf/callchain_32.c | 8 +-
arch/powerpc/perf/callchain_64.c | 4 +-
arch/riscv/Kconfig | 1 +
arch/riscv/include/asm/elf.h | 9 +-
arch/riscv/kernel/vdso.c | 11 +-
arch/s390/Kconfig | 1 +
arch/s390/include/asm/elf.h | 10 +-
arch/s390/include/asm/mmu.h | 1 -
arch/s390/kernel/vdso.c | 12 +-
arch/sh/Kconfig | 1 +
arch/sh/include/asm/elf.h | 16 +-
arch/sh/kernel/vsyscall/vsyscall.c | 3 +-
arch/sparc/Kconfig | 1 +
arch/sparc/include/asm/elf_64.h | 11 +-
arch/sparc/include/asm/mmu_64.h | 1 -
arch/sparc/vdso/vma.c | 20 +--
arch/x86/Kconfig | 2 +
arch/x86/entry/common.c | 10 +-
arch/x86/entry/vdso/extable.c | 4 +-
arch/x86/entry/vdso/vma.c | 79 ++++-----
arch/x86/ia32/ia32_signal.c | 18 ++-
arch/x86/include/asm/compat.h | 6 +
arch/x86/include/asm/elf.h | 44 ++---
arch/x86/include/asm/mmu.h | 1 -
arch/x86/include/asm/mmu_context.h | 5 -
arch/x86/include/asm/vdso.h | 4 +
arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 3 +-
arch/x86/kernel/signal.c | 25 ++-
arch/x86/um/asm/elf.h | 9 +-
arch/x86/um/vdso/vma.c | 2 +-
fs/Kconfig.binfmt | 3 +
fs/aio.c | 3 +-
fs/binfmt_elf.c | 19 ++-
fs/binfmt_elf_fdpic.c | 17 +-
fs/compat_binfmt_elf.c | 12 --
include/asm-generic/mm_hooks.h | 9 +-
include/linux/elf.h | 24 ++-
include/linux/mm.h | 3 +-
include/linux/mm_types.h | 21 ++-
kernel/fork.c | 1 +
mm/mmap.c | 28 ++--
mm/mremap.c | 2 +-
tools/testing/selftests/x86/.gitignore | 1 +
tools/testing/selftests/x86/Makefile | 11 +-
.../testing/selftests/x86/test_munmap_vdso.c | 151 ++++++++++++++++++
91 files changed, 511 insertions(+), 491 deletions(-)
create mode 100644 tools/testing/selftests/x86/test_munmap_vdso.c


base-commit: 614124bea77e452aa6df7a8714e8bc820b489922
--
2.31.1


2021-06-11 18:06:37

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH v3 05/23] x86: Remove compat_arch_setup_additional_pages()

The same as for x32 task, detect ia32 task by in_ia32_syscall().
A process that has just done sys_exec() for compatible ELF
is doing compatible syscall after personality is set in
load_elf_binary(), see the comment near in_32bit_syscall().

Removing compat_arch_setup_additional_pages() provides single point of
entry - arch_setup_additional_pages(), makes ifdeffery easier to read,
aligns the code with powerpc and sparc (mips also has single vdso setup
function, but instead of taking bitness from mm.context, takes vdso
image pointer there).
Together with arm64 code align to use in_compat_task(), it makes
possible to remove compat_arch_setup_additional_pages() macro
re-definition from compat elf code (another redefined macro less).

Cc: [email protected]
Signed-off-by: Dmitry Safonov <[email protected]>
---
arch/x86/entry/vdso/vma.c | 48 +++++++++++++++-----------------------
arch/x86/include/asm/elf.h | 5 ----
2 files changed, 19 insertions(+), 34 deletions(-)

diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c
index 43d42ce82e86..99415ffb9501 100644
--- a/arch/x86/entry/vdso/vma.c
+++ b/arch/x86/entry/vdso/vma.c
@@ -375,50 +375,40 @@ int map_vdso_once(const struct vdso_image *image, unsigned long addr)
return map_vdso(image, addr);
}

-#if defined(CONFIG_X86_32) || defined(CONFIG_IA32_EMULATION)
-static int load_vdso32(void)
-{
- if (vdso32_enabled != 1) /* Other values all mean "disabled" */
- return 0;
-
- return map_vdso(&vdso_image_32, 0);
-}
-#endif
-
#ifdef CONFIG_X86_64
-int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
+static int load_vdso_64(void)
{
if (!vdso64_enabled)
return 0;

- return map_vdso_randomized(&vdso_image_64);
-}
-
-#ifdef CONFIG_COMPAT
-int compat_arch_setup_additional_pages(struct linux_binprm *bprm,
- int uses_interp)
-{
#ifdef CONFIG_X86_X32_ABI
- if (in_x32_syscall()) {
- if (!vdso64_enabled)
- return 0;
+ if (in_x32_syscall())
return map_vdso_randomized(&vdso_image_x32);
- }
#endif
-#ifdef CONFIG_IA32_EMULATION
- return load_vdso32();
+
+ return map_vdso_randomized(&vdso_image_64);
+}
#else
- return 0;
-#endif
+static int load_vdso_64(void)
+{
+ WARN_ON_ONCE(1);
+ return -ENODATA;
}
#endif
-#else
+
int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
{
- return load_vdso32();
-}
+#if defined(CONFIG_X86_32) || defined(CONFIG_IA32_EMULATION)
+ if (in_ia32_syscall()) {
+ if (vdso32_enabled != 1) /* Other values all mean "disabled" */
+ return 0;
+ return map_vdso(&vdso_image_32, 0);
+ }
#endif

+ return load_vdso_64();
+}
+
bool arch_syscall_is_vdso_sigreturn(struct pt_regs *regs)
{
#if defined(CONFIG_X86_32) || defined(CONFIG_IA32_EMULATION)
diff --git a/arch/x86/include/asm/elf.h b/arch/x86/include/asm/elf.h
index 9ee5b3b3ba93..93ff2c7ca4df 100644
--- a/arch/x86/include/asm/elf.h
+++ b/arch/x86/include/asm/elf.h
@@ -377,11 +377,6 @@ else if (IS_ENABLED(CONFIG_IA32_EMULATION)) \
((unsigned long)current->mm->context.vdso + \
vdso_image_32.sym___kernel_vsyscall)

-struct linux_binprm;
-extern int compat_arch_setup_additional_pages(struct linux_binprm *bprm,
- int uses_interp);
-#define compat_arch_setup_additional_pages compat_arch_setup_additional_pages
-
extern bool arch_syscall_is_vdso_sigreturn(struct pt_regs *regs);

/* Do not change the values. See get_align_mask() */
--
2.31.1

2021-06-11 18:06:38

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH v3 13/23] mm/mmap: Make vm_special_mapping::mremap return void

Previously .mremap() callback had to return (int) to provide
a way to restrict resizing of a special mapping. Now it's
restricted by providing .may_split = special_mapping_split.

Removing (int) return simplifies further changes to
special_mapping_mremap() as it won't need save ret code from the
callback. Also, it removes needless `return 0` from callbacks.

Signed-off-by: Dmitry Safonov <[email protected]>
---
arch/arm/kernel/process.c | 3 +--
arch/arm64/kernel/vdso.c | 4 +---
arch/mips/vdso/genvdso.c | 3 +--
arch/x86/entry/vdso/vma.c | 4 +---
include/linux/mm_types.h | 4 ++--
mm/mmap.c | 2 +-
6 files changed, 7 insertions(+), 13 deletions(-)

diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c
index 5897ccb88bca..b863c5525b5d 100644
--- a/arch/arm/kernel/process.c
+++ b/arch/arm/kernel/process.c
@@ -387,11 +387,10 @@ static unsigned long sigpage_addr(const struct mm_struct *mm,
static struct page *signal_page;
extern struct page *get_signal_page(void);

-static int sigpage_mremap(const struct vm_special_mapping *sm,
+static void sigpage_mremap(const struct vm_special_mapping *sm,
struct vm_area_struct *new_vma)
{
current->mm->context.sigpage = new_vma->vm_start;
- return 0;
}

static const struct vm_special_mapping sigpage_mapping = {
diff --git a/arch/arm64/kernel/vdso.c b/arch/arm64/kernel/vdso.c
index c0512c2e8183..680415e0098c 100644
--- a/arch/arm64/kernel/vdso.c
+++ b/arch/arm64/kernel/vdso.c
@@ -78,12 +78,10 @@ static union {
} vdso_data_store __page_aligned_data;
struct vdso_data *vdso_data = vdso_data_store.data;

-static int vdso_mremap(const struct vm_special_mapping *sm,
+static void vdso_mremap(const struct vm_special_mapping *sm,
struct vm_area_struct *new_vma)
{
current->mm->context.vdso = (void *)new_vma->vm_start;
-
- return 0;
}

static int __init __vdso_init(enum vdso_abi abi)
diff --git a/arch/mips/vdso/genvdso.c b/arch/mips/vdso/genvdso.c
index 09e30eb4be86..0303d30cde03 100644
--- a/arch/mips/vdso/genvdso.c
+++ b/arch/mips/vdso/genvdso.c
@@ -259,13 +259,12 @@ int main(int argc, char **argv)
fprintf(out_file, "#include <linux/linkage.h>\n");
fprintf(out_file, "#include <linux/mm.h>\n");
fprintf(out_file, "#include <asm/vdso.h>\n");
- fprintf(out_file, "static int vdso_mremap(\n");
+ fprintf(out_file, "static void vdso_mremap(\n");
fprintf(out_file, " const struct vm_special_mapping *sm,\n");
fprintf(out_file, " struct vm_area_struct *new_vma)\n");
fprintf(out_file, "{\n");
fprintf(out_file, " current->mm->context.vdso =\n");
fprintf(out_file, " (void *)(new_vma->vm_start);\n");
- fprintf(out_file, " return 0;\n");
fprintf(out_file, "}\n");

/* Write out the stripped VDSO data. */
diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c
index f1abe43aadb9..a286d44751be 100644
--- a/arch/x86/entry/vdso/vma.c
+++ b/arch/x86/entry/vdso/vma.c
@@ -86,15 +86,13 @@ static void vdso_fix_landing(const struct vdso_image *image,
#endif
}

-static int vdso_mremap(const struct vm_special_mapping *sm,
+static void vdso_mremap(const struct vm_special_mapping *sm,
struct vm_area_struct *new_vma)
{
const struct vdso_image *image = current->mm->context.vdso_image;

vdso_fix_landing(image, new_vma);
current->mm->context.vdso = (void __user *)new_vma->vm_start;
-
- return 0;
}

#ifdef CONFIG_TIME_NS
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 5aacc1c10a45..e9c5f2051f08 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -770,8 +770,8 @@ struct vm_special_mapping {
struct vm_area_struct *vma,
struct vm_fault *vmf);

- int (*mremap)(const struct vm_special_mapping *sm,
- struct vm_area_struct *new_vma);
+ void (*mremap)(const struct vm_special_mapping *sm,
+ struct vm_area_struct *new_vma);
};

enum tlb_flush_reason {
diff --git a/mm/mmap.c b/mm/mmap.c
index 0584e540246e..4f0d62409b1c 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -3401,7 +3401,7 @@ static int special_mapping_mremap(struct vm_area_struct *new_vma)
return -EFAULT;

if (sm->mremap)
- return sm->mremap(sm, new_vma);
+ sm->mremap(sm, new_vma);

return 0;
}
--
2.31.1

2021-06-11 18:06:50

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH v3 18/23] arm/vdso: Migrate to generic vdso_base

Generic way to track the landing vma area.
As a bonus, after unmapping sigpage, kernel won't try to land on its
previous position (due to UNMAPPED_VDSO_BASE check instead of
context.vdso ?= 0 check).

Signed-off-by: Dmitry Safonov <[email protected]>
---
arch/arm/Kconfig | 1 +
arch/arm/kernel/process.c | 9 +--------
arch/arm/kernel/signal.c | 6 +++++-
3 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 2df5ad505b8b..edf1cbb908a9 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -23,6 +23,7 @@ config ARM
select ARCH_HAS_SYNC_DMA_FOR_CPU if SWIOTLB
select ARCH_HAS_TEARDOWN_DMA_OPS if MMU
select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
+ select ARCH_HAS_VDSO_BASE
select ARCH_HAVE_CUSTOM_GPIO_H
select ARCH_HAVE_NMI_SAFE_CMPXCHG if CPU_V7 || CPU_V7M || CPU_V6K
select ARCH_HAS_GCOV_PROFILE_ALL
diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c
index b863c5525b5d..3a5975d1ace6 100644
--- a/arch/arm/kernel/process.c
+++ b/arch/arm/kernel/process.c
@@ -387,16 +387,9 @@ static unsigned long sigpage_addr(const struct mm_struct *mm,
static struct page *signal_page;
extern struct page *get_signal_page(void);

-static void sigpage_mremap(const struct vm_special_mapping *sm,
- struct vm_area_struct *new_vma)
-{
- current->mm->context.sigpage = new_vma->vm_start;
-}
-
static const struct vm_special_mapping sigpage_mapping = {
.name = "[sigpage]",
.pages = &signal_page,
- .mremap = sigpage_mremap,
};

int arch_setup_additional_pages(unsigned long *sysinfo_ehdr)
@@ -434,7 +427,7 @@ int arch_setup_additional_pages(unsigned long *sysinfo_ehdr)
goto up_fail;
}

- mm->context.sigpage = addr;
+ mm->vdso_base = (void __user *)addr;

/* Unlike the sigpage, failure to install the vdso is unlikely
* to be fatal to the process, so no error check needed
diff --git a/arch/arm/kernel/signal.c b/arch/arm/kernel/signal.c
index a3a38d0a4c85..6c0507e84e24 100644
--- a/arch/arm/kernel/signal.c
+++ b/arch/arm/kernel/signal.c
@@ -451,13 +451,17 @@ setup_return(struct pt_regs *regs, struct ksignal *ksig,
#ifdef CONFIG_MMU
if (cpsr & MODE32_BIT) {
struct mm_struct *mm = current->mm;
+ unsigned long land = (unsigned long)mm->vdso_base;
+
+ if (land == UNMAPPED_VDSO_BASE)
+ return 1;

/*
* 32-bit code can use the signal return page
* except when the MPU has protected the vectors
* page from PL0
*/
- retcode = mm->context.sigpage + signal_return_offset +
+ retcode = land + signal_return_offset +
(idx << 2) + thumb;
} else
#endif
--
2.31.1

2021-06-11 18:06:51

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH v3 08/23] elf/vdso: Modify arch_setup_additional_pages() parameters

Both parameters of arch_setup_additional_pages() are currently unused.
commit fc5243d98ac2 ("[S390] arch_setup_additional_pages arguments")
tried to introduce useful arguments, but they still are not used.

Remove old parameters and introduce sysinfo_ehdr argument that will be
used to return vdso address to put as AT_SYSINFO_EHDR tag in auxiliary
vector. The reason to add this parameter is that many architectures
have vDSO pointer saved in their mm->context with the only purpose
to use it later in ARCH_DLINFO. That's the macro for elf loader
to setup sysinfo_ehdr tag.

Return sysinfo_ehdr address that will be later used by ARCH_DLINFO as
an argument. That will allow to drop vDSO pointer from mm->context
and any code responsible to track vDSO position on platforms that
don't use vDSO as a landing in userspace (arm/s390/sparc).

Cc: Albert Ou <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Paul Walmsley <[email protected]>
Cc: [email protected]
Signed-off-by: Dmitry Safonov <[email protected]>
---
arch/arm/include/asm/vdso.h | 6 ++++--
arch/arm/kernel/process.c | 4 ++--
arch/arm/kernel/vdso.c | 10 +++++++---
arch/arm64/kernel/vdso.c | 17 ++++++++---------
arch/csky/kernel/vdso.c | 11 ++++++-----
arch/hexagon/kernel/vdso.c | 3 ++-
arch/mips/kernel/vdso.c | 3 ++-
arch/nds32/kernel/vdso.c | 3 ++-
arch/nios2/mm/init.c | 2 +-
arch/powerpc/kernel/vdso.c | 12 +++++++-----
arch/riscv/kernel/vdso.c | 11 ++++++-----
arch/s390/kernel/vdso.c | 3 ++-
arch/sh/kernel/vsyscall/vsyscall.c | 3 ++-
arch/sparc/vdso/vma.c | 14 +++++++-------
arch/x86/entry/vdso/vma.c | 26 +++++++++++++++-----------
arch/x86/um/vdso/vma.c | 2 +-
fs/binfmt_elf.c | 3 ++-
fs/binfmt_elf_fdpic.c | 3 ++-
include/linux/elf.h | 17 ++++++++++++-----
19 files changed, 90 insertions(+), 63 deletions(-)

diff --git a/arch/arm/include/asm/vdso.h b/arch/arm/include/asm/vdso.h
index 5b85889f82ee..6b2b3b1fe833 100644
--- a/arch/arm/include/asm/vdso.h
+++ b/arch/arm/include/asm/vdso.h
@@ -10,13 +10,15 @@ struct mm_struct;

#ifdef CONFIG_VDSO

-void arm_install_vdso(struct mm_struct *mm, unsigned long addr);
+void arm_install_vdso(struct mm_struct *mm, unsigned long addr,
+ unsigned long *sysinfo_ehdr);

extern unsigned int vdso_total_pages;

#else /* CONFIG_VDSO */

-static inline void arm_install_vdso(struct mm_struct *mm, unsigned long addr)
+static inline void arm_install_vdso(struct mm_struct *mm, unsigned long addr,
+ unsigned long *sysinfo_ehdr)
{
}

diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c
index 6324f4db9b02..5897ccb88bca 100644
--- a/arch/arm/kernel/process.c
+++ b/arch/arm/kernel/process.c
@@ -400,7 +400,7 @@ static const struct vm_special_mapping sigpage_mapping = {
.mremap = sigpage_mremap,
};

-int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
+int arch_setup_additional_pages(unsigned long *sysinfo_ehdr)
{
struct mm_struct *mm = current->mm;
struct vm_area_struct *vma;
@@ -441,7 +441,7 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
* to be fatal to the process, so no error check needed
* here.
*/
- arm_install_vdso(mm, addr + PAGE_SIZE);
+ arm_install_vdso(mm, addr + PAGE_SIZE, sysinfo_ehdr);

up_fail:
mmap_write_unlock(mm);
diff --git a/arch/arm/kernel/vdso.c b/arch/arm/kernel/vdso.c
index 015eff0a6e93..de516f8ba85a 100644
--- a/arch/arm/kernel/vdso.c
+++ b/arch/arm/kernel/vdso.c
@@ -233,7 +233,8 @@ static int install_vvar(struct mm_struct *mm, unsigned long addr)
}

/* assumes mmap_lock is write-locked */
-void arm_install_vdso(struct mm_struct *mm, unsigned long addr)
+void arm_install_vdso(struct mm_struct *mm, unsigned long addr,
+ unsigned long *sysinfo_ehdr)
{
struct vm_area_struct *vma;
unsigned long len;
@@ -252,7 +253,10 @@ void arm_install_vdso(struct mm_struct *mm, unsigned long addr)
VM_READ | VM_EXEC | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC,
&vdso_text_mapping);

- if (!IS_ERR(vma))
- mm->context.vdso = addr;
+ if (IS_ERR(vma))
+ return;
+
+ mm->context.vdso = addr;
+ *sysinfo_ehdr = addr;
}

diff --git a/arch/arm64/kernel/vdso.c b/arch/arm64/kernel/vdso.c
index 1bc8adefa293..c0512c2e8183 100644
--- a/arch/arm64/kernel/vdso.c
+++ b/arch/arm64/kernel/vdso.c
@@ -213,8 +213,7 @@ static vm_fault_t vvar_fault(const struct vm_special_mapping *sm,

static int __setup_additional_pages(enum vdso_abi abi,
struct mm_struct *mm,
- struct linux_binprm *bprm,
- int uses_interp)
+ unsigned long *sysinfo_ehdr)
{
unsigned long vdso_base, vdso_text_len, vdso_mapping_len;
unsigned long gp_flags = 0;
@@ -248,6 +247,8 @@ static int __setup_additional_pages(enum vdso_abi abi,
return PTR_ERR(ret);

mm->context.vdso = (void *)vdso_base;
+ *sysinfo_ehdr = vdso_base;
+
return 0;
}

@@ -405,8 +406,7 @@ static int aarch32_sigreturn_setup(struct mm_struct *mm)
return PTR_ERR_OR_ZERO(ret);
}

-static int aarch32_setup_additional_pages(struct linux_binprm *bprm,
- int uses_interp)
+static int aarch32_setup_additional_pages(unsigned long *sysinfo_ehdr)
{
struct mm_struct *mm = current->mm;
int ret;
@@ -416,8 +416,7 @@ static int aarch32_setup_additional_pages(struct linux_binprm *bprm,
return ret;

if (IS_ENABLED(CONFIG_COMPAT_VDSO)) {
- ret = __setup_additional_pages(VDSO_ABI_AA32, mm, bprm,
- uses_interp);
+ ret = __setup_additional_pages(VDSO_ABI_AA32, mm, sysinfo_ehdr);
if (ret)
return ret;
}
@@ -451,7 +450,7 @@ static int __init vdso_init(void)
}
arch_initcall(vdso_init);

-int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
+int arch_setup_additional_pages(unsigned long *sysinfo_ehdr)
{
struct mm_struct *mm = current->mm;
int ret;
@@ -460,9 +459,9 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
return -EINTR;

if (is_compat_task())
- ret = aarch32_setup_additional_pages(bprm, uses_interp);
+ ret = aarch32_setup_additional_pages(sysinfo_ehdr);
else
- ret = __setup_additional_pages(VDSO_ABI_AA64, mm, bprm, uses_interp);
+ ret = __setup_additional_pages(VDSO_ABI_AA64, mm, sysinfo_ehdr);

mmap_write_unlock(mm);

diff --git a/arch/csky/kernel/vdso.c b/arch/csky/kernel/vdso.c
index 16c20d64d165..30160e64ee2d 100644
--- a/arch/csky/kernel/vdso.c
+++ b/arch/csky/kernel/vdso.c
@@ -52,11 +52,10 @@ static int __init vdso_init(void)
}
arch_initcall(vdso_init);

-int arch_setup_additional_pages(struct linux_binprm *bprm,
- int uses_interp)
+int arch_setup_additional_pages(unsigned long *sysinfo_ehdr)
{
struct mm_struct *mm = current->mm;
- unsigned long vdso_base, vdso_len;
+ unsigned long vdso_base, vvar_base, vdso_len;
int ret;

vdso_len = (vdso_pages + 1) << PAGE_SHIFT;
@@ -85,12 +84,14 @@ int arch_setup_additional_pages(struct linux_binprm *bprm,
goto end;
}

- vdso_base += (vdso_pages << PAGE_SHIFT);
- ret = install_special_mapping(mm, vdso_base, PAGE_SIZE,
+ vvar_base = vdso_base + (vdso_pages << PAGE_SHIFT);
+ ret = install_special_mapping(mm, vvar_base, PAGE_SIZE,
(VM_READ | VM_MAYREAD), &vdso_pagelist[vdso_pages]);

if (unlikely(ret))
mm->context.vdso = NULL;
+ else
+ *sysinfo_ehdr = vdso_base;
end:
mmap_write_unlock(mm);
return ret;
diff --git a/arch/hexagon/kernel/vdso.c b/arch/hexagon/kernel/vdso.c
index b70970ac809f..39e78fe82b99 100644
--- a/arch/hexagon/kernel/vdso.c
+++ b/arch/hexagon/kernel/vdso.c
@@ -46,7 +46,7 @@ arch_initcall(vdso_init);
/*
* Called from binfmt_elf. Create a VMA for the vDSO page.
*/
-int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
+int arch_setup_additional_pages(unsigned long *sysinfo_ehdr)
{
int ret;
unsigned long vdso_base;
@@ -74,6 +74,7 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
goto up_fail;

mm->context.vdso = (void *)vdso_base;
+ *sysinfo_ehdr = vdso_base;

up_fail:
mmap_write_unlock(mm);
diff --git a/arch/mips/kernel/vdso.c b/arch/mips/kernel/vdso.c
index 3d0cf471f2fe..9b2e1d2250b4 100644
--- a/arch/mips/kernel/vdso.c
+++ b/arch/mips/kernel/vdso.c
@@ -86,7 +86,7 @@ static unsigned long vdso_base(void)
return base;
}

-int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
+int arch_setup_additional_pages(unsigned long *sysinfo_ehdr)
{
struct mips_vdso_image *image = current->thread.abi->vdso;
struct mm_struct *mm = current->mm;
@@ -185,6 +185,7 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
}

mm->context.vdso = (void *)vdso_addr;
+ *sysinfo_ehdr = vdso_addr;
ret = 0;

out:
diff --git a/arch/nds32/kernel/vdso.c b/arch/nds32/kernel/vdso.c
index 2d1d51a0fc64..1d35a33389e5 100644
--- a/arch/nds32/kernel/vdso.c
+++ b/arch/nds32/kernel/vdso.c
@@ -111,7 +111,7 @@ unsigned long inline vdso_random_addr(unsigned long vdso_mapping_len)
return addr;
}

-int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
+int arch_setup_additional_pages(unsigned long *sysinfo_ehdr)
{
struct mm_struct *mm = current->mm;
unsigned long vdso_base, vdso_text_len, vdso_mapping_len;
@@ -185,6 +185,7 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
}

mm->context.vdso = (void *)vdso_base;
+ *sysinfo_ehdr = vdso_base;

up_fail:
mmap_write_unlock(mm);
diff --git a/arch/nios2/mm/init.c b/arch/nios2/mm/init.c
index 613fcaa5988a..0164f5cdb255 100644
--- a/arch/nios2/mm/init.c
+++ b/arch/nios2/mm/init.c
@@ -103,7 +103,7 @@ static int alloc_kuser_page(void)
}
arch_initcall(alloc_kuser_page);

-int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
+int arch_setup_additional_pages(unsigned long *sysinfo_ehdr)
{
struct mm_struct *mm = current->mm;
int ret;
diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c
index 76e898b56002..6d6e575630c1 100644
--- a/arch/powerpc/kernel/vdso.c
+++ b/arch/powerpc/kernel/vdso.c
@@ -190,7 +190,7 @@ static vm_fault_t vvar_fault(const struct vm_special_mapping *sm,
* This is called from binfmt_elf, we create the special vma for the
* vDSO and insert it into the mm struct tree
*/
-static int __arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
+static int __arch_setup_additional_pages(unsigned long *sysinfo_ehdr)
{
unsigned long vdso_size, vdso_base, mappings_size;
struct vm_special_mapping *vdso_spec;
@@ -248,15 +248,17 @@ static int __arch_setup_additional_pages(struct linux_binprm *bprm, int uses_int
vma = _install_special_mapping(mm, vdso_base + vvar_size, vdso_size,
VM_READ | VM_EXEC | VM_MAYREAD |
VM_MAYWRITE | VM_MAYEXEC, vdso_spec);
- if (IS_ERR(vma))
+ if (IS_ERR(vma)) {
do_munmap(mm, vdso_base, vvar_size, NULL);
- else
+ } else {
mm->context.vdso = (void __user *)vdso_base + vvar_size;
+ *sysinfo_ehdr = vdso_base + vvar_size;
+ }

return PTR_ERR_OR_ZERO(vma);
}

-int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
+int arch_setup_additional_pages(unsigned long *sysinfo_ehdr)
{
struct mm_struct *mm = current->mm;
int rc;
@@ -266,7 +268,7 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
if (mmap_write_lock_killable(mm))
return -EINTR;

- rc = __arch_setup_additional_pages(bprm, uses_interp);
+ rc = __arch_setup_additional_pages(sysinfo_ehdr);
if (rc)
mm->context.vdso = NULL;

diff --git a/arch/riscv/kernel/vdso.c b/arch/riscv/kernel/vdso.c
index 25a3b8849599..9cbbad8e48da 100644
--- a/arch/riscv/kernel/vdso.c
+++ b/arch/riscv/kernel/vdso.c
@@ -56,11 +56,10 @@ static int __init vdso_init(void)
}
arch_initcall(vdso_init);

-int arch_setup_additional_pages(struct linux_binprm *bprm,
- int uses_interp)
+int arch_setup_additional_pages(unsigned long *sysinfo_ehdr)
{
struct mm_struct *mm = current->mm;
- unsigned long vdso_base, vdso_len;
+ unsigned long vdso_base, vvar_base, vdso_len;
int ret;

vdso_len = (vdso_pages + 1) << PAGE_SHIFT;
@@ -89,12 +88,14 @@ int arch_setup_additional_pages(struct linux_binprm *bprm,
goto end;
}

- vdso_base += (vdso_pages << PAGE_SHIFT);
- ret = install_special_mapping(mm, vdso_base, PAGE_SIZE,
+ vvar_base = vdso_base + (vdso_pages << PAGE_SHIFT);
+ ret = install_special_mapping(mm, vvar_base, PAGE_SIZE,
(VM_READ | VM_MAYREAD), &vdso_pagelist[vdso_pages]);

if (unlikely(ret))
mm->context.vdso = NULL;
+ else
+ *sysinfo_ehdr = vdso_base;
end:
mmap_write_unlock(mm);
return ret;
diff --git a/arch/s390/kernel/vdso.c b/arch/s390/kernel/vdso.c
index 8c4e07d533c8..8a72fdedbae9 100644
--- a/arch/s390/kernel/vdso.c
+++ b/arch/s390/kernel/vdso.c
@@ -167,7 +167,7 @@ int vdso_getcpu_init(void)
}
early_initcall(vdso_getcpu_init); /* Must be called before SMP init */

-int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
+int arch_setup_additional_pages(unsigned long *sysinfo_ehdr)
{
unsigned long vdso_text_len, vdso_mapping_len;
unsigned long vvar_start, vdso_text_start;
@@ -204,6 +204,7 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
rc = PTR_ERR(vma);
} else {
current->mm->context.vdso_base = vdso_text_start;
+ *sysinfo_ehdr = vdso_text_start;
rc = 0;
}
out:
diff --git a/arch/sh/kernel/vsyscall/vsyscall.c b/arch/sh/kernel/vsyscall/vsyscall.c
index 1bd85a6949c4..de8df3261b4f 100644
--- a/arch/sh/kernel/vsyscall/vsyscall.c
+++ b/arch/sh/kernel/vsyscall/vsyscall.c
@@ -55,7 +55,7 @@ int __init vsyscall_init(void)
}

/* Setup a VMA at program startup for the vsyscall page */
-int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
+int arch_setup_additional_pages(unsigned long *sysinfo_ehdr)
{
struct mm_struct *mm = current->mm;
unsigned long addr;
@@ -78,6 +78,7 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
goto up_fail;

current->mm->context.vdso = (void *)addr;
+ *sysinfo_ehdr = addr;

up_fail:
mmap_write_unlock(mm);
diff --git a/arch/sparc/vdso/vma.c b/arch/sparc/vdso/vma.c
index d8a344f6c914..ae635893f9b3 100644
--- a/arch/sparc/vdso/vma.c
+++ b/arch/sparc/vdso/vma.c
@@ -346,8 +346,6 @@ static int __init init_vdso(void)
}
subsys_initcall(init_vdso);

-struct linux_binprm;
-
/* Shuffle the vdso up a bit, randomly. */
static unsigned long vdso_addr(unsigned long start, unsigned int len)
{
@@ -359,7 +357,8 @@ static unsigned long vdso_addr(unsigned long start, unsigned int len)
}

static int map_vdso(const struct vdso_image *image,
- struct vm_special_mapping *vdso_mapping)
+ struct vm_special_mapping *vdso_mapping,
+ unsigned long *sysinfo_ehdr)
{
struct mm_struct *mm = current->mm;
struct vm_area_struct *vma;
@@ -416,6 +415,7 @@ static int map_vdso(const struct vdso_image *image,
do_munmap(mm, text_start, image->size, NULL);
} else {
current->mm->context.vdso = (void __user *)text_start;
+ *sysinfo_ehdr = text_start;
}

up_fail:
@@ -423,7 +423,7 @@ static int map_vdso(const struct vdso_image *image,
return ret;
}

-int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
+int arch_setup_additional_pages(unsigned long *sysinfo_ehdr)
{

if (!vdso_enabled)
@@ -431,11 +431,11 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)

#if defined CONFIG_COMPAT
if (!(is_32bit_task()))
- return map_vdso(&vdso_image_64_builtin, &vdso_mapping64);
+ return map_vdso(&vdso_image_64_builtin, &vdso_mapping64, sysinfo_ehdr);
else
- return map_vdso(&vdso_image_32_builtin, &vdso_mapping32);
+ return map_vdso(&vdso_image_32_builtin, &vdso_mapping32, sysinfo_ehdr);
#else
- return map_vdso(&vdso_image_64_builtin, &vdso_mapping64);
+ return map_vdso(&vdso_image_64_builtin, &vdso_mapping64, sysinfo_ehdr);
#endif

}
diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c
index 99415ffb9501..f1abe43aadb9 100644
--- a/arch/x86/entry/vdso/vma.c
+++ b/arch/x86/entry/vdso/vma.c
@@ -243,7 +243,8 @@ static const struct vm_special_mapping vvar_mapping = {
* @image - blob to map
* @addr - request a specific address (zero to map at free addr)
*/
-static int map_vdso(const struct vdso_image *image, unsigned long addr)
+static int map_vdso(const struct vdso_image *image, unsigned long addr,
+ unsigned long *sysinfo_ehdr)
{
struct mm_struct *mm = current->mm;
struct vm_area_struct *vma;
@@ -290,6 +291,7 @@ static int map_vdso(const struct vdso_image *image, unsigned long addr)
} else {
current->mm->context.vdso = (void __user *)text_start;
current->mm->context.vdso_image = image;
+ *sysinfo_ehdr = text_start;
}

up_fail:
@@ -342,11 +344,12 @@ static unsigned long vdso_addr(unsigned long start, unsigned len)
return addr;
}

-static int map_vdso_randomized(const struct vdso_image *image)
+static int map_vdso_randomized(const struct vdso_image *image,
+ unsigned long *sysinfo_ehdr)
{
unsigned long addr = vdso_addr(current->mm->start_stack, image->size-image->sym_vvar_start);

- return map_vdso(image, addr);
+ return map_vdso(image, addr, sysinfo_ehdr);
}
#endif

@@ -354,6 +357,7 @@ int map_vdso_once(const struct vdso_image *image, unsigned long addr)
{
struct mm_struct *mm = current->mm;
struct vm_area_struct *vma;
+ unsigned long unused;

mmap_write_lock(mm);
/*
@@ -372,41 +376,41 @@ int map_vdso_once(const struct vdso_image *image, unsigned long addr)
}
mmap_write_unlock(mm);

- return map_vdso(image, addr);
+ return map_vdso(image, addr, &unused);
}

#ifdef CONFIG_X86_64
-static int load_vdso_64(void)
+static int load_vdso_64(unsigned long *sysinfo_ehdr)
{
if (!vdso64_enabled)
return 0;

#ifdef CONFIG_X86_X32_ABI
if (in_x32_syscall())
- return map_vdso_randomized(&vdso_image_x32);
+ return map_vdso_randomized(&vdso_image_x32, sysinfo_ehdr);
#endif

- return map_vdso_randomized(&vdso_image_64);
+ return map_vdso_randomized(&vdso_image_64, sysinfo_ehdr);
}
#else
-static int load_vdso_64(void)
+static int load_vdso_64(unsigned long *sysinfo_ehdr)
{
WARN_ON_ONCE(1);
return -ENODATA;
}
#endif

-int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
+int arch_setup_additional_pages(unsigned long *sysinfo_ehdr)
{
#if defined(CONFIG_X86_32) || defined(CONFIG_IA32_EMULATION)
if (in_ia32_syscall()) {
if (vdso32_enabled != 1) /* Other values all mean "disabled" */
return 0;
- return map_vdso(&vdso_image_32, 0);
+ return map_vdso(&vdso_image_32, 0, sysinfo_ehdr);
}
#endif

- return load_vdso_64();
+ return load_vdso_64(sysinfo_ehdr);
}

bool arch_syscall_is_vdso_sigreturn(struct pt_regs *regs)
diff --git a/arch/x86/um/vdso/vma.c b/arch/x86/um/vdso/vma.c
index 76d9f6ce7a3d..77488065f7cc 100644
--- a/arch/x86/um/vdso/vma.c
+++ b/arch/x86/um/vdso/vma.c
@@ -50,7 +50,7 @@ static int __init init_vdso(void)
}
subsys_initcall(init_vdso);

-int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
+int arch_setup_additional_pages(unsigned long *sysinfo_ehdr)
{
int err;
struct mm_struct *mm = current->mm;
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index dac2713c10ee..62741e55e3d1 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -836,6 +836,7 @@ static int load_elf_binary(struct linux_binprm *bprm)
unsigned long interp_load_addr = 0;
unsigned long start_code, end_code, start_data, end_data;
unsigned long reloc_func_desc __maybe_unused = 0;
+ unsigned long sysinfo_ehdr = 0;
int executable_stack = EXSTACK_DEFAULT;
struct elfhdr *elf_ex = (struct elfhdr *)bprm->buf;
struct elfhdr *interp_elf_ex = NULL;
@@ -1252,7 +1253,7 @@ static int load_elf_binary(struct linux_binprm *bprm)

set_binfmt(&elf_format);

- retval = arch_setup_additional_pages(bprm, !!interpreter);
+ retval = arch_setup_additional_pages(&sysinfo_ehdr);
if (retval < 0)
goto out;

diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c
index 11cbf20b19da..421a09bc6ee6 100644
--- a/fs/binfmt_elf_fdpic.c
+++ b/fs/binfmt_elf_fdpic.c
@@ -183,6 +183,7 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm)
{
struct elf_fdpic_params exec_params, interp_params;
struct pt_regs *regs = current_pt_regs();
+ unsigned long sysinfo_ehdr = 0;
struct elf_phdr *phdr;
unsigned long stack_size, entryaddr;
#ifdef ELF_FDPIC_PLAT_INIT
@@ -375,7 +376,7 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm)
if (retval < 0)
goto error;

- retval = arch_setup_additional_pages(bprm, !!interpreter_name);
+ retval = arch_setup_additional_pages(&sysinfo_ehdr);
if (retval < 0)
goto error;
#endif
diff --git a/include/linux/elf.h b/include/linux/elf.h
index 95bf7a1abaef..a8bea5611a4b 100644
--- a/include/linux/elf.h
+++ b/include/linux/elf.h
@@ -104,13 +104,20 @@ static inline int arch_elf_adjust_prot(int prot,
}
#endif

-struct linux_binprm;
#ifdef CONFIG_ARCH_HAS_SETUP_ADDITIONAL_PAGES
-extern int arch_setup_additional_pages(struct linux_binprm *bprm,
- int uses_interp);
+/**
+ * arch_setup_additional_pages - Premap VMAs in a new-execed process
+ * @sysinfo_ehdr: Returns vDSO position to be set in the initial
+ * auxiliary vector (tag AT_SYSINFO_EHDR) by binfmt
+ * loader. On failure isn't initialized.
+ * As address == 0 is never used, it allows to check
+ * if the tag should be set.
+ *
+ * Return: Zero if successful, or a negative error code on failure.
+ */
+extern int arch_setup_additional_pages(unsigned long *sysinfo_ehdr);
#else
-static inline int arch_setup_additional_pages(struct linux_binprm *bprm,
- int uses_interp)
+static inline int arch_setup_additional_pages(unsigned long *sysinfo_ehdr)
{
return 0;
}
--
2.31.1

2021-06-11 18:07:04

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH v3 10/23] arm/vdso: Remove vdso pointer from mm->context

Not used any more: now sysinfo_ehdr is passed back from
arch_setup_additional_pages() to set AT_SYSINFO_EHDR tag.
.vdso_mremap() was only to track proper position of context.vdso
throughout any mremap() syscalls, remove it too.

Signed-off-by: Dmitry Safonov <[email protected]>
---
arch/arm/include/asm/mmu.h | 3 ---
arch/arm/kernel/vdso.c | 10 ----------
2 files changed, 13 deletions(-)

diff --git a/arch/arm/include/asm/mmu.h b/arch/arm/include/asm/mmu.h
index 1592a4264488..2397b0a19f59 100644
--- a/arch/arm/include/asm/mmu.h
+++ b/arch/arm/include/asm/mmu.h
@@ -12,9 +12,6 @@ typedef struct {
#endif
unsigned int vmalloc_seq;
unsigned long sigpage;
-#ifdef CONFIG_VDSO
- unsigned long vdso;
-#endif
#ifdef CONFIG_BINFMT_ELF_FDPIC
unsigned long exec_fdpic_loadmap;
unsigned long interp_fdpic_loadmap;
diff --git a/arch/arm/kernel/vdso.c b/arch/arm/kernel/vdso.c
index de516f8ba85a..4b39c7d8f525 100644
--- a/arch/arm/kernel/vdso.c
+++ b/arch/arm/kernel/vdso.c
@@ -47,17 +47,8 @@ static const struct vm_special_mapping vdso_data_mapping = {
.pages = &vdso_data_page,
};

-static int vdso_mremap(const struct vm_special_mapping *sm,
- struct vm_area_struct *new_vma)
-{
- current->mm->context.vdso = new_vma->vm_start;
-
- return 0;
-}
-
static struct vm_special_mapping vdso_text_mapping __ro_after_init = {
.name = "[vdso]",
- .mremap = vdso_mremap,
};

struct elfinfo {
@@ -256,7 +247,6 @@ void arm_install_vdso(struct mm_struct *mm, unsigned long addr,
if (IS_ERR(vma))
return;

- mm->context.vdso = addr;
*sysinfo_ehdr = addr;
}

--
2.31.1

2021-06-11 18:07:04

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH v3 09/23] elf: Use sysinfo_ehdr in ARCH_DLINFO()

Instead mm->context.vdso use the pointer provided by elf loader.
That allows to drop the pointer on arm/s390/sparc.

Cc: Christian Borntraeger <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Cc: [email protected]
Cc: "David S. Miller" <[email protected]>
Cc: [email protected]
Signed-off-by: Dmitry Safonov <[email protected]>
---
arch/alpha/include/asm/elf.h | 2 +-
arch/arm/include/asm/elf.h | 5 ++---
arch/arm64/include/asm/elf.h | 18 +++++-------------
arch/ia64/include/asm/elf.h | 2 +-
arch/mips/include/asm/elf.h | 5 ++---
arch/nds32/include/asm/elf.h | 5 ++---
arch/powerpc/include/asm/elf.h | 4 ++--
arch/riscv/include/asm/elf.h | 5 ++---
arch/s390/include/asm/elf.h | 5 ++---
arch/sh/include/asm/elf.h | 10 +++++-----
arch/sparc/include/asm/elf_64.h | 5 ++---
arch/x86/include/asm/elf.h | 33 ++++++++++++++-------------------
arch/x86/um/asm/elf.h | 4 ++--
fs/binfmt_elf.c | 6 +++---
fs/binfmt_elf_fdpic.c | 11 ++++++-----
15 files changed, 51 insertions(+), 69 deletions(-)

diff --git a/arch/alpha/include/asm/elf.h b/arch/alpha/include/asm/elf.h
index 8049997fa372..701e820f28f0 100644
--- a/arch/alpha/include/asm/elf.h
+++ b/arch/alpha/include/asm/elf.h
@@ -155,7 +155,7 @@ extern int alpha_l2_cacheshape;
extern int alpha_l3_cacheshape;

/* update AT_VECTOR_SIZE_ARCH if the number of NEW_AUX_ENT entries changes */
-#define ARCH_DLINFO \
+#define ARCH_DLINFO(sysinfo_ehdr) \
do { \
NEW_AUX_ENT(AT_L1I_CACHESHAPE, alpha_l1i_cacheshape); \
NEW_AUX_ENT(AT_L1D_CACHESHAPE, alpha_l1d_cacheshape); \
diff --git a/arch/arm/include/asm/elf.h b/arch/arm/include/asm/elf.h
index 47347d7412ec..76a0f04190f0 100644
--- a/arch/arm/include/asm/elf.h
+++ b/arch/arm/include/asm/elf.h
@@ -138,10 +138,9 @@ extern void elf_set_personality(const struct elf32_hdr *);
#define SET_PERSONALITY(ex) elf_set_personality(&(ex))

#ifdef CONFIG_VDSO
-#define ARCH_DLINFO \
+#define ARCH_DLINFO(sysinfo_ehdr) \
do { \
- NEW_AUX_ENT(AT_SYSINFO_EHDR, \
- (elf_addr_t)current->mm->context.vdso); \
+ NEW_AUX_ENT(AT_SYSINFO_EHDR, sysinfo_ehdr); \
} while (0)
#endif

diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
index a81953bcc1cf..e62818967a69 100644
--- a/arch/arm64/include/asm/elf.h
+++ b/arch/arm64/include/asm/elf.h
@@ -165,10 +165,9 @@ typedef struct user_fpsimd_state elf_fpregset_t;
})

/* update AT_VECTOR_SIZE_ARCH if the number of NEW_AUX_ENT entries changes */
-#define ARCH_DLINFO \
+#define ARCH_DLINFO(sysinfo_ehdr) \
do { \
- NEW_AUX_ENT(AT_SYSINFO_EHDR, \
- (elf_addr_t)current->mm->context.vdso); \
+ NEW_AUX_ENT(AT_SYSINFO_EHDR, sysinfo_ehdr); \
\
/* \
* Should always be nonzero unless there's a kernel bug. \
@@ -223,19 +222,12 @@ typedef compat_elf_greg_t compat_elf_gregset_t[COMPAT_ELF_NGREG];
set_thread_flag(TIF_32BIT); \
})
#ifdef CONFIG_COMPAT_VDSO
-#define COMPAT_ARCH_DLINFO \
+#define COMPAT_ARCH_DLINFO(sysinfo_ehdr) \
do { \
- /* \
- * Note that we use Elf64_Off instead of elf_addr_t because \
- * elf_addr_t in compat is defined as Elf32_Addr and casting \
- * current->mm->context.vdso to it triggers a cast warning of \
- * cast from pointer to integer of different size. \
- */ \
- NEW_AUX_ENT(AT_SYSINFO_EHDR, \
- (Elf64_Off)current->mm->context.vdso); \
+ NEW_AUX_ENT(AT_SYSINFO_EHDR, sysinfo_ehdr); \
} while (0)
#else
-#define COMPAT_ARCH_DLINFO
+#define COMPAT_ARCH_DLINFO(sysinfo_ehdr)
#endif

#endif /* CONFIG_COMPAT */
diff --git a/arch/ia64/include/asm/elf.h b/arch/ia64/include/asm/elf.h
index 6629301a2620..a257e5abddce 100644
--- a/arch/ia64/include/asm/elf.h
+++ b/arch/ia64/include/asm/elf.h
@@ -208,7 +208,7 @@ struct task_struct;
#define GATE_EHDR ((const struct elfhdr *) GATE_ADDR)

/* update AT_VECTOR_SIZE_ARCH if the number of NEW_AUX_ENT entries changes */
-#define ARCH_DLINFO \
+#define ARCH_DLINFO(sysinfo_ehdr) \
do { \
extern char __kernel_syscall_via_epc[]; \
NEW_AUX_ENT(AT_SYSINFO, (unsigned long) __kernel_syscall_via_epc); \
diff --git a/arch/mips/include/asm/elf.h b/arch/mips/include/asm/elf.h
index a5c8be47a39d..672a32fa59d9 100644
--- a/arch/mips/include/asm/elf.h
+++ b/arch/mips/include/asm/elf.h
@@ -456,10 +456,9 @@ extern const char *__elf_base_platform;
#define ELF_ET_DYN_BASE (TASK_SIZE / 3 * 2)

/* update AT_VECTOR_SIZE_ARCH if the number of NEW_AUX_ENT entries changes */
-#define ARCH_DLINFO \
+#define ARCH_DLINFO(sysinfo_ehdr) \
do { \
- NEW_AUX_ENT(AT_SYSINFO_EHDR, \
- (unsigned long)current->mm->context.vdso); \
+ NEW_AUX_ENT(AT_SYSINFO_EHDR, sysinfo_ehdr); \
} while (0)

#ifdef CONFIG_MIPS_FP_SUPPORT
diff --git a/arch/nds32/include/asm/elf.h b/arch/nds32/include/asm/elf.h
index 36cec4ae5a84..4f5894208efe 100644
--- a/arch/nds32/include/asm/elf.h
+++ b/arch/nds32/include/asm/elf.h
@@ -165,13 +165,12 @@ struct elf32_hdr;
#define FPU_AUX_ENT NEW_AUX_ENT(AT_IGNORE, 0)
#endif

-#define ARCH_DLINFO \
+#define ARCH_DLINFO(sysinfo_ehdr) \
do { \
/* Optional FPU initialization */ \
FPU_AUX_ENT; \
\
- NEW_AUX_ENT(AT_SYSINFO_EHDR, \
- (elf_addr_t)current->mm->context.vdso); \
+ NEW_AUX_ENT(AT_SYSINFO_EHDR, sysinfo_ehdr); \
} while (0)

#endif
diff --git a/arch/powerpc/include/asm/elf.h b/arch/powerpc/include/asm/elf.h
index d7d9820c9096..71b180d6ed90 100644
--- a/arch/powerpc/include/asm/elf.h
+++ b/arch/powerpc/include/asm/elf.h
@@ -155,7 +155,7 @@ extern int ucache_bsize;
* even if DLINFO_ARCH_ITEMS goes to zero or is undefined.
* update AT_VECTOR_SIZE_ARCH if the number of NEW_AUX_ENT entries changes
*/
-#define ARCH_DLINFO \
+#define ARCH_DLINFO(sysinfo_ehdr) \
do { \
/* Handle glibc compatibility. */ \
NEW_AUX_ENT(AT_IGNOREPPC, AT_IGNOREPPC); \
@@ -164,7 +164,7 @@ do { \
NEW_AUX_ENT(AT_DCACHEBSIZE, dcache_bsize); \
NEW_AUX_ENT(AT_ICACHEBSIZE, icache_bsize); \
NEW_AUX_ENT(AT_UCACHEBSIZE, 0); \
- VDSO_AUX_ENT(AT_SYSINFO_EHDR, (unsigned long)current->mm->context.vdso);\
+ VDSO_AUX_ENT(AT_SYSINFO_EHDR, sysinfo_ehdr); \
ARCH_DLINFO_CACHE_GEOMETRY; \
} while (0)

diff --git a/arch/riscv/include/asm/elf.h b/arch/riscv/include/asm/elf.h
index 1d1d60df632e..7c56700f857d 100644
--- a/arch/riscv/include/asm/elf.h
+++ b/arch/riscv/include/asm/elf.h
@@ -58,10 +58,9 @@ extern unsigned long elf_hwcap;
#define ELF_PLATFORM (NULL)

#ifdef CONFIG_MMU
-#define ARCH_DLINFO \
+#define ARCH_DLINFO(sysinfo_ehdr) \
do { \
- NEW_AUX_ENT(AT_SYSINFO_EHDR, \
- (elf_addr_t)current->mm->context.vdso); \
+ NEW_AUX_ENT(AT_SYSINFO_EHDR, sysinfo_ehdr); \
NEW_AUX_ENT(AT_L1I_CACHESIZE, \
get_cache_size(1, CACHE_TYPE_INST)); \
NEW_AUX_ENT(AT_L1I_CACHEGEOMETRY, \
diff --git a/arch/s390/include/asm/elf.h b/arch/s390/include/asm/elf.h
index 6583142149b0..c8026e3e5f10 100644
--- a/arch/s390/include/asm/elf.h
+++ b/arch/s390/include/asm/elf.h
@@ -268,11 +268,10 @@ do { \
#define STACK_RND_MASK MMAP_RND_MASK

/* update AT_VECTOR_SIZE_ARCH if the number of NEW_AUX_ENT entries changes */
-#define ARCH_DLINFO \
+#define ARCH_DLINFO(sysinfo_ehdr) \
do { \
if (vdso_enabled) \
- NEW_AUX_ENT(AT_SYSINFO_EHDR, \
- (unsigned long)current->mm->context.vdso_base); \
+ NEW_AUX_ENT(AT_SYSINFO_EHDR, sysinfo_ehdr); \
} while (0)

#endif
diff --git a/arch/sh/include/asm/elf.h b/arch/sh/include/asm/elf.h
index 9b3e22e771a1..03b813c0bc39 100644
--- a/arch/sh/include/asm/elf.h
+++ b/arch/sh/include/asm/elf.h
@@ -170,13 +170,13 @@ extern void __kernel_vsyscall;
#define VDSO_BASE ((unsigned long)current->mm->context.vdso)
#define VDSO_SYM(x) (VDSO_BASE + (unsigned long)(x))

-#define VSYSCALL_AUX_ENT \
+#define VSYSCALL_AUX_ENT(sysinfo_ehdr) \
if (vdso_enabled) \
- NEW_AUX_ENT(AT_SYSINFO_EHDR, VDSO_BASE); \
+ NEW_AUX_ENT(AT_SYSINFO_EHDR, sysinfo_ehdr); \
else \
NEW_AUX_ENT(AT_IGNORE, 0)
#else
-#define VSYSCALL_AUX_ENT NEW_AUX_ENT(AT_IGNORE, 0)
+#define VSYSCALL_AUX_ENT(sysinfo_ehdr) NEW_AUX_ENT(AT_IGNORE, 0)
#endif /* CONFIG_VSYSCALL */

#ifdef CONFIG_SH_FPU
@@ -188,13 +188,13 @@ extern void __kernel_vsyscall;
extern int l1i_cache_shape, l1d_cache_shape, l2_cache_shape;

/* update AT_VECTOR_SIZE_ARCH if the number of NEW_AUX_ENT entries changes */
-#define ARCH_DLINFO \
+#define ARCH_DLINFO(sysinfo_ehdr) \
do { \
/* Optional FPU initialization */ \
FPU_AUX_ENT; \
\
/* Optional vsyscall entry */ \
- VSYSCALL_AUX_ENT; \
+ VSYSCALL_AUX_ENT(sysinfo_ehdr); \
\
/* Cache desc */ \
NEW_AUX_ENT(AT_L1I_CACHESHAPE, l1i_cache_shape); \
diff --git a/arch/sparc/include/asm/elf_64.h b/arch/sparc/include/asm/elf_64.h
index 63a622c36df3..1e7295b5ae2f 100644
--- a/arch/sparc/include/asm/elf_64.h
+++ b/arch/sparc/include/asm/elf_64.h
@@ -213,12 +213,11 @@ do { if ((ex).e_ident[EI_CLASS] == ELFCLASS32) \

extern unsigned int vdso_enabled;

-#define ARCH_DLINFO \
+#define ARCH_DLINFO(sysinfo_ehdr) \
do { \
extern struct adi_config adi_state; \
if (vdso_enabled) \
- NEW_AUX_ENT(AT_SYSINFO_EHDR, \
- (unsigned long)current->mm->context.vdso); \
+ NEW_AUX_ENT(AT_SYSINFO_EHDR, sysinfo_ehdr); \
NEW_AUX_ENT(AT_ADI_BLKSZ, adi_state.caps.blksz); \
NEW_AUX_ENT(AT_ADI_NBITS, adi_state.caps.nbits); \
NEW_AUX_ENT(AT_ADI_UEONADI, adi_state.caps.ue_on_adi); \
diff --git a/arch/x86/include/asm/elf.h b/arch/x86/include/asm/elf.h
index 93ff2c7ca4df..d543aca7c725 100644
--- a/arch/x86/include/asm/elf.h
+++ b/arch/x86/include/asm/elf.h
@@ -306,11 +306,14 @@ extern u32 elf_hwcap2;

struct task_struct;

-#define ARCH_DLINFO_IA32 \
+#define VDSO_ENTRY(sysinfo_ehdr) \
+ (sysinfo_ehdr + vdso_image_32.sym___kernel_vsyscall)
+
+#define ARCH_DLINFO_IA32(sysinfo_ehdr) \
do { \
- if (VDSO_CURRENT_BASE) { \
- NEW_AUX_ENT(AT_SYSINFO, VDSO_ENTRY); \
- NEW_AUX_ENT(AT_SYSINFO_EHDR, VDSO_CURRENT_BASE); \
+ if (sysinfo_ehdr) { \
+ NEW_AUX_ENT(AT_SYSINFO, VDSO_ENTRY(sysinfo_ehdr)); \
+ NEW_AUX_ENT(AT_SYSINFO_EHDR, sysinfo_ehdr); \
} \
} while (0)

@@ -344,39 +347,31 @@ extern bool mmap_address_hint_valid(unsigned long addr, unsigned long len);
#define __STACK_RND_MASK(is32bit) ((is32bit) ? 0x7ff : 0x3fffff)
#define STACK_RND_MASK __STACK_RND_MASK(mmap_is_ia32())

-#define ARCH_DLINFO \
+#define ARCH_DLINFO(sysinfo_ehdr) \
do { \
if (vdso64_enabled) \
- NEW_AUX_ENT(AT_SYSINFO_EHDR, \
- (unsigned long __force)current->mm->context.vdso); \
+ NEW_AUX_ENT(AT_SYSINFO_EHDR, sysinfo_ehdr); \
} while (0)

/* As a historical oddity, the x32 and x86_64 vDSOs are controlled together. */
-#define ARCH_DLINFO_X32 \
+#define ARCH_DLINFO_X32(sysinfo_ehdr) \
do { \
if (vdso64_enabled) \
- NEW_AUX_ENT(AT_SYSINFO_EHDR, \
- (unsigned long __force)current->mm->context.vdso); \
+ NEW_AUX_ENT(AT_SYSINFO_EHDR, sysinfo_ehdr); \
} while (0)

#define AT_SYSINFO 32

-#define COMPAT_ARCH_DLINFO \
+#define COMPAT_ARCH_DLINFO(sysinfo_ehdr) \
if (exec->e_machine == EM_X86_64) \
- ARCH_DLINFO_X32; \
+ ARCH_DLINFO_X32(sysinfo_ehdr); \
else if (IS_ENABLED(CONFIG_IA32_EMULATION)) \
- ARCH_DLINFO_IA32
+ ARCH_DLINFO_IA32(sysinfo_ehdr)

#define COMPAT_ELF_ET_DYN_BASE (TASK_UNMAPPED_BASE + 0x1000000)

#endif /* !CONFIG_X86_32 */

-#define VDSO_CURRENT_BASE ((unsigned long)current->mm->context.vdso)
-
-#define VDSO_ENTRY \
- ((unsigned long)current->mm->context.vdso + \
- vdso_image_32.sym___kernel_vsyscall)
-
extern bool arch_syscall_is_vdso_sigreturn(struct pt_regs *regs);

/* Do not change the values. See get_align_mask() */
diff --git a/arch/x86/um/asm/elf.h b/arch/x86/um/asm/elf.h
index b7c03a760a3c..8608b33ac0e4 100644
--- a/arch/x86/um/asm/elf.h
+++ b/arch/x86/um/asm/elf.h
@@ -88,7 +88,7 @@ extern unsigned long __kernel_vsyscall;
#define AT_SYSINFO 32
#define AT_SYSINFO_EHDR 33

-#define ARCH_DLINFO \
+#define ARCH_DLINFO(sysinfo_ehdr) \
do { \
if ( vsyscall_ehdr ) { \
NEW_AUX_ENT(AT_SYSINFO, __kernel_vsyscall); \
@@ -183,7 +183,7 @@ do { \

extern unsigned long um_vdso_addr;
#define AT_SYSINFO_EHDR 33
-#define ARCH_DLINFO NEW_AUX_ENT(AT_SYSINFO_EHDR, um_vdso_addr)
+#define ARCH_DLINFO(sysinfo_ehdr) NEW_AUX_ENT(AT_SYSINFO_EHDR, um_vdso_addr)

#endif

diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 62741e55e3d1..a0e61ed9bdc7 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -171,7 +171,7 @@ static int padzero(unsigned long elf_bss)
static int
create_elf_tables(struct linux_binprm *bprm, const struct elfhdr *exec,
unsigned long load_addr, unsigned long interp_load_addr,
- unsigned long e_entry)
+ unsigned long e_entry, unsigned long sysinfo_ehdr)
{
struct mm_struct *mm = current->mm;
unsigned long p = bprm->p;
@@ -252,7 +252,7 @@ create_elf_tables(struct linux_binprm *bprm, const struct elfhdr *exec,
* update AT_VECTOR_SIZE_ARCH if the number of NEW_AUX_ENT() in
* ARCH_DLINFO changes
*/
- ARCH_DLINFO;
+ ARCH_DLINFO(sysinfo_ehdr);
#endif
NEW_AUX_ENT(AT_HWCAP, ELF_HWCAP);
NEW_AUX_ENT(AT_PAGESZ, ELF_EXEC_PAGESIZE);
@@ -1258,7 +1258,7 @@ static int load_elf_binary(struct linux_binprm *bprm)
goto out;

retval = create_elf_tables(bprm, elf_ex,
- load_addr, interp_load_addr, e_entry);
+ load_addr, interp_load_addr, e_entry, sysinfo_ehdr);
if (retval < 0)
goto out;

diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c
index 421a09bc6ee6..0b5f9252e5ad 100644
--- a/fs/binfmt_elf_fdpic.c
+++ b/fs/binfmt_elf_fdpic.c
@@ -63,7 +63,7 @@ static int elf_fdpic_map_file(struct elf_fdpic_params *, struct file *,

static int create_elf_fdpic_tables(struct linux_binprm *, struct mm_struct *,
struct elf_fdpic_params *,
- struct elf_fdpic_params *);
+ struct elf_fdpic_params *, unsigned long);

#ifndef CONFIG_MMU
static int elf_fdpic_map_file_constdisp_on_uclinux(struct elf_fdpic_params *,
@@ -434,8 +434,8 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm)
current->mm->start_stack = current->mm->start_brk + stack_size;
#endif

- if (create_elf_fdpic_tables(bprm, current->mm,
- &exec_params, &interp_params) < 0)
+ if (create_elf_fdpic_tables(bprm, current->mm, &exec_params,
+ &interp_params, sysinfo_ehdr) < 0)
goto error;

kdebug("- start_code %lx", current->mm->start_code);
@@ -496,7 +496,8 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm)
static int create_elf_fdpic_tables(struct linux_binprm *bprm,
struct mm_struct *mm,
struct elf_fdpic_params *exec_params,
- struct elf_fdpic_params *interp_params)
+ struct elf_fdpic_params *interp_params,
+ unsigned long sysinfo_ehdr)
{
const struct cred *cred = current_cred();
unsigned long sp, csp, nitems;
@@ -667,7 +668,7 @@ static int create_elf_fdpic_tables(struct linux_binprm *bprm,
/* ARCH_DLINFO must come last so platform specific code can enforce
* special alignment requirements on the AUXV if necessary (eg. PPC).
*/
- ARCH_DLINFO;
+ ARCH_DLINFO(sysinfo_ehdr);
#endif
#undef NEW_AUX_ENT

--
2.31.1

2021-06-11 18:07:07

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH v3 07/23] vdso: Set mm->context.vdso only on success of _install_special_mapping()

Old pattern was:
1. set mm->context.vdso = addr;
2. try install_special_mapping()
3. on failure set mm->context.vdso = 0

The reason behind the old pattern was to make arch_vma_name() work
in perf_event_mmap_event(). These days using _install_special_mapping()
instead of install_special_mapping() makes old pattern obsolete:
: if (vma->vm_ops && vma->vm_ops->name) {
: name = (char *) vma->vm_ops->name(vma);
: if (name)
: goto cpy_name;

Setting mm->context.vdso = 0 also makes little sense: mm_alloc()
zero-fills new mm_struct. And for double-safety if
arch_setup_additional_pages() fails, bprm_execve() makes sure that the
half-initialized process doesn't make it way to userspace by
: force_sigsegv(SIGSEGV);

Let's cleanup code: set mm->context.vdso only on success, assuming that
any new mm_struct is clean.

Some platforms do_munmap() vvar if vdso mapping failed, but it's really
necessary only on x86 where vdso/vvar pair can be mapped by userspace
(see prctl_map_vdso()). On other platforms vdso/vvar is only pre-mapped
by ELF loader, which as described above will make sure to not let any
half-baked process out. I've left do_unmap() on !x86 in case prctl()
will be supported.

Signed-off-by: Dmitry Safonov <[email protected]>
---
arch/arm/kernel/vdso.c | 2 --
arch/arm64/kernel/vdso.c | 16 +++++-----------
arch/nds32/kernel/vdso.c | 5 +----
arch/powerpc/kernel/vdso.c | 9 ++-------
arch/sparc/vdso/vma.c | 7 ++-----
5 files changed, 10 insertions(+), 29 deletions(-)

diff --git a/arch/arm/kernel/vdso.c b/arch/arm/kernel/vdso.c
index 3408269d19c7..015eff0a6e93 100644
--- a/arch/arm/kernel/vdso.c
+++ b/arch/arm/kernel/vdso.c
@@ -238,8 +238,6 @@ void arm_install_vdso(struct mm_struct *mm, unsigned long addr)
struct vm_area_struct *vma;
unsigned long len;

- mm->context.vdso = 0;
-
if (vdso_text_pagelist == NULL)
return;

diff --git a/arch/arm64/kernel/vdso.c b/arch/arm64/kernel/vdso.c
index a8bf72320ad0..1bc8adefa293 100644
--- a/arch/arm64/kernel/vdso.c
+++ b/arch/arm64/kernel/vdso.c
@@ -227,34 +227,28 @@ static int __setup_additional_pages(enum vdso_abi abi,
vdso_mapping_len = vdso_text_len + VVAR_NR_PAGES * PAGE_SIZE;

vdso_base = get_unmapped_area(NULL, 0, vdso_mapping_len, 0, 0);
- if (IS_ERR_VALUE(vdso_base)) {
- ret = ERR_PTR(vdso_base);
- goto up_fail;
- }
+ if (IS_ERR_VALUE(vdso_base))
+ return vdso_base;

ret = _install_special_mapping(mm, vdso_base, VVAR_NR_PAGES * PAGE_SIZE,
VM_READ|VM_MAYREAD|VM_PFNMAP,
vdso_info[abi].dm);
if (IS_ERR(ret))
- goto up_fail;
+ return PTR_ERR(ret);

if (IS_ENABLED(CONFIG_ARM64_BTI_KERNEL) && system_supports_bti())
gp_flags = VM_ARM64_BTI;

vdso_base += VVAR_NR_PAGES * PAGE_SIZE;
- mm->context.vdso = (void *)vdso_base;
ret = _install_special_mapping(mm, vdso_base, vdso_text_len,
VM_READ|VM_EXEC|gp_flags|
VM_MAYREAD|VM_MAYWRITE|VM_MAYEXEC,
vdso_info[abi].cm);
if (IS_ERR(ret))
- goto up_fail;
+ return PTR_ERR(ret);

+ mm->context.vdso = (void *)vdso_base;
return 0;
-
-up_fail:
- mm->context.vdso = NULL;
- return PTR_ERR(ret);
}

#ifdef CONFIG_COMPAT
diff --git a/arch/nds32/kernel/vdso.c b/arch/nds32/kernel/vdso.c
index e16009a07971..2d1d51a0fc64 100644
--- a/arch/nds32/kernel/vdso.c
+++ b/arch/nds32/kernel/vdso.c
@@ -175,7 +175,6 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)

/*Map vdso to user space */
vdso_base += PAGE_SIZE;
- mm->context.vdso = (void *)vdso_base;
vma = _install_special_mapping(mm, vdso_base, vdso_text_len,
VM_READ | VM_EXEC |
VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC,
@@ -185,11 +184,9 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
goto up_fail;
}

- mmap_write_unlock(mm);
- return 0;
+ mm->context.vdso = (void *)vdso_base;

up_fail:
- mm->context.vdso = NULL;
mmap_write_unlock(mm);
return ret;
}
diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c
index 717f2c9a7573..76e898b56002 100644
--- a/arch/powerpc/kernel/vdso.c
+++ b/arch/powerpc/kernel/vdso.c
@@ -229,13 +229,6 @@ static int __arch_setup_additional_pages(struct linux_binprm *bprm, int uses_int
/* Add required alignment. */
vdso_base = ALIGN(vdso_base, VDSO_ALIGNMENT);

- /*
- * Put vDSO base into mm struct. We need to do this before calling
- * install_special_mapping or the perf counter mmap tracking code
- * will fail to recognise it as a vDSO.
- */
- mm->context.vdso = (void __user *)vdso_base + vvar_size;
-
vma = _install_special_mapping(mm, vdso_base, vvar_size,
VM_READ | VM_MAYREAD | VM_IO |
VM_DONTDUMP | VM_PFNMAP, &vvar_spec);
@@ -257,6 +250,8 @@ static int __arch_setup_additional_pages(struct linux_binprm *bprm, int uses_int
VM_MAYWRITE | VM_MAYEXEC, vdso_spec);
if (IS_ERR(vma))
do_munmap(mm, vdso_base, vvar_size, NULL);
+ else
+ mm->context.vdso = (void __user *)vdso_base + vvar_size;

return PTR_ERR_OR_ZERO(vma);
}
diff --git a/arch/sparc/vdso/vma.c b/arch/sparc/vdso/vma.c
index cc19e09b0fa1..d8a344f6c914 100644
--- a/arch/sparc/vdso/vma.c
+++ b/arch/sparc/vdso/vma.c
@@ -390,7 +390,6 @@ static int map_vdso(const struct vdso_image *image,
}

text_start = addr - image->sym_vvar_start;
- current->mm->context.vdso = (void __user *)text_start;

/*
* MAYWRITE to allow gdb to COW and set breakpoints
@@ -412,16 +411,14 @@ static int map_vdso(const struct vdso_image *image,
-image->sym_vvar_start,
VM_READ|VM_MAYREAD,
&vvar_mapping);
-
if (IS_ERR(vma)) {
ret = PTR_ERR(vma);
do_munmap(mm, text_start, image->size, NULL);
+ } else {
+ current->mm->context.vdso = (void __user *)text_start;
}

up_fail:
- if (ret)
- current->mm->context.vdso = NULL;
-
mmap_write_unlock(mm);
return ret;
}
--
2.31.1

2021-06-11 18:07:23

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH v3 15/23] x86/signal: Check if vdso_image_32 is mapped before trying to land on it

Provide current_has_vdso(image) helper and check it apriory landing
attempt on vdso vma.
The helper is a macro, not a static inline funciton to avoid
linux/sched/task_stack.h inclusion in asm/vdso.h.

Signed-off-by: Dmitry Safonov <[email protected]>
---
arch/x86/entry/common.c | 10 +++++++++-
arch/x86/ia32/ia32_signal.c | 4 ++--
arch/x86/include/asm/vdso.h | 4 ++++
arch/x86/kernel/signal.c | 4 ++--
4 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index 7b2542b13ebd..385a1c4bf4c0 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -150,11 +150,19 @@ static noinstr bool __do_fast_syscall_32(struct pt_regs *regs)
/* Returns 0 to return using IRET or 1 to return using SYSEXIT/SYSRETL. */
__visible noinstr long do_fast_syscall_32(struct pt_regs *regs)
{
+ unsigned long landing_pad;
+
+ if (!current_has_vdso(&vdso_image_32)) {
+ regs->ip = 0;
+ force_sigsegv(SIGSEGV);
+ syscall_exit_to_user_mode(regs);
+ }
+
/*
* Called using the internal vDSO SYSENTER/SYSCALL32 calling
* convention. Adjust regs so it looks like we entered using int80.
*/
- unsigned long landing_pad = (unsigned long)current->mm->context.vdso +
+ landing_pad = (unsigned long)current->mm->context.vdso +
vdso_image_32.sym_int80_landing_pad;

/*
diff --git a/arch/x86/ia32/ia32_signal.c b/arch/x86/ia32/ia32_signal.c
index adb6994c40f6..2af40ae53a0e 100644
--- a/arch/x86/ia32/ia32_signal.c
+++ b/arch/x86/ia32/ia32_signal.c
@@ -255,7 +255,7 @@ int ia32_setup_frame(int sig, struct ksignal *ksig,
restorer = ksig->ka.sa.sa_restorer;
} else {
/* Return stub is in 32bit vsyscall page */
- if (current->mm->context.vdso)
+ if (current_has_vdso(&vdso_image_32))
restorer = current->mm->context.vdso +
vdso_image_32.sym___kernel_sigreturn;
else
@@ -336,7 +336,7 @@ int ia32_setup_rt_frame(int sig, struct ksignal *ksig,

if (ksig->ka.sa.sa_flags & SA_RESTORER)
restorer = ksig->ka.sa.sa_restorer;
- else if (current->mm->context.vdso)
+ else if (current_has_vdso(&vdso_image_32))
restorer = current->mm->context.vdso +
vdso_image_32.sym___kernel_rt_sigreturn;
else
diff --git a/arch/x86/include/asm/vdso.h b/arch/x86/include/asm/vdso.h
index 98aa103eb4ab..1ea7cb3f9b14 100644
--- a/arch/x86/include/asm/vdso.h
+++ b/arch/x86/include/asm/vdso.h
@@ -45,6 +45,10 @@ extern const struct vdso_image vdso_image_x32;
extern const struct vdso_image vdso_image_32;
#endif

+#define current_has_vdso(image) \
+ (current->mm->context.vdso != 0 && \
+ current->mm->context.vdso_image == image)
+
extern void __init init_vdso_image(const struct vdso_image *image);

extern int map_vdso_once(const struct vdso_image *image, unsigned long addr);
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 988cbc634949..77496ccb812d 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -319,7 +319,7 @@ __setup_frame(int sig, struct ksignal *ksig, sigset_t *set,
unsafe_put_user(set->sig[1], &frame->extramask[0], Efault);
if (ksig->ka.sa.sa_flags & SA_RESTORER)
restorer = ksig->ka.sa.sa_restorer;
- else if (current->mm->context.vdso)
+ else if (current_has_vdso(&vdso_image_32))
restorer = current->mm->context.vdso +
vdso_image_32.sym___kernel_sigreturn;
else
@@ -381,7 +381,7 @@ static int __setup_rt_frame(int sig, struct ksignal *ksig,
/* Set up to return from userspace. */
if (ksig->ka.sa.sa_flags & SA_RESTORER)
restorer = ksig->ka.sa.sa_restorer;
- else if (current->mm->context.vdso)
+ else if (current_has_vdso(&vdso_image_32))
restorer = current->mm->context.vdso +
vdso_image_32.sym___kernel_rt_sigreturn;
else
--
2.31.1

2021-06-11 18:07:56

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH v3 02/23] elf: Move arch_setup_additional_pages() to generic elf.h

Ifdef the function in the header, not around it's usage in the code.
Following kernel style, move it to Kconfig.
Makes it easier to follow when the option is enabled/disabled.
Remove re-definition from compat_binfmt_elf, as it's always defined
for architectures that define compat_arch_setup_additional_pages (arm64/x86).
CONFIG_VDSO depends on MMU, so

Signed-off-by: Dmitry Safonov <[email protected]>
---
arch/arm/Kconfig | 1 +
arch/arm/include/asm/elf.h | 3 ---
arch/arm64/Kconfig | 1 +
arch/arm64/include/asm/elf.h | 6 +-----
arch/csky/Kconfig | 1 +
arch/csky/include/asm/elf.h | 4 ----
arch/hexagon/Kconfig | 1 +
arch/hexagon/include/asm/elf.h | 6 ------
arch/mips/Kconfig | 1 +
arch/mips/include/asm/elf.h | 5 -----
arch/nds32/Kconfig | 1 +
arch/nds32/include/asm/elf.h | 3 ---
arch/nios2/Kconfig | 1 +
arch/nios2/include/asm/elf.h | 4 ----
arch/powerpc/Kconfig | 1 +
arch/powerpc/include/asm/elf.h | 5 -----
arch/riscv/Kconfig | 1 +
arch/riscv/include/asm/elf.h | 4 ----
arch/s390/Kconfig | 1 +
arch/s390/include/asm/elf.h | 5 -----
arch/sh/Kconfig | 1 +
arch/sh/include/asm/elf.h | 6 ------
arch/sparc/Kconfig | 1 +
arch/sparc/include/asm/elf_64.h | 6 ------
arch/x86/Kconfig | 1 +
arch/x86/include/asm/elf.h | 4 ----
arch/x86/um/asm/elf.h | 5 -----
fs/Kconfig.binfmt | 3 +++
fs/binfmt_elf.c | 2 --
fs/binfmt_elf_fdpic.c | 3 +--
fs/compat_binfmt_elf.c | 2 --
include/linux/elf.h | 12 ++++++++++++
32 files changed, 30 insertions(+), 71 deletions(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 24804f11302d..2df5ad505b8b 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -14,6 +14,7 @@ config ARM
select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
select ARCH_HAS_PTE_SPECIAL if ARM_LPAE
select ARCH_HAS_PHYS_TO_DMA
+ select ARCH_HAS_SETUP_ADDITIONAL_PAGES if MMU
select ARCH_HAS_SETUP_DMA_OPS
select ARCH_HAS_SET_MEMORY
select ARCH_HAS_STRICT_KERNEL_RWX if MMU && !XIP_KERNEL
diff --git a/arch/arm/include/asm/elf.h b/arch/arm/include/asm/elf.h
index b8102a6ddf16..a7cd90b3a779 100644
--- a/arch/arm/include/asm/elf.h
+++ b/arch/arm/include/asm/elf.h
@@ -145,9 +145,6 @@ do { \
(elf_addr_t)current->mm->context.vdso); \
} while (0)
#endif
-#define ARCH_HAS_SETUP_ADDITIONAL_PAGES 1
-struct linux_binprm;
-int arch_setup_additional_pages(struct linux_binprm *, int);
#endif

#endif
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 9f1d8566bbf9..385ef8d8ad9b 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -31,6 +31,7 @@ config ARM64
select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
select ARCH_HAS_PTE_DEVMAP
select ARCH_HAS_PTE_SPECIAL
+ select ARCH_HAS_SETUP_ADDITIONAL_PAGES
select ARCH_HAS_SETUP_DMA_OPS
select ARCH_HAS_SET_DIRECT_MAP
select ARCH_HAS_SET_MEMORY
diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
index 8d1c8dcb87fd..d1073ffa7f24 100644
--- a/arch/arm64/include/asm/elf.h
+++ b/arch/arm64/include/asm/elf.h
@@ -181,11 +181,6 @@ do { \
NEW_AUX_ENT(AT_IGNORE, 0); \
} while (0)

-#define ARCH_HAS_SETUP_ADDITIONAL_PAGES
-struct linux_binprm;
-extern int arch_setup_additional_pages(struct linux_binprm *bprm,
- int uses_interp);
-
/* 1GB of VA */
#ifdef CONFIG_COMPAT
#define STACK_RND_MASK (test_thread_flag(TIF_32BIT) ? \
@@ -242,6 +237,7 @@ do { \
#else
#define COMPAT_ARCH_DLINFO
#endif
+struct linux_binprm;
extern int aarch32_setup_additional_pages(struct linux_binprm *bprm,
int uses_interp);
#define compat_arch_setup_additional_pages \
diff --git a/arch/csky/Kconfig b/arch/csky/Kconfig
index 8de5b987edb9..68139fa18691 100644
--- a/arch/csky/Kconfig
+++ b/arch/csky/Kconfig
@@ -4,6 +4,7 @@ config CSKY
select ARCH_32BIT_OFF_T
select ARCH_HAS_DMA_PREP_COHERENT
select ARCH_HAS_GCOV_PROFILE_ALL
+ select ARCH_HAS_SETUP_ADDITIONAL_PAGES
select ARCH_HAS_SYNC_DMA_FOR_CPU
select ARCH_HAS_SYNC_DMA_FOR_DEVICE
select ARCH_USE_BUILTIN_BSWAP
diff --git a/arch/csky/include/asm/elf.h b/arch/csky/include/asm/elf.h
index 48b83e283ed4..89067e028335 100644
--- a/arch/csky/include/asm/elf.h
+++ b/arch/csky/include/asm/elf.h
@@ -83,8 +83,4 @@ extern int dump_task_regs(struct task_struct *tsk, elf_gregset_t *elf_regs);
#define ELF_PLATFORM (NULL)
#define SET_PERSONALITY(ex) set_personality(PER_LINUX)

-#define ARCH_HAS_SETUP_ADDITIONAL_PAGES 1
-struct linux_binprm;
-extern int arch_setup_additional_pages(struct linux_binprm *bprm,
- int uses_interp);
#endif /* __ASM_CSKY_ELF_H */
diff --git a/arch/hexagon/Kconfig b/arch/hexagon/Kconfig
index 44a409967af1..04dc816d04bd 100644
--- a/arch/hexagon/Kconfig
+++ b/arch/hexagon/Kconfig
@@ -5,6 +5,7 @@ comment "Linux Kernel Configuration for Hexagon"
config HEXAGON
def_bool y
select ARCH_32BIT_OFF_T
+ select ARCH_HAS_SETUP_ADDITIONAL_PAGES
select ARCH_HAS_SYNC_DMA_FOR_DEVICE
select ARCH_NO_PREEMPT
# Other pending projects/to-do items.
diff --git a/arch/hexagon/include/asm/elf.h b/arch/hexagon/include/asm/elf.h
index 5bfdd9b147fd..eba4131610aa 100644
--- a/arch/hexagon/include/asm/elf.h
+++ b/arch/hexagon/include/asm/elf.h
@@ -207,10 +207,4 @@ do { \
*/
#define ELF_PLATFORM (NULL)

-#define ARCH_HAS_SETUP_ADDITIONAL_PAGES 1
-struct linux_binprm;
-extern int arch_setup_additional_pages(struct linux_binprm *bprm,
- int uses_interp);
-
-
#endif
diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index ed51970c08e7..81096dd2c1ef 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -9,6 +9,7 @@ config MIPS
select ARCH_HAS_KCOV
select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE if !EVA
select ARCH_HAS_PTE_SPECIAL if !(32BIT && CPU_HAS_RIXI)
+ select ARCH_HAS_SETUP_ADDITIONAL_PAGES
select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
select ARCH_HAS_UBSAN_SANITIZE_ALL
select ARCH_HAS_GCOV_PROFILE_ALL
diff --git a/arch/mips/include/asm/elf.h b/arch/mips/include/asm/elf.h
index dc8d2863752c..a5c8be47a39d 100644
--- a/arch/mips/include/asm/elf.h
+++ b/arch/mips/include/asm/elf.h
@@ -462,11 +462,6 @@ do { \
(unsigned long)current->mm->context.vdso); \
} while (0)

-#define ARCH_HAS_SETUP_ADDITIONAL_PAGES 1
-struct linux_binprm;
-extern int arch_setup_additional_pages(struct linux_binprm *bprm,
- int uses_interp);
-
#ifdef CONFIG_MIPS_FP_SUPPORT

struct arch_elf_state {
diff --git a/arch/nds32/Kconfig b/arch/nds32/Kconfig
index 62313902d75d..02afe5ebdfff 100644
--- a/arch/nds32/Kconfig
+++ b/arch/nds32/Kconfig
@@ -8,6 +8,7 @@ config NDS32
def_bool y
select ARCH_32BIT_OFF_T
select ARCH_HAS_DMA_PREP_COHERENT
+ select ARCH_HAS_SETUP_ADDITIONAL_PAGES
select ARCH_HAS_SYNC_DMA_FOR_CPU
select ARCH_HAS_SYNC_DMA_FOR_DEVICE
select ARCH_WANT_FRAME_POINTERS if FTRACE
diff --git a/arch/nds32/include/asm/elf.h b/arch/nds32/include/asm/elf.h
index 1853dc89b8ac..36cec4ae5a84 100644
--- a/arch/nds32/include/asm/elf.h
+++ b/arch/nds32/include/asm/elf.h
@@ -173,8 +173,5 @@ do { \
NEW_AUX_ENT(AT_SYSINFO_EHDR, \
(elf_addr_t)current->mm->context.vdso); \
} while (0)
-#define ARCH_HAS_SETUP_ADDITIONAL_PAGES 1
-struct linux_binprm;
-int arch_setup_additional_pages(struct linux_binprm *, int);

#endif
diff --git a/arch/nios2/Kconfig b/arch/nios2/Kconfig
index c24955c81c92..8159123a995e 100644
--- a/arch/nios2/Kconfig
+++ b/arch/nios2/Kconfig
@@ -3,6 +3,7 @@ config NIOS2
def_bool y
select ARCH_32BIT_OFF_T
select ARCH_HAS_DMA_PREP_COHERENT
+ select ARCH_HAS_SETUP_ADDITIONAL_PAGES
select ARCH_HAS_SYNC_DMA_FOR_CPU
select ARCH_HAS_SYNC_DMA_FOR_DEVICE
select ARCH_HAS_DMA_SET_UNCACHED
diff --git a/arch/nios2/include/asm/elf.h b/arch/nios2/include/asm/elf.h
index 984dd6de17c2..4f8baaef843f 100644
--- a/arch/nios2/include/asm/elf.h
+++ b/arch/nios2/include/asm/elf.h
@@ -28,10 +28,6 @@
/* regs is struct pt_regs, pr_reg is elf_gregset_t (which is
now struct_user_regs, they are different) */

-#define ARCH_HAS_SETUP_ADDITIONAL_PAGES 1
-struct linux_binprm;
-extern int arch_setup_additional_pages(struct linux_binprm *bprm,
- int uses_interp);
#define ELF_CORE_COPY_REGS(pr_reg, regs) \
{ do { \
/* Bleech. */ \
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 088dd2afcfe4..a9f842230ee4 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -140,6 +140,7 @@ config PPC
select ARCH_HAS_PTE_DEVMAP if PPC_BOOK3S_64
select ARCH_HAS_PTE_SPECIAL
select ARCH_HAS_SCALED_CPUTIME if VIRT_CPU_ACCOUNTING_NATIVE && PPC_BOOK3S_64
+ select ARCH_HAS_SETUP_ADDITIONAL_PAGES
select ARCH_HAS_STRICT_KERNEL_RWX if ((PPC_BOOK3S_64 || PPC32) && !HIBERNATION)
select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
select ARCH_HAS_UACCESS_FLUSHCACHE
diff --git a/arch/powerpc/include/asm/elf.h b/arch/powerpc/include/asm/elf.h
index b8425e3cfd81..d7d9820c9096 100644
--- a/arch/powerpc/include/asm/elf.h
+++ b/arch/powerpc/include/asm/elf.h
@@ -111,11 +111,6 @@ extern int dcache_bsize;
extern int icache_bsize;
extern int ucache_bsize;

-/* vDSO has arch_setup_additional_pages */
-#define ARCH_HAS_SETUP_ADDITIONAL_PAGES
-struct linux_binprm;
-extern int arch_setup_additional_pages(struct linux_binprm *bprm,
- int uses_interp);
#define VDSO_AUX_ENT(a,b) NEW_AUX_ENT(a,b)

/* 1GB for 64bit, 8MB for 32bit */
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index c5914e70a0fd..1003fa4534a7 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -28,6 +28,7 @@ config RISCV
select ARCH_HAS_PTE_SPECIAL
select ARCH_HAS_SET_DIRECT_MAP
select ARCH_HAS_SET_MEMORY
+ select ARCH_HAS_SETUP_ADDITIONAL_PAGES if MMU
select ARCH_HAS_STRICT_KERNEL_RWX if MMU && !XIP_KERNEL
select ARCH_HAS_STRICT_MODULE_RWX if MMU && !XIP_KERNEL
select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
diff --git a/arch/riscv/include/asm/elf.h b/arch/riscv/include/asm/elf.h
index f4b490cd0e5d..1d1d60df632e 100644
--- a/arch/riscv/include/asm/elf.h
+++ b/arch/riscv/include/asm/elf.h
@@ -75,10 +75,6 @@ do { \
NEW_AUX_ENT(AT_L2_CACHEGEOMETRY, \
get_cache_geometry(2, CACHE_TYPE_UNIFIED)); \
} while (0)
-#define ARCH_HAS_SETUP_ADDITIONAL_PAGES
-struct linux_binprm;
-extern int arch_setup_additional_pages(struct linux_binprm *bprm,
- int uses_interp);
#endif /* CONFIG_MMU */

#define ELF_CORE_COPY_REGS(dest, regs) \
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index b4c7c34069f8..ccca15663345 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -76,6 +76,7 @@ config S390
select ARCH_HAS_PTE_SPECIAL
select ARCH_HAS_SCALED_CPUTIME
select ARCH_HAS_SET_MEMORY
+ select ARCH_HAS_SETUP_ADDITIONAL_PAGES
select ARCH_HAS_STRICT_KERNEL_RWX
select ARCH_HAS_STRICT_MODULE_RWX
select ARCH_HAS_SYSCALL_WRAPPER
diff --git a/arch/s390/include/asm/elf.h b/arch/s390/include/asm/elf.h
index 66d51ad090ab..6583142149b0 100644
--- a/arch/s390/include/asm/elf.h
+++ b/arch/s390/include/asm/elf.h
@@ -275,9 +275,4 @@ do { \
(unsigned long)current->mm->context.vdso_base); \
} while (0)

-struct linux_binprm;
-
-#define ARCH_HAS_SETUP_ADDITIONAL_PAGES 1
-int arch_setup_additional_pages(struct linux_binprm *, int);
-
#endif
diff --git a/arch/sh/Kconfig b/arch/sh/Kconfig
index 68129537e350..9a1a39c5b3c4 100644
--- a/arch/sh/Kconfig
+++ b/arch/sh/Kconfig
@@ -10,6 +10,7 @@ config SUPERH
select ARCH_HAS_GIGANTIC_PAGE
select ARCH_HAS_GCOV_PROFILE_ALL
select ARCH_HAS_PTE_SPECIAL
+ select ARCH_HAS_SETUP_ADDITIONAL_PAGES if VSYSCALL
select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
select ARCH_HIBERNATION_POSSIBLE if MMU
select ARCH_MIGHT_HAVE_PC_PARPORT
diff --git a/arch/sh/include/asm/elf.h b/arch/sh/include/asm/elf.h
index 2862d6d1cb64..9b3e22e771a1 100644
--- a/arch/sh/include/asm/elf.h
+++ b/arch/sh/include/asm/elf.h
@@ -164,12 +164,6 @@ do { \
set_personality(PER_LINUX_32BIT | (current->personality & (~PER_MASK)))

#ifdef CONFIG_VSYSCALL
-/* vDSO has arch_setup_additional_pages */
-#define ARCH_HAS_SETUP_ADDITIONAL_PAGES
-struct linux_binprm;
-extern int arch_setup_additional_pages(struct linux_binprm *bprm,
- int uses_interp);
-
extern unsigned int vdso_enabled;
extern void __kernel_vsyscall;

diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index 164a5254c91c..10c719a73b90 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -13,6 +13,7 @@ config 64BIT
config SPARC
bool
default y
+ select ARCH_HAS_SETUP_ADDITIONAL_PAGES if SPARC64
select ARCH_MIGHT_HAVE_PC_PARPORT if SPARC64 && PCI
select ARCH_MIGHT_HAVE_PC_SERIO
select DMA_OPS
diff --git a/arch/sparc/include/asm/elf_64.h b/arch/sparc/include/asm/elf_64.h
index 8fb09eec8c3e..63a622c36df3 100644
--- a/arch/sparc/include/asm/elf_64.h
+++ b/arch/sparc/include/asm/elf_64.h
@@ -223,10 +223,4 @@ do { \
NEW_AUX_ENT(AT_ADI_NBITS, adi_state.caps.nbits); \
NEW_AUX_ENT(AT_ADI_UEONADI, adi_state.caps.ue_on_adi); \
} while (0)
-
-struct linux_binprm;
-
-#define ARCH_HAS_SETUP_ADDITIONAL_PAGES 1
-extern int arch_setup_additional_pages(struct linux_binprm *bprm,
- int uses_interp);
#endif /* !(__ASM_SPARC64_ELF_H) */
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 0045e1b44190..d30b952f4453 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -87,6 +87,7 @@ config X86
select ARCH_HAS_COPY_MC if X86_64
select ARCH_HAS_SET_MEMORY
select ARCH_HAS_SET_DIRECT_MAP
+ select ARCH_HAS_SETUP_ADDITIONAL_PAGES
select ARCH_HAS_STRICT_KERNEL_RWX
select ARCH_HAS_STRICT_MODULE_RWX
select ARCH_HAS_SYNC_CORE_BEFORE_USERMODE
diff --git a/arch/x86/include/asm/elf.h b/arch/x86/include/asm/elf.h
index c0b5733005af..9ee5b3b3ba93 100644
--- a/arch/x86/include/asm/elf.h
+++ b/arch/x86/include/asm/elf.h
@@ -378,10 +378,6 @@ else if (IS_ENABLED(CONFIG_IA32_EMULATION)) \
vdso_image_32.sym___kernel_vsyscall)

struct linux_binprm;
-
-#define ARCH_HAS_SETUP_ADDITIONAL_PAGES 1
-extern int arch_setup_additional_pages(struct linux_binprm *bprm,
- int uses_interp);
extern int compat_arch_setup_additional_pages(struct linux_binprm *bprm,
int uses_interp);
#define compat_arch_setup_additional_pages compat_arch_setup_additional_pages
diff --git a/arch/x86/um/asm/elf.h b/arch/x86/um/asm/elf.h
index dcaf3b38a9e0..b7c03a760a3c 100644
--- a/arch/x86/um/asm/elf.h
+++ b/arch/x86/um/asm/elf.h
@@ -181,11 +181,6 @@ do { \
#define FIXADDR_USER_START 0
#define FIXADDR_USER_END 0

-#define ARCH_HAS_SETUP_ADDITIONAL_PAGES 1
-struct linux_binprm;
-extern int arch_setup_additional_pages(struct linux_binprm *bprm,
- int uses_interp);
-
extern unsigned long um_vdso_addr;
#define AT_SYSINFO_EHDR 33
#define ARCH_DLINFO NEW_AUX_ENT(AT_SYSINFO_EHDR, um_vdso_addr)
diff --git a/fs/Kconfig.binfmt b/fs/Kconfig.binfmt
index 06fb7a93a1bd..d98206ad7749 100644
--- a/fs/Kconfig.binfmt
+++ b/fs/Kconfig.binfmt
@@ -39,6 +39,9 @@ config ARCH_BINFMT_ELF_STATE
config ARCH_HAVE_ELF_PROT
bool

+config ARCH_HAS_SETUP_ADDITIONAL_PAGES
+ bool
+
config ARCH_USE_GNU_PROPERTY
bool

diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 2347d9067df6..dac2713c10ee 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -1252,11 +1252,9 @@ static int load_elf_binary(struct linux_binprm *bprm)

set_binfmt(&elf_format);

-#ifdef ARCH_HAS_SETUP_ADDITIONAL_PAGES
retval = arch_setup_additional_pages(bprm, !!interpreter);
if (retval < 0)
goto out;
-#endif /* ARCH_HAS_SETUP_ADDITIONAL_PAGES */

retval = create_elf_tables(bprm, elf_ex,
load_addr, interp_load_addr, e_entry);
diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c
index 2c99b102c860..11cbf20b19da 100644
--- a/fs/binfmt_elf_fdpic.c
+++ b/fs/binfmt_elf_fdpic.c
@@ -374,11 +374,10 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm)
executable_stack);
if (retval < 0)
goto error;
-#ifdef ARCH_HAS_SETUP_ADDITIONAL_PAGES
+
retval = arch_setup_additional_pages(bprm, !!interpreter_name);
if (retval < 0)
goto error;
-#endif
#endif

/* load the executable and interpreter into memory */
diff --git a/fs/compat_binfmt_elf.c b/fs/compat_binfmt_elf.c
index 049ba7c011b9..fad63a4f842e 100644
--- a/fs/compat_binfmt_elf.c
+++ b/fs/compat_binfmt_elf.c
@@ -111,8 +111,6 @@
#endif

#ifdef compat_arch_setup_additional_pages
-#undef ARCH_HAS_SETUP_ADDITIONAL_PAGES
-#define ARCH_HAS_SETUP_ADDITIONAL_PAGES 1
#undef arch_setup_additional_pages
#define arch_setup_additional_pages compat_arch_setup_additional_pages
#endif
diff --git a/include/linux/elf.h b/include/linux/elf.h
index 6dbcfe7a3fd7..95bf7a1abaef 100644
--- a/include/linux/elf.h
+++ b/include/linux/elf.h
@@ -104,4 +104,16 @@ static inline int arch_elf_adjust_prot(int prot,
}
#endif

+struct linux_binprm;
+#ifdef CONFIG_ARCH_HAS_SETUP_ADDITIONAL_PAGES
+extern int arch_setup_additional_pages(struct linux_binprm *bprm,
+ int uses_interp);
+#else
+static inline int arch_setup_additional_pages(struct linux_binprm *bprm,
+ int uses_interp)
+{
+ return 0;
+}
+#endif
+
#endif /* _LINUX_ELF_H */
--
2.31.1

2021-06-11 18:08:29

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH v3 17/23] x86/vdso: Migrate to generic vdso_base

Generic way to track the landing vma area.
As a bonus, after unmapping vdso, kernel won't try to land on its
previous position (due to UNMAPPED_VDSO_BASE check instead of
context.vdso != 0 check).

Signed-off-by: Dmitry Safonov <[email protected]>
---
arch/x86/Kconfig | 1 +
arch/x86/entry/common.c | 2 +-
arch/x86/entry/vdso/extable.c | 4 ++--
arch/x86/entry/vdso/vma.c | 13 ++++++-------
arch/x86/ia32/ia32_signal.c | 4 ++--
arch/x86/include/asm/mmu.h | 1 -
arch/x86/include/asm/vdso.h | 2 +-
arch/x86/kernel/signal.c | 4 ++--
8 files changed, 15 insertions(+), 16 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index d30b952f4453..b1b6dab92fb6 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -93,6 +93,7 @@ config X86
select ARCH_HAS_SYNC_CORE_BEFORE_USERMODE
select ARCH_HAS_SYSCALL_WRAPPER
select ARCH_HAS_UBSAN_SANITIZE_ALL
+ select ARCH_HAS_VDSO_BASE
select ARCH_HAS_DEBUG_WX
select ARCH_HAVE_NMI_SAFE_CMPXCHG
select ARCH_MIGHT_HAVE_ACPI_PDC if ACPI
diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index 385a1c4bf4c0..f5861f6e00a2 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -162,7 +162,7 @@ __visible noinstr long do_fast_syscall_32(struct pt_regs *regs)
* Called using the internal vDSO SYSENTER/SYSCALL32 calling
* convention. Adjust regs so it looks like we entered using int80.
*/
- landing_pad = (unsigned long)current->mm->context.vdso +
+ landing_pad = (unsigned long)current->mm->vdso_base +
vdso_image_32.sym_int80_landing_pad;

/*
diff --git a/arch/x86/entry/vdso/extable.c b/arch/x86/entry/vdso/extable.c
index afcf5b65beef..e5bfb35dafdb 100644
--- a/arch/x86/entry/vdso/extable.c
+++ b/arch/x86/entry/vdso/extable.c
@@ -25,10 +25,10 @@ bool fixup_vdso_exception(struct pt_regs *regs, int trapnr,
if (trapnr == X86_TRAP_DB || trapnr == X86_TRAP_BP)
return false;

- if (!current->mm->context.vdso)
+ if (current->mm->vdso_base == (void __user *)UNMAPPED_VDSO_BASE)
return false;

- base = (unsigned long)current->mm->context.vdso + image->extable_base;
+ base = (unsigned long)current->mm->vdso_base + image->extable_base;
nr_entries = image->extable_len / (sizeof(*extable));
extable = image->extable;

diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c
index a286d44751be..f17e617c3b3e 100644
--- a/arch/x86/entry/vdso/vma.c
+++ b/arch/x86/entry/vdso/vma.c
@@ -77,7 +77,7 @@ static void vdso_fix_landing(const struct vdso_image *image,
struct pt_regs *regs = current_pt_regs();
unsigned long vdso_land = image->sym_int80_landing_pad;
unsigned long old_land_addr = vdso_land +
- (unsigned long)current->mm->context.vdso;
+ (unsigned long)current->mm->vdso_base;

/* Fixing userspace landing - look at do_fast_syscall_32 */
if (regs->ip == old_land_addr)
@@ -92,7 +92,6 @@ static void vdso_mremap(const struct vm_special_mapping *sm,
const struct vdso_image *image = current->mm->context.vdso_image;

vdso_fix_landing(image, new_vma);
- current->mm->context.vdso = (void __user *)new_vma->vm_start;
}

#ifdef CONFIG_TIME_NS
@@ -287,7 +286,7 @@ static int map_vdso(const struct vdso_image *image, unsigned long addr,
ret = PTR_ERR(vma);
do_munmap(mm, text_start, image->size, NULL);
} else {
- current->mm->context.vdso = (void __user *)text_start;
+ current->mm->vdso_base = (void __user *)text_start;
current->mm->context.vdso_image = image;
*sysinfo_ehdr = text_start;
}
@@ -362,8 +361,8 @@ int map_vdso_once(const struct vdso_image *image, unsigned long addr)
* Check if we have already mapped vdso blob - fail to prevent
* abusing from userspace install_special_mapping, which may
* not do accounting and rlimit right.
- * We could search vma near context.vdso, but it's a slowpath,
- * so let's explicitly check all VMAs to be completely sure.
+ * It's a slowpath, let's explicitly check all VMAs to be
+ * completely sure.
*/
for (vma = mm->mmap; vma; vma = vma->vm_next) {
if (vma_is_special_mapping(vma, &vdso_mapping) ||
@@ -415,9 +414,9 @@ bool arch_syscall_is_vdso_sigreturn(struct pt_regs *regs)
{
#if defined(CONFIG_X86_32) || defined(CONFIG_IA32_EMULATION)
const struct vdso_image *image = current->mm->context.vdso_image;
- unsigned long vdso = (unsigned long) current->mm->context.vdso;
+ unsigned long vdso = (unsigned long) current->mm->vdso_base;

- if (in_ia32_syscall() && image == &vdso_image_32) {
+ if (in_ia32_syscall() && current_has_vdso(&vdso_image_32)) {
if (regs->ip == vdso + image->sym_vdso32_sigreturn_landing_pad ||
regs->ip == vdso + image->sym_vdso32_rt_sigreturn_landing_pad)
return true;
diff --git a/arch/x86/ia32/ia32_signal.c b/arch/x86/ia32/ia32_signal.c
index 2af40ae53a0e..a98f4c7cdc38 100644
--- a/arch/x86/ia32/ia32_signal.c
+++ b/arch/x86/ia32/ia32_signal.c
@@ -256,7 +256,7 @@ int ia32_setup_frame(int sig, struct ksignal *ksig,
} else {
/* Return stub is in 32bit vsyscall page */
if (current_has_vdso(&vdso_image_32))
- restorer = current->mm->context.vdso +
+ restorer = current->mm->vdso_base +
vdso_image_32.sym___kernel_sigreturn;
else
restorer = &frame->retcode;
@@ -337,7 +337,7 @@ int ia32_setup_rt_frame(int sig, struct ksignal *ksig,
if (ksig->ka.sa.sa_flags & SA_RESTORER)
restorer = ksig->ka.sa.sa_restorer;
else if (current_has_vdso(&vdso_image_32))
- restorer = current->mm->context.vdso +
+ restorer = current->mm->vdso_base +
vdso_image_32.sym___kernel_rt_sigreturn;
else
restorer = &frame->retcode;
diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h
index 5d7494631ea9..7bd10e6b8386 100644
--- a/arch/x86/include/asm/mmu.h
+++ b/arch/x86/include/asm/mmu.h
@@ -43,7 +43,6 @@ typedef struct {
#endif

struct mutex lock;
- void __user *vdso; /* vdso base address */
const struct vdso_image *vdso_image; /* vdso image in use */

atomic_t perf_rdpmc_allowed; /* nonzero if rdpmc is allowed */
diff --git a/arch/x86/include/asm/vdso.h b/arch/x86/include/asm/vdso.h
index 1ea7cb3f9b14..e4dcc189c30d 100644
--- a/arch/x86/include/asm/vdso.h
+++ b/arch/x86/include/asm/vdso.h
@@ -46,7 +46,7 @@ extern const struct vdso_image vdso_image_32;
#endif

#define current_has_vdso(image) \
- (current->mm->context.vdso != 0 && \
+ (current->mm->vdso_base != (void __user *)UNMAPPED_VDSO_BASE && \
current->mm->context.vdso_image == image)

extern void __init init_vdso_image(const struct vdso_image *image);
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 77496ccb812d..c853d212e6cb 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -320,7 +320,7 @@ __setup_frame(int sig, struct ksignal *ksig, sigset_t *set,
if (ksig->ka.sa.sa_flags & SA_RESTORER)
restorer = ksig->ka.sa.sa_restorer;
else if (current_has_vdso(&vdso_image_32))
- restorer = current->mm->context.vdso +
+ restorer = current->mm->vdso_base +
vdso_image_32.sym___kernel_sigreturn;
else
restorer = &frame->retcode;
@@ -382,7 +382,7 @@ static int __setup_rt_frame(int sig, struct ksignal *ksig,
if (ksig->ka.sa.sa_flags & SA_RESTORER)
restorer = ksig->ka.sa.sa_restorer;
else if (current_has_vdso(&vdso_image_32))
- restorer = current->mm->context.vdso +
+ restorer = current->mm->vdso_base +
vdso_image_32.sym___kernel_rt_sigreturn;
else
restorer = &frame->retcode;
--
2.31.1

2021-06-11 18:08:50

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH v3 04/23] arm64: Use in_compat_task() in arch_setup_additional_pages()

Instead of providing compat_arch_setup_additional_pages(), check if the
task is compatible from personality, which is set earlier in
load_elf_binary(). That will align code with powerpc and sparc, also
it'll allow to completely remove compat_arch_setyp_addtional_pages()
macro after doing the same for x86, simiplifying the binfmt code
in the end and leaving elf loader single function.

Cc: [email protected]
Signed-off-by: Dmitry Safonov <[email protected]>
---
arch/arm64/include/asm/elf.h | 5 -----
arch/arm64/kernel/vdso.c | 21 ++++++++++-----------
2 files changed, 10 insertions(+), 16 deletions(-)

diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
index d1073ffa7f24..a81953bcc1cf 100644
--- a/arch/arm64/include/asm/elf.h
+++ b/arch/arm64/include/asm/elf.h
@@ -237,11 +237,6 @@ do { \
#else
#define COMPAT_ARCH_DLINFO
#endif
-struct linux_binprm;
-extern int aarch32_setup_additional_pages(struct linux_binprm *bprm,
- int uses_interp);
-#define compat_arch_setup_additional_pages \
- aarch32_setup_additional_pages

#endif /* CONFIG_COMPAT */

diff --git a/arch/arm64/kernel/vdso.c b/arch/arm64/kernel/vdso.c
index a61fc4f989b3..a8bf72320ad0 100644
--- a/arch/arm64/kernel/vdso.c
+++ b/arch/arm64/kernel/vdso.c
@@ -411,29 +411,24 @@ static int aarch32_sigreturn_setup(struct mm_struct *mm)
return PTR_ERR_OR_ZERO(ret);
}

-int aarch32_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
+static int aarch32_setup_additional_pages(struct linux_binprm *bprm,
+ int uses_interp)
{
struct mm_struct *mm = current->mm;
int ret;

- if (mmap_write_lock_killable(mm))
- return -EINTR;
-
ret = aarch32_kuser_helpers_setup(mm);
if (ret)
- goto out;
+ return ret;

if (IS_ENABLED(CONFIG_COMPAT_VDSO)) {
ret = __setup_additional_pages(VDSO_ABI_AA32, mm, bprm,
uses_interp);
if (ret)
- goto out;
+ return ret;
}

- ret = aarch32_sigreturn_setup(mm);
-out:
- mmap_write_unlock(mm);
- return ret;
+ return aarch32_sigreturn_setup(mm);
}
#endif /* CONFIG_COMPAT */

@@ -470,7 +465,11 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
if (mmap_write_lock_killable(mm))
return -EINTR;

- ret = __setup_additional_pages(VDSO_ABI_AA64, mm, bprm, uses_interp);
+ if (is_compat_task())
+ ret = aarch32_setup_additional_pages(bprm, uses_interp);
+ else
+ ret = __setup_additional_pages(VDSO_ABI_AA64, mm, bprm, uses_interp);
+
mmap_write_unlock(mm);

return ret;
--
2.31.1

2021-06-11 18:09:15

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH v3 14/23] x86/signal: Land on &frame->retcode when vdso isn't mapped

Since commit 9fbbd4dd17d0 ("x86: Don't require the vDSO for handling
a.out signals") after processing 32-bit signal if there is no vdso
mapped frame->retcode is used as a landing.
Do the same for rt ia32 signals.
It also makes the ia32 compat signals match the native ia32 case.

This shouldn't be mistaken for encouragement for running binaries with
executable stack, rather something to do in hopefully very rare
situation with disabled or unmapped vdso and absent SA_RESTORER.
For non-executable stack it'll segfault on attempt to land, rather than
land on a random address where vdso was previously mapped.
For programs with executable stack it'll just do the same for rt signals
as for non-rt.

Discouraging users to run with executable stack is done separately in
commit 47a2ebb7f505 ("execve: warn if process starts with executable
stack").

Signed-off-by: Dmitry Safonov <[email protected]>
Acked-by: Andy Lutomirski <[email protected]>
---
arch/x86/ia32/ia32_signal.c | 12 +++++++-----
arch/x86/kernel/signal.c | 23 ++++++++++-------------
2 files changed, 17 insertions(+), 18 deletions(-)

diff --git a/arch/x86/ia32/ia32_signal.c b/arch/x86/ia32/ia32_signal.c
index 5e3d9b7fd5fb..adb6994c40f6 100644
--- a/arch/x86/ia32/ia32_signal.c
+++ b/arch/x86/ia32/ia32_signal.c
@@ -270,8 +270,8 @@ int ia32_setup_frame(int sig, struct ksignal *ksig,
unsafe_put_user(set->sig[1], &frame->extramask[0], Efault);
unsafe_put_user(ptr_to_compat(restorer), &frame->pretcode, Efault);
/*
- * These are actually not used anymore, but left because some
- * gdb versions depend on them as a marker.
+ * This is popl %eax ; movl $__NR_sigreturn, %eax ; int $0x80
+ * gdb uses it as a signature to notice signal handler stack frames.
*/
unsafe_put_user(*((u64 *)&code), (u64 __user *)frame->retcode, Efault);
user_access_end();
@@ -336,14 +336,16 @@ int ia32_setup_rt_frame(int sig, struct ksignal *ksig,

if (ksig->ka.sa.sa_flags & SA_RESTORER)
restorer = ksig->ka.sa.sa_restorer;
- else
+ else if (current->mm->context.vdso)
restorer = current->mm->context.vdso +
vdso_image_32.sym___kernel_rt_sigreturn;
+ else
+ restorer = &frame->retcode;
unsafe_put_user(ptr_to_compat(restorer), &frame->pretcode, Efault);

/*
- * Not actually used anymore, but left because some gdb
- * versions need it.
+ * This is popl %eax ; movl $__NR_sigreturn, %eax ; int $0x80
+ * gdb uses it as a signature to notice signal handler stack frames.
*/
unsafe_put_user(*((u64 *)&code), (u64 __user *)frame->retcode, Efault);
unsafe_put_sigcontext32(&frame->uc.uc_mcontext, fp, regs, set, Efault);
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index a06cb107c0e8..988cbc634949 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -317,23 +317,20 @@ __setup_frame(int sig, struct ksignal *ksig, sigset_t *set,
unsafe_put_user(sig, &frame->sig, Efault);
unsafe_put_sigcontext(&frame->sc, fp, regs, set, Efault);
unsafe_put_user(set->sig[1], &frame->extramask[0], Efault);
- if (current->mm->context.vdso)
+ if (ksig->ka.sa.sa_flags & SA_RESTORER)
+ restorer = ksig->ka.sa.sa_restorer;
+ else if (current->mm->context.vdso)
restorer = current->mm->context.vdso +
vdso_image_32.sym___kernel_sigreturn;
else
restorer = &frame->retcode;
- if (ksig->ka.sa.sa_flags & SA_RESTORER)
- restorer = ksig->ka.sa.sa_restorer;

/* Set up to return from userspace. */
unsafe_put_user(restorer, &frame->pretcode, Efault);

/*
* This is popl %eax ; movl $__NR_sigreturn, %eax ; int $0x80
- *
- * WE DO NOT USE IT ANY MORE! It's only left here for historical
- * reasons and because gdb uses it as a signature to notice
- * signal handler stack frames.
+ * gdb uses it as a signature to notice signal handler stack frames.
*/
unsafe_put_user(*((u64 *)&retcode), (u64 *)frame->retcode, Efault);
user_access_end();
@@ -382,18 +379,18 @@ static int __setup_rt_frame(int sig, struct ksignal *ksig,
unsafe_save_altstack(&frame->uc.uc_stack, regs->sp, Efault);

/* Set up to return from userspace. */
- restorer = current->mm->context.vdso +
- vdso_image_32.sym___kernel_rt_sigreturn;
if (ksig->ka.sa.sa_flags & SA_RESTORER)
restorer = ksig->ka.sa.sa_restorer;
+ else if (current->mm->context.vdso)
+ restorer = current->mm->context.vdso +
+ vdso_image_32.sym___kernel_rt_sigreturn;
+ else
+ restorer = &frame->retcode;
unsafe_put_user(restorer, &frame->pretcode, Efault);

/*
* This is movl $__NR_rt_sigreturn, %ax ; int $0x80
- *
- * WE DO NOT USE IT ANY MORE! It's only left here for historical
- * reasons and because gdb uses it as a signature to notice
- * signal handler stack frames.
+ * gdb uses it as a signature to notice signal handler stack frames.
*/
unsafe_put_user(*((u64 *)&rt_retcode), (u64 *)frame->retcode, Efault);
unsafe_put_sigcontext(&frame->uc.uc_mcontext, fp, regs, set, Efault);
--
2.31.1

2021-06-11 18:09:24

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH v3 23/23] x86/vdso/selftest: Add a test for unmapping vDSO

Output for landing on x86:
> [root@localhost ~]# ./test_munmap_vdso_64
> AT_SYSINFO_EHDR is 0x7fffead9f000
> [NOTE] unmapping vDSO: [0x7fffead9f000, 0x7fffeada0000]
> [NOTE] vDSO partial move failed, will try with bigger size
> [NOTE] unmapping vDSO: [0x7fffead9f000, 0x7fffeada1000]
> [OK]
> [root@localhost ~]# ./test_munmap_vdso_32
> AT_SYSINFO_EHDR is 0xf7eef000
> [NOTE] unmapping vDSO: [0xf7eef000, 0xf7ef0000]
> [NOTE] vDSO partial move failed, will try with bigger size
> [NOTE] unmapping vDSO: [0xf7eef000, 0xf7ef1000]
> [OK]

The test also can check force_sigsegv(SIGSEGV) in do_fast_syscall_32():
> [root@localhost ~]# ./test_munmap_vdso_32 sysenter
> [NOTE] Using sysenter after munmap
> AT_SYSINFO_EHDR is 0xf7efe000
> [NOTE] unmapping vDSO: [0xf7efe000, 0xf7eff000]
> [NOTE] vDSO partial move failed, will try with bigger size
> [NOTE] unmapping vDSO: [0xf7efe000, 0xf7f00000]
> [OK] 32-bit process gets segfault on fast syscall with unmapped vDSO

Cc: Shuah Khan <[email protected]>
Signed-off-by: Dmitry Safonov <[email protected]>
---
tools/testing/selftests/x86/.gitignore | 1 +
tools/testing/selftests/x86/Makefile | 11 +-
.../testing/selftests/x86/test_munmap_vdso.c | 151 ++++++++++++++++++
3 files changed, 158 insertions(+), 5 deletions(-)
create mode 100644 tools/testing/selftests/x86/test_munmap_vdso.c

diff --git a/tools/testing/selftests/x86/.gitignore b/tools/testing/selftests/x86/.gitignore
index 1aaef5bf119a..9ce8337e8fa0 100644
--- a/tools/testing/selftests/x86/.gitignore
+++ b/tools/testing/selftests/x86/.gitignore
@@ -6,6 +6,7 @@ sysret_ss_attrs
syscall_nt
ptrace_syscall
test_mremap_vdso
+test_munmap_vdso
check_initial_reg_state
sigreturn
ldt_gdt
diff --git a/tools/testing/selftests/x86/Makefile b/tools/testing/selftests/x86/Makefile
index 333980375bc7..43016351ddb3 100644
--- a/tools/testing/selftests/x86/Makefile
+++ b/tools/testing/selftests/x86/Makefile
@@ -10,12 +10,13 @@ CAN_BUILD_I386 := $(shell ./check_cc.sh $(CC) trivial_32bit_program.c -m32)
CAN_BUILD_X86_64 := $(shell ./check_cc.sh $(CC) trivial_64bit_program.c)
CAN_BUILD_WITH_NOPIE := $(shell ./check_cc.sh $(CC) trivial_program.c -no-pie)

-TARGETS_C_BOTHBITS := single_step_syscall sysret_ss_attrs syscall_nt test_mremap_vdso \
- check_initial_reg_state sigreturn iopl ioperm \
- test_vsyscall mov_ss_trap \
+TARGETS_C_BOTHBITS := single_step_syscall sysret_ss_attrs syscall_nt \
+ test_mremap_vdso test_munmap_vdso \
+ check_initial_reg_state sigreturn iopl ioperm \
+ test_vsyscall mov_ss_trap \
syscall_arg_fault fsgsbase_restore
-TARGETS_C_32BIT_ONLY := entry_from_vm86 test_syscall_vdso unwind_vdso \
- test_FCMOV test_FCOMI test_FISTTP \
+TARGETS_C_32BIT_ONLY := entry_from_vm86 test_syscall_vdso unwind_vdso \
+ test_FCMOV test_FCOMI test_FISTTP \
vdso_restorer
TARGETS_C_64BIT_ONLY := fsgsbase sysret_rip syscall_numbering
# Some selftests require 32bit support enabled also on 64bit systems
diff --git a/tools/testing/selftests/x86/test_munmap_vdso.c b/tools/testing/selftests/x86/test_munmap_vdso.c
new file mode 100644
index 000000000000..f56433dae279
--- /dev/null
+++ b/tools/testing/selftests/x86/test_munmap_vdso.c
@@ -0,0 +1,151 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * 32/64-bit test to check vDSO munmap.
+ *
+ * Copyright (c) 2021 Dmitry Safonov
+ */
+/*
+ * Can be built statically:
+ * gcc -Os -Wall -static -m32 test_munmap_vdso.c
+ */
+#define _GNU_SOURCE
+#include <stdio.h>
+#include <errno.h>
+#include <unistd.h>
+#include <string.h>
+
+#include <sys/mman.h>
+#include <sys/auxv.h>
+#include <sys/syscall.h>
+#include <sys/wait.h>
+
+#define PAGE_SIZE 4096
+
+static int try_to_unmap(void *vdso_addr, unsigned long size)
+{
+ int ret;
+
+ printf("[NOTE]\tunmapping vDSO: [%p, %#lx]\n",
+ vdso_addr, (unsigned long)vdso_addr + size);
+ fflush(stdout);
+
+#ifdef __i386__
+ /* vDSO is a landing for fast syscalls - don't use it for munmap() */
+ asm volatile ("int $0x80" : "=a" (ret)
+ : "a" (SYS_munmap),
+ "b" (vdso_addr),
+ "c" (size));
+ errno = -ret;
+#else /* __x86_64__ */
+ ret = munmap(vdso_addr, size);
+#endif
+ if (ret) {
+ if (errno == EINVAL) {
+ printf("[NOTE]\tvDSO partial move failed, will try with bigger size\n");
+ return -1; /* Retry with larger */
+ }
+ printf("[FAIL]\tmunmap failed (%d): %m\n", errno);
+ return 1;
+ }
+
+ return 0;
+}
+
+int main(int argc, char **argv, char **envp)
+{
+ pid_t child;
+
+#ifdef __i386__
+ enum syscall_type_t {
+ INT80, SYSCALL32, SYSENTER
+ } syscall_type = INT80;
+
+ if (argc > 1) {
+ if (!strcmp(argv[1], "syscall32")) {
+ syscall_type = SYSCALL32;
+ printf("[NOTE]\tUsing syscall32 after munmap\n");
+ } else if (!strcmp(argv[1], "sysenter")) {
+ syscall_type = SYSENTER;
+ printf("[NOTE]\tUsing sysenter after munmap\n");
+ }
+ }
+#endif
+
+ child = fork();
+ if (child == -1) {
+ printf("[WARN]\tfailed to fork (%d): %m\n", errno);
+ return 1;
+ }
+
+ if (child == 0) {
+ unsigned long vdso_size = PAGE_SIZE;
+ unsigned long auxval;
+ int ret = -1;
+
+ auxval = getauxval(AT_SYSINFO_EHDR);
+ printf("\tAT_SYSINFO_EHDR is %#lx\n", auxval);
+ if (!auxval || auxval == -ENOENT) {
+ printf("[WARN]\tgetauxval failed\n");
+ return 0;
+ }
+
+ /* Simpler than parsing ELF header */
+ while (ret < 0) {
+ ret = try_to_unmap((void *)auxval, vdso_size);
+ vdso_size += PAGE_SIZE;
+ }
+
+ /* Glibc is likely to explode now - exit with raw syscall */
+#ifdef __i386__
+ switch (syscall_type) {
+ case SYSCALL32:
+ asm volatile ("syscall" : : "a" (__NR_exit), "b" (!!ret));
+ case SYSENTER:
+ asm volatile ("sysenter" : : "a" (__NR_exit), "b" (!!ret));
+ default:
+ case INT80:
+ asm volatile ("int $0x80" : : "a" (__NR_exit), "b" (!!ret));
+ }
+#else /* __x86_64__ */
+ syscall(SYS_exit, ret);
+#endif
+ } else {
+ int status;
+
+ if (waitpid(child, &status, 0) != child) {
+ printf("[FAIL]\tUnexpected child, killing the expected one\n");
+ kill(child, SIGKILL);
+ return 1;
+ }
+
+
+#ifdef __i386__
+ switch (syscall_type) {
+ case SYSCALL32:
+ case SYSENTER:
+ if (WIFSIGNALED(status) && WTERMSIG(status) == SIGSEGV) {
+ printf("[OK]\t32-bit process gets segfault on fast syscall with unmapped vDSO\n");
+ return 0;
+ }
+ default:
+ case INT80:
+ /* same as on x86_64 */
+ }
+#endif
+
+ if (!WIFEXITED(status)) {
+ printf("[FAIL]\tmunmap() of the vDSO does not work on this kernel!\n");
+ if (WIFSIGNALED(status))
+ printf("[FAIL]\tprocess crashed with %s\n",
+ strsignal(WTERMSIG(status)));
+ return 1;
+ } else if (WEXITSTATUS(status) != 0) {
+ printf("[FAIL]\tChild failed with %d\n",
+ WEXITSTATUS(status));
+ return 1;
+ }
+ printf("[OK]\n");
+ }
+
+ return 0;
+}
--
2.31.1

2021-06-11 18:09:38

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH v3 22/23] powerpc/vdso: Migrate native signals to generic vdso_base

Generic way to track the land vma area.
Stat speaks for itself.

Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Paul Mackerras <[email protected]>
Signed-off-by: Dmitry Safonov <[email protected]>
---
arch/powerpc/Kconfig | 1 +
arch/powerpc/include/asm/book3s/32/mmu-hash.h | 1 -
arch/powerpc/include/asm/book3s/64/mmu.h | 1 -
arch/powerpc/include/asm/mmu_context.h | 9 ------
arch/powerpc/include/asm/nohash/32/mmu-40x.h | 1 -
arch/powerpc/include/asm/nohash/32/mmu-44x.h | 1 -
arch/powerpc/include/asm/nohash/32/mmu-8xx.h | 1 -
arch/powerpc/include/asm/nohash/mmu-book3e.h | 1 -
arch/powerpc/kernel/signal_32.c | 8 ++---
arch/powerpc/kernel/signal_64.c | 4 +--
arch/powerpc/kernel/vdso.c | 31 +------------------
arch/powerpc/perf/callchain_32.c | 8 ++---
arch/powerpc/perf/callchain_64.c | 4 +--
arch/x86/include/asm/mmu_context.h | 5 ---
include/asm-generic/mm_hooks.h | 9 ++----
mm/mmap.c | 7 -----
16 files changed, 16 insertions(+), 76 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index a9f842230ee4..21e58d145c82 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -145,6 +145,7 @@ config PPC
select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
select ARCH_HAS_UACCESS_FLUSHCACHE
select ARCH_HAS_UBSAN_SANITIZE_ALL
+ select ARCH_HAS_VDSO_BASE
select ARCH_HAVE_NMI_SAFE_CMPXCHG
select ARCH_KEEP_MEMBLOCK
select ARCH_MIGHT_HAVE_PC_PARPORT
diff --git a/arch/powerpc/include/asm/book3s/32/mmu-hash.h b/arch/powerpc/include/asm/book3s/32/mmu-hash.h
index b85f8e114a9c..d5ee68f394d9 100644
--- a/arch/powerpc/include/asm/book3s/32/mmu-hash.h
+++ b/arch/powerpc/include/asm/book3s/32/mmu-hash.h
@@ -90,7 +90,6 @@ struct hash_pte {

typedef struct {
unsigned long id;
- void __user *vdso;
} mm_context_t;

void update_bats(void);
diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h
index eace8c3f7b0a..66bcc3ee3add 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu.h
@@ -111,7 +111,6 @@ typedef struct {

struct hash_mm_context *hash_context;

- void __user *vdso;
/*
* pagetable fragment support
*/
diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
index 4bc45d3ed8b0..71dedeac7fdb 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -260,15 +260,6 @@ static inline void enter_lazy_tlb(struct mm_struct *mm,

extern void arch_exit_mmap(struct mm_struct *mm);

-static inline void arch_unmap(struct mm_struct *mm,
- unsigned long start, unsigned long end)
-{
- unsigned long vdso_base = (unsigned long)mm->context.vdso;
-
- if (start <= vdso_base && vdso_base < end)
- mm->context.vdso = NULL;
-}
-
#ifdef CONFIG_PPC_MEM_KEYS
bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write,
bool execute, bool foreign);
diff --git a/arch/powerpc/include/asm/nohash/32/mmu-40x.h b/arch/powerpc/include/asm/nohash/32/mmu-40x.h
index 8a8f13a22cf4..366088bb1c3f 100644
--- a/arch/powerpc/include/asm/nohash/32/mmu-40x.h
+++ b/arch/powerpc/include/asm/nohash/32/mmu-40x.h
@@ -57,7 +57,6 @@
typedef struct {
unsigned int id;
unsigned int active;
- void __user *vdso;
} mm_context_t;

#endif /* !__ASSEMBLY__ */
diff --git a/arch/powerpc/include/asm/nohash/32/mmu-44x.h b/arch/powerpc/include/asm/nohash/32/mmu-44x.h
index 2d92a39d8f2e..d67256ab7887 100644
--- a/arch/powerpc/include/asm/nohash/32/mmu-44x.h
+++ b/arch/powerpc/include/asm/nohash/32/mmu-44x.h
@@ -108,7 +108,6 @@ extern unsigned int tlb_44x_index;
typedef struct {
unsigned int id;
unsigned int active;
- void __user *vdso;
} mm_context_t;

/* patch sites */
diff --git a/arch/powerpc/include/asm/nohash/32/mmu-8xx.h b/arch/powerpc/include/asm/nohash/32/mmu-8xx.h
index 6e4faa0a9b35..9e394810faac 100644
--- a/arch/powerpc/include/asm/nohash/32/mmu-8xx.h
+++ b/arch/powerpc/include/asm/nohash/32/mmu-8xx.h
@@ -184,7 +184,6 @@ void mmu_pin_tlb(unsigned long top, bool readonly);
typedef struct {
unsigned int id;
unsigned int active;
- void __user *vdso;
void *pte_frag;
} mm_context_t;

diff --git a/arch/powerpc/include/asm/nohash/mmu-book3e.h b/arch/powerpc/include/asm/nohash/mmu-book3e.h
index e43a418d3ccd..61ac19f315e5 100644
--- a/arch/powerpc/include/asm/nohash/mmu-book3e.h
+++ b/arch/powerpc/include/asm/nohash/mmu-book3e.h
@@ -238,7 +238,6 @@ extern unsigned int tlbcam_index;
typedef struct {
unsigned int id;
unsigned int active;
- void __user *vdso;
} mm_context_t;

/* Page size definitions, common between 32 and 64-bit
diff --git a/arch/powerpc/kernel/signal_32.c b/arch/powerpc/kernel/signal_32.c
index 8f05ed0da292..ae61c480af53 100644
--- a/arch/powerpc/kernel/signal_32.c
+++ b/arch/powerpc/kernel/signal_32.c
@@ -824,8 +824,8 @@ int handle_rt_signal32(struct ksignal *ksig, sigset_t *oldset,
}

/* Save user registers on the stack */
- if (tsk->mm->context.vdso) {
- tramp = VDSO32_SYMBOL(tsk->mm->context.vdso, sigtramp_rt32);
+ if (tsk->mm->vdso_base != (void __user *)UNMAPPED_VDSO_BASE) {
+ tramp = VDSO32_SYMBOL(tsk->mm->vdso_base, sigtramp_rt32);
} else {
tramp = (unsigned long)mctx->mc_pad;
/* Set up the sigreturn trampoline: li r0,sigret; sc */
@@ -922,8 +922,8 @@ int handle_signal32(struct ksignal *ksig, sigset_t *oldset,
else
unsafe_save_user_regs(regs, mctx, tm_mctx, 1, failed);

- if (tsk->mm->context.vdso) {
- tramp = VDSO32_SYMBOL(tsk->mm->context.vdso, sigtramp32);
+ if (tsk->mm->vdso_base != (void __user *)UNMAPPED_VDSO_BASE) {
+ tramp = VDSO32_SYMBOL(tsk->mm->vdso_base, sigtramp32);
} else {
tramp = (unsigned long)mctx->mc_pad;
/* Set up the sigreturn trampoline: li r0,sigret; sc */
diff --git a/arch/powerpc/kernel/signal_64.c b/arch/powerpc/kernel/signal_64.c
index dca66481d0c2..468866dc1e0e 100644
--- a/arch/powerpc/kernel/signal_64.c
+++ b/arch/powerpc/kernel/signal_64.c
@@ -906,8 +906,8 @@ int handle_rt_signal64(struct ksignal *ksig, sigset_t *set,
tsk->thread.fp_state.fpscr = 0;

/* Set up to return from userspace. */
- if (tsk->mm->context.vdso) {
- regs->nip = VDSO64_SYMBOL(tsk->mm->context.vdso, sigtramp_rt64);
+ if (tsk->mm->vdso_base != (void __user *)UNMAPPED_VDSO_BASE) {
+ regs->nip = VDSO64_SYMBOL(tsk->mm->vdso_base, sigtramp_rt64);
} else {
err |= setup_trampoline(__NR_rt_sigreturn, &frame->tramp[0]);
if (err)
diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c
index 6d6e575630c1..2080a0540537 100644
--- a/arch/powerpc/kernel/vdso.c
+++ b/arch/powerpc/kernel/vdso.c
@@ -57,29 +57,6 @@ enum vvar_pages {
VVAR_NR_PAGES,
};

-static int vdso_mremap(const struct vm_special_mapping *sm, struct vm_area_struct *new_vma,
- unsigned long text_size)
-{
- unsigned long new_size = new_vma->vm_end - new_vma->vm_start;
-
- if (new_size != text_size)
- return -EINVAL;
-
- current->mm->context.vdso = (void __user *)new_vma->vm_start;
-
- return 0;
-}
-
-static int vdso32_mremap(const struct vm_special_mapping *sm, struct vm_area_struct *new_vma)
-{
- return vdso_mremap(sm, new_vma, &vdso32_end - &vdso32_start);
-}
-
-static int vdso64_mremap(const struct vm_special_mapping *sm, struct vm_area_struct *new_vma)
-{
- return vdso_mremap(sm, new_vma, &vdso64_end - &vdso64_start);
-}
-
static vm_fault_t vvar_fault(const struct vm_special_mapping *sm,
struct vm_area_struct *vma, struct vm_fault *vmf);

@@ -90,12 +67,10 @@ static struct vm_special_mapping vvar_spec __ro_after_init = {

static struct vm_special_mapping vdso32_spec __ro_after_init = {
.name = "[vdso]",
- .mremap = vdso32_mremap,
};

static struct vm_special_mapping vdso64_spec __ro_after_init = {
.name = "[vdso]",
- .mremap = vdso64_mremap,
};

#ifdef CONFIG_TIME_NS
@@ -251,7 +226,7 @@ static int __arch_setup_additional_pages(unsigned long *sysinfo_ehdr)
if (IS_ERR(vma)) {
do_munmap(mm, vdso_base, vvar_size, NULL);
} else {
- mm->context.vdso = (void __user *)vdso_base + vvar_size;
+ mm->vdso_base = (void __user *)vdso_base + vvar_size;
*sysinfo_ehdr = vdso_base + vvar_size;
}

@@ -263,14 +238,10 @@ int arch_setup_additional_pages(unsigned long *sysinfo_ehdr)
struct mm_struct *mm = current->mm;
int rc;

- mm->context.vdso = NULL;
-
if (mmap_write_lock_killable(mm))
return -EINTR;

rc = __arch_setup_additional_pages(sysinfo_ehdr);
- if (rc)
- mm->context.vdso = NULL;

mmap_write_unlock(mm);
return rc;
diff --git a/arch/powerpc/perf/callchain_32.c b/arch/powerpc/perf/callchain_32.c
index b83c47b7947f..c48b63e16603 100644
--- a/arch/powerpc/perf/callchain_32.c
+++ b/arch/powerpc/perf/callchain_32.c
@@ -59,8 +59,8 @@ static int is_sigreturn_32_address(unsigned int nip, unsigned int fp)
{
if (nip == fp + offsetof(struct signal_frame_32, mctx.mc_pad))
return 1;
- if (current->mm->context.vdso &&
- nip == VDSO32_SYMBOL(current->mm->context.vdso, sigtramp32))
+ if (current->mm->vdso_base != (void __user *)UNMAPPED_VDSO_BASE &&
+ nip == VDSO32_SYMBOL(current->mm->vdso_base, sigtramp32))
return 1;
return 0;
}
@@ -70,8 +70,8 @@ static int is_rt_sigreturn_32_address(unsigned int nip, unsigned int fp)
if (nip == fp + offsetof(struct rt_signal_frame_32,
uc.uc_mcontext.mc_pad))
return 1;
- if (current->mm->context.vdso &&
- nip == VDSO32_SYMBOL(current->mm->context.vdso, sigtramp_rt32))
+ if (current->mm->vdso_base != (void __user *)UNMAPPED_VDSO_BASE &&
+ nip == VDSO32_SYMBOL(current->mm->vdso_base, sigtramp_rt32))
return 1;
return 0;
}
diff --git a/arch/powerpc/perf/callchain_64.c b/arch/powerpc/perf/callchain_64.c
index 8d0df4226328..ef7116bd525a 100644
--- a/arch/powerpc/perf/callchain_64.c
+++ b/arch/powerpc/perf/callchain_64.c
@@ -68,8 +68,8 @@ static int is_sigreturn_64_address(unsigned long nip, unsigned long fp)
{
if (nip == fp + offsetof(struct signal_frame_64, tramp))
return 1;
- if (current->mm->context.vdso &&
- nip == VDSO64_SYMBOL(current->mm->context.vdso, sigtramp_rt64))
+ if (current->mm->vdso_base != (void __user *)UNMAPPED_VDSO_BASE &&
+ nip == VDSO64_SYMBOL(current->mm->vdso_base, sigtramp_rt64))
return 1;
return 0;
}
diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h
index 27516046117a..394aeaf136bb 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -190,11 +190,6 @@ static inline bool is_64bit_mm(struct mm_struct *mm)
}
#endif

-static inline void arch_unmap(struct mm_struct *mm, unsigned long start,
- unsigned long end)
-{
-}
-
/*
* We only want to enforce protection keys on the current process
* because we effectively have no access to PKRU for other
diff --git a/include/asm-generic/mm_hooks.h b/include/asm-generic/mm_hooks.h
index 4dbb177d1150..6cd41034743d 100644
--- a/include/asm-generic/mm_hooks.h
+++ b/include/asm-generic/mm_hooks.h
@@ -1,7 +1,7 @@
/* SPDX-License-Identifier: GPL-2.0 */
/*
- * Define generic no-op hooks for arch_dup_mmap, arch_exit_mmap
- * and arch_unmap to be included in asm-FOO/mmu_context.h for any
+ * Define generic no-op hooks for arch_dup_mmap() and arch_exit_mmap()
+ * to be included in asm-FOO/mmu_context.h for any
* arch FOO which doesn't need to hook these.
*/
#ifndef _ASM_GENERIC_MM_HOOKS_H
@@ -17,11 +17,6 @@ static inline void arch_exit_mmap(struct mm_struct *mm)
{
}

-static inline void arch_unmap(struct mm_struct *mm,
- unsigned long start, unsigned long end)
-{
-}
-
static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
bool write, bool execute, bool foreign)
{
diff --git a/mm/mmap.c b/mm/mmap.c
index 5d1ffce51119..d22eb9ab770c 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2821,13 +2821,6 @@ int __do_munmap(struct mm_struct *mm, unsigned long start, size_t len,
if (len == 0)
return -EINVAL;

- /*
- * arch_unmap() might do unmaps itself. It must be called
- * and finish any rbtree manipulation before this code
- * runs and also starts to manipulate the rbtree.
- */
- arch_unmap(mm, start, end);
-
/* Find the first overlapping VMA */
vma = find_vma(mm, start);
if (!vma)
--
2.31.1

2021-06-11 18:25:30

by Shuah Khan

[permalink] [raw]
Subject: Re: [PATCH v3 23/23] x86/vdso/selftest: Add a test for unmapping vDSO

On 6/11/21 12:02 PM, Dmitry Safonov wrote:
> Output for landing on x86:
>> [root@localhost ~]# ./test_munmap_vdso_64
>> AT_SYSINFO_EHDR is 0x7fffead9f000
>> [NOTE] unmapping vDSO: [0x7fffead9f000, 0x7fffeada0000]
>> [NOTE] vDSO partial move failed, will try with bigger size
>> [NOTE] unmapping vDSO: [0x7fffead9f000, 0x7fffeada1000]
>> [OK]
>> [root@localhost ~]# ./test_munmap_vdso_32
>> AT_SYSINFO_EHDR is 0xf7eef000
>> [NOTE] unmapping vDSO: [0xf7eef000, 0xf7ef0000]
>> [NOTE] vDSO partial move failed, will try with bigger size
>> [NOTE] unmapping vDSO: [0xf7eef000, 0xf7ef1000]
>> [OK]
>
> The test also can check force_sigsegv(SIGSEGV) in do_fast_syscall_32():
>> [root@localhost ~]# ./test_munmap_vdso_32 sysenter
>> [NOTE] Using sysenter after munmap
>> AT_SYSINFO_EHDR is 0xf7efe000
>> [NOTE] unmapping vDSO: [0xf7efe000, 0xf7eff000]
>> [NOTE] vDSO partial move failed, will try with bigger size
>> [NOTE] unmapping vDSO: [0xf7efe000, 0xf7f00000]
>> [OK] 32-bit process gets segfault on fast syscall with unmapped vDSO
>
> Cc: Shuah Khan <[email protected]>
> Signed-off-by: Dmitry Safonov <[email protected]>
> ---
> tools/testing/selftests/x86/.gitignore | 1 +
> tools/testing/selftests/x86/Makefile | 11 +-
> .../testing/selftests/x86/test_munmap_vdso.c | 151 ++++++++++++++++++
> 3 files changed, 158 insertions(+), 5 deletions(-)
> create mode 100644 tools/testing/selftests/x86/test_munmap_vdso.c
>

I can take this through kselftest tree for 5.14 - are there any
dependencies on x86 tree, I should be aware of?

thanks,
-- Shuah

2021-06-11 18:40:43

by Dmitry Safonov

[permalink] [raw]
Subject: Re: [PATCH v3 23/23] x86/vdso/selftest: Add a test for unmapping vDSO

On 6/11/21 7:21 PM, Shuah Khan wrote:
> On 6/11/21 12:02 PM, Dmitry Safonov wrote:
>> Output for landing on x86:
>>> [root@localhost ~]# ./test_munmap_vdso_64
>>>     AT_SYSINFO_EHDR is 0x7fffead9f000
>>> [NOTE]    unmapping vDSO: [0x7fffead9f000, 0x7fffeada0000]
>>> [NOTE]    vDSO partial move failed, will try with bigger size
>>> [NOTE]    unmapping vDSO: [0x7fffead9f000, 0x7fffeada1000]
>>> [OK]
>>> [root@localhost ~]# ./test_munmap_vdso_32
>>>     AT_SYSINFO_EHDR is 0xf7eef000
>>> [NOTE]    unmapping vDSO: [0xf7eef000, 0xf7ef0000]
>>> [NOTE]    vDSO partial move failed, will try with bigger size
>>> [NOTE]    unmapping vDSO: [0xf7eef000, 0xf7ef1000]
>>> [OK]
>>
>> The test also can check force_sigsegv(SIGSEGV) in do_fast_syscall_32():
>>> [root@localhost ~]# ./test_munmap_vdso_32 sysenter
>>> [NOTE]    Using sysenter after munmap
>>>     AT_SYSINFO_EHDR is 0xf7efe000
>>> [NOTE]    unmapping vDSO: [0xf7efe000, 0xf7eff000]
>>> [NOTE]    vDSO partial move failed, will try with bigger size
>>> [NOTE]    unmapping vDSO: [0xf7efe000, 0xf7f00000]
>>> [OK]    32-bit process gets segfault on fast syscall with unmapped vDSO
>>
>> Cc: Shuah Khan <[email protected]>
>> Signed-off-by: Dmitry Safonov <[email protected]>
>> ---
>>   tools/testing/selftests/x86/.gitignore        |   1 +
>>   tools/testing/selftests/x86/Makefile          |  11 +-
>>   .../testing/selftests/x86/test_munmap_vdso.c  | 151 ++++++++++++++++++
>>   3 files changed, 158 insertions(+), 5 deletions(-)
>>   create mode 100644 tools/testing/selftests/x86/test_munmap_vdso.c
>>
>
> I can take this through kselftest tree for 5.14 - are there any
> dependencies on x86 tree, I should be aware of?

The test should work without other patches from the set.
So I guess, it's good to go by it's own.

The only note I can make here is that without previous patches this part
of the commit message is not exactly precise:
> The test also can check force_sigsegv(SIGSEGV) in
> do_fast_syscall_32()

I will still crash, but not by the kernel enforcement, rather with
landing on the area where vdso was previously mapped.

Thanks,
Dmitry

2021-06-11 18:45:29

by Shuah Khan

[permalink] [raw]
Subject: Re: [PATCH v3 23/23] x86/vdso/selftest: Add a test for unmapping vDSO

On 6/11/21 12:37 PM, Dmitry Safonov wrote:
> On 6/11/21 7:21 PM, Shuah Khan wrote:
>> On 6/11/21 12:02 PM, Dmitry Safonov wrote:
>>> Output for landing on x86:
>>>> [root@localhost ~]# ./test_munmap_vdso_64
>>>>     AT_SYSINFO_EHDR is 0x7fffead9f000
>>>> [NOTE]    unmapping vDSO: [0x7fffead9f000, 0x7fffeada0000]
>>>> [NOTE]    vDSO partial move failed, will try with bigger size
>>>> [NOTE]    unmapping vDSO: [0x7fffead9f000, 0x7fffeada1000]
>>>> [OK]
>>>> [root@localhost ~]# ./test_munmap_vdso_32
>>>>     AT_SYSINFO_EHDR is 0xf7eef000
>>>> [NOTE]    unmapping vDSO: [0xf7eef000, 0xf7ef0000]
>>>> [NOTE]    vDSO partial move failed, will try with bigger size
>>>> [NOTE]    unmapping vDSO: [0xf7eef000, 0xf7ef1000]
>>>> [OK]
>>>
>>> The test also can check force_sigsegv(SIGSEGV) in do_fast_syscall_32():
>>>> [root@localhost ~]# ./test_munmap_vdso_32 sysenter
>>>> [NOTE]    Using sysenter after munmap
>>>>     AT_SYSINFO_EHDR is 0xf7efe000
>>>> [NOTE]    unmapping vDSO: [0xf7efe000, 0xf7eff000]
>>>> [NOTE]    vDSO partial move failed, will try with bigger size
>>>> [NOTE]    unmapping vDSO: [0xf7efe000, 0xf7f00000]
>>>> [OK]    32-bit process gets segfault on fast syscall with unmapped vDSO
>>>
>>> Cc: Shuah Khan <[email protected]>
>>> Signed-off-by: Dmitry Safonov <[email protected]>
>>> ---
>>>   tools/testing/selftests/x86/.gitignore        |   1 +
>>>   tools/testing/selftests/x86/Makefile          |  11 +-
>>>   .../testing/selftests/x86/test_munmap_vdso.c  | 151 ++++++++++++++++++
>>>   3 files changed, 158 insertions(+), 5 deletions(-)
>>>   create mode 100644 tools/testing/selftests/x86/test_munmap_vdso.c
>>>
>>
>> I can take this through kselftest tree for 5.14 - are there any
>> dependencies on x86 tree, I should be aware of?
>
> The test should work without other patches from the set.
> So I guess, it's good to go by it's own.
>
> The only note I can make here is that without previous patches this part
> of the commit message is not exactly precise:
>> The test also can check force_sigsegv(SIGSEGV) in
>> do_fast_syscall_32()
>
> I will still crash, but not by the kernel enforcement, rather with
> landing on the area where vdso was previously mapped.
>

Playing it safe, here is my Ack for it to go through x86 tree

Acked-by: Shuah Khan <[email protected]>

thanks,
-- Shuah

2021-06-15 10:23:18

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v3 04/23] arm64: Use in_compat_task() in arch_setup_additional_pages()

On Fri, Jun 11, 2021 at 07:02:23PM +0100, Dmitry Safonov wrote:
> Instead of providing compat_arch_setup_additional_pages(), check if the
> task is compatible from personality, which is set earlier in
> load_elf_binary(). That will align code with powerpc and sparc, also
> it'll allow to completely remove compat_arch_setyp_addtional_pages()
> macro after doing the same for x86, simiplifying the binfmt code
> in the end and leaving elf loader single function.
>
> Cc: [email protected]
> Signed-off-by: Dmitry Safonov <[email protected]>
> ---
> arch/arm64/include/asm/elf.h | 5 -----
> arch/arm64/kernel/vdso.c | 21 ++++++++++-----------
> 2 files changed, 10 insertions(+), 16 deletions(-)

Acked-by: Will Deacon <[email protected]>

Will

2021-06-15 12:53:36

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH v3 22/23] powerpc/vdso: Migrate native signals to generic vdso_base

Dmitry Safonov <[email protected]> writes:
> Generic way to track the land vma area.
> Stat speaks for itself.
>
> Cc: Benjamin Herrenschmidt <[email protected]>
> Cc: Michael Ellerman <[email protected]>
> Cc: Paul Mackerras <[email protected]>
> Signed-off-by: Dmitry Safonov <[email protected]>
> ---
> arch/powerpc/Kconfig | 1 +
> arch/powerpc/include/asm/book3s/32/mmu-hash.h | 1 -
> arch/powerpc/include/asm/book3s/64/mmu.h | 1 -
> arch/powerpc/include/asm/mmu_context.h | 9 ------
> arch/powerpc/include/asm/nohash/32/mmu-40x.h | 1 -
> arch/powerpc/include/asm/nohash/32/mmu-44x.h | 1 -
> arch/powerpc/include/asm/nohash/32/mmu-8xx.h | 1 -
> arch/powerpc/include/asm/nohash/mmu-book3e.h | 1 -
> arch/powerpc/kernel/signal_32.c | 8 ++---
> arch/powerpc/kernel/signal_64.c | 4 +--
> arch/powerpc/kernel/vdso.c | 31 +------------------
> arch/powerpc/perf/callchain_32.c | 8 ++---
> arch/powerpc/perf/callchain_64.c | 4 +--
> arch/x86/include/asm/mmu_context.h | 5 ---
> include/asm-generic/mm_hooks.h | 9 ++----
> mm/mmap.c | 7 -----
> 16 files changed, 16 insertions(+), 76 deletions(-)

LGTM.

Acked-by: Michael Ellerman <[email protected]> (powerpc)

cheers

2021-06-17 06:32:25

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH v3 22/23] powerpc/vdso: Migrate native signals to generic vdso_base



Le 11/06/2021 à 20:02, Dmitry Safonov a écrit :
> Generic way to track the land vma area.
> Stat speaks for itself.
>
> Cc: Benjamin Herrenschmidt <[email protected]>
> Cc: Michael Ellerman <[email protected]>
> Cc: Paul Mackerras <[email protected]>
> Signed-off-by: Dmitry Safonov <[email protected]>
> ---

> diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
> index 4bc45d3ed8b0..71dedeac7fdb 100644
> --- a/arch/powerpc/include/asm/mmu_context.h
> +++ b/arch/powerpc/include/asm/mmu_context.h
> @@ -260,15 +260,6 @@ static inline void enter_lazy_tlb(struct mm_struct *mm,
>
> extern void arch_exit_mmap(struct mm_struct *mm);
>
> -static inline void arch_unmap(struct mm_struct *mm,
> - unsigned long start, unsigned long end)
> -{
> - unsigned long vdso_base = (unsigned long)mm->context.vdso;
> -
> - if (start <= vdso_base && vdso_base < end)
> - mm->context.vdso = NULL;
> -}
> -
> #ifdef CONFIG_PPC_MEM_KEYS
> bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write,
> bool execute, bool foreign);

powerpc was the only user of arch_unmap().

We should get rid of it completely (Remove the stubs in arch/x86/include/asm/mmu_context.h and
include/asm-generic/mm_hooks.h and remove call in mm/mmap.c)

2021-06-17 06:38:38

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH v3 22/23] powerpc/vdso: Migrate native signals to generic vdso_base



Le 11/06/2021 à 20:02, Dmitry Safonov a écrit :
> Generic way to track the land vma area.
> Stat speaks for itself.
>
> Cc: Benjamin Herrenschmidt <[email protected]>
> Cc: Michael Ellerman <[email protected]>
> Cc: Paul Mackerras <[email protected]>
> Signed-off-by: Dmitry Safonov <[email protected]>


Build failure:

CC arch/powerpc/kernel/asm-offsets.s
In file included from ./include/linux/mmzone.h:21,
from ./include/linux/gfp.h:6,
from ./include/linux/xarray.h:14,
from ./include/linux/radix-tree.h:19,
from ./include/linux/fs.h:15,
from ./include/linux/compat.h:17,
from arch/powerpc/kernel/asm-offsets.c:14:
./include/linux/mm_types.h: In function 'init_vdso_base':
./include/linux/mm_types.h:522:28: error: 'TASK_SIZE_MAX' undeclared (first use in this function);
did you mean 'XATTR_SIZE_MAX'?
522 | #define UNMAPPED_VDSO_BASE TASK_SIZE_MAX
| ^~~~~~~~~~~~~
./include/linux/mm_types.h:627:40: note: in expansion of macro 'UNMAPPED_VDSO_BASE'
627 | mm->vdso_base = (void __user *)UNMAPPED_VDSO_BASE;
| ^~~~~~~~~~~~~~~~~~
./include/linux/mm_types.h:522:28: note: each undeclared identifier is reported only once for each
function it appears in
522 | #define UNMAPPED_VDSO_BASE TASK_SIZE_MAX
| ^~~~~~~~~~~~~
./include/linux/mm_types.h:627:40: note: in expansion of macro 'UNMAPPED_VDSO_BASE'
627 | mm->vdso_base = (void __user *)UNMAPPED_VDSO_BASE;
| ^~~~~~~~~~~~~~~~~~
make[2]: *** [arch/powerpc/kernel/asm-offsets.s] Error 1
make[1]: *** [prepare0] Error 2
make: *** [__sub-make] Error 2

2021-06-17 07:24:14

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH v3 13/23] mm/mmap: Make vm_special_mapping::mremap return void



Le 11/06/2021 à 20:02, Dmitry Safonov a écrit :
> Previously .mremap() callback had to return (int) to provide
> a way to restrict resizing of a special mapping. Now it's
> restricted by providing .may_split = special_mapping_split.
>
> Removing (int) return simplifies further changes to
> special_mapping_mremap() as it won't need save ret code from the
> callback. Also, it removes needless `return 0` from callbacks.
>
> Signed-off-by: Dmitry Safonov <[email protected]>
> ---
> arch/arm/kernel/process.c | 3 +--
> arch/arm64/kernel/vdso.c | 4 +---
> arch/mips/vdso/genvdso.c | 3 +--
> arch/x86/entry/vdso/vma.c | 4 +---
> include/linux/mm_types.h | 4 ++--
> mm/mmap.c | 2 +-
> 6 files changed, 7 insertions(+), 13 deletions(-)
>

Build failure:

CC arch/powerpc/kernel/vdso.o
arch/powerpc/kernel/vdso.c:93:19: error: initialization of 'void (*)(const struct vm_special_mapping
*, struct vm_area_struct *)' from incompatible pointer type 'int (*)(const struct vm_special_mapping
*, struct vm_area_struct *)' [-Werror=incompatible-pointer-types]
93 | .mremap = vdso32_mremap,
| ^~~~~~~~~~~~~
arch/powerpc/kernel/vdso.c:93:19: note: (near initialization for 'vdso32_spec.mremap')
arch/powerpc/kernel/vdso.c:98:19: error: initialization of 'void (*)(const struct vm_special_mapping
*, struct vm_area_struct *)' from incompatible pointer type 'int (*)(const struct vm_special_mapping
*, struct vm_area_struct *)' [-Werror=incompatible-pointer-types]
98 | .mremap = vdso64_mremap,
| ^~~~~~~~~~~~~
arch/powerpc/kernel/vdso.c:98:19: note: (near initialization for 'vdso64_spec.mremap')
cc1: all warnings being treated as errors
make[3]: *** [arch/powerpc/kernel/vdso.o] Error 1
make[2]: *** [arch/powerpc/kernel] Error 2
make[1]: *** [arch/powerpc] Error 2
make: *** [__sub-make] Error 2

2021-06-17 07:38:18

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH v3 22/23] powerpc/vdso: Migrate native signals to generic vdso_base



Le 17/06/2021 à 08:36, Christophe Leroy a écrit :
>
>
> Le 11/06/2021 à 20:02, Dmitry Safonov a écrit :
>> Generic way to track the land vma area.
>> Stat speaks for itself.
>>
>> Cc: Benjamin Herrenschmidt <[email protected]>
>> Cc: Michael Ellerman <[email protected]>
>> Cc: Paul Mackerras <[email protected]>
>> Signed-off-by: Dmitry Safonov <[email protected]>
>
>
> Build failure:
>
>   CC      arch/powerpc/kernel/asm-offsets.s
> In file included from ./include/linux/mmzone.h:21,
>                  from ./include/linux/gfp.h:6,
>                  from ./include/linux/xarray.h:14,
>                  from ./include/linux/radix-tree.h:19,
>                  from ./include/linux/fs.h:15,
>                  from ./include/linux/compat.h:17,
>                  from arch/powerpc/kernel/asm-offsets.c:14:
> ./include/linux/mm_types.h: In function 'init_vdso_base':
> ./include/linux/mm_types.h:522:28: error: 'TASK_SIZE_MAX' undeclared (first use in this function);
> did you mean 'XATTR_SIZE_MAX'?
>   522 | #define UNMAPPED_VDSO_BASE TASK_SIZE_MAX
>       |                            ^~~~~~~~~~~~~
> ./include/linux/mm_types.h:627:40: note: in expansion of macro 'UNMAPPED_VDSO_BASE'
>   627 |         mm->vdso_base = (void __user *)UNMAPPED_VDSO_BASE;
>       |                                        ^~~~~~~~~~~~~~~~~~
> ./include/linux/mm_types.h:522:28: note: each undeclared identifier is reported only once for each
> function it appears in
>   522 | #define UNMAPPED_VDSO_BASE TASK_SIZE_MAX
>       |                            ^~~~~~~~~~~~~
> ./include/linux/mm_types.h:627:40: note: in expansion of macro 'UNMAPPED_VDSO_BASE'
>   627 |         mm->vdso_base = (void __user *)UNMAPPED_VDSO_BASE;
>       |                                        ^~~~~~~~~~~~~~~~~~
> make[2]: *** [arch/powerpc/kernel/asm-offsets.s] Error 1
> make[1]: *** [prepare0] Error 2
> make: *** [__sub-make] Error 2
>

Fixed by moving TASK_SIZE_MAX into asm/task_size_32.h and asm/task_size_64.h

diff --git a/arch/powerpc/include/asm/task_size_32.h b/arch/powerpc/include/asm/task_size_32.h
index de7290ee770f..03af9e6bb5cd 100644
--- a/arch/powerpc/include/asm/task_size_32.h
+++ b/arch/powerpc/include/asm/task_size_32.h
@@ -7,6 +7,7 @@
#endif

#define TASK_SIZE (CONFIG_TASK_SIZE)
+#define TASK_SIZE_MAX TASK_SIZE

/*
* This decides where the kernel will search for a free chunk of vm space during
diff --git a/arch/powerpc/include/asm/task_size_64.h b/arch/powerpc/include/asm/task_size_64.h
index c993482237ed..bfdb98c0ef43 100644
--- a/arch/powerpc/include/asm/task_size_64.h
+++ b/arch/powerpc/include/asm/task_size_64.h
@@ -49,6 +49,7 @@
TASK_SIZE_USER64)

#define TASK_SIZE TASK_SIZE_OF(current)
+#define TASK_SIZE_MAX TASK_SIZE_USER64

#define TASK_UNMAPPED_BASE_USER32 (PAGE_ALIGN(TASK_SIZE_USER32 / 4))
#define TASK_UNMAPPED_BASE_USER64 (PAGE_ALIGN(DEFAULT_MAP_WINDOW_USER64 / 4))
diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
index 22c79ab40006..5823140d39f1 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -8,13 +8,6 @@
#include <asm/extable.h>
#include <asm/kup.h>

-#ifdef __powerpc64__
-/* We use TASK_SIZE_USER64 as TASK_SIZE is not constant */
-#define TASK_SIZE_MAX TASK_SIZE_USER64
-#else
-#define TASK_SIZE_MAX TASK_SIZE
-#endif
-
static inline bool __access_ok(unsigned long addr, unsigned long size)
{
return addr < TASK_SIZE_MAX && size <= TASK_SIZE_MAX - addr;

2021-06-17 09:52:21

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH v3 00/23] Add generic vdso_base tracking



Le 11/06/2021 à 20:02, Dmitry Safonov a écrit :
> v3 Changes:
> - Migrated arch/powerpc to vdso_base
> - Added x86/selftest for unmapped vdso & no landing on fast syscall
> - Review comments from Andy & Christophe (thanks!)
> - Amended s/born process/execed process/ everywhere I noticed
> - Build robot warning on cast from __user pointer
>
> I've tested it on x86, I would appreciate any help with
> Tested-by on arm/arm64/mips/powerpc/s390/... platforms.

I tried it on powerpc, normal use still works.

What tests can be done exactly ?

We have a selftest in powerpc
(https://github.com/linuxppc/linux/blob/master/tools/testing/selftests/powerpc/signal/sigreturn_vdso.c)
but it doesn't work anymore since the split of VDSO into VDSO+VVAR.

Christophe

2021-06-21 21:13:34

by Dmitry Safonov

[permalink] [raw]
Subject: Re: [PATCH v3 13/23] mm/mmap: Make vm_special_mapping::mremap return void

On 6/17/21 8:20 AM, Christophe Leroy wrote:
>
>
> Le 11/06/2021 à 20:02, Dmitry Safonov a écrit :
>> Previously .mremap() callback had to return (int) to provide
>> a way to restrict resizing of a special mapping. Now it's
>> restricted by providing .may_split = special_mapping_split.
>>
>> Removing (int) return simplifies further changes to
>> special_mapping_mremap() as it won't need save ret code from the
>> callback. Also, it removes needless `return 0` from callbacks.
>>
>> Signed-off-by: Dmitry Safonov <[email protected]>
>> ---
>>   arch/arm/kernel/process.c | 3 +--
>>   arch/arm64/kernel/vdso.c  | 4 +---
>>   arch/mips/vdso/genvdso.c  | 3 +--
>>   arch/x86/entry/vdso/vma.c | 4 +---
>>   include/linux/mm_types.h  | 4 ++--
>>   mm/mmap.c                 | 2 +-
>>   6 files changed, 7 insertions(+), 13 deletions(-)
>>
>
> Build failure:
>
>   CC      arch/powerpc/kernel/vdso.o
> arch/powerpc/kernel/vdso.c:93:19: error: initialization of 'void
> (*)(const struct vm_special_mapping *, struct vm_area_struct *)' from
> incompatible pointer type 'int (*)(const struct vm_special_mapping *,
> struct vm_area_struct *)' [-Werror=incompatible-pointer-types]
>    93 |         .mremap = vdso32_mremap,
>       |                   ^~~~~~~~~~~~~
> arch/powerpc/kernel/vdso.c:93:19: note: (near initialization for
> 'vdso32_spec.mremap')
> arch/powerpc/kernel/vdso.c:98:19: error: initialization of 'void
> (*)(const struct vm_special_mapping *, struct vm_area_struct *)' from
> incompatible pointer type 'int (*)(const struct vm_special_mapping *,
> struct vm_area_struct *)' [-Werror=incompatible-pointer-types]
>    98 |         .mremap = vdso64_mremap,
>       |                   ^~~~~~~~~~~~~
> arch/powerpc/kernel/vdso.c:98:19: note: (near initialization for
> 'vdso64_spec.mremap')
> cc1: all warnings being treated as errors
> make[3]: *** [arch/powerpc/kernel/vdso.o] Error 1
> make[2]: *** [arch/powerpc/kernel] Error 2
> make[1]: *** [arch/powerpc] Error 2
> make: *** [__sub-make] Error 2
>

Thanks for reporting, Christophe, I'll fix that in v4.

Thanks,
Dmitry

2021-06-21 21:23:44

by Dmitry Safonov

[permalink] [raw]
Subject: Re: [PATCH v3 22/23] powerpc/vdso: Migrate native signals to generic vdso_base

Hi Chirstophe,

On 6/17/21 8:34 AM, Christophe Leroy wrote:
>
>
> Le 17/06/2021 à 08:36, Christophe Leroy a écrit :
>>
>>
>> Le 11/06/2021 à 20:02, Dmitry Safonov a écrit :
>>> Generic way to track the land vma area.
>>> Stat speaks for itself.
>>>
>>> Cc: Benjamin Herrenschmidt <[email protected]>
>>> Cc: Michael Ellerman <[email protected]>
>>> Cc: Paul Mackerras <[email protected]>
>>> Signed-off-by: Dmitry Safonov <[email protected]>
>>
>>
>> Build failure:
>>
>>    CC      arch/powerpc/kernel/asm-offsets.s
>> In file included from ./include/linux/mmzone.h:21,
>>                   from ./include/linux/gfp.h:6,
>>                   from ./include/linux/xarray.h:14,
>>                   from ./include/linux/radix-tree.h:19,
>>                   from ./include/linux/fs.h:15,
>>                   from ./include/linux/compat.h:17,
>>                   from arch/powerpc/kernel/asm-offsets.c:14:
>> ./include/linux/mm_types.h: In function 'init_vdso_base':
>> ./include/linux/mm_types.h:522:28: error: 'TASK_SIZE_MAX' undeclared
>> (first use in this function); did you mean 'XATTR_SIZE_MAX'?
>>    522 | #define UNMAPPED_VDSO_BASE TASK_SIZE_MAX
>>        |                            ^~~~~~~~~~~~~
>> ./include/linux/mm_types.h:627:40: note: in expansion of macro
>> 'UNMAPPED_VDSO_BASE'
>>    627 |         mm->vdso_base = (void __user *)UNMAPPED_VDSO_BASE;
>>        |                                        ^~~~~~~~~~~~~~~~~~
>> ./include/linux/mm_types.h:522:28: note: each undeclared identifier is
>> reported only once for each function it appears in
>>    522 | #define UNMAPPED_VDSO_BASE TASK_SIZE_MAX
>>        |                            ^~~~~~~~~~~~~
>> ./include/linux/mm_types.h:627:40: note: in expansion of macro
>> 'UNMAPPED_VDSO_BASE'
>>    627 |         mm->vdso_base = (void __user *)UNMAPPED_VDSO_BASE;
>>        |                                        ^~~~~~~~~~~~~~~~~~
>> make[2]: *** [arch/powerpc/kernel/asm-offsets.s] Error 1
>> make[1]: *** [prepare0] Error 2
>> make: *** [__sub-make] Error 2
>>
>
> Fixed by moving TASK_SIZE_MAX into asm/task_size_32.h and
> asm/task_size_64.h
>
> diff --git a/arch/powerpc/include/asm/task_size_32.h
> b/arch/powerpc/include/asm/task_size_32.h
> index de7290ee770f..03af9e6bb5cd 100644
> --- a/arch/powerpc/include/asm/task_size_32.h
> +++ b/arch/powerpc/include/asm/task_size_32.h
> @@ -7,6 +7,7 @@
>  #endif
>
>  #define TASK_SIZE (CONFIG_TASK_SIZE)
> +#define TASK_SIZE_MAX        TASK_SIZE
>
>  /*
>   * This decides where the kernel will search for a free chunk of vm
> space during
> diff --git a/arch/powerpc/include/asm/task_size_64.h
> b/arch/powerpc/include/asm/task_size_64.h
> index c993482237ed..bfdb98c0ef43 100644
> --- a/arch/powerpc/include/asm/task_size_64.h
> +++ b/arch/powerpc/include/asm/task_size_64.h
> @@ -49,6 +49,7 @@
>                          TASK_SIZE_USER64)
>
>  #define TASK_SIZE TASK_SIZE_OF(current)
> +#define TASK_SIZE_MAX        TASK_SIZE_USER64
>
>  #define TASK_UNMAPPED_BASE_USER32 (PAGE_ALIGN(TASK_SIZE_USER32 / 4))
>  #define TASK_UNMAPPED_BASE_USER64 (PAGE_ALIGN(DEFAULT_MAP_WINDOW_USER64
> / 4))
> diff --git a/arch/powerpc/include/asm/uaccess.h
> b/arch/powerpc/include/asm/uaccess.h
> index 22c79ab40006..5823140d39f1 100644
> --- a/arch/powerpc/include/asm/uaccess.h
> +++ b/arch/powerpc/include/asm/uaccess.h
> @@ -8,13 +8,6 @@
>  #include <asm/extable.h>
>  #include <asm/kup.h>
>
> -#ifdef __powerpc64__
> -/* We use TASK_SIZE_USER64 as TASK_SIZE is not constant */
> -#define TASK_SIZE_MAX        TASK_SIZE_USER64
> -#else
> -#define TASK_SIZE_MAX        TASK_SIZE
> -#endif
> -
>  static inline bool __access_ok(unsigned long addr, unsigned long size)
>  {
>      return addr < TASK_SIZE_MAX && size <= TASK_SIZE_MAX - addr;


Thanks, that's very helpful!

--
Dmitry

2021-06-21 21:59:39

by Dmitry Safonov

[permalink] [raw]
Subject: Re: [PATCH v3 00/23] Add generic vdso_base tracking

On 6/17/21 10:13 AM, Christophe Leroy wrote:
>
>
> Le 11/06/2021 à 20:02, Dmitry Safonov a écrit :
>> v3 Changes:
>> - Migrated arch/powerpc to vdso_base
>> - Added x86/selftest for unmapped vdso & no landing on fast syscall
>> - Review comments from Andy & Christophe (thanks!)
>> - Amended s/born process/execed process/ everywhere I noticed
>> - Build robot warning on cast from __user pointer
>>
>> I've tested it on x86, I would appreciate any help with
>> Tested-by on arm/arm64/mips/powerpc/s390/... platforms.
>
> I tried it on powerpc, normal use still works.

Thank you!

> What tests can be done exactly ?

Well, for x86 I've run all vdso tests from tools/testing/selftests/x86/
(with the new one from patch 23 here, which tests exactly forced
segfault on unmapped vdso).
I think normal use on other platforms sounds good to me.

>
> We have a selftest in powerpc
> (https://github.com/linuxppc/linux/blob/master/tools/testing/selftests/powerpc/signal/sigreturn_vdso.c)
> but it doesn't work anymore since the split of VDSO into VDSO+VVAR.

Well, it doesn't sound very hard to fix, see the sample diff inline.

Thank you,
Dmitry

--->8---
diff --git a/tools/testing/selftests/powerpc/signal/sigreturn_vdso.c
b/tools/testing/selftests/powerpc/signal/sigreturn_vdso.c
index e282fff0fe25..a4f85ee13c4a 100644
--- a/tools/testing/selftests/powerpc/signal/sigreturn_vdso.c
+++ b/tools/testing/selftests/powerpc/signal/sigreturn_vdso.c
@@ -13,6 +13,7 @@
#include <signal.h>
#include <stdlib.h>
#include <string.h>
+#include <stdbool.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <unistd.h>
@@ -23,7 +24,7 @@

#include "utils.h"

-static int search_proc_maps(char *needle, unsigned long *low,
unsigned long *high)
+static int search_proc_maps(char *needle, unsigned long *low,
unsigned long *high, unsigned long *size)
{
unsigned long start, end;
static char buf[4096];
@@ -52,6 +53,7 @@ static int search_proc_maps(char *needle, unsigned
long *low, unsigned long *hig
if (strstr(name, needle)) {
*low = start;
*high = end - 1;
+ *size = end - start;
rc = 0;
break;
}
@@ -71,9 +73,12 @@ static void sigusr1_handler(int sig)

int test_sigreturn_vdso(void)
{
- unsigned long low, high, size;
+ unsigned long stack_start, stack_end, stack_size;
+ unsigned long vdso_start, vdso_end, vdso_size;
+ unsigned long vvar_start, vvar_end, vvar_size;
+ char *vdso_parking, *vvar_parking;
struct sigaction act;
- char *p;
+ bool vvar_present;

act.sa_handler = sigusr1_handler;
act.sa_flags = 0;
@@ -82,36 +87,56 @@ int test_sigreturn_vdso(void)
assert(sigaction(SIGUSR1, &act, NULL) == 0);

// Confirm the VDSO is mapped, and work out where it is
- assert(search_proc_maps("[vdso]", &low, &high) == 0);
- size = high - low + 1;
- printf("VDSO is at 0x%lx-0x%lx (%lu bytes)\n", low, high, size);
+ assert(search_proc_maps("[vdso]", &vdso_start, &vdso_end,
&vdso_size) == 0);
+ printf("VDSO is at 0x%lx-0x%lx (%lu bytes)\n", vdso_start,
vdso_end, vdso_size);
+ // On older kernels there's only vdso, on newer vdso/vvar pair
+ if (search_proc_maps("[vvar]", &vvar_start, &vvar_end, &vvar_size) == 0) {
+ vvar_present = true;
+ printf("VVAR is at 0x%lx-0x%lx (%lu bytes)\n",
+ vvar_start, vvar_end, vvar_size);
+ } else {
+ vvar_present = false;
+ vvar_size = 0;
+ }

kill(getpid(), SIGUSR1);
assert(took_signal == 1);
printf("Signal delivered OK with VDSO mapped\n");

- // Remap the VDSO somewhere else
- p = mmap(NULL, size, PROT_READ|PROT_WRITE,
MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
- assert(p != MAP_FAILED);
- assert(mremap((void *)low, size, size,
MREMAP_MAYMOVE|MREMAP_FIXED, p) != MAP_FAILED);
- assert(search_proc_maps("[vdso]", &low, &high) == 0);
- size = high - low + 1;
- printf("VDSO moved to 0x%lx-0x%lx (%lu bytes)\n", low, high, size);
+ // Remap the VDSO and VVAR somewhere else
+ vdso_parking = mmap(NULL, vdso_size + vvar_size,
PROT_READ|PROT_WRITE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
+ assert(vdso_parking != MAP_FAILED);
+
+ if (vvar_present) {
+ // The relative position of vdso/vvar must always stay the same
+ if (vvar_start > vdso_start) {
+ vvar_parking = vdso_parking + vdso_size;
+ } else {
+ vvar_parking = vdso_parking;
+ vdso_parking = vvar_parking + vvar_size;
+ }
+ assert(mremap((void *)vvar_start, vvar_size, vvar_size,
MREMAP_MAYMOVE|MREMAP_FIXED, vvar_parking) != MAP_FAILED);
+ }
+ assert(mremap((void *)vdso_start, vdso_size, vdso_size,
MREMAP_MAYMOVE|MREMAP_FIXED, vdso_parking) != MAP_FAILED);
+
+ assert(search_proc_maps("[vdso]", &vdso_start, &vdso_end,
&vdso_size) == 0);
+ printf("VDSO moved to 0x%lx-0x%lx (%lu bytes)\n", vdso_start,
vdso_end, vdso_size);
+ assert(search_proc_maps("[vvar]", &vvar_start, &vvar_end,
&vvar_size) == 0);
+ printf("VVAR moved to 0x%lx-0x%lx (%lu bytes)\n", vvar_start,
vvar_end, vvar_size);

kill(getpid(), SIGUSR1);
assert(took_signal == 2);
printf("Signal delivered OK with VDSO moved\n");

- assert(munmap((void *)low, size) == 0);
+ assert(munmap((void *)vdso_start, vdso_size) == 0);
printf("Unmapped VDSO\n");

// Confirm the VDSO is not mapped anymore
- assert(search_proc_maps("[vdso]", &low, &high) != 0);
+ assert(search_proc_maps("[vdso]", &vdso_start, &vdso_end,
&vdso_size) != 0);

// Make the stack executable
- assert(search_proc_maps("[stack]", &low, &high) == 0);
- size = high - low + 1;
- mprotect((void *)low, size, PROT_READ|PROT_WRITE|PROT_EXEC);
+ assert(search_proc_maps("[stack]", &stack_start, &stack_end,
&stack_size) == 0);
+ mprotect((void *)stack_start, stack_size, PROT_READ|PROT_WRITE|PROT_EXEC);
printf("Remapped the stack executable\n");

kill(getpid(), SIGUSR1);

2022-03-09 16:24:03

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH v3 00/23] Add generic vdso_base tracking

Hi Dmitry,

I'm wondering the status of this series.

Wondering what to do while reviewing pending powerpc patches and
especially
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/[email protected]/


Christophe

Le 11/06/2021 à 20:02, Dmitry Safonov a écrit :
> v3 Changes:
> - Migrated arch/powerpc to vdso_base
> - Added x86/selftest for unmapped vdso & no landing on fast syscall
> - Review comments from Andy & Christophe (thanks!)
> - Amended s/born process/execed process/ everywhere I noticed
> - Build robot warning on cast from __user pointer
>
> I've tested it on x86, I would appreciate any help with
> Tested-by on arm/arm64/mips/powerpc/s390/... platforms.
>
> One thing I've noticed while cooking this and haven't found a clean
> way to solve is zero-terminated .pages[] array in vdso mappings, which
> is not always zero-terminated but works by the reason of
> VM_DONTEXPAND on mappings.
>
> v2 Changes:
> - Rename user_landing to vdso_base as it tracks vDSO VMA start address,
> rather than the explicit address to land (Andy)
> - Reword and don't use "new-execed" and "new-born" task (Andy)
> - Fix failures reported by build robot
>
> Started from discussion [1], where was noted that currently a couple of
> architectures support mremap() for vdso/sigpage, but not munmap().
> If an application maps something on the ex-place of vdso/sigpage,
> later after processing signal it will land there (good luck!)
>
> Patches set is based on linux-next (next-20201123) and it depends on
> changes in x86/cleanups (those reclaim TIF_IA32/TIF_X32) and also
> on my changes in akpm (fixing several mremap() issues).
>
> Logically, the patches set divides on:
> - patch 1: a cleanup for patches in x86/cleanups
> - patches 2-13: cleanups for arch_setup_additional_pages()
> - patches 13-14: x86 signal changes for unmapped vdso
> - patches 15-22: provide generic vdso_base in mm_struct
> - patch 23: selftest for unmapped vDSO & fast syscalls
>
> In the end, besides cleanups, it's now more predictable what happens for
> applications with unmapped vdso on architectures those support .mremap()
> for vdso/sigpage.
>
> I'm aware of only one user that unmaps vdso - Valgrind [2].
> (there possibly are more, but this one is "special", it unmaps vdso, but
> not vvar, which confuses CRIU [Checkpoint Restore In Userspace], that's
> why I'm aware of it)
>

I'm wondering the status of this series.

Wondering what to do while reviewing pending powerpc patches and
especially
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/[email protected]/


Christophe

2022-03-11 23:26:09

by Dmitry Safonov

[permalink] [raw]
Subject: Re: [PATCH v3 00/23] Add generic vdso_base tracking

Hi Christophe,

On 3/9/22 15:41, Christophe Leroy wrote:
> Hi Dmitry,
>
> I'm wondering the status of this series.

Yeah, I plan to work on v4 addressing the reviews.
WFH has quite affected my work on side-projects and I've laid aside for
a while this patch set that touches every architecture and is besides
a bit challenging to upstream.

> Wondering what to do while reviewing pending powerpc patches and
> especially
> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/[email protected]/

Please, go ahead with that - I'll base v4 patches on the top of that.
Thanks for pinging me about this.

> Christophe
>
> Le 11/06/2021 à 20:02, Dmitry Safonov a écrit :
>> v3 Changes:
>> - Migrated arch/powerpc to vdso_base
>> - Added x86/selftest for unmapped vdso & no landing on fast syscall
>> - Review comments from Andy & Christophe (thanks!)
>> - Amended s/born process/execed process/ everywhere I noticed
>> - Build robot warning on cast from __user pointer
>>
>> I've tested it on x86, I would appreciate any help with
>> Tested-by on arm/arm64/mips/powerpc/s390/... platforms.
>>
>> One thing I've noticed while cooking this and haven't found a clean
>> way to solve is zero-terminated .pages[] array in vdso mappings, which
>> is not always zero-terminated but works by the reason of
>> VM_DONTEXPAND on mappings.
>>
>> v2 Changes:
>> - Rename user_landing to vdso_base as it tracks vDSO VMA start address,
>>    rather than the explicit address to land (Andy)
>> - Reword and don't use "new-execed" and "new-born" task (Andy)
>> - Fix failures reported by build robot
>>
>> Started from discussion [1], where was noted that currently a couple of
>> architectures support mremap() for vdso/sigpage, but not munmap().
>> If an application maps something on the ex-place of vdso/sigpage,
>> later after processing signal it will land there (good luck!)
>>
>> Patches set is based on linux-next (next-20201123) and it depends on
>> changes in x86/cleanups (those reclaim TIF_IA32/TIF_X32) and also
>> on my changes in akpm (fixing several mremap() issues).
>>
>> Logically, the patches set divides on:
>> - patch       1: a cleanup for patches in x86/cleanups
>> - patches  2-13: cleanups for arch_setup_additional_pages()
>> - patches 13-14: x86 signal changes for unmapped vdso
>> - patches 15-22: provide generic vdso_base in mm_struct
>> - patch      23: selftest for unmapped vDSO & fast syscalls
>>
>> In the end, besides cleanups, it's now more predictable what happens for
>> applications with unmapped vdso on architectures those support .mremap()
>> for vdso/sigpage.
>>
>> I'm aware of only one user that unmaps vdso - Valgrind [2].
>> (there possibly are more, but this one is "special", it unmaps vdso, but
>>   not vvar, which confuses CRIU [Checkpoint Restore In Userspace], that's
>>   why I'm aware of it)
>>
>
> I'm wondering the status of this series.
>
> Wondering what to do while reviewing pending powerpc patches and
> especially
> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/[email protected]/
>
>
> Christophe
>


Thanks,
Dmitry

2022-08-19 09:33:22

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH v3 00/23] Add generic vdso_base tracking

Hi Dmitry,

Le 10/03/2022 à 22:17, Dmitry Safonov a écrit :
> Hi Christophe,
>
> On 3/9/22 15:41, Christophe Leroy wrote:
>> Hi Dmitry,
>>
>> I'm wondering the status of this series.
>
> Yeah, I plan to work on v4 addressing the reviews.
> WFH has quite affected my work on side-projects and I've laid aside for
> a while this patch set that touches every architecture and is besides
> a bit challenging to upstream.

Any progress ?

Thanks
Christophe

2022-08-23 20:50:05

by Dmitry Safonov

[permalink] [raw]
Subject: Re: [PATCH v3 00/23] Add generic vdso_base tracking

Hi Christophe,

On 8/19/22 10:17, Christophe Leroy wrote:
> Hi Dmitry,
>
> Le 10/03/2022 à 22:17, Dmitry Safonov a écrit :
>> Hi Christophe,
>>
>> On 3/9/22 15:41, Christophe Leroy wrote:
>>> Hi Dmitry,
>>>
>>> I'm wondering the status of this series.
>>
>> Yeah, I plan to work on v4 addressing the reviews.
>> WFH has quite affected my work on side-projects and I've laid aside for
>> a while this patch set that touches every architecture and is besides
>> a bit challenging to upstream.
>
> Any progress ?

Yeah, I'm back to the office, so will spend some time on v4, thanks for
pinging.

Thank you,
Dmitry

2023-10-11 10:29:25

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH v3 00/23] Add generic vdso_base tracking

Hi Dmitry,

Le 23/08/2022 à 21:13, Dmitry Safonov a écrit :
> Hi Christophe,
>
> On 8/19/22 10:17, Christophe Leroy wrote:
>> Hi Dmitry,
>>
>> Le 10/03/2022 à 22:17, Dmitry Safonov a écrit :
>>> Hi Christophe,
>>>
>>> On 3/9/22 15:41, Christophe Leroy wrote:
>>>> Hi Dmitry,
>>>>
>>>> I'm wondering the status of this series.
>>>
>>> Yeah, I plan to work on v4 addressing the reviews.
>>> WFH has quite affected my work on side-projects and I've laid aside for
>>> a while this patch set that touches every architecture and is besides
>>> a bit challenging to upstream.
>>
>> Any progress ?
>
> Yeah, I'm back to the office, so will spend some time on v4, thanks for
> pinging.
>

I haven't seen any v4, did I miss it ?

Christophe

2023-10-11 23:21:12

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH v3 00/23] Add generic vdso_base tracking

On 6/11/21 11:02, Dmitry Safonov wrote:
>
> Patches set is based on linux-next (next-20201123) and it depends on
> changes in x86/cleanups (those reclaim TIF_IA32/TIF_X32) and also
> on my changes in akpm (fixing several mremap() issues).
>
> Logically, the patches set divides on:
> - patch 1: a cleanup for patches in x86/cleanups
> - patches 2-13: cleanups for arch_setup_additional_pages()
> - patches 13-14: x86 signal changes for unmapped vdso
> - patches 15-22: provide generic vdso_base in mm_struct
> - patch 23: selftest for unmapped vDSO & fast syscalls
>
> In the end, besides cleanups, it's now more predictable what happens for
> applications with unmapped vdso on architectures those support .mremap()
> for vdso/sigpage.
>
> I'm aware of only one user that unmaps vdso - Valgrind [2].
> (there possibly are more, but this one is "special", it unmaps vdso, but
> not vvar, which confuses CRIU [Checkpoint Restore In Userspace], that's
> why I'm aware of it)
>

There was another discussion that might be relevant: actually
associating the vdso with an actual file, and allowing a program to map
said file normally if it want access to one that it normally wouldn't
have (say, /proc/vdso/x86_64.so versus /proc/vdso/i386.so on the same
system.)

The "catch", of course, is that this file will need to be mapped as
MAP_SHARED because of vdso data.

-hpa