2020-07-14 11:00:45

by Christoph Hellwig

[permalink] [raw]
Subject: clean up address limit helpers v2

Hi all,

in preparation for eventually phasing out direct use of set_fs(), this
series removes the segment_eq() arch helper that is only used to
implement or duplicate the uaccess_kernel() API, and then adds
descriptive helpers to force the kernel address limit.


Changes since v1:
- drop to incorrect hunks
- fix a commit log typo

Diffstat:
arch/alpha/include/asm/uaccess.h | 2 +-
arch/arc/include/asm/segment.h | 3 +--
arch/arm/include/asm/uaccess.h | 4 ++--
arch/arm64/include/asm/uaccess.h | 2 +-
arch/arm64/kernel/sdei.c | 2 +-
arch/csky/include/asm/segment.h | 2 +-
arch/h8300/include/asm/segment.h | 2 +-
arch/ia64/include/asm/uaccess.h | 2 +-
arch/m68k/include/asm/segment.h | 2 +-
arch/m68k/include/asm/tlbflush.h | 6 +++---
arch/microblaze/include/asm/uaccess.h | 2 +-
arch/mips/include/asm/uaccess.h | 2 +-
arch/mips/kernel/unaligned.c | 27 +++++++++++++--------------
arch/nds32/include/asm/uaccess.h | 2 +-
arch/nds32/kernel/process.c | 2 +-
arch/nds32/mm/alignment.c | 7 +++----
arch/nios2/include/asm/uaccess.h | 2 +-
arch/openrisc/include/asm/uaccess.h | 2 +-
arch/parisc/include/asm/uaccess.h | 2 +-
arch/powerpc/include/asm/uaccess.h | 3 +--
arch/riscv/include/asm/uaccess.h | 6 +++---
arch/s390/include/asm/uaccess.h | 2 +-
arch/sh/include/asm/segment.h | 3 +--
arch/sh/kernel/traps_32.c | 12 +++++-------
arch/sparc/include/asm/uaccess_32.h | 2 +-
arch/sparc/include/asm/uaccess_64.h | 2 +-
arch/x86/include/asm/uaccess.h | 2 +-
arch/xtensa/include/asm/uaccess.h | 2 +-
drivers/firmware/arm_sdei.c | 5 ++---
fs/exec.c | 7 ++++++-
include/asm-generic/uaccess.h | 4 ++--
include/linux/syscalls.h | 2 +-
include/linux/uaccess.h | 20 ++++++++++++++++++--
kernel/events/callchain.c | 5 ++---
kernel/events/core.c | 5 ++---
kernel/exit.c | 2 +-
kernel/kthread.c | 5 ++---
kernel/stacktrace.c | 5 ++---
mm/maccess.c | 22 ++++++++++------------
39 files changed, 99 insertions(+), 92 deletions(-)


2020-07-14 11:02:07

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 1/6] syscalls: use uaccess_kernel in addr_limit_user_check

Use the uaccess_kernel helper instead of duplicating it.

Signed-off-by: Christoph Hellwig <[email protected]>
---
include/linux/syscalls.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index b951a87da9877c..e933a43d4a69ac 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -263,7 +263,7 @@ static inline void addr_limit_user_check(void)
return;
#endif

- if (CHECK_DATA_CORRUPTION(!segment_eq(get_fs(), USER_DS),
+ if (CHECK_DATA_CORRUPTION(uaccess_kernel(),
"Invalid address limit on user-mode return"))
force_sig(SIGKILL);

--
2.26.2

2020-07-14 11:04:31

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 2/6] nds32: use uaccess_kernel in show_regs

Use the uaccess_kernel helper instead of duplicating it.

Signed-off-by: Christoph Hellwig <[email protected]>
Acked-by: Greentime Hu <[email protected]>
---
arch/nds32/kernel/process.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/nds32/kernel/process.c b/arch/nds32/kernel/process.c
index 9712fd474f2ca3..f06265949ec28b 100644
--- a/arch/nds32/kernel/process.c
+++ b/arch/nds32/kernel/process.c
@@ -121,7 +121,7 @@ void show_regs(struct pt_regs *regs)
regs->uregs[3], regs->uregs[2], regs->uregs[1], regs->uregs[0]);
pr_info(" IRQs o%s Segment %s\n",
interrupts_enabled(regs) ? "n" : "ff",
- segment_eq(get_fs(), KERNEL_DS)? "kernel" : "user");
+ uaccess_kernel() ? "kernel" : "user");
}

EXPORT_SYMBOL(show_regs);
--
2.26.2

2020-07-14 11:04:35

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 3/6] riscv: include <asm/pgtable.h> in <asm/uaccess.h>

To ensure TASK_SIZE is defined for USER_DS.

Signed-off-by: Christoph Hellwig <[email protected]>
---
arch/riscv/include/asm/uaccess.h | 2 ++
1 file changed, 2 insertions(+)

diff --git a/arch/riscv/include/asm/uaccess.h b/arch/riscv/include/asm/uaccess.h
index 8ce9d607b53dce..22de922d6ecb2f 100644
--- a/arch/riscv/include/asm/uaccess.h
+++ b/arch/riscv/include/asm/uaccess.h
@@ -8,6 +8,8 @@
#ifndef _ASM_RISCV_UACCESS_H
#define _ASM_RISCV_UACCESS_H

+#include <asm/pgtable.h> /* for TASK_SIZE */
+
/*
* User space memory access functions
*/
--
2.26.2

2020-07-14 11:06:45

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 4/6] uaccess: remove segment_eq

segment_eq is only used to implement uaccess_kernel. Just open code
uaccess_kernel in the arch uaccess headers and remove one layer of
indirection.

Signed-off-by: Christoph Hellwig <[email protected]>
Acked-by: Greentime Hu <[email protected]>
Acked-by: Geert Uytterhoeven <[email protected]>
---
arch/alpha/include/asm/uaccess.h | 2 +-
arch/arc/include/asm/segment.h | 3 +--
arch/arm/include/asm/uaccess.h | 4 ++--
arch/arm64/include/asm/uaccess.h | 2 +-
arch/csky/include/asm/segment.h | 2 +-
arch/h8300/include/asm/segment.h | 2 +-
arch/ia64/include/asm/uaccess.h | 2 +-
arch/m68k/include/asm/segment.h | 2 +-
arch/microblaze/include/asm/uaccess.h | 2 +-
arch/mips/include/asm/uaccess.h | 2 +-
arch/nds32/include/asm/uaccess.h | 2 +-
arch/nios2/include/asm/uaccess.h | 2 +-
arch/openrisc/include/asm/uaccess.h | 2 +-
arch/parisc/include/asm/uaccess.h | 2 +-
arch/powerpc/include/asm/uaccess.h | 3 +--
arch/riscv/include/asm/uaccess.h | 4 +---
arch/s390/include/asm/uaccess.h | 2 +-
arch/sh/include/asm/segment.h | 3 +--
arch/sparc/include/asm/uaccess_32.h | 2 +-
arch/sparc/include/asm/uaccess_64.h | 2 +-
arch/x86/include/asm/uaccess.h | 2 +-
arch/xtensa/include/asm/uaccess.h | 2 +-
include/asm-generic/uaccess.h | 4 ++--
include/linux/uaccess.h | 2 --
24 files changed, 25 insertions(+), 32 deletions(-)

diff --git a/arch/alpha/include/asm/uaccess.h b/arch/alpha/include/asm/uaccess.h
index 1fe2b56cb861fe..1b6f25efa247f0 100644
--- a/arch/alpha/include/asm/uaccess.h
+++ b/arch/alpha/include/asm/uaccess.h
@@ -20,7 +20,7 @@
#define get_fs() (current_thread_info()->addr_limit)
#define set_fs(x) (current_thread_info()->addr_limit = (x))

-#define segment_eq(a, b) ((a).seg == (b).seg)
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)

/*
* Is a address valid? This does a straightforward calculation rather
diff --git a/arch/arc/include/asm/segment.h b/arch/arc/include/asm/segment.h
index 6a2a5be5026de7..871f8ab11bfd02 100644
--- a/arch/arc/include/asm/segment.h
+++ b/arch/arc/include/asm/segment.h
@@ -14,8 +14,7 @@ typedef unsigned long mm_segment_t;

#define KERNEL_DS MAKE_MM_SEG(0)
#define USER_DS MAKE_MM_SEG(TASK_SIZE)
-
-#define segment_eq(a, b) ((a) == (b))
+#define uaccess_kernel() (get_fs() == KERNEL_DS)

#endif /* __ASSEMBLY__ */
#endif /* __ASMARC_SEGMENT_H */
diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h
index 98c6b91be4a8ad..b19c9bec1f7a63 100644
--- a/arch/arm/include/asm/uaccess.h
+++ b/arch/arm/include/asm/uaccess.h
@@ -76,7 +76,7 @@ static inline void set_fs(mm_segment_t fs)
modify_domain(DOMAIN_KERNEL, fs ? DOMAIN_CLIENT : DOMAIN_MANAGER);
}

-#define segment_eq(a, b) ((a) == (b))
+#define uaccess_kernel() (get_fs() == KERNEL_DS)

/* We use 33-bit arithmetic here... */
#define __range_ok(addr, size) ({ \
@@ -263,7 +263,7 @@ extern int __put_user_8(void *, unsigned long long);
*/
#define USER_DS KERNEL_DS

-#define segment_eq(a, b) (1)
+#define uaccess_kernel() (true)
#define __addr_ok(addr) ((void)(addr), 1)
#define __range_ok(addr, size) ((void)(addr), 0)
#define get_fs() (KERNEL_DS)
diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
index bc5c7b09115205..fcb8174de505ea 100644
--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -49,7 +49,7 @@ static inline void set_fs(mm_segment_t fs)
CONFIG_ARM64_UAO));
}

-#define segment_eq(a, b) ((a) == (b))
+#define uaccess_kernel() (get_fs() == KERNEL_DS)

/*
* Test whether a block of memory is a valid user space address.
diff --git a/arch/csky/include/asm/segment.h b/arch/csky/include/asm/segment.h
index db2640d5f57591..79ede9b1a6467f 100644
--- a/arch/csky/include/asm/segment.h
+++ b/arch/csky/include/asm/segment.h
@@ -13,6 +13,6 @@ typedef struct {
#define USER_DS ((mm_segment_t) { 0x80000000UL })
#define get_fs() (current_thread_info()->addr_limit)
#define set_fs(x) (current_thread_info()->addr_limit = (x))
-#define segment_eq(a, b) ((a).seg == (b).seg)
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)

#endif /* __ASM_CSKY_SEGMENT_H */
diff --git a/arch/h8300/include/asm/segment.h b/arch/h8300/include/asm/segment.h
index a407978f9f9fb8..37950725d9b9c8 100644
--- a/arch/h8300/include/asm/segment.h
+++ b/arch/h8300/include/asm/segment.h
@@ -33,7 +33,7 @@ static inline mm_segment_t get_fs(void)
return USER_DS;
}

-#define segment_eq(a, b) ((a).seg == (b).seg)
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)

#endif /* __ASSEMBLY__ */

diff --git a/arch/ia64/include/asm/uaccess.h b/arch/ia64/include/asm/uaccess.h
index 8aa473a4b0f4ef..179243c3dfc702 100644
--- a/arch/ia64/include/asm/uaccess.h
+++ b/arch/ia64/include/asm/uaccess.h
@@ -50,7 +50,7 @@
#define get_fs() (current_thread_info()->addr_limit)
#define set_fs(x) (current_thread_info()->addr_limit = (x))

-#define segment_eq(a, b) ((a).seg == (b).seg)
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)

/*
* When accessing user memory, we need to make sure the entire area really is in
diff --git a/arch/m68k/include/asm/segment.h b/arch/m68k/include/asm/segment.h
index c6686559e9b742..2b5e68a71ef78e 100644
--- a/arch/m68k/include/asm/segment.h
+++ b/arch/m68k/include/asm/segment.h
@@ -52,7 +52,7 @@ static inline void set_fs(mm_segment_t val)
#define set_fs(x) (current_thread_info()->addr_limit = (x))
#endif

-#define segment_eq(a, b) ((a).seg == (b).seg)
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)

#endif /* __ASSEMBLY__ */

diff --git a/arch/microblaze/include/asm/uaccess.h b/arch/microblaze/include/asm/uaccess.h
index 6723c56ec3783b..304b04ffea2faf 100644
--- a/arch/microblaze/include/asm/uaccess.h
+++ b/arch/microblaze/include/asm/uaccess.h
@@ -41,7 +41,7 @@
# define get_fs() (current_thread_info()->addr_limit)
# define set_fs(val) (current_thread_info()->addr_limit = (val))

-# define segment_eq(a, b) ((a).seg == (b).seg)
+# define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)

#ifndef CONFIG_MMU

diff --git a/arch/mips/include/asm/uaccess.h b/arch/mips/include/asm/uaccess.h
index 62b298c50905f6..61fc01f177a64b 100644
--- a/arch/mips/include/asm/uaccess.h
+++ b/arch/mips/include/asm/uaccess.h
@@ -72,7 +72,7 @@ extern u64 __ua_limit;
#define get_fs() (current_thread_info()->addr_limit)
#define set_fs(x) (current_thread_info()->addr_limit = (x))

-#define segment_eq(a, b) ((a).seg == (b).seg)
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)

/*
* eva_kernel_access() - determine whether kernel memory access on an EVA system
diff --git a/arch/nds32/include/asm/uaccess.h b/arch/nds32/include/asm/uaccess.h
index 3a9219f53ee0d8..010ba5f1d7dd6b 100644
--- a/arch/nds32/include/asm/uaccess.h
+++ b/arch/nds32/include/asm/uaccess.h
@@ -44,7 +44,7 @@ static inline void set_fs(mm_segment_t fs)
current_thread_info()->addr_limit = fs;
}

-#define segment_eq(a, b) ((a) == (b))
+#define uaccess_kernel() (get_fs() == KERNEL_DS)

#define __range_ok(addr, size) (size <= get_fs() && addr <= (get_fs() -size))

diff --git a/arch/nios2/include/asm/uaccess.h b/arch/nios2/include/asm/uaccess.h
index e83f831a76f939..a741abbed6fbf5 100644
--- a/arch/nios2/include/asm/uaccess.h
+++ b/arch/nios2/include/asm/uaccess.h
@@ -30,7 +30,7 @@
#define get_fs() (current_thread_info()->addr_limit)
#define set_fs(seg) (current_thread_info()->addr_limit = (seg))

-#define segment_eq(a, b) ((a).seg == (b).seg)
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)

#define __access_ok(addr, len) \
(((signed long)(((long)get_fs().seg) & \
diff --git a/arch/openrisc/include/asm/uaccess.h b/arch/openrisc/include/asm/uaccess.h
index 17c24f14615fb5..48b691530d3ed4 100644
--- a/arch/openrisc/include/asm/uaccess.h
+++ b/arch/openrisc/include/asm/uaccess.h
@@ -43,7 +43,7 @@
#define get_fs() (current_thread_info()->addr_limit)
#define set_fs(x) (current_thread_info()->addr_limit = (x))

-#define segment_eq(a, b) ((a) == (b))
+#define uaccess_kernel() (get_fs() == KERNEL_DS)

/* Ensure that the range from addr to addr+size is all within the process'
* address space
diff --git a/arch/parisc/include/asm/uaccess.h b/arch/parisc/include/asm/uaccess.h
index ebbb9ffe038c76..ed2cd4fb479b0c 100644
--- a/arch/parisc/include/asm/uaccess.h
+++ b/arch/parisc/include/asm/uaccess.h
@@ -14,7 +14,7 @@
#define KERNEL_DS ((mm_segment_t){0})
#define USER_DS ((mm_segment_t){1})

-#define segment_eq(a, b) ((a).seg == (b).seg)
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)

#define get_fs() (current_thread_info()->addr_limit)
#define set_fs(x) (current_thread_info()->addr_limit = (x))
diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
index 64c04ab0911236..00699903f1efca 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -38,8 +38,7 @@ static inline void set_fs(mm_segment_t fs)
set_thread_flag(TIF_FSCHECK);
}

-#define segment_eq(a, b) ((a).seg == (b).seg)
-
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
#define user_addr_max() (get_fs().seg)

#ifdef __powerpc64__
diff --git a/arch/riscv/include/asm/uaccess.h b/arch/riscv/include/asm/uaccess.h
index 22de922d6ecb2f..f56c66b3f5fe21 100644
--- a/arch/riscv/include/asm/uaccess.h
+++ b/arch/riscv/include/asm/uaccess.h
@@ -64,11 +64,9 @@ static inline void set_fs(mm_segment_t fs)
current_thread_info()->addr_limit = fs;
}

-#define segment_eq(a, b) ((a).seg == (b).seg)
-
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
#define user_addr_max() (get_fs().seg)

-
/**
* access_ok: - Checks if a user space pointer is valid
* @addr: User space pointer to start of block to check
diff --git a/arch/s390/include/asm/uaccess.h b/arch/s390/include/asm/uaccess.h
index 324438889fe168..f09444d6aeab30 100644
--- a/arch/s390/include/asm/uaccess.h
+++ b/arch/s390/include/asm/uaccess.h
@@ -32,7 +32,7 @@
#define USER_DS_SACF (3)

#define get_fs() (current->thread.mm_segment)
-#define segment_eq(a,b) (((a) & 2) == ((b) & 2))
+#define uaccess_kernel() ((get_fs() & 2) == KERNEL_DS)

void set_fs(mm_segment_t fs);

diff --git a/arch/sh/include/asm/segment.h b/arch/sh/include/asm/segment.h
index 33d1d28057cbfb..02e54a3335d68f 100644
--- a/arch/sh/include/asm/segment.h
+++ b/arch/sh/include/asm/segment.h
@@ -24,8 +24,7 @@ typedef struct {
#define USER_DS KERNEL_DS
#endif

-#define segment_eq(a, b) ((a).seg == (b).seg)
-
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)

#define get_fs() (current_thread_info()->addr_limit)
#define set_fs(x) (current_thread_info()->addr_limit = (x))
diff --git a/arch/sparc/include/asm/uaccess_32.h b/arch/sparc/include/asm/uaccess_32.h
index d6d8413eca835a..0a2d3ebc4bb86d 100644
--- a/arch/sparc/include/asm/uaccess_32.h
+++ b/arch/sparc/include/asm/uaccess_32.h
@@ -28,7 +28,7 @@
#define get_fs() (current->thread.current_ds)
#define set_fs(val) ((current->thread.current_ds) = (val))

-#define segment_eq(a, b) ((a).seg == (b).seg)
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)

/* We have there a nice not-mapped page at PAGE_OFFSET - PAGE_SIZE, so that this test
* can be fairly lightweight.
diff --git a/arch/sparc/include/asm/uaccess_64.h b/arch/sparc/include/asm/uaccess_64.h
index bf9d330073b235..698cf69f74e998 100644
--- a/arch/sparc/include/asm/uaccess_64.h
+++ b/arch/sparc/include/asm/uaccess_64.h
@@ -32,7 +32,7 @@

#define get_fs() ((mm_segment_t){(current_thread_info()->current_ds)})

-#define segment_eq(a, b) ((a).seg == (b).seg)
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)

#define set_fs(val) \
do { \
diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index 18dfa07d3ef0df..dd3261f9f4eaad 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -33,7 +33,7 @@ static inline void set_fs(mm_segment_t fs)
set_thread_flag(TIF_FSCHECK);
}

-#define segment_eq(a, b) ((a).seg == (b).seg)
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
#define user_addr_max() (current->thread.addr_limit.seg)

/*
diff --git a/arch/xtensa/include/asm/uaccess.h b/arch/xtensa/include/asm/uaccess.h
index e57f0d0a88d8ba..b9758119feca19 100644
--- a/arch/xtensa/include/asm/uaccess.h
+++ b/arch/xtensa/include/asm/uaccess.h
@@ -35,7 +35,7 @@
#define get_fs() (current->thread.current_ds)
#define set_fs(val) (current->thread.current_ds = (val))

-#define segment_eq(a, b) ((a).seg == (b).seg)
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)

#define __kernel_ok (uaccess_kernel())
#define __user_ok(addr, size) \
diff --git a/include/asm-generic/uaccess.h b/include/asm-generic/uaccess.h
index e935318804f8ae..ba68ee4dabfaa7 100644
--- a/include/asm-generic/uaccess.h
+++ b/include/asm-generic/uaccess.h
@@ -86,8 +86,8 @@ static inline void set_fs(mm_segment_t fs)
}
#endif

-#ifndef segment_eq
-#define segment_eq(a, b) ((a).seg == (b).seg)
+#ifndef uaccess_kernel
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
#endif

#define access_ok(addr, size) __access_ok((unsigned long)(addr),(size))
diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
index 0a76ddc07d5970..5c62d0c6f15b16 100644
--- a/include/linux/uaccess.h
+++ b/include/linux/uaccess.h
@@ -6,8 +6,6 @@
#include <linux/sched.h>
#include <linux/thread_info.h>

-#define uaccess_kernel() segment_eq(get_fs(), KERNEL_DS)
-
#include <asm/uaccess.h>

/*
--
2.26.2

2020-07-14 11:08:54

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 5/6] uaccess: add force_uaccess_{begin,end} helpers

Add helpers to wrap the get_fs/set_fs magic for undoing any damange done
by set_fs(KERNEL_DS). There is no real functional benefit, but this
documents the intent of these calls better, and will allow stubbing the
functions out easily for kernels builds that do not allow address space
overrides in the future.

Signed-off-by: Christoph Hellwig <[email protected]>
Acked-by: Greentime Hu <[email protected]>
Acked-by: Mark Rutland <[email protected]>
Acked-by: Geert Uytterhoeven <[email protected]>
---
arch/arm64/kernel/sdei.c | 2 +-
arch/m68k/include/asm/tlbflush.h | 6 +++---
arch/mips/kernel/unaligned.c | 27 +++++++++++++--------------
arch/nds32/mm/alignment.c | 7 +++----
arch/sh/kernel/traps_32.c | 12 +++++-------
drivers/firmware/arm_sdei.c | 5 ++---
include/linux/uaccess.h | 18 ++++++++++++++++++
kernel/events/callchain.c | 5 ++---
kernel/events/core.c | 5 ++---
kernel/kthread.c | 5 ++---
kernel/stacktrace.c | 5 ++---
mm/maccess.c | 22 ++++++++++------------
12 files changed, 63 insertions(+), 56 deletions(-)

diff --git a/arch/arm64/kernel/sdei.c b/arch/arm64/kernel/sdei.c
index dab88260b13739..7689f2031c0c41 100644
--- a/arch/arm64/kernel/sdei.c
+++ b/arch/arm64/kernel/sdei.c
@@ -180,7 +180,7 @@ static __kprobes unsigned long _sdei_handler(struct pt_regs *regs,

/*
* We didn't take an exception to get here, set PAN. UAO will be cleared
- * by sdei_event_handler()s set_fs(USER_DS) call.
+ * by sdei_event_handler()s force_uaccess_begin() call.
*/
__uaccess_enable_hw_pan();

diff --git a/arch/m68k/include/asm/tlbflush.h b/arch/m68k/include/asm/tlbflush.h
index 191e75a6bb249e..5337bc2c262f22 100644
--- a/arch/m68k/include/asm/tlbflush.h
+++ b/arch/m68k/include/asm/tlbflush.h
@@ -85,10 +85,10 @@ static inline void flush_tlb_mm(struct mm_struct *mm)
static inline void flush_tlb_page(struct vm_area_struct *vma, unsigned long addr)
{
if (vma->vm_mm == current->active_mm) {
- mm_segment_t old_fs = get_fs();
- set_fs(USER_DS);
+ mm_segment_t old_fs = force_uaccess_begin();
+
__flush_tlb_one(addr);
- set_fs(old_fs);
+ force_uaccess_end(old_fs);
}
}

diff --git a/arch/mips/kernel/unaligned.c b/arch/mips/kernel/unaligned.c
index 0adce604fa44cb..126a5f3f4e4ce3 100644
--- a/arch/mips/kernel/unaligned.c
+++ b/arch/mips/kernel/unaligned.c
@@ -191,17 +191,16 @@ static void emulate_load_store_insn(struct pt_regs *regs,
* memory, so we need to "switch" the address limit to
* user space, so that address check can work properly.
*/
- seg = get_fs();
- set_fs(USER_DS);
+ seg = force_uaccess_begin();
switch (insn.spec3_format.func) {
case lhe_op:
if (!access_ok(addr, 2)) {
- set_fs(seg);
+ force_uaccess_end(seg);
goto sigbus;
}
LoadHWE(addr, value, res);
if (res) {
- set_fs(seg);
+ force_uaccess_end(seg);
goto fault;
}
compute_return_epc(regs);
@@ -209,12 +208,12 @@ static void emulate_load_store_insn(struct pt_regs *regs,
break;
case lwe_op:
if (!access_ok(addr, 4)) {
- set_fs(seg);
+ force_uaccess_end(seg);
goto sigbus;
}
LoadWE(addr, value, res);
if (res) {
- set_fs(seg);
+ force_uaccess_end(seg);
goto fault;
}
compute_return_epc(regs);
@@ -222,12 +221,12 @@ static void emulate_load_store_insn(struct pt_regs *regs,
break;
case lhue_op:
if (!access_ok(addr, 2)) {
- set_fs(seg);
+ force_uaccess_end(seg);
goto sigbus;
}
LoadHWUE(addr, value, res);
if (res) {
- set_fs(seg);
+ force_uaccess_end(seg);
goto fault;
}
compute_return_epc(regs);
@@ -235,35 +234,35 @@ static void emulate_load_store_insn(struct pt_regs *regs,
break;
case she_op:
if (!access_ok(addr, 2)) {
- set_fs(seg);
+ force_uaccess_end(seg);
goto sigbus;
}
compute_return_epc(regs);
value = regs->regs[insn.spec3_format.rt];
StoreHWE(addr, value, res);
if (res) {
- set_fs(seg);
+ force_uaccess_end(seg);
goto fault;
}
break;
case swe_op:
if (!access_ok(addr, 4)) {
- set_fs(seg);
+ force_uaccess_end(seg);
goto sigbus;
}
compute_return_epc(regs);
value = regs->regs[insn.spec3_format.rt];
StoreWE(addr, value, res);
if (res) {
- set_fs(seg);
+ force_uaccess_end(seg);
goto fault;
}
break;
default:
- set_fs(seg);
+ force_uaccess_end(seg);
goto sigill;
}
- set_fs(seg);
+ force_uaccess_end(seg);
}
#endif
break;
diff --git a/arch/nds32/mm/alignment.c b/arch/nds32/mm/alignment.c
index c8b9061a2ee3d5..1eb7ded6992b57 100644
--- a/arch/nds32/mm/alignment.c
+++ b/arch/nds32/mm/alignment.c
@@ -512,7 +512,7 @@ int do_unaligned_access(unsigned long addr, struct pt_regs *regs)
{
unsigned long inst;
int ret = -EFAULT;
- mm_segment_t seg = get_fs();
+ mm_segment_t seg;

inst = get_inst(regs->ipc);

@@ -520,13 +520,12 @@ int do_unaligned_access(unsigned long addr, struct pt_regs *regs)
"Faulting addr: 0x%08lx, pc: 0x%08lx [inst: 0x%08lx ]\n", addr,
regs->ipc, inst);

- set_fs(USER_DS);
-
+ seg = force_uaccess_begin();
if (inst & NDS32_16BIT_INSTRUCTION)
ret = do_16((inst >> 16) & 0xffff, regs);
else
ret = do_32(inst, regs);
- set_fs(seg);
+ force_uaccess_end(seg);

return ret;
}
diff --git a/arch/sh/kernel/traps_32.c b/arch/sh/kernel/traps_32.c
index 058c6181bb306c..b62ad0ba23950a 100644
--- a/arch/sh/kernel/traps_32.c
+++ b/arch/sh/kernel/traps_32.c
@@ -482,8 +482,6 @@ asmlinkage void do_address_error(struct pt_regs *regs,
error_code = lookup_exception_vector();
#endif

- oldfs = get_fs();
-
if (user_mode(regs)) {
int si_code = BUS_ADRERR;
unsigned int user_action;
@@ -491,13 +489,13 @@ asmlinkage void do_address_error(struct pt_regs *regs,
local_irq_enable();
inc_unaligned_user_access();

- set_fs(USER_DS);
+ oldfs = force_uaccess_begin();
if (copy_from_user(&instruction, (insn_size_t *)(regs->pc & ~1),
sizeof(instruction))) {
- set_fs(oldfs);
+ force_uaccess_end(oldfs);
goto uspace_segv;
}
- set_fs(oldfs);
+ force_uaccess_end(oldfs);

/* shout about userspace fixups */
unaligned_fixups_notify(current, instruction, regs);
@@ -520,11 +518,11 @@ asmlinkage void do_address_error(struct pt_regs *regs,
goto uspace_segv;
}

- set_fs(USER_DS);
+ oldfs = force_uaccess_begin();
tmp = handle_unaligned_access(instruction, regs,
&user_mem_access, 0,
address);
- set_fs(oldfs);
+ force_uaccess_end(oldfs);

if (tmp == 0)
return; /* sorted */
diff --git a/drivers/firmware/arm_sdei.c b/drivers/firmware/arm_sdei.c
index e7e36aab2386ff..b4b9ce97f415e3 100644
--- a/drivers/firmware/arm_sdei.c
+++ b/drivers/firmware/arm_sdei.c
@@ -1136,15 +1136,14 @@ int sdei_event_handler(struct pt_regs *regs,
* access kernel memory.
* Do the same here because this doesn't come via the same entry code.
*/
- orig_addr_limit = get_fs();
- set_fs(USER_DS);
+ orig_addr_limit = force_uaccess_begin();

err = arg->callback(event_num, regs, arg->callback_arg);
if (err)
pr_err_ratelimited("event %u on CPU %u failed with error: %d\n",
event_num, smp_processor_id(), err);

- set_fs(orig_addr_limit);
+ force_uaccess_end(orig_addr_limit);

return err;
}
diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
index 5c62d0c6f15b16..94b28541165929 100644
--- a/include/linux/uaccess.h
+++ b/include/linux/uaccess.h
@@ -8,6 +8,24 @@

#include <asm/uaccess.h>

+/*
+ * Force the uaccess routines to be wired up for actual userspace access,
+ * overriding any possible set_fs(KERNEL_DS) still lingering around. Undone
+ * using force_uaccess_end below.
+ */
+static inline mm_segment_t force_uaccess_begin(void)
+{
+ mm_segment_t fs = get_fs();
+
+ set_fs(USER_DS);
+ return fs;
+}
+
+static inline void force_uaccess_end(mm_segment_t oldfs)
+{
+ set_fs(oldfs);
+}
+
/*
* Architectures should provide two primitives (raw_copy_{to,from}_user())
* and get rid of their private instances of copy_{to,from}_user() and
diff --git a/kernel/events/callchain.c b/kernel/events/callchain.c
index 334d48b16c36d7..dddc773e99f7b5 100644
--- a/kernel/events/callchain.c
+++ b/kernel/events/callchain.c
@@ -218,10 +218,9 @@ get_perf_callchain(struct pt_regs *regs, u32 init_nr, bool kernel, bool user,
if (add_mark)
perf_callchain_store_context(&ctx, PERF_CONTEXT_USER);

- fs = get_fs();
- set_fs(USER_DS);
+ fs = force_uaccess_begin();
perf_callchain_user(&ctx, regs);
- set_fs(fs);
+ force_uaccess_end(fs);
}
}

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 856d98c36f562d..c0c6f462a393e1 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6436,10 +6436,9 @@ perf_output_sample_ustack(struct perf_output_handle *handle, u64 dump_size,

/* Data. */
sp = perf_user_stack_pointer(regs);
- fs = get_fs();
- set_fs(USER_DS);
+ fs = force_uaccess_begin();
rem = __output_copy_user(handle, (void *) sp, dump_size);
- set_fs(fs);
+ force_uaccess_end(fs);
dyn_size = dump_size - rem;

perf_output_skip(handle, rem);
diff --git a/kernel/kthread.c b/kernel/kthread.c
index 132f84a5fde3f0..379bf1e543371a 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -1254,8 +1254,7 @@ void kthread_use_mm(struct mm_struct *mm)
if (active_mm != mm)
mmdrop(active_mm);

- to_kthread(tsk)->oldfs = get_fs();
- set_fs(USER_DS);
+ to_kthread(tsk)->oldfs = force_uaccess_begin();
}
EXPORT_SYMBOL_GPL(kthread_use_mm);

@@ -1270,7 +1269,7 @@ void kthread_unuse_mm(struct mm_struct *mm)
WARN_ON_ONCE(!(tsk->flags & PF_KTHREAD));
WARN_ON_ONCE(!tsk->mm);

- set_fs(to_kthread(tsk)->oldfs);
+ force_uaccess_end(to_kthread(tsk)->oldfs);

task_lock(tsk);
sync_mm_rss(mm);
diff --git a/kernel/stacktrace.c b/kernel/stacktrace.c
index 2af66e449aa6a8..946f44a9e86afb 100644
--- a/kernel/stacktrace.c
+++ b/kernel/stacktrace.c
@@ -233,10 +233,9 @@ unsigned int stack_trace_save_user(unsigned long *store, unsigned int size)
if (current->flags & PF_KTHREAD)
return 0;

- fs = get_fs();
- set_fs(USER_DS);
+ fs = force_uaccess_begin();
arch_stack_walk_user(consume_entry, &c, task_pt_regs(current));
- set_fs(fs);
+ force_uaccess_end(fs);

return c.len;
}
diff --git a/mm/maccess.c b/mm/maccess.c
index f98ff91e32c6df..3bd70405f2d848 100644
--- a/mm/maccess.c
+++ b/mm/maccess.c
@@ -205,15 +205,14 @@ long strncpy_from_kernel_nofault(char *dst, const void *unsafe_addr, long count)
long copy_from_user_nofault(void *dst, const void __user *src, size_t size)
{
long ret = -EFAULT;
- mm_segment_t old_fs = get_fs();
+ mm_segment_t old_fs = force_uaccess_begin();

- set_fs(USER_DS);
if (access_ok(src, size)) {
pagefault_disable();
ret = __copy_from_user_inatomic(dst, src, size);
pagefault_enable();
}
- set_fs(old_fs);
+ force_uaccess_end(old_fs);

if (ret)
return -EFAULT;
@@ -233,15 +232,14 @@ EXPORT_SYMBOL_GPL(copy_from_user_nofault);
long copy_to_user_nofault(void __user *dst, const void *src, size_t size)
{
long ret = -EFAULT;
- mm_segment_t old_fs = get_fs();
+ mm_segment_t old_fs = force_uaccess_begin();

- set_fs(USER_DS);
if (access_ok(dst, size)) {
pagefault_disable();
ret = __copy_to_user_inatomic(dst, src, size);
pagefault_enable();
}
- set_fs(old_fs);
+ force_uaccess_end(old_fs);

if (ret)
return -EFAULT;
@@ -270,17 +268,17 @@ EXPORT_SYMBOL_GPL(copy_to_user_nofault);
long strncpy_from_user_nofault(char *dst, const void __user *unsafe_addr,
long count)
{
- mm_segment_t old_fs = get_fs();
+ mm_segment_t old_fs;
long ret;

if (unlikely(count <= 0))
return 0;

- set_fs(USER_DS);
+ old_fs = force_uaccess_begin();
pagefault_disable();
ret = strncpy_from_user(dst, unsafe_addr, count);
pagefault_enable();
- set_fs(old_fs);
+ force_uaccess_end(old_fs);

if (ret >= count) {
ret = count;
@@ -310,14 +308,14 @@ long strncpy_from_user_nofault(char *dst, const void __user *unsafe_addr,
*/
long strnlen_user_nofault(const void __user *unsafe_addr, long count)
{
- mm_segment_t old_fs = get_fs();
+ mm_segment_t old_fs;
int ret;

- set_fs(USER_DS);
+ old_fs = force_uaccess_begin();
pagefault_disable();
ret = strnlen_user(unsafe_addr, count);
pagefault_enable();
- set_fs(old_fs);
+ force_uaccess_end(old_fs);

return ret;
}
--
2.26.2

2020-07-14 11:11:55

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 6/6] exec: use force_uaccess_begin during exec and exit

Both exec and exit want to ensure that the uaccess routines actually do
access user pointers. Use the newly added force_uaccess_begin helper
instead of an open coded set_fs for that to prepare for kernel builds
where set_fs() does not exist.

Signed-off-by: Christoph Hellwig <[email protected]>
---
fs/exec.c | 7 ++++++-
kernel/exit.c | 2 +-
2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index e6e8a9a7032784..769af470b69124 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1380,7 +1380,12 @@ int begin_new_exec(struct linux_binprm * bprm)
if (retval)
goto out_unlock;

- set_fs(USER_DS);
+ /*
+ * Ensure that the uaccess routines can actually operate on userspace
+ * pointers:
+ */
+ force_uaccess_begin();
+
me->flags &= ~(PF_RANDOMIZE | PF_FORKNOEXEC | PF_KTHREAD |
PF_NOFREEZE | PF_NO_SETAFFINITY);
flush_thread();
diff --git a/kernel/exit.c b/kernel/exit.c
index 727150f2810338..17d486a20f0dc6 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -731,7 +731,7 @@ void __noreturn do_exit(long code)
* mm_release()->clear_child_tid() from writing to a user-controlled
* kernel address.
*/
- set_fs(USER_DS);
+ force_uaccess_begin();

if (unlikely(in_atomic())) {
pr_info("note: %s[%d] exited with preempt_count %d\n",
--
2.26.2

2020-07-14 15:28:11

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH 4/6] uaccess: remove segment_eq

Ack, just with a note:

On Tue, Jul 14, 2020 at 4:06 AM Christoph Hellwig <[email protected]> wrote:
>
> --- a/arch/x86/include/asm/uaccess.h
> +++ b/arch/x86/include/asm/uaccess.h
> @@ -33,7 +33,7 @@ static inline void set_fs(mm_segment_t fs)
> set_thread_flag(TIF_FSCHECK);
> }
>
> -#define segment_eq(a, b) ((a).seg == (b).seg)
> +#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
> #define user_addr_max() (current->thread.addr_limit.seg)

This "uaccess_kernel()" interface is a better model anyway, because at
least on x86 (and from a quick glance at others), we might avoid the
exact equality comparison, and instead do simpler/better things.

On x86-64, for example, checking whether the limit has the high bit
set is not only more flexible and correct, it's much cheaper too.

Of course, trying to get rid of all this means that it doesn't matter
so much, but it would probably have been good to do this part years
ago regardless.

Linus

2020-07-14 15:30:41

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH 5/6] uaccess: add force_uaccess_{begin,end} helpers

On Tue, Jul 14, 2020 at 4:08 AM Christoph Hellwig <[email protected]> wrote:
>
> Add helpers to wrap the get_fs/set_fs magic for undoing any damange done
> by set_fs(KERNEL_DS). There is no real functional benefit, but this
> documents the intent of these calls better, and will allow stubbing the
> functions out easily for kernels builds that do not allow address space
> overrides in the future.

It would perhaps have been nicer to rename the save variabel too
(neither "seg" nor "oldfs" make much sense once you get rid of the old
x86-inspired name).

But from a greppability standpoint and a doc standpoint, I guess just
renaming the function is sufficient (and certainly easier).

Linus

2020-07-15 04:15:05

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH 6/6] exec: use force_uaccess_begin during exec and exit

Christoph Hellwig <[email protected]> writes:

> Both exec and exit want to ensure that the uaccess routines actually do
> access user pointers. Use the newly added force_uaccess_begin helper
> instead of an open coded set_fs for that to prepare for kernel builds
> where set_fs() does not exist.

Acked-by: "Eric W. Biederman" <[email protected]>

Have you played with a tree with all of your patches
and placing force_uaccess_begin in init/main.c:start_kernel?

Somewhere deep in the arch code we seem to have it all backwards
and kernel threads are all set_fs(KERNEL_DS). So just putting
a force_uaccess_begin somewhere very early should be enough
to switch things around.

> Signed-off-by: Christoph Hellwig <[email protected]>
> ---
> fs/exec.c | 7 ++++++-
> kernel/exit.c | 2 +-
> 2 files changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/fs/exec.c b/fs/exec.c
> index e6e8a9a7032784..769af470b69124 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -1380,7 +1380,12 @@ int begin_new_exec(struct linux_binprm * bprm)
> if (retval)
> goto out_unlock;
>
> - set_fs(USER_DS);
> + /*
> + * Ensure that the uaccess routines can actually operate on userspace
> + * pointers:
> + */
> + force_uaccess_begin();
> +
> me->flags &= ~(PF_RANDOMIZE | PF_FORKNOEXEC | PF_KTHREAD |
> PF_NOFREEZE | PF_NO_SETAFFINITY);
> flush_thread();
> diff --git a/kernel/exit.c b/kernel/exit.c
> index 727150f2810338..17d486a20f0dc6 100644
> --- a/kernel/exit.c
> +++ b/kernel/exit.c
> @@ -731,7 +731,7 @@ void __noreturn do_exit(long code)
> * mm_release()->clear_child_tid() from writing to a user-controlled
> * kernel address.
> */
> - set_fs(USER_DS);
> + force_uaccess_begin();
>
> if (unlikely(in_atomic())) {
> pr_info("note: %s[%d] exited with preempt_count %d\n",

2020-07-15 06:27:04

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 6/6] exec: use force_uaccess_begin during exec and exit

On Tue, Jul 14, 2020 at 10:33:05PM -0500, Eric W. Biederman wrote:
> Christoph Hellwig <[email protected]> writes:
>
> > Both exec and exit want to ensure that the uaccess routines actually do
> > access user pointers. Use the newly added force_uaccess_begin helper
> > instead of an open coded set_fs for that to prepare for kernel builds
> > where set_fs() does not exist.
>
> Acked-by: "Eric W. Biederman" <[email protected]>
>
> Have you played with a tree with all of your patches
> and placing force_uaccess_begin in init/main.c:start_kernel?

No, I have converted the early boot code to not require KERNEL_DS
and except for a pending network item remove set_fs entirely. Older
series here, will need some rework:

http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/set_fs-removal

2020-07-16 23:49:54

by Andrew Morton

[permalink] [raw]
Subject: Re: clean up address limit helpers v2

On Tue, 14 Jul 2020 12:54:59 +0200 Christoph Hellwig <[email protected]> wrote:

> Hi all,
>
> in preparation for eventually phasing out direct use of set_fs(), this
> series removes the segment_eq() arch helper that is only used to
> implement or duplicate the uaccess_kernel() API, and then adds
> descriptive helpers to force the kernel address limit.
>
>
> Changes since v1:
> - drop to incorrect hunks
> - fix a commit log typo

I think this *is* v1. I can't find any differences in the patches and I
was unable to eyeball any changelog alterations?

2020-07-17 06:09:05

by Christoph Hellwig

[permalink] [raw]
Subject: Re: clean up address limit helpers v2

On Thu, Jul 16, 2020 at 04:49:24PM -0700, Andrew Morton wrote:
> On Tue, 14 Jul 2020 12:54:59 +0200 Christoph Hellwig <[email protected]> wrote:
>
> > Hi all,
> >
> > in preparation for eventually phasing out direct use of set_fs(), this
> > series removes the segment_eq() arch helper that is only used to
> > implement or duplicate the uaccess_kernel() API, and then adds
> > descriptive helpers to force the kernel address limit.
> >
> >
> > Changes since v1:
> > - drop to incorrect hunks
> > - fix a commit log typo
>
> I think this *is* v1. I can't find any differences in the patches and I
> was unable to eyeball any changelog alterations?

No, this actuall is v2.

"[PATCH 5/6] uaccess: add force_uaccess_{begin,end} helpers" dropped
two incorrect hunks in the m68k and sh arch code, and lost and "er"
in the commit log.

2020-07-18 01:39:25

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 1/6] syscalls: use uaccess_kernel in addr_limit_user_check

Hi,

On Tue, Jul 14, 2020 at 12:55:00PM +0200, Christoph Hellwig wrote:
> Use the uaccess_kernel helper instead of duplicating it.
>
> Signed-off-by: Christoph Hellwig <[email protected]>

This patch causes a severe hiccup with my mps2-an385 boot test.

[ 8.545195] ------------[ cut here ]------------
[ 8.545294] WARNING: CPU: 0 PID: 1 at ./include/linux/syscalls.h:267 addr_limit_check_failed+0x11/0x24
[ 8.545376] Invalid address limit on user-mode return
[ 8.545487] CPU: 0 PID: 1 Comm: init Not tainted 5.8.0-rc5-next-20200717 #1
[ 8.545603] Hardware name: MPS2 (Device Tree Support)
[ 8.546053] [<2100af9d>] (unwind_backtrace) from [<2100a353>] (show_stack+0xb/0xc)
[ 8.546240] [<2100a353>] (show_stack) from [<2100dadb>] (__warn+0x6f/0x80)
[ 8.546311] [<2100dadb>] (__warn) from [<2100db1d>] (warn_slowpath_fmt+0x31/0x60)
[ 8.546383] [<2100db1d>] (warn_slowpath_fmt) from [<2100a159>] (addr_limit_check_failed+0x11/0x24)
[ 8.546464] [<2100a159>] (addr_limit_check_failed) from [<210080f3>] (ret_to_user_from_irq+0xf/0x18)
[ 8.546567] Exception stack(0x21427fb0 to 0x21427ff8)
[ 8.546649] 7fa0: 00000000 00000000 00000000 00000000
[ 8.546729] 7fc0: 00000000 00000000 00000000 00000000 00000000 00000000 21744f80 00000000
[ 8.546800] 7fe0: 00000000 21757fa8 00000000 21700b6c 01000000 00000000
[ 8.546910] ---[ end trace f1b0cd10cc3456dc ]---

This keeps happening until qemu aborts.

Reverting the patch isn't possible since segment_eq() is gone in
next-20200717.

Guenter

---
# bad: [aab7ee9f8ff0110bfcd594b33dc33748dc1baf46] Add linux-next specific files for 20200717
# good: [11ba468877bb23f28956a35e896356252d63c983] Linux 5.8-rc5
git bisect start 'HEAD' 'v5.8-rc5'
# good: [4d55a7a1298d197755c1a0f4512f56917e938a83] Merge remote-tracking branch 'crypto/master'
git bisect good 4d55a7a1298d197755c1a0f4512f56917e938a83
# good: [e63bf5dcce255302e355cb2277a3a39c83752c92] Merge remote-tracking branch 'devicetree/for-next'
git bisect good e63bf5dcce255302e355cb2277a3a39c83752c92
# good: [94d932ec3afb923efd8c736974f8316413175a5b] Merge remote-tracking branch 'thunderbolt/next'
git bisect good 94d932ec3afb923efd8c736974f8316413175a5b
# good: [5ddd2e0dbe8fceb80b0b36bd38a32217be7a04a5] Merge remote-tracking branch 'livepatching/for-next'
git bisect good 5ddd2e0dbe8fceb80b0b36bd38a32217be7a04a5
# bad: [40346f79983caf46fb92f779b0353422d43580a9] ipc/shm.c: Remove the superfluous break
git bisect bad 40346f79983caf46fb92f779b0353422d43580a9
# good: [0b917599517f71ddef5f7274a8199a33cecd49b2] kasan: update required compiler versions in documentation
git bisect good 0b917599517f71ddef5f7274a8199a33cecd49b2
# good: [701571ae06641cc0632d113a6d25f54ce651e723] mm,hwpoison: rework soft offline for free pages
git bisect good 701571ae06641cc0632d113a6d25f54ce651e723
# bad: [1c21deffe923b068d2d297c248845ec93531d1bf] lib/test_bits.c: add tests of GENMASK
git bisect bad 1c21deffe923b068d2d297c248845ec93531d1bf
# bad: [9549375184b2cb4f63fa7917665acf9c44114499] uaccess: add force_uaccess_{begin,end} helpers
git bisect bad 9549375184b2cb4f63fa7917665acf9c44114499
# good: [233d009c15719e43c53b73296144664e0bd59a2e] mm/memory_hotplug: introduce default dummy memory_add_physaddr_to_nid()
git bisect good 233d009c15719e43c53b73296144664e0bd59a2e
# good: [42889ca325dd735ce964838cff81a444637d9d01] mm: drop duplicated words in <linux/mm.h>
git bisect good 42889ca325dd735ce964838cff81a444637d9d01
# bad: [3b17e98704eedeeff41672b2f64cef1bbefbb8b2] nds32: use uaccess_kernel in show_regs
git bisect bad 3b17e98704eedeeff41672b2f64cef1bbefbb8b2
# bad: [02dc30b876b111276fe7d83d492ddfc2b39b80e3] syscalls: use uaccess_kernel in addr_limit_user_check
git bisect bad 02dc30b876b111276fe7d83d492ddfc2b39b80e3
# first bad commit: [02dc30b876b111276fe7d83d492ddfc2b39b80e3] syscalls: use uaccess_kernel in addr_limit_user_check

2020-07-18 09:49:23

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 1/6] syscalls: use uaccess_kernel in addr_limit_user_check

On Fri, Jul 17, 2020 at 06:38:50PM -0700, Guenter Roeck wrote:
> Hi,
>
> On Tue, Jul 14, 2020 at 12:55:00PM +0200, Christoph Hellwig wrote:
> > Use the uaccess_kernel helper instead of duplicating it.
> >
> > Signed-off-by: Christoph Hellwig <[email protected]>
>
> This patch causes a severe hiccup with my mps2-an385 boot test.

I guess that is a nommu config?

Can you try this patch?

diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h
index b19c9bec1f7a63..cc7daf374a6eb6 100644
--- a/arch/arm/include/asm/uaccess.h
+++ b/arch/arm/include/asm/uaccess.h
@@ -263,7 +263,7 @@ extern int __put_user_8(void *, unsigned long long);
*/
#define USER_DS KERNEL_DS

-#define uaccess_kernel() (true)
+#define uaccess_kernel() (false)
#define __addr_ok(addr) ((void)(addr), 1)
#define __range_ok(addr, size) ((void)(addr), 0)
#define get_fs() (KERNEL_DS)

2020-07-18 14:55:12

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 1/6] syscalls: use uaccess_kernel in addr_limit_user_check

On 7/18/20 2:48 AM, Christoph Hellwig wrote:
> On Fri, Jul 17, 2020 at 06:38:50PM -0700, Guenter Roeck wrote:
>> Hi,
>>
>> On Tue, Jul 14, 2020 at 12:55:00PM +0200, Christoph Hellwig wrote:
>>> Use the uaccess_kernel helper instead of duplicating it.
>>>
>>> Signed-off-by: Christoph Hellwig <[email protected]>
>>
>> This patch causes a severe hiccup with my mps2-an385 boot test.
>
> I guess that is a nommu config?
>
Correct.

> Can you try this patch?
>
> diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h
> index b19c9bec1f7a63..cc7daf374a6eb6 100644
> --- a/arch/arm/include/asm/uaccess.h
> +++ b/arch/arm/include/asm/uaccess.h
> @@ -263,7 +263,7 @@ extern int __put_user_8(void *, unsigned long long);
> */
> #define USER_DS KERNEL_DS
>
> -#define uaccess_kernel() (true)
> +#define uaccess_kernel() (false)
> #define __addr_ok(addr) ((void)(addr), 1)
> #define __range_ok(addr, size) ((void)(addr), 0)
> #define get_fs() (KERNEL_DS)
>

With this change, the error message is gone. However, it still does not
work. The image just hangs instead.

Guenter

2020-07-20 10:01:42

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 1/6] syscalls: use uaccess_kernel in addr_limit_user_check

To try to reproduce your report I built a mps2_defconfig kernel
and then run the qemu command line manually extraced from your
script below, using a mainline qemu built for arm-softmmu, but it
crashes with the following message even for the baseline kernel.

qemu: fatal: Lockup: can't escalate 3 to HardFault (current priority -1)

R00=00000000 R01=00000000 R02=00000000 R03=00000000
R04=00000000 R05=00000000 R06=00000000 R07=00000000
R08=00000000 R09=00000000 R10=00000000 R11=00000000
R12=00000000 R13=ffffffe0 R14=fffffff9 R15=00000000
XPSR=40000003 -Z-- A handler
FPSCR: 00000000

Does anyone have an idea what this means?


---
/opt/qemu/bin/qemu-system-arm \
-M mps2-an385 \
-cpu cortex-m3 \
-dtb arch/arm/boot/dts/mps2-an385.dtb \
-kernel vmlinux \
-no-reboot \
-snapshot -m 16 \
-initrd ~/images/rootfs-arm-m3.cpio \
-append 'panic=-1' \
-bios ~/images/mps2-boot.axf \
-nographic -monitor null -serial stdio

2020-07-20 14:58:22

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 1/6] syscalls: use uaccess_kernel in addr_limit_user_check

On 7/20/20 3:01 AM, Christoph Hellwig wrote:
> To try to reproduce your report I built a mps2_defconfig kernel
> and then run the qemu command line manually extraced from your
> script below, using a mainline qemu built for arm-softmmu, but it
> crashes with the following message even for the baseline kernel.
>
> qemu: fatal: Lockup: can't escalate 3 to HardFault (current priority -1)
>
> R00=00000000 R01=00000000 R02=00000000 R03=00000000
> R04=00000000 R05=00000000 R06=00000000 R07=00000000
> R08=00000000 R09=00000000 R10=00000000 R11=00000000
> R12=00000000 R13=ffffffe0 R14=fffffff9 R15=00000000
> XPSR=40000003 -Z-- A handler
> FPSCR: 00000000
>
> Does anyone have an idea what this means?
>

Ah, sorry, you can't use the upstream version of qemu to test mps2-an385
Linux images. You'll have to use a version from https://github.com/groeck/qemu.
I'd recommend to use the v5.0.0-local branch.

I had to make some changes to qemu to be able to boot mps2-an385.
I tried to submit those changes into upstream qemu, but that was
rejected because, as I was told, the qemu implementation
would no longer reflect the real hardware with those changes in
place.

Guenter

2020-07-20 15:29:23

by Peter Maydell

[permalink] [raw]
Subject: Re: [PATCH 1/6] syscalls: use uaccess_kernel in addr_limit_user_check

On Mon, 20 Jul 2020 at 15:55, Guenter Roeck <[email protected]> wrote:
> Ah, sorry, you can't use the upstream version of qemu to test mps2-an385
> Linux images. You'll have to use a version from https://github.com/groeck/qemu.
> I'd recommend to use the v5.0.0-local branch.
>
> I had to make some changes to qemu to be able to boot mps2-an385.
> I tried to submit those changes into upstream qemu, but that was
> rejected because, as I was told, the qemu implementation
> would no longer reflect the real hardware with those changes in
> place.

Yes; the rationale is that if you wanted to boot a kernel
on an actual MPS2 board you'd need a bit of guest code to
start it up (and to bundle the initrd/dtb in with it), so
since you need to write that code anyway you could use it for
booting the kernel in QEMU too.

I appreciate that this is awkward for kernel developers (and
perhaps for some other users too), but QEMU's handling of
-kernel and built-in-bootloader code is already a morass of
special cases and do-what-I-mean behaviour that I'm not
enthusiastic about further complicating :-)

(https://lists.gnu.org/archive/html/qemu-arm/2018-06/msg00393.html
has the archive of our original discussion on the point, for
other readers of this post interested in further context and
discussion.)

thanks
-- PMM

2020-07-20 15:55:20

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 0/6] arm: don't call addr_limit_user_check for nommu

On arm nommu kernel use the same constant for USER_DS and KERNEL_DS,
and seqment_eq always returns false. With the current check in
addr_limit_user_check that works by accident, but when replacing
seqment_eq with uaccess_kerne it will fail. Just remove the not
needed check entirely.

Signed-off-by: Christoph Hellwig <[email protected]>
Reported-by: Guenter Roeck <[email protected]>
---

Andrew: this should preferably go before the other patches in this
series.

arch/arm/kernel/signal.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/arch/arm/kernel/signal.c b/arch/arm/kernel/signal.c
index ab2568996ddb0c..c9dc912b83f012 100644
--- a/arch/arm/kernel/signal.c
+++ b/arch/arm/kernel/signal.c
@@ -713,7 +713,9 @@ struct page *get_signal_page(void)
/* Defer to generic check */
asmlinkage void addr_limit_check_failed(void)
{
+#ifdef CONFIG_MMU
addr_limit_user_check();
+#endif
}

#ifdef CONFIG_DEBUG_RSEQ
--
2.27.0

2020-07-20 22:11:57

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 1/6] syscalls: use uaccess_kernel in addr_limit_user_check

On Sat, Jul 18, 2020 at 11:48:46AM +0200, Christoph Hellwig wrote:
> On Fri, Jul 17, 2020 at 06:38:50PM -0700, Guenter Roeck wrote:
> > Hi,
> >
> > On Tue, Jul 14, 2020 at 12:55:00PM +0200, Christoph Hellwig wrote:
> > > Use the uaccess_kernel helper instead of duplicating it.
> > >
> > > Signed-off-by: Christoph Hellwig <[email protected]>
> >
> > This patch causes a severe hiccup with my mps2-an385 boot test.
>

I had another look into the code. Right after this patch, I see

#define uaccess_kernel() segment_eq(get_fs(), KERNEL_DS)

Yet, this patch is:

- if (CHECK_DATA_CORRUPTION(!segment_eq(get_fs(), USER_DS),
+ if (CHECK_DATA_CORRUPTION(uaccess_kernel(),

So there is a negation in the condition. Indeed, the following change
on top of next-20200720 fixes the problem for mps2-an385.

- if (CHECK_DATA_CORRUPTION(uaccess_kernel(),
+ if (CHECK_DATA_CORRUPTION(!uaccess_kernel(),

How does this work anywhere ?

Thanks,
Guenter

> I guess that is a nommu config?
>
> Can you try this patch?
>
> diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h
> index b19c9bec1f7a63..cc7daf374a6eb6 100644
> --- a/arch/arm/include/asm/uaccess.h
> +++ b/arch/arm/include/asm/uaccess.h
> @@ -263,7 +263,7 @@ extern int __put_user_8(void *, unsigned long long);
> */
> #define USER_DS KERNEL_DS
>
> -#define uaccess_kernel() (true)
> +#define uaccess_kernel() (false)
> #define __addr_ok(addr) ((void)(addr), 1)
> #define __range_ok(addr, size) ((void)(addr), 0)
> #define get_fs() (KERNEL_DS)

2020-07-21 04:59:33

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 1/6] syscalls: use uaccess_kernel in addr_limit_user_check

On Mon, Jul 20, 2020 at 03:10:46PM -0700, Guenter Roeck wrote:
> I had another look into the code. Right after this patch, I see
>
> #define uaccess_kernel() segment_eq(get_fs(), KERNEL_DS)
>
> Yet, this patch is:
>
> - if (CHECK_DATA_CORRUPTION(!segment_eq(get_fs(), USER_DS),
> + if (CHECK_DATA_CORRUPTION(uaccess_kernel(),
>
> So there is a negation in the condition. Indeed, the following change
> on top of next-20200720 fixes the problem for mps2-an385.
>
> - if (CHECK_DATA_CORRUPTION(uaccess_kernel(),
> + if (CHECK_DATA_CORRUPTION(!uaccess_kernel(),
>
> How does this work anywhere ?

No, that is the wrong check - we want to make sure the address
space override doesn't leak to userspace. The problem is that
armnommu (and m68knommu, but that doesn't call the offending
function) pretends to not have a kernel address space, which doesn't
really work. Here is the fix I sent out yesterday, which I should
have Cc'ed you on, sorry:

---
From 2bb889b2d99a2d978e90640ade8fe02359287092 Mon Sep 17 00:00:00 2001
From: Christoph Hellwig <[email protected]>
Date: Mon, 20 Jul 2020 17:46:50 +0200
Subject: arm: don't call addr_limit_user_check for nommu

On arm nommu kernel use the same constant for USER_DS and KERNEL_DS,
and seqment_eq always returns false. With the current check in
addr_limit_user_check that works by accident, but when replacing
seqment_eq with uaccess_kerne it will fail. Just remove the not
needed check entirely.

Signed-off-by: Christoph Hellwig <[email protected]>
Reported-by: Guenter Roeck <[email protected]>
---
arch/arm/kernel/signal.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/arch/arm/kernel/signal.c b/arch/arm/kernel/signal.c
index ab2568996ddb0c..c9dc912b83f012 100644
--- a/arch/arm/kernel/signal.c
+++ b/arch/arm/kernel/signal.c
@@ -713,7 +713,9 @@ struct page *get_signal_page(void)
/* Defer to generic check */
asmlinkage void addr_limit_check_failed(void)
{
+#ifdef CONFIG_MMU
addr_limit_user_check();
+#endif
}

#ifdef CONFIG_DEBUG_RSEQ
--
2.27.0

2020-07-21 05:16:27

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 1/6] syscalls: use uaccess_kernel in addr_limit_user_check

On 7/20/20 9:58 PM, Christoph Hellwig wrote:
> On Mon, Jul 20, 2020 at 03:10:46PM -0700, Guenter Roeck wrote:
>> I had another look into the code. Right after this patch, I see
>>
>> #define uaccess_kernel() segment_eq(get_fs(), KERNEL_DS)
>>
>> Yet, this patch is:
>>
>> - if (CHECK_DATA_CORRUPTION(!segment_eq(get_fs(), USER_DS),
>> + if (CHECK_DATA_CORRUPTION(uaccess_kernel(),
>>
>> So there is a negation in the condition. Indeed, the following change
>> on top of next-20200720 fixes the problem for mps2-an385.
>>
>> - if (CHECK_DATA_CORRUPTION(uaccess_kernel(),
>> + if (CHECK_DATA_CORRUPTION(!uaccess_kernel(),
>>
>> How does this work anywhere ?
>
> No, that is the wrong check - we want to make sure the address
> space override doesn't leak to userspace. The problem is that
> armnommu (and m68knommu, but that doesn't call the offending
> function) pretends to not have a kernel address space, which doesn't
> really work. Here is the fix I sent out yesterday, which I should
> have Cc'ed you on, sorry:
>

The patch below makes sense, and it does work, but I still suspect
that something with your original patch is wrong, or at least suspicious.
Reason: My change above (Adding the "!") works for _all_ of my arm boot
tests. Or, in other words, it doesn't make a difference if true
or false is passed as first parameter of CHECK_DATA_CORRUPTION(), except
for nommu systems. Also, unless I am really missing something, your
original patch _does_ reverse the logic.

I didn't track this down further.

Thanks,
Guenter

> ---
>>From 2bb889b2d99a2d978e90640ade8fe02359287092 Mon Sep 17 00:00:00 2001
> From: Christoph Hellwig <[email protected]>
> Date: Mon, 20 Jul 2020 17:46:50 +0200
> Subject: arm: don't call addr_limit_user_check for nommu
>
> On arm nommu kernel use the same constant for USER_DS and KERNEL_DS,
> and seqment_eq always returns false. With the current check in
> addr_limit_user_check that works by accident, but when replacing
> seqment_eq with uaccess_kerne it will fail. Just remove the not
> needed check entirely.
>
> Signed-off-by: Christoph Hellwig <[email protected]>
> Reported-by: Guenter Roeck <[email protected]>

Tested-by: Guenter Roeck <[email protected]>

> ---
> arch/arm/kernel/signal.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/arch/arm/kernel/signal.c b/arch/arm/kernel/signal.c
> index ab2568996ddb0c..c9dc912b83f012 100644
> --- a/arch/arm/kernel/signal.c
> +++ b/arch/arm/kernel/signal.c
> @@ -713,7 +713,9 @@ struct page *get_signal_page(void)
> /* Defer to generic check */
> asmlinkage void addr_limit_check_failed(void)
> {
> +#ifdef CONFIG_MMU
> addr_limit_user_check();
> +#endif
> }
>
> #ifdef CONFIG_DEBUG_RSEQ
>

2020-07-21 05:21:11

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 1/6] syscalls: use uaccess_kernel in addr_limit_user_check

On Mon, Jul 20, 2020 at 10:15:37PM -0700, Guenter Roeck wrote:
> >> - if (CHECK_DATA_CORRUPTION(uaccess_kernel(),
> >> + if (CHECK_DATA_CORRUPTION(!uaccess_kernel(),
> >>
> >> How does this work anywhere ?
> >
> > No, that is the wrong check - we want to make sure the address
> > space override doesn't leak to userspace. The problem is that
> > armnommu (and m68knommu, but that doesn't call the offending
> > function) pretends to not have a kernel address space, which doesn't
> > really work. Here is the fix I sent out yesterday, which I should
> > have Cc'ed you on, sorry:
> >
>
> The patch below makes sense, and it does work, but I still suspect
> that something with your original patch is wrong, or at least suspicious.
> Reason: My change above (Adding the "!") works for _all_ of my arm boot
> tests. Or, in other words, it doesn't make a difference if true
> or false is passed as first parameter of CHECK_DATA_CORRUPTION(), except
> for nommu systems. Also, unless I am really missing something, your
> original patch _does_ reverse the logic.

Well. segment_eq is in current mainline used in two places:

1) to implement uaccess_kernel
2) in addr_limit_user_check to implement uaccess_kernel-like
semantics using a strange reverse notation

I think the explanation for your observation is how addr_limit_user_check
is called on arm. The addr_limit_check_failed wrapper for it is called
from assembly code, but only after already checking the addr_limit,
basically duplicating the segment_eq check. So for mmu builds it won't
get called unless we leak the kernel address space override, which
is a pretty fatal error and won't show up in your boot tests. The
only good way to test it is by explicit injecting it using the
lkdtm module.

2020-07-21 05:31:04

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 1/6] syscalls: use uaccess_kernel in addr_limit_user_check

On 7/20/20 10:20 PM, Christoph Hellwig wrote:
> On Mon, Jul 20, 2020 at 10:15:37PM -0700, Guenter Roeck wrote:
>>>> - if (CHECK_DATA_CORRUPTION(uaccess_kernel(),
>>>> + if (CHECK_DATA_CORRUPTION(!uaccess_kernel(),
>>>>
>>>> How does this work anywhere ?
>>>
>>> No, that is the wrong check - we want to make sure the address
>>> space override doesn't leak to userspace. The problem is that
>>> armnommu (and m68knommu, but that doesn't call the offending
>>> function) pretends to not have a kernel address space, which doesn't
>>> really work. Here is the fix I sent out yesterday, which I should
>>> have Cc'ed you on, sorry:
>>>
>>
>> The patch below makes sense, and it does work, but I still suspect
>> that something with your original patch is wrong, or at least suspicious.
>> Reason: My change above (Adding the "!") works for _all_ of my arm boot
>> tests. Or, in other words, it doesn't make a difference if true
>> or false is passed as first parameter of CHECK_DATA_CORRUPTION(), except
>> for nommu systems. Also, unless I am really missing something, your
>> original patch _does_ reverse the logic.
>
> Well. segment_eq is in current mainline used in two places:
>
> 1) to implement uaccess_kernel
> 2) in addr_limit_user_check to implement uaccess_kernel-like
> semantics using a strange reverse notation
>
> I think the explanation for your observation is how addr_limit_user_check
> is called on arm. The addr_limit_check_failed wrapper for it is called
> from assembly code, but only after already checking the addr_limit,
> basically duplicating the segment_eq check. So for mmu builds it won't
> get called unless we leak the kernel address space override, which
> is a pretty fatal error and won't show up in your boot tests. The
> only good way to test it is by explicit injecting it using the
> lkdtm module.
>

Guess I lost it somewhere. Are you saying the check was wrong all along
and your patch fixed it ?

Thanks,
Guenter

2020-07-21 05:39:15

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 1/6] syscalls: use uaccess_kernel in addr_limit_user_check

On Mon, Jul 20, 2020 at 10:30:30PM -0700, Guenter Roeck wrote:
> Guess I lost it somewhere. Are you saying the check was wrong all along
> and your patch fixed it ?

Oh, it is a little complicated.

Normally we have two address space limits, KERNEL_DS and USER_DS,
and they are supposed to be different. armnommu and m68k define them
to the same value for no good reason. That leads to:

uaccess_kernel always returning true as it does a positive check
agains KERNEL_DS, which disables a bunch of drivers like sg and
rdma, and could also lead to really strange and probably broken
results in a few places.

It also leads to the SIGKILL in addr_limit_user_check never
triggering due to the negatŃ–ve check, which is ok as the limits
never are different.