2019-10-01 01:12:11

by Aleksa Sarai

[permalink] [raw]
Subject: [PATCH v4 0/4] lib: introduce copy_struct_from_user() helper

Patch changelog:
v4:
* __always_inline copy_struct_from_user(). [Kees Cook]
* Rework test_user_copy.ko changes. [Kees Cook]
v3: <https://lore.kernel.org/lkml/[email protected]/>
<https://lore.kernel.org/lkml/[email protected]/>
v2: <https://lore.kernel.org/lkml/[email protected]/>
v1: <https://lore.kernel.org/lkml/[email protected]/>

This series was split off from the openat2(2) syscall discussion[1].
However, the copy_struct_to_user() helper has been dropped, because
after some discussion it appears that there is no really obvious
semantics for how copy_struct_to_user() should work on mixed-vintages
(for instance, whether [2] is the correct semantics for all syscalls).

A common pattern for syscall extensions is increasing the size of a
struct passed from userspace, such that the zero-value of the new fields
result in the old kernel behaviour (allowing for a mix of userspace and
kernel vintages to operate on one another in most cases).

Previously there was no common lib/ function that implemented
the necessary extension-checking semantics (and different syscalls
implemented them slightly differently or incompletely[3]). This series
implements the helper and ports several syscalls to use it.

Some in-kernel selftests are included in this patch. More complete
self-tests for copy_struct_from_user() are included in the openat2()
patchset.

[1]: https://lore.kernel.org/lkml/[email protected]/

[2]: commit 1251201c0d34 ("sched/core: Fix uclamp ABI bug, clean up and
robustify sched_read_attr() ABI logic and code")

[3]: For instance {sched_setattr,perf_event_open,clone3}(2) all do do
similar checks to copy_struct_from_user() while rt_sigprocmask(2)
always rejects differently-sized struct arguments.

Aleksa Sarai (4):
lib: introduce copy_struct_from_user() helper
clone3: switch to copy_struct_from_user()
sched_setattr: switch to copy_struct_from_user()
perf_event_open: switch to copy_struct_from_user()

include/linux/bitops.h | 7 ++
include/linux/uaccess.h | 70 +++++++++++++++++++
include/uapi/linux/sched.h | 2 +
kernel/events/core.c | 47 +++----------
kernel/fork.c | 34 ++--------
kernel/sched/core.c | 43 ++----------
lib/strnlen_user.c | 8 +--
lib/test_user_copy.c | 136 +++++++++++++++++++++++++++++++++++--
lib/usercopy.c | 55 +++++++++++++++
9 files changed, 288 insertions(+), 114 deletions(-)

--
2.23.0


2019-10-01 01:12:58

by Aleksa Sarai

[permalink] [raw]
Subject: [PATCH v4 1/4] lib: introduce copy_struct_from_user() helper

A common pattern for syscall extensions is increasing the size of a
struct passed from userspace, such that the zero-value of the new fields
result in the old kernel behaviour (allowing for a mix of userspace and
kernel vintages to operate on one another in most cases).

While this interface exists for communication in both directions, only
one interface is straightforward to have reasonable semantics for
(userspace passing a struct to the kernel). For kernel returns to
userspace, what the correct semantics are (whether there should be an
error if userspace is unaware of a new extension) is very
syscall-dependent and thus probably cannot be unified between syscalls
(a good example of this problem is [1]).

Previously there was no common lib/ function that implemented
the necessary extension-checking semantics (and different syscalls
implemented them slightly differently or incompletely[2]). Future
patches replace common uses of this pattern to make use of
copy_struct_from_user().

Some in-kernel selftests that insure that the handling of alignment and
various byte patterns are all handled identically to memchr_inv() usage.

[1]: commit 1251201c0d34 ("sched/core: Fix uclamp ABI bug, clean up and
robustify sched_read_attr() ABI logic and code")

[2]: For instance {sched_setattr,perf_event_open,clone3}(2) all do do
similar checks to copy_struct_from_user() while rt_sigprocmask(2)
always rejects differently-sized struct arguments.

Suggested-by: Rasmus Villemoes <[email protected]>
Signed-off-by: Aleksa Sarai <[email protected]>
---
include/linux/bitops.h | 7 +++
include/linux/uaccess.h | 70 +++++++++++++++++++++
lib/strnlen_user.c | 8 +--
lib/test_user_copy.c | 136 ++++++++++++++++++++++++++++++++++++++--
lib/usercopy.c | 55 ++++++++++++++++
5 files changed, 263 insertions(+), 13 deletions(-)

diff --git a/include/linux/bitops.h b/include/linux/bitops.h
index cf074bce3eb3..c94a9ff9f082 100644
--- a/include/linux/bitops.h
+++ b/include/linux/bitops.h
@@ -4,6 +4,13 @@
#include <asm/types.h>
#include <linux/bits.h>

+/* Set bits in the first 'n' bytes when loaded from memory */
+#ifdef __LITTLE_ENDIAN
+# define aligned_byte_mask(n) ((1UL << 8*(n))-1)
+#else
+# define aligned_byte_mask(n) (~0xffUL << (BITS_PER_LONG - 8 - 8*(n)))
+#endif
+
#define BITS_PER_TYPE(type) (sizeof(type) * BITS_PER_BYTE)
#define BITS_TO_LONGS(nr) DIV_ROUND_UP(nr, BITS_PER_TYPE(long))

diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
index 70bbdc38dc37..8abbc713f7fb 100644
--- a/include/linux/uaccess.h
+++ b/include/linux/uaccess.h
@@ -231,6 +231,76 @@ __copy_from_user_inatomic_nocache(void *to, const void __user *from,

#endif /* ARCH_HAS_NOCACHE_UACCESS */

+extern int check_zeroed_user(const void __user *from, size_t size);
+
+/**
+ * copy_struct_from_user: copy a struct from userspace
+ * @dst: Destination address, in kernel space. This buffer must be @ksize
+ * bytes long.
+ * @ksize: Size of @dst struct.
+ * @src: Source address, in userspace.
+ * @usize: (Alleged) size of @src struct.
+ *
+ * Copies a struct from userspace to kernel space, in a way that guarantees
+ * backwards-compatibility for struct syscall arguments (as long as future
+ * struct extensions are made such that all new fields are *appended* to the
+ * old struct, and zeroed-out new fields have the same meaning as the old
+ * struct).
+ *
+ * @ksize is just sizeof(*dst), and @usize should've been passed by userspace.
+ * The recommended usage is something like the following:
+ *
+ * SYSCALL_DEFINE2(foobar, const struct foo __user *, uarg, size_t, usize)
+ * {
+ * int err;
+ * struct foo karg = {};
+ *
+ * if (usize > PAGE_SIZE)
+ * return -E2BIG;
+ * if (usize < FOO_SIZE_VER0)
+ * return -EINVAL;
+ *
+ * err = copy_struct_from_user(&karg, sizeof(karg), uarg, usize);
+ * if (err)
+ * return err;
+ *
+ * // ...
+ * }
+ *
+ * There are three cases to consider:
+ * * If @usize == @ksize, then it's copied verbatim.
+ * * If @usize < @ksize, then the userspace has passed an old struct to a
+ * newer kernel. The rest of the trailing bytes in @dst (@ksize - @usize)
+ * are to be zero-filled.
+ * * If @usize > @ksize, then the userspace has passed a new struct to an
+ * older kernel. The trailing bytes unknown to the kernel (@usize - @ksize)
+ * are checked to ensure they are zeroed, otherwise -E2BIG is returned.
+ *
+ * Returns (in all cases, some data may have been copied):
+ * * -E2BIG: (@usize > @ksize) and there are non-zero trailing bytes in @src.
+ * * -EFAULT: access to userspace failed.
+ */
+static __always_inline
+int copy_struct_from_user(void *dst, size_t ksize,
+ const void __user *src, size_t usize)
+{
+ size_t size = min(ksize, usize);
+ size_t rest = max(ksize, usize) - size;
+
+ /* Deal with trailing bytes. */
+ if (usize < ksize) {
+ memset(dst + size, 0, rest);
+ } else if (usize > ksize) {
+ int ret = check_zeroed_user(src + size, rest);
+ if (ret <= 0)
+ return ret ?: -E2BIG;
+ }
+ /* Copy the interoperable parts of the struct. */
+ if (copy_from_user(dst, src, size))
+ return -EFAULT;
+ return 0;
+}
+
/*
* probe_kernel_read(): safely attempt to read from a location
* @dst: pointer to the buffer that shall take the data
diff --git a/lib/strnlen_user.c b/lib/strnlen_user.c
index 28ff554a1be8..6c0005d5dd5c 100644
--- a/lib/strnlen_user.c
+++ b/lib/strnlen_user.c
@@ -3,16 +3,10 @@
#include <linux/export.h>
#include <linux/uaccess.h>
#include <linux/mm.h>
+#include <linux/bitops.h>

#include <asm/word-at-a-time.h>

-/* Set bits in the first 'n' bytes when loaded from memory */
-#ifdef __LITTLE_ENDIAN
-# define aligned_byte_mask(n) ((1ul << 8*(n))-1)
-#else
-# define aligned_byte_mask(n) (~0xfful << (BITS_PER_LONG - 8 - 8*(n)))
-#endif
-
/*
* Do a strnlen, return length of string *with* final '\0'.
* 'count' is the user-supplied count, while 'max' is the
diff --git a/lib/test_user_copy.c b/lib/test_user_copy.c
index 67bcd5dfd847..950ee88cd6ac 100644
--- a/lib/test_user_copy.c
+++ b/lib/test_user_copy.c
@@ -31,14 +31,133 @@
# define TEST_U64
#endif

-#define test(condition, msg) \
-({ \
- int cond = (condition); \
- if (cond) \
- pr_warn("%s\n", msg); \
- cond; \
+#define test(condition, msg, ...) \
+({ \
+ int cond = (condition); \
+ if (cond) \
+ pr_warn("[%d] " msg "\n", __LINE__, ##__VA_ARGS__); \
+ cond; \
})

+static bool is_zeroed(void *from, size_t size)
+{
+ return memchr_inv(from, 0x0, size) == NULL;
+}
+
+static int test_check_nonzero_user(char *kmem, char __user *umem, size_t size)
+{
+ int ret = 0;
+ size_t start, end, i;
+ size_t zero_start = size / 4;
+ size_t zero_end = size - zero_start;
+
+ /*
+ * We conduct a series of check_nonzero_user() tests on a block of memory
+ * with the following byte-pattern (trying every possible [start,end]
+ * pair):
+ *
+ * [ 00 ff 00 ff ... 00 00 00 00 ... ff 00 ff 00 ]
+ *
+ * And we verify that check_nonzero_user() acts identically to memchr_inv().
+ */
+
+ memset(kmem, 0x0, size);
+ for (i = 1; i < zero_start; i += 2)
+ kmem[i] = 0xff;
+ for (i = zero_end; i < size; i += 2)
+ kmem[i] = 0xff;
+
+ ret |= test(copy_to_user(umem, kmem, size),
+ "legitimate copy_to_user failed");
+
+ for (start = 0; start <= size; start++) {
+ for (end = start; end <= size; end++) {
+ size_t len = end - start;
+ int retval = check_zeroed_user(umem + start, len);
+ int expected = is_zeroed(kmem + start, len);
+
+ ret |= test(retval != expected,
+ "check_nonzero_user(=%d) != memchr_inv(=%d) mismatch (start=%zu, end=%zu)",
+ retval, expected, start, end);
+ }
+ }
+
+ return ret;
+}
+
+static int test_copy_struct_from_user(char *kmem, char __user *umem,
+ size_t size)
+{
+ int ret = 0;
+ char *umem_src = NULL, *expected = NULL;
+ size_t ksize, usize;
+
+ umem_src = kmalloc(size, GFP_KERNEL);
+ if (ret |= test(umem_src == NULL, "kmalloc failed"))
+ goto out_free;
+
+ expected = kmalloc(size, GFP_KERNEL);
+ if (ret |= test(expected == NULL, "kmalloc failed"))
+ goto out_free;
+
+ /* Fill umem with a fixed byte pattern. */
+ memset(umem_src, 0x3e, size);
+ ret |= test(copy_to_user(umem, umem_src, size),
+ "legitimate copy_to_user failed");
+
+ /* Check basic case -- (usize == ksize). */
+ ksize = size;
+ usize = size;
+
+ memcpy(expected, umem_src, ksize);
+
+ memset(kmem, 0x0, size);
+ ret |= test(copy_struct_from_user(kmem, ksize, umem, usize),
+ "copy_struct_from_user(usize == ksize) failed");
+ ret |= test(memcmp(kmem, expected, ksize),
+ "copy_struct_from_user(usize == ksize) gives unexpected copy");
+
+ /* Old userspace case -- (usize < ksize). */
+ ksize = size;
+ usize = size / 2;
+
+ memcpy(expected, umem_src, usize);
+ memset(expected + usize, 0x0, ksize - usize);
+
+ memset(kmem, 0x0, size);
+ ret |= test(copy_struct_from_user(kmem, ksize, umem, usize),
+ "copy_struct_from_user(usize < ksize) failed");
+ ret |= test(memcmp(kmem, expected, ksize),
+ "copy_struct_from_user(usize < ksize) gives unexpected copy");
+
+ /* New userspace (-E2BIG) case -- (usize > ksize). */
+ ksize = size / 2;
+ usize = size;
+
+ memset(kmem, 0x0, size);
+ ret |= test(copy_struct_from_user(kmem, ksize, umem, usize) != -E2BIG,
+ "copy_struct_from_user(usize > ksize) didn't give E2BIG");
+
+ /* New userspace (success) case -- (usize > ksize). */
+ ksize = size / 2;
+ usize = size;
+
+ memcpy(expected, umem_src, ksize);
+ ret |= test(clear_user(umem + ksize, usize - ksize),
+ "legitimate clear_user failed");
+
+ memset(kmem, 0x0, size);
+ ret |= test(copy_struct_from_user(kmem, ksize, umem, usize),
+ "copy_struct_from_user(usize > ksize) failed");
+ ret |= test(memcmp(kmem, expected, ksize),
+ "copy_struct_from_user(usize > ksize) gives unexpected copy");
+
+out_free:
+ kfree(expected);
+ kfree(umem_src);
+ return ret;
+}
+
static int __init test_user_copy_init(void)
{
int ret = 0;
@@ -106,6 +225,11 @@ static int __init test_user_copy_init(void)
#endif
#undef test_legit

+ /* Test usage of check_nonzero_user(). */
+ ret |= test_check_nonzero_user(kmem, usermem, 2 * PAGE_SIZE);
+ /* Test usage of copy_struct_from_user(). */
+ ret |= test_copy_struct_from_user(kmem, usermem, 2 * PAGE_SIZE);
+
/*
* Invalid usage: none of these copies should succeed.
*/
diff --git a/lib/usercopy.c b/lib/usercopy.c
index c2bfbcaeb3dc..cbb4d9ec00f2 100644
--- a/lib/usercopy.c
+++ b/lib/usercopy.c
@@ -1,5 +1,6 @@
// SPDX-License-Identifier: GPL-2.0
#include <linux/uaccess.h>
+#include <linux/bitops.h>

/* out-of-line parts */

@@ -31,3 +32,57 @@ unsigned long _copy_to_user(void __user *to, const void *from, unsigned long n)
}
EXPORT_SYMBOL(_copy_to_user);
#endif
+
+/**
+ * check_zeroed_user: check if a userspace buffer only contains zero bytes
+ * @from: Source address, in userspace.
+ * @size: Size of buffer.
+ *
+ * This is effectively shorthand for "memchr_inv(from, 0, size) == NULL" for
+ * userspace addresses (and is more efficient because we don't care where the
+ * first non-zero byte is).
+ *
+ * Returns:
+ * * 0: There were non-zero bytes present in the buffer.
+ * * 1: The buffer was full of zero bytes.
+ * * -EFAULT: access to userspace failed.
+ */
+int check_zeroed_user(const void __user *from, size_t size)
+{
+ unsigned long val;
+ uintptr_t align = (uintptr_t) from % sizeof(unsigned long);
+
+ if (unlikely(size == 0))
+ return 1;
+
+ from -= align;
+ size += align;
+
+ if (!user_access_begin(from, size))
+ return -EFAULT;
+
+ unsafe_get_user(val, (unsigned long __user *) from, err_fault);
+ if (align)
+ val &= ~aligned_byte_mask(align);
+
+ while (size > sizeof(unsigned long)) {
+ if (unlikely(val))
+ goto done;
+
+ from += sizeof(unsigned long);
+ size -= sizeof(unsigned long);
+
+ unsafe_get_user(val, (unsigned long __user *) from, err_fault);
+ }
+
+ if (size < sizeof(unsigned long))
+ val &= aligned_byte_mask(size);
+
+done:
+ user_access_end();
+ return (val == 0);
+err_fault:
+ user_access_end();
+ return -EFAULT;
+}
+EXPORT_SYMBOL(check_zeroed_user);
--
2.23.0

2019-10-01 01:14:19

by Aleksa Sarai

[permalink] [raw]
Subject: [PATCH v4 3/4] sched_setattr: switch to copy_struct_from_user()

The change is very straightforward, and helps unify the syscall
interface for struct-from-userspace syscalls. Ideally we could also
unify sched_getattr(2)-style syscalls as well, but unfortunately the
correct semantics for such syscalls are much less clear (see [1] for
more detail). In future we could come up with a more sane idea for how
the syscall interface should look.

[1]: commit 1251201c0d34 ("sched/core: Fix uclamp ABI bug, clean up and
robustify sched_read_attr() ABI logic and code")

Reviewed-by: Kees Cook <[email protected]>
Signed-off-by: Aleksa Sarai <[email protected]>
---
kernel/sched/core.c | 43 +++++++------------------------------------
1 file changed, 7 insertions(+), 36 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 7880f4f64d0e..dd05a378631a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5106,9 +5106,6 @@ static int sched_copy_attr(struct sched_attr __user *uattr, struct sched_attr *a
u32 size;
int ret;

- if (!access_ok(uattr, SCHED_ATTR_SIZE_VER0))
- return -EFAULT;
-
/* Zero the full structure, so that a short copy will be nice: */
memset(attr, 0, sizeof(*attr));

@@ -5116,45 +5113,19 @@ static int sched_copy_attr(struct sched_attr __user *uattr, struct sched_attr *a
if (ret)
return ret;

- /* Bail out on silly large: */
- if (size > PAGE_SIZE)
- goto err_size;
-
/* ABI compatibility quirk: */
if (!size)
size = SCHED_ATTR_SIZE_VER0;
-
- if (size < SCHED_ATTR_SIZE_VER0)
+ if (size < SCHED_ATTR_SIZE_VER0 || size > PAGE_SIZE)
goto err_size;

- /*
- * If we're handed a bigger struct than we know of,
- * ensure all the unknown bits are 0 - i.e. new
- * user-space does not rely on any kernel feature
- * extensions we dont know about yet.
- */
- if (size > sizeof(*attr)) {
- unsigned char __user *addr;
- unsigned char __user *end;
- unsigned char val;
-
- addr = (void __user *)uattr + sizeof(*attr);
- end = (void __user *)uattr + size;
-
- for (; addr < end; addr++) {
- ret = get_user(val, addr);
- if (ret)
- return ret;
- if (val)
- goto err_size;
- }
- size = sizeof(*attr);
+ ret = copy_struct_from_user(attr, sizeof(*attr), uattr, size);
+ if (ret) {
+ if (ret == -E2BIG)
+ goto err_size;
+ return ret;
}

- ret = copy_from_user(attr, uattr, size);
- if (ret)
- return -EFAULT;
-
if ((attr->sched_flags & SCHED_FLAG_UTIL_CLAMP) &&
size < SCHED_ATTR_SIZE_VER1)
return -EINVAL;
@@ -5354,7 +5325,7 @@ sched_attr_copy_to_user(struct sched_attr __user *uattr,
* sys_sched_getattr - similar to sched_getparam, but with sched_attr
* @pid: the pid in question.
* @uattr: structure containing the extended parameters.
- * @usize: sizeof(attr) that user-space knows about, for forwards and backwards compatibility.
+ * @usize: sizeof(attr) for fwd/bwd comp.
* @flags: for future extension.
*/
SYSCALL_DEFINE4(sched_getattr, pid_t, pid, struct sched_attr __user *, uattr,
--
2.23.0

2019-10-01 01:15:22

by Aleksa Sarai

[permalink] [raw]
Subject: [PATCH v4 2/4] clone3: switch to copy_struct_from_user()

The change is very straightforward, and helps unify the syscall
interface for struct-from-userspace syscalls. Additionally, explicitly
define CLONE_ARGS_SIZE_VER0 to match the other users of the
struct-extension pattern.

Reviewed-by: Kees Cook <[email protected]>
Signed-off-by: Aleksa Sarai <[email protected]>
---
include/uapi/linux/sched.h | 2 ++
kernel/fork.c | 34 +++++++---------------------------
2 files changed, 9 insertions(+), 27 deletions(-)

diff --git a/include/uapi/linux/sched.h b/include/uapi/linux/sched.h
index b3105ac1381a..0945805982b4 100644
--- a/include/uapi/linux/sched.h
+++ b/include/uapi/linux/sched.h
@@ -47,6 +47,8 @@ struct clone_args {
__aligned_u64 tls;
};

+#define CLONE_ARGS_SIZE_VER0 64 /* sizeof first published struct */
+
/*
* Scheduling policies
*/
diff --git a/kernel/fork.c b/kernel/fork.c
index f9572f416126..2ef529869c64 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2525,39 +2525,19 @@ SYSCALL_DEFINE5(clone, unsigned long, clone_flags, unsigned long, newsp,
#ifdef __ARCH_WANT_SYS_CLONE3
noinline static int copy_clone_args_from_user(struct kernel_clone_args *kargs,
struct clone_args __user *uargs,
- size_t size)
+ size_t usize)
{
+ int err;
struct clone_args args;

- if (unlikely(size > PAGE_SIZE))
+ if (unlikely(usize > PAGE_SIZE))
return -E2BIG;
-
- if (unlikely(size < sizeof(struct clone_args)))
+ if (unlikely(usize < CLONE_ARGS_SIZE_VER0))
return -EINVAL;

- if (unlikely(!access_ok(uargs, size)))
- return -EFAULT;
-
- if (size > sizeof(struct clone_args)) {
- unsigned char __user *addr;
- unsigned char __user *end;
- unsigned char val;
-
- addr = (void __user *)uargs + sizeof(struct clone_args);
- end = (void __user *)uargs + size;
-
- for (; addr < end; addr++) {
- if (get_user(val, addr))
- return -EFAULT;
- if (val)
- return -E2BIG;
- }
-
- size = sizeof(struct clone_args);
- }
-
- if (copy_from_user(&args, uargs, size))
- return -EFAULT;
+ err = copy_struct_from_user(&args, sizeof(args), uargs, usize);
+ if (err)
+ return err;

/*
* Verify that higher 32bits of exit_signal are unset and that
--
2.23.0

2019-10-01 01:16:28

by Aleksa Sarai

[permalink] [raw]
Subject: [PATCH v4 4/4] perf_event_open: switch to copy_struct_from_user()

The change is very straightforward, and helps unify the syscall
interface for struct-from-userspace syscalls.

Reviewed-by: Kees Cook <[email protected]>
Signed-off-by: Aleksa Sarai <[email protected]>
---
kernel/events/core.c | 47 +++++++++-----------------------------------
1 file changed, 9 insertions(+), 38 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 4655adbbae10..3f0cb82e4fbc 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -10586,55 +10586,26 @@ static int perf_copy_attr(struct perf_event_attr __user *uattr,
u32 size;
int ret;

- if (!access_ok(uattr, PERF_ATTR_SIZE_VER0))
- return -EFAULT;
-
- /*
- * zero the full structure, so that a short copy will be nice.
- */
+ /* Zero the full structure, so that a short copy will be nice. */
memset(attr, 0, sizeof(*attr));

ret = get_user(size, &uattr->size);
if (ret)
return ret;

- if (size > PAGE_SIZE) /* silly large */
- goto err_size;
-
- if (!size) /* abi compat */
+ /* ABI compatibility quirk: */
+ if (!size)
size = PERF_ATTR_SIZE_VER0;
-
- if (size < PERF_ATTR_SIZE_VER0)
+ if (size < PERF_ATTR_SIZE_VER0 || size > PAGE_SIZE)
goto err_size;

- /*
- * If we're handed a bigger struct than we know of,
- * ensure all the unknown bits are 0 - i.e. new
- * user-space does not rely on any kernel feature
- * extensions we dont know about yet.
- */
- if (size > sizeof(*attr)) {
- unsigned char __user *addr;
- unsigned char __user *end;
- unsigned char val;
-
- addr = (void __user *)uattr + sizeof(*attr);
- end = (void __user *)uattr + size;
-
- for (; addr < end; addr++) {
- ret = get_user(val, addr);
- if (ret)
- return ret;
- if (val)
- goto err_size;
- }
- size = sizeof(*attr);
+ ret = copy_struct_from_user(attr, sizeof(*attr), uattr, size);
+ if (ret) {
+ if (ret == -E2BIG)
+ goto err_size;
+ return ret;
}

- ret = copy_from_user(attr, uattr, size);
- if (ret)
- return -EFAULT;
-
attr->size = size;

if (attr->__reserved_1)
--
2.23.0

2019-10-01 01:59:40

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH v4 1/4] lib: introduce copy_struct_from_user() helper

On Tue, Oct 01, 2019 at 11:10:52AM +1000, Aleksa Sarai wrote:
> A common pattern for syscall extensions is increasing the size of a
> struct passed from userspace, such that the zero-value of the new fields
> result in the old kernel behaviour (allowing for a mix of userspace and
> kernel vintages to operate on one another in most cases).
>
> While this interface exists for communication in both directions, only
> one interface is straightforward to have reasonable semantics for
> (userspace passing a struct to the kernel). For kernel returns to
> userspace, what the correct semantics are (whether there should be an
> error if userspace is unaware of a new extension) is very
> syscall-dependent and thus probably cannot be unified between syscalls
> (a good example of this problem is [1]).
>
> Previously there was no common lib/ function that implemented
> the necessary extension-checking semantics (and different syscalls
> implemented them slightly differently or incompletely[2]). Future
> patches replace common uses of this pattern to make use of
> copy_struct_from_user().
>
> Some in-kernel selftests that insure that the handling of alignment and
> various byte patterns are all handled identically to memchr_inv() usage.
>
> [1]: commit 1251201c0d34 ("sched/core: Fix uclamp ABI bug, clean up and
> robustify sched_read_attr() ABI logic and code")
>
> [2]: For instance {sched_setattr,perf_event_open,clone3}(2) all do do
> similar checks to copy_struct_from_user() while rt_sigprocmask(2)
> always rejects differently-sized struct arguments.
>
> Suggested-by: Rasmus Villemoes <[email protected]>
> Signed-off-by: Aleksa Sarai <[email protected]>
> ---
> include/linux/bitops.h | 7 +++
> include/linux/uaccess.h | 70 +++++++++++++++++++++
> lib/strnlen_user.c | 8 +--
> lib/test_user_copy.c | 136 ++++++++++++++++++++++++++++++++++++++--
> lib/usercopy.c | 55 ++++++++++++++++
> 5 files changed, 263 insertions(+), 13 deletions(-)
>
> diff --git a/include/linux/bitops.h b/include/linux/bitops.h
> index cf074bce3eb3..c94a9ff9f082 100644
> --- a/include/linux/bitops.h
> +++ b/include/linux/bitops.h
> @@ -4,6 +4,13 @@
> #include <asm/types.h>
> #include <linux/bits.h>
>
> +/* Set bits in the first 'n' bytes when loaded from memory */
> +#ifdef __LITTLE_ENDIAN
> +# define aligned_byte_mask(n) ((1UL << 8*(n))-1)
> +#else
> +# define aligned_byte_mask(n) (~0xffUL << (BITS_PER_LONG - 8 - 8*(n)))
> +#endif
> +
> #define BITS_PER_TYPE(type) (sizeof(type) * BITS_PER_BYTE)
> #define BITS_TO_LONGS(nr) DIV_ROUND_UP(nr, BITS_PER_TYPE(long))
>
> diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
> index 70bbdc38dc37..8abbc713f7fb 100644
> --- a/include/linux/uaccess.h
> +++ b/include/linux/uaccess.h
> @@ -231,6 +231,76 @@ __copy_from_user_inatomic_nocache(void *to, const void __user *from,
>
> #endif /* ARCH_HAS_NOCACHE_UACCESS */
>
> +extern int check_zeroed_user(const void __user *from, size_t size);
> +
> +/**
> + * copy_struct_from_user: copy a struct from userspace
> + * @dst: Destination address, in kernel space. This buffer must be @ksize
> + * bytes long.
> + * @ksize: Size of @dst struct.
> + * @src: Source address, in userspace.
> + * @usize: (Alleged) size of @src struct.
> + *
> + * Copies a struct from userspace to kernel space, in a way that guarantees
> + * backwards-compatibility for struct syscall arguments (as long as future
> + * struct extensions are made such that all new fields are *appended* to the
> + * old struct, and zeroed-out new fields have the same meaning as the old
> + * struct).
> + *
> + * @ksize is just sizeof(*dst), and @usize should've been passed by userspace.
> + * The recommended usage is something like the following:
> + *
> + * SYSCALL_DEFINE2(foobar, const struct foo __user *, uarg, size_t, usize)
> + * {
> + * int err;
> + * struct foo karg = {};
> + *
> + * if (usize > PAGE_SIZE)
> + * return -E2BIG;
> + * if (usize < FOO_SIZE_VER0)
> + * return -EINVAL;
> + *
> + * err = copy_struct_from_user(&karg, sizeof(karg), uarg, usize);
> + * if (err)
> + * return err;
> + *
> + * // ...
> + * }
> + *
> + * There are three cases to consider:
> + * * If @usize == @ksize, then it's copied verbatim.
> + * * If @usize < @ksize, then the userspace has passed an old struct to a
> + * newer kernel. The rest of the trailing bytes in @dst (@ksize - @usize)
> + * are to be zero-filled.
> + * * If @usize > @ksize, then the userspace has passed a new struct to an
> + * older kernel. The trailing bytes unknown to the kernel (@usize - @ksize)
> + * are checked to ensure they are zeroed, otherwise -E2BIG is returned.
> + *
> + * Returns (in all cases, some data may have been copied):
> + * * -E2BIG: (@usize > @ksize) and there are non-zero trailing bytes in @src.
> + * * -EFAULT: access to userspace failed.
> + */
> +static __always_inline
> +int copy_struct_from_user(void *dst, size_t ksize,
> + const void __user *src, size_t usize)

And of course I forgot to realize both this and check_zeroed_user()
should also have the __must_check attribute. Sorry for forgetting that
earlier!

With that, please consider it:

Reviewed-by: Kees Cook <[email protected]>

Thanks for working on this!

--
Kees Cook

2019-10-01 02:33:41

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH v4 2/4] clone3: switch to copy_struct_from_user()

On Tue, Oct 01, 2019 at 11:10:53AM +1000, Aleksa Sarai wrote:
> The change is very straightforward, and helps unify the syscall
> interface for struct-from-userspace syscalls. Additionally, explicitly
> define CLONE_ARGS_SIZE_VER0 to match the other users of the
> struct-extension pattern.
>
> Reviewed-by: Kees Cook <[email protected]>
> Signed-off-by: Aleksa Sarai <[email protected]>

Reviewed-by: Christian Brauner <[email protected]>

2019-10-01 02:35:01

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH v4 1/4] lib: introduce copy_struct_from_user() helper

On Mon, Sep 30, 2019 at 06:58:39PM -0700, Kees Cook wrote:
> On Tue, Oct 01, 2019 at 11:10:52AM +1000, Aleksa Sarai wrote:
> > A common pattern for syscall extensions is increasing the size of a
> > struct passed from userspace, such that the zero-value of the new fields
> > result in the old kernel behaviour (allowing for a mix of userspace and
> > kernel vintages to operate on one another in most cases).
> >
> > While this interface exists for communication in both directions, only
> > one interface is straightforward to have reasonable semantics for
> > (userspace passing a struct to the kernel). For kernel returns to
> > userspace, what the correct semantics are (whether there should be an
> > error if userspace is unaware of a new extension) is very
> > syscall-dependent and thus probably cannot be unified between syscalls
> > (a good example of this problem is [1]).
> >
> > Previously there was no common lib/ function that implemented
> > the necessary extension-checking semantics (and different syscalls
> > implemented them slightly differently or incompletely[2]). Future
> > patches replace common uses of this pattern to make use of
> > copy_struct_from_user().
> >
> > Some in-kernel selftests that insure that the handling of alignment and
> > various byte patterns are all handled identically to memchr_inv() usage.
> >
> > [1]: commit 1251201c0d34 ("sched/core: Fix uclamp ABI bug, clean up and
> > robustify sched_read_attr() ABI logic and code")
> >
> > [2]: For instance {sched_setattr,perf_event_open,clone3}(2) all do do
> > similar checks to copy_struct_from_user() while rt_sigprocmask(2)
> > always rejects differently-sized struct arguments.
> >
> > Suggested-by: Rasmus Villemoes <[email protected]>
> > Signed-off-by: Aleksa Sarai <[email protected]>
> > ---
> > include/linux/bitops.h | 7 +++
> > include/linux/uaccess.h | 70 +++++++++++++++++++++
> > lib/strnlen_user.c | 8 +--
> > lib/test_user_copy.c | 136 ++++++++++++++++++++++++++++++++++++++--
> > lib/usercopy.c | 55 ++++++++++++++++
> > 5 files changed, 263 insertions(+), 13 deletions(-)
> >
> > diff --git a/include/linux/bitops.h b/include/linux/bitops.h
> > index cf074bce3eb3..c94a9ff9f082 100644
> > --- a/include/linux/bitops.h
> > +++ b/include/linux/bitops.h
> > @@ -4,6 +4,13 @@
> > #include <asm/types.h>
> > #include <linux/bits.h>
> >
> > +/* Set bits in the first 'n' bytes when loaded from memory */
> > +#ifdef __LITTLE_ENDIAN
> > +# define aligned_byte_mask(n) ((1UL << 8*(n))-1)
> > +#else
> > +# define aligned_byte_mask(n) (~0xffUL << (BITS_PER_LONG - 8 - 8*(n)))
> > +#endif
> > +
> > #define BITS_PER_TYPE(type) (sizeof(type) * BITS_PER_BYTE)
> > #define BITS_TO_LONGS(nr) DIV_ROUND_UP(nr, BITS_PER_TYPE(long))
> >
> > diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
> > index 70bbdc38dc37..8abbc713f7fb 100644
> > --- a/include/linux/uaccess.h
> > +++ b/include/linux/uaccess.h
> > @@ -231,6 +231,76 @@ __copy_from_user_inatomic_nocache(void *to, const void __user *from,
> >
> > #endif /* ARCH_HAS_NOCACHE_UACCESS */
> >
> > +extern int check_zeroed_user(const void __user *from, size_t size);
> > +
> > +/**
> > + * copy_struct_from_user: copy a struct from userspace
> > + * @dst: Destination address, in kernel space. This buffer must be @ksize
> > + * bytes long.
> > + * @ksize: Size of @dst struct.
> > + * @src: Source address, in userspace.
> > + * @usize: (Alleged) size of @src struct.
> > + *
> > + * Copies a struct from userspace to kernel space, in a way that guarantees
> > + * backwards-compatibility for struct syscall arguments (as long as future
> > + * struct extensions are made such that all new fields are *appended* to the
> > + * old struct, and zeroed-out new fields have the same meaning as the old
> > + * struct).
> > + *
> > + * @ksize is just sizeof(*dst), and @usize should've been passed by userspace.
> > + * The recommended usage is something like the following:
> > + *
> > + * SYSCALL_DEFINE2(foobar, const struct foo __user *, uarg, size_t, usize)
> > + * {
> > + * int err;
> > + * struct foo karg = {};
> > + *
> > + * if (usize > PAGE_SIZE)
> > + * return -E2BIG;
> > + * if (usize < FOO_SIZE_VER0)
> > + * return -EINVAL;
> > + *
> > + * err = copy_struct_from_user(&karg, sizeof(karg), uarg, usize);
> > + * if (err)
> > + * return err;
> > + *
> > + * // ...
> > + * }
> > + *
> > + * There are three cases to consider:
> > + * * If @usize == @ksize, then it's copied verbatim.
> > + * * If @usize < @ksize, then the userspace has passed an old struct to a
> > + * newer kernel. The rest of the trailing bytes in @dst (@ksize - @usize)
> > + * are to be zero-filled.
> > + * * If @usize > @ksize, then the userspace has passed a new struct to an
> > + * older kernel. The trailing bytes unknown to the kernel (@usize - @ksize)
> > + * are checked to ensure they are zeroed, otherwise -E2BIG is returned.
> > + *
> > + * Returns (in all cases, some data may have been copied):
> > + * * -E2BIG: (@usize > @ksize) and there are non-zero trailing bytes in @src.
> > + * * -EFAULT: access to userspace failed.
> > + */
> > +static __always_inline
> > +int copy_struct_from_user(void *dst, size_t ksize,
> > + const void __user *src, size_t usize)
>
> And of course I forgot to realize both this and check_zeroed_user()
> should also have the __must_check attribute. Sorry for forgetting that
> earlier!

Just said to Aleksa that I'll just fix this up when I apply so he
doesn't have to resend. You ok with this, Kees?

>
> With that, please consider it:
>
> Reviewed-by: Kees Cook <[email protected]>

Reviewed-by: Christian Brauner <[email protected]>

2019-10-01 02:36:20

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH v4 3/4] sched_setattr: switch to copy_struct_from_user()

On Tue, Oct 01, 2019 at 11:10:54AM +1000, Aleksa Sarai wrote:
> The change is very straightforward, and helps unify the syscall
> interface for struct-from-userspace syscalls. Ideally we could also
> unify sched_getattr(2)-style syscalls as well, but unfortunately the
> correct semantics for such syscalls are much less clear (see [1] for
> more detail). In future we could come up with a more sane idea for how
> the syscall interface should look.
>
> [1]: commit 1251201c0d34 ("sched/core: Fix uclamp ABI bug, clean up and
> robustify sched_read_attr() ABI logic and code")
>
> Reviewed-by: Kees Cook <[email protected]>
> Signed-off-by: Aleksa Sarai <[email protected]>

Reviewed-by: Christian Brauner <[email protected]>

2019-10-01 02:36:51

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH v4 4/4] perf_event_open: switch to copy_struct_from_user()

On Tue, Oct 01, 2019 at 11:10:55AM +1000, Aleksa Sarai wrote:
> The change is very straightforward, and helps unify the syscall
> interface for struct-from-userspace syscalls.
>
> Reviewed-by: Kees Cook <[email protected]>
> Signed-off-by: Aleksa Sarai <[email protected]>

Reviewed-by: Christian Brauner <[email protected]>

2019-10-01 16:51:54

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH v4 1/4] lib: introduce copy_struct_from_user() helper

On Tue, Oct 01, 2019 at 04:31:27AM +0200, Christian Brauner wrote:
> On Mon, Sep 30, 2019 at 06:58:39PM -0700, Kees Cook wrote:
> > On Tue, Oct 01, 2019 at 11:10:52AM +1000, Aleksa Sarai wrote:
> > > +static __always_inline
> > > +int copy_struct_from_user(void *dst, size_t ksize,
> > > + const void __user *src, size_t usize)
> >
> > And of course I forgot to realize both this and check_zeroed_user()
> > should also have the __must_check attribute. Sorry for forgetting that
> > earlier!
>
> Just said to Aleksa that I'll just fix this up when I apply so he
> doesn't have to resend. You ok with this, Kees?

Yup; that's totally fine. Thanks!

--
Kees Cook

2019-10-01 16:52:17

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH v4 0/4] lib: introduce copy_struct_from_user() helper

On Tue, Oct 01, 2019 at 11:10:51AM +1000, Aleksa Sarai wrote:
> Patch changelog:
> v4:
> * __always_inline copy_struct_from_user(). [Kees Cook]
> * Rework test_user_copy.ko changes. [Kees Cook]
> v3: <https://lore.kernel.org/lkml/[email protected]/>
> <https://lore.kernel.org/lkml/[email protected]/>
> v2: <https://lore.kernel.org/lkml/[email protected]/>
> v1: <https://lore.kernel.org/lkml/[email protected]/>
>
> This series was split off from the openat2(2) syscall discussion[1].
> However, the copy_struct_to_user() helper has been dropped, because
> after some discussion it appears that there is no really obvious
> semantics for how copy_struct_to_user() should work on mixed-vintages
> (for instance, whether [2] is the correct semantics for all syscalls).
>
> A common pattern for syscall extensions is increasing the size of a
> struct passed from userspace, such that the zero-value of the new fields
> result in the old kernel behaviour (allowing for a mix of userspace and
> kernel vintages to operate on one another in most cases).
>
> Previously there was no common lib/ function that implemented
> the necessary extension-checking semantics (and different syscalls
> implemented them slightly differently or incompletely[3]). This series
> implements the helper and ports several syscalls to use it.
>
> Some in-kernel selftests are included in this patch. More complete
> self-tests for copy_struct_from_user() are included in the openat2()
> patchset.
>
> [1]: https://lore.kernel.org/lkml/[email protected]/
>
> [2]: commit 1251201c0d34 ("sched/core: Fix uclamp ABI bug, clean up and
> robustify sched_read_attr() ABI logic and code")
>
> [3]: For instance {sched_setattr,perf_event_open,clone3}(2) all do do
> similar checks to copy_struct_from_user() while rt_sigprocmask(2)
> always rejects differently-sized struct arguments.
>
> Aleksa Sarai (4):
> lib: introduce copy_struct_from_user() helper
> clone3: switch to copy_struct_from_user()
> sched_setattr: switch to copy_struct_from_user()
> perf_event_open: switch to copy_struct_from_user()
>
> include/linux/bitops.h | 7 ++
> include/linux/uaccess.h | 70 +++++++++++++++++++
> include/uapi/linux/sched.h | 2 +
> kernel/events/core.c | 47 +++----------
> kernel/fork.c | 34 ++--------
> kernel/sched/core.c | 43 ++----------
> lib/strnlen_user.c | 8 +--
> lib/test_user_copy.c | 136 +++++++++++++++++++++++++++++++++++--
> lib/usercopy.c | 55 +++++++++++++++
> 9 files changed, 288 insertions(+), 114 deletions(-)

I've picked this up now and it's sitting in
https://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git/log/?h=copy_struct_from_user

It should show up in linux-next tomorrow. I will let this sit there for
a few days but overall this seems good to have in rc2.
If someone objects and prefers to take it through their tree I can drop
it.

Christian

2019-10-10 11:20:37

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH v4 1/4] lib: introduce copy_struct_from_user() helper

Hi Aleksa,

Aleksa Sarai <[email protected]> writes:
> A common pattern for syscall extensions is increasing the size of a
> struct passed from userspace, such that the zero-value of the new fields
> result in the old kernel behaviour (allowing for a mix of userspace and
> kernel vintages to operate on one another in most cases).
>
> While this interface exists for communication in both directions, only
> one interface is straightforward to have reasonable semantics for
> (userspace passing a struct to the kernel). For kernel returns to
> userspace, what the correct semantics are (whether there should be an
> error if userspace is unaware of a new extension) is very
> syscall-dependent and thus probably cannot be unified between syscalls
> (a good example of this problem is [1]).
>
> Previously there was no common lib/ function that implemented
> the necessary extension-checking semantics (and different syscalls
> implemented them slightly differently or incompletely[2]). Future
> patches replace common uses of this pattern to make use of
> copy_struct_from_user().
>
> Some in-kernel selftests that insure that the handling of alignment and
> various byte patterns are all handled identically to memchr_inv() usage.
...
> diff --git a/lib/test_user_copy.c b/lib/test_user_copy.c
> index 67bcd5dfd847..950ee88cd6ac 100644
> --- a/lib/test_user_copy.c
> +++ b/lib/test_user_copy.c
> @@ -31,14 +31,133 @@
...
> +static int test_check_nonzero_user(char *kmem, char __user *umem, size_t size)
> +{
> + int ret = 0;
> + size_t start, end, i;
> + size_t zero_start = size / 4;
> + size_t zero_end = size - zero_start;
> +
> + /*
> + * We conduct a series of check_nonzero_user() tests on a block of memory
> + * with the following byte-pattern (trying every possible [start,end]
> + * pair):
> + *
> + * [ 00 ff 00 ff ... 00 00 00 00 ... ff 00 ff 00 ]
> + *
> + * And we verify that check_nonzero_user() acts identically to memchr_inv().
> + */
> +
> + memset(kmem, 0x0, size);
> + for (i = 1; i < zero_start; i += 2)
> + kmem[i] = 0xff;
> + for (i = zero_end; i < size; i += 2)
> + kmem[i] = 0xff;
> +
> + ret |= test(copy_to_user(umem, kmem, size),
> + "legitimate copy_to_user failed");
> +
> + for (start = 0; start <= size; start++) {
> + for (end = start; end <= size; end++) {
> + size_t len = end - start;
> + int retval = check_zeroed_user(umem + start, len);
> + int expected = is_zeroed(kmem + start, len);
> +
> + ret |= test(retval != expected,
> + "check_nonzero_user(=%d) != memchr_inv(=%d) mismatch (start=%zu, end=%zu)",
> + retval, expected, start, end);
> + }
> + }

This is causing soft lockups for me on powerpc, eg:

[ 188.208315] watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [modprobe:611]
[ 188.208782] Modules linked in: test_user_copy(+) vmx_crypto gf128mul crc32c_vpmsum virtio_balloon ip_tables x_tables autofs4
[ 188.209594] CPU: 4 PID: 611 Comm: modprobe Tainted: G L 5.4.0-rc1-gcc-8.2.0-00001-gf5a1a536fa14-dirty #1151
[ 188.210392] NIP: c000000000173650 LR: c000000000379cb0 CTR: c0000000007b20d0
[ 188.210612] REGS: c0000000ec213560 TRAP: 0901 Tainted: G L (5.4.0-rc1-gcc-8.2.0-00001-gf5a1a536fa14-dirty)
[ 188.210876] MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 28222422 XER: 20000000
[ 188.211060] CFAR: c000000000379cac IRQMASK: 0
[ 188.211060] GPR00: c000000000379cb0 c0000000ec2137f0 c0000000013bbb00 c000000000f527f0
[ 188.211060] GPR04: 000000000000004b 0000000000000000 00000000000085f5 c00000000fffb780
[ 188.211060] GPR08: 0000000000000000 0000000000000000 c0000000fb9a3080 c008000000411478
[ 188.211060] GPR12: c0000000007b20d0 c00000000fffb780
[ 188.211802] NIP [c000000000173650] __might_sleep+0x20/0xc0
[ 188.211924] LR [c000000000379cb0] __might_fault+0x40/0x60
[ 188.212037] Call Trace:
[ 188.212101] [c0000000ec2137f0] [c0000000001b99b4] vprintk_func+0xc4/0x230 (unreliable)
[ 188.212274] [c0000000ec213810] [c0000000007b21fc] check_zeroed_user+0x12c/0x200
[ 188.212478] [c0000000ec213860] [c0080000004106cc] test_user_copy_init+0x67c/0x1210 [test_user_copy]
[ 188.212681] [c0000000ec2139a0] [c000000000010440] do_one_initcall+0x60/0x340
[ 188.212859] [c0000000ec213a70] [c000000000213d4c] do_init_module+0x7c/0x2f0
[ 188.213004] [c0000000ec213b00] [c000000000216f24] load_module+0x2d94/0x30e0
[ 188.213150] [c0000000ec213d00] [c000000000217578] __do_sys_finit_module+0xc8/0x150
[ 188.213350] [c0000000ec213e20] [c00000000000b5d8] system_call+0x5c/0x68
[ 188.213494] Instruction dump:
[ 188.213587] 409efec0 4e800020 60000000 60000000 3c4c0125 384284d0 7c0802a6 60000000
[ 188.213767] fba1ffe8 fbc1fff0 fbe1fff8 7c9e2378 <f821ff81> 7c7f1b78 7cbd2b78 e94d0958


I think it's partly because I have DEBUG_ATOMIC_SLEEP enabled, which
means each unsafe_get_user() calls __might_fault() etc.

But even turning that off, it still takes forever.

> @@ -106,6 +225,11 @@ static int __init test_user_copy_init(void)
> #endif
> #undef test_legit
>
> + /* Test usage of check_nonzero_user(). */
> + ret |= test_check_nonzero_user(kmem, usermem, 2 * PAGE_SIZE);

I suspect it's just that PAGE_SIZE for me is 64K, and so the nested loop
above gets too big too fast.

If my math is right it's doing about 500 million iterations, vs ~2
million on a 4K kernel.

If I do the change below the entire test_user_copy module loads and runs
all the tests in about 10 seconds.

diff --git a/lib/test_user_copy.c b/lib/test_user_copy.c
index 950ee88cd6ac..03b617a36144 100644
--- a/lib/test_user_copy.c
+++ b/lib/test_user_copy.c
@@ -226,7 +226,7 @@ static int __init test_user_copy_init(void)
#undef test_legit

/* Test usage of check_nonzero_user(). */
- ret |= test_check_nonzero_user(kmem, usermem, 2 * PAGE_SIZE);
+ ret |= test_check_nonzero_user(kmem, usermem, 2 * 4096);
/* Test usage of copy_struct_from_user(). */
ret |= test_copy_struct_from_user(kmem, usermem, 2 * PAGE_SIZE);


How long does it take on your systems? Is 10s in the ball park, or is
there something else pathological happening on my machine, and shrinking
it to 4096 is just papering over it?

cheers

2019-10-10 11:42:29

by Aleksa Sarai

[permalink] [raw]
Subject: Re: [PATCH v4 1/4] lib: introduce copy_struct_from_user() helper

On 2019-10-10, Michael Ellerman <[email protected]> wrote:
> Aleksa Sarai <[email protected]> writes:
> > A common pattern for syscall extensions is increasing the size of a
> > struct passed from userspace, such that the zero-value of the new fields
> > result in the old kernel behaviour (allowing for a mix of userspace and
> > kernel vintages to operate on one another in most cases).
> >
> > While this interface exists for communication in both directions, only
> > one interface is straightforward to have reasonable semantics for
> > (userspace passing a struct to the kernel). For kernel returns to
> > userspace, what the correct semantics are (whether there should be an
> > error if userspace is unaware of a new extension) is very
> > syscall-dependent and thus probably cannot be unified between syscalls
> > (a good example of this problem is [1]).
> >
> > Previously there was no common lib/ function that implemented
> > the necessary extension-checking semantics (and different syscalls
> > implemented them slightly differently or incompletely[2]). Future
> > patches replace common uses of this pattern to make use of
> > copy_struct_from_user().
> >
> > Some in-kernel selftests that insure that the handling of alignment and
> > various byte patterns are all handled identically to memchr_inv() usage.
> ...
> > diff --git a/lib/test_user_copy.c b/lib/test_user_copy.c
> > index 67bcd5dfd847..950ee88cd6ac 100644
> > --- a/lib/test_user_copy.c
> > +++ b/lib/test_user_copy.c
> > @@ -31,14 +31,133 @@
> ...
> > +static int test_check_nonzero_user(char *kmem, char __user *umem, size_t size)
> > +{
> > + int ret = 0;
> > + size_t start, end, i;
> > + size_t zero_start = size / 4;
> > + size_t zero_end = size - zero_start;
> > +
> > + /*
> > + * We conduct a series of check_nonzero_user() tests on a block of memory
> > + * with the following byte-pattern (trying every possible [start,end]
> > + * pair):
> > + *
> > + * [ 00 ff 00 ff ... 00 00 00 00 ... ff 00 ff 00 ]
> > + *
> > + * And we verify that check_nonzero_user() acts identically to memchr_inv().
> > + */
> > +
> > + memset(kmem, 0x0, size);
> > + for (i = 1; i < zero_start; i += 2)
> > + kmem[i] = 0xff;
> > + for (i = zero_end; i < size; i += 2)
> > + kmem[i] = 0xff;
> > +
> > + ret |= test(copy_to_user(umem, kmem, size),
> > + "legitimate copy_to_user failed");
> > +
> > + for (start = 0; start <= size; start++) {
> > + for (end = start; end <= size; end++) {
> > + size_t len = end - start;
> > + int retval = check_zeroed_user(umem + start, len);
> > + int expected = is_zeroed(kmem + start, len);
> > +
> > + ret |= test(retval != expected,
> > + "check_nonzero_user(=%d) != memchr_inv(=%d) mismatch (start=%zu, end=%zu)",
> > + retval, expected, start, end);
> > + }
> > + }
>
> This is causing soft lockups for me on powerpc, eg:
>
> [ 188.208315] watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [modprobe:611]
> [ 188.208782] Modules linked in: test_user_copy(+) vmx_crypto gf128mul crc32c_vpmsum virtio_balloon ip_tables x_tables autofs4
> [ 188.209594] CPU: 4 PID: 611 Comm: modprobe Tainted: G L 5.4.0-rc1-gcc-8.2.0-00001-gf5a1a536fa14-dirty #1151
> [ 188.210392] NIP: c000000000173650 LR: c000000000379cb0 CTR: c0000000007b20d0
> [ 188.210612] REGS: c0000000ec213560 TRAP: 0901 Tainted: G L (5.4.0-rc1-gcc-8.2.0-00001-gf5a1a536fa14-dirty)
> [ 188.210876] MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 28222422 XER: 20000000
> [ 188.211060] CFAR: c000000000379cac IRQMASK: 0
> [ 188.211060] GPR00: c000000000379cb0 c0000000ec2137f0 c0000000013bbb00 c000000000f527f0
> [ 188.211060] GPR04: 000000000000004b 0000000000000000 00000000000085f5 c00000000fffb780
> [ 188.211060] GPR08: 0000000000000000 0000000000000000 c0000000fb9a3080 c008000000411478
> [ 188.211060] GPR12: c0000000007b20d0 c00000000fffb780
> [ 188.211802] NIP [c000000000173650] __might_sleep+0x20/0xc0
> [ 188.211924] LR [c000000000379cb0] __might_fault+0x40/0x60
> [ 188.212037] Call Trace:
> [ 188.212101] [c0000000ec2137f0] [c0000000001b99b4] vprintk_func+0xc4/0x230 (unreliable)
> [ 188.212274] [c0000000ec213810] [c0000000007b21fc] check_zeroed_user+0x12c/0x200
> [ 188.212478] [c0000000ec213860] [c0080000004106cc] test_user_copy_init+0x67c/0x1210 [test_user_copy]
> [ 188.212681] [c0000000ec2139a0] [c000000000010440] do_one_initcall+0x60/0x340
> [ 188.212859] [c0000000ec213a70] [c000000000213d4c] do_init_module+0x7c/0x2f0
> [ 188.213004] [c0000000ec213b00] [c000000000216f24] load_module+0x2d94/0x30e0
> [ 188.213150] [c0000000ec213d00] [c000000000217578] __do_sys_finit_module+0xc8/0x150
> [ 188.213350] [c0000000ec213e20] [c00000000000b5d8] system_call+0x5c/0x68
> [ 188.213494] Instruction dump:
> [ 188.213587] 409efec0 4e800020 60000000 60000000 3c4c0125 384284d0 7c0802a6 60000000
> [ 188.213767] fba1ffe8 fbc1fff0 fbe1fff8 7c9e2378 <f821ff81> 7c7f1b78 7cbd2b78 e94d0958
>
>
> I think it's partly because I have DEBUG_ATOMIC_SLEEP enabled, which
> means each unsafe_get_user() calls __might_fault() etc.
>
> But even turning that off, it still takes forever.
>
> > @@ -106,6 +225,11 @@ static int __init test_user_copy_init(void)
> > #endif
> > #undef test_legit
> >
> > + /* Test usage of check_nonzero_user(). */
> > + ret |= test_check_nonzero_user(kmem, usermem, 2 * PAGE_SIZE);
>
> I suspect it's just that PAGE_SIZE for me is 64K, and so the nested loop
> above gets too big too fast.
>
> If my math is right it's doing about 500 million iterations, vs ~2
> million on a 4K kernel.
>
> If I do the change below the entire test_user_copy module loads and runs
> all the tests in about 10 seconds.
>
> diff --git a/lib/test_user_copy.c b/lib/test_user_copy.c
> index 950ee88cd6ac..03b617a36144 100644
> --- a/lib/test_user_copy.c
> +++ b/lib/test_user_copy.c
> @@ -226,7 +226,7 @@ static int __init test_user_copy_init(void)
> #undef test_legit
>
> /* Test usage of check_nonzero_user(). */
> - ret |= test_check_nonzero_user(kmem, usermem, 2 * PAGE_SIZE);
> + ret |= test_check_nonzero_user(kmem, usermem, 2 * 4096);
> /* Test usage of copy_struct_from_user(). */
> ret |= test_copy_struct_from_user(kmem, usermem, 2 * PAGE_SIZE);
>
>
> How long does it take on your systems? Is 10s in the ball park, or is
> there something else pathological happening on my machine, and shrinking
> it to 4096 is just papering over it?

Yeah, it takes about 5-10s on my laptop. We could switch it to just
everything within a 4K block, but the main reason for testing with
2*PAGE_SIZE is to make sure that check_nonzero_user() works across page
boundaries. Though we could only do check_nonzero_user() in the region
of the page boundary (maybe i E (PAGE_SIZE-512,PAGE_SIZE+512]?)

Making a single test run for ~40min doesn't seem like that good of an
idea in retrospect. :P

--
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>


Attachments:
(No filename) (7.05 kB)
signature.asc (235.00 B)
Download all attachments

2019-10-10 16:45:03

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH v4 1/4] lib: introduce copy_struct_from_user() helper

On Thu, Oct 10, 2019 at 10:40:07PM +1100, Aleksa Sarai wrote:
> Yeah, it takes about 5-10s on my laptop. We could switch it to just
> everything within a 4K block, but the main reason for testing with
> 2*PAGE_SIZE is to make sure that check_nonzero_user() works across page
> boundaries. Though we could only do check_nonzero_user() in the region
> of the page boundary (maybe i E (PAGE_SIZE-512,PAGE_SIZE+512]?)

Yeah, I like this idea: just poke at the specific edge-case.

--
Kees Cook

2019-10-11 02:25:41

by Michael Ellerman

[permalink] [raw]
Subject: [PATCH] usercopy: Avoid soft lockups in test_check_nonzero_user()

On a machine with a 64K PAGE_SIZE, the nested for loops in
test_check_nonzero_user() can lead to soft lockups, eg:

watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [modprobe:611]
Modules linked in: test_user_copy(+) vmx_crypto gf128mul crc32c_vpmsum virtio_balloon ip_tables x_tables autofs4
CPU: 4 PID: 611 Comm: modprobe Tainted: G L 5.4.0-rc1-gcc-8.2.0-00001-gf5a1a536fa14-dirty #1151
...
NIP __might_sleep+0x20/0xc0
LR __might_fault+0x40/0x60
Call Trace:
check_zeroed_user+0x12c/0x200
test_user_copy_init+0x67c/0x1210 [test_user_copy]
do_one_initcall+0x60/0x340
do_init_module+0x7c/0x2f0
load_module+0x2d94/0x30e0
__do_sys_finit_module+0xc8/0x150
system_call+0x5c/0x68

Even with a 4K PAGE_SIZE the test takes multiple seconds. Instead
tweak it to only scan a 1024 byte region, but make it cross the
page boundary.

Fixes: f5a1a536fa14 ("lib: introduce copy_struct_from_user() helper")
Suggested-by: Aleksa Sarai <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
---
lib/test_user_copy.c | 23 ++++++++++++++++++++---
1 file changed, 20 insertions(+), 3 deletions(-)

How does this look? It runs in < 1s on my machine here.

cheers

diff --git a/lib/test_user_copy.c b/lib/test_user_copy.c
index 950ee88cd6ac..9fb6bc609d4c 100644
--- a/lib/test_user_copy.c
+++ b/lib/test_user_copy.c
@@ -47,9 +47,26 @@ static bool is_zeroed(void *from, size_t size)
static int test_check_nonzero_user(char *kmem, char __user *umem, size_t size)
{
int ret = 0;
- size_t start, end, i;
- size_t zero_start = size / 4;
- size_t zero_end = size - zero_start;
+ size_t start, end, i, zero_start, zero_end;
+
+ if (test(size < 1024, "buffer too small"))
+ return -EINVAL;
+
+ /*
+ * We want to cross a page boundary to exercise the code more
+ * effectively. We assume the buffer we're passed has a page boundary at
+ * size / 2. We also don't want to make the size we scan too large,
+ * otherwise the test can take a long time and cause soft lockups. So
+ * scan a 1024 byte region across the page boundary.
+ */
+ start = size / 2 - 512;
+ size = 1024;
+
+ kmem += start;
+ umem += start;
+
+ zero_start = size / 4;
+ zero_end = size - zero_start;

/*
* We conduct a series of check_nonzero_user() tests on a block of memory
--
2.21.0

2019-10-11 03:51:24

by Aleksa Sarai

[permalink] [raw]
Subject: Re: [PATCH] usercopy: Avoid soft lockups in test_check_nonzero_user()

On 2019-10-11, Michael Ellerman <[email protected]> wrote:
> On a machine with a 64K PAGE_SIZE, the nested for loops in
> test_check_nonzero_user() can lead to soft lockups, eg:
>
> watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [modprobe:611]
> Modules linked in: test_user_copy(+) vmx_crypto gf128mul crc32c_vpmsum virtio_balloon ip_tables x_tables autofs4
> CPU: 4 PID: 611 Comm: modprobe Tainted: G L 5.4.0-rc1-gcc-8.2.0-00001-gf5a1a536fa14-dirty #1151
> ...
> NIP __might_sleep+0x20/0xc0
> LR __might_fault+0x40/0x60
> Call Trace:
> check_zeroed_user+0x12c/0x200
> test_user_copy_init+0x67c/0x1210 [test_user_copy]
> do_one_initcall+0x60/0x340
> do_init_module+0x7c/0x2f0
> load_module+0x2d94/0x30e0
> __do_sys_finit_module+0xc8/0x150
> system_call+0x5c/0x68
>
> Even with a 4K PAGE_SIZE the test takes multiple seconds. Instead
> tweak it to only scan a 1024 byte region, but make it cross the
> page boundary.
>
> Fixes: f5a1a536fa14 ("lib: introduce copy_struct_from_user() helper")
> Suggested-by: Aleksa Sarai <[email protected]>
> Signed-off-by: Michael Ellerman <[email protected]>
> ---
> lib/test_user_copy.c | 23 ++++++++++++++++++++---
> 1 file changed, 20 insertions(+), 3 deletions(-)
>
> How does this look? It runs in < 1s on my machine here.
>
> cheers
>
> diff --git a/lib/test_user_copy.c b/lib/test_user_copy.c
> index 950ee88cd6ac..9fb6bc609d4c 100644
> --- a/lib/test_user_copy.c
> +++ b/lib/test_user_copy.c
> @@ -47,9 +47,26 @@ static bool is_zeroed(void *from, size_t size)
> static int test_check_nonzero_user(char *kmem, char __user *umem, size_t size)
> {
> int ret = 0;
> - size_t start, end, i;
> - size_t zero_start = size / 4;
> - size_t zero_end = size - zero_start;
> + size_t start, end, i, zero_start, zero_end;
> +
> + if (test(size < 1024, "buffer too small"))
> + return -EINVAL;
> +
> + /*
> + * We want to cross a page boundary to exercise the code more
> + * effectively. We assume the buffer we're passed has a page boundary at
> + * size / 2. We also don't want to make the size we scan too large,
> + * otherwise the test can take a long time and cause soft lockups. So
> + * scan a 1024 byte region across the page boundary.
> + */
> + start = size / 2 - 512;
> + size = 1024;

I don't think it's necessary to do "size / 2" here -- you can just use
PAGE_SIZE directly and check above that "size == 2*PAGE_SIZE" (not that
this check is exceptionally necessary -- since there's only one caller
of this function and it's in the same file).

> +
> + kmem += start;
> + umem += start;
> +
> + zero_start = size / 4;
> + zero_end = size - zero_start;
>
> /*
> * We conduct a series of check_nonzero_user() tests on a block of memory

--
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>


Attachments:
(No filename) (2.88 kB)
signature.asc (235.00 B)
Download all attachments

2019-10-11 09:45:41

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH] usercopy: Avoid soft lockups in test_check_nonzero_user()

On Fri, Oct 11, 2019 at 02:48:10PM +1100, Aleksa Sarai wrote:
> On 2019-10-11, Michael Ellerman <[email protected]> wrote:
> > On a machine with a 64K PAGE_SIZE, the nested for loops in
> > test_check_nonzero_user() can lead to soft lockups, eg:
> >
> > watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [modprobe:611]
> > Modules linked in: test_user_copy(+) vmx_crypto gf128mul crc32c_vpmsum virtio_balloon ip_tables x_tables autofs4
> > CPU: 4 PID: 611 Comm: modprobe Tainted: G L 5.4.0-rc1-gcc-8.2.0-00001-gf5a1a536fa14-dirty #1151
> > ...
> > NIP __might_sleep+0x20/0xc0
> > LR __might_fault+0x40/0x60
> > Call Trace:
> > check_zeroed_user+0x12c/0x200
> > test_user_copy_init+0x67c/0x1210 [test_user_copy]
> > do_one_initcall+0x60/0x340
> > do_init_module+0x7c/0x2f0
> > load_module+0x2d94/0x30e0
> > __do_sys_finit_module+0xc8/0x150
> > system_call+0x5c/0x68
> >
> > Even with a 4K PAGE_SIZE the test takes multiple seconds. Instead
> > tweak it to only scan a 1024 byte region, but make it cross the
> > page boundary.
> >
> > Fixes: f5a1a536fa14 ("lib: introduce copy_struct_from_user() helper")
> > Suggested-by: Aleksa Sarai <[email protected]>
> > Signed-off-by: Michael Ellerman <[email protected]>
> > ---
> > lib/test_user_copy.c | 23 ++++++++++++++++++++---
> > 1 file changed, 20 insertions(+), 3 deletions(-)
> >
> > How does this look? It runs in < 1s on my machine here.
> >
> > cheers
> >
> > diff --git a/lib/test_user_copy.c b/lib/test_user_copy.c
> > index 950ee88cd6ac..9fb6bc609d4c 100644
> > --- a/lib/test_user_copy.c
> > +++ b/lib/test_user_copy.c
> > @@ -47,9 +47,26 @@ static bool is_zeroed(void *from, size_t size)
> > static int test_check_nonzero_user(char *kmem, char __user *umem, size_t size)
> > {
> > int ret = 0;
> > - size_t start, end, i;
> > - size_t zero_start = size / 4;
> > - size_t zero_end = size - zero_start;
> > + size_t start, end, i, zero_start, zero_end;
> > +
> > + if (test(size < 1024, "buffer too small"))
> > + return -EINVAL;
> > +
> > + /*
> > + * We want to cross a page boundary to exercise the code more
> > + * effectively. We assume the buffer we're passed has a page boundary at
> > + * size / 2. We also don't want to make the size we scan too large,
> > + * otherwise the test can take a long time and cause soft lockups. So
> > + * scan a 1024 byte region across the page boundary.
> > + */
> > + start = size / 2 - 512;
> > + size = 1024;
>
> I don't think it's necessary to do "size / 2" here -- you can just use
> PAGE_SIZE directly and check above that "size == 2*PAGE_SIZE" (not that
> this check is exceptionally necessary -- since there's only one caller
> of this function and it's in the same file).

Michael, in case you resend, can you make my life a little easier and do
it on top of
https://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git/log/?h=copy_struct_from_user
please. I have a fix from Aleksa sitting in there laready that _might_
cause a conflict otherwise.

Christian

2019-10-12 10:05:49

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH] usercopy: Avoid soft lockups in test_check_nonzero_user()

Aleksa Sarai <[email protected]> writes:
> On 2019-10-11, Michael Ellerman <[email protected]> wrote:
>> On a machine with a 64K PAGE_SIZE, the nested for loops in
>> test_check_nonzero_user() can lead to soft lockups, eg:
...
>> diff --git a/lib/test_user_copy.c b/lib/test_user_copy.c
>> index 950ee88cd6ac..9fb6bc609d4c 100644
>> --- a/lib/test_user_copy.c
>> +++ b/lib/test_user_copy.c
>> @@ -47,9 +47,26 @@ static bool is_zeroed(void *from, size_t size)
>> static int test_check_nonzero_user(char *kmem, char __user *umem, size_t size)
>> {
>> int ret = 0;
>> - size_t start, end, i;
>> - size_t zero_start = size / 4;
>> - size_t zero_end = size - zero_start;
>> + size_t start, end, i, zero_start, zero_end;
>> +
>> + if (test(size < 1024, "buffer too small"))
>> + return -EINVAL;
>> +
>> + /*
>> + * We want to cross a page boundary to exercise the code more
>> + * effectively. We assume the buffer we're passed has a page boundary at
>> + * size / 2. We also don't want to make the size we scan too large,
>> + * otherwise the test can take a long time and cause soft lockups. So
>> + * scan a 1024 byte region across the page boundary.
>> + */
>> + start = size / 2 - 512;
>> + size = 1024;
>
> I don't think it's necessary to do "size / 2" here -- you can just use
> PAGE_SIZE directly and check above that "size == 2*PAGE_SIZE" (not that
> this check is exceptionally necessary -- since there's only one caller
> of this function and it's in the same file).

OK, like this?

diff --git a/lib/test_user_copy.c b/lib/test_user_copy.c
index 950ee88cd6ac..48bc669b2549 100644
--- a/lib/test_user_copy.c
+++ b/lib/test_user_copy.c
@@ -47,9 +47,25 @@ static bool is_zeroed(void *from, size_t size)
static int test_check_nonzero_user(char *kmem, char __user *umem, size_t size)
{
int ret = 0;
- size_t start, end, i;
- size_t zero_start = size / 4;
- size_t zero_end = size - zero_start;
+ size_t start, end, i, zero_start, zero_end;
+
+ if (test(size < 2 * PAGE_SIZE, "buffer too small"))
+ return -EINVAL;
+
+ /*
+ * We want to cross a page boundary to exercise the code more
+ * effectively. We also don't want to make the size we scan too large,
+ * otherwise the test can take a long time and cause soft lockups. So
+ * scan a 1024 byte region across the page boundary.
+ */
+ size = 1024;
+ start = PAGE_SIZE - (size / 2);
+
+ kmem += start;
+ umem += start;
+
+ zero_start = size / 4;
+ zero_end = size - zero_start;

/*
* We conduct a series of check_nonzero_user() tests on a block of memory

cheers

2019-10-12 10:21:52

by Aleksa Sarai

[permalink] [raw]
Subject: Re: [PATCH] usercopy: Avoid soft lockups in test_check_nonzero_user()

On 2019-10-12, Michael Ellerman <[email protected]> wrote:
> Aleksa Sarai <[email protected]> writes:
> > On 2019-10-11, Michael Ellerman <[email protected]> wrote:
> >> On a machine with a 64K PAGE_SIZE, the nested for loops in
> >> test_check_nonzero_user() can lead to soft lockups, eg:
> ...
> >> diff --git a/lib/test_user_copy.c b/lib/test_user_copy.c
> >> index 950ee88cd6ac..9fb6bc609d4c 100644
> >> --- a/lib/test_user_copy.c
> >> +++ b/lib/test_user_copy.c
> >> @@ -47,9 +47,26 @@ static bool is_zeroed(void *from, size_t size)
> >> static int test_check_nonzero_user(char *kmem, char __user *umem, size_t size)
> >> {
> >> int ret = 0;
> >> - size_t start, end, i;
> >> - size_t zero_start = size / 4;
> >> - size_t zero_end = size - zero_start;
> >> + size_t start, end, i, zero_start, zero_end;
> >> +
> >> + if (test(size < 1024, "buffer too small"))
> >> + return -EINVAL;
> >> +
> >> + /*
> >> + * We want to cross a page boundary to exercise the code more
> >> + * effectively. We assume the buffer we're passed has a page boundary at
> >> + * size / 2. We also don't want to make the size we scan too large,
> >> + * otherwise the test can take a long time and cause soft lockups. So
> >> + * scan a 1024 byte region across the page boundary.
> >> + */
> >> + start = size / 2 - 512;
> >> + size = 1024;
> >
> > I don't think it's necessary to do "size / 2" here -- you can just use
> > PAGE_SIZE directly and check above that "size == 2*PAGE_SIZE" (not that
> > this check is exceptionally necessary -- since there's only one caller
> > of this function and it's in the same file).
>
> OK, like this?

Yup -- that looks good. I'll give it a Reviewed-by once you resend it.

> diff --git a/lib/test_user_copy.c b/lib/test_user_copy.c
> index 950ee88cd6ac..48bc669b2549 100644
> --- a/lib/test_user_copy.c
> +++ b/lib/test_user_copy.c
> @@ -47,9 +47,25 @@ static bool is_zeroed(void *from, size_t size)
> static int test_check_nonzero_user(char *kmem, char __user *umem, size_t size)
> {
> int ret = 0;
> - size_t start, end, i;
> - size_t zero_start = size / 4;
> - size_t zero_end = size - zero_start;
> + size_t start, end, i, zero_start, zero_end;
> +
> + if (test(size < 2 * PAGE_SIZE, "buffer too small"))
> + return -EINVAL;
> +
> + /*
> + * We want to cross a page boundary to exercise the code more
> + * effectively. We also don't want to make the size we scan too large,
> + * otherwise the test can take a long time and cause soft lockups. So
> + * scan a 1024 byte region across the page boundary.
> + */
> + size = 1024;
> + start = PAGE_SIZE - (size / 2);
> +
> + kmem += start;
> + umem += start;
> +
> + zero_start = size / 4;
> + zero_end = size - zero_start;
>
> /*
> * We conduct a series of check_nonzero_user() tests on a block of memory


--
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>


Attachments:
(No filename) (2.92 kB)
signature.asc (235.00 B)
Download all attachments

2019-10-16 15:31:43

by Michael Ellerman

[permalink] [raw]
Subject: [PATCH v2] usercopy: Avoid soft lockups in test_check_nonzero_user()

On a machine with a 64K PAGE_SIZE, the nested for loops in
test_check_nonzero_user() can lead to soft lockups, eg:

watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [modprobe:611]
Modules linked in: test_user_copy(+) vmx_crypto gf128mul crc32c_vpmsum virtio_balloon ip_tables x_tables autofs4
CPU: 4 PID: 611 Comm: modprobe Tainted: G L 5.4.0-rc1-gcc-8.2.0-00001-gf5a1a536fa14-dirty #1151
...
NIP __might_sleep+0x20/0xc0
LR __might_fault+0x40/0x60
Call Trace:
check_zeroed_user+0x12c/0x200
test_user_copy_init+0x67c/0x1210 [test_user_copy]
do_one_initcall+0x60/0x340
do_init_module+0x7c/0x2f0
load_module+0x2d94/0x30e0
__do_sys_finit_module+0xc8/0x150
system_call+0x5c/0x68

Even with a 4K PAGE_SIZE the test takes multiple seconds. Instead
tweak it to only scan a 1024 byte region, but make it cross the
page boundary.

Fixes: f5a1a536fa14 ("lib: introduce copy_struct_from_user() helper")
Suggested-by: Aleksa Sarai <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
---
lib/test_user_copy.c | 22 +++++++++++++++++++---
1 file changed, 19 insertions(+), 3 deletions(-)

v2: Rework calculation to just use PAGE_SIZE directly.
Rebase onto Christian's tree.

diff --git a/lib/test_user_copy.c b/lib/test_user_copy.c
index ad2372727b1b..5ff04d8fe971 100644
--- a/lib/test_user_copy.c
+++ b/lib/test_user_copy.c
@@ -47,9 +47,25 @@ static bool is_zeroed(void *from, size_t size)
static int test_check_nonzero_user(char *kmem, char __user *umem, size_t size)
{
int ret = 0;
- size_t start, end, i;
- size_t zero_start = size / 4;
- size_t zero_end = size - zero_start;
+ size_t start, end, i, zero_start, zero_end;
+
+ if (test(size < 2 * PAGE_SIZE, "buffer too small"))
+ return -EINVAL;
+
+ /*
+ * We want to cross a page boundary to exercise the code more
+ * effectively. We also don't want to make the size we scan too large,
+ * otherwise the test can take a long time and cause soft lockups. So
+ * scan a 1024 byte region across the page boundary.
+ */
+ size = 1024;
+ start = PAGE_SIZE - (size / 2);
+
+ kmem += start;
+ umem += start;
+
+ zero_start = size / 4;
+ zero_end = size - zero_start;

/*
* We conduct a series of check_nonzero_user() tests on a block of
--
2.21.0

2019-10-16 15:32:04

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH] usercopy: Avoid soft lockups in test_check_nonzero_user()

Christian Brauner <[email protected]> writes:
> On Fri, Oct 11, 2019 at 02:48:10PM +1100, Aleksa Sarai wrote:
>> On 2019-10-11, Michael Ellerman <[email protected]> wrote:
>> > On a machine with a 64K PAGE_SIZE, the nested for loops in
>> > test_check_nonzero_user() can lead to soft lockups, eg:
>> >
>> > watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [modprobe:611]
>> > Modules linked in: test_user_copy(+) vmx_crypto gf128mul crc32c_vpmsum virtio_balloon ip_tables x_tables autofs4
>> > CPU: 4 PID: 611 Comm: modprobe Tainted: G L 5.4.0-rc1-gcc-8.2.0-00001-gf5a1a536fa14-dirty #1151
>> > ...
>> > NIP __might_sleep+0x20/0xc0
>> > LR __might_fault+0x40/0x60
>> > Call Trace:
>> > check_zeroed_user+0x12c/0x200
>> > test_user_copy_init+0x67c/0x1210 [test_user_copy]
>> > do_one_initcall+0x60/0x340
>> > do_init_module+0x7c/0x2f0
>> > load_module+0x2d94/0x30e0
>> > __do_sys_finit_module+0xc8/0x150
>> > system_call+0x5c/0x68
>> >
>> > Even with a 4K PAGE_SIZE the test takes multiple seconds. Instead
>> > tweak it to only scan a 1024 byte region, but make it cross the
>> > page boundary.
>> >
>> > Fixes: f5a1a536fa14 ("lib: introduce copy_struct_from_user() helper")
>> > Suggested-by: Aleksa Sarai <[email protected]>
>> > Signed-off-by: Michael Ellerman <[email protected]>
>> > ---
>> > lib/test_user_copy.c | 23 ++++++++++++++++++++---
>> > 1 file changed, 20 insertions(+), 3 deletions(-)
>> >
>> > How does this look? It runs in < 1s on my machine here.
>> >
>> > cheers
>> >
>> > diff --git a/lib/test_user_copy.c b/lib/test_user_copy.c
>> > index 950ee88cd6ac..9fb6bc609d4c 100644
>> > --- a/lib/test_user_copy.c
>> > +++ b/lib/test_user_copy.c
>> > @@ -47,9 +47,26 @@ static bool is_zeroed(void *from, size_t size)
>> > static int test_check_nonzero_user(char *kmem, char __user *umem, size_t size)
>> > {
>> > int ret = 0;
>> > - size_t start, end, i;
>> > - size_t zero_start = size / 4;
>> > - size_t zero_end = size - zero_start;
>> > + size_t start, end, i, zero_start, zero_end;
>> > +
>> > + if (test(size < 1024, "buffer too small"))
>> > + return -EINVAL;
>> > +
>> > + /*
>> > + * We want to cross a page boundary to exercise the code more
>> > + * effectively. We assume the buffer we're passed has a page boundary at
>> > + * size / 2. We also don't want to make the size we scan too large,
>> > + * otherwise the test can take a long time and cause soft lockups. So
>> > + * scan a 1024 byte region across the page boundary.
>> > + */
>> > + start = size / 2 - 512;
>> > + size = 1024;
>>
>> I don't think it's necessary to do "size / 2" here -- you can just use
>> PAGE_SIZE directly and check above that "size == 2*PAGE_SIZE" (not that
>> this check is exceptionally necessary -- since there's only one caller
>> of this function and it's in the same file).
>
> Michael, in case you resend, can you make my life a little easier and do
> it on top of
> https://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git/log/?h=copy_struct_from_user
> please. I have a fix from Aleksa sitting in there laready that _might_
> cause a conflict otherwise.

No worries, done.

cheers

2019-10-16 15:37:43

by Aleksa Sarai

[permalink] [raw]
Subject: Re: [PATCH v2] usercopy: Avoid soft lockups in test_check_nonzero_user()

On 2019-10-16, Michael Ellerman <[email protected]> wrote:
> On a machine with a 64K PAGE_SIZE, the nested for loops in
> test_check_nonzero_user() can lead to soft lockups, eg:
>
> watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [modprobe:611]
> Modules linked in: test_user_copy(+) vmx_crypto gf128mul crc32c_vpmsum virtio_balloon ip_tables x_tables autofs4
> CPU: 4 PID: 611 Comm: modprobe Tainted: G L 5.4.0-rc1-gcc-8.2.0-00001-gf5a1a536fa14-dirty #1151
> ...
> NIP __might_sleep+0x20/0xc0
> LR __might_fault+0x40/0x60
> Call Trace:
> check_zeroed_user+0x12c/0x200
> test_user_copy_init+0x67c/0x1210 [test_user_copy]
> do_one_initcall+0x60/0x340
> do_init_module+0x7c/0x2f0
> load_module+0x2d94/0x30e0
> __do_sys_finit_module+0xc8/0x150
> system_call+0x5c/0x68
>
> Even with a 4K PAGE_SIZE the test takes multiple seconds. Instead
> tweak it to only scan a 1024 byte region, but make it cross the
> page boundary.
>
> Fixes: f5a1a536fa14 ("lib: introduce copy_struct_from_user() helper")
> Suggested-by: Aleksa Sarai <[email protected]>
> Signed-off-by: Michael Ellerman <[email protected]>

Thanks Michael.

Reviewed-by: Aleksa Sarai <[email protected]>

> ---
> lib/test_user_copy.c | 22 +++++++++++++++++++---
> 1 file changed, 19 insertions(+), 3 deletions(-)
>
> v2: Rework calculation to just use PAGE_SIZE directly.
> Rebase onto Christian's tree.
>
> diff --git a/lib/test_user_copy.c b/lib/test_user_copy.c
> index ad2372727b1b..5ff04d8fe971 100644
> --- a/lib/test_user_copy.c
> +++ b/lib/test_user_copy.c
> @@ -47,9 +47,25 @@ static bool is_zeroed(void *from, size_t size)
> static int test_check_nonzero_user(char *kmem, char __user *umem, size_t size)
> {
> int ret = 0;
> - size_t start, end, i;
> - size_t zero_start = size / 4;
> - size_t zero_end = size - zero_start;
> + size_t start, end, i, zero_start, zero_end;
> +
> + if (test(size < 2 * PAGE_SIZE, "buffer too small"))
> + return -EINVAL;
> +
> + /*
> + * We want to cross a page boundary to exercise the code more
> + * effectively. We also don't want to make the size we scan too large,
> + * otherwise the test can take a long time and cause soft lockups. So
> + * scan a 1024 byte region across the page boundary.
> + */
> + size = 1024;
> + start = PAGE_SIZE - (size / 2);
> +
> + kmem += start;
> + umem += start;
> +
> + zero_start = size / 4;
> + zero_end = size - zero_start;
>
> /*
> * We conduct a series of check_nonzero_user() tests on a block of
> --
> 2.21.0
>


--
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>


Attachments:
(No filename) (2.66 kB)
signature.asc (235.00 B)
Download all attachments

2019-10-16 15:39:59

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH] usercopy: Avoid soft lockups in test_check_nonzero_user()

On Wed, Oct 16, 2019 at 11:28:20PM +1100, Michael Ellerman wrote:
> Christian Brauner <[email protected]> writes:
> > On Fri, Oct 11, 2019 at 02:48:10PM +1100, Aleksa Sarai wrote:
> >> On 2019-10-11, Michael Ellerman <[email protected]> wrote:
> >> > On a machine with a 64K PAGE_SIZE, the nested for loops in
> >> > test_check_nonzero_user() can lead to soft lockups, eg:
> >> >
> >> > watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [modprobe:611]
> >> > Modules linked in: test_user_copy(+) vmx_crypto gf128mul crc32c_vpmsum virtio_balloon ip_tables x_tables autofs4
> >> > CPU: 4 PID: 611 Comm: modprobe Tainted: G L 5.4.0-rc1-gcc-8.2.0-00001-gf5a1a536fa14-dirty #1151
> >> > ...
> >> > NIP __might_sleep+0x20/0xc0
> >> > LR __might_fault+0x40/0x60
> >> > Call Trace:
> >> > check_zeroed_user+0x12c/0x200
> >> > test_user_copy_init+0x67c/0x1210 [test_user_copy]
> >> > do_one_initcall+0x60/0x340
> >> > do_init_module+0x7c/0x2f0
> >> > load_module+0x2d94/0x30e0
> >> > __do_sys_finit_module+0xc8/0x150
> >> > system_call+0x5c/0x68
> >> >
> >> > Even with a 4K PAGE_SIZE the test takes multiple seconds. Instead
> >> > tweak it to only scan a 1024 byte region, but make it cross the
> >> > page boundary.
> >> >
> >> > Fixes: f5a1a536fa14 ("lib: introduce copy_struct_from_user() helper")
> >> > Suggested-by: Aleksa Sarai <[email protected]>
> >> > Signed-off-by: Michael Ellerman <[email protected]>
> >> > ---
> >> > lib/test_user_copy.c | 23 ++++++++++++++++++++---
> >> > 1 file changed, 20 insertions(+), 3 deletions(-)
> >> >
> >> > How does this look? It runs in < 1s on my machine here.
> >> >
> >> > cheers
> >> >
> >> > diff --git a/lib/test_user_copy.c b/lib/test_user_copy.c
> >> > index 950ee88cd6ac..9fb6bc609d4c 100644
> >> > --- a/lib/test_user_copy.c
> >> > +++ b/lib/test_user_copy.c
> >> > @@ -47,9 +47,26 @@ static bool is_zeroed(void *from, size_t size)
> >> > static int test_check_nonzero_user(char *kmem, char __user *umem, size_t size)
> >> > {
> >> > int ret = 0;
> >> > - size_t start, end, i;
> >> > - size_t zero_start = size / 4;
> >> > - size_t zero_end = size - zero_start;
> >> > + size_t start, end, i, zero_start, zero_end;
> >> > +
> >> > + if (test(size < 1024, "buffer too small"))
> >> > + return -EINVAL;
> >> > +
> >> > + /*
> >> > + * We want to cross a page boundary to exercise the code more
> >> > + * effectively. We assume the buffer we're passed has a page boundary at
> >> > + * size / 2. We also don't want to make the size we scan too large,
> >> > + * otherwise the test can take a long time and cause soft lockups. So
> >> > + * scan a 1024 byte region across the page boundary.
> >> > + */
> >> > + start = size / 2 - 512;
> >> > + size = 1024;
> >>
> >> I don't think it's necessary to do "size / 2" here -- you can just use
> >> PAGE_SIZE directly and check above that "size == 2*PAGE_SIZE" (not that
> >> this check is exceptionally necessary -- since there's only one caller
> >> of this function and it's in the same file).
> >
> > Michael, in case you resend, can you make my life a little easier and do
> > it on top of
> > https://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git/log/?h=copy_struct_from_user
> > please. I have a fix from Aleksa sitting in there laready that _might_
> > cause a conflict otherwise.
>
> No worries, done.

Thank you!
Christian

2019-10-16 15:54:04

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH v2] usercopy: Avoid soft lockups in test_check_nonzero_user()

On Wed, Oct 16, 2019 at 11:27:32PM +1100, Michael Ellerman wrote:
> On a machine with a 64K PAGE_SIZE, the nested for loops in
> test_check_nonzero_user() can lead to soft lockups, eg:
>
> watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [modprobe:611]
> Modules linked in: test_user_copy(+) vmx_crypto gf128mul crc32c_vpmsum virtio_balloon ip_tables x_tables autofs4
> CPU: 4 PID: 611 Comm: modprobe Tainted: G L 5.4.0-rc1-gcc-8.2.0-00001-gf5a1a536fa14-dirty #1151
> ...
> NIP __might_sleep+0x20/0xc0
> LR __might_fault+0x40/0x60
> Call Trace:
> check_zeroed_user+0x12c/0x200
> test_user_copy_init+0x67c/0x1210 [test_user_copy]
> do_one_initcall+0x60/0x340
> do_init_module+0x7c/0x2f0
> load_module+0x2d94/0x30e0
> __do_sys_finit_module+0xc8/0x150
> system_call+0x5c/0x68
>
> Even with a 4K PAGE_SIZE the test takes multiple seconds. Instead
> tweak it to only scan a 1024 byte region, but make it cross the
> page boundary.
>
> Fixes: f5a1a536fa14 ("lib: introduce copy_struct_from_user() helper")
> Suggested-by: Aleksa Sarai <[email protected]>
> Signed-off-by: Michael Ellerman <[email protected]>

Thanks!
Acked-by: Christian Brauner <[email protected]>

2019-10-16 16:01:44

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH v2] usercopy: Avoid soft lockups in test_check_nonzero_user()

On Wed, Oct 16, 2019 at 11:27:32PM +1100, Michael Ellerman wrote:
> On a machine with a 64K PAGE_SIZE, the nested for loops in
> test_check_nonzero_user() can lead to soft lockups, eg:
>
> watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [modprobe:611]
> Modules linked in: test_user_copy(+) vmx_crypto gf128mul crc32c_vpmsum virtio_balloon ip_tables x_tables autofs4
> CPU: 4 PID: 611 Comm: modprobe Tainted: G L 5.4.0-rc1-gcc-8.2.0-00001-gf5a1a536fa14-dirty #1151
> ...
> NIP __might_sleep+0x20/0xc0
> LR __might_fault+0x40/0x60
> Call Trace:
> check_zeroed_user+0x12c/0x200
> test_user_copy_init+0x67c/0x1210 [test_user_copy]
> do_one_initcall+0x60/0x340
> do_init_module+0x7c/0x2f0
> load_module+0x2d94/0x30e0
> __do_sys_finit_module+0xc8/0x150
> system_call+0x5c/0x68
>
> Even with a 4K PAGE_SIZE the test takes multiple seconds. Instead
> tweak it to only scan a 1024 byte region, but make it cross the
> page boundary.
>
> Fixes: f5a1a536fa14 ("lib: introduce copy_struct_from_user() helper")
> Suggested-by: Aleksa Sarai <[email protected]>
> Signed-off-by: Michael Ellerman <[email protected]>

With Aleksa's Reviewed-by I've picked this up:
https://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git/log/?h=copy_struct_from_user

Thanks!
Christian

2019-10-17 18:23:13

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH v2] usercopy: Avoid soft lockups in test_check_nonzero_user()

Christian Brauner <[email protected]> writes:
> On Wed, Oct 16, 2019 at 11:27:32PM +1100, Michael Ellerman wrote:
>> On a machine with a 64K PAGE_SIZE, the nested for loops in
>> test_check_nonzero_user() can lead to soft lockups, eg:
>>
>> watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [modprobe:611]
>> Modules linked in: test_user_copy(+) vmx_crypto gf128mul crc32c_vpmsum virtio_balloon ip_tables x_tables autofs4
>> CPU: 4 PID: 611 Comm: modprobe Tainted: G L 5.4.0-rc1-gcc-8.2.0-00001-gf5a1a536fa14-dirty #1151
>> ...
>> NIP __might_sleep+0x20/0xc0
>> LR __might_fault+0x40/0x60
>> Call Trace:
>> check_zeroed_user+0x12c/0x200
>> test_user_copy_init+0x67c/0x1210 [test_user_copy]
>> do_one_initcall+0x60/0x340
>> do_init_module+0x7c/0x2f0
>> load_module+0x2d94/0x30e0
>> __do_sys_finit_module+0xc8/0x150
>> system_call+0x5c/0x68
>>
>> Even with a 4K PAGE_SIZE the test takes multiple seconds. Instead
>> tweak it to only scan a 1024 byte region, but make it cross the
>> page boundary.
>>
>> Fixes: f5a1a536fa14 ("lib: introduce copy_struct_from_user() helper")
>> Suggested-by: Aleksa Sarai <[email protected]>
>> Signed-off-by: Michael Ellerman <[email protected]>
>
> With Aleksa's Reviewed-by I've picked this up:
> https://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git/log/?h=copy_struct_from_user

Thanks. Are you planning to send that to Linus for v5.4 or v5.5 ?

cheers

2019-10-18 09:42:56

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH v2] usercopy: Avoid soft lockups in test_check_nonzero_user()

On Thu, Oct 17, 2019 at 09:00:48AM +1100, Michael Ellerman wrote:
> Christian Brauner <[email protected]> writes:
> > On Wed, Oct 16, 2019 at 11:27:32PM +1100, Michael Ellerman wrote:
> >> On a machine with a 64K PAGE_SIZE, the nested for loops in
> >> test_check_nonzero_user() can lead to soft lockups, eg:
> >>
> >> watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [modprobe:611]
> >> Modules linked in: test_user_copy(+) vmx_crypto gf128mul crc32c_vpmsum virtio_balloon ip_tables x_tables autofs4
> >> CPU: 4 PID: 611 Comm: modprobe Tainted: G L 5.4.0-rc1-gcc-8.2.0-00001-gf5a1a536fa14-dirty #1151
> >> ...
> >> NIP __might_sleep+0x20/0xc0
> >> LR __might_fault+0x40/0x60
> >> Call Trace:
> >> check_zeroed_user+0x12c/0x200
> >> test_user_copy_init+0x67c/0x1210 [test_user_copy]
> >> do_one_initcall+0x60/0x340
> >> do_init_module+0x7c/0x2f0
> >> load_module+0x2d94/0x30e0
> >> __do_sys_finit_module+0xc8/0x150
> >> system_call+0x5c/0x68
> >>
> >> Even with a 4K PAGE_SIZE the test takes multiple seconds. Instead
> >> tweak it to only scan a 1024 byte region, but make it cross the
> >> page boundary.
> >>
> >> Fixes: f5a1a536fa14 ("lib: introduce copy_struct_from_user() helper")
> >> Suggested-by: Aleksa Sarai <[email protected]>
> >> Signed-off-by: Michael Ellerman <[email protected]>
> >
> > With Aleksa's Reviewed-by I've picked this up:
> > https://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git/log/?h=copy_struct_from_user
>
> Thanks. Are you planning to send that to Linus for v5.4 or v5.5 ?

This looks like a pretty straight bugfix to me since it's clearly
causing an issue for you on power so v5.4-rc4 is what I'd aim for. I
just want it to be in linux-next until tomorrow.

Christian

2019-10-23 03:36:34

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH v2] usercopy: Avoid soft lockups in test_check_nonzero_user()

Christian Brauner <[email protected]> writes:
> On Thu, Oct 17, 2019 at 09:00:48AM +1100, Michael Ellerman wrote:
>> Christian Brauner <[email protected]> writes:
>> > On Wed, Oct 16, 2019 at 11:27:32PM +1100, Michael Ellerman wrote:
>> >> On a machine with a 64K PAGE_SIZE, the nested for loops in
>> >> test_check_nonzero_user() can lead to soft lockups, eg:
>> >>
>> >> watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [modprobe:611]
>> >> Modules linked in: test_user_copy(+) vmx_crypto gf128mul crc32c_vpmsum virtio_balloon ip_tables x_tables autofs4
>> >> CPU: 4 PID: 611 Comm: modprobe Tainted: G L 5.4.0-rc1-gcc-8.2.0-00001-gf5a1a536fa14-dirty #1151
>> >> ...
>> >> NIP __might_sleep+0x20/0xc0
>> >> LR __might_fault+0x40/0x60
>> >> Call Trace:
>> >> check_zeroed_user+0x12c/0x200
>> >> test_user_copy_init+0x67c/0x1210 [test_user_copy]
>> >> do_one_initcall+0x60/0x340
>> >> do_init_module+0x7c/0x2f0
>> >> load_module+0x2d94/0x30e0
>> >> __do_sys_finit_module+0xc8/0x150
>> >> system_call+0x5c/0x68
>> >>
>> >> Even with a 4K PAGE_SIZE the test takes multiple seconds. Instead
>> >> tweak it to only scan a 1024 byte region, but make it cross the
>> >> page boundary.
>> >>
>> >> Fixes: f5a1a536fa14 ("lib: introduce copy_struct_from_user() helper")
>> >> Suggested-by: Aleksa Sarai <[email protected]>
>> >> Signed-off-by: Michael Ellerman <[email protected]>
>> >
>> > With Aleksa's Reviewed-by I've picked this up:
>> > https://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git/log/?h=copy_struct_from_user
>>
>> Thanks. Are you planning to send that to Linus for v5.4 or v5.5 ?
>
> This looks like a pretty straight bugfix to me since it's clearly
> causing an issue for you on power so v5.4-rc4 is what I'd aim for. I
> just want it to be in linux-next until tomorrow.

I see it in mainine now, thanks!

cheers