2020-04-03 07:21:22

by Christophe Leroy

[permalink] [raw]
Subject: [PATCH v2 1/5] uaccess: Add user_read_access_begin/end and user_write_access_begin/end

Some architectures like powerpc64 have the capability to separate
read access and write access protection.
For get_user() and copy_from_user(), powerpc64 only open read access.
For put_user() and copy_to_user(), powerpc64 only open write access.
But when using unsafe_get_user() or unsafe_put_user(),
user_access_begin open both read and write.

Other architectures like powerpc book3s 32 bits only allow write
access protection. And on this architecture protection is an heavy
operation as it requires locking/unlocking per segment of 256Mbytes.
On those architecture it is therefore desirable to do the unlocking
only for write access. (Note that book3s/32 ranges from very old
powermac from the 90's with powerpc 601 processor, till modern
ADSL boxes with PowerQuicc II processors for instance so it
is still worth considering.)

In order to avoid any risk based of hacking some variable parameters
passed to user_access_begin/end that would allow hacking and
leaving user access open or opening too much, it is preferable to
use dedicated static functions that can't be overridden.

Add a user_read_access_begin and user_read_access_end to only open
read access.

Add a user_write_access_begin and user_write_access_end to only open
write access.

By default, when undefined, those new access helpers default on the
existing user_access_begin and user_access_end.

Signed-off-by: Christophe Leroy <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
---
v2: no change in this patch. See each patch for related changes.

v1 at https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=168174

This series is based on the discussion we had in January, see
https://patchwork.ozlabs.org/patch/1227926/ . I tried to
take into account all remarks, especially @hpa 's remark to use
a fixed API on not base the relocking on a magic id returned at
unlocking.

This series is awaited for implementing selective lkdtm test to
test powerpc64 independant read and write protection, see
https://patchwork.ozlabs.org/patch/1231765/

Signed-off-by: Christophe Leroy <[email protected]>
---
include/linux/uaccess.h | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
index 67f016010aad..9861c89f93be 100644
--- a/include/linux/uaccess.h
+++ b/include/linux/uaccess.h
@@ -378,6 +378,14 @@ extern long strnlen_unsafe_user(const void __user *unsafe_addr, long count);
static inline unsigned long user_access_save(void) { return 0UL; }
static inline void user_access_restore(unsigned long flags) { }
#endif
+#ifndef user_write_access_begin
+#define user_write_access_begin user_access_begin
+#define user_write_access_end user_access_end
+#endif
+#ifndef user_read_access_begin
+#define user_read_access_begin user_access_begin
+#define user_read_access_end user_access_end
+#endif

#ifdef CONFIG_HARDENED_USERCOPY
void usercopy_warn(const char *name, const char *detail, bool to_user,
--
2.25.0


2020-04-03 07:21:42

by Christophe Leroy

[permalink] [raw]
Subject: [PATCH v2 2/5] uaccess: Selectively open read or write user access

When opening user access to only perform reads, only open read access.
When opening user access to only perform writes, only open write
access.

Signed-off-by: Christophe Leroy <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
---
v2: Fixed a mismatched use of _read_ and _write_ in compat_get_bitmap()
and compat_put_bitmap()
---
fs/readdir.c | 12 ++++++------
kernel/compat.c | 12 ++++++------
kernel/exit.c | 12 ++++++------
lib/strncpy_from_user.c | 4 ++--
lib/strnlen_user.c | 4 ++--
lib/usercopy.c | 6 +++---
6 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/fs/readdir.c b/fs/readdir.c
index de2eceffdee8..ed6aaad451aa 100644
--- a/fs/readdir.c
+++ b/fs/readdir.c
@@ -242,7 +242,7 @@ static int filldir(struct dir_context *ctx, const char *name, int namlen,
return -EINTR;
dirent = buf->current_dir;
prev = (void __user *) dirent - prev_reclen;
- if (!user_access_begin(prev, reclen + prev_reclen))
+ if (!user_write_access_begin(prev, reclen + prev_reclen))
goto efault;

/* This might be 'dirent->d_off', but if so it will get overwritten */
@@ -251,14 +251,14 @@ static int filldir(struct dir_context *ctx, const char *name, int namlen,
unsafe_put_user(reclen, &dirent->d_reclen, efault_end);
unsafe_put_user(d_type, (char __user *) dirent + reclen - 1, efault_end);
unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end);
- user_access_end();
+ user_write_access_end();

buf->current_dir = (void __user *)dirent + reclen;
buf->prev_reclen = reclen;
buf->count -= reclen;
return 0;
efault_end:
- user_access_end();
+ user_write_access_end();
efault:
buf->error = -EFAULT;
return -EFAULT;
@@ -327,7 +327,7 @@ static int filldir64(struct dir_context *ctx, const char *name, int namlen,
return -EINTR;
dirent = buf->current_dir;
prev = (void __user *)dirent - prev_reclen;
- if (!user_access_begin(prev, reclen + prev_reclen))
+ if (!user_write_access_begin(prev, reclen + prev_reclen))
goto efault;

/* This might be 'dirent->d_off', but if so it will get overwritten */
@@ -336,7 +336,7 @@ static int filldir64(struct dir_context *ctx, const char *name, int namlen,
unsafe_put_user(reclen, &dirent->d_reclen, efault_end);
unsafe_put_user(d_type, &dirent->d_type, efault_end);
unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end);
- user_access_end();
+ user_write_access_end();

buf->prev_reclen = reclen;
buf->current_dir = (void __user *)dirent + reclen;
@@ -344,7 +344,7 @@ static int filldir64(struct dir_context *ctx, const char *name, int namlen,
return 0;

efault_end:
- user_access_end();
+ user_write_access_end();
efault:
buf->error = -EFAULT;
return -EFAULT;
diff --git a/kernel/compat.c b/kernel/compat.c
index 843dd17e6078..b8d2800bb4b7 100644
--- a/kernel/compat.c
+++ b/kernel/compat.c
@@ -199,7 +199,7 @@ long compat_get_bitmap(unsigned long *mask, const compat_ulong_t __user *umask,
bitmap_size = ALIGN(bitmap_size, BITS_PER_COMPAT_LONG);
nr_compat_longs = BITS_TO_COMPAT_LONGS(bitmap_size);

- if (!user_access_begin(umask, bitmap_size / 8))
+ if (!user_read_access_begin(umask, bitmap_size / 8))
return -EFAULT;

while (nr_compat_longs > 1) {
@@ -211,11 +211,11 @@ long compat_get_bitmap(unsigned long *mask, const compat_ulong_t __user *umask,
}
if (nr_compat_longs)
unsafe_get_user(*mask, umask++, Efault);
- user_access_end();
+ user_read_access_end();
return 0;

Efault:
- user_access_end();
+ user_read_access_end();
return -EFAULT;
}

@@ -228,7 +228,7 @@ long compat_put_bitmap(compat_ulong_t __user *umask, unsigned long *mask,
bitmap_size = ALIGN(bitmap_size, BITS_PER_COMPAT_LONG);
nr_compat_longs = BITS_TO_COMPAT_LONGS(bitmap_size);

- if (!user_access_begin(umask, bitmap_size / 8))
+ if (!user_write_access_begin(umask, bitmap_size / 8))
return -EFAULT;

while (nr_compat_longs > 1) {
@@ -239,10 +239,10 @@ long compat_put_bitmap(compat_ulong_t __user *umask, unsigned long *mask,
}
if (nr_compat_longs)
unsafe_put_user((compat_ulong_t)*mask, umask++, Efault);
- user_access_end();
+ user_write_access_end();
return 0;
Efault:
- user_access_end();
+ user_write_access_end();
return -EFAULT;
}

diff --git a/kernel/exit.c b/kernel/exit.c
index d70d47159640..61b2f7a85079 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -1555,7 +1555,7 @@ SYSCALL_DEFINE5(waitid, int, which, pid_t, upid, struct siginfo __user *,
if (!infop)
return err;

- if (!user_access_begin(infop, sizeof(*infop)))
+ if (!user_write_access_begin(infop, sizeof(*infop)))
return -EFAULT;

unsafe_put_user(signo, &infop->si_signo, Efault);
@@ -1564,10 +1564,10 @@ SYSCALL_DEFINE5(waitid, int, which, pid_t, upid, struct siginfo __user *,
unsafe_put_user(info.pid, &infop->si_pid, Efault);
unsafe_put_user(info.uid, &infop->si_uid, Efault);
unsafe_put_user(info.status, &infop->si_status, Efault);
- user_access_end();
+ user_write_access_end();
return err;
Efault:
- user_access_end();
+ user_write_access_end();
return -EFAULT;
}

@@ -1682,7 +1682,7 @@ COMPAT_SYSCALL_DEFINE5(waitid,
if (!infop)
return err;

- if (!user_access_begin(infop, sizeof(*infop)))
+ if (!user_write_access_begin(infop, sizeof(*infop)))
return -EFAULT;

unsafe_put_user(signo, &infop->si_signo, Efault);
@@ -1691,10 +1691,10 @@ COMPAT_SYSCALL_DEFINE5(waitid,
unsafe_put_user(info.pid, &infop->si_pid, Efault);
unsafe_put_user(info.uid, &infop->si_uid, Efault);
unsafe_put_user(info.status, &infop->si_status, Efault);
- user_access_end();
+ user_write_access_end();
return err;
Efault:
- user_access_end();
+ user_write_access_end();
return -EFAULT;
}
#endif
diff --git a/lib/strncpy_from_user.c b/lib/strncpy_from_user.c
index 706020b06617..b90ec550183a 100644
--- a/lib/strncpy_from_user.c
+++ b/lib/strncpy_from_user.c
@@ -116,9 +116,9 @@ long strncpy_from_user(char *dst, const char __user *src, long count)

kasan_check_write(dst, count);
check_object_size(dst, count, false);
- if (user_access_begin(src, max)) {
+ if (user_read_access_begin(src, max)) {
retval = do_strncpy_from_user(dst, src, count, max);
- user_access_end();
+ user_read_access_end();
return retval;
}
}
diff --git a/lib/strnlen_user.c b/lib/strnlen_user.c
index 41670d4a5816..1616710b8a82 100644
--- a/lib/strnlen_user.c
+++ b/lib/strnlen_user.c
@@ -109,9 +109,9 @@ long strnlen_user(const char __user *str, long count)
if (max > count)
max = count;

- if (user_access_begin(str, max)) {
+ if (user_read_access_begin(str, max)) {
retval = do_strnlen_user(str, count, max);
- user_access_end();
+ user_read_access_end();
return retval;
}
}
diff --git a/lib/usercopy.c b/lib/usercopy.c
index cbb4d9ec00f2..ca2a697a2061 100644
--- a/lib/usercopy.c
+++ b/lib/usercopy.c
@@ -58,7 +58,7 @@ int check_zeroed_user(const void __user *from, size_t size)
from -= align;
size += align;

- if (!user_access_begin(from, size))
+ if (!user_read_access_begin(from, size))
return -EFAULT;

unsafe_get_user(val, (unsigned long __user *) from, err_fault);
@@ -79,10 +79,10 @@ int check_zeroed_user(const void __user *from, size_t size)
val &= aligned_byte_mask(size);

done:
- user_access_end();
+ user_read_access_end();
return (val == 0);
err_fault:
- user_access_end();
+ user_read_access_end();
return -EFAULT;
}
EXPORT_SYMBOL(check_zeroed_user);
--
2.25.0

2020-04-03 07:22:04

by Christophe Leroy

[permalink] [raw]
Subject: [PATCH v2 5/5] uaccess: Rename user_access_begin/end() to user_full_access_begin/end()

Now we have user_read_access_begin() and user_write_access_begin()
in addition to user_access_begin().

Make it explicit that user_access_begin() provides both read and
write by renaming it user_full_access_begin(). And the same for
user_access_end() which becomes user_full_access_end().

Done with following command, then hand splitted two too long lines.

sed -i s/user_access_begin/user_full_access_begin/g `git grep -l user_access_begin`

Signed-off-by: Christophe Leroy <[email protected]>
---
v2: New, based on remark from Al Viro.
---
arch/powerpc/include/asm/uaccess.h | 5 +++--
arch/x86/include/asm/futex.h | 4 ++--
arch/x86/include/asm/uaccess.h | 7 ++++---
include/linux/uaccess.h | 8 ++++----
4 files changed, 13 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
index 4427d419eb1d..7fe799e081f2 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -456,14 +456,15 @@ extern long __copy_from_user_flushcache(void *dst, const void __user *src,
extern void memcpy_page_flushcache(char *to, struct page *page, size_t offset,
size_t len);

-static __must_check inline bool user_access_begin(const void __user *ptr, size_t len)
+static __must_check inline bool
+user_full_access_begin(const void __user *ptr, size_t len)
{
if (unlikely(!access_ok(ptr, len)))
return false;
allow_read_write_user((void __user *)ptr, ptr, len);
return true;
}
-#define user_access_begin user_access_begin
+#define user_full_access_begin user_full_access_begin
#define user_access_end prevent_current_access_user
#define user_access_save prevent_user_access_return
#define user_access_restore restore_user_access
diff --git a/arch/x86/include/asm/futex.h b/arch/x86/include/asm/futex.h
index f9c00110a69a..9eefea374bd4 100644
--- a/arch/x86/include/asm/futex.h
+++ b/arch/x86/include/asm/futex.h
@@ -56,7 +56,7 @@ do { \
static __always_inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval,
u32 __user *uaddr)
{
- if (!user_access_begin(uaddr, sizeof(u32)))
+ if (!user_full_access_begin(uaddr, sizeof(u32)))
return -EFAULT;

switch (op) {
@@ -92,7 +92,7 @@ static inline int futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
{
int ret = 0;

- if (!user_access_begin(uaddr, sizeof(u32)))
+ if (!user_full_access_begin(uaddr, sizeof(u32)))
return -EFAULT;
asm volatile("\n"
"1:\t" LOCK_PREFIX "cmpxchgl %4, %2\n"
diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index d8f283b9a569..8776e815f215 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -473,16 +473,17 @@ extern struct movsl_mask {
* The "unsafe" user accesses aren't really "unsafe", but the naming
* is a big fat warning: you have to not only do the access_ok()
* checking before using them, but you have to surround them with the
- * user_access_begin/end() pair.
+ * user_full_access_begin/end() pair.
*/
-static __must_check __always_inline bool user_access_begin(const void __user *ptr, size_t len)
+static __must_check __always_inline bool
+user_full_access_begin(const void __user *ptr, size_t len)
{
if (unlikely(!access_ok(ptr,len)))
return 0;
__uaccess_begin_nospec();
return 1;
}
-#define user_access_begin(a,b) user_access_begin(a,b)
+#define user_full_access_begin(a,b) user_full_access_begin(a,b)
#define user_access_end() __uaccess_end()

#define user_access_save() smap_save()
diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
index 9861c89f93be..5be9bc930342 100644
--- a/include/linux/uaccess.h
+++ b/include/linux/uaccess.h
@@ -368,8 +368,8 @@ extern long strnlen_unsafe_user(const void __user *unsafe_addr, long count);
#define probe_kernel_address(addr, retval) \
probe_kernel_read(&retval, addr, sizeof(retval))

-#ifndef user_access_begin
-#define user_access_begin(ptr,len) access_ok(ptr, len)
+#ifndef user_full_access_begin
+#define user_full_access_begin(ptr,len) access_ok(ptr, len)
#define user_access_end() do { } while (0)
#define unsafe_op_wrap(op, err) do { if (unlikely(op)) goto err; } while (0)
#define unsafe_get_user(x,p,e) unsafe_op_wrap(__get_user(x,p),e)
@@ -379,11 +379,11 @@ static inline unsigned long user_access_save(void) { return 0UL; }
static inline void user_access_restore(unsigned long flags) { }
#endif
#ifndef user_write_access_begin
-#define user_write_access_begin user_access_begin
+#define user_write_access_begin user_full_access_begin
#define user_write_access_end user_access_end
#endif
#ifndef user_read_access_begin
-#define user_read_access_begin user_access_begin
+#define user_read_access_begin user_full_access_begin
#define user_read_access_end user_access_end
#endif

--
2.25.0

2020-04-03 07:22:29

by Christophe Leroy

[permalink] [raw]
Subject: [PATCH v2 3/5] drm/i915/gem: Replace user_access_begin by user_write_access_begin

When i915_gem_execbuffer2_ioctl() is using user_access_begin(),
that's only to perform unsafe_put_user() so use
user_write_access_begin() in order to only open write access.

Signed-off-by: Christophe Leroy <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
---
v2: Rebased (one part of the patch flies away)
---
drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 36d069504836..b4c903308590 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -2794,7 +2794,8 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data,
* And this range already got effectively checked earlier
* when we did the "copy_from_user()" above.
*/
- if (!user_access_begin(user_exec_list, count * sizeof(*user_exec_list)))
+ if (!user_write_access_begin(user_exec_list,
+ count * sizeof(*user_exec_list)))
goto end;

for (i = 0; i < args->buffer_count; i++) {
@@ -2808,7 +2809,7 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data,
end_user);
}
end_user:
- user_access_end();
+ user_write_access_end();
end:;
}

--
2.25.0

2020-04-03 07:22:59

by Christophe Leroy

[permalink] [raw]
Subject: [PATCH v2 4/5] powerpc/uaccess: Implement user_read_access_begin and user_write_access_begin

Add support for selective read or write user access with
user_read_access_begin/end and user_write_access_begin/end.

Signed-off-by: Christophe Leroy <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
---
v2: no change
---
arch/powerpc/include/asm/book3s/32/kup.h | 4 ++--
arch/powerpc/include/asm/kup.h | 14 +++++++++++++-
arch/powerpc/include/asm/uaccess.h | 22 ++++++++++++++++++++++
3 files changed, 37 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/32/kup.h b/arch/powerpc/include/asm/book3s/32/kup.h
index 3c0ba22dc360..1617e73bee30 100644
--- a/arch/powerpc/include/asm/book3s/32/kup.h
+++ b/arch/powerpc/include/asm/book3s/32/kup.h
@@ -108,7 +108,7 @@ static __always_inline void allow_user_access(void __user *to, const void __user
u32 addr, end;

BUILD_BUG_ON(!__builtin_constant_p(dir));
- BUILD_BUG_ON(dir == KUAP_CURRENT);
+ BUILD_BUG_ON(dir & ~KUAP_READ_WRITE);

if (!(dir & KUAP_WRITE))
return;
@@ -131,7 +131,7 @@ static __always_inline void prevent_user_access(void __user *to, const void __us

BUILD_BUG_ON(!__builtin_constant_p(dir));

- if (dir == KUAP_CURRENT) {
+ if (dir & KUAP_CURRENT_WRITE) {
u32 kuap = current->thread.kuap;

if (unlikely(!kuap))
diff --git a/arch/powerpc/include/asm/kup.h b/arch/powerpc/include/asm/kup.h
index 92bcd1a26d73..c745ee41ad66 100644
--- a/arch/powerpc/include/asm/kup.h
+++ b/arch/powerpc/include/asm/kup.h
@@ -10,7 +10,9 @@
* Use the current saved situation instead of the to/from/size params.
* Used on book3s/32
*/
-#define KUAP_CURRENT 4
+#define KUAP_CURRENT_READ 4
+#define KUAP_CURRENT_WRITE 8
+#define KUAP_CURRENT (KUAP_CURRENT_READ | KUAP_CURRENT_WRITE)

#ifdef CONFIG_PPC64
#include <asm/book3s/64/kup-radix.h>
@@ -101,6 +103,16 @@ static inline void prevent_current_access_user(void)
prevent_user_access(NULL, NULL, ~0UL, KUAP_CURRENT);
}

+static inline void prevent_current_read_from_user(void)
+{
+ prevent_user_access(NULL, NULL, ~0UL, KUAP_CURRENT_READ);
+}
+
+static inline void prevent_current_write_to_user(void)
+{
+ prevent_user_access(NULL, NULL, ~0UL, KUAP_CURRENT_WRITE);
+}
+
#endif /* !__ASSEMBLY__ */

#endif /* _ASM_POWERPC_KUAP_H_ */
diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
index 2f500debae21..4427d419eb1d 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -468,6 +468,28 @@ static __must_check inline bool user_access_begin(const void __user *ptr, size_t
#define user_access_save prevent_user_access_return
#define user_access_restore restore_user_access

+static __must_check inline bool
+user_read_access_begin(const void __user *ptr, size_t len)
+{
+ if (unlikely(!access_ok(ptr, len)))
+ return false;
+ allow_read_from_user(ptr, len);
+ return true;
+}
+#define user_read_access_begin user_read_access_begin
+#define user_read_access_end prevent_current_read_from_user
+
+static __must_check inline bool
+user_write_access_begin(const void __user *ptr, size_t len)
+{
+ if (unlikely(!access_ok(ptr, len)))
+ return false;
+ allow_write_to_user((void __user *)ptr, len);
+ return true;
+}
+#define user_write_access_begin user_write_access_begin
+#define user_write_access_end prevent_current_write_to_user
+
#define unsafe_op_wrap(op, err) do { if (unlikely(op)) goto err; } while (0)
#define unsafe_get_user(x, p, e) unsafe_op_wrap(__get_user_allowed(x, p), e)
#define unsafe_put_user(x, p, e) unsafe_op_wrap(__put_user_allowed(x, p), e)
--
2.25.0

2020-04-03 18:02:32

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH v2 5/5] uaccess: Rename user_access_begin/end() to user_full_access_begin/end()

On Fri, Apr 3, 2020 at 12:21 AM Christophe Leroy
<[email protected]> wrote:
>
> Now we have user_read_access_begin() and user_write_access_begin()
> in addition to user_access_begin().

I realize Al asked for this, but I don't think it really adds anything
to the series.

The "full" makes the names longer, but not really any more legible.

So I like 1-4, but am unconvinced about 5 and would prefer that to be
dropped. Sorry for the bikeshedding.

And I like this series much better without the cookie that was
discussed, and just making the hard rule be that they can't nest.

Some architecture may obviously use a cookie internally if they have
some nesting behavior of their own, but it doesn't look like we have
any major reason to expose that as the actual interface.

The only other question is how to synchronize this? I'm ok with it
going through the ppc tree, for example, and just let others build on
that. Maybe using a shared immutable branch with 5.6 as a base?

Linus

2020-04-03 20:53:21

by Al Viro

[permalink] [raw]
Subject: Re: [PATCH v2 5/5] uaccess: Rename user_access_begin/end() to user_full_access_begin/end()

On Fri, Apr 03, 2020 at 11:01:10AM -0700, Linus Torvalds wrote:
> On Fri, Apr 3, 2020 at 12:21 AM Christophe Leroy
> <[email protected]> wrote:
> >
> > Now we have user_read_access_begin() and user_write_access_begin()
> > in addition to user_access_begin().
>
> I realize Al asked for this, but I don't think it really adds anything
> to the series.
>
> The "full" makes the names longer, but not really any more legible.
>
> So I like 1-4, but am unconvinced about 5 and would prefer that to be
> dropped. Sorry for the bikeshedding.
>
> And I like this series much better without the cookie that was
> discussed, and just making the hard rule be that they can't nest.
>
> Some architecture may obviously use a cookie internally if they have
> some nesting behavior of their own, but it doesn't look like we have
> any major reason to expose that as the actual interface.
>
> The only other question is how to synchronize this? I'm ok with it
> going through the ppc tree, for example, and just let others build on
> that. Maybe using a shared immutable branch with 5.6 as a base?

I can do a 5.7-rc1-based branch with that; depending upon what we end
up doing for arm and s390 we can always change the calling conventions
come next cycle ;-/

My impressions after digging through arm side of things:

1) the only instance of nesting I'd found there (so far) is a mistake.
The rule should be "no fucking nesting, TYVM".

2) I'm really unhappy about the uaccess_with_memcpy thing. Besides
being fucking ugly, it kills any hope of lifting user_access_begin/end
out of raw_copy_to_user() - the things done in that bastard are
obviously *NOT* fit for uaccess block. Including the wonders like
/* the mmap semaphore is taken only if not in an atomic context */
atomic = faulthandler_disabled();

if (!atomic)
down_read(&current->mm->mmap_sem);
which, IMO, deserves to be taken out and shot. Not that pin_page_for_write()
in the same file (arch/arm/lib/uaccess_with_memcpy.c) did not deserve the
same treatment... As far as I can tell, the whole point of that thing
is that well, memcpy() is optimized better than raw_copy_to_user()...
So what's wrong with taking the damn optimized memcpy and using it for
raw_copy_to_user() instead?

Is that the lack of STRT analogue that would store several registers?
Because AFAICS commit f441882a5229ffaef61a47bccd4518f7e2749cbc
Author: Vincent Whitchurch <[email protected]>
Date: Fri Nov 9 10:09:48 2018 +0100
ARM: 8812/1: Optimise copy_{from/to}_user for !CPU_USE_DOMAINS
makes for much saner solution... I realize that it's v6+ and this
thing is specifically for a v5 variant, but...

3) how much do we need to keep the old DACR value in a register for
uaccess_restore()? AFAICS, if we prohibit nesting it becomes
a function of USER_DS/KERNEL_DS setting (and that - only for
CPU_USE_DOMAINS), doesn't it? And we had to have fetched it
recently, back when access_ok() had been done, so shouldn't
it be in cache?

2020-04-04 06:22:22

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v2 5/5] uaccess: Rename user_access_begin/end() to user_full_access_begin/end()

Hi Christophe,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on next-20200403]
[cannot apply to powerpc/next drm-intel/for-linux-next tip/x86/core v5.6]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url: https://github.com/0day-ci/linux/commits/Christophe-Leroy/uaccess-Add-user_read_access_begin-end-and-user_write_access_begin-end/20200404-080555
base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 5364abc57993b3bf60c41923cb98a8f1a594e749
config: x86_64-randconfig-s1-20200404 (attached as .config)
compiler: gcc-7 (Ubuntu 7.5.0-6ubuntu2) 7.5.0
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64

If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot <[email protected]>

All errors (new ones prefixed by >>):

arch/x86/ia32/ia32_signal.c: In function 'ia32_setup_frame':
>> arch/x86/ia32/ia32_signal.c:265:7: error: implicit declaration of function 'user_access_begin'; did you mean 'user_access_end'? [-Werror=implicit-function-declaration]
if (!user_access_begin(frame, sizeof(*frame)))
^~~~~~~~~~~~~~~~~
user_access_end
cc1: some warnings being treated as errors

vim +265 arch/x86/ia32/ia32_signal.c

^1da177e4c3f41 arch/x86_64/ia32/ia32_signal.c Linus Torvalds 2005-04-16 233
235b80226b986d arch/x86/ia32/ia32_signal.c Al Viro 2012-11-09 234 int ia32_setup_frame(int sig, struct ksignal *ksig,
^1da177e4c3f41 arch/x86_64/ia32/ia32_signal.c Linus Torvalds 2005-04-16 235 compat_sigset_t *set, struct pt_regs *regs)
^1da177e4c3f41 arch/x86_64/ia32/ia32_signal.c Linus Torvalds 2005-04-16 236 {
3b0d29ee1c73b6 arch/x86/ia32/ia32_signal.c Hiroshi Shimamoto 2008-12-17 237 struct sigframe_ia32 __user *frame;
99b9cdf758af70 arch/x86/ia32/ia32_signal.c Thomas Gleixner 2008-01-30 238 void __user *restorer;
44a1d996325982 arch/x86/ia32/ia32_signal.c Al Viro 2020-02-15 239 void __user *fp = NULL;
^1da177e4c3f41 arch/x86_64/ia32/ia32_signal.c Linus Torvalds 2005-04-16 240
99b9cdf758af70 arch/x86/ia32/ia32_signal.c Thomas Gleixner 2008-01-30 241 /* copy_to_user optimizes that into a single 8 byte store */
99b9cdf758af70 arch/x86/ia32/ia32_signal.c Thomas Gleixner 2008-01-30 242 static const struct {
99b9cdf758af70 arch/x86/ia32/ia32_signal.c Thomas Gleixner 2008-01-30 243 u16 poplmovl;
99b9cdf758af70 arch/x86/ia32/ia32_signal.c Thomas Gleixner 2008-01-30 244 u32 val;
99b9cdf758af70 arch/x86/ia32/ia32_signal.c Thomas Gleixner 2008-01-30 245 u16 int80;
99b9cdf758af70 arch/x86/ia32/ia32_signal.c Thomas Gleixner 2008-01-30 246 } __attribute__((packed)) code = {
99b9cdf758af70 arch/x86/ia32/ia32_signal.c Thomas Gleixner 2008-01-30 247 0xb858, /* popl %eax ; movl $...,%eax */
99b9cdf758af70 arch/x86/ia32/ia32_signal.c Thomas Gleixner 2008-01-30 248 __NR_ia32_sigreturn,
99b9cdf758af70 arch/x86/ia32/ia32_signal.c Thomas Gleixner 2008-01-30 249 0x80cd, /* int $0x80 */
99b9cdf758af70 arch/x86/ia32/ia32_signal.c Thomas Gleixner 2008-01-30 250 };
99b9cdf758af70 arch/x86/ia32/ia32_signal.c Thomas Gleixner 2008-01-30 251
44a1d996325982 arch/x86/ia32/ia32_signal.c Al Viro 2020-02-15 252 frame = get_sigframe(ksig, regs, sizeof(*frame), &fp);
^1da177e4c3f41 arch/x86_64/ia32/ia32_signal.c Linus Torvalds 2005-04-16 253
235b80226b986d arch/x86/ia32/ia32_signal.c Al Viro 2012-11-09 254 if (ksig->ka.sa.sa_flags & SA_RESTORER) {
235b80226b986d arch/x86/ia32/ia32_signal.c Al Viro 2012-11-09 255 restorer = ksig->ka.sa.sa_restorer;
af65d64845a90c arch/x86/ia32/ia32_signal.c Roland McGrath 2008-01-30 256 } else {
^1da177e4c3f41 arch/x86_64/ia32/ia32_signal.c Linus Torvalds 2005-04-16 257 /* Return stub is in 32bit vsyscall page */
1a3e4ca41c5a38 arch/x86/ia32/ia32_signal.c Roland McGrath 2008-04-09 258 if (current->mm->context.vdso)
6f121e548f8367 arch/x86/ia32/ia32_signal.c Andy Lutomirski 2014-05-05 259 restorer = current->mm->context.vdso +
0a6d1fa0d2b48f arch/x86/ia32/ia32_signal.c Andy Lutomirski 2015-10-05 260 vdso_image_32.sym___kernel_sigreturn;
9fbbd4dd17d071 arch/x86_64/ia32/ia32_signal.c Andi Kleen 2007-02-13 261 else
ade1af77129dea arch/x86/ia32/ia32_signal.c Jan Engelhardt 2008-01-30 262 restorer = &frame->retcode;
af65d64845a90c arch/x86/ia32/ia32_signal.c Roland McGrath 2008-01-30 263 }
3b4b75700a245d arch/x86/ia32/ia32_signal.c Hiroshi Shimamoto 2009-01-23 264
e2390741053e49 arch/x86/ia32/ia32_signal.c Al Viro 2020-02-15 @265 if (!user_access_begin(frame, sizeof(*frame)))
e2390741053e49 arch/x86/ia32/ia32_signal.c Al Viro 2020-02-15 266 return -EFAULT;
99b9cdf758af70 arch/x86/ia32/ia32_signal.c Thomas Gleixner 2008-01-30 267
e2390741053e49 arch/x86/ia32/ia32_signal.c Al Viro 2020-02-15 268 unsafe_put_user(sig, &frame->sig, Efault);
e2390741053e49 arch/x86/ia32/ia32_signal.c Al Viro 2020-02-15 269 unsafe_put_sigcontext32(&frame->sc, fp, regs, set, Efault);
e2390741053e49 arch/x86/ia32/ia32_signal.c Al Viro 2020-02-15 270 unsafe_put_user(set->sig[1], &frame->extramask[0], Efault);
e2390741053e49 arch/x86/ia32/ia32_signal.c Al Viro 2020-02-15 271 unsafe_put_user(ptr_to_compat(restorer), &frame->pretcode, Efault);
99b9cdf758af70 arch/x86/ia32/ia32_signal.c Thomas Gleixner 2008-01-30 272 /*
99b9cdf758af70 arch/x86/ia32/ia32_signal.c Thomas Gleixner 2008-01-30 273 * These are actually not used anymore, but left because some
99b9cdf758af70 arch/x86/ia32/ia32_signal.c Thomas Gleixner 2008-01-30 274 * gdb versions depend on them as a marker.
99b9cdf758af70 arch/x86/ia32/ia32_signal.c Thomas Gleixner 2008-01-30 275 */
e2390741053e49 arch/x86/ia32/ia32_signal.c Al Viro 2020-02-15 276 unsafe_put_user(*((u64 *)&code), (u64 __user *)frame->retcode, Efault);
e2390741053e49 arch/x86/ia32/ia32_signal.c Al Viro 2020-02-15 277 user_access_end();
^1da177e4c3f41 arch/x86_64/ia32/ia32_signal.c Linus Torvalds 2005-04-16 278
^1da177e4c3f41 arch/x86_64/ia32/ia32_signal.c Linus Torvalds 2005-04-16 279 /* Set up registers for signal handler */
65ea5b03499035 arch/x86/ia32/ia32_signal.c H. Peter Anvin 2008-01-30 280 regs->sp = (unsigned long) frame;
235b80226b986d arch/x86/ia32/ia32_signal.c Al Viro 2012-11-09 281 regs->ip = (unsigned long) ksig->ka.sa.sa_handler;
^1da177e4c3f41 arch/x86_64/ia32/ia32_signal.c Linus Torvalds 2005-04-16 282
536e3ee4fed13d arch/x86_64/ia32/ia32_signal.c Andi Kleen 2006-09-26 283 /* Make -mregparm=3 work */
65ea5b03499035 arch/x86/ia32/ia32_signal.c H. Peter Anvin 2008-01-30 284 regs->ax = sig;
65ea5b03499035 arch/x86/ia32/ia32_signal.c H. Peter Anvin 2008-01-30 285 regs->dx = 0;
65ea5b03499035 arch/x86/ia32/ia32_signal.c H. Peter Anvin 2008-01-30 286 regs->cx = 0;
536e3ee4fed13d arch/x86_64/ia32/ia32_signal.c Andi Kleen 2006-09-26 287
b6edbb1e045a71 arch/x86/ia32/ia32_signal.c Jeremy Fitzhardinge 2008-08-19 288 loadsegment(ds, __USER32_DS);
b6edbb1e045a71 arch/x86/ia32/ia32_signal.c Jeremy Fitzhardinge 2008-08-19 289 loadsegment(es, __USER32_DS);
^1da177e4c3f41 arch/x86_64/ia32/ia32_signal.c Linus Torvalds 2005-04-16 290
^1da177e4c3f41 arch/x86_64/ia32/ia32_signal.c Linus Torvalds 2005-04-16 291 regs->cs = __USER32_CS;
^1da177e4c3f41 arch/x86_64/ia32/ia32_signal.c Linus Torvalds 2005-04-16 292 regs->ss = __USER32_DS;
^1da177e4c3f41 arch/x86_64/ia32/ia32_signal.c Linus Torvalds 2005-04-16 293
1d001df19d5323 arch/x86_64/ia32/ia32_signal.c Andi Kleen 2006-09-26 294 return 0;
44a1d996325982 arch/x86/ia32/ia32_signal.c Al Viro 2020-02-15 295 Efault:
44a1d996325982 arch/x86/ia32/ia32_signal.c Al Viro 2020-02-15 296 user_access_end();
44a1d996325982 arch/x86/ia32/ia32_signal.c Al Viro 2020-02-15 297 return -EFAULT;
^1da177e4c3f41 arch/x86_64/ia32/ia32_signal.c Linus Torvalds 2005-04-16 298 }
^1da177e4c3f41 arch/x86_64/ia32/ia32_signal.c Linus Torvalds 2005-04-16 299

:::::: The code at line 265 was first introduced by commit
:::::: e2390741053e4931841650b5eadac32697aa88aa x86: ia32_setup_frame(): consolidate uaccess areas

:::::: TO: Al Viro <[email protected]>
:::::: CC: Al Viro <[email protected]>

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]


Attachments:
(No filename) (9.03 kB)
.config.gz (32.76 kB)
Download all attachments

2020-04-04 07:21:44

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v2 5/5] uaccess: Rename user_access_begin/end() to user_full_access_begin/end()

Hi Christophe,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on next-20200403]
[cannot apply to powerpc/next drm-intel/for-linux-next tip/x86/core v5.6]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url: https://github.com/0day-ci/linux/commits/Christophe-Leroy/uaccess-Add-user_read_access_begin-end-and-user_write_access_begin-end/20200404-080555
base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 5364abc57993b3bf60c41923cb98a8f1a594e749
config: i386-allyesconfig (attached as .config)
compiler: gcc-7 (Ubuntu 7.5.0-6ubuntu2) 7.5.0
reproduce:
# save the attached .config to linux build tree
make ARCH=i386

If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot <[email protected]>

All errors (new ones prefixed by >>):

arch/x86/kernel/vm86_32.c: In function 'save_v86_state':
>> arch/x86/kernel/vm86_32.c:116:7: error: implicit declaration of function 'user_access_begin'; did you mean 'user_access_end'? [-Werror=implicit-function-declaration]
if (!user_access_begin(user, vm86->vm86plus.is_vm86pus ?
^~~~~~~~~~~~~~~~~
user_access_end
cc1: some warnings being treated as errors

vim +116 arch/x86/kernel/vm86_32.c

^1da177e4c3f41 arch/i386/kernel/vm86.c Linus Torvalds 2005-04-16 95
5ed92a8ab71f88 arch/x86/kernel/vm86_32.c Brian Gerst 2015-07-29 96 void save_v86_state(struct kernel_vm86_regs *regs, int retval)
^1da177e4c3f41 arch/i386/kernel/vm86.c Linus Torvalds 2005-04-16 97 {
ed0b2edb61ba4e arch/x86/kernel/vm86_32.c Brian Gerst 2015-07-19 98 struct task_struct *tsk = current;
ed0b2edb61ba4e arch/x86/kernel/vm86_32.c Brian Gerst 2015-07-19 99 struct vm86plus_struct __user *user;
9fda6a0681e070 arch/x86/kernel/vm86_32.c Brian Gerst 2015-07-29 100 struct vm86 *vm86 = current->thread.vm86;
^1da177e4c3f41 arch/i386/kernel/vm86.c Linus Torvalds 2005-04-16 101
^1da177e4c3f41 arch/i386/kernel/vm86.c Linus Torvalds 2005-04-16 102 /*
^1da177e4c3f41 arch/i386/kernel/vm86.c Linus Torvalds 2005-04-16 103 * This gets called from entry.S with interrupts disabled, but
^1da177e4c3f41 arch/i386/kernel/vm86.c Linus Torvalds 2005-04-16 104 * from process context. Enable interrupts here, before trying
^1da177e4c3f41 arch/i386/kernel/vm86.c Linus Torvalds 2005-04-16 105 * to access user space.
^1da177e4c3f41 arch/i386/kernel/vm86.c Linus Torvalds 2005-04-16 106 */
^1da177e4c3f41 arch/i386/kernel/vm86.c Linus Torvalds 2005-04-16 107 local_irq_enable();
^1da177e4c3f41 arch/i386/kernel/vm86.c Linus Torvalds 2005-04-16 108
1342635638cba9 arch/x86/kernel/vm86_32.c Brian Gerst 2015-07-29 109 if (!vm86 || !vm86->user_vm86) {
1342635638cba9 arch/x86/kernel/vm86_32.c Brian Gerst 2015-07-29 110 pr_alert("no user_vm86: BAD\n");
^1da177e4c3f41 arch/i386/kernel/vm86.c Linus Torvalds 2005-04-16 111 do_exit(SIGSEGV);
^1da177e4c3f41 arch/i386/kernel/vm86.c Linus Torvalds 2005-04-16 112 }
decd275e62d5ee arch/x86/kernel/vm86_32.c Brian Gerst 2015-07-29 113 set_flags(regs->pt.flags, VEFLAGS, X86_EFLAGS_VIF | vm86->veflags_mask);
1342635638cba9 arch/x86/kernel/vm86_32.c Brian Gerst 2015-07-29 114 user = vm86->user_vm86;
ed0b2edb61ba4e arch/x86/kernel/vm86_32.c Brian Gerst 2015-07-19 115
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 @116 if (!user_access_begin(user, vm86->vm86plus.is_vm86pus ?
ed0b2edb61ba4e arch/x86/kernel/vm86_32.c Brian Gerst 2015-07-19 117 sizeof(struct vm86plus_struct) :
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 118 sizeof(struct vm86_struct)))
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 119 goto Efault;
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 120
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 121 unsafe_put_user(regs->pt.bx, &user->regs.ebx, Efault_end);
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 122 unsafe_put_user(regs->pt.cx, &user->regs.ecx, Efault_end);
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 123 unsafe_put_user(regs->pt.dx, &user->regs.edx, Efault_end);
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 124 unsafe_put_user(regs->pt.si, &user->regs.esi, Efault_end);
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 125 unsafe_put_user(regs->pt.di, &user->regs.edi, Efault_end);
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 126 unsafe_put_user(regs->pt.bp, &user->regs.ebp, Efault_end);
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 127 unsafe_put_user(regs->pt.ax, &user->regs.eax, Efault_end);
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 128 unsafe_put_user(regs->pt.ip, &user->regs.eip, Efault_end);
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 129 unsafe_put_user(regs->pt.cs, &user->regs.cs, Efault_end);
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 130 unsafe_put_user(regs->pt.flags, &user->regs.eflags, Efault_end);
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 131 unsafe_put_user(regs->pt.sp, &user->regs.esp, Efault_end);
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 132 unsafe_put_user(regs->pt.ss, &user->regs.ss, Efault_end);
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 133 unsafe_put_user(regs->es, &user->regs.es, Efault_end);
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 134 unsafe_put_user(regs->ds, &user->regs.ds, Efault_end);
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 135 unsafe_put_user(regs->fs, &user->regs.fs, Efault_end);
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 136 unsafe_put_user(regs->gs, &user->regs.gs, Efault_end);
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 137 unsafe_put_user(vm86->screen_bitmap, &user->screen_bitmap, Efault_end);
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 138
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 139 user_access_end();
^1da177e4c3f41 arch/i386/kernel/vm86.c Linus Torvalds 2005-04-16 140
da51da189a24bb arch/x86/kernel/vm86_32.c Andy Lutomirski 2017-11-02 141 preempt_disable();
9fda6a0681e070 arch/x86/kernel/vm86_32.c Brian Gerst 2015-07-29 142 tsk->thread.sp0 = vm86->saved_sp0;
ed0b2edb61ba4e arch/x86/kernel/vm86_32.c Brian Gerst 2015-07-19 143 tsk->thread.sysenter_cs = __KERNEL_CS;
252e1a0526304f arch/x86/kernel/vm86_32.c Joerg Roedel 2018-07-18 144 update_task_stack(tsk);
bd7dc5a6afac71 arch/x86/kernel/vm86_32.c Andy Lutomirski 2017-11-02 145 refresh_sysenter_cs(&tsk->thread);
9fda6a0681e070 arch/x86/kernel/vm86_32.c Brian Gerst 2015-07-29 146 vm86->saved_sp0 = 0;
da51da189a24bb arch/x86/kernel/vm86_32.c Andy Lutomirski 2017-11-02 147 preempt_enable();
^1da177e4c3f41 arch/i386/kernel/vm86.c Linus Torvalds 2005-04-16 148
5ed92a8ab71f88 arch/x86/kernel/vm86_32.c Brian Gerst 2015-07-29 149 memcpy(&regs->pt, &vm86->regs32, sizeof(struct pt_regs));
49d26b6eaa8e97 arch/i386/kernel/vm86.c Jeremy Fitzhardinge 2006-12-07 150
5ed92a8ab71f88 arch/x86/kernel/vm86_32.c Brian Gerst 2015-07-29 151 lazy_load_gs(vm86->regs32.gs);
49d26b6eaa8e97 arch/i386/kernel/vm86.c Jeremy Fitzhardinge 2006-12-07 152
5ed92a8ab71f88 arch/x86/kernel/vm86_32.c Brian Gerst 2015-07-29 153 regs->pt.ax = retval;
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 154 return;
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 155
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 156 Efault_end:
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 157 user_access_end();
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 158 Efault:
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 159 pr_alert("could not access userspace vm86 info\n");
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 160 do_exit(SIGSEGV);
^1da177e4c3f41 arch/i386/kernel/vm86.c Linus Torvalds 2005-04-16 161 }
^1da177e4c3f41 arch/i386/kernel/vm86.c Linus Torvalds 2005-04-16 162

:::::: The code at line 116 was first introduced by commit
:::::: a37d01ead405e3aa14d72d284721fe46422b3b63 x86: switch save_v86_state() to unsafe_put_user()

:::::: TO: Al Viro <[email protected]>
:::::: CC: Al Viro <[email protected]>

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]


Attachments:
(No filename) (9.25 kB)
.config.gz (70.44 kB)
Download all attachments

2020-04-05 18:50:07

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH v2 5/5] uaccess: Rename user_access_begin/end() to user_full_access_begin/end()



Le 03/04/2020 à 20:01, Linus Torvalds a écrit :
> On Fri, Apr 3, 2020 at 12:21 AM Christophe Leroy
> <[email protected]> wrote:
>>
>> Now we have user_read_access_begin() and user_write_access_begin()
>> in addition to user_access_begin().
>
> I realize Al asked for this, but I don't think it really adds anything
> to the series.
>
> The "full" makes the names longer, but not really any more legible.
>
> So I like 1-4, but am unconvinced about 5 and would prefer that to be
> dropped. Sorry for the bikeshedding.
>

Yes I was not sure about it, that's the reason why I added it as the
last patch of the series.

And in the meantime, we see Robots reporting build failures due to
additional use of user_access_begin() in parallele to this change, so I
guess it would anyway be a challenge to perform such a change without
coordination.

> And I like this series much better without the cookie that was
> discussed, and just making the hard rule be that they can't nest.
>
> Some architecture may obviously use a cookie internally if they have
> some nesting behavior of their own, but it doesn't look like we have
> any major reason to expose that as the actual interface.
>
> The only other question is how to synchronize this? I'm ok with it
> going through the ppc tree, for example, and just let others build on
> that. Maybe using a shared immutable branch with 5.6 as a base?

Michael, can you take patches 1 to 4 ?

Otherwise, can you ack patch 4 to enable merging through another tree ?

Christophe

2020-04-21 02:51:43

by Al Viro

[permalink] [raw]
Subject: Re: [PATCH v2 5/5] uaccess: Rename user_access_begin/end() to user_full_access_begin/end()

[rmk Cc'd]
On Fri, Apr 03, 2020 at 09:52:05PM +0100, Al Viro wrote:

> I can do a 5.7-rc1-based branch with that; depending upon what we end
> up doing for arm and s390 we can always change the calling conventions
> come next cycle ;-/
>
> My impressions after digging through arm side of things:
>
> 1) the only instance of nesting I'd found there (so far) is a mistake.
> The rule should be "no fucking nesting, TYVM".

OK, after quite a bit of digging:
1) everything outside of arm is quite happy with not passing
anything to user_access_end(). s390 is a red herring in that respect.
2) on arm we definitely can get rid of nesting. However,
there are some unpleasant sides of the logics in there. What we have
is an MMU register; everything except for two 2bit fields in it is
constant. One of those fields is a function of get_fs(), another might
serve an analogue of x86 EFLAGS.AC. Rules:

DACR.USER is 0 if CONFIG_SW_DOMAIN_PAN is enabled and we are
*not* in uaccess section; otherwise it's 1.
DACR.KERNEL is 3 if CONFIG_USE_DOMAINS is enabled and we are
under KERNEL_DS; otherwise it's 1.

[USE_DOMAINS is forced to "yes" on v5 and earlier, configurable on v6+]
[SW_DOMAIN_PAN is forced to "no" on v7 if we want support of huge physical
space, configurable with default to "yes" otherwise]

On entry into kernel we get into USER_DS state before we get
out of asm glue. Original settings are restored on return. That goes
both for ->addr_limit (get_fs() value) and for DACR.KERNEL contents.
DACR.USER ("uaccess allowed") is switched to "disabled" state before
we reach C code and restored on return from kernel.

The costs are interesting; setting the register is costly, in
the same manner STAC/CLAC is. Reading it... hell knows; I don't see any
explicit information about that. As it is, both set_fs() and starting
uaccess block (uaccess_save_and_enable() - the thing that would've
gone into user_access_begin()) do both read and write to register; with
minimal massage we could get rid of reading the damn thing in set_fs().
user_access_end() candidate does a plain write to register, with value
kept around since the beginning of uaccess block.

*IF* read from that register is cheap, we can trivially get rid
of passing the cookie there - it's a matter of reading the register
and clearing one bit in it before writing it back. If that is costly,
though... We can easily calculate it from ->addr_limit, which we already
have in cache at that point, or will need shortly anyway. In that
case it would probably make sense to do the same to user_access_begin()
and set_fs(). Note that I'm not suggesting to do anything of that sort
in switch_to() - existing mechanism doesn't need any changes, and neither
does the asm glue in entry*.S.

The only source I'd been able to find speeks of >= 60 cycles
(and possibly much more) for non-pipelined coprocessor instructions;
the list of such does contain loads and stores to a bunch of registers.
However, the register in question (p15/c3) has only store mentioned there,
so loads might be cheap; no obvious reasons for those to be slow.
That's a question to arm folks, I'm afraid... rmk?

Note that we can keep the current variant (i.e.
user_access_begin() being just the check for access_ok(), user_access_end()
being empty and uaccess_save_and_enable()/uaccess_restore() done manually
inside the primitives); after all, a lot of architectures don't _have_
anything of that sort. It's just that decisions regarding the calling
conventions for these primitives will be much harder to change later on...

Again, arm (32bit one) is the only architectures that has something
of that sort and needs to pass cookie from beginning to the end of uaccess
blocks. Everything else splits into several classes:

1) has MMU, shared address space for kernel/userland, no stac analogues.
alpha, arc, csky, hexagon, itanic, nds32, nios32, openrisc, sh,
sparc32, unicore32, xtensa/MMU, microblaze/MMU, mips/MMU,
m68k/MMU/COLDFIRE.
No way to do anything other than plain access_ok() for user_access_begin().

2) has MMU, shared address space for kernel/userland, has stac analogue,
possibly with separate "for read" and "for write" variants. Can live
without passing any cookies.
arm64, powerpc, riscv, x86
Current variant with changes in this patchset covers those.

3) non-MMU, uses memcpy() for everything, or at least ought to:
c6x, h8300, m68k/!MMU, xtensa/!MMU(?), microblaze/!MMU(?), mips/!MMU(?),
arm/!MMU
No memory protection of any sort...

4) sparc-like: MMU, separate address spaces for userland and kernel,
has explicit insns for uaccess + some register(s) to choose what those
insns actually hit.
sparc64, parisc, m68k/MMU/!COLDFIRE
No stac/clac analogue would make sense.

5) s390: weird one - there is an stac analogue as far the hardware is
concerned, but it can't be separated from inline asm where
actual uaccess insns are. From the kernel POV it's sparc-like.
Nothing that would reasonably map to user_access_begin/user_access_end

6) um: no uaccess, in a sense of dereferencing non-kernel pointers. What
it does is simulation of page table walk + explicit call of #PF handler
on missing pages + kmap_atomic() to get a kernel alias.

2020-04-21 09:15:47

by Russell King (Oracle)

[permalink] [raw]
Subject: Re: [PATCH v2 5/5] uaccess: Rename user_access_begin/end() to user_full_access_begin/end()

On Tue, Apr 21, 2020 at 03:49:19AM +0100, Al Viro wrote:
> The only source I'd been able to find speeks of >= 60 cycles
> (and possibly much more) for non-pipelined coprocessor instructions;
> the list of such does contain loads and stores to a bunch of registers.
> However, the register in question (p15/c3) has only store mentioned there,
> so loads might be cheap; no obvious reasons for those to be slow.
> That's a question to arm folks, I'm afraid... rmk?

I have no information on that; instruction timings are not defined
at architecture level (architecture reference manual), nor do I find
information in the CPU technical reference manual (which would be
specific to the CPU). Instruction timings tend to be implementation
dependent.

I've always consulted Will Deacon when I've needed to know whether
an instruction is expensive or not.

--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up

2020-04-21 18:37:15

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH v2 5/5] uaccess: Rename user_access_begin/end() to user_full_access_begin/end()

On Mon, Apr 20, 2020 at 7:49 PM Al Viro <[email protected]> wrote:
>
> The only source I'd been able to find speaks of >= 60 cycles
> (and possibly much more) for non-pipelined coprocessor instructions;
> the list of such does contain loads and stores to a bunch of registers.
> However, the register in question (p15/c3) has only store mentioned there,
> so loads might be cheap; no obvious reasons for those to be slow.
> That's a question to arm folks, I'm afraid... rmk?

_If_ it turns out to be expensive, is there any reason we couldn't
just cache the value in general?

That's what x86 tends to do with expensive system registers. One
example would be "msr_misc_features_shadow".

But maybe that's something to worry about when/if it turns out to
actually be a problem?

Linus

2020-05-29 04:23:17

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH v2 3/5] drm/i915/gem: Replace user_access_begin by user_write_access_begin

On Fri, 2020-04-03 at 07:20:52 UTC, Christophe Leroy wrote:
> When i915_gem_execbuffer2_ioctl() is using user_access_begin(),
> that's only to perform unsafe_put_user() so use
> user_write_access_begin() in order to only open write access.
>
> Signed-off-by: Christophe Leroy <[email protected]>
> Reviewed-by: Kees Cook <[email protected]>

Applied to powerpc topic/uaccess, thanks.

https://git.kernel.org/powerpc/c/b44f687386875b714dae2afa768e73401e45c21c

cheers

2020-05-29 04:25:43

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH v2 1/5] uaccess: Add user_read_access_begin/end and user_write_access_begin/end

On Fri, 2020-04-03 at 07:20:50 UTC, Christophe Leroy wrote:
> Some architectures like powerpc64 have the capability to separate
> read access and write access protection.
> For get_user() and copy_from_user(), powerpc64 only open read access.
> For put_user() and copy_to_user(), powerpc64 only open write access.
> But when using unsafe_get_user() or unsafe_put_user(),
> user_access_begin open both read and write.
>
> Other architectures like powerpc book3s 32 bits only allow write
> access protection. And on this architecture protection is an heavy
> operation as it requires locking/unlocking per segment of 256Mbytes.
> On those architecture it is therefore desirable to do the unlocking
> only for write access. (Note that book3s/32 ranges from very old
> powermac from the 90's with powerpc 601 processor, till modern
> ADSL boxes with PowerQuicc II processors for instance so it
> is still worth considering.)
>
> In order to avoid any risk based of hacking some variable parameters
> passed to user_access_begin/end that would allow hacking and
> leaving user access open or opening too much, it is preferable to
> use dedicated static functions that can't be overridden.
>
> Add a user_read_access_begin and user_read_access_end to only open
> read access.
>
> Add a user_write_access_begin and user_write_access_end to only open
> write access.
>
> By default, when undefined, those new access helpers default on the
> existing user_access_begin and user_access_end.
>
> Signed-off-by: Christophe Leroy <[email protected]>
> Reviewed-by: Kees Cook <[email protected]>

Applied to powerpc topic/uaccess, thanks.

https://git.kernel.org/powerpc/c/999a22890cb183b918e4372395d24426a755cef2

cheers

2020-05-29 04:25:43

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH v2 2/5] uaccess: Selectively open read or write user access

On Fri, 2020-04-03 at 07:20:51 UTC, Christophe Leroy wrote:
> When opening user access to only perform reads, only open read access.
> When opening user access to only perform writes, only open write
> access.
>
> Signed-off-by: Christophe Leroy <[email protected]>
> Reviewed-by: Kees Cook <[email protected]>

Applied to powerpc topic/uaccess, thanks.

https://git.kernel.org/powerpc/c/41cd780524674082b037e7c8461f90c5e42103f0

cheers

2020-05-29 04:27:01

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH v2 4/5] powerpc/uaccess: Implement user_read_access_begin and user_write_access_begin

On Fri, 2020-04-03 at 07:20:53 UTC, Christophe Leroy wrote:
> Add support for selective read or write user access with
> user_read_access_begin/end and user_write_access_begin/end.
>
> Signed-off-by: Christophe Leroy <[email protected]>
> Reviewed-by: Kees Cook <[email protected]>

Applied to powerpc topic/uaccess-ppc, thanks.

https://git.kernel.org/powerpc/c/4fe5cda9f89d0aea8e915b7c96ae34bda4e12e51

cheers