LinuxLists.cc - [RFC V2] fs/kcore: change copy_to_user to copy_in

2015-08-20 10:47:21

Subject: [RFC V2] fs/kcore: change copy_to_user to copy_in_user

The copy_to_user() here expect can fix the fault on both kernel and
user address, this is not true on other platforms except x86,
change to user copy_in_user() so that can detect the source and
destination address page fault, work as expected:

ENTRY(copy_user_generic_string)
ASM_STAC
cmpl $8,%edx
jb 2f /* less than 8 bytes, go to byte copy loop */
ALIGN_DESTINATION
movl %edx,%ecx
shrl $3,%ecx
andl $7,%edx
1: rep
movsq
2: movl %edx,%ecx
3: rep
movsb
xorl %eax,%eax
ASM_CLAC
ret

.section .fixup,"ax"
11: leal (%rdx,%rcx,8),%ecx
12: movl %ecx,%edx /* ecx is zerorest also */
jmp copy_user_handle_tail
.previous

_ASM_EXTABLE(1b,11b)
_ASM_EXTABLE(3b,12b)
ENDPROC(copy_user_generic_string)

This is the x86 copy_to_user implementation, the fault instruction is
label 1b, 3b, because movsb instruction will fault on both source and
destination address, but on other platforms, like arm64 and arm:

ENTRY(__copy_to_user)
ALTERNATIVE("nop", __stringify(SET_PSTATE_PAN(0)), ARM64_HAS_PAN, \
CONFIG_ARM64_PAN)
add x5, x0, x2 // upper user buffer boundary
subs x2, x2, #16
b.mi 1f
0:
ldp x3, x4, [x1], #16
subs x2, x2, #16
USER(9f, stp x3, x4, [x0], #16)
b.pl 0b
1: adds x2, x2, #8
b.mi 2f
ldr x3, [x1], #8
sub x2, x2, #8
USER(9f, str x3, [x0], #8 )
2: adds x2, x2, #4
b.mi 3f
ldr w3, [x1], #4
sub x2, x2, #4
USER(9f, str w3, [x0], #4 )
3: adds x2, x2, #2
b.mi 4f
ldrh w3, [x1], #2
sub x2, x2, #2
USER(9f, strh w3, [x0], #2 )
4: adds x2, x2, #1
b.mi 5f
ldrb w3, [x1]
USER(9f, strb w3, [x0] )
5: mov x0, #0

It only check the str instructions, means only check fault on destination
address, if the source address in kernel is invalid, it will result in
kernel panic(), this may happened in read_kcore() function, the
kern_addr_valid() only make sure the address is valid during call it,
when it returned, the kernel address may be invalid, like kunmap(),
setfixmap(), vfree() function will change kernel pagetables, and there is
not any sync between read_kcore() and kernel pagetable change. I change
to use copy_in_user(), so that can check both source and destination address,
avoid the kernel panic() in some unlucky circumstance.

Signed-off-by: yalin wang <[email protected]>
---
fs/proc/kcore.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c
index 92e6726..4f28deb 100644
--- a/fs/proc/kcore.c
+++ b/fs/proc/kcore.c
@@ -515,8 +515,12 @@ read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos)
} else {
if (kern_addr_valid(start)) {
unsigned long n;
-
- n = copy_to_user(buffer, (char *)start, tsz);
+ if ((start + tsz < tsz) ||
+ (start + tsz) > TASK_SIZE)
+ return -EFAULT;
+ set_fs(KERNEL_DS);
+ n = copy_in_user(buffer, (char *)start, tsz);
+ set_fs(USER_DS);
/*
* We cannot distinguish between fault on source
* and fault on destination. When this happens
--
1.9.1

2015-08-20 22:16:44

by Andrew Morton

[permalink] [raw]

Subject: Re: [RFC V2] fs/kcore: change copy_to_user to copy_in_user

On Thu, 20 Aug 2015 18:46:56 +0800 yalin wang <[email protected]> wrote:

> The copy_to_user() here expect can fix the fault on both kernel and
> user address, this is not true on other platforms except x86,
> change to user copy_in_user() so that can detect the source and
> destination address page fault, work as expected:
>
> ...
>
> This is the x86 copy_to_user implementation, the fault instruction is
> label 1b, 3b, because movsb instruction will fault on both source and
> destination address, but on other platforms, like arm64 and arm:
>
> ...
>
> It only check the str instructions, means only check fault on destination
> address, if the source address in kernel is invalid, it will result in
> kernel panic(), this may happened in read_kcore() function, the
> kern_addr_valid() only make sure the address is valid during call it,
> when it returned, the kernel address may be invalid, like kunmap(),
> setfixmap(), vfree() function will change kernel pagetables, and there is
> not any sync between read_kcore() and kernel pagetable change. I change
> to use copy_in_user(), so that can check both source and destination address,
> avoid the kernel panic() in some unlucky circumstance.

This difference between x86 and the other architectures is problematic.
It creates risk that code which was tested on x86 will fail on other
architectures in rare situations.

Now, if I'm understanding things correctly, this situation is rare
because there aren't many places in which the kernel will access kernel
memory which can be concurrently unmapped.

> ...
>
> --- a/fs/proc/kcore.c
> +++ b/fs/proc/kcore.c
> @@ -515,8 +515,12 @@ read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos)
> } else {
> if (kern_addr_valid(start)) {
> unsigned long n;
> -
> - n = copy_to_user(buffer, (char *)start, tsz);
> + if ((start + tsz < tsz) ||
> + (start + tsz) > TASK_SIZE)
> + return -EFAULT;
> + set_fs(KERNEL_DS);
> + n = copy_in_user(buffer, (char *)start, tsz);
> + set_fs(USER_DS);
> /*
> * We cannot distinguish between fault on source
> * and fault on destination. When this happens

huh. I seem to have forgotten that copy_in_user() exists. It is of
course rigorously undocumented, but iirc it's there to copy from
userspace to userspace (copy_within_user would be a better name?)

And in this patch what you're doing is repurposing copy_in_user to mean
"copy from userspace to kernel memory which might be unmapped".

I'm not at all confident that this trick will work on all
architectures: code which is designed to handle a userspace fault is
now expected to correctly handle a kernel space fault?