2004-04-18 00:28:28

by Chris Evans

[permalink] [raw]
Subject: Nasty 2.6 sendfile() bug / regression; affects vsftpd

Hi Andrew, linux-kernel,

Large-file download support with vsftpd and the 2.6 kernel is a broken
combination. After careful consideration, I'm pretty sure the bug is with
the 2.6 kernel. At the very least, behaviour has changed from 2.4. An
initial fix is included below.

vsftpd makes extensive use of sendfile to achieve high performance. When
vsftpd was first written, sendfile64 did not exist. So, in order to
transfer >2Gb files, multiple sendfile() calls are used [1]. The main
issue with sendfile() and >2Gb files is of course that the "offset" in/out
parameter cannot be used, because it is not 64-bit. The solution is
simple; use NULL for the offset parameter and keep track of the file
position in a purely user-space variable.
The above scheme works well with 2.4, but unfortunately fails with
EOVERFLOW in 2.6 at the 2Gb mark.
This failure seems to be a bug. I could be wrong here, but it seems the
intent of EOVERFLOW is to indicate to the user that a large 64-bit kernel
value cannot be stuffed into a 32-bit userspace return value. In the case
of using NULL as the offset paramter, userspace does not care about the
64-bit file offset and no truncation will ever occur.
The below patch fixes this up to restore 2.4 behaviour in the event of a
NULL offset parameter. It's currently under testing and looking good.

Any chance you can review this and sneak into 2.6.soon?

--- read_write.c.old 2004-04-17 18:39:11.000000000 +0100
+++ read_write.c 2004-04-17 23:38:04.000000000 +0100
@@ -545,6 +545,7 @@
loff_t pos;
ssize_t retval;
int fput_needed_in, fput_needed_out;
+ loff_t *orig_ppos = ppos;

/*
* Get input file, and verify that it is ok..
@@ -599,7 +600,7 @@
retval = -EINVAL;
if (unlikely(pos < 0))
goto fput_out;
- if (unlikely(pos + count > max)) {
+ if (unlikely(pos + count > max && orig_ppos != NULL)) {
retval = -EOVERFLOW;
if (pos >= max)
goto fput_out;
@@ -608,7 +609,7 @@

retval = in_file->f_op->sendfile(in_file, ppos, count, file_send_actor, out_file);

- if (*ppos > max)
+ if (*ppos > max && orig_ppos != NULL)
retval = -EOVERFLOW;

fput_out:

Cheers
Chris

[1] Note that I'm not inspired to use sendfile64, as it will STILL need
multiple sendfile64 calls - the "count" variabe is still 32-bit. The only
change from sendfile is that the offset parameter becomes 64-bit.


2004-04-18 03:12:30

by Linus Torvalds

[permalink] [raw]
Subject: Re: Nasty 2.6 sendfile() bug / regression; affects vsftpd



On Sun, 18 Apr 2004 [email protected] wrote:
>
> Any chance you can review this and sneak into 2.6.soon?

Your patch is horribly ugly. How about this (much simpler) patch instead?

It just sets the "max" to zero if pos in NULL in the caller. That just
seems a much better/saner approach.

Can you test that this one-liner fixes the issue for you?

Linus

----
--- 1.37/fs/read_write.c Mon Apr 12 10:54:20 2004
+++ edited/fs/read_write.c Sat Apr 17 20:09:41 2004
@@ -635,7 +635,7 @@
return ret;
}

- return do_sendfile(out_fd, in_fd, NULL, count, MAX_NON_LFS);
+ return do_sendfile(out_fd, in_fd, NULL, count, 0);
}

asmlinkage ssize_t sys_sendfile64(int out_fd, int in_fd, loff_t __user *offset, size_t count)

2004-04-18 13:50:05

by Chris Evans

[permalink] [raw]
Subject: Re: Nasty 2.6 sendfile() bug / regression; affects vsftpd


On Sat, 17 Apr 2004, Linus Torvalds wrote:

> Your patch is horribly ugly. How about this (much simpler) patch instead?

Hey, it's a guaranteed way to get your attention ;-)

> Can you test that this one-liner fixes the issue for you?

It certainly does.

Cheers
Chris

> ----
> --- 1.37/fs/read_write.c Mon Apr 12 10:54:20 2004
> +++ edited/fs/read_write.c Sat Apr 17 20:09:41 2004
> @@ -635,7 +635,7 @@
> return ret;
> }
>
> - return do_sendfile(out_fd, in_fd, NULL, count, MAX_NON_LFS);
> + return do_sendfile(out_fd, in_fd, NULL, count, 0);
> }
>
> asmlinkage ssize_t sys_sendfile64(int out_fd, int in_fd, loff_t __user *offset, size_t count)
>

2004-04-19 00:47:04

by Jamie Lokier

[permalink] [raw]
Subject: Re: Nasty 2.6 sendfile() bug / regression; affects vsftpd

Looking at related code, sys_sendfile64 a few lines down.

if (unlikely(copy_from_user(&pos, offset, sizeof(loff_t))))
return -EFAULT;

if (unlikely(put_user(pos, offset)))
return -EFAULT;

It seems odd that put_user() is used to write an 8-byte value, but
get_user() cannot be used read one. I looked in <asm-i386/uaccess.h>
and indeed the asymmetry is there.

Is there a reason why put_user() supports 1/2/4/8 bytes and get_user()
supports only 1/2/4 bytes?

-- Jamie

2004-04-19 05:28:17

by Linus Torvalds

[permalink] [raw]
Subject: Re: Nasty 2.6 sendfile() bug / regression; affects vsftpd



On Mon, 19 Apr 2004, Jamie Lokier wrote:
>
> Is there a reason why put_user() supports 1/2/4/8 bytes and get_user()
> supports only 1/2/4 bytes?

It's a bit more complicated to do get_user, mainly because we use a 64-bit
value to pass the data around already on x86 - the "real data" in %eax,
and the error code in %edx. So you'd need to have a slightly different
calling convention for the 8-byte case, so it was more than just
"duplicate the other cases".

I agree that it's an ugly special case. get/put_user should really accept
all the normal cases, and that includes 'u64'.

Not a lot of code cares, though, so for now we've just had the special
case. You are the first one to notice, I think.

Linus

2004-04-19 09:44:19

by Jamie Lokier

[permalink] [raw]
Subject: [PATCH] Add 64-bit get_user and __get_user for i386

Linus Torvalds wrote:
> I agree that it's an ugly special case. get/put_user should really accept
> all the normal cases, and that includes 'u64'.

Enjoy,
-- Jamie

Subject: [PATCH] Add 64-bit get_user and __get_user for i386
Patch: uaccess64-2.6.5

Add 64-bit get_user and __get_user for i386.
Don't ask me how, but this shrinks the kernel too.

--- orig-2.6.5/include/asm-i386/uaccess.h 2003-12-18 02:58:16.000000000 +0000
+++ uaccess64-2.6.5/include/asm-i386/uaccess.h 2004-04-19 10:33:20.000000000 +0100
@@ -169,16 +169,19 @@
* On error, the variable @x is set to zero.
*/
#define get_user(x,ptr) \
-({ int __ret_gu,__val_gu; \
+(sizeof (*(ptr)) == 8 \
+ ? (__copy_from_user_ll((void *)&(x), (ptr), 8) ? -EFAULT : 0) : \
+({ \
+ int __ret_gu,__val_gu; \
switch(sizeof (*(ptr))) { \
case 1: __get_user_x(1,__ret_gu,__val_gu,ptr); break; \
case 2: __get_user_x(2,__ret_gu,__val_gu,ptr); break; \
case 4: __get_user_x(4,__ret_gu,__val_gu,ptr); break; \
- default: __get_user_x(X,__ret_gu,__val_gu,ptr); break; \
+ default: __get_user_bad(); break; \
} \
(x) = (__typeof__(*(ptr)))__val_gu; \
__ret_gu; \
-})
+}))

extern void __put_user_bad(void);

@@ -333,13 +336,14 @@
: ltype (x), "m"(__m(addr)), "i"(errret), "0"(err))


-#define __get_user_nocheck(x,ptr,size) \
-({ \
- long __gu_err, __gu_val; \
- __get_user_size(__gu_val,(ptr),(size),__gu_err,-EFAULT);\
- (x) = (__typeof__(*(ptr)))__gu_val; \
- __gu_err; \
-})
+#define __get_user_nocheck(x,ptr,size) \
+((size) == 8 ? (__copy_from_user_ll((void *)&(x), (ptr), 8) ? -EFAULT : 0) : \
+({ \
+ long __gu_err, __gu_val; \
+ __get_user_size(__gu_val,(ptr),(size),__gu_err,-EFAULT); \
+ (x) = (__typeof__(*(ptr)))__gu_val; \
+ __gu_err; \
+}))

extern long __get_user_bad(void);

2004-04-19 10:15:54

by Jamie Lokier

[permalink] [raw]
Subject: Re: [PATCH] Add 64-bit get_user and __get_user for i386

Ignore the previous patch. It didn't include the access_ok() check in
64-bit get_user(). Use this instead.

Enjoy,
-- Jamie

Subject: [PATCH] Add 64-bit get_user and __get_user for i386
Patch: uaccess64-2.6.5-jl2

Add 64-bit get_user and __get_user for i386.
This time, include the access_ok() check.
Don't ask me how, but this shrinks the kernel too.

--- orig-2.6.5/include/asm-i386/uaccess.h 2003-12-18 02:58:16.000000000 +0000
+++ uaccess64-2.6.5/include/asm-i386/uaccess.h 2004-04-19 11:06:52.000000000 +0100
@@ -168,17 +168,19 @@
* Returns zero on success, or -EFAULT on error.
* On error, the variable @x is set to zero.
*/
-#define get_user(x,ptr) \
-({ int __ret_gu,__val_gu; \
- switch(sizeof (*(ptr))) { \
- case 1: __get_user_x(1,__ret_gu,__val_gu,ptr); break; \
- case 2: __get_user_x(2,__ret_gu,__val_gu,ptr); break; \
- case 4: __get_user_x(4,__ret_gu,__val_gu,ptr); break; \
- default: __get_user_x(X,__ret_gu,__val_gu,ptr); break; \
- } \
- (x) = (__typeof__(*(ptr)))__val_gu; \
- __ret_gu; \
-})
+#define get_user(x,ptr) \
+(sizeof (*(ptr)) == 8 ? (copy_from_user((void *)&(x),(ptr),8) ? -EFAULT : 0): \
+({ \
+ int __ret_gu,__val_gu; \
+ switch(sizeof (*(ptr))) { \
+ case 1: __get_user_x(1,__ret_gu,__val_gu,ptr); break; \
+ case 2: __get_user_x(2,__ret_gu,__val_gu,ptr); break; \
+ case 4: __get_user_x(4,__ret_gu,__val_gu,ptr); break; \
+ default: __get_user_bad(); break; \
+ } \
+ (x) = (__typeof__(*(ptr)))__val_gu; \
+ __ret_gu; \
+}))

extern void __put_user_bad(void);

@@ -333,13 +335,14 @@
: ltype (x), "m"(__m(addr)), "i"(errret), "0"(err))


-#define __get_user_nocheck(x,ptr,size) \
-({ \
- long __gu_err, __gu_val; \
- __get_user_size(__gu_val,(ptr),(size),__gu_err,-EFAULT);\
- (x) = (__typeof__(*(ptr)))__gu_val; \
- __gu_err; \
-})
+#define __get_user_nocheck(x,ptr,size) \
+((size) == 8 ? (__copy_from_user_ll((void *)&(x), (ptr), 8) ? -EFAULT : 0) : \
+({ \
+ long __gu_err, __gu_val; \
+ __get_user_size(__gu_val,(ptr),(size),__gu_err,-EFAULT); \
+ (x) = (__typeof__(*(ptr)))__gu_val; \
+ __gu_err; \
+}))

extern long __get_user_bad(void);



>
> Enjoy,
> -- Jamie
>
> Subject: [PATCH] Add 64-bit get_user and __get_user for i386
> Patch: uaccess64-2.6.5
>
> Add 64-bit get_user and __get_user for i386.
> Don't ask me how, but this shrinks the kernel too.
>
> --- orig-2.6.5/include/asm-i386/uaccess.h 2003-12-18 02:58:16.000000000 +0000
> +++ uaccess64-2.6.5/include/asm-i386/uaccess.h 2004-04-19 10:33:20.000000000 +0100
> @@ -169,16 +169,19 @@
> * On error, the variable @x is set to zero.
> */
> #define get_user(x,ptr) \
> -({ int __ret_gu,__val_gu; \
> +(sizeof (*(ptr)) == 8 \
> + ? (__copy_from_user_ll((void *)&(x), (ptr), 8) ? -EFAULT : 0) : \
> +({ \
> + int __ret_gu,__val_gu; \
> switch(sizeof (*(ptr))) { \
> case 1: __get_user_x(1,__ret_gu,__val_gu,ptr); break; \
> case 2: __get_user_x(2,__ret_gu,__val_gu,ptr); break; \
> case 4: __get_user_x(4,__ret_gu,__val_gu,ptr); break; \
> - default: __get_user_x(X,__ret_gu,__val_gu,ptr); break; \
> + default: __get_user_bad(); break; \
> } \
> (x) = (__typeof__(*(ptr)))__val_gu; \
> __ret_gu; \
> -})
> +}))
>
> extern void __put_user_bad(void);
>
> @@ -333,13 +336,14 @@
> : ltype (x), "m"(__m(addr)), "i"(errret), "0"(err))
>
>
> -#define __get_user_nocheck(x,ptr,size) \
> -({ \
> - long __gu_err, __gu_val; \
> - __get_user_size(__gu_val,(ptr),(size),__gu_err,-EFAULT);\
> - (x) = (__typeof__(*(ptr)))__gu_val; \
> - __gu_err; \
> -})
> +#define __get_user_nocheck(x,ptr,size) \
> +((size) == 8 ? (__copy_from_user_ll((void *)&(x), (ptr), 8) ? -EFAULT : 0) : \
> +({ \
> + long __gu_err, __gu_val; \
> + __get_user_size(__gu_val,(ptr),(size),__gu_err,-EFAULT); \
> + (x) = (__typeof__(*(ptr)))__gu_val; \
> + __gu_err; \
> +}))
>
> extern long __get_user_bad(void);
>

2004-04-19 10:18:11

by Jamie Lokier

[permalink] [raw]
Subject: [PATCH] Use get_user for 64-bit read in sendfile64

Patch: sendfile64_get_user-2.6.5
Requires: uaccess64-2.6.5-jl2

Use get_user instead of copy_from_user to read a loff_t.
Compiled output is identical on i386.

--- orig-2.6.5/fs/read_write.c 2004-04-14 08:29:43.000000000 +0100
+++ uaccess64-2.6.5/fs/read_write.c 2004-04-19 10:35:35.000000000 +0100
@@ -644,7 +644,7 @@
ssize_t ret;

if (offset) {
- if (unlikely(copy_from_user(&pos, offset, sizeof(loff_t))))
+ if (unlikely(get_user(pos, offset)))
return -EFAULT;
ret = do_sendfile(out_fd, in_fd, &pos, count, 0);
if (unlikely(put_user(pos, offset)))

2004-04-19 10:36:53

by Jamie Lokier

[permalink] [raw]
Subject: Re: [PATCH] Use get_user for 64-bit read in sendfile64

Jamie Lokier wrote:
> Use get_user instead of copy_from_user to read a loff_t.
> Compiled output is identical on i386.

Fwiw, all architectures have 64-bit get_user except i386 (prior to
earlier patch), SH and PPC32.

PPC is weird: it has a get_user64 macro which allows 64 bits.
ARM is also weird: it has 64-bit get_user, but not __get_user.

-- Jamie

2004-04-19 14:56:52

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH] Add 64-bit get_user and __get_user for i386



On Mon, 19 Apr 2004, Jamie Lokier wrote:
>
> Subject: [PATCH] Add 64-bit get_user and __get_user for i386
> Patch: uaccess64-2.6.5
>
> Add 64-bit get_user and __get_user for i386.
> Don't ask me how, but this shrinks the kernel too.

There must be some bug somewhere if the kernel shrinks from this. Although
possibly the bug is in gcc ;)

Anyway, please don't do it like this (ie making one case be just a
memcpy). If we do this, let's do it right - ie 'd much rather see
something like

#define get_user(x, ptr) \
({ __typeof__(*(ptr)) __val_gu; \
int __ret_gu; \
switch (sizeof(__val_gu)) { \
case 1: __get_user_x(1,__ret_gu,__val_gu,ptr); break; \
case 2: __get_user_x(2,__ret_gu,__val_gu,ptr); break; \
case 4: __get_user_x(4,__ret_gu,__val_gu,ptr); break; \
case 8: __get_user_8(__ret_gu,__val_gu,ptr); break; \
default: __get_user_bad(); break; \
} \
(x) = __val_gu; \
__ret_gu; \
})

and then you just make "__get_user_8()" look something like

#define __get_user_8(ret,x,ptr) \
__asm__ __volatile__("call __get_user_8" \
:"=A" (ret), "=c" (x) \
:"1" (ptr))

Which is _different_ from the other "get_user" cases: it passes the
address in %ecx, and it returns the error in %ecx too - the return value
comes in %edx:%eax. Make the __get_user_8 in getuser.S match those
different rules.

What do you think?

Linus

2004-04-20 17:42:28

by Jamie Lokier

[permalink] [raw]
Subject: Re: [PATCH] Add 64-bit get_user and __get_user for i386

Linus Torvalds wrote:
> On Tue, 20 Apr 2004, Jamie Lokier wrote:
> > Changed the registers for all __get_user_X so they could all be
> > consistent: %ecx now contains the address, and %eax or %edx:%eax
> > contain the result.
>
> This clobbers unnecessary registers. You now use three registers instead
> of just two.

Wrong. It uses two registers for the 1/2/4-byte cases (%ecx and %eax).
It uses three for 8-byte, which is necessary: 8-byte result + error code.

That's achieved by using different asm output constraints: "=a" for
1/2/4 bytes and "=A" for 8 bytes.

> > There was a boundary bug in getuser.S: get_user(x,(char *)0xbfffffff)
> > would fail. Fixed.
>
> Hmm. I'll believe you.

Just look, it's obvious. This is the old code

__get_user_4:
addl $3,%eax
jc bad_get_user
GET_THREAD_INFO(%edx)
cmpl TI_ADDR_LIMIT(%edx),%eax
jae bad_get_user

Change "jae" to "ja" and it's fine. (x86_64 has copied this bug).

> > +/*
> > + * Careful: we have to cast the result to the type of the pointer for
> > + * sign reasons.
> > + */
> ...
> > + __typeof__(*(ptr)) __gu_val; \
> ...
> > + (x) = (__typeof__(*(ptr)))__gu_val; \
>
> And exactly why do you want to cast the value to a type that it already
> has?

Well spotted. It's an artifact of experiments with __gu_val being
long, long long etc. for various sizes of get_user() and seeing
unusual warnings when *ptr and/or x are pointers.

Patch re-rolled. The only change is the removal of that cast,
and a clarifying comment in getuser.S.

I've checked it compiles to identical code.

Enjoy,
-- Jamie


Subject: [PATCH] Add 64-bit get_user and __get_user for i386
Patch: uaccess64-2.6.5-jl4

Added 64-bit get_user and __get_user for i386.

Changed the registers for all __get_user_X so they could all be
consistent: %ecx now contains the address, and %eax or %edx:%eax
contain the result.

There was a boundary bug in getuser.S: get_user(x,(char *)0xbfffffff)
would fail. Fixed.

Added might_sleep() to get_user().

Added the 8-byte constant-length cases for __copy_from_user() and
__copy_to_user().

Changed constant-length short copy_from_user() to be out of line (it was
inline), as this is just another way to write get_user(). It calls the
same routines as get_user() and one test showed equivalent uses now
generate identical code (sendfile64). Constant-length
__copy_from_user() remains inline, just like __get_user().

Very slight cosmetic moving of a few definitions into a more
consistent arrangement.

This patch shaved about 1.2k off my kernel, due to the un-inlining of
copy_from_user() calls which are equivalent to get_user().



diff -urN --exclude-from=dontdiff dual-2.6.5/arch/i386/kernel/i386_ksyms.c uaccess64-2.6.5/arch/i386/kernel/i386_ksyms.c
--- dual-2.6.5/arch/i386/kernel/i386_ksyms.c 2004-04-14 08:28:07.000000000 +0100
+++ uaccess64-2.6.5/arch/i386/kernel/i386_ksyms.c 2004-04-20 01:55:10.000000000 +0100
@@ -103,6 +103,7 @@
EXPORT_SYMBOL_NOVERS(__get_user_1);
EXPORT_SYMBOL_NOVERS(__get_user_2);
EXPORT_SYMBOL_NOVERS(__get_user_4);
+EXPORT_SYMBOL_NOVERS(__get_user_8);

EXPORT_SYMBOL(strpbrk);
EXPORT_SYMBOL(strstr);
diff -urN --exclude-from=dontdiff dual-2.6.5/arch/i386/lib/getuser.S uaccess64-2.6.5/arch/i386/lib/getuser.S
--- dual-2.6.5/arch/i386/lib/getuser.S 2003-12-18 02:57:59.000000000 +0000
+++ uaccess64-2.6.5/arch/i386/lib/getuser.S 2004-04-20 18:15:24.000000000 +0100
@@ -14,10 +14,11 @@
/*
* __get_user_X
*
- * Inputs: %eax contains the address
+ * Inputs: %ecx contains the address
*
- * Outputs: %eax is error code (0 or -EFAULT)
- * %edx contains zero-extended value
+ * Outputs: %ecx is error code (0 or -EFAULT)
+ * %eax contains zero-extended value
+ * %edx is high part of 64 bit result, or preserved if <= 32 bits
*
* These functions should not modify any other registers,
* as they get called from within inline assembly.
@@ -27,44 +28,61 @@
.align 4
.globl __get_user_1
__get_user_1:
- GET_THREAD_INFO(%edx)
- cmpl TI_ADDR_LIMIT(%edx),%eax
- jae bad_get_user
-1: movzbl (%eax),%edx
- xorl %eax,%eax
+ GET_THREAD_INFO(%eax)
+ cmpl TI_ADDR_LIMIT(%eax),%ecx
+ ja bad_get_user
+1: movzbl (%ecx),%eax
+ xorl %ecx,%ecx
ret

.align 4
.globl __get_user_2
__get_user_2:
- addl $1,%eax
+ addl $1,%ecx
jc bad_get_user
- GET_THREAD_INFO(%edx)
- cmpl TI_ADDR_LIMIT(%edx),%eax
- jae bad_get_user
-2: movzwl -1(%eax),%edx
- xorl %eax,%eax
+ GET_THREAD_INFO(%eax)
+ cmpl TI_ADDR_LIMIT(%eax),%ecx
+ ja bad_get_user
+2: movzwl -1(%ecx),%eax
+ xorl %ecx,%ecx
ret

.align 4
.globl __get_user_4
__get_user_4:
- addl $3,%eax
+ addl $3,%ecx
jc bad_get_user
- GET_THREAD_INFO(%edx)
- cmpl TI_ADDR_LIMIT(%edx),%eax
- jae bad_get_user
-3: movl -3(%eax),%edx
- xorl %eax,%eax
+ GET_THREAD_INFO(%eax)
+ cmpl TI_ADDR_LIMIT(%eax),%ecx
+ ja bad_get_user
+3: movl -3(%ecx),%eax
+ xorl %ecx,%ecx
ret

-bad_get_user:
+.align 4
+.globl __get_user_8
+__get_user_8:
+ addl $7,%ecx
+ jc bad_get_user_8
+ GET_THREAD_INFO(%eax)
+ cmpl TI_ADDR_LIMIT(%eax),%ecx
+ ja bad_get_user_8
+4: movl -7(%ecx),%eax
+5: movl -3(%ecx),%edx
+ xorl %ecx,%ecx
+ ret
+
+bad_get_user_8:
xorl %edx,%edx
- movl $-14,%eax
+bad_get_user:
+ xorl %eax,%eax
+ movl $-14,%ecx
ret

.section __ex_table,"a"
.long 1b,bad_get_user
.long 2b,bad_get_user
.long 3b,bad_get_user
+ .long 4b,bad_get_user_8
+ .long 5b,bad_get_user_8
.previous
diff -urN --exclude-from=dontdiff dual-2.6.5/include/asm-i386/uaccess.h uaccess64-2.6.5/include/asm-i386/uaccess.h
--- dual-2.6.5/include/asm-i386/uaccess.h 2003-12-18 02:58:16.000000000 +0000
+++ uaccess64-2.6.5/include/asm-i386/uaccess.h 2004-04-20 17:49:53.000000000 +0100
@@ -140,17 +140,7 @@
* accesses to the same area of user memory).
*/

-extern void __get_user_1(void);
-extern void __get_user_2(void);
-extern void __get_user_4(void);
-
-#define __get_user_x(size,ret,x,ptr) \
- __asm__ __volatile__("call __get_user_" #size \
- :"=a" (ret),"=d" (x) \
- :"0" (ptr))
-

-/* Careful: we have to cast the result to the type of the pointer for sign reasons */
/**
* get_user: - Get a simple variable from user space.
* @x: Variable to store result.
@@ -168,19 +158,36 @@
* Returns zero on success, or -EFAULT on error.
* On error, the variable @x is set to zero.
*/
+/*
+ * Careful: we have to cast the result to the type of the pointer for
+ * sign reasons.
+ */
#define get_user(x,ptr) \
-({ int __ret_gu,__val_gu; \
+({ \
+ __typeof__(*(ptr)) __gu_val; \
+ int __gu_err; \
+ might_sleep(); \
switch(sizeof (*(ptr))) { \
- case 1: __get_user_x(1,__ret_gu,__val_gu,ptr); break; \
- case 2: __get_user_x(2,__ret_gu,__val_gu,ptr); break; \
- case 4: __get_user_x(4,__ret_gu,__val_gu,ptr); break; \
- default: __get_user_x(X,__ret_gu,__val_gu,ptr); break; \
+ case 1: __get_user_x(1,__gu_val,ptr,__gu_err,"=a"); break; \
+ case 2: __get_user_x(2,__gu_val,ptr,__gu_err,"=a"); break; \
+ case 4: __get_user_x(4,__gu_val,ptr,__gu_err,"=a"); break; \
+ case 8: __get_user_x(8,__gu_val,ptr,__gu_err,"=A"); break; \
+ default: __get_user_bad(); __gu_val = 0; __gu_err = 0; \
} \
- (x) = (__typeof__(*(ptr)))__val_gu; \
- __ret_gu; \
+ (x) = __gu_val; \
+ __gu_err; \
})

-extern void __put_user_bad(void);
+extern void __get_user_1(void);
+extern void __get_user_2(void);
+extern void __get_user_4(void);
+extern void __get_user_8(void);
+
+#define __get_user_x(size, x, ptr, err, rtype) \
+ __asm__ __volatile__("call __get_user_" #size \
+ : "=c" (err), rtype (x) \
+ : "0" (ptr));
+

/**
* put_user: - Write a simple value into user space.
@@ -255,33 +262,17 @@
__pu_err; \
})

-
#define __put_user_check(x,ptr,size) \
({ \
long __pu_err = -EFAULT; \
__typeof__(*(ptr)) *__pu_addr = (ptr); \
- might_sleep(); \
+ might_sleep(); \
if (access_ok(VERIFY_WRITE,__pu_addr,size)) \
__put_user_size((x),__pu_addr,(size),__pu_err,-EFAULT); \
__pu_err; \
})

-#define __put_user_u64(x, addr, err) \
- __asm__ __volatile__( \
- "1: movl %%eax,0(%2)\n" \
- "2: movl %%edx,4(%2)\n" \
- "3:\n" \
- ".section .fixup,\"ax\"\n" \
- "4: movl %3,%0\n" \
- " jmp 3b\n" \
- ".previous\n" \
- ".section __ex_table,\"a\"\n" \
- " .align 4\n" \
- " .long 1b,4b\n" \
- " .long 2b,4b\n" \
- ".previous" \
- : "=r"(err) \
- : "A" (x), "r" (addr), "i"(-EFAULT), "0"(err))
+extern void __put_user_bad(void);

#ifdef CONFIG_X86_WP_WORKS_OK

@@ -292,8 +283,8 @@
case 1: __put_user_asm(x,ptr,retval,"b","b","iq",errret);break; \
case 2: __put_user_asm(x,ptr,retval,"w","w","ir",errret);break; \
case 4: __put_user_asm(x,ptr,retval,"l","","ir",errret); break; \
- case 8: __put_user_u64((__typeof__(*ptr))(x),ptr,retval); break;\
- default: __put_user_bad(); \
+ case 8: __put_user_u64(x,ptr,retval,errret); break; \
+ default: __put_user_bad(); \
} \
} while (0)

@@ -301,7 +292,7 @@

#define __put_user_size(x,ptr,size,retval,errret) \
do { \
- __typeof__(*(ptr)) __pus_tmp = x; \
+ __typeof__(*(ptr)) __pus_tmp = (x); \
retval = 0; \
\
if(unlikely(__copy_to_user_ll(ptr, &__pus_tmp, size) != 0)) \
@@ -332,12 +323,31 @@
: "=r"(err) \
: ltype (x), "m"(__m(addr)), "i"(errret), "0"(err))

+#define __put_user_u64(x, addr, err, errret) \
+ __asm__ __volatile__( \
+ "1: movl %%eax,%2\n" \
+ "2: movl %%edx,%3\n" \
+ "3:\n" \
+ ".section .fixup,\"ax\"\n" \
+ "4: movl %4,%0\n" \
+ " jmp 3b\n" \
+ ".previous\n" \
+ ".section __ex_table,\"a\"\n" \
+ " .align 4\n" \
+ " .long 1b,4b\n" \
+ " .long 2b,4b\n" \
+ ".previous" \
+ : "=r" (err) \
+ : "A" (x), "m" (__m(addr)), "m" (__m(4+(char*)addr)), \
+ "i" (errret), "0" (err))
+

#define __get_user_nocheck(x,ptr,size) \
({ \
- long __gu_err, __gu_val; \
+ __typeof__(*(ptr)) __gu_val; \
+ int __gu_err; \
__get_user_size(__gu_val,(ptr),(size),__gu_err,-EFAULT);\
- (x) = (__typeof__(*(ptr)))__gu_val; \
+ (x) = __gu_val; \
__gu_err; \
})

@@ -350,7 +360,8 @@
case 1: __get_user_asm(x,ptr,retval,"b","b","=q",errret);break; \
case 2: __get_user_asm(x,ptr,retval,"w","w","=r",errret);break; \
case 4: __get_user_asm(x,ptr,retval,"l","","=r",errret);break; \
- default: (x) = __get_user_bad(); \
+ case 8: __get_user_u64(x,ptr,retval,errret);break; \
+ default: __get_user_bad(); (x) = 0; \
} \
} while (0)

@@ -370,6 +381,26 @@
: "=r"(err), ltype (x) \
: "m"(__m(addr)), "i"(errret), "0"(err))

+#define __get_user_u64(x, addr, err, errret) \
+ __asm__ __volatile__( \
+ "1: movl %2,%%eax\n" \
+ "2: movl %3,%%edx\n" \
+ "3:\n" \
+ ".section .fixup,\"ax\"\n" \
+ "4: movl %4,%0\n" \
+ " xorl %%eax,%%eax\n" \
+ " xorl %%edx,%%edx\n" \
+ " jmp 3b\n" \
+ ".previous\n" \
+ ".section __ex_table,\"a\"\n" \
+ " .align 4\n" \
+ " .long 1b,4b\n" \
+ " .long 2b,4b\n" \
+ ".previous" \
+ : "=r" (err), "=A" (x) \
+ : "m" (__m(addr)), "m" (__m(4+(char*)addr)), \
+ "i" (errret), "0" (err));
+

unsigned long __copy_to_user_ll(void __user *to, const void *from, unsigned long n);
unsigned long __copy_from_user_ll(void *to, const void __user *from, unsigned long n);
@@ -411,6 +442,9 @@
case 4:
__put_user_size(*(u32 *)from, (u32 *)to, 4, ret, 4);
return ret;
+ case 8:
+ __put_user_size(*(u64 *)from, (u64 *)to, 8, ret, 8);
+ return ret;
}
}
return __copy_to_user_ll(to, from, n);
@@ -449,6 +483,9 @@
case 4:
__get_user_size(*(u32 *)to, from, 4, ret, 4);
return ret;
+ case 8:
+ __get_user_size(*(u64 *)to, from, 8, ret, 8);
+ return ret;
}
}
return __copy_from_user_ll(to, from, n);
@@ -496,8 +533,24 @@
copy_from_user(void *to, const void __user *from, unsigned long n)
{
might_sleep();
+ if (__builtin_constant_p(n)) {
+ switch (n) {
+ case 1:
+ __get_user_x(1, *(u8 *)to, from, n, "=a");
+ return n ? 1 : 0;
+ case 2:
+ __get_user_x(2, *(u16 *)to, from, n, "=a");
+ return n ? 2 : 0;
+ case 4:
+ __get_user_x(4, *(u32 *)to, from, n, "=a");
+ return n ? 4 : 0;
+ case 8:
+ __get_user_x(8, *(u64 *)to, from, n, "=A");
+ return n ? 8 : 0;
+ }
+ }
if (access_ok(VERIFY_READ, from, n))
- n = __copy_from_user(to, from, n);
+ n = __copy_from_user_ll(to, from, n);
else
memset(to, 0, n);
return n;

2004-04-20 18:16:16

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH] Add 64-bit get_user and __get_user for i386



On Tue, 20 Apr 2004, Jamie Lokier wrote:
>
> Patch re-rolled. The only change is the removal of that cast,
> and a clarifying comment in getuser.S.

Well, it's not applying against current kernels (which have already
un-inlined some of these things), and that comment still isn't clear to
me. There's a comment about casting things, but the thing is, the macro
contains no casts anywhere...

Other than that, I'm convinced.

Linus

2004-04-20 23:25:22

by Jim Wilson

[permalink] [raw]
Subject: Re: [PATCH] Add 64-bit get_user and __get_user for i386

Jamie Lokier wrote:
> However that doesn't seem to be a problem. Experiments show that an
> __asm__ which outputs to a char or short type is _always_
> sign-extended or zero-extended if needed by the compiler, on various
> i386 subtargets. The compiler doesn't assume anything about the
> output higher bits from an __asm__.
>
> --> GCC experts, can you confirm the above property please?

expand_asm_operands in stmt.c checks for promoted variables, and if it
sees one, it will generate a temporary with the unpromoted size to
replace it in the asm, and then copy the temporary to the promoted
variable after the asm, ensuring that the extension happens after the asm.

> (Function calls are similar: GCC won't assume anything about the
> higher bits of the result received from a function, although it always
> promotes the result before returning a value. Odd combination).

I think this is a minor gcc bug. We have been promoting char/short
return values since the gcc-1.x days. This promotion occurs regardless
of whether the target asks for return values to be promoted. This
promotion happens in start_function in c-decl.c. Perhaps this is a left
over from the K&R C days, as I don't see anything in the C89 standard
that requires this. Changing this now might break ports that require
the promotion, but were not explicitly asking for it, so changing this
might be tricky. I'd suggest opening a gcc bugzilla bug report if we
really care about this.
--
Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com