2008-06-27 21:39:53

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 0/39] Merge files at x86/lib

Hey folks,

Here it goes a series of patch that merges some user-related files.
>From x86/lib, delay.c, getuser.S, and putuser.S are merged. For the
last two of them, the accompanying include/asm-x86/uaccess.h is merged.
Or close to. There are some small leftovers, that are sufficiently
different to remain in its own files.

As for bisectability, all patches have been tested in more than 20
different configs for both i386 and x86_64, in the usual way (just
I'm testing in more configs now). If you find a build bug in this
series, please send me the offending config so I can add to my poll.

The diffstat and the one-big-patch follows at the end of this introductory
message.

Ingo, in the absence of any objections, you can pull this work from:

git://git.kernel.org/pub/scm/linux/kernel/git/glommer/linux-2.6-x86-integration.git master

into your tip/master tree



Thanks!

Glauber

------------------>
Glauber Costa (39):
Don't use size specifiers
provide delay loop for x86_64
use rdtscll in read_current_timer for i386.
explicitly use edx in const delay function.
integrate delay functions
use something common for both architectures
don't clobber r8 nor use rcx
don't use word-size specifiers
adapt x86_64 getuser functions
rename threadinfo to TI
Don't use word-size specifiers on getuser_64
introduce __ASM_REG macro
use _ASM_PTR instead of explicit word-size pointers
merge getuser asm functions
don't save ebx in putuser_32.S
user put_user_x instead of all variants.
clobber rbx in putuser_64.S
pass argument to putuser_64 functions in ax register.
change testing logic in putuser_64.S
replace function headers by macros
don't use word-size specifiers in putuser files
use macros from asm.h
merge putuser asm functions
commonize __range_not_ok
merge common parts of uaccess.
merge getuser
move __addr_ok to uaccess.h
use k modifier for 4-byte access.
mark x86_64 as having a working WP.
don't always use EFAULT on __put_user_size.
merge __put_user_asm and its user.
don't always use EFAULT on __get_user_size.
merge __get_user_asm and its users.
Be more explicit in __put_user_x
turn __put_user_check directly into put_user.
merge put_user
move __get_user and __put_user into uaccess.h
put movsl_mask into uaccess.h
define architectural characteristics in uaccess.h

arch/x86/Kconfig.cpu | 2 +-
arch/x86/ia32/ia32entry.S | 25 +-
arch/x86/kernel/asm-offsets_64.c | 2 +-
arch/x86/kernel/entry_64.S | 27 +-
arch/x86/kernel/tsc_64.c | 1 +
arch/x86/lib/Makefile | 4 +-
arch/x86/lib/copy_user_64.S | 4 +-
arch/x86/lib/{delay_32.c => delay.c} | 17 +-
arch/x86/lib/delay_64.c | 85 ------
arch/x86/lib/{getuser_64.S => getuser.S} | 87 +++---
arch/x86/lib/getuser_32.S | 78 -----
arch/x86/lib/{putuser_32.S => putuser.S} | 73 +++---
arch/x86/lib/putuser_64.S | 106 -------
include/asm-x86/asm.h | 9 +-
include/asm-x86/uaccess.h | 449 ++++++++++++++++++++++++++++++
include/asm-x86/uaccess_32.h | 422 ----------------------------
include/asm-x86/uaccess_64.h | 260 -----------------
17 files changed, 577 insertions(+), 1074 deletions(-)
rename arch/x86/lib/{delay_32.c => delay.c} (96%)
delete mode 100644 arch/x86/lib/delay_64.c
rename arch/x86/lib/{getuser_64.S => getuser.S} (53%)
delete mode 100644 arch/x86/lib/getuser_32.S
rename arch/x86/lib/{putuser_32.S => putuser.S} (54%)
delete mode 100644 arch/x86/lib/putuser_64.S

diff --git a/arch/x86/Kconfig.cpu b/arch/x86/Kconfig.cpu
index d5f04f9..99ec0fe 100644
--- a/arch/x86/Kconfig.cpu
+++ b/arch/x86/Kconfig.cpu
@@ -344,7 +344,7 @@ config X86_F00F_BUG

config X86_WP_WORKS_OK
def_bool y
- depends on X86_32 && !M386
+ depends on !M386

config X86_INVLPG
def_bool y
diff --git a/arch/x86/ia32/ia32entry.S b/arch/x86/ia32/ia32entry.S
index 3aefbce..9bfea05 100644
--- a/arch/x86/ia32/ia32entry.S
+++ b/arch/x86/ia32/ia32entry.S
@@ -103,7 +103,7 @@ ENTRY(ia32_sysenter_target)
pushfq
CFI_ADJUST_CFA_OFFSET 8
/*CFI_REL_OFFSET rflags,0*/
- movl 8*3-THREAD_SIZE+threadinfo_sysenter_return(%rsp), %r10d
+ movl 8*3-THREAD_SIZE+TI_sysenter_return(%rsp), %r10d
CFI_REGISTER rip,r10
pushq $__USER32_CS
CFI_ADJUST_CFA_OFFSET 8
@@ -123,8 +123,9 @@ ENTRY(ia32_sysenter_target)
.quad 1b,ia32_badarg
.previous
GET_THREAD_INFO(%r10)
- orl $TS_COMPAT,threadinfo_status(%r10)
- testl $(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SECCOMP),threadinfo_flags(%r10)
+ orl $TS_COMPAT,TI_status(%r10)
+ testl $(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SECCOMP), \
+ TI_flags(%r10)
CFI_REMEMBER_STATE
jnz sysenter_tracesys
sysenter_do_call:
@@ -136,9 +137,9 @@ sysenter_do_call:
GET_THREAD_INFO(%r10)
cli
TRACE_IRQS_OFF
- testl $_TIF_ALLWORK_MASK,threadinfo_flags(%r10)
+ testl $_TIF_ALLWORK_MASK,TI_flags(%r10)
jnz int_ret_from_sys_call
- andl $~TS_COMPAT,threadinfo_status(%r10)
+ andl $~TS_COMPAT,TI_status(%r10)
/* clear IF, that popfq doesn't enable interrupts early */
andl $~0x200,EFLAGS-R11(%rsp)
movl RIP-R11(%rsp),%edx /* User %eip */
@@ -230,8 +231,9 @@ ENTRY(ia32_cstar_target)
.quad 1b,ia32_badarg
.previous
GET_THREAD_INFO(%r10)
- orl $TS_COMPAT,threadinfo_status(%r10)
- testl $(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SECCOMP),threadinfo_flags(%r10)
+ orl $TS_COMPAT,TI_status(%r10)
+ testl $(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SECCOMP), \
+ TI_flags(%r10)
CFI_REMEMBER_STATE
jnz cstar_tracesys
cstar_do_call:
@@ -243,9 +245,9 @@ cstar_do_call:
GET_THREAD_INFO(%r10)
cli
TRACE_IRQS_OFF
- testl $_TIF_ALLWORK_MASK,threadinfo_flags(%r10)
+ testl $_TIF_ALLWORK_MASK,TI_flags(%r10)
jnz int_ret_from_sys_call
- andl $~TS_COMPAT,threadinfo_status(%r10)
+ andl $~TS_COMPAT,TI_status(%r10)
RESTORE_ARGS 1,-ARG_SKIP,1,1,1
movl RIP-ARGOFFSET(%rsp),%ecx
CFI_REGISTER rip,rcx
@@ -324,8 +326,9 @@ ENTRY(ia32_syscall)
this could be a problem. */
SAVE_ARGS 0,0,1
GET_THREAD_INFO(%r10)
- orl $TS_COMPAT,threadinfo_status(%r10)
- testl $(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SECCOMP),threadinfo_flags(%r10)
+ orl $TS_COMPAT,TI_status(%r10)
+ testl $(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SECCOMP), \
+ TI_flags(%r10)
jnz ia32_tracesys
ia32_do_syscall:
cmpl $(IA32_NR_syscalls-1),%eax
diff --git a/arch/x86/kernel/asm-offsets_64.c b/arch/x86/kernel/asm-offsets_64.c
index a5bbec3..2fcc6ac 100644
--- a/arch/x86/kernel/asm-offsets_64.c
+++ b/arch/x86/kernel/asm-offsets_64.c
@@ -34,7 +34,7 @@ int main(void)
ENTRY(pid);
BLANK();
#undef ENTRY
-#define ENTRY(entry) DEFINE(threadinfo_ ## entry, offsetof(struct thread_info, entry))
+#define ENTRY(entry) DEFINE(TI_ ## entry, offsetof(struct thread_info, entry))
ENTRY(flags);
ENTRY(addr_limit);
ENTRY(preempt_count);
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index c035b20..b79cfc9 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -277,13 +277,13 @@ ENTRY(ret_from_fork)
CFI_ADJUST_CFA_OFFSET -4
call schedule_tail
GET_THREAD_INFO(%rcx)
- testl $(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT),threadinfo_flags(%rcx)
+ testl $(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT),TI_flags(%rcx)
jnz rff_trace
rff_action:
RESTORE_REST
testl $3,CS-ARGOFFSET(%rsp) # from kernel_thread?
je int_ret_from_sys_call
- testl $_TIF_IA32,threadinfo_flags(%rcx)
+ testl $_TIF_IA32,TI_flags(%rcx)
jnz int_ret_from_sys_call
RESTORE_TOP_OF_STACK %rdi,ARGOFFSET
jmp ret_from_sys_call
@@ -352,7 +352,8 @@ ENTRY(system_call_after_swapgs)
movq %rcx,RIP-ARGOFFSET(%rsp)
CFI_REL_OFFSET rip,RIP-ARGOFFSET
GET_THREAD_INFO(%rcx)
- testl $(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SECCOMP),threadinfo_flags(%rcx)
+ testl $(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SECCOMP), \
+ TI_flags(%rcx)
jnz tracesys
cmpq $__NR_syscall_max,%rax
ja badsys
@@ -371,7 +372,7 @@ sysret_check:
GET_THREAD_INFO(%rcx)
DISABLE_INTERRUPTS(CLBR_NONE)
TRACE_IRQS_OFF
- movl threadinfo_flags(%rcx),%edx
+ movl TI_flags(%rcx),%edx
andl %edi,%edx
jnz sysret_careful
CFI_REMEMBER_STATE
@@ -455,10 +456,10 @@ int_ret_from_sys_call:
int_with_check:
LOCKDEP_SYS_EXIT_IRQ
GET_THREAD_INFO(%rcx)
- movl threadinfo_flags(%rcx),%edx
+ movl TI_flags(%rcx),%edx
andl %edi,%edx
jnz int_careful
- andl $~TS_COMPAT,threadinfo_status(%rcx)
+ andl $~TS_COMPAT,TI_status(%rcx)
jmp retint_swapgs

/* Either reschedule or signal or syscall exit tracking needed. */
@@ -666,7 +667,7 @@ retint_with_reschedule:
movl $_TIF_WORK_MASK,%edi
retint_check:
LOCKDEP_SYS_EXIT_IRQ
- movl threadinfo_flags(%rcx),%edx
+ movl TI_flags(%rcx),%edx
andl %edi,%edx
CFI_REMEMBER_STATE
jnz retint_careful
@@ -764,7 +765,7 @@ retint_signal:
/* Returning to kernel space from exception. */
/* rcx: threadinfo. interrupts off. */
ENTRY(retexc_kernel)
- testl $HARDNMI_MASK,threadinfo_preempt_count(%rcx)
+ testl $HARDNMI_MASK,TI_preempt_count(%rcx)
jz retint_kernel /* Not nested over NMI ? */
testw $X86_EFLAGS_TF,EFLAGS-ARGOFFSET(%rsp) /* trap flag? */
jnz retint_kernel /*
@@ -782,9 +783,9 @@ ENTRY(retexc_kernel)
/* Returning to kernel space. Check if we need preemption */
/* rcx: threadinfo. interrupts off. */
ENTRY(retint_kernel)
- cmpl $0,threadinfo_preempt_count(%rcx)
+ cmpl $0,TI_preempt_count(%rcx)
jnz retint_restore_args
- bt $TIF_NEED_RESCHED,threadinfo_flags(%rcx)
+ bt $TIF_NEED_RESCHED,TI_flags(%rcx)
jnc retint_restore_args
bt $9,EFLAGS-ARGOFFSET(%rsp) /* interrupts off? */
jnc retint_restore_args
@@ -945,7 +946,7 @@ paranoid_restore_no_nmi\trace:
jmp irq_return
paranoid_restore\trace:
GET_THREAD_INFO(%rcx)
- testl $HARDNMI_MASK,threadinfo_preempt_count(%rcx)
+ testl $HARDNMI_MASK,TI_preempt_count(%rcx)
jz paranoid_restore_no_nmi\trace /* Nested over NMI ? */
testw $X86_EFLAGS_TF,EFLAGS-0(%rsp) /* trap flag? */
jnz paranoid_restore_no_nmi\trace
@@ -953,7 +954,7 @@ paranoid_restore\trace:
INTERRUPT_RETURN_NMI_SAFE
paranoid_userspace\trace:
GET_THREAD_INFO(%rcx)
- movl threadinfo_flags(%rcx),%ebx
+ movl TI_flags(%rcx),%ebx
andl $_TIF_WORK_MASK,%ebx
jz paranoid_swapgs\trace
movq %rsp,%rdi /* &pt_regs */
@@ -1051,7 +1052,7 @@ error_exit:
testl %eax,%eax
jne retexc_kernel
LOCKDEP_SYS_EXIT_IRQ
- movl threadinfo_flags(%rcx),%edx
+ movl TI_flags(%rcx),%edx
movl $_TIF_WORK_MASK,%edi
andl %edi,%edx
jnz retint_careful
diff --git a/arch/x86/kernel/tsc_64.c b/arch/x86/kernel/tsc_64.c
index 9898fb0..36ac46f 100644
--- a/arch/x86/kernel/tsc_64.c
+++ b/arch/x86/kernel/tsc_64.c
@@ -258,6 +258,7 @@ void __init tsc_calibrate(void)
out:
for_each_possible_cpu(cpu)
set_cyc2ns_scale(tsc_khz, cpu);
+ use_tsc_delay();
}

/*
diff --git a/arch/x86/lib/Makefile b/arch/x86/lib/Makefile
index 84aa288..aa3fa41 100644
--- a/arch/x86/lib/Makefile
+++ b/arch/x86/lib/Makefile
@@ -4,9 +4,9 @@

obj-$(CONFIG_SMP) := msr-on-cpu.o

-lib-y := delay_$(BITS).o
+lib-y := delay.o
lib-y += thunk_$(BITS).o
-lib-y += usercopy_$(BITS).o getuser_$(BITS).o putuser_$(BITS).o
+lib-y += usercopy_$(BITS).o getuser.o putuser.o
lib-y += memcpy_$(BITS).o

ifeq ($(CONFIG_X86_32),y)
diff --git a/arch/x86/lib/copy_user_64.S b/arch/x86/lib/copy_user_64.S
index ee1c3f6..7eaaf01 100644
--- a/arch/x86/lib/copy_user_64.S
+++ b/arch/x86/lib/copy_user_64.S
@@ -40,7 +40,7 @@ ENTRY(copy_to_user)
movq %rdi,%rcx
addq %rdx,%rcx
jc bad_to_user
- cmpq threadinfo_addr_limit(%rax),%rcx
+ cmpq TI_addr_limit(%rax),%rcx
jae bad_to_user
xorl %eax,%eax /* clear zero flag */
ALTERNATIVE_JUMP X86_FEATURE_REP_GOOD,copy_user_generic_unrolled,copy_user_generic_string
@@ -65,7 +65,7 @@ ENTRY(copy_from_user)
movq %rsi,%rcx
addq %rdx,%rcx
jc bad_from_user
- cmpq threadinfo_addr_limit(%rax),%rcx
+ cmpq TI_addr_limit(%rax),%rcx
jae bad_from_user
movl $1,%ecx /* set zero flag */
ALTERNATIVE_JUMP X86_FEATURE_REP_GOOD,copy_user_generic_unrolled,copy_user_generic_string
diff --git a/arch/x86/lib/delay_32.c b/arch/x86/lib/delay.c
similarity index 96%
rename from arch/x86/lib/delay_32.c
rename to arch/x86/lib/delay.c
index ef69131..f456860 100644
--- a/arch/x86/lib/delay_32.c
+++ b/arch/x86/lib/delay.c
@@ -29,7 +29,7 @@
/* simple loop based delay: */
static void delay_loop(unsigned long loops)
{
- __asm__ __volatile__(
+ asm volatile(
" test %0,%0 \n"
" jz 3f \n"
" jmp 1f \n"
@@ -38,9 +38,9 @@ static void delay_loop(unsigned long loops)
"1: jmp 2f \n"

".align 16 \n"
- "2: decl %0 \n"
+ "2: dec %0 \n"
" jnz 2b \n"
- "3: decl %0 \n"
+ "3: dec %0 \n"

: /* we don't need output */
:"a" (loops)
@@ -98,7 +98,7 @@ void use_tsc_delay(void)
int __devinit read_current_timer(unsigned long *timer_val)
{
if (delay_fn == delay_tsc) {
- rdtscl(*timer_val);
+ rdtscll(*timer_val);
return 0;
}
return -1;
@@ -108,31 +108,30 @@ void __delay(unsigned long loops)
{
delay_fn(loops);
}
+EXPORT_SYMBOL(__delay);

inline void __const_udelay(unsigned long xloops)
{
int d0;

xloops *= 4;
- __asm__("mull %0"
+ asm("mull %%edx"
:"=d" (xloops), "=&a" (d0)
:"1" (xloops), "0"
(cpu_data(raw_smp_processor_id()).loops_per_jiffy * (HZ/4)));

__delay(++xloops);
}
+EXPORT_SYMBOL(__const_udelay);

void __udelay(unsigned long usecs)
{
__const_udelay(usecs * 0x000010c7); /* 2**32 / 1000000 (rounded up) */
}
+EXPORT_SYMBOL(__udelay);

void __ndelay(unsigned long nsecs)
{
__const_udelay(nsecs * 0x00005); /* 2**32 / 1000000000 (rounded up) */
}
-
-EXPORT_SYMBOL(__delay);
-EXPORT_SYMBOL(__const_udelay);
-EXPORT_SYMBOL(__udelay);
EXPORT_SYMBOL(__ndelay);
diff --git a/arch/x86/lib/delay_64.c b/arch/x86/lib/delay_64.c
deleted file mode 100644
index 4c441be..0000000
--- a/arch/x86/lib/delay_64.c
+++ /dev/null
@@ -1,85 +0,0 @@
-/*
- * Precise Delay Loops for x86-64
- *
- * Copyright (C) 1993 Linus Torvalds
- * Copyright (C) 1997 Martin Mares <[email protected]>
- *
- * The __delay function must _NOT_ be inlined as its execution time
- * depends wildly on alignment on many x86 processors.
- */
-
-#include <linux/module.h>
-#include <linux/sched.h>
-#include <linux/timex.h>
-#include <linux/preempt.h>
-#include <linux/delay.h>
-#include <linux/init.h>
-
-#include <asm/delay.h>
-#include <asm/msr.h>
-
-#ifdef CONFIG_SMP
-#include <asm/smp.h>
-#endif
-
-int __devinit read_current_timer(unsigned long *timer_value)
-{
- rdtscll(*timer_value);
- return 0;
-}
-
-void __delay(unsigned long loops)
-{
- unsigned bclock, now;
- int cpu;
-
- preempt_disable();
- cpu = smp_processor_id();
- rdtscl(bclock);
- for (;;) {
- rdtscl(now);
- if ((now - bclock) >= loops)
- break;
-
- /* Allow RT tasks to run */
- preempt_enable();
- rep_nop();
- preempt_disable();
-
- /*
- * It is possible that we moved to another CPU, and
- * since TSC's are per-cpu we need to calculate
- * that. The delay must guarantee that we wait "at
- * least" the amount of time. Being moved to another
- * CPU could make the wait longer but we just need to
- * make sure we waited long enough. Rebalance the
- * counter for this CPU.
- */
- if (unlikely(cpu != smp_processor_id())) {
- loops -= (now - bclock);
- cpu = smp_processor_id();
- rdtscl(bclock);
- }
- }
- preempt_enable();
-}
-EXPORT_SYMBOL(__delay);
-
-inline void __const_udelay(unsigned long xloops)
-{
- __delay(((xloops * HZ *
- cpu_data(raw_smp_processor_id()).loops_per_jiffy) >> 32) + 1);
-}
-EXPORT_SYMBOL(__const_udelay);
-
-void __udelay(unsigned long usecs)
-{
- __const_udelay(usecs * 0x000010c7); /* 2**32 / 1000000 (rounded up) */
-}
-EXPORT_SYMBOL(__udelay);
-
-void __ndelay(unsigned long nsecs)
-{
- __const_udelay(nsecs * 0x00005); /* 2**32 / 1000000000 (rounded up) */
-}
-EXPORT_SYMBOL(__ndelay);
diff --git a/arch/x86/lib/getuser_64.S b/arch/x86/lib/getuser.S
similarity index 53%
rename from arch/x86/lib/getuser_64.S
rename to arch/x86/lib/getuser.S
index 5448876..ad37400 100644
--- a/arch/x86/lib/getuser_64.S
+++ b/arch/x86/lib/getuser.S
@@ -3,6 +3,7 @@
*
* (C) Copyright 1998 Linus Torvalds
* (C) Copyright 2005 Andi Kleen
+ * (C) Copyright 2008 Glauber Costa
*
* These functions have a non-standard call interface
* to make them more efficient, especially as they
@@ -13,14 +14,13 @@
/*
* __get_user_X
*
- * Inputs: %rcx contains the address.
+ * Inputs: %[r|e]ax contains the address.
* The register is modified, but all changes are undone
* before returning because the C code doesn't know about it.
*
- * Outputs: %rax is error code (0 or -EFAULT)
- * %rdx contains zero-extended value
- *
- * %r8 is destroyed.
+ * Outputs: %[r|e]ax is error code (0 or -EFAULT)
+ * %[r|e]dx contains zero-extended value
+ *
*
* These functions should not modify any other registers,
* as they get called from within inline assembly.
@@ -32,78 +32,73 @@
#include <asm/errno.h>
#include <asm/asm-offsets.h>
#include <asm/thread_info.h>
+#include <asm/asm.h>

.text
ENTRY(__get_user_1)
CFI_STARTPROC
- GET_THREAD_INFO(%r8)
- cmpq threadinfo_addr_limit(%r8),%rcx
+ GET_THREAD_INFO(%_ASM_DX)
+ cmp TI_addr_limit(%_ASM_DX),%_ASM_AX
jae bad_get_user
-1: movzb (%rcx),%edx
- xorl %eax,%eax
+1: movzb (%_ASM_AX),%edx
+ xor %eax,%eax
ret
CFI_ENDPROC
ENDPROC(__get_user_1)

ENTRY(__get_user_2)
CFI_STARTPROC
- GET_THREAD_INFO(%r8)
- addq $1,%rcx
- jc 20f
- cmpq threadinfo_addr_limit(%r8),%rcx
- jae 20f
- decq %rcx
-2: movzwl (%rcx),%edx
- xorl %eax,%eax
+ add $1,%_ASM_AX
+ jc bad_get_user
+ GET_THREAD_INFO(%_ASM_DX)
+ cmp TI_addr_limit(%_ASM_DX),%_ASM_AX
+ jae bad_get_user
+2: movzwl -1(%_ASM_AX),%edx
+ xor %eax,%eax
ret
-20: decq %rcx
- jmp bad_get_user
CFI_ENDPROC
ENDPROC(__get_user_2)

ENTRY(__get_user_4)
CFI_STARTPROC
- GET_THREAD_INFO(%r8)
- addq $3,%rcx
- jc 30f
- cmpq threadinfo_addr_limit(%r8),%rcx
- jae 30f
- subq $3,%rcx
-3: movl (%rcx),%edx
- xorl %eax,%eax
+ add $3,%_ASM_AX
+ jc bad_get_user
+ GET_THREAD_INFO(%_ASM_DX)
+ cmp TI_addr_limit(%_ASM_DX),%_ASM_AX
+ jae bad_get_user
+3: mov -3(%_ASM_AX),%edx
+ xor %eax,%eax
ret
-30: subq $3,%rcx
- jmp bad_get_user
CFI_ENDPROC
ENDPROC(__get_user_4)

+#ifdef CONFIG_X86_64
ENTRY(__get_user_8)
CFI_STARTPROC
- GET_THREAD_INFO(%r8)
- addq $7,%rcx
- jc 40f
- cmpq threadinfo_addr_limit(%r8),%rcx
- jae 40f
- subq $7,%rcx
-4: movq (%rcx),%rdx
- xorl %eax,%eax
+ add $7,%_ASM_AX
+ jc bad_get_user
+ GET_THREAD_INFO(%_ASM_DX)
+ cmp TI_addr_limit(%_ASM_DX),%_ASM_AX
+ jae bad_get_user
+4: movq -7(%_ASM_AX),%_ASM_DX
+ xor %eax,%eax
ret
-40: subq $7,%rcx
- jmp bad_get_user
CFI_ENDPROC
ENDPROC(__get_user_8)
+#endif

bad_get_user:
CFI_STARTPROC
- xorl %edx,%edx
- movq $(-EFAULT),%rax
+ xor %edx,%edx
+ mov $(-EFAULT),%_ASM_AX
ret
CFI_ENDPROC
END(bad_get_user)

.section __ex_table,"a"
- .quad 1b,bad_get_user
- .quad 2b,bad_get_user
- .quad 3b,bad_get_user
- .quad 4b,bad_get_user
-.previous
+ _ASM_PTR 1b,bad_get_user
+ _ASM_PTR 2b,bad_get_user
+ _ASM_PTR 3b,bad_get_user
+#ifdef CONFIG_X86_64
+ _ASM_PTR 4b,bad_get_user
+#endif
diff --git a/arch/x86/lib/getuser_32.S b/arch/x86/lib/getuser_32.S
deleted file mode 100644
index 6d84b53..0000000
--- a/arch/x86/lib/getuser_32.S
+++ /dev/null
@@ -1,78 +0,0 @@
-/*
- * __get_user functions.
- *
- * (C) Copyright 1998 Linus Torvalds
- *
- * These functions have a non-standard call interface
- * to make them more efficient, especially as they
- * return an error value in addition to the "real"
- * return value.
- */
-#include <linux/linkage.h>
-#include <asm/dwarf2.h>
-#include <asm/thread_info.h>
-
-
-/*
- * __get_user_X
- *
- * Inputs: %eax contains the address
- *
- * Outputs: %eax is error code (0 or -EFAULT)
- * %edx contains zero-extended value
- *
- * These functions should not modify any other registers,
- * as they get called from within inline assembly.
- */
-
-.text
-ENTRY(__get_user_1)
- CFI_STARTPROC
- GET_THREAD_INFO(%edx)
- cmpl TI_addr_limit(%edx),%eax
- jae bad_get_user
-1: movzbl (%eax),%edx
- xorl %eax,%eax
- ret
- CFI_ENDPROC
-ENDPROC(__get_user_1)
-
-ENTRY(__get_user_2)
- CFI_STARTPROC
- addl $1,%eax
- jc bad_get_user
- GET_THREAD_INFO(%edx)
- cmpl TI_addr_limit(%edx),%eax
- jae bad_get_user
-2: movzwl -1(%eax),%edx
- xorl %eax,%eax
- ret
- CFI_ENDPROC
-ENDPROC(__get_user_2)
-
-ENTRY(__get_user_4)
- CFI_STARTPROC
- addl $3,%eax
- jc bad_get_user
- GET_THREAD_INFO(%edx)
- cmpl TI_addr_limit(%edx),%eax
- jae bad_get_user
-3: movl -3(%eax),%edx
- xorl %eax,%eax
- ret
- CFI_ENDPROC
-ENDPROC(__get_user_4)
-
-bad_get_user:
- CFI_STARTPROC
- xorl %edx,%edx
- movl $-14,%eax
- ret
- CFI_ENDPROC
-END(bad_get_user)
-
-.section __ex_table,"a"
- .long 1b,bad_get_user
- .long 2b,bad_get_user
- .long 3b,bad_get_user
-.previous
diff --git a/arch/x86/lib/putuser_32.S b/arch/x86/lib/putuser.S
similarity index 54%
rename from arch/x86/lib/putuser_32.S
rename to arch/x86/lib/putuser.S
index f58fba1..36b0d15 100644
--- a/arch/x86/lib/putuser_32.S
+++ b/arch/x86/lib/putuser.S
@@ -2,6 +2,8 @@
* __put_user functions.
*
* (C) Copyright 2005 Linus Torvalds
+ * (C) Copyright 2005 Andi Kleen
+ * (C) Copyright 2008 Glauber Costa
*
* These functions have a non-standard call interface
* to make them more efficient, especially as they
@@ -11,6 +13,8 @@
#include <linux/linkage.h>
#include <asm/dwarf2.h>
#include <asm/thread_info.h>
+#include <asm/errno.h>
+#include <asm/asm.h>


/*
@@ -26,73 +30,68 @@
*/

#define ENTER CFI_STARTPROC ; \
- pushl %ebx ; \
- CFI_ADJUST_CFA_OFFSET 4 ; \
- CFI_REL_OFFSET ebx, 0 ; \
- GET_THREAD_INFO(%ebx)
-#define EXIT popl %ebx ; \
- CFI_ADJUST_CFA_OFFSET -4 ; \
- CFI_RESTORE ebx ; \
- ret ; \
+ GET_THREAD_INFO(%_ASM_BX)
+#define EXIT ret ; \
CFI_ENDPROC

.text
ENTRY(__put_user_1)
ENTER
- cmpl TI_addr_limit(%ebx),%ecx
+ cmp TI_addr_limit(%_ASM_BX),%_ASM_CX
jae bad_put_user
-1: movb %al,(%ecx)
- xorl %eax,%eax
+1: movb %al,(%_ASM_CX)
+ xor %eax,%eax
EXIT
ENDPROC(__put_user_1)

ENTRY(__put_user_2)
ENTER
- movl TI_addr_limit(%ebx),%ebx
- subl $1,%ebx
- cmpl %ebx,%ecx
+ mov TI_addr_limit(%_ASM_BX),%_ASM_BX
+ sub $1,%_ASM_BX
+ cmp %_ASM_BX,%_ASM_CX
jae bad_put_user
-2: movw %ax,(%ecx)
- xorl %eax,%eax
+2: movw %ax,(%_ASM_CX)
+ xor %eax,%eax
EXIT
ENDPROC(__put_user_2)

ENTRY(__put_user_4)
ENTER
- movl TI_addr_limit(%ebx),%ebx
- subl $3,%ebx
- cmpl %ebx,%ecx
+ mov TI_addr_limit(%_ASM_BX),%_ASM_BX
+ sub $3,%_ASM_BX
+ cmp %_ASM_BX,%_ASM_CX
jae bad_put_user
-3: movl %eax,(%ecx)
- xorl %eax,%eax
+3: movl %eax,(%_ASM_CX)
+ xor %eax,%eax
EXIT
ENDPROC(__put_user_4)

ENTRY(__put_user_8)
ENTER
- movl TI_addr_limit(%ebx),%ebx
- subl $7,%ebx
- cmpl %ebx,%ecx
+ mov TI_addr_limit(%_ASM_BX),%_ASM_BX
+ sub $7,%_ASM_BX
+ cmp %_ASM_BX,%_ASM_CX
jae bad_put_user
-4: movl %eax,(%ecx)
-5: movl %edx,4(%ecx)
- xorl %eax,%eax
+4: mov %_ASM_AX,(%_ASM_CX)
+#ifdef CONFIG_X86_32
+5: movl %edx,4(%_ASM_CX)
+#endif
+ xor %eax,%eax
EXIT
ENDPROC(__put_user_8)

bad_put_user:
- CFI_STARTPROC simple
- CFI_DEF_CFA esp, 2*4
- CFI_OFFSET eip, -1*4
- CFI_OFFSET ebx, -2*4
- movl $-14,%eax
+ CFI_STARTPROC
+ movl $-EFAULT,%eax
EXIT
END(bad_put_user)

.section __ex_table,"a"
- .long 1b,bad_put_user
- .long 2b,bad_put_user
- .long 3b,bad_put_user
- .long 4b,bad_put_user
- .long 5b,bad_put_user
+ _ASM_PTR 1b,bad_put_user
+ _ASM_PTR 2b,bad_put_user
+ _ASM_PTR 3b,bad_put_user
+ _ASM_PTR 4b,bad_put_user
+#ifdef CONFIG_X86_32
+ _ASM_PTR 5b,bad_put_user
+#endif
.previous
diff --git a/arch/x86/lib/putuser_64.S b/arch/x86/lib/putuser_64.S
deleted file mode 100644
index 4989f5a..0000000
--- a/arch/x86/lib/putuser_64.S
+++ /dev/null
@@ -1,106 +0,0 @@
-/*
- * __put_user functions.
- *
- * (C) Copyright 1998 Linus Torvalds
- * (C) Copyright 2005 Andi Kleen
- *
- * These functions have a non-standard call interface
- * to make them more efficient, especially as they
- * return an error value in addition to the "real"
- * return value.
- */
-
-/*
- * __put_user_X
- *
- * Inputs: %rcx contains the address
- * %rdx contains new value
- *
- * Outputs: %rax is error code (0 or -EFAULT)
- *
- * %r8 is destroyed.
- *
- * These functions should not modify any other registers,
- * as they get called from within inline assembly.
- */
-
-#include <linux/linkage.h>
-#include <asm/dwarf2.h>
-#include <asm/page.h>
-#include <asm/errno.h>
-#include <asm/asm-offsets.h>
-#include <asm/thread_info.h>
-
- .text
-ENTRY(__put_user_1)
- CFI_STARTPROC
- GET_THREAD_INFO(%r8)
- cmpq threadinfo_addr_limit(%r8),%rcx
- jae bad_put_user
-1: movb %dl,(%rcx)
- xorl %eax,%eax
- ret
- CFI_ENDPROC
-ENDPROC(__put_user_1)
-
-ENTRY(__put_user_2)
- CFI_STARTPROC
- GET_THREAD_INFO(%r8)
- addq $1,%rcx
- jc 20f
- cmpq threadinfo_addr_limit(%r8),%rcx
- jae 20f
- decq %rcx
-2: movw %dx,(%rcx)
- xorl %eax,%eax
- ret
-20: decq %rcx
- jmp bad_put_user
- CFI_ENDPROC
-ENDPROC(__put_user_2)
-
-ENTRY(__put_user_4)
- CFI_STARTPROC
- GET_THREAD_INFO(%r8)
- addq $3,%rcx
- jc 30f
- cmpq threadinfo_addr_limit(%r8),%rcx
- jae 30f
- subq $3,%rcx
-3: movl %edx,(%rcx)
- xorl %eax,%eax
- ret
-30: subq $3,%rcx
- jmp bad_put_user
- CFI_ENDPROC
-ENDPROC(__put_user_4)
-
-ENTRY(__put_user_8)
- CFI_STARTPROC
- GET_THREAD_INFO(%r8)
- addq $7,%rcx
- jc 40f
- cmpq threadinfo_addr_limit(%r8),%rcx
- jae 40f
- subq $7,%rcx
-4: movq %rdx,(%rcx)
- xorl %eax,%eax
- ret
-40: subq $7,%rcx
- jmp bad_put_user
- CFI_ENDPROC
-ENDPROC(__put_user_8)
-
-bad_put_user:
- CFI_STARTPROC
- movq $(-EFAULT),%rax
- ret
- CFI_ENDPROC
-END(bad_put_user)
-
-.section __ex_table,"a"
- .quad 1b,bad_put_user
- .quad 2b,bad_put_user
- .quad 3b,bad_put_user
- .quad 4b,bad_put_user
-.previous
diff --git a/include/asm-x86/asm.h b/include/asm-x86/asm.h
index 7093982..9722032 100644
--- a/include/asm-x86/asm.h
+++ b/include/asm-x86/asm.h
@@ -3,8 +3,10 @@

#ifdef __ASSEMBLY__
# define __ASM_FORM(x) x
+# define __ASM_EX_SEC .section __ex_table
#else
# define __ASM_FORM(x) " " #x " "
+# define __ASM_EX_SEC " .section __ex_table,\"a\"\n"
#endif

#ifdef CONFIG_X86_32
@@ -14,6 +16,7 @@
#endif

#define __ASM_SIZE(inst) __ASM_SEL(inst##l, inst##q)
+#define __ASM_REG(reg) __ASM_SEL(e##reg, r##reg)

#define _ASM_PTR __ASM_SEL(.long, .quad)
#define _ASM_ALIGN __ASM_SEL(.balign 4, .balign 8)
@@ -24,10 +27,14 @@
#define _ASM_ADD __ASM_SIZE(add)
#define _ASM_SUB __ASM_SIZE(sub)
#define _ASM_XADD __ASM_SIZE(xadd)
+#define _ASM_AX __ASM_REG(ax)
+#define _ASM_BX __ASM_REG(bx)
+#define _ASM_CX __ASM_REG(cx)
+#define _ASM_DX __ASM_REG(dx)

/* Exception table entry */
# define _ASM_EXTABLE(from,to) \
- " .section __ex_table,\"a\"\n" \
+ __ASM_EX_SEC \
_ASM_ALIGN "\n" \
_ASM_PTR #from "," #to "\n" \
" .previous\n"
diff --git a/include/asm-x86/uaccess.h b/include/asm-x86/uaccess.h
index 9fefd29..a1e8157 100644
--- a/include/asm-x86/uaccess.h
+++ b/include/asm-x86/uaccess.h
@@ -1,5 +1,454 @@
+#ifndef _ASM_UACCES_H_
+#define _ASM_UACCES_H_
+/*
+ * User space memory access functions
+ */
+#include <linux/errno.h>
+#include <linux/compiler.h>
+#include <linux/thread_info.h>
+#include <linux/prefetch.h>
+#include <linux/string.h>
+#include <asm/asm.h>
+#include <asm/page.h>
+
+#define VERIFY_READ 0
+#define VERIFY_WRITE 1
+
+/*
+ * The fs value determines whether argument validity checking should be
+ * performed or not. If get_fs() == USER_DS, checking is performed, with
+ * get_fs() == KERNEL_DS, checking is bypassed.
+ *
+ * For historical reasons, these macros are grossly misnamed.
+ */
+
+#define MAKE_MM_SEG(s) ((mm_segment_t) { (s) })
+
+#define KERNEL_DS MAKE_MM_SEG(-1UL)
+#define USER_DS MAKE_MM_SEG(PAGE_OFFSET)
+
+#define get_ds() (KERNEL_DS)
+#define get_fs() (current_thread_info()->addr_limit)
+#define set_fs(x) (current_thread_info()->addr_limit = (x))
+
+#define segment_eq(a, b) ((a).seg == (b).seg)
+
+#define __addr_ok(addr) \
+ ((unsigned long __force)(addr) < \
+ (current_thread_info()->addr_limit.seg))
+
+/*
+ * Test whether a block of memory is a valid user space address.
+ * Returns 0 if the range is valid, nonzero otherwise.
+ *
+ * This is equivalent to the following test:
+ * (u33)addr + (u33)size >= (u33)current->addr_limit.seg (u65 for x86_64)
+ *
+ * This needs 33-bit (65-bit for x86_64) arithmetic. We have a carry...
+ */
+
+#define __range_not_ok(addr, size) \
+({ \
+ unsigned long flag, roksum; \
+ __chk_user_ptr(addr); \
+ asm("# range_ok\n\r" \
+ "add %3,%1 ; sbb %0,%0 ; cmp %1,%4 ; sbb $0,%0" \
+ : "=&r" (flag), "=r" (roksum) \
+ : "1" (addr), "g" ((long)(size)), \
+ "g" (current_thread_info()->addr_limit.seg)); \
+ flag; \
+})
+
+/**
+ * access_ok: - Checks if a user space pointer is valid
+ * @type: Type of access: %VERIFY_READ or %VERIFY_WRITE. Note that
+ * %VERIFY_WRITE is a superset of %VERIFY_READ - if it is safe
+ * to write to a block, it is always safe to read from it.
+ * @addr: User space pointer to start of block to check
+ * @size: Size of block to check
+ *
+ * Context: User context only. This function may sleep.
+ *
+ * Checks if a pointer to a block of memory in user space is valid.
+ *
+ * Returns true (nonzero) if the memory block may be valid, false (zero)
+ * if it is definitely invalid.
+ *
+ * Note that, depending on architecture, this function probably just
+ * checks that the pointer is in the user space range - after calling
+ * this function, memory access functions may still return -EFAULT.
+ */
+#define access_ok(type, addr, size) (likely(__range_not_ok(addr, size) == 0))
+
+/*
+ * The exception table consists of pairs of addresses: the first is the
+ * address of an instruction that is allowed to fault, and the second is
+ * the address at which the program should continue. No registers are
+ * modified, so it is entirely up to the continuation code to figure out
+ * what to do.
+ *
+ * All the routines below use bits of fixup code that are out of line
+ * with the main instruction path. This means when everything is well,
+ * we don't even have to jump over them. Further, they do not intrude
+ * on our cache or tlb entries.
+ */
+
+struct exception_table_entry {
+ unsigned long insn, fixup;
+};
+
+extern int fixup_exception(struct pt_regs *regs);
+
+/*
+ * These are the main single-value transfer routines. They automatically
+ * use the right size if we just have the right pointer type.
+ *
+ * This gets kind of ugly. We want to return _two_ values in "get_user()"
+ * and yet we don't want to do any pointers, because that is too much
+ * of a performance impact. Thus we have a few rather ugly macros here,
+ * and hide all the ugliness from the user.
+ *
+ * The "__xxx" versions of the user access functions are versions that
+ * do not verify the address space, that must have been done previously
+ * with a separate "access_ok()" call (this is used when we do multiple
+ * accesses to the same area of user memory).
+ */
+
+extern int __get_user_1(void);
+extern int __get_user_2(void);
+extern int __get_user_4(void);
+extern int __get_user_8(void);
+extern int __get_user_bad(void);
+
+#define __get_user_x(size, ret, x, ptr) \
+ asm volatile("call __get_user_" #size \
+ : "=a" (ret),"=d" (x) \
+ : "0" (ptr)) \
+
+/* Careful: we have to cast the result to the type of the pointer
+ * for sign reasons */
+
+/**
+ * get_user: - Get a simple variable from user space.
+ * @x: Variable to store result.
+ * @ptr: Source address, in user space.
+ *
+ * Context: User context only. This function may sleep.
+ *
+ * This macro copies a single simple variable from user space to kernel
+ * space. It supports simple types like char and int, but not larger
+ * data types like structures or arrays.
+ *
+ * @ptr must have pointer-to-simple-variable type, and the result of
+ * dereferencing @ptr must be assignable to @x without a cast.
+ *
+ * Returns zero on success, or -EFAULT on error.
+ * On error, the variable @x is set to zero.
+ */
+#ifdef CONFIG_X86_32
+#define __get_user_8(__ret_gu, __val_gu, ptr) \
+ __get_user_x(X, __ret_gu, __val_gu, ptr)
+#else
+#define __get_user_8(__ret_gu, __val_gu, ptr) \
+ __get_user_x(8, __ret_gu, __val_gu, ptr)
+#endif
+
+#define get_user(x, ptr) \
+({ \
+ int __ret_gu; \
+ unsigned long __val_gu; \
+ __chk_user_ptr(ptr); \
+ switch (sizeof(*(ptr))) { \
+ case 1: \
+ __get_user_x(1, __ret_gu, __val_gu, ptr); \
+ break; \
+ case 2: \
+ __get_user_x(2, __ret_gu, __val_gu, ptr); \
+ break; \
+ case 4: \
+ __get_user_x(4, __ret_gu, __val_gu, ptr); \
+ break; \
+ case 8: \
+ __get_user_8(__ret_gu, __val_gu, ptr); \
+ break; \
+ default: \
+ __get_user_x(X, __ret_gu, __val_gu, ptr); \
+ break; \
+ } \
+ (x) = (__typeof__(*(ptr)))__val_gu; \
+ __ret_gu; \
+})
+
+#define __put_user_x(size, x, ptr, __ret_pu) \
+ asm volatile("call __put_user_" #size : "=a" (__ret_pu) \
+ :"0" ((typeof(*(ptr)))(x)), "c" (ptr) : "ebx")
+
+
+
+#ifdef CONFIG_X86_32
+#define __put_user_u64(x, addr, err) \
+ asm volatile("1: movl %%eax,0(%2)\n" \
+ "2: movl %%edx,4(%2)\n" \
+ "3:\n" \
+ ".section .fixup,\"ax\"\n" \
+ "4: movl %3,%0\n" \
+ " jmp 3b\n" \
+ ".previous\n" \
+ _ASM_EXTABLE(1b, 4b) \
+ _ASM_EXTABLE(2b, 4b) \
+ : "=r" (err) \
+ : "A" (x), "r" (addr), "i" (-EFAULT), "0" (err))
+
+#define __put_user_x8(x, ptr, __ret_pu) \
+ asm volatile("call __put_user_8" : "=a" (__ret_pu) \
+ : "A" ((typeof(*(ptr)))(x)), "c" (ptr) : "ebx")
+#else
+#define __put_user_u64(x, ptr, retval) \
+ __put_user_asm(x, ptr, retval, "q", "", "Zr", -EFAULT)
+#define __put_user_x8(x, ptr, __ret_pu) __put_user_x(8, x, ptr, __ret_pu)
+#endif
+
+extern void __put_user_bad(void);
+
+/*
+ * Strange magic calling convention: pointer in %ecx,
+ * value in %eax(:%edx), return value in %eax. clobbers %rbx
+ */
+extern void __put_user_1(void);
+extern void __put_user_2(void);
+extern void __put_user_4(void);
+extern void __put_user_8(void);
+
+#ifdef CONFIG_X86_WP_WORKS_OK
+
+/**
+ * put_user: - Write a simple value into user space.
+ * @x: Value to copy to user space.
+ * @ptr: Destination address, in user space.
+ *
+ * Context: User context only. This function may sleep.
+ *
+ * This macro copies a single simple value from kernel space to user
+ * space. It supports simple types like char and int, but not larger
+ * data types like structures or arrays.
+ *
+ * @ptr must have pointer-to-simple-variable type, and @x must be assignable
+ * to the result of dereferencing @ptr.
+ *
+ * Returns zero on success, or -EFAULT on error.
+ */
+#define put_user(x, ptr) \
+({ \
+ int __ret_pu; \
+ __typeof__(*(ptr)) __pu_val; \
+ __chk_user_ptr(ptr); \
+ __pu_val = x; \
+ switch (sizeof(*(ptr))) { \
+ case 1: \
+ __put_user_x(1, __pu_val, ptr, __ret_pu); \
+ break; \
+ case 2: \
+ __put_user_x(2, __pu_val, ptr, __ret_pu); \
+ break; \
+ case 4: \
+ __put_user_x(4, __pu_val, ptr, __ret_pu); \
+ break; \
+ case 8: \
+ __put_user_x8(__pu_val, ptr, __ret_pu); \
+ break; \
+ default: \
+ __put_user_x(X, __pu_val, ptr, __ret_pu); \
+ break; \
+ } \
+ __ret_pu; \
+})
+
+#define __put_user_size(x, ptr, size, retval, errret) \
+do { \
+ retval = 0; \
+ __chk_user_ptr(ptr); \
+ switch (size) { \
+ case 1: \
+ __put_user_asm(x, ptr, retval, "b", "b", "iq", errret); \
+ break; \
+ case 2: \
+ __put_user_asm(x, ptr, retval, "w", "w", "ir", errret); \
+ break; \
+ case 4: \
+ __put_user_asm(x, ptr, retval, "l", "k", "ir", errret);\
+ break; \
+ case 8: \
+ __put_user_u64((__typeof__(*ptr))(x), ptr, retval); \
+ break; \
+ default: \
+ __put_user_bad(); \
+ } \
+} while (0)
+
+#else
+
+#define __put_user_size(x, ptr, size, retval, errret) \
+do { \
+ __typeof__(*(ptr))__pus_tmp = x; \
+ retval = 0; \
+ \
+ if (unlikely(__copy_to_user_ll(ptr, &__pus_tmp, size) != 0)) \
+ retval = errret; \
+} while (0)
+
+#define put_user(x, ptr) \
+({ \
+ int __ret_pu; \
+ __typeof__(*(ptr))__pus_tmp = x; \
+ __ret_pu = 0; \
+ if (unlikely(__copy_to_user_ll(ptr, &__pus_tmp, \
+ sizeof(*(ptr))) != 0)) \
+ __ret_pu = -EFAULT; \
+ __ret_pu; \
+})
+#endif
+
+#ifdef CONFIG_X86_32
+#define __get_user_asm_u64(x, ptr, retval, errret) (x) = __get_user_bad()
+#else
+#define __get_user_asm_u64(x, ptr, retval, errret) \
+ __get_user_asm(x, ptr, retval, "q", "", "=r", errret)
+#endif
+
+#define __get_user_size(x, ptr, size, retval, errret) \
+do { \
+ retval = 0; \
+ __chk_user_ptr(ptr); \
+ switch (size) { \
+ case 1: \
+ __get_user_asm(x, ptr, retval, "b", "b", "=q", errret); \
+ break; \
+ case 2: \
+ __get_user_asm(x, ptr, retval, "w", "w", "=r", errret); \
+ break; \
+ case 4: \
+ __get_user_asm(x, ptr, retval, "l", "k", "=r", errret); \
+ break; \
+ case 8: \
+ __get_user_asm_u64(x, ptr, retval, errret); \
+ break; \
+ default: \
+ (x) = __get_user_bad(); \
+ } \
+} while (0)
+
+#define __get_user_asm(x, addr, err, itype, rtype, ltype, errret) \
+ asm volatile("1: mov"itype" %2,%"rtype"1\n" \
+ "2:\n" \
+ ".section .fixup,\"ax\"\n" \
+ "3: mov %3,%0\n" \
+ " xor"itype" %"rtype"1,%"rtype"1\n" \
+ " jmp 2b\n" \
+ ".previous\n" \
+ _ASM_EXTABLE(1b, 3b) \
+ : "=r" (err), ltype(x) \
+ : "m" (__m(addr)), "i" (errret), "0" (err))
+
+#define __put_user_nocheck(x, ptr, size) \
+({ \
+ long __pu_err; \
+ __put_user_size((x), (ptr), (size), __pu_err, -EFAULT); \
+ __pu_err; \
+})
+
+#define __get_user_nocheck(x, ptr, size) \
+({ \
+ long __gu_err; \
+ unsigned long __gu_val; \
+ __get_user_size(__gu_val, (ptr), (size), __gu_err, -EFAULT); \
+ (x) = (__force __typeof__(*(ptr)))__gu_val; \
+ __gu_err; \
+})
+
+/* FIXME: this hack is definitely wrong -AK */
+struct __large_struct { unsigned long buf[100]; };
+#define __m(x) (*(struct __large_struct __user *)(x))
+
+/*
+ * Tell gcc we read from memory instead of writing: this is because
+ * we do not write to any memory gcc knows about, so there are no
+ * aliasing issues.
+ */
+#define __put_user_asm(x, addr, err, itype, rtype, ltype, errret) \
+ asm volatile("1: mov"itype" %"rtype"1,%2\n" \
+ "2:\n" \
+ ".section .fixup,\"ax\"\n" \
+ "3: mov %3,%0\n" \
+ " jmp 2b\n" \
+ ".previous\n" \
+ _ASM_EXTABLE(1b, 3b) \
+ : "=r"(err) \
+ : ltype(x), "m" (__m(addr)), "i" (errret), "0" (err))
+/**
+ * __get_user: - Get a simple variable from user space, with less checking.
+ * @x: Variable to store result.
+ * @ptr: Source address, in user space.
+ *
+ * Context: User context only. This function may sleep.
+ *
+ * This macro copies a single simple variable from user space to kernel
+ * space. It supports simple types like char and int, but not larger
+ * data types like structures or arrays.
+ *
+ * @ptr must have pointer-to-simple-variable type, and the result of
+ * dereferencing @ptr must be assignable to @x without a cast.
+ *
+ * Caller must check the pointer with access_ok() before calling this
+ * function.
+ *
+ * Returns zero on success, or -EFAULT on error.
+ * On error, the variable @x is set to zero.
+ */
+
+#define __get_user(x, ptr) \
+ __get_user_nocheck((x), (ptr), sizeof(*(ptr)))
+/**
+ * __put_user: - Write a simple value into user space, with less checking.
+ * @x: Value to copy to user space.
+ * @ptr: Destination address, in user space.
+ *
+ * Context: User context only. This function may sleep.
+ *
+ * This macro copies a single simple value from kernel space to user
+ * space. It supports simple types like char and int, but not larger
+ * data types like structures or arrays.
+ *
+ * @ptr must have pointer-to-simple-variable type, and @x must be assignable
+ * to the result of dereferencing @ptr.
+ *
+ * Caller must check the pointer with access_ok() before calling this
+ * function.
+ *
+ * Returns zero on success, or -EFAULT on error.
+ */
+
+#define __put_user(x, ptr) \
+ __put_user_nocheck((__typeof__(*(ptr)))(x), (ptr), sizeof(*(ptr)))
+
+#define __get_user_unaligned __get_user
+#define __put_user_unaligned __put_user
+
+/*
+ * movsl can be slow when source and dest are not both 8-byte aligned
+ */
+#ifdef CONFIG_X86_INTEL_USERCOPY
+extern struct movsl_mask {
+ int mask;
+} ____cacheline_aligned_in_smp movsl_mask;
+#endif
+
+#define ARCH_HAS_NOCACHE_UACCESS 1
+
#ifdef CONFIG_X86_32
# include "uaccess_32.h"
#else
+# define ARCH_HAS_SEARCH_EXTABLE
# include "uaccess_64.h"
#endif
+
+#endif
diff --git a/include/asm-x86/uaccess_32.h b/include/asm-x86/uaccess_32.h
index 8e7595c..6fdef39 100644
--- a/include/asm-x86/uaccess_32.h
+++ b/include/asm-x86/uaccess_32.h
@@ -11,426 +11,6 @@
#include <asm/asm.h>
#include <asm/page.h>

-#define VERIFY_READ 0
-#define VERIFY_WRITE 1
-
-/*
- * The fs value determines whether argument validity checking should be
- * performed or not. If get_fs() == USER_DS, checking is performed, with
- * get_fs() == KERNEL_DS, checking is bypassed.
- *
- * For historical reasons, these macros are grossly misnamed.
- */
-
-#define MAKE_MM_SEG(s) ((mm_segment_t) { (s) })
-
-
-#define KERNEL_DS MAKE_MM_SEG(0xFFFFFFFFUL)
-#define USER_DS MAKE_MM_SEG(PAGE_OFFSET)
-
-#define get_ds() (KERNEL_DS)
-#define get_fs() (current_thread_info()->addr_limit)
-#define set_fs(x) (current_thread_info()->addr_limit = (x))
-
-#define segment_eq(a, b) ((a).seg == (b).seg)
-
-/*
- * movsl can be slow when source and dest are not both 8-byte aligned
- */
-#ifdef CONFIG_X86_INTEL_USERCOPY
-extern struct movsl_mask {
- int mask;
-} ____cacheline_aligned_in_smp movsl_mask;
-#endif
-
-#define __addr_ok(addr) \
- ((unsigned long __force)(addr) < \
- (current_thread_info()->addr_limit.seg))
-
-/*
- * Test whether a block of memory is a valid user space address.
- * Returns 0 if the range is valid, nonzero otherwise.
- *
- * This is equivalent to the following test:
- * (u33)addr + (u33)size >= (u33)current->addr_limit.seg
- *
- * This needs 33-bit arithmetic. We have a carry...
- */
-#define __range_ok(addr, size) \
-({ \
- unsigned long flag, roksum; \
- __chk_user_ptr(addr); \
- asm("addl %3,%1 ; sbbl %0,%0; cmpl %1,%4; sbbl $0,%0" \
- :"=&r" (flag), "=r" (roksum) \
- :"1" (addr), "g" ((int)(size)), \
- "rm" (current_thread_info()->addr_limit.seg)); \
- flag; \
-})
-
-/**
- * access_ok: - Checks if a user space pointer is valid
- * @type: Type of access: %VERIFY_READ or %VERIFY_WRITE. Note that
- * %VERIFY_WRITE is a superset of %VERIFY_READ - if it is safe
- * to write to a block, it is always safe to read from it.
- * @addr: User space pointer to start of block to check
- * @size: Size of block to check
- *
- * Context: User context only. This function may sleep.
- *
- * Checks if a pointer to a block of memory in user space is valid.
- *
- * Returns true (nonzero) if the memory block may be valid, false (zero)
- * if it is definitely invalid.
- *
- * Note that, depending on architecture, this function probably just
- * checks that the pointer is in the user space range - after calling
- * this function, memory access functions may still return -EFAULT.
- */
-#define access_ok(type, addr, size) (likely(__range_ok(addr, size) == 0))
-
-/*
- * The exception table consists of pairs of addresses: the first is the
- * address of an instruction that is allowed to fault, and the second is
- * the address at which the program should continue. No registers are
- * modified, so it is entirely up to the continuation code to figure out
- * what to do.
- *
- * All the routines below use bits of fixup code that are out of line
- * with the main instruction path. This means when everything is well,
- * we don't even have to jump over them. Further, they do not intrude
- * on our cache or tlb entries.
- */
-
-struct exception_table_entry {
- unsigned long insn, fixup;
-};
-
-extern int fixup_exception(struct pt_regs *regs);
-
-/*
- * These are the main single-value transfer routines. They automatically
- * use the right size if we just have the right pointer type.
- *
- * This gets kind of ugly. We want to return _two_ values in "get_user()"
- * and yet we don't want to do any pointers, because that is too much
- * of a performance impact. Thus we have a few rather ugly macros here,
- * and hide all the ugliness from the user.
- *
- * The "__xxx" versions of the user access functions are versions that
- * do not verify the address space, that must have been done previously
- * with a separate "access_ok()" call (this is used when we do multiple
- * accesses to the same area of user memory).
- */
-
-extern void __get_user_1(void);
-extern void __get_user_2(void);
-extern void __get_user_4(void);
-
-#define __get_user_x(size, ret, x, ptr) \
- asm volatile("call __get_user_" #size \
- :"=a" (ret),"=d" (x) \
- :"0" (ptr))
-
-
-/* Careful: we have to cast the result to the type of the pointer
- * for sign reasons */
-
-/**
- * get_user: - Get a simple variable from user space.
- * @x: Variable to store result.
- * @ptr: Source address, in user space.
- *
- * Context: User context only. This function may sleep.
- *
- * This macro copies a single simple variable from user space to kernel
- * space. It supports simple types like char and int, but not larger
- * data types like structures or arrays.
- *
- * @ptr must have pointer-to-simple-variable type, and the result of
- * dereferencing @ptr must be assignable to @x without a cast.
- *
- * Returns zero on success, or -EFAULT on error.
- * On error, the variable @x is set to zero.
- */
-#define get_user(x, ptr) \
-({ \
- int __ret_gu; \
- unsigned long __val_gu; \
- __chk_user_ptr(ptr); \
- switch (sizeof(*(ptr))) { \
- case 1: \
- __get_user_x(1, __ret_gu, __val_gu, ptr); \
- break; \
- case 2: \
- __get_user_x(2, __ret_gu, __val_gu, ptr); \
- break; \
- case 4: \
- __get_user_x(4, __ret_gu, __val_gu, ptr); \
- break; \
- default: \
- __get_user_x(X, __ret_gu, __val_gu, ptr); \
- break; \
- } \
- (x) = (__typeof__(*(ptr)))__val_gu; \
- __ret_gu; \
-})
-
-extern void __put_user_bad(void);
-
-/*
- * Strange magic calling convention: pointer in %ecx,
- * value in %eax(:%edx), return value in %eax, no clobbers.
- */
-extern void __put_user_1(void);
-extern void __put_user_2(void);
-extern void __put_user_4(void);
-extern void __put_user_8(void);
-
-#define __put_user_1(x, ptr) \
- asm volatile("call __put_user_1" : "=a" (__ret_pu) \
- : "0" ((typeof(*(ptr)))(x)), "c" (ptr))
-
-#define __put_user_2(x, ptr) \
- asm volatile("call __put_user_2" : "=a" (__ret_pu) \
- : "0" ((typeof(*(ptr)))(x)), "c" (ptr))
-
-#define __put_user_4(x, ptr) \
- asm volatile("call __put_user_4" : "=a" (__ret_pu) \
- : "0" ((typeof(*(ptr)))(x)), "c" (ptr))
-
-#define __put_user_8(x, ptr) \
- asm volatile("call __put_user_8" : "=a" (__ret_pu) \
- : "A" ((typeof(*(ptr)))(x)), "c" (ptr))
-
-#define __put_user_X(x, ptr) \
- asm volatile("call __put_user_X" : "=a" (__ret_pu) \
- : "c" (ptr))
-
-/**
- * put_user: - Write a simple value into user space.
- * @x: Value to copy to user space.
- * @ptr: Destination address, in user space.
- *
- * Context: User context only. This function may sleep.
- *
- * This macro copies a single simple value from kernel space to user
- * space. It supports simple types like char and int, but not larger
- * data types like structures or arrays.
- *
- * @ptr must have pointer-to-simple-variable type, and @x must be assignable
- * to the result of dereferencing @ptr.
- *
- * Returns zero on success, or -EFAULT on error.
- */
-#ifdef CONFIG_X86_WP_WORKS_OK
-
-#define put_user(x, ptr) \
-({ \
- int __ret_pu; \
- __typeof__(*(ptr)) __pu_val; \
- __chk_user_ptr(ptr); \
- __pu_val = x; \
- switch (sizeof(*(ptr))) { \
- case 1: \
- __put_user_1(__pu_val, ptr); \
- break; \
- case 2: \
- __put_user_2(__pu_val, ptr); \
- break; \
- case 4: \
- __put_user_4(__pu_val, ptr); \
- break; \
- case 8: \
- __put_user_8(__pu_val, ptr); \
- break; \
- default: \
- __put_user_X(__pu_val, ptr); \
- break; \
- } \
- __ret_pu; \
-})
-
-#else
-#define put_user(x, ptr) \
-({ \
- int __ret_pu; \
- __typeof__(*(ptr))__pus_tmp = x; \
- __ret_pu = 0; \
- if (unlikely(__copy_to_user_ll(ptr, &__pus_tmp, \
- sizeof(*(ptr))) != 0)) \
- __ret_pu = -EFAULT; \
- __ret_pu; \
-})
-
-
-#endif
-
-/**
- * __get_user: - Get a simple variable from user space, with less checking.
- * @x: Variable to store result.
- * @ptr: Source address, in user space.
- *
- * Context: User context only. This function may sleep.
- *
- * This macro copies a single simple variable from user space to kernel
- * space. It supports simple types like char and int, but not larger
- * data types like structures or arrays.
- *
- * @ptr must have pointer-to-simple-variable type, and the result of
- * dereferencing @ptr must be assignable to @x without a cast.
- *
- * Caller must check the pointer with access_ok() before calling this
- * function.
- *
- * Returns zero on success, or -EFAULT on error.
- * On error, the variable @x is set to zero.
- */
-#define __get_user(x, ptr) \
- __get_user_nocheck((x), (ptr), sizeof(*(ptr)))
-
-
-/**
- * __put_user: - Write a simple value into user space, with less checking.
- * @x: Value to copy to user space.
- * @ptr: Destination address, in user space.
- *
- * Context: User context only. This function may sleep.
- *
- * This macro copies a single simple value from kernel space to user
- * space. It supports simple types like char and int, but not larger
- * data types like structures or arrays.
- *
- * @ptr must have pointer-to-simple-variable type, and @x must be assignable
- * to the result of dereferencing @ptr.
- *
- * Caller must check the pointer with access_ok() before calling this
- * function.
- *
- * Returns zero on success, or -EFAULT on error.
- */
-#define __put_user(x, ptr) \
- __put_user_nocheck((__typeof__(*(ptr)))(x), (ptr), sizeof(*(ptr)))
-
-#define __put_user_nocheck(x, ptr, size) \
-({ \
- long __pu_err; \
- __put_user_size((x), (ptr), (size), __pu_err, -EFAULT); \
- __pu_err; \
-})
-
-
-#define __put_user_u64(x, addr, err) \
- asm volatile("1: movl %%eax,0(%2)\n" \
- "2: movl %%edx,4(%2)\n" \
- "3:\n" \
- ".section .fixup,\"ax\"\n" \
- "4: movl %3,%0\n" \
- " jmp 3b\n" \
- ".previous\n" \
- _ASM_EXTABLE(1b, 4b) \
- _ASM_EXTABLE(2b, 4b) \
- : "=r" (err) \
- : "A" (x), "r" (addr), "i" (-EFAULT), "0" (err))
-
-#ifdef CONFIG_X86_WP_WORKS_OK
-
-#define __put_user_size(x, ptr, size, retval, errret) \
-do { \
- retval = 0; \
- __chk_user_ptr(ptr); \
- switch (size) { \
- case 1: \
- __put_user_asm(x, ptr, retval, "b", "b", "iq", errret); \
- break; \
- case 2: \
- __put_user_asm(x, ptr, retval, "w", "w", "ir", errret); \
- break; \
- case 4: \
- __put_user_asm(x, ptr, retval, "l", "", "ir", errret); \
- break; \
- case 8: \
- __put_user_u64((__typeof__(*ptr))(x), ptr, retval); \
- break; \
- default: \
- __put_user_bad(); \
- } \
-} while (0)
-
-#else
-
-#define __put_user_size(x, ptr, size, retval, errret) \
-do { \
- __typeof__(*(ptr))__pus_tmp = x; \
- retval = 0; \
- \
- if (unlikely(__copy_to_user_ll(ptr, &__pus_tmp, size) != 0)) \
- retval = errret; \
-} while (0)
-
-#endif
-struct __large_struct { unsigned long buf[100]; };
-#define __m(x) (*(struct __large_struct __user *)(x))
-
-/*
- * Tell gcc we read from memory instead of writing: this is because
- * we do not write to any memory gcc knows about, so there are no
- * aliasing issues.
- */
-#define __put_user_asm(x, addr, err, itype, rtype, ltype, errret) \
- asm volatile("1: mov"itype" %"rtype"1,%2\n" \
- "2:\n" \
- ".section .fixup,\"ax\"\n" \
- "3: movl %3,%0\n" \
- " jmp 2b\n" \
- ".previous\n" \
- _ASM_EXTABLE(1b, 3b) \
- : "=r"(err) \
- : ltype (x), "m" (__m(addr)), "i" (errret), "0" (err))
-
-
-#define __get_user_nocheck(x, ptr, size) \
-({ \
- long __gu_err; \
- unsigned long __gu_val; \
- __get_user_size(__gu_val, (ptr), (size), __gu_err, -EFAULT); \
- (x) = (__typeof__(*(ptr)))__gu_val; \
- __gu_err; \
-})
-
-extern long __get_user_bad(void);
-
-#define __get_user_size(x, ptr, size, retval, errret) \
-do { \
- retval = 0; \
- __chk_user_ptr(ptr); \
- switch (size) { \
- case 1: \
- __get_user_asm(x, ptr, retval, "b", "b", "=q", errret); \
- break; \
- case 2: \
- __get_user_asm(x, ptr, retval, "w", "w", "=r", errret); \
- break; \
- case 4: \
- __get_user_asm(x, ptr, retval, "l", "", "=r", errret); \
- break; \
- default: \
- (x) = __get_user_bad(); \
- } \
-} while (0)
-
-#define __get_user_asm(x, addr, err, itype, rtype, ltype, errret) \
- asm volatile("1: mov"itype" %2,%"rtype"1\n" \
- "2:\n" \
- ".section .fixup,\"ax\"\n" \
- "3: movl %3,%0\n" \
- " xor"itype" %"rtype"1,%"rtype"1\n" \
- " jmp 2b\n" \
- ".previous\n" \
- _ASM_EXTABLE(1b, 3b) \
- : "=r" (err), ltype (x) \
- : "m" (__m(addr)), "i" (errret), "0" (err))
-
-
unsigned long __must_check __copy_to_user_ll
(void __user *to, const void *from, unsigned long n);
unsigned long __must_check __copy_from_user_ll
@@ -576,8 +156,6 @@ __copy_from_user(void *to, const void __user *from, unsigned long n)
return __copy_from_user_ll(to, from, n);
}

-#define ARCH_HAS_NOCACHE_UACCESS
-
static __always_inline unsigned long __copy_from_user_nocache(void *to,
const void __user *from, unsigned long n)
{
diff --git a/include/asm-x86/uaccess_64.h b/include/asm-x86/uaccess_64.h
index b8a2f43..4e3ec00 100644
--- a/include/asm-x86/uaccess_64.h
+++ b/include/asm-x86/uaccess_64.h
@@ -9,265 +9,6 @@
#include <linux/prefetch.h>
#include <asm/page.h>

-#define VERIFY_READ 0
-#define VERIFY_WRITE 1
-
-/*
- * The fs value determines whether argument validity checking should be
- * performed or not. If get_fs() == USER_DS, checking is performed, with
- * get_fs() == KERNEL_DS, checking is bypassed.
- *
- * For historical reasons, these macros are grossly misnamed.
- */
-
-#define MAKE_MM_SEG(s) ((mm_segment_t) { (s) })
-
-#define KERNEL_DS MAKE_MM_SEG(0xFFFFFFFFFFFFFFFFUL)
-#define USER_DS MAKE_MM_SEG(PAGE_OFFSET)
-
-#define get_ds() (KERNEL_DS)
-#define get_fs() (current_thread_info()->addr_limit)
-#define set_fs(x) (current_thread_info()->addr_limit = (x))
-
-#define segment_eq(a, b) ((a).seg == (b).seg)
-
-#define __addr_ok(addr) (!((unsigned long)(addr) & \
- (current_thread_info()->addr_limit.seg)))
-
-/*
- * Uhhuh, this needs 65-bit arithmetic. We have a carry..
- */
-#define __range_not_ok(addr, size) \
-({ \
- unsigned long flag, roksum; \
- __chk_user_ptr(addr); \
- asm("# range_ok\n\r" \
- "addq %3,%1 ; sbbq %0,%0 ; cmpq %1,%4 ; sbbq $0,%0" \
- : "=&r" (flag), "=r" (roksum) \
- : "1" (addr), "g" ((long)(size)), \
- "g" (current_thread_info()->addr_limit.seg)); \
- flag; \
-})
-
-#define access_ok(type, addr, size) (__range_not_ok(addr, size) == 0)
-
-/*
- * The exception table consists of pairs of addresses: the first is the
- * address of an instruction that is allowed to fault, and the second is
- * the address at which the program should continue. No registers are
- * modified, so it is entirely up to the continuation code to figure out
- * what to do.
- *
- * All the routines below use bits of fixup code that are out of line
- * with the main instruction path. This means when everything is well,
- * we don't even have to jump over them. Further, they do not intrude
- * on our cache or tlb entries.
- */
-
-struct exception_table_entry {
- unsigned long insn, fixup;
-};
-
-extern int fixup_exception(struct pt_regs *regs);
-
-#define ARCH_HAS_SEARCH_EXTABLE
-
-/*
- * These are the main single-value transfer routines. They automatically
- * use the right size if we just have the right pointer type.
- *
- * This gets kind of ugly. We want to return _two_ values in "get_user()"
- * and yet we don't want to do any pointers, because that is too much
- * of a performance impact. Thus we have a few rather ugly macros here,
- * and hide all the ugliness from the user.
- *
- * The "__xxx" versions of the user access functions are versions that
- * do not verify the address space, that must have been done previously
- * with a separate "access_ok()" call (this is used when we do multiple
- * accesses to the same area of user memory).
- */
-
-#define __get_user_x(size, ret, x, ptr) \
- asm volatile("call __get_user_" #size \
- : "=a" (ret),"=d" (x) \
- : "c" (ptr) \
- : "r8")
-
-/* Careful: we have to cast the result to the type of the pointer
- * for sign reasons */
-
-#define get_user(x, ptr) \
-({ \
- unsigned long __val_gu; \
- int __ret_gu; \
- __chk_user_ptr(ptr); \
- switch (sizeof(*(ptr))) { \
- case 1: \
- __get_user_x(1, __ret_gu, __val_gu, ptr); \
- break; \
- case 2: \
- __get_user_x(2, __ret_gu, __val_gu, ptr); \
- break; \
- case 4: \
- __get_user_x(4, __ret_gu, __val_gu, ptr); \
- break; \
- case 8: \
- __get_user_x(8, __ret_gu, __val_gu, ptr); \
- break; \
- default: \
- __get_user_bad(); \
- break; \
- } \
- (x) = (__force typeof(*(ptr)))__val_gu; \
- __ret_gu; \
-})
-
-extern void __put_user_1(void);
-extern void __put_user_2(void);
-extern void __put_user_4(void);
-extern void __put_user_8(void);
-extern void __put_user_bad(void);
-
-#define __put_user_x(size, ret, x, ptr) \
- asm volatile("call __put_user_" #size \
- :"=a" (ret) \
- :"c" (ptr),"d" (x) \
- :"r8")
-
-#define put_user(x, ptr) \
- __put_user_check((__typeof__(*(ptr)))(x), (ptr), sizeof(*(ptr)))
-
-#define __get_user(x, ptr) \
- __get_user_nocheck((x), (ptr), sizeof(*(ptr)))
-#define __put_user(x, ptr) \
- __put_user_nocheck((__typeof__(*(ptr)))(x), (ptr), sizeof(*(ptr)))
-
-#define __get_user_unaligned __get_user
-#define __put_user_unaligned __put_user
-
-#define __put_user_nocheck(x, ptr, size) \
-({ \
- int __pu_err; \
- __put_user_size((x), (ptr), (size), __pu_err); \
- __pu_err; \
-})
-
-
-#define __put_user_check(x, ptr, size) \
-({ \
- int __pu_err; \
- typeof(*(ptr)) __user *__pu_addr = (ptr); \
- switch (size) { \
- case 1: \
- __put_user_x(1, __pu_err, x, __pu_addr); \
- break; \
- case 2: \
- __put_user_x(2, __pu_err, x, __pu_addr); \
- break; \
- case 4: \
- __put_user_x(4, __pu_err, x, __pu_addr); \
- break; \
- case 8: \
- __put_user_x(8, __pu_err, x, __pu_addr); \
- break; \
- default: \
- __put_user_bad(); \
- } \
- __pu_err; \
-})
-
-#define __put_user_size(x, ptr, size, retval) \
-do { \
- retval = 0; \
- __chk_user_ptr(ptr); \
- switch (size) { \
- case 1: \
- __put_user_asm(x, ptr, retval, "b", "b", "iq", -EFAULT);\
- break; \
- case 2: \
- __put_user_asm(x, ptr, retval, "w", "w", "ir", -EFAULT);\
- break; \
- case 4: \
- __put_user_asm(x, ptr, retval, "l", "k", "ir", -EFAULT);\
- break; \
- case 8: \
- __put_user_asm(x, ptr, retval, "q", "", "Zr", -EFAULT); \
- break; \
- default: \
- __put_user_bad(); \
- } \
-} while (0)
-
-/* FIXME: this hack is definitely wrong -AK */
-struct __large_struct { unsigned long buf[100]; };
-#define __m(x) (*(struct __large_struct __user *)(x))
-
-/*
- * Tell gcc we read from memory instead of writing: this is because
- * we do not write to any memory gcc knows about, so there are no
- * aliasing issues.
- */
-#define __put_user_asm(x, addr, err, itype, rtype, ltype, errno) \
- asm volatile("1: mov"itype" %"rtype"1,%2\n" \
- "2:\n" \
- ".section .fixup, \"ax\"\n" \
- "3: mov %3,%0\n" \
- " jmp 2b\n" \
- ".previous\n" \
- _ASM_EXTABLE(1b, 3b) \
- : "=r"(err) \
- : ltype (x), "m" (__m(addr)), "i" (errno), "0" (err))
-
-
-#define __get_user_nocheck(x, ptr, size) \
-({ \
- int __gu_err; \
- unsigned long __gu_val; \
- __get_user_size(__gu_val, (ptr), (size), __gu_err); \
- (x) = (__force typeof(*(ptr)))__gu_val; \
- __gu_err; \
-})
-
-extern int __get_user_1(void);
-extern int __get_user_2(void);
-extern int __get_user_4(void);
-extern int __get_user_8(void);
-extern int __get_user_bad(void);
-
-#define __get_user_size(x, ptr, size, retval) \
-do { \
- retval = 0; \
- __chk_user_ptr(ptr); \
- switch (size) { \
- case 1: \
- __get_user_asm(x, ptr, retval, "b", "b", "=q", -EFAULT);\
- break; \
- case 2: \
- __get_user_asm(x, ptr, retval, "w", "w", "=r", -EFAULT);\
- break; \
- case 4: \
- __get_user_asm(x, ptr, retval, "l", "k", "=r", -EFAULT);\
- break; \
- case 8: \
- __get_user_asm(x, ptr, retval, "q", "", "=r", -EFAULT); \
- break; \
- default: \
- (x) = __get_user_bad(); \
- } \
-} while (0)
-
-#define __get_user_asm(x, addr, err, itype, rtype, ltype, errno) \
- asm volatile("1: mov"itype" %2,%"rtype"1\n" \
- "2:\n" \
- ".section .fixup, \"ax\"\n" \
- "3: mov %3,%0\n" \
- " xor"itype" %"rtype"1,%"rtype"1\n" \
- " jmp 2b\n" \
- ".previous\n" \
- _ASM_EXTABLE(1b, 3b) \
- : "=r" (err), ltype (x) \
- : "m" (__m(addr)), "i"(errno), "0"(err))
-
/*
* Copy To/From Userspace
*/
@@ -437,7 +178,6 @@ __copy_to_user_inatomic(void __user *dst, const void *src, unsigned size)
return copy_user_generic((__force void *)dst, src, size);
}

-#define ARCH_HAS_NOCACHE_UACCESS 1
extern long __copy_user_nocache(void *dst, const void __user *src,
unsigned size, int zerorest);


2008-06-27 21:36:51

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 01/39] Don't use size specifiers

Remove the "l" from inline asm at arch/x86/lib/delay_32.c.
It is not needed.

Signed-off-by: Glauber Costa <[email protected]>
---
arch/x86/lib/delay_32.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/lib/delay_32.c b/arch/x86/lib/delay_32.c
index ef69131..54013f8 100644
--- a/arch/x86/lib/delay_32.c
+++ b/arch/x86/lib/delay_32.c
@@ -38,9 +38,9 @@ static void delay_loop(unsigned long loops)
"1: jmp 2f \n"

".align 16 \n"
- "2: decl %0 \n"
+ "2: dec %0 \n"
" jnz 2b \n"
- "3: decl %0 \n"
+ "3: dec %0 \n"

: /* we don't need output */
:"a" (loops)
--
1.5.5.1

2008-06-27 21:37:13

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 03/39] use rdtscll in read_current_timer for i386.

This way we achieve the same code for both arches.

Signed-off-by: Glauber Costa <[email protected]>
---
arch/x86/lib/delay_32.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/lib/delay_32.c b/arch/x86/lib/delay_32.c
index 54013f8..bf6de05 100644
--- a/arch/x86/lib/delay_32.c
+++ b/arch/x86/lib/delay_32.c
@@ -98,7 +98,7 @@ void use_tsc_delay(void)
int __devinit read_current_timer(unsigned long *timer_val)
{
if (delay_fn == delay_tsc) {
- rdtscl(*timer_val);
+ rdtscll(*timer_val);
return 0;
}
return -1;
--
1.5.5.1

2008-06-27 21:37:42

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 07/39] don't clobber r8 nor use rcx

There's really no reason to clobber r8 or pass the address in rcx.
We can safely use only two registers (which we already have to touch anyway)
to do the job.

Signed-off-by: Glauber Costa <[email protected]>
---
arch/x86/lib/getuser_64.S | 42 +++++++++++++++++++++---------------------
include/asm-x86/uaccess_64.h | 3 +--
2 files changed, 22 insertions(+), 23 deletions(-)

diff --git a/arch/x86/lib/getuser_64.S b/arch/x86/lib/getuser_64.S
index 5448876..2b003d3 100644
--- a/arch/x86/lib/getuser_64.S
+++ b/arch/x86/lib/getuser_64.S
@@ -36,10 +36,10 @@
.text
ENTRY(__get_user_1)
CFI_STARTPROC
- GET_THREAD_INFO(%r8)
- cmpq threadinfo_addr_limit(%r8),%rcx
+ GET_THREAD_INFO(%rdx)
+ cmpq threadinfo_addr_limit(%rdx),%rax
jae bad_get_user
-1: movzb (%rcx),%edx
+1: movzb (%rax),%edx
xorl %eax,%eax
ret
CFI_ENDPROC
@@ -47,48 +47,48 @@ ENDPROC(__get_user_1)

ENTRY(__get_user_2)
CFI_STARTPROC
- GET_THREAD_INFO(%r8)
- addq $1,%rcx
+ GET_THREAD_INFO(%rdx)
+ addq $1,%rax
jc 20f
- cmpq threadinfo_addr_limit(%r8),%rcx
+ cmpq threadinfo_addr_limit(%rdx),%rax
jae 20f
- decq %rcx
-2: movzwl (%rcx),%edx
+ decq %rax
+2: movzwl (%rax),%edx
xorl %eax,%eax
ret
-20: decq %rcx
+20: decq %rax
jmp bad_get_user
CFI_ENDPROC
ENDPROC(__get_user_2)

ENTRY(__get_user_4)
CFI_STARTPROC
- GET_THREAD_INFO(%r8)
- addq $3,%rcx
+ GET_THREAD_INFO(%rdx)
+ addq $3,%rax
jc 30f
- cmpq threadinfo_addr_limit(%r8),%rcx
+ cmpq threadinfo_addr_limit(%rdx),%rax
jae 30f
- subq $3,%rcx
-3: movl (%rcx),%edx
+ subq $3,%rax
+3: movl (%rax),%edx
xorl %eax,%eax
ret
-30: subq $3,%rcx
+30: subq $3,%rax
jmp bad_get_user
CFI_ENDPROC
ENDPROC(__get_user_4)

ENTRY(__get_user_8)
CFI_STARTPROC
- GET_THREAD_INFO(%r8)
- addq $7,%rcx
+ GET_THREAD_INFO(%rdx)
+ addq $7,%rax
jc 40f
- cmpq threadinfo_addr_limit(%r8),%rcx
+ cmpq threadinfo_addr_limit(%rdx),%rax
jae 40f
- subq $7,%rcx
-4: movq (%rcx),%rdx
+ subq $7,%rax
+4: movq (%rax),%rdx
xorl %eax,%eax
ret
-40: subq $7,%rcx
+40: subq $7,%rax
jmp bad_get_user
CFI_ENDPROC
ENDPROC(__get_user_8)
diff --git a/include/asm-x86/uaccess_64.h b/include/asm-x86/uaccess_64.h
index 83382c1..9049f4e 100644
--- a/include/asm-x86/uaccess_64.h
+++ b/include/asm-x86/uaccess_64.h
@@ -90,8 +90,7 @@ extern int fixup_exception(struct pt_regs *regs);
#define __get_user_x(size, ret, x, ptr) \
asm volatile("call __get_user_" #size \
: "=a" (ret),"=d" (x) \
- : "c" (ptr) \
- : "r8")
+ : "0" (ptr)) \

/* Careful: we have to cast the result to the type of the pointer
* for sign reasons */
--
1.5.5.1

2008-06-27 21:38:01

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 05/39] integrate delay functions

delay_32.c, delay_64.c are now equal, and are integrated into delay.c

Signed-off-by: Glauber Costa <[email protected]>
---
arch/x86/lib/Makefile | 2 +-
arch/x86/lib/delay.c | 137 ++++++++++++++++++++++++++++++++++++++++++++++
arch/x86/lib/delay_32.c | 138 -----------------------------------------------
arch/x86/lib/delay_64.c | 128 -------------------------------------------
4 files changed, 138 insertions(+), 267 deletions(-)
create mode 100644 arch/x86/lib/delay.c
delete mode 100644 arch/x86/lib/delay_32.c
delete mode 100644 arch/x86/lib/delay_64.c

diff --git a/arch/x86/lib/Makefile b/arch/x86/lib/Makefile
index 84aa288..700c2c3 100644
--- a/arch/x86/lib/Makefile
+++ b/arch/x86/lib/Makefile
@@ -4,7 +4,7 @@

obj-$(CONFIG_SMP) := msr-on-cpu.o

-lib-y := delay_$(BITS).o
+lib-y := delay.o
lib-y += thunk_$(BITS).o
lib-y += usercopy_$(BITS).o getuser_$(BITS).o putuser_$(BITS).o
lib-y += memcpy_$(BITS).o
diff --git a/arch/x86/lib/delay.c b/arch/x86/lib/delay.c
new file mode 100644
index 0000000..f456860
--- /dev/null
+++ b/arch/x86/lib/delay.c
@@ -0,0 +1,137 @@
+/*
+ * Precise Delay Loops for i386
+ *
+ * Copyright (C) 1993 Linus Torvalds
+ * Copyright (C) 1997 Martin Mares <[email protected]>
+ * Copyright (C) 2008 Jiri Hladky <hladky _dot_ jiri _at_ gmail _dot_ com>
+ *
+ * The __delay function must _NOT_ be inlined as its execution time
+ * depends wildly on alignment on many x86 processors. The additional
+ * jump magic is needed to get the timing stable on all the CPU's
+ * we have to worry about.
+ */
+
+#include <linux/module.h>
+#include <linux/sched.h>
+#include <linux/timex.h>
+#include <linux/preempt.h>
+#include <linux/delay.h>
+#include <linux/init.h>
+
+#include <asm/processor.h>
+#include <asm/delay.h>
+#include <asm/timer.h>
+
+#ifdef CONFIG_SMP
+# include <asm/smp.h>
+#endif
+
+/* simple loop based delay: */
+static void delay_loop(unsigned long loops)
+{
+ asm volatile(
+ " test %0,%0 \n"
+ " jz 3f \n"
+ " jmp 1f \n"
+
+ ".align 16 \n"
+ "1: jmp 2f \n"
+
+ ".align 16 \n"
+ "2: dec %0 \n"
+ " jnz 2b \n"
+ "3: dec %0 \n"
+
+ : /* we don't need output */
+ :"a" (loops)
+ );
+}
+
+/* TSC based delay: */
+static void delay_tsc(unsigned long loops)
+{
+ unsigned long bclock, now;
+ int cpu;
+
+ preempt_disable();
+ cpu = smp_processor_id();
+ rdtscl(bclock);
+ for (;;) {
+ rdtscl(now);
+ if ((now - bclock) >= loops)
+ break;
+
+ /* Allow RT tasks to run */
+ preempt_enable();
+ rep_nop();
+ preempt_disable();
+
+ /*
+ * It is possible that we moved to another CPU, and
+ * since TSC's are per-cpu we need to calculate
+ * that. The delay must guarantee that we wait "at
+ * least" the amount of time. Being moved to another
+ * CPU could make the wait longer but we just need to
+ * make sure we waited long enough. Rebalance the
+ * counter for this CPU.
+ */
+ if (unlikely(cpu != smp_processor_id())) {
+ loops -= (now - bclock);
+ cpu = smp_processor_id();
+ rdtscl(bclock);
+ }
+ }
+ preempt_enable();
+}
+
+/*
+ * Since we calibrate only once at boot, this
+ * function should be set once at boot and not changed
+ */
+static void (*delay_fn)(unsigned long) = delay_loop;
+
+void use_tsc_delay(void)
+{
+ delay_fn = delay_tsc;
+}
+
+int __devinit read_current_timer(unsigned long *timer_val)
+{
+ if (delay_fn == delay_tsc) {
+ rdtscll(*timer_val);
+ return 0;
+ }
+ return -1;
+}
+
+void __delay(unsigned long loops)
+{
+ delay_fn(loops);
+}
+EXPORT_SYMBOL(__delay);
+
+inline void __const_udelay(unsigned long xloops)
+{
+ int d0;
+
+ xloops *= 4;
+ asm("mull %%edx"
+ :"=d" (xloops), "=&a" (d0)
+ :"1" (xloops), "0"
+ (cpu_data(raw_smp_processor_id()).loops_per_jiffy * (HZ/4)));
+
+ __delay(++xloops);
+}
+EXPORT_SYMBOL(__const_udelay);
+
+void __udelay(unsigned long usecs)
+{
+ __const_udelay(usecs * 0x000010c7); /* 2**32 / 1000000 (rounded up) */
+}
+EXPORT_SYMBOL(__udelay);
+
+void __ndelay(unsigned long nsecs)
+{
+ __const_udelay(nsecs * 0x00005); /* 2**32 / 1000000000 (rounded up) */
+}
+EXPORT_SYMBOL(__ndelay);
diff --git a/arch/x86/lib/delay_32.c b/arch/x86/lib/delay_32.c
deleted file mode 100644
index 0b659a3..0000000
--- a/arch/x86/lib/delay_32.c
+++ /dev/null
@@ -1,138 +0,0 @@
-/*
- * Precise Delay Loops for i386
- *
- * Copyright (C) 1993 Linus Torvalds
- * Copyright (C) 1997 Martin Mares <[email protected]>
- * Copyright (C) 2008 Jiri Hladky <hladky _dot_ jiri _at_ gmail _dot_ com>
- *
- * The __delay function must _NOT_ be inlined as its execution time
- * depends wildly on alignment on many x86 processors. The additional
- * jump magic is needed to get the timing stable on all the CPU's
- * we have to worry about.
- */
-
-#include <linux/module.h>
-#include <linux/sched.h>
-#include <linux/timex.h>
-#include <linux/preempt.h>
-#include <linux/delay.h>
-#include <linux/init.h>
-
-#include <asm/processor.h>
-#include <asm/delay.h>
-#include <asm/timer.h>
-
-#ifdef CONFIG_SMP
-# include <asm/smp.h>
-#endif
-
-/* simple loop based delay: */
-static void delay_loop(unsigned long loops)
-{
- __asm__ __volatile__(
- " test %0,%0 \n"
- " jz 3f \n"
- " jmp 1f \n"
-
- ".align 16 \n"
- "1: jmp 2f \n"
-
- ".align 16 \n"
- "2: dec %0 \n"
- " jnz 2b \n"
- "3: dec %0 \n"
-
- : /* we don't need output */
- :"a" (loops)
- );
-}
-
-/* TSC based delay: */
-static void delay_tsc(unsigned long loops)
-{
- unsigned long bclock, now;
- int cpu;
-
- preempt_disable();
- cpu = smp_processor_id();
- rdtscl(bclock);
- for (;;) {
- rdtscl(now);
- if ((now - bclock) >= loops)
- break;
-
- /* Allow RT tasks to run */
- preempt_enable();
- rep_nop();
- preempt_disable();
-
- /*
- * It is possible that we moved to another CPU, and
- * since TSC's are per-cpu we need to calculate
- * that. The delay must guarantee that we wait "at
- * least" the amount of time. Being moved to another
- * CPU could make the wait longer but we just need to
- * make sure we waited long enough. Rebalance the
- * counter for this CPU.
- */
- if (unlikely(cpu != smp_processor_id())) {
- loops -= (now - bclock);
- cpu = smp_processor_id();
- rdtscl(bclock);
- }
- }
- preempt_enable();
-}
-
-/*
- * Since we calibrate only once at boot, this
- * function should be set once at boot and not changed
- */
-static void (*delay_fn)(unsigned long) = delay_loop;
-
-void use_tsc_delay(void)
-{
- delay_fn = delay_tsc;
-}
-
-int __devinit read_current_timer(unsigned long *timer_val)
-{
- if (delay_fn == delay_tsc) {
- rdtscll(*timer_val);
- return 0;
- }
- return -1;
-}
-
-void __delay(unsigned long loops)
-{
- delay_fn(loops);
-}
-
-inline void __const_udelay(unsigned long xloops)
-{
- int d0;
-
- xloops *= 4;
- __asm__("mull %%edx"
- :"=d" (xloops), "=&a" (d0)
- :"1" (xloops), "0"
- (cpu_data(raw_smp_processor_id()).loops_per_jiffy * (HZ/4)));
-
- __delay(++xloops);
-}
-
-void __udelay(unsigned long usecs)
-{
- __const_udelay(usecs * 0x000010c7); /* 2**32 / 1000000 (rounded up) */
-}
-
-void __ndelay(unsigned long nsecs)
-{
- __const_udelay(nsecs * 0x00005); /* 2**32 / 1000000000 (rounded up) */
-}
-
-EXPORT_SYMBOL(__delay);
-EXPORT_SYMBOL(__const_udelay);
-EXPORT_SYMBOL(__udelay);
-EXPORT_SYMBOL(__ndelay);
diff --git a/arch/x86/lib/delay_64.c b/arch/x86/lib/delay_64.c
deleted file mode 100644
index ff3dfec..0000000
--- a/arch/x86/lib/delay_64.c
+++ /dev/null
@@ -1,128 +0,0 @@
-/*
- * Precise Delay Loops for x86-64
- *
- * Copyright (C) 1993 Linus Torvalds
- * Copyright (C) 1997 Martin Mares <[email protected]>
- *
- * The __delay function must _NOT_ be inlined as its execution time
- * depends wildly on alignment on many x86 processors.
- */
-
-#include <linux/module.h>
-#include <linux/sched.h>
-#include <linux/timex.h>
-#include <linux/preempt.h>
-#include <linux/delay.h>
-#include <linux/init.h>
-
-#include <asm/delay.h>
-#include <asm/msr.h>
-
-#ifdef CONFIG_SMP
-#include <asm/smp.h>
-#endif
-
-/* simple loop based delay: */
-static void delay_loop(unsigned long loops)
-{
- asm volatile(
- " test %0,%0 \n"
- " jz 3f \n"
- " jmp 1f \n"
-
- ".align 16 \n"
- "1: jmp 2f \n"
-
- ".align 16 \n"
- "2: dec %0 \n"
- " jnz 2b \n"
- "3: dec %0 \n"
-
- : /* we don't need output */
- :"a" (loops)
- );
-}
-
-static void delay_tsc(unsigned long loops)
-{
- unsigned bclock, now;
- int cpu;
-
- preempt_disable();
- cpu = smp_processor_id();
- rdtscl(bclock);
- for (;;) {
- rdtscl(now);
- if ((now - bclock) >= loops)
- break;
-
- /* Allow RT tasks to run */
- preempt_enable();
- rep_nop();
- preempt_disable();
-
- /*
- * It is possible that we moved to another CPU, and
- * since TSC's are per-cpu we need to calculate
- * that. The delay must guarantee that we wait "at
- * least" the amount of time. Being moved to another
- * CPU could make the wait longer but we just need to
- * make sure we waited long enough. Rebalance the
- * counter for this CPU.
- */
- if (unlikely(cpu != smp_processor_id())) {
- loops -= (now - bclock);
- cpu = smp_processor_id();
- rdtscl(bclock);
- }
- }
- preempt_enable();
-}
-
-static void (*delay_fn)(unsigned long) = delay_loop;
-
-void use_tsc_delay(void)
-{
- delay_fn = delay_tsc;
-}
-
-int __devinit read_current_timer(unsigned long *timer_value)
-{
- if (delay_fn == delay_tsc) {
- rdtscll(*timer_value);
- return 0;
- }
- return -1;
-}
-
-void __delay(unsigned long loops)
-{
- delay_fn(loops);
-}
-EXPORT_SYMBOL(__delay);
-
-inline void __const_udelay(unsigned long xloops)
-{
- int d0;
- xloops *= 4;
- __asm__("mull %%edx"
- :"=d" (xloops), "=&a" (d0)
- :"1" (xloops), "0"
- (cpu_data(raw_smp_processor_id()).loops_per_jiffy * (HZ/4)));
-
- __delay(++xloops);
-}
-
-EXPORT_SYMBOL(__const_udelay);
-
-void __udelay(unsigned long usecs)
-{
- __const_udelay(usecs * 0x000010c7); /* 2**32 / 1000000 (rounded up) */
-}
-EXPORT_SYMBOL(__udelay);
-
-void __ndelay(unsigned long nsecs)
-{
- __const_udelay(nsecs * 0x00005); /* 2**32 / 1000000000 (rounded up) */
-}
-EXPORT_SYMBOL(__ndelay);
--
1.5.5.1

2008-06-27 21:38:31

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 02/39] provide delay loop for x86_64

This is for consistency with i386. We call use_tsc_delay()
at tsc initialization for x86_64, so we'll be always using it.

Signed-off-by: Glauber Costa <[email protected]>
---
arch/x86/kernel/tsc_64.c | 1 +
arch/x86/lib/delay_64.c | 44 ++++++++++++++++++++++++++++++++++++++++----
2 files changed, 41 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/tsc_64.c b/arch/x86/kernel/tsc_64.c
index 9898fb0..36ac46f 100644
--- a/arch/x86/kernel/tsc_64.c
+++ b/arch/x86/kernel/tsc_64.c
@@ -258,6 +258,7 @@ void __init tsc_calibrate(void)
out:
for_each_possible_cpu(cpu)
set_cyc2ns_scale(tsc_khz, cpu);
+ use_tsc_delay();
}

/*
diff --git a/arch/x86/lib/delay_64.c b/arch/x86/lib/delay_64.c
index 4c441be..d0326d0 100644
--- a/arch/x86/lib/delay_64.c
+++ b/arch/x86/lib/delay_64.c
@@ -22,13 +22,28 @@
#include <asm/smp.h>
#endif

-int __devinit read_current_timer(unsigned long *timer_value)
+/* simple loop based delay: */
+static void delay_loop(unsigned long loops)
{
- rdtscll(*timer_value);
- return 0;
+ asm volatile(
+ " test %0,%0 \n"
+ " jz 3f \n"
+ " jmp 1f \n"
+
+ ".align 16 \n"
+ "1: jmp 2f \n"
+
+ ".align 16 \n"
+ "2: dec %0 \n"
+ " jnz 2b \n"
+ "3: dec %0 \n"
+
+ : /* we don't need output */
+ :"a" (loops)
+ );
}

-void __delay(unsigned long loops)
+static void delay_tsc(unsigned long loops)
{
unsigned bclock, now;
int cpu;
@@ -63,6 +78,27 @@ void __delay(unsigned long loops)
}
preempt_enable();
}
+
+static void (*delay_fn)(unsigned long) = delay_loop;
+
+void use_tsc_delay(void)
+{
+ delay_fn = delay_tsc;
+}
+
+int __devinit read_current_timer(unsigned long *timer_value)
+{
+ if (delay_fn == delay_tsc) {
+ rdtscll(*timer_value);
+ return 0;
+ }
+ return -1;
+}
+
+void __delay(unsigned long loops)
+{
+ delay_fn(loops);
+}
EXPORT_SYMBOL(__delay);

inline void __const_udelay(unsigned long xloops)
--
1.5.5.1

2008-06-27 21:38:50

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 09/39] adapt x86_64 getuser functions

instead of doing a sub after the addition, use the
offset directly at the memory operand of the mov instructions.
This is the way i386 do.

Signed-off-by: Glauber Costa <[email protected]>
---
arch/x86/lib/getuser_64.S | 33 ++++++++++++---------------------
1 files changed, 12 insertions(+), 21 deletions(-)

diff --git a/arch/x86/lib/getuser_64.S b/arch/x86/lib/getuser_64.S
index 2b003d3..df37d3a 100644
--- a/arch/x86/lib/getuser_64.S
+++ b/arch/x86/lib/getuser_64.S
@@ -47,49 +47,40 @@ ENDPROC(__get_user_1)

ENTRY(__get_user_2)
CFI_STARTPROC
- GET_THREAD_INFO(%rdx)
addq $1,%rax
- jc 20f
+ jc bad_get_user
+ GET_THREAD_INFO(%rdx)
cmpq threadinfo_addr_limit(%rdx),%rax
- jae 20f
- decq %rax
-2: movzwl (%rax),%edx
+ jae bad_get_user
+2: movzwl -1(%rax),%edx
xorl %eax,%eax
ret
-20: decq %rax
- jmp bad_get_user
CFI_ENDPROC
ENDPROC(__get_user_2)

ENTRY(__get_user_4)
CFI_STARTPROC
- GET_THREAD_INFO(%rdx)
addq $3,%rax
- jc 30f
+ jc bad_get_user
+ GET_THREAD_INFO(%rdx)
cmpq threadinfo_addr_limit(%rdx),%rax
- jae 30f
- subq $3,%rax
-3: movl (%rax),%edx
+ jae bad_get_user
+3: movl -3(%rax),%edx
xorl %eax,%eax
ret
-30: subq $3,%rax
- jmp bad_get_user
CFI_ENDPROC
ENDPROC(__get_user_4)

ENTRY(__get_user_8)
CFI_STARTPROC
- GET_THREAD_INFO(%rdx)
addq $7,%rax
- jc 40f
+ jc bad_get_user
+ GET_THREAD_INFO(%rdx)
cmpq threadinfo_addr_limit(%rdx),%rax
- jae 40f
- subq $7,%rax
-4: movq (%rax),%rdx
+ jae bad_get_user
+4: movq -7(%rax),%rdx
xorl %eax,%eax
ret
-40: subq $7,%rax
- jmp bad_get_user
CFI_ENDPROC
ENDPROC(__get_user_8)

--
1.5.5.1

2008-06-27 21:39:14

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 11/39] Don't use word-size specifiers on getuser_64

The instructions access registers, so the size is unambiguous.

Signed-off-by: Glauber Costa <[email protected]>
---
arch/x86/lib/getuser_64.S | 28 ++++++++++++++--------------
1 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/arch/x86/lib/getuser_64.S b/arch/x86/lib/getuser_64.S
index 0ec7890..6134752 100644
--- a/arch/x86/lib/getuser_64.S
+++ b/arch/x86/lib/getuser_64.S
@@ -37,57 +37,57 @@
ENTRY(__get_user_1)
CFI_STARTPROC
GET_THREAD_INFO(%rdx)
- cmpq TI_addr_limit(%rdx),%rax
+ cmp TI_addr_limit(%rdx),%rax
jae bad_get_user
1: movzb (%rax),%edx
- xorl %eax,%eax
+ xor %eax,%eax
ret
CFI_ENDPROC
ENDPROC(__get_user_1)

ENTRY(__get_user_2)
CFI_STARTPROC
- addq $1,%rax
+ add $1,%rax
jc bad_get_user
GET_THREAD_INFO(%rdx)
- cmpq TI_addr_limit(%rdx),%rax
+ cmp TI_addr_limit(%rdx),%rax
jae bad_get_user
2: movzwl -1(%rax),%edx
- xorl %eax,%eax
+ xor %eax,%eax
ret
CFI_ENDPROC
ENDPROC(__get_user_2)

ENTRY(__get_user_4)
CFI_STARTPROC
- addq $3,%rax
+ add $3,%rax
jc bad_get_user
GET_THREAD_INFO(%rdx)
- cmpq TI_addr_limit(%rdx),%rax
+ cmp TI_addr_limit(%rdx),%rax
jae bad_get_user
-3: movl -3(%rax),%edx
- xorl %eax,%eax
+3: mov -3(%rax),%edx
+ xor %eax,%eax
ret
CFI_ENDPROC
ENDPROC(__get_user_4)

ENTRY(__get_user_8)
CFI_STARTPROC
- addq $7,%rax
+ add $7,%rax
jc bad_get_user
GET_THREAD_INFO(%rdx)
- cmpq TI_addr_limit(%rdx),%rax
+ cmp TI_addr_limit(%rdx),%rax
jae bad_get_user
4: movq -7(%rax),%rdx
- xorl %eax,%eax
+ xor %eax,%eax
ret
CFI_ENDPROC
ENDPROC(__get_user_8)

bad_get_user:
CFI_STARTPROC
- xorl %edx,%edx
- movq $(-EFAULT),%rax
+ xor %edx,%edx
+ mov $(-EFAULT),%rax
ret
CFI_ENDPROC
END(bad_get_user)
--
1.5.5.1

2008-06-27 21:39:31

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 08/39] don't use word-size specifiers

since the instructions refer to registers, they'll be able
to figure it out.

Signed-off-by: Glauber Costa <[email protected]>
---
arch/x86/lib/getuser_32.S | 24 ++++++++++++------------
1 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/arch/x86/lib/getuser_32.S b/arch/x86/lib/getuser_32.S
index 6d84b53..8200fde 100644
--- a/arch/x86/lib/getuser_32.S
+++ b/arch/x86/lib/getuser_32.S
@@ -29,44 +29,44 @@
ENTRY(__get_user_1)
CFI_STARTPROC
GET_THREAD_INFO(%edx)
- cmpl TI_addr_limit(%edx),%eax
+ cmp TI_addr_limit(%edx),%eax
jae bad_get_user
-1: movzbl (%eax),%edx
- xorl %eax,%eax
+1: movzb (%eax),%edx
+ xor %eax,%eax
ret
CFI_ENDPROC
ENDPROC(__get_user_1)

ENTRY(__get_user_2)
CFI_STARTPROC
- addl $1,%eax
+ add $1,%eax
jc bad_get_user
GET_THREAD_INFO(%edx)
- cmpl TI_addr_limit(%edx),%eax
+ cmp TI_addr_limit(%edx),%eax
jae bad_get_user
2: movzwl -1(%eax),%edx
- xorl %eax,%eax
+ xor %eax,%eax
ret
CFI_ENDPROC
ENDPROC(__get_user_2)

ENTRY(__get_user_4)
CFI_STARTPROC
- addl $3,%eax
+ add $3,%eax
jc bad_get_user
GET_THREAD_INFO(%edx)
- cmpl TI_addr_limit(%edx),%eax
+ cmp TI_addr_limit(%edx),%eax
jae bad_get_user
-3: movl -3(%eax),%edx
- xorl %eax,%eax
+3: mov -3(%eax),%edx
+ xor %eax,%eax
ret
CFI_ENDPROC
ENDPROC(__get_user_4)

bad_get_user:
CFI_STARTPROC
- xorl %edx,%edx
- movl $-14,%eax
+ xor %edx,%edx
+ mov $-14,%eax
ret
CFI_ENDPROC
END(bad_get_user)
--
1.5.5.1

2008-06-27 21:40:23

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 06/39] use something common for both architectures

using explicit hexa (0xFFFFFFUL) introduces an unnecessary difference
between i386 and x86_64 because of the size of their long. Use -1UL instead.

Signed-off-by: Glauber Costa <[email protected]>
---
include/asm-x86/uaccess_32.h | 2 +-
include/asm-x86/uaccess_64.h | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/asm-x86/uaccess_32.h b/include/asm-x86/uaccess_32.h
index 8e7595c..6a8adec 100644
--- a/include/asm-x86/uaccess_32.h
+++ b/include/asm-x86/uaccess_32.h
@@ -25,7 +25,7 @@
#define MAKE_MM_SEG(s) ((mm_segment_t) { (s) })


-#define KERNEL_DS MAKE_MM_SEG(0xFFFFFFFFUL)
+#define KERNEL_DS MAKE_MM_SEG(-1UL)
#define USER_DS MAKE_MM_SEG(PAGE_OFFSET)

#define get_ds() (KERNEL_DS)
diff --git a/include/asm-x86/uaccess_64.h b/include/asm-x86/uaccess_64.h
index b8a2f43..83382c1 100644
--- a/include/asm-x86/uaccess_64.h
+++ b/include/asm-x86/uaccess_64.h
@@ -22,7 +22,7 @@

#define MAKE_MM_SEG(s) ((mm_segment_t) { (s) })

-#define KERNEL_DS MAKE_MM_SEG(0xFFFFFFFFFFFFFFFFUL)
+#define KERNEL_DS MAKE_MM_SEG(-1UL)
#define USER_DS MAKE_MM_SEG(PAGE_OFFSET)

#define get_ds() (KERNEL_DS)
--
1.5.5.1

2008-06-27 21:40:51

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 04/39] explicitly use edx in const delay function.

For x86_64, we can't just use %0, as it would
generate a mul against rdx, which is not really what we
want (note the ">> 32" in x86_64 version).

Using a u64 variable with a shift in i386 generates bad code,
so the solution is to explicitly use %%edx in inline assembly
for both.

Signed-off-by: Glauber Costa <[email protected]>
---
arch/x86/lib/delay_32.c | 2 +-
arch/x86/lib/delay_64.c | 11 +++++++++--
2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/arch/x86/lib/delay_32.c b/arch/x86/lib/delay_32.c
index bf6de05..0b659a3 100644
--- a/arch/x86/lib/delay_32.c
+++ b/arch/x86/lib/delay_32.c
@@ -114,7 +114,7 @@ inline void __const_udelay(unsigned long xloops)
int d0;

xloops *= 4;
- __asm__("mull %0"
+ __asm__("mull %%edx"
:"=d" (xloops), "=&a" (d0)
:"1" (xloops), "0"
(cpu_data(raw_smp_processor_id()).loops_per_jiffy * (HZ/4)));
diff --git a/arch/x86/lib/delay_64.c b/arch/x86/lib/delay_64.c
index d0326d0..ff3dfec 100644
--- a/arch/x86/lib/delay_64.c
+++ b/arch/x86/lib/delay_64.c
@@ -103,9 +103,16 @@ EXPORT_SYMBOL(__delay);

inline void __const_udelay(unsigned long xloops)
{
- __delay(((xloops * HZ *
- cpu_data(raw_smp_processor_id()).loops_per_jiffy) >> 32) + 1);
+ int d0;
+ xloops *= 4;
+ __asm__("mull %%edx"
+ :"=d" (xloops), "=&a" (d0)
+ :"1" (xloops), "0"
+ (cpu_data(raw_smp_processor_id()).loops_per_jiffy * (HZ/4)));
+
+ __delay(++xloops);
}
+
EXPORT_SYMBOL(__const_udelay);

void __udelay(unsigned long usecs)
--
1.5.5.1

2008-06-27 21:41:19

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 10/39] rename threadinfo to TI

this is for consistency with i386.

Signed-off-by: Glauber Costa <[email protected]>
---
arch/x86/ia32/ia32entry.S | 25 ++++++++++++++-----------
arch/x86/kernel/asm-offsets_64.c | 2 +-
arch/x86/kernel/entry_64.S | 27 ++++++++++++++-------------
arch/x86/lib/copy_user_64.S | 4 ++--
arch/x86/lib/getuser_64.S | 8 ++++----
arch/x86/lib/putuser_64.S | 8 ++++----
6 files changed, 39 insertions(+), 35 deletions(-)

diff --git a/arch/x86/ia32/ia32entry.S b/arch/x86/ia32/ia32entry.S
index 3aefbce..9bfea05 100644
--- a/arch/x86/ia32/ia32entry.S
+++ b/arch/x86/ia32/ia32entry.S
@@ -103,7 +103,7 @@ ENTRY(ia32_sysenter_target)
pushfq
CFI_ADJUST_CFA_OFFSET 8
/*CFI_REL_OFFSET rflags,0*/
- movl 8*3-THREAD_SIZE+threadinfo_sysenter_return(%rsp), %r10d
+ movl 8*3-THREAD_SIZE+TI_sysenter_return(%rsp), %r10d
CFI_REGISTER rip,r10
pushq $__USER32_CS
CFI_ADJUST_CFA_OFFSET 8
@@ -123,8 +123,9 @@ ENTRY(ia32_sysenter_target)
.quad 1b,ia32_badarg
.previous
GET_THREAD_INFO(%r10)
- orl $TS_COMPAT,threadinfo_status(%r10)
- testl $(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SECCOMP),threadinfo_flags(%r10)
+ orl $TS_COMPAT,TI_status(%r10)
+ testl $(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SECCOMP), \
+ TI_flags(%r10)
CFI_REMEMBER_STATE
jnz sysenter_tracesys
sysenter_do_call:
@@ -136,9 +137,9 @@ sysenter_do_call:
GET_THREAD_INFO(%r10)
cli
TRACE_IRQS_OFF
- testl $_TIF_ALLWORK_MASK,threadinfo_flags(%r10)
+ testl $_TIF_ALLWORK_MASK,TI_flags(%r10)
jnz int_ret_from_sys_call
- andl $~TS_COMPAT,threadinfo_status(%r10)
+ andl $~TS_COMPAT,TI_status(%r10)
/* clear IF, that popfq doesn't enable interrupts early */
andl $~0x200,EFLAGS-R11(%rsp)
movl RIP-R11(%rsp),%edx /* User %eip */
@@ -230,8 +231,9 @@ ENTRY(ia32_cstar_target)
.quad 1b,ia32_badarg
.previous
GET_THREAD_INFO(%r10)
- orl $TS_COMPAT,threadinfo_status(%r10)
- testl $(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SECCOMP),threadinfo_flags(%r10)
+ orl $TS_COMPAT,TI_status(%r10)
+ testl $(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SECCOMP), \
+ TI_flags(%r10)
CFI_REMEMBER_STATE
jnz cstar_tracesys
cstar_do_call:
@@ -243,9 +245,9 @@ cstar_do_call:
GET_THREAD_INFO(%r10)
cli
TRACE_IRQS_OFF
- testl $_TIF_ALLWORK_MASK,threadinfo_flags(%r10)
+ testl $_TIF_ALLWORK_MASK,TI_flags(%r10)
jnz int_ret_from_sys_call
- andl $~TS_COMPAT,threadinfo_status(%r10)
+ andl $~TS_COMPAT,TI_status(%r10)
RESTORE_ARGS 1,-ARG_SKIP,1,1,1
movl RIP-ARGOFFSET(%rsp),%ecx
CFI_REGISTER rip,rcx
@@ -324,8 +326,9 @@ ENTRY(ia32_syscall)
this could be a problem. */
SAVE_ARGS 0,0,1
GET_THREAD_INFO(%r10)
- orl $TS_COMPAT,threadinfo_status(%r10)
- testl $(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SECCOMP),threadinfo_flags(%r10)
+ orl $TS_COMPAT,TI_status(%r10)
+ testl $(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SECCOMP), \
+ TI_flags(%r10)
jnz ia32_tracesys
ia32_do_syscall:
cmpl $(IA32_NR_syscalls-1),%eax
diff --git a/arch/x86/kernel/asm-offsets_64.c b/arch/x86/kernel/asm-offsets_64.c
index a5bbec3..2fcc6ac 100644
--- a/arch/x86/kernel/asm-offsets_64.c
+++ b/arch/x86/kernel/asm-offsets_64.c
@@ -34,7 +34,7 @@ int main(void)
ENTRY(pid);
BLANK();
#undef ENTRY
-#define ENTRY(entry) DEFINE(threadinfo_ ## entry, offsetof(struct thread_info, entry))
+#define ENTRY(entry) DEFINE(TI_ ## entry, offsetof(struct thread_info, entry))
ENTRY(flags);
ENTRY(addr_limit);
ENTRY(preempt_count);
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index c035b20..b79cfc9 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -277,13 +277,13 @@ ENTRY(ret_from_fork)
CFI_ADJUST_CFA_OFFSET -4
call schedule_tail
GET_THREAD_INFO(%rcx)
- testl $(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT),threadinfo_flags(%rcx)
+ testl $(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT),TI_flags(%rcx)
jnz rff_trace
rff_action:
RESTORE_REST
testl $3,CS-ARGOFFSET(%rsp) # from kernel_thread?
je int_ret_from_sys_call
- testl $_TIF_IA32,threadinfo_flags(%rcx)
+ testl $_TIF_IA32,TI_flags(%rcx)
jnz int_ret_from_sys_call
RESTORE_TOP_OF_STACK %rdi,ARGOFFSET
jmp ret_from_sys_call
@@ -352,7 +352,8 @@ ENTRY(system_call_after_swapgs)
movq %rcx,RIP-ARGOFFSET(%rsp)
CFI_REL_OFFSET rip,RIP-ARGOFFSET
GET_THREAD_INFO(%rcx)
- testl $(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SECCOMP),threadinfo_flags(%rcx)
+ testl $(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SECCOMP), \
+ TI_flags(%rcx)
jnz tracesys
cmpq $__NR_syscall_max,%rax
ja badsys
@@ -371,7 +372,7 @@ sysret_check:
GET_THREAD_INFO(%rcx)
DISABLE_INTERRUPTS(CLBR_NONE)
TRACE_IRQS_OFF
- movl threadinfo_flags(%rcx),%edx
+ movl TI_flags(%rcx),%edx
andl %edi,%edx
jnz sysret_careful
CFI_REMEMBER_STATE
@@ -455,10 +456,10 @@ int_ret_from_sys_call:
int_with_check:
LOCKDEP_SYS_EXIT_IRQ
GET_THREAD_INFO(%rcx)
- movl threadinfo_flags(%rcx),%edx
+ movl TI_flags(%rcx),%edx
andl %edi,%edx
jnz int_careful
- andl $~TS_COMPAT,threadinfo_status(%rcx)
+ andl $~TS_COMPAT,TI_status(%rcx)
jmp retint_swapgs

/* Either reschedule or signal or syscall exit tracking needed. */
@@ -666,7 +667,7 @@ retint_with_reschedule:
movl $_TIF_WORK_MASK,%edi
retint_check:
LOCKDEP_SYS_EXIT_IRQ
- movl threadinfo_flags(%rcx),%edx
+ movl TI_flags(%rcx),%edx
andl %edi,%edx
CFI_REMEMBER_STATE
jnz retint_careful
@@ -764,7 +765,7 @@ retint_signal:
/* Returning to kernel space from exception. */
/* rcx: threadinfo. interrupts off. */
ENTRY(retexc_kernel)
- testl $HARDNMI_MASK,threadinfo_preempt_count(%rcx)
+ testl $HARDNMI_MASK,TI_preempt_count(%rcx)
jz retint_kernel /* Not nested over NMI ? */
testw $X86_EFLAGS_TF,EFLAGS-ARGOFFSET(%rsp) /* trap flag? */
jnz retint_kernel /*
@@ -782,9 +783,9 @@ ENTRY(retexc_kernel)
/* Returning to kernel space. Check if we need preemption */
/* rcx: threadinfo. interrupts off. */
ENTRY(retint_kernel)
- cmpl $0,threadinfo_preempt_count(%rcx)
+ cmpl $0,TI_preempt_count(%rcx)
jnz retint_restore_args
- bt $TIF_NEED_RESCHED,threadinfo_flags(%rcx)
+ bt $TIF_NEED_RESCHED,TI_flags(%rcx)
jnc retint_restore_args
bt $9,EFLAGS-ARGOFFSET(%rsp) /* interrupts off? */
jnc retint_restore_args
@@ -945,7 +946,7 @@ paranoid_restore_no_nmi\trace:
jmp irq_return
paranoid_restore\trace:
GET_THREAD_INFO(%rcx)
- testl $HARDNMI_MASK,threadinfo_preempt_count(%rcx)
+ testl $HARDNMI_MASK,TI_preempt_count(%rcx)
jz paranoid_restore_no_nmi\trace /* Nested over NMI ? */
testw $X86_EFLAGS_TF,EFLAGS-0(%rsp) /* trap flag? */
jnz paranoid_restore_no_nmi\trace
@@ -953,7 +954,7 @@ paranoid_restore\trace:
INTERRUPT_RETURN_NMI_SAFE
paranoid_userspace\trace:
GET_THREAD_INFO(%rcx)
- movl threadinfo_flags(%rcx),%ebx
+ movl TI_flags(%rcx),%ebx
andl $_TIF_WORK_MASK,%ebx
jz paranoid_swapgs\trace
movq %rsp,%rdi /* &pt_regs */
@@ -1051,7 +1052,7 @@ error_exit:
testl %eax,%eax
jne retexc_kernel
LOCKDEP_SYS_EXIT_IRQ
- movl threadinfo_flags(%rcx),%edx
+ movl TI_flags(%rcx),%edx
movl $_TIF_WORK_MASK,%edi
andl %edi,%edx
jnz retint_careful
diff --git a/arch/x86/lib/copy_user_64.S b/arch/x86/lib/copy_user_64.S
index ee1c3f6..7eaaf01 100644
--- a/arch/x86/lib/copy_user_64.S
+++ b/arch/x86/lib/copy_user_64.S
@@ -40,7 +40,7 @@ ENTRY(copy_to_user)
movq %rdi,%rcx
addq %rdx,%rcx
jc bad_to_user
- cmpq threadinfo_addr_limit(%rax),%rcx
+ cmpq TI_addr_limit(%rax),%rcx
jae bad_to_user
xorl %eax,%eax /* clear zero flag */
ALTERNATIVE_JUMP X86_FEATURE_REP_GOOD,copy_user_generic_unrolled,copy_user_generic_string
@@ -65,7 +65,7 @@ ENTRY(copy_from_user)
movq %rsi,%rcx
addq %rdx,%rcx
jc bad_from_user
- cmpq threadinfo_addr_limit(%rax),%rcx
+ cmpq TI_addr_limit(%rax),%rcx
jae bad_from_user
movl $1,%ecx /* set zero flag */
ALTERNATIVE_JUMP X86_FEATURE_REP_GOOD,copy_user_generic_unrolled,copy_user_generic_string
diff --git a/arch/x86/lib/getuser_64.S b/arch/x86/lib/getuser_64.S
index df37d3a..0ec7890 100644
--- a/arch/x86/lib/getuser_64.S
+++ b/arch/x86/lib/getuser_64.S
@@ -37,7 +37,7 @@
ENTRY(__get_user_1)
CFI_STARTPROC
GET_THREAD_INFO(%rdx)
- cmpq threadinfo_addr_limit(%rdx),%rax
+ cmpq TI_addr_limit(%rdx),%rax
jae bad_get_user
1: movzb (%rax),%edx
xorl %eax,%eax
@@ -50,7 +50,7 @@ ENTRY(__get_user_2)
addq $1,%rax
jc bad_get_user
GET_THREAD_INFO(%rdx)
- cmpq threadinfo_addr_limit(%rdx),%rax
+ cmpq TI_addr_limit(%rdx),%rax
jae bad_get_user
2: movzwl -1(%rax),%edx
xorl %eax,%eax
@@ -63,7 +63,7 @@ ENTRY(__get_user_4)
addq $3,%rax
jc bad_get_user
GET_THREAD_INFO(%rdx)
- cmpq threadinfo_addr_limit(%rdx),%rax
+ cmpq TI_addr_limit(%rdx),%rax
jae bad_get_user
3: movl -3(%rax),%edx
xorl %eax,%eax
@@ -76,7 +76,7 @@ ENTRY(__get_user_8)
addq $7,%rax
jc bad_get_user
GET_THREAD_INFO(%rdx)
- cmpq threadinfo_addr_limit(%rdx),%rax
+ cmpq TI_addr_limit(%rdx),%rax
jae bad_get_user
4: movq -7(%rax),%rdx
xorl %eax,%eax
diff --git a/arch/x86/lib/putuser_64.S b/arch/x86/lib/putuser_64.S
index 4989f5a..940796f 100644
--- a/arch/x86/lib/putuser_64.S
+++ b/arch/x86/lib/putuser_64.S
@@ -35,7 +35,7 @@
ENTRY(__put_user_1)
CFI_STARTPROC
GET_THREAD_INFO(%r8)
- cmpq threadinfo_addr_limit(%r8),%rcx
+ cmpq TI_addr_limit(%r8),%rcx
jae bad_put_user
1: movb %dl,(%rcx)
xorl %eax,%eax
@@ -48,7 +48,7 @@ ENTRY(__put_user_2)
GET_THREAD_INFO(%r8)
addq $1,%rcx
jc 20f
- cmpq threadinfo_addr_limit(%r8),%rcx
+ cmpq TI_addr_limit(%r8),%rcx
jae 20f
decq %rcx
2: movw %dx,(%rcx)
@@ -64,7 +64,7 @@ ENTRY(__put_user_4)
GET_THREAD_INFO(%r8)
addq $3,%rcx
jc 30f
- cmpq threadinfo_addr_limit(%r8),%rcx
+ cmpq TI_addr_limit(%r8),%rcx
jae 30f
subq $3,%rcx
3: movl %edx,(%rcx)
@@ -80,7 +80,7 @@ ENTRY(__put_user_8)
GET_THREAD_INFO(%r8)
addq $7,%rcx
jc 40f
- cmpq threadinfo_addr_limit(%r8),%rcx
+ cmpq TI_addr_limit(%r8),%rcx
jae 40f
subq $7,%rcx
4: movq %rdx,(%rcx)
--
1.5.5.1

2008-06-27 21:41:37

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 12/39] introduce __ASM_REG macro

There are situations in which the architecture wants to use the
register that represents its word-size, whatever it is. For those,
introduce __ASM_REG in asm.h, along with the first users _ASM_AX
and _ASM_DX. They have users waiting for it, namely the getuser
functions.

Signed-off-by: Glauber Costa <[email protected]>
---
arch/x86/lib/getuser_32.S | 25 +++++++++++++------------
arch/x86/lib/getuser_64.S | 36 ++++++++++++++++++------------------
include/asm-x86/asm.h | 3 +++
3 files changed, 34 insertions(+), 30 deletions(-)

diff --git a/arch/x86/lib/getuser_32.S b/arch/x86/lib/getuser_32.S
index 8200fde..2cc3cee 100644
--- a/arch/x86/lib/getuser_32.S
+++ b/arch/x86/lib/getuser_32.S
@@ -11,6 +11,7 @@
#include <linux/linkage.h>
#include <asm/dwarf2.h>
#include <asm/thread_info.h>
+#include <asm/asm.h>


/*
@@ -28,10 +29,10 @@
.text
ENTRY(__get_user_1)
CFI_STARTPROC
- GET_THREAD_INFO(%edx)
- cmp TI_addr_limit(%edx),%eax
+ GET_THREAD_INFO(%_ASM_DX)
+ cmp TI_addr_limit(%_ASM_DX),%_ASM_AX
jae bad_get_user
-1: movzb (%eax),%edx
+1: movzb (%_ASM_AX),%edx
xor %eax,%eax
ret
CFI_ENDPROC
@@ -39,12 +40,12 @@ ENDPROC(__get_user_1)

ENTRY(__get_user_2)
CFI_STARTPROC
- add $1,%eax
+ add $1,%_ASM_AX
jc bad_get_user
- GET_THREAD_INFO(%edx)
- cmp TI_addr_limit(%edx),%eax
+ GET_THREAD_INFO(%_ASM_DX)
+ cmp TI_addr_limit(%_ASM_DX),%_ASM_AX
jae bad_get_user
-2: movzwl -1(%eax),%edx
+2: movzwl -1(%_ASM_AX),%edx
xor %eax,%eax
ret
CFI_ENDPROC
@@ -52,12 +53,12 @@ ENDPROC(__get_user_2)

ENTRY(__get_user_4)
CFI_STARTPROC
- add $3,%eax
+ add $3,%_ASM_AX
jc bad_get_user
- GET_THREAD_INFO(%edx)
- cmp TI_addr_limit(%edx),%eax
+ GET_THREAD_INFO(%_ASM_DX)
+ cmp TI_addr_limit(%_ASM_DX),%_ASM_AX
jae bad_get_user
-3: mov -3(%eax),%edx
+3: mov -3(%_ASM_AX),%edx
xor %eax,%eax
ret
CFI_ENDPROC
@@ -66,7 +67,7 @@ ENDPROC(__get_user_4)
bad_get_user:
CFI_STARTPROC
xor %edx,%edx
- mov $-14,%eax
+ mov $-14,%_ASM_AX
ret
CFI_ENDPROC
END(bad_get_user)
diff --git a/arch/x86/lib/getuser_64.S b/arch/x86/lib/getuser_64.S
index 6134752..63b0e5c 100644
--- a/arch/x86/lib/getuser_64.S
+++ b/arch/x86/lib/getuser_64.S
@@ -13,14 +13,13 @@
/*
* __get_user_X
*
- * Inputs: %rcx contains the address.
+ * Inputs: %rax contains the address.
* The register is modified, but all changes are undone
* before returning because the C code doesn't know about it.
*
* Outputs: %rax is error code (0 or -EFAULT)
* %rdx contains zero-extended value
*
- * %r8 is destroyed.
*
* These functions should not modify any other registers,
* as they get called from within inline assembly.
@@ -32,14 +31,15 @@
#include <asm/errno.h>
#include <asm/asm-offsets.h>
#include <asm/thread_info.h>
+#include <asm/asm.h>

.text
ENTRY(__get_user_1)
CFI_STARTPROC
- GET_THREAD_INFO(%rdx)
- cmp TI_addr_limit(%rdx),%rax
+ GET_THREAD_INFO(%_ASM_DX)
+ cmp TI_addr_limit(%_ASM_DX),%_ASM_AX
jae bad_get_user
-1: movzb (%rax),%edx
+1: movzb (%_ASM_AX),%edx
xor %eax,%eax
ret
CFI_ENDPROC
@@ -47,12 +47,12 @@ ENDPROC(__get_user_1)

ENTRY(__get_user_2)
CFI_STARTPROC
- add $1,%rax
+ add $1,%_ASM_AX
jc bad_get_user
- GET_THREAD_INFO(%rdx)
- cmp TI_addr_limit(%rdx),%rax
+ GET_THREAD_INFO(%_ASM_DX)
+ cmp TI_addr_limit(%_ASM_DX),%_ASM_AX
jae bad_get_user
-2: movzwl -1(%rax),%edx
+2: movzwl -1(%_ASM_AX),%edx
xor %eax,%eax
ret
CFI_ENDPROC
@@ -60,12 +60,12 @@ ENDPROC(__get_user_2)

ENTRY(__get_user_4)
CFI_STARTPROC
- add $3,%rax
+ add $3,%_ASM_AX
jc bad_get_user
- GET_THREAD_INFO(%rdx)
- cmp TI_addr_limit(%rdx),%rax
+ GET_THREAD_INFO(%_ASM_DX)
+ cmp TI_addr_limit(%_ASM_DX),%_ASM_AX
jae bad_get_user
-3: mov -3(%rax),%edx
+3: mov -3(%_ASM_AX),%edx
xor %eax,%eax
ret
CFI_ENDPROC
@@ -73,12 +73,12 @@ ENDPROC(__get_user_4)

ENTRY(__get_user_8)
CFI_STARTPROC
- add $7,%rax
+ add $7,%_ASM_AX
jc bad_get_user
- GET_THREAD_INFO(%rdx)
- cmp TI_addr_limit(%rdx),%rax
+ GET_THREAD_INFO(%_ASM_DX)
+ cmp TI_addr_limit(%_ASM_DX),%_ASM_AX
jae bad_get_user
-4: movq -7(%rax),%rdx
+4: movq -7(%_ASM_AX),%_ASM_DX
xor %eax,%eax
ret
CFI_ENDPROC
@@ -87,7 +87,7 @@ ENDPROC(__get_user_8)
bad_get_user:
CFI_STARTPROC
xor %edx,%edx
- mov $(-EFAULT),%rax
+ mov $(-EFAULT),%_ASM_AX
ret
CFI_ENDPROC
END(bad_get_user)
diff --git a/include/asm-x86/asm.h b/include/asm-x86/asm.h
index 7093982..435402e 100644
--- a/include/asm-x86/asm.h
+++ b/include/asm-x86/asm.h
@@ -14,6 +14,7 @@
#endif

#define __ASM_SIZE(inst) __ASM_SEL(inst##l, inst##q)
+#define __ASM_REG(reg) __ASM_SEL(e##reg, r##reg)

#define _ASM_PTR __ASM_SEL(.long, .quad)
#define _ASM_ALIGN __ASM_SEL(.balign 4, .balign 8)
@@ -24,6 +25,8 @@
#define _ASM_ADD __ASM_SIZE(add)
#define _ASM_SUB __ASM_SIZE(sub)
#define _ASM_XADD __ASM_SIZE(xadd)
+#define _ASM_AX __ASM_REG(ax)
+#define _ASM_DX __ASM_REG(dx)

/* Exception table entry */
# define _ASM_EXTABLE(from,to) \
--
1.5.5.1

2008-06-27 21:41:59

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 14/39] merge getuser asm functions

getuser_32.S and getuser_64.S are merged into getuser.S

Signed-off-by: Glauber Costa <[email protected]>
---
arch/x86/lib/Makefile | 2 +-
arch/x86/lib/getuser.S | 104 +++++++++++++++++++++++++++++++++++++++++++++
arch/x86/lib/getuser_32.S | 79 ----------------------------------
arch/x86/lib/getuser_64.S | 100 -------------------------------------------
include/asm-x86/asm.h | 4 +-
5 files changed, 108 insertions(+), 181 deletions(-)
create mode 100644 arch/x86/lib/getuser.S
delete mode 100644 arch/x86/lib/getuser_32.S
delete mode 100644 arch/x86/lib/getuser_64.S

diff --git a/arch/x86/lib/Makefile b/arch/x86/lib/Makefile
index 700c2c3..bf990ca 100644
--- a/arch/x86/lib/Makefile
+++ b/arch/x86/lib/Makefile
@@ -6,7 +6,7 @@ obj-$(CONFIG_SMP) := msr-on-cpu.o

lib-y := delay.o
lib-y += thunk_$(BITS).o
-lib-y += usercopy_$(BITS).o getuser_$(BITS).o putuser_$(BITS).o
+lib-y += usercopy_$(BITS).o getuser.o putuser_$(BITS).o
lib-y += memcpy_$(BITS).o

ifeq ($(CONFIG_X86_32),y)
diff --git a/arch/x86/lib/getuser.S b/arch/x86/lib/getuser.S
new file mode 100644
index 0000000..ad37400
--- /dev/null
+++ b/arch/x86/lib/getuser.S
@@ -0,0 +1,104 @@
+/*
+ * __get_user functions.
+ *
+ * (C) Copyright 1998 Linus Torvalds
+ * (C) Copyright 2005 Andi Kleen
+ * (C) Copyright 2008 Glauber Costa
+ *
+ * These functions have a non-standard call interface
+ * to make them more efficient, especially as they
+ * return an error value in addition to the "real"
+ * return value.
+ */
+
+/*
+ * __get_user_X
+ *
+ * Inputs: %[r|e]ax contains the address.
+ * The register is modified, but all changes are undone
+ * before returning because the C code doesn't know about it.
+ *
+ * Outputs: %[r|e]ax is error code (0 or -EFAULT)
+ * %[r|e]dx contains zero-extended value
+ *
+ *
+ * These functions should not modify any other registers,
+ * as they get called from within inline assembly.
+ */
+
+#include <linux/linkage.h>
+#include <asm/dwarf2.h>
+#include <asm/page.h>
+#include <asm/errno.h>
+#include <asm/asm-offsets.h>
+#include <asm/thread_info.h>
+#include <asm/asm.h>
+
+ .text
+ENTRY(__get_user_1)
+ CFI_STARTPROC
+ GET_THREAD_INFO(%_ASM_DX)
+ cmp TI_addr_limit(%_ASM_DX),%_ASM_AX
+ jae bad_get_user
+1: movzb (%_ASM_AX),%edx
+ xor %eax,%eax
+ ret
+ CFI_ENDPROC
+ENDPROC(__get_user_1)
+
+ENTRY(__get_user_2)
+ CFI_STARTPROC
+ add $1,%_ASM_AX
+ jc bad_get_user
+ GET_THREAD_INFO(%_ASM_DX)
+ cmp TI_addr_limit(%_ASM_DX),%_ASM_AX
+ jae bad_get_user
+2: movzwl -1(%_ASM_AX),%edx
+ xor %eax,%eax
+ ret
+ CFI_ENDPROC
+ENDPROC(__get_user_2)
+
+ENTRY(__get_user_4)
+ CFI_STARTPROC
+ add $3,%_ASM_AX
+ jc bad_get_user
+ GET_THREAD_INFO(%_ASM_DX)
+ cmp TI_addr_limit(%_ASM_DX),%_ASM_AX
+ jae bad_get_user
+3: mov -3(%_ASM_AX),%edx
+ xor %eax,%eax
+ ret
+ CFI_ENDPROC
+ENDPROC(__get_user_4)
+
+#ifdef CONFIG_X86_64
+ENTRY(__get_user_8)
+ CFI_STARTPROC
+ add $7,%_ASM_AX
+ jc bad_get_user
+ GET_THREAD_INFO(%_ASM_DX)
+ cmp TI_addr_limit(%_ASM_DX),%_ASM_AX
+ jae bad_get_user
+4: movq -7(%_ASM_AX),%_ASM_DX
+ xor %eax,%eax
+ ret
+ CFI_ENDPROC
+ENDPROC(__get_user_8)
+#endif
+
+bad_get_user:
+ CFI_STARTPROC
+ xor %edx,%edx
+ mov $(-EFAULT),%_ASM_AX
+ ret
+ CFI_ENDPROC
+END(bad_get_user)
+
+.section __ex_table,"a"
+ _ASM_PTR 1b,bad_get_user
+ _ASM_PTR 2b,bad_get_user
+ _ASM_PTR 3b,bad_get_user
+#ifdef CONFIG_X86_64
+ _ASM_PTR 4b,bad_get_user
+#endif
diff --git a/arch/x86/lib/getuser_32.S b/arch/x86/lib/getuser_32.S
deleted file mode 100644
index 2bb0a18..0000000
--- a/arch/x86/lib/getuser_32.S
+++ /dev/null
@@ -1,79 +0,0 @@
-/*
- * __get_user functions.
- *
- * (C) Copyright 1998 Linus Torvalds
- *
- * These functions have a non-standard call interface
- * to make them more efficient, especially as they
- * return an error value in addition to the "real"
- * return value.
- */
-#include <linux/linkage.h>
-#include <asm/dwarf2.h>
-#include <asm/thread_info.h>
-#include <asm/asm.h>
-
-
-/*
- * __get_user_X
- *
- * Inputs: %eax contains the address
- *
- * Outputs: %eax is error code (0 or -EFAULT)
- * %edx contains zero-extended value
- *
- * These functions should not modify any other registers,
- * as they get called from within inline assembly.
- */
-
-.text
-ENTRY(__get_user_1)
- CFI_STARTPROC
- GET_THREAD_INFO(%_ASM_DX)
- cmp TI_addr_limit(%_ASM_DX),%_ASM_AX
- jae bad_get_user
-1: movzb (%_ASM_AX),%edx
- xor %eax,%eax
- ret
- CFI_ENDPROC
-ENDPROC(__get_user_1)
-
-ENTRY(__get_user_2)
- CFI_STARTPROC
- add $1,%_ASM_AX
- jc bad_get_user
- GET_THREAD_INFO(%_ASM_DX)
- cmp TI_addr_limit(%_ASM_DX),%_ASM_AX
- jae bad_get_user
-2: movzwl -1(%_ASM_AX),%edx
- xor %eax,%eax
- ret
- CFI_ENDPROC
-ENDPROC(__get_user_2)
-
-ENTRY(__get_user_4)
- CFI_STARTPROC
- add $3,%_ASM_AX
- jc bad_get_user
- GET_THREAD_INFO(%_ASM_DX)
- cmp TI_addr_limit(%_ASM_DX),%_ASM_AX
- jae bad_get_user
-3: mov -3(%_ASM_AX),%edx
- xor %eax,%eax
- ret
- CFI_ENDPROC
-ENDPROC(__get_user_4)
-
-bad_get_user:
- CFI_STARTPROC
- xor %edx,%edx
- mov $-14,%_ASM_AX
- ret
- CFI_ENDPROC
-END(bad_get_user)
-
-.section __ex_table,"a"
- _ASM_PTR 1b,bad_get_user
- _ASM_PTR 2b,bad_get_user
- _ASM_PTR 3b,bad_get_user
-.previous
diff --git a/arch/x86/lib/getuser_64.S b/arch/x86/lib/getuser_64.S
deleted file mode 100644
index e333884..0000000
--- a/arch/x86/lib/getuser_64.S
+++ /dev/null
@@ -1,100 +0,0 @@
-/*
- * __get_user functions.
- *
- * (C) Copyright 1998 Linus Torvalds
- * (C) Copyright 2005 Andi Kleen
- *
- * These functions have a non-standard call interface
- * to make them more efficient, especially as they
- * return an error value in addition to the "real"
- * return value.
- */
-
-/*
- * __get_user_X
- *
- * Inputs: %rax contains the address.
- * The register is modified, but all changes are undone
- * before returning because the C code doesn't know about it.
- *
- * Outputs: %rax is error code (0 or -EFAULT)
- * %rdx contains zero-extended value
- *
- *
- * These functions should not modify any other registers,
- * as they get called from within inline assembly.
- */
-
-#include <linux/linkage.h>
-#include <asm/dwarf2.h>
-#include <asm/page.h>
-#include <asm/errno.h>
-#include <asm/asm-offsets.h>
-#include <asm/thread_info.h>
-#include <asm/asm.h>
-
- .text
-ENTRY(__get_user_1)
- CFI_STARTPROC
- GET_THREAD_INFO(%_ASM_DX)
- cmp TI_addr_limit(%_ASM_DX),%_ASM_AX
- jae bad_get_user
-1: movzb (%_ASM_AX),%edx
- xor %eax,%eax
- ret
- CFI_ENDPROC
-ENDPROC(__get_user_1)
-
-ENTRY(__get_user_2)
- CFI_STARTPROC
- add $1,%_ASM_AX
- jc bad_get_user
- GET_THREAD_INFO(%_ASM_DX)
- cmp TI_addr_limit(%_ASM_DX),%_ASM_AX
- jae bad_get_user
-2: movzwl -1(%_ASM_AX),%edx
- xor %eax,%eax
- ret
- CFI_ENDPROC
-ENDPROC(__get_user_2)
-
-ENTRY(__get_user_4)
- CFI_STARTPROC
- add $3,%_ASM_AX
- jc bad_get_user
- GET_THREAD_INFO(%_ASM_DX)
- cmp TI_addr_limit(%_ASM_DX),%_ASM_AX
- jae bad_get_user
-3: mov -3(%_ASM_AX),%edx
- xor %eax,%eax
- ret
- CFI_ENDPROC
-ENDPROC(__get_user_4)
-
-ENTRY(__get_user_8)
- CFI_STARTPROC
- add $7,%_ASM_AX
- jc bad_get_user
- GET_THREAD_INFO(%_ASM_DX)
- cmp TI_addr_limit(%_ASM_DX),%_ASM_AX
- jae bad_get_user
-4: movq -7(%_ASM_AX),%_ASM_DX
- xor %eax,%eax
- ret
- CFI_ENDPROC
-ENDPROC(__get_user_8)
-
-bad_get_user:
- CFI_STARTPROC
- xor %edx,%edx
- mov $(-EFAULT),%_ASM_AX
- ret
- CFI_ENDPROC
-END(bad_get_user)
-
-.section __ex_table,"a"
- _ASM_PTR 1b,bad_get_user
- _ASM_PTR 2b,bad_get_user
- _ASM_PTR 3b,bad_get_user
- _ASM_PTR 4b,bad_get_user
-.previous
diff --git a/include/asm-x86/asm.h b/include/asm-x86/asm.h
index 435402e..57750a9 100644
--- a/include/asm-x86/asm.h
+++ b/include/asm-x86/asm.h
@@ -3,8 +3,10 @@

#ifdef __ASSEMBLY__
# define __ASM_FORM(x) x
+# define __ASM_EX_SEC .section __ex_table
#else
# define __ASM_FORM(x) " " #x " "
+# define __ASM_EX_SEC " .section __ex_table,\"a\"\n"
#endif

#ifdef CONFIG_X86_32
@@ -30,7 +32,7 @@

/* Exception table entry */
# define _ASM_EXTABLE(from,to) \
- " .section __ex_table,\"a\"\n" \
+ __ASM_EX_SEC \
_ASM_ALIGN "\n" \
_ASM_PTR #from "," #to "\n" \
" .previous\n"
--
1.5.5.1

2008-06-27 21:42:26

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 15/39] don't save ebx in putuser_32.S

clobber it in the inline asm macros, and let the compiler do this for us.

Signed-off-by: Glauber Costa <[email protected]>
---
arch/x86/lib/putuser_32.S | 13 ++-----------
include/asm-x86/uaccess_32.h | 10 +++++-----
2 files changed, 7 insertions(+), 16 deletions(-)

diff --git a/arch/x86/lib/putuser_32.S b/arch/x86/lib/putuser_32.S
index f58fba1..5b2a926 100644
--- a/arch/x86/lib/putuser_32.S
+++ b/arch/x86/lib/putuser_32.S
@@ -26,14 +26,8 @@
*/

#define ENTER CFI_STARTPROC ; \
- pushl %ebx ; \
- CFI_ADJUST_CFA_OFFSET 4 ; \
- CFI_REL_OFFSET ebx, 0 ; \
GET_THREAD_INFO(%ebx)
-#define EXIT popl %ebx ; \
- CFI_ADJUST_CFA_OFFSET -4 ; \
- CFI_RESTORE ebx ; \
- ret ; \
+#define EXIT ret ; \
CFI_ENDPROC

.text
@@ -81,10 +75,7 @@ ENTRY(__put_user_8)
ENDPROC(__put_user_8)

bad_put_user:
- CFI_STARTPROC simple
- CFI_DEF_CFA esp, 2*4
- CFI_OFFSET eip, -1*4
- CFI_OFFSET ebx, -2*4
+ CFI_STARTPROC
movl $-14,%eax
EXIT
END(bad_put_user)
diff --git a/include/asm-x86/uaccess_32.h b/include/asm-x86/uaccess_32.h
index 6a8adec..94d201b 100644
--- a/include/asm-x86/uaccess_32.h
+++ b/include/asm-x86/uaccess_32.h
@@ -188,23 +188,23 @@ extern void __put_user_8(void);

#define __put_user_1(x, ptr) \
asm volatile("call __put_user_1" : "=a" (__ret_pu) \
- : "0" ((typeof(*(ptr)))(x)), "c" (ptr))
+ : "0" ((typeof(*(ptr)))(x)), "c" (ptr) : "ebx")

#define __put_user_2(x, ptr) \
asm volatile("call __put_user_2" : "=a" (__ret_pu) \
- : "0" ((typeof(*(ptr)))(x)), "c" (ptr))
+ : "0" ((typeof(*(ptr)))(x)), "c" (ptr) : "ebx")

#define __put_user_4(x, ptr) \
asm volatile("call __put_user_4" : "=a" (__ret_pu) \
- : "0" ((typeof(*(ptr)))(x)), "c" (ptr))
+ : "0" ((typeof(*(ptr)))(x)), "c" (ptr) : "ebx")

#define __put_user_8(x, ptr) \
asm volatile("call __put_user_8" : "=a" (__ret_pu) \
- : "A" ((typeof(*(ptr)))(x)), "c" (ptr))
+ : "A" ((typeof(*(ptr)))(x)), "c" (ptr) : "ebx")

#define __put_user_X(x, ptr) \
asm volatile("call __put_user_X" : "=a" (__ret_pu) \
- : "c" (ptr))
+ : "c" (ptr): "ebx")

/**
* put_user: - Write a simple value into user space.
--
1.5.5.1

2008-06-27 21:42:48

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 13/39] use _ASM_PTR instead of explicit word-size pointers

switch .long and .quad with _ASM_PTR in getuser*.S.

Signed-off-by: Glauber Costa <[email protected]>
---
arch/x86/lib/getuser_32.S | 6 +++---
arch/x86/lib/getuser_64.S | 8 ++++----
2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/x86/lib/getuser_32.S b/arch/x86/lib/getuser_32.S
index 2cc3cee..2bb0a18 100644
--- a/arch/x86/lib/getuser_32.S
+++ b/arch/x86/lib/getuser_32.S
@@ -73,7 +73,7 @@ bad_get_user:
END(bad_get_user)

.section __ex_table,"a"
- .long 1b,bad_get_user
- .long 2b,bad_get_user
- .long 3b,bad_get_user
+ _ASM_PTR 1b,bad_get_user
+ _ASM_PTR 2b,bad_get_user
+ _ASM_PTR 3b,bad_get_user
.previous
diff --git a/arch/x86/lib/getuser_64.S b/arch/x86/lib/getuser_64.S
index 63b0e5c..e333884 100644
--- a/arch/x86/lib/getuser_64.S
+++ b/arch/x86/lib/getuser_64.S
@@ -93,8 +93,8 @@ bad_get_user:
END(bad_get_user)

.section __ex_table,"a"
- .quad 1b,bad_get_user
- .quad 2b,bad_get_user
- .quad 3b,bad_get_user
- .quad 4b,bad_get_user
+ _ASM_PTR 1b,bad_get_user
+ _ASM_PTR 2b,bad_get_user
+ _ASM_PTR 3b,bad_get_user
+ _ASM_PTR 4b,bad_get_user
.previous
--
1.5.5.1

2008-06-27 21:43:21

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 16/39] user put_user_x instead of all variants.

Follow the pattern, and define a single put_user_x, instead
of defining macros for all available sizes. Exception is
put_user_8, since the "A" constraint does not give us enough
power to specify which register (a or d) to use in the 32-bit
common case.

Signed-off-by: Glauber Costa <[email protected]>
---
include/asm-x86/uaccess_32.h | 25 +++++++------------------
1 files changed, 7 insertions(+), 18 deletions(-)

diff --git a/include/asm-x86/uaccess_32.h b/include/asm-x86/uaccess_32.h
index 94d201b..ff3443b 100644
--- a/include/asm-x86/uaccess_32.h
+++ b/include/asm-x86/uaccess_32.h
@@ -186,25 +186,14 @@ extern void __put_user_2(void);
extern void __put_user_4(void);
extern void __put_user_8(void);

-#define __put_user_1(x, ptr) \
- asm volatile("call __put_user_1" : "=a" (__ret_pu) \
- : "0" ((typeof(*(ptr)))(x)), "c" (ptr) : "ebx")
-
-#define __put_user_2(x, ptr) \
- asm volatile("call __put_user_2" : "=a" (__ret_pu) \
- : "0" ((typeof(*(ptr)))(x)), "c" (ptr) : "ebx")
-
-#define __put_user_4(x, ptr) \
- asm volatile("call __put_user_4" : "=a" (__ret_pu) \
- : "0" ((typeof(*(ptr)))(x)), "c" (ptr) : "ebx")
+#define __put_user_x(size, x, ptr) \
+ asm volatile("call __put_user_" #size : "=a" (__ret_pu) \
+ :"0" ((typeof(*(ptr)))(x)), "c" (ptr) : "ebx")

#define __put_user_8(x, ptr) \
asm volatile("call __put_user_8" : "=a" (__ret_pu) \
: "A" ((typeof(*(ptr)))(x)), "c" (ptr) : "ebx")

-#define __put_user_X(x, ptr) \
- asm volatile("call __put_user_X" : "=a" (__ret_pu) \
- : "c" (ptr): "ebx")

/**
* put_user: - Write a simple value into user space.
@@ -232,19 +221,19 @@ extern void __put_user_8(void);
__pu_val = x; \
switch (sizeof(*(ptr))) { \
case 1: \
- __put_user_1(__pu_val, ptr); \
+ __put_user_x(1, __pu_val, ptr); \
break; \
case 2: \
- __put_user_2(__pu_val, ptr); \
+ __put_user_x(2, __pu_val, ptr); \
break; \
case 4: \
- __put_user_4(__pu_val, ptr); \
+ __put_user_x(4, __pu_val, ptr); \
break; \
case 8: \
__put_user_8(__pu_val, ptr); \
break; \
default: \
- __put_user_X(__pu_val, ptr); \
+ __put_user_x(X, __pu_val, ptr); \
break; \
} \
__ret_pu; \
--
1.5.5.1

2008-06-27 21:43:43

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 17/39] clobber rbx in putuser_64.S

Instead of clobbering r8, clobber rbx, which is the i386 way.

Signed-off-by: Glauber Costa <[email protected]>
---
arch/x86/lib/putuser_64.S | 18 +++++++++---------
include/asm-x86/uaccess_64.h | 2 +-
2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/arch/x86/lib/putuser_64.S b/arch/x86/lib/putuser_64.S
index 940796f..0702885 100644
--- a/arch/x86/lib/putuser_64.S
+++ b/arch/x86/lib/putuser_64.S
@@ -18,7 +18,7 @@
*
* Outputs: %rax is error code (0 or -EFAULT)
*
- * %r8 is destroyed.
+ * %rbx is destroyed.
*
* These functions should not modify any other registers,
* as they get called from within inline assembly.
@@ -34,8 +34,8 @@
.text
ENTRY(__put_user_1)
CFI_STARTPROC
- GET_THREAD_INFO(%r8)
- cmpq TI_addr_limit(%r8),%rcx
+ GET_THREAD_INFO(%rbx)
+ cmpq TI_addr_limit(%rbx),%rcx
jae bad_put_user
1: movb %dl,(%rcx)
xorl %eax,%eax
@@ -45,10 +45,10 @@ ENDPROC(__put_user_1)

ENTRY(__put_user_2)
CFI_STARTPROC
- GET_THREAD_INFO(%r8)
+ GET_THREAD_INFO(%rbx)
addq $1,%rcx
jc 20f
- cmpq TI_addr_limit(%r8),%rcx
+ cmpq TI_addr_limit(%rbx),%rcx
jae 20f
decq %rcx
2: movw %dx,(%rcx)
@@ -61,10 +61,10 @@ ENDPROC(__put_user_2)

ENTRY(__put_user_4)
CFI_STARTPROC
- GET_THREAD_INFO(%r8)
+ GET_THREAD_INFO(%rbx)
addq $3,%rcx
jc 30f
- cmpq TI_addr_limit(%r8),%rcx
+ cmpq TI_addr_limit(%rbx),%rcx
jae 30f
subq $3,%rcx
3: movl %edx,(%rcx)
@@ -77,10 +77,10 @@ ENDPROC(__put_user_4)

ENTRY(__put_user_8)
CFI_STARTPROC
- GET_THREAD_INFO(%r8)
+ GET_THREAD_INFO(%rbx)
addq $7,%rcx
jc 40f
- cmpq TI_addr_limit(%r8),%rcx
+ cmpq TI_addr_limit(%rbx),%rcx
jae 40f
subq $7,%rcx
4: movq %rdx,(%rcx)
diff --git a/include/asm-x86/uaccess_64.h b/include/asm-x86/uaccess_64.h
index 9049f4e..8933ddb 100644
--- a/include/asm-x86/uaccess_64.h
+++ b/include/asm-x86/uaccess_64.h
@@ -131,7 +131,7 @@ extern void __put_user_bad(void);
asm volatile("call __put_user_" #size \
:"=a" (ret) \
:"c" (ptr),"d" (x) \
- :"r8")
+ :"ebx")

#define put_user(x, ptr) \
__put_user_check((__typeof__(*(ptr)))(x), (ptr), sizeof(*(ptr)))
--
1.5.5.1

2008-06-27 21:44:06

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 18/39] pass argument to putuser_64 functions in ax register.

This is consistent with i386 usage.

Signed-off-by: Glauber Costa <[email protected]>
---
arch/x86/lib/putuser_64.S | 8 ++++----
include/asm-x86/uaccess_64.h | 2 +-
2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/x86/lib/putuser_64.S b/arch/x86/lib/putuser_64.S
index 0702885..ce5fcd5 100644
--- a/arch/x86/lib/putuser_64.S
+++ b/arch/x86/lib/putuser_64.S
@@ -37,7 +37,7 @@ ENTRY(__put_user_1)
GET_THREAD_INFO(%rbx)
cmpq TI_addr_limit(%rbx),%rcx
jae bad_put_user
-1: movb %dl,(%rcx)
+1: movb %al,(%rcx)
xorl %eax,%eax
ret
CFI_ENDPROC
@@ -51,7 +51,7 @@ ENTRY(__put_user_2)
cmpq TI_addr_limit(%rbx),%rcx
jae 20f
decq %rcx
-2: movw %dx,(%rcx)
+2: movw %ax,(%rcx)
xorl %eax,%eax
ret
20: decq %rcx
@@ -67,7 +67,7 @@ ENTRY(__put_user_4)
cmpq TI_addr_limit(%rbx),%rcx
jae 30f
subq $3,%rcx
-3: movl %edx,(%rcx)
+3: movl %eax,(%rcx)
xorl %eax,%eax
ret
30: subq $3,%rcx
@@ -83,7 +83,7 @@ ENTRY(__put_user_8)
cmpq TI_addr_limit(%rbx),%rcx
jae 40f
subq $7,%rcx
-4: movq %rdx,(%rcx)
+4: movq %rax,(%rcx)
xorl %eax,%eax
ret
40: subq $7,%rcx
diff --git a/include/asm-x86/uaccess_64.h b/include/asm-x86/uaccess_64.h
index 8933ddb..05bb246 100644
--- a/include/asm-x86/uaccess_64.h
+++ b/include/asm-x86/uaccess_64.h
@@ -130,7 +130,7 @@ extern void __put_user_bad(void);
#define __put_user_x(size, ret, x, ptr) \
asm volatile("call __put_user_" #size \
:"=a" (ret) \
- :"c" (ptr),"d" (x) \
+ :"c" (ptr),"a" (x) \
:"ebx")

#define put_user(x, ptr) \
--
1.5.5.1

2008-06-27 21:44:33

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 19/39] change testing logic in putuser_64.S

Instead of operating over a register we need to put back
into normal state afterwards (the memory position), just
sub from rbx, which is trashed anyway. We can save a few instructions.

Also, this is the i386 way.

Signed-off-by: Glauber Costa <[email protected]>
---
arch/x86/lib/putuser_64.S | 33 ++++++++++++---------------------
1 files changed, 12 insertions(+), 21 deletions(-)

diff --git a/arch/x86/lib/putuser_64.S b/arch/x86/lib/putuser_64.S
index ce5fcd5..a96bd8a 100644
--- a/arch/x86/lib/putuser_64.S
+++ b/arch/x86/lib/putuser_64.S
@@ -46,48 +46,39 @@ ENDPROC(__put_user_1)
ENTRY(__put_user_2)
CFI_STARTPROC
GET_THREAD_INFO(%rbx)
- addq $1,%rcx
- jc 20f
- cmpq TI_addr_limit(%rbx),%rcx
- jae 20f
- decq %rcx
+ mov TI_addr_limit(%rbx),%rbx
+ sub $1, %rbx
+ cmpq %rbx ,%rcx
+ jae bad_put_user
2: movw %ax,(%rcx)
xorl %eax,%eax
ret
-20: decq %rcx
- jmp bad_put_user
CFI_ENDPROC
ENDPROC(__put_user_2)

ENTRY(__put_user_4)
CFI_STARTPROC
GET_THREAD_INFO(%rbx)
- addq $3,%rcx
- jc 30f
- cmpq TI_addr_limit(%rbx),%rcx
- jae 30f
- subq $3,%rcx
+ mov TI_addr_limit(%rbx),%rbx
+ sub $3, %rbx
+ cmp %rbx, %rcx
+ jae bad_put_user
3: movl %eax,(%rcx)
xorl %eax,%eax
ret
-30: subq $3,%rcx
- jmp bad_put_user
CFI_ENDPROC
ENDPROC(__put_user_4)

ENTRY(__put_user_8)
CFI_STARTPROC
GET_THREAD_INFO(%rbx)
- addq $7,%rcx
- jc 40f
- cmpq TI_addr_limit(%rbx),%rcx
- jae 40f
- subq $7,%rcx
+ mov TI_addr_limit(%rbx),%rbx
+ sub $7, %rbx
+ cmp %rbx, %rcx
+ jae bad_put_user
4: movq %rax,(%rcx)
xorl %eax,%eax
ret
-40: subq $7,%rcx
- jmp bad_put_user
CFI_ENDPROC
ENDPROC(__put_user_8)

--
1.5.5.1

2008-06-27 21:44:54

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 20/39] replace function headers by macros

in putuser_64.S, do it the i386 way, and replace the code
in beginning and end of functions with macros, since it's
always the same thing. Save lines.

Signed-off-by: Glauber Costa <[email protected]>
---
arch/x86/lib/putuser_64.S | 32 ++++++++++++++------------------
1 files changed, 14 insertions(+), 18 deletions(-)

diff --git a/arch/x86/lib/putuser_64.S b/arch/x86/lib/putuser_64.S
index a96bd8a..6d7513b 100644
--- a/arch/x86/lib/putuser_64.S
+++ b/arch/x86/lib/putuser_64.S
@@ -31,62 +31,58 @@
#include <asm/asm-offsets.h>
#include <asm/thread_info.h>

+#define ENTER CFI_STARTPROC ; \
+ GET_THREAD_INFO(%rbx)
+#define EXIT ret ; \
+ CFI_ENDPROC
+
.text
ENTRY(__put_user_1)
- CFI_STARTPROC
- GET_THREAD_INFO(%rbx)
+ ENTER
cmpq TI_addr_limit(%rbx),%rcx
jae bad_put_user
1: movb %al,(%rcx)
xorl %eax,%eax
- ret
- CFI_ENDPROC
+ EXIT
ENDPROC(__put_user_1)

ENTRY(__put_user_2)
- CFI_STARTPROC
- GET_THREAD_INFO(%rbx)
+ ENTER
mov TI_addr_limit(%rbx),%rbx
sub $1, %rbx
cmpq %rbx ,%rcx
jae bad_put_user
2: movw %ax,(%rcx)
xorl %eax,%eax
- ret
- CFI_ENDPROC
+ EXIT
ENDPROC(__put_user_2)

ENTRY(__put_user_4)
- CFI_STARTPROC
- GET_THREAD_INFO(%rbx)
+ ENTER
mov TI_addr_limit(%rbx),%rbx
sub $3, %rbx
cmp %rbx, %rcx
jae bad_put_user
3: movl %eax,(%rcx)
xorl %eax,%eax
- ret
- CFI_ENDPROC
+ EXIT
ENDPROC(__put_user_4)

ENTRY(__put_user_8)
- CFI_STARTPROC
- GET_THREAD_INFO(%rbx)
+ ENTER
mov TI_addr_limit(%rbx),%rbx
sub $7, %rbx
cmp %rbx, %rcx
jae bad_put_user
4: movq %rax,(%rcx)
xorl %eax,%eax
- ret
- CFI_ENDPROC
+ EXIT
ENDPROC(__put_user_8)

bad_put_user:
CFI_STARTPROC
movq $(-EFAULT),%rax
- ret
- CFI_ENDPROC
+ EXIT
END(bad_put_user)

.section __ex_table,"a"
--
1.5.5.1

2008-06-27 21:45:25

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 21/39] don't use word-size specifiers in putuser files

remove them where unambiguous.

Signed-off-by: Glauber Costa <[email protected]>
---
arch/x86/lib/putuser_32.S | 28 ++++++++++++++--------------
arch/x86/lib/putuser_64.S | 14 +++++++-------
2 files changed, 21 insertions(+), 21 deletions(-)

diff --git a/arch/x86/lib/putuser_32.S b/arch/x86/lib/putuser_32.S
index 5b2a926..b67a37c 100644
--- a/arch/x86/lib/putuser_32.S
+++ b/arch/x86/lib/putuser_32.S
@@ -33,44 +33,44 @@
.text
ENTRY(__put_user_1)
ENTER
- cmpl TI_addr_limit(%ebx),%ecx
+ cmp TI_addr_limit(%ebx),%ecx
jae bad_put_user
1: movb %al,(%ecx)
- xorl %eax,%eax
+ xor %eax,%eax
EXIT
ENDPROC(__put_user_1)

ENTRY(__put_user_2)
ENTER
- movl TI_addr_limit(%ebx),%ebx
- subl $1,%ebx
- cmpl %ebx,%ecx
+ mov TI_addr_limit(%ebx),%ebx
+ sub $1,%ebx
+ cmp %ebx,%ecx
jae bad_put_user
2: movw %ax,(%ecx)
- xorl %eax,%eax
+ xor %eax,%eax
EXIT
ENDPROC(__put_user_2)

ENTRY(__put_user_4)
ENTER
- movl TI_addr_limit(%ebx),%ebx
- subl $3,%ebx
- cmpl %ebx,%ecx
+ mov TI_addr_limit(%ebx),%ebx
+ sub $3,%ebx
+ cmp %ebx,%ecx
jae bad_put_user
3: movl %eax,(%ecx)
- xorl %eax,%eax
+ xor %eax,%eax
EXIT
ENDPROC(__put_user_4)

ENTRY(__put_user_8)
ENTER
- movl TI_addr_limit(%ebx),%ebx
- subl $7,%ebx
- cmpl %ebx,%ecx
+ mov TI_addr_limit(%ebx),%ebx
+ sub $7,%ebx
+ cmp %ebx,%ecx
jae bad_put_user
4: movl %eax,(%ecx)
5: movl %edx,4(%ecx)
- xorl %eax,%eax
+ xor %eax,%eax
EXIT
ENDPROC(__put_user_8)

diff --git a/arch/x86/lib/putuser_64.S b/arch/x86/lib/putuser_64.S
index 6d7513b..c18fc0f 100644
--- a/arch/x86/lib/putuser_64.S
+++ b/arch/x86/lib/putuser_64.S
@@ -39,10 +39,10 @@
.text
ENTRY(__put_user_1)
ENTER
- cmpq TI_addr_limit(%rbx),%rcx
+ cmp TI_addr_limit(%rbx),%rcx
jae bad_put_user
1: movb %al,(%rcx)
- xorl %eax,%eax
+ xor %eax,%eax
EXIT
ENDPROC(__put_user_1)

@@ -50,10 +50,10 @@ ENTRY(__put_user_2)
ENTER
mov TI_addr_limit(%rbx),%rbx
sub $1, %rbx
- cmpq %rbx ,%rcx
+ cmp %rbx ,%rcx
jae bad_put_user
2: movw %ax,(%rcx)
- xorl %eax,%eax
+ xor %eax,%eax
EXIT
ENDPROC(__put_user_2)

@@ -64,7 +64,7 @@ ENTRY(__put_user_4)
cmp %rbx, %rcx
jae bad_put_user
3: movl %eax,(%rcx)
- xorl %eax,%eax
+ xor %eax,%eax
EXIT
ENDPROC(__put_user_4)

@@ -75,13 +75,13 @@ ENTRY(__put_user_8)
cmp %rbx, %rcx
jae bad_put_user
4: movq %rax,(%rcx)
- xorl %eax,%eax
+ xor %eax,%eax
EXIT
ENDPROC(__put_user_8)

bad_put_user:
CFI_STARTPROC
- movq $(-EFAULT),%rax
+ mov $(-EFAULT),%rax
EXIT
END(bad_put_user)

--
1.5.5.1

2008-06-27 21:45:49

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 22/39] use macros from asm.h

In putuser_32.S and putuser_64.S, replace things like .quad, .long,
and explicit references to [r|e]ax for the apropriate macros
in asm/asm.h.

Signed-off-by: Glauber Costa <[email protected]>
---
arch/x86/lib/putuser_32.S | 43 ++++++++++++++++++++++---------------------
arch/x86/lib/putuser_64.S | 41 +++++++++++++++++++++--------------------
include/asm-x86/asm.h | 2 ++
3 files changed, 45 insertions(+), 41 deletions(-)

diff --git a/arch/x86/lib/putuser_32.S b/arch/x86/lib/putuser_32.S
index b67a37c..e7eda34 100644
--- a/arch/x86/lib/putuser_32.S
+++ b/arch/x86/lib/putuser_32.S
@@ -11,6 +11,7 @@
#include <linux/linkage.h>
#include <asm/dwarf2.h>
#include <asm/thread_info.h>
+#include <asm/asm.h>


/*
@@ -26,50 +27,50 @@
*/

#define ENTER CFI_STARTPROC ; \
- GET_THREAD_INFO(%ebx)
+ GET_THREAD_INFO(%_ASM_BX)
#define EXIT ret ; \
CFI_ENDPROC

.text
ENTRY(__put_user_1)
ENTER
- cmp TI_addr_limit(%ebx),%ecx
+ cmp TI_addr_limit(%_ASM_BX),%_ASM_CX
jae bad_put_user
-1: movb %al,(%ecx)
+1: movb %al,(%_ASM_CX)
xor %eax,%eax
EXIT
ENDPROC(__put_user_1)

ENTRY(__put_user_2)
ENTER
- mov TI_addr_limit(%ebx),%ebx
- sub $1,%ebx
- cmp %ebx,%ecx
+ mov TI_addr_limit(%_ASM_BX),%_ASM_BX
+ sub $1,%_ASM_BX
+ cmp %_ASM_BX,%_ASM_CX
jae bad_put_user
-2: movw %ax,(%ecx)
+2: movw %ax,(%_ASM_CX)
xor %eax,%eax
EXIT
ENDPROC(__put_user_2)

ENTRY(__put_user_4)
ENTER
- mov TI_addr_limit(%ebx),%ebx
- sub $3,%ebx
- cmp %ebx,%ecx
+ mov TI_addr_limit(%_ASM_BX),%_ASM_BX
+ sub $3,%_ASM_BX
+ cmp %_ASM_BX,%_ASM_CX
jae bad_put_user
-3: movl %eax,(%ecx)
+3: movl %eax,(%_ASM_CX)
xor %eax,%eax
EXIT
ENDPROC(__put_user_4)

ENTRY(__put_user_8)
ENTER
- mov TI_addr_limit(%ebx),%ebx
- sub $7,%ebx
- cmp %ebx,%ecx
+ mov TI_addr_limit(%_ASM_BX),%_ASM_BX
+ sub $7,%_ASM_BX
+ cmp %_ASM_BX,%_ASM_CX
jae bad_put_user
-4: movl %eax,(%ecx)
-5: movl %edx,4(%ecx)
+4: movl %_ASM_AX,(%_ASM_CX)
+5: movl %edx,4(%_ASM_CX)
xor %eax,%eax
EXIT
ENDPROC(__put_user_8)
@@ -81,9 +82,9 @@ bad_put_user:
END(bad_put_user)

.section __ex_table,"a"
- .long 1b,bad_put_user
- .long 2b,bad_put_user
- .long 3b,bad_put_user
- .long 4b,bad_put_user
- .long 5b,bad_put_user
+ _ASM_PTR 1b,bad_put_user
+ _ASM_PTR 2b,bad_put_user
+ _ASM_PTR 3b,bad_put_user
+ _ASM_PTR 4b,bad_put_user
+ _ASM_PTR 5b,bad_put_user
.previous
diff --git a/arch/x86/lib/putuser_64.S b/arch/x86/lib/putuser_64.S
index c18fc0f..d496cc8 100644
--- a/arch/x86/lib/putuser_64.S
+++ b/arch/x86/lib/putuser_64.S
@@ -30,64 +30,65 @@
#include <asm/errno.h>
#include <asm/asm-offsets.h>
#include <asm/thread_info.h>
+#include <asm/asm.h>

#define ENTER CFI_STARTPROC ; \
- GET_THREAD_INFO(%rbx)
+ GET_THREAD_INFO(%_ASM_BX)
#define EXIT ret ; \
CFI_ENDPROC

.text
ENTRY(__put_user_1)
ENTER
- cmp TI_addr_limit(%rbx),%rcx
+ cmp TI_addr_limit(%_ASM_BX),%_ASM_CX
jae bad_put_user
-1: movb %al,(%rcx)
+1: movb %al,(%_ASM_CX)
xor %eax,%eax
EXIT
ENDPROC(__put_user_1)

ENTRY(__put_user_2)
ENTER
- mov TI_addr_limit(%rbx),%rbx
- sub $1, %rbx
- cmp %rbx ,%rcx
+ mov TI_addr_limit(%_ASM_BX),%_ASM_BX
+ sub $1, %_ASM_BX
+ cmp %_ASM_BX ,%_ASM_CX
jae bad_put_user
-2: movw %ax,(%rcx)
+2: movw %ax,(%_ASM_CX)
xor %eax,%eax
EXIT
ENDPROC(__put_user_2)

ENTRY(__put_user_4)
ENTER
- mov TI_addr_limit(%rbx),%rbx
- sub $3, %rbx
- cmp %rbx, %rcx
+ mov TI_addr_limit(%_ASM_BX),%_ASM_BX
+ sub $3, %_ASM_BX
+ cmp %_ASM_BX, %_ASM_CX
jae bad_put_user
-3: movl %eax,(%rcx)
+3: movl %eax,(%_ASM_CX)
xor %eax,%eax
EXIT
ENDPROC(__put_user_4)

ENTRY(__put_user_8)
ENTER
- mov TI_addr_limit(%rbx),%rbx
- sub $7, %rbx
- cmp %rbx, %rcx
+ mov TI_addr_limit(%_ASM_BX),%_ASM_BX
+ sub $7, %_ASM_BX
+ cmp %_ASM_BX, %_ASM_CX
jae bad_put_user
-4: movq %rax,(%rcx)
+4: movq %_ASM_AX,(%_ASM_CX)
xor %eax,%eax
EXIT
ENDPROC(__put_user_8)

bad_put_user:
CFI_STARTPROC
- mov $(-EFAULT),%rax
+ mov $(-EFAULT),%eax
EXIT
END(bad_put_user)

.section __ex_table,"a"
- .quad 1b,bad_put_user
- .quad 2b,bad_put_user
- .quad 3b,bad_put_user
- .quad 4b,bad_put_user
+ _ASM_PTR 1b,bad_put_user
+ _ASM_PTR 2b,bad_put_user
+ _ASM_PTR 3b,bad_put_user
+ _ASM_PTR 4b,bad_put_user
.previous
diff --git a/include/asm-x86/asm.h b/include/asm-x86/asm.h
index 57750a9..9722032 100644
--- a/include/asm-x86/asm.h
+++ b/include/asm-x86/asm.h
@@ -28,6 +28,8 @@
#define _ASM_SUB __ASM_SIZE(sub)
#define _ASM_XADD __ASM_SIZE(xadd)
#define _ASM_AX __ASM_REG(ax)
+#define _ASM_BX __ASM_REG(bx)
+#define _ASM_CX __ASM_REG(cx)
#define _ASM_DX __ASM_REG(dx)

/* Exception table entry */
--
1.5.5.1

2008-06-27 21:46:18

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 23/39] merge putuser asm functions

putuser_32.S and putuser_64.S are merged into putuser.S

Signed-off-by: Glauber Costa <[email protected]>
---
arch/x86/lib/Makefile | 2 +-
arch/x86/lib/putuser.S | 97 +++++++++++++++++++++++++++++++++++++++++++++
arch/x86/lib/putuser_32.S | 90 -----------------------------------------
arch/x86/lib/putuser_64.S | 94 -------------------------------------------
4 files changed, 98 insertions(+), 185 deletions(-)
create mode 100644 arch/x86/lib/putuser.S
delete mode 100644 arch/x86/lib/putuser_32.S
delete mode 100644 arch/x86/lib/putuser_64.S

diff --git a/arch/x86/lib/Makefile b/arch/x86/lib/Makefile
index bf990ca..aa3fa41 100644
--- a/arch/x86/lib/Makefile
+++ b/arch/x86/lib/Makefile
@@ -6,7 +6,7 @@ obj-$(CONFIG_SMP) := msr-on-cpu.o

lib-y := delay.o
lib-y += thunk_$(BITS).o
-lib-y += usercopy_$(BITS).o getuser.o putuser_$(BITS).o
+lib-y += usercopy_$(BITS).o getuser.o putuser.o
lib-y += memcpy_$(BITS).o

ifeq ($(CONFIG_X86_32),y)
diff --git a/arch/x86/lib/putuser.S b/arch/x86/lib/putuser.S
new file mode 100644
index 0000000..36b0d15
--- /dev/null
+++ b/arch/x86/lib/putuser.S
@@ -0,0 +1,97 @@
+/*
+ * __put_user functions.
+ *
+ * (C) Copyright 2005 Linus Torvalds
+ * (C) Copyright 2005 Andi Kleen
+ * (C) Copyright 2008 Glauber Costa
+ *
+ * These functions have a non-standard call interface
+ * to make them more efficient, especially as they
+ * return an error value in addition to the "real"
+ * return value.
+ */
+#include <linux/linkage.h>
+#include <asm/dwarf2.h>
+#include <asm/thread_info.h>
+#include <asm/errno.h>
+#include <asm/asm.h>
+
+
+/*
+ * __put_user_X
+ *
+ * Inputs: %eax[:%edx] contains the data
+ * %ecx contains the address
+ *
+ * Outputs: %eax is error code (0 or -EFAULT)
+ *
+ * These functions should not modify any other registers,
+ * as they get called from within inline assembly.
+ */
+
+#define ENTER CFI_STARTPROC ; \
+ GET_THREAD_INFO(%_ASM_BX)
+#define EXIT ret ; \
+ CFI_ENDPROC
+
+.text
+ENTRY(__put_user_1)
+ ENTER
+ cmp TI_addr_limit(%_ASM_BX),%_ASM_CX
+ jae bad_put_user
+1: movb %al,(%_ASM_CX)
+ xor %eax,%eax
+ EXIT
+ENDPROC(__put_user_1)
+
+ENTRY(__put_user_2)
+ ENTER
+ mov TI_addr_limit(%_ASM_BX),%_ASM_BX
+ sub $1,%_ASM_BX
+ cmp %_ASM_BX,%_ASM_CX
+ jae bad_put_user
+2: movw %ax,(%_ASM_CX)
+ xor %eax,%eax
+ EXIT
+ENDPROC(__put_user_2)
+
+ENTRY(__put_user_4)
+ ENTER
+ mov TI_addr_limit(%_ASM_BX),%_ASM_BX
+ sub $3,%_ASM_BX
+ cmp %_ASM_BX,%_ASM_CX
+ jae bad_put_user
+3: movl %eax,(%_ASM_CX)
+ xor %eax,%eax
+ EXIT
+ENDPROC(__put_user_4)
+
+ENTRY(__put_user_8)
+ ENTER
+ mov TI_addr_limit(%_ASM_BX),%_ASM_BX
+ sub $7,%_ASM_BX
+ cmp %_ASM_BX,%_ASM_CX
+ jae bad_put_user
+4: mov %_ASM_AX,(%_ASM_CX)
+#ifdef CONFIG_X86_32
+5: movl %edx,4(%_ASM_CX)
+#endif
+ xor %eax,%eax
+ EXIT
+ENDPROC(__put_user_8)
+
+bad_put_user:
+ CFI_STARTPROC
+ movl $-EFAULT,%eax
+ EXIT
+END(bad_put_user)
+
+.section __ex_table,"a"
+ _ASM_PTR 1b,bad_put_user
+ _ASM_PTR 2b,bad_put_user
+ _ASM_PTR 3b,bad_put_user
+ _ASM_PTR 4b,bad_put_user
+#ifdef CONFIG_X86_32
+ _ASM_PTR 5b,bad_put_user
+#endif
+.previous
diff --git a/arch/x86/lib/putuser_32.S b/arch/x86/lib/putuser_32.S
deleted file mode 100644
index e7eda34..0000000
--- a/arch/x86/lib/putuser_32.S
+++ /dev/null
@@ -1,90 +0,0 @@
-/*
- * __put_user functions.
- *
- * (C) Copyright 2005 Linus Torvalds
- *
- * These functions have a non-standard call interface
- * to make them more efficient, especially as they
- * return an error value in addition to the "real"
- * return value.
- */
-#include <linux/linkage.h>
-#include <asm/dwarf2.h>
-#include <asm/thread_info.h>
-#include <asm/asm.h>
-
-
-/*
- * __put_user_X
- *
- * Inputs: %eax[:%edx] contains the data
- * %ecx contains the address
- *
- * Outputs: %eax is error code (0 or -EFAULT)
- *
- * These functions should not modify any other registers,
- * as they get called from within inline assembly.
- */
-
-#define ENTER CFI_STARTPROC ; \
- GET_THREAD_INFO(%_ASM_BX)
-#define EXIT ret ; \
- CFI_ENDPROC
-
-.text
-ENTRY(__put_user_1)
- ENTER
- cmp TI_addr_limit(%_ASM_BX),%_ASM_CX
- jae bad_put_user
-1: movb %al,(%_ASM_CX)
- xor %eax,%eax
- EXIT
-ENDPROC(__put_user_1)
-
-ENTRY(__put_user_2)
- ENTER
- mov TI_addr_limit(%_ASM_BX),%_ASM_BX
- sub $1,%_ASM_BX
- cmp %_ASM_BX,%_ASM_CX
- jae bad_put_user
-2: movw %ax,(%_ASM_CX)
- xor %eax,%eax
- EXIT
-ENDPROC(__put_user_2)
-
-ENTRY(__put_user_4)
- ENTER
- mov TI_addr_limit(%_ASM_BX),%_ASM_BX
- sub $3,%_ASM_BX
- cmp %_ASM_BX,%_ASM_CX
- jae bad_put_user
-3: movl %eax,(%_ASM_CX)
- xor %eax,%eax
- EXIT
-ENDPROC(__put_user_4)
-
-ENTRY(__put_user_8)
- ENTER
- mov TI_addr_limit(%_ASM_BX),%_ASM_BX
- sub $7,%_ASM_BX
- cmp %_ASM_BX,%_ASM_CX
- jae bad_put_user
-4: movl %_ASM_AX,(%_ASM_CX)
-5: movl %edx,4(%_ASM_CX)
- xor %eax,%eax
- EXIT
-ENDPROC(__put_user_8)
-
-bad_put_user:
- CFI_STARTPROC
- movl $-14,%eax
- EXIT
-END(bad_put_user)
-
-.section __ex_table,"a"
- _ASM_PTR 1b,bad_put_user
- _ASM_PTR 2b,bad_put_user
- _ASM_PTR 3b,bad_put_user
- _ASM_PTR 4b,bad_put_user
- _ASM_PTR 5b,bad_put_user
-.previous
diff --git a/arch/x86/lib/putuser_64.S b/arch/x86/lib/putuser_64.S
deleted file mode 100644
index d496cc8..0000000
--- a/arch/x86/lib/putuser_64.S
+++ /dev/null
@@ -1,94 +0,0 @@
-/*
- * __put_user functions.
- *
- * (C) Copyright 1998 Linus Torvalds
- * (C) Copyright 2005 Andi Kleen
- *
- * These functions have a non-standard call interface
- * to make them more efficient, especially as they
- * return an error value in addition to the "real"
- * return value.
- */
-
-/*
- * __put_user_X
- *
- * Inputs: %rcx contains the address
- * %rdx contains new value
- *
- * Outputs: %rax is error code (0 or -EFAULT)
- *
- * %rbx is destroyed.
- *
- * These functions should not modify any other registers,
- * as they get called from within inline assembly.
- */
-
-#include <linux/linkage.h>
-#include <asm/dwarf2.h>
-#include <asm/page.h>
-#include <asm/errno.h>
-#include <asm/asm-offsets.h>
-#include <asm/thread_info.h>
-#include <asm/asm.h>
-
-#define ENTER CFI_STARTPROC ; \
- GET_THREAD_INFO(%_ASM_BX)
-#define EXIT ret ; \
- CFI_ENDPROC
-
- .text
-ENTRY(__put_user_1)
- ENTER
- cmp TI_addr_limit(%_ASM_BX),%_ASM_CX
- jae bad_put_user
-1: movb %al,(%_ASM_CX)
- xor %eax,%eax
- EXIT
-ENDPROC(__put_user_1)
-
-ENTRY(__put_user_2)
- ENTER
- mov TI_addr_limit(%_ASM_BX),%_ASM_BX
- sub $1, %_ASM_BX
- cmp %_ASM_BX ,%_ASM_CX
- jae bad_put_user
-2: movw %ax,(%_ASM_CX)
- xor %eax,%eax
- EXIT
-ENDPROC(__put_user_2)
-
-ENTRY(__put_user_4)
- ENTER
- mov TI_addr_limit(%_ASM_BX),%_ASM_BX
- sub $3, %_ASM_BX
- cmp %_ASM_BX, %_ASM_CX
- jae bad_put_user
-3: movl %eax,(%_ASM_CX)
- xor %eax,%eax
- EXIT
-ENDPROC(__put_user_4)
-
-ENTRY(__put_user_8)
- ENTER
- mov TI_addr_limit(%_ASM_BX),%_ASM_BX
- sub $7, %_ASM_BX
- cmp %_ASM_BX, %_ASM_CX
- jae bad_put_user
-4: movq %_ASM_AX,(%_ASM_CX)
- xor %eax,%eax
- EXIT
-ENDPROC(__put_user_8)
-
-bad_put_user:
- CFI_STARTPROC
- mov $(-EFAULT),%eax
- EXIT
-END(bad_put_user)
-
-.section __ex_table,"a"
- _ASM_PTR 1b,bad_put_user
- _ASM_PTR 2b,bad_put_user
- _ASM_PTR 3b,bad_put_user
- _ASM_PTR 4b,bad_put_user
-.previous
--
1.5.5.1

2008-06-27 21:46:47

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 24/39] commonize __range_not_ok

For i386, __range_not_ok is a better name than __range_ok, since
it returns 0 when it is in fact okay. Other than that,
both versions does not need the word size specifiers, and we remove them.

Signed-off-by: Glauber Costa <[email protected]>
---
include/asm-x86/uaccess_32.h | 6 +++---
include/asm-x86/uaccess_64.h | 2 +-
2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/asm-x86/uaccess_32.h b/include/asm-x86/uaccess_32.h
index ff3443b..9884f50 100644
--- a/include/asm-x86/uaccess_32.h
+++ b/include/asm-x86/uaccess_32.h
@@ -56,11 +56,11 @@ extern struct movsl_mask {
*
* This needs 33-bit arithmetic. We have a carry...
*/
-#define __range_ok(addr, size) \
+#define __range_not_ok(addr, size) \
({ \
unsigned long flag, roksum; \
__chk_user_ptr(addr); \
- asm("addl %3,%1 ; sbbl %0,%0; cmpl %1,%4; sbbl $0,%0" \
+ asm("add %3,%1 ; sbb %0,%0; cmp %1,%4; sbb $0,%0" \
:"=&r" (flag), "=r" (roksum) \
:"1" (addr), "g" ((int)(size)), \
"rm" (current_thread_info()->addr_limit.seg)); \
@@ -86,7 +86,7 @@ extern struct movsl_mask {
* checks that the pointer is in the user space range - after calling
* this function, memory access functions may still return -EFAULT.
*/
-#define access_ok(type, addr, size) (likely(__range_ok(addr, size) == 0))
+#define access_ok(type, addr, size) (likely(__range_not_ok(addr, size) == 0))

/*
* The exception table consists of pairs of addresses: the first is the
diff --git a/include/asm-x86/uaccess_64.h b/include/asm-x86/uaccess_64.h
index 05bb246..d607fd0 100644
--- a/include/asm-x86/uaccess_64.h
+++ b/include/asm-x86/uaccess_64.h
@@ -42,7 +42,7 @@
unsigned long flag, roksum; \
__chk_user_ptr(addr); \
asm("# range_ok\n\r" \
- "addq %3,%1 ; sbbq %0,%0 ; cmpq %1,%4 ; sbbq $0,%0" \
+ "add %3,%1 ; sbb %0,%0 ; cmp %1,%4 ; sbb $0,%0" \
: "=&r" (flag), "=r" (roksum) \
: "1" (addr), "g" ((long)(size)), \
"g" (current_thread_info()->addr_limit.seg)); \
--
1.5.5.1

2008-06-27 21:47:10

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 26/39] merge getuser

merge versions of getuser from uaccess_32.h and uaccess_64.h into
uaccess.h. There is a part which is 64-bit only (for now), and for
that, we use a __get_user_8 macro.

Signed-off-by: Glauber Costa <[email protected]>
---
include/asm-x86/uaccess.h | 55 ++++++++++++++++++++++++++++++++++++++++++
include/asm-x86/uaccess_32.h | 43 --------------------------------
include/asm-x86/uaccess_64.h | 29 ----------------------
3 files changed, 55 insertions(+), 72 deletions(-)

diff --git a/include/asm-x86/uaccess.h b/include/asm-x86/uaccess.h
index 2290513..a06a810 100644
--- a/include/asm-x86/uaccess.h
+++ b/include/asm-x86/uaccess.h
@@ -121,6 +121,61 @@ extern int __get_user_bad(void);
: "=a" (ret),"=d" (x) \
: "0" (ptr)) \

+/* Careful: we have to cast the result to the type of the pointer
+ * for sign reasons */
+
+/**
+ * get_user: - Get a simple variable from user space.
+ * @x: Variable to store result.
+ * @ptr: Source address, in user space.
+ *
+ * Context: User context only. This function may sleep.
+ *
+ * This macro copies a single simple variable from user space to kernel
+ * space. It supports simple types like char and int, but not larger
+ * data types like structures or arrays.
+ *
+ * @ptr must have pointer-to-simple-variable type, and the result of
+ * dereferencing @ptr must be assignable to @x without a cast.
+ *
+ * Returns zero on success, or -EFAULT on error.
+ * On error, the variable @x is set to zero.
+ */
+#ifdef CONFIG_X86_32
+#define __get_user_8(__ret_gu, __val_gu, ptr) \
+ __get_user_x(X, __ret_gu, __val_gu, ptr)
+#else
+#define __get_user_8(__ret_gu, __val_gu, ptr) \
+ __get_user_x(8, __ret_gu, __val_gu, ptr)
+#endif
+
+#define get_user(x, ptr) \
+({ \
+ int __ret_gu; \
+ unsigned long __val_gu; \
+ __chk_user_ptr(ptr); \
+ switch (sizeof(*(ptr))) { \
+ case 1: \
+ __get_user_x(1, __ret_gu, __val_gu, ptr); \
+ break; \
+ case 2: \
+ __get_user_x(2, __ret_gu, __val_gu, ptr); \
+ break; \
+ case 4: \
+ __get_user_x(4, __ret_gu, __val_gu, ptr); \
+ break; \
+ case 8: \
+ __get_user_8(__ret_gu, __val_gu, ptr); \
+ break; \
+ default: \
+ __get_user_x(X, __ret_gu, __val_gu, ptr); \
+ break; \
+ } \
+ (x) = (__typeof__(*(ptr)))__val_gu; \
+ __ret_gu; \
+})
+
+
#ifdef CONFIG_X86_32
# include "uaccess_32.h"
#else
diff --git a/include/asm-x86/uaccess_32.h b/include/asm-x86/uaccess_32.h
index 92ad19e..3cc3236 100644
--- a/include/asm-x86/uaccess_32.h
+++ b/include/asm-x86/uaccess_32.h
@@ -24,49 +24,6 @@ extern struct movsl_mask {
((unsigned long __force)(addr) < \
(current_thread_info()->addr_limit.seg))

-/* Careful: we have to cast the result to the type of the pointer
- * for sign reasons */
-
-/**
- * get_user: - Get a simple variable from user space.
- * @x: Variable to store result.
- * @ptr: Source address, in user space.
- *
- * Context: User context only. This function may sleep.
- *
- * This macro copies a single simple variable from user space to kernel
- * space. It supports simple types like char and int, but not larger
- * data types like structures or arrays.
- *
- * @ptr must have pointer-to-simple-variable type, and the result of
- * dereferencing @ptr must be assignable to @x without a cast.
- *
- * Returns zero on success, or -EFAULT on error.
- * On error, the variable @x is set to zero.
- */
-#define get_user(x, ptr) \
-({ \
- int __ret_gu; \
- unsigned long __val_gu; \
- __chk_user_ptr(ptr); \
- switch (sizeof(*(ptr))) { \
- case 1: \
- __get_user_x(1, __ret_gu, __val_gu, ptr); \
- break; \
- case 2: \
- __get_user_x(2, __ret_gu, __val_gu, ptr); \
- break; \
- case 4: \
- __get_user_x(4, __ret_gu, __val_gu, ptr); \
- break; \
- default: \
- __get_user_x(X, __ret_gu, __val_gu, ptr); \
- break; \
- } \
- (x) = (__typeof__(*(ptr)))__val_gu; \
- __ret_gu; \
-})
-
extern void __put_user_bad(void);

/*
diff --git a/include/asm-x86/uaccess_64.h b/include/asm-x86/uaccess_64.h
index 243dbb4..4a44b90 100644
--- a/include/asm-x86/uaccess_64.h
+++ b/include/asm-x86/uaccess_64.h
@@ -14,35 +14,6 @@

#define ARCH_HAS_SEARCH_EXTABLE

-/* Careful: we have to cast the result to the type of the pointer
- * for sign reasons */
-
-#define get_user(x, ptr) \
-({ \
- unsigned long __val_gu; \
- int __ret_gu; \
- __chk_user_ptr(ptr); \
- switch (sizeof(*(ptr))) { \
- case 1: \
- __get_user_x(1, __ret_gu, __val_gu, ptr); \
- break; \
- case 2: \
- __get_user_x(2, __ret_gu, __val_gu, ptr); \
- break; \
- case 4: \
- __get_user_x(4, __ret_gu, __val_gu, ptr); \
- break; \
- case 8: \
- __get_user_x(8, __ret_gu, __val_gu, ptr); \
- break; \
- default: \
- __get_user_bad(); \
- break; \
- } \
- (x) = (__force typeof(*(ptr)))__val_gu; \
- __ret_gu; \
-})
-
extern void __put_user_1(void);
extern void __put_user_2(void);
extern void __put_user_4(void);
--
1.5.5.1

2008-06-27 21:47:36

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 25/39] merge common parts of uaccess.

common parts of uaccess_32.h and uaccess_64.h
are put in uaccess.h.

Signed-off-by: Glauber Costa <[email protected]>
---
include/asm-x86/uaccess.h | 125 ++++++++++++++++++++++++++++++++++++++++++
include/asm-x86/uaccess_32.h | 110 -------------------------------------
include/asm-x86/uaccess_64.h | 84 ----------------------------
3 files changed, 125 insertions(+), 194 deletions(-)

diff --git a/include/asm-x86/uaccess.h b/include/asm-x86/uaccess.h
index 9fefd29..2290513 100644
--- a/include/asm-x86/uaccess.h
+++ b/include/asm-x86/uaccess.h
@@ -1,5 +1,130 @@
+#ifndef _ASM_UACCES_H_
+#define _ASM_UACCES_H_
+/*
+ * User space memory access functions
+ */
+#include <linux/errno.h>
+#include <linux/compiler.h>
+#include <linux/thread_info.h>
+#include <linux/prefetch.h>
+#include <linux/string.h>
+#include <asm/asm.h>
+#include <asm/page.h>
+
+#define VERIFY_READ 0
+#define VERIFY_WRITE 1
+
+/*
+ * The fs value determines whether argument validity checking should be
+ * performed or not. If get_fs() == USER_DS, checking is performed, with
+ * get_fs() == KERNEL_DS, checking is bypassed.
+ *
+ * For historical reasons, these macros are grossly misnamed.
+ */
+
+#define MAKE_MM_SEG(s) ((mm_segment_t) { (s) })
+
+#define KERNEL_DS MAKE_MM_SEG(-1UL)
+#define USER_DS MAKE_MM_SEG(PAGE_OFFSET)
+
+#define get_ds() (KERNEL_DS)
+#define get_fs() (current_thread_info()->addr_limit)
+#define set_fs(x) (current_thread_info()->addr_limit = (x))
+
+#define segment_eq(a, b) ((a).seg == (b).seg)
+
+/*
+ * Test whether a block of memory is a valid user space address.
+ * Returns 0 if the range is valid, nonzero otherwise.
+ *
+ * This is equivalent to the following test:
+ * (u33)addr + (u33)size >= (u33)current->addr_limit.seg (u65 for x86_64)
+ *
+ * This needs 33-bit (65-bit for x86_64) arithmetic. We have a carry...
+ */
+
+#define __range_not_ok(addr, size) \
+({ \
+ unsigned long flag, roksum; \
+ __chk_user_ptr(addr); \
+ asm("# range_ok\n\r" \
+ "add %3,%1 ; sbb %0,%0 ; cmp %1,%4 ; sbb $0,%0" \
+ : "=&r" (flag), "=r" (roksum) \
+ : "1" (addr), "g" ((long)(size)), \
+ "g" (current_thread_info()->addr_limit.seg)); \
+ flag; \
+})
+
+/**
+ * access_ok: - Checks if a user space pointer is valid
+ * @type: Type of access: %VERIFY_READ or %VERIFY_WRITE. Note that
+ * %VERIFY_WRITE is a superset of %VERIFY_READ - if it is safe
+ * to write to a block, it is always safe to read from it.
+ * @addr: User space pointer to start of block to check
+ * @size: Size of block to check
+ *
+ * Context: User context only. This function may sleep.
+ *
+ * Checks if a pointer to a block of memory in user space is valid.
+ *
+ * Returns true (nonzero) if the memory block may be valid, false (zero)
+ * if it is definitely invalid.
+ *
+ * Note that, depending on architecture, this function probably just
+ * checks that the pointer is in the user space range - after calling
+ * this function, memory access functions may still return -EFAULT.
+ */
+#define access_ok(type, addr, size) (likely(__range_not_ok(addr, size) == 0))
+
+/*
+ * The exception table consists of pairs of addresses: the first is the
+ * address of an instruction that is allowed to fault, and the second is
+ * the address at which the program should continue. No registers are
+ * modified, so it is entirely up to the continuation code to figure out
+ * what to do.
+ *
+ * All the routines below use bits of fixup code that are out of line
+ * with the main instruction path. This means when everything is well,
+ * we don't even have to jump over them. Further, they do not intrude
+ * on our cache or tlb entries.
+ */
+
+struct exception_table_entry {
+ unsigned long insn, fixup;
+};
+
+extern int fixup_exception(struct pt_regs *regs);
+
+/*
+ * These are the main single-value transfer routines. They automatically
+ * use the right size if we just have the right pointer type.
+ *
+ * This gets kind of ugly. We want to return _two_ values in "get_user()"
+ * and yet we don't want to do any pointers, because that is too much
+ * of a performance impact. Thus we have a few rather ugly macros here,
+ * and hide all the ugliness from the user.
+ *
+ * The "__xxx" versions of the user access functions are versions that
+ * do not verify the address space, that must have been done previously
+ * with a separate "access_ok()" call (this is used when we do multiple
+ * accesses to the same area of user memory).
+ */
+
+extern int __get_user_1(void);
+extern int __get_user_2(void);
+extern int __get_user_4(void);
+extern int __get_user_8(void);
+extern int __get_user_bad(void);
+
+#define __get_user_x(size, ret, x, ptr) \
+ asm volatile("call __get_user_" #size \
+ : "=a" (ret),"=d" (x) \
+ : "0" (ptr)) \
+
#ifdef CONFIG_X86_32
# include "uaccess_32.h"
#else
# include "uaccess_64.h"
#endif
+
+#endif
diff --git a/include/asm-x86/uaccess_32.h b/include/asm-x86/uaccess_32.h
index 9884f50..92ad19e 100644
--- a/include/asm-x86/uaccess_32.h
+++ b/include/asm-x86/uaccess_32.h
@@ -11,29 +11,6 @@
#include <asm/asm.h>
#include <asm/page.h>

-#define VERIFY_READ 0
-#define VERIFY_WRITE 1
-
-/*
- * The fs value determines whether argument validity checking should be
- * performed or not. If get_fs() == USER_DS, checking is performed, with
- * get_fs() == KERNEL_DS, checking is bypassed.
- *
- * For historical reasons, these macros are grossly misnamed.
- */
-
-#define MAKE_MM_SEG(s) ((mm_segment_t) { (s) })
-
-
-#define KERNEL_DS MAKE_MM_SEG(-1UL)
-#define USER_DS MAKE_MM_SEG(PAGE_OFFSET)
-
-#define get_ds() (KERNEL_DS)
-#define get_fs() (current_thread_info()->addr_limit)
-#define set_fs(x) (current_thread_info()->addr_limit = (x))
-
-#define segment_eq(a, b) ((a).seg == (b).seg)
-
/*
* movsl can be slow when source and dest are not both 8-byte aligned
*/
@@ -47,91 +24,6 @@ extern struct movsl_mask {
((unsigned long __force)(addr) < \
(current_thread_info()->addr_limit.seg))

-/*
- * Test whether a block of memory is a valid user space address.
- * Returns 0 if the range is valid, nonzero otherwise.
- *
- * This is equivalent to the following test:
- * (u33)addr + (u33)size >= (u33)current->addr_limit.seg
- *
- * This needs 33-bit arithmetic. We have a carry...
- */
-#define __range_not_ok(addr, size) \
-({ \
- unsigned long flag, roksum; \
- __chk_user_ptr(addr); \
- asm("add %3,%1 ; sbb %0,%0; cmp %1,%4; sbb $0,%0" \
- :"=&r" (flag), "=r" (roksum) \
- :"1" (addr), "g" ((int)(size)), \
- "rm" (current_thread_info()->addr_limit.seg)); \
- flag; \
-})
-
-/**
- * access_ok: - Checks if a user space pointer is valid
- * @type: Type of access: %VERIFY_READ or %VERIFY_WRITE. Note that
- * %VERIFY_WRITE is a superset of %VERIFY_READ - if it is safe
- * to write to a block, it is always safe to read from it.
- * @addr: User space pointer to start of block to check
- * @size: Size of block to check
- *
- * Context: User context only. This function may sleep.
- *
- * Checks if a pointer to a block of memory in user space is valid.
- *
- * Returns true (nonzero) if the memory block may be valid, false (zero)
- * if it is definitely invalid.
- *
- * Note that, depending on architecture, this function probably just
- * checks that the pointer is in the user space range - after calling
- * this function, memory access functions may still return -EFAULT.
- */
-#define access_ok(type, addr, size) (likely(__range_not_ok(addr, size) == 0))
-
-/*
- * The exception table consists of pairs of addresses: the first is the
- * address of an instruction that is allowed to fault, and the second is
- * the address at which the program should continue. No registers are
- * modified, so it is entirely up to the continuation code to figure out
- * what to do.
- *
- * All the routines below use bits of fixup code that are out of line
- * with the main instruction path. This means when everything is well,
- * we don't even have to jump over them. Further, they do not intrude
- * on our cache or tlb entries.
- */
-
-struct exception_table_entry {
- unsigned long insn, fixup;
-};
-
-extern int fixup_exception(struct pt_regs *regs);
-
-/*
- * These are the main single-value transfer routines. They automatically
- * use the right size if we just have the right pointer type.
- *
- * This gets kind of ugly. We want to return _two_ values in "get_user()"
- * and yet we don't want to do any pointers, because that is too much
- * of a performance impact. Thus we have a few rather ugly macros here,
- * and hide all the ugliness from the user.
- *
- * The "__xxx" versions of the user access functions are versions that
- * do not verify the address space, that must have been done previously
- * with a separate "access_ok()" call (this is used when we do multiple
- * accesses to the same area of user memory).
- */
-
-extern void __get_user_1(void);
-extern void __get_user_2(void);
-extern void __get_user_4(void);
-
-#define __get_user_x(size, ret, x, ptr) \
- asm volatile("call __get_user_" #size \
- :"=a" (ret),"=d" (x) \
- :"0" (ptr))
-
-
/* Careful: we have to cast the result to the type of the pointer
* for sign reasons */

@@ -386,8 +278,6 @@ struct __large_struct { unsigned long buf[100]; };
__gu_err; \
})

-extern long __get_user_bad(void);
-
#define __get_user_size(x, ptr, size, retval, errret) \
do { \
retval = 0; \
diff --git a/include/asm-x86/uaccess_64.h b/include/asm-x86/uaccess_64.h
index d607fd0..243dbb4 100644
--- a/include/asm-x86/uaccess_64.h
+++ b/include/asm-x86/uaccess_64.h
@@ -9,89 +9,11 @@
#include <linux/prefetch.h>
#include <asm/page.h>

-#define VERIFY_READ 0
-#define VERIFY_WRITE 1
-
-/*
- * The fs value determines whether argument validity checking should be
- * performed or not. If get_fs() == USER_DS, checking is performed, with
- * get_fs() == KERNEL_DS, checking is bypassed.
- *
- * For historical reasons, these macros are grossly misnamed.
- */
-
-#define MAKE_MM_SEG(s) ((mm_segment_t) { (s) })
-
-#define KERNEL_DS MAKE_MM_SEG(-1UL)
-#define USER_DS MAKE_MM_SEG(PAGE_OFFSET)
-
-#define get_ds() (KERNEL_DS)
-#define get_fs() (current_thread_info()->addr_limit)
-#define set_fs(x) (current_thread_info()->addr_limit = (x))
-
-#define segment_eq(a, b) ((a).seg == (b).seg)
-
#define __addr_ok(addr) (!((unsigned long)(addr) & \
(current_thread_info()->addr_limit.seg)))

-/*
- * Uhhuh, this needs 65-bit arithmetic. We have a carry..
- */
-#define __range_not_ok(addr, size) \
-({ \
- unsigned long flag, roksum; \
- __chk_user_ptr(addr); \
- asm("# range_ok\n\r" \
- "add %3,%1 ; sbb %0,%0 ; cmp %1,%4 ; sbb $0,%0" \
- : "=&r" (flag), "=r" (roksum) \
- : "1" (addr), "g" ((long)(size)), \
- "g" (current_thread_info()->addr_limit.seg)); \
- flag; \
-})
-
-#define access_ok(type, addr, size) (__range_not_ok(addr, size) == 0)
-
-/*
- * The exception table consists of pairs of addresses: the first is the
- * address of an instruction that is allowed to fault, and the second is
- * the address at which the program should continue. No registers are
- * modified, so it is entirely up to the continuation code to figure out
- * what to do.
- *
- * All the routines below use bits of fixup code that are out of line
- * with the main instruction path. This means when everything is well,
- * we don't even have to jump over them. Further, they do not intrude
- * on our cache or tlb entries.
- */
-
-struct exception_table_entry {
- unsigned long insn, fixup;
-};
-
-extern int fixup_exception(struct pt_regs *regs);
-
#define ARCH_HAS_SEARCH_EXTABLE

-/*
- * These are the main single-value transfer routines. They automatically
- * use the right size if we just have the right pointer type.
- *
- * This gets kind of ugly. We want to return _two_ values in "get_user()"
- * and yet we don't want to do any pointers, because that is too much
- * of a performance impact. Thus we have a few rather ugly macros here,
- * and hide all the ugliness from the user.
- *
- * The "__xxx" versions of the user access functions are versions that
- * do not verify the address space, that must have been done previously
- * with a separate "access_ok()" call (this is used when we do multiple
- * accesses to the same area of user memory).
- */
-
-#define __get_user_x(size, ret, x, ptr) \
- asm volatile("call __get_user_" #size \
- : "=a" (ret),"=d" (x) \
- : "0" (ptr)) \
-
/* Careful: we have to cast the result to the type of the pointer
* for sign reasons */

@@ -227,12 +149,6 @@ struct __large_struct { unsigned long buf[100]; };
__gu_err; \
})

-extern int __get_user_1(void);
-extern int __get_user_2(void);
-extern int __get_user_4(void);
-extern int __get_user_8(void);
-extern int __get_user_bad(void);
-
#define __get_user_size(x, ptr, size, retval) \
do { \
retval = 0; \
--
1.5.5.1

2008-06-27 21:47:58

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 27/39] move __addr_ok to uaccess.h

Take it out of uaccess_32.h. Since it seems that no users
of the x86_64 exists, we simply pick the i386 version.

Signed-off-by: Glauber Costa <[email protected]>
---
include/asm-x86/uaccess.h | 4 ++++
include/asm-x86/uaccess_32.h | 4 ----
include/asm-x86/uaccess_64.h | 3 ---
3 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/include/asm-x86/uaccess.h b/include/asm-x86/uaccess.h
index a06a810..3721513 100644
--- a/include/asm-x86/uaccess.h
+++ b/include/asm-x86/uaccess.h
@@ -33,6 +33,10 @@

#define segment_eq(a, b) ((a).seg == (b).seg)

+#define __addr_ok(addr) \
+ ((unsigned long __force)(addr) < \
+ (current_thread_info()->addr_limit.seg))
+
/*
* Test whether a block of memory is a valid user space address.
* Returns 0 if the range is valid, nonzero otherwise.
diff --git a/include/asm-x86/uaccess_32.h b/include/asm-x86/uaccess_32.h
index 3cc3236..87b1aed 100644
--- a/include/asm-x86/uaccess_32.h
+++ b/include/asm-x86/uaccess_32.h
@@ -20,10 +20,6 @@ extern struct movsl_mask {
} ____cacheline_aligned_in_smp movsl_mask;
#endif

-#define __addr_ok(addr) \
- ((unsigned long __force)(addr) < \
- (current_thread_info()->addr_limit.seg))
-
extern void __put_user_bad(void);

/*
diff --git a/include/asm-x86/uaccess_64.h b/include/asm-x86/uaccess_64.h
index 4a44b90..8130876 100644
--- a/include/asm-x86/uaccess_64.h
+++ b/include/asm-x86/uaccess_64.h
@@ -9,9 +9,6 @@
#include <linux/prefetch.h>
#include <asm/page.h>

-#define __addr_ok(addr) (!((unsigned long)(addr) & \
- (current_thread_info()->addr_limit.seg)))
-
#define ARCH_HAS_SEARCH_EXTABLE

extern void __put_user_1(void);
--
1.5.5.1

2008-06-27 21:48:24

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 29/39] mark x86_64 as having a working WP.

select X86_WP_WORKS_OK for x86_64 too.

Signed-off-by: Glauber Costa <[email protected]>
---
arch/x86/Kconfig.cpu | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/Kconfig.cpu b/arch/x86/Kconfig.cpu
index d5f04f9..99ec0fe 100644
--- a/arch/x86/Kconfig.cpu
+++ b/arch/x86/Kconfig.cpu
@@ -344,7 +344,7 @@ config X86_F00F_BUG

config X86_WP_WORKS_OK
def_bool y
- depends on X86_32 && !M386
+ depends on !M386

config X86_INVLPG
def_bool y
--
1.5.5.1

2008-06-27 21:48:43

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 28/39] use k modifier for 4-byte access.

do it in a separate patch for bisectability.
Goal is to have put_user_size integrated.

Signed-off-by: Glauber Costa <[email protected]>
---
include/asm-x86/uaccess_32.h | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/asm-x86/uaccess_32.h b/include/asm-x86/uaccess_32.h
index 87b1aed..4c47a5b 100644
--- a/include/asm-x86/uaccess_32.h
+++ b/include/asm-x86/uaccess_32.h
@@ -180,7 +180,7 @@ do { \
__put_user_asm(x, ptr, retval, "w", "w", "ir", errret); \
break; \
case 4: \
- __put_user_asm(x, ptr, retval, "l", "", "ir", errret); \
+ __put_user_asm(x, ptr, retval, "l", "k", "ir", errret);\
break; \
case 8: \
__put_user_u64((__typeof__(*ptr))(x), ptr, retval); \
--
1.5.5.1

2008-06-27 21:49:14

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 30/39] don't always use EFAULT on __put_user_size.

Let the user of the macro specify the desired return.

Signed-off-by: Glauber Costa <[email protected]>
---
include/asm-x86/uaccess_64.h | 12 ++++++------
1 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/include/asm-x86/uaccess_64.h b/include/asm-x86/uaccess_64.h
index 8130876..6532d63 100644
--- a/include/asm-x86/uaccess_64.h
+++ b/include/asm-x86/uaccess_64.h
@@ -37,7 +37,7 @@ extern void __put_user_bad(void);
#define __put_user_nocheck(x, ptr, size) \
({ \
int __pu_err; \
- __put_user_size((x), (ptr), (size), __pu_err); \
+ __put_user_size((x), (ptr), (size), __pu_err, -EFAULT); \
__pu_err; \
})

@@ -65,22 +65,22 @@ extern void __put_user_bad(void);
__pu_err; \
})

-#define __put_user_size(x, ptr, size, retval) \
+#define __put_user_size(x, ptr, size, retval, errret) \
do { \
retval = 0; \
__chk_user_ptr(ptr); \
switch (size) { \
case 1: \
- __put_user_asm(x, ptr, retval, "b", "b", "iq", -EFAULT);\
+ __put_user_asm(x, ptr, retval, "b", "b", "iq", errret);\
break; \
case 2: \
- __put_user_asm(x, ptr, retval, "w", "w", "ir", -EFAULT);\
+ __put_user_asm(x, ptr, retval, "w", "w", "ir", errret);\
break; \
case 4: \
- __put_user_asm(x, ptr, retval, "l", "k", "ir", -EFAULT);\
+ __put_user_asm(x, ptr, retval, "l", "k", "ir", errret);\
break; \
case 8: \
- __put_user_asm(x, ptr, retval, "q", "", "Zr", -EFAULT); \
+ __put_user_asm(x, ptr, retval, "q", "", "Zr", errret); \
break; \
default: \
__put_user_bad(); \
--
1.5.5.1

2008-06-27 21:49:34

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 31/39] merge __put_user_asm and its user.

Move both __put_user_asm and __put_user_size to
uaccess.h. i386 already had a special function for 64-bit access,
so for x86_64, we just define a macro with the same name.
Note that for X86_64, CONFIG_X86_WP_WORKS_OK will always
be defined, so the #else part will never be even compiled in.

Signed-off-by: Glauber Costa <[email protected]>
---
include/asm-x86/uaccess.h | 84 ++++++++++++++++++++++++++++++++++++++++++
include/asm-x86/uaccess_32.h | 77 --------------------------------------
include/asm-x86/uaccess_64.h | 51 -------------------------
3 files changed, 84 insertions(+), 128 deletions(-)

diff --git a/include/asm-x86/uaccess.h b/include/asm-x86/uaccess.h
index 3721513..8479be3 100644
--- a/include/asm-x86/uaccess.h
+++ b/include/asm-x86/uaccess.h
@@ -179,6 +179,90 @@ extern int __get_user_bad(void);
__ret_gu; \
})

+#ifdef CONFIG_X86_32
+#define __put_user_u64(x, addr, err) \
+ asm volatile("1: movl %%eax,0(%2)\n" \
+ "2: movl %%edx,4(%2)\n" \
+ "3:\n" \
+ ".section .fixup,\"ax\"\n" \
+ "4: movl %3,%0\n" \
+ " jmp 3b\n" \
+ ".previous\n" \
+ _ASM_EXTABLE(1b, 4b) \
+ _ASM_EXTABLE(2b, 4b) \
+ : "=r" (err) \
+ : "A" (x), "r" (addr), "i" (-EFAULT), "0" (err))
+#else
+#define __put_user_u64(x, ptr, retval) \
+ __put_user_asm(x, ptr, retval, "q", "", "Zr", -EFAULT)
+#endif
+
+#ifdef CONFIG_X86_WP_WORKS_OK
+
+#define __put_user_size(x, ptr, size, retval, errret) \
+do { \
+ retval = 0; \
+ __chk_user_ptr(ptr); \
+ switch (size) { \
+ case 1: \
+ __put_user_asm(x, ptr, retval, "b", "b", "iq", errret); \
+ break; \
+ case 2: \
+ __put_user_asm(x, ptr, retval, "w", "w", "ir", errret); \
+ break; \
+ case 4: \
+ __put_user_asm(x, ptr, retval, "l", "k", "ir", errret);\
+ break; \
+ case 8: \
+ __put_user_u64((__typeof__(*ptr))(x), ptr, retval); \
+ break; \
+ default: \
+ __put_user_bad(); \
+ } \
+} while (0)
+
+#else
+
+#define __put_user_size(x, ptr, size, retval, errret) \
+do { \
+ __typeof__(*(ptr))__pus_tmp = x; \
+ retval = 0; \
+ \
+ if (unlikely(__copy_to_user_ll(ptr, &__pus_tmp, size) != 0)) \
+ retval = errret; \
+} while (0)
+
+#endif
+
+#define __put_user_nocheck(x, ptr, size) \
+({ \
+ long __pu_err; \
+ __put_user_size((x), (ptr), (size), __pu_err, -EFAULT); \
+ __pu_err; \
+})
+
+
+
+/* FIXME: this hack is definitely wrong -AK */
+struct __large_struct { unsigned long buf[100]; };
+#define __m(x) (*(struct __large_struct __user *)(x))
+
+/*
+ * Tell gcc we read from memory instead of writing: this is because
+ * we do not write to any memory gcc knows about, so there are no
+ * aliasing issues.
+ */
+#define __put_user_asm(x, addr, err, itype, rtype, ltype, errret) \
+ asm volatile("1: mov"itype" %"rtype"1,%2\n" \
+ "2:\n" \
+ ".section .fixup,\"ax\"\n" \
+ "3: mov %3,%0\n" \
+ " jmp 2b\n" \
+ ".previous\n" \
+ _ASM_EXTABLE(1b, 3b) \
+ : "=r"(err) \
+ : ltype(x), "m" (__m(addr)), "i" (errret), "0" (err))
+

#ifdef CONFIG_X86_32
# include "uaccess_32.h"
diff --git a/include/asm-x86/uaccess_32.h b/include/asm-x86/uaccess_32.h
index 4c47a5b..fab7557 100644
--- a/include/asm-x86/uaccess_32.h
+++ b/include/asm-x86/uaccess_32.h
@@ -145,83 +145,6 @@ extern void __put_user_8(void);
#define __put_user(x, ptr) \
__put_user_nocheck((__typeof__(*(ptr)))(x), (ptr), sizeof(*(ptr)))

-#define __put_user_nocheck(x, ptr, size) \
-({ \
- long __pu_err; \
- __put_user_size((x), (ptr), (size), __pu_err, -EFAULT); \
- __pu_err; \
-})
-
-
-#define __put_user_u64(x, addr, err) \
- asm volatile("1: movl %%eax,0(%2)\n" \
- "2: movl %%edx,4(%2)\n" \
- "3:\n" \
- ".section .fixup,\"ax\"\n" \
- "4: movl %3,%0\n" \
- " jmp 3b\n" \
- ".previous\n" \
- _ASM_EXTABLE(1b, 4b) \
- _ASM_EXTABLE(2b, 4b) \
- : "=r" (err) \
- : "A" (x), "r" (addr), "i" (-EFAULT), "0" (err))
-
-#ifdef CONFIG_X86_WP_WORKS_OK
-
-#define __put_user_size(x, ptr, size, retval, errret) \
-do { \
- retval = 0; \
- __chk_user_ptr(ptr); \
- switch (size) { \
- case 1: \
- __put_user_asm(x, ptr, retval, "b", "b", "iq", errret); \
- break; \
- case 2: \
- __put_user_asm(x, ptr, retval, "w", "w", "ir", errret); \
- break; \
- case 4: \
- __put_user_asm(x, ptr, retval, "l", "k", "ir", errret);\
- break; \
- case 8: \
- __put_user_u64((__typeof__(*ptr))(x), ptr, retval); \
- break; \
- default: \
- __put_user_bad(); \
- } \
-} while (0)
-
-#else
-
-#define __put_user_size(x, ptr, size, retval, errret) \
-do { \
- __typeof__(*(ptr))__pus_tmp = x; \
- retval = 0; \
- \
- if (unlikely(__copy_to_user_ll(ptr, &__pus_tmp, size) != 0)) \
- retval = errret; \
-} while (0)
-
-#endif
-struct __large_struct { unsigned long buf[100]; };
-#define __m(x) (*(struct __large_struct __user *)(x))
-
-/*
- * Tell gcc we read from memory instead of writing: this is because
- * we do not write to any memory gcc knows about, so there are no
- * aliasing issues.
- */
-#define __put_user_asm(x, addr, err, itype, rtype, ltype, errret) \
- asm volatile("1: mov"itype" %"rtype"1,%2\n" \
- "2:\n" \
- ".section .fixup,\"ax\"\n" \
- "3: movl %3,%0\n" \
- " jmp 2b\n" \
- ".previous\n" \
- _ASM_EXTABLE(1b, 3b) \
- : "=r"(err) \
- : ltype (x), "m" (__m(addr)), "i" (errret), "0" (err))
-
-
#define __get_user_nocheck(x, ptr, size) \
({ \
long __gu_err; \
diff --git a/include/asm-x86/uaccess_64.h b/include/asm-x86/uaccess_64.h
index 6532d63..42c01aa 100644
--- a/include/asm-x86/uaccess_64.h
+++ b/include/asm-x86/uaccess_64.h
@@ -34,14 +34,6 @@ extern void __put_user_bad(void);
#define __get_user_unaligned __get_user
#define __put_user_unaligned __put_user

-#define __put_user_nocheck(x, ptr, size) \
-({ \
- int __pu_err; \
- __put_user_size((x), (ptr), (size), __pu_err, -EFAULT); \
- __pu_err; \
-})
-
-
#define __put_user_check(x, ptr, size) \
({ \
int __pu_err; \
@@ -65,49 +57,6 @@ extern void __put_user_bad(void);
__pu_err; \
})

-#define __put_user_size(x, ptr, size, retval, errret) \
-do { \
- retval = 0; \
- __chk_user_ptr(ptr); \
- switch (size) { \
- case 1: \
- __put_user_asm(x, ptr, retval, "b", "b", "iq", errret);\
- break; \
- case 2: \
- __put_user_asm(x, ptr, retval, "w", "w", "ir", errret);\
- break; \
- case 4: \
- __put_user_asm(x, ptr, retval, "l", "k", "ir", errret);\
- break; \
- case 8: \
- __put_user_asm(x, ptr, retval, "q", "", "Zr", errret); \
- break; \
- default: \
- __put_user_bad(); \
- } \
-} while (0)
-
-/* FIXME: this hack is definitely wrong -AK */
-struct __large_struct { unsigned long buf[100]; };
-#define __m(x) (*(struct __large_struct __user *)(x))
-
-/*
- * Tell gcc we read from memory instead of writing: this is because
- * we do not write to any memory gcc knows about, so there are no
- * aliasing issues.
- */
-#define __put_user_asm(x, addr, err, itype, rtype, ltype, errno) \
- asm volatile("1: mov"itype" %"rtype"1,%2\n" \
- "2:\n" \
- ".section .fixup, \"ax\"\n" \
- "3: mov %3,%0\n" \
- " jmp 2b\n" \
- ".previous\n" \
- _ASM_EXTABLE(1b, 3b) \
- : "=r"(err) \
- : ltype (x), "m" (__m(addr)), "i" (errno), "0" (err))
-
-
#define __get_user_nocheck(x, ptr, size) \
({ \
int __gu_err; \
--
1.5.5.1

2008-06-27 21:49:55

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 32/39] don't always use EFAULT on __get_user_size.

Let the user of the macro specify the desired return.

Signed-off-by: Glauber Costa <[email protected]>
---
include/asm-x86/uaccess_64.h | 12 ++++++------
1 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/include/asm-x86/uaccess_64.h b/include/asm-x86/uaccess_64.h
index 42c01aa..e0875d7 100644
--- a/include/asm-x86/uaccess_64.h
+++ b/include/asm-x86/uaccess_64.h
@@ -61,27 +61,27 @@ extern void __put_user_bad(void);
({ \
int __gu_err; \
unsigned long __gu_val; \
- __get_user_size(__gu_val, (ptr), (size), __gu_err); \
+ __get_user_size(__gu_val, (ptr), (size), __gu_err, -EFAULT);\
(x) = (__force typeof(*(ptr)))__gu_val; \
__gu_err; \
})

-#define __get_user_size(x, ptr, size, retval) \
+#define __get_user_size(x, ptr, size, retval, errret) \
do { \
retval = 0; \
__chk_user_ptr(ptr); \
switch (size) { \
case 1: \
- __get_user_asm(x, ptr, retval, "b", "b", "=q", -EFAULT);\
+ __get_user_asm(x, ptr, retval, "b", "b", "=q", errret);\
break; \
case 2: \
- __get_user_asm(x, ptr, retval, "w", "w", "=r", -EFAULT);\
+ __get_user_asm(x, ptr, retval, "w", "w", "=r", errret);\
break; \
case 4: \
- __get_user_asm(x, ptr, retval, "l", "k", "=r", -EFAULT);\
+ __get_user_asm(x, ptr, retval, "l", "k", "=r", errret);\
break; \
case 8: \
- __get_user_asm(x, ptr, retval, "q", "", "=r", -EFAULT); \
+ __get_user_asm(x, ptr, retval, "q", "", "=r", errret); \
break; \
default: \
(x) = __get_user_bad(); \
--
1.5.5.1

2008-06-27 21:50:33

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 33/39] merge __get_user_asm and its users.

Move __get_user_asm and __get_user_size and __get_user_nocheck
to uaccess.h. This requires us to define a macro at __get_user_size
for the 64-bit access case.

Signed-off-by: Glauber Costa <[email protected]>
---
include/asm-x86/uaccess.h | 50 +++++++++++++++++++++++++++++++++++++++++-
include/asm-x86/uaccess_32.h | 41 ----------------------------------
include/asm-x86/uaccess_64.h | 43 ------------------------------------
3 files changed, 49 insertions(+), 85 deletions(-)

diff --git a/include/asm-x86/uaccess.h b/include/asm-x86/uaccess.h
index 8479be3..4ff31c5 100644
--- a/include/asm-x86/uaccess.h
+++ b/include/asm-x86/uaccess.h
@@ -234,6 +234,47 @@ do { \

#endif

+#ifdef CONFIG_X86_32
+#define __get_user_asm_u64(x, ptr, retval, errret) (x) = __get_user_bad()
+#else
+#define __get_user_asm_u64(x, ptr, retval, errret) \
+ __get_user_asm(x, ptr, retval, "q", "", "=r", errret)
+#endif
+
+#define __get_user_size(x, ptr, size, retval, errret) \
+do { \
+ retval = 0; \
+ __chk_user_ptr(ptr); \
+ switch (size) { \
+ case 1: \
+ __get_user_asm(x, ptr, retval, "b", "b", "=q", errret); \
+ break; \
+ case 2: \
+ __get_user_asm(x, ptr, retval, "w", "w", "=r", errret); \
+ break; \
+ case 4: \
+ __get_user_asm(x, ptr, retval, "l", "k", "=r", errret); \
+ break; \
+ case 8: \
+ __get_user_asm_u64(x, ptr, retval, errret); \
+ break; \
+ default: \
+ (x) = __get_user_bad(); \
+ } \
+} while (0)
+
+#define __get_user_asm(x, addr, err, itype, rtype, ltype, errret) \
+ asm volatile("1: mov"itype" %2,%"rtype"1\n" \
+ "2:\n" \
+ ".section .fixup,\"ax\"\n" \
+ "3: mov %3,%0\n" \
+ " xor"itype" %"rtype"1,%"rtype"1\n" \
+ " jmp 2b\n" \
+ ".previous\n" \
+ _ASM_EXTABLE(1b, 3b) \
+ : "=r" (err), ltype(x) \
+ : "m" (__m(addr)), "i" (errret), "0" (err))
+
#define __put_user_nocheck(x, ptr, size) \
({ \
long __pu_err; \
@@ -241,7 +282,14 @@ do { \
__pu_err; \
})

-
+#define __get_user_nocheck(x, ptr, size) \
+({ \
+ long __gu_err; \
+ unsigned long __gu_val; \
+ __get_user_size(__gu_val, (ptr), (size), __gu_err, -EFAULT); \
+ (x) = (__force __typeof__(*(ptr)))__gu_val; \
+ __gu_err; \
+})

/* FIXME: this hack is definitely wrong -AK */
struct __large_struct { unsigned long buf[100]; };
diff --git a/include/asm-x86/uaccess_32.h b/include/asm-x86/uaccess_32.h
index fab7557..ebfe6b2 100644
--- a/include/asm-x86/uaccess_32.h
+++ b/include/asm-x86/uaccess_32.h
@@ -145,47 +145,6 @@ extern void __put_user_8(void);
#define __put_user(x, ptr) \
__put_user_nocheck((__typeof__(*(ptr)))(x), (ptr), sizeof(*(ptr)))

-#define __get_user_nocheck(x, ptr, size) \
-({ \
- long __gu_err; \
- unsigned long __gu_val; \
- __get_user_size(__gu_val, (ptr), (size), __gu_err, -EFAULT); \
- (x) = (__typeof__(*(ptr)))__gu_val; \
- __gu_err; \
-})
-
-#define __get_user_size(x, ptr, size, retval, errret) \
-do { \
- retval = 0; \
- __chk_user_ptr(ptr); \
- switch (size) { \
- case 1: \
- __get_user_asm(x, ptr, retval, "b", "b", "=q", errret); \
- break; \
- case 2: \
- __get_user_asm(x, ptr, retval, "w", "w", "=r", errret); \
- break; \
- case 4: \
- __get_user_asm(x, ptr, retval, "l", "", "=r", errret); \
- break; \
- default: \
- (x) = __get_user_bad(); \
- } \
-} while (0)
-
-#define __get_user_asm(x, addr, err, itype, rtype, ltype, errret) \
- asm volatile("1: mov"itype" %2,%"rtype"1\n" \
- "2:\n" \
- ".section .fixup,\"ax\"\n" \
- "3: movl %3,%0\n" \
- " xor"itype" %"rtype"1,%"rtype"1\n" \
- " jmp 2b\n" \
- ".previous\n" \
- _ASM_EXTABLE(1b, 3b) \
- : "=r" (err), ltype (x) \
- : "m" (__m(addr)), "i" (errret), "0" (err))
-
-
unsigned long __must_check __copy_to_user_ll
(void __user *to, const void *from, unsigned long n);
unsigned long __must_check __copy_from_user_ll
diff --git a/include/asm-x86/uaccess_64.h b/include/asm-x86/uaccess_64.h
index e0875d7..42a9769 100644
--- a/include/asm-x86/uaccess_64.h
+++ b/include/asm-x86/uaccess_64.h
@@ -57,49 +57,6 @@ extern void __put_user_bad(void);
__pu_err; \
})

-#define __get_user_nocheck(x, ptr, size) \
-({ \
- int __gu_err; \
- unsigned long __gu_val; \
- __get_user_size(__gu_val, (ptr), (size), __gu_err, -EFAULT);\
- (x) = (__force typeof(*(ptr)))__gu_val; \
- __gu_err; \
-})
-
-#define __get_user_size(x, ptr, size, retval, errret) \
-do { \
- retval = 0; \
- __chk_user_ptr(ptr); \
- switch (size) { \
- case 1: \
- __get_user_asm(x, ptr, retval, "b", "b", "=q", errret);\
- break; \
- case 2: \
- __get_user_asm(x, ptr, retval, "w", "w", "=r", errret);\
- break; \
- case 4: \
- __get_user_asm(x, ptr, retval, "l", "k", "=r", errret);\
- break; \
- case 8: \
- __get_user_asm(x, ptr, retval, "q", "", "=r", errret); \
- break; \
- default: \
- (x) = __get_user_bad(); \
- } \
-} while (0)
-
-#define __get_user_asm(x, addr, err, itype, rtype, ltype, errno) \
- asm volatile("1: mov"itype" %2,%"rtype"1\n" \
- "2:\n" \
- ".section .fixup, \"ax\"\n" \
- "3: mov %3,%0\n" \
- " xor"itype" %"rtype"1,%"rtype"1\n" \
- " jmp 2b\n" \
- ".previous\n" \
- _ASM_EXTABLE(1b, 3b) \
- : "=r" (err), ltype (x) \
- : "m" (__m(addr)), "i"(errno), "0"(err))
-
/*
* Copy To/From Userspace
*/
--
1.5.5.1

2008-06-27 21:50:57

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 35/39] turn __put_user_check directly into put_user.

We also check user pointer in x86_64 put_user, the way i386 does.

in a separate patch for bisecting purposes.

Signed-off-by: Glauber Costa <[email protected]>
---
include/asm-x86/uaccess_64.h | 8 +++-----
1 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/include/asm-x86/uaccess_64.h b/include/asm-x86/uaccess_64.h
index 42a9769..9139854 100644
--- a/include/asm-x86/uaccess_64.h
+++ b/include/asm-x86/uaccess_64.h
@@ -23,9 +23,6 @@ extern void __put_user_bad(void);
:"c" (ptr),"a" (x) \
:"ebx")

-#define put_user(x, ptr) \
- __put_user_check((__typeof__(*(ptr)))(x), (ptr), sizeof(*(ptr)))
-
#define __get_user(x, ptr) \
__get_user_nocheck((x), (ptr), sizeof(*(ptr)))
#define __put_user(x, ptr) \
@@ -34,11 +31,12 @@ extern void __put_user_bad(void);
#define __get_user_unaligned __get_user
#define __put_user_unaligned __put_user

-#define __put_user_check(x, ptr, size) \
+#define put_user(x, ptr) \
({ \
int __pu_err; \
typeof(*(ptr)) __user *__pu_addr = (ptr); \
- switch (size) { \
+ __chk_user_ptr(ptr); \
+ switch (sizeof(*(ptr))) { \
case 1: \
__put_user_x(1, __pu_err, x, __pu_addr); \
break; \
--
1.5.5.1

2008-06-27 21:51:26

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 36/39] merge put_user

move both versions, which are highly similar, to uaccess.h.
Note that, for x86_64, X86_WP_WORKS_OK is always defined.

Signed-off-by: Glauber Costa <[email protected]>
---
include/asm-x86/uaccess.h | 74 +++++++++++++++++++++++++++++++++++++++
include/asm-x86/uaccess_32.h | 79 ------------------------------------------
include/asm-x86/uaccess_64.h | 36 -------------------
3 files changed, 74 insertions(+), 115 deletions(-)

diff --git a/include/asm-x86/uaccess.h b/include/asm-x86/uaccess.h
index 4ff31c5..583cc48 100644
--- a/include/asm-x86/uaccess.h
+++ b/include/asm-x86/uaccess.h
@@ -179,6 +179,12 @@ extern int __get_user_bad(void);
__ret_gu; \
})

+#define __put_user_x(size, x, ptr, __ret_pu) \
+ asm volatile("call __put_user_" #size : "=a" (__ret_pu) \
+ :"0" ((typeof(*(ptr)))(x)), "c" (ptr) : "ebx")
+
+
+
#ifdef CONFIG_X86_32
#define __put_user_u64(x, addr, err) \
asm volatile("1: movl %%eax,0(%2)\n" \
@@ -192,13 +198,71 @@ extern int __get_user_bad(void);
_ASM_EXTABLE(2b, 4b) \
: "=r" (err) \
: "A" (x), "r" (addr), "i" (-EFAULT), "0" (err))
+
+#define __put_user_x8(x, ptr, __ret_pu) \
+ asm volatile("call __put_user_8" : "=a" (__ret_pu) \
+ : "A" ((typeof(*(ptr)))(x)), "c" (ptr) : "ebx")
#else
#define __put_user_u64(x, ptr, retval) \
__put_user_asm(x, ptr, retval, "q", "", "Zr", -EFAULT)
+#define __put_user_x8(x, ptr, __ret_pu) __put_user_x(8, x, ptr, __ret_pu)
#endif

+extern void __put_user_bad(void);
+
+/*
+ * Strange magic calling convention: pointer in %ecx,
+ * value in %eax(:%edx), return value in %eax. clobbers %rbx
+ */
+extern void __put_user_1(void);
+extern void __put_user_2(void);
+extern void __put_user_4(void);
+extern void __put_user_8(void);
+
#ifdef CONFIG_X86_WP_WORKS_OK

+/**
+ * put_user: - Write a simple value into user space.
+ * @x: Value to copy to user space.
+ * @ptr: Destination address, in user space.
+ *
+ * Context: User context only. This function may sleep.
+ *
+ * This macro copies a single simple value from kernel space to user
+ * space. It supports simple types like char and int, but not larger
+ * data types like structures or arrays.
+ *
+ * @ptr must have pointer-to-simple-variable type, and @x must be assignable
+ * to the result of dereferencing @ptr.
+ *
+ * Returns zero on success, or -EFAULT on error.
+ */
+#define put_user(x, ptr) \
+({ \
+ int __ret_pu; \
+ __typeof__(*(ptr)) __pu_val; \
+ __chk_user_ptr(ptr); \
+ __pu_val = x; \
+ switch (sizeof(*(ptr))) { \
+ case 1: \
+ __put_user_x(1, __pu_val, ptr, __ret_pu); \
+ break; \
+ case 2: \
+ __put_user_x(2, __pu_val, ptr, __ret_pu); \
+ break; \
+ case 4: \
+ __put_user_x(4, __pu_val, ptr, __ret_pu); \
+ break; \
+ case 8: \
+ __put_user_x8(__pu_val, ptr, __ret_pu); \
+ break; \
+ default: \
+ __put_user_x(X, __pu_val, ptr, __ret_pu); \
+ break; \
+ } \
+ __ret_pu; \
+})
+
#define __put_user_size(x, ptr, size, retval, errret) \
do { \
retval = 0; \
@@ -232,6 +296,16 @@ do { \
retval = errret; \
} while (0)

+#define put_user(x, ptr) \
+({ \
+ int __ret_pu; \
+ __typeof__(*(ptr))__pus_tmp = x; \
+ __ret_pu = 0; \
+ if (unlikely(__copy_to_user_ll(ptr, &__pus_tmp, \
+ sizeof(*(ptr))) != 0)) \
+ __ret_pu = -EFAULT; \
+ __ret_pu; \
+})
#endif

#ifdef CONFIG_X86_32
diff --git a/include/asm-x86/uaccess_32.h b/include/asm-x86/uaccess_32.h
index 2c90673..e5c0437 100644
--- a/include/asm-x86/uaccess_32.h
+++ b/include/asm-x86/uaccess_32.h
@@ -20,85 +20,6 @@ extern struct movsl_mask {
} ____cacheline_aligned_in_smp movsl_mask;
#endif

-extern void __put_user_bad(void);
-
-/*
- * Strange magic calling convention: pointer in %ecx,
- * value in %eax(:%edx), return value in %eax, no clobbers.
- */
-extern void __put_user_1(void);
-extern void __put_user_2(void);
-extern void __put_user_4(void);
-extern void __put_user_8(void);
-
-#define __put_user_x(size, x, ptr, __ret_pu) \
- asm volatile("call __put_user_" #size : "=a" (__ret_pu) \
- :"0" ((typeof(*(ptr)))(x)), "c" (ptr) : "ebx")
-
-#define __put_user_8(x, ptr, __ret_pu) \
- asm volatile("call __put_user_8" : "=a" (__ret_pu) \
- : "A" ((typeof(*(ptr)))(x)), "c" (ptr) : "ebx")
-
-
-/**
- * put_user: - Write a simple value into user space.
- * @x: Value to copy to user space.
- * @ptr: Destination address, in user space.
- *
- * Context: User context only. This function may sleep.
- *
- * This macro copies a single simple value from kernel space to user
- * space. It supports simple types like char and int, but not larger
- * data types like structures or arrays.
- *
- * @ptr must have pointer-to-simple-variable type, and @x must be assignable
- * to the result of dereferencing @ptr.
- *
- * Returns zero on success, or -EFAULT on error.
- */
-#ifdef CONFIG_X86_WP_WORKS_OK
-
-#define put_user(x, ptr) \
-({ \
- int __ret_pu; \
- __typeof__(*(ptr)) __pu_val; \
- __chk_user_ptr(ptr); \
- __pu_val = x; \
- switch (sizeof(*(ptr))) { \
- case 1: \
- __put_user_x(1, __pu_val, ptr, __ret_pu); \
- break; \
- case 2: \
- __put_user_x(2, __pu_val, ptr, __ret_pu); \
- break; \
- case 4: \
- __put_user_x(4, __pu_val, ptr, __ret_pu); \
- break; \
- case 8: \
- __put_user_8(__pu_val, ptr, __ret_pu); \
- break; \
- default: \
- __put_user_x(X, __pu_val, ptr, __ret_pu); \
- break; \
- } \
- __ret_pu; \
-})
-
-#else
-#define put_user(x, ptr) \
-({ \
- int __ret_pu; \
- __typeof__(*(ptr))__pus_tmp = x; \
- __ret_pu = 0; \
- if (unlikely(__copy_to_user_ll(ptr, &__pus_tmp, \
- sizeof(*(ptr))) != 0)) \
- __ret_pu = -EFAULT; \
- __ret_pu; \
-})
-
-
-#endif
-
/**
* __get_user: - Get a simple variable from user space, with less checking.
* @x: Variable to store result.
diff --git a/include/asm-x86/uaccess_64.h b/include/asm-x86/uaccess_64.h
index 9139854..2e75a5d 100644
--- a/include/asm-x86/uaccess_64.h
+++ b/include/asm-x86/uaccess_64.h
@@ -11,18 +11,6 @@

#define ARCH_HAS_SEARCH_EXTABLE

-extern void __put_user_1(void);
-extern void __put_user_2(void);
-extern void __put_user_4(void);
-extern void __put_user_8(void);
-extern void __put_user_bad(void);
-
-#define __put_user_x(size, ret, x, ptr) \
- asm volatile("call __put_user_" #size \
- :"=a" (ret) \
- :"c" (ptr),"a" (x) \
- :"ebx")
-
#define __get_user(x, ptr) \
__get_user_nocheck((x), (ptr), sizeof(*(ptr)))
#define __put_user(x, ptr) \
@@ -31,30 +19,6 @@ extern void __put_user_bad(void);
#define __get_user_unaligned __get_user
#define __put_user_unaligned __put_user

-#define put_user(x, ptr) \
-({ \
- int __pu_err; \
- typeof(*(ptr)) __user *__pu_addr = (ptr); \
- __chk_user_ptr(ptr); \
- switch (sizeof(*(ptr))) { \
- case 1: \
- __put_user_x(1, __pu_err, x, __pu_addr); \
- break; \
- case 2: \
- __put_user_x(2, __pu_err, x, __pu_addr); \
- break; \
- case 4: \
- __put_user_x(4, __pu_err, x, __pu_addr); \
- break; \
- case 8: \
- __put_user_x(8, __pu_err, x, __pu_addr); \
- break; \
- default: \
- __put_user_bad(); \
- } \
- __pu_err; \
-})
-
/*
* Copy To/From Userspace
*/
--
1.5.5.1

2008-06-27 21:51:46

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 37/39] move __get_user and __put_user into uaccess.h

We also carry the unaligned version with us. Only x86_64 uses
it, but there's no problem in defining it.

Signed-off-by: Glauber Costa <[email protected]>
---
include/asm-x86/uaccess.h | 47 ++++++++++++++++++++++++++++++++++++++++++
include/asm-x86/uaccess_32.h | 46 -----------------------------------------
include/asm-x86/uaccess_64.h | 8 -------
3 files changed, 47 insertions(+), 54 deletions(-)

diff --git a/include/asm-x86/uaccess.h b/include/asm-x86/uaccess.h
index 583cc48..4ebb992 100644
--- a/include/asm-x86/uaccess.h
+++ b/include/asm-x86/uaccess.h
@@ -384,7 +384,54 @@ struct __large_struct { unsigned long buf[100]; };
_ASM_EXTABLE(1b, 3b) \
: "=r"(err) \
: ltype(x), "m" (__m(addr)), "i" (errret), "0" (err))
+/**
+ * __get_user: - Get a simple variable from user space, with less checking.
+ * @x: Variable to store result.
+ * @ptr: Source address, in user space.
+ *
+ * Context: User context only. This function may sleep.
+ *
+ * This macro copies a single simple variable from user space to kernel
+ * space. It supports simple types like char and int, but not larger
+ * data types like structures or arrays.
+ *
+ * @ptr must have pointer-to-simple-variable type, and the result of
+ * dereferencing @ptr must be assignable to @x without a cast.
+ *
+ * Caller must check the pointer with access_ok() before calling this
+ * function.
+ *
+ * Returns zero on success, or -EFAULT on error.
+ * On error, the variable @x is set to zero.
+ */
+
+#define __get_user(x, ptr) \
+ __get_user_nocheck((x), (ptr), sizeof(*(ptr)))
+/**
+ * __put_user: - Write a simple value into user space, with less checking.
+ * @x: Value to copy to user space.
+ * @ptr: Destination address, in user space.
+ *
+ * Context: User context only. This function may sleep.
+ *
+ * This macro copies a single simple value from kernel space to user
+ * space. It supports simple types like char and int, but not larger
+ * data types like structures or arrays.
+ *
+ * @ptr must have pointer-to-simple-variable type, and @x must be assignable
+ * to the result of dereferencing @ptr.
+ *
+ * Caller must check the pointer with access_ok() before calling this
+ * function.
+ *
+ * Returns zero on success, or -EFAULT on error.
+ */
+
+#define __put_user(x, ptr) \
+ __put_user_nocheck((__typeof__(*(ptr)))(x), (ptr), sizeof(*(ptr)))

+#define __get_user_unaligned __get_user
+#define __put_user_unaligned __put_user

#ifdef CONFIG_X86_32
# include "uaccess_32.h"
diff --git a/include/asm-x86/uaccess_32.h b/include/asm-x86/uaccess_32.h
index e5c0437..d3b5bf8 100644
--- a/include/asm-x86/uaccess_32.h
+++ b/include/asm-x86/uaccess_32.h
@@ -20,52 +20,6 @@ extern struct movsl_mask {
} ____cacheline_aligned_in_smp movsl_mask;
#endif

-/**
- * __get_user: - Get a simple variable from user space, with less checking.
- * @x: Variable to store result.
- * @ptr: Source address, in user space.
- *
- * Context: User context only. This function may sleep.
- *
- * This macro copies a single simple variable from user space to kernel
- * space. It supports simple types like char and int, but not larger
- * data types like structures or arrays.
- *
- * @ptr must have pointer-to-simple-variable type, and the result of
- * dereferencing @ptr must be assignable to @x without a cast.
- *
- * Caller must check the pointer with access_ok() before calling this
- * function.
- *
- * Returns zero on success, or -EFAULT on error.
- * On error, the variable @x is set to zero.
- */
-#define __get_user(x, ptr) \
- __get_user_nocheck((x), (ptr), sizeof(*(ptr)))
-
-
-/**
- * __put_user: - Write a simple value into user space, with less checking.
- * @x: Value to copy to user space.
- * @ptr: Destination address, in user space.
- *
- * Context: User context only. This function may sleep.
- *
- * This macro copies a single simple value from kernel space to user
- * space. It supports simple types like char and int, but not larger
- * data types like structures or arrays.
- *
- * @ptr must have pointer-to-simple-variable type, and @x must be assignable
- * to the result of dereferencing @ptr.
- *
- * Caller must check the pointer with access_ok() before calling this
- * function.
- *
- * Returns zero on success, or -EFAULT on error.
- */
-#define __put_user(x, ptr) \
- __put_user_nocheck((__typeof__(*(ptr)))(x), (ptr), sizeof(*(ptr)))
-
unsigned long __must_check __copy_to_user_ll
(void __user *to, const void *from, unsigned long n);
unsigned long __must_check __copy_from_user_ll
diff --git a/include/asm-x86/uaccess_64.h b/include/asm-x86/uaccess_64.h
index 2e75a5d..b5bacd6 100644
--- a/include/asm-x86/uaccess_64.h
+++ b/include/asm-x86/uaccess_64.h
@@ -11,14 +11,6 @@

#define ARCH_HAS_SEARCH_EXTABLE

-#define __get_user(x, ptr) \
- __get_user_nocheck((x), (ptr), sizeof(*(ptr)))
-#define __put_user(x, ptr) \
- __put_user_nocheck((__typeof__(*(ptr)))(x), (ptr), sizeof(*(ptr)))
-
-#define __get_user_unaligned __get_user
-#define __put_user_unaligned __put_user
-
/*
* Copy To/From Userspace
*/
--
1.5.5.1

2008-06-27 21:52:20

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 34/39] Be more explicit in __put_user_x

For both __put_user_x and __put_user_8 macros, pass the error
variable explicitly.

Signed-off-by: Glauber Costa <[email protected]>
---
include/asm-x86/uaccess_32.h | 14 +++++++-------
1 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/include/asm-x86/uaccess_32.h b/include/asm-x86/uaccess_32.h
index ebfe6b2..2c90673 100644
--- a/include/asm-x86/uaccess_32.h
+++ b/include/asm-x86/uaccess_32.h
@@ -31,11 +31,11 @@ extern void __put_user_2(void);
extern void __put_user_4(void);
extern void __put_user_8(void);

-#define __put_user_x(size, x, ptr) \
+#define __put_user_x(size, x, ptr, __ret_pu) \
asm volatile("call __put_user_" #size : "=a" (__ret_pu) \
:"0" ((typeof(*(ptr)))(x)), "c" (ptr) : "ebx")

-#define __put_user_8(x, ptr) \
+#define __put_user_8(x, ptr, __ret_pu) \
asm volatile("call __put_user_8" : "=a" (__ret_pu) \
: "A" ((typeof(*(ptr)))(x)), "c" (ptr) : "ebx")

@@ -66,19 +66,19 @@ extern void __put_user_8(void);
__pu_val = x; \
switch (sizeof(*(ptr))) { \
case 1: \
- __put_user_x(1, __pu_val, ptr); \
+ __put_user_x(1, __pu_val, ptr, __ret_pu); \
break; \
case 2: \
- __put_user_x(2, __pu_val, ptr); \
+ __put_user_x(2, __pu_val, ptr, __ret_pu); \
break; \
case 4: \
- __put_user_x(4, __pu_val, ptr); \
+ __put_user_x(4, __pu_val, ptr, __ret_pu); \
break; \
case 8: \
- __put_user_8(__pu_val, ptr); \
+ __put_user_8(__pu_val, ptr, __ret_pu); \
break; \
default: \
- __put_user_x(X, __pu_val, ptr); \
+ __put_user_x(X, __pu_val, ptr, __ret_pu); \
break; \
} \
__ret_pu; \
--
1.5.5.1

2008-06-27 21:52:39

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 38/39] put movsl_mask into uaccess.h

x86_64 does not need it, but it won't have X86_INTEL_USERCOPY
defined either.

Signed-off-by: Glauber Costa <[email protected]>
---
include/asm-x86/uaccess.h | 9 +++++++++
include/asm-x86/uaccess_32.h | 9 ---------
2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/include/asm-x86/uaccess.h b/include/asm-x86/uaccess.h
index 4ebb992..ddc32fe 100644
--- a/include/asm-x86/uaccess.h
+++ b/include/asm-x86/uaccess.h
@@ -433,6 +433,15 @@ struct __large_struct { unsigned long buf[100]; };
#define __get_user_unaligned __get_user
#define __put_user_unaligned __put_user

+/*
+ * movsl can be slow when source and dest are not both 8-byte aligned
+ */
+#ifdef CONFIG_X86_INTEL_USERCOPY
+extern struct movsl_mask {
+ int mask;
+} ____cacheline_aligned_in_smp movsl_mask;
+#endif
+
#ifdef CONFIG_X86_32
# include "uaccess_32.h"
#else
diff --git a/include/asm-x86/uaccess_32.h b/include/asm-x86/uaccess_32.h
index d3b5bf8..3467749 100644
--- a/include/asm-x86/uaccess_32.h
+++ b/include/asm-x86/uaccess_32.h
@@ -11,15 +11,6 @@
#include <asm/asm.h>
#include <asm/page.h>

-/*
- * movsl can be slow when source and dest are not both 8-byte aligned
- */
-#ifdef CONFIG_X86_INTEL_USERCOPY
-extern struct movsl_mask {
- int mask;
-} ____cacheline_aligned_in_smp movsl_mask;
-#endif
-
unsigned long __must_check __copy_to_user_ll
(void __user *to, const void *from, unsigned long n);
unsigned long __must_check __copy_from_user_ll
--
1.5.5.1

2008-06-27 21:53:07

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 39/39] define architectural characteristics in uaccess.h

Remove them from the arch-specific file.

Signed-off-by: Glauber Costa <[email protected]>
---
include/asm-x86/uaccess.h | 3 +++
include/asm-x86/uaccess_32.h | 2 --
include/asm-x86/uaccess_64.h | 3 ---
3 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/include/asm-x86/uaccess.h b/include/asm-x86/uaccess.h
index ddc32fe..a1e8157 100644
--- a/include/asm-x86/uaccess.h
+++ b/include/asm-x86/uaccess.h
@@ -442,9 +442,12 @@ extern struct movsl_mask {
} ____cacheline_aligned_in_smp movsl_mask;
#endif

+#define ARCH_HAS_NOCACHE_UACCESS 1
+
#ifdef CONFIG_X86_32
# include "uaccess_32.h"
#else
+# define ARCH_HAS_SEARCH_EXTABLE
# include "uaccess_64.h"
#endif

diff --git a/include/asm-x86/uaccess_32.h b/include/asm-x86/uaccess_32.h
index 3467749..6fdef39 100644
--- a/include/asm-x86/uaccess_32.h
+++ b/include/asm-x86/uaccess_32.h
@@ -156,8 +156,6 @@ __copy_from_user(void *to, const void __user *from, unsigned long n)
return __copy_from_user_ll(to, from, n);
}

-#define ARCH_HAS_NOCACHE_UACCESS
-
static __always_inline unsigned long __copy_from_user_nocache(void *to,
const void __user *from, unsigned long n)
{
diff --git a/include/asm-x86/uaccess_64.h b/include/asm-x86/uaccess_64.h
index b5bacd6..4e3ec00 100644
--- a/include/asm-x86/uaccess_64.h
+++ b/include/asm-x86/uaccess_64.h
@@ -9,8 +9,6 @@
#include <linux/prefetch.h>
#include <asm/page.h>

-#define ARCH_HAS_SEARCH_EXTABLE
-
/*
* Copy To/From Userspace
*/
@@ -180,7 +178,6 @@ __copy_to_user_inatomic(void __user *dst, const void *src, unsigned size)
return copy_user_generic((__force void *)dst, src, size);
}

-#define ARCH_HAS_NOCACHE_UACCESS 1
extern long __copy_user_nocache(void *dst, const void __user *src,
unsigned size, int zerorest);

--
1.5.5.1

2008-06-27 23:19:07

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH 08/39] don't use word-size specifiers

Glauber Costa wrote:
> since the instructions refer to registers, they'll be able
> to figure it out.
>
> Signed-off-by: Glauber Costa <[email protected]>

> diff --git a/arch/x86/lib/getuser_32.S b/arch/x86/lib/getuser_32.S
> index 6d84b53..8200fde 100644
> --- a/arch/x86/lib/getuser_32.S
> +++ b/arch/x86/lib/getuser_32.S
> @@ -29,44 +29,44 @@
> ENTRY(__get_user_1)
> CFI_STARTPROC
> GET_THREAD_INFO(%edx)
> - cmpl TI_addr_limit(%edx),%eax
> + cmp TI_addr_limit(%edx),%eax
> jae bad_get_user
> -1: movzbl (%eax),%edx
> - xorl %eax,%eax
> +1: movzb (%eax),%edx
> + xor %eax,%eax
> ret
> CFI_ENDPROC

I hate to say it, but I really think this is a step backwards in
readability. Consistency is a good thing, and with the suffixes in
place we are consistent between instructions that refer to memory and
instructions that refer to registers. We also get one more check on
things, where the assembler can tell the programmer he probably typoed.

So I would prefer if we *didn't* go down this route, except for explicit
unification, but that's not the case here (since the size is still
explicit in the register names.)

-hpa

2008-06-27 23:23:20

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH 08/39] don't use word-size specifiers

H. Peter Anvin wrote:
>
> I hate to say it, but I really think this is a step backwards in
> readability. Consistency is a good thing, and with the suffixes in
> place we are consistent between instructions that refer to memory and
> instructions that refer to registers. We also get one more check on
> things, where the assembler can tell the programmer he probably typoed.
>
> So I would prefer if we *didn't* go down this route, except for explicit
> unification, but that's not the case here (since the size is still
> explicit in the register names.)
>

Okay, I didn't really explain what I meant very well here...

Obviously, most of your patch series is all about unification, and that
is a Good Thing, and thank you for doing it. What I was trying to say
was that it is not obvious from just reading the patchset what changes
are necessary for unification, and which one are a stylistic change. If
*all* the changes are unification, please just say so and disregard this
remark, and I'll go ahead and apply your patchset.

-hpa

2008-06-28 01:48:17

by Glauber Costa

[permalink] [raw]
Subject: Re: [PATCH 08/39] don't use word-size specifiers

H. Peter Anvin wrote:
> H. Peter Anvin wrote:
>>
>> I hate to say it, but I really think this is a step backwards in
>> readability. Consistency is a good thing, and with the suffixes in
>> place we are consistent between instructions that refer to memory and
>> instructions that refer to registers. We also get one more check on
>> things, where the assembler can tell the programmer he probably typoed.
>>
>> So I would prefer if we *didn't* go down this route, except for
>> explicit unification, but that's not the case here (since the size is
>> still explicit in the register names.)
>>
>
> Okay, I didn't really explain what I meant very well here...
>
> Obviously, most of your patch series is all about unification, and that
> is a Good Thing, and thank you for doing it. What I was trying to say
> was that it is not obvious from just reading the patchset what changes
> are necessary for unification, and which one are a stylistic change. If
> *all* the changes are unification, please just say so and disregard this
> remark, and I'll go ahead and apply your patchset.
>
> -hpa
They're all about unification, but I split it for bisectability, as ingo
many times requested (and it happened for me to agree completely with it
after a while ;-)).

So, exactly because not using size specifiers can introduce bugs here, I
did it in a separate patch. But it all end at unification in the end.

Sorry if the intention was not explicit enough.

2008-06-28 05:15:04

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH 08/39] don't use word-size specifiers

Glauber Costa wrote:
> They're all about unification, but I split it for bisectability, as ingo
> many times requested (and it happened for me to agree completely with it
> after a while ;-)).
>
> So, exactly because not using size specifiers can introduce bugs here, I
> did it in a separate patch. But it all end at unification in the end.
>
> Sorry if the intention was not explicit enough.

No problem, I just wanted to verify. Thank you :)

-hpa

2008-06-28 12:09:56

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 12/39] introduce __ASM_REG macro

Glauber Costa <[email protected]> writes:

> There are situations in which the architecture wants to use the
> register that represents its word-size, whatever it is. For those,
> introduce __ASM_REG in asm.h, along with the first users _ASM_AX
> and _ASM_DX. They have users waiting for it, namely the getuser
> functions.

FYI the ABI of the 64bit get_user was designed for minimal
code size originally. You should check for regressions if you
change that.

-Andi

2008-06-28 12:12:47

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 17/39] clobber rbx in putuser_64.S

Glauber Costa <[email protected]> writes:

> Instead of clobbering r8, clobber rbx, which is the i386 way.

Note rbx is callee saved on 64bit, so using that one means
the surrounding function always has to save explicitely.
Not the case with r8.

There's a reason it is the way it is.

-Andi

2008-06-28 12:23:37

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 15/39] don't save ebx in putuser_32.S

Glauber Costa <[email protected]> writes:

> clobber it in the inline asm macros, and let the compiler do this for us.

I would expect that definitely will cause code size regressions ...

-Andi

2008-06-30 05:12:17

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH 0/39] Merge files at x86/lib

Glauber Costa wrote:
> Hey folks,
>
> Here it goes a series of patch that merges some user-related files.
> From x86/lib, delay.c, getuser.S, and putuser.S are merged. For the
> last two of them, the accompanying include/asm-x86/uaccess.h is merged.
> Or close to. There are some small leftovers, that are sufficiently
> different to remain in its own files.
>
> As for bisectability, all patches have been tested in more than 20
> different configs for both i386 and x86_64, in the usual way (just
> I'm testing in more configs now). If you find a build bug in this
> series, please send me the offending config so I can add to my poll.
>
> The diffstat and the one-big-patch follows at the end of this introductory
> message.
>
> Ingo, in the absence of any objections, you can pull this work from:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/glommer/linux-2.6-x86-integration.git master
>
> into your tip/master tree
>

Hi Glauber,

Applied to -tip as x86/unify-lib.

-hpa

2008-06-30 06:31:03

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 25/39] merge common parts of uaccess.


* Glauber Costa <[email protected]> wrote:

> common parts of uaccess_32.h and uaccess_64.h
> are put in uaccess.h.

-tip testing found that it causes this build failure:

fs/binfmt_aout.c: Assembler messages:
fs/binfmt_aout.c:152: Error: suffix or operands invalid for `cmp'

with:

http://redhat.com/~mingo/misc/config-Mon_Jun_30_08_17_42_CEST_2008.bad

and comparing the 32-bit and unified version is not simple and the
commit is rather large.

I'm sure the fix is simple, but this bug shows a structural problem with
this unification patch. The proper way to unify files is to first bring
both the 32-bit and the 64-bit version up to a unified form via
finegrained changes, so that uaccess_32.h and uaccess_64.h becomes
exactly the same file.

... _then_ only, in a final 'mechanic unification' step the two files
are merged into uaccess.h. (but no change is done to the content)

If anything breaks during such a series it's bisectable to a finegrained
patch on either the 32-bit or the 64-bit side. If this commit was shaped
that way i could now report to you the exact bisection result - instead
of this too-broad bisection result.

So please rework this commit in that fashion (not just to fix this
breakage but in anticipation of future commits) - uaccess.h is central
enough for us to be super careful about it.

Ingo

2008-06-30 06:32:32

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 25/39] merge common parts of uaccess.


* Ingo Molnar <[email protected]> wrote:

> * Glauber Costa <[email protected]> wrote:
>
> > common parts of uaccess_32.h and uaccess_64.h
> > are put in uaccess.h.
>
> -tip testing found that it causes this build failure:
>
> fs/binfmt_aout.c: Assembler messages:
> fs/binfmt_aout.c:152: Error: suffix or operands invalid for `cmp'
>
> with:
>
> http://redhat.com/~mingo/misc/config-Mon_Jun_30_08_17_42_CEST_2008.bad

there was another merge fallout as well, when i merged it into
tip/master - see below.

Ingo

------------->
commit d0a1b893fd764ae722cef905bf282c52b768feb1
Author: Ingo Molnar <[email protected]>
Date: Mon Jun 30 08:05:44 2008 +0200

- fix merge fallout between x86/unify-lib and tracing/nmisafe

Signed-off-by: Ingo Molnar <[email protected]>

diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 32c5ffd..19c7210 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -765,7 +765,7 @@ retint_signal:
/* Returning to kernel space from exception. */
/* rcx: threadinfo. interrupts off. */
ENTRY(retexc_kernel)
- testl $HARDNMI_MASK,threadinfo_preempt_count(%rcx)
+ testl $HARDNMI_MASK, TI_preempt_count(%rcx)
jz retint_kernel /* Not nested over NMI ? */
testw $X86_EFLAGS_TF,EFLAGS-ARGOFFSET(%rsp) /* trap flag? */
jnz retint_kernel /*
@@ -946,7 +946,7 @@ paranoid_restore_no_nmi\trace:
jmp irq_return
paranoid_restore\trace:
GET_THREAD_INFO(%rcx)
- testl $HARDNMI_MASK,threadinfo_preempt_count(%rcx)
+ testl $HARDNMI_MASK, TI_preempt_count(%rcx)
jz paranoid_restore_no_nmi\trace /* Nested over NMI ? */
testw $X86_EFLAGS_TF,EFLAGS-0(%rsp) /* trap flag? */
jnz paranoid_restore_no_nmi\trace

2008-06-30 16:32:55

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH 25/39] merge common parts of uaccess.

Ingo Molnar wrote:
>
> there was another merge fallout as well, when i merged it into
> tip/master - see below.
>
> Ingo
>

This was actually in the original patch, but I had to remove it when
applying it on top of auto-x86-next.

-hpa

2008-06-30 18:49:18

by Glauber Costa

[permalink] [raw]
Subject: Re: [PATCH 25/39] merge common parts of uaccess.

Ingo Molnar wrote:
> * Glauber Costa <[email protected]> wrote:
>
>> common parts of uaccess_32.h and uaccess_64.h
>> are put in uaccess.h.
>
> -tip testing found that it causes this build failure:
>
> fs/binfmt_aout.c: Assembler messages:
> fs/binfmt_aout.c:152: Error: suffix or operands invalid for `cmp'
>
> with:
>
> http://redhat.com/~mingo/misc/config-Mon_Jun_30_08_17_42_CEST_2008.bad
>
> and comparing the 32-bit and unified version is not simple and the
> commit is rather large.
>
> I'm sure the fix is simple, but this bug shows a structural problem with
> this unification patch. The proper way to unify files is to first bring
> both the 32-bit and the 64-bit version up to a unified form via
> finegrained changes, so that uaccess_32.h and uaccess_64.h becomes
> exactly the same file.
>
> ... _then_ only, in a final 'mechanic unification' step the two files
> are merged into uaccess.h. (but no change is done to the content)
>
> If anything breaks during such a series it's bisectable to a finegrained
> patch on either the 32-bit or the 64-bit side. If this commit was shaped
> that way i could now report to you the exact bisection result - instead
> of this too-broad bisection result.
>
> So please rework this commit in that fashion (not just to fix this
> breakage but in anticipation of future commits) - uaccess.h is central
> enough for us to be super careful about it.
>
> Ingo
Fair.

However, as I wrote in the first patch of the series, I'm not doing a
complete unification of uaccess.h. Part of it is left for future work,
since it's a little bit trickier.

So I didn't have the option of a mechanical move. I did tried, however,
to make sure this patch was only a code move, with everything that is
going to the common file being equal in both files.

Needless to say, I failed. ;-) This was for a very tiny piece, but still...

The options I see are:

* to redo the uaccess.h unification this way, making sure a diff between
the diffs of the arch-files report nothing different, or:
* to remove the topmost patches that touches uaccess*.h, and leave only
the ones that integrate the .c and .S files, until I can really
integrate the whole of it.

For the second, however, although I was careful to make incremental
changes, some small differences may exist. Examples of these differences
are places in which I introduce a few ifdefs. It's close to nothing, but
still not mechanical. Because of that, you might want me to redo the
whole series.

Your call.

2008-06-30 18:53:48

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 25/39] merge common parts of uaccess.


* Glauber Costa <[email protected]> wrote:

> Ingo Molnar wrote:
>> * Glauber Costa <[email protected]> wrote:
>>
>>> common parts of uaccess_32.h and uaccess_64.h
>>> are put in uaccess.h.
>>
>> -tip testing found that it causes this build failure:
>>
>> fs/binfmt_aout.c: Assembler messages:
>> fs/binfmt_aout.c:152: Error: suffix or operands invalid for `cmp'
>>
>> with:
>>
>> http://redhat.com/~mingo/misc/config-Mon_Jun_30_08_17_42_CEST_2008.bad
>>
>> and comparing the 32-bit and unified version is not simple and the
>> commit is rather large.
>>
>> I'm sure the fix is simple, but this bug shows a structural problem
>> with this unification patch. The proper way to unify files is to first
>> bring both the 32-bit and the 64-bit version up to a unified form via
>> finegrained changes, so that uaccess_32.h and uaccess_64.h becomes
>> exactly the same file.
>>
>> ... _then_ only, in a final 'mechanic unification' step the two files
>> are merged into uaccess.h. (but no change is done to the content)
>>
>> If anything breaks during such a series it's bisectable to a
>> finegrained patch on either the 32-bit or the 64-bit side. If this
>> commit was shaped that way i could now report to you the exact
>> bisection result - instead of this too-broad bisection result.
>>
>> So please rework this commit in that fashion (not just to fix this
>> breakage but in anticipation of future commits) - uaccess.h is central
>> enough for us to be super careful about it.
>>
>> Ingo
> Fair.
>
> However, as I wrote in the first patch of the series, I'm not doing a
> complete unification of uaccess.h. Part of it is left for future work,
> since it's a little bit trickier.
>
> So I didn't have the option of a mechanical move. I did tried, however,
> to make sure this patch was only a code move, with everything that is
> going to the common file being equal in both files.
>
> Needless to say, I failed. ;-) This was for a very tiny piece, but still...
>
> The options I see are:
>
> * to redo the uaccess.h unification this way, making sure a diff
> between the diffs of the arch-files report nothing different, or: * to
> remove the topmost patches that touches uaccess*.h, and leave only the
> ones that integrate the .c and .S files, until I can really integrate
> the whole of it.
>
> For the second, however, although I was careful to make incremental
> changes, some small differences may exist. Examples of these
> differences are places in which I introduce a few ifdefs. It's close
> to nothing, but still not mechanical. Because of that, you might want
> me to redo the whole series.
>
> Your call.

well the primary worry is the build failure with gcc 4.3.1 that i've
posted. If that's simple to fix we could re-try with your existing
series.

But to be defensive it's alway useful to move one component at a time.
Even if you dont end up doing a mechanical unification - the stuff you
move you should be able to claim to be exactly identical. I.e. the final
step can be mechanic in that it unifies exactly the same content (even
though both files still have remaining bits).

Then we'll end up with nice bisection reports to the specific area that
is impacted by a problem.

Ingo

2008-06-30 19:02:28

by Glauber Costa

[permalink] [raw]
Subject: Re: [PATCH 25/39] merge common parts of uaccess.

Ingo Molnar wrote:
> * Glauber Costa <[email protected]> wrote:
>
>> Ingo Molnar wrote:
>>> * Glauber Costa <[email protected]> wrote:
>>>
>>>> common parts of uaccess_32.h and uaccess_64.h
>>>> are put in uaccess.h.
>>> -tip testing found that it causes this build failure:
>>>
>>> fs/binfmt_aout.c: Assembler messages:
>>> fs/binfmt_aout.c:152: Error: suffix or operands invalid for `cmp'
>>>
>>> with:
>>>
>>> http://redhat.com/~mingo/misc/config-Mon_Jun_30_08_17_42_CEST_2008.bad
>>>
>>> and comparing the 32-bit and unified version is not simple and the
>>> commit is rather large.
>>>
>>> I'm sure the fix is simple, but this bug shows a structural problem
>>> with this unification patch. The proper way to unify files is to first
>>> bring both the 32-bit and the 64-bit version up to a unified form via
>>> finegrained changes, so that uaccess_32.h and uaccess_64.h becomes
>>> exactly the same file.
>>>
>>> ... _then_ only, in a final 'mechanic unification' step the two files
>>> are merged into uaccess.h. (but no change is done to the content)
>>>
>>> If anything breaks during such a series it's bisectable to a
>>> finegrained patch on either the 32-bit or the 64-bit side. If this
>>> commit was shaped that way i could now report to you the exact
>>> bisection result - instead of this too-broad bisection result.
>>>
>>> So please rework this commit in that fashion (not just to fix this
>>> breakage but in anticipation of future commits) - uaccess.h is central
>>> enough for us to be super careful about it.
>>>
>>> Ingo
>> Fair.
>>
>> However, as I wrote in the first patch of the series, I'm not doing a
>> complete unification of uaccess.h. Part of it is left for future work,
>> since it's a little bit trickier.
>>
>> So I didn't have the option of a mechanical move. I did tried, however,
>> to make sure this patch was only a code move, with everything that is
>> going to the common file being equal in both files.
>>
>> Needless to say, I failed. ;-) This was for a very tiny piece, but still...
>>
>> The options I see are:
>>
>> * to redo the uaccess.h unification this way, making sure a diff
>> between the diffs of the arch-files report nothing different, or: * to
>> remove the topmost patches that touches uaccess*.h, and leave only the
>> ones that integrate the .c and .S files, until I can really integrate
>> the whole of it.
>>
>> For the second, however, although I was careful to make incremental
>> changes, some small differences may exist. Examples of these
>> differences are places in which I introduce a few ifdefs. It's close
>> to nothing, but still not mechanical. Because of that, you might want
>> me to redo the whole series.
>>
>> Your call.
>
> well the primary worry is the build failure with gcc 4.3.1 that i've
> posted. If that's simple to fix we could re-try with your existing
> series.
>
> But to be defensive it's alway useful to move one component at a time.
> Even if you dont end up doing a mechanical unification - the stuff you
> move you should be able to claim to be exactly identical. I.e. the final
> step can be mechanic in that it unifies exactly the same content (even
> though both files still have remaining bits).
>
> Then we'll end up with nice bisection reports to the specific area that
> is impacted by a problem.
>
> Ingo
I already have a fix for that. But I'll repost it in a way in which I
can claim the (part of the) files to be identical. For now, can you trim
the tree at that point? I think it's the best option.

As for bisection, note that I did everything with bisection in mind, so
I do know the importance of it. It's more a failure than a fundamental
mistake.

Thanks.

2008-06-30 19:30:36

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 25/39] merge common parts of uaccess.


* Glauber Costa <[email protected]> wrote:

> Ingo Molnar wrote:
>> * Glauber Costa <[email protected]> wrote:
>>
>>> Ingo Molnar wrote:
>>>> * Glauber Costa <[email protected]> wrote:
>>>>
>>>>> common parts of uaccess_32.h and uaccess_64.h
>>>>> are put in uaccess.h.
>>>> -tip testing found that it causes this build failure:
>>>>
>>>> fs/binfmt_aout.c: Assembler messages:
>>>> fs/binfmt_aout.c:152: Error: suffix or operands invalid for `cmp'
>>>>
>>>> with:
>>>>
>>>> http://redhat.com/~mingo/misc/config-Mon_Jun_30_08_17_42_CEST_2008.bad
>>>>
>>>> and comparing the 32-bit and unified version is not simple and the
>>>> commit is rather large.
>>>>
>>>> I'm sure the fix is simple, but this bug shows a structural problem
>>>> with this unification patch. The proper way to unify files is to
>>>> first bring both the 32-bit and the 64-bit version up to a unified
>>>> form via finegrained changes, so that uaccess_32.h and
>>>> uaccess_64.h becomes exactly the same file.
>>>>
>>>> ... _then_ only, in a final 'mechanic unification' step the two
>>>> files are merged into uaccess.h. (but no change is done to the
>>>> content)
>>>>
>>>> If anything breaks during such a series it's bisectable to a
>>>> finegrained patch on either the 32-bit or the 64-bit side. If this
>>>> commit was shaped that way i could now report to you the exact
>>>> bisection result - instead of this too-broad bisection result.
>>>>
>>>> So please rework this commit in that fashion (not just to fix this
>>>> breakage but in anticipation of future commits) - uaccess.h is
>>>> central enough for us to be super careful about it.
>>>>
>>>> Ingo
>>> Fair.
>>>
>>> However, as I wrote in the first patch of the series, I'm not doing a
>>> complete unification of uaccess.h. Part of it is left for future
>>> work, since it's a little bit trickier.
>>>
>>> So I didn't have the option of a mechanical move. I did tried,
>>> however, to make sure this patch was only a code move, with
>>> everything that is going to the common file being equal in both
>>> files.
>>>
>>> Needless to say, I failed. ;-) This was for a very tiny piece, but still...
>>>
>>> The options I see are:
>>>
>>> * to redo the uaccess.h unification this way, making sure a diff
>>> between the diffs of the arch-files report nothing different, or: *
>>> to remove the topmost patches that touches uaccess*.h, and leave only
>>> the ones that integrate the .c and .S files, until I can really
>>> integrate the whole of it.
>>>
>>> For the second, however, although I was careful to make incremental
>>> changes, some small differences may exist. Examples of these
>>> differences are places in which I introduce a few ifdefs. It's close
>>> to nothing, but still not mechanical. Because of that, you might want
>>> me to redo the whole series.
>>>
>>> Your call.
>>
>> well the primary worry is the build failure with gcc 4.3.1 that i've
>> posted. If that's simple to fix we could re-try with your existing
>> series.
>>
>> But to be defensive it's alway useful to move one component at a time.
>> Even if you dont end up doing a mechanical unification - the stuff you
>> move you should be able to claim to be exactly identical. I.e. the
>> final step can be mechanic in that it unifies exactly the same content
>> (even though both files still have remaining bits).
>>
>> Then we'll end up with nice bisection reports to the specific area that
>> is impacted by a problem.
>>
>> Ingo
>
> I already have a fix for that. But I'll repost it in a way in which I
> can claim the (part of the) files to be identical. For now, can you
> trim the tree at that point? I think it's the best option.

sounds good. Could you git branch for that, and post the pull request
plus the shortlog+diffstat to lkml?

To construct that branch you can merge any subset of the existing
tip/x86/unify-lib series into that. I.e. you can cut the tree yourself
at the point you find most appropriate - and we can then reset
tip/x86/unify-lib and pull your tree into it.

Due to the test failure the topic is not integrated yet externally so it
has no append-only constraints.

> As for bisection, note that I did everything with bisection in mind,
> so I do know the importance of it. It's more a failure than a
> fundamental mistake.

yeah, i know that and i'm not complaining :)

Ingo

2008-06-30 20:44:56

by Glauber Costa

[permalink] [raw]
Subject: Re: [PATCH 15/39] don't save ebx in putuser_32.S

Andi Kleen wrote:
> Glauber Costa <[email protected]> writes:
>
>> clobber it in the inline asm macros, and let the compiler do this for us.
>
> I would expect that definitely will cause code size regressions ...
>
> -Andi

Andi,

Thanks for reviewing my patches.

Although I agree with you, I think the reduced codesize is not enough a
reason to keep code duplicated. If you can provide a version of it that
is granted to reduce the code size for both i386 and x86_64, or even one
that makes it minimal for x86_64 and yet, works with i386, fine.

Otherwise, I still think this is the best option.

2008-06-30 21:03:04

by Glauber Costa

[permalink] [raw]
Subject: Re: [PATCH 17/39] clobber rbx in putuser_64.S

Andi Kleen wrote:
> Glauber Costa <[email protected]> writes:
>
>> Instead of clobbering r8, clobber rbx, which is the i386 way.
>
> Note rbx is callee saved on 64bit, so using that one means
> the surrounding function always has to save explicitely.
> Not the case with r8.
>
> There's a reason it is the way it is.
>
> -Andi
Right. Thanks for pointing this out.
However, r8 is not available for i386. We could use %ax, but it
holds part of the data for the call itself.

But for this case, I think we can come up with a macro that selects the
appropriate register for each of them. Should be easy to do now that the
code is merged.

Many thanks.

2008-06-30 21:44:57

by Glauber Costa

[permalink] [raw]
Subject: Re: [PATCH 17/39] clobber rbx in putuser_64.S

>From d7162c0499da551b71c2894fa8357c148cc3b974 Mon Sep 17 00:00:00 2001
From: Glauber Costa <[email protected]>
Date: Mon, 30 Jun 2008 18:30:32 -0300
Subject: [PATCH] Use r8 as clobber for putuser at x86_64.

As pointed out by Andi Kleen, we can do a little bit
better with putuser, regarding code generation. (As
a matter of fact, we used to.)

So use either r8 or ebx in the clobber list, depending
on which platform we are. This patch also provides comments
on the reasoning behind it.

Signed-off-by: Glauber Costa <[email protected]>
CC: Andi Kleen <[email protected]>
---
arch/x86/lib/putuser.S | 39 ++++++++++++++++++++++++++++-----------
include/asm-x86/uaccess.h | 16 +++++++++-------
2 files changed, 37 insertions(+), 18 deletions(-)

diff --git a/arch/x86/lib/putuser.S b/arch/x86/lib/putuser.S
index 36b0d15..896667d 100644
--- a/arch/x86/lib/putuser.S
+++ b/arch/x86/lib/putuser.S
@@ -30,14 +30,31 @@
*/

#define ENTER CFI_STARTPROC ; \
- GET_THREAD_INFO(%_ASM_BX)
+ GET_THREAD_INFO(%SCRATCH_REG)
#define EXIT ret ; \
CFI_ENDPROC

+/*
+ * i386 calling convention determines that eax, ecx and edx belongs
+ * to the called function (caller-saved). This means that if we used any
+ * of those registers, the compiler would not need to save and restore it.
+ * However, all those registers are used for parameter passing in i386,
+ * and we have to trash one of them anyway. We randomly choose ebx.
+ *
+ * x86_64 has many more caller-saved registers, and we can pick one of them
+ * to use here. Again at random, we pick r8. This should lead to better code
+ * generation in the later platform.
+ */
+#ifdef CONFIG_X86_64
+#define SCRATCH_REG r8
+#else
+#define SCRATCH_REG ebx
+#endif
+
.text
ENTRY(__put_user_1)
ENTER
- cmp TI_addr_limit(%_ASM_BX),%_ASM_CX
+ cmp TI_addr_limit(%SCRATCH_REG),%_ASM_CX
jae bad_put_user
1: movb %al,(%_ASM_CX)
xor %eax,%eax
@@ -46,9 +63,9 @@ ENDPROC(__put_user_1)

ENTRY(__put_user_2)
ENTER
- mov TI_addr_limit(%_ASM_BX),%_ASM_BX
- sub $1,%_ASM_BX
- cmp %_ASM_BX,%_ASM_CX
+ mov TI_addr_limit(%SCRATCH_REG),%SCRATCH_REG
+ sub $1,%SCRATCH_REG
+ cmp %SCRATCH_REG,%_ASM_CX
jae bad_put_user
2: movw %ax,(%_ASM_CX)
xor %eax,%eax
@@ -57,9 +74,9 @@ ENDPROC(__put_user_2)

ENTRY(__put_user_4)
ENTER
- mov TI_addr_limit(%_ASM_BX),%_ASM_BX
- sub $3,%_ASM_BX
- cmp %_ASM_BX,%_ASM_CX
+ mov TI_addr_limit(%SCRATCH_REG),%SCRATCH_REG
+ sub $3,%SCRATCH_REG
+ cmp %SCRATCH_REG,%_ASM_CX
jae bad_put_user
3: movl %eax,(%_ASM_CX)
xor %eax,%eax
@@ -68,9 +85,9 @@ ENDPROC(__put_user_4)

ENTRY(__put_user_8)
ENTER
- mov TI_addr_limit(%_ASM_BX),%_ASM_BX
- sub $7,%_ASM_BX
- cmp %_ASM_BX,%_ASM_CX
+ mov TI_addr_limit(%SCRATCH_REG),%SCRATCH_REG
+ sub $7,%SCRATCH_REG
+ cmp %SCRATCH_REG,%_ASM_CX
jae bad_put_user
4: mov %_ASM_AX,(%_ASM_CX)
#ifdef CONFIG_X86_32
diff --git a/include/asm-x86/uaccess.h b/include/asm-x86/uaccess.h
index f6fa4d8..c00fbc3 100644
--- a/include/asm-x86/uaccess.h
+++ b/include/asm-x86/uaccess.h
@@ -178,12 +178,6 @@ extern int __get_user_bad(void);
__ret_gu; \
})

-#define __put_user_x(size, x, ptr, __ret_pu) \
- asm volatile("call __put_user_" #size : "=a" (__ret_pu) \
- :"0" ((typeof(*(ptr)))(x)), "c" (ptr) : "ebx")
-
-
-
#ifdef CONFIG_X86_32
#define __put_user_u64(x, addr, err) \
asm volatile("1: movl %%eax,0(%2)\n" \
@@ -201,17 +195,25 @@ extern int __get_user_bad(void);
#define __put_user_x8(x, ptr, __ret_pu) \
asm volatile("call __put_user_8" : "=a" (__ret_pu) \
: "A" ((typeof(*(ptr)))(x)), "c" (ptr) : "ebx")
+#define PUT_USER_REG_CLOBBER "ebx"
#else
#define __put_user_u64(x, ptr, retval) \
__put_user_asm(x, ptr, retval, "q", "", "Zr", -EFAULT)
#define __put_user_x8(x, ptr, __ret_pu) __put_user_x(8, x, ptr, __ret_pu)
+#define PUT_USER_REG_CLOBBER "r8"
#endif

+#define __put_user_x(size, x, ptr, __ret_pu) \
+ asm volatile("call __put_user_" #size : "=a" (__ret_pu) \
+ :"0" ((typeof(*(ptr)))(x)), "c" (ptr) \
+ : PUT_USER_REG_CLOBER )
+
extern void __put_user_bad(void);

/*
* Strange magic calling convention: pointer in %ecx,
- * value in %eax(:%edx), return value in %eax. clobbers %rbx
+ * value in %eax(:%edx), return value in %eax.
+ * For clobbers, refer to the explanation at putuser.S
*/
extern void __put_user_1(void);
extern void __put_user_2(void);
--
1.5.5.1


Attachments:
0001-Use-r8-as-clobber-for-putuser-at-x86_64.patch (4.27 kB)

2008-06-30 23:33:26

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 17/39] clobber rbx in putuser_64.S

> But for this case, I think we can come up with a macro that selects the
> appropriate register for each of them. Should be easy to do now that the
> code is merged.

Note that each get_user() backend is ~10 lines or so. If you add
that many macros you might end up with more code than if you
just keep them separate.

While I admit I am also partly to blame for some asm macro
mess, e.g. in entry.S, I relented and would now advocate
to minimize macro use in assembler. It simply makes it much
harder to understand and to change.

-Andi

2008-07-01 02:47:07

by Glauber Costa

[permalink] [raw]
Subject: Re: [PATCH 17/39] clobber rbx in putuser_64.S

Andi Kleen wrote:
>> But for this case, I think we can come up with a macro that selects the
>> appropriate register for each of them. Should be easy to do now that the
>> code is merged.
>
> Note that each get_user() backend is ~10 lines or so. If you add
> that many macros you might end up with more code than if you
> just keep them separate.

I agree I might. But I honestly don't think this is the case here.

Does anyone else have a word on this ?

> While I admit I am also partly to blame for some asm macro
> mess, e.g. in entry.S, I relented and would now advocate
> to minimize macro use in assembler. It simply makes it much
> harder to understand and to change.
>
> -Andi
>

2008-07-01 15:12:11

by Glauber Costa

[permalink] [raw]
Subject: Re: [PATCH 17/39] clobber rbx in putuser_64.S

Andi Kleen wrote:
>> But for this case, I think we can come up with a macro that selects the
>> appropriate register for each of them. Should be easy to do now that the
>> code is merged.
>
> Note that each get_user() backend is ~10 lines or so. If you add
> that many macros you might end up with more code than if you
> just keep them separate.
>
> While I admit I am also partly to blame for some asm macro
> mess, e.g. in entry.S, I relented and would now advocate
> to minimize macro use in assembler. It simply makes it much
> harder to understand and to change.
>
> -Andi
>

As it turns out, neither seem significant.


4991509 618198 475308 6085015 5cd997 vmlinux
4989760 618038 475308 6083106 5cd222 vmlinux.top
4989392 618038 475308 6082738 5cd0b2 vmlinux.patched


vmlinux is base before integration, .top is the top of my tree, and
.patched, with the r8 patch added.

As you can see, there _is_ a difference in code size, but not
significant by any means.

clobbering r8 instead of rbx (.patched vs .top) gives us no
more difference than 0.007 %. Hard to say it matters.

The whole series, gives us a .03 % improvement in code size already
(although it was not my intention).

So I'd go for leaving the tree as is, clobbering rbx anyway.