2007-11-09 21:28:43

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 0/24] paravirt_ops for unified x86 - that's me again!

Hey folks,

Here's a new spin of the pvops64 patch series.
We didn't get that many comments from the last time,
so it should be probably almost ready to get in. Heya!

>From the last version, the most notable changes are:
* consolidation of system.h, merging jeremy's comments about ordering
concerns
* consolidation of smp functions that goes through smp_ops. They're sharing
a bunch of code now.

Other than that, just some issues that arose from the rebase.

Please, not that this patch series _does not_ apply over linus git anymore,
but rather, over tglx cleanup series.

The first patch in this series is already on linus', but not on tglx', so
I'm sending it again, because you'll need it if you want to compile it
anyway.

tglx, in the absense of any outstanding NACKs, or any very big call for
improvements, could you please pull it in your tree?

Have fun,



2007-11-09 21:28:27

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 2/24] irqflags consolidation

This patch consolidates the irqflags include files containing common
paravirt definitions. The native definition for interrupt handling, halt,
and such, are the same for 32 and 64 bit, and they are kept in irqflags.h.
The differences are split in the arch-specific files.

The syscall function, irq_enable_sysexit, has a very specific i386 naming,
and its name is then changed to a more general one.

Signed-off-by: Glauber de Oliveira Costa <[email protected]>
Signed-off-by: Steven Rostedt <[email protected]>
Acked-by: Jeremy Fitzhardinge <[email protected]>
---
arch/x86/kernel/asm-offsets_32.c | 2 +-
arch/x86/kernel/entry_32.S | 8 +-
arch/x86/kernel/paravirt_32.c | 10 +-
arch/x86/kernel/vmi_32.c | 4 +-
arch/x86/xen/enlighten.c | 2 +-
include/asm-x86/irqflags.h | 246 +++++++++++++++++++++++++++++++++++++-
include/asm-x86/irqflags_32.h | 174 ---------------------------
include/asm-x86/irqflags_64.h | 154 ------------------------
include/asm-x86/paravirt.h | 9 +-
9 files changed, 261 insertions(+), 348 deletions(-)

diff --git a/arch/x86/kernel/asm-offsets_32.c b/arch/x86/kernel/asm-offsets_32.c
index 0e45981..c1ccfab 100644
--- a/arch/x86/kernel/asm-offsets_32.c
+++ b/arch/x86/kernel/asm-offsets_32.c
@@ -123,7 +123,7 @@ void foo(void)
OFFSET(PV_IRQ_irq_disable, pv_irq_ops, irq_disable);
OFFSET(PV_IRQ_irq_enable, pv_irq_ops, irq_enable);
OFFSET(PV_CPU_iret, pv_cpu_ops, iret);
- OFFSET(PV_CPU_irq_enable_sysexit, pv_cpu_ops, irq_enable_sysexit);
+ OFFSET(PV_CPU_irq_enable_syscall_ret, pv_cpu_ops, irq_enable_syscall_ret);
OFFSET(PV_CPU_read_cr0, pv_cpu_ops, read_cr0);
#endif

diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S
index dc7f938..d63609d 100644
--- a/arch/x86/kernel/entry_32.S
+++ b/arch/x86/kernel/entry_32.S
@@ -58,7 +58,7 @@
* for paravirtualization. The following will never clobber any registers:
* INTERRUPT_RETURN (aka. "iret")
* GET_CR0_INTO_EAX (aka. "movl %cr0, %eax")
- * ENABLE_INTERRUPTS_SYSEXIT (aka "sti; sysexit").
+ * ENABLE_INTERRUPTS_SYSCALL_RET (aka "sti; sysexit").
*
* For DISABLE_INTERRUPTS/ENABLE_INTERRUPTS (aka "cli"/"sti"), you must
* specify what registers can be overwritten (CLBR_NONE, CLBR_EAX/EDX/ECX/ANY).
@@ -351,7 +351,7 @@ sysenter_past_esp:
xorl %ebp,%ebp
TRACE_IRQS_ON
1: mov PT_FS(%esp), %fs
- ENABLE_INTERRUPTS_SYSEXIT
+ ENABLE_INTERRUPTS_SYSCALL_RET
CFI_ENDPROC
.pushsection .fixup,"ax"
2: movl $0,PT_FS(%esp)
@@ -882,10 +882,10 @@ ENTRY(native_iret)
.previous
END(native_iret)

-ENTRY(native_irq_enable_sysexit)
+ENTRY(native_irq_enable_syscall_ret)
sti
sysexit
-END(native_irq_enable_sysexit)
+END(native_irq_enable_syscall_ret)
#endif

KPROBE_ENTRY(int3)
diff --git a/arch/x86/kernel/paravirt_32.c b/arch/x86/kernel/paravirt_32.c
index 6a80d67..04f51d0 100644
--- a/arch/x86/kernel/paravirt_32.c
+++ b/arch/x86/kernel/paravirt_32.c
@@ -60,7 +60,7 @@ DEF_NATIVE(pv_irq_ops, irq_enable, "sti");
DEF_NATIVE(pv_irq_ops, restore_fl, "push %eax; popf");
DEF_NATIVE(pv_irq_ops, save_fl, "pushf; pop %eax");
DEF_NATIVE(pv_cpu_ops, iret, "iret");
-DEF_NATIVE(pv_cpu_ops, irq_enable_sysexit, "sti; sysexit");
+DEF_NATIVE(pv_cpu_ops, irq_enable_syscall_ret, "sti; sysexit");
DEF_NATIVE(pv_mmu_ops, read_cr2, "mov %cr2, %eax");
DEF_NATIVE(pv_mmu_ops, write_cr3, "mov %eax, %cr3");
DEF_NATIVE(pv_mmu_ops, read_cr3, "mov %cr3, %eax");
@@ -88,7 +88,7 @@ static unsigned native_patch(u8 type, u16 clobbers, void *ibuf,
SITE(pv_irq_ops, restore_fl);
SITE(pv_irq_ops, save_fl);
SITE(pv_cpu_ops, iret);
- SITE(pv_cpu_ops, irq_enable_sysexit);
+ SITE(pv_cpu_ops, irq_enable_syscall_ret);
SITE(pv_mmu_ops, read_cr2);
SITE(pv_mmu_ops, read_cr3);
SITE(pv_mmu_ops, write_cr3);
@@ -186,7 +186,7 @@ unsigned paravirt_patch_default(u8 type, u16 clobbers, void *insnbuf,
/* If the operation is a nop, then nop the callsite */
ret = paravirt_patch_nop();
else if (type == PARAVIRT_PATCH(pv_cpu_ops.iret) ||
- type == PARAVIRT_PATCH(pv_cpu_ops.irq_enable_sysexit))
+ type == PARAVIRT_PATCH(pv_cpu_ops.irq_enable_syscall_ret))
/* If operation requires a jmp, then jmp */
ret = paravirt_patch_jmp(insnbuf, opfunc, addr, len);
else
@@ -237,7 +237,7 @@ static void native_flush_tlb_single(unsigned long addr)

/* These are in entry.S */
extern void native_iret(void);
-extern void native_irq_enable_sysexit(void);
+extern void native_irq_enable_syscall_ret(void);

static int __init print_banner(void)
{
@@ -384,7 +384,7 @@ struct pv_cpu_ops pv_cpu_ops = {
.write_idt_entry = write_dt_entry,
.load_esp0 = native_load_esp0,

- .irq_enable_sysexit = native_irq_enable_sysexit,
+ .irq_enable_syscall_ret = native_irq_enable_syscall_ret,
.iret = native_iret,

.set_iopl_mask = native_set_iopl_mask,
diff --git a/arch/x86/kernel/vmi_32.c b/arch/x86/kernel/vmi_32.c
index f02bad6..aacce42 100644
--- a/arch/x86/kernel/vmi_32.c
+++ b/arch/x86/kernel/vmi_32.c
@@ -148,7 +148,7 @@ static unsigned vmi_patch(u8 type, u16 clobbers, void *insns,
insns, eip);
case PARAVIRT_PATCH(pv_cpu_ops.iret):
return patch_internal(VMI_CALL_IRET, len, insns, eip);
- case PARAVIRT_PATCH(pv_cpu_ops.irq_enable_sysexit):
+ case PARAVIRT_PATCH(pv_cpu_ops.irq_enable_syscall_ret):
return patch_internal(VMI_CALL_SYSEXIT, len, insns, eip);
default:
break;
@@ -870,7 +870,7 @@ static inline int __init activate_vmi(void)
* the backend. They are performance critical anyway, so requiring
* a patch is not a big problem.
*/
- pv_cpu_ops.irq_enable_sysexit = (void *)0xfeedbab0;
+ pv_cpu_ops.irq_enable_syscall_ret = (void *)0xfeedbab0;
pv_cpu_ops.iret = (void *)0xbadbab0;

#ifdef CONFIG_SMP
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 94c39aa..094b915 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -953,7 +953,7 @@ static const struct pv_cpu_ops xen_cpu_ops __initdata = {
.read_pmc = native_read_pmc,

.iret = (void *)&hypercall_page[__HYPERVISOR_iret],
- .irq_enable_sysexit = NULL, /* never called */
+ .irq_enable_syscall_ret = NULL, /* never called */

.load_tr_desc = paravirt_nop,
.set_ldt = xen_set_ldt,
diff --git a/include/asm-x86/irqflags.h b/include/asm-x86/irqflags.h
index 1b695ff..92021c1 100644
--- a/include/asm-x86/irqflags.h
+++ b/include/asm-x86/irqflags.h
@@ -1,5 +1,245 @@
-#ifdef CONFIG_X86_32
-# include "irqflags_32.h"
+#ifndef _X86_IRQFLAGS_H_
+#define _X86_IRQFLAGS_H_
+
+#include <asm/processor-flags.h>
+
+#ifndef __ASSEMBLY__
+/*
+ * Interrupt control:
+ */
+
+static inline unsigned long native_save_fl(void)
+{
+ unsigned long flags;
+
+ __asm__ __volatile__(
+ "# __raw_save_flags\n\t"
+ "pushf ; pop %0"
+ : "=g" (flags)
+ : /* no input */
+ : "memory"
+ );
+
+ return flags;
+}
+
+static inline void native_restore_fl(unsigned long flags)
+{
+ __asm__ __volatile__(
+ "push %0 ; popf"
+ : /* no output */
+ :"g" (flags)
+ :"memory", "cc"
+ );
+}
+
+static inline void native_irq_disable(void)
+{
+ asm volatile("cli": : :"memory");
+}
+
+static inline void native_irq_enable(void)
+{
+ asm volatile("sti": : :"memory");
+}
+
+static inline void native_safe_halt(void)
+{
+ asm volatile("sti; hlt": : :"memory");
+}
+
+static inline void native_halt(void)
+{
+ asm volatile("hlt": : :"memory");
+}
+
+#endif
+
+#ifdef CONFIG_PARAVIRT
+#include <asm/paravirt.h>
+#else
+#ifndef __ASSEMBLY__
+
+static inline unsigned long __raw_local_save_flags(void)
+{
+ return native_save_fl();
+}
+
+static inline void raw_local_irq_restore(unsigned long flags)
+{
+ native_restore_fl(flags);
+}
+
+static inline void raw_local_irq_disable(void)
+{
+ native_irq_disable();
+}
+
+static inline void raw_local_irq_enable(void)
+{
+ native_irq_enable();
+}
+
+/*
+ * Used in the idle loop; sti takes one instruction cycle
+ * to complete:
+ */
+static inline void raw_safe_halt(void)
+{
+ native_safe_halt();
+}
+
+/*
+ * Used when interrupts are already enabled or to
+ * shutdown the processor:
+ */
+static inline void halt(void)
+{
+ native_halt();
+}
+
+/*
+ * For spinlocks, etc:
+ */
+static inline unsigned long __raw_local_irq_save(void)
+{
+ unsigned long flags = __raw_local_save_flags();
+
+ raw_local_irq_disable();
+
+ return flags;
+}
+#else
+
+#define ENABLE_INTERRUPTS(x) sti
+#define DISABLE_INTERRUPTS(x) cli
+
+#ifdef CONFIG_X86_64
+#define INTERRUPT_RETURN iretq
+#define ENABLE_INTERRUPTS_SYSCALL_RET \
+ movq %gs:pda_oldrsp, %rsp; \
+ swapgs; \
+ sysretq;
+#else
+#define INTERRUPT_RETURN iret
+#define ENABLE_INTERRUPTS_SYSCALL_RET sti; sysexit
+#define GET_CR0_INTO_EAX movl %cr0, %eax
+#endif
+
+
+#endif /* __ASSEMBLY__ */
+#endif /* CONFIG_PARAVIRT */
+
+#ifndef __ASSEMBLY__
+#define raw_local_save_flags(flags) \
+ do { (flags) = __raw_local_save_flags(); } while (0)
+
+#define raw_local_irq_save(flags) \
+ do { (flags) = __raw_local_irq_save(); } while (0)
+
+static inline int raw_irqs_disabled_flags(unsigned long flags)
+{
+ return !(flags & X86_EFLAGS_IF);
+}
+
+static inline int raw_irqs_disabled(void)
+{
+ unsigned long flags = __raw_local_save_flags();
+
+ return raw_irqs_disabled_flags(flags);
+}
+
+/*
+ * makes the traced hardirq state match with the machine state
+ *
+ * should be a rarely used function, only in places where its
+ * otherwise impossible to know the irq state, like in traps.
+ */
+static inline void trace_hardirqs_fixup_flags(unsigned long flags)
+{
+ if (raw_irqs_disabled_flags(flags))
+ trace_hardirqs_off();
+ else
+ trace_hardirqs_on();
+}
+
+static inline void trace_hardirqs_fixup(void)
+{
+ unsigned long flags = __raw_local_save_flags();
+
+ trace_hardirqs_fixup_flags(flags);
+}
+
#else
-# include "irqflags_64.h"
+
+#ifdef CONFIG_X86_64
+/*
+ * Currently paravirt can't handle swapgs nicely when we
+ * don't have a stack we can rely on (such as a user space
+ * stack). So we either find a way around these or just fault
+ * and emulate if a guest tries to call swapgs directly.
+ *
+ * Either way, this is a good way to document that we don't
+ * have a reliable stack. x86_64 only.
+ */
+#define SWAPGS_UNSAFE_STACK swapgs
+#define ARCH_TRACE_IRQS_ON call trace_hardirqs_on_thunk
+#define ARCH_TRACE_IRQS_OFF call trace_hardirqs_off_thunk
+#define ARCH_LOCKDEP_SYS_EXIT call lockdep_sys_exit_thunk
+#define ARCH_LOCKDEP_SYS_EXIT_IRQ \
+ TRACE_IRQS_ON; \
+ sti; \
+ SAVE_REST; \
+ LOCKDEP_SYS_EXIT; \
+ RESTORE_REST; \
+ cli; \
+ TRACE_IRQS_OFF;
+
+#else
+#define ARCH_TRACE_IRQS_ON \
+ pushl %eax; \
+ pushl %ecx; \
+ pushl %edx; \
+ call trace_hardirqs_on; \
+ popl %edx; \
+ popl %ecx; \
+ popl %eax;
+
+#define ARCH_TRACE_IRQS_OFF \
+ pushl %eax; \
+ pushl %ecx; \
+ pushl %edx; \
+ call trace_hardirqs_off; \
+ popl %edx; \
+ popl %ecx; \
+ popl %eax;
+
+#define ARCH_LOCKDEP_SYS_EXIT \
+ pushl %eax; \
+ pushl %ecx; \
+ pushl %edx; \
+ call lockdep_sys_exit; \
+ popl %edx; \
+ popl %ecx; \
+ popl %eax;
+
+#define ARCH_LOCKDEP_SYS_EXIT_IRQ
+#endif
+
+#ifdef CONFIG_TRACE_IRQFLAGS
+# define TRACE_IRQS_ON ARCH_TRACE_IRQS_ON
+# define TRACE_IRQS_OFF ARCH_TRACE_IRQS_OFF
+#else
+# define TRACE_IRQS_ON
+# define TRACE_IRQS_OFF
+#endif
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+# define LOCKDEP_SYS_EXIT ARCH_LOCKDEP_SYS_EXIT
+# define LOCKDEP_SYS_EXIT_IRQ ARCH_LOCKDEP_SYS_EXIT_IRQ
+# else
+# define LOCKDEP_SYS_EXIT
+# define LOCKDEP_SYS_EXIT_IRQ
+# endif
+
+#endif /* __ASSEMBLY__ */
#endif
diff --git a/include/asm-x86/irqflags_32.h b/include/asm-x86/irqflags_32.h
deleted file mode 100644
index d8500f4..0000000
--- a/include/asm-x86/irqflags_32.h
+++ /dev/null
@@ -1,174 +0,0 @@
-/*
- * IRQ flags handling
- *
- * This file gets included from lowlevel asm headers too, to provide
- * wrapped versions of the local_irq_*() APIs, based on the
- * raw_local_irq_*() functions from the lowlevel headers.
- */
-#ifndef _ASM_IRQFLAGS_H
-#define _ASM_IRQFLAGS_H
-#include <asm/processor-flags.h>
-
-#ifndef __ASSEMBLY__
-static inline unsigned long native_save_fl(void)
-{
- unsigned long f;
- asm volatile("pushfl ; popl %0":"=g" (f): /* no input */);
- return f;
-}
-
-static inline void native_restore_fl(unsigned long f)
-{
- asm volatile("pushl %0 ; popfl": /* no output */
- :"g" (f)
- :"memory", "cc");
-}
-
-static inline void native_irq_disable(void)
-{
- asm volatile("cli": : :"memory");
-}
-
-static inline void native_irq_enable(void)
-{
- asm volatile("sti": : :"memory");
-}
-
-static inline void native_safe_halt(void)
-{
- asm volatile("sti; hlt": : :"memory");
-}
-
-static inline void native_halt(void)
-{
- asm volatile("hlt": : :"memory");
-}
-#endif /* __ASSEMBLY__ */
-
-#ifdef CONFIG_PARAVIRT
-#include <asm/paravirt.h>
-#else
-#ifndef __ASSEMBLY__
-
-static inline unsigned long __raw_local_save_flags(void)
-{
- return native_save_fl();
-}
-
-static inline void raw_local_irq_restore(unsigned long flags)
-{
- native_restore_fl(flags);
-}
-
-static inline void raw_local_irq_disable(void)
-{
- native_irq_disable();
-}
-
-static inline void raw_local_irq_enable(void)
-{
- native_irq_enable();
-}
-
-/*
- * Used in the idle loop; sti takes one instruction cycle
- * to complete:
- */
-static inline void raw_safe_halt(void)
-{
- native_safe_halt();
-}
-
-/*
- * Used when interrupts are already enabled or to
- * shutdown the processor:
- */
-static inline void halt(void)
-{
- native_halt();
-}
-
-/*
- * For spinlocks, etc:
- */
-static inline unsigned long __raw_local_irq_save(void)
-{
- unsigned long flags = __raw_local_save_flags();
-
- raw_local_irq_disable();
-
- return flags;
-}
-
-#else
-#define DISABLE_INTERRUPTS(clobbers) cli
-#define ENABLE_INTERRUPTS(clobbers) sti
-#define ENABLE_INTERRUPTS_SYSEXIT sti; sysexit
-#define INTERRUPT_RETURN iret
-#define GET_CR0_INTO_EAX movl %cr0, %eax
-#endif /* __ASSEMBLY__ */
-#endif /* CONFIG_PARAVIRT */
-
-#ifndef __ASSEMBLY__
-#define raw_local_save_flags(flags) \
- do { (flags) = __raw_local_save_flags(); } while (0)
-
-#define raw_local_irq_save(flags) \
- do { (flags) = __raw_local_irq_save(); } while (0)
-
-static inline int raw_irqs_disabled_flags(unsigned long flags)
-{
- return !(flags & X86_EFLAGS_IF);
-}
-
-static inline int raw_irqs_disabled(void)
-{
- unsigned long flags = __raw_local_save_flags();
-
- return raw_irqs_disabled_flags(flags);
-}
-#endif /* __ASSEMBLY__ */
-
-/*
- * Do the CPU's IRQ-state tracing from assembly code. We call a
- * C function, so save all the C-clobbered registers:
- */
-#ifdef CONFIG_TRACE_IRQFLAGS
-
-# define TRACE_IRQS_ON \
- pushl %eax; \
- pushl %ecx; \
- pushl %edx; \
- call trace_hardirqs_on; \
- popl %edx; \
- popl %ecx; \
- popl %eax;
-
-# define TRACE_IRQS_OFF \
- pushl %eax; \
- pushl %ecx; \
- pushl %edx; \
- call trace_hardirqs_off; \
- popl %edx; \
- popl %ecx; \
- popl %eax;
-
-#else
-# define TRACE_IRQS_ON
-# define TRACE_IRQS_OFF
-#endif
-
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
-# define LOCKDEP_SYS_EXIT \
- pushl %eax; \
- pushl %ecx; \
- pushl %edx; \
- call lockdep_sys_exit; \
- popl %edx; \
- popl %ecx; \
- popl %eax;
-#else
-# define LOCKDEP_SYS_EXIT
-#endif
-
-#endif
diff --git a/include/asm-x86/irqflags_64.h b/include/asm-x86/irqflags_64.h
deleted file mode 100644
index 857d9eb..0000000
--- a/include/asm-x86/irqflags_64.h
+++ /dev/null
@@ -1,154 +0,0 @@
-/*
- * IRQ flags handling
- *
- * This file gets included from lowlevel asm headers too, to provide
- * wrapped versions of the local_irq_*() APIs, based on the
- * raw_local_irq_*() functions from the lowlevel headers.
- */
-#ifndef _ASM_IRQFLAGS_H
-#define _ASM_IRQFLAGS_H
-#include <asm/processor-flags.h>
-
-#ifndef __ASSEMBLY__
-/*
- * Interrupt control:
- */
-
-static inline unsigned long __raw_local_save_flags(void)
-{
- unsigned long flags;
-
- __asm__ __volatile__(
- "# __raw_save_flags\n\t"
- "pushfq ; popq %q0"
- : "=g" (flags)
- : /* no input */
- : "memory"
- );
-
- return flags;
-}
-
-#define raw_local_save_flags(flags) \
- do { (flags) = __raw_local_save_flags(); } while (0)
-
-static inline void raw_local_irq_restore(unsigned long flags)
-{
- __asm__ __volatile__(
- "pushq %0 ; popfq"
- : /* no output */
- :"g" (flags)
- :"memory", "cc"
- );
-}
-
-#ifdef CONFIG_X86_VSMP
-
-/*
- * Interrupt control for the VSMP architecture:
- */
-
-static inline void raw_local_irq_disable(void)
-{
- unsigned long flags = __raw_local_save_flags();
-
- raw_local_irq_restore((flags & ~X86_EFLAGS_IF) | X86_EFLAGS_AC);
-}
-
-static inline void raw_local_irq_enable(void)
-{
- unsigned long flags = __raw_local_save_flags();
-
- raw_local_irq_restore((flags | X86_EFLAGS_IF) & (~X86_EFLAGS_AC));
-}
-
-static inline int raw_irqs_disabled_flags(unsigned long flags)
-{
- return !(flags & X86_EFLAGS_IF) || (flags & X86_EFLAGS_AC);
-}
-
-#else /* CONFIG_X86_VSMP */
-
-static inline void raw_local_irq_disable(void)
-{
- __asm__ __volatile__("cli" : : : "memory");
-}
-
-static inline void raw_local_irq_enable(void)
-{
- __asm__ __volatile__("sti" : : : "memory");
-}
-
-static inline int raw_irqs_disabled_flags(unsigned long flags)
-{
- return !(flags & X86_EFLAGS_IF);
-}
-
-#endif
-
-/*
- * For spinlocks, etc.:
- */
-
-static inline unsigned long __raw_local_irq_save(void)
-{
- unsigned long flags = __raw_local_save_flags();
-
- raw_local_irq_disable();
-
- return flags;
-}
-
-#define raw_local_irq_save(flags) \
- do { (flags) = __raw_local_irq_save(); } while (0)
-
-static inline int raw_irqs_disabled(void)
-{
- unsigned long flags = __raw_local_save_flags();
-
- return raw_irqs_disabled_flags(flags);
-}
-
-/*
- * Used in the idle loop; sti takes one instruction cycle
- * to complete:
- */
-static inline void raw_safe_halt(void)
-{
- __asm__ __volatile__("sti; hlt" : : : "memory");
-}
-
-/*
- * Used when interrupts are already enabled or to
- * shutdown the processor:
- */
-static inline void halt(void)
-{
- __asm__ __volatile__("hlt": : :"memory");
-}
-
-#else /* __ASSEMBLY__: */
-# ifdef CONFIG_TRACE_IRQFLAGS
-# define TRACE_IRQS_ON call trace_hardirqs_on_thunk
-# define TRACE_IRQS_OFF call trace_hardirqs_off_thunk
-# else
-# define TRACE_IRQS_ON
-# define TRACE_IRQS_OFF
-# endif
-# ifdef CONFIG_DEBUG_LOCK_ALLOC
-# define LOCKDEP_SYS_EXIT call lockdep_sys_exit_thunk
-# define LOCKDEP_SYS_EXIT_IRQ \
- TRACE_IRQS_ON; \
- sti; \
- SAVE_REST; \
- LOCKDEP_SYS_EXIT; \
- RESTORE_REST; \
- cli; \
- TRACE_IRQS_OFF;
-# else
-# define LOCKDEP_SYS_EXIT
-# define LOCKDEP_SYS_EXIT_IRQ
-# endif
-#endif
-
-#endif
diff --git a/include/asm-x86/paravirt.h b/include/asm-x86/paravirt.h
index f59d370..d81a361 100644
--- a/include/asm-x86/paravirt.h
+++ b/include/asm-x86/paravirt.h
@@ -121,7 +121,7 @@ struct pv_cpu_ops {
u64 (*read_pmc)(void);

/* These two are jmp to, not actually called. */
- void (*irq_enable_sysexit)(void);
+ void (*irq_enable_syscall_ret)(void);
void (*iret)(void);

struct pv_lazy_ops lazy_mode;
@@ -1138,9 +1138,10 @@ static inline unsigned long __raw_local_irq_save(void)
call *%cs:pv_irq_ops+PV_IRQ_irq_enable; \
popl %edx; popl %ecx; popl %eax)

-#define ENABLE_INTERRUPTS_SYSEXIT \
- PARA_SITE(PARA_PATCH(pv_cpu_ops, PV_CPU_irq_enable_sysexit), CLBR_NONE,\
- jmp *%cs:pv_cpu_ops+PV_CPU_irq_enable_sysexit)
+#define ENABLE_INTERRUPTS_SYSCALL_RET \
+ PARA_SITE(PARA_PATCH(pv_cpu_ops, PV_CPU_irq_enable_syscall_ret),\
+ CLBR_NONE, \
+ jmp *%cs:pv_cpu_ops+PV_CPU_irq_enable_syscall_ret)

#define GET_CR0_INTO_EAX \
push %ecx; push %edx; \
--
1.4.4.2

2007-11-09 21:28:58

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 1/24] mm/sparse-vmemmap.c: make sure init_mm is included

mm/sparse-vmemmap.c uses init_mm in some places. However, it is not
present in any of the headers currently included in the file.

init_mm is defined as extern in sched.h, so we add it to the headers list

Up to now, this problem was masked by the fact that functions like
set_pte_at() and pmd_populate_kernel() are usually macros that expand to
simpler variants that does not use the first parameter at all.

Signed-off-by: Glauber de Oliveira Costa <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
---
mm/sparse-vmemmap.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c
index d3b718b..22620f6 100644
--- a/mm/sparse-vmemmap.c
+++ b/mm/sparse-vmemmap.c
@@ -24,6 +24,7 @@
#include <linux/module.h>
#include <linux/spinlock.h>
#include <linux/vmalloc.h>
+#include <linux/sched.h>
#include <asm/dma.h>
#include <asm/pgalloc.h>
#include <asm/pgtable.h>
--
1.4.4.2

2007-11-09 21:29:23

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 4/24] tlb functions consolidation

This patch consolidates part of the tlb handling functions for the x86
architecture. In this approach, we start by the parts actually used for
paravirt in i386.

Signed-off-by: Glauber de Oliveira Costa <[email protected]>
Signed-off-by: Steven Rostedt <[email protected]>
Acked-by: Jeremy Fitzhardinge <[email protected]>
---
arch/x86/kernel/smp_64.c | 5 ++-
include/asm-x86/tlbflush.h | 77 +++++++++++++++++++++++++++++++++++++++++
include/asm-x86/tlbflush_32.h | 77 -----------------------------------------
include/asm-x86/tlbflush_64.h | 43 +++--------------------
4 files changed, 85 insertions(+), 117 deletions(-)

diff --git a/arch/x86/kernel/smp_64.c b/arch/x86/kernel/smp_64.c
index 62b0f2a..ce3935b 100644
--- a/arch/x86/kernel/smp_64.c
+++ b/arch/x86/kernel/smp_64.c
@@ -166,11 +166,12 @@ out:
add_pda(irq_tlb_count, 1);
}

-static void flush_tlb_others(cpumask_t cpumask, struct mm_struct *mm,
- unsigned long va)
+void native_flush_tlb_others(const cpumask_t *cpumaskp,
+ struct mm_struct *mm, unsigned long va)
{
int sender;
union smp_flush_state *f;
+ cpumask_t cpumask = *cpumaskp;

/* Caller has disabled preemption */
sender = smp_processor_id() % NUM_INVALIDATE_TLB_VECTORS;
diff --git a/include/asm-x86/tlbflush.h b/include/asm-x86/tlbflush.h
index 9af4cc8..93283cf 100644
--- a/include/asm-x86/tlbflush.h
+++ b/include/asm-x86/tlbflush.h
@@ -1,5 +1,82 @@
+#ifndef _X86_TLBFLUSH_H_
+#define _X86_TLBFLUSH_H_
+
+#ifdef CONFIG_PARAVIRT
+#include <asm/paravirt.h>
+#else
+#define __flush_tlb() __native_flush_tlb()
+#define __flush_tlb_global() __native_flush_tlb_global()
+#define __flush_tlb_single(addr) __native_flush_tlb_single(addr)
+#endif
+
+static inline void __native_flush_tlb(void)
+{
+ write_cr3(read_cr3());
+}
+
+static inline void __native_flush_tlb_global(void)
+{
+ unsigned long cr4 = read_cr4();
+ write_cr4(cr4 & ~X86_CR4_PGE); /* clear PGE */
+ write_cr4(cr4); /* write old PGE again and flush TLBs */
+}
+
+#define __native_flush_tlb_single(addr) \
+ __asm__ __volatile__("invlpg (%0)" ::"r" (addr) : "memory")
+
+#ifdef CONFIG_SMP
+
+#include <asm/smp.h>
+#include <linux/mm.h>
+
+#define local_flush_tlb() \
+ __flush_tlb()
+
+extern void flush_tlb_all(void);
+extern void flush_tlb_current_task(void);
+extern void flush_tlb_mm(struct mm_struct *);
+extern void flush_tlb_page(struct vm_area_struct *, unsigned long);
+
+#define flush_tlb() flush_tlb_current_task()
+
+static inline void flush_tlb_range(struct vm_area_struct *vma,
+ unsigned long start, unsigned long end)
+{
+ flush_tlb_mm(vma->vm_mm);
+}
+
+void native_flush_tlb_others(const cpumask_t *cpumask, struct mm_struct *mm,
+ unsigned long va);
+
+#define TLBSTATE_OK 1
+#define TLBSTATE_LAZY 2
+
+#ifdef CONFIG_X86_64
+/* Roughly an IPI every 20MB with 4k pages for freeing page table
+ ranges. Cost is about 42k of memory for each CPU. */
+#define ARCH_FREE_PTE_NR 5350
+
+#else /* X86_64 */
+struct tlb_state
+{
+ struct mm_struct *active_mm;
+ int state;
+ char __cacheline_padding[L1_CACHE_BYTES-8];
+};
+DECLARE_PER_CPU(struct tlb_state, cpu_tlbstate);
+#endif /* X86_64 */
+
+#endif
+
+#ifndef CONFIG_PARAVIRT
+#define flush_tlb_others(mask, mm, va) \
+ native_flush_tlb_others(&mask, mm, va)
+#endif
+
#ifdef CONFIG_X86_32
# include "tlbflush_32.h"
#else
# include "tlbflush_64.h"
#endif
+
+#endif
diff --git a/include/asm-x86/tlbflush_32.h b/include/asm-x86/tlbflush_32.h
index 2bd5b95..07eaf37 100644
--- a/include/asm-x86/tlbflush_32.h
+++ b/include/asm-x86/tlbflush_32.h
@@ -1,49 +1,8 @@
#ifndef _I386_TLBFLUSH_H
#define _I386_TLBFLUSH_H

-#include <linux/mm.h>
#include <asm/processor.h>

-#ifdef CONFIG_PARAVIRT
-#include <asm/paravirt.h>
-#else
-#define __flush_tlb() __native_flush_tlb()
-#define __flush_tlb_global() __native_flush_tlb_global()
-#define __flush_tlb_single(addr) __native_flush_tlb_single(addr)
-#endif
-
-#define __native_flush_tlb() \
- do { \
- unsigned int tmpreg; \
- \
- __asm__ __volatile__( \
- "movl %%cr3, %0; \n" \
- "movl %0, %%cr3; # flush TLB \n" \
- : "=r" (tmpreg) \
- :: "memory"); \
- } while (0)
-
-/*
- * Global pages have to be flushed a bit differently. Not a real
- * performance problem because this does not happen often.
- */
-#define __native_flush_tlb_global() \
- do { \
- unsigned int tmpreg, cr4, cr4_orig; \
- \
- __asm__ __volatile__( \
- "movl %%cr4, %2; # turn off PGE \n" \
- "movl %2, %1; \n" \
- "andl %3, %1; \n" \
- "movl %1, %%cr4; \n" \
- "movl %%cr3, %0; \n" \
- "movl %0, %%cr3; # flush TLB \n" \
- "movl %2, %%cr4; # turn PGE back on \n" \
- : "=&r" (tmpreg), "=&r" (cr4), "=&r" (cr4_orig) \
- : "i" (~X86_CR4_PGE) \
- : "memory"); \
- } while (0)
-
#define __native_flush_tlb_single(addr) \
__asm__ __volatile__("invlpg (%0)" ::"r" (addr) : "memory")

@@ -120,44 +79,8 @@ static inline void native_flush_tlb_others(const cpumask_t *cpumask,
{
}

-#else /* SMP */
-
-#include <asm/smp.h>
-
-#define local_flush_tlb() \
- __flush_tlb()
-
-extern void flush_tlb_all(void);
-extern void flush_tlb_current_task(void);
-extern void flush_tlb_mm(struct mm_struct *);
-extern void flush_tlb_page(struct vm_area_struct *, unsigned long);
-
-#define flush_tlb() flush_tlb_current_task()
-
-static inline void flush_tlb_range(struct vm_area_struct * vma, unsigned long start, unsigned long end)
-{
- flush_tlb_mm(vma->vm_mm);
-}
-
-void native_flush_tlb_others(const cpumask_t *cpumask, struct mm_struct *mm,
- unsigned long va);
-
-#define TLBSTATE_OK 1
-#define TLBSTATE_LAZY 2
-
-struct tlb_state
-{
- struct mm_struct *active_mm;
- int state;
- char __cacheline_padding[L1_CACHE_BYTES-8];
-};
-DECLARE_PER_CPU(struct tlb_state, cpu_tlbstate);
#endif /* SMP */

-#ifndef CONFIG_PARAVIRT
-#define flush_tlb_others(mask, mm, va) \
- native_flush_tlb_others(&mask, mm, va)
-#endif

static inline void flush_tlb_kernel_range(unsigned long start,
unsigned long end)
diff --git a/include/asm-x86/tlbflush_64.h b/include/asm-x86/tlbflush_64.h
index 7731fd2..d1d1097 100644
--- a/include/asm-x86/tlbflush_64.h
+++ b/include/asm-x86/tlbflush_64.h
@@ -6,21 +6,8 @@
#include <asm/processor.h>
#include <asm/system.h>

-static inline void __flush_tlb(void)
-{
- write_cr3(read_cr3());
-}
-
-static inline void __flush_tlb_all(void)
-{
- unsigned long cr4 = read_cr4();
- write_cr4(cr4 & ~X86_CR4_PGE); /* clear PGE */
- write_cr4(cr4); /* write old PGE again and flush TLBs */
-}
-
-#define __flush_tlb_one(addr) \
- __asm__ __volatile__("invlpg (%0)" :: "r" (addr) : "memory")
-
+#define __flush_tlb_one(addr) __flush_tlb_single(addr)
+#define __flush_tlb_all() __flush_tlb_global()

/*
* TLB flushing:
@@ -63,32 +50,12 @@ static inline void flush_tlb_range(struct vm_area_struct *vma,
__flush_tlb();
}

-#else
-
-#include <asm/smp.h>
-
-#define local_flush_tlb() \
- __flush_tlb()
-
-extern void flush_tlb_all(void);
-extern void flush_tlb_current_task(void);
-extern void flush_tlb_mm(struct mm_struct *);
-extern void flush_tlb_page(struct vm_area_struct *, unsigned long);
-
-#define flush_tlb() flush_tlb_current_task()
-
-static inline void flush_tlb_range(struct vm_area_struct * vma, unsigned long start, unsigned long end)
+static inline void native_flush_tlb_others(const cpumask_t *cpumask,
+ struct mm_struct *mm,
+ unsigned long va)
{
- flush_tlb_mm(vma->vm_mm);
}

-#define TLBSTATE_OK 1
-#define TLBSTATE_LAZY 2
-
-/* Roughly an IPI every 20MB with 4k pages for freeing page table
- ranges. Cost is about 42k of memory for each CPU. */
-#define ARCH_FREE_PTE_NR 5350
-
#endif

static inline void flush_tlb_kernel_range(unsigned long start,
--
1.4.4.2

2007-11-09 21:29:40

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 10/24] paravirt hooks at entry functions.

Those are the hooks needed for paravirt at entry_64.S
In general, they follow the way of i386.

Signed-off-by: Glauber de Oliveira Costa <[email protected]>
Signed-off-by: Steven Rostedt <[email protected]>
Acked-by: Jeremy Fitzhardinge <[email protected]>
---
arch/x86/kernel/entry_64.S | 108 +++++++++++++++++++++++++++-----------------
1 files changed, 66 insertions(+), 42 deletions(-)

diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 3a058bb..b6d7008 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -50,6 +50,7 @@
#include <asm/hw_irq.h>
#include <asm/page.h>
#include <asm/irqflags.h>
+#include <asm/paravirt.h>

.code64

@@ -57,6 +58,20 @@
#define retint_kernel retint_restore_args
#endif

+#ifdef CONFIG_PARAVIRT
+ENTRY(native_irq_enable_syscall_ret)
+ movq %gs:pda_oldrsp,%rsp
+ swapgs
+ sysretq
+/*
+ * This could well be defined as a C function, but as it is only used here,
+ * let it be locally defined
+ */
+ENTRY(native_swapgs)
+ swapgs
+ retq
+#endif /* CONFIG_PARAVIRT */
+

.macro TRACE_IRQS_IRETQ offset=ARGOFFSET
#ifdef CONFIG_TRACE_IRQFLAGS
@@ -216,14 +231,21 @@ ENTRY(system_call)
CFI_DEF_CFA rsp,PDA_STACKOFFSET
CFI_REGISTER rip,rcx
/*CFI_REGISTER rflags,r11*/
- swapgs
+ SWAPGS_UNSAFE_STACK
+ /*
+ * A hypervisor implementation might want to use a label
+ * after the swapgs, so that it can do the swapgs
+ * for the guest and jump here on syscall.
+ */
+ENTRY(system_call_after_swapgs)
+
movq %rsp,%gs:pda_oldrsp
movq %gs:pda_kernelstack,%rsp
/*
* No need to follow this irqs off/on section - it's straight
* and short:
*/
- sti
+ ENABLE_INTERRUPTS(CLBR_NONE)
SAVE_ARGS 8,1
movq %rax,ORIG_RAX-ARGOFFSET(%rsp)
movq %rcx,RIP-ARGOFFSET(%rsp)
@@ -246,7 +268,7 @@ ret_from_sys_call:
sysret_check:
LOCKDEP_SYS_EXIT
GET_THREAD_INFO(%rcx)
- cli
+ DISABLE_INTERRUPTS(CLBR_NONE)
TRACE_IRQS_OFF
movl threadinfo_flags(%rcx),%edx
andl %edi,%edx
@@ -260,9 +282,7 @@ sysret_check:
CFI_REGISTER rip,rcx
RESTORE_ARGS 0,-ARG_SKIP,1
/*CFI_REGISTER rflags,r11*/
- movq %gs:pda_oldrsp,%rsp
- swapgs
- sysretq
+ ENABLE_INTERRUPTS_SYSCALL_RET

CFI_RESTORE_STATE
/* Handle reschedules */
@@ -271,7 +291,7 @@ sysret_careful:
bt $TIF_NEED_RESCHED,%edx
jnc sysret_signal
TRACE_IRQS_ON
- sti
+ ENABLE_INTERRUPTS(CLBR_NONE)
pushq %rdi
CFI_ADJUST_CFA_OFFSET 8
call schedule
@@ -282,7 +302,7 @@ sysret_careful:
/* Handle a signal */
sysret_signal:
TRACE_IRQS_ON
- sti
+ ENABLE_INTERRUPTS(CLBR_NONE)
testl $(_TIF_SIGPENDING|_TIF_SINGLESTEP|_TIF_MCE_NOTIFY),%edx
jz 1f

@@ -295,7 +315,7 @@ sysret_signal:
1: movl $_TIF_NEED_RESCHED,%edi
/* Use IRET because user could have changed frame. This
works because ptregscall_common has called FIXUP_TOP_OF_STACK. */
- cli
+ DISABLE_INTERRUPTS(CLBR_NONE)
TRACE_IRQS_OFF
jmp int_with_check

@@ -327,7 +347,7 @@ tracesys:
*/
.globl int_ret_from_sys_call
int_ret_from_sys_call:
- cli
+ DISABLE_INTERRUPTS(CLBR_NONE)
TRACE_IRQS_OFF
testl $3,CS-ARGOFFSET(%rsp)
je retint_restore_args
@@ -349,20 +369,20 @@ int_careful:
bt $TIF_NEED_RESCHED,%edx
jnc int_very_careful
TRACE_IRQS_ON
- sti
+ ENABLE_INTERRUPTS(CLBR_NONE)
pushq %rdi
CFI_ADJUST_CFA_OFFSET 8
call schedule
popq %rdi
CFI_ADJUST_CFA_OFFSET -8
- cli
+ DISABLE_INTERRUPTS(CLBR_NONE)
TRACE_IRQS_OFF
jmp int_with_check

/* handle signals and tracing -- both require a full stack frame */
int_very_careful:
TRACE_IRQS_ON
- sti
+ ENABLE_INTERRUPTS(CLBR_NONE)
SAVE_REST
/* Check for syscall exit trace */
testl $(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SINGLESTEP),%edx
@@ -385,7 +405,7 @@ int_signal:
1: movl $_TIF_NEED_RESCHED,%edi
int_restore_rest:
RESTORE_REST
- cli
+ DISABLE_INTERRUPTS(CLBR_NONE)
TRACE_IRQS_OFF
jmp int_with_check
CFI_ENDPROC
@@ -506,7 +526,7 @@ END(stub_rt_sigreturn)
CFI_DEF_CFA_REGISTER rbp
testl $3,CS(%rdi)
je 1f
- swapgs
+ SWAPGS
/* irqcount is used to check if a CPU is already on an interrupt
stack or not. While this is essentially redundant with preempt_count
it is a little cheaper to use a separate counter in the PDA
@@ -527,7 +547,7 @@ ENTRY(common_interrupt)
interrupt do_IRQ
/* 0(%rsp): oldrsp-ARGOFFSET */
ret_from_intr:
- cli
+ DISABLE_INTERRUPTS(CLBR_NONE)
TRACE_IRQS_OFF
decl %gs:pda_irqcount
leaveq
@@ -556,13 +576,13 @@ retint_swapgs: /* return to user-space */
/*
* The iretq could re-enable interrupts:
*/
- cli
+ DISABLE_INTERRUPTS(CLBR_ANY)
TRACE_IRQS_IRETQ
- swapgs
+ SWAPGS
jmp restore_args

retint_restore_args: /* return to kernel space */
- cli
+ DISABLE_INTERRUPTS(CLBR_ANY)
/*
* The iretq could re-enable interrupts:
*/
@@ -570,10 +590,14 @@ retint_restore_args: /* return to kernel space */
restore_args:
RESTORE_ARGS 0,8,0
iret_label:
+#ifdef CONFIG_PARAVIRT
+ INTERRUPT_RETURN
+#endif
+ENTRY(native_iret)
iretq

.section __ex_table,"a"
- .quad iret_label,bad_iret
+ .quad native_iret, bad_iret
.previous
.section .fixup,"ax"
/* force a signal here? this matches i386 behaviour */
@@ -581,24 +605,24 @@ iret_label:
bad_iret:
movq $11,%rdi /* SIGSEGV */
TRACE_IRQS_ON
- sti
- jmp do_exit
- .previous
-
+ ENABLE_INTERRUPTS(CLBR_ANY | ~(CLBR_RDI))
+ jmp do_exit
+ .previous
+
/* edi: workmask, edx: work */
retint_careful:
CFI_RESTORE_STATE
bt $TIF_NEED_RESCHED,%edx
jnc retint_signal
TRACE_IRQS_ON
- sti
+ ENABLE_INTERRUPTS(CLBR_NONE)
pushq %rdi
CFI_ADJUST_CFA_OFFSET 8
call schedule
popq %rdi
CFI_ADJUST_CFA_OFFSET -8
GET_THREAD_INFO(%rcx)
- cli
+ DISABLE_INTERRUPTS(CLBR_NONE)
TRACE_IRQS_OFF
jmp retint_check

@@ -606,14 +630,14 @@ retint_signal:
testl $(_TIF_SIGPENDING|_TIF_SINGLESTEP|_TIF_MCE_NOTIFY),%edx
jz retint_swapgs
TRACE_IRQS_ON
- sti
+ ENABLE_INTERRUPTS(CLBR_NONE)
SAVE_REST
movq $-1,ORIG_RAX(%rsp)
xorl %esi,%esi # oldset
movq %rsp,%rdi # &pt_regs
call do_notify_resume
RESTORE_REST
- cli
+ DISABLE_INTERRUPTS(CLBR_NONE)
TRACE_IRQS_OFF
movl $_TIF_NEED_RESCHED,%edi
GET_THREAD_INFO(%rcx)
@@ -731,7 +755,7 @@ END(spurious_interrupt)
rdmsr
testl %edx,%edx
js 1f
- swapgs
+ SWAPGS
xorl %ebx,%ebx
1:
.if \ist
@@ -747,7 +771,7 @@ END(spurious_interrupt)
.if \ist
addq $EXCEPTION_STKSZ, per_cpu__init_tss + TSS_ist + (\ist - 1) * 8(%rbp)
.endif
- cli
+ DISABLE_INTERRUPTS(CLBR_NONE)
.if \irqtrace
TRACE_IRQS_OFF
.endif
@@ -776,10 +800,10 @@ paranoid_swapgs\trace:
.if \trace
TRACE_IRQS_IRETQ 0
.endif
- swapgs
+ SWAPGS_UNSAFE_STACK
paranoid_restore\trace:
RESTORE_ALL 8
- iretq
+ INTERRUPT_RETURN
paranoid_userspace\trace:
GET_THREAD_INFO(%rcx)
movl threadinfo_flags(%rcx),%ebx
@@ -794,11 +818,11 @@ paranoid_userspace\trace:
.if \trace
TRACE_IRQS_ON
.endif
- sti
+ ENABLE_INTERRUPTS(CLBR_NONE)
xorl %esi,%esi /* arg2: oldset */
movq %rsp,%rdi /* arg1: &pt_regs */
call do_notify_resume
- cli
+ DISABLE_INTERRUPTS(CLBR_NONE)
.if \trace
TRACE_IRQS_OFF
.endif
@@ -807,9 +831,9 @@ paranoid_schedule\trace:
.if \trace
TRACE_IRQS_ON
.endif
- sti
+ ENABLE_INTERRUPTS(CLBR_ANY)
call schedule
- cli
+ DISABLE_INTERRUPTS(CLBR_ANY)
.if \trace
TRACE_IRQS_OFF
.endif
@@ -862,7 +886,7 @@ KPROBE_ENTRY(error_entry)
testl $3,CS(%rsp)
je error_kernelspace
error_swapgs:
- swapgs
+ SWAPGS
error_sti:
movq %rdi,RDI(%rsp)
CFI_REL_OFFSET rdi,RDI
@@ -874,7 +898,7 @@ error_sti:
error_exit:
movl %ebx,%eax
RESTORE_REST
- cli
+ DISABLE_INTERRUPTS(CLBR_NONE)
TRACE_IRQS_OFF
GET_THREAD_INFO(%rcx)
testl %eax,%eax
@@ -911,12 +935,12 @@ ENTRY(load_gs_index)
CFI_STARTPROC
pushf
CFI_ADJUST_CFA_OFFSET 8
- cli
- swapgs
+ DISABLE_INTERRUPTS(CLBR_ANY | ~(CLBR_RDI))
+ SWAPGS
gs_change:
movl %edi,%gs
2: mfence /* workaround */
- swapgs
+ SWAPGS
popf
CFI_ADJUST_CFA_OFFSET -8
ret
@@ -930,7 +954,7 @@ ENDPROC(load_gs_index)
.section .fixup,"ax"
/* running with kernelgs */
bad_gs:
- swapgs /* switch back to user gs */
+ SWAPGS /* switch back to user gs */
xorl %eax,%eax
movl %eax,%gs
jmp 2b
--
1.4.4.2

2007-11-09 21:30:00

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 9/24] Wipe out traditional opt from x86_64 Makefile

Among other things, using -traditional as a gcc option stops us from
using macro token pasting, which is a feature we heavily rely on.

There was still a use of -traditional in arch/x86/kernel/Makefile_64,
which this patch removes.

I don't see any problems building kernels in my x86_64 box without
-traditional.

Signed-off-by: Glauber de Oliveira Costa <[email protected]>
Signed-off-by: Steven Rostedt <[email protected]>
Acked-by: Jeremy Fitzhardinge <[email protected]>
---
arch/x86/kernel/Makefile_64 | 1 -
1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/Makefile_64 b/arch/x86/kernel/Makefile_64
index ffee997..0714528 100644
--- a/arch/x86/kernel/Makefile_64
+++ b/arch/x86/kernel/Makefile_64
@@ -3,7 +3,6 @@
#

extra-y := head_64.o head64.o init_task.o vmlinux.lds
-EXTRA_AFLAGS := -traditional
obj-y := process_64.o signal_64.o entry_64.o traps_64.o irq_64.o \
ptrace_64.o time_64.o ioport_64.o ldt.o setup_64.o i8259_64.o sys_x86_64.o \
x8664_ksyms_64.o i387_64.o syscall_64.o vsyscall_64.o \
--
1.4.4.2

2007-11-09 21:30:31

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 8/24] consolidate system.h

This patch consolidates system.h header. For i386, it adds functions
read/write_cr8 that ain't really needed, but will also not hurt. If they are
used somewhere in i386 code, there's a bug anyway, and should be fixed.

Signed-off-by: Glauber de Oliveira Costa <[email protected]>
Signed-off-by: Steven Rostedt <[email protected]>
Acked-by: Jeremy Fitzhardinge <[email protected]>
---
include/asm-x86/system.h | 134 +++++++++++++++++++++++++++++++++++++++++++
include/asm-x86/system_32.h | 102 --------------------------------
include/asm-x86/system_64.h | 77 -------------------------
3 files changed, 134 insertions(+), 179 deletions(-)

diff --git a/include/asm-x86/system.h b/include/asm-x86/system.h
index 692562b..ef20916 100644
--- a/include/asm-x86/system.h
+++ b/include/asm-x86/system.h
@@ -1,5 +1,139 @@
+#ifndef _X86_SYSTEM_H_
+#define _X86_SYSTEM_H_
+
+#include <linux/kernel.h>
+
+static inline void native_clts(void)
+{
+ asm volatile ("clts");
+}
+
+/*
+ * Volatile isn't enough to prevent the compiler from reordering the
+ * read/write functions for the control registers and messing everything up.
+ * A memory clobber would solve the problem, but would prevent reordering of
+ * all loads stores around it, which can hurt performance. Solution is to
+ * use a variable and mimic reads and writes to it to enforce serialization
+ */
+static unsigned long __force_order;
+
+static inline unsigned long native_read_cr0(void)
+{
+ unsigned long val;
+ asm volatile("mov %%cr0,%0\n\t" :"=r" (val), "=m" (__force_order));
+ return val;
+}
+
+static inline void native_write_cr0(unsigned long val)
+{
+ asm volatile("mov %0,%%cr0": :"r" (val), "m" (__force_order));
+}
+
+static inline unsigned long native_read_cr2(void)
+{
+ unsigned long val;
+ asm volatile("mov %%cr2,%0\n\t" :"=r" (val), "=m" (__force_order));
+ return val;
+}
+
+static inline void native_write_cr2(unsigned long val)
+{
+ asm volatile("mov %0,%%cr2": :"r" (val), "m" (__force_order));
+}
+
+static inline unsigned long native_read_cr3(void)
+{
+ unsigned long val;
+ asm volatile("mov %%cr3,%0\n\t" :"=r" (val), "=m" (__force_order));
+ return val;
+}
+
+static inline void native_write_cr3(unsigned long val)
+{
+ asm volatile("mov %0,%%cr3": :"r" (val), "m" (__force_order));
+}
+
+static inline unsigned long native_read_cr4(void)
+{
+ unsigned long val;
+ asm volatile("mov %%cr4,%0\n\t" :"=r" (val), "=m" (__force_order));
+ return val;
+}
+
+static inline unsigned long native_read_cr4_safe(void)
+{
+ unsigned long val;
+ /* This could fault if %cr4 does not exist. In x86_64, a cr4 always
+ * exists, so it will never fail. */
+#ifdef CONFIG_X86_32
+ asm volatile("1: mov %%cr4, %0 \n"
+ "2: \n"
+ ".section __ex_table,\"a\" \n"
+ ".long 1b,2b \n"
+ ".previous \n"
+ : "=r" (val), "=m" (__force_order) : "0" (0));
+#else
+ val = native_read_cr4();
+#endif
+ return val;
+}
+
+static inline void native_write_cr4(unsigned long val)
+{
+ asm volatile("mov %0,%%cr4": :"r" (val), "m" (__force_order));
+}
+
+static inline unsigned long native_read_cr8(void)
+{
+ unsigned long cr8;
+ asm volatile("mov %%cr8,%0" : "=r" (cr8), "=m" (__force_order));
+ return cr8;
+}
+
+static inline void native_write_cr8(unsigned long val)
+{
+ asm volatile("mov %0,%%cr8" : : "r" (val));
+}
+
+static inline void native_wbinvd(void)
+{
+ asm volatile("wbinvd": : :"memory");
+}
+
+static inline void clflush(volatile void *__p)
+{
+ asm volatile("clflush %0" : "+m" (*(char __force *)__p));
+}
+
+#ifdef CONFIG_PARAVIRT
+#include <asm/paravirt.h>
+#else
+#define read_cr0() (native_read_cr0())
+#define write_cr0(x) (native_write_cr0(x))
+#define read_cr2() (native_read_cr2())
+#define write_cr2(x) (native_write_cr2(x))
+#define read_cr3() (native_read_cr3())
+#define write_cr3(x) (native_write_cr3(x))
+#define read_cr4() (native_read_cr4())
+#define read_cr4_safe() (native_read_cr4_safe())
+#define write_cr4(x) (native_write_cr4(x))
+#define read_cr8() (native_read_cr8())
+#define write_cr8(x) (native_write_cr8(x))
+#define wbinvd() (native_wbinvd())
+
+/* Clear the 'TS' bit */
+#define clts() (native_clts())
+
+#endif/* CONFIG_PARAVIRT */
+
+#define stts() write_cr0(8 | read_cr0())
+
+#define nop() __asm__ __volatile__ ("nop")
+
#ifdef CONFIG_X86_32
# include "system_32.h"
#else
# include "system_64.h"
#endif
+
+#endif
diff --git a/include/asm-x86/system_32.h b/include/asm-x86/system_32.h
index db6283e..26fc8f5 100644
--- a/include/asm-x86/system_32.h
+++ b/include/asm-x86/system_32.h
@@ -1,7 +1,6 @@
#ifndef __ASM_SYSTEM_H
#define __ASM_SYSTEM_H

-#include <linux/kernel.h>
#include <asm/segment.h>
#include <asm/cpufeature.h>
#include <asm/cmpxchg.h>
@@ -89,105 +88,6 @@ __asm__ __volatile__ ("movw %%dx,%1\n\t" \
#define savesegment(seg, value) \
asm volatile("mov %%" #seg ",%0":"=rm" (value))

-
-static inline void native_clts(void)
-{
- asm volatile ("clts");
-}
-
-static inline unsigned long native_read_cr0(void)
-{
- unsigned long val;
- asm volatile("movl %%cr0,%0\n\t" :"=r" (val));
- return val;
-}
-
-static inline void native_write_cr0(unsigned long val)
-{
- asm volatile("movl %0,%%cr0": :"r" (val));
-}
-
-static inline unsigned long native_read_cr2(void)
-{
- unsigned long val;
- asm volatile("movl %%cr2,%0\n\t" :"=r" (val));
- return val;
-}
-
-static inline void native_write_cr2(unsigned long val)
-{
- asm volatile("movl %0,%%cr2": :"r" (val));
-}
-
-static inline unsigned long native_read_cr3(void)
-{
- unsigned long val;
- asm volatile("movl %%cr3,%0\n\t" :"=r" (val));
- return val;
-}
-
-static inline void native_write_cr3(unsigned long val)
-{
- asm volatile("movl %0,%%cr3": :"r" (val));
-}
-
-static inline unsigned long native_read_cr4(void)
-{
- unsigned long val;
- asm volatile("movl %%cr4,%0\n\t" :"=r" (val));
- return val;
-}
-
-static inline unsigned long native_read_cr4_safe(void)
-{
- unsigned long val;
- /* This could fault if %cr4 does not exist */
- asm volatile("1: movl %%cr4, %0 \n"
- "2: \n"
- ".section __ex_table,\"a\" \n"
- ".long 1b,2b \n"
- ".previous \n"
- : "=r" (val): "0" (0));
- return val;
-}
-
-static inline void native_write_cr4(unsigned long val)
-{
- asm volatile("movl %0,%%cr4": :"r" (val));
-}
-
-static inline void native_wbinvd(void)
-{
- asm volatile("wbinvd": : :"memory");
-}
-
-static inline void clflush(volatile void *__p)
-{
- asm volatile("clflush %0" : "+m" (*(char __force *)__p));
-}
-
-#ifdef CONFIG_PARAVIRT
-#include <asm/paravirt.h>
-#else
-#define read_cr0() (native_read_cr0())
-#define write_cr0(x) (native_write_cr0(x))
-#define read_cr2() (native_read_cr2())
-#define write_cr2(x) (native_write_cr2(x))
-#define read_cr3() (native_read_cr3())
-#define write_cr3(x) (native_write_cr3(x))
-#define read_cr4() (native_read_cr4())
-#define read_cr4_safe() (native_read_cr4_safe())
-#define write_cr4(x) (native_write_cr4(x))
-#define wbinvd() (native_wbinvd())
-
-/* Clear the 'TS' bit */
-#define clts() (native_clts())
-
-#endif/* CONFIG_PARAVIRT */
-
-/* Set the 'TS' bit */
-#define stts() write_cr0(8 | read_cr0())
-
#endif /* __KERNEL__ */

static inline unsigned long get_limit(unsigned long segment)
@@ -198,8 +98,6 @@ static inline unsigned long get_limit(unsigned long segment)
return __limit+1;
}

-#define nop() __asm__ __volatile__ ("nop")
-
/*
* Force strict CPU ordering.
* And yes, this is required on UP too when we're talking
diff --git a/include/asm-x86/system_64.h b/include/asm-x86/system_64.h
index 4cb2384..84a8f10 100644
--- a/include/asm-x86/system_64.h
+++ b/include/asm-x86/system_64.h
@@ -1,7 +1,6 @@
#ifndef __ASM_SYSTEM_H
#define __ASM_SYSTEM_H

-#include <linux/kernel.h>
#include <asm/segment.h>
#include <asm/cmpxchg.h>

@@ -62,85 +61,9 @@ extern void load_gs_index(unsigned);
".previous" \
: :"r" (value), "r" (0))

-/*
- * Clear and set 'TS' bit respectively
- */
-#define clts() __asm__ __volatile__ ("clts")
-
-static inline unsigned long read_cr0(void)
-{
- unsigned long cr0;
- asm volatile("movq %%cr0,%0" : "=r" (cr0));
- return cr0;
-}
-
-static inline void write_cr0(unsigned long val)
-{
- asm volatile("movq %0,%%cr0" :: "r" (val));
-}
-
-static inline unsigned long read_cr2(void)
-{
- unsigned long cr2;
- asm volatile("movq %%cr2,%0" : "=r" (cr2));
- return cr2;
-}
-
-static inline void write_cr2(unsigned long val)
-{
- asm volatile("movq %0,%%cr2" :: "r" (val));
-}
-
-static inline unsigned long read_cr3(void)
-{
- unsigned long cr3;
- asm volatile("movq %%cr3,%0" : "=r" (cr3));
- return cr3;
-}
-
-static inline void write_cr3(unsigned long val)
-{
- asm volatile("movq %0,%%cr3" :: "r" (val) : "memory");
-}
-
-static inline unsigned long read_cr4(void)
-{
- unsigned long cr4;
- asm volatile("movq %%cr4,%0" : "=r" (cr4));
- return cr4;
-}
-
-static inline void write_cr4(unsigned long val)
-{
- asm volatile("movq %0,%%cr4" :: "r" (val) : "memory");
-}
-
-static inline unsigned long read_cr8(void)
-{
- unsigned long cr8;
- asm volatile("movq %%cr8,%0" : "=r" (cr8));
- return cr8;
-}
-
-static inline void write_cr8(unsigned long val)
-{
- asm volatile("movq %0,%%cr8" :: "r" (val) : "memory");
-}
-
-#define stts() write_cr0(8 | read_cr0())
-
-#define wbinvd() \
- __asm__ __volatile__ ("wbinvd": : :"memory")

#endif /* __KERNEL__ */

-static inline void clflush(volatile void *__p)
-{
- asm volatile("clflush %0" : "+m" (*(char __force *)__p));
-}
-
-#define nop() __asm__ __volatile__ ("nop")
-
#ifdef CONFIG_SMP
#define smp_mb() mb()
#define smp_rmb() barrier()
--
1.4.4.2

2007-11-09 21:30:51

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 12/24] provide native irq initialization function

The interrupt initialization routine becomes native_init_IRQ and will
be overriden later in case paravirt is on. The interrupt array is made visible
for guests such lguest, that will need to have their own initialization
mechanism (though using most of the same irq lines) later on.

Signed-off-by: Glauber de Oliveira Costa <[email protected]>
Signed-off-by: Steven Rostedt <[email protected]>
Acked-by: Jeremy Fitzhardinge <[email protected]>
---
arch/x86/kernel/i8259_64.c | 7 +++++--
include/asm-x86/irq_64.h | 3 +++
2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/i8259_64.c b/arch/x86/kernel/i8259_64.c
index 3041e59..53955f4 100644
--- a/arch/x86/kernel/i8259_64.c
+++ b/arch/x86/kernel/i8259_64.c
@@ -77,7 +77,7 @@ BUILD_16_IRQS(0xc) BUILD_16_IRQS(0xd) BUILD_16_IRQS(0xe) BUILD_16_IRQS(0xf)
IRQ(x,c), IRQ(x,d), IRQ(x,e), IRQ(x,f)

/* for the irq vectors */
-static void (*interrupt[NR_VECTORS - FIRST_EXTERNAL_VECTOR])(void) = {
+void (*interrupt[NR_VECTORS - FIRST_EXTERNAL_VECTOR])(void) = {
IRQLIST_16(0x2), IRQLIST_16(0x3),
IRQLIST_16(0x4), IRQLIST_16(0x5), IRQLIST_16(0x6), IRQLIST_16(0x7),
IRQLIST_16(0x8), IRQLIST_16(0x9), IRQLIST_16(0xa), IRQLIST_16(0xb),
@@ -456,7 +456,10 @@ void __init init_ISA_irqs (void)
}
}

-void __init init_IRQ(void)
+/* Overridden in paravirt.c */
+void init_IRQ(void) __attribute__((weak, alias("native_init_IRQ")));
+
+void __init native_init_IRQ(void)
{
int i;

diff --git a/include/asm-x86/irq_64.h b/include/asm-x86/irq_64.h
index 5006c6e..4f02446 100644
--- a/include/asm-x86/irq_64.h
+++ b/include/asm-x86/irq_64.h
@@ -46,6 +46,9 @@ static __inline__ int irq_canonicalize(int irq)
extern void fixup_irqs(cpumask_t map);
#endif

+#include <linux/init.h>
+void native_init_IRQ(void);
+
#define __ARCH_HAS_DO_SOFTIRQ 1

#endif /* _ASM_IRQ_H */
--
1.4.4.2

2007-11-09 21:31:15

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 11/24] read/write_crX, clts and wbinvd for 64-bit paravirt

This patch introduces, and patch callers when needed, native
versions for read/write_crX functions, clts and wbinvd.

Signed-off-by: Glauber de Oliveira Costa <[email protected]>
Signed-off-by: Steven Rostedt <[email protected]>
Acked-by: Jeremy Fitzhardinge <[email protected]>
---
arch/x86/mm/pageattr_64.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/arch/x86/mm/pageattr_64.c b/arch/x86/mm/pageattr_64.c
index 14ab327..3a483a8 100644
--- a/arch/x86/mm/pageattr_64.c
+++ b/arch/x86/mm/pageattr_64.c
@@ -13,6 +13,7 @@
#include <asm/tlbflush.h>
#include <asm/uaccess.h>
#include <asm/io.h>
+#include <asm/paravirt.h>

pte_t *lookup_address(unsigned long address)
{
@@ -84,7 +85,7 @@ static void flush_kernel_map(void *arg)
much cheaper than WBINVD. */
/* clflush is still broken. Disable for now. */
if (1 || !cpu_has_clflush) {
- asm volatile("wbinvd" ::: "memory");
+ wbinvd();
} else {
list_for_each_entry(pg, l, lru) {
void *addr = page_address(pg);
--
1.4.4.2

2007-11-09 21:31:38

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 14/24] export math_state_restore

Export math_state_restore symbol, so it can be used for hypervisors.
They are commonly loaded as modules (lguest being an example).

Signed-off-by: Glauber de Oliveira Costa <[email protected]>
Signed-off-by: Steven Rostedt <[email protected]>
Acked-by: Jeremy Fitzhardinge <[email protected]>
---
arch/x86/kernel/traps_64.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/traps_64.c b/arch/x86/kernel/traps_64.c
index 4d752a8..0876692 100644
--- a/arch/x86/kernel/traps_64.c
+++ b/arch/x86/kernel/traps_64.c
@@ -1069,6 +1069,7 @@ asmlinkage void math_state_restore(void)
task_thread_info(me)->status |= TS_USEDFPU;
me->fpu_counter++;
}
+EXPORT_SYMBOL_GPL(math_state_restore);

void __init trap_init(void)
{
--
1.4.4.2

2007-11-09 21:31:57

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 3/24] consolidate spinlock.h

The cli and sti instructions need to be replaced by paravirt hooks.
For the i386 architecture, this is already done. The code requirements
aren't much different from x86_64 POV, so this part is consolidated in
the common header

Signed-off-by: Glauber de Oliveira Costa <[email protected]>
Signed-off-by: Steven Rostedt <[email protected]>
Acked-by: Jeremy Fitzhardinge <[email protected]>
---
include/asm-x86/spinlock.h | 14 ++++++++++++++
include/asm-x86/spinlock_32.h | 9 ---------
include/asm-x86/spinlock_64.h | 8 +++++---
3 files changed, 19 insertions(+), 12 deletions(-)

diff --git a/include/asm-x86/spinlock.h b/include/asm-x86/spinlock.h
index d74d85e..e1d555a 100644
--- a/include/asm-x86/spinlock.h
+++ b/include/asm-x86/spinlock.h
@@ -1,5 +1,19 @@
+#ifndef _X86_SPINLOCK_H_
+#define _X86_SPINLOCK_H_
+
+#ifdef CONFIG_PARAVIRT
+#include <asm/paravirt.h>
+#else
+#define CLI_STRING "cli"
+#define STI_STRING "sti"
+#define CLI_STI_CLOBBERS
+#define CLI_STI_INPUT_ARGS
+#endif /* CONFIG_PARAVIRT */
+
#ifdef CONFIG_X86_32
# include "spinlock_32.h"
#else
# include "spinlock_64.h"
#endif
+
+#endif
diff --git a/include/asm-x86/spinlock_32.h b/include/asm-x86/spinlock_32.h
index d3bcebe..ebbf371 100644
--- a/include/asm-x86/spinlock_32.h
+++ b/include/asm-x86/spinlock_32.h
@@ -7,15 +7,6 @@
#include <asm/processor.h>
#include <linux/compiler.h>

-#ifdef CONFIG_PARAVIRT
-#include <asm/paravirt.h>
-#else
-#define CLI_STRING "cli"
-#define STI_STRING "sti"
-#define CLI_STI_CLOBBERS
-#define CLI_STI_INPUT_ARGS
-#endif /* CONFIG_PARAVIRT */
-
/*
* Your basic SMP spinlocks, allowing only a single CPU anywhere
*
diff --git a/include/asm-x86/spinlock_64.h b/include/asm-x86/spinlock_64.h
index 88bf981..e56b17e 100644
--- a/include/asm-x86/spinlock_64.h
+++ b/include/asm-x86/spinlock_64.h
@@ -48,12 +48,12 @@ static inline void __raw_spin_lock_flags(raw_spinlock_t *lock, unsigned long fla
"jns 5f\n"
"testl $0x200, %1\n\t" /* interrupts were disabled? */
"jz 4f\n\t"
- "sti\n"
+ STI_STRING "\n"
"3:\t"
"rep;nop\n\t"
"cmpl $0, %0\n\t"
"jle 3b\n\t"
- "cli\n\t"
+ CLI_STRING "\n\t"
"jmp 1b\n"
"4:\t"
"rep;nop\n\t"
@@ -61,7 +61,9 @@ static inline void __raw_spin_lock_flags(raw_spinlock_t *lock, unsigned long fla
"jg 1b\n\t"
"jmp 4b\n"
"5:\n\t"
- : "+m" (lock->slock) : "r" ((unsigned)flags) : "memory");
+ : "+m" (lock->slock)
+ : "r" ((unsigned)flags) CLI_STI_INPUT_ARGS
+ : "memory" CLI_STI_CLOBBERS);
}
#endif

--
1.4.4.2

2007-11-09 21:32:22

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 13/24] report ring kernel is running without paravirt

When paravirtualization is disabled, the kernel is always
running at ring 0. So report it in the appropriate macro

Signed-off-by: Glauber de Oliveira Costa <[email protected]>
Signed-off-by: Steven Rostedt <[email protected]>
Acked-by: Jeremy Fitzhardinge <[email protected]>
---
include/asm-x86/segment_64.h | 4 ++++
1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/include/asm-x86/segment_64.h b/include/asm-x86/segment_64.h
index 04b8ab2..240c1bf 100644
--- a/include/asm-x86/segment_64.h
+++ b/include/asm-x86/segment_64.h
@@ -50,4 +50,8 @@
#define GDT_SIZE (GDT_ENTRIES * 8)
#define TLS_SIZE (GDT_ENTRY_TLS_ENTRIES * 8)

+#ifndef CONFIG_PARAVIRT
+#define get_kernel_rpl() 0
+#endif
+
#endif
--
1.4.4.2

2007-11-09 21:32:40

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 7/24] consolidate msr.h

This patch goes one step forward in consolidating the msr.h header.
It shares code between i386 and x86_64, instead of duplicating the
code for tsc reading, msr reading/writing, etc.

Signed-off-by: Glauber de Oliveira Costa <[email protected]>
Signed-off-by: Steven Rostedt <[email protected]>
Acked-by: Jeremy Fitzhardinge <[email protected]>
---
arch/x86/ia32/syscall32.c | 2 +-
arch/x86/kernel/setup64.c | 6 +-
arch/x86/kernel/tsc_64.c | 17 +++-
arch/x86/kernel/vsyscall_64.c | 4 +-
arch/x86/vdso/vgetcpu.c | 4 +-
include/asm-x86/alternative_32.h | 17 +++-
include/asm-x86/alternative_64.h | 27 ++++-
include/asm-x86/msr.h | 225 ++++++++++----------------------------
include/asm-x86/tsc.h | 33 +++++-
9 files changed, 151 insertions(+), 184 deletions(-)

diff --git a/arch/x86/ia32/syscall32.c b/arch/x86/ia32/syscall32.c
index d751d96..a1247ed 100644
--- a/arch/x86/ia32/syscall32.c
+++ b/arch/x86/ia32/syscall32.c
@@ -82,5 +82,5 @@ void syscall32_cpu_init(void)
checking_wrmsrl(MSR_IA32_SYSENTER_ESP, 0ULL);
checking_wrmsrl(MSR_IA32_SYSENTER_EIP, (u64)ia32_sysenter_target);

- wrmsrl(MSR_CSTAR, ia32_cstar_target);
+ wrmsrl(MSR_CSTAR, (u64)ia32_cstar_target);
}
diff --git a/arch/x86/kernel/setup64.c b/arch/x86/kernel/setup64.c
index 3558ac7..50b7514 100644
--- a/arch/x86/kernel/setup64.c
+++ b/arch/x86/kernel/setup64.c
@@ -122,7 +122,7 @@ void pda_init(int cpu)
asm volatile("movl %0,%%fs ; movl %0,%%gs" :: "r" (0));
/* Memory clobbers used to order PDA accessed */
mb();
- wrmsrl(MSR_GS_BASE, pda);
+ wrmsrl(MSR_GS_BASE, (u64)pda);
mb();

pda->cpunumber = cpu;
@@ -161,8 +161,8 @@ void syscall_init(void)
* but only a 32bit target. LSTAR sets the 64bit rip.
*/
wrmsrl(MSR_STAR, ((u64)__USER32_CS)<<48 | ((u64)__KERNEL_CS)<<32);
- wrmsrl(MSR_LSTAR, system_call);
- wrmsrl(MSR_CSTAR, ignore_sysret);
+ wrmsrl(MSR_LSTAR, (u64)system_call);
+ wrmsrl(MSR_CSTAR, (u64)ignore_sysret);

#ifdef CONFIG_IA32_EMULATION
syscall32_cpu_init ();
diff --git a/arch/x86/kernel/tsc_64.c b/arch/x86/kernel/tsc_64.c
index 9c70af4..4502539 100644
--- a/arch/x86/kernel/tsc_64.c
+++ b/arch/x86/kernel/tsc_64.c
@@ -30,7 +30,7 @@ static unsigned long long cycles_2_ns(unsigned long long cyc)
return (cyc * cyc2ns_scale) >> NS_SCALE;
}

-unsigned long long sched_clock(void)
+unsigned long long native_sched_clock(void)
{
unsigned long a = 0;

@@ -44,6 +44,19 @@ unsigned long long sched_clock(void)
return cycles_2_ns(a);
}

+/* We need to define a real function for sched_clock, to override the
+ weak default version */
+#ifdef CONFIG_PARAVIRT
+unsigned long long sched_clock(void)
+{
+ return paravirt_sched_clock();
+}
+#else
+unsigned long long
+sched_clock(void) __attribute__((alias("native_sched_clock")));
+#endif
+
+
static int tsc_unstable;

inline int check_tsc_unstable(void)
@@ -256,7 +269,7 @@ static cycle_t read_tsc(void)

static cycle_t __vsyscall_fn vread_tsc(void)
{
- cycle_t ret = (cycle_t)get_cycles_sync();
+ cycle_t ret = (cycle_t)vget_cycles_sync();
return ret;
}

diff --git a/arch/x86/kernel/vsyscall_64.c b/arch/x86/kernel/vsyscall_64.c
index ad4005c..1425d02 100644
--- a/arch/x86/kernel/vsyscall_64.c
+++ b/arch/x86/kernel/vsyscall_64.c
@@ -190,7 +190,7 @@ time_t __vsyscall(1) vtime(time_t *t)
long __vsyscall(2)
vgetcpu(unsigned *cpu, unsigned *node, struct getcpu_cache *tcache)
{
- unsigned int dummy, p;
+ unsigned int p;
unsigned long j = 0;

/* Fast cache - only recompute value once per jiffies and avoid
@@ -205,7 +205,7 @@ vgetcpu(unsigned *cpu, unsigned *node, struct getcpu_cache *tcache)
p = tcache->blob[1];
} else if (__vgetcpu_mode == VGETCPU_RDTSCP) {
/* Load per CPU data from RDTSCP */
- rdtscp(dummy, dummy, p);
+ native_read_tscp(&p);
} else {
/* Load per CPU data from GDT */
asm("lsl %1,%0" : "=r" (p) : "r" (__PER_CPU_SEG));
diff --git a/arch/x86/vdso/vgetcpu.c b/arch/x86/vdso/vgetcpu.c
index 91f6e85..61d0def 100644
--- a/arch/x86/vdso/vgetcpu.c
+++ b/arch/x86/vdso/vgetcpu.c
@@ -15,7 +15,7 @@

long __vdso_getcpu(unsigned *cpu, unsigned *node, struct getcpu_cache *tcache)
{
- unsigned int dummy, p;
+ unsigned int p;
unsigned long j = 0;

/* Fast cache - only recompute value once per jiffies and avoid
@@ -30,7 +30,7 @@ long __vdso_getcpu(unsigned *cpu, unsigned *node, struct getcpu_cache *tcache)
p = tcache->blob[1];
} else if (*vdso_vgetcpu_mode == VGETCPU_RDTSCP) {
/* Load per CPU data from RDTSCP */
- rdtscp(dummy, dummy, p);
+ native_read_tscp(&p);
} else {
/* Load per CPU data from GDT */
asm("lsl %1,%0" : "=r" (p) : "r" (__PER_CPU_SEG));
diff --git a/include/asm-x86/alternative_32.h b/include/asm-x86/alternative_32.h
index bda6c81..1ed7708 100644
--- a/include/asm-x86/alternative_32.h
+++ b/include/asm-x86/alternative_32.h
@@ -101,7 +101,22 @@ static inline void alternatives_smp_switch(int smp) {}
* use this macro(s) if you need more than one output parameter
* in alternative_io
*/
-#define ASM_OUTPUT2(a, b) a, b
+#define ASM_OUTPUT2(a, b...) a, b
+
+#define fixup_section(code, fixup, output, input...) \
+ asm volatile("2: " code "\n" \
+ "1:\n\t" \
+ ".section .fixup,\"ax\"\n\t" \
+ "3: " fixup "\n\t" \
+ "jmp 1b\n\t" \
+ ".previous\n\t" \
+ ".section __ex_table,\"a\"\n" \
+ " .align 4\n\t" \
+ " .long 2b,3b\n\t" \
+ ".previous" \
+ : output \
+ : input)
+

/*
* Alternative inline assembly for SMP.
diff --git a/include/asm-x86/alternative_64.h b/include/asm-x86/alternative_64.h
index ab161e8..f080b69 100644
--- a/include/asm-x86/alternative_64.h
+++ b/include/asm-x86/alternative_64.h
@@ -141,14 +141,29 @@ static inline void alternatives_smp_switch(int smp) {}
* use this macro(s) if you need more than one output parameter
* in alternative_io
*/
-#define ASM_OUTPUT2(a, b) a, b
-
-struct paravirt_patch;
+#define ASM_OUTPUT2(a, b...) a, b
+
+#define fixup_section(code, fixup, output, input...) \
+ asm volatile("2: " code "\n" \
+ "1:\n\t" \
+ ".section .fixup,\"ax\"\n\t" \
+ "3: " fixup "\n\t" \
+ " jmp 1b\n\t" \
+ ".previous\n\t" \
+ ".section __ex_table,\"a\"\n" \
+ " .align 8\n\t" \
+ " .quad 2b,3b\n\t" \
+ ".previous" \
+ : output \
+ : input)
+
+struct paravirt_patch_site;
#ifdef CONFIG_PARAVIRT
-void apply_paravirt(struct paravirt_patch *start, struct paravirt_patch *end);
+void apply_paravirt(struct paravirt_patch_site *start,
+ struct paravirt_patch_site *end);
#else
-static inline void
-apply_paravirt(struct paravirt_patch *start, struct paravirt_patch *end)
+static inline void apply_paravirt(struct paravirt_patch_site *start,
+ struct paravirt_patch_site *end)
{}
#define __parainstructions NULL
#define __parainstructions_end NULL
diff --git a/include/asm-x86/msr.h b/include/asm-x86/msr.h
index 48f73c7..9171564 100644
--- a/include/asm-x86/msr.h
+++ b/include/asm-x86/msr.h
@@ -3,8 +3,6 @@

#include <asm/msr-index.h>

-#ifdef __i386__
-
#ifdef __KERNEL__
#ifndef __ASSEMBLY__

@@ -12,70 +10,66 @@

static inline unsigned long long native_read_msr(unsigned int msr)
{
- unsigned long long val;
-
- asm volatile("rdmsr" : "=A" (val) : "c" (msr));
- return val;
+ unsigned long a, d;
+ asm volatile("rdmsr" : "=a" (a), "=d" (d) : "c" (msr));
+ return a | ((u64)d << 32);
}

static inline unsigned long long native_read_msr_safe(unsigned int msr,
int *err)
{
- unsigned long long val;
-
- asm volatile("2: rdmsr ; xorl %0,%0\n"
- "1:\n\t"
- ".section .fixup,\"ax\"\n\t"
- "3: movl %3,%0 ; jmp 1b\n\t"
- ".previous\n\t"
- ".section __ex_table,\"a\"\n"
- " .align 4\n\t"
- " .long 2b,3b\n\t"
- ".previous"
- : "=r" (*err), "=A" (val)
- : "c" (msr), "i" (-EFAULT));
-
- return val;
+ unsigned long a, d;
+ fixup_section("rdmsr; xor %0, %0", "mov %4, %0",
+ ASM_OUTPUT2("=r" (*err), "=a"((a)), "=d"((d))),
+ "c"(msr), "i"(-EFAULT), "0"(0));
+ return a | ((u64)d << 32);
}

-static inline void native_write_msr(unsigned int msr, unsigned long long val)
+static inline void native_write_msr(unsigned int msr, unsigned low,
+ unsigned high)
{
- asm volatile("wrmsr" : : "c" (msr), "A"(val));
+ asm volatile("wrmsr" : : "c" (msr), "a"(low), "d" (high));
}

static inline int native_write_msr_safe(unsigned int msr,
- unsigned long long val)
+ unsigned low, unsigned high)
{
int err;
- asm volatile("2: wrmsr ; xorl %0,%0\n"
- "1:\n\t"
- ".section .fixup,\"ax\"\n\t"
- "3: movl %4,%0 ; jmp 1b\n\t"
- ".previous\n\t"
- ".section __ex_table,\"a\"\n"
- " .align 4\n\t"
- " .long 2b,3b\n\t"
- ".previous"
- : "=a" (err)
- : "c" (msr), "0" ((u32)val), "d" ((u32)(val>>32)),
- "i" (-EFAULT));
+ fixup_section("wrmsr; xor %0, %0", "mov %4, %0", "=a" (err),
+ "c" (msr), "0" (low), "d" (high),
+ "i" (-EFAULT));
return err;
}

static inline unsigned long long native_read_tsc(void)
{
- unsigned long long val;
- asm volatile("rdtsc" : "=A" (val));
- return val;
+ unsigned int low, high;
+ asm volatile("rdtsc" : "=a" (low), "=d" (high));
+ return low | ((u64)(high) << 32);
}

-static inline unsigned long long native_read_pmc(void)
+static inline unsigned long long native_read_pmc(int counter)
{
- unsigned long long val;
- asm volatile("rdpmc" : "=A" (val));
- return val;
+ unsigned long low, high;
+ asm volatile ("rdpmc"
+ : "=a" (low), "=d" (high)
+ : "c" (counter));
+
+ return low | ((u64)high << 32);
}

+static inline unsigned long long native_read_tscp(int *aux)
+{
+ unsigned long low, high;
+ asm volatile (".byte 0x0f,0x01,0xf9"
+ : "=a" (low), "=d" (high), "=c" (*aux));
+ return low | ((u64)high >> 32);
+}
+
+#endif /* ! __ASSEMBLY__ */
+#endif /* __KERNEL__ */
+
+#ifndef __ASSEMBLY__
#ifdef CONFIG_PARAVIRT
#include <asm/paravirt.h>
#else
@@ -93,20 +87,26 @@ static inline unsigned long long native_read_pmc(void)
(val2) = (u32)(__val >> 32); \
} while(0)

-static inline void wrmsr(u32 __msr, u32 __low, u32 __high)
+static inline void wrmsr(unsigned int msr, unsigned int low, unsigned int high)
{
- native_write_msr(__msr, ((u64)__high << 32) | __low);
+ native_write_msr(msr, low, high);
}

#define rdmsrl(msr,val) \
((val) = native_read_msr(msr))

-#define wrmsrl(msr,val) native_write_msr(msr, val)
+static inline void wrmsrl(unsigned int msr, unsigned long long val)
+{
+ unsigned long low, high;
+ low = (u32)val;
+ high = val >> 32;
+ native_write_msr(msr, low, high);
+}

/* wrmsr with exception handling */
-static inline int wrmsr_safe(u32 __msr, u32 __low, u32 __high)
+static inline int wrmsr_safe(int msr, int low, int high)
{
- return native_write_msr_safe(__msr, ((u64)__high << 32) | __low);
+ return native_write_msr_safe(msr, low, high);
}

/* rdmsr with exception handling */
@@ -129,130 +129,28 @@ static inline int wrmsr_safe(u32 __msr, u32 __low, u32 __high)

#define rdpmc(counter,low,high) \
do { \
- u64 _l = native_read_pmc(); \
+ u64 _l = native_read_pmc(counter); \
(low) = (u32)_l; \
(high) = (u32)(_l >> 32); \
- } while(0)
-#endif /* !CONFIG_PARAVIRT */
-
-#ifdef CONFIG_SMP
-void rdmsr_on_cpu(unsigned int cpu, u32 msr_no, u32 *l, u32 *h);
-void wrmsr_on_cpu(unsigned int cpu, u32 msr_no, u32 l, u32 h);
-int rdmsr_safe_on_cpu(unsigned int cpu, u32 msr_no, u32 *l, u32 *h);
-int wrmsr_safe_on_cpu(unsigned int cpu, u32 msr_no, u32 l, u32 h);
-#else /* CONFIG_SMP */
-static inline void rdmsr_on_cpu(unsigned int cpu, u32 msr_no, u32 *l, u32 *h)
-{
- rdmsr(msr_no, *l, *h);
-}
-static inline void wrmsr_on_cpu(unsigned int cpu, u32 msr_no, u32 l, u32 h)
-{
- wrmsr(msr_no, l, h);
-}
-static inline int rdmsr_safe_on_cpu(unsigned int cpu, u32 msr_no, u32 *l, u32 *h)
-{
- return rdmsr_safe(msr_no, l, h);
-}
-static inline int wrmsr_safe_on_cpu(unsigned int cpu, u32 msr_no, u32 l, u32 h)
-{
- return wrmsr_safe(msr_no, l, h);
-}
-#endif /* CONFIG_SMP */
-#endif /* ! __ASSEMBLY__ */
-#endif /* __KERNEL__ */
-
-#else /* __i386__ */
-
-#ifndef __ASSEMBLY__
-#include <linux/errno.h>
-/*
- * Access to machine-specific registers (available on 586 and better only)
- * Note: the rd* operations modify the parameters directly (without using
- * pointer indirection), this allows gcc to optimize better
- */
+ } while (0)

-#define rdmsr(msr,val1,val2) \
- __asm__ __volatile__("rdmsr" \
- : "=a" (val1), "=d" (val2) \
- : "c" (msr))
-
-
-#define rdmsrl(msr,val) do { unsigned long a__,b__; \
- __asm__ __volatile__("rdmsr" \
- : "=a" (a__), "=d" (b__) \
- : "c" (msr)); \
- val = a__ | (b__<<32); \
-} while(0)
-
-#define wrmsr(msr,val1,val2) \
- __asm__ __volatile__("wrmsr" \
- : /* no outputs */ \
- : "c" (msr), "a" (val1), "d" (val2))
+#define rdtscp(low, high, aux) \
+ do { \
+ unsigned long long _val = native_read_tscp(&(aux)); \
+ (low) = (u32)_val; \
+ (high) = (u32)(_val >> 32); \
+ } while (0)

-#define wrmsrl(msr,val) wrmsr(msr,(__u32)((__u64)(val)),((__u64)(val))>>32)
+#define rdtscpll(val, aux) (val) = native_read_tscp(&(aux))

-/* wrmsr with exception handling */
-#define wrmsr_safe(msr,a,b) ({ int ret__; \
- asm volatile("2: wrmsr ; xorl %0,%0\n" \
- "1:\n\t" \
- ".section .fixup,\"ax\"\n\t" \
- "3: movl %4,%0 ; jmp 1b\n\t" \
- ".previous\n\t" \
- ".section __ex_table,\"a\"\n" \
- " .align 8\n\t" \
- " .quad 2b,3b\n\t" \
- ".previous" \
- : "=a" (ret__) \
- : "c" (msr), "0" (a), "d" (b), "i" (-EFAULT)); \
- ret__; })
+#endif /* !CONFIG_PARAVIRT */

#define checking_wrmsrl(msr,val) wrmsr_safe(msr,(u32)(val),(u32)((val)>>32))

-#define rdmsr_safe(msr,a,b) \
- ({ int ret__; \
- asm volatile ("1: rdmsr\n" \
- "2:\n" \
- ".section .fixup,\"ax\"\n" \
- "3: movl %4,%0\n" \
- " jmp 2b\n" \
- ".previous\n" \
- ".section __ex_table,\"a\"\n" \
- " .align 8\n" \
- " .quad 1b,3b\n" \
- ".previous":"=&bDS" (ret__), "=a"(*(a)), "=d"(*(b)) \
- :"c"(msr), "i"(-EIO), "0"(0)); \
- ret__; })
-
-#define rdtsc(low,high) \
- __asm__ __volatile__("rdtsc" : "=a" (low), "=d" (high))
-
-#define rdtscl(low) \
- __asm__ __volatile__ ("rdtsc" : "=a" (low) : : "edx")
-
-#define rdtscp(low,high,aux) \
- asm volatile (".byte 0x0f,0x01,0xf9" : "=a" (low), "=d" (high), "=c" (aux))
-
-#define rdtscll(val) do { \
- unsigned int __a,__d; \
- asm volatile("rdtsc" : "=a" (__a), "=d" (__d)); \
- (val) = ((unsigned long)__a) | (((unsigned long)__d)<<32); \
-} while(0)
-
-#define rdtscpll(val, aux) do { \
- unsigned long __a, __d; \
- asm volatile (".byte 0x0f,0x01,0xf9" : "=a" (__a), "=d" (__d), "=c" (aux)); \
- (val) = (__d << 32) | __a; \
-} while (0)
-
#define write_tsc(val1,val2) wrmsr(0x10, val1, val2)

#define write_rdtscp_aux(val) wrmsr(0xc0000103, val, 0)

-#define rdpmc(counter,low,high) \
- __asm__ __volatile__("rdpmc" \
- : "=a" (low), "=d" (high) \
- : "c" (counter))
-
#ifdef CONFIG_SMP
void rdmsr_on_cpu(unsigned int cpu, u32 msr_no, u32 *l, u32 *h);
void wrmsr_on_cpu(unsigned int cpu, u32 msr_no, u32 l, u32 h);
@@ -275,9 +173,6 @@ static inline int wrmsr_safe_on_cpu(unsigned int cpu, u32 msr_no, u32 l, u32 h)
{
return wrmsr_safe(msr_no, l, h);
}
-#endif /* CONFIG_SMP */
-#endif /* __ASSEMBLY__ */
-
-#endif /* !__i386__ */
-
+#endif /* CONFIG_SMP */
+#endif /* ! __ASSEMBLY__ */
#endif
diff --git a/include/asm-x86/tsc.h b/include/asm-x86/tsc.h
index d7b1c4e..651e6ac 100644
--- a/include/asm-x86/tsc.h
+++ b/include/asm-x86/tsc.h
@@ -33,7 +33,7 @@ static inline cycles_t get_cycles(void)
}

/* Like get_cycles, but make sure the CPU is synchronized. */
-static __always_inline cycles_t get_cycles_sync(void)
+static __always_inline cycles_t __get_cycles_sync(void)
{
unsigned long long ret;
unsigned eax, edx;
@@ -55,11 +55,40 @@ static __always_inline cycles_t get_cycles_sync(void)
*/
alternative_io("cpuid", ASM_NOP2, X86_FEATURE_SYNC_RDTSC,
"=a" (eax), "0" (1) : "ebx","ecx","edx","memory");
- rdtscll(ret);

+ return 0;
+}
+
+static __always_inline cycles_t get_cycles_sync(void)
+{
+ unsigned long long ret;
+ ret = __get_cycles_sync();
+ if (!ret)
+ rdtscll(ret);
return ret;
}

+#ifdef CONFIG_PARAVIRT
+/*
+ * For paravirt guests, some functionalities are executed through function
+ * pointers in the various pvops structures.
+ * These function pointers exist inside the kernel and can not
+ * be accessed by user space. To avoid this, we make a copy of the
+ * get_cycles_sync (called in kernel) but force the use of native_read_tsc.
+ * Ideally, the guest should set up it's own clock and vread
+ */
+static __always_inline long long vget_cycles_sync(void)
+{
+ unsigned long long ret;
+ ret = __get_cycles_sync();
+ if (!ret)
+ ret = native_read_tsc();
+ return ret;
+}
+#else
+# define vget_cycles_sync() get_cycles_sync()
+#endif
+
extern void tsc_init(void);
extern void mark_tsc_unstable(char *reason);
extern int unsynchronized_tsc(void);
--
1.4.4.2

2007-11-09 21:32:57

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 15/24] native versions for set pagetables

This patch turns the set_p{te,md,ud,gd} functions into their
native_ versions. There is no need to patch any caller.

Also, it adds pte_update() and pte_update_defer() calls whenever
we modify a page table entry. This last part was coded to match
i386 as close as possible.

Pieces of the header are moved to below the #ifdef CONFIG_PARAVIRT
site, as they are users of the newly defined set_* macros.

Signed-off-by: Glauber de Oliveira Costa <[email protected]>
Signed-off-by: Steven Rostedt <[email protected]>
Acked-by: Jeremy Fitzhardinge <[email protected]>
---
include/asm-x86/pgtable_64.h | 192 ++++++++++++++++++++++++++++--------------
1 files changed, 128 insertions(+), 64 deletions(-)

diff --git a/include/asm-x86/pgtable_64.h b/include/asm-x86/pgtable_64.h
index 9b0ff47..592d613 100644
--- a/include/asm-x86/pgtable_64.h
+++ b/include/asm-x86/pgtable_64.h
@@ -57,56 +57,107 @@ extern unsigned long empty_zero_page[PAGE_SIZE/sizeof(unsigned long)];
*/
#define PTRS_PER_PTE 512

-#ifndef __ASSEMBLY__
+#ifdef CONFIG_PARAVIRT
+#include <asm/paravirt.h>
+#else
+
+#define set_pte native_set_pte
+#define set_pte_at(mm, addr, ptep, pteval) set_pte(ptep, pteval)
+#define set_pmd native_set_pmd
+#define set_pud native_set_pud
+#define set_pgd native_set_pgd
+#define pte_clear(mm, addr, xp) \
+do { \
+ set_pte_at(mm, addr, xp, __pte(0)); \
+} while (0)

-#define pte_ERROR(e) \
- printk("%s:%d: bad pte %p(%016lx).\n", __FILE__, __LINE__, &(e), pte_val(e))
-#define pmd_ERROR(e) \
- printk("%s:%d: bad pmd %p(%016lx).\n", __FILE__, __LINE__, &(e), pmd_val(e))
-#define pud_ERROR(e) \
- printk("%s:%d: bad pud %p(%016lx).\n", __FILE__, __LINE__, &(e), pud_val(e))
-#define pgd_ERROR(e) \
- printk("%s:%d: bad pgd %p(%016lx).\n", __FILE__, __LINE__, &(e), pgd_val(e))
+#define pmd_clear(xp) do { set_pmd(xp, __pmd(0)); } while (0)
+#define pud_clear native_pud_clear
+#define pgd_clear native_pgd_clear
+#define pte_update(mm, addr, ptep) do { } while (0)
+#define pte_update_defer(mm, addr, ptep) do { } while (0)

-#define pgd_none(x) (!pgd_val(x))
-#define pud_none(x) (!pud_val(x))
+#endif

-static inline void set_pte(pte_t *dst, pte_t val)
+#ifndef __ASSEMBLY__
+
+static inline void native_set_pte(pte_t *dst, pte_t val)
{
- pte_val(*dst) = pte_val(val);
+ dst->pte = pte_val(val);
}
-#define set_pte_at(mm,addr,ptep,pteval) set_pte(ptep,pteval)

-static inline void set_pmd(pmd_t *dst, pmd_t val)
+static inline void native_set_pmd(pmd_t *dst, pmd_t val)
{
- pmd_val(*dst) = pmd_val(val);
+ dst->pmd = pmd_val(val);
}

-static inline void set_pud(pud_t *dst, pud_t val)
+static inline void native_set_pud(pud_t *dst, pud_t val)
{
- pud_val(*dst) = pud_val(val);
+ dst->pud = pud_val(val);
}

-static inline void pud_clear (pud_t *pud)
+static inline void native_set_pgd(pgd_t *dst, pgd_t val)
{
- set_pud(pud, __pud(0));
+ dst->pgd = pgd_val(val);
}

-static inline void set_pgd(pgd_t *dst, pgd_t val)
+static inline void native_pud_clear(pud_t *pud)
{
- pgd_val(*dst) = pgd_val(val);
-}
+ set_pud(pud, __pud(0));
+}

-static inline void pgd_clear (pgd_t * pgd)
+static inline void native_pgd_clear(pgd_t *pgd)
{
set_pgd(pgd, __pgd(0));
}

-#define ptep_get_and_clear(mm,addr,xp) __pte(xchg(&(xp)->pte, 0))
+static inline void native_set_pte_at(struct mm_struct *mm, unsigned long addr,
+ pte_t *ptep, pte_t pteval)
+{
+ native_set_pte(ptep, pteval);
+}
+
+static inline void native_pte_clear(struct mm_struct *mm, unsigned long addr,
+ pte_t *ptep)
+{
+ native_set_pte_at(mm, addr, ptep, __pte(0));
+}
+
+static inline void native_pmd_clear(pmd_t *pmd)
+{
+ native_set_pmd(pmd, __pmd(0));
+}
+
+
+#define pte_ERROR(e) \
+ printk("%s:%d: bad pte %p(%016llx).\n", \
+ __FILE__, __LINE__, &(e), (u64)pte_val(e))
+#define pmd_ERROR(e) \
+ printk("%s:%d: bad pmd %p(%016llx).\n", \
+ __FILE__, __LINE__, &(e), (u64)pmd_val(e))
+#define pud_ERROR(e) \
+ printk("%s:%d: bad pud %p(%016llx).\n", \
+ __FILE__, __LINE__, &(e), (u64)pud_val(e))
+#define pgd_ERROR(e) \
+ printk("%s:%d: bad pgd %p(%016llx).\n", \
+ __FILE__, __LINE__, &(e), (u64)pgd_val(e))
+
+#define pgd_none(x) (!pgd_val(x))
+#define pud_none(x) (!pud_val(x))

struct mm_struct;

-static inline pte_t ptep_get_and_clear_full(struct mm_struct *mm, unsigned long addr, pte_t *ptep, int full)
+static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
+ unsigned long addr, pte_t *ptep)
+{
+ pte_t pte = __pte(xchg(&ptep->pte, 0));
+ pte_update(mm, addr, ptep);
+ return pte;
+}
+
+static inline pte_t ptep_get_and_clear_full(struct mm_struct *mm,
+ unsigned long addr, pte_t *ptep,
+ int full)
{
pte_t pte;
if (full) {
@@ -246,7 +297,6 @@ static inline unsigned long pmd_bad(pmd_t pmd)

#define pte_none(x) (!pte_val(x))
#define pte_present(x) (pte_val(x) & (_PAGE_PRESENT | _PAGE_PROTNONE))
-#define pte_clear(mm,addr,xp) do { set_pte_at(mm, addr, xp, __pte(0)); } while (0)

#define pages_to_mb(x) ((x) >> (20-PAGE_SHIFT)) /* FIXME: is this
right? */
@@ -255,11 +305,11 @@ static inline unsigned long pmd_bad(pmd_t pmd)

static inline pte_t pfn_pte(unsigned long page_nr, pgprot_t pgprot)
{
- pte_t pte;
- pte_val(pte) = (page_nr << PAGE_SHIFT);
- pte_val(pte) |= pgprot_val(pgprot);
- pte_val(pte) &= __supported_pte_mask;
- return pte;
+ unsigned long pte;
+ pte = (page_nr << PAGE_SHIFT);
+ pte |= pgprot_val(pgprot);
+ pte &= __supported_pte_mask;
+ return __pte(pte);
}

/*
@@ -283,30 +333,6 @@ static inline pte_t pte_mkwrite(pte_t pte) { set_pte(&pte, __pte(pte_val(pte) |
static inline pte_t pte_mkhuge(pte_t pte) { set_pte(&pte, __pte(pte_val(pte) | _PAGE_PSE)); return pte; }
static inline pte_t pte_clrhuge(pte_t pte) { set_pte(&pte, __pte(pte_val(pte) & ~_PAGE_PSE)); return pte; }

-struct vm_area_struct;
-
-static inline int ptep_test_and_clear_young(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep)
-{
- if (!pte_young(*ptep))
- return 0;
- return test_and_clear_bit(_PAGE_BIT_ACCESSED, &ptep->pte);
-}
-
-static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
-{
- clear_bit(_PAGE_BIT_RW, &ptep->pte);
-}
-
-/*
- * Macro to mark a page protection value as "uncacheable".
- */
-#define pgprot_noncached(prot) (__pgprot(pgprot_val(prot) | _PAGE_PCD | _PAGE_PWT))
-
-static inline int pmd_large(pmd_t pte) {
- return (pmd_val(pte) & __LARGE_PTE) == __LARGE_PTE;
-}
-
-
/*
* Conversion functions: convert a page and protection to a page entry,
* and a page entry and page directory to the page they refer to.
@@ -340,7 +366,6 @@ static inline int pmd_large(pmd_t pte) {
pmd_index(address))
#define pmd_none(x) (!pmd_val(x))
#define pmd_present(x) (pmd_val(x) & _PAGE_PRESENT)
-#define pmd_clear(xp) do { set_pmd(xp, __pmd(0)); } while (0)
#define pfn_pmd(nr,prot) (__pmd(((nr) << PAGE_SHIFT) | pgprot_val(prot)))
#define pmd_pfn(x) ((pmd_val(x) & __PHYSICAL_MASK) >> PAGE_SHIFT)

@@ -352,15 +377,53 @@ static inline int pmd_large(pmd_t pte) {

/* page, protection -> pte */
#define mk_pte(page, pgprot) pfn_pte(page_to_pfn(page), (pgprot))
-#define mk_pte_huge(entry) (pte_val(entry) |= _PAGE_PRESENT | _PAGE_PSE)
-
+
+static inline pte_t __mk_pte_huge(pte_t entry)
+{
+ unsigned long pte;
+ pte = pte_val(entry);
+ pte |= _PAGE_PRESENT | _PAGE_PSE;
+ return __pte(pte);
+}
+#define mk_pte_huge(entry) ((entry) = __mk_pte_huge(entry))
+
+#include <linux/mm_types.h>
+static inline int ptep_test_and_clear_young(struct vm_area_struct *vma,
+ unsigned long addr, pte_t *ptep)
+{
+ int ret = 0;
+ if (!pte_young(*ptep))
+ return 0;
+ ret = test_and_clear_bit(_PAGE_BIT_ACCESSED, &ptep->pte);
+ pte_update(vma->vm_mm, addr, ptep);
+ return ret;
+}
+
+static inline void ptep_set_wrprotect(struct mm_struct *mm,
+ unsigned long addr, pte_t *ptep)
+{
+ clear_bit(_PAGE_BIT_RW, &ptep->pte);
+ pte_update(mm, addr, ptep);
+}
+
+/*
+ * Macro to mark a page protection value as "uncacheable".
+ */
+#define pgprot_noncached(prot) (__pgprot(pgprot_val(prot) | _PAGE_PCD | _PAGE_PWT))
+
+static inline int pmd_large(pmd_t pte)
+{
+ return (pmd_val(pte) & __LARGE_PTE) == __LARGE_PTE;
+}
+
/* Change flags of a PTE */
-static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
+static inline pte_t pte_modify(pte_t pte_old, pgprot_t newprot)
{
- pte_val(pte) &= _PAGE_CHG_MASK;
- pte_val(pte) |= pgprot_val(newprot);
- pte_val(pte) &= __supported_pte_mask;
- return pte;
+ unsigned long pte = pte_val(pte_old);
+ pte &= _PAGE_CHG_MASK;
+ pte |= pgprot_val(newprot);
+ pte &= __supported_pte_mask;
+ return __pte(pte);
}

#define pte_index(address) \
@@ -387,6 +450,7 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
int __changed = !pte_same(*(__ptep), __entry); \
if (__changed && __dirty) { \
set_pte(__ptep, __entry); \
+ pte_update_defer((__vma)->vm_mm, (__address), (__ptep)); \
flush_tlb_page(__vma, __address); \
} \
__changed; \
--
1.4.4.2

2007-11-09 21:33:26

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 5/24] smp x86 consolidation

This patch consolidates part of the pieces of smp for both architectures.
(i386 and x86_64). It makes part the calls go through smp_ops, and shares
code for those functions in smpcommon.c

There's more room for code sharing here, but it is left as an exercise to
the reader ;-)

Signed-off-by: Glauber de Oliveira Costa <[email protected]>
Signed-off-by: Steven Rostedt <[email protected]>
Acked-by: Jeremy Fitzhardinge <[email protected]>
---
arch/x86/kernel/Makefile_32 | 2 +-
arch/x86/kernel/Makefile_64 | 2 +-
arch/x86/kernel/smp_32.c | 218 ------------------------------
arch/x86/kernel/smp_64.c | 245 ---------------------------------
arch/x86/kernel/smpboot_64.c | 8 +-
arch/x86/kernel/smpcommon.c | 291 ++++++++++++++++++++++++++++++++++++++++
arch/x86/kernel/smpcommon_32.c | 81 -----------
include/asm-x86/idle.h | 18 +++-
include/asm-x86/smp.h | 66 +++++++++
include/asm-x86/smp_32.h | 58 --------
include/asm-x86/smp_64.h | 4 -
11 files changed, 379 insertions(+), 614 deletions(-)

diff --git a/arch/x86/kernel/Makefile_32 b/arch/x86/kernel/Makefile_32
index b08179a..b0da543 100644
--- a/arch/x86/kernel/Makefile_32
+++ b/arch/x86/kernel/Makefile_32
@@ -20,7 +20,7 @@ obj-$(CONFIG_MICROCODE) += microcode.o
obj-$(CONFIG_PCI) += early-quirks.o
obj-$(CONFIG_APM) += apm_32.o
obj-$(CONFIG_X86_SMP) += smp_32.o smpboot_32.o tsc_sync.o
-obj-$(CONFIG_SMP) += smpcommon_32.o
+obj-$(CONFIG_SMP) += smpcommon.o
obj-$(CONFIG_X86_TRAMPOLINE) += trampoline_32.o
obj-$(CONFIG_X86_MPPARSE) += mpparse_32.o
obj-$(CONFIG_X86_LOCAL_APIC) += apic_32.o nmi_32.o
diff --git a/arch/x86/kernel/Makefile_64 b/arch/x86/kernel/Makefile_64
index 686de84..ffee997 100644
--- a/arch/x86/kernel/Makefile_64
+++ b/arch/x86/kernel/Makefile_64
@@ -17,7 +17,7 @@ obj-y += acpi/
obj-$(CONFIG_X86_MSR) += msr.o
obj-$(CONFIG_MICROCODE) += microcode.o
obj-$(CONFIG_X86_CPUID) += cpuid.o
-obj-$(CONFIG_SMP) += smp_64.o smpboot_64.o trampoline_64.o tsc_sync.o
+obj-$(CONFIG_SMP) += smp_64.o smpboot_64.o trampoline_64.o tsc_sync.o smpcommon.o
obj-y += apic_64.o nmi_64.o
obj-y += io_apic_64.o mpparse_64.o genapic_64.o genapic_flat_64.o
obj-$(CONFIG_KEXEC) += machine_kexec_64.o relocate_kernel_64.o crash.o
diff --git a/arch/x86/kernel/smp_32.c b/arch/x86/kernel/smp_32.c
index fcaa026..a7cc319 100644
--- a/arch/x86/kernel/smp_32.c
+++ b/arch/x86/kernel/smp_32.c
@@ -464,213 +464,6 @@ void flush_tlb_all(void)
on_each_cpu(do_flush_tlb_all, NULL, 1, 1);
}

-/*
- * this function sends a 'reschedule' IPI to another CPU.
- * it goes straight through and wastes no time serializing
- * anything. Worst case is that we lose a reschedule ...
- */
-static void native_smp_send_reschedule(int cpu)
-{
- WARN_ON(cpu_is_offline(cpu));
- send_IPI_mask(cpumask_of_cpu(cpu), RESCHEDULE_VECTOR);
-}
-
-/*
- * Structure and data for smp_call_function(). This is designed to minimise
- * static memory requirements. It also looks cleaner.
- */
-static DEFINE_SPINLOCK(call_lock);
-
-struct call_data_struct {
- void (*func) (void *info);
- void *info;
- atomic_t started;
- atomic_t finished;
- int wait;
-};
-
-void lock_ipi_call_lock(void)
-{
- spin_lock_irq(&call_lock);
-}
-
-void unlock_ipi_call_lock(void)
-{
- spin_unlock_irq(&call_lock);
-}
-
-static struct call_data_struct *call_data;
-
-static void __smp_call_function(void (*func) (void *info), void *info,
- int nonatomic, int wait)
-{
- struct call_data_struct data;
- int cpus = num_online_cpus() - 1;
-
- if (!cpus)
- return;
-
- data.func = func;
- data.info = info;
- atomic_set(&data.started, 0);
- data.wait = wait;
- if (wait)
- atomic_set(&data.finished, 0);
-
- call_data = &data;
- mb();
-
- /* Send a message to all other CPUs and wait for them to respond */
- send_IPI_allbutself(CALL_FUNCTION_VECTOR);
-
- /* Wait for response */
- while (atomic_read(&data.started) != cpus)
- cpu_relax();
-
- if (wait)
- while (atomic_read(&data.finished) != cpus)
- cpu_relax();
-}
-
-
-/**
- * smp_call_function_mask(): Run a function on a set of other CPUs.
- * @mask: The set of cpus to run on. Must not include the current cpu.
- * @func: The function to run. This must be fast and non-blocking.
- * @info: An arbitrary pointer to pass to the function.
- * @wait: If true, wait (atomically) until function has completed on other CPUs.
- *
- * Returns 0 on success, else a negative status code.
- *
- * If @wait is true, then returns once @func has returned; otherwise
- * it returns just before the target cpu calls @func.
- *
- * You must not call this function with disabled interrupts or from a
- * hardware interrupt handler or from a bottom half handler.
- */
-static int
-native_smp_call_function_mask(cpumask_t mask,
- void (*func)(void *), void *info,
- int wait)
-{
- struct call_data_struct data;
- cpumask_t allbutself;
- int cpus;
-
- /* Can deadlock when called with interrupts disabled */
- WARN_ON(irqs_disabled());
-
- /* Holding any lock stops cpus from going down. */
- spin_lock(&call_lock);
-
- allbutself = cpu_online_map;
- cpu_clear(smp_processor_id(), allbutself);
-
- cpus_and(mask, mask, allbutself);
- cpus = cpus_weight(mask);
-
- if (!cpus) {
- spin_unlock(&call_lock);
- return 0;
- }
-
- data.func = func;
- data.info = info;
- atomic_set(&data.started, 0);
- data.wait = wait;
- if (wait)
- atomic_set(&data.finished, 0);
-
- call_data = &data;
- mb();
-
- /* Send a message to other CPUs */
- if (cpus_equal(mask, allbutself))
- send_IPI_allbutself(CALL_FUNCTION_VECTOR);
- else
- send_IPI_mask(mask, CALL_FUNCTION_VECTOR);
-
- /* Wait for response */
- while (atomic_read(&data.started) != cpus)
- cpu_relax();
-
- if (wait)
- while (atomic_read(&data.finished) != cpus)
- cpu_relax();
- spin_unlock(&call_lock);
-
- return 0;
-}
-
-static void stop_this_cpu (void * dummy)
-{
- local_irq_disable();
- /*
- * Remove this CPU:
- */
- cpu_clear(smp_processor_id(), cpu_online_map);
- disable_local_APIC();
- if (cpu_data(smp_processor_id()).hlt_works_ok)
- for(;;) halt();
- for (;;);
-}
-
-/*
- * this function calls the 'stop' function on all other CPUs in the system.
- */
-
-static void native_smp_send_stop(void)
-{
- /* Don't deadlock on the call lock in panic */
- int nolock = !spin_trylock(&call_lock);
- unsigned long flags;
-
- local_irq_save(flags);
- __smp_call_function(stop_this_cpu, NULL, 0, 0);
- if (!nolock)
- spin_unlock(&call_lock);
- disable_local_APIC();
- local_irq_restore(flags);
-}
-
-/*
- * Reschedule call back. Nothing to do,
- * all the work is done automatically when
- * we return from the interrupt.
- */
-fastcall void smp_reschedule_interrupt(struct pt_regs *regs)
-{
- ack_APIC_irq();
- __get_cpu_var(irq_stat).irq_resched_count++;
-}
-
-fastcall void smp_call_function_interrupt(struct pt_regs *regs)
-{
- void (*func) (void *info) = call_data->func;
- void *info = call_data->info;
- int wait = call_data->wait;
-
- ack_APIC_irq();
- /*
- * Notify initiating CPU that I've grabbed the data and am
- * about to execute the function
- */
- mb();
- atomic_inc(&call_data->started);
- /*
- * At this point the info structure may be out of scope unless wait==1
- */
- irq_enter();
- (*func)(info);
- __get_cpu_var(irq_stat).irq_call_count++;
- irq_exit();
-
- if (wait) {
- mb();
- atomic_inc(&call_data->finished);
- }
-}
-
static int convert_apicid_to_cpu(int apic_id)
{
int i;
@@ -698,14 +491,3 @@ int safe_smp_processor_id(void)
return cpuid >= 0 ? cpuid : 0;
}

-struct smp_ops smp_ops = {
- .smp_prepare_boot_cpu = native_smp_prepare_boot_cpu,
- .smp_prepare_cpus = native_smp_prepare_cpus,
- .cpu_up = native_cpu_up,
- .smp_cpus_done = native_smp_cpus_done,
-
- .smp_send_stop = native_smp_send_stop,
- .smp_send_reschedule = native_smp_send_reschedule,
- .smp_call_function_mask = native_smp_call_function_mask,
-};
-EXPORT_SYMBOL_GPL(smp_ops);
diff --git a/arch/x86/kernel/smp_64.c b/arch/x86/kernel/smp_64.c
index ce3935b..c19c68b 100644
--- a/arch/x86/kernel/smp_64.c
+++ b/arch/x86/kernel/smp_64.c
@@ -283,248 +283,3 @@ void flush_tlb_all(void)
{
on_each_cpu(do_flush_tlb_all, NULL, 1, 1);
}
-
-/*
- * this function sends a 'reschedule' IPI to another CPU.
- * it goes straight through and wastes no time serializing
- * anything. Worst case is that we lose a reschedule ...
- */
-
-void smp_send_reschedule(int cpu)
-{
- send_IPI_mask(cpumask_of_cpu(cpu), RESCHEDULE_VECTOR);
-}
-
-/*
- * Structure and data for smp_call_function(). This is designed to minimise
- * static memory requirements. It also looks cleaner.
- */
-static DEFINE_SPINLOCK(call_lock);
-
-struct call_data_struct {
- void (*func) (void *info);
- void *info;
- atomic_t started;
- atomic_t finished;
- int wait;
-};
-
-static struct call_data_struct * call_data;
-
-void lock_ipi_call_lock(void)
-{
- spin_lock_irq(&call_lock);
-}
-
-void unlock_ipi_call_lock(void)
-{
- spin_unlock_irq(&call_lock);
-}
-
-/*
- * this function sends a 'generic call function' IPI to all other CPU
- * of the system defined in the mask.
- */
-static int __smp_call_function_mask(cpumask_t mask,
- void (*func)(void *), void *info,
- int wait)
-{
- struct call_data_struct data;
- cpumask_t allbutself;
- int cpus;
-
- allbutself = cpu_online_map;
- cpu_clear(smp_processor_id(), allbutself);
-
- cpus_and(mask, mask, allbutself);
- cpus = cpus_weight(mask);
-
- if (!cpus)
- return 0;
-
- data.func = func;
- data.info = info;
- atomic_set(&data.started, 0);
- data.wait = wait;
- if (wait)
- atomic_set(&data.finished, 0);
-
- call_data = &data;
- wmb();
-
- /* Send a message to other CPUs */
- if (cpus_equal(mask, allbutself))
- send_IPI_allbutself(CALL_FUNCTION_VECTOR);
- else
- send_IPI_mask(mask, CALL_FUNCTION_VECTOR);
-
- /* Wait for response */
- while (atomic_read(&data.started) != cpus)
- cpu_relax();
-
- if (!wait)
- return 0;
-
- while (atomic_read(&data.finished) != cpus)
- cpu_relax();
-
- return 0;
-}
-/**
- * smp_call_function_mask(): Run a function on a set of other CPUs.
- * @mask: The set of cpus to run on. Must not include the current cpu.
- * @func: The function to run. This must be fast and non-blocking.
- * @info: An arbitrary pointer to pass to the function.
- * @wait: If true, wait (atomically) until function has completed on other CPUs.
- *
- * Returns 0 on success, else a negative status code.
- *
- * If @wait is true, then returns once @func has returned; otherwise
- * it returns just before the target cpu calls @func.
- *
- * You must not call this function with disabled interrupts or from a
- * hardware interrupt handler or from a bottom half handler.
- */
-int smp_call_function_mask(cpumask_t mask,
- void (*func)(void *), void *info,
- int wait)
-{
- int ret;
-
- /* Can deadlock when called with interrupts disabled */
- WARN_ON(irqs_disabled());
-
- spin_lock(&call_lock);
- ret = __smp_call_function_mask(mask, func, info, wait);
- spin_unlock(&call_lock);
- return ret;
-}
-EXPORT_SYMBOL(smp_call_function_mask);
-
-/*
- * smp_call_function_single - Run a function on a specific CPU
- * @func: The function to run. This must be fast and non-blocking.
- * @info: An arbitrary pointer to pass to the function.
- * @nonatomic: Currently unused.
- * @wait: If true, wait until function has completed on other CPUs.
- *
- * Retrurns 0 on success, else a negative status code.
- *
- * Does not return until the remote CPU is nearly ready to execute <func>
- * or is or has executed.
- */
-
-int smp_call_function_single (int cpu, void (*func) (void *info), void *info,
- int nonatomic, int wait)
-{
- /* prevent preemption and reschedule on another processor */
- int ret, me = get_cpu();
-
- /* Can deadlock when called with interrupts disabled */
- WARN_ON(irqs_disabled());
-
- if (cpu == me) {
- local_irq_disable();
- func(info);
- local_irq_enable();
- put_cpu();
- return 0;
- }
-
- ret = smp_call_function_mask(cpumask_of_cpu(cpu), func, info, wait);
-
- put_cpu();
- return ret;
-}
-EXPORT_SYMBOL(smp_call_function_single);
-
-/*
- * smp_call_function - run a function on all other CPUs.
- * @func: The function to run. This must be fast and non-blocking.
- * @info: An arbitrary pointer to pass to the function.
- * @nonatomic: currently unused.
- * @wait: If true, wait (atomically) until function has completed on other
- * CPUs.
- *
- * Returns 0 on success, else a negative status code. Does not return until
- * remote CPUs are nearly ready to execute func or are or have executed.
- *
- * You must not call this function with disabled interrupts or from a
- * hardware interrupt handler or from a bottom half handler.
- * Actually there are a few legal cases, like panic.
- */
-int smp_call_function (void (*func) (void *info), void *info, int nonatomic,
- int wait)
-{
- return smp_call_function_mask(cpu_online_map, func, info, wait);
-}
-EXPORT_SYMBOL(smp_call_function);
-
-static void stop_this_cpu(void *dummy)
-{
- local_irq_disable();
- /*
- * Remove this CPU:
- */
- cpu_clear(smp_processor_id(), cpu_online_map);
- disable_local_APIC();
- for (;;)
- halt();
-}
-
-void smp_send_stop(void)
-{
- int nolock;
- unsigned long flags;
-
- if (reboot_force)
- return;
-
- /* Don't deadlock on the call lock in panic */
- nolock = !spin_trylock(&call_lock);
- local_irq_save(flags);
- __smp_call_function_mask(cpu_online_map, stop_this_cpu, NULL, 0);
- if (!nolock)
- spin_unlock(&call_lock);
- disable_local_APIC();
- local_irq_restore(flags);
-}
-
-/*
- * Reschedule call back. Nothing to do,
- * all the work is done automatically when
- * we return from the interrupt.
- */
-asmlinkage void smp_reschedule_interrupt(void)
-{
- ack_APIC_irq();
- add_pda(irq_resched_count, 1);
-}
-
-asmlinkage void smp_call_function_interrupt(void)
-{
- void (*func) (void *info) = call_data->func;
- void *info = call_data->info;
- int wait = call_data->wait;
-
- ack_APIC_irq();
- /*
- * Notify initiating CPU that I've grabbed the data and am
- * about to execute the function
- */
- mb();
- atomic_inc(&call_data->started);
- /*
- * At this point the info structure may be out of scope unless wait==1
- */
- exit_idle();
- irq_enter();
- (*func)(info);
- add_pda(irq_call_count, 1);
- irq_exit();
- if (wait) {
- mb();
- atomic_inc(&call_data->finished);
- }
-}
-
diff --git a/arch/x86/kernel/smpboot_64.c b/arch/x86/kernel/smpboot_64.c
index 6f3e63c..17e54fa 100644
--- a/arch/x86/kernel/smpboot_64.c
+++ b/arch/x86/kernel/smpboot_64.c
@@ -864,7 +864,7 @@ void __init smp_set_apicids(void)
* Prepare for SMP bootup. The MP table or ACPI has been read
* earlier. Just do some sanity checking here and enable APIC mode.
*/
-void __init smp_prepare_cpus(unsigned int max_cpus)
+void __init native_smp_prepare_cpus(unsigned int max_cpus)
{
nmi_watchdog_default();
current_cpu_data = boot_cpu_data;
@@ -908,7 +908,7 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
/*
* Early setup to make printk work.
*/
-void __init smp_prepare_boot_cpu(void)
+void __init native_smp_prepare_boot_cpu(void)
{
int me = smp_processor_id();
cpu_set(me, cpu_online_map);
@@ -919,7 +919,7 @@ void __init smp_prepare_boot_cpu(void)
/*
* Entry point to boot a CPU.
*/
-int __cpuinit __cpu_up(unsigned int cpu)
+int __cpuinit native_cpu_up(unsigned int cpu)
{
int apicid = cpu_present_to_apicid(cpu);
unsigned long flags;
@@ -977,7 +977,7 @@ int __cpuinit __cpu_up(unsigned int cpu)
/*
* Finish the SMP boot.
*/
-void __init smp_cpus_done(unsigned int max_cpus)
+void __init native_smp_cpus_done(unsigned int max_cpus)
{
smp_cleanup_boot();
setup_ioapic_dest();
diff --git a/arch/x86/kernel/smpcommon.c b/arch/x86/kernel/smpcommon.c
new file mode 100644
index 0000000..e3bee40
--- /dev/null
+++ b/arch/x86/kernel/smpcommon.c
@@ -0,0 +1,291 @@
+/*
+ * SMP stuff which is common to all sub-architectures.
+ */
+#include <linux/cpu.h>
+#include <linux/module.h>
+#include <linux/interrupt.h>
+
+#include <asm/mach_apic.h>
+#include <asm/proto.h>
+#include <asm/apicdef.h>
+#include <asm/smp.h>
+#include <asm/idle.h>
+
+#ifdef CONFIG_X86_32
+DEFINE_PER_CPU(unsigned long, this_cpu_off);
+EXPORT_PER_CPU_SYMBOL(this_cpu_off);
+
+/* Initialize the CPU's GDT. This is either the boot CPU doing itself
+ (still using the master per-cpu area), or a CPU doing it for a
+ secondary which will soon come up. */
+__cpuinit void init_gdt(int cpu)
+{
+ struct desc_struct *gdt = get_cpu_gdt_table(cpu);
+
+ pack_descriptor((u32 *)&gdt[GDT_ENTRY_PERCPU].a,
+ (u32 *)&gdt[GDT_ENTRY_PERCPU].b,
+ __per_cpu_offset[cpu], 0xFFFFF,
+ 0x80 | DESCTYPE_S | 0x2, 0x8);
+
+ per_cpu(this_cpu_off, cpu) = __per_cpu_offset[cpu];
+ per_cpu(cpu_number, cpu) = cpu;
+}
+#endif
+
+/*
+ * Structure and data for smp_call_function(). This is designed to minimise
+ * static memory requirements. It also looks cleaner.
+ */
+static DEFINE_SPINLOCK(call_lock);
+
+struct call_data_struct {
+ void (*func) (void *info);
+ void *info;
+ atomic_t started;
+ atomic_t finished;
+ int wait;
+};
+
+static struct call_data_struct * call_data;
+
+void lock_ipi_call_lock(void)
+{
+ spin_lock_irq(&call_lock);
+}
+
+void unlock_ipi_call_lock(void)
+{
+ spin_unlock_irq(&call_lock);
+}
+
+/**
+ * smp_call_function_mask(): Run a function on a set of other CPUs.
+ * @mask: The set of cpus to run on. Must not include the current cpu.
+ * @func: The function to run. This must be fast and non-blocking.
+ * @info: An arbitrary pointer to pass to the function.
+ * @wait: If true, wait (atomically) until function has completed on other CPUs.
+ *
+ * Returns 0 on success, else a negative status code.
+ *
+ * If @wait is true, then returns once @func has returned; otherwise
+ * it returns just before the target cpu calls @func.
+ *
+ * You must not call this function with disabled interrupts or from a
+ * hardware interrupt handler or from a bottom half handler.
+ */
+static int
+native_smp_call_function_mask(cpumask_t mask,
+ void (*func)(void *), void *info,
+ int wait)
+{
+ struct call_data_struct data;
+ cpumask_t allbutself;
+ int cpus;
+
+ /* Can deadlock when called with interrupts disabled */
+ WARN_ON(irqs_disabled());
+
+ /* Holding any lock stops cpus from going down. */
+ spin_lock(&call_lock);
+
+ allbutself = cpu_online_map;
+ cpu_clear(smp_processor_id(), allbutself);
+
+ cpus_and(mask, mask, allbutself);
+ cpus = cpus_weight(mask);
+
+ if (!cpus) {
+ spin_unlock(&call_lock);
+ return 0;
+ }
+
+ data.func = func;
+ data.info = info;
+ atomic_set(&data.started, 0);
+ data.wait = wait;
+ if (wait)
+ atomic_set(&data.finished, 0);
+
+ call_data = &data;
+ mb();
+
+ /* Send a message to other CPUs */
+ if (cpus_equal(mask, allbutself))
+ send_IPI_allbutself(CALL_FUNCTION_VECTOR);
+ else
+ send_IPI_mask(mask, CALL_FUNCTION_VECTOR);
+
+ /* Wait for response */
+ while (atomic_read(&data.started) != cpus)
+ cpu_relax();
+
+ if (wait)
+ while (atomic_read(&data.finished) != cpus)
+ cpu_relax();
+ spin_unlock(&call_lock);
+
+ return 0;
+}
+
+/**
+ * smp_call_function(): Run a function on all other CPUs.
+ * @func: The function to run. This must be fast and non-blocking.
+ * @info: An arbitrary pointer to pass to the function.
+ * @nonatomic: Unused.
+ * @wait: If true, wait (atomically) until function has completed on other CPUs.
+ *
+ * Returns 0 on success, else a negative status code.
+ *
+ * If @wait is true, then returns once @func has returned; otherwise
+ * it returns just before the target cpu calls @func.
+ *
+ * You must not call this function with disabled interrupts or from a
+ * hardware interrupt handler or from a bottom half handler.
+ */
+int smp_call_function(void (*func) (void *info), void *info, int nonatomic,
+ int wait)
+{
+ return smp_call_function_mask(cpu_online_map, func, info, wait);
+}
+EXPORT_SYMBOL(smp_call_function);
+
+/**
+ * smp_call_function_single - Run a function on a specific CPU
+ * @cpu: The target CPU. Cannot be the calling CPU.
+ * @func: The function to run. This must be fast and non-blocking.
+ * @info: An arbitrary pointer to pass to the function.
+ * @nonatomic: Unused.
+ * @wait: If true, wait until function has completed on other CPUs.
+ *
+ * Returns 0 on success, else a negative status code.
+ *
+ * If @wait is true, then returns once @func has returned; otherwise
+ * it returns just before the target cpu calls @func.
+ */
+int smp_call_function_single(int cpu, void (*func) (void *info), void *info,
+ int nonatomic, int wait)
+{
+ /* prevent preemption and reschedule on another processor */
+ int ret;
+ int me = get_cpu();
+ if (cpu == me) {
+ local_irq_disable();
+ func(info);
+ local_irq_enable();
+ put_cpu();
+ return 0;
+ }
+
+ ret = smp_call_function_mask(cpumask_of_cpu(cpu), func, info, wait);
+
+ put_cpu();
+ return ret;
+}
+EXPORT_SYMBOL(smp_call_function_single);
+
+static void stop_this_cpu(void *dummy)
+{
+ local_irq_disable();
+ /*
+ * Remove this CPU:
+ */
+ cpu_clear(smp_processor_id(), cpu_online_map);
+ disable_local_APIC();
+#ifdef CONFIG_X86_32
+ if (cpu_data(smp_processor_id()).hlt_works_ok)
+#endif
+ for (;;)
+ halt();
+ for (;;);
+}
+
+void native_smp_send_stop(void)
+{
+ int nolock;
+ unsigned long flags;
+
+#ifdef CONFIG_X86_64
+ if (reboot_force)
+ return;
+#endif
+
+ /* Don't deadlock on the call lock in panic */
+ nolock = !spin_trylock(&call_lock);
+ local_irq_save(flags);
+ smp_call_function(stop_this_cpu, NULL, 0, 0);
+ if (!nolock)
+ spin_unlock(&call_lock);
+ disable_local_APIC();
+ local_irq_restore(flags);
+}
+
+asmlinkage void smp_call_function_interrupt(void)
+{
+ void (*func) (void *info) = call_data->func;
+ void *info = call_data->info;
+ int wait = call_data->wait;
+
+ ack_APIC_irq();
+ /*
+ * Notify initiating CPU that I've grabbed the data and am
+ * about to execute the function
+ */
+ mb();
+ atomic_inc(&call_data->started);
+ /*
+ * At this point the info structure may be out of scope unless wait==1
+ */
+ exit_idle();
+ irq_enter();
+ (*func)(info);
+ /* FIXME: Put this thing in the same place for both arches */
+#ifdef CONFIG_X86_64
+ add_pda(irq_call_count, 1);
+#else
+ __get_cpu_var(irq_stat).irq_call_count++;
+#endif
+ irq_exit();
+ if (wait) {
+ mb();
+ atomic_inc(&call_data->finished);
+ }
+}
+
+/*
+ * this function sends a 'reschedule' IPI to another CPU.
+ * it goes straight through and wastes no time serializing
+ * anything. Worst case is that we lose a reschedule ...
+ */
+static void native_smp_send_reschedule(int cpu)
+{
+ WARN_ON(cpu_is_offline(cpu));
+ send_IPI_mask(cpumask_of_cpu(cpu), RESCHEDULE_VECTOR);
+}
+
+/*
+ * Reschedule call back. Nothing to do,
+ * all the work is done automatically when
+ * we return from the interrupt.
+ */
+asmlinkage void smp_reschedule_interrupt(struct pt_regs *regs)
+{
+ ack_APIC_irq();
+ /* FIXME: Put this thing in the same place for both arches */
+#ifdef CONFIG_X86_64
+ add_pda(irq_call_count, 1);
+#else
+ __get_cpu_var(irq_stat).irq_resched_count++;
+#endif
+}
+
+struct smp_ops smp_ops = {
+ .smp_prepare_boot_cpu = native_smp_prepare_boot_cpu,
+ .smp_prepare_cpus = native_smp_prepare_cpus,
+ .cpu_up = native_cpu_up,
+ .smp_cpus_done = native_smp_cpus_done,
+
+ .smp_send_stop = native_smp_send_stop,
+ .smp_send_reschedule = native_smp_send_reschedule,
+ .smp_call_function_mask = native_smp_call_function_mask,
+};
+EXPORT_SYMBOL(smp_ops);
diff --git a/arch/x86/kernel/smpcommon_32.c b/arch/x86/kernel/smpcommon_32.c
deleted file mode 100644
index bbfe85a..0000000
--- a/arch/x86/kernel/smpcommon_32.c
+++ /dev/null
@@ -1,81 +0,0 @@
-/*
- * SMP stuff which is common to all sub-architectures.
- */
-#include <linux/module.h>
-#include <asm/smp.h>
-
-DEFINE_PER_CPU(unsigned long, this_cpu_off);
-EXPORT_PER_CPU_SYMBOL(this_cpu_off);
-
-/* Initialize the CPU's GDT. This is either the boot CPU doing itself
- (still using the master per-cpu area), or a CPU doing it for a
- secondary which will soon come up. */
-__cpuinit void init_gdt(int cpu)
-{
- struct desc_struct *gdt = get_cpu_gdt_table(cpu);
-
- pack_descriptor((u32 *)&gdt[GDT_ENTRY_PERCPU].a,
- (u32 *)&gdt[GDT_ENTRY_PERCPU].b,
- __per_cpu_offset[cpu], 0xFFFFF,
- 0x80 | DESCTYPE_S | 0x2, 0x8);
-
- per_cpu(this_cpu_off, cpu) = __per_cpu_offset[cpu];
- per_cpu(cpu_number, cpu) = cpu;
-}
-
-
-/**
- * smp_call_function(): Run a function on all other CPUs.
- * @func: The function to run. This must be fast and non-blocking.
- * @info: An arbitrary pointer to pass to the function.
- * @nonatomic: Unused.
- * @wait: If true, wait (atomically) until function has completed on other CPUs.
- *
- * Returns 0 on success, else a negative status code.
- *
- * If @wait is true, then returns once @func has returned; otherwise
- * it returns just before the target cpu calls @func.
- *
- * You must not call this function with disabled interrupts or from a
- * hardware interrupt handler or from a bottom half handler.
- */
-int smp_call_function(void (*func) (void *info), void *info, int nonatomic,
- int wait)
-{
- return smp_call_function_mask(cpu_online_map, func, info, wait);
-}
-EXPORT_SYMBOL(smp_call_function);
-
-/**
- * smp_call_function_single - Run a function on a specific CPU
- * @cpu: The target CPU. Cannot be the calling CPU.
- * @func: The function to run. This must be fast and non-blocking.
- * @info: An arbitrary pointer to pass to the function.
- * @nonatomic: Unused.
- * @wait: If true, wait until function has completed on other CPUs.
- *
- * Returns 0 on success, else a negative status code.
- *
- * If @wait is true, then returns once @func has returned; otherwise
- * it returns just before the target cpu calls @func.
- */
-int smp_call_function_single(int cpu, void (*func) (void *info), void *info,
- int nonatomic, int wait)
-{
- /* prevent preemption and reschedule on another processor */
- int ret;
- int me = get_cpu();
- if (cpu == me) {
- local_irq_disable();
- func(info);
- local_irq_enable();
- put_cpu();
- return 0;
- }
-
- ret = smp_call_function_mask(cpumask_of_cpu(cpu), func, info, wait);
-
- put_cpu();
- return ret;
-}
-EXPORT_SYMBOL(smp_call_function_single);
diff --git a/include/asm-x86/idle.h b/include/asm-x86/idle.h
index d240e5b..487bca1 100644
--- a/include/asm-x86/idle.h
+++ b/include/asm-x86/idle.h
@@ -1,5 +1,5 @@
-#ifndef _ASM_X86_64_IDLE_H
-#define _ASM_X86_64_IDLE_H 1
+#ifndef _ASM_X86_IDLE_H
+#define _ASM_X86_IDLE_H 1

#define IDLE_START 1
#define IDLE_END 2
@@ -7,7 +7,21 @@
struct notifier_block;
void idle_notifier_register(struct notifier_block *n);

+#ifdef CONFIG_X86_64
void enter_idle(void);
void exit_idle(void);
+#else
+/*
+ * i386 does not really uses it, but define them here so we can avoid ifdefs in
+ * shared locations that uses it
+ */
+static inline void enter_idle(void)
+{
+}
+static inline void exit_idle(void)
+{
+}
+#endif
+

#endif
diff --git a/include/asm-x86/smp.h b/include/asm-x86/smp.h
index f2e8319..b2f99df 100644
--- a/include/asm-x86/smp.h
+++ b/include/asm-x86/smp.h
@@ -1,5 +1,71 @@
+#ifndef _X86_SMP_H_
+#define _X86_SMP_H_
+
+#ifndef __ASSEMBLY__
+struct smp_ops
+{
+ void (*smp_prepare_boot_cpu)(void);
+ void (*smp_prepare_cpus)(unsigned max_cpus);
+ int (*cpu_up)(unsigned cpu);
+ void (*smp_cpus_done)(unsigned max_cpus);
+
+ void (*smp_send_stop)(void);
+ void (*smp_send_reschedule)(int cpu);
+ int (*smp_call_function_mask)(cpumask_t mask,
+ void (*func)(void *info), void *info,
+ int wait);
+};
+
+extern struct smp_ops smp_ops;
+
+static inline void smp_prepare_boot_cpu(void)
+{
+ smp_ops.smp_prepare_boot_cpu();
+}
+static inline void smp_prepare_cpus(unsigned int max_cpus)
+{
+ smp_ops.smp_prepare_cpus(max_cpus);
+}
+static inline int __cpu_up(unsigned int cpu)
+{
+ return smp_ops.cpu_up(cpu);
+}
+static inline void smp_cpus_done(unsigned int max_cpus)
+{
+ smp_ops.smp_cpus_done(max_cpus);
+}
+
+static inline void smp_send_stop(void)
+{
+ smp_ops.smp_send_stop();
+}
+static inline void smp_send_reschedule(int cpu)
+{
+ smp_ops.smp_send_reschedule(cpu);
+}
+
+static inline int smp_call_function_mask(cpumask_t mask,
+ void (*func) (void *info),
+ void *info, int wait)
+{
+ return smp_ops.smp_call_function_mask(mask, func, info, wait);
+}
+
+void native_smp_prepare_boot_cpu(void);
+void native_smp_prepare_cpus(unsigned int max_cpus);
+int native_cpu_up(unsigned int cpunum);
+void native_smp_cpus_done(unsigned int max_cpus);
+
+#ifndef CONFIG_PARAVIRT
+#define startup_ipi_hook(phys_apicid, start_eip, start_esp) \
+do { } while (0)
+#endif
+#endif /* __ASSEMBLY__ */
+
#ifdef CONFIG_X86_32
# include "smp_32.h"
#else
# include "smp_64.h"
#endif
+
+#endif
diff --git a/include/asm-x86/smp_32.h b/include/asm-x86/smp_32.h
index e10b7af..a53b03f 100644
--- a/include/asm-x86/smp_32.h
+++ b/include/asm-x86/smp_32.h
@@ -53,64 +53,6 @@ extern void cpu_uninit(void);
extern void remove_siblinginfo(int cpu);
#endif

-struct smp_ops
-{
- void (*smp_prepare_boot_cpu)(void);
- void (*smp_prepare_cpus)(unsigned max_cpus);
- int (*cpu_up)(unsigned cpu);
- void (*smp_cpus_done)(unsigned max_cpus);
-
- void (*smp_send_stop)(void);
- void (*smp_send_reschedule)(int cpu);
- int (*smp_call_function_mask)(cpumask_t mask,
- void (*func)(void *info), void *info,
- int wait);
-};
-
-extern struct smp_ops smp_ops;
-
-static inline void smp_prepare_boot_cpu(void)
-{
- smp_ops.smp_prepare_boot_cpu();
-}
-static inline void smp_prepare_cpus(unsigned int max_cpus)
-{
- smp_ops.smp_prepare_cpus(max_cpus);
-}
-static inline int __cpu_up(unsigned int cpu)
-{
- return smp_ops.cpu_up(cpu);
-}
-static inline void smp_cpus_done(unsigned int max_cpus)
-{
- smp_ops.smp_cpus_done(max_cpus);
-}
-
-static inline void smp_send_stop(void)
-{
- smp_ops.smp_send_stop();
-}
-static inline void smp_send_reschedule(int cpu)
-{
- smp_ops.smp_send_reschedule(cpu);
-}
-static inline int smp_call_function_mask(cpumask_t mask,
- void (*func) (void *info), void *info,
- int wait)
-{
- return smp_ops.smp_call_function_mask(mask, func, info, wait);
-}
-
-void native_smp_prepare_boot_cpu(void);
-void native_smp_prepare_cpus(unsigned int max_cpus);
-int native_cpu_up(unsigned int cpunum);
-void native_smp_cpus_done(unsigned int max_cpus);
-
-#ifndef CONFIG_PARAVIRT
-#define startup_ipi_hook(phys_apicid, start_eip, start_esp) \
-do { } while (0)
-#endif
-
/*
* This function is needed by all SMP systems. It must _always_ be valid
* from the initial startup. We map APIC_BASE very early in page_setup(),
diff --git a/include/asm-x86/smp_64.h b/include/asm-x86/smp_64.h
index ab612b0..279ff92 100644
--- a/include/asm-x86/smp_64.h
+++ b/include/asm-x86/smp_64.h
@@ -36,9 +36,6 @@ extern volatile unsigned long smp_invalidate_needed;
extern void lock_ipi_call_lock(void);
extern void unlock_ipi_call_lock(void);
extern int smp_num_siblings;
-extern void smp_send_reschedule(int cpu);
-extern int smp_call_function_mask(cpumask_t mask, void (*func)(void *),
- void *info, int wait);

/*
* cpu_sibling_map and cpu_core_map now live
@@ -127,4 +124,3 @@ extern unsigned int boot_cpu_id;
#define cpu_physical_id(cpu) boot_cpu_id
#endif /* !CONFIG_SMP */
#endif
-
--
1.4.4.2

2007-11-09 21:33:43

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 19/24] turn priviled operation into a macro in head_64.S

under paravirt, read cr2 cannot be issued directly anymore.
So wrap it in a macro, defined to the operation itself in case
paravirt is off, but to something else if we have paravirt
in the game

Signed-off-by: Glauber de Oliveira Costa <[email protected]>
Signed-off-by: Steven Rostedt <[email protected]>
Acked-by: Jeremy Fitzhardinge <[email protected]>
---
arch/x86/kernel/head_64.S | 9 ++++++++-
1 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index b6167fe..c31b1c9 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -19,6 +19,13 @@
#include <asm/msr.h>
#include <asm/cache.h>

+#ifdef CONFIG_PARAVIRT
+#include <asm/asm-offsets.h>
+#include <asm/paravirt.h>
+#else
+#define GET_CR2_INTO_RCX movq %cr2, %rcx
+#endif
+
/* we are not able to switch in one step to the final KERNEL ADRESS SPACE
* because we need identity-mapped pages.
*
@@ -267,7 +274,7 @@ ENTRY(early_idt_handler)
xorl %eax,%eax
movq 8(%rsp),%rsi # get rip
movq (%rsp),%rdx
- movq %cr2,%rcx
+ GET_CR2_INTO_RCX
leaq early_idt_msg(%rip),%rdi
call early_printk
cmpl $2,early_recursion_flag(%rip)
--
1.4.4.2

2007-11-09 21:33:56

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 17/24] This patch add provisions for time related functions so they

can be later replaced by paravirt versions.

it basically encloses {g,s}et_wallclock inside the
already existent functions update_persistent_clock and
read_persistent_clock, and defines {s,g}et_wallclock
to the core of such functions.

it also allow for a later-on-game time initialization, as done
by i386. Paravirt guests can set a function to do their own
initialization this way.

Signed-off-by: Glauber de Oliveira Costa <[email protected]>
Signed-off-by: Steven Rostedt <[email protected]>
Acked-by: Jeremy Fitzhardinge <[email protected]>
---
arch/x86/kernel/time_64.c | 12 +++++++++---
include/asm-x86/time.h | 26 +++++++++++++++++++++-----
2 files changed, 30 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/time_64.c b/arch/x86/kernel/time_64.c
index f88bf6b..89943d8 100644
--- a/arch/x86/kernel/time_64.c
+++ b/arch/x86/kernel/time_64.c
@@ -21,6 +21,8 @@
#include <asm/hpet.h>
#include <asm/nmi.h>
#include <asm/vgtod.h>
+#include <asm/time.h>
+#include <asm/timer.h>

volatile unsigned long __jiffies __section_jiffies = INITIAL_JIFFIES;

@@ -54,7 +56,7 @@ static irqreturn_t timer_event_interrupt(int irq, void *dev_id)
/* calibrate_cpu is used on systems with fixed rate TSCs to determine
* processor frequency */
#define TICK_COUNT 100000000
-static unsigned int __init tsc_calibrate_cpu_khz(void)
+unsigned long __init native_calculate_cpu_khz(void)
{
int tsc_start, tsc_now;
int i, no_ctr_free;
@@ -104,20 +106,23 @@ static struct irqaction irq0 = {
.name = "timer"
};

-void __init time_init(void)
+void __init hpet_time_init(void)
{
if (!hpet_enable())
setup_pit_timer();

setup_irq(0, &irq0);
+}

+void __init time_init(void)
+{
tsc_calibrate();

cpu_khz = tsc_khz;
if (cpu_has(&boot_cpu_data, X86_FEATURE_CONSTANT_TSC) &&
boot_cpu_data.x86_vendor == X86_VENDOR_AMD &&
boot_cpu_data.x86 == 16)
- cpu_khz = tsc_calibrate_cpu_khz();
+ cpu_khz = calculate_cpu_khz();

if (unsynchronized_tsc())
mark_tsc_unstable("TSCs unsynchronized");
@@ -130,4 +135,5 @@ void __init time_init(void)
printk(KERN_INFO "time.c: Detected %d.%03d MHz processor.\n",
cpu_khz / 1000, cpu_khz % 1000);
init_tsc_clocksource();
+ late_time_init = choose_time_init();
}
diff --git a/include/asm-x86/time.h b/include/asm-x86/time.h
index b3f94cd..68779b0 100644
--- a/include/asm-x86/time.h
+++ b/include/asm-x86/time.h
@@ -1,8 +1,12 @@
-#ifndef _ASMi386_TIME_H
-#define _ASMi386_TIME_H
+#ifndef _ASMX86_TIME_H
+#define _ASMX86_TIME_H
+
+extern void (*late_time_init)(void);
+extern void hpet_time_init(void);

-#include <linux/efi.h>
#include <asm/mc146818rtc.h>
+#ifdef CONFIG_X86_32
+#include <linux/efi.h>

static inline unsigned long native_get_wallclock(void)
{
@@ -28,8 +32,20 @@ static inline int native_set_wallclock(unsigned long nowtime)
return retval;
}

-extern void (*late_time_init)(void);
-extern void hpet_time_init(void);
+#else
+extern void native_time_init_hook(void);
+
+static inline unsigned long native_get_wallclock(void)
+{
+ return mach_get_cmos_time();
+}
+
+static inline int native_set_wallclock(unsigned long nowtime)
+{
+ return mach_set_rtc_mmss(nowtime);
+}
+
+#endif

#ifdef CONFIG_PARAVIRT
#include <asm/paravirt.h>
--
1.4.4.2

2007-11-09 21:34:23

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 22/24] prepare x86_64 architecture initialization for paravirt

This patch prepares the x86_64 architecture initialization for
paravirt. It requires a memory initialization step, which is done
by implementing 64-bit version for machine_specific_memory_setup,
and putting an ARCH_SETUP hook, for guest-dependent initialization.
This last step is done akin to i386

Signed-off-by: Glauber de Oliveira Costa <[email protected]>
Signed-off-by: Steven Rostedt <[email protected]>
Acked-by: Jeremy Fitzhardinge <[email protected]>
---
arch/x86/kernel/e820_64.c | 9 +++++++--
arch/x86/kernel/setup_64.c | 28 +++++++++++++++++++++++++++-
include/asm-x86/setup.h | 11 ++++++++---
3 files changed, 42 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/e820_64.c b/arch/x86/kernel/e820_64.c
index 0128b0b..eed900b 100644
--- a/arch/x86/kernel/e820_64.c
+++ b/arch/x86/kernel/e820_64.c
@@ -639,8 +639,10 @@ void early_panic(char *msg)
panic(msg);
}

-void __init setup_memory_region(void)
+/* We're not void only for x86 32-bit compat */
+char * __init machine_specific_memory_setup(void)
{
+ char *who = "BIOS-e820";
/*
* Try to copy the BIOS-supplied E820-map.
*
@@ -651,7 +653,10 @@ void __init setup_memory_region(void)
if (copy_e820_map(boot_params.e820_map, boot_params.e820_entries) < 0)
early_panic("Cannot find a valid memory map");
printk(KERN_INFO "BIOS-provided physical RAM map:\n");
- e820_print_map("BIOS-e820");
+ e820_print_map(who);
+
+ /* In case someone cares... */
+ return who;
}

static int __init parse_memopt(char *p)
diff --git a/arch/x86/kernel/setup_64.c b/arch/x86/kernel/setup_64.c
index 2451a63..1c9f237 100644
--- a/arch/x86/kernel/setup_64.c
+++ b/arch/x86/kernel/setup_64.c
@@ -39,6 +39,7 @@
#include <linux/dmi.h>
#include <linux/dma-mapping.h>
#include <linux/ctype.h>
+#include <linux/uaccess.h>

#include <asm/mtrr.h>
#include <asm/uaccess.h>
@@ -61,6 +62,12 @@
#include <asm/cacheflush.h>
#include <asm/mce.h>

+#ifdef CONFIG_PARAVIRT
+#include <asm/paravirt.h>
+#else
+#define ARCH_SETUP
+#endif
+
/*
* Machine setup..
*/
@@ -244,6 +251,16 @@ static void discover_ebda(void)
* 4K EBDA area at 0x40E
*/
ebda_addr = *(unsigned short *)__va(EBDA_ADDR_POINTER);
+ /*
+ * There can be some situations, like paravirtualized guests,
+ * in which there is no available ebda information. In such
+ * case, just skip it
+ */
+ if (!ebda_addr) {
+ ebda_size = 0;
+ return;
+ }
+
ebda_addr <<= 4;

ebda_size = *(unsigned short *)__va(ebda_addr);
@@ -257,6 +274,12 @@ static void discover_ebda(void)
ebda_size = 64*1024;
}

+/* Overridden in paravirt.c if CONFIG_PARAVIRT */
+void __attribute__((weak)) memory_setup(void)
+{
+ machine_specific_memory_setup();
+}
+
void __init setup_arch(char **cmdline_p)
{
printk(KERN_INFO "Command line: %s\n", boot_command_line);
@@ -272,7 +295,10 @@ void __init setup_arch(char **cmdline_p)
rd_prompt = ((boot_params.hdr.ram_size & RAMDISK_PROMPT_FLAG) != 0);
rd_doload = ((boot_params.hdr.ram_size & RAMDISK_LOAD_FLAG) != 0);
#endif
- setup_memory_region();
+
+ ARCH_SETUP
+
+ memory_setup();
copy_edd();

if (!boot_params.hdr.root_flags)
diff --git a/include/asm-x86/setup.h b/include/asm-x86/setup.h
index 24d786e..071e054 100644
--- a/include/asm-x86/setup.h
+++ b/include/asm-x86/setup.h
@@ -3,6 +3,13 @@

#define COMMAND_LINE_SIZE 2048

+#ifndef __ASSEMBLY__
+char *machine_specific_memory_setup(void);
+#ifndef CONFIG_PARAVIRT
+#define paravirt_post_allocator_init() do {} while (0)
+#endif
+#endif /* __ASSEMBLY__ */
+
#ifdef __KERNEL__

#ifdef __i386__
@@ -51,9 +58,7 @@ void __init add_memory_region(unsigned long long start,

extern unsigned long init_pg_tables_end;

-#ifndef CONFIG_PARAVIRT
-#define paravirt_post_allocator_init() do {} while (0)
-#endif
+

#endif /* __i386__ */
#endif /* _SETUP */
--
1.4.4.2

2007-11-09 21:34:40

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 21/24] native versions for page table entries values

This patch turns the page operations (set and make a page table)
into native_ versions. The operations itself will be later
overriden by paravirt.

It uses unsigned long long for consistency with 32-bit. So we
have to fix fault_64.c to get rid of warnings.

Signed-off-by: Glauber de Oliveira Costa <[email protected]>
Signed-off-by: Steven Rostedt <[email protected]>
Acked-by: Jeremy Fitzhardinge <[email protected]>
---
arch/x86/mm/fault_64.c | 8 +++---
include/asm-x86/page_64.h | 56 +++++++++++++++++++++++++++++++++++++++++----
2 files changed, 55 insertions(+), 9 deletions(-)

diff --git a/arch/x86/mm/fault_64.c b/arch/x86/mm/fault_64.c
index 161c0d1..86b7307 100644
--- a/arch/x86/mm/fault_64.c
+++ b/arch/x86/mm/fault_64.c
@@ -157,22 +157,22 @@ void dump_pagetable(unsigned long address)
pgd = __va((unsigned long)pgd & PHYSICAL_PAGE_MASK);
pgd += pgd_index(address);
if (bad_address(pgd)) goto bad;
- printk("PGD %lx ", pgd_val(*pgd));
+ printk("PGD %llx ", pgd_val(*pgd));
if (!pgd_present(*pgd)) goto ret;

pud = pud_offset(pgd, address);
if (bad_address(pud)) goto bad;
- printk("PUD %lx ", pud_val(*pud));
+ printk("PUD %llx ", pud_val(*pud));
if (!pud_present(*pud)) goto ret;

pmd = pmd_offset(pud, address);
if (bad_address(pmd)) goto bad;
- printk("PMD %lx ", pmd_val(*pmd));
+ printk("PMD %llx ", pmd_val(*pmd));
if (!pmd_present(*pmd) || pmd_large(*pmd)) goto ret;

pte = pte_offset_kernel(pmd, address);
if (bad_address(pte)) goto bad;
- printk("PTE %lx", pte_val(*pte));
+ printk("PTE %llx", pte_val(*pte));
ret:
printk("\n");
return;
diff --git a/include/asm-x86/page_64.h b/include/asm-x86/page_64.h
index 6fdc904..b8da60c 100644
--- a/include/asm-x86/page_64.h
+++ b/include/asm-x86/page_64.h
@@ -65,16 +65,62 @@ typedef struct { unsigned long pgprot; } pgprot_t;

extern unsigned long phys_base;

-#define pte_val(x) ((x).pte)
-#define pmd_val(x) ((x).pmd)
-#define pud_val(x) ((x).pud)
-#define pgd_val(x) ((x).pgd)
-#define pgprot_val(x) ((x).pgprot)
+static inline unsigned long long native_pte_val(pte_t pte)
+{
+ return pte.pte;
+}
+
+static inline unsigned long long native_pud_val(pud_t pud)
+{
+ return pud.pud;
+}
+
+
+static inline unsigned long long native_pmd_val(pmd_t pmd)
+{
+ return pmd.pmd;
+}
+
+static inline unsigned long long native_pgd_val(pgd_t pgd)
+{
+ return pgd.pgd;
+}
+
+static inline pte_t native_make_pte(unsigned long long pte)
+{
+ return (pte_t){ pte };
+}
+
+static inline pud_t native_make_pud(unsigned long long pud)
+{
+ return (pud_t){ pud };
+}
+
+static inline pmd_t native_make_pmd(unsigned long long pmd)
+{
+ return (pmd_t){ pmd };
+}
+
+static inline pgd_t native_make_pgd(unsigned long long pgd)
+{
+ return (pgd_t){ pgd };
+}
+
+#ifdef CONFIG_PARAVIRT
+#include <asm/paravirt.h>
+#else
+#define pte_val(x) native_pte_val(x)
+#define pmd_val(x) native_pmd_val(x)
+#define pud_val(x) native_pud_val(x)
+#define pgd_val(x) native_pgd_val(x)

#define __pte(x) ((pte_t) { (x) } )
#define __pmd(x) ((pmd_t) { (x) } )
#define __pud(x) ((pud_t) { (x) } )
#define __pgd(x) ((pgd_t) { (x) } )
+#endif /* CONFIG_PARAVIRT */
+
+#define pgprot_val(x) ((x).pgprot)
#define __pgprot(x) ((pgprot_t) { (x) } )

#endif /* !__ASSEMBLY__ */
--
1.4.4.2

2007-11-09 21:34:57

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 18/24] export cpu_gdt_descr

With paravirualization, hypervisors needs to handle the gdt,
that was right to this point only used at very early
inialization code. Hypervisors (lguest being the current case)
are commonly modules, so make it an export

Signed-off-by: Glauber de Oliveira Costa <[email protected]>
Signed-off-by: Steven Rostedt <[email protected]>
Acked-by: Jeremy Fitzhardinge <[email protected]>
---
arch/x86/kernel/x8664_ksyms_64.c | 6 ++++++
1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/x8664_ksyms_64.c b/arch/x86/kernel/x8664_ksyms_64.c
index 105712e..f97aed4 100644
--- a/arch/x86/kernel/x8664_ksyms_64.c
+++ b/arch/x86/kernel/x8664_ksyms_64.c
@@ -8,6 +8,7 @@
#include <asm/processor.h>
#include <asm/uaccess.h>
#include <asm/pgtable.h>
+#include <asm/desc.h>

EXPORT_SYMBOL(kernel_thread);

@@ -51,3 +52,8 @@ EXPORT_SYMBOL(__memcpy);
EXPORT_SYMBOL(load_gs_index);

EXPORT_SYMBOL(_proxy_pda);
+
+#ifdef CONFIG_PARAVIRT
+/* Virtualized guests may want to use it */
+EXPORT_SYMBOL_GPL(cpu_gdt_descr);
+#endif
--
1.4.4.2

2007-11-09 21:35:36

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 16/24] add native functions for descriptors handling

This patch turns the basic descriptor handling into native_
functions. It is basically write_idt, load_idt, write_gdt,
load_gdt, set_ldt, store_tr, load_tls, and the ones
for updating a single entry.

In the process of doing that, we change the definition of
load_LDT_nolock, and caller sites have to be patched. We
also patch call sites that now needs a typecast.

Signed-off-by: Glauber de Oliveira Costa <[email protected]>
Signed-off-by: Steven Rostedt <[email protected]>
Acked-by: Jeremy Fitzhardinge <[email protected]>
---
include/asm-x86/desc.h | 59 ++++++++++++
include/asm-x86/desc_32.h | 45 ---------
include/asm-x86/desc_64.h | 191 ++++++++++++++++++-------------------
include/asm-x86/mmu_context_64.h | 23 ++++-
4 files changed, 169 insertions(+), 149 deletions(-)

diff --git a/include/asm-x86/desc.h b/include/asm-x86/desc.h
index 6065c50..276dc6e 100644
--- a/include/asm-x86/desc.h
+++ b/include/asm-x86/desc.h
@@ -1,5 +1,64 @@
+#ifndef _ASM_DESC_H_
+#define _ASM_DESC_H_
+
#ifdef CONFIG_X86_32
# include "desc_32.h"
#else
# include "desc_64.h"
#endif
+
+#ifndef __ASSEMBLY__
+#define LDT_entry_a(info) \
+ ((((info)->base_addr & 0x0000ffff) << 16) | ((info)->limit & 0x0ffff))
+
+#define LDT_entry_b(info) \
+ (((info)->base_addr & 0xff000000) | \
+ (((info)->base_addr & 0x00ff0000) >> 16) | \
+ ((info)->limit & 0xf0000) | \
+ (((info)->read_exec_only ^ 1) << 9) | \
+ ((info)->contents << 10) | \
+ (((info)->seg_not_present ^ 1) << 15) | \
+ ((info)->seg_32bit << 22) | \
+ ((info)->limit_in_pages << 23) | \
+ ((info)->useable << 20) | \
+ 0x7000)
+
+#define _LDT_empty(info) (\
+ (info)->base_addr == 0 && \
+ (info)->limit == 0 && \
+ (info)->contents == 0 && \
+ (info)->read_exec_only == 1 && \
+ (info)->seg_32bit == 0 && \
+ (info)->limit_in_pages == 0 && \
+ (info)->seg_not_present == 1 && \
+ (info)->useable == 0 )
+
+#ifdef CONFIG_X86_64
+#define LDT_empty(info) (_LDT_empty(info) && ((info)->lm == 0))
+#else
+#define LDT_empty(info) _LDT_empty(info)
+#endif
+
+static inline void clear_LDT(void)
+{
+ set_ldt(NULL, 0);
+}
+
+/*
+ * load one particular LDT into the current CPU
+ */
+static inline void load_LDT_nolock(mm_context_t *pc)
+{
+ set_ldt(pc->ldt, pc->size);
+}
+
+static inline void load_LDT(mm_context_t *pc)
+{
+ preempt_disable();
+ load_LDT_nolock(pc);
+ preempt_enable();
+}
+
+#endif /* __ASSEMBLY__ */
+
+#endif
diff --git a/include/asm-x86/desc_32.h b/include/asm-x86/desc_32.h
index c547403..84bb843 100644
--- a/include/asm-x86/desc_32.h
+++ b/include/asm-x86/desc_32.h
@@ -162,51 +162,6 @@ static inline void __set_tss_desc(unsigned int cpu, unsigned int entry, const vo

#define set_tss_desc(cpu,addr) __set_tss_desc(cpu, GDT_ENTRY_TSS, addr)

-#define LDT_entry_a(info) \
- ((((info)->base_addr & 0x0000ffff) << 16) | ((info)->limit & 0x0ffff))
-
-#define LDT_entry_b(info) \
- (((info)->base_addr & 0xff000000) | \
- (((info)->base_addr & 0x00ff0000) >> 16) | \
- ((info)->limit & 0xf0000) | \
- (((info)->read_exec_only ^ 1) << 9) | \
- ((info)->contents << 10) | \
- (((info)->seg_not_present ^ 1) << 15) | \
- ((info)->seg_32bit << 22) | \
- ((info)->limit_in_pages << 23) | \
- ((info)->useable << 20) | \
- 0x7000)
-
-#define LDT_empty(info) (\
- (info)->base_addr == 0 && \
- (info)->limit == 0 && \
- (info)->contents == 0 && \
- (info)->read_exec_only == 1 && \
- (info)->seg_32bit == 0 && \
- (info)->limit_in_pages == 0 && \
- (info)->seg_not_present == 1 && \
- (info)->useable == 0 )
-
-static inline void clear_LDT(void)
-{
- set_ldt(NULL, 0);
-}
-
-/*
- * load one particular LDT into the current CPU
- */
-static inline void load_LDT_nolock(mm_context_t *pc)
-{
- set_ldt(pc->ldt, pc->size);
-}
-
-static inline void load_LDT(mm_context_t *pc)
-{
- preempt_disable();
- load_LDT_nolock(pc);
- preempt_enable();
-}
-
static inline unsigned long get_desc_base(unsigned long *desc)
{
unsigned long base;
diff --git a/include/asm-x86/desc_64.h b/include/asm-x86/desc_64.h
index 7d48df7..d12cd07 100644
--- a/include/asm-x86/desc_64.h
+++ b/include/asm-x86/desc_64.h
@@ -16,11 +16,12 @@

extern struct desc_struct cpu_gdt_table[GDT_ENTRIES];

-#define load_TR_desc() asm volatile("ltr %w0"::"r" (GDT_ENTRY_TSS*8))
-#define load_LDT_desc() asm volatile("lldt %w0"::"r" (GDT_ENTRY_LDT*8))
-#define clear_LDT() asm volatile("lldt %w0"::"r" (0))
+static inline void native_load_tr_desc(void)
+{
+ asm volatile("ltr %w0"::"r" (GDT_ENTRY_TSS*8));
+}

-static inline unsigned long __store_tr(void)
+static inline unsigned long native_store_tr(void)
{
unsigned long tr;

@@ -28,8 +29,6 @@ static inline unsigned long __store_tr(void)
return tr;
}

-#define store_tr(tr) (tr) = __store_tr()
-
/*
* This is the ldt that every process will get unless we need
* something other than this.
@@ -38,79 +37,54 @@ extern struct desc_struct default_ldt[];
extern struct gate_struct idt_table[];
extern struct desc_ptr cpu_gdt_descr[];

-static inline void write_ldt_entry(struct desc_struct *ldt,
- int entry, u32 entry_low, u32 entry_high)
+static inline void write_dt_entry(struct desc_struct *dt, int entry,
+ u32 entry_low, u32 entry_high)
{
- __u32 *lp = (__u32 *)((entry << 3) + (char *)ldt);
-
- lp[0] = entry_low;
- lp[1] = entry_high;
+ ((struct n_desc_struct *)dt)[entry].a = entry_low;
+ ((struct n_desc_struct *)dt)[entry].b = entry_high;
}

/* the cpu gdt accessor */
#define cpu_gdt(_cpu) ((struct desc_struct *)cpu_gdt_descr[_cpu].address)

-static inline void load_gdt(const struct desc_ptr *ptr)
+static inline void native_load_gdt(const struct desc_ptr *ptr)
{
asm volatile("lgdt %w0"::"m" (*ptr));
}

-static inline void store_gdt(struct desc_ptr *ptr)
+static inline void native_store_gdt(struct desc_ptr *ptr)
{
asm("sgdt %w0":"=m" (*ptr));
}

-static inline void _set_gate(void *adr, unsigned type, unsigned long func,
- unsigned dpl, unsigned ist)
+static inline void native_write_idt_entry(void *adr, struct gate_struct *s)
{
- struct gate_struct s;
-
- s.offset_low = PTR_LOW(func);
- s.segment = __KERNEL_CS;
- s.ist = ist;
- s.p = 1;
- s.dpl = dpl;
- s.zero0 = 0;
- s.zero1 = 0;
- s.type = type;
- s.offset_middle = PTR_MIDDLE(func);
- s.offset_high = PTR_HIGH(func);
- /*
- * does not need to be atomic because it is only done once at
- * setup time
- */
- memcpy(adr, &s, 16);
+ /* does not need to be atomic because
+ * it is only done once at setup time */
+ memcpy(adr, s, 16);
}

-static inline void set_intr_gate(int nr, void *func)
+static inline void native_write_gdt_entry(void *ptr, void *entry,
+ unsigned type, unsigned size)
{
- BUG_ON((unsigned)nr > 0xFF);
- _set_gate(&idt_table[nr], GATE_INTERRUPT, (unsigned long) func, 0, 0);
-}
-
-static inline void set_intr_gate_ist(int nr, void *func, unsigned ist)
-{
- BUG_ON((unsigned)nr > 0xFF);
- _set_gate(&idt_table[nr], GATE_INTERRUPT, (unsigned long) func, 0, ist);
-}
-
-static inline void set_system_gate(int nr, void *func)
-{
- BUG_ON((unsigned)nr > 0xFF);
- _set_gate(&idt_table[nr], GATE_INTERRUPT, (unsigned long) func, 3, 0);
+ memcpy(ptr, entry, size);
}

-static inline void set_system_gate_ist(int nr, void *func, unsigned ist)
-{
- _set_gate(&idt_table[nr], GATE_INTERRUPT, (unsigned long) func, 3, ist);
-}
+/*
+ * This one unfortunately can't go with the others, in below, because it has
+ * an user anxious for its definition: set_tssldt_descriptor
+ */
+#ifndef CONFIG_PARAVIRT
+#define write_gdt_entry(_ptr, _e, _type, _size) \
+ native_write_gdt_entry((_ptr), (_e), (_type), (_size))
+#endif

-static inline void load_idt(const struct desc_ptr *ptr)
+static inline void native_load_idt(const struct desc_ptr *ptr)
{
asm volatile("lidt %w0"::"m" (*ptr));
}

-static inline void store_idt(struct desc_ptr *dtr)
+static inline void native_store_idt(struct desc_ptr *dtr)
{
asm("sidt %w0":"=m" (*dtr));
}
@@ -129,7 +103,7 @@ static inline void set_tssldt_descriptor(void *ptr, unsigned long tss,
d.limit1 = (size >> 16) & 0xF;
d.base2 = (PTR_MIDDLE(tss) >> 8) & 0xFF;
d.base3 = PTR_HIGH(tss);
- memcpy(ptr, &d, 16);
+ write_gdt_entry(ptr, &d, type, 16);
}

static inline void set_tss_desc(unsigned cpu, void *addr)
@@ -152,35 +126,7 @@ static inline void set_ldt_desc(unsigned cpu, void *addr, int size)
DESC_LDT, size * 8 - 1);
}

-#define LDT_entry_a(info) \
- ((((info)->base_addr & 0x0000ffff) << 16) | ((info)->limit & 0x0ffff))
-/* Don't allow setting of the lm bit. It is useless anyways because
- 64bit system calls require __USER_CS. */
-#define LDT_entry_b(info) \
- (((info)->base_addr & 0xff000000) | \
- (((info)->base_addr & 0x00ff0000) >> 16) | \
- ((info)->limit & 0xf0000) | \
- (((info)->read_exec_only ^ 1) << 9) | \
- ((info)->contents << 10) | \
- (((info)->seg_not_present ^ 1) << 15) | \
- ((info)->seg_32bit << 22) | \
- ((info)->limit_in_pages << 23) | \
- ((info)->useable << 20) | \
- /* ((info)->lm << 21) | */ \
- 0x7000)
-
-#define LDT_empty(info) (\
- (info)->base_addr == 0 && \
- (info)->limit == 0 && \
- (info)->contents == 0 && \
- (info)->read_exec_only == 1 && \
- (info)->seg_32bit == 0 && \
- (info)->limit_in_pages == 0 && \
- (info)->seg_not_present == 1 && \
- (info)->useable == 0 && \
- (info)->lm == 0)
-
-static inline void load_TLS(struct thread_struct *t, unsigned int cpu)
+static inline void native_load_tls(struct thread_struct *t, unsigned int cpu)
{
unsigned int i;
u64 *gdt = (u64 *)(cpu_gdt(cpu) + GDT_ENTRY_TLS_MIN);
@@ -189,28 +135,77 @@ static inline void load_TLS(struct thread_struct *t, unsigned int cpu)
gdt[i] = t->tls_array[i];
}

-/*
- * load one particular LDT into the current CPU
- */
-static inline void load_LDT_nolock(mm_context_t *pc, int cpu)
+static inline void native_set_ldt(const void *addr,
+ unsigned int entries)
{
- int count = pc->size;
+ if (likely(entries == 0))
+ __asm__ __volatile__ ("lldt %w0" :: "r" (0));
+ else {
+ unsigned cpu = smp_processor_id();

- if (likely(!count)) {
- clear_LDT();
- return;
+ set_tssldt_descriptor(&cpu_gdt(cpu)[GDT_ENTRY_LDT],
+ (unsigned long)addr, DESC_LDT,
+ entries * 8 - 1);
+ __asm__ __volatile__ ("lldt %w0"::"r" (GDT_ENTRY_LDT*8));
}
+}
+
+#ifdef CONFIG_PARAVIRT
+#include <asm/paravirt.h>
+#else
+#define load_TR_desc() native_load_tr_desc()
+#define load_gdt(ptr) native_load_gdt(ptr)
+#define load_idt(ptr) native_load_idt(ptr)
+#define load_TLS(t, cpu) native_load_tls(t, cpu)
+#define set_ldt(addr, entries) native_set_ldt(addr, entries)
+#define store_tr(tr) (tr) = native_store_tr()
+#define store_gdt(ptr) native_store_gdt(ptr)
+#define store_idt(ptr) native_store_idt(ptr)
+
+#define write_idt_entry(_adr, _s) native_write_idt_entry((_adr), (_s))
+#define write_ldt_entry(_ldt, _number, _entry1, _entry2) \
+ write_dt_entry((_ldt), (_number), (_entry1), (_entry2))
+#endif
+
+static inline void _set_gate(void *adr, unsigned type, unsigned long func,
+ unsigned dpl, unsigned ist)
+{
+ struct gate_struct s;
+
+ s.offset_low = PTR_LOW(func);
+ s.segment = __KERNEL_CS;
+ s.ist = ist;
+ s.p = 1;
+ s.dpl = dpl;
+ s.zero0 = 0;
+ s.zero1 = 0;
+ s.type = type;
+ s.offset_middle = PTR_MIDDLE(func);
+ s.offset_high = PTR_HIGH(func);
+ write_idt_entry(adr, &s);
+}
+
+static inline void set_intr_gate(int nr, void *func)
+{
+ BUG_ON((unsigned)nr > 0xFF);
+ _set_gate(&idt_table[nr], GATE_INTERRUPT, (unsigned long) func, 0, 0);
+}

- set_ldt_desc(cpu, pc->ldt, count);
- load_LDT_desc();
+static inline void set_intr_gate_ist(int nr, void *func, unsigned ist)
+{
+ BUG_ON((unsigned)nr > 0xFF);
+ _set_gate(&idt_table[nr], GATE_INTERRUPT, (unsigned long) func, 0, ist);
}

-static inline void load_LDT(mm_context_t *pc)
+static inline void set_system_gate(int nr, void *func)
{
- int cpu = get_cpu();
+ BUG_ON((unsigned)nr > 0xFF);
+ _set_gate(&idt_table[nr], GATE_INTERRUPT, (unsigned long) func, 3, 0);
+}

- load_LDT_nolock(pc, cpu);
- put_cpu();
+static inline void set_system_gate_ist(int nr, void *func, unsigned ist)
+{
+ _set_gate(&idt_table[nr], GATE_INTERRUPT, (unsigned long) func, 3, ist);
}

extern struct desc_ptr idt_descr;
diff --git a/include/asm-x86/mmu_context_64.h b/include/asm-x86/mmu_context_64.h
index 29f95c3..85b7fb8 100644
--- a/include/asm-x86/mmu_context_64.h
+++ b/include/asm-x86/mmu_context_64.h
@@ -7,7 +7,16 @@
#include <asm/pda.h>
#include <asm/pgtable.h>
#include <asm/tlbflush.h>
+
+#ifdef CONFIG_PARAVIRT
+#include <asm/paravirt.h>
+#else
#include <asm-generic/mm_hooks.h>
+static inline void paravirt_activate_mm(struct mm_struct *prev,
+ struct mm_struct *next)
+{
+}
+#endif /* CONFIG_PARAVIRT */

/*
* possibly do the LDT unload here?
@@ -25,7 +34,7 @@ static inline void enter_lazy_tlb(struct mm_struct *mm, struct task_struct *tsk)

static inline void load_cr3(pgd_t *pgd)
{
- asm volatile("movq %0,%%cr3" :: "r" (__pa(pgd)) : "memory");
+ write_cr3(__pa(pgd));
}

static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
@@ -43,7 +52,7 @@ static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
load_cr3(next->pgd);

if (unlikely(next->context.ldt != prev->context.ldt))
- load_LDT_nolock(&next->context, cpu);
+ load_LDT_nolock(&next->context);
}
#ifdef CONFIG_SMP
else {
@@ -56,7 +65,7 @@ static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
* to make sure to use no freed page tables.
*/
load_cr3(next->pgd);
- load_LDT_nolock(&next->context, cpu);
+ load_LDT_nolock(&next->context);
}
}
#endif
@@ -67,8 +76,10 @@ static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
asm volatile("movl %0,%%fs"::"r"(0)); \
} while(0)

-#define activate_mm(prev, next) \
- switch_mm((prev),(next),NULL)
-
+#define activate_mm(prev, next) \
+do { \
+ paravirt_activate_mm(prev, next); \
+ switch_mm((prev), (next), NULL); \
+} while (0)

#endif
--
1.4.4.2

2007-11-09 21:35:58

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 20/24] tweak io_64.h for paravirt.

We need something here because we can't call in and out instructions
directly. However, we have to be careful, because no indirections are
allowed in misc_64.c , and paravirt_ops is a kind of one. So just
call it directly there

Signed-off-by: Glauber de Oliveira Costa <[email protected]>
Signed-off-by: Steven Rostedt <[email protected]>
Acked-by: Jeremy Fitzhardinge <[email protected]>
---
arch/x86/boot/compressed/misc_64.c | 6 +++++
include/asm-x86/io_64.h | 37 +++++++++++++++++++++++++++++------
2 files changed, 36 insertions(+), 7 deletions(-)

diff --git a/arch/x86/boot/compressed/misc_64.c b/arch/x86/boot/compressed/misc_64.c
index 6ea015a..6640a17 100644
--- a/arch/x86/boot/compressed/misc_64.c
+++ b/arch/x86/boot/compressed/misc_64.c
@@ -9,6 +9,12 @@
* High loaded stuff by Hans Lermen & Werner Almesberger, Feb. 1996
*/

+/*
+ * we have to be careful, because no indirections are allowed here, and
+ * paravirt_ops is a kind of one. As it will only run in baremetal anyway,
+ * we just keep it from happening
+ */
+#undef CONFIG_PARAVIRT
#define _LINUX_STRING_H_ 1
#define __LINUX_BITMAP_H 1

diff --git a/include/asm-x86/io_64.h b/include/asm-x86/io_64.h
index a037b07..57fcdd9 100644
--- a/include/asm-x86/io_64.h
+++ b/include/asm-x86/io_64.h
@@ -35,12 +35,24 @@
* - Arnaldo Carvalho de Melo <[email protected]>
*/

-#define __SLOW_DOWN_IO "\noutb %%al,$0x80"
+static inline void native_io_delay(void)
+{
+ asm volatile("outb %%al,$0x80" : : : "memory");
+}

-#ifdef REALLY_SLOW_IO
-#define __FULL_SLOW_DOWN_IO __SLOW_DOWN_IO __SLOW_DOWN_IO __SLOW_DOWN_IO __SLOW_DOWN_IO
+#if defined(CONFIG_PARAVIRT)
+#include <asm/paravirt.h>
#else
-#define __FULL_SLOW_DOWN_IO __SLOW_DOWN_IO
+
+static inline void slow_down_io(void)
+{
+ native_io_delay();
+#ifdef REALLY_SLOW_IO
+ native_io_delay();
+ native_io_delay();
+ native_io_delay();
+#endif
+}
#endif

/*
@@ -52,9 +64,15 @@ static inline void out##s(unsigned x value, unsigned short port) {
#define __OUT2(s,s1,s2) \
__asm__ __volatile__ ("out" #s " %" s1 "0,%" s2 "1"

+#ifndef REALLY_SLOW_IO
+#define REALLY_SLOW_IO
+#define UNSET_REALLY_SLOW_IO
+#endif
+
#define __OUT(s,s1,x) \
__OUT1(s,x) __OUT2(s,s1,"w") : : "a" (value), "Nd" (port)); } \
-__OUT1(s##_p,x) __OUT2(s,s1,"w") __FULL_SLOW_DOWN_IO : : "a" (value), "Nd" (port));} \
+__OUT1(s##_p, x) __OUT2(s, s1, "w") : : "a" (value), "Nd" (port)); \
+ slow_down_io(); }

#define __IN1(s) \
static inline RETURN_TYPE in##s(unsigned short port) { RETURN_TYPE _v;
@@ -63,8 +81,13 @@ static inline RETURN_TYPE in##s(unsigned short port) { RETURN_TYPE _v;
__asm__ __volatile__ ("in" #s " %" s2 "1,%" s1 "0"

#define __IN(s,s1,i...) \
-__IN1(s) __IN2(s,s1,"w") : "=a" (_v) : "Nd" (port) ,##i ); return _v; } \
-__IN1(s##_p) __IN2(s,s1,"w") __FULL_SLOW_DOWN_IO : "=a" (_v) : "Nd" (port) ,##i ); return _v; } \
+__IN1(s) __IN2(s, s1, "w") : "=a" (_v) : "Nd" (port), ##i); return _v; } \
+__IN1(s##_p) __IN2(s, s1, "w") : "=a" (_v) : "Nd" (port), ##i); \
+ slow_down_io(); return _v; }
+
+#ifdef UNSET_REALLY_SLOW_IO
+#undef REALLY_SLOW_IO
+#endif

#define __INS(s) \
static inline void ins##s(unsigned short port, void * addr, unsigned long count) \
--
1.4.4.2

2007-11-09 21:36:28

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 23/24] consolidation of paravirt for 32 and 64 bits

This patch unifies the paravirt ops structures for usage in
x86_64 and i386 architectures. Some new field had to be created
to accomodate the differences between the architectures.

Signed-off-by: Glauber de Oliveira Costa <[email protected]>
Signed-off-by: Steven Rostedt <[email protected]>
Acked-by: Jeremy Fitzhardinge <[email protected]>
---
arch/x86/Kconfig.x86_64 | 11 +
arch/x86/kernel/Makefile | 3 +
arch/x86/kernel/Makefile_32 | 1 -
arch/x86/kernel/asm-offsets.c | 8 +
arch/x86/kernel/asm-offsets_32.c | 8 -
arch/x86/kernel/asm-offsets_64.c | 22 +-
arch/x86/kernel/paravirt.c | 447 +++++++++++++++++++++++++++++++++
arch/x86/kernel/paravirt_32.c | 472 -----------------------------------
arch/x86/kernel/paravirt_patch_32.c | 52 ++++
arch/x86/kernel/paravirt_patch_64.c | 56 ++++
arch/x86/kernel/vmlinux_64.lds.S | 6 +
include/asm-x86/paravirt.h | 472 +++++++++++++++++++++++++++++------
12 files changed, 987 insertions(+), 571 deletions(-)

diff --git a/arch/x86/Kconfig.x86_64 b/arch/x86/Kconfig.x86_64
index b45855c..568ba7a 100644
--- a/arch/x86/Kconfig.x86_64
+++ b/arch/x86/Kconfig.x86_64
@@ -372,6 +372,17 @@ config NODES_SHIFT

# Dummy CONFIG option to select ACPI_NUMA from drivers/acpi/Kconfig.

+config PARAVIRT
+ bool
+ depends on EXPERIMENTAL
+ help
+ Paravirtualization is a way of running multiple instances of
+ Linux on the same machine, under a hypervisor. This option
+ changes the kernel so it can modify itself when it is run
+ under a hypervisor, improving performance significantly.
+ However, when run without a hypervisor the kernel is
+ theoretically slower. If in doubt, say N.
+
config X86_64_ACPI_NUMA
bool "ACPI NUMA detection"
depends on NUMA
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 3857334..f444d0e 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -1,8 +1,11 @@
ifeq ($(CONFIG_X86_32),y)
include ${srctree}/arch/x86/kernel/Makefile_32
+obj-$(CONFIG_PARAVIRT) += paravirt_patch_32.o
else
include ${srctree}/arch/x86/kernel/Makefile_64
+obj-$(CONFIG_PARAVIRT) += paravirt_patch_64.o
endif
+obj-$(CONFIG_PARAVIRT) += paravirt.o

# Workaround to delete .lds files with make clean
# The problem is that we do not enter Makefile_32 with make clean.
diff --git a/arch/x86/kernel/Makefile_32 b/arch/x86/kernel/Makefile_32
index b0da543..44ba1d6 100644
--- a/arch/x86/kernel/Makefile_32
+++ b/arch/x86/kernel/Makefile_32
@@ -43,7 +43,6 @@ obj-$(CONFIG_K8_NB) += k8.o
obj-$(CONFIG_MGEODE_LX) += geode_32.o mfgpt_32.o

obj-$(CONFIG_VMI) += vmi_32.o vmiclock_32.o
-obj-$(CONFIG_PARAVIRT) += paravirt_32.o
obj-y += pcspeaker.o

obj-$(CONFIG_SCx200) += scx200_32.o
diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
index cfa82c8..25530d5 100644
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -1,3 +1,11 @@
+#define DEFINE(sym, val) \
+ asm volatile("\n->" #sym " %0 " #val : : "i" (val))
+
+#define BLANK() asm volatile("\n->" : : )
+
+#define OFFSET(sym, str, mem) \
+ DEFINE(sym, offsetof(struct str, mem));
+
#ifdef CONFIG_X86_32
# include "asm-offsets_32.c"
#else
diff --git a/arch/x86/kernel/asm-offsets_32.c b/arch/x86/kernel/asm-offsets_32.c
index c1ccfab..f320b2d 100644
--- a/arch/x86/kernel/asm-offsets_32.c
+++ b/arch/x86/kernel/asm-offsets_32.c
@@ -25,14 +25,6 @@
#include "../../../drivers/lguest/lg.h"
#endif

-#define DEFINE(sym, val) \
- asm volatile("\n->" #sym " %0 " #val : : "i" (val))
-
-#define BLANK() asm volatile("\n->" : : )
-
-#define OFFSET(sym, str, mem) \
- DEFINE(sym, offsetof(struct str, mem));
-
/* workaround for a warning with -Wmissing-prototypes */
void foo(void);

diff --git a/arch/x86/kernel/asm-offsets_64.c b/arch/x86/kernel/asm-offsets_64.c
index d1b6ed9..3ef77dd 100644
--- a/arch/x86/kernel/asm-offsets_64.c
+++ b/arch/x86/kernel/asm-offsets_64.c
@@ -16,14 +16,7 @@
#include <asm/thread_info.h>
#include <asm/ia32.h>
#include <asm/bootparam.h>
-
-#define DEFINE(sym, val) \
- asm volatile("\n->" #sym " %0 " #val : : "i" (val))
-
-#define BLANK() asm volatile("\n->" : : )
-
-#define OFFSET(sym, str, mem) \
- DEFINE(sym, offsetof(struct str, mem))
+#include <asm/paravirt.h>

#define __NO_STUBS 1
#undef __SYSCALL
@@ -76,6 +69,19 @@ int main(void)
offsetof (struct rt_sigframe32, uc.uc_mcontext));
BLANK();
#endif
+#ifdef CONFIG_PARAVIRT
+ OFFSET(PARAVIRT_enabled, pv_info, paravirt_enabled);
+ OFFSET(PARAVIRT_PATCH_pv_cpu_ops, paravirt_patch_template, pv_cpu_ops);
+ OFFSET(PARAVIRT_PATCH_pv_irq_ops, paravirt_patch_template, pv_irq_ops);
+ OFFSET(PARAVIRT_PATCH_pv_mmu_ops, paravirt_patch_template, pv_mmu_ops);
+ OFFSET(PV_IRQ_irq_disable, pv_irq_ops, irq_disable);
+ OFFSET(PV_IRQ_irq_enable, pv_irq_ops, irq_enable);
+ OFFSET(PV_CPU_iret, pv_cpu_ops, iret);
+ OFFSET(PV_CPU_irq_enable_syscall_ret, pv_cpu_ops, irq_enable_syscall_ret);
+ OFFSET(PV_MMU_read_cr2, pv_mmu_ops, read_cr2);
+ OFFSET(PV_CPU_swapgs, pv_cpu_ops, swapgs);
+ BLANK();
+#endif
DEFINE(pbe_address, offsetof(struct pbe, address));
DEFINE(pbe_orig_address, offsetof(struct pbe, orig_address));
DEFINE(pbe_next, offsetof(struct pbe, next));
diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
new file mode 100644
index 0000000..8fc21ed
--- /dev/null
+++ b/arch/x86/kernel/paravirt.c
@@ -0,0 +1,447 @@
+/* Paravirtualization interfaces
+ Copyright (C) 2006 Rusty Russell IBM Corporation
+
+ This program is free software; you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 2 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program; if not, write to the Free Software
+ Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
+
+ 2007 - x86_64 support added by Glauber de Oliveira Costa
+*/
+#include <linux/errno.h>
+#include <linux/module.h>
+#include <linux/efi.h>
+#include <linux/bcd.h>
+#include <linux/highmem.h>
+#include <linux/pci_regs.h>
+#include <linux/pci_ids.h>
+
+#include <asm/bug.h>
+#include <asm/paravirt.h>
+#include <asm/desc.h>
+#include <asm/setup.h>
+#include <asm/arch_hooks.h>
+#include <asm/time.h>
+#include <asm/irq.h>
+#include <asm/delay.h>
+#include <asm/fixmap.h>
+#include <asm/apic.h>
+#include <asm/tlbflush.h>
+#include <asm/timer.h>
+#include <asm/io.h>
+#include <asm/pci-direct.h>
+
+
+/* nop stub */
+void _paravirt_nop(void)
+{
+}
+
+static void __init default_banner(void)
+{
+ printk(KERN_INFO "Booting paravirtualized kernel on %s\n",
+ pv_info.name);
+}
+
+char *memory_setup(void)
+{
+ return pv_init_ops.memory_setup();
+}
+
+unsigned paravirt_patch_nop(void)
+{
+ return 0;
+}
+
+unsigned paravirt_patch_ignore(unsigned len)
+{
+ return len;
+}
+
+struct branch {
+ unsigned char opcode;
+ u32 delta;
+} __attribute__((packed));
+
+unsigned paravirt_patch_call(void *insnbuf,
+ const void *target, u16 tgt_clobbers,
+ unsigned long addr, u16 site_clobbers,
+ unsigned len)
+{
+ struct branch *b = insnbuf;
+ unsigned long delta = (unsigned long)target - (addr+5);
+
+ if (tgt_clobbers & ~site_clobbers)
+ return len; /* target would clobber too much for this site */
+ if (len < 5)
+ return len; /* call too long for patch site */
+
+ b->opcode = 0xe8; /* call */
+ b->delta = delta;
+ BUILD_BUG_ON(sizeof(*b) != 5);
+
+ return 5;
+}
+
+unsigned paravirt_patch_jmp(void *insnbuf, const void *target,
+ unsigned long addr, unsigned len)
+{
+ struct branch *b = insnbuf;
+ unsigned long delta = (unsigned long)target - (addr+5);
+
+ if (len < 5)
+ return len; /* call too long for patch site */
+
+ b->opcode = 0xe9; /* jmp */
+ b->delta = delta;
+
+ return 5;
+}
+
+/* Undefined instruction for dealing with missing ops pointers. */
+static const unsigned char ud2a[] = { 0x0f, 0x0b };
+
+/* Neat trick to map patch type back to the call within the
+ * corresponding structure. */
+static void *get_call_destination(u8 type)
+{
+ struct paravirt_patch_template tmpl = {
+ .pv_init_ops = pv_init_ops,
+ .pv_time_ops = pv_time_ops,
+ .pv_cpu_ops = pv_cpu_ops,
+ .pv_irq_ops = pv_irq_ops,
+ .pv_apic_ops = pv_apic_ops,
+ .pv_mmu_ops = pv_mmu_ops,
+ };
+ return *((void **)&tmpl + type);
+}
+
+unsigned paravirt_patch_default(u8 type, u16 clobbers, void *insnbuf,
+ unsigned long addr, unsigned len)
+{
+ void *opfunc = get_call_destination(type);
+ unsigned ret;
+
+ if (opfunc == NULL)
+ /* If there's no function, patch it with a ud2a (BUG) */
+ ret = paravirt_patch_insns(insnbuf, len, ud2a, ud2a+sizeof(ud2a));
+ else if (opfunc == paravirt_nop)
+ /* If the operation is a nop, then nop the callsite */
+ ret = paravirt_patch_nop();
+ else if (type == PARAVIRT_PATCH(pv_cpu_ops.iret) ||
+ type == PARAVIRT_PATCH(pv_cpu_ops.irq_enable_syscall_ret))
+ /* If operation requires a jmp, then jmp */
+ ret = paravirt_patch_jmp(insnbuf, opfunc, addr, len);
+ else
+ /* Otherwise call the function; assume target could
+ clobber any caller-save reg */
+ ret = paravirt_patch_call(insnbuf, opfunc, CLBR_ANY,
+ addr, clobbers, len);
+
+ return ret;
+}
+
+unsigned paravirt_patch_insns(void *insnbuf, unsigned len,
+ const char *start, const char *end)
+{
+ unsigned insn_len = end - start;
+
+ if (insn_len > len || start == NULL)
+ insn_len = len;
+ else
+ memcpy(insnbuf, start, insn_len);
+
+ return insn_len;
+}
+
+void init_IRQ(void)
+{
+ pv_irq_ops.init_IRQ();
+}
+
+static void native_flush_tlb(void)
+{
+ __native_flush_tlb();
+}
+
+/*
+ * Global pages have to be flushed a bit differently. Not a real
+ * performance problem because this does not happen often.
+ */
+static void native_flush_tlb_global(void)
+{
+ __native_flush_tlb_global();
+}
+
+static void native_flush_tlb_single(unsigned long addr)
+{
+ __native_flush_tlb_single(addr);
+}
+
+/* These are in entry.S */
+extern void native_iret(void);
+extern void native_irq_enable_syscall_ret(void);
+
+static int __init print_banner(void)
+{
+ pv_init_ops.banner();
+ return 0;
+}
+core_initcall(print_banner);
+
+static struct resource reserve_ioports = {
+ .start = 0,
+ .end = IO_SPACE_LIMIT,
+ .name = "paravirt-ioport",
+ .flags = IORESOURCE_IO | IORESOURCE_BUSY,
+};
+
+static struct resource reserve_iomem = {
+ .start = 0,
+ .end = -1,
+ .name = "paravirt-iomem",
+ .flags = IORESOURCE_MEM | IORESOURCE_BUSY,
+};
+
+/*
+ * Reserve the whole legacy IO space to prevent any legacy drivers
+ * from wasting time probing for their hardware. This is a fairly
+ * brute-force approach to disabling all non-virtual drivers.
+ *
+ * Note that this must be called very early to have any effect.
+ */
+int paravirt_disable_iospace(void)
+{
+ int ret;
+
+ ret = request_resource(&ioport_resource, &reserve_ioports);
+ if (ret == 0) {
+ ret = request_resource(&iomem_resource, &reserve_iomem);
+ if (ret)
+ release_resource(&reserve_ioports);
+ }
+
+ return ret;
+}
+
+static DEFINE_PER_CPU(enum paravirt_lazy_mode, paravirt_lazy_mode) = PARAVIRT_LAZY_NONE;
+
+static inline void enter_lazy(enum paravirt_lazy_mode mode)
+{
+ BUG_ON(__get_cpu_var(paravirt_lazy_mode) != PARAVIRT_LAZY_NONE);
+ BUG_ON(preemptible());
+
+ __get_cpu_var(paravirt_lazy_mode) = mode;
+}
+
+void paravirt_leave_lazy(enum paravirt_lazy_mode mode)
+{
+ BUG_ON(__get_cpu_var(paravirt_lazy_mode) != mode);
+ BUG_ON(preemptible());
+
+ __get_cpu_var(paravirt_lazy_mode) = PARAVIRT_LAZY_NONE;
+}
+
+void paravirt_enter_lazy_mmu(void)
+{
+ enter_lazy(PARAVIRT_LAZY_MMU);
+}
+
+void paravirt_leave_lazy_mmu(void)
+{
+ paravirt_leave_lazy(PARAVIRT_LAZY_MMU);
+}
+
+void paravirt_enter_lazy_cpu(void)
+{
+ enter_lazy(PARAVIRT_LAZY_CPU);
+}
+
+void paravirt_leave_lazy_cpu(void)
+{
+ paravirt_leave_lazy(PARAVIRT_LAZY_CPU);
+}
+
+enum paravirt_lazy_mode paravirt_get_lazy_mode(void)
+{
+ return __get_cpu_var(paravirt_lazy_mode);
+}
+
+struct pv_info pv_info = {
+ .name = "bare hardware",
+ .paravirt_enabled = 0,
+ .kernel_rpl = 0,
+ .shared_kernel_pmd = 1, /* Only used when CONFIG_X86_PAE is set */
+};
+
+struct pv_init_ops pv_init_ops = {
+ .patch = native_patch,
+ .banner = default_banner,
+ .arch_setup = paravirt_nop,
+ .memory_setup = machine_specific_memory_setup,
+};
+
+struct pv_time_ops pv_time_ops = {
+ .time_init = hpet_time_init,
+ .get_wallclock = native_get_wallclock,
+ .set_wallclock = native_set_wallclock,
+ .sched_clock = native_sched_clock,
+ .get_cpu_khz = native_calculate_cpu_khz,
+};
+
+struct pv_irq_ops pv_irq_ops = {
+ .init_IRQ = native_init_IRQ,
+ .save_fl = native_save_fl,
+ .restore_fl = native_restore_fl,
+ .irq_disable = native_irq_disable,
+ .irq_enable = native_irq_enable,
+ .safe_halt = native_safe_halt,
+ .halt = native_halt,
+ .read_cr8 = native_read_cr8,
+ .write_cr8 = native_write_cr8,
+};
+
+struct pv_cpu_ops pv_cpu_ops = {
+ .cpuid = native_cpuid,
+ .get_debugreg = native_get_debugreg,
+ .set_debugreg = native_set_debugreg,
+ .clts = native_clts,
+ .read_cr0 = native_read_cr0,
+ .write_cr0 = native_write_cr0,
+ .read_cr4 = native_read_cr4,
+ .read_cr4_safe = native_read_cr4_safe,
+ .write_cr4 = native_write_cr4,
+ .wbinvd = native_wbinvd,
+ .read_msr = native_read_msr_safe,
+ .write_msr = native_write_msr_safe,
+ .read_tsc = native_read_tsc,
+ .read_pmc = native_read_pmc,
+ .read_tscp = native_read_tscp,
+ .load_tr_desc = native_load_tr_desc,
+ .set_ldt = native_set_ldt,
+ .load_gdt = native_load_gdt,
+ .load_idt = native_load_idt,
+ .store_gdt = native_store_gdt,
+ .store_idt = native_store_idt,
+ .store_tr = native_store_tr,
+ .load_tls = native_load_tls,
+ .write_ldt_entry = write_dt_entry,
+#ifdef CONFIG_X86_32
+ .write_gdt_entry = write_dt_entry,
+ .write_idt_entry = write_dt_entry,
+#else
+ .write_gdt_entry = native_write_gdt_entry,
+ .write_idt_entry = native_write_idt_entry,
+#endif
+ .load_esp0 = native_load_esp0,
+
+ .irq_enable_syscall_ret = native_irq_enable_syscall_ret,
+ .iret = native_iret,
+ .swapgs = native_swapgs,
+
+ .set_iopl_mask = native_set_iopl_mask,
+ .io_delay = native_io_delay,
+
+ .lazy_mode = {
+ .enter = paravirt_nop,
+ .leave = paravirt_nop,
+ },
+};
+
+struct pv_apic_ops pv_apic_ops = {
+#ifdef CONFIG_X86_LOCAL_APIC
+ .apic_write = native_apic_write,
+ .apic_write_atomic = native_apic_write_atomic,
+ .apic_read = native_apic_read,
+ .setup_boot_clock = setup_boot_APIC_clock,
+ .setup_secondary_clock = setup_secondary_APIC_clock,
+ .startup_ipi_hook = paravirt_nop,
+#endif
+};
+
+struct pv_mmu_ops pv_mmu_ops = {
+#ifdef CONFIG_X86_32
+ .pagetable_setup_start = native_pagetable_setup_start,
+ .pagetable_setup_done = native_pagetable_setup_done,
+#else
+ .pagetable_setup_start = paravirt_nop,
+ .pagetable_setup_done = paravirt_nop,
+#endif
+
+ .read_cr2 = native_read_cr2,
+ .write_cr2 = native_write_cr2,
+ .read_cr3 = native_read_cr3,
+ .write_cr3 = native_write_cr3,
+
+ .flush_tlb_user = native_flush_tlb,
+ .flush_tlb_kernel = native_flush_tlb_global,
+ .flush_tlb_single = native_flush_tlb_single,
+ .flush_tlb_others = native_flush_tlb_others,
+
+ .alloc_pt = paravirt_nop,
+ .alloc_pd = paravirt_nop,
+ .alloc_pd_clone = paravirt_nop,
+ .release_pt = paravirt_nop,
+ .release_pd = paravirt_nop,
+
+ .set_pte = native_set_pte,
+ .set_pte_at = native_set_pte_at,
+ .set_pmd = native_set_pmd,
+ .pte_update = paravirt_nop,
+ .pte_update_defer = paravirt_nop,
+
+#ifdef CONFIG_HIGHPTE
+ .kmap_atomic_pte = kmap_atomic,
+#endif
+
+#ifdef CONFIG_X86_PAE
+ .set_pte_atomic = native_set_pte_atomic,
+ .set_pte_present = native_set_pte_present,
+#endif
+#if defined(CONFIG_X86_PAE) || defined(CONFIG_X86_64)
+ .set_pud = native_set_pud,
+ .pte_clear = native_pte_clear,
+ .pmd_clear = native_pmd_clear,
+ .pmd_val = native_pmd_val,
+ .make_pmd = native_make_pmd,
+#endif
+ .pte_val = native_pte_val,
+ .pgd_val = native_pgd_val,
+
+ .make_pte = native_make_pte,
+ .make_pgd = native_make_pgd,
+#ifdef CONFIG_X86_64
+ .set_pgd = native_set_pgd,
+
+ .pud_clear = native_pud_clear,
+ .pgd_clear = native_pgd_clear,
+
+ .pud_val = native_pud_val,
+
+ .make_pud = native_make_pud,
+#endif
+ .dup_mmap = paravirt_nop,
+ .exit_mmap = paravirt_nop,
+ .activate_mm = paravirt_nop,
+
+ .lazy_mode = {
+ .enter = paravirt_nop,
+ .leave = paravirt_nop,
+ },
+};
+
+EXPORT_SYMBOL_GPL(pv_time_ops);
+EXPORT_SYMBOL_GPL(pv_cpu_ops);
+EXPORT_SYMBOL_GPL(pv_mmu_ops);
+EXPORT_SYMBOL_GPL(pv_apic_ops);
+EXPORT_SYMBOL_GPL(pv_info);
+EXPORT_SYMBOL (pv_irq_ops);
diff --git a/arch/x86/kernel/paravirt_32.c b/arch/x86/kernel/paravirt_32.c
deleted file mode 100644
index 04f51d0..0000000
--- a/arch/x86/kernel/paravirt_32.c
+++ /dev/null
@@ -1,472 +0,0 @@
-/* Paravirtualization interfaces
- Copyright (C) 2006 Rusty Russell IBM Corporation
-
- This program is free software; you can redistribute it and/or modify
- it under the terms of the GNU General Public License as published by
- the Free Software Foundation; either version 2 of the License, or
- (at your option) any later version.
-
- This program is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- GNU General Public License for more details.
-
- You should have received a copy of the GNU General Public License
- along with this program; if not, write to the Free Software
- Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
-*/
-#include <linux/errno.h>
-#include <linux/module.h>
-#include <linux/efi.h>
-#include <linux/bcd.h>
-#include <linux/highmem.h>
-
-#include <asm/bug.h>
-#include <asm/paravirt.h>
-#include <asm/desc.h>
-#include <asm/setup.h>
-#include <asm/arch_hooks.h>
-#include <asm/time.h>
-#include <asm/irq.h>
-#include <asm/delay.h>
-#include <asm/fixmap.h>
-#include <asm/apic.h>
-#include <asm/tlbflush.h>
-#include <asm/timer.h>
-
-/* nop stub */
-void _paravirt_nop(void)
-{
-}
-
-static void __init default_banner(void)
-{
- printk(KERN_INFO "Booting paravirtualized kernel on %s\n",
- pv_info.name);
-}
-
-char *memory_setup(void)
-{
- return pv_init_ops.memory_setup();
-}
-
-/* Simple instruction patching code. */
-#define DEF_NATIVE(ops, name, code) \
- extern const char start_##ops##_##name[], end_##ops##_##name[]; \
- asm("start_" #ops "_" #name ": " code "; end_" #ops "_" #name ":")
-
-DEF_NATIVE(pv_irq_ops, irq_disable, "cli");
-DEF_NATIVE(pv_irq_ops, irq_enable, "sti");
-DEF_NATIVE(pv_irq_ops, restore_fl, "push %eax; popf");
-DEF_NATIVE(pv_irq_ops, save_fl, "pushf; pop %eax");
-DEF_NATIVE(pv_cpu_ops, iret, "iret");
-DEF_NATIVE(pv_cpu_ops, irq_enable_syscall_ret, "sti; sysexit");
-DEF_NATIVE(pv_mmu_ops, read_cr2, "mov %cr2, %eax");
-DEF_NATIVE(pv_mmu_ops, write_cr3, "mov %eax, %cr3");
-DEF_NATIVE(pv_mmu_ops, read_cr3, "mov %cr3, %eax");
-DEF_NATIVE(pv_cpu_ops, clts, "clts");
-DEF_NATIVE(pv_cpu_ops, read_tsc, "rdtsc");
-
-/* Undefined instruction for dealing with missing ops pointers. */
-static const unsigned char ud2a[] = { 0x0f, 0x0b };
-
-static unsigned native_patch(u8 type, u16 clobbers, void *ibuf,
- unsigned long addr, unsigned len)
-{
- const unsigned char *start, *end;
- unsigned ret;
-
- switch(type) {
-#define SITE(ops, x) \
- case PARAVIRT_PATCH(ops.x): \
- start = start_##ops##_##x; \
- end = end_##ops##_##x; \
- goto patch_site
-
- SITE(pv_irq_ops, irq_disable);
- SITE(pv_irq_ops, irq_enable);
- SITE(pv_irq_ops, restore_fl);
- SITE(pv_irq_ops, save_fl);
- SITE(pv_cpu_ops, iret);
- SITE(pv_cpu_ops, irq_enable_syscall_ret);
- SITE(pv_mmu_ops, read_cr2);
- SITE(pv_mmu_ops, read_cr3);
- SITE(pv_mmu_ops, write_cr3);
- SITE(pv_cpu_ops, clts);
- SITE(pv_cpu_ops, read_tsc);
-#undef SITE
-
- patch_site:
- ret = paravirt_patch_insns(ibuf, len, start, end);
- break;
-
- default:
- ret = paravirt_patch_default(type, clobbers, ibuf, addr, len);
- break;
- }
-
- return ret;
-}
-
-unsigned paravirt_patch_nop(void)
-{
- return 0;
-}
-
-unsigned paravirt_patch_ignore(unsigned len)
-{
- return len;
-}
-
-struct branch {
- unsigned char opcode;
- u32 delta;
-} __attribute__((packed));
-
-unsigned paravirt_patch_call(void *insnbuf,
- const void *target, u16 tgt_clobbers,
- unsigned long addr, u16 site_clobbers,
- unsigned len)
-{
- struct branch *b = insnbuf;
- unsigned long delta = (unsigned long)target - (addr+5);
-
- if (tgt_clobbers & ~site_clobbers)
- return len; /* target would clobber too much for this site */
- if (len < 5)
- return len; /* call too long for patch site */
-
- b->opcode = 0xe8; /* call */
- b->delta = delta;
- BUILD_BUG_ON(sizeof(*b) != 5);
-
- return 5;
-}
-
-unsigned paravirt_patch_jmp(void *insnbuf, const void *target,
- unsigned long addr, unsigned len)
-{
- struct branch *b = insnbuf;
- unsigned long delta = (unsigned long)target - (addr+5);
-
- if (len < 5)
- return len; /* call too long for patch site */
-
- b->opcode = 0xe9; /* jmp */
- b->delta = delta;
-
- return 5;
-}
-
-/* Neat trick to map patch type back to the call within the
- * corresponding structure. */
-static void *get_call_destination(u8 type)
-{
- struct paravirt_patch_template tmpl = {
- .pv_init_ops = pv_init_ops,
- .pv_time_ops = pv_time_ops,
- .pv_cpu_ops = pv_cpu_ops,
- .pv_irq_ops = pv_irq_ops,
- .pv_apic_ops = pv_apic_ops,
- .pv_mmu_ops = pv_mmu_ops,
- };
- return *((void **)&tmpl + type);
-}
-
-unsigned paravirt_patch_default(u8 type, u16 clobbers, void *insnbuf,
- unsigned long addr, unsigned len)
-{
- void *opfunc = get_call_destination(type);
- unsigned ret;
-
- if (opfunc == NULL)
- /* If there's no function, patch it with a ud2a (BUG) */
- ret = paravirt_patch_insns(insnbuf, len, ud2a, ud2a+sizeof(ud2a));
- else if (opfunc == paravirt_nop)
- /* If the operation is a nop, then nop the callsite */
- ret = paravirt_patch_nop();
- else if (type == PARAVIRT_PATCH(pv_cpu_ops.iret) ||
- type == PARAVIRT_PATCH(pv_cpu_ops.irq_enable_syscall_ret))
- /* If operation requires a jmp, then jmp */
- ret = paravirt_patch_jmp(insnbuf, opfunc, addr, len);
- else
- /* Otherwise call the function; assume target could
- clobber any caller-save reg */
- ret = paravirt_patch_call(insnbuf, opfunc, CLBR_ANY,
- addr, clobbers, len);
-
- return ret;
-}
-
-unsigned paravirt_patch_insns(void *insnbuf, unsigned len,
- const char *start, const char *end)
-{
- unsigned insn_len = end - start;
-
- if (insn_len > len || start == NULL)
- insn_len = len;
- else
- memcpy(insnbuf, start, insn_len);
-
- return insn_len;
-}
-
-void init_IRQ(void)
-{
- pv_irq_ops.init_IRQ();
-}
-
-static void native_flush_tlb(void)
-{
- __native_flush_tlb();
-}
-
-/*
- * Global pages have to be flushed a bit differently. Not a real
- * performance problem because this does not happen often.
- */
-static void native_flush_tlb_global(void)
-{
- __native_flush_tlb_global();
-}
-
-static void native_flush_tlb_single(unsigned long addr)
-{
- __native_flush_tlb_single(addr);
-}
-
-/* These are in entry.S */
-extern void native_iret(void);
-extern void native_irq_enable_syscall_ret(void);
-
-static int __init print_banner(void)
-{
- pv_init_ops.banner();
- return 0;
-}
-core_initcall(print_banner);
-
-static struct resource reserve_ioports = {
- .start = 0,
- .end = IO_SPACE_LIMIT,
- .name = "paravirt-ioport",
- .flags = IORESOURCE_IO | IORESOURCE_BUSY,
-};
-
-static struct resource reserve_iomem = {
- .start = 0,
- .end = -1,
- .name = "paravirt-iomem",
- .flags = IORESOURCE_MEM | IORESOURCE_BUSY,
-};
-
-/*
- * Reserve the whole legacy IO space to prevent any legacy drivers
- * from wasting time probing for their hardware. This is a fairly
- * brute-force approach to disabling all non-virtual drivers.
- *
- * Note that this must be called very early to have any effect.
- */
-int paravirt_disable_iospace(void)
-{
- int ret;
-
- ret = request_resource(&ioport_resource, &reserve_ioports);
- if (ret == 0) {
- ret = request_resource(&iomem_resource, &reserve_iomem);
- if (ret)
- release_resource(&reserve_ioports);
- }
-
- return ret;
-}
-
-static DEFINE_PER_CPU(enum paravirt_lazy_mode, paravirt_lazy_mode) = PARAVIRT_LAZY_NONE;
-
-static inline void enter_lazy(enum paravirt_lazy_mode mode)
-{
- BUG_ON(x86_read_percpu(paravirt_lazy_mode) != PARAVIRT_LAZY_NONE);
- BUG_ON(preemptible());
-
- x86_write_percpu(paravirt_lazy_mode, mode);
-}
-
-void paravirt_leave_lazy(enum paravirt_lazy_mode mode)
-{
- BUG_ON(x86_read_percpu(paravirt_lazy_mode) != mode);
- BUG_ON(preemptible());
-
- x86_write_percpu(paravirt_lazy_mode, PARAVIRT_LAZY_NONE);
-}
-
-void paravirt_enter_lazy_mmu(void)
-{
- enter_lazy(PARAVIRT_LAZY_MMU);
-}
-
-void paravirt_leave_lazy_mmu(void)
-{
- paravirt_leave_lazy(PARAVIRT_LAZY_MMU);
-}
-
-void paravirt_enter_lazy_cpu(void)
-{
- enter_lazy(PARAVIRT_LAZY_CPU);
-}
-
-void paravirt_leave_lazy_cpu(void)
-{
- paravirt_leave_lazy(PARAVIRT_LAZY_CPU);
-}
-
-enum paravirt_lazy_mode paravirt_get_lazy_mode(void)
-{
- return x86_read_percpu(paravirt_lazy_mode);
-}
-
-struct pv_info pv_info = {
- .name = "bare hardware",
- .paravirt_enabled = 0,
- .kernel_rpl = 0,
- .shared_kernel_pmd = 1, /* Only used when CONFIG_X86_PAE is set */
-};
-
-struct pv_init_ops pv_init_ops = {
- .patch = native_patch,
- .banner = default_banner,
- .arch_setup = paravirt_nop,
- .memory_setup = machine_specific_memory_setup,
-};
-
-struct pv_time_ops pv_time_ops = {
- .time_init = hpet_time_init,
- .get_wallclock = native_get_wallclock,
- .set_wallclock = native_set_wallclock,
- .sched_clock = native_sched_clock,
- .get_cpu_khz = native_calculate_cpu_khz,
-};
-
-struct pv_irq_ops pv_irq_ops = {
- .init_IRQ = native_init_IRQ,
- .save_fl = native_save_fl,
- .restore_fl = native_restore_fl,
- .irq_disable = native_irq_disable,
- .irq_enable = native_irq_enable,
- .safe_halt = native_safe_halt,
- .halt = native_halt,
-};
-
-struct pv_cpu_ops pv_cpu_ops = {
- .cpuid = native_cpuid,
- .get_debugreg = native_get_debugreg,
- .set_debugreg = native_set_debugreg,
- .clts = native_clts,
- .read_cr0 = native_read_cr0,
- .write_cr0 = native_write_cr0,
- .read_cr4 = native_read_cr4,
- .read_cr4_safe = native_read_cr4_safe,
- .write_cr4 = native_write_cr4,
- .wbinvd = native_wbinvd,
- .read_msr = native_read_msr_safe,
- .write_msr = native_write_msr_safe,
- .read_tsc = native_read_tsc,
- .read_pmc = native_read_pmc,
- .load_tr_desc = native_load_tr_desc,
- .set_ldt = native_set_ldt,
- .load_gdt = native_load_gdt,
- .load_idt = native_load_idt,
- .store_gdt = native_store_gdt,
- .store_idt = native_store_idt,
- .store_tr = native_store_tr,
- .load_tls = native_load_tls,
- .write_ldt_entry = write_dt_entry,
- .write_gdt_entry = write_dt_entry,
- .write_idt_entry = write_dt_entry,
- .load_esp0 = native_load_esp0,
-
- .irq_enable_syscall_ret = native_irq_enable_syscall_ret,
- .iret = native_iret,
-
- .set_iopl_mask = native_set_iopl_mask,
- .io_delay = native_io_delay,
-
- .lazy_mode = {
- .enter = paravirt_nop,
- .leave = paravirt_nop,
- },
-};
-
-struct pv_apic_ops pv_apic_ops = {
-#ifdef CONFIG_X86_LOCAL_APIC
- .apic_write = native_apic_write,
- .apic_write_atomic = native_apic_write_atomic,
- .apic_read = native_apic_read,
- .setup_boot_clock = setup_boot_APIC_clock,
- .setup_secondary_clock = setup_secondary_APIC_clock,
- .startup_ipi_hook = paravirt_nop,
-#endif
-};
-
-struct pv_mmu_ops pv_mmu_ops = {
- .pagetable_setup_start = native_pagetable_setup_start,
- .pagetable_setup_done = native_pagetable_setup_done,
-
- .read_cr2 = native_read_cr2,
- .write_cr2 = native_write_cr2,
- .read_cr3 = native_read_cr3,
- .write_cr3 = native_write_cr3,
-
- .flush_tlb_user = native_flush_tlb,
- .flush_tlb_kernel = native_flush_tlb_global,
- .flush_tlb_single = native_flush_tlb_single,
- .flush_tlb_others = native_flush_tlb_others,
-
- .alloc_pt = paravirt_nop,
- .alloc_pd = paravirt_nop,
- .alloc_pd_clone = paravirt_nop,
- .release_pt = paravirt_nop,
- .release_pd = paravirt_nop,
-
- .set_pte = native_set_pte,
- .set_pte_at = native_set_pte_at,
- .set_pmd = native_set_pmd,
- .pte_update = paravirt_nop,
- .pte_update_defer = paravirt_nop,
-
-#ifdef CONFIG_HIGHPTE
- .kmap_atomic_pte = kmap_atomic,
-#endif
-
-#ifdef CONFIG_X86_PAE
- .set_pte_atomic = native_set_pte_atomic,
- .set_pte_present = native_set_pte_present,
- .set_pud = native_set_pud,
- .pte_clear = native_pte_clear,
- .pmd_clear = native_pmd_clear,
-
- .pmd_val = native_pmd_val,
- .make_pmd = native_make_pmd,
-#endif
-
- .pte_val = native_pte_val,
- .pgd_val = native_pgd_val,
-
- .make_pte = native_make_pte,
- .make_pgd = native_make_pgd,
-
- .dup_mmap = paravirt_nop,
- .exit_mmap = paravirt_nop,
- .activate_mm = paravirt_nop,
-
- .lazy_mode = {
- .enter = paravirt_nop,
- .leave = paravirt_nop,
- },
-};
-
-EXPORT_SYMBOL_GPL(pv_time_ops);
-EXPORT_SYMBOL_GPL(pv_cpu_ops);
-EXPORT_SYMBOL_GPL(pv_mmu_ops);
-EXPORT_SYMBOL_GPL(pv_apic_ops);
-EXPORT_SYMBOL_GPL(pv_info);
-EXPORT_SYMBOL (pv_irq_ops);
diff --git a/arch/x86/kernel/paravirt_patch_32.c b/arch/x86/kernel/paravirt_patch_32.c
new file mode 100644
index 0000000..46ae585
--- /dev/null
+++ b/arch/x86/kernel/paravirt_patch_32.c
@@ -0,0 +1,52 @@
+#include <asm/paravirt.h>
+
+DEF_NATIVE(pv_irq_ops, irq_disable, "cli");
+DEF_NATIVE(pv_irq_ops, irq_enable, "sti");
+DEF_NATIVE(pv_irq_ops, restore_fl, "push %eax; popf");
+DEF_NATIVE(pv_irq_ops, save_fl, "pushf; pop %eax");
+DEF_NATIVE(pv_cpu_ops, iret, "iret");
+DEF_NATIVE(pv_cpu_ops, irq_enable_syscall_ret, "sti; sysexit");
+DEF_NATIVE(pv_mmu_ops, read_cr2, "mov %cr2, %eax");
+DEF_NATIVE(pv_mmu_ops, write_cr3, "mov %eax, %cr3");
+DEF_NATIVE(pv_mmu_ops, read_cr3, "mov %cr3, %eax");
+DEF_NATIVE(pv_cpu_ops, clts, "clts");
+DEF_NATIVE(pv_cpu_ops, read_tsc, "rdtsc");
+
+/* Undefined instruction for dealing with missing ops pointers. */
+static const unsigned char ud2a[] = { 0x0f, 0x0b };
+
+unsigned native_patch(u8 type, u16 clobbers, void *ibuf,
+ unsigned long addr, unsigned len)
+{
+ const unsigned char *start, *end;
+ unsigned ret;
+
+#define PATCH_SITE(ops, x) \
+ case PARAVIRT_PATCH(ops.x): \
+ start = start_##ops##_##x; \
+ end = end_##ops##_##x; \
+ goto patch_site
+ switch(type) {
+ PATCH_SITE(pv_irq_ops, irq_disable);
+ PATCH_SITE(pv_irq_ops, irq_enable);
+ PATCH_SITE(pv_irq_ops, restore_fl);
+ PATCH_SITE(pv_irq_ops, save_fl);
+ PATCH_SITE(pv_cpu_ops, iret);
+ PATCH_SITE(pv_cpu_ops, irq_enable_syscall_ret);
+ PATCH_SITE(pv_mmu_ops, read_cr2);
+ PATCH_SITE(pv_mmu_ops, read_cr3);
+ PATCH_SITE(pv_mmu_ops, write_cr3);
+ PATCH_SITE(pv_cpu_ops, clts);
+ PATCH_SITE(pv_cpu_ops, read_tsc);
+
+ patch_site:
+ ret = paravirt_patch_insns(ibuf, len, start, end);
+ break;
+
+ default:
+ ret = paravirt_patch_default(type, clobbers, ibuf, addr, len);
+ break;
+ }
+#undef PATCH_SITE
+ return ret;
+}
diff --git a/arch/x86/kernel/paravirt_patch_64.c b/arch/x86/kernel/paravirt_patch_64.c
new file mode 100644
index 0000000..cbfc4f3
--- /dev/null
+++ b/arch/x86/kernel/paravirt_patch_64.c
@@ -0,0 +1,56 @@
+#include <asm/paravirt.h>
+#include <asm/asm-offsets.h>
+
+DEF_NATIVE(pv_irq_ops, irq_disable, "cli");
+DEF_NATIVE(pv_irq_ops, irq_enable, "sti");
+DEF_NATIVE(pv_irq_ops, restore_fl, "pushq %rdi; popfq");
+DEF_NATIVE(pv_irq_ops, save_fl, "pushfq; popq %rax");
+DEF_NATIVE(pv_cpu_ops, iret, "iretq");
+DEF_NATIVE(pv_mmu_ops, read_cr2, "movq %cr2, %rax");
+DEF_NATIVE(pv_mmu_ops, read_cr3, "movq %cr3, %rax");
+DEF_NATIVE(pv_mmu_ops, write_cr3, "movq %rdi, %cr3");
+DEF_NATIVE(pv_mmu_ops, flush_tlb_single, "invlpg (%rdi)");
+DEF_NATIVE(pv_cpu_ops, clts, "clts");
+DEF_NATIVE(pv_cpu_ops, wbinvd, "wbinvd");
+
+/* the three commands give us more control to how to return from a syscall */
+DEF_NATIVE(pv_cpu_ops, irq_enable_syscall_ret, "movq %gs:" __stringify(pda_oldrsp) ", %rsp; swapgs; sysretq;");
+DEF_NATIVE(pv_cpu_ops, swapgs, "swapgs");
+
+unsigned native_patch(u8 type, u16 clobbers, void *ibuf,
+ unsigned long addr, unsigned len)
+{
+ const unsigned char *start, *end;
+ unsigned ret;
+
+#define PATCH_SITE(ops, x) \
+ case PARAVIRT_PATCH(ops.x): \
+ start = start_##ops##_##x; \
+ end = end_##ops##_##x; \
+ goto patch_site
+ switch(type) {
+ PATCH_SITE(pv_irq_ops, restore_fl);
+ PATCH_SITE(pv_irq_ops, save_fl);
+ PATCH_SITE(pv_irq_ops, irq_enable);
+ PATCH_SITE(pv_irq_ops, irq_disable);
+ PATCH_SITE(pv_cpu_ops, iret);
+ PATCH_SITE(pv_cpu_ops, irq_enable_syscall_ret);
+ PATCH_SITE(pv_cpu_ops, swapgs);
+ PATCH_SITE(pv_mmu_ops, read_cr2);
+ PATCH_SITE(pv_mmu_ops, read_cr3);
+ PATCH_SITE(pv_mmu_ops, write_cr3);
+ PATCH_SITE(pv_cpu_ops, clts);
+ PATCH_SITE(pv_mmu_ops, flush_tlb_single);
+ PATCH_SITE(pv_cpu_ops, wbinvd);
+
+ patch_site:
+ ret = paravirt_patch_insns(ibuf, len, start, end);
+ break;
+
+ default:
+ ret = paravirt_patch_default(type, clobbers, ibuf, addr, len);
+ break;
+ }
+#undef PATCH_SITE
+ return ret;
+}
diff --git a/arch/x86/kernel/vmlinux_64.lds.S b/arch/x86/kernel/vmlinux_64.lds.S
index ba8ea97..c3fce85 100644
--- a/arch/x86/kernel/vmlinux_64.lds.S
+++ b/arch/x86/kernel/vmlinux_64.lds.S
@@ -185,6 +185,12 @@ SECTIONS
.altinstr_replacement : AT(ADDR(.altinstr_replacement) - LOAD_OFFSET) {
*(.altinstr_replacement)
}
+ . = ALIGN(8);
+ .parainstructions : AT(ADDR(.parainstructions) - LOAD_OFFSET) {
+ __parainstructions = .;
+ *(.parainstructions)
+ __parainstructions_end = .;
+ }
/* .exit.text is discard at runtime, not link time, to deal with references
from .altinstructions and .eh_frame */
.exit.text : AT(ADDR(.exit.text) - LOAD_OFFSET) { *(.exit.text) }
diff --git a/include/asm-x86/paravirt.h b/include/asm-x86/paravirt.h
index d81a361..13291e2 100644
--- a/include/asm-x86/paravirt.h
+++ b/include/asm-x86/paravirt.h
@@ -7,23 +7,42 @@
#include <asm/page.h>

/* Bitmask of what can be clobbered: usually at least eax. */
-#define CLBR_NONE 0x0
-#define CLBR_EAX 0x1
-#define CLBR_ECX 0x2
-#define CLBR_EDX 0x4
-#define CLBR_ANY 0x7
+#define CLBR_NONE 0
+#define CLBR_EAX (1 << 0)
+#define CLBR_ECX (1 << 1)
+#define CLBR_EDX (1 << 2)
+
+#ifdef CONFIG_X86_64
+#define CLBR_RSI (1 << 3)
+#define CLBR_RDI (1 << 4)
+#define CLBR_R8 (1 << 5)
+#define CLBR_R9 (1 << 6)
+#define CLBR_R10 (1 << 7)
+#define CLBR_R11 (1 << 8)
+#define CLBR_ANY ((1 << 9) - 1)
+#include <asm/desc_defs.h>
+#else
+/* CLBR_ANY should match all regs platform has. For i386, that's just it */
+#define CLBR_ANY ((1 << 3) - 1)
+#endif /* X86_64 */

#ifndef __ASSEMBLY__
#include <linux/types.h>
#include <linux/cpumask.h>
#include <asm/kmap_types.h>
+#include <linux/stringify.h>

struct page;
struct thread_struct;
-struct Xgt_desc_struct;
struct tss_struct;
struct mm_struct;
struct desc_struct;
+/* FIXME: Ideally, the two arches would use the same data structure */
+#ifdef CONFIG_X86_64
+typedef struct desc_ptr x86_descr_ptr;
+#else
+typedef struct Xgt_desc_struct x86_descr_ptr;
+#endif

/* general info */
struct pv_info {
@@ -54,7 +73,6 @@ struct pv_init_ops {
void (*banner)(void);
};

-
struct pv_lazy_ops {
/* Set deferred update mode, used for batching operations. */
void (*enter)(void);
@@ -86,21 +104,29 @@ struct pv_cpu_ops {
unsigned long (*read_cr4)(void);
void (*write_cr4)(unsigned long);

+
/* Segment descriptor handling */
void (*load_tr_desc)(void);
- void (*load_gdt)(const struct Xgt_desc_struct *);
- void (*load_idt)(const struct Xgt_desc_struct *);
- void (*store_gdt)(struct Xgt_desc_struct *);
- void (*store_idt)(struct Xgt_desc_struct *);
+ void (*load_gdt)(const x86_descr_ptr *);
+ void (*load_idt)(const x86_descr_ptr *);
+ void (*store_gdt)(x86_descr_ptr *);
+ void (*store_idt)(x86_descr_ptr *);
void (*set_ldt)(const void *desc, unsigned entries);
unsigned long (*store_tr)(void);
void (*load_tls)(struct thread_struct *t, unsigned int cpu);
void (*write_ldt_entry)(struct desc_struct *,
int entrynum, u32 low, u32 high);
+#ifdef CONFIG_X86_32
void (*write_gdt_entry)(struct desc_struct *,
int entrynum, u32 low, u32 high);
void (*write_idt_entry)(struct desc_struct *,
int entrynum, u32 low, u32 high);
+#else
+ void (*write_gdt_entry)(void *ptr, void *entry, unsigned type,
+ unsigned size);
+ void (*write_idt_entry)(void *adr, struct gate_struct *s);
+#endif
+
void (*load_esp0)(struct tss_struct *tss, struct thread_struct *t);

void (*set_iopl_mask)(unsigned mask);
@@ -115,15 +141,18 @@ struct pv_cpu_ops {
/* MSR, PMC and TSR operations.
err = 0/-EFAULT. wrmsr returns 0/-EFAULT. */
u64 (*read_msr)(unsigned int msr, int *err);
- int (*write_msr)(unsigned int msr, u64 val);
+ int (*write_msr)(unsigned int msr, unsigned int low, unsigned int high);

u64 (*read_tsc)(void);
- u64 (*read_pmc)(void);
+ u64 (*read_pmc)(int counter);
+ u64 (*read_tscp)(int *aux);

/* These two are jmp to, not actually called. */
void (*irq_enable_syscall_ret)(void);
void (*iret)(void);

+ void (*swapgs)(void);
+
struct pv_lazy_ops lazy_mode;
};

@@ -142,6 +171,10 @@ struct pv_irq_ops {
void (*irq_enable)(void);
void (*safe_halt)(void);
void (*halt)(void);
+
+ /* cr8 register has interrupt priority information on x86_64 */
+ unsigned long (*read_cr8)(void);
+ void (*write_cr8)(unsigned long);
};

struct pv_apic_ops {
@@ -150,9 +183,9 @@ struct pv_apic_ops {
* Direct APIC operations, principally for VMI. Ideally
* these shouldn't be in this interface.
*/
- void (*apic_write)(unsigned long reg, unsigned long v);
- void (*apic_write_atomic)(unsigned long reg, unsigned long v);
- unsigned long (*apic_read)(unsigned long reg);
+ void (*apic_write)(unsigned long reg, u32 v);
+ void (*apic_write_atomic)(unsigned long reg, u32 v);
+ u32 (*apic_read)(unsigned long reg);
void (*setup_boot_clock)(void);
void (*setup_secondary_clock)(void);

@@ -216,6 +249,8 @@ struct pv_mmu_ops {
void (*set_pte_atomic)(pte_t *ptep, pte_t pteval);
void (*set_pte_present)(struct mm_struct *mm, unsigned long addr,
pte_t *ptep, pte_t pte);
+#endif
+#if defined(CONFIG_X86_PAE) || defined(CONFIG_X86_64)
void (*set_pud)(pud_t *pudp, pud_t pudval);
void (*pte_clear)(struct mm_struct *mm, unsigned long addr, pte_t *ptep);
void (*pmd_clear)(pmd_t *pmdp);
@@ -227,6 +262,16 @@ struct pv_mmu_ops {
pte_t (*make_pte)(unsigned long long pte);
pmd_t (*make_pmd)(unsigned long long pmd);
pgd_t (*make_pgd)(unsigned long long pgd);
+ #ifdef CONFIG_X86_64
+ void (*set_pgd)(pgd_t *pgdp, pgd_t pgdval);
+
+ void (*pud_clear)(pud_t *pudp);
+ void (*pgd_clear)(pgd_t *pgdp);
+
+ unsigned long long (*pud_val)(pud_t);
+
+ pud_t (*make_pud)(unsigned long long pud);
+ #endif
#else
unsigned long (*pte_val)(pte_t);
unsigned long (*pgd_val)(pgd_t);
@@ -255,6 +300,12 @@ struct paravirt_patch_template
struct pv_mmu_ops pv_mmu_ops;
};

+#ifdef CONFIG_X86_64
+#define WORDSIZE_STR " .quad"
+#else
+#define WORDSIZE_STR " .long"
+#endif
+
extern struct pv_info pv_info;
extern struct pv_init_ops pv_init_ops;
extern struct pv_time_ops pv_time_ops;
@@ -279,7 +330,8 @@ extern struct pv_mmu_ops pv_mmu_ops;
#define _paravirt_alt(insn_string, type, clobber) \
"771:\n\t" insn_string "\n" "772:\n" \
".pushsection .parainstructions,\"a\"\n" \
- " .long 771b\n" \
+ ".align 8\n" \
+ WORDSIZE_STR " 771b\n" \
" .byte " type "\n" \
" .byte 772b-771b\n" \
" .short " clobber "\n" \
@@ -289,6 +341,11 @@ extern struct pv_mmu_ops pv_mmu_ops;
#define paravirt_alt(insn_string) \
_paravirt_alt(insn_string, "%c[paravirt_typenum]", "%c[paravirt_clobber]")

+/* Simple instruction patching code. */
+#define DEF_NATIVE(ops, name, code) \
+ extern const char start_##ops##_##name[], end_##ops##_##name[]; \
+ asm("start_" #ops "_" #name ": " code "; end_" #ops "_" #name ":")
+
unsigned paravirt_patch_nop(void);
unsigned paravirt_patch_ignore(unsigned len);
unsigned paravirt_patch_call(void *insnbuf,
@@ -303,6 +360,9 @@ unsigned paravirt_patch_default(u8 type, u16 clobbers, void *insnbuf,
unsigned paravirt_patch_insns(void *insnbuf, unsigned len,
const char *start, const char *end);

+unsigned native_patch(u8 type, u16 clobbers, void *ibuf,
+ unsigned long addr, unsigned len);
+
int paravirt_disable_iospace(void);

/*
@@ -319,22 +379,29 @@ int paravirt_disable_iospace(void);
* runtime.
*
* Normally, a call to a pv_op function is a simple indirect call:
- * (paravirt_ops.operations)(args...).
+ * (pv_op_struct.operations)(args...).
*
* Unfortunately, this is a relatively slow operation for modern CPUs,
* because it cannot necessarily determine what the destination
- * address is. In this case, the address is a runtime constant, so at
- * the very least we can patch the call to e a simple direct call, or
+ * address is. In this case, the address is a runtime constant, so at
+ * the very least we can patch the call to be a simple direct call, or
* ideally, patch an inline implementation into the callsite. (Direct
* calls are essentially free, because the call and return addresses
* are completely predictable.)
*
- * These macros rely on the standard gcc "regparm(3)" calling
+ * For i386, these macros rely on the standard gcc "regparm(3)" calling
* convention, in which the first three arguments are placed in %eax,
* %edx, %ecx (in that order), and the remaining arguments are placed
* on the stack. All caller-save registers (eax,edx,ecx) are expected
* to be modified (either clobbered or used for return values).
*
+ * X86_64, on the other hand, already specifies a register-based calling
+ * conventions, returning at %rax, with parameteres going on %rdi, %rsi,
+ * %rdx, and %rcx. Note that for this reason, x86_64 does not need any
+ * special handling for dealing with 4 arguments, unlike i386.
+ * However, x86_64 also have to clobber all caller saved registers, which
+ * unfortunately, are quite a bit (r8 - r11)
+ *
* The call instruction itself is marked by placing its start address
* and size into the .parainstructions section, so that
* apply_paravirt() in arch/i386/kernel/alternative.c can do the
@@ -356,9 +423,10 @@ int paravirt_disable_iospace(void);
* the return type. The macro then uses sizeof() on that type to
* determine whether its a 32 or 64 bit value, and places the return
* in the right register(s) (just %eax for 32-bit, and %edx:%eax for
- * 64-bit).
+ * 64-bit). For x86_64 machines, it just returns at %rax regardless of
+ * the return value size.
*
- * 64-bit arguments are passed as a pair of adjacent 32-bit arguments
+ * i386 also passes 64-bit arguments as a pair of adjacent 32-bit arguments
* in low,high order.
*
* Small structures are passed and returned in registers. The macro
@@ -369,46 +437,67 @@ int paravirt_disable_iospace(void);
* means that all uses must be wrapped in inline functions. This also
* makes sure the incoming and outgoing types are always correct.
*/
+#ifdef CONFIG_X86_32
+#define PVOP_VCALL_ARGS unsigned long __eax,__edx,__ecx
+#define PVOP_CALL_ARGS PVOP_VCALL_ARGS
+#define PVOP_VCALL_CLOBBERS "=a" (__eax), "=d" (__edx), \
+ "=c" (__ecx)
+#define PVOP_CALL_CLOBBERS PVOP_VCALL_CLOBBERS
+#define EXTRA_CLOBBERS
+#define VEXTRA_CLOBBERS
+#else
+#define PVOP_VCALL_ARGS unsigned long __edi,__esi,__edx,__ecx
+#define PVOP_CALL_ARGS PVOP_VCALL_ARGS, __eax
+#define PVOP_VCALL_CLOBBERS "=D" (__edi), \
+ "=S" (__esi), "=d" (__edx), \
+ "=c" (__ecx)
+
+#define PVOP_CALL_CLOBBERS PVOP_VCALL_CLOBBERS, "=a" (__eax)
+
+#define EXTRA_CLOBBERS , "r8", "r9", "r10", "r11"
+#define VEXTRA_CLOBBERS , "rax", "r8", "r9", "r10", "r11"
+#endif
+
#define __PVOP_CALL(rettype, op, pre, post, ...) \
({ \
rettype __ret; \
- unsigned long __eax, __edx, __ecx; \
+ PVOP_CALL_ARGS; \
+ /* This is 32-bit specific, but is okay in 64-bit */ \
+ /* since this condition will never hold */ \
if (sizeof(rettype) > sizeof(unsigned long)) { \
asm volatile(pre \
paravirt_alt(PARAVIRT_CALL) \
post \
- : "=a" (__eax), "=d" (__edx), \
- "=c" (__ecx) \
+ : PVOP_CALL_CLOBBERS \
: paravirt_type(op), \
paravirt_clobber(CLBR_ANY), \
##__VA_ARGS__ \
- : "memory", "cc"); \
+ : "memory", "cc" EXTRA_CLOBBERS); \
__ret = (rettype)((((u64)__edx) << 32) | __eax); \
} else { \
asm volatile(pre \
paravirt_alt(PARAVIRT_CALL) \
post \
- : "=a" (__eax), "=d" (__edx), \
- "=c" (__ecx) \
+ : PVOP_CALL_CLOBBERS \
: paravirt_type(op), \
paravirt_clobber(CLBR_ANY), \
##__VA_ARGS__ \
- : "memory", "cc"); \
+ : "memory", "cc" EXTRA_CLOBBERS); \
__ret = (rettype)__eax; \
} \
__ret; \
})
#define __PVOP_VCALL(op, pre, post, ...) \
({ \
- unsigned long __eax, __edx, __ecx; \
+ PVOP_VCALL_ARGS; \
asm volatile(pre \
paravirt_alt(PARAVIRT_CALL) \
post \
- : "=a" (__eax), "=d" (__edx), "=c" (__ecx) \
+ : PVOP_VCALL_CLOBBERS \
: paravirt_type(op), \
paravirt_clobber(CLBR_ANY), \
##__VA_ARGS__ \
- : "memory", "cc"); \
+ : "memory", "cc" VEXTRA_CLOBBERS); \
})

#define PVOP_CALL0(rettype, op) \
@@ -417,22 +506,27 @@ int paravirt_disable_iospace(void);
__PVOP_VCALL(op, "", "")

#define PVOP_CALL1(rettype, op, arg1) \
- __PVOP_CALL(rettype, op, "", "", "0" ((u32)(arg1)))
+ __PVOP_CALL(rettype, op, "", "", "0" ((unsigned long)(arg1)))
#define PVOP_VCALL1(op, arg1) \
- __PVOP_VCALL(op, "", "", "0" ((u32)(arg1)))
+ __PVOP_VCALL(op, "", "", "0" ((unsigned long)(arg1)))

#define PVOP_CALL2(rettype, op, arg1, arg2) \
- __PVOP_CALL(rettype, op, "", "", "0" ((u32)(arg1)), "1" ((u32)(arg2)))
+ __PVOP_CALL(rettype, op, "", "", "0" ((unsigned long)(arg1)), \
+ "1" ((unsigned long)(arg2)))
+
#define PVOP_VCALL2(op, arg1, arg2) \
- __PVOP_VCALL(op, "", "", "0" ((u32)(arg1)), "1" ((u32)(arg2)))
+ __PVOP_VCALL(op, "", "", "0" ((unsigned long)(arg1)), \
+ "1" ((unsigned long)(arg2)))

#define PVOP_CALL3(rettype, op, arg1, arg2, arg3) \
- __PVOP_CALL(rettype, op, "", "", "0" ((u32)(arg1)), \
- "1"((u32)(arg2)), "2"((u32)(arg3)))
+ __PVOP_CALL(rettype, op, "", "", "0" ((unsigned long)(arg1)), \
+ "1"((unsigned long)(arg2)), "2"((unsigned long)(arg3)))
#define PVOP_VCALL3(op, arg1, arg2, arg3) \
- __PVOP_VCALL(op, "", "", "0" ((u32)(arg1)), "1"((u32)(arg2)), \
- "2"((u32)(arg3)))
+ __PVOP_VCALL(op, "", "", "0" ((unsigned long)(arg1)), \
+ "1"((unsigned long)(arg2)), "2"((unsigned long)(arg3)))

+/* This is the only difference in x86_64. We can make it much simpler */
+#ifdef CONFIG_X86_32
#define PVOP_CALL4(rettype, op, arg1, arg2, arg3, arg4) \
__PVOP_CALL(rettype, op, \
"push %[_arg4];", "lea 4(%%esp),%%esp;", \
@@ -443,6 +537,16 @@ int paravirt_disable_iospace(void);
"push %[_arg4];", "lea 4(%%esp),%%esp;", \
"0" ((u32)(arg1)), "1" ((u32)(arg2)), \
"2" ((u32)(arg3)), [_arg4] "mr" ((u32)(arg4)))
+#else
+#define PVOP_CALL4(rettype, op, arg1, arg2, arg3, arg4) \
+ __PVOP_CALL(rettype, op, "", "", "0" ((unsigned long)(arg1)), \
+ "1"((unsigned long)(arg2)), "2"((unsigned long)(arg3)), \
+ "3"((unsigned long)(arg4)))
+#define PVOP_VCALL4(op, arg1, arg2, arg3, arg4) \
+ __PVOP_VCALL(op, "", "", "0" ((unsigned long)(arg1)), \
+ "1"((unsigned long)(arg2)), "2"((unsigned long)(arg3)), \
+ "3"((unsigned long)(arg4)))
+#endif

static inline int paravirt_enabled(void)
{
@@ -540,6 +644,15 @@ static inline void write_cr4(unsigned long x)
PVOP_VCALL1(pv_cpu_ops.write_cr4, x);
}

+static inline unsigned long read_cr8(void)
+{
+ return PVOP_CALL0(unsigned long, pv_irq_ops.read_cr8);
+}
+
+static inline void write_cr8(unsigned long x)
+{
+ PVOP_VCALL1(pv_irq_ops.write_cr8, x);
+}
static inline void raw_safe_halt(void)
{
PVOP_VCALL0(pv_irq_ops.safe_halt);
@@ -561,6 +674,7 @@ static inline u64 paravirt_read_msr(unsigned msr, int *err)
{
return PVOP_CALL2(u64, pv_cpu_ops.read_msr, msr, err);
}
+
static inline int paravirt_write_msr(unsigned msr, unsigned low, unsigned high)
{
return PVOP_CALL3(int, pv_cpu_ops.write_msr, msr, low, high);
@@ -613,8 +727,6 @@ static inline unsigned long long paravirt_sched_clock(void)
}
#define calculate_cpu_khz() (pv_time_ops.get_cpu_khz())

-#define write_tsc(val1,val2) wrmsr(0x10, val1, val2)
-
static inline unsigned long long paravirt_read_pmc(int counter)
{
return PVOP_CALL1(u64, pv_cpu_ops.read_pmc, counter);
@@ -626,15 +738,36 @@ static inline unsigned long long paravirt_read_pmc(int counter)
high = _l >> 32; \
} while(0)

+static inline unsigned long paravirt_readt_scp(int *aux)
+{
+ return PVOP_CALL1(u64, pv_cpu_ops.read_tscp, aux);
+}
+
+#define rdtscp(low, high, aux) \
+do { \
+ int __aux; \
+ unsigned long __val = paravirt_rdtscp(&__aux); \
+ (low) = (u32)__val; \
+ (high) = (u32)(__val >> 32); \
+ (aux) = __aux; \
+} while (0)
+
+#define rdtscpll(val, aux) \
+do { \
+ unsigned long __aux; \
+ val = paravirt_rdtscp(&__aux); \
+ (aux) = __aux; \
+} while (0)
+
static inline void load_TR_desc(void)
{
PVOP_VCALL0(pv_cpu_ops.load_tr_desc);
}
-static inline void load_gdt(const struct Xgt_desc_struct *dtr)
+static inline void load_gdt(const x86_descr_ptr *dtr)
{
PVOP_VCALL1(pv_cpu_ops.load_gdt, dtr);
}
-static inline void load_idt(const struct Xgt_desc_struct *dtr)
+static inline void load_idt(const x86_descr_ptr *dtr)
{
PVOP_VCALL1(pv_cpu_ops.load_idt, dtr);
}
@@ -642,11 +775,11 @@ static inline void set_ldt(const void *addr, unsigned entries)
{
PVOP_VCALL2(pv_cpu_ops.set_ldt, addr, entries);
}
-static inline void store_gdt(struct Xgt_desc_struct *dtr)
+static inline void store_gdt(x86_descr_ptr *dtr)
{
PVOP_VCALL1(pv_cpu_ops.store_gdt, dtr);
}
-static inline void store_idt(struct Xgt_desc_struct *dtr)
+static inline void store_idt(x86_descr_ptr *dtr)
{
PVOP_VCALL1(pv_cpu_ops.store_idt, dtr);
}
@@ -663,6 +796,8 @@ static inline void write_ldt_entry(void *dt, int entry, u32 low, u32 high)
{
PVOP_VCALL4(pv_cpu_ops.write_ldt_entry, dt, entry, low, high);
}
+
+#ifdef CONFIG_X86_32
static inline void write_gdt_entry(void *dt, int entry, u32 low, u32 high)
{
PVOP_VCALL4(pv_cpu_ops.write_gdt_entry, dt, entry, low, high);
@@ -671,6 +806,19 @@ static inline void write_idt_entry(void *dt, int entry, u32 low, u32 high)
{
PVOP_VCALL4(pv_cpu_ops.write_idt_entry, dt, entry, low, high);
}
+#else
+static inline void write_gdt_entry(void *ptr, void *entry,
+ unsigned type, unsigned size)
+{
+ PVOP_VCALL4(pv_cpu_ops.write_gdt_entry, ptr, entry, type, size);
+}
+
+static inline void write_idt_entry(void *adr, struct gate_struct *s)
+{
+ PVOP_VCALL2(pv_cpu_ops.write_idt_entry, adr, s);
+}
+#endif
+
static inline void set_iopl_mask(unsigned mask)
{
PVOP_VCALL1(pv_cpu_ops.set_iopl_mask, mask);
@@ -690,19 +838,19 @@ static inline void slow_down_io(void) {
/*
* Basic functions accessing APICs.
*/
-static inline void apic_write(unsigned long reg, unsigned long v)
+static inline void apic_write(unsigned long reg, u32 v)
{
PVOP_VCALL2(pv_apic_ops.apic_write, reg, v);
}

-static inline void apic_write_atomic(unsigned long reg, unsigned long v)
+static inline void apic_write_atomic(unsigned long reg, u32 v)
{
PVOP_VCALL2(pv_apic_ops.apic_write_atomic, reg, v);
}

-static inline unsigned long apic_read(unsigned long reg)
+static inline u32 apic_read(unsigned long reg)
{
- return PVOP_CALL1(unsigned long, pv_apic_ops.apic_read, reg);
+ return PVOP_CALL1(u32, pv_apic_ops.apic_read, reg);
}

static inline void setup_boot_clock(void)
@@ -762,10 +910,12 @@ static inline void __flush_tlb(void)
{
PVOP_VCALL0(pv_mmu_ops.flush_tlb_user);
}
+
static inline void __flush_tlb_global(void)
{
PVOP_VCALL0(pv_mmu_ops.flush_tlb_kernel);
}
+
static inline void __flush_tlb_single(unsigned long addr)
{
PVOP_VCALL1(pv_mmu_ops.flush_tlb_single, addr);
@@ -908,7 +1058,103 @@ static inline void pmd_clear(pmd_t *pmdp)
PVOP_VCALL1(pv_mmu_ops.pmd_clear, pmdp);
}

-#else /* !CONFIG_X86_PAE */
+#elif defined(CONFIG_X86_64)
+/* FIXME: There ought to be a way to do it that duplicate less code */
+static inline pte_t __pte(unsigned long long val)
+{
+ unsigned long long ret;
+ ret = PVOP_CALL1(unsigned long long, pv_mmu_ops.make_pte, val);
+ return (pte_t) { ret };
+}
+
+static inline pmd_t __pmd(unsigned long long val)
+{
+ unsigned long long ret;
+ ret = PVOP_CALL1(unsigned long long, pv_mmu_ops.make_pmd, val);
+ return (pmd_t) { ret };
+}
+
+static inline pud_t __pud(unsigned long long val)
+{
+ unsigned long long ret;
+ ret = PVOP_CALL1(unsigned long long, pv_mmu_ops.make_pud, val);
+ return (pud_t) { ret };
+}
+
+static inline pgd_t __pgd(unsigned long long val)
+{
+ unsigned long long ret;
+ ret = PVOP_CALL1(unsigned long long, pv_mmu_ops.make_pgd, val);
+ return (pgd_t) { ret };
+}
+
+static inline unsigned long long pte_val(pte_t x)
+{
+ return PVOP_CALL1(unsigned long long, pv_mmu_ops.pte_val, x.pte);
+}
+
+static inline unsigned long long pmd_val(pmd_t x)
+{
+ return PVOP_CALL1(unsigned long long, pv_mmu_ops.pmd_val, x.pmd);
+}
+
+static inline unsigned long long pud_val(pud_t x)
+{
+ return PVOP_CALL1(unsigned long long, pv_mmu_ops.pud_val, x.pud);
+}
+
+static inline unsigned long long pgd_val(pgd_t x)
+{
+ return PVOP_CALL1(unsigned long long, pv_mmu_ops.pgd_val, x.pgd);
+}
+
+static inline void set_pte(pte_t *ptep, pte_t pteval)
+{
+ PVOP_VCALL2(pv_mmu_ops.set_pte, ptep, pteval.pte);
+}
+
+static inline void set_pte_at(struct mm_struct *mm, unsigned long addr,
+ pte_t *ptep, pte_t pteval)
+{
+ PVOP_VCALL4(pv_mmu_ops.set_pte_at, mm, addr, ptep, pteval.pte);
+}
+
+static inline void set_pmd(pmd_t *pmdp, pmd_t pmdval)
+{
+ PVOP_VCALL2(pv_mmu_ops.set_pmd, pmdp, pmdval.pmd);
+}
+
+static inline void set_pud(pud_t *pudp, pud_t pudval)
+{
+ PVOP_VCALL2(pv_mmu_ops.set_pud, pudp, pudval.pud);
+}
+
+static inline void set_pgd(pgd_t *pgdp, pgd_t pgdval)
+{
+ PVOP_VCALL2(pv_mmu_ops.set_pgd, pgdp, pgdval.pgd);
+}
+
+static inline void pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
+{
+ PVOP_VCALL3(pv_mmu_ops.pte_clear, mm, addr, ptep);
+}
+
+static inline void pmd_clear(pmd_t *pmdp)
+{
+ PVOP_VCALL1(pv_mmu_ops.pmd_clear, pmdp);
+}
+
+static inline void pud_clear(pud_t *pudp)
+{
+ PVOP_VCALL1(pv_mmu_ops.pud_clear, pudp);
+}
+
+static inline void pgd_clear(pgd_t *pgdp)
+{
+ PVOP_VCALL1(pv_mmu_ops.pgd_clear, pgdp);
+}
+
+#else /* !CONFIG_X86_PAE && !CONFIG_X86_64*/

static inline pte_t __pte(unsigned long val)
{
@@ -1014,52 +1260,68 @@ struct paravirt_patch_site {
extern struct paravirt_patch_site __parainstructions[],
__parainstructions_end[];

+#ifdef CONFIG_X86_32
+#define PV_SAVE_REGS "pushl %%ecx; pushl %%edx;"
+#define PV_RESTORE_REGS "popl %%edx; popl %%ecx"
+#define PV_FLAGS_ARG "0"
+#define PV_EXTRA_CLOBBERS
+#define PV_VEXTRA_CLOBBERS
+#else
+/* We save some registers, but all of them, that's too much. We clobber all
+ * caller saved registers but the argument parameter */
+#define PV_SAVE_REGS "pushq %%rdi;"
+#define PV_RESTORE_REGS "popq %%rdi;"
+#define PV_EXTRA_CLOBBERS EXTRA_CLOBBERS, "rcx" , "rdx"
+#define PV_VEXTRA_CLOBBERS EXTRA_CLOBBERS, "rdi", "rcx" , "rdx"
+#define PV_FLAGS_ARG "D"
+#endif
+
static inline unsigned long __raw_local_save_flags(void)
{
unsigned long f;

- asm volatile(paravirt_alt("pushl %%ecx; pushl %%edx;"
+ asm volatile(paravirt_alt(PV_SAVE_REGS
PARAVIRT_CALL
- "popl %%edx; popl %%ecx")
+ PV_RESTORE_REGS)
: "=a"(f)
: paravirt_type(pv_irq_ops.save_fl),
paravirt_clobber(CLBR_EAX)
- : "memory", "cc");
+ : "memory", "cc" PV_VEXTRA_CLOBBERS);
return f;
}

static inline void raw_local_irq_restore(unsigned long f)
{
- asm volatile(paravirt_alt("pushl %%ecx; pushl %%edx;"
+ asm volatile(paravirt_alt(PV_SAVE_REGS
PARAVIRT_CALL
- "popl %%edx; popl %%ecx")
+ PV_RESTORE_REGS)
: "=a"(f)
- : "0"(f),
+ : PV_FLAGS_ARG (f),
paravirt_type(pv_irq_ops.restore_fl),
paravirt_clobber(CLBR_EAX)
- : "memory", "cc");
+ : "memory", "cc" PV_EXTRA_CLOBBERS);
}

static inline void raw_local_irq_disable(void)
{
- asm volatile(paravirt_alt("pushl %%ecx; pushl %%edx;"
+ asm volatile(paravirt_alt(PV_SAVE_REGS
PARAVIRT_CALL
- "popl %%edx; popl %%ecx")
+ PV_RESTORE_REGS)
:
: paravirt_type(pv_irq_ops.irq_disable),
paravirt_clobber(CLBR_EAX)
- : "memory", "eax", "cc");
+ : "memory", "eax", "cc" PV_VEXTRA_CLOBBERS);
}

static inline void raw_local_irq_enable(void)
{
- asm volatile(paravirt_alt("pushl %%ecx; pushl %%edx;"
+ asm volatile(paravirt_alt(PV_SAVE_REGS
PARAVIRT_CALL
- "popl %%edx; popl %%ecx")
+ PV_RESTORE_REGS)
:
: paravirt_type(pv_irq_ops.irq_enable),
paravirt_clobber(CLBR_EAX)
- : "memory", "eax", "cc");
+ : "memory", "eax", "cc" PV_VEXTRA_CLOBBERS);
}

static inline unsigned long __raw_local_irq_save(void)
@@ -1071,27 +1333,41 @@ static inline unsigned long __raw_local_irq_save(void)
return f;
}

+#ifdef CONFIG_X86_32
+#define SAVE_REGS "pushl %%ecx; pushl %%edx;"
+#define RESTORE_REGS "popl %%edx; popl %%ecx"
+#define CLI_STI_CLOBBERS , "%eax"
+#else /* !X86_32 */
+#define SAVE_REGS "pushq %%rcx; pushq %%rdx;"
+#define RESTORE_REGS "popq %%rdx; popq %%rcx"
+#define CLI_STI_CLOBBERS , "%rax", "%rdi", "%rsi", "%r8", "%r9", "%r10",\
+ "%r11", "%r12", "%r13", "%r14", "%r15"
+#endif /* X86_32 */
+
#define CLI_STRING \
- _paravirt_alt("pushl %%ecx; pushl %%edx;" \
+ _paravirt_alt(SAVE_REGS \
"call *%[paravirt_cli_opptr];" \
- "popl %%edx; popl %%ecx", \
+ RESTORE_REGS, \
"%c[paravirt_cli_type]", "%c[paravirt_clobber]")

+
#define STI_STRING \
- _paravirt_alt("pushl %%ecx; pushl %%edx;" \
+ _paravirt_alt(SAVE_REGS \
"call *%[paravirt_sti_opptr];" \
- "popl %%edx; popl %%ecx", \
+ RESTORE_REGS, \
"%c[paravirt_sti_type]", "%c[paravirt_clobber]")

-#define CLI_STI_CLOBBERS , "%eax"
-#define CLI_STI_INPUT_ARGS \
- , \
- [paravirt_cli_type] "i" (PARAVIRT_PATCH(pv_irq_ops.irq_disable)), \
+
+#define CLI_STI_INPUT_ARGS \
+ , \
+ [paravirt_cli_type] "i" (PARAVIRT_PATCH(pv_irq_ops.irq_disable)),\
[paravirt_cli_opptr] "m" (pv_irq_ops.irq_disable), \
- [paravirt_sti_type] "i" (PARAVIRT_PATCH(pv_irq_ops.irq_enable)), \
+ [paravirt_sti_type] "i" (PARAVIRT_PATCH(pv_irq_ops.irq_enable)),\
[paravirt_sti_opptr] "m" (pv_irq_ops.irq_enable), \
paravirt_clobber(CLBR_EAX)

+
+
/* Make sure as little as possible of this mess escapes. */
#undef PARAVIRT_CALL
#undef __PVOP_CALL
@@ -1106,48 +1382,80 @@ static inline unsigned long __raw_local_irq_save(void)
#undef PVOP_CALL3
#undef PVOP_VCALL4
#undef PVOP_CALL4
+#undef PV_SAVE_REGS
+#undef PV_RESTORE_REGS

#else /* __ASSEMBLY__ */

-#define PARA_PATCH(struct, off) ((PARAVIRT_PATCH_##struct + (off)) / 4)
-
-#define PARA_SITE(ptype, clobbers, ops) \
+#define _PARA_SITE(ptype, clobbers, ops, word) \
771:; \
ops; \
772:; \
.pushsection .parainstructions,"a"; \
- .long 771b; \
+ .align 8; \
+ word 771b; \
.byte ptype; \
.byte 772b-771b; \
.short clobbers; \
.popsection

+#ifdef CONFIG_X86_64
+#define PV_SAVE_REGS pushq %rax; pushq %rdi; pushq %rcx; pushq %rdx
+#define PV_RESTORE_REGS popq %rdx; popq %rcx; popq %rdi; popq %rax
+#define PARA_PATCH(struct, off) ((PARAVIRT_PATCH_##struct + (off)) / 8)
+#define PARA_SITE(ptype, clobbers, ops) _PARA_SITE(ptype, clobbers, ops, .quad)
+#else
+#define PV_SAVE_REGS pushl %eax; pushl %edi; pushl %ecx; pushl %edx
+#define PV_RESTORE_REGS popl %edx; popl %ecx; popl %edi; popl %eax
+#define PARA_PATCH(struct, off) ((PARAVIRT_PATCH_##struct + (off)) / 4)
+#define PARA_SITE(ptype, clobbers, ops) _PARA_SITE(ptype, clobbers, ops, .long)
+#endif
+
#define INTERRUPT_RETURN \
PARA_SITE(PARA_PATCH(pv_cpu_ops, PV_CPU_iret), CLBR_NONE, \
jmp *%cs:pv_cpu_ops+PV_CPU_iret)

#define DISABLE_INTERRUPTS(clobbers) \
PARA_SITE(PARA_PATCH(pv_irq_ops, PV_IRQ_irq_disable), clobbers, \
- pushl %eax; pushl %ecx; pushl %edx; \
+ PV_SAVE_REGS; \
call *%cs:pv_irq_ops+PV_IRQ_irq_disable; \
- popl %edx; popl %ecx; popl %eax) \
+ PV_RESTORE_REGS;)

#define ENABLE_INTERRUPTS(clobbers) \
PARA_SITE(PARA_PATCH(pv_irq_ops, PV_IRQ_irq_enable), clobbers, \
- pushl %eax; pushl %ecx; pushl %edx; \
+ PV_SAVE_REGS; \
call *%cs:pv_irq_ops+PV_IRQ_irq_enable; \
- popl %edx; popl %ecx; popl %eax)
+ PV_RESTORE_REGS;)

#define ENABLE_INTERRUPTS_SYSCALL_RET \
PARA_SITE(PARA_PATCH(pv_cpu_ops, PV_CPU_irq_enable_syscall_ret),\
CLBR_NONE, \
jmp *%cs:pv_cpu_ops+PV_CPU_irq_enable_syscall_ret)

+#ifdef CONFIG_X86_32
#define GET_CR0_INTO_EAX \
push %ecx; push %edx; \
call *pv_cpu_ops+PV_CPU_read_cr0; \
pop %edx; pop %ecx

+#else
+/* Those are x86_64 specific */
+#define SWAPGS \
+ PARA_SITE(PARA_PATCH(pv_cpu_ops, PV_CPU_swapgs), CLBR_NONE, \
+ pushq %rax; pushq %rdi; pushq %rcx; pushq %rdx; \
+ call *pv_cpu_ops+PV_CPU_swapgs; \
+ popq %rdx; popq %rcx; popq %rdi; popq %rax; \
+ )
+
+#define GET_CR2_INTO_RCX \
+ call *pv_mmu_ops+PV_MMU_read_cr2; \
+ movq %rax, %rcx; \
+ xorq %rax, %rax;
+
+#endif
+
+#undef WSIZE
+
#endif /* __ASSEMBLY__ */
#endif /* CONFIG_PARAVIRT */
#endif /* __ASM_PARAVIRT_H */
--
1.4.4.2

2007-11-09 21:36:46

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 24/24] make vsmp a paravirt client

This patch makes vsmp a paravirt client. It now uses the whole
infrastructure provided by pvops. When we detect we're running
a vsmp box, we change the irq-related paravirt operations (and so,
it have to happen quite early), and the patching function

Signed-off-by: Glauber de Oliveira Costa <[email protected]>
Signed-off-by: Steven Rostedt <[email protected]>
Acked-by: Jeremy Fitzhardinge <[email protected]>
---
arch/x86/Kconfig.x86_64 | 3 +-
arch/x86/kernel/Makefile_64 | 3 +-
arch/x86/kernel/setup_64.c | 2 +
arch/x86/kernel/vsmp_64.c | 82 +++++++++++++++++++++++++++++++++++++-----
include/asm-x86/setup.h | 6 ++-
5 files changed, 80 insertions(+), 16 deletions(-)

diff --git a/arch/x86/Kconfig.x86_64 b/arch/x86/Kconfig.x86_64
index 568ba7a..fe4ef61 100644
--- a/arch/x86/Kconfig.x86_64
+++ b/arch/x86/Kconfig.x86_64
@@ -148,15 +148,14 @@ config X86_PC
bool "PC-compatible"
help
Choose this option if your computer is a standard PC or compatible.
-
config X86_VSMP
bool "Support for ScaleMP vSMP"
depends on PCI
+ select PARAVIRT
help
Support for ScaleMP vSMP systems. Say 'Y' here if this kernel is
supposed to run on these EM64T-based machines. Only choose this option
if you have one of these machines.
-
endchoice

choice
diff --git a/arch/x86/kernel/Makefile_64 b/arch/x86/kernel/Makefile_64
index 0714528..34e5c7d 100644
--- a/arch/x86/kernel/Makefile_64
+++ b/arch/x86/kernel/Makefile_64
@@ -8,7 +8,7 @@ obj-y := process_64.o signal_64.o entry_64.o traps_64.o irq_64.o \
x8664_ksyms_64.o i387_64.o syscall_64.o vsyscall_64.o \
setup64.o bootflag.o e820_64.o reboot_64.o quirks.o i8237.o \
pci-dma_64.o pci-nommu_64.o alternative.o hpet.o tsc_64.o bugs_64.o \
- i8253.o rtc.o
+ i8253.o rtc.o vsmp_64.o

obj-$(CONFIG_STACKTRACE) += stacktrace.o
obj-y += cpu/
@@ -29,7 +29,6 @@ obj-$(CONFIG_CALGARY_IOMMU) += pci-calgary_64.o tce_64.o
obj-$(CONFIG_SWIOTLB) += pci-swiotlb_64.o
obj-$(CONFIG_KPROBES) += kprobes_64.o
obj-$(CONFIG_X86_PM_TIMER) += pmtimer_64.o
-obj-$(CONFIG_X86_VSMP) += vsmp_64.o
obj-$(CONFIG_K8_NB) += k8.o
obj-$(CONFIG_AUDIT) += audit_64.o

diff --git a/arch/x86/kernel/setup_64.c b/arch/x86/kernel/setup_64.c
index 1c9f237..8cc5915 100644
--- a/arch/x86/kernel/setup_64.c
+++ b/arch/x86/kernel/setup_64.c
@@ -338,6 +338,8 @@ void __init setup_arch(char **cmdline_p)

init_memory_mapping(0, (end_pfn_map << PAGE_SHIFT));

+ vsmp_init();
+
dmi_scan_machine();

#ifdef CONFIG_SMP
diff --git a/arch/x86/kernel/vsmp_64.c b/arch/x86/kernel/vsmp_64.c
index d971210..24b0541 100644
--- a/arch/x86/kernel/vsmp_64.c
+++ b/arch/x86/kernel/vsmp_64.c
@@ -8,31 +8,95 @@
*
* Ravikiran Thirumalai <[email protected]>,
* Shai Fultheim <[email protected]>
+ * Paravirt ops integration: Glauber de Oliveira Costa <[email protected]>
*/
-
#include <linux/init.h>
#include <linux/pci_ids.h>
#include <linux/pci_regs.h>
#include <asm/pci-direct.h>
#include <asm/io.h>
+#include <asm/paravirt.h>
+
+/*
+ * Interrupt control for the VSMP architecture:
+ */
+
+static inline unsigned long vsmp_save_fl(void)
+{
+ unsigned long flags = native_save_fl();
+
+ if (!(flags & X86_EFLAGS_AC))
+ return X86_EFLAGS_IF;
+ return 0;
+}
+
+static inline void vsmp_restore_fl(unsigned long flags)
+{
+ if (flags & X86_EFLAGS_IF)
+ flags &= ~X86_EFLAGS_AC;
+ if (!(flags & X86_EFLAGS_IF))
+ flags &= X86_EFLAGS_AC;
+ native_restore_fl(flags);
+}
+
+static inline void vsmp_irq_disable(void)
+{
+ unsigned long flags = native_save_fl();

-static int __init vsmp_init(void)
+ vsmp_restore_fl(flags & ~X86_EFLAGS_IF);
+}
+
+static inline void vsmp_irq_enable(void)
+{
+ unsigned long flags = native_save_fl();
+
+ vsmp_restore_fl(flags | X86_EFLAGS_IF);
+}
+
+#ifdef CONFIG_PARAVIRT
+static unsigned __init vsmp_patch(u8 type, u16 clobbers, void *ibuf,
+ unsigned long addr, unsigned len)
+{
+ switch (type) {
+ case PARAVIRT_PATCH(pv_irq_ops.irq_enable):
+ case PARAVIRT_PATCH(pv_irq_ops.irq_disable):
+ case PARAVIRT_PATCH(pv_irq_ops.save_fl):
+ case PARAVIRT_PATCH(pv_irq_ops.restore_fl):
+ return paravirt_patch_default(type, clobbers, ibuf, addr, len);
+ default:
+ return native_patch(type, clobbers, ibuf, addr, len);
+ }
+
+}
+#endif
+
+void __init vsmp_init(void)
{
void *address;
- unsigned int cap, ctl;
+ unsigned int cap, ctl, cfg;

if (!early_pci_allowed())
- return 0;
+ return;

/* Check if we are running on a ScaleMP vSMP box */
if ((read_pci_config_16(0, 0x1f, 0, PCI_VENDOR_ID) !=
PCI_VENDOR_ID_SCALEMP) ||
(read_pci_config_16(0, 0x1f, 0, PCI_DEVICE_ID) !=
PCI_DEVICE_ID_SCALEMP_VSMP_CTL))
- return 0;
+ return;
+
+#ifdef CONFIG_PARAVIRT
+ /* If we are, use the distinguished irq functions */
+ pv_irq_ops.irq_disable = vsmp_irq_disable;
+ pv_irq_ops.irq_enable = vsmp_irq_enable;
+ pv_irq_ops.save_fl = vsmp_save_fl;
+ pv_irq_ops.restore_fl = vsmp_restore_fl;
+ pv_init_ops.patch = vsmp_patch;
+#endif

/* set vSMP magic bits to indicate vSMP capable kernel */
- address = ioremap(read_pci_config(0, 0x1f, 0, PCI_BASE_ADDRESS_0), 8);
+ cfg = read_pci_config(0, 0x1f, 0, PCI_BASE_ADDRESS_0);
+ address = early_ioremap(cfg, 8);
cap = readl(address);
ctl = readl(address + 4);
printk(KERN_INFO "vSMP CTL: capabilities:0x%08x control:0x%08x\n",
@@ -45,8 +109,6 @@ static int __init vsmp_init(void)
printk(KERN_INFO "vSMP CTL: control set to:0x%08x\n", ctl);
}

- iounmap(address);
- return 0;
+ early_iounmap(address, 8);
+ return;
}
-
-core_initcall(vsmp_init);
diff --git a/include/asm-x86/setup.h b/include/asm-x86/setup.h
index 071e054..8f532fa 100644
--- a/include/asm-x86/setup.h
+++ b/include/asm-x86/setup.h
@@ -4,6 +4,10 @@
#define COMMAND_LINE_SIZE 2048

#ifndef __ASSEMBLY__
+
+/* For EM64T-based VSMP machines */
+void vsmp_init(void);
+
char *machine_specific_memory_setup(void);
#ifndef CONFIG_PARAVIRT
#define paravirt_post_allocator_init() do {} while (0)
@@ -58,8 +62,6 @@ void __init add_memory_region(unsigned long long start,

extern unsigned long init_pg_tables_end;

-
-
#endif /* __i386__ */
#endif /* _SETUP */
#endif /* __ASSEMBLY__ */
--
1.4.4.2

2007-11-09 21:38:00

by Glauber Costa

[permalink] [raw]
Subject: [PATCH 6/24] Add debugreg/load_rsp native hooks

This patch adds native hooks for debugreg handling functions,
and for the native load_rsp0 function. The later also have its
call sites patched. There's some room for consolidation in the
processor*.h headers, and it is done, for paravirt related functions

Signed-off-by: Glauber de Oliveira Costa <[email protected]>
Signed-off-by: Steven Rostedt <[email protected]>
Acked-by: Jeremy Fitzhardinge <[email protected]>
---
arch/x86/kernel/process_64.c | 2 +-
arch/x86/kernel/smpboot_64.c | 2 +-
include/asm-x86/msr.h | 67 ------------------
include/asm-x86/processor.h | 146 ++++++++++++++++++++++++++++++++++++++++
include/asm-x86/processor_32.h | 138 +-------------------------------------
include/asm-x86/processor_64.h | 32 ++++-----
6 files changed, 165 insertions(+), 222 deletions(-)

diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 6472f37..9647a10 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -589,7 +589,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
/*
* Reload esp0, LDT and the page table pointer:
*/
- tss->rsp0 = next->rsp0;
+ load_esp0(tss, next);

/*
* Switch DS and ES.
diff --git a/arch/x86/kernel/smpboot_64.c b/arch/x86/kernel/smpboot_64.c
index 17e54fa..8fb0f90 100644
--- a/arch/x86/kernel/smpboot_64.c
+++ b/arch/x86/kernel/smpboot_64.c
@@ -613,7 +613,7 @@ do_rest:
start_rip = setup_trampoline();

init_rsp = c_idle.idle->thread.rsp;
- per_cpu(init_tss,cpu).rsp0 = init_rsp;
+ load_esp0(&per_cpu(init_tss, cpu), &c_idle.idle->thread);
initial_code = start_secondary;
clear_tsk_thread_flag(c_idle.idle, TIF_FORK);

diff --git a/include/asm-x86/msr.h b/include/asm-x86/msr.h
index ba4b314..48f73c7 100644
--- a/include/asm-x86/msr.h
+++ b/include/asm-x86/msr.h
@@ -253,73 +253,6 @@ static inline int wrmsr_safe_on_cpu(unsigned int cpu, u32 msr_no, u32 l, u32 h)
: "=a" (low), "=d" (high) \
: "c" (counter))

-static inline void cpuid(int op, unsigned int *eax, unsigned int *ebx,
- unsigned int *ecx, unsigned int *edx)
-{
- __asm__("cpuid"
- : "=a" (*eax),
- "=b" (*ebx),
- "=c" (*ecx),
- "=d" (*edx)
- : "0" (op));
-}
-
-/* Some CPUID calls want 'count' to be placed in ecx */
-static inline void cpuid_count(int op, int count, int *eax, int *ebx, int *ecx,
- int *edx)
-{
- __asm__("cpuid"
- : "=a" (*eax),
- "=b" (*ebx),
- "=c" (*ecx),
- "=d" (*edx)
- : "0" (op), "c" (count));
-}
-
-/*
- * CPUID functions returning a single datum
- */
-static inline unsigned int cpuid_eax(unsigned int op)
-{
- unsigned int eax;
-
- __asm__("cpuid"
- : "=a" (eax)
- : "0" (op)
- : "bx", "cx", "dx");
- return eax;
-}
-static inline unsigned int cpuid_ebx(unsigned int op)
-{
- unsigned int eax, ebx;
-
- __asm__("cpuid"
- : "=a" (eax), "=b" (ebx)
- : "0" (op)
- : "cx", "dx" );
- return ebx;
-}
-static inline unsigned int cpuid_ecx(unsigned int op)
-{
- unsigned int eax, ecx;
-
- __asm__("cpuid"
- : "=a" (eax), "=c" (ecx)
- : "0" (op)
- : "bx", "dx" );
- return ecx;
-}
-static inline unsigned int cpuid_edx(unsigned int op)
-{
- unsigned int eax, edx;
-
- __asm__("cpuid"
- : "=a" (eax), "=d" (edx)
- : "0" (op)
- : "bx", "cx");
- return edx;
-}
-
#ifdef CONFIG_SMP
void rdmsr_on_cpu(unsigned int cpu, u32 msr_no, u32 *l, u32 *h);
void wrmsr_on_cpu(unsigned int cpu, u32 msr_no, u32 l, u32 h);
diff --git a/include/asm-x86/processor.h b/include/asm-x86/processor.h
index 46e1c04..a576a72 100644
--- a/include/asm-x86/processor.h
+++ b/include/asm-x86/processor.h
@@ -1,5 +1,151 @@
+#ifndef _X86_PROCESSOR_H_
+#define _X86_PROCESSOR_H_
+
+#include <linux/kernel.h>
+#include <asm/bug.h>
+
+static inline void native_cpuid(unsigned int *eax, unsigned int *ebx,
+ unsigned int *ecx, unsigned int *edx)
+{
+ /* ecx is often an input as well as an output. */
+ __asm__("cpuid"
+ : "=a" (*eax),
+ "=b" (*ebx),
+ "=c" (*ecx),
+ "=d" (*edx)
+ : "0" (*eax), "2" (*ecx));
+}
+
+static inline unsigned long native_get_debugreg(int regno)
+{
+ unsigned long val = 0; /* Damn you, gcc! */
+
+ switch (regno) {
+ case 0:
+ asm volatile ("mov %%db0, %0" :"=r" (val)); break;
+ case 1:
+ asm volatile ("mov %%db1, %0" :"=r" (val)); break;
+ case 2:
+ asm volatile ("mov %%db2, %0" :"=r" (val)); break;
+ case 3:
+ asm volatile ("mov %%db3, %0" :"=r" (val)); break;
+ case 6:
+ asm volatile ("mov %%db6, %0" :"=r" (val)); break;
+ case 7:
+ asm volatile ("mov %%db7, %0" :"=r" (val)); break;
+ default:
+ WARN_ON(1);
+ }
+ return val;
+}
+
+static inline void native_set_debugreg(int regno, unsigned long value)
+{
+ switch (regno) {
+ case 0:
+ asm("mov %0,%%db0" : :"r" (value) : "memory");
+ break;
+ case 1:
+ asm("mov %0,%%db1" : :"r" (value) : "memory");
+ break;
+ case 2:
+ asm("mov %0,%%db2" : :"r" (value) : "memory");
+ break;
+ case 3:
+ asm("mov %0,%%db3" : :"r" (value) : "memory");
+ break;
+ case 6:
+ asm("mov %0,%%db6" : :"r" (value) : "memory");
+ break;
+ case 7:
+ asm("mov %0,%%db7" : :"r" (value) : "memory");
+ break;
+ default:
+ WARN_ON(1);
+ }
+}
+
#ifdef CONFIG_X86_32
# include "processor_32.h"
#else
# include "processor_64.h"
#endif
+
+#ifdef CONFIG_PARAVIRT
+#include <asm/paravirt.h>
+#else
+#define paravirt_enabled() 0
+#define __cpuid native_cpuid
+#define SWAPGS swapgs
+
+static inline void load_esp0(struct tss_struct *tss,
+ struct thread_struct *thread)
+{
+ native_load_esp0(tss, thread);
+}
+
+/*
+ * These special macros can be used to get or set a debugging register
+ */
+#define get_debugreg(var, register) \
+ (var) = native_get_debugreg(register)
+#define set_debugreg(value, register) \
+ native_set_debugreg(register, value)
+
+#define set_iopl_mask native_set_iopl_mask
+#endif /* CONFIG_PARAVIRT */
+
+/*
+ * Generic CPUID function
+ * clear %ecx since some cpus (Cyrix MII) do not set or clear %ecx
+ * resulting in stale register contents being returned.
+ */
+static inline void cpuid(unsigned int op, unsigned int *eax, unsigned int *ebx,
+ unsigned int *ecx, unsigned int *edx)
+{
+ *eax = op;
+ *ecx = 0;
+ __cpuid(eax, ebx, ecx, edx);
+}
+
+/* Some CPUID calls want 'count' to be placed in ecx */
+static inline void cpuid_count(int op, int count, unsigned *eax,
+ unsigned *ebx, unsigned *ecx, unsigned *edx)
+{
+ *eax = op;
+ *ecx = count;
+ __cpuid(eax, ebx, ecx, edx);
+}
+
+/*
+ * CPUID functions returning a single datum
+ */
+static inline unsigned int cpuid_eax(unsigned int op)
+{
+ unsigned int eax, ebx, ecx, edx;
+
+ cpuid(op, &eax, &ebx, &ecx, &edx);
+ return eax;
+}
+static inline unsigned int cpuid_ebx(unsigned int op)
+{
+ unsigned int eax, ebx, ecx, edx;
+
+ cpuid(op, &eax, &ebx, &ecx, &edx);
+ return ebx;
+}
+static inline unsigned int cpuid_ecx(unsigned int op)
+{
+ unsigned int eax, ebx, ecx, edx;
+
+ cpuid(op, &eax, &ebx, &ecx, &edx);
+ return ecx;
+}
+static inline unsigned int cpuid_edx(unsigned int op)
+{
+ unsigned int eax, ebx, ecx, edx;
+
+ cpuid(op, &eax, &ebx, &ecx, &edx);
+ return edx;
+}
+#endif
diff --git a/include/asm-x86/processor_32.h b/include/asm-x86/processor_32.h
index 3b51a18..233fee3 100644
--- a/include/asm-x86/processor_32.h
+++ b/include/asm-x86/processor_32.h
@@ -132,17 +132,6 @@ extern void detect_ht(struct cpuinfo_x86 *c);
static inline void detect_ht(struct cpuinfo_x86 *c) {}
#endif

-static inline void native_cpuid(unsigned int *eax, unsigned int *ebx,
- unsigned int *ecx, unsigned int *edx)
-{
- /* ecx is often an input as well as an output. */
- __asm__("cpuid"
- : "=a" (*eax),
- "=b" (*ebx),
- "=c" (*ecx),
- "=d" (*edx)
- : "0" (*eax), "2" (*ecx));
-}

#define load_cr3(pgdir) write_cr3(__pa(pgdir))

@@ -505,56 +494,6 @@ static inline void native_load_esp0(struct tss_struct *tss, struct thread_struct
}
}

-
-static inline unsigned long native_get_debugreg(int regno)
-{
- unsigned long val = 0; /* Damn you, gcc! */
-
- switch (regno) {
- case 0:
- asm("movl %%db0, %0" :"=r" (val)); break;
- case 1:
- asm("movl %%db1, %0" :"=r" (val)); break;
- case 2:
- asm("movl %%db2, %0" :"=r" (val)); break;
- case 3:
- asm("movl %%db3, %0" :"=r" (val)); break;
- case 6:
- asm("movl %%db6, %0" :"=r" (val)); break;
- case 7:
- asm("movl %%db7, %0" :"=r" (val)); break;
- default:
- BUG();
- }
- return val;
-}
-
-static inline void native_set_debugreg(int regno, unsigned long value)
-{
- switch (regno) {
- case 0:
- asm("movl %0,%%db0" : /* no output */ :"r" (value));
- break;
- case 1:
- asm("movl %0,%%db1" : /* no output */ :"r" (value));
- break;
- case 2:
- asm("movl %0,%%db2" : /* no output */ :"r" (value));
- break;
- case 3:
- asm("movl %0,%%db3" : /* no output */ :"r" (value));
- break;
- case 6:
- asm("movl %0,%%db6" : /* no output */ :"r" (value));
- break;
- case 7:
- asm("movl %0,%%db7" : /* no output */ :"r" (value));
- break;
- default:
- BUG();
- }
-}
-
/*
* Set IOPL bits in EFLAGS from given mask
*/
@@ -571,82 +510,9 @@ static inline void native_set_iopl_mask(unsigned mask)
: "i" (~X86_EFLAGS_IOPL), "r" (mask));
}

-#ifdef CONFIG_PARAVIRT
-#include <asm/paravirt.h>
-#else
-#define paravirt_enabled() 0
-#define __cpuid native_cpuid
-
-static inline void load_esp0(struct tss_struct *tss, struct thread_struct *thread)
-{
- native_load_esp0(tss, thread);
-}
-
-/*
- * These special macros can be used to get or set a debugging register
- */
-#define get_debugreg(var, register) \
- (var) = native_get_debugreg(register)
-#define set_debugreg(value, register) \
- native_set_debugreg(register, value)
-
-#define set_iopl_mask native_set_iopl_mask
-#endif /* CONFIG_PARAVIRT */
-
-/*
- * Generic CPUID function
- * clear %ecx since some cpus (Cyrix MII) do not set or clear %ecx
- * resulting in stale register contents being returned.
- */
-static inline void cpuid(unsigned int op,
- unsigned int *eax, unsigned int *ebx,
- unsigned int *ecx, unsigned int *edx)
-{
- *eax = op;
- *ecx = 0;
- __cpuid(eax, ebx, ecx, edx);
-}
-
-/* Some CPUID calls want 'count' to be placed in ecx */
-static inline void cpuid_count(unsigned int op, int count,
- unsigned int *eax, unsigned int *ebx,
- unsigned int *ecx, unsigned int *edx)
-{
- *eax = op;
- *ecx = count;
- __cpuid(eax, ebx, ecx, edx);
-}
-
-/*
- * CPUID functions returning a single datum
- */
-static inline unsigned int cpuid_eax(unsigned int op)
-{
- unsigned int eax, ebx, ecx, edx;
-
- cpuid(op, &eax, &ebx, &ecx, &edx);
- return eax;
-}
-static inline unsigned int cpuid_ebx(unsigned int op)
-{
- unsigned int eax, ebx, ecx, edx;
-
- cpuid(op, &eax, &ebx, &ecx, &edx);
- return ebx;
-}
-static inline unsigned int cpuid_ecx(unsigned int op)
-{
- unsigned int eax, ebx, ecx, edx;
-
- cpuid(op, &eax, &ebx, &ecx, &edx);
- return ecx;
-}
-static inline unsigned int cpuid_edx(unsigned int op)
+/* We don't really have one */
+static inline void native_swapgs(void)
{
- unsigned int eax, ebx, ecx, edx;
-
- cpuid(op, &eax, &ebx, &ecx, &edx);
- return edx;
}

/* generic versions from gas */
diff --git a/include/asm-x86/processor_64.h b/include/asm-x86/processor_64.h
index 6c0d96a..befecc6 100644
--- a/include/asm-x86/processor_64.h
+++ b/include/asm-x86/processor_64.h
@@ -114,21 +114,13 @@ extern unsigned long mmu_cr4_features;
static inline void set_in_cr4 (unsigned long mask)
{
mmu_cr4_features |= mask;
- __asm__("movq %%cr4,%%rax\n\t"
- "orq %0,%%rax\n\t"
- "movq %%rax,%%cr4\n"
- : : "irg" (mask)
- :"ax");
+ write_cr4(read_cr4() | mask);
}

static inline void clear_in_cr4 (unsigned long mask)
{
mmu_cr4_features &= ~mask;
- __asm__("movq %%cr4,%%rax\n\t"
- "andq %0,%%rax\n\t"
- "movq %%rax,%%cr4\n"
- : : "irg" (~mask)
- :"ax");
+ write_cr4(read_cr4() & ~mask);
}


@@ -249,6 +241,12 @@ struct thread_struct {
.rsp0 = (unsigned long)&init_stack + sizeof(init_stack) \
}

+static inline void native_load_esp0(struct tss_struct *tss,
+ struct thread_struct *thread)
+{
+ tss->rsp0 = thread->rsp0;
+}
+
#define INIT_MMAP \
{ &init_mm, 0, 0, NULL, PAGE_SHARED, VM_READ | VM_WRITE | VM_EXEC, 1, NULL, NULL }

@@ -264,13 +262,13 @@ struct thread_struct {
set_fs(USER_DS); \
} while(0)

-#define get_debugreg(var, register) \
- __asm__("movq %%db" #register ", %0" \
- :"=r" (var))
-#define set_debugreg(value, register) \
- __asm__("movq %0,%%db" #register \
- : /* no output */ \
- :"r" (value))
+extern void native_swapgs(void);
+
+/* We have it for simmetry with i386 code, but we don't really use it */
+static inline void native_set_iopl_mask(unsigned mask)
+{
+
+}

struct task_struct;
struct mm_struct;
--
1.4.4.2

2007-11-09 23:01:51

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [PATCH 1/24] mm/sparse-vmemmap.c: make sure init_mm is included

Glauber de Oliveira Costa wrote:
> mm/sparse-vmemmap.c uses init_mm in some places. However, it is not
> present in any of the headers currently included in the file.
>
> init_mm is defined as extern in sched.h, so we add it to the headers list
>
> Up to now, this problem was masked by the fact that functions like
> set_pte_at() and pmd_populate_kernel() are usually macros that expand to
> simpler variants that does not use the first parameter at all.
>
> Signed-off-by: Glauber de Oliveira Costa <[email protected]>
> Signed-off-by: Andrew Morton <[email protected]>
> Signed-off-by: Linus Torvalds <[email protected]>
> ---
> mm/sparse-vmemmap.c | 1 +
> 1 files changed, 1 insertions(+), 0 deletions(-)
>
> diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c
> index d3b718b..22620f6 100644
> --- a/mm/sparse-vmemmap.c
> +++ b/mm/sparse-vmemmap.c
> @@ -24,6 +24,7 @@
> #include <linux/module.h>
> #include <linux/spinlock.h>
> #include <linux/vmalloc.h>
> +#include <linux/sched.h>
>

This is already in git.

J

2007-11-10 00:30:27

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [PATCH 5/24] smp x86 consolidation

Glauber de Oliveira Costa wrote:
> This patch consolidates part of the pieces of smp for both architectures.
> (i386 and x86_64). It makes part the calls go through smp_ops, and shares
> code for those functions in smpcommon.c
>
> There's more room for code sharing here, but it is left as an exercise to
> the reader ;-)
>

I'm getting link errors in 32-bit:

arch/x86/kernel/built-in.o: In function `native_smp_send_reschedule':
/home/jeremy/hg/xen/paravirt/linux/arch/x86/kernel/smpcommon.c:262: undefined reference to `genapic'
arch/x86/kernel/built-in.o: In function `native_smp_call_function_mask':
/home/jeremy/hg/xen/paravirt/linux/arch/x86/kernel/smpcommon.c:113: undefined reference to `genapic'


J

2007-11-10 01:26:39

by Glauber Costa

[permalink] [raw]
Subject: Re: [PATCH 1/24] mm/sparse-vmemmap.c: make sure init_mm is included

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jeremy Fitzhardinge escreveu:
> Glauber de Oliveira Costa wrote:
>> mm/sparse-vmemmap.c uses init_mm in some places. However, it is not
>> present in any of the headers currently included in the file.
>>
>> init_mm is defined as extern in sched.h, so we add it to the headers list
>>
>> Up to now, this problem was masked by the fact that functions like
>> set_pte_at() and pmd_populate_kernel() are usually macros that expand to
>> simpler variants that does not use the first parameter at all.
>>
>> Signed-off-by: Glauber de Oliveira Costa <[email protected]>
>> Signed-off-by: Andrew Morton <[email protected]>
>> Signed-off-by: Linus Torvalds <[email protected]>
>> ---
>> mm/sparse-vmemmap.c | 1 +
>> 1 files changed, 1 insertions(+), 0 deletions(-)
>>
>> diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c
>> index d3b718b..22620f6 100644
>> --- a/mm/sparse-vmemmap.c
>> +++ b/mm/sparse-vmemmap.c
>> @@ -24,6 +24,7 @@
>> #include <linux/module.h>
>> #include <linux/spinlock.h>
>> #include <linux/vmalloc.h>
>> +#include <linux/sched.h>
>>
>
> This is already in git.
>
> J
As I told in the 0th message, yes, I'm aware.
Just it does not seem to be in tglx's , so if people are willing to try
this out, they'll needed. Thus it's included in the series.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Remi - http://enigmail.mozdev.org

iD8DBQFHNQjijYI8LaFUWXMRAn+UAJ48V1EyWoXkWu1+J0Y0ze59H7ZG2QCcDdgW
4qVJgJcQDfJrvZk8TSo901s=
=RwdO
-----END PGP SIGNATURE-----

2007-11-10 01:29:45

by Glauber Costa

[permalink] [raw]
Subject: Re: [PATCH 5/24] smp x86 consolidation

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jeremy Fitzhardinge escreveu:
> Glauber de Oliveira Costa wrote:
>> This patch consolidates part of the pieces of smp for both architectures.
>> (i386 and x86_64). It makes part the calls go through smp_ops, and shares
>> code for those functions in smpcommon.c
>>
>> There's more room for code sharing here, but it is left as an exercise to
>> the reader ;-)
>>
>
> I'm getting link errors in 32-bit:
>
> arch/x86/kernel/built-in.o: In function `native_smp_send_reschedule':
> /home/jeremy/hg/xen/paravirt/linux/arch/x86/kernel/smpcommon.c:262: undefined reference to `genapic'
> arch/x86/kernel/built-in.o: In function `native_smp_call_function_mask':
> /home/jeremy/hg/xen/paravirt/linux/arch/x86/kernel/smpcommon.c:113: undefined reference to `genapic'
>
Ok, it compiled just fine here. I bet it's due to one of that i386 lots
of variants.
Which subarchitecture are you compiling for, jeremy ?
> J

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Remi - http://enigmail.mozdev.org

iD8DBQFHNQmKjYI8LaFUWXMRAlrLAKCHOb28oE/veBkbVeJZDCbjE8OADwCg81ye
nsBfBe2iLOFHF6dxT4mRauc=
=xjo4
-----END PGP SIGNATURE-----

2007-11-10 02:09:42

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [PATCH 5/24] smp x86 consolidation

#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.24-rc2
# Fri Nov 9 15:19:56 2007
#
CONFIG_X86_32=y
CONFIG_GENERIC_TIME=y
CONFIG_GENERIC_CMOS_UPDATE=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_QUICKLIST=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_DMI=y
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"

#
# General setup
#
CONFIG_EXPERIMENTAL=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_LOCALVERSION=""
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_BSD_PROCESS_ACCT=y
# CONFIG_BSD_PROCESS_ACCT_V3 is not set
CONFIG_TASKSTATS=y
CONFIG_TASK_DELAY_ACCT=y
# CONFIG_TASK_XACCT is not set
# CONFIG_USER_NS is not set
# CONFIG_AUDIT is not set
# CONFIG_IKCONFIG is not set
CONFIG_LOG_BUF_SHIFT=17
# CONFIG_CGROUPS is not set
CONFIG_FAIR_GROUP_SCHED=y
CONFIG_FAIR_USER_SCHED=y
# CONFIG_FAIR_CGROUP_SCHED is not set
CONFIG_SYSFS_DEPRECATED=y
CONFIG_RELAY=y
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_SYSCTL=y
# CONFIG_EMBEDDED is not set
CONFIG_UID16=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_ALL is not set
CONFIG_KALLSYMS_EXTRA_PASS=y
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_ANON_INODES=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_SLUB_DEBUG=y
# CONFIG_SLAB is not set
CONFIG_SLUB=y
# CONFIG_SLOB is not set
CONFIG_RT_MUTEXES=y
# CONFIG_TINY_SHMEM is not set
CONFIG_BASE_SMALL=0
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
# CONFIG_MODULE_FORCE_UNLOAD is not set
CONFIG_MODVERSIONS=y
CONFIG_MODULE_SRCVERSION_ALL=y
CONFIG_KMOD=y
CONFIG_STOP_MACHINE=y
CONFIG_BLOCK=y
# CONFIG_LBD is not set
# CONFIG_BLK_DEV_IO_TRACE is not set
# CONFIG_LSF is not set
CONFIG_BLK_DEV_BSG=y

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
# CONFIG_DEFAULT_AS is not set
# CONFIG_DEFAULT_DEADLINE is not set
CONFIG_DEFAULT_CFQ=y
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED="cfq"
CONFIG_PREEMPT_NOTIFIERS=y

#
# Processor type and features
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y
CONFIG_GENERIC_CLOCKEVENTS_BUILD=y
CONFIG_SMP=y
CONFIG_X86_PC=y
# CONFIG_X86_ELAN is not set
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_NUMAQ is not set
# CONFIG_X86_SUMMIT is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_VISWS is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_ES7000 is not set
CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER=y
CONFIG_PARAVIRT=y
CONFIG_PARAVIRT_GUEST=y
CONFIG_XEN=y
CONFIG_VMI=y
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
CONFIG_MPENTIUMM=y
# CONFIG_MCORE2 is not set
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
# CONFIG_MK8 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MEFFICEON is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MGEODEGX1 is not set
# CONFIG_MGEODE_LX is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_MVIAC7 is not set
# CONFIG_X86_GENERIC is not set
CONFIG_X86_CMPXCHG=y
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_X86_XADD=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
# CONFIG_ARCH_HAS_ILOG2_U32 is not set
# CONFIG_ARCH_HAS_ILOG2_U64 is not set
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_TSC=y
CONFIG_X86_CMOV=y
CONFIG_X86_MINIMUM_CPU_FAMILY=4
CONFIG_HPET_TIMER=y
CONFIG_NR_CPUS=4
# CONFIG_SCHED_SMT is not set
CONFIG_SCHED_MC=y
# CONFIG_PREEMPT_NONE is not set
# CONFIG_PREEMPT_VOLUNTARY is not set
CONFIG_PREEMPT=y
CONFIG_PREEMPT_BKL=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_MCE=y
CONFIG_X86_MCE_NONFATAL=y
CONFIG_X86_MCE_P4THERMAL=y
CONFIG_VM86=y
# CONFIG_TOSHIBA is not set
# CONFIG_I8K is not set
# CONFIG_X86_REBOOTFIXUPS is not set
CONFIG_MICROCODE=m
CONFIG_MICROCODE_OLD_INTERFACE=y
CONFIG_X86_MSR=m
CONFIG_X86_CPUID=m

#
# Firmware Drivers
#
CONFIG_EDD=m
# CONFIG_DELL_RBU is not set
# CONFIG_DCDBAS is not set
CONFIG_DMIID=y
# CONFIG_NOHIGHMEM is not set
# CONFIG_HIGHMEM4G is not set
CONFIG_HIGHMEM64G=y
CONFIG_PAGE_OFFSET=0xC0000000
CONFIG_HIGHMEM=y
CONFIG_X86_PAE=y
CONFIG_ARCH_FLATMEM_ENABLE=y
CONFIG_ARCH_SPARSEMEM_ENABLE=y
CONFIG_ARCH_SELECT_MEMORY_MODEL=y
CONFIG_ARCH_POPULATES_NODE_MAP=y
CONFIG_SELECT_MEMORY_MODEL=y
CONFIG_FLATMEM_MANUAL=y
# CONFIG_DISCONTIGMEM_MANUAL is not set
# CONFIG_SPARSEMEM_MANUAL is not set
CONFIG_FLATMEM=y
CONFIG_FLAT_NODE_MEM_MAP=y
CONFIG_SPARSEMEM_STATIC=y
# CONFIG_SPARSEMEM_VMEMMAP_ENABLE is not set
CONFIG_SPLIT_PTLOCK_CPUS=4
CONFIG_RESOURCES_64BIT=y
CONFIG_ZONE_DMA_FLAG=1
CONFIG_BOUNCE=y
CONFIG_NR_QUICK=1
CONFIG_VIRT_TO_BUS=y
CONFIG_HIGHPTE=y
# CONFIG_MATH_EMULATION is not set
CONFIG_MTRR=y
# CONFIG_EFI is not set
# CONFIG_IRQBALANCE is not set
# CONFIG_SECCOMP is not set
# CONFIG_HZ_100 is not set
CONFIG_HZ_250=y
# CONFIG_HZ_300 is not set
# CONFIG_HZ_1000 is not set
CONFIG_HZ=250
# CONFIG_KEXEC is not set
# CONFIG_CRASH_DUMP is not set
CONFIG_PHYSICAL_START=0x100000
# CONFIG_RELOCATABLE is not set
CONFIG_PHYSICAL_ALIGN=0x100000
CONFIG_HOTPLUG_CPU=y
# CONFIG_COMPAT_VDSO is not set
CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y

#
# Power management options (ACPI, APM)
#
CONFIG_PM=y
# CONFIG_PM_LEGACY is not set
CONFIG_PM_DEBUG=y
# CONFIG_PM_VERBOSE is not set
# CONFIG_PM_TRACE is not set
CONFIG_PM_SLEEP_SMP=y
CONFIG_PM_SLEEP=y
CONFIG_SUSPEND_SMP_POSSIBLE=y
CONFIG_SUSPEND=y
CONFIG_HIBERNATION_SMP_POSSIBLE=y
CONFIG_HIBERNATION=y
CONFIG_PM_STD_PARTITION=""
CONFIG_ACPI=y
CONFIG_ACPI_SLEEP=y
CONFIG_ACPI_PROCFS=y
# CONFIG_ACPI_PROC_EVENT is not set
CONFIG_ACPI_AC=y
CONFIG_ACPI_BATTERY=y
CONFIG_ACPI_BUTTON=y
CONFIG_ACPI_VIDEO=m
CONFIG_ACPI_FAN=y
CONFIG_ACPI_DOCK=y
CONFIG_ACPI_BAY=y
CONFIG_ACPI_PROCESSOR=y
CONFIG_ACPI_HOTPLUG_CPU=y
CONFIG_ACPI_THERMAL=y
# CONFIG_ACPI_ASUS is not set
# CONFIG_ACPI_TOSHIBA is not set
CONFIG_ACPI_BLACKLIST_YEAR=1999
# CONFIG_ACPI_DEBUG is not set
CONFIG_ACPI_EC=y
CONFIG_ACPI_POWER=y
CONFIG_ACPI_SYSTEM=y
CONFIG_X86_PM_TIMER=y
CONFIG_ACPI_CONTAINER=y
CONFIG_ACPI_SBS=m
# CONFIG_APM is not set

#
# CPU Frequency scaling
#
CONFIG_CPU_FREQ=y
CONFIG_CPU_FREQ_TABLE=y
# CONFIG_CPU_FREQ_DEBUG is not set
CONFIG_CPU_FREQ_STAT=y
CONFIG_CPU_FREQ_STAT_DETAILS=y
# CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND is not set
CONFIG_CPU_FREQ_DEFAULT_GOV_CONSERVATIVE=y
CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
CONFIG_CPU_FREQ_GOV_POWERSAVE=y
CONFIG_CPU_FREQ_GOV_USERSPACE=y
CONFIG_CPU_FREQ_GOV_ONDEMAND=y
CONFIG_CPU_FREQ_GOV_CONSERVATIVE=y

#
# CPUFreq processor drivers
#
CONFIG_X86_ACPI_CPUFREQ=y
# CONFIG_X86_POWERNOW_K6 is not set
# CONFIG_X86_POWERNOW_K7 is not set
# CONFIG_X86_POWERNOW_K8 is not set
# CONFIG_X86_GX_SUSPMOD is not set
# CONFIG_X86_SPEEDSTEP_CENTRINO is not set
# CONFIG_X86_SPEEDSTEP_ICH is not set
# CONFIG_X86_SPEEDSTEP_SMI is not set
# CONFIG_X86_P4_CLOCKMOD is not set
# CONFIG_X86_CPUFREQ_NFORCE2 is not set
# CONFIG_X86_LONGRUN is not set
# CONFIG_X86_LONGHAUL is not set
# CONFIG_X86_E_POWERSAVER is not set

#
# shared options
#
# CONFIG_X86_ACPI_CPUFREQ_PROC_INTF is not set
# CONFIG_X86_SPEEDSTEP_LIB is not set
CONFIG_CPU_IDLE=y
CONFIG_CPU_IDLE_GOV_LADDER=y
CONFIG_CPU_IDLE_GOV_MENU=y

#
# Bus options (PCI, PCMCIA, EISA, MCA, ISA)
#
CONFIG_PCI=y
# CONFIG_PCI_GOBIOS is not set
# CONFIG_PCI_GOMMCONFIG is not set
# CONFIG_PCI_GODIRECT is not set
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_MMCONFIG=y
CONFIG_PCI_DOMAINS=y
CONFIG_PCIEPORTBUS=y
CONFIG_HOTPLUG_PCI_PCIE=y
CONFIG_PCIEAER=y
CONFIG_ARCH_SUPPORTS_MSI=y
CONFIG_PCI_MSI=y
# CONFIG_PCI_LEGACY is not set
# CONFIG_PCI_DEBUG is not set
CONFIG_HT_IRQ=y
CONFIG_ISA_DMA_API=y
# CONFIG_ISA is not set
# CONFIG_MCA is not set
# CONFIG_SCx200 is not set
CONFIG_PCCARD=y
# CONFIG_PCMCIA_DEBUG is not set
CONFIG_PCMCIA=y
CONFIG_PCMCIA_LOAD_CIS=y
# CONFIG_PCMCIA_IOCTL is not set
CONFIG_CARDBUS=y

#
# PC-card bridges
#
CONFIG_YENTA=y
CONFIG_YENTA_O2=y
CONFIG_YENTA_RICOH=y
CONFIG_YENTA_TI=y
CONFIG_YENTA_ENE_TUNE=y
CONFIG_YENTA_TOSHIBA=y
# CONFIG_PD6729 is not set
# CONFIG_I82092 is not set
CONFIG_PCCARD_NONSTATIC=y
CONFIG_HOTPLUG_PCI=y
# CONFIG_HOTPLUG_PCI_FAKE is not set
CONFIG_HOTPLUG_PCI_ACPI=y
CONFIG_HOTPLUG_PCI_ACPI_IBM=y
# CONFIG_HOTPLUG_PCI_CPCI is not set
# CONFIG_HOTPLUG_PCI_SHPC is not set

#
# Executable file formats
#
CONFIG_BINFMT_ELF=y
# CONFIG_BINFMT_AOUT is not set
CONFIG_BINFMT_MISC=y

#
# Networking
#
CONFIG_NET=y

#
# Networking options
#
CONFIG_PACKET=y
CONFIG_PACKET_MMAP=y
CONFIG_UNIX=y
CONFIG_XFRM=y
CONFIG_XFRM_USER=y
# CONFIG_XFRM_SUB_POLICY is not set
# CONFIG_XFRM_MIGRATE is not set
CONFIG_NET_KEY=m
# CONFIG_NET_KEY_MIGRATE is not set
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
# CONFIG_IP_ADVANCED_ROUTER is not set
CONFIG_IP_FIB_HASH=y
# CONFIG_IP_PNP is not set
CONFIG_NET_IPIP=m
CONFIG_NET_IPGRE=m
CONFIG_NET_IPGRE_BROADCAST=y
CONFIG_IP_MROUTE=y
CONFIG_IP_PIMSM_V1=y
CONFIG_IP_PIMSM_V2=y
# CONFIG_ARPD is not set
CONFIG_SYN_COOKIES=y
CONFIG_INET_AH=m
CONFIG_INET_ESP=m
CONFIG_INET_IPCOMP=m
CONFIG_INET_XFRM_TUNNEL=m
CONFIG_INET_TUNNEL=m
CONFIG_INET_XFRM_MODE_TRANSPORT=m
CONFIG_INET_XFRM_MODE_TUNNEL=m
# CONFIG_INET_XFRM_MODE_BEET is not set
CONFIG_INET_LRO=y
CONFIG_INET_DIAG=m
CONFIG_INET_TCP_DIAG=m
# CONFIG_TCP_CONG_ADVANCED is not set
CONFIG_TCP_CONG_CUBIC=y
CONFIG_DEFAULT_TCP_CONG="cubic"
# CONFIG_TCP_MD5SIG is not set
CONFIG_IP_VS=m
# CONFIG_IP_VS_DEBUG is not set
CONFIG_IP_VS_TAB_BITS=12

#
# IPVS transport protocol load balancing support
#
CONFIG_IP_VS_PROTO_TCP=y
CONFIG_IP_VS_PROTO_UDP=y
CONFIG_IP_VS_PROTO_ESP=y
CONFIG_IP_VS_PROTO_AH=y

#
# IPVS scheduler
#
CONFIG_IP_VS_RR=m
CONFIG_IP_VS_WRR=m
CONFIG_IP_VS_LC=m
CONFIG_IP_VS_WLC=m
CONFIG_IP_VS_LBLC=m
CONFIG_IP_VS_LBLCR=m
CONFIG_IP_VS_DH=m
CONFIG_IP_VS_SH=m
CONFIG_IP_VS_SED=m
CONFIG_IP_VS_NQ=m

#
# IPVS application helper
#
CONFIG_IP_VS_FTP=m
# CONFIG_IPV6 is not set
# CONFIG_INET6_XFRM_TUNNEL is not set
# CONFIG_INET6_TUNNEL is not set
# CONFIG_NETLABEL is not set
# CONFIG_NETWORK_SECMARK is not set
CONFIG_NETFILTER=y
# CONFIG_NETFILTER_DEBUG is not set
CONFIG_BRIDGE_NETFILTER=y

#
# Core Netfilter Configuration
#
# CONFIG_NETFILTER_NETLINK is not set
CONFIG_NF_CONNTRACK_ENABLED=m
CONFIG_NF_CONNTRACK=m
# CONFIG_NF_CT_ACCT is not set
# CONFIG_NF_CONNTRACK_MARK is not set
# CONFIG_NF_CONNTRACK_EVENTS is not set
# CONFIG_NF_CT_PROTO_SCTP is not set
# CONFIG_NF_CT_PROTO_UDPLITE is not set
# CONFIG_NF_CONNTRACK_AMANDA is not set
CONFIG_NF_CONNTRACK_FTP=m
# CONFIG_NF_CONNTRACK_H323 is not set
# CONFIG_NF_CONNTRACK_IRC is not set
# CONFIG_NF_CONNTRACK_NETBIOS_NS is not set
# CONFIG_NF_CONNTRACK_PPTP is not set
# CONFIG_NF_CONNTRACK_SANE is not set
# CONFIG_NF_CONNTRACK_SIP is not set
# CONFIG_NF_CONNTRACK_TFTP is not set
CONFIG_NETFILTER_XTABLES=m
# CONFIG_NETFILTER_XT_TARGET_CLASSIFY is not set
# CONFIG_NETFILTER_XT_TARGET_MARK is not set
# CONFIG_NETFILTER_XT_TARGET_NFQUEUE is not set
# CONFIG_NETFILTER_XT_TARGET_NFLOG is not set
# CONFIG_NETFILTER_XT_TARGET_TCPMSS is not set
# CONFIG_NETFILTER_XT_MATCH_COMMENT is not set
# CONFIG_NETFILTER_XT_MATCH_CONNBYTES is not set
# CONFIG_NETFILTER_XT_MATCH_CONNLIMIT is not set
# CONFIG_NETFILTER_XT_MATCH_CONNMARK is not set
# CONFIG_NETFILTER_XT_MATCH_CONNTRACK is not set
# CONFIG_NETFILTER_XT_MATCH_DCCP is not set
# CONFIG_NETFILTER_XT_MATCH_DSCP is not set
# CONFIG_NETFILTER_XT_MATCH_ESP is not set
# CONFIG_NETFILTER_XT_MATCH_HELPER is not set
# CONFIG_NETFILTER_XT_MATCH_LENGTH is not set
# CONFIG_NETFILTER_XT_MATCH_LIMIT is not set
# CONFIG_NETFILTER_XT_MATCH_MAC is not set
# CONFIG_NETFILTER_XT_MATCH_MARK is not set
# CONFIG_NETFILTER_XT_MATCH_POLICY is not set
# CONFIG_NETFILTER_XT_MATCH_MULTIPORT is not set
# CONFIG_NETFILTER_XT_MATCH_PHYSDEV is not set
# CONFIG_NETFILTER_XT_MATCH_PKTTYPE is not set
# CONFIG_NETFILTER_XT_MATCH_QUOTA is not set
# CONFIG_NETFILTER_XT_MATCH_REALM is not set
# CONFIG_NETFILTER_XT_MATCH_SCTP is not set
CONFIG_NETFILTER_XT_MATCH_STATE=m
# CONFIG_NETFILTER_XT_MATCH_STATISTIC is not set
# CONFIG_NETFILTER_XT_MATCH_STRING is not set
# CONFIG_NETFILTER_XT_MATCH_TCPMSS is not set
# CONFIG_NETFILTER_XT_MATCH_TIME is not set
# CONFIG_NETFILTER_XT_MATCH_U32 is not set
# CONFIG_NETFILTER_XT_MATCH_HASHLIMIT is not set

#
# IP: Netfilter Configuration
#
CONFIG_NF_CONNTRACK_IPV4=m
CONFIG_NF_CONNTRACK_PROC_COMPAT=y
# CONFIG_IP_NF_QUEUE is not set
CONFIG_IP_NF_IPTABLES=m
CONFIG_IP_NF_MATCH_IPRANGE=m
CONFIG_IP_NF_MATCH_TOS=m
CONFIG_IP_NF_MATCH_RECENT=m
CONFIG_IP_NF_MATCH_ECN=m
CONFIG_IP_NF_MATCH_AH=m
CONFIG_IP_NF_MATCH_TTL=m
CONFIG_IP_NF_MATCH_OWNER=m
CONFIG_IP_NF_MATCH_ADDRTYPE=m
CONFIG_IP_NF_FILTER=m
CONFIG_IP_NF_TARGET_REJECT=m
CONFIG_IP_NF_TARGET_LOG=m
CONFIG_IP_NF_TARGET_ULOG=m
CONFIG_NF_NAT=m
CONFIG_NF_NAT_NEEDED=y
CONFIG_IP_NF_TARGET_MASQUERADE=m
# CONFIG_IP_NF_TARGET_REDIRECT is not set
# CONFIG_IP_NF_TARGET_NETMAP is not set
# CONFIG_IP_NF_TARGET_SAME is not set
# CONFIG_NF_NAT_SNMP_BASIC is not set
CONFIG_NF_NAT_FTP=m
# CONFIG_NF_NAT_IRC is not set
# CONFIG_NF_NAT_TFTP is not set
# CONFIG_NF_NAT_AMANDA is not set
# CONFIG_NF_NAT_PPTP is not set
# CONFIG_NF_NAT_H323 is not set
# CONFIG_NF_NAT_SIP is not set
# CONFIG_IP_NF_MANGLE is not set
# CONFIG_IP_NF_RAW is not set
# CONFIG_IP_NF_ARPTABLES is not set

#
# Bridge: Netfilter Configuration
#
# CONFIG_BRIDGE_NF_EBTABLES is not set
# CONFIG_IP_DCCP is not set
# CONFIG_IP_SCTP is not set
# CONFIG_TIPC is not set
# CONFIG_ATM is not set
CONFIG_BRIDGE=m
CONFIG_VLAN_8021Q=m
# CONFIG_DECNET is not set
CONFIG_LLC=m
# CONFIG_LLC2 is not set
# CONFIG_IPX is not set
# CONFIG_ATALK is not set
# CONFIG_X25 is not set
# CONFIG_LAPB is not set
# CONFIG_ECONET is not set
# CONFIG_WAN_ROUTER is not set
# CONFIG_NET_SCHED is not set
CONFIG_NET_SCH_FIFO=y

#
# Network testing
#
# CONFIG_NET_PKTGEN is not set
# CONFIG_HAMRADIO is not set
# CONFIG_IRDA is not set
CONFIG_BT=m
CONFIG_BT_L2CAP=m
CONFIG_BT_SCO=m
CONFIG_BT_RFCOMM=m
CONFIG_BT_RFCOMM_TTY=y
CONFIG_BT_BNEP=m
CONFIG_BT_BNEP_MC_FILTER=y
CONFIG_BT_BNEP_PROTO_FILTER=y
CONFIG_BT_HIDP=m

#
# Bluetooth device drivers
#
CONFIG_BT_HCIUSB=m
CONFIG_BT_HCIUSB_SCO=y
# CONFIG_BT_HCIBTSDIO is not set
CONFIG_BT_HCIUART=m
CONFIG_BT_HCIUART_H4=y
CONFIG_BT_HCIUART_BCSP=y
# CONFIG_BT_HCIUART_LL is not set
CONFIG_BT_HCIBCM203X=m
CONFIG_BT_HCIBPA10X=m
CONFIG_BT_HCIBFUSB=m
CONFIG_BT_HCIDTL1=m
CONFIG_BT_HCIBT3C=m
CONFIG_BT_HCIBLUECARD=m
CONFIG_BT_HCIBTUART=m
CONFIG_BT_HCIVHCI=m
# CONFIG_AF_RXRPC is not set

#
# Wireless
#
CONFIG_CFG80211=m
CONFIG_NL80211=y
CONFIG_WIRELESS_EXT=y
CONFIG_MAC80211=m
# CONFIG_MAC80211_DEBUG is not set
CONFIG_IEEE80211=m
# CONFIG_IEEE80211_DEBUG is not set
CONFIG_IEEE80211_CRYPT_WEP=m
CONFIG_IEEE80211_CRYPT_CCMP=m
CONFIG_IEEE80211_CRYPT_TKIP=m
CONFIG_IEEE80211_SOFTMAC=m
# CONFIG_IEEE80211_SOFTMAC_DEBUG is not set
CONFIG_RFKILL=m
CONFIG_RFKILL_INPUT=m
# CONFIG_NET_9P is not set

#
# Device Drivers
#

#
# Generic Driver Options
#
CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug"
CONFIG_STANDALONE=y
CONFIG_PREVENT_FIRMWARE_BUILD=y
CONFIG_FW_LOADER=y
# CONFIG_DEBUG_DRIVER is not set
# CONFIG_DEBUG_DEVRES is not set
# CONFIG_SYS_HYPERVISOR is not set
# CONFIG_CONNECTOR is not set
# CONFIG_MTD is not set
CONFIG_PARPORT=m
CONFIG_PARPORT_PC=m
CONFIG_PARPORT_SERIAL=m
CONFIG_PARPORT_PC_FIFO=y
CONFIG_PARPORT_PC_SUPERIO=y
CONFIG_PARPORT_PC_PCMCIA=m
# CONFIG_PARPORT_GSC is not set
# CONFIG_PARPORT_AX88796 is not set
CONFIG_PARPORT_1284=y
CONFIG_PARPORT_NOT_PC=y
CONFIG_PNP=y
# CONFIG_PNP_DEBUG is not set

#
# Protocols
#
CONFIG_PNPACPI=y
CONFIG_BLK_DEV=y
# CONFIG_BLK_DEV_FD is not set
# CONFIG_PARIDE is not set
# CONFIG_BLK_CPQ_DA is not set
# CONFIG_BLK_CPQ_CISS_DA is not set
# CONFIG_BLK_DEV_DAC960 is not set
# CONFIG_BLK_DEV_UMEM is not set
# CONFIG_BLK_DEV_COW_COMMON is not set
CONFIG_BLK_DEV_LOOP=m
# CONFIG_BLK_DEV_CRYPTOLOOP is not set
# CONFIG_BLK_DEV_NBD is not set
# CONFIG_BLK_DEV_SX8 is not set
# CONFIG_BLK_DEV_UB is not set
CONFIG_BLK_DEV_RAM=y
CONFIG_BLK_DEV_RAM_COUNT=16
CONFIG_BLK_DEV_RAM_SIZE=16384
CONFIG_BLK_DEV_RAM_BLOCKSIZE=4096
CONFIG_CDROM_PKTCDVD=m
CONFIG_CDROM_PKTCDVD_BUFFERS=8
# CONFIG_CDROM_PKTCDVD_WCACHE is not set
# CONFIG_ATA_OVER_ETH is not set
CONFIG_XEN_BLKDEV_FRONTEND=y
CONFIG_MISC_DEVICES=y
# CONFIG_IBM_ASM is not set
# CONFIG_PHANTOM is not set
# CONFIG_EEPROM_93CX6 is not set
# CONFIG_SGI_IOC4 is not set
# CONFIG_TIFM_CORE is not set
# CONFIG_ASUS_LAPTOP is not set
# CONFIG_FUJITSU_LAPTOP is not set
# CONFIG_MSI_LAPTOP is not set
# CONFIG_SONY_LAPTOP is not set
CONFIG_THINKPAD_ACPI=m
# CONFIG_THINKPAD_ACPI_DEBUG is not set
CONFIG_THINKPAD_ACPI_BAY=y
# CONFIG_IDE is not set

#
# SCSI device support
#
# CONFIG_RAID_ATTRS is not set
CONFIG_SCSI=y
CONFIG_SCSI_DMA=y
# CONFIG_SCSI_TGT is not set
CONFIG_SCSI_NETLINK=y
CONFIG_SCSI_PROC_FS=y

#
# SCSI support type (disk, tape, CD-ROM)
#
CONFIG_BLK_DEV_SD=y
# CONFIG_CHR_DEV_ST is not set
# CONFIG_CHR_DEV_OSST is not set
CONFIG_BLK_DEV_SR=m
CONFIG_BLK_DEV_SR_VENDOR=y
CONFIG_CHR_DEV_SG=m
# CONFIG_CHR_DEV_SCH is not set

#
# Some SCSI devices (e.g. CD jukebox) support multiple LUNs
#
CONFIG_SCSI_MULTI_LUN=y
CONFIG_SCSI_CONSTANTS=y
CONFIG_SCSI_LOGGING=y
# CONFIG_SCSI_SCAN_ASYNC is not set
CONFIG_SCSI_WAIT_SCAN=m

#
# SCSI Transports
#
CONFIG_SCSI_SPI_ATTRS=m
CONFIG_SCSI_FC_ATTRS=m
CONFIG_SCSI_ISCSI_ATTRS=m
CONFIG_SCSI_SAS_ATTRS=m
# CONFIG_SCSI_SAS_LIBSAS is not set
# CONFIG_SCSI_SRP_ATTRS is not set
CONFIG_SCSI_LOWLEVEL=y
# CONFIG_ISCSI_TCP is not set
# CONFIG_BLK_DEV_3W_XXXX_RAID is not set
# CONFIG_SCSI_3W_9XXX is not set
# CONFIG_SCSI_ACARD is not set
# CONFIG_SCSI_AACRAID is not set
# CONFIG_SCSI_AIC7XXX is not set
# CONFIG_SCSI_AIC7XXX_OLD is not set
# CONFIG_SCSI_AIC79XX is not set
# CONFIG_SCSI_AIC94XX is not set
# CONFIG_SCSI_DPT_I2O is not set
# CONFIG_SCSI_ADVANSYS is not set
# CONFIG_SCSI_ARCMSR is not set
# CONFIG_MEGARAID_NEWGEN is not set
# CONFIG_MEGARAID_LEGACY is not set
# CONFIG_MEGARAID_SAS is not set
# CONFIG_SCSI_HPTIOP is not set
# CONFIG_SCSI_BUSLOGIC is not set
# CONFIG_SCSI_DMX3191D is not set
# CONFIG_SCSI_EATA is not set
# CONFIG_SCSI_FUTURE_DOMAIN is not set
# CONFIG_SCSI_IPS is not set
# CONFIG_SCSI_INITIO is not set
# CONFIG_SCSI_INIA100 is not set
# CONFIG_SCSI_PPA is not set
# CONFIG_SCSI_IMM is not set
# CONFIG_SCSI_STEX is not set
# CONFIG_SCSI_SYM53C8XX_2 is not set
# CONFIG_SCSI_IPR is not set
# CONFIG_SCSI_QLOGIC_1280 is not set
# CONFIG_SCSI_QLA_FC is not set
# CONFIG_SCSI_QLA_ISCSI is not set
# CONFIG_SCSI_LPFC is not set
# CONFIG_SCSI_DC395x is not set
# CONFIG_SCSI_DC390T is not set
# CONFIG_SCSI_NSP32 is not set
# CONFIG_SCSI_DEBUG is not set
# CONFIG_SCSI_SRP is not set
# CONFIG_SCSI_LOWLEVEL_PCMCIA is not set
CONFIG_ATA=y
# CONFIG_ATA_NONSTANDARD is not set
CONFIG_ATA_ACPI=y
CONFIG_SATA_AHCI=y
# CONFIG_SATA_SVW is not set
CONFIG_ATA_PIIX=y
# CONFIG_SATA_MV is not set
# CONFIG_SATA_NV is not set
# CONFIG_PDC_ADMA is not set
# CONFIG_SATA_QSTOR is not set
# CONFIG_SATA_PROMISE is not set
# CONFIG_SATA_SX4 is not set
# CONFIG_SATA_SIL is not set
# CONFIG_SATA_SIL24 is not set
# CONFIG_SATA_SIS is not set
# CONFIG_SATA_ULI is not set
# CONFIG_SATA_VIA is not set
# CONFIG_SATA_VITESSE is not set
# CONFIG_SATA_INIC162X is not set
# CONFIG_PATA_ACPI is not set
# CONFIG_PATA_ALI is not set
# CONFIG_PATA_AMD is not set
# CONFIG_PATA_ARTOP is not set
# CONFIG_PATA_ATIIXP is not set
# CONFIG_PATA_CMD640_PCI is not set
# CONFIG_PATA_CMD64X is not set
# CONFIG_PATA_CS5520 is not set
# CONFIG_PATA_CS5530 is not set
# CONFIG_PATA_CS5535 is not set
# CONFIG_PATA_CS5536 is not set
# CONFIG_PATA_CYPRESS is not set
# CONFIG_PATA_EFAR is not set
CONFIG_ATA_GENERIC=y
# CONFIG_PATA_HPT366 is not set
# CONFIG_PATA_HPT37X is not set
# CONFIG_PATA_HPT3X2N is not set
# CONFIG_PATA_HPT3X3 is not set
# CONFIG_PATA_IT821X is not set
# CONFIG_PATA_IT8213 is not set
# CONFIG_PATA_JMICRON is not set
# CONFIG_PATA_TRIFLEX is not set
# CONFIG_PATA_MARVELL is not set
# CONFIG_PATA_MPIIX is not set
# CONFIG_PATA_OLDPIIX is not set
# CONFIG_PATA_NETCELL is not set
# CONFIG_PATA_NS87410 is not set
# CONFIG_PATA_NS87415 is not set
# CONFIG_PATA_OPTI is not set
# CONFIG_PATA_OPTIDMA is not set
# CONFIG_PATA_PCMCIA is not set
# CONFIG_PATA_PDC_OLD is not set
# CONFIG_PATA_RADISYS is not set
# CONFIG_PATA_RZ1000 is not set
# CONFIG_PATA_SC1200 is not set
# CONFIG_PATA_SERVERWORKS is not set
# CONFIG_PATA_PDC2027X is not set
# CONFIG_PATA_SIL680 is not set
# CONFIG_PATA_SIS is not set
# CONFIG_PATA_VIA is not set
# CONFIG_PATA_WINBOND is not set
CONFIG_MD=y
# CONFIG_BLK_DEV_MD is not set
CONFIG_BLK_DEV_DM=m
# CONFIG_DM_DEBUG is not set
# CONFIG_DM_CRYPT is not set
CONFIG_DM_SNAPSHOT=m
# CONFIG_DM_MIRROR is not set
# CONFIG_DM_ZERO is not set
# CONFIG_DM_MULTIPATH is not set
# CONFIG_DM_DELAY is not set
# CONFIG_DM_UEVENT is not set
# CONFIG_FUSION is not set

#
# IEEE 1394 (FireWire) support
#
# CONFIG_FIREWIRE is not set
CONFIG_IEEE1394=m

#
# Subsystem Options
#
# CONFIG_IEEE1394_VERBOSEDEBUG is not set

#
# Controllers
#

#
# Texas Instruments PCILynx requires I2C
#
CONFIG_IEEE1394_OHCI1394=m

#
# Protocols
#
CONFIG_IEEE1394_VIDEO1394=m
CONFIG_IEEE1394_SBP2=m
# CONFIG_IEEE1394_SBP2_PHYS_DMA is not set
CONFIG_IEEE1394_ETH1394_ROM_ENTRY=y
CONFIG_IEEE1394_ETH1394=m
CONFIG_IEEE1394_DV1394=m
CONFIG_IEEE1394_RAWIO=m
# CONFIG_I2O is not set
# CONFIG_MACINTOSH_DRIVERS is not set
CONFIG_NETDEVICES=y
# CONFIG_NETDEVICES_MULTIQUEUE is not set
CONFIG_DUMMY=m
CONFIG_BONDING=m
# CONFIG_MACVLAN is not set
# CONFIG_EQUALIZER is not set
CONFIG_TUN=m
# CONFIG_VETH is not set
# CONFIG_NET_SB1000 is not set
# CONFIG_IP1000 is not set
# CONFIG_ARCNET is not set
# CONFIG_PHYLIB is not set
CONFIG_NET_ETHERNET=y
CONFIG_MII=y
# CONFIG_HAPPYMEAL is not set
# CONFIG_SUNGEM is not set
# CONFIG_CASSINI is not set
# CONFIG_NET_VENDOR_3COM is not set
# CONFIG_NET_TULIP is not set
# CONFIG_HP100 is not set
# CONFIG_IBM_NEW_EMAC_ZMII is not set
# CONFIG_IBM_NEW_EMAC_RGMII is not set
# CONFIG_IBM_NEW_EMAC_TAH is not set
# CONFIG_IBM_NEW_EMAC_EMAC4 is not set
CONFIG_NET_PCI=y
CONFIG_PCNET32=y
# CONFIG_PCNET32_NAPI is not set
# CONFIG_AMD8111_ETH is not set
# CONFIG_ADAPTEC_STARFIRE is not set
# CONFIG_B44 is not set
# CONFIG_FORCEDETH is not set
# CONFIG_EEPRO100 is not set
# CONFIG_E100 is not set
# CONFIG_FEALNX is not set
# CONFIG_NATSEMI is not set
CONFIG_NE2K_PCI=y
# CONFIG_8139CP is not set
CONFIG_8139TOO=y
CONFIG_8139TOO_PIO=y
# CONFIG_8139TOO_TUNE_TWISTER is not set
# CONFIG_8139TOO_8129 is not set
# CONFIG_8139_OLD_RX_RESET is not set
# CONFIG_SIS900 is not set
# CONFIG_EPIC100 is not set
# CONFIG_SUNDANCE is not set
# CONFIG_TLAN is not set
# CONFIG_VIA_RHINE is not set
# CONFIG_SC92031 is not set
# CONFIG_NET_POCKET is not set
CONFIG_NETDEV_1000=y
# CONFIG_ACENIC is not set
# CONFIG_DL2K is not set
CONFIG_E1000=m
# CONFIG_E1000_NAPI is not set
# CONFIG_E1000_DISABLE_PACKET_SPLIT is not set
CONFIG_E1000E=m
# CONFIG_NS83820 is not set
# CONFIG_HAMACHI is not set
# CONFIG_YELLOWFIN is not set
# CONFIG_R8169 is not set
# CONFIG_SIS190 is not set
# CONFIG_SKGE is not set
# CONFIG_SKY2 is not set
# CONFIG_SK98LIN is not set
# CONFIG_VIA_VELOCITY is not set
# CONFIG_TIGON3 is not set
# CONFIG_BNX2 is not set
# CONFIG_QLA3XXX is not set
# CONFIG_ATL1 is not set
# CONFIG_NETDEV_10000 is not set
# CONFIG_TR is not set

#
# Wireless LAN
#
# CONFIG_WLAN_PRE80211 is not set
CONFIG_WLAN_80211=y
# CONFIG_PCMCIA_RAYCS is not set
# CONFIG_IPW2100 is not set
# CONFIG_IPW2200 is not set
# CONFIG_LIBERTAS is not set
# CONFIG_AIRO is not set
# CONFIG_HERMES is not set
# CONFIG_ATMEL is not set
# CONFIG_AIRO_CS is not set
# CONFIG_PCMCIA_WL3501 is not set
# CONFIG_PRISM54 is not set
# CONFIG_USB_ZD1201 is not set
# CONFIG_RTL8187 is not set
# CONFIG_ADM8211 is not set
# CONFIG_P54_COMMON is not set
CONFIG_IWLWIFI=y
CONFIG_IWLWIFI_DEBUG=y
CONFIG_IWLWIFI_SENSITIVITY=y
CONFIG_IWLWIFI_SPECTRUM_MEASUREMENT=y
CONFIG_IWLWIFI_QOS=y
# CONFIG_IWL4965 is not set
CONFIG_IWL3945=m
# CONFIG_HOSTAP is not set
# CONFIG_BCM43XX is not set
# CONFIG_B43 is not set
# CONFIG_B43LEGACY is not set
# CONFIG_ZD1211RW is not set
# CONFIG_RT2X00 is not set

#
# USB Network Adapters
#
CONFIG_USB_CATC=m
CONFIG_USB_KAWETH=m
CONFIG_USB_PEGASUS=m
CONFIG_USB_RTL8150=m
CONFIG_USB_USBNET_MII=m
CONFIG_USB_USBNET=m
CONFIG_USB_NET_AX8817X=m
CONFIG_USB_NET_CDCETHER=m
# CONFIG_USB_NET_DM9601 is not set
CONFIG_USB_NET_GL620A=m
CONFIG_USB_NET_NET1080=m
CONFIG_USB_NET_PLUSB=m
# CONFIG_USB_NET_MCS7830 is not set
CONFIG_USB_NET_RNDIS_HOST=m
CONFIG_USB_NET_CDC_SUBSET=m
CONFIG_USB_ALI_M5632=y
CONFIG_USB_AN2720=y
CONFIG_USB_BELKIN=y
CONFIG_USB_ARMLINUX=y
CONFIG_USB_EPSON2888=y
# CONFIG_USB_KC2190 is not set
CONFIG_USB_NET_ZAURUS=m
# CONFIG_NET_PCMCIA is not set
# CONFIG_WAN is not set
CONFIG_XEN_NETDEV_FRONTEND=y
# CONFIG_FDDI is not set
# CONFIG_HIPPI is not set
# CONFIG_PLIP is not set
CONFIG_PPP=m
CONFIG_PPP_MULTILINK=y
CONFIG_PPP_FILTER=y
CONFIG_PPP_ASYNC=m
CONFIG_PPP_SYNC_TTY=m
CONFIG_PPP_DEFLATE=m
CONFIG_PPP_BSDCOMP=m
# CONFIG_PPP_MPPE is not set
# CONFIG_PPPOE is not set
# CONFIG_PPPOL2TP is not set
# CONFIG_SLIP is not set
CONFIG_SLHC=m
# CONFIG_NET_FC is not set
# CONFIG_SHAPER is not set
CONFIG_NETCONSOLE=m
# CONFIG_NETCONSOLE_DYNAMIC is not set
CONFIG_NETPOLL=y
# CONFIG_NETPOLL_TRAP is not set
CONFIG_NET_POLL_CONTROLLER=y
# CONFIG_ISDN is not set
# CONFIG_PHONE is not set

#
# Input device support
#
CONFIG_INPUT=y
CONFIG_INPUT_FF_MEMLESS=y
CONFIG_INPUT_POLLDEV=m

#
# Userland interfaces
#
CONFIG_INPUT_MOUSEDEV=y
# CONFIG_INPUT_MOUSEDEV_PSAUX is not set
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
CONFIG_INPUT_JOYDEV=m
CONFIG_INPUT_EVDEV=y
# CONFIG_INPUT_EVBUG is not set

#
# Input Device Drivers
#
CONFIG_INPUT_KEYBOARD=y
CONFIG_KEYBOARD_ATKBD=y
# CONFIG_KEYBOARD_SUNKBD is not set
# CONFIG_KEYBOARD_LKKBD is not set
# CONFIG_KEYBOARD_XTKBD is not set
# CONFIG_KEYBOARD_NEWTON is not set
# CONFIG_KEYBOARD_STOWAWAY is not set
CONFIG_INPUT_MOUSE=y
CONFIG_MOUSE_PS2=y
CONFIG_MOUSE_PS2_ALPS=y
CONFIG_MOUSE_PS2_LOGIPS2PP=y
CONFIG_MOUSE_PS2_SYNAPTICS=y
CONFIG_MOUSE_PS2_LIFEBOOK=y
CONFIG_MOUSE_PS2_TRACKPOINT=y
# CONFIG_MOUSE_PS2_TOUCHKIT is not set
# CONFIG_MOUSE_SERIAL is not set
# CONFIG_MOUSE_APPLETOUCH is not set
# CONFIG_MOUSE_VSXXXAA is not set
CONFIG_INPUT_JOYSTICK=y
# CONFIG_JOYSTICK_ANALOG is not set
# CONFIG_JOYSTICK_A3D is not set
# CONFIG_JOYSTICK_ADI is not set
# CONFIG_JOYSTICK_COBRA is not set
# CONFIG_JOYSTICK_GF2K is not set
# CONFIG_JOYSTICK_GRIP is not set
# CONFIG_JOYSTICK_GRIP_MP is not set
# CONFIG_JOYSTICK_GUILLEMOT is not set
# CONFIG_JOYSTICK_INTERACT is not set
# CONFIG_JOYSTICK_SIDEWINDER is not set
# CONFIG_JOYSTICK_TMDC is not set
# CONFIG_JOYSTICK_IFORCE is not set
# CONFIG_JOYSTICK_WARRIOR is not set
# CONFIG_JOYSTICK_MAGELLAN is not set
# CONFIG_JOYSTICK_SPACEORB is not set
# CONFIG_JOYSTICK_SPACEBALL is not set
# CONFIG_JOYSTICK_STINGER is not set
# CONFIG_JOYSTICK_TWIDJOY is not set
# CONFIG_JOYSTICK_DB9 is not set
# CONFIG_JOYSTICK_GAMECON is not set
# CONFIG_JOYSTICK_TURBOGRAFX is not set
# CONFIG_JOYSTICK_JOYDUMP is not set
# CONFIG_JOYSTICK_XPAD is not set
# CONFIG_INPUT_TABLET is not set
# CONFIG_INPUT_TOUCHSCREEN is not set
CONFIG_INPUT_MISC=y
CONFIG_INPUT_PCSPKR=m
# CONFIG_INPUT_WISTRON_BTNS is not set
# CONFIG_INPUT_ATLAS_BTNS is not set
# CONFIG_INPUT_ATI_REMOTE is not set
# CONFIG_INPUT_ATI_REMOTE2 is not set
# CONFIG_INPUT_KEYSPAN_REMOTE is not set
# CONFIG_INPUT_POWERMATE is not set
# CONFIG_INPUT_YEALINK is not set
CONFIG_INPUT_UINPUT=m

#
# Hardware I/O ports
#
CONFIG_SERIO=y
CONFIG_SERIO_I8042=y
CONFIG_SERIO_SERPORT=y
# CONFIG_SERIO_CT82C710 is not set
# CONFIG_SERIO_PARKBD is not set
# CONFIG_SERIO_PCIPS2 is not set
CONFIG_SERIO_LIBPS2=y
CONFIG_SERIO_RAW=m
# CONFIG_GAMEPORT is not set

#
# Character devices
#
CONFIG_VT=y
CONFIG_VT_CONSOLE=y
CONFIG_HW_CONSOLE=y
CONFIG_VT_HW_CONSOLE_BINDING=y
CONFIG_SERIAL_NONSTANDARD=y
# CONFIG_COMPUTONE is not set
# CONFIG_ROCKETPORT is not set
# CONFIG_CYCLADES is not set
# CONFIG_DIGIEPCA is not set
# CONFIG_MOXA_INTELLIO is not set
# CONFIG_MOXA_SMARTIO is not set
# CONFIG_MOXA_SMARTIO_NEW is not set
# CONFIG_ISI is not set
# CONFIG_SYNCLINK is not set
# CONFIG_SYNCLINKMP is not set
# CONFIG_SYNCLINK_GT is not set
# CONFIG_N_HDLC is not set
# CONFIG_SPECIALIX is not set
# CONFIG_SX is not set
# CONFIG_RIO is not set
# CONFIG_STALDRV is not set

#
# Serial drivers
#
CONFIG_SERIAL_8250=y
CONFIG_SERIAL_8250_CONSOLE=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_SERIAL_8250_PCI=y
CONFIG_SERIAL_8250_PNP=y
# CONFIG_SERIAL_8250_CS is not set
CONFIG_SERIAL_8250_NR_UARTS=4
CONFIG_SERIAL_8250_RUNTIME_UARTS=4
# CONFIG_SERIAL_8250_EXTENDED is not set

#
# Non-8250 serial port support
#
CONFIG_SERIAL_CORE=y
CONFIG_SERIAL_CORE_CONSOLE=y
# CONFIG_SERIAL_JSM is not set
CONFIG_UNIX98_PTYS=y
# CONFIG_LEGACY_PTYS is not set
CONFIG_PRINTER=m
CONFIG_LP_CONSOLE=y
CONFIG_PPDEV=m
# CONFIG_TIPAR is not set
CONFIG_HVC_DRIVER=y
CONFIG_HVC_XEN=y
# CONFIG_IPMI_HANDLER is not set
CONFIG_HW_RANDOM=y
CONFIG_HW_RANDOM_INTEL=m
# CONFIG_HW_RANDOM_AMD is not set
# CONFIG_HW_RANDOM_GEODE is not set
# CONFIG_HW_RANDOM_VIA is not set
CONFIG_NVRAM=y
CONFIG_RTC=m
CONFIG_GEN_RTC=m
CONFIG_GEN_RTC_X=y
# CONFIG_R3964 is not set
# CONFIG_APPLICOM is not set
# CONFIG_SONYPI is not set

#
# PCMCIA character devices
#
# CONFIG_SYNCLINK_CS is not set
# CONFIG_CARDMAN_4000 is not set
# CONFIG_CARDMAN_4040 is not set
# CONFIG_MWAVE is not set
# CONFIG_PC8736x_GPIO is not set
# CONFIG_NSC_GPIO is not set
# CONFIG_CS5535_GPIO is not set
# CONFIG_RAW_DRIVER is not set
CONFIG_HPET=y
CONFIG_HPET_RTC_IRQ=y
CONFIG_HPET_MMAP=y
CONFIG_HANGCHECK_TIMER=m
CONFIG_TCG_TPM=m
# CONFIG_TCG_TIS is not set
# CONFIG_TCG_NSC is not set
CONFIG_TCG_ATMEL=m
# CONFIG_TCG_INFINEON is not set
# CONFIG_TELCLOCK is not set
CONFIG_DEVPORT=y
# CONFIG_I2C is not set

#
# SPI support
#
# CONFIG_SPI is not set
# CONFIG_SPI_MASTER is not set
# CONFIG_W1 is not set
CONFIG_POWER_SUPPLY=y
# CONFIG_POWER_SUPPLY_DEBUG is not set
# CONFIG_PDA_POWER is not set
# CONFIG_BATTERY_DS2760 is not set
CONFIG_HWMON=m
# CONFIG_HWMON_VID is not set
# CONFIG_SENSORS_ABITUGURU is not set
# CONFIG_SENSORS_ABITUGURU3 is not set
# CONFIG_SENSORS_K8TEMP is not set
# CONFIG_SENSORS_F71805F is not set
# CONFIG_SENSORS_F71882FG is not set
CONFIG_SENSORS_CORETEMP=m
# CONFIG_SENSORS_IT87 is not set
# CONFIG_SENSORS_PC87360 is not set
# CONFIG_SENSORS_PC87427 is not set
# CONFIG_SENSORS_SIS5595 is not set
# CONFIG_SENSORS_SMSC47M1 is not set
# CONFIG_SENSORS_SMSC47B397 is not set
# CONFIG_SENSORS_VIA686A is not set
# CONFIG_SENSORS_VT1211 is not set
# CONFIG_SENSORS_VT8231 is not set
# CONFIG_SENSORS_W83627HF is not set
# CONFIG_SENSORS_W83627EHF is not set
CONFIG_SENSORS_HDAPS=m
# CONFIG_SENSORS_APPLESMC is not set
# CONFIG_HWMON_DEBUG_CHIP is not set
# CONFIG_WATCHDOG is not set

#
# Sonics Silicon Backplane
#
CONFIG_SSB_POSSIBLE=y
# CONFIG_SSB is not set

#
# Multifunction device drivers
#
# CONFIG_MFD_SM501 is not set

#
# Multimedia devices
#
CONFIG_VIDEO_DEV=m
CONFIG_VIDEO_V4L1=y
CONFIG_VIDEO_V4L1_COMPAT=y
CONFIG_VIDEO_V4L2=y
# CONFIG_VIDEO_CAPTURE_DRIVERS is not set
# CONFIG_RADIO_ADAPTERS is not set
# CONFIG_DVB_CORE is not set
# CONFIG_DAB is not set

#
# Graphics support
#
CONFIG_AGP=y
# CONFIG_AGP_ALI is not set
# CONFIG_AGP_ATI is not set
# CONFIG_AGP_AMD is not set
# CONFIG_AGP_AMD64 is not set
CONFIG_AGP_INTEL=y
# CONFIG_AGP_NVIDIA is not set
# CONFIG_AGP_SIS is not set
# CONFIG_AGP_SWORKS is not set
# CONFIG_AGP_VIA is not set
# CONFIG_AGP_EFFICEON is not set
CONFIG_DRM=m
# CONFIG_DRM_TDFX is not set
# CONFIG_DRM_R128 is not set
# CONFIG_DRM_RADEON is not set
CONFIG_DRM_I810=m
CONFIG_DRM_I830=m
CONFIG_DRM_I915=m
# CONFIG_DRM_MGA is not set
# CONFIG_DRM_SIS is not set
# CONFIG_DRM_VIA is not set
# CONFIG_DRM_SAVAGE is not set
CONFIG_VGASTATE=m
CONFIG_VIDEO_OUTPUT_CONTROL=m
CONFIG_FB=y
# CONFIG_FIRMWARE_EDID is not set
# CONFIG_FB_DDC is not set
CONFIG_FB_CFB_FILLRECT=y
CONFIG_FB_CFB_COPYAREA=y
CONFIG_FB_CFB_IMAGEBLIT=y
# CONFIG_FB_CFB_REV_PIXELS_IN_BYTE is not set
# CONFIG_FB_SYS_FILLRECT is not set
# CONFIG_FB_SYS_COPYAREA is not set
# CONFIG_FB_SYS_IMAGEBLIT is not set
# CONFIG_FB_SYS_FOPS is not set
CONFIG_FB_DEFERRED_IO=y
# CONFIG_FB_SVGALIB is not set
# CONFIG_FB_MACMODES is not set
# CONFIG_FB_BACKLIGHT is not set
CONFIG_FB_MODE_HELPERS=y
CONFIG_FB_TILEBLITTING=y

#
# Frame buffer hardware drivers
#
# CONFIG_FB_CIRRUS is not set
# CONFIG_FB_PM2 is not set
# CONFIG_FB_CYBER2000 is not set
# CONFIG_FB_ARC is not set
# CONFIG_FB_ASILIANT is not set
# CONFIG_FB_IMSTT is not set
CONFIG_FB_VGA16=m
CONFIG_FB_VESA=y
# CONFIG_FB_HECUBA is not set
# CONFIG_FB_HGA is not set
# CONFIG_FB_S1D13XXX is not set
# CONFIG_FB_NVIDIA is not set
# CONFIG_FB_RIVA is not set
CONFIG_FB_I810=m
CONFIG_FB_I810_GTF=y
# CONFIG_FB_I810_I2C is not set
# CONFIG_FB_LE80578 is not set
CONFIG_FB_INTEL=m
CONFIG_FB_INTEL_DEBUG=y
# CONFIG_FB_INTEL_I2C is not set
# CONFIG_FB_MATROX is not set
# CONFIG_FB_RADEON is not set
# CONFIG_FB_ATY128 is not set
# CONFIG_FB_ATY is not set
# CONFIG_FB_S3 is not set
# CONFIG_FB_SAVAGE is not set
# CONFIG_FB_SIS is not set
# CONFIG_FB_NEOMAGIC is not set
# CONFIG_FB_KYRO is not set
# CONFIG_FB_3DFX is not set
# CONFIG_FB_VOODOO1 is not set
# CONFIG_FB_VT8623 is not set
# CONFIG_FB_CYBLA is not set
# CONFIG_FB_TRIDENT is not set
# CONFIG_FB_ARK is not set
# CONFIG_FB_PM3 is not set
# CONFIG_FB_GEODE is not set
# CONFIG_FB_VIRTUAL is not set
CONFIG_BACKLIGHT_LCD_SUPPORT=y
CONFIG_LCD_CLASS_DEVICE=m
CONFIG_BACKLIGHT_CLASS_DEVICE=y
# CONFIG_BACKLIGHT_CORGI is not set
# CONFIG_BACKLIGHT_PROGEAR is not set

#
# Display device support
#
CONFIG_DISPLAY_SUPPORT=m

#
# Display hardware drivers
#

#
# Console display driver support
#
CONFIG_VGA_CONSOLE=y
CONFIG_VGACON_SOFT_SCROLLBACK=y
CONFIG_VGACON_SOFT_SCROLLBACK_SIZE=64
CONFIG_VIDEO_SELECT=y
CONFIG_DUMMY_CONSOLE=y
CONFIG_FRAMEBUFFER_CONSOLE=y
# CONFIG_FRAMEBUFFER_CONSOLE_DETECT_PRIMARY is not set
CONFIG_FRAMEBUFFER_CONSOLE_ROTATION=y
# CONFIG_FONTS is not set
CONFIG_FONT_8x8=y
CONFIG_FONT_8x16=y
CONFIG_LOGO=y
# CONFIG_LOGO_LINUX_MONO is not set
# CONFIG_LOGO_LINUX_VGA16 is not set
CONFIG_LOGO_LINUX_CLUT224=y

#
# Sound
#
CONFIG_SOUND=m

#
# Advanced Linux Sound Architecture
#
CONFIG_SND=m
CONFIG_SND_TIMER=m
CONFIG_SND_PCM=m
CONFIG_SND_RAWMIDI=m
CONFIG_SND_SEQUENCER=m
CONFIG_SND_SEQ_DUMMY=m
CONFIG_SND_OSSEMUL=y
CONFIG_SND_MIXER_OSS=m
CONFIG_SND_PCM_OSS=m
CONFIG_SND_PCM_OSS_PLUGINS=y
CONFIG_SND_SEQUENCER_OSS=y
CONFIG_SND_RTCTIMER=m
CONFIG_SND_SEQ_RTCTIMER_DEFAULT=y
CONFIG_SND_DYNAMIC_MINORS=y
# CONFIG_SND_SUPPORT_OLD_API is not set
CONFIG_SND_VERBOSE_PROCFS=y
# CONFIG_SND_VERBOSE_PRINTK is not set
# CONFIG_SND_DEBUG is not set

#
# Generic devices
#
CONFIG_SND_MPU401_UART=m
CONFIG_SND_AC97_CODEC=m
CONFIG_SND_DUMMY=m
CONFIG_SND_VIRMIDI=m
CONFIG_SND_MTPAV=m
# CONFIG_SND_MTS64 is not set
# CONFIG_SND_SERIAL_U16550 is not set
CONFIG_SND_MPU401=m
# CONFIG_SND_PORTMAN2X4 is not set

#
# PCI devices
#
# CONFIG_SND_AD1889 is not set
# CONFIG_SND_ALS300 is not set
# CONFIG_SND_ALS4000 is not set
# CONFIG_SND_ALI5451 is not set
# CONFIG_SND_ATIIXP is not set
# CONFIG_SND_ATIIXP_MODEM is not set
# CONFIG_SND_AU8810 is not set
# CONFIG_SND_AU8820 is not set
# CONFIG_SND_AU8830 is not set
# CONFIG_SND_AZT3328 is not set
# CONFIG_SND_BT87X is not set
# CONFIG_SND_CA0106 is not set
# CONFIG_SND_CMIPCI is not set
# CONFIG_SND_CS4281 is not set
# CONFIG_SND_CS46XX is not set
# CONFIG_SND_CS5530 is not set
# CONFIG_SND_CS5535AUDIO is not set
# CONFIG_SND_DARLA20 is not set
# CONFIG_SND_GINA20 is not set
# CONFIG_SND_LAYLA20 is not set
# CONFIG_SND_DARLA24 is not set
# CONFIG_SND_GINA24 is not set
# CONFIG_SND_LAYLA24 is not set
# CONFIG_SND_MONA is not set
# CONFIG_SND_MIA is not set
# CONFIG_SND_ECHO3G is not set
# CONFIG_SND_INDIGO is not set
# CONFIG_SND_INDIGOIO is not set
# CONFIG_SND_INDIGODJ is not set
# CONFIG_SND_EMU10K1 is not set
# CONFIG_SND_EMU10K1X is not set
# CONFIG_SND_ENS1370 is not set
# CONFIG_SND_ENS1371 is not set
# CONFIG_SND_ES1938 is not set
# CONFIG_SND_ES1968 is not set
# CONFIG_SND_FM801 is not set
CONFIG_SND_HDA_INTEL=m
# CONFIG_SND_HDA_HWDEP is not set
CONFIG_SND_HDA_CODEC_REALTEK=y
CONFIG_SND_HDA_CODEC_ANALOG=y
CONFIG_SND_HDA_CODEC_SIGMATEL=y
CONFIG_SND_HDA_CODEC_VIA=y
CONFIG_SND_HDA_CODEC_ATIHDMI=y
CONFIG_SND_HDA_CODEC_CONEXANT=y
CONFIG_SND_HDA_CODEC_CMEDIA=y
CONFIG_SND_HDA_CODEC_SI3054=y
CONFIG_SND_HDA_GENERIC=y
# CONFIG_SND_HDA_POWER_SAVE is not set
# CONFIG_SND_HDSP is not set
# CONFIG_SND_HDSPM is not set
# CONFIG_SND_ICE1712 is not set
# CONFIG_SND_ICE1724 is not set
CONFIG_SND_INTEL8X0=m
CONFIG_SND_INTEL8X0M=m
# CONFIG_SND_KORG1212 is not set
# CONFIG_SND_MAESTRO3 is not set
# CONFIG_SND_MIXART is not set
# CONFIG_SND_NM256 is not set
# CONFIG_SND_PCXHR is not set
# CONFIG_SND_RIPTIDE is not set
# CONFIG_SND_RME32 is not set
# CONFIG_SND_RME96 is not set
# CONFIG_SND_RME9652 is not set
# CONFIG_SND_SONICVIBES is not set
# CONFIG_SND_TRIDENT is not set
# CONFIG_SND_VIA82XX is not set
# CONFIG_SND_VIA82XX_MODEM is not set
# CONFIG_SND_VX222 is not set
# CONFIG_SND_YMFPCI is not set
CONFIG_SND_AC97_POWER_SAVE=y
CONFIG_SND_AC97_POWER_SAVE_DEFAULT=0

#
# USB devices
#
# CONFIG_SND_USB_AUDIO is not set
# CONFIG_SND_USB_USX2Y is not set
# CONFIG_SND_USB_CAIAQ is not set

#
# PCMCIA devices
#
# CONFIG_SND_VXPOCKET is not set
# CONFIG_SND_PDAUDIOCF is not set

#
# System on Chip audio support
#
# CONFIG_SND_SOC is not set

#
# SoC Audio support for SuperH
#

#
# Open Sound System
#
# CONFIG_SOUND_PRIME is not set
CONFIG_AC97_BUS=m
CONFIG_HID_SUPPORT=y
CONFIG_HID=y
# CONFIG_HID_DEBUG is not set
# CONFIG_HIDRAW is not set

#
# USB Input Devices
#
CONFIG_USB_HID=y
# CONFIG_USB_HIDINPUT_POWERBOOK is not set
CONFIG_HID_FF=y
CONFIG_HID_PID=y
CONFIG_LOGITECH_FF=y
# CONFIG_PANTHERLORD_FF is not set
CONFIG_THRUSTMASTER_FF=y
# CONFIG_ZEROPLUS_FF is not set
CONFIG_USB_HIDDEV=y
CONFIG_USB_SUPPORT=y
CONFIG_USB_ARCH_HAS_HCD=y
CONFIG_USB_ARCH_HAS_OHCI=y
CONFIG_USB_ARCH_HAS_EHCI=y
CONFIG_USB=y
# CONFIG_USB_DEBUG is not set

#
# Miscellaneous USB options
#
CONFIG_USB_DEVICEFS=y
# CONFIG_USB_DEVICE_CLASS is not set
# CONFIG_USB_DYNAMIC_MINORS is not set
CONFIG_USB_SUSPEND=y
# CONFIG_USB_PERSIST is not set
# CONFIG_USB_OTG is not set

#
# USB Host Controller Drivers
#
CONFIG_USB_EHCI_HCD=m
# CONFIG_USB_EHCI_SPLIT_ISO is not set
CONFIG_USB_EHCI_ROOT_HUB_TT=y
# CONFIG_USB_EHCI_TT_NEWSCHED is not set
# CONFIG_USB_ISP116X_HCD is not set
# CONFIG_USB_OHCI_HCD is not set
CONFIG_USB_UHCI_HCD=m
# CONFIG_USB_SL811_HCD is not set
# CONFIG_USB_R8A66597_HCD is not set

#
# USB Device Class drivers
#
CONFIG_USB_ACM=m
CONFIG_USB_PRINTER=m

#
# NOTE: USB_STORAGE enables SCSI, and 'SCSI disk support'
#

#
# may also be needed; see USB_STORAGE Help for more information
#
CONFIG_USB_STORAGE=m
# CONFIG_USB_STORAGE_DEBUG is not set
CONFIG_USB_STORAGE_DATAFAB=y
CONFIG_USB_STORAGE_FREECOM=y
# CONFIG_USB_STORAGE_ISD200 is not set
CONFIG_USB_STORAGE_DPCM=y
CONFIG_USB_STORAGE_USBAT=y
CONFIG_USB_STORAGE_SDDR09=y
CONFIG_USB_STORAGE_SDDR55=y
CONFIG_USB_STORAGE_JUMPSHOT=y
CONFIG_USB_STORAGE_ALAUDA=y
# CONFIG_USB_STORAGE_KARMA is not set
CONFIG_USB_LIBUSUAL=y

#
# USB Imaging devices
#
CONFIG_USB_MDC800=m
CONFIG_USB_MICROTEK=m
CONFIG_USB_MON=y

#
# USB port drivers
#
CONFIG_USB_USS720=m

#
# USB Serial Converter support
#
CONFIG_USB_SERIAL=m
CONFIG_USB_SERIAL_GENERIC=y
# CONFIG_USB_SERIAL_AIRCABLE is not set
CONFIG_USB_SERIAL_AIRPRIME=m
CONFIG_USB_SERIAL_ARK3116=m
CONFIG_USB_SERIAL_BELKIN=m
# CONFIG_USB_SERIAL_CH341 is not set
CONFIG_USB_SERIAL_WHITEHEAT=m
CONFIG_USB_SERIAL_DIGI_ACCELEPORT=m
CONFIG_USB_SERIAL_CP2101=m
CONFIG_USB_SERIAL_CYPRESS_M8=m
CONFIG_USB_SERIAL_EMPEG=m
CONFIG_USB_SERIAL_FTDI_SIO=m
CONFIG_USB_SERIAL_FUNSOFT=m
CONFIG_USB_SERIAL_VISOR=m
CONFIG_USB_SERIAL_IPAQ=m
CONFIG_USB_SERIAL_IR=m
CONFIG_USB_SERIAL_EDGEPORT=m
CONFIG_USB_SERIAL_EDGEPORT_TI=m
CONFIG_USB_SERIAL_GARMIN=m
CONFIG_USB_SERIAL_IPW=m
CONFIG_USB_SERIAL_KEYSPAN_PDA=m
CONFIG_USB_SERIAL_KEYSPAN=m
CONFIG_USB_SERIAL_KEYSPAN_MPR=y
CONFIG_USB_SERIAL_KEYSPAN_USA28=y
CONFIG_USB_SERIAL_KEYSPAN_USA28X=y
CONFIG_USB_SERIAL_KEYSPAN_USA28XA=y
CONFIG_USB_SERIAL_KEYSPAN_USA28XB=y
CONFIG_USB_SERIAL_KEYSPAN_USA19=y
CONFIG_USB_SERIAL_KEYSPAN_USA18X=y
CONFIG_USB_SERIAL_KEYSPAN_USA19W=y
CONFIG_USB_SERIAL_KEYSPAN_USA19QW=y
CONFIG_USB_SERIAL_KEYSPAN_USA19QI=y
CONFIG_USB_SERIAL_KEYSPAN_USA49W=y
CONFIG_USB_SERIAL_KEYSPAN_USA49WLC=y
CONFIG_USB_SERIAL_KLSI=m
CONFIG_USB_SERIAL_KOBIL_SCT=m
CONFIG_USB_SERIAL_MCT_U232=m
# CONFIG_USB_SERIAL_MOS7720 is not set
# CONFIG_USB_SERIAL_MOS7840 is not set
CONFIG_USB_SERIAL_NAVMAN=m
CONFIG_USB_SERIAL_PL2303=m
# CONFIG_USB_SERIAL_OTI6858 is not set
CONFIG_USB_SERIAL_HP4X=m
CONFIG_USB_SERIAL_SAFE=m
CONFIG_USB_SERIAL_SAFE_PADDED=y
CONFIG_USB_SERIAL_SIERRAWIRELESS=m
CONFIG_USB_SERIAL_TI=m
CONFIG_USB_SERIAL_CYBERJACK=m
CONFIG_USB_SERIAL_XIRCOM=m
CONFIG_USB_SERIAL_OPTION=m
CONFIG_USB_SERIAL_OMNINET=m
# CONFIG_USB_SERIAL_DEBUG is not set
CONFIG_USB_EZUSB=y

#
# USB Miscellaneous drivers
#
CONFIG_USB_EMI62=m
CONFIG_USB_EMI26=m
# CONFIG_USB_ADUTUX is not set
CONFIG_USB_AUERSWALD=m
CONFIG_USB_RIO500=m
CONFIG_USB_LEGOTOWER=m
CONFIG_USB_LCD=m
# CONFIG_USB_BERRY_CHARGE is not set
CONFIG_USB_LED=m
# CONFIG_USB_CYPRESS_CY7C63 is not set
# CONFIG_USB_CYTHERM is not set
# CONFIG_USB_PHIDGET is not set
CONFIG_USB_IDMOUSE=m
# CONFIG_USB_FTDI_ELAN is not set
CONFIG_USB_APPLEDISPLAY=m
CONFIG_USB_SISUSBVGA=m
CONFIG_USB_SISUSBVGA_CON=y
CONFIG_USB_LD=m
# CONFIG_USB_TRANCEVIBRATOR is not set
# CONFIG_USB_IOWARRIOR is not set
CONFIG_USB_TEST=m

#
# USB DSL modem support
#

#
# USB Gadget Support
#
# CONFIG_USB_GADGET is not set
CONFIG_MMC=m
# CONFIG_MMC_DEBUG is not set
# CONFIG_MMC_UNSAFE_RESUME is not set

#
# MMC/SD Card Drivers
#
CONFIG_MMC_BLOCK=m
CONFIG_MMC_BLOCK_BOUNCE=y
# CONFIG_SDIO_UART is not set

#
# MMC/SD Host Controller Drivers
#
CONFIG_MMC_SDHCI=m
# CONFIG_MMC_RICOH_MMC is not set
# CONFIG_MMC_WBSD is not set
# CONFIG_MMC_TIFM_SD is not set
CONFIG_NEW_LEDS=y
CONFIG_LEDS_CLASS=m

#
# LED drivers
#

#
# LED Triggers
#
# CONFIG_LEDS_TRIGGERS is not set
# CONFIG_INFINIBAND is not set
# CONFIG_EDAC is not set
CONFIG_RTC_LIB=m
CONFIG_RTC_CLASS=m

#
# RTC interfaces
#
CONFIG_RTC_INTF_SYSFS=y
CONFIG_RTC_INTF_PROC=y
CONFIG_RTC_INTF_DEV=y
CONFIG_RTC_INTF_DEV_UIE_EMUL=y
# CONFIG_RTC_DRV_TEST is not set

#
# SPI RTC drivers
#

#
# Platform RTC drivers
#
CONFIG_RTC_DRV_CMOS=m
CONFIG_RTC_DRV_DS1553=m
# CONFIG_RTC_DRV_STK17TA8 is not set
CONFIG_RTC_DRV_DS1742=m
# CONFIG_RTC_DRV_M48T86 is not set
# CONFIG_RTC_DRV_M48T59 is not set
CONFIG_RTC_DRV_V3020=m

#
# on-CPU RTC drivers
#
# CONFIG_DMADEVICES is not set
# CONFIG_AUXDISPLAY is not set
CONFIG_VIRTUALIZATION=y
CONFIG_KVM=m
CONFIG_KVM_INTEL=m
# CONFIG_KVM_AMD is not set

#
# Userspace I/O
#
# CONFIG_UIO is not set

#
# File systems
#
# CONFIG_EXT2_FS is not set
CONFIG_EXT3_FS=y
CONFIG_EXT3_FS_XATTR=y
CONFIG_EXT3_FS_POSIX_ACL=y
CONFIG_EXT3_FS_SECURITY=y
# CONFIG_EXT4DEV_FS is not set
CONFIG_JBD=y
CONFIG_FS_MBCACHE=y
# CONFIG_REISERFS_FS is not set
# CONFIG_JFS_FS is not set
CONFIG_FS_POSIX_ACL=y
CONFIG_XFS_FS=y
# CONFIG_XFS_QUOTA is not set
# CONFIG_XFS_SECURITY is not set
# CONFIG_XFS_POSIX_ACL is not set
# CONFIG_XFS_RT is not set
# CONFIG_GFS2_FS is not set
# CONFIG_OCFS2_FS is not set
# CONFIG_MINIX_FS is not set
# CONFIG_ROMFS_FS is not set
CONFIG_INOTIFY=y
CONFIG_INOTIFY_USER=y
# CONFIG_QUOTA is not set
CONFIG_DNOTIFY=y
# CONFIG_AUTOFS_FS is not set
CONFIG_AUTOFS4_FS=m
# CONFIG_FUSE_FS is not set

#
# CD-ROM/DVD Filesystems
#
CONFIG_ISO9660_FS=y
CONFIG_JOLIET=y
CONFIG_ZISOFS=y
CONFIG_UDF_FS=m
CONFIG_UDF_NLS=y

#
# DOS/FAT/NT Filesystems
#
CONFIG_FAT_FS=m
CONFIG_MSDOS_FS=m
CONFIG_VFAT_FS=m
CONFIG_FAT_DEFAULT_CODEPAGE=437
CONFIG_FAT_DEFAULT_IOCHARSET="ascii"
CONFIG_NTFS_FS=m
# CONFIG_NTFS_DEBUG is not set
# CONFIG_NTFS_RW is not set

#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
CONFIG_PROC_KCORE=y
CONFIG_PROC_SYSCTL=y
CONFIG_SYSFS=y
CONFIG_TMPFS=y
# CONFIG_TMPFS_POSIX_ACL is not set
CONFIG_HUGETLBFS=y
CONFIG_HUGETLB_PAGE=y
CONFIG_CONFIGFS_FS=m

#
# Miscellaneous filesystems
#
# CONFIG_ADFS_FS is not set
# CONFIG_AFFS_FS is not set
# CONFIG_ECRYPT_FS is not set
# CONFIG_HFS_FS is not set
CONFIG_HFSPLUS_FS=m
# CONFIG_BEFS_FS is not set
# CONFIG_BFS_FS is not set
# CONFIG_EFS_FS is not set
# CONFIG_CRAMFS is not set
# CONFIG_VXFS_FS is not set
# CONFIG_HPFS_FS is not set
# CONFIG_QNX4FS_FS is not set
# CONFIG_SYSV_FS is not set
# CONFIG_UFS_FS is not set
CONFIG_NETWORK_FILESYSTEMS=y
CONFIG_NFS_FS=y
CONFIG_NFS_V3=y
CONFIG_NFS_V3_ACL=y
# CONFIG_NFS_V4 is not set
# CONFIG_NFS_DIRECTIO is not set
CONFIG_NFSD=m
CONFIG_NFSD_V2_ACL=y
CONFIG_NFSD_V3=y
CONFIG_NFSD_V3_ACL=y
# CONFIG_NFSD_V4 is not set
CONFIG_NFSD_TCP=y
CONFIG_LOCKD=y
CONFIG_LOCKD_V4=y
CONFIG_EXPORTFS=m
CONFIG_NFS_ACL_SUPPORT=y
CONFIG_NFS_COMMON=y
CONFIG_SUNRPC=y
CONFIG_SUNRPC_GSS=y
CONFIG_SUNRPC_BIND34=y
CONFIG_RPCSEC_GSS_KRB5=y
# CONFIG_RPCSEC_GSS_SPKM3 is not set
# CONFIG_SMB_FS is not set
# CONFIG_CIFS is not set
# CONFIG_NCP_FS is not set
# CONFIG_CODA_FS is not set
# CONFIG_AFS_FS is not set

#
# Partition Types
#
CONFIG_PARTITION_ADVANCED=y
# CONFIG_ACORN_PARTITION is not set
# CONFIG_OSF_PARTITION is not set
# CONFIG_AMIGA_PARTITION is not set
# CONFIG_ATARI_PARTITION is not set
CONFIG_MAC_PARTITION=y
CONFIG_MSDOS_PARTITION=y
# CONFIG_BSD_DISKLABEL is not set
# CONFIG_MINIX_SUBPARTITION is not set
# CONFIG_SOLARIS_X86_PARTITION is not set
# CONFIG_UNIXWARE_DISKLABEL is not set
CONFIG_LDM_PARTITION=y
# CONFIG_LDM_DEBUG is not set
# CONFIG_SGI_PARTITION is not set
# CONFIG_ULTRIX_PARTITION is not set
# CONFIG_SUN_PARTITION is not set
# CONFIG_KARMA_PARTITION is not set
# CONFIG_EFI_PARTITION is not set
# CONFIG_SYSV68_PARTITION is not set
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="utf8"
CONFIG_NLS_CODEPAGE_437=y
CONFIG_NLS_CODEPAGE_737=m
CONFIG_NLS_CODEPAGE_775=m
CONFIG_NLS_CODEPAGE_850=m
CONFIG_NLS_CODEPAGE_852=m
CONFIG_NLS_CODEPAGE_855=m
CONFIG_NLS_CODEPAGE_857=m
CONFIG_NLS_CODEPAGE_860=m
CONFIG_NLS_CODEPAGE_861=m
CONFIG_NLS_CODEPAGE_862=m
CONFIG_NLS_CODEPAGE_863=m
CONFIG_NLS_CODEPAGE_864=m
CONFIG_NLS_CODEPAGE_865=m
CONFIG_NLS_CODEPAGE_866=m
CONFIG_NLS_CODEPAGE_869=m
CONFIG_NLS_CODEPAGE_936=m
CONFIG_NLS_CODEPAGE_950=m
CONFIG_NLS_CODEPAGE_932=m
CONFIG_NLS_CODEPAGE_949=m
CONFIG_NLS_CODEPAGE_874=m
CONFIG_NLS_ISO8859_8=m
CONFIG_NLS_CODEPAGE_1250=m
CONFIG_NLS_CODEPAGE_1251=m
CONFIG_NLS_ASCII=y
CONFIG_NLS_ISO8859_1=m
CONFIG_NLS_ISO8859_2=m
CONFIG_NLS_ISO8859_3=m
CONFIG_NLS_ISO8859_4=m
CONFIG_NLS_ISO8859_5=m
CONFIG_NLS_ISO8859_6=m
CONFIG_NLS_ISO8859_7=m
CONFIG_NLS_ISO8859_9=m
CONFIG_NLS_ISO8859_13=m
CONFIG_NLS_ISO8859_14=m
CONFIG_NLS_ISO8859_15=m
CONFIG_NLS_KOI8_R=m
CONFIG_NLS_KOI8_U=m
CONFIG_NLS_UTF8=m
# CONFIG_DLM is not set
CONFIG_INSTRUMENTATION=y
# CONFIG_PROFILING is not set
# CONFIG_KPROBES is not set
# CONFIG_MARKERS is not set

#
# Kernel hacking
#
CONFIG_TRACE_IRQFLAGS_SUPPORT=y
# CONFIG_PRINTK_TIME is not set
CONFIG_ENABLE_WARN_DEPRECATED=y
CONFIG_ENABLE_MUST_CHECK=y
CONFIG_MAGIC_SYSRQ=y
# CONFIG_UNUSED_SYMBOLS is not set
# CONFIG_DEBUG_FS is not set
# CONFIG_HEADERS_CHECK is not set
CONFIG_DEBUG_KERNEL=y
# CONFIG_DEBUG_SHIRQ is not set
CONFIG_DETECT_SOFTLOCKUP=y
# CONFIG_SCHED_DEBUG is not set
# CONFIG_SCHEDSTATS is not set
CONFIG_TIMER_STATS=y
# CONFIG_SLUB_DEBUG_ON is not set
CONFIG_DEBUG_PREEMPT=y
# CONFIG_DEBUG_RT_MUTEXES is not set
# CONFIG_RT_MUTEX_TESTER is not set
# CONFIG_DEBUG_SPINLOCK is not set
# CONFIG_DEBUG_MUTEXES is not set
# CONFIG_DEBUG_LOCK_ALLOC is not set
# CONFIG_PROVE_LOCKING is not set
# CONFIG_LOCK_STAT is not set
# CONFIG_DEBUG_SPINLOCK_SLEEP is not set
# CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set
# CONFIG_DEBUG_KOBJECT is not set
# CONFIG_DEBUG_HIGHMEM is not set
CONFIG_DEBUG_BUGVERBOSE=y
CONFIG_DEBUG_INFO=y
# CONFIG_DEBUG_VM is not set
# CONFIG_DEBUG_LIST is not set
# CONFIG_DEBUG_SG is not set
CONFIG_FRAME_POINTER=y
CONFIG_FORCED_INLINING=y
# CONFIG_BOOT_PRINTK_DELAY is not set
# CONFIG_RCU_TORTURE_TEST is not set
# CONFIG_FAULT_INJECTION is not set
# CONFIG_SAMPLES is not set
CONFIG_EARLY_PRINTK=y
# CONFIG_DEBUG_STACKOVERFLOW is not set
# CONFIG_DEBUG_STACK_USAGE is not set

#
# Page alloc debug is incompatible with Software Suspend on i386
#
# CONFIG_DEBUG_RODATA is not set
# CONFIG_4KSTACKS is not set
CONFIG_X86_FIND_SMP_CONFIG=y
CONFIG_X86_MPPARSE=y
CONFIG_DOUBLEFAULT=y

#
# Security options
#
CONFIG_KEYS=y
# CONFIG_KEYS_DEBUG_PROC_KEYS is not set
CONFIG_SECURITY=y
CONFIG_SECURITY_NETWORK=y
CONFIG_SECURITY_NETWORK_XFRM=y
CONFIG_SECURITY_CAPABILITIES=y
# CONFIG_SECURITY_FILE_CAPABILITIES is not set
# CONFIG_SECURITY_ROOTPLUG is not set
CONFIG_CRYPTO=y
CONFIG_CRYPTO_ALGAPI=y
CONFIG_CRYPTO_BLKCIPHER=y
CONFIG_CRYPTO_HASH=y
CONFIG_CRYPTO_MANAGER=y
CONFIG_CRYPTO_HMAC=y
# CONFIG_CRYPTO_XCBC is not set
CONFIG_CRYPTO_NULL=m
CONFIG_CRYPTO_MD4=m
CONFIG_CRYPTO_MD5=y
CONFIG_CRYPTO_SHA1=y
CONFIG_CRYPTO_SHA256=m
CONFIG_CRYPTO_SHA512=m
CONFIG_CRYPTO_WP512=m
CONFIG_CRYPTO_TGR192=m
# CONFIG_CRYPTO_GF128MUL is not set
CONFIG_CRYPTO_ECB=m
CONFIG_CRYPTO_CBC=y
CONFIG_CRYPTO_PCBC=m
# CONFIG_CRYPTO_LRW is not set
# CONFIG_CRYPTO_XTS is not set
# CONFIG_CRYPTO_CRYPTD is not set
CONFIG_CRYPTO_DES=y
# CONFIG_CRYPTO_FCRYPT is not set
CONFIG_CRYPTO_BLOWFISH=m
CONFIG_CRYPTO_TWOFISH=m
CONFIG_CRYPTO_TWOFISH_COMMON=m
# CONFIG_CRYPTO_TWOFISH_586 is not set
CONFIG_CRYPTO_SERPENT=m
CONFIG_CRYPTO_AES=m
CONFIG_CRYPTO_AES_586=m
CONFIG_CRYPTO_CAST5=m
CONFIG_CRYPTO_CAST6=m
CONFIG_CRYPTO_TEA=m
CONFIG_CRYPTO_ARC4=m
CONFIG_CRYPTO_KHAZAD=m
CONFIG_CRYPTO_ANUBIS=m
# CONFIG_CRYPTO_SEED is not set
CONFIG_CRYPTO_DEFLATE=m
CONFIG_CRYPTO_MICHAEL_MIC=m
CONFIG_CRYPTO_CRC32C=m
# CONFIG_CRYPTO_CAMELLIA is not set
# CONFIG_CRYPTO_TEST is not set
# CONFIG_CRYPTO_AUTHENC is not set
CONFIG_CRYPTO_HW=y
CONFIG_CRYPTO_DEV_PADLOCK=m
CONFIG_CRYPTO_DEV_PADLOCK_AES=m
CONFIG_CRYPTO_DEV_PADLOCK_SHA=m
CONFIG_CRYPTO_DEV_GEODE=m

#
# Library routines
#
CONFIG_BITREVERSE=y
CONFIG_CRC_CCITT=m
CONFIG_CRC16=m
# CONFIG_CRC_ITU_T is not set
CONFIG_CRC32=y
# CONFIG_CRC7 is not set
CONFIG_LIBCRC32C=m
CONFIG_ZLIB_INFLATE=y
CONFIG_ZLIB_DEFLATE=m
CONFIG_PLIST=y
CONFIG_HAS_IOMEM=y
CONFIG_HAS_IOPORT=y
CONFIG_HAS_DMA=y
CONFIG_GENERIC_HARDIRQS=y
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_PENDING_IRQ=y
CONFIG_X86_SMP=y
CONFIG_X86_HT=y
CONFIG_X86_BIOS_REBOOT=y
CONFIG_X86_TRAMPOLINE=y
CONFIG_KTIME_SCALAR=y


Attachments:
.config (49.89 kB)

2007-11-11 18:07:27

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 24/24] make vsmp a paravirt client

On Fri, Nov 09, 2007 at 04:43:05PM -0200, Glauber de Oliveira Costa wrote:
> This patch makes vsmp a paravirt client. It now uses the whole
> infrastructure provided by pvops. When we detect we're running
> a vsmp box, we change the irq-related paravirt operations (and so,
> it have to happen quite early), and the patching function

The PARAVIRT ifdefs look wrong. Surely you don't need them at all
because it cannot work at all without paravirt.

Also you got some white space damage.

And the "EM64T based comment" is wrong because there are AMD based
vSMPs too.

-Andi

2007-11-12 08:33:40

by Amit Shah

[permalink] [raw]
Subject: Re: [kvm-devel] [PATCH 0/24] paravirt_ops for unified x86 - that's me again!

On Saturday 10 November 2007 00:12:41 Glauber de Oliveira Costa wrote:
> Hey folks,
>
> Here's a new spin of the pvops64 patch series.
> We didn't get that many comments from the last time,
> so it should be probably almost ready to get in. Heya!
>
> >From the last version, the most notable changes are:
>
> * consolidation of system.h, merging jeremy's comments about ordering
> concerns
> * consolidation of smp functions that goes through smp_ops. They're sharing
> a bunch of code now.
>
> Other than that, just some issues that arose from the rebase.
>
> Please, not that this patch series _does not_ apply over linus git anymore,
> but rather, over tglx cleanup series.
>
> The first patch in this series is already on linus', but not on tglx', so
> I'm sending it again, because you'll need it if you want to compile it
> anyway.
>
> tglx, in the absense of any outstanding NACKs, or any very big call for
> improvements, could you please pull it in your tree?
>
> Have fun,

Glauber, are you planning on consolidating the dma_ops structure for 32- and
64-bit? 32-bit doesn't currently have a dma_mapping_ops structure, which
makes paravirtualizing DMA access difficult on 32-bit.

2007-11-13 11:37:24

by Glauber Costa

[permalink] [raw]
Subject: Re: [PATCH 24/24] make vsmp a paravirt client

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Andi Kleen escreveu:
> On Fri, Nov 09, 2007 at 04:43:05PM -0200, Glauber de Oliveira Costa wrote:
>> This patch makes vsmp a paravirt client. It now uses the whole
>> infrastructure provided by pvops. When we detect we're running
>> a vsmp box, we change the irq-related paravirt operations (and so,
>> it have to happen quite early), and the patching function
>
> The PARAVIRT ifdefs look wrong. Surely you don't need them at all
> because it cannot work at all without paravirt.

The vsmp_64.c file is now compiled unconditionally, according to which
me and kiran agreed to. The detection code is always run, but will only
trigger when a suitable box is found. Accordingly, the paravirt structs
are only touched when PARAVIRT is on. Otherwise, we don't even have the
symbols.

> Also you got some white space damage.

Thanks, will fix.

> And the "EM64T based comment" is wrong because there are AMD based
> vSMPs too.
Just got it as-is from the old Kconfig. Do you think it should be fixed
as well?

> -Andi

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Remi - http://enigmail.mozdev.org

iD8DBQFHOYxKjYI8LaFUWXMRApS3AJwJSYjW4Lw3dnPR4yMNfXABnMoQQQCcDMnf
3wBQoPjGO/8HO1Os4Y21vIU=
=S+KO
-----END PGP SIGNATURE-----

2007-11-13 11:42:36

by Glauber Costa

[permalink] [raw]
Subject: Re: [kvm-devel] [PATCH 0/24] paravirt_ops for unified x86 - that's me again!

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Amit Shah escreveu:
> On Saturday 10 November 2007 00:12:41 Glauber de Oliveira Costa wrote:
>> Hey folks,
>>
>> Here's a new spin of the pvops64 patch series.
>> We didn't get that many comments from the last time,
>> so it should be probably almost ready to get in. Heya!
>>
>> >From the last version, the most notable changes are:
>>
>> * consolidation of system.h, merging jeremy's comments about ordering
>> concerns
>> * consolidation of smp functions that goes through smp_ops. They're sharing
>> a bunch of code now.
>>
>> Other than that, just some issues that arose from the rebase.
>>
>> Please, not that this patch series _does not_ apply over linus git anymore,
>> but rather, over tglx cleanup series.
>>
>> The first patch in this series is already on linus', but not on tglx', so
>> I'm sending it again, because you'll need it if you want to compile it
>> anyway.
>>
>> tglx, in the absense of any outstanding NACKs, or any very big call for
>> improvements, could you please pull it in your tree?
>>
>> Have fun,
>
> Glauber, are you planning on consolidating the dma_ops structure for 32- and
> 64-bit? 32-bit doesn't currently have a dma_mapping_ops structure, which
> makes paravirtualizing DMA access difficult on 32-bit.
Until its get merged, definitely not. Although important, this is
significant work, and can delay us even more. But I was not planning to
do it at all (well, you were the first that raised the issue...)

So if the reason for your question is you are planning to work on it, go
ahead ;-)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Remi - http://enigmail.mozdev.org

iD8DBQFHOY3mjYI8LaFUWXMRAvnqAJ4ridsG0ZB2aI7U36hbZFBO0PoDgQCgqvjc
n6RbVz7Jw8t9qCyKUN+hLGg=
=crYj
-----END PGP SIGNATURE-----

2007-11-13 12:34:16

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 24/24] make vsmp a paravirt client


> The vsmp_64.c file is now compiled unconditionally, according to which
> me and kiran agreed to. The detection code is always run, but will only
> trigger when a suitable box is found. Accordingly, the paravirt structs
> are only touched when PARAVIRT is on. Otherwise, we don't even have the
> symbols.

That seems dumb. What good is it if it doesn't patch the interrupt code?

-Andi

2007-11-13 12:52:10

by Glauber Costa

[permalink] [raw]
Subject: Re: [PATCH 24/24] make vsmp a paravirt client

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Andi Kleen escreveu:
>> The vsmp_64.c file is now compiled unconditionally, according to which
>> me and kiran agreed to. The detection code is always run, but will only
>> trigger when a suitable box is found. Accordingly, the paravirt structs
>> are only touched when PARAVIRT is on. Otherwise, we don't even have the
>> symbols.
>
> That seems dumb. What good is it if it doesn't patch the interrupt code?

If vsmp is selected, PARAVIRT will be too, and the interrupt code will
be patched.
the vsmp option triggers a select statement.

the ifdef only exists because, as I said, the code itself will be always
compiled in, to avoid an ifdef in setup_64.c. So it's just a taking it
from here, putting it there issue. Kiran seem to prefer this way, but I
don't really have a preference.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Remi - http://enigmail.mozdev.org

iD8DBQFHOZ3IjYI8LaFUWXMRAjPkAJ0XosdyMcj1j4h6XW5dVaj95NH7cgCeIW5o
CGnBnZOTGz9DIu5D997bsZ4=
=BP57
-----END PGP SIGNATURE-----

2007-11-13 13:09:57

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 24/24] make vsmp a paravirt client


> If vsmp is selected, PARAVIRT will be too, and the interrupt code will
> be patched.
> the vsmp option triggers a select statement.
>
> the ifdef only exists because, as I said, the code itself will be always
> compiled in, to avoid an ifdef in setup_64.c. So it's just a taking it
> from here, putting it there issue.

Peeking around needlessly in pci config space at each boot just to avoid an
ifdef is not an good idea.

-Andi


2007-11-13 17:23:50

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [PATCH 24/24] make vsmp a paravirt client

Glauber de Oliveira Costa wrote:
> the ifdef only exists because, as I said, the code itself will be always
> compiled in, to avoid an ifdef in setup_64.c. So it's just a taking it
> from here, putting it there issue. Kiran seem to prefer this way, but I
> don't really have a preference.

It would be better to have the ifdef in setup_64.c and just make the
compilation of vsmp_64.c depend on CONFIG_X86_VSMP. If the ifdef really
rankles you could use a weak stub function somewhere, or define an
inline stub in vsmp.h.

J

2007-11-13 18:44:21

by Ravikiran Thirumalai

[permalink] [raw]
Subject: Re: [PATCH 24/24] make vsmp a paravirt client

On Tue, Nov 13, 2007 at 09:36:42AM -0200, Glauber de Oliveira Costa wrote:
>-----BEGIN PGP SIGNED MESSAGE-----
>
>> And the "EM64T based comment" is wrong because there are AMD based
>> vSMPs too.
>Just got it as-is from the old Kconfig. Do you think it should be fixed
>as well?

Yep.

Thanks,
Kiran

2007-11-13 18:52:17

by Ravikiran Thirumalai

[permalink] [raw]
Subject: Re: [PATCH 24/24] make vsmp a paravirt client

On Tue, Nov 13, 2007 at 02:09:41PM +0100, Andi Kleen wrote:
>
>> If vsmp is selected, PARAVIRT will be too, and the interrupt code will
>> be patched.
>> the vsmp option triggers a select statement.
>>
>> the ifdef only exists because, as I said, the code itself will be always
>> compiled in, to avoid an ifdef in setup_64.c. So it's just a taking it
>> from here, putting it there issue.
>
>Peeking around needlessly in pci config space at each boot just to avoid an
>ifdef is not an good idea.

Agreed. vsmp_init does not make sense unless the irq ops are patched. We
can call vsmp_init under #ifdef CONFIG_PARAVIRT.

Thanks,
Kiran

2007-11-13 20:28:28

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [kvm-devel] [PATCH 0/24] paravirt_ops for unified x86 - that's me again!

Amit Shah wrote:
> Glauber, are you planning on consolidating the dma_ops structure for 32- and
> 64-bit? 32-bit doesn't currently have a dma_mapping_ops structure, which
> makes paravirtualizing DMA access difficult on 32-bit.


I think it's a good idea. While I haven't worked out the details yet,
it seems like something I'll need for Xen dom0 support.

J

2007-11-17 09:33:25

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 5/24] smp x86 consolidation


* Glauber de Oliveira Costa <[email protected]> wrote:

> > I'm getting link errors in 32-bit:
> >
> > arch/x86/kernel/built-in.o: In function `native_smp_send_reschedule':
> > /home/jeremy/hg/xen/paravirt/linux/arch/x86/kernel/smpcommon.c:262: undefined reference to `genapic'
> > arch/x86/kernel/built-in.o: In function `native_smp_call_function_mask':
> > /home/jeremy/hg/xen/paravirt/linux/arch/x86/kernel/smpcommon.c:113: undefined reference to `genapic'
> >
> Ok, it compiled just fine here. I bet it's due to one of that i386 lots
> of variants.

this patch is causing build failures all around the place. It does not
build on CONFIG_X86_SUMMIT, it does not build if CONFIG_X86_GENERICARCH
is turned off, etc. It does not even build on UP - fix for that is
attached below. I've dropped this patch for now.

Ingo

-------------->
Subject: x86: fix build error
From: Ingo Molnar <[email protected]>

fix build error on !CONFIG_SMP.

Signed-off-by: Ingo Molnar <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
---
include/asm-x86/smp.h | 2 ++
1 file changed, 2 insertions(+)

Index: linux/include/asm-x86/smp.h
===================================================================
--- linux.orig/include/asm-x86/smp.h
+++ linux/include/asm-x86/smp.h
@@ -2,6 +2,7 @@
#define _X86_SMP_H_

#ifndef __ASSEMBLY__
+#ifdef CONFIG_SMP
struct smp_ops
{
void (*smp_prepare_boot_cpu)(void);
@@ -55,6 +56,7 @@ void native_smp_prepare_boot_cpu(void);
void native_smp_prepare_cpus(unsigned int max_cpus);
int native_cpu_up(unsigned int cpunum);
void native_smp_cpus_done(unsigned int max_cpus);
+#endif

#ifndef CONFIG_PARAVIRT
#define startup_ipi_hook(phys_apicid, start_eip, start_esp) \

2007-11-19 22:22:10

by Bastian Blank

[permalink] [raw]
Subject: Re: [PATCH 7/24] consolidate msr.h

On Fri, Nov 09, 2007 at 04:42:48PM -0200, Glauber de Oliveira Costa wrote:
> - wrmsrl(MSR_CSTAR, ia32_cstar_target);
> + wrmsrl(MSR_CSTAR, (u64)ia32_cstar_target);

Hmm, why do you add explicit casts? The compiler should convert that
correctly on its own.

> +static inline void wrmsrl(unsigned int msr, unsigned long long val)

Hmm, long long is 64 bit on all x86, but why not use explicit u64 to
show that?

Bastian

--
Captain's Log, star date 21:34.5...

2007-11-19 22:37:12

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH 7/24] consolidate msr.h


On Mon, 19 Nov 2007, Bastian Blank wrote:

> On Fri, Nov 09, 2007 at 04:42:48PM -0200, Glauber de Oliveira Costa wrote:
> > - wrmsrl(MSR_CSTAR, ia32_cstar_target);
> > + wrmsrl(MSR_CSTAR, (u64)ia32_cstar_target);
>
> Hmm, why do you add explicit casts? The compiler should convert that
> correctly on its own.
>
> > +static inline void wrmsrl(unsigned int msr, unsigned long long val)
>
> Hmm, long long is 64 bit on all x86, but why not use explicit u64 to
> show that?

(quick reply)

With PVOPS on it gives compiler warnings without that explict cast.
Without looking at the code, IIRC with non-PVOPS it is a macro directly
into asm, so it didn't matter what the cast was. But with PVOPS as a
function, it gave compiler warnings.

Take it out and try compiling it for both i386 and x86_64. One of them
gave warnings. But maybe it's not a problem now.

-- Steve

2007-11-20 05:52:15

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 7/24] consolidate msr.h


* Steven Rostedt <[email protected]> wrote:

> > On Fri, Nov 09, 2007 at 04:42:48PM -0200, Glauber de Oliveira Costa wrote:
> > > - wrmsrl(MSR_CSTAR, ia32_cstar_target);
> > > + wrmsrl(MSR_CSTAR, (u64)ia32_cstar_target);
> >
> > Hmm, why do you add explicit casts? The compiler should convert that
> > correctly on its own.
> >
> > > +static inline void wrmsrl(unsigned int msr, unsigned long long val)
> >
> > Hmm, long long is 64 bit on all x86, but why not use explicit u64 to
> > show that?
>
> (quick reply)
>
> With PVOPS on it gives compiler warnings without that explict cast.
> Without looking at the code, IIRC with non-PVOPS it is a macro
> directly into asm, so it didn't matter what the cast was. But with
> PVOPS as a function, it gave compiler warnings.
>
> Take it out and try compiling it for both i386 and x86_64. One of them
> gave warnings. But maybe it's not a problem now.

i dont think there's ever any true need (and good cause) to force
integer type casts like that at the callee site.

Ingo

2007-11-20 06:14:38

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH 7/24] consolidate msr.h



On Tue, 20 Nov 2007, Ingo Molnar wrote:

> * Steven Rostedt <[email protected]> wrote:
>
> > With PVOPS on it gives compiler warnings without that explict cast.
> > Without looking at the code, IIRC with non-PVOPS it is a macro
> > directly into asm, so it didn't matter what the cast was. But with
> > PVOPS as a function, it gave compiler warnings.
> >
> > Take it out and try compiling it for both i386 and x86_64. One of them
> > gave warnings. But maybe it's not a problem now.
>
> i dont think there's ever any true need (and good cause) to force
> integer type casts like that at the callee site.

I guess the problem is that we converted a macro to a function, where the
macro did no type checking. Now we need to pick between integers and
pointers. Some places uses intergers in wrmsrl and some use pointers. So
changing this to a typechecking protocol is not going to be nice.

Looking at the current code now, we have this:

====
checking_wrmsrl(MSR_IA32_SYSENTER_CS, (u64)__KERNEL_CS);
checking_wrmsrl(MSR_IA32_SYSENTER_ESP, 0ULL);
checking_wrmsrl(MSR_IA32_SYSENTER_EIP, (u64)ia32_sysenter_target);

wrmsrl(MSR_CSTAR, ia32_cstar_target);
====

A typecast is already used in that same area.

-- Steve


2007-11-20 06:17:10

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH 7/24] consolidate msr.h


On Tue, 20 Nov 2007, Ingo Molnar wrote:
>
> i dont think there's ever any true need (and good cause) to force
> integer type casts like that at the callee site.

Unless you mean we should do something like this:

static inline void __wrmsrl(unsigned int msr, unsigned long long val);
#define wrmsr(msr, val) __wrmsrl(msr, (unsigned long long)var)

-- Steve


2007-11-20 10:43:34

by Rusty Russell

[permalink] [raw]
Subject: Re: [PATCH 7/24] consolidate msr.h

On Tuesday 20 November 2007 17:16:45 Steven Rostedt wrote:
> On Tue, 20 Nov 2007, Ingo Molnar wrote:
> > i dont think there's ever any true need (and good cause) to force
> > integer type casts like that at the callee site.
>
> Unless you mean we should do something like this:
>
> static inline void __wrmsrl(unsigned int msr, unsigned long long val);
> #define wrmsr(msr, val) __wrmsrl(msr, (unsigned long long)var)

Heh:

union ptr_or_val {
void *ptr;
unsigned long long val;
};

static inline void __wrmsrl(unsigned int msr, union ptr_or_val pv);
#define wrmsr(msr, v) __wrmsrl(msr, (union ptr_or_val)v)

Ok, maybe not...
Rusty.