2015-07-16 21:44:05

by Boris Ostrovsky

[permalink] [raw]
Subject: [PATCH v2 0/6] 32-bit PVH domU support

A set of PVH-related patches.

The first patch is x86-64 ABI fix for PVH guests. The second is a small update
that removes redundant memset (both on PV and PVH code paths)

The rest is to enable non-privileged 32-bit PVH guests. This requires hypervisor
patches from http://lists.xenproject.org/archives/html/xen-devel/2015-07/msg02229.html

Changes in v3:
* Removed second argument to xen_pvh_early_cpu_init() per Konrad's suggestion
(patch 1)
* Replaced segment updates done in assembly with loadsegment() macros (patch 4)


Boris Ostrovsky (6):
xen/x86/pvh: Save %rbx in xen_pvh_early_cpu_init()
xen/x86: Remove unnecessary memset() call
xen/x86/pvh: Properly set page tables for 32-bit PVH guests
xen/x86/pvh: Set up descriptors for 32-bit PVH guests
xen/x86/pvh: Add 32-bit PVH initialization code
xen/x86/pvh: Allow building 32-bit PVH guests

arch/x86/xen/Kconfig | 2 +-
arch/x86/xen/enlighten.c | 21 +++++++++++----------
arch/x86/xen/mmu.c | 22 +++++++++++++---------
arch/x86/xen/smp.c | 21 +++++++++++----------
arch/x86/xen/smp.h | 8 ++++++--
arch/x86/xen/xen-head.S | 38 +++++++++++++++++++++++++-------------
6 files changed, 67 insertions(+), 45 deletions(-)

--
1.8.1.4


2015-07-16 21:45:00

by Boris Ostrovsky

[permalink] [raw]
Subject: [PATCH v2 1/6] xen/x86/pvh: Save %rbx in xen_pvh_early_cpu_init()

x86-64 ABI requires that functions preserve %rbx. When
xen_pvh_early_cpu_init() is executed on boot cpu it is invoked as a
function and 'cpuid' instruction will clobber %rbx. (This is not a
concern on secondary processors since there xen_pvh_early_cpu_init() is
the entry point and thus does not need to preserve registers.)

Since we cannot access stack on secondary processors (PTE's NX bit will
cause #GP on AMD --- see commit a2ef5dc2c7cb ("x86/xen: Set EFER.NX and
EFER.SCE in PVH guests")) we can't always save %rbx on the stack. We
work around this by creating a new entry point for those processors ---
xen_pvh_early_cpu_init_secondary(). Note that on secondary CPUs we will
load %rbx with garbage from the stack, which is OK.

As a side effect of these changes we don't need to save %rsi anymore,
since we can use %rbx instead of %rsi as a temporary register.

(We could store %rbx into another scratch register such as %r8 but since
we will need new entry point for 32-bit PVH anyway we go with this
approach for symmetry).

Konrad also sugested that with separate entry point for secondary
processors we don't need the second argument ('bool entry') to
xen_pvh_early_cpu_init[_secondary](). So it is dropped.

Signed-off-by: Boris Ostrovsky <[email protected]>
---
Changes in v2:
* Dropped second argument to xen_pvh_early_cpu_init()

arch/x86/xen/enlighten.c | 2 +-
arch/x86/xen/smp.c | 4 ++--
arch/x86/xen/smp.h | 8 ++++++--
arch/x86/xen/xen-head.S | 27 ++++++++++++++-------------
4 files changed, 23 insertions(+), 18 deletions(-)

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 0b95c9b..f8dc398 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -1454,7 +1454,7 @@ static void __init xen_pvh_early_guest_init(void)

xen_have_vector_callback = 1;

- xen_pvh_early_cpu_init(0, false);
+ xen_pvh_early_cpu_init(0);
xen_pvh_set_cr_flags(0);

#ifdef CONFIG_X86_32
diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
index 8648438..e53be3b 100644
--- a/arch/x86/xen/smp.c
+++ b/arch/x86/xen/smp.c
@@ -423,9 +423,9 @@ cpu_initialize_context(unsigned int cpu, struct task_struct *idle)
* bit set. This means before DS/SS is touched, NX in
* EFER must be set. Hence the following assembly glue code.
*/
- ctxt->user_regs.eip = (unsigned long)xen_pvh_early_cpu_init;
+ ctxt->user_regs.eip =
+ (unsigned long)xen_pvh_early_cpu_init_secondary;
ctxt->user_regs.rdi = cpu;
- ctxt->user_regs.rsi = true; /* entry == true */
}
#endif
ctxt->user_regs.esp = idle->thread.sp0 - sizeof(struct pt_regs);
diff --git a/arch/x86/xen/smp.h b/arch/x86/xen/smp.h
index 963d62a..538c00d 100644
--- a/arch/x86/xen/smp.h
+++ b/arch/x86/xen/smp.h
@@ -9,9 +9,13 @@ extern void xen_send_IPI_all(int vector);
extern void xen_send_IPI_self(int vector);

#ifdef CONFIG_XEN_PVH
-extern void xen_pvh_early_cpu_init(int cpu, bool entry);
+extern void xen_pvh_early_cpu_init(int cpu);
+extern void xen_pvh_early_cpu_init_secondary(int cpu);
#else
-static inline void xen_pvh_early_cpu_init(int cpu, bool entry)
+static inline void xen_pvh_early_cpu_init(int cpu)
+{
+}
+static inline void xen_pvh_early_cpu_init_secondary(int cpu)
{
}
#endif
diff --git a/arch/x86/xen/xen-head.S b/arch/x86/xen/xen-head.S
index 8afdfcc..944b08b 100644
--- a/arch/x86/xen/xen-head.S
+++ b/arch/x86/xen/xen-head.S
@@ -53,31 +53,32 @@ ENTRY(startup_xen)
/*
* xen_pvh_early_cpu_init() - early PVH VCPU initialization
* @cpu: this cpu number (%rdi)
- * @entry: true if this is a secondary vcpu coming up on this entry
- * point, false if this is the boot CPU being initialized for
- * the first time (%rsi)
- *
- * Note: This is called as a function on the boot CPU, and is the entry point
- * on the secondary CPU.
*/
ENTRY(xen_pvh_early_cpu_init)
- mov %rsi, %r11
-
+ mov %rbx, -8(%rsp)
+ xor %esi, %esi
+ jmp 1f
+
+/* Entry point for secondary CPUs. Can't touch stack until NX is dealt with. */
+ENTRY(xen_pvh_early_cpu_init_secondary)
+ mov $1, %esi
+1:
/* Gather features to see if NX implemented. */
mov $0x80000001, %eax
cpuid
- mov %edx, %esi
+ mov %edx, %ebx

mov $MSR_EFER, %ecx
rdmsr
bts $_EFER_SCE, %eax

- bt $20, %esi
- jnc 1f /* No NX, skip setting it */
+ bt $20, %ebx
+ jnc 2f /* No NX, skip setting it */
bts $_EFER_NX, %eax
-1: wrmsr
+2: wrmsr
+ mov -8(%rsp), %rbx
#ifdef CONFIG_SMP
- cmp $0, %r11b
+ cmp $0, %esi
jne cpu_bringup_and_idle
#endif
ret
--
1.8.1.4

2015-07-16 21:44:07

by Boris Ostrovsky

[permalink] [raw]
Subject: [PATCH v2 2/6] xen/x86: Remove unnecessary memset() call

Since ctxt is kzalloc'd there is no need to call a memset for
ctxt->fpu_ctxt.

Signed-off-by: Boris Ostrovsky <[email protected]>
Reviewed-by: Konrad Rzeszutek Wilk <[email protected]>
---
arch/x86/xen/smp.c | 1 -
1 file changed, 1 deletion(-)

diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
index e53be3b..33a4529 100644
--- a/arch/x86/xen/smp.c
+++ b/arch/x86/xen/smp.c
@@ -377,7 +377,6 @@ cpu_initialize_context(unsigned int cpu, struct task_struct *idle)
ctxt->user_regs.fs = __KERNEL_PERCPU;
ctxt->user_regs.gs = __KERNEL_STACK_CANARY;
#endif
- memset(&ctxt->fpu_ctxt, 0, sizeof(ctxt->fpu_ctxt));

if (!xen_feature(XENFEAT_auto_translated_physmap)) {
ctxt->user_regs.eip = (unsigned long)cpu_bringup_and_idle;
--
1.8.1.4

2015-07-16 21:44:12

by Boris Ostrovsky

[permalink] [raw]
Subject: [PATCH v2 3/6] xen/x86/pvh: Properly set page tables for 32-bit PVH guests

32-bit PVH guests don't want to write-protect/pin page tables.

Signed-off-by: Boris Ostrovsky <[email protected]>
Reviewed-by: Konrad Rzeszutek Wilk <[email protected]>
---
arch/x86/xen/mmu.c | 22 +++++++++++++---------
1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
index dd151b2..b473df8 100644
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -1762,8 +1762,9 @@ void __init xen_setup_machphys_mapping(void)
machine_to_phys_nr = MACH2PHYS_NR_ENTRIES;
}
#ifdef CONFIG_X86_32
- WARN_ON((machine_to_phys_mapping + (machine_to_phys_nr - 1))
- < machine_to_phys_mapping);
+ if (!xen_feature(XENFEAT_auto_translated_physmap))
+ WARN_ON((machine_to_phys_mapping + (machine_to_phys_nr - 1))
+ < machine_to_phys_mapping);
#endif
}

@@ -1958,15 +1959,18 @@ void __init xen_setup_kernel_pagetable(pgd_t *pgd, unsigned long max_pfn)
initial_page_table[KERNEL_PGD_BOUNDARY] =
__pgd(__pa(initial_kernel_pmd) | _PAGE_PRESENT);

- set_page_prot(initial_kernel_pmd, PAGE_KERNEL_RO);
- set_page_prot(initial_page_table, PAGE_KERNEL_RO);
- set_page_prot(empty_zero_page, PAGE_KERNEL_RO);
+ if (!xen_feature(XENFEAT_auto_translated_physmap)) {
+ set_page_prot(initial_kernel_pmd, PAGE_KERNEL_RO);
+ set_page_prot(initial_page_table, PAGE_KERNEL_RO);
+ set_page_prot(empty_zero_page, PAGE_KERNEL_RO);

- pin_pagetable_pfn(MMUEXT_UNPIN_TABLE, PFN_DOWN(__pa(pgd)));
+ pin_pagetable_pfn(MMUEXT_UNPIN_TABLE, PFN_DOWN(__pa(pgd)));

- pin_pagetable_pfn(MMUEXT_PIN_L3_TABLE,
- PFN_DOWN(__pa(initial_page_table)));
- xen_write_cr3(__pa(initial_page_table));
+ pin_pagetable_pfn(MMUEXT_PIN_L3_TABLE,
+ PFN_DOWN(__pa(initial_page_table)));
+ xen_write_cr3(__pa(initial_page_table));
+ } else
+ native_write_cr3(__pa(initial_page_table));

memblock_reserve(__pa(xen_start_info->pt_base),
xen_start_info->nr_pt_frames * PAGE_SIZE);
--
1.8.1.4

2015-07-16 21:45:47

by Boris Ostrovsky

[permalink] [raw]
Subject: [PATCH v2 4/6] xen/x86/pvh: Set up descriptors for 32-bit PVH guests

Signed-off-by: Boris Ostrovsky <[email protected]>
---
Changes in v2:
* Set segment selectors using loadsegment() instead of assembly

arch/x86/xen/enlighten.c | 15 ++++++++++-----
1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index f8dc398..d665b1d 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -1362,12 +1362,12 @@ static void __init xen_boot_params_init_edd(void)
static void __ref xen_setup_gdt(int cpu)
{
if (xen_feature(XENFEAT_auto_translated_physmap)) {
-#ifdef CONFIG_X86_64
- unsigned long dummy;
+ unsigned long __attribute__((unused)) dummy;

- load_percpu_segment(cpu); /* We need to access per-cpu area */
+ setup_stack_canary_segment(cpu);
switch_to_new_gdt(cpu); /* GDT and GS set */

+#ifdef CONFIG_X86_64
/* We are switching of the Xen provided GDT to our HVM mode
* GDT. The new GDT has __KERNEL_CS with CS.L = 1
* and we are jumping to reload it.
@@ -1392,8 +1392,13 @@ static void __ref xen_setup_gdt(int cpu)
loadsegment(ds, 0);
loadsegment(fs, 0);
#else
- /* PVH: TODO Implement. */
- BUG();
+ asm volatile ("ljmp %0,$1f\n"
+ "1:\n"
+ :: "i" (__KERNEL_CS));
+
+ loadsegment(ss, __KERNEL_DS);
+ loadsegment(ds, __USER_DS);
+ loadsegment(es, __USER_DS);
#endif
return; /* PVH does not need any PV GDT ops. */
}
--
1.8.1.4

2015-07-16 21:44:10

by Boris Ostrovsky

[permalink] [raw]
Subject: [PATCH v2 5/6] xen/x86/pvh: Add 32-bit PVH initialization code

Signed-off-by: Boris Ostrovsky <[email protected]>
---
Changes in v2:
* Some code reshuffling due to changes in patch 1.

arch/x86/xen/enlighten.c | 4 ----
arch/x86/xen/smp.c | 16 +++++++++-------
arch/x86/xen/xen-head.S | 13 ++++++++++++-
3 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index d665b1d..9eb395f 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -1461,10 +1461,6 @@ static void __init xen_pvh_early_guest_init(void)

xen_pvh_early_cpu_init(0);
xen_pvh_set_cr_flags(0);
-
-#ifdef CONFIG_X86_32
- BUG(); /* PVH: Implement proper support. */
-#endif
}
#endif /* CONFIG_XEN_PVH */

diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
index 33a4529..f8c5d23 100644
--- a/arch/x86/xen/smp.c
+++ b/arch/x86/xen/smp.c
@@ -372,11 +372,8 @@ cpu_initialize_context(unsigned int cpu, struct task_struct *idle)

gdt = get_cpu_gdt_table(cpu);

-#ifdef CONFIG_X86_32
- /* Note: PVH is not yet supported on x86_32. */
- ctxt->user_regs.fs = __KERNEL_PERCPU;
- ctxt->user_regs.gs = __KERNEL_STACK_CANARY;
-#endif
+ ctxt->user_regs.esp = idle->thread.sp0 - sizeof(struct pt_regs);
+ ctxt->ctrlreg[3] = xen_pfn_to_cr3(virt_to_mfn(swapper_pg_dir));

if (!xen_feature(XENFEAT_auto_translated_physmap)) {
ctxt->user_regs.eip = (unsigned long)cpu_bringup_and_idle;
@@ -403,6 +400,8 @@ cpu_initialize_context(unsigned int cpu, struct task_struct *idle)
ctxt->kernel_sp = idle->thread.sp0;

#ifdef CONFIG_X86_32
+ ctxt->user_regs.fs = __KERNEL_PERCPU;
+ ctxt->user_regs.gs = __KERNEL_STACK_CANARY;
ctxt->event_callback_cs = __KERNEL_CS;
ctxt->failsafe_callback_cs = __KERNEL_CS;
#else
@@ -424,11 +423,14 @@ cpu_initialize_context(unsigned int cpu, struct task_struct *idle)
*/
ctxt->user_regs.eip =
(unsigned long)xen_pvh_early_cpu_init_secondary;
+#ifdef CONFIG_X86_64
ctxt->user_regs.rdi = cpu;
+#else
+ *((uint32_t *)ctxt->user_regs.esp + 1) = cpu;
+#endif
}
#endif
- ctxt->user_regs.esp = idle->thread.sp0 - sizeof(struct pt_regs);
- ctxt->ctrlreg[3] = xen_pfn_to_cr3(virt_to_mfn(swapper_pg_dir));
+
if (HYPERVISOR_vcpu_op(VCPUOP_initialise, cpu, ctxt))
BUG();

diff --git a/arch/x86/xen/xen-head.S b/arch/x86/xen/xen-head.S
index 944b08b..eada11b 100644
--- a/arch/x86/xen/xen-head.S
+++ b/arch/x86/xen/xen-head.S
@@ -55,7 +55,11 @@ ENTRY(startup_xen)
* @cpu: this cpu number (%rdi)
*/
ENTRY(xen_pvh_early_cpu_init)
+#ifdef CONFIG_X86_64
mov %rbx, -8(%rsp)
+#else
+ mov %ebx, -4(%esp)
+#endif
xor %esi, %esi
jmp 1f

@@ -70,13 +74,20 @@ ENTRY(xen_pvh_early_cpu_init_secondary)

mov $MSR_EFER, %ecx
rdmsr
+#ifdef CONFIG_X86_64
bts $_EFER_SCE, %eax
-
+#endif
bt $20, %ebx
jnc 2f /* No NX, skip setting it */
bts $_EFER_NX, %eax
2: wrmsr
+
+#ifdef CONFIG_X86_64
mov -8(%rsp), %rbx
+#else
+ mov -4(%esp), %ebx
+#endif
+
#ifdef CONFIG_SMP
cmp $0, %esi
jne cpu_bringup_and_idle
--
1.8.1.4

2015-07-16 21:44:47

by Boris Ostrovsky

[permalink] [raw]
Subject: [PATCH v2 6/6] xen/x86/pvh: Allow building 32-bit PVH guests

Signed-off-by: Boris Ostrovsky <[email protected]>
Reviewed-by: Konrad Rzeszutek Wilk <[email protected]>
---
arch/x86/xen/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/xen/Kconfig b/arch/x86/xen/Kconfig
index e88fda8..891031e 100644
--- a/arch/x86/xen/Kconfig
+++ b/arch/x86/xen/Kconfig
@@ -48,5 +48,5 @@ config XEN_DEBUG_FS

config XEN_PVH
bool "Support for running as a PVH guest"
- depends on X86_64 && XEN && XEN_PVHVM
+ depends on XEN && XEN_PVHVM
def_bool n
--
1.8.1.4

2015-07-17 15:10:06

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: [PATCH v2 1/6] xen/x86/pvh: Save %rbx in xen_pvh_early_cpu_init()

On Thu, Jul 16, 2015 at 05:43:36PM -0400, Boris Ostrovsky wrote:
> x86-64 ABI requires that functions preserve %rbx. When
> xen_pvh_early_cpu_init() is executed on boot cpu it is invoked as a
> function and 'cpuid' instruction will clobber %rbx. (This is not a
> concern on secondary processors since there xen_pvh_early_cpu_init() is
> the entry point and thus does not need to preserve registers.)
>
> Since we cannot access stack on secondary processors (PTE's NX bit will
> cause #GP on AMD --- see commit a2ef5dc2c7cb ("x86/xen: Set EFER.NX and
> EFER.SCE in PVH guests")) we can't always save %rbx on the stack. We
> work around this by creating a new entry point for those processors ---
> xen_pvh_early_cpu_init_secondary(). Note that on secondary CPUs we will
> load %rbx with garbage from the stack, which is OK.
>
> As a side effect of these changes we don't need to save %rsi anymore,
> since we can use %rbx instead of %rsi as a temporary register.
>
> (We could store %rbx into another scratch register such as %r8 but since
> we will need new entry point for 32-bit PVH anyway we go with this
> approach for symmetry).
>
> Konrad also sugested that with separate entry point for secondary
> processors we don't need the second argument ('bool entry') to
> xen_pvh_early_cpu_init[_secondary](). So it is dropped.
>
> Signed-off-by: Boris Ostrovsky <[email protected]>

Reviewed-by: Konrad Rzeszutek Wilk <[email protected]>
> ---
> Changes in v2:
> * Dropped second argument to xen_pvh_early_cpu_init()
>
> arch/x86/xen/enlighten.c | 2 +-
> arch/x86/xen/smp.c | 4 ++--
> arch/x86/xen/smp.h | 8 ++++++--
> arch/x86/xen/xen-head.S | 27 ++++++++++++++-------------
> 4 files changed, 23 insertions(+), 18 deletions(-)
>
> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> index 0b95c9b..f8dc398 100644
> --- a/arch/x86/xen/enlighten.c
> +++ b/arch/x86/xen/enlighten.c
> @@ -1454,7 +1454,7 @@ static void __init xen_pvh_early_guest_init(void)
>
> xen_have_vector_callback = 1;
>
> - xen_pvh_early_cpu_init(0, false);
> + xen_pvh_early_cpu_init(0);
> xen_pvh_set_cr_flags(0);
>
> #ifdef CONFIG_X86_32
> diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
> index 8648438..e53be3b 100644
> --- a/arch/x86/xen/smp.c
> +++ b/arch/x86/xen/smp.c
> @@ -423,9 +423,9 @@ cpu_initialize_context(unsigned int cpu, struct task_struct *idle)
> * bit set. This means before DS/SS is touched, NX in
> * EFER must be set. Hence the following assembly glue code.
> */
> - ctxt->user_regs.eip = (unsigned long)xen_pvh_early_cpu_init;
> + ctxt->user_regs.eip =
> + (unsigned long)xen_pvh_early_cpu_init_secondary;
> ctxt->user_regs.rdi = cpu;
> - ctxt->user_regs.rsi = true; /* entry == true */
> }
> #endif
> ctxt->user_regs.esp = idle->thread.sp0 - sizeof(struct pt_regs);
> diff --git a/arch/x86/xen/smp.h b/arch/x86/xen/smp.h
> index 963d62a..538c00d 100644
> --- a/arch/x86/xen/smp.h
> +++ b/arch/x86/xen/smp.h
> @@ -9,9 +9,13 @@ extern void xen_send_IPI_all(int vector);
> extern void xen_send_IPI_self(int vector);
>
> #ifdef CONFIG_XEN_PVH
> -extern void xen_pvh_early_cpu_init(int cpu, bool entry);
> +extern void xen_pvh_early_cpu_init(int cpu);
> +extern void xen_pvh_early_cpu_init_secondary(int cpu);
> #else
> -static inline void xen_pvh_early_cpu_init(int cpu, bool entry)
> +static inline void xen_pvh_early_cpu_init(int cpu)
> +{
> +}
> +static inline void xen_pvh_early_cpu_init_secondary(int cpu)
> {
> }
> #endif
> diff --git a/arch/x86/xen/xen-head.S b/arch/x86/xen/xen-head.S
> index 8afdfcc..944b08b 100644
> --- a/arch/x86/xen/xen-head.S
> +++ b/arch/x86/xen/xen-head.S
> @@ -53,31 +53,32 @@ ENTRY(startup_xen)
> /*
> * xen_pvh_early_cpu_init() - early PVH VCPU initialization
> * @cpu: this cpu number (%rdi)
> - * @entry: true if this is a secondary vcpu coming up on this entry
> - * point, false if this is the boot CPU being initialized for
> - * the first time (%rsi)
> - *
> - * Note: This is called as a function on the boot CPU, and is the entry point
> - * on the secondary CPU.
> */
> ENTRY(xen_pvh_early_cpu_init)
> - mov %rsi, %r11
> -
> + mov %rbx, -8(%rsp)
> + xor %esi, %esi
> + jmp 1f
> +
> +/* Entry point for secondary CPUs. Can't touch stack until NX is dealt with. */
> +ENTRY(xen_pvh_early_cpu_init_secondary)
> + mov $1, %esi
> +1:
> /* Gather features to see if NX implemented. */
> mov $0x80000001, %eax
> cpuid
> - mov %edx, %esi
> + mov %edx, %ebx
>
> mov $MSR_EFER, %ecx
> rdmsr
> bts $_EFER_SCE, %eax
>
> - bt $20, %esi
> - jnc 1f /* No NX, skip setting it */
> + bt $20, %ebx
> + jnc 2f /* No NX, skip setting it */
> bts $_EFER_NX, %eax
> -1: wrmsr
> +2: wrmsr
> + mov -8(%rsp), %rbx
> #ifdef CONFIG_SMP
> - cmp $0, %r11b
> + cmp $0, %esi
> jne cpu_bringup_and_idle
> #endif
> ret
> --
> 1.8.1.4
>

2015-07-17 15:21:26

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: [PATCH v2 4/6] xen/x86/pvh: Set up descriptors for 32-bit PVH guests

On Thu, Jul 16, 2015 at 05:43:39PM -0400, Boris Ostrovsky wrote:
> Signed-off-by: Boris Ostrovsky <[email protected]>
> ---
> Changes in v2:
> * Set segment selectors using loadsegment() instead of assembly
>
> arch/x86/xen/enlighten.c | 15 ++++++++++-----
> 1 file changed, 10 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> index f8dc398..d665b1d 100644
> --- a/arch/x86/xen/enlighten.c
> +++ b/arch/x86/xen/enlighten.c
> @@ -1362,12 +1362,12 @@ static void __init xen_boot_params_init_edd(void)
> static void __ref xen_setup_gdt(int cpu)
> {
> if (xen_feature(XENFEAT_auto_translated_physmap)) {
> -#ifdef CONFIG_X86_64
> - unsigned long dummy;
> + unsigned long __attribute__((unused)) dummy;
>
> - load_percpu_segment(cpu); /* We need to access per-cpu area */

You removed that - where are we going to do that? As the
'switch_to_new_gdt' uses the per-cpu GDT table.

> + setup_stack_canary_segment(cpu);

As this one too ?
> switch_to_new_gdt(cpu); /* GDT and GS set */
>
> +#ifdef CONFIG_X86_64
> /* We are switching of the Xen provided GDT to our HVM mode
> * GDT. The new GDT has __KERNEL_CS with CS.L = 1
> * and we are jumping to reload it.
> @@ -1392,8 +1392,13 @@ static void __ref xen_setup_gdt(int cpu)
> loadsegment(ds, 0);
> loadsegment(fs, 0);
> #else
> - /* PVH: TODO Implement. */
> - BUG();
> + asm volatile ("ljmp %0,$1f\n"
> + "1:\n"
> + :: "i" (__KERNEL_CS));
> +
> + loadsegment(ss, __KERNEL_DS);
> + loadsegment(ds, __USER_DS);
> + loadsegment(es, __USER_DS);
> #endif
> return; /* PVH does not need any PV GDT ops. */
> }
> --
> 1.8.1.4
>

2015-07-17 15:21:48

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: [PATCH v2 5/6] xen/x86/pvh: Add 32-bit PVH initialization code

On Thu, Jul 16, 2015 at 05:43:40PM -0400, Boris Ostrovsky wrote:
> Signed-off-by: Boris Ostrovsky <[email protected]>

Reviewed-by: Konrad Rzeszutek Wilk <[email protected]>
> ---
> Changes in v2:
> * Some code reshuffling due to changes in patch 1.
>
> arch/x86/xen/enlighten.c | 4 ----
> arch/x86/xen/smp.c | 16 +++++++++-------
> arch/x86/xen/xen-head.S | 13 ++++++++++++-
> 3 files changed, 21 insertions(+), 12 deletions(-)
>
> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> index d665b1d..9eb395f 100644
> --- a/arch/x86/xen/enlighten.c
> +++ b/arch/x86/xen/enlighten.c
> @@ -1461,10 +1461,6 @@ static void __init xen_pvh_early_guest_init(void)
>
> xen_pvh_early_cpu_init(0);
> xen_pvh_set_cr_flags(0);
> -
> -#ifdef CONFIG_X86_32
> - BUG(); /* PVH: Implement proper support. */
> -#endif
> }
> #endif /* CONFIG_XEN_PVH */
>
> diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
> index 33a4529..f8c5d23 100644
> --- a/arch/x86/xen/smp.c
> +++ b/arch/x86/xen/smp.c
> @@ -372,11 +372,8 @@ cpu_initialize_context(unsigned int cpu, struct task_struct *idle)
>
> gdt = get_cpu_gdt_table(cpu);
>
> -#ifdef CONFIG_X86_32
> - /* Note: PVH is not yet supported on x86_32. */
> - ctxt->user_regs.fs = __KERNEL_PERCPU;
> - ctxt->user_regs.gs = __KERNEL_STACK_CANARY;
> -#endif
> + ctxt->user_regs.esp = idle->thread.sp0 - sizeof(struct pt_regs);
> + ctxt->ctrlreg[3] = xen_pfn_to_cr3(virt_to_mfn(swapper_pg_dir));
>
> if (!xen_feature(XENFEAT_auto_translated_physmap)) {
> ctxt->user_regs.eip = (unsigned long)cpu_bringup_and_idle;
> @@ -403,6 +400,8 @@ cpu_initialize_context(unsigned int cpu, struct task_struct *idle)
> ctxt->kernel_sp = idle->thread.sp0;
>
> #ifdef CONFIG_X86_32
> + ctxt->user_regs.fs = __KERNEL_PERCPU;
> + ctxt->user_regs.gs = __KERNEL_STACK_CANARY;
> ctxt->event_callback_cs = __KERNEL_CS;
> ctxt->failsafe_callback_cs = __KERNEL_CS;
> #else
> @@ -424,11 +423,14 @@ cpu_initialize_context(unsigned int cpu, struct task_struct *idle)
> */
> ctxt->user_regs.eip =
> (unsigned long)xen_pvh_early_cpu_init_secondary;
> +#ifdef CONFIG_X86_64
> ctxt->user_regs.rdi = cpu;
> +#else
> + *((uint32_t *)ctxt->user_regs.esp + 1) = cpu;
> +#endif
> }
> #endif
> - ctxt->user_regs.esp = idle->thread.sp0 - sizeof(struct pt_regs);
> - ctxt->ctrlreg[3] = xen_pfn_to_cr3(virt_to_mfn(swapper_pg_dir));
> +
> if (HYPERVISOR_vcpu_op(VCPUOP_initialise, cpu, ctxt))
> BUG();
>
> diff --git a/arch/x86/xen/xen-head.S b/arch/x86/xen/xen-head.S
> index 944b08b..eada11b 100644
> --- a/arch/x86/xen/xen-head.S
> +++ b/arch/x86/xen/xen-head.S
> @@ -55,7 +55,11 @@ ENTRY(startup_xen)
> * @cpu: this cpu number (%rdi)
> */
> ENTRY(xen_pvh_early_cpu_init)
> +#ifdef CONFIG_X86_64
> mov %rbx, -8(%rsp)
> +#else
> + mov %ebx, -4(%esp)
> +#endif
> xor %esi, %esi
> jmp 1f
>
> @@ -70,13 +74,20 @@ ENTRY(xen_pvh_early_cpu_init_secondary)
>
> mov $MSR_EFER, %ecx
> rdmsr
> +#ifdef CONFIG_X86_64
> bts $_EFER_SCE, %eax
> -
> +#endif
> bt $20, %ebx
> jnc 2f /* No NX, skip setting it */
> bts $_EFER_NX, %eax
> 2: wrmsr
> +
> +#ifdef CONFIG_X86_64
> mov -8(%rsp), %rbx
> +#else
> + mov -4(%esp), %ebx
> +#endif
> +
> #ifdef CONFIG_SMP
> cmp $0, %esi
> jne cpu_bringup_and_idle
> --
> 1.8.1.4
>

2015-07-17 15:37:06

by Boris Ostrovsky

[permalink] [raw]
Subject: Re: [PATCH v2 4/6] xen/x86/pvh: Set up descriptors for 32-bit PVH guests

On 07/17/2015 11:21 AM, Konrad Rzeszutek Wilk wrote:
> On Thu, Jul 16, 2015 at 05:43:39PM -0400, Boris Ostrovsky wrote:
>> Signed-off-by: Boris Ostrovsky <[email protected]>
>> ---
>> Changes in v2:
>> * Set segment selectors using loadsegment() instead of assembly
>>
>> arch/x86/xen/enlighten.c | 15 ++++++++++-----
>> 1 file changed, 10 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
>> index f8dc398..d665b1d 100644
>> --- a/arch/x86/xen/enlighten.c
>> +++ b/arch/x86/xen/enlighten.c
>> @@ -1362,12 +1362,12 @@ static void __init xen_boot_params_init_edd(void)
>> static void __ref xen_setup_gdt(int cpu)
>> {
>> if (xen_feature(XENFEAT_auto_translated_physmap)) {
>> -#ifdef CONFIG_X86_64
>> - unsigned long dummy;
>> + unsigned long __attribute__((unused)) dummy;
>>
>> - load_percpu_segment(cpu); /* We need to access per-cpu area */
> You removed that - where are we going to do that? As the
> 'switch_to_new_gdt' uses the per-cpu GDT table.

load_percpu_segment() is part of switch_to_new_gdt(), so I thought there
is no need to call it here.

But you are right --- switch_to_new_gdt() starts with
get_cpu_gdt_table() which accesses per-CPU area. How did this manage to
work then?

>
>> + setup_stack_canary_segment(cpu);
> As this one too ?

This one I added. On 64-bit it's a nop so we never had problems without
having this call. On 32-bit, if CC_STACKPROTECTOR is set, we will fail
without setting this up.

-boris

>> switch_to_new_gdt(cpu); /* GDT and GS set */
>>
>> +#ifdef CONFIG_X86_64
>> /* We are switching of the Xen provided GDT to our HVM mode
>> * GDT. The new GDT has __KERNEL_CS with CS.L = 1
>> * and we are jumping to reload it.
>> @@ -1392,8 +1392,13 @@ static void __ref xen_setup_gdt(int cpu)
>> loadsegment(ds, 0);
>> loadsegment(fs, 0);
>> #else
>> - /* PVH: TODO Implement. */
>> - BUG();
>> + asm volatile ("ljmp %0,$1f\n"
>> + "1:\n"
>> + :: "i" (__KERNEL_CS));
>> +
>> + loadsegment(ss, __KERNEL_DS);
>> + loadsegment(ds, __USER_DS);
>> + loadsegment(es, __USER_DS);
>> #endif
>> return; /* PVH does not need any PV GDT ops. */
>> }
>> --
>> 1.8.1.4
>>

2015-07-17 16:43:45

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: [PATCH v2 4/6] xen/x86/pvh: Set up descriptors for 32-bit PVH guests

On Fri, Jul 17, 2015 at 11:36:29AM -0400, Boris Ostrovsky wrote:
> On 07/17/2015 11:21 AM, Konrad Rzeszutek Wilk wrote:
> >On Thu, Jul 16, 2015 at 05:43:39PM -0400, Boris Ostrovsky wrote:
> >>Signed-off-by: Boris Ostrovsky <[email protected]>
> >>---
> >>Changes in v2:
> >>* Set segment selectors using loadsegment() instead of assembly
> >>
> >> arch/x86/xen/enlighten.c | 15 ++++++++++-----
> >> 1 file changed, 10 insertions(+), 5 deletions(-)
> >>
> >>diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> >>index f8dc398..d665b1d 100644
> >>--- a/arch/x86/xen/enlighten.c
> >>+++ b/arch/x86/xen/enlighten.c
> >>@@ -1362,12 +1362,12 @@ static void __init xen_boot_params_init_edd(void)
> >> static void __ref xen_setup_gdt(int cpu)
> >> {
> >> if (xen_feature(XENFEAT_auto_translated_physmap)) {
> >>-#ifdef CONFIG_X86_64
> >>- unsigned long dummy;
> >>+ unsigned long __attribute__((unused)) dummy;
> >>- load_percpu_segment(cpu); /* We need to access per-cpu area */
> >You removed that - where are we going to do that? As the
> >'switch_to_new_gdt' uses the per-cpu GDT table.
>
> load_percpu_segment() is part of switch_to_new_gdt(), so I thought there is
> no need to call it here.
>
> But you are right --- switch_to_new_gdt() starts with get_cpu_gdt_table()
> which accesses per-CPU area. How did this manage to work then?

I was surprised as well - I was expecting your patch to have blow up.
Unless we are doing something fancy for CPU0 and for the other CPUs we
already have the per-cpu segment setup during bootup (copied from BSP)?

>
> >
> >>+ setup_stack_canary_segment(cpu);
> >As this one too ?
>
> This one I added. On 64-bit it's a nop so we never had problems without
> having this call. On 32-bit, if CC_STACKPROTECTOR is set, we will fail
> without setting this up.

Right, but it uses the per-cpu area so I would have expected it to crash.
(since you did not do load_percpu_segment).But maybe it did load, but some
unknown area ? :-)
>
> -boris
>
> >> switch_to_new_gdt(cpu); /* GDT and GS set */
> >>+#ifdef CONFIG_X86_64
> >> /* We are switching of the Xen provided GDT to our HVM mode
> >> * GDT. The new GDT has __KERNEL_CS with CS.L = 1
> >> * and we are jumping to reload it.
> >>@@ -1392,8 +1392,13 @@ static void __ref xen_setup_gdt(int cpu)
> >> loadsegment(ds, 0);
> >> loadsegment(fs, 0);
> >> #else
> >>- /* PVH: TODO Implement. */
> >>- BUG();
> >>+ asm volatile ("ljmp %0,$1f\n"
> >>+ "1:\n"
> >>+ :: "i" (__KERNEL_CS));
> >>+
> >>+ loadsegment(ss, __KERNEL_DS);
> >>+ loadsegment(ds, __USER_DS);
> >>+ loadsegment(es, __USER_DS);
> >> #endif
> >> return; /* PVH does not need any PV GDT ops. */
> >> }
> >>--
> >>1.8.1.4
> >>
>

2015-07-17 17:53:37

by Boris Ostrovsky

[permalink] [raw]
Subject: Re: [PATCH v2 4/6] xen/x86/pvh: Set up descriptors for 32-bit PVH guests

On 07/17/2015 12:43 PM, Konrad Rzeszutek Wilk wrote:
> On Fri, Jul 17, 2015 at 11:36:29AM -0400, Boris Ostrovsky wrote:
>> On 07/17/2015 11:21 AM, Konrad Rzeszutek Wilk wrote:
>>> On Thu, Jul 16, 2015 at 05:43:39PM -0400, Boris Ostrovsky wrote:
>>>> Signed-off-by: Boris Ostrovsky <[email protected]>
>>>> ---
>>>> Changes in v2:
>>>> * Set segment selectors using loadsegment() instead of assembly
>>>>
>>>> arch/x86/xen/enlighten.c | 15 ++++++++++-----
>>>> 1 file changed, 10 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
>>>> index f8dc398..d665b1d 100644
>>>> --- a/arch/x86/xen/enlighten.c
>>>> +++ b/arch/x86/xen/enlighten.c
>>>> @@ -1362,12 +1362,12 @@ static void __init xen_boot_params_init_edd(void)
>>>> static void __ref xen_setup_gdt(int cpu)
>>>> {
>>>> if (xen_feature(XENFEAT_auto_translated_physmap)) {
>>>> -#ifdef CONFIG_X86_64
>>>> - unsigned long dummy;
>>>> + unsigned long __attribute__((unused)) dummy;
>>>> - load_percpu_segment(cpu); /* We need to access per-cpu area */
>>> You removed that - where are we going to do that? As the
>>> 'switch_to_new_gdt' uses the per-cpu GDT table.
>> load_percpu_segment() is part of switch_to_new_gdt(), so I thought there is
>> no need to call it here.
>>
>> But you are right --- switch_to_new_gdt() starts with get_cpu_gdt_table()
>> which accesses per-CPU area. How did this manage to work then?
> I was surprised as well - I was expecting your patch to have blow up.
> Unless we are doing something fancy for CPU0 and for the other CPUs we
> already have the per-cpu segment setup during bootup (copied from BSP)?


No, %fs is zero when we enter xen_setup_gdt() (for 32-bit).

In any case, I should put load_percpu_segment() back.

-boris

2015-07-21 15:33:25

by Boris Ostrovsky

[permalink] [raw]
Subject: Re: [PATCH v2 4/6] xen/x86/pvh: Set up descriptors for 32-bit PVH guests

On 07/17/2015 01:52 PM, Boris Ostrovsky wrote:
> On 07/17/2015 12:43 PM, Konrad Rzeszutek Wilk wrote:
>> On Fri, Jul 17, 2015 at 11:36:29AM -0400, Boris Ostrovsky wrote:
>>> On 07/17/2015 11:21 AM, Konrad Rzeszutek Wilk wrote:
>>>> On Thu, Jul 16, 2015 at 05:43:39PM -0400, Boris Ostrovsky wrote:
>>>>> Signed-off-by: Boris Ostrovsky <[email protected]>
>>>>> ---
>>>>> Changes in v2:
>>>>> * Set segment selectors using loadsegment() instead of assembly
>>>>>
>>>>> arch/x86/xen/enlighten.c | 15 ++++++++++-----
>>>>> 1 file changed, 10 insertions(+), 5 deletions(-)
>>>>>
>>>>> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
>>>>> index f8dc398..d665b1d 100644
>>>>> --- a/arch/x86/xen/enlighten.c
>>>>> +++ b/arch/x86/xen/enlighten.c
>>>>> @@ -1362,12 +1362,12 @@ static void __init
>>>>> xen_boot_params_init_edd(void)
>>>>> static void __ref xen_setup_gdt(int cpu)
>>>>> {
>>>>> if (xen_feature(XENFEAT_auto_translated_physmap)) {
>>>>> -#ifdef CONFIG_X86_64
>>>>> - unsigned long dummy;
>>>>> + unsigned long __attribute__((unused)) dummy;
>>>>> - load_percpu_segment(cpu); /* We need to access per-cpu
>>>>> area */
>>>> You removed that - where are we going to do that? As the
>>>> 'switch_to_new_gdt' uses the per-cpu GDT table.
>>> load_percpu_segment() is part of switch_to_new_gdt(), so I thought
>>> there is
>>> no need to call it here.
>>>
>>> But you are right --- switch_to_new_gdt() starts with
>>> get_cpu_gdt_table()
>>> which accesses per-CPU area. How did this manage to work then?
>> I was surprised as well - I was expecting your patch to have blow up.
>> Unless we are doing something fancy for CPU0 and for the other CPUs we
>> already have the per-cpu segment setup during bootup (copied from BSP)?
>
>
> No, %fs is zero when we enter xen_setup_gdt() (for 32-bit).
>
> In any case, I should put load_percpu_segment() back.


No, I shouldn't.

Until the new GDT is loaded we can't load selectors since current GDT
doesn't have descriptors set up for them. And so any attempt to load
uninitialized selectors results in a fault.

This worked for 64-bit guests because there we load zero into %gs and
that is allowed (processor doesn't perform descriptor checks for the
first 4 indexes). But for 32-bit guests we load %fs with 0xd8.

And the reason the code worked before was because we are using "master"
per-cpu area and because GDT is the same for all CPUs at that point. Or
so I think.


-boris