2012-10-30 15:47:49

by Olaf Hering

[permalink] [raw]
Subject: [PATCH v4] xen PVonHVM: move shared_info to reserved memory area

This is a respin of 00e37bdb0113a98408de42db85be002f21dbffd3
("xen PVonHVM: move shared_info to MMIO before kexec").

Currently kexec in a PVonHVM guest fails with a triple fault because the
new kernel overwrites the shared info page. The exact failure depends on
the size of the kernel image. This patch moves the pfn from RAM into an
E820 reserved memory area.

The pfn containing the shared_info is located somewhere in RAM. This will
cause trouble if the current kernel is doing a kexec boot into a new
kernel. The new kernel (and its startup code) can not know where the pfn
is, so it can not reserve the page. The hypervisor will continue to update
the pfn, and as a result memory corruption occours in the new kernel.

The toolstack marks the memory area FC000000-FFFFFFFF as reserved in the
E820 map. Within that range newer toolstacks will keep 1MB starting from
FE700000 as reserved for guest use. Older toolstacks will usually not
allocate areas up to FE700000, so FE700000 is expected to work also with
older toolstacks.

Signed-off-by: Olaf Hering <[email protected]>
---
v4:
rebase to 3.7-rc3
v3:
make xen_hvm_init_shared_info static
v2:
add __init to xen_hvm_set_shared_info

arch/x86/xen/enlighten.c | 46 ++++++++++++++++++++++++++++++----------------
arch/x86/xen/suspend.c | 2 +-
arch/x86/xen/xen-ops.h | 2 +-
3 files changed, 32 insertions(+), 18 deletions(-)

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 586d838..0cc41f8 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -101,7 +101,6 @@ struct shared_info xen_dummy_shared_info;

void *xen_initial_gdt;

-RESERVE_BRK(shared_info_page_brk, PAGE_SIZE);
__read_mostly int xen_have_vector_callback;
EXPORT_SYMBOL_GPL(xen_have_vector_callback);

@@ -1495,38 +1494,53 @@ asmlinkage void __init xen_start_kernel(void)
#endif
}

-void __ref xen_hvm_init_shared_info(void)
+#ifdef CONFIG_XEN_PVHVM
+#define HVM_SHARED_INFO_ADDR 0xFE700000UL
+static struct shared_info *xen_hvm_shared_info;
+
+static void xen_hvm_connect_shared_info(unsigned long pfn)
{
- int cpu;
struct xen_add_to_physmap xatp;
- static struct shared_info *shared_info_page = 0;

- if (!shared_info_page)
- shared_info_page = (struct shared_info *)
- extend_brk(PAGE_SIZE, PAGE_SIZE);
xatp.domid = DOMID_SELF;
xatp.idx = 0;
xatp.space = XENMAPSPACE_shared_info;
- xatp.gpfn = __pa(shared_info_page) >> PAGE_SHIFT;
+ xatp.gpfn = pfn;
if (HYPERVISOR_memory_op(XENMEM_add_to_physmap, &xatp))
BUG();

- HYPERVISOR_shared_info = (struct shared_info *)shared_info_page;
+}
+static void __init xen_hvm_set_shared_info(struct shared_info *sip)
+{
+ int cpu;
+
+ HYPERVISOR_shared_info = sip;

/* xen_vcpu is a pointer to the vcpu_info struct in the shared_info
* page, we use it in the event channel upcall and in some pvclock
* related functions. We don't need the vcpu_info placement
* optimizations because we don't use any pv_mmu or pv_irq op on
- * HVM.
- * When xen_hvm_init_shared_info is run at boot time only vcpu 0 is
- * online but xen_hvm_init_shared_info is run at resume time too and
- * in that case multiple vcpus might be online. */
- for_each_online_cpu(cpu) {
+ * HVM. */
+ for_each_online_cpu(cpu)
per_cpu(xen_vcpu, cpu) = &HYPERVISOR_shared_info->vcpu_info[cpu];
- }
}

-#ifdef CONFIG_XEN_PVHVM
+/* Reconnect the shared_info pfn to a (new) mfn */
+void xen_hvm_resume_shared_info(void)
+{
+ xen_hvm_connect_shared_info(HVM_SHARED_INFO_ADDR >> PAGE_SHIFT);
+}
+
+/* Obtain an address to the pfn in reserved area */
+static void __init xen_hvm_init_shared_info(void)
+{
+ set_fixmap(FIX_PARAVIRT_BOOTMAP, HVM_SHARED_INFO_ADDR);
+ xen_hvm_shared_info =
+ (struct shared_info *)fix_to_virt(FIX_PARAVIRT_BOOTMAP);
+ xen_hvm_connect_shared_info(HVM_SHARED_INFO_ADDR >> PAGE_SHIFT);
+ xen_hvm_set_shared_info(xen_hvm_shared_info);
+}
+
static void __init init_hvm_pv_info(void)
{
int major, minor;
diff --git a/arch/x86/xen/suspend.c b/arch/x86/xen/suspend.c
index 45329c8..ae8a00c 100644
--- a/arch/x86/xen/suspend.c
+++ b/arch/x86/xen/suspend.c
@@ -30,7 +30,7 @@ void xen_arch_hvm_post_suspend(int suspend_cancelled)
{
#ifdef CONFIG_XEN_PVHVM
int cpu;
- xen_hvm_init_shared_info();
+ xen_hvm_resume_shared_info();
xen_callback_vector();
xen_unplug_emulated_devices();
if (xen_feature(XENFEAT_hvm_safe_pvclock)) {
diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h
index a95b417..d2e73d1 100644
--- a/arch/x86/xen/xen-ops.h
+++ b/arch/x86/xen/xen-ops.h
@@ -40,7 +40,7 @@ void xen_enable_syscall(void);
void xen_vcpu_restore(void);

void xen_callback_vector(void);
-void xen_hvm_init_shared_info(void);
+void xen_hvm_resume_shared_info(void);
void xen_unplug_emulated_devices(void);

void __init xen_build_dynamic_phys_to_machine(void);
--
1.8.0


2012-10-30 16:14:09

by Jan Beulich

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH v4] xen PVonHVM: move shared_info to reserved memory area

>>> On 30.10.12 at 16:47, Olaf Hering <[email protected]> wrote:
> This is a respin of 00e37bdb0113a98408de42db85be002f21dbffd3
> ("xen PVonHVM: move shared_info to MMIO before kexec").
>
> Currently kexec in a PVonHVM guest fails with a triple fault because the
> new kernel overwrites the shared info page. The exact failure depends on
> the size of the kernel image. This patch moves the pfn from RAM into an
> E820 reserved memory area.

One thing that occurred to me only now: How is this relocation
of the shared info going to help with the vCPU info placement?
You can't undo this, nor can you re-register these areas to be
put in a different location (of course, both of there could be
implemented in the hypervisor). Yet the hypervisor writes to
some of these areas' fields as much as it does write to the
shared info structure itself.

Jan

2012-10-30 16:30:11

by Olaf Hering

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH v4] xen PVonHVM: move shared_info to reserved memory area

On Tue, Oct 30, Jan Beulich wrote:

> >>> On 30.10.12 at 16:47, Olaf Hering <[email protected]> wrote:
> > This is a respin of 00e37bdb0113a98408de42db85be002f21dbffd3
> > ("xen PVonHVM: move shared_info to MMIO before kexec").
> >
> > Currently kexec in a PVonHVM guest fails with a triple fault because the
> > new kernel overwrites the shared info page. The exact failure depends on
> > the size of the kernel image. This patch moves the pfn from RAM into an
> > E820 reserved memory area.
>
> One thing that occurred to me only now: How is this relocation
> of the shared info going to help with the vCPU info placement?
> You can't undo this, nor can you re-register these areas to be
> put in a different location (of course, both of there could be
> implemented in the hypervisor). Yet the hypervisor writes to
> some of these areas' fields as much as it does write to the
> shared info structure itself.

Maybe the wording "move" is a bit misleading in this patch.

A single "move" of the actual pfn happens during boot, that is when a
PVonHVM enabled guest kernel does the XENMAPSPACE_shared_info operation.
It moves the pfn of the shared info page from the location the hvmloader
initially configured to this new pfn (0xfffff -> 0xfe700).
Another relocation does not happen at runtime, AFAIK.

The "move" which this patch does is more a source level move in the
sense that RESERVE_BRK (which is somewhere in the middle of RAM) is not
used anymore. Instead a pfn in an E820_Reserved area is used, see
xen-unstable changeset 26108:79185dcdf558 "hvmloader: Reserve
FE700000-FE800000 in physical memory map for guest use."


Olaf

2012-10-30 16:44:36

by Jan Beulich

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH v4] xen PVonHVM: move shared_info to reserved memory area

>>> On 30.10.12 at 17:30, Olaf Hering <[email protected]> wrote:
> On Tue, Oct 30, Jan Beulich wrote:
>
>> >>> On 30.10.12 at 16:47, Olaf Hering <[email protected]> wrote:
>> > This is a respin of 00e37bdb0113a98408de42db85be002f21dbffd3
>> > ("xen PVonHVM: move shared_info to MMIO before kexec").
>> >
>> > Currently kexec in a PVonHVM guest fails with a triple fault because the
>> > new kernel overwrites the shared info page. The exact failure depends on
>> > the size of the kernel image. This patch moves the pfn from RAM into an
>> > E820 reserved memory area.
>>
>> One thing that occurred to me only now: How is this relocation
>> of the shared info going to help with the vCPU info placement?
>> You can't undo this, nor can you re-register these areas to be
>> put in a different location (of course, both of there could be
>> implemented in the hypervisor). Yet the hypervisor writes to
>> some of these areas' fields as much as it does write to the
>> shared info structure itself.
>
> Maybe the wording "move" is a bit misleading in this patch.
>
> A single "move" of the actual pfn happens during boot, that is when a
> PVonHVM enabled guest kernel does the XENMAPSPACE_shared_info operation.
> It moves the pfn of the shared info page from the location the hvmloader
> initially configured to this new pfn (0xfffff -> 0xfe700).
> Another relocation does not happen at runtime, AFAIK.
>
> The "move" which this patch does is more a source level move in the
> sense that RESERVE_BRK (which is somewhere in the middle of RAM) is not
> used anymore. Instead a pfn in an E820_Reserved area is used, see
> xen-unstable changeset 26108:79185dcdf558 "hvmloader: Reserve
> FE700000-FE800000 in physical memory map for guest use."

And iirc you're doing this relocation because otherwise the newly
booting kernel image may get overwritten at an (from its
perspective) arbitrary location. What I'm trying to point out is
that the shared info structure is not the only thing (potentially)
inside the kernel image that might get overwritten - the relocated
(by the old kernel) vCPU info may as well. Plus I think the new
kernel has no way of relocating it a second time (including back
into the shared info structure) with how the hypervisor works at
present.

Jan

2012-10-30 17:03:28

by Olaf Hering

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH v4] xen PVonHVM: move shared_info to reserved memory area

On Tue, Oct 30, Jan Beulich wrote:

> And iirc you're doing this relocation because otherwise the newly
> booting kernel image may get overwritten at an (from its
> perspective) arbitrary location. What I'm trying to point out is
> that the shared info structure is not the only thing (potentially)
> inside the kernel image that might get overwritten - the relocated
> (by the old kernel) vCPU info may as well. Plus I think the new
> kernel has no way of relocating it a second time (including back
> into the shared info structure) with how the hypervisor works at
> present.

Thanks, I have not looked at this, nor was I aware of it until now.
I will see what needs to be done.

Is this also an issue with the xenlinux based PVonHVM code?


Olaf

2012-10-31 08:02:17

by Jan Beulich

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH v4] xen PVonHVM: move shared_info to reserved memory area

>>> On 30.10.12 at 18:03, Olaf Hering <[email protected]> wrote:
> On Tue, Oct 30, Jan Beulich wrote:
>
>> And iirc you're doing this relocation because otherwise the newly
>> booting kernel image may get overwritten at an (from its
>> perspective) arbitrary location. What I'm trying to point out is
>> that the shared info structure is not the only thing (potentially)
>> inside the kernel image that might get overwritten - the relocated
>> (by the old kernel) vCPU info may as well. Plus I think the new
>> kernel has no way of relocating it a second time (including back
>> into the shared info structure) with how the hypervisor works at
>> present.
>
> Thanks, I have not looked at this, nor was I aware of it until now.
> I will see what needs to be done.
>
> Is this also an issue with the xenlinux based PVonHVM code?

No - this code doesn't make use of that feature (yet, we've got
a feature request that would involve doing so), as it only ever
touches vCPU 0's info.

Jan