Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754167AbaGOPHW (ORCPT ); Tue, 15 Jul 2014 11:07:22 -0400 Received: from userp1040.oracle.com ([156.151.31.81]:30321 "EHLO userp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754023AbaGOPHF (ORCPT ); Tue, 15 Jul 2014 11:07:05 -0400 Date: Tue, 15 Jul 2014 11:06:52 -0400 From: Konrad Rzeszutek Wilk To: Vitaly Kuznetsov Cc: xen-devel@lists.xenproject.org, Boris Ostrovsky , David Vrabel , Stefano Stabellini , Andrew Jones , linux-kernel@vger.kernel.org Subject: Re: [PATCH RFC 1/4] xen PVonHVM: use E820_Reserved area for shared_info Message-ID: <20140715150652.GN3403@laptop.dumpdata.com> References: <1405431640-649-1-git-send-email-vkuznets@redhat.com> <1405431640-649-2-git-send-email-vkuznets@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1405431640-649-2-git-send-email-vkuznets@redhat.com> User-Agent: Mutt/1.5.23 (2014-03-12) X-Source-IP: ucsinet21.oracle.com [156.151.31.93] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jul 15, 2014 at 03:40:37PM +0200, Vitaly Kuznetsov wrote: > From: Olaf Hering > > This is a respin of 00e37bdb0113a98408de42db85be002f21dbffd3 > ("xen PVonHVM: move shared_info to MMIO before kexec"). > > Currently kexec in a PVonHVM guest fails with a triple fault because the > new kernel overwrites the shared info page. The exact failure depends on > the size of the kernel image. This patch moves the pfn from RAM into an > E820 reserved memory area. > > The pfn containing the shared_info is located somewhere in RAM. This will > cause trouble if the current kernel is doing a kexec boot into a new > kernel. The new kernel (and its startup code) can not know where the pfn > is, so it can not reserve the page. The hypervisor will continue to update > the pfn, and as a result memory corruption occours in the new kernel. > > The toolstack marks the memory area FC000000-FFFFFFFF as reserved in the > E820 map. Within that range newer toolstacks (4.3+) will keep 1MB > starting from FE700000 as reserved for guest use. Older Xen4 toolstacks > will usually not allocate areas up to FE700000, so FE700000 is expected > to work also with older toolstacks. > > In Xen3 there is no reserved area at a fixed location. If the guest is > started on such old hosts the shared_info page will be placed in RAM. As > a result kexec can not be used. So this looks right, the one thing that we really need to check is e9daff24a266307943457086533041bd971d0ef9 This reverts commit 9d02b43dee0d7fb18dfb13a00915550b1a3daa9f. We are doing this b/c on 32-bit PVonHVM with older hypervisors (Xen 4.1) it ends up bothing up the start_info. This is bad b/c we use it for the time keeping, and the timekeeping code loops forever - as the version field never changes. Olaf says to revert it, so lets do that. Could you kindly test that the migration on 32-bit PVHVM guests on older hypervisors works? > > Signed-off-by: Olaf Hering > Signed-off-by: Konrad Rzeszutek Wilk > (cherry picked from commit 9d02b43dee0d7fb18dfb13a00915550b1a3daa9f) > > [On resume we need to reset the xen_vcpu_info, which the original > patch did not do] > > Signed-off-by: Vitaly Kuznetsov > --- > arch/x86/xen/enlighten.c | 74 ++++++++++++++++++++++++++++++++++++------------ > arch/x86/xen/suspend.c | 2 +- > arch/x86/xen/xen-ops.h | 2 +- > 3 files changed, 58 insertions(+), 20 deletions(-) > > diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c > index ffb101e..a11af62 100644 > --- a/arch/x86/xen/enlighten.c > +++ b/arch/x86/xen/enlighten.c > @@ -1726,23 +1726,29 @@ asmlinkage __visible void __init xen_start_kernel(void) > #endif > } > > -void __ref xen_hvm_init_shared_info(void) > +#ifdef CONFIG_XEN_PVHVM > +#define HVM_SHARED_INFO_ADDR 0xFE700000UL > +static struct shared_info *xen_hvm_shared_info; > +static unsigned long xen_hvm_sip_phys; > +static int xen_major, xen_minor; > + > +static void xen_hvm_connect_shared_info(unsigned long pfn) > { > - int cpu; > struct xen_add_to_physmap xatp; > - static struct shared_info *shared_info_page = 0; > > - if (!shared_info_page) > - shared_info_page = (struct shared_info *) > - extend_brk(PAGE_SIZE, PAGE_SIZE); > xatp.domid = DOMID_SELF; > xatp.idx = 0; > xatp.space = XENMAPSPACE_shared_info; > - xatp.gpfn = __pa(shared_info_page) >> PAGE_SHIFT; > + xatp.gpfn = pfn; > if (HYPERVISOR_memory_op(XENMEM_add_to_physmap, &xatp)) > BUG(); > > - HYPERVISOR_shared_info = (struct shared_info *)shared_info_page; > +} > +static void __init xen_hvm_set_shared_info(struct shared_info *sip) > +{ > + int cpu; > + > + HYPERVISOR_shared_info = sip; > > /* xen_vcpu is a pointer to the vcpu_info struct in the shared_info > * page, we use it in the event channel upcall and in some pvclock > @@ -1760,20 +1766,39 @@ void __ref xen_hvm_init_shared_info(void) > } > } > > -#ifdef CONFIG_XEN_PVHVM > +/* Reconnect the shared_info pfn to a (new) mfn */ > +void xen_hvm_resume_shared_info(void) > +{ > + xen_hvm_connect_shared_info(xen_hvm_sip_phys >> PAGE_SHIFT); > + xen_hvm_set_shared_info(xen_hvm_shared_info); > +} > + > +/* Xen tools prior to Xen 4 do not provide a E820_Reserved area for guest usage. > + * On these old tools the shared info page will be placed in E820_Ram. > + * Xen 4 provides a E820_Reserved area at 0xFC000000, and this code expects > + * that nothing is mapped up to HVM_SHARED_INFO_ADDR. > + * Xen 4.3+ provides an explicit 1MB area at HVM_SHARED_INFO_ADDR which is used > + * here for the shared info page. */ > +static void __init xen_hvm_init_shared_info(void) > +{ > + if (xen_major < 4) { > + xen_hvm_shared_info = extend_brk(PAGE_SIZE, PAGE_SIZE); > + xen_hvm_sip_phys = __pa(xen_hvm_shared_info); > + } else { > + xen_hvm_sip_phys = HVM_SHARED_INFO_ADDR; > + set_fixmap(FIX_PARAVIRT_BOOTMAP, xen_hvm_sip_phys); > + xen_hvm_shared_info = > + (struct shared_info *)fix_to_virt(FIX_PARAVIRT_BOOTMAP); > + } > + xen_hvm_resume_shared_info(); > +} > + > static void __init init_hvm_pv_info(void) > { > - int major, minor; > - uint32_t eax, ebx, ecx, edx, pages, msr, base; > + uint32_t ecx, edx, pages, msr, base; > u64 pfn; > > base = xen_cpuid_base(); > - cpuid(base + 1, &eax, &ebx, &ecx, &edx); > - > - major = eax >> 16; > - minor = eax & 0xffff; > - printk(KERN_INFO "Xen version %d.%d.\n", major, minor); > - > cpuid(base + 2, &pages, &msr, &ecx, &edx); > > pfn = __pa(hypercall_page); > @@ -1828,10 +1853,23 @@ static void __init xen_hvm_guest_init(void) > > static uint32_t __init xen_hvm_platform(void) > { > + uint32_t eax, ebx, ecx, edx, base; > + > if (xen_pv_domain()) > return 0; > > - return xen_cpuid_base(); > + base = xen_cpuid_base(); > + if (!base) > + return 0; > + > + cpuid(base + 1, &eax, &ebx, &ecx, &edx); > + > + xen_major = eax >> 16; > + xen_minor = eax & 0xffff; > + > + printk(KERN_INFO "Xen version %d.%d.\n", xen_major, xen_minor); > + > + return 1; > } > > bool xen_hvm_need_lapic(void) > diff --git a/arch/x86/xen/suspend.c b/arch/x86/xen/suspend.c > index c4df9db..8d6793e 100644 > --- a/arch/x86/xen/suspend.c > +++ b/arch/x86/xen/suspend.c > @@ -32,7 +32,7 @@ static void xen_hvm_post_suspend(int suspend_cancelled) > { > #ifdef CONFIG_XEN_PVHVM > int cpu; > - xen_hvm_init_shared_info(); > + xen_hvm_resume_shared_info(); > xen_callback_vector(); > xen_unplug_emulated_devices(); > if (xen_feature(XENFEAT_hvm_safe_pvclock)) { > diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h > index 97d8765..d083e82 100644 > --- a/arch/x86/xen/xen-ops.h > +++ b/arch/x86/xen/xen-ops.h > @@ -43,7 +43,7 @@ void xen_enable_syscall(void); > void xen_vcpu_restore(void); > > void xen_callback_vector(void); > -void xen_hvm_init_shared_info(void); > +void xen_hvm_resume_shared_info(void); > void xen_unplug_emulated_devices(void); > > void __init xen_build_dynamic_phys_to_machine(void); > -- > 1.9.3 > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/