Received: by 10.223.164.202 with SMTP id h10csp3458420wrb; Tue, 28 Nov 2017 11:38:35 -0800 (PST) X-Google-Smtp-Source: AGs4zMZ7NKVpX/WIQd1wgfcwrvGheEb3H2YVADkBkp1rFBq8bDdhE3zkcoWqb/WoSIED7+LSZYif X-Received: by 10.101.97.75 with SMTP id o11mr206716pgv.363.1511897915736; Tue, 28 Nov 2017 11:38:35 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1511897915; cv=none; d=google.com; s=arc-20160816; b=LoGJUudxq+zhBGEFx6d+AyF1+8hTqERXTrSqT5iT7D42sl25XxBFqJhzqKezVK/fzd dXutE5ZkhPvIE7NFLWXDe1fJRoOcyOGb58CLAw4clB97VIzV41uoHRUAFQin1vjkAWHq FCBoa1vxEDiF/X/N6W7PYgvngw+HuC8sHhfCz/zCKW351YLUAjszcUvpXjxl1/pYaOYa beaUGr5YWRFF/hKZXWW5lHLoYWh7socEMUQipxQq7NNQmO+NP/UHcz6b+inglcNQUZyy HmX5/jPGVCrxugy3adCbge3hfiDW1BjYfNkYRKbvB5o4vAP6gGSq/e9nl0e4fO8C9B7o aYdg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:to:from :arc-authentication-results; bh=NfVlLgemnW3ECSf6DFsAPGtKGX8AtH41z2Iv443BrEk=; b=ECuHOnKsKhW0WgSjiU5m0F7B4LkKE6osrXyCzDdu+TbAeKMEDwNq7bFaq7A4F/Hg78 W1sANNtNiGiLwQafMx/EbHEuWrPR2zsCzm8xkArrSR3fUgRbnMFuKz+JxtRVCJCRLDBd x/J2Uzd+wbVXqnaY64AohY2fXC7kBoM2Z2iK3QkBFPU8s2WPCyQWJz4zrcAkvq2Y+uj4 I/Q8sjtKy+gq2iYweN4zkb6D2/LQtaIzo7F0KE7ow6n96+XuQEovuxytt+6PaBgnqOY+ kntfgTqU4D9weIFh7bsro8gBB4X6BV1N5Hawqqdm2Cxyq8tVo+CyPQiyhAQId0nEdRQb lzZw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id o30si25619166pgc.468.2017.11.28.11.38.25; Tue, 28 Nov 2017 11:38:35 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932211AbdK1Tf7 (ORCPT + 71 others); Tue, 28 Nov 2017 14:35:59 -0500 Received: from aserp1040.oracle.com ([141.146.126.69]:28206 "EHLO aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752687AbdK1Tfy (ORCPT ); Tue, 28 Nov 2017 14:35:54 -0500 Received: from userv0022.oracle.com (userv0022.oracle.com [156.151.31.74]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id vASJYocM002697 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 28 Nov 2017 19:34:50 GMT Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by userv0022.oracle.com (8.14.4/8.14.4) with ESMTP id vASJYnUV028982 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 28 Nov 2017 19:34:49 GMT Received: from abhmp0003.oracle.com (abhmp0003.oracle.com [141.146.116.9]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id vASJYmOR002011; Tue, 28 Nov 2017 19:34:48 GMT Received: from marawils-linux.us.oracle.com (/10.141.197.9) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Tue, 28 Nov 2017 11:34:48 -0800 From: Maran Wilson To: boris.ostrovsky@oracle.com, jgross@suse.com, tglx@linutronix.de, mingo@redhat.com, hpa@zytor.com, x86@kernel.org, xen-devel@lists.xenproject.org, linux-kernel@vger.kernel.org, roger.pau@citrix.com, rkrcmar@redhat.com, JBeulich@suse.com, andrew.cooper3@citrix.com, pbonzini@redhat.com, kvm@vger.kernel.org Subject: [RFC PATCH] KVM: x86: Allow Qemu/KVM to use PVH entry point Date: Tue, 28 Nov 2017 11:34:42 -0800 Message-Id: <1511897682-32060-1-git-send-email-maran.wilson@oracle.com> X-Mailer: git-send-email 1.8.3.1 X-Source-IP: userv0022.oracle.com [156.151.31.74] Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org For certain applications it is desirable to rapidly boot a KVM virtual machine. In cases where legacy hardware and software support within the guest is not needed, Qemu should be able to boot directly into the uncompressed Linux kernel binary without the need to run firmware. There already exists an ABI to allow this for Xen PVH guests and the ABI is supported by Linux and FreeBSD: https://xenbits.xen.org/docs/unstable/misc/hvmlite.html This PoC patch enables Qemu to use that same entry point for booting KVM guests. Even though the code is still PoC quality, I'm sending this as an RFC now since there are a number of different ways the specific implementation details can be handled. I chose a shared code path for Xen and KVM guests but could just as easily create a separate code path that is advertised by a different ELF note for KVM. There also seems to be some flexibility in how the e820 table data is passed and how (or if) it should be identified as e820 data. As a starting point, I've chosen the options that seem to result in the smallest patch with minimal to no changes required of the x86/HVM direct boot ABI. --- arch/x86/xen/enlighten_pvh.c | 74 ++++++++++++++++++++++++++++++++------------ 1 file changed, 55 insertions(+), 19 deletions(-) diff --git a/arch/x86/xen/enlighten_pvh.c b/arch/x86/xen/enlighten_pvh.c index 98ab176..d93f711 100644 --- a/arch/x86/xen/enlighten_pvh.c +++ b/arch/x86/xen/enlighten_pvh.c @@ -31,21 +31,46 @@ static void xen_pvh_arch_setup(void) acpi_irq_model = ACPI_IRQ_MODEL_PLATFORM; } -static void __init init_pvh_bootparams(void) +static void __init init_pvh_bootparams(bool xen_guest) { struct xen_memory_map memmap; int rc; memset(&pvh_bootparams, 0, sizeof(pvh_bootparams)); - memmap.nr_entries = ARRAY_SIZE(pvh_bootparams.e820_table); - set_xen_guest_handle(memmap.buffer, pvh_bootparams.e820_table); - rc = HYPERVISOR_memory_op(XENMEM_memory_map, &memmap); - if (rc) { - xen_raw_printk("XENMEM_memory_map failed (%d)\n", rc); - BUG(); + if (xen_guest) { + memmap.nr_entries = ARRAY_SIZE(pvh_bootparams.e820_table); + set_xen_guest_handle(memmap.buffer, pvh_bootparams.e820_table); + rc = HYPERVISOR_memory_op(XENMEM_memory_map, &memmap); + if (rc) { + xen_raw_printk("XENMEM_memory_map failed (%d)\n", rc); + BUG(); + } + pvh_bootparams.e820_entries = memmap.nr_entries; + } else if (pvh_start_info.nr_modules > 1) { + /* The second module should be the e820 data for KVM guests */ + struct hvm_modlist_entry *modaddr; + char e820_sig[] = "e820 data"; + struct boot_e820_entry *ep; + struct e820_table *tp; + char *cmdline_str; + int idx; + + modaddr = __va(pvh_start_info.modlist_paddr + + sizeof(struct hvm_modlist_entry)); + cmdline_str = __va(modaddr->cmdline_paddr); + + if ((modaddr->cmdline_paddr) && + (!strncmp(e820_sig, cmdline_str, sizeof(e820_sig)))) { + tp = __va(modaddr->paddr); + ep = (struct boot_e820_entry *)tp->entries; + + pvh_bootparams.e820_entries = tp->nr_entries; + + for (idx = 0; idx < tp->nr_entries ; idx++, ep++) + pvh_bootparams.e820_table[idx] = *ep; + } } - pvh_bootparams.e820_entries = memmap.nr_entries; if (pvh_bootparams.e820_entries < E820_MAX_ENTRIES_ZEROPAGE - 1) { pvh_bootparams.e820_table[pvh_bootparams.e820_entries].addr = @@ -55,8 +80,9 @@ static void __init init_pvh_bootparams(void) pvh_bootparams.e820_table[pvh_bootparams.e820_entries].type = E820_TYPE_RESERVED; pvh_bootparams.e820_entries++; - } else + } else if (xen_guest) { xen_raw_printk("Warning: Can fit ISA range into e820\n"); + } pvh_bootparams.hdr.cmd_line_ptr = pvh_start_info.cmdline_paddr; @@ -76,7 +102,7 @@ static void __init init_pvh_bootparams(void) * environment (i.e. hardware_subarch 0). */ pvh_bootparams.hdr.version = 0x212; - pvh_bootparams.hdr.type_of_loader = (9 << 4) | 0; /* Xen loader */ + pvh_bootparams.hdr.type_of_loader = ((xen_guest ? 0x9 : 0xb) << 4) | 0; } /* @@ -85,22 +111,32 @@ static void __init init_pvh_bootparams(void) */ void __init xen_prepare_pvh(void) { - u32 msr; + + u32 msr = xen_cpuid_base(); u64 pfn; + bool xen_guest = msr ? true : false; if (pvh_start_info.magic != XEN_HVM_START_MAGIC_VALUE) { - xen_raw_printk("Error: Unexpected magic value (0x%08x)\n", - pvh_start_info.magic); + if (xen_guest) + xen_raw_printk("Error: Unexpected magic value (0x%08x)\n", + pvh_start_info.magic); BUG(); } - xen_pvh = 1; + if (xen_guest) { + xen_pvh = 1; + + msr = cpuid_ebx(msr + 2); + pfn = __pa(hypercall_page); + wrmsr_safe(msr, (u32)pfn, (u32)(pfn >> 32)); + + } else if (!hypervisor_cpuid_base("KVMKVMKVM\0\0\0", 0)) { + BUG(); + } - msr = cpuid_ebx(xen_cpuid_base() + 2); - pfn = __pa(hypercall_page); - wrmsr_safe(msr, (u32)pfn, (u32)(pfn >> 32)); + init_pvh_bootparams(xen_guest); - init_pvh_bootparams(); + if (xen_guest) + x86_init.oem.arch_setup = xen_pvh_arch_setup; - x86_init.oem.arch_setup = xen_pvh_arch_setup; } -- 1.8.3.1 From 1587226944483149795@xxx Tue Dec 19 15:32:51 +0000 2017 X-GM-THRID: 1586141671970795059 X-Gmail-Labels: Inbox,Category Forums