LinuxLists.cc - [RFC PATCH v3 0/2] KVM: x86: Allow Qemu/KVM to use PVH entry point

2017-12-12 23:43:37

Subject: [RFC PATCH v3 0/2] KVM: x86: Allow Qemu/KVM to use PVH entry point

Changes from v2:

* All structures (including memory map table entries) are padded and
aligned to an 8 byte boundary.

* Removed the "packed" attributes and made changes to comments as
suggested by Jan.

Changes from v1:

* Adopted Paolo's suggestion for defining a v2 PVH ABI that includes the
e820 map instead of using the second module entry to pass the table.

* Cleaned things up a bit to reduce the number of xen vs non-xen special
cases.

Maran Wilson (2):
xen/pvh: Add memory map pointer to hvm_start_info struct
KVM: x86: Allow Qemu/KVM to use PVH entry point

arch/x86/xen/enlighten_pvh.c | 51 ++++++++++++++++++++++++++++++++++++---------------
include/xen/interface/hvm/start_info.h | 50 +++++++++++++++++++++++++++++++++++++++++++++++++-
2 files changed, 85 insertions(+), 16 deletions(-)

2017-12-12 23:44:34

by Maran Wilson

[permalink] [raw]

Subject: [RFC PATCH v3 1/2] xen/pvh: Add memory map pointer to hvm_start_info struct

The start info structure that is defined as part of the x86/HVM direct
boot ABI and used for starting Xen PVH guests would be more versatile if
it also included a way to pass information about the memory map to the
guest. This would allow KVM guests to share the same entry point.
---
include/xen/interface/hvm/start_info.h | 50 +++++++++++++++++++++++++++++++++-
1 file changed, 49 insertions(+), 1 deletion(-)

diff --git a/include/xen/interface/hvm/start_info.h b/include/xen/interface/hvm/start_info.h
index 6484159..80cfbd3 100644
--- a/include/xen/interface/hvm/start_info.h
+++ b/include/xen/interface/hvm/start_info.h
@@ -33,7 +33,7 @@
* | magic | Contains the magic value XEN_HVM_START_MAGIC_VALUE
* | | ("xEn3" with the 0x80 bit of the "E" set).
* 4 +----------------+
- * | version | Version of this structure. Current version is 0. New
+ * | version | Version of this structure. Current version is 1. New
* | | versions are guaranteed to be backwards-compatible.
* 8 +----------------+
* | flags | SIF_xxx flags.
@@ -48,6 +48,15 @@
* 32 +----------------+
* | rsdp_paddr | Physical address of the RSDP ACPI data structure.
* 40 +----------------+
+ * | memmap_paddr | Physical address of the (optional) memory map. Only
+ * | | present in version 1 and newer of the structure.
+ * 48 +----------------+
+ * | memmap_entries | Number of entries in the memory map table. Only
+ * | | present in version 1 and newer of the structure.
+ * | | Zero if there is no memory map being provided.
+ * 52 +----------------+
+ * | reserved | Version 1 and newer only.
+ * 56 +----------------+
*
* The layout of each entry in the module structure is the following:
*
@@ -62,10 +71,34 @@
* | reserved |
* 32 +----------------+
*
+ * The layout of each entry in the memory map table is as follows:
+ *
+ * 0 +----------------+
+ * | addr | Base address
+ * 8 +----------------+
+ * | size | Size of mapping in bytes
+ * 16 +----------------+
+ * | type | Type of mapping as defined between the hypervisor
+ * | | and guest it's starting. E820_TYPE_xxx, for example.
+ * 20 +----------------|
+ * | reserved |
+ * 24 +----------------+
+ *
* The address and sizes are always a 64bit little endian unsigned integer.
*
* NB: Xen on x86 will always try to place all the data below the 4GiB
* boundary.
+ *
+ * Version numbers of the hvm_start_info structure have evolved like this:
+ *
+ * Version 0:
+ *
+ * Version 1: Added the memmap_paddr/memmap_entries fields (plus 4 bytes of
+ * padding) to the end of the hvm_start_info struct. These new
+ * fields can be used to pass a memory map to the guest. The
+ * memory map is optional and so guests that understand version 1
+ * of the structure must check that memmap_entries is non-zero
+ * before trying to read the memory map.
*/
#define XEN_HVM_START_MAGIC_VALUE 0x336ec578

@@ -86,6 +119,14 @@ struct hvm_start_info {
uint64_t cmdline_paddr; /* Physical address of the command line. */
uint64_t rsdp_paddr; /* Physical address of the RSDP ACPI data */
/* structure. */
+ uint64_t memmap_paddr; /* Physical address of an array of */
+ /* hvm_memmap_table_entry. Only present in */
+ /* version 1 and newer of the structure */
+ uint32_t memmap_entries; /* Number of entries in the memmap table. */
+ /* Only present in version 1 and newer of */
+ /* the structure. Value will be zero if */
+ /* there is no memory map being provided. */
+ uint32_t reserved;
};

struct hvm_modlist_entry {
@@ -95,4 +136,11 @@ struct hvm_modlist_entry {
uint64_t reserved;
};

+struct hvm_memmap_table_entry {
+ uint64_t addr; /* Base address of the memory region */
+ uint64_t size; /* Size of the memory region in bytes */
+ uint32_t type; /* Mapping type */
+ uint32_t reserved;
+};
+
#endif /* __XEN_PUBLIC_ARCH_X86_HVM_START_INFO_H__ */
--
1.8.3.1

2017-12-12 23:48:26

by Maran Wilson

[permalink] [raw]

Subject: [RFC PATCH v3 2/2] KVM: x86: Allow Qemu/KVM to use PVH entry point

For certain applications it is desirable to rapidly boot a KVM virtual
machine. In cases where legacy hardware and software support within the
guest is not needed, Qemu should be able to boot directly into the
uncompressed Linux kernel binary without the need to run firmware.

There already exists an ABI to allow this for Xen PVH guests and the ABI
is supported by Linux and FreeBSD:

https://xenbits.xen.org/docs/unstable/misc/pvh.html

This patch enables Qemu to use that same entry point for booting KVM
guests.
---
arch/x86/xen/enlighten_pvh.c | 51 +++++++++++++++++++++++++++++++-------------
1 file changed, 36 insertions(+), 15 deletions(-)

diff --git a/arch/x86/xen/enlighten_pvh.c b/arch/x86/xen/enlighten_pvh.c
index 98ab176..12f3716 100644
--- a/arch/x86/xen/enlighten_pvh.c
+++ b/arch/x86/xen/enlighten_pvh.c
@@ -31,21 +31,38 @@ static void xen_pvh_arch_setup(void)
acpi_irq_model = ACPI_IRQ_MODEL_PLATFORM;
}

-static void __init init_pvh_bootparams(void)
+static void __init init_pvh_bootparams(bool xen_guest)
{
struct xen_memory_map memmap;
int rc;

memset(&pvh_bootparams, 0, sizeof(pvh_bootparams));

- memmap.nr_entries = ARRAY_SIZE(pvh_bootparams.e820_table);
- set_xen_guest_handle(memmap.buffer, pvh_bootparams.e820_table);
- rc = HYPERVISOR_memory_op(XENMEM_memory_map, &memmap);
- if (rc) {
- xen_raw_printk("XENMEM_memory_map failed (%d)\n", rc);
+ if ((pvh_start_info.version > 0) && (pvh_start_info.memmap_entries)) {
+ struct hvm_memmap_table_entry *ep;
+ int i;
+
+ ep = __va(pvh_start_info.memmap_paddr);
+ pvh_bootparams.e820_entries = pvh_start_info.memmap_entries;
+
+ for (i = 0; i < pvh_bootparams.e820_entries ; i++, ep++) {
+ pvh_bootparams.e820_table[i].addr = ep->addr;
+ pvh_bootparams.e820_table[i].size = ep->size;
+ pvh_bootparams.e820_table[i].type = ep->type;
+ }
+ } else if (xen_guest) {
+ memmap.nr_entries = ARRAY_SIZE(pvh_bootparams.e820_table);
+ set_xen_guest_handle(memmap.buffer, pvh_bootparams.e820_table);
+ rc = HYPERVISOR_memory_op(XENMEM_memory_map, &memmap);
+ if (rc) {
+ xen_raw_printk("XENMEM_memory_map failed (%d)\n", rc);
+ BUG();
+ }
+ pvh_bootparams.e820_entries = memmap.nr_entries;
+ } else {
+ xen_raw_printk("Error: Could not find memory map\n");
BUG();
}
- pvh_bootparams.e820_entries = memmap.nr_entries;

if (pvh_bootparams.e820_entries < E820_MAX_ENTRIES_ZEROPAGE - 1) {
pvh_bootparams.e820_table[pvh_bootparams.e820_entries].addr =
@@ -76,7 +93,7 @@ static void __init init_pvh_bootparams(void)
* environment (i.e. hardware_subarch 0).
*/
pvh_bootparams.hdr.version = 0x212;
- pvh_bootparams.hdr.type_of_loader = (9 << 4) | 0; /* Xen loader */
+ pvh_bootparams.hdr.type_of_loader = ((xen_guest ? 0x9 : 0xb) << 4) | 0;
}

/*
@@ -85,8 +102,10 @@ static void __init init_pvh_bootparams(void)
*/
void __init xen_prepare_pvh(void)
{
- u32 msr;
+
+ u32 msr = xen_cpuid_base();
u64 pfn;
+ bool xen_guest = !!msr;

if (pvh_start_info.magic != XEN_HVM_START_MAGIC_VALUE) {
xen_raw_printk("Error: Unexpected magic value (0x%08x)\n",
@@ -94,13 +113,15 @@ void __init xen_prepare_pvh(void)
BUG();
}

- xen_pvh = 1;
+ if (xen_guest) {
+ xen_pvh = 1;

- msr = cpuid_ebx(xen_cpuid_base() + 2);
- pfn = __pa(hypercall_page);
- wrmsr_safe(msr, (u32)pfn, (u32)(pfn >> 32));
+ msr = cpuid_ebx(msr + 2);
+ pfn = __pa(hypercall_page);
+ wrmsr_safe(msr, (u32)pfn, (u32)(pfn >> 32));

- init_pvh_bootparams();
+ x86_init.oem.arch_setup = xen_pvh_arch_setup;
+ }

- x86_init.oem.arch_setup = xen_pvh_arch_setup;
+ init_pvh_bootparams(xen_guest);
}
--
1.8.3.1

2017-12-15 15:55:55

by Jürgen Groß

[permalink] [raw]

Subject: Re: [RFC PATCH v3 2/2] KVM: x86: Allow Qemu/KVM to use PVH entry point

On 13/12/17 00:42, Maran Wilson wrote:
> For certain applications it is desirable to rapidly boot a KVM virtual
> machine. In cases where legacy hardware and software support within the
> guest is not needed, Qemu should be able to boot directly into the
> uncompressed Linux kernel binary without the need to run firmware.
>
> There already exists an ABI to allow this for Xen PVH guests and the ABI
> is supported by Linux and FreeBSD:
>
> https://xenbits.xen.org/docs/unstable/misc/pvh.html
>
> This patch enables Qemu to use that same entry point for booting KVM
> guests.

I'm fine with the general idea.

I'm wondering whether you really want to require CONFIG_XEN for the
KVM case, though.

Wouldn't it be better to rename arch/x86/xen/enlighten_pvh.c to
arch/x86/pvh.c and arch/x86/xen/xen-pvh.S to arch/x86/pvh-head.S,
put both under CONFIG_PVH umbrella and select this from CONFIG_XEN_PVH
and KVM_PVH (or what you like to call it)?

In the two moved source files you can make Xen/KVM-specific parts
optional via their CONFIG_ options.

And you might want to add an own ELF note for the KVM case?

> ---
> arch/x86/xen/enlighten_pvh.c | 51 +++++++++++++++++++++++++++++++-------------
> 1 file changed, 36 insertions(+), 15 deletions(-)
>
> diff --git a/arch/x86/xen/enlighten_pvh.c b/arch/x86/xen/enlighten_pvh.c
> index 98ab176..12f3716 100644
> --- a/arch/x86/xen/enlighten_pvh.c
> +++ b/arch/x86/xen/enlighten_pvh.c
> @@ -31,21 +31,38 @@ static void xen_pvh_arch_setup(void)
> acpi_irq_model = ACPI_IRQ_MODEL_PLATFORM;
> }
>
> -static void __init init_pvh_bootparams(void)
> +static void __init init_pvh_bootparams(bool xen_guest)
> {
> struct xen_memory_map memmap;
> int rc;
>
> memset(&pvh_bootparams, 0, sizeof(pvh_bootparams));
>
> - memmap.nr_entries = ARRAY_SIZE(pvh_bootparams.e820_table);
> - set_xen_guest_handle(memmap.buffer, pvh_bootparams.e820_table);
> - rc = HYPERVISOR_memory_op(XENMEM_memory_map, &memmap);
> - if (rc) {
> - xen_raw_printk("XENMEM_memory_map failed (%d)\n", rc);
> + if ((pvh_start_info.version > 0) && (pvh_start_info.memmap_entries)) {
> + struct hvm_memmap_table_entry *ep;
> + int i;
> +
> + ep = __va(pvh_start_info.memmap_paddr);
> + pvh_bootparams.e820_entries = pvh_start_info.memmap_entries;
> +
> + for (i = 0; i < pvh_bootparams.e820_entries ; i++, ep++) {
> + pvh_bootparams.e820_table[i].addr = ep->addr;
> + pvh_bootparams.e820_table[i].size = ep->size;
> + pvh_bootparams.e820_table[i].type = ep->type;
> + }
> + } else if (xen_guest) {
> + memmap.nr_entries = ARRAY_SIZE(pvh_bootparams.e820_table);
> + set_xen_guest_handle(memmap.buffer, pvh_bootparams.e820_table);
> + rc = HYPERVISOR_memory_op(XENMEM_memory_map, &memmap);
> + if (rc) {
> + xen_raw_printk("XENMEM_memory_map failed (%d)\n", rc);
> + BUG();
> + }
> + pvh_bootparams.e820_entries = memmap.nr_entries;
> + } else {
> + xen_raw_printk("Error: Could not find memory map\n");

xen_raw_printk() without being a Xen guest?

Juergen

2017-12-15 17:25:41

by Paolo Bonzini

[permalink] [raw]

Subject: Re: [RFC PATCH v3 2/2] KVM: x86: Allow Qemu/KVM to use PVH entry point

On 15/12/2017 16:55, Juergen Gross wrote:
> I'm fine with the general idea.
>
> I'm wondering whether you really want to require CONFIG_XEN for the
> KVM case, though.
>
> Wouldn't it be better to rename arch/x86/xen/enlighten_pvh.c to
> arch/x86/pvh.c and arch/x86/xen/xen-pvh.S to arch/x86/pvh-head.S,

Yes, sounds good.

> put both under CONFIG_PVH umbrella and select this from CONFIG_XEN_PVH
> and KVM_PVH (or what you like to call it)?

CONFIG_KVM_GUEST will be good enough.

> In the two moved source files you can make Xen/KVM-specific parts
> optional via their CONFIG_ options.
>
> And you might want to add an own ELF note for the KVM case?

As long as it's compatible with Xen, it's not needed. Only the startup
code changes between a KVM "PVH" kernel and a KVM "HVM" kernel.

Paolo