Changes from v1:
* Adopted Paolo's suggestion for defining a v2 PVH ABI that includes the
e820 map instead of using the second module entry to pass the table.
* Cleaned things up a bit to reduce the number of xen vs non-xen special
cases.
Juergen also had a suggestion to split the different hypervisor types
early and use a common set of service functions instead of special casing
xen_guest everywhere.
There are certainly less special cases in this version of the patch, but
if we still think it's important to split things up between common, Xen,
and KVM components, then I would appreciate a suggestion on how best that
can be done. Are we talking about just re-factoring functions in the
existing file? Or do we need to go all the way and pull all the PVH entry
code out of xen directories and find a home for it somewhere else so that
we can use kernels built without CONFIG_XEN to start KVM guests via the
PVH entry point. If the latter, any suggestions for which common files or
directories I can move this stuff to?
Maran Wilson (2):
xen/pvh: Add memory map pointer to hvm_start_info struct
KVM: x86: Allow Qemu/KVM to use PVH entry point
arch/x86/xen/enlighten_pvh.c | 48 +++++++++++++++++++++++++++++++++---------------
include/xen/interface/hvm/start_info.h | 34 +++++++++++++++++++++++++++++++---
2 files changed, 64 insertions(+), 18 deletions(-)
The start info structure that is defined as part of the x86/HVM direct
boot ABI and used for starting Xen PVH guests would be more versatile if
it also included a way to efficiently pass information about the memory
map to the guest.
That way Xen PVH guests would not be forced to use a hypercall to get the
information and would make it easier for KVM guests to share the PVH
entry point.
---
include/xen/interface/hvm/start_info.h | 34 +++++++++++++++++++++++++++++++---
1 file changed, 31 insertions(+), 3 deletions(-)
diff --git a/include/xen/interface/hvm/start_info.h b/include/xen/interface/hvm/start_info.h
index 6484159..60206bb 100644
--- a/include/xen/interface/hvm/start_info.h
+++ b/include/xen/interface/hvm/start_info.h
@@ -33,7 +33,7 @@
* | magic | Contains the magic value XEN_HVM_START_MAGIC_VALUE
* | | ("xEn3" with the 0x80 bit of the "E" set).
* 4 +----------------+
- * | version | Version of this structure. Current version is 0. New
+ * | version | Version of this structure. Current version is 1. New
* | | versions are guaranteed to be backwards-compatible.
* 8 +----------------+
* | flags | SIF_xxx flags.
@@ -48,6 +48,12 @@
* 32 +----------------+
* | rsdp_paddr | Physical address of the RSDP ACPI data structure.
* 40 +----------------+
+ * | memmap_paddr | Physical address of the memory map. Only present in
+ * | | version 1 and newer of the structure.
+ * 48 +----------------+
+ * | memmap_entries | Number of entries in the memory map table. Only
+ * | | present in version 1 and newer of the structure.
+ * 52 +----------------+
*
* The layout of each entry in the module structure is the following:
*
@@ -62,6 +68,17 @@
* | reserved |
* 32 +----------------+
*
+ * The layout of each entry in the memory map table is as follows and no
+ * padding is used between entries in the array:
+ *
+ * 0 +----------------+
+ * | addr | Base address
+ * 8 +----------------+
+ * | size | Size of mapping
+ * 16 +----------------+
+ * | type | E820_TYPE_xxx
+ * 20 +----------------|
+ *
* The address and sizes are always a 64bit little endian unsigned integer.
*
* NB: Xen on x86 will always try to place all the data below the 4GiB
@@ -86,13 +103,24 @@ struct hvm_start_info {
uint64_t cmdline_paddr; /* Physical address of the command line. */
uint64_t rsdp_paddr; /* Physical address of the RSDP ACPI data */
/* structure. */
-};
+ uint64_t memmap_paddr; /* Physical address of an array of */
+ /* hvm_memmap_table_entry. Only present in */
+ /* Ver 1 or later. For e820 mem map table. */
+ uint32_t memmap_entries; /* Only present in Ver 1 or later. Number of */
+ /* entries in the memmap table. */
+} __attribute__((packed));
struct hvm_modlist_entry {
uint64_t paddr; /* Physical address of the module. */
uint64_t size; /* Size of the module in bytes. */
uint64_t cmdline_paddr; /* Physical address of the command line. */
uint64_t reserved;
-};
+} __attribute__((packed));
+
+struct hvm_memmap_table_entry {
+ uint64_t addr; /* Base address of the memory region */
+ uint64_t size; /* Size of the memory region */
+ uint32_t type; /* E820_TYPE_xxx of the memory region */
+} __attribute__((packed));
#endif /* __XEN_PUBLIC_ARCH_X86_HVM_START_INFO_H__ */
--
1.8.3.1
For certain applications it is desirable to rapidly boot a KVM virtual
machine. In cases where legacy hardware and software support within the
guest is not needed, Qemu should be able to boot directly into the
uncompressed Linux kernel binary without the need to run firmware.
There already exists an ABI to allow this for Xen PVH guests and the ABI
is supported by Linux and FreeBSD:
https://xenbits.xen.org/docs/unstable/misc/pvh.html
This patch enables Qemu to use that same entry point for booting KVM
guests.
---
arch/x86/xen/enlighten_pvh.c | 48 ++++++++++++++++++++++++++++++--------------
1 file changed, 33 insertions(+), 15 deletions(-)
diff --git a/arch/x86/xen/enlighten_pvh.c b/arch/x86/xen/enlighten_pvh.c
index 98ab176..f11fbfc 100644
--- a/arch/x86/xen/enlighten_pvh.c
+++ b/arch/x86/xen/enlighten_pvh.c
@@ -31,21 +31,35 @@ static void xen_pvh_arch_setup(void)
acpi_irq_model = ACPI_IRQ_MODEL_PLATFORM;
}
-static void __init init_pvh_bootparams(void)
+static void __init init_pvh_bootparams(bool xen_guest)
{
struct xen_memory_map memmap;
int rc;
memset(&pvh_bootparams, 0, sizeof(pvh_bootparams));
- memmap.nr_entries = ARRAY_SIZE(pvh_bootparams.e820_table);
- set_xen_guest_handle(memmap.buffer, pvh_bootparams.e820_table);
- rc = HYPERVISOR_memory_op(XENMEM_memory_map, &memmap);
- if (rc) {
- xen_raw_printk("XENMEM_memory_map failed (%d)\n", rc);
+ if ((pvh_start_info.version > 0) && (pvh_start_info.memmap_entries)) {
+ struct boot_e820_entry *ep;
+ int idx;
+
+ ep = __va(pvh_start_info.memmap_paddr);
+ pvh_bootparams.e820_entries = pvh_start_info.memmap_entries;
+
+ for (idx = 0; idx < pvh_bootparams.e820_entries ; idx++, ep++)
+ pvh_bootparams.e820_table[idx] = *ep;
+ } else if (xen_guest) {
+ memmap.nr_entries = ARRAY_SIZE(pvh_bootparams.e820_table);
+ set_xen_guest_handle(memmap.buffer, pvh_bootparams.e820_table);
+ rc = HYPERVISOR_memory_op(XENMEM_memory_map, &memmap);
+ if (rc) {
+ xen_raw_printk("XENMEM_memory_map failed (%d)\n", rc);
+ BUG();
+ }
+ pvh_bootparams.e820_entries = memmap.nr_entries;
+ } else {
+ xen_raw_printk("Error: Could not find memory map\n");
BUG();
}
- pvh_bootparams.e820_entries = memmap.nr_entries;
if (pvh_bootparams.e820_entries < E820_MAX_ENTRIES_ZEROPAGE - 1) {
pvh_bootparams.e820_table[pvh_bootparams.e820_entries].addr =
@@ -76,7 +90,7 @@ static void __init init_pvh_bootparams(void)
* environment (i.e. hardware_subarch 0).
*/
pvh_bootparams.hdr.version = 0x212;
- pvh_bootparams.hdr.type_of_loader = (9 << 4) | 0; /* Xen loader */
+ pvh_bootparams.hdr.type_of_loader = ((xen_guest ? 0x9 : 0xb) << 4) | 0;
}
/*
@@ -85,8 +99,10 @@ static void __init init_pvh_bootparams(void)
*/
void __init xen_prepare_pvh(void)
{
- u32 msr;
+
+ u32 msr = xen_cpuid_base();
u64 pfn;
+ bool xen_guest = !!msr;
if (pvh_start_info.magic != XEN_HVM_START_MAGIC_VALUE) {
xen_raw_printk("Error: Unexpected magic value (0x%08x)\n",
@@ -94,13 +110,15 @@ void __init xen_prepare_pvh(void)
BUG();
}
- xen_pvh = 1;
+ if (xen_guest) {
+ xen_pvh = 1;
- msr = cpuid_ebx(xen_cpuid_base() + 2);
- pfn = __pa(hypercall_page);
- wrmsr_safe(msr, (u32)pfn, (u32)(pfn >> 32));
+ msr = cpuid_ebx(msr + 2);
+ pfn = __pa(hypercall_page);
+ wrmsr_safe(msr, (u32)pfn, (u32)(pfn >> 32));
- init_pvh_bootparams();
+ x86_init.oem.arch_setup = xen_pvh_arch_setup;
+ }
- x86_init.oem.arch_setup = xen_pvh_arch_setup;
+ init_pvh_bootparams(xen_guest);
}
--
1.8.3.1
>>> On 07.12.17 at 23:45, <[email protected]> wrote:
> The start info structure that is defined as part of the x86/HVM direct
> boot ABI and used for starting Xen PVH guests would be more versatile if
> it also included a way to efficiently pass information about the memory
> map to the guest.
>
> That way Xen PVH guests would not be forced to use a hypercall to get the
> information and would make it easier for KVM guests to share the PVH
> entry point.
> ---
> include/xen/interface/hvm/start_info.h | 34 +++++++++++++++++++++++++++++++---
> 1 file changed, 31 insertions(+), 3 deletions(-)
First of all such a change should be submitted against the canonical
copy of the header, which lives in the Xen tree.
The argument of avoiding a hypercall doesn't really count imo - this
isn't in any way performance critical code. The argument of making
re-use easier is fine, though.
> --- a/include/xen/interface/hvm/start_info.h
> +++ b/include/xen/interface/hvm/start_info.h
> @@ -33,7 +33,7 @@
> * | magic | Contains the magic value XEN_HVM_START_MAGIC_VALUE
> * | | ("xEn3" with the 0x80 bit of the "E" set).
> * 4 +----------------+
> - * | version | Version of this structure. Current version is 0. New
> + * | version | Version of this structure. Current version is 1. New
> * | | versions are guaranteed to be backwards-compatible.
> * 8 +----------------+
> * | flags | SIF_xxx flags.
> @@ -48,6 +48,12 @@
> * 32 +----------------+
> * | rsdp_paddr | Physical address of the RSDP ACPI data structure.
> * 40 +----------------+
> + * | memmap_paddr | Physical address of the memory map. Only present in
> + * | | version 1 and newer of the structure.
> + * 48 +----------------+
> + * | memmap_entries | Number of entries in the memory map table. Only
> + * | | present in version 1 and newer of the structure.
> + * 52 +----------------+
Please let's make this optional even in v1 (and later), i.e. spell out
that it may be zero. That way Xen code could continue to use the
hypercall approach even.
Also please spell out a 4-byte reserved entry at the end, to make
the specified structure a multiple of 8 in size again regardless of
bitness of the producer/consumer.
> @@ -62,6 +68,17 @@
> * | reserved |
> * 32 +----------------+
> *
> + * The layout of each entry in the memory map table is as follows and no
> + * padding is used between entries in the array:
> + *
> + * 0 +----------------+
> + * | addr | Base address
> + * 8 +----------------+
> + * | size | Size of mapping
> + * 16 +----------------+
> + * | type | E820_TYPE_xxx
> + * 20 +----------------|
I'm not convinced of re-using E820 types here. I can see that this
might ease the consumption in Linux, but I don't think there should
be any connection to x86 aspects here - the data being supplied is
x86-agnostic, and Linux'es placement of the header is also making
no connection to x86 (oddly enough, the current placement in the
Xen tree does, for a reason which escapes me).
I could also imagine reasons to add new types without them being
sanctioned by whoever maintains E820 type assignments.
As to the size field - you need to spell out whether these are bytes
or pages (it might be worthwhile to also make this explicit for the
addr one, but there I view it as less of a problem, since "address"
doesn't commonly mean a page granular entity).
Also this again lacks a 4-byte reserved field at the end.
> @@ -86,13 +103,24 @@ struct hvm_start_info {
> uint64_t cmdline_paddr; /* Physical address of the command line. */
> uint64_t rsdp_paddr; /* Physical address of the RSDP ACPI data */
> /* structure.
> */
> -};
> + uint64_t memmap_paddr; /* Physical address of an array of */
> + /* hvm_memmap_table_entry. Only present in */
> + /* Ver 1 or later. For e820 mem map table. */
> + uint32_t memmap_entries; /* Only present in Ver 1 or later. Number of */
> + /* entries in the memmap table. */
> +} __attribute__((packed));
No packed attribute here and below please, at least not in the
canonical (non-Linux) variant of the header.
Jan
Thanks for taking a look Jan. More below...
On 12/8/2017 12:49 AM, Jan Beulich wrote:
>>>> On 07.12.17 at 23:45, <[email protected]> wrote:
>> The start info structure that is defined as part of the x86/HVM direct
>> boot ABI and used for starting Xen PVH guests would be more versatile if
>> it also included a way to efficiently pass information about the memory
>> map to the guest.
>>
>> That way Xen PVH guests would not be forced to use a hypercall to get the
>> information and would make it easier for KVM guests to share the PVH
>> entry point.
>> ---
>> include/xen/interface/hvm/start_info.h | 34 +++++++++++++++++++++++++++++++---
>> 1 file changed, 31 insertions(+), 3 deletions(-)
> First of all such a change should be submitted against the canonical
> copy of the header, which lives in the Xen tree.
Understood. Will do that when this converts from RFC to actual patch.
> The argument of avoiding a hypercall doesn't really count imo - this
> isn't in any way performance critical code. The argument of making
> re-use easier is fine, though.
Okay, I will reword the commit message.
>> --- a/include/xen/interface/hvm/start_info.h
>> +++ b/include/xen/interface/hvm/start_info.h
>> @@ -33,7 +33,7 @@
>> * | magic | Contains the magic value XEN_HVM_START_MAGIC_VALUE
>> * | | ("xEn3" with the 0x80 bit of the "E" set).
>> * 4 +----------------+
>> - * | version | Version of this structure. Current version is 0. New
>> + * | version | Version of this structure. Current version is 1. New
>> * | | versions are guaranteed to be backwards-compatible.
>> * 8 +----------------+
>> * | flags | SIF_xxx flags.
>> @@ -48,6 +48,12 @@
>> * 32 +----------------+
>> * | rsdp_paddr | Physical address of the RSDP ACPI data structure.
>> * 40 +----------------+
>> + * | memmap_paddr | Physical address of the memory map. Only present in
>> + * | | version 1 and newer of the structure.
>> + * 48 +----------------+
>> + * | memmap_entries | Number of entries in the memory map table. Only
>> + * | | present in version 1 and newer of the structure.
>> + * 52 +----------------+
> Please let's make this optional even in v1 (and later), i.e. spell out
> that it may be zero. That way Xen code could continue to use the
> hypercall approach even.
Yes, my intention was to make this optional. I will spell it out.
> Also please spell out a 4-byte reserved entry at the end, to make
> the specified structure a multiple of 8 in size again regardless of
> bitness of the producer/consumer.
Sure, I can add that.
>> @@ -62,6 +68,17 @@
>> * | reserved |
>> * 32 +----------------+
>> *
>> + * The layout of each entry in the memory map table is as follows and no
>> + * padding is used between entries in the array:
>> + *
>> + * 0 +----------------+
>> + * | addr | Base address
>> + * 8 +----------------+
>> + * | size | Size of mapping
>> + * 16 +----------------+
>> + * | type | E820_TYPE_xxx
>> + * 20 +----------------|
> I'm not convinced of re-using E820 types here. I can see that this
> might ease the consumption in Linux, but I don't think there should
> be any connection to x86 aspects here - the data being supplied is
> x86-agnostic, and Linux'es placement of the header is also making
> no connection to x86 (oddly enough, the current placement in the
> Xen tree does, for a reason which escapes me).
>
> I could also imagine reasons to add new types without them being
> sanctioned by whoever maintains E820 type assignments.
So there are three aspects to discuss here.
1) The addition of the "E820_TYPE_xxx" comment. I am fine with just
changing that to "mapping type" and leaving it as something to be
coordinated between the hypervisor and the guest OS being started by
that hypervisor.
2) x86 vs x86-agnostic. While I'm trying to keep this interface generic
in terms of guest OS (like Linux, FreeBSD, possible other guests in the
future) and hypervisor type (Xen, QEMU/KVM, etc), I was actually under
the impression that we are dealing with an ABI that is very much x86
specific.
The canonical document describing the ABI
(https://xenbits.xen.org/docs/unstable/misc/pvh.html) is titled "x86/HVM
direct boot ABI" and goes on to describe an interface in very
x86-specific terms. i.e. The ebx register must contain a pointer, cs,
ds, es must be set a certain way, etc.
That is probably why Xen's placement of the header file is in a x86
section of the tree. And also why there already exist a number of "x86"
references in the existing header file. A quick grep of the existing
header file will show lines like:
"C representation of the x86/HVM start info layout"
"Start of day structure passed to PVH guests and to HVM guests in %ebx"
"Xen on x86 will always try to place all the data below the 4GiB"
If at some point in the future someone decides to implement a similar
ABI for a different CPU architecture while re-using this same
hvm_start_info struct, then this header will have to be redone a bit
anyway. But I'm not aware of any other such ABI that exists or is
currently in the works.
3) The (packed) layout of the hvm_memmap_table_entry struct. I did
initially consider just making this a new structure that did not
necessarily match struct e820_entry in its array layout. But, it's not
just the consumer that has an easier time digesting it in the e820_entry
array format. It's also the producer side (QEMU for instance) where code
already exists to lay out this information in e820_entry array format.
And since this is all x86 specific anyway, it just seemed like I would
be needlessly making more work for both ends by inventing a completely
new memory map layout just for the sake of being different. Especially
when there doesn't seem to be anything terribly broken about the
existing e820_entry array format as a general purpose memory map.
Would that be acceptable?
> As to the size field - you need to spell out whether these are bytes
> or pages (it might be worthwhile to also make this explicit for the
> addr one, but there I view it as less of a problem, since "address"
> doesn't commonly mean a page granular entity).
Sure, I will change that to "Size of the mapping in bytes".
> Also this again lacks a 4-byte reserved field at the end.
For the reasons described above, I'd much prefer to leave this struct
packed and without the 4-byte reserved field. But let me know if there
is a compelling reason that I have missed.
>> @@ -86,13 +103,24 @@ struct hvm_start_info {
>> uint64_t cmdline_paddr; /* Physical address of the command line. */
>> uint64_t rsdp_paddr; /* Physical address of the RSDP ACPI data */
>> /* structure.
>> */
>> -};
>> + uint64_t memmap_paddr; /* Physical address of an array of */
>> + /* hvm_memmap_table_entry. Only present in */
>> + /* Ver 1 or later. For e820 mem map table. */
>> + uint32_t memmap_entries; /* Only present in Ver 1 or later. Number of */
>> + /* entries in the memmap table. */
>> +} __attribute__((packed));
> No packed attribute here and below please, at least not in the
> canonical (non-Linux) variant of the header.
Sure, with the addition of the 4 byte reserved field you mentioned
earlier, I'll remove the packed attribute for the two existing
structures. But as discussed above, would prefer to leave it in place
for the new memory map array struct.
Thanks,
-Maran
> Jan
On 12/07/2017 05:45 PM, Maran Wilson wrote:
>
> Juergen also had a suggestion to split the different hypervisor types
> early and use a common set of service functions instead of special casing
> xen_guest everywhere.
>
> There are certainly less special cases in this version of the patch, but
> if we still think it's important to split things up between common, Xen,
> and KVM components, then I would appreciate a suggestion on how best that
> can be done. Are we talking about just re-factoring functions in the
> existing file? Or do we need to go all the way and pull all the PVH entry
> code out of xen directories and find a home for it somewhere else so that
> we can use kernels built without CONFIG_XEN to start KVM guests via the
> PVH entry point. If the latter, any suggestions for which common files or
> directories I can move this stuff to?
I wonder whether the time has come for arch/x86/virt/
xen/
kvm/
hyperv/
kernel/paravirt*
kernel/cpu/hypervisor.c
-boris
>>> On 08.12.17 at 20:05, <[email protected]> wrote:
> On 12/8/2017 12:49 AM, Jan Beulich wrote:
>> I'm not convinced of re-using E820 types here. I can see that this
>> might ease the consumption in Linux, but I don't think there should
>> be any connection to x86 aspects here - the data being supplied is
>> x86-agnostic, and Linux'es placement of the header is also making
>> no connection to x86 (oddly enough, the current placement in the
>> Xen tree does, for a reason which escapes me).
>>
>> I could also imagine reasons to add new types without them being
>> sanctioned by whoever maintains E820 type assignments.
>
> So there are three aspects to discuss here.
>
> 1) The addition of the "E820_TYPE_xxx" comment. I am fine with just
> changing that to "mapping type" and leaving it as something to be
> coordinated between the hypervisor and the guest OS being started by
> that hypervisor.
>
> 2) x86 vs x86-agnostic. While I'm trying to keep this interface generic
> in terms of guest OS (like Linux, FreeBSD, possible other guests in the
> future) and hypervisor type (Xen, QEMU/KVM, etc), I was actually under
> the impression that we are dealing with an ABI that is very much x86
> specific.
>
> The canonical document describing the ABI
> (https://xenbits.xen.org/docs/unstable/misc/pvh.html) is titled "x86/HVM
> direct boot ABI" and goes on to describe an interface in very
> x86-specific terms. i.e. The ebx register must contain a pointer, cs,
> ds, es must be set a certain way, etc.
>
> That is probably why Xen's placement of the header file is in a x86
> section of the tree. And also why there already exist a number of "x86"
> references in the existing header file. A quick grep of the existing
> header file will show lines like:
>
> "C representation of the x86/HVM start info layout"
> "Start of day structure passed to PVH guests and to HVM guests in %ebx"
> "Xen on x86 will always try to place all the data below the 4GiB"
>
> If at some point in the future someone decides to implement a similar
> ABI for a different CPU architecture while re-using this same
> hvm_start_info struct, then this header will have to be redone a bit
> anyway. But I'm not aware of any other such ABI that exists or is
> currently in the works.
Anything like this being in the works elsewhere doesn't mean much.
Otherwise you'd advocate for anyone introducing something new
on a single architecture only to ignore all portability concerns. My
fundamental point here is that within the currently defined
structures there's nothing x86 specific, no matter that a few
comments are mentioning x86 (and in at least two of the three
cases for obvious [implementation] reasons rather than to
explain why this interface can/should be x86 only).
> 3) The (packed) layout of the hvm_memmap_table_entry struct. I did
> initially consider just making this a new structure that did not
> necessarily match struct e820_entry in its array layout. But, it's not
> just the consumer that has an easier time digesting it in the e820_entry
> array format. It's also the producer side (QEMU for instance) where code
> already exists to lay out this information in e820_entry array format.
> And since this is all x86 specific anyway, it just seemed like I would
> be needlessly making more work for both ends by inventing a completely
> new memory map layout just for the sake of being different. Especially
> when there doesn't seem to be anything terribly broken about the
> existing e820_entry array format as a general purpose memory map.
The brokenness is the mis-alignment of every other array member.
This doesn't matter on x86, but it would matter on any architecture
requiring strict alignment. Furthermore please note that in the
canonical headers you can't even express what you want: Use of
#pragma pack or the packed attribute is prohibited there - we
demand that the headers can be used by any C89-compatible
compiler (lately there have been a few extensions requiring C99,
but these need to be opted in for by consumers, which imo is not
an option here).
Jan
On 08/12/2017 09:49, Jan Beulich wrote:
>> + * The layout of each entry in the memory map table is as follows and no
>> + * padding is used between entries in the array:
>> + *
>> + * 0 +----------------+
>> + * | addr | Base address
>> + * 8 +----------------+
>> + * | size | Size of mapping
>> + * 16 +----------------+
>> + * | type | E820_TYPE_xxx
>> + * 20 +----------------|
> I'm not convinced of re-using E820 types here. I can see that this
> might ease the consumption in Linux, but I don't think there should
> be any connection to x86 aspects here - the data being supplied is
> x86-agnostic, and Linux'es placement of the header is also making
> no connection to x86 (oddly enough, the current placement in the
> Xen tree does, for a reason which escapes me).
FWIW, e820 types are now part of the ACPI standard. So using them is
not necessarily related to x86, and reasonably x86-agnostic.
Paolo
> I could also imagine reasons to add new types without them being
> sanctioned by whoever maintains E820 type assignments.
>>> On 11.12.17 at 22:59, <[email protected]> wrote:
> On 08/12/2017 09:49, Jan Beulich wrote:
>>> + * The layout of each entry in the memory map table is as follows and no
>>> + * padding is used between entries in the array:
>>> + *
>>> + * 0 +----------------+
>>> + * | addr | Base address
>>> + * 8 +----------------+
>>> + * | size | Size of mapping
>>> + * 16 +----------------+
>>> + * | type | E820_TYPE_xxx
>>> + * 20 +----------------|
>> I'm not convinced of re-using E820 types here. I can see that this
>> might ease the consumption in Linux, but I don't think there should
>> be any connection to x86 aspects here - the data being supplied is
>> x86-agnostic, and Linux'es placement of the header is also making
>> no connection to x86 (oddly enough, the current placement in the
>> Xen tree does, for a reason which escapes me).
>
> FWIW, e820 types are now part of the ACPI standard. So using them is
> not necessarily related to x86, and reasonably x86-agnostic.
Sort of - the description of it starts with "This interface is used in
real mode only on IA-PC-based systems ..."
But it being there is useful in another way: It shows that there's
an optional field making the full structure 64-bit aligned again. (It
at the same time shows - I admit I had forgotten about this aspect -
that the structure size isn't fixed in the first place, so consumers
have to convert [truncate/extend] the output to their internal
representation anyway, and hence there's even less of a reason
to tie the proposed structure's layout to the E820 one.)
Jan
On 12/12/2017 09:06, Jan Beulich wrote:
>>>> On 11.12.17 at 22:59, <[email protected]> wrote:
>> On 08/12/2017 09:49, Jan Beulich wrote:
>>>> + * The layout of each entry in the memory map table is as follows and no
>>>> + * padding is used between entries in the array:
>>>> + *
>>>> + * 0 +----------------+
>>>> + * | addr | Base address
>>>> + * 8 +----------------+
>>>> + * | size | Size of mapping
>>>> + * 16 +----------------+
>>>> + * | type | E820_TYPE_xxx
>>>> + * 20 +----------------|
>>> I'm not convinced of re-using E820 types here. I can see that this
>>> might ease the consumption in Linux, but I don't think there should
>>> be any connection to x86 aspects here - the data being supplied is
>>> x86-agnostic, and Linux'es placement of the header is also making
>>> no connection to x86 (oddly enough, the current placement in the
>>> Xen tree does, for a reason which escapes me).
>>
>> FWIW, e820 types are now part of the ACPI standard. So using them is
>> not necessarily related to x86, and reasonably x86-agnostic.
>
> Sort of - the description of it starts with "This interface is used in
> real mode only on IA-PC-based systems ..."
Note I said the e820 *types*. While the interface is there for PC
compatibility, the ACPI address range types (AddressRangeMemory,
AddressRangeReserved, AddressRangeACPI, etc.) are exactly the e820 types.
> But it being there is useful in another way: It shows that there's
> an optional field making the full structure 64-bit aligned again. (It
> at the same time shows - I admit I had forgotten about this aspect -
> that the structure size isn't fixed in the first place, so consumers
> have to convert [truncate/extend] the output to their internal
> representation anyway, and hence there's even less of a reason
> to tie the proposed structure's layout to the E820 one.)
My point was that the e820 types are okay to use in an
architecture-agnostic way in my opinion. The layout only matters so
much, as there aren't many ways to encode a memory map (note I do agree
about that alignment dword).
Paolo