Hi all
Here we propose this patch series to make Linux run as the root partition [0]
on Microsoft Hypervisor [1].
Microsoft wants to create a complete virtualization stack with Linux and
Microsoft Hypervisor. There will be a subsequent patch series to provide a
device node (/dev/mshv) such that userspace programs can create and run virtual
machines. We've also ported Cloud Hypervisor [3] over and have been able to
boot a Linux guest with Virtio devices since late July.
Being an RFC sereis, this implements only the absolutely necessary
components to get things running. I will break down this series a bit.
A large portion of this series consists of patches that augment hyperv-tlfs.h.
They should be rather uncontroversial and can be applied right away.
A few key things other than the changes to hyperv-tlfs.h:
1. Linux needs to setup existing Hyper-V facilities differently.
2. Linux needs to make a few hypercalls to bring up APs.
3. Interrupts are remapped by IOMMU, which is controlled by the hypervisor.
Linux needs to make hypercalls to map and unmap interrupts. This is
done by introducing a new MSI irqdomain and new irqchips.
#3 is perhaps the thing that we feel least confident about. We drew inspiration
from the Xen code in Linux. We are of course open to criticism and sugguestions
on how to make it better / acceptable to upstream.
We're aware of tglx's series to change some of the MSI code, so we may need to
change some of the code after that series is upstreamed. But it wouldn't hurt
to throw this out as soon as possible for feedback.
Comments and suggestions are welcome.
Thanks,
Wei.
[0] Just think of it like Xen's Dom0.
[1] Hyper-V is more well-known, but it really refers to the whole stack
including the hypervisor and other components that run in Windows kernel
and userspace.
[3] https://github.com/cloud-hypervisor/
Cc: [email protected]
Cc: "K. Y. Srinivasan" <[email protected]>
Cc: Haiyang Zhang <[email protected]>
Cc: Stephen Hemminger <[email protected]>
Cc: Sunil Muthuswamy <[email protected]>
Cc: Nuno Das Neves <[email protected]>
Cc: Vineeth Pillai <[email protected]>
Cc: Samuel Ortiz <[email protected]>
Wei Liu (18):
asm-generic/hyperv: change HV_CPU_POWER_MANAGEMENT to
HV_CPU_MANAGEMENT
x86/hyperv: detect if Linux is the root partition
Drivers: hv: vmbus: skip VMBus initialization if Linux is root
iommu/hyperv: don't setup IRQ remapping when running as root
clocksource/hyperv: use MSR-based access if running as root
x86/hyperv: allocate output arg pages if required
x86/hyperv: extract partition ID from Microsoft Hypervisor if
necessary
x86/hyperv: handling hypercall page setup for root
x86/hyperv: provide a bunch of helper functions
x86/hyperv: implement and use hv_smp_prepare_cpus
asm-generic/hyperv: update hv_msi_entry
asm-generic/hyperv: update hv_interrupt_entry
asm-generic/hyperv: introduce hv_device_id and auxiliary structures
asm-generic/hyperv: import data structures for mapping device
interrupts
x86/apic/msi: export pci_msi_get_hwirq
x86/hyperv: implement MSI domain for root partition
x86/ioapic: export a few functions and data structures via io_apic.h
x86/hyperv: handle IO-APIC when running as root
arch/x86/hyperv/Makefile | 4 +-
arch/x86/hyperv/hv_init.c | 126 +++++-
arch/x86/hyperv/hv_proc.c | 209 ++++++++++
arch/x86/hyperv/irqdomain.c | 580 ++++++++++++++++++++++++++++
arch/x86/include/asm/hyperv-tlfs.h | 23 ++
arch/x86/include/asm/io_apic.h | 20 +
arch/x86/include/asm/mshyperv.h | 13 +-
arch/x86/include/asm/msi.h | 3 +
arch/x86/kernel/apic/io_apic.c | 28 +-
arch/x86/kernel/apic/msi.c | 3 +-
arch/x86/kernel/cpu/mshyperv.c | 43 +++
drivers/clocksource/hyperv_timer.c | 3 +
drivers/hv/vmbus_drv.c | 3 +
drivers/iommu/hyperv-iommu.c | 3 +-
drivers/pci/controller/pci-hyperv.c | 2 +-
include/asm-generic/hyperv-tlfs.h | 243 +++++++++++-
16 files changed, 1268 insertions(+), 38 deletions(-)
create mode 100644 arch/x86/hyperv/hv_proc.c
create mode 100644 arch/x86/hyperv/irqdomain.c
--
2.20.1
This makes the name match Hyper-V TLFS.
Signed-off-by: Wei Liu <[email protected]>
---
include/asm-generic/hyperv-tlfs.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/asm-generic/hyperv-tlfs.h b/include/asm-generic/hyperv-tlfs.h
index e73a11850055..e6903589a82a 100644
--- a/include/asm-generic/hyperv-tlfs.h
+++ b/include/asm-generic/hyperv-tlfs.h
@@ -88,7 +88,7 @@
#define HV_CONNECT_PORT BIT(7)
#define HV_ACCESS_STATS BIT(8)
#define HV_DEBUGGING BIT(11)
-#define HV_CPU_POWER_MANAGEMENT BIT(12)
+#define HV_CPU_MANAGEMENT BIT(12)
/*
--
2.20.1
For now we can use the privilege flag to check. Stash the value to be
used later.
Put in a bunch of defines for future use when we want to have more
fine-grained detection.
Signed-off-by: Wei Liu <[email protected]>
---
arch/x86/hyperv/hv_init.c | 4 ++++
arch/x86/include/asm/hyperv-tlfs.h | 10 ++++++++++
arch/x86/include/asm/mshyperv.h | 2 ++
arch/x86/kernel/cpu/mshyperv.c | 16 ++++++++++++++++
4 files changed, 32 insertions(+)
diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index 6035df1b49e1..cac8e4c56261 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -26,6 +26,10 @@
#include <linux/syscore_ops.h>
#include <clocksource/hyperv_timer.h>
+/* Is Linux running as the root partition? */
+bool hv_root_partition;
+EXPORT_SYMBOL_GPL(hv_root_partition);
+
void *hv_hypercall_pg;
EXPORT_SYMBOL_GPL(hv_hypercall_pg);
diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
index 7a4d2062385c..20d628c1ed50 100644
--- a/arch/x86/include/asm/hyperv-tlfs.h
+++ b/arch/x86/include/asm/hyperv-tlfs.h
@@ -21,6 +21,7 @@
#define HYPERV_CPUID_FEATURES 0x40000003
#define HYPERV_CPUID_ENLIGHTMENT_INFO 0x40000004
#define HYPERV_CPUID_IMPLEMENT_LIMITS 0x40000005
+#define HYPERV_CPUID_CPU_MANAGEMENT_FEATURES 0x40000007
#define HYPERV_CPUID_NESTED_FEATURES 0x4000000A
#define HYPERV_HYPERVISOR_PRESENT_BIT 0x80000000
@@ -136,6 +137,15 @@
/* Recommend using enlightened VMCS */
#define HV_X64_ENLIGHTENED_VMCS_RECOMMENDED BIT(14)
+/*
+ * CPU management features identification.
+ * These are HYPERV_CPUID_CPU_MANAGEMENT_FEATURES.EAX bits.
+ */
+#define HV_X64_START_LOGICAL_PROCESSOR BIT(0)
+#define HV_X64_CREATE_ROOT_VIRTUAL_PROCESSOR BIT(1)
+#define HV_X64_PERFORMANCE_COUNTER_SYNC BIT(2)
+#define HV_X64_RESERVED_IDENTITY_BIT BIT(31)
+
/*
* Virtual processor will never share a physical core with another virtual
* processor, except for virtual processors that are reported as sibling SMT
diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index 60b944dd2df1..2a2cc81beac6 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -224,6 +224,8 @@ int hyperv_fill_flush_guest_mapping_list(
struct hv_guest_mapping_flush_list *flush,
u64 start_gfn, u64 end_gfn);
+extern bool hv_root_partition;
+
#ifdef CONFIG_X86_64
void hv_apic_init(void);
void __init hv_init_spinlocks(void);
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index af94f05a5c66..1bf57d310f78 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -232,6 +232,22 @@ static void __init ms_hyperv_init_platform(void)
pr_debug("Hyper-V: max %u virtual processors, %u logical processors\n",
ms_hyperv.max_vp_index, ms_hyperv.max_lp_index);
+ /*
+ * Check CPU management privilege.
+ *
+ * To mirror what Windows does we should extract CPU management
+ * features and use the ReservedIdentityBit to detect if Linux is the
+ * root partition. But that requires negotiating CPU management
+ * interface (a process to be finalized).
+ *
+ * For now, use the privilege flag as the indicator for running as
+ * root.
+ */
+ if (cpuid_ebx(HYPERV_CPUID_FEATURES) & HV_CPU_MANAGEMENT) {
+ hv_root_partition = true;
+ pr_info("Hyper-V: running as root partition\n");
+ }
+
/*
* Extract host information.
*/
--
2.20.1
The IOMMU code needs more work. We're sure for now the IRQ remapping
hooks are not applicable when Linux is the root.
Signed-off-by: Wei Liu <[email protected]>
---
drivers/iommu/hyperv-iommu.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/hyperv-iommu.c b/drivers/iommu/hyperv-iommu.c
index 8919c1c70b68..4da684ab292c 100644
--- a/drivers/iommu/hyperv-iommu.c
+++ b/drivers/iommu/hyperv-iommu.c
@@ -20,6 +20,7 @@
#include <asm/io_apic.h>
#include <asm/irq_remapping.h>
#include <asm/hypervisor.h>
+#include <asm/mshyperv.h>
#include "irq_remapping.h"
@@ -143,7 +144,7 @@ static int __init hyperv_prepare_irq_remapping(void)
int i;
if (!hypervisor_is_type(X86_HYPER_MS_HYPERV) ||
- !x2apic_supported())
+ !x2apic_supported() || hv_root_partition)
return -ENODEV;
fn = irq_domain_alloc_named_id_fwnode("HYPERV-IR", 0);
--
2.20.1
On Mon, Sep 14, 2020 at 11:27:48AM +0000, Wei Liu wrote:
> The IOMMU code needs more work. We're sure for now the IRQ remapping
> hooks are not applicable when Linux is the root.
>
> Signed-off-by: Wei Liu <[email protected]>
> ---
> drivers/iommu/hyperv-iommu.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
Acked-by: Joerg Roedel <[email protected]>