2023-07-27 08:13:04

by Yi-De Wu

[permalink] [raw]
Subject: [PATCH v5 00/12] GenieZone hypervisor drivers

This series is based on linux-next, tag: next-20230726.

GenieZone hypervisor(gzvm) is a type-1 hypervisor that supports various virtual
machine types and provides security features such as TEE-like scenarios and
secure boot. It can create guest VMs for security use cases and has
virtualization capabilities for both platform and interrupt. Although the
hypervisor can be booted independently, it requires the assistance of GenieZone
hypervisor kernel driver(gzvm-ko) to leverage the ability of Linux kernel for
vCPU scheduling, memory management, inter-VM communication and virtio backend
support.

Changes in v5:
- Add dt solution back for device initialization
- Add GZVM_EXIT_GZ reason for gzvm_vcpu_run()
- Add patch for guest page fault handler
- Add patch for supporitng pin/unpin memory
- Remove unused enum members, namely GZVM_FUNC_GET_REGS and GZVM_FUNC_SET_REGS
- Use dev_debug() for debugging when platform device is available, and use
pr_debug() otherwise
- Response to reviewers and fix bugs accordingly

Changes in v4:
https://lore.kernel.org/lkml/[email protected]/
- Add macro to set VM as protected without triggering pvmfw in AVF.
- Add support to pass dtb config to hypervisor.
- Add support for virtual timer.
- Add UAPI to pass memory region metadata to hypervisor.
- Define our own macros for ARM's interrupt number
- Elaborate more on GenieZone hyperivsor in documentation
- Fix coding style.
- Implement our own module for coverting ipa to pa
- Modify the way of initializing device from dt to a more discoverable way
- Move refactoring changes into indepedent patches.

Changes in v3:
https://lore.kernel.org/all/[email protected]/
- Refactor: separate arch/arm64/geniezone/gzvm_arch.c into vm.c/vcpu.c/vgic.c
- Remove redundant functions
- Fix reviewer's comments

Changes in v2:
https://lore.kernel.org/all/[email protected]/
- Refactor: move to drivers/virt/geniezone
- Refactor: decouple arch-dependent and arch-independent
- Check pending signal before entering guest context
- Fix reviewer's comments

Initial Commit in v1:
https://lore.kernel.org/all/[email protected]/

Yi-De Wu (12):
docs: geniezone: Introduce GenieZone hypervisor
dt-bindings: hypervisor: Add MediaTek GenieZone hypervisor
virt: geniezone: Add GenieZone hypervisor support
virt: geniezone: Add vcpu support
virt: geniezone: Add irqchip support for virtual interrupt injection
virt: geniezone: Add irqfd support
virt: geniezone: Add ioeventfd support
virt: geniezone: Add memory region support
virt: geniezone: Add dtb config support
virt: geniezone: Add virtual timer support
virt: geniezone: Add guest page fault handler
virt: geniezone: Add memory pin/unpin support

.../hypervisor/mediatek,geniezone-hyp.yaml | 31 +
Documentation/virt/geniezone/introduction.rst | 86 +++
Documentation/virt/index.rst | 1 +
MAINTAINERS | 13 +
arch/arm64/Kbuild | 1 +
arch/arm64/geniezone/Makefile | 9 +
arch/arm64/geniezone/driver.c | 26 +
arch/arm64/geniezone/gzvm_arch_common.h | 130 ++++
arch/arm64/geniezone/vcpu.c | 155 +++++
arch/arm64/geniezone/vgic.c | 124 ++++
arch/arm64/geniezone/vm.c | 251 ++++++++
arch/arm64/include/uapi/asm/gzvm_arch.h | 58 ++
drivers/virt/Kconfig | 2 +
drivers/virt/geniezone/Kconfig | 16 +
drivers/virt/geniezone/Makefile | 12 +
drivers/virt/geniezone/gzvm_common.h | 12 +
drivers/virt/geniezone/gzvm_exception.c | 34 ++
drivers/virt/geniezone/gzvm_hvc.c | 34 ++
drivers/virt/geniezone/gzvm_ioeventfd.c | 273 +++++++++
drivers/virt/geniezone/gzvm_irqfd.c | 566 ++++++++++++++++++
drivers/virt/geniezone/gzvm_main.c | 154 +++++
drivers/virt/geniezone/gzvm_mmu.c | 210 +++++++
drivers/virt/geniezone/gzvm_vcpu.c | 280 +++++++++
drivers/virt/geniezone/gzvm_vm.c | 488 +++++++++++++++
include/linux/gzvm_drv.h | 185 ++++++
include/uapi/asm-generic/Kbuild | 1 +
include/uapi/asm-generic/gzvm_arch.h | 13 +
include/uapi/linux/gzvm.h | 362 +++++++++++
28 files changed, 3527 insertions(+)
create mode 100644 Documentation/devicetree/bindings/hypervisor/mediatek,geniezone-hyp.yaml
create mode 100644 Documentation/virt/geniezone/introduction.rst
create mode 100644 arch/arm64/geniezone/Makefile
create mode 100644 arch/arm64/geniezone/driver.c
create mode 100644 arch/arm64/geniezone/gzvm_arch_common.h
create mode 100644 arch/arm64/geniezone/vcpu.c
create mode 100644 arch/arm64/geniezone/vgic.c
create mode 100644 arch/arm64/geniezone/vm.c
create mode 100644 arch/arm64/include/uapi/asm/gzvm_arch.h
create mode 100644 drivers/virt/geniezone/Kconfig
create mode 100644 drivers/virt/geniezone/Makefile
create mode 100644 drivers/virt/geniezone/gzvm_common.h
create mode 100644 drivers/virt/geniezone/gzvm_exception.c
create mode 100644 drivers/virt/geniezone/gzvm_hvc.c
create mode 100644 drivers/virt/geniezone/gzvm_ioeventfd.c
create mode 100644 drivers/virt/geniezone/gzvm_irqfd.c
create mode 100644 drivers/virt/geniezone/gzvm_main.c
create mode 100644 drivers/virt/geniezone/gzvm_mmu.c
create mode 100644 drivers/virt/geniezone/gzvm_vcpu.c
create mode 100644 drivers/virt/geniezone/gzvm_vm.c
create mode 100644 include/linux/gzvm_drv.h
create mode 100644 include/uapi/asm-generic/gzvm_arch.h
create mode 100644 include/uapi/linux/gzvm.h

--
2.18.0



2023-07-27 08:16:18

by Yi-De Wu

[permalink] [raw]
Subject: [PATCH v5 11/12] virt: geniezone: Add guest page fault handler

From: "Yingshiuan Pan" <[email protected]>

This page fault handler helps GenieZone hypervisor to do demand paging.
On a lower level translation fault, GenieZone hypervisor will first
check the fault GPA (guest physical address or IPA in ARM) is valid
e.g. within the registered memory region, then it will setup the
vcpu_run->exit_reason with necessary information for returning to
gzvm driver.

With the fault information, the gzvm driver looks up the physical
address and call the MT_HVC_GZVM_MAP_GUEST to request the hypervisor
maps the found PA to the fault GPA (IPA).

Signed-off-by: Yingshiuan Pan <[email protected]>
Signed-off-by: Liju Chen <[email protected]>
Signed-off-by: Yi-De Wu <[email protected]>
---
arch/arm64/geniezone/gzvm_arch_common.h | 2 +
arch/arm64/geniezone/vm.c | 9 ++++
arch/arm64/include/uapi/asm/gzvm_arch.h | 4 ++
drivers/virt/geniezone/Makefile | 2 +-
drivers/virt/geniezone/gzvm_exception.c | 66 +++++++++++++++++++++++++
drivers/virt/geniezone/gzvm_main.c | 2 +
drivers/virt/geniezone/gzvm_vcpu.c | 6 ++-
drivers/virt/geniezone/gzvm_vm.c | 28 ++++++++++-
include/linux/gzvm_drv.h | 7 +++
include/uapi/asm-generic/gzvm_arch.h | 3 ++
include/uapi/linux/gzvm.h | 14 ++++++
11 files changed, 138 insertions(+), 5 deletions(-)
create mode 100644 drivers/virt/geniezone/gzvm_exception.c

diff --git a/arch/arm64/geniezone/gzvm_arch_common.h b/arch/arm64/geniezone/gzvm_arch_common.h
index e51310be2376..d4db0ee7bcb8 100644
--- a/arch/arm64/geniezone/gzvm_arch_common.h
+++ b/arch/arm64/geniezone/gzvm_arch_common.h
@@ -24,6 +24,7 @@ enum {
GZVM_FUNC_INFORM_EXIT = 14,
GZVM_FUNC_MEMREGION_PURPOSE = 15,
GZVM_FUNC_SET_DTB_CONFIG = 16,
+ GZVM_FUNC_MAP_GUEST = 17,
NR_GZVM_FUNC,
};

@@ -48,6 +49,7 @@ enum {
#define MT_HVC_GZVM_INFORM_EXIT GZVM_HCALL_ID(GZVM_FUNC_INFORM_EXIT)
#define MT_HVC_GZVM_MEMREGION_PURPOSE GZVM_HCALL_ID(GZVM_FUNC_MEMREGION_PURPOSE)
#define MT_HVC_GZVM_SET_DTB_CONFIG GZVM_HCALL_ID(GZVM_FUNC_SET_DTB_CONFIG)
+#define MT_HVC_GZVM_MAP_GUEST GZVM_HCALL_ID(GZVM_FUNC_MAP_GUEST)

#define GIC_V3_NR_LRS 16

diff --git a/arch/arm64/geniezone/vm.c b/arch/arm64/geniezone/vm.c
index a47e1d60dc1f..9d6b22bd1d70 100644
--- a/arch/arm64/geniezone/vm.c
+++ b/arch/arm64/geniezone/vm.c
@@ -240,3 +240,12 @@ u64 gzvm_hva_to_pa_arch(u64 hva)

return par & PAR_PA47_MASK;
}
+
+int gzvm_arch_map_guest(u16 vm_id, int memslot_id, u64 pfn, u64 gfn,
+ u64 nr_pages)
+{
+ struct arm_smccc_res res;
+
+ return gzvm_hypcall_wrapper(MT_HVC_GZVM_MAP_GUEST, vm_id, memslot_id,
+ pfn, gfn, nr_pages, 0, 0, &res);
+}
diff --git a/arch/arm64/include/uapi/asm/gzvm_arch.h b/arch/arm64/include/uapi/asm/gzvm_arch.h
index acfe9be0f849..ebb136c2a57a 100644
--- a/arch/arm64/include/uapi/asm/gzvm_arch.h
+++ b/arch/arm64/include/uapi/asm/gzvm_arch.h
@@ -51,4 +51,8 @@
#define GZVM_VGIC_NR_PPIS 16
#define GZVM_VGIC_NR_PRIVATE_IRQS (GZVM_VGIC_NR_SGIS + GZVM_VGIC_NR_PPIS)

+struct gzvm_arch_exception {
+ __u64 esr_el2;
+};
+
#endif /* __GZVM_ARCH_H__ */
diff --git a/drivers/virt/geniezone/Makefile b/drivers/virt/geniezone/Makefile
index bc5ae49f2407..e1299f99df76 100644
--- a/drivers/virt/geniezone/Makefile
+++ b/drivers/virt/geniezone/Makefile
@@ -8,4 +8,4 @@ GZVM_DIR ?= ../../../drivers/virt/geniezone

gzvm-y := $(GZVM_DIR)/gzvm_main.o $(GZVM_DIR)/gzvm_vm.o \
$(GZVM_DIR)/gzvm_vcpu.o $(GZVM_DIR)/gzvm_irqfd.o \
- $(GZVM_DIR)/gzvm_ioeventfd.o
+ $(GZVM_DIR)/gzvm_ioeventfd.o $(GZVM_DIR)/gzvm_exception.o
diff --git a/drivers/virt/geniezone/gzvm_exception.c b/drivers/virt/geniezone/gzvm_exception.c
new file mode 100644
index 000000000000..c2cab1472d2f
--- /dev/null
+++ b/drivers/virt/geniezone/gzvm_exception.c
@@ -0,0 +1,66 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2023 MediaTek Inc.
+ */
+
+#include <linux/device.h>
+#include <linux/gzvm_drv.h>
+
+/**
+ * gzvm_handle_page_fault() - Handle guest page fault, find corresponding page
+ * for the faulting gpa
+ * @vcpu: Pointer to struct gzvm_vcpu_run of the faulting vcpu
+ *
+ * Return:
+ * * 0 - Success to handle guest page fault
+ * * -EFAULT - Failed to map phys addr to guest's GPA
+ */
+static int gzvm_handle_page_fault(struct gzvm_vcpu *vcpu)
+{
+ struct gzvm *vm = vcpu->gzvm;
+ int memslot_id;
+ u64 pfn, gfn;
+ int ret;
+
+ gfn = PHYS_PFN(vcpu->run->exception.fault_gpa);
+ memslot_id = gzvm_find_memslot(vm, gfn);
+ if (unlikely(memslot_id < 0))
+ return -EFAULT;
+
+ ret = gzvm_gfn_to_pfn_memslot(&vm->memslot[memslot_id], gfn, &pfn);
+ if (unlikely(ret))
+ return -EFAULT;
+
+ ret = gzvm_arch_map_guest(vm->vm_id, memslot_id, pfn, gfn, 1);
+ if (unlikely(ret))
+ return -EFAULT;
+
+ return 0;
+}
+
+/**
+ * gzvm_handle_guest_exception() - Handle guest exception
+ * @vcpu: Pointer to struct gzvm_vcpu_run in userspace
+ * Return:
+ * * true - This exception has been processed, no need to back to VMM.
+ * * false - This exception has not been processed, require userspace.
+ */
+bool gzvm_handle_guest_exception(struct gzvm_vcpu *vcpu)
+{
+ int ret;
+
+ switch (vcpu->run->exception.exception) {
+ case GZVM_EXCEPTION_PAGE_FAULT:
+ ret = gzvm_handle_page_fault(vcpu);
+ break;
+ case GZVM_EXCEPTION_UNKNOWN:
+ fallthrough;
+ default:
+ ret = -EFAULT;
+ }
+
+ if (!ret)
+ return true;
+ else
+ return false;
+}
diff --git a/drivers/virt/geniezone/gzvm_main.c b/drivers/virt/geniezone/gzvm_main.c
index a4c235f3ff01..933146f79d4f 100644
--- a/drivers/virt/geniezone/gzvm_main.c
+++ b/drivers/virt/geniezone/gzvm_main.c
@@ -30,6 +30,8 @@ int gzvm_err_to_errno(unsigned long err)
return 0;
case ERR_NO_MEMORY:
return -ENOMEM;
+ case ERR_INVALID_ARGS:
+ return -EINVAL;
case ERR_NOT_SUPPORTED:
return -EOPNOTSUPP;
case ERR_NOT_IMPLEMENTED:
diff --git a/drivers/virt/geniezone/gzvm_vcpu.c b/drivers/virt/geniezone/gzvm_vcpu.c
index 8dac8ba4c0cf..794b24da40e2 100644
--- a/drivers/virt/geniezone/gzvm_vcpu.c
+++ b/drivers/virt/geniezone/gzvm_vcpu.c
@@ -112,9 +112,11 @@ static long gzvm_vcpu_run(struct gzvm_vcpu *vcpu, void * __user argp)
* it's geniezone's responsibility to fill corresponding data
* structure
*/
- case GZVM_EXIT_HYPERCALL:
- fallthrough;
case GZVM_EXIT_EXCEPTION:
+ if (!gzvm_handle_guest_exception(vcpu))
+ need_userspace = true;
+ break;
+ case GZVM_EXIT_HYPERCALL:
fallthrough;
case GZVM_EXIT_DEBUG:
fallthrough;
diff --git a/drivers/virt/geniezone/gzvm_vm.c b/drivers/virt/geniezone/gzvm_vm.c
index 8e9967b754df..3da5fdc141b6 100644
--- a/drivers/virt/geniezone/gzvm_vm.c
+++ b/drivers/virt/geniezone/gzvm_vm.c
@@ -98,8 +98,7 @@ static u64 __gzvm_gfn_to_pfn_memslot(struct gzvm_memslot *memslot, u64 gfn)
* * 0 - Succeed
* * -EFAULT - Failed to convert
*/
-static int gzvm_gfn_to_pfn_memslot(struct gzvm_memslot *memslot, u64 gfn,
- u64 *pfn)
+int gzvm_gfn_to_pfn_memslot(struct gzvm_memslot *memslot, u64 gfn, u64 *pfn)
{
u64 __pfn;

@@ -117,6 +116,31 @@ static int gzvm_gfn_to_pfn_memslot(struct gzvm_memslot *memslot, u64 gfn,
return 0;
}

+/**
+ * gzvm_find_memslot() - Find memslot containing this @gpa
+ * @vm: Pointer to struct gzvm
+ * @gfn: Guest frame number
+ *
+ * Return:
+ * * >=0 - Index of memslot
+ * * -EFAULT - Not found
+ */
+int gzvm_find_memslot(struct gzvm *vm, u64 gfn)
+{
+ int i;
+
+ for (i = 0; i < GZVM_MAX_MEM_REGION; i++) {
+ if (vm->memslot[i].npages == 0)
+ continue;
+
+ if (gfn >= vm->memslot[i].base_gfn &&
+ gfn < vm->memslot[i].base_gfn + vm->memslot[i].npages)
+ return i;
+ }
+
+ return -EFAULT;
+}
+
/**
* fill_constituents() - Populate pa to buffer until full
* @consti: Pointer to struct mem_region_addr_range.
diff --git a/include/linux/gzvm_drv.h b/include/linux/gzvm_drv.h
index e5b21ac9215b..d7838679c700 100644
--- a/include/linux/gzvm_drv.h
+++ b/include/linux/gzvm_drv.h
@@ -23,6 +23,7 @@
*/
#define NO_ERROR (0)
#define ERR_NO_MEMORY (-5)
+#define ERR_INVALID_ARGS (-8)
#define ERR_NOT_SUPPORTED (-24)
#define ERR_NOT_IMPLEMENTED (-27)
#define ERR_FAULT (-40)
@@ -119,6 +120,8 @@ int gzvm_arch_set_memregion(u16 vm_id, size_t buf_size,
int gzvm_arch_check_extension(struct gzvm *gzvm, __u64 cap, void __user *argp);
int gzvm_arch_create_vm(unsigned long vm_type);
int gzvm_arch_destroy_vm(u16 vm_id);
+int gzvm_arch_map_guest(u16 vm_id, int memslot_id, u64 pfn, u64 gfn,
+ u64 nr_pages);
int gzvm_vm_ioctl_arch_enable_cap(struct gzvm *gzvm,
struct gzvm_enable_cap *cap,
void __user *argp);
@@ -134,6 +137,10 @@ int gzvm_arch_inform_exit(u16 vm_id);
int gzvm_arch_drv_init(void);
void gzvm_arch_drv_exit(void);

+int gzvm_gfn_to_pfn_memslot(struct gzvm_memslot *memslot, u64 gfn, u64 *pfn);
+int gzvm_find_memslot(struct gzvm *vm, u64 gpa);
+bool gzvm_handle_guest_exception(struct gzvm_vcpu *vcpu);
+
int gzvm_arch_create_device(u16 vm_id, struct gzvm_create_device *gzvm_dev);
int gzvm_arch_inject_irq(struct gzvm *gzvm, unsigned int vcpu_idx,
u32 irq_type, u32 irq, bool level);
diff --git a/include/uapi/asm-generic/gzvm_arch.h b/include/uapi/asm-generic/gzvm_arch.h
index c4cc12716c91..0b2cde406f5a 100644
--- a/include/uapi/asm-generic/gzvm_arch.h
+++ b/include/uapi/asm-generic/gzvm_arch.h
@@ -5,6 +5,9 @@

#ifndef __ASM_GENERIC_GZVM_ARCH_H
#define __ASM_GENERIC_GZVM_ARCH_H
+
/* geniezone only supports aarch64 platform for now */
+struct gzvm_arch_exception {
+};

#endif /* __ASM_GENERIC_GZVM_ARCH_H */
diff --git a/include/uapi/linux/gzvm.h b/include/uapi/linux/gzvm.h
index d37be00fbeea..a3329b713089 100644
--- a/include/uapi/linux/gzvm.h
+++ b/include/uapi/linux/gzvm.h
@@ -150,6 +150,12 @@ enum {
GZVM_EXIT_GZ = 0x9292000a,
};

+/* exception definitions of GZVM_EXIT_EXCEPTION */
+enum {
+ GZVM_EXCEPTION_UNKNOWN = 0x0,
+ GZVM_EXCEPTION_PAGE_FAULT = 0x1,
+};
+
/**
* struct gzvm_vcpu_run: Same purpose as kvm_run, this struct is
* shared between userspace, kernel and
@@ -174,6 +180,11 @@ enum {
* Handle exception occurred in VM
* @exception: Which exception vector
* @error_code: Exception error codes
+ * @fault_gpa: Fault GPA (guest physical address or IPA in ARM)
+ * @reserved: Future-proof reservation and should be zeroed, and this can also
+ * fix the offset of `gzvm_arch_exception`
+ * @arch: struct gzvm_arch_exception, architecture information for guest
+ * exception
* @hypercall: The nested struct in anonymous union.
* Some hypercalls issued from VM must be handled
* @args: The hypercall's arguments
@@ -220,6 +231,9 @@ struct gzvm_vcpu_run {
struct {
__u32 exception;
__u32 error_code;
+ __u64 fault_gpa;
+ __u64 reserved[6];
+ struct gzvm_arch_exception arch;
} exception;
/* GZVM_EXIT_HYPERCALL */
struct {
--
2.18.0


2023-07-27 08:16:46

by Yi-De Wu

[permalink] [raw]
Subject: [PATCH v5 02/12] dt-bindings: hypervisor: Add MediaTek GenieZone hypervisor

From: "Yingshiuan Pan" <[email protected]>

Add documentation for GenieZone(gzvm) node. This node informs gzvm
driver to start probing if geniezone hypervisor is available and
able to do virtual machine operations.

Signed-off-by: Yingshiuan Pan <[email protected]>
Signed-off-by: Liju Chen <[email protected]>
Signed-off-by: Yi-De Wu <[email protected]>
---
.../hypervisor/mediatek,geniezone-hyp.yaml | 31 +++++++++++++++++++
MAINTAINERS | 1 +
2 files changed, 32 insertions(+)
create mode 100644 Documentation/devicetree/bindings/hypervisor/mediatek,geniezone-hyp.yaml

diff --git a/Documentation/devicetree/bindings/hypervisor/mediatek,geniezone-hyp.yaml b/Documentation/devicetree/bindings/hypervisor/mediatek,geniezone-hyp.yaml
new file mode 100644
index 000000000000..ab89a4c310cb
--- /dev/null
+++ b/Documentation/devicetree/bindings/hypervisor/mediatek,geniezone-hyp.yaml
@@ -0,0 +1,31 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/hypervisor/mediatek,geniezone-hyp.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: MediaTek GenieZone hypervisor
+
+maintainers:
+ - Yingshiuan Pan <[email protected]>
+
+description:
+ This interface is designed for integrating GenieZone hypervisor into Android
+ Virtualization Framework(AVF) along with Crosvm as a VMM.
+ It acts like a wrapper for every hypercall to GenieZone hypervisor in
+ order to control guest VM lifecycles and virtual interrupt injections.
+
+properties:
+ compatible:
+ const: mediatek,geniezone-hyp
+
+required:
+ - compatible
+
+additionalProperties: false
+
+examples:
+ - |
+ hypervisor {
+ compatible = "mediatek,geniezone-hyp";
+ };
diff --git a/MAINTAINERS b/MAINTAINERS
index a81903c029f2..bfbfdb790446 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8745,6 +8745,7 @@ GENIEZONE HYPERVISOR DRIVER
M: Yingshiuan Pan <[email protected]>
M: Ze-Yu Wang <[email protected]>
M: Yi-De Wu <[email protected]>
+F: Documentation/devicetree/bindings/hypervisor/mediatek,geniezone-hyp.yaml
F: Documentation/virt/geniezone/

GENWQE (IBM Generic Workqueue Card)
--
2.18.0


2023-07-27 08:20:54

by Yi-De Wu

[permalink] [raw]
Subject: [PATCH v5 09/12] virt: geniezone: Add dtb config support

From: "Jerry Wang" <[email protected]>

Hypervisor might need to know the accurate address and size of dtb
passed from userspace. And then hypervisor would parse the dtb and get
vm information.

Signed-off-by: Jerry Wang <[email protected]>
Signed-off-by: Liju-clr Chen <[email protected]>
Signed-off-by: Yi-De Wu <[email protected]>
---
arch/arm64/geniezone/gzvm_arch_common.h | 2 ++
arch/arm64/geniezone/vm.c | 9 +++++++++
drivers/virt/geniezone/gzvm_vm.c | 10 ++++++++++
include/linux/gzvm_drv.h | 1 +
include/uapi/linux/gzvm.h | 14 ++++++++++++++
5 files changed, 36 insertions(+)

diff --git a/arch/arm64/geniezone/gzvm_arch_common.h b/arch/arm64/geniezone/gzvm_arch_common.h
index 321f5dbcd616..82d2c44e819b 100644
--- a/arch/arm64/geniezone/gzvm_arch_common.h
+++ b/arch/arm64/geniezone/gzvm_arch_common.h
@@ -23,6 +23,7 @@ enum {
GZVM_FUNC_ENABLE_CAP = 13,
GZVM_FUNC_INFORM_EXIT = 14,
GZVM_FUNC_MEMREGION_PURPOSE = 15,
+ GZVM_FUNC_SET_DTB_CONFIG = 16,
NR_GZVM_FUNC,
};

@@ -46,6 +47,7 @@ enum {
#define MT_HVC_GZVM_ENABLE_CAP GZVM_HCALL_ID(GZVM_FUNC_ENABLE_CAP)
#define MT_HVC_GZVM_INFORM_EXIT GZVM_HCALL_ID(GZVM_FUNC_INFORM_EXIT)
#define MT_HVC_GZVM_MEMREGION_PURPOSE GZVM_HCALL_ID(GZVM_FUNC_MEMREGION_PURPOSE)
+#define MT_HVC_GZVM_SET_DTB_CONFIG GZVM_HCALL_ID(GZVM_FUNC_SET_DTB_CONFIG)

#define GIC_V3_NR_LRS 16

diff --git a/arch/arm64/geniezone/vm.c b/arch/arm64/geniezone/vm.c
index 17327081eb27..a47e1d60dc1f 100644
--- a/arch/arm64/geniezone/vm.c
+++ b/arch/arm64/geniezone/vm.c
@@ -119,6 +119,15 @@ int gzvm_arch_memregion_purpose(struct gzvm *gzvm,
mem->flags, 0, 0, 0, &res);
}

+int gzvm_arch_set_dtb_config(struct gzvm *gzvm, struct gzvm_dtb_config *cfg)
+{
+ struct arm_smccc_res res;
+
+ return gzvm_hypcall_wrapper(MT_HVC_GZVM_SET_DTB_CONFIG, gzvm->vm_id,
+ cfg->dtb_addr, cfg->dtb_size, 0, 0, 0, 0,
+ &res);
+}
+
static int gzvm_vm_arch_enable_cap(struct gzvm *gzvm,
struct gzvm_enable_cap *cap,
struct arm_smccc_res *res)
diff --git a/drivers/virt/geniezone/gzvm_vm.c b/drivers/virt/geniezone/gzvm_vm.c
index a1cf970e4c91..8e9967b754df 100644
--- a/drivers/virt/geniezone/gzvm_vm.c
+++ b/drivers/virt/geniezone/gzvm_vm.c
@@ -413,6 +413,16 @@ static long gzvm_vm_ioctl(struct file *filp, unsigned int ioctl,
ret = gzvm_vm_ioctl_enable_cap(gzvm, &cap, argp);
break;
}
+ case GZVM_SET_DTB_CONFIG: {
+ struct gzvm_dtb_config cfg;
+
+ if (copy_from_user(&cfg, argp, sizeof(cfg))) {
+ ret = -EFAULT;
+ goto out;
+ }
+ ret = gzvm_arch_set_dtb_config(gzvm, &cfg);
+ break;
+ }
default:
ret = -ENOTTY;
}
diff --git a/include/linux/gzvm_drv.h b/include/linux/gzvm_drv.h
index 369c3df6c0b5..7bc00218dce6 100644
--- a/include/linux/gzvm_drv.h
+++ b/include/linux/gzvm_drv.h
@@ -143,6 +143,7 @@ void gzvm_vm_irqfd_release(struct gzvm *gzvm);

int gzvm_arch_memregion_purpose(struct gzvm *gzvm,
struct gzvm_userspace_memory_region *mem);
+int gzvm_arch_set_dtb_config(struct gzvm *gzvm, struct gzvm_dtb_config *args);

int gzvm_init_ioeventfd(struct gzvm *gzvm);
int gzvm_ioeventfd(struct gzvm *gzvm, struct gzvm_ioeventfd *args);
diff --git a/include/uapi/linux/gzvm.h b/include/uapi/linux/gzvm.h
index 506ef975de02..d37be00fbeea 100644
--- a/include/uapi/linux/gzvm.h
+++ b/include/uapi/linux/gzvm.h
@@ -326,4 +326,18 @@ struct gzvm_ioeventfd {

#define GZVM_IOEVENTFD _IOW(GZVM_IOC_MAGIC, 0x79, struct gzvm_ioeventfd)

+/**
+ * struct gzvm_dtb_config: store address and size of dtb passed from userspace
+ *
+ * @dtb_addr: dtb address set by VMM (guset memory)
+ * @dtb_size: dtb size
+ */
+struct gzvm_dtb_config {
+ __u64 dtb_addr;
+ __u64 dtb_size;
+};
+
+#define GZVM_SET_DTB_CONFIG _IOW(GZVM_IOC_MAGIC, 0xff, \
+ struct gzvm_dtb_config)
+
#endif /* __GZVM_H__ */
--
2.18.0


2023-07-27 08:21:14

by Yi-De Wu

[permalink] [raw]
Subject: [PATCH v5 08/12] virt: geniezone: Add memory region support

From: "Jerry Wang" <[email protected]>

Hypervisor might need to know the precise purpose of each memory
region, so that it can provide specific memory protection. We add a new
uapi to pass address and size of a memory region and its purpose.

Signed-off-by: Jerry Wang <[email protected]>
Signed-off-by: Liju-clr Chen <[email protected]>
Signed-off-by: Yi-De Wu <[email protected]>
---
arch/arm64/geniezone/gzvm_arch_common.h | 2 ++
arch/arm64/geniezone/vm.c | 10 ++++++++++
drivers/virt/geniezone/gzvm_vm.c | 7 +++++++
include/linux/gzvm_drv.h | 3 +++
4 files changed, 22 insertions(+)

diff --git a/arch/arm64/geniezone/gzvm_arch_common.h b/arch/arm64/geniezone/gzvm_arch_common.h
index 051d8f49a1df..321f5dbcd616 100644
--- a/arch/arm64/geniezone/gzvm_arch_common.h
+++ b/arch/arm64/geniezone/gzvm_arch_common.h
@@ -22,6 +22,7 @@ enum {
GZVM_FUNC_PROBE = 12,
GZVM_FUNC_ENABLE_CAP = 13,
GZVM_FUNC_INFORM_EXIT = 14,
+ GZVM_FUNC_MEMREGION_PURPOSE = 15,
NR_GZVM_FUNC,
};

@@ -44,6 +45,7 @@ enum {
#define MT_HVC_GZVM_PROBE GZVM_HCALL_ID(GZVM_FUNC_PROBE)
#define MT_HVC_GZVM_ENABLE_CAP GZVM_HCALL_ID(GZVM_FUNC_ENABLE_CAP)
#define MT_HVC_GZVM_INFORM_EXIT GZVM_HCALL_ID(GZVM_FUNC_INFORM_EXIT)
+#define MT_HVC_GZVM_MEMREGION_PURPOSE GZVM_HCALL_ID(GZVM_FUNC_MEMREGION_PURPOSE)

#define GIC_V3_NR_LRS 16

diff --git a/arch/arm64/geniezone/vm.c b/arch/arm64/geniezone/vm.c
index 2df321f13057..17327081eb27 100644
--- a/arch/arm64/geniezone/vm.c
+++ b/arch/arm64/geniezone/vm.c
@@ -109,6 +109,16 @@ int gzvm_arch_destroy_vm(u16 vm_id)
0, 0, &res);
}

+int gzvm_arch_memregion_purpose(struct gzvm *gzvm,
+ struct gzvm_userspace_memory_region *mem)
+{
+ struct arm_smccc_res res;
+
+ return gzvm_hypcall_wrapper(MT_HVC_GZVM_MEMREGION_PURPOSE, gzvm->vm_id,
+ mem->guest_phys_addr, mem->memory_size,
+ mem->flags, 0, 0, 0, &res);
+}
+
static int gzvm_vm_arch_enable_cap(struct gzvm *gzvm,
struct gzvm_enable_cap *cap,
struct arm_smccc_res *res)
diff --git a/drivers/virt/geniezone/gzvm_vm.c b/drivers/virt/geniezone/gzvm_vm.c
index 60bd017e41fa..a1cf970e4c91 100644
--- a/drivers/virt/geniezone/gzvm_vm.c
+++ b/drivers/virt/geniezone/gzvm_vm.c
@@ -234,6 +234,7 @@ static int
gzvm_vm_ioctl_set_memory_region(struct gzvm *gzvm,
struct gzvm_userspace_memory_region *mem)
{
+ int ret;
struct vm_area_struct *vma;
struct gzvm_memslot *memslot;
unsigned long size;
@@ -258,6 +259,12 @@ gzvm_vm_ioctl_set_memory_region(struct gzvm *gzvm,
memslot->vma = vma;
memslot->flags = mem->flags;
memslot->slot_id = mem->slot;
+
+ ret = gzvm_arch_memregion_purpose(gzvm, mem);
+ if (ret) {
+ pr_err("Failed to config memory region for the specified purpose\n");
+ return -EFAULT;
+ }
return register_memslot_addr_range(gzvm, memslot);
}

diff --git a/include/linux/gzvm_drv.h b/include/linux/gzvm_drv.h
index d2985e4df4a0..369c3df6c0b5 100644
--- a/include/linux/gzvm_drv.h
+++ b/include/linux/gzvm_drv.h
@@ -141,6 +141,9 @@ void gzvm_drv_irqfd_exit(void);
int gzvm_vm_irqfd_init(struct gzvm *gzvm);
void gzvm_vm_irqfd_release(struct gzvm *gzvm);

+int gzvm_arch_memregion_purpose(struct gzvm *gzvm,
+ struct gzvm_userspace_memory_region *mem);
+
int gzvm_init_ioeventfd(struct gzvm *gzvm);
int gzvm_ioeventfd(struct gzvm *gzvm, struct gzvm_ioeventfd *args);
bool gzvm_ioevent_write(struct gzvm_vcpu *vcpu, __u64 addr, int len,
--
2.18.0


2023-07-27 08:21:24

by Yi-De Wu

[permalink] [raw]
Subject: [PATCH v5 07/12] virt: geniezone: Add ioeventfd support

From: "Yingshiuan Pan" <[email protected]>

Ioeventfd leverages eventfd to provide asynchronous notification
mechanism for VMM. VMM can register a mmio address and bind with an
eventfd. Once a mmio trap occurs on this registered region, its
corresponding eventfd will be notified.

Signed-off-by: Yingshiuan Pan <[email protected]>
Signed-off-by: Liju Chen <[email protected]>
Signed-off-by: Yi-De Wu <[email protected]>
---
drivers/virt/geniezone/Makefile | 3 +-
drivers/virt/geniezone/gzvm_ioeventfd.c | 273 ++++++++++++++++++++++++
drivers/virt/geniezone/gzvm_vcpu.c | 27 ++-
drivers/virt/geniezone/gzvm_vm.c | 17 ++
include/linux/gzvm_drv.h | 12 ++
include/uapi/linux/gzvm.h | 25 +++
6 files changed, 355 insertions(+), 2 deletions(-)
create mode 100644 drivers/virt/geniezone/gzvm_ioeventfd.c

diff --git a/drivers/virt/geniezone/Makefile b/drivers/virt/geniezone/Makefile
index 19a835b0aac2..bc5ae49f2407 100644
--- a/drivers/virt/geniezone/Makefile
+++ b/drivers/virt/geniezone/Makefile
@@ -7,4 +7,5 @@
GZVM_DIR ?= ../../../drivers/virt/geniezone

gzvm-y := $(GZVM_DIR)/gzvm_main.o $(GZVM_DIR)/gzvm_vm.o \
- $(GZVM_DIR)/gzvm_vcpu.o $(GZVM_DIR)/gzvm_irqfd.o
+ $(GZVM_DIR)/gzvm_vcpu.o $(GZVM_DIR)/gzvm_irqfd.o \
+ $(GZVM_DIR)/gzvm_ioeventfd.o
diff --git a/drivers/virt/geniezone/gzvm_ioeventfd.c b/drivers/virt/geniezone/gzvm_ioeventfd.c
new file mode 100644
index 000000000000..8d41db16ada2
--- /dev/null
+++ b/drivers/virt/geniezone/gzvm_ioeventfd.c
@@ -0,0 +1,273 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2023 MediaTek Inc.
+ */
+
+#include <linux/eventfd.h>
+#include <linux/file.h>
+#include <linux/syscalls.h>
+#include <linux/gzvm.h>
+#include <linux/gzvm_drv.h>
+#include <linux/wait.h>
+#include <linux/poll.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+
+struct gzvm_ioevent {
+ struct list_head list;
+ __u64 addr;
+ __u32 len;
+ struct eventfd_ctx *evt_ctx;
+ __u64 datamatch;
+ bool wildcard;
+};
+
+/**
+ * ioeventfd_check_collision() - Check collison assumes gzvm->slots_lock held.
+ * @gzvm: Pointer to gzvm.
+ * @p: Pointer to gzvm_ioevent.
+ *
+ * Return:
+ * * true - collison found
+ * * false - no collison
+ */
+static bool ioeventfd_check_collision(struct gzvm *gzvm, struct gzvm_ioevent *p)
+{
+ struct gzvm_ioevent *_p;
+
+ list_for_each_entry(_p, &gzvm->ioevents, list)
+ if (_p->addr == p->addr &&
+ (!_p->len || !p->len ||
+ (_p->len == p->len &&
+ (_p->wildcard || p->wildcard ||
+ _p->datamatch == p->datamatch))))
+ return true;
+
+ return false;
+}
+
+static void gzvm_ioevent_release(struct gzvm_ioevent *p)
+{
+ eventfd_ctx_put(p->evt_ctx);
+ list_del(&p->list);
+ kfree(p);
+}
+
+static bool gzvm_ioevent_in_range(struct gzvm_ioevent *p, __u64 addr, int len,
+ const void *val)
+{
+ u64 _val;
+
+ if (addr != p->addr)
+ /* address must be precise for a hit */
+ return false;
+
+ if (!p->len)
+ /* length = 0 means only look at the address, so always a hit */
+ return true;
+
+ if (len != p->len)
+ /* address-range must be precise for a hit */
+ return false;
+
+ if (p->wildcard)
+ /* all else equal, wildcard is always a hit */
+ return true;
+
+ /* otherwise, we have to actually compare the data */
+
+ WARN_ON_ONCE(!IS_ALIGNED((unsigned long)val, len));
+
+ switch (len) {
+ case 1:
+ _val = *(u8 *)val;
+ break;
+ case 2:
+ _val = *(u16 *)val;
+ break;
+ case 4:
+ _val = *(u32 *)val;
+ break;
+ case 8:
+ _val = *(u64 *)val;
+ break;
+ default:
+ return false;
+ }
+
+ return _val == p->datamatch;
+}
+
+static int gzvm_deassign_ioeventfd(struct gzvm *gzvm,
+ struct gzvm_ioeventfd *args)
+{
+ struct gzvm_ioevent *p, *tmp;
+ struct eventfd_ctx *evt_ctx;
+ int ret = -ENOENT;
+ bool wildcard;
+
+ evt_ctx = eventfd_ctx_fdget(args->fd);
+ if (IS_ERR(evt_ctx))
+ return PTR_ERR(evt_ctx);
+
+ wildcard = !(args->flags & GZVM_IOEVENTFD_FLAG_DATAMATCH);
+
+ mutex_lock(&gzvm->lock);
+
+ list_for_each_entry_safe(p, tmp, &gzvm->ioevents, list) {
+ if (p->evt_ctx != evt_ctx ||
+ p->addr != args->addr ||
+ p->len != args->len ||
+ p->wildcard != wildcard)
+ continue;
+
+ if (!p->wildcard && p->datamatch != args->datamatch)
+ continue;
+
+ gzvm_ioevent_release(p);
+ ret = 0;
+ break;
+ }
+
+ mutex_unlock(&gzvm->lock);
+
+ /* got in the front of this function */
+ eventfd_ctx_put(evt_ctx);
+
+ return ret;
+}
+
+static int gzvm_assign_ioeventfd(struct gzvm *gzvm, struct gzvm_ioeventfd *args)
+{
+ struct eventfd_ctx *evt_ctx;
+ struct gzvm_ioevent *evt;
+ int ret;
+
+ evt_ctx = eventfd_ctx_fdget(args->fd);
+ if (IS_ERR(evt_ctx))
+ return PTR_ERR(evt_ctx);
+
+ evt = kmalloc(sizeof(*evt), GFP_KERNEL);
+ if (!evt)
+ return -ENOMEM;
+ *evt = (struct gzvm_ioevent) {
+ .addr = args->addr,
+ .len = args->len,
+ .evt_ctx = evt_ctx,
+ };
+ if (args->flags & GZVM_IOEVENTFD_FLAG_DATAMATCH) {
+ evt->datamatch = args->datamatch;
+ evt->wildcard = false;
+ } else {
+ evt->wildcard = true;
+ }
+
+ if (ioeventfd_check_collision(gzvm, evt)) {
+ ret = -EEXIST;
+ goto err_free;
+ }
+
+ mutex_lock(&gzvm->lock);
+ list_add_tail(&evt->list, &gzvm->ioevents);
+ mutex_unlock(&gzvm->lock);
+
+ return 0;
+
+err_free:
+ kfree(evt);
+ eventfd_ctx_put(evt_ctx);
+ return ret;
+}
+
+/**
+ * gzvm_ioeventfd_check_valid() - Check user arguments is valid.
+ * @args: Pointer to gzvm_ioeventfd.
+ *
+ * Return:
+ * * true if user arguments are valid.
+ * * false if user arguments are invalid.
+ */
+static bool gzvm_ioeventfd_check_valid(struct gzvm_ioeventfd *args)
+{
+ /* must be natural-word sized, or 0 to ignore length */
+ switch (args->len) {
+ case 0:
+ case 1:
+ case 2:
+ case 4:
+ case 8:
+ break;
+ default:
+ return false;
+ }
+
+ /* check for range overflow */
+ if (args->addr + args->len < args->addr)
+ return false;
+
+ /* check for extra flags that we don't understand */
+ if (args->flags & ~GZVM_IOEVENTFD_VALID_FLAG_MASK)
+ return false;
+
+ /* ioeventfd with no length can't be combined with DATAMATCH */
+ if (!args->len && (args->flags & GZVM_IOEVENTFD_FLAG_DATAMATCH))
+ return false;
+
+ /* gzvm does not support pio bus ioeventfd */
+ if (args->flags & GZVM_IOEVENTFD_FLAG_PIO)
+ return false;
+
+ return true;
+}
+
+/**
+ * gzvm_ioeventfd() - Register ioevent to ioevent list.
+ * @gzvm: Pointer to gzvm.
+ * @args: Pointer to gzvm_ioeventfd.
+ *
+ * Return:
+ * * 0 - Success.
+ * * Negative - Failure.
+ */
+int gzvm_ioeventfd(struct gzvm *gzvm, struct gzvm_ioeventfd *args)
+{
+ if (gzvm_ioeventfd_check_valid(args) == false)
+ return -EINVAL;
+
+ if (args->flags & GZVM_IOEVENTFD_FLAG_DEASSIGN)
+ return gzvm_deassign_ioeventfd(gzvm, args);
+ return gzvm_assign_ioeventfd(gzvm, args);
+}
+
+/**
+ * gzvm_ioevent_write() - Travers this vm's registered ioeventfd to see if
+ * need notifying it.
+ * @vcpu: Pointer to vcpu.
+ * @addr: mmio address.
+ * @len: mmio size.
+ * @val: Pointer to void.
+ *
+ * Return:
+ * * true if this io is already sent to ioeventfd's listener.
+ * * false if we cannot find any ioeventfd registering this mmio write.
+ */
+bool gzvm_ioevent_write(struct gzvm_vcpu *vcpu, __u64 addr, int len,
+ const void *val)
+{
+ struct gzvm_ioevent *e;
+
+ list_for_each_entry(e, &vcpu->gzvm->ioevents, list) {
+ if (gzvm_ioevent_in_range(e, addr, len, val)) {
+ eventfd_signal(e->evt_ctx, 1);
+ return true;
+ }
+ }
+ return false;
+}
+
+int gzvm_init_ioeventfd(struct gzvm *gzvm)
+{
+ INIT_LIST_HEAD(&gzvm->ioevents);
+
+ return 0;
+}
diff --git a/drivers/virt/geniezone/gzvm_vcpu.c b/drivers/virt/geniezone/gzvm_vcpu.c
index a717fc713b2e..72bd122a8be7 100644
--- a/drivers/virt/geniezone/gzvm_vcpu.c
+++ b/drivers/virt/geniezone/gzvm_vcpu.c
@@ -50,6 +50,30 @@ static long gzvm_vcpu_update_one_reg(struct gzvm_vcpu *vcpu,
return 0;
}

+/**
+ * gzvm_vcpu_handle_mmio() - Handle mmio in kernel space.
+ * @vcpu: Pointer to vcpu.
+ *
+ * Return:
+ * * true - This mmio exit has been processed.
+ * * false - This mmio exit has not been processed, require userspace.
+ */
+static bool gzvm_vcpu_handle_mmio(struct gzvm_vcpu *vcpu)
+{
+ __u64 addr;
+ __u32 len;
+ const void *val_ptr;
+
+ /* So far, we don't have in-kernel mmio read handler */
+ if (!vcpu->run->mmio.is_write)
+ return false;
+ addr = vcpu->run->mmio.phys_addr;
+ len = vcpu->run->mmio.size;
+ val_ptr = &vcpu->run->mmio.data;
+
+ return gzvm_ioevent_write(vcpu, addr, len, val_ptr);
+}
+
/**
* gzvm_vcpu_run() - Handle vcpu run ioctl, entry point to guest and exit
* point from guest
@@ -81,7 +105,8 @@ static long gzvm_vcpu_run(struct gzvm_vcpu *vcpu, void * __user argp)

switch (exit_reason) {
case GZVM_EXIT_MMIO:
- need_userspace = true;
+ if (!gzvm_vcpu_handle_mmio(vcpu))
+ need_userspace = true;
break;
/**
* it's geniezone's responsibility to fill corresponding data
diff --git a/drivers/virt/geniezone/gzvm_vm.c b/drivers/virt/geniezone/gzvm_vm.c
index a93f5b0e7078..60bd017e41fa 100644
--- a/drivers/virt/geniezone/gzvm_vm.c
+++ b/drivers/virt/geniezone/gzvm_vm.c
@@ -386,6 +386,16 @@ static long gzvm_vm_ioctl(struct file *filp, unsigned int ioctl,
ret = gzvm_irqfd(gzvm, &data);
break;
}
+ case GZVM_IOEVENTFD: {
+ struct gzvm_ioeventfd data;
+
+ if (copy_from_user(&data, argp, sizeof(data))) {
+ ret = -EFAULT;
+ goto out;
+ }
+ ret = gzvm_ioeventfd(gzvm, &data);
+ break;
+ }
case GZVM_ENABLE_CAP: {
struct gzvm_enable_cap cap;

@@ -462,6 +472,13 @@ static struct gzvm *gzvm_create_vm(unsigned long vm_type)
return ERR_PTR(ret);
}

+ ret = gzvm_init_ioeventfd(gzvm);
+ if (ret) {
+ pr_err("Failed to initialize ioeventfd\n");
+ kfree(gzvm);
+ return ERR_PTR(ret);
+ }
+
mutex_lock(&gzvm_list_lock);
list_add(&gzvm->vm_list, &gzvm_list);
mutex_unlock(&gzvm_list_lock);
diff --git a/include/linux/gzvm_drv.h b/include/linux/gzvm_drv.h
index af7043d66567..d2985e4df4a0 100644
--- a/include/linux/gzvm_drv.h
+++ b/include/linux/gzvm_drv.h
@@ -6,6 +6,7 @@
#ifndef __GZVM_DRV_H__
#define __GZVM_DRV_H__

+#include <linux/eventfd.h>
#include <linux/list.h>
#include <linux/mutex.h>
#include <linux/gzvm.h>
@@ -89,6 +90,8 @@ struct gzvm {
struct mutex resampler_lock;
} irqfds;

+ struct list_head ioevents;
+
struct list_head vm_list;
u16 vm_id;

@@ -138,4 +141,13 @@ void gzvm_drv_irqfd_exit(void);
int gzvm_vm_irqfd_init(struct gzvm *gzvm);
void gzvm_vm_irqfd_release(struct gzvm *gzvm);

+int gzvm_init_ioeventfd(struct gzvm *gzvm);
+int gzvm_ioeventfd(struct gzvm *gzvm, struct gzvm_ioeventfd *args);
+bool gzvm_ioevent_write(struct gzvm_vcpu *vcpu, __u64 addr, int len,
+ const void *val);
+void eventfd_ctx_do_read(struct eventfd_ctx *ctx, __u64 *cnt);
+struct vm_area_struct *vma_lookup(struct mm_struct *mm, unsigned long addr);
+void add_wait_queue_priority(struct wait_queue_head *wq_head,
+ struct wait_queue_entry *wq_entry);
+
#endif /* __GZVM_DRV_H__ */
diff --git a/include/uapi/linux/gzvm.h b/include/uapi/linux/gzvm.h
index f4b16d70f035..506ef975de02 100644
--- a/include/uapi/linux/gzvm.h
+++ b/include/uapi/linux/gzvm.h
@@ -301,4 +301,29 @@ struct gzvm_irqfd {

#define GZVM_IRQFD _IOW(GZVM_IOC_MAGIC, 0x76, struct gzvm_irqfd)

+enum {
+ gzvm_ioeventfd_flag_nr_datamatch = 0,
+ gzvm_ioeventfd_flag_nr_pio = 1,
+ gzvm_ioeventfd_flag_nr_deassign = 2,
+ gzvm_ioeventfd_flag_nr_max,
+};
+
+#define GZVM_IOEVENTFD_FLAG_DATAMATCH (1 << gzvm_ioeventfd_flag_nr_datamatch)
+#define GZVM_IOEVENTFD_FLAG_PIO (1 << gzvm_ioeventfd_flag_nr_pio)
+#define GZVM_IOEVENTFD_FLAG_DEASSIGN (1 << gzvm_ioeventfd_flag_nr_deassign)
+#define GZVM_IOEVENTFD_VALID_FLAG_MASK ((1 << gzvm_ioeventfd_flag_nr_max) - 1)
+
+struct gzvm_ioeventfd {
+ __u64 datamatch;
+ /* private: legal pio/mmio address */
+ __u64 addr;
+ /* private: 1, 2, 4, or 8 bytes; or 0 to ignore length */
+ __u32 len;
+ __s32 fd;
+ __u32 flags;
+ __u8 pad[36];
+};
+
+#define GZVM_IOEVENTFD _IOW(GZVM_IOC_MAGIC, 0x79, struct gzvm_ioeventfd)
+
#endif /* __GZVM_H__ */
--
2.18.0


2023-07-27 09:08:08

by Yi-De Wu

[permalink] [raw]
Subject: [PATCH v5 01/12] docs: geniezone: Introduce GenieZone hypervisor

From: "Yi-De Wu" <[email protected]>

GenieZone is MediaTek proprietary hypervisor solution, and it is running
in EL2 stand alone as a type-I hypervisor. It is a pure EL2
implementation which implies it does not rely any specific host VM, and
this behavior improves GenieZone's security as it limits its interface.

Signed-off-by: Yingshiuan Pan <[email protected]>
Signed-off-by: Liju Chen <[email protected]>
Signed-off-by: Yi-De Wu <[email protected]>
Reported-by: kernel test robot <[email protected]>
Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
---
Documentation/virt/geniezone/introduction.rst | 86 +++++++++++++++++++
Documentation/virt/index.rst | 1 +
MAINTAINERS | 6 ++
3 files changed, 93 insertions(+)
create mode 100644 Documentation/virt/geniezone/introduction.rst

diff --git a/Documentation/virt/geniezone/introduction.rst b/Documentation/virt/geniezone/introduction.rst
new file mode 100644
index 000000000000..fb9fa41bcfb8
--- /dev/null
+++ b/Documentation/virt/geniezone/introduction.rst
@@ -0,0 +1,86 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+======================
+GenieZone Introduction
+======================
+
+Overview
+========
+GenieZone hypervisor(gzvm) is a type-1 hypervisor that supports various virtual
+machine types and provides security features such as TEE-like scenarios and
+secure boot. It can create guest VMs for security use cases and has
+virtualization capabilities for both platform and interrupt. Although the
+hypervisor can be booted independently, it requires the assistance of GenieZone
+hypervisor kernel driver(gzvm-ko) to leverage the ability of Linux kernel for
+vCPU scheduling, memory management, inter-VM communication and virtio backend
+support.
+
+Supported Architecture
+======================
+GenieZone now only supports MediaTek ARM64 SoC.
+
+Features
+========
+
+- vCPU Management
+
+VM manager aims to provide vCPUs on the basis of time sharing on physical CPUs.
+It requires Linux kernel in host VM for vCPU scheduling and VM power management.
+
+- Memory Management
+
+Direct use of physical memory from VMs is forbidden and designed to be dictated
+to the privilege models managed by GenieZone hypervisor for security reason.
+With the help of gzvm-ko, the hypervisor would be able to manipulate memory as
+objects.
+
+- Virtual Platform
+
+We manage to emulate a virtual mobile platform for guest OS running on guest
+VM. The platform supports various architecture-defined devices, such as
+virtual arch timer, GIC, MMIO, PSCI, and exception watching...etc.
+
+- Inter-VM Communication
+
+Communication among guest VMs was provided mainly on RPC. More communication
+mechanisms were to be provided in the future based on VirtIO-vsock.
+
+- Device Virtualization
+
+The solution is provided using the well-known VirtIO. The gzvm-ko would
+redirect MMIO traps back to VMM where the virtual devices are mostly emulated.
+Ioeventfd is implemented using eventfd for signaling host VM that some IO
+events in guest VMs need to be processed.
+
+- Interrupt virtualization
+
+All Interrupts during some guest VMs running would be handled by GenieZone
+hypervisor with the help of gzvm-ko, both virtual and physical ones. In case
+there's no guest VM running out there, physical interrupts would be handled by
+host VM directly for performance reason. Irqfd is also implemented using
+eventfd for accepting vIRQ requests in gzvm-ko.
+
+Platform architecture component
+===============================
+
+- vm
+
+The vm component is responsible for setting up the capability and memory
+management for the protected VMs. The capability is mainly about the lifecycle
+control and boot context initialization. And the memory management is highly
+integrated with ARM 2-stage translation tables to convert VA to IPA to PA under
+proper security measures required by protected VMs.
+
+- vcpu
+
+The vcpu component is the core of virtualizing aarch64 physical CPU runnable,
+and it controls the vCPU lifecycle including creating, running and destroying.
+With self-defined exit handler, the vm component would be able to act
+accordingly before terminated.
+
+- vgic
+
+The vgic component exposes control interfaces to Linux kernel via irqchip, and
+we intend to support all SPI, PPI, and SGI. When it comes to virtual
+interrupts, the GenieZone hypervisor would write to list registers and trigger
+vIRQ injection in guest VMs via GIC.
diff --git a/Documentation/virt/index.rst b/Documentation/virt/index.rst
index 7fb55ae08598..cf12444db336 100644
--- a/Documentation/virt/index.rst
+++ b/Documentation/virt/index.rst
@@ -16,6 +16,7 @@ Virtualization Support
coco/sev-guest
coco/tdx-guest
hyperv/index
+ geniezone/introduction

.. only:: html and subproject

diff --git a/MAINTAINERS b/MAINTAINERS
index ae1fd58fc64c..a81903c029f2 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8741,6 +8741,12 @@ F: include/vdso/
F: kernel/time/vsyscall.c
F: lib/vdso/

+GENIEZONE HYPERVISOR DRIVER
+M: Yingshiuan Pan <[email protected]>
+M: Ze-Yu Wang <[email protected]>
+M: Yi-De Wu <[email protected]>
+F: Documentation/virt/geniezone/
+
GENWQE (IBM Generic Workqueue Card)
M: Frank Haverkamp <[email protected]>
S: Supported
--
2.18.0


2023-07-27 09:12:08

by Yi-De Wu

[permalink] [raw]
Subject: [PATCH v5 03/12] virt: geniezone: Add GenieZone hypervisor support

From: "Yingshiuan Pan" <[email protected]>

GenieZone is MediaTek hypervisor solution, and it is running in EL2
stand alone as a type-I hypervisor. This patch exports a set of ioctl
interfaces for userspace VMM (e.g., crosvm) to operate guest VMs
lifecycle (creation and destroy) on GenieZone.

Signed-off-by: Yingshiuan Pan <[email protected]>
Signed-off-by: Jerry Wang <[email protected]>
Signed-off-by: Liju Chen <[email protected]>
Signed-off-by: Yi-De Wu <[email protected]>
---
MAINTAINERS | 6 +
arch/arm64/Kbuild | 1 +
arch/arm64/geniezone/Makefile | 9 +
arch/arm64/geniezone/gzvm_arch_common.h | 68 ++++
arch/arm64/geniezone/vm.c | 212 +++++++++++++
arch/arm64/include/uapi/asm/gzvm_arch.h | 20 ++
drivers/virt/Kconfig | 2 +
drivers/virt/geniezone/Kconfig | 16 +
drivers/virt/geniezone/Makefile | 10 +
drivers/virt/geniezone/gzvm_main.c | 143 +++++++++
drivers/virt/geniezone/gzvm_vm.c | 400 ++++++++++++++++++++++++
include/linux/gzvm_drv.h | 90 ++++++
include/uapi/asm-generic/Kbuild | 1 +
include/uapi/asm-generic/gzvm_arch.h | 10 +
include/uapi/linux/gzvm.h | 76 +++++
15 files changed, 1064 insertions(+)
create mode 100644 arch/arm64/geniezone/Makefile
create mode 100644 arch/arm64/geniezone/gzvm_arch_common.h
create mode 100644 arch/arm64/geniezone/vm.c
create mode 100644 arch/arm64/include/uapi/asm/gzvm_arch.h
create mode 100644 drivers/virt/geniezone/Kconfig
create mode 100644 drivers/virt/geniezone/Makefile
create mode 100644 drivers/virt/geniezone/gzvm_main.c
create mode 100644 drivers/virt/geniezone/gzvm_vm.c
create mode 100644 include/linux/gzvm_drv.h
create mode 100644 include/uapi/asm-generic/gzvm_arch.h
create mode 100644 include/uapi/linux/gzvm.h

diff --git a/MAINTAINERS b/MAINTAINERS
index bfbfdb790446..b91d41dd2f2f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8747,6 +8747,12 @@ M: Ze-Yu Wang <[email protected]>
M: Yi-De Wu <[email protected]>
F: Documentation/devicetree/bindings/hypervisor/mediatek,geniezone-hyp.yaml
F: Documentation/virt/geniezone/
+F: arch/arm64/geniezone/
+F: arch/arm64/include/uapi/asm/gzvm_arch.h
+F: drivers/virt/geniezone/
+F: include/linux/gzvm_drv.h
+F include/uapi/asm-generic/gzvm_arch.h
+F: include/uapi/linux/gzvm.h

GENWQE (IBM Generic Workqueue Card)
M: Frank Haverkamp <[email protected]>
diff --git a/arch/arm64/Kbuild b/arch/arm64/Kbuild
index 5bfbf7d79c99..0c3cca572919 100644
--- a/arch/arm64/Kbuild
+++ b/arch/arm64/Kbuild
@@ -4,6 +4,7 @@ obj-$(CONFIG_KVM) += kvm/
obj-$(CONFIG_XEN) += xen/
obj-$(subst m,y,$(CONFIG_HYPERV)) += hyperv/
obj-$(CONFIG_CRYPTO) += crypto/
+obj-$(CONFIG_MTK_GZVM) += geniezone/

# for cleaning
subdir- += boot
diff --git a/arch/arm64/geniezone/Makefile b/arch/arm64/geniezone/Makefile
new file mode 100644
index 000000000000..2957898cdd05
--- /dev/null
+++ b/arch/arm64/geniezone/Makefile
@@ -0,0 +1,9 @@
+# SPDX-License-Identifier: GPL-2.0-only
+#
+# Main Makefile for gzvm, this one includes drivers/virt/geniezone/Makefile
+#
+include $(srctree)/drivers/virt/geniezone/Makefile
+
+gzvm-y += vm.o
+
+obj-$(CONFIG_MTK_GZVM) += gzvm.o
diff --git a/arch/arm64/geniezone/gzvm_arch_common.h b/arch/arm64/geniezone/gzvm_arch_common.h
new file mode 100644
index 000000000000..fdb95d619102
--- /dev/null
+++ b/arch/arm64/geniezone/gzvm_arch_common.h
@@ -0,0 +1,68 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2023 MediaTek Inc.
+ */
+
+#ifndef __GZVM_ARCH_COMMON_H__
+#define __GZVM_ARCH_COMMON_H__
+
+#include <linux/arm-smccc.h>
+
+enum {
+ GZVM_FUNC_CREATE_VM = 0,
+ GZVM_FUNC_DESTROY_VM = 1,
+ GZVM_FUNC_CREATE_VCPU = 2,
+ GZVM_FUNC_DESTROY_VCPU = 3,
+ GZVM_FUNC_SET_MEMREGION = 4,
+ GZVM_FUNC_RUN = 5,
+ GZVM_FUNC_GET_ONE_REG = 8,
+ GZVM_FUNC_SET_ONE_REG = 9,
+ GZVM_FUNC_IRQ_LINE = 10,
+ GZVM_FUNC_CREATE_DEVICE = 11,
+ GZVM_FUNC_PROBE = 12,
+ GZVM_FUNC_ENABLE_CAP = 13,
+ NR_GZVM_FUNC,
+};
+
+#define SMC_ENTITY_MTK 59
+#define GZVM_FUNCID_START (0x1000)
+#define GZVM_HCALL_ID(func) \
+ ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, ARM_SMCCC_SMC_32, \
+ SMC_ENTITY_MTK, (GZVM_FUNCID_START + (func)))
+
+#define MT_HVC_GZVM_CREATE_VM GZVM_HCALL_ID(GZVM_FUNC_CREATE_VM)
+#define MT_HVC_GZVM_DESTROY_VM GZVM_HCALL_ID(GZVM_FUNC_DESTROY_VM)
+#define MT_HVC_GZVM_CREATE_VCPU GZVM_HCALL_ID(GZVM_FUNC_CREATE_VCPU)
+#define MT_HVC_GZVM_DESTROY_VCPU GZVM_HCALL_ID(GZVM_FUNC_DESTROY_VCPU)
+#define MT_HVC_GZVM_SET_MEMREGION GZVM_HCALL_ID(GZVM_FUNC_SET_MEMREGION)
+#define MT_HVC_GZVM_RUN GZVM_HCALL_ID(GZVM_FUNC_RUN)
+#define MT_HVC_GZVM_GET_ONE_REG GZVM_HCALL_ID(GZVM_FUNC_GET_ONE_REG)
+#define MT_HVC_GZVM_SET_ONE_REG GZVM_HCALL_ID(GZVM_FUNC_SET_ONE_REG)
+#define MT_HVC_GZVM_IRQ_LINE GZVM_HCALL_ID(GZVM_FUNC_IRQ_LINE)
+#define MT_HVC_GZVM_CREATE_DEVICE GZVM_HCALL_ID(GZVM_FUNC_CREATE_DEVICE)
+#define MT_HVC_GZVM_PROBE GZVM_HCALL_ID(GZVM_FUNC_PROBE)
+#define MT_HVC_GZVM_ENABLE_CAP GZVM_HCALL_ID(GZVM_FUNC_ENABLE_CAP)
+
+/**
+ * gzvm_hypcall_wrapper() - the wrapper for hvc calls
+ * @a0-a7: arguments passed in registers 0 to 7
+ * @res: result values from registers 0 to 3
+ *
+ * Return: The wrapper helps caller to convert geniezone errno to Linux errno.
+ */
+static inline int gzvm_hypcall_wrapper(unsigned long a0, unsigned long a1,
+ unsigned long a2, unsigned long a3,
+ unsigned long a4, unsigned long a5,
+ unsigned long a6, unsigned long a7,
+ struct arm_smccc_res *res)
+{
+ arm_smccc_hvc(a0, a1, a2, a3, a4, a5, a6, a7, res);
+ return gzvm_err_to_errno(res->a0);
+}
+
+static inline u16 get_vmid_from_tuple(unsigned int tuple)
+{
+ return (u16)(tuple >> 16);
+}
+
+#endif /* __GZVM_ARCH_COMMON_H__ */
diff --git a/arch/arm64/geniezone/vm.c b/arch/arm64/geniezone/vm.c
new file mode 100644
index 000000000000..e35751b21821
--- /dev/null
+++ b/arch/arm64/geniezone/vm.c
@@ -0,0 +1,212 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2023 MediaTek Inc.
+ */
+
+#include <asm/sysreg.h>
+#include <linux/arm-smccc.h>
+#include <linux/err.h>
+#include <linux/uaccess.h>
+
+#include <linux/gzvm.h>
+#include <linux/gzvm_drv.h>
+#include "gzvm_arch_common.h"
+
+#define PAR_PA47_MASK ((((1UL << 48) - 1) >> 12) << 12)
+
+int gzvm_arch_probe(void)
+{
+ struct arm_smccc_res res;
+
+ arm_smccc_hvc(MT_HVC_GZVM_PROBE, 0, 0, 0, 0, 0, 0, 0, &res);
+ if (res.a0 == 0)
+ return 0;
+
+ return -ENXIO;
+}
+
+int gzvm_arch_set_memregion(u16 vm_id, size_t buf_size,
+ phys_addr_t region)
+{
+ struct arm_smccc_res res;
+
+ return gzvm_hypcall_wrapper(MT_HVC_GZVM_SET_MEMREGION, vm_id,
+ buf_size, region, 0, 0, 0, 0, &res);
+}
+
+static int gzvm_cap_arm_vm_ipa_size(void __user *argp)
+{
+ __u64 value = CONFIG_ARM64_PA_BITS;
+
+ if (copy_to_user(argp, &value, sizeof(__u64)))
+ return -EFAULT;
+
+ return 0;
+}
+
+int gzvm_arch_check_extension(struct gzvm *gzvm, __u64 cap, void __user *argp)
+{
+ int ret = -EOPNOTSUPP;
+
+ switch (cap) {
+ case GZVM_CAP_ARM_PROTECTED_VM: {
+ __u64 success = 1;
+
+ if (copy_to_user(argp, &success, sizeof(__u64)))
+ return -EFAULT;
+ ret = 0;
+ break;
+ }
+ case GZVM_CAP_ARM_VM_IPA_SIZE: {
+ ret = gzvm_cap_arm_vm_ipa_size(argp);
+ break;
+ }
+ default:
+ ret = -EOPNOTSUPP;
+ }
+
+ return ret;
+}
+
+/**
+ * gzvm_arch_create_vm() - create vm
+ * @vm_type: VM type. Only supports Linux VM now.
+ *
+ * Return:
+ * * positive value - VM ID
+ * * -ENOMEM - Memory not enough for storing VM data
+ */
+int gzvm_arch_create_vm(unsigned long vm_type)
+{
+ struct arm_smccc_res res;
+ int ret;
+
+ ret = gzvm_hypcall_wrapper(MT_HVC_GZVM_CREATE_VM, vm_type, 0, 0, 0, 0,
+ 0, 0, &res);
+
+ if (ret == 0)
+ return res.a1;
+ else
+ return ret;
+}
+
+int gzvm_arch_destroy_vm(u16 vm_id)
+{
+ struct arm_smccc_res res;
+
+ return gzvm_hypcall_wrapper(MT_HVC_GZVM_DESTROY_VM, vm_id, 0, 0, 0, 0,
+ 0, 0, &res);
+}
+
+static int gzvm_vm_arch_enable_cap(struct gzvm *gzvm,
+ struct gzvm_enable_cap *cap,
+ struct arm_smccc_res *res)
+{
+ return gzvm_hypcall_wrapper(MT_HVC_GZVM_ENABLE_CAP, gzvm->vm_id,
+ cap->cap, cap->args[0], cap->args[1],
+ cap->args[2], cap->args[3], cap->args[4],
+ res);
+}
+
+/**
+ * gzvm_vm_ioctl_get_pvmfw_size() - Get pvmfw size from hypervisor, return
+ * in x1, and return to userspace in args
+ * @gzvm: Pointer to struct gzvm.
+ * @cap: Pointer to struct gzvm_enable_cap.
+ * @argp: Pointer to struct gzvm_enable_cap in user space.
+ *
+ * Return:
+ * * 0 - Succeed
+ * * -EINVAL - Hypervisor return invalid results
+ * * -EFAULT - Fail to copy back to userspace buffer
+ */
+static int gzvm_vm_ioctl_get_pvmfw_size(struct gzvm *gzvm,
+ struct gzvm_enable_cap *cap,
+ void __user *argp)
+{
+ struct arm_smccc_res res = {0};
+
+ if (gzvm_vm_arch_enable_cap(gzvm, cap, &res) != 0)
+ return -EINVAL;
+
+ cap->args[1] = res.a1;
+ if (copy_to_user(argp, cap, sizeof(*cap)))
+ return -EFAULT;
+
+ return 0;
+}
+
+/**
+ * gzvm_vm_ioctl_cap_pvm() - Proceed GZVM_CAP_ARM_PROTECTED_VM's subcommands
+ * @gzvm: Pointer to struct gzvm.
+ * @cap: Pointer to struct gzvm_enable_cap.
+ * @argp: Pointer to struct gzvm_enable_cap in user space.
+ *
+ * Return:
+ * * 0 - Succeed
+ * * -EINVAL - Invalid subcommand or arguments
+ */
+static int gzvm_vm_ioctl_cap_pvm(struct gzvm *gzvm,
+ struct gzvm_enable_cap *cap,
+ void __user *argp)
+{
+ int ret = -EINVAL;
+ struct arm_smccc_res res = {0};
+
+ switch (cap->args[0]) {
+ case GZVM_CAP_ARM_PVM_SET_PVMFW_IPA:
+ fallthrough;
+ case GZVM_CAP_ARM_PVM_SET_PROTECTED_VM:
+ ret = gzvm_vm_arch_enable_cap(gzvm, cap, &res);
+ break;
+ case GZVM_CAP_ARM_PVM_GET_PVMFW_SIZE:
+ ret = gzvm_vm_ioctl_get_pvmfw_size(gzvm, cap, argp);
+ break;
+ default:
+ ret = -EINVAL;
+ break;
+ }
+
+ return ret;
+}
+
+int gzvm_vm_ioctl_arch_enable_cap(struct gzvm *gzvm,
+ struct gzvm_enable_cap *cap,
+ void __user *argp)
+{
+ int ret = -EINVAL;
+
+ switch (cap->cap) {
+ case GZVM_CAP_ARM_PROTECTED_VM:
+ ret = gzvm_vm_ioctl_cap_pvm(gzvm, cap, argp);
+ break;
+ default:
+ ret = -EINVAL;
+ break;
+ }
+
+ return ret;
+}
+
+/**
+ * gzvm_hva_to_pa_arch() - converts hva to pa with arch-specific way
+ * @hva: Host virtual address.
+ *
+ * Return: 0 if translation error
+ */
+u64 gzvm_hva_to_pa_arch(u64 hva)
+{
+ u64 par;
+ unsigned long flags;
+
+ local_irq_save(flags);
+ asm volatile("at s1e1r, %0" :: "r" (hva));
+ isb();
+ par = read_sysreg_par();
+ local_irq_restore(flags);
+
+ if (par & SYS_PAR_EL1_F)
+ return 0;
+
+ return par & PAR_PA47_MASK;
+}
diff --git a/arch/arm64/include/uapi/asm/gzvm_arch.h b/arch/arm64/include/uapi/asm/gzvm_arch.h
new file mode 100644
index 000000000000..847bb627a65d
--- /dev/null
+++ b/arch/arm64/include/uapi/asm/gzvm_arch.h
@@ -0,0 +1,20 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/*
+ * Copyright (c) 2023 MediaTek Inc.
+ */
+
+#ifndef __GZVM_ARCH_H__
+#define __GZVM_ARCH_H__
+
+#include <linux/types.h>
+
+#define GZVM_CAP_ARM_VM_IPA_SIZE 165
+#define GZVM_CAP_ARM_PROTECTED_VM 0xffbadab1
+
+/* sub-commands put in args[0] for GZVM_CAP_ARM_PROTECTED_VM */
+#define GZVM_CAP_ARM_PVM_SET_PVMFW_IPA 0
+#define GZVM_CAP_ARM_PVM_GET_PVMFW_SIZE 1
+/* GZVM_CAP_ARM_PVM_SET_PROTECTED_VM only sets protected but not load pvmfw */
+#define GZVM_CAP_ARM_PVM_SET_PROTECTED_VM 2
+
+#endif /* __GZVM_ARCH_H__ */
diff --git a/drivers/virt/Kconfig b/drivers/virt/Kconfig
index f79ab13a5c28..9bbf0bdf672c 100644
--- a/drivers/virt/Kconfig
+++ b/drivers/virt/Kconfig
@@ -54,4 +54,6 @@ source "drivers/virt/coco/sev-guest/Kconfig"

source "drivers/virt/coco/tdx-guest/Kconfig"

+source "drivers/virt/geniezone/Kconfig"
+
endif
diff --git a/drivers/virt/geniezone/Kconfig b/drivers/virt/geniezone/Kconfig
new file mode 100644
index 000000000000..2643fb8913cc
--- /dev/null
+++ b/drivers/virt/geniezone/Kconfig
@@ -0,0 +1,16 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+config MTK_GZVM
+ tristate "GenieZone Hypervisor driver for guest VM operation"
+ depends on ARM64
+ help
+ This driver, gzvm, enables to run guest VMs on MTK GenieZone
+ hypervisor. It exports kvm-like interfaces for VMM (e.g., crosvm) in
+ order to operate guest VMs on GenieZone hypervisor.
+
+ GenieZone hypervisor now only supports MediaTek SoC and arm64
+ architecture.
+
+ Select M if you want it be built as a module (gzvm.ko).
+
+ If unsure, say N.
diff --git a/drivers/virt/geniezone/Makefile b/drivers/virt/geniezone/Makefile
new file mode 100644
index 000000000000..066efddc0b9c
--- /dev/null
+++ b/drivers/virt/geniezone/Makefile
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: GPL-2.0-only
+#
+# Makefile for GenieZone driver, this file should be include in arch's
+# to avoid two ko being generated.
+#
+
+GZVM_DIR ?= ../../../drivers/virt/geniezone
+
+gzvm-y := $(GZVM_DIR)/gzvm_main.o $(GZVM_DIR)/gzvm_vm.o
+
diff --git a/drivers/virt/geniezone/gzvm_main.c b/drivers/virt/geniezone/gzvm_main.c
new file mode 100644
index 000000000000..b629b41a0cd9
--- /dev/null
+++ b/drivers/virt/geniezone/gzvm_main.c
@@ -0,0 +1,143 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2023 MediaTek Inc.
+ */
+
+#include <linux/anon_inodes.h>
+#include <linux/device.h>
+#include <linux/file.h>
+#include <linux/kdev_t.h>
+#include <linux/miscdevice.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
+#include <linux/slab.h>
+#include <linux/gzvm_drv.h>
+
+/**
+ * gzvm_err_to_errno() - Convert geniezone return value to standard errno
+ *
+ * @err: Return value from geniezone function return
+ *
+ * Return: Standard errno
+ */
+int gzvm_err_to_errno(unsigned long err)
+{
+ int gz_err = (int)err;
+
+ switch (gz_err) {
+ case 0:
+ return 0;
+ case ERR_NO_MEMORY:
+ return -ENOMEM;
+ case ERR_NOT_SUPPORTED:
+ return -EOPNOTSUPP;
+ case ERR_NOT_IMPLEMENTED:
+ return -EOPNOTSUPP;
+ case ERR_FAULT:
+ return -EFAULT;
+ default:
+ break;
+ }
+
+ return -EINVAL;
+}
+
+/**
+ * gzvm_dev_ioctl_check_extension() - Check if given capability is support
+ * or not
+ *
+ * @gzvm: Pointer to struct gzvm
+ * @args: Pointer in u64 from userspace
+ *
+ * Return:
+ * * 0 - Support, no error
+ * * -EOPNOTSUPP - Not support
+ * * -EFAULT - Failed to get data from userspace
+ */
+long gzvm_dev_ioctl_check_extension(struct gzvm *gzvm, unsigned long args)
+{
+ __u64 cap;
+ void __user *argp = (void __user *)args;
+
+ if (copy_from_user(&cap, argp, sizeof(uint64_t)))
+ return -EFAULT;
+ return gzvm_arch_check_extension(gzvm, cap, argp);
+}
+
+static long gzvm_dev_ioctl(struct file *filp, unsigned int cmd,
+ unsigned long user_args)
+{
+ long ret = -ENOTTY;
+
+ switch (cmd) {
+ case GZVM_CREATE_VM:
+ ret = gzvm_dev_ioctl_create_vm(user_args);
+ break;
+ case GZVM_CHECK_EXTENSION:
+ if (!user_args)
+ return -EINVAL;
+ ret = gzvm_dev_ioctl_check_extension(NULL, user_args);
+ break;
+ default:
+ ret = -ENOTTY;
+ }
+
+ return ret;
+}
+
+static const struct file_operations gzvm_chardev_ops = {
+ .unlocked_ioctl = gzvm_dev_ioctl,
+ .llseek = noop_llseek,
+};
+
+static struct miscdevice gzvm_dev = {
+ .minor = MISC_DYNAMIC_MINOR,
+ .name = KBUILD_MODNAME,
+ .fops = &gzvm_chardev_ops,
+};
+
+static int gzvm_drv_probe(struct platform_device *pdev)
+{
+ int ret;
+
+ if (gzvm_arch_probe() != 0) {
+ dev_err(&pdev->dev, "Not found available conduit\n");
+ return -ENODEV;
+ }
+
+ ret = misc_register(&gzvm_dev);
+ if (ret)
+ return ret;
+
+ return 0;
+}
+
+static int gzvm_drv_remove(struct platform_device *pdev)
+{
+ gzvm_destroy_all_vms();
+ misc_deregister(&gzvm_dev);
+ return 0;
+}
+
+static const struct of_device_id gzvm_of_match[] = {
+ { .compatible = "mediatek,geniezone-hyp", },
+ {/* sentinel */},
+};
+
+static struct platform_driver gzvm_driver = {
+ .probe = gzvm_drv_probe,
+ .remove = gzvm_drv_remove,
+ .driver = {
+ .name = KBUILD_MODNAME,
+ .owner = THIS_MODULE,
+ .of_match_table = gzvm_of_match,
+ },
+};
+
+module_platform_driver(gzvm_driver);
+
+MODULE_DEVICE_TABLE(of, gzvm_of_match);
+MODULE_AUTHOR("MediaTek");
+MODULE_DESCRIPTION("GenieZone interface for VMM");
+MODULE_LICENSE("GPL");
diff --git a/drivers/virt/geniezone/gzvm_vm.c b/drivers/virt/geniezone/gzvm_vm.c
new file mode 100644
index 000000000000..ee751369fd4b
--- /dev/null
+++ b/drivers/virt/geniezone/gzvm_vm.c
@@ -0,0 +1,400 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2023 MediaTek Inc.
+ */
+
+#include <linux/anon_inodes.h>
+#include <linux/file.h>
+#include <linux/kdev_t.h>
+#include <linux/mm.h>
+#include <linux/module.h>
+#include <linux/platform_device.h>
+#include <linux/slab.h>
+#include <linux/gzvm_drv.h>
+
+static DEFINE_MUTEX(gzvm_list_lock);
+static LIST_HEAD(gzvm_list);
+
+/**
+ * hva_to_pa_fast() - converts hva to pa in generic fast way
+ * @hva: Host virtual address.
+ *
+ * Return: 0 if translation error
+ */
+static u64 hva_to_pa_fast(u64 hva)
+{
+ struct page *page[1];
+
+ u64 pfn;
+
+ if (get_user_page_fast_only(hva, 0, page)) {
+ pfn = page_to_phys(page[0]);
+ put_page((struct page *)page);
+ return pfn;
+ } else {
+ return 0;
+ }
+}
+
+/**
+ * hva_to_pa_slow() - note that this function may sleep
+ * @hva: Host virtual address.
+ *
+ * Return: 0 if translation error
+ */
+static u64 hva_to_pa_slow(u64 hva)
+{
+ struct page *page;
+ int npages;
+ u64 pfn;
+
+ npages = get_user_pages_unlocked(hva, 1, &page, 0);
+ if (npages != 1)
+ return 0;
+
+ pfn = page_to_phys(page);
+ put_page(page);
+
+ return pfn;
+}
+
+static u64 gzvm_gfn_to_hva_memslot(struct gzvm_memslot *memslot, u64 gfn)
+{
+ u64 offset = gfn - memslot->base_gfn;
+
+ return memslot->userspace_addr + offset * PAGE_SIZE;
+}
+
+static u64 __gzvm_gfn_to_pfn_memslot(struct gzvm_memslot *memslot, u64 gfn)
+{
+ u64 hva, pa;
+
+ hva = gzvm_gfn_to_hva_memslot(memslot, gfn);
+
+ pa = gzvm_hva_to_pa_arch(hva);
+ if (pa != 0)
+ return PHYS_PFN(pa);
+
+ pa = hva_to_pa_fast(hva);
+ if (pa)
+ return PHYS_PFN(pa);
+
+ pa = hva_to_pa_slow(hva);
+ if (pa)
+ return PHYS_PFN(pa);
+
+ return 0;
+}
+
+/**
+ * gzvm_gfn_to_pfn_memslot() - Translate gfn (guest ipa) to pfn (host pa),
+ * result is in @pfn
+ * @memslot: Pointer to struct gzvm_memslot.
+ * @gfn: Guest frame number.
+ * @pfn: Host page frame number.
+ *
+ * Return:
+ * * 0 - Succeed
+ * * -EFAULT - Failed to convert
+ */
+static int gzvm_gfn_to_pfn_memslot(struct gzvm_memslot *memslot, u64 gfn,
+ u64 *pfn)
+{
+ u64 __pfn;
+
+ if (!memslot)
+ return -EFAULT;
+
+ __pfn = __gzvm_gfn_to_pfn_memslot(memslot, gfn);
+ if (__pfn == 0) {
+ *pfn = 0;
+ return -EFAULT;
+ }
+
+ *pfn = __pfn;
+
+ return 0;
+}
+
+/**
+ * fill_constituents() - Populate pa to buffer until full
+ * @consti: Pointer to struct mem_region_addr_range.
+ * @consti_cnt: Constituent count.
+ * @max_nr_consti: Maximum number of constituent count.
+ * @gfn: Guest frame number.
+ * @total_pages: Total page numbers.
+ * @slot: Pointer to struct gzvm_memslot.
+ *
+ * Return: how many pages we've fill in, negative if error
+ */
+static int fill_constituents(struct mem_region_addr_range *consti,
+ int *consti_cnt, int max_nr_consti, u64 gfn,
+ u32 total_pages, struct gzvm_memslot *slot)
+{
+ u64 pfn, prev_pfn, gfn_end;
+ int nr_pages = 1;
+ int i = 0;
+
+ if (unlikely(total_pages == 0))
+ return -EINVAL;
+ gfn_end = gfn + total_pages;
+
+ /* entry 0 */
+ if (gzvm_gfn_to_pfn_memslot(slot, gfn, &pfn) != 0)
+ return -EFAULT;
+ consti[0].address = PFN_PHYS(pfn);
+ consti[0].pg_cnt = 1;
+ gfn++;
+ prev_pfn = pfn;
+
+ while (i < max_nr_consti && gfn < gfn_end) {
+ if (gzvm_gfn_to_pfn_memslot(slot, gfn, &pfn) != 0)
+ return -EFAULT;
+ if (pfn == (prev_pfn + 1)) {
+ consti[i].pg_cnt++;
+ } else {
+ i++;
+ if (i >= max_nr_consti)
+ break;
+ consti[i].address = PFN_PHYS(pfn);
+ consti[i].pg_cnt = 1;
+ }
+ prev_pfn = pfn;
+ gfn++;
+ nr_pages++;
+ }
+ if (i != max_nr_consti)
+ i++;
+ *consti_cnt = i;
+
+ return nr_pages;
+}
+
+/* register_memslot_addr_range() - Register memory region to GZ */
+static int
+register_memslot_addr_range(struct gzvm *gzvm, struct gzvm_memslot *memslot)
+{
+ struct gzvm_memory_region_ranges *region;
+ u32 buf_size;
+ int max_nr_consti, remain_pages;
+ u64 gfn, gfn_end;
+
+ buf_size = PAGE_SIZE * 2;
+ region = alloc_pages_exact(buf_size, GFP_KERNEL);
+ if (!region)
+ return -ENOMEM;
+ max_nr_consti = (buf_size - sizeof(*region)) /
+ sizeof(struct mem_region_addr_range);
+
+ region->slot = memslot->slot_id;
+ remain_pages = memslot->npages;
+ gfn = memslot->base_gfn;
+ gfn_end = gfn + remain_pages;
+ while (gfn < gfn_end) {
+ int nr_pages;
+
+ nr_pages = fill_constituents(region->constituents,
+ &region->constituent_cnt,
+ max_nr_consti, gfn,
+ remain_pages, memslot);
+ if (nr_pages < 0) {
+ pr_err("Failed to fill constituents\n");
+ free_pages_exact(region, buf_size);
+ return nr_pages;
+ }
+ region->gpa = PFN_PHYS(gfn);
+ region->total_pages = nr_pages;
+
+ remain_pages -= nr_pages;
+ gfn += nr_pages;
+
+ if (gzvm_arch_set_memregion(gzvm->vm_id, buf_size,
+ virt_to_phys(region))) {
+ pr_err("Failed to register memregion to hypervisor\n");
+ free_pages_exact(region, buf_size);
+ return -EFAULT;
+ }
+ }
+ free_pages_exact(region, buf_size);
+ return 0;
+}
+
+/**
+ * gzvm_vm_ioctl_set_memory_region() - Set memory region of guest
+ * @gzvm: Pointer to struct gzvm.
+ * @mem: Input memory region from user.
+ *
+ * Return:
+ * * -EXIO - memslot is out-of-range
+ * * -EFAULT - Cannot find corresponding vma
+ * * -EINVAL - region size and vma size does not match
+ */
+static int
+gzvm_vm_ioctl_set_memory_region(struct gzvm *gzvm,
+ struct gzvm_userspace_memory_region *mem)
+{
+ struct vm_area_struct *vma;
+ struct gzvm_memslot *memslot;
+ unsigned long size;
+ __u32 slot;
+
+ slot = mem->slot;
+ if (slot >= GZVM_MAX_MEM_REGION)
+ return -ENXIO;
+ memslot = &gzvm->memslot[slot];
+
+ vma = vma_lookup(gzvm->mm, mem->userspace_addr);
+ if (!vma)
+ return -EFAULT;
+
+ size = vma->vm_end - vma->vm_start;
+ if (size != mem->memory_size)
+ return -EINVAL;
+
+ memslot->base_gfn = __phys_to_pfn(mem->guest_phys_addr);
+ memslot->npages = size >> PAGE_SHIFT;
+ memslot->userspace_addr = mem->userspace_addr;
+ memslot->vma = vma;
+ memslot->flags = mem->flags;
+ memslot->slot_id = mem->slot;
+ return register_memslot_addr_range(gzvm, memslot);
+}
+
+static int gzvm_vm_ioctl_enable_cap(struct gzvm *gzvm,
+ struct gzvm_enable_cap *cap,
+ void __user *argp)
+{
+ return gzvm_vm_ioctl_arch_enable_cap(gzvm, cap, argp);
+}
+
+/* gzvm_vm_ioctl() - Ioctl handler of VM FD */
+static long gzvm_vm_ioctl(struct file *filp, unsigned int ioctl,
+ unsigned long arg)
+{
+ long ret = -ENOTTY;
+ void __user *argp = (void __user *)arg;
+ struct gzvm *gzvm = filp->private_data;
+
+ switch (ioctl) {
+ case GZVM_CHECK_EXTENSION: {
+ ret = gzvm_dev_ioctl_check_extension(gzvm, arg);
+ break;
+ }
+ case GZVM_SET_USER_MEMORY_REGION: {
+ struct gzvm_userspace_memory_region userspace_mem;
+
+ if (copy_from_user(&userspace_mem, argp, sizeof(userspace_mem))) {
+ ret = -EFAULT;
+ goto out;
+ }
+ ret = gzvm_vm_ioctl_set_memory_region(gzvm, &userspace_mem);
+ break;
+ }
+ case GZVM_ENABLE_CAP: {
+ struct gzvm_enable_cap cap;
+
+ if (copy_from_user(&cap, argp, sizeof(cap))) {
+ ret = -EFAULT;
+ goto out;
+ }
+ ret = gzvm_vm_ioctl_enable_cap(gzvm, &cap, argp);
+ break;
+ }
+ default:
+ ret = -ENOTTY;
+ }
+out:
+ return ret;
+}
+
+static void gzvm_destroy_vm(struct gzvm *gzvm)
+{
+ pr_debug("VM-%u is going to be destroyed\n", gzvm->vm_id);
+
+ mutex_lock(&gzvm->lock);
+
+ gzvm_arch_destroy_vm(gzvm->vm_id);
+
+ mutex_lock(&gzvm_list_lock);
+ list_del(&gzvm->vm_list);
+ mutex_unlock(&gzvm_list_lock);
+
+ mutex_unlock(&gzvm->lock);
+
+ kfree(gzvm);
+}
+
+static int gzvm_vm_release(struct inode *inode, struct file *filp)
+{
+ struct gzvm *gzvm = filp->private_data;
+
+ gzvm_destroy_vm(gzvm);
+ return 0;
+}
+
+static const struct file_operations gzvm_vm_fops = {
+ .release = gzvm_vm_release,
+ .unlocked_ioctl = gzvm_vm_ioctl,
+ .llseek = noop_llseek,
+};
+
+static struct gzvm *gzvm_create_vm(unsigned long vm_type)
+{
+ int ret;
+ struct gzvm *gzvm;
+
+ gzvm = kzalloc(sizeof(*gzvm), GFP_KERNEL);
+ if (!gzvm)
+ return ERR_PTR(-ENOMEM);
+
+ ret = gzvm_arch_create_vm(vm_type);
+ if (ret < 0) {
+ kfree(gzvm);
+ return ERR_PTR(ret);
+ }
+
+ gzvm->vm_id = ret;
+ gzvm->mm = current->mm;
+ mutex_init(&gzvm->lock);
+
+ mutex_lock(&gzvm_list_lock);
+ list_add(&gzvm->vm_list, &gzvm_list);
+ mutex_unlock(&gzvm_list_lock);
+
+ pr_debug("VM-%u is created\n", gzvm->vm_id);
+
+ return gzvm;
+}
+
+/**
+ * gzvm_dev_ioctl_create_vm - Create vm fd
+ * @vm_type: VM type. Only supports Linux VM now.
+ *
+ * Return: fd of vm, negative if error
+ */
+int gzvm_dev_ioctl_create_vm(unsigned long vm_type)
+{
+ struct gzvm *gzvm;
+
+ gzvm = gzvm_create_vm(vm_type);
+ if (IS_ERR(gzvm))
+ return PTR_ERR(gzvm);
+
+ return anon_inode_getfd("gzvm-vm", &gzvm_vm_fops, gzvm,
+ O_RDWR | O_CLOEXEC);
+}
+
+void gzvm_destroy_all_vms(void)
+{
+ struct gzvm *gzvm, *tmp;
+
+ mutex_lock(&gzvm_list_lock);
+ if (list_empty(&gzvm_list))
+ goto out;
+
+ list_for_each_entry_safe(gzvm, tmp, &gzvm_list, vm_list)
+ gzvm_destroy_vm(gzvm);
+
+out:
+ mutex_unlock(&gzvm_list_lock);
+}
diff --git a/include/linux/gzvm_drv.h b/include/linux/gzvm_drv.h
new file mode 100644
index 000000000000..4fd52fcbd5a8
--- /dev/null
+++ b/include/linux/gzvm_drv.h
@@ -0,0 +1,90 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2023 MediaTek Inc.
+ */
+
+#ifndef __GZVM_DRV_H__
+#define __GZVM_DRV_H__
+
+#include <linux/list.h>
+#include <linux/mutex.h>
+#include <linux/gzvm.h>
+
+#define GZVM_VCPU_MMAP_SIZE PAGE_SIZE
+#define INVALID_VM_ID 0xffff
+
+/*
+ * These are the efinitions of APIs between GenieZone hypervisor and driver,
+ * there's no need to be visible to uapi. Furthermore, We need GenieZone
+ * specific error code in order to map to Linux errno
+ */
+#define NO_ERROR (0)
+#define ERR_NO_MEMORY (-5)
+#define ERR_NOT_SUPPORTED (-24)
+#define ERR_NOT_IMPLEMENTED (-27)
+#define ERR_FAULT (-40)
+
+/*
+ * The following data structures are for data transferring between driver and
+ * hypervisor, and they're aligned with hypervisor definitions
+ */
+#define GZVM_MAX_VCPUS 8
+#define GZVM_MAX_MEM_REGION 10
+
+/* struct mem_region_addr_range - Identical to ffa memory constituent */
+struct mem_region_addr_range {
+ /* the base IPA of the constituent memory region, aligned to 4 kiB */
+ __u64 address;
+ /* the number of 4 kiB pages in the constituent memory region. */
+ __u32 pg_cnt;
+ __u32 reserved;
+};
+
+struct gzvm_memory_region_ranges {
+ __u32 slot;
+ __u32 constituent_cnt;
+ __u64 total_pages;
+ __u64 gpa;
+ struct mem_region_addr_range constituents[];
+};
+
+/* struct gzvm_memslot - VM's memory slot descriptor */
+struct gzvm_memslot {
+ u64 base_gfn; /* begin of guest page frame */
+ unsigned long npages; /* number of pages this slot covers */
+ unsigned long userspace_addr; /* corresponding userspace va */
+ struct vm_area_struct *vma; /* vma related to this userspace addr */
+ u32 flags;
+ u32 slot_id;
+};
+
+struct gzvm {
+ /* userspace tied to this vm */
+ struct mm_struct *mm;
+ struct gzvm_memslot memslot[GZVM_MAX_MEM_REGION];
+ /* lock for list_add*/
+ struct mutex lock;
+ struct list_head vm_list;
+ u16 vm_id;
+};
+
+long gzvm_dev_ioctl_check_extension(struct gzvm *gzvm, unsigned long args);
+int gzvm_dev_ioctl_create_vm(unsigned long vm_type);
+
+int gzvm_err_to_errno(unsigned long err);
+
+void gzvm_destroy_all_vms(void);
+
+/* arch-dependant functions */
+int gzvm_arch_probe(void);
+int gzvm_arch_set_memregion(u16 vm_id, size_t buf_size,
+ phys_addr_t region);
+int gzvm_arch_check_extension(struct gzvm *gzvm, __u64 cap, void __user *argp);
+int gzvm_arch_create_vm(unsigned long vm_type);
+int gzvm_arch_destroy_vm(u16 vm_id);
+int gzvm_vm_ioctl_arch_enable_cap(struct gzvm *gzvm,
+ struct gzvm_enable_cap *cap,
+ void __user *argp);
+u64 gzvm_hva_to_pa_arch(u64 hva);
+
+#endif /* __GZVM_DRV_H__ */
diff --git a/include/uapi/asm-generic/Kbuild b/include/uapi/asm-generic/Kbuild
index ebb180aac74e..5af115a3c1a8 100644
--- a/include/uapi/asm-generic/Kbuild
+++ b/include/uapi/asm-generic/Kbuild
@@ -34,3 +34,4 @@ mandatory-y += termbits.h
mandatory-y += termios.h
mandatory-y += types.h
mandatory-y += unistd.h
+mandatory-y += gzvm_arch.h
diff --git a/include/uapi/asm-generic/gzvm_arch.h b/include/uapi/asm-generic/gzvm_arch.h
new file mode 100644
index 000000000000..c4cc12716c91
--- /dev/null
+++ b/include/uapi/asm-generic/gzvm_arch.h
@@ -0,0 +1,10 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/*
+ * Copyright (c) 2023 MediaTek Inc.
+ */
+
+#ifndef __ASM_GENERIC_GZVM_ARCH_H
+#define __ASM_GENERIC_GZVM_ARCH_H
+/* geniezone only supports aarch64 platform for now */
+
+#endif /* __ASM_GENERIC_GZVM_ARCH_H */
diff --git a/include/uapi/linux/gzvm.h b/include/uapi/linux/gzvm.h
new file mode 100644
index 000000000000..99730c142b0e
--- /dev/null
+++ b/include/uapi/linux/gzvm.h
@@ -0,0 +1,76 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/*
+ * Copyright (c) 2023 MediaTek Inc.
+ */
+
+/**
+ * DOC: UAPI of GenieZone Hypervisor
+ *
+ * This file declares common data structure shared among user space,
+ * kernel space, and GenieZone hypervisor.
+ */
+#ifndef __GZVM_H__
+#define __GZVM_H__
+
+#include <linux/const.h>
+#include <linux/types.h>
+#include <linux/ioctl.h>
+
+#include <asm/gzvm_arch.h>
+
+/* GZVM ioctls */
+#define GZVM_IOC_MAGIC 0x92 /* gz */
+
+/* ioctls for /dev/gzvm fds */
+#define GZVM_CREATE_VM _IO(GZVM_IOC_MAGIC, 0x01) /* Returns a Geniezone VM fd */
+
+/*
+ * Check if the given capability is supported or not.
+ * The argument is capability. Ex. GZVM_CAP_ARM_PROTECTED_VM or GZVM_CAP_ARM_VM_IPA_SIZE
+ * return is 0 (supported, no error)
+ * return is -EOPNOTSUPP (unsupported)
+ * return is -EFAULT (failed to get the argument from userspace)
+ */
+#define GZVM_CHECK_EXTENSION _IO(GZVM_IOC_MAGIC, 0x03)
+
+/* ioctls for VM fds */
+/* for GZVM_SET_MEMORY_REGION */
+struct gzvm_memory_region {
+ __u32 slot;
+ __u32 flags;
+ __u64 guest_phys_addr;
+ __u64 memory_size; /* bytes */
+};
+
+#define GZVM_SET_MEMORY_REGION _IOW(GZVM_IOC_MAGIC, 0x40, \
+ struct gzvm_memory_region)
+
+/* for GZVM_SET_USER_MEMORY_REGION */
+struct gzvm_userspace_memory_region {
+ __u32 slot;
+ __u32 flags;
+ __u64 guest_phys_addr;
+ /* bytes */
+ __u64 memory_size;
+ /* start of the userspace allocated memory */
+ __u64 userspace_addr;
+};
+
+#define GZVM_SET_USER_MEMORY_REGION _IOW(GZVM_IOC_MAGIC, 0x46, \
+ struct gzvm_userspace_memory_region)
+
+/* for GZVM_ENABLE_CAP */
+struct gzvm_enable_cap {
+ /* in */
+ __u64 cap;
+ /**
+ * we have total 5 (8 - 3) registers can be used for
+ * additional args
+ */
+ __u64 args[5];
+};
+
+#define GZVM_ENABLE_CAP _IOW(GZVM_IOC_MAGIC, 0xa3, \
+ struct gzvm_enable_cap)
+
+#endif /* __GZVM_H__ */
--
2.18.0


2023-07-27 09:12:30

by Yi-De Wu

[permalink] [raw]
Subject: [PATCH v5 10/12] virt: geniezone: Add virtual timer support

From: "Willix Yeh" <[email protected]>

Implement vtimer migration handler.
- Using hrtimer for guest vtimer migration
- Identify migrate flag to do register hrtimer

Signed-off-by: Willix Yeh <[email protected]>
Signed-off-by: Liju Chen <[email protected]>
Signed-off-by: Yi-De Wu <[email protected]>
---
arch/arm64/geniezone/Makefile | 2 +-
arch/arm64/geniezone/driver.c | 26 ++++++++
arch/arm64/geniezone/gzvm_arch_common.h | 18 ++++++
arch/arm64/geniezone/vcpu.c | 83 ++++++++++++++++++++++---
arch/arm64/geniezone/vgic.c | 16 +++++
drivers/virt/geniezone/gzvm_main.c | 5 ++
drivers/virt/geniezone/gzvm_vcpu.c | 4 +-
include/linux/gzvm_drv.h | 8 ++-
8 files changed, 149 insertions(+), 13 deletions(-)
create mode 100644 arch/arm64/geniezone/driver.c

diff --git a/arch/arm64/geniezone/Makefile b/arch/arm64/geniezone/Makefile
index 0e4f1087f9de..59e04cc0a000 100644
--- a/arch/arm64/geniezone/Makefile
+++ b/arch/arm64/geniezone/Makefile
@@ -4,6 +4,6 @@
#
include $(srctree)/drivers/virt/geniezone/Makefile

-gzvm-y += vm.o vcpu.o vgic.o
+gzvm-y += vm.o vcpu.o vgic.o driver.o

obj-$(CONFIG_MTK_GZVM) += gzvm.o
diff --git a/arch/arm64/geniezone/driver.c b/arch/arm64/geniezone/driver.c
new file mode 100644
index 000000000000..fb6ec0fed4d8
--- /dev/null
+++ b/arch/arm64/geniezone/driver.c
@@ -0,0 +1,26 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2023 MediaTek Inc.
+ */
+
+#include <linux/clocksource.h>
+#include <linux/gzvm_drv.h>
+#include "gzvm_arch_common.h"
+
+struct timecycle clock_scale_factor;
+
+int gzvm_arch_drv_init(void)
+{
+ /* clock_scale_factor init mult shift */
+ clocks_calc_mult_shift(&clock_scale_factor.mult,
+ &clock_scale_factor.shift,
+ arch_timer_get_cntfrq(),
+ NSEC_PER_SEC,
+ 10);
+
+ return 0;
+}
+
+void gzvm_arch_drv_exit(void)
+{
+}
diff --git a/arch/arm64/geniezone/gzvm_arch_common.h b/arch/arm64/geniezone/gzvm_arch_common.h
index 82d2c44e819b..e51310be2376 100644
--- a/arch/arm64/geniezone/gzvm_arch_common.h
+++ b/arch/arm64/geniezone/gzvm_arch_common.h
@@ -51,6 +51,13 @@ enum {

#define GIC_V3_NR_LRS 16

+struct timecycle {
+ u32 mult;
+ u32 shift;
+};
+
+extern struct timecycle clock_scale_factor;
+
/**
* gzvm_hypcall_wrapper() - the wrapper for hvc calls
* @a0-a7: arguments passed in registers 0 to 7
@@ -84,14 +91,22 @@ static inline u16 get_vcpuid_from_tuple(unsigned int tuple)
* @__pad: add an explicit '__u32 __pad;' in the middle to make it clear
* what the actual layout is.
* @lr: The array of LRs(list registers).
+ * @vtimer_delay: The remaining time until the next tick of guest VM.
+ * @vtimer_migrate: The switch flag used for guest VM to do vtimer migration or not.
+ * @vtimer_irq_num: vtimer irq number.
*
* - Keep the same layout of hypervisor data struct.
* - Sync list registers back for acking virtual device interrupt status.
+ * - Sync timer registers back for migrating timer to host's hwtimer to keep
+ * timer working in background.
*/
struct gzvm_vcpu_hwstate {
__le32 nr_lrs;
__le32 __pad;
__le64 lr[GIC_V3_NR_LRS];
+ __le64 vtimer_delay;
+ __le32 vtimer_migrate;
+ __le32 vtimer_irq_num;
};

static inline unsigned int
@@ -107,4 +122,7 @@ disassemble_vm_vcpu_tuple(unsigned int tuple, u16 *vmid, u16 *vcpuid)
*vcpuid = get_vcpuid_from_tuple(tuple);
}

+int gzvm_vgic_inject_ppi(struct gzvm *gzvm, unsigned int vcpu_idx,
+ u32 irq, bool level);
+
#endif /* __GZVM_ARCH_COMMON_H__ */
diff --git a/arch/arm64/geniezone/vcpu.c b/arch/arm64/geniezone/vcpu.c
index 95681fd66656..b26bbadf10a0 100644
--- a/arch/arm64/geniezone/vcpu.c
+++ b/arch/arm64/geniezone/vcpu.c
@@ -4,6 +4,7 @@
*/

#include <linux/arm-smccc.h>
+#include <linux/clocksource.h>
#include <linux/err.h>
#include <linux/uaccess.h>

@@ -40,25 +41,91 @@ int gzvm_arch_vcpu_update_one_reg(struct gzvm_vcpu *vcpu, __u64 reg_id,
return ret;
}

+static void clear_migrate_state(struct gzvm_vcpu *vcpu)
+{
+ vcpu->hwstate->vtimer_migrate = 0;
+ vcpu->hwstate->vtimer_delay = 0;
+}
+
+static u64 gzvm_mtimer_delay_time(u64 delay)
+{
+ u64 ns;
+
+ ns = clocksource_cyc2ns(delay, clock_scale_factor.mult,
+ clock_scale_factor.shift);
+
+ return ns;
+}
+
+static void gzvm_mtimer_release(struct gzvm_vcpu *vcpu)
+{
+ hrtimer_cancel(&vcpu->gzvm_mtimer);
+
+ clear_migrate_state(vcpu);
+}
+
+static void gzvm_mtimer_catch(struct hrtimer *hrt, u64 delay)
+{
+ u64 ns;
+
+ ns = gzvm_mtimer_delay_time(delay);
+ hrtimer_start(hrt, ktime_add_ns(ktime_get(), ns), HRTIMER_MODE_ABS_HARD);
+}
+
+static void mtimer_irq_forward(struct gzvm_vcpu *vcpu)
+{
+ gzvm_vgic_inject_ppi(vcpu->gzvm, vcpu->vcpuid,
+ vcpu->hwstate->vtimer_irq_num, 1);
+}
+
+static enum hrtimer_restart gzvm_mtimer_expire(struct hrtimer *hrt)
+{
+ struct gzvm_vcpu *vcpu;
+
+ vcpu = container_of(hrt, struct gzvm_vcpu, gzvm_mtimer);
+
+ mtimer_irq_forward(vcpu);
+
+ return HRTIMER_NORESTART;
+}
+
+static void vtimer_init(struct gzvm_vcpu *vcpu)
+{
+ /* gzvm_mtimer init based on hrtimer */
+ hrtimer_init(&vcpu->gzvm_mtimer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_HARD);
+ vcpu->gzvm_mtimer.function = gzvm_mtimer_expire;
+}
+
int gzvm_arch_vcpu_run(struct gzvm_vcpu *vcpu, __u64 *exit_reason)
{
struct arm_smccc_res res;
unsigned long a1;
int ret;

+ /* hrtimer cancel and clear migrate state */
+ if (vcpu->hwstate->vtimer_migrate)
+ gzvm_mtimer_release(vcpu);
+
a1 = assemble_vm_vcpu_tuple(vcpu->gzvm->vm_id, vcpu->vcpuid);
ret = gzvm_hypcall_wrapper(MT_HVC_GZVM_RUN, a1, 0, 0, 0, 0, 0,
0, &res);
+
+ /* hrtimer register if migration needed */
+ if (vcpu->hwstate->vtimer_migrate)
+ gzvm_mtimer_catch(&vcpu->gzvm_mtimer, vcpu->hwstate->vtimer_delay);
+
*exit_reason = res.a1;
return ret;
}

-int gzvm_arch_destroy_vcpu(u16 vm_id, int vcpuid)
+int gzvm_arch_destroy_vcpu(struct gzvm_vcpu *vcpu)
{
struct arm_smccc_res res;
unsigned long a1;

- a1 = assemble_vm_vcpu_tuple(vm_id, vcpuid);
+ hrtimer_cancel(&vcpu->gzvm_mtimer);
+
+ a1 = assemble_vm_vcpu_tuple(vcpu->gzvm->vm_id, vcpu->vcpuid);
gzvm_hypcall_wrapper(MT_HVC_GZVM_DESTROY_VCPU, a1, 0, 0, 0, 0, 0, 0,
&res);

@@ -67,20 +134,20 @@ int gzvm_arch_destroy_vcpu(u16 vm_id, int vcpuid)

/**
* gzvm_arch_create_vcpu() - Call smc to gz hypervisor to create vcpu
- * @vm_id: vm id
- * @vcpuid: vcpu id
- * @run: Virtual address of vcpu->run
+ * @vcpu: Pointer to struct gzvm_vcpu
*
* Return: The wrapper helps caller to convert geniezone errno to Linux errno.
*/
-int gzvm_arch_create_vcpu(u16 vm_id, int vcpuid, void *run)
+int gzvm_arch_create_vcpu(struct gzvm_vcpu *vcpu)
{
struct arm_smccc_res res;
unsigned long a1, a2;
int ret;

- a1 = assemble_vm_vcpu_tuple(vm_id, vcpuid);
- a2 = (__u64)virt_to_phys(run);
+ vtimer_init(vcpu);
+
+ a1 = assemble_vm_vcpu_tuple(vcpu->gzvm->vm_id, vcpu->vcpuid);
+ a2 = (__u64)virt_to_phys(vcpu->run);
ret = gzvm_hypcall_wrapper(MT_HVC_GZVM_CREATE_VCPU, a1, a2, 0, 0, 0, 0,
0, &res);

diff --git a/arch/arm64/geniezone/vgic.c b/arch/arm64/geniezone/vgic.c
index 3746e0c9e247..e24728997b57 100644
--- a/arch/arm64/geniezone/vgic.c
+++ b/arch/arm64/geniezone/vgic.c
@@ -91,6 +91,22 @@ static int gzvm_vgic_inject_spi(struct gzvm *gzvm, unsigned int vcpu_idx,
level);
}

+/**
+ * gzvm_vgic_inject_ppi() - Inject virtual ppi interrupt
+ * @gzvm: Pointer to struct gzvm
+ * @vcpu_idx: vcpu index
+ * @irq: This is spi interrupt number (starts from 0 instead of 32)
+ * @level: 1 if true else 0
+ *
+ * Return:
+ * * 0 if succeed else other negative values indicating each errors
+ */
+int gzvm_vgic_inject_ppi(struct gzvm *gzvm, unsigned int vcpu_idx,
+ u32 irq, bool level)
+{
+ return gzvm_vgic_inject_irq(gzvm, 0, GZVM_IRQ_TYPE_PPI, irq, level);
+}
+
int gzvm_arch_create_device(u16 vm_id, struct gzvm_create_device *gzvm_dev)
{
struct arm_smccc_res res;
diff --git a/drivers/virt/geniezone/gzvm_main.c b/drivers/virt/geniezone/gzvm_main.c
index d4d5d75d3660..a4c235f3ff01 100644
--- a/drivers/virt/geniezone/gzvm_main.c
+++ b/drivers/virt/geniezone/gzvm_main.c
@@ -106,6 +106,10 @@ static int gzvm_drv_probe(struct platform_device *pdev)
return -ENODEV;
}

+ ret = gzvm_arch_drv_init();
+ if (ret)
+ return ret;
+
ret = misc_register(&gzvm_dev);
if (ret)
return ret;
@@ -121,6 +125,7 @@ static int gzvm_drv_remove(struct platform_device *pdev)
gzvm_drv_irqfd_exit();
gzvm_destroy_all_vms();
misc_deregister(&gzvm_dev);
+ gzvm_arch_drv_exit();
return 0;
}

diff --git a/drivers/virt/geniezone/gzvm_vcpu.c b/drivers/virt/geniezone/gzvm_vcpu.c
index 72bd122a8be7..8dac8ba4c0cf 100644
--- a/drivers/virt/geniezone/gzvm_vcpu.c
+++ b/drivers/virt/geniezone/gzvm_vcpu.c
@@ -188,7 +188,7 @@ static void gzvm_destroy_vcpu(struct gzvm_vcpu *vcpu)
if (!vcpu)
return;

- gzvm_arch_destroy_vcpu(vcpu->gzvm->vm_id, vcpu->vcpuid);
+ gzvm_arch_destroy_vcpu(vcpu);
/* clean guest's data */
memset(vcpu->run, 0, GZVM_VCPU_RUN_MAP_SIZE);
free_pages_exact(vcpu->run, GZVM_VCPU_RUN_MAP_SIZE);
@@ -257,7 +257,7 @@ int gzvm_vm_ioctl_create_vcpu(struct gzvm *gzvm, u32 cpuid)
vcpu->gzvm = gzvm;
mutex_init(&vcpu->lock);

- ret = gzvm_arch_create_vcpu(gzvm->vm_id, vcpu->vcpuid, vcpu->run);
+ ret = gzvm_arch_create_vcpu(vcpu);
if (ret < 0)
goto free_vcpu_run;

diff --git a/include/linux/gzvm_drv.h b/include/linux/gzvm_drv.h
index 7bc00218dce6..e5b21ac9215b 100644
--- a/include/linux/gzvm_drv.h
+++ b/include/linux/gzvm_drv.h
@@ -6,6 +6,7 @@
#ifndef __GZVM_DRV_H__
#define __GZVM_DRV_H__

+#include <asm/arch_timer.h>
#include <linux/eventfd.h>
#include <linux/list.h>
#include <linux/mutex.h>
@@ -71,6 +72,7 @@ struct gzvm_vcpu {
struct mutex lock;
struct gzvm_vcpu_run *run;
struct gzvm_vcpu_hwstate *hwstate;
+ struct hrtimer gzvm_mtimer;
};

struct gzvm {
@@ -125,10 +127,12 @@ u64 gzvm_hva_to_pa_arch(u64 hva);
int gzvm_vm_ioctl_create_vcpu(struct gzvm *gzvm, u32 cpuid);
int gzvm_arch_vcpu_update_one_reg(struct gzvm_vcpu *vcpu, __u64 reg_id,
bool is_write, __u64 *data);
-int gzvm_arch_create_vcpu(u16 vm_id, int vcpuid, void *run);
+int gzvm_arch_create_vcpu(struct gzvm_vcpu *vcpu);
int gzvm_arch_vcpu_run(struct gzvm_vcpu *vcpu, __u64 *exit_reason);
-int gzvm_arch_destroy_vcpu(u16 vm_id, int vcpuid);
+int gzvm_arch_destroy_vcpu(struct gzvm_vcpu *vcpu);
int gzvm_arch_inform_exit(u16 vm_id);
+int gzvm_arch_drv_init(void);
+void gzvm_arch_drv_exit(void);

int gzvm_arch_create_device(u16 vm_id, struct gzvm_create_device *gzvm_dev);
int gzvm_arch_inject_irq(struct gzvm *gzvm, unsigned int vcpu_idx,
--
2.18.0


2023-07-27 09:16:27

by Yi-De Wu

[permalink] [raw]
Subject: [PATCH v5 05/12] virt: geniezone: Add irqchip support for virtual interrupt injection

From: "Yingshiuan Pan" <[email protected]>

Enable GenieZone to handle virtual interrupt injection request.

Signed-off-by: Yingshiuan Pan <[email protected]>
Signed-off-by: Liju Chen <[email protected]>
Signed-off-by: Yi-De Wu <[email protected]>
---
arch/arm64/geniezone/Makefile | 2 +-
arch/arm64/geniezone/vgic.c | 108 ++++++++++++++++++++++++
arch/arm64/include/uapi/asm/gzvm_arch.h | 4 +
drivers/virt/geniezone/gzvm_common.h | 12 +++
drivers/virt/geniezone/gzvm_vm.c | 82 ++++++++++++++++++
include/linux/gzvm_drv.h | 4 +
include/uapi/linux/gzvm.h | 66 +++++++++++++++
7 files changed, 277 insertions(+), 1 deletion(-)
create mode 100644 arch/arm64/geniezone/vgic.c
create mode 100644 drivers/virt/geniezone/gzvm_common.h

diff --git a/arch/arm64/geniezone/Makefile b/arch/arm64/geniezone/Makefile
index 69b0a4abeab0..0e4f1087f9de 100644
--- a/arch/arm64/geniezone/Makefile
+++ b/arch/arm64/geniezone/Makefile
@@ -4,6 +4,6 @@
#
include $(srctree)/drivers/virt/geniezone/Makefile

-gzvm-y += vm.o vcpu.o
+gzvm-y += vm.o vcpu.o vgic.o

obj-$(CONFIG_MTK_GZVM) += gzvm.o
diff --git a/arch/arm64/geniezone/vgic.c b/arch/arm64/geniezone/vgic.c
new file mode 100644
index 000000000000..3746e0c9e247
--- /dev/null
+++ b/arch/arm64/geniezone/vgic.c
@@ -0,0 +1,108 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2023 MediaTek Inc.
+ */
+
+#include <linux/irqchip/arm-gic-v3.h>
+#include <linux/gzvm.h>
+#include <linux/gzvm_drv.h>
+#include "gzvm_arch_common.h"
+
+/**
+ * is_irq_valid() - Check the irq number and irq_type are matched
+ * @irq: interrupt number
+ * @irq_type: interrupt type
+ *
+ * Return:
+ * true if irq is valid else false.
+ */
+static bool is_irq_valid(u32 irq, u32 irq_type)
+{
+ switch (irq_type) {
+ case GZVM_IRQ_TYPE_CPU:
+ /* 0 ~ 15: SGI */
+ if (likely(irq <= GZVM_IRQ_CPU_FIQ))
+ return true;
+ break;
+ case GZVM_IRQ_TYPE_PPI:
+ /* 16 ~ 31: PPI */
+ if (likely(irq >= GZVM_VGIC_NR_SGIS &&
+ irq < GZVM_VGIC_NR_PRIVATE_IRQS))
+ return true;
+ break;
+ case GZVM_IRQ_TYPE_SPI:
+ /* 32 ~ : SPT */
+ if (likely(irq >= GZVM_VGIC_NR_PRIVATE_IRQS))
+ return true;
+ break;
+ default:
+ return false;
+ }
+ return false;
+}
+
+/**
+ * gzvm_vgic_inject_irq() - Inject virtual interrupt to a VM
+ * @gzvm: Pointer to struct gzvm
+ * @vcpu_idx: vcpu index, only valid if PPI
+ * @irq_type: Interrupt type
+ * @irq: irq number
+ * @level: 1 if true else 0
+ *
+ * Return:
+ * * 0 - Success.
+ * * Negative - Failure.
+ */
+static int gzvm_vgic_inject_irq(struct gzvm *gzvm, unsigned int vcpu_idx,
+ u32 irq_type, u32 irq, bool level)
+{
+ unsigned long a1 = assemble_vm_vcpu_tuple(gzvm->vm_id, vcpu_idx);
+ struct arm_smccc_res res;
+
+ if (!unlikely(is_irq_valid(irq, irq_type)))
+ return -EINVAL;
+
+ gzvm_hypcall_wrapper(MT_HVC_GZVM_IRQ_LINE, a1, irq, level,
+ 0, 0, 0, 0, &res);
+ if (res.a0) {
+ pr_err("Failed to set IRQ level (%d) to irq#%u on vcpu %d with ret=%d\n",
+ level, irq, vcpu_idx, (int)res.a0);
+ return -EFAULT;
+ }
+
+ return 0;
+}
+
+/**
+ * gzvm_vgic_inject_spi() - Inject virtual spi interrupt
+ * @gzvm: Pointer to struct gzvm
+ * @vcpu_idx: vcpu index
+ * @spi_irq: This is spi interrupt number (starts from 0 instead of 32)
+ * @level: 1 if true else 0
+ *
+ * Return:
+ * * 0 if succeed else other negative values indicating each errors
+ */
+static int gzvm_vgic_inject_spi(struct gzvm *gzvm, unsigned int vcpu_idx,
+ u32 spi_irq, bool level)
+{
+ return gzvm_vgic_inject_irq(gzvm, 0, GZVM_IRQ_TYPE_SPI,
+ spi_irq + GZVM_VGIC_NR_PRIVATE_IRQS,
+ level);
+}
+
+int gzvm_arch_create_device(u16 vm_id, struct gzvm_create_device *gzvm_dev)
+{
+ struct arm_smccc_res res;
+
+ return gzvm_hypcall_wrapper(MT_HVC_GZVM_CREATE_DEVICE, vm_id,
+ virt_to_phys(gzvm_dev), 0, 0, 0, 0, 0,
+ &res);
+}
+
+int gzvm_arch_inject_irq(struct gzvm *gzvm, unsigned int vcpu_idx,
+ u32 irq_type, u32 irq, bool level)
+{
+ /* default use spi */
+ return gzvm_vgic_inject_spi(gzvm, vcpu_idx, irq, level);
+}
diff --git a/arch/arm64/include/uapi/asm/gzvm_arch.h b/arch/arm64/include/uapi/asm/gzvm_arch.h
index e56b4700e07e..acfe9be0f849 100644
--- a/arch/arm64/include/uapi/asm/gzvm_arch.h
+++ b/arch/arm64/include/uapi/asm/gzvm_arch.h
@@ -47,4 +47,8 @@
#define GZVM_REG_ARM_CORE_REG(name) \
(offsetof(struct gzvm_regs, name) / sizeof(__u32))

+#define GZVM_VGIC_NR_SGIS 16
+#define GZVM_VGIC_NR_PPIS 16
+#define GZVM_VGIC_NR_PRIVATE_IRQS (GZVM_VGIC_NR_SGIS + GZVM_VGIC_NR_PPIS)
+
#endif /* __GZVM_ARCH_H__ */
diff --git a/drivers/virt/geniezone/gzvm_common.h b/drivers/virt/geniezone/gzvm_common.h
new file mode 100644
index 000000000000..d0e39ded79e6
--- /dev/null
+++ b/drivers/virt/geniezone/gzvm_common.h
@@ -0,0 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2023 MediaTek Inc.
+ */
+
+#ifndef __GZ_COMMON_H__
+#define __GZ_COMMON_H__
+
+int gzvm_irqchip_inject_irq(struct gzvm *gzvm, unsigned int vcpu_idx,
+ u32 irq_type, u32 irq, bool level);
+
+#endif /* __GZVM_COMMON_H__ */
diff --git a/drivers/virt/geniezone/gzvm_vm.c b/drivers/virt/geniezone/gzvm_vm.c
index aea99d050653..b1397180cd02 100644
--- a/drivers/virt/geniezone/gzvm_vm.c
+++ b/drivers/virt/geniezone/gzvm_vm.c
@@ -11,6 +11,7 @@
#include <linux/platform_device.h>
#include <linux/slab.h>
#include <linux/gzvm_drv.h>
+#include "gzvm_common.h"

static DEFINE_MUTEX(gzvm_list_lock);
static LIST_HEAD(gzvm_list);
@@ -260,6 +261,73 @@ gzvm_vm_ioctl_set_memory_region(struct gzvm *gzvm,
return register_memslot_addr_range(gzvm, memslot);
}

+int gzvm_irqchip_inject_irq(struct gzvm *gzvm, unsigned int vcpu_idx,
+ u32 irq_type, u32 irq, bool level)
+{
+ return gzvm_arch_inject_irq(gzvm, vcpu_idx, irq_type, irq, level);
+}
+
+static int gzvm_vm_ioctl_irq_line(struct gzvm *gzvm,
+ struct gzvm_irq_level *irq_level)
+{
+ u32 irq = irq_level->irq;
+ u32 irq_type, vcpu_idx, vcpu2_idx, irq_num;
+ bool level = irq_level->level;
+
+ irq_type = FIELD_GET(GZVM_IRQ_LINE_TYPE, irq);
+ vcpu_idx = FIELD_GET(GZVM_IRQ_LINE_VCPU, irq);
+ vcpu2_idx = FIELD_GET(GZVM_IRQ_LINE_VCPU2, irq) * (GZVM_IRQ_VCPU_MASK + 1);
+ irq_num = FIELD_GET(GZVM_IRQ_LINE_NUM, irq);
+
+ return gzvm_irqchip_inject_irq(gzvm, vcpu_idx + vcpu2_idx, irq_type, irq_num,
+ level);
+}
+
+static int gzvm_vm_ioctl_create_device(struct gzvm *gzvm, void __user *argp)
+{
+ struct gzvm_create_device *gzvm_dev;
+ void *dev_data = NULL;
+ int ret;
+
+ gzvm_dev = (struct gzvm_create_device *)alloc_pages_exact(PAGE_SIZE,
+ GFP_KERNEL);
+ if (!gzvm_dev)
+ return -ENOMEM;
+ if (copy_from_user(gzvm_dev, argp, sizeof(*gzvm_dev))) {
+ ret = -EFAULT;
+ goto err_free_dev;
+ }
+
+ if (gzvm_dev->attr_addr != 0 && gzvm_dev->attr_size != 0) {
+ size_t attr_size = gzvm_dev->attr_size;
+ void __user *attr_addr = (void __user *)gzvm_dev->attr_addr;
+
+ /* Size of device specific data should not be over a page. */
+ if (attr_size > PAGE_SIZE)
+ return -EINVAL;
+
+ dev_data = alloc_pages_exact(attr_size, GFP_KERNEL);
+ if (!dev_data) {
+ ret = -ENOMEM;
+ goto err_free_dev;
+ }
+
+ if (copy_from_user(dev_data, attr_addr, attr_size)) {
+ ret = -EFAULT;
+ goto err_free_dev_data;
+ }
+ gzvm_dev->attr_addr = virt_to_phys(dev_data);
+ }
+
+ ret = gzvm_arch_create_device(gzvm->vm_id, gzvm_dev);
+err_free_dev_data:
+ if (dev_data)
+ free_pages_exact(dev_data, 0);
+err_free_dev:
+ free_pages_exact(gzvm_dev, 0);
+ return ret;
+}
+
static int gzvm_vm_ioctl_enable_cap(struct gzvm *gzvm,
struct gzvm_enable_cap *cap,
void __user *argp)
@@ -294,6 +362,20 @@ static long gzvm_vm_ioctl(struct file *filp, unsigned int ioctl,
ret = gzvm_vm_ioctl_set_memory_region(gzvm, &userspace_mem);
break;
}
+ case GZVM_IRQ_LINE: {
+ struct gzvm_irq_level irq_event;
+
+ if (copy_from_user(&irq_event, argp, sizeof(irq_event))) {
+ ret = -EFAULT;
+ goto out;
+ }
+ ret = gzvm_vm_ioctl_irq_line(gzvm, &irq_event);
+ break;
+ }
+ case GZVM_CREATE_DEVICE: {
+ ret = gzvm_vm_ioctl_create_device(gzvm, argp);
+ break;
+ }
case GZVM_ENABLE_CAP: {
struct gzvm_enable_cap cap;

diff --git a/include/linux/gzvm_drv.h b/include/linux/gzvm_drv.h
index c42edb4345cc..d86885d46195 100644
--- a/include/linux/gzvm_drv.h
+++ b/include/linux/gzvm_drv.h
@@ -108,4 +108,8 @@ int gzvm_arch_vcpu_run(struct gzvm_vcpu *vcpu, __u64 *exit_reason);
int gzvm_arch_destroy_vcpu(u16 vm_id, int vcpuid);
int gzvm_arch_inform_exit(u16 vm_id);

+int gzvm_arch_create_device(u16 vm_id, struct gzvm_create_device *gzvm_dev);
+int gzvm_arch_inject_irq(struct gzvm *gzvm, unsigned int vcpu_idx,
+ u32 irq_type, u32 irq, bool level);
+
#endif /* __GZVM_DRV_H__ */
diff --git a/include/uapi/linux/gzvm.h b/include/uapi/linux/gzvm.h
index 4814c82b0dff..fb019d232a98 100644
--- a/include/uapi/linux/gzvm.h
+++ b/include/uapi/linux/gzvm.h
@@ -64,6 +64,72 @@ struct gzvm_userspace_memory_region {
#define GZVM_SET_USER_MEMORY_REGION _IOW(GZVM_IOC_MAGIC, 0x46, \
struct gzvm_userspace_memory_region)

+/* for GZVM_IRQ_LINE, irq field index values */
+#define GZVM_IRQ_VCPU_MASK 0xff
+#define GZVM_IRQ_LINE_TYPE GENMASK(27, 24)
+#define GZVM_IRQ_LINE_VCPU GENMASK(23, 16)
+#define GZVM_IRQ_LINE_VCPU2 GENMASK(31, 28)
+#define GZVM_IRQ_LINE_NUM GENMASK(15, 0)
+
+/* irq_type field */
+#define GZVM_IRQ_TYPE_CPU 0
+#define GZVM_IRQ_TYPE_SPI 1
+#define GZVM_IRQ_TYPE_PPI 2
+
+/* out-of-kernel GIC cpu interrupt injection irq_number field */
+#define GZVM_IRQ_CPU_IRQ 0
+#define GZVM_IRQ_CPU_FIQ 1
+
+struct gzvm_irq_level {
+ union {
+ __u32 irq;
+ __s32 status;
+ };
+ __u32 level;
+};
+
+#define GZVM_IRQ_LINE _IOW(GZVM_IOC_MAGIC, 0x61, \
+ struct gzvm_irq_level)
+
+enum gzvm_device_type {
+ GZVM_DEV_TYPE_ARM_VGIC_V3_DIST = 0,
+ GZVM_DEV_TYPE_ARM_VGIC_V3_REDIST = 1,
+ GZVM_DEV_TYPE_MAX,
+};
+
+/**
+ * struct gzvm_create_device: For GZVM_CREATE_DEVICE.
+ * @dev_type: Device type.
+ * @id: Device id.
+ * @flags: Bypass to hypervisor to handle them and these are flags of virtual
+ * devices.
+ * @dev_addr: Device ipa address in VM's view.
+ * @dev_reg_size: Device register range size.
+ * @attr_addr: If user -> kernel, this is user virtual address of device
+ * specific attributes (if needed). If kernel->hypervisor,
+ * this is ipa.
+ * @attr_size: This attr_size is the buffer size in bytes of each attribute
+ * needed from various devices. The attribute here refers to the
+ * additional data passed from VMM(e.g. Crosvm) to GenieZone
+ * hypervisor when virtual devices were to be created. Thus,
+ * we need attr_addr and attr_size in the gzvm_create_device
+ * structure to keep track of the attribute mentioned.
+ *
+ * Store information needed to create device.
+ */
+struct gzvm_create_device {
+ __u32 dev_type;
+ __u32 id;
+ __u64 flags;
+ __u64 dev_addr;
+ __u64 dev_reg_size;
+ __u64 attr_addr;
+ __u64 attr_size;
+};
+
+#define GZVM_CREATE_DEVICE _IOWR(GZVM_IOC_MAGIC, 0xe0, \
+ struct gzvm_create_device)
+
/*
* ioctls for vcpu fds
*/
--
2.18.0


2023-07-27 09:16:42

by Yi-De Wu

[permalink] [raw]
Subject: [PATCH v5 04/12] virt: geniezone: Add vcpu support

From: "Yingshiuan Pan" <[email protected]>

VMM use this interface to create vcpu instance which is a fd, and this
fd will be for any vcpu operations, such as setting vcpu registers and
accepts the most important ioctl GZVM_VCPU_RUN which requests GenieZone
hypervisor to do context switch to execute VM's vcpu context.

Signed-off-by: Yingshiuan Pan <[email protected]>
Signed-off-by: Jerry Wang <[email protected]>
Signed-off-by: Liju Chen <[email protected]>
Signed-off-by: Yi-De Wu <[email protected]>
---
arch/arm64/geniezone/Makefile | 2 +-
arch/arm64/geniezone/gzvm_arch_common.h | 20 ++
arch/arm64/geniezone/vcpu.c | 88 +++++++++
arch/arm64/geniezone/vm.c | 11 ++
arch/arm64/include/uapi/asm/gzvm_arch.h | 30 +++
drivers/virt/geniezone/Makefile | 3 +-
drivers/virt/geniezone/gzvm_vcpu.c | 250 ++++++++++++++++++++++++
drivers/virt/geniezone/gzvm_vm.c | 5 +
include/linux/gzvm_drv.h | 21 ++
include/uapi/linux/gzvm.h | 136 +++++++++++++
10 files changed, 564 insertions(+), 2 deletions(-)
create mode 100644 arch/arm64/geniezone/vcpu.c
create mode 100644 drivers/virt/geniezone/gzvm_vcpu.c

diff --git a/arch/arm64/geniezone/Makefile b/arch/arm64/geniezone/Makefile
index 2957898cdd05..69b0a4abeab0 100644
--- a/arch/arm64/geniezone/Makefile
+++ b/arch/arm64/geniezone/Makefile
@@ -4,6 +4,6 @@
#
include $(srctree)/drivers/virt/geniezone/Makefile

-gzvm-y += vm.o
+gzvm-y += vm.o vcpu.o

obj-$(CONFIG_MTK_GZVM) += gzvm.o
diff --git a/arch/arm64/geniezone/gzvm_arch_common.h b/arch/arm64/geniezone/gzvm_arch_common.h
index fdb95d619102..9be9cf77faa3 100644
--- a/arch/arm64/geniezone/gzvm_arch_common.h
+++ b/arch/arm64/geniezone/gzvm_arch_common.h
@@ -21,6 +21,7 @@ enum {
GZVM_FUNC_CREATE_DEVICE = 11,
GZVM_FUNC_PROBE = 12,
GZVM_FUNC_ENABLE_CAP = 13,
+ GZVM_FUNC_INFORM_EXIT = 14,
NR_GZVM_FUNC,
};

@@ -42,6 +43,7 @@ enum {
#define MT_HVC_GZVM_CREATE_DEVICE GZVM_HCALL_ID(GZVM_FUNC_CREATE_DEVICE)
#define MT_HVC_GZVM_PROBE GZVM_HCALL_ID(GZVM_FUNC_PROBE)
#define MT_HVC_GZVM_ENABLE_CAP GZVM_HCALL_ID(GZVM_FUNC_ENABLE_CAP)
+#define MT_HVC_GZVM_INFORM_EXIT GZVM_HCALL_ID(GZVM_FUNC_INFORM_EXIT)

/**
* gzvm_hypcall_wrapper() - the wrapper for hvc calls
@@ -65,4 +67,22 @@ static inline u16 get_vmid_from_tuple(unsigned int tuple)
return (u16)(tuple >> 16);
}

+static inline u16 get_vcpuid_from_tuple(unsigned int tuple)
+{
+ return (u16)(tuple & 0xffff);
+}
+
+static inline unsigned int
+assemble_vm_vcpu_tuple(u16 vmid, u16 vcpuid)
+{
+ return ((unsigned int)vmid << 16 | vcpuid);
+}
+
+static inline void
+disassemble_vm_vcpu_tuple(unsigned int tuple, u16 *vmid, u16 *vcpuid)
+{
+ *vmid = get_vmid_from_tuple(tuple);
+ *vcpuid = get_vcpuid_from_tuple(tuple);
+}
+
#endif /* __GZVM_ARCH_COMMON_H__ */
diff --git a/arch/arm64/geniezone/vcpu.c b/arch/arm64/geniezone/vcpu.c
new file mode 100644
index 000000000000..95681fd66656
--- /dev/null
+++ b/arch/arm64/geniezone/vcpu.c
@@ -0,0 +1,88 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2023 MediaTek Inc.
+ */
+
+#include <linux/arm-smccc.h>
+#include <linux/err.h>
+#include <linux/uaccess.h>
+
+#include <linux/gzvm.h>
+#include <linux/gzvm_drv.h>
+#include "gzvm_arch_common.h"
+
+int gzvm_arch_vcpu_update_one_reg(struct gzvm_vcpu *vcpu, __u64 reg_id,
+ bool is_write, __u64 *data)
+{
+ struct arm_smccc_res res;
+ unsigned long a1;
+ int ret;
+
+ /* reg id follows KVM's encoding */
+ switch (reg_id & GZVM_REG_ARM_COPROC_MASK) {
+ case GZVM_REG_ARM_CORE:
+ break;
+ default:
+ return -EOPNOTSUPP;
+ }
+
+ a1 = assemble_vm_vcpu_tuple(vcpu->gzvm->vm_id, vcpu->vcpuid);
+ if (!is_write) {
+ ret = gzvm_hypcall_wrapper(MT_HVC_GZVM_GET_ONE_REG,
+ a1, reg_id, 0, 0, 0, 0, 0, &res);
+ if (ret == 0)
+ *data = res.a1;
+ } else {
+ ret = gzvm_hypcall_wrapper(MT_HVC_GZVM_SET_ONE_REG,
+ a1, reg_id, *data, 0, 0, 0, 0, &res);
+ }
+
+ return ret;
+}
+
+int gzvm_arch_vcpu_run(struct gzvm_vcpu *vcpu, __u64 *exit_reason)
+{
+ struct arm_smccc_res res;
+ unsigned long a1;
+ int ret;
+
+ a1 = assemble_vm_vcpu_tuple(vcpu->gzvm->vm_id, vcpu->vcpuid);
+ ret = gzvm_hypcall_wrapper(MT_HVC_GZVM_RUN, a1, 0, 0, 0, 0, 0,
+ 0, &res);
+ *exit_reason = res.a1;
+ return ret;
+}
+
+int gzvm_arch_destroy_vcpu(u16 vm_id, int vcpuid)
+{
+ struct arm_smccc_res res;
+ unsigned long a1;
+
+ a1 = assemble_vm_vcpu_tuple(vm_id, vcpuid);
+ gzvm_hypcall_wrapper(MT_HVC_GZVM_DESTROY_VCPU, a1, 0, 0, 0, 0, 0, 0,
+ &res);
+
+ return 0;
+}
+
+/**
+ * gzvm_arch_create_vcpu() - Call smc to gz hypervisor to create vcpu
+ * @vm_id: vm id
+ * @vcpuid: vcpu id
+ * @run: Virtual address of vcpu->run
+ *
+ * Return: The wrapper helps caller to convert geniezone errno to Linux errno.
+ */
+int gzvm_arch_create_vcpu(u16 vm_id, int vcpuid, void *run)
+{
+ struct arm_smccc_res res;
+ unsigned long a1, a2;
+ int ret;
+
+ a1 = assemble_vm_vcpu_tuple(vm_id, vcpuid);
+ a2 = (__u64)virt_to_phys(run);
+ ret = gzvm_hypcall_wrapper(MT_HVC_GZVM_CREATE_VCPU, a1, a2, 0, 0, 0, 0,
+ 0, &res);
+
+ return ret;
+}
diff --git a/arch/arm64/geniezone/vm.c b/arch/arm64/geniezone/vm.c
index e35751b21821..2df321f13057 100644
--- a/arch/arm64/geniezone/vm.c
+++ b/arch/arm64/geniezone/vm.c
@@ -14,6 +14,17 @@

#define PAR_PA47_MASK ((((1UL << 48) - 1) >> 12) << 12)

+int gzvm_arch_inform_exit(u16 vm_id)
+{
+ struct arm_smccc_res res;
+
+ arm_smccc_hvc(MT_HVC_GZVM_INFORM_EXIT, vm_id, 0, 0, 0, 0, 0, 0, &res);
+ if (res.a0 == 0)
+ return 0;
+
+ return -ENXIO;
+}
+
int gzvm_arch_probe(void)
{
struct arm_smccc_res res;
diff --git a/arch/arm64/include/uapi/asm/gzvm_arch.h b/arch/arm64/include/uapi/asm/gzvm_arch.h
index 847bb627a65d..e56b4700e07e 100644
--- a/arch/arm64/include/uapi/asm/gzvm_arch.h
+++ b/arch/arm64/include/uapi/asm/gzvm_arch.h
@@ -17,4 +17,34 @@
/* GZVM_CAP_ARM_PVM_SET_PROTECTED_VM only sets protected but not load pvmfw */
#define GZVM_CAP_ARM_PVM_SET_PROTECTED_VM 2

+/*
+ * Architecture specific registers are to be defined in arch headers and
+ * ORed with the arch identifier.
+ */
+#define GZVM_REG_ARM 0x4000000000000000ULL
+#define GZVM_REG_ARM64 0x6000000000000000ULL
+
+#define GZVM_REG_SIZE_SHIFT 52
+#define GZVM_REG_SIZE_MASK 0x00f0000000000000ULL
+#define GZVM_REG_SIZE_U8 0x0000000000000000ULL
+#define GZVM_REG_SIZE_U16 0x0010000000000000ULL
+#define GZVM_REG_SIZE_U32 0x0020000000000000ULL
+#define GZVM_REG_SIZE_U64 0x0030000000000000ULL
+#define GZVM_REG_SIZE_U128 0x0040000000000000ULL
+#define GZVM_REG_SIZE_U256 0x0050000000000000ULL
+#define GZVM_REG_SIZE_U512 0x0060000000000000ULL
+#define GZVM_REG_SIZE_U1024 0x0070000000000000ULL
+#define GZVM_REG_SIZE_U2048 0x0080000000000000ULL
+
+#define GZVM_REG_ARCH_MASK 0xff00000000000000ULL
+
+/* If you need to interpret the index values, here is the key: */
+#define GZVM_REG_ARM_COPROC_MASK 0x000000000FFF0000
+#define GZVM_REG_ARM_COPROC_SHIFT 16
+
+/* Normal registers are mapped as coprocessor 16. */
+#define GZVM_REG_ARM_CORE (0x0010 << GZVM_REG_ARM_COPROC_SHIFT)
+#define GZVM_REG_ARM_CORE_REG(name) \
+ (offsetof(struct gzvm_regs, name) / sizeof(__u32))
+
#endif /* __GZVM_ARCH_H__ */
diff --git a/drivers/virt/geniezone/Makefile b/drivers/virt/geniezone/Makefile
index 066efddc0b9c..8ebf2db0c970 100644
--- a/drivers/virt/geniezone/Makefile
+++ b/drivers/virt/geniezone/Makefile
@@ -6,5 +6,6 @@

GZVM_DIR ?= ../../../drivers/virt/geniezone

-gzvm-y := $(GZVM_DIR)/gzvm_main.o $(GZVM_DIR)/gzvm_vm.o
+gzvm-y := $(GZVM_DIR)/gzvm_main.o $(GZVM_DIR)/gzvm_vm.o \
+ $(GZVM_DIR)/gzvm_vcpu.o

diff --git a/drivers/virt/geniezone/gzvm_vcpu.c b/drivers/virt/geniezone/gzvm_vcpu.c
new file mode 100644
index 000000000000..e051343f2b0e
--- /dev/null
+++ b/drivers/virt/geniezone/gzvm_vcpu.c
@@ -0,0 +1,250 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2023 MediaTek Inc.
+ */
+
+#include <asm/sysreg.h>
+#include <linux/anon_inodes.h>
+#include <linux/device.h>
+#include <linux/file.h>
+#include <linux/mm.h>
+#include <linux/platform_device.h>
+#include <linux/slab.h>
+#include <linux/gzvm_drv.h>
+
+/* maximum size needed for holding an integer */
+#define ITOA_MAX_LEN 12
+
+static long gzvm_vcpu_update_one_reg(struct gzvm_vcpu *vcpu,
+ void * __user argp,
+ bool is_write)
+{
+ struct gzvm_one_reg reg;
+ void __user *reg_addr;
+ u64 data = 0;
+ u64 reg_size;
+ long ret;
+
+ if (copy_from_user(&reg, argp, sizeof(reg)))
+ return -EFAULT;
+
+ reg_addr = (void __user *)reg.addr;
+ reg_size = (reg.id & GZVM_REG_SIZE_MASK) >> GZVM_REG_SIZE_SHIFT;
+ reg_size = BIT(reg_size);
+
+ if (is_write) {
+ if (copy_from_user(&data, reg_addr, reg_size))
+ return -EFAULT;
+ }
+
+ ret = gzvm_arch_vcpu_update_one_reg(vcpu, reg.id, is_write, &data);
+
+ if (ret)
+ return ret;
+
+ if (!is_write) {
+ if (copy_to_user(reg_addr, &data, reg_size))
+ return -EFAULT;
+ }
+
+ return 0;
+}
+
+/**
+ * gzvm_vcpu_run() - Handle vcpu run ioctl, entry point to guest and exit
+ * point from guest
+ * @vcpu: Pointer to struct gzvm_vcpu
+ * @argp: Pointer to struct gzvm_vcpu_run in userspace
+ *
+ * Return:
+ * * 0 - Success.
+ * * Negative - Failure.
+ */
+static long gzvm_vcpu_run(struct gzvm_vcpu *vcpu, void * __user argp)
+{
+ bool need_userspace = false;
+ u64 exit_reason = 0;
+
+ if (copy_from_user(vcpu->run, argp, sizeof(struct gzvm_vcpu_run)))
+ return -EFAULT;
+
+ for (int i = 0; i < ARRAY_SIZE(vcpu->run->padding1); i++) {
+ if (vcpu->run->padding1[i])
+ return -EINVAL;
+ }
+
+ if (vcpu->run->immediate_exit == 1)
+ return -EINTR;
+
+ while (!need_userspace && !signal_pending(current)) {
+ gzvm_arch_vcpu_run(vcpu, &exit_reason);
+
+ switch (exit_reason) {
+ case GZVM_EXIT_MMIO:
+ need_userspace = true;
+ break;
+ /**
+ * it's geniezone's responsibility to fill corresponding data
+ * structure
+ */
+ case GZVM_EXIT_HYPERCALL:
+ fallthrough;
+ case GZVM_EXIT_EXCEPTION:
+ fallthrough;
+ case GZVM_EXIT_DEBUG:
+ fallthrough;
+ case GZVM_EXIT_FAIL_ENTRY:
+ fallthrough;
+ case GZVM_EXIT_INTERNAL_ERROR:
+ fallthrough;
+ case GZVM_EXIT_SYSTEM_EVENT:
+ fallthrough;
+ case GZVM_EXIT_SHUTDOWN:
+ need_userspace = true;
+ break;
+ case GZVM_EXIT_IRQ:
+ fallthrough;
+ case GZVM_EXIT_GZ:
+ break;
+ case GZVM_EXIT_UNKNOWN:
+ fallthrough;
+ default:
+ pr_err("vcpu unknown exit\n");
+ need_userspace = true;
+ goto out;
+ }
+ }
+
+out:
+ if (copy_to_user(argp, vcpu->run, sizeof(struct gzvm_vcpu_run)))
+ return -EFAULT;
+ if (signal_pending(current)) {
+ // invoke hvc to inform gz to map memory
+ gzvm_arch_inform_exit(vcpu->gzvm->vm_id);
+ return -ERESTARTSYS;
+ }
+ return 0;
+}
+
+static long gzvm_vcpu_ioctl(struct file *filp, unsigned int ioctl,
+ unsigned long arg)
+{
+ int ret = -ENOTTY;
+ void __user *argp = (void __user *)arg;
+ struct gzvm_vcpu *vcpu = filp->private_data;
+
+ switch (ioctl) {
+ case GZVM_RUN:
+ ret = gzvm_vcpu_run(vcpu, argp);
+ break;
+ case GZVM_GET_ONE_REG:
+ /* is_write */
+ ret = gzvm_vcpu_update_one_reg(vcpu, argp, false);
+ break;
+ case GZVM_SET_ONE_REG:
+ /* is_write */
+ ret = gzvm_vcpu_update_one_reg(vcpu, argp, true);
+ break;
+ default:
+ break;
+ }
+
+ return ret;
+}
+
+static const struct file_operations gzvm_vcpu_fops = {
+ .unlocked_ioctl = gzvm_vcpu_ioctl,
+ .llseek = noop_llseek,
+};
+
+/* caller must hold the vm lock */
+static void gzvm_destroy_vcpu(struct gzvm_vcpu *vcpu)
+{
+ if (!vcpu)
+ return;
+
+ gzvm_arch_destroy_vcpu(vcpu->gzvm->vm_id, vcpu->vcpuid);
+ /* clean guest's data */
+ memset(vcpu->run, 0, GZVM_VCPU_RUN_MAP_SIZE);
+ free_pages_exact(vcpu->run, GZVM_VCPU_RUN_MAP_SIZE);
+ kfree(vcpu);
+}
+
+/**
+ * gzvm_destroy_vcpus() - Destroy all vcpus, caller has to hold the vm lock
+ *
+ * @gzvm: vm struct that owns the vcpus
+ */
+void gzvm_destroy_vcpus(struct gzvm *gzvm)
+{
+ int i;
+
+ for (i = 0; i < GZVM_MAX_VCPUS; i++) {
+ gzvm_destroy_vcpu(gzvm->vcpus[i]);
+ gzvm->vcpus[i] = NULL;
+ }
+}
+
+/* create_vcpu_fd() - Allocates an inode for the vcpu. */
+static int create_vcpu_fd(struct gzvm_vcpu *vcpu)
+{
+ /* sizeof("gzvm-vcpu:") + max(strlen(itoa(vcpuid))) + null */
+ char name[10 + ITOA_MAX_LEN + 1];
+
+ snprintf(name, sizeof(name), "gzvm-vcpu:%d", vcpu->vcpuid);
+ return anon_inode_getfd(name, &gzvm_vcpu_fops, vcpu, O_RDWR | O_CLOEXEC);
+}
+
+/**
+ * gzvm_vm_ioctl_create_vcpu() - for GZVM_CREATE_VCPU
+ * @gzvm: Pointer to struct gzvm
+ * @cpuid: equals arg
+ *
+ * Return: Fd of vcpu, negative errno if error occurs
+ */
+int gzvm_vm_ioctl_create_vcpu(struct gzvm *gzvm, u32 cpuid)
+{
+ struct gzvm_vcpu *vcpu;
+ int ret;
+
+ if (cpuid >= GZVM_MAX_VCPUS)
+ return -EINVAL;
+
+ vcpu = kzalloc(sizeof(*vcpu), GFP_KERNEL);
+ if (!vcpu)
+ return -ENOMEM;
+
+ /**
+ * Allocate 2 pages for data sharing between driver and gz hypervisor
+ *
+ * |- page 0 -|- page 1 -|
+ * |gzvm_vcpu_run|......|hwstate|.......|
+ *
+ */
+ vcpu->run = alloc_pages_exact(GZVM_VCPU_RUN_MAP_SIZE,
+ GFP_KERNEL_ACCOUNT | __GFP_ZERO);
+ if (!vcpu->run) {
+ ret = -ENOMEM;
+ goto free_vcpu;
+ }
+ vcpu->vcpuid = cpuid;
+ vcpu->gzvm = gzvm;
+ mutex_init(&vcpu->lock);
+
+ ret = gzvm_arch_create_vcpu(gzvm->vm_id, vcpu->vcpuid, vcpu->run);
+ if (ret < 0)
+ goto free_vcpu_run;
+
+ ret = create_vcpu_fd(vcpu);
+ if (ret < 0)
+ goto free_vcpu_run;
+ gzvm->vcpus[cpuid] = vcpu;
+
+ return ret;
+
+free_vcpu_run:
+ free_pages_exact(vcpu->run, GZVM_VCPU_RUN_MAP_SIZE);
+free_vcpu:
+ kfree(vcpu);
+ return ret;
+}
diff --git a/drivers/virt/geniezone/gzvm_vm.c b/drivers/virt/geniezone/gzvm_vm.c
index ee751369fd4b..aea99d050653 100644
--- a/drivers/virt/geniezone/gzvm_vm.c
+++ b/drivers/virt/geniezone/gzvm_vm.c
@@ -280,6 +280,10 @@ static long gzvm_vm_ioctl(struct file *filp, unsigned int ioctl,
ret = gzvm_dev_ioctl_check_extension(gzvm, arg);
break;
}
+ case GZVM_CREATE_VCPU: {
+ ret = gzvm_vm_ioctl_create_vcpu(gzvm, arg);
+ break;
+ }
case GZVM_SET_USER_MEMORY_REGION: {
struct gzvm_userspace_memory_region userspace_mem;

@@ -313,6 +317,7 @@ static void gzvm_destroy_vm(struct gzvm *gzvm)

mutex_lock(&gzvm->lock);

+ gzvm_destroy_vcpus(gzvm);
gzvm_arch_destroy_vm(gzvm->vm_id);

mutex_lock(&gzvm_list_lock);
diff --git a/include/linux/gzvm_drv.h b/include/linux/gzvm_drv.h
index 4fd52fcbd5a8..c42edb4345cc 100644
--- a/include/linux/gzvm_drv.h
+++ b/include/linux/gzvm_drv.h
@@ -31,6 +31,8 @@
#define GZVM_MAX_VCPUS 8
#define GZVM_MAX_MEM_REGION 10

+#define GZVM_VCPU_RUN_MAP_SIZE (PAGE_SIZE * 2)
+
/* struct mem_region_addr_range - Identical to ffa memory constituent */
struct mem_region_addr_range {
/* the base IPA of the constituent memory region, aligned to 4 kiB */
@@ -58,7 +60,16 @@ struct gzvm_memslot {
u32 slot_id;
};

+struct gzvm_vcpu {
+ struct gzvm *gzvm;
+ int vcpuid;
+ /* lock of vcpu*/
+ struct mutex lock;
+ struct gzvm_vcpu_run *run;
+};
+
struct gzvm {
+ struct gzvm_vcpu *vcpus[GZVM_MAX_VCPUS];
/* userspace tied to this vm */
struct mm_struct *mm;
struct gzvm_memslot memslot[GZVM_MAX_MEM_REGION];
@@ -75,6 +86,8 @@ int gzvm_err_to_errno(unsigned long err);

void gzvm_destroy_all_vms(void);

+void gzvm_destroy_vcpus(struct gzvm *gzvm);
+
/* arch-dependant functions */
int gzvm_arch_probe(void);
int gzvm_arch_set_memregion(u16 vm_id, size_t buf_size,
@@ -85,6 +98,14 @@ int gzvm_arch_destroy_vm(u16 vm_id);
int gzvm_vm_ioctl_arch_enable_cap(struct gzvm *gzvm,
struct gzvm_enable_cap *cap,
void __user *argp);
+
u64 gzvm_hva_to_pa_arch(u64 hva);
+int gzvm_vm_ioctl_create_vcpu(struct gzvm *gzvm, u32 cpuid);
+int gzvm_arch_vcpu_update_one_reg(struct gzvm_vcpu *vcpu, __u64 reg_id,
+ bool is_write, __u64 *data);
+int gzvm_arch_create_vcpu(u16 vm_id, int vcpuid, void *run);
+int gzvm_arch_vcpu_run(struct gzvm_vcpu *vcpu, __u64 *exit_reason);
+int gzvm_arch_destroy_vcpu(u16 vm_id, int vcpuid);
+int gzvm_arch_inform_exit(u16 vm_id);

#endif /* __GZVM_DRV_H__ */
diff --git a/include/uapi/linux/gzvm.h b/include/uapi/linux/gzvm.h
index 99730c142b0e..4814c82b0dff 100644
--- a/include/uapi/linux/gzvm.h
+++ b/include/uapi/linux/gzvm.h
@@ -44,6 +44,11 @@ struct gzvm_memory_region {

#define GZVM_SET_MEMORY_REGION _IOW(GZVM_IOC_MAGIC, 0x40, \
struct gzvm_memory_region)
+/*
+ * GZVM_CREATE_VCPU receives as a parameter the vcpu slot,
+ * and returns a vcpu fd.
+ */
+#define GZVM_CREATE_VCPU _IO(GZVM_IOC_MAGIC, 0x41)

/* for GZVM_SET_USER_MEMORY_REGION */
struct gzvm_userspace_memory_region {
@@ -59,6 +64,124 @@ struct gzvm_userspace_memory_region {
#define GZVM_SET_USER_MEMORY_REGION _IOW(GZVM_IOC_MAGIC, 0x46, \
struct gzvm_userspace_memory_region)

+/*
+ * ioctls for vcpu fds
+ */
+#define GZVM_RUN _IO(GZVM_IOC_MAGIC, 0x80)
+
+/* VM exit reason */
+enum {
+ GZVM_EXIT_UNKNOWN = 0x92920000,
+ GZVM_EXIT_MMIO = 0x92920001,
+ GZVM_EXIT_HYPERCALL = 0x92920002,
+ GZVM_EXIT_IRQ = 0x92920003,
+ GZVM_EXIT_EXCEPTION = 0x92920004,
+ GZVM_EXIT_DEBUG = 0x92920005,
+ GZVM_EXIT_FAIL_ENTRY = 0x92920006,
+ GZVM_EXIT_INTERNAL_ERROR = 0x92920007,
+ GZVM_EXIT_SYSTEM_EVENT = 0x92920008,
+ GZVM_EXIT_SHUTDOWN = 0x92920009,
+ GZVM_EXIT_GZ = 0x9292000a,
+};
+
+/**
+ * struct gzvm_vcpu_run: Same purpose as kvm_run, this struct is
+ * shared between userspace, kernel and
+ * GenieZone hypervisor
+ * @exit_reason: The reason why gzvm_vcpu_run has stopped running the vCPU
+ * @immediate_exit: Polled when the vcpu is scheduled.
+ * If set, immediately returns -EINTR
+ * @padding1: Reserved for future-proof and must be zero filled
+ * @mmio: The nested struct in anonymous union. Handle mmio in host side
+ * @phys_addr: The address guest tries to access
+ * @data: The value to be written (is_write is 1) or
+ * be filled by user for reads (is_write is 0)
+ * @size: The size of written data.
+ * Only the first `size` bytes of `data` are handled
+ * @reg_nr: The register number where the data is stored
+ * @is_write: 1 for VM to perform a write or 0 for VM to perform a read
+ * @fail_entry: The nested struct in anonymous union.
+ * Handle invalid entry address at the first run
+ * @hardware_entry_failure_reason: The reason codes about hardware entry failure
+ * @cpu: The current processor number via smp_processor_id()
+ * @exception: The nested struct in anonymous union.
+ * Handle exception occurred in VM
+ * @exception: Which exception vector
+ * @error_code: Exception error codes
+ * @hypercall: The nested struct in anonymous union.
+ * Some hypercalls issued from VM must be handled
+ * @args: The hypercall's arguments
+ * @internal: The nested struct in anonymous union. The errors from hypervisor
+ * @suberror: The errors codes about GZVM_EXIT_INTERNAL_ERROR
+ * @ndata: The number of elements used in data[]
+ * @data: Keep the detailed information about GZVM_EXIT_INTERNAL_ERROR
+ * @system_event: The nested struct in anonymous union.
+ * VM's PSCI must be handled by host
+ * @type: System event type.
+ * Ex. GZVM_SYSTEM_EVENT_SHUTDOWN or GZVM_SYSTEM_EVENT_RESET...etc.
+ * @ndata: The number of elements used in data[]
+ * @data: Keep the detailed information about GZVM_EXIT_SYSTEM_EVENT
+ * @padding: Fix it to a reasonable size future-proof for keeping the same
+ * struct size when adding new variables in the union is needed
+ *
+ * Keep identical layout between the 3 modules
+ */
+struct gzvm_vcpu_run {
+ /* to userspace */
+ __u32 exit_reason;
+ __u8 immediate_exit;
+ __u8 padding1[3];
+ /* union structure of collection of guest exit reason */
+ union {
+ /* GZVM_EXIT_MMIO */
+ struct {
+ /* from FAR_EL2 */
+ __u64 phys_addr;
+ __u8 data[8];
+ /* from ESR_EL2 as */
+ __u64 size;
+ /* from ESR_EL2 */
+ __u32 reg_nr;
+ /* from ESR_EL2 */
+ __u8 is_write;
+ } mmio;
+ /* GZVM_EXIT_FAIL_ENTRY */
+ struct {
+ __u64 hardware_entry_failure_reason;
+ __u32 cpu;
+ } fail_entry;
+ /* GZVM_EXIT_EXCEPTION */
+ struct {
+ __u32 exception;
+ __u32 error_code;
+ } exception;
+ /* GZVM_EXIT_HYPERCALL */
+ struct {
+ __u64 args[8]; /* in-out */
+ } hypercall;
+ /* GZVM_EXIT_INTERNAL_ERROR */
+ struct {
+ __u32 suberror;
+ __u32 ndata;
+ __u64 data[16];
+ } internal;
+ /* GZVM_EXIT_SYSTEM_EVENT */
+ struct {
+#define GZVM_SYSTEM_EVENT_SHUTDOWN 1
+#define GZVM_SYSTEM_EVENT_RESET 2
+#define GZVM_SYSTEM_EVENT_CRASH 3
+#define GZVM_SYSTEM_EVENT_WAKEUP 4
+#define GZVM_SYSTEM_EVENT_SUSPEND 5
+#define GZVM_SYSTEM_EVENT_SEV_TERM 6
+#define GZVM_SYSTEM_EVENT_S2IDLE 7
+ __u32 type;
+ __u32 ndata;
+ __u64 data[16];
+ } system_event;
+ char padding[256];
+ };
+};
+
/* for GZVM_ENABLE_CAP */
struct gzvm_enable_cap {
/* in */
@@ -73,4 +196,17 @@ struct gzvm_enable_cap {
#define GZVM_ENABLE_CAP _IOW(GZVM_IOC_MAGIC, 0xa3, \
struct gzvm_enable_cap)

+/* for GZVM_GET/SET_ONE_REG */
+struct gzvm_one_reg {
+ __u64 id;
+ __u64 addr;
+};
+
+#define GZVM_GET_ONE_REG _IOW(GZVM_IOC_MAGIC, 0xab, \
+ struct gzvm_one_reg)
+#define GZVM_SET_ONE_REG _IOW(GZVM_IOC_MAGIC, 0xac, \
+ struct gzvm_one_reg)
+
+#define GZVM_REG_GENERIC 0x0000000000000000ULL
+
#endif /* __GZVM_H__ */
--
2.18.0


2023-07-27 09:17:36

by Yi-De Wu

[permalink] [raw]
Subject: [PATCH v5 06/12] virt: geniezone: Add irqfd support

From: "Yingshiuan Pan" <[email protected]>

irqfd enables other threads than vcpu threads to inject virtual
interrupt through irqfd asynchronously rather through ioctl interface.
This interface is necessary for VMM which creates separated thread for
IO handling or uses vhost devices.

Signed-off-by: Yingshiuan Pan <[email protected]>
Signed-off-by: Liju Chen <[email protected]>
Signed-off-by: Yi-De Wu <[email protected]>
---
arch/arm64/geniezone/gzvm_arch_common.h | 18 +
drivers/virt/geniezone/Makefile | 3 +-
drivers/virt/geniezone/gzvm_irqfd.c | 566 ++++++++++++++++++++++++
drivers/virt/geniezone/gzvm_main.c | 4 +
drivers/virt/geniezone/gzvm_vcpu.c | 1 +
drivers/virt/geniezone/gzvm_vm.c | 18 +
include/linux/gzvm_drv.h | 26 ++
include/uapi/linux/gzvm.h | 26 ++
8 files changed, 660 insertions(+), 2 deletions(-)
create mode 100644 drivers/virt/geniezone/gzvm_irqfd.c

diff --git a/arch/arm64/geniezone/gzvm_arch_common.h b/arch/arm64/geniezone/gzvm_arch_common.h
index 9be9cf77faa3..051d8f49a1df 100644
--- a/arch/arm64/geniezone/gzvm_arch_common.h
+++ b/arch/arm64/geniezone/gzvm_arch_common.h
@@ -45,6 +45,8 @@ enum {
#define MT_HVC_GZVM_ENABLE_CAP GZVM_HCALL_ID(GZVM_FUNC_ENABLE_CAP)
#define MT_HVC_GZVM_INFORM_EXIT GZVM_HCALL_ID(GZVM_FUNC_INFORM_EXIT)

+#define GIC_V3_NR_LRS 16
+
/**
* gzvm_hypcall_wrapper() - the wrapper for hvc calls
* @a0-a7: arguments passed in registers 0 to 7
@@ -72,6 +74,22 @@ static inline u16 get_vcpuid_from_tuple(unsigned int tuple)
return (u16)(tuple & 0xffff);
}

+/**
+ * struct gzvm_vcpu_hwstate: Sync architecture state back to host for handling
+ * @nr_lrs: The available LRs(list registers) in Soc.
+ * @__pad: add an explicit '__u32 __pad;' in the middle to make it clear
+ * what the actual layout is.
+ * @lr: The array of LRs(list registers).
+ *
+ * - Keep the same layout of hypervisor data struct.
+ * - Sync list registers back for acking virtual device interrupt status.
+ */
+struct gzvm_vcpu_hwstate {
+ __le32 nr_lrs;
+ __le32 __pad;
+ __le64 lr[GIC_V3_NR_LRS];
+};
+
static inline unsigned int
assemble_vm_vcpu_tuple(u16 vmid, u16 vcpuid)
{
diff --git a/drivers/virt/geniezone/Makefile b/drivers/virt/geniezone/Makefile
index 8ebf2db0c970..19a835b0aac2 100644
--- a/drivers/virt/geniezone/Makefile
+++ b/drivers/virt/geniezone/Makefile
@@ -7,5 +7,4 @@
GZVM_DIR ?= ../../../drivers/virt/geniezone

gzvm-y := $(GZVM_DIR)/gzvm_main.o $(GZVM_DIR)/gzvm_vm.o \
- $(GZVM_DIR)/gzvm_vcpu.o
-
+ $(GZVM_DIR)/gzvm_vcpu.o $(GZVM_DIR)/gzvm_irqfd.o
diff --git a/drivers/virt/geniezone/gzvm_irqfd.c b/drivers/virt/geniezone/gzvm_irqfd.c
new file mode 100644
index 000000000000..b10ac3a940ee
--- /dev/null
+++ b/drivers/virt/geniezone/gzvm_irqfd.c
@@ -0,0 +1,566 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2023 MediaTek Inc.
+ */
+
+#include <linux/eventfd.h>
+#include <linux/syscalls.h>
+#include <linux/gzvm_drv.h>
+#include "gzvm_common.h"
+
+struct gzvm_irq_ack_notifier {
+ struct hlist_node link;
+ unsigned int gsi;
+ void (*irq_acked)(struct gzvm_irq_ack_notifier *ian);
+};
+
+/**
+ * struct gzvm_kernel_irqfd_resampler - irqfd resampler descriptor.
+ * @gzvm: Poiner to gzvm.
+ * @list: List of resampling struct _irqfd objects sharing this gsi.
+ * RCU list modified under gzvm->irqfds.resampler_lock.
+ * @notifier: gzvm irq ack notifier.
+ * @link: Entry in list of gzvm->irqfd.resampler_list.
+ * Use for sharing esamplers among irqfds on the same gsi.
+ * Accessed and modified under gzvm->irqfds.resampler_lock.
+ *
+ * Resampling irqfds are a special variety of irqfds used to emulate
+ * level triggered interrupts. The interrupt is asserted on eventfd
+ * trigger. On acknowledgment through the irq ack notifier, the
+ * interrupt is de-asserted and userspace is notified through the
+ * resamplefd. All resamplers on the same gsi are de-asserted
+ * together, so we don't need to track the state of each individual
+ * user. We can also therefore share the same irq source ID.
+ */
+struct gzvm_kernel_irqfd_resampler {
+ struct gzvm *gzvm;
+
+ struct list_head list;
+ struct gzvm_irq_ack_notifier notifier;
+
+ struct list_head link;
+};
+
+/**
+ * struct gzvm_kernel_irqfd: gzvm kernel irqfd descriptor.
+ * @gzvm: Pointer to struct gzvm.
+ * @wait: Wait queue entry.
+ * @gsi: Used for level IRQ fast-path.
+ * @resampler: The resampler used by this irqfd (resampler-only).
+ * @resamplefd: Eventfd notified on resample (resampler-only).
+ * @resampler_link: Entry in list of irqfds for a resampler (resampler-only).
+ * @eventfd: Used for setup/shutdown.
+ * @list: struct list_head.
+ * @pt: struct poll_table_struct.
+ * @shutdown: struct work_struct.
+ */
+struct gzvm_kernel_irqfd {
+ struct gzvm *gzvm;
+ wait_queue_entry_t wait;
+
+ int gsi;
+
+ struct gzvm_kernel_irqfd_resampler *resampler;
+
+ struct eventfd_ctx *resamplefd;
+
+ struct list_head resampler_link;
+
+ struct eventfd_ctx *eventfd;
+ struct list_head list;
+ poll_table pt;
+ struct work_struct shutdown;
+};
+
+static struct workqueue_struct *irqfd_cleanup_wq;
+
+/**
+ * irqfd_set_spi(): irqfd to inject virtual interrupt.
+ * @gzvm: Pointer to gzvm.
+ * @irq_source_id: irq source id.
+ * @irq: This is spi interrupt number (starts from 0 instead of 32).
+ * @level: irq triggered level.
+ * @line_status: irq status.
+ */
+static void irqfd_set_spi(struct gzvm *gzvm, int irq_source_id, u32 irq,
+ int level, bool line_status)
+{
+ if (level)
+ gzvm_irqchip_inject_irq(gzvm, irq_source_id, 0, irq, level);
+}
+
+/**
+ * irqfd_resampler_ack() - Notify all of the resampler irqfds using this GSI
+ * when IRQ de-assert once.
+ * @ian: Pointer to gzvm_irq_ack_notifier.
+ *
+ * Since resampler irqfds share an IRQ source ID, we de-assert once
+ * then notify all of the resampler irqfds using this GSI. We can't
+ * do multiple de-asserts or we risk racing with incoming re-asserts.
+ */
+static void irqfd_resampler_ack(struct gzvm_irq_ack_notifier *ian)
+{
+ struct gzvm_kernel_irqfd_resampler *resampler;
+ struct gzvm *gzvm;
+ struct gzvm_kernel_irqfd *irqfd;
+ int idx;
+
+ resampler = container_of(ian,
+ struct gzvm_kernel_irqfd_resampler, notifier);
+ gzvm = resampler->gzvm;
+
+ irqfd_set_spi(gzvm, GZVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID,
+ resampler->notifier.gsi, 0, false);
+
+ idx = srcu_read_lock(&gzvm->irq_srcu);
+
+ list_for_each_entry_srcu(irqfd, &resampler->list, resampler_link,
+ srcu_read_lock_held(&gzvm->irq_srcu)) {
+ eventfd_signal(irqfd->resamplefd, 1);
+ }
+
+ srcu_read_unlock(&gzvm->irq_srcu, idx);
+}
+
+static void gzvm_register_irq_ack_notifier(struct gzvm *gzvm,
+ struct gzvm_irq_ack_notifier *ian)
+{
+ mutex_lock(&gzvm->irq_lock);
+ hlist_add_head_rcu(&ian->link, &gzvm->irq_ack_notifier_list);
+ mutex_unlock(&gzvm->irq_lock);
+}
+
+static void gzvm_unregister_irq_ack_notifier(struct gzvm *gzvm,
+ struct gzvm_irq_ack_notifier *ian)
+{
+ mutex_lock(&gzvm->irq_lock);
+ hlist_del_init_rcu(&ian->link);
+ mutex_unlock(&gzvm->irq_lock);
+ synchronize_srcu(&gzvm->irq_srcu);
+}
+
+static void irqfd_resampler_shutdown(struct gzvm_kernel_irqfd *irqfd)
+{
+ struct gzvm_kernel_irqfd_resampler *resampler = irqfd->resampler;
+ struct gzvm *gzvm = resampler->gzvm;
+
+ mutex_lock(&gzvm->irqfds.resampler_lock);
+
+ list_del_rcu(&irqfd->resampler_link);
+ synchronize_srcu(&gzvm->irq_srcu);
+
+ if (list_empty(&resampler->list)) {
+ list_del(&resampler->link);
+ gzvm_unregister_irq_ack_notifier(gzvm, &resampler->notifier);
+ irqfd_set_spi(gzvm, GZVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID,
+ resampler->notifier.gsi, 0, false);
+ kfree(resampler);
+ }
+
+ mutex_unlock(&gzvm->irqfds.resampler_lock);
+}
+
+/**
+ * irqfd_shutdown() - Race-free decouple logic (ordering is critical).
+ * @work: Pointer to work_struct.
+ */
+static void irqfd_shutdown(struct work_struct *work)
+{
+ struct gzvm_kernel_irqfd *irqfd =
+ container_of(work, struct gzvm_kernel_irqfd, shutdown);
+ struct gzvm *gzvm = irqfd->gzvm;
+ u64 cnt;
+
+ /* Make sure irqfd has been initialized in assign path. */
+ synchronize_srcu(&gzvm->irq_srcu);
+
+ /*
+ * Synchronize with the wait-queue and unhook ourselves to prevent
+ * further events.
+ */
+ eventfd_ctx_remove_wait_queue(irqfd->eventfd, &irqfd->wait, &cnt);
+
+ if (irqfd->resampler) {
+ irqfd_resampler_shutdown(irqfd);
+ eventfd_ctx_put(irqfd->resamplefd);
+ }
+
+ /*
+ * It is now safe to release the object's resources
+ */
+ eventfd_ctx_put(irqfd->eventfd);
+ kfree(irqfd);
+}
+
+/**
+ * irqfd_is_active() - Assumes gzvm->irqfds.lock is held.
+ * @irqfd: Pointer to gzvm_kernel_irqfd.
+ *
+ * Return:
+ * * true - irqfd is active.
+ */
+static bool irqfd_is_active(struct gzvm_kernel_irqfd *irqfd)
+{
+ return list_empty(&irqfd->list) ? false : true;
+}
+
+/**
+ * irqfd_deactivate() - Mark the irqfd as inactive and schedule it for removal.
+ * assumes gzvm->irqfds.lock is held.
+ * @irqfd: Pointer to gzvm_kernel_irqfd.
+ */
+static void irqfd_deactivate(struct gzvm_kernel_irqfd *irqfd)
+{
+ if (!irqfd_is_active(irqfd))
+ return;
+
+ list_del_init(&irqfd->list);
+
+ queue_work(irqfd_cleanup_wq, &irqfd->shutdown);
+}
+
+/**
+ * irqfd_wakeup() - Callback of irqfd wait queue, would be woken by writing to
+ * irqfd to do virtual interrupt injection.
+ * @wait: Pointer to wait_queue_entry_t.
+ * @mode: Unused.
+ * @sync: Unused.
+ * @key: Get flags about Epoll events.
+ *
+ * Return:
+ * * 0 - Success
+ */
+static int irqfd_wakeup(wait_queue_entry_t *wait, unsigned int mode, int sync,
+ void *key)
+{
+ struct gzvm_kernel_irqfd *irqfd =
+ container_of(wait, struct gzvm_kernel_irqfd, wait);
+ __poll_t flags = key_to_poll(key);
+ struct gzvm *gzvm = irqfd->gzvm;
+
+ if (flags & EPOLLIN) {
+ u64 cnt;
+
+ eventfd_ctx_do_read(irqfd->eventfd, &cnt);
+ /* gzvm's irq injection is not blocked, don't need workq */
+ irqfd_set_spi(gzvm, GZVM_USERSPACE_IRQ_SOURCE_ID, irqfd->gsi,
+ 1, false);
+ }
+
+ if (flags & EPOLLHUP) {
+ /* The eventfd is closing, detach from GZVM */
+ unsigned long iflags;
+
+ spin_lock_irqsave(&gzvm->irqfds.lock, iflags);
+
+ /*
+ * Do more check if someone deactivated the irqfd before
+ * we could acquire the irqfds.lock.
+ */
+ if (irqfd_is_active(irqfd))
+ irqfd_deactivate(irqfd);
+
+ spin_unlock_irqrestore(&gzvm->irqfds.lock, iflags);
+ }
+
+ return 0;
+}
+
+static void irqfd_ptable_queue_proc(struct file *file, wait_queue_head_t *wqh,
+ poll_table *pt)
+{
+ struct gzvm_kernel_irqfd *irqfd =
+ container_of(pt, struct gzvm_kernel_irqfd, pt);
+ add_wait_queue_priority(wqh, &irqfd->wait);
+}
+
+static int gzvm_irqfd_assign(struct gzvm *gzvm, struct gzvm_irqfd *args)
+{
+ struct gzvm_kernel_irqfd *irqfd, *tmp;
+ struct fd f;
+ struct eventfd_ctx *eventfd = NULL, *resamplefd = NULL;
+ int ret;
+ __poll_t events;
+ int idx;
+
+ irqfd = kzalloc(sizeof(*irqfd), GFP_KERNEL_ACCOUNT);
+ if (!irqfd)
+ return -ENOMEM;
+
+ irqfd->gzvm = gzvm;
+ irqfd->gsi = args->gsi;
+ irqfd->resampler = NULL;
+
+ INIT_LIST_HEAD(&irqfd->list);
+ INIT_WORK(&irqfd->shutdown, irqfd_shutdown);
+
+ f = fdget(args->fd);
+ if (!f.file) {
+ ret = -EBADF;
+ goto out;
+ }
+
+ eventfd = eventfd_ctx_fileget(f.file);
+ if (IS_ERR(eventfd)) {
+ ret = PTR_ERR(eventfd);
+ goto fail;
+ }
+
+ irqfd->eventfd = eventfd;
+
+ if (args->flags & GZVM_IRQFD_FLAG_RESAMPLE) {
+ struct gzvm_kernel_irqfd_resampler *resampler;
+
+ resamplefd = eventfd_ctx_fdget(args->resamplefd);
+ if (IS_ERR(resamplefd)) {
+ ret = PTR_ERR(resamplefd);
+ goto fail;
+ }
+
+ irqfd->resamplefd = resamplefd;
+ INIT_LIST_HEAD(&irqfd->resampler_link);
+
+ mutex_lock(&gzvm->irqfds.resampler_lock);
+
+ list_for_each_entry(resampler,
+ &gzvm->irqfds.resampler_list, link) {
+ if (resampler->notifier.gsi == irqfd->gsi) {
+ irqfd->resampler = resampler;
+ break;
+ }
+ }
+
+ if (!irqfd->resampler) {
+ resampler = kzalloc(sizeof(*resampler),
+ GFP_KERNEL_ACCOUNT);
+ if (!resampler) {
+ ret = -ENOMEM;
+ mutex_unlock(&gzvm->irqfds.resampler_lock);
+ goto fail;
+ }
+
+ resampler->gzvm = gzvm;
+ INIT_LIST_HEAD(&resampler->list);
+ resampler->notifier.gsi = irqfd->gsi;
+ resampler->notifier.irq_acked = irqfd_resampler_ack;
+ INIT_LIST_HEAD(&resampler->link);
+
+ list_add(&resampler->link, &gzvm->irqfds.resampler_list);
+ gzvm_register_irq_ack_notifier(gzvm,
+ &resampler->notifier);
+ irqfd->resampler = resampler;
+ }
+
+ list_add_rcu(&irqfd->resampler_link, &irqfd->resampler->list);
+ synchronize_srcu(&gzvm->irq_srcu);
+
+ mutex_unlock(&gzvm->irqfds.resampler_lock);
+ }
+
+ /*
+ * Install our own custom wake-up handling so we are notified via
+ * a callback whenever someone signals the underlying eventfd
+ */
+ init_waitqueue_func_entry(&irqfd->wait, irqfd_wakeup);
+ init_poll_funcptr(&irqfd->pt, irqfd_ptable_queue_proc);
+
+ spin_lock_irq(&gzvm->irqfds.lock);
+
+ ret = 0;
+ list_for_each_entry(tmp, &gzvm->irqfds.items, list) {
+ if (irqfd->eventfd != tmp->eventfd)
+ continue;
+ /* This fd is used for another irq already. */
+ pr_err("already used: gsi=%d fd=%d\n", args->gsi, args->fd);
+ ret = -EBUSY;
+ spin_unlock_irq(&gzvm->irqfds.lock);
+ goto fail;
+ }
+
+ idx = srcu_read_lock(&gzvm->irq_srcu);
+
+ list_add_tail(&irqfd->list, &gzvm->irqfds.items);
+
+ spin_unlock_irq(&gzvm->irqfds.lock);
+
+ /*
+ * Check if there was an event already pending on the eventfd
+ * before we registered, and trigger it as if we didn't miss it.
+ */
+ events = vfs_poll(f.file, &irqfd->pt);
+
+ /* In case there is already a pending event */
+ if (events & EPOLLIN)
+ irqfd_set_spi(gzvm, GZVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID,
+ irqfd->gsi, 1, false);
+
+ srcu_read_unlock(&gzvm->irq_srcu, idx);
+
+ /*
+ * do not drop the file until the irqfd is fully initialized, otherwise
+ * we might race against the EPOLLHUP
+ */
+ fdput(f);
+ return 0;
+
+fail:
+ if (irqfd->resampler)
+ irqfd_resampler_shutdown(irqfd);
+
+ if (resamplefd && !IS_ERR(resamplefd))
+ eventfd_ctx_put(resamplefd);
+
+ if (eventfd && !IS_ERR(eventfd))
+ eventfd_ctx_put(eventfd);
+
+ fdput(f);
+
+out:
+ kfree(irqfd);
+ return ret;
+}
+
+static void gzvm_notify_acked_gsi(struct gzvm *gzvm, int gsi)
+{
+ struct gzvm_irq_ack_notifier *gian;
+
+ hlist_for_each_entry_srcu(gian, &gzvm->irq_ack_notifier_list,
+ link, srcu_read_lock_held(&gzvm->irq_srcu))
+ if (gian->gsi == gsi)
+ gian->irq_acked(gian);
+}
+
+void gzvm_notify_acked_irq(struct gzvm *gzvm, unsigned int gsi)
+{
+ int idx;
+
+ idx = srcu_read_lock(&gzvm->irq_srcu);
+ gzvm_notify_acked_gsi(gzvm, gsi);
+ srcu_read_unlock(&gzvm->irq_srcu, idx);
+}
+
+/**
+ * gzvm_irqfd_deassign() - Shutdown any irqfd's that match fd+gsi.
+ * @gzvm: Pointer to gzvm.
+ * @args: Pointer to gzvm_irqfd.
+ *
+ * Return:
+ * * 0 - Success.
+ * * Negative value - Failure.
+ */
+static int gzvm_irqfd_deassign(struct gzvm *gzvm, struct gzvm_irqfd *args)
+{
+ struct gzvm_kernel_irqfd *irqfd, *tmp;
+ struct eventfd_ctx *eventfd;
+
+ eventfd = eventfd_ctx_fdget(args->fd);
+ if (IS_ERR(eventfd))
+ return PTR_ERR(eventfd);
+
+ spin_lock_irq(&gzvm->irqfds.lock);
+
+ list_for_each_entry_safe(irqfd, tmp, &gzvm->irqfds.items, list) {
+ if (irqfd->eventfd == eventfd && irqfd->gsi == args->gsi)
+ irqfd_deactivate(irqfd);
+ }
+
+ spin_unlock_irq(&gzvm->irqfds.lock);
+ eventfd_ctx_put(eventfd);
+
+ /*
+ * Block until we know all outstanding shutdown jobs have completed
+ * so that we guarantee there will not be any more interrupts on this
+ * gsi once this deassign function returns.
+ */
+ flush_workqueue(irqfd_cleanup_wq);
+
+ return 0;
+}
+
+int gzvm_irqfd(struct gzvm *gzvm, struct gzvm_irqfd *args)
+{
+ for (int i = 0; i < ARRAY_SIZE(args->pad); i++) {
+ if (args->pad[i])
+ return -EINVAL;
+ }
+
+ if (args->flags &
+ ~(GZVM_IRQFD_FLAG_DEASSIGN | GZVM_IRQFD_FLAG_RESAMPLE))
+ return -EINVAL;
+
+ if (args->flags & GZVM_IRQFD_FLAG_DEASSIGN)
+ return gzvm_irqfd_deassign(gzvm, args);
+
+ return gzvm_irqfd_assign(gzvm, args);
+}
+
+/**
+ * gzvm_vm_irqfd_init() - Initialize irqfd data structure per VM
+ *
+ * @gzvm: Pointer to struct gzvm.
+ *
+ * Return:
+ * * 0 - Success.
+ * * Negative - Failure.
+ */
+int gzvm_vm_irqfd_init(struct gzvm *gzvm)
+{
+ mutex_init(&gzvm->irq_lock);
+
+ spin_lock_init(&gzvm->irqfds.lock);
+ INIT_LIST_HEAD(&gzvm->irqfds.items);
+ INIT_LIST_HEAD(&gzvm->irqfds.resampler_list);
+ if (init_srcu_struct(&gzvm->irq_srcu))
+ return -EINVAL;
+ INIT_HLIST_HEAD(&gzvm->irq_ack_notifier_list);
+ mutex_init(&gzvm->irqfds.resampler_lock);
+
+ return 0;
+}
+
+/**
+ * gzvm_vm_irqfd_release() - This function is called as the gzvm VM fd is being
+ * released. Shutdown all irqfds that still remain open.
+ * @gzvm: Pointer to gzvm.
+ */
+void gzvm_vm_irqfd_release(struct gzvm *gzvm)
+{
+ struct gzvm_kernel_irqfd *irqfd, *tmp;
+
+ spin_lock_irq(&gzvm->irqfds.lock);
+
+ list_for_each_entry_safe(irqfd, tmp, &gzvm->irqfds.items, list)
+ irqfd_deactivate(irqfd);
+
+ spin_unlock_irq(&gzvm->irqfds.lock);
+
+ /*
+ * Block until we know all outstanding shutdown jobs have completed.
+ */
+ flush_workqueue(irqfd_cleanup_wq);
+}
+
+/**
+ * gzvm_drv_irqfd_init() - Erase flushing work items when a VM exits.
+ *
+ * Return:
+ * * 0 - Success.
+ * * Negative - Failure.
+ *
+ * Create a host-wide workqueue for issuing deferred shutdown requests
+ * aggregated from all vm* instances. We need our own isolated
+ * queue to ease flushing work items when a VM exits.
+ */
+int gzvm_drv_irqfd_init(void)
+{
+ irqfd_cleanup_wq = alloc_workqueue("gzvm-irqfd-cleanup", 0, 0);
+ if (!irqfd_cleanup_wq)
+ return -ENOMEM;
+
+ return 0;
+}
+
+void gzvm_drv_irqfd_exit(void)
+{
+ destroy_workqueue(irqfd_cleanup_wq);
+}
diff --git a/drivers/virt/geniezone/gzvm_main.c b/drivers/virt/geniezone/gzvm_main.c
index b629b41a0cd9..d4d5d75d3660 100644
--- a/drivers/virt/geniezone/gzvm_main.c
+++ b/drivers/virt/geniezone/gzvm_main.c
@@ -110,11 +110,15 @@ static int gzvm_drv_probe(struct platform_device *pdev)
if (ret)
return ret;

+ ret = gzvm_drv_irqfd_init();
+ if (ret)
+ return ret;
return 0;
}

static int gzvm_drv_remove(struct platform_device *pdev)
{
+ gzvm_drv_irqfd_exit();
gzvm_destroy_all_vms();
misc_deregister(&gzvm_dev);
return 0;
diff --git a/drivers/virt/geniezone/gzvm_vcpu.c b/drivers/virt/geniezone/gzvm_vcpu.c
index e051343f2b0e..a717fc713b2e 100644
--- a/drivers/virt/geniezone/gzvm_vcpu.c
+++ b/drivers/virt/geniezone/gzvm_vcpu.c
@@ -227,6 +227,7 @@ int gzvm_vm_ioctl_create_vcpu(struct gzvm *gzvm, u32 cpuid)
ret = -ENOMEM;
goto free_vcpu;
}
+ vcpu->hwstate = (void *)vcpu->run + PAGE_SIZE;
vcpu->vcpuid = cpuid;
vcpu->gzvm = gzvm;
mutex_init(&vcpu->lock);
diff --git a/drivers/virt/geniezone/gzvm_vm.c b/drivers/virt/geniezone/gzvm_vm.c
index b1397180cd02..a93f5b0e7078 100644
--- a/drivers/virt/geniezone/gzvm_vm.c
+++ b/drivers/virt/geniezone/gzvm_vm.c
@@ -376,6 +376,16 @@ static long gzvm_vm_ioctl(struct file *filp, unsigned int ioctl,
ret = gzvm_vm_ioctl_create_device(gzvm, argp);
break;
}
+ case GZVM_IRQFD: {
+ struct gzvm_irqfd data;
+
+ if (copy_from_user(&data, argp, sizeof(data))) {
+ ret = -EFAULT;
+ goto out;
+ }
+ ret = gzvm_irqfd(gzvm, &data);
+ break;
+ }
case GZVM_ENABLE_CAP: {
struct gzvm_enable_cap cap;

@@ -399,6 +409,7 @@ static void gzvm_destroy_vm(struct gzvm *gzvm)

mutex_lock(&gzvm->lock);

+ gzvm_vm_irqfd_release(gzvm);
gzvm_destroy_vcpus(gzvm);
gzvm_arch_destroy_vm(gzvm->vm_id);

@@ -444,6 +455,13 @@ static struct gzvm *gzvm_create_vm(unsigned long vm_type)
gzvm->mm = current->mm;
mutex_init(&gzvm->lock);

+ ret = gzvm_vm_irqfd_init(gzvm);
+ if (ret) {
+ pr_err("Failed to initialize irqfd\n");
+ kfree(gzvm);
+ return ERR_PTR(ret);
+ }
+
mutex_lock(&gzvm_list_lock);
list_add(&gzvm->vm_list, &gzvm_list);
mutex_unlock(&gzvm_list_lock);
diff --git a/include/linux/gzvm_drv.h b/include/linux/gzvm_drv.h
index d86885d46195..af7043d66567 100644
--- a/include/linux/gzvm_drv.h
+++ b/include/linux/gzvm_drv.h
@@ -9,6 +9,7 @@
#include <linux/list.h>
#include <linux/mutex.h>
#include <linux/gzvm.h>
+#include <linux/srcu.h>

#define GZVM_VCPU_MMAP_SIZE PAGE_SIZE
#define INVALID_VM_ID 0xffff
@@ -23,6 +24,8 @@
#define ERR_NOT_SUPPORTED (-24)
#define ERR_NOT_IMPLEMENTED (-27)
#define ERR_FAULT (-40)
+#define GZVM_USERSPACE_IRQ_SOURCE_ID 0
+#define GZVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID 1

/*
* The following data structures are for data transferring between driver and
@@ -66,6 +69,7 @@ struct gzvm_vcpu {
/* lock of vcpu*/
struct mutex lock;
struct gzvm_vcpu_run *run;
+ struct gzvm_vcpu_hwstate *hwstate;
};

struct gzvm {
@@ -75,8 +79,23 @@ struct gzvm {
struct gzvm_memslot memslot[GZVM_MAX_MEM_REGION];
/* lock for list_add*/
struct mutex lock;
+
+ struct {
+ /* lock for irqfds list operation */
+ spinlock_t lock;
+ struct list_head items;
+ struct list_head resampler_list;
+ /* lock for irqfds resampler */
+ struct mutex resampler_lock;
+ } irqfds;
+
struct list_head vm_list;
u16 vm_id;
+
+ struct hlist_head irq_ack_notifier_list;
+ struct srcu_struct irq_srcu;
+ /* lock for irq injection */
+ struct mutex irq_lock;
};

long gzvm_dev_ioctl_check_extension(struct gzvm *gzvm, unsigned long args);
@@ -112,4 +131,11 @@ int gzvm_arch_create_device(u16 vm_id, struct gzvm_create_device *gzvm_dev);
int gzvm_arch_inject_irq(struct gzvm *gzvm, unsigned int vcpu_idx,
u32 irq_type, u32 irq, bool level);

+void gzvm_notify_acked_irq(struct gzvm *gzvm, unsigned int gsi);
+int gzvm_irqfd(struct gzvm *gzvm, struct gzvm_irqfd *args);
+int gzvm_drv_irqfd_init(void);
+void gzvm_drv_irqfd_exit(void);
+int gzvm_vm_irqfd_init(struct gzvm *gzvm);
+void gzvm_vm_irqfd_release(struct gzvm *gzvm);
+
#endif /* __GZVM_DRV_H__ */
diff --git a/include/uapi/linux/gzvm.h b/include/uapi/linux/gzvm.h
index fb019d232a98..f4b16d70f035 100644
--- a/include/uapi/linux/gzvm.h
+++ b/include/uapi/linux/gzvm.h
@@ -275,4 +275,30 @@ struct gzvm_one_reg {

#define GZVM_REG_GENERIC 0x0000000000000000ULL

+#define GZVM_IRQFD_FLAG_DEASSIGN BIT(0)
+/*
+ * GZVM_IRQFD_FLAG_RESAMPLE indicates resamplefd is valid and specifies
+ * the irqfd to operate in resampling mode for level triggered interrupt
+ * emulation.
+ */
+#define GZVM_IRQFD_FLAG_RESAMPLE BIT(1)
+
+/**
+ * struct gzvm_irqfd: gzvm irqfd descriptor
+ * @fd: File descriptor.
+ * @gsi: Used for level IRQ fast-path.
+ * @flags: FLAG_DEASSIGN or FLAG_RESAMPLE.
+ * @resamplefd: The file descriptor of the resampler.
+ * @pad: Reserved for future-proof.
+ */
+struct gzvm_irqfd {
+ __u32 fd;
+ __u32 gsi;
+ __u32 flags;
+ __u32 resamplefd;
+ __u8 pad[16];
+};
+
+#define GZVM_IRQFD _IOW(GZVM_IOC_MAGIC, 0x76, struct gzvm_irqfd)
+
#endif /* __GZVM_H__ */
--
2.18.0


2023-07-27 09:48:47

by Eugen Hristev

[permalink] [raw]
Subject: Re: [PATCH v5 03/12] virt: geniezone: Add GenieZone hypervisor support

Hi Yi-De,

On 7/27/23 10:59, Yi-De Wu wrote:
> From: "Yingshiuan Pan" <[email protected]>
>
> GenieZone is MediaTek hypervisor solution, and it is running in EL2
> stand alone as a type-I hypervisor. This patch exports a set of ioctl
> interfaces for userspace VMM (e.g., crosvm) to operate guest VMs
> lifecycle (creation and destroy) on GenieZone.
>
> Signed-off-by: Yingshiuan Pan <[email protected]>
> Signed-off-by: Jerry Wang <[email protected]>
> Signed-off-by: Liju Chen <[email protected]>
> Signed-off-by: Yi-De Wu <[email protected]>
> ---
> MAINTAINERS | 6 +
> arch/arm64/Kbuild | 1 +
> arch/arm64/geniezone/Makefile | 9 +
> arch/arm64/geniezone/gzvm_arch_common.h | 68 ++++
> arch/arm64/geniezone/vm.c | 212 +++++++++++++
> arch/arm64/include/uapi/asm/gzvm_arch.h | 20 ++
> drivers/virt/Kconfig | 2 +
> drivers/virt/geniezone/Kconfig | 16 +
> drivers/virt/geniezone/Makefile | 10 +
> drivers/virt/geniezone/gzvm_main.c | 143 +++++++++
> drivers/virt/geniezone/gzvm_vm.c | 400 ++++++++++++++++++++++++
> include/linux/gzvm_drv.h | 90 ++++++
> include/uapi/asm-generic/Kbuild | 1 +
> include/uapi/asm-generic/gzvm_arch.h | 10 +
> include/uapi/linux/gzvm.h | 76 +++++
> 15 files changed, 1064 insertions(+)
> create mode 100644 arch/arm64/geniezone/Makefile
> create mode 100644 arch/arm64/geniezone/gzvm_arch_common.h
> create mode 100644 arch/arm64/geniezone/vm.c
> create mode 100644 arch/arm64/include/uapi/asm/gzvm_arch.h
> create mode 100644 drivers/virt/geniezone/Kconfig
> create mode 100644 drivers/virt/geniezone/Makefile
> create mode 100644 drivers/virt/geniezone/gzvm_main.c
> create mode 100644 drivers/virt/geniezone/gzvm_vm.c
> create mode 100644 include/linux/gzvm_drv.h
> create mode 100644 include/uapi/asm-generic/gzvm_arch.h
> create mode 100644 include/uapi/linux/gzvm.h
>

I have a feeling this patch is a bit big, and could help review if it's
split into chunks of smaller size.

> diff --git a/MAINTAINERS b/MAINTAINERS
> index bfbfdb790446..b91d41dd2f2f 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -8747,6 +8747,12 @@ M: Ze-Yu Wang <[email protected]>
> M: Yi-De Wu <[email protected]>
> F: Documentation/devicetree/bindings/hypervisor/mediatek,geniezone-hyp.yaml
> F: Documentation/virt/geniezone/
> +F: arch/arm64/geniezone/
> +F: arch/arm64/include/uapi/asm/gzvm_arch.h
> +F: drivers/virt/geniezone/
> +F: include/linux/gzvm_drv.h
> +F include/uapi/asm-generic/gzvm_arch.h
> +F: include/uapi/linux/gzvm.h
>
> GENWQE (IBM Generic Workqueue Card)
> M: Frank Haverkamp <[email protected]>
> diff --git a/arch/arm64/Kbuild b/arch/arm64/Kbuild
> index 5bfbf7d79c99..0c3cca572919 100644
> --- a/arch/arm64/Kbuild
> +++ b/arch/arm64/Kbuild
> @@ -4,6 +4,7 @@ obj-$(CONFIG_KVM) += kvm/
> obj-$(CONFIG_XEN) += xen/
> obj-$(subst m,y,$(CONFIG_HYPERV)) += hyperv/
> obj-$(CONFIG_CRYPTO) += crypto/
> +obj-$(CONFIG_MTK_GZVM) += geniezone/
>
> # for cleaning
> subdir- += boot
> diff --git a/arch/arm64/geniezone/Makefile b/arch/arm64/geniezone/Makefile
> new file mode 100644
> index 000000000000..2957898cdd05
> --- /dev/null
> +++ b/arch/arm64/geniezone/Makefile
> @@ -0,0 +1,9 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +#
> +# Main Makefile for gzvm, this one includes drivers/virt/geniezone/Makefile
> +#
> +include $(srctree)/drivers/virt/geniezone/Makefile
> +
> +gzvm-y += vm.o
> +
> +obj-$(CONFIG_MTK_GZVM) += gzvm.o
> diff --git a/arch/arm64/geniezone/gzvm_arch_common.h b/arch/arm64/geniezone/gzvm_arch_common.h
> new file mode 100644
> index 000000000000..fdb95d619102
> --- /dev/null
> +++ b/arch/arm64/geniezone/gzvm_arch_common.h
> @@ -0,0 +1,68 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (c) 2023 MediaTek Inc.
> + */
> +
> +#ifndef __GZVM_ARCH_COMMON_H__
> +#define __GZVM_ARCH_COMMON_H__
> +
> +#include <linux/arm-smccc.h>
> +
> +enum {
> + GZVM_FUNC_CREATE_VM = 0,
> + GZVM_FUNC_DESTROY_VM = 1,
> + GZVM_FUNC_CREATE_VCPU = 2,
> + GZVM_FUNC_DESTROY_VCPU = 3,
> + GZVM_FUNC_SET_MEMREGION = 4,
> + GZVM_FUNC_RUN = 5,
> + GZVM_FUNC_GET_ONE_REG = 8,
> + GZVM_FUNC_SET_ONE_REG = 9,
> + GZVM_FUNC_IRQ_LINE = 10,
> + GZVM_FUNC_CREATE_DEVICE = 11,
> + GZVM_FUNC_PROBE = 12,
> + GZVM_FUNC_ENABLE_CAP = 13,
> + NR_GZVM_FUNC,
> +};
> +
> +#define SMC_ENTITY_MTK 59
> +#define GZVM_FUNCID_START (0x1000)
> +#define GZVM_HCALL_ID(func) \
> + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, ARM_SMCCC_SMC_32, \
> + SMC_ENTITY_MTK, (GZVM_FUNCID_START + (func)))
> +
> +#define MT_HVC_GZVM_CREATE_VM GZVM_HCALL_ID(GZVM_FUNC_CREATE_VM)
> +#define MT_HVC_GZVM_DESTROY_VM GZVM_HCALL_ID(GZVM_FUNC_DESTROY_VM)
> +#define MT_HVC_GZVM_CREATE_VCPU GZVM_HCALL_ID(GZVM_FUNC_CREATE_VCPU)
> +#define MT_HVC_GZVM_DESTROY_VCPU GZVM_HCALL_ID(GZVM_FUNC_DESTROY_VCPU)
> +#define MT_HVC_GZVM_SET_MEMREGION GZVM_HCALL_ID(GZVM_FUNC_SET_MEMREGION)
> +#define MT_HVC_GZVM_RUN GZVM_HCALL_ID(GZVM_FUNC_RUN)
> +#define MT_HVC_GZVM_GET_ONE_REG GZVM_HCALL_ID(GZVM_FUNC_GET_ONE_REG)
> +#define MT_HVC_GZVM_SET_ONE_REG GZVM_HCALL_ID(GZVM_FUNC_SET_ONE_REG)
> +#define MT_HVC_GZVM_IRQ_LINE GZVM_HCALL_ID(GZVM_FUNC_IRQ_LINE)
> +#define MT_HVC_GZVM_CREATE_DEVICE GZVM_HCALL_ID(GZVM_FUNC_CREATE_DEVICE)
> +#define MT_HVC_GZVM_PROBE GZVM_HCALL_ID(GZVM_FUNC_PROBE)
> +#define MT_HVC_GZVM_ENABLE_CAP GZVM_HCALL_ID(GZVM_FUNC_ENABLE_CAP)
> +
> +/**
> + * gzvm_hypcall_wrapper() - the wrapper for hvc calls
> + * @a0-a7: arguments passed in registers 0 to 7
> + * @res: result values from registers 0 to 3
> + *
> + * Return: The wrapper helps caller to convert geniezone errno to Linux errno.
> + */
> +static inline int gzvm_hypcall_wrapper(unsigned long a0, unsigned long a1,
> + unsigned long a2, unsigned long a3,
> + unsigned long a4, unsigned long a5,
> + unsigned long a6, unsigned long a7,
> + struct arm_smccc_res *res)
> +{
> + arm_smccc_hvc(a0, a1, a2, a3, a4, a5, a6, a7, res);
> + return gzvm_err_to_errno(res->a0);
> +}
> +
> +static inline u16 get_vmid_from_tuple(unsigned int tuple)
> +{
> + return (u16)(tuple >> 16);
> +}
> +
> +#endif /* __GZVM_ARCH_COMMON_H__ */
> diff --git a/arch/arm64/geniezone/vm.c b/arch/arm64/geniezone/vm.c
> new file mode 100644
> index 000000000000..e35751b21821
> --- /dev/null
> +++ b/arch/arm64/geniezone/vm.c
> @@ -0,0 +1,212 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2023 MediaTek Inc.
> + */
> +
> +#include <asm/sysreg.h>
> +#include <linux/arm-smccc.h>
> +#include <linux/err.h>
> +#include <linux/uaccess.h>
> +
> +#include <linux/gzvm.h>
> +#include <linux/gzvm_drv.h>
> +#include "gzvm_arch_common.h"
> +
> +#define PAR_PA47_MASK ((((1UL << 48) - 1) >> 12) << 12)
> +
> +int gzvm_arch_probe(void)
> +{
> + struct arm_smccc_res res;
> +
> + arm_smccc_hvc(MT_HVC_GZVM_PROBE, 0, 0, 0, 0, 0, 0, 0, &res);
> + if (res.a0 == 0)
> + return 0;

I would see the error path as a particular case here, e.g.

if (res.a0)
return -ENXIO;

and on the usual path return success.

(as you already do below in some functions)...

> +
> + return -ENXIO;
> +}
> +
> +int gzvm_arch_set_memregion(u16 vm_id, size_t buf_size,
> + phys_addr_t region)
> +{
> + struct arm_smccc_res res;
> +
> + return gzvm_hypcall_wrapper(MT_HVC_GZVM_SET_MEMREGION, vm_id,
> + buf_size, region, 0, 0, 0, 0, &res);
> +}
> +
> +static int gzvm_cap_arm_vm_ipa_size(void __user *argp)
> +{
> + __u64 value = CONFIG_ARM64_PA_BITS;
> +
> + if (copy_to_user(argp, &value, sizeof(__u64)))
> + return -EFAULT;

... e.g. here.

> +
> + return 0;
> +}
> +
> +int gzvm_arch_check_extension(struct gzvm *gzvm, __u64 cap, void __user *argp)
> +{
> + int ret = -EOPNOTSUPP;
> +
> + switch (cap) {
> + case GZVM_CAP_ARM_PROTECTED_VM: {
> + __u64 success = 1;
> +
> + if (copy_to_user(argp, &success, sizeof(__u64)))
> + return -EFAULT;
> + ret = 0;
> + break;
> + }
> + case GZVM_CAP_ARM_VM_IPA_SIZE: {
> + ret = gzvm_cap_arm_vm_ipa_size(argp);
> + break;
> + }
> + default:
> + ret = -EOPNOTSUPP;

you already initialized ret to -EOPNOTSUPP, why don't you initialize it
with 0, and just set it as error code here, and avoid setting it to 0 on
the success case above.

> + }
> +
> + return ret;
> +}
> +
> +/**
> + * gzvm_arch_create_vm() - create vm
> + * @vm_type: VM type. Only supports Linux VM now.
> + *
> + * Return:
> + * * positive value - VM ID
> + * * -ENOMEM - Memory not enough for storing VM data
> + */
> +int gzvm_arch_create_vm(unsigned long vm_type)
> +{
> + struct arm_smccc_res res;
> + int ret;
> +
> + ret = gzvm_hypcall_wrapper(MT_HVC_GZVM_CREATE_VM, vm_type, 0, 0, 0, 0,
> + 0, 0, &res);
> +
> + if (ret == 0)
> + return res.a1;
> + else
> + return ret;
> +}
> +
> +int gzvm_arch_destroy_vm(u16 vm_id)
> +{
> + struct arm_smccc_res res;
> +
> + return gzvm_hypcall_wrapper(MT_HVC_GZVM_DESTROY_VM, vm_id, 0, 0, 0, 0,
> + 0, 0, &res);
> +}
> +
> +static int gzvm_vm_arch_enable_cap(struct gzvm *gzvm,
> + struct gzvm_enable_cap *cap,
> + struct arm_smccc_res *res)
> +{
> + return gzvm_hypcall_wrapper(MT_HVC_GZVM_ENABLE_CAP, gzvm->vm_id,
> + cap->cap, cap->args[0], cap->args[1],
> + cap->args[2], cap->args[3], cap->args[4],
> + res);
> +}
> +
> +/**
> + * gzvm_vm_ioctl_get_pvmfw_size() - Get pvmfw size from hypervisor, return
> + * in x1, and return to userspace in args
> + * @gzvm: Pointer to struct gzvm.
> + * @cap: Pointer to struct gzvm_enable_cap.
> + * @argp: Pointer to struct gzvm_enable_cap in user space.
> + *
> + * Return:
> + * * 0 - Succeed
> + * * -EINVAL - Hypervisor return invalid results
> + * * -EFAULT - Fail to copy back to userspace buffer
> + */
> +static int gzvm_vm_ioctl_get_pvmfw_size(struct gzvm *gzvm,
> + struct gzvm_enable_cap *cap,
> + void __user *argp)
> +{
> + struct arm_smccc_res res = {0};
> +
> + if (gzvm_vm_arch_enable_cap(gzvm, cap, &res) != 0)
> + return -EINVAL;
> +
> + cap->args[1] = res.a1;
> + if (copy_to_user(argp, cap, sizeof(*cap)))
> + return -EFAULT;
> +
> + return 0;
> +}
> +
> +/**
> + * gzvm_vm_ioctl_cap_pvm() - Proceed GZVM_CAP_ARM_PROTECTED_VM's subcommands
> + * @gzvm: Pointer to struct gzvm.
> + * @cap: Pointer to struct gzvm_enable_cap.
> + * @argp: Pointer to struct gzvm_enable_cap in user space.
> + *
> + * Return:
> + * * 0 - Succeed
> + * * -EINVAL - Invalid subcommand or arguments
> + */
> +static int gzvm_vm_ioctl_cap_pvm(struct gzvm *gzvm,
> + struct gzvm_enable_cap *cap,
> + void __user *argp)
> +{
> + int ret = -EINVAL;

This initialization appears redundant as you always rewrite ret to a new
value below

> + struct arm_smccc_res res = {0};
> +
> + switch (cap->args[0]) {
> + case GZVM_CAP_ARM_PVM_SET_PVMFW_IPA:
> + fallthrough;
> + case GZVM_CAP_ARM_PVM_SET_PROTECTED_VM:
> + ret = gzvm_vm_arch_enable_cap(gzvm, cap, &res);
> + break;
> + case GZVM_CAP_ARM_PVM_GET_PVMFW_SIZE:
> + ret = gzvm_vm_ioctl_get_pvmfw_size(gzvm, cap, argp);
> + break;
> + default:
> + ret = -EINVAL;
> + break;
> + }
> +
> + return ret;
> +}
> +
> +int gzvm_vm_ioctl_arch_enable_cap(struct gzvm *gzvm,
> + struct gzvm_enable_cap *cap,
> + void __user *argp)
> +{
> + int ret = -EINVAL;
same here
> +
> + switch (cap->cap) {
> + case GZVM_CAP_ARM_PROTECTED_VM:
> + ret = gzvm_vm_ioctl_cap_pvm(gzvm, cap, argp);
> + break;
> + default:
> + ret = -EINVAL;
> + break;
> + }
> +
> + return ret;
> +}
> +
> +/**
> + * gzvm_hva_to_pa_arch() - converts hva to pa with arch-specific way
> + * @hva: Host virtual address.
> + *
> + * Return: 0 if translation error

This is a bit misleading, if you look at the code, you return 0 if the
bit SYS_PAR_EL1_F is present, but also return 0 if bit PAR_PA47_MASK is
not present. Are those situations identical ?

Also, it's a bit strange to return 0 for an error case.

> + */
> +u64 gzvm_hva_to_pa_arch(u64 hva)
> +{
> + u64 par;
> + unsigned long flags;
> +
> + local_irq_save(flags);
> + asm volatile("at s1e1r, %0" :: "r" (hva));
> + isb();
> + par = read_sysreg_par();
> + local_irq_restore(flags);
> +
> + if (par & SYS_PAR_EL1_F)
> + return 0;
> +
> + return par & PAR_PA47_MASK;
> +}
> diff --git a/arch/arm64/include/uapi/asm/gzvm_arch.h b/arch/arm64/include/uapi/asm/gzvm_arch.h
> new file mode 100644
> index 000000000000..847bb627a65d
> --- /dev/null
> +++ b/arch/arm64/include/uapi/asm/gzvm_arch.h
> @@ -0,0 +1,20 @@
> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
> +/*
> + * Copyright (c) 2023 MediaTek Inc.
> + */
> +
> +#ifndef __GZVM_ARCH_H__
> +#define __GZVM_ARCH_H__
> +
> +#include <linux/types.h>
> +
> +#define GZVM_CAP_ARM_VM_IPA_SIZE 165
> +#define GZVM_CAP_ARM_PROTECTED_VM 0xffbadab1
> +
> +/* sub-commands put in args[0] for GZVM_CAP_ARM_PROTECTED_VM */
> +#define GZVM_CAP_ARM_PVM_SET_PVMFW_IPA 0
> +#define GZVM_CAP_ARM_PVM_GET_PVMFW_SIZE 1
> +/* GZVM_CAP_ARM_PVM_SET_PROTECTED_VM only sets protected but not load pvmfw */
> +#define GZVM_CAP_ARM_PVM_SET_PROTECTED_VM 2
> +
> +#endif /* __GZVM_ARCH_H__ */
> diff --git a/drivers/virt/Kconfig b/drivers/virt/Kconfig
> index f79ab13a5c28..9bbf0bdf672c 100644
> --- a/drivers/virt/Kconfig
> +++ b/drivers/virt/Kconfig
> @@ -54,4 +54,6 @@ source "drivers/virt/coco/sev-guest/Kconfig"
>
> source "drivers/virt/coco/tdx-guest/Kconfig"
>
> +source "drivers/virt/geniezone/Kconfig"
> +
> endif
> diff --git a/drivers/virt/geniezone/Kconfig b/drivers/virt/geniezone/Kconfig
> new file mode 100644
> index 000000000000..2643fb8913cc
> --- /dev/null
> +++ b/drivers/virt/geniezone/Kconfig
> @@ -0,0 +1,16 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +
> +config MTK_GZVM
> + tristate "GenieZone Hypervisor driver for guest VM operation"
> + depends on ARM64

if only mediatek SoC is supported, should it depend on it here ?

> + help
> + This driver, gzvm, enables to run guest VMs on MTK GenieZone
> + hypervisor. It exports kvm-like interfaces for VMM (e.g., crosvm) in
> + order to operate guest VMs on GenieZone hypervisor.
> +
> + GenieZone hypervisor now only supports MediaTek SoC and arm64
> + architecture.
> +
> + Select M if you want it be built as a module (gzvm.ko).
> +
> + If unsure, say N.
> diff --git a/drivers/virt/geniezone/Makefile b/drivers/virt/geniezone/Makefile
> new file mode 100644
> index 000000000000..066efddc0b9c
> --- /dev/null
> +++ b/drivers/virt/geniezone/Makefile
> @@ -0,0 +1,10 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +#
> +# Makefile for GenieZone driver, this file should be include in arch's
> +# to avoid two ko being generated.
> +#
> +
> +GZVM_DIR ?= ../../../drivers/virt/geniezone
> +
> +gzvm-y := $(GZVM_DIR)/gzvm_main.o $(GZVM_DIR)/gzvm_vm.o
> +
> diff --git a/drivers/virt/geniezone/gzvm_main.c b/drivers/virt/geniezone/gzvm_main.c
> new file mode 100644
> index 000000000000..b629b41a0cd9
> --- /dev/null
> +++ b/drivers/virt/geniezone/gzvm_main.c
> @@ -0,0 +1,143 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2023 MediaTek Inc.
> + */
> +
> +#include <linux/anon_inodes.h>
> +#include <linux/device.h>
> +#include <linux/file.h>
> +#include <linux/kdev_t.h>
> +#include <linux/miscdevice.h>
> +#include <linux/module.h>
> +#include <linux/of.h>
> +#include <linux/platform_device.h>
> +#include <linux/slab.h>
> +#include <linux/gzvm_drv.h>
> +
> +/**
> + * gzvm_err_to_errno() - Convert geniezone return value to standard errno
> + *
> + * @err: Return value from geniezone function return
> + *
> + * Return: Standard errno
> + */
> +int gzvm_err_to_errno(unsigned long err)
> +{
> + int gz_err = (int)err;
> +
> + switch (gz_err) {
> + case 0:
> + return 0;
> + case ERR_NO_MEMORY:
> + return -ENOMEM;
> + case ERR_NOT_SUPPORTED:
> + return -EOPNOTSUPP;
> + case ERR_NOT_IMPLEMENTED:
> + return -EOPNOTSUPP;
> + case ERR_FAULT:
> + return -EFAULT;
> + default:
> + break;
> + }
> +
> + return -EINVAL;
> +}
> +
> +/**
> + * gzvm_dev_ioctl_check_extension() - Check if given capability is support
> + * or not
> + *
> + * @gzvm: Pointer to struct gzvm
> + * @args: Pointer in u64 from userspace
> + *
> + * Return:
> + * * 0 - Support, no error
Supported ?

> + * * -EOPNOTSUPP - Not support
> + * * -EFAULT - Failed to get data from userspace
> + */
> +long gzvm_dev_ioctl_check_extension(struct gzvm *gzvm, unsigned long args)
> +{
> + __u64 cap;
> + void __user *argp = (void __user *)args;
> +
> + if (copy_from_user(&cap, argp, sizeof(uint64_t)))
> + return -EFAULT;
> + return gzvm_arch_check_extension(gzvm, cap, argp);
> +}
> +
> +static long gzvm_dev_ioctl(struct file *filp, unsigned int cmd,
> + unsigned long user_args)
> +{
> + long ret = -ENOTTY;
again redundant initializations
> +
> + switch (cmd) {
> + case GZVM_CREATE_VM:
> + ret = gzvm_dev_ioctl_create_vm(user_args);
> + break;
> + case GZVM_CHECK_EXTENSION:
> + if (!user_args)
> + return -EINVAL;
> + ret = gzvm_dev_ioctl_check_extension(NULL, user_args);
> + break;
> + default:
> + ret = -ENOTTY;
> + }
> +
> + return ret;
> +}
> +
> +static const struct file_operations gzvm_chardev_ops = {
> + .unlocked_ioctl = gzvm_dev_ioctl,
> + .llseek = noop_llseek,
> +};
> +
> +static struct miscdevice gzvm_dev = {
> + .minor = MISC_DYNAMIC_MINOR,
> + .name = KBUILD_MODNAME,
> + .fops = &gzvm_chardev_ops,
> +};
> +
> +static int gzvm_drv_probe(struct platform_device *pdev)
> +{
> + int ret;
> +
> + if (gzvm_arch_probe() != 0) {
> + dev_err(&pdev->dev, "Not found available conduit\n");
> + return -ENODEV;
> + }
> +
> + ret = misc_register(&gzvm_dev);

return misc_register(...) ?

> + if (ret)
> + return ret;
> +
> + return 0;
> +}
> +
> +static int gzvm_drv_remove(struct platform_device *pdev)
> +{
> + gzvm_destroy_all_vms();
> + misc_deregister(&gzvm_dev);
> + return 0;
> +}
> +
> +static const struct of_device_id gzvm_of_match[] = {
> + { .compatible = "mediatek,geniezone-hyp", },
> + {/* sentinel */},
> +};
> +
> +static struct platform_driver gzvm_driver = {
> + .probe = gzvm_drv_probe,
> + .remove = gzvm_drv_remove,
> + .driver = {
> + .name = KBUILD_MODNAME,
> + .owner = THIS_MODULE,
> + .of_match_table = gzvm_of_match,
> + },
> +};
> +
> +module_platform_driver(gzvm_driver);
> +
> +MODULE_DEVICE_TABLE(of, gzvm_of_match);
> +MODULE_AUTHOR("MediaTek");
> +MODULE_DESCRIPTION("GenieZone interface for VMM");
> +MODULE_LICENSE("GPL");
> diff --git a/drivers/virt/geniezone/gzvm_vm.c b/drivers/virt/geniezone/gzvm_vm.c
> new file mode 100644
> index 000000000000..ee751369fd4b
> --- /dev/null
> +++ b/drivers/virt/geniezone/gzvm_vm.c
> @@ -0,0 +1,400 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2023 MediaTek Inc.
> + */
> +
> +#include <linux/anon_inodes.h>
> +#include <linux/file.h>
> +#include <linux/kdev_t.h>
> +#include <linux/mm.h>
> +#include <linux/module.h>
> +#include <linux/platform_device.h>
> +#include <linux/slab.h>
> +#include <linux/gzvm_drv.h>
> +
> +static DEFINE_MUTEX(gzvm_list_lock);
> +static LIST_HEAD(gzvm_list);
> +
> +/**
> + * hva_to_pa_fast() - converts hva to pa in generic fast way
> + * @hva: Host virtual address.
> + *
> + * Return: 0 if translation error
> + */
> +static u64 hva_to_pa_fast(u64 hva)
> +{
> + struct page *page[1];
> +
> + u64 pfn;
> +
> + if (get_user_page_fast_only(hva, 0, page)) {
> + pfn = page_to_phys(page[0]);
> + put_page((struct page *)page);
> + return pfn;
> + } else {
you can remove the 'else' and just return 0 here as you return pfn in
the if(true) case.

> + return 0;
> + }
> +}
> +
> +/**
> + * hva_to_pa_slow() - note that this function may sleep
> + * @hva: Host virtual address.
> + *
> + * Return: 0 if translation error
> + */
> +static u64 hva_to_pa_slow(u64 hva)
> +{
> + struct page *page;
> + int npages;
> + u64 pfn;
> +
> + npages = get_user_pages_unlocked(hva, 1, &page, 0);
> + if (npages != 1)
> + return 0;
> +
> + pfn = page_to_phys(page);
> + put_page(page);
> +
> + return pfn;
> +}
> +
> +static u64 gzvm_gfn_to_hva_memslot(struct gzvm_memslot *memslot, u64 gfn)
> +{
> + u64 offset = gfn - memslot->base_gfn;
> +
> + return memslot->userspace_addr + offset * PAGE_SIZE;
> +}
> +
> +static u64 __gzvm_gfn_to_pfn_memslot(struct gzvm_memslot *memslot, u64 gfn)
> +{
> + u64 hva, pa;
> +
> + hva = gzvm_gfn_to_hva_memslot(memslot, gfn);
> +
> + pa = gzvm_hva_to_pa_arch(hva);
> + if (pa != 0)
> + return PHYS_PFN(pa);
> +
> + pa = hva_to_pa_fast(hva);
> + if (pa)
> + return PHYS_PFN(pa);
> +
> + pa = hva_to_pa_slow(hva);
> + if (pa)
> + return PHYS_PFN(pa);
> +
> + return 0;
> +}
> +
> +/**
> + * gzvm_gfn_to_pfn_memslot() - Translate gfn (guest ipa) to pfn (host pa),
> + * result is in @pfn
> + * @memslot: Pointer to struct gzvm_memslot.
> + * @gfn: Guest frame number.
> + * @pfn: Host page frame number.
> + *
> + * Return:
> + * * 0 - Succeed
> + * * -EFAULT - Failed to convert
> + */
> +static int gzvm_gfn_to_pfn_memslot(struct gzvm_memslot *memslot, u64 gfn,
> + u64 *pfn)
> +{
> + u64 __pfn;
> +
> + if (!memslot)
> + return -EFAULT;
> +
> + __pfn = __gzvm_gfn_to_pfn_memslot(memslot, gfn);
> + if (__pfn == 0) {
> + *pfn = 0;
> + return -EFAULT;
> + }
> +
> + *pfn = __pfn;
> +
> + return 0;
> +}
> +
> +/**
> + * fill_constituents() - Populate pa to buffer until full
> + * @consti: Pointer to struct mem_region_addr_range.
> + * @consti_cnt: Constituent count.
> + * @max_nr_consti: Maximum number of constituent count.
> + * @gfn: Guest frame number.
> + * @total_pages: Total page numbers.
> + * @slot: Pointer to struct gzvm_memslot.
> + *
> + * Return: how many pages we've fill in, negative if error
> + */
> +static int fill_constituents(struct mem_region_addr_range *consti,
> + int *consti_cnt, int max_nr_consti, u64 gfn,
> + u32 total_pages, struct gzvm_memslot *slot)
> +{
> + u64 pfn, prev_pfn, gfn_end;
> + int nr_pages = 1;
> + int i = 0;
> +
> + if (unlikely(total_pages == 0))
> + return -EINVAL;
> + gfn_end = gfn + total_pages;
> +
> + /* entry 0 */
> + if (gzvm_gfn_to_pfn_memslot(slot, gfn, &pfn) != 0)
> + return -EFAULT;
> + consti[0].address = PFN_PHYS(pfn);
> + consti[0].pg_cnt = 1;
> + gfn++;
> + prev_pfn = pfn;
> +
> + while (i < max_nr_consti && gfn < gfn_end) {
> + if (gzvm_gfn_to_pfn_memslot(slot, gfn, &pfn) != 0)
> + return -EFAULT;
> + if (pfn == (prev_pfn + 1)) {
> + consti[i].pg_cnt++;
> + } else {
> + i++;
> + if (i >= max_nr_consti)
> + break;
> + consti[i].address = PFN_PHYS(pfn);
> + consti[i].pg_cnt = 1;
> + }
> + prev_pfn = pfn;
> + gfn++;
> + nr_pages++;
> + }
> + if (i != max_nr_consti)
> + i++;
> + *consti_cnt = i;
> +
> + return nr_pages;
> +}
> +
> +/* register_memslot_addr_range() - Register memory region to GZ */
> +static int
> +register_memslot_addr_range(struct gzvm *gzvm, struct gzvm_memslot *memslot)
> +{
> + struct gzvm_memory_region_ranges *region;
> + u32 buf_size;
> + int max_nr_consti, remain_pages;
> + u64 gfn, gfn_end;
> +
> + buf_size = PAGE_SIZE * 2;
> + region = alloc_pages_exact(buf_size, GFP_KERNEL);
> + if (!region)
> + return -ENOMEM;
> + max_nr_consti = (buf_size - sizeof(*region)) /
> + sizeof(struct mem_region_addr_range);
> +
> + region->slot = memslot->slot_id;
> + remain_pages = memslot->npages;
> + gfn = memslot->base_gfn;
> + gfn_end = gfn + remain_pages;
> + while (gfn < gfn_end) {
> + int nr_pages;
> +
> + nr_pages = fill_constituents(region->constituents,
> + &region->constituent_cnt,
> + max_nr_consti, gfn,
> + remain_pages, memslot);
> + if (nr_pages < 0) {
> + pr_err("Failed to fill constituents\n");
> + free_pages_exact(region, buf_size);
> + return nr_pages;
> + }
> + region->gpa = PFN_PHYS(gfn);
> + region->total_pages = nr_pages;
> +
> + remain_pages -= nr_pages;
> + gfn += nr_pages;
> +
> + if (gzvm_arch_set_memregion(gzvm->vm_id, buf_size,
> + virt_to_phys(region))) {
> + pr_err("Failed to register memregion to hypervisor\n");
> + free_pages_exact(region, buf_size);
> + return -EFAULT;
> + }
> + }
> + free_pages_exact(region, buf_size);
> + return 0;
> +}
> +
> +/**
> + * gzvm_vm_ioctl_set_memory_region() - Set memory region of guest
> + * @gzvm: Pointer to struct gzvm.
> + * @mem: Input memory region from user.
> + *
> + * Return:
> + * * -EXIO - memslot is out-of-range
> + * * -EFAULT - Cannot find corresponding vma
> + * * -EINVAL - region size and vma size does not match
I assume 0 for success ?

> + */
> +static int
> +gzvm_vm_ioctl_set_memory_region(struct gzvm *gzvm,
> + struct gzvm_userspace_memory_region *mem)
> +{
> + struct vm_area_struct *vma;
> + struct gzvm_memslot *memslot;
> + unsigned long size;
> + __u32 slot;
> +
> + slot = mem->slot;
> + if (slot >= GZVM_MAX_MEM_REGION)
> + return -ENXIO;
> + memslot = &gzvm->memslot[slot];
> +
> + vma = vma_lookup(gzvm->mm, mem->userspace_addr);
> + if (!vma)
> + return -EFAULT;
> +
> + size = vma->vm_end - vma->vm_start;
> + if (size != mem->memory_size)
> + return -EINVAL;
> +
> + memslot->base_gfn = __phys_to_pfn(mem->guest_phys_addr);
> + memslot->npages = size >> PAGE_SHIFT;
> + memslot->userspace_addr = mem->userspace_addr;
> + memslot->vma = vma;
> + memslot->flags = mem->flags;
> + memslot->slot_id = mem->slot;
> + return register_memslot_addr_range(gzvm, memslot);
> +}
> +
> +static int gzvm_vm_ioctl_enable_cap(struct gzvm *gzvm,
> + struct gzvm_enable_cap *cap,
> + void __user *argp)
> +{
> + return gzvm_vm_ioctl_arch_enable_cap(gzvm, cap, argp);
> +}
> +
> +/* gzvm_vm_ioctl() - Ioctl handler of VM FD */
> +static long gzvm_vm_ioctl(struct file *filp, unsigned int ioctl,
> + unsigned long arg)
> +{
> + long ret = -ENOTTY;
appears to be redundant

> + void __user *argp = (void __user *)arg;
> + struct gzvm *gzvm = filp->private_data;
> +
> + switch (ioctl) {
> + case GZVM_CHECK_EXTENSION: {
> + ret = gzvm_dev_ioctl_check_extension(gzvm, arg);
> + break;
> + }
> + case GZVM_SET_USER_MEMORY_REGION: {
> + struct gzvm_userspace_memory_region userspace_mem;
> +
> + if (copy_from_user(&userspace_mem, argp, sizeof(userspace_mem))) {
> + ret = -EFAULT;
> + goto out;
> + }
> + ret = gzvm_vm_ioctl_set_memory_region(gzvm, &userspace_mem);
> + break;
> + }
> + case GZVM_ENABLE_CAP: {
> + struct gzvm_enable_cap cap;
> +
> + if (copy_from_user(&cap, argp, sizeof(cap))) {
> + ret = -EFAULT;
> + goto out;
> + }
> + ret = gzvm_vm_ioctl_enable_cap(gzvm, &cap, argp);
> + break;
> + }
> + default:
> + ret = -ENOTTY;
> + }
> +out:
> + return ret;
> +}
> +
> +static void gzvm_destroy_vm(struct gzvm *gzvm)
> +{
> + pr_debug("VM-%u is going to be destroyed\n", gzvm->vm_id);
> +
> + mutex_lock(&gzvm->lock);
> +
> + gzvm_arch_destroy_vm(gzvm->vm_id);
> +
> + mutex_lock(&gzvm_list_lock);
> + list_del(&gzvm->vm_list);
> + mutex_unlock(&gzvm_list_lock);
> +
> + mutex_unlock(&gzvm->lock);
> +
> + kfree(gzvm);
> +}
> +
> +static int gzvm_vm_release(struct inode *inode, struct file *filp)
> +{
> + struct gzvm *gzvm = filp->private_data;
> +
> + gzvm_destroy_vm(gzvm);
> + return 0;
> +}
> +
> +static const struct file_operations gzvm_vm_fops = {
> + .release = gzvm_vm_release,
> + .unlocked_ioctl = gzvm_vm_ioctl,
> + .llseek = noop_llseek,
> +};
> +
> +static struct gzvm *gzvm_create_vm(unsigned long vm_type)
> +{
> + int ret;
> + struct gzvm *gzvm;
> +
> + gzvm = kzalloc(sizeof(*gzvm), GFP_KERNEL);
> + if (!gzvm)
> + return ERR_PTR(-ENOMEM);
> +
> + ret = gzvm_arch_create_vm(vm_type);
> + if (ret < 0) {
> + kfree(gzvm);
> + return ERR_PTR(ret);
> + }
> +
> + gzvm->vm_id = ret;
> + gzvm->mm = current->mm;
> + mutex_init(&gzvm->lock);
> +
> + mutex_lock(&gzvm_list_lock);
> + list_add(&gzvm->vm_list, &gzvm_list);
> + mutex_unlock(&gzvm_list_lock);
> +
> + pr_debug("VM-%u is created\n", gzvm->vm_id);
> +
> + return gzvm;
> +}
> +
> +/**
> + * gzvm_dev_ioctl_create_vm - Create vm fd
> + * @vm_type: VM type. Only supports Linux VM now.
> + *
> + * Return: fd of vm, negative if error
> + */
> +int gzvm_dev_ioctl_create_vm(unsigned long vm_type)
> +{
> + struct gzvm *gzvm;
> +
> + gzvm = gzvm_create_vm(vm_type);
> + if (IS_ERR(gzvm))
> + return PTR_ERR(gzvm);
> +
> + return anon_inode_getfd("gzvm-vm", &gzvm_vm_fops, gzvm,
> + O_RDWR | O_CLOEXEC);
> +}
> +
> +void gzvm_destroy_all_vms(void)
> +{
> + struct gzvm *gzvm, *tmp;
> +
> + mutex_lock(&gzvm_list_lock);
> + if (list_empty(&gzvm_list))
> + goto out;
> +
> + list_for_each_entry_safe(gzvm, tmp, &gzvm_list, vm_list)
> + gzvm_destroy_vm(gzvm);
> +
> +out:
> + mutex_unlock(&gzvm_list_lock);
> +}
> diff --git a/include/linux/gzvm_drv.h b/include/linux/gzvm_drv.h
> new file mode 100644
> index 000000000000..4fd52fcbd5a8
> --- /dev/null
> +++ b/include/linux/gzvm_drv.h
> @@ -0,0 +1,90 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (c) 2023 MediaTek Inc.
> + */
> +
> +#ifndef __GZVM_DRV_H__
> +#define __GZVM_DRV_H__
> +
> +#include <linux/list.h>
> +#include <linux/mutex.h>
> +#include <linux/gzvm.h>
> +
> +#define GZVM_VCPU_MMAP_SIZE PAGE_SIZE
> +#define INVALID_VM_ID 0xffff
> +
> +/*
> + * These are the efinitions of APIs between GenieZone hypervisor and driver,
typo: definitions

> + * there's no need to be visible to uapi. Furthermore, We need GenieZone
We doesn't have to be capital
> + * specific error code in order to map to Linux errno
> + */
> +#define NO_ERROR (0)
> +#define ERR_NO_MEMORY (-5)
> +#define ERR_NOT_SUPPORTED (-24)
> +#define ERR_NOT_IMPLEMENTED (-27)
> +#define ERR_FAULT (-40)
> +
> +/*
> + * The following data structures are for data transferring between driver and
> + * hypervisor, and they're aligned with hypervisor definitions
> + */
> +#define GZVM_MAX_VCPUS 8
> +#define GZVM_MAX_MEM_REGION 10
> +
> +/* struct mem_region_addr_range - Identical to ffa memory constituent */
> +struct mem_region_addr_range {
> + /* the base IPA of the constituent memory region, aligned to 4 kiB */
> + __u64 address;
> + /* the number of 4 kiB pages in the constituent memory region. */
> + __u32 pg_cnt;
> + __u32 reserved;
> +};
> +
> +struct gzvm_memory_region_ranges {
> + __u32 slot;
> + __u32 constituent_cnt;
> + __u64 total_pages;
> + __u64 gpa;
> + struct mem_region_addr_range constituents[];
> +};
> +
> +/* struct gzvm_memslot - VM's memory slot descriptor */
> +struct gzvm_memslot {
> + u64 base_gfn; /* begin of guest page frame */
> + unsigned long npages; /* number of pages this slot covers */
> + unsigned long userspace_addr; /* corresponding userspace va */
> + struct vm_area_struct *vma; /* vma related to this userspace addr */
> + u32 flags;
> + u32 slot_id;
> +};
> +
> +struct gzvm {
> + /* userspace tied to this vm */
> + struct mm_struct *mm;
> + struct gzvm_memslot memslot[GZVM_MAX_MEM_REGION];
> + /* lock for list_add*/
> + struct mutex lock;
> + struct list_head vm_list;
> + u16 vm_id;
> +};
> +
> +long gzvm_dev_ioctl_check_extension(struct gzvm *gzvm, unsigned long args);
> +int gzvm_dev_ioctl_create_vm(unsigned long vm_type);
> +
> +int gzvm_err_to_errno(unsigned long err);
> +
> +void gzvm_destroy_all_vms(void);
> +
> +/* arch-dependant functions */
> +int gzvm_arch_probe(void);
> +int gzvm_arch_set_memregion(u16 vm_id, size_t buf_size,
> + phys_addr_t region);
> +int gzvm_arch_check_extension(struct gzvm *gzvm, __u64 cap, void __user *argp);
> +int gzvm_arch_create_vm(unsigned long vm_type);
> +int gzvm_arch_destroy_vm(u16 vm_id);
> +int gzvm_vm_ioctl_arch_enable_cap(struct gzvm *gzvm,
> + struct gzvm_enable_cap *cap,
> + void __user *argp);
> +u64 gzvm_hva_to_pa_arch(u64 hva);
> +
> +#endif /* __GZVM_DRV_H__ */
> diff --git a/include/uapi/asm-generic/Kbuild b/include/uapi/asm-generic/Kbuild
> index ebb180aac74e..5af115a3c1a8 100644
> --- a/include/uapi/asm-generic/Kbuild
> +++ b/include/uapi/asm-generic/Kbuild
> @@ -34,3 +34,4 @@ mandatory-y += termbits.h
> mandatory-y += termios.h
> mandatory-y += types.h
> mandatory-y += unistd.h
> +mandatory-y += gzvm_arch.h
> diff --git a/include/uapi/asm-generic/gzvm_arch.h b/include/uapi/asm-generic/gzvm_arch.h
> new file mode 100644
> index 000000000000..c4cc12716c91
> --- /dev/null
> +++ b/include/uapi/asm-generic/gzvm_arch.h
> @@ -0,0 +1,10 @@
> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
> +/*
> + * Copyright (c) 2023 MediaTek Inc.
> + */
> +
> +#ifndef __ASM_GENERIC_GZVM_ARCH_H
> +#define __ASM_GENERIC_GZVM_ARCH_H
> +/* geniezone only supports aarch64 platform for now */
> +
> +#endif /* __ASM_GENERIC_GZVM_ARCH_H */
> diff --git a/include/uapi/linux/gzvm.h b/include/uapi/linux/gzvm.h
> new file mode 100644
> index 000000000000..99730c142b0e
> --- /dev/null
> +++ b/include/uapi/linux/gzvm.h
> @@ -0,0 +1,76 @@
> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
> +/*
> + * Copyright (c) 2023 MediaTek Inc.
> + */
> +
> +/**
> + * DOC: UAPI of GenieZone Hypervisor
> + *
> + * This file declares common data structure shared among user space,
> + * kernel space, and GenieZone hypervisor.
> + */
> +#ifndef __GZVM_H__
> +#define __GZVM_H__
> +
> +#include <linux/const.h>
> +#include <linux/types.h>
> +#include <linux/ioctl.h>
> +
> +#include <asm/gzvm_arch.h>
> +
> +/* GZVM ioctls */
> +#define GZVM_IOC_MAGIC 0x92 /* gz */
> +
> +/* ioctls for /dev/gzvm fds */
> +#define GZVM_CREATE_VM _IO(GZVM_IOC_MAGIC, 0x01) /* Returns a Geniezone VM fd */
> +
> +/*
> + * Check if the given capability is supported or not.
> + * The argument is capability. Ex. GZVM_CAP_ARM_PROTECTED_VM or GZVM_CAP_ARM_VM_IPA_SIZE
> + * return is 0 (supported, no error)
> + * return is -EOPNOTSUPP (unsupported)
> + * return is -EFAULT (failed to get the argument from userspace)
> + */
> +#define GZVM_CHECK_EXTENSION _IO(GZVM_IOC_MAGIC, 0x03)
> +
> +/* ioctls for VM fds */
> +/* for GZVM_SET_MEMORY_REGION */
> +struct gzvm_memory_region {
> + __u32 slot;
> + __u32 flags;
> + __u64 guest_phys_addr;
> + __u64 memory_size; /* bytes */
> +};
> +
> +#define GZVM_SET_MEMORY_REGION _IOW(GZVM_IOC_MAGIC, 0x40, \
> + struct gzvm_memory_region)
> +
> +/* for GZVM_SET_USER_MEMORY_REGION */
> +struct gzvm_userspace_memory_region {
> + __u32 slot;
> + __u32 flags;
> + __u64 guest_phys_addr;
> + /* bytes */
> + __u64 memory_size;
> + /* start of the userspace allocated memory */
> + __u64 userspace_addr;
> +};
> +
> +#define GZVM_SET_USER_MEMORY_REGION _IOW(GZVM_IOC_MAGIC, 0x46, \
> + struct gzvm_userspace_memory_region)
> +
> +/* for GZVM_ENABLE_CAP */
> +struct gzvm_enable_cap {
> + /* in */
> + __u64 cap;
> + /**
> + * we have total 5 (8 - 3) registers can be used for
which can be used ?

> + * additional args
> + */
> + __u64 args[5];
> +};
> +
> +#define GZVM_ENABLE_CAP _IOW(GZVM_IOC_MAGIC, 0xa3, \
> + struct gzvm_enable_cap)
> +
> +#endif /* __GZVM_H__ */


Regards,

Eugen

Subject: Re: [PATCH v5 03/12] virt: geniezone: Add GenieZone hypervisor support

Il 27/07/23 09:59, Yi-De Wu ha scritto:
> From: "Yingshiuan Pan" <[email protected]>
>
> GenieZone is MediaTek hypervisor solution, and it is running in EL2
> stand alone as a type-I hypervisor. This patch exports a set of ioctl
> interfaces for userspace VMM (e.g., crosvm) to operate guest VMs
> lifecycle (creation and destroy) on GenieZone.
>
> Signed-off-by: Yingshiuan Pan <[email protected]>
> Signed-off-by: Jerry Wang <[email protected]>
> Signed-off-by: Liju Chen <[email protected]>
> Signed-off-by: Yi-De Wu <[email protected]>
> ---
> MAINTAINERS | 6 +
> arch/arm64/Kbuild | 1 +
> arch/arm64/geniezone/Makefile | 9 +
> arch/arm64/geniezone/gzvm_arch_common.h | 68 ++++
> arch/arm64/geniezone/vm.c | 212 +++++++++++++
> arch/arm64/include/uapi/asm/gzvm_arch.h | 20 ++
> drivers/virt/Kconfig | 2 +
> drivers/virt/geniezone/Kconfig | 16 +
> drivers/virt/geniezone/Makefile | 10 +
> drivers/virt/geniezone/gzvm_main.c | 143 +++++++++
> drivers/virt/geniezone/gzvm_vm.c | 400 ++++++++++++++++++++++++
> include/linux/gzvm_drv.h | 90 ++++++
> include/uapi/asm-generic/Kbuild | 1 +
> include/uapi/asm-generic/gzvm_arch.h | 10 +
> include/uapi/linux/gzvm.h | 76 +++++
> 15 files changed, 1064 insertions(+)
> create mode 100644 arch/arm64/geniezone/Makefile
> create mode 100644 arch/arm64/geniezone/gzvm_arch_common.h
> create mode 100644 arch/arm64/geniezone/vm.c
> create mode 100644 arch/arm64/include/uapi/asm/gzvm_arch.h
> create mode 100644 drivers/virt/geniezone/Kconfig
> create mode 100644 drivers/virt/geniezone/Makefile
> create mode 100644 drivers/virt/geniezone/gzvm_main.c
> create mode 100644 drivers/virt/geniezone/gzvm_vm.c
> create mode 100644 include/linux/gzvm_drv.h
> create mode 100644 include/uapi/asm-generic/gzvm_arch.h
> create mode 100644 include/uapi/linux/gzvm.h
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index bfbfdb790446..b91d41dd2f2f 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -8747,6 +8747,12 @@ M: Ze-Yu Wang <[email protected]>
> M: Yi-De Wu <[email protected]>
> F: Documentation/devicetree/bindings/hypervisor/mediatek,geniezone-hyp.yaml
> F: Documentation/virt/geniezone/
> +F: arch/arm64/geniezone/
> +F: arch/arm64/include/uapi/asm/gzvm_arch.h
> +F: drivers/virt/geniezone/
> +F: include/linux/gzvm_drv.h
> +F include/uapi/asm-generic/gzvm_arch.h
> +F: include/uapi/linux/gzvm.h
>
> GENWQE (IBM Generic Workqueue Card)
> M: Frank Haverkamp <[email protected]>
> diff --git a/arch/arm64/Kbuild b/arch/arm64/Kbuild
> index 5bfbf7d79c99..0c3cca572919 100644
> --- a/arch/arm64/Kbuild
> +++ b/arch/arm64/Kbuild
> @@ -4,6 +4,7 @@ obj-$(CONFIG_KVM) += kvm/
> obj-$(CONFIG_XEN) += xen/
> obj-$(subst m,y,$(CONFIG_HYPERV)) += hyperv/
> obj-$(CONFIG_CRYPTO) += crypto/
> +obj-$(CONFIG_MTK_GZVM) += geniezone/
>
> # for cleaning
> subdir- += boot
> diff --git a/arch/arm64/geniezone/Makefile b/arch/arm64/geniezone/Makefile
> new file mode 100644
> index 000000000000..2957898cdd05
> --- /dev/null
> +++ b/arch/arm64/geniezone/Makefile
> @@ -0,0 +1,9 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +#
> +# Main Makefile for gzvm, this one includes drivers/virt/geniezone/Makefile
> +#
> +include $(srctree)/drivers/virt/geniezone/Makefile
> +
> +gzvm-y += vm.o
> +
> +obj-$(CONFIG_MTK_GZVM) += gzvm.o
> diff --git a/arch/arm64/geniezone/gzvm_arch_common.h b/arch/arm64/geniezone/gzvm_arch_common.h
> new file mode 100644
> index 000000000000..fdb95d619102
> --- /dev/null
> +++ b/arch/arm64/geniezone/gzvm_arch_common.h
> @@ -0,0 +1,68 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (c) 2023 MediaTek Inc.
> + */
> +
> +#ifndef __GZVM_ARCH_COMMON_H__
> +#define __GZVM_ARCH_COMMON_H__
> +
> +#include <linux/arm-smccc.h>
> +
> +enum {
> + GZVM_FUNC_CREATE_VM = 0,
> + GZVM_FUNC_DESTROY_VM = 1,
> + GZVM_FUNC_CREATE_VCPU = 2,
> + GZVM_FUNC_DESTROY_VCPU = 3,
> + GZVM_FUNC_SET_MEMREGION = 4,
> + GZVM_FUNC_RUN = 5,
> + GZVM_FUNC_GET_ONE_REG = 8,
> + GZVM_FUNC_SET_ONE_REG = 9,
> + GZVM_FUNC_IRQ_LINE = 10,
> + GZVM_FUNC_CREATE_DEVICE = 11,
> + GZVM_FUNC_PROBE = 12,
> + GZVM_FUNC_ENABLE_CAP = 13,
> + NR_GZVM_FUNC,
> +};
> +
> +#define SMC_ENTITY_MTK 59
> +#define GZVM_FUNCID_START (0x1000)
> +#define GZVM_HCALL_ID(func) \
> + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, ARM_SMCCC_SMC_32, \
> + SMC_ENTITY_MTK, (GZVM_FUNCID_START + (func)))
> +
> +#define MT_HVC_GZVM_CREATE_VM GZVM_HCALL_ID(GZVM_FUNC_CREATE_VM)
> +#define MT_HVC_GZVM_DESTROY_VM GZVM_HCALL_ID(GZVM_FUNC_DESTROY_VM)
> +#define MT_HVC_GZVM_CREATE_VCPU GZVM_HCALL_ID(GZVM_FUNC_CREATE_VCPU)
> +#define MT_HVC_GZVM_DESTROY_VCPU GZVM_HCALL_ID(GZVM_FUNC_DESTROY_VCPU)
> +#define MT_HVC_GZVM_SET_MEMREGION GZVM_HCALL_ID(GZVM_FUNC_SET_MEMREGION)
> +#define MT_HVC_GZVM_RUN GZVM_HCALL_ID(GZVM_FUNC_RUN)
> +#define MT_HVC_GZVM_GET_ONE_REG GZVM_HCALL_ID(GZVM_FUNC_GET_ONE_REG)
> +#define MT_HVC_GZVM_SET_ONE_REG GZVM_HCALL_ID(GZVM_FUNC_SET_ONE_REG)
> +#define MT_HVC_GZVM_IRQ_LINE GZVM_HCALL_ID(GZVM_FUNC_IRQ_LINE)
> +#define MT_HVC_GZVM_CREATE_DEVICE GZVM_HCALL_ID(GZVM_FUNC_CREATE_DEVICE)
> +#define MT_HVC_GZVM_PROBE GZVM_HCALL_ID(GZVM_FUNC_PROBE)
> +#define MT_HVC_GZVM_ENABLE_CAP GZVM_HCALL_ID(GZVM_FUNC_ENABLE_CAP)
> +
> +/**
> + * gzvm_hypcall_wrapper() - the wrapper for hvc calls
> + * @a0-a7: arguments passed in registers 0 to 7
> + * @res: result values from registers 0 to 3
> + *
> + * Return: The wrapper helps caller to convert geniezone errno to Linux errno.
> + */
> +static inline int gzvm_hypcall_wrapper(unsigned long a0, unsigned long a1,
> + unsigned long a2, unsigned long a3,
> + unsigned long a4, unsigned long a5,
> + unsigned long a6, unsigned long a7,
> + struct arm_smccc_res *res)
> +{
> + arm_smccc_hvc(a0, a1, a2, a3, a4, a5, a6, a7, res);
> + return gzvm_err_to_errno(res->a0);
> +}
> +
> +static inline u16 get_vmid_from_tuple(unsigned int tuple)
> +{
> + return (u16)(tuple >> 16);
> +}
> +
> +#endif /* __GZVM_ARCH_COMMON_H__ */
> diff --git a/arch/arm64/geniezone/vm.c b/arch/arm64/geniezone/vm.c
> new file mode 100644
> index 000000000000..e35751b21821
> --- /dev/null
> +++ b/arch/arm64/geniezone/vm.c
> @@ -0,0 +1,212 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2023 MediaTek Inc.
> + */
> +
> +#include <asm/sysreg.h>
> +#include <linux/arm-smccc.h>
> +#include <linux/err.h>
> +#include <linux/uaccess.h>
> +
> +#include <linux/gzvm.h>
> +#include <linux/gzvm_drv.h>
> +#include "gzvm_arch_common.h"
> +
> +#define PAR_PA47_MASK ((((1UL << 48) - 1) >> 12) << 12)
> +
> +int gzvm_arch_probe(void)
> +{
> + struct arm_smccc_res res;
> +
> + arm_smccc_hvc(MT_HVC_GZVM_PROBE, 0, 0, 0, 0, 0, 0, 0, &res);
> + if (res.a0 == 0)
> + return 0;
> +
> + return -ENXIO;
> +}
> +
> +int gzvm_arch_set_memregion(u16 vm_id, size_t buf_size,
> + phys_addr_t region)
> +{
> + struct arm_smccc_res res;
> +
> + return gzvm_hypcall_wrapper(MT_HVC_GZVM_SET_MEMREGION, vm_id,
> + buf_size, region, 0, 0, 0, 0, &res);
> +}
> +
> +static int gzvm_cap_arm_vm_ipa_size(void __user *argp)
> +{
> + __u64 value = CONFIG_ARM64_PA_BITS;
> +
> + if (copy_to_user(argp, &value, sizeof(__u64)))
> + return -EFAULT;
> +
> + return 0;
> +}
> +
> +int gzvm_arch_check_extension(struct gzvm *gzvm, __u64 cap, void __user *argp)
> +{
> + int ret = -EOPNOTSUPP;

int ret;

> +
> + switch (cap) {
> + case GZVM_CAP_ARM_PROTECTED_VM: {
> + __u64 success = 1;
> +
> + if (copy_to_user(argp, &success, sizeof(__u64)))
> + return -EFAULT;
> + ret = 0;

here, instead of ret = 0...

return 0;

> + break;
> + }
> + case GZVM_CAP_ARM_VM_IPA_SIZE: {
> + ret = gzvm_cap_arm_vm_ipa_size(argp);

and here
return ret;

> + break;
> + }
> + default:
> + ret = -EOPNOTSUPP;

break;

> + }
> +

return -EOPNOTSUPP;

> + return ret;
> +}
> +
> +/**
> + * gzvm_arch_create_vm() - create vm
> + * @vm_type: VM type. Only supports Linux VM now.
> + *
> + * Return:
> + * * positive value - VM ID
> + * * -ENOMEM - Memory not enough for storing VM data
> + */
> +int gzvm_arch_create_vm(unsigned long vm_type)
> +{
> + struct arm_smccc_res res;
> + int ret;
> +
> + ret = gzvm_hypcall_wrapper(MT_HVC_GZVM_CREATE_VM, vm_type, 0, 0, 0, 0,
> + 0, 0, &res);
> +

return ret ? ret : res.a1;

> + if (ret == 0)
> + return res.a1;
> + else
> + return ret;
> +}
> +
> +int gzvm_arch_destroy_vm(u16 vm_id)
> +{
> + struct arm_smccc_res res;
> +
> + return gzvm_hypcall_wrapper(MT_HVC_GZVM_DESTROY_VM, vm_id, 0, 0, 0, 0,
> + 0, 0, &res);
> +}
> +
> +static int gzvm_vm_arch_enable_cap(struct gzvm *gzvm,
> + struct gzvm_enable_cap *cap,
> + struct arm_smccc_res *res)
> +{
> + return gzvm_hypcall_wrapper(MT_HVC_GZVM_ENABLE_CAP, gzvm->vm_id,
> + cap->cap, cap->args[0], cap->args[1],
> + cap->args[2], cap->args[3], cap->args[4],
> + res);
> +}
> +
> +/**
> + * gzvm_vm_ioctl_get_pvmfw_size() - Get pvmfw size from hypervisor, return
> + * in x1, and return to userspace in args
> + * @gzvm: Pointer to struct gzvm.
> + * @cap: Pointer to struct gzvm_enable_cap.
> + * @argp: Pointer to struct gzvm_enable_cap in user space.
> + *
> + * Return:
> + * * 0 - Succeed
> + * * -EINVAL - Hypervisor return invalid results
> + * * -EFAULT - Fail to copy back to userspace buffer
> + */
> +static int gzvm_vm_ioctl_get_pvmfw_size(struct gzvm *gzvm,
> + struct gzvm_enable_cap *cap,
> + void __user *argp)
> +{
> + struct arm_smccc_res res = {0};
> +
> + if (gzvm_vm_arch_enable_cap(gzvm, cap, &res) != 0)
> + return -EINVAL;
> +
> + cap->args[1] = res.a1;
> + if (copy_to_user(argp, cap, sizeof(*cap)))
> + return -EFAULT;
> +
> + return 0;
> +}
> +
> +/**
> + * gzvm_vm_ioctl_cap_pvm() - Proceed GZVM_CAP_ARM_PROTECTED_VM's subcommands
> + * @gzvm: Pointer to struct gzvm.
> + * @cap: Pointer to struct gzvm_enable_cap.
> + * @argp: Pointer to struct gzvm_enable_cap in user space.
> + *
> + * Return:
> + * * 0 - Succeed
> + * * -EINVAL - Invalid subcommand or arguments
> + */
> +static int gzvm_vm_ioctl_cap_pvm(struct gzvm *gzvm,
> + struct gzvm_enable_cap *cap,
> + void __user *argp)
> +{
> + int ret = -EINVAL;
> + struct arm_smccc_res res = {0};

Invert for readability; struct arm_smccc_res res first, ret last...
also, you don't need to initialize ret to -EINVAL, because:

> +
> + switch (cap->args[0]) {
> + case GZVM_CAP_ARM_PVM_SET_PVMFW_IPA:
> + fallthrough;
> + case GZVM_CAP_ARM_PVM_SET_PROTECTED_VM:
> + ret = gzvm_vm_arch_enable_cap(gzvm, cap, &res);

return ret;

> + break;
> + case GZVM_CAP_ARM_PVM_GET_PVMFW_SIZE:
> + ret = gzvm_vm_ioctl_get_pvmfw_size(gzvm, cap, argp);

return ret;

> + break;
> + default:

just break here

> + ret = -EINVAL;
> + break;
> + }
> +

return -EINVAL;

> + return ret;
> +}
> +
> +int gzvm_vm_ioctl_arch_enable_cap(struct gzvm *gzvm,
> + struct gzvm_enable_cap *cap,
> + void __user *argp)
> +{
> + int ret = -EINVAL;

same comments here

> +
> + switch (cap->cap) {
> + case GZVM_CAP_ARM_PROTECTED_VM:
> + ret = gzvm_vm_ioctl_cap_pvm(gzvm, cap, argp);
> + break;
> + default:
> + ret = -EINVAL;
> + break;
> + }
> +
> + return ret;
> +}
> +
> +/**
> + * gzvm_hva_to_pa_arch() - converts hva to pa with arch-specific way
> + * @hva: Host virtual address.
> + *
> + * Return: 0 if translation error
> + */
> +u64 gzvm_hva_to_pa_arch(u64 hva)
> +{
> + u64 par;
> + unsigned long flags;

unsigned long flags;
u64 par;

> +
> + local_irq_save(flags);
> + asm volatile("at s1e1r, %0" :: "r" (hva));
> + isb();
> + par = read_sysreg_par();
> + local_irq_restore(flags);
> +
> + if (par & SYS_PAR_EL1_F)
> + return 0;
> +
> + return par & PAR_PA47_MASK;
> +}

..snip..

> diff --git a/drivers/virt/geniezone/gzvm_main.c b/drivers/virt/geniezone/gzvm_main.c
> new file mode 100644
> index 000000000000..b629b41a0cd9
> --- /dev/null
> +++ b/drivers/virt/geniezone/gzvm_main.c
> @@ -0,0 +1,143 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2023 MediaTek Inc.
> + */
> +
> +#include <linux/anon_inodes.h>
> +#include <linux/device.h>
> +#include <linux/file.h>
> +#include <linux/kdev_t.h>
> +#include <linux/miscdevice.h>
> +#include <linux/module.h>
> +#include <linux/of.h>
> +#include <linux/platform_device.h>
> +#include <linux/slab.h>
> +#include <linux/gzvm_drv.h>
> +

..snip..

> +
> +static long gzvm_dev_ioctl(struct file *filp, unsigned int cmd,
> + unsigned long user_args)
> +{
> + long ret = -ENOTTY;

long ret;

> +
> + switch (cmd) {
> + case GZVM_CREATE_VM:
> + ret = gzvm_dev_ioctl_create_vm(user_args);

you may even do just

return gzvm_dev_ioctl_create_vm(user_args);

> + break;
> + case GZVM_CHECK_EXTENSION:
> + if (!user_args)
> + return -EINVAL;
> + ret = gzvm_dev_ioctl_check_extension(NULL, user_args);

return ....

> + break;
> + default:

break...

> + ret = -ENOTTY;
> + }

...return

> +
> + return ret;
> +}
> +
> +static const struct file_operations gzvm_chardev_ops = {
> + .unlocked_ioctl = gzvm_dev_ioctl,
> + .llseek = noop_llseek,
> +};
> +
> +static struct miscdevice gzvm_dev = {
> + .minor = MISC_DYNAMIC_MINOR,
> + .name = KBUILD_MODNAME,
> + .fops = &gzvm_chardev_ops,
> +};
> +
> +static int gzvm_drv_probe(struct platform_device *pdev)
> +{
> + int ret;
> +
> + if (gzvm_arch_probe() != 0) {
> + dev_err(&pdev->dev, "Not found available conduit\n");
> + return -ENODEV;
> + }
> +
> + ret = misc_register(&gzvm_dev);
> + if (ret)
> + return ret;
> +
> + return 0;
> +}
> +
> +static int gzvm_drv_remove(struct platform_device *pdev)
> +{
> + gzvm_destroy_all_vms();
> + misc_deregister(&gzvm_dev);
> + return 0;
> +}
> +
> +static const struct of_device_id gzvm_of_match[] = {
> + { .compatible = "mediatek,geniezone-hyp", },

Remove the comma after "mediatek,geniezone-hyp" as it's not needed.

> + {/* sentinel */},
> +};
> +
> +static struct platform_driver gzvm_driver = {
> + .probe = gzvm_drv_probe,
> + .remove = gzvm_drv_remove,
> + .driver = {
> + .name = KBUILD_MODNAME,
> + .owner = THIS_MODULE,
> + .of_match_table = gzvm_of_match,
> + },
> +};
> +
> +module_platform_driver(gzvm_driver);
> +
> +MODULE_DEVICE_TABLE(of, gzvm_of_match);
> +MODULE_AUTHOR("MediaTek");
> +MODULE_DESCRIPTION("GenieZone interface for VMM");
> +MODULE_LICENSE("GPL");
> diff --git a/drivers/virt/geniezone/gzvm_vm.c b/drivers/virt/geniezone/gzvm_vm.c
> new file mode 100644
> index 000000000000..ee751369fd4b
> --- /dev/null
> +++ b/drivers/virt/geniezone/gzvm_vm.c
> @@ -0,0 +1,400 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2023 MediaTek Inc.
> + */
> +
> +#include <linux/anon_inodes.h>
> +#include <linux/file.h>
> +#include <linux/kdev_t.h>
> +#include <linux/mm.h>
> +#include <linux/module.h>
> +#include <linux/platform_device.h>
> +#include <linux/slab.h>
> +#include <linux/gzvm_drv.h>
> +
> +static DEFINE_MUTEX(gzvm_list_lock);
> +static LIST_HEAD(gzvm_list);
> +
> +/**
> + * hva_to_pa_fast() - converts hva to pa in generic fast way
> + * @hva: Host virtual address.
> + *
> + * Return: 0 if translation error
> + */
> +static u64 hva_to_pa_fast(u64 hva)
> +{
> + struct page *page[1];
> +

Remove extra blank line

> + u64 pfn;
> +
> + if (get_user_page_fast_only(hva, 0, page)) {
> + pfn = page_to_phys(page[0]);
> + put_page((struct page *)page);
> + return pfn;

why the else branch? just do...

if (get_user_page_fast_only(.....)) {
do_something; return pfn;
}

return 0;
}

> + } else {
> + return 0;
> + }
> +}
> +
> +/**
> + * hva_to_pa_slow() - note that this function may sleep

/**
* hva_to_pa_slow() - converts hva to pa in a slow way
* @hva: Host virtual address
*
* This function converts HVA to PA in a slow way (because......)
*
* Context: This function may sleep
* Return: PA or 0 for translation error
*/

> + * @hva: Host virtual address.
> + *
> + * Return: 0 if translation error
> + */
> +static u64 hva_to_pa_slow(u64 hva)
> +{
> + struct page *page;
> + int npages;
> + u64 pfn;
> +
> + npages = get_user_pages_unlocked(hva, 1, &page, 0);
> + if (npages != 1)
> + return 0;
> +
> + pfn = page_to_phys(page);
> + put_page(page);
> +
> + return pfn;
> +}
> +

..snip..

> +
> +/* register_memslot_addr_range() - Register memory region to GZ */

/**
* register_memslot_addr_range() - Register memory region to GenieZone
* @gzvm: xxxx
* @memslot: xxxx
*
* Return: something
*/

> +static int
> +register_memslot_addr_range(struct gzvm *gzvm, struct gzvm_memslot *memslot)
> +{
> + struct gzvm_memory_region_ranges *region;
> + u32 buf_size;

u32 buf_size = PAGE_SIZE * 2;

> + int max_nr_consti, remain_pages;
> + u64 gfn, gfn_end;
> +
> + buf_size = PAGE_SIZE * 2;
> + region = alloc_pages_exact(buf_size, GFP_KERNEL);
> + if (!region)
> + return -ENOMEM;
> + max_nr_consti = (buf_size - sizeof(*region)) /
> + sizeof(struct mem_region_addr_range);
> +
> + region->slot = memslot->slot_id;
> + remain_pages = memslot->npages;
> + gfn = memslot->base_gfn;
> + gfn_end = gfn + remain_pages;
> + while (gfn < gfn_end) {
> + int nr_pages;

int nr_pages = fill_constituents(...)

> +
> + nr_pages = fill_constituents(region->constituents,
> + &region->constituent_cnt,
> + max_nr_consti, gfn,
> + remain_pages, memslot);
> + if (nr_pages < 0) {
> + pr_err("Failed to fill constituents\n");
> + free_pages_exact(region, buf_size);
> + return nr_pages;
> + }
> + region->gpa = PFN_PHYS(gfn);
> + region->total_pages = nr_pages;
> +
> + remain_pages -= nr_pages;
> + gfn += nr_pages;
> +
> + if (gzvm_arch_set_memregion(gzvm->vm_id, buf_size,
> + virt_to_phys(region))) {
> + pr_err("Failed to register memregion to hypervisor\n");
> + free_pages_exact(region, buf_size);
> + return -EFAULT;
> + }
> + }
> + free_pages_exact(region, buf_size);
> + return 0;
> +}
> +
> +/**
> + * gzvm_vm_ioctl_set_memory_region() - Set memory region of guest
> + * @gzvm: Pointer to struct gzvm.
> + * @mem: Input memory region from user.
> + *
> + * Return:

* Return: 0 for success, negative number for error
*
* -EXIO - The memslot is out-of-range
* -EFAULT - Cannot find corresponding vma
* -EINVAL - Region size and VMA size mismatch
*/

> + * * -EXIO - memslot is out-of-range
> + * * -EFAULT - Cannot find corresponding vma
> + * * -EINVAL - region size and vma size does not match
> + */
> +static int
> +gzvm_vm_ioctl_set_memory_region(struct gzvm *gzvm,
> + struct gzvm_userspace_memory_region *mem)
> +{
> + struct vm_area_struct *vma;
> + struct gzvm_memslot *memslot;
> + unsigned long size;
> + __u32 slot;
> +
> + slot = mem->slot;
> + if (slot >= GZVM_MAX_MEM_REGION)
> + return -ENXIO;
> + memslot = &gzvm->memslot[slot];
> +
> + vma = vma_lookup(gzvm->mm, mem->userspace_addr);
> + if (!vma)
> + return -EFAULT;
> +
> + size = vma->vm_end - vma->vm_start;
> + if (size != mem->memory_size)
> + return -EINVAL;
> +
> + memslot->base_gfn = __phys_to_pfn(mem->guest_phys_addr);
> + memslot->npages = size >> PAGE_SHIFT;
> + memslot->userspace_addr = mem->userspace_addr;
> + memslot->vma = vma;
> + memslot->flags = mem->flags;
> + memslot->slot_id = mem->slot;
> + return register_memslot_addr_range(gzvm, memslot);
> +}
> +

There are other instances of the same for all comments in this review, so
fix accordingly everywhere else.

Regards,
Angelo


2023-08-01 08:53:06

by Yi-De Wu

[permalink] [raw]
Subject: Re: [PATCH v5 03/12] virt: geniezone: Add GenieZone hypervisor support

On Thu, 2023-07-27 at 11:51 +0300, Eugen Hristev wrote:
> Hi Yi-De,
>
> On 7/27/23 10:59, Yi-De Wu wrote:
> > From: "Yingshiuan Pan" <[email protected]>
> >
> > GenieZone is MediaTek hypervisor solution, and it is running in EL2
> > stand alone as a type-I hypervisor. This patch exports a set of
> > ioctl
> > interfaces for userspace VMM (e.g., crosvm) to operate guest VMs
> > lifecycle (creation and destroy) on GenieZone.
> >
> > Signed-off-by: Yingshiuan Pan <[email protected]>
> > Signed-off-by: Jerry Wang <[email protected]>
> > Signed-off-by: Liju Chen <[email protected]>
> > Signed-off-by: Yi-De Wu <[email protected]>
> > ---
> > MAINTAINERS | 6 +
> > arch/arm64/Kbuild | 1 +
> > arch/arm64/geniezone/Makefile | 9 +
> > arch/arm64/geniezone/gzvm_arch_common.h | 68 ++++
> > arch/arm64/geniezone/vm.c | 212 +++++++++++++
> > arch/arm64/include/uapi/asm/gzvm_arch.h | 20 ++
> > drivers/virt/Kconfig | 2 +
> > drivers/virt/geniezone/Kconfig | 16 +
> > drivers/virt/geniezone/Makefile | 10 +
> > drivers/virt/geniezone/gzvm_main.c | 143 +++++++++
> > drivers/virt/geniezone/gzvm_vm.c | 400
> > ++++++++++++++++++++++++
> > include/linux/gzvm_drv.h | 90 ++++++
> > include/uapi/asm-generic/Kbuild | 1 +
> > include/uapi/asm-generic/gzvm_arch.h | 10 +
> > include/uapi/linux/gzvm.h | 76 +++++
> > 15 files changed, 1064 insertions(+)
> > create mode 100644 arch/arm64/geniezone/Makefile
> > create mode 100644 arch/arm64/geniezone/gzvm_arch_common.h
> > create mode 100644 arch/arm64/geniezone/vm.c
> > create mode 100644 arch/arm64/include/uapi/asm/gzvm_arch.h
> > create mode 100644 drivers/virt/geniezone/Kconfig
> > create mode 100644 drivers/virt/geniezone/Makefile
> > create mode 100644 drivers/virt/geniezone/gzvm_main.c
> > create mode 100644 drivers/virt/geniezone/gzvm_vm.c
> > create mode 100644 include/linux/gzvm_drv.h
> > create mode 100644 include/uapi/asm-generic/gzvm_arch.h
> > create mode 100644 include/uapi/linux/gzvm.h
> >
>
> I have a feeling this patch is a bit big, and could help review if
> it's
> split into chunks of smaller size.
>
Sure, we would tear this patch apart into a couple of smaller patches
according to their features.

> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index bfbfdb790446..b91d41dd2f2f 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -8747,6 +8747,12 @@ M: Ze-Yu Wang <[email protected]>
> > M: Yi-De Wu <[email protected]>
> > F: Documentation/devicetree/bindings/hypervisor/mediatek,g
> > eniezone-hyp.yaml
> > F: Documentation/virt/geniezone/
> > +F: arch/arm64/geniezone/
> > +F: arch/arm64/include/uapi/asm/gzvm_arch.h
> > +F: drivers/virt/geniezone/
> > +F: include/linux/gzvm_drv.h
> > +F include/uapi/asm-generic/gzvm_arch.h
> > +F: include/uapi/linux/gzvm.h
> >
> > GENWQE (IBM Generic Workqueue Card)
> > M: Frank Haverkamp <[email protected]>
> > diff --git a/arch/arm64/Kbuild b/arch/arm64/Kbuild
> > index 5bfbf7d79c99..0c3cca572919 100644
> > --- a/arch/arm64/Kbuild
> > +++ b/arch/arm64/Kbuild
> > @@ -4,6 +4,7 @@ obj-$(CONFIG_KVM) += kvm/
> > obj-$(CONFIG_XEN) += xen/
> > obj-$(subst m,y,$(CONFIG_HYPERV)) += hyperv/
> > obj-$(CONFIG_CRYPTO) += crypto/
> > +obj-$(CONFIG_MTK_GZVM) += geniezone/
> >
> > # for cleaning
> > subdir- += boot
> > diff --git a/arch/arm64/geniezone/Makefile
> > b/arch/arm64/geniezone/Makefile
> > new file mode 100644
> > index 000000000000..2957898cdd05
> > --- /dev/null
> > +++ b/arch/arm64/geniezone/Makefile
> > @@ -0,0 +1,9 @@
> > +# SPDX-License-Identifier: GPL-2.0-only
> > +#
> > +# Main Makefile for gzvm, this one includes
> > drivers/virt/geniezone/Makefile
> > +#
> > +include $(srctree)/drivers/virt/geniezone/Makefile
> > +
> > +gzvm-y += vm.o
> > +
> > +obj-$(CONFIG_MTK_GZVM) += gzvm.o
> > diff --git a/arch/arm64/geniezone/gzvm_arch_common.h
> > b/arch/arm64/geniezone/gzvm_arch_common.h
> > new file mode 100644
> > index 000000000000..fdb95d619102
> > --- /dev/null
> > +++ b/arch/arm64/geniezone/gzvm_arch_common.h
> > @@ -0,0 +1,68 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (c) 2023 MediaTek Inc.
> > + */
> > +
> > +#ifndef __GZVM_ARCH_COMMON_H__
> > +#define __GZVM_ARCH_COMMON_H__
> > +
> > +#include <linux/arm-smccc.h>
> > +
> > +enum {
> > + GZVM_FUNC_CREATE_VM = 0,
> > + GZVM_FUNC_DESTROY_VM = 1,
> > + GZVM_FUNC_CREATE_VCPU = 2,
> > + GZVM_FUNC_DESTROY_VCPU = 3,
> > + GZVM_FUNC_SET_MEMREGION = 4,
> > + GZVM_FUNC_RUN = 5,
> > + GZVM_FUNC_GET_ONE_REG = 8,
> > + GZVM_FUNC_SET_ONE_REG = 9,
> > + GZVM_FUNC_IRQ_LINE = 10,
> > + GZVM_FUNC_CREATE_DEVICE = 11,
> > + GZVM_FUNC_PROBE = 12,
> > + GZVM_FUNC_ENABLE_CAP = 13,
> > + NR_GZVM_FUNC,
> > +};
> > +
> > +#define SMC_ENTITY_MTK 59
> > +#define GZVM_FUNCID_START (0x1000)
> > +#define GZVM_HCALL_ID(func)
> > \
> > + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, ARM_SMCCC_SMC_32, \
> > + SMC_ENTITY_MTK, (GZVM_FUNCID_START +
> > (func)))
> > +
> > +#define MT_HVC_GZVM_CREATE_VM GZVM_HCALL_ID(GZVM_FUNC
> > _CREATE_VM)
> > +#define MT_HVC_GZVM_DESTROY_VM GZVM_HCALL_ID(GZVM_FUNC
> > _DESTROY_VM)
> > +#define MT_HVC_GZVM_CREATE_VCPU GZVM_HCALL_ID(GZVM_FUNC
> > _CREATE_VCPU)
> > +#define MT_HVC_GZVM_DESTROY_VCPU GZVM_HCALL_ID(GZVM_FUNC_DESTROY
> > _VCPU)
> > +#define MT_HVC_GZVM_SET_MEMREGION GZVM_HCALL_ID(GZVM_FUNC_SET_MEM
> > REGION)
> > +#define MT_HVC_GZVM_RUN GZVM_HCALL_ID(GZVM_FUNC
> > _RUN)
> > +#define MT_HVC_GZVM_GET_ONE_REG GZVM_HCALL_ID(GZVM_FUNC
> > _GET_ONE_REG)
> > +#define MT_HVC_GZVM_SET_ONE_REG GZVM_HCALL_ID(GZVM_FUNC
> > _SET_ONE_REG)
> > +#define MT_HVC_GZVM_IRQ_LINE GZVM_HCALL_ID(GZVM_FUNC
> > _IRQ_LINE)
> > +#define MT_HVC_GZVM_CREATE_DEVICE GZVM_HCALL_ID(GZVM_FUNC_CREATE_
> > DEVICE)
> > +#define MT_HVC_GZVM_PROBE GZVM_HCALL_ID(GZVM_FUNC_PROBE)
> > +#define MT_HVC_GZVM_ENABLE_CAP GZVM_HCALL_ID(GZVM_FUNC
> > _ENABLE_CAP)
> > +
> > +/**
> > + * gzvm_hypcall_wrapper() - the wrapper for hvc calls
> > + * @a0-a7: arguments passed in registers 0 to 7
> > + * @res: result values from registers 0 to 3
> > + *
> > + * Return: The wrapper helps caller to convert geniezone errno to
> > Linux errno.
> > + */
> > +static inline int gzvm_hypcall_wrapper(unsigned long a0, unsigned
> > long a1,
> > + unsigned long a2, unsigned long
> > a3,
> > + unsigned long a4, unsigned long
> > a5,
> > + unsigned long a6, unsigned long
> > a7,
> > + struct arm_smccc_res *res)
> > +{
> > + arm_smccc_hvc(a0, a1, a2, a3, a4, a5, a6, a7, res);
> > + return gzvm_err_to_errno(res->a0);
> > +}
> > +
> > +static inline u16 get_vmid_from_tuple(unsigned int tuple)
> > +{
> > + return (u16)(tuple >> 16);
> > +}
> > +
> > +#endif /* __GZVM_ARCH_COMMON_H__ */
> > diff --git a/arch/arm64/geniezone/vm.c b/arch/arm64/geniezone/vm.c
> > new file mode 100644
> > index 000000000000..e35751b21821
> > --- /dev/null
> > +++ b/arch/arm64/geniezone/vm.c
> > @@ -0,0 +1,212 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright (c) 2023 MediaTek Inc.
> > + */
> > +
> > +#include <asm/sysreg.h>
> > +#include <linux/arm-smccc.h>
> > +#include <linux/err.h>
> > +#include <linux/uaccess.h>
> > +
> > +#include <linux/gzvm.h>
> > +#include <linux/gzvm_drv.h>
> > +#include "gzvm_arch_common.h"
> > +
> > +#define PAR_PA47_MASK ((((1UL << 48) - 1) >> 12) << 12)
> > +
> > +int gzvm_arch_probe(void)
> > +{
> > + struct arm_smccc_res res;
> > +
> > + arm_smccc_hvc(MT_HVC_GZVM_PROBE, 0, 0, 0, 0, 0, 0, 0, &res);
> > + if (res.a0 == 0)
> > + return 0;
>
> I would see the error path as a particular case here, e.g.
>
> if (res.a0)
> return -ENXIO;
>
> and on the usual path return success.
>
> (as you already do below in some functions)...
>
> > +
> > + return -ENXIO;
> > +}
> > +
> > +int gzvm_arch_set_memregion(u16 vm_id, size_t buf_size,
> > + phys_addr_t region)
> > +{
> > + struct arm_smccc_res res;
> > +
> > + return gzvm_hypcall_wrapper(MT_HVC_GZVM_SET_MEMREGION, vm_id,
> > + buf_size, region, 0, 0, 0, 0,
> > &res);
> > +}
> > +
> > +static int gzvm_cap_arm_vm_ipa_size(void __user *argp)
> > +{
> > + __u64 value = CONFIG_ARM64_PA_BITS;
> > +
> > + if (copy_to_user(argp, &value, sizeof(__u64)))
> > + return -EFAULT;
>
> ... e.g. here.
>
> > +
> > + return 0;
> > +}
> > +
> > +int gzvm_arch_check_extension(struct gzvm *gzvm, __u64 cap, void
> > __user *argp)
> > +{
> > + int ret = -EOPNOTSUPP;
> > +
> > + switch (cap) {
> > + case GZVM_CAP_ARM_PROTECTED_VM: {
> > + __u64 success = 1;
> > +
> > + if (copy_to_user(argp, &success, sizeof(__u64)))
> > + return -EFAULT;
> > + ret = 0;
> > + break;
> > + }
> > + case GZVM_CAP_ARM_VM_IPA_SIZE: {
> > + ret = gzvm_cap_arm_vm_ipa_size(argp);
> > + break;
> > + }
> > + default:
> > + ret = -EOPNOTSUPP;
>
> you already initialized ret to -EOPNOTSUPP, why don't you initialize
> it
> with 0, and just set it as error code here, and avoid setting it to 0
> on
> the success case above.
>
> > + }
> > +
> > + return ret;
> > +}
> > +
> > +/**
> > + * gzvm_arch_create_vm() - create vm
> > + * @vm_type: VM type. Only supports Linux VM now.
> > + *
> > + * Return:
> > + * * positive value - VM ID
> > + * * -ENOMEM - Memory not enough for storing VM data
> > + */
> > +int gzvm_arch_create_vm(unsigned long vm_type)
> > +{
> > + struct arm_smccc_res res;
> > + int ret;
> > +
> > + ret = gzvm_hypcall_wrapper(MT_HVC_GZVM_CREATE_VM, vm_type, 0,
> > 0, 0, 0,
> > + 0, 0, &res);
> > +
> > + if (ret == 0)
> > + return res.a1;
> > + else
> > + return ret;
> > +}
> > +
> > +int gzvm_arch_destroy_vm(u16 vm_id)
> > +{
> > + struct arm_smccc_res res;
> > +
> > + return gzvm_hypcall_wrapper(MT_HVC_GZVM_DESTROY_VM, vm_id, 0,
> > 0, 0, 0,
> > + 0, 0, &res);
> > +}
> > +
> > +static int gzvm_vm_arch_enable_cap(struct gzvm *gzvm,
> > + struct gzvm_enable_cap *cap,
> > + struct arm_smccc_res *res)
> > +{
> > + return gzvm_hypcall_wrapper(MT_HVC_GZVM_ENABLE_CAP, gzvm-
> > >vm_id,
> > + cap->cap, cap->args[0], cap-
> > >args[1],
> > + cap->args[2], cap->args[3], cap-
> > >args[4],
> > + res);
> > +}
> > +
> > +/**
> > + * gzvm_vm_ioctl_get_pvmfw_size() - Get pvmfw size from
> > hypervisor, return
> > + * in x1, and return to userspace in
> > args
> > + * @gzvm: Pointer to struct gzvm.
> > + * @cap: Pointer to struct gzvm_enable_cap.
> > + * @argp: Pointer to struct gzvm_enable_cap in user space.
> > + *
> > + * Return:
> > + * * 0 - Succeed
> > + * * -EINVAL - Hypervisor return invalid results
> > + * * -EFAULT - Fail to copy back to userspace buffer
> > + */
> > +static int gzvm_vm_ioctl_get_pvmfw_size(struct gzvm *gzvm,
> > + struct gzvm_enable_cap *cap,
> > + void __user *argp)
> > +{
> > + struct arm_smccc_res res = {0};
> > +
> > + if (gzvm_vm_arch_enable_cap(gzvm, cap, &res) != 0)
> > + return -EINVAL;
> > +
> > + cap->args[1] = res.a1;
> > + if (copy_to_user(argp, cap, sizeof(*cap)))
> > + return -EFAULT;
> > +
> > + return 0;
> > +}
> > +
> > +/**
> > + * gzvm_vm_ioctl_cap_pvm() - Proceed GZVM_CAP_ARM_PROTECTED_VM's
> > subcommands
> > + * @gzvm: Pointer to struct gzvm.
> > + * @cap: Pointer to struct gzvm_enable_cap.
> > + * @argp: Pointer to struct gzvm_enable_cap in user space.
> > + *
> > + * Return:
> > + * * 0 - Succeed
> > + * * -EINVAL - Invalid subcommand or arguments
> > + */
> > +static int gzvm_vm_ioctl_cap_pvm(struct gzvm *gzvm,
> > + struct gzvm_enable_cap *cap,
> > + void __user *argp)
> > +{
> > + int ret = -EINVAL;
>
> This initialization appears redundant as you always rewrite ret to a
> new
> value below
>
> > + struct arm_smccc_res res = {0};
> > +
> > + switch (cap->args[0]) {
> > + case GZVM_CAP_ARM_PVM_SET_PVMFW_IPA:
> > + fallthrough;
> > + case GZVM_CAP_ARM_PVM_SET_PROTECTED_VM:
> > + ret = gzvm_vm_arch_enable_cap(gzvm, cap, &res);
> > + break;
> > + case GZVM_CAP_ARM_PVM_GET_PVMFW_SIZE:
> > + ret = gzvm_vm_ioctl_get_pvmfw_size(gzvm, cap, argp);
> > + break;
> > + default:
> > + ret = -EINVAL;
> > + break;
> > + }
> > +
> > + return ret;
> > +}
> > +
> > +int gzvm_vm_ioctl_arch_enable_cap(struct gzvm *gzvm,
> > + struct gzvm_enable_cap *cap,
> > + void __user *argp)
> > +{
> > + int ret = -EINVAL;
>
> same here
> > +
> > + switch (cap->cap) {
> > + case GZVM_CAP_ARM_PROTECTED_VM:
> > + ret = gzvm_vm_ioctl_cap_pvm(gzvm, cap, argp);
> > + break;
> > + default:
> > + ret = -EINVAL;
> > + break;
> > + }
> > +
> > + return ret;
> > +}
> > +
> > +/**
> > + * gzvm_hva_to_pa_arch() - converts hva to pa with arch-specific
> > way
> > + * @hva: Host virtual address.
> > + *
> > + * Return: 0 if translation error
>
> This is a bit misleading, if you look at the code, you return 0 if
> the
> bit SYS_PAR_EL1_F is present, but also return 0 if bit PAR_PA47_MASK
> is
> not present. Are those situations identical ?
>
> Also, it's a bit strange to return 0 for an error case.
>
> > + */
> > +u64 gzvm_hva_to_pa_arch(u64 hva)
> > +{
> > + u64 par;
> > + unsigned long flags;
> > +
> > + local_irq_save(flags);
> > + asm volatile("at s1e1r, %0" :: "r" (hva));
> > + isb();
> > + par = read_sysreg_par();
> > + local_irq_restore(flags);
> > +
> > + if (par & SYS_PAR_EL1_F)
> > + return 0;
> > +
> > + return par & PAR_PA47_MASK;
> > +}
> > diff --git a/arch/arm64/include/uapi/asm/gzvm_arch.h
> > b/arch/arm64/include/uapi/asm/gzvm_arch.h
> > new file mode 100644
> > index 000000000000..847bb627a65d
> > --- /dev/null
> > +++ b/arch/arm64/include/uapi/asm/gzvm_arch.h
> > @@ -0,0 +1,20 @@
> > +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
> > +/*
> > + * Copyright (c) 2023 MediaTek Inc.
> > + */
> > +
> > +#ifndef __GZVM_ARCH_H__
> > +#define __GZVM_ARCH_H__
> > +
> > +#include <linux/types.h>
> > +
> > +#define GZVM_CAP_ARM_VM_IPA_SIZE 165
> > +#define GZVM_CAP_ARM_PROTECTED_VM 0xffbadab1
> > +
> > +/* sub-commands put in args[0] for GZVM_CAP_ARM_PROTECTED_VM */
> > +#define GZVM_CAP_ARM_PVM_SET_PVMFW_IPA 0
> > +#define GZVM_CAP_ARM_PVM_GET_PVMFW_SIZE 1
> > +/* GZVM_CAP_ARM_PVM_SET_PROTECTED_VM only sets protected but not
> > load pvmfw */
> > +#define GZVM_CAP_ARM_PVM_SET_PROTECTED_VM 2
> > +
> > +#endif /* __GZVM_ARCH_H__ */
> > diff --git a/drivers/virt/Kconfig b/drivers/virt/Kconfig
> > index f79ab13a5c28..9bbf0bdf672c 100644
> > --- a/drivers/virt/Kconfig
> > +++ b/drivers/virt/Kconfig
> > @@ -54,4 +54,6 @@ source "drivers/virt/coco/sev-guest/Kconfig"
> >
> > source "drivers/virt/coco/tdx-guest/Kconfig"
> >
> > +source "drivers/virt/geniezone/Kconfig"
> > +
> > endif
> > diff --git a/drivers/virt/geniezone/Kconfig
> > b/drivers/virt/geniezone/Kconfig
> > new file mode 100644
> > index 000000000000..2643fb8913cc
> > --- /dev/null
> > +++ b/drivers/virt/geniezone/Kconfig
> > @@ -0,0 +1,16 @@
> > +# SPDX-License-Identifier: GPL-2.0-only
> > +
> > +config MTK_GZVM
> > + tristate "GenieZone Hypervisor driver for guest VM operation"
> > + depends on ARM64
>
> if only mediatek SoC is supported, should it depend on it here ?
>
> > + help
> > + This driver, gzvm, enables to run guest VMs on MTK GenieZone
> > + hypervisor. It exports kvm-like interfaces for VMM (e.g.,
> > crosvm) in
> > + order to operate guest VMs on GenieZone hypervisor.
> > +
> > + GenieZone hypervisor now only supports MediaTek SoC and arm64
> > + architecture.
> > +
> > + Select M if you want it be built as a module (gzvm.ko).
> > +
> > + If unsure, say N.
> > diff --git a/drivers/virt/geniezone/Makefile
> > b/drivers/virt/geniezone/Makefile
> > new file mode 100644
> > index 000000000000..066efddc0b9c
> > --- /dev/null
> > +++ b/drivers/virt/geniezone/Makefile
> > @@ -0,0 +1,10 @@
> > +# SPDX-License-Identifier: GPL-2.0-only
> > +#
> > +# Makefile for GenieZone driver, this file should be include in
> > arch's
> > +# to avoid two ko being generated.
> > +#
> > +
> > +GZVM_DIR ?= ../../../drivers/virt/geniezone
> > +
> > +gzvm-y := $(GZVM_DIR)/gzvm_main.o $(GZVM_DIR)/gzvm_vm.o
> > +
> > diff --git a/drivers/virt/geniezone/gzvm_main.c
> > b/drivers/virt/geniezone/gzvm_main.c
> > new file mode 100644
> > index 000000000000..b629b41a0cd9
> > --- /dev/null
> > +++ b/drivers/virt/geniezone/gzvm_main.c
> > @@ -0,0 +1,143 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright (c) 2023 MediaTek Inc.
> > + */
> > +
> > +#include <linux/anon_inodes.h>
> > +#include <linux/device.h>
> > +#include <linux/file.h>
> > +#include <linux/kdev_t.h>
> > +#include <linux/miscdevice.h>
> > +#include <linux/module.h>
> > +#include <linux/of.h>
> > +#include <linux/platform_device.h>
> > +#include <linux/slab.h>
> > +#include <linux/gzvm_drv.h>
> > +
> > +/**
> > + * gzvm_err_to_errno() - Convert geniezone return value to
> > standard errno
> > + *
> > + * @err: Return value from geniezone function return
> > + *
> > + * Return: Standard errno
> > + */
> > +int gzvm_err_to_errno(unsigned long err)
> > +{
> > + int gz_err = (int)err;
> > +
> > + switch (gz_err) {
> > + case 0:
> > + return 0;
> > + case ERR_NO_MEMORY:
> > + return -ENOMEM;
> > + case ERR_NOT_SUPPORTED:
> > + return -EOPNOTSUPP;
> > + case ERR_NOT_IMPLEMENTED:
> > + return -EOPNOTSUPP;
> > + case ERR_FAULT:
> > + return -EFAULT;
> > + default:
> > + break;
> > + }
> > +
> > + return -EINVAL;
> > +}
> > +
> > +/**
> > + * gzvm_dev_ioctl_check_extension() - Check if given capability is
> > support
> > + * or not
> > + *
> > + * @gzvm: Pointer to struct gzvm
> > + * @args: Pointer in u64 from userspace
> > + *
> > + * Return:
> > + * * 0 - Support, no error
>
> Supported ?
>
> > + * * -EOPNOTSUPP - Not support
> > + * * -EFAULT - Failed to get data from userspace
> > + */
> > +long gzvm_dev_ioctl_check_extension(struct gzvm *gzvm, unsigned
> > long args)
> > +{
> > + __u64 cap;
> > + void __user *argp = (void __user *)args;
> > +
> > + if (copy_from_user(&cap, argp, sizeof(uint64_t)))
> > + return -EFAULT;
> > + return gzvm_arch_check_extension(gzvm, cap, argp);
> > +}
> > +
> > +static long gzvm_dev_ioctl(struct file *filp, unsigned int cmd,
> > + unsigned long user_args)
> > +{
> > + long ret = -ENOTTY;
>
> again redundant initializations
> > +
> > + switch (cmd) {
> > + case GZVM_CREATE_VM:
> > + ret = gzvm_dev_ioctl_create_vm(user_args);
> > + break;
> > + case GZVM_CHECK_EXTENSION:
> > + if (!user_args)
> > + return -EINVAL;
> > + ret = gzvm_dev_ioctl_check_extension(NULL, user_args);
> > + break;
> > + default:
> > + ret = -ENOTTY;
> > + }
> > +
> > + return ret;
> > +}
> > +
> > +static const struct file_operations gzvm_chardev_ops = {
> > + .unlocked_ioctl = gzvm_dev_ioctl,
> > + .llseek = noop_llseek,
> > +};
> > +
> > +static struct miscdevice gzvm_dev = {
> > + .minor = MISC_DYNAMIC_MINOR,
> > + .name = KBUILD_MODNAME,
> > + .fops = &gzvm_chardev_ops,
> > +};
> > +
> > +static int gzvm_drv_probe(struct platform_device *pdev)
> > +{
> > + int ret;
> > +
> > + if (gzvm_arch_probe() != 0) {
> > + dev_err(&pdev->dev, "Not found available conduit\n");
> > + return -ENODEV;
> > + }
> > +
> > + ret = misc_register(&gzvm_dev);
>
> return misc_register(...) ?
>
> > + if (ret)
> > + return ret;
> > +
> > + return 0;
> > +}
> > +
> > +static int gzvm_drv_remove(struct platform_device *pdev)
> > +{
> > + gzvm_destroy_all_vms();
> > + misc_deregister(&gzvm_dev);
> > + return 0;
> > +}
> > +
> > +static const struct of_device_id gzvm_of_match[] = {
> > + { .compatible = "mediatek,geniezone-hyp", },
> > + {/* sentinel */},
> > +};
> > +
> > +static struct platform_driver gzvm_driver = {
> > + .probe = gzvm_drv_probe,
> > + .remove = gzvm_drv_remove,
> > + .driver = {
> > + .name = KBUILD_MODNAME,
> > + .owner = THIS_MODULE,
> > + .of_match_table = gzvm_of_match,
> > + },
> > +};
> > +
> > +module_platform_driver(gzvm_driver);
> > +
> > +MODULE_DEVICE_TABLE(of, gzvm_of_match);
> > +MODULE_AUTHOR("MediaTek");
> > +MODULE_DESCRIPTION("GenieZone interface for VMM");
> > +MODULE_LICENSE("GPL");
> > diff --git a/drivers/virt/geniezone/gzvm_vm.c
> > b/drivers/virt/geniezone/gzvm_vm.c
> > new file mode 100644
> > index 000000000000..ee751369fd4b
> > --- /dev/null
> > +++ b/drivers/virt/geniezone/gzvm_vm.c
> > @@ -0,0 +1,400 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright (c) 2023 MediaTek Inc.
> > + */
> > +
> > +#include <linux/anon_inodes.h>
> > +#include <linux/file.h>
> > +#include <linux/kdev_t.h>
> > +#include <linux/mm.h>
> > +#include <linux/module.h>
> > +#include <linux/platform_device.h>
> > +#include <linux/slab.h>
> > +#include <linux/gzvm_drv.h>
> > +
> > +static DEFINE_MUTEX(gzvm_list_lock);
> > +static LIST_HEAD(gzvm_list);
> > +
> > +/**
> > + * hva_to_pa_fast() - converts hva to pa in generic fast way
> > + * @hva: Host virtual address.
> > + *
> > + * Return: 0 if translation error
> > + */
> > +static u64 hva_to_pa_fast(u64 hva)
> > +{
> > + struct page *page[1];
> > +
> > + u64 pfn;
> > +
> > + if (get_user_page_fast_only(hva, 0, page)) {
> > + pfn = page_to_phys(page[0]);
> > + put_page((struct page *)page);
> > + return pfn;
> > + } else {
>
> you can remove the 'else' and just return 0 here as you return pfn
> in
> the if(true) case.
>
> > + return 0;
> > + }
> > +}
> > +
> > +/**
> > + * hva_to_pa_slow() - note that this function may sleep
> > + * @hva: Host virtual address.
> > + *
> > + * Return: 0 if translation error
> > + */
> > +static u64 hva_to_pa_slow(u64 hva)
> > +{
> > + struct page *page;
> > + int npages;
> > + u64 pfn;
> > +
> > + npages = get_user_pages_unlocked(hva, 1, &page, 0);
> > + if (npages != 1)
> > + return 0;
> > +
> > + pfn = page_to_phys(page);
> > + put_page(page);
> > +
> > + return pfn;
> > +}
> > +
> > +static u64 gzvm_gfn_to_hva_memslot(struct gzvm_memslot *memslot,
> > u64 gfn)
> > +{
> > + u64 offset = gfn - memslot->base_gfn;
> > +
> > + return memslot->userspace_addr + offset * PAGE_SIZE;
> > +}
> > +
> > +static u64 __gzvm_gfn_to_pfn_memslot(struct gzvm_memslot *memslot,
> > u64 gfn)
> > +{
> > + u64 hva, pa;
> > +
> > + hva = gzvm_gfn_to_hva_memslot(memslot, gfn);
> > +
> > + pa = gzvm_hva_to_pa_arch(hva);
> > + if (pa != 0)
> > + return PHYS_PFN(pa);
> > +
> > + pa = hva_to_pa_fast(hva);
> > + if (pa)
> > + return PHYS_PFN(pa);
> > +
> > + pa = hva_to_pa_slow(hva);
> > + if (pa)
> > + return PHYS_PFN(pa);
> > +
> > + return 0;
> > +}
> > +
> > +/**
> > + * gzvm_gfn_to_pfn_memslot() - Translate gfn (guest ipa) to pfn
> > (host pa),
> > + * result is in @pfn
> > + * @memslot: Pointer to struct gzvm_memslot.
> > + * @gfn: Guest frame number.
> > + * @pfn: Host page frame number.
> > + *
> > + * Return:
> > + * * 0 - Succeed
> > + * * -EFAULT - Failed to convert
> > + */
> > +static int gzvm_gfn_to_pfn_memslot(struct gzvm_memslot *memslot,
> > u64 gfn,
> > + u64 *pfn)
> > +{
> > + u64 __pfn;
> > +
> > + if (!memslot)
> > + return -EFAULT;
> > +
> > + __pfn = __gzvm_gfn_to_pfn_memslot(memslot, gfn);
> > + if (__pfn == 0) {
> > + *pfn = 0;
> > + return -EFAULT;
> > + }
> > +
> > + *pfn = __pfn;
> > +
> > + return 0;
> > +}
> > +
> > +/**
> > + * fill_constituents() - Populate pa to buffer until full
> > + * @consti: Pointer to struct mem_region_addr_range.
> > + * @consti_cnt: Constituent count.
> > + * @max_nr_consti: Maximum number of constituent count.
> > + * @gfn: Guest frame number.
> > + * @total_pages: Total page numbers.
> > + * @slot: Pointer to struct gzvm_memslot.
> > + *
> > + * Return: how many pages we've fill in, negative if error
> > + */
> > +static int fill_constituents(struct mem_region_addr_range *consti,
> > + int *consti_cnt, int max_nr_consti, u64
> > gfn,
> > + u32 total_pages, struct gzvm_memslot
> > *slot)
> > +{
> > + u64 pfn, prev_pfn, gfn_end;
> > + int nr_pages = 1;
> > + int i = 0;
> > +
> > + if (unlikely(total_pages == 0))
> > + return -EINVAL;
> > + gfn_end = gfn + total_pages;
> > +
> > + /* entry 0 */
> > + if (gzvm_gfn_to_pfn_memslot(slot, gfn, &pfn) != 0)
> > + return -EFAULT;
> > + consti[0].address = PFN_PHYS(pfn);
> > + consti[0].pg_cnt = 1;
> > + gfn++;
> > + prev_pfn = pfn;
> > +
> > + while (i < max_nr_consti && gfn < gfn_end) {
> > + if (gzvm_gfn_to_pfn_memslot(slot, gfn, &pfn) != 0)
> > + return -EFAULT;
> > + if (pfn == (prev_pfn + 1)) {
> > + consti[i].pg_cnt++;
> > + } else {
> > + i++;
> > + if (i >= max_nr_consti)
> > + break;
> > + consti[i].address = PFN_PHYS(pfn);
> > + consti[i].pg_cnt = 1;
> > + }
> > + prev_pfn = pfn;
> > + gfn++;
> > + nr_pages++;
> > + }
> > + if (i != max_nr_consti)
> > + i++;
> > + *consti_cnt = i;
> > +
> > + return nr_pages;
> > +}
> > +
> > +/* register_memslot_addr_range() - Register memory region to GZ */
> > +static int
> > +register_memslot_addr_range(struct gzvm *gzvm, struct gzvm_memslot
> > *memslot)
> > +{
> > + struct gzvm_memory_region_ranges *region;
> > + u32 buf_size;
> > + int max_nr_consti, remain_pages;
> > + u64 gfn, gfn_end;
> > +
> > + buf_size = PAGE_SIZE * 2;
> > + region = alloc_pages_exact(buf_size, GFP_KERNEL);
> > + if (!region)
> > + return -ENOMEM;
> > + max_nr_consti = (buf_size - sizeof(*region)) /
> > + sizeof(struct mem_region_addr_range);
> > +
> > + region->slot = memslot->slot_id;
> > + remain_pages = memslot->npages;
> > + gfn = memslot->base_gfn;
> > + gfn_end = gfn + remain_pages;
> > + while (gfn < gfn_end) {
> > + int nr_pages;
> > +
> > + nr_pages = fill_constituents(region->constituents,
> > + &region->constituent_cnt,
> > + max_nr_consti, gfn,
> > + remain_pages, memslot);
> > + if (nr_pages < 0) {
> > + pr_err("Failed to fill constituents\n");
> > + free_pages_exact(region, buf_size);
> > + return nr_pages;
> > + }
> > + region->gpa = PFN_PHYS(gfn);
> > + region->total_pages = nr_pages;
> > +
> > + remain_pages -= nr_pages;
> > + gfn += nr_pages;
> > +
> > + if (gzvm_arch_set_memregion(gzvm->vm_id, buf_size,
> > + virt_to_phys(region))) {
> > + pr_err("Failed to register memregion to
> > hypervisor\n");
> > + free_pages_exact(region, buf_size);
> > + return -EFAULT;
> > + }
> > + }
> > + free_pages_exact(region, buf_size);
> > + return 0;
> > +}
> > +
> > +/**
> > + * gzvm_vm_ioctl_set_memory_region() - Set memory region of guest
> > + * @gzvm: Pointer to struct gzvm.
> > + * @mem: Input memory region from user.
> > + *
> > + * Return:
> > + * * -EXIO - memslot is out-of-range
> > + * * -EFAULT - Cannot find corresponding vma
> > + * * -EINVAL - region size and vma size does not
> > match
>
> I assume 0 for success ?
>
> > + */
> > +static int
> > +gzvm_vm_ioctl_set_memory_region(struct gzvm *gzvm,
> > + struct gzvm_userspace_memory_region
> > *mem)
> > +{
> > + struct vm_area_struct *vma;
> > + struct gzvm_memslot *memslot;
> > + unsigned long size;
> > + __u32 slot;
> > +
> > + slot = mem->slot;
> > + if (slot >= GZVM_MAX_MEM_REGION)
> > + return -ENXIO;
> > + memslot = &gzvm->memslot[slot];
> > +
> > + vma = vma_lookup(gzvm->mm, mem->userspace_addr);
> > + if (!vma)
> > + return -EFAULT;
> > +
> > + size = vma->vm_end - vma->vm_start;
> > + if (size != mem->memory_size)
> > + return -EINVAL;
> > +
> > + memslot->base_gfn = __phys_to_pfn(mem->guest_phys_addr);
> > + memslot->npages = size >> PAGE_SHIFT;
> > + memslot->userspace_addr = mem->userspace_addr;
> > + memslot->vma = vma;
> > + memslot->flags = mem->flags;
> > + memslot->slot_id = mem->slot;
> > + return register_memslot_addr_range(gzvm, memslot);
> > +}
> > +
> > +static int gzvm_vm_ioctl_enable_cap(struct gzvm *gzvm,
> > + struct gzvm_enable_cap *cap,
> > + void __user *argp)
> > +{
> > + return gzvm_vm_ioctl_arch_enable_cap(gzvm, cap, argp);
> > +}
> > +
> > +/* gzvm_vm_ioctl() - Ioctl handler of VM FD */
> > +static long gzvm_vm_ioctl(struct file *filp, unsigned int ioctl,
> > + unsigned long arg)
> > +{
> > + long ret = -ENOTTY;
>
> appears to be redundant
>
> > + void __user *argp = (void __user *)arg;
> > + struct gzvm *gzvm = filp->private_data;
> > +
> > + switch (ioctl) {
> > + case GZVM_CHECK_EXTENSION: {
> > + ret = gzvm_dev_ioctl_check_extension(gzvm, arg);
> > + break;
> > + }
> > + case GZVM_SET_USER_MEMORY_REGION: {
> > + struct gzvm_userspace_memory_region userspace_mem;
> > +
> > + if (copy_from_user(&userspace_mem, argp,
> > sizeof(userspace_mem))) {
> > + ret = -EFAULT;
> > + goto out;
> > + }
> > + ret = gzvm_vm_ioctl_set_memory_region(gzvm,
> > &userspace_mem);
> > + break;
> > + }
> > + case GZVM_ENABLE_CAP: {
> > + struct gzvm_enable_cap cap;
> > +
> > + if (copy_from_user(&cap, argp, sizeof(cap))) {
> > + ret = -EFAULT;
> > + goto out;
> > + }
> > + ret = gzvm_vm_ioctl_enable_cap(gzvm, &cap, argp);
> > + break;
> > + }
> > + default:
> > + ret = -ENOTTY;
> > + }
> > +out:
> > + return ret;
> > +}
> > +
> > +static void gzvm_destroy_vm(struct gzvm *gzvm)
> > +{
> > + pr_debug("VM-%u is going to be destroyed\n", gzvm->vm_id);
> > +
> > + mutex_lock(&gzvm->lock);
> > +
> > + gzvm_arch_destroy_vm(gzvm->vm_id);
> > +
> > + mutex_lock(&gzvm_list_lock);
> > + list_del(&gzvm->vm_list);
> > + mutex_unlock(&gzvm_list_lock);
> > +
> > + mutex_unlock(&gzvm->lock);
> > +
> > + kfree(gzvm);
> > +}
> > +
> > +static int gzvm_vm_release(struct inode *inode, struct file *filp)
> > +{
> > + struct gzvm *gzvm = filp->private_data;
> > +
> > + gzvm_destroy_vm(gzvm);
> > + return 0;
> > +}
> > +
> > +static const struct file_operations gzvm_vm_fops = {
> > + .release = gzvm_vm_release,
> > + .unlocked_ioctl = gzvm_vm_ioctl,
> > + .llseek = noop_llseek,
> > +};
> > +
> > +static struct gzvm *gzvm_create_vm(unsigned long vm_type)
> > +{
> > + int ret;
> > + struct gzvm *gzvm;
> > +
> > + gzvm = kzalloc(sizeof(*gzvm), GFP_KERNEL);
> > + if (!gzvm)
> > + return ERR_PTR(-ENOMEM);
> > +
> > + ret = gzvm_arch_create_vm(vm_type);
> > + if (ret < 0) {
> > + kfree(gzvm);
> > + return ERR_PTR(ret);
> > + }
> > +
> > + gzvm->vm_id = ret;
> > + gzvm->mm = current->mm;
> > + mutex_init(&gzvm->lock);
> > +
> > + mutex_lock(&gzvm_list_lock);
> > + list_add(&gzvm->vm_list, &gzvm_list);
> > + mutex_unlock(&gzvm_list_lock);
> > +
> > + pr_debug("VM-%u is created\n", gzvm->vm_id);
> > +
> > + return gzvm;
> > +}
> > +
> > +/**
> > + * gzvm_dev_ioctl_create_vm - Create vm fd
> > + * @vm_type: VM type. Only supports Linux VM now.
> > + *
> > + * Return: fd of vm, negative if error
> > + */
> > +int gzvm_dev_ioctl_create_vm(unsigned long vm_type)
> > +{
> > + struct gzvm *gzvm;
> > +
> > + gzvm = gzvm_create_vm(vm_type);
> > + if (IS_ERR(gzvm))
> > + return PTR_ERR(gzvm);
> > +
> > + return anon_inode_getfd("gzvm-vm", &gzvm_vm_fops, gzvm,
> > + O_RDWR | O_CLOEXEC);
> > +}
> > +
> > +void gzvm_destroy_all_vms(void)
> > +{
> > + struct gzvm *gzvm, *tmp;
> > +
> > + mutex_lock(&gzvm_list_lock);
> > + if (list_empty(&gzvm_list))
> > + goto out;
> > +
> > + list_for_each_entry_safe(gzvm, tmp, &gzvm_list, vm_list)
> > + gzvm_destroy_vm(gzvm);
> > +
> > +out:
> > + mutex_unlock(&gzvm_list_lock);
> > +}
> > diff --git a/include/linux/gzvm_drv.h b/include/linux/gzvm_drv.h
> > new file mode 100644
> > index 000000000000..4fd52fcbd5a8
> > --- /dev/null
> > +++ b/include/linux/gzvm_drv.h
> > @@ -0,0 +1,90 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (c) 2023 MediaTek Inc.
> > + */
> > +
> > +#ifndef __GZVM_DRV_H__
> > +#define __GZVM_DRV_H__
> > +
> > +#include <linux/list.h>
> > +#include <linux/mutex.h>
> > +#include <linux/gzvm.h>
> > +
> > +#define GZVM_VCPU_MMAP_SIZE PAGE_SIZE
> > +#define INVALID_VM_ID 0xffff
> > +
> > +/*
> > + * These are the efinitions of APIs between GenieZone hypervisor
> > and driver,
>
> typo: definitions
>
> > + * there's no need to be visible to uapi. Furthermore, We need
> > GenieZone
>
> We doesn't have to be capital
> > + * specific error code in order to map to Linux errno
> > + */
> > +#define NO_ERROR (0)
> > +#define ERR_NO_MEMORY (-5)
> > +#define ERR_NOT_SUPPORTED (-24)
> > +#define ERR_NOT_IMPLEMENTED (-27)
> > +#define ERR_FAULT (-40)
> > +
> > +/*
> > + * The following data structures are for data transferring between
> > driver and
> > + * hypervisor, and they're aligned with hypervisor definitions
> > + */
> > +#define GZVM_MAX_VCPUS 8
> > +#define GZVM_MAX_MEM_REGION 10
> > +
> > +/* struct mem_region_addr_range - Identical to ffa memory
> > constituent */
> > +struct mem_region_addr_range {
> > + /* the base IPA of the constituent memory region, aligned to 4
> > kiB */
> > + __u64 address;
> > + /* the number of 4 kiB pages in the constituent memory region.
> > */
> > + __u32 pg_cnt;
> > + __u32 reserved;
> > +};
> > +
> > +struct gzvm_memory_region_ranges {
> > + __u32 slot;
> > + __u32 constituent_cnt;
> > + __u64 total_pages;
> > + __u64 gpa;
> > + struct mem_region_addr_range constituents[];
> > +};
> > +
> > +/* struct gzvm_memslot - VM's memory slot descriptor */
> > +struct gzvm_memslot {
> > + u64 base_gfn; /* begin of guest page
> > frame */
> > + unsigned long npages; /* number of pages this
> > slot covers */
> > + unsigned long userspace_addr; /* corresponding userspace
> > va */
> > + struct vm_area_struct *vma; /* vma related to this userspace
> > addr */
> > + u32 flags;
> > + u32 slot_id;
> > +};
> > +
> > +struct gzvm {
> > + /* userspace tied to this vm */
> > + struct mm_struct *mm;
> > + struct gzvm_memslot memslot[GZVM_MAX_MEM_REGION];
> > + /* lock for list_add*/
> > + struct mutex lock;
> > + struct list_head vm_list;
> > + u16 vm_id;
> > +};
> > +
> > +long gzvm_dev_ioctl_check_extension(struct gzvm *gzvm, unsigned
> > long args);
> > +int gzvm_dev_ioctl_create_vm(unsigned long vm_type);
> > +
> > +int gzvm_err_to_errno(unsigned long err);
> > +
> > +void gzvm_destroy_all_vms(void);
> > +
> > +/* arch-dependant functions */
> > +int gzvm_arch_probe(void);
> > +int gzvm_arch_set_memregion(u16 vm_id, size_t buf_size,
> > + phys_addr_t region);
> > +int gzvm_arch_check_extension(struct gzvm *gzvm, __u64 cap, void
> > __user *argp);
> > +int gzvm_arch_create_vm(unsigned long vm_type);
> > +int gzvm_arch_destroy_vm(u16 vm_id);
> > +int gzvm_vm_ioctl_arch_enable_cap(struct gzvm *gzvm,
> > + struct gzvm_enable_cap *cap,
> > + void __user *argp);
> > +u64 gzvm_hva_to_pa_arch(u64 hva);
> > +
> > +#endif /* __GZVM_DRV_H__ */
> > diff --git a/include/uapi/asm-generic/Kbuild b/include/uapi/asm-
> > generic/Kbuild
> > index ebb180aac74e..5af115a3c1a8 100644
> > --- a/include/uapi/asm-generic/Kbuild
> > +++ b/include/uapi/asm-generic/Kbuild
> > @@ -34,3 +34,4 @@ mandatory-y += termbits.h
> > mandatory-y += termios.h
> > mandatory-y += types.h
> > mandatory-y += unistd.h
> > +mandatory-y += gzvm_arch.h
> > diff --git a/include/uapi/asm-generic/gzvm_arch.h
> > b/include/uapi/asm-generic/gzvm_arch.h
> > new file mode 100644
> > index 000000000000..c4cc12716c91
> > --- /dev/null
> > +++ b/include/uapi/asm-generic/gzvm_arch.h
> > @@ -0,0 +1,10 @@
> > +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
> > +/*
> > + * Copyright (c) 2023 MediaTek Inc.
> > + */
> > +
> > +#ifndef __ASM_GENERIC_GZVM_ARCH_H
> > +#define __ASM_GENERIC_GZVM_ARCH_H
> > +/* geniezone only supports aarch64 platform for now */
> > +
> > +#endif /* __ASM_GENERIC_GZVM_ARCH_H */
> > diff --git a/include/uapi/linux/gzvm.h b/include/uapi/linux/gzvm.h
> > new file mode 100644
> > index 000000000000..99730c142b0e
> > --- /dev/null
> > +++ b/include/uapi/linux/gzvm.h
> > @@ -0,0 +1,76 @@
> > +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
> > +/*
> > + * Copyright (c) 2023 MediaTek Inc.
> > + */
> > +
> > +/**
> > + * DOC: UAPI of GenieZone Hypervisor
> > + *
> > + * This file declares common data structure shared among user
> > space,
> > + * kernel space, and GenieZone hypervisor.
> > + */
> > +#ifndef __GZVM_H__
> > +#define __GZVM_H__
> > +
> > +#include <linux/const.h>
> > +#include <linux/types.h>
> > +#include <linux/ioctl.h>
> > +
> > +#include <asm/gzvm_arch.h>
> > +
> > +/* GZVM ioctls */
> > +#define GZVM_IOC_MAGIC 0x92 /* gz */
> > +
> > +/* ioctls for /dev/gzvm fds */
> > +#define GZVM_CREATE_VM _IO(GZVM_IOC_MAGIC, 0x01) /*
> > Returns a Geniezone VM fd */
> > +
> > +/*
> > + * Check if the given capability is supported or not.
> > + * The argument is capability. Ex. GZVM_CAP_ARM_PROTECTED_VM or
> > GZVM_CAP_ARM_VM_IPA_SIZE
> > + * return is 0 (supported, no error)
> > + * return is -EOPNOTSUPP (unsupported)
> > + * return is -EFAULT (failed to get the argument from userspace)
> > + */
> > +#define GZVM_CHECK_EXTENSION _IO(GZVM_IOC_MAGIC, 0x03)
> > +
> > +/* ioctls for VM fds */
> > +/* for GZVM_SET_MEMORY_REGION */
> > +struct gzvm_memory_region {
> > + __u32 slot;
> > + __u32 flags;
> > + __u64 guest_phys_addr;
> > + __u64 memory_size; /* bytes */
> > +};
> > +
> > +#define GZVM_SET_MEMORY_REGION _IOW(GZVM_IOC_MAGIC, 0x40, \
> > + struct gzvm_memory_region)
> > +
> > +/* for GZVM_SET_USER_MEMORY_REGION */
> > +struct gzvm_userspace_memory_region {
> > + __u32 slot;
> > + __u32 flags;
> > + __u64 guest_phys_addr;
> > + /* bytes */
> > + __u64 memory_size;
> > + /* start of the userspace allocated memory */
> > + __u64 userspace_addr;
> > +};
> > +
> > +#define GZVM_SET_USER_MEMORY_REGION _IOW(GZVM_IOC_MAGIC, 0x46, \
> > + struct
> > gzvm_userspace_memory_region)
> > +
> > +/* for GZVM_ENABLE_CAP */
> > +struct gzvm_enable_cap {
> > + /* in */
> > + __u64 cap;
> > + /**
> > + * we have total 5 (8 - 3) registers can be used for
>
> which can be used ?
>
> > + * additional args
> > + */
> > + __u64 args[5];
> > +};
> > +
> > +#define GZVM_ENABLE_CAP _IOW(GZVM_IOC_MAGIC, 0xa3, \
> > + struct gzvm_enable_cap)
> > +
> > +#endif /* __GZVM_H__ */
>
>
> Regards,
>
> Eugen

2023-08-11 17:47:18

by Rob Herring (Arm)

[permalink] [raw]
Subject: Re: [PATCH v5 04/12] virt: geniezone: Add vcpu support

On Thu, Jul 27, 2023 at 03:59:57PM +0800, Yi-De Wu wrote:
> From: "Yingshiuan Pan" <[email protected]>
>
> VMM use this interface to create vcpu instance which is a fd, and this
> fd will be for any vcpu operations, such as setting vcpu registers and
> accepts the most important ioctl GZVM_VCPU_RUN which requests GenieZone
> hypervisor to do context switch to execute VM's vcpu context.
>
> Signed-off-by: Yingshiuan Pan <[email protected]>
> Signed-off-by: Jerry Wang <[email protected]>
> Signed-off-by: Liju Chen <[email protected]>
> Signed-off-by: Yi-De Wu <[email protected]>
> ---
> arch/arm64/geniezone/Makefile | 2 +-
> arch/arm64/geniezone/gzvm_arch_common.h | 20 ++
> arch/arm64/geniezone/vcpu.c | 88 +++++++++
> arch/arm64/geniezone/vm.c | 11 ++
> arch/arm64/include/uapi/asm/gzvm_arch.h | 30 +++

I'm almost certain that the arm64 maintainers will reject putting this
here. What is the purpose of the split with drivers/virt/? Do you plan
to support another arch in the near future?

Yes, there's KVM stuff in arch/arm64, but that is multi-arch.

> drivers/virt/geniezone/Makefile | 3 +-
> drivers/virt/geniezone/gzvm_vcpu.c | 250 ++++++++++++++++++++++++
> drivers/virt/geniezone/gzvm_vm.c | 5 +
> include/linux/gzvm_drv.h | 21 ++
> include/uapi/linux/gzvm.h | 136 +++++++++++++
> 10 files changed, 564 insertions(+), 2 deletions(-)
> create mode 100644 arch/arm64/geniezone/vcpu.c
> create mode 100644 drivers/virt/geniezone/gzvm_vcpu.c

2023-08-11 17:50:47

by Rob Herring (Arm)

[permalink] [raw]
Subject: Re: [PATCH v5 00/12] GenieZone hypervisor drivers

On Thu, Jul 27, 2023 at 03:59:53PM +0800, Yi-De Wu wrote:
> This series is based on linux-next, tag: next-20230726.
>
> GenieZone hypervisor(gzvm) is a type-1 hypervisor that supports various virtual
> machine types and provides security features such as TEE-like scenarios and
> secure boot. It can create guest VMs for security use cases and has
> virtualization capabilities for both platform and interrupt. Although the
> hypervisor can be booted independently, it requires the assistance of GenieZone
> hypervisor kernel driver(gzvm-ko) to leverage the ability of Linux kernel for
> vCPU scheduling, memory management, inter-VM communication and virtio backend
> support.
>
> Changes in v5:
> - Add dt solution back for device initialization

Why? It's a software interface that you define and control. Make that
interface discoverable.

Rob

2023-08-15 14:42:51

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v5 04/12] virt: geniezone: Add vcpu support

On Fri, Aug 11, 2023 at 11:00:54AM -0600, Rob Herring wrote:
> On Thu, Jul 27, 2023 at 03:59:57PM +0800, Yi-De Wu wrote:
> > From: "Yingshiuan Pan" <[email protected]>
> >
> > VMM use this interface to create vcpu instance which is a fd, and this
> > fd will be for any vcpu operations, such as setting vcpu registers and
> > accepts the most important ioctl GZVM_VCPU_RUN which requests GenieZone
> > hypervisor to do context switch to execute VM's vcpu context.
> >
> > Signed-off-by: Yingshiuan Pan <[email protected]>
> > Signed-off-by: Jerry Wang <[email protected]>
> > Signed-off-by: Liju Chen <[email protected]>
> > Signed-off-by: Yi-De Wu <[email protected]>
> > ---
> > arch/arm64/geniezone/Makefile | 2 +-
> > arch/arm64/geniezone/gzvm_arch_common.h | 20 ++
> > arch/arm64/geniezone/vcpu.c | 88 +++++++++
> > arch/arm64/geniezone/vm.c | 11 ++
> > arch/arm64/include/uapi/asm/gzvm_arch.h | 30 +++
>
> I'm almost certain that the arm64 maintainers will reject putting this
> here. What is the purpose of the split with drivers/virt/? Do you plan
> to support another arch in the near future?

Thanks, Rob. You're absolutely right that this doesn't belong in the
architecture code.

Will

2023-09-01 17:11:04

by Yi-De Wu

[permalink] [raw]
Subject: Re: [PATCH v5 00/12] GenieZone hypervisor drivers

On Thu, 2023-08-17 at 15:31 +0800, Yi-De Wu wrote:
> On Fri, 2023-08-11 at 10:52 -0600, Rob Herring wrote:
> >
> > External email : Please do not click links or open attachments
> > until
> > you have verified the sender or the content.
> > On Thu, Jul 27, 2023 at 03:59:53PM +0800, Yi-De Wu wrote:
> > > This series is based on linux-next, tag: next-20230726.
> > >
> > > GenieZone hypervisor(gzvm) is a type-1 hypervisor that supports
> >
> > various virtual
> > > machine types and provides security features such as TEE-like
> >
> > scenarios and
> > > secure boot. It can create guest VMs for security use cases and
> > > has
> > > virtualization capabilities for both platform and interrupt.
> >
> > Although the
> > > hypervisor can be booted independently, it requires the
> > > assistance
> >
> > of GenieZone
> > > hypervisor kernel driver(gzvm-ko) to leverage the ability of
> > > Linux
> >
> > kernel for
> > > vCPU scheduling, memory management, inter-VM communication and
> >
> > virtio backend
> > > support.
> > >
> > > Changes in v5:
> > > - Add dt solution back for device initialization
> >
> > Why? It's a software interface that you define and control. Make
> > that
> > interface discoverable.
> >
> > Rob
>
> hi Rob,
>
> Let me recap a bit about this as you might not notice our previous
> response[1]. In order to discover our GenieZone hypervisor, there
> were
> 2 solutions being talked about, namely with dt or without dt.
>
> The reasons we use dt now were listed in some previous mail
> thread[2].
> I'll just copy the statements here for better sync-up.
> - Although dt is for hardware, it's difficult to discover a specific
> hypervisor without probing on all subsystem and thus pollute all of
> other users as a consequence.
> - The GenieZone hypervisor could be considered as a vendor model to
> assist platform virtualization whose implementation is independent
> from
> Linuxism.
>
> In contrast to the solution with dt, what we were doing was probing
> via
> hypercall to see whether our hypervisor exists.
> However, this could raise some concerns about "polluting all systems"
> even for those systems without GenieZone hypervisor embedded[3].
>
> We're wondering if there's any specific implementation in mind from
> your side that we could initialize our device in a discoverable
> manners
> while not affecting other systems. We'll appreciate for the hint.
>
> Regards,
>
> Reference
> 1.
>
https://lore.kernel.org/all/[email protected]/
> 2.
>
https://lore.kernel.org/lkml/[email protected]/
> 3.
>
https://lore.kernel.org/all/[email protected]/
>
>
> Regards,

A gentle ping.

We suppose a simple dt would be a consise solution here to initialize
the GenieZone hypervisor. We also found some other software pieces use
dt as well[4]. Perhaps it could be brought into discussion that dt
shall be suitable under our use case.

Reference
4. OP-TEE Trusted OS maintained by Linaro

https://elixir.bootlin.com/linux/v6.1/source/Documentation/devicetree/bindings/arm/firmware/tlm,trusted-foundations.yaml