Gunyah is a Type-1 hypervisor independent of any
high-level OS kernel, and runs in a higher CPU privilege level. It does
not depend on any lower-privileged OS kernel/code for its core
functionality. This increases its security and can support a much smaller
trusted computing base than a Type-2 hypervisor.
Gunyah is an open source hypervisor. The source repo is available at
https://github.com/quic/gunyah-hypervisor.
The diagram below shows the architecture.
::
VM A VM B
+-----+ +-----+ | +-----+ +-----+ +-----+
| | | | | | | | | | |
EL0 | APP | | APP | | | APP | | APP | | APP |
| | | | | | | | | | |
+-----+ +-----+ | +-----+ +-----+ +-----+
---------------------|-------------------------
+--------------+ | +----------------------+
| | | | |
EL1 | Linux Kernel | | |Linux kernel/Other OS | ...
| | | | |
+--------------+ | +----------------------+
--------hvc/smc------|------hvc/smc------------
+----------------------------------------+
| |
EL2 | Gunyah Hypervisor |
| |
+----------------------------------------+
Gunyah provides these following features.
- Threads and Scheduling: The scheduler schedules virtual CPUs (VCPUs) on
physical CPUs and enables time-sharing of the CPUs.
- Memory Management: Gunyah tracks memory ownership and use of all memory
under its control. Memory partitioning between VMs is a fundamental
security feature.
- Interrupt Virtualization: All interrupts are handled in the hypervisor
and routed to the assigned VM.
- Inter-VM Communication: There are several different mechanisms provided
for communicating between VMs.
- Device Virtualization: Para-virtualization of devices is supported using
inter-VM communication. Low level system features and devices such as
interrupt controllers are supported with emulation where required.
This series adds the basic framework for detecting that Linux is running
under Gunyah as a virtual machine, communication with the Gunyah Resource
Manager, and a virtual machine manager capable of launching virtual machines.
The series relies on two other patches posted separately:
- https://lore.kernel.org/all/[email protected]/
- https://lore.kernel.org/all/[email protected]/
Changes in v11:
- Rename struct gh_vm_dtb_config:gpa -> guest_phys_addr & overflow checks for this
- More docstrings throughout
- Make resp_buf and resp_buf_size optional
- Replace deprecated idr with xarray
- Refconting on misc device instead of RM's platform device
- Renaming variables, structs, etc. from gunyah_ -> gh_
- Drop removal of user mem regions
- Drop mem_lend functionality; to converge with restricted_memfd later
Changes in v10: https://lore.kernel.org/all/[email protected]/
- Fix bisectability (end result of series is same, --fixups applied to wrong commits)
- Convert GH_ERROR_* and GH_RM_ERROR_* to enums
- Correct race condition between allocating/freeing user memory
- Replace offsetof with struct_size
- Series-wide renaming of functions to be more consistent
- VM shutdown & restart support added in vCPU and VM Manager patches
- Convert VM function name (string) to type (number)
- Convert VM function argument to value (which could be a pointer) to remove memory wastage for arguments
- Remove defensive checks of hypervisor correctness
- Clean ups to ioeventfd as suggested by Srivatsa
Changes in v9: https://lore.kernel.org/all/[email protected]/
- Refactor Gunyah API flags to be exposed as feature flags at kernel level
- Move mbox client cleanup into gunyah_msgq_remove()
- Simplify gh_rm_call return value and response payload
- Missing clean-up/error handling/little endian fixes as suggested by Srivatsa and Alex in v8 series
Changes in v8: https://lore.kernel.org/all/[email protected]/
- Treat VM manager as a library of RM
- Add patches 21-28 as RFC to support proxy-scheduled vCPUs and necessary bits to support virtio
from Gunyah userspace
Changes in v7: https://lore.kernel.org/all/[email protected]/
- Refactor to remove gunyah RM bus
- Refactor allow multiple RM device instances
- Bump UAPI to start at 0x0
- Refactor QCOM SCM's platform hooks to allow CONFIG_QCOM_SCM=Y/CONFIG_GUNYAH=M combinations
Changes in v6: https://lore.kernel.org/all/[email protected]/
- *Replace gunyah-console with gunyah VM Manager*
- Move include/asm-generic/gunyah.h into include/linux/gunyah.h
- s/gunyah_msgq/gh_msgq/
- Minor tweaks and documentation tidying based on comments from Jiri, Greg, Arnd, Dmitry, and Bagas.
Changes in v5: https://lore.kernel.org/all/[email protected]/
- Dropped sysfs nodes
- Switch from aux bus to Gunyah RM bus for the subdevices
- Cleaning up RM console
Changes in v4: https://lore.kernel.org/all/[email protected]/
- Tidied up documentation throughout based on questions/feedback received
- Switched message queue implementation to use mailboxes
- Renamed "gunyah_device" as "gunyah_resource"
Changes in v3: https://lore.kernel.org/all/[email protected]/
- /Maintained/Supported/ in MAINTAINERS
- Tidied up documentation throughout based on questions/feedback received
- Moved hypercalls into arch/arm64/gunyah/; following hyper-v's implementation
- Drop opaque typedefs
- Move sysfs nodes under /sys/hypervisor/gunyah/
- Moved Gunyah console driver to drivers/tty/
- Reworked gh_device design to drop the Gunyah bus.
Changes in v2: https://lore.kernel.org/all/[email protected]/
- DT bindings clean up
- Switch hypercalls to follow SMCCC
v1: https://lore.kernel.org/all/[email protected]/
Elliot Berman (26):
docs: gunyah: Introduce Gunyah Hypervisor
dt-bindings: Add binding for gunyah hypervisor
gunyah: Common types and error codes for Gunyah hypercalls
virt: gunyah: Add hypercalls to identify Gunyah
virt: gunyah: Identify hypervisor version
virt: gunyah: msgq: Add hypercalls to send and receive messages
mailbox: Add Gunyah message queue mailbox
gunyah: rsc_mgr: Add resource manager RPC core
gunyah: rsc_mgr: Add VM lifecycle RPC
gunyah: vm_mgr: Introduce basic VM Manager
gunyah: rsc_mgr: Add RPC for sharing memory
gunyah: vm_mgr: Add/remove user memory regions
gunyah: vm_mgr: Add ioctls to support basic non-proxy VM boot
samples: Add sample userspace Gunyah VM Manager
gunyah: rsc_mgr: Add platform ops on mem_lend/mem_reclaim
firmware: qcom_scm: Register Gunyah platform ops
docs: gunyah: Document Gunyah VM Manager
virt: gunyah: Translate gh_rm_hyp_resource into gunyah_resource
gunyah: vm_mgr: Add framework to add VM Functions
virt: gunyah: Add resource tickets
virt: gunyah: Add IO handlers
virt: gunyah: Add proxy-scheduled vCPUs
virt: gunyah: Add hypercalls for sending doorbell
virt: gunyah: Add irqfd interface
virt: gunyah: Add ioeventfd
MAINTAINERS: Add Gunyah hypervisor drivers section
.../bindings/firmware/gunyah-hypervisor.yaml | 82 ++
.../userspace-api/ioctl/ioctl-number.rst | 1 +
Documentation/virt/gunyah/index.rst | 114 +++
Documentation/virt/gunyah/message-queue.rst | 71 ++
Documentation/virt/gunyah/vm-manager.rst | 151 +++
Documentation/virt/index.rst | 1 +
MAINTAINERS | 13 +
arch/arm64/Kbuild | 1 +
arch/arm64/gunyah/Makefile | 3 +
arch/arm64/gunyah/gunyah_hypercall.c | 148 +++
arch/arm64/include/asm/gunyah.h | 23 +
drivers/firmware/Kconfig | 2 +
drivers/firmware/qcom_scm.c | 100 ++
drivers/mailbox/Makefile | 2 +
drivers/mailbox/gunyah-msgq.c | 209 +++++
drivers/virt/Kconfig | 2 +
drivers/virt/Makefile | 1 +
drivers/virt/gunyah/Kconfig | 46 +
drivers/virt/gunyah/Makefile | 11 +
drivers/virt/gunyah/gunyah.c | 57 ++
drivers/virt/gunyah/gunyah_ioeventfd.c | 117 +++
drivers/virt/gunyah/gunyah_irqfd.c | 164 ++++
drivers/virt/gunyah/gunyah_platform_hooks.c | 80 ++
drivers/virt/gunyah/gunyah_vcpu.c | 465 +++++++++
drivers/virt/gunyah/rsc_mgr.c | 885 ++++++++++++++++++
drivers/virt/gunyah/rsc_mgr.h | 19 +
drivers/virt/gunyah/rsc_mgr_rpc.c | 497 ++++++++++
drivers/virt/gunyah/vm_mgr.c | 785 ++++++++++++++++
drivers/virt/gunyah/vm_mgr.h | 71 ++
drivers/virt/gunyah/vm_mgr_mm.c | 252 +++++
include/linux/gunyah.h | 194 ++++
include/linux/gunyah_rsc_mgr.h | 168 ++++
include/linux/gunyah_vm_mgr.h | 112 +++
include/uapi/linux/gunyah.h | 257 +++++
samples/Kconfig | 10 +
samples/Makefile | 1 +
samples/gunyah/.gitignore | 2 +
samples/gunyah/Makefile | 6 +
samples/gunyah/gunyah_vmm.c | 270 ++++++
samples/gunyah/sample_vm.dts | 68 ++
40 files changed, 5461 insertions(+)
create mode 100644 Documentation/devicetree/bindings/firmware/gunyah-hypervisor.yaml
create mode 100644 Documentation/virt/gunyah/index.rst
create mode 100644 Documentation/virt/gunyah/message-queue.rst
create mode 100644 Documentation/virt/gunyah/vm-manager.rst
create mode 100644 arch/arm64/gunyah/Makefile
create mode 100644 arch/arm64/gunyah/gunyah_hypercall.c
create mode 100644 arch/arm64/include/asm/gunyah.h
create mode 100644 drivers/mailbox/gunyah-msgq.c
create mode 100644 drivers/virt/gunyah/Kconfig
create mode 100644 drivers/virt/gunyah/Makefile
create mode 100644 drivers/virt/gunyah/gunyah.c
create mode 100644 drivers/virt/gunyah/gunyah_ioeventfd.c
create mode 100644 drivers/virt/gunyah/gunyah_irqfd.c
create mode 100644 drivers/virt/gunyah/gunyah_platform_hooks.c
create mode 100644 drivers/virt/gunyah/gunyah_vcpu.c
create mode 100644 drivers/virt/gunyah/rsc_mgr.c
create mode 100644 drivers/virt/gunyah/rsc_mgr.h
create mode 100644 drivers/virt/gunyah/rsc_mgr_rpc.c
create mode 100644 drivers/virt/gunyah/vm_mgr.c
create mode 100644 drivers/virt/gunyah/vm_mgr.h
create mode 100644 drivers/virt/gunyah/vm_mgr_mm.c
create mode 100644 include/linux/gunyah.h
create mode 100644 include/linux/gunyah_rsc_mgr.h
create mode 100644 include/linux/gunyah_vm_mgr.h
create mode 100644 include/uapi/linux/gunyah.h
create mode 100644 samples/gunyah/.gitignore
create mode 100644 samples/gunyah/Makefile
create mode 100644 samples/gunyah/gunyah_vmm.c
create mode 100644 samples/gunyah/sample_vm.dts
base-commit: 2eb29d59ddf02e39774abfb60b2030b0b7e27c1f
prerequisite-patch-id: 25a39c504532b2fcdf51baff6dc55f7885db2375
prerequisite-patch-id: b48c45acdec06adf37e09fe35e6a9412c5784800
--
2.39.2
Add architecture-independent standard error codes, types, and macros for
Gunyah hypercalls.
Reviewed-by: Dmitry Baryshkov <[email protected]>
Signed-off-by: Elliot Berman <[email protected]>
---
include/linux/gunyah.h | 83 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 83 insertions(+)
create mode 100644 include/linux/gunyah.h
diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
new file mode 100644
index 000000000000..54b4be71caf7
--- /dev/null
+++ b/include/linux/gunyah.h
@@ -0,0 +1,83 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#ifndef _LINUX_GUNYAH_H
+#define _LINUX_GUNYAH_H
+
+#include <linux/errno.h>
+#include <linux/limits.h>
+
+/******************************************************************************/
+/* Common arch-independent definitions for Gunyah hypercalls */
+#define GH_CAPID_INVAL U64_MAX
+#define GH_VMID_ROOT_VM 0xff
+
+enum gh_error {
+ GH_ERROR_OK = 0,
+ GH_ERROR_UNIMPLEMENTED = -1,
+ GH_ERROR_RETRY = -2,
+
+ GH_ERROR_ARG_INVAL = 1,
+ GH_ERROR_ARG_SIZE = 2,
+ GH_ERROR_ARG_ALIGN = 3,
+
+ GH_ERROR_NOMEM = 10,
+
+ GH_ERROR_ADDR_OVFL = 20,
+ GH_ERROR_ADDR_UNFL = 21,
+ GH_ERROR_ADDR_INVAL = 22,
+
+ GH_ERROR_DENIED = 30,
+ GH_ERROR_BUSY = 31,
+ GH_ERROR_IDLE = 32,
+
+ GH_ERROR_IRQ_BOUND = 40,
+ GH_ERROR_IRQ_UNBOUND = 41,
+
+ GH_ERROR_CSPACE_CAP_NULL = 50,
+ GH_ERROR_CSPACE_CAP_REVOKED = 51,
+ GH_ERROR_CSPACE_WRONG_OBJ_TYPE = 52,
+ GH_ERROR_CSPACE_INSUF_RIGHTS = 53,
+ GH_ERROR_CSPACE_FULL = 54,
+
+ GH_ERROR_MSGQUEUE_EMPTY = 60,
+ GH_ERROR_MSGQUEUE_FULL = 61,
+};
+
+/**
+ * gh_remap_error() - Remap Gunyah hypervisor errors into a Linux error code
+ * @gh_error: Gunyah hypercall return value
+ */
+static inline int gh_remap_error(enum gh_error gh_error)
+{
+ switch (gh_error) {
+ case GH_ERROR_OK:
+ return 0;
+ case GH_ERROR_NOMEM:
+ return -ENOMEM;
+ case GH_ERROR_DENIED:
+ case GH_ERROR_CSPACE_CAP_NULL:
+ case GH_ERROR_CSPACE_CAP_REVOKED:
+ case GH_ERROR_CSPACE_WRONG_OBJ_TYPE:
+ case GH_ERROR_CSPACE_INSUF_RIGHTS:
+ case GH_ERROR_CSPACE_FULL:
+ return -EACCES;
+ case GH_ERROR_BUSY:
+ case GH_ERROR_IDLE:
+ return -EBUSY;
+ case GH_ERROR_IRQ_BOUND:
+ case GH_ERROR_IRQ_UNBOUND:
+ case GH_ERROR_MSGQUEUE_FULL:
+ case GH_ERROR_MSGQUEUE_EMPTY:
+ return -EIO;
+ case GH_ERROR_UNIMPLEMENTED:
+ case GH_ERROR_RETRY:
+ return -EOPNOTSUPP;
+ default:
+ return -EINVAL;
+ }
+}
+
+#endif
--
2.39.2
Add hypercalls to identify when Linux is running a virtual machine under
Gunyah.
There are two calls to help identify Gunyah:
1. gh_hypercall_get_uid() returns a UID when running under a Gunyah
hypervisor.
2. gh_hypercall_hyp_identify() returns build information and a set of
feature flags that are supported by Gunyah.
Signed-off-by: Elliot Berman <[email protected]>
---
arch/arm64/Kbuild | 1 +
arch/arm64/gunyah/Makefile | 3 ++
arch/arm64/gunyah/gunyah_hypercall.c | 64 ++++++++++++++++++++++++++++
drivers/virt/Kconfig | 2 +
drivers/virt/gunyah/Kconfig | 13 ++++++
include/linux/gunyah.h | 28 ++++++++++++
6 files changed, 111 insertions(+)
create mode 100644 arch/arm64/gunyah/Makefile
create mode 100644 arch/arm64/gunyah/gunyah_hypercall.c
create mode 100644 drivers/virt/gunyah/Kconfig
diff --git a/arch/arm64/Kbuild b/arch/arm64/Kbuild
index 5bfbf7d79c99..e4847ba0e3c9 100644
--- a/arch/arm64/Kbuild
+++ b/arch/arm64/Kbuild
@@ -3,6 +3,7 @@ obj-y += kernel/ mm/ net/
obj-$(CONFIG_KVM) += kvm/
obj-$(CONFIG_XEN) += xen/
obj-$(subst m,y,$(CONFIG_HYPERV)) += hyperv/
+obj-$(CONFIG_GUNYAH) += gunyah/
obj-$(CONFIG_CRYPTO) += crypto/
# for cleaning
diff --git a/arch/arm64/gunyah/Makefile b/arch/arm64/gunyah/Makefile
new file mode 100644
index 000000000000..84f1e38cafb1
--- /dev/null
+++ b/arch/arm64/gunyah/Makefile
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0
+
+obj-$(CONFIG_GUNYAH) += gunyah_hypercall.o
diff --git a/arch/arm64/gunyah/gunyah_hypercall.c b/arch/arm64/gunyah/gunyah_hypercall.c
new file mode 100644
index 000000000000..0d14e767e2c8
--- /dev/null
+++ b/arch/arm64/gunyah/gunyah_hypercall.c
@@ -0,0 +1,64 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#include <linux/arm-smccc.h>
+#include <linux/module.h>
+#include <linux/gunyah.h>
+#include <linux/uuid.h>
+
+static const uuid_t gh_known_uuids[] = {
+ /* Qualcomm's version of Gunyah {19bd54bd-0b37-571b-946f-609b54539de6} */
+ UUID_INIT(0x19bd54bd, 0x0b37, 0x571b, 0x94, 0x6f, 0x60, 0x9b, 0x54, 0x53, 0x9d, 0xe6),
+ /* Standard version of Gunyah {c1d58fcd-a453-5fdb-9265-ce36673d5f14} */
+ UUID_INIT(0xc1d58fcd, 0xa453, 0x5fdb, 0x92, 0x65, 0xce, 0x36, 0x67, 0x3d, 0x5f, 0x14),
+};
+
+bool arch_is_gh_guest(void)
+{
+ struct arm_smccc_res res;
+ uuid_t uuid;
+ int i;
+
+ arm_smccc_1_1_hvc(ARM_SMCCC_VENDOR_HYP_CALL_UID_FUNC_ID, &res);
+
+ ((u32 *)&uuid.b[0])[0] = lower_32_bits(res.a0);
+ ((u32 *)&uuid.b[0])[1] = lower_32_bits(res.a1);
+ ((u32 *)&uuid.b[0])[2] = lower_32_bits(res.a2);
+ ((u32 *)&uuid.b[0])[3] = lower_32_bits(res.a3);
+
+ for (i = 0; i < ARRAY_SIZE(gh_known_uuids); i++)
+ if (uuid_equal(&uuid, &gh_known_uuids[i]))
+ return true;
+
+ return false;
+}
+EXPORT_SYMBOL_GPL(arch_is_gh_guest);
+
+#define GH_HYPERCALL(fn) ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, ARM_SMCCC_SMC_64, \
+ ARM_SMCCC_OWNER_VENDOR_HYP, \
+ fn)
+
+#define GH_HYPERCALL_HYP_IDENTIFY GH_HYPERCALL(0x8000)
+
+/**
+ * gh_hypercall_hyp_identify() - Returns build information and feature flags
+ * supported by Gunyah.
+ * @hyp_identity: filled by the hypercall with the API info and feature flags.
+ */
+void gh_hypercall_hyp_identify(struct gh_hypercall_hyp_identify_resp *hyp_identity)
+{
+ struct arm_smccc_res res;
+
+ arm_smccc_1_1_hvc(GH_HYPERCALL_HYP_IDENTIFY, &res);
+
+ hyp_identity->api_info = res.a0;
+ hyp_identity->flags[0] = res.a1;
+ hyp_identity->flags[1] = res.a2;
+ hyp_identity->flags[2] = res.a3;
+}
+EXPORT_SYMBOL_GPL(gh_hypercall_hyp_identify);
+
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("Gunyah Hypervisor Hypercalls");
diff --git a/drivers/virt/Kconfig b/drivers/virt/Kconfig
index f79ab13a5c28..85bd6626ffc9 100644
--- a/drivers/virt/Kconfig
+++ b/drivers/virt/Kconfig
@@ -54,4 +54,6 @@ source "drivers/virt/coco/sev-guest/Kconfig"
source "drivers/virt/coco/tdx-guest/Kconfig"
+source "drivers/virt/gunyah/Kconfig"
+
endif
diff --git a/drivers/virt/gunyah/Kconfig b/drivers/virt/gunyah/Kconfig
new file mode 100644
index 000000000000..1a737694c333
--- /dev/null
+++ b/drivers/virt/gunyah/Kconfig
@@ -0,0 +1,13 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+config GUNYAH
+ tristate "Gunyah Virtualization drivers"
+ depends on ARM64
+ depends on MAILBOX
+ help
+ The Gunyah drivers are the helper interfaces that run in a guest VM
+ such as basic inter-VM IPC and signaling mechanisms, and higher level
+ services such as memory/device sharing, IRQ sharing, and so on.
+
+ Say Y/M here to enable the drivers needed to interact in a Gunyah
+ virtual environment.
diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
index 54b4be71caf7..bd080e3a6fc9 100644
--- a/include/linux/gunyah.h
+++ b/include/linux/gunyah.h
@@ -6,8 +6,10 @@
#ifndef _LINUX_GUNYAH_H
#define _LINUX_GUNYAH_H
+#include <linux/bitfield.h>
#include <linux/errno.h>
#include <linux/limits.h>
+#include <linux/types.h>
/******************************************************************************/
/* Common arch-independent definitions for Gunyah hypercalls */
@@ -80,4 +82,30 @@ static inline int gh_remap_error(enum gh_error gh_error)
}
}
+enum gh_api_feature {
+ GH_FEATURE_DOORBELL = 1,
+ GH_FEATURE_MSGQUEUE = 2,
+ GH_FEATURE_VCPU = 5,
+ GH_FEATURE_MEMEXTENT = 6,
+};
+
+bool arch_is_gh_guest(void);
+
+u16 gh_api_version(void);
+bool gh_api_has_feature(enum gh_api_feature feature);
+
+#define GH_API_V1 1
+
+#define GH_API_INFO_API_VERSION_MASK GENMASK_ULL(13, 0)
+#define GH_API_INFO_BIG_ENDIAN BIT_ULL(14)
+#define GH_API_INFO_IS_64BIT BIT_ULL(15)
+#define GH_API_INFO_VARIANT_MASK GENMASK_ULL(63, 56)
+
+struct gh_hypercall_hyp_identify_resp {
+ u64 api_info;
+ u64 flags[3];
+};
+
+void gh_hypercall_hyp_identify(struct gh_hypercall_hyp_identify_resp *hyp_identity);
+
#endif
--
2.39.2
Add hypercalls to send and receive messages on a Gunyah message queue.
Signed-off-by: Elliot Berman <[email protected]>
---
arch/arm64/gunyah/gunyah_hypercall.c | 31 ++++++++++++++++++++++++++++
include/linux/gunyah.h | 6 ++++++
2 files changed, 37 insertions(+)
diff --git a/arch/arm64/gunyah/gunyah_hypercall.c b/arch/arm64/gunyah/gunyah_hypercall.c
index 0d14e767e2c8..3420d8f286a9 100644
--- a/arch/arm64/gunyah/gunyah_hypercall.c
+++ b/arch/arm64/gunyah/gunyah_hypercall.c
@@ -41,6 +41,8 @@ EXPORT_SYMBOL_GPL(arch_is_gh_guest);
fn)
#define GH_HYPERCALL_HYP_IDENTIFY GH_HYPERCALL(0x8000)
+#define GH_HYPERCALL_MSGQ_SEND GH_HYPERCALL(0x801B)
+#define GH_HYPERCALL_MSGQ_RECV GH_HYPERCALL(0x801C)
/**
* gh_hypercall_hyp_identify() - Returns build information and feature flags
@@ -60,5 +62,34 @@ void gh_hypercall_hyp_identify(struct gh_hypercall_hyp_identify_resp *hyp_identi
}
EXPORT_SYMBOL_GPL(gh_hypercall_hyp_identify);
+enum gh_error gh_hypercall_msgq_send(u64 capid, size_t size, void *buff, int tx_flags, bool *ready)
+{
+ struct arm_smccc_res res;
+
+ arm_smccc_1_1_hvc(GH_HYPERCALL_MSGQ_SEND, capid, size, (uintptr_t)buff, tx_flags, 0, &res);
+
+ if (res.a0 == GH_ERROR_OK)
+ *ready = !!res.a1;
+
+ return res.a0;
+}
+EXPORT_SYMBOL_GPL(gh_hypercall_msgq_send);
+
+enum gh_error gh_hypercall_msgq_recv(u64 capid, void *buff, size_t size, size_t *recv_size,
+ bool *ready)
+{
+ struct arm_smccc_res res;
+
+ arm_smccc_1_1_hvc(GH_HYPERCALL_MSGQ_RECV, capid, (uintptr_t)buff, size, 0, &res);
+
+ if (res.a0 == GH_ERROR_OK) {
+ *recv_size = res.a1;
+ *ready = !!res.a2;
+ }
+
+ return res.a0;
+}
+EXPORT_SYMBOL_GPL(gh_hypercall_msgq_recv);
+
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("Gunyah Hypervisor Hypercalls");
diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
index bd080e3a6fc9..18cfbf5ee48b 100644
--- a/include/linux/gunyah.h
+++ b/include/linux/gunyah.h
@@ -108,4 +108,10 @@ struct gh_hypercall_hyp_identify_resp {
void gh_hypercall_hyp_identify(struct gh_hypercall_hyp_identify_resp *hyp_identity);
+#define GH_HYPERCALL_MSGQ_TX_FLAGS_PUSH BIT(0)
+
+enum gh_error gh_hypercall_msgq_send(u64 capid, size_t size, void *buff, int tx_flags, bool *ready);
+enum gh_error gh_hypercall_msgq_recv(u64 capid, void *buff, size_t size, size_t *recv_size,
+ bool *ready);
+
#endif
--
2.39.2
Add a sample Gunyah VMM capable of launching a non-proxy scheduled VM.
Signed-off-by: Elliot Berman <[email protected]>
---
samples/Kconfig | 10 ++
samples/Makefile | 1 +
samples/gunyah/.gitignore | 2 +
samples/gunyah/Makefile | 6 +
samples/gunyah/gunyah_vmm.c | 270 +++++++++++++++++++++++++++++++++++
samples/gunyah/sample_vm.dts | 68 +++++++++
6 files changed, 357 insertions(+)
create mode 100644 samples/gunyah/.gitignore
create mode 100644 samples/gunyah/Makefile
create mode 100644 samples/gunyah/gunyah_vmm.c
create mode 100644 samples/gunyah/sample_vm.dts
diff --git a/samples/Kconfig b/samples/Kconfig
index 30ef8bd48ba3..11070bf02bd7 100644
--- a/samples/Kconfig
+++ b/samples/Kconfig
@@ -273,6 +273,16 @@ config SAMPLE_CORESIGHT_SYSCFG
This demonstrates how a user may create their own CoreSight
configurations and easily load them into the system at runtime.
+config SAMPLE_GUNYAH
+ bool "Build example Gunyah Virtual Machine Manager"
+ depends on CC_CAN_LINK && HEADERS_INSTALL
+ depends on GUNYAH
+ help
+ Build an example Gunyah VMM userspace program capable of launching
+ a basic virtual machine under the Gunyah hypervisor.
+ This demonstrates how to create a virtual machine under the Gunyah
+ hypervisor.
+
source "samples/rust/Kconfig"
endif # SAMPLES
diff --git a/samples/Makefile b/samples/Makefile
index 7cb632ef88ee..a65555802642 100644
--- a/samples/Makefile
+++ b/samples/Makefile
@@ -37,3 +37,4 @@ obj-$(CONFIG_DEBUG_KMEMLEAK_TEST) += kmemleak/
obj-$(CONFIG_SAMPLE_CORESIGHT_SYSCFG) += coresight/
obj-$(CONFIG_SAMPLE_FPROBE) += fprobe/
obj-$(CONFIG_SAMPLES_RUST) += rust/
+obj-$(CONFIG_SAMPLE_GUNYAH) += gunyah/
diff --git a/samples/gunyah/.gitignore b/samples/gunyah/.gitignore
new file mode 100644
index 000000000000..adc7d1589fde
--- /dev/null
+++ b/samples/gunyah/.gitignore
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0
+/gunyah_vmm
diff --git a/samples/gunyah/Makefile b/samples/gunyah/Makefile
new file mode 100644
index 000000000000..faf14f9bb337
--- /dev/null
+++ b/samples/gunyah/Makefile
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+userprogs-always-y += gunyah_vmm
+dtb-y += sample_vm.dtb
+
+userccflags += -I usr/include
diff --git a/samples/gunyah/gunyah_vmm.c b/samples/gunyah/gunyah_vmm.c
new file mode 100644
index 000000000000..d0ba9c20cb13
--- /dev/null
+++ b/samples/gunyah/gunyah_vmm.c
@@ -0,0 +1,270 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2022 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#include <stdlib.h>
+#include <stdio.h>
+#include <unistd.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <sys/ioctl.h>
+#include <getopt.h>
+#include <limits.h>
+#include <stdint.h>
+#include <fcntl.h>
+#include <string.h>
+#include <sys/sysmacros.h>
+#define __USE_GNU
+#include <sys/mman.h>
+
+#include <linux/gunyah.h>
+
+struct vm_config {
+ int image_fd;
+ int dtb_fd;
+ int ramdisk_fd;
+
+ uint64_t guest_base;
+ uint64_t guest_size;
+
+ uint64_t image_offset;
+ off_t image_size;
+ uint64_t dtb_offset;
+ off_t dtb_size;
+ uint64_t ramdisk_offset;
+ off_t ramdisk_size;
+};
+
+static struct option options[] = {
+ { "help", no_argument, NULL, 'h' },
+ { "image", required_argument, NULL, 'i' },
+ { "dtb", required_argument, NULL, 'd' },
+ { "ramdisk", optional_argument, NULL, 'r' },
+ { "base", optional_argument, NULL, 'B' },
+ { "size", optional_argument, NULL, 'S' },
+ { "image_offset", optional_argument, NULL, 'I' },
+ { "dtb_offset", optional_argument, NULL, 'D' },
+ { "ramdisk_offset", optional_argument, NULL, 'R' },
+ { }
+};
+
+static void print_help(char *cmd)
+{
+ printf("gunyah_vmm, a sample tool to launch Gunyah VMs\n"
+ "Usage: %s <options>\n"
+ " --help, -h this menu\n"
+ " --image, -i <image> VM image file to load (e.g. a kernel Image) [Required]\n"
+ " --dtb, -d <dtb> Devicetree to load [Required]\n"
+ " --ramdisk, -r <ramdisk> Ramdisk to load\n"
+ " --base, -B <address> Set the base address of guest's memory [Default: 0x80000000]\n"
+ " --size, -S <number> The number of bytes large to make the guest's memory [Default: 0x6400000 (100 MB)]\n"
+ " --image_offset, -I <number> Offset into guest memory to load the VM image file [Default: 0x10000]\n"
+ " --dtb_offset, -D <number> Offset into guest memory to load the DTB [Default: 0]\n"
+ " --ramdisk_offset, -R <number> Offset into guest memory to load a ramdisk [Default: 0x4600000]\n"
+ , cmd);
+}
+
+int main(int argc, char **argv)
+{
+ int gunyah_fd, vm_fd, guest_fd;
+ struct gh_userspace_memory_region guest_mem_desc = { 0 };
+ struct gh_vm_dtb_config dtb_config = { 0 };
+ char *guest_mem;
+ struct vm_config config = {
+ /* Defaults good enough to boot static kernel and a basic ramdisk */
+ .ramdisk_fd = -1,
+ .guest_base = 0x80000000,
+ .guest_size = 0x6400000, /* 100 MB */
+ .image_offset = 0,
+ .dtb_offset = 0x45f0000,
+ .ramdisk_offset = 0x4600000, /* put at +70MB (30MB for ramdisk) */
+ };
+ struct stat st;
+ int opt, optidx, ret = 0;
+ long l;
+
+ while ((opt = getopt_long(argc, argv, "hi:d:r:B:S:I:D:R:c:", options, &optidx)) != -1) {
+ switch (opt) {
+ case 'i':
+ config.image_fd = open(optarg, O_RDONLY | O_CLOEXEC);
+ if (config.image_fd < 0) {
+ perror("Failed to open image");
+ return -1;
+ }
+ if (stat(optarg, &st) < 0) {
+ perror("Failed to stat image");
+ return -1;
+ }
+ config.image_size = st.st_size;
+ break;
+ case 'd':
+ config.dtb_fd = open(optarg, O_RDONLY | O_CLOEXEC);
+ if (config.dtb_fd < 0) {
+ perror("Failed to open dtb");
+ return -1;
+ }
+ if (stat(optarg, &st) < 0) {
+ perror("Failed to stat dtb");
+ return -1;
+ }
+ config.dtb_size = st.st_size;
+ break;
+ case 'r':
+ config.ramdisk_fd = open(optarg, O_RDONLY | O_CLOEXEC);
+ if (config.ramdisk_fd < 0) {
+ perror("Failed to open ramdisk");
+ return -1;
+ }
+ if (stat(optarg, &st) < 0) {
+ perror("Failed to stat ramdisk");
+ return -1;
+ }
+ config.ramdisk_size = st.st_size;
+ break;
+ case 'B':
+ l = strtol(optarg, NULL, 0);
+ if (l == LONG_MIN) {
+ perror("Failed to parse base address");
+ return -1;
+ }
+ config.guest_base = l;
+ break;
+ case 'S':
+ l = strtol(optarg, NULL, 0);
+ if (l == LONG_MIN) {
+ perror("Failed to parse memory size");
+ return -1;
+ }
+ config.guest_size = l;
+ break;
+ case 'I':
+ l = strtol(optarg, NULL, 0);
+ if (l == LONG_MIN) {
+ perror("Failed to parse image offset");
+ return -1;
+ }
+ config.image_offset = l;
+ break;
+ case 'D':
+ l = strtol(optarg, NULL, 0);
+ if (l == LONG_MIN) {
+ perror("Failed to parse dtb offset");
+ return -1;
+ }
+ config.dtb_offset = l;
+ break;
+ case 'R':
+ l = strtol(optarg, NULL, 0);
+ if (l == LONG_MIN) {
+ perror("Failed to parse ramdisk offset");
+ return -1;
+ }
+ config.ramdisk_offset = l;
+ break;
+ case 'h':
+ print_help(argv[0]);
+ return 0;
+ default:
+ print_help(argv[0]);
+ return -1;
+ }
+ }
+
+ if (!config.image_fd || !config.dtb_fd) {
+ print_help(argv[0]);
+ return -1;
+ }
+
+ if (config.image_offset + config.image_size > config.guest_size) {
+ fprintf(stderr, "Image offset and size puts it outside guest memory. Make image smaller or increase guest memory size.\n");
+ return -1;
+ }
+
+ if (config.dtb_offset + config.dtb_size > config.guest_size) {
+ fprintf(stderr, "DTB offset and size puts it outside guest memory. Make dtb smaller or increase guest memory size.\n");
+ return -1;
+ }
+
+ if (config.ramdisk_fd == -1 &&
+ config.ramdisk_offset + config.ramdisk_size > config.guest_size) {
+ fprintf(stderr, "Ramdisk offset and size puts it outside guest memory. Make ramdisk smaller or increase guest memory size.\n");
+ return -1;
+ }
+
+ gunyah_fd = open("/dev/gunyah", O_RDWR | O_CLOEXEC);
+ if (gunyah_fd < 0) {
+ perror("Failed to open /dev/gunyah");
+ return -1;
+ }
+
+ vm_fd = ioctl(gunyah_fd, GH_CREATE_VM, 0);
+ if (vm_fd < 0) {
+ perror("Failed to create vm");
+ return -1;
+ }
+
+ guest_fd = memfd_create("guest_memory", MFD_CLOEXEC);
+ if (guest_fd < 0) {
+ perror("Failed to create guest memfd");
+ return -1;
+ }
+
+ if (ftruncate(guest_fd, config.guest_size) < 0) {
+ perror("Failed to grow guest memory");
+ return -1;
+ }
+
+ guest_mem = mmap(NULL, config.guest_size, PROT_READ | PROT_WRITE, MAP_SHARED, guest_fd, 0);
+ if (guest_mem == MAP_FAILED) {
+ perror("Not enough memory");
+ return -1;
+ }
+
+ if (read(config.image_fd, guest_mem + config.image_offset, config.image_size) < 0) {
+ perror("Failed to read image into guest memory");
+ return -1;
+ }
+
+ if (read(config.dtb_fd, guest_mem + config.dtb_offset, config.dtb_size) < 0) {
+ perror("Failed to read dtb into guest memory");
+ return -1;
+ }
+
+ if (config.ramdisk_fd > 0 &&
+ read(config.ramdisk_fd, guest_mem + config.ramdisk_offset,
+ config.ramdisk_size) < 0) {
+ perror("Failed to read ramdisk into guest memory");
+ return -1;
+ }
+
+ guest_mem_desc.label = 0;
+ guest_mem_desc.flags = GH_MEM_ALLOW_READ | GH_MEM_ALLOW_WRITE | GH_MEM_ALLOW_EXEC;
+ guest_mem_desc.guest_phys_addr = config.guest_base;
+ guest_mem_desc.memory_size = config.guest_size;
+ guest_mem_desc.userspace_addr = (__u64)guest_mem;
+
+ if (ioctl(vm_fd, GH_VM_SET_USER_MEM_REGION, &guest_mem_desc) < 0) {
+ perror("Failed to register guest memory with VM");
+ return -1;
+ }
+
+ dtb_config.guest_phys_addr = config.guest_base + config.dtb_offset;
+ dtb_config.size = config.dtb_size;
+ if (ioctl(vm_fd, GH_VM_SET_DTB_CONFIG, &dtb_config) < 0) {
+ perror("Failed to set DTB configuration for VM");
+ return -1;
+ }
+
+ ret = ioctl(vm_fd, GH_VM_START);
+ if (ret) {
+ perror("GH_VM_START failed");
+ return -1;
+ }
+
+ while (1)
+ sleep(10);
+
+ return 0;
+}
diff --git a/samples/gunyah/sample_vm.dts b/samples/gunyah/sample_vm.dts
new file mode 100644
index 000000000000..293bbc0469c8
--- /dev/null
+++ b/samples/gunyah/sample_vm.dts
@@ -0,0 +1,68 @@
+// SPDX-License-Identifier: BSD-3-Clause
+/*
+ * Copyright (c) 2022 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+/dts-v1/;
+
+/ {
+ #address-cells = <2>;
+ #size-cells = <2>;
+ interrupt-parent = <&intc>;
+
+ chosen {
+ bootargs = "nokaslr";
+ };
+
+ cpus {
+ #address-cells = <0x2>;
+ #size-cells = <0>;
+
+ cpu@0 {
+ device_type = "cpu";
+ compatible = "arm,armv8";
+ reg = <0 0>;
+ };
+ };
+
+ intc: interrupt-controller@3FFF0000 {
+ compatible = "arm,gic-v3";
+ #interrupt-cells = <3>;
+ #address-cells = <2>;
+ #size-cells = <2>;
+ interrupt-controller;
+ reg = <0 0x3FFF0000 0 0x10000>,
+ <0 0x3FFD0000 0 0x20000>;
+ };
+
+ timer {
+ compatible = "arm,armv8-timer";
+ always-on;
+ interrupts = <1 13 0x108>,
+ <1 14 0x108>,
+ <1 11 0x108>,
+ <1 10 0x108>;
+ clock-frequency = <19200000>;
+ };
+
+ gunyah-vm-config {
+ image-name = "linux_vm_0";
+
+ memory {
+ #address-cells = <2>;
+ #size-cells = <2>;
+
+ base-address = <0 0x80000000>;
+ };
+
+ interrupts {
+ config = <&intc>;
+ };
+
+ vcpus {
+ affinity-map = < 0 >;
+ sched-priority = < (-1) >;
+ sched-timeslice = < 2000 >;
+ };
+ };
+};
--
2.39.2
Gunyah VM manager is a kernel moduel which exposes an interface to
Gunyah userspace to load, run, and interact with other Gunyah virtual
machines. The interface is a character device at /dev/gunyah.
Add a basic VM manager driver. Upcoming patches will add more ioctls
into this driver.
Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Elliot Berman <[email protected]>
---
.../userspace-api/ioctl/ioctl-number.rst | 1 +
drivers/virt/gunyah/Makefile | 2 +-
drivers/virt/gunyah/rsc_mgr.c | 38 +++++-
drivers/virt/gunyah/vm_mgr.c | 116 ++++++++++++++++++
drivers/virt/gunyah/vm_mgr.h | 23 ++++
include/uapi/linux/gunyah.h | 23 ++++
6 files changed, 201 insertions(+), 2 deletions(-)
create mode 100644 drivers/virt/gunyah/vm_mgr.c
create mode 100644 drivers/virt/gunyah/vm_mgr.h
create mode 100644 include/uapi/linux/gunyah.h
diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documentation/userspace-api/ioctl/ioctl-number.rst
index 0a1882e296ae..2513324ae7be 100644
--- a/Documentation/userspace-api/ioctl/ioctl-number.rst
+++ b/Documentation/userspace-api/ioctl/ioctl-number.rst
@@ -137,6 +137,7 @@ Code Seq# Include File Comments
'F' DD video/sstfb.h conflict!
'G' 00-3F drivers/misc/sgi-gru/grulib.h conflict!
'G' 00-0F xen/gntalloc.h, xen/gntdev.h conflict!
+'G' 00-0f linux/gunyah.h conflict!
'H' 00-7F linux/hiddev.h conflict!
'H' 00-0F linux/hidraw.h conflict!
'H' 01 linux/mei.h conflict!
diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
index de29769f2f3f..03951cf82023 100644
--- a/drivers/virt/gunyah/Makefile
+++ b/drivers/virt/gunyah/Makefile
@@ -2,5 +2,5 @@
obj-$(CONFIG_GUNYAH) += gunyah.o
-gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o
+gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o
obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
diff --git a/drivers/virt/gunyah/rsc_mgr.c b/drivers/virt/gunyah/rsc_mgr.c
index 67813c9a52db..d7ce692d0067 100644
--- a/drivers/virt/gunyah/rsc_mgr.c
+++ b/drivers/virt/gunyah/rsc_mgr.c
@@ -15,8 +15,10 @@
#include <linux/completion.h>
#include <linux/gunyah_rsc_mgr.h>
#include <linux/platform_device.h>
+#include <linux/miscdevice.h>
#include "rsc_mgr.h"
+#include "vm_mgr.h"
#define RM_RPC_API_VERSION_MASK GENMASK(3, 0)
#define RM_RPC_HEADER_WORDS_MASK GENMASK(7, 4)
@@ -129,6 +131,7 @@ struct gh_rm_connection {
* @cache: cache for allocating Tx messages
* @send_lock: synchronization to allow only one request to be sent at a time
* @nh: notifier chain for clients interested in RM notification messages
+ * @miscdev: /dev/gunyah
*/
struct gh_rm {
struct device *dev;
@@ -145,6 +148,8 @@ struct gh_rm {
struct kmem_cache *cache;
struct mutex send_lock;
struct blocking_notifier_head nh;
+
+ struct miscdevice miscdev;
};
/**
@@ -593,6 +598,21 @@ void gh_rm_put(struct gh_rm *rm)
}
EXPORT_SYMBOL_GPL(gh_rm_put);
+static long gh_dev_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
+{
+ struct miscdevice *miscdev = filp->private_data;
+ struct gh_rm *rm = container_of(miscdev, struct gh_rm, miscdev);
+
+ return gh_dev_vm_mgr_ioctl(rm, cmd, arg);
+}
+
+static const struct file_operations gh_dev_fops = {
+ .owner = THIS_MODULE,
+ .unlocked_ioctl = gh_dev_ioctl,
+ .compat_ioctl = compat_ptr_ioctl,
+ .llseek = noop_llseek,
+};
+
static int gh_msgq_platform_probe_direction(struct platform_device *pdev, bool tx,
struct gh_resource *ghrsc)
{
@@ -651,7 +671,22 @@ static int gh_rm_drv_probe(struct platform_device *pdev)
rm->msgq_client.rx_callback = gh_rm_msgq_rx_data;
rm->msgq_client.tx_done = gh_rm_msgq_tx_done;
- return gh_msgq_init(&pdev->dev, &rm->msgq, &rm->msgq_client, &rm->tx_ghrsc, &rm->rx_ghrsc);
+ ret = gh_msgq_init(&pdev->dev, &rm->msgq, &rm->msgq_client, &rm->tx_ghrsc, &rm->rx_ghrsc);
+ if (ret)
+ goto err_cache;
+
+ rm->miscdev.name = "gunyah";
+ rm->miscdev.minor = MISC_DYNAMIC_MINOR;
+ rm->miscdev.fops = &gh_dev_fops;
+
+ ret = misc_register(&rm->miscdev);
+ if (ret)
+ goto err_msgq;
+
+ return 0;
+err_msgq:
+ mbox_free_channel(gh_msgq_chan(&rm->msgq));
+ gh_msgq_remove(&rm->msgq);
err_cache:
kmem_cache_destroy(rm->cache);
return ret;
@@ -661,6 +696,7 @@ static int gh_rm_drv_remove(struct platform_device *pdev)
{
struct gh_rm *rm = platform_get_drvdata(pdev);
+ misc_deregister(&rm->miscdev);
mbox_free_channel(gh_msgq_chan(&rm->msgq));
gh_msgq_remove(&rm->msgq);
kmem_cache_destroy(rm->cache);
diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
new file mode 100644
index 000000000000..dbacf36af72d
--- /dev/null
+++ b/drivers/virt/gunyah/vm_mgr.c
@@ -0,0 +1,116 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#define pr_fmt(fmt) "gh_vm_mgr: " fmt
+
+#include <linux/anon_inodes.h>
+#include <linux/file.h>
+#include <linux/gunyah_rsc_mgr.h>
+#include <linux/miscdevice.h>
+#include <linux/module.h>
+
+#include <uapi/linux/gunyah.h>
+
+#include "vm_mgr.h"
+
+static void gh_vm_free(struct work_struct *work)
+{
+ struct gh_vm *ghvm = container_of(work, struct gh_vm, free_work);
+ int ret;
+
+ ret = gh_rm_dealloc_vmid(ghvm->rm, ghvm->vmid);
+ if (ret)
+ pr_warn("Failed to deallocate vmid: %d\n", ret);
+
+ put_gh_rm(ghvm->rm);
+ kfree(ghvm);
+}
+
+static __must_check struct gh_vm *gh_vm_alloc(struct gh_rm *rm)
+{
+ struct gh_vm *ghvm;
+ int vmid;
+
+ vmid = gh_rm_alloc_vmid(rm, 0);
+ if (vmid < 0)
+ return ERR_PTR(vmid);
+
+ ghvm = kzalloc(sizeof(*ghvm), GFP_KERNEL);
+ if (!ghvm) {
+ gh_rm_dealloc_vmid(rm, vmid);
+ return ERR_PTR(-ENOMEM);
+ }
+
+ ghvm->parent = gh_rm_get(rm);
+ ghvm->vmid = vmid;
+ ghvm->rm = rm;
+
+ INIT_WORK(&ghvm->free_work, gh_vm_free);
+
+ return ghvm;
+}
+
+static int gh_vm_release(struct inode *inode, struct file *filp)
+{
+ struct gh_vm *ghvm = filp->private_data;
+
+ /* VM will be reset and make RM calls which can interruptible sleep.
+ * Defer to a work so this thread can receive signal.
+ */
+ schedule_work(&ghvm->free_work);
+ return 0;
+}
+
+static const struct file_operations gh_vm_fops = {
+ .release = gh_vm_release,
+ .llseek = noop_llseek,
+};
+
+static long gh_dev_ioctl_create_vm(struct gh_rm *rm, unsigned long arg)
+{
+ struct gh_vm *ghvm;
+ struct file *file;
+ int fd, err;
+
+ /* arg reserved for future use. */
+ if (arg)
+ return -EINVAL;
+
+ ghvm = gh_vm_alloc(rm);
+ if (IS_ERR(ghvm))
+ return PTR_ERR(ghvm);
+
+ fd = get_unused_fd_flags(O_CLOEXEC);
+ if (fd < 0) {
+ err = fd;
+ goto err_destroy_vm;
+ }
+
+ file = anon_inode_getfile("gunyah-vm", &gh_vm_fops, ghvm, O_RDWR);
+ if (IS_ERR(file)) {
+ err = PTR_ERR(file);
+ goto err_put_fd;
+ }
+
+ fd_install(fd, file);
+
+ return fd;
+
+err_put_fd:
+ put_unused_fd(fd);
+err_destroy_vm:
+ gh_vm_free(&ghvm->free_work);
+ return err;
+}
+
+long gh_dev_vm_mgr_ioctl(struct gh_rm *rm, unsigned int cmd, unsigned long arg)
+{
+ switch (cmd) {
+ case GH_CREATE_VM:
+ return gh_dev_ioctl_create_vm(rm, arg);
+ default:
+ return -ENOIOCTLCMD;
+ }
+}
diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
new file mode 100644
index 000000000000..4b22fbcac91c
--- /dev/null
+++ b/drivers/virt/gunyah/vm_mgr.h
@@ -0,0 +1,23 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#ifndef _GH_PRIV_VM_MGR_H
+#define _GH_PRIV_VM_MGR_H
+
+#include <linux/gunyah_rsc_mgr.h>
+
+#include <uapi/linux/gunyah.h>
+
+long gh_dev_vm_mgr_ioctl(struct gh_rm *rm, unsigned int cmd, unsigned long arg);
+
+struct gh_vm {
+ u16 vmid;
+ struct gh_rm *rm;
+ struct device *parent;
+
+ struct work_struct free_work;
+};
+
+#endif
diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
new file mode 100644
index 000000000000..10ba32d2b0a6
--- /dev/null
+++ b/include/uapi/linux/gunyah.h
@@ -0,0 +1,23 @@
+/* SPDX-License-Identifier: GPL-2.0-only WITH Linux-syscall-note */
+/*
+ * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#ifndef _UAPI_LINUX_GUNYAH
+#define _UAPI_LINUX_GUNYAH
+
+/*
+ * Userspace interface for /dev/gunyah - gunyah based virtual machine
+ */
+
+#include <linux/types.h>
+#include <linux/ioctl.h>
+
+#define GH_IOCTL_TYPE 'G'
+
+/*
+ * ioctls for /dev/gunyah fds:
+ */
+#define GH_CREATE_VM _IO(GH_IOCTL_TYPE, 0x0) /* Returns a Gunyah VM fd */
+
+#endif
--
2.39.2
Qualcomm platforms have a firmware entity which performs access control
to physical pages. Dynamically started Gunyah virtual machines use the
QCOM_SCM_RM_MANAGED_VMID for access. Linux thus needs to assign access
to the memory used by guest VMs. Gunyah doesn't do this operation for us
since it is the current VM (typically VMID_HLOS) delegating the access
and not Gunyah itself. Use the Gunyah platform ops to achieve this so
that only Qualcomm platforms attempt to make the needed SCM calls.
Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Elliot Berman <[email protected]>
---
drivers/firmware/Kconfig | 2 +
drivers/firmware/qcom_scm.c | 100 +++++++++++++++++++++++++++++++++
include/linux/gunyah_rsc_mgr.h | 2 +-
3 files changed, 103 insertions(+), 1 deletion(-)
diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
index b59e3041fd62..b888068ff6f2 100644
--- a/drivers/firmware/Kconfig
+++ b/drivers/firmware/Kconfig
@@ -214,6 +214,8 @@ config MTK_ADSP_IPC
config QCOM_SCM
tristate
+ select VIRT_DRIVERS
+ select GUNYAH_PLATFORM_HOOKS
config QCOM_SCM_DOWNLOAD_MODE_DEFAULT
bool "Qualcomm download mode enabled by default"
diff --git a/drivers/firmware/qcom_scm.c b/drivers/firmware/qcom_scm.c
index b95616b35bff..89a261a9e021 100644
--- a/drivers/firmware/qcom_scm.c
+++ b/drivers/firmware/qcom_scm.c
@@ -20,6 +20,7 @@
#include <linux/clk.h>
#include <linux/reset-controller.h>
#include <linux/arm-smccc.h>
+#include <linux/gunyah_rsc_mgr.h>
#include "qcom_scm.h"
@@ -30,6 +31,9 @@ module_param(download_mode, bool, 0);
#define SCM_HAS_IFACE_CLK BIT(1)
#define SCM_HAS_BUS_CLK BIT(2)
+#define QCOM_SCM_RM_MANAGED_VMID 0x3A
+#define QCOM_SCM_MAX_MANAGED_VMID 0x3F
+
struct qcom_scm {
struct device *dev;
struct clk *core_clk;
@@ -1299,6 +1303,99 @@ int qcom_scm_lmh_dcvsh(u32 payload_fn, u32 payload_reg, u32 payload_val,
}
EXPORT_SYMBOL(qcom_scm_lmh_dcvsh);
+static int qcom_scm_gh_rm_pre_mem_share(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel)
+{
+ struct qcom_scm_vmperm *new_perms;
+ u64 src, src_cpy;
+ int ret = 0, i, n;
+ u16 vmid;
+
+ new_perms = kcalloc(mem_parcel->n_acl_entries, sizeof(*new_perms), GFP_KERNEL);
+ if (!new_perms)
+ return -ENOMEM;
+
+ for (n = 0; n < mem_parcel->n_acl_entries; n++) {
+ vmid = le16_to_cpu(mem_parcel->acl_entries[n].vmid);
+ if (vmid <= QCOM_SCM_MAX_MANAGED_VMID)
+ new_perms[n].vmid = vmid;
+ else
+ new_perms[n].vmid = QCOM_SCM_RM_MANAGED_VMID;
+ if (mem_parcel->acl_entries[n].perms & GH_RM_ACL_X)
+ new_perms[n].perm |= QCOM_SCM_PERM_EXEC;
+ if (mem_parcel->acl_entries[n].perms & GH_RM_ACL_W)
+ new_perms[n].perm |= QCOM_SCM_PERM_WRITE;
+ if (mem_parcel->acl_entries[n].perms & GH_RM_ACL_R)
+ new_perms[n].perm |= QCOM_SCM_PERM_READ;
+ }
+
+ src = (1ull << QCOM_SCM_VMID_HLOS);
+
+ for (i = 0; i < mem_parcel->n_mem_entries; i++) {
+ src_cpy = src;
+ ret = qcom_scm_assign_mem(le64_to_cpu(mem_parcel->mem_entries[i].ipa_base),
+ le64_to_cpu(mem_parcel->mem_entries[i].size),
+ &src_cpy, new_perms, mem_parcel->n_acl_entries);
+ if (ret) {
+ src = 0;
+ for (n = 0; n < mem_parcel->n_acl_entries; n++) {
+ vmid = le16_to_cpu(mem_parcel->acl_entries[n].vmid);
+ if (vmid <= QCOM_SCM_MAX_MANAGED_VMID)
+ src |= (1ull << vmid);
+ else
+ src |= (1ull << QCOM_SCM_RM_MANAGED_VMID);
+ }
+
+ new_perms[0].vmid = QCOM_SCM_VMID_HLOS;
+
+ for (i--; i >= 0; i--) {
+ src_cpy = src;
+ WARN_ON_ONCE(qcom_scm_assign_mem(
+ le64_to_cpu(mem_parcel->mem_entries[i].ipa_base),
+ le64_to_cpu(mem_parcel->mem_entries[i].size),
+ &src_cpy, new_perms, 1));
+ }
+ break;
+ }
+ }
+
+ kfree(new_perms);
+ return ret;
+}
+
+static int qcom_scm_gh_rm_post_mem_reclaim(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel)
+{
+ struct qcom_scm_vmperm new_perms;
+ u64 src = 0, src_cpy;
+ int ret = 0, i, n;
+ u16 vmid;
+
+ new_perms.vmid = QCOM_SCM_VMID_HLOS;
+ new_perms.perm = QCOM_SCM_PERM_EXEC | QCOM_SCM_PERM_WRITE | QCOM_SCM_PERM_READ;
+
+ for (n = 0; n < mem_parcel->n_acl_entries; n++) {
+ vmid = le16_to_cpu(mem_parcel->acl_entries[n].vmid);
+ if (vmid <= QCOM_SCM_MAX_MANAGED_VMID)
+ src |= (1ull << vmid);
+ else
+ src |= (1ull << QCOM_SCM_RM_MANAGED_VMID);
+ }
+
+ for (i = 0; i < mem_parcel->n_mem_entries; i++) {
+ src_cpy = src;
+ ret = qcom_scm_assign_mem(le64_to_cpu(mem_parcel->mem_entries[i].ipa_base),
+ le64_to_cpu(mem_parcel->mem_entries[i].size),
+ &src_cpy, &new_perms, 1);
+ WARN_ON_ONCE(ret);
+ }
+
+ return ret;
+}
+
+static struct gh_rm_platform_ops qcom_scm_gh_rm_platform_ops = {
+ .pre_mem_share = qcom_scm_gh_rm_pre_mem_share,
+ .post_mem_reclaim = qcom_scm_gh_rm_post_mem_reclaim,
+};
+
static int qcom_scm_find_dload_address(struct device *dev, u64 *addr)
{
struct device_node *tcsr;
@@ -1502,6 +1599,9 @@ static int qcom_scm_probe(struct platform_device *pdev)
if (download_mode)
qcom_scm_set_download_mode(true);
+ if (devm_gh_rm_register_platform_ops(&pdev->dev, &qcom_scm_gh_rm_platform_ops))
+ dev_warn(__scm->dev, "Gunyah RM platform ops were already registered\n");
+
return 0;
}
diff --git a/include/linux/gunyah_rsc_mgr.h b/include/linux/gunyah_rsc_mgr.h
index 515087931a2b..acf8c1545a6c 100644
--- a/include/linux/gunyah_rsc_mgr.h
+++ b/include/linux/gunyah_rsc_mgr.h
@@ -145,7 +145,7 @@ int gh_rm_get_hyp_resources(struct gh_rm *rm, u16 vmid,
struct gh_rm_hyp_resources **resources);
int gh_rm_get_vmid(struct gh_rm *rm, u16 *vmid);
-struct gunyah_rm_platform_ops {
+struct gh_rm_platform_ops {
int (*pre_mem_share)(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel);
int (*post_mem_reclaim)(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel);
};
--
2.39.2
Gunyah message queues are a unidirectional inter-VM pipe for messages up
to 1024 bytes. This driver supports pairing a receiver message queue and
a transmitter message queue to expose a single mailbox channel.
Signed-off-by: Elliot Berman <[email protected]>
---
Documentation/virt/gunyah/message-queue.rst | 8 +
drivers/mailbox/Makefile | 2 +
drivers/mailbox/gunyah-msgq.c | 209 ++++++++++++++++++++
include/linux/gunyah.h | 57 ++++++
4 files changed, 276 insertions(+)
create mode 100644 drivers/mailbox/gunyah-msgq.c
diff --git a/Documentation/virt/gunyah/message-queue.rst b/Documentation/virt/gunyah/message-queue.rst
index b352918ae54b..70d82a4ef32d 100644
--- a/Documentation/virt/gunyah/message-queue.rst
+++ b/Documentation/virt/gunyah/message-queue.rst
@@ -61,3 +61,11 @@ vIRQ: two TX message queues will have two vIRQs (and two capability IDs).
| | | | | |
| | | | | |
+---------------+ +-----------------+ +---------------+
+
+Gunyah message queues are exposed as mailboxes. To create the mailbox, create
+a mbox_client and call `gh_msgq_init()`. On receipt of the RX_READY interrupt,
+all messages in the RX message queue are read and pushed via the `rx_callback`
+of the registered mbox_client.
+
+.. kernel-doc:: drivers/mailbox/gunyah-msgq.c
+ :identifiers: gh_msgq_init
diff --git a/drivers/mailbox/Makefile b/drivers/mailbox/Makefile
index fc9376117111..5f929bb55e9a 100644
--- a/drivers/mailbox/Makefile
+++ b/drivers/mailbox/Makefile
@@ -55,6 +55,8 @@ obj-$(CONFIG_MTK_CMDQ_MBOX) += mtk-cmdq-mailbox.o
obj-$(CONFIG_ZYNQMP_IPI_MBOX) += zynqmp-ipi-mailbox.o
+obj-$(CONFIG_GUNYAH) += gunyah-msgq.o
+
obj-$(CONFIG_SUN6I_MSGBOX) += sun6i-msgbox.o
obj-$(CONFIG_SPRD_MBOX) += sprd-mailbox.o
diff --git a/drivers/mailbox/gunyah-msgq.c b/drivers/mailbox/gunyah-msgq.c
new file mode 100644
index 000000000000..1989298653f9
--- /dev/null
+++ b/drivers/mailbox/gunyah-msgq.c
@@ -0,0 +1,209 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#include <linux/mailbox_controller.h>
+#include <linux/module.h>
+#include <linux/interrupt.h>
+#include <linux/gunyah.h>
+#include <linux/printk.h>
+#include <linux/init.h>
+#include <linux/slab.h>
+#include <linux/wait.h>
+
+#define mbox_chan_to_msgq(chan) (container_of(chan->mbox, struct gh_msgq, mbox))
+
+static irqreturn_t gh_msgq_rx_irq_handler(int irq, void *data)
+{
+ struct gh_msgq *msgq = data;
+ struct gh_msgq_rx_data rx_data;
+ enum gh_error gh_error;
+ bool ready = true;
+
+ while (ready) {
+ gh_error = gh_hypercall_msgq_recv(msgq->rx_ghrsc->capid,
+ &rx_data.data, sizeof(rx_data.data),
+ &rx_data.length, &ready);
+ if (gh_error != GH_ERROR_OK) {
+ if (gh_error != GH_ERROR_MSGQUEUE_EMPTY)
+ dev_warn(msgq->mbox.dev, "Failed to receive data: %d\n", gh_error);
+ break;
+ }
+ mbox_chan_received_data(gh_msgq_chan(msgq), &rx_data);
+ }
+
+ return IRQ_HANDLED;
+}
+
+/* Fired when message queue transitions from "full" to "space available" to send messages */
+static irqreturn_t gh_msgq_tx_irq_handler(int irq, void *data)
+{
+ struct gh_msgq *msgq = data;
+
+ mbox_chan_txdone(gh_msgq_chan(msgq), 0);
+
+ return IRQ_HANDLED;
+}
+
+/* Fired after sending message and hypercall told us there was more space available. */
+static void gh_msgq_txdone_tasklet(struct tasklet_struct *tasklet)
+{
+ struct gh_msgq *msgq = container_of(tasklet, struct gh_msgq, txdone_tasklet);
+
+ mbox_chan_txdone(gh_msgq_chan(msgq), msgq->last_ret);
+}
+
+static int gh_msgq_send_data(struct mbox_chan *chan, void *data)
+{
+ struct gh_msgq *msgq = mbox_chan_to_msgq(chan);
+ struct gh_msgq_tx_data *msgq_data = data;
+ u64 tx_flags = 0;
+ enum gh_error gh_error;
+ bool ready;
+
+ if (msgq_data->push)
+ tx_flags |= GH_HYPERCALL_MSGQ_TX_FLAGS_PUSH;
+
+ gh_error = gh_hypercall_msgq_send(msgq->tx_ghrsc->capid, msgq_data->length, msgq_data->data,
+ tx_flags, &ready);
+
+ /**
+ * unlikely because Linux tracks state of msgq and should not try to
+ * send message when msgq is full.
+ */
+ if (unlikely(gh_error == GH_ERROR_MSGQUEUE_FULL))
+ return -EAGAIN;
+
+ /**
+ * Propagate all other errors to client. If we return error to mailbox
+ * framework, then no other messages can be sent and nobody will know
+ * to retry this message.
+ */
+ msgq->last_ret = gh_remap_error(gh_error);
+
+ /**
+ * This message was successfully sent, but message queue isn't ready to
+ * accept more messages because it's now full. Mailbox framework
+ * requires that we only report that message was transmitted when
+ * we're ready to transmit another message. We'll get that in the form
+ * of tx IRQ once the other side starts to drain the msgq.
+ */
+ if (gh_error == GH_ERROR_OK) {
+ if (!ready)
+ return 0;
+ } else
+ dev_err(msgq->mbox.dev, "Failed to send data: %d (%d)\n", gh_error, msgq->last_ret);
+
+ /**
+ * We can send more messages. Mailbox framework requires that tx done
+ * happens asynchronously to sending the message. Gunyah message queues
+ * tell us right away on the hypercall return whether we can send more
+ * messages. To work around this, defer the txdone to a tasklet.
+ */
+ tasklet_schedule(&msgq->txdone_tasklet);
+
+ return 0;
+}
+
+static struct mbox_chan_ops gh_msgq_ops = {
+ .send_data = gh_msgq_send_data,
+};
+
+/**
+ * gh_msgq_init() - Initialize a Gunyah message queue with an mbox_client
+ * @parent: optional, device parent used for the mailbox controller
+ * @msgq: Pointer to the gh_msgq to initialize
+ * @cl: A mailbox client to bind to the mailbox channel that the message queue creates
+ * @tx_ghrsc: optional, the transmission side of the message queue
+ * @rx_ghrsc: optional, the receiving side of the message queue
+ *
+ * At least one of tx_ghrsc and rx_ghrsc must be not NULL. Most message queue use cases come with
+ * a pair of message queues to facilitate bidirectional communication. When tx_ghrsc is set,
+ * the client can send messages with mbox_send_message(gh_msgq_chan(msgq), msg). When rx_ghrsc
+ * is set, the mbox_client must register an .rx_callback() and the message queue driver will
+ * deliver all available messages upon receiving the RX ready interrupt. The messages should be
+ * consumed or copied by the client right away as the gh_msgq_rx_data will be replaced/destroyed
+ * after the callback.
+ *
+ * Returns - 0 on success, negative otherwise
+ */
+int gh_msgq_init(struct device *parent, struct gh_msgq *msgq, struct mbox_client *cl,
+ struct gh_resource *tx_ghrsc, struct gh_resource *rx_ghrsc)
+{
+ int ret;
+
+ /* Must have at least a tx_ghrsc or rx_ghrsc and that they are the right device types */
+ if ((!tx_ghrsc && !rx_ghrsc) ||
+ (tx_ghrsc && tx_ghrsc->type != GH_RESOURCE_TYPE_MSGQ_TX) ||
+ (rx_ghrsc && rx_ghrsc->type != GH_RESOURCE_TYPE_MSGQ_RX))
+ return -EINVAL;
+
+ if (!gh_api_has_feature(GH_FEATURE_MSGQUEUE))
+ return -EOPNOTSUPP;
+
+ msgq->tx_ghrsc = tx_ghrsc;
+ msgq->rx_ghrsc = rx_ghrsc;
+
+ msgq->mbox.dev = parent;
+ msgq->mbox.ops = &gh_msgq_ops;
+ msgq->mbox.num_chans = 1;
+ msgq->mbox.txdone_irq = true;
+ msgq->mbox.chans = &msgq->mbox_chan;
+
+ if (msgq->tx_ghrsc) {
+ ret = request_irq(msgq->tx_ghrsc->irq, gh_msgq_tx_irq_handler, 0, "gh_msgq_tx",
+ msgq);
+ if (ret)
+ goto err_chans;
+ }
+
+ if (msgq->rx_ghrsc) {
+ ret = request_threaded_irq(msgq->rx_ghrsc->irq, NULL, gh_msgq_rx_irq_handler,
+ IRQF_ONESHOT, "gh_msgq_rx", msgq);
+ if (ret)
+ goto err_tx_irq;
+ }
+
+ tasklet_setup(&msgq->txdone_tasklet, gh_msgq_txdone_tasklet);
+
+ ret = mbox_controller_register(&msgq->mbox);
+ if (ret)
+ goto err_rx_irq;
+
+ ret = mbox_bind_client(gh_msgq_chan(msgq), cl);
+ if (ret)
+ goto err_mbox;
+
+ return 0;
+err_mbox:
+ mbox_controller_unregister(&msgq->mbox);
+err_rx_irq:
+ if (msgq->rx_ghrsc)
+ free_irq(msgq->rx_ghrsc->irq, msgq);
+err_tx_irq:
+ if (msgq->tx_ghrsc)
+ free_irq(msgq->tx_ghrsc->irq, msgq);
+err_chans:
+ kfree(msgq->mbox.chans);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(gh_msgq_init);
+
+void gh_msgq_remove(struct gh_msgq *msgq)
+{
+ tasklet_kill(&msgq->txdone_tasklet);
+ mbox_controller_unregister(&msgq->mbox);
+
+ if (msgq->rx_ghrsc)
+ free_irq(msgq->rx_ghrsc->irq, msgq);
+
+ if (msgq->tx_ghrsc)
+ free_irq(msgq->tx_ghrsc->irq, msgq);
+
+ kfree(msgq->mbox.chans);
+}
+EXPORT_SYMBOL_GPL(gh_msgq_remove);
+
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("Gunyah Message Queue Driver");
diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
index 18cfbf5ee48b..378bec0f2ce1 100644
--- a/include/linux/gunyah.h
+++ b/include/linux/gunyah.h
@@ -8,11 +8,68 @@
#include <linux/bitfield.h>
#include <linux/errno.h>
+#include <linux/interrupt.h>
#include <linux/limits.h>
+#include <linux/mailbox_controller.h>
+#include <linux/mailbox_client.h>
#include <linux/types.h>
+/* Follows resource manager's resource types for VM_GET_HYP_RESOURCES */
+enum gh_resource_type {
+ GH_RESOURCE_TYPE_BELL_TX = 0,
+ GH_RESOURCE_TYPE_BELL_RX = 1,
+ GH_RESOURCE_TYPE_MSGQ_TX = 2,
+ GH_RESOURCE_TYPE_MSGQ_RX = 3,
+ GH_RESOURCE_TYPE_VCPU = 4,
+};
+
+struct gh_resource {
+ enum gh_resource_type type;
+ u64 capid;
+ unsigned int irq;
+};
+
+/**
+ * Gunyah Message Queues
+ */
+
+#define GH_MSGQ_MAX_MSG_SIZE 240
+
+struct gh_msgq_tx_data {
+ size_t length;
+ bool push;
+ char data[];
+};
+
+struct gh_msgq_rx_data {
+ size_t length;
+ char data[GH_MSGQ_MAX_MSG_SIZE];
+};
+
+struct gh_msgq {
+ struct gh_resource *tx_ghrsc;
+ struct gh_resource *rx_ghrsc;
+
+ /* msgq private */
+ int last_ret; /* Linux error, not GH_STATUS_* */
+ struct mbox_chan mbox_chan;
+ struct mbox_controller mbox;
+ struct tasklet_struct txdone_tasklet;
+};
+
+
+int gh_msgq_init(struct device *parent, struct gh_msgq *msgq, struct mbox_client *cl,
+ struct gh_resource *tx_ghrsc, struct gh_resource *rx_ghrsc);
+void gh_msgq_remove(struct gh_msgq *msgq);
+
+static inline struct mbox_chan *gh_msgq_chan(struct gh_msgq *msgq)
+{
+ return &msgq->mbox.chans[0];
+}
+
/******************************************************************************/
/* Common arch-independent definitions for Gunyah hypercalls */
+
#define GH_CAPID_INVAL U64_MAX
#define GH_VMID_ROOT_VM 0xff
--
2.39.2
When launching a virtual machine, Gunyah userspace allocates memory for
the guest and informs Gunyah about these memory regions through
SET_USER_MEMORY_REGION ioctl.
Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Elliot Berman <[email protected]>
---
drivers/virt/gunyah/Makefile | 2 +-
drivers/virt/gunyah/vm_mgr.c | 44 ++++++
drivers/virt/gunyah/vm_mgr.h | 25 ++++
drivers/virt/gunyah/vm_mgr_mm.c | 229 ++++++++++++++++++++++++++++++++
include/uapi/linux/gunyah.h | 29 ++++
5 files changed, 328 insertions(+), 1 deletion(-)
create mode 100644 drivers/virt/gunyah/vm_mgr_mm.c
diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
index 03951cf82023..ff8bc4925392 100644
--- a/drivers/virt/gunyah/Makefile
+++ b/drivers/virt/gunyah/Makefile
@@ -2,5 +2,5 @@
obj-$(CONFIG_GUNYAH) += gunyah.o
-gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o
+gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o vm_mgr_mm.o
obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
index dbacf36af72d..e950274c6a53 100644
--- a/drivers/virt/gunyah/vm_mgr.c
+++ b/drivers/virt/gunyah/vm_mgr.c
@@ -18,8 +18,16 @@
static void gh_vm_free(struct work_struct *work)
{
struct gh_vm *ghvm = container_of(work, struct gh_vm, free_work);
+ struct gh_vm_mem *mapping, *tmp;
int ret;
+ mutex_lock(&ghvm->mm_lock);
+ list_for_each_entry_safe(mapping, tmp, &ghvm->memory_mappings, list) {
+ gh_vm_mem_reclaim(ghvm, mapping);
+ kfree(mapping);
+ }
+ mutex_unlock(&ghvm->mm_lock);
+
ret = gh_rm_dealloc_vmid(ghvm->rm, ghvm->vmid);
if (ret)
pr_warn("Failed to deallocate vmid: %d\n", ret);
@@ -47,11 +55,44 @@ static __must_check struct gh_vm *gh_vm_alloc(struct gh_rm *rm)
ghvm->vmid = vmid;
ghvm->rm = rm;
+ mutex_init(&ghvm->mm_lock);
+ INIT_LIST_HEAD(&ghvm->memory_mappings);
INIT_WORK(&ghvm->free_work, gh_vm_free);
return ghvm;
}
+static long gh_vm_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
+{
+ struct gh_vm *ghvm = filp->private_data;
+ void __user *argp = (void __user *)arg;
+ long r;
+
+ switch (cmd) {
+ case GH_VM_SET_USER_MEM_REGION: {
+ struct gh_userspace_memory_region region;
+
+ if (!gh_api_has_feature(GH_FEATURE_MEMEXTENT))
+ return -EOPNOTSUPP;
+
+ if (copy_from_user(®ion, argp, sizeof(region)))
+ return -EFAULT;
+
+ /* All other flag bits are reserved for future use */
+ if (region.flags & ~(GH_MEM_ALLOW_READ | GH_MEM_ALLOW_WRITE | GH_MEM_ALLOW_EXEC))
+ return -EINVAL;
+
+ r = gh_vm_mem_alloc(ghvm, ®ion);
+ break;
+ }
+ default:
+ r = -ENOTTY;
+ break;
+ }
+
+ return r;
+}
+
static int gh_vm_release(struct inode *inode, struct file *filp)
{
struct gh_vm *ghvm = filp->private_data;
@@ -64,6 +105,9 @@ static int gh_vm_release(struct inode *inode, struct file *filp)
}
static const struct file_operations gh_vm_fops = {
+ .owner = THIS_MODULE,
+ .unlocked_ioctl = gh_vm_ioctl,
+ .compat_ioctl = compat_ptr_ioctl,
.release = gh_vm_release,
.llseek = noop_llseek,
};
diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
index 4b22fbcac91c..c9f6fa5478ed 100644
--- a/drivers/virt/gunyah/vm_mgr.h
+++ b/drivers/virt/gunyah/vm_mgr.h
@@ -7,17 +7,42 @@
#define _GH_PRIV_VM_MGR_H
#include <linux/gunyah_rsc_mgr.h>
+#include <linux/list.h>
+#include <linux/miscdevice.h>
+#include <linux/mutex.h>
#include <uapi/linux/gunyah.h>
long gh_dev_vm_mgr_ioctl(struct gh_rm *rm, unsigned int cmd, unsigned long arg);
+enum gh_vm_mem_share_type {
+ VM_MEM_SHARE,
+ VM_MEM_LEND,
+};
+
+struct gh_vm_mem {
+ struct list_head list;
+ enum gh_vm_mem_share_type share_type;
+ struct gh_rm_mem_parcel parcel;
+
+ __u64 guest_phys_addr;
+ struct page **pages;
+ unsigned long npages;
+};
+
struct gh_vm {
u16 vmid;
struct gh_rm *rm;
struct device *parent;
struct work_struct free_work;
+ struct mutex mm_lock;
+ struct list_head memory_mappings;
};
+int gh_vm_mem_alloc(struct gh_vm *ghvm, struct gh_userspace_memory_region *region);
+void gh_vm_mem_reclaim(struct gh_vm *ghvm, struct gh_vm_mem *mapping);
+int gh_vm_mem_free(struct gh_vm *ghvm, u32 label);
+struct gh_vm_mem *gh_vm_mem_find_by_label(struct gh_vm *ghvm, u32 label);
+
#endif
diff --git a/drivers/virt/gunyah/vm_mgr_mm.c b/drivers/virt/gunyah/vm_mgr_mm.c
new file mode 100644
index 000000000000..db6f55cef37f
--- /dev/null
+++ b/drivers/virt/gunyah/vm_mgr_mm.c
@@ -0,0 +1,229 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#define pr_fmt(fmt) "gh_vm_mgr: " fmt
+
+#include <linux/gunyah_rsc_mgr.h>
+#include <linux/mm.h>
+
+#include <uapi/linux/gunyah.h>
+
+#include "vm_mgr.h"
+
+static struct gh_vm_mem *__gh_vm_mem_find_by_label(struct gh_vm *ghvm, u32 label)
+ __must_hold(&ghvm->mm_lock)
+{
+ struct gh_vm_mem *mapping;
+
+ list_for_each_entry(mapping, &ghvm->memory_mappings, list)
+ if (mapping->parcel.label == label)
+ return mapping;
+
+ return NULL;
+}
+
+void gh_vm_mem_reclaim(struct gh_vm *ghvm, struct gh_vm_mem *mapping)
+ __must_hold(&ghvm->mm_lock)
+{
+ int i, ret = 0;
+
+ if (mapping->parcel.mem_handle != GH_MEM_HANDLE_INVAL) {
+ ret = gh_rm_mem_reclaim(ghvm->rm, &mapping->parcel);
+ if (ret)
+ pr_warn("Failed to reclaim memory parcel for label %d: %d\n",
+ mapping->parcel.label, ret);
+ }
+
+ if (!ret)
+ for (i = 0; i < mapping->npages; i++)
+ unpin_user_page(mapping->pages[i]);
+
+ kfree(mapping->pages);
+ kfree(mapping->parcel.acl_entries);
+ kfree(mapping->parcel.mem_entries);
+
+ list_del(&mapping->list);
+}
+
+struct gh_vm_mem *gh_vm_mem_find_by_label(struct gh_vm *ghvm, u32 label)
+{
+ struct gh_vm_mem *mapping;
+ int ret;
+
+ ret = mutex_lock_interruptible(&ghvm->mm_lock);
+ if (ret)
+ return ERR_PTR(ret);
+
+ mapping = __gh_vm_mem_find_by_label(ghvm, label);
+ mutex_unlock(&ghvm->mm_lock);
+
+ return mapping ? : ERR_PTR(-ENODEV);
+}
+
+int gh_vm_mem_alloc(struct gh_vm *ghvm, struct gh_userspace_memory_region *region)
+{
+ struct gh_vm_mem *mapping, *tmp_mapping;
+ struct gh_rm_mem_entry *mem_entries;
+ phys_addr_t curr_page, prev_page;
+ struct gh_rm_mem_parcel *parcel;
+ int i, j, pinned, ret = 0;
+ size_t entry_size;
+ u16 vmid;
+
+ if (!region->memory_size || !PAGE_ALIGNED(region->memory_size) ||
+ !PAGE_ALIGNED(region->userspace_addr) || !PAGE_ALIGNED(region->guest_phys_addr))
+ return -EINVAL;
+
+ if (region->guest_phys_addr + region->memory_size < region->guest_phys_addr)
+ return -EOVERFLOW;
+
+ ret = mutex_lock_interruptible(&ghvm->mm_lock);
+ if (ret)
+ return ret;
+
+ mapping = __gh_vm_mem_find_by_label(ghvm, region->label);
+ if (mapping) {
+ mutex_unlock(&ghvm->mm_lock);
+ return -EEXIST;
+ }
+
+ mapping = kzalloc(sizeof(*mapping), GFP_KERNEL);
+ if (!mapping) {
+ mutex_unlock(&ghvm->mm_lock);
+ return -ENOMEM;
+ }
+
+ mapping->parcel.label = region->label;
+ mapping->guest_phys_addr = region->guest_phys_addr;
+ mapping->npages = region->memory_size >> PAGE_SHIFT;
+ parcel = &mapping->parcel;
+ parcel->mem_handle = GH_MEM_HANDLE_INVAL; /* to be filled later by mem_share/mem_lend */
+ parcel->mem_type = GH_RM_MEM_TYPE_NORMAL;
+
+ /* Check for overlap */
+ list_for_each_entry(tmp_mapping, &ghvm->memory_mappings, list) {
+ if (!((mapping->guest_phys_addr + (mapping->npages << PAGE_SHIFT) <=
+ tmp_mapping->guest_phys_addr) ||
+ (mapping->guest_phys_addr >=
+ tmp_mapping->guest_phys_addr + (tmp_mapping->npages << PAGE_SHIFT)))) {
+ ret = -EEXIST;
+ goto free_mapping;
+ }
+ }
+
+ list_add(&mapping->list, &ghvm->memory_mappings);
+
+ mapping->pages = kcalloc(mapping->npages, sizeof(*mapping->pages), GFP_KERNEL);
+ if (!mapping->pages) {
+ ret = -ENOMEM;
+ mapping->npages = 0; /* update npages for reclaim */
+ goto reclaim;
+ }
+
+ pinned = pin_user_pages_fast(region->userspace_addr, mapping->npages,
+ FOLL_WRITE | FOLL_LONGTERM, mapping->pages);
+ if (pinned < 0) {
+ ret = pinned;
+ mapping->npages = 0; /* update npages for reclaim */
+ goto reclaim;
+ } else if (pinned != mapping->npages) {
+ ret = -EFAULT;
+ mapping->npages = pinned; /* update npages for reclaim */
+ goto reclaim;
+ }
+
+ parcel->n_acl_entries = 2;
+ mapping->share_type = VM_MEM_SHARE;
+ parcel->acl_entries = kcalloc(parcel->n_acl_entries, sizeof(*parcel->acl_entries),
+ GFP_KERNEL);
+ if (!parcel->acl_entries) {
+ ret = -ENOMEM;
+ goto reclaim;
+ }
+
+ parcel->acl_entries[0].vmid = cpu_to_le16(ghvm->vmid);
+
+ if (region->flags & GH_MEM_ALLOW_READ)
+ parcel->acl_entries[0].perms |= GH_RM_ACL_R;
+ if (region->flags & GH_MEM_ALLOW_WRITE)
+ parcel->acl_entries[0].perms |= GH_RM_ACL_W;
+ if (region->flags & GH_MEM_ALLOW_EXEC)
+ parcel->acl_entries[0].perms |= GH_RM_ACL_X;
+
+ if (mapping->share_type == VM_MEM_SHARE) {
+ ret = gh_rm_get_vmid(ghvm->rm, &vmid);
+ if (ret)
+ goto reclaim;
+
+ parcel->acl_entries[1].vmid = cpu_to_le16(vmid);
+ /* Host assumed to have all these permissions. Gunyah will not
+ * grant new permissions if host actually had less than RWX
+ */
+ parcel->acl_entries[1].perms |= GH_RM_ACL_R | GH_RM_ACL_W | GH_RM_ACL_X;
+ }
+
+ mem_entries = kcalloc(mapping->npages, sizeof(*mem_entries), GFP_KERNEL);
+ if (!mem_entries) {
+ ret = -ENOMEM;
+ goto reclaim;
+ }
+
+ /* reduce number of entries by combining contiguous pages into single memory entry */
+ prev_page = page_to_phys(mapping->pages[0]);
+ mem_entries[0].ipa_base = cpu_to_le64(prev_page);
+ entry_size = PAGE_SIZE;
+ for (i = 1, j = 0; i < mapping->npages; i++) {
+ curr_page = page_to_phys(mapping->pages[i]);
+ if (curr_page - prev_page == PAGE_SIZE) {
+ entry_size += PAGE_SIZE;
+ } else {
+ mem_entries[j].size = cpu_to_le64(entry_size);
+ j++;
+ mem_entries[j].ipa_base = cpu_to_le64(curr_page);
+ entry_size = PAGE_SIZE;
+ }
+
+ prev_page = curr_page;
+ }
+ mem_entries[j].size = cpu_to_le64(entry_size);
+
+ parcel->n_mem_entries = j + 1;
+ parcel->mem_entries = kmemdup(mem_entries, sizeof(*mem_entries) * parcel->n_mem_entries,
+ GFP_KERNEL);
+ kfree(mem_entries);
+ if (!parcel->mem_entries) {
+ ret = -ENOMEM;
+ goto reclaim;
+ }
+
+ mutex_unlock(&ghvm->mm_lock);
+ return 0;
+reclaim:
+ gh_vm_mem_reclaim(ghvm, mapping);
+free_mapping:
+ kfree(mapping);
+ mutex_unlock(&ghvm->mm_lock);
+ return ret;
+}
+
+int gh_vm_mem_free(struct gh_vm *ghvm, u32 label)
+{
+ struct gh_vm_mem *mapping;
+ int ret;
+
+ ret = mutex_lock_interruptible(&ghvm->mm_lock);
+ if (ret)
+ return ret;
+
+ mapping = __gh_vm_mem_find_by_label(ghvm, label);
+ if (!mapping)
+ goto out;
+
+ gh_vm_mem_reclaim(ghvm, mapping);
+ kfree(mapping);
+out:
+ mutex_unlock(&ghvm->mm_lock);
+ return ret;
+}
diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
index 10ba32d2b0a6..a19207e3e065 100644
--- a/include/uapi/linux/gunyah.h
+++ b/include/uapi/linux/gunyah.h
@@ -20,4 +20,33 @@
*/
#define GH_CREATE_VM _IO(GH_IOCTL_TYPE, 0x0) /* Returns a Gunyah VM fd */
+/*
+ * ioctls for VM fds
+ */
+
+#define GH_MEM_ALLOW_READ (1UL << 0)
+#define GH_MEM_ALLOW_WRITE (1UL << 1)
+#define GH_MEM_ALLOW_EXEC (1UL << 2)
+
+/**
+ * struct gh_userspace_memory_region - Userspace memory descripion for GH_VM_SET_USER_MEM_REGION
+ * @label: Unique identifer to the region.
+ * @flags: Flags for memory parcel behavior
+ * @guest_phys_addr: Location of the memory region in guest's memory space (page-aligned)
+ * @memory_size: Size of the region (page-aligned)
+ * @userspace_addr: Location of the memory region in caller (userspace)'s memory
+ *
+ * See Documentation/virt/gunyah/vm-manager.rst for further details.
+ */
+struct gh_userspace_memory_region {
+ __u32 label;
+ __u32 flags;
+ __u64 guest_phys_addr;
+ __u64 memory_size;
+ __u64 userspace_addr;
+};
+
+#define GH_VM_SET_USER_MEM_REGION _IOW(GH_IOCTL_TYPE, 0x1, \
+ struct gh_userspace_memory_region)
+
#endif
--
2.39.2
Gunyah resource manager provides API to manipulate stage 2 page tables.
Manipulations are represented as a memory parcel. Memory parcels
describe a list of memory regions (intermediate physical address and
size), a list of new permissions for VMs, and the memory type (DDR or
MMIO). Memory parcels are uniquely identified by a handle allocated by
Gunyah. There are a few types of memory parcel sharing which Gunyah
supports:
- Sharing: the guest and host VM both have access
- Lending: only the guest has access; host VM loses access
- Donating: Permanently lent (not reclaimed even if guest shuts down)
Memory parcels that have been shared or lent can be reclaimed by the
host via an additional call. The reclaim operation restores the original
access the host VM had to the memory parcel and removes the access to
other VM.
One point to note that memory parcels don't describe where in the guest
VM the memory parcel should reside. The guest VM must accept the memory
parcel either explicitly via a "gh_rm_mem_accept" call (not introduced
here) or be configured to accept it automatically at boot. As the guest
VM accepts the memory parcel, it also mentions the IPA it wants to place
memory parcel.
Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Elliot Berman <[email protected]>
---
drivers/virt/gunyah/rsc_mgr_rpc.c | 223 ++++++++++++++++++++++++++++++
include/linux/gunyah_rsc_mgr.h | 48 +++++++
2 files changed, 271 insertions(+)
diff --git a/drivers/virt/gunyah/rsc_mgr_rpc.c b/drivers/virt/gunyah/rsc_mgr_rpc.c
index ffcb861a31b5..3df15ad5b97d 100644
--- a/drivers/virt/gunyah/rsc_mgr_rpc.c
+++ b/drivers/virt/gunyah/rsc_mgr_rpc.c
@@ -6,6 +6,12 @@
#include <linux/gunyah_rsc_mgr.h>
#include "rsc_mgr.h"
+/* Message IDs: Memory Management */
+#define GH_RM_RPC_MEM_LEND 0x51000012
+#define GH_RM_RPC_MEM_SHARE 0x51000013
+#define GH_RM_RPC_MEM_RECLAIM 0x51000015
+#define GH_RM_RPC_MEM_APPEND 0x51000018
+
/* Message IDs: VM Management */
#define GH_RM_RPC_VM_ALLOC_VMID 0x56000001
#define GH_RM_RPC_VM_DEALLOC_VMID 0x56000002
@@ -22,6 +28,46 @@ struct gh_rm_vm_common_vmid_req {
__le16 _padding;
} __packed;
+/* Call: MEM_LEND, MEM_SHARE */
+#define GH_MEM_SHARE_REQ_FLAGS_APPEND BIT(1)
+
+struct gh_rm_mem_share_req_header {
+ u8 mem_type;
+ u8 _padding0;
+ u8 flags;
+ u8 _padding1;
+ __le32 label;
+} __packed;
+
+struct gh_rm_mem_share_req_acl_section {
+ __le32 n_entries;
+ struct gh_rm_mem_acl_entry entries[];
+};
+
+struct gh_rm_mem_share_req_mem_section {
+ __le16 n_entries;
+ __le16 _padding;
+ struct gh_rm_mem_entry entries[];
+};
+
+/* Call: MEM_RELEASE */
+struct gh_rm_mem_release_req {
+ __le32 mem_handle;
+ u8 flags; /* currently not used */
+ u8 _padding0;
+ __le16 _padding1;
+} __packed;
+
+/* Call: MEM_APPEND */
+#define GH_MEM_APPEND_REQ_FLAGS_END BIT(0)
+
+struct gh_rm_mem_append_req_header {
+ __le32 mem_handle;
+ u8 flags;
+ u8 _padding0;
+ __le16 _padding1;
+} __packed;
+
/* Call: VM_ALLOC */
struct gh_rm_vm_alloc_vmid_resp {
__le16 vmid;
@@ -51,6 +97,8 @@ struct gh_rm_vm_config_image_req {
__le64 dtb_size;
} __packed;
+#define GH_RM_MAX_MEM_ENTRIES 512
+
/*
* Several RM calls take only a VMID as a parameter and give only standard
* response back. Deduplicate boilerplate code by using this common call.
@@ -64,6 +112,181 @@ static int gh_rm_common_vmid_call(struct gh_rm *rm, u32 message_id, u16 vmid)
return gh_rm_call(rm, message_id, &req_payload, sizeof(req_payload), NULL, NULL);
}
+static int _gh_rm_mem_append(struct gh_rm *rm, u32 mem_handle, bool end_append,
+ struct gh_rm_mem_entry *mem_entries, size_t n_mem_entries)
+{
+ struct gh_rm_mem_share_req_mem_section *mem_section;
+ struct gh_rm_mem_append_req_header *req_header;
+ size_t msg_size = 0;
+ void *msg;
+ int ret;
+
+ msg_size += sizeof(struct gh_rm_mem_append_req_header);
+ msg_size += struct_size(mem_section, entries, n_mem_entries);
+
+ msg = kzalloc(msg_size, GFP_KERNEL);
+ if (!msg)
+ return -ENOMEM;
+
+ req_header = msg;
+ mem_section = (void *)req_header + sizeof(struct gh_rm_mem_append_req_header);
+
+ req_header->mem_handle = cpu_to_le32(mem_handle);
+ if (end_append)
+ req_header->flags |= GH_MEM_APPEND_REQ_FLAGS_END;
+
+ mem_section->n_entries = cpu_to_le16(n_mem_entries);
+ memcpy(mem_section->entries, mem_entries, sizeof(*mem_entries) * n_mem_entries);
+
+ ret = gh_rm_call(rm, GH_RM_RPC_MEM_APPEND, msg, msg_size, NULL, NULL);
+ kfree(msg);
+
+ return ret;
+}
+
+static int gh_rm_mem_append(struct gh_rm *rm, u32 mem_handle,
+ struct gh_rm_mem_entry *mem_entries, size_t n_mem_entries)
+{
+ bool end_append;
+ int ret = 0;
+ size_t n;
+
+ while (n_mem_entries) {
+ if (n_mem_entries > GH_RM_MAX_MEM_ENTRIES) {
+ end_append = false;
+ n = GH_RM_MAX_MEM_ENTRIES;
+ } else {
+ end_append = true;
+ n = n_mem_entries;
+ }
+
+ ret = _gh_rm_mem_append(rm, mem_handle, end_append, mem_entries, n);
+ if (ret)
+ break;
+
+ mem_entries += n;
+ n_mem_entries -= n;
+ }
+
+ return ret;
+}
+
+static int gh_rm_mem_lend_common(struct gh_rm *rm, u32 message_id, struct gh_rm_mem_parcel *p)
+{
+ size_t msg_size = 0, initial_mem_entries = p->n_mem_entries, resp_size;
+ struct gh_rm_mem_share_req_acl_section *acl_section;
+ struct gh_rm_mem_share_req_mem_section *mem_section;
+ struct gh_rm_mem_share_req_header *req_header;
+ u32 *attr_section;
+ __le32 *resp;
+ void *msg;
+ int ret;
+
+ if (!p->acl_entries || !p->n_acl_entries || !p->mem_entries || !p->n_mem_entries ||
+ p->n_acl_entries > U8_MAX || p->mem_handle != GH_MEM_HANDLE_INVAL)
+ return -EINVAL;
+
+ if (initial_mem_entries > GH_RM_MAX_MEM_ENTRIES)
+ initial_mem_entries = GH_RM_MAX_MEM_ENTRIES;
+
+ /* The format of the message goes:
+ * request header
+ * ACL entries (which VMs get what kind of access to this memory parcel)
+ * Memory entries (list of memory regions to share)
+ * Memory attributes (currently unused, we'll hard-code the size to 0)
+ */
+ msg_size += sizeof(struct gh_rm_mem_share_req_header);
+ msg_size += struct_size(acl_section, entries, p->n_acl_entries);
+ msg_size += struct_size(mem_section, entries, initial_mem_entries);
+ msg_size += sizeof(u32); /* for memory attributes, currently unused */
+
+ msg = kzalloc(msg_size, GFP_KERNEL);
+ if (!msg)
+ return -ENOMEM;
+
+ req_header = msg;
+ acl_section = (void *)req_header + sizeof(*req_header);
+ mem_section = (void *)acl_section + struct_size(acl_section, entries, p->n_acl_entries);
+ attr_section = (void *)mem_section + struct_size(mem_section, entries, initial_mem_entries);
+
+ req_header->mem_type = p->mem_type;
+ if (initial_mem_entries != p->n_mem_entries)
+ req_header->flags |= GH_MEM_SHARE_REQ_FLAGS_APPEND;
+ req_header->label = cpu_to_le32(p->label);
+
+ acl_section->n_entries = cpu_to_le32(p->n_acl_entries);
+ memcpy(acl_section->entries, p->acl_entries, sizeof(*(p->acl_entries)) * p->n_acl_entries);
+
+ mem_section->n_entries = cpu_to_le16(initial_mem_entries);
+ memcpy(mem_section->entries, p->mem_entries,
+ sizeof(*(p->mem_entries)) * initial_mem_entries);
+
+ /* Set n_entries for memory attribute section to 0 */
+ *attr_section = 0;
+
+ ret = gh_rm_call(rm, message_id, msg, msg_size, (void **)&resp, &resp_size);
+ kfree(msg);
+
+ if (ret)
+ return ret;
+
+ p->mem_handle = le32_to_cpu(*resp);
+
+ if (initial_mem_entries != p->n_mem_entries) {
+ ret = gh_rm_mem_append(rm, p->mem_handle,
+ &p->mem_entries[initial_mem_entries],
+ p->n_mem_entries - initial_mem_entries);
+ if (ret) {
+ gh_rm_mem_reclaim(rm, p);
+ p->mem_handle = GH_MEM_HANDLE_INVAL;
+ }
+ }
+
+ kfree(resp);
+ return ret;
+}
+
+/**
+ * gh_rm_mem_lend() - Lend memory to other virtual machines.
+ * @rm: Handle to a Gunyah resource manager
+ * @parcel: Package the memory information of the memory to be lent.
+ *
+ * Lending removes Linux's access to the memory while the memory parcel is lent.
+ */
+int gh_rm_mem_lend(struct gh_rm *rm, struct gh_rm_mem_parcel *parcel)
+{
+ return gh_rm_mem_lend_common(rm, GH_RM_RPC_MEM_LEND, parcel);
+}
+
+
+/**
+ * gh_rm_mem_share() - Share memory with other virtual machines.
+ * @rm: Handle to a Gunyah resource manager
+ * @parcel: Package the memory information of the memory to be shared.
+ *
+ * Sharing keeps Linux's access to the memory while the memory parcel is shared.
+ */
+int gh_rm_mem_share(struct gh_rm *rm, struct gh_rm_mem_parcel *parcel)
+{
+ return gh_rm_mem_lend_common(rm, GH_RM_RPC_MEM_SHARE, parcel);
+}
+
+/**
+ * gh_rm_mem_reclaim() - Reclaim a memory parcel
+ * @rm: Handle to a Gunyah resource manager
+ * @parcel: Package the memory information of the memory to be reclaimed.
+ *
+ * RM maps the associated memory back into the stage-2 page tables of the owner VM.
+ */
+int gh_rm_mem_reclaim(struct gh_rm *rm, struct gh_rm_mem_parcel *parcel)
+{
+ struct gh_rm_mem_release_req req = {
+ .mem_handle = cpu_to_le32(parcel->mem_handle),
+ };
+
+ return gh_rm_call(rm, GH_RM_RPC_MEM_RECLAIM, &req, sizeof(req), NULL, NULL);
+}
+
/**
* gh_rm_alloc_vmid() - Allocate a new VM in Gunyah. Returns the VM identifier.
* @rm: Handle to a Gunyah resource manager
diff --git a/include/linux/gunyah_rsc_mgr.h b/include/linux/gunyah_rsc_mgr.h
index 6a2f434e67f7..88a429dad09e 100644
--- a/include/linux/gunyah_rsc_mgr.h
+++ b/include/linux/gunyah_rsc_mgr.h
@@ -11,6 +11,7 @@
#include <linux/gunyah.h>
#define GH_VMID_INVAL U16_MAX
+#define GH_MEM_HANDLE_INVAL U32_MAX
struct gh_rm;
int gh_rm_notifier_register(struct gh_rm *rm, struct notifier_block *nb);
@@ -51,7 +52,54 @@ struct gh_rm_vm_status_payload {
#define GH_RM_NOTIFICATION_VM_STATUS 0x56100008
+#define GH_RM_ACL_X BIT(0)
+#define GH_RM_ACL_W BIT(1)
+#define GH_RM_ACL_R BIT(2)
+
+struct gh_rm_mem_acl_entry {
+ __le16 vmid;
+ u8 perms;
+ u8 reserved;
+} __packed;
+
+struct gh_rm_mem_entry {
+ __le64 ipa_base;
+ __le64 size;
+} __packed;
+
+enum gh_rm_mem_type {
+ GH_RM_MEM_TYPE_NORMAL = 0,
+ GH_RM_MEM_TYPE_IO = 1,
+};
+
+/*
+ * struct gh_rm_mem_parcel - Package info about memory to be lent/shared/donated/reclaimed
+ * @mem_type: The type of memory: normal (DDR) or IO
+ * @label: An client-specified identifier which can be used by the other VMs to identify the purpose
+ * of the memory parcel.
+ * @acl_entries: An array of access control entries. Each entry specifies a VM and what access
+ * is allowed for the memory parcel.
+ * @n_acl_entries: Count of the number of entries in the `acl_entries` array.
+ * @mem_entries: An list of regions to be associated with the memory parcel. Addresses should be
+ * (intermediate) physical addresses from Linux's perspective.
+ * @n_mem_entries: Count of the number of entries in the `mem_entries` array.
+ * @mem_handle: On success, filled with memory handle that RM allocates for this memory parcel
+ */
+struct gh_rm_mem_parcel {
+ enum gh_rm_mem_type mem_type;
+ u32 label;
+ size_t n_acl_entries;
+ struct gh_rm_mem_acl_entry *acl_entries;
+ size_t n_mem_entries;
+ struct gh_rm_mem_entry *mem_entries;
+ u32 mem_handle;
+};
+
/* RPC Calls */
+int gh_rm_mem_lend(struct gh_rm *rm, struct gh_rm_mem_parcel *parcel);
+int gh_rm_mem_share(struct gh_rm *rm, struct gh_rm_mem_parcel *parcel);
+int gh_rm_mem_reclaim(struct gh_rm *rm, struct gh_rm_mem_parcel *parcel);
+
int gh_rm_alloc_vmid(struct gh_rm *rm, u16 vmid);
int gh_rm_dealloc_vmid(struct gh_rm *rm, u16 vmid);
int gh_rm_vm_reset(struct gh_rm *rm, u16 vmid);
--
2.39.2
Add Gunyah Resource Manager RPC to launch an unauthenticated VM.
Signed-off-by: Elliot Berman <[email protected]>
---
drivers/virt/gunyah/Makefile | 2 +-
drivers/virt/gunyah/rsc_mgr_rpc.c | 260 ++++++++++++++++++++++++++++++
include/linux/gunyah_rsc_mgr.h | 73 +++++++++
3 files changed, 334 insertions(+), 1 deletion(-)
create mode 100644 drivers/virt/gunyah/rsc_mgr_rpc.c
diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
index cc864ff5abbb..de29769f2f3f 100644
--- a/drivers/virt/gunyah/Makefile
+++ b/drivers/virt/gunyah/Makefile
@@ -2,5 +2,5 @@
obj-$(CONFIG_GUNYAH) += gunyah.o
-gunyah_rsc_mgr-y += rsc_mgr.o
+gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o
obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
diff --git a/drivers/virt/gunyah/rsc_mgr_rpc.c b/drivers/virt/gunyah/rsc_mgr_rpc.c
new file mode 100644
index 000000000000..ffcb861a31b5
--- /dev/null
+++ b/drivers/virt/gunyah/rsc_mgr_rpc.c
@@ -0,0 +1,260 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#include <linux/gunyah_rsc_mgr.h>
+#include "rsc_mgr.h"
+
+/* Message IDs: VM Management */
+#define GH_RM_RPC_VM_ALLOC_VMID 0x56000001
+#define GH_RM_RPC_VM_DEALLOC_VMID 0x56000002
+#define GH_RM_RPC_VM_START 0x56000004
+#define GH_RM_RPC_VM_STOP 0x56000005
+#define GH_RM_RPC_VM_RESET 0x56000006
+#define GH_RM_RPC_VM_CONFIG_IMAGE 0x56000009
+#define GH_RM_RPC_VM_INIT 0x5600000B
+#define GH_RM_RPC_VM_GET_HYP_RESOURCES 0x56000020
+#define GH_RM_RPC_VM_GET_VMID 0x56000024
+
+struct gh_rm_vm_common_vmid_req {
+ __le16 vmid;
+ __le16 _padding;
+} __packed;
+
+/* Call: VM_ALLOC */
+struct gh_rm_vm_alloc_vmid_resp {
+ __le16 vmid;
+ __le16 _padding;
+} __packed;
+
+/* Call: VM_STOP */
+#define GH_RM_VM_STOP_FLAG_FORCE_STOP BIT(0)
+
+#define GH_RM_VM_STOP_REASON_FORCE_STOP 3
+
+struct gh_rm_vm_stop_req {
+ __le16 vmid;
+ u8 flags;
+ u8 _padding;
+ __le32 stop_reason;
+} __packed;
+
+/* Call: VM_CONFIG_IMAGE */
+struct gh_rm_vm_config_image_req {
+ __le16 vmid;
+ __le16 auth_mech;
+ __le32 mem_handle;
+ __le64 image_offset;
+ __le64 image_size;
+ __le64 dtb_offset;
+ __le64 dtb_size;
+} __packed;
+
+/*
+ * Several RM calls take only a VMID as a parameter and give only standard
+ * response back. Deduplicate boilerplate code by using this common call.
+ */
+static int gh_rm_common_vmid_call(struct gh_rm *rm, u32 message_id, u16 vmid)
+{
+ struct gh_rm_vm_common_vmid_req req_payload = {
+ .vmid = cpu_to_le16(vmid),
+ };
+
+ return gh_rm_call(rm, message_id, &req_payload, sizeof(req_payload), NULL, NULL);
+}
+
+/**
+ * gh_rm_alloc_vmid() - Allocate a new VM in Gunyah. Returns the VM identifier.
+ * @rm: Handle to a Gunyah resource manager
+ * @vmid: Use 0 to dynamically allocate a VM. A reserved VMID can be supplied
+ * to request allocation of a platform-defined VM.
+ *
+ * Returns - the allocated VMID or negative value on error
+ */
+int gh_rm_alloc_vmid(struct gh_rm *rm, u16 vmid)
+{
+ struct gh_rm_vm_common_vmid_req req_payload = {
+ .vmid = vmid,
+ };
+ struct gh_rm_vm_alloc_vmid_resp *resp_payload;
+ size_t resp_size;
+ void *resp;
+ int ret;
+
+ ret = gh_rm_call(rm, GH_RM_RPC_VM_ALLOC_VMID, &req_payload, sizeof(req_payload), &resp,
+ &resp_size);
+ if (ret)
+ return ret;
+
+ if (!vmid) {
+ resp_payload = resp;
+ ret = le16_to_cpu(resp_payload->vmid);
+ kfree(resp);
+ }
+
+ return ret;
+}
+
+/**
+ * gh_rm_dealloc_vmid() - Dispose the VMID
+ * @rm: Handle to a Gunyah resource manager
+ * @vmid: VM identifier allocated with gh_rm_alloc_vmid
+ */
+int gh_rm_dealloc_vmid(struct gh_rm *rm, u16 vmid)
+{
+ return gh_rm_common_vmid_call(rm, GH_RM_RPC_VM_DEALLOC_VMID, vmid);
+}
+
+/**
+ * gh_rm_vm_reset() - Reset the VM's resources
+ * @rm: Handle to a Gunyah resource manager
+ * @vmid: VM identifier allocated with gh_rm_alloc_vmid
+ *
+ * While tearing down the VM, request RM to clean up all the VM resources
+ * associated with the VM. Only after this, Linux can clean up all the
+ * references it maintains to resources.
+ */
+int gh_rm_vm_reset(struct gh_rm *rm, u16 vmid)
+{
+ return gh_rm_common_vmid_call(rm, GH_RM_RPC_VM_RESET, vmid);
+}
+
+/**
+ * gh_rm_vm_start() - Move the VM into "ready to run" state
+ * @rm: Handle to a Gunyah resource manager
+ * @vmid: VM identifier allocated with gh_rm_alloc_vmid
+ *
+ * On VMs which use proxy scheduling, vcpu_run is needed to actually run the VM.
+ * On VMs which use Gunyah's scheduling, the vCPUs start executing in accordance with Gunyah
+ * scheduling policies.
+ */
+int gh_rm_vm_start(struct gh_rm *rm, u16 vmid)
+{
+ return gh_rm_common_vmid_call(rm, GH_RM_RPC_VM_START, vmid);
+}
+
+/**
+ * gh_rm_vm_stop() - Send a request to Resource Manager VM to forcibly stop a VM.
+ * @rm: Handle to a Gunyah resource manager
+ * @vmid: VM identifier allocated with gh_rm_alloc_vmid
+ */
+int gh_rm_vm_stop(struct gh_rm *rm, u16 vmid)
+{
+ struct gh_rm_vm_stop_req req_payload = {
+ .vmid = cpu_to_le16(vmid),
+ .flags = GH_RM_VM_STOP_FLAG_FORCE_STOP,
+ .stop_reason = cpu_to_le32(GH_RM_VM_STOP_REASON_FORCE_STOP),
+ };
+
+ return gh_rm_call(rm, GH_RM_RPC_VM_STOP, &req_payload, sizeof(req_payload), NULL, NULL);
+}
+
+/**
+ * gh_rm_vm_configure() - Prepare a VM to start and provide the common
+ * configuration needed by RM to configure a VM
+ * @rm: Handle to a Gunyah resource manager
+ * @vmid: VM identifier allocated with gh_rm_alloc_vmid
+ * @auth_mechanism: Authentication mechanism used by resource manager to verify
+ * the virtual machine
+ * @mem_handle: Handle to a previously shared memparcel that contains all parts
+ * of the VM image subject to authentication.
+ * @image_offset: Start address of VM image, relative to the start of memparcel
+ * @image_size: Size of the VM image
+ * @dtb_offset: Start address of the devicetree binary with VM configuration,
+ * relative to start of memparcel.
+ * @dtb_size: Maximum size of devicetree binary. Resource manager applies
+ * an overlay to the DTB and dtb_size should include room for
+ * the overlay.
+ */
+int gh_rm_vm_configure(struct gh_rm *rm, u16 vmid, enum gh_rm_vm_auth_mechanism auth_mechanism,
+ u32 mem_handle, u64 image_offset, u64 image_size, u64 dtb_offset, u64 dtb_size)
+{
+ struct gh_rm_vm_config_image_req req_payload = {
+ .vmid = cpu_to_le16(vmid),
+ .auth_mech = cpu_to_le16(auth_mechanism),
+ .mem_handle = cpu_to_le32(mem_handle),
+ .image_offset = cpu_to_le64(image_offset),
+ .image_size = cpu_to_le64(image_size),
+ .dtb_offset = cpu_to_le64(dtb_offset),
+ .dtb_size = cpu_to_le64(dtb_size),
+ };
+
+ return gh_rm_call(rm, GH_RM_RPC_VM_CONFIG_IMAGE, &req_payload, sizeof(req_payload),
+ NULL, NULL);
+}
+
+/**
+ * gh_rm_vm_init() - Move the VM to initialized state.
+ * @rm: Handle to a Gunyah resource manager
+ * @vmid: VM identifier
+ *
+ * RM will allocate needed resources for the VM.
+ */
+int gh_rm_vm_init(struct gh_rm *rm, u16 vmid)
+{
+ return gh_rm_common_vmid_call(rm, GH_RM_RPC_VM_INIT, vmid);
+}
+
+/**
+ * gh_rm_get_hyp_resources() - Retrieve hypervisor resources (capabilities) associated with a VM
+ * @rm: Handle to a Gunyah resource manager
+ * @vmid: VMID of the other VM to get the resources of
+ * @resources: Set by gh_rm_get_hyp_resources and contains the returned hypervisor resources.
+ */
+int gh_rm_get_hyp_resources(struct gh_rm *rm, u16 vmid,
+ struct gh_rm_hyp_resources **resources)
+{
+ struct gh_rm_vm_common_vmid_req req_payload = {
+ .vmid = cpu_to_le16(vmid),
+ };
+ struct gh_rm_hyp_resources *resp;
+ size_t resp_size;
+ int ret;
+
+ ret = gh_rm_call(rm, GH_RM_RPC_VM_GET_HYP_RESOURCES,
+ &req_payload, sizeof(req_payload),
+ (void **)&resp, &resp_size);
+ if (ret)
+ return ret;
+
+ if (!resp_size)
+ return -EBADMSG;
+
+ if (resp_size < struct_size(resp, entries, 0) ||
+ resp_size != struct_size(resp, entries, le32_to_cpu(resp->n_entries))) {
+ kfree(resp);
+ return -EBADMSG;
+ }
+
+ *resources = resp;
+ return 0;
+}
+
+/**
+ * gh_rm_get_vmid() - Retrieve VMID of this virtual machine
+ * @rm: Handle to a Gunyah resource manager
+ * @vmid: Filled with the VMID of this VM
+ */
+int gh_rm_get_vmid(struct gh_rm *rm, u16 *vmid)
+{
+ static u16 cached_vmid = GH_VMID_INVAL;
+ size_t resp_size;
+ __le32 *resp;
+ int ret;
+
+ if (cached_vmid != GH_VMID_INVAL) {
+ *vmid = cached_vmid;
+ return 0;
+ }
+
+ ret = gh_rm_call(rm, GH_RM_RPC_VM_GET_VMID, NULL, 0, (void **)&resp, &resp_size);
+ if (ret)
+ return ret;
+
+ *vmid = cached_vmid = lower_16_bits(le32_to_cpu(*resp));
+ kfree(resp);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(gh_rm_get_vmid);
diff --git a/include/linux/gunyah_rsc_mgr.h b/include/linux/gunyah_rsc_mgr.h
index deca9b3da541..6a2f434e67f7 100644
--- a/include/linux/gunyah_rsc_mgr.h
+++ b/include/linux/gunyah_rsc_mgr.h
@@ -18,4 +18,77 @@ int gh_rm_notifier_unregister(struct gh_rm *rm, struct notifier_block *nb);
struct device *gh_rm_get(struct gh_rm *rm);
void gh_rm_put(struct gh_rm *rm);
+struct gh_rm_vm_exited_payload {
+ __le16 vmid;
+ __le16 exit_type;
+ __le32 exit_reason_size;
+ u8 exit_reason[];
+} __packed;
+
+#define GH_RM_NOTIFICATION_VM_EXITED 0x56100001
+
+enum gh_rm_vm_status {
+ GH_RM_VM_STATUS_NO_STATE = 0,
+ GH_RM_VM_STATUS_INIT = 1,
+ GH_RM_VM_STATUS_READY = 2,
+ GH_RM_VM_STATUS_RUNNING = 3,
+ GH_RM_VM_STATUS_PAUSED = 4,
+ GH_RM_VM_STATUS_LOAD = 5,
+ GH_RM_VM_STATUS_AUTH = 6,
+ GH_RM_VM_STATUS_INIT_FAILED = 8,
+ GH_RM_VM_STATUS_EXITED = 9,
+ GH_RM_VM_STATUS_RESETTING = 10,
+ GH_RM_VM_STATUS_RESET = 11,
+};
+
+struct gh_rm_vm_status_payload {
+ __le16 vmid;
+ u16 reserved;
+ u8 vm_status;
+ u8 os_status;
+ __le16 app_status;
+} __packed;
+
+#define GH_RM_NOTIFICATION_VM_STATUS 0x56100008
+
+/* RPC Calls */
+int gh_rm_alloc_vmid(struct gh_rm *rm, u16 vmid);
+int gh_rm_dealloc_vmid(struct gh_rm *rm, u16 vmid);
+int gh_rm_vm_reset(struct gh_rm *rm, u16 vmid);
+int gh_rm_vm_start(struct gh_rm *rm, u16 vmid);
+int gh_rm_vm_stop(struct gh_rm *rm, u16 vmid);
+
+enum gh_rm_vm_auth_mechanism {
+ GH_RM_VM_AUTH_NONE = 0,
+ GH_RM_VM_AUTH_QCOM_PIL_ELF = 1,
+ GH_RM_VM_AUTH_QCOM_ANDROID_PVM = 2,
+};
+
+int gh_rm_vm_configure(struct gh_rm *rm, u16 vmid, enum gh_rm_vm_auth_mechanism auth_mechanism,
+ u32 mem_handle, u64 image_offset, u64 image_size,
+ u64 dtb_offset, u64 dtb_size);
+int gh_rm_vm_init(struct gh_rm *rm, u16 vmid);
+
+struct gh_rm_hyp_resource {
+ u8 type;
+ u8 reserved;
+ __le16 partner_vmid;
+ __le32 resource_handle;
+ __le32 resource_label;
+ __le64 cap_id;
+ __le32 virq_handle;
+ __le32 virq;
+ __le64 base;
+ __le64 size;
+} __packed;
+
+struct gh_rm_hyp_resources {
+ __le32 n_entries;
+ struct gh_rm_hyp_resource entries[];
+} __packed;
+
+int gh_rm_get_hyp_resources(struct gh_rm *rm, u16 vmid,
+ struct gh_rm_hyp_resources **resources);
+int gh_rm_get_vmid(struct gh_rm *rm, u16 *vmid);
+
#endif
--
2.39.2
On Qualcomm platforms, there is a firmware entity which controls access
to physical pages. In order to share memory with another VM, this entity
needs to be informed that the guest VM should have access to the memory.
Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Elliot Berman <[email protected]>
---
drivers/virt/gunyah/Kconfig | 4 ++
drivers/virt/gunyah/Makefile | 1 +
drivers/virt/gunyah/gunyah_platform_hooks.c | 80 +++++++++++++++++++++
drivers/virt/gunyah/rsc_mgr.h | 3 +
drivers/virt/gunyah/rsc_mgr_rpc.c | 18 ++++-
include/linux/gunyah_rsc_mgr.h | 17 +++++
6 files changed, 121 insertions(+), 2 deletions(-)
create mode 100644 drivers/virt/gunyah/gunyah_platform_hooks.c
diff --git a/drivers/virt/gunyah/Kconfig b/drivers/virt/gunyah/Kconfig
index 1a737694c333..de815189dab6 100644
--- a/drivers/virt/gunyah/Kconfig
+++ b/drivers/virt/gunyah/Kconfig
@@ -4,6 +4,7 @@ config GUNYAH
tristate "Gunyah Virtualization drivers"
depends on ARM64
depends on MAILBOX
+ select GUNYAH_PLATFORM_HOOKS
help
The Gunyah drivers are the helper interfaces that run in a guest VM
such as basic inter-VM IPC and signaling mechanisms, and higher level
@@ -11,3 +12,6 @@ config GUNYAH
Say Y/M here to enable the drivers needed to interact in a Gunyah
virtual environment.
+
+config GUNYAH_PLATFORM_HOOKS
+ tristate
diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
index ff8bc4925392..6b8f84dbfe0d 100644
--- a/drivers/virt/gunyah/Makefile
+++ b/drivers/virt/gunyah/Makefile
@@ -1,6 +1,7 @@
# SPDX-License-Identifier: GPL-2.0
obj-$(CONFIG_GUNYAH) += gunyah.o
+obj-$(CONFIG_GUNYAH_PLATFORM_HOOKS) += gunyah_platform_hooks.o
gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o vm_mgr_mm.o
obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
diff --git a/drivers/virt/gunyah/gunyah_platform_hooks.c b/drivers/virt/gunyah/gunyah_platform_hooks.c
new file mode 100644
index 000000000000..60da0e154e98
--- /dev/null
+++ b/drivers/virt/gunyah/gunyah_platform_hooks.c
@@ -0,0 +1,80 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#include <linux/module.h>
+#include <linux/rwsem.h>
+#include <linux/gunyah_rsc_mgr.h>
+
+#include "rsc_mgr.h"
+
+static struct gh_rm_platform_ops *rm_platform_ops;
+static DECLARE_RWSEM(rm_platform_ops_lock);
+
+int gh_rm_platform_pre_mem_share(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel)
+{
+ int ret = 0;
+
+ down_read(&rm_platform_ops_lock);
+ if (rm_platform_ops && rm_platform_ops->pre_mem_share)
+ ret = rm_platform_ops->pre_mem_share(rm, mem_parcel);
+ up_read(&rm_platform_ops_lock);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(gh_rm_platform_pre_mem_share);
+
+int gh_rm_platform_post_mem_reclaim(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel)
+{
+ int ret = 0;
+
+ down_read(&rm_platform_ops_lock);
+ if (rm_platform_ops && rm_platform_ops->post_mem_reclaim)
+ ret = rm_platform_ops->post_mem_reclaim(rm, mem_parcel);
+ up_read(&rm_platform_ops_lock);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(gh_rm_platform_post_mem_reclaim);
+
+int gh_rm_register_platform_ops(struct gh_rm_platform_ops *platform_ops)
+{
+ int ret = 0;
+
+ down_write(&rm_platform_ops_lock);
+ if (!rm_platform_ops)
+ rm_platform_ops = platform_ops;
+ else
+ ret = -EEXIST;
+ up_write(&rm_platform_ops_lock);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(gh_rm_register_platform_ops);
+
+void gh_rm_unregister_platform_ops(struct gh_rm_platform_ops *platform_ops)
+{
+ down_write(&rm_platform_ops_lock);
+ if (rm_platform_ops == platform_ops)
+ rm_platform_ops = NULL;
+ up_write(&rm_platform_ops_lock);
+}
+EXPORT_SYMBOL_GPL(gh_rm_unregister_platform_ops);
+
+static void _devm_gh_rm_unregister_platform_ops(void *data)
+{
+ gh_rm_unregister_platform_ops(data);
+}
+
+int devm_gh_rm_register_platform_ops(struct device *dev, struct gh_rm_platform_ops *ops)
+{
+ int ret;
+
+ ret = gh_rm_register_platform_ops(ops);
+ if (ret)
+ return ret;
+
+ return devm_add_action(dev, _devm_gh_rm_unregister_platform_ops, ops);
+}
+EXPORT_SYMBOL_GPL(devm_gh_rm_register_platform_ops);
+
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("Gunyah Platform Hooks");
diff --git a/drivers/virt/gunyah/rsc_mgr.h b/drivers/virt/gunyah/rsc_mgr.h
index 3665ebc7b020..6838e736f361 100644
--- a/drivers/virt/gunyah/rsc_mgr.h
+++ b/drivers/virt/gunyah/rsc_mgr.h
@@ -13,4 +13,7 @@ struct gh_rm;
int gh_rm_call(struct gh_rm *rsc_mgr, u32 message_id, void *req_buff, size_t req_buf_size,
void **resp_buf, size_t *resp_buf_size);
+int gh_rm_platform_pre_mem_share(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel);
+int gh_rm_platform_post_mem_reclaim(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel);
+
#endif
diff --git a/drivers/virt/gunyah/rsc_mgr_rpc.c b/drivers/virt/gunyah/rsc_mgr_rpc.c
index 3df15ad5b97d..733be4dc8dd2 100644
--- a/drivers/virt/gunyah/rsc_mgr_rpc.c
+++ b/drivers/virt/gunyah/rsc_mgr_rpc.c
@@ -204,6 +204,12 @@ static int gh_rm_mem_lend_common(struct gh_rm *rm, u32 message_id, struct gh_rm_
if (!msg)
return -ENOMEM;
+ ret = gh_rm_platform_pre_mem_share(rm, p);
+ if (ret) {
+ kfree(msg);
+ return ret;
+ }
+
req_header = msg;
acl_section = (void *)req_header + sizeof(*req_header);
mem_section = (void *)acl_section + struct_size(acl_section, entries, p->n_acl_entries);
@@ -227,8 +233,10 @@ static int gh_rm_mem_lend_common(struct gh_rm *rm, u32 message_id, struct gh_rm_
ret = gh_rm_call(rm, message_id, msg, msg_size, (void **)&resp, &resp_size);
kfree(msg);
- if (ret)
+ if (ret) {
+ gh_rm_platform_post_mem_reclaim(rm, p);
return ret;
+ }
p->mem_handle = le32_to_cpu(*resp);
@@ -283,8 +291,14 @@ int gh_rm_mem_reclaim(struct gh_rm *rm, struct gh_rm_mem_parcel *parcel)
struct gh_rm_mem_release_req req = {
.mem_handle = cpu_to_le32(parcel->mem_handle),
};
+ int ret;
+
+ ret = gh_rm_call(rm, GH_RM_RPC_MEM_RECLAIM, &req, sizeof(req), NULL, NULL);
+ /* Do not call platform mem reclaim hooks: the reclaim didn't happen*/
+ if (ret)
+ return ret;
- return gh_rm_call(rm, GH_RM_RPC_MEM_RECLAIM, &req, sizeof(req), NULL, NULL);
+ return gh_rm_platform_post_mem_reclaim(rm, parcel);
}
/**
diff --git a/include/linux/gunyah_rsc_mgr.h b/include/linux/gunyah_rsc_mgr.h
index 8b0b46f28e39..515087931a2b 100644
--- a/include/linux/gunyah_rsc_mgr.h
+++ b/include/linux/gunyah_rsc_mgr.h
@@ -145,4 +145,21 @@ int gh_rm_get_hyp_resources(struct gh_rm *rm, u16 vmid,
struct gh_rm_hyp_resources **resources);
int gh_rm_get_vmid(struct gh_rm *rm, u16 *vmid);
+struct gunyah_rm_platform_ops {
+ int (*pre_mem_share)(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel);
+ int (*post_mem_reclaim)(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel);
+};
+
+#if IS_ENABLED(CONFIG_GUNYAH_PLATFORM_HOOKS)
+int gh_rm_register_platform_ops(struct gh_rm_platform_ops *platform_ops);
+void gh_rm_unregister_platform_ops(struct gh_rm_platform_ops *platform_ops);
+int devm_gh_rm_register_platform_ops(struct device *dev, struct gh_rm_platform_ops *ops);
+#else
+static inline int gh_rm_register_platform_ops(struct gh_rm_platform_ops *platform_ops)
+ { return 0; }
+static inline void gh_rm_unregister_platform_ops(struct gh_rm_platform_ops *platform_ops) { }
+static inline int devm_gh_rm_register_platform_ops(struct device *dev,
+ struct gh_rm_platform_ops *ops) { return 0; }
+#endif
+
#endif
--
2.39.2
Add remaining ioctls to support non-proxy VM boot:
- Gunyah Resource Manager uses the VM's devicetree to configure the
virtual machine. The location of the devicetree in the guest's
virtual memory can be declared via the SET_DTB_CONFIG ioctl.
- Trigger start of the virtual machine with VM_START ioctl.
Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Elliot Berman <[email protected]>
---
drivers/virt/gunyah/vm_mgr.c | 243 ++++++++++++++++++++++++++++++--
drivers/virt/gunyah/vm_mgr.h | 10 ++
drivers/virt/gunyah/vm_mgr_mm.c | 23 +++
include/linux/gunyah_rsc_mgr.h | 6 +
include/uapi/linux/gunyah.h | 13 ++
5 files changed, 282 insertions(+), 13 deletions(-)
diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
index e950274c6a53..299b9bb81edc 100644
--- a/drivers/virt/gunyah/vm_mgr.c
+++ b/drivers/virt/gunyah/vm_mgr.c
@@ -9,37 +9,118 @@
#include <linux/file.h>
#include <linux/gunyah_rsc_mgr.h>
#include <linux/miscdevice.h>
+#include <linux/mm.h>
#include <linux/module.h>
#include <uapi/linux/gunyah.h>
#include "vm_mgr.h"
+static int gh_vm_rm_notification_status(struct gh_vm *ghvm, void *data)
+{
+ struct gh_rm_vm_status_payload *payload = data;
+
+ if (payload->vmid != ghvm->vmid)
+ return NOTIFY_OK;
+
+ /* All other state transitions are synchronous to a corresponding RM call */
+ if (payload->vm_status == GH_RM_VM_STATUS_RESET) {
+ down_write(&ghvm->status_lock);
+ ghvm->vm_status = payload->vm_status;
+ up_write(&ghvm->status_lock);
+ wake_up(&ghvm->vm_status_wait);
+ }
+
+ return NOTIFY_DONE;
+}
+
+static int gh_vm_rm_notification_exited(struct gh_vm *ghvm, void *data)
+{
+ struct gh_rm_vm_exited_payload *payload = data;
+
+ if (payload->vmid != ghvm->vmid)
+ return NOTIFY_OK;
+
+ down_write(&ghvm->status_lock);
+ ghvm->vm_status = GH_RM_VM_STATUS_EXITED;
+ up_write(&ghvm->status_lock);
+
+ return NOTIFY_DONE;
+}
+
+static int gh_vm_rm_notification(struct notifier_block *nb, unsigned long action, void *data)
+{
+ struct gh_vm *ghvm = container_of(nb, struct gh_vm, nb);
+
+ switch (action) {
+ case GH_RM_NOTIFICATION_VM_STATUS:
+ return gh_vm_rm_notification_status(ghvm, data);
+ case GH_RM_NOTIFICATION_VM_EXITED:
+ return gh_vm_rm_notification_exited(ghvm, data);
+ default:
+ return NOTIFY_OK;
+ }
+}
+
+static void gh_vm_stop(struct gh_vm *ghvm)
+{
+ int ret;
+
+ down_write(&ghvm->status_lock);
+ if (ghvm->vm_status == GH_RM_VM_STATUS_RUNNING) {
+ ret = gh_rm_vm_stop(ghvm->rm, ghvm->vmid);
+ if (ret)
+ dev_warn(ghvm->parent, "Failed to stop VM: %d\n", ret);
+ }
+
+ ghvm->vm_status = GH_RM_VM_STATUS_EXITED;
+ up_write(&ghvm->status_lock);
+}
+
static void gh_vm_free(struct work_struct *work)
{
struct gh_vm *ghvm = container_of(work, struct gh_vm, free_work);
struct gh_vm_mem *mapping, *tmp;
int ret;
- mutex_lock(&ghvm->mm_lock);
- list_for_each_entry_safe(mapping, tmp, &ghvm->memory_mappings, list) {
- gh_vm_mem_reclaim(ghvm, mapping);
- kfree(mapping);
- }
- mutex_unlock(&ghvm->mm_lock);
-
- ret = gh_rm_dealloc_vmid(ghvm->rm, ghvm->vmid);
- if (ret)
- pr_warn("Failed to deallocate vmid: %d\n", ret);
+ switch (ghvm->vm_status) {
+ case GH_RM_VM_STATUS_RUNNING:
+ gh_vm_stop(ghvm);
+ fallthrough;
+ case GH_RM_VM_STATUS_INIT_FAILED:
+ case GH_RM_VM_STATUS_LOAD:
+ case GH_RM_VM_STATUS_EXITED:
+ mutex_lock(&ghvm->mm_lock);
+ list_for_each_entry_safe(mapping, tmp, &ghvm->memory_mappings, list) {
+ gh_vm_mem_reclaim(ghvm, mapping);
+ kfree(mapping);
+ }
+ mutex_unlock(&ghvm->mm_lock);
+ fallthrough;
+ case GH_RM_VM_STATUS_NO_STATE:
+ ret = gh_rm_dealloc_vmid(ghvm->rm, ghvm->vmid);
+ if (ret)
+ dev_warn(ghvm->parent, "Failed to deallocate vmid: %d\n", ret);
+
+ gh_rm_notifier_unregister(ghvm->rm, &ghvm->nb);
+ gh_rm_put(ghvm->rm);
+ kfree(ghvm);
+ break;
+ default:
+ dev_err(ghvm->parent, "VM is unknown state: %d. VM will not be cleaned up.\n",
+ ghvm->vm_status);
- put_gh_rm(ghvm->rm);
- kfree(ghvm);
+ gh_rm_notifier_unregister(ghvm->rm, &ghvm->nb);
+ gh_rm_put(ghvm->rm);
+ kfree(ghvm);
+ break;
+ }
}
static __must_check struct gh_vm *gh_vm_alloc(struct gh_rm *rm)
{
struct gh_vm *ghvm;
- int vmid;
+ int vmid, ret;
vmid = gh_rm_alloc_vmid(rm, 0);
if (vmid < 0)
@@ -55,13 +136,130 @@ static __must_check struct gh_vm *gh_vm_alloc(struct gh_rm *rm)
ghvm->vmid = vmid;
ghvm->rm = rm;
+ init_waitqueue_head(&ghvm->vm_status_wait);
+ ghvm->nb.notifier_call = gh_vm_rm_notification;
+ ret = gh_rm_notifier_register(rm, &ghvm->nb);
+ if (ret) {
+ gh_rm_put(rm);
+ gh_rm_dealloc_vmid(rm, vmid);
+ kfree(ghvm);
+ return ERR_PTR(ret);
+ }
+
mutex_init(&ghvm->mm_lock);
INIT_LIST_HEAD(&ghvm->memory_mappings);
+ init_rwsem(&ghvm->status_lock);
INIT_WORK(&ghvm->free_work, gh_vm_free);
+ ghvm->vm_status = GH_RM_VM_STATUS_LOAD;
return ghvm;
}
+static int gh_vm_start(struct gh_vm *ghvm)
+{
+ struct gh_vm_mem *mapping;
+ u64 dtb_offset;
+ u32 mem_handle;
+ int ret;
+
+ down_write(&ghvm->status_lock);
+ if (ghvm->vm_status != GH_RM_VM_STATUS_LOAD) {
+ up_write(&ghvm->status_lock);
+ return 0;
+ }
+
+ ghvm->vm_status = GH_RM_VM_STATUS_RESET;
+
+ mutex_lock(&ghvm->mm_lock);
+ list_for_each_entry(mapping, &ghvm->memory_mappings, list) {
+ switch (mapping->share_type) {
+ case VM_MEM_LEND:
+ ret = gh_rm_mem_lend(ghvm->rm, &mapping->parcel);
+ break;
+ case VM_MEM_SHARE:
+ ret = gh_rm_mem_share(ghvm->rm, &mapping->parcel);
+ break;
+ }
+ if (ret) {
+ dev_warn(ghvm->parent, "Failed to %s parcel %d: %d\n",
+ mapping->share_type == VM_MEM_LEND ? "lend" : "share",
+ mapping->parcel.label,
+ ret);
+ goto err;
+ }
+ }
+ mutex_unlock(&ghvm->mm_lock);
+
+ mapping = gh_vm_mem_find_by_addr(ghvm, ghvm->dtb_config.guest_phys_addr,
+ ghvm->dtb_config.size);
+ if (!mapping) {
+ dev_warn(ghvm->parent, "Failed to find the memory_handle for DTB\n");
+ ret = -EINVAL;
+ goto err;
+ }
+
+ mem_handle = mapping->parcel.mem_handle;
+ dtb_offset = ghvm->dtb_config.guest_phys_addr - mapping->guest_phys_addr;
+
+ ret = gh_rm_vm_configure(ghvm->rm, ghvm->vmid, ghvm->auth, mem_handle,
+ 0, 0, dtb_offset, ghvm->dtb_config.size);
+ if (ret) {
+ dev_warn(ghvm->parent, "Failed to configure VM: %d\n", ret);
+ goto err;
+ }
+
+ ret = gh_rm_vm_init(ghvm->rm, ghvm->vmid);
+ if (ret) {
+ dev_warn(ghvm->parent, "Failed to initialize VM: %d\n", ret);
+ goto err;
+ }
+
+ ret = gh_rm_vm_start(ghvm->rm, ghvm->vmid);
+ if (ret) {
+ dev_warn(ghvm->parent, "Failed to start VM: %d\n", ret);
+ goto err;
+ }
+
+ ghvm->vm_status = GH_RM_VM_STATUS_RUNNING;
+ up_write(&ghvm->status_lock);
+ return ret;
+err:
+ ghvm->vm_status = GH_RM_VM_STATUS_INIT_FAILED;
+ /* gh_vm_free will handle releasing resources and reclaiming memory */
+ up_write(&ghvm->status_lock);
+ return ret;
+}
+
+static int gh_vm_ensure_started(struct gh_vm *ghvm)
+{
+ int ret;
+
+ ret = down_read_interruptible(&ghvm->status_lock);
+ if (ret)
+ return ret;
+
+ /* Unlikely because VM is typically started */
+ if (unlikely(ghvm->vm_status == GH_RM_VM_STATUS_LOAD)) {
+ up_read(&ghvm->status_lock);
+ ret = gh_vm_start(ghvm);
+ if (ret)
+ goto out;
+ /** gh_vm_start() is guaranteed to bring status out of
+ * GH_RM_VM_STATUS_LOAD, thus inifitely recursive call is not
+ * possible
+ */
+ return gh_vm_ensure_started(ghvm);
+ }
+
+ /* Unlikely because VM is typically running */
+ if (unlikely(ghvm->vm_status != GH_RM_VM_STATUS_RUNNING))
+ ret = -ENODEV;
+
+out:
+ up_read(&ghvm->status_lock);
+ return ret;
+}
+
static long gh_vm_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
{
struct gh_vm *ghvm = filp->private_data;
@@ -85,6 +283,25 @@ static long gh_vm_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
r = gh_vm_mem_alloc(ghvm, ®ion);
break;
}
+ case GH_VM_SET_DTB_CONFIG: {
+ struct gh_vm_dtb_config dtb_config;
+
+ if (copy_from_user(&dtb_config, argp, sizeof(dtb_config)))
+ return -EFAULT;
+
+ dtb_config.size = PAGE_ALIGN(dtb_config.size);
+ if (dtb_config.guest_phys_addr + dtb_config.size < dtb_config.guest_phys_addr)
+ return -EOVERFLOW;
+
+ ghvm->dtb_config = dtb_config;
+
+ r = 0;
+ break;
+ }
+ case GH_VM_START: {
+ r = gh_vm_ensure_started(ghvm);
+ break;
+ }
default:
r = -ENOTTY;
break;
diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
index c9f6fa5478ed..26bcc2ae4478 100644
--- a/drivers/virt/gunyah/vm_mgr.h
+++ b/drivers/virt/gunyah/vm_mgr.h
@@ -10,6 +10,8 @@
#include <linux/list.h>
#include <linux/miscdevice.h>
#include <linux/mutex.h>
+#include <linux/rwsem.h>
+#include <linux/wait.h>
#include <uapi/linux/gunyah.h>
@@ -34,6 +36,13 @@ struct gh_vm {
u16 vmid;
struct gh_rm *rm;
struct device *parent;
+ enum gh_rm_vm_auth_mechanism auth;
+ struct gh_vm_dtb_config dtb_config;
+
+ struct notifier_block nb;
+ enum gh_rm_vm_status vm_status;
+ wait_queue_head_t vm_status_wait;
+ struct rw_semaphore status_lock;
struct work_struct free_work;
struct mutex mm_lock;
@@ -44,5 +53,6 @@ int gh_vm_mem_alloc(struct gh_vm *ghvm, struct gh_userspace_memory_region *regio
void gh_vm_mem_reclaim(struct gh_vm *ghvm, struct gh_vm_mem *mapping);
int gh_vm_mem_free(struct gh_vm *ghvm, u32 label);
struct gh_vm_mem *gh_vm_mem_find_by_label(struct gh_vm *ghvm, u32 label);
+struct gh_vm_mem *gh_vm_mem_find_by_addr(struct gh_vm *ghvm, u64 guest_phys_addr, u32 size);
#endif
diff --git a/drivers/virt/gunyah/vm_mgr_mm.c b/drivers/virt/gunyah/vm_mgr_mm.c
index db6f55cef37f..6e1d2e8bddb7 100644
--- a/drivers/virt/gunyah/vm_mgr_mm.c
+++ b/drivers/virt/gunyah/vm_mgr_mm.c
@@ -47,6 +47,29 @@ void gh_vm_mem_reclaim(struct gh_vm *ghvm, struct gh_vm_mem *mapping)
list_del(&mapping->list);
}
+struct gh_vm_mem *gh_vm_mem_find_by_addr(struct gh_vm *ghvm, u64 guest_phys_addr, u32 size)
+{
+ struct gh_vm_mem *mapping = NULL;
+ int ret;
+
+ ret = mutex_lock_interruptible(&ghvm->mm_lock);
+ if (ret)
+ return ERR_PTR(ret);
+
+ list_for_each_entry(mapping, &ghvm->memory_mappings, list) {
+ if (guest_phys_addr >= mapping->guest_phys_addr &&
+ (guest_phys_addr + size <= mapping->guest_phys_addr +
+ (mapping->npages << PAGE_SHIFT))) {
+ goto unlock;
+ }
+ }
+
+ mapping = NULL;
+unlock:
+ mutex_unlock(&ghvm->mm_lock);
+ return mapping;
+}
+
struct gh_vm_mem *gh_vm_mem_find_by_label(struct gh_vm *ghvm, u32 label)
{
struct gh_vm_mem *mapping;
diff --git a/include/linux/gunyah_rsc_mgr.h b/include/linux/gunyah_rsc_mgr.h
index 88a429dad09e..8b0b46f28e39 100644
--- a/include/linux/gunyah_rsc_mgr.h
+++ b/include/linux/gunyah_rsc_mgr.h
@@ -29,6 +29,12 @@ struct gh_rm_vm_exited_payload {
#define GH_RM_NOTIFICATION_VM_EXITED 0x56100001
enum gh_rm_vm_status {
+ /**
+ * RM doesn't have a state where load partially failed because
+ * only Linux
+ */
+ GH_RM_VM_STATUS_LOAD_FAILED = -1,
+
GH_RM_VM_STATUS_NO_STATE = 0,
GH_RM_VM_STATUS_INIT = 1,
GH_RM_VM_STATUS_READY = 2,
diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
index a19207e3e065..d6abd8605a2e 100644
--- a/include/uapi/linux/gunyah.h
+++ b/include/uapi/linux/gunyah.h
@@ -49,4 +49,17 @@ struct gh_userspace_memory_region {
#define GH_VM_SET_USER_MEM_REGION _IOW(GH_IOCTL_TYPE, 0x1, \
struct gh_userspace_memory_region)
+/**
+ * struct gh_vm_dtb_config - Set the location of the VM's devicetree blob
+ * @guest_phys_addr: Address of the VM's devicetree in guest memory.
+ * @size: Maximum size of the devicetree.
+ */
+struct gh_vm_dtb_config {
+ __u64 guest_phys_addr;
+ __u64 size;
+};
+#define GH_VM_SET_DTB_CONFIG _IOW(GH_IOCTL_TYPE, 0x2, struct gh_vm_dtb_config)
+
+#define GH_VM_START _IO(GH_IOCTL_TYPE, 0x3)
+
#endif
--
2.39.2
The resource manager is a special virtual machine which is always
running on a Gunyah system. It provides APIs for creating and destroying
VMs, secure memory management, sharing/lending of memory between VMs,
and setup of inter-VM communication. Calls to the resource manager are
made via message queues.
This patch implements the basic probing and RPC mechanism to make those
API calls. Request/response calls can be made with gh_rm_call.
Drivers can also register to notifications pushed by RM via
gh_rm_register_notifier
Specific API calls that resource manager supports will be implemented in
subsequent patches.
Signed-off-by: Elliot Berman <[email protected]>
---
drivers/virt/gunyah/Makefile | 3 +
drivers/virt/gunyah/rsc_mgr.c | 688 +++++++++++++++++++++++++++++++++
drivers/virt/gunyah/rsc_mgr.h | 16 +
include/linux/gunyah_rsc_mgr.h | 21 +
4 files changed, 728 insertions(+)
create mode 100644 drivers/virt/gunyah/rsc_mgr.c
create mode 100644 drivers/virt/gunyah/rsc_mgr.h
create mode 100644 include/linux/gunyah_rsc_mgr.h
diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
index 34f32110faf9..cc864ff5abbb 100644
--- a/drivers/virt/gunyah/Makefile
+++ b/drivers/virt/gunyah/Makefile
@@ -1,3 +1,6 @@
# SPDX-License-Identifier: GPL-2.0
obj-$(CONFIG_GUNYAH) += gunyah.o
+
+gunyah_rsc_mgr-y += rsc_mgr.o
+obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
diff --git a/drivers/virt/gunyah/rsc_mgr.c b/drivers/virt/gunyah/rsc_mgr.c
new file mode 100644
index 000000000000..67813c9a52db
--- /dev/null
+++ b/drivers/virt/gunyah/rsc_mgr.c
@@ -0,0 +1,688 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#include <linux/of.h>
+#include <linux/slab.h>
+#include <linux/mutex.h>
+#include <linux/sched.h>
+#include <linux/gunyah.h>
+#include <linux/module.h>
+#include <linux/of_irq.h>
+#include <linux/notifier.h>
+#include <linux/workqueue.h>
+#include <linux/completion.h>
+#include <linux/gunyah_rsc_mgr.h>
+#include <linux/platform_device.h>
+
+#include "rsc_mgr.h"
+
+#define RM_RPC_API_VERSION_MASK GENMASK(3, 0)
+#define RM_RPC_HEADER_WORDS_MASK GENMASK(7, 4)
+#define RM_RPC_API_VERSION FIELD_PREP(RM_RPC_API_VERSION_MASK, 1)
+#define RM_RPC_HEADER_WORDS FIELD_PREP(RM_RPC_HEADER_WORDS_MASK, \
+ (sizeof(struct gh_rm_rpc_hdr) / sizeof(u32)))
+#define RM_RPC_API (RM_RPC_API_VERSION | RM_RPC_HEADER_WORDS)
+
+#define RM_RPC_TYPE_CONTINUATION 0x0
+#define RM_RPC_TYPE_REQUEST 0x1
+#define RM_RPC_TYPE_REPLY 0x2
+#define RM_RPC_TYPE_NOTIF 0x3
+#define RM_RPC_TYPE_MASK GENMASK(1, 0)
+
+#define GH_RM_MAX_NUM_FRAGMENTS 62
+#define RM_RPC_FRAGMENTS_MASK GENMASK(7, 2)
+
+struct gh_rm_rpc_hdr {
+ u8 api;
+ u8 type;
+ __le16 seq;
+ __le32 msg_id;
+} __packed;
+
+struct gh_rm_rpc_reply_hdr {
+ struct gh_rm_rpc_hdr hdr;
+ __le32 err_code; /* GH_RM_ERROR_* */
+} __packed;
+
+#define GH_RM_MAX_MSG_SIZE (GH_MSGQ_MAX_MSG_SIZE - sizeof(struct gh_rm_rpc_hdr))
+
+/* RM Error codes */
+enum gh_rm_error {
+ GH_RM_ERROR_OK = 0x0,
+ GH_RM_ERROR_UNIMPLEMENTED = 0xFFFFFFFF,
+ GH_RM_ERROR_NOMEM = 0x1,
+ GH_RM_ERROR_NORESOURCE = 0x2,
+ GH_RM_ERROR_DENIED = 0x3,
+ GH_RM_ERROR_INVALID = 0x4,
+ GH_RM_ERROR_BUSY = 0x5,
+ GH_RM_ERROR_ARGUMENT_INVALID = 0x6,
+ GH_RM_ERROR_HANDLE_INVALID = 0x7,
+ GH_RM_ERROR_VALIDATE_FAILED = 0x8,
+ GH_RM_ERROR_MAP_FAILED = 0x9,
+ GH_RM_ERROR_MEM_INVALID = 0xA,
+ GH_RM_ERROR_MEM_INUSE = 0xB,
+ GH_RM_ERROR_MEM_RELEASED = 0xC,
+ GH_RM_ERROR_VMID_INVALID = 0xD,
+ GH_RM_ERROR_LOOKUP_FAILED = 0xE,
+ GH_RM_ERROR_IRQ_INVALID = 0xF,
+ GH_RM_ERROR_IRQ_INUSE = 0x10,
+ GH_RM_ERROR_IRQ_RELEASED = 0x11,
+};
+
+/**
+ * struct gh_rm_connection - Represents a complete message from resource manager
+ * @payload: Combined payload of all the fragments (msg headers stripped off).
+ * @size: Size of the payload received so far.
+ * @msg_id: Message ID from the header.
+ * @type: RM_RPC_TYPE_REPLY or RM_RPC_TYPE_NOTIF.
+ * @num_fragments: total number of fragments expected to be received.
+ * @fragments_received: fragments received so far.
+ * @reply: Fields used for request/reply sequences
+ * @notification: Fields used for notifiations
+ */
+struct gh_rm_connection {
+ void *payload;
+ size_t size;
+ __le32 msg_id;
+ u8 type;
+
+ u8 num_fragments;
+ u8 fragments_received;
+
+ union {
+ /**
+ * @ret: Linux return code, there was an error processing connection
+ * @seq: Sequence ID for the main message.
+ * @rm_error: For request/reply sequences with standard replies
+ * @seq_done: Signals caller that the RM reply has been received
+ */
+ struct {
+ int ret;
+ u16 seq;
+ enum gh_rm_error rm_error;
+ struct completion seq_done;
+ } reply;
+
+ /**
+ * @rm: Pointer to the RM that launched the connection
+ * @work: Triggered when all fragments of a notification received
+ */
+ struct {
+ struct gh_rm *rm;
+ struct work_struct work;
+ } notification;
+ };
+};
+
+/**
+ * struct gh_rm - private data for communicating w/Gunyah resource manager
+ * @dev: pointer to device
+ * @tx_ghrsc: message queue resource to TX to RM
+ * @rx_ghrsc: message queue resource to RX from RM
+ * @msgq: mailbox instance of above
+ * @active_rx_connection: ongoing gh_rm_connection for which we're receiving fragments
+ * @last_tx_ret: return value of last mailbox tx
+ * @call_xarray: xarray to allocate & lookup sequence IDs for Request/Response flows
+ * @next_seq: next ID to allocate (for xa_alloc_cyclic)
+ * @cache: cache for allocating Tx messages
+ * @send_lock: synchronization to allow only one request to be sent at a time
+ * @nh: notifier chain for clients interested in RM notification messages
+ */
+struct gh_rm {
+ struct device *dev;
+ struct gh_resource tx_ghrsc;
+ struct gh_resource rx_ghrsc;
+ struct gh_msgq msgq;
+ struct mbox_client msgq_client;
+ struct gh_rm_connection *active_rx_connection;
+ int last_tx_ret;
+
+ struct xarray call_xarray;
+ u32 next_seq;
+
+ struct kmem_cache *cache;
+ struct mutex send_lock;
+ struct blocking_notifier_head nh;
+};
+
+/**
+ * gh_rm_remap_error() - Remap Gunyah resource manager errors into a Linux error code
+ * @gh_error: "Standard" return value from Gunyah resource manager
+ */
+static inline int gh_rm_remap_error(enum gh_rm_error rm_error)
+{
+ switch (rm_error) {
+ case GH_RM_ERROR_OK:
+ return 0;
+ case GH_RM_ERROR_UNIMPLEMENTED:
+ return -EOPNOTSUPP;
+ case GH_RM_ERROR_NOMEM:
+ return -ENOMEM;
+ case GH_RM_ERROR_NORESOURCE:
+ return -ENODEV;
+ case GH_RM_ERROR_DENIED:
+ return -EPERM;
+ case GH_RM_ERROR_BUSY:
+ return -EBUSY;
+ case GH_RM_ERROR_INVALID:
+ case GH_RM_ERROR_ARGUMENT_INVALID:
+ case GH_RM_ERROR_HANDLE_INVALID:
+ case GH_RM_ERROR_VALIDATE_FAILED:
+ case GH_RM_ERROR_MAP_FAILED:
+ case GH_RM_ERROR_MEM_INVALID:
+ case GH_RM_ERROR_MEM_INUSE:
+ case GH_RM_ERROR_MEM_RELEASED:
+ case GH_RM_ERROR_VMID_INVALID:
+ case GH_RM_ERROR_LOOKUP_FAILED:
+ case GH_RM_ERROR_IRQ_INVALID:
+ case GH_RM_ERROR_IRQ_INUSE:
+ case GH_RM_ERROR_IRQ_RELEASED:
+ return -EINVAL;
+ default:
+ return -EBADMSG;
+ }
+}
+
+static int gh_rm_init_connection_payload(struct gh_rm_connection *connection, void *msg,
+ size_t hdr_size, size_t msg_size)
+{
+ size_t max_buf_size, payload_size;
+ struct gh_rm_rpc_hdr *hdr = msg;
+
+ if (msg_size < hdr_size)
+ return -EINVAL;
+
+ payload_size = msg_size - hdr_size;
+
+ connection->num_fragments = FIELD_GET(RM_RPC_FRAGMENTS_MASK, hdr->type);
+ connection->fragments_received = 0;
+
+ /* There's not going to be any payload, no need to allocate buffer. */
+ if (!payload_size && !connection->num_fragments)
+ return 0;
+
+ if (connection->num_fragments > GH_RM_MAX_NUM_FRAGMENTS)
+ return -EINVAL;
+
+ max_buf_size = payload_size + (connection->num_fragments * GH_RM_MAX_MSG_SIZE);
+
+ connection->payload = kzalloc(max_buf_size, GFP_KERNEL);
+ if (!connection->payload)
+ return -ENOMEM;
+
+ memcpy(connection->payload, msg + hdr_size, payload_size);
+ connection->size = payload_size;
+ return 0;
+}
+
+static void gh_rm_abort_connection(struct gh_rm *rm)
+{
+ switch (rm->active_rx_connection->type) {
+ case RM_RPC_TYPE_REPLY:
+ rm->active_rx_connection->reply.ret = -EIO;
+ complete(&rm->active_rx_connection->reply.seq_done);
+ break;
+ case RM_RPC_TYPE_NOTIF:
+ fallthrough;
+ default:
+ kfree(rm->active_rx_connection->payload);
+ kfree(rm->active_rx_connection);
+ }
+
+ rm->active_rx_connection = NULL;
+}
+
+static void gh_rm_notif_work(struct work_struct *work)
+{
+ struct gh_rm_connection *connection = container_of(work, struct gh_rm_connection,
+ notification.work);
+ struct gh_rm *rm = connection->notification.rm;
+
+ blocking_notifier_call_chain(&rm->nh, connection->msg_id, connection->payload);
+
+ gh_rm_put(rm);
+ kfree(connection->payload);
+ kfree(connection);
+}
+
+static void gh_rm_process_notif(struct gh_rm *rm, void *msg, size_t msg_size)
+{
+ struct gh_rm_connection *connection;
+ struct gh_rm_rpc_hdr *hdr = msg;
+ int ret;
+
+ if (rm->active_rx_connection)
+ gh_rm_abort_connection(rm);
+
+ connection = kzalloc(sizeof(*connection), GFP_KERNEL);
+ if (!connection)
+ return;
+
+ connection->type = RM_RPC_TYPE_NOTIF;
+ connection->msg_id = hdr->msg_id;
+
+ gh_rm_get(rm);
+ connection->notification.rm = rm;
+ INIT_WORK(&connection->notification.work, gh_rm_notif_work);
+
+ ret = gh_rm_init_connection_payload(connection, msg, sizeof(*hdr), msg_size);
+ if (ret) {
+ dev_err(rm->dev, "Failed to initialize connection for notification: %d\n", ret);
+ gh_rm_put(rm);
+ kfree(connection);
+ return;
+ }
+
+ rm->active_rx_connection = connection;
+}
+
+static void gh_rm_process_rply(struct gh_rm *rm, void *msg, size_t msg_size)
+{
+ struct gh_rm_rpc_reply_hdr *reply_hdr = msg;
+ struct gh_rm_connection *connection;
+ u16 seq_id;
+
+ seq_id = le16_to_cpu(reply_hdr->hdr.seq);
+ connection = xa_load(&rm->call_xarray, seq_id);
+
+ if (!connection || connection->msg_id != reply_hdr->hdr.msg_id)
+ return;
+
+ if (rm->active_rx_connection)
+ gh_rm_abort_connection(rm);
+
+ if (gh_rm_init_connection_payload(connection, msg, sizeof(*reply_hdr), msg_size)) {
+ dev_err(rm->dev, "Failed to alloc connection buffer for sequence %d\n", seq_id);
+ /* Send connection complete and error the client. */
+ connection->reply.ret = -ENOMEM;
+ complete(&connection->reply.seq_done);
+ return;
+ }
+
+ connection->reply.rm_error = le32_to_cpu(reply_hdr->err_code);
+ rm->active_rx_connection = connection;
+}
+
+static void gh_rm_process_cont(struct gh_rm *rm, struct gh_rm_connection *connection,
+ void *msg, size_t msg_size)
+{
+ struct gh_rm_rpc_hdr *hdr = msg;
+ size_t payload_size = msg_size - sizeof(*hdr);
+
+ if (!rm->active_rx_connection)
+ return;
+
+ /*
+ * hdr->fragments and hdr->msg_id preserves the value from first reply
+ * or notif message. To detect mishandling, check it's still intact.
+ */
+ if (connection->msg_id != hdr->msg_id ||
+ connection->num_fragments != FIELD_GET(RM_RPC_FRAGMENTS_MASK, hdr->type)) {
+ gh_rm_abort_connection(rm);
+ return;
+ }
+
+ memcpy(connection->payload + connection->size, msg + sizeof(*hdr), payload_size);
+ connection->size += payload_size;
+ connection->fragments_received++;
+}
+
+static void gh_rm_try_complete_connection(struct gh_rm *rm)
+{
+ struct gh_rm_connection *connection = rm->active_rx_connection;
+
+ if (!connection || connection->fragments_received != connection->num_fragments)
+ return;
+
+ switch (connection->type) {
+ case RM_RPC_TYPE_REPLY:
+ complete(&connection->reply.seq_done);
+ break;
+ case RM_RPC_TYPE_NOTIF:
+ schedule_work(&connection->notification.work);
+ break;
+ default:
+ dev_err_ratelimited(rm->dev, "Invalid message type (%d) received\n",
+ connection->type);
+ gh_rm_abort_connection(rm);
+ break;
+ }
+
+ rm->active_rx_connection = NULL;
+}
+
+static void gh_rm_msgq_rx_data(struct mbox_client *cl, void *mssg)
+{
+ struct gh_rm *rm = container_of(cl, struct gh_rm, msgq_client);
+ struct gh_msgq_rx_data *rx_data = mssg;
+ size_t msg_size = rx_data->length;
+ void *msg = rx_data->data;
+ struct gh_rm_rpc_hdr *hdr;
+
+ if (msg_size < sizeof(*hdr) || msg_size > GH_MSGQ_MAX_MSG_SIZE)
+ return;
+
+ hdr = msg;
+ if (hdr->api != RM_RPC_API) {
+ dev_err(rm->dev, "Unknown RM RPC API version: %x\n", hdr->api);
+ return;
+ }
+
+ switch (FIELD_GET(RM_RPC_TYPE_MASK, hdr->type)) {
+ case RM_RPC_TYPE_NOTIF:
+ gh_rm_process_notif(rm, msg, msg_size);
+ break;
+ case RM_RPC_TYPE_REPLY:
+ gh_rm_process_rply(rm, msg, msg_size);
+ break;
+ case RM_RPC_TYPE_CONTINUATION:
+ gh_rm_process_cont(rm, rm->active_rx_connection, msg, msg_size);
+ break;
+ default:
+ dev_err(rm->dev, "Invalid message type (%lu) received\n",
+ FIELD_GET(RM_RPC_TYPE_MASK, hdr->type));
+ return;
+ }
+
+ gh_rm_try_complete_connection(rm);
+}
+
+static void gh_rm_msgq_tx_done(struct mbox_client *cl, void *mssg, int r)
+{
+ struct gh_rm *rm = container_of(cl, struct gh_rm, msgq_client);
+
+ kmem_cache_free(rm->cache, mssg);
+ rm->last_tx_ret = r;
+}
+
+static int gh_rm_send_request(struct gh_rm *rm, u32 message_id,
+ const void *req_buff, size_t req_buf_size,
+ struct gh_rm_connection *connection)
+{
+ size_t buf_size_remaining = req_buf_size;
+ const void *req_buf_curr = req_buff;
+ struct gh_msgq_tx_data *msg;
+ struct gh_rm_rpc_hdr *hdr, hdr_template;
+ u32 cont_fragments = 0;
+ size_t payload_size;
+ void *payload;
+ int ret;
+
+ if (req_buf_size > GH_RM_MAX_NUM_FRAGMENTS * GH_RM_MAX_MSG_SIZE) {
+ dev_warn(rm->dev, "Limit exceeded for the number of fragments: %u\n",
+ cont_fragments);
+ dump_stack();
+ return -E2BIG;
+ }
+
+ if (req_buf_size)
+ cont_fragments = (req_buf_size - 1) / GH_RM_MAX_MSG_SIZE;
+
+ hdr_template.api = RM_RPC_API;
+ hdr_template.type = FIELD_PREP(RM_RPC_TYPE_MASK, RM_RPC_TYPE_REQUEST) |
+ FIELD_PREP(RM_RPC_FRAGMENTS_MASK, cont_fragments);
+ hdr_template.seq = cpu_to_le16(connection->reply.seq);
+ hdr_template.msg_id = cpu_to_le32(message_id);
+
+ ret = mutex_lock_interruptible(&rm->send_lock);
+ if (ret)
+ return ret;
+
+ /* Consider also the 'request' packet for the loop count */
+ do {
+ msg = kmem_cache_zalloc(rm->cache, GFP_KERNEL);
+ if (!msg) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ /* Fill header */
+ hdr = (struct gh_rm_rpc_hdr *)msg->data;
+ *hdr = hdr_template;
+
+ /* Copy payload */
+ payload = hdr + 1;
+ payload_size = min(buf_size_remaining, GH_RM_MAX_MSG_SIZE);
+ memcpy(payload, req_buf_curr, payload_size);
+ req_buf_curr += payload_size;
+ buf_size_remaining -= payload_size;
+
+ /* Force the last fragment to immediately alert the receiver */
+ msg->push = !buf_size_remaining;
+ msg->length = sizeof(*hdr) + payload_size;
+
+ ret = mbox_send_message(gh_msgq_chan(&rm->msgq), msg);
+ if (ret < 0) {
+ kmem_cache_free(rm->cache, msg);
+ break;
+ }
+
+ if (rm->last_tx_ret) {
+ ret = rm->last_tx_ret;
+ break;
+ }
+
+ hdr_template.type = FIELD_PREP(RM_RPC_TYPE_MASK, RM_RPC_TYPE_CONTINUATION) |
+ FIELD_PREP(RM_RPC_FRAGMENTS_MASK, cont_fragments);
+ } while (buf_size_remaining);
+
+out:
+ mutex_unlock(&rm->send_lock);
+ return ret < 0 ? ret : 0;
+}
+
+/**
+ * gh_rm_call: Achieve request-response type communication with RPC
+ * @rm: Pointer to Gunyah resource manager internal data
+ * @message_id: The RM RPC message-id
+ * @req_buff: Request buffer that contains the payload
+ * @req_buf_size: Total size of the payload
+ * @resp_buf: Pointer to a response buffer
+ * @resp_buf_size: Size of the response buffer
+ *
+ * Make a request to the RM-VM and wait for reply back. For a successful
+ * response, the function returns the payload. The size of the payload is set in
+ * resp_buf_size. The resp_buf should be freed by the caller when 0 is returned
+ * and resp_buf_size != 0.
+ *
+ * req_buff should be not NULL for req_buf_size >0. If req_buf_size == 0,
+ * req_buff *can* be NULL and no additional payload is sent.
+ *
+ * Context: Process context. Will sleep waiting for reply.
+ * Return: 0 on success. <0 if error.
+ */
+int gh_rm_call(struct gh_rm *rm, u32 message_id, void *req_buff, size_t req_buf_size,
+ void **resp_buf, size_t *resp_buf_size)
+{
+ struct gh_rm_connection *connection;
+ u32 seq_id;
+ int ret;
+
+ /* message_id 0 is reserved. req_buf_size implies req_buf is not NULL */
+ if (!message_id || (!req_buff && req_buf_size) || !rm)
+ return -EINVAL;
+
+
+ connection = kzalloc(sizeof(*connection), GFP_KERNEL);
+ if (!connection)
+ return -ENOMEM;
+
+ connection->type = RM_RPC_TYPE_REPLY;
+ connection->msg_id = cpu_to_le32(message_id);
+
+ init_completion(&connection->reply.seq_done);
+
+ /* Allocate a new seq number for this connection */
+ ret = xa_alloc_cyclic(&rm->call_xarray, &seq_id, connection, xa_limit_16b, &rm->next_seq,
+ GFP_KERNEL);
+ if (ret < 0)
+ goto free;
+ connection->reply.seq = lower_16_bits(seq_id);
+
+ /* Send the request to the Resource Manager */
+ ret = gh_rm_send_request(rm, message_id, req_buff, req_buf_size, connection);
+ if (ret < 0)
+ goto out;
+
+ /* Wait for response */
+ ret = wait_for_completion_interruptible(&connection->reply.seq_done);
+ if (ret)
+ goto out;
+
+ /* Check for internal (kernel) error waiting for the response */
+ if (connection->reply.ret) {
+ ret = connection->reply.ret;
+ if (ret != -ENOMEM)
+ kfree(connection->payload);
+ goto out;
+ }
+
+ /* Got a response, did resource manager give us an error? */
+ if (connection->reply.rm_error != GH_RM_ERROR_OK) {
+ dev_warn(rm->dev, "RM rejected message %08x. Error: %d\n", message_id,
+ connection->reply.rm_error);
+ dump_stack();
+ ret = gh_rm_remap_error(connection->reply.rm_error);
+ kfree(connection->payload);
+ goto out;
+ }
+
+ /* Everything looks good, return the payload */
+ if (resp_buf_size)
+ *resp_buf_size = connection->size;
+ if (connection->size && resp_buf)
+ *resp_buf = connection->payload;
+ else {
+ /* kfree in case RM sent us multiple fragments but never any data in
+ * those fragments. We would've allocated memory for it, but connection->size == 0
+ */
+ kfree(connection->payload);
+ }
+
+out:
+ xa_erase(&rm->call_xarray, connection->reply.seq);
+free:
+ kfree(connection);
+ return ret;
+}
+
+
+int gh_rm_notifier_register(struct gh_rm *rm, struct notifier_block *nb)
+{
+ return blocking_notifier_chain_register(&rm->nh, nb);
+}
+EXPORT_SYMBOL_GPL(gh_rm_notifier_register);
+
+int gh_rm_notifier_unregister(struct gh_rm *rm, struct notifier_block *nb)
+{
+ return blocking_notifier_chain_unregister(&rm->nh, nb);
+}
+EXPORT_SYMBOL_GPL(gh_rm_notifier_unregister);
+
+struct device *gh_rm_get(struct gh_rm *rm)
+{
+ return get_device(rm->miscdev.this_device);
+}
+EXPORT_SYMBOL_GPL(gh_rm_get);
+
+void gh_rm_put(struct gh_rm *rm)
+{
+ put_device(rm->miscdev.this_device);
+}
+EXPORT_SYMBOL_GPL(gh_rm_put);
+
+static int gh_msgq_platform_probe_direction(struct platform_device *pdev, bool tx,
+ struct gh_resource *ghrsc)
+{
+ struct device_node *node = pdev->dev.of_node;
+ int ret;
+ int idx = tx ? 0 : 1;
+
+ ghrsc->type = tx ? GH_RESOURCE_TYPE_MSGQ_TX : GH_RESOURCE_TYPE_MSGQ_RX;
+
+ ghrsc->irq = platform_get_irq(pdev, idx);
+ if (ghrsc->irq < 0) {
+ dev_err(&pdev->dev, "Failed to get irq%d: %d\n", idx, ghrsc->irq);
+ return ghrsc->irq;
+ }
+
+ ret = of_property_read_u64_index(node, "reg", idx, &ghrsc->capid);
+ if (ret) {
+ dev_err(&pdev->dev, "Failed to get capid%d: %d\n", idx, ret);
+ return ret;
+ }
+
+ return 0;
+}
+
+static int gh_rm_drv_probe(struct platform_device *pdev)
+{
+ struct gh_msgq_tx_data *msg;
+ struct gh_rm *rm;
+ int ret;
+
+ rm = devm_kzalloc(&pdev->dev, sizeof(*rm), GFP_KERNEL);
+ if (!rm)
+ return -ENOMEM;
+
+ platform_set_drvdata(pdev, rm);
+ rm->dev = &pdev->dev;
+
+ mutex_init(&rm->send_lock);
+ BLOCKING_INIT_NOTIFIER_HEAD(&rm->nh);
+ xa_init_flags(&rm->call_xarray, XA_FLAGS_ALLOC);
+ rm->cache = kmem_cache_create("gh_rm", struct_size(msg, data, GH_MSGQ_MAX_MSG_SIZE), 0,
+ SLAB_HWCACHE_ALIGN, NULL);
+ if (!rm->cache)
+ return -ENOMEM;
+
+ ret = gh_msgq_platform_probe_direction(pdev, true, &rm->tx_ghrsc);
+ if (ret)
+ goto err_cache;
+
+ ret = gh_msgq_platform_probe_direction(pdev, false, &rm->rx_ghrsc);
+ if (ret)
+ goto err_cache;
+
+ rm->msgq_client.dev = &pdev->dev;
+ rm->msgq_client.tx_block = true;
+ rm->msgq_client.rx_callback = gh_rm_msgq_rx_data;
+ rm->msgq_client.tx_done = gh_rm_msgq_tx_done;
+
+ return gh_msgq_init(&pdev->dev, &rm->msgq, &rm->msgq_client, &rm->tx_ghrsc, &rm->rx_ghrsc);
+err_cache:
+ kmem_cache_destroy(rm->cache);
+ return ret;
+}
+
+static int gh_rm_drv_remove(struct platform_device *pdev)
+{
+ struct gh_rm *rm = platform_get_drvdata(pdev);
+
+ mbox_free_channel(gh_msgq_chan(&rm->msgq));
+ gh_msgq_remove(&rm->msgq);
+ kmem_cache_destroy(rm->cache);
+
+ return 0;
+}
+
+static const struct of_device_id gh_rm_of_match[] = {
+ { .compatible = "gunyah-resource-manager" },
+ {}
+};
+MODULE_DEVICE_TABLE(of, gh_rm_of_match);
+
+static struct platform_driver gh_rm_driver = {
+ .probe = gh_rm_drv_probe,
+ .remove = gh_rm_drv_remove,
+ .driver = {
+ .name = "gh_rsc_mgr",
+ .of_match_table = gh_rm_of_match,
+ },
+};
+module_platform_driver(gh_rm_driver);
+
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("Gunyah Resource Manager Driver");
diff --git a/drivers/virt/gunyah/rsc_mgr.h b/drivers/virt/gunyah/rsc_mgr.h
new file mode 100644
index 000000000000..3665ebc7b020
--- /dev/null
+++ b/drivers/virt/gunyah/rsc_mgr.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+#ifndef __GH_RSC_MGR_PRIV_H
+#define __GH_RSC_MGR_PRIV_H
+
+#include <linux/gunyah.h>
+#include <linux/gunyah_rsc_mgr.h>
+#include <linux/types.h>
+
+struct gh_rm;
+int gh_rm_call(struct gh_rm *rsc_mgr, u32 message_id, void *req_buff, size_t req_buf_size,
+ void **resp_buf, size_t *resp_buf_size);
+
+#endif
diff --git a/include/linux/gunyah_rsc_mgr.h b/include/linux/gunyah_rsc_mgr.h
new file mode 100644
index 000000000000..deca9b3da541
--- /dev/null
+++ b/include/linux/gunyah_rsc_mgr.h
@@ -0,0 +1,21 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#ifndef _GUNYAH_RSC_MGR_H
+#define _GUNYAH_RSC_MGR_H
+
+#include <linux/list.h>
+#include <linux/notifier.h>
+#include <linux/gunyah.h>
+
+#define GH_VMID_INVAL U16_MAX
+
+struct gh_rm;
+int gh_rm_notifier_register(struct gh_rm *rm, struct notifier_block *nb);
+int gh_rm_notifier_unregister(struct gh_rm *rm, struct notifier_block *nb);
+struct device *gh_rm_get(struct gh_rm *rm);
+void gh_rm_put(struct gh_rm *rm);
+
+#endif
--
2.39.2
Gunyah doorbells allow two virtual machines to signal each other using
interrupts. Add the hypercalls needed to assert the interrupt.
Signed-off-by: Elliot Berman <[email protected]>
---
arch/arm64/gunyah/gunyah_hypercall.c | 25 +++++++++++++++++++++++++
include/linux/gunyah.h | 3 +++
2 files changed, 28 insertions(+)
diff --git a/arch/arm64/gunyah/gunyah_hypercall.c b/arch/arm64/gunyah/gunyah_hypercall.c
index f01f5cec4d23..0f1cdb706e91 100644
--- a/arch/arm64/gunyah/gunyah_hypercall.c
+++ b/arch/arm64/gunyah/gunyah_hypercall.c
@@ -41,6 +41,8 @@ EXPORT_SYMBOL_GPL(arch_is_gh_guest);
fn)
#define GH_HYPERCALL_HYP_IDENTIFY GH_HYPERCALL(0x8000)
+#define GH_HYPERCALL_BELL_SEND GH_HYPERCALL(0x8012)
+#define GH_HYPERCALL_BELL_SET_MASK GH_HYPERCALL(0x8015)
#define GH_HYPERCALL_MSGQ_SEND GH_HYPERCALL(0x801B)
#define GH_HYPERCALL_MSGQ_RECV GH_HYPERCALL(0x801C)
#define GH_HYPERCALL_VCPU_RUN GH_HYPERCALL(0x8065)
@@ -63,6 +65,29 @@ void gh_hypercall_hyp_identify(struct gh_hypercall_hyp_identify_resp *hyp_identi
}
EXPORT_SYMBOL_GPL(gh_hypercall_hyp_identify);
+enum gh_error gh_hypercall_bell_send(u64 capid, u64 new_flags, u64 *old_flags)
+{
+ struct arm_smccc_res res;
+
+ arm_smccc_1_1_hvc(GH_HYPERCALL_BELL_SEND, capid, new_flags, 0, &res);
+
+ if (res.a0 == GH_ERROR_OK)
+ *old_flags = res.a1;
+
+ return res.a0;
+}
+EXPORT_SYMBOL_GPL(gh_hypercall_bell_send);
+
+enum gh_error gh_hypercall_bell_set_mask(u64 capid, u64 enable_mask, u64 ack_mask)
+{
+ struct arm_smccc_res res;
+
+ arm_smccc_1_1_hvc(GH_HYPERCALL_BELL_SET_MASK, capid, enable_mask, ack_mask, 0, &res);
+
+ return res.a0;
+}
+EXPORT_SYMBOL_GPL(gh_hypercall_bell_set_mask);
+
enum gh_error gh_hypercall_msgq_send(u64 capid, size_t size, void *buff, int tx_flags, bool *ready)
{
struct arm_smccc_res res;
diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
index 37f1e2c822ce..63395dacc1a8 100644
--- a/include/linux/gunyah.h
+++ b/include/linux/gunyah.h
@@ -169,6 +169,9 @@ struct gh_hypercall_hyp_identify_resp {
void gh_hypercall_hyp_identify(struct gh_hypercall_hyp_identify_resp *hyp_identity);
+enum gh_error gh_hypercall_bell_send(u64 capid, u64 new_flags, u64 *old_flags);
+enum gh_error gh_hypercall_bell_set_mask(u64 capid, u64 enable_mask, u64 ack_mask);
+
#define GH_HYPERCALL_MSGQ_TX_FLAGS_PUSH BIT(0)
enum gh_error gh_hypercall_msgq_send(u64 capid, size_t size, void *buff, int tx_flags, bool *ready);
--
2.39.2
Introduce a framework for Gunyah userspace to install VM functions. VM
functions are optional interfaces to the virtual machine. vCPUs,
ioeventfs, and irqfds are examples of such VM functions and are
implemented in subsequent patches.
A generic framework is implemented instead of individual ioctls to
create vCPUs, irqfds, etc., in order to simplify the VM manager core
implementation and allow dynamic loading of VM function modules.
Signed-off-by: Elliot Berman <[email protected]>
---
Documentation/virt/gunyah/vm-manager.rst | 18 ++
drivers/virt/gunyah/vm_mgr.c | 208 ++++++++++++++++++++++-
drivers/virt/gunyah/vm_mgr.h | 4 +
include/linux/gunyah_vm_mgr.h | 73 ++++++++
include/uapi/linux/gunyah.h | 17 ++
5 files changed, 316 insertions(+), 4 deletions(-)
create mode 100644 include/linux/gunyah_vm_mgr.h
diff --git a/Documentation/virt/gunyah/vm-manager.rst b/Documentation/virt/gunyah/vm-manager.rst
index 1b4aa18670a3..af8ad88a88ab 100644
--- a/Documentation/virt/gunyah/vm-manager.rst
+++ b/Documentation/virt/gunyah/vm-manager.rst
@@ -17,6 +17,24 @@ sharing userspace memory with a VM is done via the GH_VM_SET_USER_MEM_REGION
ioctl. The VM itself is configured to use the memory region via the
devicetree.
+Gunyah Functions
+================
+
+Components of a Gunyah VM's configuration that need kernel configuration are
+called "functions" and are built on top of a framework. Functions are identified
+by a string and have some argument(s) to configure them. They are typically
+created by the `GH_VM_ADD_FUNCTION` ioctl.
+
+Functions typically will always do at least one of these operations:
+
+1. Create resource ticket(s). Resource tickets allow a function to register
+ itself as the client for a Gunyah resource (e.g. doorbell or vCPU) and
+ the function is given the pointer to the `struct gh_resource` when the
+ VM is starting.
+
+2. Register IO handler(s). IO handlers allow a function to handle stage-2 faults
+ from the virtual machine.
+
Sample Userspace VMM
====================
diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
index 299b9bb81edc..88db011395ec 100644
--- a/drivers/virt/gunyah/vm_mgr.c
+++ b/drivers/virt/gunyah/vm_mgr.c
@@ -6,16 +6,165 @@
#define pr_fmt(fmt) "gh_vm_mgr: " fmt
#include <linux/anon_inodes.h>
+#include <linux/compat.h>
#include <linux/file.h>
#include <linux/gunyah_rsc_mgr.h>
+#include <linux/gunyah_vm_mgr.h>
#include <linux/miscdevice.h>
#include <linux/mm.h>
#include <linux/module.h>
+#include <linux/xarray.h>
#include <uapi/linux/gunyah.h>
#include "vm_mgr.h"
+static DEFINE_XARRAY(functions);
+
+int gh_vm_function_register(struct gh_vm_function *fn)
+{
+ if (!fn->bind || !fn->unbind)
+ return -EINVAL;
+
+ return xa_err(xa_store(&functions, fn->type, fn, GFP_KERNEL));
+}
+EXPORT_SYMBOL_GPL(gh_vm_function_register);
+
+static void gh_vm_remove_function_instance(struct gh_vm_function_instance *inst)
+ __must_hold(&inst->ghvm->fn_lock)
+{
+ inst->fn->unbind(inst);
+ list_del(&inst->vm_list);
+ module_put(inst->fn->mod);
+ kfree(inst->argp);
+ kfree(inst);
+}
+
+void gh_vm_function_unregister(struct gh_vm_function *fn)
+{
+ /* Expecting unregister to only come when unloading a module */
+ WARN_ON(fn->mod && module_refcount(fn->mod));
+ xa_erase(&functions, fn->type);
+}
+EXPORT_SYMBOL_GPL(gh_vm_function_unregister);
+
+static struct gh_vm_function *gh_vm_get_function(u32 type)
+{
+ struct gh_vm_function *fn;
+ int r;
+
+ fn = xa_load(&functions, type);
+ if (!fn) {
+ r = request_module("ghfunc:%d", type);
+ if (r)
+ return ERR_PTR(r);
+
+ fn = xa_load(&functions, type);
+ }
+
+ if (!fn || !try_module_get(fn->mod))
+ fn = ERR_PTR(-ENOENT);
+
+ return fn;
+}
+
+static long gh_vm_add_function(struct gh_vm *ghvm, struct gh_fn_desc *f)
+{
+ struct gh_vm_function_instance *inst;
+ void __user *argp;
+ long r = 0;
+
+ if (f->arg_size > GH_FN_MAX_ARG_SIZE) {
+ dev_err(ghvm->parent, "%s: arg_size > %d\n", __func__, GH_FN_MAX_ARG_SIZE);
+ return -EINVAL;
+ }
+
+ inst = kzalloc(sizeof(*inst), GFP_KERNEL);
+ if (!inst)
+ return -ENOMEM;
+
+ inst->arg_size = f->arg_size;
+ if (inst->arg_size) {
+ inst->argp = kzalloc(inst->arg_size, GFP_KERNEL);
+ if (!inst->argp) {
+ r = -ENOMEM;
+ goto free;
+ }
+
+ argp = u64_to_user_ptr(f->arg);
+ if (copy_from_user(inst->argp, argp, f->arg_size)) {
+ r = -EFAULT;
+ goto free_arg;
+ }
+ }
+
+ inst->fn = gh_vm_get_function(f->type);
+ if (IS_ERR(inst->fn)) {
+ r = PTR_ERR(inst->fn);
+ goto free_arg;
+ }
+
+ inst->ghvm = ghvm;
+ inst->rm = ghvm->rm;
+
+ mutex_lock(&ghvm->fn_lock);
+ r = inst->fn->bind(inst);
+ if (r < 0) {
+ module_put(inst->fn->mod);
+ goto free_arg;
+ }
+
+ list_add(&inst->vm_list, &ghvm->functions);
+ mutex_unlock(&ghvm->fn_lock);
+
+ return r;
+free_arg:
+ kfree(inst->argp);
+free:
+ kfree(inst);
+ return r;
+}
+
+static long gh_vm_rm_function(struct gh_vm *ghvm, struct gh_fn_desc *f)
+{
+ struct gh_vm_function_instance *inst, *iter;
+ void __user *user_argp;
+ void *argp;
+ long r = 0;
+
+ r = mutex_lock_interruptible(&ghvm->fn_lock);
+ if (r)
+ return r;
+
+ if (f->arg_size) {
+ argp = kzalloc(f->arg_size, GFP_KERNEL);
+ if (!argp) {
+ r = -ENOMEM;
+ goto out;
+ }
+
+ user_argp = u64_to_user_ptr(f->arg);
+ if (copy_from_user(argp, user_argp, f->arg_size)) {
+ r = -EFAULT;
+ kfree(argp);
+ goto out;
+ }
+
+ list_for_each_entry_safe(inst, iter, &ghvm->functions, vm_list) {
+ if (inst->fn->type == f->type &&
+ f->arg_size == inst->arg_size &&
+ !memcmp(argp, inst->argp, f->arg_size))
+ gh_vm_remove_function_instance(inst);
+ }
+
+ kfree(argp);
+ }
+
+out:
+ mutex_unlock(&ghvm->fn_lock);
+ return r;
+}
+
static int gh_vm_rm_notification_status(struct gh_vm *ghvm, void *data)
{
struct gh_rm_vm_status_payload *payload = data;
@@ -80,6 +229,7 @@ static void gh_vm_stop(struct gh_vm *ghvm)
static void gh_vm_free(struct work_struct *work)
{
struct gh_vm *ghvm = container_of(work, struct gh_vm, free_work);
+ struct gh_vm_function_instance *inst, *iiter;
struct gh_vm_mem *mapping, *tmp;
int ret;
@@ -90,6 +240,12 @@ static void gh_vm_free(struct work_struct *work)
case GH_RM_VM_STATUS_INIT_FAILED:
case GH_RM_VM_STATUS_LOAD:
case GH_RM_VM_STATUS_EXITED:
+ mutex_lock(&ghvm->fn_lock);
+ list_for_each_entry_safe(inst, iiter, &ghvm->functions, vm_list) {
+ gh_vm_remove_function_instance(inst);
+ }
+ mutex_unlock(&ghvm->fn_lock);
+
mutex_lock(&ghvm->mm_lock);
list_for_each_entry_safe(mapping, tmp, &ghvm->memory_mappings, list) {
gh_vm_mem_reclaim(ghvm, mapping);
@@ -117,6 +273,28 @@ static void gh_vm_free(struct work_struct *work)
}
}
+static void _gh_vm_put(struct kref *kref)
+{
+ struct gh_vm *ghvm = container_of(kref, struct gh_vm, kref);
+
+ /* VM will be reset and make RM calls which can interruptible sleep.
+ * Defer to a work so this thread can receive signal.
+ */
+ schedule_work(&ghvm->free_work);
+}
+
+int __must_check gh_vm_get(struct gh_vm *ghvm)
+{
+ return kref_get_unless_zero(&ghvm->kref);
+}
+EXPORT_SYMBOL_GPL(gh_vm_get);
+
+void gh_vm_put(struct gh_vm *ghvm)
+{
+ kref_put(&ghvm->kref, _gh_vm_put);
+}
+EXPORT_SYMBOL_GPL(gh_vm_put);
+
static __must_check struct gh_vm *gh_vm_alloc(struct gh_rm *rm)
{
struct gh_vm *ghvm;
@@ -150,6 +328,8 @@ static __must_check struct gh_vm *gh_vm_alloc(struct gh_rm *rm)
INIT_LIST_HEAD(&ghvm->memory_mappings);
init_rwsem(&ghvm->status_lock);
INIT_WORK(&ghvm->free_work, gh_vm_free);
+ kref_init(&ghvm->kref);
+ INIT_LIST_HEAD(&ghvm->functions);
ghvm->vm_status = GH_RM_VM_STATUS_LOAD;
return ghvm;
@@ -302,6 +482,29 @@ static long gh_vm_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
r = gh_vm_ensure_started(ghvm);
break;
}
+ case GH_VM_ADD_FUNCTION: {
+ struct gh_fn_desc f;
+
+ if (copy_from_user(&f, argp, sizeof(f)))
+ return -EFAULT;
+
+ r = gh_vm_add_function(ghvm, &f);
+ break;
+ }
+ case GH_VM_REMOVE_FUNCTION: {
+ struct gh_fn_desc *f;
+
+ f = kzalloc(sizeof(*f), GFP_KERNEL);
+ if (!f)
+ return -ENOMEM;
+
+ if (copy_from_user(f, argp, sizeof(*f)))
+ return -EFAULT;
+
+ r = gh_vm_rm_function(ghvm, f);
+ kfree(f);
+ break;
+ }
default:
r = -ENOTTY;
break;
@@ -314,10 +517,7 @@ static int gh_vm_release(struct inode *inode, struct file *filp)
{
struct gh_vm *ghvm = filp->private_data;
- /* VM will be reset and make RM calls which can interruptible sleep.
- * Defer to a work so this thread can receive signal.
- */
- schedule_work(&ghvm->free_work);
+ gh_vm_put(ghvm);
return 0;
}
diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
index 26bcc2ae4478..7bd271bad721 100644
--- a/drivers/virt/gunyah/vm_mgr.h
+++ b/drivers/virt/gunyah/vm_mgr.h
@@ -8,6 +8,7 @@
#include <linux/gunyah_rsc_mgr.h>
#include <linux/list.h>
+#include <linux/kref.h>
#include <linux/miscdevice.h>
#include <linux/mutex.h>
#include <linux/rwsem.h>
@@ -45,8 +46,11 @@ struct gh_vm {
struct rw_semaphore status_lock;
struct work_struct free_work;
+ struct kref kref;
struct mutex mm_lock;
struct list_head memory_mappings;
+ struct mutex fn_lock;
+ struct list_head functions;
};
int gh_vm_mem_alloc(struct gh_vm *ghvm, struct gh_userspace_memory_region *region);
diff --git a/include/linux/gunyah_vm_mgr.h b/include/linux/gunyah_vm_mgr.h
new file mode 100644
index 000000000000..3825c951790a
--- /dev/null
+++ b/include/linux/gunyah_vm_mgr.h
@@ -0,0 +1,73 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#ifndef _GUNYAH_VM_MGR_H
+#define _GUNYAH_VM_MGR_H
+
+#include <linux/compiler_types.h>
+#include <linux/gunyah.h>
+#include <linux/gunyah_rsc_mgr.h>
+#include <linux/list.h>
+#include <linux/mod_devicetable.h>
+#include <linux/notifier.h>
+
+#include <uapi/linux/gunyah.h>
+
+struct gh_vm;
+
+int __must_check gh_vm_get(struct gh_vm *ghvm);
+void gh_vm_put(struct gh_vm *ghvm);
+
+struct gh_vm_function_instance;
+struct gh_vm_function {
+ u32 type;
+ const char *name;
+ struct module *mod;
+ long (*bind)(struct gh_vm_function_instance *f);
+ void (*unbind)(struct gh_vm_function_instance *f);
+};
+
+/**
+ * struct gh_vm_function_instance - Represents one function instance
+ * @arg_size: size of user argument
+ * @argp: pointer to user argument
+ * @ghvm: Pointer to VM instance
+ * @rm: Pointer to resource manager for the VM instance
+ * @fn: The ops for the function
+ * @data: Private data for function
+ * @vm_list: for gh_vm's functions list
+ * @fn_list: for gh_vm_function's instances list
+ */
+struct gh_vm_function_instance {
+ size_t arg_size;
+ void *argp;
+ struct gh_vm *ghvm;
+ struct gh_rm *rm;
+ struct gh_vm_function *fn;
+ void *data;
+ struct list_head vm_list;
+};
+
+int gh_vm_function_register(struct gh_vm_function *f);
+void gh_vm_function_unregister(struct gh_vm_function *f);
+
+#define DECLARE_GH_VM_FUNCTION(_name, _type, _bind, _unbind) \
+ static struct gh_vm_function _name = { \
+ .type = _type, \
+ .name = __stringify(_name), \
+ .mod = THIS_MODULE, \
+ .bind = _bind, \
+ .unbind = _unbind, \
+ }; \
+ MODULE_ALIAS("ghfunc:"__stringify(_type))
+
+#define module_gh_vm_function(__gf) \
+ module_driver(__gf, gh_vm_function_register, gh_vm_function_unregister)
+
+#define DECLARE_GH_VM_FUNCTION_INIT(_name, _type, _bind, _unbind) \
+ DECLARE_GH_VM_FUNCTION(_name, _type, _bind, _unbind); \
+ module_gh_vm_function(_name)
+
+#endif
diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
index d6abd8605a2e..caeb3b3a3e9a 100644
--- a/include/uapi/linux/gunyah.h
+++ b/include/uapi/linux/gunyah.h
@@ -62,4 +62,21 @@ struct gh_vm_dtb_config {
#define GH_VM_START _IO(GH_IOCTL_TYPE, 0x3)
+#define GH_FN_MAX_ARG_SIZE 256
+
+/**
+ * struct gh_fn_desc - Arguments to create a VM function
+ * @type: Type of the function. See GH_FN_* macro for supported types
+ * @arg_size: Size of argument to pass to the function. arg_size <= GH_FN_MAX_ARG_SIZE
+ * @arg: Value or pointer to argument given to the function
+ */
+struct gh_fn_desc {
+ __u32 type;
+ __u32 arg_size;
+ __u64 arg;
+};
+
+#define GH_VM_ADD_FUNCTION _IOW(GH_IOCTL_TYPE, 0x4, struct gh_fn_desc)
+#define GH_VM_REMOVE_FUNCTION _IOW(GH_IOCTL_TYPE, 0x7, struct gh_fn_desc)
+
#endif
--
2.39.2
Some VM functions need to acquire Gunyah resources. For instance, Gunyah
vCPUs are exposed to the host as a resource. The Gunyah vCPU function
will register a resource ticket and be able to interact with the
hypervisor once the resource ticket is filled.
Resource tickets are the mechanism for functions to acquire ownership of
Gunyah resources. Gunyah functions can be created before the VM's
resources are created and made available to Linux. A resource ticket
identifies a type of resource and a label of a resource which the ticket
holder is interested in.
Resources are created by Gunyah as configured in the VM's devicetree
configuration. Gunyah doesn't process the label and that makes it
possible for userspace to create multiple resources with the same label.
Resource ticket owners need to be prepared for populate to be called
multiple times if userspace created multiple resources with the same
label.
Signed-off-by: Elliot Berman <[email protected]>
---
drivers/virt/gunyah/vm_mgr.c | 112 +++++++++++++++++++++++++++++++++-
drivers/virt/gunyah/vm_mgr.h | 4 ++
include/linux/gunyah_vm_mgr.h | 14 +++++
3 files changed, 129 insertions(+), 1 deletion(-)
diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
index 88db011395ec..0269bcdaf692 100644
--- a/drivers/virt/gunyah/vm_mgr.c
+++ b/drivers/virt/gunyah/vm_mgr.c
@@ -165,6 +165,74 @@ static long gh_vm_rm_function(struct gh_vm *ghvm, struct gh_fn_desc *f)
return r;
}
+int gh_vm_add_resource_ticket(struct gh_vm *ghvm, struct gh_vm_resource_ticket *ticket)
+{
+ struct gh_vm_resource_ticket *iter;
+ struct gh_resource *ghrsc;
+ int ret = 0;
+
+ mutex_lock(&ghvm->resources_lock);
+ list_for_each_entry(iter, &ghvm->resource_tickets, list) {
+ if (iter->resource_type == ticket->resource_type && iter->label == ticket->label) {
+ ret = -EEXIST;
+ goto out;
+ }
+ }
+
+ if (!try_module_get(ticket->owner)) {
+ ret = -ENODEV;
+ goto out;
+ }
+
+ list_add(&ticket->list, &ghvm->resource_tickets);
+ INIT_LIST_HEAD(&ticket->resources);
+
+ list_for_each_entry(ghrsc, &ghvm->resources, list) {
+ if (ghrsc->type == ticket->resource_type && ghrsc->rm_label == ticket->label) {
+ if (!ticket->populate(ticket, ghrsc))
+ list_move(&ghrsc->list, &ticket->resources);
+ }
+ }
+out:
+ mutex_unlock(&ghvm->resources_lock);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(gh_vm_add_resource_ticket);
+
+void gh_vm_remove_resource_ticket(struct gh_vm *ghvm, struct gh_vm_resource_ticket *ticket)
+{
+ struct gh_resource *ghrsc, *iter;
+
+ mutex_lock(&ghvm->resources_lock);
+ list_for_each_entry_safe(ghrsc, iter, &ticket->resources, list) {
+ ticket->unpopulate(ticket, ghrsc);
+ list_move(&ghrsc->list, &ghvm->resources);
+ }
+
+ module_put(ticket->owner);
+ list_del(&ticket->list);
+ mutex_unlock(&ghvm->resources_lock);
+}
+EXPORT_SYMBOL_GPL(gh_vm_remove_resource_ticket);
+
+static void gh_vm_add_resource(struct gh_vm *ghvm, struct gh_resource *ghrsc)
+{
+ struct gh_vm_resource_ticket *ticket;
+
+ mutex_lock(&ghvm->resources_lock);
+ list_for_each_entry(ticket, &ghvm->resource_tickets, list) {
+ if (ghrsc->type == ticket->resource_type && ghrsc->rm_label == ticket->label) {
+ if (!ticket->populate(ticket, ghrsc)) {
+ list_add(&ghrsc->list, &ticket->resources);
+ goto found;
+ }
+ }
+ }
+ list_add(&ghrsc->list, &ghvm->resources);
+found:
+ mutex_unlock(&ghvm->resources_lock);
+}
+
static int gh_vm_rm_notification_status(struct gh_vm *ghvm, void *data)
{
struct gh_rm_vm_status_payload *payload = data;
@@ -230,6 +298,8 @@ static void gh_vm_free(struct work_struct *work)
{
struct gh_vm *ghvm = container_of(work, struct gh_vm, free_work);
struct gh_vm_function_instance *inst, *iiter;
+ struct gh_vm_resource_ticket *ticket, *titer;
+ struct gh_resource *ghrsc, *riter;
struct gh_vm_mem *mapping, *tmp;
int ret;
@@ -246,6 +316,25 @@ static void gh_vm_free(struct work_struct *work)
}
mutex_unlock(&ghvm->fn_lock);
+ mutex_lock(&ghvm->resources_lock);
+ if (!list_empty(&ghvm->resource_tickets)) {
+ dev_warn(ghvm->parent, "Dangling resource tickets:\n");
+ list_for_each_entry_safe(ticket, titer, &ghvm->resource_tickets, list) {
+ dev_warn(ghvm->parent, " %pS\n", ticket->populate);
+ gh_vm_remove_resource_ticket(ghvm, ticket);
+ }
+ }
+
+ list_for_each_entry_safe(ghrsc, riter, &ghvm->resources, list) {
+ gh_rm_free_resource(ghrsc);
+ }
+ mutex_unlock(&ghvm->resources_lock);
+
+ ret = gh_rm_vm_reset(ghvm->rm, ghvm->vmid);
+ if (ret)
+ dev_err(ghvm->parent, "Failed to reset the vm: %d\n", ret);
+ wait_event(ghvm->vm_status_wait, ghvm->vm_status == GH_RM_VM_STATUS_RESET);
+
mutex_lock(&ghvm->mm_lock);
list_for_each_entry_safe(mapping, tmp, &ghvm->memory_mappings, list) {
gh_vm_mem_reclaim(ghvm, mapping);
@@ -329,6 +418,9 @@ static __must_check struct gh_vm *gh_vm_alloc(struct gh_rm *rm)
init_rwsem(&ghvm->status_lock);
INIT_WORK(&ghvm->free_work, gh_vm_free);
kref_init(&ghvm->kref);
+ mutex_init(&ghvm->resources_lock);
+ INIT_LIST_HEAD(&ghvm->resources);
+ INIT_LIST_HEAD(&ghvm->resource_tickets);
INIT_LIST_HEAD(&ghvm->functions);
ghvm->vm_status = GH_RM_VM_STATUS_LOAD;
@@ -338,9 +430,11 @@ static __must_check struct gh_vm *gh_vm_alloc(struct gh_rm *rm)
static int gh_vm_start(struct gh_vm *ghvm)
{
struct gh_vm_mem *mapping;
+ struct gh_rm_hyp_resources *resources;
+ struct gh_resource *ghrsc;
u64 dtb_offset;
u32 mem_handle;
- int ret;
+ int ret, i, n;
down_write(&ghvm->status_lock);
if (ghvm->vm_status != GH_RM_VM_STATUS_LOAD) {
@@ -394,6 +488,22 @@ static int gh_vm_start(struct gh_vm *ghvm)
goto err;
}
+ ret = gh_rm_get_hyp_resources(ghvm->rm, ghvm->vmid, &resources);
+ if (ret) {
+ dev_warn(ghvm->parent, "Failed to get hypervisor resources for VM: %d\n", ret);
+ goto err;
+ }
+
+ for (i = 0, n = le32_to_cpu(resources->n_entries); i < n; i++) {
+ ghrsc = gh_rm_alloc_resource(ghvm->rm, &resources->entries[i]);
+ if (!ghrsc) {
+ ret = -ENOMEM;
+ goto err;
+ }
+
+ gh_vm_add_resource(ghvm, ghrsc);
+ }
+
ret = gh_rm_vm_start(ghvm->rm, ghvm->vmid);
if (ret) {
dev_warn(ghvm->parent, "Failed to start VM: %d\n", ret);
diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
index 7bd271bad721..18d0e1effd25 100644
--- a/drivers/virt/gunyah/vm_mgr.h
+++ b/drivers/virt/gunyah/vm_mgr.h
@@ -7,6 +7,7 @@
#define _GH_PRIV_VM_MGR_H
#include <linux/gunyah_rsc_mgr.h>
+#include <linux/gunyah_vm_mgr.h>
#include <linux/list.h>
#include <linux/kref.h>
#include <linux/miscdevice.h>
@@ -51,6 +52,9 @@ struct gh_vm {
struct list_head memory_mappings;
struct mutex fn_lock;
struct list_head functions;
+ struct mutex resources_lock;
+ struct list_head resources;
+ struct list_head resource_tickets;
};
int gh_vm_mem_alloc(struct gh_vm *ghvm, struct gh_userspace_memory_region *region);
diff --git a/include/linux/gunyah_vm_mgr.h b/include/linux/gunyah_vm_mgr.h
index 3825c951790a..01b1761b5923 100644
--- a/include/linux/gunyah_vm_mgr.h
+++ b/include/linux/gunyah_vm_mgr.h
@@ -70,4 +70,18 @@ void gh_vm_function_unregister(struct gh_vm_function *f);
DECLARE_GH_VM_FUNCTION(_name, _type, _bind, _unbind); \
module_gh_vm_function(_name)
+struct gh_vm_resource_ticket {
+ struct list_head list; /* for gh_vm's resources list */
+ struct list_head resources; /* for gh_resources's list */
+ enum gh_resource_type resource_type;
+ u32 label;
+
+ struct module *owner;
+ int (*populate)(struct gh_vm_resource_ticket *ticket, struct gh_resource *ghrsc);
+ void (*unpopulate)(struct gh_vm_resource_ticket *ticket, struct gh_resource *ghrsc);
+};
+
+int gh_vm_add_resource_ticket(struct gh_vm *ghvm, struct gh_vm_resource_ticket *ticket);
+void gh_vm_remove_resource_ticket(struct gh_vm *ghvm, struct gh_vm_resource_ticket *ticket);
+
#endif
--
2.39.2
Add framework for VM functions to handle stage-2 write faults from Gunyah
guest virtual machines. IO handlers have a range of addresses which they
apply to. Optionally, they may apply to only when the value written
matches the IO handler's value.
Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Elliot Berman <[email protected]>
---
drivers/virt/gunyah/vm_mgr.c | 94 +++++++++++++++++++++++++++++++++++
drivers/virt/gunyah/vm_mgr.h | 4 ++
include/linux/gunyah_vm_mgr.h | 25 ++++++++++
3 files changed, 123 insertions(+)
diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
index 0269bcdaf692..b31fac15ff45 100644
--- a/drivers/virt/gunyah/vm_mgr.c
+++ b/drivers/virt/gunyah/vm_mgr.c
@@ -233,6 +233,100 @@ static void gh_vm_add_resource(struct gh_vm *ghvm, struct gh_resource *ghrsc)
mutex_unlock(&ghvm->resources_lock);
}
+static int _gh_vm_io_handler_compare(const struct rb_node *node, const struct rb_node *parent)
+{
+ struct gh_vm_io_handler *n = container_of(node, struct gh_vm_io_handler, node);
+ struct gh_vm_io_handler *p = container_of(parent, struct gh_vm_io_handler, node);
+
+ if (n->addr < p->addr)
+ return -1;
+ if (n->addr > p->addr)
+ return 1;
+ if ((n->len && !p->len) || (!n->len && p->len))
+ return 0;
+ if (n->len < p->len)
+ return -1;
+ if (n->len > p->len)
+ return 1;
+ if (n->datamatch < p->datamatch)
+ return -1;
+ if (n->datamatch > p->datamatch)
+ return 1;
+ return 0;
+}
+
+static int gh_vm_io_handler_compare(struct rb_node *node, const struct rb_node *parent)
+{
+ return _gh_vm_io_handler_compare(node, parent);
+}
+
+static int gh_vm_io_handler_find(const void *key, const struct rb_node *node)
+{
+ const struct gh_vm_io_handler *k = key;
+
+ return _gh_vm_io_handler_compare(&k->node, node);
+}
+
+static struct gh_vm_io_handler *gh_vm_mgr_find_io_hdlr(struct gh_vm *ghvm, u64 addr,
+ u64 len, u64 data)
+{
+ struct gh_vm_io_handler key = {
+ .addr = addr,
+ .len = len,
+ .datamatch = data,
+ };
+ struct rb_node *node;
+
+ node = rb_find(&key, &ghvm->mmio_handler_root, gh_vm_io_handler_find);
+ if (!node)
+ return NULL;
+
+ return container_of(node, struct gh_vm_io_handler, node);
+}
+
+int gh_vm_mmio_write(struct gh_vm *ghvm, u64 addr, u32 len, u64 data)
+{
+ struct gh_vm_io_handler *io_hdlr = NULL;
+ int ret;
+
+ down_read(&ghvm->mmio_handler_lock);
+ io_hdlr = gh_vm_mgr_find_io_hdlr(ghvm, addr, len, data);
+ if (!io_hdlr || !io_hdlr->ops || !io_hdlr->ops->write) {
+ ret = -ENODEV;
+ goto out;
+ }
+
+ ret = io_hdlr->ops->write(io_hdlr, addr, len, data);
+
+out:
+ up_read(&ghvm->mmio_handler_lock);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(gh_vm_mmio_write);
+
+int gh_vm_add_io_handler(struct gh_vm *ghvm, struct gh_vm_io_handler *io_hdlr)
+{
+ struct rb_node *found;
+
+ if (io_hdlr->datamatch && (!io_hdlr->len || io_hdlr->len > sizeof(io_hdlr->data)))
+ return -EINVAL;
+
+ down_write(&ghvm->mmio_handler_lock);
+ found = rb_find_add(&io_hdlr->node, &ghvm->mmio_handler_root, gh_vm_io_handler_compare);
+ up_write(&ghvm->mmio_handler_lock);
+
+ return found ? -EEXIST : 0;
+}
+EXPORT_SYMBOL_GPL(gh_vm_add_io_handler);
+
+void gh_vm_remove_io_handler(struct gh_vm *ghvm, struct gh_vm_io_handler *io_hdlr)
+{
+ down_write(&ghvm->mmio_handler_lock);
+ rb_erase(&io_hdlr->node, &ghvm->mmio_handler_root);
+ up_write(&ghvm->mmio_handler_lock);
+}
+EXPORT_SYMBOL_GPL(gh_vm_remove_io_handler);
+
static int gh_vm_rm_notification_status(struct gh_vm *ghvm, void *data)
{
struct gh_rm_vm_status_payload *payload = data;
diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
index 18d0e1effd25..9c1046af80ed 100644
--- a/drivers/virt/gunyah/vm_mgr.h
+++ b/drivers/virt/gunyah/vm_mgr.h
@@ -55,6 +55,8 @@ struct gh_vm {
struct mutex resources_lock;
struct list_head resources;
struct list_head resource_tickets;
+ struct rb_root mmio_handler_root;
+ struct rw_semaphore mmio_handler_lock;
};
int gh_vm_mem_alloc(struct gh_vm *ghvm, struct gh_userspace_memory_region *region);
@@ -63,4 +65,6 @@ int gh_vm_mem_free(struct gh_vm *ghvm, u32 label);
struct gh_vm_mem *gh_vm_mem_find_by_label(struct gh_vm *ghvm, u32 label);
struct gh_vm_mem *gh_vm_mem_find_by_addr(struct gh_vm *ghvm, u64 guest_phys_addr, u32 size);
+int gh_vm_mmio_write(struct gh_vm *ghvm, u64 addr, u32 len, u64 data);
+
#endif
diff --git a/include/linux/gunyah_vm_mgr.h b/include/linux/gunyah_vm_mgr.h
index 01b1761b5923..2dbf5e5f4037 100644
--- a/include/linux/gunyah_vm_mgr.h
+++ b/include/linux/gunyah_vm_mgr.h
@@ -84,4 +84,29 @@ struct gh_vm_resource_ticket {
int gh_vm_add_resource_ticket(struct gh_vm *ghvm, struct gh_vm_resource_ticket *ticket);
void gh_vm_remove_resource_ticket(struct gh_vm *ghvm, struct gh_vm_resource_ticket *ticket);
+/*
+ * gh_vm_io_handler contains the info about an io device and its associated
+ * addr and the ops associated with the io device.
+ */
+struct gh_vm_io_handler {
+ struct rb_node node;
+ u64 addr;
+
+ bool datamatch;
+ u8 len;
+ u64 data;
+ struct gh_vm_io_handler_ops *ops;
+};
+
+/*
+ * gh_vm_io_handler_ops contains function pointers associated with an iodevice.
+ */
+struct gh_vm_io_handler_ops {
+ int (*read)(struct gh_vm_io_handler *io_dev, u64 addr, u32 len, u64 data);
+ int (*write)(struct gh_vm_io_handler *io_dev, u64 addr, u32 len, u64 data);
+};
+
+int gh_vm_add_io_handler(struct gh_vm *ghvm, struct gh_vm_io_handler *io_dev);
+void gh_vm_remove_io_handler(struct gh_vm *ghvm, struct gh_vm_io_handler *io_dev);
+
#endif
--
2.39.2
Gunyah allows host virtual machines to schedule guest virtual machines
and handle their MMIO accesses. vCPUs are presented to the host as a
Gunyah resource and represented to userspace as a Gunyah VM function.
Creating the vcpu VM function will create a file descriptor that:
- can run an ioctl: GH_VCPU_RUN to schedule the guest vCPU until the
next interrupt occurs on the host or when the guest vCPU can no
longer be run.
- can be mmap'd to share a gh_vcpu_run structure which can look up the
reason why GH_VCPU_RUN returned and provide return values for MMIO
access.
Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Elliot Berman <[email protected]>
---
Documentation/virt/gunyah/vm-manager.rst | 46 ++-
arch/arm64/gunyah/gunyah_hypercall.c | 28 ++
drivers/virt/gunyah/Kconfig | 11 +
drivers/virt/gunyah/Makefile | 2 +
drivers/virt/gunyah/gunyah_vcpu.c | 465 +++++++++++++++++++++++
drivers/virt/gunyah/vm_mgr.c | 4 +
drivers/virt/gunyah/vm_mgr.h | 1 +
include/linux/gunyah.h | 8 +
include/uapi/linux/gunyah.h | 108 ++++++
9 files changed, 671 insertions(+), 2 deletions(-)
create mode 100644 drivers/virt/gunyah/gunyah_vcpu.c
diff --git a/Documentation/virt/gunyah/vm-manager.rst b/Documentation/virt/gunyah/vm-manager.rst
index af8ad88a88ab..83d326b0d11f 100644
--- a/Documentation/virt/gunyah/vm-manager.rst
+++ b/Documentation/virt/gunyah/vm-manager.rst
@@ -5,8 +5,7 @@ Virtual Machine Manager
=======================
The Gunyah Virtual Machine Manager is a Linux driver to support launching
-virtual machines using Gunyah. It presently supports launching non-proxy
-scheduled Linux-like virtual machines.
+virtual machines using Gunyah.
Except for some basic information about the location of initial binaries,
most of the configuration about a Gunyah virtual machine is described in the
@@ -107,3 +106,46 @@ GH_VM_START
~~~~~~~~~~~
This ioctl starts the VM.
+
+GH_VM_ADD_FUNCTION
+~~~~~~~~~~~~~~~~~~
+
+This ioctl registers a Gunyah VM function with the VM manager. The VM function
+is described with a `type` string and some arguments for that type. Typically,
+the function is added before the VM starts, but the function doesn't "operate"
+until the VM starts with GH_VM_START: e.g. vCPU ioclts will all return an error
+until the VM starts because the vCPUs don't exist until the VM is started. This
+allows the VMM to set up all the kernel functionality needed for the VM *before*
+the VM starts.
+
+.. kernel-doc:: include/uapi/linux/gunyah.h
+ :identifiers: gh_fn_desc
+
+The possible types are documented below:
+
+.. kernel-doc:: include/uapi/linux/gunyah.h
+ :identifiers: GH_FN_VCPU gh_fn_vcpu_arg
+
+Gunyah VCPU API Descriptions
+----------------------------
+
+A vCPU file descriptor is created after calling `GH_VM_ADD_FUNCTION` with the type `GH_FN_VCPU`.
+
+GH_VCPU_RUN
+~~~~~~~~~~~
+
+This ioctl is used to run a guest virtual cpu. While there are no
+explicit parameters, there is an implicit parameter block that can be
+obtained by mmap()ing the vcpu fd at offset 0, with the size given by
+GH_VCPU_MMAP_SIZE. The parameter block is formatted as a 'struct
+gh_vcpu_run' (see below).
+
+GH_VCPU_MMAP_SIZE
+~~~~~~~~~~~~~~~~~
+
+The GH_VCPU_RUN ioctl communicates with userspace via a shared
+memory region. This ioctl returns the size of that region. See the
+GH_VCPU_RUN documentation for details.
+
+.. kernel-doc:: include/uapi/linux/gunyah.h
+ :identifiers: gh_vcpu_run gh_vm_exit_info
diff --git a/arch/arm64/gunyah/gunyah_hypercall.c b/arch/arm64/gunyah/gunyah_hypercall.c
index 3420d8f286a9..f01f5cec4d23 100644
--- a/arch/arm64/gunyah/gunyah_hypercall.c
+++ b/arch/arm64/gunyah/gunyah_hypercall.c
@@ -43,6 +43,7 @@ EXPORT_SYMBOL_GPL(arch_is_gh_guest);
#define GH_HYPERCALL_HYP_IDENTIFY GH_HYPERCALL(0x8000)
#define GH_HYPERCALL_MSGQ_SEND GH_HYPERCALL(0x801B)
#define GH_HYPERCALL_MSGQ_RECV GH_HYPERCALL(0x801C)
+#define GH_HYPERCALL_VCPU_RUN GH_HYPERCALL(0x8065)
/**
* gh_hypercall_hyp_identify() - Returns build information and feature flags
@@ -91,5 +92,32 @@ enum gh_error gh_hypercall_msgq_recv(u64 capid, void *buff, size_t size, size_t
}
EXPORT_SYMBOL_GPL(gh_hypercall_msgq_recv);
+enum gh_error gh_hypercall_vcpu_run(u64 capid, u64 *resume_data,
+ struct gh_hypercall_vcpu_run_resp *resp)
+{
+ struct arm_smccc_1_2_regs args = {
+ .a0 = GH_HYPERCALL_VCPU_RUN,
+ .a1 = capid,
+ .a2 = resume_data[0],
+ .a3 = resume_data[1],
+ .a4 = resume_data[2],
+ /* C language says this will be implictly zero. Gunyah requires 0, so be explicit */
+ .a5 = 0,
+ };
+ struct arm_smccc_1_2_regs res;
+
+ arm_smccc_1_2_hvc(&args, &res);
+
+ if (res.a0 == GH_ERROR_OK) {
+ resp->state = res.a1;
+ resp->state_data[0] = res.a2;
+ resp->state_data[1] = res.a3;
+ resp->state_data[2] = res.a4;
+ }
+
+ return res.a0;
+}
+EXPORT_SYMBOL_GPL(gh_hypercall_vcpu_run);
+
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("Gunyah Hypervisor Hypercalls");
diff --git a/drivers/virt/gunyah/Kconfig b/drivers/virt/gunyah/Kconfig
index de815189dab6..4c1c6110b50e 100644
--- a/drivers/virt/gunyah/Kconfig
+++ b/drivers/virt/gunyah/Kconfig
@@ -15,3 +15,14 @@ config GUNYAH
config GUNYAH_PLATFORM_HOOKS
tristate
+
+config GUNYAH_VCPU
+ tristate "Runnable Gunyah vCPUs"
+ depends on GUNYAH
+ help
+ Enable kernel support for host-scheduled vCPUs running under Gunyah.
+ When selecting this option, userspace virtual machine managers (VMM)
+ can schedule the guest VM's vCPUs instead of using Gunyah's scheduler.
+ VMMs can also handle stage 2 faults of the vCPUs.
+
+ Say Y/M here if unsure and you want to support Gunyah VMMs.
diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
index 6b8f84dbfe0d..2d1b604a7b03 100644
--- a/drivers/virt/gunyah/Makefile
+++ b/drivers/virt/gunyah/Makefile
@@ -5,3 +5,5 @@ obj-$(CONFIG_GUNYAH_PLATFORM_HOOKS) += gunyah_platform_hooks.o
gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o vm_mgr_mm.o
obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
+
+obj-$(CONFIG_GUNYAH_VCPU) += gunyah_vcpu.o
diff --git a/drivers/virt/gunyah/gunyah_vcpu.c b/drivers/virt/gunyah/gunyah_vcpu.c
new file mode 100644
index 000000000000..870e471a11df
--- /dev/null
+++ b/drivers/virt/gunyah/gunyah_vcpu.c
@@ -0,0 +1,465 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#include <linux/anon_inodes.h>
+#include <linux/file.h>
+#include <linux/gunyah.h>
+#include <linux/gunyah_vm_mgr.h>
+#include <linux/interrupt.h>
+#include <linux/kref.h>
+#include <linux/mm.h>
+#include <linux/module.h>
+#include <linux/types.h>
+#include <linux/wait.h>
+
+#include "vm_mgr.h"
+
+#include <uapi/linux/gunyah.h>
+
+#define MAX_VCPU_NAME 20 /* gh-vcpu:u32_max+NUL */
+
+struct gh_vcpu {
+ struct gh_vm_function_instance *f;
+ struct gh_resource *rsc;
+ struct mutex run_lock;
+ /* Track why vcpu_run left last time around. */
+ enum {
+ GH_VCPU_UNKNOWN = 0,
+ GH_VCPU_READY,
+ GH_VCPU_MMIO_READ,
+ GH_VCPU_SYSTEM_DOWN,
+ } state;
+ u8 mmio_read_len;
+ struct gh_vcpu_run *vcpu_run;
+ struct completion ready;
+ struct gh_vm *ghvm;
+
+ struct notifier_block nb;
+ struct gh_vm_resource_ticket ticket;
+ struct kref kref;
+};
+
+/* VCPU is ready to run */
+#define GH_VCPU_STATE_READY 0
+/* VCPU is sleeping until an interrupt arrives */
+#define GH_VCPU_STATE_EXPECTS_WAKEUP 1
+/* VCPU is powered off */
+#define GH_VCPU_STATE_POWERED_OFF 2
+/* VCPU is blocked in EL2 for unspecified reason */
+#define GH_VCPU_STATE_BLOCKED 3
+/* VCPU has returned for MMIO READ */
+#define GH_VCPU_ADDRSPACE_VMMIO_READ 4
+/* VCPU has returned for MMIO WRITE */
+#define GH_VCPU_ADDRSPACE_VMMIO_WRITE 5
+
+static void vcpu_release(struct kref *kref)
+{
+ struct gh_vcpu *vcpu = container_of(kref, struct gh_vcpu, kref);
+
+ free_page((unsigned long)vcpu->vcpu_run);
+ kfree(vcpu);
+}
+
+/*
+ * When hypervisor allows us to schedule vCPU again, it gives us an interrupt
+ */
+static irqreturn_t gh_vcpu_irq_handler(int irq, void *data)
+{
+ struct gh_vcpu *vcpu = data;
+
+ complete(&vcpu->ready);
+ return IRQ_HANDLED;
+}
+
+static bool gh_handle_mmio(struct gh_vcpu *vcpu,
+ struct gh_hypercall_vcpu_run_resp *vcpu_run_resp)
+{
+ int ret = 0;
+ u64 addr = vcpu_run_resp->state_data[0],
+ len = vcpu_run_resp->state_data[1],
+ data = vcpu_run_resp->state_data[2];
+
+ if (vcpu_run_resp->state == GH_VCPU_ADDRSPACE_VMMIO_READ) {
+ vcpu->vcpu_run->mmio.is_write = 0;
+ /* Record that we need to give vCPU user's supplied value next gh_vcpu_run() */
+ vcpu->state = GH_VCPU_MMIO_READ;
+ vcpu->mmio_read_len = len;
+ } else { /* GH_VCPU_ADDRSPACE_VMMIO_WRITE */
+ /* Try internal handlers first */
+ ret = gh_vm_mmio_write(vcpu->f->ghvm, addr, len, data);
+ if (!ret)
+ return true;
+
+ /* Give userspace the info */
+ vcpu->vcpu_run->mmio.is_write = 1;
+ memcpy(vcpu->vcpu_run->mmio.data, &data, len);
+ }
+
+ vcpu->vcpu_run->mmio.phys_addr = addr;
+ vcpu->vcpu_run->mmio.len = len;
+ vcpu->vcpu_run->exit_reason = GH_VCPU_EXIT_MMIO;
+
+ return false;
+}
+
+static int gh_vcpu_rm_notification(struct notifier_block *nb, unsigned long action, void *data)
+{
+ struct gh_vcpu *vcpu = container_of(nb, struct gh_vcpu, nb);
+ struct gh_rm_vm_exited_payload *exit_payload = data;
+
+ if (action == GH_RM_NOTIFICATION_VM_EXITED &&
+ le16_to_cpu(exit_payload->vmid) == vcpu->ghvm->vmid)
+ complete(&vcpu->ready);
+
+ return NOTIFY_OK;
+}
+
+static inline enum gh_vm_status remap_vm_status(enum gh_rm_vm_status rm_status)
+{
+ switch (rm_status) {
+ case GH_RM_VM_STATUS_INIT_FAILED:
+ return GH_VM_STATUS_LOAD_FAILED;
+ case GH_RM_VM_STATUS_EXITED:
+ return GH_VM_STATUS_EXITED;
+ default:
+ return GH_VM_STATUS_CRASHED;
+ }
+}
+
+/**
+ * gh_vcpu_check_system() - Check whether VM as a whole is running
+ * @vcpu: Pointer to gh_vcpu
+ *
+ * Returns true if the VM is alive.
+ * Returns false if the vCPU is the VM is not alive (can only be that VM is shutting down).
+ */
+static bool gh_vcpu_check_system(struct gh_vcpu *vcpu)
+ __must_hold(&vcpu->run_lock)
+{
+ bool ret = true;
+
+ down_read(&vcpu->ghvm->status_lock);
+ if (likely(vcpu->ghvm->vm_status == GH_RM_VM_STATUS_RUNNING))
+ goto out;
+
+ vcpu->vcpu_run->status.status = remap_vm_status(vcpu->ghvm->vm_status);
+ vcpu->vcpu_run->status.exit_info = vcpu->ghvm->exit_info;
+ vcpu->vcpu_run->exit_reason = GH_VCPU_EXIT_STATUS;
+ vcpu->state = GH_VCPU_SYSTEM_DOWN;
+ ret = false;
+out:
+ up_read(&vcpu->ghvm->status_lock);
+ return ret;
+}
+
+/**
+ * gh_vcpu_run() - Request Gunyah to begin scheduling this vCPU.
+ * @vcpu: The client descriptor that was obtained via gh_vcpu_alloc()
+ */
+static int gh_vcpu_run(struct gh_vcpu *vcpu)
+{
+ struct gh_hypercall_vcpu_run_resp vcpu_run_resp;
+ u64 state_data[3] = { 0 };
+ enum gh_error gh_error;
+ int ret = 0;
+
+ if (!vcpu->f)
+ return -ENODEV;
+
+ if (mutex_lock_interruptible(&vcpu->run_lock))
+ return -ERESTARTSYS;
+
+ if (!vcpu->rsc) {
+ ret = -ENODEV;
+ goto out;
+ }
+
+ switch (vcpu->state) {
+ case GH_VCPU_UNKNOWN:
+ if (vcpu->ghvm->vm_status != GH_RM_VM_STATUS_RUNNING) {
+ /* Check if VM is up. If VM is starting, will block until VM is fully up
+ * since that thread does down_write.
+ */
+ if (!gh_vcpu_check_system(vcpu))
+ goto out;
+ }
+ vcpu->state = GH_VCPU_READY;
+ break;
+ case GH_VCPU_MMIO_READ:
+ memcpy(&state_data[0], vcpu->vcpu_run->mmio.data, vcpu->mmio_read_len);
+ vcpu->state = GH_VCPU_READY;
+ break;
+ case GH_VCPU_SYSTEM_DOWN:
+ goto out;
+ default:
+ break;
+ }
+
+ while (!ret && !signal_pending(current)) {
+ if (vcpu->vcpu_run->immediate_exit) {
+ ret = -EINTR;
+ goto out;
+ }
+
+ gh_error = gh_hypercall_vcpu_run(vcpu->rsc->capid, state_data, &vcpu_run_resp);
+ if (gh_error == GH_ERROR_OK) {
+ ret = 0;
+ switch (vcpu_run_resp.state) {
+ case GH_VCPU_STATE_READY:
+ if (need_resched())
+ schedule();
+ break;
+ case GH_VCPU_STATE_POWERED_OFF:
+ /* vcpu might be off because the VM is shut down.
+ * If so, it won't ever run again: exit back to user
+ */
+ if (!gh_vcpu_check_system(vcpu))
+ goto out;
+ /* Otherwise, another vcpu will turn it on (e.g. by PSCI)
+ * and hyp sends an interrupt to wake Linux up.
+ */
+ fallthrough;
+ case GH_VCPU_STATE_EXPECTS_WAKEUP:
+ ret = wait_for_completion_interruptible(&vcpu->ready);
+ /* reinitialize completion before next hypercall. If we reinitialize
+ * after the hypercall, interrupt may have already come before
+ * re-initializing the completion and then end up waiting for
+ * event that already happened.
+ */
+ reinit_completion(&vcpu->ready);
+ /* Check system status again. Completion might've
+ * come from gh_vcpu_rm_notification
+ */
+ if (!ret && !gh_vcpu_check_system(vcpu))
+ goto out;
+ break;
+ case GH_VCPU_STATE_BLOCKED:
+ schedule();
+ break;
+ case GH_VCPU_ADDRSPACE_VMMIO_READ:
+ case GH_VCPU_ADDRSPACE_VMMIO_WRITE:
+ if (!gh_handle_mmio(vcpu, &vcpu_run_resp))
+ goto out;
+ break;
+ default:
+ pr_warn_ratelimited("Unknown vCPU state: %llx\n",
+ vcpu_run_resp.state);
+ schedule();
+ break;
+ }
+ } else if (gh_error == GH_ERROR_RETRY) {
+ schedule();
+ ret = 0;
+ } else
+ ret = gh_remap_error(gh_error);
+ }
+
+out:
+ mutex_unlock(&vcpu->run_lock);
+
+ if (signal_pending(current))
+ return -ERESTARTSYS;
+
+ return ret;
+}
+
+static long gh_vcpu_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
+{
+ struct gh_vcpu *vcpu = filp->private_data;
+ long ret = -EINVAL;
+
+ switch (cmd) {
+ case GH_VCPU_RUN:
+ ret = gh_vcpu_run(vcpu);
+ break;
+ case GH_VCPU_MMAP_SIZE:
+ ret = PAGE_SIZE;
+ break;
+ default:
+ break;
+ }
+ return ret;
+}
+
+static int gh_vcpu_release(struct inode *inode, struct file *filp)
+{
+ struct gh_vcpu *vcpu = filp->private_data;
+
+ gh_vm_put(vcpu->ghvm);
+ kref_put(&vcpu->kref, vcpu_release);
+ return 0;
+}
+
+static vm_fault_t gh_vcpu_fault(struct vm_fault *vmf)
+{
+ struct gh_vcpu *vcpu = vmf->vma->vm_file->private_data;
+ struct page *page = NULL;
+
+ if (vmf->pgoff == 0)
+ page = virt_to_page(vcpu->vcpu_run);
+
+ get_page(page);
+ vmf->page = page;
+ return 0;
+}
+
+static const struct vm_operations_struct gh_vcpu_ops = {
+ .fault = gh_vcpu_fault,
+};
+
+static int gh_vcpu_mmap(struct file *file, struct vm_area_struct *vma)
+{
+ vma->vm_ops = &gh_vcpu_ops;
+ return 0;
+}
+
+static const struct file_operations gh_vcpu_fops = {
+ .owner = THIS_MODULE,
+ .unlocked_ioctl = gh_vcpu_ioctl,
+ .release = gh_vcpu_release,
+ .llseek = noop_llseek,
+ .mmap = gh_vcpu_mmap,
+};
+
+static int gh_vcpu_populate(struct gh_vm_resource_ticket *ticket, struct gh_resource *ghrsc)
+{
+ struct gh_vcpu *vcpu = container_of(ticket, struct gh_vcpu, ticket);
+ int ret;
+
+ mutex_lock(&vcpu->run_lock);
+ if (vcpu->rsc) {
+ ret = -1;
+ goto out;
+ }
+
+ vcpu->rsc = ghrsc;
+ init_completion(&vcpu->ready);
+
+ ret = request_irq(vcpu->rsc->irq, gh_vcpu_irq_handler, IRQF_TRIGGER_RISING, "gh_vcpu",
+ vcpu);
+ if (ret)
+ pr_warn("Failed to request vcpu irq %d: %d", vcpu->rsc->irq, ret);
+
+out:
+ mutex_unlock(&vcpu->run_lock);
+ return ret;
+}
+
+static void gh_vcpu_unpopulate(struct gh_vm_resource_ticket *ticket,
+ struct gh_resource *ghrsc)
+{
+ struct gh_vcpu *vcpu = container_of(ticket, struct gh_vcpu, ticket);
+
+ vcpu->vcpu_run->immediate_exit = true;
+ complete_all(&vcpu->ready);
+ mutex_lock(&vcpu->run_lock);
+ free_irq(vcpu->rsc->irq, vcpu);
+ vcpu->rsc = NULL;
+ mutex_unlock(&vcpu->run_lock);
+}
+
+static long gh_vcpu_bind(struct gh_vm_function_instance *f)
+{
+ struct gh_fn_vcpu_arg *arg = f->argp;
+ struct gh_vcpu *vcpu;
+ char name[MAX_VCPU_NAME];
+ struct file *file;
+ struct page *page;
+ int fd;
+ long r;
+
+ if (!gh_api_has_feature(GH_FEATURE_VCPU))
+ return -EOPNOTSUPP;
+
+ if (f->arg_size != sizeof(*arg))
+ return -EINVAL;
+
+ vcpu = kzalloc(sizeof(*vcpu), GFP_KERNEL);
+ if (!vcpu)
+ return -ENOMEM;
+
+ vcpu->f = f;
+ f->data = vcpu;
+ mutex_init(&vcpu->run_lock);
+ kref_init(&vcpu->kref);
+
+ page = alloc_page(GFP_KERNEL | __GFP_ZERO);
+ if (!page) {
+ r = -ENOMEM;
+ goto err_destroy_vcpu;
+ }
+ vcpu->vcpu_run = page_address(page);
+
+ vcpu->ticket.resource_type = GH_RESOURCE_TYPE_VCPU;
+ vcpu->ticket.label = arg->id;
+ vcpu->ticket.owner = THIS_MODULE;
+ vcpu->ticket.populate = gh_vcpu_populate;
+ vcpu->ticket.unpopulate = gh_vcpu_unpopulate;
+
+ r = gh_vm_add_resource_ticket(f->ghvm, &vcpu->ticket);
+ if (r)
+ goto err_destroy_page;
+
+ fd = get_unused_fd_flags(O_CLOEXEC);
+ if (fd < 0) {
+ r = fd;
+ goto err_remove_vcpu;
+ }
+
+ if (!gh_vm_get(f->ghvm)) {
+ r = -ENODEV;
+ goto err_put_fd;
+ }
+ vcpu->ghvm = f->ghvm;
+
+ vcpu->nb.notifier_call = gh_vcpu_rm_notification;
+ /* Ensure we run after the vm_mgr handles the notification and does
+ * any necessary state changes. We wake up to check the new state.
+ */
+ vcpu->nb.priority = -1;
+ r = gh_rm_notifier_register(f->rm, &vcpu->nb);
+ if (r)
+ goto err_put_gh_vm;
+
+ kref_get(&vcpu->kref);
+ snprintf(name, sizeof(name), "gh-vcpu:%d", vcpu->ticket.label);
+ file = anon_inode_getfile(name, &gh_vcpu_fops, vcpu, O_RDWR);
+ if (IS_ERR(file)) {
+ r = PTR_ERR(file);
+ goto err_notifier;
+ }
+
+ fd_install(fd, file);
+
+ return fd;
+err_notifier:
+ gh_rm_notifier_unregister(f->rm, &vcpu->nb);
+err_put_gh_vm:
+ gh_vm_put(vcpu->ghvm);
+err_put_fd:
+ put_unused_fd(fd);
+err_remove_vcpu:
+ gh_vm_remove_resource_ticket(f->ghvm, &vcpu->ticket);
+err_destroy_page:
+ free_page((unsigned long)vcpu->vcpu_run);
+err_destroy_vcpu:
+ kfree(vcpu);
+ return r;
+}
+
+static void gh_vcpu_unbind(struct gh_vm_function_instance *f)
+{
+ struct gh_vcpu *vcpu = f->data;
+
+ gh_rm_notifier_unregister(f->rm, &vcpu->nb);
+ gh_vm_remove_resource_ticket(vcpu->f->ghvm, &vcpu->ticket);
+ vcpu->f = NULL;
+
+ kref_put(&vcpu->kref, vcpu_release);
+}
+
+DECLARE_GH_VM_FUNCTION_INIT(vcpu, GH_FN_VCPU, gh_vcpu_bind, gh_vcpu_unbind);
+MODULE_DESCRIPTION("Gunyah vCPU Driver");
+MODULE_LICENSE("GPL");
diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
index b31fac15ff45..d453d902847e 100644
--- a/drivers/virt/gunyah/vm_mgr.c
+++ b/drivers/virt/gunyah/vm_mgr.c
@@ -354,6 +354,10 @@ static int gh_vm_rm_notification_exited(struct gh_vm *ghvm, void *data)
down_write(&ghvm->status_lock);
ghvm->vm_status = GH_RM_VM_STATUS_EXITED;
+ ghvm->exit_info.type = le16_to_cpu(payload->exit_type);
+ ghvm->exit_info.reason_size = le32_to_cpu(payload->exit_reason_size);
+ memcpy(&ghvm->exit_info.reason, payload->exit_reason,
+ min(GH_VM_MAX_EXIT_REASON_SIZE, ghvm->exit_info.reason_size));
up_write(&ghvm->status_lock);
return NOTIFY_DONE;
diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
index 9c1046af80ed..df78756639b6 100644
--- a/drivers/virt/gunyah/vm_mgr.h
+++ b/drivers/virt/gunyah/vm_mgr.h
@@ -45,6 +45,7 @@ struct gh_vm {
enum gh_rm_vm_status vm_status;
wait_queue_head_t vm_status_wait;
struct rw_semaphore status_lock;
+ struct gh_vm_exit_info exit_info;
struct work_struct free_work;
struct kref kref;
diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
index 3e706b59d2c0..37f1e2c822ce 100644
--- a/include/linux/gunyah.h
+++ b/include/linux/gunyah.h
@@ -175,4 +175,12 @@ enum gh_error gh_hypercall_msgq_send(u64 capid, size_t size, void *buff, int tx_
enum gh_error gh_hypercall_msgq_recv(u64 capid, void *buff, size_t size, size_t *recv_size,
bool *ready);
+struct gh_hypercall_vcpu_run_resp {
+ u64 state;
+ u64 state_data[3];
+};
+
+enum gh_error gh_hypercall_vcpu_run(u64 capid, u64 *resume_data,
+ struct gh_hypercall_vcpu_run_resp *resp);
+
#endif
diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
index caeb3b3a3e9a..e52265fa5715 100644
--- a/include/uapi/linux/gunyah.h
+++ b/include/uapi/linux/gunyah.h
@@ -62,8 +62,32 @@ struct gh_vm_dtb_config {
#define GH_VM_START _IO(GH_IOCTL_TYPE, 0x3)
+/**
+ * GH_FN_VCPU - create a vCPU instance to control a vCPU
+ *
+ * gh_fn_desc is filled with &struct gh_fn_vcpu_arg
+ *
+ * The vcpu type will register with the VM Manager to expect to control
+ * vCPU number `vcpu_id`. It returns a file descriptor allowing interaction with
+ * the vCPU. See the Gunyah vCPU API description sections for interacting with
+ * the Gunyah vCPU file descriptors.
+ *
+ * Return: file descriptor to manipulate the vcpu. See GH_VCPU_* ioctls
+ */
+#define GH_FN_VCPU 1
+
#define GH_FN_MAX_ARG_SIZE 256
+/**
+ * struct gh_fn_vcpu_arg - Arguments to create a vCPU
+ * @id: vcpu id
+ */
+struct gh_fn_vcpu_arg {
+ __u32 id;
+};
+
+#define GH_IRQFD_LEVEL (1UL << 0)
+
/**
* struct gh_fn_desc - Arguments to create a VM function
* @type: Type of the function. See GH_FN_* macro for supported types
@@ -79,4 +103,88 @@ struct gh_fn_desc {
#define GH_VM_ADD_FUNCTION _IOW(GH_IOCTL_TYPE, 0x4, struct gh_fn_desc)
#define GH_VM_REMOVE_FUNCTION _IOW(GH_IOCTL_TYPE, 0x7, struct gh_fn_desc)
+enum gh_vm_status {
+ GH_VM_STATUS_LOAD_FAILED = 1,
+#define GH_VM_STATUS_LOAD_FAILED GH_VM_STATUS_LOAD_FAILED
+ GH_VM_STATUS_EXITED = 2,
+#define GH_VM_STATUS_EXITED GH_VM_STATUS_EXITED
+ GH_VM_STATUS_CRASHED = 3,
+#define GH_VM_STATUS_CRASHED GH_VM_STATUS_CRASHED
+};
+
+/*
+ * Gunyah presently sends max 4 bytes of exit_reason.
+ * If that changes, this macro can be safely increased without breaking
+ * userspace so long as struct gh_vcpu_run < PAGE_SIZE.
+ */
+#define GH_VM_MAX_EXIT_REASON_SIZE 8u
+
+/**
+ * struct gh_vm_exit_info - Reason for VM exit as reported by Gunyah
+ * See Gunyah documentation for values.
+ * @type: Describes how VM exited
+ * @padding: padding bytes
+ * @reason_size: Number of bytes valid for `reason`
+ * @reason: See Gunyah documentation for interpretation. Note: these values are
+ * not interpreted by Linux and need to be converted from little-endian
+ * as applicable.
+ */
+struct gh_vm_exit_info {
+ __u16 type;
+ __u16 padding;
+ __u32 reason_size;
+ __u8 reason[GH_VM_MAX_EXIT_REASON_SIZE];
+};
+
+#define GH_VCPU_EXIT_UNKNOWN 0
+#define GH_VCPU_EXIT_MMIO 1
+#define GH_VCPU_EXIT_STATUS 2
+
+/**
+ * struct gh_vcpu_run - Application code obtains a pointer to the gh_vcpu_run
+ * structure by mmap()ing a vcpu fd.
+ * @immediate_exit: polled when scheduling the vcpu. If set, immediately returns -EINTR.
+ * @padding: padding bytes
+ * @exit_reason: Set when GH_VCPU_RUN returns successfully and gives reason why
+ * GH_VCPU_RUN has stopped running the vCPU.
+ * @mmio: Used when exit_reason == GH_VCPU_EXIT_MMIO
+ * The guest has faulted on an memory-mapped I/O instruction that
+ * couldn't be satisfied by gunyah.
+ * @mmio.phys_addr: Address guest tried to access
+ * @mmio.data: the value that was written if `is_write == 1`. Filled by
+ * user for reads (`is_wite == 0`).
+ * @mmio.len: Length of write. Only the first `len` bytes of `data`
+ * are considered by Gunyah.
+ * @mmio.is_write: 1 if VM tried to perform a write, 0 for a read
+ * @status: Used when exit_reason == GH_VCPU_EXIT_STATUS.
+ * The guest VM is no longer runnable. This struct informs why.
+ * @status.status: See `enum gh_vm_status` for possible values
+ * @status.exit_info: Used when status == GH_VM_STATUS_EXITED
+ */
+struct gh_vcpu_run {
+ /* in */
+ __u8 immediate_exit;
+ __u8 padding[7];
+
+ /* out */
+ __u32 exit_reason;
+
+ union {
+ struct {
+ __u64 phys_addr;
+ __u8 data[8];
+ __u32 len;
+ __u8 is_write;
+ } mmio;
+
+ struct {
+ enum gh_vm_status status;
+ struct gh_vm_exit_info exit_info;
+ } status;
+ };
+};
+
+#define GH_VCPU_RUN _IO(GH_IOCTL_TYPE, 0x5)
+#define GH_VCPU_MMAP_SIZE _IO(GH_IOCTL_TYPE, 0x6)
+
#endif
--
2.39.2
Enable support for creating irqfds which can raise an interrupt on a
Gunyah virtual machine. irqfds are exposed to userspace as a Gunyah VM
function with the name "irqfd". If the VM devicetree is not configured
to create a doorbell with the corresponding label, userspace will still
be able to assert the eventfd but no interrupt will be raised on the
guest.
Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Elliot Berman <[email protected]>
---
Documentation/virt/gunyah/vm-manager.rst | 2 +-
drivers/virt/gunyah/Kconfig | 9 ++
drivers/virt/gunyah/Makefile | 1 +
drivers/virt/gunyah/gunyah_irqfd.c | 164 +++++++++++++++++++++++
include/linux/gunyah.h | 5 +
include/uapi/linux/gunyah.h | 30 +++++
6 files changed, 210 insertions(+), 1 deletion(-)
create mode 100644 drivers/virt/gunyah/gunyah_irqfd.c
diff --git a/Documentation/virt/gunyah/vm-manager.rst b/Documentation/virt/gunyah/vm-manager.rst
index 83d326b0d11f..a1dd70f0cbf6 100644
--- a/Documentation/virt/gunyah/vm-manager.rst
+++ b/Documentation/virt/gunyah/vm-manager.rst
@@ -124,7 +124,7 @@ the VM starts.
The possible types are documented below:
.. kernel-doc:: include/uapi/linux/gunyah.h
- :identifiers: GH_FN_VCPU gh_fn_vcpu_arg
+ :identifiers: GH_FN_VCPU gh_fn_vcpu_arg GH_FN_IRQFD gh_fn_irqfd_arg
Gunyah VCPU API Descriptions
----------------------------
diff --git a/drivers/virt/gunyah/Kconfig b/drivers/virt/gunyah/Kconfig
index 4c1c6110b50e..2cde24d429d1 100644
--- a/drivers/virt/gunyah/Kconfig
+++ b/drivers/virt/gunyah/Kconfig
@@ -26,3 +26,12 @@ config GUNYAH_VCPU
VMMs can also handle stage 2 faults of the vCPUs.
Say Y/M here if unsure and you want to support Gunyah VMMs.
+
+config GUNYAH_IRQFD
+ tristate "Gunyah irqfd interface"
+ depends on GUNYAH
+ help
+ Enable kernel support for creating irqfds which can raise an interrupt
+ on Gunyah virtual machine.
+
+ Say Y/M here if unsure and you want to support Gunyah VMMs.
diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
index 2d1b604a7b03..6cf756bfa3c2 100644
--- a/drivers/virt/gunyah/Makefile
+++ b/drivers/virt/gunyah/Makefile
@@ -7,3 +7,4 @@ gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o vm_mgr_mm.o
obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
obj-$(CONFIG_GUNYAH_VCPU) += gunyah_vcpu.o
+obj-$(CONFIG_GUNYAH_IRQFD) += gunyah_irqfd.o
diff --git a/drivers/virt/gunyah/gunyah_irqfd.c b/drivers/virt/gunyah/gunyah_irqfd.c
new file mode 100644
index 000000000000..38e5fe266b00
--- /dev/null
+++ b/drivers/virt/gunyah/gunyah_irqfd.c
@@ -0,0 +1,164 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#include <linux/eventfd.h>
+#include <linux/file.h>
+#include <linux/fs.h>
+#include <linux/gunyah.h>
+#include <linux/gunyah_vm_mgr.h>
+#include <linux/module.h>
+#include <linux/poll.h>
+#include <linux/printk.h>
+
+#include <uapi/linux/gunyah.h>
+
+struct gh_irqfd {
+ struct gh_resource *ghrsc;
+ struct gh_vm_resource_ticket ticket;
+ struct gh_vm_function_instance *f;
+
+ bool level;
+
+ struct eventfd_ctx *ctx;
+ wait_queue_entry_t wait;
+ poll_table pt;
+};
+
+static int irqfd_wakeup(wait_queue_entry_t *wait, unsigned int mode, int sync, void *key)
+{
+ struct gh_irqfd *irqfd = container_of(wait, struct gh_irqfd, wait);
+ __poll_t flags = key_to_poll(key);
+ u64 enable_mask = GH_BELL_NONBLOCK;
+ u64 old_flags;
+ int ret = 0;
+
+ if (flags & EPOLLIN) {
+ if (irqfd->ghrsc) {
+ ret = gh_hypercall_bell_send(irqfd->ghrsc->capid, enable_mask, &old_flags);
+ if (ret)
+ pr_err_ratelimited("Failed to inject interrupt %d: %d\n",
+ irqfd->ticket.label, ret);
+ } else
+ pr_err_ratelimited("Premature injection of interrupt\n");
+ }
+
+ return 0;
+}
+
+static void irqfd_ptable_queue_proc(struct file *file, wait_queue_head_t *wqh, poll_table *pt)
+{
+ struct gh_irqfd *irq_ctx = container_of(pt, struct gh_irqfd, pt);
+
+ add_wait_queue(wqh, &irq_ctx->wait);
+}
+
+static int gh_irqfd_populate(struct gh_vm_resource_ticket *ticket, struct gh_resource *ghrsc)
+{
+ struct gh_irqfd *irqfd = container_of(ticket, struct gh_irqfd, ticket);
+ u64 enable_mask = GH_BELL_NONBLOCK;
+ u64 ack_mask = ~0;
+ int ret = 0;
+
+ if (irqfd->ghrsc) {
+ pr_warn("irqfd%d already got a Gunyah resource. Check if multiple resources with same label were configured.\n",
+ irqfd->ticket.label);
+ return -1;
+ }
+
+ irqfd->ghrsc = ghrsc;
+ if (irqfd->level) {
+ ret = gh_hypercall_bell_set_mask(irqfd->ghrsc->capid, enable_mask, ack_mask);
+ if (ret)
+ pr_warn("irq %d couldn't be set as level triggered. Might cause IRQ storm if asserted\n",
+ irqfd->ticket.label);
+ }
+
+ return 0;
+}
+
+static void gh_irqfd_unpopulate(struct gh_vm_resource_ticket *ticket, struct gh_resource *ghrsc)
+{
+ struct gh_irqfd *irqfd = container_of(ticket, struct gh_irqfd, ticket);
+ u64 cnt;
+
+ eventfd_ctx_remove_wait_queue(irqfd->ctx, &irqfd->wait, &cnt);
+}
+
+static long gh_irqfd_bind(struct gh_vm_function_instance *f)
+{
+ struct gh_fn_irqfd_arg *args = f->argp;
+ struct gh_irqfd *irqfd;
+ __poll_t events;
+ struct fd fd;
+ long r;
+
+ if (f->arg_size != sizeof(*args))
+ return -EINVAL;
+
+ /* All other flag bits are reserved for future use */
+ if (args->flags & ~GH_IRQFD_LEVEL)
+ return -EINVAL;
+
+ irqfd = kzalloc(sizeof(*irqfd), GFP_KERNEL);
+ if (!irqfd)
+ return -ENOMEM;
+
+ irqfd->f = f;
+ f->data = irqfd;
+
+ fd = fdget(args->fd);
+ if (!fd.file) {
+ kfree(irqfd);
+ return -EBADF;
+ }
+
+ irqfd->ctx = eventfd_ctx_fileget(fd.file);
+ if (IS_ERR(irqfd->ctx)) {
+ r = PTR_ERR(irqfd->ctx);
+ goto err_fdput;
+ }
+
+ if (args->flags & GH_IRQFD_LEVEL)
+ irqfd->level = true;
+
+ init_waitqueue_func_entry(&irqfd->wait, irqfd_wakeup);
+ init_poll_funcptr(&irqfd->pt, irqfd_ptable_queue_proc);
+
+ irqfd->ticket.resource_type = GH_RESOURCE_TYPE_BELL_TX;
+ irqfd->ticket.label = args->label;
+ irqfd->ticket.owner = THIS_MODULE;
+ irqfd->ticket.populate = gh_irqfd_populate;
+ irqfd->ticket.unpopulate = gh_irqfd_unpopulate;
+
+ r = gh_vm_add_resource_ticket(f->ghvm, &irqfd->ticket);
+ if (r)
+ goto err_ctx;
+
+ events = vfs_poll(fd.file, &irqfd->pt);
+ if (events & EPOLLIN)
+ pr_warn("Premature injection of interrupt\n");
+ fdput(fd);
+
+ return 0;
+err_ctx:
+ eventfd_ctx_put(irqfd->ctx);
+err_fdput:
+ fdput(fd);
+ kfree(irqfd);
+ return r;
+}
+
+static void gh_irqfd_unbind(struct gh_vm_function_instance *f)
+{
+ struct gh_irqfd *irqfd = f->data;
+
+ gh_vm_remove_resource_ticket(irqfd->f->ghvm, &irqfd->ticket);
+ eventfd_ctx_put(irqfd->ctx);
+ kfree(irqfd);
+}
+
+DECLARE_GH_VM_FUNCTION_INIT(irqfd, GH_FN_IRQFD, gh_irqfd_bind, gh_irqfd_unbind);
+MODULE_DESCRIPTION("Gunyah irqfds");
+MODULE_LICENSE("GPL");
diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
index 63395dacc1a8..0344b6988cfa 100644
--- a/include/linux/gunyah.h
+++ b/include/linux/gunyah.h
@@ -33,6 +33,11 @@ struct gh_resource {
u32 rm_label;
};
+/**
+ * Gunyah Doorbells
+ */
+#define GH_BELL_NONBLOCK BIT(32)
+
/**
* Gunyah Message Queues
*/
diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
index e52265fa5715..5617dadc1c7b 100644
--- a/include/uapi/linux/gunyah.h
+++ b/include/uapi/linux/gunyah.h
@@ -76,6 +76,19 @@ struct gh_vm_dtb_config {
*/
#define GH_FN_VCPU 1
+/**
+ * GH_FN_IRQFD - register eventfd to assert a Gunyah doorbell
+ *
+ * gh_fn_desc is filled with gh_fn_irqfd_arg
+ *
+ * Allows setting an eventfd to directly trigger a guest interrupt.
+ * irqfd.fd specifies the file descriptor to use as the eventfd.
+ * irqfd.label corresponds to the doorbell label used in the guest VM's devicetree.
+ *
+ * Return: 0
+ */
+#define GH_FN_IRQFD 2
+
#define GH_FN_MAX_ARG_SIZE 256
/**
@@ -88,6 +101,23 @@ struct gh_fn_vcpu_arg {
#define GH_IRQFD_LEVEL (1UL << 0)
+/**
+ * struct gh_fn_irqfd_arg - Arguments to create an irqfd function
+ * @fd: an eventfd which when written to will raise a doorbell
+ * @label: Label of the doorbell created on the guest VM
+ * @flags: GH_IRQFD_LEVEL configures the corresponding doorbell to behave
+ * like a level triggered interrupt.
+ * @padding: padding bytes
+ */
+struct gh_fn_irqfd_arg {
+ __u32 fd;
+ __u32 label;
+ __u32 flags;
+ __u32 padding;
+};
+
+#define GH_IOEVENTFD_DATAMATCH (1UL << 0)
+
/**
* struct gh_fn_desc - Arguments to create a VM function
* @type: Type of the function. See GH_FN_* macro for supported types
--
2.39.2
When booting a Gunyah virtual machine, the host VM may gain capabilities
to interact with resources for the guest virtual machine. Examples of
such resources are vCPUs or message queues. To use those resources, we
need to translate the RM response into a gunyah_resource structure which
are useful to Linux drivers. Presently, Linux drivers need only to know
the type of resource, the capability ID, and an interrupt.
On ARM64 systems, the interrupt reported by Gunyah is the GIC interrupt
ID number and always a SPI.
Signed-off-by: Elliot Berman <[email protected]>
---
arch/arm64/include/asm/gunyah.h | 23 +++++
drivers/virt/gunyah/rsc_mgr.c | 163 +++++++++++++++++++++++++++++++-
include/linux/gunyah.h | 4 +
include/linux/gunyah_rsc_mgr.h | 3 +
4 files changed, 192 insertions(+), 1 deletion(-)
create mode 100644 arch/arm64/include/asm/gunyah.h
diff --git a/arch/arm64/include/asm/gunyah.h b/arch/arm64/include/asm/gunyah.h
new file mode 100644
index 000000000000..64cfb964efee
--- /dev/null
+++ b/arch/arm64/include/asm/gunyah.h
@@ -0,0 +1,23 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) 2022 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+#ifndef __ASM_GUNYAH_H_
+#define __ASM_GUNYAH_H_
+
+#include <linux/irq.h>
+#include <dt-bindings/interrupt-controller/arm-gic.h>
+
+static inline int arch_gh_fill_irq_fwspec_params(u32 virq, struct irq_fwspec *fwspec)
+{
+ if (virq < 32 || virq > 1019)
+ return -EINVAL;
+
+ fwspec->param_count = 3;
+ fwspec->param[0] = GIC_SPI;
+ fwspec->param[1] = virq - 32;
+ fwspec->param[2] = IRQ_TYPE_EDGE_RISING;
+ return 0;
+}
+
+#endif
diff --git a/drivers/virt/gunyah/rsc_mgr.c b/drivers/virt/gunyah/rsc_mgr.c
index d7ce692d0067..383be5ac0f44 100644
--- a/drivers/virt/gunyah/rsc_mgr.c
+++ b/drivers/virt/gunyah/rsc_mgr.c
@@ -17,6 +17,8 @@
#include <linux/platform_device.h>
#include <linux/miscdevice.h>
+#include <asm/gunyah.h>
+
#include "rsc_mgr.h"
#include "vm_mgr.h"
@@ -132,6 +134,7 @@ struct gh_rm_connection {
* @send_lock: synchronization to allow only one request to be sent at a time
* @nh: notifier chain for clients interested in RM notification messages
* @miscdev: /dev/gunyah
+ * @irq_domain: Domain to translate Gunyah hwirqs to Linux irqs
*/
struct gh_rm {
struct device *dev;
@@ -150,6 +153,7 @@ struct gh_rm {
struct blocking_notifier_head nh;
struct miscdevice miscdev;
+ struct irq_domain *irq_domain;
};
/**
@@ -190,6 +194,134 @@ static inline int gh_rm_remap_error(enum gh_rm_error rm_error)
}
}
+struct gh_irq_chip_data {
+ u32 gh_virq;
+};
+
+static struct irq_chip gh_rm_irq_chip = {
+ .name = "Gunyah",
+ .irq_enable = irq_chip_enable_parent,
+ .irq_disable = irq_chip_disable_parent,
+ .irq_ack = irq_chip_ack_parent,
+ .irq_mask = irq_chip_mask_parent,
+ .irq_mask_ack = irq_chip_mask_ack_parent,
+ .irq_unmask = irq_chip_unmask_parent,
+ .irq_eoi = irq_chip_eoi_parent,
+ .irq_set_affinity = irq_chip_set_affinity_parent,
+ .irq_set_type = irq_chip_set_type_parent,
+ .irq_set_wake = irq_chip_set_wake_parent,
+ .irq_set_vcpu_affinity = irq_chip_set_vcpu_affinity_parent,
+ .irq_retrigger = irq_chip_retrigger_hierarchy,
+ .irq_get_irqchip_state = irq_chip_get_parent_state,
+ .irq_set_irqchip_state = irq_chip_set_parent_state,
+ .flags = IRQCHIP_SET_TYPE_MASKED |
+ IRQCHIP_SKIP_SET_WAKE |
+ IRQCHIP_MASK_ON_SUSPEND,
+};
+
+static int gh_rm_irq_domain_alloc(struct irq_domain *d, unsigned int virq, unsigned int nr_irqs,
+ void *arg)
+{
+ struct gh_irq_chip_data *chip_data, *spec = arg;
+ struct irq_fwspec parent_fwspec;
+ struct gh_rm *rm = d->host_data;
+ u32 gh_virq = spec->gh_virq;
+ int ret;
+
+ if (nr_irqs != 1 || gh_virq == U32_MAX)
+ return -EINVAL;
+
+ chip_data = kzalloc(sizeof(*chip_data), GFP_KERNEL);
+ if (!chip_data)
+ return -ENOMEM;
+
+ chip_data->gh_virq = gh_virq;
+
+ ret = irq_domain_set_hwirq_and_chip(d, virq, chip_data->gh_virq, &gh_rm_irq_chip,
+ chip_data);
+ if (ret)
+ goto err_free_irq_data;
+
+ parent_fwspec.fwnode = d->parent->fwnode;
+ ret = arch_gh_fill_irq_fwspec_params(chip_data->gh_virq, &parent_fwspec);
+ if (ret) {
+ dev_err(rm->dev, "virq translation failed %u: %d\n", chip_data->gh_virq, ret);
+ goto err_free_irq_data;
+ }
+
+ ret = irq_domain_alloc_irqs_parent(d, virq, nr_irqs, &parent_fwspec);
+ if (ret)
+ goto err_free_irq_data;
+
+ return ret;
+err_free_irq_data:
+ kfree(chip_data);
+ return ret;
+}
+
+static void gh_rm_irq_domain_free_single(struct irq_domain *d, unsigned int virq)
+{
+ struct gh_irq_chip_data *chip_data;
+ struct irq_data *irq_data;
+
+ irq_data = irq_domain_get_irq_data(d, virq);
+ if (!irq_data)
+ return;
+
+ chip_data = irq_data->chip_data;
+
+ kfree(chip_data);
+ irq_data->chip_data = NULL;
+}
+
+static void gh_rm_irq_domain_free(struct irq_domain *d, unsigned int virq, unsigned int nr_irqs)
+{
+ unsigned int i;
+
+ for (i = 0; i < nr_irqs; i++)
+ gh_rm_irq_domain_free_single(d, virq);
+}
+
+static const struct irq_domain_ops gh_rm_irq_domain_ops = {
+ .alloc = gh_rm_irq_domain_alloc,
+ .free = gh_rm_irq_domain_free,
+};
+
+struct gh_resource *gh_rm_alloc_resource(struct gh_rm *rm, struct gh_rm_hyp_resource *hyp_resource)
+{
+ struct gh_resource *ghrsc;
+
+ ghrsc = kzalloc(sizeof(*ghrsc), GFP_KERNEL);
+ if (!ghrsc)
+ return NULL;
+
+ ghrsc->type = hyp_resource->type;
+ ghrsc->capid = le64_to_cpu(hyp_resource->cap_id);
+ ghrsc->irq = IRQ_NOTCONNECTED;
+ ghrsc->rm_label = le32_to_cpu(hyp_resource->resource_label);
+ if (hyp_resource->virq && le32_to_cpu(hyp_resource->virq) != U32_MAX) {
+ struct gh_irq_chip_data irq_data = {
+ .gh_virq = le32_to_cpu(hyp_resource->virq),
+ };
+
+ ghrsc->irq = irq_domain_alloc_irqs(rm->irq_domain, 1, NUMA_NO_NODE, &irq_data);
+ if (ghrsc->irq < 0) {
+ dev_err(rm->dev,
+ "Failed to allocate interrupt for resource %d label: %d: %d\n",
+ ghrsc->type, ghrsc->rm_label, ghrsc->irq);
+ ghrsc->irq = IRQ_NOTCONNECTED;
+ }
+ }
+
+ return ghrsc;
+}
+
+void gh_rm_free_resource(struct gh_resource *ghrsc)
+{
+ irq_dispose_mapping(ghrsc->irq);
+ kfree(ghrsc);
+}
+
static int gh_rm_init_connection_payload(struct gh_rm_connection *connection, void *msg,
size_t hdr_size, size_t msg_size)
{
@@ -639,6 +771,8 @@ static int gh_msgq_platform_probe_direction(struct platform_device *pdev, bool t
static int gh_rm_drv_probe(struct platform_device *pdev)
{
+ struct irq_domain *parent_irq_domain;
+ struct device_node *parent_irq_node;
struct gh_msgq_tx_data *msg;
struct gh_rm *rm;
int ret;
@@ -675,15 +809,41 @@ static int gh_rm_drv_probe(struct platform_device *pdev)
if (ret)
goto err_cache;
+ parent_irq_node = of_irq_find_parent(pdev->dev.of_node);
+ if (!parent_irq_node) {
+ dev_err(&pdev->dev, "Failed to find interrupt parent of resource manager\n");
+ ret = -ENODEV;
+ goto err_msgq;
+ }
+
+ parent_irq_domain = irq_find_host(parent_irq_node);
+ if (!parent_irq_domain) {
+ dev_err(&pdev->dev, "Failed to find interrupt parent domain of resource manager\n");
+ ret = -ENODEV;
+ goto err_msgq;
+ }
+
+ rm->irq_domain = irq_domain_add_hierarchy(parent_irq_domain, 0, 0, pdev->dev.of_node,
+ &gh_rm_irq_domain_ops, NULL);
+ if (!rm->irq_domain) {
+ dev_err(&pdev->dev, "Failed to add irq domain\n");
+ ret = -ENODEV;
+ goto err_msgq;
+ }
+ rm->irq_domain->host_data = rm;
+
+ rm->miscdev.parent = &pdev->dev;
rm->miscdev.name = "gunyah";
rm->miscdev.minor = MISC_DYNAMIC_MINOR;
rm->miscdev.fops = &gh_dev_fops;
ret = misc_register(&rm->miscdev);
if (ret)
- goto err_msgq;
+ goto err_irq_domain;
return 0;
+err_irq_domain:
+ irq_domain_remove(rm->irq_domain);
err_msgq:
mbox_free_channel(gh_msgq_chan(&rm->msgq));
gh_msgq_remove(&rm->msgq);
@@ -697,6 +857,7 @@ static int gh_rm_drv_remove(struct platform_device *pdev)
struct gh_rm *rm = platform_get_drvdata(pdev);
misc_deregister(&rm->miscdev);
+ irq_domain_remove(rm->irq_domain);
mbox_free_channel(gh_msgq_chan(&rm->msgq));
gh_msgq_remove(&rm->msgq);
kmem_cache_destroy(rm->cache);
diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
index 378bec0f2ce1..3e706b59d2c0 100644
--- a/include/linux/gunyah.h
+++ b/include/linux/gunyah.h
@@ -27,6 +27,10 @@ struct gh_resource {
enum gh_resource_type type;
u64 capid;
unsigned int irq;
+
+ /* To help allocator in vm manager */
+ struct list_head list;
+ u32 rm_label;
};
/**
diff --git a/include/linux/gunyah_rsc_mgr.h b/include/linux/gunyah_rsc_mgr.h
index acf8c1545a6c..58693c27cf1a 100644
--- a/include/linux/gunyah_rsc_mgr.h
+++ b/include/linux/gunyah_rsc_mgr.h
@@ -145,6 +145,9 @@ int gh_rm_get_hyp_resources(struct gh_rm *rm, u16 vmid,
struct gh_rm_hyp_resources **resources);
int gh_rm_get_vmid(struct gh_rm *rm, u16 *vmid);
+struct gh_resource *gh_rm_alloc_resource(struct gh_rm *rm, struct gh_rm_hyp_resource *hyp_resource);
+void gh_rm_free_resource(struct gh_resource *ghrsc);
+
struct gh_rm_platform_ops {
int (*pre_mem_share)(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel);
int (*post_mem_reclaim)(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel);
--
2.39.2
Document the ioctls and usage of Gunyah VM Manager driver.
Signed-off-by: Elliot Berman <[email protected]>
---
Documentation/virt/gunyah/index.rst | 1 +
Documentation/virt/gunyah/vm-manager.rst | 91 ++++++++++++++++++++++++
2 files changed, 92 insertions(+)
create mode 100644 Documentation/virt/gunyah/vm-manager.rst
diff --git a/Documentation/virt/gunyah/index.rst b/Documentation/virt/gunyah/index.rst
index 74aa345e0a14..7058249825b1 100644
--- a/Documentation/virt/gunyah/index.rst
+++ b/Documentation/virt/gunyah/index.rst
@@ -7,6 +7,7 @@ Gunyah Hypervisor
.. toctree::
:maxdepth: 1
+ vm-manager
message-queue
Gunyah is a Type-1 hypervisor which is independent of any OS kernel, and runs in
diff --git a/Documentation/virt/gunyah/vm-manager.rst b/Documentation/virt/gunyah/vm-manager.rst
new file mode 100644
index 000000000000..1b4aa18670a3
--- /dev/null
+++ b/Documentation/virt/gunyah/vm-manager.rst
@@ -0,0 +1,91 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=======================
+Virtual Machine Manager
+=======================
+
+The Gunyah Virtual Machine Manager is a Linux driver to support launching
+virtual machines using Gunyah. It presently supports launching non-proxy
+scheduled Linux-like virtual machines.
+
+Except for some basic information about the location of initial binaries,
+most of the configuration about a Gunyah virtual machine is described in the
+VM's devicetree. The devicetree is generated by userspace. Interacting with the
+virtual machine is still done via the kernel and VM configuration requires some
+of the corresponding functionality to be set up in the kernel. For instance,
+sharing userspace memory with a VM is done via the GH_VM_SET_USER_MEM_REGION
+ioctl. The VM itself is configured to use the memory region via the
+devicetree.
+
+Sample Userspace VMM
+====================
+
+A sample userspace VMM is included in samples/gunyah/ along with a minimal
+devicetree that can be used to launch a VM. To build this sample, enable
+CONFIG_SAMPLE_GUNYAH.
+
+IOCTLs and userspace VMM flows
+==============================
+
+The kernel exposes a char device interface at /dev/gunyah.
+
+To create a VM, use the GH_CREATE_VM ioctl. A successful call will return a
+"Gunyah VM" file descriptor.
+
+/dev/gunyah API Descriptions
+----------------------------
+
+GH_CREATE_VM
+~~~~~~~~~~~~
+
+Creates a Gunyah VM. The argument is reserved for future use and must be 0.
+
+Gunyah VM API Descriptions
+--------------------------
+
+GH_VM_SET_USER_MEM_REGION
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+This ioctl allows the user to create or delete a memory parcel for a guest
+virtual machine. Each memory region is uniquely identified by a label;
+attempting to create two regions with the same label is not allowed. Labels are
+unique per virtual machine.
+
+While VMM is guest-agnostic and allows runtime addition of memory regions,
+Linux guest virtual machines do not support accepting memory regions at runtime.
+Thus, memory regions should be provided before starting the VM and the VM must
+be configured to accept these at boot-up.
+
+The guest physical address is used by Linux kernel to check that the requested
+user regions do not overlap and to help find the corresponding memory region
+for calls like GH_VM_SET_DTB_CONFIG. It must be page aligned.
+
+memory_size and userspace_addr must be page-aligned.
+
+The flags field of gh_userspace_memory_region accepts the following bits. All
+other bits must be 0 and are reserved for future use. The ioctl will return
+-EINVAL if an unsupported bit is detected.
+
+ - GH_MEM_ALLOW_READ/GH_MEM_ALLOW_WRITE/GH_MEM_ALLOW_EXEC sets read/write/exec
+ permissions for the guest, respectively.
+
+To add a memory region, call GH_VM_SET_USER_MEM_REGION with fields set as
+described above.
+
+.. kernel-doc:: include/uapi/linux/gunyah.h
+ :identifiers: gh_userspace_memory_region
+
+GH_VM_SET_DTB_CONFIG
+~~~~~~~~~~~~~~~~~~~~
+
+This ioctl sets the location of the VM's devicetree blob and is used by Gunyah
+Resource Manager to allocate resources. The guest physical memory should be part
+of the primary memory parcel provided to the VM prior to GH_VM_START.
+
+.. kernel-doc:: include/uapi/linux/gunyah.h
+ :identifiers: gh_vm_dtb_config
+
+GH_VM_START
+~~~~~~~~~~~
+
+This ioctl starts the VM.
--
2.39.2
Allow userspace to attach an ioeventfd to an mmio address within the guest.
Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Elliot Berman <[email protected]>
---
Documentation/virt/gunyah/vm-manager.rst | 2 +-
drivers/virt/gunyah/Kconfig | 9 ++
drivers/virt/gunyah/Makefile | 1 +
drivers/virt/gunyah/gunyah_ioeventfd.c | 117 +++++++++++++++++++++++
include/uapi/linux/gunyah.h | 37 +++++++
5 files changed, 165 insertions(+), 1 deletion(-)
create mode 100644 drivers/virt/gunyah/gunyah_ioeventfd.c
diff --git a/Documentation/virt/gunyah/vm-manager.rst b/Documentation/virt/gunyah/vm-manager.rst
index a1dd70f0cbf6..cd41a705849f 100644
--- a/Documentation/virt/gunyah/vm-manager.rst
+++ b/Documentation/virt/gunyah/vm-manager.rst
@@ -124,7 +124,7 @@ the VM starts.
The possible types are documented below:
.. kernel-doc:: include/uapi/linux/gunyah.h
- :identifiers: GH_FN_VCPU gh_fn_vcpu_arg GH_FN_IRQFD gh_fn_irqfd_arg
+ :identifiers: GH_FN_VCPU gh_fn_vcpu_arg GH_FN_IRQFD gh_fn_irqfd_arg GH_FN_IOEVENTFD gh_fn_ioeventfd_arg
Gunyah VCPU API Descriptions
----------------------------
diff --git a/drivers/virt/gunyah/Kconfig b/drivers/virt/gunyah/Kconfig
index 2cde24d429d1..bd8e31184962 100644
--- a/drivers/virt/gunyah/Kconfig
+++ b/drivers/virt/gunyah/Kconfig
@@ -35,3 +35,12 @@ config GUNYAH_IRQFD
on Gunyah virtual machine.
Say Y/M here if unsure and you want to support Gunyah VMMs.
+
+config GUNYAH_IOEVENTFD
+ tristate "Gunyah ioeventfd interface"
+ depends on GUNYAH
+ help
+ Enable kernel support for creating ioeventfds which can alert userspace
+ when a Gunyah virtual machine accesses a memory address.
+
+ Say Y/M here if unsure and you want to support Gunyah VMMs.
diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
index 6cf756bfa3c2..7347b1470491 100644
--- a/drivers/virt/gunyah/Makefile
+++ b/drivers/virt/gunyah/Makefile
@@ -8,3 +8,4 @@ obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
obj-$(CONFIG_GUNYAH_VCPU) += gunyah_vcpu.o
obj-$(CONFIG_GUNYAH_IRQFD) += gunyah_irqfd.o
+obj-$(CONFIG_GUNYAH_IOEVENTFD) += gunyah_ioeventfd.o
diff --git a/drivers/virt/gunyah/gunyah_ioeventfd.c b/drivers/virt/gunyah/gunyah_ioeventfd.c
new file mode 100644
index 000000000000..517f55706ed9
--- /dev/null
+++ b/drivers/virt/gunyah/gunyah_ioeventfd.c
@@ -0,0 +1,117 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#include <linux/eventfd.h>
+#include <linux/file.h>
+#include <linux/fs.h>
+#include <linux/gunyah.h>
+#include <linux/gunyah_vm_mgr.h>
+#include <linux/module.h>
+#include <linux/printk.h>
+
+#include <uapi/linux/gunyah.h>
+
+struct gh_ioeventfd {
+ struct gh_vm_function_instance *f;
+ struct gh_vm_io_handler io_handler;
+
+ struct eventfd_ctx *ctx;
+};
+
+static int gh_write_ioeventfd(struct gh_vm_io_handler *io_dev, u64 addr, u32 len, u64 data)
+{
+ struct gh_ioeventfd *iofd = container_of(io_dev, struct gh_ioeventfd, io_handler);
+
+ eventfd_signal(iofd->ctx, 1);
+ return 0;
+}
+
+static struct gh_vm_io_handler_ops io_ops = {
+ .write = gh_write_ioeventfd,
+};
+
+static long gh_ioeventfd_bind(struct gh_vm_function_instance *f)
+{
+ const struct gh_fn_ioeventfd_arg *args = f->argp;
+ struct eventfd_ctx *ctx = NULL;
+ struct gh_ioeventfd *iofd;
+ int ret;
+
+ if (f->arg_size != sizeof(*args))
+ return -EINVAL;
+
+ /* must be natural-word sized, or 0 to ignore length */
+ switch (args->len) {
+ case 0:
+ case 1:
+ case 2:
+ case 4:
+ case 8:
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ /* check for range overflow */
+ if (args->addr + args->len < args->addr)
+ return -EINVAL;
+
+ /* ioeventfd with no length can't be combined with DATAMATCH */
+ if (!args->len && (args->flags & GH_IOEVENTFD_DATAMATCH))
+ return -EINVAL;
+
+ /* All other flag bits are reserved for future use */
+ if (args->flags & ~GH_IOEVENTFD_DATAMATCH)
+ return -EINVAL;
+
+ ctx = eventfd_ctx_fdget(args->fd);
+ if (IS_ERR(ctx))
+ return PTR_ERR(ctx);
+
+ iofd = kzalloc(sizeof(*iofd), GFP_KERNEL);
+ if (!iofd) {
+ ret = -ENOMEM;
+ goto err_eventfd;
+ }
+
+ f->data = iofd;
+ iofd->f = f;
+
+ iofd->ctx = ctx;
+
+ if (args->flags & GH_IOEVENTFD_DATAMATCH) {
+ iofd->io_handler.datamatch = true;
+ iofd->io_handler.len = args->len;
+ iofd->io_handler.data = args->datamatch;
+ }
+ iofd->io_handler.addr = args->addr;
+ iofd->io_handler.ops = &io_ops;
+
+ ret = gh_vm_add_io_handler(f->ghvm, &iofd->io_handler);
+ if (ret)
+ goto err_io_dev_add;
+
+ return 0;
+
+err_io_dev_add:
+ kfree(iofd);
+err_eventfd:
+ eventfd_ctx_put(ctx);
+ return ret;
+}
+
+static void gh_ioevent_unbind(struct gh_vm_function_instance *f)
+{
+ struct gh_ioeventfd *iofd = f->data;
+
+ eventfd_ctx_put(iofd->ctx);
+ gh_vm_remove_io_handler(iofd->f->ghvm, &iofd->io_handler);
+ kfree(iofd);
+}
+
+DECLARE_GH_VM_FUNCTION_INIT(ioeventfd, GH_FN_IOEVENTFD,
+ gh_ioeventfd_bind, gh_ioevent_unbind);
+MODULE_DESCRIPTION("Gunyah ioeventfds");
+MODULE_LICENSE("GPL");
diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
index 5617dadc1c7b..f8482ff4cc55 100644
--- a/include/uapi/linux/gunyah.h
+++ b/include/uapi/linux/gunyah.h
@@ -89,6 +89,23 @@ struct gh_vm_dtb_config {
*/
#define GH_FN_IRQFD 2
+/**
+ * GH_FN_IOEVENTFD - register ioeventfd to trigger when VM faults on parameter
+ *
+ * gh_fn_desc is filled with gh_fn_ioeventfd_arg
+ *
+ * Attaches an ioeventfd to a legal mmio address within the guest. A guest write
+ * in the registered address will signal the provided event instead of triggering
+ * an exit on the GH_VCPU_RUN ioctl.
+ *
+ * If GH_IOEVENTFD_DATAMATCH flag is set, the event will be signaled only if the
+ * written value to the registered address is equal to datamatch in
+ * struct gh_fn_ioeventfd_arg.
+ *
+ * Return: 0
+ */
+#define GH_FN_IOEVENTFD 3
+
#define GH_FN_MAX_ARG_SIZE 256
/**
@@ -118,6 +135,26 @@ struct gh_fn_irqfd_arg {
#define GH_IOEVENTFD_DATAMATCH (1UL << 0)
+/**
+ * struct gh_fn_ioeventfd_arg - Arguments to create an ioeventfd function
+ * @datamatch: data used when GH_IOEVENTFD_DATAMATCH is set
+ * @addr: Address in guest memory
+ * @len: Length of access
+ * @fd: When ioeventfd is matched, this eventfd is written
+ * @flags: If GH_IOEVENTFD_DATAMATCH flag is set, the event will be signaled
+ * only if the written value to the registered address is equal to
+ * @datamatch
+ * @padding: padding bytes
+ */
+struct gh_fn_ioeventfd_arg {
+ __u64 datamatch;
+ __u64 addr; /* legal mmio address */
+ __u32 len; /* 1, 2, 4, or 8 bytes; or 0 to ignore length */
+ __s32 fd;
+ __u32 flags;
+ __u32 padding;
+};
+
/**
* struct gh_fn_desc - Arguments to create a VM function
* @type: Type of the function. See GH_FN_* macro for supported types
--
2.39.2
Add myself and Prakruthi as maintainers of Gunyah hypervisor drivers.
Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Elliot Berman <[email protected]>
---
MAINTAINERS | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index b0db911207ba..26ba59610276 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8916,6 +8916,19 @@ L: [email protected]
S: Maintained
F: block/partitions/efi.*
+GUNYAH HYPERVISOR DRIVER
+M: Elliot Berman <[email protected]>
+M: Prakruthi Deepak Heragu <[email protected]>
+L: [email protected]
+S: Supported
+F: Documentation/devicetree/bindings/firmware/gunyah-hypervisor.yaml
+F: Documentation/virt/gunyah/
+F: arch/arm64/gunyah/
+F: drivers/mailbox/gunyah-msgq.c
+F: drivers/virt/gunyah/
+F: include/linux/gunyah*.h
+F: samples/gunyah/
+
HABANALABS PCI DRIVER
M: Oded Gabbay <[email protected]>
L: [email protected]
--
2.39.2
On 04/03/2023 01:06, Elliot Berman wrote:
> Add hypercalls to identify when Linux is running a virtual machine under
> Gunyah.
>
> There are two calls to help identify Gunyah:
>
> 1. gh_hypercall_get_uid() returns a UID when running under a Gunyah
> hypervisor.
> 2. gh_hypercall_hyp_identify() returns build information and a set of
> feature flags that are supported by Gunyah.
>
> Signed-off-by: Elliot Berman <[email protected]>
Reviewed-by: Srinivas Kandagatla <[email protected]>
> ---
> arch/arm64/Kbuild | 1 +
> arch/arm64/gunyah/Makefile | 3 ++
> arch/arm64/gunyah/gunyah_hypercall.c | 64 ++++++++++++++++++++++++++++
> drivers/virt/Kconfig | 2 +
> drivers/virt/gunyah/Kconfig | 13 ++++++
> include/linux/gunyah.h | 28 ++++++++++++
> 6 files changed, 111 insertions(+)
> create mode 100644 arch/arm64/gunyah/Makefile
> create mode 100644 arch/arm64/gunyah/gunyah_hypercall.c
> create mode 100644 drivers/virt/gunyah/Kconfig
>
> diff --git a/arch/arm64/Kbuild b/arch/arm64/Kbuild
> index 5bfbf7d79c99..e4847ba0e3c9 100644
> --- a/arch/arm64/Kbuild
> +++ b/arch/arm64/Kbuild
> @@ -3,6 +3,7 @@ obj-y += kernel/ mm/ net/
> obj-$(CONFIG_KVM) += kvm/
> obj-$(CONFIG_XEN) += xen/
> obj-$(subst m,y,$(CONFIG_HYPERV)) += hyperv/
> +obj-$(CONFIG_GUNYAH) += gunyah/
> obj-$(CONFIG_CRYPTO) += crypto/
>
> # for cleaning
> diff --git a/arch/arm64/gunyah/Makefile b/arch/arm64/gunyah/Makefile
> new file mode 100644
> index 000000000000..84f1e38cafb1
> --- /dev/null
> +++ b/arch/arm64/gunyah/Makefile
> @@ -0,0 +1,3 @@
> +# SPDX-License-Identifier: GPL-2.0
> +
> +obj-$(CONFIG_GUNYAH) += gunyah_hypercall.o
> diff --git a/arch/arm64/gunyah/gunyah_hypercall.c b/arch/arm64/gunyah/gunyah_hypercall.c
> new file mode 100644
> index 000000000000..0d14e767e2c8
> --- /dev/null
> +++ b/arch/arm64/gunyah/gunyah_hypercall.c
> @@ -0,0 +1,64 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#include <linux/arm-smccc.h>
> +#include <linux/module.h>
> +#include <linux/gunyah.h>
> +#include <linux/uuid.h>
> +
> +static const uuid_t gh_known_uuids[] = {
> + /* Qualcomm's version of Gunyah {19bd54bd-0b37-571b-946f-609b54539de6} */
> + UUID_INIT(0x19bd54bd, 0x0b37, 0x571b, 0x94, 0x6f, 0x60, 0x9b, 0x54, 0x53, 0x9d, 0xe6),
> + /* Standard version of Gunyah {c1d58fcd-a453-5fdb-9265-ce36673d5f14} */
> + UUID_INIT(0xc1d58fcd, 0xa453, 0x5fdb, 0x92, 0x65, 0xce, 0x36, 0x67, 0x3d, 0x5f, 0x14),
> +};
> +
> +bool arch_is_gh_guest(void)
> +{
> + struct arm_smccc_res res;
> + uuid_t uuid;
> + int i;
> +
> + arm_smccc_1_1_hvc(ARM_SMCCC_VENDOR_HYP_CALL_UID_FUNC_ID, &res);
> +
> + ((u32 *)&uuid.b[0])[0] = lower_32_bits(res.a0);
> + ((u32 *)&uuid.b[0])[1] = lower_32_bits(res.a1);
> + ((u32 *)&uuid.b[0])[2] = lower_32_bits(res.a2);
> + ((u32 *)&uuid.b[0])[3] = lower_32_bits(res.a3);
> +
> + for (i = 0; i < ARRAY_SIZE(gh_known_uuids); i++)
> + if (uuid_equal(&uuid, &gh_known_uuids[i]))
> + return true;
> +
> + return false;
> +}
> +EXPORT_SYMBOL_GPL(arch_is_gh_guest);
> +
> +#define GH_HYPERCALL(fn) ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, ARM_SMCCC_SMC_64, \
> + ARM_SMCCC_OWNER_VENDOR_HYP, \
> + fn)
> +
> +#define GH_HYPERCALL_HYP_IDENTIFY GH_HYPERCALL(0x8000)
> +
> +/**
> + * gh_hypercall_hyp_identify() - Returns build information and feature flags
> + * supported by Gunyah.
> + * @hyp_identity: filled by the hypercall with the API info and feature flags.
> + */
> +void gh_hypercall_hyp_identify(struct gh_hypercall_hyp_identify_resp *hyp_identity)
> +{
> + struct arm_smccc_res res;
> +
> + arm_smccc_1_1_hvc(GH_HYPERCALL_HYP_IDENTIFY, &res);
> +
> + hyp_identity->api_info = res.a0;
> + hyp_identity->flags[0] = res.a1;
> + hyp_identity->flags[1] = res.a2;
> + hyp_identity->flags[2] = res.a3;
> +}
> +EXPORT_SYMBOL_GPL(gh_hypercall_hyp_identify);
> +
> +MODULE_LICENSE("GPL");
> +MODULE_DESCRIPTION("Gunyah Hypervisor Hypercalls");
> diff --git a/drivers/virt/Kconfig b/drivers/virt/Kconfig
> index f79ab13a5c28..85bd6626ffc9 100644
> --- a/drivers/virt/Kconfig
> +++ b/drivers/virt/Kconfig
> @@ -54,4 +54,6 @@ source "drivers/virt/coco/sev-guest/Kconfig"
>
> source "drivers/virt/coco/tdx-guest/Kconfig"
>
> +source "drivers/virt/gunyah/Kconfig"
> +
> endif
> diff --git a/drivers/virt/gunyah/Kconfig b/drivers/virt/gunyah/Kconfig
> new file mode 100644
> index 000000000000..1a737694c333
> --- /dev/null
> +++ b/drivers/virt/gunyah/Kconfig
> @@ -0,0 +1,13 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +
> +config GUNYAH
> + tristate "Gunyah Virtualization drivers"
> + depends on ARM64
> + depends on MAILBOX
> + help
> + The Gunyah drivers are the helper interfaces that run in a guest VM
> + such as basic inter-VM IPC and signaling mechanisms, and higher level
> + services such as memory/device sharing, IRQ sharing, and so on.
> +
> + Say Y/M here to enable the drivers needed to interact in a Gunyah
> + virtual environment.
> diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
> index 54b4be71caf7..bd080e3a6fc9 100644
> --- a/include/linux/gunyah.h
> +++ b/include/linux/gunyah.h
> @@ -6,8 +6,10 @@
> #ifndef _LINUX_GUNYAH_H
> #define _LINUX_GUNYAH_H
>
> +#include <linux/bitfield.h>
> #include <linux/errno.h>
> #include <linux/limits.h>
> +#include <linux/types.h>
>
> /******************************************************************************/
> /* Common arch-independent definitions for Gunyah hypercalls */
> @@ -80,4 +82,30 @@ static inline int gh_remap_error(enum gh_error gh_error)
> }
> }
>
> +enum gh_api_feature {
> + GH_FEATURE_DOORBELL = 1,
> + GH_FEATURE_MSGQUEUE = 2,
> + GH_FEATURE_VCPU = 5,
> + GH_FEATURE_MEMEXTENT = 6,
> +};
> +
> +bool arch_is_gh_guest(void);
> +
> +u16 gh_api_version(void);
> +bool gh_api_has_feature(enum gh_api_feature feature);
> +
> +#define GH_API_V1 1
> +
> +#define GH_API_INFO_API_VERSION_MASK GENMASK_ULL(13, 0)
> +#define GH_API_INFO_BIG_ENDIAN BIT_ULL(14)
> +#define GH_API_INFO_IS_64BIT BIT_ULL(15)
> +#define GH_API_INFO_VARIANT_MASK GENMASK_ULL(63, 56)
> +
> +struct gh_hypercall_hyp_identify_resp {
> + u64 api_info;
> + u64 flags[3];
> +};
> +
> +void gh_hypercall_hyp_identify(struct gh_hypercall_hyp_identify_resp *hyp_identity);
> +
> #endif
On 04/03/2023 01:06, Elliot Berman wrote:
> Gunyah message queues are a unidirectional inter-VM pipe for messages up
> to 1024 bytes. This driver supports pairing a receiver message queue and
> a transmitter message queue to expose a single mailbox channel.
>
> Signed-off-by: Elliot Berman <[email protected]>
Reviewed-by: Srinivas Kandagatla <[email protected]>
> ---
> Documentation/virt/gunyah/message-queue.rst | 8 +
> drivers/mailbox/Makefile | 2 +
> drivers/mailbox/gunyah-msgq.c | 209 ++++++++++++++++++++
> include/linux/gunyah.h | 57 ++++++
> 4 files changed, 276 insertions(+)
> create mode 100644 drivers/mailbox/gunyah-msgq.c
>
> diff --git a/Documentation/virt/gunyah/message-queue.rst b/Documentation/virt/gunyah/message-queue.rst
> index b352918ae54b..70d82a4ef32d 100644
> --- a/Documentation/virt/gunyah/message-queue.rst
> +++ b/Documentation/virt/gunyah/message-queue.rst
> @@ -61,3 +61,11 @@ vIRQ: two TX message queues will have two vIRQs (and two capability IDs).
> | | | | | |
> | | | | | |
> +---------------+ +-----------------+ +---------------+
> +
> +Gunyah message queues are exposed as mailboxes. To create the mailbox, create
> +a mbox_client and call `gh_msgq_init()`. On receipt of the RX_READY interrupt,
> +all messages in the RX message queue are read and pushed via the `rx_callback`
> +of the registered mbox_client.
> +
> +.. kernel-doc:: drivers/mailbox/gunyah-msgq.c
> + :identifiers: gh_msgq_init
> diff --git a/drivers/mailbox/Makefile b/drivers/mailbox/Makefile
> index fc9376117111..5f929bb55e9a 100644
> --- a/drivers/mailbox/Makefile
> +++ b/drivers/mailbox/Makefile
> @@ -55,6 +55,8 @@ obj-$(CONFIG_MTK_CMDQ_MBOX) += mtk-cmdq-mailbox.o
>
> obj-$(CONFIG_ZYNQMP_IPI_MBOX) += zynqmp-ipi-mailbox.o
>
> +obj-$(CONFIG_GUNYAH) += gunyah-msgq.o
> +
> obj-$(CONFIG_SUN6I_MSGBOX) += sun6i-msgbox.o
>
> obj-$(CONFIG_SPRD_MBOX) += sprd-mailbox.o
> diff --git a/drivers/mailbox/gunyah-msgq.c b/drivers/mailbox/gunyah-msgq.c
> new file mode 100644
> index 000000000000..1989298653f9
> --- /dev/null
> +++ b/drivers/mailbox/gunyah-msgq.c
> @@ -0,0 +1,209 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#include <linux/mailbox_controller.h>
> +#include <linux/module.h>
> +#include <linux/interrupt.h>
> +#include <linux/gunyah.h>
> +#include <linux/printk.h>
> +#include <linux/init.h>
> +#include <linux/slab.h>
> +#include <linux/wait.h>
> +
> +#define mbox_chan_to_msgq(chan) (container_of(chan->mbox, struct gh_msgq, mbox))
> +
> +static irqreturn_t gh_msgq_rx_irq_handler(int irq, void *data)
> +{
> + struct gh_msgq *msgq = data;
> + struct gh_msgq_rx_data rx_data;
> + enum gh_error gh_error;
> + bool ready = true;
> +
> + while (ready) {
> + gh_error = gh_hypercall_msgq_recv(msgq->rx_ghrsc->capid,
> + &rx_data.data, sizeof(rx_data.data),
> + &rx_data.length, &ready);
> + if (gh_error != GH_ERROR_OK) {
> + if (gh_error != GH_ERROR_MSGQUEUE_EMPTY)
> + dev_warn(msgq->mbox.dev, "Failed to receive data: %d\n", gh_error);
> + break;
> + }
> + mbox_chan_received_data(gh_msgq_chan(msgq), &rx_data);
> + }
> +
> + return IRQ_HANDLED;
> +}
> +
> +/* Fired when message queue transitions from "full" to "space available" to send messages */
> +static irqreturn_t gh_msgq_tx_irq_handler(int irq, void *data)
> +{
> + struct gh_msgq *msgq = data;
> +
> + mbox_chan_txdone(gh_msgq_chan(msgq), 0);
> +
> + return IRQ_HANDLED;
> +}
> +
> +/* Fired after sending message and hypercall told us there was more space available. */
> +static void gh_msgq_txdone_tasklet(struct tasklet_struct *tasklet)
> +{
> + struct gh_msgq *msgq = container_of(tasklet, struct gh_msgq, txdone_tasklet);
> +
> + mbox_chan_txdone(gh_msgq_chan(msgq), msgq->last_ret);
> +}
> +
> +static int gh_msgq_send_data(struct mbox_chan *chan, void *data)
> +{
> + struct gh_msgq *msgq = mbox_chan_to_msgq(chan);
> + struct gh_msgq_tx_data *msgq_data = data;
> + u64 tx_flags = 0;
> + enum gh_error gh_error;
> + bool ready;
> +
> + if (msgq_data->push)
> + tx_flags |= GH_HYPERCALL_MSGQ_TX_FLAGS_PUSH;
> +
> + gh_error = gh_hypercall_msgq_send(msgq->tx_ghrsc->capid, msgq_data->length, msgq_data->data,
> + tx_flags, &ready);
> +
> + /**
> + * unlikely because Linux tracks state of msgq and should not try to
> + * send message when msgq is full.
> + */
> + if (unlikely(gh_error == GH_ERROR_MSGQUEUE_FULL))
> + return -EAGAIN;
> +
> + /**
> + * Propagate all other errors to client. If we return error to mailbox
> + * framework, then no other messages can be sent and nobody will know
> + * to retry this message.
> + */
> + msgq->last_ret = gh_remap_error(gh_error);
> +
> + /**
> + * This message was successfully sent, but message queue isn't ready to
> + * accept more messages because it's now full. Mailbox framework
> + * requires that we only report that message was transmitted when
> + * we're ready to transmit another message. We'll get that in the form
> + * of tx IRQ once the other side starts to drain the msgq.
> + */
> + if (gh_error == GH_ERROR_OK) {
> + if (!ready)
> + return 0;
> + } else
> + dev_err(msgq->mbox.dev, "Failed to send data: %d (%d)\n", gh_error, msgq->last_ret);
> +
> + /**
> + * We can send more messages. Mailbox framework requires that tx done
> + * happens asynchronously to sending the message. Gunyah message queues
> + * tell us right away on the hypercall return whether we can send more
> + * messages. To work around this, defer the txdone to a tasklet.
> + */
> + tasklet_schedule(&msgq->txdone_tasklet);
> +
> + return 0;
> +}
> +
> +static struct mbox_chan_ops gh_msgq_ops = {
> + .send_data = gh_msgq_send_data,
> +};
> +
> +/**
> + * gh_msgq_init() - Initialize a Gunyah message queue with an mbox_client
> + * @parent: optional, device parent used for the mailbox controller
> + * @msgq: Pointer to the gh_msgq to initialize
> + * @cl: A mailbox client to bind to the mailbox channel that the message queue creates
> + * @tx_ghrsc: optional, the transmission side of the message queue
> + * @rx_ghrsc: optional, the receiving side of the message queue
> + *
> + * At least one of tx_ghrsc and rx_ghrsc must be not NULL. Most message queue use cases come with
> + * a pair of message queues to facilitate bidirectional communication. When tx_ghrsc is set,
> + * the client can send messages with mbox_send_message(gh_msgq_chan(msgq), msg). When rx_ghrsc
> + * is set, the mbox_client must register an .rx_callback() and the message queue driver will
> + * deliver all available messages upon receiving the RX ready interrupt. The messages should be
> + * consumed or copied by the client right away as the gh_msgq_rx_data will be replaced/destroyed
> + * after the callback.
> + *
> + * Returns - 0 on success, negative otherwise
> + */
> +int gh_msgq_init(struct device *parent, struct gh_msgq *msgq, struct mbox_client *cl,
> + struct gh_resource *tx_ghrsc, struct gh_resource *rx_ghrsc)
> +{
> + int ret;
> +
> + /* Must have at least a tx_ghrsc or rx_ghrsc and that they are the right device types */
> + if ((!tx_ghrsc && !rx_ghrsc) ||
> + (tx_ghrsc && tx_ghrsc->type != GH_RESOURCE_TYPE_MSGQ_TX) ||
> + (rx_ghrsc && rx_ghrsc->type != GH_RESOURCE_TYPE_MSGQ_RX))
> + return -EINVAL;
> +
> + if (!gh_api_has_feature(GH_FEATURE_MSGQUEUE))
> + return -EOPNOTSUPP;
> +
> + msgq->tx_ghrsc = tx_ghrsc;
> + msgq->rx_ghrsc = rx_ghrsc;
> +
> + msgq->mbox.dev = parent;
> + msgq->mbox.ops = &gh_msgq_ops;
> + msgq->mbox.num_chans = 1;
> + msgq->mbox.txdone_irq = true;
> + msgq->mbox.chans = &msgq->mbox_chan;
> +
> + if (msgq->tx_ghrsc) {
> + ret = request_irq(msgq->tx_ghrsc->irq, gh_msgq_tx_irq_handler, 0, "gh_msgq_tx",
> + msgq);
> + if (ret)
> + goto err_chans;
> + }
> +
> + if (msgq->rx_ghrsc) {
> + ret = request_threaded_irq(msgq->rx_ghrsc->irq, NULL, gh_msgq_rx_irq_handler,
> + IRQF_ONESHOT, "gh_msgq_rx", msgq);
> + if (ret)
> + goto err_tx_irq;
> + }
> +
> + tasklet_setup(&msgq->txdone_tasklet, gh_msgq_txdone_tasklet);
> +
> + ret = mbox_controller_register(&msgq->mbox);
> + if (ret)
> + goto err_rx_irq;
> +
> + ret = mbox_bind_client(gh_msgq_chan(msgq), cl);
> + if (ret)
> + goto err_mbox;
> +
> + return 0;
> +err_mbox:
> + mbox_controller_unregister(&msgq->mbox);
> +err_rx_irq:
> + if (msgq->rx_ghrsc)
> + free_irq(msgq->rx_ghrsc->irq, msgq);
> +err_tx_irq:
> + if (msgq->tx_ghrsc)
> + free_irq(msgq->tx_ghrsc->irq, msgq);
> +err_chans:
> + kfree(msgq->mbox.chans);
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(gh_msgq_init);
> +
> +void gh_msgq_remove(struct gh_msgq *msgq)
> +{
> + tasklet_kill(&msgq->txdone_tasklet);
> + mbox_controller_unregister(&msgq->mbox);
> +
> + if (msgq->rx_ghrsc)
> + free_irq(msgq->rx_ghrsc->irq, msgq);
> +
> + if (msgq->tx_ghrsc)
> + free_irq(msgq->tx_ghrsc->irq, msgq);
> +
> + kfree(msgq->mbox.chans);
> +}
> +EXPORT_SYMBOL_GPL(gh_msgq_remove);
> +
> +MODULE_LICENSE("GPL");
> +MODULE_DESCRIPTION("Gunyah Message Queue Driver");
> diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
> index 18cfbf5ee48b..378bec0f2ce1 100644
> --- a/include/linux/gunyah.h
> +++ b/include/linux/gunyah.h
> @@ -8,11 +8,68 @@
>
> #include <linux/bitfield.h>
> #include <linux/errno.h>
> +#include <linux/interrupt.h>
> #include <linux/limits.h>
> +#include <linux/mailbox_controller.h>
> +#include <linux/mailbox_client.h>
> #include <linux/types.h>
>
> +/* Follows resource manager's resource types for VM_GET_HYP_RESOURCES */
> +enum gh_resource_type {
> + GH_RESOURCE_TYPE_BELL_TX = 0,
> + GH_RESOURCE_TYPE_BELL_RX = 1,
> + GH_RESOURCE_TYPE_MSGQ_TX = 2,
> + GH_RESOURCE_TYPE_MSGQ_RX = 3,
> + GH_RESOURCE_TYPE_VCPU = 4,
> +};
> +
> +struct gh_resource {
> + enum gh_resource_type type;
> + u64 capid;
> + unsigned int irq;
> +};
> +
> +/**
> + * Gunyah Message Queues
> + */
> +
> +#define GH_MSGQ_MAX_MSG_SIZE 240
> +
> +struct gh_msgq_tx_data {
> + size_t length;
> + bool push;
> + char data[];
> +};
> +
> +struct gh_msgq_rx_data {
> + size_t length;
> + char data[GH_MSGQ_MAX_MSG_SIZE];
> +};
> +
> +struct gh_msgq {
> + struct gh_resource *tx_ghrsc;
> + struct gh_resource *rx_ghrsc;
> +
> + /* msgq private */
> + int last_ret; /* Linux error, not GH_STATUS_* */
> + struct mbox_chan mbox_chan;
> + struct mbox_controller mbox;
> + struct tasklet_struct txdone_tasklet;
> +};
> +
> +
> +int gh_msgq_init(struct device *parent, struct gh_msgq *msgq, struct mbox_client *cl,
> + struct gh_resource *tx_ghrsc, struct gh_resource *rx_ghrsc);
> +void gh_msgq_remove(struct gh_msgq *msgq);
> +
> +static inline struct mbox_chan *gh_msgq_chan(struct gh_msgq *msgq)
> +{
> + return &msgq->mbox.chans[0];
> +}
> +
> /******************************************************************************/
> /* Common arch-independent definitions for Gunyah hypercalls */
> +
> #define GH_CAPID_INVAL U64_MAX
> #define GH_VMID_ROOT_VM 0xff
>
On 04/03/2023 01:06, Elliot Berman wrote:
> Add architecture-independent standard error codes, types, and macros for
> Gunyah hypercalls.
>
> Reviewed-by: Dmitry Baryshkov <[email protected]>
> Signed-off-by: Elliot Berman <[email protected]>
> ---
Reviewed-by: Srinivas Kandagatla <[email protected]>
> include/linux/gunyah.h | 83 ++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 83 insertions(+)
> create mode 100644 include/linux/gunyah.h
>
> diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
> new file mode 100644
> index 000000000000..54b4be71caf7
> --- /dev/null
> +++ b/include/linux/gunyah.h
> @@ -0,0 +1,83 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#ifndef _LINUX_GUNYAH_H
> +#define _LINUX_GUNYAH_H
> +
> +#include <linux/errno.h>
> +#include <linux/limits.h>
> +
> +/******************************************************************************/
> +/* Common arch-independent definitions for Gunyah hypercalls */
> +#define GH_CAPID_INVAL U64_MAX
> +#define GH_VMID_ROOT_VM 0xff
> +
> +enum gh_error {
> + GH_ERROR_OK = 0,
> + GH_ERROR_UNIMPLEMENTED = -1,
> + GH_ERROR_RETRY = -2,
> +
> + GH_ERROR_ARG_INVAL = 1,
> + GH_ERROR_ARG_SIZE = 2,
> + GH_ERROR_ARG_ALIGN = 3,
> +
> + GH_ERROR_NOMEM = 10,
> +
> + GH_ERROR_ADDR_OVFL = 20,
> + GH_ERROR_ADDR_UNFL = 21,
> + GH_ERROR_ADDR_INVAL = 22,
> +
> + GH_ERROR_DENIED = 30,
> + GH_ERROR_BUSY = 31,
> + GH_ERROR_IDLE = 32,
> +
> + GH_ERROR_IRQ_BOUND = 40,
> + GH_ERROR_IRQ_UNBOUND = 41,
> +
> + GH_ERROR_CSPACE_CAP_NULL = 50,
> + GH_ERROR_CSPACE_CAP_REVOKED = 51,
> + GH_ERROR_CSPACE_WRONG_OBJ_TYPE = 52,
> + GH_ERROR_CSPACE_INSUF_RIGHTS = 53,
> + GH_ERROR_CSPACE_FULL = 54,
> +
> + GH_ERROR_MSGQUEUE_EMPTY = 60,
> + GH_ERROR_MSGQUEUE_FULL = 61,
> +};
> +
> +/**
> + * gh_remap_error() - Remap Gunyah hypervisor errors into a Linux error code
> + * @gh_error: Gunyah hypercall return value
> + */
> +static inline int gh_remap_error(enum gh_error gh_error)
> +{
> + switch (gh_error) {
> + case GH_ERROR_OK:
> + return 0;
> + case GH_ERROR_NOMEM:
> + return -ENOMEM;
> + case GH_ERROR_DENIED:
> + case GH_ERROR_CSPACE_CAP_NULL:
> + case GH_ERROR_CSPACE_CAP_REVOKED:
> + case GH_ERROR_CSPACE_WRONG_OBJ_TYPE:
> + case GH_ERROR_CSPACE_INSUF_RIGHTS:
> + case GH_ERROR_CSPACE_FULL:
> + return -EACCES;
> + case GH_ERROR_BUSY:
> + case GH_ERROR_IDLE:
> + return -EBUSY;
> + case GH_ERROR_IRQ_BOUND:
> + case GH_ERROR_IRQ_UNBOUND:
> + case GH_ERROR_MSGQUEUE_FULL:
> + case GH_ERROR_MSGQUEUE_EMPTY:
> + return -EIO;
> + case GH_ERROR_UNIMPLEMENTED:
> + case GH_ERROR_RETRY:
> + return -EOPNOTSUPP;
> + default:
> + return -EINVAL;
> + }
> +}
> +
> +#endif
On 04/03/2023 01:06, Elliot Berman wrote:
> Gunyah VM manager is a kernel moduel which exposes an interface to
> Gunyah userspace to load, run, and interact with other Gunyah virtual
> machines. The interface is a character device at /dev/gunyah.
>
> Add a basic VM manager driver. Upcoming patches will add more ioctls
> into this driver.
>
> Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
> Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
> Signed-off-by: Elliot Berman <[email protected]>
> ---
Reviewed-by: Srinivas Kandagatla <[email protected]>
> .../userspace-api/ioctl/ioctl-number.rst | 1 +
> drivers/virt/gunyah/Makefile | 2 +-
> drivers/virt/gunyah/rsc_mgr.c | 38 +++++-
> drivers/virt/gunyah/vm_mgr.c | 116 ++++++++++++++++++
> drivers/virt/gunyah/vm_mgr.h | 23 ++++
> include/uapi/linux/gunyah.h | 23 ++++
> 6 files changed, 201 insertions(+), 2 deletions(-)
> create mode 100644 drivers/virt/gunyah/vm_mgr.c
> create mode 100644 drivers/virt/gunyah/vm_mgr.h
> create mode 100644 include/uapi/linux/gunyah.h
>
> diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documentation/userspace-api/ioctl/ioctl-number.rst
> index 0a1882e296ae..2513324ae7be 100644
> --- a/Documentation/userspace-api/ioctl/ioctl-number.rst
> +++ b/Documentation/userspace-api/ioctl/ioctl-number.rst
> @@ -137,6 +137,7 @@ Code Seq# Include File Comments
> 'F' DD video/sstfb.h conflict!
> 'G' 00-3F drivers/misc/sgi-gru/grulib.h conflict!
> 'G' 00-0F xen/gntalloc.h, xen/gntdev.h conflict!
> +'G' 00-0f linux/gunyah.h conflict!
> 'H' 00-7F linux/hiddev.h conflict!
> 'H' 00-0F linux/hidraw.h conflict!
> 'H' 01 linux/mei.h conflict!
> diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
> index de29769f2f3f..03951cf82023 100644
> --- a/drivers/virt/gunyah/Makefile
> +++ b/drivers/virt/gunyah/Makefile
> @@ -2,5 +2,5 @@
>
> obj-$(CONFIG_GUNYAH) += gunyah.o
>
> -gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o
> +gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o
> obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
> diff --git a/drivers/virt/gunyah/rsc_mgr.c b/drivers/virt/gunyah/rsc_mgr.c
> index 67813c9a52db..d7ce692d0067 100644
> --- a/drivers/virt/gunyah/rsc_mgr.c
> +++ b/drivers/virt/gunyah/rsc_mgr.c
> @@ -15,8 +15,10 @@
> #include <linux/completion.h>
> #include <linux/gunyah_rsc_mgr.h>
> #include <linux/platform_device.h>
> +#include <linux/miscdevice.h>
>
> #include "rsc_mgr.h"
> +#include "vm_mgr.h"
>
> #define RM_RPC_API_VERSION_MASK GENMASK(3, 0)
> #define RM_RPC_HEADER_WORDS_MASK GENMASK(7, 4)
> @@ -129,6 +131,7 @@ struct gh_rm_connection {
> * @cache: cache for allocating Tx messages
> * @send_lock: synchronization to allow only one request to be sent at a time
> * @nh: notifier chain for clients interested in RM notification messages
> + * @miscdev: /dev/gunyah
> */
> struct gh_rm {
> struct device *dev;
> @@ -145,6 +148,8 @@ struct gh_rm {
> struct kmem_cache *cache;
> struct mutex send_lock;
> struct blocking_notifier_head nh;
> +
> + struct miscdevice miscdev;
> };
>
> /**
> @@ -593,6 +598,21 @@ void gh_rm_put(struct gh_rm *rm)
> }
> EXPORT_SYMBOL_GPL(gh_rm_put);
>
> +static long gh_dev_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
> +{
> + struct miscdevice *miscdev = filp->private_data;
> + struct gh_rm *rm = container_of(miscdev, struct gh_rm, miscdev);
> +
> + return gh_dev_vm_mgr_ioctl(rm, cmd, arg);
> +}
> +
> +static const struct file_operations gh_dev_fops = {
> + .owner = THIS_MODULE,
> + .unlocked_ioctl = gh_dev_ioctl,
> + .compat_ioctl = compat_ptr_ioctl,
> + .llseek = noop_llseek,
> +};
> +
> static int gh_msgq_platform_probe_direction(struct platform_device *pdev, bool tx,
> struct gh_resource *ghrsc)
> {
> @@ -651,7 +671,22 @@ static int gh_rm_drv_probe(struct platform_device *pdev)
> rm->msgq_client.rx_callback = gh_rm_msgq_rx_data;
> rm->msgq_client.tx_done = gh_rm_msgq_tx_done;
>
> - return gh_msgq_init(&pdev->dev, &rm->msgq, &rm->msgq_client, &rm->tx_ghrsc, &rm->rx_ghrsc);
> + ret = gh_msgq_init(&pdev->dev, &rm->msgq, &rm->msgq_client, &rm->tx_ghrsc, &rm->rx_ghrsc);
> + if (ret)
> + goto err_cache;
> +
> + rm->miscdev.name = "gunyah";
> + rm->miscdev.minor = MISC_DYNAMIC_MINOR;
> + rm->miscdev.fops = &gh_dev_fops;
> +
> + ret = misc_register(&rm->miscdev);
> + if (ret)
> + goto err_msgq;
> +
> + return 0;
> +err_msgq:
> + mbox_free_channel(gh_msgq_chan(&rm->msgq));
> + gh_msgq_remove(&rm->msgq);
> err_cache:
> kmem_cache_destroy(rm->cache);
> return ret;
> @@ -661,6 +696,7 @@ static int gh_rm_drv_remove(struct platform_device *pdev)
> {
> struct gh_rm *rm = platform_get_drvdata(pdev);
>
> + misc_deregister(&rm->miscdev);
> mbox_free_channel(gh_msgq_chan(&rm->msgq));
> gh_msgq_remove(&rm->msgq);
> kmem_cache_destroy(rm->cache);
> diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
> new file mode 100644
> index 000000000000..dbacf36af72d
> --- /dev/null
> +++ b/drivers/virt/gunyah/vm_mgr.c
> @@ -0,0 +1,116 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#define pr_fmt(fmt) "gh_vm_mgr: " fmt
> +
> +#include <linux/anon_inodes.h>
> +#include <linux/file.h>
> +#include <linux/gunyah_rsc_mgr.h>
> +#include <linux/miscdevice.h>
> +#include <linux/module.h>
> +
> +#include <uapi/linux/gunyah.h>
> +
> +#include "vm_mgr.h"
> +
> +static void gh_vm_free(struct work_struct *work)
> +{
> + struct gh_vm *ghvm = container_of(work, struct gh_vm, free_work);
> + int ret;
> +
> + ret = gh_rm_dealloc_vmid(ghvm->rm, ghvm->vmid);
> + if (ret)
> + pr_warn("Failed to deallocate vmid: %d\n", ret);
> +
> + put_gh_rm(ghvm->rm);
> + kfree(ghvm);
> +}
> +
> +static __must_check struct gh_vm *gh_vm_alloc(struct gh_rm *rm)
> +{
> + struct gh_vm *ghvm;
> + int vmid;
> +
> + vmid = gh_rm_alloc_vmid(rm, 0);
> + if (vmid < 0)
> + return ERR_PTR(vmid);
> +
> + ghvm = kzalloc(sizeof(*ghvm), GFP_KERNEL);
> + if (!ghvm) {
> + gh_rm_dealloc_vmid(rm, vmid);
> + return ERR_PTR(-ENOMEM);
> + }
> +
> + ghvm->parent = gh_rm_get(rm);
> + ghvm->vmid = vmid;
> + ghvm->rm = rm;
> +
> + INIT_WORK(&ghvm->free_work, gh_vm_free);
> +
> + return ghvm;
> +}
> +
> +static int gh_vm_release(struct inode *inode, struct file *filp)
> +{
> + struct gh_vm *ghvm = filp->private_data;
> +
> + /* VM will be reset and make RM calls which can interruptible sleep.
> + * Defer to a work so this thread can receive signal.
> + */
> + schedule_work(&ghvm->free_work);
> + return 0;
> +}
> +
> +static const struct file_operations gh_vm_fops = {
> + .release = gh_vm_release,
> + .llseek = noop_llseek,
> +};
> +
> +static long gh_dev_ioctl_create_vm(struct gh_rm *rm, unsigned long arg)
> +{
> + struct gh_vm *ghvm;
> + struct file *file;
> + int fd, err;
> +
> + /* arg reserved for future use. */
> + if (arg)
> + return -EINVAL;
> +
> + ghvm = gh_vm_alloc(rm);
> + if (IS_ERR(ghvm))
> + return PTR_ERR(ghvm);
> +
> + fd = get_unused_fd_flags(O_CLOEXEC);
> + if (fd < 0) {
> + err = fd;
> + goto err_destroy_vm;
> + }
> +
> + file = anon_inode_getfile("gunyah-vm", &gh_vm_fops, ghvm, O_RDWR);
> + if (IS_ERR(file)) {
> + err = PTR_ERR(file);
> + goto err_put_fd;
> + }
> +
> + fd_install(fd, file);
> +
> + return fd;
> +
> +err_put_fd:
> + put_unused_fd(fd);
> +err_destroy_vm:
> + gh_vm_free(&ghvm->free_work);
> + return err;
> +}
> +
> +long gh_dev_vm_mgr_ioctl(struct gh_rm *rm, unsigned int cmd, unsigned long arg)
> +{
> + switch (cmd) {
> + case GH_CREATE_VM:
> + return gh_dev_ioctl_create_vm(rm, arg);
> + default:
> + return -ENOIOCTLCMD;
> + }
> +}
> diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
> new file mode 100644
> index 000000000000..4b22fbcac91c
> --- /dev/null
> +++ b/drivers/virt/gunyah/vm_mgr.h
> @@ -0,0 +1,23 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#ifndef _GH_PRIV_VM_MGR_H
> +#define _GH_PRIV_VM_MGR_H
> +
> +#include <linux/gunyah_rsc_mgr.h>
> +
> +#include <uapi/linux/gunyah.h>
> +
> +long gh_dev_vm_mgr_ioctl(struct gh_rm *rm, unsigned int cmd, unsigned long arg);
> +
> +struct gh_vm {
> + u16 vmid;
> + struct gh_rm *rm;
> + struct device *parent;
> +
> + struct work_struct free_work;
> +};
> +
> +#endif
> diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
> new file mode 100644
> index 000000000000..10ba32d2b0a6
> --- /dev/null
> +++ b/include/uapi/linux/gunyah.h
> @@ -0,0 +1,23 @@
> +/* SPDX-License-Identifier: GPL-2.0-only WITH Linux-syscall-note */
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#ifndef _UAPI_LINUX_GUNYAH
> +#define _UAPI_LINUX_GUNYAH
> +
> +/*
> + * Userspace interface for /dev/gunyah - gunyah based virtual machine
> + */
> +
> +#include <linux/types.h>
> +#include <linux/ioctl.h>
> +
> +#define GH_IOCTL_TYPE 'G'
> +
> +/*
> + * ioctls for /dev/gunyah fds:
> + */
> +#define GH_CREATE_VM _IO(GH_IOCTL_TYPE, 0x0) /* Returns a Gunyah VM fd */
> +
> +#endif
On 04/03/2023 01:06, Elliot Berman wrote:
> On Qualcomm platforms, there is a firmware entity which controls access
> to physical pages. In order to share memory with another VM, this entity
> needs to be informed that the guest VM should have access to the memory.
>
> Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
> Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
> Signed-off-by: Elliot Berman <[email protected]>
> ---
> drivers/virt/gunyah/Kconfig | 4 ++
> drivers/virt/gunyah/Makefile | 1 +
> drivers/virt/gunyah/gunyah_platform_hooks.c | 80 +++++++++++++++++++++
> drivers/virt/gunyah/rsc_mgr.h | 3 +
> drivers/virt/gunyah/rsc_mgr_rpc.c | 18 ++++-
> include/linux/gunyah_rsc_mgr.h | 17 +++++
> 6 files changed, 121 insertions(+), 2 deletions(-)
> create mode 100644 drivers/virt/gunyah/gunyah_platform_hooks.c
>
> diff --git a/drivers/virt/gunyah/Kconfig b/drivers/virt/gunyah/Kconfig
> index 1a737694c333..de815189dab6 100644
> --- a/drivers/virt/gunyah/Kconfig
> +++ b/drivers/virt/gunyah/Kconfig
> @@ -4,6 +4,7 @@ config GUNYAH
> tristate "Gunyah Virtualization drivers"
> depends on ARM64
> depends on MAILBOX
> + select GUNYAH_PLATFORM_HOOKS
> help
> The Gunyah drivers are the helper interfaces that run in a guest VM
> such as basic inter-VM IPC and signaling mechanisms, and higher level
> @@ -11,3 +12,6 @@ config GUNYAH
>
> Say Y/M here to enable the drivers needed to interact in a Gunyah
> virtual environment.
> +
> +config GUNYAH_PLATFORM_HOOKS
> + tristate
> diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
> index ff8bc4925392..6b8f84dbfe0d 100644
> --- a/drivers/virt/gunyah/Makefile
> +++ b/drivers/virt/gunyah/Makefile
> @@ -1,6 +1,7 @@
> # SPDX-License-Identifier: GPL-2.0
>
> obj-$(CONFIG_GUNYAH) += gunyah.o
> +obj-$(CONFIG_GUNYAH_PLATFORM_HOOKS) += gunyah_platform_hooks.o
>
> gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o vm_mgr_mm.o
> obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
> diff --git a/drivers/virt/gunyah/gunyah_platform_hooks.c b/drivers/virt/gunyah/gunyah_platform_hooks.c
> new file mode 100644
> index 000000000000..60da0e154e98
> --- /dev/null
> +++ b/drivers/virt/gunyah/gunyah_platform_hooks.c
> @@ -0,0 +1,80 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#include <linux/module.h>
> +#include <linux/rwsem.h>
> +#include <linux/gunyah_rsc_mgr.h>
> +
> +#include "rsc_mgr.h"
> +
> +static struct gh_rm_platform_ops *rm_platform_ops;
> +static DECLARE_RWSEM(rm_platform_ops_lock);
> +
> +int gh_rm_platform_pre_mem_share(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel)
I think I have asked this question but I can not find the answer to this
from old replies.
Why is this platform hooks not part of core gunyah? Do we need a
dedicated module for this.
By the looks of APIs I see this is very much close to rm and i think
this functionality should live with rm.
--srini
+{
> + int ret = 0;
> +
> + down_read(&rm_platform_ops_lock);
> + if (rm_platform_ops && rm_platform_ops->pre_mem_share)
> + ret = rm_platform_ops->pre_mem_share(rm, mem_parcel);
> + up_read(&rm_platform_ops_lock);
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(gh_rm_platform_pre_mem_share);
> +
> +int gh_rm_platform_post_mem_reclaim(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel)
> +{
> + int ret = 0;
> +
> + down_read(&rm_platform_ops_lock);
> + if (rm_platform_ops && rm_platform_ops->post_mem_reclaim)
> + ret = rm_platform_ops->post_mem_reclaim(rm, mem_parcel);
> + up_read(&rm_platform_ops_lock);
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(gh_rm_platform_post_mem_reclaim);
> +
> +int gh_rm_register_platform_ops(struct gh_rm_platform_ops *platform_ops)
> +{
> + int ret = 0;
> +
> + down_write(&rm_platform_ops_lock);
> + if (!rm_platform_ops)
> + rm_platform_ops = platform_ops;
> + else
> + ret = -EEXIST;
> + up_write(&rm_platform_ops_lock);
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(gh_rm_register_platform_ops);
> +
> +void gh_rm_unregister_platform_ops(struct gh_rm_platform_ops *platform_ops)
> +{
> + down_write(&rm_platform_ops_lock);
> + if (rm_platform_ops == platform_ops)
> + rm_platform_ops = NULL;
> + up_write(&rm_platform_ops_lock);
> +}
> +EXPORT_SYMBOL_GPL(gh_rm_unregister_platform_ops);
> +
> +static void _devm_gh_rm_unregister_platform_ops(void *data)
> +{
> + gh_rm_unregister_platform_ops(data);
> +}
> +
> +int devm_gh_rm_register_platform_ops(struct device *dev, struct gh_rm_platform_ops *ops)
> +{
> + int ret;
> +
> + ret = gh_rm_register_platform_ops(ops);
> + if (ret)
> + return ret;
> +
> + return devm_add_action(dev, _devm_gh_rm_unregister_platform_ops, ops);
> +}
> +EXPORT_SYMBOL_GPL(devm_gh_rm_register_platform_ops);
> +
> +MODULE_LICENSE("GPL");
> +MODULE_DESCRIPTION("Gunyah Platform Hooks");
> diff --git a/drivers/virt/gunyah/rsc_mgr.h b/drivers/virt/gunyah/rsc_mgr.h
> index 3665ebc7b020..6838e736f361 100644
> --- a/drivers/virt/gunyah/rsc_mgr.h
> +++ b/drivers/virt/gunyah/rsc_mgr.h
> @@ -13,4 +13,7 @@ struct gh_rm;
> int gh_rm_call(struct gh_rm *rsc_mgr, u32 message_id, void *req_buff, size_t req_buf_size,
> void **resp_buf, size_t *resp_buf_size);
>
> +int gh_rm_platform_pre_mem_share(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel);
> +int gh_rm_platform_post_mem_reclaim(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel);
> +
> #endif
> diff --git a/drivers/virt/gunyah/rsc_mgr_rpc.c b/drivers/virt/gunyah/rsc_mgr_rpc.c
> index 3df15ad5b97d..733be4dc8dd2 100644
> --- a/drivers/virt/gunyah/rsc_mgr_rpc.c
> +++ b/drivers/virt/gunyah/rsc_mgr_rpc.c
> @@ -204,6 +204,12 @@ static int gh_rm_mem_lend_common(struct gh_rm *rm, u32 message_id, struct gh_rm_
> if (!msg)
> return -ENOMEM;
>
> + ret = gh_rm_platform_pre_mem_share(rm, p);
> + if (ret) {
> + kfree(msg);
> + return ret;
> + }
> +
> req_header = msg;
> acl_section = (void *)req_header + sizeof(*req_header);
> mem_section = (void *)acl_section + struct_size(acl_section, entries, p->n_acl_entries);
> @@ -227,8 +233,10 @@ static int gh_rm_mem_lend_common(struct gh_rm *rm, u32 message_id, struct gh_rm_
> ret = gh_rm_call(rm, message_id, msg, msg_size, (void **)&resp, &resp_size);
> kfree(msg);
>
> - if (ret)
> + if (ret) {
> + gh_rm_platform_post_mem_reclaim(rm, p);
> return ret;
> + }
>
> p->mem_handle = le32_to_cpu(*resp);
>
> @@ -283,8 +291,14 @@ int gh_rm_mem_reclaim(struct gh_rm *rm, struct gh_rm_mem_parcel *parcel)
> struct gh_rm_mem_release_req req = {
> .mem_handle = cpu_to_le32(parcel->mem_handle),
> };
> + int ret;
> +
> + ret = gh_rm_call(rm, GH_RM_RPC_MEM_RECLAIM, &req, sizeof(req), NULL, NULL);
> + /* Do not call platform mem reclaim hooks: the reclaim didn't happen*/
> + if (ret)
> + return ret;
>
> - return gh_rm_call(rm, GH_RM_RPC_MEM_RECLAIM, &req, sizeof(req), NULL, NULL);
> + return gh_rm_platform_post_mem_reclaim(rm, parcel);
> }
>
> /**
> diff --git a/include/linux/gunyah_rsc_mgr.h b/include/linux/gunyah_rsc_mgr.h
> index 8b0b46f28e39..515087931a2b 100644
> --- a/include/linux/gunyah_rsc_mgr.h
> +++ b/include/linux/gunyah_rsc_mgr.h
> @@ -145,4 +145,21 @@ int gh_rm_get_hyp_resources(struct gh_rm *rm, u16 vmid,
> struct gh_rm_hyp_resources **resources);
> int gh_rm_get_vmid(struct gh_rm *rm, u16 *vmid);
>
> +struct gunyah_rm_platform_ops {
> + int (*pre_mem_share)(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel);
> + int (*post_mem_reclaim)(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel);
> +};
> +
> +#if IS_ENABLED(CONFIG_GUNYAH_PLATFORM_HOOKS)
> +int gh_rm_register_platform_ops(struct gh_rm_platform_ops *platform_ops);
> +void gh_rm_unregister_platform_ops(struct gh_rm_platform_ops *platform_ops);
> +int devm_gh_rm_register_platform_ops(struct device *dev, struct gh_rm_platform_ops *ops);
> +#else
> +static inline int gh_rm_register_platform_ops(struct gh_rm_platform_ops *platform_ops)
> + { return 0; }
> +static inline void gh_rm_unregister_platform_ops(struct gh_rm_platform_ops *platform_ops) { }
> +static inline int devm_gh_rm_register_platform_ops(struct device *dev,
> + struct gh_rm_platform_ops *ops) { return 0; }
> +#endif
> +
> #endif
Hi Elliot,
On 04/03/2023 01:06, Elliot Berman wrote:
> Qualcomm platforms have a firmware entity which performs access control
> to physical pages. Dynamically started Gunyah virtual machines use the
> QCOM_SCM_RM_MANAGED_VMID for access. Linux thus needs to assign access
> to the memory used by guest VMs. Gunyah doesn't do this operation for us
> since it is the current VM (typically VMID_HLOS) delegating the access
> and not Gunyah itself. Use the Gunyah platform ops to achieve this so
> that only Qualcomm platforms attempt to make the needed SCM calls.
>
> Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
> Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
> Signed-off-by: Elliot Berman <[email protected]>
> ---
> drivers/firmware/Kconfig | 2 +
> drivers/firmware/qcom_scm.c | 100 +++++++++++++++++++++++++++++++++
> include/linux/gunyah_rsc_mgr.h | 2 +-
> 3 files changed, 103 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
> index b59e3041fd62..b888068ff6f2 100644
> --- a/drivers/firmware/Kconfig
> +++ b/drivers/firmware/Kconfig
> @@ -214,6 +214,8 @@ config MTK_ADSP_IPC
>
> config QCOM_SCM
> tristate
> + select VIRT_DRIVERS
> + select GUNYAH_PLATFORM_HOOKS
>
I still have concerns with this selects in Kconfig on older Qualcomm
platforms that use SCM and do not have GUNYAH.
In our last discussing you mentioned the requirement for
"CONFIG_GUNYAH=y and CONFIG_QCOM_SCM=m"
I think that should be doable and remove selecting if you can make a
separate GUNYAH_QCOM_PLATFORM_HOOKS driver
Does this work?
>----------------------->cut<-------------------------------
From 1fb7995aecf17caefd09ffb516579bc4ac9ac301 Mon Sep 17 00:00:00 2001
From: Srinivas Kandagatla <[email protected]>
Date: Tue, 21 Mar 2023 13:34:02 +0000
Subject: [PATCH] virt: gunyah: add qcom platform hooks
Signed-off-by: Srinivas Kandagatla <[email protected]>
---
drivers/firmware/Kconfig | 2 --
drivers/firmware/qcom_scm.c | 14 +++-----
drivers/virt/gunyah/Kconfig | 5 +++
drivers/virt/gunyah/Makefile | 1 +
.../virt/gunyah/gunyah_qcom_platform_hooks.c | 35 +++++++++++++++++++
include/linux/firmware/qcom/qcom_scm.h | 3 ++
6 files changed, 48 insertions(+), 12 deletions(-)
create mode 100644 drivers/virt/gunyah/gunyah_qcom_platform_hooks.c
diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
index b888068ff6f2..b59e3041fd62 100644
--- a/drivers/firmware/Kconfig
+++ b/drivers/firmware/Kconfig
@@ -214,8 +214,6 @@ config MTK_ADSP_IPC
config QCOM_SCM
tristate
- select VIRT_DRIVERS
- select GUNYAH_PLATFORM_HOOKS
config QCOM_SCM_DOWNLOAD_MODE_DEFAULT
bool "Qualcomm download mode enabled by default"
diff --git a/drivers/firmware/qcom_scm.c b/drivers/firmware/qcom_scm.c
index 5273cf64ee2a..194ea2bc9a1d 100644
--- a/drivers/firmware/qcom_scm.c
+++ b/drivers/firmware/qcom_scm.c
@@ -1301,7 +1301,7 @@ int qcom_scm_lmh_dcvsh(u32 payload_fn, u32
payload_reg, u32 payload_val,
}
EXPORT_SYMBOL(qcom_scm_lmh_dcvsh);
-static int qcom_scm_gh_rm_pre_mem_share(struct gh_rm *rm, struct
gh_rm_mem_parcel *mem_parcel)
+int qcom_scm_gh_rm_pre_mem_share(struct gh_rm_mem_parcel *mem_parcel)
{
struct qcom_scm_vmperm *new_perms;
u64 src, src_cpy;
@@ -1359,8 +1359,9 @@ static int qcom_scm_gh_rm_pre_mem_share(struct
gh_rm *rm, struct gh_rm_mem_parce
kfree(new_perms);
return ret;
}
+EXPORT_SYMBOL_GPL(qcom_scm_gh_rm_pre_mem_share);
-static int qcom_scm_gh_rm_post_mem_reclaim(struct gh_rm *rm, struct
gh_rm_mem_parcel *mem_parcel)
+int qcom_scm_gh_rm_post_mem_reclaim(struct gh_rm_mem_parcel *mem_parcel)
{
struct qcom_scm_vmperm new_perms;
u64 src = 0, src_cpy;
@@ -1388,11 +1389,7 @@ static int qcom_scm_gh_rm_post_mem_reclaim(struct
gh_rm *rm, struct gh_rm_mem_pa
return ret;
}
-
-static struct gh_rm_platform_ops qcom_scm_gh_rm_platform_ops = {
- .pre_mem_share = qcom_scm_gh_rm_pre_mem_share,
- .post_mem_reclaim = qcom_scm_gh_rm_post_mem_reclaim,
-};
+EXPORT_SYMBOL_GPL(qcom_scm_gh_rm_post_mem_reclaim);
static int qcom_scm_find_dload_address(struct device *dev, u64 *addr)
{
@@ -1597,9 +1594,6 @@ static int qcom_scm_probe(struct platform_device
*pdev)
if (download_mode)
qcom_scm_set_download_mode(true);
- if (devm_gh_rm_register_platform_ops(&pdev->dev,
&qcom_scm_gh_rm_platform_ops))
- dev_warn(__scm->dev, "Gunyah RM platform ops were already registered\n");
-
return 0;
}
diff --git a/drivers/virt/gunyah/Kconfig b/drivers/virt/gunyah/Kconfig
index bd8e31184962..a9c48d6518f7 100644
--- a/drivers/virt/gunyah/Kconfig
+++ b/drivers/virt/gunyah/Kconfig
@@ -16,6 +16,11 @@ config GUNYAH
config GUNYAH_PLATFORM_HOOKS
tristate
+config GUNYAH_QCOM_PLATFORM_HOOKS
+ tristate "Gunyah Platform hooks for Qualcomm"
+ depends on ARCH_QCOM && QCOM_SCM
+ depends on GUNYAH
+
config GUNYAH_VCPU
tristate "Runnable Gunyah vCPUs"
depends on GUNYAH
diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
index 7347b1470491..c33f701bb5c8 100644
--- a/drivers/virt/gunyah/Makefile
+++ b/drivers/virt/gunyah/Makefile
@@ -2,6 +2,7 @@
obj-$(CONFIG_GUNYAH) += gunyah.o
obj-$(CONFIG_GUNYAH_PLATFORM_HOOKS) += gunyah_platform_hooks.o
+obj-$(CONFIG_GUNYAH_QCOM_PLATFORM_HOOKS) += gunyah_qcom_platform_hooks.o
gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o vm_mgr_mm.o
obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
diff --git a/drivers/virt/gunyah/gunyah_qcom_platform_hooks.c
b/drivers/virt/gunyah/gunyah_qcom_platform_hooks.c
new file mode 100644
index 000000000000..3332f84134d3
--- /dev/null
+++ b/drivers/virt/gunyah/gunyah_qcom_platform_hooks.c
@@ -0,0 +1,35 @@
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/firmware/qcom/qcom_scm.h>
+#include <linux/gunyah_rsc_mgr.h>
+
+static int qcom_gh_rm_pre_mem_share(struct gh_rm *rm, struct
gh_rm_mem_parcel *mem_parcel)
+{
+ return qcom_scm_gh_rm_pre_mem_share(mem_parcel);
+}
+
+static int qcom_gh_rm_post_mem_reclaim(struct gh_rm *rm, struct
gh_rm_mem_parcel *mem_parcel)
+{
+ return qcom_scm_gh_rm_post_mem_reclaim(mem_parcel);
+}
+
+static struct gh_rm_platform_ops qcom_gh_platform_hooks_ops = {
+ .pre_mem_share = qcom_gh_rm_pre_mem_share,
+ .post_mem_reclaim = qcom_gh_rm_post_mem_reclaim,
+};
+
+static int __init qcom_gh_platform_hooks_register(void)
+{
+ return gh_rm_register_platform_ops(&qcom_gh_platform_hooks_ops);
+}
+
+static void __exit qcom_gh_platform_hooks_unregister(void)
+{
+ gh_rm_unregister_platform_ops(&qcom_gh_platform_hooks_ops);
+}
+
+module_init(qcom_gh_platform_hooks_register);
+module_exit(qcom_gh_platform_hooks_unregister);
+
+MODULE_DESCRIPTION("Qualcomm Technologies, Inc. Gunyah Platform Hooks
driver");
+MODULE_LICENSE("GPL v2");
diff --git a/include/linux/firmware/qcom/qcom_scm.h
b/include/linux/firmware/qcom/qcom_scm.h
index 1e449a5d7f5c..9b0d33db803d 100644
--- a/include/linux/firmware/qcom/qcom_scm.h
+++ b/include/linux/firmware/qcom/qcom_scm.h
@@ -121,5 +121,8 @@ extern int qcom_scm_lmh_dcvsh(u32 payload_fn, u32
payload_reg, u32 payload_val,
u64 limit_node, u32 node_id, u64 version);
extern int qcom_scm_lmh_profile_change(u32 profile_id);
extern bool qcom_scm_lmh_dcvsh_available(void);
+struct gh_rm_mem_parcel;
+extern int qcom_scm_gh_rm_post_mem_reclaim(struct gh_rm_mem_parcel
*mem_parcel);
+extern int qcom_scm_gh_rm_pre_mem_share(struct gh_rm_mem_parcel
*mem_parcel);
#endif
--------------------------->cut<-----------------------
> config QCOM_SCM_DOWNLOAD_MODE_DEFAULT
> bool "Qualcomm download mode enabled by default"
> diff --git a/drivers/firmware/qcom_scm.c b/drivers/firmware/qcom_scm.c
> index b95616b35bff..89a261a9e021 100644
> --- a/drivers/firmware/qcom_scm.c
> +++ b/drivers/firmware/qcom_scm.c
> @@ -20,6 +20,7 @@
> #include <linux/clk.h>
> #include <linux/reset-controller.h>
> #include <linux/arm-smccc.h>
> +#include <linux/gunyah_rsc_mgr.h>
>
> #include "qcom_scm.h"
>
> @@ -30,6 +31,9 @@ module_param(download_mode, bool, 0);
> #define SCM_HAS_IFACE_CLK BIT(1)
> #define SCM_HAS_BUS_CLK BIT(2)
>
> +#define QCOM_SCM_RM_MANAGED_VMID 0x3A
> +#define QCOM_SCM_MAX_MANAGED_VMID 0x3F
> +
> struct qcom_scm {
> struct device *dev;
> struct clk *core_clk;
> @@ -1299,6 +1303,99 @@ int qcom_scm_lmh_dcvsh(u32 payload_fn, u32 payload_reg, u32 payload_val,
> }
> EXPORT_SYMBOL(qcom_scm_lmh_dcvsh);
>
> +static int qcom_scm_gh_rm_pre_mem_share(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel)
> +{
> + struct qcom_scm_vmperm *new_perms;
> + u64 src, src_cpy;
> + int ret = 0, i, n;
> + u16 vmid;
> +
> + new_perms = kcalloc(mem_parcel->n_acl_entries, sizeof(*new_perms), GFP_KERNEL);
> + if (!new_perms)
> + return -ENOMEM;
> +
> + for (n = 0; n < mem_parcel->n_acl_entries; n++) {
> + vmid = le16_to_cpu(mem_parcel->acl_entries[n].vmid);
> + if (vmid <= QCOM_SCM_MAX_MANAGED_VMID)
> + new_perms[n].vmid = vmid;
> + else
> + new_perms[n].vmid = QCOM_SCM_RM_MANAGED_VMID;
> + if (mem_parcel->acl_entries[n].perms & GH_RM_ACL_X)
> + new_perms[n].perm |= QCOM_SCM_PERM_EXEC;
> + if (mem_parcel->acl_entries[n].perms & GH_RM_ACL_W)
> + new_perms[n].perm |= QCOM_SCM_PERM_WRITE;
> + if (mem_parcel->acl_entries[n].perms & GH_RM_ACL_R)
> + new_perms[n].perm |= QCOM_SCM_PERM_READ;
> + }
> +
> + src = (1ull << QCOM_SCM_VMID_HLOS);
> +
> + for (i = 0; i < mem_parcel->n_mem_entries; i++) {
> + src_cpy = src;
> + ret = qcom_scm_assign_mem(le64_to_cpu(mem_parcel->mem_entries[i].ipa_base),
> + le64_to_cpu(mem_parcel->mem_entries[i].size),
> + &src_cpy, new_perms, mem_parcel->n_acl_entries);
> + if (ret) {
> + src = 0;
> + for (n = 0; n < mem_parcel->n_acl_entries; n++) {
> + vmid = le16_to_cpu(mem_parcel->acl_entries[n].vmid);
> + if (vmid <= QCOM_SCM_MAX_MANAGED_VMID)
> + src |= (1ull << vmid);
> + else
> + src |= (1ull << QCOM_SCM_RM_MANAGED_VMID);
> + }
> +
> + new_perms[0].vmid = QCOM_SCM_VMID_HLOS;
> +
> + for (i--; i >= 0; i--) {
> + src_cpy = src;
> + WARN_ON_ONCE(qcom_scm_assign_mem(
> + le64_to_cpu(mem_parcel->mem_entries[i].ipa_base),
> + le64_to_cpu(mem_parcel->mem_entries[i].size),
> + &src_cpy, new_perms, 1));
> + }
> + break;
> + }
> + }
> +
> + kfree(new_perms);
> + return ret;
> +}
> +
> +static int qcom_scm_gh_rm_post_mem_reclaim(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel)
> +{
> + struct qcom_scm_vmperm new_perms;
> + u64 src = 0, src_cpy;
> + int ret = 0, i, n;
> + u16 vmid;
> +
> + new_perms.vmid = QCOM_SCM_VMID_HLOS;
> + new_perms.perm = QCOM_SCM_PERM_EXEC | QCOM_SCM_PERM_WRITE | QCOM_SCM_PERM_READ;
> +
> + for (n = 0; n < mem_parcel->n_acl_entries; n++) {
> + vmid = le16_to_cpu(mem_parcel->acl_entries[n].vmid);
> + if (vmid <= QCOM_SCM_MAX_MANAGED_VMID)
> + src |= (1ull << vmid);
> + else
> + src |= (1ull << QCOM_SCM_RM_MANAGED_VMID);
> + }
> +
> + for (i = 0; i < mem_parcel->n_mem_entries; i++) {
> + src_cpy = src;
> + ret = qcom_scm_assign_mem(le64_to_cpu(mem_parcel->mem_entries[i].ipa_base),
> + le64_to_cpu(mem_parcel->mem_entries[i].size),
> + &src_cpy, &new_perms, 1);
> + WARN_ON_ONCE(ret);
> + }
> +
> + return ret;
> +}
> +
> +static struct gh_rm_platform_ops qcom_scm_gh_rm_platform_ops = {
> + .pre_mem_share = qcom_scm_gh_rm_pre_mem_share,
> + .post_mem_reclaim = qcom_scm_gh_rm_post_mem_reclaim,
> +};
> +
> static int qcom_scm_find_dload_address(struct device *dev, u64 *addr)
> {
> struct device_node *tcsr;
> @@ -1502,6 +1599,9 @@ static int qcom_scm_probe(struct platform_device *pdev)
> if (download_mode)
> qcom_scm_set_download_mode(true);
>
> + if (devm_gh_rm_register_platform_ops(&pdev->dev, &qcom_scm_gh_rm_platform_ops))
> + dev_warn(__scm->dev, "Gunyah RM platform ops were already registered\n");
> +
> return 0;
> }
>
> diff --git a/include/linux/gunyah_rsc_mgr.h b/include/linux/gunyah_rsc_mgr.h
> index 515087931a2b..acf8c1545a6c 100644
> --- a/include/linux/gunyah_rsc_mgr.h
> +++ b/include/linux/gunyah_rsc_mgr.h
> @@ -145,7 +145,7 @@ int gh_rm_get_hyp_resources(struct gh_rm *rm, u16 vmid,
> struct gh_rm_hyp_resources **resources);
> int gh_rm_get_vmid(struct gh_rm *rm, u16 *vmid);
>
> -struct gunyah_rm_platform_ops {
> +struct gh_rm_platform_ops {
> int (*pre_mem_share)(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel);
> int (*post_mem_reclaim)(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel);
> };
On 04/03/2023 01:06, Elliot Berman wrote:
> Add remaining ioctls to support non-proxy VM boot:
>
> - Gunyah Resource Manager uses the VM's devicetree to configure the
> virtual machine. The location of the devicetree in the guest's
> virtual memory can be declared via the SET_DTB_CONFIG ioctl.
> - Trigger start of the virtual machine with VM_START ioctl.
>
> Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
> Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
> Signed-off-by: Elliot Berman <[email protected]>
> ---
> drivers/virt/gunyah/vm_mgr.c | 243 ++++++++++++++++++++++++++++++--
> drivers/virt/gunyah/vm_mgr.h | 10 ++
> drivers/virt/gunyah/vm_mgr_mm.c | 23 +++
> include/linux/gunyah_rsc_mgr.h | 6 +
> include/uapi/linux/gunyah.h | 13 ++
> 5 files changed, 282 insertions(+), 13 deletions(-)
>
...
> diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
> index a19207e3e065..d6abd8605a2e 100644
> --- a/include/uapi/linux/gunyah.h
> +++ b/include/uapi/linux/gunyah.h
> @@ -49,4 +49,17 @@ struct gh_userspace_memory_region {
> #define GH_VM_SET_USER_MEM_REGION _IOW(GH_IOCTL_TYPE, 0x1, \
> struct gh_userspace_memory_region)
>
> +/**
> + * struct gh_vm_dtb_config - Set the location of the VM's devicetree blob
> + * @guest_phys_addr: Address of the VM's devicetree in guest memory.
> + * @size: Maximum size of the devicetree.
> + */
> +struct gh_vm_dtb_config {
> + __u64 guest_phys_addr;
> + __u64 size;
> +};
> +#define GH_VM_SET_DTB_CONFIG _IOW(GH_IOCTL_TYPE, 0x2, struct gh_vm_dtb_config)
> +
> +#define GH_VM_START _IO(GH_IOCTL_TYPE, 0x3)
A comment here that this is going to *ONLY* start an un-authenticated VM
would be useful to the users.
with that fixed,
Reviewed-by: Srinivas Kandagatla <[email protected]>
--srini
> +
> #endif
On 04/03/2023 01:06, Elliot Berman wrote:
> Add hypercalls to send and receive messages on a Gunyah message queue.
>
> Signed-off-by: Elliot Berman <[email protected]>
> ---
Reviewed-by: Srinivas Kandagatla <[email protected]>
> arch/arm64/gunyah/gunyah_hypercall.c | 31 ++++++++++++++++++++++++++++
> include/linux/gunyah.h | 6 ++++++
> 2 files changed, 37 insertions(+)
>
> diff --git a/arch/arm64/gunyah/gunyah_hypercall.c b/arch/arm64/gunyah/gunyah_hypercall.c
> index 0d14e767e2c8..3420d8f286a9 100644
> --- a/arch/arm64/gunyah/gunyah_hypercall.c
> +++ b/arch/arm64/gunyah/gunyah_hypercall.c
> @@ -41,6 +41,8 @@ EXPORT_SYMBOL_GPL(arch_is_gh_guest);
> fn)
>
> #define GH_HYPERCALL_HYP_IDENTIFY GH_HYPERCALL(0x8000)
> +#define GH_HYPERCALL_MSGQ_SEND GH_HYPERCALL(0x801B)
> +#define GH_HYPERCALL_MSGQ_RECV GH_HYPERCALL(0x801C)
>
> /**
> * gh_hypercall_hyp_identify() - Returns build information and feature flags
> @@ -60,5 +62,34 @@ void gh_hypercall_hyp_identify(struct gh_hypercall_hyp_identify_resp *hyp_identi
> }
> EXPORT_SYMBOL_GPL(gh_hypercall_hyp_identify);
>
> +enum gh_error gh_hypercall_msgq_send(u64 capid, size_t size, void *buff, int tx_flags, bool *ready)
> +{
> + struct arm_smccc_res res;
> +
> + arm_smccc_1_1_hvc(GH_HYPERCALL_MSGQ_SEND, capid, size, (uintptr_t)buff, tx_flags, 0, &res);
> +
> + if (res.a0 == GH_ERROR_OK)
> + *ready = !!res.a1;
> +
> + return res.a0;
> +}
> +EXPORT_SYMBOL_GPL(gh_hypercall_msgq_send);
> +
> +enum gh_error gh_hypercall_msgq_recv(u64 capid, void *buff, size_t size, size_t *recv_size,
> + bool *ready)
> +{
> + struct arm_smccc_res res;
> +
> + arm_smccc_1_1_hvc(GH_HYPERCALL_MSGQ_RECV, capid, (uintptr_t)buff, size, 0, &res);
> +
> + if (res.a0 == GH_ERROR_OK) {
> + *recv_size = res.a1;
> + *ready = !!res.a2;
> + }
> +
> + return res.a0;
> +}
> +EXPORT_SYMBOL_GPL(gh_hypercall_msgq_recv);
> +
> MODULE_LICENSE("GPL");
> MODULE_DESCRIPTION("Gunyah Hypervisor Hypercalls");
> diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
> index bd080e3a6fc9..18cfbf5ee48b 100644
> --- a/include/linux/gunyah.h
> +++ b/include/linux/gunyah.h
> @@ -108,4 +108,10 @@ struct gh_hypercall_hyp_identify_resp {
>
> void gh_hypercall_hyp_identify(struct gh_hypercall_hyp_identify_resp *hyp_identity);
>
> +#define GH_HYPERCALL_MSGQ_TX_FLAGS_PUSH BIT(0)
> +
> +enum gh_error gh_hypercall_msgq_send(u64 capid, size_t size, void *buff, int tx_flags, bool *ready);
> +enum gh_error gh_hypercall_msgq_recv(u64 capid, void *buff, size_t size, size_t *recv_size,
> + bool *ready);
> +
> #endif
On 3/21/2023 7:24 AM, Srinivas Kandagatla wrote:
> Hi Elliot,
>
> On 04/03/2023 01:06, Elliot Berman wrote:
>> Qualcomm platforms have a firmware entity which performs access control
>> to physical pages. Dynamically started Gunyah virtual machines use the
>> QCOM_SCM_RM_MANAGED_VMID for access. Linux thus needs to assign access
>> to the memory used by guest VMs. Gunyah doesn't do this operation for us
>> since it is the current VM (typically VMID_HLOS) delegating the access
>> and not Gunyah itself. Use the Gunyah platform ops to achieve this so
>> that only Qualcomm platforms attempt to make the needed SCM calls.
>>
>> Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
>> Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
>> Signed-off-by: Elliot Berman <[email protected]>
>> ---
>> drivers/firmware/Kconfig | 2 +
>> drivers/firmware/qcom_scm.c | 100 +++++++++++++++++++++++++++++++++
>> include/linux/gunyah_rsc_mgr.h | 2 +-
>> 3 files changed, 103 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
>> index b59e3041fd62..b888068ff6f2 100644
>> --- a/drivers/firmware/Kconfig
>> +++ b/drivers/firmware/Kconfig
>> @@ -214,6 +214,8 @@ config MTK_ADSP_IPC
>> config QCOM_SCM
>> tristate
>> + select VIRT_DRIVERS
>> + select GUNYAH_PLATFORM_HOOKS
>>
>
> I still have concerns with this selects in Kconfig on older Qualcomm
> platforms that use SCM and do not have GUNYAH.
>
> In our last discussing you mentioned the requirement for
> "CONFIG_GUNYAH=y and CONFIG_QCOM_SCM=m"
>
> I think that should be doable and remove selecting if you can make a
> separate GUNYAH_QCOM_PLATFORM_HOOKS driver
>
> Does this work?
This works for Android and all the Qualcomm vendor (downstream)
platforms where we can explicitly load modules. I don't think this
module would be implicitly loaded by any kernel mechanism.
> >----------------------->cut<-------------------------------
> From 1fb7995aecf17caefd09ffb516579bc4ac9ac301 Mon Sep 17 00:00:00 2001
> From: Srinivas Kandagatla <[email protected]>
> Date: Tue, 21 Mar 2023 13:34:02 +0000
> Subject: [PATCH] virt: gunyah: add qcom platform hooks
>
> Signed-off-by: Srinivas Kandagatla <[email protected]>
> ---
> drivers/firmware/Kconfig | 2 --
> drivers/firmware/qcom_scm.c | 14 +++-----
> drivers/virt/gunyah/Kconfig | 5 +++
> drivers/virt/gunyah/Makefile | 1 +
> .../virt/gunyah/gunyah_qcom_platform_hooks.c | 35 +++++++++++++++++++
> include/linux/firmware/qcom/qcom_scm.h | 3 ++
> 6 files changed, 48 insertions(+), 12 deletions(-)
> create mode 100644 drivers/virt/gunyah/gunyah_qcom_platform_hooks.c
>
> diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
> index b888068ff6f2..b59e3041fd62 100644
> --- a/drivers/firmware/Kconfig
> +++ b/drivers/firmware/Kconfig
> @@ -214,8 +214,6 @@ config MTK_ADSP_IPC
>
> config QCOM_SCM
> tristate
> - select VIRT_DRIVERS
> - select GUNYAH_PLATFORM_HOOKS
>
> config QCOM_SCM_DOWNLOAD_MODE_DEFAULT
> bool "Qualcomm download mode enabled by default"
> diff --git a/drivers/firmware/qcom_scm.c b/drivers/firmware/qcom_scm.c
> index 5273cf64ee2a..194ea2bc9a1d 100644
> --- a/drivers/firmware/qcom_scm.c
> +++ b/drivers/firmware/qcom_scm.c
> @@ -1301,7 +1301,7 @@ int qcom_scm_lmh_dcvsh(u32 payload_fn, u32
> payload_reg, u32 payload_val,
> }
> EXPORT_SYMBOL(qcom_scm_lmh_dcvsh);
>
> -static int qcom_scm_gh_rm_pre_mem_share(struct gh_rm *rm, struct
> gh_rm_mem_parcel *mem_parcel)
> +int qcom_scm_gh_rm_pre_mem_share(struct gh_rm_mem_parcel *mem_parcel)
> {
> struct qcom_scm_vmperm *new_perms;
> u64 src, src_cpy;
> @@ -1359,8 +1359,9 @@ static int qcom_scm_gh_rm_pre_mem_share(struct
> gh_rm *rm, struct gh_rm_mem_parce
> kfree(new_perms);
> return ret;
> }
> +EXPORT_SYMBOL_GPL(qcom_scm_gh_rm_pre_mem_share);
>
> -static int qcom_scm_gh_rm_post_mem_reclaim(struct gh_rm *rm, struct
> gh_rm_mem_parcel *mem_parcel)
> +int qcom_scm_gh_rm_post_mem_reclaim(struct gh_rm_mem_parcel *mem_parcel)
> {
> struct qcom_scm_vmperm new_perms;
> u64 src = 0, src_cpy;
> @@ -1388,11 +1389,7 @@ static int qcom_scm_gh_rm_post_mem_reclaim(struct
> gh_rm *rm, struct gh_rm_mem_pa
>
> return ret;
> }
> -
> -static struct gh_rm_platform_ops qcom_scm_gh_rm_platform_ops = {
> - .pre_mem_share = qcom_scm_gh_rm_pre_mem_share,
> - .post_mem_reclaim = qcom_scm_gh_rm_post_mem_reclaim,
> -};
> +EXPORT_SYMBOL_GPL(qcom_scm_gh_rm_post_mem_reclaim);
>
> static int qcom_scm_find_dload_address(struct device *dev, u64 *addr)
> {
> @@ -1597,9 +1594,6 @@ static int qcom_scm_probe(struct platform_device
> *pdev)
> if (download_mode)
> qcom_scm_set_download_mode(true);
>
> - if (devm_gh_rm_register_platform_ops(&pdev->dev,
> &qcom_scm_gh_rm_platform_ops))
> - dev_warn(__scm->dev, "Gunyah RM platform ops were already
> registered\n");
> -
> return 0;
> }
>
> diff --git a/drivers/virt/gunyah/Kconfig b/drivers/virt/gunyah/Kconfig
> index bd8e31184962..a9c48d6518f7 100644
> --- a/drivers/virt/gunyah/Kconfig
> +++ b/drivers/virt/gunyah/Kconfig
> @@ -16,6 +16,11 @@ config GUNYAH
> config GUNYAH_PLATFORM_HOOKS
> tristate
>
> +config GUNYAH_QCOM_PLATFORM_HOOKS
> + tristate "Gunyah Platform hooks for Qualcomm"
> + depends on ARCH_QCOM && QCOM_SCM
> + depends on GUNYAH
> +
> config GUNYAH_VCPU
> tristate "Runnable Gunyah vCPUs"
> depends on GUNYAH
> diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
> index 7347b1470491..c33f701bb5c8 100644
> --- a/drivers/virt/gunyah/Makefile
> +++ b/drivers/virt/gunyah/Makefile
> @@ -2,6 +2,7 @@
>
> obj-$(CONFIG_GUNYAH) += gunyah.o
> obj-$(CONFIG_GUNYAH_PLATFORM_HOOKS) += gunyah_platform_hooks.o
> +obj-$(CONFIG_GUNYAH_QCOM_PLATFORM_HOOKS) += gunyah_qcom_platform_hooks.o
>
> gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o vm_mgr_mm.o
> obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
> diff --git a/drivers/virt/gunyah/gunyah_qcom_platform_hooks.c
> b/drivers/virt/gunyah/gunyah_qcom_platform_hooks.c
> new file mode 100644
> index 000000000000..3332f84134d3
> --- /dev/null
> +++ b/drivers/virt/gunyah/gunyah_qcom_platform_hooks.c
> @@ -0,0 +1,35 @@
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/firmware/qcom/qcom_scm.h>
> +#include <linux/gunyah_rsc_mgr.h>
> +
> +static int qcom_gh_rm_pre_mem_share(struct gh_rm *rm, struct
> gh_rm_mem_parcel *mem_parcel)
> +{
> + return qcom_scm_gh_rm_pre_mem_share(mem_parcel);
> +}
> +
> +static int qcom_gh_rm_post_mem_reclaim(struct gh_rm *rm, struct
> gh_rm_mem_parcel *mem_parcel)
> +{
> + return qcom_scm_gh_rm_post_mem_reclaim(mem_parcel);
> +}
> +
> +static struct gh_rm_platform_ops qcom_gh_platform_hooks_ops = {
> + .pre_mem_share = qcom_gh_rm_pre_mem_share,
> + .post_mem_reclaim = qcom_gh_rm_post_mem_reclaim,
> +};
> +
> +static int __init qcom_gh_platform_hooks_register(void)
> +{
> + return gh_rm_register_platform_ops(&qcom_gh_platform_hooks_ops);
> +}
> +
> +static void __exit qcom_gh_platform_hooks_unregister(void)
> +{
> + gh_rm_unregister_platform_ops(&qcom_gh_platform_hooks_ops);
> +}
> +
> +module_init(qcom_gh_platform_hooks_register);
> +module_exit(qcom_gh_platform_hooks_unregister);
> +
> +MODULE_DESCRIPTION("Qualcomm Technologies, Inc. Gunyah Platform Hooks
> driver");
> +MODULE_LICENSE("GPL v2");
> diff --git a/include/linux/firmware/qcom/qcom_scm.h
> b/include/linux/firmware/qcom/qcom_scm.h
> index 1e449a5d7f5c..9b0d33db803d 100644
> --- a/include/linux/firmware/qcom/qcom_scm.h
> +++ b/include/linux/firmware/qcom/qcom_scm.h
> @@ -121,5 +121,8 @@ extern int qcom_scm_lmh_dcvsh(u32 payload_fn, u32
> payload_reg, u32 payload_val,
> u64 limit_node, u32 node_id, u64 version);
> extern int qcom_scm_lmh_profile_change(u32 profile_id);
> extern bool qcom_scm_lmh_dcvsh_available(void);
> +struct gh_rm_mem_parcel;
> +extern int qcom_scm_gh_rm_post_mem_reclaim(struct gh_rm_mem_parcel
> *mem_parcel);
> +extern int qcom_scm_gh_rm_pre_mem_share(struct gh_rm_mem_parcel
> *mem_parcel);
>
> #endif
> --------------------------->cut<-----------------------
>
>> config QCOM_SCM_DOWNLOAD_MODE_DEFAULT
>> bool "Qualcomm download mode enabled by default"
>> diff --git a/drivers/firmware/qcom_scm.c b/drivers/firmware/qcom_scm.c
>> index b95616b35bff..89a261a9e021 100644
>> --- a/drivers/firmware/qcom_scm.c
>> +++ b/drivers/firmware/qcom_scm.c
>> @@ -20,6 +20,7 @@
>> #include <linux/clk.h>
>> #include <linux/reset-controller.h>
>> #include <linux/arm-smccc.h>
>> +#include <linux/gunyah_rsc_mgr.h>
>> #include "qcom_scm.h"
>> @@ -30,6 +31,9 @@ module_param(download_mode, bool, 0);
>> #define SCM_HAS_IFACE_CLK BIT(1)
>> #define SCM_HAS_BUS_CLK BIT(2)
>> +#define QCOM_SCM_RM_MANAGED_VMID 0x3A
>> +#define QCOM_SCM_MAX_MANAGED_VMID 0x3F
>> +
>> struct qcom_scm {
>> struct device *dev;
>> struct clk *core_clk;
>> @@ -1299,6 +1303,99 @@ int qcom_scm_lmh_dcvsh(u32 payload_fn, u32
>> payload_reg, u32 payload_val,
>> }
>> EXPORT_SYMBOL(qcom_scm_lmh_dcvsh);
>> +static int qcom_scm_gh_rm_pre_mem_share(struct gh_rm *rm, struct
>> gh_rm_mem_parcel *mem_parcel)
>> +{
>> + struct qcom_scm_vmperm *new_perms;
>> + u64 src, src_cpy;
>> + int ret = 0, i, n;
>> + u16 vmid;
>> +
>> + new_perms = kcalloc(mem_parcel->n_acl_entries,
>> sizeof(*new_perms), GFP_KERNEL);
>> + if (!new_perms)
>> + return -ENOMEM;
>> +
>> + for (n = 0; n < mem_parcel->n_acl_entries; n++) {
>> + vmid = le16_to_cpu(mem_parcel->acl_entries[n].vmid);
>> + if (vmid <= QCOM_SCM_MAX_MANAGED_VMID)
>> + new_perms[n].vmid = vmid;
>> + else
>> + new_perms[n].vmid = QCOM_SCM_RM_MANAGED_VMID;
>> + if (mem_parcel->acl_entries[n].perms & GH_RM_ACL_X)
>> + new_perms[n].perm |= QCOM_SCM_PERM_EXEC;
>> + if (mem_parcel->acl_entries[n].perms & GH_RM_ACL_W)
>> + new_perms[n].perm |= QCOM_SCM_PERM_WRITE;
>> + if (mem_parcel->acl_entries[n].perms & GH_RM_ACL_R)
>> + new_perms[n].perm |= QCOM_SCM_PERM_READ;
>> + }
>> +
>> + src = (1ull << QCOM_SCM_VMID_HLOS);
>> +
>> + for (i = 0; i < mem_parcel->n_mem_entries; i++) {
>> + src_cpy = src;
>> + ret =
>> qcom_scm_assign_mem(le64_to_cpu(mem_parcel->mem_entries[i].ipa_base),
>> + le64_to_cpu(mem_parcel->mem_entries[i].size),
>> + &src_cpy, new_perms, mem_parcel->n_acl_entries);
>> + if (ret) {
>> + src = 0;
>> + for (n = 0; n < mem_parcel->n_acl_entries; n++) {
>> + vmid = le16_to_cpu(mem_parcel->acl_entries[n].vmid);
>> + if (vmid <= QCOM_SCM_MAX_MANAGED_VMID)
>> + src |= (1ull << vmid);
>> + else
>> + src |= (1ull << QCOM_SCM_RM_MANAGED_VMID);
>> + }
>> +
>> + new_perms[0].vmid = QCOM_SCM_VMID_HLOS;
>> +
>> + for (i--; i >= 0; i--) {
>> + src_cpy = src;
>> + WARN_ON_ONCE(qcom_scm_assign_mem(
>> +
>> le64_to_cpu(mem_parcel->mem_entries[i].ipa_base),
>> + le64_to_cpu(mem_parcel->mem_entries[i].size),
>> + &src_cpy, new_perms, 1));
>> + }
>> + break;
>> + }
>> + }
>> +
>> + kfree(new_perms);
>> + return ret;
>> +}
>> +
>> +static int qcom_scm_gh_rm_post_mem_reclaim(struct gh_rm *rm, struct
>> gh_rm_mem_parcel *mem_parcel)
>> +{
>> + struct qcom_scm_vmperm new_perms;
>> + u64 src = 0, src_cpy;
>> + int ret = 0, i, n;
>> + u16 vmid;
>> +
>> + new_perms.vmid = QCOM_SCM_VMID_HLOS;
>> + new_perms.perm = QCOM_SCM_PERM_EXEC | QCOM_SCM_PERM_WRITE |
>> QCOM_SCM_PERM_READ;
>> +
>> + for (n = 0; n < mem_parcel->n_acl_entries; n++) {
>> + vmid = le16_to_cpu(mem_parcel->acl_entries[n].vmid);
>> + if (vmid <= QCOM_SCM_MAX_MANAGED_VMID)
>> + src |= (1ull << vmid);
>> + else
>> + src |= (1ull << QCOM_SCM_RM_MANAGED_VMID);
>> + }
>> +
>> + for (i = 0; i < mem_parcel->n_mem_entries; i++) {
>> + src_cpy = src;
>> + ret =
>> qcom_scm_assign_mem(le64_to_cpu(mem_parcel->mem_entries[i].ipa_base),
>> + le64_to_cpu(mem_parcel->mem_entries[i].size),
>> + &src_cpy, &new_perms, 1);
>> + WARN_ON_ONCE(ret);
>> + }
>> +
>> + return ret;
>> +}
>> +
>> +static struct gh_rm_platform_ops qcom_scm_gh_rm_platform_ops = {
>> + .pre_mem_share = qcom_scm_gh_rm_pre_mem_share,
>> + .post_mem_reclaim = qcom_scm_gh_rm_post_mem_reclaim,
>> +};
>> +
>> static int qcom_scm_find_dload_address(struct device *dev, u64 *addr)
>> {
>> struct device_node *tcsr;
>> @@ -1502,6 +1599,9 @@ static int qcom_scm_probe(struct platform_device
>> *pdev)
>> if (download_mode)
>> qcom_scm_set_download_mode(true);
>> + if (devm_gh_rm_register_platform_ops(&pdev->dev,
>> &qcom_scm_gh_rm_platform_ops))
>> + dev_warn(__scm->dev, "Gunyah RM platform ops were already
>> registered\n");
>> +
>> return 0;
>> }
>> diff --git a/include/linux/gunyah_rsc_mgr.h
>> b/include/linux/gunyah_rsc_mgr.h
>> index 515087931a2b..acf8c1545a6c 100644
>> --- a/include/linux/gunyah_rsc_mgr.h
>> +++ b/include/linux/gunyah_rsc_mgr.h
>> @@ -145,7 +145,7 @@ int gh_rm_get_hyp_resources(struct gh_rm *rm, u16
>> vmid,
>> struct gh_rm_hyp_resources **resources);
>> int gh_rm_get_vmid(struct gh_rm *rm, u16 *vmid);
>> -struct gunyah_rm_platform_ops {
>> +struct gh_rm_platform_ops {
>> int (*pre_mem_share)(struct gh_rm *rm, struct gh_rm_mem_parcel
>> *mem_parcel);
>> int (*post_mem_reclaim)(struct gh_rm *rm, struct
>> gh_rm_mem_parcel *mem_parcel);
>> };
On 21/03/2023 18:40, Elliot Berman wrote:
>
>
> On 3/21/2023 7:24 AM, Srinivas Kandagatla wrote:
>> Hi Elliot,
>>
>> On 04/03/2023 01:06, Elliot Berman wrote:
>>> Qualcomm platforms have a firmware entity which performs access control
>>> to physical pages. Dynamically started Gunyah virtual machines use the
>>> QCOM_SCM_RM_MANAGED_VMID for access. Linux thus needs to assign access
>>> to the memory used by guest VMs. Gunyah doesn't do this operation for us
>>> since it is the current VM (typically VMID_HLOS) delegating the access
>>> and not Gunyah itself. Use the Gunyah platform ops to achieve this so
>>> that only Qualcomm platforms attempt to make the needed SCM calls.
>>>
>>> Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
>>> Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
>>> Signed-off-by: Elliot Berman <[email protected]>
>>> ---
>>> drivers/firmware/Kconfig | 2 +
>>> drivers/firmware/qcom_scm.c | 100 +++++++++++++++++++++++++++++++++
>>> include/linux/gunyah_rsc_mgr.h | 2 +-
>>> 3 files changed, 103 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
>>> index b59e3041fd62..b888068ff6f2 100644
>>> --- a/drivers/firmware/Kconfig
>>> +++ b/drivers/firmware/Kconfig
>>> @@ -214,6 +214,8 @@ config MTK_ADSP_IPC
>>> config QCOM_SCM
>>> tristate
>>> + select VIRT_DRIVERS
>>> + select GUNYAH_PLATFORM_HOOKS
>>>
>>
>> I still have concerns with this selects in Kconfig on older Qualcomm
>> platforms that use SCM and do not have GUNYAH.
>>
>> In our last discussing you mentioned the requirement for
>> "CONFIG_GUNYAH=y and CONFIG_QCOM_SCM=m"
>>
>> I think that should be doable and remove selecting if you can make a
>> separate GUNYAH_QCOM_PLATFORM_HOOKS driver
>>
>> Does this work?
>
> This works for Android and all the Qualcomm vendor (downstream)
> platforms where we can explicitly load modules. I don't think this
> module would be implicitly loaded by any kernel mechanism.
We could also load this module based on UUID match at the gunyah core
level too, if that helps.
--srini
>
>> >----------------------->cut<-------------------------------
>> From 1fb7995aecf17caefd09ffb516579bc4ac9ac301 Mon Sep 17 00:00:00 2001
>> From: Srinivas Kandagatla <[email protected]>
>> Date: Tue, 21 Mar 2023 13:34:02 +0000
>> Subject: [PATCH] virt: gunyah: add qcom platform hooks
>>
>> Signed-off-by: Srinivas Kandagatla <[email protected]>
>> ---
>> drivers/firmware/Kconfig | 2 --
>> drivers/firmware/qcom_scm.c | 14 +++-----
>> drivers/virt/gunyah/Kconfig | 5 +++
>> drivers/virt/gunyah/Makefile | 1 +
>> .../virt/gunyah/gunyah_qcom_platform_hooks.c | 35 +++++++++++++++++++
>> include/linux/firmware/qcom/qcom_scm.h | 3 ++
>> 6 files changed, 48 insertions(+), 12 deletions(-)
>> create mode 100644 drivers/virt/gunyah/gunyah_qcom_platform_hooks.c
>>
>> diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
>> index b888068ff6f2..b59e3041fd62 100644
>> --- a/drivers/firmware/Kconfig
>> +++ b/drivers/firmware/Kconfig
>> @@ -214,8 +214,6 @@ config MTK_ADSP_IPC
>>
>> config QCOM_SCM
>> tristate
>> - select VIRT_DRIVERS
>> - select GUNYAH_PLATFORM_HOOKS
>>
>> config QCOM_SCM_DOWNLOAD_MODE_DEFAULT
>> bool "Qualcomm download mode enabled by default"
>> diff --git a/drivers/firmware/qcom_scm.c b/drivers/firmware/qcom_scm.c
>> index 5273cf64ee2a..194ea2bc9a1d 100644
>> --- a/drivers/firmware/qcom_scm.c
>> +++ b/drivers/firmware/qcom_scm.c
>> @@ -1301,7 +1301,7 @@ int qcom_scm_lmh_dcvsh(u32 payload_fn, u32
>> payload_reg, u32 payload_val,
>> }
>> EXPORT_SYMBOL(qcom_scm_lmh_dcvsh);
>>
>> -static int qcom_scm_gh_rm_pre_mem_share(struct gh_rm *rm, struct
>> gh_rm_mem_parcel *mem_parcel)
>> +int qcom_scm_gh_rm_pre_mem_share(struct gh_rm_mem_parcel *mem_parcel)
>> {
>> struct qcom_scm_vmperm *new_perms;
>> u64 src, src_cpy;
>> @@ -1359,8 +1359,9 @@ static int qcom_scm_gh_rm_pre_mem_share(struct
>> gh_rm *rm, struct gh_rm_mem_parce
>> kfree(new_perms);
>> return ret;
>> }
>> +EXPORT_SYMBOL_GPL(qcom_scm_gh_rm_pre_mem_share);
>>
>> -static int qcom_scm_gh_rm_post_mem_reclaim(struct gh_rm *rm, struct
>> gh_rm_mem_parcel *mem_parcel)
>> +int qcom_scm_gh_rm_post_mem_reclaim(struct gh_rm_mem_parcel *mem_parcel)
>> {
>> struct qcom_scm_vmperm new_perms;
>> u64 src = 0, src_cpy;
>> @@ -1388,11 +1389,7 @@ static int
>> qcom_scm_gh_rm_post_mem_reclaim(struct gh_rm *rm, struct gh_rm_mem_pa
>>
>> return ret;
>> }
>> -
>> -static struct gh_rm_platform_ops qcom_scm_gh_rm_platform_ops = {
>> - .pre_mem_share = qcom_scm_gh_rm_pre_mem_share,
>> - .post_mem_reclaim = qcom_scm_gh_rm_post_mem_reclaim,
>> -};
>> +EXPORT_SYMBOL_GPL(qcom_scm_gh_rm_post_mem_reclaim);
>>
>> static int qcom_scm_find_dload_address(struct device *dev, u64 *addr)
>> {
>> @@ -1597,9 +1594,6 @@ static int qcom_scm_probe(struct platform_device
>> *pdev)
>> if (download_mode)
>> qcom_scm_set_download_mode(true);
>>
>> - if (devm_gh_rm_register_platform_ops(&pdev->dev,
>> &qcom_scm_gh_rm_platform_ops))
>> - dev_warn(__scm->dev, "Gunyah RM platform ops were already
>> registered\n");
>> -
>> return 0;
>> }
>>
>> diff --git a/drivers/virt/gunyah/Kconfig b/drivers/virt/gunyah/Kconfig
>> index bd8e31184962..a9c48d6518f7 100644
>> --- a/drivers/virt/gunyah/Kconfig
>> +++ b/drivers/virt/gunyah/Kconfig
>> @@ -16,6 +16,11 @@ config GUNYAH
>> config GUNYAH_PLATFORM_HOOKS
>> tristate
>>
>> +config GUNYAH_QCOM_PLATFORM_HOOKS
>> + tristate "Gunyah Platform hooks for Qualcomm"
>> + depends on ARCH_QCOM && QCOM_SCM
>> + depends on GUNYAH
>> +
>> config GUNYAH_VCPU
>> tristate "Runnable Gunyah vCPUs"
>> depends on GUNYAH
>> diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
>> index 7347b1470491..c33f701bb5c8 100644
>> --- a/drivers/virt/gunyah/Makefile
>> +++ b/drivers/virt/gunyah/Makefile
>> @@ -2,6 +2,7 @@
>>
>> obj-$(CONFIG_GUNYAH) += gunyah.o
>> obj-$(CONFIG_GUNYAH_PLATFORM_HOOKS) += gunyah_platform_hooks.o
>> +obj-$(CONFIG_GUNYAH_QCOM_PLATFORM_HOOKS) += gunyah_qcom_platform_hooks.o
>>
>> gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o vm_mgr_mm.o
>> obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
>> diff --git a/drivers/virt/gunyah/gunyah_qcom_platform_hooks.c
>> b/drivers/virt/gunyah/gunyah_qcom_platform_hooks.c
>> new file mode 100644
>> index 000000000000..3332f84134d3
>> --- /dev/null
>> +++ b/drivers/virt/gunyah/gunyah_qcom_platform_hooks.c
>> @@ -0,0 +1,35 @@
>> +#include <linux/kernel.h>
>> +#include <linux/module.h>
>> +#include <linux/firmware/qcom/qcom_scm.h>
>> +#include <linux/gunyah_rsc_mgr.h>
>> +
>> +static int qcom_gh_rm_pre_mem_share(struct gh_rm *rm, struct
>> gh_rm_mem_parcel *mem_parcel)
>> +{
>> + return qcom_scm_gh_rm_pre_mem_share(mem_parcel);
>> +}
>> +
>> +static int qcom_gh_rm_post_mem_reclaim(struct gh_rm *rm, struct
>> gh_rm_mem_parcel *mem_parcel)
>> +{
>> + return qcom_scm_gh_rm_post_mem_reclaim(mem_parcel);
>> +}
>> +
>> +static struct gh_rm_platform_ops qcom_gh_platform_hooks_ops = {
>> + .pre_mem_share = qcom_gh_rm_pre_mem_share,
>> + .post_mem_reclaim = qcom_gh_rm_post_mem_reclaim,
>> +};
>> +
>> +static int __init qcom_gh_platform_hooks_register(void)
>> +{
>> + return gh_rm_register_platform_ops(&qcom_gh_platform_hooks_ops);
>> +}
>> +
>> +static void __exit qcom_gh_platform_hooks_unregister(void)
>> +{
>> + gh_rm_unregister_platform_ops(&qcom_gh_platform_hooks_ops);
>> +}
>> +
>> +module_init(qcom_gh_platform_hooks_register);
>> +module_exit(qcom_gh_platform_hooks_unregister);
>> +
>> +MODULE_DESCRIPTION("Qualcomm Technologies, Inc. Gunyah Platform Hooks
>> driver");
>> +MODULE_LICENSE("GPL v2");
>> diff --git a/include/linux/firmware/qcom/qcom_scm.h
>> b/include/linux/firmware/qcom/qcom_scm.h
>> index 1e449a5d7f5c..9b0d33db803d 100644
>> --- a/include/linux/firmware/qcom/qcom_scm.h
>> +++ b/include/linux/firmware/qcom/qcom_scm.h
>> @@ -121,5 +121,8 @@ extern int qcom_scm_lmh_dcvsh(u32 payload_fn, u32
>> payload_reg, u32 payload_val,
>> u64 limit_node, u32 node_id, u64 version);
>> extern int qcom_scm_lmh_profile_change(u32 profile_id);
>> extern bool qcom_scm_lmh_dcvsh_available(void);
>> +struct gh_rm_mem_parcel;
>> +extern int qcom_scm_gh_rm_post_mem_reclaim(struct gh_rm_mem_parcel
>> *mem_parcel);
>> +extern int qcom_scm_gh_rm_pre_mem_share(struct gh_rm_mem_parcel
>> *mem_parcel);
>>
>> #endif
>> --------------------------->cut<-----------------------
>>
>>> config QCOM_SCM_DOWNLOAD_MODE_DEFAULT
>>> bool "Qualcomm download mode enabled by default"
>>> diff --git a/drivers/firmware/qcom_scm.c b/drivers/firmware/qcom_scm.c
>>> index b95616b35bff..89a261a9e021 100644
>>> --- a/drivers/firmware/qcom_scm.c
>>> +++ b/drivers/firmware/qcom_scm.c
>>> @@ -20,6 +20,7 @@
>>> #include <linux/clk.h>
>>> #include <linux/reset-controller.h>
>>> #include <linux/arm-smccc.h>
>>> +#include <linux/gunyah_rsc_mgr.h>
>>> #include "qcom_scm.h"
>>> @@ -30,6 +31,9 @@ module_param(download_mode, bool, 0);
>>> #define SCM_HAS_IFACE_CLK BIT(1)
>>> #define SCM_HAS_BUS_CLK BIT(2)
>>> +#define QCOM_SCM_RM_MANAGED_VMID 0x3A
>>> +#define QCOM_SCM_MAX_MANAGED_VMID 0x3F
>>> +
>>> struct qcom_scm {
>>> struct device *dev;
>>> struct clk *core_clk;
>>> @@ -1299,6 +1303,99 @@ int qcom_scm_lmh_dcvsh(u32 payload_fn, u32
>>> payload_reg, u32 payload_val,
>>> }
>>> EXPORT_SYMBOL(qcom_scm_lmh_dcvsh);
>>> +static int qcom_scm_gh_rm_pre_mem_share(struct gh_rm *rm, struct
>>> gh_rm_mem_parcel *mem_parcel)
>>> +{
>>> + struct qcom_scm_vmperm *new_perms;
>>> + u64 src, src_cpy;
>>> + int ret = 0, i, n;
>>> + u16 vmid;
>>> +
>>> + new_perms = kcalloc(mem_parcel->n_acl_entries,
>>> sizeof(*new_perms), GFP_KERNEL);
>>> + if (!new_perms)
>>> + return -ENOMEM;
>>> +
>>> + for (n = 0; n < mem_parcel->n_acl_entries; n++) {
>>> + vmid = le16_to_cpu(mem_parcel->acl_entries[n].vmid);
>>> + if (vmid <= QCOM_SCM_MAX_MANAGED_VMID)
>>> + new_perms[n].vmid = vmid;
>>> + else
>>> + new_perms[n].vmid = QCOM_SCM_RM_MANAGED_VMID;
>>> + if (mem_parcel->acl_entries[n].perms & GH_RM_ACL_X)
>>> + new_perms[n].perm |= QCOM_SCM_PERM_EXEC;
>>> + if (mem_parcel->acl_entries[n].perms & GH_RM_ACL_W)
>>> + new_perms[n].perm |= QCOM_SCM_PERM_WRITE;
>>> + if (mem_parcel->acl_entries[n].perms & GH_RM_ACL_R)
>>> + new_perms[n].perm |= QCOM_SCM_PERM_READ;
>>> + }
>>> +
>>> + src = (1ull << QCOM_SCM_VMID_HLOS);
>>> +
>>> + for (i = 0; i < mem_parcel->n_mem_entries; i++) {
>>> + src_cpy = src;
>>> + ret =
>>> qcom_scm_assign_mem(le64_to_cpu(mem_parcel->mem_entries[i].ipa_base),
>>> + le64_to_cpu(mem_parcel->mem_entries[i].size),
>>> + &src_cpy, new_perms,
>>> mem_parcel->n_acl_entries);
>>> + if (ret) {
>>> + src = 0;
>>> + for (n = 0; n < mem_parcel->n_acl_entries; n++) {
>>> + vmid = le16_to_cpu(mem_parcel->acl_entries[n].vmid);
>>> + if (vmid <= QCOM_SCM_MAX_MANAGED_VMID)
>>> + src |= (1ull << vmid);
>>> + else
>>> + src |= (1ull << QCOM_SCM_RM_MANAGED_VMID);
>>> + }
>>> +
>>> + new_perms[0].vmid = QCOM_SCM_VMID_HLOS;
>>> +
>>> + for (i--; i >= 0; i--) {
>>> + src_cpy = src;
>>> + WARN_ON_ONCE(qcom_scm_assign_mem(
>>> + le64_to_cpu(mem_parcel->mem_entries[i].ipa_base),
>>> + le64_to_cpu(mem_parcel->mem_entries[i].size),
>>> + &src_cpy, new_perms, 1));
>>> + }
>>> + break;
>>> + }
>>> + }
>>> +
>>> + kfree(new_perms);
>>> + return ret;
>>> +}
>>> +
>>> +static int qcom_scm_gh_rm_post_mem_reclaim(struct gh_rm *rm, struct
>>> gh_rm_mem_parcel *mem_parcel)
>>> +{
>>> + struct qcom_scm_vmperm new_perms;
>>> + u64 src = 0, src_cpy;
>>> + int ret = 0, i, n;
>>> + u16 vmid;
>>> +
>>> + new_perms.vmid = QCOM_SCM_VMID_HLOS;
>>> + new_perms.perm = QCOM_SCM_PERM_EXEC | QCOM_SCM_PERM_WRITE |
>>> QCOM_SCM_PERM_READ;
>>> +
>>> + for (n = 0; n < mem_parcel->n_acl_entries; n++) {
>>> + vmid = le16_to_cpu(mem_parcel->acl_entries[n].vmid);
>>> + if (vmid <= QCOM_SCM_MAX_MANAGED_VMID)
>>> + src |= (1ull << vmid);
>>> + else
>>> + src |= (1ull << QCOM_SCM_RM_MANAGED_VMID);
>>> + }
>>> +
>>> + for (i = 0; i < mem_parcel->n_mem_entries; i++) {
>>> + src_cpy = src;
>>> + ret =
>>> qcom_scm_assign_mem(le64_to_cpu(mem_parcel->mem_entries[i].ipa_base),
>>> + le64_to_cpu(mem_parcel->mem_entries[i].size),
>>> + &src_cpy, &new_perms, 1);
>>> + WARN_ON_ONCE(ret);
>>> + }
>>> +
>>> + return ret;
>>> +}
>>> +
>>> +static struct gh_rm_platform_ops qcom_scm_gh_rm_platform_ops = {
>>> + .pre_mem_share = qcom_scm_gh_rm_pre_mem_share,
>>> + .post_mem_reclaim = qcom_scm_gh_rm_post_mem_reclaim,
>>> +};
>>> +
>>> static int qcom_scm_find_dload_address(struct device *dev, u64 *addr)
>>> {
>>> struct device_node *tcsr;
>>> @@ -1502,6 +1599,9 @@ static int qcom_scm_probe(struct
>>> platform_device *pdev)
>>> if (download_mode)
>>> qcom_scm_set_download_mode(true);
>>> + if (devm_gh_rm_register_platform_ops(&pdev->dev,
>>> &qcom_scm_gh_rm_platform_ops))
>>> + dev_warn(__scm->dev, "Gunyah RM platform ops were already
>>> registered\n");
>>> +
>>> return 0;
>>> }
>>> diff --git a/include/linux/gunyah_rsc_mgr.h
>>> b/include/linux/gunyah_rsc_mgr.h
>>> index 515087931a2b..acf8c1545a6c 100644
>>> --- a/include/linux/gunyah_rsc_mgr.h
>>> +++ b/include/linux/gunyah_rsc_mgr.h
>>> @@ -145,7 +145,7 @@ int gh_rm_get_hyp_resources(struct gh_rm *rm, u16
>>> vmid,
>>> struct gh_rm_hyp_resources **resources);
>>> int gh_rm_get_vmid(struct gh_rm *rm, u16 *vmid);
>>> -struct gunyah_rm_platform_ops {
>>> +struct gh_rm_platform_ops {
>>> int (*pre_mem_share)(struct gh_rm *rm, struct gh_rm_mem_parcel
>>> *mem_parcel);
>>> int (*post_mem_reclaim)(struct gh_rm *rm, struct
>>> gh_rm_mem_parcel *mem_parcel);
>>> };
On 3/21/2023 7:23 AM, Srinivas Kandagatla wrote:
>
>
> On 04/03/2023 01:06, Elliot Berman wrote:
>> On Qualcomm platforms, there is a firmware entity which controls access
>> to physical pages. In order to share memory with another VM, this entity
>> needs to be informed that the guest VM should have access to the memory.
>>
>> Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
>> Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
>> Signed-off-by: Elliot Berman <[email protected]>
>> ---
>> drivers/virt/gunyah/Kconfig | 4 ++
>> drivers/virt/gunyah/Makefile | 1 +
>> drivers/virt/gunyah/gunyah_platform_hooks.c | 80 +++++++++++++++++++++
>> drivers/virt/gunyah/rsc_mgr.h | 3 +
>> drivers/virt/gunyah/rsc_mgr_rpc.c | 18 ++++-
>> include/linux/gunyah_rsc_mgr.h | 17 +++++
>> 6 files changed, 121 insertions(+), 2 deletions(-)
>> create mode 100644 drivers/virt/gunyah/gunyah_platform_hooks.c
>>
>> diff --git a/drivers/virt/gunyah/Kconfig b/drivers/virt/gunyah/Kconfig
>> index 1a737694c333..de815189dab6 100644
>> --- a/drivers/virt/gunyah/Kconfig
>> +++ b/drivers/virt/gunyah/Kconfig
>> @@ -4,6 +4,7 @@ config GUNYAH
>> tristate "Gunyah Virtualization drivers"
>> depends on ARM64
>> depends on MAILBOX
>> + select GUNYAH_PLATFORM_HOOKS
>> help
>> The Gunyah drivers are the helper interfaces that run in a
>> guest VM
>> such as basic inter-VM IPC and signaling mechanisms, and
>> higher level
>> @@ -11,3 +12,6 @@ config GUNYAH
>> Say Y/M here to enable the drivers needed to interact in a Gunyah
>> virtual environment.
>> +
>> +config GUNYAH_PLATFORM_HOOKS
>> + tristate
>> diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
>> index ff8bc4925392..6b8f84dbfe0d 100644
>> --- a/drivers/virt/gunyah/Makefile
>> +++ b/drivers/virt/gunyah/Makefile
>> @@ -1,6 +1,7 @@
>> # SPDX-License-Identifier: GPL-2.0
>> obj-$(CONFIG_GUNYAH) += gunyah.o
>> +obj-$(CONFIG_GUNYAH_PLATFORM_HOOKS) += gunyah_platform_hooks.o
>> gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o vm_mgr_mm.o
>> obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
>> diff --git a/drivers/virt/gunyah/gunyah_platform_hooks.c
>> b/drivers/virt/gunyah/gunyah_platform_hooks.c
>> new file mode 100644
>> index 000000000000..60da0e154e98
>> --- /dev/null
>> +++ b/drivers/virt/gunyah/gunyah_platform_hooks.c
>> @@ -0,0 +1,80 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All
>> rights reserved.
>> + */
>> +
>> +#include <linux/module.h>
>> +#include <linux/rwsem.h>
>> +#include <linux/gunyah_rsc_mgr.h>
>> +
>> +#include "rsc_mgr.h"
>> +
>> +static struct gh_rm_platform_ops *rm_platform_ops;
>> +static DECLARE_RWSEM(rm_platform_ops_lock);
>> +
>> +int gh_rm_platform_pre_mem_share(struct gh_rm *rm, struct
>> gh_rm_mem_parcel *mem_parcel)
>
> I think I have asked this question but I can not find the answer to this
> from old replies.
> Why is this platform hooks not part of core gunyah? Do we need a
> dedicated module for this.
> By the looks of APIs I see this is very much close to rm and i think
> this functionality should live with rm.
>
If the platform hooks were part of core gunyah, then core gunyah could
only be enabled as =y in the default defconfig because QCOM_SCM would
rely on the platform hooks (core gunyah).
With the suggestion to move the SCM platform hooks into separate module,
I can/will bring the platform hooks back into core gunyah module.
- Elliot
> --srini
> +{
>> + int ret = 0;
>> +
>> + down_read(&rm_platform_ops_lock);
>> + if (rm_platform_ops && rm_platform_ops->pre_mem_share)
>> + ret = rm_platform_ops->pre_mem_share(rm, mem_parcel);
>> + up_read(&rm_platform_ops_lock);
>> + return ret;
>> +}
>> +EXPORT_SYMBOL_GPL(gh_rm_platform_pre_mem_share);
>> +
>> +int gh_rm_platform_post_mem_reclaim(struct gh_rm *rm, struct
>> gh_rm_mem_parcel *mem_parcel)
>> +{
>> + int ret = 0;
>> +
>> + down_read(&rm_platform_ops_lock);
>> + if (rm_platform_ops && rm_platform_ops->post_mem_reclaim)
>> + ret = rm_platform_ops->post_mem_reclaim(rm, mem_parcel);
>> + up_read(&rm_platform_ops_lock);
>> + return ret;
>> +}
>> +EXPORT_SYMBOL_GPL(gh_rm_platform_post_mem_reclaim);
>> +
>> +int gh_rm_register_platform_ops(struct gh_rm_platform_ops *platform_ops)
>> +{
>> + int ret = 0;
>> +
>> + down_write(&rm_platform_ops_lock);
>> + if (!rm_platform_ops)
>> + rm_platform_ops = platform_ops;
>> + else
>> + ret = -EEXIST;
>> + up_write(&rm_platform_ops_lock);
>> + return ret;
>> +}
>> +EXPORT_SYMBOL_GPL(gh_rm_register_platform_ops);
>> +
>> +void gh_rm_unregister_platform_ops(struct gh_rm_platform_ops
>> *platform_ops)
>> +{
>> + down_write(&rm_platform_ops_lock);
>> + if (rm_platform_ops == platform_ops)
>> + rm_platform_ops = NULL;
>> + up_write(&rm_platform_ops_lock);
>> +}
>> +EXPORT_SYMBOL_GPL(gh_rm_unregister_platform_ops);
>> +
>> +static void _devm_gh_rm_unregister_platform_ops(void *data)
>> +{
>> + gh_rm_unregister_platform_ops(data);
>> +}
>> +
>> +int devm_gh_rm_register_platform_ops(struct device *dev, struct
>> gh_rm_platform_ops *ops)
>> +{
>> + int ret;
>> +
>> + ret = gh_rm_register_platform_ops(ops);
>> + if (ret)
>> + return ret;
>> +
>> + return devm_add_action(dev, _devm_gh_rm_unregister_platform_ops,
>> ops);
>> +}
>> +EXPORT_SYMBOL_GPL(devm_gh_rm_register_platform_ops);
>> +
>> +MODULE_LICENSE("GPL");
>> +MODULE_DESCRIPTION("Gunyah Platform Hooks");
>> diff --git a/drivers/virt/gunyah/rsc_mgr.h
>> b/drivers/virt/gunyah/rsc_mgr.h
>> index 3665ebc7b020..6838e736f361 100644
>> --- a/drivers/virt/gunyah/rsc_mgr.h
>> +++ b/drivers/virt/gunyah/rsc_mgr.h
>> @@ -13,4 +13,7 @@ struct gh_rm;
>> int gh_rm_call(struct gh_rm *rsc_mgr, u32 message_id, void
>> *req_buff, size_t req_buf_size,
>> void **resp_buf, size_t *resp_buf_size);
>> +int gh_rm_platform_pre_mem_share(struct gh_rm *rm, struct
>> gh_rm_mem_parcel *mem_parcel);
>> +int gh_rm_platform_post_mem_reclaim(struct gh_rm *rm, struct
>> gh_rm_mem_parcel *mem_parcel);
>> +
>> #endif
>> diff --git a/drivers/virt/gunyah/rsc_mgr_rpc.c
>> b/drivers/virt/gunyah/rsc_mgr_rpc.c
>> index 3df15ad5b97d..733be4dc8dd2 100644
>> --- a/drivers/virt/gunyah/rsc_mgr_rpc.c
>> +++ b/drivers/virt/gunyah/rsc_mgr_rpc.c
>> @@ -204,6 +204,12 @@ static int gh_rm_mem_lend_common(struct gh_rm
>> *rm, u32 message_id, struct gh_rm_
>> if (!msg)
>> return -ENOMEM;
>> + ret = gh_rm_platform_pre_mem_share(rm, p);
>> + if (ret) {
>> + kfree(msg);
>> + return ret;
>> + }
>> +
>> req_header = msg;
>> acl_section = (void *)req_header + sizeof(*req_header);
>> mem_section = (void *)acl_section + struct_size(acl_section,
>> entries, p->n_acl_entries);
>> @@ -227,8 +233,10 @@ static int gh_rm_mem_lend_common(struct gh_rm
>> *rm, u32 message_id, struct gh_rm_
>> ret = gh_rm_call(rm, message_id, msg, msg_size, (void **)&resp,
>> &resp_size);
>> kfree(msg);
>> - if (ret)
>> + if (ret) {
>> + gh_rm_platform_post_mem_reclaim(rm, p);
>> return ret;
>> + }
>> p->mem_handle = le32_to_cpu(*resp);
>> @@ -283,8 +291,14 @@ int gh_rm_mem_reclaim(struct gh_rm *rm, struct
>> gh_rm_mem_parcel *parcel)
>> struct gh_rm_mem_release_req req = {
>> .mem_handle = cpu_to_le32(parcel->mem_handle),
>> };
>> + int ret;
>> +
>> + ret = gh_rm_call(rm, GH_RM_RPC_MEM_RECLAIM, &req, sizeof(req),
>> NULL, NULL);
>> + /* Do not call platform mem reclaim hooks: the reclaim didn't
>> happen*/
>> + if (ret)
>> + return ret;
>> - return gh_rm_call(rm, GH_RM_RPC_MEM_RECLAIM, &req, sizeof(req),
>> NULL, NULL);
>> + return gh_rm_platform_post_mem_reclaim(rm, parcel);
>> }
>> /**
>> diff --git a/include/linux/gunyah_rsc_mgr.h
>> b/include/linux/gunyah_rsc_mgr.h
>> index 8b0b46f28e39..515087931a2b 100644
>> --- a/include/linux/gunyah_rsc_mgr.h
>> +++ b/include/linux/gunyah_rsc_mgr.h
>> @@ -145,4 +145,21 @@ int gh_rm_get_hyp_resources(struct gh_rm *rm, u16
>> vmid,
>> struct gh_rm_hyp_resources **resources);
>> int gh_rm_get_vmid(struct gh_rm *rm, u16 *vmid);
>> +struct gunyah_rm_platform_ops {
>> + int (*pre_mem_share)(struct gh_rm *rm, struct gh_rm_mem_parcel
>> *mem_parcel);
>> + int (*post_mem_reclaim)(struct gh_rm *rm, struct gh_rm_mem_parcel
>> *mem_parcel);
>> +};
>> +
>> +#if IS_ENABLED(CONFIG_GUNYAH_PLATFORM_HOOKS)
>> +int gh_rm_register_platform_ops(struct gh_rm_platform_ops
>> *platform_ops);
>> +void gh_rm_unregister_platform_ops(struct gh_rm_platform_ops
>> *platform_ops);
>> +int devm_gh_rm_register_platform_ops(struct device *dev, struct
>> gh_rm_platform_ops *ops);
>> +#else
>> +static inline int gh_rm_register_platform_ops(struct
>> gh_rm_platform_ops *platform_ops)
>> + { return 0; }
>> +static inline void gh_rm_unregister_platform_ops(struct
>> gh_rm_platform_ops *platform_ops) { }
>> +static inline int devm_gh_rm_register_platform_ops(struct device *dev,
>> + struct gh_rm_platform_ops *ops) { return 0; }
>> +#endif
>> +
>> #endif
On Fri, Mar 03, 2023 at 05:06:18PM -0800, Elliot Berman wrote:
> When launching a virtual machine, Gunyah userspace allocates memory for
> the guest and informs Gunyah about these memory regions through
> SET_USER_MEMORY_REGION ioctl.
>
> Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
> Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
> Signed-off-by: Elliot Berman <[email protected]>
> ---
> drivers/virt/gunyah/Makefile | 2 +-
> drivers/virt/gunyah/vm_mgr.c | 44 ++++++
> drivers/virt/gunyah/vm_mgr.h | 25 ++++
> drivers/virt/gunyah/vm_mgr_mm.c | 229 ++++++++++++++++++++++++++++++++
> include/uapi/linux/gunyah.h | 29 ++++
> 5 files changed, 328 insertions(+), 1 deletion(-)
> create mode 100644 drivers/virt/gunyah/vm_mgr_mm.c
[...]
> +int gh_vm_mem_alloc(struct gh_vm *ghvm, struct gh_userspace_memory_region *region)
> +{
> + struct gh_vm_mem *mapping, *tmp_mapping;
> + struct gh_rm_mem_entry *mem_entries;
> + phys_addr_t curr_page, prev_page;
> + struct gh_rm_mem_parcel *parcel;
> + int i, j, pinned, ret = 0;
> + size_t entry_size;
> + u16 vmid;
> +
> + if (!region->memory_size || !PAGE_ALIGNED(region->memory_size) ||
> + !PAGE_ALIGNED(region->userspace_addr) || !PAGE_ALIGNED(region->guest_phys_addr))
> + return -EINVAL;
> +
> + if (region->guest_phys_addr + region->memory_size < region->guest_phys_addr)
> + return -EOVERFLOW;
> +
> + ret = mutex_lock_interruptible(&ghvm->mm_lock);
> + if (ret)
> + return ret;
> +
> + mapping = __gh_vm_mem_find_by_label(ghvm, region->label);
> + if (mapping) {
> + mutex_unlock(&ghvm->mm_lock);
> + return -EEXIST;
> + }
> +
> + mapping = kzalloc(sizeof(*mapping), GFP_KERNEL);
> + if (!mapping) {
> + mutex_unlock(&ghvm->mm_lock);
> + return -ENOMEM;
> + }
> +
> + mapping->parcel.label = region->label;
> + mapping->guest_phys_addr = region->guest_phys_addr;
> + mapping->npages = region->memory_size >> PAGE_SHIFT;
> + parcel = &mapping->parcel;
> + parcel->mem_handle = GH_MEM_HANDLE_INVAL; /* to be filled later by mem_share/mem_lend */
> + parcel->mem_type = GH_RM_MEM_TYPE_NORMAL;
> +
> + /* Check for overlap */
> + list_for_each_entry(tmp_mapping, &ghvm->memory_mappings, list) {
> + if (!((mapping->guest_phys_addr + (mapping->npages << PAGE_SHIFT) <=
> + tmp_mapping->guest_phys_addr) ||
> + (mapping->guest_phys_addr >=
> + tmp_mapping->guest_phys_addr + (tmp_mapping->npages << PAGE_SHIFT)))) {
> + ret = -EEXIST;
> + goto free_mapping;
> + }
> + }
> +
> + list_add(&mapping->list, &ghvm->memory_mappings);
> +
> + mapping->pages = kcalloc(mapping->npages, sizeof(*mapping->pages), GFP_KERNEL);
> + if (!mapping->pages) {
> + ret = -ENOMEM;
> + mapping->npages = 0; /* update npages for reclaim */
> + goto reclaim;
> + }
> +
> + pinned = pin_user_pages_fast(region->userspace_addr, mapping->npages,
> + FOLL_WRITE | FOLL_LONGTERM, mapping->pages);
> + if (pinned < 0) {
> + ret = pinned;
> + mapping->npages = 0; /* update npages for reclaim */
> + goto reclaim;
> + } else if (pinned != mapping->npages) {
> + ret = -EFAULT;
> + mapping->npages = pinned; /* update npages for reclaim */
> + goto reclaim;
> + }
I think Fuad mentioned this on an older version of these patches, but it
looks like you're failing to account for the pinned memory here which is
a security issue depending on who is able to issue the ioctl() calling
into here.
Specifically, I'm thinking that your kXalloc() calls should be using
GFP_KERNEL_ACCOUNT in this function and also that you should be calling
account_locked_vm() for the pages being pinned.
Finally, what happens if userspace passes in a file mapping?
Will
On 3/3/23 7:06 PM, Elliot Berman wrote:
> Gunyah is a Type-1 hypervisor independent of any
> high-level OS kernel, and runs in a higher CPU privilege level. It does
> not depend on any lower-privileged OS kernel/code for its core
> functionality. This increases its security and can support a much smaller
> trusted computing base than a Type-2 hypervisor.
>
> Gunyah is an open source hypervisor. The source repo is available at
> https://github.com/quic/gunyah-hypervisor.
I've done a pretty detailed review again, and got further along
than I did last time. Things are definitely looking better, but
I have found some bugs that need to be addressed.
I also make a lot of comments about grouping certain sets of
definitions into enumerated types. Also I tend to notice when
things aren't done consistently, and I mention that a lot.
There are silly suggestions all over about alignment of
things--these are mainly to make the code look prettier,
though that's a matter of opinion.
I still prefer having lines generally closer to 80 columns
wide, but I've already mentioned that...
I really focused on the code, and not the documentation.
In fact I didn't even pay much attention to your patch
headers either. I did not review the SCM calls yet.
So in summary I have not reviewed patches 1, 2, 16, 17,
and 26. I try to look at everything in my next review,
which I hope will be final (or very close).
-Alex
> The diagram below shows the architecture.
>
> ::
>
> VM A VM B
> +-----+ +-----+ | +-----+ +-----+ +-----+
> | | | | | | | | | | |
> EL0 | APP | | APP | | | APP | | APP | | APP |
> | | | | | | | | | | |
> +-----+ +-----+ | +-----+ +-----+ +-----+
> ---------------------|-------------------------
> +--------------+ | +----------------------+
> | | | | |
> EL1 | Linux Kernel | | |Linux kernel/Other OS | ...
> | | | | |
> +--------------+ | +----------------------+
> --------hvc/smc------|------hvc/smc------------
> +----------------------------------------+
> | |
> EL2 | Gunyah Hypervisor |
> | |
> +----------------------------------------+
>
> Gunyah provides these following features.
>
. . .
On 3/3/23 7:06 PM, Elliot Berman wrote:
> Add architecture-independent standard error codes, types, and macros for
> Gunyah hypercalls.
>
> Reviewed-by: Dmitry Baryshkov <[email protected]>
> Signed-off-by: Elliot Berman <[email protected]>
See a few comments below. -Alex
> ---
> include/linux/gunyah.h | 83 ++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 83 insertions(+)
> create mode 100644 include/linux/gunyah.h
>
> diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
> new file mode 100644
> index 000000000000..54b4be71caf7
> --- /dev/null
> +++ b/include/linux/gunyah.h
> @@ -0,0 +1,83 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#ifndef _LINUX_GUNYAH_H
> +#define _LINUX_GUNYAH_H
> +
> +#include <linux/errno.h>
> +#include <linux/limits.h>
> +
> +/******************************************************************************/
> +/* Common arch-independent definitions for Gunyah hypercalls */
> +#define GH_CAPID_INVAL U64_MAX
> +#define GH_VMID_ROOT_VM 0xff
The above definition doesn't seem to be used anywhere, but seeing
it begs the question to me of what type it is expected to have.
If it were used, where would it be used in an 8 bit field?
> +
> +enum gh_error {
> + GH_ERROR_OK = 0,
> + GH_ERROR_UNIMPLEMENTED = -1,
> + GH_ERROR_RETRY = -2,
There might be nothing fundamentally wrong with this, but I
dislike seeing negative values assigned to enums.
These error values are returned from the hypervisor, and it
looks like they'll likely truncated from a 64-bit unsigned
value. Are they *sent* from the hypervisor as 64-bit signed
values? Or 32-bit signed values? (In that case, the
I just wonder if you can use 0xffffffff or 0xffff for example
rather than -1, depending on the actual value that gets passed.
> +
> + GH_ERROR_ARG_INVAL = 1,
> + GH_ERROR_ARG_SIZE = 2,
> + GH_ERROR_ARG_ALIGN = 3,
> +
> + GH_ERROR_NOMEM = 10,
> +
> + GH_ERROR_ADDR_OVFL = 20,
> + GH_ERROR_ADDR_UNFL = 21,
> + GH_ERROR_ADDR_INVAL = 22,
> +
> + GH_ERROR_DENIED = 30,
> + GH_ERROR_BUSY = 31,
> + GH_ERROR_IDLE = 32,
> +
> + GH_ERROR_IRQ_BOUND = 40,
> + GH_ERROR_IRQ_UNBOUND = 41,
> +
> + GH_ERROR_CSPACE_CAP_NULL = 50,
> + GH_ERROR_CSPACE_CAP_REVOKED = 51,
> + GH_ERROR_CSPACE_WRONG_OBJ_TYPE = 52,
> + GH_ERROR_CSPACE_INSUF_RIGHTS = 53,
> + GH_ERROR_CSPACE_FULL = 54,
> +
> + GH_ERROR_MSGQUEUE_EMPTY = 60,
> + GH_ERROR_MSGQUEUE_FULL = 61,
> +};
> +
> +/**
> + * gh_remap_error() - Remap Gunyah hypervisor errors into a Linux error code
> + * @gh_error: Gunyah hypercall return value
> + */
> +static inline int gh_remap_error(enum gh_error gh_error)
Since you're remapping a gh_error, I would have named this
gh_error_remap().
> +{
> + switch (gh_error) {
> + case GH_ERROR_OK:
> + return 0;
> + case GH_ERROR_NOMEM:
> + return -ENOMEM;
> + case GH_ERROR_DENIED:
> + case GH_ERROR_CSPACE_CAP_NULL:
> + case GH_ERROR_CSPACE_CAP_REVOKED:
> + case GH_ERROR_CSPACE_WRONG_OBJ_TYPE:
> + case GH_ERROR_CSPACE_INSUF_RIGHTS:
> + case GH_ERROR_CSPACE_FULL:
> + return -EACCES;
> + case GH_ERROR_BUSY:
> + case GH_ERROR_IDLE:
> + return -EBUSY;
> + case GH_ERROR_IRQ_BOUND:
> + case GH_ERROR_IRQ_UNBOUND:
> + case GH_ERROR_MSGQUEUE_FULL:
> + case GH_ERROR_MSGQUEUE_EMPTY:
> + return -EIO;
> + case GH_ERROR_UNIMPLEMENTED:
> + case GH_ERROR_RETRY:
> + return -EOPNOTSUPP;
> + default:
> + return -EINVAL;
> + }
> +}
> +
> +#endif
On 3/3/23 7:06 PM, Elliot Berman wrote:
> Add hypercalls to identify when Linux is running a virtual machine under
> Gunyah.
>
> There are two calls to help identify Gunyah:
>
> 1. gh_hypercall_get_uid() returns a UID when running under a Gunyah
> hypervisor.
> 2. gh_hypercall_hyp_identify() returns build information and a set of
> feature flags that are supported by Gunyah.
>
> Signed-off-by: Elliot Berman <[email protected]>
Two very minor comments below. -Alex
> ---
> arch/arm64/Kbuild | 1 +
> arch/arm64/gunyah/Makefile | 3 ++
> arch/arm64/gunyah/gunyah_hypercall.c | 64 ++++++++++++++++++++++++++++
> drivers/virt/Kconfig | 2 +
> drivers/virt/gunyah/Kconfig | 13 ++++++
> include/linux/gunyah.h | 28 ++++++++++++
> 6 files changed, 111 insertions(+)
> create mode 100644 arch/arm64/gunyah/Makefile
> create mode 100644 arch/arm64/gunyah/gunyah_hypercall.c
> create mode 100644 drivers/virt/gunyah/Kconfig
>
> diff --git a/arch/arm64/Kbuild b/arch/arm64/Kbuild
> index 5bfbf7d79c99..e4847ba0e3c9 100644
> --- a/arch/arm64/Kbuild
> +++ b/arch/arm64/Kbuild
> @@ -3,6 +3,7 @@ obj-y += kernel/ mm/ net/
> obj-$(CONFIG_KVM) += kvm/
> obj-$(CONFIG_XEN) += xen/
> obj-$(subst m,y,$(CONFIG_HYPERV)) += hyperv/
> +obj-$(CONFIG_GUNYAH) += gunyah/
> obj-$(CONFIG_CRYPTO) += crypto/
>
> # for cleaning
> diff --git a/arch/arm64/gunyah/Makefile b/arch/arm64/gunyah/Makefile
> new file mode 100644
> index 000000000000..84f1e38cafb1
> --- /dev/null
> +++ b/arch/arm64/gunyah/Makefile
> @@ -0,0 +1,3 @@
> +# SPDX-License-Identifier: GPL-2.0
> +
> +obj-$(CONFIG_GUNYAH) += gunyah_hypercall.o
> diff --git a/arch/arm64/gunyah/gunyah_hypercall.c b/arch/arm64/gunyah/gunyah_hypercall.c
> new file mode 100644
> index 000000000000..0d14e767e2c8
> --- /dev/null
> +++ b/arch/arm64/gunyah/gunyah_hypercall.c
> @@ -0,0 +1,64 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#include <linux/arm-smccc.h>
> +#include <linux/module.h>
> +#include <linux/gunyah.h>
> +#include <linux/uuid.h>
> +
> +static const uuid_t gh_known_uuids[] = {
> + /* Qualcomm's version of Gunyah {19bd54bd-0b37-571b-946f-609b54539de6} */
> + UUID_INIT(0x19bd54bd, 0x0b37, 0x571b, 0x94, 0x6f, 0x60, 0x9b, 0x54, 0x53, 0x9d, 0xe6),
> + /* Standard version of Gunyah {c1d58fcd-a453-5fdb-9265-ce36673d5f14} */
> + UUID_INIT(0xc1d58fcd, 0xa453, 0x5fdb, 0x92, 0x65, 0xce, 0x36, 0x67, 0x3d, 0x5f, 0x14),
> +};
> +
> +bool arch_is_gh_guest(void)
> +{
> + struct arm_smccc_res res;
> + uuid_t uuid;
> + int i;
> +
> + arm_smccc_1_1_hvc(ARM_SMCCC_VENDOR_HYP_CALL_UID_FUNC_ID, &res);
> +
> + ((u32 *)&uuid.b[0])[0] = lower_32_bits(res.a0);
> + ((u32 *)&uuid.b[0])[1] = lower_32_bits(res.a1);
> + ((u32 *)&uuid.b[0])[2] = lower_32_bits(res.a2);
> + ((u32 *)&uuid.b[0])[3] = lower_32_bits(res.a3);
> +
> + for (i = 0; i < ARRAY_SIZE(gh_known_uuids); i++)
> + if (uuid_equal(&uuid, &gh_known_uuids[i]))
> + return true;
> +
> + return false;
> +}
> +EXPORT_SYMBOL_GPL(arch_is_gh_guest);
> +
> +#define GH_HYPERCALL(fn) ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, ARM_SMCCC_SMC_64, \
> + ARM_SMCCC_OWNER_VENDOR_HYP, \
> + fn)
> +
> +#define GH_HYPERCALL_HYP_IDENTIFY GH_HYPERCALL(0x8000)
> +
> +/**
> + * gh_hypercall_hyp_identify() - Returns build information and feature flags
> + * supported by Gunyah.
> + * @hyp_identity: filled by the hypercall with the API info and feature flags.
> + */
> +void gh_hypercall_hyp_identify(struct gh_hypercall_hyp_identify_resp *hyp_identity)
> +{
> + struct arm_smccc_res res;
> +
> + arm_smccc_1_1_hvc(GH_HYPERCALL_HYP_IDENTIFY, &res);
> +
> + hyp_identity->api_info = res.a0;
> + hyp_identity->flags[0] = res.a1;
> + hyp_identity->flags[1] = res.a2;
> + hyp_identity->flags[2] = res.a3;
> +}
> +EXPORT_SYMBOL_GPL(gh_hypercall_hyp_identify);
> +
> +MODULE_LICENSE("GPL");
> +MODULE_DESCRIPTION("Gunyah Hypervisor Hypercalls");
> diff --git a/drivers/virt/Kconfig b/drivers/virt/Kconfig
> index f79ab13a5c28..85bd6626ffc9 100644
> --- a/drivers/virt/Kconfig
> +++ b/drivers/virt/Kconfig
> @@ -54,4 +54,6 @@ source "drivers/virt/coco/sev-guest/Kconfig"
>
> source "drivers/virt/coco/tdx-guest/Kconfig"
>
> +source "drivers/virt/gunyah/Kconfig"
> +
> endif
> diff --git a/drivers/virt/gunyah/Kconfig b/drivers/virt/gunyah/Kconfig
> new file mode 100644
> index 000000000000..1a737694c333
> --- /dev/null
> +++ b/drivers/virt/gunyah/Kconfig
> @@ -0,0 +1,13 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +
> +config GUNYAH
> + tristate "Gunyah Virtualization drivers"
> + depends on ARM64
> + depends on MAILBOX
> + help
> + The Gunyah drivers are the helper interfaces that run in a guest VM
> + such as basic inter-VM IPC and signaling mechanisms, and higher level
> + services such as memory/device sharing, IRQ sharing, and so on.
> +
> + Say Y/M here to enable the drivers needed to interact in a Gunyah
> + virtual environment.
> diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
> index 54b4be71caf7..bd080e3a6fc9 100644
> --- a/include/linux/gunyah.h
> +++ b/include/linux/gunyah.h
> @@ -6,8 +6,10 @@
> #ifndef _LINUX_GUNYAH_H
> #define _LINUX_GUNYAH_H
>
> +#include <linux/bitfield.h>
> #include <linux/errno.h>
> #include <linux/limits.h>
> +#include <linux/types.h>
>
> /******************************************************************************/
> /* Common arch-independent definitions for Gunyah hypercalls */
> @@ -80,4 +82,30 @@ static inline int gh_remap_error(enum gh_error gh_error)
> }
> }
>
> +enum gh_api_feature {
> + GH_FEATURE_DOORBELL = 1,
> + GH_FEATURE_MSGQUEUE = 2,
> + GH_FEATURE_VCPU = 5,
Fix alignment in the line above.
> + GH_FEATURE_MEMEXTENT = 6,
> +};
> +
> +bool arch_is_gh_guest(void);
> +
> +u16 gh_api_version(void);
> +bool gh_api_has_feature(enum gh_api_feature feature);
> +
> +#define GH_API_V1 1
> +
> +#define GH_API_INFO_API_VERSION_MASK GENMASK_ULL(13, 0)
> +#define GH_API_INFO_BIG_ENDIAN BIT_ULL(14)
> +#define GH_API_INFO_IS_64BIT BIT_ULL(15)
Maybe a comment saying "bits 16-55 are reserved"?
> +#define GH_API_INFO_VARIANT_MASK GENMASK_ULL(63, 56)
> +
> +struct gh_hypercall_hyp_identify_resp {
> + u64 api_info;
> + u64 flags[3];
> +};
> +
> +void gh_hypercall_hyp_identify(struct gh_hypercall_hyp_identify_resp *hyp_identity);
> +
> #endif
On 3/3/23 7:06 PM, Elliot Berman wrote:
> The resource manager is a special virtual machine which is always
> running on a Gunyah system. It provides APIs for creating and destroying
> VMs, secure memory management, sharing/lending of memory between VMs,
> and setup of inter-VM communication. Calls to the resource manager are
> made via message queues.
>
> This patch implements the basic probing and RPC mechanism to make those
> API calls. Request/response calls can be made with gh_rm_call.
> Drivers can also register to notifications pushed by RM via
> gh_rm_register_notifier
>
> Specific API calls that resource manager supports will be implemented in
> subsequent patches.
Mostly very simple issues noted here. -Alex
> Signed-off-by: Elliot Berman <[email protected]>
> ---
> drivers/virt/gunyah/Makefile | 3 +
> drivers/virt/gunyah/rsc_mgr.c | 688 +++++++++++++++++++++++++++++++++
> drivers/virt/gunyah/rsc_mgr.h | 16 +
> include/linux/gunyah_rsc_mgr.h | 21 +
> 4 files changed, 728 insertions(+)
> create mode 100644 drivers/virt/gunyah/rsc_mgr.c
> create mode 100644 drivers/virt/gunyah/rsc_mgr.h
> create mode 100644 include/linux/gunyah_rsc_mgr.h
>
> diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
> index 34f32110faf9..cc864ff5abbb 100644
> --- a/drivers/virt/gunyah/Makefile
> +++ b/drivers/virt/gunyah/Makefile
> @@ -1,3 +1,6 @@
> # SPDX-License-Identifier: GPL-2.0
>
> obj-$(CONFIG_GUNYAH) += gunyah.o
> +
> +gunyah_rsc_mgr-y += rsc_mgr.o
> +obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
> diff --git a/drivers/virt/gunyah/rsc_mgr.c b/drivers/virt/gunyah/rsc_mgr.c
> new file mode 100644
> index 000000000000..67813c9a52db
> --- /dev/null
> +++ b/drivers/virt/gunyah/rsc_mgr.c
> @@ -0,0 +1,688 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
. . .
> +static void gh_rm_try_complete_connection(struct gh_rm *rm)
> +{
> + struct gh_rm_connection *connection = rm->active_rx_connection;
> +
> + if (!connection || connection->fragments_received != connection->num_fragments)
> + return;
> +
> + switch (connection->type) {
> + case RM_RPC_TYPE_REPLY:
> + complete(&connection->reply.seq_done);
> + break;
> + case RM_RPC_TYPE_NOTIF:
> + schedule_work(&connection->notification.work);
> + break;
> + default:
> + dev_err_ratelimited(rm->dev, "Invalid message type (%d) received\n",
s/%d/%u/
> + connection->type);
> + gh_rm_abort_connection(rm);
> + break;
> + }
> +
> + rm->active_rx_connection = NULL;
> +}
> +
> +static void gh_rm_msgq_rx_data(struct mbox_client *cl, void *mssg)
> +{
> + struct gh_rm *rm = container_of(cl, struct gh_rm, msgq_client);
> + struct gh_msgq_rx_data *rx_data = mssg;
> + size_t msg_size = rx_data->length;
> + void *msg = rx_data->data;
> + struct gh_rm_rpc_hdr *hdr;
> +
> + if (msg_size < sizeof(*hdr) || msg_size > GH_MSGQ_MAX_MSG_SIZE)
> + return;
> +
> + hdr = msg;
> + if (hdr->api != RM_RPC_API) {
> + dev_err(rm->dev, "Unknown RM RPC API version: %x\n", hdr->api);
> + return;
> + }
> +
> + switch (FIELD_GET(RM_RPC_TYPE_MASK, hdr->type)) {
> + case RM_RPC_TYPE_NOTIF:
> + gh_rm_process_notif(rm, msg, msg_size);
> + break;
> + case RM_RPC_TYPE_REPLY:
> + gh_rm_process_rply(rm, msg, msg_size);
> + break;
> + case RM_RPC_TYPE_CONTINUATION:
> + gh_rm_process_cont(rm, rm->active_rx_connection, msg, msg_size);
> + break;
> + default:
> + dev_err(rm->dev, "Invalid message type (%lu) received\n",
> + FIELD_GET(RM_RPC_TYPE_MASK, hdr->type));
> + return;
> + }
> +
> + gh_rm_try_complete_connection(rm);
> +}
> +
> +static void gh_rm_msgq_tx_done(struct mbox_client *cl, void *mssg, int r)
> +{
> + struct gh_rm *rm = container_of(cl, struct gh_rm, msgq_client);
> +
> + kmem_cache_free(rm->cache, mssg);
> + rm->last_tx_ret = r;
> +}
> +
> +static int gh_rm_send_request(struct gh_rm *rm, u32 message_id,
> + const void *req_buff, size_t req_buf_size,
> + struct gh_rm_connection *connection)
> +{
> + size_t buf_size_remaining = req_buf_size;
> + const void *req_buf_curr = req_buff;
> + struct gh_msgq_tx_data *msg;
> + struct gh_rm_rpc_hdr *hdr, hdr_template;
> + u32 cont_fragments = 0;
> + size_t payload_size;
> + void *payload;
> + int ret;
> +
> + if (req_buf_size > GH_RM_MAX_NUM_FRAGMENTS * GH_RM_MAX_MSG_SIZE) {
> + dev_warn(rm->dev, "Limit exceeded for the number of fragments: %u\n",
> + cont_fragments);
You are printing the value of cont_fragments here when it's just zero.
> + dump_stack();
> + return -E2BIG;
> + }
> +
Move the computation of cont_fragments prior to the block above.
You could use a ?: statement to assign it.
> + if (req_buf_size)
> + cont_fragments = (req_buf_size - 1) / GH_RM_MAX_MSG_SIZE;
> +
> + hdr_template.api = RM_RPC_API;
> + hdr_template.type = FIELD_PREP(RM_RPC_TYPE_MASK, RM_RPC_TYPE_REQUEST) |
> + FIELD_PREP(RM_RPC_FRAGMENTS_MASK, cont_fragments);
The line above should be indented further.
> + hdr_template.seq = cpu_to_le16(connection->reply.seq);
> + hdr_template.msg_id = cpu_to_le32(message_id);
> +
> + ret = mutex_lock_interruptible(&rm->send_lock);
> + if (ret)
> + return ret;
> +
> + /* Consider also the 'request' packet for the loop count */
I don't think the comment above is helpful.
> + do {
> + msg = kmem_cache_zalloc(rm->cache, GFP_KERNEL);
> + if (!msg) {
> + ret = -ENOMEM;
> + goto out;
> + }
> +
> + /* Fill header */
> + hdr = (struct gh_rm_rpc_hdr *)msg->data;
I personally would prefer &msg->data[0] in this case.
> + *hdr = hdr_template;
> +
> + /* Copy payload */
> + payload = hdr + 1;
I think I might have suggested using "hdr + 1" here.
Elsewhere you use something like:
payload = (char *)hdr + sizeof(hdr);
or something similar. I suggest you choose one approach and use
it consistently througout the driver. Either is fine, but I
have a slight preference for the "hdr + 1" way.
> + payload_size = min(buf_size_remaining, GH_RM_MAX_MSG_SIZE);
> + memcpy(payload, req_buf_curr, payload_size);
> + req_buf_curr += payload_size;
> + buf_size_remaining -= payload_size;
> +
> + /* Force the last fragment to immediately alert the receiver */
> + msg->push = !buf_size_remaining;
> + msg->length = sizeof(*hdr) + payload_size;
> +
> + ret = mbox_send_message(gh_msgq_chan(&rm->msgq), msg);
> + if (ret < 0) {
> + kmem_cache_free(rm->cache, msg);
> + break;
> + }
> +
> + if (rm->last_tx_ret) {
> + ret = rm->last_tx_ret;
> + break;
> + }
> +
> + hdr_template.type = FIELD_PREP(RM_RPC_TYPE_MASK, RM_RPC_TYPE_CONTINUATION) |
> + FIELD_PREP(RM_RPC_FRAGMENTS_MASK, cont_fragments);
> + } while (buf_size_remaining);
> +
> +out:
> + mutex_unlock(&rm->send_lock);
> + return ret < 0 ? ret : 0;
> +}
> +
> +/**
> + * gh_rm_call: Achieve request-response type communication with RPC
> + * @rm: Pointer to Gunyah resource manager internal data
> + * @message_id: The RM RPC message-id
> + * @req_buff: Request buffer that contains the payload
> + * @req_buf_size: Total size of the payload
> + * @resp_buf: Pointer to a response buffer
> + * @resp_buf_size: Size of the response buffer
> + *
> + * Make a request to the RM-VM and wait for reply back. For a successful
I think you could just say "to the RM and wait"...
Overall I suggest using "RM" or "RM VM" consistently when you talk
about the Resource Manager. This is the only place I see "RM-VM".
> + * response, the function returns the payload. The size of the payload is set in
> + * resp_buf_size. The resp_buf should be freed by the caller when 0 is returned
s/should/must/
> + * and resp_buf_size != 0.
> + *
> + * req_buff should be not NULL for req_buf_size >0. If req_buf_size == 0,
> + * req_buff *can* be NULL and no additional payload is sent.
I'd say use "buf" or "buff" but not both in your naming
convention.
> + *
> + * Context: Process context. Will sleep waiting for reply.
> + * Return: 0 on success. <0 if error.
> + */
> +int gh_rm_call(struct gh_rm *rm, u32 message_id, void *req_buff, size_t req_buf_size,
> + void **resp_buf, size_t *resp_buf_size)
I suspect you could define the request buffer as a pointer to const;
can you?
> +{
> + struct gh_rm_connection *connection;
> + u32 seq_id;
> + int ret;
> +
> + /* message_id 0 is reserved. req_buf_size implies req_buf is not NULL */
> + if (!message_id || (!req_buff && req_buf_size) || !rm)
If you're going to check for a null RM pointer, I'd check it first.
> + return -EINVAL;
> +
> +
> + connection = kzalloc(sizeof(*connection), GFP_KERNEL);
> + if (!connection)
> + return -ENOMEM;
> +
> + connection->type = RM_RPC_TYPE_REPLY;
> + connection->msg_id = cpu_to_le32(message_id);
> +
> + init_completion(&connection->reply.seq_done);
. . .
> diff --git a/include/linux/gunyah_rsc_mgr.h b/include/linux/gunyah_rsc_mgr.h
> new file mode 100644
> index 000000000000..deca9b3da541
> --- /dev/null
> +++ b/include/linux/gunyah_rsc_mgr.h
> @@ -0,0 +1,21 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#ifndef _GUNYAH_RSC_MGR_H
> +#define _GUNYAH_RSC_MGR_H
> +
> +#include <linux/list.h>
> +#include <linux/notifier.h>
> +#include <linux/gunyah.h>
> +
> +#define GH_VMID_INVAL U16_MAX
Add a tab before U16_MAX; it will line up more nicely
when you define GH_MEM_HANDLE_INVAL later.
> +
> +struct gh_rm;
> +int gh_rm_notifier_register(struct gh_rm *rm, struct notifier_block *nb);
> +int gh_rm_notifier_unregister(struct gh_rm *rm, struct notifier_block *nb);
> +struct device *gh_rm_get(struct gh_rm *rm);
> +void gh_rm_put(struct gh_rm *rm);
> +
> +#endif
On 3/3/23 7:06 PM, Elliot Berman wrote:
> Add Gunyah Resource Manager RPC to launch an unauthenticated VM.
>
> Signed-off-by: Elliot Berman <[email protected]>
Several comments, no major issues here. -Alex
> ---
> drivers/virt/gunyah/Makefile | 2 +-
> drivers/virt/gunyah/rsc_mgr_rpc.c | 260 ++++++++++++++++++++++++++++++
> include/linux/gunyah_rsc_mgr.h | 73 +++++++++
> 3 files changed, 334 insertions(+), 1 deletion(-)
> create mode 100644 drivers/virt/gunyah/rsc_mgr_rpc.c
>
> diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
> index cc864ff5abbb..de29769f2f3f 100644
> --- a/drivers/virt/gunyah/Makefile
> +++ b/drivers/virt/gunyah/Makefile
> @@ -2,5 +2,5 @@
>
> obj-$(CONFIG_GUNYAH) += gunyah.o
>
> -gunyah_rsc_mgr-y += rsc_mgr.o
> +gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o
> obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
> diff --git a/drivers/virt/gunyah/rsc_mgr_rpc.c b/drivers/virt/gunyah/rsc_mgr_rpc.c
> new file mode 100644
> index 000000000000..ffcb861a31b5
> --- /dev/null
> +++ b/drivers/virt/gunyah/rsc_mgr_rpc.c
> @@ -0,0 +1,260 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#include <linux/gunyah_rsc_mgr.h>
> +#include "rsc_mgr.h"
> +
> +/* Message IDs: VM Management */
> +#define GH_RM_RPC_VM_ALLOC_VMID 0x56000001
> +#define GH_RM_RPC_VM_DEALLOC_VMID 0x56000002
> +#define GH_RM_RPC_VM_START 0x56000004
> +#define GH_RM_RPC_VM_STOP 0x56000005
> +#define GH_RM_RPC_VM_RESET 0x56000006
> +#define GH_RM_RPC_VM_CONFIG_IMAGE 0x56000009
> +#define GH_RM_RPC_VM_INIT 0x5600000B
> +#define GH_RM_RPC_VM_GET_HYP_RESOURCES 0x56000020
> +#define GH_RM_RPC_VM_GET_VMID 0x56000024
> +
> +struct gh_rm_vm_common_vmid_req {
> + __le16 vmid;
> + __le16 _padding;
> +} __packed;
> +
> +/* Call: VM_ALLOC */
> +struct gh_rm_vm_alloc_vmid_resp {
> + __le16 vmid;
> + __le16 _padding;
> +} __packed;
> +
> +/* Call: VM_STOP */
> +#define GH_RM_VM_STOP_FLAG_FORCE_STOP BIT(0)
> +
> +#define GH_RM_VM_STOP_REASON_FORCE_STOP 3
> +
> +struct gh_rm_vm_stop_req {
> + __le16 vmid;
> + u8 flags;
> + u8 _padding;
> + __le32 stop_reason;
> +} __packed;
> +
> +/* Call: VM_CONFIG_IMAGE */
> +struct gh_rm_vm_config_image_req {
> + __le16 vmid;
> + __le16 auth_mech;
> + __le32 mem_handle;
> + __le64 image_offset;
> + __le64 image_size;
> + __le64 dtb_offset;
> + __le64 dtb_size;
> +} __packed;
> +
> +/*
> + * Several RM calls take only a VMID as a parameter and give only standard
> + * response back. Deduplicate boilerplate code by using this common call.
> + */
> +static int gh_rm_common_vmid_call(struct gh_rm *rm, u32 message_id, u16 vmid)
> +{
> + struct gh_rm_vm_common_vmid_req req_payload = {
> + .vmid = cpu_to_le16(vmid),
> + };
> +
> + return gh_rm_call(rm, message_id, &req_payload, sizeof(req_payload), NULL, NULL);
> +}
> +
> +/**
> + * gh_rm_alloc_vmid() - Allocate a new VM in Gunyah. Returns the VM identifier.
> + * @rm: Handle to a Gunyah resource manager
> + * @vmid: Use 0 to dynamically allocate a VM. A reserved VMID can be supplied
> + * to request allocation of a platform-defined VM.
> + *
> + * Returns - the allocated VMID or negative value on error
> + */
> +int gh_rm_alloc_vmid(struct gh_rm *rm, u16 vmid)
> +{
> + struct gh_rm_vm_common_vmid_req req_payload = {
> + .vmid = vmid,
> + };
> + struct gh_rm_vm_alloc_vmid_resp *resp_payload;
> + size_t resp_size;
> + void *resp;
> + int ret;
> +
> + ret = gh_rm_call(rm, GH_RM_RPC_VM_ALLOC_VMID, &req_payload, sizeof(req_payload), &resp,
> + &resp_size);
> + if (ret)
> + return ret;
> +
> + if (!vmid) {
> + resp_payload = resp;
> + ret = le16_to_cpu(resp_payload->vmid);
> + kfree(resp);
> + }
> +
> + return ret;
> +}
> +
> +/**
> + * gh_rm_dealloc_vmid() - Dispose the VMID
s/the/of a/
> + * @rm: Handle to a Gunyah resource manager
> + * @vmid: VM identifier allocated with gh_rm_alloc_vmid
> + */
> +int gh_rm_dealloc_vmid(struct gh_rm *rm, u16 vmid)
> +{
> + return gh_rm_common_vmid_call(rm, GH_RM_RPC_VM_DEALLOC_VMID, vmid);
> +}
> +
> +/**
> + * gh_rm_vm_reset() - Reset the VM's resources
> + * @rm: Handle to a Gunyah resource manager
> + * @vmid: VM identifier allocated with gh_rm_alloc_vmid
> + *
> + * While tearing down the VM, request RM to clean up all the VM resources
s/While/As part of/
> + * associated with the VM. Only after this, Linux can clean up all the
> + * references it maintains to resources.
> + */
> +int gh_rm_vm_reset(struct gh_rm *rm, u16 vmid)
> +{
> + return gh_rm_common_vmid_call(rm, GH_RM_RPC_VM_RESET, vmid);
> +}
> +
> +/**
> + * gh_rm_vm_start() - Move the VM into "ready to run" state
> + * @rm: Handle to a Gunyah resource manager
> + * @vmid: VM identifier allocated with gh_rm_alloc_vmid
> + *
> + * On VMs which use proxy scheduling, vcpu_run is needed to actually run the VM.
> + * On VMs which use Gunyah's scheduling, the vCPUs start executing in accordance with Gunyah
> + * scheduling policies.
> + */
> +int gh_rm_vm_start(struct gh_rm *rm, u16 vmid)
> +{
> + return gh_rm_common_vmid_call(rm, GH_RM_RPC_VM_START, vmid);
> +}
> +
> +/**
> + * gh_rm_vm_stop() - Send a request to Resource Manager VM to forcibly stop a VM.
> + * @rm: Handle to a Gunyah resource manager
> + * @vmid: VM identifier allocated with gh_rm_alloc_vmid
> + */
> +int gh_rm_vm_stop(struct gh_rm *rm, u16 vmid)
> +{
> + struct gh_rm_vm_stop_req req_payload = {
> + .vmid = cpu_to_le16(vmid),
> + .flags = GH_RM_VM_STOP_FLAG_FORCE_STOP,
> + .stop_reason = cpu_to_le32(GH_RM_VM_STOP_REASON_FORCE_STOP),
> + };
> +
> + return gh_rm_call(rm, GH_RM_RPC_VM_STOP, &req_payload, sizeof(req_payload), NULL, NULL);
> +}
> +
> +/**
> + * gh_rm_vm_configure() - Prepare a VM to start and provide the common
> + * configuration needed by RM to configure a VM
> + * @rm: Handle to a Gunyah resource manager
> + * @vmid: VM identifier allocated with gh_rm_alloc_vmid
> + * @auth_mechanism: Authentication mechanism used by resource manager to verify
> + * the virtual machine
> + * @mem_handle: Handle to a previously shared memparcel that contains all parts
> + * of the VM image subject to authentication.
> + * @image_offset: Start address of VM image, relative to the start of memparcel
> + * @image_size: Size of the VM image
> + * @dtb_offset: Start address of the devicetree binary with VM configuration,
> + * relative to start of memparcel.
> + * @dtb_size: Maximum size of devicetree binary. Resource manager applies
> + * an overlay to the DTB and dtb_size should include room for
> + * the overlay.
The above comment about including extra room doesn't sit well.
How much extra room is required? Is there any way you can
provide an estimate? Or better yet, is it possible to have
gh_rm_call() somehow calculate that extra amount and add it on?
> + */
> +int gh_rm_vm_configure(struct gh_rm *rm, u16 vmid, enum gh_rm_vm_auth_mechanism auth_mechanism,
> + u32 mem_handle, u64 image_offset, u64 image_size, u64 dtb_offset, u64 dtb_size)
From what I can tell, the auth argument (and generally, ghvm->auth)
is never used. If that's the case, it might be nicer to explicitly
not included it for now, and only add it when it's going to be used
(and tested to work correctly).
I don't know if this is a reasonable strategy, but I'm always a
little skeptical about unused code like this.
> +{
> + struct gh_rm_vm_config_image_req req_payload = {
> + .vmid = cpu_to_le16(vmid),
> + .auth_mech = cpu_to_le16(auth_mechanism),
> + .mem_handle = cpu_to_le32(mem_handle),
> + .image_offset = cpu_to_le64(image_offset),
> + .image_size = cpu_to_le64(image_size),
> + .dtb_offset = cpu_to_le64(dtb_offset),
> + .dtb_size = cpu_to_le64(dtb_size),
> + };
> +
Are there any sanity checks that could be performed before we
actually make the call to the resource manager? Like, can
you ensure the DTB offset and size are in range?
> + return gh_rm_call(rm, GH_RM_RPC_VM_CONFIG_IMAGE, &req_payload, sizeof(req_payload),
> + NULL, NULL);
> +}
> +
> +/**
> + * gh_rm_vm_init() - Move the VM to initialized state.
s/the/a/
> + * @rm: Handle to a Gunyah resource manager
> + * @vmid: VM identifier
> + *
> + * RM will allocate needed resources for the VM.
> + */
> +int gh_rm_vm_init(struct gh_rm *rm, u16 vmid)
> +{
> + return gh_rm_common_vmid_call(rm, GH_RM_RPC_VM_INIT, vmid);
> +}
> +
> +/**
> + * gh_rm_get_hyp_resources() - Retrieve hypervisor resources (capabilities) associated with a VM
> + * @rm: Handle to a Gunyah resource manager
> + * @vmid: VMID of the other VM to get the resources of
> + * @resources: Set by gh_rm_get_hyp_resources and contains the returned hypervisor resources.
Caller must free the resources pointer returned if successful.
(Please mention this.)
> + */
> +int gh_rm_get_hyp_resources(struct gh_rm *rm, u16 vmid,
> + struct gh_rm_hyp_resources **resources)
> +{
> + struct gh_rm_vm_common_vmid_req req_payload = {
> + .vmid = cpu_to_le16(vmid),
> + };
> + struct gh_rm_hyp_resources *resp;
> + size_t resp_size;
> + int ret;
> +
> + ret = gh_rm_call(rm, GH_RM_RPC_VM_GET_HYP_RESOURCES,
> + &req_payload, sizeof(req_payload),
> + (void **)&resp, &resp_size);
> + if (ret)
> + return ret;
> +
> + if (!resp_size)
> + return -EBADMSG;
> +
> + if (resp_size < struct_size(resp, entries, 0) ||
> + resp_size != struct_size(resp, entries, le32_to_cpu(resp->n_entries))) {
> + kfree(resp);
> + return -EBADMSG;
> + }
> +
> + *resources = resp;
> + return 0;
> +}
> +
> +/**
> + * gh_rm_get_vmid() - Retrieve VMID of this virtual machine
> + * @rm: Handle to a Gunyah resource manager
> + * @vmid: Filled with the VMID of this VM
> + */
> +int gh_rm_get_vmid(struct gh_rm *rm, u16 *vmid)
> +{
> + static u16 cached_vmid = GH_VMID_INVAL;
> + size_t resp_size;
> + __le32 *resp;
> + int ret;
> +
> + if (cached_vmid != GH_VMID_INVAL) {
> + *vmid = cached_vmid;
> + return 0;
> + }
> +
> + ret = gh_rm_call(rm, GH_RM_RPC_VM_GET_VMID, NULL, 0, (void **)&resp, &resp_size);
> + if (ret)
> + return ret;
> +
> + *vmid = cached_vmid = lower_16_bits(le32_to_cpu(*resp));
> + kfree(resp);
> +
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(gh_rm_get_vmid);
> diff --git a/include/linux/gunyah_rsc_mgr.h b/include/linux/gunyah_rsc_mgr.h
> index deca9b3da541..6a2f434e67f7 100644
> --- a/include/linux/gunyah_rsc_mgr.h
> +++ b/include/linux/gunyah_rsc_mgr.h
> @@ -18,4 +18,77 @@ int gh_rm_notifier_unregister(struct gh_rm *rm, struct notifier_block *nb);
> struct device *gh_rm_get(struct gh_rm *rm);
> void gh_rm_put(struct gh_rm *rm);
>
> +struct gh_rm_vm_exited_payload {
> + __le16 vmid;
> + __le16 exit_type;
> + __le32 exit_reason_size;
> + u8 exit_reason[];
> +} __packed;
> +
> +#define GH_RM_NOTIFICATION_VM_EXITED 0x56100001
I think all these notification reasons should be defined in
an enumerated type, to group them, and name the group.
> +
> +enum gh_rm_vm_status {
> + GH_RM_VM_STATUS_NO_STATE = 0,
> + GH_RM_VM_STATUS_INIT = 1,
> + GH_RM_VM_STATUS_READY = 2,
> + GH_RM_VM_STATUS_RUNNING = 3,
> + GH_RM_VM_STATUS_PAUSED = 4,
> + GH_RM_VM_STATUS_LOAD = 5,
> + GH_RM_VM_STATUS_AUTH = 6,
> + GH_RM_VM_STATUS_INIT_FAILED = 8,
> + GH_RM_VM_STATUS_EXITED = 9,
> + GH_RM_VM_STATUS_RESETTING = 10,
> + GH_RM_VM_STATUS_RESET = 11,
> +};
> +
> +struct gh_rm_vm_status_payload {
> + __le16 vmid;
> + u16 reserved;
> + u8 vm_status;
> + u8 os_status;
> + __le16 app_status;
> +} __packed;
> +
> +#define GH_RM_NOTIFICATION_VM_STATUS 0x56100008
> +
> +/* RPC Calls */
> +int gh_rm_alloc_vmid(struct gh_rm *rm, u16 vmid);
> +int gh_rm_dealloc_vmid(struct gh_rm *rm, u16 vmid);
> +int gh_rm_vm_reset(struct gh_rm *rm, u16 vmid);
> +int gh_rm_vm_start(struct gh_rm *rm, u16 vmid);
> +int gh_rm_vm_stop(struct gh_rm *rm, u16 vmid);
> +
> +enum gh_rm_vm_auth_mechanism {
> + GH_RM_VM_AUTH_NONE = 0,
> + GH_RM_VM_AUTH_QCOM_PIL_ELF = 1,
> + GH_RM_VM_AUTH_QCOM_ANDROID_PVM = 2,
> +};
> +
> +int gh_rm_vm_configure(struct gh_rm *rm, u16 vmid, enum gh_rm_vm_auth_mechanism auth_mechanism,
> + u32 mem_handle, u64 image_offset, u64 image_size,
> + u64 dtb_offset, u64 dtb_size);
> +int gh_rm_vm_init(struct gh_rm *rm, u16 vmid);
> +
> +struct gh_rm_hyp_resource {
> + u8 type;
Maybe add a comment on the above field, and others, such as:
u8 type; /* enum gh_resource_type */
> + u8 reserved;
> + __le16 partner_vmid;
> + __le32 resource_handle;
> + __le32 resource_label;
> + __le64 cap_id;
> + __le32 virq_handle;
> + __le32 virq;
> + __le64 base;
> + __le64 size;
> +} __packed;
> +
> +struct gh_rm_hyp_resources {
> + __le32 n_entries;
> + struct gh_rm_hyp_resource entries[];
> +} __packed;
> +
> +int gh_rm_get_hyp_resources(struct gh_rm *rm, u16 vmid,
> + struct gh_rm_hyp_resources **resources);
> +int gh_rm_get_vmid(struct gh_rm *rm, u16 *vmid);
> +
> #endif
On 3/3/23 7:06 PM, Elliot Berman wrote:
> Gunyah message queues are a unidirectional inter-VM pipe for messages up
> to 1024 bytes. This driver supports pairing a receiver message queue and
> a transmitter message queue to expose a single mailbox channel.
I think it's good to reuse existing frameworks, for example, using
the mailbox abstraction to implement your messaging code. But I
find there are some minor mismatches between what you need and
the way the mailbox code works.
I'm not really suggesting you change anything, but I'll just say
it seemed like there were a few spots you needed to do things
that were slightly awkward in order to satisfy mailbox requirements.
I'll point out in a few comments what I mean below.
I'll take one more look at it again next time, but I assume this
works and I have no other new comments today.
-Alex
> Signed-off-by: Elliot Berman <[email protected]>
> ---
> Documentation/virt/gunyah/message-queue.rst | 8 +
> drivers/mailbox/Makefile | 2 +
> drivers/mailbox/gunyah-msgq.c | 209 ++++++++++++++++++++
> include/linux/gunyah.h | 57 ++++++
> 4 files changed, 276 insertions(+)
> create mode 100644 drivers/mailbox/gunyah-msgq.c
>
> diff --git a/Documentation/virt/gunyah/message-queue.rst b/Documentation/virt/gunyah/message-queue.rst
> index b352918ae54b..70d82a4ef32d 100644
> --- a/Documentation/virt/gunyah/message-queue.rst
> +++ b/Documentation/virt/gunyah/message-queue.rst
> @@ -61,3 +61,11 @@ vIRQ: two TX message queues will have two vIRQs (and two capability IDs).
> | | | | | |
> | | | | | |
> +---------------+ +-----------------+ +---------------+
> +
> +Gunyah message queues are exposed as mailboxes. To create the mailbox, create
> +a mbox_client and call `gh_msgq_init()`. On receipt of the RX_READY interrupt,
> +all messages in the RX message queue are read and pushed via the `rx_callback`
> +of the registered mbox_client.
> +
> +.. kernel-doc:: drivers/mailbox/gunyah-msgq.c
> + :identifiers: gh_msgq_init
> diff --git a/drivers/mailbox/Makefile b/drivers/mailbox/Makefile
> index fc9376117111..5f929bb55e9a 100644
> --- a/drivers/mailbox/Makefile
> +++ b/drivers/mailbox/Makefile
> @@ -55,6 +55,8 @@ obj-$(CONFIG_MTK_CMDQ_MBOX) += mtk-cmdq-mailbox.o
>
> obj-$(CONFIG_ZYNQMP_IPI_MBOX) += zynqmp-ipi-mailbox.o
>
> +obj-$(CONFIG_GUNYAH) += gunyah-msgq.o
> +
> obj-$(CONFIG_SUN6I_MSGBOX) += sun6i-msgbox.o
>
> obj-$(CONFIG_SPRD_MBOX) += sprd-mailbox.o
> diff --git a/drivers/mailbox/gunyah-msgq.c b/drivers/mailbox/gunyah-msgq.c
> new file mode 100644
> index 000000000000..1989298653f9
> --- /dev/null
> +++ b/drivers/mailbox/gunyah-msgq.c
> @@ -0,0 +1,209 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#include <linux/mailbox_controller.h>
> +#include <linux/module.h>
> +#include <linux/interrupt.h>
> +#include <linux/gunyah.h>
> +#include <linux/printk.h>
> +#include <linux/init.h>
> +#include <linux/slab.h>
> +#include <linux/wait.h>
> +
> +#define mbox_chan_to_msgq(chan) (container_of(chan->mbox, struct gh_msgq, mbox))
> +
> +static irqreturn_t gh_msgq_rx_irq_handler(int irq, void *data)
> +{
> + struct gh_msgq *msgq = data;
> + struct gh_msgq_rx_data rx_data;
> + enum gh_error gh_error;
> + bool ready = true;
> +
> + while (ready) {
> + gh_error = gh_hypercall_msgq_recv(msgq->rx_ghrsc->capid,
> + &rx_data.data, sizeof(rx_data.data),
> + &rx_data.length, &ready);
> + if (gh_error != GH_ERROR_OK) {
> + if (gh_error != GH_ERROR_MSGQUEUE_EMPTY)
> + dev_warn(msgq->mbox.dev, "Failed to receive data: %d\n", gh_error);
> + break;
> + }
> + mbox_chan_received_data(gh_msgq_chan(msgq), &rx_data);
> + }
> +
> + return IRQ_HANDLED;
> +}
> +
> +/* Fired when message queue transitions from "full" to "space available" to send messages */
> +static irqreturn_t gh_msgq_tx_irq_handler(int irq, void *data)
> +{
> + struct gh_msgq *msgq = data;
> +
> + mbox_chan_txdone(gh_msgq_chan(msgq), 0);
> +
> + return IRQ_HANDLED;
> +}
> +
> +/* Fired after sending message and hypercall told us there was more space available. */
> +static void gh_msgq_txdone_tasklet(struct tasklet_struct *tasklet)
> +{
> + struct gh_msgq *msgq = container_of(tasklet, struct gh_msgq, txdone_tasklet);
> +
> + mbox_chan_txdone(gh_msgq_chan(msgq), msgq->last_ret);
> +}
> +
> +static int gh_msgq_send_data(struct mbox_chan *chan, void *data)
> +{
> + struct gh_msgq *msgq = mbox_chan_to_msgq(chan);
> + struct gh_msgq_tx_data *msgq_data = data;
> + u64 tx_flags = 0;
> + enum gh_error gh_error;
> + bool ready;
> +
> + if (msgq_data->push)
> + tx_flags |= GH_HYPERCALL_MSGQ_TX_FLAGS_PUSH;
> +
> + gh_error = gh_hypercall_msgq_send(msgq->tx_ghrsc->capid, msgq_data->length, msgq_data->data,
> + tx_flags, &ready);
> +
> + /**
> + * unlikely because Linux tracks state of msgq and should not try to
> + * send message when msgq is full.
> + */
> + if (unlikely(gh_error == GH_ERROR_MSGQUEUE_FULL))
> + return -EAGAIN;
> +
> + /**
> + * Propagate all other errors to client. If we return error to mailbox
> + * framework, then no other messages can be sent and nobody will know
> + * to retry this message.
If you weren't using the mailbox framework, would you be
sending the error to the client in this case? (I'm just
curious; it's good to document the behavior if you were
to return it to the mailbox framework.)
> + */
> + msgq->last_ret = gh_remap_error(gh_error);
> +
> + /**
> + * This message was successfully sent, but message queue isn't ready to
> + * accept more messages because it's now full. Mailbox framework
> + * requires that we only report that message was transmitted when
> + * we're ready to transmit another message. We'll get that in the form
> + * of tx IRQ once the other side starts to drain the msgq.
So you are forced to delay reporting the completion
here because you're using the mailbox framework.
> + */
> + if (gh_error == GH_ERROR_OK) {
> + if (!ready)
> + return 0;
> + } else
> + dev_err(msgq->mbox.dev, "Failed to send data: %d (%d)\n", gh_error, msgq->last_ret);
> +
> + /**
> + * We can send more messages. Mailbox framework requires that tx done
> + * happens asynchronously to sending the message. Gunyah message queues
> + * tell us right away on the hypercall return whether we can send more
> + * messages. To work around this, defer the txdone to a tasklet.
> + */
If you weren't using the mailbox framework, you'd send the next
message directly rather than scheduling this tasklet to do it.
> + tasklet_schedule(&msgq->txdone_tasklet);
> +
> + return 0;
> +}
> +
> +static struct mbox_chan_ops gh_msgq_ops = {
> + .send_data = gh_msgq_send_data,
> +};
> +
> +/**
> + * gh_msgq_init() - Initialize a Gunyah message queue with an mbox_client
> + * @parent: optional, device parent used for the mailbox controller
> + * @msgq: Pointer to the gh_msgq to initialize
> + * @cl: A mailbox client to bind to the mailbox channel that the message queue creates
> + * @tx_ghrsc: optional, the transmission side of the message queue
> + * @rx_ghrsc: optional, the receiving side of the message queue
> + *
> + * At least one of tx_ghrsc and rx_ghrsc must be not NULL. Most message queue use cases come with
> + * a pair of message queues to facilitate bidirectional communication. When tx_ghrsc is set,
> + * the client can send messages with mbox_send_message(gh_msgq_chan(msgq), msg). When rx_ghrsc
> + * is set, the mbox_client must register an .rx_callback() and the message queue driver will
> + * deliver all available messages upon receiving the RX ready interrupt. The messages should be
> + * consumed or copied by the client right away as the gh_msgq_rx_data will be replaced/destroyed
> + * after the callback.
> + *
> + * Returns - 0 on success, negative otherwise
> + */
> +int gh_msgq_init(struct device *parent, struct gh_msgq *msgq, struct mbox_client *cl,
> + struct gh_resource *tx_ghrsc, struct gh_resource *rx_ghrsc)
> +{
> + int ret;
> +
> + /* Must have at least a tx_ghrsc or rx_ghrsc and that they are the right device types */
> + if ((!tx_ghrsc && !rx_ghrsc) ||
> + (tx_ghrsc && tx_ghrsc->type != GH_RESOURCE_TYPE_MSGQ_TX) ||
> + (rx_ghrsc && rx_ghrsc->type != GH_RESOURCE_TYPE_MSGQ_RX))
> + return -EINVAL;
> +
> + if (!gh_api_has_feature(GH_FEATURE_MSGQUEUE))
> + return -EOPNOTSUPP;
> +
> + msgq->tx_ghrsc = tx_ghrsc;
> + msgq->rx_ghrsc = rx_ghrsc;
> +
> + msgq->mbox.dev = parent;
> + msgq->mbox.ops = &gh_msgq_ops;
> + msgq->mbox.num_chans = 1;
> + msgq->mbox.txdone_irq = true;
> + msgq->mbox.chans = &msgq->mbox_chan;
> +
> + if (msgq->tx_ghrsc) {
> + ret = request_irq(msgq->tx_ghrsc->irq, gh_msgq_tx_irq_handler, 0, "gh_msgq_tx",
> + msgq);
> + if (ret)
> + goto err_chans;
> + }
> +
> + if (msgq->rx_ghrsc) {
> + ret = request_threaded_irq(msgq->rx_ghrsc->irq, NULL, gh_msgq_rx_irq_handler,
> + IRQF_ONESHOT, "gh_msgq_rx", msgq);
> + if (ret)
> + goto err_tx_irq;
> + }
> +
> + tasklet_setup(&msgq->txdone_tasklet, gh_msgq_txdone_tasklet);
> +
> + ret = mbox_controller_register(&msgq->mbox);
> + if (ret)
> + goto err_rx_irq;
> +
> + ret = mbox_bind_client(gh_msgq_chan(msgq), cl);
> + if (ret)
> + goto err_mbox;
> +
> + return 0;
> +err_mbox:
> + mbox_controller_unregister(&msgq->mbox);
> +err_rx_irq:
> + if (msgq->rx_ghrsc)
> + free_irq(msgq->rx_ghrsc->irq, msgq);
> +err_tx_irq:
> + if (msgq->tx_ghrsc)
> + free_irq(msgq->tx_ghrsc->irq, msgq);
> +err_chans:
> + kfree(msgq->mbox.chans);
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(gh_msgq_init);
> +
> +void gh_msgq_remove(struct gh_msgq *msgq)
> +{
> + tasklet_kill(&msgq->txdone_tasklet);
> + mbox_controller_unregister(&msgq->mbox);
> +
> + if (msgq->rx_ghrsc)
> + free_irq(msgq->rx_ghrsc->irq, msgq);
> +
> + if (msgq->tx_ghrsc)
> + free_irq(msgq->tx_ghrsc->irq, msgq);
> +
> + kfree(msgq->mbox.chans);
> +}
> +EXPORT_SYMBOL_GPL(gh_msgq_remove);
> +
> +MODULE_LICENSE("GPL");
> +MODULE_DESCRIPTION("Gunyah Message Queue Driver");
> diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
> index 18cfbf5ee48b..378bec0f2ce1 100644
> --- a/include/linux/gunyah.h
> +++ b/include/linux/gunyah.h
> @@ -8,11 +8,68 @@
>
> #include <linux/bitfield.h>
> #include <linux/errno.h>
> +#include <linux/interrupt.h>
> #include <linux/limits.h>
> +#include <linux/mailbox_controller.h>
> +#include <linux/mailbox_client.h>
> #include <linux/types.h>
>
> +/* Follows resource manager's resource types for VM_GET_HYP_RESOURCES */
I'm not sure what you mean by "Follows" here. You mean that these are
the gh_rm_hyp_resource type values that GET_HYP_RESOURCES can return?
Note that gh_resource_type values must fit in an 8 bit field.
> +enum gh_resource_type {
> + GH_RESOURCE_TYPE_BELL_TX = 0,
> + GH_RESOURCE_TYPE_BELL_RX = 1,
> + GH_RESOURCE_TYPE_MSGQ_TX = 2,
> + GH_RESOURCE_TYPE_MSGQ_RX = 3,
Fix alignment below.
> + GH_RESOURCE_TYPE_VCPU = 4,
> +};
> +
> +struct gh_resource {
> + enum gh_resource_type type;
> + u64 capid;
> + unsigned int irq;
> +};
> +
> +/**
> + * Gunyah Message Queues
> + */
> +
> +#define GH_MSGQ_MAX_MSG_SIZE 240
Maybe insert another tab the before 240. You later define
GH_BELL_NONBLOCK that far out, and aligning them will look
better.
> +
> +struct gh_msgq_tx_data {
> + size_t length;
> + bool push;
> + char data[];
> +};
> +
> +struct gh_msgq_rx_data {
> + size_t length;
> + char data[GH_MSGQ_MAX_MSG_SIZE];
> +};
> +
> +struct gh_msgq {
> + struct gh_resource *tx_ghrsc;
> + struct gh_resource *rx_ghrsc;
> +
> + /* msgq private */
> + int last_ret; /* Linux error, not GH_STATUS_* */
> + struct mbox_chan mbox_chan;
> + struct mbox_controller mbox;
> + struct tasklet_struct txdone_tasklet;
> +};
> +
> +
> +int gh_msgq_init(struct device *parent, struct gh_msgq *msgq, struct mbox_client *cl,
> + struct gh_resource *tx_ghrsc, struct gh_resource *rx_ghrsc);
> +void gh_msgq_remove(struct gh_msgq *msgq);
> +
> +static inline struct mbox_chan *gh_msgq_chan(struct gh_msgq *msgq)
> +{
> + return &msgq->mbox.chans[0];
> +}
> +
> /******************************************************************************/
> /* Common arch-independent definitions for Gunyah hypercalls */
> +
> #define GH_CAPID_INVAL U64_MAX
> #define GH_VMID_ROOT_VM 0xff
>
On 3/3/23 7:06 PM, Elliot Berman wrote:
> Gunyah resource manager provides API to manipulate stage 2 page tables.
> Manipulations are represented as a memory parcel. Memory parcels
> describe a list of memory regions (intermediate physical address and
> size), a list of new permissions for VMs, and the memory type (DDR or
> MMIO). Memory parcels are uniquely identified by a handle allocated by
> Gunyah. There are a few types of memory parcel sharing which Gunyah
> supports:
>
> - Sharing: the guest and host VM both have access
> - Lending: only the guest has access; host VM loses access
> - Donating: Permanently lent (not reclaimed even if guest shuts down)
>
> Memory parcels that have been shared or lent can be reclaimed by the
> host via an additional call. The reclaim operation restores the original
> access the host VM had to the memory parcel and removes the access to
> other VM.
>
> One point to note that memory parcels don't describe where in the guest
> VM the memory parcel should reside. The guest VM must accept the memory
> parcel either explicitly via a "gh_rm_mem_accept" call (not introduced
> here) or be configured to accept it automatically at boot. As the guest
> VM accepts the memory parcel, it also mentions the IPA it wants to place
> memory parcel.
>
> Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
> Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
> Signed-off-by: Elliot Berman <[email protected]>
The comments here aren't anything major, just suggestions. -Alex
> ---
> drivers/virt/gunyah/rsc_mgr_rpc.c | 223 ++++++++++++++++++++++++++++++
> include/linux/gunyah_rsc_mgr.h | 48 +++++++
> 2 files changed, 271 insertions(+)
>
> diff --git a/drivers/virt/gunyah/rsc_mgr_rpc.c b/drivers/virt/gunyah/rsc_mgr_rpc.c
> index ffcb861a31b5..3df15ad5b97d 100644
> --- a/drivers/virt/gunyah/rsc_mgr_rpc.c
> +++ b/drivers/virt/gunyah/rsc_mgr_rpc.c
> @@ -6,6 +6,12 @@
> #include <linux/gunyah_rsc_mgr.h>
> #include "rsc_mgr.h"
>
> +/* Message IDs: Memory Management */
> +#define GH_RM_RPC_MEM_LEND 0x51000012
> +#define GH_RM_RPC_MEM_SHARE 0x51000013
> +#define GH_RM_RPC_MEM_RECLAIM 0x51000015
> +#define GH_RM_RPC_MEM_APPEND 0x51000018
> +
> /* Message IDs: VM Management */
> #define GH_RM_RPC_VM_ALLOC_VMID 0x56000001
> #define GH_RM_RPC_VM_DEALLOC_VMID 0x56000002
> @@ -22,6 +28,46 @@ struct gh_rm_vm_common_vmid_req {
> __le16 _padding;
> } __packed;
>
> +/* Call: MEM_LEND, MEM_SHARE */
> +#define GH_MEM_SHARE_REQ_FLAGS_APPEND BIT(1)
> +
> +struct gh_rm_mem_share_req_header {
> + u8 mem_type;
> + u8 _padding0;
> + u8 flags;
> + u8 _padding1;
> + __le32 label;
> +} __packed;
> +
> +struct gh_rm_mem_share_req_acl_section {
> + __le32 n_entries;
> + struct gh_rm_mem_acl_entry entries[];
> +};
> +
> +struct gh_rm_mem_share_req_mem_section {
> + __le16 n_entries;
> + __le16 _padding;
> + struct gh_rm_mem_entry entries[];
> +};
> +
> +/* Call: MEM_RELEASE */
> +struct gh_rm_mem_release_req {
> + __le32 mem_handle;
> + u8 flags; /* currently not used */
> + u8 _padding0;
> + __le16 _padding1;
> +} __packed;
> +
> +/* Call: MEM_APPEND */
> +#define GH_MEM_APPEND_REQ_FLAGS_END BIT(0)
Insert a tab before BIT(0) to align with the value assigned
to GH_MEM_SHARE_REQ_FLAGS_APPEND, above. Same comment will
apply to GH_RM_VM_STOP_FLAG_FORCE_STOP (and so on).
> +
> +struct gh_rm_mem_append_req_header {
> + __le32 mem_handle;
> + u8 flags;
> + u8 _padding0;
> + __le16 _padding1;
> +} __packed;
> +
> /* Call: VM_ALLOC */
> struct gh_rm_vm_alloc_vmid_resp {
> __le16 vmid;
> @@ -51,6 +97,8 @@ struct gh_rm_vm_config_image_req {
> __le64 dtb_size;
> } __packed;
>
> +#define GH_RM_MAX_MEM_ENTRIES 512
> +
> /*
> * Several RM calls take only a VMID as a parameter and give only standard
> * response back. Deduplicate boilerplate code by using this common call.
> @@ -64,6 +112,181 @@ static int gh_rm_common_vmid_call(struct gh_rm *rm, u32 message_id, u16 vmid)
> return gh_rm_call(rm, message_id, &req_payload, sizeof(req_payload), NULL, NULL);
> }
>
> +static int _gh_rm_mem_append(struct gh_rm *rm, u32 mem_handle, bool end_append,
> + struct gh_rm_mem_entry *mem_entries, size_t n_mem_entries)
> +{
> + struct gh_rm_mem_share_req_mem_section *mem_section;
> + struct gh_rm_mem_append_req_header *req_header;
> + size_t msg_size = 0;
> + void *msg;
> + int ret;
> +
> + msg_size += sizeof(struct gh_rm_mem_append_req_header);
> + msg_size += struct_size(mem_section, entries, n_mem_entries);
> +
> + msg = kzalloc(msg_size, GFP_KERNEL);
> + if (!msg)
> + return -ENOMEM;
> +
> + req_header = msg;
> + mem_section = (void *)req_header + sizeof(struct gh_rm_mem_append_req_header);
> +
> + req_header->mem_handle = cpu_to_le32(mem_handle);
> + if (end_append)
> + req_header->flags |= GH_MEM_APPEND_REQ_FLAGS_END;
> +
> + mem_section->n_entries = cpu_to_le16(n_mem_entries);
> + memcpy(mem_section->entries, mem_entries, sizeof(*mem_entries) * n_mem_entries);
> +
> + ret = gh_rm_call(rm, GH_RM_RPC_MEM_APPEND, msg, msg_size, NULL, NULL);
> + kfree(msg);
> +
> + return ret;
> +}
> +
> +static int gh_rm_mem_append(struct gh_rm *rm, u32 mem_handle,
> + struct gh_rm_mem_entry *mem_entries, size_t n_mem_entries)
> +{
> + bool end_append;
> + int ret = 0;
> + size_t n;
> +
> + while (n_mem_entries) {
> + if (n_mem_entries > GH_RM_MAX_MEM_ENTRIES) {
> + end_append = false;
> + n = GH_RM_MAX_MEM_ENTRIES;
> + } else {
> + end_append = true;
> + n = n_mem_entries;
> + }
> +
> + ret = _gh_rm_mem_append(rm, mem_handle, end_append, mem_entries, n);
> + if (ret)
> + break;
> +
> + mem_entries += n;
> + n_mem_entries -= n;
> + }
> +
> + return ret;
> +}
> +
> +static int gh_rm_mem_lend_common(struct gh_rm *rm, u32 message_id, struct gh_rm_mem_parcel *p)
> +{
> + size_t msg_size = 0, initial_mem_entries = p->n_mem_entries, resp_size;
> + struct gh_rm_mem_share_req_acl_section *acl_section;
> + struct gh_rm_mem_share_req_mem_section *mem_section;
> + struct gh_rm_mem_share_req_header *req_header;
> + u32 *attr_section;
> + __le32 *resp;
> + void *msg;
> + int ret;
> +
> + if (!p->acl_entries || !p->n_acl_entries || !p->mem_entries || !p->n_mem_entries ||
> + p->n_acl_entries > U8_MAX || p->mem_handle != GH_MEM_HANDLE_INVAL)
> + return -EINVAL;
> +
> + if (initial_mem_entries > GH_RM_MAX_MEM_ENTRIES)
> + initial_mem_entries = GH_RM_MAX_MEM_ENTRIES;
> +
> + /* The format of the message goes:
> + * request header
> + * ACL entries (which VMs get what kind of access to this memory parcel)
> + * Memory entries (list of memory regions to share)
> + * Memory attributes (currently unused, we'll hard-code the size to 0)
> + */
> + msg_size += sizeof(struct gh_rm_mem_share_req_header);
> + msg_size += struct_size(acl_section, entries, p->n_acl_entries);
> + msg_size += struct_size(mem_section, entries, initial_mem_entries);
Perhaps you can compute and cache these sizes, and use them
both here and below when computing the addresses of the
sections within the message.
> + msg_size += sizeof(u32); /* for memory attributes, currently unused */
> +
> + msg = kzalloc(msg_size, GFP_KERNEL);
> + if (!msg)
> + return -ENOMEM;
> +
> + req_header = msg;
> + acl_section = (void *)req_header + sizeof(*req_header);
> + mem_section = (void *)acl_section + struct_size(acl_section, entries, p->n_acl_entries);
> + attr_section = (void *)mem_section + struct_size(mem_section, entries, initial_mem_entries);
> +
> + req_header->mem_type = p->mem_type;
> + if (initial_mem_entries != p->n_mem_entries)
> + req_header->flags |= GH_MEM_SHARE_REQ_FLAGS_APPEND;
> + req_header->label = cpu_to_le32(p->label);
> +
> + acl_section->n_entries = cpu_to_le32(p->n_acl_entries);
> + memcpy(acl_section->entries, p->acl_entries, sizeof(*(p->acl_entries)) * p->n_acl_entries);
Should you use struct_size(), or maybe flex_array_size() in the
line above?
> +
> + mem_section->n_entries = cpu_to_le16(initial_mem_entries);
> + memcpy(mem_section->entries, p->mem_entries,
> + sizeof(*(p->mem_entries)) * initial_mem_entries);
Here too.
> +
> + /* Set n_entries for memory attribute section to 0 */
> + *attr_section = 0;
> +
> + ret = gh_rm_call(rm, message_id, msg, msg_size, (void **)&resp, &resp_size);
> + kfree(msg);
> +
> + if (ret)
> + return ret;
> +
> + p->mem_handle = le32_to_cpu(*resp);
> +
> + if (initial_mem_entries != p->n_mem_entries) {
> + ret = gh_rm_mem_append(rm, p->mem_handle,
> + &p->mem_entries[initial_mem_entries],
> + p->n_mem_entries - initial_mem_entries);
> + if (ret) {
> + gh_rm_mem_reclaim(rm, p);
> + p->mem_handle = GH_MEM_HANDLE_INVAL;
> + }
> + }
> +
> + kfree(resp);
> + return ret;
> +}
> +
> +/**
> + * gh_rm_mem_lend() - Lend memory to other virtual machines.
> + * @rm: Handle to a Gunyah resource manager
> + * @parcel: Package the memory information of the memory to be lent.
Again, the "package" here doesn't clarify things for me.
Maybe just "Information about the memory to be lent"?
> + *
> + * Lending removes Linux's access to the memory while the memory parcel is lent.
> + */
> +int gh_rm_mem_lend(struct gh_rm *rm, struct gh_rm_mem_parcel *parcel)
> +{
> + return gh_rm_mem_lend_common(rm, GH_RM_RPC_MEM_LEND, parcel);
> +}
> +
> +
> +/**
> + * gh_rm_mem_share() - Share memory with other virtual machines.
> + * @rm: Handle to a Gunyah resource manager
> + * @parcel: Package the memory information of the memory to be shared.
> + *
> + * Sharing keeps Linux's access to the memory while the memory parcel is shared.
> + */
> +int gh_rm_mem_share(struct gh_rm *rm, struct gh_rm_mem_parcel *parcel)
> +{
> + return gh_rm_mem_lend_common(rm, GH_RM_RPC_MEM_SHARE, parcel);
> +}
> +
> +/**
> + * gh_rm_mem_reclaim() - Reclaim a memory parcel
> + * @rm: Handle to a Gunyah resource manager
> + * @parcel: Package the memory information of the memory to be reclaimed.
> + *
> + * RM maps the associated memory back into the stage-2 page tables of the owner VM.
> + */
> +int gh_rm_mem_reclaim(struct gh_rm *rm, struct gh_rm_mem_parcel *parcel)
> +{
> + struct gh_rm_mem_release_req req = {
> + .mem_handle = cpu_to_le32(parcel->mem_handle),
> + };
> +
> + return gh_rm_call(rm, GH_RM_RPC_MEM_RECLAIM, &req, sizeof(req), NULL, NULL);
> +}
> +
> /**
> * gh_rm_alloc_vmid() - Allocate a new VM in Gunyah. Returns the VM identifier.
> * @rm: Handle to a Gunyah resource manager
> diff --git a/include/linux/gunyah_rsc_mgr.h b/include/linux/gunyah_rsc_mgr.h
> index 6a2f434e67f7..88a429dad09e 100644
> --- a/include/linux/gunyah_rsc_mgr.h
> +++ b/include/linux/gunyah_rsc_mgr.h
> @@ -11,6 +11,7 @@
> #include <linux/gunyah.h>
>
> #define GH_VMID_INVAL U16_MAX
> +#define GH_MEM_HANDLE_INVAL U32_MAX
>
> struct gh_rm;
> int gh_rm_notifier_register(struct gh_rm *rm, struct notifier_block *nb);
> @@ -51,7 +52,54 @@ struct gh_rm_vm_status_payload {
>
> #define GH_RM_NOTIFICATION_VM_STATUS 0x56100008
>
> +#define GH_RM_ACL_X BIT(0)
> +#define GH_RM_ACL_W BIT(1)
> +#define GH_RM_ACL_R BIT(2)
> +
> +struct gh_rm_mem_acl_entry {
> + __le16 vmid;
> + u8 perms;
> + u8 reserved;
> +} __packed;
> +
> +struct gh_rm_mem_entry {
> + __le64 ipa_base;
Does "ipa" represent "intermediate physical address"? Please at
least explain that, and preferably, rename the field so it's a
little more explicit (maybe "intermediate_addr"?).
> + __le64 size;
> +} __packed;
> +
> +enum gh_rm_mem_type {
> + GH_RM_MEM_TYPE_NORMAL = 0,
> + GH_RM_MEM_TYPE_IO = 1,
> +};
> +
> +/*
> + * struct gh_rm_mem_parcel - Package info about memory to be lent/shared/donated/reclaimed
The term "Package info", when describing a type you have named "parcel"
is a little confusing. I think you could safely drop "Package" (here
and elsewhere).
My understanding is that a parcel is a Gunyah representation of a set
of memory regions that have been passed (lent or shared) to one or
more other VMs, defining access permisions for each VM to all of the
regions in the parcel.
> + * @mem_type: The type of memory: normal (DDR) or IO
> + * @label: An client-specified identifier which can be used by the other VMs to identify the purpose
s/An/A/
> + * of the memory parcel.
> + * @acl_entries: An array of access control entries. Each entry specifies a VM and what access
> + * is allowed for the memory parcel.
> + * @n_acl_entries: Count of the number of entries in the `acl_entries` array.
When you refer to a symbol in kernel-doc, you can use @acl_entries
rather than something like `acl-entries`.
> + * @mem_entries: An list of regions to be associated with the memory parcel. Addresses should be
> + * (intermediate) physical addresses from Linux's perspective.
> + * @n_mem_entries: Count of the number of entries in the `mem_entries` array.
I don't know if this is required, but I suggest you list the
descriptions here in the same order as the fields are defined
in the structure below.
> + * @mem_handle: On success, filled with memory handle that RM allocates for this memory parcel
> + */
> +struct gh_rm_mem_parcel {
> + enum gh_rm_mem_type mem_type;
> + u32 label;
> + size_t n_acl_entries;
> + struct gh_rm_mem_acl_entry *acl_entries;
> + size_t n_mem_entries;
> + struct gh_rm_mem_entry *mem_entries;
> + u32 mem_handle;
> +};
> +
> /* RPC Calls */
> +int gh_rm_mem_lend(struct gh_rm *rm, struct gh_rm_mem_parcel *parcel);
> +int gh_rm_mem_share(struct gh_rm *rm, struct gh_rm_mem_parcel *parcel);
> +int gh_rm_mem_reclaim(struct gh_rm *rm, struct gh_rm_mem_parcel *parcel);
> +
> int gh_rm_alloc_vmid(struct gh_rm *rm, u16 vmid);
> int gh_rm_dealloc_vmid(struct gh_rm *rm, u16 vmid);
> int gh_rm_vm_reset(struct gh_rm *rm, u16 vmid);
On 3/3/23 7:06 PM, Elliot Berman wrote:
> Gunyah VM manager is a kernel moduel which exposes an interface to
> Gunyah userspace to load, run, and interact with other Gunyah virtual
> machines. The interface is a character device at /dev/gunyah.
>
> Add a basic VM manager driver. Upcoming patches will add more ioctls
> into this driver.
>
> Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
> Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
> Signed-off-by: Elliot Berman <[email protected]>
One suggestion to move some code here. And a few other minor
things.
-Alex
> ---
> .../userspace-api/ioctl/ioctl-number.rst | 1 +
> drivers/virt/gunyah/Makefile | 2 +-
> drivers/virt/gunyah/rsc_mgr.c | 38 +++++-
> drivers/virt/gunyah/vm_mgr.c | 116 ++++++++++++++++++
> drivers/virt/gunyah/vm_mgr.h | 23 ++++
> include/uapi/linux/gunyah.h | 23 ++++
> 6 files changed, 201 insertions(+), 2 deletions(-)
> create mode 100644 drivers/virt/gunyah/vm_mgr.c
> create mode 100644 drivers/virt/gunyah/vm_mgr.h
> create mode 100644 include/uapi/linux/gunyah.h
>
> diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documentation/userspace-api/ioctl/ioctl-number.rst
> index 0a1882e296ae..2513324ae7be 100644
> --- a/Documentation/userspace-api/ioctl/ioctl-number.rst
> +++ b/Documentation/userspace-api/ioctl/ioctl-number.rst
> @@ -137,6 +137,7 @@ Code Seq# Include File Comments
> 'F' DD video/sstfb.h conflict!
> 'G' 00-3F drivers/misc/sgi-gru/grulib.h conflict!
> 'G' 00-0F xen/gntalloc.h, xen/gntdev.h conflict!
> +'G' 00-0f linux/gunyah.h conflict!
> 'H' 00-7F linux/hiddev.h conflict!
> 'H' 00-0F linux/hidraw.h conflict!
> 'H' 01 linux/mei.h conflict!
> diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
> index de29769f2f3f..03951cf82023 100644
> --- a/drivers/virt/gunyah/Makefile
> +++ b/drivers/virt/gunyah/Makefile
> @@ -2,5 +2,5 @@
>
> obj-$(CONFIG_GUNYAH) += gunyah.o
>
> -gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o
> +gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o
> obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
> diff --git a/drivers/virt/gunyah/rsc_mgr.c b/drivers/virt/gunyah/rsc_mgr.c
> index 67813c9a52db..d7ce692d0067 100644
> --- a/drivers/virt/gunyah/rsc_mgr.c
> +++ b/drivers/virt/gunyah/rsc_mgr.c
> @@ -15,8 +15,10 @@
> #include <linux/completion.h>
> #include <linux/gunyah_rsc_mgr.h>
> #include <linux/platform_device.h>
> +#include <linux/miscdevice.h>
>
> #include "rsc_mgr.h"
> +#include "vm_mgr.h"
>
> #define RM_RPC_API_VERSION_MASK GENMASK(3, 0)
> #define RM_RPC_HEADER_WORDS_MASK GENMASK(7, 4)
> @@ -129,6 +131,7 @@ struct gh_rm_connection {
> * @cache: cache for allocating Tx messages
> * @send_lock: synchronization to allow only one request to be sent at a time
> * @nh: notifier chain for clients interested in RM notification messages
> + * @miscdev: /dev/gunyah
> */
> struct gh_rm {
> struct device *dev;
> @@ -145,6 +148,8 @@ struct gh_rm {
> struct kmem_cache *cache;
> struct mutex send_lock;
> struct blocking_notifier_head nh;
> +
> + struct miscdevice miscdev;
> };
>
> /**
> @@ -593,6 +598,21 @@ void gh_rm_put(struct gh_rm *rm)
> }
> EXPORT_SYMBOL_GPL(gh_rm_put);
>
I feel like /dev/gunyah code would more appropriately be found
in "vm_mgr.c". All gh_dev_ioctl() does is call the function
defined there, and it's therefore a VM-oriented rather than
resource-oriented device.
> +static long gh_dev_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
> +{
> + struct miscdevice *miscdev = filp->private_data;
> + struct gh_rm *rm = container_of(miscdev, struct gh_rm, miscdev);
> +
> + return gh_dev_vm_mgr_ioctl(rm, cmd, arg);
> +}
> +
> +static const struct file_operations gh_dev_fops = {
> + .owner = THIS_MODULE,
> + .unlocked_ioctl = gh_dev_ioctl,
> + .compat_ioctl = compat_ptr_ioctl,
> + .llseek = noop_llseek,
> +};
> +
> static int gh_msgq_platform_probe_direction(struct platform_device *pdev, bool tx,
> struct gh_resource *ghrsc)
> {
> @@ -651,7 +671,22 @@ static int gh_rm_drv_probe(struct platform_device *pdev)
> rm->msgq_client.rx_callback = gh_rm_msgq_rx_data;
> rm->msgq_client.tx_done = gh_rm_msgq_tx_done;
>
> - return gh_msgq_init(&pdev->dev, &rm->msgq, &rm->msgq_client, &rm->tx_ghrsc, &rm->rx_ghrsc);
> + ret = gh_msgq_init(&pdev->dev, &rm->msgq, &rm->msgq_client, &rm->tx_ghrsc, &rm->rx_ghrsc);
> + if (ret)
> + goto err_cache;
> +
> + rm->miscdev.name = "gunyah";
> + rm->miscdev.minor = MISC_DYNAMIC_MINOR;
> + rm->miscdev.fops = &gh_dev_fops;
> +
> + ret = misc_register(&rm->miscdev);
> + if (ret)
> + goto err_msgq;
> +
> + return 0;
> +err_msgq:
> + mbox_free_channel(gh_msgq_chan(&rm->msgq));
> + gh_msgq_remove(&rm->msgq);
> err_cache:
> kmem_cache_destroy(rm->cache);
> return ret;
> @@ -661,6 +696,7 @@ static int gh_rm_drv_remove(struct platform_device *pdev)
> {
> struct gh_rm *rm = platform_get_drvdata(pdev);
>
> + misc_deregister(&rm->miscdev);
> mbox_free_channel(gh_msgq_chan(&rm->msgq));
> gh_msgq_remove(&rm->msgq);
> kmem_cache_destroy(rm->cache);
> diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
> new file mode 100644
> index 000000000000..dbacf36af72d
> --- /dev/null
> +++ b/drivers/virt/gunyah/vm_mgr.c
> @@ -0,0 +1,116 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#define pr_fmt(fmt) "gh_vm_mgr: " fmt
> +
> +#include <linux/anon_inodes.h>
> +#include <linux/file.h>
> +#include <linux/gunyah_rsc_mgr.h>
> +#include <linux/miscdevice.h>
> +#include <linux/module.h>
> +
> +#include <uapi/linux/gunyah.h>
> +
> +#include "vm_mgr.h"
> +
> +static void gh_vm_free(struct work_struct *work)
> +{
> + struct gh_vm *ghvm = container_of(work, struct gh_vm, free_work);
> + int ret;
> +
> + ret = gh_rm_dealloc_vmid(ghvm->rm, ghvm->vmid);
> + if (ret)
> + pr_warn("Failed to deallocate vmid: %d\n", ret);
> +
> + put_gh_rm(ghvm->rm);
> + kfree(ghvm);
> +}
> +
> +static __must_check struct gh_vm *gh_vm_alloc(struct gh_rm *rm)
> +{
> + struct gh_vm *ghvm;
> + int vmid;
> +
> + vmid = gh_rm_alloc_vmid(rm, 0);
> + if (vmid < 0)
> + return ERR_PTR(vmid);
> +
> + ghvm = kzalloc(sizeof(*ghvm), GFP_KERNEL);
> + if (!ghvm) {
> + gh_rm_dealloc_vmid(rm, vmid);
> + return ERR_PTR(-ENOMEM);
> + }
> +
> + ghvm->parent = gh_rm_get(rm);
> + ghvm->vmid = vmid;
> + ghvm->rm = rm;
> +
> + INIT_WORK(&ghvm->free_work, gh_vm_free);
> +
> + return ghvm;
> +}
> +
> +static int gh_vm_release(struct inode *inode, struct file *filp)
> +{
> + struct gh_vm *ghvm = filp->private_data;
> +
> + /* VM will be reset and make RM calls which can interruptible sleep.
> + * Defer to a work so this thread can receive signal.
> + */
> + schedule_work(&ghvm->free_work);
> + return 0;
> +}
> +
> +static const struct file_operations gh_vm_fops = {
> + .release = gh_vm_release,
> + .llseek = noop_llseek,
> +};
> +
> +static long gh_dev_ioctl_create_vm(struct gh_rm *rm, unsigned long arg)
> +{
> + struct gh_vm *ghvm;
> + struct file *file;
> + int fd, err;
> +
> + /* arg reserved for future use. */
Do you have a clear idea of how this might be used in the future?
I was thinking you could silently ignore the argument value, but
I suppose if it *does* get used in the future, you want the caller
to know it's being ignored. (Is that right?)
> + if (arg)
> + return -EINVAL;
> +
> + ghvm = gh_vm_alloc(rm);
> + if (IS_ERR(ghvm))
> + return PTR_ERR(ghvm);
> +
> + fd = get_unused_fd_flags(O_CLOEXEC);
> + if (fd < 0) {
> + err = fd;
> + goto err_destroy_vm;
> + }
> +
> + file = anon_inode_getfile("gunyah-vm", &gh_vm_fops, ghvm, O_RDWR);
> + if (IS_ERR(file)) {
> + err = PTR_ERR(file);
> + goto err_put_fd;
> + }
> +
> + fd_install(fd, file);
> +
> + return fd;
> +
> +err_put_fd:
> + put_unused_fd(fd);
> +err_destroy_vm:
> + gh_vm_free(&ghvm->free_work);
> + return err;
> +}
> +
> +long gh_dev_vm_mgr_ioctl(struct gh_rm *rm, unsigned int cmd, unsigned long arg)
> +{
> + switch (cmd) {
> + case GH_CREATE_VM:
> + return gh_dev_ioctl_create_vm(rm, arg);
> + default:
> + return -ENOIOCTLCMD;
> + }
> +}
> diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
> new file mode 100644
> index 000000000000..4b22fbcac91c
> --- /dev/null
> +++ b/drivers/virt/gunyah/vm_mgr.h
> @@ -0,0 +1,23 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#ifndef _GH_PRIV_VM_MGR_H
> +#define _GH_PRIV_VM_MGR_H
Maybe _GH_VM_MGR_H?
> +
> +#include <linux/gunyah_rsc_mgr.h>
> +
> +#include <uapi/linux/gunyah.h>
> +
> +long gh_dev_vm_mgr_ioctl(struct gh_rm *rm, unsigned int cmd, unsigned long arg);
> +
> +struct gh_vm {
> + u16 vmid;
> + struct gh_rm *rm;
> + struct device *parent;
> +
> + struct work_struct free_work;
> +};
> +
> +#endif
> diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
> new file mode 100644
> index 000000000000..10ba32d2b0a6
> --- /dev/null
> +++ b/include/uapi/linux/gunyah.h
> @@ -0,0 +1,23 @@
> +/* SPDX-License-Identifier: GPL-2.0-only WITH Linux-syscall-note */
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#ifndef _UAPI_LINUX_GUNYAH
> +#define _UAPI_LINUX_GUNYAH
Use _UAPI_LINUX_GUNYAH_H
> +
> +/*
> + * Userspace interface for /dev/gunyah - gunyah based virtual machine
> + */
> +
> +#include <linux/types.h>
> +#include <linux/ioctl.h>
> +
> +#define GH_IOCTL_TYPE 'G'
> +
> +/*
> + * ioctls for /dev/gunyah fds:
> + */
> +#define GH_CREATE_VM _IO(GH_IOCTL_TYPE, 0x0) /* Returns a Gunyah VM fd */
> +
> +#endif
On 3/3/23 7:06 PM, Elliot Berman wrote:
> Add remaining ioctls to support non-proxy VM boot:
>
> - Gunyah Resource Manager uses the VM's devicetree to configure the
> virtual machine. The location of the devicetree in the guest's
> virtual memory can be declared via the SET_DTB_CONFIG ioctl.
> - Trigger start of the virtual machine with VM_START ioctl.
>
> Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
> Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
> Signed-off-by: Elliot Berman <[email protected]>
I identify one bug here, possibly another. And I have a few
suggestions about things that could improve code readability.
-Alex
> ---
> drivers/virt/gunyah/vm_mgr.c | 243 ++++++++++++++++++++++++++++++--
> drivers/virt/gunyah/vm_mgr.h | 10 ++
> drivers/virt/gunyah/vm_mgr_mm.c | 23 +++
> include/linux/gunyah_rsc_mgr.h | 6 +
> include/uapi/linux/gunyah.h | 13 ++
> 5 files changed, 282 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
> index e950274c6a53..299b9bb81edc 100644
> --- a/drivers/virt/gunyah/vm_mgr.c
> +++ b/drivers/virt/gunyah/vm_mgr.c
> @@ -9,37 +9,118 @@
> #include <linux/file.h>
> #include <linux/gunyah_rsc_mgr.h>
> #include <linux/miscdevice.h>
> +#include <linux/mm.h>
> #include <linux/module.h>
>
> #include <uapi/linux/gunyah.h>
>
> #include "vm_mgr.h"
>
> +static int gh_vm_rm_notification_status(struct gh_vm *ghvm, void *data)
> +{
> + struct gh_rm_vm_status_payload *payload = data;
> +
> + if (payload->vmid != ghvm->vmid)
> + return NOTIFY_OK;
> +
> + /* All other state transitions are synchronous to a corresponding RM call */
> + if (payload->vm_status == GH_RM_VM_STATUS_RESET) {
> + down_write(&ghvm->status_lock);
> + ghvm->vm_status = payload->vm_status;
> + up_write(&ghvm->status_lock);
> + wake_up(&ghvm->vm_status_wait);
> + }
> +
> + return NOTIFY_DONE;
> +}
> +
> +static int gh_vm_rm_notification_exited(struct gh_vm *ghvm, void *data)
> +{
> + struct gh_rm_vm_exited_payload *payload = data;
> +
> + if (payload->vmid != ghvm->vmid)
> + return NOTIFY_OK;
> +
> + down_write(&ghvm->status_lock);
> + ghvm->vm_status = GH_RM_VM_STATUS_EXITED;
> + up_write(&ghvm->status_lock);
> +
> + return NOTIFY_DONE;
> +}
> +
> +static int gh_vm_rm_notification(struct notifier_block *nb, unsigned long action, void *data)
> +{
> + struct gh_vm *ghvm = container_of(nb, struct gh_vm, nb);
> +
> + switch (action) {
> + case GH_RM_NOTIFICATION_VM_STATUS:
> + return gh_vm_rm_notification_status(ghvm, data);
> + case GH_RM_NOTIFICATION_VM_EXITED:
> + return gh_vm_rm_notification_exited(ghvm, data);
> + default:
> + return NOTIFY_OK;
> + }
> +}
> +
> +static void gh_vm_stop(struct gh_vm *ghvm)
> +{
> + int ret;
> +
> + down_write(&ghvm->status_lock);
> + if (ghvm->vm_status == GH_RM_VM_STATUS_RUNNING) {
> + ret = gh_rm_vm_stop(ghvm->rm, ghvm->vmid);
> + if (ret)
> + dev_warn(ghvm->parent, "Failed to stop VM: %d\n", ret);
> + }
> +
> + ghvm->vm_status = GH_RM_VM_STATUS_EXITED;
> + up_write(&ghvm->status_lock);
> +}
> +
> static void gh_vm_free(struct work_struct *work)
> {
> struct gh_vm *ghvm = container_of(work, struct gh_vm, free_work);
> struct gh_vm_mem *mapping, *tmp;
> int ret;
>
> - mutex_lock(&ghvm->mm_lock);
> - list_for_each_entry_safe(mapping, tmp, &ghvm->memory_mappings, list) {
> - gh_vm_mem_reclaim(ghvm, mapping);
> - kfree(mapping);
> - }
> - mutex_unlock(&ghvm->mm_lock);
> -
> - ret = gh_rm_dealloc_vmid(ghvm->rm, ghvm->vmid);
> - if (ret)
> - pr_warn("Failed to deallocate vmid: %d\n", ret);
> + switch (ghvm->vm_status) {
> + case GH_RM_VM_STATUS_RUNNING:
> + gh_vm_stop(ghvm);
> + fallthrough;
> + case GH_RM_VM_STATUS_INIT_FAILED:
> + case GH_RM_VM_STATUS_LOAD:
> + case GH_RM_VM_STATUS_EXITED:
> + mutex_lock(&ghvm->mm_lock);
> + list_for_each_entry_safe(mapping, tmp, &ghvm->memory_mappings, list) {
> + gh_vm_mem_reclaim(ghvm, mapping);
> + kfree(mapping);
> + }
> + mutex_unlock(&ghvm->mm_lock);
> + fallthrough;
> + case GH_RM_VM_STATUS_NO_STATE:
> + ret = gh_rm_dealloc_vmid(ghvm->rm, ghvm->vmid);
> + if (ret)
> + dev_warn(ghvm->parent, "Failed to deallocate vmid: %d\n", ret);
> +
> + gh_rm_notifier_unregister(ghvm->rm, &ghvm->nb);
I think you should unregister the notifier before you
deallocate the VMID. I think the notifier might be able
to use the VMID.
> + gh_rm_put(ghvm->rm);
> + kfree(ghvm);
> + break;
> + default:
> + dev_err(ghvm->parent, "VM is unknown state: %d. VM will not be cleaned up.\n",
> + ghvm->vm_status);
>
> - put_gh_rm(ghvm->rm);
> - kfree(ghvm);
> + gh_rm_notifier_unregister(ghvm->rm, &ghvm->nb);
> + gh_rm_put(ghvm->rm);
> + kfree(ghvm);
> + break;
> + }
> }
>
> static __must_check struct gh_vm *gh_vm_alloc(struct gh_rm *rm)
> {
> struct gh_vm *ghvm;
> - int vmid;
> + int vmid, ret;
>
> vmid = gh_rm_alloc_vmid(rm, 0);
> if (vmid < 0)
> @@ -55,13 +136,130 @@ static __must_check struct gh_vm *gh_vm_alloc(struct gh_rm *rm)
> ghvm->vmid = vmid;
> ghvm->rm = rm;
>
> + init_waitqueue_head(&ghvm->vm_status_wait);
> + ghvm->nb.notifier_call = gh_vm_rm_notification;
> + ret = gh_rm_notifier_register(rm, &ghvm->nb);
> + if (ret) {
> + gh_rm_put(rm);
> + gh_rm_dealloc_vmid(rm, vmid);
> + kfree(ghvm);
> + return ERR_PTR(ret);
> + }
> +
> mutex_init(&ghvm->mm_lock);
> INIT_LIST_HEAD(&ghvm->memory_mappings);
> + init_rwsem(&ghvm->status_lock);
> INIT_WORK(&ghvm->free_work, gh_vm_free);
> + ghvm->vm_status = GH_RM_VM_STATUS_LOAD;
>
> return ghvm;
> }
>
> +static int gh_vm_start(struct gh_vm *ghvm)
> +{
> + struct gh_vm_mem *mapping;
> + u64 dtb_offset;
> + u32 mem_handle;
> + int ret;
> +
> + down_write(&ghvm->status_lock);
> + if (ghvm->vm_status != GH_RM_VM_STATUS_LOAD) {
> + up_write(&ghvm->status_lock);
> + return 0;
> + }
> +
> + ghvm->vm_status = GH_RM_VM_STATUS_RESET;
> +
> + mutex_lock(&ghvm->mm_lock);
> + list_for_each_entry(mapping, &ghvm->memory_mappings, list) {
> + switch (mapping->share_type) {
> + case VM_MEM_LEND:
> + ret = gh_rm_mem_lend(ghvm->rm, &mapping->parcel);
> + break;
> + case VM_MEM_SHARE:
> + ret = gh_rm_mem_share(ghvm->rm, &mapping->parcel);
> + break;
> + }
> + if (ret) {
> + dev_warn(ghvm->parent, "Failed to %s parcel %d: %d\n",
> + mapping->share_type == VM_MEM_LEND ? "lend" : "share",
> + mapping->parcel.label,
> + ret);
> + goto err;
> + }
> + }
> + mutex_unlock(&ghvm->mm_lock);
> +
> + mapping = gh_vm_mem_find_by_addr(ghvm, ghvm->dtb_config.guest_phys_addr,
> + ghvm->dtb_config.size);
> + if (!mapping) {
> + dev_warn(ghvm->parent, "Failed to find the memory_handle for DTB\n");
> + ret = -EINVAL;
> + goto err;
> + }
> +
> + mem_handle = mapping->parcel.mem_handle;
> + dtb_offset = ghvm->dtb_config.guest_phys_addr - mapping->guest_phys_addr;
> +
> + ret = gh_rm_vm_configure(ghvm->rm, ghvm->vmid, ghvm->auth, mem_handle,
> + 0, 0, dtb_offset, ghvm->dtb_config.size);
> + if (ret) {
> + dev_warn(ghvm->parent, "Failed to configure VM: %d\n", ret);
> + goto err;
> + }
> +
> + ret = gh_rm_vm_init(ghvm->rm, ghvm->vmid);
> + if (ret) {
> + dev_warn(ghvm->parent, "Failed to initialize VM: %d\n", ret);
> + goto err;
> + }
> +
> + ret = gh_rm_vm_start(ghvm->rm, ghvm->vmid);
> + if (ret) {
> + dev_warn(ghvm->parent, "Failed to start VM: %d\n", ret);
> + goto err;
> + }
> +
> + ghvm->vm_status = GH_RM_VM_STATUS_RUNNING;
> + up_write(&ghvm->status_lock);
> + return ret;
> +err:
> + ghvm->vm_status = GH_RM_VM_STATUS_INIT_FAILED;
> + /* gh_vm_free will handle releasing resources and reclaiming memory */
> + up_write(&ghvm->status_lock);
> + return ret;
> +}
> +
> +static int gh_vm_ensure_started(struct gh_vm *ghvm)
> +{
> + int ret;
> +
> + ret = down_read_interruptible(&ghvm->status_lock);
> + if (ret)
> + return ret;
> +
> + /* Unlikely because VM is typically started */
> + if (unlikely(ghvm->vm_status == GH_RM_VM_STATUS_LOAD)) {
> + up_read(&ghvm->status_lock);
> + ret = gh_vm_start(ghvm);
> + if (ret)
> + goto out;
You have already released the status lock at this point.
So going to "out" will do it again. This is a BUG.
I think you just want to return ret, and you don't need
the "out" error path.
> + /** gh_vm_start() is guaranteed to bring status out of
> + * GH_RM_VM_STATUS_LOAD, thus inifitely recursive call is not
> + * possible
> + */
> + return gh_vm_ensure_started(ghvm);
> + }
> +
> + /* Unlikely because VM is typically running */
> + if (unlikely(ghvm->vm_status != GH_RM_VM_STATUS_RUNNING))
> + ret = -ENODEV;
> +
> +out:
> + up_read(&ghvm->status_lock);
> + return ret;
> +}
> +
> static long gh_vm_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
> {
> struct gh_vm *ghvm = filp->private_data;
> @@ -85,6 +283,25 @@ static long gh_vm_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
> r = gh_vm_mem_alloc(ghvm, ®ion);
> break;
> }
> + case GH_VM_SET_DTB_CONFIG: {
> + struct gh_vm_dtb_config dtb_config;
> +
> + if (copy_from_user(&dtb_config, argp, sizeof(dtb_config)))
> + return -EFAULT;
> +
It's clear that the base of the DTB does not need to be
page aligned. But why do you round up the size to a
page boundary? (It might be the "extra for overlay"
I comment on elsewhere, but even if so, it's worth
mentioning this.)
> + dtb_config.size = PAGE_ALIGN(dtb_config.size);
> + if (dtb_config.guest_phys_addr + dtb_config.size < dtb_config.guest_phys_addr)
> + return -EOVERFLOW;
> +
> + ghvm->dtb_config = dtb_config;
> +
> + r = 0;
> + break;
> + }
> + case GH_VM_START: {
> + r = gh_vm_ensure_started(ghvm);
> + break;
> + }
> default:
> r = -ENOTTY;
> break;
> diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
> index c9f6fa5478ed..26bcc2ae4478 100644
> --- a/drivers/virt/gunyah/vm_mgr.h
> +++ b/drivers/virt/gunyah/vm_mgr.h
> @@ -10,6 +10,8 @@
> #include <linux/list.h>
> #include <linux/miscdevice.h>
> #include <linux/mutex.h>
> +#include <linux/rwsem.h>
> +#include <linux/wait.h>
>
> #include <uapi/linux/gunyah.h>
>
> @@ -34,6 +36,13 @@ struct gh_vm {
> u16 vmid;
> struct gh_rm *rm;
> struct device *parent;
> + enum gh_rm_vm_auth_mechanism auth;
> + struct gh_vm_dtb_config dtb_config;
> +
> + struct notifier_block nb;
> + enum gh_rm_vm_status vm_status;
> + wait_queue_head_t vm_status_wait;
> + struct rw_semaphore status_lock;
>
> struct work_struct free_work;
> struct mutex mm_lock;
> @@ -44,5 +53,6 @@ int gh_vm_mem_alloc(struct gh_vm *ghvm, struct gh_userspace_memory_region *regio
> void gh_vm_mem_reclaim(struct gh_vm *ghvm, struct gh_vm_mem *mapping);
> int gh_vm_mem_free(struct gh_vm *ghvm, u32 label);
> struct gh_vm_mem *gh_vm_mem_find_by_label(struct gh_vm *ghvm, u32 label);
> +struct gh_vm_mem *gh_vm_mem_find_by_addr(struct gh_vm *ghvm, u64 guest_phys_addr, u32 size);
>
> #endif
> diff --git a/drivers/virt/gunyah/vm_mgr_mm.c b/drivers/virt/gunyah/vm_mgr_mm.c
> index db6f55cef37f..6e1d2e8bddb7 100644
> --- a/drivers/virt/gunyah/vm_mgr_mm.c
> +++ b/drivers/virt/gunyah/vm_mgr_mm.c
> @@ -47,6 +47,29 @@ void gh_vm_mem_reclaim(struct gh_vm *ghvm, struct gh_vm_mem *mapping)
> list_del(&mapping->list);
> }
>
I think you should call this gh_vm_mem_find_mapping(). You are
finding the mapping that contains the given range.
> +struct gh_vm_mem *gh_vm_mem_find_by_addr(struct gh_vm *ghvm, u64 guest_phys_addr, u32 size)
> +{
> + struct gh_vm_mem *mapping = NULL;
> + int ret;
> +
> + ret = mutex_lock_interruptible(&ghvm->mm_lock);
> + if (ret)
> + return ERR_PTR(ret);
> +
I think you could slightly modify this, and replace this loop and
the one in gh_vm_mem_alloc() with a call to a helper function.
What I suggest is that you define a function like
__gh_vm_mem_overlap(ghvm, offset, size). It would return a
gh_vm_mem pointer if it found any mapping that overlapped
the range, or null if none is found.
Here, you could then check the returned result to ensure
the entire range fits within it.
And in gh_vm_mem_alloc() you could simply use the result
to indicate that you can't allocate the range (because
at least one existing range covers all or part of the new
one).
> + list_for_each_entry(mapping, &ghvm->memory_mappings, list) {
> + if (guest_phys_addr >= mapping->guest_phys_addr &&
> + (guest_phys_addr + size <= mapping->guest_phys_addr +
> + (mapping->npages << PAGE_SHIFT))) {
> + goto unlock;
> + }
> + }
> +
> + mapping = NULL;
> +unlock:
> + mutex_unlock(&ghvm->mm_lock);
> + return mapping;
> +}
> +
> struct gh_vm_mem *gh_vm_mem_find_by_label(struct gh_vm *ghvm, u32 label)
> {
> struct gh_vm_mem *mapping;
> diff --git a/include/linux/gunyah_rsc_mgr.h b/include/linux/gunyah_rsc_mgr.h
> index 88a429dad09e..8b0b46f28e39 100644
> --- a/include/linux/gunyah_rsc_mgr.h
> +++ b/include/linux/gunyah_rsc_mgr.h
> @@ -29,6 +29,12 @@ struct gh_rm_vm_exited_payload {
> #define GH_RM_NOTIFICATION_VM_EXITED 0x56100001
>
> enum gh_rm_vm_status {
> + /**
> + * RM doesn't have a state where load partially failed because
> + * only Linux
I have no idea what the comment above means... Please fix.
Several of the values below are never explicitly assigned,
and some are used but not assigned. The others apparently
might come back from the resource manager? Why, for
example, are the PAUSED, AUTH, and RESETTING statuses
defined if we don't use them?
> + */
> + GH_RM_VM_STATUS_LOAD_FAILED = -1,
> +
> GH_RM_VM_STATUS_NO_STATE = 0,
> GH_RM_VM_STATUS_INIT = 1,
> GH_RM_VM_STATUS_READY = 2,
> diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
> index a19207e3e065..d6abd8605a2e 100644
> --- a/include/uapi/linux/gunyah.h
> +++ b/include/uapi/linux/gunyah.h
> @@ -49,4 +49,17 @@ struct gh_userspace_memory_region {
> #define GH_VM_SET_USER_MEM_REGION _IOW(GH_IOCTL_TYPE, 0x1, \
> struct gh_userspace_memory_region)
>
> +/**
> + * struct gh_vm_dtb_config - Set the location of the VM's devicetree blob
> + * @guest_phys_addr: Address of the VM's devicetree in guest memory.
> + * @size: Maximum size of the devicetree.
> + */
> +struct gh_vm_dtb_config {
> + __u64 guest_phys_addr;
> + __u64 size;
> +};
> +#define GH_VM_SET_DTB_CONFIG _IOW(GH_IOCTL_TYPE, 0x2, struct gh_vm_dtb_config)
> +
> +#define GH_VM_START _IO(GH_IOCTL_TYPE, 0x3)
> +
> #endif
On 3/3/23 7:06 PM, Elliot Berman wrote:
> Add hypercalls to send and receive messages on a Gunyah message queue.
>
> Signed-off-by: Elliot Berman <[email protected]>
One comment below. -Alex
> ---
> arch/arm64/gunyah/gunyah_hypercall.c | 31 ++++++++++++++++++++++++++++
> include/linux/gunyah.h | 6 ++++++
> 2 files changed, 37 insertions(+)
>
> diff --git a/arch/arm64/gunyah/gunyah_hypercall.c b/arch/arm64/gunyah/gunyah_hypercall.c
> index 0d14e767e2c8..3420d8f286a9 100644
> --- a/arch/arm64/gunyah/gunyah_hypercall.c
> +++ b/arch/arm64/gunyah/gunyah_hypercall.c
> @@ -41,6 +41,8 @@ EXPORT_SYMBOL_GPL(arch_is_gh_guest);
> fn)
>
> #define GH_HYPERCALL_HYP_IDENTIFY GH_HYPERCALL(0x8000)
> +#define GH_HYPERCALL_MSGQ_SEND GH_HYPERCALL(0x801B)
> +#define GH_HYPERCALL_MSGQ_RECV GH_HYPERCALL(0x801C)
>
> /**
> * gh_hypercall_hyp_identify() - Returns build information and feature flags
> @@ -60,5 +62,34 @@ void gh_hypercall_hyp_identify(struct gh_hypercall_hyp_identify_resp *hyp_identi
> }
> EXPORT_SYMBOL_GPL(gh_hypercall_hyp_identify);
>
> +enum gh_error gh_hypercall_msgq_send(u64 capid, size_t size, void *buff, int tx_flags, bool *ready)
The tx_flags argument--being a mask of some kind--should be unsigned,
and perhaps 64 bits wide. The only caller passes a u64 value here
(which would technically be truncated).
> +{
> + struct arm_smccc_res res;
> +
> + arm_smccc_1_1_hvc(GH_HYPERCALL_MSGQ_SEND, capid, size, (uintptr_t)buff, tx_flags, 0, &res);
> +
> + if (res.a0 == GH_ERROR_OK)
> + *ready = !!res.a1;
> +
> + return res.a0;
> +}
> +EXPORT_SYMBOL_GPL(gh_hypercall_msgq_send);
> +
> +enum gh_error gh_hypercall_msgq_recv(u64 capid, void *buff, size_t size, size_t *recv_size,
> + bool *ready)
> +{
> + struct arm_smccc_res res;
> +
> + arm_smccc_1_1_hvc(GH_HYPERCALL_MSGQ_RECV, capid, (uintptr_t)buff, size, 0, &res);
> +
> + if (res.a0 == GH_ERROR_OK) {
> + *recv_size = res.a1;
> + *ready = !!res.a2;
> + }
> +
> + return res.a0;
> +}
> +EXPORT_SYMBOL_GPL(gh_hypercall_msgq_recv);
> +
> MODULE_LICENSE("GPL");
> MODULE_DESCRIPTION("Gunyah Hypervisor Hypercalls");
> diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
> index bd080e3a6fc9..18cfbf5ee48b 100644
> --- a/include/linux/gunyah.h
> +++ b/include/linux/gunyah.h
> @@ -108,4 +108,10 @@ struct gh_hypercall_hyp_identify_resp {
>
> void gh_hypercall_hyp_identify(struct gh_hypercall_hyp_identify_resp *hyp_identity);
>
> +#define GH_HYPERCALL_MSGQ_TX_FLAGS_PUSH BIT(0)
> +
> +enum gh_error gh_hypercall_msgq_send(u64 capid, size_t size, void *buff, int tx_flags, bool *ready);
> +enum gh_error gh_hypercall_msgq_recv(u64 capid, void *buff, size_t size, size_t *recv_size,
> + bool *ready);
> +
> #endif
On 3/3/23 7:06 PM, Elliot Berman wrote:
> When launching a virtual machine, Gunyah userspace allocates memory for
> the guest and informs Gunyah about these memory regions through
> SET_USER_MEMORY_REGION ioctl.
>
> Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
> Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
> Signed-off-by: Elliot Berman <[email protected]>
I have quite a few suggestions here, but I don't think I identified
any bugs.
-Alex
> ---
> drivers/virt/gunyah/Makefile | 2 +-
> drivers/virt/gunyah/vm_mgr.c | 44 ++++++
> drivers/virt/gunyah/vm_mgr.h | 25 ++++
> drivers/virt/gunyah/vm_mgr_mm.c | 229 ++++++++++++++++++++++++++++++++
> include/uapi/linux/gunyah.h | 29 ++++
> 5 files changed, 328 insertions(+), 1 deletion(-)
> create mode 100644 drivers/virt/gunyah/vm_mgr_mm.c
>
> diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
> index 03951cf82023..ff8bc4925392 100644
> --- a/drivers/virt/gunyah/Makefile
> +++ b/drivers/virt/gunyah/Makefile
> @@ -2,5 +2,5 @@
>
> obj-$(CONFIG_GUNYAH) += gunyah.o
>
> -gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o
> +gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o vm_mgr_mm.o
> obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
> diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
> index dbacf36af72d..e950274c6a53 100644
> --- a/drivers/virt/gunyah/vm_mgr.c
> +++ b/drivers/virt/gunyah/vm_mgr.c
> @@ -18,8 +18,16 @@
> static void gh_vm_free(struct work_struct *work)
> {
> struct gh_vm *ghvm = container_of(work, struct gh_vm, free_work);
> + struct gh_vm_mem *mapping, *tmp;
> int ret;
>
> + mutex_lock(&ghvm->mm_lock);
> + list_for_each_entry_safe(mapping, tmp, &ghvm->memory_mappings, list) {
> + gh_vm_mem_reclaim(ghvm, mapping);
> + kfree(mapping);
> + }
> + mutex_unlock(&ghvm->mm_lock);
> +
> ret = gh_rm_dealloc_vmid(ghvm->rm, ghvm->vmid);
> if (ret)
> pr_warn("Failed to deallocate vmid: %d\n", ret);
> @@ -47,11 +55,44 @@ static __must_check struct gh_vm *gh_vm_alloc(struct gh_rm *rm)
> ghvm->vmid = vmid;
> ghvm->rm = rm;
>
> + mutex_init(&ghvm->mm_lock);
> + INIT_LIST_HEAD(&ghvm->memory_mappings);
> INIT_WORK(&ghvm->free_work, gh_vm_free);
>
> return ghvm;
> }
>
> +static long gh_vm_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
> +{
> + struct gh_vm *ghvm = filp->private_data;
> + void __user *argp = (void __user *)arg;
> + long r;
> +
> + switch (cmd) {
> + case GH_VM_SET_USER_MEM_REGION: {
> + struct gh_userspace_memory_region region;
> +
> + if (!gh_api_has_feature(GH_FEATURE_MEMEXTENT))
> + return -EOPNOTSUPP;
> +
> + if (copy_from_user(®ion, argp, sizeof(region)))
> + return -EFAULT;
> +
> + /* All other flag bits are reserved for future use */
> + if (region.flags & ~(GH_MEM_ALLOW_READ | GH_MEM_ALLOW_WRITE | GH_MEM_ALLOW_EXEC))
> + return -EINVAL;
> +
> + r = gh_vm_mem_alloc(ghvm, ®ion);
> + break;
> + }
> + default:
> + r = -ENOTTY;
> + break;
> + }
> +
> + return r;
> +}
> +
> static int gh_vm_release(struct inode *inode, struct file *filp)
> {
> struct gh_vm *ghvm = filp->private_data;
> @@ -64,6 +105,9 @@ static int gh_vm_release(struct inode *inode, struct file *filp)
> }
>
> static const struct file_operations gh_vm_fops = {
> + .owner = THIS_MODULE,
> + .unlocked_ioctl = gh_vm_ioctl,
> + .compat_ioctl = compat_ptr_ioctl,
> .release = gh_vm_release,
> .llseek = noop_llseek,
> };
> diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
> index 4b22fbcac91c..c9f6fa5478ed 100644
> --- a/drivers/virt/gunyah/vm_mgr.h
> +++ b/drivers/virt/gunyah/vm_mgr.h
> @@ -7,17 +7,42 @@
> #define _GH_PRIV_VM_MGR_H
>
> #include <linux/gunyah_rsc_mgr.h>
> +#include <linux/list.h>
> +#include <linux/miscdevice.h>
> +#include <linux/mutex.h>
>
> #include <uapi/linux/gunyah.h>
>
> long gh_dev_vm_mgr_ioctl(struct gh_rm *rm, unsigned int cmd, unsigned long arg);
>
> +enum gh_vm_mem_share_type {
> + VM_MEM_SHARE,
> + VM_MEM_LEND,
> +};
> +
> +struct gh_vm_mem {
> + struct list_head list;
> + enum gh_vm_mem_share_type share_type;
> + struct gh_rm_mem_parcel parcel;
> +
> + __u64 guest_phys_addr;
> + struct page **pages;
> + unsigned long npages;
> +};
> +
> struct gh_vm {
> u16 vmid;
> struct gh_rm *rm;
> struct device *parent;
>
> struct work_struct free_work;
> + struct mutex mm_lock;
> + struct list_head memory_mappings;
> };
>
> +int gh_vm_mem_alloc(struct gh_vm *ghvm, struct gh_userspace_memory_region *region);
> +void gh_vm_mem_reclaim(struct gh_vm *ghvm, struct gh_vm_mem *mapping);
> +int gh_vm_mem_free(struct gh_vm *ghvm, u32 label);
> +struct gh_vm_mem *gh_vm_mem_find_by_label(struct gh_vm *ghvm, u32 label);
> +
> #endif
> diff --git a/drivers/virt/gunyah/vm_mgr_mm.c b/drivers/virt/gunyah/vm_mgr_mm.c
> new file mode 100644
> index 000000000000..db6f55cef37f
> --- /dev/null
> +++ b/drivers/virt/gunyah/vm_mgr_mm.c
> @@ -0,0 +1,229 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#define pr_fmt(fmt) "gh_vm_mgr: " fmt
> +
> +#include <linux/gunyah_rsc_mgr.h>
> +#include <linux/mm.h>
> +
> +#include <uapi/linux/gunyah.h>
> +
> +#include "vm_mgr.h"
> +
> +static struct gh_vm_mem *__gh_vm_mem_find_by_label(struct gh_vm *ghvm, u32 label)
> + __must_hold(&ghvm->mm_lock)
> +{
> + struct gh_vm_mem *mapping;
> +
> + list_for_each_entry(mapping, &ghvm->memory_mappings, list)
> + if (mapping->parcel.label == label)
> + return mapping;
> +
> + return NULL;
> +}
> +
I think you should call this function gh_vm_mem_reclaim_mapping().
Then add a new function with this name, which takes only a ghvm
argument, and will take mm_lock and call this function to reclaim
all mappings on a VM. That encapsulates some code found in
gh_vm_free(). (You could use a similar model for removing
functions, as well as tickets and resources, and use them
gh_vm_free().)
> +void gh_vm_mem_reclaim(struct gh_vm *ghvm, struct gh_vm_mem *mapping)
> + __must_hold(&ghvm->mm_lock)
> +{
> + int i, ret = 0;
> +
> + if (mapping->parcel.mem_handle != GH_MEM_HANDLE_INVAL) {
> + ret = gh_rm_mem_reclaim(ghvm->rm, &mapping->parcel);
> + if (ret)
> + pr_warn("Failed to reclaim memory parcel for label %d: %d\n",
> + mapping->parcel.label, ret);
> + }
> +
> + if (!ret)
> + for (i = 0; i < mapping->npages; i++)
> + unpin_user_page(mapping->pages[i]);
> +
> + kfree(mapping->pages);
> + kfree(mapping->parcel.acl_entries);
> + kfree(mapping->parcel.mem_entries);
> +
> + list_del(&mapping->list);
> +}
> +
The next function is never used. Can you get rid of it for
now, and add it if/when needed?
> +struct gh_vm_mem *gh_vm_mem_find_by_label(struct gh_vm *ghvm, u32 label)
> +{
> + struct gh_vm_mem *mapping;
> + int ret;
> +
> + ret = mutex_lock_interruptible(&ghvm->mm_lock);
> + if (ret)
> + return ERR_PTR(ret);
> +
> + mapping = __gh_vm_mem_find_by_label(ghvm, label);
> + mutex_unlock(&ghvm->mm_lock);
> +
> + return mapping ? : ERR_PTR(-ENODEV);
> +}
> +
> +int gh_vm_mem_alloc(struct gh_vm *ghvm, struct gh_userspace_memory_region *region)
> +{
> + struct gh_vm_mem *mapping, *tmp_mapping;
> + struct gh_rm_mem_entry *mem_entries;
> + phys_addr_t curr_page, prev_page;
> + struct gh_rm_mem_parcel *parcel;
> + int i, j, pinned, ret = 0;
> + size_t entry_size;
> + u16 vmid;
> +
> + if (!region->memory_size || !PAGE_ALIGNED(region->memory_size) ||
> + !PAGE_ALIGNED(region->userspace_addr) || !PAGE_ALIGNED(region->guest_phys_addr))
A clever trick would be to use:
allofem = region->memory_size |
region->userspace_addr |
region->guest_phys_addr;
if (!region->memory size || !PAGE_ALIGNED(allofem))
But that's just a clever trick.
> + return -EINVAL;
> +
> + if (region->guest_phys_addr + region->memory_size < region->guest_phys_addr)
> + return -EOVERFLOW;
> +
> + ret = mutex_lock_interruptible(&ghvm->mm_lock);
> + if (ret)
> + return ret;
> +
> + mapping = __gh_vm_mem_find_by_label(ghvm, region->label);
> + if (mapping) {
> + mutex_unlock(&ghvm->mm_lock);
> + return -EEXIST;
> + }
> +
> + mapping = kzalloc(sizeof(*mapping), GFP_KERNEL);
> + if (!mapping) {
> + mutex_unlock(&ghvm->mm_lock);
> + return -ENOMEM;
> + }
> +
> + mapping->parcel.label = region->label;
> + mapping->guest_phys_addr = region->guest_phys_addr;
> + mapping->npages = region->memory_size >> PAGE_SHIFT;
> + parcel = &mapping->parcel;
Assign parcel->label here instead.
> + parcel->mem_handle = GH_MEM_HANDLE_INVAL; /* to be filled later by mem_share/mem_lend */
> + parcel->mem_type = GH_RM_MEM_TYPE_NORMAL;
> +
> + /* Check for overlap */
See my other suggestion about using a common helper here.
> + list_for_each_entry(tmp_mapping, &ghvm->memory_mappings, list) {
> + if (!((mapping->guest_phys_addr + (mapping->npages << PAGE_SHIFT) <=
> + tmp_mapping->guest_phys_addr) ||
I *think* this || is supposed to be &&. But I find the way this
is formatted (and negated) makes this all more confusing than
it should be.
> + (mapping->guest_phys_addr >=
> + tmp_mapping->guest_phys_addr + (tmp_mapping->npages << PAGE_SHIFT)))) {
I think this little block of code is worthy of its own function
to allow it to be examined in isolation (and maybe be explained).
static bool gh_vm_mem_overlap(struct gh_vm_mem *a, struct gh_vm_mem *b)
{
u64 a_end = a->guest_phys_addr + a->npages << PAGE_SHIFT;
u64 b_end = b->guest_phys_addr + b->npages << PAGE_SHIFT;
return a->guest_phys_addr < b_end && b->guest_phys_addr < a_end;
}
> + ret = -EEXIST;
> + goto free_mapping;
> + }
> + }
> +
> + list_add(&mapping->list, &ghvm->memory_mappings);
You should defer adding the mapping to the list. More below.
> +
> + mapping->pages = kcalloc(mapping->npages, sizeof(*mapping->pages), GFP_KERNEL);
> + if (!mapping->pages) {
> + ret = -ENOMEM;
> + mapping->npages = 0; /* update npages for reclaim */
If you haven't added it to the list, you can goto free_mapping
here instead.
> + goto reclaim;
> + }
> +
> + pinned = pin_user_pages_fast(region->userspace_addr, mapping->npages,
> + FOLL_WRITE | FOLL_LONGTERM, mapping->pages);
> + if (pinned < 0) {
> + ret = pinned;
I would suggest having a new err_free_pages label that frees
mapping->pages, just before free_mapping.
> + mapping->npages = 0; /* update npages for reclaim */
> + goto reclaim;
Now that you actually have something to reclaim, add the
mapping to the VM mappings list.
> + } else if (pinned != mapping->npages) {
> + ret = -EFAULT;
> + mapping->npages = pinned; /* update npages for reclaim */
> + goto reclaim;
> + }
> +
> + parcel->n_acl_entries = 2;
> + mapping->share_type = VM_MEM_SHARE;
> + parcel->acl_entries = kcalloc(parcel->n_acl_entries, sizeof(*parcel->acl_entries),
> + GFP_KERNEL);
> + if (!parcel->acl_entries) {
> + ret = -ENOMEM;
> + goto reclaim;
> + }
> +
> + parcel->acl_entries[0].vmid = cpu_to_le16(ghvm->vmid);
> +
> + if (region->flags & GH_MEM_ALLOW_READ)
> + parcel->acl_entries[0].perms |= GH_RM_ACL_R;
> + if (region->flags & GH_MEM_ALLOW_WRITE)
> + parcel->acl_entries[0].perms |= GH_RM_ACL_W;
> + if (region->flags & GH_MEM_ALLOW_EXEC)
> + parcel->acl_entries[0].perms |= GH_RM_ACL_X;
> +
> + if (mapping->share_type == VM_MEM_SHARE) {
There is no need for this conditional test. The share_type *is*
VM_MEM_SHARE (you just set it to that).
> + ret = gh_rm_get_vmid(ghvm->rm, &vmid);
> + if (ret)
> + goto reclaim;
> +
> + parcel->acl_entries[1].vmid = cpu_to_le16(vmid);
> + /* Host assumed to have all these permissions. Gunyah will not
> + * grant new permissions if host actually had less than RWX
> + */
> + parcel->acl_entries[1].perms |= GH_RM_ACL_R | GH_RM_ACL_W | GH_RM_ACL_X;
The perms value is already zero here. Just use =, not |=.
> + }
> +
> + mem_entries = kcalloc(mapping->npages, sizeof(*mem_entries), GFP_KERNEL);
> + if (!mem_entries) {
> + ret = -ENOMEM;
> + goto reclaim;
> + }
> +
> + /* reduce number of entries by combining contiguous pages into single memory entry */
Are you sure you need to do this? I.e., does pin_user_pages_fast()
already take care of consolidating these pages?
> + prev_page = page_to_phys(mapping->pages[0]);
> + mem_entries[0].ipa_base = cpu_to_le64(prev_page);
> + entry_size = PAGE_SIZE;
> + for (i = 1, j = 0; i < mapping->npages; i++) {
> + curr_page = page_to_phys(mapping->pages[i]);
I think you can actually use the page frame numbers
here instead of the addresses. If they are consecutive,
they are contiguous. See pages_are_mergeable() for an
example of that. Using PFNs might simplify this code.
> + if (curr_page - prev_page == PAGE_SIZE) {
> + entry_size += PAGE_SIZE;
> + } else {
> + mem_entries[j].size = cpu_to_le64(entry_size);
> + j++;
> + mem_entries[j].ipa_base = cpu_to_le64(curr_page);
> + entry_size = PAGE_SIZE;
> + }
> +
> + prev_page = curr_page;
> + }
> + mem_entries[j].size = cpu_to_le64(entry_size);
It might be messier, but it seems like you could scan the pages to
see how many you'll need (after combining), then allocate the array
of mem entries based on that. That is, do that rather than allocating,
filling, then duplicating and freeing.
count = 1;
curr_page = mapping->pages[0];
for (i = 1; i < mapping->npages; i++) {
next_page = mapping->pages[i];
if (page_to_pfn(next_page) !=
page_to_pfn(curr_page) + 1)
count++;
curr_page = next_page;
}
parcel->n_mem_entries = count;
parcel->mem_entries = kcalloc(count, ...);
/* Then fill them up */
(Not tested, but you get the idea.)
> +
> + parcel->n_mem_entries = j + 1;
> + parcel->mem_entries = kmemdup(mem_entries, sizeof(*mem_entries) * parcel->n_mem_entries,
> + GFP_KERNEL);
> + kfree(mem_entries);
> + if (!parcel->mem_entries) {
> + ret = -ENOMEM;
> + goto reclaim;
> + }
> +
> + mutex_unlock(&ghvm->mm_lock);
> + return 0;
> +reclaim:
> + gh_vm_mem_reclaim(ghvm, mapping);
> +free_mapping:
> + kfree(mapping);
> + mutex_unlock(&ghvm->mm_lock);
> + return ret;
> +}
> +
> +int gh_vm_mem_free(struct gh_vm *ghvm, u32 label)
> +{
> + struct gh_vm_mem *mapping;
> + int ret;
> +
> + ret = mutex_lock_interruptible(&ghvm->mm_lock);
> + if (ret)
> + return ret;
> +
> + mapping = __gh_vm_mem_find_by_label(ghvm, label);
> + if (!mapping)
> + goto out;
> +
> + gh_vm_mem_reclaim(ghvm, mapping);
> + kfree(mapping);
> +out:
> + mutex_unlock(&ghvm->mm_lock);
> + return ret;
> +}
> diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
> index 10ba32d2b0a6..a19207e3e065 100644
> --- a/include/uapi/linux/gunyah.h
> +++ b/include/uapi/linux/gunyah.h
> @@ -20,4 +20,33 @@
> */
> #define GH_CREATE_VM _IO(GH_IOCTL_TYPE, 0x0) /* Returns a Gunyah VM fd */
>
> +/*
> + * ioctls for VM fds
> + */
> +
I think you should define the following three values in an enum.
> +#define GH_MEM_ALLOW_READ (1UL << 0)
> +#define GH_MEM_ALLOW_WRITE (1UL << 1)
> +#define GH_MEM_ALLOW_EXEC (1UL << 2)
> +
> +/**
> + * struct gh_userspace_memory_region - Userspace memory descripion for GH_VM_SET_USER_MEM_REGION
> + * @label: Unique identifer to the region.
Unique with respect to what? I think it's unique among memory
regions defined within a VM. And I think it's arbitrary and
defined by the caller (right?).
> + * @flags: Flags for memory parcel behavior
> + * @guest_phys_addr: Location of the memory region in guest's memory space (page-aligned)
> + * @memory_size: Size of the region (page-aligned)
> + * @userspace_addr: Location of the memory region in caller (userspace)'s memory
> + *
> + * See Documentation/virt/gunyah/vm-manager.rst for further details.
> + */
> +struct gh_userspace_memory_region {
> + __u32 label;
> + __u32 flags;
Add a comment to indicate what types of values "flags" can have.
Maybe "flags" should be called "perms" or something?
> + __u64 guest_phys_addr;
> + __u64 memory_size;
> + __u64 userspace_addr;
Why isn't userspace_addr just a (void *)? That would be a more natural
thing to pass to the kernel. Is it to avoid 32-bit/64-bit pointer
differences in the API?
> +};
> +
> +#define GH_VM_SET_USER_MEM_REGION _IOW(GH_IOCTL_TYPE, 0x1, \
> + struct gh_userspace_memory_region)
> +
I think it's nicer to group the definitions of these IOCTL values.
Then in the struct definitions that follow, you can add comment that
indicates which IOCTL the struct is used for.
> #endif
On 3/3/23 7:06 PM, Elliot Berman wrote:
> Add a sample Gunyah VMM capable of launching a non-proxy scheduled VM.
>
This example doesn't show any VM functions getting added. Is
that handled by cros_vm or something? It would be nice to
have the process of interpreting the DTS file and setting
up CPUs and interrupts be explained.
Honestly, I haven't reviewed your documentation this time
around, so maybe you've done that and I just haven't looked...
A few tiny suggestions below, but this generally looks fine.
I'll look at it once more next time.
-Alex
> Signed-off-by: Elliot Berman <[email protected]>
> ---
> samples/Kconfig | 10 ++
> samples/Makefile | 1 +
> samples/gunyah/.gitignore | 2 +
> samples/gunyah/Makefile | 6 +
> samples/gunyah/gunyah_vmm.c | 270 +++++++++++++++++++++++++++++++++++
> samples/gunyah/sample_vm.dts | 68 +++++++++
> 6 files changed, 357 insertions(+)
> create mode 100644 samples/gunyah/.gitignore
> create mode 100644 samples/gunyah/Makefile
> create mode 100644 samples/gunyah/gunyah_vmm.c
> create mode 100644 samples/gunyah/sample_vm.dts
>
> diff --git a/samples/Kconfig b/samples/Kconfig
> index 30ef8bd48ba3..11070bf02bd7 100644
> --- a/samples/Kconfig
> +++ b/samples/Kconfig
> @@ -273,6 +273,16 @@ config SAMPLE_CORESIGHT_SYSCFG
> This demonstrates how a user may create their own CoreSight
> configurations and easily load them into the system at runtime.
>
> +config SAMPLE_GUNYAH
> + bool "Build example Gunyah Virtual Machine Manager"
> + depends on CC_CAN_LINK && HEADERS_INSTALL
> + depends on GUNYAH
> + help
> + Build an example Gunyah VMM userspace program capable of launching
> + a basic virtual machine under the Gunyah hypervisor.
> + This demonstrates how to create a virtual machine under the Gunyah
> + hypervisor.
> +
> source "samples/rust/Kconfig"
>
> endif # SAMPLES
> diff --git a/samples/Makefile b/samples/Makefile
> index 7cb632ef88ee..a65555802642 100644
> --- a/samples/Makefile
> +++ b/samples/Makefile
> @@ -37,3 +37,4 @@ obj-$(CONFIG_DEBUG_KMEMLEAK_TEST) += kmemleak/
> obj-$(CONFIG_SAMPLE_CORESIGHT_SYSCFG) += coresight/
> obj-$(CONFIG_SAMPLE_FPROBE) += fprobe/
> obj-$(CONFIG_SAMPLES_RUST) += rust/
> +obj-$(CONFIG_SAMPLE_GUNYAH) += gunyah/
> diff --git a/samples/gunyah/.gitignore b/samples/gunyah/.gitignore
> new file mode 100644
> index 000000000000..adc7d1589fde
> --- /dev/null
> +++ b/samples/gunyah/.gitignore
> @@ -0,0 +1,2 @@
> +# SPDX-License-Identifier: GPL-2.0
> +/gunyah_vmm
> diff --git a/samples/gunyah/Makefile b/samples/gunyah/Makefile
> new file mode 100644
> index 000000000000..faf14f9bb337
> --- /dev/null
> +++ b/samples/gunyah/Makefile
> @@ -0,0 +1,6 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +
> +userprogs-always-y += gunyah_vmm
> +dtb-y += sample_vm.dtb
> +
> +userccflags += -I usr/include
> diff --git a/samples/gunyah/gunyah_vmm.c b/samples/gunyah/gunyah_vmm.c
> new file mode 100644
> index 000000000000..d0ba9c20cb13
> --- /dev/null
> +++ b/samples/gunyah/gunyah_vmm.c
> @@ -0,0 +1,270 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2022 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#include <stdlib.h>
> +#include <stdio.h>
> +#include <unistd.h>
> +#include <sys/types.h>
> +#include <sys/stat.h>
> +#include <fcntl.h>
> +#include <sys/ioctl.h>
> +#include <getopt.h>
> +#include <limits.h>
> +#include <stdint.h>
> +#include <fcntl.h>
> +#include <string.h>
> +#include <sys/sysmacros.h>
> +#define __USE_GNU
> +#include <sys/mman.h>
> +
> +#include <linux/gunyah.h>
> +
> +struct vm_config {
> + int image_fd;
> + int dtb_fd;
> + int ramdisk_fd;
> +
> + uint64_t guest_base;
> + uint64_t guest_size;
> +
> + uint64_t image_offset;
> + off_t image_size;
> + uint64_t dtb_offset;
> + off_t dtb_size;
> + uint64_t ramdisk_offset;
> + off_t ramdisk_size;
> +};
> +
> +static struct option options[] = {
> + { "help", no_argument, NULL, 'h' },
> + { "image", required_argument, NULL, 'i' },
> + { "dtb", required_argument, NULL, 'd' },
> + { "ramdisk", optional_argument, NULL, 'r' },
> + { "base", optional_argument, NULL, 'B' },
> + { "size", optional_argument, NULL, 'S' },
> + { "image_offset", optional_argument, NULL, 'I' },
> + { "dtb_offset", optional_argument, NULL, 'D' },
> + { "ramdisk_offset", optional_argument, NULL, 'R' },
> + { }
> +};
> +
> +static void print_help(char *cmd)
> +{
> + printf("gunyah_vmm, a sample tool to launch Gunyah VMs\n"
> + "Usage: %s <options>\n"
> + " --help, -h this menu\n"
> + " --image, -i <image> VM image file to load (e.g. a kernel Image) [Required]\n"
> + " --dtb, -d <dtb> Devicetree to load [Required]\n"
s/Devicetree/Devicetree file/
> + " --ramdisk, -r <ramdisk> Ramdisk to load\n"
s/Ramdisk/Ramdisk file/
> + " --base, -B <address> Set the base address of guest's memory [Default: 0x80000000]\n"
> + " --size, -S <number> The number of bytes large to make the guest's memory [Default: 0x6400000 (100 MB)]\n"
> + " --image_offset, -I <number> Offset into guest memory to load the VM image file [Default: 0x10000]\n"
> + " --dtb_offset, -D <number> Offset into guest memory to load the DTB [Default: 0]\n"
> + " --ramdisk_offset, -R <number> Offset into guest memory to load a ramdisk [Default: 0x4600000]\n"
> + , cmd);
> +}
> +
> +int main(int argc, char **argv)
> +{
> + int gunyah_fd, vm_fd, guest_fd;
> + struct gh_userspace_memory_region guest_mem_desc = { 0 };
> + struct gh_vm_dtb_config dtb_config = { 0 };
> + char *guest_mem;
> + struct vm_config config = {
> + /* Defaults good enough to boot static kernel and a basic ramdisk */
> + .ramdisk_fd = -1,
> + .guest_base = 0x80000000,
> + .guest_size = 0x6400000, /* 100 MB */
> + .image_offset = 0,
> + .dtb_offset = 0x45f0000,
> + .ramdisk_offset = 0x4600000, /* put at +70MB (30MB for ramdisk) */
> + };
> + struct stat st;
> + int opt, optidx, ret = 0;
> + long l;
> +
> + while ((opt = getopt_long(argc, argv, "hi:d:r:B:S:I:D:R:c:", options, &optidx)) != -1) {
> + switch (opt) {
> + case 'i':
> + config.image_fd = open(optarg, O_RDONLY | O_CLOEXEC);
> + if (config.image_fd < 0) {
> + perror("Failed to open image");
> + return -1;
> + }
> + if (stat(optarg, &st) < 0) {
> + perror("Failed to stat image");
> + return -1;
> + }
> + config.image_size = st.st_size;
> + break;
> + case 'd':
> + config.dtb_fd = open(optarg, O_RDONLY | O_CLOEXEC);
> + if (config.dtb_fd < 0) {
> + perror("Failed to open dtb");
> + return -1;
> + }
> + if (stat(optarg, &st) < 0) {
> + perror("Failed to stat dtb");
> + return -1;
> + }
> + config.dtb_size = st.st_size;
> + break;
> + case 'r':
> + config.ramdisk_fd = open(optarg, O_RDONLY | O_CLOEXEC);
> + if (config.ramdisk_fd < 0) {
> + perror("Failed to open ramdisk");
> + return -1;
> + }
> + if (stat(optarg, &st) < 0) {
> + perror("Failed to stat ramdisk");
> + return -1;
> + }
> + config.ramdisk_size = st.st_size;
> + break;
> + case 'B':
> + l = strtol(optarg, NULL, 0);
> + if (l == LONG_MIN) {
> + perror("Failed to parse base address");
> + return -1;
> + }
> + config.guest_base = l;
> + break;
> + case 'S':
> + l = strtol(optarg, NULL, 0);
> + if (l == LONG_MIN) {
> + perror("Failed to parse memory size");
> + return -1;
> + }
> + config.guest_size = l;
> + break;
> + case 'I':
> + l = strtol(optarg, NULL, 0);
> + if (l == LONG_MIN) {
> + perror("Failed to parse image offset");
> + return -1;
> + }
> + config.image_offset = l;
> + break;
> + case 'D':
> + l = strtol(optarg, NULL, 0);
> + if (l == LONG_MIN) {
> + perror("Failed to parse dtb offset");
> + return -1;
> + }
> + config.dtb_offset = l;
> + break;
> + case 'R':
> + l = strtol(optarg, NULL, 0);
> + if (l == LONG_MIN) {
> + perror("Failed to parse ramdisk offset");
> + return -1;
> + }
> + config.ramdisk_offset = l;
> + break;
> + case 'h':
> + print_help(argv[0]);
> + return 0;
> + default:
> + print_help(argv[0]);
> + return -1;
> + }
> + }
> +
> + if (!config.image_fd || !config.dtb_fd) {
> + print_help(argv[0]);
> + return -1;
> + }
> +
> + if (config.image_offset + config.image_size > config.guest_size) {
> + fprintf(stderr, "Image offset and size puts it outside guest memory. Make image smaller or increase guest memory size.\n");
> + return -1;
> + }
> +
> + if (config.dtb_offset + config.dtb_size > config.guest_size) {
> + fprintf(stderr, "DTB offset and size puts it outside guest memory. Make dtb smaller or increase guest memory size.\n");
> + return -1;
> + }
> +
> + if (config.ramdisk_fd == -1 &&
> + config.ramdisk_offset + config.ramdisk_size > config.guest_size) {
> + fprintf(stderr, "Ramdisk offset and size puts it outside guest memory. Make ramdisk smaller or increase guest memory size.\n");
> + return -1;
> + }
> +
> + gunyah_fd = open("/dev/gunyah", O_RDWR | O_CLOEXEC);
> + if (gunyah_fd < 0) {
> + perror("Failed to open /dev/gunyah");
> + return -1;
> + }
> +
> + vm_fd = ioctl(gunyah_fd, GH_CREATE_VM, 0);
> + if (vm_fd < 0) {
> + perror("Failed to create vm");
> + return -1;
> + }
> +
> + guest_fd = memfd_create("guest_memory", MFD_CLOEXEC);
> + if (guest_fd < 0) {
> + perror("Failed to create guest memfd");
> + return -1;
> + }
> +
> + if (ftruncate(guest_fd, config.guest_size) < 0) {
> + perror("Failed to grow guest memory");
> + return -1;
> + }
> +
> + guest_mem = mmap(NULL, config.guest_size, PROT_READ | PROT_WRITE, MAP_SHARED, guest_fd, 0);
> + if (guest_mem == MAP_FAILED) {
> + perror("Not enough memory");
> + return -1;
> + }
> +
> + if (read(config.image_fd, guest_mem + config.image_offset, config.image_size) < 0) {
> + perror("Failed to read image into guest memory");
> + return -1;
> + }
> +
> + if (read(config.dtb_fd, guest_mem + config.dtb_offset, config.dtb_size) < 0) {
> + perror("Failed to read dtb into guest memory");
> + return -1;
> + }
> +
> + if (config.ramdisk_fd > 0 &&
> + read(config.ramdisk_fd, guest_mem + config.ramdisk_offset,
> + config.ramdisk_size) < 0) {
> + perror("Failed to read ramdisk into guest memory");
> + return -1;
> + }
> +
> + guest_mem_desc.label = 0;
> + guest_mem_desc.flags = GH_MEM_ALLOW_READ | GH_MEM_ALLOW_WRITE | GH_MEM_ALLOW_EXEC;
> + guest_mem_desc.guest_phys_addr = config.guest_base;
> + guest_mem_desc.memory_size = config.guest_size;
> + guest_mem_desc.userspace_addr = (__u64)guest_mem;
> +
> + if (ioctl(vm_fd, GH_VM_SET_USER_MEM_REGION, &guest_mem_desc) < 0) {
> + perror("Failed to register guest memory with VM");
> + return -1;
> + }
> +
> + dtb_config.guest_phys_addr = config.guest_base + config.dtb_offset;
> + dtb_config.size = config.dtb_size;
> + if (ioctl(vm_fd, GH_VM_SET_DTB_CONFIG, &dtb_config) < 0) {
> + perror("Failed to set DTB configuration for VM");
> + return -1;
> + }
> +
> + ret = ioctl(vm_fd, GH_VM_START);
> + if (ret) {
> + perror("GH_VM_START failed");
> + return -1;
> + }
> +
> + while (1)
> + sleep(10);
> +
> + return 0;
> +}
> diff --git a/samples/gunyah/sample_vm.dts b/samples/gunyah/sample_vm.dts
> new file mode 100644
> index 000000000000..293bbc0469c8
> --- /dev/null
> +++ b/samples/gunyah/sample_vm.dts
> @@ -0,0 +1,68 @@
> +// SPDX-License-Identifier: BSD-3-Clause
> +/*
> + * Copyright (c) 2022 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +/dts-v1/;
> +
> +/ {
> + #address-cells = <2>;
> + #size-cells = <2>;
> + interrupt-parent = <&intc>;
> +
> + chosen {
> + bootargs = "nokaslr";
> + };
> +
> + cpus {
> + #address-cells = <0x2>;
> + #size-cells = <0>;
> +
> + cpu@0 {
> + device_type = "cpu";
> + compatible = "arm,armv8";
> + reg = <0 0>;
> + };
> + };
> +
> + intc: interrupt-controller@3FFF0000 {
> + compatible = "arm,gic-v3";
> + #interrupt-cells = <3>;
> + #address-cells = <2>;
> + #size-cells = <2>;
> + interrupt-controller;
> + reg = <0 0x3FFF0000 0 0x10000>,
> + <0 0x3FFD0000 0 0x20000>;
> + };
> +
> + timer {
> + compatible = "arm,armv8-timer";
> + always-on;
> + interrupts = <1 13 0x108>,
> + <1 14 0x108>,
> + <1 11 0x108>,
> + <1 10 0x108>;
> + clock-frequency = <19200000>;
> + };
> +
> + gunyah-vm-config {
> + image-name = "linux_vm_0";
> +
> + memory {
> + #address-cells = <2>;
> + #size-cells = <2>;
> +
> + base-address = <0 0x80000000>;
> + };
> +
> + interrupts {
> + config = <&intc>;
> + };
> +
> + vcpus {
> + affinity-map = < 0 >;
> + sched-priority = < (-1) >;
> + sched-timeslice = < 2000 >;
> + };
> + };
> +};
On 3/3/23 7:06 PM, Elliot Berman wrote:
> On Qualcomm platforms, there is a firmware entity which controls access
> to physical pages. In order to share memory with another VM, this entity
> needs to be informed that the guest VM should have access to the memory.
Will Gunyah ever be used on something other than a Qualcomm
platform?
Is there really any need to have these "platform hooks"
conditionally compiled?
One other comment below.
-Alex
> Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
> Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
> Signed-off-by: Elliot Berman <[email protected]>
> ---
> drivers/virt/gunyah/Kconfig | 4 ++
> drivers/virt/gunyah/Makefile | 1 +
> drivers/virt/gunyah/gunyah_platform_hooks.c | 80 +++++++++++++++++++++
> drivers/virt/gunyah/rsc_mgr.h | 3 +
> drivers/virt/gunyah/rsc_mgr_rpc.c | 18 ++++-
> include/linux/gunyah_rsc_mgr.h | 17 +++++
> 6 files changed, 121 insertions(+), 2 deletions(-)
> create mode 100644 drivers/virt/gunyah/gunyah_platform_hooks.c
>
> diff --git a/drivers/virt/gunyah/Kconfig b/drivers/virt/gunyah/Kconfig
> index 1a737694c333..de815189dab6 100644
> --- a/drivers/virt/gunyah/Kconfig
> +++ b/drivers/virt/gunyah/Kconfig
> @@ -4,6 +4,7 @@ config GUNYAH
> tristate "Gunyah Virtualization drivers"
> depends on ARM64
> depends on MAILBOX
> + select GUNYAH_PLATFORM_HOOKS
> help
> The Gunyah drivers are the helper interfaces that run in a guest VM
> such as basic inter-VM IPC and signaling mechanisms, and higher level
> @@ -11,3 +12,6 @@ config GUNYAH
>
> Say Y/M here to enable the drivers needed to interact in a Gunyah
> virtual environment.
> +
> +config GUNYAH_PLATFORM_HOOKS
> + tristate
> diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
> index ff8bc4925392..6b8f84dbfe0d 100644
> --- a/drivers/virt/gunyah/Makefile
> +++ b/drivers/virt/gunyah/Makefile
> @@ -1,6 +1,7 @@
> # SPDX-License-Identifier: GPL-2.0
>
> obj-$(CONFIG_GUNYAH) += gunyah.o
> +obj-$(CONFIG_GUNYAH_PLATFORM_HOOKS) += gunyah_platform_hooks.o
>
> gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o vm_mgr_mm.o
> obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
> diff --git a/drivers/virt/gunyah/gunyah_platform_hooks.c b/drivers/virt/gunyah/gunyah_platform_hooks.c
> new file mode 100644
> index 000000000000..60da0e154e98
> --- /dev/null
> +++ b/drivers/virt/gunyah/gunyah_platform_hooks.c
> @@ -0,0 +1,80 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#include <linux/module.h>
> +#include <linux/rwsem.h>
> +#include <linux/gunyah_rsc_mgr.h>
> +
> +#include "rsc_mgr.h"
> +
> +static struct gh_rm_platform_ops *rm_platform_ops;
> +static DECLARE_RWSEM(rm_platform_ops_lock);
> +
> +int gh_rm_platform_pre_mem_share(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel)
> +{
> + int ret = 0;
> +
> + down_read(&rm_platform_ops_lock);
> + if (rm_platform_ops && rm_platform_ops->pre_mem_share)
> + ret = rm_platform_ops->pre_mem_share(rm, mem_parcel);
> + up_read(&rm_platform_ops_lock);
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(gh_rm_platform_pre_mem_share);
> +
> +int gh_rm_platform_post_mem_reclaim(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel)
> +{
> + int ret = 0;
> +
> + down_read(&rm_platform_ops_lock);
> + if (rm_platform_ops && rm_platform_ops->post_mem_reclaim)
> + ret = rm_platform_ops->post_mem_reclaim(rm, mem_parcel);
> + up_read(&rm_platform_ops_lock);
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(gh_rm_platform_post_mem_reclaim);
> +
> +int gh_rm_register_platform_ops(struct gh_rm_platform_ops *platform_ops)
> +{
> + int ret = 0;
> +
> + down_write(&rm_platform_ops_lock);
> + if (!rm_platform_ops)
> + rm_platform_ops = platform_ops;
> + else
> + ret = -EEXIST;
> + up_write(&rm_platform_ops_lock);
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(gh_rm_register_platform_ops);
> +
> +void gh_rm_unregister_platform_ops(struct gh_rm_platform_ops *platform_ops)
> +{
> + down_write(&rm_platform_ops_lock);
> + if (rm_platform_ops == platform_ops)
> + rm_platform_ops = NULL;
> + up_write(&rm_platform_ops_lock);
> +}
> +EXPORT_SYMBOL_GPL(gh_rm_unregister_platform_ops);
> +
> +static void _devm_gh_rm_unregister_platform_ops(void *data)
> +{
> + gh_rm_unregister_platform_ops(data);
> +}
> +
> +int devm_gh_rm_register_platform_ops(struct device *dev, struct gh_rm_platform_ops *ops)
> +{
> + int ret;
> +
> + ret = gh_rm_register_platform_ops(ops);
> + if (ret)
> + return ret;
> +
> + return devm_add_action(dev, _devm_gh_rm_unregister_platform_ops, ops);
> +}
> +EXPORT_SYMBOL_GPL(devm_gh_rm_register_platform_ops);
> +
> +MODULE_LICENSE("GPL");
> +MODULE_DESCRIPTION("Gunyah Platform Hooks");
> diff --git a/drivers/virt/gunyah/rsc_mgr.h b/drivers/virt/gunyah/rsc_mgr.h
> index 3665ebc7b020..6838e736f361 100644
> --- a/drivers/virt/gunyah/rsc_mgr.h
> +++ b/drivers/virt/gunyah/rsc_mgr.h
> @@ -13,4 +13,7 @@ struct gh_rm;
> int gh_rm_call(struct gh_rm *rsc_mgr, u32 message_id, void *req_buff, size_t req_buf_size,
> void **resp_buf, size_t *resp_buf_size);
>
> +int gh_rm_platform_pre_mem_share(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel);
> +int gh_rm_platform_post_mem_reclaim(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel);
> +
> #endif
> diff --git a/drivers/virt/gunyah/rsc_mgr_rpc.c b/drivers/virt/gunyah/rsc_mgr_rpc.c
> index 3df15ad5b97d..733be4dc8dd2 100644
> --- a/drivers/virt/gunyah/rsc_mgr_rpc.c
> +++ b/drivers/virt/gunyah/rsc_mgr_rpc.c
> @@ -204,6 +204,12 @@ static int gh_rm_mem_lend_common(struct gh_rm *rm, u32 message_id, struct gh_rm_
> if (!msg)
> return -ENOMEM;
>
> + ret = gh_rm_platform_pre_mem_share(rm, p);
> + if (ret) {
> + kfree(msg);
> + return ret;
> + }
> +
> req_header = msg;
> acl_section = (void *)req_header + sizeof(*req_header);
> mem_section = (void *)acl_section + struct_size(acl_section, entries, p->n_acl_entries);
> @@ -227,8 +233,10 @@ static int gh_rm_mem_lend_common(struct gh_rm *rm, u32 message_id, struct gh_rm_
> ret = gh_rm_call(rm, message_id, msg, msg_size, (void **)&resp, &resp_size);
> kfree(msg);
>
> - if (ret)
> + if (ret) {
> + gh_rm_platform_post_mem_reclaim(rm, p);
> return ret;
> + }
>
> p->mem_handle = le32_to_cpu(*resp);
>
> @@ -283,8 +291,14 @@ int gh_rm_mem_reclaim(struct gh_rm *rm, struct gh_rm_mem_parcel *parcel)
> struct gh_rm_mem_release_req req = {
> .mem_handle = cpu_to_le32(parcel->mem_handle),
> };
> + int ret;
> +
> + ret = gh_rm_call(rm, GH_RM_RPC_MEM_RECLAIM, &req, sizeof(req), NULL, NULL);
> + /* Do not call platform mem reclaim hooks: the reclaim didn't happen*/
Move the comment above the gh_rm_call() call, and rephrase, such as:
/* Only call platform mem reclaim hooks if... */
> + if (ret)
> + return ret;
>
> - return gh_rm_call(rm, GH_RM_RPC_MEM_RECLAIM, &req, sizeof(req), NULL, NULL);
> + return gh_rm_platform_post_mem_reclaim(rm, parcel);
> }
>
> /**
> diff --git a/include/linux/gunyah_rsc_mgr.h b/include/linux/gunyah_rsc_mgr.h
> index 8b0b46f28e39..515087931a2b 100644
> --- a/include/linux/gunyah_rsc_mgr.h
> +++ b/include/linux/gunyah_rsc_mgr.h
> @@ -145,4 +145,21 @@ int gh_rm_get_hyp_resources(struct gh_rm *rm, u16 vmid,
> struct gh_rm_hyp_resources **resources);
> int gh_rm_get_vmid(struct gh_rm *rm, u16 *vmid);
>
> +struct gunyah_rm_platform_ops {
> + int (*pre_mem_share)(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel);
> + int (*post_mem_reclaim)(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel);
> +};
> +
> +#if IS_ENABLED(CONFIG_GUNYAH_PLATFORM_HOOKS)
> +int gh_rm_register_platform_ops(struct gh_rm_platform_ops *platform_ops);
> +void gh_rm_unregister_platform_ops(struct gh_rm_platform_ops *platform_ops);
> +int devm_gh_rm_register_platform_ops(struct device *dev, struct gh_rm_platform_ops *ops);
> +#else
> +static inline int gh_rm_register_platform_ops(struct gh_rm_platform_ops *platform_ops)
> + { return 0; }
> +static inline void gh_rm_unregister_platform_ops(struct gh_rm_platform_ops *platform_ops) { }
> +static inline int devm_gh_rm_register_platform_ops(struct device *dev,
> + struct gh_rm_platform_ops *ops) { return 0; }
> +#endif
> +
> #endif
On 3/3/23 7:06 PM, Elliot Berman wrote:
> When booting a Gunyah virtual machine, the host VM may gain capabilities
> to interact with resources for the guest virtual machine. Examples of
> such resources are vCPUs or message queues. To use those resources, we
> need to translate the RM response into a gunyah_resource structure which
> are useful to Linux drivers. Presently, Linux drivers need only to know
> the type of resource, the capability ID, and an interrupt.
>
> On ARM64 systems, the interrupt reported by Gunyah is the GIC interrupt
> ID number and always a SPI.
>
> Signed-off-by: Elliot Berman <[email protected]>
Several comments here, nothing major. -Alex
> ---
> arch/arm64/include/asm/gunyah.h | 23 +++++
> drivers/virt/gunyah/rsc_mgr.c | 163 +++++++++++++++++++++++++++++++-
> include/linux/gunyah.h | 4 +
> include/linux/gunyah_rsc_mgr.h | 3 +
> 4 files changed, 192 insertions(+), 1 deletion(-)
> create mode 100644 arch/arm64/include/asm/gunyah.h
>
> diff --git a/arch/arm64/include/asm/gunyah.h b/arch/arm64/include/asm/gunyah.h
> new file mode 100644
> index 000000000000..64cfb964efee
> --- /dev/null
> +++ b/arch/arm64/include/asm/gunyah.h
> @@ -0,0 +1,23 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (c) 2022 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +#ifndef __ASM_GUNYAH_H_
> +#define __ASM_GUNYAH_H_
Maybe just one _ at the beginning and none at the end?
Follow the same convention across all your header files.
(Maybe you're looking at other files in the same directory
as this one, but that's not consistent.)
> +
> +#include <linux/irq.h>
> +#include <dt-bindings/interrupt-controller/arm-gic.h>
> +
> +static inline int arch_gh_fill_irq_fwspec_params(u32 virq, struct irq_fwspec *fwspec)
> +{
> + if (virq < 32 || virq > 1019)
> + return -EINVAL;
What is special about VIRQs greater than 1019 (minus 32)?
It's probably documented somewhere but it's worth adding a
comment here to explain the check.
You would know better than I, but could/should the caller
be responsible for this check? (Not a big deal.)
> +
> + fwspec->param_count = 3;
> + fwspec->param[0] = GIC_SPI;
> + fwspec->param[1] = virq - 32;
And why is 32 subtracted?
> + fwspec->param[2] = IRQ_TYPE_EDGE_RISING;
> + return 0;
> +}
> +
> +#endif
> diff --git a/drivers/virt/gunyah/rsc_mgr.c b/drivers/virt/gunyah/rsc_mgr.c
> index d7ce692d0067..383be5ac0f44 100644
> --- a/drivers/virt/gunyah/rsc_mgr.c
> +++ b/drivers/virt/gunyah/rsc_mgr.c
> @@ -17,6 +17,8 @@
> #include <linux/platform_device.h>
> #include <linux/miscdevice.h>
>
> +#include <asm/gunyah.h>
> +
> #include "rsc_mgr.h"
> #include "vm_mgr.h"
>
> @@ -132,6 +134,7 @@ struct gh_rm_connection {
> * @send_lock: synchronization to allow only one request to be sent at a time
> * @nh: notifier chain for clients interested in RM notification messages
> * @miscdev: /dev/gunyah
> + * @irq_domain: Domain to translate Gunyah hwirqs to Linux irqs
> */
> struct gh_rm {
> struct device *dev;
> @@ -150,6 +153,7 @@ struct gh_rm {
> struct blocking_notifier_head nh;
>
> struct miscdevice miscdev;
> + struct irq_domain *irq_domain;
> };
>
> /**
> @@ -190,6 +194,134 @@ static inline int gh_rm_remap_error(enum gh_rm_error rm_error)
> }
> }
>
> +struct gh_irq_chip_data {
> + u32 gh_virq;
> +};
> +
> +static struct irq_chip gh_rm_irq_chip = {
> + .name = "Gunyah",
> + .irq_enable = irq_chip_enable_parent,
> + .irq_disable = irq_chip_disable_parent,
> + .irq_ack = irq_chip_ack_parent,
> + .irq_mask = irq_chip_mask_parent,
> + .irq_mask_ack = irq_chip_mask_ack_parent,
> + .irq_unmask = irq_chip_unmask_parent,
> + .irq_eoi = irq_chip_eoi_parent,
> + .irq_set_affinity = irq_chip_set_affinity_parent,
> + .irq_set_type = irq_chip_set_type_parent,
> + .irq_set_wake = irq_chip_set_wake_parent,
> + .irq_set_vcpu_affinity = irq_chip_set_vcpu_affinity_parent,
> + .irq_retrigger = irq_chip_retrigger_hierarchy,
> + .irq_get_irqchip_state = irq_chip_get_parent_state,
> + .irq_set_irqchip_state = irq_chip_set_parent_state,
> + .flags = IRQCHIP_SET_TYPE_MASKED |
> + IRQCHIP_SKIP_SET_WAKE |
> + IRQCHIP_MASK_ON_SUSPEND,
> +};
> +
> +static int gh_rm_irq_domain_alloc(struct irq_domain *d, unsigned int virq, unsigned int nr_irqs,
> + void *arg)
> +{
> + struct gh_irq_chip_data *chip_data, *spec = arg;
> + struct irq_fwspec parent_fwspec;
> + struct gh_rm *rm = d->host_data;
> + u32 gh_virq = spec->gh_virq;
> + int ret;
> +
> + if (nr_irqs != 1 || gh_virq == U32_MAX)
Does U32_MAX have special meaning? Why are you checking for it?
Whatever it is, you should explain why this is invalid here.
> + return -EINVAL;
> +
> + chip_data = kzalloc(sizeof(*chip_data), GFP_KERNEL);
> + if (!chip_data)
> + return -ENOMEM;
> +
> + chip_data->gh_virq = gh_virq;
> +
> + ret = irq_domain_set_hwirq_and_chip(d, virq, chip_data->gh_virq, &gh_rm_irq_chip,
> + chip_data);
> + if (ret)
> + goto err_free_irq_data;
> +
> + parent_fwspec.fwnode = d->parent->fwnode;
> + ret = arch_gh_fill_irq_fwspec_params(chip_data->gh_virq, &parent_fwspec);
> + if (ret) {
> + dev_err(rm->dev, "virq translation failed %u: %d\n", chip_data->gh_virq, ret);
> + goto err_free_irq_data;
> + }
> +
> + ret = irq_domain_alloc_irqs_parent(d, virq, nr_irqs, &parent_fwspec);
> + if (ret)
> + goto err_free_irq_data;
> +
> + return ret;
> +err_free_irq_data:
> + kfree(chip_data);
> + return ret;
> +}
> +
> +static void gh_rm_irq_domain_free_single(struct irq_domain *d, unsigned int virq)
> +{
> + struct gh_irq_chip_data *chip_data;
No need to define chip_data.
> + struct irq_data *irq_data;
> +
> + irq_data = irq_domain_get_irq_data(d, virq);
> + if (!irq_data)
> + return;
> +
> + chip_data = irq_data->chip_data;
> +
> + kfree(chip_data);
Just call kfree(irq_data->chip_data);
> + irq_data->chip_data = NULL;
> +}
> +
> +static void gh_rm_irq_domain_free(struct irq_domain *d, unsigned int virq, unsigned int nr_irqs)
> +{
> + unsigned int i;
> +
> + for (i = 0; i < nr_irqs; i++)
> + gh_rm_irq_domain_free_single(d, virq);
> +}
> +
> +static const struct irq_domain_ops gh_rm_irq_domain_ops = {
> + .alloc = gh_rm_irq_domain_alloc,
> + .free = gh_rm_irq_domain_free,
> +};
> +
> +struct gh_resource *gh_rm_alloc_resource(struct gh_rm *rm, struct gh_rm_hyp_resource *hyp_resource)
> +{
> + struct gh_resource *ghrsc;
> +
> + ghrsc = kzalloc(sizeof(*ghrsc), GFP_KERNEL);
> + if (!ghrsc)
> + return NULL;
> +
> + ghrsc->type = hyp_resource->type;
> + ghrsc->capid = le64_to_cpu(hyp_resource->cap_id);
> + ghrsc->irq = IRQ_NOTCONNECTED;
> + ghrsc->rm_label = le32_to_cpu(hyp_resource->resource_label);
> + if (hyp_resource->virq && le32_to_cpu(hyp_resource->virq) != U32_MAX) {
Again, does U32_MAX have a particular meaning here?
> + struct gh_irq_chip_data irq_data = {
> + .gh_virq = le32_to_cpu(hyp_resource->virq),
> + };
> +
> + ghrsc->irq = irq_domain_alloc_irqs(rm->irq_domain, 1, NUMA_NO_NODE, &irq_data);
> + if (ghrsc->irq < 0) {
> + dev_err(rm->dev,
> + "Failed to allocate interrupt for resource %d label: %d: %d\n",
> + ghrsc->type, ghrsc->rm_label, ghrsc->irq);
> + ghrsc->irq = IRQ_NOTCONNECTED;
ghrsc->irq already had that value. You could use a local
variable irq to hold the value, and then assign ghrsc->irq
after you know it's good.
> + }
> + }
> +
> + return ghrsc;
> +}
> +
> +void gh_rm_free_resource(struct gh_resource *ghrsc)
> +{
> + irq_dispose_mapping(ghrsc->irq);
> + kfree(ghrsc);
> +}
> +
> static int gh_rm_init_connection_payload(struct gh_rm_connection *connection, void *msg,
> size_t hdr_size, size_t msg_size)
> {
> @@ -639,6 +771,8 @@ static int gh_msgq_platform_probe_direction(struct platform_device *pdev, bool t
>
> static int gh_rm_drv_probe(struct platform_device *pdev)
> {
> + struct irq_domain *parent_irq_domain;
> + struct device_node *parent_irq_node;
> struct gh_msgq_tx_data *msg;
> struct gh_rm *rm;
> int ret;
> @@ -675,15 +809,41 @@ static int gh_rm_drv_probe(struct platform_device *pdev)
> if (ret)
> goto err_cache;
>
> + parent_irq_node = of_irq_find_parent(pdev->dev.of_node);
> + if (!parent_irq_node) {
> + dev_err(&pdev->dev, "Failed to find interrupt parent of resource manager\n");
> + ret = -ENODEV;
> + goto err_msgq;
> + }
> +
> + parent_irq_domain = irq_find_host(parent_irq_node);
> + if (!parent_irq_domain) {
> + dev_err(&pdev->dev, "Failed to find interrupt parent domain of resource manager\n");
> + ret = -ENODEV;
> + goto err_msgq;
> + }
> +
> + rm->irq_domain = irq_domain_add_hierarchy(parent_irq_domain, 0, 0, pdev->dev.of_node,
> + &gh_rm_irq_domain_ops, NULL);
> + if (!rm->irq_domain) {
> + dev_err(&pdev->dev, "Failed to add irq domain\n");
> + ret = -ENODEV;
> + goto err_msgq;
> + }
> + rm->irq_domain->host_data = rm;
> +
> + rm->miscdev.parent = &pdev->dev;
> rm->miscdev.name = "gunyah";
> rm->miscdev.minor = MISC_DYNAMIC_MINOR;
> rm->miscdev.fops = &gh_dev_fops;
>
> ret = misc_register(&rm->miscdev);
> if (ret)
> - goto err_msgq;
> + goto err_irq_domain;
>
> return 0;
> +err_irq_domain:
> + irq_domain_remove(rm->irq_domain);
> err_msgq:
> mbox_free_channel(gh_msgq_chan(&rm->msgq));
> gh_msgq_remove(&rm->msgq);
> @@ -697,6 +857,7 @@ static int gh_rm_drv_remove(struct platform_device *pdev)
> struct gh_rm *rm = platform_get_drvdata(pdev);
>
> misc_deregister(&rm->miscdev);
> + irq_domain_remove(rm->irq_domain);
> mbox_free_channel(gh_msgq_chan(&rm->msgq));
> gh_msgq_remove(&rm->msgq);
> kmem_cache_destroy(rm->cache);
> diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
> index 378bec0f2ce1..3e706b59d2c0 100644
> --- a/include/linux/gunyah.h
> +++ b/include/linux/gunyah.h
> @@ -27,6 +27,10 @@ struct gh_resource {
> enum gh_resource_type type;
> u64 capid;
> unsigned int irq;
> +
> + /* To help allocator in vm manager */
I don't find the above comment helpful.
> + struct list_head list;
> + u32 rm_label;
> };
>
> /**
> diff --git a/include/linux/gunyah_rsc_mgr.h b/include/linux/gunyah_rsc_mgr.h
> index acf8c1545a6c..58693c27cf1a 100644
> --- a/include/linux/gunyah_rsc_mgr.h
> +++ b/include/linux/gunyah_rsc_mgr.h
> @@ -145,6 +145,9 @@ int gh_rm_get_hyp_resources(struct gh_rm *rm, u16 vmid,
> struct gh_rm_hyp_resources **resources);
> int gh_rm_get_vmid(struct gh_rm *rm, u16 *vmid);
>
> +struct gh_resource *gh_rm_alloc_resource(struct gh_rm *rm, struct gh_rm_hyp_resource *hyp_resource);
> +void gh_rm_free_resource(struct gh_resource *ghrsc);
> +
> struct gh_rm_platform_ops {
> int (*pre_mem_share)(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel);
> int (*post_mem_reclaim)(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel);
On 3/3/23 7:06 PM, Elliot Berman wrote:
> Introduce a framework for Gunyah userspace to install VM functions. VM
> functions are optional interfaces to the virtual machine. vCPUs,
> ioeventfs, and irqfds are examples of such VM functions and are
> implemented in subsequent patches.
>
> A generic framework is implemented instead of individual ioctls to
> create vCPUs, irqfds, etc., in order to simplify the VM manager core
> implementation and allow dynamic loading of VM function modules.
>
> Signed-off-by: Elliot Berman <[email protected]>
I found two bugs here, and have some suggestions that might
improve code readability.
-Alex
> ---
> Documentation/virt/gunyah/vm-manager.rst | 18 ++
> drivers/virt/gunyah/vm_mgr.c | 208 ++++++++++++++++++++++-
> drivers/virt/gunyah/vm_mgr.h | 4 +
> include/linux/gunyah_vm_mgr.h | 73 ++++++++
> include/uapi/linux/gunyah.h | 17 ++
> 5 files changed, 316 insertions(+), 4 deletions(-)
> create mode 100644 include/linux/gunyah_vm_mgr.h
>
> diff --git a/Documentation/virt/gunyah/vm-manager.rst b/Documentation/virt/gunyah/vm-manager.rst
> index 1b4aa18670a3..af8ad88a88ab 100644
> --- a/Documentation/virt/gunyah/vm-manager.rst
> +++ b/Documentation/virt/gunyah/vm-manager.rst
> @@ -17,6 +17,24 @@ sharing userspace memory with a VM is done via the GH_VM_SET_USER_MEM_REGION
> ioctl. The VM itself is configured to use the memory region via the
> devicetree.
>
> +Gunyah Functions
> +================
> +
> +Components of a Gunyah VM's configuration that need kernel configuration are
> +called "functions" and are built on top of a framework. Functions are identified
> +by a string and have some argument(s) to configure them. They are typically
> +created by the `GH_VM_ADD_FUNCTION` ioctl.
> +
> +Functions typically will always do at least one of these operations:
> +
> +1. Create resource ticket(s). Resource tickets allow a function to register
> + itself as the client for a Gunyah resource (e.g. doorbell or vCPU) and
> + the function is given the pointer to the `struct gh_resource` when the
> + VM is starting.
> +
> +2. Register IO handler(s). IO handlers allow a function to handle stage-2 faults
> + from the virtual machine.
> +
> Sample Userspace VMM
> ====================
>
> diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
> index 299b9bb81edc..88db011395ec 100644
> --- a/drivers/virt/gunyah/vm_mgr.c
> +++ b/drivers/virt/gunyah/vm_mgr.c
> @@ -6,16 +6,165 @@
> #define pr_fmt(fmt) "gh_vm_mgr: " fmt
>
> #include <linux/anon_inodes.h>
> +#include <linux/compat.h>
> #include <linux/file.h>
> #include <linux/gunyah_rsc_mgr.h>
> +#include <linux/gunyah_vm_mgr.h>
> #include <linux/miscdevice.h>
> #include <linux/mm.h>
> #include <linux/module.h>
> +#include <linux/xarray.h>
>
> #include <uapi/linux/gunyah.h>
>
> #include "vm_mgr.h"
>
> +static DEFINE_XARRAY(functions);
> +
> +int gh_vm_function_register(struct gh_vm_function *fn)
> +{
> + if (!fn->bind || !fn->unbind)
> + return -EINVAL;
> +
> + return xa_err(xa_store(&functions, fn->type, fn, GFP_KERNEL));
> +}
> +EXPORT_SYMBOL_GPL(gh_vm_function_register);
> +
I would move gh_vm_remove_function_instance() down, grouping it
more closely with the code that uses it.
> +static void gh_vm_remove_function_instance(struct gh_vm_function_instance *inst)
> + __must_hold(&inst->ghvm->fn_lock)
> +{
> + inst->fn->unbind(inst);
> + list_del(&inst->vm_list);
> + module_put(inst->fn->mod);
> + kfree(inst->argp);
> + kfree(inst);
> +}
> +
> +void gh_vm_function_unregister(struct gh_vm_function *fn)
> +{
> + /* Expecting unregister to only come when unloading a module */
> + WARN_ON(fn->mod && module_refcount(fn->mod));
> + xa_erase(&functions, fn->type);
> +}
> +EXPORT_SYMBOL_GPL(gh_vm_function_unregister);
> +
You define gh_vm_get_function(), but you don't define the matching
gh_vm_put_function() abstraction. Instead, you just expect the
caller to know that they should call module_put(). Even if it's
a simple wrapper, please define gh_vm_put_function().
> +static struct gh_vm_function *gh_vm_get_function(u32 type)
> +{
> + struct gh_vm_function *fn;
> + int r;
> +
> + fn = xa_load(&functions, type);
> + if (!fn) {
> + r = request_module("ghfunc:%d", type);
> + if (r)
> + return ERR_PTR(r);
> +
> + fn = xa_load(&functions, type);
> + }
> +
> + if (!fn || !try_module_get(fn->mod))
> + fn = ERR_PTR(-ENOENT);
> +
> + return fn;
> +}
> +
> +static long gh_vm_add_function(struct gh_vm *ghvm, struct gh_fn_desc *f)
This is adding a function *instance*. Maybe it would be clearer
if you included that in the name.
> +{
> + struct gh_vm_function_instance *inst;
> + void __user *argp;
> + long r = 0;
> +
> + if (f->arg_size > GH_FN_MAX_ARG_SIZE) {
> + dev_err(ghvm->parent, "%s: arg_size > %d\n", __func__, GH_FN_MAX_ARG_SIZE);
> + return -EINVAL;
> + }
> +
> + inst = kzalloc(sizeof(*inst), GFP_KERNEL);
> + if (!inst)
> + return -ENOMEM;
> +
> + inst->arg_size = f->arg_size;
> + if (inst->arg_size) {
> + inst->argp = kzalloc(inst->arg_size, GFP_KERNEL);
> + if (!inst->argp) {
> + r = -ENOMEM;
> + goto free;
> + }
> +
> + argp = u64_to_user_ptr(f->arg);
> + if (copy_from_user(inst->argp, argp, f->arg_size)) {
> + r = -EFAULT;
> + goto free_arg;
> + }
> + }
> +
> + inst->fn = gh_vm_get_function(f->type);
> + if (IS_ERR(inst->fn)) {
> + r = PTR_ERR(inst->fn);
> + goto free_arg;
> + }
> +
> + inst->ghvm = ghvm;
> + inst->rm = ghvm->rm;
> +
> + mutex_lock(&ghvm->fn_lock);
> + r = inst->fn->bind(inst);
> + if (r < 0) {
You need to unlock the mutex here. This is a BUG.
> + module_put(inst->fn->mod);
> + goto free_arg;
Perhaps you should add a new label in the error path and
unlock the mutex and put the function reference there.
> + }
> +
> + list_add(&inst->vm_list, &ghvm->functions);
> + mutex_unlock(&ghvm->fn_lock);
> +
> + return r;
> +free_arg:
> + kfree(inst->argp);
> +free:
> + kfree(inst);
> + return r;
> +}
> +
> +static long gh_vm_rm_function(struct gh_vm *ghvm, struct gh_fn_desc *f)
This is removing a function *instance*, right?
> +{
> + struct gh_vm_function_instance *inst, *iter;
> + void __user *user_argp;
> + void *argp;
> + long r = 0;
> +
> + r = mutex_lock_interruptible(&ghvm->fn_lock);
> + if (r)
> + return r;
> +
> + if (f->arg_size) {
This is a BUG. You' aren't freeing things, you seem
to have just duplicated the allocation code.
Actually I think this is just a block of copied code
that's here by mistake. The loop might be doing
the removal you intend.
> + argp = kzalloc(f->arg_size, GFP_KERNEL);
> + if (!argp) {
> + r = -ENOMEM;
> + goto out;
> + }
> +
> + user_argp = u64_to_user_ptr(f->arg);
> + if (copy_from_user(argp, user_argp, f->arg_size)) {
> + r = -EFAULT;
> + kfree(argp);
> + goto out;
> + }
>
I *think* the for loop and freeing the argument here
still does what you want.
> + list_for_each_entry_safe(inst, iter, &ghvm->functions, vm_list) {
> + if (inst->fn->type == f->type &&
> + f->arg_size == inst->arg_size &&
> + !memcmp(argp, inst->argp, f->arg_size))
> + gh_vm_remove_function_instance(inst);
> + }
> +
> + kfree(argp);
> + }
> +
> +out:
> + mutex_unlock(&ghvm->fn_lock);
> + return r;
> +}
> +
> static int gh_vm_rm_notification_status(struct gh_vm *ghvm, void *data)
> {
> struct gh_rm_vm_status_payload *payload = data;
> @@ -80,6 +229,7 @@ static void gh_vm_stop(struct gh_vm *ghvm)
> static void gh_vm_free(struct work_struct *work)
> {
> struct gh_vm *ghvm = container_of(work, struct gh_vm, free_work);
> + struct gh_vm_function_instance *inst, *iiter;
> struct gh_vm_mem *mapping, *tmp;
> int ret;
>
> @@ -90,6 +240,12 @@ static void gh_vm_free(struct work_struct *work)
> case GH_RM_VM_STATUS_INIT_FAILED:
> case GH_RM_VM_STATUS_LOAD:
> case GH_RM_VM_STATUS_EXITED:
> + mutex_lock(&ghvm->fn_lock);
> + list_for_each_entry_safe(inst, iiter, &ghvm->functions, vm_list) {
> + gh_vm_remove_function_instance(inst);
> + }
> + mutex_unlock(&ghvm->fn_lock);
> +
> mutex_lock(&ghvm->mm_lock);
> list_for_each_entry_safe(mapping, tmp, &ghvm->memory_mappings, list) {
> gh_vm_mem_reclaim(ghvm, mapping);
> @@ -117,6 +273,28 @@ static void gh_vm_free(struct work_struct *work)
> }
> }
>
> +static void _gh_vm_put(struct kref *kref)
> +{
> + struct gh_vm *ghvm = container_of(kref, struct gh_vm, kref);
> +
> + /* VM will be reset and make RM calls which can interruptible sleep.
> + * Defer to a work so this thread can receive signal.
> + */
> + schedule_work(&ghvm->free_work);
> +}
> +
> +int __must_check gh_vm_get(struct gh_vm *ghvm)
> +{
> + return kref_get_unless_zero(&ghvm->kref);
> +}
> +EXPORT_SYMBOL_GPL(gh_vm_get);
> +
Maybe move _gh_vm_put() here.
> +void gh_vm_put(struct gh_vm *ghvm)
> +{
> + kref_put(&ghvm->kref, _gh_vm_put);
> +}
> +EXPORT_SYMBOL_GPL(gh_vm_put);
> +
> static __must_check struct gh_vm *gh_vm_alloc(struct gh_rm *rm)
> {
> struct gh_vm *ghvm;
> @@ -150,6 +328,8 @@ static __must_check struct gh_vm *gh_vm_alloc(struct gh_rm *rm)
> INIT_LIST_HEAD(&ghvm->memory_mappings);
> init_rwsem(&ghvm->status_lock);
> INIT_WORK(&ghvm->free_work, gh_vm_free);
> + kref_init(&ghvm->kref);
> + INIT_LIST_HEAD(&ghvm->functions);
> ghvm->vm_status = GH_RM_VM_STATUS_LOAD;
>
> return ghvm;
> @@ -302,6 +482,29 @@ static long gh_vm_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
> r = gh_vm_ensure_started(ghvm);
> break;
> }
> + case GH_VM_ADD_FUNCTION: {
> + struct gh_fn_desc f;
> +
> + if (copy_from_user(&f, argp, sizeof(f)))
> + return -EFAULT;
> +
> + r = gh_vm_add_function(ghvm, &f);
> + break;
> + }
> + case GH_VM_REMOVE_FUNCTION: {
To be clear, this is adding a function *instance*.
(I'm not suggesting you change the name.)
> + struct gh_fn_desc *f;
> +
> + f = kzalloc(sizeof(*f), GFP_KERNEL);
Why do you allocate a function descriptor here dynamically,
while when adding a function you just define a descriptor
as a local variable (on the stack)? It looks to me like
you should be able to do it the same way here and avoid
the possibility of a kzalloc() failure.
> + if (!f)
> + return -ENOMEM;
> +
> + if (copy_from_user(f, argp, sizeof(*f)))
> + return -EFAULT;
If the copy_from_user() fails you will have leaked the
memory allocated for the memory descriptor. This is a BUG.
> +
> + r = gh_vm_rm_function(ghvm, f);
> + kfree(f);
> + break;
> + }
> default:
> r = -ENOTTY;
> break;
. . .
On 3/3/23 7:06 PM, Elliot Berman wrote:
> Gunyah allows host virtual machines to schedule guest virtual machines
> and handle their MMIO accesses. vCPUs are presented to the host as a
> Gunyah resource and represented to userspace as a Gunyah VM function.
>
> Creating the vcpu VM function will create a file descriptor that:
> - can run an ioctl: GH_VCPU_RUN to schedule the guest vCPU until the
> next interrupt occurs on the host or when the guest vCPU can no
> longer be run.
> - can be mmap'd to share a gh_vcpu_run structure which can look up the
> reason why GH_VCPU_RUN returned and provide return values for MMIO
> access.
>
> Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
> Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
> Signed-off-by: Elliot Berman <[email protected]>
I suggest reorganizing and renaming a few things here, but I don't
think there's anything major.
-Alex
> ---
> Documentation/virt/gunyah/vm-manager.rst | 46 ++-
> arch/arm64/gunyah/gunyah_hypercall.c | 28 ++
> drivers/virt/gunyah/Kconfig | 11 +
> drivers/virt/gunyah/Makefile | 2 +
> drivers/virt/gunyah/gunyah_vcpu.c | 465 +++++++++++++++++++++++
> drivers/virt/gunyah/vm_mgr.c | 4 +
> drivers/virt/gunyah/vm_mgr.h | 1 +
> include/linux/gunyah.h | 8 +
> include/uapi/linux/gunyah.h | 108 ++++++
> 9 files changed, 671 insertions(+), 2 deletions(-)
> create mode 100644 drivers/virt/gunyah/gunyah_vcpu.c
>
> diff --git a/Documentation/virt/gunyah/vm-manager.rst b/Documentation/virt/gunyah/vm-manager.rst
> index af8ad88a88ab..83d326b0d11f 100644
> --- a/Documentation/virt/gunyah/vm-manager.rst
> +++ b/Documentation/virt/gunyah/vm-manager.rst
> @@ -5,8 +5,7 @@ Virtual Machine Manager
> =======================
>
> The Gunyah Virtual Machine Manager is a Linux driver to support launching
> -virtual machines using Gunyah. It presently supports launching non-proxy
> -scheduled Linux-like virtual machines.
> +virtual machines using Gunyah.
>
> Except for some basic information about the location of initial binaries,
> most of the configuration about a Gunyah virtual machine is described in the
> @@ -107,3 +106,46 @@ GH_VM_START
> ~~~~~~~~~~~
>
> This ioctl starts the VM.
> +
> +GH_VM_ADD_FUNCTION
> +~~~~~~~~~~~~~~~~~~
> +
> +This ioctl registers a Gunyah VM function with the VM manager. The VM function
> +is described with a `type` string and some arguments for that type. Typically,
> +the function is added before the VM starts, but the function doesn't "operate"
> +until the VM starts with GH_VM_START: e.g. vCPU ioclts will all return an error
> +until the VM starts because the vCPUs don't exist until the VM is started. This
> +allows the VMM to set up all the kernel functionality needed for the VM *before*
> +the VM starts.
> +
> +.. kernel-doc:: include/uapi/linux/gunyah.h
> + :identifiers: gh_fn_desc
> +
> +The possible types are documented below:
> +
> +.. kernel-doc:: include/uapi/linux/gunyah.h
> + :identifiers: GH_FN_VCPU gh_fn_vcpu_arg
> +
> +Gunyah VCPU API Descriptions
> +----------------------------
> +
> +A vCPU file descriptor is created after calling `GH_VM_ADD_FUNCTION` with the type `GH_FN_VCPU`.
> +
> +GH_VCPU_RUN
> +~~~~~~~~~~~
> +
> +This ioctl is used to run a guest virtual cpu. While there are no
> +explicit parameters, there is an implicit parameter block that can be
> +obtained by mmap()ing the vcpu fd at offset 0, with the size given by
> +GH_VCPU_MMAP_SIZE. The parameter block is formatted as a 'struct
> +gh_vcpu_run' (see below).
> +
> +GH_VCPU_MMAP_SIZE
> +~~~~~~~~~~~~~~~~~
> +
> +The GH_VCPU_RUN ioctl communicates with userspace via a shared
> +memory region. This ioctl returns the size of that region. See the
> +GH_VCPU_RUN documentation for details.
> +
> +.. kernel-doc:: include/uapi/linux/gunyah.h
> + :identifiers: gh_vcpu_run gh_vm_exit_info
> diff --git a/arch/arm64/gunyah/gunyah_hypercall.c b/arch/arm64/gunyah/gunyah_hypercall.c
> index 3420d8f286a9..f01f5cec4d23 100644
> --- a/arch/arm64/gunyah/gunyah_hypercall.c
> +++ b/arch/arm64/gunyah/gunyah_hypercall.c
> @@ -43,6 +43,7 @@ EXPORT_SYMBOL_GPL(arch_is_gh_guest);
> #define GH_HYPERCALL_HYP_IDENTIFY GH_HYPERCALL(0x8000)
> #define GH_HYPERCALL_MSGQ_SEND GH_HYPERCALL(0x801B)
> #define GH_HYPERCALL_MSGQ_RECV GH_HYPERCALL(0x801C)
> +#define GH_HYPERCALL_VCPU_RUN GH_HYPERCALL(0x8065)
>
> /**
> * gh_hypercall_hyp_identify() - Returns build information and feature flags
> @@ -91,5 +92,32 @@ enum gh_error gh_hypercall_msgq_recv(u64 capid, void *buff, size_t size, size_t
> }
> EXPORT_SYMBOL_GPL(gh_hypercall_msgq_recv);
>
> +enum gh_error gh_hypercall_vcpu_run(u64 capid, u64 *resume_data,
> + struct gh_hypercall_vcpu_run_resp *resp)
> +{
> + struct arm_smccc_1_2_regs args = {
> + .a0 = GH_HYPERCALL_VCPU_RUN,
> + .a1 = capid,
> + .a2 = resume_data[0],
> + .a3 = resume_data[1],
> + .a4 = resume_data[2],
> + /* C language says this will be implictly zero. Gunyah requires 0, so be explicit */
> + .a5 = 0,
> + };
> + struct arm_smccc_1_2_regs res;
> +
> + arm_smccc_1_2_hvc(&args, &res);
> +
> + if (res.a0 == GH_ERROR_OK) {
> + resp->state = res.a1;
> + resp->state_data[0] = res.a2;
> + resp->state_data[1] = res.a3;
> + resp->state_data[2] = res.a4;
> + }
> +
> + return res.a0;
> +}
> +EXPORT_SYMBOL_GPL(gh_hypercall_vcpu_run);
> +
> MODULE_LICENSE("GPL");
> MODULE_DESCRIPTION("Gunyah Hypervisor Hypercalls");
> diff --git a/drivers/virt/gunyah/Kconfig b/drivers/virt/gunyah/Kconfig
> index de815189dab6..4c1c6110b50e 100644
> --- a/drivers/virt/gunyah/Kconfig
> +++ b/drivers/virt/gunyah/Kconfig
> @@ -15,3 +15,14 @@ config GUNYAH
>
> config GUNYAH_PLATFORM_HOOKS
> tristate
> +
> +config GUNYAH_VCPU
> + tristate "Runnable Gunyah vCPUs"
> + depends on GUNYAH
> + help
> + Enable kernel support for host-scheduled vCPUs running under Gunyah.
> + When selecting this option, userspace virtual machine managers (VMM)
> + can schedule the guest VM's vCPUs instead of using Gunyah's scheduler.
> + VMMs can also handle stage 2 faults of the vCPUs.
> +
> + Say Y/M here if unsure and you want to support Gunyah VMMs.
> diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
> index 6b8f84dbfe0d..2d1b604a7b03 100644
> --- a/drivers/virt/gunyah/Makefile
> +++ b/drivers/virt/gunyah/Makefile
> @@ -5,3 +5,5 @@ obj-$(CONFIG_GUNYAH_PLATFORM_HOOKS) += gunyah_platform_hooks.o
>
> gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o vm_mgr_mm.o
> obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
> +
> +obj-$(CONFIG_GUNYAH_VCPU) += gunyah_vcpu.o
> diff --git a/drivers/virt/gunyah/gunyah_vcpu.c b/drivers/virt/gunyah/gunyah_vcpu.c
> new file mode 100644
> index 000000000000..870e471a11df
> --- /dev/null
> +++ b/drivers/virt/gunyah/gunyah_vcpu.c
> @@ -0,0 +1,465 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#include <linux/anon_inodes.h>
> +#include <linux/file.h>
> +#include <linux/gunyah.h>
> +#include <linux/gunyah_vm_mgr.h>
> +#include <linux/interrupt.h>
> +#include <linux/kref.h>
> +#include <linux/mm.h>
> +#include <linux/module.h>
> +#include <linux/types.h>
> +#include <linux/wait.h>
> +
> +#include "vm_mgr.h"
> +
> +#include <uapi/linux/gunyah.h>
> +
> +#define MAX_VCPU_NAME 20 /* gh-vcpu:u32_max+NUL */
> +
> +struct gh_vcpu {
> + struct gh_vm_function_instance *f;
> + struct gh_resource *rsc;
> + struct mutex run_lock;
> + /* Track why vcpu_run left last time around. */
> + enum {
> + GH_VCPU_UNKNOWN = 0,
> + GH_VCPU_READY,
> + GH_VCPU_MMIO_READ,
> + GH_VCPU_SYSTEM_DOWN,
> + } state;
> + u8 mmio_read_len;
> + struct gh_vcpu_run *vcpu_run;
> + struct completion ready;
> + struct gh_vm *ghvm;
> +
> + struct notifier_block nb;
> + struct gh_vm_resource_ticket ticket;
> + struct kref kref;
> +};
> +
Here again, I suggest defining the states using an enumerated
type. Then add kernel-doc comments to describe them, rather
than these one-line comments. I like enums because it gives
you a way to refer to the group of values by name.
> +/* VCPU is ready to run */
> +#define GH_VCPU_STATE_READY 0
> +/* VCPU is sleeping until an interrupt arrives */
> +#define GH_VCPU_STATE_EXPECTS_WAKEUP 1
> +/* VCPU is powered off */
> +#define GH_VCPU_STATE_POWERED_OFF 2
> +/* VCPU is blocked in EL2 for unspecified reason */
> +#define GH_VCPU_STATE_BLOCKED 3
> +/* VCPU has returned for MMIO READ */
> +#define GH_VCPU_ADDRSPACE_VMMIO_READ 4
> +/* VCPU has returned for MMIO WRITE */
> +#define GH_VCPU_ADDRSPACE_VMMIO_WRITE 5
The states above are used as values held in the state field
of the gh_hypercall_vcpu_run_res structure. I find it
confusing that you define them here right below the gh_vcpu
structure (which also has a "state" field--even though its
values are listed above).
Perhaps these values should be defined in <linux/gunyah.h>
instead, where th_hypercall_vcpu_run_resp is defined. I
realize that even if you did that, they'd only be used
in this file.
> +
> +static void vcpu_release(struct kref *kref)
> +{
> + struct gh_vcpu *vcpu = container_of(kref, struct gh_vcpu, kref);
> +
> + free_page((unsigned long)vcpu->vcpu_run);
> + kfree(vcpu);
> +}
> +
> +/*
> + * When hypervisor allows us to schedule vCPU again, it gives us an interrupt
> + */
> +static irqreturn_t gh_vcpu_irq_handler(int irq, void *data)
> +{
> + struct gh_vcpu *vcpu = data;
> +
> + complete(&vcpu->ready);
> + return IRQ_HANDLED;
> +}
> +
> +static bool gh_handle_mmio(struct gh_vcpu *vcpu,
> + struct gh_hypercall_vcpu_run_resp *vcpu_run_resp)
> +{
> + int ret = 0;
> + u64 addr = vcpu_run_resp->state_data[0],
> + len = vcpu_run_resp->state_data[1],
> + data = vcpu_run_resp->state_data[2];
> +
> + if (vcpu_run_resp->state == GH_VCPU_ADDRSPACE_VMMIO_READ) {
> + vcpu->vcpu_run->mmio.is_write = 0;
> + /* Record that we need to give vCPU user's supplied value next gh_vcpu_run() */
> + vcpu->state = GH_VCPU_MMIO_READ;
> + vcpu->mmio_read_len = len;
> + } else { /* GH_VCPU_ADDRSPACE_VMMIO_WRITE */
> + /* Try internal handlers first */
> + ret = gh_vm_mmio_write(vcpu->f->ghvm, addr, len, data);
> + if (!ret)
> + return true;
> +
> + /* Give userspace the info */
> + vcpu->vcpu_run->mmio.is_write = 1;
> + memcpy(vcpu->vcpu_run->mmio.data, &data, len);
> + }
> +
> + vcpu->vcpu_run->mmio.phys_addr = addr;
> + vcpu->vcpu_run->mmio.len = len;
> + vcpu->vcpu_run->exit_reason = GH_VCPU_EXIT_MMIO;
> +
> + return false;
> +}
> +
> +static int gh_vcpu_rm_notification(struct notifier_block *nb, unsigned long action, void *data)
> +{
> + struct gh_vcpu *vcpu = container_of(nb, struct gh_vcpu, nb);
> + struct gh_rm_vm_exited_payload *exit_payload = data;
> +
> + if (action == GH_RM_NOTIFICATION_VM_EXITED &&
> + le16_to_cpu(exit_payload->vmid) == vcpu->ghvm->vmid)
> + complete(&vcpu->ready);
> +
> + return NOTIFY_OK;
> +}
> +
> +static inline enum gh_vm_status remap_vm_status(enum gh_rm_vm_status rm_status)
> +{
> + switch (rm_status) {
> + case GH_RM_VM_STATUS_INIT_FAILED:
> + return GH_VM_STATUS_LOAD_FAILED;
> + case GH_RM_VM_STATUS_EXITED:
> + return GH_VM_STATUS_EXITED;
> + default:
> + return GH_VM_STATUS_CRASHED;
> + }
> +}
> +
> +/**
> + * gh_vcpu_check_system() - Check whether VM as a whole is running
> + * @vcpu: Pointer to gh_vcpu
> + *
> + * Returns true if the VM is alive.
> + * Returns false if the vCPU is the VM is not alive (can only be that VM is shutting down).
> + */
> +static bool gh_vcpu_check_system(struct gh_vcpu *vcpu)
> + __must_hold(&vcpu->run_lock)
> +{
> + bool ret = true;
> +
> + down_read(&vcpu->ghvm->status_lock);
> + if (likely(vcpu->ghvm->vm_status == GH_RM_VM_STATUS_RUNNING))
> + goto out;
> +
> + vcpu->vcpu_run->status.status = remap_vm_status(vcpu->ghvm->vm_status);
> + vcpu->vcpu_run->status.exit_info = vcpu->ghvm->exit_info;
> + vcpu->vcpu_run->exit_reason = GH_VCPU_EXIT_STATUS;
> + vcpu->state = GH_VCPU_SYSTEM_DOWN;
> + ret = false;
> +out:
> + up_read(&vcpu->ghvm->status_lock);
> + return ret;
> +}
> +
> +/**
> + * gh_vcpu_run() - Request Gunyah to begin scheduling this vCPU.
> + * @vcpu: The client descriptor that was obtained via gh_vcpu_alloc()
> + */
> +static int gh_vcpu_run(struct gh_vcpu *vcpu)
> +{
> + struct gh_hypercall_vcpu_run_resp vcpu_run_resp;
> + u64 state_data[3] = { 0 };
> + enum gh_error gh_error;
> + int ret = 0;
> +
> + if (!vcpu->f)
> + return -ENODEV;
> +
> + if (mutex_lock_interruptible(&vcpu->run_lock))
> + return -ERESTARTSYS;
> +
> + if (!vcpu->rsc) {
> + ret = -ENODEV;
> + goto out;
> + }
> +
> + switch (vcpu->state) {
> + case GH_VCPU_UNKNOWN:
> + if (vcpu->ghvm->vm_status != GH_RM_VM_STATUS_RUNNING) {
> + /* Check if VM is up. If VM is starting, will block until VM is fully up
> + * since that thread does down_write.
> + */
> + if (!gh_vcpu_check_system(vcpu))
> + goto out;
> + }
> + vcpu->state = GH_VCPU_READY;
> + break;
> + case GH_VCPU_MMIO_READ:
I think you should verify that vcpu->mmio_read_len is <= 8 bytes
(or sizeof(state_data[0]). It is set in gh_handle_mmio(), and
*should* be correct, but in isolation here it would be defensive.
> + memcpy(&state_data[0], vcpu->vcpu_run->mmio.data, vcpu->mmio_read_len);
> + vcpu->state = GH_VCPU_READY;
> + break;
> + case GH_VCPU_SYSTEM_DOWN:
> + goto out;
> + default:
> + break;
> + }
> +
> + while (!ret && !signal_pending(current)) {
> + if (vcpu->vcpu_run->immediate_exit) {
> + ret = -EINTR;
> + goto out;
> + }
> +
> + gh_error = gh_hypercall_vcpu_run(vcpu->rsc->capid, state_data, &vcpu_run_resp);
> + if (gh_error == GH_ERROR_OK) {
> + ret = 0;
> + switch (vcpu_run_resp.state) {
> + case GH_VCPU_STATE_READY:
> + if (need_resched())
> + schedule();
> + break;
> + case GH_VCPU_STATE_POWERED_OFF:
> + /* vcpu might be off because the VM is shut down.
> + * If so, it won't ever run again: exit back to user
> + */
> + if (!gh_vcpu_check_system(vcpu))
> + goto out;
> + /* Otherwise, another vcpu will turn it on (e.g. by PSCI)
> + * and hyp sends an interrupt to wake Linux up.
> + */
> + fallthrough;
> + case GH_VCPU_STATE_EXPECTS_WAKEUP:
> + ret = wait_for_completion_interruptible(&vcpu->ready);
> + /* reinitialize completion before next hypercall. If we reinitialize
> + * after the hypercall, interrupt may have already come before
> + * re-initializing the completion and then end up waiting for
> + * event that already happened.
> + */
> + reinit_completion(&vcpu->ready);
> + /* Check system status again. Completion might've
> + * come from gh_vcpu_rm_notification
> + */
> + if (!ret && !gh_vcpu_check_system(vcpu))
> + goto out;
> + break;
> + case GH_VCPU_STATE_BLOCKED:
> + schedule();
> + break;
> + case GH_VCPU_ADDRSPACE_VMMIO_READ:
> + case GH_VCPU_ADDRSPACE_VMMIO_WRITE:
> + if (!gh_handle_mmio(vcpu, &vcpu_run_resp))
> + goto out;
> + break;
> + default:
> + pr_warn_ratelimited("Unknown vCPU state: %llx\n",
> + vcpu_run_resp.state);
> + schedule();
> + break;
> + }
> + } else if (gh_error == GH_ERROR_RETRY) {
> + schedule();
> + ret = 0;
I don't think assigning ret here is necessary.
Add curly braces for the block below.
> + } else
> + ret = gh_remap_error(gh_error);
> + }
> +
> +out:
> + mutex_unlock(&vcpu->run_lock);
> +
> + if (signal_pending(current))
> + return -ERESTARTSYS;
> +
> + return ret;
> +}
> +
> +static long gh_vcpu_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
> +{
> + struct gh_vcpu *vcpu = filp->private_data;
> + long ret = -EINVAL;
> +
> + switch (cmd) {
> + case GH_VCPU_RUN:
> + ret = gh_vcpu_run(vcpu);
> + break;
> + case GH_VCPU_MMAP_SIZE:
> + ret = PAGE_SIZE;
> + break;
> + default:
> + break;
> + }
> + return ret;
> +}
> +
> +static int gh_vcpu_release(struct inode *inode, struct file *filp)
> +{
> + struct gh_vcpu *vcpu = filp->private_data;
> +
> + gh_vm_put(vcpu->ghvm);
> + kref_put(&vcpu->kref, vcpu_release);
> + return 0;
> +}
> +
> +static vm_fault_t gh_vcpu_fault(struct vm_fault *vmf)
> +{
> + struct gh_vcpu *vcpu = vmf->vma->vm_file->private_data;
> + struct page *page = NULL;
> +
> + if (vmf->pgoff == 0)
> + page = virt_to_page(vcpu->vcpu_run);
> +
> + get_page(page);
> + vmf->page = page;
> + return 0;
> +}
> +
> +static const struct vm_operations_struct gh_vcpu_ops = {
> + .fault = gh_vcpu_fault,
> +};
> +
> +static int gh_vcpu_mmap(struct file *file, struct vm_area_struct *vma)
> +{
> + vma->vm_ops = &gh_vcpu_ops;
> + return 0;
> +}
> +
> +static const struct file_operations gh_vcpu_fops = {
> + .owner = THIS_MODULE,
> + .unlocked_ioctl = gh_vcpu_ioctl,
> + .release = gh_vcpu_release,
> + .llseek = noop_llseek,
> + .mmap = gh_vcpu_mmap,
> +};
> +
> +static int gh_vcpu_populate(struct gh_vm_resource_ticket *ticket, struct gh_resource *ghrsc)
> +{
> + struct gh_vcpu *vcpu = container_of(ticket, struct gh_vcpu, ticket);
> + int ret;
> +
> + mutex_lock(&vcpu->run_lock);
> + if (vcpu->rsc) {
> + ret = -1;
I think this should be -EBUSY, or (as I mention
elsewhere) this function could return Boolean.
> + goto out;
> + }
> +
> + vcpu->rsc = ghrsc;
> + init_completion(&vcpu->ready);
> +
> + ret = request_irq(vcpu->rsc->irq, gh_vcpu_irq_handler, IRQF_TRIGGER_RISING, "gh_vcpu",
> + vcpu);
> + if (ret)
> + pr_warn("Failed to request vcpu irq %d: %d", vcpu->rsc->irq, ret);
> +
> +out:
> + mutex_unlock(&vcpu->run_lock);
> + return ret;
> +}
> +
> +static void gh_vcpu_unpopulate(struct gh_vm_resource_ticket *ticket,
> + struct gh_resource *ghrsc)
> +{
> + struct gh_vcpu *vcpu = container_of(ticket, struct gh_vcpu, ticket);
> +
> + vcpu->vcpu_run->immediate_exit = true;
> + complete_all(&vcpu->ready);
> + mutex_lock(&vcpu->run_lock);
> + free_irq(vcpu->rsc->irq, vcpu);
> + vcpu->rsc = NULL;
> + mutex_unlock(&vcpu->run_lock);
> +}
> +
> +static long gh_vcpu_bind(struct gh_vm_function_instance *f)
> +{
> + struct gh_fn_vcpu_arg *arg = f->argp;
> + struct gh_vcpu *vcpu;
> + char name[MAX_VCPU_NAME];
> + struct file *file;
> + struct page *page;
> + int fd;
> + long r;
> +
> + if (!gh_api_has_feature(GH_FEATURE_VCPU))
> + return -EOPNOTSUPP;
> +
> + if (f->arg_size != sizeof(*arg))
> + return -EINVAL;
> +
> + vcpu = kzalloc(sizeof(*vcpu), GFP_KERNEL);
> + if (!vcpu)
> + return -ENOMEM;
> +
> + vcpu->f = f;
> + f->data = vcpu;
> + mutex_init(&vcpu->run_lock);
> + kref_init(&vcpu->kref);
> +
> + page = alloc_page(GFP_KERNEL | __GFP_ZERO);
> + if (!page) {
> + r = -ENOMEM;
> + goto err_destroy_vcpu;
> + }
> + vcpu->vcpu_run = page_address(page);
> +
> + vcpu->ticket.resource_type = GH_RESOURCE_TYPE_VCPU;
> + vcpu->ticket.label = arg->id;
> + vcpu->ticket.owner = THIS_MODULE;
> + vcpu->ticket.populate = gh_vcpu_populate;
> + vcpu->ticket.unpopulate = gh_vcpu_unpopulate;
> +
> + r = gh_vm_add_resource_ticket(f->ghvm, &vcpu->ticket);
> + if (r)
> + goto err_destroy_page;
> +
> + fd = get_unused_fd_flags(O_CLOEXEC);
> + if (fd < 0) {
> + r = fd;
> + goto err_remove_vcpu;
> + }
> +
> + if (!gh_vm_get(f->ghvm)) {
> + r = -ENODEV;
> + goto err_put_fd;
> + }
> + vcpu->ghvm = f->ghvm;
> +
> + vcpu->nb.notifier_call = gh_vcpu_rm_notification;
> + /* Ensure we run after the vm_mgr handles the notification and does
> + * any necessary state changes. We wake up to check the new state.
> + */
> + vcpu->nb.priority = -1;
> + r = gh_rm_notifier_register(f->rm, &vcpu->nb);
> + if (r)
> + goto err_put_gh_vm;
> +
> + kref_get(&vcpu->kref);
> + snprintf(name, sizeof(name), "gh-vcpu:%d", vcpu->ticket.label);
s/%d/%u/
> + file = anon_inode_getfile(name, &gh_vcpu_fops, vcpu, O_RDWR);
> + if (IS_ERR(file)) {
> + r = PTR_ERR(file);
> + goto err_notifier;
> + }
Maybe group getting the anonymous file with getting
an unused file descriptor.
> +
> + fd_install(fd, file);
> +
> + return fd;
> +err_notifier:
> + gh_rm_notifier_unregister(f->rm, &vcpu->nb);
> +err_put_gh_vm:
> + gh_vm_put(vcpu->ghvm);
> +err_put_fd:
> + put_unused_fd(fd);
> +err_remove_vcpu:
> + gh_vm_remove_resource_ticket(f->ghvm, &vcpu->ticket);
> +err_destroy_page:
> + free_page((unsigned long)vcpu->vcpu_run);
> +err_destroy_vcpu:
> + kfree(vcpu);
> + return r;
> +}
> +
> +static void gh_vcpu_unbind(struct gh_vm_function_instance *f)
> +{
> + struct gh_vcpu *vcpu = f->data;
> +
> + gh_rm_notifier_unregister(f->rm, &vcpu->nb);
> + gh_vm_remove_resource_ticket(vcpu->f->ghvm, &vcpu->ticket);
> + vcpu->f = NULL;
> +
> + kref_put(&vcpu->kref, vcpu_release);
> +}
> +
> +DECLARE_GH_VM_FUNCTION_INIT(vcpu, GH_FN_VCPU, gh_vcpu_bind, gh_vcpu_unbind);
> +MODULE_DESCRIPTION("Gunyah vCPU Driver");
Maybe "Gunyah vCPU VM function(s)"?
And if you use "Driver" (or "driver") here, consider using it for
the irqfd and ioeventfd module descriptions as well.
> +MODULE_LICENSE("GPL");
> diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
> index b31fac15ff45..d453d902847e 100644
> --- a/drivers/virt/gunyah/vm_mgr.c
> +++ b/drivers/virt/gunyah/vm_mgr.c
> @@ -354,6 +354,10 @@ static int gh_vm_rm_notification_exited(struct gh_vm *ghvm, void *data)
>
> down_write(&ghvm->status_lock);
> ghvm->vm_status = GH_RM_VM_STATUS_EXITED;
> + ghvm->exit_info.type = le16_to_cpu(payload->exit_type);
> + ghvm->exit_info.reason_size = le32_to_cpu(payload->exit_reason_size);
> + memcpy(&ghvm->exit_info.reason, payload->exit_reason,
> + min(GH_VM_MAX_EXIT_REASON_SIZE, ghvm->exit_info.reason_size));
> up_write(&ghvm->status_lock);
>
> return NOTIFY_DONE;
> diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
> index 9c1046af80ed..df78756639b6 100644
> --- a/drivers/virt/gunyah/vm_mgr.h
> +++ b/drivers/virt/gunyah/vm_mgr.h
> @@ -45,6 +45,7 @@ struct gh_vm {
> enum gh_rm_vm_status vm_status;
> wait_queue_head_t vm_status_wait;
> struct rw_semaphore status_lock;
> + struct gh_vm_exit_info exit_info;
>
> struct work_struct free_work;
> struct kref kref;
> diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
> index 3e706b59d2c0..37f1e2c822ce 100644
> --- a/include/linux/gunyah.h
> +++ b/include/linux/gunyah.h
> @@ -175,4 +175,12 @@ enum gh_error gh_hypercall_msgq_send(u64 capid, size_t size, void *buff, int tx_
> enum gh_error gh_hypercall_msgq_recv(u64 capid, void *buff, size_t size, size_t *recv_size,
> bool *ready);
>
> +struct gh_hypercall_vcpu_run_resp {
> + u64 state;
> + u64 state_data[3];
> +};
> +
> +enum gh_error gh_hypercall_vcpu_run(u64 capid, u64 *resume_data,
> + struct gh_hypercall_vcpu_run_resp *resp);
> +
> #endif
> diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
> index caeb3b3a3e9a..e52265fa5715 100644
> --- a/include/uapi/linux/gunyah.h
> +++ b/include/uapi/linux/gunyah.h
> @@ -62,8 +62,32 @@ struct gh_vm_dtb_config {
>
> #define GH_VM_START _IO(GH_IOCTL_TYPE, 0x3)
>
> +/**
> + * GH_FN_VCPU - create a vCPU instance to control a vCPU
> + *
> + * gh_fn_desc is filled with &struct gh_fn_vcpu_arg
> + *
> + * The vcpu type will register with the VM Manager to expect to control
> + * vCPU number `vcpu_id`. It returns a file descriptor allowing interaction with
> + * the vCPU. See the Gunyah vCPU API description sections for interacting with
> + * the Gunyah vCPU file descriptors.
> + *
> + * Return: file descriptor to manipulate the vcpu. See GH_VCPU_* ioctls
> + */
> +#define GH_FN_VCPU 1
I think you should define GH_VN_VCPU, GN_FN_IRQFD, and GN_FN_IOEVENTFD
in an enumerated type. Each has a type associated with it, and you can
add the explanation for the function in the kernel-doc comments above
thosse type definitions.
> +
> #define GH_FN_MAX_ARG_SIZE 256
>
> +/**
> + * struct gh_fn_vcpu_arg - Arguments to create a vCPU
> + * @id: vcpu id
> + */
> +struct gh_fn_vcpu_arg {
> + __u32 id;
I realize this is the "CPU ID" but the other two function types
name this field "label" and it gets used as the label for the
ticket when it gets bound. So I suggest naming this "label".
> +};
> +
> +#define GH_IRQFD_LEVEL (1UL << 0)
> +
> /**
> * struct gh_fn_desc - Arguments to create a VM function
> * @type: Type of the function. See GH_FN_* macro for supported types
If you do what I suggest, the above comment should instead refer
to the enumerated type name.
> @@ -79,4 +103,88 @@ struct gh_fn_desc {
> #define GH_VM_ADD_FUNCTION _IOW(GH_IOCTL_TYPE, 0x4, struct gh_fn_desc)
> #define GH_VM_REMOVE_FUNCTION _IOW(GH_IOCTL_TYPE, 0x7, struct gh_fn_desc)
>
> +enum gh_vm_status {
> + GH_VM_STATUS_LOAD_FAILED = 1,
> +#define GH_VM_STATUS_LOAD_FAILED GH_VM_STATUS_LOAD_FAILED
> + GH_VM_STATUS_EXITED = 2,
> +#define GH_VM_STATUS_EXITED GH_VM_STATUS_EXITED
> + GH_VM_STATUS_CRASHED = 3,
> +#define GH_VM_STATUS_CRASHED GH_VM_STATUS_CRASHED
> +};
> +
> +/*
> + * Gunyah presently sends max 4 bytes of exit_reason.
> + * If that changes, this macro can be safely increased without breaking
> + * userspace so long as struct gh_vcpu_run < PAGE_SIZE.
Is PAGE_SIZE allowed to be anything other than 4096 bytes? Do you
expect this driver to work properly if the page size were configured
to be 16384 bytes? In other words, is this a Gunyah constant, or
is it *really* the page size configured for Linux?
> + */
> +#define GH_VM_MAX_EXIT_REASON_SIZE 8u
> +
> +/**
> + * struct gh_vm_exit_info - Reason for VM exit as reported by Gunyah
> + * See Gunyah documentation for values.
> + * @type: Describes how VM exited
> + * @padding: padding bytes
> + * @reason_size: Number of bytes valid for `reason`
> + * @reason: See Gunyah documentation for interpretation. Note: these values are
> + * not interpreted by Linux and need to be converted from little-endian
> + * as applicable.
> + */
> +struct gh_vm_exit_info {
> + __u16 type;
> + __u16 padding;
> + __u32 reason_size;
> + __u8 reason[GH_VM_MAX_EXIT_REASON_SIZE];
> +};
> +
Define this group of values in an enumerated type.
Are these the possible "exit_reason" values? If so,
maybe name them GH_VCPU_EXIT_REASON_*.
> +#define GH_VCPU_EXIT_UNKNOWN 0
> +#define GH_VCPU_EXIT_MMIO 1
> +#define GH_VCPU_EXIT_STATUS 2
> +
> +/**
> + * struct gh_vcpu_run - Application code obtains a pointer to the gh_vcpu_run
> + * structure by mmap()ing a vcpu fd.
> + * @immediate_exit: polled when scheduling the vcpu. If set, immediately returns -EINTR.
> + * @padding: padding bytes
> + * @exit_reason: Set when GH_VCPU_RUN returns successfully and gives reason why
> + * GH_VCPU_RUN has stopped running the vCPU.
> + * @mmio: Used when exit_reason == GH_VCPU_EXIT_MMIO
> + * The guest has faulted on an memory-mapped I/O instruction that
> + * couldn't be satisfied by gunyah.
> + * @mmio.phys_addr: Address guest tried to access
> + * @mmio.data: the value that was written if `is_write == 1`. Filled by
> + * user for reads (`is_wite == 0`).
> + * @mmio.len: Length of write. Only the first `len` bytes of `data`
> + * are considered by Gunyah.
> + * @mmio.is_write: 1 if VM tried to perform a write, 0 for a read
> + * @status: Used when exit_reason == GH_VCPU_EXIT_STATUS.
> + * The guest VM is no longer runnable. This struct informs why.
> + * @status.status: See `enum gh_vm_status` for possible values
> + * @status.exit_info: Used when status == GH_VM_STATUS_EXITED
> + */
> +struct gh_vcpu_run {
> + /* in */
> + __u8 immediate_exit;
> + __u8 padding[7];
> +
> + /* out */
> + __u32 exit_reason;
> +
> + union {
> + struct {
> + __u64 phys_addr;
> + __u8 data[8];
> + __u32 len;
> + __u8 is_write;
> + } mmio;
> +
> + struct {
> + enum gh_vm_status status;
> + struct gh_vm_exit_info exit_info;
> + } status;
> + };
> +};
> +
> +#define GH_VCPU_RUN _IO(GH_IOCTL_TYPE, 0x5)
> +#define GH_VCPU_MMAP_SIZE _IO(GH_IOCTL_TYPE, 0x6)
> +
> #endif
On 3/3/23 7:06 PM, Elliot Berman wrote:
> Some VM functions need to acquire Gunyah resources. For instance, Gunyah
> vCPUs are exposed to the host as a resource. The Gunyah vCPU function
> will register a resource ticket and be able to interact with the
> hypervisor once the resource ticket is filled.
>
> Resource tickets are the mechanism for functions to acquire ownership of
> Gunyah resources. Gunyah functions can be created before the VM's
> resources are created and made available to Linux. A resource ticket
> identifies a type of resource and a label of a resource which the ticket
> holder is interested in.
>
> Resources are created by Gunyah as configured in the VM's devicetree
> configuration. Gunyah doesn't process the label and that makes it
> possible for userspace to create multiple resources with the same label.
> Resource ticket owners need to be prepared for populate to be called
> multiple times if userspace created multiple resources with the same
> label.
>
> Signed-off-by: Elliot Berman <[email protected]>
One possibly substantive suggestion here, plus a couple suggestions
to add or revise comments.
-Alex
> ---
> drivers/virt/gunyah/vm_mgr.c | 112 +++++++++++++++++++++++++++++++++-
> drivers/virt/gunyah/vm_mgr.h | 4 ++
> include/linux/gunyah_vm_mgr.h | 14 +++++
> 3 files changed, 129 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
> index 88db011395ec..0269bcdaf692 100644
> --- a/drivers/virt/gunyah/vm_mgr.c
> +++ b/drivers/virt/gunyah/vm_mgr.c
> @@ -165,6 +165,74 @@ static long gh_vm_rm_function(struct gh_vm *ghvm, struct gh_fn_desc *f)
> return r;
> }
>
> +int gh_vm_add_resource_ticket(struct gh_vm *ghvm, struct gh_vm_resource_ticket *ticket)
> +{
> + struct gh_vm_resource_ticket *iter;
> + struct gh_resource *ghrsc;
> + int ret = 0;
> +
> + mutex_lock(&ghvm->resources_lock);
> + list_for_each_entry(iter, &ghvm->resource_tickets, list) {
> + if (iter->resource_type == ticket->resource_type && iter->label == ticket->label) {
> + ret = -EEXIST;
> + goto out;
> + }
> + }
> +
> + if (!try_module_get(ticket->owner)) {
> + ret = -ENODEV;
> + goto out;
> + }
> +
> + list_add(&ticket->list, &ghvm->resource_tickets);
> + INIT_LIST_HEAD(&ticket->resources);
> +
> + list_for_each_entry(ghrsc, &ghvm->resources, list) {
> + if (ghrsc->type == ticket->resource_type && ghrsc->rm_label == ticket->label) {
> + if (!ticket->populate(ticket, ghrsc))
> + list_move(&ghrsc->list, &ticket->resources);
> + }
> + }
> +out:
> + mutex_unlock(&ghvm->resources_lock);
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(gh_vm_add_resource_ticket);
> +
> +void gh_vm_remove_resource_ticket(struct gh_vm *ghvm, struct gh_vm_resource_ticket *ticket)
> +{
> + struct gh_resource *ghrsc, *iter;
> +
> + mutex_lock(&ghvm->resources_lock);
> + list_for_each_entry_safe(ghrsc, iter, &ticket->resources, list) {
> + ticket->unpopulate(ticket, ghrsc);
> + list_move(&ghrsc->list, &ghvm->resources);
> + }
> +
> + module_put(ticket->owner);
> + list_del(&ticket->list);
> + mutex_unlock(&ghvm->resources_lock);
> +}
> +EXPORT_SYMBOL_GPL(gh_vm_remove_resource_ticket);
> +
> +static void gh_vm_add_resource(struct gh_vm *ghvm, struct gh_resource *ghrsc)
> +{
> + struct gh_vm_resource_ticket *ticket;
> +
> + mutex_lock(&ghvm->resources_lock);
> + list_for_each_entry(ticket, &ghvm->resource_tickets, list) {
> + if (ghrsc->type == ticket->resource_type && ghrsc->rm_label == ticket->label) {
> + if (!ticket->populate(ticket, ghrsc)) {
> + list_add(&ghrsc->list, &ticket->resources);
> + goto found;
> + }
I think the "goto found" belongs here, unconditionally.
You disallow adding more than one ticket of a given type
with the same label. So you will never match another
ticket once you've matched this one.
The populate function generally shouldn't fail. I think
it only fails if you find a duplicate, and again, I think
you prevent that from happening. (But if it does, you
silently ignore it...)
> + }
> + }
> + list_add(&ghrsc->list, &ghvm->resources);
> +found:
> + mutex_unlock(&ghvm->resources_lock);
> +}
> +
> static int gh_vm_rm_notification_status(struct gh_vm *ghvm, void *data)
> {
> struct gh_rm_vm_status_payload *payload = data;
> @@ -230,6 +298,8 @@ static void gh_vm_free(struct work_struct *work)
> {
> struct gh_vm *ghvm = container_of(work, struct gh_vm, free_work);
> struct gh_vm_function_instance *inst, *iiter;
> + struct gh_vm_resource_ticket *ticket, *titer;
> + struct gh_resource *ghrsc, *riter;
> struct gh_vm_mem *mapping, *tmp;
> int ret;
>
> @@ -246,6 +316,25 @@ static void gh_vm_free(struct work_struct *work)
> }
> mutex_unlock(&ghvm->fn_lock);
>
> + mutex_lock(&ghvm->resources_lock);
> + if (!list_empty(&ghvm->resource_tickets)) {
> + dev_warn(ghvm->parent, "Dangling resource tickets:\n");
> + list_for_each_entry_safe(ticket, titer, &ghvm->resource_tickets, list) {
> + dev_warn(ghvm->parent, " %pS\n", ticket->populate);
> + gh_vm_remove_resource_ticket(ghvm, ticket);
> + }
> + }
> +
> + list_for_each_entry_safe(ghrsc, riter, &ghvm->resources, list) {
> + gh_rm_free_resource(ghrsc);
> + }
> + mutex_unlock(&ghvm->resources_lock);
> +
> + ret = gh_rm_vm_reset(ghvm->rm, ghvm->vmid);
> + if (ret)
> + dev_err(ghvm->parent, "Failed to reset the vm: %d\n", ret);
> + wait_event(ghvm->vm_status_wait, ghvm->vm_status == GH_RM_VM_STATUS_RESET);
> +
> mutex_lock(&ghvm->mm_lock);
> list_for_each_entry_safe(mapping, tmp, &ghvm->memory_mappings, list) {
> gh_vm_mem_reclaim(ghvm, mapping);
> @@ -329,6 +418,9 @@ static __must_check struct gh_vm *gh_vm_alloc(struct gh_rm *rm)
> init_rwsem(&ghvm->status_lock);
> INIT_WORK(&ghvm->free_work, gh_vm_free);
> kref_init(&ghvm->kref);
> + mutex_init(&ghvm->resources_lock);
> + INIT_LIST_HEAD(&ghvm->resources);
> + INIT_LIST_HEAD(&ghvm->resource_tickets);
> INIT_LIST_HEAD(&ghvm->functions);
> ghvm->vm_status = GH_RM_VM_STATUS_LOAD;
>
> @@ -338,9 +430,11 @@ static __must_check struct gh_vm *gh_vm_alloc(struct gh_rm *rm)
> static int gh_vm_start(struct gh_vm *ghvm)
> {
> struct gh_vm_mem *mapping;
> + struct gh_rm_hyp_resources *resources;
> + struct gh_resource *ghrsc;
> u64 dtb_offset;
> u32 mem_handle;
> - int ret;
> + int ret, i, n;
>
> down_write(&ghvm->status_lock);
> if (ghvm->vm_status != GH_RM_VM_STATUS_LOAD) {
> @@ -394,6 +488,22 @@ static int gh_vm_start(struct gh_vm *ghvm)
> goto err;
> }
>
> + ret = gh_rm_get_hyp_resources(ghvm->rm, ghvm->vmid, &resources);
> + if (ret) {
> + dev_warn(ghvm->parent, "Failed to get hypervisor resources for VM: %d\n", ret);
> + goto err;
> + }
> +
> + for (i = 0, n = le32_to_cpu(resources->n_entries); i < n; i++) {
> + ghrsc = gh_rm_alloc_resource(ghvm->rm, &resources->entries[i]);
> + if (!ghrsc) {
> + ret = -ENOMEM;
> + goto err;
> + }
> +
> + gh_vm_add_resource(ghvm, ghrsc);
> + }
> +
> ret = gh_rm_vm_start(ghvm->rm, ghvm->vmid);
> if (ret) {
> dev_warn(ghvm->parent, "Failed to start VM: %d\n", ret);
> diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
> index 7bd271bad721..18d0e1effd25 100644
> --- a/drivers/virt/gunyah/vm_mgr.h
> +++ b/drivers/virt/gunyah/vm_mgr.h
> @@ -7,6 +7,7 @@
> #define _GH_PRIV_VM_MGR_H
>
> #include <linux/gunyah_rsc_mgr.h>
> +#include <linux/gunyah_vm_mgr.h>
> #include <linux/list.h>
> #include <linux/kref.h>
> #include <linux/miscdevice.h>
> @@ -51,6 +52,9 @@ struct gh_vm {
> struct list_head memory_mappings;
> struct mutex fn_lock;
> struct list_head functions;
> + struct mutex resources_lock;
> + struct list_head resources;
> + struct list_head resource_tickets;
> };
>
> int gh_vm_mem_alloc(struct gh_vm *ghvm, struct gh_userspace_memory_region *region);
> diff --git a/include/linux/gunyah_vm_mgr.h b/include/linux/gunyah_vm_mgr.h
> index 3825c951790a..01b1761b5923 100644
> --- a/include/linux/gunyah_vm_mgr.h
> +++ b/include/linux/gunyah_vm_mgr.h
> @@ -70,4 +70,18 @@ void gh_vm_function_unregister(struct gh_vm_function *f);
> DECLARE_GH_VM_FUNCTION(_name, _type, _bind, _unbind); \
> module_gh_vm_function(_name)
>
> +struct gh_vm_resource_ticket {
> + struct list_head list; /* for gh_vm's resources list */
Maybe "resource lists" above (it's for the resources list and
resource_tickets list).
> + struct list_head resources; /* for gh_resources's list */
Maybe: /* resources associated with this ticket */
> + enum gh_resource_type resource_type;
> + u32 label;
> +
> + struct module *owner;
> + int (*populate)(struct gh_vm_resource_ticket *ticket, struct gh_resource *ghrsc);
> + void (*unpopulate)(struct gh_vm_resource_ticket *ticket, struct gh_resource *ghrsc);
> +};
> +
> +int gh_vm_add_resource_ticket(struct gh_vm *ghvm, struct gh_vm_resource_ticket *ticket);
> +void gh_vm_remove_resource_ticket(struct gh_vm *ghvm, struct gh_vm_resource_ticket *ticket);
> +
> #endif
On 3/3/23 7:06 PM, Elliot Berman wrote:
> Add framework for VM functions to handle stage-2 write faults from Gunyah
> guest virtual machines. IO handlers have a range of addresses which they
> apply to. Optionally, they may apply to only when the value written
> matches the IO handler's value.
>
> Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
> Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
> Signed-off-by: Elliot Berman <[email protected]>
Two (related) bugs and a suggestion that might help avoid
adding the same problem in the future. (Or maybe I made
that suggestion elsewhere? Anyway, you'll see.)
-Alex
> ---
> drivers/virt/gunyah/vm_mgr.c | 94 +++++++++++++++++++++++++++++++++++
> drivers/virt/gunyah/vm_mgr.h | 4 ++
> include/linux/gunyah_vm_mgr.h | 25 ++++++++++
> 3 files changed, 123 insertions(+)
>
> diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
> index 0269bcdaf692..b31fac15ff45 100644
> --- a/drivers/virt/gunyah/vm_mgr.c
> +++ b/drivers/virt/gunyah/vm_mgr.c
> @@ -233,6 +233,100 @@ static void gh_vm_add_resource(struct gh_vm *ghvm, struct gh_resource *ghrsc)
> mutex_unlock(&ghvm->resources_lock);
> }
>
> +static int _gh_vm_io_handler_compare(const struct rb_node *node, const struct rb_node *parent)
> +{
> + struct gh_vm_io_handler *n = container_of(node, struct gh_vm_io_handler, node);
> + struct gh_vm_io_handler *p = container_of(parent, struct gh_vm_io_handler, node);
> +
> + if (n->addr < p->addr)
> + return -1;
> + if (n->addr > p->addr)
> + return 1;
> + if ((n->len && !p->len) || (!n->len && p->len))
> + return 0;
> + if (n->len < p->len)
> + return -1;
> + if (n->len > p->len)
> + return 1;
The datamatch field in a gh_vm_io_handler structure is Boolean.
If this is what you intend, it would be better to not treat
them as integer values (i.e., don't use < and >).
However I *think* what you want is to be comparing the
data fields here. If so, this is a BUG.
I think you should maybe use "data" in the gh_fn_ioeventfd_arg
structure rather than "datamatch". And then use "datamatch"
consistently as a Boolean indicating whether to do matching,
and "data" to be the value used in matching.
> + if (n->datamatch < p->datamatch)
> + return -1;
> + if (n->datamatch > p->datamatch)
> + return 1;
> + return 0;
> +}
> +
> +static int gh_vm_io_handler_compare(struct rb_node *node, const struct rb_node *parent)
> +{
> + return _gh_vm_io_handler_compare(node, parent);
> +}
> +
> +static int gh_vm_io_handler_find(const void *key, const struct rb_node *node)
> +{
> + const struct gh_vm_io_handler *k = key;
> +
> + return _gh_vm_io_handler_compare(&k->node, node);
> +}
> +
> +static struct gh_vm_io_handler *gh_vm_mgr_find_io_hdlr(struct gh_vm *ghvm, u64 addr,
> + u64 len, u64 data)
> +{
> + struct gh_vm_io_handler key = {
> + .addr = addr,
> + .len = len,
> + .datamatch = data,
The datamatch field here is Boolean. I'm pretty sure you
want to assign the data field instead, in which case, this
is a BUG.
If you *do* intend to treat the data assigned as Boolean,
please use !!data to make this obvious.
> + };
> + struct rb_node *node;
> +
> + node = rb_find(&key, &ghvm->mmio_handler_root, gh_vm_io_handler_find);
> + if (!node)
> + return NULL;
> +
> + return container_of(node, struct gh_vm_io_handler, node);
> +}
> +
> +int gh_vm_mmio_write(struct gh_vm *ghvm, u64 addr, u32 len, u64 data)
> +{
> + struct gh_vm_io_handler *io_hdlr = NULL;
> + int ret;
> +
> + down_read(&ghvm->mmio_handler_lock);
> + io_hdlr = gh_vm_mgr_find_io_hdlr(ghvm, addr, len, data);
> + if (!io_hdlr || !io_hdlr->ops || !io_hdlr->ops->write) {
> + ret = -ENODEV;
> + goto out;
> + }
> +
> + ret = io_hdlr->ops->write(io_hdlr, addr, len, data);
> +
> +out:
> + up_read(&ghvm->mmio_handler_lock);
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(gh_vm_mmio_write);
. . .
On 3/3/23 7:06 PM, Elliot Berman wrote:
> Gunyah doorbells allow two virtual machines to signal each other using
> interrupts. Add the hypercalls needed to assert the interrupt.
>
> Signed-off-by: Elliot Berman <[email protected]>
One minor suggestion below. -Alex
> ---
> arch/arm64/gunyah/gunyah_hypercall.c | 25 +++++++++++++++++++++++++
> include/linux/gunyah.h | 3 +++
> 2 files changed, 28 insertions(+)
>
> diff --git a/arch/arm64/gunyah/gunyah_hypercall.c b/arch/arm64/gunyah/gunyah_hypercall.c
> index f01f5cec4d23..0f1cdb706e91 100644
> --- a/arch/arm64/gunyah/gunyah_hypercall.c
> +++ b/arch/arm64/gunyah/gunyah_hypercall.c
> @@ -41,6 +41,8 @@ EXPORT_SYMBOL_GPL(arch_is_gh_guest);
> fn)
>
> #define GH_HYPERCALL_HYP_IDENTIFY GH_HYPERCALL(0x8000)
> +#define GH_HYPERCALL_BELL_SEND GH_HYPERCALL(0x8012)
> +#define GH_HYPERCALL_BELL_SET_MASK GH_HYPERCALL(0x8015)
> #define GH_HYPERCALL_MSGQ_SEND GH_HYPERCALL(0x801B)
> #define GH_HYPERCALL_MSGQ_RECV GH_HYPERCALL(0x801C)
> #define GH_HYPERCALL_VCPU_RUN GH_HYPERCALL(0x8065)
> @@ -63,6 +65,29 @@ void gh_hypercall_hyp_identify(struct gh_hypercall_hyp_identify_resp *hyp_identi
> }
> EXPORT_SYMBOL_GPL(gh_hypercall_hyp_identify);
>
> +enum gh_error gh_hypercall_bell_send(u64 capid, u64 new_flags, u64 *old_flags)
> +{
> + struct arm_smccc_res res;
> +
> + arm_smccc_1_1_hvc(GH_HYPERCALL_BELL_SEND, capid, new_flags, 0, &res);
> +
> + if (res.a0 == GH_ERROR_OK)
> + *old_flags = res.a1;
At least one caller doesn't care about the result. So you could
accept a null pointer as the old_flags argument, and in that case
just don't assign it.
> +
> + return res.a0;
> +}
> +EXPORT_SYMBOL_GPL(gh_hypercall_bell_send);
> +
> +enum gh_error gh_hypercall_bell_set_mask(u64 capid, u64 enable_mask, u64 ack_mask)
> +{
> + struct arm_smccc_res res;
> +
> + arm_smccc_1_1_hvc(GH_HYPERCALL_BELL_SET_MASK, capid, enable_mask, ack_mask, 0, &res);
> +
> + return res.a0;
> +}
> +EXPORT_SYMBOL_GPL(gh_hypercall_bell_set_mask);
> +
> enum gh_error gh_hypercall_msgq_send(u64 capid, size_t size, void *buff, int tx_flags, bool *ready)
> {
> struct arm_smccc_res res;
> diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
> index 37f1e2c822ce..63395dacc1a8 100644
> --- a/include/linux/gunyah.h
> +++ b/include/linux/gunyah.h
> @@ -169,6 +169,9 @@ struct gh_hypercall_hyp_identify_resp {
>
> void gh_hypercall_hyp_identify(struct gh_hypercall_hyp_identify_resp *hyp_identity);
>
> +enum gh_error gh_hypercall_bell_send(u64 capid, u64 new_flags, u64 *old_flags);
> +enum gh_error gh_hypercall_bell_set_mask(u64 capid, u64 enable_mask, u64 ack_mask);
> +
> #define GH_HYPERCALL_MSGQ_TX_FLAGS_PUSH BIT(0)
>
> enum gh_error gh_hypercall_msgq_send(u64 capid, size_t size, void *buff, int tx_flags, bool *ready);
On 3/3/23 7:06 PM, Elliot Berman wrote:
> Enable support for creating irqfds which can raise an interrupt on a
> Gunyah virtual machine. irqfds are exposed to userspace as a Gunyah VM
> function with the name "irqfd". If the VM devicetree is not configured
> to create a doorbell with the corresponding label, userspace will still
> be able to assert the eventfd but no interrupt will be raised on the
> guest.
>
> Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
> Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
> Signed-off-by: Elliot Berman <[email protected]>
I suggest a few things below, including some code simplification.
I also have a few questions (which could possibly be answered by
adding comments).
-Alex
> ---
> Documentation/virt/gunyah/vm-manager.rst | 2 +-
> drivers/virt/gunyah/Kconfig | 9 ++
> drivers/virt/gunyah/Makefile | 1 +
> drivers/virt/gunyah/gunyah_irqfd.c | 164 +++++++++++++++++++++++
> include/linux/gunyah.h | 5 +
> include/uapi/linux/gunyah.h | 30 +++++
> 6 files changed, 210 insertions(+), 1 deletion(-)
> create mode 100644 drivers/virt/gunyah/gunyah_irqfd.c
>
> diff --git a/Documentation/virt/gunyah/vm-manager.rst b/Documentation/virt/gunyah/vm-manager.rst
> index 83d326b0d11f..a1dd70f0cbf6 100644
> --- a/Documentation/virt/gunyah/vm-manager.rst
> +++ b/Documentation/virt/gunyah/vm-manager.rst
> @@ -124,7 +124,7 @@ the VM starts.
> The possible types are documented below:
>
> .. kernel-doc:: include/uapi/linux/gunyah.h
> - :identifiers: GH_FN_VCPU gh_fn_vcpu_arg
> + :identifiers: GH_FN_VCPU gh_fn_vcpu_arg GH_FN_IRQFD gh_fn_irqfd_arg
>
> Gunyah VCPU API Descriptions
> ----------------------------
> diff --git a/drivers/virt/gunyah/Kconfig b/drivers/virt/gunyah/Kconfig
> index 4c1c6110b50e..2cde24d429d1 100644
> --- a/drivers/virt/gunyah/Kconfig
> +++ b/drivers/virt/gunyah/Kconfig
> @@ -26,3 +26,12 @@ config GUNYAH_VCPU
> VMMs can also handle stage 2 faults of the vCPUs.
>
> Say Y/M here if unsure and you want to support Gunyah VMMs.
> +
> +config GUNYAH_IRQFD
> + tristate "Gunyah irqfd interface"
> + depends on GUNYAH
> + help
> + Enable kernel support for creating irqfds which can raise an interrupt
> + on Gunyah virtual machine.
> +
> + Say Y/M here if unsure and you want to support Gunyah VMMs.
> diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
> index 2d1b604a7b03..6cf756bfa3c2 100644
> --- a/drivers/virt/gunyah/Makefile
> +++ b/drivers/virt/gunyah/Makefile
> @@ -7,3 +7,4 @@ gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o vm_mgr_mm.o
> obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
>
> obj-$(CONFIG_GUNYAH_VCPU) += gunyah_vcpu.o
> +obj-$(CONFIG_GUNYAH_IRQFD) += gunyah_irqfd.o
> diff --git a/drivers/virt/gunyah/gunyah_irqfd.c b/drivers/virt/gunyah/gunyah_irqfd.c
> new file mode 100644
> index 000000000000..38e5fe266b00
> --- /dev/null
> +++ b/drivers/virt/gunyah/gunyah_irqfd.c
> @@ -0,0 +1,164 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#include <linux/eventfd.h>
> +#include <linux/file.h>
> +#include <linux/fs.h>
> +#include <linux/gunyah.h>
> +#include <linux/gunyah_vm_mgr.h>
> +#include <linux/module.h>
> +#include <linux/poll.h>
> +#include <linux/printk.h>
> +
> +#include <uapi/linux/gunyah.h>
> +
> +struct gh_irqfd {
> + struct gh_resource *ghrsc;
> + struct gh_vm_resource_ticket ticket;
> + struct gh_vm_function_instance *f;
> +
> + bool level;
> +
> + struct eventfd_ctx *ctx;
> + wait_queue_entry_t wait;
> + poll_table pt;
> +};
> +
> +static int irqfd_wakeup(wait_queue_entry_t *wait, unsigned int mode, int sync, void *key)
> +{
> + struct gh_irqfd *irqfd = container_of(wait, struct gh_irqfd, wait);
> + __poll_t flags = key_to_poll(key);
> + u64 enable_mask = GH_BELL_NONBLOCK;
> + u64 old_flags;
> + int ret = 0;
> +
> + if (flags & EPOLLIN) {
> + if (irqfd->ghrsc) {
> + ret = gh_hypercall_bell_send(irqfd->ghrsc->capid, enable_mask, &old_flags);
I commented elsewhere that you might support passing a null
pointer as the last argument above (since you don't use the
result).
> + if (ret)
> + pr_err_ratelimited("Failed to inject interrupt %d: %d\n",
> + irqfd->ticket.label, ret);
> + } else
> + pr_err_ratelimited("Premature injection of interrupt\n");
> + }
> +
> + return 0;
> +}
> +
> +static void irqfd_ptable_queue_proc(struct file *file, wait_queue_head_t *wqh, poll_table *pt)
> +{
> + struct gh_irqfd *irq_ctx = container_of(pt, struct gh_irqfd, pt);
> +
> + add_wait_queue(wqh, &irq_ctx->wait);
> +}
> +
> +static int gh_irqfd_populate(struct gh_vm_resource_ticket *ticket, struct gh_resource *ghrsc)
> +{
> + struct gh_irqfd *irqfd = container_of(ticket, struct gh_irqfd, ticket);
> + u64 enable_mask = GH_BELL_NONBLOCK;
> + u64 ack_mask = ~0;
Why is the ACK mask ~0?
I guess I don't know details about this hypercall (do you document
them somewhere?), so it's hard to judge whether or why this is the
right thing to use. The enable_mask is just GH_BELL_NONBLOCK,
which is just BIT(32).
> + int ret = 0;
> +
> + if (irqfd->ghrsc) {
> + pr_warn("irqfd%d already got a Gunyah resource. Check if multiple resources with same label were configured.\n",
s/%d/%u/
> + irqfd->ticket.label);
> + return -1;
I would say you should return -EBUSY here instead.
However, all callers just check for a zero/nonzero result, so
you could instead have this function (and the pointer it's
assigned to) to return Boolean instead (and return true on
success).
> + }
> +
> + irqfd->ghrsc = ghrsc;
> + if (irqfd->level) {
I think I don't understand this part of the code well
enough to know this. What happens if level is false?
> + ret = gh_hypercall_bell_set_mask(irqfd->ghrsc->capid, enable_mask, ack_mask);
> + if (ret)
> + pr_warn("irq %d couldn't be set as level triggered. Might cause IRQ storm if asserted\n",
> + irqfd->ticket.label);
> + }
> +
> + return 0;
> +}
> +
> +static void gh_irqfd_unpopulate(struct gh_vm_resource_ticket *ticket, struct gh_resource *ghrsc)
> +{
> + struct gh_irqfd *irqfd = container_of(ticket, struct gh_irqfd, ticket);
> + u64 cnt;
> +
> + eventfd_ctx_remove_wait_queue(irqfd->ctx, &irqfd->wait, &cnt);
> +}
> +
> +static long gh_irqfd_bind(struct gh_vm_function_instance *f)
> +{
> + struct gh_fn_irqfd_arg *args = f->argp;
> + struct gh_irqfd *irqfd;
> + __poll_t events;
> + struct fd fd;
> + long r;
> +
> + if (f->arg_size != sizeof(*args))
> + return -EINVAL;
> +
> + /* All other flag bits are reserved for future use */
> + if (args->flags & ~GH_IRQFD_LEVEL)
> + return -EINVAL;
> +
> + irqfd = kzalloc(sizeof(*irqfd), GFP_KERNEL);
> + if (!irqfd)
> + return -ENOMEM;
> +
> + irqfd->f = f;
> + f->data = irqfd;
> +
In the next section you get a temporary reference to the FD,
then look up the eventfd context from its file. But in
gh_ioeventfd_bind() you just call eventfd_ctx_fdget().
I *think* you can do the same here, but perhaps I'm missing
something.
> + fd = fdget(args->fd);
> + if (!fd.file) {
> + kfree(irqfd);
> + return -EBADF;
> + }
> +
> + irqfd->ctx = eventfd_ctx_fileget(fd.file);
> + if (IS_ERR(irqfd->ctx)) {
> + r = PTR_ERR(irqfd->ctx);
> + goto err_fdput;
> + }
> +
I.e., rather than the two function calls above, you could just
call:
irqfd->ctx = eventfd_ctx_fdget(args->fd);
And in that case you also wouldn't need the fdput() call in the
error path below.
> + if (args->flags & GH_IRQFD_LEVEL)
> + irqfd->level = true;
> +
> + init_waitqueue_func_entry(&irqfd->wait, irqfd_wakeup);
> + init_poll_funcptr(&irqfd->pt, irqfd_ptable_queue_proc);
> +
> + irqfd->ticket.resource_type = GH_RESOURCE_TYPE_BELL_TX;
> + irqfd->ticket.label = args->label;
> + irqfd->ticket.owner = THIS_MODULE;
> + irqfd->ticket.populate = gh_irqfd_populate;
> + irqfd->ticket.unpopulate = gh_irqfd_unpopulate;
> +
> + r = gh_vm_add_resource_ticket(f->ghvm, &irqfd->ticket);
> + if (r)
> + goto err_ctx;
> +
> + events = vfs_poll(fd.file, &irqfd->pt);
> + if (events & EPOLLIN)
> + pr_warn("Premature injection of interrupt\n");
> + fdput(fd);
> +
> + return 0;
> +err_ctx:
> + eventfd_ctx_put(irqfd->ctx);
> +err_fdput:
> + fdput(fd);
> + kfree(irqfd);
> + return r;
> +}
> +
> +static void gh_irqfd_unbind(struct gh_vm_function_instance *f)
> +{
> + struct gh_irqfd *irqfd = f->data;
> +
> + gh_vm_remove_resource_ticket(irqfd->f->ghvm, &irqfd->ticket);
> + eventfd_ctx_put(irqfd->ctx);
> + kfree(irqfd);
> +}
> +
> +DECLARE_GH_VM_FUNCTION_INIT(irqfd, GH_FN_IRQFD, gh_irqfd_bind, gh_irqfd_unbind);
> +MODULE_DESCRIPTION("Gunyah irqfds");
Maybe singular, and maybe "Gunyah irqfd VM function(s)".
> +MODULE_LICENSE("GPL");
> diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
> index 63395dacc1a8..0344b6988cfa 100644
> --- a/include/linux/gunyah.h
> +++ b/include/linux/gunyah.h
> @@ -33,6 +33,11 @@ struct gh_resource {
> u32 rm_label;
> };
>
> +/**
> + * Gunyah Doorbells
> + */
> +#define GH_BELL_NONBLOCK BIT(32)
> +
> /**
> * Gunyah Message Queues
> */
> diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
> index e52265fa5715..5617dadc1c7b 100644
> --- a/include/uapi/linux/gunyah.h
> +++ b/include/uapi/linux/gunyah.h
> @@ -76,6 +76,19 @@ struct gh_vm_dtb_config {
> */
> #define GH_FN_VCPU 1
>
> +/**
> + * GH_FN_IRQFD - register eventfd to assert a Gunyah doorbell
> + *
> + * gh_fn_desc is filled with gh_fn_irqfd_arg
> + *
> + * Allows setting an eventfd to directly trigger a guest interrupt.
> + * irqfd.fd specifies the file descriptor to use as the eventfd.
> + * irqfd.label corresponds to the doorbell label used in the guest VM's devicetree.
> + *
> + * Return: 0
> + */
> +#define GH_FN_IRQFD 2
> +
> #define GH_FN_MAX_ARG_SIZE 256
>
> /**
> @@ -88,6 +101,23 @@ struct gh_fn_vcpu_arg {
>
> #define GH_IRQFD_LEVEL (1UL << 0)
This is associated with the IRQFD "flags" field, so I'd name it
GH_IRQFD_FLAGS_LEVEL.
>
> +/**
> + * struct gh_fn_irqfd_arg - Arguments to create an irqfd function
> + * @fd: an eventfd which when written to will raise a doorbell
> + * @label: Label of the doorbell created on the guest VM
> + * @flags: GH_IRQFD_LEVEL configures the corresponding doorbell to behave
> + * like a level triggered interrupt.
> + * @padding: padding bytes
> + */
> +struct gh_fn_irqfd_arg {
> + __u32 fd;
Should the "fd" field be signed? Should it be an int? (Perhaps
you're trying to define a fixed kernel API, so __s32 if signed would
be better.)
> + __u32 label;
> + __u32 flags;
> + __u32 padding;
> +};
> +
> +#define GH_IOEVENTFD_DATAMATCH (1UL << 0)
> +
> /**
> * struct gh_fn_desc - Arguments to create a VM function
> * @type: Type of the function. See GH_FN_* macro for supported types
On 3/3/23 7:06 PM, Elliot Berman wrote:
> Allow userspace to attach an ioeventfd to an mmio address within the guest.
>
> Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
> Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
> Signed-off-by: Elliot Berman <[email protected]>
Mostly minor suggestions here. -Alex
> ---
> Documentation/virt/gunyah/vm-manager.rst | 2 +-
> drivers/virt/gunyah/Kconfig | 9 ++
> drivers/virt/gunyah/Makefile | 1 +
> drivers/virt/gunyah/gunyah_ioeventfd.c | 117 +++++++++++++++++++++++
> include/uapi/linux/gunyah.h | 37 +++++++
> 5 files changed, 165 insertions(+), 1 deletion(-)
> create mode 100644 drivers/virt/gunyah/gunyah_ioeventfd.c
>
> diff --git a/Documentation/virt/gunyah/vm-manager.rst b/Documentation/virt/gunyah/vm-manager.rst
> index a1dd70f0cbf6..cd41a705849f 100644
> --- a/Documentation/virt/gunyah/vm-manager.rst
> +++ b/Documentation/virt/gunyah/vm-manager.rst
> @@ -124,7 +124,7 @@ the VM starts.
> The possible types are documented below:
>
> .. kernel-doc:: include/uapi/linux/gunyah.h
> - :identifiers: GH_FN_VCPU gh_fn_vcpu_arg GH_FN_IRQFD gh_fn_irqfd_arg
> + :identifiers: GH_FN_VCPU gh_fn_vcpu_arg GH_FN_IRQFD gh_fn_irqfd_arg GH_FN_IOEVENTFD gh_fn_ioeventfd_arg
>
> Gunyah VCPU API Descriptions
> ----------------------------
> diff --git a/drivers/virt/gunyah/Kconfig b/drivers/virt/gunyah/Kconfig
> index 2cde24d429d1..bd8e31184962 100644
> --- a/drivers/virt/gunyah/Kconfig
> +++ b/drivers/virt/gunyah/Kconfig
> @@ -35,3 +35,12 @@ config GUNYAH_IRQFD
> on Gunyah virtual machine.
>
> Say Y/M here if unsure and you want to support Gunyah VMMs.
> +
> +config GUNYAH_IOEVENTFD
> + tristate "Gunyah ioeventfd interface"
> + depends on GUNYAH
> + help
> + Enable kernel support for creating ioeventfds which can alert userspace
> + when a Gunyah virtual machine accesses a memory address.
> +
> + Say Y/M here if unsure and you want to support Gunyah VMMs.
> diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
> index 6cf756bfa3c2..7347b1470491 100644
> --- a/drivers/virt/gunyah/Makefile
> +++ b/drivers/virt/gunyah/Makefile
> @@ -8,3 +8,4 @@ obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
>
> obj-$(CONFIG_GUNYAH_VCPU) += gunyah_vcpu.o
> obj-$(CONFIG_GUNYAH_IRQFD) += gunyah_irqfd.o
> +obj-$(CONFIG_GUNYAH_IOEVENTFD) += gunyah_ioeventfd.o
> diff --git a/drivers/virt/gunyah/gunyah_ioeventfd.c b/drivers/virt/gunyah/gunyah_ioeventfd.c
> new file mode 100644
> index 000000000000..517f55706ed9
> --- /dev/null
> +++ b/drivers/virt/gunyah/gunyah_ioeventfd.c
> @@ -0,0 +1,117 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#include <linux/eventfd.h>
> +#include <linux/file.h>
> +#include <linux/fs.h>
> +#include <linux/gunyah.h>
> +#include <linux/gunyah_vm_mgr.h>
> +#include <linux/module.h>
> +#include <linux/printk.h>
> +
> +#include <uapi/linux/gunyah.h>
> +
> +struct gh_ioeventfd {
> + struct gh_vm_function_instance *f;
> + struct gh_vm_io_handler io_handler;
> +
> + struct eventfd_ctx *ctx;
> +};
> +
> +static int gh_write_ioeventfd(struct gh_vm_io_handler *io_dev, u64 addr, u32 len, u64 data)
> +{
> + struct gh_ioeventfd *iofd = container_of(io_dev, struct gh_ioeventfd, io_handler);
> +
I think it's interesting that this signals an event even if
len is zero. I'm not saying it's wrong, just interesting...
> + eventfd_signal(iofd->ctx, 1);
> + return 0;
> +}
> +
> +static struct gh_vm_io_handler_ops io_ops = {
> + .write = gh_write_ioeventfd,
> +};
> +
> +static long gh_ioeventfd_bind(struct gh_vm_function_instance *f)
> +{
> + const struct gh_fn_ioeventfd_arg *args = f->argp;
> + struct eventfd_ctx *ctx = NULL;
No need to initialize ctx.
> + struct gh_ioeventfd *iofd;
> + int ret;
> +
> + if (f->arg_size != sizeof(*args))
> + return -EINVAL;
> +
> + /* must be natural-word sized, or 0 to ignore length */
> + switch (args->len) {
> + case 0:
> + case 1:
> + case 2:
> + case 4:
> + case 8:
> + break;
> + default:
> + return -EINVAL;
> + }
> +
> + /* check for range overflow */
> + if (args->addr + args->len < args->addr)
I think you could use:
if (overflows_type(args->addr + args->len, args->addr))
This is a relatively recent addition (and I haven't been using it
myself yet) but it's meant for this purpose. Consider using it
and its relatives here and anywhere else you're making this kind
of check.
> + return -EINVAL;
> +
> + /* ioeventfd with no length can't be combined with DATAMATCH */
> + if (!args->len && (args->flags & GH_IOEVENTFD_DATAMATCH))
> + return -EINVAL;
> +
Maybe check for invalid flags before before ensuring
valid flags are used properly?
> + /* All other flag bits are reserved for future use */
> + if (args->flags & ~GH_IOEVENTFD_DATAMATCH)
> + return -EINVAL;
> +
> + ctx = eventfd_ctx_fdget(args->fd);
> + if (IS_ERR(ctx))
> + return PTR_ERR(ctx);
> +
> + iofd = kzalloc(sizeof(*iofd), GFP_KERNEL);
> + if (!iofd) {
> + ret = -ENOMEM;
> + goto err_eventfd;
> + }
> +
> + f->data = iofd;
> + iofd->f = f;
> +
> + iofd->ctx = ctx;
> +
> + if (args->flags & GH_IOEVENTFD_DATAMATCH) {
> + iofd->io_handler.datamatch = true;
> + iofd->io_handler.len = args->len;
> + iofd->io_handler.data = args->datamatch;
I think you might want to rename one or the other of these
fields (datamatch or data). I might be wrong; I'll explain
elsewhere what I mean.
> + }
> + iofd->io_handler.addr = args->addr;
> + iofd->io_handler.ops = &io_ops;
> +
> + ret = gh_vm_add_io_handler(f->ghvm, &iofd->io_handler);
> + if (ret)
> + goto err_io_dev_add;
> +
> + return 0;
> +
> +err_io_dev_add:
> + kfree(iofd);
> +err_eventfd:
> + eventfd_ctx_put(ctx);
> + return ret;
> +}
> +
> +static void gh_ioevent_unbind(struct gh_vm_function_instance *f)
> +{
> + struct gh_ioeventfd *iofd = f->data;
> +
> + eventfd_ctx_put(iofd->ctx);
It's not a big deal but I prefer to "undo" everything in the
reverse order that they are originally "done". I.e., put the
eventfd context after removing the I/O handler.
> + gh_vm_remove_io_handler(iofd->f->ghvm, &iofd->io_handler);
> + kfree(iofd);
> +}
> +
> +DECLARE_GH_VM_FUNCTION_INIT(ioeventfd, GH_FN_IOEVENTFD,
> + gh_ioeventfd_bind, gh_ioevent_unbind);
> +MODULE_DESCRIPTION("Gunyah ioeventfds");
s/ioeventfds/ioeventfd/
I understand why you might want it to be plural, but I think it's
better to just name the abstraction. (If you take this suggestion,
check elsewhere and be consistent.)
AND/OR... You might also somehow incorporate the fact that this is a
VM *function* that is represented: "Gunyah ioeventfd VM function(s)"
> +MODULE_LICENSE("GPL");
> diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
> index 5617dadc1c7b..f8482ff4cc55 100644
> --- a/include/uapi/linux/gunyah.h
> +++ b/include/uapi/linux/gunyah.h
> @@ -89,6 +89,23 @@ struct gh_vm_dtb_config {
> */
> #define GH_FN_IRQFD 2
>
> +/**
> + * GH_FN_IOEVENTFD - register ioeventfd to trigger when VM faults on parameter
What does "faults on parameter" mean?
> + *
> + * gh_fn_desc is filled with gh_fn_ioeventfd_arg
> + *
> + * Attaches an ioeventfd to a legal mmio address within the guest. A guest write
> + * in the registered address will signal the provided event instead of triggering
> + * an exit on the GH_VCPU_RUN ioctl.
> + *
> + * If GH_IOEVENTFD_DATAMATCH flag is set, the event will be signaled only if the
> + * written value to the registered address is equal to datamatch in
> + * struct gh_fn_ioeventfd_arg.
> + *
> + * Return: 0
> + */
> +#define GH_FN_IOEVENTFD 3
If you added another tab before 3, it will align more nicely with the
next definition. (If you do that, add a tab in the other function
definitions as well.)
> +
> #define GH_FN_MAX_ARG_SIZE 256
>
> /**
> @@ -118,6 +135,26 @@ struct gh_fn_irqfd_arg {
>
> #define GH_IOEVENTFD_DATAMATCH (1UL << 0)
>
> +/**
> + * struct gh_fn_ioeventfd_arg - Arguments to create an ioeventfd function
> + * @datamatch: data used when GH_IOEVENTFD_DATAMATCH is set
> + * @addr: Address in guest memory
> + * @len: Length of access
> + * @fd: When ioeventfd is matched, this eventfd is written
> + * @flags: If GH_IOEVENTFD_DATAMATCH flag is set, the event will be signaled
> + * only if the written value to the registered address is equal to
> + * @datamatch
> + * @padding: padding bytes
> + */
> +struct gh_fn_ioeventfd_arg {
> + __u64 datamatch;
> + __u64 addr; /* legal mmio address */
> + __u32 len; /* 1, 2, 4, or 8 bytes; or 0 to ignore length */
> + __s32 fd;
> + __u32 flags;
> + __u32 padding;
> +};
> +
> /**
> * struct gh_fn_desc - Arguments to create a VM function
> * @type: Type of the function. See GH_FN_* macro for supported types
On 3/31/2023 7:24 AM, Alex Elder wrote:
> On 3/3/23 7:06 PM, Elliot Berman wrote:
>> Add architecture-independent standard error codes, types, and macros for
>> Gunyah hypercalls.
>>
>> Reviewed-by: Dmitry Baryshkov <[email protected]>
>> Signed-off-by: Elliot Berman <[email protected]>
>
> See a few comments below. -Alex
>
>> ---
>> include/linux/gunyah.h | 83 ++++++++++++++++++++++++++++++++++++++++++
>> 1 file changed, 83 insertions(+)
>> create mode 100644 include/linux/gunyah.h
>>
>> diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
>> new file mode 100644
>> index 000000000000..54b4be71caf7
>> --- /dev/null
>> +++ b/include/linux/gunyah.h
>> @@ -0,0 +1,83 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All
>> rights reserved.
>> + */
>> +
>> +#ifndef _LINUX_GUNYAH_H
>> +#define _LINUX_GUNYAH_H
>> +
>> +#include <linux/errno.h>
>> +#include <linux/limits.h>
>> +
>> +/******************************************************************************/
>> +/* Common arch-independent definitions for Gunyah
>> hypercalls */
>> +#define GH_CAPID_INVAL U64_MAX
>> +#define GH_VMID_ROOT_VM 0xff
>
> The above definition doesn't seem to be used anywhere, but seeing
> it begs the question to me of what type it is expected to have.
> If it were used, where would it be used in an 8 bit field?
>
VMIDs are u16, the root VM (Resource Manager) VMID is 0xff. I think this
definition snuck in from the downstream code and is indeed not being
needed/used anywhere. I'll remove it.
>> +
>> +enum gh_error {
>> + GH_ERROR_OK = 0,
>> + GH_ERROR_UNIMPLEMENTED = -1,
>> + GH_ERROR_RETRY = -2,
>
> There might be nothing fundamentally wrong with this, but I
> dislike seeing negative values assigned to enums.
>
> These error values are returned from the hypervisor, and it
> looks like they'll likely truncated from a 64-bit unsigned
> value. Are they *sent* from the hypervisor as 64-bit signed
> values? Or 32-bit signed values? (In that case, the
>
> I just wonder if you can use 0xffffffff or 0xffff for example
> rather than -1, depending on the actual value that gets passed.
>
They are sent from the hypervisor as 64-bit signed values (it's filling
a register). I think truncating should be OK because Gunyah wants to
maintain capability with 32-bit architectures and we would not see an
error number that truly requires more than 32 bits to represent.
>> +
>> + GH_ERROR_ARG_INVAL = 1,
>> + GH_ERROR_ARG_SIZE = 2,
>> + GH_ERROR_ARG_ALIGN = 3,
>> +
>> + GH_ERROR_NOMEM = 10,
>> +
>> + GH_ERROR_ADDR_OVFL = 20,
>> + GH_ERROR_ADDR_UNFL = 21,
>> + GH_ERROR_ADDR_INVAL = 22,
>> +
>> + GH_ERROR_DENIED = 30,
>> + GH_ERROR_BUSY = 31,
>> + GH_ERROR_IDLE = 32,
>> +
>> + GH_ERROR_IRQ_BOUND = 40,
>> + GH_ERROR_IRQ_UNBOUND = 41,
>> +
>> + GH_ERROR_CSPACE_CAP_NULL = 50,
>> + GH_ERROR_CSPACE_CAP_REVOKED = 51,
>> + GH_ERROR_CSPACE_WRONG_OBJ_TYPE = 52,
>> + GH_ERROR_CSPACE_INSUF_RIGHTS = 53,
>> + GH_ERROR_CSPACE_FULL = 54,
>> +
>> + GH_ERROR_MSGQUEUE_EMPTY = 60,
>> + GH_ERROR_MSGQUEUE_FULL = 61,
>> +};
>> +
>> +/**
>> + * gh_remap_error() - Remap Gunyah hypervisor errors into a Linux
>> error code
>> + * @gh_error: Gunyah hypercall return value
>> + */
>> +static inline int gh_remap_error(enum gh_error gh_error)
>
> Since you're remapping a gh_error, I would have named this
> gh_error_remap().
>
Done.
>> +{
>> + switch (gh_error) {
>> + case GH_ERROR_OK:
>> + return 0;
>> + case GH_ERROR_NOMEM:
>> + return -ENOMEM;
>> + case GH_ERROR_DENIED:
>> + case GH_ERROR_CSPACE_CAP_NULL:
>> + case GH_ERROR_CSPACE_CAP_REVOKED:
>> + case GH_ERROR_CSPACE_WRONG_OBJ_TYPE:
>> + case GH_ERROR_CSPACE_INSUF_RIGHTS:
>> + case GH_ERROR_CSPACE_FULL:
>> + return -EACCES;
>> + case GH_ERROR_BUSY:
>> + case GH_ERROR_IDLE:
>> + return -EBUSY;
>> + case GH_ERROR_IRQ_BOUND:
>> + case GH_ERROR_IRQ_UNBOUND:
>> + case GH_ERROR_MSGQUEUE_FULL:
>> + case GH_ERROR_MSGQUEUE_EMPTY:
>> + return -EIO;
>> + case GH_ERROR_UNIMPLEMENTED:
>> + case GH_ERROR_RETRY:
>> + return -EOPNOTSUPP;
>> + default:
>> + return -EINVAL;
>> + }
>> +}
>> +
>> +#endif
>
On 3/31/2023 7:25 AM, Alex Elder wrote:
> On 3/3/23 7:06 PM, Elliot Berman wrote:
>> Gunyah message queues are a unidirectional inter-VM pipe for messages up
>> to 1024 bytes. This driver supports pairing a receiver message queue and
>> a transmitter message queue to expose a single mailbox channel.
>
> I think it's good to reuse existing frameworks, for example, using
> the mailbox abstraction to implement your messaging code. But I
> find there are some minor mismatches between what you need and
> the way the mailbox code works.
>
> I'm not really suggesting you change anything, but I'll just say
> it seemed like there were a few spots you needed to do things
> that were slightly awkward in order to satisfy mailbox requirements.
>
> I'll point out in a few comments what I mean below.
>
> I'll take one more look at it again next time, but I assume this
> works and I have no other new comments today.
>
> -Alex
>
>> Signed-off-by: Elliot Berman <[email protected]>
>> ---
>> Documentation/virt/gunyah/message-queue.rst | 8 +
>> drivers/mailbox/Makefile | 2 +
>> drivers/mailbox/gunyah-msgq.c | 209 ++++++++++++++++++++
>> include/linux/gunyah.h | 57 ++++++
>> 4 files changed, 276 insertions(+)
>> create mode 100644 drivers/mailbox/gunyah-msgq.c
>>
>> diff --git a/Documentation/virt/gunyah/message-queue.rst
>> b/Documentation/virt/gunyah/message-queue.rst
>> index b352918ae54b..70d82a4ef32d 100644
>> --- a/Documentation/virt/gunyah/message-queue.rst
>> +++ b/Documentation/virt/gunyah/message-queue.rst
>> @@ -61,3 +61,11 @@ vIRQ: two TX message queues will have two vIRQs
>> (and two capability IDs).
>> | | | |
>> | |
>> | | | |
>> | |
>> +---------------+ +-----------------+
>> +---------------+
>> +
>> +Gunyah message queues are exposed as mailboxes. To create the
>> mailbox, create
>> +a mbox_client and call `gh_msgq_init()`. On receipt of the RX_READY
>> interrupt,
>> +all messages in the RX message queue are read and pushed via the
>> `rx_callback`
>> +of the registered mbox_client.
>> +
>> +.. kernel-doc:: drivers/mailbox/gunyah-msgq.c
>> + :identifiers: gh_msgq_init
>> diff --git a/drivers/mailbox/Makefile b/drivers/mailbox/Makefile
>> index fc9376117111..5f929bb55e9a 100644
>> --- a/drivers/mailbox/Makefile
>> +++ b/drivers/mailbox/Makefile
>> @@ -55,6 +55,8 @@ obj-$(CONFIG_MTK_CMDQ_MBOX) += mtk-cmdq-mailbox.o
>> obj-$(CONFIG_ZYNQMP_IPI_MBOX) += zynqmp-ipi-mailbox.o
>> +obj-$(CONFIG_GUNYAH) += gunyah-msgq.o
>> +
>> obj-$(CONFIG_SUN6I_MSGBOX) += sun6i-msgbox.o
>> obj-$(CONFIG_SPRD_MBOX) += sprd-mailbox.o
>> diff --git a/drivers/mailbox/gunyah-msgq.c
>> b/drivers/mailbox/gunyah-msgq.c
>> new file mode 100644
>> index 000000000000..1989298653f9
>> --- /dev/null
>> +++ b/drivers/mailbox/gunyah-msgq.c
>> @@ -0,0 +1,209 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All
>> rights reserved.
>> + */
>> +
>> +#include <linux/mailbox_controller.h>
>> +#include <linux/module.h>
>> +#include <linux/interrupt.h>
>> +#include <linux/gunyah.h>
>> +#include <linux/printk.h>
>> +#include <linux/init.h>
>> +#include <linux/slab.h>
>> +#include <linux/wait.h>
>> +
>> +#define mbox_chan_to_msgq(chan) (container_of(chan->mbox, struct
>> gh_msgq, mbox))
>> +
>> +static irqreturn_t gh_msgq_rx_irq_handler(int irq, void *data)
>> +{
>> + struct gh_msgq *msgq = data;
>> + struct gh_msgq_rx_data rx_data;
>> + enum gh_error gh_error;
>> + bool ready = true;
>> +
>> + while (ready) {
>> + gh_error = gh_hypercall_msgq_recv(msgq->rx_ghrsc->capid,
>> + &rx_data.data, sizeof(rx_data.data),
>> + &rx_data.length, &ready);
>> + if (gh_error != GH_ERROR_OK) {
>> + if (gh_error != GH_ERROR_MSGQUEUE_EMPTY)
>> + dev_warn(msgq->mbox.dev, "Failed to receive data:
>> %d\n", gh_error);
>> + break;
>> + }
>> + mbox_chan_received_data(gh_msgq_chan(msgq), &rx_data);
>> + }
>> +
>> + return IRQ_HANDLED;
>> +}
>> +
>> +/* Fired when message queue transitions from "full" to "space
>> available" to send messages */
>> +static irqreturn_t gh_msgq_tx_irq_handler(int irq, void *data)
>> +{
>> + struct gh_msgq *msgq = data;
>> +
>> + mbox_chan_txdone(gh_msgq_chan(msgq), 0);
>> +
>> + return IRQ_HANDLED;
>> +}
>> +
>> +/* Fired after sending message and hypercall told us there was more
>> space available. */
>> +static void gh_msgq_txdone_tasklet(struct tasklet_struct *tasklet)
>> +{
>> + struct gh_msgq *msgq = container_of(tasklet, struct gh_msgq,
>> txdone_tasklet);
>> +
>> + mbox_chan_txdone(gh_msgq_chan(msgq), msgq->last_ret);
>> +}
>> +
>> +static int gh_msgq_send_data(struct mbox_chan *chan, void *data)
>> +{
>> + struct gh_msgq *msgq = mbox_chan_to_msgq(chan);
>> + struct gh_msgq_tx_data *msgq_data = data;
>> + u64 tx_flags = 0;
>> + enum gh_error gh_error;
>> + bool ready;
>> +
>> + if (msgq_data->push)
>> + tx_flags |= GH_HYPERCALL_MSGQ_TX_FLAGS_PUSH;
>> +
>> + gh_error = gh_hypercall_msgq_send(msgq->tx_ghrsc->capid,
>> msgq_data->length, msgq_data->data,
>> + tx_flags, &ready);
>> +
>> + /**
>> + * unlikely because Linux tracks state of msgq and should not try to
>> + * send message when msgq is full.
>> + */
>> + if (unlikely(gh_error == GH_ERROR_MSGQUEUE_FULL))
>> + return -EAGAIN;
>> +
>> + /**
>> + * Propagate all other errors to client. If we return error to
>> mailbox
>> + * framework, then no other messages can be sent and nobody will
>> know
>> + * to retry this message.
>
> If you weren't using the mailbox framework, would you be
> sending the error to the client in this case? (I'm just
> curious; it's good to document the behavior if you were
> to return it to the mailbox framework.)
>
>> + */
>> + msgq->last_ret = gh_remap_error(gh_error);
>> +
>> + /**
>> + * This message was successfully sent, but message queue isn't
>> ready to
>> + * accept more messages because it's now full. Mailbox framework
>> + * requires that we only report that message was transmitted when
>> + * we're ready to transmit another message. We'll get that in the
>> form
>> + * of tx IRQ once the other side starts to drain the msgq.
>
> So you are forced to delay reporting the completion
> here because you're using the mailbox framework.
>
>> + */
>> + if (gh_error == GH_ERROR_OK) {
>> + if (!ready)
>> + return 0;
>> + } else
>> + dev_err(msgq->mbox.dev, "Failed to send data: %d (%d)\n",
>> gh_error, msgq->last_ret);
>> +
>> + /**
>> + * We can send more messages. Mailbox framework requires that tx
>> done
>> + * happens asynchronously to sending the message. Gunyah message
>> queues
>> + * tell us right away on the hypercall return whether we can send
>> more
>> + * messages. To work around this, defer the txdone to a tasklet.
>> + */
>
> If you weren't using the mailbox framework, you'd send the next
> message directly rather than scheduling this tasklet to do it.
>
That's correct for this and the other 2 comments above.
>> + tasklet_schedule(&msgq->txdone_tasklet);
>> +
>> + return 0;
>> +}
>> +
>> +static struct mbox_chan_ops gh_msgq_ops = {
>> + .send_data = gh_msgq_send_data,
>> +};
>> +
>> +/**
>> + * gh_msgq_init() - Initialize a Gunyah message queue with an
>> mbox_client
>> + * @parent: optional, device parent used for the mailbox controller
>> + * @msgq: Pointer to the gh_msgq to initialize
>> + * @cl: A mailbox client to bind to the mailbox channel that the
>> message queue creates
>> + * @tx_ghrsc: optional, the transmission side of the message queue
>> + * @rx_ghrsc: optional, the receiving side of the message queue
>> + *
>> + * At least one of tx_ghrsc and rx_ghrsc must be not NULL. Most
>> message queue use cases come with
>> + * a pair of message queues to facilitate bidirectional
>> communication. When tx_ghrsc is set,
>> + * the client can send messages with
>> mbox_send_message(gh_msgq_chan(msgq), msg). When rx_ghrsc
>> + * is set, the mbox_client must register an .rx_callback() and the
>> message queue driver will
>> + * deliver all available messages upon receiving the RX ready
>> interrupt. The messages should be
>> + * consumed or copied by the client right away as the gh_msgq_rx_data
>> will be replaced/destroyed
>> + * after the callback.
>> + *
>> + * Returns - 0 on success, negative otherwise
>> + */
>> +int gh_msgq_init(struct device *parent, struct gh_msgq *msgq, struct
>> mbox_client *cl,
>> + struct gh_resource *tx_ghrsc, struct gh_resource *rx_ghrsc)
>> +{
>> + int ret;
>> +
>> + /* Must have at least a tx_ghrsc or rx_ghrsc and that they are
>> the right device types */
>> + if ((!tx_ghrsc && !rx_ghrsc) ||
>> + (tx_ghrsc && tx_ghrsc->type != GH_RESOURCE_TYPE_MSGQ_TX) ||
>> + (rx_ghrsc && rx_ghrsc->type != GH_RESOURCE_TYPE_MSGQ_RX))
>> + return -EINVAL;
>> +
>> + if (!gh_api_has_feature(GH_FEATURE_MSGQUEUE))
>> + return -EOPNOTSUPP;
>> +
>> + msgq->tx_ghrsc = tx_ghrsc;
>> + msgq->rx_ghrsc = rx_ghrsc;
>> +
>> + msgq->mbox.dev = parent;
>> + msgq->mbox.ops = &gh_msgq_ops;
>> + msgq->mbox.num_chans = 1;
>> + msgq->mbox.txdone_irq = true;
>> + msgq->mbox.chans = &msgq->mbox_chan;
>> +
>> + if (msgq->tx_ghrsc) {
>> + ret = request_irq(msgq->tx_ghrsc->irq,
>> gh_msgq_tx_irq_handler, 0, "gh_msgq_tx",
>> + msgq);
>> + if (ret)
>> + goto err_chans;
>> + }
>> +
>> + if (msgq->rx_ghrsc) {
>> + ret = request_threaded_irq(msgq->rx_ghrsc->irq, NULL,
>> gh_msgq_rx_irq_handler,
>> + IRQF_ONESHOT, "gh_msgq_rx", msgq);
>> + if (ret)
>> + goto err_tx_irq;
>> + }
>> +
>> + tasklet_setup(&msgq->txdone_tasklet, gh_msgq_txdone_tasklet);
>> +
>> + ret = mbox_controller_register(&msgq->mbox);
>> + if (ret)
>> + goto err_rx_irq;
>> +
>> + ret = mbox_bind_client(gh_msgq_chan(msgq), cl);
>> + if (ret)
>> + goto err_mbox;
>> +
>> + return 0;
>> +err_mbox:
>> + mbox_controller_unregister(&msgq->mbox);
>> +err_rx_irq:
>> + if (msgq->rx_ghrsc)
>> + free_irq(msgq->rx_ghrsc->irq, msgq);
>> +err_tx_irq:
>> + if (msgq->tx_ghrsc)
>> + free_irq(msgq->tx_ghrsc->irq, msgq);
>> +err_chans:
>> + kfree(msgq->mbox.chans);
>> + return ret;
>> +}
>> +EXPORT_SYMBOL_GPL(gh_msgq_init);
>> +
>> +void gh_msgq_remove(struct gh_msgq *msgq)
>> +{
>> + tasklet_kill(&msgq->txdone_tasklet);
>> + mbox_controller_unregister(&msgq->mbox);
>> +
>> + if (msgq->rx_ghrsc)
>> + free_irq(msgq->rx_ghrsc->irq, msgq);
>> +
>> + if (msgq->tx_ghrsc)
>> + free_irq(msgq->tx_ghrsc->irq, msgq);
>> +
>> + kfree(msgq->mbox.chans);
>> +}
>> +EXPORT_SYMBOL_GPL(gh_msgq_remove);
>> +
>> +MODULE_LICENSE("GPL");
>> +MODULE_DESCRIPTION("Gunyah Message Queue Driver");
>> diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
>> index 18cfbf5ee48b..378bec0f2ce1 100644
>> --- a/include/linux/gunyah.h
>> +++ b/include/linux/gunyah.h
>> @@ -8,11 +8,68 @@
>> #include <linux/bitfield.h>
>> #include <linux/errno.h>
>> +#include <linux/interrupt.h>
>> #include <linux/limits.h>
>> +#include <linux/mailbox_controller.h>
>> +#include <linux/mailbox_client.h>
>> #include <linux/types.h>
>> +/* Follows resource manager's resource types for VM_GET_HYP_RESOURCES */
>
> I'm not sure what you mean by "Follows" here. You mean that these are
> the gh_rm_hyp_resource type values that GET_HYP_RESOURCES can return?
>
Correct. I will say "Matches ... for VM_GET_HYP_RESOURCES RPC" if it
helps make it clearer.
> Note that gh_resource_type values must fit in an 8 bit field.
>
>> +enum gh_resource_type {
>> + GH_RESOURCE_TYPE_BELL_TX = 0,
>> + GH_RESOURCE_TYPE_BELL_RX = 1,
>> + GH_RESOURCE_TYPE_MSGQ_TX = 2,
>> + GH_RESOURCE_TYPE_MSGQ_RX = 3,
>
> Fix alignment below.
>
>> + GH_RESOURCE_TYPE_VCPU = 4,
>> +};
>> +
>> +struct gh_resource {
>> + enum gh_resource_type type;
>> + u64 capid;
>> + unsigned int irq;
>> +};
>> +
>> +/**
>> + * Gunyah Message Queues
>> + */
>> +
>> +#define GH_MSGQ_MAX_MSG_SIZE 240
>
> Maybe insert another tab the before 240. You later define
> GH_BELL_NONBLOCK that far out, and aligning them will look
> better.
>
>> +
>> +struct gh_msgq_tx_data {
>> + size_t length;
>> + bool push;
>> + char data[];
>> +};
>> +
>> +struct gh_msgq_rx_data {
>> + size_t length;
>> + char data[GH_MSGQ_MAX_MSG_SIZE];
>> +};
>> +
>> +struct gh_msgq {
>> + struct gh_resource *tx_ghrsc;
>> + struct gh_resource *rx_ghrsc;
>> +
>> + /* msgq private */
>> + int last_ret; /* Linux error, not GH_STATUS_* */
>> + struct mbox_chan mbox_chan;
>> + struct mbox_controller mbox;
>> + struct tasklet_struct txdone_tasklet;
>> +};
>> +
>> +
>> +int gh_msgq_init(struct device *parent, struct gh_msgq *msgq, struct
>> mbox_client *cl,
>> + struct gh_resource *tx_ghrsc, struct gh_resource
>> *rx_ghrsc);
>> +void gh_msgq_remove(struct gh_msgq *msgq);
>> +
>> +static inline struct mbox_chan *gh_msgq_chan(struct gh_msgq *msgq)
>> +{
>> + return &msgq->mbox.chans[0];
>> +}
>> +
>>
>> /******************************************************************************/
>> /* Common arch-independent definitions for Gunyah
>> hypercalls */
>> +
>> #define GH_CAPID_INVAL U64_MAX
>> #define GH_VMID_ROOT_VM 0xff
>
On 3/31/2023 7:25 AM, Alex Elder wrote:
> On 3/3/23 7:06 PM, Elliot Berman wrote:
>> The resource manager is a special virtual machine which is always
>> running on a Gunyah system. It provides APIs for creating and destroying
>> VMs, secure memory management, sharing/lending of memory between VMs,
>> and setup of inter-VM communication. Calls to the resource manager are
>> made via message queues.
>>
>> This patch implements the basic probing and RPC mechanism to make those
>> API calls. Request/response calls can be made with gh_rm_call.
>> Drivers can also register to notifications pushed by RM via
>> gh_rm_register_notifier
>>
>> Specific API calls that resource manager supports will be implemented in
>> subsequent patches.
>
> Mostly very simple issues noted here. -Alex
>
>> Signed-off-by: Elliot Berman <[email protected]>
>> ---
>> drivers/virt/gunyah/Makefile | 3 +
>> drivers/virt/gunyah/rsc_mgr.c | 688 +++++++++++++++++++++++++++++++++
>> drivers/virt/gunyah/rsc_mgr.h | 16 +
>> include/linux/gunyah_rsc_mgr.h | 21 +
>> 4 files changed, 728 insertions(+)
>> create mode 100644 drivers/virt/gunyah/rsc_mgr.c
>> create mode 100644 drivers/virt/gunyah/rsc_mgr.h
>> create mode 100644 include/linux/gunyah_rsc_mgr.h
>>
>> diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
>> index 34f32110faf9..cc864ff5abbb 100644
>> --- a/drivers/virt/gunyah/Makefile
>> +++ b/drivers/virt/gunyah/Makefile
>> @@ -1,3 +1,6 @@
>> # SPDX-License-Identifier: GPL-2.0
>> obj-$(CONFIG_GUNYAH) += gunyah.o
>> +
>> +gunyah_rsc_mgr-y += rsc_mgr.o
>> +obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
>> diff --git a/drivers/virt/gunyah/rsc_mgr.c
>> b/drivers/virt/gunyah/rsc_mgr.c
>> new file mode 100644
>> index 000000000000..67813c9a52db
>> --- /dev/null
>> +++ b/drivers/virt/gunyah/rsc_mgr.c
>> @@ -0,0 +1,688 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All
>> rights reserved.
>> + */
>> +
>
> . . .
>
>> +static void gh_rm_try_complete_connection(struct gh_rm *rm)
>> +{
>> + struct gh_rm_connection *connection = rm->active_rx_connection;
>> +
>> + if (!connection || connection->fragments_received !=
>> connection->num_fragments)
>> + return;
>> +
>> + switch (connection->type) {
>> + case RM_RPC_TYPE_REPLY:
>> + complete(&connection->reply.seq_done);
>> + break;
>> + case RM_RPC_TYPE_NOTIF:
>> + schedule_work(&connection->notification.work);
>> + break;
>> + default:
>> + dev_err_ratelimited(rm->dev, "Invalid message type (%d)
>> received\n",
>
> s/%d/%u/
>
>> + connection->type);
>> + gh_rm_abort_connection(rm);
>> + break;
>> + }
>> +
>> + rm->active_rx_connection = NULL;
>> +}
>> +
>> +static void gh_rm_msgq_rx_data(struct mbox_client *cl, void *mssg)
>> +{
>> + struct gh_rm *rm = container_of(cl, struct gh_rm, msgq_client);
>> + struct gh_msgq_rx_data *rx_data = mssg;
>> + size_t msg_size = rx_data->length;
>> + void *msg = rx_data->data;
>> + struct gh_rm_rpc_hdr *hdr;
>> +
>> + if (msg_size < sizeof(*hdr) || msg_size > GH_MSGQ_MAX_MSG_SIZE)
>> + return;
>> +
>> + hdr = msg;
>> + if (hdr->api != RM_RPC_API) {
>> + dev_err(rm->dev, "Unknown RM RPC API version: %x\n", hdr->api);
>> + return;
>> + }
>> +
>> + switch (FIELD_GET(RM_RPC_TYPE_MASK, hdr->type)) {
>> + case RM_RPC_TYPE_NOTIF:
>> + gh_rm_process_notif(rm, msg, msg_size);
>> + break;
>> + case RM_RPC_TYPE_REPLY:
>> + gh_rm_process_rply(rm, msg, msg_size);
>> + break;
>> + case RM_RPC_TYPE_CONTINUATION:
>> + gh_rm_process_cont(rm, rm->active_rx_connection, msg, msg_size);
>> + break;
>> + default:
>> + dev_err(rm->dev, "Invalid message type (%lu) received\n",
>> + FIELD_GET(RM_RPC_TYPE_MASK, hdr->type));
>> + return;
>> + }
>> +
>> + gh_rm_try_complete_connection(rm);
>> +}
>> +
>> +static void gh_rm_msgq_tx_done(struct mbox_client *cl, void *mssg,
>> int r)
>> +{
>> + struct gh_rm *rm = container_of(cl, struct gh_rm, msgq_client);
>> +
>> + kmem_cache_free(rm->cache, mssg);
>> + rm->last_tx_ret = r;
>> +}
>> +
>> +static int gh_rm_send_request(struct gh_rm *rm, u32 message_id,
>> + const void *req_buff, size_t req_buf_size,
>> + struct gh_rm_connection *connection)
>> +{
>> + size_t buf_size_remaining = req_buf_size;
>> + const void *req_buf_curr = req_buff;
>> + struct gh_msgq_tx_data *msg;
>> + struct gh_rm_rpc_hdr *hdr, hdr_template;
>> + u32 cont_fragments = 0;
>> + size_t payload_size;
>> + void *payload;
>> + int ret;
>> +
>> + if (req_buf_size > GH_RM_MAX_NUM_FRAGMENTS * GH_RM_MAX_MSG_SIZE) {
>> + dev_warn(rm->dev, "Limit exceeded for the number of
>> fragments: %u\n",
>> + cont_fragments);
>
> You are printing the value of cont_fragments here when it's just zero.
>
>> + dump_stack();
>> + return -E2BIG;
>> + }
>> +
>
> Move the computation of cont_fragments prior to the block above.
> You could use a ?: statement to assign it.
>
>> + if (req_buf_size)
>> + cont_fragments = (req_buf_size - 1) / GH_RM_MAX_MSG_SIZE;
>> +
>> + hdr_template.api = RM_RPC_API;
>> + hdr_template.type = FIELD_PREP(RM_RPC_TYPE_MASK,
>> RM_RPC_TYPE_REQUEST) |
>> + FIELD_PREP(RM_RPC_FRAGMENTS_MASK, cont_fragments);
>
> The line above should be indented further.
>
>> + hdr_template.seq = cpu_to_le16(connection->reply.seq);
>> + hdr_template.msg_id = cpu_to_le32(message_id);
>> +
>> + ret = mutex_lock_interruptible(&rm->send_lock);
>> + if (ret)
>> + return ret;
>> +
>> + /* Consider also the 'request' packet for the loop count */
>
> I don't think the comment above is helpful.
>
>> + do {
>> + msg = kmem_cache_zalloc(rm->cache, GFP_KERNEL);
>> + if (!msg) {
>> + ret = -ENOMEM;
>> + goto out;
>> + }
>> +
>> + /* Fill header */
>> + hdr = (struct gh_rm_rpc_hdr *)msg->data;
>
> I personally would prefer &msg->data[0] in this case.
>
>> + *hdr = hdr_template;
>> +
>> + /* Copy payload */
>> + payload = hdr + 1;
>
> I think I might have suggested using "hdr + 1" here.
>
> Elsewhere you use something like:
> payload = (char *)hdr + sizeof(hdr);
> or something similar. I suggest you choose one approach and use
> it consistently througout the driver. Either is fine, but I
> have a slight preference for the "hdr + 1" way.
>
I think you might be referencing the memcpy in
gh_rm_init_connection_payload. In the gh_rm_init_connection_payload,
hdr_size is not fixed: for notifications, it's just the RPC header. For
responses, there is the RPC header + the "RM error code". To be able to
re-use same header processing, I'd have to do byte arithmetic rather
than the "hdr + 1" way. I also prefer the "hdr + 1" way, but if I am
going to be consistent, need to stick with byte arithmetic.
>> + payload_size = min(buf_size_remaining, GH_RM_MAX_MSG_SIZE);
>> + memcpy(payload, req_buf_curr, payload_size);
>> + req_buf_curr += payload_size;
>> + buf_size_remaining -= payload_size;
>> +
>> + /* Force the last fragment to immediately alert the receiver */
>> + msg->push = !buf_size_remaining;
>> + msg->length = sizeof(*hdr) + payload_size;
>> +
>> + ret = mbox_send_message(gh_msgq_chan(&rm->msgq), msg);
>> + if (ret < 0) {
>> + kmem_cache_free(rm->cache, msg);
>> + break;
>> + }
>> +
>> + if (rm->last_tx_ret) {
>> + ret = rm->last_tx_ret;
>> + break;
>> + }
>> +
>> + hdr_template.type = FIELD_PREP(RM_RPC_TYPE_MASK,
>> RM_RPC_TYPE_CONTINUATION) |
>> + FIELD_PREP(RM_RPC_FRAGMENTS_MASK, cont_fragments);
>> + } while (buf_size_remaining);
>> +
>> +out:
>> + mutex_unlock(&rm->send_lock);
>> + return ret < 0 ? ret : 0;
>> +}
>> +
>> +/**
>> + * gh_rm_call: Achieve request-response type communication with RPC
>> + * @rm: Pointer to Gunyah resource manager internal data
>> + * @message_id: The RM RPC message-id
>> + * @req_buff: Request buffer that contains the payload
>> + * @req_buf_size: Total size of the payload
>> + * @resp_buf: Pointer to a response buffer
>> + * @resp_buf_size: Size of the response buffer
>> + *
>> + * Make a request to the RM-VM and wait for reply back. For a successful
>
> I think you could just say "to the RM and wait"...
>
> Overall I suggest using "RM" or "RM VM" consistently when you talk
> about the Resource Manager. This is the only place I see "RM-VM".
>
>> + * response, the function returns the payload. The size of the
>> payload is set in
>> + * resp_buf_size. The resp_buf should be freed by the caller when 0
>> is returned
>
> s/should/must/
>
>> + * and resp_buf_size != 0.
>> + *
>> + * req_buff should be not NULL for req_buf_size >0. If req_buf_size
>> == 0,
>> + * req_buff *can* be NULL and no additional payload is sent.
>
> I'd say use "buf" or "buff" but not both in your naming
> convention.
>
Not intentional -- will make it consistent.
>> + *
>> + * Context: Process context. Will sleep waiting for reply.
>> + * Return: 0 on success. <0 if error.
>> + */
>> +int gh_rm_call(struct gh_rm *rm, u32 message_id, void *req_buff,
>> size_t req_buf_size,
>> + void **resp_buf, size_t *resp_buf_size)
>
> I suspect you could define the request buffer as a pointer to const;
> can you?
>
I can!
>> +{
>> + struct gh_rm_connection *connection;
>> + u32 seq_id;
>> + int ret;
>> +
>> + /* message_id 0 is reserved. req_buf_size implies req_buf is not
>> NULL */
>> + if (!message_id || (!req_buff && req_buf_size) || !rm)
>
> If you're going to check for a null RM pointer, I'd check it first.
>
>> + return -EINVAL;
>> +
>> +
>> + connection = kzalloc(sizeof(*connection), GFP_KERNEL);
>> + if (!connection)
>> + return -ENOMEM;
>> +
>> + connection->type = RM_RPC_TYPE_REPLY;
>> + connection->msg_id = cpu_to_le32(message_id);
>> +
>> + init_completion(&connection->reply.seq_done);
>
> . . .
>
>> diff --git a/include/linux/gunyah_rsc_mgr.h
>> b/include/linux/gunyah_rsc_mgr.h
>> new file mode 100644
>> index 000000000000..deca9b3da541
>> --- /dev/null
>> +++ b/include/linux/gunyah_rsc_mgr.h
>> @@ -0,0 +1,21 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All
>> rights reserved.
>> + */
>> +
>> +#ifndef _GUNYAH_RSC_MGR_H
>> +#define _GUNYAH_RSC_MGR_H
>> +
>> +#include <linux/list.h>
>> +#include <linux/notifier.h>
>> +#include <linux/gunyah.h>
>> +
>> +#define GH_VMID_INVAL U16_MAX
>
> Add a tab before U16_MAX; it will line up more nicely
> when you define GH_MEM_HANDLE_INVAL later.
>
>> +
>> +struct gh_rm;
>> +int gh_rm_notifier_register(struct gh_rm *rm, struct notifier_block
>> *nb);
>> +int gh_rm_notifier_unregister(struct gh_rm *rm, struct notifier_block
>> *nb);
>> +struct device *gh_rm_get(struct gh_rm *rm);
>> +void gh_rm_put(struct gh_rm *rm);
>> +
>> +#endif
>
On 3/31/2023 7:25 AM, Alex Elder wrote:
> On 3/3/23 7:06 PM, Elliot Berman wrote:
>> Add Gunyah Resource Manager RPC to launch an unauthenticated VM.
>>
>> Signed-off-by: Elliot Berman <[email protected]>
>
> Several comments, no major issues here. -Alex
>
>> ---
>> drivers/virt/gunyah/Makefile | 2 +-
>> drivers/virt/gunyah/rsc_mgr_rpc.c | 260 ++++++++++++++++++++++++++++++
>> include/linux/gunyah_rsc_mgr.h | 73 +++++++++
>> 3 files changed, 334 insertions(+), 1 deletion(-)
>> create mode 100644 drivers/virt/gunyah/rsc_mgr_rpc.c
>>
>> diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
>> index cc864ff5abbb..de29769f2f3f 100644
>> --- a/drivers/virt/gunyah/Makefile
>> +++ b/drivers/virt/gunyah/Makefile
>> @@ -2,5 +2,5 @@
>> obj-$(CONFIG_GUNYAH) += gunyah.o
>> -gunyah_rsc_mgr-y += rsc_mgr.o
>> +gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o
>> obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
>> diff --git a/drivers/virt/gunyah/rsc_mgr_rpc.c
>> b/drivers/virt/gunyah/rsc_mgr_rpc.c
>> new file mode 100644
>> index 000000000000..ffcb861a31b5
>> --- /dev/null
>> +++ b/drivers/virt/gunyah/rsc_mgr_rpc.c
>> @@ -0,0 +1,260 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All
>> rights reserved.
>> + */
>> +
>> +#include <linux/gunyah_rsc_mgr.h>
>> +#include "rsc_mgr.h"
>> +
>> +/* Message IDs: VM Management */
>> +#define GH_RM_RPC_VM_ALLOC_VMID 0x56000001
>> +#define GH_RM_RPC_VM_DEALLOC_VMID 0x56000002
>> +#define GH_RM_RPC_VM_START 0x56000004
>> +#define GH_RM_RPC_VM_STOP 0x56000005
>> +#define GH_RM_RPC_VM_RESET 0x56000006
>> +#define GH_RM_RPC_VM_CONFIG_IMAGE 0x56000009
>> +#define GH_RM_RPC_VM_INIT 0x5600000B
>> +#define GH_RM_RPC_VM_GET_HYP_RESOURCES 0x56000020
>> +#define GH_RM_RPC_VM_GET_VMID 0x56000024
>> +
>> +struct gh_rm_vm_common_vmid_req {
>> + __le16 vmid;
>> + __le16 _padding;
>> +} __packed;
>> +
>> +/* Call: VM_ALLOC */
>> +struct gh_rm_vm_alloc_vmid_resp {
>> + __le16 vmid;
>> + __le16 _padding;
>> +} __packed;
>> +
>> +/* Call: VM_STOP */
>> +#define GH_RM_VM_STOP_FLAG_FORCE_STOP BIT(0)
>> +
>> +#define GH_RM_VM_STOP_REASON_FORCE_STOP 3
>> +
>> +struct gh_rm_vm_stop_req {
>> + __le16 vmid;
>> + u8 flags;
>> + u8 _padding;
>> + __le32 stop_reason;
>> +} __packed;
>> +
>> +/* Call: VM_CONFIG_IMAGE */
>> +struct gh_rm_vm_config_image_req {
>> + __le16 vmid;
>> + __le16 auth_mech;
>> + __le32 mem_handle;
>> + __le64 image_offset;
>> + __le64 image_size;
>> + __le64 dtb_offset;
>> + __le64 dtb_size;
>> +} __packed;
>> +
>> +/*
>> + * Several RM calls take only a VMID as a parameter and give only
>> standard
>> + * response back. Deduplicate boilerplate code by using this common
>> call.
>> + */
>> +static int gh_rm_common_vmid_call(struct gh_rm *rm, u32 message_id,
>> u16 vmid)
>> +{
>> + struct gh_rm_vm_common_vmid_req req_payload = {
>> + .vmid = cpu_to_le16(vmid),
>> + };
>> +
>> + return gh_rm_call(rm, message_id, &req_payload,
>> sizeof(req_payload), NULL, NULL);
>> +}
>> +
>> +/**
>> + * gh_rm_alloc_vmid() - Allocate a new VM in Gunyah. Returns the VM
>> identifier.
>> + * @rm: Handle to a Gunyah resource manager
>> + * @vmid: Use 0 to dynamically allocate a VM. A reserved VMID can be
>> supplied
>> + * to request allocation of a platform-defined VM.
>> + *
>> + * Returns - the allocated VMID or negative value on error
>> + */
>> +int gh_rm_alloc_vmid(struct gh_rm *rm, u16 vmid)
>> +{
>> + struct gh_rm_vm_common_vmid_req req_payload = {
>> + .vmid = vmid,
>> + };
>> + struct gh_rm_vm_alloc_vmid_resp *resp_payload;
>> + size_t resp_size;
>> + void *resp;
>> + int ret;
>> +
>> + ret = gh_rm_call(rm, GH_RM_RPC_VM_ALLOC_VMID, &req_payload,
>> sizeof(req_payload), &resp,
>> + &resp_size);
>> + if (ret)
>> + return ret;
>> +
>> + if (!vmid) {
>> + resp_payload = resp;
>> + ret = le16_to_cpu(resp_payload->vmid);
>> + kfree(resp);
>> + }
>> +
>> + return ret;
>> +}
>> +
>> +/**
>> + * gh_rm_dealloc_vmid() - Dispose the VMID
>
> s/the/of a/
>
>> + * @rm: Handle to a Gunyah resource manager
>> + * @vmid: VM identifier allocated with gh_rm_alloc_vmid
>> + */
>> +int gh_rm_dealloc_vmid(struct gh_rm *rm, u16 vmid)
>> +{
>> + return gh_rm_common_vmid_call(rm, GH_RM_RPC_VM_DEALLOC_VMID, vmid);
>> +}
>> +
>> +/**
>> + * gh_rm_vm_reset() - Reset the VM's resources
>> + * @rm: Handle to a Gunyah resource manager
>> + * @vmid: VM identifier allocated with gh_rm_alloc_vmid
>> + *
>> + * While tearing down the VM, request RM to clean up all the VM
>> resources
>
> s/While/As part of/
>
>> + * associated with the VM. Only after this, Linux can clean up all the
>> + * references it maintains to resources.
>> + */
>> +int gh_rm_vm_reset(struct gh_rm *rm, u16 vmid)
>> +{
>> + return gh_rm_common_vmid_call(rm, GH_RM_RPC_VM_RESET, vmid);
>> +}
>> +
>> +/**
>> + * gh_rm_vm_start() - Move the VM into "ready to run" state
>> + * @rm: Handle to a Gunyah resource manager
>> + * @vmid: VM identifier allocated with gh_rm_alloc_vmid
>> + *
>> + * On VMs which use proxy scheduling, vcpu_run is needed to actually
>> run the VM.
>> + * On VMs which use Gunyah's scheduling, the vCPUs start executing in
>> accordance with Gunyah
>> + * scheduling policies.
>> + */
>> +int gh_rm_vm_start(struct gh_rm *rm, u16 vmid)
>> +{
>> + return gh_rm_common_vmid_call(rm, GH_RM_RPC_VM_START, vmid);
>> +}
>> +
>> +/**
>> + * gh_rm_vm_stop() - Send a request to Resource Manager VM to
>> forcibly stop a VM.
>> + * @rm: Handle to a Gunyah resource manager
>> + * @vmid: VM identifier allocated with gh_rm_alloc_vmid
>> + */
>> +int gh_rm_vm_stop(struct gh_rm *rm, u16 vmid)
>> +{
>> + struct gh_rm_vm_stop_req req_payload = {
>> + .vmid = cpu_to_le16(vmid),
>> + .flags = GH_RM_VM_STOP_FLAG_FORCE_STOP,
>> + .stop_reason = cpu_to_le32(GH_RM_VM_STOP_REASON_FORCE_STOP),
>> + };
>> +
>> + return gh_rm_call(rm, GH_RM_RPC_VM_STOP, &req_payload,
>> sizeof(req_payload), NULL, NULL);
>> +}
>> +
>> +/**
>> + * gh_rm_vm_configure() - Prepare a VM to start and provide the common
>> + * configuration needed by RM to configure a VM
>> + * @rm: Handle to a Gunyah resource manager
>> + * @vmid: VM identifier allocated with gh_rm_alloc_vmid
>> + * @auth_mechanism: Authentication mechanism used by resource manager
>> to verify
>> + * the virtual machine
>> + * @mem_handle: Handle to a previously shared memparcel that contains
>> all parts
>> + * of the VM image subject to authentication.
>> + * @image_offset: Start address of VM image, relative to the start of
>> memparcel
>> + * @image_size: Size of the VM image
>> + * @dtb_offset: Start address of the devicetree binary with VM
>> configuration,
>> + * relative to start of memparcel.
>> + * @dtb_size: Maximum size of devicetree binary. Resource manager
>> applies
>> + * an overlay to the DTB and dtb_size should include room for
>> + * the overlay.
>
> The above comment about including extra room doesn't sit well.
> How much extra room is required? Is there any way you can
> provide an estimate? Or better yet, is it possible to have
> gh_rm_call() somehow calculate that extra amount and add it on?
>
The amount of extra room that's required is partially dependent on the
number of Gunyah resources the VM creates. In practice, usually the
memory map will carve out large amount of memory and DT is much smaller
than that. Crosvm carves out 2MiB; Qualcomm devices (bootloader) usually
carve out 2 MiB as well. When telling RM about the DT, you should tell
RM about the whole 2MiB, not size of the actual devicetree blob.
I realize now this documentation is more UAPI facing than for internal
kernel API. I'll move this documentation over there as well.
>> + */
>> +int gh_rm_vm_configure(struct gh_rm *rm, u16 vmid, enum
>> gh_rm_vm_auth_mechanism auth_mechanism,
>> + u32 mem_handle, u64 image_offset, u64 image_size, u64
>> dtb_offset, u64 dtb_size)
>
> From what I can tell, the auth argument (and generally, ghvm->auth)
> is never used. If that's the case, it might be nicer to explicitly
> not included it for now, and only add it when it's going to be used
> (and tested to work correctly).
>
> I don't know if this is a reasonable strategy, but I'm always a
> little skeptical about unused code like this.
>
I don't have any technical reasons to keep it and I could move the
hard-coded auth type here. I thought it would best to keep the
assumption in VM manager and not in the RPC.
>> +{
>> + struct gh_rm_vm_config_image_req req_payload = {
>> + .vmid = cpu_to_le16(vmid),
>> + .auth_mech = cpu_to_le16(auth_mechanism),
>> + .mem_handle = cpu_to_le32(mem_handle),
>> + .image_offset = cpu_to_le64(image_offset),
>> + .image_size = cpu_to_le64(image_size),
>> + .dtb_offset = cpu_to_le64(dtb_offset),
>> + .dtb_size = cpu_to_le64(dtb_size),
>> + };
>> +
>
> Are there any sanity checks that could be performed before we
> actually make the call to the resource manager? Like, can
> you ensure the DTB offset and size are in range?
>
The "VM Manager" will perform those checks, as does Resource Manager. At
the RPC layer, we don't know the size of the memory parcel so we don't
have a range to reference.
>> + return gh_rm_call(rm, GH_RM_RPC_VM_CONFIG_IMAGE, &req_payload,
>> sizeof(req_payload),
>> + NULL, NULL);
>> +}
>> +
>> +/**
>> + * gh_rm_vm_init() - Move the VM to initialized state.
>
> s/the/a/
>
>> + * @rm: Handle to a Gunyah resource manager
>> + * @vmid: VM identifier
>> + *
>> + * RM will allocate needed resources for the VM.
>> + */
>> +int gh_rm_vm_init(struct gh_rm *rm, u16 vmid)
>> +{
>> + return gh_rm_common_vmid_call(rm, GH_RM_RPC_VM_INIT, vmid);
>> +}
>> +
>> +/**
>> + * gh_rm_get_hyp_resources() - Retrieve hypervisor resources
>> (capabilities) associated with a VM
>> + * @rm: Handle to a Gunyah resource manager
>> + * @vmid: VMID of the other VM to get the resources of
>> + * @resources: Set by gh_rm_get_hyp_resources and contains the
>> returned hypervisor resources.
>
> Caller must free the resources pointer returned if successful.
> (Please mention this.)
>
>> + */
>> +int gh_rm_get_hyp_resources(struct gh_rm *rm, u16 vmid,
>> + struct gh_rm_hyp_resources **resources)
>> +{
>> + struct gh_rm_vm_common_vmid_req req_payload = {
>> + .vmid = cpu_to_le16(vmid),
>> + };
>> + struct gh_rm_hyp_resources *resp;
>> + size_t resp_size;
>> + int ret;
>> +
>> + ret = gh_rm_call(rm, GH_RM_RPC_VM_GET_HYP_RESOURCES,
>> + &req_payload, sizeof(req_payload),
>> + (void **)&resp, &resp_size);
>> + if (ret)
>> + return ret;
>> +
>> + if (!resp_size)
>> + return -EBADMSG;
>> +
>> + if (resp_size < struct_size(resp, entries, 0) ||
>> + resp_size != struct_size(resp, entries,
>> le32_to_cpu(resp->n_entries))) {
>> + kfree(resp);
>> + return -EBADMSG;
>> + }
>> +
>> + *resources = resp;
>> + return 0;
>> +}
>> +
>> +/**
>> + * gh_rm_get_vmid() - Retrieve VMID of this virtual machine
>> + * @rm: Handle to a Gunyah resource manager
>> + * @vmid: Filled with the VMID of this VM
>> + */
>> +int gh_rm_get_vmid(struct gh_rm *rm, u16 *vmid)
>> +{
>> + static u16 cached_vmid = GH_VMID_INVAL;
>> + size_t resp_size;
>> + __le32 *resp;
>> + int ret;
>> +
>> + if (cached_vmid != GH_VMID_INVAL) {
>> + *vmid = cached_vmid;
>> + return 0;
>> + }
>> +
>> + ret = gh_rm_call(rm, GH_RM_RPC_VM_GET_VMID, NULL, 0, (void
>> **)&resp, &resp_size);
>> + if (ret)
>> + return ret;
>> +
>> + *vmid = cached_vmid = lower_16_bits(le32_to_cpu(*resp));
>> + kfree(resp);
>> +
>> + return ret;
>> +}
>> +EXPORT_SYMBOL_GPL(gh_rm_get_vmid);
>> diff --git a/include/linux/gunyah_rsc_mgr.h
>> b/include/linux/gunyah_rsc_mgr.h
>> index deca9b3da541..6a2f434e67f7 100644
>> --- a/include/linux/gunyah_rsc_mgr.h
>> +++ b/include/linux/gunyah_rsc_mgr.h
>> @@ -18,4 +18,77 @@ int gh_rm_notifier_unregister(struct gh_rm *rm,
>> struct notifier_block *nb);
>> struct device *gh_rm_get(struct gh_rm *rm);
>> void gh_rm_put(struct gh_rm *rm);
>> +struct gh_rm_vm_exited_payload {
>> + __le16 vmid;
>> + __le16 exit_type;
>> + __le32 exit_reason_size;
>> + u8 exit_reason[];
>> +} __packed;
>> +
>> +#define GH_RM_NOTIFICATION_VM_EXITED 0x56100001
>
> I think all these notification reasons should be defined in
> an enumerated type, to group them, and name the group.
>
Gunyah doesn't enumerate these macros. Linux could create enum for these
macros, but it's not refelected by the hypervisor. Keeping the fully
expanded macro also makes it easier to match up in the Gunyah source code.
>> +
>> +enum gh_rm_vm_status {
>> + GH_RM_VM_STATUS_NO_STATE = 0,
>> + GH_RM_VM_STATUS_INIT = 1,
>> + GH_RM_VM_STATUS_READY = 2,
>> + GH_RM_VM_STATUS_RUNNING = 3,
>> + GH_RM_VM_STATUS_PAUSED = 4,
>> + GH_RM_VM_STATUS_LOAD = 5,
>> + GH_RM_VM_STATUS_AUTH = 6,
>> + GH_RM_VM_STATUS_INIT_FAILED = 8,
>> + GH_RM_VM_STATUS_EXITED = 9,
>> + GH_RM_VM_STATUS_RESETTING = 10,
>> + GH_RM_VM_STATUS_RESET = 11,
>> +};
>> +
>> +struct gh_rm_vm_status_payload {
>> + __le16 vmid;
>> + u16 reserved;
>> + u8 vm_status;
>> + u8 os_status;
>> + __le16 app_status;
>> +} __packed;
>> +
>> +#define GH_RM_NOTIFICATION_VM_STATUS 0x56100008
>> +
>> +/* RPC Calls */
>> +int gh_rm_alloc_vmid(struct gh_rm *rm, u16 vmid);
>> +int gh_rm_dealloc_vmid(struct gh_rm *rm, u16 vmid);
>> +int gh_rm_vm_reset(struct gh_rm *rm, u16 vmid);
>> +int gh_rm_vm_start(struct gh_rm *rm, u16 vmid);
>> +int gh_rm_vm_stop(struct gh_rm *rm, u16 vmid);
>> +
>> +enum gh_rm_vm_auth_mechanism {
>> + GH_RM_VM_AUTH_NONE = 0,
>> + GH_RM_VM_AUTH_QCOM_PIL_ELF = 1,
>> + GH_RM_VM_AUTH_QCOM_ANDROID_PVM = 2,
>> +};
>> +
>> +int gh_rm_vm_configure(struct gh_rm *rm, u16 vmid, enum
>> gh_rm_vm_auth_mechanism auth_mechanism,
>> + u32 mem_handle, u64 image_offset, u64 image_size,
>> + u64 dtb_offset, u64 dtb_size);
>> +int gh_rm_vm_init(struct gh_rm *rm, u16 vmid);
>> +
>> +struct gh_rm_hyp_resource {
>> + u8 type;
>
> Maybe add a comment on the above field, and others, such as:
>
> u8 type; /* enum gh_resource_type */
>
>> + u8 reserved;
>> + __le16 partner_vmid;
>> + __le32 resource_handle;
>> + __le32 resource_label;
>> + __le64 cap_id;
>> + __le32 virq_handle;
>> + __le32 virq;
>> + __le64 base;
>> + __le64 size;
>> +} __packed;
>> +
>> +struct gh_rm_hyp_resources {
>> + __le32 n_entries;
>> + struct gh_rm_hyp_resource entries[];
>> +} __packed;
>> +
>> +int gh_rm_get_hyp_resources(struct gh_rm *rm, u16 vmid,
>> + struct gh_rm_hyp_resources **resources);
>> +int gh_rm_get_vmid(struct gh_rm *rm, u16 *vmid);
>> +
>> #endif
>
On 3/24/2023 11:37 AM, Will Deacon wrote:
> On Fri, Mar 03, 2023 at 05:06:18PM -0800, Elliot Berman wrote:
>> When launching a virtual machine, Gunyah userspace allocates memory for
>> the guest and informs Gunyah about these memory regions through
>> SET_USER_MEMORY_REGION ioctl.
>>
>> Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
>> Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
>> Signed-off-by: Elliot Berman <[email protected]>
>> ---
>> drivers/virt/gunyah/Makefile | 2 +-
>> drivers/virt/gunyah/vm_mgr.c | 44 ++++++
>> drivers/virt/gunyah/vm_mgr.h | 25 ++++
>> drivers/virt/gunyah/vm_mgr_mm.c | 229 ++++++++++++++++++++++++++++++++
>> include/uapi/linux/gunyah.h | 29 ++++
>> 5 files changed, 328 insertions(+), 1 deletion(-)
>> create mode 100644 drivers/virt/gunyah/vm_mgr_mm.c
>
> [...]
>
>> +int gh_vm_mem_alloc(struct gh_vm *ghvm, struct gh_userspace_memory_region *region)
>> +{
>> + struct gh_vm_mem *mapping, *tmp_mapping;
>> + struct gh_rm_mem_entry *mem_entries;
>> + phys_addr_t curr_page, prev_page;
>> + struct gh_rm_mem_parcel *parcel;
>> + int i, j, pinned, ret = 0;
>> + size_t entry_size;
>> + u16 vmid;
>> +
>> + if (!region->memory_size || !PAGE_ALIGNED(region->memory_size) ||
>> + !PAGE_ALIGNED(region->userspace_addr) || !PAGE_ALIGNED(region->guest_phys_addr))
>> + return -EINVAL;
>> +
>> + if (region->guest_phys_addr + region->memory_size < region->guest_phys_addr)
>> + return -EOVERFLOW;
>> +
>> + ret = mutex_lock_interruptible(&ghvm->mm_lock);
>> + if (ret)
>> + return ret;
>> +
>> + mapping = __gh_vm_mem_find_by_label(ghvm, region->label);
>> + if (mapping) {
>> + mutex_unlock(&ghvm->mm_lock);
>> + return -EEXIST;
>> + }
>> +
>> + mapping = kzalloc(sizeof(*mapping), GFP_KERNEL);
>> + if (!mapping) {
>> + mutex_unlock(&ghvm->mm_lock);
>> + return -ENOMEM;
>> + }
>> +
>> + mapping->parcel.label = region->label;
>> + mapping->guest_phys_addr = region->guest_phys_addr;
>> + mapping->npages = region->memory_size >> PAGE_SHIFT;
>> + parcel = &mapping->parcel;
>> + parcel->mem_handle = GH_MEM_HANDLE_INVAL; /* to be filled later by mem_share/mem_lend */
>> + parcel->mem_type = GH_RM_MEM_TYPE_NORMAL;
>> +
>> + /* Check for overlap */
>> + list_for_each_entry(tmp_mapping, &ghvm->memory_mappings, list) {
>> + if (!((mapping->guest_phys_addr + (mapping->npages << PAGE_SHIFT) <=
>> + tmp_mapping->guest_phys_addr) ||
>> + (mapping->guest_phys_addr >=
>> + tmp_mapping->guest_phys_addr + (tmp_mapping->npages << PAGE_SHIFT)))) {
>> + ret = -EEXIST;
>> + goto free_mapping;
>> + }
>> + }
>> +
>> + list_add(&mapping->list, &ghvm->memory_mappings);
>> +
>> + mapping->pages = kcalloc(mapping->npages, sizeof(*mapping->pages), GFP_KERNEL);
>> + if (!mapping->pages) {
>> + ret = -ENOMEM;
>> + mapping->npages = 0; /* update npages for reclaim */
>> + goto reclaim;
>> + }
>> +
>> + pinned = pin_user_pages_fast(region->userspace_addr, mapping->npages,
>> + FOLL_WRITE | FOLL_LONGTERM, mapping->pages);
>> + if (pinned < 0) {
>> + ret = pinned;
>> + mapping->npages = 0; /* update npages for reclaim */
>> + goto reclaim;
>> + } else if (pinned != mapping->npages) {
>> + ret = -EFAULT;
>> + mapping->npages = pinned; /* update npages for reclaim */
>> + goto reclaim;
>> + }
>
> I think Fuad mentioned this on an older version of these patches, but it
> looks like you're failing to account for the pinned memory here which is
> a security issue depending on who is able to issue the ioctl() calling
> into here.
>
> Specifically, I'm thinking that your kXalloc() calls should be using
> GFP_KERNEL_ACCOUNT in this function and also that you should be calling
> account_locked_vm() for the pages being pinned.
>
Added the accounting for the v12.
> Finally, what happens if userspace passes in a file mapping?
Userspace will get EBADADDR (-14) back when trying to launch the VM
(pin_user_pages_fast returns this as you might have been expecting). We
haven't yet had any need to support file-backed mappings.
Thanks,
Elliot
On 3/31/2023 7:25 AM, Alex Elder wrote:
> On 3/3/23 7:06 PM, Elliot Berman wrote:
>> @@ -129,6 +131,7 @@ struct gh_rm_connection {
>> * @cache: cache for allocating Tx messages
>> * @send_lock: synchronization to allow only one request to be sent
>> at a time
>> * @nh: notifier chain for clients interested in RM notification
>> messages
>> + * @miscdev: /dev/gunyah
>> */
>> struct gh_rm {
>> struct device *dev;
>> @@ -145,6 +148,8 @@ struct gh_rm {
>> struct kmem_cache *cache;
>> struct mutex send_lock;
>> struct blocking_notifier_head nh;
>> +
>> + struct miscdevice miscdev;
>> };
>> /**
>> @@ -593,6 +598,21 @@ void gh_rm_put(struct gh_rm *rm)
>> }
>> EXPORT_SYMBOL_GPL(gh_rm_put);
>
> I feel like /dev/gunyah code would more appropriately be found
> in "vm_mgr.c". All gh_dev_ioctl() does is call the function
> defined there, and it's therefore a VM-oriented rather than
> resource-oriented device.
I'd like to keep the gh_dev_ioctl where it is because it keeps the
struct gh_rm explicitly private to rsc_mgr.c and thinking this helps
keep the design cleaner long term by preventing new members from
sneaking into struct gh_rm.
>> +
>> +static long gh_dev_ioctl_create_vm(struct gh_rm *rm, unsigned long arg)
>> +{
>> + struct gh_vm *ghvm;
>> + struct file *file;
>> + int fd, err;
>> +
>> + /* arg reserved for future use. */
>
> Do you have a clear idea of how this might be used in the future?
Not yet. I have some vague ideas to use it as a enumeration of "special"
VM types. We might have special number for VMs which use "protected VM
firmware" for the Android boot flow, another number for the "Trusted UI
VM", another for "OEM VM", etc. Passing 0 would always be the
unauthenticated VM which we are creating today.
We're considering bumping the info to a separate ioctl since additional
info needs to be passed from userspace to configure the VM. Userspace
would do GH_CREATE_VM(). Another ioctl like GH_VM_SET_PVMFW_ADDRESS()
would imply that the VM uses the protected VM firmware for the Android
boot flow. Another ioctl call would be used to imply the "Trusted UI
VM". In any case, we're still in early design phase.
>
> I was thinking you could silently ignore the argument value, but
> I suppose if it *does* get used in the future, you want the caller
> to know it's being ignored. (Is that right?)
>
That's right.
Thanks,
Elliot
On 3/31/2023 7:26 AM, Alex Elder wrote:
> On 3/3/23 7:06 PM, Elliot Berman wrote:
>> +
>> + mem_entries = kcalloc(mapping->npages, sizeof(*mem_entries),
>> GFP_KERNEL);
>> + if (!mem_entries) {
>> + ret = -ENOMEM;
>> + goto reclaim;
>> + }
>> +
>> + /* reduce number of entries by combining contiguous pages into
>> single memory entry */
>
> Are you sure you need to do this? I.e., does pin_user_pages_fast()
> already take care of consolidating these pages?
>
pin_user_pages_fast wouldn't consolidate the page entries. There's a
speedup in sharing memory when pages are contiguous since less
information needs to be transmitted to Gunyah describing the memory.
>> + prev_page = page_to_phys(mapping->pages[0]);
>> + mem_entries[0].ipa_base = cpu_to_le64(prev_page);
>> + entry_size = PAGE_SIZE;
>> + for (i = 1, j = 0; i < mapping->npages; i++) {
>> + curr_page = page_to_phys(mapping->pages[i]);
>
> I think you can actually use the page frame numbers
> here instead of the addresses. If they are consecutive,
> they are contiguous. See pages_are_mergeable() for an
> example of that. Using PFNs might simplify this code.
>
It did, thanks for the suggestion!
>> + if (curr_page - prev_page == PAGE_SIZE) {
>> + entry_size += PAGE_SIZE;
>> + } else {
>> + mem_entries[j].size = cpu_to_le64(entry_size);
>> + j++;
>> + mem_entries[j].ipa_base = cpu_to_le64(curr_page);
>> + entry_size = PAGE_SIZE;
>> + }
>> +
>> + prev_page = curr_page;
>> + }
>> + mem_entries[j].size = cpu_to_le64(entry_size);
>
> It might be messier, but it seems like you could scan the pages to
> see how many you'll need (after combining), then allocate the array
> of mem entries based on that. That is, do that rather than allocating,
> filling, then duplicating and freeing.
>
> count = 1;
> curr_page = mapping->pages[0];
> for (i = 1; i < mapping->npages; i++) {
> next_page = mapping->pages[i];
> if (page_to_pfn(next_page) !=
> page_to_pfn(curr_page) + 1)
> count++;
> curr_page = next_page;
> }
> parcel->n_mem_entries = count;
> parcel->mem_entries = kcalloc(count, ...);
> /* Then fill them up */
>
> (Not tested, but you get the idea.)
>
It wasn't too messy IMO, I think this ended up simplifying the loop.
>> +
>> + parcel->n_mem_entries = j + 1;
>> + parcel->mem_entries = kmemdup(mem_entries, sizeof(*mem_entries) *
>> parcel->n_mem_entries,
>> + GFP_KERNEL);
>> + kfree(mem_entries);
>> + if (!parcel->mem_entries) {
>> + ret = -ENOMEM;
>> + goto reclaim;
>> + }
>> +
>> + mutex_unlock(&ghvm->mm_lock);
>> + return 0;
>> +reclaim:
>> + gh_vm_mem_reclaim(ghvm, mapping);
>> +free_mapping:
>> + kfree(mapping);
>> + mutex_unlock(&ghvm->mm_lock);
>> + return ret;
>> +}
>> +
>> +int gh_vm_mem_free(struct gh_vm *ghvm, u32 label)
>> +{
>> + struct gh_vm_mem *mapping;
>> + int ret;
>> +
>> + ret = mutex_lock_interruptible(&ghvm->mm_lock);
>> + if (ret)
>> + return ret;
>> +
>> + mapping = __gh_vm_mem_find_by_label(ghvm, label);
>> + if (!mapping)
>> + goto out;
>> +
>> + gh_vm_mem_reclaim(ghvm, mapping);
>> + kfree(mapping);
>> +out:
>> + mutex_unlock(&ghvm->mm_lock);
>> + return ret;
>> +}
>> diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
>> index 10ba32d2b0a6..a19207e3e065 100644
>> --- a/include/uapi/linux/gunyah.h
>> +++ b/include/uapi/linux/gunyah.h
>> @@ -20,4 +20,33 @@
>> */
>> #define GH_CREATE_VM _IO(GH_IOCTL_TYPE, 0x0) /* Returns a
>> Gunyah VM fd */
>> +/*
>> + * ioctls for VM fds
>> + */
>> +
>
> I think you should define the following three values in an enum.
>
>> +#define GH_MEM_ALLOW_READ (1UL << 0)
>> +#define GH_MEM_ALLOW_WRITE (1UL << 1)
>> +#define GH_MEM_ALLOW_EXEC (1UL << 2)
>> +
>> +/**
>> + * struct gh_userspace_memory_region - Userspace memory descripion
>> for GH_VM_SET_USER_MEM_REGION
>> + * @label: Unique identifer to the region.
>
> Unique with respect to what? I think it's unique among memory
> regions defined within a VM. And I think it's arbitrary and
> defined by the caller (right?).
>
>> + * @flags: Flags for memory parcel behavior
>> + * @guest_phys_addr: Location of the memory region in guest's memory
>> space (page-aligned)
>> + * @memory_size: Size of the region (page-aligned)
>> + * @userspace_addr: Location of the memory region in caller
>> (userspace)'s memory
>> + *
>> + * See Documentation/virt/gunyah/vm-manager.rst for further details.
>> + */
>> +struct gh_userspace_memory_region {
>> + __u32 label;
>> + __u32 flags;
>
> Add a comment to indicate what types of values "flags" can have.
> Maybe "flags" should be called "perms" or something?
>
Added documentation for the valid values of flags. I'm anticipating
needing to add other flag values beyond permission bits.
>> + __u64 guest_phys_addr;
>> + __u64 memory_size;
>> + __u64 userspace_addr;
>
> Why isn't userspace_addr just a (void *)? That would be a more natural
> thing to pass to the kernel. Is it to avoid 32-bit/64-bit pointer
> differences in the API?
>
Yes, to avoid 32-bit/64-bit pointer differences in API.
>> +};
>> +
>> +#define GH_VM_SET_USER_MEM_REGION _IOW(GH_IOCTL_TYPE, 0x1, \
>> + struct gh_userspace_memory_region)
>> +
>
> I think it's nicer to group the definitions of these IOCTL values.
> Then in the struct definitions that follow, you can add comment that
> indicates which IOCTL the struct is used for.
>
>> #endif
>
On 3/21/2023 7:24 AM, Srinivas Kandagatla wrote:
> On 04/03/2023 01:06, Elliot Berman wrote:
>> +
>> +#define GH_VM_START _IO(GH_IOCTL_TYPE, 0x3)
> A comment here that this is going to *ONLY* start an un-authenticated VM
> would be useful to the users.
>
There is only support for unauthenticated VM in the UAPI being proposed
and I'd like to re-use GH_VM_START ioctl for other VM types as well. Is
the comment really useful? I can easily see forgetting to remove the
comment and then being more confusing once the other VM types get added.
Thanks,
Elliot
On 4/11/23 4:07 PM, Elliot Berman wrote:
>
>
> On 3/21/2023 7:24 AM, Srinivas Kandagatla wrote:
>
>> On 04/03/2023 01:06, Elliot Berman wrote:
>>> +
>>> +#define GH_VM_START _IO(GH_IOCTL_TYPE, 0x3)
>> A comment here that this is going to *ONLY* start an un-authenticated
>> VM would be useful to the users.
>>
>
> There is only support for unauthenticated VM in the UAPI being proposed
> and I'd like to re-use GH_VM_START ioctl for other VM types as well. Is
> the comment really useful? I can easily see forgetting to remove the
> comment and then being more confusing once the other VM types get added.
It's up to you. And in general, I think your responses to my
comments have been fine--even when you just explain why you
don't plan to implement my suggestion. Thank you.
-Alex
>
> Thanks,
> Elliot
On 3/31/2023 7:26 AM, Alex Elder wrote:
> On 3/3/23 7:06 PM, Elliot Berman wrote:
>> diff --git a/include/linux/gunyah_rsc_mgr.h
>> b/include/linux/gunyah_rsc_mgr.h
>> index 88a429dad09e..8b0b46f28e39 100644
>> --- a/include/linux/gunyah_rsc_mgr.h
>> +++ b/include/linux/gunyah_rsc_mgr.h
>> @@ -29,6 +29,12 @@ struct gh_rm_vm_exited_payload {
>> #define GH_RM_NOTIFICATION_VM_EXITED 0x56100001
>> enum gh_rm_vm_status {
>> + /**
>> + * RM doesn't have a state where load partially failed because
>> + * only Linux
>
> I have no idea what the comment above means... Please fix.
>
> Several of the values below are never explicitly assigned,
> and some are used but not assigned. The others apparently
> might come back from the resource manager? Why, for
> example, are the PAUSED, AUTH, and RESETTING statuses
> defined if we don't use them?
>
I ended up no longer needing VM_STATUS_LOAD_FAILED.
The other status values are defined by Gunyah resource manager. RM will
notify us about the state transitions.
Some of the state transitions can be inferred by Linux directly. For
instance, gh_rm_vm_init() will transition the VM from
GH_RM_VM_STATUS_INIT to GH_RM_VM_STATUS_READY iff it returns
successfully. There is one instance where we wait for VM to exit during
the VM teardown as well.
Thanks,
Elliot
>> + */
>> + GH_RM_VM_STATUS_LOAD_FAILED = -1,
>> +
>> GH_RM_VM_STATUS_NO_STATE = 0,
>> GH_RM_VM_STATUS_INIT = 1,
>> GH_RM_VM_STATUS_READY = 2,
>> diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
>> index a19207e3e065..d6abd8605a2e 100644
>> --- a/include/uapi/linux/gunyah.h
>> +++ b/include/uapi/linux/gunyah.h
>> @@ -49,4 +49,17 @@ struct gh_userspace_memory_region {
>> #define GH_VM_SET_USER_MEM_REGION _IOW(GH_IOCTL_TYPE, 0x1, \
>> struct gh_userspace_memory_region)
>> +/**
>> + * struct gh_vm_dtb_config - Set the location of the VM's devicetree
>> blob
>> + * @guest_phys_addr: Address of the VM's devicetree in guest memory.
>> + * @size: Maximum size of the devicetree.
>> + */
>> +struct gh_vm_dtb_config {
>> + __u64 guest_phys_addr;
>> + __u64 size;
>> +};
>> +#define GH_VM_SET_DTB_CONFIG _IOW(GH_IOCTL_TYPE, 0x2, struct
>> gh_vm_dtb_config)
>> +
>> +#define GH_VM_START _IO(GH_IOCTL_TYPE, 0x3)
>> +
>> #endif
>
On Tue, Apr 11, 2023 at 01:34:34PM -0700, Elliot Berman wrote:
> On 3/24/2023 11:37 AM, Will Deacon wrote:
> > On Fri, Mar 03, 2023 at 05:06:18PM -0800, Elliot Berman wrote:
> > > +
> > > + pinned = pin_user_pages_fast(region->userspace_addr, mapping->npages,
> > > + FOLL_WRITE | FOLL_LONGTERM, mapping->pages);
> > > + if (pinned < 0) {
> > > + ret = pinned;
> > > + mapping->npages = 0; /* update npages for reclaim */
> > > + goto reclaim;
> > > + } else if (pinned != mapping->npages) {
> > > + ret = -EFAULT;
> > > + mapping->npages = pinned; /* update npages for reclaim */
> > > + goto reclaim;
> > > + }
> >
> > I think Fuad mentioned this on an older version of these patches, but it
> > looks like you're failing to account for the pinned memory here which is
> > a security issue depending on who is able to issue the ioctl() calling
> > into here.
> >
> > Specifically, I'm thinking that your kXalloc() calls should be using
> > GFP_KERNEL_ACCOUNT in this function and also that you should be calling
> > account_locked_vm() for the pages being pinned.
> >
>
> Added the accounting for the v12.
>
> > Finally, what happens if userspace passes in a file mapping?
>
> Userspace will get EBADADDR (-14) back when trying to launch the VM
> (pin_user_pages_fast returns this as you might have been expecting). We
> haven't yet had any need to support file-backed mappings.
Hmm, no, that's actually surprising to me. I'd have thought GUP would
happily pin page-cache pages for file mappings, so I'm intrigued as to
which FOLL_ flag is causing you to get an error code back. Can you
enlighten me on where the failure originates, please?
Will
On 4/11/2023 2:19 PM, Will Deacon wrote:
> On Tue, Apr 11, 2023 at 01:34:34PM -0700, Elliot Berman wrote:
>> On 3/24/2023 11:37 AM, Will Deacon wrote:
>>> On Fri, Mar 03, 2023 at 05:06:18PM -0800, Elliot Berman wrote:
>>>> +
>>>> + pinned = pin_user_pages_fast(region->userspace_addr, mapping->npages,
>>>> + FOLL_WRITE | FOLL_LONGTERM, mapping->pages);
>>>> + if (pinned < 0) {
>>>> + ret = pinned;
>>>> + mapping->npages = 0; /* update npages for reclaim */
>>>> + goto reclaim;
>>>> + } else if (pinned != mapping->npages) {
>>>> + ret = -EFAULT;
>>>> + mapping->npages = pinned; /* update npages for reclaim */
>>>> + goto reclaim;
>>>> + }
>>>
>>> I think Fuad mentioned this on an older version of these patches, but it
>>> looks like you're failing to account for the pinned memory here which is
>>> a security issue depending on who is able to issue the ioctl() calling
>>> into here.
>>>
>>> Specifically, I'm thinking that your kXalloc() calls should be using
>>> GFP_KERNEL_ACCOUNT in this function and also that you should be calling
>>> account_locked_vm() for the pages being pinned.
>>>
>>
>> Added the accounting for the v12.
>>
>>> Finally, what happens if userspace passes in a file mapping?
>>
>> Userspace will get EBADADDR (-14) back when trying to launch the VM
>> (pin_user_pages_fast returns this as you might have been expecting). We
>> haven't yet had any need to support file-backed mappings.
>
> Hmm, no, that's actually surprising to me. I'd have thought GUP would
> happily pin page-cache pages for file mappings, so I'm intrigued as to
> which FOLL_ flag is causing you to get an error code back. Can you
> enlighten me on where the failure originates, please?
Ah this ended up being an error on my part. Userspace was opening the
file as RO and Gunyah driver will unconditionally add FOLL_WRITE as part
of the gup flags. I got the flags aligned and seemed to be able to boot
the VM ok and it works as expected.
Thanks,
Elliot
On Wed, Apr 12, 2023 at 01:48:07PM -0700, Elliot Berman wrote:
>
>
> On 4/11/2023 2:19 PM, Will Deacon wrote:
> > On Tue, Apr 11, 2023 at 01:34:34PM -0700, Elliot Berman wrote:
> > > On 3/24/2023 11:37 AM, Will Deacon wrote:
> > > > On Fri, Mar 03, 2023 at 05:06:18PM -0800, Elliot Berman wrote:
> > > > > +
> > > > > + pinned = pin_user_pages_fast(region->userspace_addr, mapping->npages,
> > > > > + FOLL_WRITE | FOLL_LONGTERM, mapping->pages);
> > > > > + if (pinned < 0) {
> > > > > + ret = pinned;
> > > > > + mapping->npages = 0; /* update npages for reclaim */
> > > > > + goto reclaim;
> > > > > + } else if (pinned != mapping->npages) {
> > > > > + ret = -EFAULT;
> > > > > + mapping->npages = pinned; /* update npages for reclaim */
> > > > > + goto reclaim;
> > > > > + }
> > > >
> > > > I think Fuad mentioned this on an older version of these patches, but it
> > > > looks like you're failing to account for the pinned memory here which is
> > > > a security issue depending on who is able to issue the ioctl() calling
> > > > into here.
> > > >
> > > > Specifically, I'm thinking that your kXalloc() calls should be using
> > > > GFP_KERNEL_ACCOUNT in this function and also that you should be calling
> > > > account_locked_vm() for the pages being pinned.
> > > >
> > >
> > > Added the accounting for the v12.
> > >
> > > > Finally, what happens if userspace passes in a file mapping?
> > >
> > > Userspace will get EBADADDR (-14) back when trying to launch the VM
> > > (pin_user_pages_fast returns this as you might have been expecting). We
> > > haven't yet had any need to support file-backed mappings.
> >
> > Hmm, no, that's actually surprising to me. I'd have thought GUP would
> > happily pin page-cache pages for file mappings, so I'm intrigued as to
> > which FOLL_ flag is causing you to get an error code back. Can you
> > enlighten me on where the failure originates, please?
>
> Ah this ended up being an error on my part. Userspace was opening the file
> as RO and Gunyah driver will unconditionally add FOLL_WRITE as part of the
> gup flags. I got the flags aligned and seemed to be able to boot the VM ok
> and it works as expected.
I suspect you can run into latent filesystem corruption issues in this case,
as the VM can dirty pages without the filesystem knowing. That's why we
restricted anonymous memory with pKVM for now.
Will
On 3/31/2023 7:27 AM, Alex Elder wrote:
> On 3/3/23 7:06 PM, Elliot Berman wrote:
[snip]
>> diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
>> index caeb3b3a3e9a..e52265fa5715 100644
>> --- a/include/uapi/linux/gunyah.h
>> +++ b/include/uapi/linux/gunyah.h
>> @@ -62,8 +62,32 @@ struct gh_vm_dtb_config {
>> #define GH_VM_START _IO(GH_IOCTL_TYPE, 0x3)
>> +/**
>> + * GH_FN_VCPU - create a vCPU instance to control a vCPU
>> + *
>> + * gh_fn_desc is filled with &struct gh_fn_vcpu_arg
>> + *
>> + * The vcpu type will register with the VM Manager to expect to control
>> + * vCPU number `vcpu_id`. It returns a file descriptor allowing
>> interaction with
>> + * the vCPU. See the Gunyah vCPU API description sections for
>> interacting with
>> + * the Gunyah vCPU file descriptors.
>> + *
>> + * Return: file descriptor to manipulate the vcpu. See GH_VCPU_* ioctls
>> + */
>> +#define GH_FN_VCPU 1
>
> I think you should define GH_VN_VCPU, GN_FN_IRQFD, and GN_FN_IOEVENTFD
> in an enumerated type. Each has a type associated with it, and you can
> add the explanation for the function in the kernel-doc comments above
> thosse type definitions.
>
I'd like to enumify the GH_FN_* macros, but one challenge I'm facing is
that it breaks the module alias implementation in patch 19.
MODULE_ALIAS("ghfunc:"__stringify(_type))
When the GH_FN_* are regular preprocessor macros backed by an integer,
the preprocessor will make the module alias ghfunc:0 (or ghfunc:1, etc).
This works well because I can do
request_module("ghfunc:%d", type);
If the function hasn't been registered and then gunyah_vcpu.ko gets
loaded automatically.
With enum, compiler knows the value of GH_FN_VCPU and preprocessor will
make the module alias like ghfunc:GH_FN_VCPU.
[snip]
>> +
>> +/*
>> + * Gunyah presently sends max 4 bytes of exit_reason.
>> + * If that changes, this macro can be safely increased without breaking
>> + * userspace so long as struct gh_vcpu_run < PAGE_SIZE.
>
> Is PAGE_SIZE allowed to be anything other than 4096 bytes? Do you
> expect this driver to work properly if the page size were configured
> to be 16384 bytes? In other words, is this a Gunyah constant, or
> is it *really* the page size configured for Linux?
>
Our implementations are only doing 4096 bytes. I expect the driver to
work properly when using 16k pages. This really is a Linux page. It's a
reflection of the alloc_page in gh_vcpu_bind().
The exit reason is copied from hypervisor into field accessible by
userspace directly. Gunyah makes the exit reason size dynamic -- there's
no architectural limitation preventing the exit reason from being a
string or some lengthy data.
As I was writing this response, I realized that I should be able to make
this a zero-length array and ensure that reason[] doesn't overflow
PAGE_SIZE...
The comment was trying to explain that Linux itself imposes a limitation
on the maximum exit reason size. If we need to support longer exit
reason, we're OK to do so long as the total size doesn't overrun
PAGE_SIZE. There aren't any plans to need longer exit reasons than the 8
bytes mentioned today.
Thanks,
Elliot
On 3/31/2023 7:27 AM, Alex Elder wrote:
> On 3/3/23 7:06 PM, Elliot Berman wrote:
[snip]
>> +
>> +static int irqfd_wakeup(wait_queue_entry_t *wait, unsigned int mode,
>> int sync, void *key)
>> +{
>> + struct gh_irqfd *irqfd = container_of(wait, struct gh_irqfd, wait);
>> + __poll_t flags = key_to_poll(key);
>> + u64 enable_mask = GH_BELL_NONBLOCK;
>> + u64 old_flags;
>> + int ret = 0;
>> +
>> + if (flags & EPOLLIN) {
>> + if (irqfd->ghrsc) {
>> + ret = gh_hypercall_bell_send(irqfd->ghrsc->capid,
>> enable_mask, &old_flags);
>
> I commented elsewhere that you might support passing a null
> pointer as the last argument above (since you don't use the
> result).
>
>> + if (ret)
>> + pr_err_ratelimited("Failed to inject interrupt %d:
>> %d\n",
>> + irqfd->ticket.label, ret);
>> + } else
>> + pr_err_ratelimited("Premature injection of interrupt\n");
>> + }
>> +
>> + return 0;
>> +}
>> +
>> +static void irqfd_ptable_queue_proc(struct file *file,
>> wait_queue_head_t *wqh, poll_table *pt)
>> +{
>> + struct gh_irqfd *irq_ctx = container_of(pt, struct gh_irqfd, pt);
>> +
>> + add_wait_queue(wqh, &irq_ctx->wait);
>> +}
>> +
>> +static int gh_irqfd_populate(struct gh_vm_resource_ticket *ticket,
>> struct gh_resource *ghrsc)
>> +{
>> + struct gh_irqfd *irqfd = container_of(ticket, struct gh_irqfd,
>> ticket);
>> + u64 enable_mask = GH_BELL_NONBLOCK;
>> + u64 ack_mask = ~0;
>
> Why is the ACK mask ~0?
>
> I guess I don't know details about this hypercall (do you document
> them somewhere?), so it's hard to judge whether or why this is the
> right thing to use. The enable_mask is just GH_BELL_NONBLOCK,
> which is just BIT(32).
>
I talked to our hypervisor folks and they mentioned we can simplify
this. In v12, enable_mask and ack_mask can just be "1" (BIT(0)). We had
chosen bit 32 arbitrarily.
[snip]
>
>> + }
>> +
>> + irqfd->ghrsc = ghrsc;
>> + if (irqfd->level) {
>
> I think I don't understand this part of the code well
> enough to know this. What happens if level is false?
>
If level is false, then guest is assumed to set up IRQ on its side as
edge-triggered. In that case, we don't need to configure the enable
mask/ack mask because the doorbell flags aren't polled.
[snip]
>> +/**
>> + * struct gh_fn_irqfd_arg - Arguments to create an irqfd function
>> + * @fd: an eventfd which when written to will raise a doorbell
>> + * @label: Label of the doorbell created on the guest VM
>> + * @flags: GH_IRQFD_LEVEL configures the corresponding doorbell to
>> behave
>> + * like a level triggered interrupt.
>> + * @padding: padding bytes
>> + */
>> +struct gh_fn_irqfd_arg {
>> + __u32 fd;
>
> Should the "fd" field be signed? Should it be an int? (Perhaps
> you're trying to define a fixed kernel API, so __s32 if signed would
> be better.)
>
It looked to me like some interfaces use __u32 and some use __s32. Is
one technically correct?
On 3/31/2023 7:27 AM, Alex Elder wrote:
> On 3/3/23 7:06 PM, Elliot Berman wrote:
>> Some VM functions need to acquire Gunyah resources. For instance, Gunyah
>> vCPUs are exposed to the host as a resource. The Gunyah vCPU function
>> will register a resource ticket and be able to interact with the
>> hypervisor once the resource ticket is filled.
>>
>> Resource tickets are the mechanism for functions to acquire ownership of
>> Gunyah resources. Gunyah functions can be created before the VM's
>> resources are created and made available to Linux. A resource ticket
>> identifies a type of resource and a label of a resource which the ticket
>> holder is interested in.
>>
>> Resources are created by Gunyah as configured in the VM's devicetree
>> configuration. Gunyah doesn't process the label and that makes it
>> possible for userspace to create multiple resources with the same label.
>> Resource ticket owners need to be prepared for populate to be called
>> multiple times if userspace created multiple resources with the same
>> label.
>>
>> Signed-off-by: Elliot Berman <[email protected]>
>
> One possibly substantive suggestion here, plus a couple suggestions
> to add or revise comments.
>
> -Alex
>
>> ---
>> drivers/virt/gunyah/vm_mgr.c | 112 +++++++++++++++++++++++++++++++++-
>> drivers/virt/gunyah/vm_mgr.h | 4 ++
>> include/linux/gunyah_vm_mgr.h | 14 +++++
>> 3 files changed, 129 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
>> index 88db011395ec..0269bcdaf692 100644
>> --- a/drivers/virt/gunyah/vm_mgr.c
>> +++ b/drivers/virt/gunyah/vm_mgr.c
>> @@ -165,6 +165,74 @@ static long gh_vm_rm_function(struct gh_vm *ghvm,
>> struct gh_fn_desc *f)
>> return r;
>> }
>> +int gh_vm_add_resource_ticket(struct gh_vm *ghvm, struct
>> gh_vm_resource_ticket *ticket)
>> +{
>> + struct gh_vm_resource_ticket *iter;
>> + struct gh_resource *ghrsc;
>> + int ret = 0;
>> +
>> + mutex_lock(&ghvm->resources_lock);
>> + list_for_each_entry(iter, &ghvm->resource_tickets, list) {
>> + if (iter->resource_type == ticket->resource_type &&
>> iter->label == ticket->label) {
>> + ret = -EEXIST;
>> + goto out;
>> + }
>> + }
>> +
>> + if (!try_module_get(ticket->owner)) {
>> + ret = -ENODEV;
>> + goto out;
>> + }
>> +
>> + list_add(&ticket->list, &ghvm->resource_tickets);
>> + INIT_LIST_HEAD(&ticket->resources);
>> +
>> + list_for_each_entry(ghrsc, &ghvm->resources, list) {
>> + if (ghrsc->type == ticket->resource_type && ghrsc->rm_label
>> == ticket->label) {
>> + if (!ticket->populate(ticket, ghrsc))
>> + list_move(&ghrsc->list, &ticket->resources);
>> + }
>> + }
>> +out:
>> + mutex_unlock(&ghvm->resources_lock);
>> + return ret;
>> +}
>> +EXPORT_SYMBOL_GPL(gh_vm_add_resource_ticket);
>> +
>> +void gh_vm_remove_resource_ticket(struct gh_vm *ghvm, struct
>> gh_vm_resource_ticket *ticket)
>> +{
>> + struct gh_resource *ghrsc, *iter;
>> +
>> + mutex_lock(&ghvm->resources_lock);
>> + list_for_each_entry_safe(ghrsc, iter, &ticket->resources, list) {
>> + ticket->unpopulate(ticket, ghrsc);
>> + list_move(&ghrsc->list, &ghvm->resources);
>> + }
>> +
>> + module_put(ticket->owner);
>> + list_del(&ticket->list);
>> + mutex_unlock(&ghvm->resources_lock);
>> +}
>> +EXPORT_SYMBOL_GPL(gh_vm_remove_resource_ticket);
>> +
>> +static void gh_vm_add_resource(struct gh_vm *ghvm, struct gh_resource
>> *ghrsc)
>> +{
>> + struct gh_vm_resource_ticket *ticket;
>> +
>> + mutex_lock(&ghvm->resources_lock);
>> + list_for_each_entry(ticket, &ghvm->resource_tickets, list) {
>> + if (ghrsc->type == ticket->resource_type && ghrsc->rm_label
>> == ticket->label) {
>> + if (!ticket->populate(ticket, ghrsc)) {
>> + list_add(&ghrsc->list, &ticket->resources);
>> + goto found;
>> + }
>
> I think the "goto found" belongs here, unconditionally.
> You disallow adding more than one ticket of a given type
> with the same label. So you will never match another
> ticket once you've matched this one.
>
> The populate function generally shouldn't fail. I think
> it only fails if you find a duplicate, and again, I think
> you prevent that from happening. (But if it does, you
> silently ignore it...)
>
I agree with this suggestion, no need to waste continue checking other
tickets once we find the match. I'll move the "goto found" line.
[snip]
Thanks,
Elliot
On 3/31/2023 7:26 AM, Alex Elder wrote:
> On 3/3/23 7:06 PM, Elliot Berman wrote:
>> When booting a Gunyah virtual machine, the host VM may gain capabilities
>> to interact with resources for the guest virtual machine. Examples of
>> such resources are vCPUs or message queues. To use those resources, we
>> need to translate the RM response into a gunyah_resource structure which
>> are useful to Linux drivers. Presently, Linux drivers need only to know
>> the type of resource, the capability ID, and an interrupt.
>>
>> On ARM64 systems, the interrupt reported by Gunyah is the GIC interrupt
>> ID number and always a SPI.
>>
>> Signed-off-by: Elliot Berman <[email protected]>
>
> Several comments here, nothing major. -Alex
>
>> ---
>> arch/arm64/include/asm/gunyah.h | 23 +++++
>> drivers/virt/gunyah/rsc_mgr.c | 163 +++++++++++++++++++++++++++++++-
>> include/linux/gunyah.h | 4 +
>> include/linux/gunyah_rsc_mgr.h | 3 +
>> 4 files changed, 192 insertions(+), 1 deletion(-)
>> create mode 100644 arch/arm64/include/asm/gunyah.h
>>
>> diff --git a/arch/arm64/include/asm/gunyah.h
>> b/arch/arm64/include/asm/gunyah.h
>> new file mode 100644
>> index 000000000000..64cfb964efee
>> --- /dev/null
>> +++ b/arch/arm64/include/asm/gunyah.h
>> @@ -0,0 +1,23 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Copyright (c) 2022 Qualcomm Innovation Center, Inc. All rights
>> reserved.
>> + */
>> +#ifndef __ASM_GUNYAH_H_
>> +#define __ASM_GUNYAH_H_
>
> Maybe just one _ at the beginning and none at the end?
> Follow the same convention across all your header files.
> (Maybe you're looking at other files in the same directory
> as this one, but that's not consistent.)
>
>> +
>> +#include <linux/irq.h>
>> +#include <dt-bindings/interrupt-controller/arm-gic.h>
>> +
>> +static inline int arch_gh_fill_irq_fwspec_params(u32 virq, struct
>> irq_fwspec *fwspec)
>> +{
>> + if (virq < 32 || virq > 1019)
>> + return -EINVAL;
>
> What is special about VIRQs greater than 1019 (minus 32)?
>
> It's probably documented somewhere but it's worth adding a
> comment here to explain the check.
>
> You would know better than I, but could/should the caller
> be responsible for this check? (Not a big deal.)
>
I think definitely not the caller should be responsible for this check.
On arm systems, the IRQ # that is returned is the hwirq # for the GIC.
Presently, Gunyah only gives us SPI interrupts [32,1019] so I've written
a translation to a SPI fwspec.
>> +
>> + fwspec->param_count = 3;
>> + fwspec->param[0] = GIC_SPI;
>> + fwspec->param[1] = virq - 32;
>
> And why is 32 subtracted?
>
GIC driver expects the SPI #, not the hwirq #.
>> + fwspec->param[2] = IRQ_TYPE_EDGE_RISING;
>> + return 0;
>> +}
>> +
>> +#endif
>> diff --git a/drivers/virt/gunyah/rsc_mgr.c
>> b/drivers/virt/gunyah/rsc_mgr.c
>> index d7ce692d0067..383be5ac0f44 100644
>> --- a/drivers/virt/gunyah/rsc_mgr.c
>> +++ b/drivers/virt/gunyah/rsc_mgr.c
>> @@ -17,6 +17,8 @@
>> #include <linux/platform_device.h>
>> #include <linux/miscdevice.h>
>> +#include <asm/gunyah.h>
>> +
>> #include "rsc_mgr.h"
>> #include "vm_mgr.h"
>> @@ -132,6 +134,7 @@ struct gh_rm_connection {
>> * @send_lock: synchronization to allow only one request to be sent
>> at a time
>> * @nh: notifier chain for clients interested in RM notification
>> messages
>> * @miscdev: /dev/gunyah
>> + * @irq_domain: Domain to translate Gunyah hwirqs to Linux irqs
>> */
>> struct gh_rm {
>> struct device *dev;
>> @@ -150,6 +153,7 @@ struct gh_rm {
>> struct blocking_notifier_head nh;
>> struct miscdevice miscdev;
>> + struct irq_domain *irq_domain;
>> };
>> /**
>> @@ -190,6 +194,134 @@ static inline int gh_rm_remap_error(enum
>> gh_rm_error rm_error)
>> }
>> }
>> +struct gh_irq_chip_data {
>> + u32 gh_virq;
>> +};
>> +
>> +static struct irq_chip gh_rm_irq_chip = {
>> + .name = "Gunyah",
>> + .irq_enable = irq_chip_enable_parent,
>> + .irq_disable = irq_chip_disable_parent,
>> + .irq_ack = irq_chip_ack_parent,
>> + .irq_mask = irq_chip_mask_parent,
>> + .irq_mask_ack = irq_chip_mask_ack_parent,
>> + .irq_unmask = irq_chip_unmask_parent,
>> + .irq_eoi = irq_chip_eoi_parent,
>> + .irq_set_affinity = irq_chip_set_affinity_parent,
>> + .irq_set_type = irq_chip_set_type_parent,
>> + .irq_set_wake = irq_chip_set_wake_parent,
>> + .irq_set_vcpu_affinity = irq_chip_set_vcpu_affinity_parent,
>> + .irq_retrigger = irq_chip_retrigger_hierarchy,
>> + .irq_get_irqchip_state = irq_chip_get_parent_state,
>> + .irq_set_irqchip_state = irq_chip_set_parent_state,
>> + .flags = IRQCHIP_SET_TYPE_MASKED |
>> + IRQCHIP_SKIP_SET_WAKE |
>> + IRQCHIP_MASK_ON_SUSPEND,
>> +};
>> +
>> +static int gh_rm_irq_domain_alloc(struct irq_domain *d, unsigned int
>> virq, unsigned int nr_irqs,
>> + void *arg)
>> +{
>> + struct gh_irq_chip_data *chip_data, *spec = arg;
>> + struct irq_fwspec parent_fwspec;
>> + struct gh_rm *rm = d->host_data;
>> + u32 gh_virq = spec->gh_virq;
>> + int ret;
>> +
>> + if (nr_irqs != 1 || gh_virq == U32_MAX)
>
> Does U32_MAX have special meaning? Why are you checking for it?
> Whatever it is, you should explain why this is invalid here.
>
This was holdover from deprecated Gunyah code. Since there are new
features/version checks it's not possible for Linux to encounter these
values. I've dropped it in v12.
Thanks,
Elliot
On 4/17/23 5:41 PM, Elliot Berman wrote:
>
>
> On 3/31/2023 7:27 AM, Alex Elder wrote:
>> On 3/3/23 7:06 PM, Elliot Berman wrote:
>
> [snip]
>
>>> diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
>>> index caeb3b3a3e9a..e52265fa5715 100644
>>> --- a/include/uapi/linux/gunyah.h
>>> +++ b/include/uapi/linux/gunyah.h
>>> @@ -62,8 +62,32 @@ struct gh_vm_dtb_config {
>>> #define GH_VM_START _IO(GH_IOCTL_TYPE, 0x3)
>>> +/**
>>> + * GH_FN_VCPU - create a vCPU instance to control a vCPU
>>> + *
>>> + * gh_fn_desc is filled with &struct gh_fn_vcpu_arg
>>> + *
>>> + * The vcpu type will register with the VM Manager to expect to control
>>> + * vCPU number `vcpu_id`. It returns a file descriptor allowing
>>> interaction with
>>> + * the vCPU. See the Gunyah vCPU API description sections for
>>> interacting with
>>> + * the Gunyah vCPU file descriptors.
>>> + *
>>> + * Return: file descriptor to manipulate the vcpu. See GH_VCPU_* ioctls
>>> + */
>>> +#define GH_FN_VCPU 1
>>
>> I think you should define GH_VN_VCPU, GN_FN_IRQFD, and GN_FN_IOEVENTFD
>> in an enumerated type. Each has a type associated with it, and you can
>> add the explanation for the function in the kernel-doc comments above
>> thosse type definitions.
>>
>
> I'd like to enumify the GH_FN_* macros, but one challenge I'm facing is
> that it breaks the module alias implementation in patch 19.
>
> MODULE_ALIAS("ghfunc:"__stringify(_type))
>
> When the GH_FN_* are regular preprocessor macros backed by an integer,
> the preprocessor will make the module alias ghfunc:0 (or ghfunc:1, etc).
> This works well because I can do
>
> request_module("ghfunc:%d", type);
>
> If the function hasn't been registered and then gunyah_vcpu.ko gets
> loaded automatically.
>
> With enum, compiler knows the value of GH_FN_VCPU and preprocessor will
> make the module alias like ghfunc:GH_FN_VCPU.
>
> [snip]
>
>>> +
>>> +/*
>>> + * Gunyah presently sends max 4 bytes of exit_reason.
>>> + * If that changes, this macro can be safely increased without breaking
>>> + * userspace so long as struct gh_vcpu_run < PAGE_SIZE.
>>
>> Is PAGE_SIZE allowed to be anything other than 4096 bytes? Do you
>> expect this driver to work properly if the page size were configured
>> to be 16384 bytes? In other words, is this a Gunyah constant, or
>> is it *really* the page size configured for Linux?
>>
>
> Our implementations are only doing 4096 bytes. I expect the driver to
> work properly when using 16k pages. This really is a Linux page. It's a
> reflection of the alloc_page in gh_vcpu_bind().
OK. I guess I'd be on the lookout for anything that uses 4096 when
PAGE_SIZE is what's actually meant. I have no idea what's involved
with the hypervisor if you wanted to try something else, but if you
haven't tested that, you could maybe do an early check in your probe
function:
BUILD_BUG_ON(PAGE_SIZE != 4096);
> The exit reason is copied from hypervisor into field accessible by
> userspace directly. Gunyah makes the exit reason size dynamic -- there's
> no architectural limitation preventing the exit reason from being a
> string or some lengthy data.
Sounds good. I like having statements like this tested, and maybe
you have. I.e., test with the exit_reason size something like 16
bytes and ensure that works. Testing this is not technically needed,
but your comment suggests it can be done.
> As I was writing this response, I realized that I should be able to make
> this a zero-length array and ensure that reason[] doesn't overflow
> PAGE_SIZE...
Maybe some good came out of it?
> The comment was trying to explain that Linux itself imposes a limitation
> on the maximum exit reason size. If we need to support longer exit
Your comment isn't clear that Linux is what limits the size.
This is all kind of picky though. My main point was about
the PAGE_SIZE assumption.
-Alex
> reason, we're OK to do so long as the total size doesn't overrun
> PAGE_SIZE. There aren't any plans to need longer exit reasons than the 8
> bytes mentioned today.
>
> Thanks,
> Elliot
On 4/17/23 5:55 PM, Elliot Berman wrote:
>>>
>>> +struct gh_fn_irqfd_arg {
>>> + __u32 fd;
>>
>> Should the "fd" field be signed? Should it be an int? (Perhaps
>> you're trying to define a fixed kernel API, so __s32 if signed would
>> be better.)
>>
>
> It looked to me like some interfaces use __u32 and some use __s32. Is
> one technically correct?
Good question. It depends on how you use it.
It's a file descriptor, so it should be an int, and it appears
that's always a 32-bit signed (for 32 and 64 bit machines).
So the size seems to be right.
Whether it's signed or not I think depends on whether you
ever save an error value in this field. I doubt you do,
but if you do, it should be signed. Otherwise, the largest
value will never exceed INT_MAX/S32_MAX; and in that case
either is fine.
Will Gunyah ever run on a 32-bit machine?
-Alex
On 4/17/2023 3:41 PM, Elliot Berman wrote:
>
>
> On 3/31/2023 7:27 AM, Alex Elder wrote:
>> On 3/3/23 7:06 PM, Elliot Berman wrote:
>
> [snip]
>
>>> diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
>>> index caeb3b3a3e9a..e52265fa5715 100644
>>> --- a/include/uapi/linux/gunyah.h
>>> +++ b/include/uapi/linux/gunyah.h
>>> @@ -62,8 +62,32 @@ struct gh_vm_dtb_config {
>>> #define GH_VM_START _IO(GH_IOCTL_TYPE, 0x3)
>>> +/**
>>> + * GH_FN_VCPU - create a vCPU instance to control a vCPU
>>> + *
>>> + * gh_fn_desc is filled with &struct gh_fn_vcpu_arg
>>> + *
>>> + * The vcpu type will register with the VM Manager to expect to control
>>> + * vCPU number `vcpu_id`. It returns a file descriptor allowing
>>> interaction with
>>> + * the vCPU. See the Gunyah vCPU API description sections for
>>> interacting with
>>> + * the Gunyah vCPU file descriptors.
>>> + *
>>> + * Return: file descriptor to manipulate the vcpu. See GH_VCPU_* ioctls
>>> + */
>>> +#define GH_FN_VCPU 1
>>
>> I think you should define GH_VN_VCPU, GN_FN_IRQFD, and GN_FN_IOEVENTFD
>> in an enumerated type. Each has a type associated with it, and you can
>> add the explanation for the function in the kernel-doc comments above
>> thosse type definitions.
>>
>
> I'd like to enumify the GH_FN_* macros, but one challenge I'm facing is
> that it breaks the module alias implementation in patch 19.
>
> MODULE_ALIAS("ghfunc:"__stringify(_type))
>
> When the GH_FN_* are regular preprocessor macros backed by an integer,
> the preprocessor will make the module alias ghfunc:0 (or ghfunc:1, etc).
> This works well because I can do
>
> request_module("ghfunc:%d", type);
>
> If the function hasn't been registered and then gunyah_vcpu.ko gets
> loaded automatically.
>
> With enum, compiler knows the value of GH_FN_VCPU and preprocessor will
> make the module alias like ghfunc:GH_FN_VCPU.
>
I still like the idea of having enum for documentation and clarity. I
noticed that nfnetlink.h saw the same problem for NFNL_SUBSYS_*.
Is this compromise terrible and I should give up on the enum?
enum gh_fn_type {
/* _GH_FN_* macro required for MODULE_ALIAS, otherwise __stringify() trick
* won't work anymore */
#define _GH_FN_VCPU 1
GH_FN_VCPU = _GH_FN_VCPU,
#define _GH_FN_IRQFD 2
GH_FN_IRQFD = _GH_FN_IRQFD,
#define _GH_FN_IOEVENTFD 3
GH_FN_IOEVENTFD = _GH_FN_IOEVENTFD,
};
> [snip]
>
>>> +
>>> +/*
>>> + * Gunyah presently sends max 4 bytes of exit_reason.
>>> + * If that changes, this macro can be safely increased without breaking
>>> + * userspace so long as struct gh_vcpu_run < PAGE_SIZE.
>>
>> Is PAGE_SIZE allowed to be anything other than 4096 bytes? Do you
>> expect this driver to work properly if the page size were configured
>> to be 16384 bytes? In other words, is this a Gunyah constant, or
>> is it *really* the page size configured for Linux?
>>
>
> Our implementations are only doing 4096 bytes. I expect the driver to
> work properly when using 16k pages. This really is a Linux page. It's a
> reflection of the alloc_page in gh_vcpu_bind().
>
> The exit reason is copied from hypervisor into field accessible by
> userspace directly. Gunyah makes the exit reason size dynamic -- there's
> no architectural limitation preventing the exit reason from being a
> string or some lengthy data.
>
> As I was writing this response, I realized that I should be able to make
> this a zero-length array and ensure that reason[] doesn't overflow
> PAGE_SIZE...
>
> The comment was trying to explain that Linux itself imposes a limitation
> on the maximum exit reason size. If we need to support longer exit
> reason, we're OK to do so long as the total size doesn't overrun
> PAGE_SIZE. There aren't any plans to need longer exit reasons than the 8
> bytes mentioned today.
>
> Thanks,
> Elliot
On 4/18/23 12:18 PM, Elliot Berman wrote:
>
>
> On 4/17/2023 3:41 PM, Elliot Berman wrote:
>>
>>
>> On 3/31/2023 7:27 AM, Alex Elder wrote:
>>> On 3/3/23 7:06 PM, Elliot Berman wrote:
>>
>> [snip]
>>
>>>> diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
>>>> index caeb3b3a3e9a..e52265fa5715 100644
>>>> --- a/include/uapi/linux/gunyah.h
>>>> +++ b/include/uapi/linux/gunyah.h
>>>> @@ -62,8 +62,32 @@ struct gh_vm_dtb_config {
>>>> #define GH_VM_START _IO(GH_IOCTL_TYPE, 0x3)
>>>> +/**
>>>> + * GH_FN_VCPU - create a vCPU instance to control a vCPU
>>>> + *
>>>> + * gh_fn_desc is filled with &struct gh_fn_vcpu_arg
>>>> + *
>>>> + * The vcpu type will register with the VM Manager to expect to
>>>> control
>>>> + * vCPU number `vcpu_id`. It returns a file descriptor allowing
>>>> interaction with
>>>> + * the vCPU. See the Gunyah vCPU API description sections for
>>>> interacting with
>>>> + * the Gunyah vCPU file descriptors.
>>>> + *
>>>> + * Return: file descriptor to manipulate the vcpu. See GH_VCPU_*
>>>> ioctls
>>>> + */
>>>> +#define GH_FN_VCPU 1
>>>
>>> I think you should define GH_VN_VCPU, GN_FN_IRQFD, and GN_FN_IOEVENTFD
>>> in an enumerated type. Each has a type associated with it, and you can
>>> add the explanation for the function in the kernel-doc comments above
>>> thosse type definitions.
>>>
>>
>> I'd like to enumify the GH_FN_* macros, but one challenge I'm facing
>> is that it breaks the module alias implementation in patch 19.
>>
>> MODULE_ALIAS("ghfunc:"__stringify(_type))
>>
>> When the GH_FN_* are regular preprocessor macros backed by an integer,
>> the preprocessor will make the module alias ghfunc:0 (or ghfunc:1,
>> etc). This works well because I can do
>>
>> request_module("ghfunc:%d", type);
>>
>> If the function hasn't been registered and then gunyah_vcpu.ko gets
>> loaded automatically.
>>
>> With enum, compiler knows the value of GH_FN_VCPU and preprocessor
>> will make the module alias like ghfunc:GH_FN_VCPU.
>>
>
> I still like the idea of having enum for documentation and clarity. I
> noticed that nfnetlink.h saw the same problem for NFNL_SUBSYS_*.
>
> Is this compromise terrible and I should give up on the enum?
You know, I've seen this pattern in the kernel and never thought
too much about why it was done. Maybe this is exactly the reason.
It sure *seems* like there might be some macro magic that might
cause the enum symbol's numeric value to be used but I think the
problem is that enums are C tokens, which are not evaluated at
preprocessor time.
You could probably skip the leading underscore, and do this as
it's done for nfnetlink_groups in that same header file.
Maybe somebody else can confirm, or has a better suggestion.
-Alex
> enum gh_fn_type {
> /* _GH_FN_* macro required for MODULE_ALIAS, otherwise __stringify() trick
> * won't work anymore */
> #define _GH_FN_VCPU 1
> GH_FN_VCPU = _GH_FN_VCPU,
> #define _GH_FN_IRQFD 2
> GH_FN_IRQFD = _GH_FN_IRQFD,
> #define _GH_FN_IOEVENTFD 3
> GH_FN_IOEVENTFD = _GH_FN_IOEVENTFD,
> };
>
>> [snip]
>>
>>>> +
>>>> +/*
>>>> + * Gunyah presently sends max 4 bytes of exit_reason.
>>>> + * If that changes, this macro can be safely increased without
>>>> breaking
>>>> + * userspace so long as struct gh_vcpu_run < PAGE_SIZE.
>>>
>>> Is PAGE_SIZE allowed to be anything other than 4096 bytes? Do you
>>> expect this driver to work properly if the page size were configured
>>> to be 16384 bytes? In other words, is this a Gunyah constant, or
>>> is it *really* the page size configured for Linux?
>>>
>>
>> Our implementations are only doing 4096 bytes. I expect the driver to
>> work properly when using 16k pages. This really is a Linux page. It's
>> a reflection of the alloc_page in gh_vcpu_bind().
>>
>> The exit reason is copied from hypervisor into field accessible by
>> userspace directly. Gunyah makes the exit reason size dynamic --
>> there's no architectural limitation preventing the exit reason from
>> being a string or some lengthy data.
>>
>> As I was writing this response, I realized that I should be able to
>> make this a zero-length array and ensure that reason[] doesn't
>> overflow PAGE_SIZE...
>>
>> The comment was trying to explain that Linux itself imposes a
>> limitation on the maximum exit reason size. If we need to support
>> longer exit reason, we're OK to do so long as the total size doesn't
>> overrun PAGE_SIZE. There aren't any plans to need longer exit reasons
>> than the 8 bytes mentioned today.
>>
>> Thanks,
>> Elliot
On 4/18/2023 10:31 AM, Alex Elder wrote:
> On 4/18/23 12:18 PM, Elliot Berman wrote:
>>
>>
>> On 4/17/2023 3:41 PM, Elliot Berman wrote:
>>>
>>>
>>> On 3/31/2023 7:27 AM, Alex Elder wrote:
>>>> On 3/3/23 7:06 PM, Elliot Berman wrote:
>>>
>>> [snip]
>>>
>>>>> diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
>>>>> index caeb3b3a3e9a..e52265fa5715 100644
>>>>> --- a/include/uapi/linux/gunyah.h
>>>>> +++ b/include/uapi/linux/gunyah.h
>>>>> @@ -62,8 +62,32 @@ struct gh_vm_dtb_config {
>>>>> #define GH_VM_START _IO(GH_IOCTL_TYPE, 0x3)
>>>>> +/**
>>>>> + * GH_FN_VCPU - create a vCPU instance to control a vCPU
>>>>> + *
>>>>> + * gh_fn_desc is filled with &struct gh_fn_vcpu_arg
>>>>> + *
>>>>> + * The vcpu type will register with the VM Manager to expect to
>>>>> control
>>>>> + * vCPU number `vcpu_id`. It returns a file descriptor allowing
>>>>> interaction with
>>>>> + * the vCPU. See the Gunyah vCPU API description sections for
>>>>> interacting with
>>>>> + * the Gunyah vCPU file descriptors.
>>>>> + *
>>>>> + * Return: file descriptor to manipulate the vcpu. See GH_VCPU_*
>>>>> ioctls
>>>>> + */
>>>>> +#define GH_FN_VCPU 1
>>>>
>>>> I think you should define GH_VN_VCPU, GN_FN_IRQFD, and GN_FN_IOEVENTFD
>>>> in an enumerated type. Each has a type associated with it, and you can
>>>> add the explanation for the function in the kernel-doc comments above
>>>> thosse type definitions.
>>>>
>>>
>>> I'd like to enumify the GH_FN_* macros, but one challenge I'm facing
>>> is that it breaks the module alias implementation in patch 19.
>>>
>>> MODULE_ALIAS("ghfunc:"__stringify(_type))
>>>
>>> When the GH_FN_* are regular preprocessor macros backed by an
>>> integer, the preprocessor will make the module alias ghfunc:0 (or
>>> ghfunc:1, etc). This works well because I can do
>>>
>>> request_module("ghfunc:%d", type);
>>>
>>> If the function hasn't been registered and then gunyah_vcpu.ko gets
>>> loaded automatically.
>>>
>>> With enum, compiler knows the value of GH_FN_VCPU and preprocessor
>>> will make the module alias like ghfunc:GH_FN_VCPU.
>>>
>>
>> I still like the idea of having enum for documentation and clarity. I
>> noticed that nfnetlink.h saw the same problem for NFNL_SUBSYS_*.
>>
>> Is this compromise terrible and I should give up on the enum?
>
> You know, I've seen this pattern in the kernel and never thought
> too much about why it was done. Maybe this is exactly the reason.
>
> It sure *seems* like there might be some macro magic that might
> cause the enum symbol's numeric value to be used but I think the
> problem is that enums are C tokens, which are not evaluated at
> preprocessor time.
>
> You could probably skip the leading underscore, and do this as
> it's done for nfnetlink_groups in that same header file.
>
> Maybe somebody else can confirm, or has a better suggestion.
>
In the preprocessor macro case, the preprocessor macro GH_FN_VCPU
expands to the GH_FN_VCPU enum value and stuck back as if I didn't have
the preprocessor macro in first place. I'm not sure why the preprocessor
macros are done for nfnetlink_groups. I saw one case where enum
kvm_device_type does the same, but that might be done because it was
converting preprocessor macro to enum.
Just a guess -- maybe the preprocessor macro was preserved to support
userspace code doing this?
#ifdef KVM_DEV_TYPE_FSL_MPIC_20
...
#endif
> -Alex
>
>
>> enum gh_fn_type {
>> /* _GH_FN_* macro required for MODULE_ALIAS, otherwise __stringify()
>> trick
>> * won't work anymore */
>> #define _GH_FN_VCPU 1
>> GH_FN_VCPU = _GH_FN_VCPU,
>> #define _GH_FN_IRQFD 2
>> GH_FN_IRQFD = _GH_FN_IRQFD,
>> #define _GH_FN_IOEVENTFD 3
>> GH_FN_IOEVENTFD = _GH_FN_IOEVENTFD,
>> };
>>
>>> [snip]
>>>
>>>>> +
>>>>> +/*
>>>>> + * Gunyah presently sends max 4 bytes of exit_reason.
>>>>> + * If that changes, this macro can be safely increased without
>>>>> breaking
>>>>> + * userspace so long as struct gh_vcpu_run < PAGE_SIZE.
>>>>
>>>> Is PAGE_SIZE allowed to be anything other than 4096 bytes? Do you
>>>> expect this driver to work properly if the page size were configured
>>>> to be 16384 bytes? In other words, is this a Gunyah constant, or
>>>> is it *really* the page size configured for Linux?
>>>>
>>>
>>> Our implementations are only doing 4096 bytes. I expect the driver to
>>> work properly when using 16k pages. This really is a Linux page. It's
>>> a reflection of the alloc_page in gh_vcpu_bind().
>>>
>>> The exit reason is copied from hypervisor into field accessible by
>>> userspace directly. Gunyah makes the exit reason size dynamic --
>>> there's no architectural limitation preventing the exit reason from
>>> being a string or some lengthy data.
>>>
>>> As I was writing this response, I realized that I should be able to
>>> make this a zero-length array and ensure that reason[] doesn't
>>> overflow PAGE_SIZE...
>>>
>>> The comment was trying to explain that Linux itself imposes a
>>> limitation on the maximum exit reason size. If we need to support
>>> longer exit reason, we're OK to do so long as the total size doesn't
>>> overrun PAGE_SIZE. There aren't any plans to need longer exit reasons
>>> than the 8 bytes mentioned today.
>>>
>>> Thanks,
>>> Elliot
>