Gunyah is a Type-1 hypervisor independent of any high-level OS kernel,
and runs in a higher CPU privilege level. It does not depend on any
lower-privileged OS kernel/code for its core functionality. This
increases its security and can support a much smaller trusted computing
base than a Type-2 hypervisor. Gunyah is designed for isolated virtual
machine use cases and to support launching trusted+isolated virtual
machines from a relatively less trusted host virtual machine.
Gunyah is an open source hypervisor. The source repo is available at
https://github.com/quic/gunyah-hypervisor.
The diagram below shows the architecture for AArch64.
::
VM A VM B
+-----+ +-----+ | +-----+ +-----+ +-----+
| | | | | | | | | | |
EL0 | APP | | APP | | | APP | | APP | | APP |
| | | | | | | | | | |
+-----+ +-----+ | +-----+ +-----+ +-----+
---------------------|-------------------------
+--------------+ | +----------------------+
| | | | |
EL1 | Linux Kernel | | |Linux kernel/Other OS | ...
| | | | |
+--------------+ | +----------------------+
--------hvc/smc------|------hvc/smc------------
+----------------------------------------+
| |
EL2 | Gunyah Hypervisor |
| |
+----------------------------------------+
Gunyah provides these following features.
- Threads and Scheduling: The scheduler schedules virtual CPUs (VCPUs)
on physical CPUs and enables time-sharing of the CPUs.
- Memory Management: Gunyah tracks memory ownership and use of all
memory under its control. It provides low level dynamic memory
management APIs on top of which higher level donation, lending and sharing
is built. Gunyah provides strong VM memory isolation for trusted VMs.
- Interrupt Virtualization: Interrupts are managed by the hypervisor
and are routed directly to the assigned VM.
- Inter-VM Communication: There are several different mechanisms
provided for communicating between VMs.
- Device Virtualization: Para-virtualization of devices is supported
using inter-VM communication and virtio primitives. Low level architecture
features and devices such as cpu timers, interrupt controllers are supported
with hardware virtualization and emulation where required.
- Resource Manager: Gunyah supports a "root" VM that initially owns all
VM memory and IO resources. The Gunyah Resource Manager is the default
bundled root VM and provides high-level services including dynamic VM
management and secure memory donation, lending and sharing.
This series adds the basic framework for detecting that Linux is running
under Gunyah as a virtual machine, communication with the Gunyah
Resource Manager, and a sample virtual machine manager capable of
launching virtual machines.
Changes in v16:
- Fleshed out memory reclaim while VM is running
- Documentation and comments
Changes in v15:
https://lore.kernel.org/r/[email protected]
- First implementation of virtual machines backed by guestmemfd and
using demand paging to provide memory instead of all up front.
- Use message queue hypercalls directly instead of traversing through
mailbox framework.
Changes in v14: https://lore.kernel.org/all/[email protected]/
- Coding/cosmetic tweaks suggested by Alex
- Mark IRQs as wake-up capable
Changes in v13:
https://lore.kernel.org/all/[email protected]/
- Tweaks to message queue driver to address race condition between IRQ
and mailbox registration
- Allow removal of VM functions by function-specific comparison --
specifically to allow
removing irqfd by label only and not requiring original FD to be
provided.
Changes in v12:
https://lore.kernel.org/all/[email protected]/
- Stylistic/cosmetic tweaks suggested by Alex
- Remove patch "virt: gunyah: Identify hypervisor version" and squash
the
check that we're running under a reasonable Gunyah hypervisor into RM
driver
- Refactor platform hooks into a separate module per suggestion from
Srini
- GFP_KERNEL_ACCOUNT and account_locked_vm() for page pinning
- enum-ify related constants
Changes in v11:
https://lore.kernel.org/all/[email protected]/
- Rename struct gh_vm_dtb_config:gpa -> guest_phys_addr & overflow
checks for this
- More docstrings throughout
- Make resp_buf and resp_buf_size optional
- Replace deprecated idr with xarray
- Refconting on misc device instead of RM's platform device
- Renaming variables, structs, etc. from gunyah_ -> gh_
- Drop removal of user mem regions
- Drop mem_lend functionality; to converge with restricted_memfd later
Changes in v10:
https://lore.kernel.org/all/[email protected]/
- Fix bisectability (end result of series is same, --fixups applied to
wrong commits)
- Convert GH_ERROR_* and GH_RM_ERROR_* to enums
- Correct race condition between allocating/freeing user memory
- Replace offsetof with struct_size
- Series-wide renaming of functions to be more consistent
- VM shutdown & restart support added in vCPU and VM Manager patches
- Convert VM function name (string) to type (number)
- Convert VM function argument to value (which could be a pointer) to
remove memory wastage for arguments
- Remove defensive checks of hypervisor correctness
- Clean ups to ioeventfd as suggested by Srivatsa
Changes in v9:
https://lore.kernel.org/all/[email protected]/
- Refactor Gunyah API flags to be exposed as feature flags at kernel
level
- Move mbox client cleanup into gunyah_msgq_remove()
- Simplify gh_rm_call return value and response payload
- Missing clean-up/error handling/little endian fixes as suggested by
Srivatsa and Alex in v8 series
Changes in v8:
https://lore.kernel.org/all/[email protected]/
- Treat VM manager as a library of RM
- Add patches 21-28 as RFC to support proxy-scheduled vCPUs and
necessary bits to support virtio
from Gunyah userspace
Changes in v7:
https://lore.kernel.org/all/[email protected]/
- Refactor to remove gunyah RM bus
- Refactor allow multiple RM device instances
- Bump UAPI to start at 0x0
- Refactor QCOM SCM's platform hooks to allow
CONFIG_QCOM_SCM=Y/CONFIG_GUNYAH=M combinations
Changes in v6:
https://lore.kernel.org/all/[email protected]/
- *Replace gunyah-console with gunyah VM Manager*
- Move include/asm-generic/gunyah.h into include/linux/gunyah.h
- s/gunyah_msgq/gh_msgq/
- Minor tweaks and documentation tidying based on comments from Jiri,
Greg, Arnd, Dmitry, and Bagas.
Changes in v5
https://lore.kernel.org/all/[email protected]/
- Dropped sysfs nodes
- Switch from aux bus to Gunyah RM bus for the subdevices
- Cleaning up RM console
Changes in v4:
https://lore.kernel.org/all/[email protected]/
- Tidied up documentation throughout based on questions/feedback received
- Switched message queue implementation to use mailboxes
- Renamed "gunyah_device" as "gunyah_resource"
Changes in v3:
https://lore.kernel.org/all/[email protected]/
- /Maintained/Supported/ in MAINTAINERS
- Tidied up documentation throughout based on questions/feedback received
- Moved hypercalls into arch/arm64/gunyah/; following hyper-v's implementation
- Drop opaque typedefs
- Move sysfs nodes under /sys/hypervisor/gunyah/
- Moved Gunyah console driver to drivers/tty/
- Reworked gh_device design to drop the Gunyah bus.
Changes in v2: https://lore.kernel.org/all/[email protected]/
- DT bindings clean up
- Switch hypercalls to follow SMCCC
v1: https://lore.kernel.org/all/[email protected]/
Signed-off-by: Elliot Berman <[email protected]>
---
Elliot Berman (34):
docs: gunyah: Introduce Gunyah Hypervisor
dt-bindings: Add binding for gunyah hypervisor
gunyah: Common types and error codes for Gunyah hypercalls
virt: gunyah: Add hypercalls to identify Gunyah
virt: gunyah: Add hypervisor driver
virt: gunyah: msgq: Add hypercalls to send and receive messages
gunyah: rsc_mgr: Add resource manager RPC core
gunyah: vm_mgr: Introduce basic VM Manager
gunyah: rsc_mgr: Add VM lifecycle RPC
gunyah: vm_mgr: Add VM start/stop
virt: gunyah: Translate gh_rm_hyp_resource into gunyah_resource
virt: gunyah: Add resource tickets
gunyah: vm_mgr: Add framework for VM Functions
virt: gunyah: Add hypercalls for running a vCPU
virt: gunyah: Add proxy-scheduled vCPUs
gunyah: Add hypercalls for demand paging
gunyah: rsc_mgr: Add memory parcel RPC
virt: gunyah: Add interfaces to map memory into guest address space
gunyah: rsc_mgr: Add platform ops on mem_lend/mem_reclaim
virt: gunyah: Add Qualcomm Gunyah platform ops
virt: gunyah: Implement guestmemfd
virt: gunyah: Add ioctl to bind guestmem to VMs
virt: gunyah: guestmem: Initialize RM mem parcels from guestmem
virt: gunyah: Share guest VM dtb configuration to Gunyah
gunyah: rsc_mgr: Add RPC to enable demand paging
mm/interval_tree: Export iter_first/iter_next
virt: gunyah: Enable demand paging
gunyah: rsc_mgr: Add RPC to set VM boot context
virt: gunyah: Allow userspace to initialize context of primary vCPU
virt: gunyah: Add hypercalls for sending doorbell
virt: gunyah: Add irqfd interface
virt: gunyah: Add IO handlers
virt: gunyah: Add ioeventfd
MAINTAINERS: Add Gunyah hypervisor drivers section
.../bindings/firmware/gunyah-hypervisor.yaml | 82 ++
Documentation/userspace-api/ioctl/ioctl-number.rst | 1 +
Documentation/virt/gunyah/index.rst | 134 +++
Documentation/virt/gunyah/message-queue.rst | 68 ++
Documentation/virt/index.rst | 1 +
MAINTAINERS | 12 +
arch/arm64/Kbuild | 1 +
arch/arm64/gunyah/Makefile | 3 +
arch/arm64/gunyah/gunyah_hypercall.c | 279 ++++++
arch/arm64/include/asm/gunyah.h | 57 ++
drivers/virt/Kconfig | 2 +
drivers/virt/Makefile | 1 +
drivers/virt/gunyah/Kconfig | 47 +
drivers/virt/gunyah/Makefile | 9 +
drivers/virt/gunyah/guest_memfd.c | 960 ++++++++++++++++++++
drivers/virt/gunyah/gunyah.c | 52 ++
drivers/virt/gunyah/gunyah_ioeventfd.c | 139 +++
drivers/virt/gunyah/gunyah_irqfd.c | 190 ++++
drivers/virt/gunyah/gunyah_platform_hooks.c | 115 +++
drivers/virt/gunyah/gunyah_qcom.c | 218 +++++
drivers/virt/gunyah/gunyah_vcpu.c | 584 ++++++++++++
drivers/virt/gunyah/rsc_mgr.c | 948 ++++++++++++++++++++
drivers/virt/gunyah/rsc_mgr.h | 144 +++
drivers/virt/gunyah/rsc_mgr_rpc.c | 586 +++++++++++++
drivers/virt/gunyah/vm_mgr.c | 976 +++++++++++++++++++++
drivers/virt/gunyah/vm_mgr.h | 153 ++++
drivers/virt/gunyah/vm_mgr_mem.c | 321 +++++++
include/linux/gunyah.h | 482 ++++++++++
include/uapi/linux/gunyah.h | 378 ++++++++
mm/interval_tree.c | 3 +
30 files changed, 6946 insertions(+)
---
base-commit: bffdfd2e7e63175ae261131a620f809d946cf9a7
change-id: 20231208-gunyah-952aca7668e0
Best regards,
--
Elliot Berman <[email protected]>
Introduce a framework for Gunyah userspace to install VM functions. VM
functions are optional interfaces to the virtual machine. vCPUs,
ioeventfs, and irqfds are examples of such VM functions and are
implemented in subsequent patches.
A generic framework is implemented instead of individual ioctls to
create vCPUs, irqfds, etc., in order to simplify the VM manager core
implementation and allow dynamic loading of VM function modules.
Signed-off-by: Elliot Berman <[email protected]>
---
drivers/virt/gunyah/vm_mgr.c | 207 ++++++++++++++++++++++++++++++++++++++++++-
drivers/virt/gunyah/vm_mgr.h | 10 +++
include/linux/gunyah.h | 87 +++++++++++++++++-
include/uapi/linux/gunyah.h | 18 ++++
4 files changed, 318 insertions(+), 4 deletions(-)
diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
index 65badcf6357b..5d4f413f7a76 100644
--- a/drivers/virt/gunyah/vm_mgr.c
+++ b/drivers/virt/gunyah/vm_mgr.c
@@ -6,15 +6,175 @@
#define pr_fmt(fmt) "gunyah_vm_mgr: " fmt
#include <linux/anon_inodes.h>
+#include <linux/compat.h>
#include <linux/file.h>
#include <linux/miscdevice.h>
#include <linux/module.h>
+#include <linux/xarray.h>
#include <uapi/linux/gunyah.h>
#include "rsc_mgr.h"
#include "vm_mgr.h"
+static DEFINE_XARRAY(gunyah_vm_functions);
+
+static void gunyah_vm_put_function(struct gunyah_vm_function *fn)
+{
+ module_put(fn->mod);
+}
+
+static struct gunyah_vm_function *gunyah_vm_get_function(u32 type)
+{
+ struct gunyah_vm_function *fn;
+
+ fn = xa_load(&gunyah_vm_functions, type);
+ if (!fn) {
+ request_module("ghfunc:%d", type);
+
+ fn = xa_load(&gunyah_vm_functions, type);
+ }
+
+ if (!fn || !try_module_get(fn->mod))
+ fn = ERR_PTR(-ENOENT);
+
+ return fn;
+}
+
+static void
+gunyah_vm_remove_function_instance(struct gunyah_vm_function_instance *inst)
+ __must_hold(&inst->ghvm->fn_lock)
+{
+ inst->fn->unbind(inst);
+ list_del(&inst->vm_list);
+ gunyah_vm_put_function(inst->fn);
+ kfree(inst->argp);
+ kfree(inst);
+}
+
+static void gunyah_vm_remove_functions(struct gunyah_vm *ghvm)
+{
+ struct gunyah_vm_function_instance *inst, *iiter;
+
+ mutex_lock(&ghvm->fn_lock);
+ list_for_each_entry_safe(inst, iiter, &ghvm->functions, vm_list) {
+ gunyah_vm_remove_function_instance(inst);
+ }
+ mutex_unlock(&ghvm->fn_lock);
+}
+
+static long gunyah_vm_add_function_instance(struct gunyah_vm *ghvm,
+ struct gunyah_fn_desc *f)
+{
+ struct gunyah_vm_function_instance *inst;
+ void __user *argp;
+ long r = 0;
+
+ if (f->arg_size > GUNYAH_FN_MAX_ARG_SIZE) {
+ dev_err_ratelimited(ghvm->parent, "%s: arg_size > %d\n",
+ __func__, GUNYAH_FN_MAX_ARG_SIZE);
+ return -EINVAL;
+ }
+
+ inst = kzalloc(sizeof(*inst), GFP_KERNEL);
+ if (!inst)
+ return -ENOMEM;
+
+ inst->arg_size = f->arg_size;
+ if (inst->arg_size) {
+ inst->argp = kzalloc(inst->arg_size, GFP_KERNEL);
+ if (!inst->argp) {
+ r = -ENOMEM;
+ goto free;
+ }
+
+ argp = u64_to_user_ptr(f->arg);
+ if (copy_from_user(inst->argp, argp, f->arg_size)) {
+ r = -EFAULT;
+ goto free_arg;
+ }
+ }
+
+ inst->fn = gunyah_vm_get_function(f->type);
+ if (IS_ERR(inst->fn)) {
+ r = PTR_ERR(inst->fn);
+ goto free_arg;
+ }
+
+ inst->ghvm = ghvm;
+ inst->rm = ghvm->rm;
+
+ mutex_lock(&ghvm->fn_lock);
+ r = inst->fn->bind(inst);
+ if (r < 0) {
+ mutex_unlock(&ghvm->fn_lock);
+ gunyah_vm_put_function(inst->fn);
+ goto free_arg;
+ }
+
+ list_add(&inst->vm_list, &ghvm->functions);
+ mutex_unlock(&ghvm->fn_lock);
+
+ return r;
+free_arg:
+ kfree(inst->argp);
+free:
+ kfree(inst);
+ return r;
+}
+
+static long gunyah_vm_rm_function_instance(struct gunyah_vm *ghvm,
+ struct gunyah_fn_desc *f)
+{
+ struct gunyah_vm_function_instance *inst, *iter;
+ void __user *user_argp;
+ void *argp __free(kfree) = NULL;
+ long r = 0;
+
+ if (f->arg_size) {
+ argp = kzalloc(f->arg_size, GFP_KERNEL);
+ if (!argp)
+ return -ENOMEM;
+
+ user_argp = u64_to_user_ptr(f->arg);
+ if (copy_from_user(argp, user_argp, f->arg_size))
+ return -EFAULT;
+ }
+
+ r = mutex_lock_interruptible(&ghvm->fn_lock);
+ if (r)
+ return r;
+
+ r = -ENOENT;
+ list_for_each_entry_safe(inst, iter, &ghvm->functions, vm_list) {
+ if (inst->fn->type == f->type &&
+ inst->fn->compare(inst, argp, f->arg_size)) {
+ gunyah_vm_remove_function_instance(inst);
+ r = 0;
+ }
+ }
+
+ mutex_unlock(&ghvm->fn_lock);
+ return r;
+}
+
+int gunyah_vm_function_register(struct gunyah_vm_function *fn)
+{
+ if (!fn->bind || !fn->unbind)
+ return -EINVAL;
+
+ return xa_err(xa_store(&gunyah_vm_functions, fn->type, fn, GFP_KERNEL));
+}
+EXPORT_SYMBOL_GPL(gunyah_vm_function_register);
+
+void gunyah_vm_function_unregister(struct gunyah_vm_function *fn)
+{
+ /* Expecting unregister to only come when unloading a module */
+ WARN_ON(fn->mod && module_refcount(fn->mod));
+ xa_erase(&gunyah_vm_functions, fn->type);
+}
+EXPORT_SYMBOL_GPL(gunyah_vm_function_unregister);
+
int gunyah_vm_add_resource_ticket(struct gunyah_vm *ghvm,
struct gunyah_vm_resource_ticket *ticket)
{
@@ -191,7 +351,11 @@ static __must_check struct gunyah_vm *gunyah_vm_alloc(struct gunyah_rm *rm)
init_rwsem(&ghvm->status_lock);
init_waitqueue_head(&ghvm->vm_status_wait);
+ kref_init(&ghvm->kref);
ghvm->vm_status = GUNYAH_RM_VM_STATUS_NO_STATE;
+
+ INIT_LIST_HEAD(&ghvm->functions);
+ mutex_init(&ghvm->fn_lock);
mutex_init(&ghvm->resources_lock);
INIT_LIST_HEAD(&ghvm->resources);
INIT_LIST_HEAD(&ghvm->resource_tickets);
@@ -306,6 +470,7 @@ static long gunyah_vm_ioctl(struct file *filp, unsigned int cmd,
unsigned long arg)
{
struct gunyah_vm *ghvm = filp->private_data;
+ void __user *argp = (void __user *)arg;
long r;
switch (cmd) {
@@ -313,6 +478,24 @@ static long gunyah_vm_ioctl(struct file *filp, unsigned int cmd,
r = gunyah_vm_ensure_started(ghvm);
break;
}
+ case GUNYAH_VM_ADD_FUNCTION: {
+ struct gunyah_fn_desc f;
+
+ if (copy_from_user(&f, argp, sizeof(f)))
+ return -EFAULT;
+
+ r = gunyah_vm_add_function_instance(ghvm, &f);
+ break;
+ }
+ case GUNYAH_VM_REMOVE_FUNCTION: {
+ struct gunyah_fn_desc f;
+
+ if (copy_from_user(&f, argp, sizeof(f)))
+ return -EFAULT;
+
+ r = gunyah_vm_rm_function_instance(ghvm, &f);
+ break;
+ }
default:
r = -ENOTTY;
break;
@@ -321,9 +504,15 @@ static long gunyah_vm_ioctl(struct file *filp, unsigned int cmd,
return r;
}
-static int gunyah_vm_release(struct inode *inode, struct file *filp)
+int __must_check gunyah_vm_get(struct gunyah_vm *ghvm)
{
- struct gunyah_vm *ghvm = filp->private_data;
+ return kref_get_unless_zero(&ghvm->kref);
+}
+EXPORT_SYMBOL_GPL(gunyah_vm_get);
+
+static void _gunyah_vm_put(struct kref *kref)
+{
+ struct gunyah_vm *ghvm = container_of(kref, struct gunyah_vm, kref);
int ret;
/**
@@ -333,6 +522,7 @@ static int gunyah_vm_release(struct inode *inode, struct file *filp)
if (ghvm->vm_status == GUNYAH_RM_VM_STATUS_RUNNING)
gunyah_vm_stop(ghvm);
+ gunyah_vm_remove_functions(ghvm);
gunyah_vm_clean_resources(ghvm);
if (ghvm->vm_status != GUNYAH_RM_VM_STATUS_NO_STATE &&
@@ -357,6 +547,19 @@ static int gunyah_vm_release(struct inode *inode, struct file *filp)
gunyah_rm_put(ghvm->rm);
kfree(ghvm);
+}
+
+void gunyah_vm_put(struct gunyah_vm *ghvm)
+{
+ kref_put(&ghvm->kref, _gunyah_vm_put);
+}
+EXPORT_SYMBOL_GPL(gunyah_vm_put);
+
+static int gunyah_vm_release(struct inode *inode, struct file *filp)
+{
+ struct gunyah_vm *ghvm = filp->private_data;
+
+ gunyah_vm_put(ghvm);
return 0;
}
diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
index 0d291f722885..190a95ee8da6 100644
--- a/drivers/virt/gunyah/vm_mgr.h
+++ b/drivers/virt/gunyah/vm_mgr.h
@@ -7,6 +7,8 @@
#define _GUNYAH_VM_MGR_PRIV_H
#include <linux/device.h>
+#include <linux/kref.h>
+#include <linux/mutex.h>
#include <linux/rwsem.h>
#include <linux/wait.h>
@@ -26,6 +28,10 @@ long gunyah_dev_vm_mgr_ioctl(struct gunyah_rm *rm, unsigned int cmd,
* @vm_status: Current state of the VM, as last reported by RM
* @vm_status_wait: Wait queue for status @vm_status changes
* @status_lock: Serializing state transitions
+ * @kref: Reference counter for VM functions
+ * @fn_lock: Serialization addition of functions
+ * @functions: List of &struct gunyah_vm_function_instance that have been
+ * created by user for this VM.
* @resource_lock: Serializing addition of resources and resource tickets
* @resources: List of &struct gunyah_resource that are associated with this VM
* @resource_tickets: List of &struct gunyah_vm_resource_ticket
@@ -42,6 +48,10 @@ struct gunyah_vm {
enum gunyah_rm_vm_status vm_status;
wait_queue_head_t vm_status_wait;
struct rw_semaphore status_lock;
+
+ struct kref kref;
+ struct mutex fn_lock;
+ struct list_head functions;
struct mutex resources_lock;
struct list_head resources;
struct list_head resource_tickets;
diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
index 001769100260..359cd63b4938 100644
--- a/include/linux/gunyah.h
+++ b/include/linux/gunyah.h
@@ -11,8 +11,93 @@
#include <linux/interrupt.h>
#include <linux/limits.h>
#include <linux/list.h>
+#include <linux/mod_devicetable.h>
#include <linux/types.h>
+#include <uapi/linux/gunyah.h>
+
+struct gunyah_vm;
+
+int __must_check gunyah_vm_get(struct gunyah_vm *ghvm);
+void gunyah_vm_put(struct gunyah_vm *ghvm);
+
+struct gunyah_vm_function_instance;
+/**
+ * struct gunyah_vm_function - Represents a function type
+ * @type: value from &enum gunyah_fn_type
+ * @name: friendly name for debug purposes
+ * @mod: owner of the function type
+ * @bind: Called when a new function of this type has been allocated.
+ * @unbind: Called when the function instance is being destroyed.
+ * @compare: Compare function instance @f's argument to the provided arg.
+ * Return true if they are equivalent. Used on GUNYAH_VM_REMOVE_FUNCTION.
+ */
+struct gunyah_vm_function {
+ u32 type;
+ const char *name;
+ struct module *mod;
+ long (*bind)(struct gunyah_vm_function_instance *f);
+ void (*unbind)(struct gunyah_vm_function_instance *f);
+ bool (*compare)(const struct gunyah_vm_function_instance *f,
+ const void *arg, size_t size);
+};
+
+/**
+ * struct gunyah_vm_function_instance - Represents one function instance
+ * @arg_size: size of user argument
+ * @argp: pointer to user argument
+ * @ghvm: Pointer to VM instance
+ * @rm: Pointer to resource manager for the VM instance
+ * @fn: The ops for the function
+ * @data: Private data for function
+ * @vm_list: for gunyah_vm's functions list
+ * @fn_list: for gunyah_vm_function's instances list
+ */
+struct gunyah_vm_function_instance {
+ size_t arg_size;
+ void *argp;
+ struct gunyah_vm *ghvm;
+ struct gunyah_rm *rm;
+ struct gunyah_vm_function *fn;
+ void *data;
+ struct list_head vm_list;
+};
+
+int gunyah_vm_function_register(struct gunyah_vm_function *f);
+void gunyah_vm_function_unregister(struct gunyah_vm_function *f);
+
+/* Since the function identifiers were setup in a uapi header as an
+ * enum and we do no want to change that, the user must supply the expanded
+ * constant as well and the compiler checks they are the same.
+ * See also MODULE_ALIAS_RDMA_NETLINK.
+ */
+#define MODULE_ALIAS_GUNYAH_VM_FUNCTION(_type, _idx) \
+ static inline void __maybe_unused __chk##_idx(void) \
+ { \
+ BUILD_BUG_ON(_type != _idx); \
+ } \
+ MODULE_ALIAS("ghfunc:" __stringify(_idx))
+
+#define DECLARE_GUNYAH_VM_FUNCTION(_name, _type, _bind, _unbind, _compare) \
+ static struct gunyah_vm_function _name = { \
+ .type = _type, \
+ .name = __stringify(_name), \
+ .mod = THIS_MODULE, \
+ .bind = _bind, \
+ .unbind = _unbind, \
+ .compare = _compare, \
+ }
+
+#define module_gunyah_vm_function(__gf) \
+ module_driver(__gf, gunyah_vm_function_register, \
+ gunyah_vm_function_unregister)
+
+#define DECLARE_GUNYAH_VM_FUNCTION_INIT(_name, _type, _idx, _bind, _unbind, \
+ _compare) \
+ DECLARE_GUNYAH_VM_FUNCTION(_name, _type, _bind, _unbind, _compare); \
+ module_gunyah_vm_function(_name); \
+ MODULE_ALIAS_GUNYAH_VM_FUNCTION(_type, _idx)
+
/* Matches resource manager's resource types for VM_GET_HYP_RESOURCES RPC */
enum gunyah_resource_type {
/* clang-format off */
@@ -35,8 +120,6 @@ struct gunyah_resource {
u32 rm_label;
};
-struct gunyah_vm;
-
/**
* struct gunyah_vm_resource_ticket - Represents a ticket to reserve access to VM resource(s)
* @vm_list: for @gunyah_vm->resource_tickets
diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
index 31e7f79a6c39..1b7cb5fde70a 100644
--- a/include/uapi/linux/gunyah.h
+++ b/include/uapi/linux/gunyah.h
@@ -25,4 +25,22 @@
*/
#define GUNYAH_VM_START _IO(GUNYAH_IOCTL_TYPE, 0x3)
+#define GUNYAH_FN_MAX_ARG_SIZE 256
+
+/**
+ * struct gunyah_fn_desc - Arguments to create a VM function
+ * @type: Type of the function. See &enum gunyah_fn_type.
+ * @arg_size: Size of argument to pass to the function. arg_size <= GUNYAH_FN_MAX_ARG_SIZE
+ * @arg: Pointer to argument given to the function. See &enum gunyah_fn_type for expected
+ * arguments for a function type.
+ */
+struct gunyah_fn_desc {
+ __u32 type;
+ __u32 arg_size;
+ __u64 arg;
+};
+
+#define GUNYAH_VM_ADD_FUNCTION _IOW(GUNYAH_IOCTL_TYPE, 0x4, struct gunyah_fn_desc)
+#define GUNYAH_VM_REMOVE_FUNCTION _IOW(GUNYAH_IOCTL_TYPE, 0x7, struct gunyah_fn_desc)
+
#endif
--
2.34.1
Qualcomm platforms have a firmware entity which performs access control
to physical pages. Dynamically started Gunyah virtual machines use the
QCOM_SCM_RM_MANAGED_VMID for access. Linux thus needs to assign access
to the memory used by guest VMs. Gunyah doesn't do this operation for us
since it is the current VM (typically VMID_HLOS) delegating the access
and not Gunyah itself. Use the Gunyah platform ops to achieve this so
that only Qualcomm platforms attempt to make the needed SCM calls.
Reviewed-by: Alex Elder <[email protected]>
Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Elliot Berman <[email protected]>
---
drivers/virt/gunyah/Kconfig | 13 +++
drivers/virt/gunyah/Makefile | 1 +
drivers/virt/gunyah/gunyah_qcom.c | 218 ++++++++++++++++++++++++++++++++++++++
3 files changed, 232 insertions(+)
diff --git a/drivers/virt/gunyah/Kconfig b/drivers/virt/gunyah/Kconfig
index 23ba523d25dc..fe2823dc48ba 100644
--- a/drivers/virt/gunyah/Kconfig
+++ b/drivers/virt/gunyah/Kconfig
@@ -4,6 +4,7 @@ config GUNYAH
tristate "Gunyah Virtualization drivers"
depends on ARM64
select GUNYAH_PLATFORM_HOOKS
+ imply GUNYAH_QCOM_PLATFORM if ARCH_QCOM
help
The Gunyah drivers are the helper interfaces that run in a guest VM
such as basic inter-VM IPC and signaling mechanisms, and higher level
@@ -14,3 +15,15 @@ config GUNYAH
config GUNYAH_PLATFORM_HOOKS
tristate
+
+config GUNYAH_QCOM_PLATFORM
+ tristate "Support for Gunyah on Qualcomm platforms"
+ depends on GUNYAH
+ select GUNYAH_PLATFORM_HOOKS
+ select QCOM_SCM
+ help
+ Enable support for interacting with Gunyah on Qualcomm
+ platforms. Interaction with Qualcomm firmware requires
+ extra platform-specific support.
+
+ Say Y/M here to use Gunyah on Qualcomm platforms.
diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
index ffcde0e0ccfa..a6c6f29b887a 100644
--- a/drivers/virt/gunyah/Makefile
+++ b/drivers/virt/gunyah/Makefile
@@ -4,3 +4,4 @@ gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o vm_mgr_mem.o
obj-$(CONFIG_GUNYAH) += gunyah.o gunyah_rsc_mgr.o gunyah_vcpu.o
obj-$(CONFIG_GUNYAH_PLATFORM_HOOKS) += gunyah_platform_hooks.o
+obj-$(CONFIG_GUNYAH_QCOM_PLATFORM) += gunyah_qcom.o
diff --git a/drivers/virt/gunyah/gunyah_qcom.c b/drivers/virt/gunyah/gunyah_qcom.c
new file mode 100644
index 000000000000..2381d75482ca
--- /dev/null
+++ b/drivers/virt/gunyah/gunyah_qcom.c
@@ -0,0 +1,218 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2023-2024 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#include <linux/arm-smccc.h>
+#include <linux/gunyah.h>
+#include <linux/mm.h>
+#include <linux/module.h>
+#include <linux/firmware/qcom/qcom_scm.h>
+#include <linux/types.h>
+#include <linux/uuid.h>
+
+#define QCOM_SCM_RM_MANAGED_VMID 0x3A
+#define QCOM_SCM_MAX_MANAGED_VMID 0x3F
+
+static int
+qcom_scm_gunyah_rm_pre_mem_share(struct gunyah_rm *rm,
+ struct gunyah_rm_mem_parcel *mem_parcel)
+{
+ struct qcom_scm_vmperm *new_perms __free(kfree) = NULL;
+ u64 src, src_cpy;
+ int ret = 0, i, n;
+ u16 vmid;
+
+ new_perms = kcalloc(mem_parcel->n_acl_entries, sizeof(*new_perms),
+ GFP_KERNEL);
+ if (!new_perms)
+ return -ENOMEM;
+
+ for (n = 0; n < mem_parcel->n_acl_entries; n++) {
+ vmid = le16_to_cpu(mem_parcel->acl_entries[n].vmid);
+ if (vmid <= QCOM_SCM_MAX_MANAGED_VMID)
+ new_perms[n].vmid = vmid;
+ else
+ new_perms[n].vmid = QCOM_SCM_RM_MANAGED_VMID;
+ if (mem_parcel->acl_entries[n].perms & GUNYAH_RM_ACL_X)
+ new_perms[n].perm |= QCOM_SCM_PERM_EXEC;
+ if (mem_parcel->acl_entries[n].perms & GUNYAH_RM_ACL_W)
+ new_perms[n].perm |= QCOM_SCM_PERM_WRITE;
+ if (mem_parcel->acl_entries[n].perms & GUNYAH_RM_ACL_R)
+ new_perms[n].perm |= QCOM_SCM_PERM_READ;
+ }
+
+ src = BIT_ULL(QCOM_SCM_VMID_HLOS);
+
+ for (i = 0; i < mem_parcel->n_mem_entries; i++) {
+ src_cpy = src;
+ ret = qcom_scm_assign_mem(
+ le64_to_cpu(mem_parcel->mem_entries[i].phys_addr),
+ le64_to_cpu(mem_parcel->mem_entries[i].size), &src_cpy,
+ new_perms, mem_parcel->n_acl_entries);
+ if (ret)
+ break;
+ }
+
+ /* Did it work ok? */
+ if (!ret)
+ return 0;
+
+ src = 0;
+ for (n = 0; n < mem_parcel->n_acl_entries; n++) {
+ vmid = le16_to_cpu(mem_parcel->acl_entries[n].vmid);
+ if (vmid <= QCOM_SCM_MAX_MANAGED_VMID)
+ src |= BIT_ULL(vmid);
+ else
+ src |= BIT_ULL(QCOM_SCM_RM_MANAGED_VMID);
+ }
+
+ new_perms[0].vmid = QCOM_SCM_VMID_HLOS;
+
+ for (i--; i >= 0; i--) {
+ src_cpy = src;
+ WARN_ON_ONCE(qcom_scm_assign_mem(
+ le64_to_cpu(mem_parcel->mem_entries[i].phys_addr),
+ le64_to_cpu(mem_parcel->mem_entries[i].size), &src_cpy,
+ new_perms, 1));
+ }
+
+ return ret;
+}
+
+static int
+qcom_scm_gunyah_rm_post_mem_reclaim(struct gunyah_rm *rm,
+ struct gunyah_rm_mem_parcel *mem_parcel)
+{
+ struct qcom_scm_vmperm new_perms;
+ u64 src = 0, src_cpy;
+ int ret = 0, i, n;
+ u16 vmid;
+
+ new_perms.vmid = QCOM_SCM_VMID_HLOS;
+ new_perms.perm = QCOM_SCM_PERM_EXEC | QCOM_SCM_PERM_WRITE |
+ QCOM_SCM_PERM_READ;
+
+ for (n = 0; n < mem_parcel->n_acl_entries; n++) {
+ vmid = le16_to_cpu(mem_parcel->acl_entries[n].vmid);
+ if (vmid <= QCOM_SCM_MAX_MANAGED_VMID)
+ src |= (1ull << vmid);
+ else
+ src |= (1ull << QCOM_SCM_RM_MANAGED_VMID);
+ }
+
+ for (i = 0; i < mem_parcel->n_mem_entries; i++) {
+ src_cpy = src;
+ ret = qcom_scm_assign_mem(
+ le64_to_cpu(mem_parcel->mem_entries[i].phys_addr),
+ le64_to_cpu(mem_parcel->mem_entries[i].size), &src_cpy,
+ &new_perms, 1);
+ WARN_ON_ONCE(ret);
+ }
+
+ return ret;
+}
+
+static int
+qcom_scm_gunyah_rm_pre_demand_page(struct gunyah_rm *rm, u16 vmid,
+ enum gunyah_pagetable_access access,
+ struct folio *folio)
+{
+ struct qcom_scm_vmperm new_perms[2];
+ unsigned int n = 1;
+ u64 src;
+
+ new_perms[0].vmid = QCOM_SCM_RM_MANAGED_VMID;
+ new_perms[0].perm = QCOM_SCM_PERM_EXEC | QCOM_SCM_PERM_WRITE |
+ QCOM_SCM_PERM_READ;
+ if (access != GUNYAH_PAGETABLE_ACCESS_X &&
+ access != GUNYAH_PAGETABLE_ACCESS_RX &&
+ access != GUNYAH_PAGETABLE_ACCESS_RWX) {
+ new_perms[1].vmid = QCOM_SCM_VMID_HLOS;
+ new_perms[1].perm = QCOM_SCM_PERM_EXEC | QCOM_SCM_PERM_WRITE |
+ QCOM_SCM_PERM_READ;
+ n++;
+ }
+
+ src = BIT_ULL(QCOM_SCM_VMID_HLOS);
+
+ return qcom_scm_assign_mem(__pfn_to_phys(folio_pfn(folio)),
+ folio_size(folio), &src, new_perms, n);
+}
+
+static int
+qcom_scm_gunyah_rm_release_demand_page(struct gunyah_rm *rm, u16 vmid,
+ enum gunyah_pagetable_access access,
+ struct folio *folio)
+{
+ struct qcom_scm_vmperm new_perms;
+ u64 src;
+
+ new_perms.vmid = QCOM_SCM_VMID_HLOS;
+ new_perms.perm = QCOM_SCM_PERM_EXEC | QCOM_SCM_PERM_WRITE |
+ QCOM_SCM_PERM_READ;
+
+ src = BIT_ULL(QCOM_SCM_RM_MANAGED_VMID);
+
+ if (access != GUNYAH_PAGETABLE_ACCESS_X &&
+ access != GUNYAH_PAGETABLE_ACCESS_RX &&
+ access != GUNYAH_PAGETABLE_ACCESS_RWX)
+ src |= BIT_ULL(QCOM_SCM_VMID_HLOS);
+
+ return qcom_scm_assign_mem(__pfn_to_phys(folio_pfn(folio)),
+ folio_size(folio), &src, &new_perms, 1);
+}
+
+static struct gunyah_rm_platform_ops qcom_scm_gunyah_rm_platform_ops = {
+ .pre_mem_share = qcom_scm_gunyah_rm_pre_mem_share,
+ .post_mem_reclaim = qcom_scm_gunyah_rm_post_mem_reclaim,
+ .pre_demand_page = qcom_scm_gunyah_rm_pre_demand_page,
+ .release_demand_page = qcom_scm_gunyah_rm_release_demand_page,
+};
+
+/* {19bd54bd-0b37-571b-946f-609b54539de6} */
+static const uuid_t QCOM_EXT_UUID = UUID_INIT(0x19bd54bd, 0x0b37, 0x571b, 0x94,
+ 0x6f, 0x60, 0x9b, 0x54, 0x53,
+ 0x9d, 0xe6);
+
+#define GUNYAH_QCOM_EXT_CALL_UUID_ID \
+ ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, ARM_SMCCC_SMC_32, \
+ ARM_SMCCC_OWNER_VENDOR_HYP, 0x3f01)
+
+static bool gunyah_has_qcom_extensions(void)
+{
+ struct arm_smccc_res res;
+ uuid_t uuid;
+ u32 *up;
+
+ arm_smccc_1_1_smc(GUNYAH_QCOM_EXT_CALL_UUID_ID, &res);
+
+ up = (u32 *)&uuid.b[0];
+ up[0] = lower_32_bits(res.a0);
+ up[1] = lower_32_bits(res.a1);
+ up[2] = lower_32_bits(res.a2);
+ up[3] = lower_32_bits(res.a3);
+
+ return uuid_equal(&uuid, &QCOM_EXT_UUID);
+}
+
+static int __init qcom_gunyah_platform_hooks_register(void)
+{
+ if (!gunyah_has_qcom_extensions())
+ return -ENODEV;
+
+ pr_info("Enabling Gunyah hooks for Qualcomm platforms.\n");
+
+ return gunyah_rm_register_platform_ops(
+ &qcom_scm_gunyah_rm_platform_ops);
+}
+
+static void __exit qcom_gunyah_platform_hooks_unregister(void)
+{
+ gunyah_rm_unregister_platform_ops(&qcom_scm_gunyah_rm_platform_ops);
+}
+
+module_init(qcom_gunyah_platform_hooks_register);
+module_exit(qcom_gunyah_platform_hooks_unregister);
+MODULE_DESCRIPTION("Qualcomm Technologies, Inc. Platform Hooks for Gunyah");
+MODULE_LICENSE("GPL");
--
2.34.1
In a Gunyah hypervisor system using the Gunyah Resource Manager, the
"standard" unit of donating, lending and sharing memory is called a
memory parcel (memparcel). A memparcel is an abstraction used by the
resource manager for securely managing donating, lending and sharing
memory, which may be physically and virtually fragmented, without
dealing directly with physical memory addresses.
Memparcels are created and managed through the RM RPC functions for
lending, sharing and reclaiming memory from VMs.
When creating a new VM the initial VM memory containing the VM image and
the VM's device tree blob must be provided as a memparcel. The memparcel
must be created using the RM RPC for lending and mapping the memory to
the VM.
Signed-off-by: Elliot Berman <[email protected]>
---
drivers/virt/gunyah/rsc_mgr.h | 9 ++
drivers/virt/gunyah/rsc_mgr_rpc.c | 231 ++++++++++++++++++++++++++++++++++++++
include/linux/gunyah.h | 43 +++++++
3 files changed, 283 insertions(+)
diff --git a/drivers/virt/gunyah/rsc_mgr.h b/drivers/virt/gunyah/rsc_mgr.h
index 52711de77bb7..ec8ad8149e8e 100644
--- a/drivers/virt/gunyah/rsc_mgr.h
+++ b/drivers/virt/gunyah/rsc_mgr.h
@@ -10,6 +10,7 @@
#include <linux/types.h>
#define GUNYAH_VMID_INVAL U16_MAX
+#define GUNYAH_MEM_HANDLE_INVAL U32_MAX
struct gunyah_rm;
@@ -58,6 +59,12 @@ struct gunyah_rm_vm_status_payload {
__le16 app_status;
} __packed;
+/* RPC Calls */
+int gunyah_rm_mem_share(struct gunyah_rm *rm,
+ struct gunyah_rm_mem_parcel *parcel);
+int gunyah_rm_mem_reclaim(struct gunyah_rm *rm,
+ struct gunyah_rm_mem_parcel *parcel);
+
int gunyah_rm_alloc_vmid(struct gunyah_rm *rm, u16 vmid);
int gunyah_rm_dealloc_vmid(struct gunyah_rm *rm, u16 vmid);
int gunyah_rm_vm_reset(struct gunyah_rm *rm, u16 vmid);
@@ -99,6 +106,8 @@ struct gunyah_rm_hyp_resources {
int gunyah_rm_get_hyp_resources(struct gunyah_rm *rm, u16 vmid,
struct gunyah_rm_hyp_resources **resources);
+int gunyah_rm_get_vmid(struct gunyah_rm *rm, u16 *vmid);
+
struct gunyah_resource *
gunyah_rm_alloc_resource(struct gunyah_rm *rm,
struct gunyah_rm_hyp_resource *hyp_resource);
diff --git a/drivers/virt/gunyah/rsc_mgr_rpc.c b/drivers/virt/gunyah/rsc_mgr_rpc.c
index 141ce0145e91..bc44bde990ce 100644
--- a/drivers/virt/gunyah/rsc_mgr_rpc.c
+++ b/drivers/virt/gunyah/rsc_mgr_rpc.c
@@ -5,6 +5,12 @@
#include "rsc_mgr.h"
+/* Message IDs: Memory Management */
+#define GUNYAH_RM_RPC_MEM_LEND 0x51000012
+#define GUNYAH_RM_RPC_MEM_SHARE 0x51000013
+#define GUNYAH_RM_RPC_MEM_RECLAIM 0x51000015
+#define GUNYAH_RM_RPC_MEM_APPEND 0x51000018
+
/* Message IDs: VM Management */
/* clang-format off */
#define GUNYAH_RM_RPC_VM_ALLOC_VMID 0x56000001
@@ -15,6 +21,7 @@
#define GUNYAH_RM_RPC_VM_CONFIG_IMAGE 0x56000009
#define GUNYAH_RM_RPC_VM_INIT 0x5600000B
#define GUNYAH_RM_RPC_VM_GET_HYP_RESOURCES 0x56000020
+#define GUNYAH_RM_RPC_VM_GET_VMID 0x56000024
/* clang-format on */
struct gunyah_rm_vm_common_vmid_req {
@@ -22,6 +29,48 @@ struct gunyah_rm_vm_common_vmid_req {
__le16 _padding;
} __packed;
+/* Call: MEM_LEND, MEM_SHARE */
+#define GUNYAH_RM_MAX_MEM_ENTRIES 512
+
+#define GUNYAH_MEM_SHARE_REQ_FLAGS_APPEND BIT(1)
+
+struct gunyah_rm_mem_share_req_header {
+ u8 mem_type;
+ u8 _padding0;
+ u8 flags;
+ u8 _padding1;
+ __le32 label;
+} __packed;
+
+struct gunyah_rm_mem_share_req_acl_section {
+ __le32 n_entries;
+ struct gunyah_rm_mem_acl_entry entries[];
+} __packed;
+
+struct gunyah_rm_mem_share_req_mem_section {
+ __le16 n_entries;
+ __le16 _padding;
+ struct gunyah_rm_mem_entry entries[];
+} __packed;
+
+/* Call: MEM_RELEASE */
+struct gunyah_rm_mem_release_req {
+ __le32 mem_handle;
+ u8 flags; /* currently not used */
+ u8 _padding0;
+ __le16 _padding1;
+} __packed;
+
+/* Call: MEM_APPEND */
+#define GUNYAH_MEM_APPEND_REQ_FLAGS_END BIT(0)
+
+struct gunyah_rm_mem_append_req_header {
+ __le32 mem_handle;
+ u8 flags;
+ u8 _padding0;
+ __le16 _padding1;
+} __packed;
+
/* Call: VM_ALLOC */
struct gunyah_rm_vm_alloc_vmid_resp {
__le16 vmid;
@@ -66,6 +115,159 @@ static int gunyah_rm_common_vmid_call(struct gunyah_rm *rm, u32 message_id,
NULL, NULL);
}
+static int gunyah_rm_mem_append(struct gunyah_rm *rm, u32 mem_handle,
+ struct gunyah_rm_mem_entry *entries,
+ size_t n_entries)
+{
+ struct gunyah_rm_mem_append_req_header *req __free(kfree) = NULL;
+ struct gunyah_rm_mem_share_req_mem_section *mem;
+ int ret = 0;
+ size_t n;
+
+ req = kzalloc(sizeof(*req) + struct_size(mem, entries, GUNYAH_RM_MAX_MEM_ENTRIES),
+ GFP_KERNEL);
+ if (!req)
+ return -ENOMEM;
+
+ req->mem_handle = cpu_to_le32(mem_handle);
+ mem = (void *)(req + 1);
+
+ while (n_entries) {
+ req->flags = 0;
+ if (n_entries > GUNYAH_RM_MAX_MEM_ENTRIES) {
+ n = GUNYAH_RM_MAX_MEM_ENTRIES;
+ } else {
+ req->flags |= GUNYAH_MEM_APPEND_REQ_FLAGS_END;
+ n = n_entries;
+ }
+
+ mem->n_entries = cpu_to_le16(n);
+ memcpy(mem->entries, entries, sizeof(*entries) * n);
+
+ ret = gunyah_rm_call(rm, GUNYAH_RM_RPC_MEM_APPEND, req,
+ sizeof(*req) + struct_size(mem, entries, n),
+ NULL, NULL);
+ if (ret)
+ break;
+
+ entries += n;
+ n_entries -= n;
+ }
+
+ return ret;
+}
+
+/**
+ * gunyah_rm_mem_share() - Share memory with other virtual machines.
+ * @rm: Handle to a Gunyah resource manager
+ * @p: Information about the memory to be shared.
+ *
+ * Sharing keeps Linux's access to the memory while the memory parcel is shared.
+ */
+int gunyah_rm_mem_share(struct gunyah_rm *rm, struct gunyah_rm_mem_parcel *p)
+{
+ u32 message_id = p->n_acl_entries == 1 ? GUNYAH_RM_RPC_MEM_LEND :
+ GUNYAH_RM_RPC_MEM_SHARE;
+ size_t msg_size, initial_mem_entries = p->n_mem_entries, resp_size;
+ struct gunyah_rm_mem_share_req_acl_section *acl;
+ struct gunyah_rm_mem_share_req_mem_section *mem;
+ struct gunyah_rm_mem_share_req_header *req_header;
+ size_t acl_size, mem_size;
+ u32 *attr_section;
+ bool need_append = false;
+ __le32 *resp;
+ void *msg;
+ int ret;
+
+ if (!p->acl_entries || !p->n_acl_entries || !p->mem_entries ||
+ !p->n_mem_entries || p->n_acl_entries > U8_MAX ||
+ p->mem_handle != GUNYAH_MEM_HANDLE_INVAL)
+ return -EINVAL;
+
+ if (initial_mem_entries > GUNYAH_RM_MAX_MEM_ENTRIES) {
+ initial_mem_entries = GUNYAH_RM_MAX_MEM_ENTRIES;
+ need_append = true;
+ }
+
+ acl_size = struct_size(acl, entries, p->n_acl_entries);
+ mem_size = struct_size(mem, entries, initial_mem_entries);
+
+ /* The format of the message goes:
+ * request header
+ * ACL entries (which VMs get what kind of access to this memory parcel)
+ * Memory entries (list of memory regions to share)
+ * Memory attributes (currently unused, we'll hard-code the size to 0)
+ */
+ msg_size = sizeof(struct gunyah_rm_mem_share_req_header) + acl_size +
+ mem_size +
+ sizeof(u32); /* for memory attributes, currently unused */
+
+ msg = kzalloc(msg_size, GFP_KERNEL);
+ if (!msg)
+ return -ENOMEM;
+
+ req_header = msg;
+ acl = (void *)req_header + sizeof(*req_header);
+ mem = (void *)acl + acl_size;
+ attr_section = (void *)mem + mem_size;
+
+ req_header->mem_type = p->mem_type;
+ if (need_append)
+ req_header->flags |= GUNYAH_MEM_SHARE_REQ_FLAGS_APPEND;
+ req_header->label = cpu_to_le32(p->label);
+
+ acl->n_entries = cpu_to_le32(p->n_acl_entries);
+ memcpy(acl->entries, p->acl_entries,
+ flex_array_size(acl, entries, p->n_acl_entries));
+
+ mem->n_entries = cpu_to_le16(initial_mem_entries);
+ memcpy(mem->entries, p->mem_entries,
+ flex_array_size(mem, entries, initial_mem_entries));
+
+ /* Set n_entries for memory attribute section to 0 */
+ *attr_section = 0;
+
+ ret = gunyah_rm_call(rm, message_id, msg, msg_size, (void **)&resp,
+ &resp_size);
+ kfree(msg);
+
+ if (ret)
+ return ret;
+
+ p->mem_handle = le32_to_cpu(*resp);
+ kfree(resp);
+
+ if (need_append) {
+ ret = gunyah_rm_mem_append(
+ rm, p->mem_handle, &p->mem_entries[initial_mem_entries],
+ p->n_mem_entries - initial_mem_entries);
+ if (ret) {
+ gunyah_rm_mem_reclaim(rm, p);
+ p->mem_handle = GUNYAH_MEM_HANDLE_INVAL;
+ }
+ }
+
+ return ret;
+}
+
+/**
+ * gunyah_rm_mem_reclaim() - Reclaim a memory parcel
+ * @rm: Handle to a Gunyah resource manager
+ * @parcel: Information about the memory to be reclaimed.
+ *
+ * RM maps the associated memory back into the stage-2 page tables of the owner VM.
+ */
+int gunyah_rm_mem_reclaim(struct gunyah_rm *rm,
+ struct gunyah_rm_mem_parcel *parcel)
+{
+ struct gunyah_rm_mem_release_req req = {
+ .mem_handle = cpu_to_le32(parcel->mem_handle),
+ };
+
+ return gunyah_rm_call(rm, GUNYAH_RM_RPC_MEM_RECLAIM, &req, sizeof(req),
+ NULL, NULL);
+}
+
/**
* gunyah_rm_alloc_vmid() - Allocate a new VM in Gunyah. Returns the VM identifier.
* @rm: Handle to a Gunyah resource manager
@@ -236,3 +438,32 @@ int gunyah_rm_get_hyp_resources(struct gunyah_rm *rm, u16 vmid,
*resources = resp;
return 0;
}
+
+/**
+ * gunyah_rm_get_vmid() - Retrieve VMID of this virtual machine
+ * @rm: Handle to a Gunyah resource manager
+ * @vmid: Filled with the VMID of this VM
+ */
+int gunyah_rm_get_vmid(struct gunyah_rm *rm, u16 *vmid)
+{
+ static u16 cached_vmid = GUNYAH_VMID_INVAL;
+ size_t resp_size;
+ __le32 *resp;
+ int ret;
+
+ if (cached_vmid != GUNYAH_VMID_INVAL) {
+ *vmid = cached_vmid;
+ return 0;
+ }
+
+ ret = gunyah_rm_call(rm, GUNYAH_RM_RPC_VM_GET_VMID, NULL, 0,
+ (void **)&resp, &resp_size);
+ if (ret)
+ return ret;
+
+ *vmid = cached_vmid = lower_16_bits(le32_to_cpu(*resp));
+ kfree(resp);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(gunyah_rm_get_vmid);
diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
index a517c5c33a75..9065f5758c39 100644
--- a/include/linux/gunyah.h
+++ b/include/linux/gunyah.h
@@ -156,6 +156,49 @@ int gunyah_vm_add_resource_ticket(struct gunyah_vm *ghvm,
void gunyah_vm_remove_resource_ticket(struct gunyah_vm *ghvm,
struct gunyah_vm_resource_ticket *ticket);
+#define GUNYAH_RM_ACL_X BIT(0)
+#define GUNYAH_RM_ACL_W BIT(1)
+#define GUNYAH_RM_ACL_R BIT(2)
+
+struct gunyah_rm_mem_acl_entry {
+ __le16 vmid;
+ u8 perms;
+ u8 reserved;
+} __packed;
+
+struct gunyah_rm_mem_entry {
+ __le64 phys_addr;
+ __le64 size;
+} __packed;
+
+enum gunyah_rm_mem_type {
+ GUNYAH_RM_MEM_TYPE_NORMAL = 0,
+ GUNYAH_RM_MEM_TYPE_IO = 1,
+};
+
+/*
+ * struct gunyah_rm_mem_parcel - Info about memory to be lent/shared/donated/reclaimed
+ * @mem_type: The type of memory: normal (DDR) or IO
+ * @label: An client-specified identifier which can be used by the other VMs to identify the purpose
+ * of the memory parcel.
+ * @n_acl_entries: Count of the number of entries in the @acl_entries array.
+ * @acl_entries: An array of access control entries. Each entry specifies a VM and what access
+ * is allowed for the memory parcel.
+ * @n_mem_entries: Count of the number of entries in the @mem_entries array.
+ * @mem_entries: An array of regions to be associated with the memory parcel. Addresses should be
+ * (intermediate) physical addresses from Linux's perspective.
+ * @mem_handle: On success, filled with memory handle that RM allocates for this memory parcel
+ */
+struct gunyah_rm_mem_parcel {
+ enum gunyah_rm_mem_type mem_type;
+ u32 label;
+ size_t n_acl_entries;
+ struct gunyah_rm_mem_acl_entry *acl_entries;
+ size_t n_mem_entries;
+ struct gunyah_rm_mem_entry *mem_entries;
+ u32 mem_handle;
+};
+
/******************************************************************************/
/* Common arch-independent definitions for Gunyah hypercalls */
#define GUNYAH_CAPID_INVAL U64_MAX
--
2.34.1
Add Gunyah Resource Manager RPC to enable demand paging for a virtual
machine. Resource manager needs to be informed of private memory regions
which will be demand paged and the location where the DTB memory parcel
should live in the guest's address space.
Signed-off-by: Elliot Berman <[email protected]>
---
drivers/virt/gunyah/rsc_mgr.h | 12 +++++++
drivers/virt/gunyah/rsc_mgr_rpc.c | 71 +++++++++++++++++++++++++++++++++++++++
2 files changed, 83 insertions(+)
diff --git a/drivers/virt/gunyah/rsc_mgr.h b/drivers/virt/gunyah/rsc_mgr.h
index 68d08d3cff02..99c2db18579c 100644
--- a/drivers/virt/gunyah/rsc_mgr.h
+++ b/drivers/virt/gunyah/rsc_mgr.h
@@ -108,6 +108,18 @@ int gunyah_rm_get_hyp_resources(struct gunyah_rm *rm, u16 vmid,
int gunyah_rm_get_vmid(struct gunyah_rm *rm, u16 *vmid);
+int gunyah_rm_vm_set_demand_paging(struct gunyah_rm *rm, u16 vmid, u32 count,
+ struct gunyah_rm_mem_entry *mem_entries);
+
+enum gunyah_rm_range_id {
+ GUNYAH_RM_RANGE_ID_IMAGE = 0,
+ GUNYAH_RM_RANGE_ID_FIRMWARE = 1,
+};
+
+int gunyah_rm_vm_set_address_layout(struct gunyah_rm *rm, u16 vmid,
+ enum gunyah_rm_range_id range_id,
+ u64 base_address, u64 size);
+
struct gunyah_resource *
gunyah_rm_alloc_resource(struct gunyah_rm *rm,
struct gunyah_rm_hyp_resource *hyp_resource);
diff --git a/drivers/virt/gunyah/rsc_mgr_rpc.c b/drivers/virt/gunyah/rsc_mgr_rpc.c
index 0d78613827b5..f4e396fd0d47 100644
--- a/drivers/virt/gunyah/rsc_mgr_rpc.c
+++ b/drivers/virt/gunyah/rsc_mgr_rpc.c
@@ -22,6 +22,8 @@
#define GUNYAH_RM_RPC_VM_INIT 0x5600000B
#define GUNYAH_RM_RPC_VM_GET_HYP_RESOURCES 0x56000020
#define GUNYAH_RM_RPC_VM_GET_VMID 0x56000024
+#define GUNYAH_RM_RPC_VM_SET_DEMAND_PAGING 0x56000033
+#define GUNYAH_RM_RPC_VM_SET_ADDRESS_LAYOUT 0x56000034
/* clang-format on */
struct gunyah_rm_vm_common_vmid_req {
@@ -100,6 +102,23 @@ struct gunyah_rm_vm_config_image_req {
__le64 dtb_size;
} __packed;
+/* Call: VM_SET_DEMAND_PAGING */
+struct gunyah_rm_vm_set_demand_paging_req {
+ __le16 vmid;
+ __le16 _padding;
+ __le32 range_count;
+ DECLARE_FLEX_ARRAY(struct gunyah_rm_mem_entry, ranges);
+} __packed;
+
+/* Call: VM_SET_ADDRESS_LAYOUT */
+struct gunyah_rm_vm_set_address_layout_req {
+ __le16 vmid;
+ __le16 _padding;
+ __le32 range_id;
+ __le64 range_base;
+ __le64 range_size;
+} __packed;
+
/*
* Several RM calls take only a VMID as a parameter and give only standard
* response back. Deduplicate boilerplate code by using this common call.
@@ -481,3 +500,55 @@ int gunyah_rm_get_vmid(struct gunyah_rm *rm, u16 *vmid)
return ret;
}
EXPORT_SYMBOL_GPL(gunyah_rm_get_vmid);
+
+/**
+ * gunyah_rm_vm_set_demand_paging() - Enable demand paging of memory regions
+ * @rm: Handle to a Gunyah resource manager
+ * @vmid: VMID of the other VM
+ * @count: Number of demand paged memory regions
+ * @entries: Array of the regions
+ */
+int gunyah_rm_vm_set_demand_paging(struct gunyah_rm *rm, u16 vmid, u32 count,
+ struct gunyah_rm_mem_entry *entries)
+{
+ struct gunyah_rm_vm_set_demand_paging_req *req __free(kfree) = NULL;
+ size_t req_size;
+
+ req_size = struct_size(req, ranges, count);
+ if (req_size == SIZE_MAX)
+ return -EINVAL;
+
+ req = kzalloc(req_size, GFP_KERNEL);
+ if (!req)
+ return -ENOMEM;
+
+ req->vmid = cpu_to_le16(vmid);
+ req->range_count = cpu_to_le32(count);
+ memcpy(req->ranges, entries, sizeof(*entries) * count);
+
+ return gunyah_rm_call(rm, GUNYAH_RM_RPC_VM_SET_DEMAND_PAGING, req,
+ req_size, NULL, NULL);
+}
+
+/**
+ * gunyah_rm_vm_set_address_layout() - Set the start address of images
+ * @rm: Handle to a Gunyah resource manager
+ * @vmid: VMID of the other VM
+ * @range_id: Which image to set
+ * @base_address: Base address
+ * @size: Size
+ */
+int gunyah_rm_vm_set_address_layout(struct gunyah_rm *rm, u16 vmid,
+ enum gunyah_rm_range_id range_id,
+ u64 base_address, u64 size)
+{
+ struct gunyah_rm_vm_set_address_layout_req req = {
+ .vmid = cpu_to_le16(vmid),
+ .range_id = cpu_to_le32(range_id),
+ .range_base = cpu_to_le64(base_address),
+ .range_size = cpu_to_le64(size),
+ };
+
+ return gunyah_rm_call(rm, GUNYAH_RM_RPC_VM_SET_ADDRESS_LAYOUT, &req,
+ sizeof(req), NULL, NULL);
+}
--
2.34.1
The initial context of a the primary vCPU can be initialized by
performing RM RPC calls.
Signed-off-by: Elliot Berman <[email protected]>
---
drivers/virt/gunyah/rsc_mgr.h | 2 ++
drivers/virt/gunyah/rsc_mgr_rpc.c | 32 ++++++++++++++++++++++++++++++++
2 files changed, 34 insertions(+)
diff --git a/drivers/virt/gunyah/rsc_mgr.h b/drivers/virt/gunyah/rsc_mgr.h
index 99c2db18579c..2acaf8dff365 100644
--- a/drivers/virt/gunyah/rsc_mgr.h
+++ b/drivers/virt/gunyah/rsc_mgr.h
@@ -84,6 +84,8 @@ int gunyah_rm_vm_configure(struct gunyah_rm *rm, u16 vmid,
u32 mem_handle, u64 image_offset, u64 image_size,
u64 dtb_offset, u64 dtb_size);
int gunyah_rm_vm_init(struct gunyah_rm *rm, u16 vmid);
+int gunyah_rm_vm_set_boot_context(struct gunyah_rm *rm, u16 vmid, u8 reg_set,
+ u8 reg_index, u64 value);
struct gunyah_rm_hyp_resource {
u8 type;
diff --git a/drivers/virt/gunyah/rsc_mgr_rpc.c b/drivers/virt/gunyah/rsc_mgr_rpc.c
index f4e396fd0d47..bbdae0b05cd4 100644
--- a/drivers/virt/gunyah/rsc_mgr_rpc.c
+++ b/drivers/virt/gunyah/rsc_mgr_rpc.c
@@ -20,6 +20,7 @@
#define GUNYAH_RM_RPC_VM_RESET 0x56000006
#define GUNYAH_RM_RPC_VM_CONFIG_IMAGE 0x56000009
#define GUNYAH_RM_RPC_VM_INIT 0x5600000B
+#define GUNYAH_RM_RPC_VM_SET_BOOT_CONTEXT 0x5600000C
#define GUNYAH_RM_RPC_VM_GET_HYP_RESOURCES 0x56000020
#define GUNYAH_RM_RPC_VM_GET_VMID 0x56000024
#define GUNYAH_RM_RPC_VM_SET_DEMAND_PAGING 0x56000033
@@ -102,6 +103,15 @@ struct gunyah_rm_vm_config_image_req {
__le64 dtb_size;
} __packed;
+/* Call: VM_SET_BOOT_CONTEXT */
+struct gunyah_rm_vm_set_boot_context_req {
+ __le16 vmid;
+ u8 reg_set;
+ u8 reg_index;
+ __le32 _padding;
+ __le64 value;
+} __packed;
+
/* Call: VM_SET_DEMAND_PAGING */
struct gunyah_rm_vm_set_demand_paging_req {
__le16 vmid;
@@ -435,6 +445,28 @@ int gunyah_rm_vm_init(struct gunyah_rm *rm, u16 vmid)
return gunyah_rm_common_vmid_call(rm, GUNYAH_RM_RPC_VM_INIT, vmid);
}
+/**
+ * gunyah_rm_vm_set_boot_context() - set the initial boot context of the primary vCPU
+ * @rm: Handle to a Gunyah resource manager
+ * @vmid: VM identifier
+ * @reg_set: See &enum gunyah_vm_boot_context_reg
+ * @reg_index: Which register to set; must be 0 for REG_SET_PC
+ * @value: Value to set in the register
+ */
+int gunyah_rm_vm_set_boot_context(struct gunyah_rm *rm, u16 vmid, u8 reg_set,
+ u8 reg_index, u64 value)
+{
+ struct gunyah_rm_vm_set_boot_context_req req_payload = {
+ .vmid = cpu_to_le16(vmid),
+ .reg_set = reg_set,
+ .reg_index = reg_index,
+ .value = cpu_to_le64(value),
+ };
+
+ return gunyah_rm_call(rm, GUNYAH_RM_RPC_VM_SET_BOOT_CONTEXT,
+ &req_payload, sizeof(req_payload), NULL, NULL);
+}
+
/**
* gunyah_rm_get_hyp_resources() - Retrieve hypervisor resources (capabilities) associated with a VM
* @rm: Handle to a Gunyah resource manager
--
2.34.1
Gunyah allows vCPUs that are configured as proxy-scheduled to be scheduled by
another virtual machine (host) that holds capabilities to those vCPUs with
suitable rights.
Gunyah also supports configuring regions of a proxy-scheduled VM's address
space to be virtualized by the host VM. This permits a host VMM to emulate MMIO
devices in the proxy-scheduled VM.
vCPUs are presented to the host as a Gunyah resource and represented to
userspace as a Gunyah VM function.
Creating the vcpu function on the VM will create a file descriptor that:
- can handle an ioctl to run the vCPU. When called, Gunyah will directly
context-switch to the selected vCPU and run it until one of the following
events occurs:
* the host vcpu's time slice ends
* the host vcpu receives an interrupt or would have been pre-empted
by the hypervisor
* a fault occurs in the proxy-scheduled vcpu
* a power management event, such as idle or cpu-off call in the vcpu
- can be mmap'd to share the gunyah_vcpu_run structure with userspace. This
allows the vcpu_run result codes to be accessed, and for arguments to
vcpu_run to be passed, e.g. for resuming the vcpu when handling certain fault
and exit cases.
Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Elliot Berman <[email protected]>
---
drivers/virt/gunyah/Makefile | 2 +-
drivers/virt/gunyah/gunyah_vcpu.c | 557 ++++++++++++++++++++++++++++++++++++++
drivers/virt/gunyah/vm_mgr.c | 5 +
drivers/virt/gunyah/vm_mgr.h | 2 +
include/uapi/linux/gunyah.h | 163 +++++++++++
5 files changed, 728 insertions(+), 1 deletion(-)
diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
index 47f1fae5419b..3f82af8c5ce7 100644
--- a/drivers/virt/gunyah/Makefile
+++ b/drivers/virt/gunyah/Makefile
@@ -2,4 +2,4 @@
gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o
-obj-$(CONFIG_GUNYAH) += gunyah.o gunyah_rsc_mgr.o
+obj-$(CONFIG_GUNYAH) += gunyah.o gunyah_rsc_mgr.o gunyah_vcpu.o
diff --git a/drivers/virt/gunyah/gunyah_vcpu.c b/drivers/virt/gunyah/gunyah_vcpu.c
new file mode 100644
index 000000000000..b636b54dc9a1
--- /dev/null
+++ b/drivers/virt/gunyah/gunyah_vcpu.c
@@ -0,0 +1,557 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2022-2024 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#include <linux/anon_inodes.h>
+#include <linux/file.h>
+#include <linux/gunyah.h>
+#include <linux/interrupt.h>
+#include <linux/kref.h>
+#include <linux/mm.h>
+#include <linux/module.h>
+#include <linux/types.h>
+#include <linux/wait.h>
+
+#include "vm_mgr.h"
+
+#include <uapi/linux/gunyah.h>
+
+#define MAX_VCPU_NAME 20 /* gh-vcpu:strlen(U32::MAX)+NUL */
+
+/**
+ * struct gunyah_vcpu - Track an instance of gunyah vCPU
+ * @f: Function instance (how we get associated with the main VM)
+ * @rsc: Pointer to the Gunyah vCPU resource, will be NULL until VM starts
+ * @run_lock: One userspace thread at a time should run the vCPU
+ * @ghvm: Pointer to the main VM struct; quicker look up than going through
+ * @f->ghvm
+ * @vcpu_run: Pointer to page shared with userspace to communicate vCPU state
+ * @state: Our copy of the state of the vCPU, since userspace could trick
+ * kernel to behave incorrectly if we relied on @vcpu_run
+ * @mmio_read_len: Our copy of @vcpu_run->mmio.len; see also @state
+ * @mmio_addr: Our copy of @vcpu_run->mmio.phys_addr; see also @state
+ * @ready: if vCPU goes to sleep, hypervisor reports to us that it's sleeping
+ * and will signal interrupt (from @rsc) when it's time to wake up.
+ * This completion signals that we can run vCPU again.
+ * @nb: When VM exits, the status of VM is reported via @vcpu_run->status.
+ * We need to track overall VM status, and the nb gives us the updates from
+ * Resource Manager.
+ * @ticket: resource ticket to claim vCPU# for the VM
+ * @kref: Reference counter
+ */
+struct gunyah_vcpu {
+ struct gunyah_vm_function_instance *f;
+ struct gunyah_resource *rsc;
+ struct mutex run_lock;
+ struct gunyah_vm *ghvm;
+
+ struct gunyah_vcpu_run *vcpu_run;
+
+ /**
+ * Track why the vcpu_run hypercall returned. This mirrors the vcpu_run
+ * structure shared with userspace, except is used internally to avoid
+ * trusting userspace to not modify the vcpu_run structure.
+ */
+ enum {
+ GUNYAH_VCPU_RUN_STATE_UNKNOWN = 0,
+ GUNYAH_VCPU_RUN_STATE_READY,
+ GUNYAH_VCPU_RUN_STATE_MMIO_READ,
+ GUNYAH_VCPU_RUN_STATE_MMIO_WRITE,
+ GUNYAH_VCPU_RUN_STATE_SYSTEM_DOWN,
+ } state;
+ u8 mmio_read_len;
+ u64 mmio_addr;
+
+ struct completion ready;
+
+ struct notifier_block nb;
+ struct gunyah_vm_resource_ticket ticket;
+ struct kref kref;
+};
+
+static void vcpu_release(struct kref *kref)
+{
+ struct gunyah_vcpu *vcpu = container_of(kref, struct gunyah_vcpu, kref);
+
+ free_page((unsigned long)vcpu->vcpu_run);
+ kfree(vcpu);
+}
+
+/*
+ * When hypervisor allows us to schedule vCPU again, it gives us an interrupt
+ */
+static irqreturn_t gunyah_vcpu_irq_handler(int irq, void *data)
+{
+ struct gunyah_vcpu *vcpu = data;
+
+ complete(&vcpu->ready);
+ return IRQ_HANDLED;
+}
+
+static void gunyah_handle_page_fault(
+ struct gunyah_vcpu *vcpu,
+ const struct gunyah_hypercall_vcpu_run_resp *vcpu_run_resp)
+{
+ u64 addr = vcpu_run_resp->state_data[0];
+
+ vcpu->vcpu_run->page_fault.resume_action = GUNYAH_VCPU_RESUME_FAULT;
+ vcpu->vcpu_run->page_fault.attempt = 0;
+ vcpu->vcpu_run->page_fault.phys_addr = addr;
+ vcpu->vcpu_run->exit_reason = GUNYAH_VCPU_EXIT_PAGE_FAULT;
+}
+
+static void
+gunyah_handle_mmio(struct gunyah_vcpu *vcpu,
+ const struct gunyah_hypercall_vcpu_run_resp *vcpu_run_resp)
+{
+ u64 addr = vcpu_run_resp->state_data[0],
+ len = vcpu_run_resp->state_data[1],
+ data = vcpu_run_resp->state_data[2];
+
+ if (WARN_ON(len > sizeof(u64)))
+ len = sizeof(u64);
+
+ if (vcpu_run_resp->state == GUNYAH_VCPU_ADDRSPACE_VMMIO_READ) {
+ vcpu->vcpu_run->mmio.is_write = 0;
+ /* Record that we need to give vCPU user's supplied value next gunyah_vcpu_run() */
+ vcpu->state = GUNYAH_VCPU_RUN_STATE_MMIO_READ;
+ vcpu->mmio_read_len = len;
+ } else { /* GUNYAH_VCPU_ADDRSPACE_VMMIO_WRITE */
+ vcpu->vcpu_run->mmio.is_write = 1;
+ memcpy(vcpu->vcpu_run->mmio.data, &data, len);
+ vcpu->state = GUNYAH_VCPU_RUN_STATE_MMIO_WRITE;
+ }
+
+ vcpu->vcpu_run->mmio.resume_action = 0;
+ vcpu->mmio_addr = vcpu->vcpu_run->mmio.phys_addr = addr;
+ vcpu->vcpu_run->mmio.len = len;
+ vcpu->vcpu_run->exit_reason = GUNYAH_VCPU_EXIT_MMIO;
+}
+
+static int gunyah_handle_mmio_resume(struct gunyah_vcpu *vcpu,
+ unsigned long resume_data[3])
+{
+ switch (vcpu->vcpu_run->mmio.resume_action) {
+ case GUNYAH_VCPU_RESUME_HANDLED:
+ if (vcpu->state == GUNYAH_VCPU_RUN_STATE_MMIO_READ) {
+ if (unlikely(vcpu->mmio_read_len >
+ sizeof(resume_data[0])))
+ vcpu->mmio_read_len = sizeof(resume_data[0]);
+ memcpy(&resume_data[0], vcpu->vcpu_run->mmio.data,
+ vcpu->mmio_read_len);
+ }
+ resume_data[1] = GUNYAH_ADDRSPACE_VMMIO_ACTION_EMULATE;
+ break;
+ case GUNYAH_VCPU_RESUME_FAULT:
+ resume_data[1] = GUNYAH_ADDRSPACE_VMMIO_ACTION_FAULT;
+ break;
+ case GUNYAH_VCPU_RESUME_RETRY:
+ resume_data[1] = GUNYAH_ADDRSPACE_VMMIO_ACTION_RETRY;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static int gunyah_vcpu_rm_notification(struct notifier_block *nb,
+ unsigned long action, void *data)
+{
+ struct gunyah_vcpu *vcpu = container_of(nb, struct gunyah_vcpu, nb);
+ struct gunyah_rm_vm_exited_payload *exit_payload = data;
+
+ /* Wake up userspace waiting for the vCPU to be runnable again */
+ if (action == GUNYAH_RM_NOTIFICATION_VM_EXITED &&
+ le16_to_cpu(exit_payload->vmid) == vcpu->ghvm->vmid)
+ complete(&vcpu->ready);
+
+ return NOTIFY_OK;
+}
+
+static inline enum gunyah_vm_status
+remap_vm_status(enum gunyah_rm_vm_status rm_status)
+{
+ switch (rm_status) {
+ case GUNYAH_RM_VM_STATUS_INIT_FAILED:
+ return GUNYAH_VM_STATUS_LOAD_FAILED;
+ case GUNYAH_RM_VM_STATUS_EXITED:
+ return GUNYAH_VM_STATUS_EXITED;
+ default:
+ return GUNYAH_VM_STATUS_CRASHED;
+ }
+}
+
+/**
+ * gunyah_vcpu_check_system() - Check whether VM as a whole is running
+ * @vcpu: Pointer to gunyah_vcpu
+ *
+ * Returns true if the VM is alive.
+ * Returns false if the vCPU is the VM is not alive (can only be that VM is shutting down).
+ */
+static bool gunyah_vcpu_check_system(struct gunyah_vcpu *vcpu)
+ __must_hold(&vcpu->run_lock)
+{
+ bool ret = true;
+
+ down_read(&vcpu->ghvm->status_lock);
+ if (likely(vcpu->ghvm->vm_status == GUNYAH_RM_VM_STATUS_RUNNING))
+ goto out;
+
+ vcpu->vcpu_run->status.status = remap_vm_status(vcpu->ghvm->vm_status);
+ vcpu->vcpu_run->status.exit_info = vcpu->ghvm->exit_info;
+ vcpu->vcpu_run->exit_reason = GUNYAH_VCPU_EXIT_STATUS;
+ vcpu->state = GUNYAH_VCPU_RUN_STATE_SYSTEM_DOWN;
+ ret = false;
+out:
+ up_read(&vcpu->ghvm->status_lock);
+ return ret;
+}
+
+/**
+ * gunyah_vcpu_run() - Request Gunyah to begin scheduling this vCPU.
+ * @vcpu: The client descriptor that was obtained via gunyah_vcpu_alloc()
+ */
+static int gunyah_vcpu_run(struct gunyah_vcpu *vcpu)
+{
+ struct gunyah_hypercall_vcpu_run_resp vcpu_run_resp;
+ unsigned long resume_data[3] = { 0 };
+ enum gunyah_error gunyah_error;
+ int ret = 0;
+
+ if (!vcpu->f)
+ return -ENODEV;
+
+ if (mutex_lock_interruptible(&vcpu->run_lock))
+ return -ERESTARTSYS;
+
+ if (!vcpu->rsc) {
+ ret = -ENODEV;
+ goto out;
+ }
+
+ switch (vcpu->state) {
+ case GUNYAH_VCPU_RUN_STATE_UNKNOWN:
+ if (vcpu->ghvm->vm_status != GUNYAH_RM_VM_STATUS_RUNNING) {
+ /**
+ * Check if VM is up. If VM is starting, will block
+ * until VM is fully up since that thread does
+ * down_write.
+ */
+ if (!gunyah_vcpu_check_system(vcpu))
+ goto out;
+ }
+ vcpu->state = GUNYAH_VCPU_RUN_STATE_READY;
+ break;
+ case GUNYAH_VCPU_RUN_STATE_MMIO_READ:
+ case GUNYAH_VCPU_RUN_STATE_MMIO_WRITE:
+ ret = gunyah_handle_mmio_resume(vcpu, resume_data);
+ if (ret)
+ goto out;
+ vcpu->state = GUNYAH_VCPU_RUN_STATE_READY;
+ break;
+ case GUNYAH_VCPU_RUN_STATE_SYSTEM_DOWN:
+ goto out;
+ default:
+ break;
+ }
+
+ while (!ret && !signal_pending(current)) {
+ if (vcpu->vcpu_run->immediate_exit) {
+ ret = -EINTR;
+ goto out;
+ }
+
+ gunyah_error = gunyah_hypercall_vcpu_run(
+ vcpu->rsc->capid, resume_data, &vcpu_run_resp);
+ if (gunyah_error == GUNYAH_ERROR_OK) {
+ memset(resume_data, 0, sizeof(resume_data));
+ switch (vcpu_run_resp.state) {
+ case GUNYAH_VCPU_STATE_READY:
+ if (need_resched())
+ schedule();
+ break;
+ case GUNYAH_VCPU_STATE_POWERED_OFF:
+ /**
+ * vcpu might be off because the VM is shut down
+ * If so, it won't ever run again
+ */
+ if (!gunyah_vcpu_check_system(vcpu))
+ goto out;
+ /**
+ * Otherwise, another vcpu will turn it on (e.g.
+ * by PSCI) and hyp sends an interrupt to wake
+ * Linux up.
+ */
+ fallthrough;
+ case GUNYAH_VCPU_STATE_EXPECTS_WAKEUP:
+ ret = wait_for_completion_interruptible(
+ &vcpu->ready);
+ /**
+ * reinitialize completion before next
+ * hypercall. If we reinitialize after the
+ * hypercall, interrupt may have already come
+ * before re-initializing the completion and
+ * then end up waiting for event that already
+ * happened.
+ */
+ reinit_completion(&vcpu->ready);
+ /**
+ * Check VM status again. Completion
+ * might've come from VM exiting
+ */
+ if (!ret && !gunyah_vcpu_check_system(vcpu))
+ goto out;
+ break;
+ case GUNYAH_VCPU_STATE_BLOCKED:
+ schedule();
+ break;
+ case GUNYAH_VCPU_ADDRSPACE_VMMIO_READ:
+ case GUNYAH_VCPU_ADDRSPACE_VMMIO_WRITE:
+ gunyah_handle_mmio(vcpu, &vcpu_run_resp);
+ goto out;
+ case GUNYAH_VCPU_ADDRSPACE_PAGE_FAULT:
+ gunyah_handle_page_fault(vcpu, &vcpu_run_resp);
+ goto out;
+ default:
+ pr_warn_ratelimited(
+ "Unknown vCPU state: %llx\n",
+ vcpu_run_resp.sized_state);
+ schedule();
+ break;
+ }
+ } else if (gunyah_error == GUNYAH_ERROR_RETRY) {
+ schedule();
+ } else {
+ ret = gunyah_error_remap(gunyah_error);
+ }
+ }
+
+out:
+ mutex_unlock(&vcpu->run_lock);
+
+ if (signal_pending(current))
+ return -ERESTARTSYS;
+
+ return ret;
+}
+
+static long gunyah_vcpu_ioctl(struct file *filp, unsigned int cmd,
+ unsigned long arg)
+{
+ struct gunyah_vcpu *vcpu = filp->private_data;
+ long ret = -ENOTTY;
+
+ switch (cmd) {
+ case GUNYAH_VCPU_RUN:
+ ret = gunyah_vcpu_run(vcpu);
+ break;
+ case GUNYAH_VCPU_MMAP_SIZE:
+ ret = PAGE_SIZE;
+ break;
+ default:
+ break;
+ }
+ return ret;
+}
+
+static int gunyah_vcpu_release(struct inode *inode, struct file *filp)
+{
+ struct gunyah_vcpu *vcpu = filp->private_data;
+
+ gunyah_vm_put(vcpu->ghvm);
+ kref_put(&vcpu->kref, vcpu_release);
+ return 0;
+}
+
+static vm_fault_t gunyah_vcpu_fault(struct vm_fault *vmf)
+{
+ struct gunyah_vcpu *vcpu = vmf->vma->vm_file->private_data;
+ struct page *page = NULL;
+
+ if (vmf->pgoff == 0)
+ page = virt_to_page(vcpu->vcpu_run);
+
+ get_page(page);
+ vmf->page = page;
+ return 0;
+}
+
+static const struct vm_operations_struct gunyah_vcpu_ops = {
+ .fault = gunyah_vcpu_fault,
+};
+
+static int gunyah_vcpu_mmap(struct file *file, struct vm_area_struct *vma)
+{
+ vma->vm_ops = &gunyah_vcpu_ops;
+ return 0;
+}
+
+static const struct file_operations gunyah_vcpu_fops = {
+ .owner = THIS_MODULE,
+ .unlocked_ioctl = gunyah_vcpu_ioctl,
+ .release = gunyah_vcpu_release,
+ .llseek = noop_llseek,
+ .mmap = gunyah_vcpu_mmap,
+};
+
+static bool gunyah_vcpu_populate(struct gunyah_vm_resource_ticket *ticket,
+ struct gunyah_resource *ghrsc)
+{
+ struct gunyah_vcpu *vcpu =
+ container_of(ticket, struct gunyah_vcpu, ticket);
+ int ret;
+
+ mutex_lock(&vcpu->run_lock);
+ if (vcpu->rsc) {
+ pr_warn("vcpu%d already got a Gunyah resource. Check if multiple resources with same label were configured.\n",
+ vcpu->ticket.label);
+ ret = -EEXIST;
+ goto out;
+ }
+
+ vcpu->rsc = ghrsc;
+ init_completion(&vcpu->ready);
+
+ ret = request_irq(vcpu->rsc->irq, gunyah_vcpu_irq_handler,
+ IRQF_TRIGGER_RISING, "gunyah_vcpu", vcpu);
+ if (ret)
+ pr_warn("Failed to request vcpu irq %d: %d", vcpu->rsc->irq,
+ ret);
+
+ enable_irq_wake(vcpu->rsc->irq);
+
+out:
+ mutex_unlock(&vcpu->run_lock);
+ return !ret;
+}
+
+static void gunyah_vcpu_unpopulate(struct gunyah_vm_resource_ticket *ticket,
+ struct gunyah_resource *ghrsc)
+{
+ struct gunyah_vcpu *vcpu =
+ container_of(ticket, struct gunyah_vcpu, ticket);
+
+ vcpu->vcpu_run->immediate_exit = true;
+ complete_all(&vcpu->ready);
+ mutex_lock(&vcpu->run_lock);
+ free_irq(vcpu->rsc->irq, vcpu);
+ vcpu->rsc = NULL;
+ mutex_unlock(&vcpu->run_lock);
+}
+
+static long gunyah_vcpu_bind(struct gunyah_vm_function_instance *f)
+{
+ struct gunyah_fn_vcpu_arg *arg = f->argp;
+ struct gunyah_vcpu *vcpu;
+ char name[MAX_VCPU_NAME];
+ struct file *file;
+ struct page *page;
+ int fd;
+ long r;
+
+ if (f->arg_size != sizeof(*arg))
+ return -EINVAL;
+
+ vcpu = kzalloc(sizeof(*vcpu), GFP_KERNEL);
+ if (!vcpu)
+ return -ENOMEM;
+
+ vcpu->f = f;
+ f->data = vcpu;
+ mutex_init(&vcpu->run_lock);
+ kref_init(&vcpu->kref);
+
+ page = alloc_page(GFP_KERNEL | __GFP_ZERO);
+ if (!page) {
+ r = -ENOMEM;
+ goto err_destroy_vcpu;
+ }
+ vcpu->vcpu_run = page_address(page);
+
+ vcpu->ticket.resource_type = GUNYAH_RESOURCE_TYPE_VCPU;
+ vcpu->ticket.label = arg->id;
+ vcpu->ticket.owner = THIS_MODULE;
+ vcpu->ticket.populate = gunyah_vcpu_populate;
+ vcpu->ticket.unpopulate = gunyah_vcpu_unpopulate;
+
+ r = gunyah_vm_add_resource_ticket(f->ghvm, &vcpu->ticket);
+ if (r)
+ goto err_destroy_page;
+
+ if (!gunyah_vm_get(f->ghvm)) {
+ r = -ENODEV;
+ goto err_remove_resource_ticket;
+ }
+ vcpu->ghvm = f->ghvm;
+
+ vcpu->nb.notifier_call = gunyah_vcpu_rm_notification;
+ /**
+ * Ensure we run after the vm_mgr handles the notification and does
+ * any necessary state changes.
+ */
+ vcpu->nb.priority = -1;
+ r = gunyah_rm_notifier_register(f->rm, &vcpu->nb);
+ if (r)
+ goto err_put_gunyah_vm;
+
+ kref_get(&vcpu->kref);
+
+ fd = get_unused_fd_flags(O_CLOEXEC);
+ if (fd < 0) {
+ r = fd;
+ goto err_notifier;
+ }
+
+ snprintf(name, sizeof(name), "gh-vcpu:%u", vcpu->ticket.label);
+ file = anon_inode_getfile(name, &gunyah_vcpu_fops, vcpu, O_RDWR);
+ if (IS_ERR(file)) {
+ r = PTR_ERR(file);
+ goto err_put_fd;
+ }
+
+ fd_install(fd, file);
+
+ return fd;
+err_put_fd:
+ put_unused_fd(fd);
+err_notifier:
+ gunyah_rm_notifier_unregister(f->rm, &vcpu->nb);
+err_put_gunyah_vm:
+ gunyah_vm_put(vcpu->ghvm);
+err_remove_resource_ticket:
+ gunyah_vm_remove_resource_ticket(f->ghvm, &vcpu->ticket);
+err_destroy_page:
+ free_page((unsigned long)vcpu->vcpu_run);
+err_destroy_vcpu:
+ kfree(vcpu);
+ return r;
+}
+
+static void gunyah_vcpu_unbind(struct gunyah_vm_function_instance *f)
+{
+ struct gunyah_vcpu *vcpu = f->data;
+
+ gunyah_rm_notifier_unregister(f->rm, &vcpu->nb);
+ gunyah_vm_remove_resource_ticket(vcpu->ghvm, &vcpu->ticket);
+ vcpu->f = NULL;
+
+ kref_put(&vcpu->kref, vcpu_release);
+}
+
+static bool gunyah_vcpu_compare(const struct gunyah_vm_function_instance *f,
+ const void *arg, size_t size)
+{
+ const struct gunyah_fn_vcpu_arg *instance = f->argp, *other = arg;
+
+ if (sizeof(*other) != size)
+ return false;
+
+ return instance->id == other->id;
+}
+
+DECLARE_GUNYAH_VM_FUNCTION_INIT(vcpu, GUNYAH_FN_VCPU, 1, gunyah_vcpu_bind,
+ gunyah_vcpu_unbind, gunyah_vcpu_compare);
+MODULE_DESCRIPTION("Gunyah vCPU Function");
+MODULE_LICENSE("GPL");
diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
index 5d4f413f7a76..db3d1d18ccb8 100644
--- a/drivers/virt/gunyah/vm_mgr.c
+++ b/drivers/virt/gunyah/vm_mgr.c
@@ -302,6 +302,11 @@ static int gunyah_vm_rm_notification_exited(struct gunyah_vm *ghvm, void *data)
down_write(&ghvm->status_lock);
ghvm->vm_status = GUNYAH_RM_VM_STATUS_EXITED;
+ ghvm->exit_info.type = le16_to_cpu(payload->exit_type);
+ ghvm->exit_info.reason_size = le32_to_cpu(payload->exit_reason_size);
+ memcpy(&ghvm->exit_info.reason, payload->exit_reason,
+ min(GUNYAH_VM_MAX_EXIT_REASON_SIZE,
+ ghvm->exit_info.reason_size));
up_write(&ghvm->status_lock);
wake_up(&ghvm->vm_status_wait);
diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
index 190a95ee8da6..8c5b94101b2c 100644
--- a/drivers/virt/gunyah/vm_mgr.h
+++ b/drivers/virt/gunyah/vm_mgr.h
@@ -28,6 +28,7 @@ long gunyah_dev_vm_mgr_ioctl(struct gunyah_rm *rm, unsigned int cmd,
* @vm_status: Current state of the VM, as last reported by RM
* @vm_status_wait: Wait queue for status @vm_status changes
* @status_lock: Serializing state transitions
+ * @exit_info: Breadcrumbs why VM is not running anymore
* @kref: Reference counter for VM functions
* @fn_lock: Serialization addition of functions
* @functions: List of &struct gunyah_vm_function_instance that have been
@@ -48,6 +49,7 @@ struct gunyah_vm {
enum gunyah_rm_vm_status vm_status;
wait_queue_head_t vm_status_wait;
struct rw_semaphore status_lock;
+ struct gunyah_vm_exit_info exit_info;
struct kref kref;
struct mutex fn_lock;
diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
index 1b7cb5fde70a..46f7d3aa61d0 100644
--- a/include/uapi/linux/gunyah.h
+++ b/include/uapi/linux/gunyah.h
@@ -25,8 +25,33 @@
*/
#define GUNYAH_VM_START _IO(GUNYAH_IOCTL_TYPE, 0x3)
+/**
+ * enum gunyah_fn_type - Valid types of Gunyah VM functions
+ * @GUNYAH_FN_VCPU: create a vCPU instance to control a vCPU
+ * &struct gunyah_fn_desc.arg is a pointer to &struct gunyah_fn_vcpu_arg
+ * Return: file descriptor to manipulate the vcpu.
+ */
+enum gunyah_fn_type {
+ GUNYAH_FN_VCPU = 1,
+};
+
#define GUNYAH_FN_MAX_ARG_SIZE 256
+/**
+ * struct gunyah_fn_vcpu_arg - Arguments to create a vCPU.
+ * @id: vcpu id
+ *
+ * Create this function with &GUNYAH_VM_ADD_FUNCTION using type &GUNYAH_FN_VCPU.
+ *
+ * The vcpu type will register with the VM Manager to expect to control
+ * vCPU number `vcpu_id`. It returns a file descriptor allowing interaction with
+ * the vCPU. See the Gunyah vCPU API description sections for interacting with
+ * the Gunyah vCPU file descriptors.
+ */
+struct gunyah_fn_vcpu_arg {
+ __u32 id;
+};
+
/**
* struct gunyah_fn_desc - Arguments to create a VM function
* @type: Type of the function. See &enum gunyah_fn_type.
@@ -43,4 +68,142 @@ struct gunyah_fn_desc {
#define GUNYAH_VM_ADD_FUNCTION _IOW(GUNYAH_IOCTL_TYPE, 0x4, struct gunyah_fn_desc)
#define GUNYAH_VM_REMOVE_FUNCTION _IOW(GUNYAH_IOCTL_TYPE, 0x7, struct gunyah_fn_desc)
+/*
+ * ioctls for vCPU fds
+ */
+
+/**
+ * enum gunyah_vm_status - Stores status reason why VM is not runnable (exited).
+ * @GUNYAH_VM_STATUS_LOAD_FAILED: VM didn't start because it couldn't be loaded.
+ * @GUNYAH_VM_STATUS_EXITED: VM requested shutdown/reboot.
+ * Use &struct gunyah_vm_exit_info.reason for further details.
+ * @GUNYAH_VM_STATUS_CRASHED: VM state is unknown and has crashed.
+ */
+enum gunyah_vm_status {
+ GUNYAH_VM_STATUS_LOAD_FAILED = 1,
+ GUNYAH_VM_STATUS_EXITED = 2,
+ GUNYAH_VM_STATUS_CRASHED = 3,
+};
+
+/*
+ * Gunyah presently sends max 4 bytes of exit_reason.
+ * If that changes, this macro can be safely increased without breaking
+ * userspace so long as struct gunyah_vcpu_run < PAGE_SIZE.
+ */
+#define GUNYAH_VM_MAX_EXIT_REASON_SIZE 8u
+
+/**
+ * struct gunyah_vm_exit_info - Reason for VM exit as reported by Gunyah
+ * See Gunyah documentation for values.
+ * @type: Describes how VM exited
+ * @padding: padding bytes
+ * @reason_size: Number of bytes valid for `reason`
+ * @reason: See Gunyah documentation for interpretation. Note: these values are
+ * not interpreted by Linux and need to be converted from little-endian
+ * as applicable.
+ */
+struct gunyah_vm_exit_info {
+ __u16 type;
+ __u16 padding;
+ __u32 reason_size;
+ __u8 reason[GUNYAH_VM_MAX_EXIT_REASON_SIZE];
+};
+
+/**
+ * enum gunyah_vcpu_exit - Stores reason why &GUNYAH_VCPU_RUN ioctl recently exited with status 0
+ * @GUNYAH_VCPU_EXIT_UNKNOWN: Not used, status != 0
+ * @GUNYAH_VCPU_EXIT_MMIO: vCPU performed a read or write that could not be handled
+ * by hypervisor or Linux. Use @struct gunyah_vcpu_run.mmio for
+ * details of the read/write.
+ * @GUNYAH_VCPU_EXIT_STATUS: vCPU not able to run because the VM has exited.
+ * Use @struct gunyah_vcpu_run.status for why VM has exited.
+ * @GUNYAH_VCPU_EXIT_PAGE_FAULT: vCPU tried to execute an instruction at an address
+ * for which memory hasn't been provided. Use
+ * @struct gunyah_vcpu_run.page_fault for details.
+ */
+enum gunyah_vcpu_exit {
+ GUNYAH_VCPU_EXIT_UNKNOWN,
+ GUNYAH_VCPU_EXIT_MMIO,
+ GUNYAH_VCPU_EXIT_STATUS,
+ GUNYAH_VCPU_EXIT_PAGE_FAULT,
+};
+
+/**
+ * enum gunyah_vcpu_resume_action - Provide resume action after an MMIO or page fault
+ * @GUNYAH_VCPU_RESUME_HANDLED: The mmio or page fault has been handled, continue
+ * normal operation of vCPU
+ * @GUNYAH_VCPU_RESUME_FAULT: The mmio or page fault could not be satisfied and
+ * inject the original fault back to the guest.
+ * @GUNYAH_VCPU_RESUME_RETRY: Retry the faulting instruction. Perhaps you added
+ * memory binding to satisfy the request.
+ */
+enum gunyah_vcpu_resume_action {
+ GUNYAH_VCPU_RESUME_HANDLED = 0,
+ GUNYAH_VCPU_RESUME_FAULT,
+ GUNYAH_VCPU_RESUME_RETRY,
+};
+
+/**
+ * struct gunyah_vcpu_run - Application code obtains a pointer to the gunyah_vcpu_run
+ * structure by mmap()ing a vcpu fd.
+ * @immediate_exit: polled when scheduling the vcpu. If set, immediately returns -EINTR.
+ * @padding: padding bytes
+ * @exit_reason: Set when GUNYAH_VCPU_RUN returns successfully and gives reason why
+ * GUNYAH_VCPU_RUN has stopped running the vCPU. See &enum gunyah_vcpu_exit.
+ * @mmio: Used when exit_reason == GUNYAH_VCPU_EXIT_MMIO
+ * The guest has faulted on an memory-mapped I/O that
+ * couldn't be satisfied by gunyah.
+ * @mmio.phys_addr: Address guest tried to access
+ * @mmio.data: the value that was written if `is_write == 1`. Filled by
+ * user for reads (`is_write == 0`).
+ * @mmio.len: Length of write. Only the first `len` bytes of `data`
+ * are considered by Gunyah.
+ * @mmio.is_write: 1 if VM tried to perform a write, 0 for a read
+ * @mmio.resume_action: See &enum gunyah_vcpu_resume_action
+ * @status: Used when exit_reason == GUNYAH_VCPU_EXIT_STATUS.
+ * The guest VM is no longer runnable. This struct informs why.
+ * @status.status: See &enum gunyah_vm_status for possible values
+ * @status.exit_info: Used when status == GUNYAH_VM_STATUS_EXITED
+ * @page_fault: Used when EXIT_REASON == GUNYAH_VCPU_EXIT_PAGE_FAULT
+ * The guest has faulted on a region that can only be provided
+ * by mapping memory at phys_addr.
+ * @page_fault.phys_addr: Address guest tried to access.
+ * @page_fault.attempt: Error code why Linux wasn't able to handle fault itself
+ * Typically, if no memory was mapped: -ENOENT,
+ * If permission bits weren't what the VM wanted: -EPERM
+ * @page_fault.resume_action: See &enum gunyah_vcpu_resume_action
+ */
+struct gunyah_vcpu_run {
+ /* in */
+ __u8 immediate_exit;
+ __u8 padding[7];
+
+ /* out */
+ __u32 exit_reason;
+
+ union {
+ struct {
+ __u64 phys_addr;
+ __u8 data[8];
+ __u32 len;
+ __u8 is_write;
+ __u8 resume_action;
+ } mmio;
+
+ struct {
+ enum gunyah_vm_status status;
+ struct gunyah_vm_exit_info exit_info;
+ } status;
+
+ struct {
+ __u64 phys_addr;
+ __s32 attempt;
+ __u8 resume_action;
+ } page_fault;
+ };
+};
+
+#define GUNYAH_VCPU_RUN _IO(GUNYAH_IOCTL_TYPE, 0x5)
+#define GUNYAH_VCPU_MMAP_SIZE _IO(GUNYAH_IOCTL_TYPE, 0x6)
+
#endif
--
2.34.1
Gunyah virtual machines are created with either all memory provided at
VM creation using the Resource Manager memory parcel construct, or
Incrementally by enabling VM demand paging.
The Gunyah demand paging support is provided directly by the hypervisor
and does not require the creation of resource manager memory parcels.
Demand paging allows the host to map/unmap contiguous pages (folios) to
a Gunyah memory extent object with the correct rights allowing its
contained pages to be mapped into the Guest VM's address space. Memory
extents are Gunyah's mechanism for handling system memory abstracting
from the direct use of physical page numbers. Memory extents are
hypervisor objects and are therefore referenced and access controlled
with capabilities.
When a virtual machine is configured for demand paging, 3 memory
extent and 1 address space capabilities are provided to the host. The
resource manager defined policy is such that memory in the "host-only"
extent (the default) is private to the host. Memory in the "guest-only"
extent can be used for guest private mappings, and are unmapped from the
host. Memory in the "host-and-guest-shared" extent can be mapped
concurrently and shared between the host and guest VMs.
Implement two functions which Linux can use to move memory between the
virtual machines: gunyah_provide_folio and gunyah_reclaim_folio. Memory
that has been provided to the guest is tracked in a maple tree to be
reclaimed later. Folios provided to the virtual machine are assumed to
be owned Gunyah stack: the folio's ->private field is used for
bookkeeping about whether page is mapped into virtual machine.
Signed-off-by: Elliot Berman <[email protected]>
---
drivers/virt/gunyah/Makefile | 2 +-
drivers/virt/gunyah/vm_mgr.c | 67 +++++++++
drivers/virt/gunyah/vm_mgr.h | 46 ++++++
drivers/virt/gunyah/vm_mgr_mem.c | 309 +++++++++++++++++++++++++++++++++++++++
4 files changed, 423 insertions(+), 1 deletion(-)
diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
index 3f82af8c5ce7..f3c9507224ee 100644
--- a/drivers/virt/gunyah/Makefile
+++ b/drivers/virt/gunyah/Makefile
@@ -1,5 +1,5 @@
# SPDX-License-Identifier: GPL-2.0
-gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o
+gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o vm_mgr_mem.o
obj-$(CONFIG_GUNYAH) += gunyah.o gunyah_rsc_mgr.o gunyah_vcpu.o
diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
index db3d1d18ccb8..26b6dce49970 100644
--- a/drivers/virt/gunyah/vm_mgr.c
+++ b/drivers/virt/gunyah/vm_mgr.c
@@ -17,6 +17,16 @@
#include "rsc_mgr.h"
#include "vm_mgr.h"
+#define GUNYAH_VM_ADDRSPACE_LABEL 0
+// "To" extent for memory private to guest
+#define GUNYAH_VM_MEM_EXTENT_GUEST_PRIVATE_LABEL 0
+// "From" extent for memory shared with guest
+#define GUNYAH_VM_MEM_EXTENT_HOST_SHARED_LABEL 1
+// "To" extent for memory shared with the guest
+#define GUNYAH_VM_MEM_EXTENT_GUEST_SHARED_LABEL 3
+// "From" extent for memory private to guest
+#define GUNYAH_VM_MEM_EXTENT_HOST_PRIVATE_LABEL 2
+
static DEFINE_XARRAY(gunyah_vm_functions);
static void gunyah_vm_put_function(struct gunyah_vm_function *fn)
@@ -175,6 +185,16 @@ void gunyah_vm_function_unregister(struct gunyah_vm_function *fn)
}
EXPORT_SYMBOL_GPL(gunyah_vm_function_unregister);
+static bool gunyah_vm_resource_ticket_populate_noop(
+ struct gunyah_vm_resource_ticket *ticket, struct gunyah_resource *ghrsc)
+{
+ return true;
+}
+static void gunyah_vm_resource_ticket_unpopulate_noop(
+ struct gunyah_vm_resource_ticket *ticket, struct gunyah_resource *ghrsc)
+{
+}
+
int gunyah_vm_add_resource_ticket(struct gunyah_vm *ghvm,
struct gunyah_vm_resource_ticket *ticket)
{
@@ -342,6 +362,17 @@ static void gunyah_vm_stop(struct gunyah_vm *ghvm)
ghvm->vm_status != GUNYAH_RM_VM_STATUS_RUNNING);
}
+static inline void setup_extent_ticket(struct gunyah_vm *ghvm,
+ struct gunyah_vm_resource_ticket *ticket,
+ u32 label)
+{
+ ticket->resource_type = GUNYAH_RESOURCE_TYPE_MEM_EXTENT;
+ ticket->label = label;
+ ticket->populate = gunyah_vm_resource_ticket_populate_noop;
+ ticket->unpopulate = gunyah_vm_resource_ticket_unpopulate_noop;
+ gunyah_vm_add_resource_ticket(ghvm, ticket);
+}
+
static __must_check struct gunyah_vm *gunyah_vm_alloc(struct gunyah_rm *rm)
{
struct gunyah_vm *ghvm;
@@ -365,6 +396,25 @@ static __must_check struct gunyah_vm *gunyah_vm_alloc(struct gunyah_rm *rm)
INIT_LIST_HEAD(&ghvm->resources);
INIT_LIST_HEAD(&ghvm->resource_tickets);
+ mt_init(&ghvm->mm);
+
+ ghvm->addrspace_ticket.resource_type = GUNYAH_RESOURCE_TYPE_ADDR_SPACE;
+ ghvm->addrspace_ticket.label = GUNYAH_VM_ADDRSPACE_LABEL;
+ ghvm->addrspace_ticket.populate =
+ gunyah_vm_resource_ticket_populate_noop;
+ ghvm->addrspace_ticket.unpopulate =
+ gunyah_vm_resource_ticket_unpopulate_noop;
+ gunyah_vm_add_resource_ticket(ghvm, &ghvm->addrspace_ticket);
+
+ setup_extent_ticket(ghvm, &ghvm->host_private_extent_ticket,
+ GUNYAH_VM_MEM_EXTENT_HOST_PRIVATE_LABEL);
+ setup_extent_ticket(ghvm, &ghvm->host_shared_extent_ticket,
+ GUNYAH_VM_MEM_EXTENT_HOST_SHARED_LABEL);
+ setup_extent_ticket(ghvm, &ghvm->guest_private_extent_ticket,
+ GUNYAH_VM_MEM_EXTENT_GUEST_PRIVATE_LABEL);
+ setup_extent_ticket(ghvm, &ghvm->guest_shared_extent_ticket,
+ GUNYAH_VM_MEM_EXTENT_GUEST_SHARED_LABEL);
+
return ghvm;
}
@@ -528,6 +578,21 @@ static void _gunyah_vm_put(struct kref *kref)
gunyah_vm_stop(ghvm);
gunyah_vm_remove_functions(ghvm);
+
+ /**
+ * If this fails, we're going to lose the memory for good and is
+ * BUG_ON-worthy, but not unrecoverable (we just lose memory).
+ * This call should always succeed though because the VM is in not
+ * running and RM will let us reclaim all the memory.
+ */
+ WARN_ON(gunyah_vm_reclaim_range(ghvm, 0, U64_MAX));
+
+ gunyah_vm_remove_resource_ticket(ghvm, &ghvm->addrspace_ticket);
+ gunyah_vm_remove_resource_ticket(ghvm, &ghvm->host_shared_extent_ticket);
+ gunyah_vm_remove_resource_ticket(ghvm, &ghvm->host_private_extent_ticket);
+ gunyah_vm_remove_resource_ticket(ghvm, &ghvm->guest_shared_extent_ticket);
+ gunyah_vm_remove_resource_ticket(ghvm, &ghvm->guest_private_extent_ticket);
+
gunyah_vm_clean_resources(ghvm);
if (ghvm->vm_status != GUNYAH_RM_VM_STATUS_NO_STATE &&
@@ -541,6 +606,8 @@ static void _gunyah_vm_put(struct kref *kref)
ghvm->vm_status == GUNYAH_RM_VM_STATUS_RESET);
}
+ mtree_destroy(&ghvm->mm);
+
if (ghvm->vm_status > GUNYAH_RM_VM_STATUS_NO_STATE) {
gunyah_rm_notifier_unregister(ghvm->rm, &ghvm->nb);
diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
index 8c5b94101b2c..e500f6eb014e 100644
--- a/drivers/virt/gunyah/vm_mgr.h
+++ b/drivers/virt/gunyah/vm_mgr.h
@@ -8,6 +8,7 @@
#include <linux/device.h>
#include <linux/kref.h>
+#include <linux/maple_tree.h>
#include <linux/mutex.h>
#include <linux/rwsem.h>
#include <linux/wait.h>
@@ -16,12 +17,42 @@
#include "rsc_mgr.h"
+static inline u64 gunyah_gpa_to_gfn(u64 gpa)
+{
+ return gpa >> PAGE_SHIFT;
+}
+
+static inline u64 gunyah_gfn_to_gpa(u64 gfn)
+{
+ return gfn << PAGE_SHIFT;
+}
+
long gunyah_dev_vm_mgr_ioctl(struct gunyah_rm *rm, unsigned int cmd,
unsigned long arg);
/**
* struct gunyah_vm - Main representation of a Gunyah Virtual machine
* @vmid: Gunyah's VMID for this virtual machine
+ * @mm: A maple tree of all memory that has been mapped to a VM.
+ * Indices are guest frame numbers; entries are either folios or
+ * RM mem parcels
+ * @addrspace_ticket: Resource ticket to the capability for guest VM's
+ * address space
+ * @host_private_extent_ticket: Resource ticket to the capability for our
+ * memory extent from which to lend private
+ * memory to the guest
+ * @host_shared_extent_ticket: Resource ticket to the capaiblity for our
+ * memory extent from which to share memory
+ * with the guest. Distinction with
+ * @host_private_extent_ticket needed for
+ * current Qualcomm platforms; on non-Qualcomm
+ * platforms, this is the same capability ID
+ * @guest_private_extent_ticket: Resource ticket to the capaiblity for
+ * the guest's memory extent to lend private
+ * memory to
+ * @guest_shared_extent_ticket: Resource ticket to the capability for
+ * the memory extent that represents
+ * memory shared with the guest.
* @rm: Pointer to the resource manager struct to make RM calls
* @parent: For logging
* @nb: Notifier block for RM notifications
@@ -43,6 +74,11 @@ long gunyah_dev_vm_mgr_ioctl(struct gunyah_rm *rm, unsigned int cmd,
*/
struct gunyah_vm {
u16 vmid;
+ struct maple_tree mm;
+ struct gunyah_vm_resource_ticket addrspace_ticket,
+ host_private_extent_ticket, host_shared_extent_ticket,
+ guest_private_extent_ticket, guest_shared_extent_ticket;
+
struct gunyah_rm *rm;
struct notifier_block nb;
@@ -63,4 +99,14 @@ struct gunyah_vm {
};
+int gunyah_vm_parcel_to_paged(struct gunyah_vm *ghvm,
+ struct gunyah_rm_mem_parcel *parcel, u64 gfn,
+ u64 nr);
+int gunyah_vm_reclaim_parcel(struct gunyah_vm *ghvm,
+ struct gunyah_rm_mem_parcel *parcel, u64 gfn);
+int gunyah_vm_provide_folio(struct gunyah_vm *ghvm, struct folio *folio,
+ u64 gfn, bool share, bool write);
+int gunyah_vm_reclaim_folio(struct gunyah_vm *ghvm, u64 gfn);
+int gunyah_vm_reclaim_range(struct gunyah_vm *ghvm, u64 gfn, u64 nr);
+
#endif
diff --git a/drivers/virt/gunyah/vm_mgr_mem.c b/drivers/virt/gunyah/vm_mgr_mem.c
new file mode 100644
index 000000000000..d3fcb4514907
--- /dev/null
+++ b/drivers/virt/gunyah/vm_mgr_mem.c
@@ -0,0 +1,309 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2023-2024 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#define pr_fmt(fmt) "gunyah_vm_mgr: " fmt
+
+#include <asm/gunyah.h>
+#include <linux/mm.h>
+#include <linux/pagemap.h>
+
+#include "vm_mgr.h"
+
+#define WRITE_TAG (1 << 0)
+#define SHARE_TAG (1 << 1)
+
+static inline struct gunyah_resource *
+__first_resource(struct gunyah_vm_resource_ticket *ticket)
+{
+ return list_first_entry_or_null(&ticket->resources,
+ struct gunyah_resource, list);
+}
+
+int gunyah_vm_parcel_to_paged(struct gunyah_vm *ghvm,
+ struct gunyah_rm_mem_parcel *parcel, u64 gfn,
+ u64 nr)
+{
+ struct gunyah_rm_mem_entry *entry;
+ unsigned long i, entry_size, tag = 0;
+ struct folio *folio;
+ pgoff_t off = 0;
+ int ret;
+
+ if (parcel->n_acl_entries > 1)
+ tag |= SHARE_TAG;
+ if (parcel->acl_entries[0].perms & GUNYAH_RM_ACL_W)
+ tag |= WRITE_TAG;
+
+ for (i = 0; i < parcel->n_mem_entries; i++) {
+ entry = &parcel->mem_entries[i];
+ entry_size = PHYS_PFN(le64_to_cpu(entry->size));
+
+ folio = pfn_folio(PHYS_PFN(le64_to_cpu(entry->phys_addr)));
+ ret = mtree_insert_range(&ghvm->mm, gfn + off, gfn + off + folio_nr_pages(folio) - 1, xa_tag_pointer(folio, tag), GFP_KERNEL);
+ if (ret == -ENOMEM)
+ return ret;
+ BUG_ON(ret);
+ off += folio_nr_pages(folio);
+ }
+
+ BUG_ON(off != nr);
+
+ return 0;
+}
+
+static inline u32 donate_flags(bool share)
+{
+ if (share)
+ return FIELD_PREP_CONST(GUNYAH_MEMEXTENT_OPTION_TYPE_MASK,
+ GUNYAH_MEMEXTENT_DONATE_TO_SIBLING);
+ else
+ return FIELD_PREP_CONST(GUNYAH_MEMEXTENT_OPTION_TYPE_MASK,
+ GUNYAH_MEMEXTENT_DONATE_TO_PROTECTED);
+}
+
+static inline u32 reclaim_flags(bool share)
+{
+ if (share)
+ return FIELD_PREP_CONST(GUNYAH_MEMEXTENT_OPTION_TYPE_MASK,
+ GUNYAH_MEMEXTENT_DONATE_TO_SIBLING);
+ else
+ return FIELD_PREP_CONST(GUNYAH_MEMEXTENT_OPTION_TYPE_MASK,
+ GUNYAH_MEMEXTENT_DONATE_FROM_PROTECTED);
+}
+
+int gunyah_vm_provide_folio(struct gunyah_vm *ghvm, struct folio *folio,
+ u64 gfn, bool share, bool write)
+{
+ struct gunyah_resource *guest_extent, *host_extent, *addrspace;
+ u32 map_flags = BIT(GUNYAH_ADDRSPACE_MAP_FLAG_PARTIAL);
+ u64 extent_attrs, gpa = gunyah_gfn_to_gpa(gfn);
+ phys_addr_t pa = PFN_PHYS(folio_pfn(folio));
+ enum gunyah_pagetable_access access;
+ size_t size = folio_size(folio);
+ enum gunyah_error gunyah_error;
+ unsigned long tag = 0;
+ int ret;
+
+ /* clang-format off */
+ if (share) {
+ guest_extent = __first_resource(&ghvm->guest_shared_extent_ticket);
+ host_extent = __first_resource(&ghvm->host_shared_extent_ticket);
+ } else {
+ guest_extent = __first_resource(&ghvm->guest_private_extent_ticket);
+ host_extent = __first_resource(&ghvm->host_private_extent_ticket);
+ }
+ /* clang-format on */
+ addrspace = __first_resource(&ghvm->addrspace_ticket);
+
+ if (!addrspace || !guest_extent || !host_extent)
+ return -ENODEV;
+
+ if (share) {
+ map_flags |= BIT(GUNYAH_ADDRSPACE_MAP_FLAG_VMMIO);
+ tag |= SHARE_TAG;
+ } else {
+ map_flags |= BIT(GUNYAH_ADDRSPACE_MAP_FLAG_PRIVATE);
+ }
+
+ if (write)
+ tag |= WRITE_TAG;
+
+ ret = mtree_insert_range(&ghvm->mm, gfn,
+ gfn + folio_nr_pages(folio) - 1,
+ xa_tag_pointer(folio, tag), GFP_KERNEL);
+ if (ret == -EEXIST)
+ return -EAGAIN;
+ if (ret)
+ return ret;
+
+ if (share && write)
+ access = GUNYAH_PAGETABLE_ACCESS_RW;
+ else if (share && !write)
+ access = GUNYAH_PAGETABLE_ACCESS_R;
+ else if (!share && write)
+ access = GUNYAH_PAGETABLE_ACCESS_RWX;
+ else /* !share && !write */
+ access = GUNYAH_PAGETABLE_ACCESS_RX;
+
+ gunyah_error = gunyah_hypercall_memextent_donate(donate_flags(share),
+ host_extent->capid,
+ guest_extent->capid,
+ pa, size);
+ if (gunyah_error != GUNYAH_ERROR_OK) {
+ pr_err("Failed to donate memory for guest address 0x%016llx: %d\n",
+ gpa, gunyah_error);
+ ret = gunyah_error_remap(gunyah_error);
+ goto remove;
+ }
+
+ extent_attrs =
+ FIELD_PREP_CONST(GUNYAH_MEMEXTENT_MAPPING_TYPE,
+ ARCH_GUNYAH_DEFAULT_MEMTYPE) |
+ FIELD_PREP(GUNYAH_MEMEXTENT_MAPPING_USER_ACCESS, access) |
+ FIELD_PREP(GUNYAH_MEMEXTENT_MAPPING_KERNEL_ACCESS, access);
+ gunyah_error = gunyah_hypercall_addrspace_map(addrspace->capid,
+ guest_extent->capid, gpa,
+ extent_attrs, map_flags,
+ pa, size);
+ if (gunyah_error != GUNYAH_ERROR_OK) {
+ pr_err("Failed to map guest address 0x%016llx: %d\n", gpa,
+ gunyah_error);
+ ret = gunyah_error_remap(gunyah_error);
+ goto memextent_reclaim;
+ }
+
+ folio_get(folio);
+ if (!share)
+ folio_set_private(folio);
+ return 0;
+memextent_reclaim:
+ gunyah_error = gunyah_hypercall_memextent_donate(reclaim_flags(share),
+ guest_extent->capid,
+ host_extent->capid, pa,
+ size);
+ if (gunyah_error != GUNYAH_ERROR_OK)
+ pr_err("Failed to reclaim memory donation for guest address 0x%016llx: %d\n",
+ gpa, gunyah_error);
+remove:
+ mtree_erase(&ghvm->mm, gfn);
+ return ret;
+}
+
+static int __gunyah_vm_reclaim_folio_locked(struct gunyah_vm *ghvm, void *entry,
+ u64 gfn, const bool sync)
+{
+ u32 map_flags = BIT(GUNYAH_ADDRSPACE_MAP_FLAG_PARTIAL);
+ struct gunyah_resource *guest_extent, *host_extent, *addrspace;
+ enum gunyah_pagetable_access access;
+ enum gunyah_error gunyah_error;
+ struct folio *folio;
+ bool write, share;
+ phys_addr_t pa;
+ size_t size;
+ int ret;
+
+ addrspace = __first_resource(&ghvm->addrspace_ticket);
+ if (!addrspace)
+ return -ENODEV;
+
+ share = !!(xa_pointer_tag(entry) & SHARE_TAG);
+ write = !!(xa_pointer_tag(entry) & WRITE_TAG);
+ folio = xa_untag_pointer(entry);
+
+ if (!sync)
+ map_flags |= BIT(GUNYAH_ADDRSPACE_MAP_FLAG_NOSYNC);
+
+ /* clang-format off */
+ if (share) {
+ guest_extent = __first_resource(&ghvm->guest_shared_extent_ticket);
+ host_extent = __first_resource(&ghvm->host_shared_extent_ticket);
+ map_flags |= BIT(GUNYAH_ADDRSPACE_MAP_FLAG_VMMIO);
+ } else {
+ guest_extent = __first_resource(&ghvm->guest_private_extent_ticket);
+ host_extent = __first_resource(&ghvm->host_private_extent_ticket);
+ map_flags |= BIT(GUNYAH_ADDRSPACE_MAP_FLAG_PRIVATE);
+ }
+ /* clang-format on */
+
+ pa = PFN_PHYS(folio_pfn(folio));
+ size = folio_size(folio);
+
+ gunyah_error = gunyah_hypercall_addrspace_unmap(addrspace->capid,
+ guest_extent->capid,
+ gunyah_gfn_to_gpa(gfn),
+ map_flags, pa, size);
+ if (gunyah_error != GUNYAH_ERROR_OK) {
+ pr_err_ratelimited(
+ "Failed to unmap guest address 0x%016llx: %d\n",
+ gunyah_gfn_to_gpa(gfn), gunyah_error);
+ ret = gunyah_error_remap(gunyah_error);
+ goto err;
+ }
+
+ gunyah_error = gunyah_hypercall_memextent_donate(reclaim_flags(share),
+ guest_extent->capid,
+ host_extent->capid, pa,
+ size);
+ if (gunyah_error != GUNYAH_ERROR_OK) {
+ pr_err_ratelimited(
+ "Failed to reclaim memory donation for guest address 0x%016llx: %d\n",
+ gunyah_gfn_to_gpa(gfn), gunyah_error);
+ ret = gunyah_error_remap(gunyah_error);
+ goto err;
+ }
+
+ if (share && write)
+ access = GUNYAH_PAGETABLE_ACCESS_RW;
+ else if (share && !write)
+ access = GUNYAH_PAGETABLE_ACCESS_R;
+ else if (!share && write)
+ access = GUNYAH_PAGETABLE_ACCESS_RWX;
+ else /* !share && !write */
+ access = GUNYAH_PAGETABLE_ACCESS_RX;
+
+ gunyah_error = gunyah_hypercall_memextent_donate(donate_flags(share),
+ guest_extent->capid,
+ host_extent->capid, pa,
+ size);
+ if (gunyah_error != GUNYAH_ERROR_OK) {
+ pr_err("Failed to reclaim memory donation for guest address 0x%016llx: %d\n",
+ gfn << PAGE_SHIFT, gunyah_error);
+ ret = gunyah_error_remap(gunyah_error);
+ goto err;
+ }
+
+ BUG_ON(mtree_erase(&ghvm->mm, gfn) != entry);
+
+ folio_clear_private(folio);
+ folio_put(folio);
+ return 0;
+err:
+ return ret;
+}
+
+int gunyah_vm_reclaim_folio(struct gunyah_vm *ghvm, u64 gfn)
+{
+ struct folio *folio;
+ void *entry;
+
+ entry = mtree_load(&ghvm->mm, gfn);
+ if (!entry)
+ return 0;
+
+ folio = xa_untag_pointer(entry);
+ if (mtree_load(&ghvm->mm, gfn) != entry)
+ return -EAGAIN;
+
+ return __gunyah_vm_reclaim_folio_locked(ghvm, entry, gfn, true);
+}
+
+int gunyah_vm_reclaim_range(struct gunyah_vm *ghvm, u64 gfn, u64 nr)
+{
+ unsigned long next = gfn, g;
+ struct folio *folio;
+ int ret, ret2 = 0;
+ void *entry;
+ bool sync;
+
+ mt_for_each(&ghvm->mm, entry, next, gfn + nr) {
+ folio = xa_untag_pointer(entry);
+ g = next;
+ sync = !!mt_find_after(&ghvm->mm, &g, gfn + nr);
+
+ g = next - folio_nr_pages(folio);
+ folio_get(folio);
+ folio_lock(folio);
+ if (mtree_load(&ghvm->mm, g) == entry)
+ ret = __gunyah_vm_reclaim_folio_locked(ghvm, entry, g, sync);
+ else
+ ret = -EAGAIN;
+ folio_unlock(folio);
+ folio_put(folio);
+ if (ret && ret2 != -EAGAIN)
+ ret2 = ret;
+ }
+
+ return ret2;
+}
--
2.34.1
Gunyah Resource Manager sets up a virtual machine based on a device tree
which lives in guest memory. Resource manager requires this memory to be
provided as a memory parcel for it to read and manipulate. Construct a
memory parcel, lend it to the virtual machine, and inform resource
manager about the device tree location (the memory parcel ID and offset
into the memory parcel).
Signed-off-by: Elliot Berman <[email protected]>
---
drivers/virt/gunyah/vm_mgr.c | 49 ++++++++++++++++++++++++++++++++++++++++++--
drivers/virt/gunyah/vm_mgr.h | 13 +++++++++++-
include/uapi/linux/gunyah.h | 14 +++++++++++++
3 files changed, 73 insertions(+), 3 deletions(-)
diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
index be2061aa0a06..4379b5ba151e 100644
--- a/drivers/virt/gunyah/vm_mgr.c
+++ b/drivers/virt/gunyah/vm_mgr.c
@@ -445,8 +445,27 @@ static int gunyah_vm_start(struct gunyah_vm *ghvm)
ghvm->vmid = ret;
ghvm->vm_status = GUNYAH_RM_VM_STATUS_LOAD;
- ret = gunyah_rm_vm_configure(ghvm->rm, ghvm->vmid, ghvm->auth, 0, 0, 0,
- 0, 0);
+ ghvm->dtb.parcel_start = ghvm->dtb.config.guest_phys_addr >> PAGE_SHIFT;
+ ghvm->dtb.parcel_pages = ghvm->dtb.config.size >> PAGE_SHIFT;
+ /* RM requires the DTB parcel to be lent to guard against malicious
+ * modifications while starting VM. Force it so.
+ */
+ ghvm->dtb.parcel.n_acl_entries = 1;
+ ret = gunyah_gmem_share_parcel(ghvm, &ghvm->dtb.parcel,
+ &ghvm->dtb.parcel_start,
+ &ghvm->dtb.parcel_pages);
+ if (ret) {
+ dev_warn(ghvm->parent,
+ "Failed to allocate parcel for DTB: %d\n", ret);
+ goto err;
+ }
+
+ ret = gunyah_rm_vm_configure(ghvm->rm, ghvm->vmid, ghvm->auth,
+ ghvm->dtb.parcel.mem_handle, 0, 0,
+ ghvm->dtb.config.guest_phys_addr -
+ (ghvm->dtb.parcel_start
+ << PAGE_SHIFT),
+ ghvm->dtb.config.size);
if (ret) {
dev_warn(ghvm->parent, "Failed to configure VM: %d\n", ret);
goto err;
@@ -485,6 +504,8 @@ static int gunyah_vm_start(struct gunyah_vm *ghvm)
goto err;
}
+ WARN_ON(gunyah_vm_parcel_to_paged(ghvm, &ghvm->dtb.parcel, ghvm->dtb.parcel_start, ghvm->dtb.parcel_pages));
+
ghvm->vm_status = GUNYAH_RM_VM_STATUS_RUNNING;
up_write(&ghvm->status_lock);
return ret;
@@ -531,6 +552,21 @@ static long gunyah_vm_ioctl(struct file *filp, unsigned int cmd,
long r;
switch (cmd) {
+ case GUNYAH_VM_SET_DTB_CONFIG: {
+ struct gunyah_vm_dtb_config dtb_config;
+
+ if (copy_from_user(&dtb_config, argp, sizeof(dtb_config)))
+ return -EFAULT;
+
+ if (overflows_type(dtb_config.guest_phys_addr + dtb_config.size,
+ u64))
+ return -EOVERFLOW;
+
+ ghvm->dtb.config = dtb_config;
+
+ r = 0;
+ break;
+ }
case GUNYAH_VM_START: {
r = gunyah_vm_ensure_started(ghvm);
break;
@@ -589,6 +625,15 @@ static void _gunyah_vm_put(struct kref *kref)
if (ghvm->vm_status == GUNYAH_RM_VM_STATUS_RUNNING)
gunyah_vm_stop(ghvm);
+ if (ghvm->vm_status == GUNYAH_RM_VM_STATUS_LOAD) {
+ ret = gunyah_gmem_reclaim_parcel(ghvm, &ghvm->dtb.parcel,
+ ghvm->dtb.parcel_start,
+ ghvm->dtb.parcel_pages);
+ if (ret)
+ dev_err(ghvm->parent,
+ "Failed to reclaim DTB parcel: %d\n", ret);
+ }
+
gunyah_vm_remove_functions(ghvm);
down_write(&ghvm->bindings_lock);
diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
index a79c11f1c3a5..b2ab2f1bda3a 100644
--- a/drivers/virt/gunyah/vm_mgr.h
+++ b/drivers/virt/gunyah/vm_mgr.h
@@ -72,6 +72,13 @@ long gunyah_dev_vm_mgr_ioctl(struct gunyah_rm *rm, unsigned int cmd,
* @resource_tickets: List of &struct gunyah_vm_resource_ticket
* @auth: Authentication mechanism to be used by resource manager when
* launching the VM
+ * @dtb: For tracking dtb configuration when launching the VM
+ * @dtb.config: Location of the DTB in the guest memory
+ * @dtb.parcel_start: Guest frame number where the memory parcel that we lent to
+ * VM (DTB could start in middle of folio; we lend entire
+ * folio; parcel_start is start of the folio)
+ * @dtb.parcel_pages: Number of pages lent for the memory parcel
+ * @dtb.parcel: Data for resource manager to lend the parcel
*
* Members are grouped by hot path.
*/
@@ -101,7 +108,11 @@ struct gunyah_vm {
struct device *parent;
enum gunyah_rm_vm_auth_mechanism auth;
-
+ struct {
+ struct gunyah_vm_dtb_config config;
+ u64 parcel_start, parcel_pages;
+ struct gunyah_rm_mem_parcel parcel;
+ } dtb;
};
int gunyah_vm_parcel_to_paged(struct gunyah_vm *ghvm,
diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
index 1af4c5ae6bc3..a89d9bedf3e5 100644
--- a/include/uapi/linux/gunyah.h
+++ b/include/uapi/linux/gunyah.h
@@ -42,6 +42,20 @@ struct gunyah_create_mem_args {
/*
* ioctls for gunyah-vm fds (returned by GUNYAH_CREATE_VM)
*/
+
+/**
+ * struct gunyah_vm_dtb_config - Set the location of the VM's devicetree blob
+ * @guest_phys_addr: Address of the VM's devicetree in guest memory.
+ * @size: Maximum size of the devicetree including space for overlays.
+ * Resource manager applies an overlay to the DTB and dtb_size should
+ * include room for the overlay. A page of memory is typicaly plenty.
+ */
+struct gunyah_vm_dtb_config {
+ __u64 guest_phys_addr;
+ __u64 size;
+};
+#define GUNYAH_VM_SET_DTB_CONFIG _IOW(GUNYAH_IOCTL_TYPE, 0x2, struct gunyah_vm_dtb_config)
+
#define GUNYAH_VM_START _IO(GUNYAH_IOCTL_TYPE, 0x3)
/**
--
2.34.1
Export vma_interval_tree_iter_first and vma_interval_tree_iter_next for
vma_interval_tree_foreach.
Signed-off-by: Elliot Berman <[email protected]>
---
mm/interval_tree.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/mm/interval_tree.c b/mm/interval_tree.c
index 32e390c42c53..faa50767496c 100644
--- a/mm/interval_tree.c
+++ b/mm/interval_tree.c
@@ -24,6 +24,9 @@ INTERVAL_TREE_DEFINE(struct vm_area_struct, shared.rb,
unsigned long, shared.rb_subtree_last,
vma_start_pgoff, vma_last_pgoff, /* empty */, vma_interval_tree)
+EXPORT_SYMBOL_GPL(vma_interval_tree_iter_first);
+EXPORT_SYMBOL_GPL(vma_interval_tree_iter_next);
+
/* Insert node immediately after prev in the interval tree */
void vma_interval_tree_insert_after(struct vm_area_struct *node,
struct vm_area_struct *prev,
--
2.34.1
Allow userspace to attach an ioeventfd to an mmio address within the
guest. Userspace provides a description of the type of write to
"subscribe" to and eventfd to trigger when that type of write is
performed by the guest. This mechanism allows userspace to respond
asynchronously to a guest manipulating a virtualized device and is
similar to KVM's ioeventfd.
Reviewed-by: Alex Elder <[email protected]>
Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Elliot Berman <[email protected]>
---
drivers/virt/gunyah/Kconfig | 9 +++
drivers/virt/gunyah/Makefile | 1 +
drivers/virt/gunyah/gunyah_ioeventfd.c | 139 +++++++++++++++++++++++++++++++++
include/uapi/linux/gunyah.h | 37 +++++++++
4 files changed, 186 insertions(+)
diff --git a/drivers/virt/gunyah/Kconfig b/drivers/virt/gunyah/Kconfig
index 1685b75fb77a..855d41a88b16 100644
--- a/drivers/virt/gunyah/Kconfig
+++ b/drivers/virt/gunyah/Kconfig
@@ -36,3 +36,12 @@ config GUNYAH_IRQFD
on Gunyah virtual machine.
Say Y/M here if unsure and you want to support Gunyah VMMs.
+
+config GUNYAH_IOEVENTFD
+ tristate "Gunyah ioeventfd interface"
+ depends on GUNYAH
+ help
+ Enable kernel support for creating ioeventfds which can alert userspace
+ when a Gunyah virtual machine accesses a memory address.
+
+ Say Y/M here if unsure and you want to support Gunyah VMMs.
diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
index b41b02792921..2aec5989402b 100644
--- a/drivers/virt/gunyah/Makefile
+++ b/drivers/virt/gunyah/Makefile
@@ -6,3 +6,4 @@ obj-$(CONFIG_GUNYAH) += gunyah.o gunyah_rsc_mgr.o gunyah_vcpu.o
obj-$(CONFIG_GUNYAH_PLATFORM_HOOKS) += gunyah_platform_hooks.o
obj-$(CONFIG_GUNYAH_QCOM_PLATFORM) += gunyah_qcom.o
obj-$(CONFIG_GUNYAH_IRQFD) += gunyah_irqfd.o
+obj-$(CONFIG_GUNYAH_IOEVENTFD) += gunyah_ioeventfd.o
diff --git a/drivers/virt/gunyah/gunyah_ioeventfd.c b/drivers/virt/gunyah/gunyah_ioeventfd.c
new file mode 100644
index 000000000000..e33924d19be4
--- /dev/null
+++ b/drivers/virt/gunyah/gunyah_ioeventfd.c
@@ -0,0 +1,139 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2022-2024 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#include <linux/eventfd.h>
+#include <linux/device/driver.h>
+#include <linux/file.h>
+#include <linux/fs.h>
+#include <linux/gunyah.h>
+#include <linux/module.h>
+#include <linux/printk.h>
+
+#include <uapi/linux/gunyah.h>
+
+struct gunyah_ioeventfd {
+ struct gunyah_vm_function_instance *f;
+ struct gunyah_vm_io_handler io_handler;
+
+ struct eventfd_ctx *ctx;
+};
+
+static int gunyah_write_ioeventfd(struct gunyah_vm_io_handler *io_dev, u64 addr,
+ u32 len, u64 data)
+{
+ struct gunyah_ioeventfd *iofd =
+ container_of(io_dev, struct gunyah_ioeventfd, io_handler);
+
+ eventfd_signal(iofd->ctx);
+ return 0;
+}
+
+static struct gunyah_vm_io_handler_ops io_ops = {
+ .write = gunyah_write_ioeventfd,
+};
+
+static long gunyah_ioeventfd_bind(struct gunyah_vm_function_instance *f)
+{
+ const struct gunyah_fn_ioeventfd_arg *args = f->argp;
+ struct gunyah_ioeventfd *iofd;
+ struct eventfd_ctx *ctx;
+ int ret;
+
+ if (f->arg_size != sizeof(*args))
+ return -EINVAL;
+
+ /* All other flag bits are reserved for future use */
+ if (args->flags & ~GUNYAH_IOEVENTFD_FLAGS_DATAMATCH)
+ return -EINVAL;
+
+ /* must be natural-word sized, or 0 to ignore length */
+ switch (args->len) {
+ case 0:
+ case 1:
+ case 2:
+ case 4:
+ case 8:
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ /* check for range overflow */
+ if (overflows_type(args->addr + args->len, u64))
+ return -EINVAL;
+
+ /* ioeventfd with no length can't be combined with DATAMATCH */
+ if (!args->len && (args->flags & GUNYAH_IOEVENTFD_FLAGS_DATAMATCH))
+ return -EINVAL;
+
+ ctx = eventfd_ctx_fdget(args->fd);
+ if (IS_ERR(ctx))
+ return PTR_ERR(ctx);
+
+ iofd = kzalloc(sizeof(*iofd), GFP_KERNEL);
+ if (!iofd) {
+ ret = -ENOMEM;
+ goto err_eventfd;
+ }
+
+ f->data = iofd;
+ iofd->f = f;
+
+ iofd->ctx = ctx;
+
+ if (args->flags & GUNYAH_IOEVENTFD_FLAGS_DATAMATCH) {
+ iofd->io_handler.datamatch = true;
+ iofd->io_handler.len = args->len;
+ iofd->io_handler.data = args->datamatch;
+ }
+ iofd->io_handler.addr = args->addr;
+ iofd->io_handler.ops = &io_ops;
+
+ ret = gunyah_vm_add_io_handler(f->ghvm, &iofd->io_handler);
+ if (ret)
+ goto err_io_dev_add;
+
+ return 0;
+
+err_io_dev_add:
+ kfree(iofd);
+err_eventfd:
+ eventfd_ctx_put(ctx);
+ return ret;
+}
+
+static void gunyah_ioevent_unbind(struct gunyah_vm_function_instance *f)
+{
+ struct gunyah_ioeventfd *iofd = f->data;
+
+ eventfd_ctx_put(iofd->ctx);
+ gunyah_vm_remove_io_handler(iofd->f->ghvm, &iofd->io_handler);
+ kfree(iofd);
+}
+
+static bool gunyah_ioevent_compare(const struct gunyah_vm_function_instance *f,
+ const void *arg, size_t size)
+{
+ const struct gunyah_fn_ioeventfd_arg *instance = f->argp, *other = arg;
+
+ if (sizeof(*other) != size)
+ return false;
+
+ if (instance->addr != other->addr || instance->len != other->len ||
+ instance->flags != other->flags)
+ return false;
+
+ if ((instance->flags & GUNYAH_IOEVENTFD_FLAGS_DATAMATCH) &&
+ instance->datamatch != other->datamatch)
+ return false;
+
+ return true;
+}
+
+DECLARE_GUNYAH_VM_FUNCTION_INIT(ioeventfd, GUNYAH_FN_IOEVENTFD, 3,
+ gunyah_ioeventfd_bind, gunyah_ioevent_unbind,
+ gunyah_ioevent_compare);
+MODULE_DESCRIPTION("Gunyah ioeventfd VM Function");
+MODULE_LICENSE("GPL");
diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
index cb7b0bb9bef3..fd461e2fe8b5 100644
--- a/include/uapi/linux/gunyah.h
+++ b/include/uapi/linux/gunyah.h
@@ -65,10 +65,13 @@ struct gunyah_vm_dtb_config {
* Return: file descriptor to manipulate the vcpu.
* @GUNYAH_FN_IRQFD: register eventfd to assert a Gunyah doorbell
* &struct gunyah_fn_desc.arg is a pointer to &struct gunyah_fn_irqfd_arg
+ * @GUNYAH_FN_IOEVENTFD: register ioeventfd to trigger when VM faults on parameter
+ * &struct gunyah_fn_desc.arg is a pointer to &struct gunyah_fn_ioeventfd_arg
*/
enum gunyah_fn_type {
GUNYAH_FN_VCPU = 1,
GUNYAH_FN_IRQFD,
+ GUNYAH_FN_IOEVENTFD,
};
#define GUNYAH_FN_MAX_ARG_SIZE 256
@@ -120,6 +123,40 @@ struct gunyah_fn_irqfd_arg {
__u32 padding;
};
+/**
+ * enum gunyah_ioeventfd_flags - flags for use in gunyah_fn_ioeventfd_arg
+ * @GUNYAH_IOEVENTFD_FLAGS_DATAMATCH: the event will be signaled only if the
+ * written value to the registered address is
+ * equal to &struct gunyah_fn_ioeventfd_arg.datamatch
+ */
+enum gunyah_ioeventfd_flags {
+ GUNYAH_IOEVENTFD_FLAGS_DATAMATCH = 1UL << 0,
+};
+
+/**
+ * struct gunyah_fn_ioeventfd_arg - Arguments to create an ioeventfd function
+ * @datamatch: data used when GUNYAH_IOEVENTFD_DATAMATCH is set
+ * @addr: Address in guest memory
+ * @len: Length of access
+ * @fd: When ioeventfd is matched, this eventfd is written
+ * @flags: See &enum gunyah_ioeventfd_flags
+ * @padding: padding bytes
+ *
+ * Create this function with &GUNYAH_VM_ADD_FUNCTION using type &GUNYAH_FN_IOEVENTFD.
+ *
+ * Attaches an ioeventfd to a legal mmio address within the guest. A guest write
+ * in the registered address will signal the provided event instead of triggering
+ * an exit on the GUNYAH_VCPU_RUN ioctl.
+ */
+struct gunyah_fn_ioeventfd_arg {
+ __u64 datamatch;
+ __u64 addr; /* legal mmio address */
+ __u32 len; /* 1, 2, 4, or 8 bytes; or 0 to ignore length */
+ __s32 fd;
+ __u32 flags;
+ __u32 padding;
+};
+
/**
* struct gunyah_fn_desc - Arguments to create a VM function
* @type: Type of the function. See &enum gunyah_fn_type.
--
2.34.1
Enable support for creating irqfds which can raise an interrupt on a
Gunyah virtual machine. irqfds are exposed to userspace as a Gunyah VM
function with the name "irqfd". If the VM devicetree is not configured
to create a doorbell with the corresponding label, userspace will still
be able to assert the eventfd but no interrupt will be raised on the
guest.
Acked-by: Alex Elder <[email protected]>
Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Elliot Berman <[email protected]>
---
drivers/virt/gunyah/Kconfig | 9 ++
drivers/virt/gunyah/Makefile | 1 +
drivers/virt/gunyah/gunyah_irqfd.c | 190 +++++++++++++++++++++++++++++++++++++
include/uapi/linux/gunyah.h | 35 +++++++
4 files changed, 235 insertions(+)
diff --git a/drivers/virt/gunyah/Kconfig b/drivers/virt/gunyah/Kconfig
index fe2823dc48ba..1685b75fb77a 100644
--- a/drivers/virt/gunyah/Kconfig
+++ b/drivers/virt/gunyah/Kconfig
@@ -27,3 +27,12 @@ config GUNYAH_QCOM_PLATFORM
extra platform-specific support.
Say Y/M here to use Gunyah on Qualcomm platforms.
+
+config GUNYAH_IRQFD
+ tristate "Gunyah irqfd interface"
+ depends on GUNYAH
+ help
+ Enable kernel support for creating irqfds which can raise an interrupt
+ on Gunyah virtual machine.
+
+ Say Y/M here if unsure and you want to support Gunyah VMMs.
diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
index c4505fce177d..b41b02792921 100644
--- a/drivers/virt/gunyah/Makefile
+++ b/drivers/virt/gunyah/Makefile
@@ -5,3 +5,4 @@ gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o vm_mgr_mem.o guest_memfd.o
obj-$(CONFIG_GUNYAH) += gunyah.o gunyah_rsc_mgr.o gunyah_vcpu.o
obj-$(CONFIG_GUNYAH_PLATFORM_HOOKS) += gunyah_platform_hooks.o
obj-$(CONFIG_GUNYAH_QCOM_PLATFORM) += gunyah_qcom.o
+obj-$(CONFIG_GUNYAH_IRQFD) += gunyah_irqfd.o
diff --git a/drivers/virt/gunyah/gunyah_irqfd.c b/drivers/virt/gunyah/gunyah_irqfd.c
new file mode 100644
index 000000000000..030af9069639
--- /dev/null
+++ b/drivers/virt/gunyah/gunyah_irqfd.c
@@ -0,0 +1,190 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2022-2024 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#include <linux/eventfd.h>
+#include <linux/device/driver.h>
+#include <linux/file.h>
+#include <linux/fs.h>
+#include <linux/gunyah.h>
+#include <linux/module.h>
+#include <linux/poll.h>
+#include <linux/printk.h>
+
+#include <uapi/linux/gunyah.h>
+
+struct gunyah_irqfd {
+ struct gunyah_resource *ghrsc;
+ struct gunyah_vm_resource_ticket ticket;
+ struct gunyah_vm_function_instance *f;
+
+ bool level;
+
+ struct eventfd_ctx *ctx;
+ wait_queue_entry_t wait;
+ poll_table pt;
+};
+
+static int irqfd_wakeup(wait_queue_entry_t *wait, unsigned int mode, int sync,
+ void *key)
+{
+ struct gunyah_irqfd *irqfd =
+ container_of(wait, struct gunyah_irqfd, wait);
+ __poll_t flags = key_to_poll(key);
+ int ret = 0;
+
+ if (flags & EPOLLIN) {
+ if (irqfd->ghrsc) {
+ ret = gunyah_hypercall_bell_send(irqfd->ghrsc->capid, 1,
+ NULL);
+ if (ret)
+ pr_err_ratelimited(
+ "Failed to inject interrupt %d: %d\n",
+ irqfd->ticket.label, ret);
+ } else
+ pr_err_ratelimited(
+ "Premature injection of interrupt\n");
+ }
+
+ return 0;
+}
+
+static void irqfd_ptable_queue_proc(struct file *file, wait_queue_head_t *wqh,
+ poll_table *pt)
+{
+ struct gunyah_irqfd *irq_ctx =
+ container_of(pt, struct gunyah_irqfd, pt);
+
+ add_wait_queue(wqh, &irq_ctx->wait);
+}
+
+static bool gunyah_irqfd_populate(struct gunyah_vm_resource_ticket *ticket,
+ struct gunyah_resource *ghrsc)
+{
+ struct gunyah_irqfd *irqfd =
+ container_of(ticket, struct gunyah_irqfd, ticket);
+ int ret;
+
+ if (irqfd->ghrsc) {
+ pr_warn("irqfd%d already got a Gunyah resource. Check if multiple resources with same label were configured.\n",
+ irqfd->ticket.label);
+ return false;
+ }
+
+ irqfd->ghrsc = ghrsc;
+ if (irqfd->level) {
+ /* Configure the bell to trigger when bit 0 is asserted (see
+ * irq_wakeup) and for bell to automatically clear bit 0 once
+ * received by the VM (ack_mask). need to make sure bit 0 is cleared right away,
+ * otherwise the line will never be deasserted. Emulating edge
+ * trigger interrupt does not need to set either mask
+ * because irq is listed only once per gunyah_hypercall_bell_send
+ */
+ ret = gunyah_hypercall_bell_set_mask(irqfd->ghrsc->capid, 1, 1);
+ if (ret)
+ pr_warn("irq %d couldn't be set as level triggered. Might cause IRQ storm if asserted\n",
+ irqfd->ticket.label);
+ }
+
+ return true;
+}
+
+static void gunyah_irqfd_unpopulate(struct gunyah_vm_resource_ticket *ticket,
+ struct gunyah_resource *ghrsc)
+{
+ struct gunyah_irqfd *irqfd =
+ container_of(ticket, struct gunyah_irqfd, ticket);
+ u64 cnt;
+
+ eventfd_ctx_remove_wait_queue(irqfd->ctx, &irqfd->wait, &cnt);
+}
+
+static long gunyah_irqfd_bind(struct gunyah_vm_function_instance *f)
+{
+ struct gunyah_fn_irqfd_arg *args = f->argp;
+ struct gunyah_irqfd *irqfd;
+ __poll_t events;
+ struct fd fd;
+ long r;
+
+ if (f->arg_size != sizeof(*args))
+ return -EINVAL;
+
+ /* All other flag bits are reserved for future use */
+ if (args->flags & ~GUNYAH_IRQFD_FLAGS_LEVEL)
+ return -EINVAL;
+
+ irqfd = kzalloc(sizeof(*irqfd), GFP_KERNEL);
+ if (!irqfd)
+ return -ENOMEM;
+
+ irqfd->f = f;
+ f->data = irqfd;
+
+ fd = fdget(args->fd);
+ if (!fd.file) {
+ kfree(irqfd);
+ return -EBADF;
+ }
+
+ irqfd->ctx = eventfd_ctx_fileget(fd.file);
+ if (IS_ERR(irqfd->ctx)) {
+ r = PTR_ERR(irqfd->ctx);
+ goto err_fdput;
+ }
+
+ if (args->flags & GUNYAH_IRQFD_FLAGS_LEVEL)
+ irqfd->level = true;
+
+ init_waitqueue_func_entry(&irqfd->wait, irqfd_wakeup);
+ init_poll_funcptr(&irqfd->pt, irqfd_ptable_queue_proc);
+
+ irqfd->ticket.resource_type = GUNYAH_RESOURCE_TYPE_BELL_TX;
+ irqfd->ticket.label = args->label;
+ irqfd->ticket.owner = THIS_MODULE;
+ irqfd->ticket.populate = gunyah_irqfd_populate;
+ irqfd->ticket.unpopulate = gunyah_irqfd_unpopulate;
+
+ r = gunyah_vm_add_resource_ticket(f->ghvm, &irqfd->ticket);
+ if (r)
+ goto err_ctx;
+
+ events = vfs_poll(fd.file, &irqfd->pt);
+ if (events & EPOLLIN)
+ pr_warn("Premature injection of interrupt\n");
+ fdput(fd);
+
+ return 0;
+err_ctx:
+ eventfd_ctx_put(irqfd->ctx);
+err_fdput:
+ fdput(fd);
+ kfree(irqfd);
+ return r;
+}
+
+static void gunyah_irqfd_unbind(struct gunyah_vm_function_instance *f)
+{
+ struct gunyah_irqfd *irqfd = f->data;
+
+ gunyah_vm_remove_resource_ticket(irqfd->f->ghvm, &irqfd->ticket);
+ eventfd_ctx_put(irqfd->ctx);
+ kfree(irqfd);
+}
+
+static bool gunyah_irqfd_compare(const struct gunyah_vm_function_instance *f,
+ const void *arg, size_t size)
+{
+ const struct gunyah_fn_irqfd_arg *instance = f->argp, *other = arg;
+
+ if (sizeof(*other) != size)
+ return false;
+
+ return instance->label == other->label;
+}
+
+DECLARE_GUNYAH_VM_FUNCTION_INIT(irqfd, GUNYAH_FN_IRQFD, 2, gunyah_irqfd_bind,
+ gunyah_irqfd_unbind, gunyah_irqfd_compare);
+MODULE_DESCRIPTION("Gunyah irqfd VM Function");
+MODULE_LICENSE("GPL");
diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
index 574116f54472..cb7b0bb9bef3 100644
--- a/include/uapi/linux/gunyah.h
+++ b/include/uapi/linux/gunyah.h
@@ -63,9 +63,12 @@ struct gunyah_vm_dtb_config {
* @GUNYAH_FN_VCPU: create a vCPU instance to control a vCPU
* &struct gunyah_fn_desc.arg is a pointer to &struct gunyah_fn_vcpu_arg
* Return: file descriptor to manipulate the vcpu.
+ * @GUNYAH_FN_IRQFD: register eventfd to assert a Gunyah doorbell
+ * &struct gunyah_fn_desc.arg is a pointer to &struct gunyah_fn_irqfd_arg
*/
enum gunyah_fn_type {
GUNYAH_FN_VCPU = 1,
+ GUNYAH_FN_IRQFD,
};
#define GUNYAH_FN_MAX_ARG_SIZE 256
@@ -85,6 +88,38 @@ struct gunyah_fn_vcpu_arg {
__u32 id;
};
+/**
+ * enum gunyah_irqfd_flags - flags for use in gunyah_fn_irqfd_arg
+ * @GUNYAH_IRQFD_FLAGS_LEVEL: make the interrupt operate like a level triggered
+ * interrupt on guest side. Triggering IRQFD before
+ * guest handles the interrupt causes interrupt to
+ * stay asserted.
+ */
+enum gunyah_irqfd_flags {
+ GUNYAH_IRQFD_FLAGS_LEVEL = 1UL << 0,
+};
+
+/**
+ * struct gunyah_fn_irqfd_arg - Arguments to create an irqfd function.
+ *
+ * Create this function with &GUNYAH_VM_ADD_FUNCTION using type &GUNYAH_FN_IRQFD.
+ *
+ * Allows setting an eventfd to directly trigger a guest interrupt.
+ * irqfd.fd specifies the file descriptor to use as the eventfd.
+ * irqfd.label corresponds to the doorbell label used in the guest VM's devicetree.
+ *
+ * @fd: an eventfd which when written to will raise a doorbell
+ * @label: Label of the doorbell created on the guest VM
+ * @flags: see &enum gunyah_irqfd_flags
+ * @padding: padding bytes
+ */
+struct gunyah_fn_irqfd_arg {
+ __u32 fd;
+ __u32 label;
+ __u32 flags;
+ __u32 padding;
+};
+
/**
* struct gunyah_fn_desc - Arguments to create a VM function
* @type: Type of the function. See &enum gunyah_fn_type.
--
2.34.1
Add myself and Prakruthi as maintainers of Gunyah hypervisor drivers.
Reviewed-by: Alex Elder <[email protected]>
Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Elliot Berman <[email protected]>
---
MAINTAINERS | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index fa67e2624723..64f70ef1ef91 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9306,6 +9306,18 @@ L: [email protected]
S: Maintained
F: block/partitions/efi.*
+GUNYAH HYPERVISOR DRIVER
+M: Elliot Berman <[email protected]>
+M: Prakruthi Deepak Heragu <[email protected]>
+L: [email protected]
+S: Supported
+F: Documentation/devicetree/bindings/firmware/gunyah-hypervisor.yaml
+F: Documentation/virt/gunyah/
+F: arch/arm64/gunyah/
+F: drivers/virt/gunyah/
+F: include/linux/gunyah*.h
+K: gunyah
+
HABANALABS PCI DRIVER
M: Oded Gabbay <[email protected]>
L: [email protected]
--
2.34.1
Gunyah Resource Manager sets up a virtual machine based on a device
tree which lives in guest memory. Resource manager requires this memory
to be provided as a memory parcel for it to read and manipulate.
Implement a function to construct a memory parcel from a guestmem
binding.
Signed-off-by: Elliot Berman <[email protected]>
---
drivers/virt/gunyah/guest_memfd.c | 190 ++++++++++++++++++++++++++++++++++++++
drivers/virt/gunyah/vm_mgr.h | 6 ++
2 files changed, 196 insertions(+)
diff --git a/drivers/virt/gunyah/guest_memfd.c b/drivers/virt/gunyah/guest_memfd.c
index 71686f1946da..5eeac6ac451e 100644
--- a/drivers/virt/gunyah/guest_memfd.c
+++ b/drivers/virt/gunyah/guest_memfd.c
@@ -653,3 +653,193 @@ int gunyah_gmem_modify_mapping(struct gunyah_vm *ghvm,
fput(file);
return ret;
}
+
+int gunyah_gmem_share_parcel(struct gunyah_vm *ghvm, struct gunyah_rm_mem_parcel *parcel,
+ u64 *gfn, u64 *nr)
+{
+ struct folio *folio, *prev_folio;
+ unsigned long nr_entries, i, j, start, end;
+ struct gunyah_gmem_binding *b;
+ bool lend;
+ int ret;
+
+ parcel->mem_handle = GUNYAH_MEM_HANDLE_INVAL;
+
+ if (!*nr)
+ return -EINVAL;
+
+
+ down_read(&ghvm->bindings_lock);
+ b = mtree_load(&ghvm->bindings, *gfn);
+ if (!b || *gfn > b->gfn + b->nr || *gfn < b->gfn) {
+ ret = -ENOENT;
+ goto unlock;
+ }
+
+ /**
+ * Generally, indices can be based on gfn, guest_memfd offset, or
+ * offset into binding. start and end are based on offset into binding.
+ */
+ start = *gfn - b->gfn;
+
+ if (start + *nr > b->nr) {
+ ret = -ENOENT;
+ goto unlock;
+ }
+
+ end = start + *nr;
+ lend = parcel->n_acl_entries == 1 || gunyah_guest_mem_is_lend(ghvm, b->flags);
+
+ /**
+ * First, calculate the number of physically discontiguous regions
+ * the parcel covers. Each memory entry corresponds to one folio.
+ * In future, each memory entry could correspond to contiguous
+ * folios that are also adjacent in guest_memfd, but parcels
+ * are only being used for small amounts of memory for now, so
+ * this optimization is premature.
+ */
+ nr_entries = 0;
+ prev_folio = NULL;
+ for (i = start + b->i_off; i < end + b->i_off;) {
+ folio = gunyah_gmem_get_folio(file_inode(b->file), i); /* A */
+ if (!folio) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ nr_entries++;
+ i = folio_index(folio) + folio_nr_pages(folio);
+ }
+ end = i - b->i_off;
+
+ parcel->mem_entries =
+ kcalloc(nr_entries, sizeof(*parcel->mem_entries), GFP_KERNEL);
+ if (!parcel->mem_entries) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ /**
+ * Walk through all the folios again, now filling the mem_entries array.
+ */
+ j = 0;
+ prev_folio = NULL;
+ for (i = start + b->i_off; i < end + b->i_off; j++) {
+ folio = filemap_get_folio(file_inode(b->file)->i_mapping, i); /* B */
+ if (WARN_ON(IS_ERR(folio))) {
+ ret = PTR_ERR(folio);
+ i = end + b->i_off;
+ goto out;
+ }
+
+ if (lend)
+ folio_set_private(folio);
+
+ parcel->mem_entries[j].size = cpu_to_le64(folio_size(folio));
+ parcel->mem_entries[j].phys_addr = cpu_to_le64(PFN_PHYS(folio_pfn(folio)));
+ i = folio_index(folio) + folio_nr_pages(folio);
+ folio_put(folio); /* B */
+ }
+ BUG_ON(j != nr_entries);
+ parcel->n_mem_entries = nr_entries;
+
+ if (lend)
+ parcel->n_acl_entries = 1;
+
+ parcel->acl_entries = kcalloc(parcel->n_acl_entries,
+ sizeof(*parcel->acl_entries), GFP_KERNEL);
+ if (!parcel->n_acl_entries) {
+ ret = -ENOMEM;
+ goto free_entries;
+ }
+
+ parcel->acl_entries[0].vmid = cpu_to_le16(ghvm->vmid);
+ if (b->flags & GUNYAH_MEM_ALLOW_READ)
+ parcel->acl_entries[0].perms |= GUNYAH_RM_ACL_R;
+ if (b->flags & GUNYAH_MEM_ALLOW_WRITE)
+ parcel->acl_entries[0].perms |= GUNYAH_RM_ACL_W;
+ if (b->flags & GUNYAH_MEM_ALLOW_EXEC)
+ parcel->acl_entries[0].perms |= GUNYAH_RM_ACL_X;
+
+ if (!lend) {
+ u16 host_vmid;
+
+ ret = gunyah_rm_get_vmid(ghvm->rm, &host_vmid);
+ if (ret)
+ goto free_acl;
+
+ parcel->acl_entries[1].vmid = cpu_to_le16(host_vmid);
+ parcel->acl_entries[1].perms = GUNYAH_RM_ACL_R | GUNYAH_RM_ACL_W | GUNYAH_RM_ACL_X;
+ }
+
+ parcel->mem_handle = GUNYAH_MEM_HANDLE_INVAL;
+ folio = filemap_get_folio(file_inode(b->file)->i_mapping, start); /* C */
+ *gfn = folio_index(folio) - b->i_off + b->gfn;
+ *nr = end - (folio_index(folio) - b->i_off);
+ folio_put(folio); /* C */
+
+ ret = gunyah_rm_mem_share(ghvm->rm, parcel);
+ goto out;
+free_acl:
+ kfree(parcel->acl_entries);
+ parcel->acl_entries = NULL;
+free_entries:
+ kfree(parcel->mem_entries);
+ parcel->mem_entries = NULL;
+ parcel->n_mem_entries = 0;
+out:
+ /* unlock the folios */
+ for (j = start + b->i_off; j < i;) {
+ folio = filemap_get_folio(file_inode(b->file)->i_mapping, j); /* D */
+ if (IS_ERR(folio))
+ continue;
+ j = folio_index(folio) + folio_nr_pages(folio);
+ folio_unlock(folio); /* A */
+ folio_put(folio); /* D */
+ if (ret)
+ folio_put(folio); /* A */
+ /* matching folio_put for A is done at
+ * (1) gunyah_gmem_reclaim_parcel or
+ * (2) after gunyah_gmem_parcel_to_paged, gunyah_vm_reclaim_folio
+ */
+ }
+unlock:
+ up_read(&ghvm->bindings_lock);
+ return ret;
+}
+
+int gunyah_gmem_reclaim_parcel(struct gunyah_vm *ghvm,
+ struct gunyah_rm_mem_parcel *parcel, u64 gfn,
+ u64 nr)
+{
+ struct gunyah_rm_mem_entry *entry;
+ struct folio *folio;
+ pgoff_t i;
+ int ret;
+
+ if (parcel->mem_handle != GUNYAH_MEM_HANDLE_INVAL) {
+ ret = gunyah_rm_mem_reclaim(ghvm->rm, parcel);
+ if (ret) {
+ dev_err(ghvm->parent, "Failed to reclaim parcel: %d\n",
+ ret);
+ /* We can't reclaim the pages -- hold onto the pages
+ * forever because we don't know what state the memory
+ * is in
+ */
+ return ret;
+ }
+ parcel->mem_handle = GUNYAH_MEM_HANDLE_INVAL;
+
+ for (i = 0; i < parcel->n_mem_entries; i++) {
+ entry = &parcel->mem_entries[i];
+
+ folio = pfn_folio(PHYS_PFN(le64_to_cpu(entry->phys_addr)));
+ folio_put(folio); /* A */
+ }
+
+ kfree(parcel->mem_entries);
+ kfree(parcel->acl_entries);
+ }
+
+ return 0;
+}
diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
index 518d05eeb642..a79c11f1c3a5 100644
--- a/drivers/virt/gunyah/vm_mgr.h
+++ b/drivers/virt/gunyah/vm_mgr.h
@@ -119,5 +119,11 @@ int gunyah_gmem_modify_mapping(struct gunyah_vm *ghvm,
struct gunyah_map_mem_args *args);
struct gunyah_gmem_binding;
void gunyah_gmem_remove_binding(struct gunyah_gmem_binding *binding);
+int gunyah_gmem_share_parcel(struct gunyah_vm *ghvm,
+ struct gunyah_rm_mem_parcel *parcel, u64 *gfn,
+ u64 *nr);
+int gunyah_gmem_reclaim_parcel(struct gunyah_vm *ghvm,
+ struct gunyah_rm_mem_parcel *parcel, u64 gfn,
+ u64 nr);
#endif
--
2.34.1
RM provides APIs to fill boot context the initial registers upon
starting the vCPU. Most importantly, this allows userspace to set the
initial PC for the primary vCPU when the VM starts to run.
Signed-off-by: Elliot Berman <[email protected]>
---
drivers/virt/gunyah/vm_mgr.c | 77 ++++++++++++++++++++++++++++++++++++++++++++
drivers/virt/gunyah/vm_mgr.h | 2 ++
include/uapi/linux/gunyah.h | 23 +++++++++++++
3 files changed, 102 insertions(+)
diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
index 3b767eeeb7c2..1f3d29749174 100644
--- a/drivers/virt/gunyah/vm_mgr.c
+++ b/drivers/virt/gunyah/vm_mgr.c
@@ -395,6 +395,7 @@ static __must_check struct gunyah_vm *gunyah_vm_alloc(struct gunyah_rm *rm)
mutex_init(&ghvm->resources_lock);
INIT_LIST_HEAD(&ghvm->resources);
INIT_LIST_HEAD(&ghvm->resource_tickets);
+ xa_init(&ghvm->boot_context);
mt_init(&ghvm->mm);
mt_init(&ghvm->bindings);
@@ -420,6 +421,66 @@ static __must_check struct gunyah_vm *gunyah_vm_alloc(struct gunyah_rm *rm)
return ghvm;
}
+static long gunyah_vm_set_boot_context(struct gunyah_vm *ghvm,
+ struct gunyah_vm_boot_context *boot_ctx)
+{
+ u8 reg_set, reg_index; /* to check values are reasonable */
+ int ret;
+
+ reg_set = (boot_ctx->reg >> GUNYAH_VM_BOOT_CONTEXT_REG_SHIFT) & 0xff;
+ reg_index = boot_ctx->reg & 0xff;
+
+ switch (reg_set) {
+ case REG_SET_X:
+ if (reg_index > 31)
+ return -EINVAL;
+ break;
+ case REG_SET_PC:
+ if (reg_index)
+ return -EINVAL;
+ break;
+ case REG_SET_SP:
+ if (reg_index > 2)
+ return -EINVAL;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ ret = down_read_interruptible(&ghvm->status_lock);
+ if (ret)
+ return ret;
+
+ if (ghvm->vm_status != GUNYAH_RM_VM_STATUS_NO_STATE) {
+ ret = -EINVAL;
+ goto out;
+ }
+
+ ret = xa_err(xa_store(&ghvm->boot_context, boot_ctx->reg,
+ (void *)boot_ctx->value, GFP_KERNEL));
+out:
+ up_read(&ghvm->status_lock);
+ return ret;
+}
+
+static inline int gunyah_vm_fill_boot_context(struct gunyah_vm *ghvm)
+{
+ unsigned long reg_set, reg_index, id;
+ void *entry;
+ int ret;
+
+ xa_for_each(&ghvm->boot_context, id, entry) {
+ reg_set = (id >> GUNYAH_VM_BOOT_CONTEXT_REG_SHIFT) & 0xff;
+ reg_index = id & 0xff;
+ ret = gunyah_rm_vm_set_boot_context(
+ ghvm->rm, ghvm->vmid, reg_set, reg_index, (u64)entry);
+ if (ret)
+ return ret;
+ }
+
+ return 0;
+}
+
static int gunyah_vm_start(struct gunyah_vm *ghvm)
{
struct gunyah_rm_hyp_resources *resources;
@@ -496,6 +557,13 @@ static int gunyah_vm_start(struct gunyah_vm *ghvm)
}
ghvm->vm_status = GUNYAH_RM_VM_STATUS_READY;
+ ret = gunyah_vm_fill_boot_context(ghvm);
+ if (ret) {
+ dev_warn(ghvm->parent, "Failed to setup boot context: %d\n",
+ ret);
+ goto err;
+ }
+
ret = gunyah_rm_get_hyp_resources(ghvm->rm, ghvm->vmid, &resources);
if (ret) {
dev_warn(ghvm->parent,
@@ -614,6 +682,14 @@ static long gunyah_vm_ioctl(struct file *filp, unsigned int cmd,
return gunyah_gmem_modify_mapping(ghvm, &args);
}
+ case GUNYAH_VM_SET_BOOT_CONTEXT: {
+ struct gunyah_vm_boot_context boot_ctx;
+
+ if (copy_from_user(&boot_ctx, argp, sizeof(boot_ctx)))
+ return -EFAULT;
+
+ return gunyah_vm_set_boot_context(ghvm, &boot_ctx);
+ }
default:
r = -ENOTTY;
break;
@@ -699,6 +775,7 @@ static void _gunyah_vm_put(struct kref *kref)
"Failed to deallocate vmid: %d\n", ret);
}
+ xa_destroy(&ghvm->boot_context);
gunyah_rm_put(ghvm->rm);
kfree(ghvm);
}
diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
index 474ac866d237..4a436c3e435c 100644
--- a/drivers/virt/gunyah/vm_mgr.h
+++ b/drivers/virt/gunyah/vm_mgr.h
@@ -79,6 +79,7 @@ long gunyah_dev_vm_mgr_ioctl(struct gunyah_rm *rm, unsigned int cmd,
* folio; parcel_start is start of the folio)
* @dtb.parcel_pages: Number of pages lent for the memory parcel
* @dtb.parcel: Data for resource manager to lend the parcel
+ * @boot_context: Requested initial boot context to set when launching the VM
*
* Members are grouped by hot path.
*/
@@ -113,6 +114,7 @@ struct gunyah_vm {
u64 parcel_start, parcel_pages;
struct gunyah_rm_mem_parcel parcel;
} dtb;
+ struct xarray boot_context;
};
int gunyah_vm_parcel_to_paged(struct gunyah_vm *ghvm,
diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
index a89d9bedf3e5..574116f54472 100644
--- a/include/uapi/linux/gunyah.h
+++ b/include/uapi/linux/gunyah.h
@@ -142,6 +142,29 @@ struct gunyah_map_mem_args {
#define GUNYAH_VM_MAP_MEM _IOW(GUNYAH_IOCTL_TYPE, 0x9, struct gunyah_map_mem_args)
+enum gunyah_vm_boot_context_reg {
+ REG_SET_X = 0,
+ REG_SET_PC = 1,
+ REG_SET_SP = 2,
+};
+
+#define GUNYAH_VM_BOOT_CONTEXT_REG_SHIFT 8
+#define GUNYAH_VM_BOOT_CONTEXT_REG(reg, idx) (((reg & 0xff) << GUNYAH_VM_BOOT_CONTEXT_REG_SHIFT) |\
+ (idx & 0xff))
+
+/**
+ * struct gunyah_vm_boot_context - Set an initial register for the VM
+ * @reg: Register to set. See GUNYAH_VM_BOOT_CONTEXT_REG_* macros
+ * @reserved: reserved for alignment
+ * @value: value to fill in the register
+ */
+struct gunyah_vm_boot_context {
+ __u32 reg;
+ __u32 reserved;
+ __u64 value;
+};
+#define GUNYAH_VM_SET_BOOT_CONTEXT _IOW(GUNYAH_IOCTL_TYPE, 0xa, struct gunyah_vm_boot_context)
+
/*
* ioctls for vCPU fds
*/
--
2.34.1
Add framework for VM functions to handle stage-2 write faults from Gunyah
guest virtual machines. IO handlers have a range of addresses which they
apply to. Optionally, they may apply to only when the value written
matches the IO handler's value.
Reviewed-by: Alex Elder <[email protected]>
Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Elliot Berman <[email protected]>
---
drivers/virt/gunyah/gunyah_vcpu.c | 4 ++
drivers/virt/gunyah/vm_mgr.c | 115 ++++++++++++++++++++++++++++++++++++++
drivers/virt/gunyah/vm_mgr.h | 8 +++
include/linux/gunyah.h | 29 ++++++++++
4 files changed, 156 insertions(+)
diff --git a/drivers/virt/gunyah/gunyah_vcpu.c b/drivers/virt/gunyah/gunyah_vcpu.c
index f01e6d6163ba..edadb056cc18 100644
--- a/drivers/virt/gunyah/gunyah_vcpu.c
+++ b/drivers/virt/gunyah/gunyah_vcpu.c
@@ -133,6 +133,10 @@ gunyah_handle_mmio(struct gunyah_vcpu *vcpu, unsigned long resume_data[3],
vcpu->state = GUNYAH_VCPU_RUN_STATE_MMIO_READ;
vcpu->mmio_read_len = len;
} else { /* GUNYAH_VCPU_ADDRSPACE_VMMIO_WRITE */
+ if (!gunyah_vm_mmio_write(vcpu->ghvm, addr, len, data)) {
+ resume_data[0] = GUNYAH_ADDRSPACE_VMMIO_ACTION_EMULATE;
+ return true;
+ }
vcpu->vcpu_run->mmio.is_write = 1;
memcpy(vcpu->vcpu_run->mmio.data, &data, len);
vcpu->state = GUNYAH_VCPU_RUN_STATE_MMIO_WRITE;
diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
index 1f3d29749174..cb63cb121846 100644
--- a/drivers/virt/gunyah/vm_mgr.c
+++ b/drivers/virt/gunyah/vm_mgr.c
@@ -295,6 +295,118 @@ static void gunyah_vm_clean_resources(struct gunyah_vm *ghvm)
mutex_unlock(&ghvm->resources_lock);
}
+static int _gunyah_vm_io_handler_compare(const struct rb_node *node,
+ const struct rb_node *parent)
+{
+ struct gunyah_vm_io_handler *n =
+ container_of(node, struct gunyah_vm_io_handler, node);
+ struct gunyah_vm_io_handler *p =
+ container_of(parent, struct gunyah_vm_io_handler, node);
+
+ if (n->addr < p->addr)
+ return -1;
+ if (n->addr > p->addr)
+ return 1;
+ if ((n->len && !p->len) || (!n->len && p->len))
+ return 0;
+ if (n->len < p->len)
+ return -1;
+ if (n->len > p->len)
+ return 1;
+ /* one of the io handlers doesn't have datamatch and the other does.
+ * For purposes of comparison, that makes them identical since the
+ * one that doesn't have datamatch will cover the same handler that
+ * does.
+ */
+ if (n->datamatch != p->datamatch)
+ return 0;
+ if (n->data < p->data)
+ return -1;
+ if (n->data > p->data)
+ return 1;
+ return 0;
+}
+
+static int gunyah_vm_io_handler_compare(struct rb_node *node,
+ const struct rb_node *parent)
+{
+ return _gunyah_vm_io_handler_compare(node, parent);
+}
+
+static int gunyah_vm_io_handler_find(const void *key,
+ const struct rb_node *node)
+{
+ const struct gunyah_vm_io_handler *k = key;
+
+ return _gunyah_vm_io_handler_compare(&k->node, node);
+}
+
+static struct gunyah_vm_io_handler *
+gunyah_vm_mgr_find_io_hdlr(struct gunyah_vm *ghvm, u64 addr, u64 len, u64 data)
+{
+ struct gunyah_vm_io_handler key = {
+ .addr = addr,
+ .len = len,
+ .datamatch = true,
+ .data = data,
+ };
+ struct rb_node *node;
+
+ node = rb_find(&key, &ghvm->mmio_handler_root,
+ gunyah_vm_io_handler_find);
+ if (!node)
+ return NULL;
+
+ return container_of(node, struct gunyah_vm_io_handler, node);
+}
+
+int gunyah_vm_mmio_write(struct gunyah_vm *ghvm, u64 addr, u32 len, u64 data)
+{
+ struct gunyah_vm_io_handler *io_hdlr = NULL;
+ int ret;
+
+ down_read(&ghvm->mmio_handler_lock);
+ io_hdlr = gunyah_vm_mgr_find_io_hdlr(ghvm, addr, len, data);
+ if (!io_hdlr || !io_hdlr->ops || !io_hdlr->ops->write) {
+ ret = -ENOENT;
+ goto out;
+ }
+
+ ret = io_hdlr->ops->write(io_hdlr, addr, len, data);
+
+out:
+ up_read(&ghvm->mmio_handler_lock);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(gunyah_vm_mmio_write);
+
+int gunyah_vm_add_io_handler(struct gunyah_vm *ghvm,
+ struct gunyah_vm_io_handler *io_hdlr)
+{
+ struct rb_node *found;
+
+ if (io_hdlr->datamatch &&
+ (!io_hdlr->len || io_hdlr->len > sizeof(io_hdlr->data)))
+ return -EINVAL;
+
+ down_write(&ghvm->mmio_handler_lock);
+ found = rb_find_add(&io_hdlr->node, &ghvm->mmio_handler_root,
+ gunyah_vm_io_handler_compare);
+ up_write(&ghvm->mmio_handler_lock);
+
+ return found ? -EEXIST : 0;
+}
+EXPORT_SYMBOL_GPL(gunyah_vm_add_io_handler);
+
+void gunyah_vm_remove_io_handler(struct gunyah_vm *ghvm,
+ struct gunyah_vm_io_handler *io_hdlr)
+{
+ down_write(&ghvm->mmio_handler_lock);
+ rb_erase(&io_hdlr->node, &ghvm->mmio_handler_root);
+ up_write(&ghvm->mmio_handler_lock);
+}
+EXPORT_SYMBOL_GPL(gunyah_vm_remove_io_handler);
+
static int gunyah_vm_rm_notification_status(struct gunyah_vm *ghvm, void *data)
{
struct gunyah_rm_vm_status_payload *payload = data;
@@ -397,6 +509,9 @@ static __must_check struct gunyah_vm *gunyah_vm_alloc(struct gunyah_rm *rm)
INIT_LIST_HEAD(&ghvm->resource_tickets);
xa_init(&ghvm->boot_context);
+ init_rwsem(&ghvm->mmio_handler_lock);
+ ghvm->mmio_handler_root = RB_ROOT;
+
mt_init(&ghvm->mm);
mt_init(&ghvm->bindings);
init_rwsem(&ghvm->bindings_lock);
diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
index 4a436c3e435c..b956989fa5e6 100644
--- a/drivers/virt/gunyah/vm_mgr.h
+++ b/drivers/virt/gunyah/vm_mgr.h
@@ -10,6 +10,7 @@
#include <linux/kref.h>
#include <linux/maple_tree.h>
#include <linux/mutex.h>
+#include <linux/rbtree.h>
#include <linux/rwsem.h>
#include <linux/wait.h>
@@ -56,6 +57,9 @@ long gunyah_dev_vm_mgr_ioctl(struct gunyah_rm *rm, unsigned int cmd,
* @guest_shared_extent_ticket: Resource ticket to the capability for
* the memory extent that represents
* memory shared with the guest.
+ * @mmio_handler_root: RB tree of MMIO handlers.
+ * Entries are &struct gunyah_vm_io_handler
+ * @mmio_handler_lock: Serialization of traversing @mmio_handler_root
* @rm: Pointer to the resource manager struct to make RM calls
* @parent: For logging
* @nb: Notifier block for RM notifications
@@ -91,6 +95,8 @@ struct gunyah_vm {
struct gunyah_vm_resource_ticket addrspace_ticket,
host_private_extent_ticket, host_shared_extent_ticket,
guest_private_extent_ticket, guest_shared_extent_ticket;
+ struct rb_root mmio_handler_root;
+ struct rw_semaphore mmio_handler_lock;
struct gunyah_rm *rm;
@@ -117,6 +123,8 @@ struct gunyah_vm {
struct xarray boot_context;
};
+int gunyah_vm_mmio_write(struct gunyah_vm *ghvm, u64 addr, u32 len, u64 data);
+
int gunyah_vm_parcel_to_paged(struct gunyah_vm *ghvm,
struct gunyah_rm_mem_parcel *parcel, u64 gfn,
u64 nr);
diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
index 67cb9350ab9e..4638c358869a 100644
--- a/include/linux/gunyah.h
+++ b/include/linux/gunyah.h
@@ -156,6 +156,35 @@ int gunyah_vm_add_resource_ticket(struct gunyah_vm *ghvm,
void gunyah_vm_remove_resource_ticket(struct gunyah_vm *ghvm,
struct gunyah_vm_resource_ticket *ticket);
+/*
+ * gunyah_vm_io_handler contains the info about an io device and its associated
+ * addr and the ops associated with the io device.
+ */
+struct gunyah_vm_io_handler {
+ struct rb_node node;
+ u64 addr;
+
+ bool datamatch;
+ u8 len;
+ u64 data;
+ struct gunyah_vm_io_handler_ops *ops;
+};
+
+/*
+ * gunyah_vm_io_handler_ops contains function pointers associated with an iodevice.
+ */
+struct gunyah_vm_io_handler_ops {
+ int (*read)(struct gunyah_vm_io_handler *io_dev, u64 addr, u32 len,
+ u64 data);
+ int (*write)(struct gunyah_vm_io_handler *io_dev, u64 addr, u32 len,
+ u64 data);
+};
+
+int gunyah_vm_add_io_handler(struct gunyah_vm *ghvm,
+ struct gunyah_vm_io_handler *io_dev);
+void gunyah_vm_remove_io_handler(struct gunyah_vm *ghvm,
+ struct gunyah_vm_io_handler *io_dev);
+
#define GUNYAH_RM_ACL_X BIT(0)
#define GUNYAH_RM_ACL_W BIT(1)
#define GUNYAH_RM_ACL_R BIT(2)
--
2.34.1
A maple tree is used to maintain a map from guest address ranges to a
guestmemfd that provides the memory for that range of memory for the
guest. The mapping of guest address range to guestmemfd is called a
binding. Implement an ioctl to add/remove bindings to the virtual
machine. The binding determines whether the memory is shared (host
retains access) or lent (host loses access).
Signed-off-by: Elliot Berman <[email protected]>
---
drivers/virt/gunyah/guest_memfd.c | 394 +++++++++++++++++++++++++++++++++++++-
drivers/virt/gunyah/vm_mgr.c | 20 ++
drivers/virt/gunyah/vm_mgr.h | 9 +
include/uapi/linux/gunyah.h | 41 ++++
4 files changed, 455 insertions(+), 9 deletions(-)
diff --git a/drivers/virt/gunyah/guest_memfd.c b/drivers/virt/gunyah/guest_memfd.c
index 73a3f1368081..71686f1946da 100644
--- a/drivers/virt/gunyah/guest_memfd.c
+++ b/drivers/virt/gunyah/guest_memfd.c
@@ -16,6 +16,51 @@
#include "vm_mgr.h"
+/**
+ * struct gunyah_gmem_binding - Represents a binding of guestmem to a Gunyah VM
+ * @gfn: Guest address to place acquired folios
+ * @ghvm: Pointer to Gunyah VM in this binding
+ * @i_off: offset into the guestmem to grab folios from
+ * @file: Pointer to guest_memfd
+ * @i_entry: list entry for inode->i_private_list
+ * @flags: Access flags for the binding
+ * @nr: Number of pages covered by this binding
+ */
+struct gunyah_gmem_binding {
+ u64 gfn;
+ struct gunyah_vm *ghvm;
+
+ pgoff_t i_off;
+ struct file *file;
+ struct list_head i_entry;
+
+ u32 flags;
+ unsigned long nr;
+};
+
+static inline pgoff_t gunyah_gfn_to_off(struct gunyah_gmem_binding *b, u64 gfn)
+{
+ return gfn - b->gfn + b->i_off;
+}
+
+static inline u64 gunyah_off_to_gfn(struct gunyah_gmem_binding *b, pgoff_t off)
+{
+ return off - b->i_off + b->gfn;
+}
+
+static inline bool gunyah_guest_mem_is_lend(struct gunyah_vm *ghvm, u32 flags)
+{
+ u8 access = flags & GUNYAH_MEM_ACCESS_MASK;
+
+ if (access == GUNYAH_MEM_FORCE_LEND)
+ return true;
+ else if (access == GUNYAH_MEM_FORCE_SHARE)
+ return false;
+
+ /* RM requires all VMs to be protected (isolated) */
+ return true;
+}
+
static struct folio *gunyah_gmem_get_huge_folio(struct inode *inode,
pgoff_t index)
{
@@ -83,17 +128,55 @@ static struct folio *gunyah_gmem_get_folio(struct inode *inode, pgoff_t index)
return folio;
}
+/**
+ * gunyah_gmem_launder_folio() - Tries to unmap one folio from virtual machine(s)
+ * @folio: The folio to unmap
+ *
+ * Returns - 0 if the folio has been reclaimed from any virtual machine(s) that
+ * folio was mapped into.
+ */
+static int gunyah_gmem_launder_folio(struct folio *folio)
+{
+ struct address_space *const mapping = folio->mapping;
+ struct gunyah_gmem_binding *b;
+ pgoff_t index = folio_index(folio);
+ int ret = 0;
+ u64 gfn;
+
+ filemap_invalidate_lock_shared(mapping);
+ list_for_each_entry(b, &mapping->i_private_list, i_entry) {
+ /* if the mapping doesn't cover this folio: skip */
+ if (b->i_off > index || index > b->i_off + b->nr)
+ continue;
+
+ gfn = gunyah_off_to_gfn(b, index);
+ ret = gunyah_vm_reclaim_folio(b->ghvm, gfn);
+ if (WARN_RATELIMIT(ret, "failed to reclaim gfn: %08llx %d\n",
+ gfn, ret))
+ break;
+ }
+ filemap_invalidate_unlock_shared(mapping);
+
+ return ret;
+}
+
static vm_fault_t gunyah_gmem_host_fault(struct vm_fault *vmf)
{
struct folio *folio;
folio = gunyah_gmem_get_folio(file_inode(vmf->vma->vm_file),
vmf->pgoff);
- if (!folio || folio_test_private(folio)) {
+ if (!folio)
+ return VM_FAULT_SIGBUS;
+
+ /* If the folio is lent to a VM, try to reclim it */
+ if (folio_test_private(folio) && gunyah_gmem_launder_folio(folio)) {
folio_unlock(folio);
folio_put(folio);
return VM_FAULT_SIGBUS;
}
+ /* gunyah_gmem_launder_folio should clear the private bit if it returns 0 */
+ BUG_ON(folio_test_private(folio));
vmf->page = folio_file_page(folio, vmf->pgoff);
@@ -106,9 +189,36 @@ static const struct vm_operations_struct gunyah_gmem_vm_ops = {
static int gunyah_gmem_mmap(struct file *file, struct vm_area_struct *vma)
{
- file_accessed(file);
- vma->vm_ops = &gunyah_gmem_vm_ops;
- return 0;
+ struct address_space *const mapping = file->f_mapping;
+ struct gunyah_gmem_binding *b;
+ int ret = 0;
+ u64 gfn, nr;
+
+ filemap_invalidate_lock_shared(mapping);
+ list_for_each_entry(b, &mapping->i_private_list, i_entry) {
+ if (!gunyah_guest_mem_is_lend(b->ghvm, b->flags))
+ continue;
+
+ /* if the binding doesn't cover this vma: skip */
+ if (vma->vm_pgoff + vma_pages(vma) < b->i_off)
+ continue;
+ if (vma->vm_pgoff > b->i_off + b->nr)
+ continue;
+
+ gfn = gunyah_off_to_gfn(b, vma->vm_pgoff);
+ nr = gunyah_off_to_gfn(b, vma->vm_pgoff + vma_pages(vma)) - gfn;
+ ret = gunyah_vm_reclaim_range(b->ghvm, gfn, nr);
+ if (ret)
+ break;
+ }
+ filemap_invalidate_unlock_shared(mapping);
+
+ if (!ret) {
+ file_accessed(file);
+ vma->vm_ops = &gunyah_gmem_vm_ops;
+ }
+
+ return ret;
}
/**
@@ -125,9 +235,7 @@ static int gunyah_gmem_mmap(struct file *file, struct vm_area_struct *vma)
static long gunyah_gmem_punch_hole(struct inode *inode, loff_t offset,
loff_t len)
{
- truncate_inode_pages_range(inode->i_mapping, offset, offset + len - 1);
-
- return 0;
+ return invalidate_inode_pages2_range(inode->i_mapping, offset, offset + len - 1);
}
static long gunyah_gmem_allocate(struct inode *inode, loff_t offset, loff_t len)
@@ -204,6 +312,12 @@ static long gunyah_gmem_fallocate(struct file *file, int mode, loff_t offset,
static int gunyah_gmem_release(struct inode *inode, struct file *file)
{
+ /**
+ * each binding increments refcount on file, so we shouldn't be here
+ * if i_private_list not empty.
+ */
+ BUG_ON(!list_empty(&inode->i_mapping->i_private_list));
+
return 0;
}
@@ -216,10 +330,26 @@ static const struct file_operations gunyah_gmem_fops = {
.release = gunyah_gmem_release,
};
+static bool gunyah_gmem_release_folio(struct folio *folio, gfp_t gfp_flags)
+{
+ /* should return true if released; launder folio returns 0 if freed */
+ return !gunyah_gmem_launder_folio(folio);
+}
+
+static int gunyah_gmem_remove_folio(struct address_space *mapping,
+ struct folio *folio)
+{
+ if (mapping != folio->mapping)
+ return -EINVAL;
+
+ return gunyah_gmem_launder_folio(folio);
+}
+
static const struct address_space_operations gunyah_gmem_aops = {
.dirty_folio = noop_dirty_folio,
- .migrate_folio = migrate_folio,
- .error_remove_folio = generic_error_remove_folio,
+ .release_folio = gunyah_gmem_release_folio,
+ .launder_folio = gunyah_gmem_launder_folio,
+ .error_remove_folio = gunyah_gmem_remove_folio,
};
int gunyah_guest_mem_create(struct gunyah_create_mem_args *args)
@@ -267,6 +397,7 @@ int gunyah_guest_mem_create(struct gunyah_create_mem_args *args)
mapping_set_gfp_mask(inode->i_mapping, GFP_HIGHUSER);
mapping_set_large_folios(inode->i_mapping);
mapping_set_unmovable(inode->i_mapping);
+ mapping_set_release_always(inode->i_mapping);
/* Unmovable mappings are supposed to be marked unevictable as well. */
WARN_ON_ONCE(!mapping_unevictable(inode->i_mapping));
@@ -277,3 +408,248 @@ int gunyah_guest_mem_create(struct gunyah_create_mem_args *args)
put_unused_fd(fd);
return err;
}
+
+void gunyah_gmem_remove_binding(struct gunyah_gmem_binding *b)
+{
+ WARN_ON(gunyah_vm_reclaim_range(b->ghvm, b->gfn, b->nr));
+ mtree_erase(&b->ghvm->bindings, b->gfn);
+ list_del(&b->i_entry);
+ fput(b->file);
+ kfree(b);
+}
+
+static inline unsigned long gunyah_gmem_page_mask(struct file *file)
+{
+ unsigned long gmem_flags = (unsigned long)file_inode(file)->i_private;
+
+ if (gmem_flags & GHMF_ALLOW_HUGEPAGE) {
+#if IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)
+ return HPAGE_PMD_MASK;
+#else
+ return ULONG_MAX;
+#endif
+ }
+
+ return PAGE_MASK;
+}
+
+static int gunyah_gmem_init_binding(struct gunyah_vm *ghvm, struct file *file,
+ struct gunyah_map_mem_args *args,
+ struct gunyah_gmem_binding *binding)
+{
+ const unsigned long page_mask = ~gunyah_gmem_page_mask(file);
+
+ if (args->flags & ~(GUNYAH_MEM_ALLOW_RWX | GUNYAH_MEM_ACCESS_MASK))
+ return -EINVAL;
+
+ if (args->guest_addr & page_mask)
+ return -EINVAL;
+
+ if (args->offset & page_mask)
+ return -EINVAL;
+
+ if (args->size & page_mask)
+ return -EINVAL;
+
+ binding->gfn = gunyah_gpa_to_gfn(args->guest_addr);
+ binding->ghvm = ghvm;
+ binding->i_off = args->offset >> PAGE_SHIFT;
+ binding->file = file;
+ binding->flags = args->flags;
+ binding->nr = args->size >> PAGE_SHIFT;
+
+ return 0;
+}
+
+static int gunyah_gmem_trim_binding(struct gunyah_gmem_binding *b,
+ unsigned long start_delta,
+ unsigned long end_delta)
+{
+ struct gunyah_vm *ghvm = b->ghvm;
+ int ret;
+
+ down_write(&ghvm->bindings_lock);
+ if (!start_delta && !end_delta) {
+ ret = gunyah_vm_reclaim_range(ghvm, b->gfn, b->nr);
+ if (ret)
+ goto unlock;
+ gunyah_gmem_remove_binding(b);
+ } else if (start_delta && !end_delta) {
+ /* shrink the beginning */
+ ret = gunyah_vm_reclaim_range(ghvm, b->gfn,
+ b->gfn + start_delta);
+ if (ret)
+ goto unlock;
+ mtree_erase(&ghvm->bindings, b->gfn);
+ b->gfn += start_delta;
+ b->i_off += start_delta;
+ b->nr -= start_delta;
+ ret = mtree_insert_range(&ghvm->bindings, b->gfn,
+ b->gfn + b->nr - 1, b, GFP_KERNEL);
+ } else if (!start_delta && end_delta) {
+ /* Shrink the end */
+ ret = gunyah_vm_reclaim_range(ghvm, b->gfn + b->nr - end_delta,
+ b->gfn + b->nr);
+ if (ret)
+ goto unlock;
+ mtree_erase(&ghvm->bindings, b->gfn);
+ b->nr -= end_delta;
+ ret = mtree_insert_range(&ghvm->bindings, b->gfn,
+ b->gfn + b->nr - 1, b, GFP_KERNEL);
+ } else {
+ /* TODO: split the mapping into 2 */
+ ret = -EINVAL;
+ }
+
+unlock:
+ up_write(&ghvm->bindings_lock);
+ return ret;
+}
+
+static int gunyah_gmem_remove_mapping(struct gunyah_vm *ghvm, struct file *file,
+ struct gunyah_map_mem_args *args)
+{
+ struct inode *inode = file_inode(file);
+ struct gunyah_gmem_binding *b = NULL;
+ unsigned long start_delta, end_delta;
+ struct gunyah_gmem_binding remove;
+ int ret;
+
+ ret = gunyah_gmem_init_binding(ghvm, file, args, &remove);
+ if (ret)
+ return ret;
+
+ ret = -ENOENT;
+ filemap_invalidate_lock(inode->i_mapping);
+ list_for_each_entry(b, &inode->i_mapping->i_private_list, i_entry) {
+ if (b->ghvm != remove.ghvm || b->flags != remove.flags ||
+ WARN_ON(b->file != remove.file))
+ continue;
+ /**
+ * Test if the binding to remove is within this binding
+ * [gfn b nr]
+ * [gfn remove nr]
+ */
+ if (b->gfn > remove.gfn)
+ continue;
+ if (b->gfn + b->nr < remove.gfn + remove.nr)
+ continue;
+
+ /**
+ * We found the binding!
+ * Compute the delta in gfn start and make sure the offset
+ * into guest memfd matches.
+ */
+ start_delta = remove.gfn - b->gfn;
+ if (remove.i_off - b->i_off != start_delta)
+ continue;
+ end_delta = remove.gfn + remove.nr - b->gfn - b->nr;
+
+ ret = gunyah_gmem_trim_binding(b, start_delta, end_delta);
+ break;
+ }
+
+ filemap_invalidate_unlock(inode->i_mapping);
+ return ret;
+}
+
+static bool gunyah_gmem_binding_allowed_overlap(struct gunyah_gmem_binding *a,
+ struct gunyah_gmem_binding *b)
+{
+ /* assumes we are operating on the same file, check to be sure */
+ BUG_ON(a->file != b->file);
+
+ /**
+ * Gunyah only guarantees we can share a page with one VM and
+ * doesn't (currently) allow us to share same page with multiple VMs,
+ * regardless whether host can also access.
+ * Gunyah supports, but Linux hasn't implemented mapping same page
+ * into 2 separate addresses in guest's address space. This doesn't
+ * seem reasonable today, but we could do it later.
+ * All this to justify: check that the `a` region doesn't overlap with
+ * `b` region w.r.t. file offsets.
+ */
+ if (a->i_off + a->nr < b->i_off)
+ return false;
+ if (a->i_off > b->i_off + b->nr)
+ return false;
+
+ return true;
+}
+
+static int gunyah_gmem_add_mapping(struct gunyah_vm *ghvm, struct file *file,
+ struct gunyah_map_mem_args *args)
+{
+ struct gunyah_gmem_binding *b, *tmp = NULL;
+ struct inode *inode = file_inode(file);
+ int ret;
+
+ b = kzalloc(sizeof(*b), GFP_KERNEL);
+ if (!b)
+ return -ENOMEM;
+
+ ret = gunyah_gmem_init_binding(ghvm, file, args, b);
+ if (ret)
+ return ret;
+
+ filemap_invalidate_lock(inode->i_mapping);
+ list_for_each_entry(tmp, &inode->i_mapping->i_private_list, i_entry) {
+ if (!gunyah_gmem_binding_allowed_overlap(b, tmp)) {
+ ret = -EEXIST;
+ goto unlock;
+ }
+ }
+
+ ret = mtree_insert_range(&ghvm->bindings, b->gfn, b->gfn + b->nr - 1,
+ b, GFP_KERNEL);
+ if (ret)
+ goto unlock;
+
+ list_add(&b->i_entry, &inode->i_mapping->i_private_list);
+
+unlock:
+ filemap_invalidate_unlock(inode->i_mapping);
+ return ret;
+}
+
+int gunyah_gmem_modify_mapping(struct gunyah_vm *ghvm,
+ struct gunyah_map_mem_args *args)
+{
+ u8 access = args->flags & GUNYAH_MEM_ACCESS_MASK;
+ struct file *file;
+ int ret = -EINVAL;
+
+ file = fget(args->guest_mem_fd);
+ if (!file)
+ return -EINVAL;
+
+ if (file->f_op != &gunyah_gmem_fops)
+ goto err_file;
+
+ if (args->flags & ~(GUNYAH_MEM_ALLOW_RWX | GUNYAH_MEM_UNMAP | GUNYAH_MEM_ACCESS_MASK))
+ goto err_file;
+
+ /* VM needs to have some permissions to the memory */
+ if (!(args->flags & GUNYAH_MEM_ALLOW_RWX))
+ goto err_file;
+
+ if (access != GUNYAH_MEM_DEFAULT_ACCESS &&
+ access != GUNYAH_MEM_FORCE_LEND && access != GUNYAH_MEM_FORCE_SHARE)
+ goto err_file;
+
+ if (!PAGE_ALIGNED(args->guest_addr) || !PAGE_ALIGNED(args->offset) ||
+ !PAGE_ALIGNED(args->size))
+ goto err_file;
+
+ if (args->flags & GUNYAH_MEM_UNMAP) {
+ args->flags &= ~GUNYAH_MEM_UNMAP;
+ ret = gunyah_gmem_remove_mapping(ghvm, file, args);
+ } else {
+ ret = gunyah_gmem_add_mapping(ghvm, file, args);
+ }
+
+err_file:
+ if (ret)
+ fput(file);
+ return ret;
+}
diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
index 33751d5cddd2..be2061aa0a06 100644
--- a/drivers/virt/gunyah/vm_mgr.c
+++ b/drivers/virt/gunyah/vm_mgr.c
@@ -397,6 +397,8 @@ static __must_check struct gunyah_vm *gunyah_vm_alloc(struct gunyah_rm *rm)
INIT_LIST_HEAD(&ghvm->resource_tickets);
mt_init(&ghvm->mm);
+ mt_init(&ghvm->bindings);
+ init_rwsem(&ghvm->bindings_lock);
ghvm->addrspace_ticket.resource_type = GUNYAH_RESOURCE_TYPE_ADDR_SPACE;
ghvm->addrspace_ticket.label = GUNYAH_VM_ADDRSPACE_LABEL;
@@ -551,6 +553,14 @@ static long gunyah_vm_ioctl(struct file *filp, unsigned int cmd,
r = gunyah_vm_rm_function_instance(ghvm, &f);
break;
}
+ case GUNYAH_VM_MAP_MEM: {
+ struct gunyah_map_mem_args args;
+
+ if (copy_from_user(&args, argp, sizeof(args)))
+ return -EFAULT;
+
+ return gunyah_gmem_modify_mapping(ghvm, &args);
+ }
default:
r = -ENOTTY;
break;
@@ -568,6 +578,8 @@ EXPORT_SYMBOL_GPL(gunyah_vm_get);
static void _gunyah_vm_put(struct kref *kref)
{
struct gunyah_vm *ghvm = container_of(kref, struct gunyah_vm, kref);
+ struct gunyah_gmem_binding *b;
+ unsigned long idx = 0;
int ret;
/**
@@ -579,6 +591,13 @@ static void _gunyah_vm_put(struct kref *kref)
gunyah_vm_remove_functions(ghvm);
+ down_write(&ghvm->bindings_lock);
+ mt_for_each(&ghvm->bindings, b, idx, ULONG_MAX) {
+ gunyah_gmem_remove_binding(b);
+ }
+ up_write(&ghvm->bindings_lock);
+ WARN_ON(!mtree_empty(&ghvm->bindings));
+ mtree_destroy(&ghvm->bindings);
/**
* If this fails, we're going to lose the memory for good and is
* BUG_ON-worthy, but not unrecoverable (we just lose memory).
@@ -606,6 +625,7 @@ static void _gunyah_vm_put(struct kref *kref)
ghvm->vm_status == GUNYAH_RM_VM_STATUS_RESET);
}
+ WARN_ON(!mtree_empty(&ghvm->mm));
mtree_destroy(&ghvm->mm);
if (ghvm->vm_status > GUNYAH_RM_VM_STATUS_NO_STATE) {
diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
index 055990842959..518d05eeb642 100644
--- a/drivers/virt/gunyah/vm_mgr.h
+++ b/drivers/virt/gunyah/vm_mgr.h
@@ -36,6 +36,9 @@ long gunyah_dev_vm_mgr_ioctl(struct gunyah_rm *rm, unsigned int cmd,
* @mm: A maple tree of all memory that has been mapped to a VM.
* Indices are guest frame numbers; entries are either folios or
* RM mem parcels
+ * @bindings: A maple tree of guest memfd bindings. Indices are guest frame
+ * numbers; entries are &struct gunyah_gmem_binding
+ * @bindings_lock: For serialization to @bindings
* @addrspace_ticket: Resource ticket to the capability for guest VM's
* address space
* @host_private_extent_ticket: Resource ticket to the capability for our
@@ -75,6 +78,8 @@ long gunyah_dev_vm_mgr_ioctl(struct gunyah_rm *rm, unsigned int cmd,
struct gunyah_vm {
u16 vmid;
struct maple_tree mm;
+ struct maple_tree bindings;
+ struct rw_semaphore bindings_lock;
struct gunyah_vm_resource_ticket addrspace_ticket,
host_private_extent_ticket, host_shared_extent_ticket,
guest_private_extent_ticket, guest_shared_extent_ticket;
@@ -110,5 +115,9 @@ int gunyah_vm_reclaim_folio(struct gunyah_vm *ghvm, u64 gfn);
int gunyah_vm_reclaim_range(struct gunyah_vm *ghvm, u64 gfn, u64 nr);
int gunyah_guest_mem_create(struct gunyah_create_mem_args *args);
+int gunyah_gmem_modify_mapping(struct gunyah_vm *ghvm,
+ struct gunyah_map_mem_args *args);
+struct gunyah_gmem_binding;
+void gunyah_gmem_remove_binding(struct gunyah_gmem_binding *binding);
#endif
diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
index c5f506350364..1af4c5ae6bc3 100644
--- a/include/uapi/linux/gunyah.h
+++ b/include/uapi/linux/gunyah.h
@@ -87,6 +87,47 @@ struct gunyah_fn_desc {
#define GUNYAH_VM_ADD_FUNCTION _IOW(GUNYAH_IOCTL_TYPE, 0x4, struct gunyah_fn_desc)
#define GUNYAH_VM_REMOVE_FUNCTION _IOW(GUNYAH_IOCTL_TYPE, 0x7, struct gunyah_fn_desc)
+/**
+ * enum gunyah_map_flags- Possible flags on &struct gunyah_map_mem_args
+ * @GUNYAH_MEM_DEFAULT_SHARE: Use default host access for the VM type
+ * @GUNYAH_MEM_FORCE_LEND: Force unmapping the memory once the guest starts to use
+ * @GUNYAH_MEM_FORCE_SHARE: Allow host to continue accessing memory when guest starts to use
+ * @GUNYAH_MEM_ALLOW_READ: Allow guest to read memory
+ * @GUNYAH_MEM_ALLOW_WRITE: Allow guest to write to the memory
+ * @GUNYAH_MEM_ALLOW_EXEC: Allow guest to execute instructions in the memory
+ */
+enum gunyah_map_flags {
+ GUNYAH_MEM_DEFAULT_ACCESS = 0,
+ GUNYAH_MEM_FORCE_LEND = 1,
+ GUNYAH_MEM_FORCE_SHARE = 2,
+#define GUNYAH_MEM_ACCESS_MASK 0x7
+
+ GUNYAH_MEM_ALLOW_READ = 1UL << 4,
+ GUNYAH_MEM_ALLOW_WRITE = 1UL << 5,
+ GUNYAH_MEM_ALLOW_EXEC = 1UL << 6,
+ GUNYAH_MEM_ALLOW_RWX =
+ (GUNYAH_MEM_ALLOW_READ | GUNYAH_MEM_ALLOW_WRITE | GUNYAH_MEM_ALLOW_EXEC),
+
+ GUNYAH_MEM_UNMAP = 1UL << 8,
+};
+
+/**
+ * struct gunyah_map_mem_args - Description to provide guest memory into a VM
+ * @guest_addr: Location in guest address space to place the memory
+ * @flags: See &enum gunyah_map_flags.
+ * @guest_mem_fd: File descriptor created by GUNYAH_CREATE_GUEST_MEM
+ * @offset: Offset into the guest memory file
+ */
+struct gunyah_map_mem_args {
+ __u64 guest_addr;
+ __u32 flags;
+ __u32 guest_mem_fd;
+ __u64 offset;
+ __u64 size;
+};
+
+#define GUNYAH_VM_MAP_MEM _IOW(GUNYAH_IOCTL_TYPE, 0x9, struct gunyah_map_mem_args)
+
/*
* ioctls for vCPU fds
*/
--
2.34.1
Tell resource manager to enable demand paging and wire vCPU faults to
provide the backing folio when a guestmemfd is bound to the faulting
access.
Signed-off-by: Elliot Berman <[email protected]>
---
drivers/virt/gunyah/guest_memfd.c | 115 ++++++++++++++++++++++++++++++++++++++
drivers/virt/gunyah/gunyah_vcpu.c | 39 ++++++++++---
drivers/virt/gunyah/vm_mgr.c | 17 ++++++
drivers/virt/gunyah/vm_mgr.h | 3 +
4 files changed, 166 insertions(+), 8 deletions(-)
diff --git a/drivers/virt/gunyah/guest_memfd.c b/drivers/virt/gunyah/guest_memfd.c
index 5eeac6ac451e..4696ff4c7c22 100644
--- a/drivers/virt/gunyah/guest_memfd.c
+++ b/drivers/virt/gunyah/guest_memfd.c
@@ -843,3 +843,118 @@ int gunyah_gmem_reclaim_parcel(struct gunyah_vm *ghvm,
return 0;
}
+
+int gunyah_gmem_setup_demand_paging(struct gunyah_vm *ghvm)
+{
+ struct gunyah_rm_mem_entry *entries;
+ struct gunyah_gmem_binding *b;
+ unsigned long index = 0;
+ u32 count = 0, i;
+ int ret = 0;
+
+ down_read(&ghvm->bindings_lock);
+ mt_for_each(&ghvm->bindings, b, index, ULONG_MAX)
+ if (gunyah_guest_mem_is_lend(ghvm, b->flags))
+ count++;
+
+ if (!count)
+ goto out;
+
+ entries = kcalloc(count, sizeof(*entries), GFP_KERNEL);
+ if (!entries) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ index = i = 0;
+ mt_for_each(&ghvm->bindings, b, index, ULONG_MAX) {
+ if (!gunyah_guest_mem_is_lend(ghvm, b->flags))
+ continue;
+ entries[i].phys_addr = cpu_to_le64(gunyah_gfn_to_gpa(b->gfn));
+ entries[i].size = cpu_to_le64(b->nr << PAGE_SHIFT);
+ if (++i == count)
+ break;
+ }
+
+ ret = gunyah_rm_vm_set_demand_paging(ghvm->rm, ghvm->vmid, i, entries);
+ kfree(entries);
+out:
+ up_read(&ghvm->bindings_lock);
+ return ret;
+}
+
+static bool folio_mmapped(struct folio *folio)
+{
+ struct address_space *mapping = folio->mapping;
+ struct vm_area_struct *vma;
+ bool ret = false;
+
+ i_mmap_lock_read(mapping);
+ vma_interval_tree_foreach(vma, &mapping->i_mmap, folio_index(folio),
+ folio_index(folio) + folio_nr_pages(folio)) {
+ ret = true;
+ break;
+ }
+ i_mmap_unlock_read(mapping);
+ return ret;
+}
+
+int gunyah_gmem_demand_page(struct gunyah_vm *ghvm, u64 gpa, bool write)
+{
+ unsigned long gfn = gunyah_gpa_to_gfn(gpa);
+ struct gunyah_gmem_binding *b;
+ struct folio *folio;
+ int ret;
+
+ down_read(&ghvm->bindings_lock);
+ b = mtree_load(&ghvm->bindings, gfn);
+ if (!b) {
+ ret = -ENOENT;
+ goto unlock;
+ }
+
+ if (write && !(b->flags & GUNYAH_MEM_ALLOW_WRITE)) {
+ ret = -EPERM;
+ goto unlock;
+ }
+
+ folio = gunyah_gmem_get_folio(file_inode(b->file), gunyah_gfn_to_off(b, gfn));
+ if (IS_ERR(folio)) {
+ ret = PTR_ERR(folio);
+ pr_err_ratelimited(
+ "Failed to obtain memory for guest addr %016llx: %d\n",
+ gpa, ret);
+ goto unlock;
+ }
+
+ if (gunyah_guest_mem_is_lend(ghvm, b->flags) &&
+ (folio_mapped(folio) || folio_mmapped(folio))) {
+ ret = -EPERM;
+ goto out;
+ }
+
+ /**
+ * the folio covers the requested guest address, but the folio may not
+ * start at the requested guest address. recompute the gfn based on the
+ * folio itself.
+ */
+ gfn = gunyah_off_to_gfn(b, folio_index(folio));
+
+ ret = gunyah_vm_provide_folio(ghvm, folio, gfn,
+ !gunyah_guest_mem_is_lend(ghvm, b->flags),
+ !!(b->flags & GUNYAH_MEM_ALLOW_WRITE));
+ if (ret) {
+ if (ret != -EAGAIN)
+ pr_err_ratelimited(
+ "Failed to provide folio for guest addr: %016llx: %d\n",
+ gpa, ret);
+ goto out;
+ }
+out:
+ folio_unlock(folio);
+ folio_put(folio);
+unlock:
+ up_read(&ghvm->bindings_lock);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(gunyah_gmem_demand_page);
diff --git a/drivers/virt/gunyah/gunyah_vcpu.c b/drivers/virt/gunyah/gunyah_vcpu.c
index b636b54dc9a1..f01e6d6163ba 100644
--- a/drivers/virt/gunyah/gunyah_vcpu.c
+++ b/drivers/virt/gunyah/gunyah_vcpu.c
@@ -89,29 +89,44 @@ static irqreturn_t gunyah_vcpu_irq_handler(int irq, void *data)
return IRQ_HANDLED;
}
-static void gunyah_handle_page_fault(
+static bool gunyah_handle_page_fault(
struct gunyah_vcpu *vcpu,
const struct gunyah_hypercall_vcpu_run_resp *vcpu_run_resp)
{
u64 addr = vcpu_run_resp->state_data[0];
+ bool write = !!vcpu_run_resp->state_data[1];
+ int ret = 0;
+
+ ret = gunyah_gmem_demand_page(vcpu->ghvm, addr, write);
+ if (!ret || ret == -EAGAIN)
+ return true;
vcpu->vcpu_run->page_fault.resume_action = GUNYAH_VCPU_RESUME_FAULT;
- vcpu->vcpu_run->page_fault.attempt = 0;
+ vcpu->vcpu_run->page_fault.attempt = ret;
vcpu->vcpu_run->page_fault.phys_addr = addr;
vcpu->vcpu_run->exit_reason = GUNYAH_VCPU_EXIT_PAGE_FAULT;
+ return false;
}
-static void
-gunyah_handle_mmio(struct gunyah_vcpu *vcpu,
+static bool
+gunyah_handle_mmio(struct gunyah_vcpu *vcpu, unsigned long resume_data[3],
const struct gunyah_hypercall_vcpu_run_resp *vcpu_run_resp)
{
u64 addr = vcpu_run_resp->state_data[0],
len = vcpu_run_resp->state_data[1],
data = vcpu_run_resp->state_data[2];
+ int ret;
if (WARN_ON(len > sizeof(u64)))
len = sizeof(u64);
+ ret = gunyah_gmem_demand_page(vcpu->ghvm, addr,
+ vcpu->vcpu_run->mmio.is_write);
+ if (!ret || ret == -EAGAIN) {
+ resume_data[1] = GUNYAH_ADDRSPACE_VMMIO_ACTION_RETRY;
+ return true;
+ }
+
if (vcpu_run_resp->state == GUNYAH_VCPU_ADDRSPACE_VMMIO_READ) {
vcpu->vcpu_run->mmio.is_write = 0;
/* Record that we need to give vCPU user's supplied value next gunyah_vcpu_run() */
@@ -127,6 +142,8 @@ gunyah_handle_mmio(struct gunyah_vcpu *vcpu,
vcpu->mmio_addr = vcpu->vcpu_run->mmio.phys_addr = addr;
vcpu->vcpu_run->mmio.len = len;
vcpu->vcpu_run->exit_reason = GUNYAH_VCPU_EXIT_MMIO;
+
+ return false;
}
static int gunyah_handle_mmio_resume(struct gunyah_vcpu *vcpu,
@@ -147,6 +164,8 @@ static int gunyah_handle_mmio_resume(struct gunyah_vcpu *vcpu,
resume_data[1] = GUNYAH_ADDRSPACE_VMMIO_ACTION_FAULT;
break;
case GUNYAH_VCPU_RESUME_RETRY:
+ gunyah_gmem_demand_page(vcpu->ghvm, vcpu->mmio_addr,
+ vcpu->state == GUNYAH_VCPU_RUN_STATE_MMIO_WRITE);
resume_data[1] = GUNYAH_ADDRSPACE_VMMIO_ACTION_RETRY;
break;
default:
@@ -309,11 +328,15 @@ static int gunyah_vcpu_run(struct gunyah_vcpu *vcpu)
break;
case GUNYAH_VCPU_ADDRSPACE_VMMIO_READ:
case GUNYAH_VCPU_ADDRSPACE_VMMIO_WRITE:
- gunyah_handle_mmio(vcpu, &vcpu_run_resp);
- goto out;
+ if (!gunyah_handle_mmio(vcpu, resume_data,
+ &vcpu_run_resp))
+ goto out;
+ break;
case GUNYAH_VCPU_ADDRSPACE_PAGE_FAULT:
- gunyah_handle_page_fault(vcpu, &vcpu_run_resp);
- goto out;
+ if (!gunyah_handle_page_fault(vcpu,
+ &vcpu_run_resp))
+ goto out;
+ break;
default:
pr_warn_ratelimited(
"Unknown vCPU state: %llx\n",
diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
index 4379b5ba151e..3b767eeeb7c2 100644
--- a/drivers/virt/gunyah/vm_mgr.c
+++ b/drivers/virt/gunyah/vm_mgr.c
@@ -471,6 +471,23 @@ static int gunyah_vm_start(struct gunyah_vm *ghvm)
goto err;
}
+ ret = gunyah_gmem_setup_demand_paging(ghvm);
+ if (ret) {
+ dev_warn(ghvm->parent,
+ "Failed to set up gmem demand paging: %d\n", ret);
+ goto err;
+ }
+
+ ret = gunyah_rm_vm_set_address_layout(
+ ghvm->rm, ghvm->vmid, GUNYAH_RM_RANGE_ID_IMAGE,
+ ghvm->dtb.parcel_start << PAGE_SHIFT,
+ ghvm->dtb.parcel_pages << PAGE_SHIFT);
+ if (ret) {
+ dev_warn(ghvm->parent,
+ "Failed to set location of DTB mem parcel: %d\n", ret);
+ goto err;
+ }
+
ret = gunyah_rm_vm_init(ghvm->rm, ghvm->vmid);
if (ret) {
ghvm->vm_status = GUNYAH_RM_VM_STATUS_INIT_FAILED;
diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
index b2ab2f1bda3a..474ac866d237 100644
--- a/drivers/virt/gunyah/vm_mgr.h
+++ b/drivers/virt/gunyah/vm_mgr.h
@@ -137,4 +137,7 @@ int gunyah_gmem_reclaim_parcel(struct gunyah_vm *ghvm,
struct gunyah_rm_mem_parcel *parcel, u64 gfn,
u64 nr);
+int gunyah_gmem_setup_demand_paging(struct gunyah_vm *ghvm);
+int gunyah_gmem_demand_page(struct gunyah_vm *ghvm, u64 gpa, bool write);
+
#endif
--
2.34.1
Some VM functions need to acquire Gunyah resources. For instance, Gunyah
vCPUs are exposed to the host as a resource. The Gunyah vCPU function
will register a resource ticket and be able to interact with the
hypervisor once the resource ticket is filled.
Resource tickets are the mechanism for functions to acquire ownership of
Gunyah resources. Gunyah functions can be created before the VM's
resources are created and made available to Linux. A resource ticket
identifies a type of resource and a label of a resource which the ticket
holder is interested in.
Resources are created by Gunyah as configured in the VM's devicetree
configuration. Gunyah doesn't process the label and that makes it
possible for userspace to create multiple resources with the same label.
Resource ticket owners need to be prepared for populate to be called
multiple times if userspace created multiple resources with the same
label.
Reviewed-by: Alex Elder <[email protected]>
Signed-off-by: Elliot Berman <[email protected]>
---
drivers/virt/gunyah/vm_mgr.c | 128 ++++++++++++++++++++++++++++++++++++++++++-
drivers/virt/gunyah/vm_mgr.h | 7 +++
include/linux/gunyah.h | 39 +++++++++++++
3 files changed, 173 insertions(+), 1 deletion(-)
diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
index f6e6b5669aae..65badcf6357b 100644
--- a/drivers/virt/gunyah/vm_mgr.c
+++ b/drivers/virt/gunyah/vm_mgr.c
@@ -15,6 +15,106 @@
#include "rsc_mgr.h"
#include "vm_mgr.h"
+int gunyah_vm_add_resource_ticket(struct gunyah_vm *ghvm,
+ struct gunyah_vm_resource_ticket *ticket)
+{
+ struct gunyah_vm_resource_ticket *iter;
+ struct gunyah_resource *ghrsc, *rsc_iter;
+ int ret = 0;
+
+ mutex_lock(&ghvm->resources_lock);
+ list_for_each_entry(iter, &ghvm->resource_tickets, vm_list) {
+ if (iter->resource_type == ticket->resource_type &&
+ iter->label == ticket->label) {
+ ret = -EEXIST;
+ goto out;
+ }
+ }
+
+ if (!try_module_get(ticket->owner)) {
+ ret = -ENODEV;
+ goto out;
+ }
+
+ list_add(&ticket->vm_list, &ghvm->resource_tickets);
+ INIT_LIST_HEAD(&ticket->resources);
+
+ list_for_each_entry_safe(ghrsc, rsc_iter, &ghvm->resources, list) {
+ if (ghrsc->type == ticket->resource_type &&
+ ghrsc->rm_label == ticket->label) {
+ if (ticket->populate(ticket, ghrsc))
+ list_move(&ghrsc->list, &ticket->resources);
+ }
+ }
+out:
+ mutex_unlock(&ghvm->resources_lock);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(gunyah_vm_add_resource_ticket);
+
+void gunyah_vm_remove_resource_ticket(struct gunyah_vm *ghvm,
+ struct gunyah_vm_resource_ticket *ticket)
+{
+ struct gunyah_resource *ghrsc, *iter;
+
+ mutex_lock(&ghvm->resources_lock);
+ list_for_each_entry_safe(ghrsc, iter, &ticket->resources, list) {
+ ticket->unpopulate(ticket, ghrsc);
+ list_move(&ghrsc->list, &ghvm->resources);
+ }
+
+ module_put(ticket->owner);
+ list_del(&ticket->vm_list);
+ mutex_unlock(&ghvm->resources_lock);
+}
+EXPORT_SYMBOL_GPL(gunyah_vm_remove_resource_ticket);
+
+static void gunyah_vm_add_resource(struct gunyah_vm *ghvm,
+ struct gunyah_resource *ghrsc)
+{
+ struct gunyah_vm_resource_ticket *ticket;
+
+ mutex_lock(&ghvm->resources_lock);
+ list_for_each_entry(ticket, &ghvm->resource_tickets, vm_list) {
+ if (ghrsc->type == ticket->resource_type &&
+ ghrsc->rm_label == ticket->label) {
+ if (ticket->populate(ticket, ghrsc))
+ list_add(&ghrsc->list, &ticket->resources);
+ else
+ list_add(&ghrsc->list, &ghvm->resources);
+ /* unconditonal -- we prevent multiple identical
+ * resource tickets so there will not be some other
+ * ticket elsewhere in the list if populate() failed.
+ */
+ goto found;
+ }
+ }
+ list_add(&ghrsc->list, &ghvm->resources);
+found:
+ mutex_unlock(&ghvm->resources_lock);
+}
+
+static void gunyah_vm_clean_resources(struct gunyah_vm *ghvm)
+{
+ struct gunyah_vm_resource_ticket *ticket, *titer;
+ struct gunyah_resource *ghrsc, *riter;
+
+ mutex_lock(&ghvm->resources_lock);
+ if (!list_empty(&ghvm->resource_tickets)) {
+ dev_warn(ghvm->parent, "Dangling resource tickets:\n");
+ list_for_each_entry_safe(ticket, titer, &ghvm->resource_tickets,
+ vm_list) {
+ dev_warn(ghvm->parent, " %pS\n", ticket->populate);
+ gunyah_vm_remove_resource_ticket(ghvm, ticket);
+ }
+ }
+
+ list_for_each_entry_safe(ghrsc, riter, &ghvm->resources, list) {
+ gunyah_rm_free_resource(ghrsc);
+ }
+ mutex_unlock(&ghvm->resources_lock);
+}
+
static int gunyah_vm_rm_notification_status(struct gunyah_vm *ghvm, void *data)
{
struct gunyah_rm_vm_status_payload *payload = data;
@@ -92,13 +192,18 @@ static __must_check struct gunyah_vm *gunyah_vm_alloc(struct gunyah_rm *rm)
init_rwsem(&ghvm->status_lock);
init_waitqueue_head(&ghvm->vm_status_wait);
ghvm->vm_status = GUNYAH_RM_VM_STATUS_NO_STATE;
+ mutex_init(&ghvm->resources_lock);
+ INIT_LIST_HEAD(&ghvm->resources);
+ INIT_LIST_HEAD(&ghvm->resource_tickets);
return ghvm;
}
static int gunyah_vm_start(struct gunyah_vm *ghvm)
{
- int ret;
+ struct gunyah_rm_hyp_resources *resources;
+ struct gunyah_resource *ghrsc;
+ int ret, i, n;
down_write(&ghvm->status_lock);
if (ghvm->vm_status != GUNYAH_RM_VM_STATUS_NO_STATE) {
@@ -134,6 +239,25 @@ static int gunyah_vm_start(struct gunyah_vm *ghvm)
}
ghvm->vm_status = GUNYAH_RM_VM_STATUS_READY;
+ ret = gunyah_rm_get_hyp_resources(ghvm->rm, ghvm->vmid, &resources);
+ if (ret) {
+ dev_warn(ghvm->parent,
+ "Failed to get hypervisor resources for VM: %d\n",
+ ret);
+ goto err;
+ }
+
+ for (i = 0, n = le32_to_cpu(resources->n_entries); i < n; i++) {
+ ghrsc = gunyah_rm_alloc_resource(ghvm->rm,
+ &resources->entries[i]);
+ if (!ghrsc) {
+ ret = -ENOMEM;
+ goto err;
+ }
+
+ gunyah_vm_add_resource(ghvm, ghrsc);
+ }
+
ret = gunyah_rm_vm_start(ghvm->rm, ghvm->vmid);
if (ret) {
dev_warn(ghvm->parent, "Failed to start VM: %d\n", ret);
@@ -209,6 +333,8 @@ static int gunyah_vm_release(struct inode *inode, struct file *filp)
if (ghvm->vm_status == GUNYAH_RM_VM_STATUS_RUNNING)
gunyah_vm_stop(ghvm);
+ gunyah_vm_clean_resources(ghvm);
+
if (ghvm->vm_status != GUNYAH_RM_VM_STATUS_NO_STATE &&
ghvm->vm_status != GUNYAH_RM_VM_STATUS_LOAD &&
ghvm->vm_status != GUNYAH_RM_VM_STATUS_RESET) {
diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
index e6cc9aead0b6..0d291f722885 100644
--- a/drivers/virt/gunyah/vm_mgr.h
+++ b/drivers/virt/gunyah/vm_mgr.h
@@ -26,6 +26,9 @@ long gunyah_dev_vm_mgr_ioctl(struct gunyah_rm *rm, unsigned int cmd,
* @vm_status: Current state of the VM, as last reported by RM
* @vm_status_wait: Wait queue for status @vm_status changes
* @status_lock: Serializing state transitions
+ * @resource_lock: Serializing addition of resources and resource tickets
+ * @resources: List of &struct gunyah_resource that are associated with this VM
+ * @resource_tickets: List of &struct gunyah_vm_resource_ticket
* @auth: Authentication mechanism to be used by resource manager when
* launching the VM
*
@@ -39,9 +42,13 @@ struct gunyah_vm {
enum gunyah_rm_vm_status vm_status;
wait_queue_head_t vm_status_wait;
struct rw_semaphore status_lock;
+ struct mutex resources_lock;
+ struct list_head resources;
+ struct list_head resource_tickets;
struct device *parent;
enum gunyah_rm_vm_auth_mechanism auth;
+
};
#endif
diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
index ede8abb1b276..001769100260 100644
--- a/include/linux/gunyah.h
+++ b/include/linux/gunyah.h
@@ -10,6 +10,7 @@
#include <linux/errno.h>
#include <linux/interrupt.h>
#include <linux/limits.h>
+#include <linux/list.h>
#include <linux/types.h>
/* Matches resource manager's resource types for VM_GET_HYP_RESOURCES RPC */
@@ -34,6 +35,44 @@ struct gunyah_resource {
u32 rm_label;
};
+struct gunyah_vm;
+
+/**
+ * struct gunyah_vm_resource_ticket - Represents a ticket to reserve access to VM resource(s)
+ * @vm_list: for @gunyah_vm->resource_tickets
+ * @resources: List of resource(s) associated with this ticket
+ * (members are from @gunyah_resource->list)
+ * @resource_type: Type of resource this ticket reserves
+ * @label: Label of the resource from resource manager this ticket reserves.
+ * @owner: owner of the ticket
+ * @populate: callback provided by the ticket owner and called when a resource is found that
+ * matches @resource_type and @label. Note that this callback could be called
+ * multiple times if userspace created mutliple resources with the same type/label.
+ * This callback may also have significant delay after gunyah_vm_add_resource_ticket()
+ * since gunyah_vm_add_resource_ticket() could be called before the VM starts.
+ * @unpopulate: callback provided by the ticket owner and called when the ticket owner should no
+ * longer use the resource provided in the argument. When unpopulate() returns,
+ * the ticket owner should not be able to use the resource any more as the resource
+ * might being freed.
+ */
+struct gunyah_vm_resource_ticket {
+ struct list_head vm_list;
+ struct list_head resources;
+ enum gunyah_resource_type resource_type;
+ u32 label;
+
+ struct module *owner;
+ bool (*populate)(struct gunyah_vm_resource_ticket *ticket,
+ struct gunyah_resource *ghrsc);
+ void (*unpopulate)(struct gunyah_vm_resource_ticket *ticket,
+ struct gunyah_resource *ghrsc);
+};
+
+int gunyah_vm_add_resource_ticket(struct gunyah_vm *ghvm,
+ struct gunyah_vm_resource_ticket *ticket);
+void gunyah_vm_remove_resource_ticket(struct gunyah_vm *ghvm,
+ struct gunyah_vm_resource_ticket *ticket);
+
/******************************************************************************/
/* Common arch-independent definitions for Gunyah hypercalls */
#define GUNYAH_CAPID_INVAL U64_MAX
--
2.34.1
Memory provided to Gunyah virtual machines are provided by a
Gunyah guestmemfd. Because memory provided to virtual machines may be
unmapped at stage-2 from the host (i.e. in the hypervisor's page tables
for the host), special care needs to be taken to ensure that the kernel
doesn't have a page mapped when it is lent to the guest. Without this
tracking, a kernel panic could be induced by userspace tricking the
kernel into accessing guest-private memory.
Introduce the basic guestmemfd ops and ioctl. Userspace should be able
to access the memory unless it is provided to the guest virtual machine:
this is necessary to allow userspace to preload binaries such as the
kernel Image prior to running the VM. Subsequent commits will wire up
providing the memory to the guest.
Signed-off-by: Elliot Berman <[email protected]>
---
drivers/virt/gunyah/Makefile | 2 +-
drivers/virt/gunyah/guest_memfd.c | 279 ++++++++++++++++++++++++++++++++++++++
drivers/virt/gunyah/vm_mgr.c | 9 ++
drivers/virt/gunyah/vm_mgr.h | 2 +
include/uapi/linux/gunyah.h | 19 +++
5 files changed, 310 insertions(+), 1 deletion(-)
diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
index a6c6f29b887a..c4505fce177d 100644
--- a/drivers/virt/gunyah/Makefile
+++ b/drivers/virt/gunyah/Makefile
@@ -1,6 +1,6 @@
# SPDX-License-Identifier: GPL-2.0
-gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o vm_mgr_mem.o
+gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o vm_mgr_mem.o guest_memfd.o
obj-$(CONFIG_GUNYAH) += gunyah.o gunyah_rsc_mgr.o gunyah_vcpu.o
obj-$(CONFIG_GUNYAH_PLATFORM_HOOKS) += gunyah_platform_hooks.o
diff --git a/drivers/virt/gunyah/guest_memfd.c b/drivers/virt/gunyah/guest_memfd.c
new file mode 100644
index 000000000000..73a3f1368081
--- /dev/null
+++ b/drivers/virt/gunyah/guest_memfd.c
@@ -0,0 +1,279 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2023-2024 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#define pr_fmt(fmt) "gunyah_guest_mem: " fmt
+
+#include <linux/anon_inodes.h>
+#include <linux/types.h>
+#include <linux/falloc.h>
+#include <linux/file.h>
+#include <linux/migrate.h>
+#include <linux/pagemap.h>
+
+#include <uapi/linux/gunyah.h>
+
+#include "vm_mgr.h"
+
+static struct folio *gunyah_gmem_get_huge_folio(struct inode *inode,
+ pgoff_t index)
+{
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+ unsigned long huge_index = round_down(index, HPAGE_PMD_NR);
+ unsigned long flags = (unsigned long)inode->i_private;
+ struct address_space *mapping = inode->i_mapping;
+ gfp_t gfp = mapping_gfp_mask(mapping);
+ struct folio *folio;
+
+ if (!(flags & GHMF_ALLOW_HUGEPAGE))
+ return NULL;
+
+ if (filemap_range_has_page(mapping, huge_index << PAGE_SHIFT,
+ (huge_index + HPAGE_PMD_NR - 1)
+ << PAGE_SHIFT))
+ return NULL;
+
+ folio = filemap_alloc_folio(gfp, HPAGE_PMD_ORDER);
+ if (!folio)
+ return NULL;
+
+ if (filemap_add_folio(mapping, folio, huge_index, gfp)) {
+ folio_put(folio);
+ return NULL;
+ }
+
+ return folio;
+#else
+ return NULL;
+#endif
+}
+
+static struct folio *gunyah_gmem_get_folio(struct inode *inode, pgoff_t index)
+{
+ struct folio *folio;
+
+ folio = gunyah_gmem_get_huge_folio(inode, index);
+ if (!folio) {
+ folio = filemap_grab_folio(inode->i_mapping, index);
+ if (IS_ERR_OR_NULL(folio))
+ return NULL;
+ }
+
+ /*
+ * Use the up-to-date flag to track whether or not the memory has been
+ * zeroed before being handed off to the guest. There is no backing
+ * storage for the memory, so the folio will remain up-to-date until
+ * it's removed.
+ */
+ if (!folio_test_uptodate(folio)) {
+ unsigned long nr_pages = folio_nr_pages(folio);
+ unsigned long i;
+
+ for (i = 0; i < nr_pages; i++)
+ clear_highpage(folio_page(folio, i));
+
+ folio_mark_uptodate(folio);
+ }
+
+ /*
+ * Ignore accessed, referenced, and dirty flags. The memory is
+ * unevictable and there is no storage to write back to.
+ */
+ return folio;
+}
+
+static vm_fault_t gunyah_gmem_host_fault(struct vm_fault *vmf)
+{
+ struct folio *folio;
+
+ folio = gunyah_gmem_get_folio(file_inode(vmf->vma->vm_file),
+ vmf->pgoff);
+ if (!folio || folio_test_private(folio)) {
+ folio_unlock(folio);
+ folio_put(folio);
+ return VM_FAULT_SIGBUS;
+ }
+
+ vmf->page = folio_file_page(folio, vmf->pgoff);
+
+ return VM_FAULT_LOCKED;
+}
+
+static const struct vm_operations_struct gunyah_gmem_vm_ops = {
+ .fault = gunyah_gmem_host_fault,
+};
+
+static int gunyah_gmem_mmap(struct file *file, struct vm_area_struct *vma)
+{
+ file_accessed(file);
+ vma->vm_ops = &gunyah_gmem_vm_ops;
+ return 0;
+}
+
+/**
+ * gunyah_gmem_punch_hole() - try to reclaim a range of pages
+ * @inode: guest memfd inode
+ * @offset: Offset into memfd to start reclaim
+ * @len: length to reclaim
+ *
+ * Will try to unmap from virtual machines any folios covered by
+ * [offset, offset+len]. If unmapped, then tries to free those folios
+ *
+ * Returns - error code if any folio in the range couldn't be freed.
+ */
+static long gunyah_gmem_punch_hole(struct inode *inode, loff_t offset,
+ loff_t len)
+{
+ truncate_inode_pages_range(inode->i_mapping, offset, offset + len - 1);
+
+ return 0;
+}
+
+static long gunyah_gmem_allocate(struct inode *inode, loff_t offset, loff_t len)
+{
+ struct address_space *mapping = inode->i_mapping;
+ pgoff_t start, index, end;
+ int r;
+
+ /* Dedicated guest is immutable by default. */
+ if (offset + len > i_size_read(inode))
+ return -EINVAL;
+
+ filemap_invalidate_lock_shared(mapping);
+
+ start = offset >> PAGE_SHIFT;
+ end = (offset + len) >> PAGE_SHIFT;
+
+ r = 0;
+ for (index = start; index < end;) {
+ struct folio *folio;
+
+ if (signal_pending(current)) {
+ r = -EINTR;
+ break;
+ }
+
+ folio = gunyah_gmem_get_folio(inode, index);
+ if (!folio) {
+ r = -ENOMEM;
+ break;
+ }
+
+ index = folio_next_index(folio);
+
+ folio_unlock(folio);
+ folio_put(folio);
+
+ /* 64-bit only, wrapping the index should be impossible. */
+ if (WARN_ON_ONCE(!index))
+ break;
+
+ cond_resched();
+ }
+
+ filemap_invalidate_unlock_shared(mapping);
+
+ return r;
+}
+
+static long gunyah_gmem_fallocate(struct file *file, int mode, loff_t offset,
+ loff_t len)
+{
+ long ret;
+
+ if (!(mode & FALLOC_FL_KEEP_SIZE))
+ return -EOPNOTSUPP;
+
+ if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE |
+ FALLOC_FL_ZERO_RANGE))
+ return -EOPNOTSUPP;
+
+ if (!PAGE_ALIGNED(offset) || !PAGE_ALIGNED(len))
+ return -EINVAL;
+
+ if (mode & FALLOC_FL_PUNCH_HOLE)
+ ret = gunyah_gmem_punch_hole(file_inode(file), offset, len);
+ else
+ ret = gunyah_gmem_allocate(file_inode(file), offset, len);
+
+ if (!ret)
+ file_modified(file);
+ return ret;
+}
+
+static int gunyah_gmem_release(struct inode *inode, struct file *file)
+{
+ return 0;
+}
+
+static const struct file_operations gunyah_gmem_fops = {
+ .owner = THIS_MODULE,
+ .llseek = generic_file_llseek,
+ .mmap = gunyah_gmem_mmap,
+ .open = generic_file_open,
+ .fallocate = gunyah_gmem_fallocate,
+ .release = gunyah_gmem_release,
+};
+
+static const struct address_space_operations gunyah_gmem_aops = {
+ .dirty_folio = noop_dirty_folio,
+ .migrate_folio = migrate_folio,
+ .error_remove_folio = generic_error_remove_folio,
+};
+
+int gunyah_guest_mem_create(struct gunyah_create_mem_args *args)
+{
+ const char *anon_name = "[gh-gmem]";
+ unsigned long fd_flags = 0;
+ struct inode *inode;
+ struct file *file;
+ int fd, err;
+
+ if (!PAGE_ALIGNED(args->size))
+ return -EINVAL;
+
+ if (args->flags & ~(GHMF_CLOEXEC | GHMF_ALLOW_HUGEPAGE))
+ return -EINVAL;
+
+ if (args->flags & GHMF_CLOEXEC)
+ fd_flags |= O_CLOEXEC;
+
+ fd = get_unused_fd_flags(fd_flags);
+ if (fd < 0)
+ return fd;
+
+ /*
+ * Use the so called "secure" variant, which creates a unique inode
+ * instead of reusing a single inode. Each guest_memfd instance needs
+ * its own inode to track the size, flags, etc.
+ */
+ file = anon_inode_create_getfile(anon_name, &gunyah_gmem_fops, NULL,
+ O_RDWR, NULL);
+ if (IS_ERR(file)) {
+ err = PTR_ERR(file);
+ goto err_fd;
+ }
+
+ file->f_flags |= O_LARGEFILE;
+
+ inode = file->f_inode;
+ WARN_ON(file->f_mapping != inode->i_mapping);
+
+ inode->i_private = (void *)(unsigned long)args->flags;
+ inode->i_mapping->a_ops = &gunyah_gmem_aops;
+ inode->i_mode |= S_IFREG;
+ inode->i_size = args->size;
+ mapping_set_gfp_mask(inode->i_mapping, GFP_HIGHUSER);
+ mapping_set_large_folios(inode->i_mapping);
+ mapping_set_unmovable(inode->i_mapping);
+ /* Unmovable mappings are supposed to be marked unevictable as well. */
+ WARN_ON_ONCE(!mapping_unevictable(inode->i_mapping));
+
+ fd_install(fd, file);
+ return fd;
+
+err_fd:
+ put_unused_fd(fd);
+ return err;
+}
diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
index 26b6dce49970..33751d5cddd2 100644
--- a/drivers/virt/gunyah/vm_mgr.c
+++ b/drivers/virt/gunyah/vm_mgr.c
@@ -687,6 +687,15 @@ long gunyah_dev_vm_mgr_ioctl(struct gunyah_rm *rm, unsigned int cmd,
switch (cmd) {
case GUNYAH_CREATE_VM:
return gunyah_dev_ioctl_create_vm(rm, arg);
+ case GUNYAH_CREATE_GUEST_MEM: {
+ struct gunyah_create_mem_args args;
+
+ if (copy_from_user(&args, (const void __user *)arg,
+ sizeof(args)))
+ return -EFAULT;
+
+ return gunyah_guest_mem_create(&args);
+ }
default:
return -ENOTTY;
}
diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
index e500f6eb014e..055990842959 100644
--- a/drivers/virt/gunyah/vm_mgr.h
+++ b/drivers/virt/gunyah/vm_mgr.h
@@ -109,4 +109,6 @@ int gunyah_vm_provide_folio(struct gunyah_vm *ghvm, struct folio *folio,
int gunyah_vm_reclaim_folio(struct gunyah_vm *ghvm, u64 gfn);
int gunyah_vm_reclaim_range(struct gunyah_vm *ghvm, u64 gfn, u64 nr);
+int gunyah_guest_mem_create(struct gunyah_create_mem_args *args);
+
#endif
diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
index 46f7d3aa61d0..c5f506350364 100644
--- a/include/uapi/linux/gunyah.h
+++ b/include/uapi/linux/gunyah.h
@@ -20,6 +20,25 @@
*/
#define GUNYAH_CREATE_VM _IO(GUNYAH_IOCTL_TYPE, 0x0) /* Returns a Gunyah VM fd */
+enum gunyah_mem_flags {
+ GHMF_CLOEXEC = (1UL << 0),
+ GHMF_ALLOW_HUGEPAGE = (1UL << 1),
+};
+
+/**
+ * struct gunyah_create_mem_args - Description of guest memory to create
+ * @flags: See GHMF_*.
+ */
+struct gunyah_create_mem_args {
+ __u64 flags;
+ __u64 size;
+ __u64 reserved[6];
+};
+
+#define GUNYAH_CREATE_GUEST_MEM \
+ _IOW(GUNYAH_IOCTL_TYPE, 0x8, \
+ struct gunyah_create_mem_args) /* Returns a Gunyah memory fd */
+
/*
* ioctls for gunyah-vm fds (returned by GUNYAH_CREATE_VM)
*/
--
2.34.1
Add Gunyah Resource Manager RPC interfaces to launch an unauthenticated
virtual machine.
Signed-off-by: Elliot Berman <[email protected]>
---
drivers/virt/gunyah/Makefile | 2 +-
drivers/virt/gunyah/rsc_mgr.h | 78 +++++++++++++
drivers/virt/gunyah/rsc_mgr_rpc.c | 238 ++++++++++++++++++++++++++++++++++++++
3 files changed, 317 insertions(+), 1 deletion(-)
diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
index ceccbbe68b38..47f1fae5419b 100644
--- a/drivers/virt/gunyah/Makefile
+++ b/drivers/virt/gunyah/Makefile
@@ -1,5 +1,5 @@
# SPDX-License-Identifier: GPL-2.0
-gunyah_rsc_mgr-y += rsc_mgr.o vm_mgr.o
+gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o
obj-$(CONFIG_GUNYAH) += gunyah.o gunyah_rsc_mgr.o
diff --git a/drivers/virt/gunyah/rsc_mgr.h b/drivers/virt/gunyah/rsc_mgr.h
index 21318ef25040..205b9ea735e5 100644
--- a/drivers/virt/gunyah/rsc_mgr.h
+++ b/drivers/virt/gunyah/rsc_mgr.h
@@ -20,6 +20,84 @@ int gunyah_rm_notifier_unregister(struct gunyah_rm *rm,
struct device *gunyah_rm_get(struct gunyah_rm *rm);
void gunyah_rm_put(struct gunyah_rm *rm);
+struct gunyah_rm_vm_exited_payload {
+ __le16 vmid;
+ __le16 exit_type;
+ __le32 exit_reason_size;
+ u8 exit_reason[];
+} __packed;
+
+enum gunyah_rm_notification_id {
+ /* clang-format off */
+ GUNYAH_RM_NOTIFICATION_VM_EXITED = 0x56100001,
+ GUNYAH_RM_NOTIFICATION_VM_STATUS = 0x56100008,
+ /* clang-format on */
+};
+
+enum gunyah_rm_vm_status {
+ /* clang-format off */
+ GUNYAH_RM_VM_STATUS_NO_STATE = 0,
+ GUNYAH_RM_VM_STATUS_INIT = 1,
+ GUNYAH_RM_VM_STATUS_READY = 2,
+ GUNYAH_RM_VM_STATUS_RUNNING = 3,
+ GUNYAH_RM_VM_STATUS_PAUSED = 4,
+ GUNYAH_RM_VM_STATUS_LOAD = 5,
+ GUNYAH_RM_VM_STATUS_AUTH = 6,
+ GUNYAH_RM_VM_STATUS_INIT_FAILED = 8,
+ GUNYAH_RM_VM_STATUS_EXITED = 9,
+ GUNYAH_RM_VM_STATUS_RESETTING = 10,
+ GUNYAH_RM_VM_STATUS_RESET = 11,
+ /* clang-format on */
+};
+
+struct gunyah_rm_vm_status_payload {
+ __le16 vmid;
+ u16 reserved;
+ u8 vm_status;
+ u8 os_status;
+ __le16 app_status;
+} __packed;
+
+int gunyah_rm_alloc_vmid(struct gunyah_rm *rm, u16 vmid);
+int gunyah_rm_dealloc_vmid(struct gunyah_rm *rm, u16 vmid);
+int gunyah_rm_vm_reset(struct gunyah_rm *rm, u16 vmid);
+int gunyah_rm_vm_start(struct gunyah_rm *rm, u16 vmid);
+int gunyah_rm_vm_stop(struct gunyah_rm *rm, u16 vmid);
+
+enum gunyah_rm_vm_auth_mechanism {
+ /* clang-format off */
+ GUNYAH_RM_VM_AUTH_NONE = 0,
+ GUNYAH_RM_VM_AUTH_QCOM_PIL_ELF = 1,
+ GUNYAH_RM_VM_AUTH_QCOM_ANDROID_PVM = 2,
+ /* clang-format on */
+};
+
+int gunyah_rm_vm_configure(struct gunyah_rm *rm, u16 vmid,
+ enum gunyah_rm_vm_auth_mechanism auth_mechanism,
+ u32 mem_handle, u64 image_offset, u64 image_size,
+ u64 dtb_offset, u64 dtb_size);
+int gunyah_rm_vm_init(struct gunyah_rm *rm, u16 vmid);
+
+struct gunyah_rm_hyp_resource {
+ u8 type;
+ u8 reserved;
+ __le16 partner_vmid;
+ __le32 resource_handle;
+ __le32 resource_label;
+ __le64 cap_id;
+ __le32 virq_handle;
+ __le32 virq;
+ __le64 base;
+ __le64 size;
+} __packed;
+
+struct gunyah_rm_hyp_resources {
+ __le32 n_entries;
+ struct gunyah_rm_hyp_resource entries[];
+} __packed;
+
+int gunyah_rm_get_hyp_resources(struct gunyah_rm *rm, u16 vmid,
+ struct gunyah_rm_hyp_resources **resources);
int gunyah_rm_call(struct gunyah_rm *rsc_mgr, u32 message_id,
const void *req_buf, size_t req_buf_size, void **resp_buf,
diff --git a/drivers/virt/gunyah/rsc_mgr_rpc.c b/drivers/virt/gunyah/rsc_mgr_rpc.c
new file mode 100644
index 000000000000..141ce0145e91
--- /dev/null
+++ b/drivers/virt/gunyah/rsc_mgr_rpc.c
@@ -0,0 +1,238 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2022-2024 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#include "rsc_mgr.h"
+
+/* Message IDs: VM Management */
+/* clang-format off */
+#define GUNYAH_RM_RPC_VM_ALLOC_VMID 0x56000001
+#define GUNYAH_RM_RPC_VM_DEALLOC_VMID 0x56000002
+#define GUNYAH_RM_RPC_VM_START 0x56000004
+#define GUNYAH_RM_RPC_VM_STOP 0x56000005
+#define GUNYAH_RM_RPC_VM_RESET 0x56000006
+#define GUNYAH_RM_RPC_VM_CONFIG_IMAGE 0x56000009
+#define GUNYAH_RM_RPC_VM_INIT 0x5600000B
+#define GUNYAH_RM_RPC_VM_GET_HYP_RESOURCES 0x56000020
+/* clang-format on */
+
+struct gunyah_rm_vm_common_vmid_req {
+ __le16 vmid;
+ __le16 _padding;
+} __packed;
+
+/* Call: VM_ALLOC */
+struct gunyah_rm_vm_alloc_vmid_resp {
+ __le16 vmid;
+ __le16 _padding;
+} __packed;
+
+/* Call: VM_STOP */
+#define GUNYAH_RM_VM_STOP_FLAG_FORCE_STOP BIT(0)
+
+#define GUNYAH_RM_VM_STOP_REASON_FORCE_STOP 3
+
+struct gunyah_rm_vm_stop_req {
+ __le16 vmid;
+ u8 flags;
+ u8 _padding;
+ __le32 stop_reason;
+} __packed;
+
+/* Call: VM_CONFIG_IMAGE */
+struct gunyah_rm_vm_config_image_req {
+ __le16 vmid;
+ __le16 auth_mech;
+ __le32 mem_handle;
+ __le64 image_offset;
+ __le64 image_size;
+ __le64 dtb_offset;
+ __le64 dtb_size;
+} __packed;
+
+/*
+ * Several RM calls take only a VMID as a parameter and give only standard
+ * response back. Deduplicate boilerplate code by using this common call.
+ */
+static int gunyah_rm_common_vmid_call(struct gunyah_rm *rm, u32 message_id,
+ u16 vmid)
+{
+ struct gunyah_rm_vm_common_vmid_req req_payload = {
+ .vmid = cpu_to_le16(vmid),
+ };
+
+ return gunyah_rm_call(rm, message_id, &req_payload, sizeof(req_payload),
+ NULL, NULL);
+}
+
+/**
+ * gunyah_rm_alloc_vmid() - Allocate a new VM in Gunyah. Returns the VM identifier.
+ * @rm: Handle to a Gunyah resource manager
+ * @vmid: Use 0 to dynamically allocate a VM. A reserved VMID can be supplied
+ * to request allocation of a platform-defined VM.
+ *
+ * Return: the allocated VMID or negative value on error
+ */
+int gunyah_rm_alloc_vmid(struct gunyah_rm *rm, u16 vmid)
+{
+ struct gunyah_rm_vm_common_vmid_req req_payload = {
+ .vmid = cpu_to_le16(vmid),
+ };
+ struct gunyah_rm_vm_alloc_vmid_resp *resp_payload;
+ size_t resp_size;
+ void *resp;
+ int ret;
+
+ ret = gunyah_rm_call(rm, GUNYAH_RM_RPC_VM_ALLOC_VMID, &req_payload,
+ sizeof(req_payload), &resp, &resp_size);
+ if (ret)
+ return ret;
+
+ if (!vmid) {
+ resp_payload = resp;
+ ret = le16_to_cpu(resp_payload->vmid);
+ kfree(resp);
+ }
+
+ return ret;
+}
+
+/**
+ * gunyah_rm_dealloc_vmid() - Dispose of a VMID
+ * @rm: Handle to a Gunyah resource manager
+ * @vmid: VM identifier allocated with gunyah_rm_alloc_vmid
+ */
+int gunyah_rm_dealloc_vmid(struct gunyah_rm *rm, u16 vmid)
+{
+ return gunyah_rm_common_vmid_call(rm, GUNYAH_RM_RPC_VM_DEALLOC_VMID,
+ vmid);
+}
+
+/**
+ * gunyah_rm_vm_reset() - Reset a VM's resources
+ * @rm: Handle to a Gunyah resource manager
+ * @vmid: VM identifier allocated with gunyah_rm_alloc_vmid
+ *
+ * As part of tearing down the VM, request RM to clean up all the VM resources
+ * associated with the VM. Only after this, Linux can clean up all the
+ * references it maintains to resources.
+ */
+int gunyah_rm_vm_reset(struct gunyah_rm *rm, u16 vmid)
+{
+ return gunyah_rm_common_vmid_call(rm, GUNYAH_RM_RPC_VM_RESET, vmid);
+}
+
+/**
+ * gunyah_rm_vm_start() - Move a VM into "ready to run" state
+ * @rm: Handle to a Gunyah resource manager
+ * @vmid: VM identifier allocated with gunyah_rm_alloc_vmid
+ *
+ * On VMs which use proxy scheduling, vcpu_run is needed to actually run the VM.
+ * On VMs which use Gunyah's scheduling, the vCPUs start executing in accordance with Gunyah
+ * scheduling policies.
+ */
+int gunyah_rm_vm_start(struct gunyah_rm *rm, u16 vmid)
+{
+ return gunyah_rm_common_vmid_call(rm, GUNYAH_RM_RPC_VM_START, vmid);
+}
+
+/**
+ * gunyah_rm_vm_stop() - Send a request to Resource Manager VM to forcibly stop a VM.
+ * @rm: Handle to a Gunyah resource manager
+ * @vmid: VM identifier allocated with gunyah_rm_alloc_vmid
+ */
+int gunyah_rm_vm_stop(struct gunyah_rm *rm, u16 vmid)
+{
+ struct gunyah_rm_vm_stop_req req_payload = {
+ .vmid = cpu_to_le16(vmid),
+ .flags = GUNYAH_RM_VM_STOP_FLAG_FORCE_STOP,
+ .stop_reason = cpu_to_le32(GUNYAH_RM_VM_STOP_REASON_FORCE_STOP),
+ };
+
+ return gunyah_rm_call(rm, GUNYAH_RM_RPC_VM_STOP, &req_payload,
+ sizeof(req_payload), NULL, NULL);
+}
+
+/**
+ * gunyah_rm_vm_configure() - Prepare a VM to start and provide the common
+ * configuration needed by RM to configure a VM
+ * @rm: Handle to a Gunyah resource manager
+ * @vmid: VM identifier allocated with gunyah_rm_alloc_vmid
+ * @auth_mechanism: Authentication mechanism used by resource manager to verify
+ * the virtual machine
+ * @mem_handle: Handle to a previously shared memparcel that contains all parts
+ * of the VM image subject to authentication.
+ * @image_offset: Start address of VM image, relative to the start of memparcel
+ * @image_size: Size of the VM image
+ * @dtb_offset: Start address of the devicetree binary with VM configuration,
+ * relative to start of memparcel.
+ * @dtb_size: Maximum size of devicetree binary.
+ */
+int gunyah_rm_vm_configure(struct gunyah_rm *rm, u16 vmid,
+ enum gunyah_rm_vm_auth_mechanism auth_mechanism,
+ u32 mem_handle, u64 image_offset, u64 image_size,
+ u64 dtb_offset, u64 dtb_size)
+{
+ struct gunyah_rm_vm_config_image_req req_payload = {
+ .vmid = cpu_to_le16(vmid),
+ .auth_mech = cpu_to_le16(auth_mechanism),
+ .mem_handle = cpu_to_le32(mem_handle),
+ .image_offset = cpu_to_le64(image_offset),
+ .image_size = cpu_to_le64(image_size),
+ .dtb_offset = cpu_to_le64(dtb_offset),
+ .dtb_size = cpu_to_le64(dtb_size),
+ };
+
+ return gunyah_rm_call(rm, GUNYAH_RM_RPC_VM_CONFIG_IMAGE, &req_payload,
+ sizeof(req_payload), NULL, NULL);
+}
+
+/**
+ * gunyah_rm_vm_init() - Move the VM to initialized state.
+ * @rm: Handle to a Gunyah resource manager
+ * @vmid: VM identifier
+ *
+ * RM will allocate needed resources for the VM.
+ */
+int gunyah_rm_vm_init(struct gunyah_rm *rm, u16 vmid)
+{
+ return gunyah_rm_common_vmid_call(rm, GUNYAH_RM_RPC_VM_INIT, vmid);
+}
+
+/**
+ * gunyah_rm_get_hyp_resources() - Retrieve hypervisor resources (capabilities) associated with a VM
+ * @rm: Handle to a Gunyah resource manager
+ * @vmid: VMID of the other VM to get the resources of
+ * @resources: Set by gunyah_rm_get_hyp_resources and contains the returned hypervisor resources.
+ * Caller must free the resources pointer if successful.
+ */
+int gunyah_rm_get_hyp_resources(struct gunyah_rm *rm, u16 vmid,
+ struct gunyah_rm_hyp_resources **resources)
+{
+ struct gunyah_rm_vm_common_vmid_req req_payload = {
+ .vmid = cpu_to_le16(vmid),
+ };
+ struct gunyah_rm_hyp_resources *resp;
+ size_t resp_size;
+ int ret;
+
+ ret = gunyah_rm_call(rm, GUNYAH_RM_RPC_VM_GET_HYP_RESOURCES,
+ &req_payload, sizeof(req_payload), (void **)&resp,
+ &resp_size);
+ if (ret)
+ return ret;
+
+ if (!resp_size)
+ return -EBADMSG;
+
+ if (resp_size < struct_size(resp, entries, 0) ||
+ resp_size !=
+ struct_size(resp, entries, le32_to_cpu(resp->n_entries))) {
+ kfree(resp);
+ return -EBADMSG;
+ }
+
+ *resources = resp;
+ return 0;
+}
--
2.34.1
Gunyah doorbells allow a virtual machine to signal another using
interrupts. Add the hypercalls needed to assert the interrupt.
Reviewed-by: Alex Elder <[email protected]>
Signed-off-by: Elliot Berman <[email protected]>
---
arch/arm64/gunyah/gunyah_hypercall.c | 38 ++++++++++++++++++++++++++++++++++++
include/linux/gunyah.h | 5 +++++
2 files changed, 43 insertions(+)
diff --git a/arch/arm64/gunyah/gunyah_hypercall.c b/arch/arm64/gunyah/gunyah_hypercall.c
index 38403dc28c66..3c2672d683ae 100644
--- a/arch/arm64/gunyah/gunyah_hypercall.c
+++ b/arch/arm64/gunyah/gunyah_hypercall.c
@@ -37,6 +37,8 @@ EXPORT_SYMBOL_GPL(arch_is_gunyah_guest);
/* clang-format off */
#define GUNYAH_HYPERCALL_HYP_IDENTIFY GUNYAH_HYPERCALL(0x8000)
+#define GUNYAH_HYPERCALL_BELL_SEND GUNYAH_HYPERCALL(0x8012)
+#define GUNYAH_HYPERCALL_BELL_SET_MASK GUNYAH_HYPERCALL(0x8015)
#define GUNYAH_HYPERCALL_MSGQ_SEND GUNYAH_HYPERCALL(0x801B)
#define GUNYAH_HYPERCALL_MSGQ_RECV GUNYAH_HYPERCALL(0x801C)
#define GUNYAH_HYPERCALL_ADDRSPACE_MAP GUNYAH_HYPERCALL(0x802B)
@@ -64,6 +66,42 @@ void gunyah_hypercall_hyp_identify(
}
EXPORT_SYMBOL_GPL(gunyah_hypercall_hyp_identify);
+/**
+ * gunyah_hypercall_bell_send() - Assert a gunyah doorbell
+ * @capid: capability ID of the doorbell
+ * @new_flags: bits to set on the doorbell
+ * @old_flags: Filled with the bits set before the send call if return value is GUNYAH_ERROR_OK
+ */
+enum gunyah_error gunyah_hypercall_bell_send(u64 capid, u64 new_flags, u64 *old_flags)
+{
+ struct arm_smccc_res res;
+
+ arm_smccc_1_1_hvc(GUNYAH_HYPERCALL_BELL_SEND, capid, new_flags, 0, &res);
+
+ if (res.a0 == GUNYAH_ERROR_OK && old_flags)
+ *old_flags = res.a1;
+
+ return res.a0;
+}
+EXPORT_SYMBOL_GPL(gunyah_hypercall_bell_send);
+
+/**
+ * gunyah_hypercall_bell_set_mask() - Set masks on a Gunyah doorbell
+ * @capid: capability ID of the doorbell
+ * @enable_mask: which bits trigger the receiver interrupt
+ * @ack_mask: which bits are automatically acknowledged when the receiver
+ * interrupt is ack'd
+ */
+enum gunyah_error gunyah_hypercall_bell_set_mask(u64 capid, u64 enable_mask, u64 ack_mask)
+{
+ struct arm_smccc_res res;
+
+ arm_smccc_1_1_hvc(GUNYAH_HYPERCALL_BELL_SET_MASK, capid, enable_mask, ack_mask, 0, &res);
+
+ return res.a0;
+}
+EXPORT_SYMBOL_GPL(gunyah_hypercall_bell_set_mask);
+
/**
* gunyah_hypercall_msgq_send() - Send a buffer on a message queue
* @capid: capability ID of the message queue to add message
diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
index 32ce578220ca..67cb9350ab9e 100644
--- a/include/linux/gunyah.h
+++ b/include/linux/gunyah.h
@@ -346,6 +346,11 @@ gunyah_api_version(const struct gunyah_hypercall_hyp_identify_resp *gunyah_api)
void gunyah_hypercall_hyp_identify(
struct gunyah_hypercall_hyp_identify_resp *hyp_identity);
+enum gunyah_error gunyah_hypercall_bell_send(u64 capid, u64 new_flags,
+ u64 *old_flags);
+enum gunyah_error gunyah_hypercall_bell_set_mask(u64 capid, u64 enable_mask,
+ u64 ack_mask);
+
/* Immediately raise RX vIRQ on receiver VM */
#define GUNYAH_HYPERCALL_MSGQ_TX_FLAGS_PUSH BIT(0)
--
2.34.1
Three hypercalls are needed to support demand paging.
In create page mappings for a virtual machine's address space, memory
must be moved to a memory extent that is allowed to be mapped into that
address space. Memory extents are Gunyah's implementation of access
control. Once the memory is moved to the proper memory extent, the
memory can be mapped into the VM's address space. Implement the
bindings to perform those hypercalls.
Signed-off-by: Elliot Berman <[email protected]>
---
arch/arm64/gunyah/gunyah_hypercall.c | 87 ++++++++++++++++++++++++++++++++++++
arch/arm64/include/asm/gunyah.h | 21 +++++++++
include/linux/gunyah.h | 56 +++++++++++++++++++++++
3 files changed, 164 insertions(+)
diff --git a/arch/arm64/gunyah/gunyah_hypercall.c b/arch/arm64/gunyah/gunyah_hypercall.c
index fee21df42c17..38403dc28c66 100644
--- a/arch/arm64/gunyah/gunyah_hypercall.c
+++ b/arch/arm64/gunyah/gunyah_hypercall.c
@@ -39,6 +39,9 @@ EXPORT_SYMBOL_GPL(arch_is_gunyah_guest);
#define GUNYAH_HYPERCALL_HYP_IDENTIFY GUNYAH_HYPERCALL(0x8000)
#define GUNYAH_HYPERCALL_MSGQ_SEND GUNYAH_HYPERCALL(0x801B)
#define GUNYAH_HYPERCALL_MSGQ_RECV GUNYAH_HYPERCALL(0x801C)
+#define GUNYAH_HYPERCALL_ADDRSPACE_MAP GUNYAH_HYPERCALL(0x802B)
+#define GUNYAH_HYPERCALL_ADDRSPACE_UNMAP GUNYAH_HYPERCALL(0x802C)
+#define GUNYAH_HYPERCALL_MEMEXTENT_DONATE GUNYAH_HYPERCALL(0x8061)
#define GUNYAH_HYPERCALL_VCPU_RUN GUNYAH_HYPERCALL(0x8065)
/* clang-format on */
@@ -114,6 +117,90 @@ enum gunyah_error gunyah_hypercall_msgq_recv(u64 capid, void *buff, size_t size,
}
EXPORT_SYMBOL_GPL(gunyah_hypercall_msgq_recv);
+/**
+ * gunyah_hypercall_addrspace_map() - Add memory to an address space from a memory extent
+ * @capid: Address space capability ID
+ * @extent_capid: Memory extent capability ID
+ * @vbase: location in address space
+ * @extent_attrs: Attributes for the memory
+ * @flags: Flags for address space mapping
+ * @offset: Offset into memory extent (physical address of memory)
+ * @size: Size of memory to map; must be page-aligned
+ */
+enum gunyah_error gunyah_hypercall_addrspace_map(u64 capid, u64 extent_capid, u64 vbase,
+ u32 extent_attrs, u32 flags, u64 offset, u64 size)
+{
+ struct arm_smccc_1_2_regs args = {
+ .a0 = GUNYAH_HYPERCALL_ADDRSPACE_MAP,
+ .a1 = capid,
+ .a2 = extent_capid,
+ .a3 = vbase,
+ .a4 = extent_attrs,
+ .a5 = flags,
+ .a6 = offset,
+ .a7 = size,
+ /* C language says this will be implictly zero. Gunyah requires 0, so be explicit */
+ .a8 = 0,
+ };
+ struct arm_smccc_1_2_regs res;
+
+ arm_smccc_1_2_hvc(&args, &res);
+
+ return res.a0;
+}
+EXPORT_SYMBOL_GPL(gunyah_hypercall_addrspace_map);
+
+/**
+ * gunyah_hypercall_addrspace_unmap() - Remove memory from an address space
+ * @capid: Address space capability ID
+ * @extent_capid: Memory extent capability ID
+ * @vbase: location in address space
+ * @flags: Flags for address space mapping
+ * @offset: Offset into memory extent (physical address of memory)
+ * @size: Size of memory to map; must be page-aligned
+ */
+enum gunyah_error gunyah_hypercall_addrspace_unmap(u64 capid, u64 extent_capid, u64 vbase,
+ u32 flags, u64 offset, u64 size)
+{
+ struct arm_smccc_1_2_regs args = {
+ .a0 = GUNYAH_HYPERCALL_ADDRSPACE_UNMAP,
+ .a1 = capid,
+ .a2 = extent_capid,
+ .a3 = vbase,
+ .a4 = flags,
+ .a5 = offset,
+ .a6 = size,
+ /* C language says this will be implictly zero. Gunyah requires 0, so be explicit */
+ .a7 = 0,
+ };
+ struct arm_smccc_1_2_regs res;
+
+ arm_smccc_1_2_hvc(&args, &res);
+
+ return res.a0;
+}
+EXPORT_SYMBOL_GPL(gunyah_hypercall_addrspace_unmap);
+
+/**
+ * gunyah_hypercall_memextent_donate() - Donate memory from one memory extent to another
+ * @options: donate options
+ * @from_capid: Memory extent capability ID to donate from
+ * @to_capid: Memory extent capability ID to donate to
+ * @offset: Offset into memory extent (physical address of memory)
+ * @size: Size of memory to donate; must be page-aligned
+ */
+enum gunyah_error gunyah_hypercall_memextent_donate(u32 options, u64 from_capid, u64 to_capid,
+ u64 offset, u64 size)
+{
+ struct arm_smccc_res res;
+
+ arm_smccc_1_1_hvc(GUNYAH_HYPERCALL_MEMEXTENT_DONATE, options, from_capid, to_capid,
+ offset, size, 0, &res);
+
+ return res.a0;
+}
+EXPORT_SYMBOL_GPL(gunyah_hypercall_memextent_donate);
+
/**
* gunyah_hypercall_vcpu_run() - Donate CPU time to a vcpu
* @capid: capability ID of the vCPU to run
diff --git a/arch/arm64/include/asm/gunyah.h b/arch/arm64/include/asm/gunyah.h
index 0cd3debe22b6..4adf24977fd1 100644
--- a/arch/arm64/include/asm/gunyah.h
+++ b/arch/arm64/include/asm/gunyah.h
@@ -33,4 +33,25 @@ static inline int arch_gunyah_fill_irq_fwspec_params(u32 virq,
return 0;
}
+enum arch_gunyah_memtype {
+ /* clang-format off */
+ GUNYAH_MEMTYPE_DEVICE_nGnRnE = 0,
+ GUNYAH_DEVICE_nGnRE = 1,
+ GUNYAH_DEVICE_nGRE = 2,
+ GUNYAH_DEVICE_GRE = 3,
+
+ GUNYAH_NORMAL_NC = 0b0101,
+ GUNYAH_NORMAL_ONC_IWT = 0b0110,
+ GUNYAH_NORMAL_ONC_IWB = 0b0111,
+ GUNYAH_NORMAL_OWT_INC = 0b1001,
+ GUNYAH_NORMAL_WT = 0b1010,
+ GUNYAH_NORMAL_OWT_IWB = 0b1011,
+ GUNYAH_NORMAL_OWB_INC = 0b1101,
+ GUNYAH_NORMAL_OWB_IWT = 0b1110,
+ GUNYAH_NORMAL_WB = 0b1111,
+ /* clang-format on */
+};
+
+#define ARCH_GUNYAH_DEFAULT_MEMTYPE GUNYAH_NORMAL_WB
+
#endif
diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
index 8405b2faf774..a517c5c33a75 100644
--- a/include/linux/gunyah.h
+++ b/include/linux/gunyah.h
@@ -274,6 +274,62 @@ enum gunyah_error gunyah_hypercall_msgq_send(u64 capid, size_t size, void *buff,
enum gunyah_error gunyah_hypercall_msgq_recv(u64 capid, void *buff, size_t size,
size_t *recv_size, bool *ready);
+#define GUNYAH_ADDRSPACE_SELF_CAP 0
+
+enum gunyah_pagetable_access {
+ /* clang-format off */
+ GUNYAH_PAGETABLE_ACCESS_NONE = 0,
+ GUNYAH_PAGETABLE_ACCESS_X = 1,
+ GUNYAH_PAGETABLE_ACCESS_W = 2,
+ GUNYAH_PAGETABLE_ACCESS_R = 4,
+ GUNYAH_PAGETABLE_ACCESS_RX = 5,
+ GUNYAH_PAGETABLE_ACCESS_RW = 6,
+ GUNYAH_PAGETABLE_ACCESS_RWX = 7,
+ /* clang-format on */
+};
+
+/* clang-format off */
+#define GUNYAH_MEMEXTENT_MAPPING_USER_ACCESS GENMASK_ULL(2, 0)
+#define GUNYAH_MEMEXTENT_MAPPING_KERNEL_ACCESS GENMASK_ULL(6, 4)
+#define GUNYAH_MEMEXTENT_MAPPING_TYPE GENMASK_ULL(23, 16)
+/* clang-format on */
+
+enum gunyah_memextent_donate_type {
+ /* clang-format off */
+ GUNYAH_MEMEXTENT_DONATE_TO_CHILD = 0,
+ GUNYAH_MEMEXTENT_DONATE_TO_PARENT = 1,
+ GUNYAH_MEMEXTENT_DONATE_TO_SIBLING = 2,
+ GUNYAH_MEMEXTENT_DONATE_TO_PROTECTED = 3,
+ GUNYAH_MEMEXTENT_DONATE_FROM_PROTECTED = 4,
+ /* clang-format on */
+};
+
+enum gunyah_addrspace_map_flag_bits {
+ /* clang-format off */
+ GUNYAH_ADDRSPACE_MAP_FLAG_PARTIAL = 0,
+ GUNYAH_ADDRSPACE_MAP_FLAG_PRIVATE = 1,
+ GUNYAH_ADDRSPACE_MAP_FLAG_VMMIO = 2,
+ GUNYAH_ADDRSPACE_MAP_FLAG_NOSYNC = 31,
+ /* clang-format on */
+};
+
+enum gunyah_error gunyah_hypercall_addrspace_map(u64 capid, u64 extent_capid,
+ u64 vbase, u32 extent_attrs,
+ u32 flags, u64 offset,
+ u64 size);
+enum gunyah_error gunyah_hypercall_addrspace_unmap(u64 capid, u64 extent_capid,
+ u64 vbase, u32 flags,
+ u64 offset, u64 size);
+
+/* clang-format off */
+#define GUNYAH_MEMEXTENT_OPTION_TYPE_MASK GENMASK_ULL(7, 0)
+#define GUNYAH_MEMEXTENT_OPTION_NOSYNC BIT(31)
+/* clang-format on */
+
+enum gunyah_error gunyah_hypercall_memextent_donate(u32 options, u64 from_capid,
+ u64 to_capid, u64 offset,
+ u64 size);
+
struct gunyah_hypercall_vcpu_run_resp {
union {
enum {
--
2.34.1
On Qualcomm platforms, there is a firmware entity which controls access
to physical pages. In order to share memory with another VM, this entity
needs to be informed that the guest VM should have access to the memory.
Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Elliot Berman <[email protected]>
---
drivers/virt/gunyah/Kconfig | 4 +
drivers/virt/gunyah/Makefile | 1 +
drivers/virt/gunyah/gunyah_platform_hooks.c | 115 ++++++++++++++++++++++++++++
drivers/virt/gunyah/rsc_mgr.h | 10 +++
drivers/virt/gunyah/rsc_mgr_rpc.c | 20 ++++-
drivers/virt/gunyah/vm_mgr_mem.c | 32 +++++---
include/linux/gunyah.h | 37 +++++++++
7 files changed, 206 insertions(+), 13 deletions(-)
diff --git a/drivers/virt/gunyah/Kconfig b/drivers/virt/gunyah/Kconfig
index 6f4c85db80b5..23ba523d25dc 100644
--- a/drivers/virt/gunyah/Kconfig
+++ b/drivers/virt/gunyah/Kconfig
@@ -3,6 +3,7 @@
config GUNYAH
tristate "Gunyah Virtualization drivers"
depends on ARM64
+ select GUNYAH_PLATFORM_HOOKS
help
The Gunyah drivers are the helper interfaces that run in a guest VM
such as basic inter-VM IPC and signaling mechanisms, and higher level
@@ -10,3 +11,6 @@ config GUNYAH
Say Y/M here to enable the drivers needed to interact in a Gunyah
virtual environment.
+
+config GUNYAH_PLATFORM_HOOKS
+ tristate
diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
index f3c9507224ee..ffcde0e0ccfa 100644
--- a/drivers/virt/gunyah/Makefile
+++ b/drivers/virt/gunyah/Makefile
@@ -3,3 +3,4 @@
gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o vm_mgr_mem.o
obj-$(CONFIG_GUNYAH) += gunyah.o gunyah_rsc_mgr.o gunyah_vcpu.o
+obj-$(CONFIG_GUNYAH_PLATFORM_HOOKS) += gunyah_platform_hooks.o
diff --git a/drivers/virt/gunyah/gunyah_platform_hooks.c b/drivers/virt/gunyah/gunyah_platform_hooks.c
new file mode 100644
index 000000000000..a1f93321e5ba
--- /dev/null
+++ b/drivers/virt/gunyah/gunyah_platform_hooks.c
@@ -0,0 +1,115 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2022-2024 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#include <linux/device.h>
+#include <linux/gunyah.h>
+#include <linux/module.h>
+#include <linux/rwsem.h>
+
+#include "rsc_mgr.h"
+
+static const struct gunyah_rm_platform_ops *rm_platform_ops;
+static DECLARE_RWSEM(rm_platform_ops_lock);
+
+int gunyah_rm_platform_pre_mem_share(struct gunyah_rm *rm,
+ struct gunyah_rm_mem_parcel *mem_parcel)
+{
+ int ret = 0;
+
+ down_read(&rm_platform_ops_lock);
+ if (rm_platform_ops && rm_platform_ops->pre_mem_share)
+ ret = rm_platform_ops->pre_mem_share(rm, mem_parcel);
+ up_read(&rm_platform_ops_lock);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(gunyah_rm_platform_pre_mem_share);
+
+int gunyah_rm_platform_post_mem_reclaim(struct gunyah_rm *rm,
+ struct gunyah_rm_mem_parcel *mem_parcel)
+{
+ int ret = 0;
+
+ down_read(&rm_platform_ops_lock);
+ if (rm_platform_ops && rm_platform_ops->post_mem_reclaim)
+ ret = rm_platform_ops->post_mem_reclaim(rm, mem_parcel);
+ up_read(&rm_platform_ops_lock);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(gunyah_rm_platform_post_mem_reclaim);
+
+int gunyah_rm_platform_pre_demand_page(struct gunyah_rm *rm, u16 vmid,
+ u32 flags, struct folio *folio)
+{
+ int ret = 0;
+
+ down_read(&rm_platform_ops_lock);
+ if (rm_platform_ops && rm_platform_ops->pre_demand_page)
+ ret = rm_platform_ops->pre_demand_page(rm, vmid, flags, folio);
+ up_read(&rm_platform_ops_lock);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(gunyah_rm_platform_pre_demand_page);
+
+int gunyah_rm_platform_reclaim_demand_page(struct gunyah_rm *rm, u16 vmid,
+ u32 flags, struct folio *folio)
+{
+ int ret = 0;
+
+ down_read(&rm_platform_ops_lock);
+ if (rm_platform_ops && rm_platform_ops->pre_demand_page)
+ ret = rm_platform_ops->release_demand_page(rm, vmid, flags,
+ folio);
+ up_read(&rm_platform_ops_lock);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(gunyah_rm_platform_reclaim_demand_page);
+
+int gunyah_rm_register_platform_ops(
+ const struct gunyah_rm_platform_ops *platform_ops)
+{
+ int ret = 0;
+
+ down_write(&rm_platform_ops_lock);
+ if (!rm_platform_ops)
+ rm_platform_ops = platform_ops;
+ else
+ ret = -EEXIST;
+ up_write(&rm_platform_ops_lock);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(gunyah_rm_register_platform_ops);
+
+void gunyah_rm_unregister_platform_ops(
+ const struct gunyah_rm_platform_ops *platform_ops)
+{
+ down_write(&rm_platform_ops_lock);
+ if (rm_platform_ops == platform_ops)
+ rm_platform_ops = NULL;
+ up_write(&rm_platform_ops_lock);
+}
+EXPORT_SYMBOL_GPL(gunyah_rm_unregister_platform_ops);
+
+static void _devm_gunyah_rm_unregister_platform_ops(void *data)
+{
+ gunyah_rm_unregister_platform_ops(
+ (const struct gunyah_rm_platform_ops *)data);
+}
+
+int devm_gunyah_rm_register_platform_ops(
+ struct device *dev, const struct gunyah_rm_platform_ops *ops)
+{
+ int ret;
+
+ ret = gunyah_rm_register_platform_ops(ops);
+ if (ret)
+ return ret;
+
+ return devm_add_action(dev, _devm_gunyah_rm_unregister_platform_ops,
+ (void *)ops);
+}
+EXPORT_SYMBOL_GPL(devm_gunyah_rm_register_platform_ops);
+
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("Gunyah Platform Hooks");
diff --git a/drivers/virt/gunyah/rsc_mgr.h b/drivers/virt/gunyah/rsc_mgr.h
index ec8ad8149e8e..68d08d3cff02 100644
--- a/drivers/virt/gunyah/rsc_mgr.h
+++ b/drivers/virt/gunyah/rsc_mgr.h
@@ -117,4 +117,14 @@ int gunyah_rm_call(struct gunyah_rm *rsc_mgr, u32 message_id,
const void *req_buf, size_t req_buf_size, void **resp_buf,
size_t *resp_buf_size);
+int gunyah_rm_platform_pre_mem_share(struct gunyah_rm *rm,
+ struct gunyah_rm_mem_parcel *mem_parcel);
+int gunyah_rm_platform_post_mem_reclaim(
+ struct gunyah_rm *rm, struct gunyah_rm_mem_parcel *mem_parcel);
+
+int gunyah_rm_platform_pre_demand_page(struct gunyah_rm *rm, u16 vmid,
+ u32 flags, struct folio *folio);
+int gunyah_rm_platform_reclaim_demand_page(struct gunyah_rm *rm, u16 vmid,
+ u32 flags, struct folio *folio);
+
#endif
diff --git a/drivers/virt/gunyah/rsc_mgr_rpc.c b/drivers/virt/gunyah/rsc_mgr_rpc.c
index bc44bde990ce..0d78613827b5 100644
--- a/drivers/virt/gunyah/rsc_mgr_rpc.c
+++ b/drivers/virt/gunyah/rsc_mgr_rpc.c
@@ -206,6 +206,12 @@ int gunyah_rm_mem_share(struct gunyah_rm *rm, struct gunyah_rm_mem_parcel *p)
if (!msg)
return -ENOMEM;
+ ret = gunyah_rm_platform_pre_mem_share(rm, p);
+ if (ret) {
+ kfree(msg);
+ return ret;
+ }
+
req_header = msg;
acl = (void *)req_header + sizeof(*req_header);
mem = (void *)acl + acl_size;
@@ -231,8 +237,10 @@ int gunyah_rm_mem_share(struct gunyah_rm *rm, struct gunyah_rm_mem_parcel *p)
&resp_size);
kfree(msg);
- if (ret)
+ if (ret) {
+ gunyah_rm_platform_post_mem_reclaim(rm, p);
return ret;
+ }
p->mem_handle = le32_to_cpu(*resp);
kfree(resp);
@@ -263,9 +271,15 @@ int gunyah_rm_mem_reclaim(struct gunyah_rm *rm,
struct gunyah_rm_mem_release_req req = {
.mem_handle = cpu_to_le32(parcel->mem_handle),
};
+ int ret;
- return gunyah_rm_call(rm, GUNYAH_RM_RPC_MEM_RECLAIM, &req, sizeof(req),
- NULL, NULL);
+ ret = gunyah_rm_call(rm, GUNYAH_RM_RPC_MEM_RECLAIM, &req, sizeof(req),
+ NULL, NULL);
+ /* Only call platform mem reclaim hooks if we reclaimed the memory */
+ if (ret)
+ return ret;
+
+ return gunyah_rm_platform_post_mem_reclaim(rm, parcel);
}
/**
diff --git a/drivers/virt/gunyah/vm_mgr_mem.c b/drivers/virt/gunyah/vm_mgr_mem.c
index d3fcb4514907..15610a8c6f82 100644
--- a/drivers/virt/gunyah/vm_mgr_mem.c
+++ b/drivers/virt/gunyah/vm_mgr_mem.c
@@ -9,6 +9,7 @@
#include <linux/mm.h>
#include <linux/pagemap.h>
+#include "rsc_mgr.h"
#include "vm_mgr.h"
#define WRITE_TAG (1 << 0)
@@ -84,7 +85,7 @@ int gunyah_vm_provide_folio(struct gunyah_vm *ghvm, struct folio *folio,
size_t size = folio_size(folio);
enum gunyah_error gunyah_error;
unsigned long tag = 0;
- int ret;
+ int ret, tmp;
/* clang-format off */
if (share) {
@@ -127,6 +128,11 @@ int gunyah_vm_provide_folio(struct gunyah_vm *ghvm, struct folio *folio,
else /* !share && !write */
access = GUNYAH_PAGETABLE_ACCESS_RX;
+ ret = gunyah_rm_platform_pre_demand_page(ghvm->rm, ghvm->vmid, access,
+ folio);
+ if (ret)
+ goto remove;
+
gunyah_error = gunyah_hypercall_memextent_donate(donate_flags(share),
host_extent->capid,
guest_extent->capid,
@@ -135,7 +141,7 @@ int gunyah_vm_provide_folio(struct gunyah_vm *ghvm, struct folio *folio,
pr_err("Failed to donate memory for guest address 0x%016llx: %d\n",
gpa, gunyah_error);
ret = gunyah_error_remap(gunyah_error);
- goto remove;
+ goto platform_release;
}
extent_attrs =
@@ -166,6 +172,14 @@ int gunyah_vm_provide_folio(struct gunyah_vm *ghvm, struct folio *folio,
if (gunyah_error != GUNYAH_ERROR_OK)
pr_err("Failed to reclaim memory donation for guest address 0x%016llx: %d\n",
gpa, gunyah_error);
+platform_release:
+ tmp = gunyah_rm_platform_reclaim_demand_page(ghvm->rm, ghvm->vmid,
+ access, folio);
+ if (tmp) {
+ pr_err("Platform failed to reclaim memory for guest address 0x%016llx: %d",
+ gpa, tmp);
+ return ret;
+ }
remove:
mtree_erase(&ghvm->mm, gfn);
return ret;
@@ -243,14 +257,12 @@ static int __gunyah_vm_reclaim_folio_locked(struct gunyah_vm *ghvm, void *entry,
else /* !share && !write */
access = GUNYAH_PAGETABLE_ACCESS_RX;
- gunyah_error = gunyah_hypercall_memextent_donate(donate_flags(share),
- guest_extent->capid,
- host_extent->capid, pa,
- size);
- if (gunyah_error != GUNYAH_ERROR_OK) {
- pr_err("Failed to reclaim memory donation for guest address 0x%016llx: %d\n",
- gfn << PAGE_SHIFT, gunyah_error);
- ret = gunyah_error_remap(gunyah_error);
+ ret = gunyah_rm_platform_reclaim_demand_page(ghvm->rm, ghvm->vmid,
+ access, folio);
+ if (ret) {
+ pr_err_ratelimited(
+ "Platform failed to reclaim memory for guest address 0x%016llx: %d",
+ gunyah_gfn_to_gpa(gfn), ret);
goto err;
}
diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
index 9065f5758c39..32ce578220ca 100644
--- a/include/linux/gunyah.h
+++ b/include/linux/gunyah.h
@@ -199,6 +199,43 @@ struct gunyah_rm_mem_parcel {
u32 mem_handle;
};
+struct gunyah_rm_platform_ops {
+ int (*pre_mem_share)(struct gunyah_rm *rm,
+ struct gunyah_rm_mem_parcel *mem_parcel);
+ int (*post_mem_reclaim)(struct gunyah_rm *rm,
+ struct gunyah_rm_mem_parcel *mem_parcel);
+
+ int (*pre_demand_page)(struct gunyah_rm *rm, u16 vmid, u32 flags,
+ struct folio *folio);
+ int (*release_demand_page)(struct gunyah_rm *rm, u16 vmid, u32 flags,
+ struct folio *folio);
+};
+
+#if IS_ENABLED(CONFIG_GUNYAH_PLATFORM_HOOKS)
+int gunyah_rm_register_platform_ops(
+ const struct gunyah_rm_platform_ops *platform_ops);
+void gunyah_rm_unregister_platform_ops(
+ const struct gunyah_rm_platform_ops *platform_ops);
+int devm_gunyah_rm_register_platform_ops(
+ struct device *dev, const struct gunyah_rm_platform_ops *ops);
+#else
+static inline int gunyah_rm_register_platform_ops(
+ const struct gunyah_rm_platform_ops *platform_ops)
+{
+ return 0;
+}
+static inline void gunyah_rm_unregister_platform_ops(
+ const struct gunyah_rm_platform_ops *platform_ops)
+{
+}
+static inline int
+devm_gunyah_rm_register_platform_ops(struct device *dev,
+ const struct gunyah_rm_platform_ops *ops)
+{
+ return 0;
+}
+#endif
+
/******************************************************************************/
/* Common arch-independent definitions for Gunyah hypercalls */
#define GUNYAH_CAPID_INVAL U64_MAX
--
2.34.1