2023-02-14 21:13:12

by Elliot Berman

[permalink] [raw]
Subject: [PATCH v10 00/26] Drivers for Gunyah hypervisor

Gunyah is a Type-1 hypervisor independent of any
high-level OS kernel, and runs in a higher CPU privilege level. It does
not depend on any lower-privileged OS kernel/code for its core
functionality. This increases its security and can support a much smaller
trusted computing base than a Type-2 hypervisor.

Gunyah is an open source hypervisor. The source repo is available at
https://github.com/quic/gunyah-hypervisor.

The diagram below shows the architecture.

::

VM A VM B
+-----+ +-----+ | +-----+ +-----+ +-----+
| | | | | | | | | | |
EL0 | APP | | APP | | | APP | | APP | | APP |
| | | | | | | | | | |
+-----+ +-----+ | +-----+ +-----+ +-----+
---------------------|-------------------------
+--------------+ | +----------------------+
| | | | |
EL1 | Linux Kernel | | |Linux kernel/Other OS | ...
| | | | |
+--------------+ | +----------------------+
--------hvc/smc------|------hvc/smc------------
+----------------------------------------+
| |
EL2 | Gunyah Hypervisor |
| |
+----------------------------------------+

Gunyah provides these following features.

- Threads and Scheduling: The scheduler schedules virtual CPUs (VCPUs) on
physical CPUs and enables time-sharing of the CPUs.
- Memory Management: Gunyah tracks memory ownership and use of all memory
under its control. Memory partitioning between VMs is a fundamental
security feature.
- Interrupt Virtualization: All interrupts are handled in the hypervisor
and routed to the assigned VM.
- Inter-VM Communication: There are several different mechanisms provided
for communicating between VMs.
- Device Virtualization: Para-virtualization of devices is supported using
inter-VM communication. Low level system features and devices such as
interrupt controllers are supported with emulation where required.

This series adds the basic framework for detecting that Linux is running
under Gunyah as a virtual machine, communication with the Gunyah Resource
Manager, and a virtual machine manager capable of launching virtual machines.

The series relies on two other patches posted separately:
- https://lore.kernel.org/all/[email protected]/
- https://lore.kernel.org/all/[email protected]/

Changes in v10:
- Fix bisectability (end result of series is same, --fixups applied to wrong commits)
- Convert GH_ERROR_* and GH_RM_ERROR_* to enums
- Correct race condition between allocating/freeing user memory
- Replace offsetof with struct_size
- Series-wide renaming of functions to be more consistent
- VM shutdown & restart support added in vCPU and VM Manager patches
- Convert VM function name (string) to type (number)
- Convert VM function argument to value (which could be a pointer) to remove memory wastage for arguments
- Remove defensive checks of hypervisor correctness
- Clean ups to ioeventfd as suggested by Srivatsa

Changes in v9: https://lore.kernel.org/all/[email protected]/
- Refactor Gunyah API flags to be exposed as feature flags at kernel level
- Move mbox client cleanup into gunyah_msgq_remove()
- Simplify gh_rm_call return value and response payload
- Missing clean-up/error handling/little endian fixes as suggested by Srivatsa and Alex in v8 series

Changes in v8: https://lore.kernel.org/all/[email protected]/
- Treat VM manager as a library of RM
- Add patches 21-28 as RFC to support proxy-scheduled vCPUs and necessary bits to support virtio
from Gunyah userspace

Changes in v7: https://lore.kernel.org/all/[email protected]/
- Refactor to remove gunyah RM bus
- Refactor allow multiple RM device instances
- Bump UAPI to start at 0x0
- Refactor QCOM SCM's platform hooks to allow CONFIG_QCOM_SCM=Y/CONFIG_GUNYAH=M combinations

Changes in v6: https://lore.kernel.org/all/[email protected]/
- *Replace gunyah-console with gunyah VM Manager*
- Move include/asm-generic/gunyah.h into include/linux/gunyah.h
- s/gunyah_msgq/gh_msgq/
- Minor tweaks and documentation tidying based on comments from Jiri, Greg, Arnd, Dmitry, and Bagas.

Changes in v5: https://lore.kernel.org/all/[email protected]/
- Dropped sysfs nodes
- Switch from aux bus to Gunyah RM bus for the subdevices
- Cleaning up RM console

Changes in v4: https://lore.kernel.org/all/[email protected]/
- Tidied up documentation throughout based on questions/feedback received
- Switched message queue implementation to use mailboxes
- Renamed "gunyah_device" as "gunyah_resource"

Changes in v3: https://lore.kernel.org/all/[email protected]/
- /Maintained/Supported/ in MAINTAINERS
- Tidied up documentation throughout based on questions/feedback received
- Moved hypercalls into arch/arm64/gunyah/; following hyper-v's implementation
- Drop opaque typedefs
- Move sysfs nodes under /sys/hypervisor/gunyah/
- Moved Gunyah console driver to drivers/tty/
- Reworked gunyah_device design to drop the Gunyah bus.

Changes in v2: https://lore.kernel.org/all/[email protected]/
- DT bindings clean up
- Switch hypercalls to follow SMCCC

v1: https://lore.kernel.org/all/[email protected]/

Elliot Berman (26):
docs: gunyah: Introduce Gunyah Hypervisor
dt-bindings: Add binding for gunyah hypervisor
gunyah: Common types and error codes for Gunyah hypercalls
virt: gunyah: Add hypercalls to identify Gunyah
virt: gunyah: Identify hypervisor version
virt: gunyah: msgq: Add hypercalls to send and receive messages
mailbox: Add Gunyah message queue mailbox
gunyah: rsc_mgr: Add resource manager RPC core
gunyah: rsc_mgr: Add VM lifecycle RPC
gunyah: vm_mgr: Introduce basic VM Manager
gunyah: rsc_mgr: Add RPC for sharing memory
unyah: vm_mgr: Add/remove user memory regions
gunyah: vm_mgr: Add ioctls to support basic non-proxy VM boot
samples: Add sample userspace Gunyah VM Manager
gunyah: rsc_mgr: Add platform ops on mem_lend/mem_reclaim
firmware: qcom_scm: Register Gunyah platform ops
docs: gunyah: Document Gunyah VM Manager
virt: gunyah: Translate gh_rm_hyp_resource into gunyah_resource
gunyah: vm_mgr: Add framework to add VM Functions
virt: gunyah: Add resource tickets
virt: gunyah: Add IO handlers
virt: gunyah: Add proxy-scheduled vCPUs
virt: gunyah: Add hypercalls for sending doorbell
virt: gunyah: Add irqfd interface
virt: gunyah: Add ioeventfd
MAINTAINERS: Add Gunyah hypervisor drivers section

.../bindings/firmware/gunyah-hypervisor.yaml | 82 ++
.../userspace-api/ioctl/ioctl-number.rst | 1 +
Documentation/virt/gunyah/index.rst | 114 +++
Documentation/virt/gunyah/message-queue.rst | 69 ++
Documentation/virt/gunyah/vm-manager.rst | 193 +++++
Documentation/virt/index.rst | 1 +
MAINTAINERS | 13 +
arch/arm64/Kbuild | 1 +
arch/arm64/gunyah/Makefile | 3 +
arch/arm64/gunyah/gunyah_hypercall.c | 146 ++++
arch/arm64/include/asm/gunyah.h | 23 +
drivers/firmware/Kconfig | 2 +
drivers/firmware/qcom_scm.c | 100 +++
drivers/mailbox/Makefile | 2 +
drivers/mailbox/gunyah-msgq.c | 214 +++++
drivers/virt/Kconfig | 2 +
drivers/virt/Makefile | 1 +
drivers/virt/gunyah/Kconfig | 46 +
drivers/virt/gunyah/Makefile | 11 +
drivers/virt/gunyah/gunyah.c | 54 ++
drivers/virt/gunyah/gunyah_ioeventfd.c | 113 +++
drivers/virt/gunyah/gunyah_irqfd.c | 160 ++++
drivers/virt/gunyah/gunyah_platform_hooks.c | 80 ++
drivers/virt/gunyah/gunyah_vcpu.c | 463 ++++++++++
drivers/virt/gunyah/rsc_mgr.c | 798 +++++++++++++++++
drivers/virt/gunyah/rsc_mgr.h | 169 ++++
drivers/virt/gunyah/rsc_mgr_rpc.c | 419 +++++++++
drivers/virt/gunyah/vm_mgr.c | 801 ++++++++++++++++++
drivers/virt/gunyah/vm_mgr.h | 70 ++
drivers/virt/gunyah/vm_mgr_mm.c | 258 ++++++
include/linux/gunyah.h | 198 +++++
include/linux/gunyah_rsc_mgr.h | 171 ++++
include/linux/gunyah_vm_mgr.h | 119 +++
include/uapi/linux/gunyah.h | 191 +++++
samples/Kconfig | 10 +
samples/Makefile | 1 +
samples/gunyah/.gitignore | 2 +
samples/gunyah/Makefile | 6 +
samples/gunyah/gunyah_vmm.c | 270 ++++++
samples/gunyah/sample_vm.dts | 68 ++
40 files changed, 5445 insertions(+)
create mode 100644 Documentation/devicetree/bindings/firmware/gunyah-hypervisor.yaml
create mode 100644 Documentation/virt/gunyah/index.rst
create mode 100644 Documentation/virt/gunyah/message-queue.rst
create mode 100644 Documentation/virt/gunyah/vm-manager.rst
create mode 100644 arch/arm64/gunyah/Makefile
create mode 100644 arch/arm64/gunyah/gunyah_hypercall.c
create mode 100644 arch/arm64/include/asm/gunyah.h
create mode 100644 drivers/mailbox/gunyah-msgq.c
create mode 100644 drivers/virt/gunyah/Kconfig
create mode 100644 drivers/virt/gunyah/Makefile
create mode 100644 drivers/virt/gunyah/gunyah.c
create mode 100644 drivers/virt/gunyah/gunyah_ioeventfd.c
create mode 100644 drivers/virt/gunyah/gunyah_irqfd.c
create mode 100644 drivers/virt/gunyah/gunyah_platform_hooks.c
create mode 100644 drivers/virt/gunyah/gunyah_vcpu.c
create mode 100644 drivers/virt/gunyah/rsc_mgr.c
create mode 100644 drivers/virt/gunyah/rsc_mgr.h
create mode 100644 drivers/virt/gunyah/rsc_mgr_rpc.c
create mode 100644 drivers/virt/gunyah/vm_mgr.c
create mode 100644 drivers/virt/gunyah/vm_mgr.h
create mode 100644 drivers/virt/gunyah/vm_mgr_mm.c
create mode 100644 include/linux/gunyah.h
create mode 100644 include/linux/gunyah_rsc_mgr.h
create mode 100644 include/linux/gunyah_vm_mgr.h
create mode 100644 include/uapi/linux/gunyah.h
create mode 100644 samples/gunyah/.gitignore
create mode 100644 samples/gunyah/Makefile
create mode 100644 samples/gunyah/gunyah_vmm.c
create mode 100644 samples/gunyah/sample_vm.dts


base-commit: 3ebb0ac55efaf1d0fb1b106f852c114e5021f7eb
--
2.39.1



2023-02-14 21:13:53

by Elliot Berman

[permalink] [raw]
Subject: [PATCH v10 03/26] gunyah: Common types and error codes for Gunyah hypercalls

Add architecture-independent standard error codes, types, and macros for
Gunyah hypercalls.

Reviewed-by: Dmitry Baryshkov <[email protected]>
Signed-off-by: Elliot Berman <[email protected]>
---
include/linux/gunyah.h | 82 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 82 insertions(+)
create mode 100644 include/linux/gunyah.h

diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
new file mode 100644
index 000000000000..59ef4c735ae8
--- /dev/null
+++ b/include/linux/gunyah.h
@@ -0,0 +1,82 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#ifndef _LINUX_GUNYAH_H
+#define _LINUX_GUNYAH_H
+
+#include <linux/errno.h>
+#include <linux/limits.h>
+
+/******************************************************************************/
+/* Common arch-independent definitions for Gunyah hypercalls */
+#define GH_CAPID_INVAL U64_MAX
+#define GH_VMID_ROOT_VM 0xff
+
+enum gh_error {
+ GH_ERROR_OK = 0,
+ GH_ERROR_UNIMPLEMENTED = -1,
+ GH_ERROR_RETRY = -2,
+
+ GH_ERROR_ARG_INVAL = 1,
+ GH_ERROR_ARG_SIZE = 2,
+ GH_ERROR_ARG_ALIGN = 3,
+
+ GH_ERROR_NOMEM = 10,
+
+ GH_ERROR_ADDR_OVFL = 20,
+ GH_ERROR_ADDR_UNFL = 21,
+ GH_ERROR_ADDR_INVAL = 22,
+
+ GH_ERROR_DENIED = 30,
+ GH_ERROR_BUSY = 31,
+ GH_ERROR_IDLE = 32,
+
+ GH_ERROR_IRQ_BOUND = 40,
+ GH_ERROR_IRQ_UNBOUND = 41,
+
+ GH_ERROR_CSPACE_CAP_NULL = 50,
+ GH_ERROR_CSPACE_CAP_REVOKED = 51,
+ GH_ERROR_CSPACE_WRONG_OBJ_TYPE = 52,
+ GH_ERROR_CSPACE_INSUF_RIGHTS = 53,
+ GH_ERROR_CSPACE_FULL = 54,
+
+ GH_ERROR_MSGQUEUE_EMPTY = 60,
+ GH_ERROR_MSGQUEUE_FULL = 61,
+};
+
+/**
+ * gh_remap_error() - Remap Gunyah hypervisor errors into a Linux error code
+ * @gh_error: Gunyah hypercall return value
+ */
+static inline int gh_remap_error(enum gh_error gh_error)
+{
+ switch (gh_error) {
+ case GH_ERROR_OK:
+ return 0;
+ case GH_ERROR_NOMEM:
+ return -ENOMEM;
+ case GH_ERROR_DENIED:
+ case GH_ERROR_CSPACE_CAP_NULL:
+ case GH_ERROR_CSPACE_CAP_REVOKED:
+ case GH_ERROR_CSPACE_WRONG_OBJ_TYPE:
+ case GH_ERROR_CSPACE_INSUF_RIGHTS:
+ case GH_ERROR_CSPACE_FULL:
+ return -EACCES;
+ case GH_ERROR_BUSY:
+ case GH_ERROR_IDLE:
+ case GH_ERROR_IRQ_BOUND:
+ case GH_ERROR_IRQ_UNBOUND:
+ case GH_ERROR_MSGQUEUE_FULL:
+ case GH_ERROR_MSGQUEUE_EMPTY:
+ return -EBUSY;
+ case GH_ERROR_UNIMPLEMENTED:
+ case GH_ERROR_RETRY:
+ return -EOPNOTSUPP;
+ default:
+ return -EINVAL;
+ }
+}
+
+#endif
--
2.39.1


2023-02-14 21:13:57

by Elliot Berman

[permalink] [raw]
Subject: [PATCH v10 02/26] dt-bindings: Add binding for gunyah hypervisor

When Linux is booted as a guest under the Gunyah hypervisor, the Gunyah
Resource Manager applies a devicetree overlay describing the virtual
platform configuration of the guest VM, such as the message queue
capability IDs for communicating with the Resource Manager. This
information is not otherwise discoverable by a VM: the Gunyah hypervisor
core does not provide a direct interface to discover capability IDs nor
a way to communicate with RM without having already known the
corresponding message queue capability ID. Add the DT bindings that
Gunyah adheres for the hypervisor node and message queues.

Reviewed-by: Rob Herring <[email protected]>
Signed-off-by: Elliot Berman <[email protected]>
---
.../bindings/firmware/gunyah-hypervisor.yaml | 82 +++++++++++++++++++
1 file changed, 82 insertions(+)
create mode 100644 Documentation/devicetree/bindings/firmware/gunyah-hypervisor.yaml

diff --git a/Documentation/devicetree/bindings/firmware/gunyah-hypervisor.yaml b/Documentation/devicetree/bindings/firmware/gunyah-hypervisor.yaml
new file mode 100644
index 000000000000..3fc0b043ac3c
--- /dev/null
+++ b/Documentation/devicetree/bindings/firmware/gunyah-hypervisor.yaml
@@ -0,0 +1,82 @@
+# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/firmware/gunyah-hypervisor.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Gunyah Hypervisor
+
+maintainers:
+ - Prakruthi Deepak Heragu <[email protected]>
+ - Elliot Berman <[email protected]>
+
+description: |+
+ Gunyah virtual machines use this information to determine the capability IDs
+ of the message queues used to communicate with the Gunyah Resource Manager.
+ See also: https://github.com/quic/gunyah-resource-manager/blob/develop/src/vm_creation/dto_construct.c
+
+properties:
+ compatible:
+ const: gunyah-hypervisor
+
+ "#address-cells":
+ description: Number of cells needed to represent 64-bit capability IDs.
+ const: 2
+
+ "#size-cells":
+ description: must be 0, because capability IDs are not memory address
+ ranges and do not have a size.
+ const: 0
+
+patternProperties:
+ "^gunyah-resource-mgr(@.*)?":
+ type: object
+ description:
+ Resource Manager node which is required to communicate to Resource
+ Manager VM using Gunyah Message Queues.
+
+ properties:
+ compatible:
+ const: gunyah-resource-manager
+
+ reg:
+ items:
+ - description: Gunyah capability ID of the TX message queue
+ - description: Gunyah capability ID of the RX message queue
+
+ interrupts:
+ items:
+ - description: Interrupt for the TX message queue
+ - description: Interrupt for the RX message queue
+
+ additionalProperties: false
+
+ required:
+ - compatible
+ - reg
+ - interrupts
+
+additionalProperties: false
+
+required:
+ - compatible
+ - "#address-cells"
+ - "#size-cells"
+
+examples:
+ - |
+ #include <dt-bindings/interrupt-controller/arm-gic.h>
+
+ hypervisor {
+ #address-cells = <2>;
+ #size-cells = <0>;
+ compatible = "gunyah-hypervisor";
+
+ gunyah-resource-mgr@0 {
+ compatible = "gunyah-resource-manager";
+ interrupts = <GIC_SPI 3 IRQ_TYPE_EDGE_RISING>, /* TX full IRQ */
+ <GIC_SPI 4 IRQ_TYPE_EDGE_RISING>; /* RX empty IRQ */
+ reg = <0x00000000 0x00000000>, <0x00000000 0x00000001>;
+ /* TX, RX cap ids */
+ };
+ };
--
2.39.1


2023-02-14 21:13:59

by Elliot Berman

[permalink] [raw]
Subject: [PATCH v10 01/26] docs: gunyah: Introduce Gunyah Hypervisor

Gunyah is an open-source Type-1 hypervisor developed by Qualcomm. It
does not depend on any lower-privileged OS/kernel code for its core
functionality. This increases its security and can support a smaller
trusted computing based when compared to Type-2 hypervisors.

Add documentation describing the Gunyah hypervisor and the main
components of the Gunyah hypervisor which are of interest to Linux
virtualization development.

Reviewed-by: Bagas Sanjaya <[email protected]>
Signed-off-by: Elliot Berman <[email protected]>
---
Documentation/virt/gunyah/index.rst | 113 ++++++++++++++++++++
Documentation/virt/gunyah/message-queue.rst | 61 +++++++++++
Documentation/virt/index.rst | 1 +
3 files changed, 175 insertions(+)
create mode 100644 Documentation/virt/gunyah/index.rst
create mode 100644 Documentation/virt/gunyah/message-queue.rst

diff --git a/Documentation/virt/gunyah/index.rst b/Documentation/virt/gunyah/index.rst
new file mode 100644
index 000000000000..45adbbc311db
--- /dev/null
+++ b/Documentation/virt/gunyah/index.rst
@@ -0,0 +1,113 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=================
+Gunyah Hypervisor
+=================
+
+.. toctree::
+ :maxdepth: 1
+
+ message-queue
+
+Gunyah is a Type-1 hypervisor which is independent of any OS kernel, and runs in
+a higher CPU privilege level. It does not depend on any lower-privileged operating system
+for its core functionality. This increases its security and can support a much smaller
+trusted computing base than a Type-2 hypervisor.
+
+Gunyah is an open source hypervisor. The source repo is available at
+https://github.com/quic/gunyah-hypervisor.
+
+Gunyah provides these following features.
+
+- Scheduling:
+
+ A scheduler for virtual CPUs (vCPUs) on physical CPUs enables time-sharing
+ of the CPUs. Gunyah supports two models of scheduling:
+
+ 1. "Behind the back" scheduling in which Gunyah hypervisor schedules vCPUS on its own.
+ 2. "Proxy" scheduling in which a delegated VM can donate part of one of its vCPU slice
+ to another VM's vCPU via a hypercall.
+
+- Memory Management:
+
+ APIs handling memory, abstracted as objects, limiting direct use of physical
+ addresses. Memory ownership and usage tracking of all memory under its control.
+ Memory partitioning between VMs is a fundamental security feature.
+
+- Interrupt Virtualization:
+
+ Uses CPU hardware interrupt virtualization capabilities. Interrupts are handled
+ in the hypervisor and routed to the assigned VM.
+
+- Inter-VM Communication:
+
+ There are several different mechanisms provided for communicating between VMs.
+
+- Virtual platform:
+
+ Architectural devices such as interrupt controllers and CPU timers are directly provided
+ by the hypervisor as well as core virtual platform devices and system APIs such as ARM PSCI.
+
+- Device Virtualization:
+
+ Para-virtualization of devices is supported using inter-VM communication.
+
+Architectures supported
+=======================
+AArch64 with a GIC
+
+Resources and Capabilities
+==========================
+
+Some services or resources provided by the Gunyah hypervisor are described to a virtual machine by
+capability IDs. For instance, inter-VM communication is performed with doorbells and message queues.
+Gunyah allows access to manipulate that doorbell via the capability ID. These resources are
+described in Linux as a struct gunyah_resource.
+
+High level management of these resources is performed by the resource manager VM. RM informs a
+guest VM about resources it can access through either the device tree or via guest-initiated RPC.
+
+For each virtual machine, Gunyah maintains a table of resources which can be accessed by that VM.
+An entry in this table is called a "capability" and VMs can only access resources via this
+capability table. Hence, virtual Gunyah resources are referenced by a "capability IDs" and not
+"resource IDs". If 2 VMs have access to the same resource, they might not be using the same
+capability ID to access that resource since the capability tables are independent per VM.
+
+Resource Manager
+================
+
+The resource manager (RM) is a privileged application VM supporting the Gunyah Hypervisor.
+It provides policy enforcement aspects of the virtualization system. The resource manager can
+be treated as an extension of the Hypervisor but is separated to its own partition to ensure
+that the hypervisor layer itself remains small and secure and to maintain a separation of policy
+and mechanism in the platform. RM runs at arm64 NS-EL1 similar to other virtual machines.
+
+Communication with the resource manager from each guest VM happens with message-queue.rst. Details
+about the specific messages can be found in drivers/virt/gunyah/rsc_mgr.c
+
+::
+
+ +-------+ +--------+ +--------+
+ | RM | | VM_A | | VM_B |
+ +-.-.-.-+ +---.----+ +---.----+
+ | | | |
+ +-.-.-----------.------------.----+
+ | | \==========/ | |
+ | \========================/ |
+ | Gunyah |
+ +---------------------------------+
+
+The source for the resource manager is available at https://github.com/quic/gunyah-resource-manager.
+
+The resource manager provides the following features:
+
+- VM lifecycle management: allocating a VM, starting VMs, destruction of VMs
+- VM access control policy, including memory sharing and lending
+- Interrupt routing configuration
+- Forwarding of system-level events (e.g. VM shutdown) to owner VM
+
+When booting a virtual machine which uses a devicetree such as Linux, resource manager overlays a
+/hypervisor node. This node can let Linux know it is running as a Gunyah guest VM,
+how to communicate with resource manager, and basic description and capabilities of
+this VM. See Documentation/devicetree/bindings/firmware/gunyah-hypervisor.yaml for a description
+of this node.
diff --git a/Documentation/virt/gunyah/message-queue.rst b/Documentation/virt/gunyah/message-queue.rst
new file mode 100644
index 000000000000..0667b3eb1ff9
--- /dev/null
+++ b/Documentation/virt/gunyah/message-queue.rst
@@ -0,0 +1,61 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+Message Queues
+==============
+Message queue is a simple low-capacity IPC channel between two VMs. It is
+intended for sending small control and configuration messages. Each message
+queue is unidirectional, so a full-duplex IPC channel requires a pair of queues.
+
+Messages can be up to 240 bytes in length. Longer messages require a further
+protocol on top of the message queue messages themselves. For instance, communication
+with the resource manager adds a header field for sending longer messages via multiple
+message fragments.
+
+The diagram below shows how message queue works. A typical configuration involves
+2 message queues. Message queue 1 allows VM_A to send messages to VM_B. Message
+queue 2 allows VM_B to send messages to VM_A.
+
+1. VM_A sends a message of up to 240 bytes in length. It raises a hypercall
+ with the message to inform the hypervisor to add the message to
+ message queue 1's queue.
+
+2. Gunyah raises the corresponding interrupt for VM_B (Rx vIRQ) when any of
+ these happens:
+
+ a. gh_msgq_send has PUSH flag. Queue is immediately flushed. This is the typical case.
+ b. Explicility with gh_msgq_push command from VM_A.
+ c. Message queue has reached a threshold depth.
+
+3. VM_B calls gh_msgq_recv and Gunyah copies message to requested buffer.
+
+4. Gunyah buffers messages in the queue. If the queue became full when VM_A added a message,
+ the return values for gh_msgq_send() include a flag that indicates the queue is full.
+ Once VM_B receives the message and, thus, there is space in the queue, Gunyah
+ will raise the Tx vIRQ on VM_A to indicate it can continue sending messages.
+
+For VM_B to send a message to VM_A, the process is identical, except that hypercalls
+reference message queue 2's capability ID. Each message queue has its own independent
+vIRQ: two TX message queues will have two vIRQs (and two capability IDs).
+
+::
+
+ +---------------+ +-----------------+ +---------------+
+ | VM_A | |Gunyah hypervisor| | VM_B |
+ | | | | | |
+ | | | | | |
+ | | Tx | | | |
+ | |-------->| | Rx vIRQ | |
+ |gh_msgq_send() | Tx vIRQ |Message queue 1 |-------->|gh_msgq_recv() |
+ | |<------- | | | |
+ | | | | | |
+ | Message Queue | | | | Message Queue |
+ | driver | | | | driver |
+ | | | | | |
+ | | | | | |
+ | | | | Tx | |
+ | | Rx vIRQ | |<--------| |
+ |gh_msgq_recv() |<--------|Message queue 2 | Tx vIRQ |gh_msgq_send() |
+ | | | |-------->| |
+ | | | | | |
+ | | | | | |
+ +---------------+ +-----------------+ +---------------+
diff --git a/Documentation/virt/index.rst b/Documentation/virt/index.rst
index 7fb55ae08598..15869ee059b3 100644
--- a/Documentation/virt/index.rst
+++ b/Documentation/virt/index.rst
@@ -16,6 +16,7 @@ Virtualization Support
coco/sev-guest
coco/tdx-guest
hyperv/index
+ gunyah/index

.. only:: html and subproject

--
2.39.1


2023-02-14 21:19:18

by Elliot Berman

[permalink] [raw]
Subject: [PATCH v10 04/26] virt: gunyah: Add hypercalls to identify Gunyah

Add hypercalls to identify when Linux is running a virtual machine under
Gunyah.

There are two calls to help identify Gunyah:

1. gh_hypercall_get_uid() returns a UID when running under a Gunyah
hypervisor.
2. gh_hypercall_hyp_identify() returns build information and a set of
feature flags that are supported by Gunyah.

Signed-off-by: Elliot Berman <[email protected]>
---
arch/arm64/Kbuild | 1 +
arch/arm64/gunyah/Makefile | 3 ++
arch/arm64/gunyah/gunyah_hypercall.c | 61 ++++++++++++++++++++++++++++
drivers/virt/Kconfig | 2 +
drivers/virt/gunyah/Kconfig | 13 ++++++
include/linux/gunyah.h | 33 +++++++++++++++
6 files changed, 113 insertions(+)
create mode 100644 arch/arm64/gunyah/Makefile
create mode 100644 arch/arm64/gunyah/gunyah_hypercall.c
create mode 100644 drivers/virt/gunyah/Kconfig

diff --git a/arch/arm64/Kbuild b/arch/arm64/Kbuild
index 5bfbf7d79c99..e4847ba0e3c9 100644
--- a/arch/arm64/Kbuild
+++ b/arch/arm64/Kbuild
@@ -3,6 +3,7 @@ obj-y += kernel/ mm/ net/
obj-$(CONFIG_KVM) += kvm/
obj-$(CONFIG_XEN) += xen/
obj-$(subst m,y,$(CONFIG_HYPERV)) += hyperv/
+obj-$(CONFIG_GUNYAH) += gunyah/
obj-$(CONFIG_CRYPTO) += crypto/

# for cleaning
diff --git a/arch/arm64/gunyah/Makefile b/arch/arm64/gunyah/Makefile
new file mode 100644
index 000000000000..84f1e38cafb1
--- /dev/null
+++ b/arch/arm64/gunyah/Makefile
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0
+
+obj-$(CONFIG_GUNYAH) += gunyah_hypercall.o
diff --git a/arch/arm64/gunyah/gunyah_hypercall.c b/arch/arm64/gunyah/gunyah_hypercall.c
new file mode 100644
index 000000000000..f30d06ee80cf
--- /dev/null
+++ b/arch/arm64/gunyah/gunyah_hypercall.c
@@ -0,0 +1,61 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#include <linux/arm-smccc.h>
+#include <linux/module.h>
+#include <linux/gunyah.h>
+
+static const uint32_t gunyah_known_uuids[][4] = {
+ {0x19bd54bd, 0x0b37571b, 0x946f609b, 0x54539de6}, /* QC_HYP (Qualcomm's build) */
+ {0x673d5f14, 0x9265ce36, 0xa4535fdb, 0xc1d58fcd}, /* GUNYAH (open source build) */
+};
+
+bool arch_is_gunyah_guest(void)
+{
+ struct arm_smccc_res res;
+ u32 uid[4];
+ int i;
+
+ arm_smccc_1_1_hvc(ARM_SMCCC_VENDOR_HYP_CALL_UID_FUNC_ID, &res);
+
+ uid[0] = lower_32_bits(res.a0);
+ uid[1] = lower_32_bits(res.a1);
+ uid[2] = lower_32_bits(res.a2);
+ uid[3] = lower_32_bits(res.a3);
+
+ for (i = 0; i < ARRAY_SIZE(gunyah_known_uuids); i++)
+ if (!memcmp(uid, gunyah_known_uuids[i], sizeof(uid)))
+ break;
+
+ return i != ARRAY_SIZE(gunyah_known_uuids);
+}
+EXPORT_SYMBOL_GPL(arch_is_gunyah_guest);
+
+#define GH_HYPERCALL(fn) ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, ARM_SMCCC_SMC_64, \
+ ARM_SMCCC_OWNER_VENDOR_HYP, \
+ fn)
+
+#define GH_HYPERCALL_HYP_IDENTIFY GH_HYPERCALL(0x8000)
+
+/**
+ * gh_hypercall_hyp_identify() - Returns build information and feature flags
+ * supported by Gunyah.
+ * @hyp_identity: filled by the hypercall with the API info and feature flags.
+ */
+void gh_hypercall_hyp_identify(struct gh_hypercall_hyp_identify_resp *hyp_identity)
+{
+ struct arm_smccc_res res;
+
+ arm_smccc_1_1_hvc(GH_HYPERCALL_HYP_IDENTIFY, &res);
+
+ hyp_identity->api_info = res.a0;
+ hyp_identity->flags[0] = res.a1;
+ hyp_identity->flags[1] = res.a2;
+ hyp_identity->flags[2] = res.a3;
+}
+EXPORT_SYMBOL_GPL(gh_hypercall_hyp_identify);
+
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("Gunyah Hypervisor Hypercalls");
diff --git a/drivers/virt/Kconfig b/drivers/virt/Kconfig
index f79ab13a5c28..85bd6626ffc9 100644
--- a/drivers/virt/Kconfig
+++ b/drivers/virt/Kconfig
@@ -54,4 +54,6 @@ source "drivers/virt/coco/sev-guest/Kconfig"

source "drivers/virt/coco/tdx-guest/Kconfig"

+source "drivers/virt/gunyah/Kconfig"
+
endif
diff --git a/drivers/virt/gunyah/Kconfig b/drivers/virt/gunyah/Kconfig
new file mode 100644
index 000000000000..1a737694c333
--- /dev/null
+++ b/drivers/virt/gunyah/Kconfig
@@ -0,0 +1,13 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+config GUNYAH
+ tristate "Gunyah Virtualization drivers"
+ depends on ARM64
+ depends on MAILBOX
+ help
+ The Gunyah drivers are the helper interfaces that run in a guest VM
+ such as basic inter-VM IPC and signaling mechanisms, and higher level
+ services such as memory/device sharing, IRQ sharing, and so on.
+
+ Say Y/M here to enable the drivers needed to interact in a Gunyah
+ virtual environment.
diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
index 59ef4c735ae8..3fef2854c5e1 100644
--- a/include/linux/gunyah.h
+++ b/include/linux/gunyah.h
@@ -6,8 +6,10 @@
#ifndef _LINUX_GUNYAH_H
#define _LINUX_GUNYAH_H

+#include <linux/bitfield.h>
#include <linux/errno.h>
#include <linux/limits.h>
+#include <linux/types.h>

/******************************************************************************/
/* Common arch-independent definitions for Gunyah hypercalls */
@@ -79,4 +81,35 @@ static inline int gh_remap_error(enum gh_error gh_error)
}
}

+enum gh_api_feature {
+ GH_API_FEATURE_DOORBELL,
+ GH_API_FEATURE_MSGQUEUE,
+ GH_API_FEATURE_VCPU,
+ GH_API_FEATURE_MEMEXTENT,
+};
+
+bool arch_is_gunyah_guest(void);
+
+u16 gh_api_version(void);
+bool gh_api_has_feature(enum gh_api_feature feature);
+
+#define GUNYAH_API_V1 1
+
+#define GH_API_INFO_API_VERSION_MASK GENMASK_ULL(13, 0)
+#define GH_API_INFO_BIG_ENDIAN BIT_ULL(14)
+#define GH_API_INFO_IS_64BIT BIT_ULL(15)
+#define GH_API_INFO_VARIANT_MASK GENMASK_ULL(63, 56)
+
+#define GH_IDENTIFY_DOORBELL BIT_ULL(1)
+#define GH_IDENTIFY_MSGQUEUE BIT_ULL(2)
+#define GH_IDENTIFY_VCPU BIT_ULL(5)
+#define GH_IDENTIFY_MEMEXTENT BIT_ULL(6)
+
+struct gh_hypercall_hyp_identify_resp {
+ u64 api_info;
+ u64 flags[3];
+};
+
+void gh_hypercall_hyp_identify(struct gh_hypercall_hyp_identify_resp *hyp_identity);
+
#endif
--
2.39.1


2023-02-14 21:20:36

by Elliot Berman

[permalink] [raw]
Subject: [PATCH v10 04/26] virt: gunyah: Add hypercalls to identify Gunyah

Add hypercalls to identify when Linux is running a virtual machine under
Gunyah.

There are two calls to help identify Gunyah:

1. gh_hypercall_get_uid() returns a UID when running under a Gunyah
hypervisor.
2. gh_hypercall_hyp_identify() returns build information and a set of
feature flags that are supported by Gunyah.

Signed-off-by: Elliot Berman <[email protected]>
---
arch/arm64/Kbuild | 1 +
arch/arm64/gunyah/Makefile | 3 ++
arch/arm64/gunyah/gunyah_hypercall.c | 61 ++++++++++++++++++++++++++++
drivers/virt/Kconfig | 2 +
drivers/virt/gunyah/Kconfig | 13 ++++++
include/linux/gunyah.h | 33 +++++++++++++++
6 files changed, 113 insertions(+)
create mode 100644 arch/arm64/gunyah/Makefile
create mode 100644 arch/arm64/gunyah/gunyah_hypercall.c
create mode 100644 drivers/virt/gunyah/Kconfig

diff --git a/arch/arm64/Kbuild b/arch/arm64/Kbuild
index 5bfbf7d79c99..e4847ba0e3c9 100644
--- a/arch/arm64/Kbuild
+++ b/arch/arm64/Kbuild
@@ -3,6 +3,7 @@ obj-y += kernel/ mm/ net/
obj-$(CONFIG_KVM) += kvm/
obj-$(CONFIG_XEN) += xen/
obj-$(subst m,y,$(CONFIG_HYPERV)) += hyperv/
+obj-$(CONFIG_GUNYAH) += gunyah/
obj-$(CONFIG_CRYPTO) += crypto/

# for cleaning
diff --git a/arch/arm64/gunyah/Makefile b/arch/arm64/gunyah/Makefile
new file mode 100644
index 000000000000..84f1e38cafb1
--- /dev/null
+++ b/arch/arm64/gunyah/Makefile
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0
+
+obj-$(CONFIG_GUNYAH) += gunyah_hypercall.o
diff --git a/arch/arm64/gunyah/gunyah_hypercall.c b/arch/arm64/gunyah/gunyah_hypercall.c
new file mode 100644
index 000000000000..f30d06ee80cf
--- /dev/null
+++ b/arch/arm64/gunyah/gunyah_hypercall.c
@@ -0,0 +1,61 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#include <linux/arm-smccc.h>
+#include <linux/module.h>
+#include <linux/gunyah.h>
+
+static const uint32_t gunyah_known_uuids[][4] = {
+ {0x19bd54bd, 0x0b37571b, 0x946f609b, 0x54539de6}, /* QC_HYP (Qualcomm's build) */
+ {0x673d5f14, 0x9265ce36, 0xa4535fdb, 0xc1d58fcd}, /* GUNYAH (open source build) */
+};
+
+bool arch_is_gunyah_guest(void)
+{
+ struct arm_smccc_res res;
+ u32 uid[4];
+ int i;
+
+ arm_smccc_1_1_hvc(ARM_SMCCC_VENDOR_HYP_CALL_UID_FUNC_ID, &res);
+
+ uid[0] = lower_32_bits(res.a0);
+ uid[1] = lower_32_bits(res.a1);
+ uid[2] = lower_32_bits(res.a2);
+ uid[3] = lower_32_bits(res.a3);
+
+ for (i = 0; i < ARRAY_SIZE(gunyah_known_uuids); i++)
+ if (!memcmp(uid, gunyah_known_uuids[i], sizeof(uid)))
+ break;
+
+ return i != ARRAY_SIZE(gunyah_known_uuids);
+}
+EXPORT_SYMBOL_GPL(arch_is_gunyah_guest);
+
+#define GH_HYPERCALL(fn) ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, ARM_SMCCC_SMC_64, \
+ ARM_SMCCC_OWNER_VENDOR_HYP, \
+ fn)
+
+#define GH_HYPERCALL_HYP_IDENTIFY GH_HYPERCALL(0x8000)
+
+/**
+ * gh_hypercall_hyp_identify() - Returns build information and feature flags
+ * supported by Gunyah.
+ * @hyp_identity: filled by the hypercall with the API info and feature flags.
+ */
+void gh_hypercall_hyp_identify(struct gh_hypercall_hyp_identify_resp *hyp_identity)
+{
+ struct arm_smccc_res res;
+
+ arm_smccc_1_1_hvc(GH_HYPERCALL_HYP_IDENTIFY, &res);
+
+ hyp_identity->api_info = res.a0;
+ hyp_identity->flags[0] = res.a1;
+ hyp_identity->flags[1] = res.a2;
+ hyp_identity->flags[2] = res.a3;
+}
+EXPORT_SYMBOL_GPL(gh_hypercall_hyp_identify);
+
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("Gunyah Hypervisor Hypercalls");
diff --git a/drivers/virt/Kconfig b/drivers/virt/Kconfig
index f79ab13a5c28..85bd6626ffc9 100644
--- a/drivers/virt/Kconfig
+++ b/drivers/virt/Kconfig
@@ -54,4 +54,6 @@ source "drivers/virt/coco/sev-guest/Kconfig"

source "drivers/virt/coco/tdx-guest/Kconfig"

+source "drivers/virt/gunyah/Kconfig"
+
endif
diff --git a/drivers/virt/gunyah/Kconfig b/drivers/virt/gunyah/Kconfig
new file mode 100644
index 000000000000..1a737694c333
--- /dev/null
+++ b/drivers/virt/gunyah/Kconfig
@@ -0,0 +1,13 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+config GUNYAH
+ tristate "Gunyah Virtualization drivers"
+ depends on ARM64
+ depends on MAILBOX
+ help
+ The Gunyah drivers are the helper interfaces that run in a guest VM
+ such as basic inter-VM IPC and signaling mechanisms, and higher level
+ services such as memory/device sharing, IRQ sharing, and so on.
+
+ Say Y/M here to enable the drivers needed to interact in a Gunyah
+ virtual environment.
diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
index 59ef4c735ae8..3fef2854c5e1 100644
--- a/include/linux/gunyah.h
+++ b/include/linux/gunyah.h
@@ -6,8 +6,10 @@
#ifndef _LINUX_GUNYAH_H
#define _LINUX_GUNYAH_H

+#include <linux/bitfield.h>
#include <linux/errno.h>
#include <linux/limits.h>
+#include <linux/types.h>

/******************************************************************************/
/* Common arch-independent definitions for Gunyah hypercalls */
@@ -79,4 +81,35 @@ static inline int gh_remap_error(enum gh_error gh_error)
}
}

+enum gh_api_feature {
+ GH_API_FEATURE_DOORBELL,
+ GH_API_FEATURE_MSGQUEUE,
+ GH_API_FEATURE_VCPU,
+ GH_API_FEATURE_MEMEXTENT,
+};
+
+bool arch_is_gunyah_guest(void);
+
+u16 gh_api_version(void);
+bool gh_api_has_feature(enum gh_api_feature feature);
+
+#define GUNYAH_API_V1 1
+
+#define GH_API_INFO_API_VERSION_MASK GENMASK_ULL(13, 0)
+#define GH_API_INFO_BIG_ENDIAN BIT_ULL(14)
+#define GH_API_INFO_IS_64BIT BIT_ULL(15)
+#define GH_API_INFO_VARIANT_MASK GENMASK_ULL(63, 56)
+
+#define GH_IDENTIFY_DOORBELL BIT_ULL(1)
+#define GH_IDENTIFY_MSGQUEUE BIT_ULL(2)
+#define GH_IDENTIFY_VCPU BIT_ULL(5)
+#define GH_IDENTIFY_MEMEXTENT BIT_ULL(6)
+
+struct gh_hypercall_hyp_identify_resp {
+ u64 api_info;
+ u64 flags[3];
+};
+
+void gh_hypercall_hyp_identify(struct gh_hypercall_hyp_identify_resp *hyp_identity);
+
#endif
--
2.39.1


2023-02-14 21:22:36

by Elliot Berman

[permalink] [raw]
Subject: [PATCH v10 05/26] virt: gunyah: Identify hypervisor version

Export the version of Gunyah which is reported via the hyp_identify
hypercall. Increments of the major API version indicate possibly
backwards incompatible changes.

Export the hypervisor identity so that Gunyah drivers can act according
to the major API version.

Signed-off-by: Elliot Berman <[email protected]>
---
drivers/virt/Makefile | 1 +
drivers/virt/gunyah/Makefile | 3 ++
drivers/virt/gunyah/gunyah.c | 54 ++++++++++++++++++++++++++++++++++++
3 files changed, 58 insertions(+)
create mode 100644 drivers/virt/gunyah/Makefile
create mode 100644 drivers/virt/gunyah/gunyah.c

diff --git a/drivers/virt/Makefile b/drivers/virt/Makefile
index e9aa6fc96fab..a5817e2d7d71 100644
--- a/drivers/virt/Makefile
+++ b/drivers/virt/Makefile
@@ -12,3 +12,4 @@ obj-$(CONFIG_ACRN_HSM) += acrn/
obj-$(CONFIG_EFI_SECRET) += coco/efi_secret/
obj-$(CONFIG_SEV_GUEST) += coco/sev-guest/
obj-$(CONFIG_INTEL_TDX_GUEST) += coco/tdx-guest/
+obj-y += gunyah/
diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
new file mode 100644
index 000000000000..34f32110faf9
--- /dev/null
+++ b/drivers/virt/gunyah/Makefile
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0
+
+obj-$(CONFIG_GUNYAH) += gunyah.o
diff --git a/drivers/virt/gunyah/gunyah.c b/drivers/virt/gunyah/gunyah.c
new file mode 100644
index 000000000000..776e83c6920d
--- /dev/null
+++ b/drivers/virt/gunyah/gunyah.c
@@ -0,0 +1,54 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#define pr_fmt(fmt) "gunyah: " fmt
+
+#include <linux/gunyah.h>
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/printk.h>
+
+static struct gh_hypercall_hyp_identify_resp gunyah_api;
+
+u16 gh_api_version(void)
+{
+ return FIELD_GET(GH_API_INFO_API_VERSION_MASK, gunyah_api.api_info);
+}
+EXPORT_SYMBOL_GPL(gh_api_version);
+
+bool gh_api_has_feature(enum gh_api_feature feature)
+{
+ switch (feature) {
+ case GH_API_FEATURE_DOORBELL:
+ return !!(gunyah_api.flags[0] & GH_IDENTIFY_DOORBELL);
+ case GH_API_FEATURE_MSGQUEUE:
+ return !!(gunyah_api.flags[0] & GH_IDENTIFY_MSGQUEUE);
+ case GH_API_FEATURE_VCPU:
+ return !!(gunyah_api.flags[0] & GH_IDENTIFY_VCPU);
+ case GH_API_FEATURE_MEMEXTENT:
+ return !!(gunyah_api.flags[0] & GH_IDENTIFY_MEMEXTENT);
+ default:
+ return false;
+ }
+}
+EXPORT_SYMBOL_GPL(gh_api_has_feature);
+
+static int __init gunyah_init(void)
+{
+ if (!arch_is_gunyah_guest())
+ return -ENODEV;
+
+ gh_hypercall_hyp_identify(&gunyah_api);
+
+ pr_info("Running under Gunyah hypervisor %llx/v%u\n",
+ FIELD_GET(GH_API_INFO_VARIANT_MASK, gunyah_api.api_info),
+ gh_api_version());
+
+ return 0;
+}
+arch_initcall(gunyah_init);
+
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("Gunyah Hypervisor Driver");
--
2.39.1


2023-02-14 21:23:35

by Elliot Berman

[permalink] [raw]
Subject: [PATCH v10 06/26] virt: gunyah: msgq: Add hypercalls to send and receive messages

Add hypercalls to send and receive messages on a Gunyah message queue.

Signed-off-by: Elliot Berman <[email protected]>
---
arch/arm64/gunyah/gunyah_hypercall.c | 32 ++++++++++++++++++++++++++++
include/linux/gunyah.h | 7 ++++++
2 files changed, 39 insertions(+)

diff --git a/arch/arm64/gunyah/gunyah_hypercall.c b/arch/arm64/gunyah/gunyah_hypercall.c
index f30d06ee80cf..2ca9ab098ff6 100644
--- a/arch/arm64/gunyah/gunyah_hypercall.c
+++ b/arch/arm64/gunyah/gunyah_hypercall.c
@@ -38,6 +38,8 @@ EXPORT_SYMBOL_GPL(arch_is_gunyah_guest);
fn)

#define GH_HYPERCALL_HYP_IDENTIFY GH_HYPERCALL(0x8000)
+#define GH_HYPERCALL_MSGQ_SEND GH_HYPERCALL(0x801B)
+#define GH_HYPERCALL_MSGQ_RECV GH_HYPERCALL(0x801C)

/**
* gh_hypercall_hyp_identify() - Returns build information and feature flags
@@ -57,5 +59,35 @@ void gh_hypercall_hyp_identify(struct gh_hypercall_hyp_identify_resp *hyp_identi
}
EXPORT_SYMBOL_GPL(gh_hypercall_hyp_identify);

+enum gh_error gh_hypercall_msgq_send(u64 capid, size_t size, uintptr_t buff, int tx_flags,
+ bool *ready)
+{
+ struct arm_smccc_res res;
+
+ arm_smccc_1_1_hvc(GH_HYPERCALL_MSGQ_SEND, capid, size, buff, tx_flags, 0, &res);
+
+ if (res.a0 == GH_ERROR_OK)
+ *ready = res.a1;
+
+ return res.a0;
+}
+EXPORT_SYMBOL_GPL(gh_hypercall_msgq_send);
+
+enum gh_error gh_hypercall_msgq_recv(u64 capid, uintptr_t buff, size_t size, size_t *recv_size,
+ bool *ready)
+{
+ struct arm_smccc_res res;
+
+ arm_smccc_1_1_hvc(GH_HYPERCALL_MSGQ_RECV, capid, buff, size, 0, &res);
+
+ if (res.a0 == GH_ERROR_OK) {
+ *recv_size = res.a1;
+ *ready = res.a2;
+ }
+
+ return res.a0;
+}
+EXPORT_SYMBOL_GPL(gh_hypercall_msgq_recv);
+
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("Gunyah Hypervisor Hypercalls");
diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
index 3fef2854c5e1..cb6df4eec5c2 100644
--- a/include/linux/gunyah.h
+++ b/include/linux/gunyah.h
@@ -112,4 +112,11 @@ struct gh_hypercall_hyp_identify_resp {

void gh_hypercall_hyp_identify(struct gh_hypercall_hyp_identify_resp *hyp_identity);

+#define GH_HYPERCALL_MSGQ_TX_FLAGS_PUSH BIT(0)
+
+enum gh_error gh_hypercall_msgq_send(u64 capid, size_t size, uintptr_t buff, int tx_flags,
+ bool *ready);
+enum gh_error gh_hypercall_msgq_recv(u64 capid, uintptr_t buff, size_t size, size_t *recv_size,
+ bool *ready);
+
#endif
--
2.39.1


2023-02-14 21:23:56

by Elliot Berman

[permalink] [raw]
Subject: [PATCH v10 07/26] mailbox: Add Gunyah message queue mailbox

Gunyah message queues are a unidirectional inter-VM pipe for messages up
to 1024 bytes. This driver supports pairing a receiver message queue and
a transmitter message queue to expose a single mailbox channel.

Signed-off-by: Elliot Berman <[email protected]>
---
Documentation/virt/gunyah/message-queue.rst | 8 +
drivers/mailbox/Makefile | 2 +
drivers/mailbox/gunyah-msgq.c | 214 ++++++++++++++++++++
include/linux/gunyah.h | 56 +++++
4 files changed, 280 insertions(+)
create mode 100644 drivers/mailbox/gunyah-msgq.c

diff --git a/Documentation/virt/gunyah/message-queue.rst b/Documentation/virt/gunyah/message-queue.rst
index 0667b3eb1ff9..082085e981e0 100644
--- a/Documentation/virt/gunyah/message-queue.rst
+++ b/Documentation/virt/gunyah/message-queue.rst
@@ -59,3 +59,11 @@ vIRQ: two TX message queues will have two vIRQs (and two capability IDs).
| | | | | |
| | | | | |
+---------------+ +-----------------+ +---------------+
+
+Gunyah message queues are exposed as mailboxes. To create the mailbox, create
+a mbox_client and call `gh_msgq_init`. On receipt of the RX_READY interrupt,
+all messages in the RX message queue are read and pushed via the `rx_callback`
+of the registered mbox_client.
+
+.. kernel-doc:: drivers/mailbox/gunyah-msgq.c
+ :identifiers: gh_msgq_init
diff --git a/drivers/mailbox/Makefile b/drivers/mailbox/Makefile
index fc9376117111..5f929bb55e9a 100644
--- a/drivers/mailbox/Makefile
+++ b/drivers/mailbox/Makefile
@@ -55,6 +55,8 @@ obj-$(CONFIG_MTK_CMDQ_MBOX) += mtk-cmdq-mailbox.o

obj-$(CONFIG_ZYNQMP_IPI_MBOX) += zynqmp-ipi-mailbox.o

+obj-$(CONFIG_GUNYAH) += gunyah-msgq.o
+
obj-$(CONFIG_SUN6I_MSGBOX) += sun6i-msgbox.o

obj-$(CONFIG_SPRD_MBOX) += sprd-mailbox.o
diff --git a/drivers/mailbox/gunyah-msgq.c b/drivers/mailbox/gunyah-msgq.c
new file mode 100644
index 000000000000..03ffaa30ce9b
--- /dev/null
+++ b/drivers/mailbox/gunyah-msgq.c
@@ -0,0 +1,214 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#include <linux/mailbox_controller.h>
+#include <linux/module.h>
+#include <linux/interrupt.h>
+#include <linux/gunyah.h>
+#include <linux/printk.h>
+#include <linux/init.h>
+#include <linux/slab.h>
+#include <linux/wait.h>
+
+#define mbox_chan_to_msgq(chan) (container_of(chan->mbox, struct gh_msgq, mbox))
+
+static irqreturn_t gh_msgq_rx_irq_handler(int irq, void *data)
+{
+ struct gh_msgq *msgq = data;
+ struct gh_msgq_rx_data rx_data;
+ enum gh_error err;
+ bool ready = true;
+
+ while (ready) {
+ err = gh_hypercall_msgq_recv(msgq->rx_ghrsc->capid,
+ (uintptr_t)&rx_data.data, sizeof(rx_data.data),
+ &rx_data.length, &ready);
+ if (err != GH_ERROR_OK) {
+ if (err != GH_ERROR_MSGQUEUE_EMPTY)
+ pr_warn("Failed to receive data from msgq for %s: %d\n",
+ msgq->mbox.dev ? dev_name(msgq->mbox.dev) : "", err);
+ break;
+ }
+ mbox_chan_received_data(gh_msgq_chan(msgq), &rx_data);
+ }
+
+ return IRQ_HANDLED;
+}
+
+/* Fired when message queue transitions from "full" to "space available" to send messages */
+static irqreturn_t gh_msgq_tx_irq_handler(int irq, void *data)
+{
+ struct gh_msgq *msgq = data;
+
+ mbox_chan_txdone(gh_msgq_chan(msgq), 0);
+
+ return IRQ_HANDLED;
+}
+
+/* Fired after sending message and hypercall told us there was more space available. */
+static void gh_msgq_txdone_tasklet(struct tasklet_struct *tasklet)
+{
+ struct gh_msgq *msgq = container_of(tasklet, struct gh_msgq, txdone_tasklet);
+
+ mbox_chan_txdone(gh_msgq_chan(msgq), msgq->last_ret);
+}
+
+static int gh_msgq_send_data(struct mbox_chan *chan, void *data)
+{
+ struct gh_msgq *msgq = mbox_chan_to_msgq(chan);
+ struct gh_msgq_tx_data *msgq_data = data;
+ u64 tx_flags = 0;
+ enum gh_error gh_error;
+ bool ready;
+
+ if (msgq_data->push)
+ tx_flags |= GH_HYPERCALL_MSGQ_TX_FLAGS_PUSH;
+
+ gh_error = gh_hypercall_msgq_send(msgq->tx_ghrsc->capid, msgq_data->length,
+ (uintptr_t)msgq_data->data, tx_flags, &ready);
+
+ /**
+ * unlikely because Linux tracks state of msgq and should not try to
+ * send message when msgq is full.
+ */
+ if (unlikely(gh_error == GH_ERROR_MSGQUEUE_FULL))
+ return -EAGAIN;
+
+ /**
+ * Propagate all other errors to client. If we return error to mailbox
+ * framework, then no other messages can be sent and nobody will know
+ * to retry this message.
+ */
+ msgq->last_ret = gh_remap_error(gh_error);
+
+ /**
+ * This message was successfully sent, but message queue isn't ready to
+ * receive more messages because it's now full. Mailbox framework
+ * requires that we only report that message was transmitted when
+ * we're ready to transmit another message. We'll get that in the form
+ * of tx IRQ once the other side starts to drain the msgq.
+ */
+ if (gh_error == GH_ERROR_OK && !ready)
+ return 0;
+
+ /**
+ * We can send more messages. Mailbox framework requires that tx done
+ * happens asynchronously to sending the message. Gunyah message queues
+ * tell us right away on the hypercall return whether we can send more
+ * messages. To work around this, defer the txdone to a tasklet.
+ */
+ tasklet_schedule(&msgq->txdone_tasklet);
+
+ return 0;
+}
+
+static struct mbox_chan_ops gh_msgq_ops = {
+ .send_data = gh_msgq_send_data,
+};
+
+/**
+ * gh_msgq_init() - Initialize a Gunyah message queue with an mbox_client
+ * @parent: optional, device parent used for the mailbox controller
+ * @msgq: Pointer to the gh_msgq to initialize
+ * @cl: A mailbox client to bind to the mailbox channel that the message queue creates
+ * @tx_ghrsc: optional, the transmission side of the message queue
+ * @rx_ghrsc: optional, the receiving side of the message queue
+ *
+ * At least one of tx_ghrsc and rx_ghrsc should be not NULL. Most message queue use cases come with
+ * a pair of message queues to facilitate bidirectional communication. When tx_ghrsc is set,
+ * the client can send messages with mbox_send_message(gh_msgq_chan(msgq), msg). When rx_ghrsc
+ * is set, the mbox_client should register an .rx_callback() and the message queue driver will
+ * push all available messages upon receiving the RX ready interrupt. The messages should be
+ * consumed or copied by the client right away as the gh_msgq_rx_data will be replaced/destroyed
+ * after the callback.
+ *
+ * Returns - 0 on success, negative otherwise
+ */
+int gh_msgq_init(struct device *parent, struct gh_msgq *msgq, struct mbox_client *cl,
+ struct gunyah_resource *tx_ghrsc, struct gunyah_resource *rx_ghrsc)
+{
+ int ret;
+
+ /* Must have at least a tx_ghrsc or rx_ghrsc and that they are the right device types */
+ if ((!tx_ghrsc && !rx_ghrsc) ||
+ (tx_ghrsc && tx_ghrsc->type != GUNYAH_RESOURCE_TYPE_MSGQ_TX) ||
+ (rx_ghrsc && rx_ghrsc->type != GUNYAH_RESOURCE_TYPE_MSGQ_RX))
+ return -EINVAL;
+
+ if (gh_api_version() != GUNYAH_API_V1) {
+ pr_err("Unrecognized gunyah version: %u. Currently supported: %d\n",
+ gh_api_version(), GUNYAH_API_V1);
+ return -EOPNOTSUPP;
+ }
+
+ if (!gh_api_has_feature(GH_API_FEATURE_MSGQUEUE))
+ return -EOPNOTSUPP;
+
+ msgq->tx_ghrsc = tx_ghrsc;
+ msgq->rx_ghrsc = rx_ghrsc;
+
+ msgq->mbox.dev = parent;
+ msgq->mbox.ops = &gh_msgq_ops;
+ msgq->mbox.num_chans = 1;
+ msgq->mbox.txdone_irq = true;
+ msgq->mbox.chans = kcalloc(msgq->mbox.num_chans, sizeof(*msgq->mbox.chans), GFP_KERNEL);
+ if (!msgq->mbox.chans)
+ return -ENOMEM;
+
+ if (msgq->tx_ghrsc) {
+ ret = request_irq(msgq->tx_ghrsc->irq, gh_msgq_tx_irq_handler, 0, "gh_msgq_tx",
+ msgq);
+ if (ret)
+ goto err_chans;
+ }
+
+ if (msgq->rx_ghrsc) {
+ ret = request_threaded_irq(msgq->rx_ghrsc->irq, NULL, gh_msgq_rx_irq_handler,
+ IRQF_ONESHOT, "gh_msgq_rx", msgq);
+ if (ret)
+ goto err_tx_irq;
+ }
+
+ tasklet_setup(&msgq->txdone_tasklet, gh_msgq_txdone_tasklet);
+
+ ret = mbox_controller_register(&msgq->mbox);
+ if (ret)
+ goto err_rx_irq;
+
+ ret = mbox_bind_client(gh_msgq_chan(msgq), cl);
+ if (ret)
+ goto err_mbox;
+
+ return 0;
+err_mbox:
+ mbox_controller_unregister(&msgq->mbox);
+err_rx_irq:
+ if (msgq->rx_ghrsc)
+ free_irq(msgq->rx_ghrsc->irq, msgq);
+err_tx_irq:
+ if (msgq->tx_ghrsc)
+ free_irq(msgq->tx_ghrsc->irq, msgq);
+err_chans:
+ kfree(msgq->mbox.chans);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(gh_msgq_init);
+
+void gh_msgq_remove(struct gh_msgq *msgq)
+{
+ mbox_controller_unregister(&msgq->mbox);
+
+ if (msgq->rx_ghrsc)
+ free_irq(msgq->rx_ghrsc->irq, msgq);
+
+ if (msgq->tx_ghrsc)
+ free_irq(msgq->tx_ghrsc->irq, msgq);
+
+ kfree(msgq->mbox.chans);
+}
+EXPORT_SYMBOL_GPL(gh_msgq_remove);
+
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("Gunyah Message Queue Driver");
diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
index cb6df4eec5c2..2e13669c6363 100644
--- a/include/linux/gunyah.h
+++ b/include/linux/gunyah.h
@@ -8,11 +8,67 @@

#include <linux/bitfield.h>
#include <linux/errno.h>
+#include <linux/interrupt.h>
#include <linux/limits.h>
+#include <linux/mailbox_controller.h>
+#include <linux/mailbox_client.h>
#include <linux/types.h>

+/* Follows resource manager's resource types for VM_GET_HYP_RESOURCES */
+enum gunyah_resource_type {
+ GUNYAH_RESOURCE_TYPE_BELL_TX = 0,
+ GUNYAH_RESOURCE_TYPE_BELL_RX = 1,
+ GUNYAH_RESOURCE_TYPE_MSGQ_TX = 2,
+ GUNYAH_RESOURCE_TYPE_MSGQ_RX = 3,
+ GUNYAH_RESOURCE_TYPE_VCPU = 4,
+};
+
+struct gunyah_resource {
+ enum gunyah_resource_type type;
+ u64 capid;
+ int irq;
+};
+
+/**
+ * Gunyah Message Queues
+ */
+
+#define GH_MSGQ_MAX_MSG_SIZE 240
+
+struct gh_msgq_tx_data {
+ size_t length;
+ bool push;
+ char data[];
+};
+
+struct gh_msgq_rx_data {
+ size_t length;
+ char data[GH_MSGQ_MAX_MSG_SIZE];
+};
+
+struct gh_msgq {
+ struct gunyah_resource *tx_ghrsc;
+ struct gunyah_resource *rx_ghrsc;
+
+ /* msgq private */
+ int last_ret; /* Linux error, not GH_STATUS_* */
+ struct mbox_controller mbox;
+ struct tasklet_struct txdone_tasklet;
+};
+
+
+int gh_msgq_init(struct device *parent, struct gh_msgq *msgq, struct mbox_client *cl,
+ struct gunyah_resource *tx_ghrsc, struct gunyah_resource *rx_ghrsc);
+void gh_msgq_remove(struct gh_msgq *msgq);
+
+static inline struct mbox_chan *gh_msgq_chan(struct gh_msgq *msgq)
+{
+ return &msgq->mbox.chans[0];
+}
+
/******************************************************************************/
/* Common arch-independent definitions for Gunyah hypercalls */
+
#define GH_CAPID_INVAL U64_MAX
#define GH_VMID_ROOT_VM 0xff

--
2.39.1


2023-02-14 21:24:31

by Elliot Berman

[permalink] [raw]
Subject: [PATCH v10 08/26] gunyah: rsc_mgr: Add resource manager RPC core


The resource manager is a special virtual machine which is always
running on a Gunyah system. It provides APIs for creating and destroying
VMs, secure memory management, sharing/lending of memory between VMs,
and setup of inter-VM communication. Calls to the resource manager are
made via message queues.

This patch implements the basic probing and RPC mechanism to make those
API calls. Request/response calls can be made with gh_rm_call.
Drivers can also register to notifications pushed by RM via
gh_rm_register_notifier

Specific API calls that resource manager supports will be implemented in
subsequent patches.

Signed-off-by: Elliot Berman <[email protected]>
---
drivers/virt/gunyah/Makefile | 3 +
drivers/virt/gunyah/rsc_mgr.c | 604 +++++++++++++++++++++++++++++++++
drivers/virt/gunyah/rsc_mgr.h | 77 +++++
include/linux/gunyah_rsc_mgr.h | 24 ++
4 files changed, 708 insertions(+)
create mode 100644 drivers/virt/gunyah/rsc_mgr.c
create mode 100644 drivers/virt/gunyah/rsc_mgr.h
create mode 100644 include/linux/gunyah_rsc_mgr.h

diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
index 34f32110faf9..cc864ff5abbb 100644
--- a/drivers/virt/gunyah/Makefile
+++ b/drivers/virt/gunyah/Makefile
@@ -1,3 +1,6 @@
# SPDX-License-Identifier: GPL-2.0

obj-$(CONFIG_GUNYAH) += gunyah.o
+
+gunyah_rsc_mgr-y += rsc_mgr.o
+obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
diff --git a/drivers/virt/gunyah/rsc_mgr.c b/drivers/virt/gunyah/rsc_mgr.c
new file mode 100644
index 000000000000..2a47139873a8
--- /dev/null
+++ b/drivers/virt/gunyah/rsc_mgr.c
@@ -0,0 +1,604 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#include <linux/of.h>
+#include <linux/slab.h>
+#include <linux/mutex.h>
+#include <linux/sched.h>
+#include <linux/gunyah.h>
+#include <linux/module.h>
+#include <linux/of_irq.h>
+#include <linux/kthread.h>
+#include <linux/notifier.h>
+#include <linux/workqueue.h>
+#include <linux/completion.h>
+#include <linux/gunyah_rsc_mgr.h>
+#include <linux/platform_device.h>
+
+#include "rsc_mgr.h"
+
+#define RM_RPC_API_VERSION_MASK GENMASK(3, 0)
+#define RM_RPC_HEADER_WORDS_MASK GENMASK(7, 4)
+#define RM_RPC_API_VERSION FIELD_PREP(RM_RPC_API_VERSION_MASK, 1)
+#define RM_RPC_HEADER_WORDS FIELD_PREP(RM_RPC_HEADER_WORDS_MASK, \
+ (sizeof(struct gh_rm_rpc_hdr) / sizeof(u32)))
+#define RM_RPC_API (RM_RPC_API_VERSION | RM_RPC_HEADER_WORDS)
+
+#define RM_RPC_TYPE_CONTINUATION 0x0
+#define RM_RPC_TYPE_REQUEST 0x1
+#define RM_RPC_TYPE_REPLY 0x2
+#define RM_RPC_TYPE_NOTIF 0x3
+#define RM_RPC_TYPE_MASK GENMASK(1, 0)
+
+#define GH_RM_MAX_NUM_FRAGMENTS 62
+#define RM_RPC_FRAGMENTS_MASK GENMASK(7, 2)
+
+struct gh_rm_rpc_hdr {
+ u8 api;
+ u8 type;
+ __le16 seq;
+ __le32 msg_id;
+} __packed;
+
+struct gh_rm_rpc_reply_hdr {
+ struct gh_rm_rpc_hdr hdr;
+ __le32 err_code; /* GH_RM_ERROR_* */
+} __packed;
+
+#define GH_RM_MAX_MSG_SIZE (GH_MSGQ_MAX_MSG_SIZE - sizeof(struct gh_rm_rpc_hdr))
+
+/**
+ * struct gh_rm_connection - Represents a complete message from resource manager
+ * @payload: Combined payload of all the fragments (msg headers stripped off).
+ * @size: Size of the payload received so far.
+ * @msg_id: Message ID from the header.
+ * @num_fragments: total number of fragments expected to be received.
+ * @fragments_received: fragments received so far.
+ * @reply: Fields used for request/reply sequences
+ * @ret: Linux return code, set in case there was an error processing connection
+ * @type: RM_RPC_TYPE_REPLY or RM_RPC_TYPE_NOTIF.
+ * @rm_error: For request/reply sequences with standard replies.
+ * @seq: Sequence ID for the main message.
+ * @seq_done: Signals caller that the RM reply has been received
+ * @notification: Fields used for notifiations
+ * @work: Triggered when all fragments of a notification received
+ */
+struct gh_rm_connection {
+ void *payload;
+ size_t size;
+ __le32 msg_id;
+ u8 type;
+
+ u8 num_fragments;
+ u8 fragments_received;
+
+ union {
+ struct {
+ int ret;
+ u16 seq;
+ enum gh_rm_error rm_error;
+ struct completion seq_done;
+ } reply;
+
+ struct {
+ struct gh_rm *rm;
+ struct work_struct work;
+ } notification;
+ };
+};
+
+struct gh_rm {
+ struct device *dev;
+ struct gunyah_resource tx_ghrsc, rx_ghrsc;
+ struct gh_msgq msgq;
+ struct mbox_client msgq_client;
+ struct gh_rm_connection *active_rx_connection;
+ int last_tx_ret;
+
+ struct idr call_idr;
+ struct mutex call_idr_lock;
+
+ struct kmem_cache *cache;
+ struct mutex send_lock;
+ struct blocking_notifier_head nh;
+};
+
+static struct gh_rm_connection *gh_rm_alloc_connection(__le32 msg_id, u8 type)
+{
+ struct gh_rm_connection *connection;
+
+ connection = kzalloc(sizeof(*connection), GFP_KERNEL);
+ if (!connection)
+ return ERR_PTR(-ENOMEM);
+
+ connection->type = type;
+ connection->msg_id = msg_id;
+
+ return connection;
+}
+
+static int gh_rm_init_connection_payload(struct gh_rm_connection *connection, void *msg,
+ size_t hdr_size, size_t msg_size)
+{
+ size_t max_buf_size, payload_size;
+ struct gh_rm_rpc_hdr *hdr = msg;
+
+ if (hdr_size > msg_size)
+ return -EINVAL;
+
+ payload_size = msg_size - hdr_size;
+
+ connection->num_fragments = FIELD_GET(RM_RPC_FRAGMENTS_MASK, hdr->type);
+ connection->fragments_received = 0;
+
+ /* There's not going to be any payload, no need to allocate buffer. */
+ if (!payload_size && !connection->num_fragments)
+ return 0;
+
+ if (connection->num_fragments > GH_RM_MAX_NUM_FRAGMENTS)
+ return -EINVAL;
+
+ max_buf_size = payload_size + (connection->num_fragments * GH_RM_MAX_MSG_SIZE);
+
+ connection->payload = kzalloc(max_buf_size, GFP_KERNEL);
+ if (!connection->payload)
+ return -ENOMEM;
+
+ memcpy(connection->payload, msg + hdr_size, payload_size);
+ connection->size = payload_size;
+ return 0;
+}
+
+static void gh_rm_notif_work(struct work_struct *work)
+{
+ struct gh_rm_connection *connection = container_of(work, struct gh_rm_connection,
+ notification.work);
+ struct gh_rm *rm = connection->notification.rm;
+
+ blocking_notifier_call_chain(&rm->nh, connection->msg_id, connection->payload);
+
+ put_gh_rm(rm);
+ kfree(connection->payload);
+ kfree(connection);
+}
+
+static struct gh_rm_connection *gh_rm_process_notif(struct gh_rm *rm, void *msg, size_t msg_size)
+{
+ struct gh_rm_connection *connection;
+ struct gh_rm_rpc_hdr *hdr = msg;
+ int ret;
+
+ connection = gh_rm_alloc_connection(hdr->msg_id, RM_RPC_TYPE_NOTIF);
+ if (IS_ERR(connection)) {
+ dev_err(rm->dev, "Failed to alloc connection for notification: %ld, dropping.\n",
+ PTR_ERR(connection));
+ return NULL;
+ }
+
+ get_gh_rm(rm);
+ connection->notification.rm = rm;
+ INIT_WORK(&connection->notification.work, gh_rm_notif_work);
+
+ ret = gh_rm_init_connection_payload(connection, msg, sizeof(*hdr), msg_size);
+ if (ret) {
+ dev_err(rm->dev, "Failed to initialize connection buffer for notification: %d\n",
+ ret);
+ kfree(connection);
+ return NULL;
+ }
+
+ return connection;
+}
+
+static struct gh_rm_connection *gh_rm_process_rply(struct gh_rm *rm, void *msg, size_t msg_size)
+{
+ struct gh_rm_rpc_reply_hdr *reply_hdr = msg;
+ struct gh_rm_connection *connection;
+ u16 seq_id = le16_to_cpu(reply_hdr->hdr.seq);
+
+ mutex_lock(&rm->call_idr_lock);
+ connection = idr_find(&rm->call_idr, seq_id);
+ mutex_unlock(&rm->call_idr_lock);
+
+ if (!connection || connection->msg_id != reply_hdr->hdr.msg_id)
+ return NULL;
+
+ if (gh_rm_init_connection_payload(connection, msg, sizeof(*reply_hdr), msg_size)) {
+ dev_err(rm->dev, "Failed to alloc connection buffer for sequence %d\n", seq_id);
+ /* Send connection complete and error the client. */
+ connection->reply.ret = -ENOMEM;
+ complete(&connection->reply.seq_done);
+ return NULL;
+ }
+
+ connection->reply.rm_error = le32_to_cpu(reply_hdr->err_code);
+ return connection;
+}
+
+static int gh_rm_process_cont(struct gh_rm *rm, struct gh_rm_connection *connection,
+ void *msg, size_t msg_size)
+{
+ struct gh_rm_rpc_hdr *hdr = msg;
+ size_t payload_size = msg_size - sizeof(*hdr);
+
+ /*
+ * hdr->fragments and hdr->msg_id preserves the value from first reply
+ * or notif message. To detect mishandling, check it's still intact.
+ */
+ if (connection->msg_id != hdr->msg_id ||
+ connection->num_fragments != FIELD_GET(RM_RPC_FRAGMENTS_MASK, hdr->type))
+ return -EINVAL;
+
+ memcpy(connection->payload + connection->size, msg + sizeof(*hdr), payload_size);
+ connection->size += payload_size;
+ connection->fragments_received++;
+ return 0;
+}
+
+static void gh_rm_abort_connection(struct gh_rm_connection *connection)
+{
+ switch (connection->type) {
+ case RM_RPC_TYPE_REPLY:
+ connection->reply.ret = -EIO;
+ complete(&connection->reply.seq_done);
+ break;
+ case RM_RPC_TYPE_NOTIF:
+ fallthrough;
+ default:
+ kfree(connection->payload);
+ kfree(connection);
+ }
+}
+
+static bool gh_rm_complete_connection(struct gh_rm *rm, struct gh_rm_connection *connection)
+{
+ if (!connection || connection->fragments_received != connection->num_fragments)
+ return false;
+
+ switch (connection->type) {
+ case RM_RPC_TYPE_REPLY:
+ complete(&connection->reply.seq_done);
+ break;
+ case RM_RPC_TYPE_NOTIF:
+ schedule_work(&connection->notification.work);
+ break;
+ default:
+ dev_err(rm->dev, "Invalid message type (%d) received\n", connection->type);
+ gh_rm_abort_connection(connection);
+ break;
+ }
+
+ return true;
+}
+
+static void gh_rm_msgq_rx_data(struct mbox_client *cl, void *mssg)
+{
+ struct gh_rm *rm = container_of(cl, struct gh_rm, msgq_client);
+ struct gh_msgq_rx_data *rx_data = mssg;
+ size_t msg_size = rx_data->length;
+ void *msg = rx_data->data;
+ struct gh_rm_rpc_hdr *hdr;
+
+ if (msg_size <= sizeof(*hdr) || msg_size > GH_MSGQ_MAX_MSG_SIZE)
+ return;
+
+ hdr = msg;
+ if (hdr->api != RM_RPC_API) {
+ dev_err(rm->dev, "Unknown RM RPC API version: %x\n", hdr->api);
+ return;
+ }
+
+ switch (FIELD_GET(RM_RPC_TYPE_MASK, hdr->type)) {
+ case RM_RPC_TYPE_NOTIF:
+ rm->active_rx_connection = gh_rm_process_notif(rm, msg, msg_size);
+ break;
+ case RM_RPC_TYPE_REPLY:
+ rm->active_rx_connection = gh_rm_process_rply(rm, msg, msg_size);
+ break;
+ case RM_RPC_TYPE_CONTINUATION:
+ if (gh_rm_process_cont(rm, rm->active_rx_connection, msg, msg_size)) {
+ gh_rm_abort_connection(rm->active_rx_connection);
+ rm->active_rx_connection = NULL;
+ }
+ break;
+ default:
+ dev_err(rm->dev, "Invalid message type (%lu) received\n",
+ FIELD_GET(RM_RPC_TYPE_MASK, hdr->type));
+ return;
+ }
+
+ if (gh_rm_complete_connection(rm, rm->active_rx_connection))
+ rm->active_rx_connection = NULL;
+}
+
+static void gh_rm_msgq_tx_done(struct mbox_client *cl, void *mssg, int r)
+{
+ struct gh_rm *rm = container_of(cl, struct gh_rm, msgq_client);
+
+ kmem_cache_free(rm->cache, mssg);
+ rm->last_tx_ret = r;
+}
+
+static int gh_rm_send_request(struct gh_rm *rm, u32 message_id,
+ const void *req_buff, size_t req_buff_size,
+ struct gh_rm_connection *connection)
+{
+ u8 msg_type = FIELD_PREP(RM_RPC_TYPE_MASK, RM_RPC_TYPE_REQUEST);
+ size_t buff_size_remaining = req_buff_size;
+ const void *req_buff_curr = req_buff;
+ struct gh_msgq_tx_data *msg;
+ struct gh_rm_rpc_hdr *hdr;
+ u32 cont_fragments = 0;
+ size_t payload_size;
+ void *payload;
+ int ret;
+
+ if (req_buff_size)
+ cont_fragments = (req_buff_size - 1) / GH_RM_MAX_MSG_SIZE;
+
+ if (req_buff_size > GH_RM_MAX_NUM_FRAGMENTS * GH_RM_MAX_MSG_SIZE) {
+ pr_warn("Limit exceeded for the number of fragments: %u\n", cont_fragments);
+ dump_stack();
+ return -E2BIG;
+ }
+
+ ret = mutex_lock_interruptible(&rm->send_lock);
+ if (ret)
+ return ret;
+
+ /* Consider also the 'request' packet for the loop count */
+ do {
+ msg = kmem_cache_zalloc(rm->cache, GFP_KERNEL);
+ if (!msg) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ /* Fill header */
+ hdr = (struct gh_rm_rpc_hdr *)msg->data;
+ hdr->api = RM_RPC_API;
+ hdr->type = msg_type | FIELD_PREP(RM_RPC_FRAGMENTS_MASK, cont_fragments);
+ hdr->seq = cpu_to_le16(connection->reply.seq);
+ hdr->msg_id = cpu_to_le32(message_id);
+
+ /* Copy payload */
+ payload = hdr + 1;
+ payload_size = min(buff_size_remaining, GH_RM_MAX_MSG_SIZE);
+ memcpy(payload, req_buff_curr, payload_size);
+ req_buff_curr += payload_size;
+ buff_size_remaining -= payload_size;
+
+ /* Force the last fragment to immediately alert the receiver */
+ msg->push = !buff_size_remaining;
+ msg->length = sizeof(*hdr) + payload_size;
+
+ ret = mbox_send_message(gh_msgq_chan(&rm->msgq), msg);
+ if (ret < 0) {
+ kmem_cache_free(rm->cache, msg);
+ break;
+ }
+
+ if (rm->last_tx_ret) {
+ ret = rm->last_tx_ret;
+ break;
+ }
+
+ msg_type = FIELD_PREP(RM_RPC_TYPE_MASK, RM_RPC_TYPE_CONTINUATION);
+ } while (buff_size_remaining);
+
+out:
+ mutex_unlock(&rm->send_lock);
+ return ret < 0 ? ret : 0;
+}
+
+/**
+ * gh_rm_call: Achieve request-response type communication with RPC
+ * @rm: Pointer to Gunyah resource manager internal data
+ * @message_id: The RM RPC message-id
+ * @req_buff: Request buffer that contains the payload
+ * @req_buff_size: Total size of the payload
+ * @resp_buf: Pointer to a response buffer
+ * @resp_buff_size: Size of the response buffer
+ *
+ * Make a request to the RM-VM and wait for reply back. For a successful
+ * response, the function returns the payload. The size of the payload is set in
+ * resp_buff_size. The resp_buf should be freed by the caller.
+ *
+ * req_buff should be not NULL for req_buff_size >0. If req_buff_size == 0,
+ * req_buff *can* be NULL and no additional payload is sent.
+ *
+ * Context: Process context. Will sleep waiting for reply.
+ * Return: 0 on success. <0 if error.
+ */
+int gh_rm_call(struct gh_rm *rm, u32 message_id, void *req_buff, size_t req_buff_size,
+ void **resp_buf, size_t *resp_buff_size)
+{
+ struct gh_rm_connection *connection;
+ int ret;
+
+ /* message_id 0 is reserved. req_buff_size implies req_buf is not NULL */
+ if (!message_id || (!req_buff && req_buff_size) || !rm)
+ return -EINVAL;
+
+ connection = gh_rm_alloc_connection(cpu_to_le32(message_id), RM_RPC_TYPE_REPLY);
+ if (IS_ERR(connection))
+ return PTR_ERR(connection);
+
+ init_completion(&connection->reply.seq_done);
+
+ /* Allocate a new seq number for this connection */
+ mutex_lock(&rm->call_idr_lock);
+ ret = idr_alloc_cyclic(&rm->call_idr, connection, 0, U16_MAX,
+ GFP_KERNEL);
+ mutex_unlock(&rm->call_idr_lock);
+ if (ret < 0)
+ goto out;
+ connection->reply.seq = ret;
+
+ /* Send the request to the Resource Manager */
+ ret = gh_rm_send_request(rm, message_id, req_buff, req_buff_size, connection);
+ if (ret < 0)
+ goto out;
+
+ /* Wait for response */
+ ret = wait_for_completion_interruptible(&connection->reply.seq_done);
+ if (ret)
+ goto out;
+
+ /* Check for internal (kernel) error waiting for the response */
+ if (connection->reply.ret) {
+ ret = connection->reply.ret;
+ if (ret != -ENOMEM)
+ kfree(connection->payload);
+ goto out;
+ }
+
+ /* Got a response, did resource manager give us an error? */
+ if (connection->reply.rm_error != GH_RM_ERROR_OK) {
+ pr_warn("RM rejected message %08x. Error: %d\n", message_id,
+ connection->reply.rm_error);
+ dump_stack();
+ ret = gh_rm_remap_error(connection->reply.rm_error);
+ kfree(connection->payload);
+ goto out;
+ }
+
+ /* Everything looks good, return the payload */
+ *resp_buff_size = connection->size;
+ if (connection->size)
+ *resp_buf = connection->payload;
+ else {
+ /* kfree in case RM sent us multiple fragments but never any data in
+ * those fragments. We would've allocated memory for it, but connection->size == 0
+ */
+ kfree(connection->payload);
+ }
+
+out:
+ mutex_lock(&rm->call_idr_lock);
+ idr_remove(&rm->call_idr, connection->reply.seq);
+ mutex_unlock(&rm->call_idr_lock);
+ kfree(connection);
+ return ret;
+}
+
+
+int gh_rm_notifier_register(struct gh_rm *rm, struct notifier_block *nb)
+{
+ return blocking_notifier_chain_register(&rm->nh, nb);
+}
+EXPORT_SYMBOL_GPL(gh_rm_notifier_register);
+
+int gh_rm_notifier_unregister(struct gh_rm *rm, struct notifier_block *nb)
+{
+ return blocking_notifier_chain_unregister(&rm->nh, nb);
+}
+EXPORT_SYMBOL_GPL(gh_rm_notifier_unregister);
+
+void get_gh_rm(struct gh_rm *rm)
+{
+ get_device(rm->dev);
+}
+EXPORT_SYMBOL_GPL(get_gh_rm);
+
+void put_gh_rm(struct gh_rm *rm)
+{
+ put_device(rm->dev);
+}
+EXPORT_SYMBOL_GPL(put_gh_rm);
+
+static int gh_msgq_platform_probe_direction(struct platform_device *pdev,
+ bool tx, int idx, struct gunyah_resource *ghrsc)
+{
+ struct device_node *node = pdev->dev.of_node;
+ int ret;
+
+ ghrsc->type = tx ? GUNYAH_RESOURCE_TYPE_MSGQ_TX : GUNYAH_RESOURCE_TYPE_MSGQ_RX;
+
+ ghrsc->irq = platform_get_irq(pdev, idx);
+ if (ghrsc->irq < 0) {
+ dev_err(&pdev->dev, "Failed to get irq%d: %d\n", idx, ghrsc->irq);
+ return ghrsc->irq;
+ }
+
+ ret = of_property_read_u64_index(node, "reg", idx, &ghrsc->capid);
+ if (ret) {
+ dev_err(&pdev->dev, "Failed to get capid%d: %d\n", idx, ret);
+ return ret;
+ }
+
+ return 0;
+}
+
+static int gh_rm_drv_probe(struct platform_device *pdev)
+{
+ struct gh_msgq_tx_data *msg;
+ struct gh_rm *rm;
+ int ret;
+
+ rm = devm_kzalloc(&pdev->dev, sizeof(*rm), GFP_KERNEL);
+ if (!rm)
+ return -ENOMEM;
+
+ platform_set_drvdata(pdev, rm);
+ rm->dev = &pdev->dev;
+
+ mutex_init(&rm->call_idr_lock);
+ idr_init(&rm->call_idr);
+ rm->cache = kmem_cache_create("gh_rm", struct_size(msg, data, GH_MSGQ_MAX_MSG_SIZE), 0,
+ SLAB_HWCACHE_ALIGN, NULL);
+ if (!rm->cache)
+ return -ENOMEM;
+ mutex_init(&rm->send_lock);
+ BLOCKING_INIT_NOTIFIER_HEAD(&rm->nh);
+
+ ret = gh_msgq_platform_probe_direction(pdev, true, 0, &rm->tx_ghrsc);
+ if (ret)
+ goto err_cache;
+
+ ret = gh_msgq_platform_probe_direction(pdev, false, 1, &rm->rx_ghrsc);
+ if (ret)
+ goto err_cache;
+
+ rm->msgq_client.dev = &pdev->dev;
+ rm->msgq_client.tx_block = true;
+ rm->msgq_client.rx_callback = gh_rm_msgq_rx_data;
+ rm->msgq_client.tx_done = gh_rm_msgq_tx_done;
+
+ return gh_msgq_init(&pdev->dev, &rm->msgq, &rm->msgq_client, &rm->tx_ghrsc, &rm->rx_ghrsc);
+err_cache:
+ kmem_cache_destroy(rm->cache);
+ return ret;
+}
+
+static int gh_rm_drv_remove(struct platform_device *pdev)
+{
+ struct gh_rm *rm = platform_get_drvdata(pdev);
+
+ mbox_free_channel(gh_msgq_chan(&rm->msgq));
+ gh_msgq_remove(&rm->msgq);
+ kmem_cache_destroy(rm->cache);
+
+ return 0;
+}
+
+static const struct of_device_id gh_rm_of_match[] = {
+ { .compatible = "gunyah-resource-manager" },
+ {}
+};
+MODULE_DEVICE_TABLE(of, gh_rm_of_match);
+
+static struct platform_driver gh_rm_driver = {
+ .probe = gh_rm_drv_probe,
+ .remove = gh_rm_drv_remove,
+ .driver = {
+ .name = "gh_rsc_mgr",
+ .of_match_table = gh_rm_of_match,
+ },
+};
+module_platform_driver(gh_rm_driver);
+
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("Gunyah Resource Manager Driver");
diff --git a/drivers/virt/gunyah/rsc_mgr.h b/drivers/virt/gunyah/rsc_mgr.h
new file mode 100644
index 000000000000..d4e799a7526f
--- /dev/null
+++ b/drivers/virt/gunyah/rsc_mgr.h
@@ -0,0 +1,77 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+#ifndef __GH_RSC_MGR_PRIV_H
+#define __GH_RSC_MGR_PRIV_H
+
+#include <linux/gunyah.h>
+#include <linux/gunyah_rsc_mgr.h>
+#include <linux/types.h>
+
+/* RM Error codes */
+enum gh_rm_error {
+ GH_RM_ERROR_OK = 0x0,
+ GH_RM_ERROR_UNIMPLEMENTED = 0xFFFFFFFF,
+ GH_RM_ERROR_NOMEM = 0x1,
+ GH_RM_ERROR_NORESOURCE = 0x2,
+ GH_RM_ERROR_DENIED = 0x3,
+ GH_RM_ERROR_INVALID = 0x4,
+ GH_RM_ERROR_BUSY = 0x5,
+ GH_RM_ERROR_ARGUMENT_INVALID = 0x6,
+ GH_RM_ERROR_HANDLE_INVALID = 0x7,
+ GH_RM_ERROR_VALIDATE_FAILED = 0x8,
+ GH_RM_ERROR_MAP_FAILED = 0x9,
+ GH_RM_ERROR_MEM_INVALID = 0xA,
+ GH_RM_ERROR_MEM_INUSE = 0xB,
+ GH_RM_ERROR_MEM_RELEASED = 0xC,
+ GH_RM_ERROR_VMID_INVALID = 0xD,
+ GH_RM_ERROR_LOOKUP_FAILED = 0xE,
+ GH_RM_ERROR_IRQ_INVALID = 0xF,
+ GH_RM_ERROR_IRQ_INUSE = 0x10,
+ GH_RM_ERROR_IRQ_RELEASED = 0x11,
+};
+
+/**
+ * gh_rm_remap_error() - Remap Gunyah resource manager errors into a Linux error code
+ * @gh_error: "Standard" return value from Gunyah resource manager
+ */
+static inline int gh_rm_remap_error(enum gh_rm_error rm_error)
+{
+ switch (rm_error) {
+ case GH_RM_ERROR_OK:
+ return 0;
+ case GH_RM_ERROR_UNIMPLEMENTED:
+ return -EOPNOTSUPP;
+ case GH_RM_ERROR_NOMEM:
+ return -ENOMEM;
+ case GH_RM_ERROR_NORESOURCE:
+ return -ENODEV;
+ case GH_RM_ERROR_DENIED:
+ return -EPERM;
+ case GH_RM_ERROR_BUSY:
+ return -EBUSY;
+ case GH_RM_ERROR_INVALID:
+ case GH_RM_ERROR_ARGUMENT_INVALID:
+ case GH_RM_ERROR_HANDLE_INVALID:
+ case GH_RM_ERROR_VALIDATE_FAILED:
+ case GH_RM_ERROR_MAP_FAILED:
+ case GH_RM_ERROR_MEM_INVALID:
+ case GH_RM_ERROR_MEM_INUSE:
+ case GH_RM_ERROR_MEM_RELEASED:
+ case GH_RM_ERROR_VMID_INVALID:
+ case GH_RM_ERROR_LOOKUP_FAILED:
+ case GH_RM_ERROR_IRQ_INVALID:
+ case GH_RM_ERROR_IRQ_INUSE:
+ case GH_RM_ERROR_IRQ_RELEASED:
+ return -EINVAL;
+ default:
+ return -EBADMSG;
+ }
+}
+
+struct gh_rm;
+int gh_rm_call(struct gh_rm *rsc_mgr, u32 message_id, void *req_buff, size_t req_buff_size,
+ void **resp_buf, size_t *resp_buff_size);
+
+#endif
diff --git a/include/linux/gunyah_rsc_mgr.h b/include/linux/gunyah_rsc_mgr.h
new file mode 100644
index 000000000000..c992b3188c8d
--- /dev/null
+++ b/include/linux/gunyah_rsc_mgr.h
@@ -0,0 +1,24 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#ifndef _GUNYAH_RSC_MGR_H
+#define _GUNYAH_RSC_MGR_H
+
+#include <linux/list.h>
+#include <linux/notifier.h>
+#include <linux/gunyah.h>
+
+#define GH_VMID_INVAL U16_MAX
+
+/* Gunyah recognizes VMID0 as an alias to the current VM's ID */
+#define GH_VMID_SELF 0
+
+struct gh_rm;
+int gh_rm_notifier_register(struct gh_rm *rm, struct notifier_block *nb);
+int gh_rm_notifier_unregister(struct gh_rm *rm, struct notifier_block *nb);
+void get_gh_rm(struct gh_rm *rm);
+void put_gh_rm(struct gh_rm *rm);
+
+#endif
--
2.39.1


2023-02-14 21:25:00

by Elliot Berman

[permalink] [raw]
Subject: [PATCH v10 10/26] gunyah: vm_mgr: Introduce basic VM Manager


Gunyah VM manager is a kernel moduel which exposes an interface to
Gunyah userspace to load, run, and interact with other Gunyah virtual
machines. The interface is a character device at /dev/gunyah.

Add a basic VM manager driver. Upcoming patches will add more ioctls
into this driver.

Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Elliot Berman <[email protected]>
---
.../userspace-api/ioctl/ioctl-number.rst | 1 +
drivers/virt/gunyah/Makefile | 2 +-
drivers/virt/gunyah/rsc_mgr.c | 37 +++++-
drivers/virt/gunyah/vm_mgr.c | 118 ++++++++++++++++++
drivers/virt/gunyah/vm_mgr.h | 22 ++++
include/uapi/linux/gunyah.h | 23 ++++
6 files changed, 201 insertions(+), 2 deletions(-)
create mode 100644 drivers/virt/gunyah/vm_mgr.c
create mode 100644 drivers/virt/gunyah/vm_mgr.h
create mode 100644 include/uapi/linux/gunyah.h

diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documentation/userspace-api/ioctl/ioctl-number.rst
index 0a1882e296ae..2513324ae7be 100644
--- a/Documentation/userspace-api/ioctl/ioctl-number.rst
+++ b/Documentation/userspace-api/ioctl/ioctl-number.rst
@@ -137,6 +137,7 @@ Code Seq# Include File Comments
'F' DD video/sstfb.h conflict!
'G' 00-3F drivers/misc/sgi-gru/grulib.h conflict!
'G' 00-0F xen/gntalloc.h, xen/gntdev.h conflict!
+'G' 00-0f linux/gunyah.h conflict!
'H' 00-7F linux/hiddev.h conflict!
'H' 00-0F linux/hidraw.h conflict!
'H' 01 linux/mei.h conflict!
diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
index de29769f2f3f..03951cf82023 100644
--- a/drivers/virt/gunyah/Makefile
+++ b/drivers/virt/gunyah/Makefile
@@ -2,5 +2,5 @@

obj-$(CONFIG_GUNYAH) += gunyah.o

-gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o
+gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o
obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
diff --git a/drivers/virt/gunyah/rsc_mgr.c b/drivers/virt/gunyah/rsc_mgr.c
index 2a47139873a8..73c5a6b7cbbc 100644
--- a/drivers/virt/gunyah/rsc_mgr.c
+++ b/drivers/virt/gunyah/rsc_mgr.c
@@ -16,8 +16,10 @@
#include <linux/completion.h>
#include <linux/gunyah_rsc_mgr.h>
#include <linux/platform_device.h>
+#include <linux/miscdevice.h>

#include "rsc_mgr.h"
+#include "vm_mgr.h"

#define RM_RPC_API_VERSION_MASK GENMASK(3, 0)
#define RM_RPC_HEADER_WORDS_MASK GENMASK(7, 4)
@@ -103,6 +105,8 @@ struct gh_rm {
struct kmem_cache *cache;
struct mutex send_lock;
struct blocking_notifier_head nh;
+
+ struct miscdevice miscdev;
};

static struct gh_rm_connection *gh_rm_alloc_connection(__le32 msg_id, u8 type)
@@ -509,6 +513,21 @@ void put_gh_rm(struct gh_rm *rm)
}
EXPORT_SYMBOL_GPL(put_gh_rm);

+static long gh_dev_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
+{
+ struct miscdevice *miscdev = filp->private_data;
+ struct gh_rm *rm = container_of(miscdev, struct gh_rm, miscdev);
+
+ return gh_dev_vm_mgr_ioctl(rm, cmd, arg);
+}
+
+static const struct file_operations gh_dev_fops = {
+ .owner = THIS_MODULE,
+ .unlocked_ioctl = gh_dev_ioctl,
+ .compat_ioctl = compat_ptr_ioctl,
+ .llseek = noop_llseek,
+};
+
static int gh_msgq_platform_probe_direction(struct platform_device *pdev,
bool tx, int idx, struct gunyah_resource *ghrsc)
{
@@ -567,7 +586,22 @@ static int gh_rm_drv_probe(struct platform_device *pdev)
rm->msgq_client.rx_callback = gh_rm_msgq_rx_data;
rm->msgq_client.tx_done = gh_rm_msgq_tx_done;

- return gh_msgq_init(&pdev->dev, &rm->msgq, &rm->msgq_client, &rm->tx_ghrsc, &rm->rx_ghrsc);
+ ret = gh_msgq_init(&pdev->dev, &rm->msgq, &rm->msgq_client, &rm->tx_ghrsc, &rm->rx_ghrsc);
+ if (ret)
+ goto err_cache;
+
+ rm->miscdev.name = "gunyah";
+ rm->miscdev.minor = MISC_DYNAMIC_MINOR;
+ rm->miscdev.fops = &gh_dev_fops;
+
+ ret = misc_register(&rm->miscdev);
+ if (ret)
+ goto err_msgq;
+
+ return 0;
+err_msgq:
+ mbox_free_channel(gh_msgq_chan(&rm->msgq));
+ gh_msgq_remove(&rm->msgq);
err_cache:
kmem_cache_destroy(rm->cache);
return ret;
@@ -577,6 +611,7 @@ static int gh_rm_drv_remove(struct platform_device *pdev)
{
struct gh_rm *rm = platform_get_drvdata(pdev);

+ misc_deregister(&rm->miscdev);
mbox_free_channel(gh_msgq_chan(&rm->msgq));
gh_msgq_remove(&rm->msgq);
kmem_cache_destroy(rm->cache);
diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
new file mode 100644
index 000000000000..fd890a57172e
--- /dev/null
+++ b/drivers/virt/gunyah/vm_mgr.c
@@ -0,0 +1,118 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#define pr_fmt(fmt) "gh_vm_mgr: " fmt
+
+#include <linux/anon_inodes.h>
+#include <linux/file.h>
+#include <linux/gunyah_rsc_mgr.h>
+#include <linux/miscdevice.h>
+#include <linux/module.h>
+
+#include <uapi/linux/gunyah.h>
+
+#include "vm_mgr.h"
+
+static void gh_vm_free(struct work_struct *work)
+{
+ struct gh_vm *ghvm = container_of(work, struct gh_vm, free_work);
+ int ret;
+
+ ret = gh_rm_dealloc_vmid(ghvm->rm, ghvm->vmid);
+ if (ret)
+ pr_warn("Failed to deallocate vmid: %d\n", ret);
+
+ put_gh_rm(ghvm->rm);
+ kfree(ghvm);
+}
+
+static __must_check struct gh_vm *gh_vm_alloc(struct gh_rm *rm)
+{
+ struct gh_vm *ghvm;
+ int vmid;
+
+ vmid = gh_rm_alloc_vmid(rm, 0);
+ if (vmid < 0)
+ return ERR_PTR(vmid);
+
+ ghvm = kzalloc(sizeof(*ghvm), GFP_KERNEL);
+ if (!ghvm) {
+ gh_rm_dealloc_vmid(rm, vmid);
+ return ERR_PTR(-ENOMEM);
+ }
+
+ get_gh_rm(rm);
+
+ ghvm->vmid = vmid;
+ ghvm->rm = rm;
+
+ INIT_WORK(&ghvm->free_work, gh_vm_free);
+
+ return ghvm;
+}
+
+static int gh_vm_release(struct inode *inode, struct file *filp)
+{
+ struct gh_vm *ghvm = filp->private_data;
+
+ /* VM will be reset and make RM calls which can interruptible sleep.
+ * Defer to a work so this thread can receive signal.
+ */
+ schedule_work(&ghvm->free_work);
+ return 0;
+}
+
+static const struct file_operations gh_vm_fops = {
+ .release = gh_vm_release,
+ .compat_ioctl = compat_ptr_ioctl,
+ .llseek = noop_llseek,
+};
+
+static long gh_dev_ioctl_create_vm(struct gh_rm *rm, unsigned long arg)
+{
+ struct gh_vm *ghvm;
+ struct file *file;
+ int fd, err;
+
+ /* arg reserved for future use. */
+ if (arg)
+ return -EINVAL;
+
+ ghvm = gh_vm_alloc(rm);
+ if (IS_ERR(ghvm))
+ return PTR_ERR(ghvm);
+
+ fd = get_unused_fd_flags(O_CLOEXEC);
+ if (fd < 0) {
+ err = fd;
+ goto err_destroy_vm;
+ }
+
+ file = anon_inode_getfile("gunyah-vm", &gh_vm_fops, ghvm, O_RDWR);
+ if (IS_ERR(file)) {
+ err = PTR_ERR(file);
+ goto err_put_fd;
+ }
+
+ fd_install(fd, file);
+
+ return fd;
+
+err_put_fd:
+ put_unused_fd(fd);
+err_destroy_vm:
+ kfree(ghvm);
+ return err;
+}
+
+long gh_dev_vm_mgr_ioctl(struct gh_rm *rm, unsigned int cmd, unsigned long arg)
+{
+ switch (cmd) {
+ case GH_CREATE_VM:
+ return gh_dev_ioctl_create_vm(rm, arg);
+ default:
+ return -ENOIOCTLCMD;
+ }
+}
diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
new file mode 100644
index 000000000000..76954da706e9
--- /dev/null
+++ b/drivers/virt/gunyah/vm_mgr.h
@@ -0,0 +1,22 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#ifndef _GH_PRIV_VM_MGR_H
+#define _GH_PRIV_VM_MGR_H
+
+#include <linux/gunyah_rsc_mgr.h>
+
+#include <uapi/linux/gunyah.h>
+
+long gh_dev_vm_mgr_ioctl(struct gh_rm *rm, unsigned int cmd, unsigned long arg);
+
+struct gh_vm {
+ u16 vmid;
+ struct gh_rm *rm;
+
+ struct work_struct free_work;
+};
+
+#endif
diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
new file mode 100644
index 000000000000..10ba32d2b0a6
--- /dev/null
+++ b/include/uapi/linux/gunyah.h
@@ -0,0 +1,23 @@
+/* SPDX-License-Identifier: GPL-2.0-only WITH Linux-syscall-note */
+/*
+ * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#ifndef _UAPI_LINUX_GUNYAH
+#define _UAPI_LINUX_GUNYAH
+
+/*
+ * Userspace interface for /dev/gunyah - gunyah based virtual machine
+ */
+
+#include <linux/types.h>
+#include <linux/ioctl.h>
+
+#define GH_IOCTL_TYPE 'G'
+
+/*
+ * ioctls for /dev/gunyah fds:
+ */
+#define GH_CREATE_VM _IO(GH_IOCTL_TYPE, 0x0) /* Returns a Gunyah VM fd */
+
+#endif
--
2.39.1


2023-02-14 21:25:06

by Elliot Berman

[permalink] [raw]
Subject: [PATCH v10 11/26] gunyah: rsc_mgr: Add RPC for sharing memory


Gunyah resource manager provides API to manipulate stage 2 page tables.
Manipulations are represented as a memory parcel. Memory parcels
describe a list of memory regions (intermediate physical address and
size), a list of new permissions for VMs, and the memory type (DDR or
MMIO). Memory parcels are uniquely identified by a handle allocated by
Gunyah. There are a few types of memory parcel sharing which Gunyah
supports:

- Sharing: the guest and host VM both have access
- Lending: only the guest has access; host VM loses access
- Donating: Permanently lent (not reclaimed even if guest shuts down)

Memory parcels that have been shared or lent can be reclaimed by the
host via an additional call. The reclaim operation restores the original
access the host VM had to the memory parcel and removes the access to
other VM.

One point to note that memory parcels don't describe where in the guest
VM the memory parcel should reside. The guest VM must accept the memory
parcel either explicitly via a "gh_rm_mem_accept" call (not introduced
here) or be configured to accept it automatically at boot. As the guest
VM accepts the memory parcel, it also mentions the IPA it wants to place
memory parcel.

Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Elliot Berman <[email protected]>
---
drivers/virt/gunyah/rsc_mgr.h | 44 +++++++
drivers/virt/gunyah/rsc_mgr_rpc.c | 185 ++++++++++++++++++++++++++++++
include/linux/gunyah_rsc_mgr.h | 47 ++++++++
3 files changed, 276 insertions(+)

diff --git a/drivers/virt/gunyah/rsc_mgr.h b/drivers/virt/gunyah/rsc_mgr.h
index 7406237bc66d..9b23cefe02b0 100644
--- a/drivers/virt/gunyah/rsc_mgr.h
+++ b/drivers/virt/gunyah/rsc_mgr.h
@@ -74,6 +74,12 @@ struct gh_rm;
int gh_rm_call(struct gh_rm *rsc_mgr, u32 message_id, void *req_buff, size_t req_buff_size,
void **resp_buf, size_t *resp_buff_size);

+/* Message IDs: Memory Management */
+#define GH_RM_RPC_MEM_LEND 0x51000012
+#define GH_RM_RPC_MEM_SHARE 0x51000013
+#define GH_RM_RPC_MEM_RECLAIM 0x51000015
+#define GH_RM_RPC_MEM_APPEND 0x51000018
+
/* Message IDs: VM Management */
#define GH_RM_RPC_VM_ALLOC_VMID 0x56000001
#define GH_RM_RPC_VM_DEALLOC_VMID 0x56000002
@@ -90,6 +96,44 @@ struct gh_rm_vm_common_vmid_req {
__le16 reserved0;
} __packed;

+/* Call: MEM_LEND, MEM_SHARE */
+struct gh_rm_mem_share_req_header {
+ u8 mem_type;
+ u8 reserved0;
+#define GH_MEM_SHARE_REQ_FLAGS_APPEND BIT(1)
+ u8 flags;
+ u8 reserved1;
+ __le32 label;
+} __packed;
+
+struct gh_rm_mem_share_req_acl_section {
+ __le32 n_entries;
+ struct gh_rm_mem_acl_entry entries[];
+};
+
+struct gh_rm_mem_share_req_mem_section {
+ __le16 n_entries;
+ __le16 reserved0;
+ struct gh_rm_mem_entry entries[];
+};
+
+/* Call: MEM_RELEASE */
+struct gh_rm_mem_release_req {
+ __le32 mem_handle;
+ u8 flags; /* currently not used */
+ __le16 reserved0;
+ u8 reserved1;
+} __packed;
+
+/* Call: MEM_APPEND */
+struct gh_rm_mem_append_req_header {
+ __le32 mem_handle;
+#define GH_MEM_APPEND_REQ_FLAGS_END BIT(0)
+ u8 flags;
+ __le16 reserved0;
+ u8 reserved1;
+} __packed;
+
/* Call: VM_ALLOC */
struct gh_rm_vm_alloc_vmid_resp {
__le16 vmid;
diff --git a/drivers/virt/gunyah/rsc_mgr_rpc.c b/drivers/virt/gunyah/rsc_mgr_rpc.c
index 4515cdd80106..0c83b097fec9 100644
--- a/drivers/virt/gunyah/rsc_mgr_rpc.c
+++ b/drivers/virt/gunyah/rsc_mgr_rpc.c
@@ -7,6 +7,8 @@

#include "rsc_mgr.h"

+#define GH_RM_MAX_MEM_ENTRIES 512
+
/*
* Several RM calls take only a VMID as a parameter and give only standard
* response back. Deduplicate boilerplate code by using this common call.
@@ -22,6 +24,189 @@ static int gh_rm_common_vmid_call(struct gh_rm *rm, u32 message_id, u16 vmid)
return gh_rm_call(rm, message_id, &req_payload, sizeof(req_payload), &resp, &resp_size);
}

+static int _gh_rm_mem_append(struct gh_rm *rm, u32 mem_handle, bool end_append,
+ struct gh_rm_mem_entry *mem_entries, size_t n_mem_entries)
+{
+ struct gh_rm_mem_share_req_mem_section *mem_section;
+ struct gh_rm_mem_append_req_header *req_header;
+ size_t msg_size = 0, resp_size;
+ void *msg, *resp;
+ int ret;
+
+ msg_size += sizeof(struct gh_rm_mem_append_req_header);
+ msg_size += struct_size(mem_section, entries, n_mem_entries);
+
+ msg = kzalloc(msg_size, GFP_KERNEL);
+ if (!msg)
+ return -ENOMEM;
+
+ req_header = msg;
+ mem_section = (void *)req_header + sizeof(struct gh_rm_mem_append_req_header);
+
+ req_header->mem_handle = cpu_to_le32(mem_handle);
+ if (end_append)
+ req_header->flags |= GH_MEM_APPEND_REQ_FLAGS_END;
+
+ mem_section->n_entries = cpu_to_le16(n_mem_entries);
+ memcpy(mem_section->entries, mem_entries, sizeof(*mem_entries) * n_mem_entries);
+
+ ret = gh_rm_call(rm, GH_RM_RPC_MEM_APPEND, msg, msg_size, &resp, &resp_size);
+ kfree(msg);
+
+ return ret;
+}
+
+static int gh_rm_mem_append(struct gh_rm *rm, u32 mem_handle,
+ struct gh_rm_mem_entry *mem_entries, size_t n_mem_entries)
+{
+ bool end_append;
+ int ret = 0;
+ size_t n;
+
+ while (n_mem_entries) {
+ if (n_mem_entries > GH_RM_MAX_MEM_ENTRIES) {
+ end_append = false;
+ n = GH_RM_MAX_MEM_ENTRIES;
+ } else {
+ end_append = true;
+ n = n_mem_entries;
+ }
+
+ ret = _gh_rm_mem_append(rm, mem_handle, end_append, mem_entries, n);
+ if (ret)
+ break;
+
+ mem_entries += n;
+ n_mem_entries -= n;
+ }
+
+ return ret;
+}
+
+static int gh_rm_mem_lend_common(struct gh_rm *rm, u32 message_id, struct gh_rm_mem_parcel *p)
+{
+ size_t msg_size = 0, initial_mem_entries = p->n_mem_entries, resp_size;
+ struct gh_rm_mem_share_req_acl_section *acl_section;
+ struct gh_rm_mem_share_req_mem_section *mem_section;
+ struct gh_rm_mem_share_req_header *req_header;
+ u32 *attr_section;
+ __le32 *resp;
+ void *msg;
+ int ret;
+
+ if (!p->acl_entries || !p->n_acl_entries || !p->mem_entries || !p->n_mem_entries ||
+ p->n_acl_entries > U8_MAX || p->mem_handle != GH_MEM_HANDLE_INVAL)
+ return -EINVAL;
+
+ if (initial_mem_entries > GH_RM_MAX_MEM_ENTRIES)
+ initial_mem_entries = GH_RM_MAX_MEM_ENTRIES;
+
+ /* The format of the message goes:
+ * request header
+ * ACL entries (which VMs get what kind of access to this memory parcel)
+ * Memory entries (list of memory regions to share)
+ * Memory attributes (currently unused, we'll hard-code the size to 0)
+ */
+ msg_size += sizeof(struct gh_rm_mem_share_req_header);
+ msg_size += struct_size(acl_section, entries, p->n_acl_entries);
+ msg_size += struct_size(mem_section, entries, initial_mem_entries);
+ msg_size += sizeof(u32); /* for memory attributes, currently unused */
+
+ msg = kzalloc(msg_size, GFP_KERNEL);
+ if (!msg)
+ return -ENOMEM;
+
+ req_header = msg;
+ acl_section = (void *)req_header + sizeof(*req_header);
+ mem_section = (void *)acl_section + struct_size(acl_section, entries, p->n_acl_entries);
+ attr_section = (void *)mem_section + struct_size(mem_section, entries, initial_mem_entries);
+
+ req_header->mem_type = p->mem_type;
+ if (initial_mem_entries != p->n_mem_entries)
+ req_header->flags |= GH_MEM_SHARE_REQ_FLAGS_APPEND;
+ req_header->label = cpu_to_le32(p->label);
+
+ acl_section->n_entries = cpu_to_le32(p->n_acl_entries);
+ memcpy(acl_section->entries, p->acl_entries, sizeof(*(p->acl_entries)) * p->n_acl_entries);
+
+ mem_section->n_entries = cpu_to_le16(initial_mem_entries);
+ memcpy(mem_section->entries, p->mem_entries,
+ sizeof(*(p->mem_entries)) * initial_mem_entries);
+
+ /* Set n_entries for memory attribute section to 0 */
+ *attr_section = 0;
+
+ ret = gh_rm_call(rm, message_id, msg, msg_size, (void **)&resp, &resp_size);
+ kfree(msg);
+
+ if (ret)
+ return ret;
+
+ p->mem_handle = le32_to_cpu(*resp);
+
+ if (initial_mem_entries != p->n_mem_entries) {
+ ret = gh_rm_mem_append(rm, p->mem_handle,
+ &p->mem_entries[initial_mem_entries],
+ p->n_mem_entries - initial_mem_entries);
+ if (ret) {
+ gh_rm_mem_reclaim(rm, p);
+ p->mem_handle = GH_MEM_HANDLE_INVAL;
+ }
+ }
+
+ kfree(resp);
+ return ret;
+}
+
+/**
+ * gh_rm_mem_lend() - Lend memory to other virtual machines.
+ * @rm: Handle to a Gunyah resource manager
+ * @parcel: Package the memory information of the memory to be lent.
+ *
+ * Lending removes Linux's access to the memory while the memory parcel is lent.
+ */
+int gh_rm_mem_lend(struct gh_rm *rm, struct gh_rm_mem_parcel *parcel)
+{
+ return gh_rm_mem_lend_common(rm, GH_RM_RPC_MEM_LEND, parcel);
+}
+
+
+/**
+ * gh_rm_mem_share() - Share memory with other virtual machines.
+ * @rm: Handle to a Gunyah resource manager
+ * @parcel: Package the memory information of the memory to be shared.
+ *
+ * Sharing keeps Linux's access to the memory while the memory parcel is shared.
+ */
+int gh_rm_mem_share(struct gh_rm *rm, struct gh_rm_mem_parcel *parcel)
+{
+ return gh_rm_mem_lend_common(rm, GH_RM_RPC_MEM_SHARE, parcel);
+}
+
+/**
+ * gh_rm_mem_reclaim() - Reclaim a memory parcel
+ * @rm: Handle to a Gunyah resource manager
+ * @parcel: Package the memory information of the memory to be reclaimed.
+ *
+ * RM maps the associated memory back into the stage-2 page tables of the owner VM.
+ */
+int gh_rm_mem_reclaim(struct gh_rm *rm, struct gh_rm_mem_parcel *parcel)
+{
+ struct gh_rm_mem_release_req req = {
+ .mem_handle = cpu_to_le32(parcel->mem_handle),
+ };
+ size_t resp_size;
+ void *resp;
+ int ret;
+
+ ret = gh_rm_call(rm, GH_RM_RPC_MEM_RECLAIM, &req, sizeof(req), &resp, &resp_size);
+ /* Do not call platform mem reclaim hooks: the reclaim didn't happen*/
+ if (ret)
+ return ret;
+
+ return ret;
+}
+
/**
* gh_rm_alloc_vmid() - Allocate a new VM in Gunyah. Returns the VM identifier.
* @rm: Handle to a Gunyah resource manager
diff --git a/include/linux/gunyah_rsc_mgr.h b/include/linux/gunyah_rsc_mgr.h
index e7bd29f8be6e..2d8b8b6cc394 100644
--- a/include/linux/gunyah_rsc_mgr.h
+++ b/include/linux/gunyah_rsc_mgr.h
@@ -11,6 +11,7 @@
#include <linux/gunyah.h>

#define GH_VMID_INVAL U16_MAX
+#define GH_MEM_HANDLE_INVAL U32_MAX

/* Gunyah recognizes VMID0 as an alias to the current VM's ID */
#define GH_VMID_SELF 0
@@ -54,7 +55,53 @@ struct gh_rm_vm_status_payload {

#define GH_RM_NOTIFICATION_VM_STATUS 0x56100008

+struct gh_rm_mem_acl_entry {
+ __le16 vmid;
+#define GH_RM_ACL_X BIT(0)
+#define GH_RM_ACL_W BIT(1)
+#define GH_RM_ACL_R BIT(2)
+ u8 perms;
+ u8 reserved;
+} __packed;
+
+struct gh_rm_mem_entry {
+ __le64 ipa_base;
+ __le64 size;
+} __packed;
+
+enum gh_rm_mem_type {
+ GH_RM_MEM_TYPE_NORMAL = 0,
+ GH_RM_MEM_TYPE_IO = 1,
+};
+
+/*
+ * struct gh_rm_mem_parcel - Package info about memory to be lent/shared/donated/reclaimed
+ * @mem_type: The type of memory: normal (DDR) or IO
+ * @label: An client-specified identifier which can be used by the other VMs to identify the purpose
+ * of the memory parcel.
+ * @acl_entries: An array of access control entries. Each entry specifies a VM and what access
+ * is allowed for the memory parcel.
+ * @n_acl_entries: Count of the number of entries in the `acl_entries` array.
+ * @mem_entries: An list of regions to be associated with the memory parcel. Addresses should be
+ * (intermediate) physical addresses from Linux's perspective.
+ * @n_mem_entries: Count of the number of entries in the `mem_entries` array.
+ * @mem_handle: On success, filled with memory handle that RM allocates for this memory parcel
+ */
+struct gh_rm_mem_parcel {
+ enum gh_rm_mem_type mem_type;
+ u32 label;
+ size_t n_acl_entries;
+ struct gh_rm_mem_acl_entry *acl_entries;
+ size_t n_mem_entries;
+ struct gh_rm_mem_entry *mem_entries;
+ u32 mem_handle;
+};
+
/* RPC Calls */
+int gh_rm_mem_lend(struct gh_rm *rm, struct gh_rm_mem_parcel *parcel);
+int gh_rm_mem_share(struct gh_rm *rm, struct gh_rm_mem_parcel *parcel);
+int gh_rm_mem_reclaim(struct gh_rm *rm, struct gh_rm_mem_parcel *parcel);
+
int gh_rm_alloc_vmid(struct gh_rm *rm, u16 vmid);
int gh_rm_dealloc_vmid(struct gh_rm *rm, u16 vmid);
int gh_rm_vm_reset(struct gh_rm *rm, u16 vmid);
--
2.39.1


2023-02-14 21:25:10

by Elliot Berman

[permalink] [raw]
Subject: [PATCH v10 09/26] gunyah: rsc_mgr: Add VM lifecycle RPC


Add Gunyah Resource Manager RPC to launch an unauthenticated VM.

Signed-off-by: Elliot Berman <[email protected]>
---
drivers/virt/gunyah/Makefile | 2 +-
drivers/virt/gunyah/rsc_mgr.h | 45 ++++++
drivers/virt/gunyah/rsc_mgr_rpc.c | 226 ++++++++++++++++++++++++++++++
include/linux/gunyah_rsc_mgr.h | 73 ++++++++++
4 files changed, 345 insertions(+), 1 deletion(-)
create mode 100644 drivers/virt/gunyah/rsc_mgr_rpc.c

diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
index cc864ff5abbb..de29769f2f3f 100644
--- a/drivers/virt/gunyah/Makefile
+++ b/drivers/virt/gunyah/Makefile
@@ -2,5 +2,5 @@

obj-$(CONFIG_GUNYAH) += gunyah.o

-gunyah_rsc_mgr-y += rsc_mgr.o
+gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o
obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
diff --git a/drivers/virt/gunyah/rsc_mgr.h b/drivers/virt/gunyah/rsc_mgr.h
index d4e799a7526f..7406237bc66d 100644
--- a/drivers/virt/gunyah/rsc_mgr.h
+++ b/drivers/virt/gunyah/rsc_mgr.h
@@ -74,4 +74,49 @@ struct gh_rm;
int gh_rm_call(struct gh_rm *rsc_mgr, u32 message_id, void *req_buff, size_t req_buff_size,
void **resp_buf, size_t *resp_buff_size);

+/* Message IDs: VM Management */
+#define GH_RM_RPC_VM_ALLOC_VMID 0x56000001
+#define GH_RM_RPC_VM_DEALLOC_VMID 0x56000002
+#define GH_RM_RPC_VM_START 0x56000004
+#define GH_RM_RPC_VM_STOP 0x56000005
+#define GH_RM_RPC_VM_RESET 0x56000006
+#define GH_RM_RPC_VM_CONFIG_IMAGE 0x56000009
+#define GH_RM_RPC_VM_INIT 0x5600000B
+#define GH_RM_RPC_VM_GET_HYP_RESOURCES 0x56000020
+#define GH_RM_RPC_VM_GET_VMID 0x56000024
+
+struct gh_rm_vm_common_vmid_req {
+ __le16 vmid;
+ __le16 reserved0;
+} __packed;
+
+/* Call: VM_ALLOC */
+struct gh_rm_vm_alloc_vmid_resp {
+ __le16 vmid;
+ __le16 reserved0;
+} __packed;
+
+/* Call: VM_STOP */
+struct gh_rm_vm_stop_req {
+ __le16 vmid;
+#define GH_RM_VM_STOP_FLAG_FORCE_STOP BIT(0)
+ u8 flags;
+ u8 reserved;
+#define GH_RM_VM_STOP_REASON_FORCE_STOP 3
+ __le32 stop_reason;
+} __packed;
+
+/* Call: VM_CONFIG_IMAGE */
+struct gh_rm_vm_config_image_req {
+ __le16 vmid;
+ __le16 auth_mech;
+ __le32 mem_handle;
+ __le64 image_offset;
+ __le64 image_size;
+ __le64 dtb_offset;
+ __le64 dtb_size;
+} __packed;
+
+/* Call: GET_HYP_RESOURCES */
+
#endif
diff --git a/drivers/virt/gunyah/rsc_mgr_rpc.c b/drivers/virt/gunyah/rsc_mgr_rpc.c
new file mode 100644
index 000000000000..4515cdd80106
--- /dev/null
+++ b/drivers/virt/gunyah/rsc_mgr_rpc.c
@@ -0,0 +1,226 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#include <linux/gunyah_rsc_mgr.h>
+
+#include "rsc_mgr.h"
+
+/*
+ * Several RM calls take only a VMID as a parameter and give only standard
+ * response back. Deduplicate boilerplate code by using this common call.
+ */
+static int gh_rm_common_vmid_call(struct gh_rm *rm, u32 message_id, u16 vmid)
+{
+ struct gh_rm_vm_common_vmid_req req_payload = {
+ .vmid = cpu_to_le16(vmid),
+ };
+ size_t resp_size;
+ void *resp;
+
+ return gh_rm_call(rm, message_id, &req_payload, sizeof(req_payload), &resp, &resp_size);
+}
+
+/**
+ * gh_rm_alloc_vmid() - Allocate a new VM in Gunyah. Returns the VM identifier.
+ * @rm: Handle to a Gunyah resource manager
+ * @vmid: Use GH_VMID_INVAL or 0 to dynamically allocate a VM. A reserved VMID can
+ * be supplied to request allocation of a platform-defined VM.
+ *
+ * Returns - the allocated VMID or negative value on error
+ */
+int gh_rm_alloc_vmid(struct gh_rm *rm, u16 vmid)
+{
+ struct gh_rm_vm_common_vmid_req req_payload = { 0 };
+ struct gh_rm_vm_alloc_vmid_resp *resp_payload;
+ size_t resp_size;
+ void *resp;
+ int ret;
+
+ if (vmid == GH_VMID_INVAL)
+ vmid = 0;
+
+ req_payload.vmid = vmid;
+
+ ret = gh_rm_call(rm, GH_RM_RPC_VM_ALLOC_VMID, &req_payload, sizeof(req_payload), &resp,
+ &resp_size);
+ if (ret)
+ return ret;
+
+ if (!vmid) {
+ resp_payload = resp;
+ ret = le16_to_cpu(resp_payload->vmid);
+ kfree(resp);
+ }
+
+ return ret;
+}
+
+/**
+ * gh_rm_dealloc_vmid() - Dispose the VMID
+ * @rm: Handle to a Gunyah resource manager
+ * @vmid: VM identifier allocated with gh_rm_alloc_vmid
+ */
+int gh_rm_dealloc_vmid(struct gh_rm *rm, u16 vmid)
+{
+ return gh_rm_common_vmid_call(rm, GH_RM_RPC_VM_DEALLOC_VMID, vmid);
+}
+
+/**
+ * gh_rm_vm_reset() - Reset the VM's resources
+ * @rm: Handle to a Gunyah resource manager
+ * @vmid: VM identifier allocated with gh_rm_alloc_vmid
+ *
+ * While tearing down the VM, request RM to clean up all the VM resources
+ * associated with the VM. Only after this, Linux can clean up all the
+ * references it maintains to resources.
+ */
+int gh_rm_vm_reset(struct gh_rm *rm, u16 vmid)
+{
+ return gh_rm_common_vmid_call(rm, GH_RM_RPC_VM_RESET, vmid);
+}
+
+/**
+ * gh_rm_vm_start() - Move the VM into "ready to run" state
+ * @rm: Handle to a Gunyah resource manager
+ * @vmid: VM identifier allocated with gh_rm_alloc_vmid
+ *
+ * On VMs which use proxy scheduling, vcpu_run is needed to actually run the VM.
+ * On VMs which use Gunyah's scheduling, the vCPUs start executing in accordance with Gunyah
+ * scheduling policies.
+ */
+int gh_rm_vm_start(struct gh_rm *rm, u16 vmid)
+{
+ return gh_rm_common_vmid_call(rm, GH_RM_RPC_VM_START, vmid);
+}
+
+/**
+ * gh_rm_vm_stop() - Send a request to Resource Manager VM to forcibly stop a VM.
+ * @rm: Handle to a Gunyah resource manager
+ * @vmid: VM identifier allocated with gh_rm_alloc_vmid
+ */
+int gh_rm_vm_stop(struct gh_rm *rm, u16 vmid)
+{
+ struct gh_rm_vm_stop_req req_payload = {
+ .vmid = cpu_to_le16(vmid),
+ .flags = GH_RM_VM_STOP_FLAG_FORCE_STOP,
+ .stop_reason = cpu_to_le32(GH_RM_VM_STOP_REASON_FORCE_STOP),
+ };
+ size_t resp_size;
+ void *resp;
+
+ return gh_rm_call(rm, GH_RM_RPC_VM_STOP, &req_payload, sizeof(req_payload),
+ &resp, &resp_size);
+}
+
+/**
+ * gh_rm_vm_configure() - Prepare a VM to start and provide the common
+ * configuration needed by RM to configure a VM
+ * @rm: Handle to a Gunyah resource manager
+ * @vmid: VM identifier allocated with gh_rm_alloc_vmid
+ * @auth_mechanism: Authentication mechanism used by resource manager to verify
+ * the virtual machine
+ * @mem_handle: Handle to a previously shared memparcel that contains all parts
+ * of the VM image subject to authentication.
+ * @image_offset: Start address of VM image, relative to the start of memparcel
+ * @image_size: Size of the VM image
+ * @dtb_offset: Start address of the devicetree binary with VM configuration,
+ * relative to start of memparcel.
+ * @dtb_size: Maximum size of devicetree binary. Resource manager applies
+ * an overlay to the DTB and dtb_size should include room for
+ * the overlay.
+ */
+int gh_rm_vm_configure(struct gh_rm *rm, u16 vmid, enum gh_rm_vm_auth_mechanism auth_mechanism,
+ u32 mem_handle, u64 image_offset, u64 image_size, u64 dtb_offset, u64 dtb_size)
+{
+ struct gh_rm_vm_config_image_req req_payload = { 0 };
+ size_t resp_size;
+ void *resp;
+
+ req_payload.vmid = cpu_to_le16(vmid);
+ req_payload.auth_mech = cpu_to_le16(auth_mechanism);
+ req_payload.mem_handle = cpu_to_le32(mem_handle);
+ req_payload.image_offset = cpu_to_le64(image_offset);
+ req_payload.image_size = cpu_to_le64(image_size);
+ req_payload.dtb_offset = cpu_to_le64(dtb_offset);
+ req_payload.dtb_size = cpu_to_le64(dtb_size);
+
+ return gh_rm_call(rm, GH_RM_RPC_VM_CONFIG_IMAGE, &req_payload, sizeof(req_payload),
+ &resp, &resp_size);
+}
+
+/**
+ * gh_rm_vm_init() - Move the VM to initialized state.
+ * @rm: Handle to a Gunyah resource manager
+ * @vmid: VM identifier
+ *
+ * RM will allocate needed resources for the VM.
+ */
+int gh_rm_vm_init(struct gh_rm *rm, u16 vmid)
+{
+ return gh_rm_common_vmid_call(rm, GH_RM_RPC_VM_INIT, vmid);
+}
+
+/**
+ * gh_rm_get_hyp_resources() - Retrieve hypervisor resources (capabilities) associated with a VM
+ * @rm: Handle to a Gunyah resource manager
+ * @vmid: VMID of the other VM to get the resources of
+ * @resources: Set by gh_rm_get_hyp_resources and contains the returned hypervisor resources.
+ */
+int gh_rm_get_hyp_resources(struct gh_rm *rm, u16 vmid,
+ struct gh_rm_hyp_resources **resources)
+{
+ struct gh_rm_vm_common_vmid_req req_payload = {
+ .vmid = cpu_to_le16(vmid),
+ };
+ struct gh_rm_hyp_resources *resp;
+ size_t resp_size;
+ int ret;
+
+ ret = gh_rm_call(rm, GH_RM_RPC_VM_GET_HYP_RESOURCES,
+ &req_payload, sizeof(req_payload),
+ (void **)&resp, &resp_size);
+ if (ret)
+ return ret;
+
+ if (!resp_size)
+ return -EBADMSG;
+
+ if (resp_size < struct_size(resp, entries, 0) ||
+ resp_size != struct_size(resp, entries, le32_to_cpu(resp->n_entries))) {
+ kfree(resp);
+ return -EBADMSG;
+ }
+
+ *resources = resp;
+ return 0;
+}
+
+/**
+ * gh_rm_get_vmid() - Retrieve VMID of this virtual machine
+ * @rm: Handle to a Gunyah resource manager
+ * @vmid: Filled with the VMID of this VM
+ */
+int gh_rm_get_vmid(struct gh_rm *rm, u16 *vmid)
+{
+ static u16 cached_vmid = GH_VMID_INVAL;
+ size_t resp_size;
+ __le32 *resp;
+ int ret;
+
+ if (cached_vmid != GH_VMID_INVAL) {
+ *vmid = cached_vmid;
+ return 0;
+ }
+
+ ret = gh_rm_call(rm, GH_RM_RPC_VM_GET_VMID, NULL, 0, (void **)&resp, &resp_size);
+ if (ret)
+ return ret;
+
+ *vmid = cached_vmid = lower_16_bits(le32_to_cpu(*resp));
+ kfree(resp);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(gh_rm_get_vmid);
diff --git a/include/linux/gunyah_rsc_mgr.h b/include/linux/gunyah_rsc_mgr.h
index c992b3188c8d..e7bd29f8be6e 100644
--- a/include/linux/gunyah_rsc_mgr.h
+++ b/include/linux/gunyah_rsc_mgr.h
@@ -21,4 +21,77 @@ int gh_rm_notifier_unregister(struct gh_rm *rm, struct notifier_block *nb);
void get_gh_rm(struct gh_rm *rm);
void put_gh_rm(struct gh_rm *rm);

+struct gh_rm_vm_exited_payload {
+ __le16 vmid;
+ __le16 exit_type;
+ __le32 exit_reason_size;
+ u8 exit_reason[];
+} __packed;
+
+#define GH_RM_NOTIFICATION_VM_EXITED 0x56100001
+
+enum gh_rm_vm_status {
+ GH_RM_VM_STATUS_NO_STATE = 0,
+ GH_RM_VM_STATUS_INIT = 1,
+ GH_RM_VM_STATUS_READY = 2,
+ GH_RM_VM_STATUS_RUNNING = 3,
+ GH_RM_VM_STATUS_PAUSED = 4,
+ GH_RM_VM_STATUS_LOAD = 5,
+ GH_RM_VM_STATUS_AUTH = 6,
+ GH_RM_VM_STATUS_INIT_FAILED = 8,
+ GH_RM_VM_STATUS_EXITED = 9,
+ GH_RM_VM_STATUS_RESETTING = 10,
+ GH_RM_VM_STATUS_RESET = 11,
+};
+
+struct gh_rm_vm_status_payload {
+ __le16 vmid;
+ u16 reserved;
+ u8 vm_status;
+ u8 os_status;
+ __le16 app_status;
+} __packed;
+
+#define GH_RM_NOTIFICATION_VM_STATUS 0x56100008
+
+/* RPC Calls */
+int gh_rm_alloc_vmid(struct gh_rm *rm, u16 vmid);
+int gh_rm_dealloc_vmid(struct gh_rm *rm, u16 vmid);
+int gh_rm_vm_reset(struct gh_rm *rm, u16 vmid);
+int gh_rm_vm_start(struct gh_rm *rm, u16 vmid);
+int gh_rm_vm_stop(struct gh_rm *rm, u16 vmid);
+
+enum gh_rm_vm_auth_mechanism {
+ GH_RM_VM_AUTH_NONE = 0,
+ GH_RM_VM_AUTH_QCOM_PIL_ELF = 1,
+ GH_RM_VM_AUTH_QCOM_ANDROID_PVM = 2,
+};
+
+int gh_rm_vm_configure(struct gh_rm *rm, u16 vmid, enum gh_rm_vm_auth_mechanism auth_mechanism,
+ u32 mem_handle, u64 image_offset, u64 image_size,
+ u64 dtb_offset, u64 dtb_size);
+int gh_rm_vm_init(struct gh_rm *rm, u16 vmid);
+
+struct gh_rm_hyp_resource {
+ u8 type;
+ u8 reserved;
+ __le16 partner_vmid;
+ __le32 resource_handle;
+ __le32 resource_label;
+ __le64 cap_id;
+ __le32 virq_handle;
+ __le32 virq;
+ __le64 base;
+ __le64 size;
+} __packed;
+
+struct gh_rm_hyp_resources {
+ __le32 n_entries;
+ struct gh_rm_hyp_resource entries[];
+} __packed;
+
+int gh_rm_get_hyp_resources(struct gh_rm *rm, u16 vmid,
+ struct gh_rm_hyp_resources **resources);
+int gh_rm_get_vmid(struct gh_rm *rm, u16 *vmid);
+
#endif
--
2.39.1


2023-02-14 21:25:13

by Elliot Berman

[permalink] [raw]
Subject: [PATCH v10 12/26] gunyah: vm_mgr: Add/remove user memory regions


When launching a virtual machine, Gunyah userspace allocates memory for
the guest and informs Gunyah about these memory regions through
SET_USER_MEMORY_REGION ioctl.

Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Elliot Berman <[email protected]>
---
drivers/virt/gunyah/Makefile | 2 +-
drivers/virt/gunyah/vm_mgr.c | 44 ++++++
drivers/virt/gunyah/vm_mgr.h | 25 ++++
drivers/virt/gunyah/vm_mgr_mm.c | 235 ++++++++++++++++++++++++++++++++
include/uapi/linux/gunyah.h | 33 +++++
5 files changed, 338 insertions(+), 1 deletion(-)
create mode 100644 drivers/virt/gunyah/vm_mgr_mm.c

diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
index 03951cf82023..ff8bc4925392 100644
--- a/drivers/virt/gunyah/Makefile
+++ b/drivers/virt/gunyah/Makefile
@@ -2,5 +2,5 @@

obj-$(CONFIG_GUNYAH) += gunyah.o

-gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o
+gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o vm_mgr_mm.o
obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
index fd890a57172e..84102bac03cc 100644
--- a/drivers/virt/gunyah/vm_mgr.c
+++ b/drivers/virt/gunyah/vm_mgr.c
@@ -18,8 +18,16 @@
static void gh_vm_free(struct work_struct *work)
{
struct gh_vm *ghvm = container_of(work, struct gh_vm, free_work);
+ struct gh_vm_mem *mapping, *tmp;
int ret;

+ mutex_lock(&ghvm->mm_lock);
+ list_for_each_entry_safe(mapping, tmp, &ghvm->memory_mappings, list) {
+ gh_vm_mem_reclaim(ghvm, mapping);
+ kfree(mapping);
+ }
+ mutex_unlock(&ghvm->mm_lock);
+
ret = gh_rm_dealloc_vmid(ghvm->rm, ghvm->vmid);
if (ret)
pr_warn("Failed to deallocate vmid: %d\n", ret);
@@ -48,11 +56,46 @@ static __must_check struct gh_vm *gh_vm_alloc(struct gh_rm *rm)
ghvm->vmid = vmid;
ghvm->rm = rm;

+ mutex_init(&ghvm->mm_lock);
+ INIT_LIST_HEAD(&ghvm->memory_mappings);
INIT_WORK(&ghvm->free_work, gh_vm_free);

return ghvm;
}

+static long gh_vm_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
+{
+ struct gh_vm *ghvm = filp->private_data;
+ void __user *argp = (void __user *)arg;
+ long r;
+
+ switch (cmd) {
+ case GH_VM_SET_USER_MEM_REGION: {
+ struct gh_userspace_memory_region region;
+
+ if (copy_from_user(&region, argp, sizeof(region)))
+ return -EFAULT;
+
+ /* All other flag bits are reserved for future use */
+ if (region.flags & ~(GH_MEM_ALLOW_READ | GH_MEM_ALLOW_WRITE | GH_MEM_ALLOW_EXEC |
+ GH_MEM_LENT))
+ return -EINVAL;
+
+
+ if (region.memory_size)
+ r = gh_vm_mem_alloc(ghvm, &region);
+ else
+ r = gh_vm_mem_free(ghvm, region.label);
+ break;
+ }
+ default:
+ r = -ENOTTY;
+ break;
+ }
+
+ return r;
+}
+
static int gh_vm_release(struct inode *inode, struct file *filp)
{
struct gh_vm *ghvm = filp->private_data;
@@ -65,6 +108,7 @@ static int gh_vm_release(struct inode *inode, struct file *filp)
}

static const struct file_operations gh_vm_fops = {
+ .unlocked_ioctl = gh_vm_ioctl,
.release = gh_vm_release,
.compat_ioctl = compat_ptr_ioctl,
.llseek = noop_llseek,
diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
index 76954da706e9..97bc00c34878 100644
--- a/drivers/virt/gunyah/vm_mgr.h
+++ b/drivers/virt/gunyah/vm_mgr.h
@@ -7,16 +7,41 @@
#define _GH_PRIV_VM_MGR_H

#include <linux/gunyah_rsc_mgr.h>
+#include <linux/list.h>
+#include <linux/miscdevice.h>
+#include <linux/mutex.h>

#include <uapi/linux/gunyah.h>

long gh_dev_vm_mgr_ioctl(struct gh_rm *rm, unsigned int cmd, unsigned long arg);

+enum gh_vm_mem_share_type {
+ VM_MEM_SHARE,
+ VM_MEM_LEND,
+};
+
+struct gh_vm_mem {
+ struct list_head list;
+ enum gh_vm_mem_share_type share_type;
+ struct gh_rm_mem_parcel parcel;
+
+ __u64 guest_phys_addr;
+ struct page **pages;
+ unsigned long npages;
+};
+
struct gh_vm {
u16 vmid;
struct gh_rm *rm;

struct work_struct free_work;
+ struct mutex mm_lock;
+ struct list_head memory_mappings;
};

+int gh_vm_mem_alloc(struct gh_vm *ghvm, struct gh_userspace_memory_region *region);
+void gh_vm_mem_reclaim(struct gh_vm *ghvm, struct gh_vm_mem *mapping);
+int gh_vm_mem_free(struct gh_vm *ghvm, u32 label);
+struct gh_vm_mem *gh_vm_mem_find(struct gh_vm *ghvm, u32 label);
+
#endif
diff --git a/drivers/virt/gunyah/vm_mgr_mm.c b/drivers/virt/gunyah/vm_mgr_mm.c
new file mode 100644
index 000000000000..03e71a36ea3b
--- /dev/null
+++ b/drivers/virt/gunyah/vm_mgr_mm.c
@@ -0,0 +1,235 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#define pr_fmt(fmt) "gh_vm_mgr: " fmt
+
+#include <linux/gunyah_rsc_mgr.h>
+#include <linux/mm.h>
+
+#include <uapi/linux/gunyah.h>
+
+#include "vm_mgr.h"
+
+static inline bool page_contiguous(phys_addr_t p, phys_addr_t t)
+{
+ return t - p == PAGE_SIZE;
+}
+
+static struct gh_vm_mem *__gh_vm_mem_find(struct gh_vm *ghvm, u32 label)
+ __must_hold(&ghvm->mm_lock)
+{
+ struct gh_vm_mem *mapping;
+
+ list_for_each_entry(mapping, &ghvm->memory_mappings, list)
+ if (mapping->parcel.label == label)
+ return mapping;
+
+ return NULL;
+}
+
+void gh_vm_mem_reclaim(struct gh_vm *ghvm, struct gh_vm_mem *mapping)
+ __must_hold(&ghvm->mm_lock)
+{
+ int i, ret = 0;
+
+ if (mapping->parcel.mem_handle != GH_MEM_HANDLE_INVAL) {
+ ret = gh_rm_mem_reclaim(ghvm->rm, &mapping->parcel);
+ if (ret)
+ pr_warn("Failed to reclaim memory parcel for label %d: %d\n",
+ mapping->parcel.label, ret);
+ }
+
+ if (!ret)
+ for (i = 0; i < mapping->npages; i++)
+ unpin_user_page(mapping->pages[i]);
+
+ kfree(mapping->pages);
+ kfree(mapping->parcel.acl_entries);
+ kfree(mapping->parcel.mem_entries);
+
+ list_del(&mapping->list);
+}
+
+struct gh_vm_mem *gh_vm_mem_find(struct gh_vm *ghvm, u32 label)
+{
+ struct gh_vm_mem *mapping;
+ int ret;
+
+ ret = mutex_lock_interruptible(&ghvm->mm_lock);
+ if (ret)
+ return ERR_PTR(ret);
+ mapping = __gh_vm_mem_find(ghvm, label);
+ mutex_unlock(&ghvm->mm_lock);
+ return mapping ? : ERR_PTR(-ENODEV);
+}
+
+int gh_vm_mem_alloc(struct gh_vm *ghvm, struct gh_userspace_memory_region *region)
+{
+ struct gh_vm_mem *mapping, *tmp_mapping;
+ struct gh_rm_mem_entry *mem_entries;
+ phys_addr_t curr_page, prev_page;
+ struct gh_rm_mem_parcel *parcel;
+ int i, j, pinned, ret = 0;
+ size_t entry_size;
+ u16 vmid;
+
+ if (!gh_api_has_feature(GH_API_FEATURE_MEMEXTENT))
+ return -EOPNOTSUPP;
+
+ if (!region->memory_size || !PAGE_ALIGNED(region->memory_size) ||
+ !PAGE_ALIGNED(region->userspace_addr) || !PAGE_ALIGNED(region->guest_phys_addr))
+ return -EINVAL;
+
+ ret = mutex_lock_interruptible(&ghvm->mm_lock);
+ if (ret)
+ return ret;
+ mapping = __gh_vm_mem_find(ghvm, region->label);
+ if (mapping) {
+ mutex_unlock(&ghvm->mm_lock);
+ return -EEXIST;
+ }
+
+ mapping = kzalloc(sizeof(*mapping), GFP_KERNEL);
+ if (!mapping) {
+ ret = -ENOMEM;
+ goto free_mapping;
+ }
+
+ mapping->parcel.label = region->label;
+ mapping->guest_phys_addr = region->guest_phys_addr;
+ mapping->npages = region->memory_size >> PAGE_SHIFT;
+ parcel = &mapping->parcel;
+ parcel->mem_handle = GH_MEM_HANDLE_INVAL; /* to be filled later by mem_share/mem_lend */
+ parcel->mem_type = GH_RM_MEM_TYPE_NORMAL;
+
+ /* Check for overlap */
+ list_for_each_entry(tmp_mapping, &ghvm->memory_mappings, list) {
+ if (!((mapping->guest_phys_addr + (mapping->npages << PAGE_SHIFT) <=
+ tmp_mapping->guest_phys_addr) ||
+ (mapping->guest_phys_addr >=
+ tmp_mapping->guest_phys_addr + (tmp_mapping->npages << PAGE_SHIFT)))) {
+ ret = -EEXIST;
+ goto free_mapping;
+ }
+ }
+
+ list_add(&mapping->list, &ghvm->memory_mappings);
+
+ mapping->pages = kcalloc(mapping->npages, sizeof(*mapping->pages), GFP_KERNEL);
+ if (!mapping->pages) {
+ ret = -ENOMEM;
+ mapping->npages = 0; /* update npages for reclaim */
+ goto reclaim;
+ }
+
+ pinned = pin_user_pages_fast(region->userspace_addr, mapping->npages,
+ FOLL_WRITE | FOLL_LONGTERM, mapping->pages);
+ if (pinned < 0) {
+ ret = pinned;
+ mapping->npages = 0; /* update npages for reclaim */
+ goto reclaim;
+ } else if (pinned != mapping->npages) {
+ ret = -EFAULT;
+ mapping->npages = pinned; /* update npages for reclaim */
+ goto reclaim;
+ }
+
+ if (region->flags & GH_MEM_LENT) {
+ parcel->n_acl_entries = 1;
+ mapping->share_type = VM_MEM_LEND;
+ } else {
+ parcel->n_acl_entries = 2;
+ mapping->share_type = VM_MEM_SHARE;
+ }
+ parcel->acl_entries = kcalloc(parcel->n_acl_entries, sizeof(*parcel->acl_entries),
+ GFP_KERNEL);
+ if (!parcel->acl_entries) {
+ ret = -ENOMEM;
+ goto reclaim;
+ }
+
+ parcel->acl_entries[0].vmid = cpu_to_le16(ghvm->vmid);
+ if (region->flags & GH_MEM_ALLOW_READ)
+ parcel->acl_entries[0].perms |= GH_RM_ACL_R;
+ if (region->flags & GH_MEM_ALLOW_WRITE)
+ parcel->acl_entries[0].perms |= GH_RM_ACL_W;
+ if (region->flags & GH_MEM_ALLOW_EXEC)
+ parcel->acl_entries[0].perms |= GH_RM_ACL_X;
+
+ if (mapping->share_type == VM_MEM_SHARE) {
+ ret = gh_rm_get_vmid(ghvm->rm, &vmid);
+ if (ret)
+ goto reclaim;
+
+ parcel->acl_entries[1].vmid = cpu_to_le16(vmid);
+ /* Host assumed to have all these permissions. Gunyah will not
+ * grant new permissions if host actually had less than RWX
+ */
+ parcel->acl_entries[1].perms |= GH_RM_ACL_R | GH_RM_ACL_W | GH_RM_ACL_X;
+ }
+
+ mem_entries = kcalloc(mapping->npages, sizeof(*mem_entries), GFP_KERNEL);
+ if (!mem_entries) {
+ ret = -ENOMEM;
+ goto reclaim;
+ }
+
+ /* reduce number of entries by combining contiguous pages into single memory entry */
+ prev_page = page_to_phys(mapping->pages[0]);
+ mem_entries[0].ipa_base = cpu_to_le64(prev_page);
+ entry_size = PAGE_SIZE;
+ for (i = 1, j = 0; i < mapping->npages; i++) {
+ curr_page = page_to_phys(mapping->pages[i]);
+ if (page_contiguous(prev_page, curr_page)) {
+ entry_size += PAGE_SIZE;
+ } else {
+ mem_entries[j].size = cpu_to_le64(entry_size);
+ j++;
+ mem_entries[j].ipa_base = cpu_to_le64(curr_page);
+ entry_size = PAGE_SIZE;
+ }
+
+ prev_page = curr_page;
+ }
+ mem_entries[j].size = cpu_to_le64(entry_size);
+
+ parcel->n_mem_entries = j + 1;
+ parcel->mem_entries = kmemdup(mem_entries, sizeof(*mem_entries) * parcel->n_mem_entries,
+ GFP_KERNEL);
+ kfree(mem_entries);
+ if (!parcel->mem_entries) {
+ ret = -ENOMEM;
+ goto reclaim;
+ }
+
+ mutex_unlock(&ghvm->mm_lock);
+ return 0;
+reclaim:
+ gh_vm_mem_reclaim(ghvm, mapping);
+free_mapping:
+ kfree(mapping);
+ mutex_unlock(&ghvm->mm_lock);
+ return ret;
+}
+
+int gh_vm_mem_free(struct gh_vm *ghvm, u32 label)
+{
+ struct gh_vm_mem *mapping;
+ int ret;
+
+ ret = mutex_lock_interruptible(&ghvm->mm_lock);
+ if (ret)
+ return ret;
+
+ mapping = __gh_vm_mem_find(ghvm, label);
+ if (!mapping)
+ goto out;
+
+ gh_vm_mem_reclaim(ghvm, mapping);
+ kfree(mapping);
+out:
+ mutex_unlock(&ghvm->mm_lock);
+ return ret;
+}
diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
index 10ba32d2b0a6..d85d12119a48 100644
--- a/include/uapi/linux/gunyah.h
+++ b/include/uapi/linux/gunyah.h
@@ -20,4 +20,37 @@
*/
#define GH_CREATE_VM _IO(GH_IOCTL_TYPE, 0x0) /* Returns a Gunyah VM fd */

+/*
+ * ioctls for VM fds
+ */
+
+/**
+ * struct gh_userspace_memory_region - Userspace memory descripion for GH_VM_SET_USER_MEM_REGION
+ * @label: Unique identifer to the region.
+ * @flags: Flags for memory parcel behavior
+ * @guest_phys_addr: Location of the memory region in guest's memory space (page-aligned)
+ * @memory_size: Size of the region (page-aligned)
+ * @userspace_addr: Location of the memory region in caller (userspace)'s memory
+ *
+ * See Documentation/virt/gunyah/vm-manager.rst for further details.
+ */
+struct gh_userspace_memory_region {
+ __u32 label;
+#define GH_MEM_ALLOW_READ (1UL << 0)
+#define GH_MEM_ALLOW_WRITE (1UL << 1)
+#define GH_MEM_ALLOW_EXEC (1UL << 2)
+/*
+ * The guest will be lent the memory instead of shared.
+ * In other words, the guest has exclusive access to the memory region and the host loses access.
+ */
+#define GH_MEM_LENT (1UL << 3)
+ __u32 flags;
+ __u64 guest_phys_addr;
+ __u64 memory_size;
+ __u64 userspace_addr;
+};
+
+#define GH_VM_SET_USER_MEM_REGION _IOW(GH_IOCTL_TYPE, 0x1, \
+ struct gh_userspace_memory_region)
+
#endif
--
2.39.1


2023-02-14 21:25:28

by Elliot Berman

[permalink] [raw]
Subject: [PATCH v10 13/26] gunyah: vm_mgr: Add ioctls to support basic non-proxy VM boot


Add remaining ioctls to support non-proxy VM boot:

- Gunyah Resource Manager uses the VM's devicetree to configure the
virtual machine. The location of the devicetree in the guest's
virtual memory can be declared via the SET_DTB_CONFIG ioctl.
- Trigger start of the virtual machine with VM_START ioctl.

Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Elliot Berman <[email protected]>
---
drivers/virt/gunyah/vm_mgr.c | 229 ++++++++++++++++++++++++++++++--
drivers/virt/gunyah/vm_mgr.h | 10 ++
drivers/virt/gunyah/vm_mgr_mm.c | 23 ++++
include/linux/gunyah_rsc_mgr.h | 6 +
include/uapi/linux/gunyah.h | 13 ++
5 files changed, 268 insertions(+), 13 deletions(-)

diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
index 84102bac03cc..fa324385ade5 100644
--- a/drivers/virt/gunyah/vm_mgr.c
+++ b/drivers/virt/gunyah/vm_mgr.c
@@ -9,37 +9,114 @@
#include <linux/file.h>
#include <linux/gunyah_rsc_mgr.h>
#include <linux/miscdevice.h>
+#include <linux/mm.h>
#include <linux/module.h>

#include <uapi/linux/gunyah.h>

#include "vm_mgr.h"

+static int gh_vm_rm_notification_status(struct gh_vm *ghvm, void *data)
+{
+ struct gh_rm_vm_status_payload *payload = data;
+
+ if (payload->vmid != ghvm->vmid)
+ return NOTIFY_OK;
+
+ /* All other state transitions are synchronous to a corresponding RM call */
+ if (payload->vm_status == GH_RM_VM_STATUS_RESET) {
+ down_write(&ghvm->status_lock);
+ ghvm->vm_status = payload->vm_status;
+ up_write(&ghvm->status_lock);
+ wake_up(&ghvm->vm_status_wait);
+ }
+
+ return NOTIFY_DONE;
+}
+
+static int gh_vm_rm_notification_exited(struct gh_vm *ghvm, void *data)
+{
+ struct gh_rm_vm_exited_payload *payload = data;
+
+ if (payload->vmid != ghvm->vmid)
+ return NOTIFY_OK;
+
+ down_write(&ghvm->status_lock);
+ ghvm->vm_status = GH_RM_VM_STATUS_EXITED;
+ up_write(&ghvm->status_lock);
+
+ return NOTIFY_DONE;
+}
+
+static int gh_vm_rm_notification(struct notifier_block *nb, unsigned long action, void *data)
+{
+ struct gh_vm *ghvm = container_of(nb, struct gh_vm, nb);
+
+ switch (action) {
+ case GH_RM_NOTIFICATION_VM_STATUS:
+ return gh_vm_rm_notification_status(ghvm, data);
+ case GH_RM_NOTIFICATION_VM_EXITED:
+ return gh_vm_rm_notification_exited(ghvm, data);
+ default:
+ return NOTIFY_OK;
+ }
+}
+
+static void gh_vm_stop(struct gh_vm *ghvm)
+{
+ int ret;
+
+ down_write(&ghvm->status_lock);
+ if (ghvm->vm_status == GH_RM_VM_STATUS_RUNNING) {
+ ret = gh_rm_vm_stop(ghvm->rm, ghvm->vmid);
+ if (ret)
+ pr_warn("Failed to stop VM: %d\n", ret);
+ }
+
+ ghvm->vm_status = GH_RM_VM_STATUS_EXITED;
+ up_write(&ghvm->status_lock);
+}
+
static void gh_vm_free(struct work_struct *work)
{
struct gh_vm *ghvm = container_of(work, struct gh_vm, free_work);
struct gh_vm_mem *mapping, *tmp;
int ret;

- mutex_lock(&ghvm->mm_lock);
- list_for_each_entry_safe(mapping, tmp, &ghvm->memory_mappings, list) {
- gh_vm_mem_reclaim(ghvm, mapping);
- kfree(mapping);
+ switch (ghvm->vm_status) {
+unknown_state:
+ case GH_RM_VM_STATUS_RUNNING:
+ gh_vm_stop(ghvm);
+ fallthrough;
+ case GH_RM_VM_STATUS_INIT_FAILED:
+ case GH_RM_VM_STATUS_LOAD:
+ case GH_RM_VM_STATUS_LOAD_FAILED:
+ mutex_lock(&ghvm->mm_lock);
+ list_for_each_entry_safe(mapping, tmp, &ghvm->memory_mappings, list) {
+ gh_vm_mem_reclaim(ghvm, mapping);
+ kfree(mapping);
+ }
+ mutex_unlock(&ghvm->mm_lock);
+ fallthrough;
+ case GH_RM_VM_STATUS_NO_STATE:
+ ret = gh_rm_dealloc_vmid(ghvm->rm, ghvm->vmid);
+ if (ret)
+ pr_warn("Failed to deallocate vmid: %d\n", ret);
+
+ gh_rm_notifier_unregister(ghvm->rm, &ghvm->nb);
+ put_gh_rm(ghvm->rm);
+ kfree(ghvm);
+ break;
+ default:
+ pr_err("VM is unknown state: %d, assuming it's running.\n", ghvm->vm_status);
+ goto unknown_state;
}
- mutex_unlock(&ghvm->mm_lock);
-
- ret = gh_rm_dealloc_vmid(ghvm->rm, ghvm->vmid);
- if (ret)
- pr_warn("Failed to deallocate vmid: %d\n", ret);
-
- put_gh_rm(ghvm->rm);
- kfree(ghvm);
}

static __must_check struct gh_vm *gh_vm_alloc(struct gh_rm *rm)
{
struct gh_vm *ghvm;
- int vmid;
+ int vmid, ret;

vmid = gh_rm_alloc_vmid(rm, 0);
if (vmid < 0)
@@ -56,13 +133,123 @@ static __must_check struct gh_vm *gh_vm_alloc(struct gh_rm *rm)
ghvm->vmid = vmid;
ghvm->rm = rm;

+ init_waitqueue_head(&ghvm->vm_status_wait);
+ ghvm->nb.notifier_call = gh_vm_rm_notification;
+ ret = gh_rm_notifier_register(rm, &ghvm->nb);
+ if (ret) {
+ put_gh_rm(rm);
+ gh_rm_dealloc_vmid(rm, vmid);
+ kfree(ghvm);
+ return ERR_PTR(ret);
+ }
+
mutex_init(&ghvm->mm_lock);
INIT_LIST_HEAD(&ghvm->memory_mappings);
+ init_rwsem(&ghvm->status_lock);
INIT_WORK(&ghvm->free_work, gh_vm_free);
+ ghvm->vm_status = GH_RM_VM_STATUS_LOAD;

return ghvm;
}

+static int gh_vm_start(struct gh_vm *ghvm)
+{
+ struct gh_vm_mem *mapping;
+ u64 dtb_offset;
+ u32 mem_handle;
+ int ret;
+
+ down_write(&ghvm->status_lock);
+ if (ghvm->vm_status != GH_RM_VM_STATUS_LOAD) {
+ up_write(&ghvm->status_lock);
+ return 0;
+ }
+
+ ghvm->vm_status = GH_RM_VM_STATUS_RESET;
+
+ list_for_each_entry(mapping, &ghvm->memory_mappings, list) {
+ switch (mapping->share_type) {
+ case VM_MEM_LEND:
+ ret = gh_rm_mem_lend(ghvm->rm, &mapping->parcel);
+ break;
+ case VM_MEM_SHARE:
+ ret = gh_rm_mem_share(ghvm->rm, &mapping->parcel);
+ break;
+ }
+ if (ret) {
+ pr_warn("Failed to %s parcel %d: %d\n",
+ mapping->share_type == VM_MEM_LEND ? "lend" : "share",
+ mapping->parcel.label,
+ ret);
+ goto err;
+ }
+ }
+
+ mapping = gh_vm_mem_find_mapping(ghvm, ghvm->dtb_config.gpa, ghvm->dtb_config.size);
+ if (!mapping) {
+ pr_warn("Failed to find the memory_handle for DTB\n");
+ ret = -EINVAL;
+ goto err;
+ }
+
+ mem_handle = mapping->parcel.mem_handle;
+ dtb_offset = ghvm->dtb_config.gpa - mapping->guest_phys_addr;
+
+ ret = gh_rm_vm_configure(ghvm->rm, ghvm->vmid, ghvm->auth, mem_handle,
+ 0, 0, dtb_offset, ghvm->dtb_config.size);
+ if (ret) {
+ pr_warn("Failed to configure VM: %d\n", ret);
+ goto err;
+ }
+
+ ret = gh_rm_vm_init(ghvm->rm, ghvm->vmid);
+ if (ret) {
+ pr_warn("Failed to initialize VM: %d\n", ret);
+ goto err;
+ }
+
+ ret = gh_rm_vm_start(ghvm->rm, ghvm->vmid);
+ if (ret) {
+ pr_warn("Failed to start VM: %d\n", ret);
+ goto err;
+ }
+
+ ghvm->vm_status = GH_RM_VM_STATUS_RUNNING;
+ up_write(&ghvm->status_lock);
+ return ret;
+err:
+ ghvm->vm_status = GH_RM_VM_STATUS_INIT_FAILED;
+ up_write(&ghvm->status_lock);
+ return ret;
+}
+
+static int gh_vm_ensure_started(struct gh_vm *ghvm)
+{
+ int ret;
+
+retry:
+ ret = down_read_interruptible(&ghvm->status_lock);
+ if (ret)
+ return ret;
+
+ /* Unlikely because VM is typically started */
+ if (unlikely(ghvm->vm_status == GH_RM_VM_STATUS_LOAD)) {
+ up_read(&ghvm->status_lock);
+ ret = gh_vm_start(ghvm);
+ if (ret)
+ goto out;
+ goto retry;
+ }
+
+ /* Unlikely because VM is typically running */
+ if (unlikely(ghvm->vm_status != GH_RM_VM_STATUS_RUNNING))
+ ret = -ENODEV;
+
+out:
+ up_read(&ghvm->status_lock);
+ return ret;
+}
+
static long gh_vm_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
{
struct gh_vm *ghvm = filp->private_data;
@@ -88,6 +275,22 @@ static long gh_vm_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
r = gh_vm_mem_free(ghvm, region.label);
break;
}
+ case GH_VM_SET_DTB_CONFIG: {
+ struct gh_vm_dtb_config dtb_config;
+
+ if (copy_from_user(&dtb_config, argp, sizeof(dtb_config)))
+ return -EFAULT;
+
+ dtb_config.size = PAGE_ALIGN(dtb_config.size);
+ ghvm->dtb_config = dtb_config;
+
+ r = 0;
+ break;
+ }
+ case GH_VM_START: {
+ r = gh_vm_ensure_started(ghvm);
+ break;
+ }
default:
r = -ENOTTY;
break;
diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
index 97bc00c34878..e9cf56647cc2 100644
--- a/drivers/virt/gunyah/vm_mgr.h
+++ b/drivers/virt/gunyah/vm_mgr.h
@@ -10,6 +10,8 @@
#include <linux/list.h>
#include <linux/miscdevice.h>
#include <linux/mutex.h>
+#include <linux/rwsem.h>
+#include <linux/wait.h>

#include <uapi/linux/gunyah.h>

@@ -33,6 +35,13 @@ struct gh_vm_mem {
struct gh_vm {
u16 vmid;
struct gh_rm *rm;
+ enum gh_rm_vm_auth_mechanism auth;
+ struct gh_vm_dtb_config dtb_config;
+
+ struct notifier_block nb;
+ enum gh_rm_vm_status vm_status;
+ wait_queue_head_t vm_status_wait;
+ struct rw_semaphore status_lock;

struct work_struct free_work;
struct mutex mm_lock;
@@ -43,5 +52,6 @@ int gh_vm_mem_alloc(struct gh_vm *ghvm, struct gh_userspace_memory_region *regio
void gh_vm_mem_reclaim(struct gh_vm *ghvm, struct gh_vm_mem *mapping);
int gh_vm_mem_free(struct gh_vm *ghvm, u32 label);
struct gh_vm_mem *gh_vm_mem_find(struct gh_vm *ghvm, u32 label);
+struct gh_vm_mem *gh_vm_mem_find_mapping(struct gh_vm *ghvm, u64 gpa, u32 size);

#endif
diff --git a/drivers/virt/gunyah/vm_mgr_mm.c b/drivers/virt/gunyah/vm_mgr_mm.c
index 03e71a36ea3b..128b90da555a 100644
--- a/drivers/virt/gunyah/vm_mgr_mm.c
+++ b/drivers/virt/gunyah/vm_mgr_mm.c
@@ -52,6 +52,29 @@ void gh_vm_mem_reclaim(struct gh_vm *ghvm, struct gh_vm_mem *mapping)
list_del(&mapping->list);
}

+struct gh_vm_mem *gh_vm_mem_find_mapping(struct gh_vm *ghvm, u64 gpa, u32 size)
+{
+ struct gh_vm_mem *mapping = NULL;
+ int ret;
+
+ ret = mutex_lock_interruptible(&ghvm->mm_lock);
+ if (ret)
+ return ERR_PTR(ret);
+
+ list_for_each_entry(mapping, &ghvm->memory_mappings, list) {
+ if (gpa >= mapping->guest_phys_addr &&
+ (gpa + size <= mapping->guest_phys_addr +
+ (mapping->npages << PAGE_SHIFT))) {
+ goto unlock;
+ }
+ }
+
+ mapping = NULL;
+unlock:
+ mutex_unlock(&ghvm->mm_lock);
+ return mapping;
+}
+
struct gh_vm_mem *gh_vm_mem_find(struct gh_vm *ghvm, u32 label)
{
struct gh_vm_mem *mapping;
diff --git a/include/linux/gunyah_rsc_mgr.h b/include/linux/gunyah_rsc_mgr.h
index 2d8b8b6cc394..9cffee6f9b4e 100644
--- a/include/linux/gunyah_rsc_mgr.h
+++ b/include/linux/gunyah_rsc_mgr.h
@@ -32,6 +32,12 @@ struct gh_rm_vm_exited_payload {
#define GH_RM_NOTIFICATION_VM_EXITED 0x56100001

enum gh_rm_vm_status {
+ /**
+ * RM doesn't have a state where load partially failed because
+ * only Linux
+ */
+ GH_RM_VM_STATUS_LOAD_FAILED = -1,
+
GH_RM_VM_STATUS_NO_STATE = 0,
GH_RM_VM_STATUS_INIT = 1,
GH_RM_VM_STATUS_READY = 2,
diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
index d85d12119a48..d899bba6a4c6 100644
--- a/include/uapi/linux/gunyah.h
+++ b/include/uapi/linux/gunyah.h
@@ -53,4 +53,17 @@ struct gh_userspace_memory_region {
#define GH_VM_SET_USER_MEM_REGION _IOW(GH_IOCTL_TYPE, 0x1, \
struct gh_userspace_memory_region)

+/**
+ * struct gh_vm_dtb_config - Set the location of the VM's devicetree blob
+ * @gpa: Address of the VM's devicetree in guest memory.
+ * @size: Maximum size of the devicetree.
+ */
+struct gh_vm_dtb_config {
+ __u64 gpa;
+ __u64 size;
+};
+#define GH_VM_SET_DTB_CONFIG _IOW(GH_IOCTL_TYPE, 0x2, struct gh_vm_dtb_config)
+
+#define GH_VM_START _IO(GH_IOCTL_TYPE, 0x3)
+
#endif
--
2.39.1


2023-02-14 21:25:32

by Elliot Berman

[permalink] [raw]
Subject: [PATCH v10 14/26] samples: Add sample userspace Gunyah VM Manager


Add a sample Gunyah VMM capable of launching a non-proxy scheduled VM.

Signed-off-by: Elliot Berman <[email protected]>
---
samples/Kconfig | 10 ++
samples/Makefile | 1 +
samples/gunyah/.gitignore | 2 +
samples/gunyah/Makefile | 6 +
samples/gunyah/gunyah_vmm.c | 270 +++++++++++++++++++++++++++++++++++
samples/gunyah/sample_vm.dts | 68 +++++++++
6 files changed, 357 insertions(+)
create mode 100644 samples/gunyah/.gitignore
create mode 100644 samples/gunyah/Makefile
create mode 100644 samples/gunyah/gunyah_vmm.c
create mode 100644 samples/gunyah/sample_vm.dts

diff --git a/samples/Kconfig b/samples/Kconfig
index 30ef8bd48ba3..11070bf02bd7 100644
--- a/samples/Kconfig
+++ b/samples/Kconfig
@@ -273,6 +273,16 @@ config SAMPLE_CORESIGHT_SYSCFG
This demonstrates how a user may create their own CoreSight
configurations and easily load them into the system at runtime.

+config SAMPLE_GUNYAH
+ bool "Build example Gunyah Virtual Machine Manager"
+ depends on CC_CAN_LINK && HEADERS_INSTALL
+ depends on GUNYAH
+ help
+ Build an example Gunyah VMM userspace program capable of launching
+ a basic virtual machine under the Gunyah hypervisor.
+ This demonstrates how to create a virtual machine under the Gunyah
+ hypervisor.
+
source "samples/rust/Kconfig"

endif # SAMPLES
diff --git a/samples/Makefile b/samples/Makefile
index 7cb632ef88ee..a65555802642 100644
--- a/samples/Makefile
+++ b/samples/Makefile
@@ -37,3 +37,4 @@ obj-$(CONFIG_DEBUG_KMEMLEAK_TEST) += kmemleak/
obj-$(CONFIG_SAMPLE_CORESIGHT_SYSCFG) += coresight/
obj-$(CONFIG_SAMPLE_FPROBE) += fprobe/
obj-$(CONFIG_SAMPLES_RUST) += rust/
+obj-$(CONFIG_SAMPLE_GUNYAH) += gunyah/
diff --git a/samples/gunyah/.gitignore b/samples/gunyah/.gitignore
new file mode 100644
index 000000000000..adc7d1589fde
--- /dev/null
+++ b/samples/gunyah/.gitignore
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0
+/gunyah_vmm
diff --git a/samples/gunyah/Makefile b/samples/gunyah/Makefile
new file mode 100644
index 000000000000..faf14f9bb337
--- /dev/null
+++ b/samples/gunyah/Makefile
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+userprogs-always-y += gunyah_vmm
+dtb-y += sample_vm.dtb
+
+userccflags += -I usr/include
diff --git a/samples/gunyah/gunyah_vmm.c b/samples/gunyah/gunyah_vmm.c
new file mode 100644
index 000000000000..a72fab7ddc3a
--- /dev/null
+++ b/samples/gunyah/gunyah_vmm.c
@@ -0,0 +1,270 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2022 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#include <stdlib.h>
+#include <stdio.h>
+#include <unistd.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <sys/ioctl.h>
+#include <getopt.h>
+#include <limits.h>
+#include <stdint.h>
+#include <fcntl.h>
+#include <string.h>
+#include <sys/sysmacros.h>
+#define __USE_GNU
+#include <sys/mman.h>
+
+#include <linux/gunyah.h>
+
+struct vm_config {
+ int image_fd;
+ int dtb_fd;
+ int ramdisk_fd;
+
+ uint64_t guest_base;
+ uint64_t guest_size;
+
+ uint64_t image_offset;
+ off_t image_size;
+ uint64_t dtb_offset;
+ off_t dtb_size;
+ uint64_t ramdisk_offset;
+ off_t ramdisk_size;
+};
+
+static struct option options[] = {
+ { "help", no_argument, NULL, 'h' },
+ { "image", required_argument, NULL, 'i' },
+ { "dtb", required_argument, NULL, 'd' },
+ { "ramdisk", optional_argument, NULL, 'r' },
+ { "base", optional_argument, NULL, 'B' },
+ { "size", optional_argument, NULL, 'S' },
+ { "image_offset", optional_argument, NULL, 'I' },
+ { "dtb_offset", optional_argument, NULL, 'D' },
+ { "ramdisk_offset", optional_argument, NULL, 'R' },
+ { }
+};
+
+static void print_help(char *cmd)
+{
+ printf("gunyah_vmm, a sample tool to launch Gunyah VMs\n"
+ "Usage: %s <options>\n"
+ " --help, -h this menu\n"
+ " --image, -i <image> VM image file to load (e.g. a kernel Image) [Required]\n"
+ " --dtb, -d <dtb> Devicetree to load [Required]\n"
+ " --ramdisk, -r <ramdisk> Ramdisk to load\n"
+ " --base, -B <address> Set the base address of guest's memory [Default: 0x80000000]\n"
+ " --size, -S <number> The number of bytes large to make the guest's memory [Default: 0x6400000 (100 MB)]\n"
+ " --image_offset, -I <number> Offset into guest memory to load the VM image file [Default: 0x10000]\n"
+ " --dtb_offset, -D <number> Offset into guest memory to load the DTB [Default: 0]\n"
+ " --ramdisk_offset, -R <number> Offset into guest memory to load a ramdisk [Default: 0x4600000]\n"
+ , cmd);
+}
+
+int main(int argc, char **argv)
+{
+ int gunyah_fd, vm_fd, guest_fd;
+ struct gh_userspace_memory_region guest_mem_desc = { 0 };
+ struct gh_vm_dtb_config dtb_config = { 0 };
+ char *guest_mem;
+ struct vm_config config = {
+ /* Defaults good enough to boot static kernel and a basic ramdisk */
+ .ramdisk_fd = -1,
+ .guest_base = 0x80000000,
+ .guest_size = 0x6400000, /* 100 MB */
+ .image_offset = 0,
+ .dtb_offset = 0x45f0000,
+ .ramdisk_offset = 0x4600000, /* put at +70MB (30MB for ramdisk) */
+ };
+ struct stat st;
+ int opt, optidx, ret = 0;
+ long l;
+
+ while ((opt = getopt_long(argc, argv, "hi:d:r:B:S:I:D:R:c:", options, &optidx)) != -1) {
+ switch (opt) {
+ case 'i':
+ config.image_fd = open(optarg, O_RDONLY | O_CLOEXEC);
+ if (config.image_fd < 0) {
+ perror("Failed to open image");
+ return -1;
+ }
+ if (stat(optarg, &st) < 0) {
+ perror("Failed to stat image");
+ return -1;
+ }
+ config.image_size = st.st_size;
+ break;
+ case 'd':
+ config.dtb_fd = open(optarg, O_RDONLY | O_CLOEXEC);
+ if (config.dtb_fd < 0) {
+ perror("Failed to open dtb");
+ return -1;
+ }
+ if (stat(optarg, &st) < 0) {
+ perror("Failed to stat dtb");
+ return -1;
+ }
+ config.dtb_size = st.st_size;
+ break;
+ case 'r':
+ config.ramdisk_fd = open(optarg, O_RDONLY | O_CLOEXEC);
+ if (config.ramdisk_fd < 0) {
+ perror("Failed to open ramdisk");
+ return -1;
+ }
+ if (stat(optarg, &st) < 0) {
+ perror("Failed to stat ramdisk");
+ return -1;
+ }
+ config.ramdisk_size = st.st_size;
+ break;
+ case 'B':
+ l = strtol(optarg, NULL, 0);
+ if (l == LONG_MIN) {
+ perror("Failed to parse base address");
+ return -1;
+ }
+ config.guest_base = l;
+ break;
+ case 'S':
+ l = strtol(optarg, NULL, 0);
+ if (l == LONG_MIN) {
+ perror("Failed to parse memory size");
+ return -1;
+ }
+ config.guest_size = l;
+ break;
+ case 'I':
+ l = strtol(optarg, NULL, 0);
+ if (l == LONG_MIN) {
+ perror("Failed to parse image offset");
+ return -1;
+ }
+ config.image_offset = l;
+ break;
+ case 'D':
+ l = strtol(optarg, NULL, 0);
+ if (l == LONG_MIN) {
+ perror("Failed to parse dtb offset");
+ return -1;
+ }
+ config.dtb_offset = l;
+ break;
+ case 'R':
+ l = strtol(optarg, NULL, 0);
+ if (l == LONG_MIN) {
+ perror("Failed to parse ramdisk offset");
+ return -1;
+ }
+ config.ramdisk_offset = l;
+ break;
+ case 'h':
+ print_help(argv[0]);
+ return 0;
+ default:
+ print_help(argv[0]);
+ return -1;
+ }
+ }
+
+ if (!config.image_fd || !config.dtb_fd) {
+ print_help(argv[0]);
+ return -1;
+ }
+
+ if (config.image_offset + config.image_size > config.guest_size) {
+ fprintf(stderr, "Image offset and size puts it outside guest memory. Make image smaller or increase guest memory size.\n");
+ return -1;
+ }
+
+ if (config.dtb_offset + config.dtb_size > config.guest_size) {
+ fprintf(stderr, "DTB offset and size puts it outside guest memory. Make dtb smaller or increase guest memory size.\n");
+ return -1;
+ }
+
+ if (config.ramdisk_fd == -1 &&
+ config.ramdisk_offset + config.ramdisk_size > config.guest_size) {
+ fprintf(stderr, "Ramdisk offset and size puts it outside guest memory. Make ramdisk smaller or increase guest memory size.\n");
+ return -1;
+ }
+
+ gunyah_fd = open("/dev/gunyah", O_RDWR | O_CLOEXEC);
+ if (gunyah_fd < 0) {
+ perror("Failed to open /dev/gunyah");
+ return -1;
+ }
+
+ vm_fd = ioctl(gunyah_fd, GH_CREATE_VM, 0);
+ if (vm_fd < 0) {
+ perror("Failed to create vm");
+ return -1;
+ }
+
+ guest_fd = memfd_create("guest_memory", MFD_CLOEXEC);
+ if (guest_fd < 0) {
+ perror("Failed to create guest memfd");
+ return -1;
+ }
+
+ if (ftruncate(guest_fd, config.guest_size) < 0) {
+ perror("Failed to grow guest memory");
+ return -1;
+ }
+
+ guest_mem = mmap(NULL, config.guest_size, PROT_READ | PROT_WRITE, MAP_SHARED, guest_fd, 0);
+ if (guest_mem == MAP_FAILED) {
+ perror("Not enough memory");
+ return -1;
+ }
+
+ if (read(config.image_fd, guest_mem + config.image_offset, config.image_size) < 0) {
+ perror("Failed to read image into guest memory");
+ return -1;
+ }
+
+ if (read(config.dtb_fd, guest_mem + config.dtb_offset, config.dtb_size) < 0) {
+ perror("Failed to read dtb into guest memory");
+ return -1;
+ }
+
+ if (config.ramdisk_fd > 0 &&
+ read(config.ramdisk_fd, guest_mem + config.ramdisk_offset,
+ config.ramdisk_size) < 0) {
+ perror("Failed to read ramdisk into guest memory");
+ return -1;
+ }
+
+ guest_mem_desc.label = 0;
+ guest_mem_desc.flags = GH_MEM_ALLOW_READ | GH_MEM_ALLOW_WRITE | GH_MEM_ALLOW_EXEC;
+ guest_mem_desc.guest_phys_addr = config.guest_base;
+ guest_mem_desc.memory_size = config.guest_size;
+ guest_mem_desc.userspace_addr = (__u64)guest_mem;
+
+ if (ioctl(vm_fd, GH_VM_SET_USER_MEM_REGION, &guest_mem_desc) < 0) {
+ perror("Failed to register guest memory with VM");
+ return -1;
+ }
+
+ dtb_config.gpa = config.guest_base + config.dtb_offset;
+ dtb_config.size = config.dtb_size;
+ if (ioctl(vm_fd, GH_VM_SET_DTB_CONFIG, &dtb_config) < 0) {
+ perror("Failed to set DTB configuration for VM");
+ return -1;
+ }
+
+ ret = ioctl(vm_fd, GH_VM_START);
+ if (ret) {
+ perror("GH_VM_START failed");
+ return -1;
+ }
+
+ while (1)
+ sleep(10);
+
+ return 0;
+}
diff --git a/samples/gunyah/sample_vm.dts b/samples/gunyah/sample_vm.dts
new file mode 100644
index 000000000000..293bbc0469c8
--- /dev/null
+++ b/samples/gunyah/sample_vm.dts
@@ -0,0 +1,68 @@
+// SPDX-License-Identifier: BSD-3-Clause
+/*
+ * Copyright (c) 2022 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+/dts-v1/;
+
+/ {
+ #address-cells = <2>;
+ #size-cells = <2>;
+ interrupt-parent = <&intc>;
+
+ chosen {
+ bootargs = "nokaslr";
+ };
+
+ cpus {
+ #address-cells = <0x2>;
+ #size-cells = <0>;
+
+ cpu@0 {
+ device_type = "cpu";
+ compatible = "arm,armv8";
+ reg = <0 0>;
+ };
+ };
+
+ intc: interrupt-controller@3FFF0000 {
+ compatible = "arm,gic-v3";
+ #interrupt-cells = <3>;
+ #address-cells = <2>;
+ #size-cells = <2>;
+ interrupt-controller;
+ reg = <0 0x3FFF0000 0 0x10000>,
+ <0 0x3FFD0000 0 0x20000>;
+ };
+
+ timer {
+ compatible = "arm,armv8-timer";
+ always-on;
+ interrupts = <1 13 0x108>,
+ <1 14 0x108>,
+ <1 11 0x108>,
+ <1 10 0x108>;
+ clock-frequency = <19200000>;
+ };
+
+ gunyah-vm-config {
+ image-name = "linux_vm_0";
+
+ memory {
+ #address-cells = <2>;
+ #size-cells = <2>;
+
+ base-address = <0 0x80000000>;
+ };
+
+ interrupts {
+ config = <&intc>;
+ };
+
+ vcpus {
+ affinity-map = < 0 >;
+ sched-priority = < (-1) >;
+ sched-timeslice = < 2000 >;
+ };
+ };
+};
--
2.39.1


2023-02-14 21:25:52

by Elliot Berman

[permalink] [raw]
Subject: [PATCH v10 15/26] gunyah: rsc_mgr: Add platform ops on mem_lend/mem_reclaim


On Qualcomm platforms, there is a firmware entity which controls access
to physical pages. In order to share memory with another VM, this entity
needs to be informed that the guest VM should have access to the memory.

Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Elliot Berman <[email protected]>
---
drivers/virt/gunyah/Kconfig | 4 ++
drivers/virt/gunyah/Makefile | 1 +
drivers/virt/gunyah/gunyah_platform_hooks.c | 80 +++++++++++++++++++++
drivers/virt/gunyah/rsc_mgr.h | 3 +
drivers/virt/gunyah/rsc_mgr_rpc.c | 12 +++-
include/linux/gunyah_rsc_mgr.h | 17 +++++
6 files changed, 115 insertions(+), 2 deletions(-)
create mode 100644 drivers/virt/gunyah/gunyah_platform_hooks.c

diff --git a/drivers/virt/gunyah/Kconfig b/drivers/virt/gunyah/Kconfig
index 1a737694c333..de815189dab6 100644
--- a/drivers/virt/gunyah/Kconfig
+++ b/drivers/virt/gunyah/Kconfig
@@ -4,6 +4,7 @@ config GUNYAH
tristate "Gunyah Virtualization drivers"
depends on ARM64
depends on MAILBOX
+ select GUNYAH_PLATFORM_HOOKS
help
The Gunyah drivers are the helper interfaces that run in a guest VM
such as basic inter-VM IPC and signaling mechanisms, and higher level
@@ -11,3 +12,6 @@ config GUNYAH

Say Y/M here to enable the drivers needed to interact in a Gunyah
virtual environment.
+
+config GUNYAH_PLATFORM_HOOKS
+ tristate
diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
index ff8bc4925392..6b8f84dbfe0d 100644
--- a/drivers/virt/gunyah/Makefile
+++ b/drivers/virt/gunyah/Makefile
@@ -1,6 +1,7 @@
# SPDX-License-Identifier: GPL-2.0

obj-$(CONFIG_GUNYAH) += gunyah.o
+obj-$(CONFIG_GUNYAH_PLATFORM_HOOKS) += gunyah_platform_hooks.o

gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o vm_mgr_mm.o
obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
diff --git a/drivers/virt/gunyah/gunyah_platform_hooks.c b/drivers/virt/gunyah/gunyah_platform_hooks.c
new file mode 100644
index 000000000000..e67e2361b592
--- /dev/null
+++ b/drivers/virt/gunyah/gunyah_platform_hooks.c
@@ -0,0 +1,80 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#include <linux/module.h>
+#include <linux/rwsem.h>
+#include <linux/gunyah_rsc_mgr.h>
+
+#include "rsc_mgr.h"
+
+static struct gunyah_rm_platform_ops *rm_platform_ops;
+static DECLARE_RWSEM(rm_platform_ops_lock);
+
+int gh_rm_platform_pre_mem_share(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel)
+{
+ int ret = 0;
+
+ down_read(&rm_platform_ops_lock);
+ if (rm_platform_ops && rm_platform_ops->pre_mem_share)
+ ret = rm_platform_ops->pre_mem_share(rm, mem_parcel);
+ up_read(&rm_platform_ops_lock);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(gh_rm_platform_pre_mem_share);
+
+int gh_rm_platform_post_mem_reclaim(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel)
+{
+ int ret = 0;
+
+ down_read(&rm_platform_ops_lock);
+ if (rm_platform_ops && rm_platform_ops->post_mem_reclaim)
+ ret = rm_platform_ops->post_mem_reclaim(rm, mem_parcel);
+ up_read(&rm_platform_ops_lock);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(gh_rm_platform_post_mem_reclaim);
+
+int gh_rm_register_platform_ops(struct gunyah_rm_platform_ops *platform_ops)
+{
+ int ret = 0;
+
+ down_write(&rm_platform_ops_lock);
+ if (!rm_platform_ops)
+ rm_platform_ops = platform_ops;
+ else
+ ret = -EEXIST;
+ up_write(&rm_platform_ops_lock);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(gh_rm_register_platform_ops);
+
+void gh_rm_unregister_platform_ops(struct gunyah_rm_platform_ops *platform_ops)
+{
+ down_write(&rm_platform_ops_lock);
+ if (rm_platform_ops == platform_ops)
+ rm_platform_ops = NULL;
+ up_write(&rm_platform_ops_lock);
+}
+EXPORT_SYMBOL_GPL(gh_rm_unregister_platform_ops);
+
+static void _devm_gh_rm_unregister_platform_ops(void *data)
+{
+ gh_rm_unregister_platform_ops(data);
+}
+
+int devm_gh_rm_register_platform_ops(struct device *dev, struct gunyah_rm_platform_ops *ops)
+{
+ int ret;
+
+ ret = gh_rm_register_platform_ops(ops);
+ if (ret)
+ return ret;
+
+ return devm_add_action(dev, _devm_gh_rm_unregister_platform_ops, ops);
+}
+EXPORT_SYMBOL_GPL(devm_gh_rm_register_platform_ops);
+
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("Gunyah Platform Hooks");
diff --git a/drivers/virt/gunyah/rsc_mgr.h b/drivers/virt/gunyah/rsc_mgr.h
index 9b23cefe02b0..e536169df41e 100644
--- a/drivers/virt/gunyah/rsc_mgr.h
+++ b/drivers/virt/gunyah/rsc_mgr.h
@@ -74,6 +74,9 @@ struct gh_rm;
int gh_rm_call(struct gh_rm *rsc_mgr, u32 message_id, void *req_buff, size_t req_buff_size,
void **resp_buf, size_t *resp_buff_size);

+int gh_rm_platform_pre_mem_share(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel);
+int gh_rm_platform_post_mem_reclaim(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel);
+
/* Message IDs: Memory Management */
#define GH_RM_RPC_MEM_LEND 0x51000012
#define GH_RM_RPC_MEM_SHARE 0x51000013
diff --git a/drivers/virt/gunyah/rsc_mgr_rpc.c b/drivers/virt/gunyah/rsc_mgr_rpc.c
index 0c83b097fec9..0b12696bf069 100644
--- a/drivers/virt/gunyah/rsc_mgr_rpc.c
+++ b/drivers/virt/gunyah/rsc_mgr_rpc.c
@@ -116,6 +116,12 @@ static int gh_rm_mem_lend_common(struct gh_rm *rm, u32 message_id, struct gh_rm_
if (!msg)
return -ENOMEM;

+ ret = gh_rm_platform_pre_mem_share(rm, p);
+ if (ret) {
+ kfree(msg);
+ return ret;
+ }
+
req_header = msg;
acl_section = (void *)req_header + sizeof(*req_header);
mem_section = (void *)acl_section + struct_size(acl_section, entries, p->n_acl_entries);
@@ -139,8 +145,10 @@ static int gh_rm_mem_lend_common(struct gh_rm *rm, u32 message_id, struct gh_rm_
ret = gh_rm_call(rm, message_id, msg, msg_size, (void **)&resp, &resp_size);
kfree(msg);

- if (ret)
+ if (ret) {
+ gh_rm_platform_post_mem_reclaim(rm, p);
return ret;
+ }

p->mem_handle = le32_to_cpu(*resp);

@@ -204,7 +212,7 @@ int gh_rm_mem_reclaim(struct gh_rm *rm, struct gh_rm_mem_parcel *parcel)
if (ret)
return ret;

- return ret;
+ return gh_rm_platform_post_mem_reclaim(rm, parcel);
}

/**
diff --git a/include/linux/gunyah_rsc_mgr.h b/include/linux/gunyah_rsc_mgr.h
index 9cffee6f9b4e..dc05d5b1e1a3 100644
--- a/include/linux/gunyah_rsc_mgr.h
+++ b/include/linux/gunyah_rsc_mgr.h
@@ -147,4 +147,21 @@ int gh_rm_get_hyp_resources(struct gh_rm *rm, u16 vmid,
struct gh_rm_hyp_resources **resources);
int gh_rm_get_vmid(struct gh_rm *rm, u16 *vmid);

+struct gunyah_rm_platform_ops {
+ int (*pre_mem_share)(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel);
+ int (*post_mem_reclaim)(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel);
+};
+
+#if IS_ENABLED(CONFIG_GUNYAH_PLATFORM_HOOKS)
+int gh_rm_register_platform_ops(struct gunyah_rm_platform_ops *platform_ops);
+void gh_rm_unregister_platform_ops(struct gunyah_rm_platform_ops *platform_ops);
+int devm_gh_rm_register_platform_ops(struct device *dev, struct gunyah_rm_platform_ops *ops);
+#else
+static inline int gh_rm_register_platform_ops(struct gunyah_rm_platform_ops *platform_ops)
+ { return 0; }
+static inline void gh_rm_unregister_platform_ops(struct gunyah_rm_platform_ops *platform_ops) { }
+static inline int devm_gh_rm_register_platform_ops(struct device *dev,
+ struct gunyah_rm_platform_ops *ops) { return 0; }
+#endif
+
#endif
--
2.39.1


2023-02-14 21:26:42

by Elliot Berman

[permalink] [raw]
Subject: [PATCH v10 16/26] firmware: qcom_scm: Register Gunyah platform ops


Qualcomm platforms have a firmware entity which performs access control
to physical pages. Dynamically started Gunyah virtual machines use the
QCOM_SCM_RM_MANAGED_VMID for access. Linux thus needs to assign access
to the memory used by guest VMs. Gunyah doesn't do this operation for us
since it is the current VM (typically VMID_HLOS) delegating the access
and not Gunyah itself. Use the Gunyah platform ops to achieve this so
that only Qualcomm platforms attempt to make the needed SCM calls.

Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Elliot Berman <[email protected]>
---
drivers/firmware/Kconfig | 2 +
drivers/firmware/qcom_scm.c | 100 ++++++++++++++++++++++++++++++++++++
2 files changed, 102 insertions(+)

diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
index b59e3041fd62..b888068ff6f2 100644
--- a/drivers/firmware/Kconfig
+++ b/drivers/firmware/Kconfig
@@ -214,6 +214,8 @@ config MTK_ADSP_IPC

config QCOM_SCM
tristate
+ select VIRT_DRIVERS
+ select GUNYAH_PLATFORM_HOOKS

config QCOM_SCM_DOWNLOAD_MODE_DEFAULT
bool "Qualcomm download mode enabled by default"
diff --git a/drivers/firmware/qcom_scm.c b/drivers/firmware/qcom_scm.c
index 468d4d5ab550..875040982b48 100644
--- a/drivers/firmware/qcom_scm.c
+++ b/drivers/firmware/qcom_scm.c
@@ -20,6 +20,7 @@
#include <linux/clk.h>
#include <linux/reset-controller.h>
#include <linux/arm-smccc.h>
+#include <linux/gunyah_rsc_mgr.h>

#include "qcom_scm.h"

@@ -30,6 +31,9 @@ module_param(download_mode, bool, 0);
#define SCM_HAS_IFACE_CLK BIT(1)
#define SCM_HAS_BUS_CLK BIT(2)

+#define QCOM_SCM_RM_MANAGED_VMID 0x3A
+#define QCOM_SCM_MAX_MANAGED_VMID 0x3F
+
struct qcom_scm {
struct device *dev;
struct clk *core_clk;
@@ -1297,6 +1301,99 @@ int qcom_scm_lmh_dcvsh(u32 payload_fn, u32 payload_reg, u32 payload_val,
}
EXPORT_SYMBOL(qcom_scm_lmh_dcvsh);

+static int qcom_scm_gh_rm_pre_mem_share(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel)
+{
+ struct qcom_scm_vmperm *new_perms;
+ u64 src, src_cpy;
+ int ret = 0, i, n;
+ u16 vmid;
+
+ new_perms = kcalloc(mem_parcel->n_acl_entries, sizeof(*new_perms), GFP_KERNEL);
+ if (!new_perms)
+ return -ENOMEM;
+
+ for (n = 0; n < mem_parcel->n_acl_entries; n++) {
+ vmid = le16_to_cpu(mem_parcel->acl_entries[n].vmid);
+ if (vmid <= QCOM_SCM_MAX_MANAGED_VMID)
+ new_perms[n].vmid = vmid;
+ else
+ new_perms[n].vmid = QCOM_SCM_RM_MANAGED_VMID;
+ if (mem_parcel->acl_entries[n].perms & GH_RM_ACL_X)
+ new_perms[n].perm |= QCOM_SCM_PERM_EXEC;
+ if (mem_parcel->acl_entries[n].perms & GH_RM_ACL_W)
+ new_perms[n].perm |= QCOM_SCM_PERM_WRITE;
+ if (mem_parcel->acl_entries[n].perms & GH_RM_ACL_R)
+ new_perms[n].perm |= QCOM_SCM_PERM_READ;
+ }
+
+ src = (1ull << QCOM_SCM_VMID_HLOS);
+
+ for (i = 0; i < mem_parcel->n_mem_entries; i++) {
+ src_cpy = src;
+ ret = qcom_scm_assign_mem(le64_to_cpu(mem_parcel->mem_entries[i].ipa_base),
+ le64_to_cpu(mem_parcel->mem_entries[i].size),
+ &src_cpy, new_perms, mem_parcel->n_acl_entries);
+ if (ret) {
+ src = 0;
+ for (n = 0; n < mem_parcel->n_acl_entries; n++) {
+ vmid = le16_to_cpu(mem_parcel->acl_entries[n].vmid);
+ if (vmid <= QCOM_SCM_MAX_MANAGED_VMID)
+ src |= (1ull << vmid);
+ else
+ src |= (1ull << QCOM_SCM_RM_MANAGED_VMID);
+ }
+
+ new_perms[0].vmid = QCOM_SCM_VMID_HLOS;
+
+ for (i--; i >= 0; i--) {
+ src_cpy = src;
+ WARN_ON_ONCE(qcom_scm_assign_mem(
+ le64_to_cpu(mem_parcel->mem_entries[i].ipa_base),
+ le64_to_cpu(mem_parcel->mem_entries[i].size),
+ &src_cpy, new_perms, 1));
+ }
+ break;
+ }
+ }
+
+ kfree(new_perms);
+ return ret;
+}
+
+static int qcom_scm_gh_rm_post_mem_reclaim(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel)
+{
+ struct qcom_scm_vmperm new_perms;
+ u64 src = 0, src_cpy;
+ int ret = 0, i, n;
+ u16 vmid;
+
+ new_perms.vmid = QCOM_SCM_VMID_HLOS;
+ new_perms.perm = QCOM_SCM_PERM_EXEC | QCOM_SCM_PERM_WRITE | QCOM_SCM_PERM_READ;
+
+ for (n = 0; n < mem_parcel->n_acl_entries; n++) {
+ vmid = le16_to_cpu(mem_parcel->acl_entries[n].vmid);
+ if (vmid <= QCOM_SCM_MAX_MANAGED_VMID)
+ src |= (1ull << vmid);
+ else
+ src |= (1ull << QCOM_SCM_RM_MANAGED_VMID);
+ }
+
+ for (i = 0; i < mem_parcel->n_mem_entries; i++) {
+ src_cpy = src;
+ ret = qcom_scm_assign_mem(le64_to_cpu(mem_parcel->mem_entries[i].ipa_base),
+ le64_to_cpu(mem_parcel->mem_entries[i].size),
+ &src_cpy, &new_perms, 1);
+ WARN_ON_ONCE(ret);
+ }
+
+ return ret;
+}
+
+static struct gunyah_rm_platform_ops qcom_scm_gh_rm_platform_ops = {
+ .pre_mem_share = qcom_scm_gh_rm_pre_mem_share,
+ .post_mem_reclaim = qcom_scm_gh_rm_post_mem_reclaim,
+};
+
static int qcom_scm_find_dload_address(struct device *dev, u64 *addr)
{
struct device_node *tcsr;
@@ -1500,6 +1597,9 @@ static int qcom_scm_probe(struct platform_device *pdev)
if (download_mode)
qcom_scm_set_download_mode(true);

+ if (devm_gh_rm_register_platform_ops(&pdev->dev, &qcom_scm_gh_rm_platform_ops))
+ dev_warn(__scm->dev, "Gunyah RM platform ops were already registered\n");
+
return 0;
}

--
2.39.1


2023-02-14 21:27:14

by Elliot Berman

[permalink] [raw]
Subject: [PATCH v10 17/26] docs: gunyah: Document Gunyah VM Manager


Document the ioctls and usage of Gunyah VM Manager driver.

Signed-off-by: Elliot Berman <[email protected]>
---
Documentation/virt/gunyah/index.rst | 1 +
Documentation/virt/gunyah/vm-manager.rst | 106 +++++++++++++++++++++++
2 files changed, 107 insertions(+)
create mode 100644 Documentation/virt/gunyah/vm-manager.rst

diff --git a/Documentation/virt/gunyah/index.rst b/Documentation/virt/gunyah/index.rst
index 45adbbc311db..b204b85e86db 100644
--- a/Documentation/virt/gunyah/index.rst
+++ b/Documentation/virt/gunyah/index.rst
@@ -7,6 +7,7 @@ Gunyah Hypervisor
.. toctree::
:maxdepth: 1

+ vm-manager
message-queue

Gunyah is a Type-1 hypervisor which is independent of any OS kernel, and runs in
diff --git a/Documentation/virt/gunyah/vm-manager.rst b/Documentation/virt/gunyah/vm-manager.rst
new file mode 100644
index 000000000000..c0126cfeadc7
--- /dev/null
+++ b/Documentation/virt/gunyah/vm-manager.rst
@@ -0,0 +1,106 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=======================
+Virtual Machine Manager
+=======================
+
+The Gunyah Virtual Machine Manager is a Linux driver to support launching
+virtual machines using Gunyah. It presently supports launching non-proxy
+scheduled Linux-like virtual machines.
+
+Except for some basic information about the location of initial binaries,
+most of the configuration about a Gunyah virtual machine is described in the
+VM's devicetree. The devicetree is generated by userspace. Interacting with the
+virtual machine is still done via the kernel and VM configuration requires some
+of the corresponding functionality to be set up in the kernel. For instance,
+sharing userspace memory with a VM is done via the GH_VM_SET_USER_MEM_REGION
+ioctl. The VM itself is configured to use the memory region via the
+devicetree.
+
+Sample Userspace VMM
+====================
+
+A sample userspace VMM is included in samples/gunyah/ along with a minimal
+devicetree that can be used to launch a VM. To build this sample, enable
+CONFIG_SAMPLE_GUNYAH.
+
+IOCTLs and userspace VMM flows
+==============================
+
+The kernel exposes a char device interface at /dev/gunyah.
+
+To create a VM, use the GH_CREATE_VM ioctl. A successful call will return a
+"Gunyah VM" file descriptor.
+
+/dev/gunyah API Descriptions
+----------------------------
+
+GH_CREATE_VM
+~~~~~~~~~~~~
+
+Creates a Gunyah VM. The argument is reserved for future use and must be 0.
+
+Gunyah VM API Descriptions
+--------------------------
+
+GH_VM_SET_USER_MEM_REGION
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+::
+
+ struct gh_userspace_memory_region {
+ __u32 label;
+ __u32 flags;
+ __u64 guest_phys_addr;
+ __u64 memory_size;
+ __u64 userspace_addr;
+ };
+
+This ioctl allows the user to create or delete a memory parcel for a guest
+virtual machine. Each memory region is uniquely identified by a label;
+attempting to create two regions with the same label is not allowed.
+
+While VMM is guest-agnostic and allows runtime addition of memory regions,
+Linux guest virtual machines do not support accepting memory regions at runtime.
+Thus, memory regions should be provided before starting the VM and the VM must
+be configured to accept these at boot-up.
+
+The guest physical address is used by Linux kernel to check that the requested
+user regions do not overlap and to help find the corresponding memory region
+for calls like GH_VM_SET_DTB_CONFIG. It should be page aligned.
+
+memory_size and userspace_addr should be page-aligned.
+
+The flags field of gh_userspace_memory_region accepts the following bits. All
+other bits must be 0 and are reserved for future use. The ioctl will return
+-EINVAL if an unsupported bit is detected.
+
+ - GH_MEM_ALLOW_READ/GH_MEM_ALLOW_WRITE/GH_MEM_ALLOW_EXEC sets read/write/exec
+ permissions for the guest, respectively.
+ - GH_MEM_LENT means that the memory will be unmapped from the host and be
+ unaccessible by the host while the guest has the region.
+
+To add a memory region, call GH_VM_SET_USER_MEM_REGION with fields set as
+described above.
+
+To delete a memory region, call GH_VM_SET_USER_MEM_REGION with label set to the
+desired region and memory_size set to 0.
+
+GH_VM_SET_DTB_CONFIG
+~~~~~~~~~~~~~~~~~~~~
+
+::
+
+ struct gh_vm_dtb_config {
+ __u64 gpa;
+ __u64 size;
+ };
+
+This ioctl sets the location of the VM's devicetree blob and is used by Gunyah
+Resource Manager to allocate resources. The guest physical memory should be part
+of the primary memory parcel provided to the VM prior to GH_VM_START.
+
+GH_VM_START
+~~~~~~~~~~~
+
+This ioctl starts the VM.
--
2.39.1


2023-02-14 21:27:45

by Elliot Berman

[permalink] [raw]
Subject: [PATCH v10 20/26] virt: gunyah: Add resource tickets


Some VM functions need to acquire Gunyah resources. For instance, Gunyah
vCPUs are exposed to the host as a resource. The Gunyah vCPU function
will register a resource ticket and be able to interact with the
hypervisor once the resource ticket is filled.

Resource tickets are the mechanism for functions to acquire ownership of
Gunyah resources. Gunyah functions can be created before the VM's
resources are created and made available to Linux. A resource ticket
identifies a type of resource and a label of a resource which the ticket
holder is interested in.

Resources are created by Gunyah as configured in the VM's devicetree
configuration. Gunyah doesn't process the label and that makes it
possible for userspace to create multiple resources with the same label.
Resource ticket owners need to be prepared for populate to be called
multiple times if userspace created multiple resources with the same
label.

Signed-off-by: Elliot Berman <[email protected]>
---
drivers/virt/gunyah/vm_mgr.c | 110 +++++++++++++++++++++++++++++++++-
drivers/virt/gunyah/vm_mgr.h | 4 ++
include/linux/gunyah_vm_mgr.h | 14 +++++
3 files changed, 127 insertions(+), 1 deletion(-)

diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
index e9c55e7dd1b3..7190107a6a0d 100644
--- a/drivers/virt/gunyah/vm_mgr.c
+++ b/drivers/virt/gunyah/vm_mgr.c
@@ -189,6 +189,74 @@ static long gh_vm_rm_function(struct gh_vm *ghvm, struct gh_fn_desc *f)
return r;
}

+int gh_vm_add_resource_ticket(struct gh_vm *ghvm, struct gh_vm_resource_ticket *ticket)
+{
+ struct gh_vm_resource_ticket *iter;
+ struct gunyah_resource *ghrsc;
+ int ret = 0;
+
+ mutex_lock(&ghvm->resources_lock);
+ list_for_each_entry(iter, &ghvm->resource_tickets, list) {
+ if (iter->resource_type == ticket->resource_type && iter->label == ticket->label) {
+ ret = -EEXIST;
+ goto out;
+ }
+ }
+
+ if (!try_module_get(ticket->owner)) {
+ ret = -ENODEV;
+ goto out;
+ }
+
+ list_add(&ticket->list, &ghvm->resource_tickets);
+ INIT_LIST_HEAD(&ticket->resources);
+
+ list_for_each_entry(ghrsc, &ghvm->resources, list) {
+ if (ghrsc->type == ticket->resource_type && ghrsc->rm_label == ticket->label) {
+ if (!ticket->populate(ticket, ghrsc))
+ list_move(&ghrsc->list, &ticket->resources);
+ }
+ }
+out:
+ mutex_unlock(&ghvm->resources_lock);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(gh_vm_add_resource_ticket);
+
+void gh_vm_remove_resource_ticket(struct gh_vm *ghvm, struct gh_vm_resource_ticket *ticket)
+{
+ struct gunyah_resource *ghrsc, *iter;
+
+ mutex_lock(&ghvm->resources_lock);
+ list_for_each_entry_safe(ghrsc, iter, &ticket->resources, list) {
+ ticket->unpopulate(ticket, ghrsc);
+ list_move(&ghrsc->list, &ghvm->resources);
+ }
+
+ module_put(ticket->owner);
+ list_del(&ticket->list);
+ mutex_unlock(&ghvm->resources_lock);
+}
+EXPORT_SYMBOL_GPL(gh_vm_remove_resource_ticket);
+
+static void gh_vm_add_resource(struct gh_vm *ghvm, struct gunyah_resource *ghrsc)
+{
+ struct gh_vm_resource_ticket *ticket;
+
+ mutex_lock(&ghvm->resources_lock);
+ list_for_each_entry(ticket, &ghvm->resource_tickets, list) {
+ if (ghrsc->type == ticket->resource_type && ghrsc->rm_label == ticket->label) {
+ if (!ticket->populate(ticket, ghrsc)) {
+ list_add(&ghrsc->list, &ticket->resources);
+ goto found;
+ }
+ }
+ }
+ list_add(&ghrsc->list, &ghvm->resources);
+found:
+ mutex_unlock(&ghvm->resources_lock);
+}
+
static int gh_vm_rm_notification_status(struct gh_vm *ghvm, void *data)
{
struct gh_rm_vm_status_payload *payload = data;
@@ -254,6 +322,8 @@ static void gh_vm_free(struct work_struct *work)
{
struct gh_vm *ghvm = container_of(work, struct gh_vm, free_work);
struct gh_vm_function_instance *inst, *iiter;
+ struct gh_vm_resource_ticket *ticket, *titer;
+ struct gunyah_resource *ghrsc, *riter;
struct gh_vm_mem *mapping, *tmp;
int ret;

@@ -271,6 +341,23 @@ static void gh_vm_free(struct work_struct *work)
}
mutex_unlock(&functions_lock);

+ if (!list_empty(&ghvm->resource_tickets)) {
+ pr_warn("Dangling resource tickets:\n");
+ list_for_each_entry_safe(ticket, titer, &ghvm->resource_tickets, list) {
+ pr_warn(" %pS\n", ticket->populate);
+ gh_vm_remove_resource_ticket(ghvm, ticket);
+ }
+ }
+
+ list_for_each_entry_safe(ghrsc, riter, &ghvm->resources, list) {
+ gh_rm_free_resource(ghrsc);
+ }
+
+ ret = gh_rm_vm_reset(ghvm->rm, ghvm->vmid);
+ if (ret)
+ pr_err("Failed to reset the vm: %d\n", ret);
+ wait_event(ghvm->vm_status_wait, ghvm->vm_status == GH_RM_VM_STATUS_RESET);
+
mutex_lock(&ghvm->mm_lock);
list_for_each_entry_safe(mapping, tmp, &ghvm->memory_mappings, list) {
gh_vm_mem_reclaim(ghvm, mapping);
@@ -350,6 +437,9 @@ static __must_check struct gh_vm *gh_vm_alloc(struct gh_rm *rm)
init_rwsem(&ghvm->status_lock);
INIT_WORK(&ghvm->free_work, gh_vm_free);
kref_init(&ghvm->kref);
+ mutex_init(&ghvm->resources_lock);
+ INIT_LIST_HEAD(&ghvm->resources);
+ INIT_LIST_HEAD(&ghvm->resource_tickets);
INIT_LIST_HEAD(&ghvm->functions);
ghvm->vm_status = GH_RM_VM_STATUS_LOAD;

@@ -359,9 +449,11 @@ static __must_check struct gh_vm *gh_vm_alloc(struct gh_rm *rm)
static int gh_vm_start(struct gh_vm *ghvm)
{
struct gh_vm_mem *mapping;
+ struct gh_rm_hyp_resources *resources;
+ struct gunyah_resource *ghrsc;
u64 dtb_offset;
u32 mem_handle;
- int ret;
+ int ret, i, n;

down_write(&ghvm->status_lock);
if (ghvm->vm_status != GH_RM_VM_STATUS_LOAD) {
@@ -412,6 +504,22 @@ static int gh_vm_start(struct gh_vm *ghvm)
goto err;
}

+ ret = gh_rm_get_hyp_resources(ghvm->rm, ghvm->vmid, &resources);
+ if (ret) {
+ pr_warn("Failed to get hypervisor resources for VM: %d\n", ret);
+ goto err;
+ }
+
+ for (i = 0, n = le32_to_cpu(resources->n_entries); i < n; i++) {
+ ghrsc = gh_rm_alloc_resource(ghvm->rm, &resources->entries[i]);
+ if (!ghrsc) {
+ ret = -ENOMEM;
+ goto err;
+ }
+
+ gh_vm_add_resource(ghvm, ghrsc);
+ }
+
ret = gh_rm_vm_start(ghvm->rm, ghvm->vmid);
if (ret) {
pr_warn("Failed to start VM: %d\n", ret);
diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
index 4750d56c1297..56ae97d752d6 100644
--- a/drivers/virt/gunyah/vm_mgr.h
+++ b/drivers/virt/gunyah/vm_mgr.h
@@ -7,6 +7,7 @@
#define _GH_PRIV_VM_MGR_H

#include <linux/gunyah_rsc_mgr.h>
+#include <linux/gunyah_vm_mgr.h>
#include <linux/list.h>
#include <linux/kref.h>
#include <linux/miscdevice.h>
@@ -49,6 +50,9 @@ struct gh_vm {
struct mutex mm_lock;
struct list_head memory_mappings;
struct list_head functions;
+ struct mutex resources_lock;
+ struct list_head resources;
+ struct list_head resource_tickets;
};

int gh_vm_mem_alloc(struct gh_vm *ghvm, struct gh_userspace_memory_region *region);
diff --git a/include/linux/gunyah_vm_mgr.h b/include/linux/gunyah_vm_mgr.h
index f0a95af50b2e..84579150d6bc 100644
--- a/include/linux/gunyah_vm_mgr.h
+++ b/include/linux/gunyah_vm_mgr.h
@@ -77,4 +77,18 @@ void gh_vm_function_unregister(struct gh_vm_function *f);
DECLARE_GUNYAH_VM_FUNCTION(_name, _type, _bind, _unbind); \
module_gunyah_vm_function(_name)

+struct gh_vm_resource_ticket {
+ struct list_head list; /* for gh_vm's resources list */
+ struct list_head resources; /* for gunyah_resources's list */
+ enum gunyah_resource_type resource_type;
+ u32 label;
+
+ struct module *owner;
+ int (*populate)(struct gh_vm_resource_ticket *ticket, struct gunyah_resource *ghrsc);
+ void (*unpopulate)(struct gh_vm_resource_ticket *ticket, struct gunyah_resource *ghrsc);
+};
+
+int gh_vm_add_resource_ticket(struct gh_vm *ghvm, struct gh_vm_resource_ticket *ticket);
+void gh_vm_remove_resource_ticket(struct gh_vm *ghvm, struct gh_vm_resource_ticket *ticket);
+
#endif
--
2.39.1


2023-02-14 21:27:53

by Elliot Berman

[permalink] [raw]
Subject: [PATCH v10 21/26] virt: gunyah: Add IO handlers


Add framework for VM functions to handle stage-2 write faults from Gunyah
guest virtual machines. IO handlers have a range of addresses which they
apply to. Optionally, they may apply to only when the value written
matches the IO handler's value.

Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Elliot Berman <[email protected]>
---
drivers/virt/gunyah/vm_mgr.c | 94 +++++++++++++++++++++++++++++++++++
drivers/virt/gunyah/vm_mgr.h | 5 ++
include/linux/gunyah_vm_mgr.h | 25 ++++++++++
3 files changed, 124 insertions(+)

diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
index 7190107a6a0d..24829db17a0d 100644
--- a/drivers/virt/gunyah/vm_mgr.c
+++ b/drivers/virt/gunyah/vm_mgr.c
@@ -257,6 +257,100 @@ static void gh_vm_add_resource(struct gh_vm *ghvm, struct gunyah_resource *ghrsc
mutex_unlock(&ghvm->resources_lock);
}

+static int _gh_vm_io_handler_compare(const struct rb_node *node, const struct rb_node *parent)
+{
+ struct gh_vm_io_handler *n = container_of(node, struct gh_vm_io_handler, node);
+ struct gh_vm_io_handler *p = container_of(parent, struct gh_vm_io_handler, node);
+
+ if (n->addr < p->addr)
+ return -1;
+ if (n->addr > p->addr)
+ return 1;
+ if ((n->len && !p->len) || (!n->len && p->len))
+ return 0;
+ if (n->len < p->len)
+ return -1;
+ if (n->len > p->len)
+ return 1;
+ if (n->datamatch < p->datamatch)
+ return -1;
+ if (n->datamatch > p->datamatch)
+ return 1;
+ return 0;
+}
+
+static int gh_vm_io_handler_compare(struct rb_node *node, const struct rb_node *parent)
+{
+ return _gh_vm_io_handler_compare(node, parent);
+}
+
+static int gh_vm_io_handler_find(const void *key, const struct rb_node *node)
+{
+ const struct gh_vm_io_handler *k = key;
+
+ return _gh_vm_io_handler_compare(&k->node, node);
+}
+
+static struct gh_vm_io_handler *gh_vm_mgr_find_io_hdlr(struct gh_vm *ghvm, u64 addr,
+ u64 len, u64 data)
+{
+ struct gh_vm_io_handler key = {
+ .addr = addr,
+ .len = len,
+ .datamatch = data,
+ };
+ struct rb_node *node;
+
+ node = rb_find(&key, &ghvm->mmio_handler_root, gh_vm_io_handler_find);
+ if (!node)
+ return NULL;
+
+ return container_of(node, struct gh_vm_io_handler, node);
+}
+
+int gh_vm_mmio_write(struct gh_vm *ghvm, u64 addr, u32 len, u64 data)
+{
+ struct gh_vm_io_handler *io_hdlr = NULL;
+ int ret;
+
+ down_read(&ghvm->mmio_handler_lock);
+ io_hdlr = gh_vm_mgr_find_io_hdlr(ghvm, addr, len, data);
+ if (!io_hdlr || !io_hdlr->ops || !io_hdlr->ops->write) {
+ ret = -ENODEV;
+ goto out;
+ }
+
+ ret = io_hdlr->ops->write(io_hdlr, addr, len, data);
+
+out:
+ up_read(&ghvm->mmio_handler_lock);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(gh_vm_mmio_write);
+
+int gh_vm_add_io_handler(struct gh_vm *ghvm, struct gh_vm_io_handler *io_hdlr)
+{
+ struct rb_node *found;
+
+ if (io_hdlr->datamatch && (!io_hdlr->len || io_hdlr->len > sizeof(io_hdlr->data)))
+ return -EINVAL;
+
+ down_write(&ghvm->mmio_handler_lock);
+ found = rb_find_add(&io_hdlr->node, &ghvm->mmio_handler_root, gh_vm_io_handler_compare);
+ up_write(&ghvm->mmio_handler_lock);
+
+ return found ? -EEXIST : 0;
+}
+EXPORT_SYMBOL_GPL(gh_vm_add_io_handler);
+
+void gh_vm_remove_io_handler(struct gh_vm *ghvm, struct gh_vm_io_handler *io_hdlr)
+{
+ down_write(&ghvm->mmio_handler_lock);
+ rb_erase(&io_hdlr->node, &ghvm->mmio_handler_root);
+ up_write(&ghvm->mmio_handler_lock);
+}
+EXPORT_SYMBOL_GPL(gh_vm_remove_io_handler);
+
static int gh_vm_rm_notification_status(struct gh_vm *ghvm, void *data)
{
struct gh_rm_vm_status_payload *payload = data;
diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
index 56ae97d752d6..70802cd7cbe1 100644
--- a/drivers/virt/gunyah/vm_mgr.h
+++ b/drivers/virt/gunyah/vm_mgr.h
@@ -8,6 +8,7 @@

#include <linux/gunyah_rsc_mgr.h>
#include <linux/gunyah_vm_mgr.h>
+#include <linux/idr.h>
#include <linux/list.h>
#include <linux/kref.h>
#include <linux/miscdevice.h>
@@ -53,6 +54,8 @@ struct gh_vm {
struct mutex resources_lock;
struct list_head resources;
struct list_head resource_tickets;
+ struct rb_root mmio_handler_root;
+ struct rw_semaphore mmio_handler_lock;
};

int gh_vm_mem_alloc(struct gh_vm *ghvm, struct gh_userspace_memory_region *region);
@@ -61,4 +64,6 @@ int gh_vm_mem_free(struct gh_vm *ghvm, u32 label);
struct gh_vm_mem *gh_vm_mem_find(struct gh_vm *ghvm, u32 label);
struct gh_vm_mem *gh_vm_mem_find_mapping(struct gh_vm *ghvm, u64 gpa, u32 size);

+int gh_vm_mmio_write(struct gh_vm *ghvm, u64 addr, u32 len, u64 data);
+
#endif
diff --git a/include/linux/gunyah_vm_mgr.h b/include/linux/gunyah_vm_mgr.h
index 84579150d6bc..4d5bb0638d0d 100644
--- a/include/linux/gunyah_vm_mgr.h
+++ b/include/linux/gunyah_vm_mgr.h
@@ -91,4 +91,29 @@ struct gh_vm_resource_ticket {
int gh_vm_add_resource_ticket(struct gh_vm *ghvm, struct gh_vm_resource_ticket *ticket);
void gh_vm_remove_resource_ticket(struct gh_vm *ghvm, struct gh_vm_resource_ticket *ticket);

+/*
+ * gh_vm_io_handler contains the info about an io device and its associated
+ * addr and the ops associated with the io device.
+ */
+struct gh_vm_io_handler {
+ struct rb_node node;
+ u64 addr;
+
+ bool datamatch;
+ u8 len;
+ u64 data;
+ struct gh_vm_io_handler_ops *ops;
+};
+
+/*
+ * gh_vm_io_handler_ops contains function pointers associated with an iodevice.
+ */
+struct gh_vm_io_handler_ops {
+ int (*read)(struct gh_vm_io_handler *io_dev, u64 addr, u32 len, u64 data);
+ int (*write)(struct gh_vm_io_handler *io_dev, u64 addr, u32 len, u64 data);
+};
+
+int gh_vm_add_io_handler(struct gh_vm *ghvm, struct gh_vm_io_handler *io_dev);
+void gh_vm_remove_io_handler(struct gh_vm *ghvm, struct gh_vm_io_handler *io_dev);
+
#endif
--
2.39.1


2023-02-14 21:27:56

by Elliot Berman

[permalink] [raw]
Subject: [PATCH v10 19/26] gunyah: vm_mgr: Add framework to add VM Functions


Introduce a framework for Gunyah userspace to install VM functions. VM
functions are optional interfaces to the virtual machine. vCPUs,
ioeventfs, and irqfds are examples of such VM functions and are
implemented in subsequent patches.

A generic framework is implemented instead of individual ioctls to
create vCPUs, irqfds, etc., in order to simplify the VM manager core
implementation and allow dynamic loading of VM function modules.

Signed-off-by: Elliot Berman <[email protected]>
---
Documentation/virt/gunyah/vm-manager.rst | 18 ++
drivers/virt/gunyah/vm_mgr.c | 240 ++++++++++++++++++++++-
drivers/virt/gunyah/vm_mgr.h | 3 +
include/linux/gunyah_vm_mgr.h | 80 ++++++++
include/uapi/linux/gunyah.h | 17 ++
5 files changed, 353 insertions(+), 5 deletions(-)
create mode 100644 include/linux/gunyah_vm_mgr.h

diff --git a/Documentation/virt/gunyah/vm-manager.rst b/Documentation/virt/gunyah/vm-manager.rst
index c0126cfeadc7..5272a6e9145c 100644
--- a/Documentation/virt/gunyah/vm-manager.rst
+++ b/Documentation/virt/gunyah/vm-manager.rst
@@ -17,6 +17,24 @@ sharing userspace memory with a VM is done via the GH_VM_SET_USER_MEM_REGION
ioctl. The VM itself is configured to use the memory region via the
devicetree.

+Gunyah Functions
+================
+
+Components of a Gunyah VM's configuration that need kernel configuration are
+called "functions" and are built on top of a framework. Functions are identified
+by a string and have some argument(s) to configure them. They are typically
+created by the `GH_VM_ADD_FUNCTION` ioctl.
+
+Functions typically will always do at least one of these operations:
+
+1. Create resource ticket(s). Resource tickets allow a function to register
+ itself as the client for a Gunyah resource (e.g. doorbell or vCPU) and
+ the function is given the pointer to the `struct gunyah_resource` when the
+ VM is starting.
+
+2. Register IO handler(s). IO handlers allow a function to handle stage-2 faults
+ from the virtual machine.
+
Sample Userspace VMM
====================

diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
index fa324385ade5..e9c55e7dd1b3 100644
--- a/drivers/virt/gunyah/vm_mgr.c
+++ b/drivers/virt/gunyah/vm_mgr.c
@@ -6,8 +6,10 @@
#define pr_fmt(fmt) "gh_vm_mgr: " fmt

#include <linux/anon_inodes.h>
+#include <linux/compat.h>
#include <linux/file.h>
#include <linux/gunyah_rsc_mgr.h>
+#include <linux/gunyah_vm_mgr.h>
#include <linux/miscdevice.h>
#include <linux/mm.h>
#include <linux/module.h>
@@ -16,6 +18,177 @@

#include "vm_mgr.h"

+static DEFINE_MUTEX(functions_lock);
+static DEFINE_IDR(functions);
+
+int gh_vm_function_register(struct gh_vm_function *drv)
+{
+ int ret = 0;
+
+ if (!drv->bind || !drv->unbind)
+ return -EINVAL;
+
+ mutex_lock(&functions_lock);
+ if (idr_find(&functions, drv->type)) {
+ ret = -EEXIST;
+ goto out;
+ }
+
+ INIT_LIST_HEAD(&drv->instances);
+ ret = idr_alloc(&functions, drv, drv->type, drv->type + 1, GFP_KERNEL);
+ if (ret > 0)
+ ret = 0;
+out:
+ mutex_unlock(&functions_lock);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(gh_vm_function_register);
+
+static void gh_vm_remove_function_instance(struct gh_vm_function_instance *inst)
+ __must_hold(functions_lock)
+{
+ inst->fn->unbind(inst);
+ list_del(&inst->vm_list);
+ list_del(&inst->fn_list);
+ module_put(inst->fn->mod);
+ if (inst->arg_size)
+ kfree(inst->argp);
+ kfree(inst);
+}
+
+void gh_vm_function_unregister(struct gh_vm_function *fn)
+{
+ struct gh_vm_function_instance *inst, *iter;
+
+ mutex_lock(&functions_lock);
+ list_for_each_entry_safe(inst, iter, &fn->instances, fn_list)
+ gh_vm_remove_function_instance(inst);
+ idr_remove(&functions, fn->type);
+ mutex_unlock(&functions_lock);
+}
+EXPORT_SYMBOL_GPL(gh_vm_function_unregister);
+
+static long gh_vm_add_function(struct gh_vm *ghvm, struct gh_fn_desc *f)
+{
+ struct gh_vm_function_instance *inst;
+ void __user *argp;
+ long r = 0;
+
+ if (f->arg_size > GH_FN_MAX_ARG_SIZE)
+ return -EINVAL;
+
+ inst = kzalloc(sizeof(*inst), GFP_KERNEL);
+ if (!inst)
+ return -ENOMEM;
+
+ inst->arg_size = f->arg_size;
+ if (inst->arg_size) {
+ inst->argp = kzalloc(inst->arg_size, GFP_KERNEL);
+ if (!inst->arg) {
+ r = -ENOMEM;
+ goto free;
+ }
+
+ argp = is_compat_task() ? compat_ptr(f->arg) : (void __user *) f->arg;
+ if (copy_from_user(inst->argp, argp, f->arg_size)) {
+ r = -EFAULT;
+ goto free_arg;
+ }
+ } else {
+ inst->arg = f->arg;
+ }
+
+ mutex_lock(&functions_lock);
+ inst->fn = idr_find(&functions, f->type);
+ if (!inst->fn) {
+ mutex_unlock(&functions_lock);
+ r = request_module("ghfunc:%d", f->type);
+ if (r)
+ goto unlock_free;
+
+ mutex_lock(&functions_lock);
+ inst->fn = idr_find(&functions, f->type);
+ }
+
+ if (!inst->fn) {
+ r = -ENOENT;
+ goto unlock_free;
+ }
+
+ if (!try_module_get(inst->fn->mod)) {
+ r = -ENOENT;
+ inst->fn = NULL;
+ goto unlock_free;
+ }
+
+ inst->ghvm = ghvm;
+ inst->rm = ghvm->rm;
+
+ r = inst->fn->bind(inst);
+ if (r < 0) {
+ module_put(inst->fn->mod);
+ goto unlock_free;
+ }
+
+ list_add(&inst->vm_list, &ghvm->functions);
+ list_add(&inst->fn_list, &inst->fn->instances);
+ mutex_unlock(&functions_lock);
+ return r;
+unlock_free:
+ mutex_unlock(&functions_lock);
+free_arg:
+ if (inst->arg_size)
+ kfree(inst->argp);
+free:
+ kfree(inst);
+ return r;
+}
+
+static long gh_vm_rm_function(struct gh_vm *ghvm, struct gh_fn_desc *f)
+{
+ struct gh_vm_function_instance *inst, *iter;
+ void __user *user_argp;
+ void *argp;
+ long r = 0;
+
+ r = mutex_lock_interruptible(&functions_lock);
+ if (r)
+ return r;
+
+ if (f->arg_size) {
+ argp = kzalloc(f->arg_size, GFP_KERNEL);
+ if (!argp) {
+ r = -ENOMEM;
+ goto out;
+ }
+
+ user_argp = is_compat_task() ? compat_ptr(f->arg) : (void __user *) f->arg;
+ if (copy_from_user(argp, user_argp, f->arg_size)) {
+ r = -EFAULT;
+ kfree(argp);
+ goto out;
+ }
+
+ list_for_each_entry_safe(inst, iter, &ghvm->functions, vm_list) {
+ if (inst->fn->type == f->type &&
+ f->arg_size == inst->arg_size &&
+ !memcmp(argp, inst->argp, f->arg_size))
+ gh_vm_remove_function_instance(inst);
+ }
+ } else {
+ list_for_each_entry_safe(inst, iter, &ghvm->functions, vm_list) {
+ if (inst->fn->type == f->type &&
+ f->arg_size == inst->arg_size &&
+ inst->arg == f->arg)
+ gh_vm_remove_function_instance(inst);
+ }
+ }
+
+out:
+ mutex_unlock(&functions_lock);
+ return r;
+}
+
static int gh_vm_rm_notification_status(struct gh_vm *ghvm, void *data)
{
struct gh_rm_vm_status_payload *payload = data;
@@ -80,6 +253,7 @@ static void gh_vm_stop(struct gh_vm *ghvm)
static void gh_vm_free(struct work_struct *work)
{
struct gh_vm *ghvm = container_of(work, struct gh_vm, free_work);
+ struct gh_vm_function_instance *inst, *iiter;
struct gh_vm_mem *mapping, *tmp;
int ret;

@@ -90,7 +264,13 @@ static void gh_vm_free(struct work_struct *work)
fallthrough;
case GH_RM_VM_STATUS_INIT_FAILED:
case GH_RM_VM_STATUS_LOAD:
- case GH_RM_VM_STATUS_LOAD_FAILED:
+ case GH_RM_VM_STATUS_EXITED:
+ mutex_lock(&functions_lock);
+ list_for_each_entry_safe(inst, iiter, &ghvm->functions, vm_list) {
+ gh_vm_remove_function_instance(inst);
+ }
+ mutex_unlock(&functions_lock);
+
mutex_lock(&ghvm->mm_lock);
list_for_each_entry_safe(mapping, tmp, &ghvm->memory_mappings, list) {
gh_vm_mem_reclaim(ghvm, mapping);
@@ -113,6 +293,28 @@ static void gh_vm_free(struct work_struct *work)
}
}

+static void _gh_vm_put(struct kref *kref)
+{
+ struct gh_vm *ghvm = container_of(kref, struct gh_vm, kref);
+
+ /* VM will be reset and make RM calls which can interruptible sleep.
+ * Defer to a work so this thread can receive signal.
+ */
+ schedule_work(&ghvm->free_work);
+}
+
+int __must_check gh_vm_get(struct gh_vm *ghvm)
+{
+ return kref_get_unless_zero(&ghvm->kref);
+}
+EXPORT_SYMBOL_GPL(gh_vm_get);
+
+void gh_vm_put(struct gh_vm *ghvm)
+{
+ kref_put(&ghvm->kref, _gh_vm_put);
+}
+EXPORT_SYMBOL_GPL(gh_vm_put);
+
static __must_check struct gh_vm *gh_vm_alloc(struct gh_rm *rm)
{
struct gh_vm *ghvm;
@@ -147,6 +349,8 @@ static __must_check struct gh_vm *gh_vm_alloc(struct gh_rm *rm)
INIT_LIST_HEAD(&ghvm->memory_mappings);
init_rwsem(&ghvm->status_lock);
INIT_WORK(&ghvm->free_work, gh_vm_free);
+ kref_init(&ghvm->kref);
+ INIT_LIST_HEAD(&ghvm->functions);
ghvm->vm_status = GH_RM_VM_STATUS_LOAD;

return ghvm;
@@ -291,6 +495,35 @@ static long gh_vm_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
r = gh_vm_ensure_started(ghvm);
break;
}
+ case GH_VM_ADD_FUNCTION: {
+ struct gh_fn_desc *f;
+
+ f = kzalloc(sizeof(*f), GFP_KERNEL);
+ if (!f)
+ return -ENOMEM;
+
+ if (copy_from_user(f, argp, sizeof(*f)))
+ return -EFAULT;
+
+ r = gh_vm_add_function(ghvm, f);
+ if (r < 0)
+ kfree(f);
+ break;
+ }
+ case GH_VM_REMOVE_FUNCTION: {
+ struct gh_fn_desc *f;
+
+ f = kzalloc(sizeof(*f), GFP_KERNEL);
+ if (!f)
+ return -ENOMEM;
+
+ if (copy_from_user(f, argp, sizeof(*f)))
+ return -EFAULT;
+
+ r = gh_vm_rm_function(ghvm, f);
+ kfree(f);
+ break;
+ }
default:
r = -ENOTTY;
break;
@@ -303,10 +536,7 @@ static int gh_vm_release(struct inode *inode, struct file *filp)
{
struct gh_vm *ghvm = filp->private_data;

- /* VM will be reset and make RM calls which can interruptible sleep.
- * Defer to a work so this thread can receive signal.
- */
- schedule_work(&ghvm->free_work);
+ gh_vm_put(ghvm);
return 0;
}

diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
index e9cf56647cc2..4750d56c1297 100644
--- a/drivers/virt/gunyah/vm_mgr.h
+++ b/drivers/virt/gunyah/vm_mgr.h
@@ -8,6 +8,7 @@

#include <linux/gunyah_rsc_mgr.h>
#include <linux/list.h>
+#include <linux/kref.h>
#include <linux/miscdevice.h>
#include <linux/mutex.h>
#include <linux/rwsem.h>
@@ -44,8 +45,10 @@ struct gh_vm {
struct rw_semaphore status_lock;

struct work_struct free_work;
+ struct kref kref;
struct mutex mm_lock;
struct list_head memory_mappings;
+ struct list_head functions;
};

int gh_vm_mem_alloc(struct gh_vm *ghvm, struct gh_userspace_memory_region *region);
diff --git a/include/linux/gunyah_vm_mgr.h b/include/linux/gunyah_vm_mgr.h
new file mode 100644
index 000000000000..f0a95af50b2e
--- /dev/null
+++ b/include/linux/gunyah_vm_mgr.h
@@ -0,0 +1,80 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#ifndef _GUNYAH_VM_MGR_H
+#define _GUNYAH_VM_MGR_H
+
+#include <linux/compiler_types.h>
+#include <linux/gunyah.h>
+#include <linux/gunyah_rsc_mgr.h>
+#include <linux/list.h>
+#include <linux/mod_devicetable.h>
+#include <linux/notifier.h>
+
+#include <uapi/linux/gunyah.h>
+
+struct gh_vm;
+
+int __must_check gh_vm_get(struct gh_vm *ghvm);
+void gh_vm_put(struct gh_vm *ghvm);
+
+struct gh_vm_function_instance;
+struct gh_vm_function {
+ u32 type;
+ const char *name;
+ struct module *mod;
+ long (*bind)(struct gh_vm_function_instance *f);
+ void (*unbind)(struct gh_vm_function_instance *f);
+ struct mutex instances_lock;
+ struct list_head instances;
+};
+
+/**
+ * struct gh_vm_function_instance - Represents one function instance
+ * @arg_size: size of user argument
+ * @arg: user argument to describe the function instance; arg_size is 0
+ * @argp: pointer to user argument
+ * @ghvm: Pointer to VM instance
+ * @rm: Pointer to resource manager for the VM instance
+ * @fn: The ops for the function
+ * @data: Private data for function
+ * @vm_list: for gh_vm's functions list
+ * @fn_list: for gh_vm_function's instances list
+ */
+struct gh_vm_function_instance {
+ size_t arg_size;
+ union {
+ u64 arg;
+ void *argp;
+ };
+ struct gh_vm *ghvm;
+ struct gh_rm *rm;
+ struct gh_vm_function *fn;
+ void *data;
+ struct list_head vm_list;
+ struct list_head fn_list;
+};
+
+int gh_vm_function_register(struct gh_vm_function *f);
+void gh_vm_function_unregister(struct gh_vm_function *f);
+
+#define DECLARE_GUNYAH_VM_FUNCTION(_name, _type, _bind, _unbind) \
+ static struct gh_vm_function _name = { \
+ .type = _type, \
+ .name = __stringify(_name), \
+ .mod = THIS_MODULE, \
+ .bind = _bind, \
+ .unbind = _unbind, \
+ }; \
+ MODULE_ALIAS("ghfunc:"__stringify(_type))
+
+#define module_gunyah_vm_function(__gf) \
+ module_driver(__gf, gh_vm_function_register, gh_vm_function_unregister)
+
+#define DECLARE_GUNYAH_VM_FUNCTION_INIT(_name, _type, _bind, _unbind) \
+ DECLARE_GUNYAH_VM_FUNCTION(_name, _type, _bind, _unbind); \
+ module_gunyah_vm_function(_name)
+
+#endif
diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
index d899bba6a4c6..8df455a2a293 100644
--- a/include/uapi/linux/gunyah.h
+++ b/include/uapi/linux/gunyah.h
@@ -66,4 +66,21 @@ struct gh_vm_dtb_config {

#define GH_VM_START _IO(GH_IOCTL_TYPE, 0x3)

+#define GH_FN_MAX_ARG_SIZE 256
+
+/**
+ * struct gh_fn_desc - Arguments to create a VM function
+ * @type: Type of the function. See GH_FN_* macro for supported types
+ * @arg_size: Size of argument to pass to the function
+ * @arg: Value or pointer to argument given to the function
+ */
+struct gh_fn_desc {
+ __u32 type;
+ __u32 arg_size;
+ __u64 arg;
+};
+
+#define GH_VM_ADD_FUNCTION _IOW(GH_IOCTL_TYPE, 0x4, struct gh_fn_desc)
+#define GH_VM_REMOVE_FUNCTION _IOW(GH_IOCTL_TYPE, 0x7, struct gh_fn_desc)
+
#endif
--
2.39.1


2023-02-14 21:28:13

by Elliot Berman

[permalink] [raw]
Subject: [PATCH v10 22/26] virt: gunyah: Add proxy-scheduled vCPUs


Gunyah allows host virtual machines to schedule guest virtual machines
and handle their MMIO accesses. vCPUs are presented to the host as a
Gunyah resource and represented to userspace as a Gunyah VM function.

Creating the vcpu VM function will create a file descriptor that:
- can run an ioctl: GH_VCPU_RUN to schedule the guest vCPU until the
next interrupt occurs on the host or when the guest vCPU can no
longer be run.
- can be mmap'd to share a gh_vcpu_run structure which can look up the
reason why GH_VCPU_RUN returned and provide return values for MMIO
access.

Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Elliot Berman <[email protected]>
---
Documentation/virt/gunyah/vm-manager.rst | 30 +-
arch/arm64/gunyah/gunyah_hypercall.c | 28 ++
drivers/virt/gunyah/Kconfig | 11 +
drivers/virt/gunyah/Makefile | 2 +
drivers/virt/gunyah/gunyah_vcpu.c | 463 +++++++++++++++++++++++
drivers/virt/gunyah/vm_mgr.c | 4 +
drivers/virt/gunyah/vm_mgr.h | 1 +
include/linux/gunyah.h | 8 +
include/uapi/linux/gunyah.h | 61 +++
9 files changed, 606 insertions(+), 2 deletions(-)
create mode 100644 drivers/virt/gunyah/gunyah_vcpu.c

diff --git a/Documentation/virt/gunyah/vm-manager.rst b/Documentation/virt/gunyah/vm-manager.rst
index 5272a6e9145c..662b8d6fec2a 100644
--- a/Documentation/virt/gunyah/vm-manager.rst
+++ b/Documentation/virt/gunyah/vm-manager.rst
@@ -5,8 +5,7 @@ Virtual Machine Manager
=======================

The Gunyah Virtual Machine Manager is a Linux driver to support launching
-virtual machines using Gunyah. It presently supports launching non-proxy
-scheduled Linux-like virtual machines.
+virtual machines using Gunyah.

Except for some basic information about the location of initial binaries,
most of the configuration about a Gunyah virtual machine is described in the
@@ -122,3 +121,30 @@ GH_VM_START
~~~~~~~~~~~

This ioctl starts the VM.
+
+GH_VM_ADD_FUNCTION
+~~~~~~~~~~~~~~~~~~
+
+This ioctl registers a Gunyah VM function with the VM manager. The VM function
+is described with a `type` string and some arguments for that type. Typically,
+the function is added before the VM starts, but the function doesn't "operate"
+until the VM starts with GH_VM_START: e.g. vCPU ioclts will all return an error
+until the VM starts because the vCPUs don't exist until the VM is started. This
+allows the VMM to set up all the kernel functionality needed for the VM *before*
+the VM starts.
+
+The possible types are documented below:
+
+Type: "vcpu"
+^^^^^^^^^^^^
+
+::
+
+ struct gh_fn_vcpu_arg {
+ __u32 vcpu_id;
+ };
+
+The vcpu type will register with the VM Manager to expect to control
+vCPU number `vcpu_id`. It returns a file descriptor allowing interaction with
+the vCPU. See the Gunyah vCPU API description sections for interacting with
+the Gunyah vCPU file descriptors.
diff --git a/arch/arm64/gunyah/gunyah_hypercall.c b/arch/arm64/gunyah/gunyah_hypercall.c
index 2ca9ab098ff6..260d416dd006 100644
--- a/arch/arm64/gunyah/gunyah_hypercall.c
+++ b/arch/arm64/gunyah/gunyah_hypercall.c
@@ -40,6 +40,7 @@ EXPORT_SYMBOL_GPL(arch_is_gunyah_guest);
#define GH_HYPERCALL_HYP_IDENTIFY GH_HYPERCALL(0x8000)
#define GH_HYPERCALL_MSGQ_SEND GH_HYPERCALL(0x801B)
#define GH_HYPERCALL_MSGQ_RECV GH_HYPERCALL(0x801C)
+#define GH_HYPERCALL_VCPU_RUN GH_HYPERCALL(0x8065)

/**
* gh_hypercall_hyp_identify() - Returns build information and feature flags
@@ -89,5 +90,32 @@ enum gh_error gh_hypercall_msgq_recv(u64 capid, uintptr_t buff, size_t size, siz
}
EXPORT_SYMBOL_GPL(gh_hypercall_msgq_recv);

+enum gh_error gh_hypercall_vcpu_run(u64 capid, u64 *resume_data,
+ struct gh_hypercall_vcpu_run_resp *resp)
+{
+ struct arm_smccc_1_2_regs args = {
+ .a0 = GH_HYPERCALL_VCPU_RUN,
+ .a1 = capid,
+ .a2 = resume_data[0],
+ .a3 = resume_data[1],
+ .a4 = resume_data[2],
+ /* C language says this will be implictly zero. Gunyah requires 0, so be explicit */
+ .a5 = 0,
+ };
+ struct arm_smccc_1_2_regs res;
+
+ arm_smccc_1_2_hvc(&args, &res);
+
+ if (res.a0 == GH_ERROR_OK) {
+ resp->state = res.a1;
+ resp->state_data[0] = res.a2;
+ resp->state_data[1] = res.a3;
+ resp->state_data[2] = res.a4;
+ }
+
+ return res.a0;
+}
+EXPORT_SYMBOL_GPL(gh_hypercall_vcpu_run);
+
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("Gunyah Hypervisor Hypercalls");
diff --git a/drivers/virt/gunyah/Kconfig b/drivers/virt/gunyah/Kconfig
index de815189dab6..4c1c6110b50e 100644
--- a/drivers/virt/gunyah/Kconfig
+++ b/drivers/virt/gunyah/Kconfig
@@ -15,3 +15,14 @@ config GUNYAH

config GUNYAH_PLATFORM_HOOKS
tristate
+
+config GUNYAH_VCPU
+ tristate "Runnable Gunyah vCPUs"
+ depends on GUNYAH
+ help
+ Enable kernel support for host-scheduled vCPUs running under Gunyah.
+ When selecting this option, userspace virtual machine managers (VMM)
+ can schedule the guest VM's vCPUs instead of using Gunyah's scheduler.
+ VMMs can also handle stage 2 faults of the vCPUs.
+
+ Say Y/M here if unsure and you want to support Gunyah VMMs.
diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
index 6b8f84dbfe0d..2d1b604a7b03 100644
--- a/drivers/virt/gunyah/Makefile
+++ b/drivers/virt/gunyah/Makefile
@@ -5,3 +5,5 @@ obj-$(CONFIG_GUNYAH_PLATFORM_HOOKS) += gunyah_platform_hooks.o

gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o vm_mgr_mm.o
obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
+
+obj-$(CONFIG_GUNYAH_VCPU) += gunyah_vcpu.o
diff --git a/drivers/virt/gunyah/gunyah_vcpu.c b/drivers/virt/gunyah/gunyah_vcpu.c
new file mode 100644
index 000000000000..fb1f55cb5020
--- /dev/null
+++ b/drivers/virt/gunyah/gunyah_vcpu.c
@@ -0,0 +1,463 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#include <linux/anon_inodes.h>
+#include <linux/file.h>
+#include <linux/gunyah.h>
+#include <linux/gunyah_vm_mgr.h>
+#include <linux/interrupt.h>
+#include <linux/kref.h>
+#include <linux/mm.h>
+#include <linux/module.h>
+#include <linux/types.h>
+#include <linux/wait.h>
+
+#include "vm_mgr.h"
+
+#include <uapi/linux/gunyah.h>
+
+#define MAX_VCPU_NAME 20 /* gh-vcpu:u32_max+NUL */
+
+struct gunyah_vcpu {
+ struct gh_vm_function_instance *f;
+ struct gunyah_resource *rsc;
+ struct mutex run_lock;
+ /* Track why vcpu_run left last time around. */
+ enum {
+ GH_VCPU_UNKNOWN = 0,
+ GH_VCPU_READY,
+ GH_VCPU_MMIO_READ,
+ GH_VCPU_SYSTEM_DOWN,
+ } state;
+ u8 mmio_read_len;
+ struct gh_vcpu_run *vcpu_run;
+ struct completion ready;
+ struct gh_vm *ghvm;
+
+ struct notifier_block nb;
+ struct gh_vm_resource_ticket ticket;
+ struct kref kref;
+};
+
+/* VCPU is ready to run */
+#define GH_VCPU_STATE_READY 0
+/* VCPU is sleeping until an interrupt arrives */
+#define GH_VCPU_STATE_EXPECTS_WAKEUP 1
+/* VCPU is powered off */
+#define GH_VCPU_STATE_POWERED_OFF 2
+/* VCPU is blocked in EL2 for unspecified reason */
+#define GH_VCPU_STATE_BLOCKED 3
+/* VCPU has returned for MMIO READ */
+#define GH_VCPU_ADDRSPACE_VMMIO_READ 4
+/* VCPU has returned for MMIO WRITE */
+#define GH_VCPU_ADDRSPACE_VMMIO_WRITE 5
+
+static void vcpu_release(struct kref *kref)
+{
+ struct gunyah_vcpu *vcpu = container_of(kref, struct gunyah_vcpu, kref);
+
+ free_page((unsigned long)vcpu->vcpu_run);
+ kfree(vcpu);
+}
+
+/*
+ * When hypervisor allows us to schedule vCPU again, it gives us an interrupt
+ */
+static irqreturn_t gh_vcpu_irq_handler(int irq, void *data)
+{
+ struct gunyah_vcpu *vcpu = data;
+
+ complete(&vcpu->ready);
+ return IRQ_HANDLED;
+}
+
+static bool gh_handle_mmio(struct gunyah_vcpu *vcpu,
+ struct gh_hypercall_vcpu_run_resp *vcpu_run_resp)
+{
+ int ret = 0;
+ u64 addr = vcpu_run_resp->state_data[0],
+ len = vcpu_run_resp->state_data[1],
+ data = vcpu_run_resp->state_data[2];
+
+ if (vcpu_run_resp->state == GH_VCPU_ADDRSPACE_VMMIO_READ) {
+ vcpu->vcpu_run->mmio.is_write = 0;
+ /* Record that we need to give vCPU user's supplied value next gh_vcpu_run() */
+ vcpu->state = GH_VCPU_MMIO_READ;
+ vcpu->mmio_read_len = len;
+ } else { /* GH_VCPU_ADDRSPACE_VMMIO_WRITE */
+ /* Try internal handlers first */
+ ret = gh_vm_mmio_write(vcpu->f->ghvm, addr, len, data);
+ if (!ret)
+ return true;
+
+ /* Give userspace the info */
+ vcpu->vcpu_run->mmio.is_write = 1;
+ memcpy(vcpu->vcpu_run->mmio.data, &data, len);
+ }
+
+ vcpu->vcpu_run->mmio.phys_addr = addr;
+ vcpu->vcpu_run->mmio.len = len;
+ vcpu->vcpu_run->exit_reason = GH_VCPU_EXIT_MMIO;
+
+ return false;
+}
+
+static int gh_vcpu_rm_notification(struct notifier_block *nb, unsigned long action, void *data)
+{
+ struct gunyah_vcpu *vcpu = container_of(nb, struct gunyah_vcpu, nb);
+ struct gh_rm_vm_exited_payload *exit_payload = data;
+
+ if (action == GH_RM_NOTIFICATION_VM_EXITED &&
+ le16_to_cpu(exit_payload->vmid) == vcpu->ghvm->vmid)
+ complete(&vcpu->ready);
+
+ return NOTIFY_OK;
+}
+
+static inline enum gh_vm_status remap_vm_status(enum gh_rm_vm_status rm_status)
+{
+ switch (rm_status) {
+ case GH_RM_VM_STATUS_INIT_FAILED:
+ return GH_VM_STATUS_LOAD_FAILED;
+ case GH_RM_VM_STATUS_EXITED:
+ return GH_VM_STATUS_EXITED;
+ default:
+ return GH_VM_STATUS_CRASHED;
+ }
+}
+
+/**
+ * gh_vcpu_check_system() - Check whether VM as a whole is running
+ * @vcpu: Pointer to gunyah_vcpu
+ *
+ * Returns true if the VM is alive.
+ * Returns false if the vCPU is the VM is not alive (can only be that VM is shutting down).
+ */
+static bool gh_vcpu_check_system(struct gunyah_vcpu *vcpu)
+ __must_hold(&vcpu->run_lock)
+{
+ bool ret = true;
+
+ down_read(&vcpu->ghvm->status_lock);
+ if (likely(vcpu->ghvm->vm_status == GH_RM_VM_STATUS_RUNNING))
+ goto out;
+
+ vcpu->vcpu_run->status.status = remap_vm_status(vcpu->ghvm->vm_status);
+ vcpu->vcpu_run->status.exit_info = vcpu->ghvm->exit_info;
+ vcpu->vcpu_run->exit_reason = GH_VCPU_EXIT_STATUS;
+ vcpu->state = GH_VCPU_SYSTEM_DOWN;
+ ret = false;
+out:
+ up_read(&vcpu->ghvm->status_lock);
+ return ret;
+}
+
+/**
+ * gh_vcpu_run() - Request Gunyah to begin scheduling this vCPU.
+ * @vcpu: The client descriptor that was obtained via gunyah_vcpu_alloc()
+ */
+static int gh_vcpu_run(struct gunyah_vcpu *vcpu)
+{
+ struct gh_hypercall_vcpu_run_resp vcpu_run_resp;
+ u64 state_data[3] = { 0 };
+ enum gh_error gh_error;
+ int ret = 0;
+
+ if (!vcpu->f)
+ return -ENODEV;
+
+ if (mutex_lock_interruptible(&vcpu->run_lock))
+ return -ERESTARTSYS;
+
+ if (!vcpu->rsc) {
+ ret = -ENODEV;
+ goto out;
+ }
+
+ switch (vcpu->state) {
+ case GH_VCPU_UNKNOWN:
+ if (vcpu->ghvm->vm_status != GH_RM_VM_STATUS_RUNNING) {
+ /* Check if VM is up. If VM is starting, will block until VM is fully up
+ * since that thread does down_write.
+ */
+ if (!gh_vcpu_check_system(vcpu))
+ goto out;
+ }
+ vcpu->state = GH_VCPU_READY;
+ break;
+ case GH_VCPU_MMIO_READ:
+ memcpy(&state_data[0], vcpu->vcpu_run->mmio.data, vcpu->mmio_read_len);
+ vcpu->state = GH_VCPU_READY;
+ break;
+ case GH_VCPU_SYSTEM_DOWN:
+ goto out;
+ default:
+ break;
+ }
+
+ while (!ret && !signal_pending(current)) {
+ if (vcpu->vcpu_run->immediate_exit) {
+ ret = -EINTR;
+ goto out;
+ }
+
+ gh_error = gh_hypercall_vcpu_run(vcpu->rsc->capid, state_data, &vcpu_run_resp);
+ if (gh_error == GH_ERROR_OK) {
+ ret = 0;
+ switch (vcpu_run_resp.state) {
+ case GH_VCPU_STATE_READY:
+ if (need_resched())
+ schedule();
+ break;
+ case GH_VCPU_STATE_POWERED_OFF:
+ /* vcpu might be off because the VM is shut down.
+ * If so, it won't ever run again: exit back to user
+ */
+ if (!gh_vcpu_check_system(vcpu))
+ goto out;
+ /* Otherwise, another vcpu will turn it on (e.g. by PSCI)
+ * and hyp sends an interrupt to wake Linux up.
+ */
+ fallthrough;
+ case GH_VCPU_STATE_EXPECTS_WAKEUP:
+ ret = wait_for_completion_interruptible(&vcpu->ready);
+ /* reinitialize completion before next hypercall. If we reinitialize
+ * after the hypercall, interrupt may have already come before
+ * re-initializing the completion and then end up waiting for
+ * event that already happened.
+ */
+ reinit_completion(&vcpu->ready);
+ /* Check system status again. Completion might've
+ * come from gh_vcpu_rm_notification
+ */
+ if (!ret && !gh_vcpu_check_system(vcpu))
+ goto out;
+ break;
+ case GH_VCPU_STATE_BLOCKED:
+ schedule();
+ break;
+ case GH_VCPU_ADDRSPACE_VMMIO_READ:
+ case GH_VCPU_ADDRSPACE_VMMIO_WRITE:
+ if (!gh_handle_mmio(vcpu, &vcpu_run_resp))
+ goto out;
+ break;
+ default:
+ pr_warn_ratelimited("Unknown vCPU state: %llx\n",
+ vcpu_run_resp.state);
+ schedule();
+ break;
+ }
+ } else if (gh_error == GH_ERROR_RETRY) {
+ schedule();
+ ret = 0;
+ } else
+ ret = gh_remap_error(gh_error);
+ }
+
+out:
+ mutex_unlock(&vcpu->run_lock);
+
+ if (signal_pending(current))
+ return -ERESTARTSYS;
+
+ return ret;
+}
+
+static long gh_vcpu_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
+{
+ struct gunyah_vcpu *vcpu = filp->private_data;
+ long ret = -EINVAL;
+
+ switch (cmd) {
+ case GH_VCPU_RUN:
+ ret = gh_vcpu_run(vcpu);
+ break;
+ case GH_VCPU_MMAP_SIZE:
+ ret = PAGE_SIZE;
+ break;
+ default:
+ break;
+ }
+ return ret;
+}
+
+static int gh_vcpu_release(struct inode *inode, struct file *filp)
+{
+ struct gunyah_vcpu *vcpu = filp->private_data;
+
+ gh_vm_put(vcpu->ghvm);
+ kref_put(&vcpu->kref, vcpu_release);
+ return 0;
+}
+
+static vm_fault_t gh_vcpu_fault(struct vm_fault *vmf)
+{
+ struct gunyah_vcpu *vcpu = vmf->vma->vm_file->private_data;
+ struct page *page = NULL;
+
+ if (vmf->pgoff == 0)
+ page = virt_to_page(vcpu->vcpu_run);
+
+ get_page(page);
+ vmf->page = page;
+ return 0;
+}
+
+static const struct vm_operations_struct gh_vcpu_ops = {
+ .fault = gh_vcpu_fault,
+};
+
+static int gh_vcpu_mmap(struct file *file, struct vm_area_struct *vma)
+{
+ vma->vm_ops = &gh_vcpu_ops;
+ return 0;
+}
+
+static const struct file_operations gh_vcpu_fops = {
+ .unlocked_ioctl = gh_vcpu_ioctl,
+ .release = gh_vcpu_release,
+ .llseek = noop_llseek,
+ .mmap = gh_vcpu_mmap,
+};
+
+static int gunyah_vcpu_populate(struct gh_vm_resource_ticket *ticket, struct gunyah_resource *ghrsc)
+{
+ struct gunyah_vcpu *vcpu = container_of(ticket, struct gunyah_vcpu, ticket);
+ int ret;
+
+ mutex_lock(&vcpu->run_lock);
+ if (vcpu->rsc) {
+ ret = -1;
+ goto out;
+ }
+
+ vcpu->rsc = ghrsc;
+ init_completion(&vcpu->ready);
+
+ ret = request_irq(vcpu->rsc->irq, gh_vcpu_irq_handler, IRQF_TRIGGER_RISING, "gh_vcpu",
+ vcpu);
+ if (ret)
+ pr_warn("Failed to request vcpu irq %d: %d", vcpu->rsc->irq, ret);
+
+out:
+ mutex_unlock(&vcpu->run_lock);
+ return ret;
+}
+
+static void gunyah_vcpu_unpopulate(struct gh_vm_resource_ticket *ticket,
+ struct gunyah_resource *ghrsc)
+{
+ struct gunyah_vcpu *vcpu = container_of(ticket, struct gunyah_vcpu, ticket);
+
+ vcpu->vcpu_run->immediate_exit = true;
+ complete_all(&vcpu->ready);
+ mutex_lock(&vcpu->run_lock);
+ free_irq(vcpu->rsc->irq, vcpu);
+ vcpu->rsc = NULL;
+ mutex_unlock(&vcpu->run_lock);
+}
+
+static long gunyah_vcpu_bind(struct gh_vm_function_instance *f)
+{
+ struct gunyah_vcpu *vcpu;
+ char name[MAX_VCPU_NAME];
+ struct file *file;
+ struct page *page;
+ int fd;
+ long r;
+
+ if (!gh_api_has_feature(GH_API_FEATURE_VCPU))
+ return -EOPNOTSUPP;
+
+ if (f->arg_size)
+ return -EINVAL;
+
+ vcpu = kzalloc(sizeof(*vcpu), GFP_KERNEL);
+ if (!vcpu)
+ return -ENOMEM;
+
+ vcpu->f = f;
+ f->data = vcpu;
+ mutex_init(&vcpu->run_lock);
+ kref_init(&vcpu->kref);
+
+ page = alloc_page(GFP_KERNEL | __GFP_ZERO);
+ if (!page) {
+ r = -ENOMEM;
+ goto err_destroy_vcpu;
+ }
+ vcpu->vcpu_run = page_address(page);
+
+ vcpu->ticket.resource_type = GUNYAH_RESOURCE_TYPE_VCPU;
+ vcpu->ticket.label = f->arg;
+ vcpu->ticket.owner = THIS_MODULE;
+ vcpu->ticket.populate = gunyah_vcpu_populate;
+ vcpu->ticket.unpopulate = gunyah_vcpu_unpopulate;
+
+ r = gh_vm_add_resource_ticket(f->ghvm, &vcpu->ticket);
+ if (r)
+ goto err_destroy_page;
+
+ fd = get_unused_fd_flags(O_CLOEXEC);
+ if (fd < 0) {
+ r = fd;
+ goto err_remove_vcpu;
+ }
+
+ if (!gh_vm_get(f->ghvm)) {
+ r = -ENODEV;
+ goto err_put_fd;
+ }
+ vcpu->ghvm = f->ghvm;
+
+ vcpu->nb.notifier_call = gh_vcpu_rm_notification;
+ /* Ensure we run after the vm_mgr handles the notification and does
+ * any necessary state changes. We wake up to check the new state.
+ */
+ vcpu->nb.priority = -1;
+ r = gh_rm_notifier_register(f->rm, &vcpu->nb);
+ if (r)
+ goto err_put_gh_vm;
+
+ kref_get(&vcpu->kref);
+ snprintf(name, sizeof(name), "gh-vcpu:%d", vcpu->ticket.label);
+ file = anon_inode_getfile(name, &gh_vcpu_fops, vcpu, O_RDWR);
+ if (IS_ERR(file)) {
+ r = PTR_ERR(file);
+ goto err_notifier;
+ }
+
+ fd_install(fd, file);
+
+ return fd;
+err_notifier:
+ gh_rm_notifier_unregister(f->rm, &vcpu->nb);
+err_put_gh_vm:
+ gh_vm_put(vcpu->ghvm);
+err_put_fd:
+ put_unused_fd(fd);
+err_remove_vcpu:
+ gh_vm_remove_resource_ticket(f->ghvm, &vcpu->ticket);
+err_destroy_page:
+ free_page((unsigned long)vcpu->vcpu_run);
+err_destroy_vcpu:
+ kfree(vcpu);
+ return r;
+}
+
+static void gunyah_vcpu_unbind(struct gh_vm_function_instance *f)
+{
+ struct gunyah_vcpu *vcpu = f->data;
+
+ gh_rm_notifier_unregister(f->rm, &vcpu->nb);
+ gh_vm_remove_resource_ticket(vcpu->f->ghvm, &vcpu->ticket);
+ vcpu->f = NULL;
+
+ kref_put(&vcpu->kref, vcpu_release);
+}
+
+DECLARE_GUNYAH_VM_FUNCTION_INIT(vcpu, GH_FN_VCPU, gunyah_vcpu_bind, gunyah_vcpu_unbind);
+MODULE_DESCRIPTION("Gunyah vCPU Driver");
+MODULE_LICENSE("GPL");
diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
index 24829db17a0d..7ec9f4c7a982 100644
--- a/drivers/virt/gunyah/vm_mgr.c
+++ b/drivers/virt/gunyah/vm_mgr.c
@@ -378,6 +378,10 @@ static int gh_vm_rm_notification_exited(struct gh_vm *ghvm, void *data)

down_write(&ghvm->status_lock);
ghvm->vm_status = GH_RM_VM_STATUS_EXITED;
+ ghvm->exit_info.type = le16_to_cpu(payload->exit_type);
+ ghvm->exit_info.reason_size = le32_to_cpu(payload->exit_reason_size);
+ memcpy(&ghvm->exit_info.reason, payload->exit_reason,
+ min(GH_VM_MAX_EXIT_REASON_SIZE, ghvm->exit_info.reason_size));
up_write(&ghvm->status_lock);

return NOTIFY_DONE;
diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
index 70802cd7cbe1..5e200d9f6f06 100644
--- a/drivers/virt/gunyah/vm_mgr.h
+++ b/drivers/virt/gunyah/vm_mgr.h
@@ -45,6 +45,7 @@ struct gh_vm {
enum gh_rm_vm_status vm_status;
wait_queue_head_t vm_status_wait;
struct rw_semaphore status_lock;
+ struct gh_vm_exit_info exit_info;

struct work_struct free_work;
struct kref kref;
diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
index a06d5fa68a65..c819df72d303 100644
--- a/include/linux/gunyah.h
+++ b/include/linux/gunyah.h
@@ -179,4 +179,12 @@ enum gh_error gh_hypercall_msgq_send(u64 capid, size_t size, uintptr_t buff, int
enum gh_error gh_hypercall_msgq_recv(u64 capid, uintptr_t buff, size_t size, size_t *recv_size,
bool *ready);

+struct gh_hypercall_vcpu_run_resp {
+ u64 state;
+ u64 state_data[3];
+};
+
+enum gh_error gh_hypercall_vcpu_run(u64 capid, u64 *resume_data,
+ struct gh_hypercall_vcpu_run_resp *resp);
+
#endif
diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
index 8df455a2a293..1173ff0329c2 100644
--- a/include/uapi/linux/gunyah.h
+++ b/include/uapi/linux/gunyah.h
@@ -66,6 +66,12 @@ struct gh_vm_dtb_config {

#define GH_VM_START _IO(GH_IOCTL_TYPE, 0x3)

+/**
+ * GH_FN_VCPU - create a vCPU instance to control a vCPU
+ * gh_fn_desc usage: arg_size must be 0, arg must be the vcpu's ID
+ */
+#define GH_FN_VCPU 1
+
#define GH_FN_MAX_ARG_SIZE 256

/**
@@ -83,4 +89,59 @@ struct gh_fn_desc {
#define GH_VM_ADD_FUNCTION _IOW(GH_IOCTL_TYPE, 0x4, struct gh_fn_desc)
#define GH_VM_REMOVE_FUNCTION _IOW(GH_IOCTL_TYPE, 0x7, struct gh_fn_desc)

+enum gh_vm_status {
+ GH_VM_STATUS_LOAD_FAILED = 1,
+#define GH_VM_STATUS_LOAD_FAILED GH_VM_STATUS_LOAD_FAILED
+ GH_VM_STATUS_EXITED = 2,
+#define GH_VM_STATUS_EXITED GH_VM_STATUS_EXITED
+ GH_VM_STATUS_CRASHED = 3,
+#define GH_VM_STATUS_CRASHED GH_VM_STATUS_CRASHED
+};
+
+/**
+ * Gunyah presently will send max 4 bytes of exit_reason.
+ * If that changes, this macro can be safely increased without breaking
+ * userspace so long as struct gh_vcpu_run < PAGE_SIZE.
+ */
+#define GH_VM_MAX_EXIT_REASON_SIZE 8u
+
+struct gh_vm_exit_info {
+ __u16 type;
+ __u16 reserved;
+ __u32 reason_size;
+ __u8 reason[GH_VM_MAX_EXIT_REASON_SIZE];
+};
+
+/* for GH_VCPU_RUN, returned by mmap(vcpu_fd, offset=0) */
+struct gh_vcpu_run {
+ /* in */
+ __u8 immediate_exit;
+ __u8 padding1[7];
+
+ /* out */
+#define GH_VCPU_EXIT_UNKNOWN 0
+#define GH_VCPU_EXIT_MMIO 1
+#define GH_VCPU_EXIT_STATUS 2
+ __u32 exit_reason;
+
+ union {
+ /* GH_VCPU_EXIT_MMIO */
+ struct {
+ __u64 phys_addr;
+ __u8 data[8];
+ __u32 len;
+ __u8 is_write;
+ } mmio;
+
+ /* GH_VCPU_EXIT_STATUS */
+ struct {
+ enum gh_vm_status status;
+ struct gh_vm_exit_info exit_info;
+ } status;
+ };
+};
+
+#define GH_VCPU_RUN _IO(GH_IOCTL_TYPE, 0x5)
+#define GH_VCPU_MMAP_SIZE _IO(GH_IOCTL_TYPE, 0x6)
+
#endif
--
2.39.1


2023-02-14 21:28:16

by Elliot Berman

[permalink] [raw]
Subject: [PATCH v10 23/26] virt: gunyah: Add hypercalls for sending doorbell


Gunyah doorbells allow two virtual machines to signal each other using
interrupts. Add the hypercalls needed to assert the interrupt.

Signed-off-by: Elliot Berman <[email protected]>
---
arch/arm64/gunyah/gunyah_hypercall.c | 25 +++++++++++++++++++++++++
include/linux/gunyah.h | 3 +++
2 files changed, 28 insertions(+)

diff --git a/arch/arm64/gunyah/gunyah_hypercall.c b/arch/arm64/gunyah/gunyah_hypercall.c
index 260d416dd006..54e561159ac7 100644
--- a/arch/arm64/gunyah/gunyah_hypercall.c
+++ b/arch/arm64/gunyah/gunyah_hypercall.c
@@ -38,6 +38,8 @@ EXPORT_SYMBOL_GPL(arch_is_gunyah_guest);
fn)

#define GH_HYPERCALL_HYP_IDENTIFY GH_HYPERCALL(0x8000)
+#define GH_HYPERCALL_BELL_SEND GH_HYPERCALL(0x8012)
+#define GH_HYPERCALL_BELL_SET_MASK GH_HYPERCALL(0x8015)
#define GH_HYPERCALL_MSGQ_SEND GH_HYPERCALL(0x801B)
#define GH_HYPERCALL_MSGQ_RECV GH_HYPERCALL(0x801C)
#define GH_HYPERCALL_VCPU_RUN GH_HYPERCALL(0x8065)
@@ -60,6 +62,29 @@ void gh_hypercall_hyp_identify(struct gh_hypercall_hyp_identify_resp *hyp_identi
}
EXPORT_SYMBOL_GPL(gh_hypercall_hyp_identify);

+enum gh_error gh_hypercall_bell_send(u64 capid, u64 new_flags, u64 *old_flags)
+{
+ struct arm_smccc_res res;
+
+ arm_smccc_1_1_hvc(GH_HYPERCALL_BELL_SEND, capid, new_flags, 0, &res);
+
+ if (res.a0 == GH_ERROR_OK)
+ *old_flags = res.a1;
+
+ return res.a0;
+}
+EXPORT_SYMBOL_GPL(gh_hypercall_bell_send);
+
+enum gh_error gh_hypercall_bell_set_mask(u64 capid, u64 enable_mask, u64 ack_mask)
+{
+ struct arm_smccc_res res;
+
+ arm_smccc_1_1_hvc(GH_HYPERCALL_BELL_SET_MASK, capid, enable_mask, ack_mask, 0, &res);
+
+ return res.a0;
+}
+EXPORT_SYMBOL_GPL(gh_hypercall_bell_set_mask);
+
enum gh_error gh_hypercall_msgq_send(u64 capid, size_t size, uintptr_t buff, int tx_flags,
bool *ready)
{
diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
index c819df72d303..bfceaa7000e5 100644
--- a/include/linux/gunyah.h
+++ b/include/linux/gunyah.h
@@ -172,6 +172,9 @@ struct gh_hypercall_hyp_identify_resp {

void gh_hypercall_hyp_identify(struct gh_hypercall_hyp_identify_resp *hyp_identity);

+enum gh_error gh_hypercall_bell_send(u64 capid, u64 new_flags, u64 *old_flags);
+enum gh_error gh_hypercall_bell_set_mask(u64 capid, u64 enable_mask, u64 ack_mask);
+
#define GH_HYPERCALL_MSGQ_TX_FLAGS_PUSH BIT(0)

enum gh_error gh_hypercall_msgq_send(u64 capid, size_t size, uintptr_t buff, int tx_flags,
--
2.39.1


2023-02-14 21:28:31

by Elliot Berman

[permalink] [raw]
Subject: [PATCH v10 26/26] MAINTAINERS: Add Gunyah hypervisor drivers section


Add myself and Prakruthi as maintainers of Gunyah hypervisor drivers.

Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Elliot Berman <[email protected]>
---
MAINTAINERS | 13 +++++++++++++
1 file changed, 13 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 539d9359385c..fe6645099395 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8913,6 +8913,19 @@ L: [email protected]
S: Maintained
F: block/partitions/efi.*

+GUNYAH HYPERVISOR DRIVER
+M: Elliot Berman <[email protected]>
+M: Prakruthi Deepak Heragu <[email protected]>
+L: [email protected]
+S: Supported
+F: Documentation/devicetree/bindings/firmware/gunyah-hypervisor.yaml
+F: Documentation/virt/gunyah/
+F: arch/arm64/gunyah/
+F: drivers/mailbox/gunyah-msgq.c
+F: drivers/virt/gunyah/
+F: include/linux/gunyah*.h
+F: samples/gunyah/
+
HABANALABS PCI DRIVER
M: Oded Gabbay <[email protected]>
L: [email protected]
--
2.39.1


2023-02-14 21:46:14

by Elliot Berman

[permalink] [raw]
Subject: [PATCH v10 18/26] virt: gunyah: Translate gh_rm_hyp_resource into gunyah_resource


When booting a Gunyah virtual machine, the host VM may gain capabilities
to interact with resources for the guest virtual machine. Examples of
such resources are vCPUs or message queues. To use those resources, we
need to translate the RM response into a gunyah_resource structure which
are useful to Linux drivers. Presently, Linux drivers need only to know
the type of resource, the capability ID, and an interrupt.

On ARM64 systems, the interrupt reported by Gunyah is the GIC interrupt
ID number and always a SPI.

Signed-off-by: Elliot Berman <[email protected]>
---
arch/arm64/include/asm/gunyah.h | 23 +++++
drivers/virt/gunyah/rsc_mgr.c | 161 +++++++++++++++++++++++++++++++-
include/linux/gunyah.h | 4 +
include/linux/gunyah_rsc_mgr.h | 4 +
4 files changed, 191 insertions(+), 1 deletion(-)
create mode 100644 arch/arm64/include/asm/gunyah.h

diff --git a/arch/arm64/include/asm/gunyah.h b/arch/arm64/include/asm/gunyah.h
new file mode 100644
index 000000000000..64cfb964efee
--- /dev/null
+++ b/arch/arm64/include/asm/gunyah.h
@@ -0,0 +1,23 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) 2022 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+#ifndef __ASM_GUNYAH_H_
+#define __ASM_GUNYAH_H_
+
+#include <linux/irq.h>
+#include <dt-bindings/interrupt-controller/arm-gic.h>
+
+static inline int arch_gh_fill_irq_fwspec_params(u32 virq, struct irq_fwspec *fwspec)
+{
+ if (virq < 32 || virq > 1019)
+ return -EINVAL;
+
+ fwspec->param_count = 3;
+ fwspec->param[0] = GIC_SPI;
+ fwspec->param[1] = virq - 32;
+ fwspec->param[2] = IRQ_TYPE_EDGE_RISING;
+ return 0;
+}
+
+#endif
diff --git a/drivers/virt/gunyah/rsc_mgr.c b/drivers/virt/gunyah/rsc_mgr.c
index 73c5a6b7cbbc..eb1bc3f68792 100644
--- a/drivers/virt/gunyah/rsc_mgr.c
+++ b/drivers/virt/gunyah/rsc_mgr.c
@@ -18,6 +18,8 @@
#include <linux/platform_device.h>
#include <linux/miscdevice.h>

+#include <asm/gunyah.h>
+
#include "rsc_mgr.h"
#include "vm_mgr.h"

@@ -107,8 +109,137 @@ struct gh_rm {
struct blocking_notifier_head nh;

struct miscdevice miscdev;
+ struct irq_domain *irq_domain;
+};
+
+struct gh_irq_chip_data {
+ u32 gh_virq;
+};
+
+static struct irq_chip gh_rm_irq_chip = {
+ .name = "Gunyah",
+ .irq_enable = irq_chip_enable_parent,
+ .irq_disable = irq_chip_disable_parent,
+ .irq_ack = irq_chip_ack_parent,
+ .irq_mask = irq_chip_mask_parent,
+ .irq_mask_ack = irq_chip_mask_ack_parent,
+ .irq_unmask = irq_chip_unmask_parent,
+ .irq_eoi = irq_chip_eoi_parent,
+ .irq_set_affinity = irq_chip_set_affinity_parent,
+ .irq_set_type = irq_chip_set_type_parent,
+ .irq_set_wake = irq_chip_set_wake_parent,
+ .irq_set_vcpu_affinity = irq_chip_set_vcpu_affinity_parent,
+ .irq_retrigger = irq_chip_retrigger_hierarchy,
+ .irq_get_irqchip_state = irq_chip_get_parent_state,
+ .irq_set_irqchip_state = irq_chip_set_parent_state,
+ .flags = IRQCHIP_SET_TYPE_MASKED |
+ IRQCHIP_SKIP_SET_WAKE |
+ IRQCHIP_MASK_ON_SUSPEND,
+};
+
+static int gh_rm_irq_domain_alloc(struct irq_domain *d, unsigned int virq, unsigned int nr_irqs,
+ void *arg)
+{
+ struct gh_irq_chip_data *chip_data, *spec = arg;
+ struct irq_fwspec parent_fwspec;
+ struct gh_rm *rm = d->host_data;
+ u32 gh_virq = spec->gh_virq;
+ int ret;
+
+ if (nr_irqs != 1 || gh_virq == U32_MAX)
+ return -EINVAL;
+
+ chip_data = kzalloc(sizeof(*chip_data), GFP_KERNEL);
+ if (!chip_data)
+ return -ENOMEM;
+
+ chip_data->gh_virq = gh_virq;
+
+ ret = irq_domain_set_hwirq_and_chip(d, virq, chip_data->gh_virq, &gh_rm_irq_chip,
+ chip_data);
+ if (ret)
+ return ret;
+
+ parent_fwspec.fwnode = d->parent->fwnode;
+ ret = arch_gh_fill_irq_fwspec_params(chip_data->gh_virq, &parent_fwspec);
+ if (ret) {
+ dev_err(rm->dev, "virq translation failed %u: %d\n", chip_data->gh_virq, ret);
+ goto err_free_irq_data;
+ }
+
+ ret = irq_domain_alloc_irqs_parent(d, virq, nr_irqs, &parent_fwspec);
+ if (ret)
+ goto err_free_irq_data;
+
+ return ret;
+err_free_irq_data:
+ kfree(chip_data);
+ return ret;
+}
+
+static void gh_rm_irq_domain_free_single(struct irq_domain *d, unsigned int virq)
+{
+ struct gh_irq_chip_data *chip_data;
+ struct irq_data *irq_data;
+
+ irq_data = irq_domain_get_irq_data(d, virq);
+ if (!irq_data)
+ return;
+
+ chip_data = irq_data->chip_data;
+
+ kfree(chip_data);
+ irq_data->chip_data = NULL;
+}
+
+static void gh_rm_irq_domain_free(struct irq_domain *d, unsigned int virq, unsigned int nr_irqs)
+{
+ unsigned int i;
+
+ for (i = 0; i < nr_irqs; i++)
+ gh_rm_irq_domain_free_single(d, virq);
+}
+
+static const struct irq_domain_ops gh_rm_irq_domain_ops = {
+ .alloc = gh_rm_irq_domain_alloc,
+ .free = gh_rm_irq_domain_free,
};

+struct gunyah_resource *gh_rm_alloc_resource(struct gh_rm *rm,
+ struct gh_rm_hyp_resource *hyp_resource)
+{
+ struct gunyah_resource *ghrsc;
+
+ ghrsc = kzalloc(sizeof(*ghrsc), GFP_KERNEL);
+ if (!ghrsc)
+ return NULL;
+
+ ghrsc->type = hyp_resource->type;
+ ghrsc->capid = le64_to_cpu(hyp_resource->cap_id);
+ ghrsc->irq = IRQ_NOTCONNECTED;
+ ghrsc->rm_label = le32_to_cpu(hyp_resource->resource_label);
+ if (hyp_resource->virq && le32_to_cpu(hyp_resource->virq) != U32_MAX) {
+ struct gh_irq_chip_data irq_data = {
+ .gh_virq = le32_to_cpu(hyp_resource->virq),
+ };
+
+ ghrsc->irq = irq_domain_alloc_irqs(rm->irq_domain, 1, NUMA_NO_NODE, &irq_data);
+ if (ghrsc->irq < 0) {
+ pr_err("Failed to allocate interrupt for resource %d label: %d: %d\n",
+ ghrsc->type, ghrsc->rm_label, ghrsc->irq);
+ ghrsc->irq = IRQ_NOTCONNECTED;
+ }
+ }
+
+ return ghrsc;
+}
+
+void gh_rm_free_resource(struct gunyah_resource *ghrsc)
+{
+ irq_dispose_mapping(ghrsc->irq);
+ kfree(ghrsc);
+}
+
static struct gh_rm_connection *gh_rm_alloc_connection(__le32 msg_id, u8 type)
{
struct gh_rm_connection *connection;
@@ -553,6 +684,8 @@ static int gh_msgq_platform_probe_direction(struct platform_device *pdev,

static int gh_rm_drv_probe(struct platform_device *pdev)
{
+ struct irq_domain *parent_irq_domain;
+ struct device_node *parent_irq_node;
struct gh_msgq_tx_data *msg;
struct gh_rm *rm;
int ret;
@@ -590,15 +723,40 @@ static int gh_rm_drv_probe(struct platform_device *pdev)
if (ret)
goto err_cache;

+ parent_irq_node = of_irq_find_parent(pdev->dev.of_node);
+ if (!parent_irq_node) {
+ dev_err(&pdev->dev, "Failed to find interrupt parent of resource manager\n");
+ ret = -ENODEV;
+ goto err_msgq;
+ }
+
+ parent_irq_domain = irq_find_host(parent_irq_node);
+ if (!parent_irq_domain) {
+ dev_err(&pdev->dev, "Failed to find interrupt parent domain of resource manager\n");
+ ret = -ENODEV;
+ goto err_msgq;
+ }
+
+ rm->irq_domain = irq_domain_add_hierarchy(parent_irq_domain, 0, 0, pdev->dev.of_node,
+ &gh_rm_irq_domain_ops, NULL);
+ if (!rm->irq_domain) {
+ dev_err(&pdev->dev, "Failed to add irq domain\n");
+ ret = -ENODEV;
+ goto err_msgq;
+ }
+ rm->irq_domain->host_data = rm;
+
rm->miscdev.name = "gunyah";
rm->miscdev.minor = MISC_DYNAMIC_MINOR;
rm->miscdev.fops = &gh_dev_fops;

ret = misc_register(&rm->miscdev);
if (ret)
- goto err_msgq;
+ goto err_irq_domain;

return 0;
+err_irq_domain:
+ irq_domain_remove(rm->irq_domain);
err_msgq:
mbox_free_channel(gh_msgq_chan(&rm->msgq));
gh_msgq_remove(&rm->msgq);
@@ -612,6 +770,7 @@ static int gh_rm_drv_remove(struct platform_device *pdev)
struct gh_rm *rm = platform_get_drvdata(pdev);

misc_deregister(&rm->miscdev);
+ irq_domain_remove(rm->irq_domain);
mbox_free_channel(gh_msgq_chan(&rm->msgq));
gh_msgq_remove(&rm->msgq);
kmem_cache_destroy(rm->cache);
diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
index 2e13669c6363..a06d5fa68a65 100644
--- a/include/linux/gunyah.h
+++ b/include/linux/gunyah.h
@@ -27,6 +27,10 @@ struct gunyah_resource {
enum gunyah_resource_type type;
u64 capid;
int irq;
+
+ /* To help allocator of resource manager */
+ struct list_head list;
+ u32 rm_label;
};

/**
diff --git a/include/linux/gunyah_rsc_mgr.h b/include/linux/gunyah_rsc_mgr.h
index dc05d5b1e1a3..2fb6efbe2f70 100644
--- a/include/linux/gunyah_rsc_mgr.h
+++ b/include/linux/gunyah_rsc_mgr.h
@@ -147,6 +147,10 @@ int gh_rm_get_hyp_resources(struct gh_rm *rm, u16 vmid,
struct gh_rm_hyp_resources **resources);
int gh_rm_get_vmid(struct gh_rm *rm, u16 *vmid);

+struct gunyah_resource *gh_rm_alloc_resource(struct gh_rm *rm,
+ struct gh_rm_hyp_resource *hyp_resource);
+void gh_rm_free_resource(struct gunyah_resource *ghrsc);
+
struct gunyah_rm_platform_ops {
int (*pre_mem_share)(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel);
int (*post_mem_reclaim)(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel);
--
2.39.1


2023-02-14 21:52:38

by Elliot Berman

[permalink] [raw]
Subject: [PATCH v10 25/26] virt: gunyah: Add ioeventfd


Allow userspace to attach an ioeventfd to an mmio address within the guest.

Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Elliot Berman <[email protected]>
---
Documentation/virt/gunyah/vm-manager.rst | 21 +++++
drivers/virt/gunyah/Kconfig | 9 ++
drivers/virt/gunyah/Makefile | 1 +
drivers/virt/gunyah/gunyah_ioeventfd.c | 113 +++++++++++++++++++++++
include/uapi/linux/gunyah.h | 24 +++++
5 files changed, 168 insertions(+)
create mode 100644 drivers/virt/gunyah/gunyah_ioeventfd.c

diff --git a/Documentation/virt/gunyah/vm-manager.rst b/Documentation/virt/gunyah/vm-manager.rst
index 55d4fb466644..62d628f810f0 100644
--- a/Documentation/virt/gunyah/vm-manager.rst
+++ b/Documentation/virt/gunyah/vm-manager.rst
@@ -170,3 +170,24 @@ the irqfd.label.

GH_IRQFD_LEVEL configures the corresponding doorbell to behave like a level
triggered interrupt.
+
+Type: "ioeventfd"
+^^^^^^^^^^^^^^^^^
+
+::
+
+ struct gh_fn_ioeventfd_arg {
+ __u64 datamatch;
+ __u64 addr; /* legal mmio address */
+ __u32 len; /* 1, 2, 4, or 8 bytes */
+ __s32 fd;
+ #define GH_IOEVENTFD_DATAMATCH (1UL << 0)
+ __u32 flags;
+ };
+
+Attaches an ioeventfd to a legal mmio address within the guest. A guest write
+in the registered address will signal the provided event instead of triggering
+an exit on the GH_VCPU_RUN ioctl.
+
+If datamatch flag is set, the event will be signaled only if the written value
+to the registered address is equal to datamatch in struct gh_fn_ioeventfd_arg.
diff --git a/drivers/virt/gunyah/Kconfig b/drivers/virt/gunyah/Kconfig
index 2cde24d429d1..bd8e31184962 100644
--- a/drivers/virt/gunyah/Kconfig
+++ b/drivers/virt/gunyah/Kconfig
@@ -35,3 +35,12 @@ config GUNYAH_IRQFD
on Gunyah virtual machine.

Say Y/M here if unsure and you want to support Gunyah VMMs.
+
+config GUNYAH_IOEVENTFD
+ tristate "Gunyah ioeventfd interface"
+ depends on GUNYAH
+ help
+ Enable kernel support for creating ioeventfds which can alert userspace
+ when a Gunyah virtual machine accesses a memory address.
+
+ Say Y/M here if unsure and you want to support Gunyah VMMs.
diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
index 6cf756bfa3c2..7347b1470491 100644
--- a/drivers/virt/gunyah/Makefile
+++ b/drivers/virt/gunyah/Makefile
@@ -8,3 +8,4 @@ obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o

obj-$(CONFIG_GUNYAH_VCPU) += gunyah_vcpu.o
obj-$(CONFIG_GUNYAH_IRQFD) += gunyah_irqfd.o
+obj-$(CONFIG_GUNYAH_IOEVENTFD) += gunyah_ioeventfd.o
diff --git a/drivers/virt/gunyah/gunyah_ioeventfd.c b/drivers/virt/gunyah/gunyah_ioeventfd.c
new file mode 100644
index 000000000000..b1d1e2d80f60
--- /dev/null
+++ b/drivers/virt/gunyah/gunyah_ioeventfd.c
@@ -0,0 +1,113 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#include <linux/eventfd.h>
+#include <linux/file.h>
+#include <linux/fs.h>
+#include <linux/gunyah.h>
+#include <linux/gunyah_vm_mgr.h>
+#include <linux/module.h>
+#include <linux/printk.h>
+
+#include <uapi/linux/gunyah.h>
+
+struct gunyah_ioeventfd {
+ struct gh_vm_function_instance *f;
+ struct gh_vm_io_handler io_handler;
+
+ struct eventfd_ctx *ctx;
+};
+
+static int gh_write_ioeventfd(struct gh_vm_io_handler *io_dev, u64 addr, u32 len, u64 data)
+{
+ struct gunyah_ioeventfd *iofd = container_of(io_dev, struct gunyah_ioeventfd, io_handler);
+
+ eventfd_signal(iofd->ctx, 1);
+ return 0;
+}
+
+static struct gh_vm_io_handler_ops io_ops = {
+ .write = gh_write_ioeventfd,
+};
+
+static long gunyah_ioeventfd_bind(struct gh_vm_function_instance *f)
+{
+ const struct gh_fn_ioeventfd_arg *args = f->argp;
+ struct eventfd_ctx *ctx = NULL;
+ struct gunyah_ioeventfd *iofd;
+ int ret;
+
+ if (f->arg_size != sizeof(*args))
+ return -EINVAL;
+
+ /* must be natural-word sized, or 0 to ignore length */
+ switch (args->len) {
+ case 0:
+ case 1:
+ case 2:
+ case 4:
+ case 8:
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ /* check for range overflow */
+ if (args->addr + args->len < args->addr)
+ return -EINVAL;
+
+ /* ioeventfd with no length can't be combined with DATAMATCH */
+ if (!args->len && (args->flags & GH_IOEVENTFD_DATAMATCH))
+ return -EINVAL;
+
+ ctx = eventfd_ctx_fdget(args->fd);
+ if (IS_ERR(ctx))
+ return PTR_ERR(ctx);
+
+ iofd = kzalloc(sizeof(*iofd), GFP_KERNEL);
+ if (!iofd) {
+ ret = -ENOMEM;
+ goto err_eventfd;
+ }
+
+ f->data = iofd;
+ iofd->f = f;
+
+ iofd->ctx = ctx;
+
+ if (args->flags & GH_IOEVENTFD_DATAMATCH) {
+ iofd->io_handler.datamatch = true;
+ iofd->io_handler.len = args->len;
+ iofd->io_handler.data = args->datamatch;
+ }
+ iofd->io_handler.addr = args->addr;
+ iofd->io_handler.ops = &io_ops;
+
+ ret = gh_vm_add_io_handler(f->ghvm, &iofd->io_handler);
+ if (ret)
+ goto err_io_dev_add;
+
+ return 0;
+
+err_io_dev_add:
+ kfree(iofd);
+err_eventfd:
+ eventfd_ctx_put(ctx);
+ return ret;
+}
+
+static void gunyah_ioevent_unbind(struct gh_vm_function_instance *f)
+{
+ struct gunyah_ioeventfd *iofd = f->data;
+
+ eventfd_ctx_put(iofd->ctx);
+ gh_vm_remove_io_handler(iofd->f->ghvm, &iofd->io_handler);
+ kfree(iofd);
+}
+
+DECLARE_GUNYAH_VM_FUNCTION_INIT(ioeventfd, GH_FN_IOEVENTFD,
+ gunyah_ioeventfd_bind, gunyah_ioevent_unbind);
+MODULE_DESCRIPTION("Gunyah ioeventfds");
+MODULE_LICENSE("GPL");
diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
index ce2ccb71993f..63b6d275a64f 100644
--- a/include/uapi/linux/gunyah.h
+++ b/include/uapi/linux/gunyah.h
@@ -78,6 +78,12 @@ struct gh_vm_dtb_config {
*/
#define GH_FN_IRQFD 2

+/**
+ * GH_FN_IOEVENTFD - register ioeventfd to trigger when VM faults on parameter
+ * gh_fn_desc usage: fill arg with gh_fn_ioeventfd_arg
+ */
+#define GH_FN_IOEVENTFD 3
+
#define GH_FN_MAX_ARG_SIZE 256

/**
@@ -94,6 +100,24 @@ struct gh_fn_irqfd_arg {
__u32 reserved;
};

+/**
+ * struct gh_fn_ioeventfd_arg - Arguments to create an ioeventfd function
+ * @datamatch: data used when GH_IOEVENTFD_DATAMATCH is set
+ * @addr: Address in guest memory
+ * @len: Length of access
+ * @fd: When ioeventfd is matched, this eventfd is written
+ * @flags: See Documentation/virt/gunyah/vm-manager.rst for flag usage.
+ */
+struct gh_fn_ioeventfd_arg {
+ __u64 datamatch;
+ __u64 addr; /* legal mmio address */
+ __u32 len; /* 1, 2, 4, or 8 bytes; or 0 to ignore length */
+ __s32 fd;
+#define GH_IOEVENTFD_DATAMATCH (1UL << 0)
+ __u32 flags;
+ __u32 reserved;
+};
+
/**
* struct gh_fn_desc - Arguments to create a VM function
* @type: Type of the function. See GH_FN_* macro for supported types
--
2.39.1


2023-02-14 21:55:14

by Elliot Berman

[permalink] [raw]
Subject: [PATCH v10 24/26] virt: gunyah: Add irqfd interface


Enable support for creating irqfds which can raise an interrupt on a
Gunyah virtual machine. irqfds are exposed to userspace as a Gunyah VM
function with the name "irqfd". If the VM devicetree is not configured
to create a doorbell with the corresponding label, userspace will still
be able to assert the eventfd but no interrupt will be raised on the
guest.

Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
Signed-off-by: Elliot Berman <[email protected]>
---
Documentation/virt/gunyah/vm-manager.rst | 22 ++++
drivers/virt/gunyah/Kconfig | 9 ++
drivers/virt/gunyah/Makefile | 1 +
drivers/virt/gunyah/gunyah_irqfd.c | 160 +++++++++++++++++++++++
include/linux/gunyah.h | 5 +
include/uapi/linux/gunyah.h | 20 +++
6 files changed, 217 insertions(+)
create mode 100644 drivers/virt/gunyah/gunyah_irqfd.c

diff --git a/Documentation/virt/gunyah/vm-manager.rst b/Documentation/virt/gunyah/vm-manager.rst
index 662b8d6fec2a..55d4fb466644 100644
--- a/Documentation/virt/gunyah/vm-manager.rst
+++ b/Documentation/virt/gunyah/vm-manager.rst
@@ -148,3 +148,25 @@ The vcpu type will register with the VM Manager to expect to control
vCPU number `vcpu_id`. It returns a file descriptor allowing interaction with
the vCPU. See the Gunyah vCPU API description sections for interacting with
the Gunyah vCPU file descriptors.
+
+Type: "irqfd"
+^^^^^^^^^^^^^
+
+::
+
+ struct gh_fn_irqfd_arg {
+ __u32 fd;
+ __u32 label;
+ #define GH_IRQFD_LEVEL (1UL << 0)
+ #define GH_IRQFD_DEASSIGN (1UL << 1)
+ __u32 flags;
+ };
+
+Allows setting an eventfd to directly trigger a guest interrupt.
+irqfd.fd specifies the file descriptor to use as the eventfd.
+irqfd.label corresponds to the doorbell label used in the guest VM's devicetree.
+The irqfd is removed using the GH_IRQFD_DEASSIGN flag and specifying at least
+the irqfd.label.
+
+GH_IRQFD_LEVEL configures the corresponding doorbell to behave like a level
+triggered interrupt.
diff --git a/drivers/virt/gunyah/Kconfig b/drivers/virt/gunyah/Kconfig
index 4c1c6110b50e..2cde24d429d1 100644
--- a/drivers/virt/gunyah/Kconfig
+++ b/drivers/virt/gunyah/Kconfig
@@ -26,3 +26,12 @@ config GUNYAH_VCPU
VMMs can also handle stage 2 faults of the vCPUs.

Say Y/M here if unsure and you want to support Gunyah VMMs.
+
+config GUNYAH_IRQFD
+ tristate "Gunyah irqfd interface"
+ depends on GUNYAH
+ help
+ Enable kernel support for creating irqfds which can raise an interrupt
+ on Gunyah virtual machine.
+
+ Say Y/M here if unsure and you want to support Gunyah VMMs.
diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
index 2d1b604a7b03..6cf756bfa3c2 100644
--- a/drivers/virt/gunyah/Makefile
+++ b/drivers/virt/gunyah/Makefile
@@ -7,3 +7,4 @@ gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o vm_mgr_mm.o
obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o

obj-$(CONFIG_GUNYAH_VCPU) += gunyah_vcpu.o
+obj-$(CONFIG_GUNYAH_IRQFD) += gunyah_irqfd.o
diff --git a/drivers/virt/gunyah/gunyah_irqfd.c b/drivers/virt/gunyah/gunyah_irqfd.c
new file mode 100644
index 000000000000..6a0c97dff51c
--- /dev/null
+++ b/drivers/virt/gunyah/gunyah_irqfd.c
@@ -0,0 +1,160 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#include <linux/eventfd.h>
+#include <linux/file.h>
+#include <linux/fs.h>
+#include <linux/gunyah.h>
+#include <linux/gunyah_vm_mgr.h>
+#include <linux/module.h>
+#include <linux/poll.h>
+#include <linux/printk.h>
+
+#include <uapi/linux/gunyah.h>
+
+struct gunyah_irqfd {
+ struct gunyah_resource *ghrsc;
+ struct gh_vm_resource_ticket ticket;
+ struct gh_vm_function_instance *f;
+
+ bool level;
+
+ struct eventfd_ctx *ctx;
+ wait_queue_entry_t wait;
+ poll_table pt;
+};
+
+static int irqfd_wakeup(wait_queue_entry_t *wait, unsigned int mode, int sync, void *key)
+{
+ struct gunyah_irqfd *irqfd = container_of(wait, struct gunyah_irqfd, wait);
+ __poll_t flags = key_to_poll(key);
+ u64 enable_mask = GH_BELL_NONBLOCK;
+ u64 old_flags;
+ int ret = 0;
+
+ if (flags & EPOLLIN) {
+ if (irqfd->ghrsc) {
+ ret = gh_hypercall_bell_send(irqfd->ghrsc->capid, enable_mask, &old_flags);
+ if (ret)
+ pr_err_ratelimited("Failed to inject interrupt %d: %d\n",
+ irqfd->ticket.label, ret);
+ } else
+ pr_err_ratelimited("Premature injection of interrupt\n");
+ }
+
+ return 0;
+}
+
+static void irqfd_ptable_queue_proc(struct file *file, wait_queue_head_t *wqh, poll_table *pt)
+{
+ struct gunyah_irqfd *irq_ctx = container_of(pt, struct gunyah_irqfd, pt);
+
+ add_wait_queue(wqh, &irq_ctx->wait);
+}
+
+static int gh_irqfd_populate(struct gh_vm_resource_ticket *ticket, struct gunyah_resource *ghrsc)
+{
+ struct gunyah_irqfd *irqfd = container_of(ticket, struct gunyah_irqfd, ticket);
+ u64 enable_mask = GH_BELL_NONBLOCK;
+ u64 ack_mask = ~0;
+ int ret = 0;
+
+ if (irqfd->ghrsc) {
+ pr_warn("irqfd%d already got a Gunyah resource. Check if multiple resources with same label were configured.\n",
+ irqfd->ticket.label);
+ return -1;
+ }
+
+ irqfd->ghrsc = ghrsc;
+ if (irqfd->level) {
+ ret = gh_hypercall_bell_set_mask(irqfd->ghrsc->capid, enable_mask, ack_mask);
+ if (ret)
+ pr_warn("irq %d couldn't be set as level triggered. Might cause IRQ storm if asserted\n",
+ irqfd->ticket.label);
+ }
+
+ return 0;
+}
+
+static void gh_irqfd_unpopulate(struct gh_vm_resource_ticket *ticket, struct gunyah_resource *ghrsc)
+{
+ struct gunyah_irqfd *irqfd = container_of(ticket, struct gunyah_irqfd, ticket);
+ u64 cnt;
+
+ eventfd_ctx_remove_wait_queue(irqfd->ctx, &irqfd->wait, &cnt);
+}
+
+static long gh_irqfd_bind(struct gh_vm_function_instance *f)
+{
+ struct gh_fn_irqfd_arg *args = f->argp;
+ struct gunyah_irqfd *irqfd;
+ __poll_t events;
+ struct fd fd;
+ long r;
+
+ if (f->arg_size != sizeof(*args))
+ return -EINVAL;
+
+ irqfd = kzalloc(sizeof(*irqfd), GFP_KERNEL);
+ if (!irqfd)
+ return -ENOMEM;
+
+ irqfd->f = f;
+ f->data = irqfd;
+
+ fd = fdget(args->fd);
+ if (!fd.file) {
+ kfree(irqfd);
+ return -EBADF;
+ }
+
+ irqfd->ctx = eventfd_ctx_fileget(fd.file);
+ if (IS_ERR(irqfd->ctx)) {
+ r = PTR_ERR(irqfd->ctx);
+ goto err_fdput;
+ }
+
+ if (args->flags & GH_IRQFD_LEVEL)
+ irqfd->level = true;
+
+ init_waitqueue_func_entry(&irqfd->wait, irqfd_wakeup);
+ init_poll_funcptr(&irqfd->pt, irqfd_ptable_queue_proc);
+
+ irqfd->ticket.resource_type = GUNYAH_RESOURCE_TYPE_BELL_TX;
+ irqfd->ticket.label = args->label;
+ irqfd->ticket.owner = THIS_MODULE;
+ irqfd->ticket.populate = gh_irqfd_populate;
+ irqfd->ticket.unpopulate = gh_irqfd_unpopulate;
+
+ r = gh_vm_add_resource_ticket(f->ghvm, &irqfd->ticket);
+ if (r)
+ goto err_ctx;
+
+ events = vfs_poll(fd.file, &irqfd->pt);
+ if (events & EPOLLIN)
+ pr_warn("Premature injection of interrupt\n");
+ fdput(fd);
+
+ return 0;
+err_ctx:
+ eventfd_ctx_put(irqfd->ctx);
+err_fdput:
+ fdput(fd);
+ kfree(irqfd);
+ return r;
+}
+
+static void gh_irqfd_unbind(struct gh_vm_function_instance *f)
+{
+ struct gunyah_irqfd *irqfd = f->data;
+
+ gh_vm_remove_resource_ticket(irqfd->f->ghvm, &irqfd->ticket);
+ eventfd_ctx_put(irqfd->ctx);
+ kfree(irqfd);
+}
+
+DECLARE_GUNYAH_VM_FUNCTION_INIT(irqfd, GH_FN_IRQFD, gh_irqfd_bind, gh_irqfd_unbind);
+MODULE_DESCRIPTION("Gunyah irqfds");
+MODULE_LICENSE("GPL");
diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
index bfceaa7000e5..e515e4afd130 100644
--- a/include/linux/gunyah.h
+++ b/include/linux/gunyah.h
@@ -33,6 +33,11 @@ struct gunyah_resource {
u32 rm_label;
};

+/**
+ * Gunyah Doorbells
+ */
+#define GH_BELL_NONBLOCK BIT(32)
+
/**
* Gunyah Message Queues
*/
diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
index 1173ff0329c2..ce2ccb71993f 100644
--- a/include/uapi/linux/gunyah.h
+++ b/include/uapi/linux/gunyah.h
@@ -72,8 +72,28 @@ struct gh_vm_dtb_config {
*/
#define GH_FN_VCPU 1

+/**
+ * GH_FN_IRQFD - register eventfd to assert a Gunyah doorbell
+ * gh_fn_desc usage: fill arg with gh_fn_irqfd_arg
+ */
+#define GH_FN_IRQFD 2
+
#define GH_FN_MAX_ARG_SIZE 256

+/**
+ * struct gh_fn_irqfd_arg - Arguments to create an irqfd function
+ * @fd: an eventfd which when written to will raise a doorbell
+ * @label: Label of the doorbell created on the guest VM
+ * @flags: See Documentation/virt/gunyah/vm-manager.rst for flag usage.
+ */
+struct gh_fn_irqfd_arg {
+ __u32 fd;
+ __u32 label;
+#define GH_IRQFD_LEVEL (1UL << 0)
+ __u32 flags;
+ __u32 reserved;
+};
+
/**
* struct gh_fn_desc - Arguments to create a VM function
* @type: Type of the function. See GH_FN_* macro for supported types
--
2.39.1


2023-02-16 00:23:48

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v10 16/26] firmware: qcom_scm: Register Gunyah platform ops

Hi Elliot,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on 3ebb0ac55efaf1d0fb1b106f852c114e5021f7eb]

url: https://github.com/intel-lab-lkp/linux/commits/Elliot-Berman/docs-gunyah-Introduce-Gunyah-Hypervisor/20230215-055721
base: 3ebb0ac55efaf1d0fb1b106f852c114e5021f7eb
patch link: https://lore.kernel.org/r/20230214212457.3319814-1-quic_eberman%40quicinc.com
patch subject: [PATCH v10 16/26] firmware: qcom_scm: Register Gunyah platform ops
config: m68k-allyesconfig (https://download.01.org/0day-ci/archive/20230216/[email protected]/config)
compiler: m68k-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/33f0c4b130c7b249a1524da8076dd12333aa7cde
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Elliot-Berman/docs-gunyah-Introduce-Gunyah-Hypervisor/20230215-055721
git checkout 33f0c4b130c7b249a1524da8076dd12333aa7cde
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=m68k olddefconfig
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=m68k SHELL=/bin/bash drivers/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>
| Link: https://lore.kernel.org/oe-kbuild-all/[email protected]/

All errors (new ones prefixed by >>):

drivers/firmware/qcom_scm.c: In function 'qcom_scm_gh_rm_pre_mem_share':
>> drivers/firmware/qcom_scm.c:1335:49: error: passing argument 3 of 'qcom_scm_assign_mem' from incompatible pointer type [-Werror=incompatible-pointer-types]
1335 | &src_cpy, new_perms, mem_parcel->n_acl_entries);
| ^~~~~~~~
| |
| u64 * {aka long long unsigned int *}
drivers/firmware/qcom_scm.c:912:39: note: expected 'unsigned int *' but argument is of type 'u64 *' {aka 'long long unsigned int *'}
912 | unsigned int *srcvm,
| ~~~~~~~~~~~~~~^~~~~
In file included from include/asm-generic/bug.h:7,
from arch/m68k/include/asm/bug.h:32,
from include/linux/bug.h:5,
from include/linux/thread_info.h:13,
from include/asm-generic/preempt.h:5,
from ./arch/m68k/include/generated/asm/preempt.h:1,
from include/linux/preempt.h:78,
from arch/m68k/include/asm/irqflags.h:6,
from include/linux/irqflags.h:16,
from arch/m68k/include/asm/atomic.h:6,
from include/linux/atomic.h:7,
from include/linux/rcupdate.h:25,
from include/linux/rculist.h:11,
from include/linux/pid.h:5,
from include/linux/sched.h:14,
from include/linux/ratelimit.h:6,
from include/linux/dev_printk.h:16,
from include/linux/device.h:15,
from include/linux/platform_device.h:13,
from drivers/firmware/qcom_scm.c:5:
drivers/firmware/qcom_scm.c:1353:49: error: passing argument 3 of 'qcom_scm_assign_mem' from incompatible pointer type [-Werror=incompatible-pointer-types]
1353 | &src_cpy, new_perms, 1));
| ^~~~~~~~
| |
| u64 * {aka long long unsigned int *}
include/linux/once_lite.h:28:41: note: in definition of macro 'DO_ONCE_LITE_IF'
28 | bool __ret_do_once = !!(condition); \
| ^~~~~~~~~
drivers/firmware/qcom_scm.c:1350:33: note: in expansion of macro 'WARN_ON_ONCE'
1350 | WARN_ON_ONCE(qcom_scm_assign_mem(
| ^~~~~~~~~~~~
drivers/firmware/qcom_scm.c:912:39: note: expected 'unsigned int *' but argument is of type 'u64 *' {aka 'long long unsigned int *'}
912 | unsigned int *srcvm,
| ~~~~~~~~~~~~~~^~~~~
drivers/firmware/qcom_scm.c: In function 'qcom_scm_gh_rm_post_mem_reclaim':
drivers/firmware/qcom_scm.c:1385:49: error: passing argument 3 of 'qcom_scm_assign_mem' from incompatible pointer type [-Werror=incompatible-pointer-types]
1385 | &src_cpy, &new_perms, 1);
| ^~~~~~~~
| |
| u64 * {aka long long unsigned int *}
drivers/firmware/qcom_scm.c:912:39: note: expected 'unsigned int *' but argument is of type 'u64 *' {aka 'long long unsigned int *'}
912 | unsigned int *srcvm,
| ~~~~~~~~~~~~~~^~~~~
cc1: some warnings being treated as errors


vim +/qcom_scm_assign_mem +1335 drivers/firmware/qcom_scm.c

1303
1304 static int qcom_scm_gh_rm_pre_mem_share(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel)
1305 {
1306 struct qcom_scm_vmperm *new_perms;
1307 u64 src, src_cpy;
1308 int ret = 0, i, n;
1309 u16 vmid;
1310
1311 new_perms = kcalloc(mem_parcel->n_acl_entries, sizeof(*new_perms), GFP_KERNEL);
1312 if (!new_perms)
1313 return -ENOMEM;
1314
1315 for (n = 0; n < mem_parcel->n_acl_entries; n++) {
1316 vmid = le16_to_cpu(mem_parcel->acl_entries[n].vmid);
1317 if (vmid <= QCOM_SCM_MAX_MANAGED_VMID)
1318 new_perms[n].vmid = vmid;
1319 else
1320 new_perms[n].vmid = QCOM_SCM_RM_MANAGED_VMID;
1321 if (mem_parcel->acl_entries[n].perms & GH_RM_ACL_X)
1322 new_perms[n].perm |= QCOM_SCM_PERM_EXEC;
1323 if (mem_parcel->acl_entries[n].perms & GH_RM_ACL_W)
1324 new_perms[n].perm |= QCOM_SCM_PERM_WRITE;
1325 if (mem_parcel->acl_entries[n].perms & GH_RM_ACL_R)
1326 new_perms[n].perm |= QCOM_SCM_PERM_READ;
1327 }
1328
1329 src = (1ull << QCOM_SCM_VMID_HLOS);
1330
1331 for (i = 0; i < mem_parcel->n_mem_entries; i++) {
1332 src_cpy = src;
1333 ret = qcom_scm_assign_mem(le64_to_cpu(mem_parcel->mem_entries[i].ipa_base),
1334 le64_to_cpu(mem_parcel->mem_entries[i].size),
> 1335 &src_cpy, new_perms, mem_parcel->n_acl_entries);
1336 if (ret) {
1337 src = 0;
1338 for (n = 0; n < mem_parcel->n_acl_entries; n++) {
1339 vmid = le16_to_cpu(mem_parcel->acl_entries[n].vmid);
1340 if (vmid <= QCOM_SCM_MAX_MANAGED_VMID)
1341 src |= (1ull << vmid);
1342 else
1343 src |= (1ull << QCOM_SCM_RM_MANAGED_VMID);
1344 }
1345
1346 new_perms[0].vmid = QCOM_SCM_VMID_HLOS;
1347
1348 for (i--; i >= 0; i--) {
1349 src_cpy = src;
1350 WARN_ON_ONCE(qcom_scm_assign_mem(
1351 le64_to_cpu(mem_parcel->mem_entries[i].ipa_base),
1352 le64_to_cpu(mem_parcel->mem_entries[i].size),
1353 &src_cpy, new_perms, 1));
1354 }
1355 break;
1356 }
1357 }
1358
1359 kfree(new_perms);
1360 return ret;
1361 }
1362

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

2023-02-16 04:07:33

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v10 07/26] mailbox: Add Gunyah message queue mailbox

Hi Elliot,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on 3ebb0ac55efaf1d0fb1b106f852c114e5021f7eb]

url: https://github.com/intel-lab-lkp/linux/commits/Elliot-Berman/docs-gunyah-Introduce-Gunyah-Hypervisor/20230215-055721
base: 3ebb0ac55efaf1d0fb1b106f852c114e5021f7eb
patch link: https://lore.kernel.org/r/20230214212316.3309053-1-quic_eberman%40quicinc.com
patch subject: [PATCH v10 07/26] mailbox: Add Gunyah message queue mailbox
config: arm64-allyesconfig (https://download.01.org/0day-ci/archive/20230216/[email protected]/config)
compiler: aarch64-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/b427188cd418632da7b26f283f5d2c668038186f
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Elliot-Berman/docs-gunyah-Introduce-Gunyah-Hypervisor/20230215-055721
git checkout b427188cd418632da7b26f283f5d2c668038186f
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=arm64 olddefconfig
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=arm64 SHELL=/bin/bash drivers/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>
| Link: https://lore.kernel.org/oe-kbuild-all/[email protected]/

All errors (new ones prefixed by >>):

drivers/mailbox/gunyah-msgq.c: In function 'gh_msgq_init':
>> drivers/mailbox/gunyah-msgq.c:180:15: error: implicit declaration of function 'mbox_bind_client' [-Werror=implicit-function-declaration]
180 | ret = mbox_bind_client(gh_msgq_chan(msgq), cl);
| ^~~~~~~~~~~~~~~~
cc1: some warnings being treated as errors


vim +/mbox_bind_client +180 drivers/mailbox/gunyah-msgq.c

110
111 /**
112 * gh_msgq_init() - Initialize a Gunyah message queue with an mbox_client
113 * @parent: optional, device parent used for the mailbox controller
114 * @msgq: Pointer to the gh_msgq to initialize
115 * @cl: A mailbox client to bind to the mailbox channel that the message queue creates
116 * @tx_ghrsc: optional, the transmission side of the message queue
117 * @rx_ghrsc: optional, the receiving side of the message queue
118 *
119 * At least one of tx_ghrsc and rx_ghrsc should be not NULL. Most message queue use cases come with
120 * a pair of message queues to facilitate bidirectional communication. When tx_ghrsc is set,
121 * the client can send messages with mbox_send_message(gh_msgq_chan(msgq), msg). When rx_ghrsc
122 * is set, the mbox_client should register an .rx_callback() and the message queue driver will
123 * push all available messages upon receiving the RX ready interrupt. The messages should be
124 * consumed or copied by the client right away as the gh_msgq_rx_data will be replaced/destroyed
125 * after the callback.
126 *
127 * Returns - 0 on success, negative otherwise
128 */
129 int gh_msgq_init(struct device *parent, struct gh_msgq *msgq, struct mbox_client *cl,
130 struct gunyah_resource *tx_ghrsc, struct gunyah_resource *rx_ghrsc)
131 {
132 int ret;
133
134 /* Must have at least a tx_ghrsc or rx_ghrsc and that they are the right device types */
135 if ((!tx_ghrsc && !rx_ghrsc) ||
136 (tx_ghrsc && tx_ghrsc->type != GUNYAH_RESOURCE_TYPE_MSGQ_TX) ||
137 (rx_ghrsc && rx_ghrsc->type != GUNYAH_RESOURCE_TYPE_MSGQ_RX))
138 return -EINVAL;
139
140 if (gh_api_version() != GUNYAH_API_V1) {
141 pr_err("Unrecognized gunyah version: %u. Currently supported: %d\n",
142 gh_api_version(), GUNYAH_API_V1);
143 return -EOPNOTSUPP;
144 }
145
146 if (!gh_api_has_feature(GH_API_FEATURE_MSGQUEUE))
147 return -EOPNOTSUPP;
148
149 msgq->tx_ghrsc = tx_ghrsc;
150 msgq->rx_ghrsc = rx_ghrsc;
151
152 msgq->mbox.dev = parent;
153 msgq->mbox.ops = &gh_msgq_ops;
154 msgq->mbox.num_chans = 1;
155 msgq->mbox.txdone_irq = true;
156 msgq->mbox.chans = kcalloc(msgq->mbox.num_chans, sizeof(*msgq->mbox.chans), GFP_KERNEL);
157 if (!msgq->mbox.chans)
158 return -ENOMEM;
159
160 if (msgq->tx_ghrsc) {
161 ret = request_irq(msgq->tx_ghrsc->irq, gh_msgq_tx_irq_handler, 0, "gh_msgq_tx",
162 msgq);
163 if (ret)
164 goto err_chans;
165 }
166
167 if (msgq->rx_ghrsc) {
168 ret = request_threaded_irq(msgq->rx_ghrsc->irq, NULL, gh_msgq_rx_irq_handler,
169 IRQF_ONESHOT, "gh_msgq_rx", msgq);
170 if (ret)
171 goto err_tx_irq;
172 }
173
174 tasklet_setup(&msgq->txdone_tasklet, gh_msgq_txdone_tasklet);
175
176 ret = mbox_controller_register(&msgq->mbox);
177 if (ret)
178 goto err_rx_irq;
179
> 180 ret = mbox_bind_client(gh_msgq_chan(msgq), cl);
181 if (ret)
182 goto err_mbox;
183
184 return 0;
185 err_mbox:
186 mbox_controller_unregister(&msgq->mbox);
187 err_rx_irq:
188 if (msgq->rx_ghrsc)
189 free_irq(msgq->rx_ghrsc->irq, msgq);
190 err_tx_irq:
191 if (msgq->tx_ghrsc)
192 free_irq(msgq->tx_ghrsc->irq, msgq);
193 err_chans:
194 kfree(msgq->mbox.chans);
195 return ret;
196 }
197 EXPORT_SYMBOL_GPL(gh_msgq_init);
198

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

2023-02-16 06:35:49

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v10 13/26] gunyah: vm_mgr: Add ioctls to support basic non-proxy VM boot

On Tue, Feb 14, 2023 at 01:24:26PM -0800, Elliot Berman wrote:
> + case GH_VM_SET_DTB_CONFIG: {
> + struct gh_vm_dtb_config dtb_config;
> +
> + if (copy_from_user(&dtb_config, argp, sizeof(dtb_config)))
> + return -EFAULT;
> +
> + dtb_config.size = PAGE_ALIGN(dtb_config.size);
> + ghvm->dtb_config = dtb_config;

Do you really mean to copy this tiny structure twice (once from
userspace and the second time off of the stack)? If so, why?

And where are the values of the structure checked for validity? Can any
64bit value work for size and "gpa"?

thanks,

greg k-h

2023-02-16 06:38:53

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v10 12/26] gunyah: vm_mgr: Add/remove user memory regions

On Tue, Feb 14, 2023 at 01:24:16PM -0800, Elliot Berman wrote:
>
> When launching a virtual machine, Gunyah userspace allocates memory for
> the guest and informs Gunyah about these memory regions through
> SET_USER_MEMORY_REGION ioctl.

It also frees memory, see below. Why not document that?

> + case GH_VM_SET_USER_MEM_REGION: {
> + struct gh_userspace_memory_region region;
> +
> + if (copy_from_user(&region, argp, sizeof(region)))
> + return -EFAULT;
> +
> + /* All other flag bits are reserved for future use */
> + if (region.flags & ~(GH_MEM_ALLOW_READ | GH_MEM_ALLOW_WRITE | GH_MEM_ALLOW_EXEC |
> + GH_MEM_LENT))
> + return -EINVAL;

Nice, thanks for validating that.


> +
> +

Nit, 2 blank lines are not needed :(


> + if (region.memory_size)
> + r = gh_vm_mem_alloc(ghvm, &region);
> + else
> + r = gh_vm_mem_free(ghvm, region.label);

So if you set the size to 0 it is freed? Wouldn't a separate ioctl make
more sense? Where is this logic documented to userspace?

thanks,

greg k-h

2023-02-16 06:40:06

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v10 09/26] gunyah: rsc_mgr: Add VM lifecycle RPC

On Tue, Feb 14, 2023 at 01:23:42PM -0800, Elliot Berman wrote:
>
> Add Gunyah Resource Manager RPC to launch an unauthenticated VM.
>
> Signed-off-by: Elliot Berman <[email protected]>
> ---
> drivers/virt/gunyah/Makefile | 2 +-
> drivers/virt/gunyah/rsc_mgr.h | 45 ++++++
> drivers/virt/gunyah/rsc_mgr_rpc.c | 226 ++++++++++++++++++++++++++++++
> include/linux/gunyah_rsc_mgr.h | 73 ++++++++++
> 4 files changed, 345 insertions(+), 1 deletion(-)
> create mode 100644 drivers/virt/gunyah/rsc_mgr_rpc.c
>
> diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
> index cc864ff5abbb..de29769f2f3f 100644
> --- a/drivers/virt/gunyah/Makefile
> +++ b/drivers/virt/gunyah/Makefile
> @@ -2,5 +2,5 @@
>
> obj-$(CONFIG_GUNYAH) += gunyah.o
>
> -gunyah_rsc_mgr-y += rsc_mgr.o
> +gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o
> obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
> diff --git a/drivers/virt/gunyah/rsc_mgr.h b/drivers/virt/gunyah/rsc_mgr.h
> index d4e799a7526f..7406237bc66d 100644
> --- a/drivers/virt/gunyah/rsc_mgr.h
> +++ b/drivers/virt/gunyah/rsc_mgr.h
> @@ -74,4 +74,49 @@ struct gh_rm;
> int gh_rm_call(struct gh_rm *rsc_mgr, u32 message_id, void *req_buff, size_t req_buff_size,
> void **resp_buf, size_t *resp_buff_size);
>
> +/* Message IDs: VM Management */
> +#define GH_RM_RPC_VM_ALLOC_VMID 0x56000001
> +#define GH_RM_RPC_VM_DEALLOC_VMID 0x56000002
> +#define GH_RM_RPC_VM_START 0x56000004
> +#define GH_RM_RPC_VM_STOP 0x56000005
> +#define GH_RM_RPC_VM_RESET 0x56000006
> +#define GH_RM_RPC_VM_CONFIG_IMAGE 0x56000009
> +#define GH_RM_RPC_VM_INIT 0x5600000B
> +#define GH_RM_RPC_VM_GET_HYP_RESOURCES 0x56000020
> +#define GH_RM_RPC_VM_GET_VMID 0x56000024
> +
> +struct gh_rm_vm_common_vmid_req {
> + __le16 vmid;
> + __le16 reserved0;

reserved for what? What is a valid value for this field? Should it be
checked for 0?

Same with other "reserved0" fields in this file.


> +} __packed;
> +
> +/* Call: VM_ALLOC */
> +struct gh_rm_vm_alloc_vmid_resp {
> + __le16 vmid;
> + __le16 reserved0;
> +} __packed;
> +
> +/* Call: VM_STOP */
> +struct gh_rm_vm_stop_req {
> + __le16 vmid;
> +#define GH_RM_VM_STOP_FLAG_FORCE_STOP BIT(0)
> + u8 flags;
> + u8 reserved;

Why just "reserved" and not "reserved0"? Naming is hard :(

thanks,

greg k-h

2023-02-16 06:44:02

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v10 08/26] gunyah: rsc_mgr: Add resource manager RPC core

On Tue, Feb 14, 2023 at 01:23:25PM -0800, Elliot Berman wrote:
> +struct gh_rm {
> + struct device *dev;

What device does this point to?

> + struct gunyah_resource tx_ghrsc, rx_ghrsc;
> + struct gh_msgq msgq;
> + struct mbox_client msgq_client;
> + struct gh_rm_connection *active_rx_connection;
> + int last_tx_ret;
> +
> + struct idr call_idr;
> + struct mutex call_idr_lock;
> +
> + struct kmem_cache *cache;
> + struct mutex send_lock;
> + struct blocking_notifier_head nh;
> +};

This obviously is the "device" that your system works on, so what are
the lifetime rules of it? Why isn't is just a real 'struct device' in
the system instead of a random memory blob with a pointer to a device?

What controls the lifetime of this structure and where is the reference
counting logic for it?

And why no documentation for this core structure?

thanks,

greg k-h

2023-02-16 11:10:44

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v10 16/26] firmware: qcom_scm: Register Gunyah platform ops

Hi Elliot,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on 3ebb0ac55efaf1d0fb1b106f852c114e5021f7eb]

url: https://github.com/intel-lab-lkp/linux/commits/Elliot-Berman/docs-gunyah-Introduce-Gunyah-Hypervisor/20230215-055721
base: 3ebb0ac55efaf1d0fb1b106f852c114e5021f7eb
patch link: https://lore.kernel.org/r/20230214212457.3319814-1-quic_eberman%40quicinc.com
patch subject: [PATCH v10 16/26] firmware: qcom_scm: Register Gunyah platform ops
config: hexagon-randconfig-r041-20230212 (https://download.01.org/0day-ci/archive/20230216/[email protected]/config)
compiler: clang version 17.0.0 (https://github.com/llvm/llvm-project db89896bbbd2251fff457699635acbbedeead27f)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/33f0c4b130c7b249a1524da8076dd12333aa7cde
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Elliot-Berman/docs-gunyah-Introduce-Gunyah-Hypervisor/20230215-055721
git checkout 33f0c4b130c7b249a1524da8076dd12333aa7cde
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=hexagon olddefconfig
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=hexagon SHELL=/bin/bash drivers/firmware/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>
| Link: https://lore.kernel.org/oe-kbuild-all/[email protected]/

All errors (new ones prefixed by >>):

In file included from drivers/firmware/qcom_scm.c:7:
In file included from include/linux/interrupt.h:11:
In file included from include/linux/hardirq.h:11:
In file included from ./arch/hexagon/include/generated/asm/hardirq.h:1:
In file included from include/asm-generic/hardirq.h:17:
In file included from include/linux/irq.h:20:
In file included from include/linux/io.h:13:
In file included from arch/hexagon/include/asm/io.h:334:
include/asm-generic/io.h:547:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
val = __raw_readb(PCI_IOBASE + addr);
~~~~~~~~~~ ^
include/asm-generic/io.h:560:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
~~~~~~~~~~ ^
include/uapi/linux/byteorder/little_endian.h:37:51: note: expanded from macro '__le16_to_cpu'
#define __le16_to_cpu(x) ((__force __u16)(__le16)(x))
^
In file included from drivers/firmware/qcom_scm.c:7:
In file included from include/linux/interrupt.h:11:
In file included from include/linux/hardirq.h:11:
In file included from ./arch/hexagon/include/generated/asm/hardirq.h:1:
In file included from include/asm-generic/hardirq.h:17:
In file included from include/linux/irq.h:20:
In file included from include/linux/io.h:13:
In file included from arch/hexagon/include/asm/io.h:334:
include/asm-generic/io.h:573:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
~~~~~~~~~~ ^
include/uapi/linux/byteorder/little_endian.h:35:51: note: expanded from macro '__le32_to_cpu'
#define __le32_to_cpu(x) ((__force __u32)(__le32)(x))
^
In file included from drivers/firmware/qcom_scm.c:7:
In file included from include/linux/interrupt.h:11:
In file included from include/linux/hardirq.h:11:
In file included from ./arch/hexagon/include/generated/asm/hardirq.h:1:
In file included from include/asm-generic/hardirq.h:17:
In file included from include/linux/irq.h:20:
In file included from include/linux/io.h:13:
In file included from arch/hexagon/include/asm/io.h:334:
include/asm-generic/io.h:584:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
__raw_writeb(value, PCI_IOBASE + addr);
~~~~~~~~~~ ^
include/asm-generic/io.h:594:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
__raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr);
~~~~~~~~~~ ^
include/asm-generic/io.h:604:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
__raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr);
~~~~~~~~~~ ^
>> drivers/firmware/qcom_scm.c:1335:7: error: incompatible pointer types passing 'u64 *' (aka 'unsigned long long *') to parameter of type 'unsigned int *' [-Werror,-Wincompatible-pointer-types]
&src_cpy, new_perms, mem_parcel->n_acl_entries);
^~~~~~~~
drivers/firmware/qcom_scm.c:912:18: note: passing argument to parameter 'srcvm' here
unsigned int *srcvm,
^
drivers/firmware/qcom_scm.c:1353:7: error: incompatible pointer types passing 'u64 *' (aka 'unsigned long long *') to parameter of type 'unsigned int *' [-Werror,-Wincompatible-pointer-types]
&src_cpy, new_perms, 1));
^~~~~~~~
include/asm-generic/bug.h:147:18: note: expanded from macro 'WARN_ON_ONCE'
DO_ONCE_LITE_IF(condition, WARN_ON, 1)
^~~~~~~~~
include/linux/once_lite.h:28:27: note: expanded from macro 'DO_ONCE_LITE_IF'
bool __ret_do_once = !!(condition); \
^~~~~~~~~
drivers/firmware/qcom_scm.c:912:18: note: passing argument to parameter 'srcvm' here
unsigned int *srcvm,
^
drivers/firmware/qcom_scm.c:1385:7: error: incompatible pointer types passing 'u64 *' (aka 'unsigned long long *') to parameter of type 'unsigned int *' [-Werror,-Wincompatible-pointer-types]
&src_cpy, &new_perms, 1);
^~~~~~~~
drivers/firmware/qcom_scm.c:912:18: note: passing argument to parameter 'srcvm' here
unsigned int *srcvm,
^
6 warnings and 3 errors generated.


vim +1335 drivers/firmware/qcom_scm.c

1303
1304 static int qcom_scm_gh_rm_pre_mem_share(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel)
1305 {
1306 struct qcom_scm_vmperm *new_perms;
1307 u64 src, src_cpy;
1308 int ret = 0, i, n;
1309 u16 vmid;
1310
1311 new_perms = kcalloc(mem_parcel->n_acl_entries, sizeof(*new_perms), GFP_KERNEL);
1312 if (!new_perms)
1313 return -ENOMEM;
1314
1315 for (n = 0; n < mem_parcel->n_acl_entries; n++) {
1316 vmid = le16_to_cpu(mem_parcel->acl_entries[n].vmid);
1317 if (vmid <= QCOM_SCM_MAX_MANAGED_VMID)
1318 new_perms[n].vmid = vmid;
1319 else
1320 new_perms[n].vmid = QCOM_SCM_RM_MANAGED_VMID;
1321 if (mem_parcel->acl_entries[n].perms & GH_RM_ACL_X)
1322 new_perms[n].perm |= QCOM_SCM_PERM_EXEC;
1323 if (mem_parcel->acl_entries[n].perms & GH_RM_ACL_W)
1324 new_perms[n].perm |= QCOM_SCM_PERM_WRITE;
1325 if (mem_parcel->acl_entries[n].perms & GH_RM_ACL_R)
1326 new_perms[n].perm |= QCOM_SCM_PERM_READ;
1327 }
1328
1329 src = (1ull << QCOM_SCM_VMID_HLOS);
1330
1331 for (i = 0; i < mem_parcel->n_mem_entries; i++) {
1332 src_cpy = src;
1333 ret = qcom_scm_assign_mem(le64_to_cpu(mem_parcel->mem_entries[i].ipa_base),
1334 le64_to_cpu(mem_parcel->mem_entries[i].size),
> 1335 &src_cpy, new_perms, mem_parcel->n_acl_entries);
1336 if (ret) {
1337 src = 0;
1338 for (n = 0; n < mem_parcel->n_acl_entries; n++) {
1339 vmid = le16_to_cpu(mem_parcel->acl_entries[n].vmid);
1340 if (vmid <= QCOM_SCM_MAX_MANAGED_VMID)
1341 src |= (1ull << vmid);
1342 else
1343 src |= (1ull << QCOM_SCM_RM_MANAGED_VMID);
1344 }
1345
1346 new_perms[0].vmid = QCOM_SCM_VMID_HLOS;
1347
1348 for (i--; i >= 0; i--) {
1349 src_cpy = src;
1350 WARN_ON_ONCE(qcom_scm_assign_mem(
1351 le64_to_cpu(mem_parcel->mem_entries[i].ipa_base),
1352 le64_to_cpu(mem_parcel->mem_entries[i].size),
1353 &src_cpy, new_perms, 1));
1354 }
1355 break;
1356 }
1357 }
1358
1359 kfree(new_perms);
1360 return ret;
1361 }
1362

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

2023-02-16 17:18:37

by Elliot Berman

[permalink] [raw]
Subject: Re: [PATCH v10 09/26] gunyah: rsc_mgr: Add VM lifecycle RPC



On 2/15/2023 10:39 PM, Greg Kroah-Hartman wrote:
> On Tue, Feb 14, 2023 at 01:23:42PM -0800, Elliot Berman wrote:
>>
>> Add Gunyah Resource Manager RPC to launch an unauthenticated VM.
>>
>> Signed-off-by: Elliot Berman <[email protected]>
>> ---
>> drivers/virt/gunyah/Makefile | 2 +-
>> drivers/virt/gunyah/rsc_mgr.h | 45 ++++++
>> drivers/virt/gunyah/rsc_mgr_rpc.c | 226 ++++++++++++++++++++++++++++++
>> include/linux/gunyah_rsc_mgr.h | 73 ++++++++++
>> 4 files changed, 345 insertions(+), 1 deletion(-)
>> create mode 100644 drivers/virt/gunyah/rsc_mgr_rpc.c
>>
>> diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
>> index cc864ff5abbb..de29769f2f3f 100644
>> --- a/drivers/virt/gunyah/Makefile
>> +++ b/drivers/virt/gunyah/Makefile
>> @@ -2,5 +2,5 @@
>>
>> obj-$(CONFIG_GUNYAH) += gunyah.o
>>
>> -gunyah_rsc_mgr-y += rsc_mgr.o
>> +gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o
>> obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
>> diff --git a/drivers/virt/gunyah/rsc_mgr.h b/drivers/virt/gunyah/rsc_mgr.h
>> index d4e799a7526f..7406237bc66d 100644
>> --- a/drivers/virt/gunyah/rsc_mgr.h
>> +++ b/drivers/virt/gunyah/rsc_mgr.h
>> @@ -74,4 +74,49 @@ struct gh_rm;
>> int gh_rm_call(struct gh_rm *rsc_mgr, u32 message_id, void *req_buff, size_t req_buff_size,
>> void **resp_buf, size_t *resp_buff_size);
>>
>> +/* Message IDs: VM Management */
>> +#define GH_RM_RPC_VM_ALLOC_VMID 0x56000001
>> +#define GH_RM_RPC_VM_DEALLOC_VMID 0x56000002
>> +#define GH_RM_RPC_VM_START 0x56000004
>> +#define GH_RM_RPC_VM_STOP 0x56000005
>> +#define GH_RM_RPC_VM_RESET 0x56000006
>> +#define GH_RM_RPC_VM_CONFIG_IMAGE 0x56000009
>> +#define GH_RM_RPC_VM_INIT 0x5600000B
>> +#define GH_RM_RPC_VM_GET_HYP_RESOURCES 0x56000020
>> +#define GH_RM_RPC_VM_GET_VMID 0x56000024
>> +
>> +struct gh_rm_vm_common_vmid_req {
>> + __le16 vmid;
>> + __le16 reserved0;
>
> reserved for what? What is a valid value for this field? Should it be
> checked for 0?

This struct is transmitted "over the wire" and RM makes all of its
structures 4-byte aligned. The reserved fields are padding for this
alignment and will be zero but don't need to be checked. Linux
initializes the reserved fields to zero.

>
> Same with other "reserved0" fields in this file.
>
>
>> +} __packed;
>> +
>> +/* Call: VM_ALLOC */
>> +struct gh_rm_vm_alloc_vmid_resp {
>> + __le16 vmid;
>> + __le16 reserved0;
>> +} __packed;
>> +
>> +/* Call: VM_STOP */
>> +struct gh_rm_vm_stop_req {
>> + __le16 vmid;
>> +#define GH_RM_VM_STOP_FLAG_FORCE_STOP BIT(0)
>> + u8 flags;
>> + u8 reserved;
>
> Why just "reserved" and not "reserved0"? Naming is hard :(
>

Some fields have multiple reserved fields. I'll clean up so "reserved0"
only appears when there are multiple padding fields.

Thanks,
Elliot

2023-02-16 17:20:26

by Elliot Berman

[permalink] [raw]
Subject: Re: [PATCH v10 13/26] gunyah: vm_mgr: Add ioctls to support basic non-proxy VM boot



On 2/15/2023 10:35 PM, Greg Kroah-Hartman wrote:
> On Tue, Feb 14, 2023 at 01:24:26PM -0800, Elliot Berman wrote:
>> + case GH_VM_SET_DTB_CONFIG: {
>> + struct gh_vm_dtb_config dtb_config;
>> +
>> + if (copy_from_user(&dtb_config, argp, sizeof(dtb_config)))
>> + return -EFAULT;
>> +
>> + dtb_config.size = PAGE_ALIGN(dtb_config.size);
>> + ghvm->dtb_config = dtb_config;
>
> Do you really mean to copy this tiny structure twice (once from
> userspace and the second time off of the stack)? If so, why?

Ah, yes this can be optimized to copy directly.
>
> And where are the values of the structure checked for validity? Can any
> 64bit value work for size and "gpa"?
>

The values get checked when starting the VM

static int gh_vm_start(struct gh_vm *ghvm)
...
mapping = gh_vm_mem_find_mapping(ghvm, ghvm->dtb_config.gpa,
ghvm->dtb_config.size);
if (!mapping) {
pr_warn("Failed to find the memory_handle for DTB\n");
ret = -EINVAL;
goto err;
}

If user passes an address that they've not set up, then
gh_vm_mem_find_mapping returns NULL and GH_VM_START ioctl fails.

I've not done the check from the GH_VM_SET_DTB_CONFIG ioctl itself
because I didn't want to require userspace to share the memory first.
We'd need to check again anyway since user could SET_USER_MEMORY,
SET_DTB_CONFIG, SET_USER_MEMORY (remove), VM_START.

Thanks,
Elliot


2023-02-16 17:25:04

by Elliot Berman

[permalink] [raw]
Subject: Re: [PATCH v10 12/26] gunyah: vm_mgr: Add/remove user memory regions



On 2/15/2023 10:38 PM, Greg Kroah-Hartman wrote:
> On Tue, Feb 14, 2023 at 01:24:16PM -0800, Elliot Berman wrote:
>>
>> When launching a virtual machine, Gunyah userspace allocates memory for
>> the guest and informs Gunyah about these memory regions through
>> SET_USER_MEMORY_REGION ioctl.
>
> It also frees memory, see below. Why not document that?
>

I can mention in commit text, too.

>> + case GH_VM_SET_USER_MEM_REGION: {
>> + struct gh_userspace_memory_region region;
>> +
>> + if (copy_from_user(&region, argp, sizeof(region)))
>> + return -EFAULT;
>> +
>> + /* All other flag bits are reserved for future use */
>> + if (region.flags & ~(GH_MEM_ALLOW_READ | GH_MEM_ALLOW_WRITE | GH_MEM_ALLOW_EXEC |
>> + GH_MEM_LENT))
>> + return -EINVAL;
>
> Nice, thanks for validating that.
>
>
>> +
>> +
>
> Nit, 2 blank lines are not needed :(
>
>
>> + if (region.memory_size)
>> + r = gh_vm_mem_alloc(ghvm, &region);
>> + else
>> + r = gh_vm_mem_free(ghvm, region.label);
>
> So if you set the size to 0 it is freed? Wouldn't a separate ioctl make
> more sense? Where is this logic documented to userspace? >

We're following KVM convention here. The logic is documented in patch 17/26.

Thanks,
Elliot

2023-02-16 17:41:20

by Elliot Berman

[permalink] [raw]
Subject: Re: [PATCH v10 08/26] gunyah: rsc_mgr: Add resource manager RPC core



On 2/15/2023 10:43 PM, Greg Kroah-Hartman wrote:
> On Tue, Feb 14, 2023 at 01:23:25PM -0800, Elliot Berman wrote:
>> +struct gh_rm {
>> + struct device *dev;
>
> What device does this point to?
>

The platform device.

>> + struct gunyah_resource tx_ghrsc, rx_ghrsc;
>> + struct gh_msgq msgq;
>> + struct mbox_client msgq_client;
>> + struct gh_rm_connection *active_rx_connection;
>> + int last_tx_ret;
>> +
>> + struct idr call_idr;
>> + struct mutex call_idr_lock;
>> +
>> + struct kmem_cache *cache;
>> + struct mutex send_lock;
>> + struct blocking_notifier_head nh;
>> +};
>
> This obviously is the "device" that your system works on, so what are
> the lifetime rules of it? Why isn't is just a real 'struct device' in
> the system instead of a random memory blob with a pointer to a device?
>
> What controls the lifetime of this structure and where is the reference
> counting logic for it?
>

The lifetime of the structure is bound by the platform device that above
struct device *dev points to. get_gh_rm and put_gh_rm increments the
device ref counter and ensures lifetime of the struct is also extended.

> And why no documentation for this core structure?
>

Sure, I will add.

> thanks,
>
> greg k-h

2023-02-17 07:37:20

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v10 08/26] gunyah: rsc_mgr: Add resource manager RPC core

On Thu, Feb 16, 2023 at 09:40:52AM -0800, Elliot Berman wrote:
>
>
> On 2/15/2023 10:43 PM, Greg Kroah-Hartman wrote:
> > On Tue, Feb 14, 2023 at 01:23:25PM -0800, Elliot Berman wrote:
> > > +struct gh_rm {
> > > + struct device *dev;
> >
> > What device does this point to?
> >
>
> The platform device.

What platform device? And why a platform device?

> > > + struct gunyah_resource tx_ghrsc, rx_ghrsc;
> > > + struct gh_msgq msgq;
> > > + struct mbox_client msgq_client;
> > > + struct gh_rm_connection *active_rx_connection;
> > > + int last_tx_ret;
> > > +
> > > + struct idr call_idr;
> > > + struct mutex call_idr_lock;
> > > +
> > > + struct kmem_cache *cache;
> > > + struct mutex send_lock;
> > > + struct blocking_notifier_head nh;
> > > +};
> >
> > This obviously is the "device" that your system works on, so what are
> > the lifetime rules of it? Why isn't is just a real 'struct device' in
> > the system instead of a random memory blob with a pointer to a device?
> >
> > What controls the lifetime of this structure and where is the reference
> > counting logic for it?
> >
>
> The lifetime of the structure is bound by the platform device that above
> struct device *dev points to. get_gh_rm and put_gh_rm increments the device
> ref counter and ensures lifetime of the struct is also extended.

But this really is "your" device, not the platform device. So make it a
real one please as that is how the kernel's driver model works. Don't
hang "magic structures" off of a random struct device and have them
control the lifetime rules of the parent without actually being a device
themself. This should make things simpler overall, not more complex,
and allow you to expose things to userspace properly (right now your
data is totally hidden.)

thanks,

greg k-h

2023-02-17 13:24:30

by Srivatsa Vaddagiri

[permalink] [raw]
Subject: Re: [PATCH v10 19/26] gunyah: vm_mgr: Add framework to add VM Functions

* Elliot Berman <[email protected]> [2023-02-14 13:25:30]:

> +static long gh_vm_add_function(struct gh_vm *ghvm, struct gh_fn_desc *f)
> +{
> + struct gh_vm_function_instance *inst;
> + void __user *argp;
> + long r = 0;
> +
> + if (f->arg_size > GH_FN_MAX_ARG_SIZE)
> + return -EINVAL;
> +
> + inst = kzalloc(sizeof(*inst), GFP_KERNEL);
> + if (!inst)
> + return -ENOMEM;
> +
> + inst->arg_size = f->arg_size;
> + if (inst->arg_size) {
> + inst->argp = kzalloc(inst->arg_size, GFP_KERNEL);
> + if (!inst->arg) {

if (!inst->argp) ?


> + r = -ENOMEM;
> + goto free;
> + }

2023-02-20 09:16:28

by Srivatsa Vaddagiri

[permalink] [raw]
Subject: Re: [PATCH v10 13/26] gunyah: vm_mgr: Add ioctls to support basic non-proxy VM boot

* Elliot Berman <[email protected]> [2023-02-14 13:24:26]:

> static void gh_vm_free(struct work_struct *work)
> {
> struct gh_vm *ghvm = container_of(work, struct gh_vm, free_work);
> struct gh_vm_mem *mapping, *tmp;
> int ret;
>
> - mutex_lock(&ghvm->mm_lock);
> - list_for_each_entry_safe(mapping, tmp, &ghvm->memory_mappings, list) {
> - gh_vm_mem_reclaim(ghvm, mapping);
> - kfree(mapping);
> + switch (ghvm->vm_status) {
> +unknown_state:
> + case GH_RM_VM_STATUS_RUNNING:
> + gh_vm_stop(ghvm);
> + fallthrough;
> + case GH_RM_VM_STATUS_INIT_FAILED:
> + case GH_RM_VM_STATUS_LOAD:
> + case GH_RM_VM_STATUS_LOAD_FAILED:
> + mutex_lock(&ghvm->mm_lock);
> + list_for_each_entry_safe(mapping, tmp, &ghvm->memory_mappings, list) {
> + gh_vm_mem_reclaim(ghvm, mapping);
> + kfree(mapping);
> + }
> + mutex_unlock(&ghvm->mm_lock);
> + fallthrough;
> + case GH_RM_VM_STATUS_NO_STATE:
> + ret = gh_rm_dealloc_vmid(ghvm->rm, ghvm->vmid);
> + if (ret)
> + pr_warn("Failed to deallocate vmid: %d\n", ret);
> +
> + gh_rm_notifier_unregister(ghvm->rm, &ghvm->nb);
> + put_gh_rm(ghvm->rm);
> + kfree(ghvm);
> + break;
> + default:
> + pr_err("VM is unknown state: %d, assuming it's running.\n", ghvm->vm_status);
> + goto unknown_state;

'goto unknown_state' here leads to a infinite loop AFAICS. For example consider
the case where VM_START failed (due to mem_lend operation) causing VM state to
be GH_RM_VM_STATUS_RESET. A subsequent close(vmfd) can leads to that forever
loop.

//snip


> +static int gh_vm_start(struct gh_vm *ghvm)
> +{
> + struct gh_vm_mem *mapping;
> + u64 dtb_offset;
> + u32 mem_handle;
> + int ret;
> +
> + down_write(&ghvm->status_lock);
> + if (ghvm->vm_status != GH_RM_VM_STATUS_LOAD) {
> + up_write(&ghvm->status_lock);
> + return 0;
> + }
> +
> + ghvm->vm_status = GH_RM_VM_STATUS_RESET;
> +
> + list_for_each_entry(mapping, &ghvm->memory_mappings, list) {

We don't seem to have the right lock here while walking the list.


2023-02-20 09:55:18

by Srivatsa Vaddagiri

[permalink] [raw]
Subject: Re: [PATCH v10 13/26] gunyah: vm_mgr: Add ioctls to support basic non-proxy VM boot

* Srivatsa Vaddagiri <[email protected]> [2023-02-20 14:45:55]:

> * Elliot Berman <[email protected]> [2023-02-14 13:24:26]:
>
> > static void gh_vm_free(struct work_struct *work)
> > {
> > struct gh_vm *ghvm = container_of(work, struct gh_vm, free_work);
> > struct gh_vm_mem *mapping, *tmp;
> > int ret;
> >
> > - mutex_lock(&ghvm->mm_lock);
> > - list_for_each_entry_safe(mapping, tmp, &ghvm->memory_mappings, list) {
> > - gh_vm_mem_reclaim(ghvm, mapping);
> > - kfree(mapping);
> > + switch (ghvm->vm_status) {
> > +unknown_state:
> > + case GH_RM_VM_STATUS_RUNNING:
> > + gh_vm_stop(ghvm);
> > + fallthrough;
> > + case GH_RM_VM_STATUS_INIT_FAILED:
> > + case GH_RM_VM_STATUS_LOAD:
> > + case GH_RM_VM_STATUS_LOAD_FAILED:
> > + mutex_lock(&ghvm->mm_lock);
> > + list_for_each_entry_safe(mapping, tmp, &ghvm->memory_mappings, list) {
> > + gh_vm_mem_reclaim(ghvm, mapping);
> > + kfree(mapping);
> > + }
> > + mutex_unlock(&ghvm->mm_lock);
> > + fallthrough;
> > + case GH_RM_VM_STATUS_NO_STATE:
> > + ret = gh_rm_dealloc_vmid(ghvm->rm, ghvm->vmid);
> > + if (ret)
> > + pr_warn("Failed to deallocate vmid: %d\n", ret);
> > +
> > + gh_rm_notifier_unregister(ghvm->rm, &ghvm->nb);
> > + put_gh_rm(ghvm->rm);
> > + kfree(ghvm);
> > + break;
> > + default:
> > + pr_err("VM is unknown state: %d, assuming it's running.\n", ghvm->vm_status);
> > + goto unknown_state;
>
> 'goto unknown_state' here leads to a infinite loop AFAICS. For example consider
> the case where VM_START failed (due to mem_lend operation) causing VM state to
> be GH_RM_VM_STATUS_RESET. A subsequent close(vmfd) can leads to that forever
> loop.

Hmm ..that's not a good example perhaps (VM state is set to
GH_RM_VM_STATUS_INIT_FAILED in failed case). Nevertheless I think we should
avoid the goto in case of unknown state.


- vatsa

2023-02-20 13:59:41

by Srinivas Kandagatla

[permalink] [raw]
Subject: Re: [PATCH v10 07/26] mailbox: Add Gunyah message queue mailbox



On 14/02/2023 21:23, Elliot Berman wrote:
> Gunyah message queues are a unidirectional inter-VM pipe for messages up
> to 1024 bytes. This driver supports pairing a receiver message queue and
> a transmitter message queue to expose a single mailbox channel.
>
> Signed-off-by: Elliot Berman <[email protected]>
> ---
> Documentation/virt/gunyah/message-queue.rst | 8 +
> drivers/mailbox/Makefile | 2 +
> drivers/mailbox/gunyah-msgq.c | 214 ++++++++++++++++++++
> include/linux/gunyah.h | 56 +++++
> 4 files changed, 280 insertions(+)
> create mode 100644 drivers/mailbox/gunyah-msgq.c
>
> diff --git a/Documentation/virt/gunyah/message-queue.rst b/Documentation/virt/gunyah/message-queue.rst
> index 0667b3eb1ff9..082085e981e0 100644
> --- a/Documentation/virt/gunyah/message-queue.rst
> +++ b/Documentation/virt/gunyah/message-queue.rst
> @@ -59,3 +59,11 @@ vIRQ: two TX message queues will have two vIRQs (and two capability IDs).
> | | | | | |
> | | | | | |
> +---------------+ +-----------------+ +---------------+
> +
> +Gunyah message queues are exposed as mailboxes. To create the mailbox, create
> +a mbox_client and call `gh_msgq_init`. On receipt of the RX_READY interrupt,
> +all messages in the RX message queue are read and pushed via the `rx_callback`
> +of the registered mbox_client.
> +
> +.. kernel-doc:: drivers/mailbox/gunyah-msgq.c
> + :identifiers: gh_msgq_init
> diff --git a/drivers/mailbox/Makefile b/drivers/mailbox/Makefile
> index fc9376117111..5f929bb55e9a 100644
> --- a/drivers/mailbox/Makefile
> +++ b/drivers/mailbox/Makefile
> @@ -55,6 +55,8 @@ obj-$(CONFIG_MTK_CMDQ_MBOX) += mtk-cmdq-mailbox.o
>
> obj-$(CONFIG_ZYNQMP_IPI_MBOX) += zynqmp-ipi-mailbox.o
>
> +obj-$(CONFIG_GUNYAH) += gunyah-msgq.o

Why are we reusing CONFIG_GUNYAH Kconfig symbol for mailbox, why not
CONFIG_GUNYAH_MBOX?

> +
> obj-$(CONFIG_SUN6I_MSGBOX) += sun6i-msgbox.o
>
> obj-$(CONFIG_SPRD_MBOX) += sprd-mailbox.o
> diff --git a/drivers/mailbox/gunyah-msgq.c b/drivers/mailbox/gunyah-msgq.c
> new file mode 100644
> index 000000000000..03ffaa30ce9b
> --- /dev/null
> +++ b/drivers/mailbox/gunyah-msgq.c
> @@ -0,0 +1,214 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#include <linux/mailbox_controller.h>
> +#include <linux/module.h>
> +#include <linux/interrupt.h>
> +#include <linux/gunyah.h>
> +#include <linux/printk.h>
> +#include <linux/init.h>
> +#include <linux/slab.h>
> +#include <linux/wait.h>

...

> +/* Fired when message queue transitions from "full" to "space available" to send messages */
> +static irqreturn_t gh_msgq_tx_irq_handler(int irq, void *data)
> +{
> + struct gh_msgq *msgq = data;
> +
> + mbox_chan_txdone(gh_msgq_chan(msgq), 0);
> +
> + return IRQ_HANDLED;
> +}
> +
> +/* Fired after sending message and hypercall told us there was more space available. */
> +static void gh_msgq_txdone_tasklet(struct tasklet_struct *tasklet)

Tasklets have been long deprecated, consider using workqueues in this
particular case.


> +{
> + struct gh_msgq *msgq = container_of(tasklet, struct gh_msgq, txdone_tasklet);
> +
> + mbox_chan_txdone(gh_msgq_chan(msgq), msgq->last_ret);
> +}
> +
> +static int gh_msgq_send_data(struct mbox_chan *chan, void *data)
> +{
..

> + tasklet_schedule(&msgq->txdone_tasklet);
> +
> + return 0;
> +}
> +
> +static struct mbox_chan_ops gh_msgq_ops = {
> + .send_data = gh_msgq_send_data,
> +};
> +
> +/**
> + * gh_msgq_init() - Initialize a Gunyah message queue with an mbox_client
> + * @parent: optional, device parent used for the mailbox controller
> + * @msgq: Pointer to the gh_msgq to initialize
> + * @cl: A mailbox client to bind to the mailbox channel that the message queue creates
> + * @tx_ghrsc: optional, the transmission side of the message queue
> + * @rx_ghrsc: optional, the receiving side of the message queue
> + *
> + * At least one of tx_ghrsc and rx_ghrsc should be not NULL. Most message queue use cases come with
> + * a pair of message queues to facilitate bidirectional communication. When tx_ghrsc is set,
> + * the client can send messages with mbox_send_message(gh_msgq_chan(msgq), msg). When rx_ghrsc
> + * is set, the mbox_client should register an .rx_callback() and the message queue driver will
> + * push all available messages upon receiving the RX ready interrupt. The messages should be
> + * consumed or copied by the client right away as the gh_msgq_rx_data will be replaced/destroyed
> + * after the callback.
> + *
> + * Returns - 0 on success, negative otherwise
> + */
> +int gh_msgq_init(struct device *parent, struct gh_msgq *msgq, struct mbox_client *cl,
> + struct gunyah_resource *tx_ghrsc, struct gunyah_resource *rx_ghrsc)
> +{
> + int ret;
> +
> + /* Must have at least a tx_ghrsc or rx_ghrsc and that they are the right device types */
> + if ((!tx_ghrsc && !rx_ghrsc) ||
> + (tx_ghrsc && tx_ghrsc->type != GUNYAH_RESOURCE_TYPE_MSGQ_TX) ||
> + (rx_ghrsc && rx_ghrsc->type != GUNYAH_RESOURCE_TYPE_MSGQ_RX))
> + return -EINVAL;
> +
> + if (gh_api_version() != GUNYAH_API_V1) {
> + pr_err("Unrecognized gunyah version: %u. Currently supported: %d\n",
dev_err(parent

would make this more useful

> + gh_api_version(), GUNYAH_API_V1);
> + return -EOPNOTSUPP;
> + }
> +
> + if (!gh_api_has_feature(GH_API_FEATURE_MSGQUEUE))
> + return -EOPNOTSUPP;
> +
> + msgq->tx_ghrsc = tx_ghrsc;
> + msgq->rx_ghrsc = rx_ghrsc;
> +
> + msgq->mbox.dev = parent;
> + msgq->mbox.ops = &gh_msgq_ops;
> + msgq->mbox.num_chans = 1;
> + msgq->mbox.txdone_irq = true;
> + msgq->mbox.chans = kcalloc(msgq->mbox.num_chans, sizeof(*msgq->mbox.chans), GFP_KERNEL);
> + if (!msgq->mbox.chans)
> + return -ENOMEM;
> +
> + if (msgq->tx_ghrsc) {
> + ret = request_irq(msgq->tx_ghrsc->irq, gh_msgq_tx_irq_handler, 0, "gh_msgq_tx",
> + msgq);
> + if (ret)
> + goto err_chans;
> + }
> +
> + if (msgq->rx_ghrsc) {
> + ret = request_threaded_irq(msgq->rx_ghrsc->irq, NULL, gh_msgq_rx_irq_handler,
> + IRQF_ONESHOT, "gh_msgq_rx", msgq);
> + if (ret)
> + goto err_tx_irq;
> + }
> +
> + tasklet_setup(&msgq->txdone_tasklet, gh_msgq_txdone_tasklet);
> +
> + ret = mbox_controller_register(&msgq->mbox);
> + if (ret)
> + goto err_rx_irq;
> +
> + ret = mbox_bind_client(gh_msgq_chan(msgq), cl);
> + if (ret)
> + goto err_mbox;
> +
> + return 0;
> +err_mbox:
> + mbox_controller_unregister(&msgq->mbox);
> +err_rx_irq:
> + if (msgq->rx_ghrsc)
> + free_irq(msgq->rx_ghrsc->irq, msgq);
> +err_tx_irq:
> + if (msgq->tx_ghrsc)
> + free_irq(msgq->tx_ghrsc->irq, msgq);
> +err_chans:
> + kfree(msgq->mbox.chans);
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(gh_msgq_init);
> +
> +void gh_msgq_remove(struct gh_msgq *msgq)
> +{
> + mbox_controller_unregister(&msgq->mbox);
> +
> + if (msgq->rx_ghrsc)
> + free_irq(msgq->rx_ghrsc->irq, msgq);
> +
> + if (msgq->tx_ghrsc)
> + free_irq(msgq->tx_ghrsc->irq, msgq);
> +
> + kfree(msgq->mbox.chans);
> +}
> +EXPORT_SYMBOL_GPL(gh_msgq_remove);
> +
> +MODULE_LICENSE("GPL");
> +MODULE_DESCRIPTION("Gunyah Message Queue Driver");

2023-02-20 14:00:10

by Srinivas Kandagatla

[permalink] [raw]
Subject: Re: [PATCH v10 04/26] virt: gunyah: Add hypercalls to identify Gunyah

minor nits below,

On 14/02/2023 21:12, Elliot Berman wrote:
> Add hypercalls to identify when Linux is running a virtual machine under
> Gunyah.
>
> There are two calls to help identify Gunyah:
>
> 1. gh_hypercall_get_uid() returns a UID when running under a Gunyah
> hypervisor.
> 2. gh_hypercall_hyp_identify() returns build information and a set of
> feature flags that are supported by Gunyah.
>
> Signed-off-by: Elliot Berman <[email protected]>
> ---
> arch/arm64/Kbuild | 1 +
> arch/arm64/gunyah/Makefile | 3 ++
> arch/arm64/gunyah/gunyah_hypercall.c | 61 ++++++++++++++++++++++++++++
> drivers/virt/Kconfig | 2 +
> drivers/virt/gunyah/Kconfig | 13 ++++++
> include/linux/gunyah.h | 33 +++++++++++++++
> 6 files changed, 113 insertions(+)
> create mode 100644 arch/arm64/gunyah/Makefile
> create mode 100644 arch/arm64/gunyah/gunyah_hypercall.c
> create mode 100644 drivers/virt/gunyah/Kconfig
>
> diff --git a/arch/arm64/Kbuild b/arch/arm64/Kbuild
> index 5bfbf7d79c99..e4847ba0e3c9 100644
> --- a/arch/arm64/Kbuild
> +++ b/arch/arm64/Kbuild
> @@ -3,6 +3,7 @@ obj-y += kernel/ mm/ net/
> obj-$(CONFIG_KVM) += kvm/
> obj-$(CONFIG_XEN) += xen/
> obj-$(subst m,y,$(CONFIG_HYPERV)) += hyperv/
> +obj-$(CONFIG_GUNYAH) += gunyah/
> obj-$(CONFIG_CRYPTO) += crypto/
>
> # for cleaning
> diff --git a/arch/arm64/gunyah/Makefile b/arch/arm64/gunyah/Makefile
> new file mode 100644
> index 000000000000..84f1e38cafb1
> --- /dev/null
> +++ b/arch/arm64/gunyah/Makefile
> @@ -0,0 +1,3 @@
> +# SPDX-License-Identifier: GPL-2.0
> +
> +obj-$(CONFIG_GUNYAH) += gunyah_hypercall.o
> diff --git a/arch/arm64/gunyah/gunyah_hypercall.c b/arch/arm64/gunyah/gunyah_hypercall.c
> new file mode 100644
> index 000000000000..f30d06ee80cf
> --- /dev/null
> +++ b/arch/arm64/gunyah/gunyah_hypercall.c
> @@ -0,0 +1,61 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#include <linux/arm-smccc.h>
> +#include <linux/module.h>
> +#include <linux/gunyah.h>
> +
> +static const uint32_t gunyah_known_uuids[][4] = {
> + {0x19bd54bd, 0x0b37571b, 0x946f609b, 0x54539de6}, /* QC_HYP (Qualcomm's build) */
> + {0x673d5f14, 0x9265ce36, 0xa4535fdb, 0xc1d58fcd}, /* GUNYAH (open source build) */
> +};
> +
> +bool arch_is_gunyah_guest(void)
> +{
> + struct arm_smccc_res res;
> + u32 uid[4];
> + int i;
> +
> + arm_smccc_1_1_hvc(ARM_SMCCC_VENDOR_HYP_CALL_UID_FUNC_ID, &res);
> +
> + uid[0] = lower_32_bits(res.a0);
> + uid[1] = lower_32_bits(res.a1);
> + uid[2] = lower_32_bits(res.a2);
> + uid[3] = lower_32_bits(res.a3);
> +
> + for (i = 0; i < ARRAY_SIZE(gunyah_known_uuids); i++)
> + if (!memcmp(uid, gunyah_known_uuids[i], sizeof(uid)))
> + break;
> +
> + return i != ARRAY_SIZE(gunyah_known_uuids);

you could probably make this more readable by:


for (i = 0; i < ARRAY_SIZE(gunyah_known_uuids); i++)
if (!memcmp(uid, gunyah_known_uuids[i], sizeof(uid)))
return true;

return false;

> +}
> +EXPORT_SYMBOL_GPL(arch_is_gunyah_guest);
> +
> +#define GH_HYPERCALL(fn) ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, ARM_SMCCC_SMC_64, \
> + ARM_SMCCC_OWNER_VENDOR_HYP, \
> + fn)
> +
> +#define GH_HYPERCALL_HYP_IDENTIFY GH_HYPERCALL(0x8000)
> +
> +/**
> + * gh_hypercall_hyp_identify() - Returns build information and feature flags
> + * supported by Gunyah.
> + * @hyp_identity: filled by the hypercall with the API info and feature flags.
> + */
> +void gh_hypercall_hyp_identify(struct gh_hypercall_hyp_identify_resp *hyp_identity)
> +{
> + struct arm_smccc_res res;
> +
> + arm_smccc_1_1_hvc(GH_HYPERCALL_HYP_IDENTIFY, &res);
> +
> + hyp_identity->api_info = res.a0;
> + hyp_identity->flags[0] = res.a1;
> + hyp_identity->flags[1] = res.a2;
> + hyp_identity->flags[2] = res.a3;
> +}
> +EXPORT_SYMBOL_GPL(gh_hypercall_hyp_identify);
> +
> +MODULE_LICENSE("GPL");
> +MODULE_DESCRIPTION("Gunyah Hypervisor Hypercalls");
> diff --git a/drivers/virt/Kconfig b/drivers/virt/Kconfig
> index f79ab13a5c28..85bd6626ffc9 100644
> --- a/drivers/virt/Kconfig
> +++ b/drivers/virt/Kconfig
> @@ -54,4 +54,6 @@ source "drivers/virt/coco/sev-guest/Kconfig"
>
> source "drivers/virt/coco/tdx-guest/Kconfig"
>
> +source "drivers/virt/gunyah/Kconfig"
> +
> endif
> diff --git a/drivers/virt/gunyah/Kconfig b/drivers/virt/gunyah/Kconfig
> new file mode 100644
> index 000000000000..1a737694c333
> --- /dev/null
> +++ b/drivers/virt/gunyah/Kconfig
> @@ -0,0 +1,13 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +
> +config GUNYAH
> + tristate "Gunyah Virtualization drivers"
> + depends on ARM64
> + depends on MAILBOX
> + help
> + The Gunyah drivers are the helper interfaces that run in a guest VM
> + such as basic inter-VM IPC and signaling mechanisms, and higher level
> + services such as memory/device sharing, IRQ sharing, and so on.
> +
> + Say Y/M here to enable the drivers needed to interact in a Gunyah
> + virtual environment.
> diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
> index 59ef4c735ae8..3fef2854c5e1 100644
> --- a/include/linux/gunyah.h
> +++ b/include/linux/gunyah.h
> @@ -6,8 +6,10 @@
> #ifndef _LINUX_GUNYAH_H
> #define _LINUX_GUNYAH_H
>
> +#include <linux/bitfield.h>
> #include <linux/errno.h>
> #include <linux/limits.h>
> +#include <linux/types.h>
>
> /******************************************************************************/
> /* Common arch-independent definitions for Gunyah hypercalls */
> @@ -79,4 +81,35 @@ static inline int gh_remap_error(enum gh_error gh_error)
> }
> }
>
> +enum gh_api_feature {
> + GH_API_FEATURE_DOORBELL,
> + GH_API_FEATURE_MSGQUEUE,
> + GH_API_FEATURE_VCPU,
> + GH_API_FEATURE_MEMEXTENT,
> +};
> +
> +bool arch_is_gunyah_guest(void);
> +
> +u16 gh_api_version(void);
> +bool gh_api_has_feature(enum gh_api_feature feature);
gh_api_has_feature or arch_is_gunyah_guest is in this patch, this should
probably moved to the respecitive patch that implements these functions.

--srini
> +
> +#define GUNYAH_API_V1 1
> +
> +#define GH_API_INFO_API_VERSION_MASK GENMASK_ULL(13, 0)
> +#define GH_API_INFO_BIG_ENDIAN BIT_ULL(14)
> +#define GH_API_INFO_IS_64BIT BIT_ULL(15)
> +#define GH_API_INFO_VARIANT_MASK GENMASK_ULL(63, 56)
> +
> +#define GH_IDENTIFY_DOORBELL BIT_ULL(1)
> +#define GH_IDENTIFY_MSGQUEUE BIT_ULL(2)
> +#define GH_IDENTIFY_VCPU BIT_ULL(5)
> +#define GH_IDENTIFY_MEMEXTENT BIT_ULL(6)
> +
> +struct gh_hypercall_hyp_identify_resp {
> + u64 api_info;
> + u64 flags[3];
> +};
> +
> +void gh_hypercall_hyp_identify(struct gh_hypercall_hyp_identify_resp *hyp_identity);
> +
> #endif

2023-02-20 18:10:39

by Srinivas Kandagatla

[permalink] [raw]
Subject: Re: [PATCH v10 08/26] gunyah: rsc_mgr: Add resource manager RPC core



On 14/02/2023 21:23, Elliot Berman wrote:
>
> The resource manager is a special virtual machine which is always
> running on a Gunyah system. It provides APIs for creating and destroying
> VMs, secure memory management, sharing/lending of memory between VMs,
> and setup of inter-VM communication. Calls to the resource manager are
> made via message queues.
>
> This patch implements the basic probing and RPC mechanism to make those
> API calls. Request/response calls can be made with gh_rm_call.
> Drivers can also register to notifications pushed by RM via
> gh_rm_register_notifier
>
> Specific API calls that resource manager supports will be implemented in
> subsequent patches.
>
> Signed-off-by: Elliot Berman <[email protected]>
> ---
> drivers/virt/gunyah/Makefile | 3 +
> drivers/virt/gunyah/rsc_mgr.c | 604 +++++++++++++++++++++++++++++++++
> drivers/virt/gunyah/rsc_mgr.h | 77 +++++
> include/linux/gunyah_rsc_mgr.h | 24 ++
> 4 files changed, 708 insertions(+)
> create mode 100644 drivers/virt/gunyah/rsc_mgr.c
> create mode 100644 drivers/virt/gunyah/rsc_mgr.h
> create mode 100644 include/linux/gunyah_rsc_mgr.h
>
> diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
> index 34f32110faf9..cc864ff5abbb 100644
> --- a/drivers/virt/gunyah/Makefile
> +++ b/drivers/virt/gunyah/Makefile
> @@ -1,3 +1,6 @@
> # SPDX-License-Identifier: GPL-2.0
>
> obj-$(CONFIG_GUNYAH) += gunyah.o
> +
> +gunyah_rsc_mgr-y += rsc_mgr.o
> +obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
> diff --git a/drivers/virt/gunyah/rsc_mgr.c b/drivers/virt/gunyah/rsc_mgr.c
> new file mode 100644
> index 000000000000..2a47139873a8
> --- /dev/null
> +++ b/drivers/virt/gunyah/rsc_mgr.c
> @@ -0,0 +1,604 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#include <linux/of.h>
> +#include <linux/slab.h>
> +#include <linux/mutex.h>
> +#include <linux/sched.h>
> +#include <linux/gunyah.h>
> +#include <linux/module.h>
> +#include <linux/of_irq.h>
> +#include <linux/kthread.h>
why do we need this?

> +#include <linux/notifier.h>
> +#include <linux/workqueue.h>
> +#include <linux/completion.h>
> +#include <linux/gunyah_rsc_mgr.h>
> +#include <linux/platform_device.h>
> +
> +#include "rsc_mgr.h"
> +

...

> +struct gh_rm {
> + struct device *dev;
> + struct gunyah_resource tx_ghrsc, rx_ghrsc;
> + struct gh_msgq msgq;
> + struct mbox_client msgq_client;
> + struct gh_rm_connection *active_rx_connection;
> + int last_tx_ret;
> +

> + struct idr call_idr;
> + struct mutex call_idr_lock;

IDR interface is deprecated you should use Xarrays instead here,

Other good thing about Xarrays is that you need not worry about locking
it uses RCU and internal spinlock, that should simiply code a bit here.

more info at
Documentation/core-api/xarray.rst

> +
> + struct kmem_cache *cache;
> + struct mutex send_lock;
> + struct blocking_notifier_head nh;
> +};
> +
> +static struct gh_rm_connection *gh_rm_alloc_connection(__le32 msg_id, u8 type)
> +{
> + struct gh_rm_connection *connection;
> +
> + connection = kzalloc(sizeof(*connection), GFP_KERNEL);
> + if (!connection)
> + return ERR_PTR(-ENOMEM);
> +
> + connection->type = type;
> + connection->msg_id = msg_id;
> +
> + return connection;
> +}
> +
> +static int gh_rm_init_connection_payload(struct gh_rm_connection *connection, void *msg,
> + size_t hdr_size, size_t msg_size)
> +{
> + size_t max_buf_size, payload_size;
> + struct gh_rm_rpc_hdr *hdr = msg;
> +
> + if (hdr_size > msg_size)
> + return -EINVAL;
> +
> + payload_size = msg_size - hdr_size;
> +
> + connection->num_fragments = FIELD_GET(RM_RPC_FRAGMENTS_MASK, hdr->type);
> + connection->fragments_received = 0;
> +
> + /* There's not going to be any payload, no need to allocate buffer. */
> + if (!payload_size && !connection->num_fragments)
> + return 0;
> +
> + if (connection->num_fragments > GH_RM_MAX_NUM_FRAGMENTS)
> + return -EINVAL;
> +
> + max_buf_size = payload_size + (connection->num_fragments * GH_RM_MAX_MSG_SIZE);
> +
> + connection->payload = kzalloc(max_buf_size, GFP_KERNEL);
> + if (!connection->payload)
> + return -ENOMEM;
> +
> + memcpy(connection->payload, msg + hdr_size, payload_size);
> + connection->size = payload_size;
> + return 0;
> +}
> +
> +static void gh_rm_notif_work(struct work_struct *work)
> +{
> + struct gh_rm_connection *connection = container_of(work, struct gh_rm_connection,
> + notification.work);
> + struct gh_rm *rm = connection->notification.rm;
> +
> + blocking_notifier_call_chain(&rm->nh, connection->msg_id, connection->payload);
> +
> + put_gh_rm(rm);
> + kfree(connection->payload);
if (connection->size)
kfree(connection->payload);

should we check for payload size before freeing this, Normally kfree
NULL should be safe, unless connection object is allocated uninitialized.

> + kfree(connection);
> +}
> +
> +static struct gh_rm_connection *gh_rm_process_notif(struct gh_rm *rm, void *msg, size_t msg_size)
> +{
> + struct gh_rm_connection *connection;
> + struct gh_rm_rpc_hdr *hdr = msg;
> + int ret;
> +
> + connection = gh_rm_alloc_connection(hdr->msg_id, RM_RPC_TYPE_NOTIF);
> + if (IS_ERR(connection)) {
> + dev_err(rm->dev, "Failed to alloc connection for notification: %ld, dropping.\n",
> + PTR_ERR(connection));
> + return NULL;
> + }
> +
> + get_gh_rm(rm);
> + connection->notification.rm = rm;
> + INIT_WORK(&connection->notification.work, gh_rm_notif_work);
> +
> + ret = gh_rm_init_connection_payload(connection, msg, sizeof(*hdr), msg_size);
> + if (ret) {
> + dev_err(rm->dev, "Failed to initialize connection buffer for notification: %d\n",
> + ret);
put_gh_rm(rm);

is missing.
or move the get and other lines after this check

> + kfree(connection);
> + return NULL;
> + }
> +
> + return connection;
> +}
> +

> +static int gh_rm_send_request(struct gh_rm *rm, u32 message_id,
> + const void *req_buff, size_t req_buff_size,
> + struct gh_rm_connection *connection)
> +{
> + u8 msg_type = FIELD_PREP(RM_RPC_TYPE_MASK, RM_RPC_TYPE_REQUEST);
> + size_t buff_size_remaining = req_buff_size;
> + const void *req_buff_curr = req_buff;
> + struct gh_msgq_tx_data *msg;
> + struct gh_rm_rpc_hdr *hdr;
> + u32 cont_fragments = 0;
> + size_t payload_size;
> + void *payload;
> + int ret;
> +
> + if (req_buff_size)
> + cont_fragments = (req_buff_size - 1) / GH_RM_MAX_MSG_SIZE;
> +
> + if (req_buff_size > GH_RM_MAX_NUM_FRAGMENTS * GH_RM_MAX_MSG_SIZE) {
> + pr_warn("Limit exceeded for the number of fragments: %u\n", cont_fragments);
> + dump_stack();
> + return -E2BIG;
> + }
> +
> + ret = mutex_lock_interruptible(&rm->send_lock);
> + if (ret)
> + return ret;
> +
> + /* Consider also the 'request' packet for the loop count */
> + do {
> + msg = kmem_cache_zalloc(rm->cache, GFP_KERNEL);
> + if (!msg) {
> + ret = -ENOMEM;
> + goto out;
> + }
> +
> + /* Fill header */
> + hdr = (struct gh_rm_rpc_hdr *)msg->data;
> + hdr->api = RM_RPC_API;
> + hdr->type = msg_type | FIELD_PREP(RM_RPC_FRAGMENTS_MASK, cont_fragments);
> + hdr->seq = cpu_to_le16(connection->reply.seq);
> + hdr->msg_id = cpu_to_le32(message_id);
> +
> + /* Copy payload */
> + payload = hdr + 1;
> + payload_size = min(buff_size_remaining, GH_RM_MAX_MSG_SIZE);
> + memcpy(payload, req_buff_curr, payload_size);
> + req_buff_curr += payload_size;
> + buff_size_remaining -= payload_size;
> +
> + /* Force the last fragment to immediately alert the receiver */
> + msg->push = !buff_size_remaining;
> + msg->length = sizeof(*hdr) + payload_size;
> +
> + ret = mbox_send_message(gh_msgq_chan(&rm->msgq), msg);
> + if (ret < 0) {
> + kmem_cache_free(rm->cache, msg);
> + break;
> + }
> +
> + if (rm->last_tx_ret) {
> + ret = rm->last_tx_ret;
> + break;
> + }
> +
> + msg_type = FIELD_PREP(RM_RPC_TYPE_MASK, RM_RPC_TYPE_CONTINUATION);
> + } while (buff_size_remaining);
> +
> +out:
> + mutex_unlock(&rm->send_lock);
> + return ret < 0 ? ret : 0;
> +}
> +
> +/**
> + * gh_rm_call: Achieve request-response type communication with RPC
> + * @rm: Pointer to Gunyah resource manager internal data
> + * @message_id: The RM RPC message-id
> + * @req_buff: Request buffer that contains the payload
> + * @req_buff_size: Total size of the payload
> + * @resp_buf: Pointer to a response buffer
> + * @resp_buff_size: Size of the response buffer
> + *
> + * Make a request to the RM-VM and wait for reply back. For a successful
> + * response, the function returns the payload. The size of the payload is set in
> + * resp_buff_size. The resp_buf should be freed by the caller.
> + *
> + * req_buff should be not NULL for req_buff_size >0. If req_buff_size == 0,
> + * req_buff *can* be NULL and no additional payload is sent.
> + *
> + * Context: Process context. Will sleep waiting for reply.
> + * Return: 0 on success. <0 if error.
> + */
> +int gh_rm_call(struct gh_rm *rm, u32 message_id, void *req_buff, size_t req_buff_size,
> + void **resp_buf, size_t *resp_buff_size)
> +{
> + struct gh_rm_connection *connection;
> + int ret;
> +
> + /* message_id 0 is reserved. req_buff_size implies req_buf is not NULL */
> + if (!message_id || (!req_buff && req_buff_size) || !rm)
> + return -EINVAL;
> +
> + connection = gh_rm_alloc_connection(cpu_to_le32(message_id), RM_RPC_TYPE_REPLY);
> + if (IS_ERR(connection))
> + return PTR_ERR(connection);
> +
> + init_completion(&connection->reply.seq_done);
> +
> + /* Allocate a new seq number for this connection */
> + mutex_lock(&rm->call_idr_lock);
> + ret = idr_alloc_cyclic(&rm->call_idr, connection, 0, U16_MAX,
> + GFP_KERNEL);
> + mutex_unlock(&rm->call_idr_lock);
> + if (ret < 0)
> + goto out;

new line.

> + connection->reply.seq = ret;
> +
> + /* Send the request to the Resource Manager */
> + ret = gh_rm_send_request(rm, message_id, req_buff, req_buff_size, connection);
> + if (ret < 0)
> + goto out;
> +
> + /* Wait for response */
> + ret = wait_for_completion_interruptible(&connection->reply.seq_done);
> + if (ret)
> + goto out;
> +
> + /* Check for internal (kernel) error waiting for the response */
> + if (connection->reply.ret) {
> + ret = connection->reply.ret;
> + if (ret != -ENOMEM)
> + kfree(connection->payload);
> + goto out;
> + }
> +
> + /* Got a response, did resource manager give us an error? */
> + if (connection->reply.rm_error != GH_RM_ERROR_OK) {
> + pr_warn("RM rejected message %08x. Error: %d\n", message_id,
> + connection->reply.rm_error);
> + dump_stack();
> + ret = gh_rm_remap_error(connection->reply.rm_error);
> + kfree(connection->payload);
> + goto out;
> + }
> +
> + /* Everything looks good, return the payload */
> + *resp_buff_size = connection->size;
> + if (connection->size)
> + *resp_buf = connection->payload;
> + else {
> + /* kfree in case RM sent us multiple fragments but never any data in
> + * those fragments. We would've allocated memory for it, but connection->size == 0
> + */
> + kfree(connection->payload);
> + }
> +
> +out:
> + mutex_lock(&rm->call_idr_lock);
> + idr_remove(&rm->call_idr, connection->reply.seq);
> + mutex_unlock(&rm->call_idr_lock);
> + kfree(connection);
> + return ret;
> +}
> +
> +
> +int gh_rm_notifier_register(struct gh_rm *rm, struct notifier_block *nb)
> +{
> + return blocking_notifier_chain_register(&rm->nh, nb);
> +}
> +EXPORT_SYMBOL_GPL(gh_rm_notifier_register);
> +
> +int gh_rm_notifier_unregister(struct gh_rm *rm, struct notifier_block *nb)
> +{
> + return blocking_notifier_chain_unregister(&rm->nh, nb);
> +}
> +EXPORT_SYMBOL_GPL(gh_rm_notifier_unregister);
> +
> +void get_gh_rm(struct gh_rm *rm)
> +{
> + get_device(rm->dev);
> +}
> +EXPORT_SYMBOL_GPL(get_gh_rm);

Can we have some consistency in the exported symbol naming,
we have two combinations now.

EXPORT_SYMBOL_GPL(gh_rm_notifier_register);
EXPORT_SYMBOL_GPL(get_gh_rm);

lets stick to one.
> +
> +void put_gh_rm(struct gh_rm *rm)
> +{
> + put_device(rm->dev);
> +}
> +EXPORT_SYMBOL_GPL(put_gh_rm);
>
...

> +
> +static int gh_rm_drv_probe(struct platform_device *pdev)
> +{
> + struct gh_msgq_tx_data *msg;
> + struct gh_rm *rm;
> + int ret;
> +
How are we ensuring that gunyah driver is probed before this driver?


> + rm = devm_kzalloc(&pdev->dev, sizeof(*rm), GFP_KERNEL);
> + if (!rm)
> + return -ENOMEM;
> +
> + platform_set_drvdata(pdev, rm);
> + rm->dev = &pdev->dev;
> +
> + mutex_init(&rm->call_idr_lock);
> + idr_init(&rm->call_idr);
> + rm->cache = kmem_cache_create("gh_rm", struct_size(msg, data, GH_MSGQ_MAX_MSG_SIZE), 0,
> + SLAB_HWCACHE_ALIGN, NULL);
> + if (!rm->cache)
> + return -ENOMEM;
new line here would be nice.

> + mutex_init(&rm->send_lock);
> + BLOCKING_INIT_NOTIFIER_HEAD(&rm->nh);
> +
> + ret = gh_msgq_platform_probe_direction(pdev, true, 0, &rm->tx_ghrsc);
> + if (ret)
> + goto err_cache;
> +
> + ret = gh_msgq_platform_probe_direction(pdev, false, 1, &rm->rx_ghrsc);
> + if (ret)
> + goto err_cache;
> +
> + rm->msgq_client.dev = &pdev->dev;
> + rm->msgq_client.tx_block = true;
> + rm->msgq_client.rx_callback = gh_rm_msgq_rx_data;
> + rm->msgq_client.tx_done = gh_rm_msgq_tx_done;
> +
> + return gh_msgq_init(&pdev->dev, &rm->msgq, &rm->msgq_client, &rm->tx_ghrsc, &rm->rx_ghrsc);
> +err_cache:
> + kmem_cache_destroy(rm->cache);
> + return ret;
> +}
> +
> +static int gh_rm_drv_remove(struct platform_device *pdev)
> +{
> + struct gh_rm *rm = platform_get_drvdata(pdev);
> +
> + mbox_free_channel(gh_msgq_chan(&rm->msgq));
> + gh_msgq_remove(&rm->msgq);
> + kmem_cache_destroy(rm->cache);
> +
> + return 0;
> +}
> +
> +static const struct of_device_id gh_rm_of_match[] = {
> + { .compatible = "gunyah-resource-manager" },
> + {}
> +};
> +MODULE_DEVICE_TABLE(of, gh_rm_of_match);
> +
> +static struct platform_driver gh_rm_driver = {
> + .probe = gh_rm_drv_probe,
> + .remove = gh_rm_drv_remove,
> + .driver = {
> + .name = "gh_rsc_mgr",
> + .of_match_table = gh_rm_of_match,
> + },
> +};
> +module_platform_driver(gh_rm_driver);
> +
> +MODULE_LICENSE("GPL");
> +MODULE_DESCRIPTION("Gunyah Resource Manager Driver");
> diff --git a/drivers/virt/gunyah/rsc_mgr.h b/drivers/virt/gunyah/rsc_mgr.h
> new file mode 100644
> index 000000000000..d4e799a7526f
> --- /dev/null
> +++ b/drivers/virt/gunyah/rsc_mgr.h
> @@ -0,0 +1,77 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +#ifndef __GH_RSC_MGR_PRIV_H
> +#define __GH_RSC_MGR_PRIV_H
> +
> +#include <linux/gunyah.h>
> +#include <linux/gunyah_rsc_mgr.h>
> +#include <linux/types.h>
> +
<------------------
> +/* RM Error codes */
> +enum gh_rm_error {
> + GH_RM_ERROR_OK = 0x0,
> + GH_RM_ERROR_UNIMPLEMENTED = 0xFFFFFFFF,
> + GH_RM_ERROR_NOMEM = 0x1,
> + GH_RM_ERROR_NORESOURCE = 0x2,
> + GH_RM_ERROR_DENIED = 0x3,
> + GH_RM_ERROR_INVALID = 0x4,
> + GH_RM_ERROR_BUSY = 0x5,
> + GH_RM_ERROR_ARGUMENT_INVALID = 0x6,
> + GH_RM_ERROR_HANDLE_INVALID = 0x7,
> + GH_RM_ERROR_VALIDATE_FAILED = 0x8,
> + GH_RM_ERROR_MAP_FAILED = 0x9,
> + GH_RM_ERROR_MEM_INVALID = 0xA,
> + GH_RM_ERROR_MEM_INUSE = 0xB,
> + GH_RM_ERROR_MEM_RELEASED = 0xC,
> + GH_RM_ERROR_VMID_INVALID = 0xD,
> + GH_RM_ERROR_LOOKUP_FAILED = 0xE,
> + GH_RM_ERROR_IRQ_INVALID = 0xF,
> + GH_RM_ERROR_IRQ_INUSE = 0x10,
> + GH_RM_ERROR_IRQ_RELEASED = 0x11,
> +};
> +
> +/**
> + * gh_rm_remap_error() - Remap Gunyah resource manager errors into a Linux error code
> + * @gh_error: "Standard" return value from Gunyah resource manager
> + */
> +static inline int gh_rm_remap_error(enum gh_rm_error rm_error)
> +{
> + switch (rm_error) {
> + case GH_RM_ERROR_OK:
> + return 0;
> + case GH_RM_ERROR_UNIMPLEMENTED:
> + return -EOPNOTSUPP;
> + case GH_RM_ERROR_NOMEM:
> + return -ENOMEM;
> + case GH_RM_ERROR_NORESOURCE:
> + return -ENODEV;
> + case GH_RM_ERROR_DENIED:
> + return -EPERM;
> + case GH_RM_ERROR_BUSY:
> + return -EBUSY;
> + case GH_RM_ERROR_INVALID:
> + case GH_RM_ERROR_ARGUMENT_INVALID:
> + case GH_RM_ERROR_HANDLE_INVALID:
> + case GH_RM_ERROR_VALIDATE_FAILED:
> + case GH_RM_ERROR_MAP_FAILED:
> + case GH_RM_ERROR_MEM_INVALID:
> + case GH_RM_ERROR_MEM_INUSE:
> + case GH_RM_ERROR_MEM_RELEASED:
> + case GH_RM_ERROR_VMID_INVALID:
> + case GH_RM_ERROR_LOOKUP_FAILED:
> + case GH_RM_ERROR_IRQ_INVALID:
> + case GH_RM_ERROR_IRQ_INUSE:
> + case GH_RM_ERROR_IRQ_RELEASED:
> + return -EINVAL;
> + default:
> + return -EBADMSG;
> + }
> +}
> +
---------------->

Only user for the error code coversion is within the rm driver, you
should just move this to the .c file, I see no value of this in .h
unless there are some other users for this.



> +struct gh_rm;
> +int gh_rm_call(struct gh_rm *rsc_mgr, u32 message_id, void *req_buff, size_t req_buff_size,
> + void **resp_buf, size_t *resp_buff_size);
> +
> +#endif
> diff --git a/include/linux/gunyah_rsc_mgr.h b/include/linux/gunyah_rsc_mgr.h
> new file mode 100644
> index 000000000000..c992b3188c8d
> --- /dev/null
> +++ b/include/linux/gunyah_rsc_mgr.h
> @@ -0,0 +1,24 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#ifndef _GUNYAH_RSC_MGR_H
> +#define _GUNYAH_RSC_MGR_H
> +
> +#include <linux/list.h>
> +#include <linux/notifier.h>
> +#include <linux/gunyah.h>
> +
> +#define GH_VMID_INVAL U16_MAX
> +
> +/* Gunyah recognizes VMID0 as an alias to the current VM's ID */
> +#define GH_VMID_SELF 0
> +
> +struct gh_rm;
> +int gh_rm_notifier_register(struct gh_rm *rm, struct notifier_block *nb);
> +int gh_rm_notifier_unregister(struct gh_rm *rm, struct notifier_block *nb);
> +void get_gh_rm(struct gh_rm *rm);
> +void put_gh_rm(struct gh_rm *rm);
> +
> +#endif

2023-02-21 07:50:56

by Srinivas Kandagatla

[permalink] [raw]
Subject: Re: [PATCH v10 09/26] gunyah: rsc_mgr: Add VM lifecycle RPC



On 14/02/2023 21:23, Elliot Berman wrote:
>
> Add Gunyah Resource Manager RPC to launch an unauthenticated VM.
>
> Signed-off-by: Elliot Berman <[email protected]>
> ---
> drivers/virt/gunyah/Makefile | 2 +-
> drivers/virt/gunyah/rsc_mgr.h | 45 ++++++
> drivers/virt/gunyah/rsc_mgr_rpc.c | 226 ++++++++++++++++++++++++++++++
> include/linux/gunyah_rsc_mgr.h | 73 ++++++++++
> 4 files changed, 345 insertions(+), 1 deletion(-)
> create mode 100644 drivers/virt/gunyah/rsc_mgr_rpc.c
>
> diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
> index cc864ff5abbb..de29769f2f3f 100644
> --- a/drivers/virt/gunyah/Makefile
> +++ b/drivers/virt/gunyah/Makefile
> @@ -2,5 +2,5 @@
>
> obj-$(CONFIG_GUNYAH) += gunyah.o
>
> -gunyah_rsc_mgr-y += rsc_mgr.o
> +gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o
> obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
> diff --git a/drivers/virt/gunyah/rsc_mgr.h b/drivers/virt/gunyah/rsc_mgr.h
> index d4e799a7526f..7406237bc66d 100644
> --- a/drivers/virt/gunyah/rsc_mgr.h
> +++ b/drivers/virt/gunyah/rsc_mgr.h
> @@ -74,4 +74,49 @@ struct gh_rm;
> int gh_rm_call(struct gh_rm *rsc_mgr, u32 message_id, void *req_buff, size_t req_buff_size,
> void **resp_buf, size_t *resp_buff_size);
>
<----------------------------
> +/* Message IDs: VM Management */
> +#define GH_RM_RPC_VM_ALLOC_VMID 0x56000001
> +#define GH_RM_RPC_VM_DEALLOC_VMID 0x56000002
> +#define GH_RM_RPC_VM_START 0x56000004
> +#define GH_RM_RPC_VM_STOP 0x56000005
> +#define GH_RM_RPC_VM_RESET 0x56000006
> +#define GH_RM_RPC_VM_CONFIG_IMAGE 0x56000009
> +#define GH_RM_RPC_VM_INIT 0x5600000B
> +#define GH_RM_RPC_VM_GET_HYP_RESOURCES 0x56000020
> +#define GH_RM_RPC_VM_GET_VMID 0x56000024
> +
> +struct gh_rm_vm_common_vmid_req {
> + __le16 vmid;
> + __le16 reserved0;
> +} __packed;
> +
> +/* Call: VM_ALLOC */
> +struct gh_rm_vm_alloc_vmid_resp {
> + __le16 vmid;
> + __le16 reserved0;
> +} __packed;
> +
> +/* Call: VM_STOP */
> +struct gh_rm_vm_stop_req {
> + __le16 vmid;
> +#define GH_RM_VM_STOP_FLAG_FORCE_STOP BIT(0)
> + u8 flags;
> + u8 reserved;
> +#define GH_RM_VM_STOP_REASON_FORCE_STOP 3
> + __le32 stop_reason;
> +} __packed;
> +
> +/* Call: VM_CONFIG_IMAGE */
> +struct gh_rm_vm_config_image_req {
> + __le16 vmid;
> + __le16 auth_mech;
> + __le32 mem_handle;
> + __le64 image_offset;
> + __le64 image_size;
> + __le64 dtb_offset;
> + __le64 dtb_size;
> +} __packed;
> +
> +/* Call: GET_HYP_RESOURCES */
> +
-------------------------------->

All the above structures are very much internal to rsc_mgr_rpc.c and
interface to the rsc_mgr_rpc is already abstracted with function arguments

ex:

int gh_rm_vm_configure(struct gh_rm *rm, u16 vmid, enum
gh_rm_vm_auth_mechanism auth_mechanism, u32 mem_handle, u64
image_offset, u64 image_size, u64 dtb_offset, u64 dtb_size)

So why do we need these structs and defines in header file at all?
you should proabably consider moving them to the .c file.


> #endif
> diff --git a/drivers/virt/gunyah/rsc_mgr_rpc.c b/drivers/virt/gunyah/rsc_mgr_rpc.c
> new file mode 100644
> index 000000000000..4515cdd80106
> --- /dev/null
> +++ b/drivers/virt/gunyah/rsc_mgr_rpc.c
> @@ -0,0 +1,226 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#include <linux/gunyah_rsc_mgr.h>
> +

Why new line here?

> +#include "rsc_mgr.h"
> +
> +/*
...

> +int gh_rm_vm_configure(struct gh_rm *rm, u16 vmid, enum gh_rm_vm_auth_mechanism auth_mechanism,
> + u32 mem_handle, u64 image_offset, u64 image_size, u64 dtb_offset, u64 dtb_size)
> +{
> + struct gh_rm_vm_config_image_req req_payload = { 0 };
> + size_t resp_size;
> + void *resp;
> +
> + req_payload.vmid = cpu_to_le16(vmid);
> + req_payload.auth_mech = cpu_to_le16(auth_mechanism);
> + req_payload.mem_handle = cpu_to_le32(mem_handle);
> + req_payload.image_offset = cpu_to_le64(image_offset);
> + req_payload.image_size = cpu_to_le64(image_size);
> + req_payload.dtb_offset = cpu_to_le64(dtb_offset);
> + req_payload.dtb_size = cpu_to_le64(dtb_size);
> +
> + return gh_rm_call(rm, GH_RM_RPC_VM_CONFIG_IMAGE, &req_payload, sizeof(req_payload),
> + &resp, &resp_size);
> +}
> +

--srini

2023-02-21 10:46:15

by Srinivas Kandagatla

[permalink] [raw]
Subject: Re: [PATCH v10 10/26] gunyah: vm_mgr: Introduce basic VM Manager



On 14/02/2023 21:23, Elliot Berman wrote:
>
> Gunyah VM manager is a kernel moduel which exposes an interface to
> Gunyah userspace to load, run, and interact with other Gunyah virtual
> machines. The interface is a character device at /dev/gunyah.
>
> Add a basic VM manager driver. Upcoming patches will add more ioctls
> into this driver.
>
> Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
> Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
> Signed-off-by: Elliot Berman <[email protected]>
> ---
> .../userspace-api/ioctl/ioctl-number.rst | 1 +
> drivers/virt/gunyah/Makefile | 2 +-
> drivers/virt/gunyah/rsc_mgr.c | 37 +++++-
> drivers/virt/gunyah/vm_mgr.c | 118 ++++++++++++++++++
> drivers/virt/gunyah/vm_mgr.h | 22 ++++
> include/uapi/linux/gunyah.h | 23 ++++
> 6 files changed, 201 insertions(+), 2 deletions(-)
> create mode 100644 drivers/virt/gunyah/vm_mgr.c
> create mode 100644 drivers/virt/gunyah/vm_mgr.h
> create mode 100644 include/uapi/linux/gunyah.h
>
> diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documentation/userspace-api/ioctl/ioctl-number.rst
> index 0a1882e296ae..2513324ae7be 100644
> --- a/Documentation/userspace-api/ioctl/ioctl-number.rst
> +++ b/Documentation/userspace-api/ioctl/ioctl-number.rst
> @@ -137,6 +137,7 @@ Code Seq# Include File Comments
> 'F' DD video/sstfb.h conflict!
> 'G' 00-3F drivers/misc/sgi-gru/grulib.h conflict!
> 'G' 00-0F xen/gntalloc.h, xen/gntdev.h conflict!
> +'G' 00-0f linux/gunyah.h conflict!
> 'H' 00-7F linux/hiddev.h conflict!
> 'H' 00-0F linux/hidraw.h conflict!
> 'H' 01 linux/mei.h conflict!
> diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
> index de29769f2f3f..03951cf82023 100644
> --- a/drivers/virt/gunyah/Makefile
> +++ b/drivers/virt/gunyah/Makefile
> @@ -2,5 +2,5 @@
>
> obj-$(CONFIG_GUNYAH) += gunyah.o
>
> -gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o
> +gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o
> obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
> diff --git a/drivers/virt/gunyah/rsc_mgr.c b/drivers/virt/gunyah/rsc_mgr.c
> index 2a47139873a8..73c5a6b7cbbc 100644
> --- a/drivers/virt/gunyah/rsc_mgr.c
> +++ b/drivers/virt/gunyah/rsc_mgr.c
> @@ -16,8 +16,10 @@
> #include <linux/completion.h>
> #include <linux/gunyah_rsc_mgr.h>
> #include <linux/platform_device.h>
> +#include <linux/miscdevice.h>
>
> #include "rsc_mgr.h"
> +#include "vm_mgr.h"
>
> #define RM_RPC_API_VERSION_MASK GENMASK(3, 0)
> #define RM_RPC_HEADER_WORDS_MASK GENMASK(7, 4)
> @@ -103,6 +105,8 @@ struct gh_rm {
> struct kmem_cache *cache;
> struct mutex send_lock;
> struct blocking_notifier_head nh;
> +
> + struct miscdevice miscdev;
> };
>
> static struct gh_rm_connection *gh_rm_alloc_connection(__le32 msg_id, u8 type)
> @@ -509,6 +513,21 @@ void put_gh_rm(struct gh_rm *rm)
> }
> EXPORT_SYMBOL_GPL(put_gh_rm);
>
> +static long gh_dev_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
> +{
> + struct miscdevice *miscdev = filp->private_data;
> + struct gh_rm *rm = container_of(miscdev, struct gh_rm, miscdev);
> +
> + return gh_dev_vm_mgr_ioctl(rm, cmd, arg);
> +}
> +
> +static const struct file_operations gh_dev_fops = {
> + .owner = THIS_MODULE,
> + .unlocked_ioctl = gh_dev_ioctl,
> + .compat_ioctl = compat_ptr_ioctl,
> + .llseek = noop_llseek,
> +};
> +
> static int gh_msgq_platform_probe_direction(struct platform_device *pdev,
> bool tx, int idx, struct gunyah_resource *ghrsc)
> {
> @@ -567,7 +586,22 @@ static int gh_rm_drv_probe(struct platform_device *pdev)
> rm->msgq_client.rx_callback = gh_rm_msgq_rx_data;
> rm->msgq_client.tx_done = gh_rm_msgq_tx_done;
>
> - return gh_msgq_init(&pdev->dev, &rm->msgq, &rm->msgq_client, &rm->tx_ghrsc, &rm->rx_ghrsc);
> + ret = gh_msgq_init(&pdev->dev, &rm->msgq, &rm->msgq_client, &rm->tx_ghrsc, &rm->rx_ghrsc);
> + if (ret)
> + goto err_cache;
> +
> + rm->miscdev.name = "gunyah";
> + rm->miscdev.minor = MISC_DYNAMIC_MINOR;
> + rm->miscdev.fops = &gh_dev_fops;
> +
> + ret = misc_register(&rm->miscdev);
> + if (ret)
> + goto err_msgq;
> +
> + return 0;
> +err_msgq:
> + mbox_free_channel(gh_msgq_chan(&rm->msgq));
> + gh_msgq_remove(&rm->msgq);
> err_cache:
> kmem_cache_destroy(rm->cache);
> return ret;
> @@ -577,6 +611,7 @@ static int gh_rm_drv_remove(struct platform_device *pdev)
> {
> struct gh_rm *rm = platform_get_drvdata(pdev);
>
> + misc_deregister(&rm->miscdev);
> mbox_free_channel(gh_msgq_chan(&rm->msgq));
> gh_msgq_remove(&rm->msgq);
> kmem_cache_destroy(rm->cache);
> diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
> new file mode 100644
> index 000000000000..fd890a57172e
> --- /dev/null
> +++ b/drivers/virt/gunyah/vm_mgr.c
> @@ -0,0 +1,118 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#define pr_fmt(fmt) "gh_vm_mgr: " fmt
> +
> +#include <linux/anon_inodes.h>
> +#include <linux/file.h>
> +#include <linux/gunyah_rsc_mgr.h>
> +#include <linux/miscdevice.h>
> +#include <linux/module.h>
> +
> +#include <uapi/linux/gunyah.h>
> +
> +#include "vm_mgr.h"
> +
> +static void gh_vm_free(struct work_struct *work)
> +{
> + struct gh_vm *ghvm = container_of(work, struct gh_vm, free_work);
> + int ret;
> +
> + ret = gh_rm_dealloc_vmid(ghvm->rm, ghvm->vmid);
> + if (ret)
> + pr_warn("Failed to deallocate vmid: %d\n", ret);
> +
> + put_gh_rm(ghvm->rm);
> + kfree(ghvm);
> +}
> +
> +static __must_check struct gh_vm *gh_vm_alloc(struct gh_rm *rm)
> +{
> + struct gh_vm *ghvm;
> + int vmid;
> +
> + vmid = gh_rm_alloc_vmid(rm, 0);
> + if (vmid < 0)
> + return ERR_PTR(vmid);
> +
> + ghvm = kzalloc(sizeof(*ghvm), GFP_KERNEL);
> + if (!ghvm) {
> + gh_rm_dealloc_vmid(rm, vmid);
> + return ERR_PTR(-ENOMEM);
> + }
> +
> + get_gh_rm(rm);
> +
> + ghvm->vmid = vmid;
> + ghvm->rm = rm;
> +
> + INIT_WORK(&ghvm->free_work, gh_vm_free);
> +
> + return ghvm;
> +}
> +
> +static int gh_vm_release(struct inode *inode, struct file *filp)
> +{
> + struct gh_vm *ghvm = filp->private_data;
> +
> + /* VM will be reset and make RM calls which can interruptible sleep.
> + * Defer to a work so this thread can receive signal.
> + */
> + schedule_work(&ghvm->free_work);
> + return 0;
> +}
> +
> +static const struct file_operations gh_vm_fops = {
> + .release = gh_vm_release,

> + .compat_ioctl = compat_ptr_ioctl,

This line should go with the patch that adds real ioctl

> + .llseek = noop_llseek,
> +};
> +
> +static long gh_dev_ioctl_create_vm(struct gh_rm *rm, unsigned long arg)
Not sure what is the gain of this multiple levels of redirection.

How about

long gh_dev_create_vm(struct gh_rm *rm, unsigned long arg)
{
...
}

and rsc_mgr just call it as part of its ioctl call

static long gh_dev_ioctl(struct file *filp, unsigned int cmd, unsigned
long arg)
{
struct miscdevice *miscdev = filp->private_data;
struct gh_rm *rm = container_of(miscdev, struct gh_rm, miscdev);

switch (cmd) {
case GH_CREATE_VM:
return gh_dev_create_vm(rm, arg);
default:
return -ENOIOCTLCMD;
}
}


> +{
> + struct gh_vm *ghvm;
> + struct file *file;
> + int fd, err;
> +
> + /* arg reserved for future use. */
> + if (arg)
> + return -EINVAL;

The only code path I see here is via GH_CREATE_VM ioctl which obviously
does not take any arguments, so if you are thinking of using the
argument for architecture-specific VM flags. Then this needs to be
properly done by making the ABI aware of this.

As you mentioned zero value arg imply an "unauthenticated VM" type, but
this was not properly encoded in the userspace ABI. Why not make it
future compatible. How about adding arguments to GH_CREATE_VM and pass
the required information correctly.
Note that once the ABI is accepted then you will not be able to change
it, other than adding a new one.

> +
> + ghvm = gh_vm_alloc(rm);
> + if (IS_ERR(ghvm))
> + return PTR_ERR(ghvm);
> +
> + fd = get_unused_fd_flags(O_CLOEXEC);
> + if (fd < 0) {
> + err = fd;
> + goto err_destroy_vm;
> + }
> +
> + file = anon_inode_getfile("gunyah-vm", &gh_vm_fops, ghvm, O_RDWR);
> + if (IS_ERR(file)) {
> + err = PTR_ERR(file);
> + goto err_put_fd;
> + }
> +
> + fd_install(fd, file);
> +
> + return fd;
> +
> +err_put_fd:
> + put_unused_fd(fd);
> +err_destroy_vm:
> + kfree(ghvm);
> + return err;
> +}
> +
> +long gh_dev_vm_mgr_ioctl(struct gh_rm *rm, unsigned int cmd, unsigned long arg)
> +{
> + switch (cmd) {
> + case GH_CREATE_VM:
> + return gh_dev_ioctl_create_vm(rm, arg);
> + default:
> + return -ENOIOCTLCMD;
> + }
> +}
> diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
> new file mode 100644
> index 000000000000..76954da706e9
> --- /dev/null
> +++ b/drivers/virt/gunyah/vm_mgr.h
> @@ -0,0 +1,22 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#ifndef _GH_PRIV_VM_MGR_H
> +#define _GH_PRIV_VM_MGR_H
> +
> +#include <linux/gunyah_rsc_mgr.h>
> +
> +#include <uapi/linux/gunyah.h>
> +
> +long gh_dev_vm_mgr_ioctl(struct gh_rm *rm, unsigned int cmd, unsigned long arg);
> +
> +struct gh_vm {
> + u16 vmid;
> + struct gh_rm *rm;
> +
> + struct work_struct free_work;
> +};
> +
> +#endif
> diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
> new file mode 100644
> index 000000000000..10ba32d2b0a6
> --- /dev/null
> +++ b/include/uapi/linux/gunyah.h
> @@ -0,0 +1,23 @@
> +/* SPDX-License-Identifier: GPL-2.0-only WITH Linux-syscall-note */
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#ifndef _UAPI_LINUX_GUNYAH
> +#define _UAPI_LINUX_GUNYAH
> +
> +/*
> + * Userspace interface for /dev/gunyah - gunyah based virtual machine
> + */
> +
> +#include <linux/types.h>
> +#include <linux/ioctl.h>
> +
> +#define GH_IOCTL_TYPE 'G'
> +
> +/*
> + * ioctls for /dev/gunyah fds:
> + */
> +#define GH_CREATE_VM _IO(GH_IOCTL_TYPE, 0x0) /* Returns a Gunyah VM fd */

Can HLOS forcefully destroy a VM?
If so should we have a corresponding DESTROY IOCTL?

--srini

> +
> +#endif

2023-02-21 11:07:51

by Srinivas Kandagatla

[permalink] [raw]
Subject: Re: [PATCH v10 11/26] gunyah: rsc_mgr: Add RPC for sharing memory



On 14/02/2023 21:24, Elliot Berman wrote:
>
> Gunyah resource manager provides API to manipulate stage 2 page tables.
> Manipulations are represented as a memory parcel. Memory parcels
> describe a list of memory regions (intermediate physical address and
> size), a list of new permissions for VMs, and the memory type (DDR or
> MMIO). Memory parcels are uniquely identified by a handle allocated by
> Gunyah. There are a few types of memory parcel sharing which Gunyah
> supports:
>
> - Sharing: the guest and host VM both have access
> - Lending: only the guest has access; host VM loses access
> - Donating: Permanently lent (not reclaimed even if guest shuts down)
>
> Memory parcels that have been shared or lent can be reclaimed by the
> host via an additional call. The reclaim operation restores the original
> access the host VM had to the memory parcel and removes the access to
> other VM.
>
> One point to note that memory parcels don't describe where in the guest
> VM the memory parcel should reside. The guest VM must accept the memory
> parcel either explicitly via a "gh_rm_mem_accept" call (not introduced
> here) or be configured to accept it automatically at boot. As the guest
> VM accepts the memory parcel, it also mentions the IPA it wants to place
> memory parcel.
>
> Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
> Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
> Signed-off-by: Elliot Berman <[email protected]>
> ---
> drivers/virt/gunyah/rsc_mgr.h | 44 +++++++
> drivers/virt/gunyah/rsc_mgr_rpc.c | 185 ++++++++++++++++++++++++++++++
> include/linux/gunyah_rsc_mgr.h | 47 ++++++++
> 3 files changed, 276 insertions(+)
>
> diff --git a/drivers/virt/gunyah/rsc_mgr.h b/drivers/virt/gunyah/rsc_mgr.h
> index 7406237bc66d..9b23cefe02b0 100644
> --- a/drivers/virt/gunyah/rsc_mgr.h
> +++ b/drivers/virt/gunyah/rsc_mgr.h
> @@ -74,6 +74,12 @@ struct gh_rm;
> int gh_rm_call(struct gh_rm *rsc_mgr, u32 message_id, void *req_buff, size_t req_buff_size,
> void **resp_buf, size_t *resp_buff_size);
>
> +/* Message IDs: Memory Management */
> +#define GH_RM_RPC_MEM_LEND 0x51000012
> +#define GH_RM_RPC_MEM_SHARE 0x51000013
> +#define GH_RM_RPC_MEM_RECLAIM 0x51000015
> +#define GH_RM_RPC_MEM_APPEND 0x51000018
> +
> /* Message IDs: VM Management */
> #define GH_RM_RPC_VM_ALLOC_VMID 0x56000001
> #define GH_RM_RPC_VM_DEALLOC_VMID 0x56000002
> @@ -90,6 +96,44 @@ struct gh_rm_vm_common_vmid_req {
> __le16 reserved0;
> } __packed;
>
> +/* Call: MEM_LEND, MEM_SHARE */
> +struct gh_rm_mem_share_req_header {
> + u8 mem_type;
> + u8 reserved0;
> +#define GH_MEM_SHARE_REQ_FLAGS_APPEND BIT(1)
> + u8 flags;
> + u8 reserved1;
> + __le32 label;
> +} __packed;
> +
> +struct gh_rm_mem_share_req_acl_section {
> + __le32 n_entries;
> + struct gh_rm_mem_acl_entry entries[];
> +};
> +
> +struct gh_rm_mem_share_req_mem_section {
> + __le16 n_entries;
> + __le16 reserved0;
> + struct gh_rm_mem_entry entries[];
> +};
> +
> +/* Call: MEM_RELEASE */
> +struct gh_rm_mem_release_req {
> + __le32 mem_handle;
> + u8 flags; /* currently not used */
> + __le16 reserved0;
> + u8 reserved1;
> +} __packed;
> +
> +/* Call: MEM_APPEND */
> +struct gh_rm_mem_append_req_header {
> + __le32 mem_handle;
> +#define GH_MEM_APPEND_REQ_FLAGS_END BIT(0)
> + u8 flags;
> + __le16 reserved0;
> + u8 reserved1;
> +} __packed;
> +
> /* Call: VM_ALLOC */
> struct gh_rm_vm_alloc_vmid_resp {
> __le16 vmid;
> diff --git a/drivers/virt/gunyah/rsc_mgr_rpc.c b/drivers/virt/gunyah/rsc_mgr_rpc.c
> index 4515cdd80106..0c83b097fec9 100644
> --- a/drivers/virt/gunyah/rsc_mgr_rpc.c
> +++ b/drivers/virt/gunyah/rsc_mgr_rpc.c
> @@ -7,6 +7,8 @@
>
> #include "rsc_mgr.h"
>
> +#define GH_RM_MAX_MEM_ENTRIES 512
> +
> /*
> * Several RM calls take only a VMID as a parameter and give only standard
> * response back. Deduplicate boilerplate code by using this common call.
> @@ -22,6 +24,189 @@ static int gh_rm_common_vmid_call(struct gh_rm *rm, u32 message_id, u16 vmid)
> return gh_rm_call(rm, message_id, &req_payload, sizeof(req_payload), &resp, &resp_size);
> }
>
> +static int _gh_rm_mem_append(struct gh_rm *rm, u32 mem_handle, bool end_append,
> + struct gh_rm_mem_entry *mem_entries, size_t n_mem_entries)
> +{
> + struct gh_rm_mem_share_req_mem_section *mem_section;
> + struct gh_rm_mem_append_req_header *req_header;
> + size_t msg_size = 0, resp_size;
> + void *msg, *resp;
> + int ret;
> +
> + msg_size += sizeof(struct gh_rm_mem_append_req_header);
> + msg_size += struct_size(mem_section, entries, n_mem_entries);
> +
> + msg = kzalloc(msg_size, GFP_KERNEL);
> + if (!msg)
> + return -ENOMEM;
> +
> + req_header = msg;
> + mem_section = (void *)req_header + sizeof(struct gh_rm_mem_append_req_header);
> +
> + req_header->mem_handle = cpu_to_le32(mem_handle);
> + if (end_append)
> + req_header->flags |= GH_MEM_APPEND_REQ_FLAGS_END;
> +
> + mem_section->n_entries = cpu_to_le16(n_mem_entries);
> + memcpy(mem_section->entries, mem_entries, sizeof(*mem_entries) * n_mem_entries);
> +
> + ret = gh_rm_call(rm, GH_RM_RPC_MEM_APPEND, msg, msg_size, &resp, &resp_size);

I have seen this pattern, where we pass &resp eventhough we are not
expecting a response.

Can we make this explicit by passing NULL. This can also help to find
any leaks of resp in case they are not freed by the caller.

ret = gh_rm_call(rm, GH_RM_RPC_MEM_APPEND, msg, msg_size, NULL, NULL);

> + kfree(msg);
> +
> + return ret;
> +}
> +
> +static int gh_rm_mem_append(struct gh_rm *rm, u32 mem_handle,
> + struct gh_rm_mem_entry *mem_entries, size_t n_mem_entries)
> +{
> + bool end_append;
> + int ret = 0;
> + size_t n;
> +
> + while (n_mem_entries) {
> + if (n_mem_entries > GH_RM_MAX_MEM_ENTRIES) {
> + end_append = false;
> + n = GH_RM_MAX_MEM_ENTRIES;
> + } else {
> + end_append = true;
> + n = n_mem_entries;
> + }
> +
> + ret = _gh_rm_mem_append(rm, mem_handle, end_append, mem_entries, n);
> + if (ret)
> + break;
> +
> + mem_entries += n;
> + n_mem_entries -= n;
> + }
> +
> + return ret;
> +}
> +
> +static int gh_rm_mem_lend_common(struct gh_rm *rm, u32 message_id, struct gh_rm_mem_parcel *p)
> +{
> + size_t msg_size = 0, initial_mem_entries = p->n_mem_entries, resp_size;
> + struct gh_rm_mem_share_req_acl_section *acl_section;
> + struct gh_rm_mem_share_req_mem_section *mem_section;
> + struct gh_rm_mem_share_req_header *req_header;
> + u32 *attr_section;
> + __le32 *resp;
> + void *msg;
> + int ret;
> +
> + if (!p->acl_entries || !p->n_acl_entries || !p->mem_entries || !p->n_mem_entries ||
> + p->n_acl_entries > U8_MAX || p->mem_handle != GH_MEM_HANDLE_INVAL)
> + return -EINVAL;
> +
> + if (initial_mem_entries > GH_RM_MAX_MEM_ENTRIES)
> + initial_mem_entries = GH_RM_MAX_MEM_ENTRIES;
> +
> + /* The format of the message goes:
> + * request header
> + * ACL entries (which VMs get what kind of access to this memory parcel)
> + * Memory entries (list of memory regions to share)
> + * Memory attributes (currently unused, we'll hard-code the size to 0)
> + */
> + msg_size += sizeof(struct gh_rm_mem_share_req_header);
> + msg_size += struct_size(acl_section, entries, p->n_acl_entries);
> + msg_size += struct_size(mem_section, entries, initial_mem_entries);
> + msg_size += sizeof(u32); /* for memory attributes, currently unused */
> +
> + msg = kzalloc(msg_size, GFP_KERNEL);
> + if (!msg)
> + return -ENOMEM;
> +
> + req_header = msg;
> + acl_section = (void *)req_header + sizeof(*req_header);
> + mem_section = (void *)acl_section + struct_size(acl_section, entries, p->n_acl_entries);
> + attr_section = (void *)mem_section + struct_size(mem_section, entries, initial_mem_entries);
> +
> + req_header->mem_type = p->mem_type;
> + if (initial_mem_entries != p->n_mem_entries)
> + req_header->flags |= GH_MEM_SHARE_REQ_FLAGS_APPEND;
> + req_header->label = cpu_to_le32(p->label);
> +
> + acl_section->n_entries = cpu_to_le32(p->n_acl_entries);
> + memcpy(acl_section->entries, p->acl_entries, sizeof(*(p->acl_entries)) * p->n_acl_entries);
> +
> + mem_section->n_entries = cpu_to_le16(initial_mem_entries);
> + memcpy(mem_section->entries, p->mem_entries,
> + sizeof(*(p->mem_entries)) * initial_mem_entries);
> +
> + /* Set n_entries for memory attribute section to 0 */
> + *attr_section = 0;
> +
> + ret = gh_rm_call(rm, message_id, msg, msg_size, (void **)&resp, &resp_size);
> + kfree(msg);
> +
> + if (ret)
> + return ret;
> +
> + p->mem_handle = le32_to_cpu(*resp);
> +
> + if (initial_mem_entries != p->n_mem_entries) {
> + ret = gh_rm_mem_append(rm, p->mem_handle,
> + &p->mem_entries[initial_mem_entries],
> + p->n_mem_entries - initial_mem_entries);
> + if (ret) {
> + gh_rm_mem_reclaim(rm, p);
> + p->mem_handle = GH_MEM_HANDLE_INVAL;
> + }
> + }
> +
> + kfree(resp);
> + return ret;
> +}
> +
> +/**
> + * gh_rm_mem_lend() - Lend memory to other virtual machines.
> + * @rm: Handle to a Gunyah resource manager
> + * @parcel: Package the memory information of the memory to be lent.
> + *
> + * Lending removes Linux's access to the memory while the memory parcel is lent.
> + */
> +int gh_rm_mem_lend(struct gh_rm *rm, struct gh_rm_mem_parcel *parcel)
> +{
> + return gh_rm_mem_lend_common(rm, GH_RM_RPC_MEM_LEND, parcel);
> +}
> +
> +
> +/**
> + * gh_rm_mem_share() - Share memory with other virtual machines.
> + * @rm: Handle to a Gunyah resource manager
> + * @parcel: Package the memory information of the memory to be shared.
> + *
> + * Sharing keeps Linux's access to the memory while the memory parcel is shared.
> + */
> +int gh_rm_mem_share(struct gh_rm *rm, struct gh_rm_mem_parcel *parcel)
> +{
> + return gh_rm_mem_lend_common(rm, GH_RM_RPC_MEM_SHARE, parcel);
> +}
> +
> +/**
> + * gh_rm_mem_reclaim() - Reclaim a memory parcel
> + * @rm: Handle to a Gunyah resource manager
> + * @parcel: Package the memory information of the memory to be reclaimed.
> + *
> + * RM maps the associated memory back into the stage-2 page tables of the owner VM.
> + */
> +int gh_rm_mem_reclaim(struct gh_rm *rm, struct gh_rm_mem_parcel *parcel)
> +{
> + struct gh_rm_mem_release_req req = {
> + .mem_handle = cpu_to_le32(parcel->mem_handle),
> + };
> + size_t resp_size;
> + void *resp;
> + int ret;
> +
> + ret = gh_rm_call(rm, GH_RM_RPC_MEM_RECLAIM, &req, sizeof(req), &resp, &resp_size);
> + /* Do not call platform mem reclaim hooks: the reclaim didn't happen*/
> + if (ret)
> + return ret;
> +
how about

return gh_rm_call(rm, GH_RM_RPC_MEM_RECLAIM, &req, sizeof(req), &resp,
&resp_size);


> + return ret;
> +}
> +
> /**
> * gh_rm_alloc_vmid() - Allocate a new VM in Gunyah. Returns the VM identifier.
> * @rm: Handle to a Gunyah resource manager
> diff --git a/include/linux/gunyah_rsc_mgr.h b/include/linux/gunyah_rsc_mgr.h
> index e7bd29f8be6e..2d8b8b6cc394 100644
> --- a/include/linux/gunyah_rsc_mgr.h
> +++ b/include/linux/gunyah_rsc_mgr.h
> @@ -11,6 +11,7 @@
> #include <linux/gunyah.h>
>
> #define GH_VMID_INVAL U16_MAX
> +#define GH_MEM_HANDLE_INVAL U32_MAX
>
> /* Gunyah recognizes VMID0 as an alias to the current VM's ID */
> #define GH_VMID_SELF 0
> @@ -54,7 +55,53 @@ struct gh_rm_vm_status_payload {
>
> #define GH_RM_NOTIFICATION_VM_STATUS 0x56100008
>
> +struct gh_rm_mem_acl_entry {
> + __le16 vmid;
> +#define GH_RM_ACL_X BIT(0)
> +#define GH_RM_ACL_W BIT(1)
> +#define GH_RM_ACL_R BIT(2)
> + u8 perms;
> + u8 reserved;
> +} __packed;
> +
> +struct gh_rm_mem_entry {
> + __le64 ipa_base;
> + __le64 size;
> +} __packed;
> +
> +enum gh_rm_mem_type {
> + GH_RM_MEM_TYPE_NORMAL = 0,
> + GH_RM_MEM_TYPE_IO = 1,
> +};
> +
> +/*
> + * struct gh_rm_mem_parcel - Package info about memory to be lent/shared/donated/reclaimed
> + * @mem_type: The type of memory: normal (DDR) or IO
> + * @label: An client-specified identifier which can be used by the other VMs to identify the purpose
> + * of the memory parcel.
> + * @acl_entries: An array of access control entries. Each entry specifies a VM and what access
> + * is allowed for the memory parcel.
> + * @n_acl_entries: Count of the number of entries in the `acl_entries` array.
> + * @mem_entries: An list of regions to be associated with the memory parcel. Addresses should be
> + * (intermediate) physical addresses from Linux's perspective.
> + * @n_mem_entries: Count of the number of entries in the `mem_entries` array.
> + * @mem_handle: On success, filled with memory handle that RM allocates for this memory parcel
> + */
> +struct gh_rm_mem_parcel {
> + enum gh_rm_mem_type mem_type;
> + u32 label;
> + size_t n_acl_entries;
> + struct gh_rm_mem_acl_entry *acl_entries;
> + size_t n_mem_entries;
> + struct gh_rm_mem_entry *mem_entries;
> + u32 mem_handle;
> +};
> +
> /* RPC Calls */
> +int gh_rm_mem_lend(struct gh_rm *rm, struct gh_rm_mem_parcel *parcel);
> +int gh_rm_mem_share(struct gh_rm *rm, struct gh_rm_mem_parcel *parcel);
> +int gh_rm_mem_reclaim(struct gh_rm *rm, struct gh_rm_mem_parcel *parcel);
> +
> int gh_rm_alloc_vmid(struct gh_rm *rm, u16 vmid);
> int gh_rm_dealloc_vmid(struct gh_rm *rm, u16 vmid);
> int gh_rm_vm_reset(struct gh_rm *rm, u16 vmid);

2023-02-21 12:29:27

by Srinivas Kandagatla

[permalink] [raw]
Subject: Re: [PATCH v10 12/26] gunyah: vm_mgr: Add/remove user memory regions



On 14/02/2023 21:24, Elliot Berman wrote:
>
> When launching a virtual machine, Gunyah userspace allocates memory for
> the guest and informs Gunyah about these memory regions through
> SET_USER_MEMORY_REGION ioctl.
>
> Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
> Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
> Signed-off-by: Elliot Berman <[email protected]>
> ---
> drivers/virt/gunyah/Makefile | 2 +-
> drivers/virt/gunyah/vm_mgr.c | 44 ++++++
> drivers/virt/gunyah/vm_mgr.h | 25 ++++
> drivers/virt/gunyah/vm_mgr_mm.c | 235 ++++++++++++++++++++++++++++++++
> include/uapi/linux/gunyah.h | 33 +++++
> 5 files changed, 338 insertions(+), 1 deletion(-)
> create mode 100644 drivers/virt/gunyah/vm_mgr_mm.c
>
> diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
> index 03951cf82023..ff8bc4925392 100644
> --- a/drivers/virt/gunyah/Makefile
> +++ b/drivers/virt/gunyah/Makefile
> @@ -2,5 +2,5 @@
>
> obj-$(CONFIG_GUNYAH) += gunyah.o
>
> -gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o
> +gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o vm_mgr_mm.o
> obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
> diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
> index fd890a57172e..84102bac03cc 100644
> --- a/drivers/virt/gunyah/vm_mgr.c
> +++ b/drivers/virt/gunyah/vm_mgr.c
> @@ -18,8 +18,16 @@
> static void gh_vm_free(struct work_struct *work)
> {
> struct gh_vm *ghvm = container_of(work, struct gh_vm, free_work);
> + struct gh_vm_mem *mapping, *tmp;
> int ret;
>
> + mutex_lock(&ghvm->mm_lock);
> + list_for_each_entry_safe(mapping, tmp, &ghvm->memory_mappings, list) {
> + gh_vm_mem_reclaim(ghvm, mapping);
> + kfree(mapping);
> + }
> + mutex_unlock(&ghvm->mm_lock);
> +
> ret = gh_rm_dealloc_vmid(ghvm->rm, ghvm->vmid);
> if (ret)
> pr_warn("Failed to deallocate vmid: %d\n", ret);
> @@ -48,11 +56,46 @@ static __must_check struct gh_vm *gh_vm_alloc(struct gh_rm *rm)
> ghvm->vmid = vmid;
> ghvm->rm = rm;
>
> + mutex_init(&ghvm->mm_lock);
> + INIT_LIST_HEAD(&ghvm->memory_mappings);
> INIT_WORK(&ghvm->free_work, gh_vm_free);
>
> return ghvm;
> }
>
> +static long gh_vm_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
> +{
> + struct gh_vm *ghvm = filp->private_data;
> + void __user *argp = (void __user *)arg;
> + long r;
> +
> + switch (cmd) {
> + case GH_VM_SET_USER_MEM_REGION: {
> + struct gh_userspace_memory_region region;
> +
> + if (copy_from_user(&region, argp, sizeof(region)))
> + return -EFAULT;
> +
> + /* All other flag bits are reserved for future use */
> + if (region.flags & ~(GH_MEM_ALLOW_READ | GH_MEM_ALLOW_WRITE | GH_MEM_ALLOW_EXEC |
> + GH_MEM_LENT))
> + return -EINVAL;
> +
> +
> + if (region.memory_size)
> + r = gh_vm_mem_alloc(ghvm, &region);
> + else
> + r = gh_vm_mem_free(ghvm, region.label);

Looks like we are repurposing GH_VM_SET_USER_MEM_REGION for allocation
and freeing.

Should we have corresponding GH_VM_UN_SET_USER_MEM_REGION instead for
freeing? given that label is the only relevant member of struct
gh_userspace_memory_region in free case.


> + break;
> + }
> + default:
> + r = -ENOTTY;
> + break;
> + }
> +
> + return r;
> +}
> +
> static int gh_vm_release(struct inode *inode, struct file *filp)
> {
> struct gh_vm *ghvm = filp->private_data;
> @@ -65,6 +108,7 @@ static int gh_vm_release(struct inode *inode, struct file *filp)
> }
>
> static const struct file_operations gh_vm_fops = {
> + .unlocked_ioctl = gh_vm_ioctl,
> .release = gh_vm_release,
> .compat_ioctl = compat_ptr_ioctl,
> .llseek = noop_llseek,
> diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
> index 76954da706e9..97bc00c34878 100644
> --- a/drivers/virt/gunyah/vm_mgr.h
> +++ b/drivers/virt/gunyah/vm_mgr.h
> @@ -7,16 +7,41 @@
> #define _GH_PRIV_VM_MGR_H
>
> #include <linux/gunyah_rsc_mgr.h>
> +#include <linux/list.h>
> +#include <linux/miscdevice.h>
> +#include <linux/mutex.h>
>
> #include <uapi/linux/gunyah.h>
>
> long gh_dev_vm_mgr_ioctl(struct gh_rm *rm, unsigned int cmd, unsigned long arg);
>
> +enum gh_vm_mem_share_type {
> + VM_MEM_SHARE,
> + VM_MEM_LEND,
> +};
> +
> +struct gh_vm_mem {
> + struct list_head list;
> + enum gh_vm_mem_share_type share_type;
> + struct gh_rm_mem_parcel parcel;
> +
> + __u64 guest_phys_addr;
> + struct page **pages;
> + unsigned long npages;
> +};
> +
> struct gh_vm {
> u16 vmid;
> struct gh_rm *rm;
>
> struct work_struct free_work;
> + struct mutex mm_lock;
> + struct list_head memory_mappings;
> };
>
> +int gh_vm_mem_alloc(struct gh_vm *ghvm, struct gh_userspace_memory_region *region);
> +void gh_vm_mem_reclaim(struct gh_vm *ghvm, struct gh_vm_mem *mapping);
> +int gh_vm_mem_free(struct gh_vm *ghvm, u32 label);
> +struct gh_vm_mem *gh_vm_mem_find(struct gh_vm *ghvm, u32 label);
> +
> #endif
> diff --git a/drivers/virt/gunyah/vm_mgr_mm.c b/drivers/virt/gunyah/vm_mgr_mm.c
> new file mode 100644
> index 000000000000..03e71a36ea3b
> --- /dev/null
> +++ b/drivers/virt/gunyah/vm_mgr_mm.c
> @@ -0,0 +1,235 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#define pr_fmt(fmt) "gh_vm_mgr: " fmt
> +
> +#include <linux/gunyah_rsc_mgr.h>
> +#include <linux/mm.h>
> +
> +#include <uapi/linux/gunyah.h>
> +
> +#include "vm_mgr.h"
> +
> +static inline bool page_contiguous(phys_addr_t p, phys_addr_t t)
> +{
> + return t - p == PAGE_SIZE;
> +}
> +
> +static struct gh_vm_mem *__gh_vm_mem_find(struct gh_vm *ghvm, u32 label)
> + __must_hold(&ghvm->mm_lock)
> +{
> + struct gh_vm_mem *mapping;
> +
> + list_for_each_entry(mapping, &ghvm->memory_mappings, list)
> + if (mapping->parcel.label == label)
> + return mapping;
> +
> + return NULL;
> +}
> +
> +void gh_vm_mem_reclaim(struct gh_vm *ghvm, struct gh_vm_mem *mapping)
> + __must_hold(&ghvm->mm_lock)
> +{
> + int i, ret = 0;
> +
> + if (mapping->parcel.mem_handle != GH_MEM_HANDLE_INVAL) {
> + ret = gh_rm_mem_reclaim(ghvm->rm, &mapping->parcel);
> + if (ret)
> + pr_warn("Failed to reclaim memory parcel for label %d: %d\n",
> + mapping->parcel.label, ret);

what the behavoir of hypervisor if we failed to reclaim the pages?

> + }
> +
> + if (!ret)
So we will leave the user pages pinned if hypervisor call fails, but
further down we free the mapping all together.

Am not 100% sure if this will have any side-effect, but is it okay to
leave user-pages pinned with no possiblity of unpinning them in such cases?


> + for (i = 0; i < mapping->npages; i++)
> + unpin_user_page(mapping->pages[i]);
> +
> + kfree(mapping->pages);
> + kfree(mapping->parcel.acl_entries);
> + kfree(mapping->parcel.mem_entries);
> +
> + list_del(&mapping->list);
> +}
> +
> +struct gh_vm_mem *gh_vm_mem_find(struct gh_vm *ghvm, u32 label)
> +{
> + struct gh_vm_mem *mapping;
> + int ret;
> +
> + ret = mutex_lock_interruptible(&ghvm->mm_lock);
> + if (ret)
> + return ERR_PTR(ret);
new line would be nice here.

> + mapping = __gh_vm_mem_find(ghvm, label);
> + mutex_unlock(&ghvm->mm_lock);
new line would be nice here.

> + return mapping ? : ERR_PTR(-ENODEV);
> +}
> +
> +int gh_vm_mem_alloc(struct gh_vm *ghvm, struct gh_userspace_memory_region *region)
> +{
> + struct gh_vm_mem *mapping, *tmp_mapping;
> + struct gh_rm_mem_entry *mem_entries;
> + phys_addr_t curr_page, prev_page;
> + struct gh_rm_mem_parcel *parcel;
> + int i, j, pinned, ret = 0;
> + size_t entry_size;
> + u16 vmid;
> +
> + if (!gh_api_has_feature(GH_API_FEATURE_MEMEXTENT))
> + return -EOPNOTSUPP;

Should this not be first thing to do in ioctl before even entering this
function?

> +
> + if (!region->memory_size || !PAGE_ALIGNED(region->memory_size) ||
> + !PAGE_ALIGNED(region->userspace_addr) || !PAGE_ALIGNED(region->guest_phys_addr))
> + return -EINVAL;
> +
> + ret = mutex_lock_interruptible(&ghvm->mm_lock);
> + if (ret)
> + return ret;
new line.

> + mapping = __gh_vm_mem_find(ghvm, region->label);
> + if (mapping) {
> + mutex_unlock(&ghvm->mm_lock);
> + return -EEXIST;
> + }
> +
> + mapping = kzalloc(sizeof(*mapping), GFP_KERNEL);
> + if (!mapping) {
> + ret = -ENOMEM;
> + goto free_mapping;

how about,

mutex_unlock(&ghvm->mm_lock);
return -ENMEM;

> + }
> +
> + mapping->parcel.label = region->label;
> + mapping->guest_phys_addr = region->guest_phys_addr;
> + mapping->npages = region->memory_size >> PAGE_SHIFT;
> + parcel = &mapping->parcel;
> + parcel->mem_handle = GH_MEM_HANDLE_INVAL; /* to be filled later by mem_share/mem_lend */
> + parcel->mem_type = GH_RM_MEM_TYPE_NORMAL;
> +
> + /* Check for overlap */
> + list_for_each_entry(tmp_mapping, &ghvm->memory_mappings, list) {
> + if (!((mapping->guest_phys_addr + (mapping->npages << PAGE_SHIFT) <=
> + tmp_mapping->guest_phys_addr) ||
> + (mapping->guest_phys_addr >=
> + tmp_mapping->guest_phys_addr + (tmp_mapping->npages << PAGE_SHIFT)))) {
> + ret = -EEXIST;
> + goto free_mapping;
> + }
> + }
> +
> + list_add(&mapping->list, &ghvm->memory_mappings);
> +
> + mapping->pages = kcalloc(mapping->npages, sizeof(*mapping->pages), GFP_KERNEL);
> + if (!mapping->pages) {
> + ret = -ENOMEM;
> + mapping->npages = 0; /* update npages for reclaim */
> + goto reclaim;
> + }
> +
> + pinned = pin_user_pages_fast(region->userspace_addr, mapping->npages,
> + FOLL_WRITE | FOLL_LONGTERM, mapping->pages);
> + if (pinned < 0) {
> + ret = pinned;
> + mapping->npages = 0; /* update npages for reclaim */
> + goto reclaim;
> + } else if (pinned != mapping->npages) {
> + ret = -EFAULT;
> + mapping->npages = pinned; /* update npages for reclaim */
> + goto reclaim;
> + }
> +
> + if (region->flags & GH_MEM_LENT) {
> + parcel->n_acl_entries = 1;
> + mapping->share_type = VM_MEM_LEND;
> + } else {
> + parcel->n_acl_entries = 2;
> + mapping->share_type = VM_MEM_SHARE;
> + }
> + parcel->acl_entries = kcalloc(parcel->n_acl_entries, sizeof(*parcel->acl_entries),
> + GFP_KERNEL);
> + if (!parcel->acl_entries) {
> + ret = -ENOMEM;
> + goto reclaim;
> + }
> +
> + parcel->acl_entries[0].vmid = cpu_to_le16(ghvm->vmid);
new line
> + if (region->flags & GH_MEM_ALLOW_READ)
> + parcel->acl_entries[0].perms |= GH_RM_ACL_R;
> + if (region->flags & GH_MEM_ALLOW_WRITE)
> + parcel->acl_entries[0].perms |= GH_RM_ACL_W;
> + if (region->flags & GH_MEM_ALLOW_EXEC)
> + parcel->acl_entries[0].perms |= GH_RM_ACL_X;
> +
> + if (mapping->share_type == VM_MEM_SHARE) {
> + ret = gh_rm_get_vmid(ghvm->rm, &vmid);
> + if (ret)
> + goto reclaim;
> +
> + parcel->acl_entries[1].vmid = cpu_to_le16(vmid);
> + /* Host assumed to have all these permissions. Gunyah will not
> + * grant new permissions if host actually had less than RWX
> + */
> + parcel->acl_entries[1].perms |= GH_RM_ACL_R | GH_RM_ACL_W | GH_RM_ACL_X;
> + }
> +
> + mem_entries = kcalloc(mapping->npages, sizeof(*mem_entries), GFP_KERNEL);
> + if (!mem_entries) {
> + ret = -ENOMEM;
> + goto reclaim;
> + }
> +
> + /* reduce number of entries by combining contiguous pages into single memory entry */
> + prev_page = page_to_phys(mapping->pages[0]);
> + mem_entries[0].ipa_base = cpu_to_le64(prev_page);
> + entry_size = PAGE_SIZE;
new line
> + for (i = 1, j = 0; i < mapping->npages; i++) {
> + curr_page = page_to_phys(mapping->pages[i]);
> + if (page_contiguous(prev_page, curr_page)) {
> + entry_size += PAGE_SIZE;
> + } else {
> + mem_entries[j].size = cpu_to_le64(entry_size);
> + j++;
> + mem_entries[j].ipa_base = cpu_to_le64(curr_page);
> + entry_size = PAGE_SIZE;
> + }
> +
> + prev_page = curr_page;
> + }
> + mem_entries[j].size = cpu_to_le64(entry_size);
> +
> + parcel->n_mem_entries = j + 1;
> + parcel->mem_entries = kmemdup(mem_entries, sizeof(*mem_entries) * parcel->n_mem_entries,
> + GFP_KERNEL);
> + kfree(mem_entries);
> + if (!parcel->mem_entries) {
> + ret = -ENOMEM;
> + goto reclaim;
> + }
> +
> + mutex_unlock(&ghvm->mm_lock);
> + return 0;
> +reclaim:
> + gh_vm_mem_reclaim(ghvm, mapping);
> +free_mapping:
> + kfree(mapping);
> + mutex_unlock(&ghvm->mm_lock);
> + return ret;
> +}
> +
> +int gh_vm_mem_free(struct gh_vm *ghvm, u32 label)
> +{
> + struct gh_vm_mem *mapping;
> + int ret;
> +
> + ret = mutex_lock_interruptible(&ghvm->mm_lock);
> + if (ret)
> + return ret;
> +
> + mapping = __gh_vm_mem_find(ghvm, label);
> + if (!mapping)
> + goto out;
> +
> + gh_vm_mem_reclaim(ghvm, mapping);
> + kfree(mapping);
> +out:
> + mutex_unlock(&ghvm->mm_lock);
> + return ret;
> +}
> diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
> index 10ba32d2b0a6..d85d12119a48 100644
> --- a/include/uapi/linux/gunyah.h
> +++ b/include/uapi/linux/gunyah.h
> @@ -20,4 +20,37 @@
> */
> #define GH_CREATE_VM _IO(GH_IOCTL_TYPE, 0x0) /* Returns a Gunyah VM fd */
>
> +/*
> + * ioctls for VM fds
> + */
> +
> +/**
> + * struct gh_userspace_memory_region - Userspace memory descripion for GH_VM_SET_USER_MEM_REGION
> + * @label: Unique identifer to the region.
> + * @flags: Flags for memory parcel behavior
> + * @guest_phys_addr: Location of the memory region in guest's memory space (page-aligned)#

Note about overlapping here would be useful.

> + * @memory_size: Size of the region (page-aligned)
> + * @userspace_addr: Location of the memory region in caller (userspace)'s memory
> + *
> + * See Documentation/virt/gunyah/vm-manager.rst for further details.
> + */
> +struct gh_userspace_memory_region {
> + __u32 label;
> +#define GH_MEM_ALLOW_READ (1UL << 0)
> +#define GH_MEM_ALLOW_WRITE (1UL << 1)
> +#define GH_MEM_ALLOW_EXEC (1UL << 2)
> +/*
> + * The guest will be lent the memory instead of shared.
> + * In other words, the guest has exclusive access to the memory region and the host loses access.
> + */
> +#define GH_MEM_LENT (1UL << 3)
> + __u32 flags;
> + __u64 guest_phys_addr;
> + __u64 memory_size;
> + __u64 userspace_addr;
> +};
> +
> +#define GH_VM_SET_USER_MEM_REGION _IOW(GH_IOCTL_TYPE, 0x1, \
> + struct gh_userspace_memory_region)
> +
> #endif

2023-02-21 12:44:31

by Srivatsa Vaddagiri

[permalink] [raw]
Subject: Re: [PATCH v10 12/26] gunyah: vm_mgr: Add/remove user memory regions

* Srinivas Kandagatla <[email protected]> [2023-02-21 12:28:53]:

> > +void gh_vm_mem_reclaim(struct gh_vm *ghvm, struct gh_vm_mem *mapping)
> > + __must_hold(&ghvm->mm_lock)
> > +{
> > + int i, ret = 0;
> > +
> > + if (mapping->parcel.mem_handle != GH_MEM_HANDLE_INVAL) {
> > + ret = gh_rm_mem_reclaim(ghvm->rm, &mapping->parcel);
> > + if (ret)
> > + pr_warn("Failed to reclaim memory parcel for label %d: %d\n",
> > + mapping->parcel.label, ret);
>
> what the behavoir of hypervisor if we failed to reclaim the pages?
>
> > + }
> > +
> > + if (!ret)
> So we will leave the user pages pinned if hypervisor call fails, but further
> down we free the mapping all together.

I think we should cleanup and bail out here, rather than try continuing past the
error. For ex: imagine userspace were to reclaim with VM still running. We would
leave the pages pinned AFAICS (even after VM terminates later) and also not
return any error to userspace indicating failure to reclaim.


2023-02-21 12:45:34

by Srivatsa Vaddagiri

[permalink] [raw]
Subject: Re: [PATCH v10 12/26] gunyah: vm_mgr: Add/remove user memory regions

* Elliot Berman <[email protected]> [2023-02-14 13:24:16]:

> +int gh_vm_mem_alloc(struct gh_vm *ghvm, struct gh_userspace_memory_region *region)
> +{
> + struct gh_vm_mem *mapping, *tmp_mapping;
> + struct gh_rm_mem_entry *mem_entries;
> + phys_addr_t curr_page, prev_page;
> + struct gh_rm_mem_parcel *parcel;
> + int i, j, pinned, ret = 0;
> + size_t entry_size;
> + u16 vmid;
> +
> + if (!gh_api_has_feature(GH_API_FEATURE_MEMEXTENT))
> + return -EOPNOTSUPP;
> +
> + if (!region->memory_size || !PAGE_ALIGNED(region->memory_size) ||
> + !PAGE_ALIGNED(region->userspace_addr) || !PAGE_ALIGNED(region->guest_phys_addr))
> + return -EINVAL;

Check for wraps also:

region->guest_phys_addr + region->memory_size > region->guest_phys_addr


2023-02-21 13:06:45

by Srivatsa Vaddagiri

[permalink] [raw]
Subject: Re: [PATCH v10 13/26] gunyah: vm_mgr: Add ioctls to support basic non-proxy VM boot

* Elliot Berman <[email protected]> [2023-02-14 13:24:26]:

> +static int gh_vm_start(struct gh_vm *ghvm)
> +{
> + struct gh_vm_mem *mapping;
> + u64 dtb_offset;
> + u32 mem_handle;
> + int ret;
> +
> + down_write(&ghvm->status_lock);
> + if (ghvm->vm_status != GH_RM_VM_STATUS_LOAD) {
> + up_write(&ghvm->status_lock);
> + return 0;
> + }
> +
> + ghvm->vm_status = GH_RM_VM_STATUS_RESET;
> +
> + list_for_each_entry(mapping, &ghvm->memory_mappings, list) {
> + switch (mapping->share_type) {
> + case VM_MEM_LEND:
> + ret = gh_rm_mem_lend(ghvm->rm, &mapping->parcel);
> + break;
> + case VM_MEM_SHARE:
> + ret = gh_rm_mem_share(ghvm->rm, &mapping->parcel);
> + break;
> + }
> + if (ret) {
> + pr_warn("Failed to %s parcel %d: %d\n",
> + mapping->share_type == VM_MEM_LEND ? "lend" : "share",
> + mapping->parcel.label,
> + ret);
> + goto err;
> + }
> + }
> +
> + mapping = gh_vm_mem_find_mapping(ghvm, ghvm->dtb_config.gpa, ghvm->dtb_config.size);

It may be some optimization to derive DTB 'mapping' in the first loop you have
above (that lends/shares all mappings)


> + if (!mapping) {
> + pr_warn("Failed to find the memory_handle for DTB\n");
> + ret = -EINVAL;
> + goto err;
> + }

2023-02-21 13:07:28

by Srivatsa Vaddagiri

[permalink] [raw]
Subject: Re: [PATCH v10 10/26] gunyah: vm_mgr: Introduce basic VM Manager

* Elliot Berman <[email protected]> [2023-02-14 13:23:54]:

> +static long gh_dev_ioctl_create_vm(struct gh_rm *rm, unsigned long arg)
> +{
> + struct gh_vm *ghvm;
> + struct file *file;
> + int fd, err;
> +
> + /* arg reserved for future use. */
> + if (arg)
> + return -EINVAL;
> +
> + ghvm = gh_vm_alloc(rm);
> + if (IS_ERR(ghvm))
> + return PTR_ERR(ghvm);
> +
> + fd = get_unused_fd_flags(O_CLOEXEC);
> + if (fd < 0) {
> + err = fd;
> + goto err_destroy_vm;
> + }
> +
> + file = anon_inode_getfile("gunyah-vm", &gh_vm_fops, ghvm, O_RDWR);
> + if (IS_ERR(file)) {
> + err = PTR_ERR(file);
> + goto err_put_fd;
> + }
> +
> + fd_install(fd, file);
> +
> + return fd;
> +
> +err_put_fd:
> + put_unused_fd(fd);
> +err_destroy_vm:
> + kfree(ghvm);

Need a put_gh_rm() also in this case

> + return err;
> +}

2023-02-21 13:08:25

by Srivatsa Vaddagiri

[permalink] [raw]
Subject: Re: [PATCH v10 19/26] gunyah: vm_mgr: Add framework to add VM Functions

* Elliot Berman <[email protected]> [2023-02-14 13:25:30]:

> +int __must_check gh_vm_get(struct gh_vm *ghvm)

Minor comment:

get_gh_rm vs gh_vm_get -> can follow some consistent convention I think.

Perhaps get_gh_vm()?


> +{
> + return kref_get_unless_zero(&ghvm->kref);
> +}
> +EXPORT_SYMBOL_GPL(gh_vm_get);

2023-02-21 14:18:10

by Srinivas Kandagatla

[permalink] [raw]
Subject: Re: [PATCH v10 13/26] gunyah: vm_mgr: Add ioctls to support basic non-proxy VM boot



On 14/02/2023 21:24, Elliot Berman wrote:
>
> Add remaining ioctls to support non-proxy VM boot:
>
> - Gunyah Resource Manager uses the VM's devicetree to configure the
> virtual machine. The location of the devicetree in the guest's
> virtual memory can be declared via the SET_DTB_CONFIG ioctl.
> - Trigger start of the virtual machine with VM_START ioctl.
>
> Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
> Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
> Signed-off-by: Elliot Berman <[email protected]>
> ---
> drivers/virt/gunyah/vm_mgr.c | 229 ++++++++++++++++++++++++++++++--
> drivers/virt/gunyah/vm_mgr.h | 10 ++
> drivers/virt/gunyah/vm_mgr_mm.c | 23 ++++
> include/linux/gunyah_rsc_mgr.h | 6 +
> include/uapi/linux/gunyah.h | 13 ++
> 5 files changed, 268 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
> index 84102bac03cc..fa324385ade5 100644
> --- a/drivers/virt/gunyah/vm_mgr.c
> +++ b/drivers/virt/gunyah/vm_mgr.c
> @@ -9,37 +9,114 @@
> #include <linux/file.h>
> #include <linux/gunyah_rsc_mgr.h>
> #include <linux/miscdevice.h>
> +#include <linux/mm.h>
> #include <linux/module.h>
>
> #include <uapi/linux/gunyah.h>
>
> #include "vm_mgr.h"
>
> +static int gh_vm_rm_notification_status(struct gh_vm *ghvm, void *data)
> +{
> + struct gh_rm_vm_status_payload *payload = data;
> +
> + if (payload->vmid != ghvm->vmid)
> + return NOTIFY_OK;
Is this even possible? If yes, then this is a bug somewhere, we should
not be getting notifications for something that does not belong to this vm.
What is the typical case for such behavior? comment would be useful.


> +
> + /* All other state transitions are synchronous to a corresponding RM call */
> + if (payload->vm_status == GH_RM_VM_STATUS_RESET) {
> + down_write(&ghvm->status_lock);
> + ghvm->vm_status = payload->vm_status;
> + up_write(&ghvm->status_lock);
> + wake_up(&ghvm->vm_status_wait);
> + }
> +
> + return NOTIFY_DONE;
> +}
> +
> +static int gh_vm_rm_notification_exited(struct gh_vm *ghvm, void *data)
> +{
> + struct gh_rm_vm_exited_payload *payload = data;
> +
> + if (payload->vmid != ghvm->vmid)
> + return NOTIFY_OK;
same

> +
> + down_write(&ghvm->status_lock);
> + ghvm->vm_status = GH_RM_VM_STATUS_EXITED;
> + up_write(&ghvm->status_lock);
> +
> + return NOTIFY_DONE;
> +}
> +
> +static int gh_vm_rm_notification(struct notifier_block *nb, unsigned long action, void *data)
> +{
> + struct gh_vm *ghvm = container_of(nb, struct gh_vm, nb);
> +
> + switch (action) {
> + case GH_RM_NOTIFICATION_VM_STATUS:
> + return gh_vm_rm_notification_status(ghvm, data);
> + case GH_RM_NOTIFICATION_VM_EXITED:
> + return gh_vm_rm_notification_exited(ghvm, data);
> + default:
> + return NOTIFY_OK;
> + }
> +}
> +
> +static void gh_vm_stop(struct gh_vm *ghvm)
> +{
> + int ret;
> +
> + down_write(&ghvm->status_lock);
> + if (ghvm->vm_status == GH_RM_VM_STATUS_RUNNING) {
> + ret = gh_rm_vm_stop(ghvm->rm, ghvm->vmid);
> + if (ret)
> + pr_warn("Failed to stop VM: %d\n", ret);
Should we not bail out from this fail path?


> + }
> +
> + ghvm->vm_status = GH_RM_VM_STATUS_EXITED;
> + up_write(&ghvm->status_lock);
> +}
> +
> static void gh_vm_free(struct work_struct *work)
> {
> struct gh_vm *ghvm = container_of(work, struct gh_vm, free_work);
> struct gh_vm_mem *mapping, *tmp;
> int ret;
>
> - mutex_lock(&ghvm->mm_lock);
> - list_for_each_entry_safe(mapping, tmp, &ghvm->memory_mappings, list) {
> - gh_vm_mem_reclaim(ghvm, mapping);
> - kfree(mapping);
> + switch (ghvm->vm_status) {
> +unknown_state:

Never seen this style of using goto from switch to a new label in
switch case. Am sure this is some kinda trick but its not helping readers.

Can we rewrite this using a normal semantics.

may be a do while could help.


> + case GH_RM_VM_STATUS_RUNNING:
> + gh_vm_stop(ghvm);
> + fallthrough;
> + case GH_RM_VM_STATUS_INIT_FAILED:
> + case GH_RM_VM_STATUS_LOAD:
> + case GH_RM_VM_STATUS_LOAD_FAILED:
> + mutex_lock(&ghvm->mm_lock);
> + list_for_each_entry_safe(mapping, tmp, &ghvm->memory_mappings, list) {
> + gh_vm_mem_reclaim(ghvm, mapping);
> + kfree(mapping);
> + }
> + mutex_unlock(&ghvm->mm_lock);
> + fallthrough;
> + case GH_RM_VM_STATUS_NO_STATE:
> + ret = gh_rm_dealloc_vmid(ghvm->rm, ghvm->vmid);
> + if (ret)
> + pr_warn("Failed to deallocate vmid: %d\n", ret);
> +
> + gh_rm_notifier_unregister(ghvm->rm, &ghvm->nb);
> + put_gh_rm(ghvm->rm);
> + kfree(ghvm);
> + break;
> + default:
> + pr_err("VM is unknown state: %d, assuming it's running.\n", ghvm->vm_status);
vm_status did not change do we not endup here again?

> + goto unknown_state;
> }
> - mutex_unlock(&ghvm->mm_lock);
> -
> - ret = gh_rm_dealloc_vmid(ghvm->rm, ghvm->vmid);
> - if (ret)
> - pr_warn("Failed to deallocate vmid: %d\n", ret);
> -
> - put_gh_rm(ghvm->rm);
> - kfree(ghvm);
> }
>
> static __must_check struct gh_vm *gh_vm_alloc(struct gh_rm *rm)
> {
> struct gh_vm *ghvm;
> - int vmid;
> + int vmid, ret;
>
> vmid = gh_rm_alloc_vmid(rm, 0);
> if (vmid < 0)
> @@ -56,13 +133,123 @@ static __must_check struct gh_vm *gh_vm_alloc(struct gh_rm *rm)
> ghvm->vmid = vmid;
> ghvm->rm = rm;
>
> + init_waitqueue_head(&ghvm->vm_status_wait);
> + ghvm->nb.notifier_call = gh_vm_rm_notification;
> + ret = gh_rm_notifier_register(rm, &ghvm->nb);
> + if (ret) {
> + put_gh_rm(rm);
> + gh_rm_dealloc_vmid(rm, vmid);
> + kfree(ghvm);
> + return ERR_PTR(ret);
> + }
> +
> mutex_init(&ghvm->mm_lock);
> INIT_LIST_HEAD(&ghvm->memory_mappings);
> + init_rwsem(&ghvm->status_lock);
> INIT_WORK(&ghvm->free_work, gh_vm_free);
> + ghvm->vm_status = GH_RM_VM_STATUS_LOAD;
>
> return ghvm;
> }
>
> +static int gh_vm_start(struct gh_vm *ghvm)
> +{
> + struct gh_vm_mem *mapping;
> + u64 dtb_offset;
> + u32 mem_handle;
> + int ret;
> +
> + down_write(&ghvm->status_lock);
> + if (ghvm->vm_status != GH_RM_VM_STATUS_LOAD) {
> + up_write(&ghvm->status_lock);
> + return 0;
> + }
> +
> + ghvm->vm_status = GH_RM_VM_STATUS_RESET;
> +

<------
should we not take ghvm->mm_lock here to make sure that list is
consistent while processing.
> + list_for_each_entry(mapping, &ghvm->memory_mappings, list) {
> + switch (mapping->share_type) {
> + case VM_MEM_LEND:
> + ret = gh_rm_mem_lend(ghvm->rm, &mapping->parcel);
> + break;
> + case VM_MEM_SHARE:
> + ret = gh_rm_mem_share(ghvm->rm, &mapping->parcel);
> + break;
> + }
> + if (ret) {
> + pr_warn("Failed to %s parcel %d: %d\n",
> + mapping->share_type == VM_MEM_LEND ? "lend" : "share",
> + mapping->parcel.label,
> + ret);
> + goto err;
> + }
> + }
--->

> +
> + mapping = gh_vm_mem_find_mapping(ghvm, ghvm->dtb_config.gpa, ghvm->dtb_config.size);
> + if (!mapping) {
> + pr_warn("Failed to find the memory_handle for DTB\n");

What wil happen to the mappings that are lend or shared?

> + ret = -EINVAL;
> + goto err;
> + }
> +
> + mem_handle = mapping->parcel.mem_handle;
> + dtb_offset = ghvm->dtb_config.gpa - mapping->guest_phys_addr;
> +
> + ret = gh_rm_vm_configure(ghvm->rm, ghvm->vmid, ghvm->auth, mem_handle,

where is authentication mechanism (auth) comming from? Who is supposed
to set this value?

Should it come from userspace? if so I do not see any UAPI facility to
do that via VM_START ioctl.


> + 0, 0, dtb_offset, ghvm->dtb_config.size);
> + if (ret) {
> + pr_warn("Failed to configure VM: %d\n", ret);
> + goto err;
> + }
> +
> + ret = gh_rm_vm_init(ghvm->rm, ghvm->vmid);
> + if (ret) {
> + pr_warn("Failed to initialize VM: %d\n", ret);
> + goto err;
> + }
> +
> + ret = gh_rm_vm_start(ghvm->rm, ghvm->vmid);
> + if (ret) {
> + pr_warn("Failed to start VM: %d\n", ret);
> + goto err;
> + }
> +
> + ghvm->vm_status = GH_RM_VM_STATUS_RUNNING;
> + up_write(&ghvm->status_lock);
> + return ret;
> +err:
> + ghvm->vm_status = GH_RM_VM_STATUS_INIT_FAILED;
> + up_write(&ghvm->status_lock);

Am really not sure if we are doing right thing in the error path, there
are multiple cases that seems to be not handled or if it was not
required no comments to clarify this are documented.
ex: if vm start fails then what happes with memory mapping or do we need
to un-configure vm or un-init vm from hypervisor side?

if none of this is required its useful to add come clear comments.

> + return ret;
> +}
> +
> +static int gh_vm_ensure_started(struct gh_vm *ghvm)
> +{
> + int ret;
> +
> +retry:
> + ret = down_read_interruptible(&ghvm->status_lock);
> + if (ret)
> + return ret;
> +
> + /* Unlikely because VM is typically started */
> + if (unlikely(ghvm->vm_status == GH_RM_VM_STATUS_LOAD)) {
> + up_read(&ghvm->status_lock);
> + ret = gh_vm_start(ghvm);
> + if (ret)
> + goto out;
> + goto retry;
> + }

do while will do better job here w.r.t to readablity.

> +
> + /* Unlikely because VM is typically running */
> + if (unlikely(ghvm->vm_status != GH_RM_VM_STATUS_RUNNING))
> + ret = -ENODEV;
> +
> +out:
> + up_read(&ghvm->status_lock);
> + return ret;
> +}
> +
> static long gh_vm_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
> {
> struct gh_vm *ghvm = filp->private_data;
> @@ -88,6 +275,22 @@ static long gh_vm_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
> r = gh_vm_mem_free(ghvm, region.label);
> break;
> }
> + case GH_VM_SET_DTB_CONFIG: {
> + struct gh_vm_dtb_config dtb_config;
> +
> + if (copy_from_user(&dtb_config, argp, sizeof(dtb_config)))
> + return -EFAULT;
> +
> + dtb_config.size = PAGE_ALIGN(dtb_config.size);
> + ghvm->dtb_config = dtb_config;
> +
> + r = 0;
> + break;
> + }
> + case GH_VM_START: {
> + r = gh_vm_ensure_started(ghvm);
> + break;
> + }
> default:
> r = -ENOTTY;
> break;
> diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
> index 97bc00c34878..e9cf56647cc2 100644
> --- a/drivers/virt/gunyah/vm_mgr.h
> +++ b/drivers/virt/gunyah/vm_mgr.h
> @@ -10,6 +10,8 @@
> #include <linux/list.h>
> #include <linux/miscdevice.h>
> #include <linux/mutex.h>
> +#include <linux/rwsem.h>
> +#include <linux/wait.h>
>
> #include <uapi/linux/gunyah.h>
>
> @@ -33,6 +35,13 @@ struct gh_vm_mem {
> struct gh_vm {
> u16 vmid;
> struct gh_rm *rm;
> + enum gh_rm_vm_auth_mechanism auth;
> + struct gh_vm_dtb_config dtb_config;
> +
> + struct notifier_block nb;
> + enum gh_rm_vm_status vm_status;
> + wait_queue_head_t vm_status_wait;
> + struct rw_semaphore status_lock;
>
> struct work_struct free_work;
> struct mutex mm_lock;
> @@ -43,5 +52,6 @@ int gh_vm_mem_alloc(struct gh_vm *ghvm, struct gh_userspace_memory_region *regio
> void gh_vm_mem_reclaim(struct gh_vm *ghvm, struct gh_vm_mem *mapping);
> int gh_vm_mem_free(struct gh_vm *ghvm, u32 label);
> struct gh_vm_mem *gh_vm_mem_find(struct gh_vm *ghvm, u32 label);
> +struct gh_vm_mem *gh_vm_mem_find_mapping(struct gh_vm *ghvm, u64 gpa, u32 size);
>
> #endif
> diff --git a/drivers/virt/gunyah/vm_mgr_mm.c b/drivers/virt/gunyah/vm_mgr_mm.c
> index 03e71a36ea3b..128b90da555a 100644
> --- a/drivers/virt/gunyah/vm_mgr_mm.c
> +++ b/drivers/virt/gunyah/vm_mgr_mm.c
> @@ -52,6 +52,29 @@ void gh_vm_mem_reclaim(struct gh_vm *ghvm, struct gh_vm_mem *mapping)
> list_del(&mapping->list);
> }
>
> +struct gh_vm_mem *gh_vm_mem_find_mapping(struct gh_vm *ghvm, u64 gpa, u32 size)
naming is bit missleading we already have
gh_vm_mem_find/__gh_vm_mem_find which is returning mapping based on label
now with gh_vm_mem_find_mapping() is doing same thing but with address.

Can we rename them clearly
gh_vm_mem_find_mapping_by_label()
gh_vm_mem_find_mapping_by_addr()

> +{

> + struct gh_vm_mem *mapping = NULL;
> + int ret;
> +
> + ret = mutex_lock_interruptible(&ghvm->mm_lock);
> + if (ret)
> + return ERR_PTR(ret);
> +
> + list_for_each_entry(mapping, &ghvm->memory_mappings, list) {
> + if (gpa >= mapping->guest_phys_addr &&
> + (gpa + size <= mapping->guest_phys_addr +
> + (mapping->npages << PAGE_SHIFT))) {
> + goto unlock;
> + }
> + }
> +
> + mapping = NULL;
> +unlock:
> + mutex_unlock(&ghvm->mm_lock);
> + return mapping;
> +}
> +
> struct gh_vm_mem *gh_vm_mem_find(struct gh_vm *ghvm, u32 label)
> {
> struct gh_vm_mem *mapping;
> diff --git a/include/linux/gunyah_rsc_mgr.h b/include/linux/gunyah_rsc_mgr.h
> index 2d8b8b6cc394..9cffee6f9b4e 100644
> --- a/include/linux/gunyah_rsc_mgr.h
> +++ b/include/linux/gunyah_rsc_mgr.h
> @@ -32,6 +32,12 @@ struct gh_rm_vm_exited_payload {
> #define GH_RM_NOTIFICATION_VM_EXITED 0x56100001
>
> enum gh_rm_vm_status {
> + /**
> + * RM doesn't have a state where load partially failed because
> + * only Linux
> + */
> + GH_RM_VM_STATUS_LOAD_FAILED = -1,
> +
> GH_RM_VM_STATUS_NO_STATE = 0,
> GH_RM_VM_STATUS_INIT = 1,
> GH_RM_VM_STATUS_READY = 2,
> diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
> index d85d12119a48..d899bba6a4c6 100644
> --- a/include/uapi/linux/gunyah.h
> +++ b/include/uapi/linux/gunyah.h
> @@ -53,4 +53,17 @@ struct gh_userspace_memory_region {
> #define GH_VM_SET_USER_MEM_REGION _IOW(GH_IOCTL_TYPE, 0x1, \
> struct gh_userspace_memory_region)
>
> +/**
> + * struct gh_vm_dtb_config - Set the location of the VM's devicetree blob
> + * @gpa: Address of the VM's devicetree in guest memory.
> + * @size: Maximum size of the devicetree.
> + */
> +struct gh_vm_dtb_config {
> + __u64 gpa;
> + __u64 size;
> +};
> +#define GH_VM_SET_DTB_CONFIG _IOW(GH_IOCTL_TYPE, 0x2, struct gh_vm_dtb_config)
> +
> +#define GH_VM_START _IO(GH_IOCTL_TYPE, 0x3)
> +
> #endif

2023-02-21 14:52:03

by Srinivas Kandagatla

[permalink] [raw]
Subject: Re: [PATCH v10 15/26] gunyah: rsc_mgr: Add platform ops on mem_lend/mem_reclaim



On 14/02/2023 21:24, Elliot Berman wrote:
>
> On Qualcomm platforms, there is a firmware entity which controls access
> to physical pages. In order to share memory with another VM, this entity
> needs to be informed that the guest VM should have access to the memory.
>
> Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
> Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
> Signed-off-by: Elliot Berman <[email protected]>
> ---
> drivers/virt/gunyah/Kconfig | 4 ++
> drivers/virt/gunyah/Makefile | 1 +
> drivers/virt/gunyah/gunyah_platform_hooks.c | 80 +++++++++++++++++++++
> drivers/virt/gunyah/rsc_mgr.h | 3 +
> drivers/virt/gunyah/rsc_mgr_rpc.c | 12 +++-
> include/linux/gunyah_rsc_mgr.h | 17 +++++
> 6 files changed, 115 insertions(+), 2 deletions(-)
> create mode 100644 drivers/virt/gunyah/gunyah_platform_hooks.c
>
> diff --git a/drivers/virt/gunyah/Kconfig b/drivers/virt/gunyah/Kconfig
> index 1a737694c333..de815189dab6 100644
> --- a/drivers/virt/gunyah/Kconfig
> +++ b/drivers/virt/gunyah/Kconfig
> @@ -4,6 +4,7 @@ config GUNYAH
> tristate "Gunyah Virtualization drivers"
> depends on ARM64
> depends on MAILBOX
> + select GUNYAH_PLATFORM_HOOKS
> help
> The Gunyah drivers are the helper interfaces that run in a guest VM
> such as basic inter-VM IPC and signaling mechanisms, and higher level
> @@ -11,3 +12,6 @@ config GUNYAH
>
> Say Y/M here to enable the drivers needed to interact in a Gunyah
> virtual environment.
> +
> +config GUNYAH_PLATFORM_HOOKS
> + tristate
> diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
> index ff8bc4925392..6b8f84dbfe0d 100644
> --- a/drivers/virt/gunyah/Makefile
> +++ b/drivers/virt/gunyah/Makefile
> @@ -1,6 +1,7 @@
> # SPDX-License-Identifier: GPL-2.0
>
> obj-$(CONFIG_GUNYAH) += gunyah.o
> +obj-$(CONFIG_GUNYAH_PLATFORM_HOOKS) += gunyah_platform_hooks.o
>
> gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o vm_mgr_mm.o
> obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
> diff --git a/drivers/virt/gunyah/gunyah_platform_hooks.c b/drivers/virt/gunyah/gunyah_platform_hooks.c
> new file mode 100644
> index 000000000000..e67e2361b592
> --- /dev/null
> +++ b/drivers/virt/gunyah/gunyah_platform_hooks.c
> @@ -0,0 +1,80 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#include <linux/module.h>
> +#include <linux/rwsem.h>
> +#include <linux/gunyah_rsc_mgr.h>
> +
> +#include "rsc_mgr.h"
> +
> +static struct gunyah_rm_platform_ops *rm_platform_ops;
> +static DECLARE_RWSEM(rm_platform_ops_lock);

Why do we need this read/write lock or this global rm_platform_ops here,
AFAIU, there will be only one instance of platform_ops per platform.

This should be a core part of the gunyah and its driver early setup,
that should give us pretty much lock less behaviour.

We should be able to determine by Hypervisor UUID that its on Qualcomm
platform or not, during early gunyah setup which should help us setup
the platfrom ops accordingly.

This should also help cleanup some of the gunyah code that was added
futher down in this patchset.


--srini

> +
> +int gh_rm_platform_pre_mem_share(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel)
> +{
> + int ret = 0;
> +
> + down_read(&rm_platform_ops_lock);
> + if (rm_platform_ops && rm_platform_ops->pre_mem_share)
> + ret = rm_platform_ops->pre_mem_share(rm, mem_parcel);
> + up_read(&rm_platform_ops_lock);
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(gh_rm_platform_pre_mem_share);
> +
> +int gh_rm_platform_post_mem_reclaim(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel)
> +{
> + int ret = 0;
> +
> + down_read(&rm_platform_ops_lock);
> + if (rm_platform_ops && rm_platform_ops->post_mem_reclaim)
> + ret = rm_platform_ops->post_mem_reclaim(rm, mem_parcel);
> + up_read(&rm_platform_ops_lock);
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(gh_rm_platform_post_mem_reclaim);
> +
> +int gh_rm_register_platform_ops(struct gunyah_rm_platform_ops *platform_ops)
> +{
> + int ret = 0;
> +
> + down_write(&rm_platform_ops_lock);
> + if (!rm_platform_ops)
> + rm_platform_ops = platform_ops;
> + else
> + ret = -EEXIST;
> + up_write(&rm_platform_ops_lock);
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(gh_rm_register_platform_ops);
> +
> +void gh_rm_unregister_platform_ops(struct gunyah_rm_platform_ops *platform_ops)
> +{
> + down_write(&rm_platform_ops_lock);
> + if (rm_platform_ops == platform_ops)
> + rm_platform_ops = NULL;
> + up_write(&rm_platform_ops_lock);
> +}
> +EXPORT_SYMBOL_GPL(gh_rm_unregister_platform_ops);
> +
> +static void _devm_gh_rm_unregister_platform_ops(void *data)
> +{
> + gh_rm_unregister_platform_ops(data);
> +}
> +
> +int devm_gh_rm_register_platform_ops(struct device *dev, struct gunyah_rm_platform_ops *ops)
> +{
> + int ret;
> +
> + ret = gh_rm_register_platform_ops(ops);
> + if (ret)
> + return ret;
> +
> + return devm_add_action(dev, _devm_gh_rm_unregister_platform_ops, ops);
> +}
> +EXPORT_SYMBOL_GPL(devm_gh_rm_register_platform_ops);
> +
> +MODULE_LICENSE("GPL");
> +MODULE_DESCRIPTION("Gunyah Platform Hooks");
> diff --git a/drivers/virt/gunyah/rsc_mgr.h b/drivers/virt/gunyah/rsc_mgr.h
> index 9b23cefe02b0..e536169df41e 100644
> --- a/drivers/virt/gunyah/rsc_mgr.h
> +++ b/drivers/virt/gunyah/rsc_mgr.h
> @@ -74,6 +74,9 @@ struct gh_rm;
> int gh_rm_call(struct gh_rm *rsc_mgr, u32 message_id, void *req_buff, size_t req_buff_size,
> void **resp_buf, size_t *resp_buff_size);
>
> +int gh_rm_platform_pre_mem_share(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel);
> +int gh_rm_platform_post_mem_reclaim(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel);
> +
> /* Message IDs: Memory Management */
> #define GH_RM_RPC_MEM_LEND 0x51000012
> #define GH_RM_RPC_MEM_SHARE 0x51000013
> diff --git a/drivers/virt/gunyah/rsc_mgr_rpc.c b/drivers/virt/gunyah/rsc_mgr_rpc.c
> index 0c83b097fec9..0b12696bf069 100644
> --- a/drivers/virt/gunyah/rsc_mgr_rpc.c
> +++ b/drivers/virt/gunyah/rsc_mgr_rpc.c
> @@ -116,6 +116,12 @@ static int gh_rm_mem_lend_common(struct gh_rm *rm, u32 message_id, struct gh_rm_
> if (!msg)
> return -ENOMEM;
>
> + ret = gh_rm_platform_pre_mem_share(rm, p);
> + if (ret) {
> + kfree(msg);
> + return ret;
> + }
> +
> req_header = msg;
> acl_section = (void *)req_header + sizeof(*req_header);
> mem_section = (void *)acl_section + struct_size(acl_section, entries, p->n_acl_entries);
> @@ -139,8 +145,10 @@ static int gh_rm_mem_lend_common(struct gh_rm *rm, u32 message_id, struct gh_rm_
> ret = gh_rm_call(rm, message_id, msg, msg_size, (void **)&resp, &resp_size);
> kfree(msg);
>
> - if (ret)
> + if (ret) {
> + gh_rm_platform_post_mem_reclaim(rm, p);
> return ret;
> + }
>
> p->mem_handle = le32_to_cpu(*resp);
>
> @@ -204,7 +212,7 @@ int gh_rm_mem_reclaim(struct gh_rm *rm, struct gh_rm_mem_parcel *parcel)
> if (ret)
> return ret;
>
> - return ret;
> + return gh_rm_platform_post_mem_reclaim(rm, parcel);
> }
>
> /**
> diff --git a/include/linux/gunyah_rsc_mgr.h b/include/linux/gunyah_rsc_mgr.h
> index 9cffee6f9b4e..dc05d5b1e1a3 100644
> --- a/include/linux/gunyah_rsc_mgr.h
> +++ b/include/linux/gunyah_rsc_mgr.h
> @@ -147,4 +147,21 @@ int gh_rm_get_hyp_resources(struct gh_rm *rm, u16 vmid,
> struct gh_rm_hyp_resources **resources);
> int gh_rm_get_vmid(struct gh_rm *rm, u16 *vmid);
>
> +struct gunyah_rm_platform_ops {
> + int (*pre_mem_share)(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel);
> + int (*post_mem_reclaim)(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel);
> +};
> +
> +#if IS_ENABLED(CONFIG_GUNYAH_PLATFORM_HOOKS)
> +int gh_rm_register_platform_ops(struct gunyah_rm_platform_ops *platform_ops);
> +void gh_rm_unregister_platform_ops(struct gunyah_rm_platform_ops *platform_ops);
> +int devm_gh_rm_register_platform_ops(struct device *dev, struct gunyah_rm_platform_ops *ops);
> +#else
> +static inline int gh_rm_register_platform_ops(struct gunyah_rm_platform_ops *platform_ops)
> + { return 0; }
> +static inline void gh_rm_unregister_platform_ops(struct gunyah_rm_platform_ops *platform_ops) { }
> +static inline int devm_gh_rm_register_platform_ops(struct device *dev,
> + struct gunyah_rm_platform_ops *ops) { return 0; }
> +#endif
> +
> #endif

2023-02-21 14:55:37

by Srinivas Kandagatla

[permalink] [raw]
Subject: Re: [PATCH v10 16/26] firmware: qcom_scm: Register Gunyah platform ops



On 14/02/2023 21:24, Elliot Berman wrote:
>
> Qualcomm platforms have a firmware entity which performs access control
> to physical pages. Dynamically started Gunyah virtual machines use the
> QCOM_SCM_RM_MANAGED_VMID for access. Linux thus needs to assign access
> to the memory used by guest VMs. Gunyah doesn't do this operation for us
> since it is the current VM (typically VMID_HLOS) delegating the access
> and not Gunyah itself. Use the Gunyah platform ops to achieve this so
> that only Qualcomm platforms attempt to make the needed SCM calls.
>
> Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
> Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
> Signed-off-by: Elliot Berman <[email protected]>
> ---
> drivers/firmware/Kconfig | 2 +
> drivers/firmware/qcom_scm.c | 100 ++++++++++++++++++++++++++++++++++++
> 2 files changed, 102 insertions(+)
>
> diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
> index b59e3041fd62..b888068ff6f2 100644
> --- a/drivers/firmware/Kconfig
> +++ b/drivers/firmware/Kconfig
> @@ -214,6 +214,8 @@ config MTK_ADSP_IPC
>
> config QCOM_SCM
> tristate
> + select VIRT_DRIVERS
> + select GUNYAH_PLATFORM_HOOKS

This is really making all the Qualcomm platforms either with Gunyah and
non-Gunyah hypervisors to enable VIRT_DRIVERS and GUNYAH_PLATFORM_HOOKS
in there kernel builds, that is not right way to do this.

SCM is used as library so lets keep it that way, I have added some
comments on platform hooks patch and potential way I see that this can
be done without making SCM aware of GUNAYAH internals.

--srini
>
> config QCOM_SCM_DOWNLOAD_MODE_DEFAULT
> bool "Qualcomm download mode enabled by default"
> diff --git a/drivers/firmware/qcom_scm.c b/drivers/firmware/qcom_scm.c
> index 468d4d5ab550..875040982b48 100644
> --- a/drivers/firmware/qcom_scm.c
> +++ b/drivers/firmware/qcom_scm.c
> @@ -20,6 +20,7 @@
> #include <linux/clk.h>
> #include <linux/reset-controller.h>
> #include <linux/arm-smccc.h>
> +#include <linux/gunyah_rsc_mgr.h>
>
> #include "qcom_scm.h"
>
> @@ -30,6 +31,9 @@ module_param(download_mode, bool, 0);
> #define SCM_HAS_IFACE_CLK BIT(1)
> #define SCM_HAS_BUS_CLK BIT(2)
>
> +#define QCOM_SCM_RM_MANAGED_VMID 0x3A
> +#define QCOM_SCM_MAX_MANAGED_VMID 0x3F
> +
> struct qcom_scm {
> struct device *dev;
> struct clk *core_clk;
> @@ -1297,6 +1301,99 @@ int qcom_scm_lmh_dcvsh(u32 payload_fn, u32 payload_reg, u32 payload_val,
> }
> EXPORT_SYMBOL(qcom_scm_lmh_dcvsh);
>
> +static int qcom_scm_gh_rm_pre_mem_share(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel)
> +{
> + struct qcom_scm_vmperm *new_perms;
> + u64 src, src_cpy;
> + int ret = 0, i, n;
> + u16 vmid;
> +
> + new_perms = kcalloc(mem_parcel->n_acl_entries, sizeof(*new_perms), GFP_KERNEL);
> + if (!new_perms)
> + return -ENOMEM;
> +
> + for (n = 0; n < mem_parcel->n_acl_entries; n++) {
> + vmid = le16_to_cpu(mem_parcel->acl_entries[n].vmid);
> + if (vmid <= QCOM_SCM_MAX_MANAGED_VMID)
> + new_perms[n].vmid = vmid;
> + else
> + new_perms[n].vmid = QCOM_SCM_RM_MANAGED_VMID;
> + if (mem_parcel->acl_entries[n].perms & GH_RM_ACL_X)
> + new_perms[n].perm |= QCOM_SCM_PERM_EXEC;
> + if (mem_parcel->acl_entries[n].perms & GH_RM_ACL_W)
> + new_perms[n].perm |= QCOM_SCM_PERM_WRITE;
> + if (mem_parcel->acl_entries[n].perms & GH_RM_ACL_R)
> + new_perms[n].perm |= QCOM_SCM_PERM_READ;
> + }
> +
> + src = (1ull << QCOM_SCM_VMID_HLOS);
> +
> + for (i = 0; i < mem_parcel->n_mem_entries; i++) {
> + src_cpy = src;
> + ret = qcom_scm_assign_mem(le64_to_cpu(mem_parcel->mem_entries[i].ipa_base),
> + le64_to_cpu(mem_parcel->mem_entries[i].size),
> + &src_cpy, new_perms, mem_parcel->n_acl_entries);
> + if (ret) {
> + src = 0;
> + for (n = 0; n < mem_parcel->n_acl_entries; n++) {
> + vmid = le16_to_cpu(mem_parcel->acl_entries[n].vmid);
> + if (vmid <= QCOM_SCM_MAX_MANAGED_VMID)
> + src |= (1ull << vmid);
> + else
> + src |= (1ull << QCOM_SCM_RM_MANAGED_VMID);
> + }
> +
> + new_perms[0].vmid = QCOM_SCM_VMID_HLOS;
> +
> + for (i--; i >= 0; i--) {
> + src_cpy = src;
> + WARN_ON_ONCE(qcom_scm_assign_mem(
> + le64_to_cpu(mem_parcel->mem_entries[i].ipa_base),
> + le64_to_cpu(mem_parcel->mem_entries[i].size),
> + &src_cpy, new_perms, 1));
> + }
> + break;
> + }
> + }
> +
> + kfree(new_perms);
> + return ret;
> +}
> +
> +static int qcom_scm_gh_rm_post_mem_reclaim(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel)
> +{
> + struct qcom_scm_vmperm new_perms;
> + u64 src = 0, src_cpy;
> + int ret = 0, i, n;
> + u16 vmid;
> +
> + new_perms.vmid = QCOM_SCM_VMID_HLOS;
> + new_perms.perm = QCOM_SCM_PERM_EXEC | QCOM_SCM_PERM_WRITE | QCOM_SCM_PERM_READ;
> +
> + for (n = 0; n < mem_parcel->n_acl_entries; n++) {
> + vmid = le16_to_cpu(mem_parcel->acl_entries[n].vmid);
> + if (vmid <= QCOM_SCM_MAX_MANAGED_VMID)
> + src |= (1ull << vmid);
> + else
> + src |= (1ull << QCOM_SCM_RM_MANAGED_VMID);
> + }
> +
> + for (i = 0; i < mem_parcel->n_mem_entries; i++) {
> + src_cpy = src;
> + ret = qcom_scm_assign_mem(le64_to_cpu(mem_parcel->mem_entries[i].ipa_base),
> + le64_to_cpu(mem_parcel->mem_entries[i].size),
> + &src_cpy, &new_perms, 1);
> + WARN_ON_ONCE(ret);
> + }
> +
> + return ret;
> +}
> +
> +static struct gunyah_rm_platform_ops qcom_scm_gh_rm_platform_ops = {
> + .pre_mem_share = qcom_scm_gh_rm_pre_mem_share,
> + .post_mem_reclaim = qcom_scm_gh_rm_post_mem_reclaim,
> +};
> +
> static int qcom_scm_find_dload_address(struct device *dev, u64 *addr)
> {
> struct device_node *tcsr;
> @@ -1500,6 +1597,9 @@ static int qcom_scm_probe(struct platform_device *pdev)
> if (download_mode)
> qcom_scm_set_download_mode(true);
>
> + if (devm_gh_rm_register_platform_ops(&pdev->dev, &qcom_scm_gh_rm_platform_ops))
> + dev_warn(__scm->dev, "Gunyah RM platform ops were already registered\n");
> +
> return 0;
> }
>

2023-02-21 17:47:25

by Srinivas Kandagatla

[permalink] [raw]
Subject: Re: [PATCH v10 18/26] virt: gunyah: Translate gh_rm_hyp_resource into gunyah_resource



On 14/02/2023 21:25, Elliot Berman wrote:
>
> When booting a Gunyah virtual machine, the host VM may gain capabilities
> to interact with resources for the guest virtual machine. Examples of
> such resources are vCPUs or message queues. To use those resources, we
> need to translate the RM response into a gunyah_resource structure which
> are useful to Linux drivers. Presently, Linux drivers need only to know
> the type of resource, the capability ID, and an interrupt.
>
> On ARM64 systems, the interrupt reported by Gunyah is the GIC interrupt
> ID number and always a SPI.
>
> Signed-off-by: Elliot Berman <[email protected]>
> ---
> arch/arm64/include/asm/gunyah.h | 23 +++++
> drivers/virt/gunyah/rsc_mgr.c | 161 +++++++++++++++++++++++++++++++-
> include/linux/gunyah.h | 4 +
> include/linux/gunyah_rsc_mgr.h | 4 +
> 4 files changed, 191 insertions(+), 1 deletion(-)
> create mode 100644 arch/arm64/include/asm/gunyah.h
>
> diff --git a/arch/arm64/include/asm/gunyah.h b/arch/arm64/include/asm/gunyah.h
> new file mode 100644
> index 000000000000..64cfb964efee
> --- /dev/null
> +++ b/arch/arm64/include/asm/gunyah.h
> @@ -0,0 +1,23 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (c) 2022 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +#ifndef __ASM_GUNYAH_H_
> +#define __ASM_GUNYAH_H_
> +
> +#include <linux/irq.h>
> +#include <dt-bindings/interrupt-controller/arm-gic.h>
> +
> +static inline int arch_gh_fill_irq_fwspec_params(u32 virq, struct irq_fwspec *fwspec)
> +{
> + if (virq < 32 || virq > 1019)
> + return -EINVAL;
> +
> + fwspec->param_count = 3;
> + fwspec->param[0] = GIC_SPI;
> + fwspec->param[1] = virq - 32;
> + fwspec->param[2] = IRQ_TYPE_EDGE_RISING;
> + return 0;
> +}
> +
> +#endif
> diff --git a/drivers/virt/gunyah/rsc_mgr.c b/drivers/virt/gunyah/rsc_mgr.c
> index 73c5a6b7cbbc..eb1bc3f68792 100644
> --- a/drivers/virt/gunyah/rsc_mgr.c
> +++ b/drivers/virt/gunyah/rsc_mgr.c
> @@ -18,6 +18,8 @@
> #include <linux/platform_device.h>
> #include <linux/miscdevice.h>
>
> +#include <asm/gunyah.h>
> +
> #include "rsc_mgr.h"
> #include "vm_mgr.h"
>
> @@ -107,8 +109,137 @@ struct gh_rm {
> struct blocking_notifier_head nh;
>
> struct miscdevice miscdev;
> + struct irq_domain *irq_domain;
> +};
> +
> +struct gh_irq_chip_data {
> + u32 gh_virq;
> +};
> +
> +static struct irq_chip gh_rm_irq_chip = {
> + .name = "Gunyah",
> + .irq_enable = irq_chip_enable_parent,
> + .irq_disable = irq_chip_disable_parent,
> + .irq_ack = irq_chip_ack_parent,
> + .irq_mask = irq_chip_mask_parent,
> + .irq_mask_ack = irq_chip_mask_ack_parent,
> + .irq_unmask = irq_chip_unmask_parent,
> + .irq_eoi = irq_chip_eoi_parent,
> + .irq_set_affinity = irq_chip_set_affinity_parent,
> + .irq_set_type = irq_chip_set_type_parent,
> + .irq_set_wake = irq_chip_set_wake_parent,
> + .irq_set_vcpu_affinity = irq_chip_set_vcpu_affinity_parent,
> + .irq_retrigger = irq_chip_retrigger_hierarchy,
> + .irq_get_irqchip_state = irq_chip_get_parent_state,
> + .irq_set_irqchip_state = irq_chip_set_parent_state,
> + .flags = IRQCHIP_SET_TYPE_MASKED |
> + IRQCHIP_SKIP_SET_WAKE |
> + IRQCHIP_MASK_ON_SUSPEND,
> +};
> +
> +static int gh_rm_irq_domain_alloc(struct irq_domain *d, unsigned int virq, unsigned int nr_irqs,
> + void *arg)
> +{
> + struct gh_irq_chip_data *chip_data, *spec = arg;
> + struct irq_fwspec parent_fwspec;
> + struct gh_rm *rm = d->host_data;
> + u32 gh_virq = spec->gh_virq;
> + int ret;
> +
> + if (nr_irqs != 1 || gh_virq == U32_MAX)
> + return -EINVAL;
> +
> + chip_data = kzalloc(sizeof(*chip_data), GFP_KERNEL);
> + if (!chip_data)
> + return -ENOMEM;
> +
> + chip_data->gh_virq = gh_virq;
> +
> + ret = irq_domain_set_hwirq_and_chip(d, virq, chip_data->gh_virq, &gh_rm_irq_chip,
> + chip_data);
> + if (ret)

leaking chip_data?

> + return ret;
> +
> + parent_fwspec.fwnode = d->parent->fwnode;
> + ret = arch_gh_fill_irq_fwspec_params(chip_data->gh_virq, &parent_fwspec);
> + if (ret) {
> + dev_err(rm->dev, "virq translation failed %u: %d\n", chip_data->gh_virq, ret);
> + goto err_free_irq_data;
> + }
> +
> + ret = irq_domain_alloc_irqs_parent(d, virq, nr_irqs, &parent_fwspec);
> + if (ret)
> + goto err_free_irq_data;
> +
> + return ret;
> +err_free_irq_data:
> + kfree(chip_data);
> + return ret;
> +}
> +
> +static void gh_rm_irq_domain_free_single(struct irq_domain *d, unsigned int virq)
> +{
> + struct gh_irq_chip_data *chip_data;
> + struct irq_data *irq_data;
> +
> + irq_data = irq_domain_get_irq_data(d, virq);
> + if (!irq_data)
> + return;
> +
> + chip_data = irq_data->chip_data;
> +
> + kfree(chip_data);
> + irq_data->chip_data = NULL;
> +}
> +
> +static void gh_rm_irq_domain_free(struct irq_domain *d, unsigned int virq, unsigned int nr_irqs)
> +{
> + unsigned int i;
> +
> + for (i = 0; i < nr_irqs; i++)
> + gh_rm_irq_domain_free_single(d, virq);
> +}
> +
> +static const struct irq_domain_ops gh_rm_irq_domain_ops = {
> + .alloc = gh_rm_irq_domain_alloc,
> + .free = gh_rm_irq_domain_free,
> };
>
> +struct gunyah_resource *gh_rm_alloc_resource(struct gh_rm *rm,
> + struct gh_rm_hyp_resource *hyp_resource)
> +{
> + struct gunyah_resource *ghrsc;
> +
> + ghrsc = kzalloc(sizeof(*ghrsc), GFP_KERNEL);
> + if (!ghrsc)
> + return NULL;
return ERR_PTR(-ENOMEM);

> +
> + ghrsc->type = hyp_resource->type;
> + ghrsc->capid = le64_to_cpu(hyp_resource->cap_id);
> + ghrsc->irq = IRQ_NOTCONNECTED;
> + ghrsc->rm_label = le32_to_cpu(hyp_resource->resource_label);
> + if (hyp_resource->virq && le32_to_cpu(hyp_resource->virq) != U32_MAX) {
> + struct gh_irq_chip_data irq_data = {
> + .gh_virq = le32_to_cpu(hyp_resource->virq),
> + };
> +
> + ghrsc->irq = irq_domain_alloc_irqs(rm->irq_domain, 1, NUMA_NO_NODE, &irq_data);
> + if (ghrsc->irq < 0) {
> + pr_err("Failed to allocate interrupt for resource %d label: %d: %d\n",
> + ghrsc->type, ghrsc->rm_label, ghrsc->irq);
> + ghrsc->irq = IRQ_NOTCONNECTED;
> + }
> + }
> +
> + return ghrsc;
> +}
> +
> +void gh_rm_free_resource(struct gunyah_resource *ghrsc)
> +{
> + irq_dispose_mapping(ghrsc->irq);
> + kfree(ghrsc);
> +}
> +
> static struct gh_rm_connection *gh_rm_alloc_connection(__le32 msg_id, u8 type)
> {
> struct gh_rm_connection *connection;
> @@ -553,6 +684,8 @@ static int gh_msgq_platform_probe_direction(struct platform_device *pdev,
>
> static int gh_rm_drv_probe(struct platform_device *pdev)
> {
> + struct irq_domain *parent_irq_domain;
> + struct device_node *parent_irq_node;
> struct gh_msgq_tx_data *msg;
> struct gh_rm *rm;
> int ret;
> @@ -590,15 +723,40 @@ static int gh_rm_drv_probe(struct platform_device *pdev)
> if (ret)
> goto err_cache;
>
> + parent_irq_node = of_irq_find_parent(pdev->dev.of_node);
> + if (!parent_irq_node) {
> + dev_err(&pdev->dev, "Failed to find interrupt parent of resource manager\n");
> + ret = -ENODEV;
> + goto err_msgq;
> + }
> +
> + parent_irq_domain = irq_find_host(parent_irq_node);
> + if (!parent_irq_domain) {
> + dev_err(&pdev->dev, "Failed to find interrupt parent domain of resource manager\n");
> + ret = -ENODEV;
> + goto err_msgq;
> + }
> +
> + rm->irq_domain = irq_domain_add_hierarchy(parent_irq_domain, 0, 0, pdev->dev.of_node,
> + &gh_rm_irq_domain_ops, NULL);
> + if (!rm->irq_domain) {
> + dev_err(&pdev->dev, "Failed to add irq domain\n");
> + ret = -ENODEV;
> + goto err_msgq;
> + }
> + rm->irq_domain->host_data = rm;
> +
> rm->miscdev.name = "gunyah";
> rm->miscdev.minor = MISC_DYNAMIC_MINOR;
> rm->miscdev.fops = &gh_dev_fops;
>
> ret = misc_register(&rm->miscdev);
> if (ret)
> - goto err_msgq;
> + goto err_irq_domain;
>
> return 0;
> +err_irq_domain:
> + irq_domain_remove(rm->irq_domain);
> err_msgq:
> mbox_free_channel(gh_msgq_chan(&rm->msgq));
> gh_msgq_remove(&rm->msgq);
> @@ -612,6 +770,7 @@ static int gh_rm_drv_remove(struct platform_device *pdev)
> struct gh_rm *rm = platform_get_drvdata(pdev);
>
> misc_deregister(&rm->miscdev);
> + irq_domain_remove(rm->irq_domain);
> mbox_free_channel(gh_msgq_chan(&rm->msgq));
> gh_msgq_remove(&rm->msgq);
> kmem_cache_destroy(rm->cache);
> diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
> index 2e13669c6363..a06d5fa68a65 100644
> --- a/include/linux/gunyah.h
> +++ b/include/linux/gunyah.h
> @@ -27,6 +27,10 @@ struct gunyah_resource {
> enum gunyah_resource_type type;
> u64 capid;
> int irq;
> +
> + /* To help allocator of resource manager */
> + struct list_head list;

Looks like unused?


> + u32 rm_label;
> };
>
> /**
> diff --git a/include/linux/gunyah_rsc_mgr.h b/include/linux/gunyah_rsc_mgr.h
> index dc05d5b1e1a3..2fb6efbe2f70 100644
> --- a/include/linux/gunyah_rsc_mgr.h
> +++ b/include/linux/gunyah_rsc_mgr.h
> @@ -147,6 +147,10 @@ int gh_rm_get_hyp_resources(struct gh_rm *rm, u16 vmid,
> struct gh_rm_hyp_resources **resources);
> int gh_rm_get_vmid(struct gh_rm *rm, u16 *vmid);
>
> +struct gunyah_resource *gh_rm_alloc_resource(struct gh_rm *rm,
> + struct gh_rm_hyp_resource *hyp_resource);
> +void gh_rm_free_resource(struct gunyah_resource *ghrsc);
> +
> struct gunyah_rm_platform_ops {
> int (*pre_mem_share)(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel);
> int (*post_mem_reclaim)(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel);

2023-02-21 17:58:44

by Srinivas Kandagatla

[permalink] [raw]
Subject: Re: [PATCH v10 19/26] gunyah: vm_mgr: Add framework to add VM Functions



On 21/02/2023 13:07, Srivatsa Vaddagiri wrote:
> * Elliot Berman <[email protected]> [2023-02-14 13:25:30]:
>
>> +int __must_check gh_vm_get(struct gh_vm *ghvm)
>
> Minor comment:
>
> get_gh_rm vs gh_vm_get -> can follow some consistent convention I think.
>
> Perhaps get_gh_vm()?

it should be other way around

currently we have combinations of gh_vm and some other pattern, we
should stick with one, in this case gh_vm_* or gh_rm_* makes more sense

here are all the exported symbols in gunyah.

./drivers/virt/gunyah/vm_mgr.c:EXPORT_SYMBOL_GPL(gh_vm_function_register);
./drivers/virt/gunyah/vm_mgr.c:EXPORT_SYMBOL_GPL(gh_vm_function_unregister);
./drivers/virt/gunyah/vm_mgr.c:EXPORT_SYMBOL_GPL(gh_vm_add_resource_ticket);
./drivers/virt/gunyah/vm_mgr.c:EXPORT_SYMBOL_GPL(gh_vm_remove_resource_ticket);
./drivers/virt/gunyah/vm_mgr.c:EXPORT_SYMBOL_GPL(gh_vm_mmio_write);
./drivers/virt/gunyah/vm_mgr.c:EXPORT_SYMBOL_GPL(gh_vm_add_io_handler);
./drivers/virt/gunyah/vm_mgr.c:EXPORT_SYMBOL_GPL(gh_vm_remove_io_handler);
./drivers/virt/gunyah/vm_mgr.c:EXPORT_SYMBOL_GPL(gh_vm_get);
./drivers/virt/gunyah/vm_mgr.c:EXPORT_SYMBOL_GPL(gh_vm_put);
./drivers/virt/gunyah/rsc_mgr.c:EXPORT_SYMBOL_GPL(gh_rm_notifier_register);
./drivers/virt/gunyah/rsc_mgr.c:EXPORT_SYMBOL_GPL(gh_rm_notifier_unregister);
./drivers/virt/gunyah/rsc_mgr.c:EXPORT_SYMBOL_GPL(get_gh_rm);
./drivers/virt/gunyah/rsc_mgr.c:EXPORT_SYMBOL_GPL(put_gh_rm);
./drivers/virt/gunyah/gunyah.c:EXPORT_SYMBOL_GPL(gh_api_version);
./drivers/virt/gunyah/gunyah.c:EXPORT_SYMBOL_GPL(gh_api_has_feature);
./drivers/virt/gunyah/rsc_mgr_rpc.c:EXPORT_SYMBOL_GPL(gh_rm_get_vmid);
./drivers/virt/gunyah/gunyah_platform_hooks.c:EXPORT_SYMBOL_GPL(gh_rm_platform_pre_mem_share);
./drivers/virt/gunyah/gunyah_platform_hooks.c:EXPORT_SYMBOL_GPL(gh_rm_platform_post_mem_reclaim);
./drivers/virt/gunyah/gunyah_platform_hooks.c:EXPORT_SYMBOL_GPL(gh_rm_register_platform_ops);
./drivers/virt/gunyah/gunyah_platform_hooks.c:EXPORT_SYMBOL_GPL(gh_rm_unregister_platform_ops);
./drivers/virt/gunyah/gunyah_platform_hooks.c:EXPORT_SYMBOL_GPL(devm_gh_rm_register_platform_ops);

>
>
>> +{
>> + return kref_get_unless_zero(&ghvm->kref);
>> +}
>> +EXPORT_SYMBOL_GPL(gh_vm_get);

2023-02-21 21:23:21

by Elliot Berman

[permalink] [raw]
Subject: Re: [PATCH v10 15/26] gunyah: rsc_mgr: Add platform ops on mem_lend/mem_reclaim



On 2/21/2023 6:51 AM, Srinivas Kandagatla wrote:
>
>
> On 14/02/2023 21:24, Elliot Berman wrote:
[snip]
>> +
>> +static struct gunyah_rm_platform_ops *rm_platform_ops;
>> +static DECLARE_RWSEM(rm_platform_ops_lock);
>
> Why do we need this read/write lock or this global rm_platform_ops here,
> AFAIU, there will be only one instance of platform_ops per platform.
>
> This should be a core part of the gunyah and its driver early setup,
> that should give us pretty much lock less behaviour.
>
> We should be able to determine by Hypervisor UUID that its on Qualcomm
> platform or not, during early gunyah setup which should help us setup
> the platfrom ops accordingly.
>
> This should also help cleanup some of the gunyah code that was added
> futher down in this patchset.

I'm guessing the direction to take is:

config GUNYAH
select QCOM_SCM if ARCH_QCOM

and have vm_mgr call directly into qcom_scm driver if the UID matches?

We have an Android requirement to enable CONFIG_GUNYAH=y and
CONFIG_QCOM_SCM=m, but it wouldn't be possible with this design. The
platform hooks implementation allows GUNYAH and QCOM_SCM to be enabled
without setting lower bound of the other.

- Elliot

2023-02-22 00:28:03

by Elliot Berman

[permalink] [raw]
Subject: Re: [PATCH v10 10/26] gunyah: vm_mgr: Introduce basic VM Manager



On 2/21/2023 2:46 AM, Srinivas Kandagatla wrote:
>
>
> On 14/02/2023 21:23, Elliot Berman wrote:
>>
>> Gunyah VM manager is a kernel moduel which exposes an interface to
>> Gunyah userspace to load, run, and interact with other Gunyah virtual
>> machines. The interface is a character device at /dev/gunyah.
>>
>> Add a basic VM manager driver. Upcoming patches will add more ioctls
>> into this driver.
>>
>> Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
>> Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
>> Signed-off-by: Elliot Berman <[email protected]>
>> ---
>>   .../userspace-api/ioctl/ioctl-number.rst      |   1 +
>>   drivers/virt/gunyah/Makefile                  |   2 +-
>>   drivers/virt/gunyah/rsc_mgr.c                 |  37 +++++-
>>   drivers/virt/gunyah/vm_mgr.c                  | 118 ++++++++++++++++++
>>   drivers/virt/gunyah/vm_mgr.h                  |  22 ++++
>>   include/uapi/linux/gunyah.h                   |  23 ++++
>>   6 files changed, 201 insertions(+), 2 deletions(-)
>>   create mode 100644 drivers/virt/gunyah/vm_mgr.c
>>   create mode 100644 drivers/virt/gunyah/vm_mgr.h
>>   create mode 100644 include/uapi/linux/gunyah.h
>>
>> diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst
>> b/Documentation/userspace-api/ioctl/ioctl-number.rst
>> index 0a1882e296ae..2513324ae7be 100644
>> --- a/Documentation/userspace-api/ioctl/ioctl-number.rst
>> +++ b/Documentation/userspace-api/ioctl/ioctl-number.rst
>> @@ -137,6 +137,7 @@ Code  Seq#    Include
>> File                                           Comments
>>   'F'   DD     video/sstfb.h
>> conflict!
>>   'G'   00-3F  drivers/misc/sgi-gru/grulib.h
>> conflict!
>>   'G'   00-0F  xen/gntalloc.h, xen/gntdev.h
>> conflict!
>> +'G'   00-0f  linux/gunyah.h
>> conflict!
>>   'H'   00-7F  linux/hiddev.h
>> conflict!
>>   'H'   00-0F  linux/hidraw.h
>> conflict!
>>   'H'   01     linux/mei.h
>> conflict!
>> diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
>> index de29769f2f3f..03951cf82023 100644
>> --- a/drivers/virt/gunyah/Makefile
>> +++ b/drivers/virt/gunyah/Makefile
>> @@ -2,5 +2,5 @@
>>   obj-$(CONFIG_GUNYAH) += gunyah.o
>> -gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o
>> +gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o
>>   obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
>> diff --git a/drivers/virt/gunyah/rsc_mgr.c
>> b/drivers/virt/gunyah/rsc_mgr.c
>> index 2a47139873a8..73c5a6b7cbbc 100644
>> --- a/drivers/virt/gunyah/rsc_mgr.c
>> +++ b/drivers/virt/gunyah/rsc_mgr.c
>> @@ -16,8 +16,10 @@
>>   #include <linux/completion.h>
>>   #include <linux/gunyah_rsc_mgr.h>
>>   #include <linux/platform_device.h>
>> +#include <linux/miscdevice.h>
>>   #include "rsc_mgr.h"
>> +#include "vm_mgr.h"
>>   #define RM_RPC_API_VERSION_MASK        GENMASK(3, 0)
>>   #define RM_RPC_HEADER_WORDS_MASK    GENMASK(7, 4)
>> @@ -103,6 +105,8 @@ struct gh_rm {
>>       struct kmem_cache *cache;
>>       struct mutex send_lock;
>>       struct blocking_notifier_head nh;
>> +
>> +    struct miscdevice miscdev;
>>   };
>>   static struct gh_rm_connection *gh_rm_alloc_connection(__le32
>> msg_id, u8 type)
>> @@ -509,6 +513,21 @@ void put_gh_rm(struct gh_rm *rm)
>>   }
>>   EXPORT_SYMBOL_GPL(put_gh_rm);
>> +static long gh_dev_ioctl(struct file *filp, unsigned int cmd,
>> unsigned long arg)
>> +{
>> +    struct miscdevice *miscdev = filp->private_data;
>> +    struct gh_rm *rm = container_of(miscdev, struct gh_rm, miscdev);
>> +
>> +    return gh_dev_vm_mgr_ioctl(rm, cmd, arg);
>> +}
>> +
>> +static const struct file_operations gh_dev_fops = {
>> +    .owner        = THIS_MODULE,
>> +    .unlocked_ioctl    = gh_dev_ioctl,
>> +    .compat_ioctl    = compat_ptr_ioctl,
>> +    .llseek        = noop_llseek,
>> +};
>> +
>>   static int gh_msgq_platform_probe_direction(struct platform_device
>> *pdev,
>>                       bool tx, int idx, struct gunyah_resource *ghrsc)
>>   {
>> @@ -567,7 +586,22 @@ static int gh_rm_drv_probe(struct platform_device
>> *pdev)
>>       rm->msgq_client.rx_callback = gh_rm_msgq_rx_data;
>>       rm->msgq_client.tx_done = gh_rm_msgq_tx_done;
>> -    return gh_msgq_init(&pdev->dev, &rm->msgq, &rm->msgq_client,
>> &rm->tx_ghrsc, &rm->rx_ghrsc);
>> +    ret = gh_msgq_init(&pdev->dev, &rm->msgq, &rm->msgq_client,
>> &rm->tx_ghrsc, &rm->rx_ghrsc);
>> +    if (ret)
>> +        goto err_cache;
>> +
>> +    rm->miscdev.name = "gunyah";
>> +    rm->miscdev.minor = MISC_DYNAMIC_MINOR;
>> +    rm->miscdev.fops = &gh_dev_fops;
>> +
>> +    ret = misc_register(&rm->miscdev);
>> +    if (ret)
>> +        goto err_msgq;
>> +
>> +    return 0;
>> +err_msgq:
>> +    mbox_free_channel(gh_msgq_chan(&rm->msgq));
>> +    gh_msgq_remove(&rm->msgq);
>>   err_cache:
>>       kmem_cache_destroy(rm->cache);
>>       return ret;
>> @@ -577,6 +611,7 @@ static int gh_rm_drv_remove(struct platform_device
>> *pdev)
>>   {
>>       struct gh_rm *rm = platform_get_drvdata(pdev);
>> +    misc_deregister(&rm->miscdev);
>>       mbox_free_channel(gh_msgq_chan(&rm->msgq));
>>       gh_msgq_remove(&rm->msgq);
>>       kmem_cache_destroy(rm->cache);
>> diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
>> new file mode 100644
>> index 000000000000..fd890a57172e
>> --- /dev/null
>> +++ b/drivers/virt/gunyah/vm_mgr.c
>> @@ -0,0 +1,118 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All
>> rights reserved.
>> + */
>> +
>> +#define pr_fmt(fmt) "gh_vm_mgr: " fmt
>> +
>> +#include <linux/anon_inodes.h>
>> +#include <linux/file.h>
>> +#include <linux/gunyah_rsc_mgr.h>
>> +#include <linux/miscdevice.h>
>> +#include <linux/module.h>
>> +
>> +#include <uapi/linux/gunyah.h>
>> +
>> +#include "vm_mgr.h"
>> +
>> +static void gh_vm_free(struct work_struct *work)
>> +{
>> +    struct gh_vm *ghvm = container_of(work, struct gh_vm, free_work);
>> +    int ret;
>> +
>> +    ret = gh_rm_dealloc_vmid(ghvm->rm, ghvm->vmid);
>> +    if (ret)
>> +        pr_warn("Failed to deallocate vmid: %d\n", ret);
>> +
>> +    put_gh_rm(ghvm->rm);
>> +    kfree(ghvm);
>> +}
>> +
>> +static __must_check struct gh_vm *gh_vm_alloc(struct gh_rm *rm)
>> +{
>> +    struct gh_vm *ghvm;
>> +    int vmid;
>> +
>> +    vmid = gh_rm_alloc_vmid(rm, 0);
>> +    if (vmid < 0)
>> +        return ERR_PTR(vmid);
>> +
>> +    ghvm = kzalloc(sizeof(*ghvm), GFP_KERNEL);
>> +    if (!ghvm) {
>> +        gh_rm_dealloc_vmid(rm, vmid);
>> +        return ERR_PTR(-ENOMEM);
>> +    }
>> +
>> +    get_gh_rm(rm);
>> +
>> +    ghvm->vmid = vmid;
>> +    ghvm->rm = rm;
>> +
>> +    INIT_WORK(&ghvm->free_work, gh_vm_free);
>> +
>> +    return ghvm;
>> +}
>> +
>> +static int gh_vm_release(struct inode *inode, struct file *filp)
>> +{
>> +    struct gh_vm *ghvm = filp->private_data;
>> +
>> +    /* VM will be reset and make RM calls which can interruptible sleep.
>> +     * Defer to a work so this thread can receive signal.
>> +     */
>> +    schedule_work(&ghvm->free_work);
>> +    return 0;
>> +}
>> +
>> +static const struct file_operations gh_vm_fops = {
>> +    .release = gh_vm_release,
>
>> +    .compat_ioctl    = compat_ptr_ioctl,
>
> This line should go with the patch that adds real ioctl
>

Done.

>> +    .llseek = noop_llseek,
>> +};
>> +
>> +static long gh_dev_ioctl_create_vm(struct gh_rm *rm, unsigned long arg)
> Not sure what is the gain of this multiple levels of redirection.
>
> How about
>
> long gh_dev_create_vm(struct gh_rm *rm, unsigned long arg)
> {
> ...
> }
>
> and rsc_mgr just call it as part of its ioctl call
>
> static long gh_dev_ioctl(struct file *filp, unsigned int cmd, unsigned
> long arg)
> {
>     struct miscdevice *miscdev = filp->private_data;
>     struct gh_rm *rm = container_of(miscdev, struct gh_rm, miscdev);
>
>     switch (cmd) {
>     case GH_CREATE_VM:
>         return gh_dev_create_vm(rm, arg);
>     default:
>         return -ENOIOCTLCMD;
>     }
> }
>

I'm anticipating we will add further /dev/gunyah ioctls and I thought it
would be cleaner to have all that in vm_mgr.c itself.

>
>> +{
>> +    struct gh_vm *ghvm;
>> +    struct file *file;
>> +    int fd, err;
>> +
>> +    /* arg reserved for future use. */
>> +    if (arg)
>> +        return -EINVAL;
>
> The only code path I see here is via GH_CREATE_VM ioctl which obviously
> does not take any arguments, so if you are thinking of using the
> argument for architecture-specific VM flags.  Then this needs to be
> properly done by making the ABI aware of this.

It is documented in Patch 17 (Document Gunyah VM Manager)

+GH_CREATE_VM
+~~~~~~~~~~~~
+
+Creates a Gunyah VM. The argument is reserved for future use and must be 0.

>
> As you mentioned zero value arg imply an "unauthenticated VM" type, but
> this was not properly encoded in the userspace ABI. Why not make it
> future compatible. How about adding arguments to GH_CREATE_VM and pass
> the required information correctly.
> Note that once the ABI is accepted then you will not be able to change
> it, other than adding a new one.
>

Does this means adding #define GH_VM_DEFAULT_ARG 0 ? I am not sure yet
what arguments to add here.

The ABI can add new "long" values to GH_CREATE_VM and that wouldn't
break compatibility with old kernels; old kernels reject it as -EINVAL.

>> +
>> +    ghvm = gh_vm_alloc(rm);
>> +    if (IS_ERR(ghvm))
>> +        return PTR_ERR(ghvm);
>> +
>> +    fd = get_unused_fd_flags(O_CLOEXEC);
>> +    if (fd < 0) {
>> +        err = fd;
>> +        goto err_destroy_vm;
>> +    }
>> +
>> +    file = anon_inode_getfile("gunyah-vm", &gh_vm_fops, ghvm, O_RDWR);
>> +    if (IS_ERR(file)) {
>> +        err = PTR_ERR(file);
>> +        goto err_put_fd;
>> +    }
>> +
>> +    fd_install(fd, file);
>> +
>> +    return fd;
>> +
>> +err_put_fd:
>> +    put_unused_fd(fd);
>> +err_destroy_vm:
>> +    kfree(ghvm);
>> +    return err;
>> +}
>> +
>> +long gh_dev_vm_mgr_ioctl(struct gh_rm *rm, unsigned int cmd, unsigned
>> long arg)
>> +{
>> +    switch (cmd) {
>> +    case GH_CREATE_VM:
>> +        return gh_dev_ioctl_create_vm(rm, arg);
>> +    default:
>> +        return -ENOIOCTLCMD;
>> +    }
>> +}
>> diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
>> new file mode 100644
>> index 000000000000..76954da706e9
>> --- /dev/null
>> +++ b/drivers/virt/gunyah/vm_mgr.h
>> @@ -0,0 +1,22 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All
>> rights reserved.
>> + */
>> +
>> +#ifndef _GH_PRIV_VM_MGR_H
>> +#define _GH_PRIV_VM_MGR_H
>> +
>> +#include <linux/gunyah_rsc_mgr.h>
>> +
>> +#include <uapi/linux/gunyah.h>
>> +
>> +long gh_dev_vm_mgr_ioctl(struct gh_rm *rm, unsigned int cmd, unsigned
>> long arg);
>> +
>> +struct gh_vm {
>> +    u16 vmid;
>> +    struct gh_rm *rm;
>> +
>> +    struct work_struct free_work;
>> +};
>> +
>> +#endif
>> diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
>> new file mode 100644
>> index 000000000000..10ba32d2b0a6
>> --- /dev/null
>> +++ b/include/uapi/linux/gunyah.h
>> @@ -0,0 +1,23 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only WITH Linux-syscall-note */
>> +/*
>> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All
>> rights reserved.
>> + */
>> +
>> +#ifndef _UAPI_LINUX_GUNYAH
>> +#define _UAPI_LINUX_GUNYAH
>> +
>> +/*
>> + * Userspace interface for /dev/gunyah - gunyah based virtual machine
>> + */
>> +
>> +#include <linux/types.h>
>> +#include <linux/ioctl.h>
>> +
>> +#define GH_IOCTL_TYPE            'G'
>> +
>> +/*
>> + * ioctls for /dev/gunyah fds:
>> + */
>> +#define GH_CREATE_VM            _IO(GH_IOCTL_TYPE, 0x0) /* Returns a
>> Gunyah VM fd */
>
> Can HLOS forcefully destroy a VM?
> If so should we have a corresponding DESTROY IOCTL?

It can forcefully destroy unauthenticated and protected virtual
machines. I don't have a userspace usecase for a DESTROY ioctl yet,
maybe this can be added later? By the way, the VM is forcefully
destroyed when VM refcount is dropped to 0 (close(vm_fd) and any other
relevant file descriptors).

- Elliot

2023-02-22 10:22:02

by Srinivas Kandagatla

[permalink] [raw]
Subject: Re: [PATCH v10 15/26] gunyah: rsc_mgr: Add platform ops on mem_lend/mem_reclaim



On 21/02/2023 21:22, Elliot Berman wrote:
>
>
> On 2/21/2023 6:51 AM, Srinivas Kandagatla wrote:
>>
>>
>> On 14/02/2023 21:24, Elliot Berman wrote:
> [snip]
>>> +
>>> +static struct gunyah_rm_platform_ops *rm_platform_ops;
>>> +static DECLARE_RWSEM(rm_platform_ops_lock);
>>
>> Why do we need this read/write lock or this global rm_platform_ops
>> here, AFAIU, there will be only one instance of platform_ops per
>> platform.
>>
>> This should be a core part of the gunyah and its driver early setup,
>> that should give us pretty much lock less behaviour.
>>
>> We should be able to determine by Hypervisor UUID that its on Qualcomm
>> platform or not, during early gunyah setup which should help us setup
>> the platfrom ops accordingly.
>>
>> This should also help cleanup some of the gunyah code that was added
>> futher down in this patchset.
>
> I'm guessing the direction to take is:
>
>   config GUNYAH
>     select QCOM_SCM if ARCH_QCOM

This is how other kernel drivers use SCM.

>
> and have vm_mgr call directly into qcom_scm driver if the UID matches?

Yes that is the plan, we could have these callbacks as part key data
structure like struct gh_rm and update it at very early in setup stage
based on UUID match.


>
> We have an Android requirement to enable CONFIG_GUNYAH=y and
> CONFIG_QCOM_SCM=m, but it wouldn't be possible with this design. The

Am not sure how this will work, if gunyah for QCOM Platform is depended
on SCM then there is no way that gunyah could be a inbuilt and make scm
a module.

On the other hand with the existing design gunyah will not be functional
until scm driver is loaded and platform hooks are registered. This
runtime dependency design does not express the dependency correctly and
the only way to know if gunyah is functional is keep trying which can
only work after scm driver is probed.

This also raises the design question on how much of platform hooks
dependency is captured at gunyah core and api level, with state of
current code /dev/gunyah will be created even without platform hooks and
let the userspace use it which then only fail at hyp call level.

Other issue with current design is, scm module can be unloaded under the
hood leaving gunyah with NULL pointers to those platform hook functions.
This is the kind of issues we could see if the dependency is not
expressed from bottom up.

The current design is not really capturing the depended components
accurately.

Considering platform hooks as a core resource to gunyah on Qualcomm
platform is something that needs attention. If we can fix that then it
might be doable to have QCOM_SCM=m and CONFIG_GUNYAH=y.


--srini
> platform hooks implementation allows GUNYAH and QCOM_SCM to be enabled
> without setting lower bound of the other.
>
> - Elliot

2023-02-22 14:08:24

by Srinivas Kandagatla

[permalink] [raw]
Subject: Re: [PATCH v10 19/26] gunyah: vm_mgr: Add framework to add VM Functions



On 14/02/2023 21:25, Elliot Berman wrote:
>
> Introduce a framework for Gunyah userspace to install VM functions. VM
> functions are optional interfaces to the virtual machine. vCPUs,
> ioeventfs, and irqfds are examples of such VM functions and are
> implemented in subsequent patches.
>
> A generic framework is implemented instead of individual ioctls to
> create vCPUs, irqfds, etc., in order to simplify the VM manager core
> implementation and allow dynamic loading of VM function modules.
>
> Signed-off-by: Elliot Berman <[email protected]>
> ---
> Documentation/virt/gunyah/vm-manager.rst | 18 ++
> drivers/virt/gunyah/vm_mgr.c | 240 ++++++++++++++++++++++-
> drivers/virt/gunyah/vm_mgr.h | 3 +
> include/linux/gunyah_vm_mgr.h | 80 ++++++++
> include/uapi/linux/gunyah.h | 17 ++
> 5 files changed, 353 insertions(+), 5 deletions(-)
> create mode 100644 include/linux/gunyah_vm_mgr.h
>
> diff --git a/Documentation/virt/gunyah/vm-manager.rst b/Documentation/virt/gunyah/vm-manager.rst
> index c0126cfeadc7..5272a6e9145c 100644
> --- a/Documentation/virt/gunyah/vm-manager.rst
> +++ b/Documentation/virt/gunyah/vm-manager.rst
> @@ -17,6 +17,24 @@ sharing userspace memory with a VM is done via the GH_VM_SET_USER_MEM_REGION
> ioctl. The VM itself is configured to use the memory region via the
> devicetree.
>
> +Gunyah Functions
> +================
> +
> +Components of a Gunyah VM's configuration that need kernel configuration are
> +called "functions" and are built on top of a framework. Functions are identified
> +by a string and have some argument(s) to configure them. They are typically
> +created by the `GH_VM_ADD_FUNCTION` ioctl.
> +
> +Functions typically will always do at least one of these operations:
> +
> +1. Create resource ticket(s). Resource tickets allow a function to register
> + itself as the client for a Gunyah resource (e.g. doorbell or vCPU) and
> + the function is given the pointer to the `struct gunyah_resource` when the
> + VM is starting.
> +
> +2. Register IO handler(s). IO handlers allow a function to handle stage-2 faults
> + from the virtual machine.
> +
> Sample Userspace VMM
> ====================
>
> diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
> index fa324385ade5..e9c55e7dd1b3 100644
> --- a/drivers/virt/gunyah/vm_mgr.c
> +++ b/drivers/virt/gunyah/vm_mgr.c
> @@ -6,8 +6,10 @@
> #define pr_fmt(fmt) "gh_vm_mgr: " fmt
>
> #include <linux/anon_inodes.h>
> +#include <linux/compat.h>
> #include <linux/file.h>
> #include <linux/gunyah_rsc_mgr.h>
> +#include <linux/gunyah_vm_mgr.h>
> #include <linux/miscdevice.h>
> #include <linux/mm.h>
> #include <linux/module.h>
> @@ -16,6 +18,177 @@
>
> #include "vm_mgr.h"
>
> +static DEFINE_MUTEX(functions_lock);
> +static DEFINE_IDR(functions);
Why are these global? Can these be not part of struc gh_rm?
Not to mention please move idr to xarrays.

> +
> +int gh_vm_function_register(struct gh_vm_function *drv)
> +{
> + int ret = 0;
> +
> + if (!drv->bind || !drv->unbind)
> + return -EINVAL;
> +
> + mutex_lock(&functions_lock);
> + if (idr_find(&functions, drv->type)) {
> + ret = -EEXIST;
> + goto out;
> + }
> +
> + INIT_LIST_HEAD(&drv->instances);
> + ret = idr_alloc(&functions, drv, drv->type, drv->type + 1, GFP_KERNEL);
> + if (ret > 0)
> + ret = 0;
> +out:
> + mutex_unlock(&functions_lock);
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(gh_vm_function_register);
> +
> +static void gh_vm_remove_function_instance(struct gh_vm_function_instance *inst)
> + __must_hold(functions_lock)
> +{
> + inst->fn->unbind(inst);
> + list_del(&inst->vm_list);
> + list_del(&inst->fn_list);
> + module_put(inst->fn->mod);
> + if (inst->arg_size)
> + kfree(inst->argp);
> + kfree(inst);
> +}
> +
> +void gh_vm_function_unregister(struct gh_vm_function *fn)
> +{
> + struct gh_vm_function_instance *inst, *iter;
> +
> + mutex_lock(&functions_lock);
> + list_for_each_entry_safe(inst, iter, &fn->instances, fn_list)
> + gh_vm_remove_function_instance(inst);

We should never have any instances as we have refcounted the module.

If there are any instances then its clearly a bug, as this will pull out
function under the hood while userspace is using it.


> + idr_remove(&functions, fn->type);
> + mutex_unlock(&functions_lock);
> +}
> +EXPORT_SYMBOL_GPL(gh_vm_function_unregister);
> +
> +static long gh_vm_add_function(struct gh_vm *ghvm, struct gh_fn_desc *f)
> +{
> + struct gh_vm_function_instance *inst;
> + void __user *argp;
> + long r = 0;
> +
> + if (f->arg_size > GH_FN_MAX_ARG_SIZE)

lets print some useful error message to user.

> + return -EINVAL;
> +
> + inst = kzalloc(sizeof(*inst), GFP_KERNEL);
> + if (!inst)
> + return -ENOMEM;
> +
> + inst->arg_size = f->arg_size;
> + if (inst->arg_size) {
> + inst->argp = kzalloc(inst->arg_size, GFP_KERNEL);
> + if (!inst->arg) {
> + r = -ENOMEM;
> + goto free;
> + }
> +
> + argp = is_compat_task() ? compat_ptr(f->arg) : (void __user *) f->arg;

hmm, arg is not a data pointer it is a fixed size variable (__u64 arg),
so why are using compat_ptr() here?

you should be able to do

argp = u64_to_user_ptr(f->arg);

> + if (copy_from_user(inst->argp, argp, f->arg_size)) {
> + r = -EFAULT;
> + goto free_arg;
> + }
> + } else {
> + inst->arg = f->arg;
bit lost here, so, we treat the arg as both pointer and value in cases
where size is zero.

> + }
> +
<---
> + mutex_lock(&functions_lock);
> + inst->fn = idr_find(&functions, f->type);
> + if (!inst->fn) {
> + mutex_unlock(&functions_lock);
> + r = request_module("ghfunc:%d", f->type);
> + if (r)
> + goto unlock_free;
> +
> + mutex_lock(&functions_lock);
> + inst->fn = idr_find(&functions, f->type);
> + }
> +
> + if (!inst->fn) {
> + r = -ENOENT;
> + goto unlock_free;
> + }
> +
> + if (!try_module_get(inst->fn->mod)) {
> + r = -ENOENT;
> + inst->fn = NULL;
> + goto unlock_free;
> + }
> +
--->
can we do this snippet as a gh_vm_get_function() and corresponding
gh_vm_put_function(). that should make the code more cleaner.


> + inst->ghvm = ghvm;
> + inst->rm = ghvm->rm;
> +
> + r = inst->fn->bind(inst);
> + if (r < 0) {
> + module_put(inst->fn->mod);
> + goto unlock_free;
> + }
> +
> + list_add(&inst->vm_list, &ghvm->functions);

I guess its possible to add same functions with same argumentso to this
list, how are we preventing this to happen?

Is it a valid usecase?

> + list_add(&inst->fn_list, &inst->fn->instances);
> + mutex_unlock(&functions_lock);
> + return r;
> +unlock_free:
> + mutex_unlock(&functions_lock);
> +free_arg:
> + if (inst->arg_size)
> + kfree(inst->argp);
> +free:
> + kfree(inst);
> + return r;
> +}
> +
> +static long gh_vm_rm_function(struct gh_vm *ghvm, struct gh_fn_desc *f)
> +{
> + struct gh_vm_function_instance *inst, *iter;
> + void __user *user_argp;
> + void *argp;
> + long r = 0;
> +
> + r = mutex_lock_interruptible(&functions_lock);
> + if (r)
> + return r;
> +
> + if (f->arg_size) {
> + argp = kzalloc(f->arg_size, GFP_KERNEL);
> + if (!argp) {
> + r = -ENOMEM;
> + goto out;
> + }
> +
> + user_argp = is_compat_task() ? compat_ptr(f->arg) : (void __user *) f->arg;

same comment as add;

> + if (copy_from_user(argp, user_argp, f->arg_size)) {
> + r = -EFAULT;
> + kfree(argp);
> + goto out;
> + }
> +
> + list_for_each_entry_safe(inst, iter, &ghvm->functions, vm_list) {
> + if (inst->fn->type == f->type &&
> + f->arg_size == inst->arg_size &&
> + !memcmp(argp, inst->argp, f->arg_size))
> + gh_vm_remove_function_instance(inst);
> + }

leaking argp;

> + } else {
> + list_for_each_entry_safe(inst, iter, &ghvm->functions, vm_list) {
> + if (inst->fn->type == f->type &&
> + f->arg_size == inst->arg_size &&
> + inst->arg == f->arg)
> + gh_vm_remove_function_instance(inst);
> + }
> + }
> +
> +out:
> + mutex_unlock(&functions_lock);
> + return r;
> +}
> +
> static int gh_vm_rm_notification_status(struct gh_vm *ghvm, void *data)
> {
> struct gh_rm_vm_status_payload *payload = data;
> @@ -80,6 +253,7 @@ static void gh_vm_stop(struct gh_vm *ghvm)
> static void gh_vm_free(struct work_struct *work)
> {
> struct gh_vm *ghvm = container_of(work, struct gh_vm, free_work);
> + struct gh_vm_function_instance *inst, *iiter;
> struct gh_vm_mem *mapping, *tmp;
> int ret;
>
> @@ -90,7 +264,13 @@ static void gh_vm_free(struct work_struct *work)
> fallthrough;
> case GH_RM_VM_STATUS_INIT_FAILED:
> case GH_RM_VM_STATUS_LOAD:
> - case GH_RM_VM_STATUS_LOAD_FAILED:
> + case GH_RM_VM_STATUS_EXITED:
> + mutex_lock(&functions_lock);
> + list_for_each_entry_safe(inst, iiter, &ghvm->functions, vm_list) {
> + gh_vm_remove_function_instance(inst);
> + }
> + mutex_unlock(&functions_lock);
> +
> mutex_lock(&ghvm->mm_lock);
> list_for_each_entry_safe(mapping, tmp, &ghvm->memory_mappings, list) {
> gh_vm_mem_reclaim(ghvm, mapping);
> @@ -113,6 +293,28 @@ static void gh_vm_free(struct work_struct *work)
> }
> }
>
> +static void _gh_vm_put(struct kref *kref)
> +{
> + struct gh_vm *ghvm = container_of(kref, struct gh_vm, kref);
> +
> + /* VM will be reset and make RM calls which can interruptible sleep.
> + * Defer to a work so this thread can receive signal.
> + */
> + schedule_work(&ghvm->free_work);
> +}
> +
> +int __must_check gh_vm_get(struct gh_vm *ghvm)
> +{
> + return kref_get_unless_zero(&ghvm->kref);
> +}
> +EXPORT_SYMBOL_GPL(gh_vm_get);
> +
> +void gh_vm_put(struct gh_vm *ghvm)
> +{
> + kref_put(&ghvm->kref, _gh_vm_put);
> +}
> +EXPORT_SYMBOL_GPL(gh_vm_put);
> +
> static __must_check struct gh_vm *gh_vm_alloc(struct gh_rm *rm)
> {
> struct gh_vm *ghvm;
> @@ -147,6 +349,8 @@ static __must_check struct gh_vm *gh_vm_alloc(struct gh_rm *rm)
> INIT_LIST_HEAD(&ghvm->memory_mappings);
> init_rwsem(&ghvm->status_lock);
> INIT_WORK(&ghvm->free_work, gh_vm_free);
> + kref_init(&ghvm->kref);
> + INIT_LIST_HEAD(&ghvm->functions);
> ghvm->vm_status = GH_RM_VM_STATUS_LOAD;
>
> return ghvm;
> @@ -291,6 +495,35 @@ static long gh_vm_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
> r = gh_vm_ensure_started(ghvm);
> break;
> }
> + case GH_VM_ADD_FUNCTION: {
> + struct gh_fn_desc *f;
> +
> + f = kzalloc(sizeof(*f), GFP_KERNEL);
> + if (!f)
> + return -ENOMEM;
> +
> + if (copy_from_user(f, argp, sizeof(*f)))
> + return -EFAULT;
> +
> + r = gh_vm_add_function(ghvm, f);
> + if (r < 0)
> + kfree(f);


we are memory leaking f here, we should free it irrespective of return
value. or I see no reason not to use this small struct from stack.


> + break;
> + }
> + case GH_VM_REMOVE_FUNCTION: {
> + struct gh_fn_desc *f;
> +
> + f = kzalloc(sizeof(*f), GFP_KERNEL);
> + if (!f)
> + return -ENOMEM;
> +
> + if (copy_from_user(f, argp, sizeof(*f)))
> + return -EFAULT;
> +
> + r = gh_vm_rm_function(ghvm, f);
> + kfree(f);
> + break;
> + }
> default:
> r = -ENOTTY;
> break;
> @@ -303,10 +536,7 @@ static int gh_vm_release(struct inode *inode, struct file *filp)
> {
> struct gh_vm *ghvm = filp->private_data;
>
> - /* VM will be reset and make RM calls which can interruptible sleep.
> - * Defer to a work so this thread can receive signal.
> - */
> - schedule_work(&ghvm->free_work);
> + gh_vm_put(ghvm);
> return 0;
> }
>
> diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
> index e9cf56647cc2..4750d56c1297 100644
> --- a/drivers/virt/gunyah/vm_mgr.h
> +++ b/drivers/virt/gunyah/vm_mgr.h
> @@ -8,6 +8,7 @@
>
> #include <linux/gunyah_rsc_mgr.h>
> #include <linux/list.h>
> +#include <linux/kref.h>
> #include <linux/miscdevice.h>
> #include <linux/mutex.h>
> #include <linux/rwsem.h>
> @@ -44,8 +45,10 @@ struct gh_vm {
> struct rw_semaphore status_lock;
>
> struct work_struct free_work;
> + struct kref kref;
> struct mutex mm_lock;
> struct list_head memory_mappings;
> + struct list_head functions;
> };
>
> int gh_vm_mem_alloc(struct gh_vm *ghvm, struct gh_userspace_memory_region *region);
> diff --git a/include/linux/gunyah_vm_mgr.h b/include/linux/gunyah_vm_mgr.h
> new file mode 100644
> index 000000000000..f0a95af50b2e
> --- /dev/null
> +++ b/include/linux/gunyah_vm_mgr.h
> @@ -0,0 +1,80 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#ifndef _GUNYAH_VM_MGR_H
> +#define _GUNYAH_VM_MGR_H
> +
> +#include <linux/compiler_types.h>
> +#include <linux/gunyah.h>
> +#include <linux/gunyah_rsc_mgr.h>
> +#include <linux/list.h>
> +#include <linux/mod_devicetable.h>
??

> +#include <linux/notifier.h>

??

> +
> +#include <uapi/linux/gunyah.h>
> +
> +struct gh_vm;
> +
> +int __must_check gh_vm_get(struct gh_vm *ghvm);
> +void gh_vm_put(struct gh_vm *ghvm);
> +
> +struct gh_vm_function_instance;
> +struct gh_vm_function {
> + u32 type; > + const char *name;
> + struct module *mod;
> + long (*bind)(struct gh_vm_function_instance *f);
> + void (*unbind)(struct gh_vm_function_instance *f);
> + struct mutex instances_lock;
> + struct list_head instances;
> +};
> +
> +/**
> + * struct gh_vm_function_instance - Represents one function instance
> + * @arg_size: size of user argument
> + * @arg: user argument to describe the function instance; arg_size is 0
> + * @argp: pointer to user argument
> + * @ghvm: Pointer to VM instance
> + * @rm: Pointer to resource manager for the VM instance
> + * @fn: The ops for the function
> + * @data: Private data for function
> + * @vm_list: for gh_vm's functions list
> + * @fn_list: for gh_vm_function's instances list
> + */
> +struct gh_vm_function_instance {
> + size_t arg_size;
> + union {
> + u64 arg;
> + void *argp;
> + };
> + struct gh_vm *ghvm;
> + struct gh_rm *rm;
> + struct gh_vm_function *fn;
> + void *data;
> + struct list_head vm_list;
> + struct list_head fn_list;
Am not seeing any advantage of storing the instance in two different
list, they look redundant to me. storing the function instances in vm
should be good IMO.


> +};
> +
> +int gh_vm_function_register(struct gh_vm_function *f);
> +void gh_vm_function_unregister(struct gh_vm_function *f);
> +
> +#define DECLARE_GUNYAH_VM_FUNCTION(_name, _type, _bind, _unbind) \
> + static struct gh_vm_function _name = { \
> + .type = _type, \
> + .name = __stringify(_name), \
> + .mod = THIS_MODULE, \
> + .bind = _bind, \
> + .unbind = _unbind, \
> + }; \
> + MODULE_ALIAS("ghfunc:"__stringify(_type))
> +
> +#define module_gunyah_vm_function(__gf) \
> + module_driver(__gf, gh_vm_function_register, gh_vm_function_unregister)
> +
> +#define DECLARE_GUNYAH_VM_FUNCTION_INIT(_name, _type, _bind, _unbind) \
> + DECLARE_GUNYAH_VM_FUNCTION(_name, _type, _bind, _unbind); \
> + module_gunyah_vm_function(_name)
> +
> +#endif
> diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
> index d899bba6a4c6..8df455a2a293 100644
> --- a/include/uapi/linux/gunyah.h
> +++ b/include/uapi/linux/gunyah.h
> @@ -66,4 +66,21 @@ struct gh_vm_dtb_config {
>
> #define GH_VM_START _IO(GH_IOCTL_TYPE, 0x3)
>
> +#define GH_FN_MAX_ARG_SIZE 256
> +
> +/**
> + * struct gh_fn_desc - Arguments to create a VM function
> + * @type: Type of the function. See GH_FN_* macro for supported types
> + * @arg_size: Size of argument to pass to the function

a note on max arg size of 256 bytes would be useful.

> + * @arg: Value or pointer to argument given to the function

Treating this as value when arg_size == 0 is really confusing abi.
how about just use as arg as ptr to data along with arg_size;

--srini
> + */
> +struct gh_fn_desc {
> + __u32 type;
> + __u32 arg_size;
> + __u64 arg;
> +};
> +
> +#define GH_VM_ADD_FUNCTION _IOW(GH_IOCTL_TYPE, 0x4, struct gh_fn_desc)
> +#define GH_VM_REMOVE_FUNCTION _IOW(GH_IOCTL_TYPE, 0x7, struct gh_fn_desc)

Do you have an example of how add and rm ioctls are used w.r.t to arg, i
see that we check correcteness of arg in between add and remove.

--srini
> +
> #endif

2023-02-22 22:52:33

by Elliot Berman

[permalink] [raw]
Subject: Re: [PATCH v10 08/26] gunyah: rsc_mgr: Add resource manager RPC core



On 2/16/2023 11:37 PM, Greg Kroah-Hartman wrote:
> On Thu, Feb 16, 2023 at 09:40:52AM -0800, Elliot Berman wrote:
>>
>>
>> On 2/15/2023 10:43 PM, Greg Kroah-Hartman wrote:
>>> On Tue, Feb 14, 2023 at 01:23:25PM -0800, Elliot Berman wrote:
>>>> +struct gh_rm {
>>>> + struct device *dev;
>>>
>>> What device does this point to?
>>>
>>
>> The platform device.
>
> What platform device? And why a platform device?
>

This will be used for the dev_printk. It's presently also used for the
reference counting. From your comments below, I'll switch the reference
counting away from this platform device.

>>>> + struct gunyah_resource tx_ghrsc, rx_ghrsc;
>>>> + struct gh_msgq msgq;
>>>> + struct mbox_client msgq_client;
>>>> + struct gh_rm_connection *active_rx_connection;
>>>> + int last_tx_ret;
>>>> +
>>>> + struct idr call_idr;
>>>> + struct mutex call_idr_lock;
>>>> +
>>>> + struct kmem_cache *cache;
>>>> + struct mutex send_lock;
>>>> + struct blocking_notifier_head nh;
>>>> +};
>>>
>>> This obviously is the "device" that your system works on, so what are
>>> the lifetime rules of it? Why isn't is just a real 'struct device' in
>>> the system instead of a random memory blob with a pointer to a device?
>>>
>>> What controls the lifetime of this structure and where is the reference
>>> counting logic for it?
>>>
>>
>> The lifetime of the structure is bound by the platform device that above
>> struct device *dev points to. get_gh_rm and put_gh_rm increments the device
>> ref counter and ensures lifetime of the struct is also extended.
>
> But this really is "your" device, not the platform device. So make it a
> real one please as that is how the kernel's driver model works. Don't
> hang "magic structures" off of a random struct device and have them
> control the lifetime rules of the parent without actually being a device
> themself. This should make things simpler overall, not more complex,
> and allow you to expose things to userspace properly (right now your
> data is totally hidden.)

The "real" device I create here is the miscdev, so I think the
recommendation here is to do refcounting off that miscdev. Is this the
approach you were thinking of?

Thanks,
Elliot

2023-02-22 23:18:59

by Elliot Berman

[permalink] [raw]
Subject: Re: [PATCH v10 08/26] gunyah: rsc_mgr: Add resource manager RPC core



On 2/20/2023 10:10 AM, Srinivas Kandagatla wrote:
>
>
> On 14/02/2023 21:23, Elliot Berman wrote:
>>
>> The resource manager is a special virtual machine which is always
>> running on a Gunyah system. It provides APIs for creating and destroying
>> VMs, secure memory management, sharing/lending of memory between VMs,
>> and setup of inter-VM communication. Calls to the resource manager are
>> made via message queues.
>>
>> This patch implements the basic probing and RPC mechanism to make those
>> API calls. Request/response calls can be made with gh_rm_call.
>> Drivers can also register to notifications pushed by RM via
>> gh_rm_register_notifier
>>
>> Specific API calls that resource manager supports will be implemented in
>> subsequent patches.
>>
>> Signed-off-by: Elliot Berman <[email protected]>
>> ---
>>   drivers/virt/gunyah/Makefile   |   3 +
>>   drivers/virt/gunyah/rsc_mgr.c  | 604 +++++++++++++++++++++++++++++++++
>>   drivers/virt/gunyah/rsc_mgr.h  |  77 +++++
>>   include/linux/gunyah_rsc_mgr.h |  24 ++
>>   4 files changed, 708 insertions(+)
>>   create mode 100644 drivers/virt/gunyah/rsc_mgr.c
>>   create mode 100644 drivers/virt/gunyah/rsc_mgr.h
>>   create mode 100644 include/linux/gunyah_rsc_mgr.h
>>
>> diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
>> index 34f32110faf9..cc864ff5abbb 100644
>> --- a/drivers/virt/gunyah/Makefile
>> +++ b/drivers/virt/gunyah/Makefile
>> @@ -1,3 +1,6 @@
>>   # SPDX-License-Identifier: GPL-2.0
>>   obj-$(CONFIG_GUNYAH) += gunyah.o
>> +
>> +gunyah_rsc_mgr-y += rsc_mgr.o
>> +obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
>> diff --git a/drivers/virt/gunyah/rsc_mgr.c
>> b/drivers/virt/gunyah/rsc_mgr.c
>> new file mode 100644
>> index 000000000000..2a47139873a8
>> --- /dev/null
>> +++ b/drivers/virt/gunyah/rsc_mgr.c
>> @@ -0,0 +1,604 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All
>> rights reserved.
>> + */
>> +
>> +#include <linux/of.h>
>> +#include <linux/slab.h>
>> +#include <linux/mutex.h>
>> +#include <linux/sched.h>
>> +#include <linux/gunyah.h>
>> +#include <linux/module.h>
>> +#include <linux/of_irq.h>
>> +#include <linux/kthread.h>
> why do we need this?
>
>> +#include <linux/notifier.h>
>> +#include <linux/workqueue.h>
>> +#include <linux/completion.h>
>> +#include <linux/gunyah_rsc_mgr.h>
>> +#include <linux/platform_device.h>
>> +
>> +#include "rsc_mgr.h"
>> +
>
> ...
>
>> +struct gh_rm {
>> +    struct device *dev;
>> +    struct gunyah_resource tx_ghrsc, rx_ghrsc;
>> +    struct gh_msgq msgq;
>> +    struct mbox_client msgq_client;
>> +    struct gh_rm_connection *active_rx_connection;
>> +    int last_tx_ret;
>> +
>
>> +    struct idr call_idr;
>> +    struct mutex call_idr_lock;
>
> IDR interface is deprecated you should use Xarrays instead here,
>
> Other good thing about Xarrays is that you need not worry about locking
> it uses RCU and internal spinlock, that should simiply code a bit here.
>
> more info at
> Documentation/core-api/xarray.rst
>

Done.

>> +
>> +    struct kmem_cache *cache;
>> +    struct mutex send_lock;
>> +    struct blocking_notifier_head nh;
>> +};
>> +
>> +static struct gh_rm_connection *gh_rm_alloc_connection(__le32 msg_id,
>> u8 type)
>> +{
>> +    struct gh_rm_connection *connection;
>> +
>> +    connection = kzalloc(sizeof(*connection), GFP_KERNEL);
>> +    if (!connection)
>> +        return ERR_PTR(-ENOMEM);
>> +
>> +    connection->type = type;
>> +    connection->msg_id = msg_id;
>> +
>> +    return connection;
>> +}
>> +
>> +static int gh_rm_init_connection_payload(struct gh_rm_connection
>> *connection, void *msg,
>> +                    size_t hdr_size, size_t msg_size)
>> +{
>> +    size_t max_buf_size, payload_size;
>> +    struct gh_rm_rpc_hdr *hdr = msg;
>> +
>> +    if (hdr_size > msg_size)
>> +        return -EINVAL;
>> +
>> +    payload_size = msg_size - hdr_size;
>> +
>> +    connection->num_fragments = FIELD_GET(RM_RPC_FRAGMENTS_MASK,
>> hdr->type);
>> +    connection->fragments_received = 0;
>> +
>> +    /* There's not going to be any payload, no need to allocate
>> buffer. */
>> +    if (!payload_size && !connection->num_fragments)
>> +        return 0;
>> +
>> +    if (connection->num_fragments > GH_RM_MAX_NUM_FRAGMENTS)
>> +        return -EINVAL;
>> +
>> +    max_buf_size = payload_size + (connection->num_fragments *
>> GH_RM_MAX_MSG_SIZE);
>> +
>> +    connection->payload = kzalloc(max_buf_size, GFP_KERNEL);
>> +    if (!connection->payload)
>> +        return -ENOMEM;
>> +
>> +    memcpy(connection->payload, msg + hdr_size, payload_size);
>> +    connection->size = payload_size;
>> +    return 0;
>> +}
>> +
>> +static void gh_rm_notif_work(struct work_struct *work)
>> +{
>> +    struct gh_rm_connection *connection = container_of(work, struct
>> gh_rm_connection,
>> +                                notification.work);
>> +    struct gh_rm *rm = connection->notification.rm;
>> +
>> +    blocking_notifier_call_chain(&rm->nh, connection->msg_id,
>> connection->payload);
>> +
>> +    put_gh_rm(rm);
>> +    kfree(connection->payload);
> if (connection->size)
>     kfree(connection->payload);
>
> should we check for payload size before freeing this, Normally kfree
> NULL should be safe, unless connection object is allocated uninitialized.
>

connection object is kzalloc'd, so it's always allocated initialized.

>> +    kfree(connection);
>> +}
>> +
>> +static struct gh_rm_connection *gh_rm_process_notif(struct gh_rm *rm,
>> void *msg, size_t msg_size)
>> +{
>> +    struct gh_rm_connection *connection;
>> +    struct gh_rm_rpc_hdr *hdr = msg;
>> +    int ret;
>> +
>> +    connection = gh_rm_alloc_connection(hdr->msg_id, RM_RPC_TYPE_NOTIF);
>> +    if (IS_ERR(connection)) {
>> +        dev_err(rm->dev, "Failed to alloc connection for
>> notification: %ld, dropping.\n",
>> +            PTR_ERR(connection));
>> +        return NULL;
>> +    }
>> +
>> +    get_gh_rm(rm);
>> +    connection->notification.rm = rm;
>> +    INIT_WORK(&connection->notification.work, gh_rm_notif_work);
>> +
>> +    ret = gh_rm_init_connection_payload(connection, msg,
>> sizeof(*hdr), msg_size);
>> +    if (ret) {
>> +        dev_err(rm->dev, "Failed to initialize connection buffer for
>> notification: %d\n",
>> +            ret);
> put_gh_rm(rm);
>
> is missing.
> or move the get and other lines after this check
>

Done.

>> +        kfree(connection);
>> +        return NULL;
>> +    }
>> +
>> +    return connection;
>> +}
>> +
>
>> +static int gh_rm_send_request(struct gh_rm *rm, u32 message_id,
>> +                  const void *req_buff, size_t req_buff_size,
>> +                  struct gh_rm_connection *connection)
>> +{
>> +    u8 msg_type = FIELD_PREP(RM_RPC_TYPE_MASK, RM_RPC_TYPE_REQUEST);
>> +    size_t buff_size_remaining = req_buff_size;
>> +    const void *req_buff_curr = req_buff;
>> +    struct gh_msgq_tx_data *msg;
>> +    struct gh_rm_rpc_hdr *hdr;
>> +    u32 cont_fragments = 0;
>> +    size_t payload_size;
>> +    void *payload;
>> +    int ret;
>> +
>> +    if (req_buff_size)
>> +        cont_fragments = (req_buff_size - 1) / GH_RM_MAX_MSG_SIZE;
>> +
>> +    if (req_buff_size > GH_RM_MAX_NUM_FRAGMENTS * GH_RM_MAX_MSG_SIZE) {
>> +        pr_warn("Limit exceeded for the number of fragments: %u\n",
>> cont_fragments);
>> +        dump_stack();
>> +        return -E2BIG;
>> +    }
>> +
>> +    ret = mutex_lock_interruptible(&rm->send_lock);
>> +    if (ret)
>> +        return ret;
>> +
>> +    /* Consider also the 'request' packet for the loop count */
>> +    do {
>> +        msg = kmem_cache_zalloc(rm->cache, GFP_KERNEL);
>> +        if (!msg) {
>> +            ret = -ENOMEM;
>> +            goto out;
>> +        }
>> +
>> +        /* Fill header */
>> +        hdr = (struct gh_rm_rpc_hdr *)msg->data;
>> +        hdr->api = RM_RPC_API;
>> +        hdr->type = msg_type | FIELD_PREP(RM_RPC_FRAGMENTS_MASK,
>> cont_fragments);
>> +        hdr->seq = cpu_to_le16(connection->reply.seq);
>> +        hdr->msg_id = cpu_to_le32(message_id);
>> +
>> +        /* Copy payload */
>> +        payload = hdr + 1;
>> +        payload_size = min(buff_size_remaining, GH_RM_MAX_MSG_SIZE);
>> +        memcpy(payload, req_buff_curr, payload_size);
>> +        req_buff_curr += payload_size;
>> +        buff_size_remaining -= payload_size;
>> +
>> +        /* Force the last fragment to immediately alert the receiver */
>> +        msg->push = !buff_size_remaining;
>> +        msg->length = sizeof(*hdr) + payload_size;
>> +
>> +        ret = mbox_send_message(gh_msgq_chan(&rm->msgq), msg);
>> +        if (ret < 0) {
>> +            kmem_cache_free(rm->cache, msg);
>> +            break;
>> +        }
>> +
>> +        if (rm->last_tx_ret) {
>> +            ret = rm->last_tx_ret;
>> +            break;
>> +        }
>> +
>> +        msg_type = FIELD_PREP(RM_RPC_TYPE_MASK,
>> RM_RPC_TYPE_CONTINUATION);
>> +    } while (buff_size_remaining);
>> +
>> +out:
>> +    mutex_unlock(&rm->send_lock);
>> +    return ret < 0 ? ret : 0;
>> +}
>> +
>> +/**
>> + * gh_rm_call: Achieve request-response type communication with RPC
>> + * @rm: Pointer to Gunyah resource manager internal data
>> + * @message_id: The RM RPC message-id
>> + * @req_buff: Request buffer that contains the payload
>> + * @req_buff_size: Total size of the payload
>> + * @resp_buf: Pointer to a response buffer
>> + * @resp_buff_size: Size of the response buffer
>> + *
>> + * Make a request to the RM-VM and wait for reply back. For a successful
>> + * response, the function returns the payload. The size of the
>> payload is set in
>> + * resp_buff_size. The resp_buf should be freed by the caller.
>> + *
>> + * req_buff should be not NULL for req_buff_size >0. If req_buff_size
>> == 0,
>> + * req_buff *can* be NULL and no additional payload is sent.
>> + *
>> + * Context: Process context. Will sleep waiting for reply.
>> + * Return: 0 on success. <0 if error.
>> + */
>> +int gh_rm_call(struct gh_rm *rm, u32 message_id, void *req_buff,
>> size_t req_buff_size,
>> +        void **resp_buf, size_t *resp_buff_size)
>> +{
>> +    struct gh_rm_connection *connection;
>> +    int ret;
>> +
>> +    /* message_id 0 is reserved. req_buff_size implies req_buf is not
>> NULL */
>> +    if (!message_id || (!req_buff && req_buff_size) || !rm)
>> +        return -EINVAL;
>> +
>> +    connection = gh_rm_alloc_connection(cpu_to_le32(message_id),
>> RM_RPC_TYPE_REPLY);
>> +    if (IS_ERR(connection))
>> +        return PTR_ERR(connection);
>> +
>> +    init_completion(&connection->reply.seq_done);
>> +
>> +    /* Allocate a new seq number for this connection */
>> +    mutex_lock(&rm->call_idr_lock);
>> +    ret = idr_alloc_cyclic(&rm->call_idr, connection, 0, U16_MAX,
>> +                        GFP_KERNEL);
>> +    mutex_unlock(&rm->call_idr_lock);
>> +    if (ret < 0)
>> +        goto out;
>
> new line.
>
>> +    connection->reply.seq = ret;
>> +
>> +    /* Send the request to the Resource Manager */
>> +    ret = gh_rm_send_request(rm, message_id, req_buff, req_buff_size,
>> connection);
>> +    if (ret < 0)
>> +        goto out;
>> +
>> +    /* Wait for response */
>> +    ret =
>> wait_for_completion_interruptible(&connection->reply.seq_done);
>> +    if (ret)
>> +        goto out;
>> +
>> +    /* Check for internal (kernel) error waiting for the response */
>> +    if (connection->reply.ret) {
>> +        ret = connection->reply.ret;
>> +        if (ret != -ENOMEM)
>> +            kfree(connection->payload);
>> +        goto out;
>> +    }
>> +
>> +    /* Got a response, did resource manager give us an error? */
>> +    if (connection->reply.rm_error != GH_RM_ERROR_OK) {
>> +        pr_warn("RM rejected message %08x. Error: %d\n", message_id,
>> +            connection->reply.rm_error);
>> +        dump_stack();
>> +        ret = gh_rm_remap_error(connection->reply.rm_error);
>> +        kfree(connection->payload);
>> +        goto out;
>> +    }
>> +
>> +    /* Everything looks good, return the payload */
>> +    *resp_buff_size = connection->size;
>> +    if (connection->size)
>> +        *resp_buf = connection->payload;
>> +    else {
>> +        /* kfree in case RM sent us multiple fragments but never any
>> data in
>> +         * those fragments. We would've allocated memory for it, but
>> connection->size == 0
>> +         */
>> +        kfree(connection->payload);
>> +    }
>> +
>> +out:
>> +    mutex_lock(&rm->call_idr_lock);
>> +    idr_remove(&rm->call_idr, connection->reply.seq);
>> +    mutex_unlock(&rm->call_idr_lock);
>> +    kfree(connection);
>> +    return ret;
>> +}
>> +
>> +
>> +int gh_rm_notifier_register(struct gh_rm *rm, struct notifier_block *nb)
>> +{
>> +    return blocking_notifier_chain_register(&rm->nh, nb);
>> +}
>> +EXPORT_SYMBOL_GPL(gh_rm_notifier_register);
>> +
>> +int gh_rm_notifier_unregister(struct gh_rm *rm, struct notifier_block
>> *nb)
>> +{
>> +    return blocking_notifier_chain_unregister(&rm->nh, nb);
>> +}
>> +EXPORT_SYMBOL_GPL(gh_rm_notifier_unregister);
>> +
>> +void get_gh_rm(struct gh_rm *rm)
>> +{
>> +    get_device(rm->dev);
>> +}
>> +EXPORT_SYMBOL_GPL(get_gh_rm);
>
> Can we have some consistency in the exported symbol naming,
> we have two combinations now.
>
> EXPORT_SYMBOL_GPL(gh_rm_notifier_register);
> EXPORT_SYMBOL_GPL(get_gh_rm);
>
> lets stick to one.

done.

>> +
>> +void put_gh_rm(struct gh_rm *rm)
>> +{
>> +    put_device(rm->dev);
>> +}
>> +EXPORT_SYMBOL_GPL(put_gh_rm);
>>
> ...
>
>> +
>> +static int gh_rm_drv_probe(struct platform_device *pdev)
>> +{
>> +    struct gh_msgq_tx_data *msg;
>> +    struct gh_rm *rm;
>> +    int ret;
>> +
> How are we ensuring that gunyah driver is probed before this driver?
>
>

Which driver?

>> +    rm = devm_kzalloc(&pdev->dev, sizeof(*rm), GFP_KERNEL);
>> +    if (!rm)
>> +        return -ENOMEM;
>> +
>> +    platform_set_drvdata(pdev, rm);
>> +    rm->dev = &pdev->dev;
>> +
>> +    mutex_init(&rm->call_idr_lock);
>> +    idr_init(&rm->call_idr);
>> +    rm->cache = kmem_cache_create("gh_rm", struct_size(msg, data,
>> GH_MSGQ_MAX_MSG_SIZE), 0,
>> +        SLAB_HWCACHE_ALIGN, NULL);
>> +    if (!rm->cache)
>> +        return -ENOMEM;
> new line here would be nice.
>

done.

>> +    mutex_init(&rm->send_lock);
>> +    BLOCKING_INIT_NOTIFIER_HEAD(&rm->nh);
>> +
>> +    ret = gh_msgq_platform_probe_direction(pdev, true, 0,
>> &rm->tx_ghrsc);
>> +    if (ret)
>> +        goto err_cache;
>> +
>> +    ret = gh_msgq_platform_probe_direction(pdev, false, 1,
>> &rm->rx_ghrsc);
>> +    if (ret)
>> +        goto err_cache;
>> +
>> +    rm->msgq_client.dev = &pdev->dev;
>> +    rm->msgq_client.tx_block = true;
>> +    rm->msgq_client.rx_callback = gh_rm_msgq_rx_data;
>> +    rm->msgq_client.tx_done = gh_rm_msgq_tx_done;
>> +
>> +    return gh_msgq_init(&pdev->dev, &rm->msgq, &rm->msgq_client,
>> &rm->tx_ghrsc, &rm->rx_ghrsc);
>> +err_cache:
>> +    kmem_cache_destroy(rm->cache);
>> +    return ret;
>> +}
>> +
>> +static int gh_rm_drv_remove(struct platform_device *pdev)
>> +{
>> +    struct gh_rm *rm = platform_get_drvdata(pdev);
>> +
>> +    mbox_free_channel(gh_msgq_chan(&rm->msgq));
>> +    gh_msgq_remove(&rm->msgq);
>> +    kmem_cache_destroy(rm->cache);
>> +
>> +    return 0;
>> +}
>> +
>> +static const struct of_device_id gh_rm_of_match[] = {
>> +    { .compatible = "gunyah-resource-manager" },
>> +    {}
>> +};
>> +MODULE_DEVICE_TABLE(of, gh_rm_of_match);
>> +
>> +static struct platform_driver gh_rm_driver = {
>> +    .probe = gh_rm_drv_probe,
>> +    .remove = gh_rm_drv_remove,
>> +    .driver = {
>> +        .name = "gh_rsc_mgr",
>> +        .of_match_table = gh_rm_of_match,
>> +    },
>> +};
>> +module_platform_driver(gh_rm_driver);
>> +
>> +MODULE_LICENSE("GPL");
>> +MODULE_DESCRIPTION("Gunyah Resource Manager Driver");
>> diff --git a/drivers/virt/gunyah/rsc_mgr.h
>> b/drivers/virt/gunyah/rsc_mgr.h
>> new file mode 100644
>> index 000000000000..d4e799a7526f
>> --- /dev/null
>> +++ b/drivers/virt/gunyah/rsc_mgr.h
>> @@ -0,0 +1,77 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All
>> rights reserved.
>> + */
>> +#ifndef __GH_RSC_MGR_PRIV_H
>> +#define __GH_RSC_MGR_PRIV_H
>> +
>> +#include <linux/gunyah.h>
>> +#include <linux/gunyah_rsc_mgr.h>
>> +#include <linux/types.h>
>> +
> <------------------
>> +/* RM Error codes */
>> +enum gh_rm_error {
>> +    GH_RM_ERROR_OK            = 0x0,
>> +    GH_RM_ERROR_UNIMPLEMENTED    = 0xFFFFFFFF,
>> +    GH_RM_ERROR_NOMEM        = 0x1,
>> +    GH_RM_ERROR_NORESOURCE        = 0x2,
>> +    GH_RM_ERROR_DENIED        = 0x3,
>> +    GH_RM_ERROR_INVALID        = 0x4,
>> +    GH_RM_ERROR_BUSY        = 0x5,
>> +    GH_RM_ERROR_ARGUMENT_INVALID    = 0x6,
>> +    GH_RM_ERROR_HANDLE_INVALID    = 0x7,
>> +    GH_RM_ERROR_VALIDATE_FAILED    = 0x8,
>> +    GH_RM_ERROR_MAP_FAILED        = 0x9,
>> +    GH_RM_ERROR_MEM_INVALID        = 0xA,
>> +    GH_RM_ERROR_MEM_INUSE        = 0xB,
>> +    GH_RM_ERROR_MEM_RELEASED    = 0xC,
>> +    GH_RM_ERROR_VMID_INVALID    = 0xD,
>> +    GH_RM_ERROR_LOOKUP_FAILED    = 0xE,
>> +    GH_RM_ERROR_IRQ_INVALID        = 0xF,
>> +    GH_RM_ERROR_IRQ_INUSE        = 0x10,
>> +    GH_RM_ERROR_IRQ_RELEASED    = 0x11,
>> +};
>> +
>> +/**
>> + * gh_rm_remap_error() - Remap Gunyah resource manager errors into a
>> Linux error code
>> + * @gh_error: "Standard" return value from Gunyah resource manager
>> + */
>> +static inline int gh_rm_remap_error(enum gh_rm_error rm_error)
>> +{
>> +    switch (rm_error) {
>> +    case GH_RM_ERROR_OK:
>> +        return 0;
>> +    case GH_RM_ERROR_UNIMPLEMENTED:
>> +        return -EOPNOTSUPP;
>> +    case GH_RM_ERROR_NOMEM:
>> +        return -ENOMEM;
>> +    case GH_RM_ERROR_NORESOURCE:
>> +        return -ENODEV;
>> +    case GH_RM_ERROR_DENIED:
>> +        return -EPERM;
>> +    case GH_RM_ERROR_BUSY:
>> +        return -EBUSY;
>> +    case GH_RM_ERROR_INVALID:
>> +    case GH_RM_ERROR_ARGUMENT_INVALID:
>> +    case GH_RM_ERROR_HANDLE_INVALID:
>> +    case GH_RM_ERROR_VALIDATE_FAILED:
>> +    case GH_RM_ERROR_MAP_FAILED:
>> +    case GH_RM_ERROR_MEM_INVALID:
>> +    case GH_RM_ERROR_MEM_INUSE:
>> +    case GH_RM_ERROR_MEM_RELEASED:
>> +    case GH_RM_ERROR_VMID_INVALID:
>> +    case GH_RM_ERROR_LOOKUP_FAILED:
>> +    case GH_RM_ERROR_IRQ_INVALID:
>> +    case GH_RM_ERROR_IRQ_INUSE:
>> +    case GH_RM_ERROR_IRQ_RELEASED:
>> +        return -EINVAL;
>> +    default:
>> +        return -EBADMSG;
>> +    }
>> +}
>> +
> ---------------->
>
> Only user for the error code coversion is within the rm driver, you
> should just move this to the .c file, I see no value of this in .h
> unless there are some other users for this.
>
>

Done.

>
>> +struct gh_rm;
>> +int gh_rm_call(struct gh_rm *rsc_mgr, u32 message_id, void *req_buff,
>> size_t req_buff_size,
>> +        void **resp_buf, size_t *resp_buff_size);
>> +
>> +#endif
>> diff --git a/include/linux/gunyah_rsc_mgr.h
>> b/include/linux/gunyah_rsc_mgr.h
>> new file mode 100644
>> index 000000000000..c992b3188c8d
>> --- /dev/null
>> +++ b/include/linux/gunyah_rsc_mgr.h
>> @@ -0,0 +1,24 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All
>> rights reserved.
>> + */
>> +
>> +#ifndef _GUNYAH_RSC_MGR_H
>> +#define _GUNYAH_RSC_MGR_H
>> +
>> +#include <linux/list.h>
>> +#include <linux/notifier.h>
>> +#include <linux/gunyah.h>
>> +
>> +#define GH_VMID_INVAL    U16_MAX
>> +
>> +/* Gunyah recognizes VMID0 as an alias to the current VM's ID */
>> +#define GH_VMID_SELF            0
>> +
>> +struct gh_rm;
>> +int gh_rm_notifier_register(struct gh_rm *rm, struct notifier_block
>> *nb);
>> +int gh_rm_notifier_unregister(struct gh_rm *rm, struct notifier_block
>> *nb);
>> +void get_gh_rm(struct gh_rm *rm);
>> +void put_gh_rm(struct gh_rm *rm);
>> +
>> +#endif

2023-02-23 00:15:41

by Elliot Berman

[permalink] [raw]
Subject: Re: [PATCH v10 07/26] mailbox: Add Gunyah message queue mailbox



On 2/20/2023 5:59 AM, Srinivas Kandagatla wrote:
>
>
> On 14/02/2023 21:23, Elliot Berman wrote:
>> Gunyah message queues are a unidirectional inter-VM pipe for messages up
>> to 1024 bytes. This driver supports pairing a receiver message queue and
>> a transmitter message queue to expose a single mailbox channel.
>>
>> Signed-off-by: Elliot Berman <[email protected]>
>> ---
>>   Documentation/virt/gunyah/message-queue.rst |   8 +
>>   drivers/mailbox/Makefile                   |   2 +
>>   drivers/mailbox/gunyah-msgq.c               | 214 ++++++++++++++++++++
>>   include/linux/gunyah.h                      |  56 +++++
>>   4 files changed, 280 insertions(+)
>>   create mode 100644 drivers/mailbox/gunyah-msgq.c
>>
>> diff --git a/Documentation/virt/gunyah/message-queue.rst
>> b/Documentation/virt/gunyah/message-queue.rst
>> index 0667b3eb1ff9..082085e981e0 100644
>> --- a/Documentation/virt/gunyah/message-queue.rst
>> +++ b/Documentation/virt/gunyah/message-queue.rst
>> @@ -59,3 +59,11 @@ vIRQ: two TX message queues will have two vIRQs
>> (and two capability IDs).
>>         |               |         |                 |
>> |               |
>>         |               |         |                 |
>> |               |
>>         +---------------+         +-----------------+
>> +---------------+
>> +
>> +Gunyah message queues are exposed as mailboxes. To create the
>> mailbox, create
>> +a mbox_client and call `gh_msgq_init`. On receipt of the RX_READY
>> interrupt,
>> +all messages in the RX message queue are read and pushed via the
>> `rx_callback`
>> +of the registered mbox_client.
>> +
>> +.. kernel-doc:: drivers/mailbox/gunyah-msgq.c
>> +   :identifiers: gh_msgq_init
>> diff --git a/drivers/mailbox/Makefile b/drivers/mailbox/Makefile
>> index fc9376117111..5f929bb55e9a 100644
>> --- a/drivers/mailbox/Makefile
>> +++ b/drivers/mailbox/Makefile
>> @@ -55,6 +55,8 @@ obj-$(CONFIG_MTK_CMDQ_MBOX)    += mtk-cmdq-mailbox.o
>>   obj-$(CONFIG_ZYNQMP_IPI_MBOX)    += zynqmp-ipi-mailbox.o
>> +obj-$(CONFIG_GUNYAH)        += gunyah-msgq.o
>
> Why are we reusing CONFIG_GUNYAH Kconfig symbol for mailbox, why not
> CONFIG_GUNYAH_MBOX?
>

There was some previous discussion about this:

https://lore.kernel.org/all/[email protected]/

>> +
>>   obj-$(CONFIG_SUN6I_MSGBOX)    += sun6i-msgbox.o
>>   obj-$(CONFIG_SPRD_MBOX)       += sprd-mailbox.o
>> diff --git a/drivers/mailbox/gunyah-msgq.c
>> b/drivers/mailbox/gunyah-msgq.c
>> new file mode 100644
>> index 000000000000..03ffaa30ce9b
>> --- /dev/null
>> +++ b/drivers/mailbox/gunyah-msgq.c
>> @@ -0,0 +1,214 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All
>> rights reserved.
>> + */
>> +
>> +#include <linux/mailbox_controller.h>
>> +#include <linux/module.h>
>> +#include <linux/interrupt.h>
>> +#include <linux/gunyah.h>
>> +#include <linux/printk.h>
>> +#include <linux/init.h>
>> +#include <linux/slab.h>
>> +#include <linux/wait.h>
>
> ...
>
>> +/* Fired when message queue transitions from "full" to "space
>> available" to send messages */
>> +static irqreturn_t gh_msgq_tx_irq_handler(int irq, void *data)
>> +{
>> +    struct gh_msgq *msgq = data;
>> +
>> +    mbox_chan_txdone(gh_msgq_chan(msgq), 0);
>> +
>> +    return IRQ_HANDLED;
>> +}
>> +
>> +/* Fired after sending message and hypercall told us there was more
>> space available. */
>> +static void gh_msgq_txdone_tasklet(struct tasklet_struct *tasklet)
>
> Tasklets have been long deprecated, consider using workqueues in this
> particular case.
>

Workqueues have higher latency and tasklets came as recommendation from
Jassi. drivers/mailbox/imx-mailbox.c uses tasklets in the same way.

I did some quick unscientific measurements of ~1000x samples. The median
latency for resource manager went from 25.5 us (tasklet) to 26 us
(workqueue) (2% slower). The mean went from 28.7 us to 32.5 us (13%
slower). Obviously, the outliers for workqueues were much more extreme.

>
>> +{
>> +    struct gh_msgq *msgq = container_of(tasklet, struct gh_msgq,
>> txdone_tasklet);
>> +
>> +    mbox_chan_txdone(gh_msgq_chan(msgq), msgq->last_ret);
>> +}
>> +
>> +static int gh_msgq_send_data(struct mbox_chan *chan, void *data)
>> +{
> ..
>
>> +    tasklet_schedule(&msgq->txdone_tasklet);
>> +
>> +    return 0;
>> +}
>> +
>> +static struct mbox_chan_ops gh_msgq_ops = {
>> +    .send_data = gh_msgq_send_data,
>> +};
>> +
>> +/**
>> + * gh_msgq_init() - Initialize a Gunyah message queue with an
>> mbox_client
>> + * @parent: optional, device parent used for the mailbox controller
>> + * @msgq: Pointer to the gh_msgq to initialize
>> + * @cl: A mailbox client to bind to the mailbox channel that the
>> message queue creates
>> + * @tx_ghrsc: optional, the transmission side of the message queue
>> + * @rx_ghrsc: optional, the receiving side of the message queue
>> + *
>> + * At least one of tx_ghrsc and rx_ghrsc should be not NULL. Most
>> message queue use cases come with
>> + * a pair of message queues to facilitate bidirectional
>> communication. When tx_ghrsc is set,
>> + * the client can send messages with
>> mbox_send_message(gh_msgq_chan(msgq), msg). When rx_ghrsc
>> + * is set, the mbox_client should register an .rx_callback() and the
>> message queue driver will
>> + * push all available messages upon receiving the RX ready interrupt.
>> The messages should be
>> + * consumed or copied by the client right away as the gh_msgq_rx_data
>> will be replaced/destroyed
>> + * after the callback.
>> + *
>> + * Returns - 0 on success, negative otherwise
>> + */
>> +int gh_msgq_init(struct device *parent, struct gh_msgq *msgq, struct
>> mbox_client *cl,
>> +             struct gunyah_resource *tx_ghrsc, struct gunyah_resource
>> *rx_ghrsc)
>> +{
>> +    int ret;
>> +
>> +    /* Must have at least a tx_ghrsc or rx_ghrsc and that they are
>> the right device types */
>> +    if ((!tx_ghrsc && !rx_ghrsc) ||
>> +        (tx_ghrsc && tx_ghrsc->type != GUNYAH_RESOURCE_TYPE_MSGQ_TX) ||
>> +        (rx_ghrsc && rx_ghrsc->type != GUNYAH_RESOURCE_TYPE_MSGQ_RX))
>> +        return -EINVAL;
>> +
>> +    if (gh_api_version() != GUNYAH_API_V1) {
>> +        pr_err("Unrecognized gunyah version: %u. Currently supported:
>> %d\n",
> dev_err(parent
>
> would make this more useful
>

Done.

- Elliot

2023-02-23 00:51:18

by Elliot Berman

[permalink] [raw]
Subject: Re: [PATCH v10 13/26] gunyah: vm_mgr: Add ioctls to support basic non-proxy VM boot



On 2/21/2023 6:17 AM, Srinivas Kandagatla wrote:
>
>
> On 14/02/2023 21:24, Elliot Berman wrote:
>>
>> Add remaining ioctls to support non-proxy VM boot:
>>
>>   - Gunyah Resource Manager uses the VM's devicetree to configure the
>>     virtual machine. The location of the devicetree in the guest's
>>     virtual memory can be declared via the SET_DTB_CONFIGioctl.
>>   - Trigger start of the virtual machine with VM_START ioctl.
>>
>> Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
>> Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
>> Signed-off-by: Elliot Berman <[email protected]>
>> ---
>>   drivers/virt/gunyah/vm_mgr.c    | 229 ++++++++++++++++++++++++++++++--
>>   drivers/virt/gunyah/vm_mgr.h    |  10 ++
>>   drivers/virt/gunyah/vm_mgr_mm.c |  23 ++++
>>   include/linux/gunyah_rsc_mgr.h  |   6 +
>>   include/uapi/linux/gunyah.h     |  13 ++
>>   5 files changed, 268 insertions(+), 13 deletions(-)
>>
>> diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
>> index 84102bac03cc..fa324385ade5 100644
>> --- a/drivers/virt/gunyah/vm_mgr.c
>> +++ b/drivers/virt/gunyah/vm_mgr.c
>> @@ -9,37 +9,114 @@
>>   #include <linux/file.h>
>>   #include <linux/gunyah_rsc_mgr.h>
>>   #include <linux/miscdevice.h>
>> +#include <linux/mm.h>
>>   #include <linux/module.h>
>>   #include <uapi/linux/gunyah.h>
>>   #include "vm_mgr.h"
>> +static int gh_vm_rm_notification_status(struct gh_vm *ghvm, void *data)
>> +{
>> +    struct gh_rm_vm_status_payload *payload = data;
>> +
>> +    if (payload->vmid != ghvm->vmid)
>> +        return NOTIFY_OK;
> Is this even possible? If yes, then this is a bug somewhere, we should
> not be getting notifications for something that does not belong to this vm.
> What is the typical case for such behavior? comment would be useful.
>

VM manager has reigstered to receive all notifications. If there are
multiple VMs running, then the notifier callback receives notifications
about all VMs. I've not yet implemented any filtering at resource
manager level because it added lot of processing code in the resource
manager that is easily done in the notifier callback.

>
>> +
>> +    /* All other state transitions are synchronous to a corresponding
>> RM call */
>> +    if (payload->vm_status == GH_RM_VM_STATUS_RESET){
>> +        down_write(&ghvm->status_lock);
>> +        ghvm->vm_status = payload->vm_status;
>> +        up_write(&ghvm->status_lock);
>> +        wake_up(&ghvm->vm_status_wait);
>> +    }
>> +
>> +    return NOTIFY_DONE;
>> +}
>> +
>> +static int gh_vm_rm_notification_exited(struct gh_vm *ghvm, void *data)
>> +{
>> +    struct gh_rm_vm_exited_payload *payload = data;
>> +
>> +    if (payload->vmid != ghvm->vmid)
>> +        return NOTIFY_OK;
> same
>
>> +
>> +    down_write(&ghvm->status_lock);
>> +    ghvm->vm_status = GH_RM_VM_STATUS_EXITED;
>> +    up_write(&ghvm->status_lock);
>> +
>> +    return NOTIFY_DONE;
>> +}
>> +
>> +static int gh_vm_rm_notification(struct notifier_block *nb, unsigned
>> long action, void *data)
>> +{
>> +    struct gh_vm *ghvm = container_of(nb, struct gh_vm, nb);
>> +
>> +    switch (action) {
>> +    case GH_RM_NOTIFICATION_VM_STATUS:
>> +        return gh_vm_rm_notification_status(ghvm, data);
>> +    case GH_RM_NOTIFICATION_VM_EXITED:
>> +        return gh_vm_rm_notification_exited(ghvm, data);
>> +    default:
>> +        return NOTIFY_OK;
>> +    }
>> +}
>> +
>> +static void gh_vm_stop(struct gh_vm *ghvm)
>> +{
>> +    int ret;
>> +
>> +    down_write(&ghvm->status_lock);
>> +    if (ghvm->vm_status == GH_RM_VM_STATUS_RUNNING) {
>> +        ret = gh_rm_vm_stop(ghvm->rm, ghvm->vmid);
>> +        if (ret)
>> +            pr_warn("Failed to stop VM: %d\n", ret);
> Should we not bail out from this fail path?
>

This is called in the gh_vm_free path and we have some options here when
we get some error while stopping a VM. So far, my strategy has been to
ignore error as best we can and continue. We might get further errors,
but we can also continue to clean up some more resources.

If there's an error, I'm not sure if there is a proper strategy to get
someone to retry later: userspace is closing all its references to the
VM and we need to stop the VM and clean up all our resources. Nitro
Enclaves and ACRN suffer similar

>
>> +    }
>> +
>> +    ghvm->vm_status = GH_RM_VM_STATUS_EXITED;
>> +    up_write(&ghvm->status_lock);
>> +}
>> +
>>   static void gh_vm_free(struct work_struct *work)
>>   {
>>       struct gh_vm *ghvm = container_of(work,struct gh_vm, free_work);
>>       struct gh_vm_mem *mapping, *tmp;
>>       int ret;
>> -    mutex_lock(&ghvm->mm_lock);
>> -    list_for_each_entry_safe(mapping, tmp, &ghvm->memory_mappings,
>> list) {
>> -        gh_vm_mem_reclaim(ghvm, mapping);
>> -        kfree(mapping);
>> +    switch (ghvm->vm_status) {
>> +unknown_state:
>
> Never seen this style of using goto from switch to a new label in switch
> case. Am sure this is some kinda trick but its not helping readers.
>
> Can we rewrite this using a normal semantics.
>
> may be a do while could help.
>

Srivatsa suggested dropping the goto, I can do that.
>
>> +    case GH_RM_VM_STATUS_RUNNING:
>> +        gh_vm_stop(ghvm);
>> +        fallthrough;
>> +    case GH_RM_VM_STATUS_INIT_FAILED:
>> +    case GH_RM_VM_STATUS_LOAD:
>> +    case GH_RM_VM_STATUS_LOAD_FAILED:
>> +        mutex_lock(&ghvm->mm_lock);
>> +        list_for_each_entry_safe(mapping, tmp,
>> &ghvm->memory_mappings, list) {
>> +            gh_vm_mem_reclaim(ghvm, mapping);
>> +            kfree(mapping);
>> +        }
>> +        mutex_unlock(&ghvm->mm_lock);
>> +        fallthrough;
>> +    case GH_RM_VM_STATUS_NO_STATE:
>> +        ret = gh_rm_dealloc_vmid(ghvm->rm, ghvm->vmid);
>> +        if (ret)
>> +            pr_warn("Failed to deallocate vmid: %d\n", ret);
>> +
>> +        gh_rm_notifier_unregister(ghvm->rm, &ghvm->nb);
>> +        put_gh_rm(ghvm->rm);
>> +        kfree(ghvm);
>> +        break;
>> +    default:
>> +        pr_err("VM is unknown state:%d, assuming it's running.\n",
>> ghvm->vm_status);
> vm_status did not change do we not endup here again?
>
>> +        goto unknown_state;
>>       }
>> -    mutex_unlock(&ghvm->mm_lock);
>> -
>> -    ret = gh_rm_dealloc_vmid(ghvm->rm, ghvm->vmid);
>> -    if (ret)
>> -        pr_warn("Failed to deallocate vmid: %d\n", ret);
>> -
>> -    put_gh_rm(ghvm->rm);
>> -    kfree(ghvm);
>>   }
>>   static __must_check struct gh_vm *gh_vm_alloc(struct gh_rm *rm)
>>   {
>>       struct gh_vm *ghvm;
>> -    int vmid;
>> +    int vmid, ret;
>>       vmid = gh_rm_alloc_vmid(rm, 0);
>>       if (vmid < 0)
>> @@ -56,13 +133,123 @@ static __must_check struct gh_vm
>> *gh_vm_alloc(struct gh_rm *rm)
>>       ghvm->vmid = vmid;
>>       ghvm->rm = rm;
>> +    init_waitqueue_head(&ghvm->vm_status_wait);
>> +    ghvm->nb.notifier_call = gh_vm_rm_notification;
>> +    ret = gh_rm_notifier_register(rm, &ghvm->nb);
>> +    if (ret) {
>> +        put_gh_rm(rm);
>> +        gh_rm_dealloc_vmid(rm, vmid);
>> +        kfree(ghvm);
>> +        return ERR_PTR(ret);
>> +    }
>> +
>>       mutex_init(&ghvm->mm_lock);
>>       INIT_LIST_HEAD(&ghvm->memory_mappings);
>> +    init_rwsem(&ghvm->status_lock);
>>       INIT_WORK(&ghvm->free_work, gh_vm_free);
>> +    ghvm->vm_status = GH_RM_VM_STATUS_LOAD;
>>       return ghvm;
>>   }
>> +static int gh_vm_start(struct gh_vm *ghvm)
>> +{
>> +    struct gh_vm_mem *mapping;
>> +    u64 dtb_offset;
>> +    u32 mem_handle;
>> +    int ret;
>> +
>> +    down_write(&ghvm->status_lock);
>> +    if (ghvm->vm_status != GH_RM_VM_STATUS_LOAD) {
>> +        up_write(&ghvm->status_lock);
>> +        return 0;
>> +    }
>> +
>> +    ghvm->vm_status = GH_RM_VM_STATUS_RESET;
>> +
>
> <------
> should we not take ghvm->mm_lock here to make sure that list is
> consistent while processing.

Done.

>> +    list_for_each_entry(mapping, &ghvm->memory_mappings,list) {
>> +        switch (mapping->share_type){
>> +        case VM_MEM_LEND:
>> +            ret = gh_rm_mem_lend(ghvm->rm, &mapping->parcel);
>> +            break;
>> +        case VM_MEM_SHARE:
>> +            ret = gh_rm_mem_share(ghvm->rm, &mapping->parcel);
>> +            break;
>> +        }
>> +        if (ret) {
>> +            pr_warn("Failed to %s parcel %d: %d\n",
>> +                mapping->share_type == VM_MEM_LEND ? "lend" : "share",
>> +                mapping->parcel.label,
>> +                ret);
>> +            gotoerr;
>> +        }
>> +    }
> --->
>
>> +
>> +    mapping = gh_vm_mem_find_mapping(ghvm, ghvm->dtb_config.gpa,
>> ghvm->dtb_config.size);
>> +    if (!mapping) {
>> +        pr_warn("Failed to find the memory_handle for DTB\n");
>
> What wil happen to the mappings that are lend or shared?
>

When the VM is cleaned up (on final destruction), the mappings are
reclaimed.

>> +        ret = -EINVAL;
>> +        goto err;
>> +    }
>> +
>> +    mem_handle = mapping->parcel.mem_handle;
>> +    dtb_offset = ghvm->dtb_config.gpa - mapping->guest_phys_addr;
>> +
>> +    ret = gh_rm_vm_configure(ghvm->rm, ghvm->vmid, ghvm->auth,
>> mem_handle,
>
> where is authentication mechanism (auth) comming from? Who is supposed
> to set this value?
>
> Should it come from userspace? if so I do not see any UAPI facility to
> do that via VM_START ioctl.
>

Right, we are only adding the support for unauthenticated VMs for now.
There would be further UAPI facilities to set the authentication type.

>
>> +                0, 0, dtb_offset, ghvm->dtb_config.size);
>> +    if (ret) {
>> +        pr_warn("Failed to configureVM: %d\n", ret);
>> +        goto err;
>> +    }
>> +
>> +    ret = gh_rm_vm_init(ghvm->rm, ghvm->vmid);
>> +    if (ret) {
>> +        pr_warn("Failed to initialize VM: %d\n", ret);
>> +        goto err;
>> +    }
>> +
>> +    ret = gh_rm_vm_start(ghvm->rm, ghvm->vmid);
>> +    if (ret) {
>> +        pr_warn("Failed to start VM:%d\n", ret);
>> +        goto err;
>> +    }
>> +
>> +    ghvm->vm_status = GH_RM_VM_STATUS_RUNNING;
>> +    up_write(&ghvm->status_lock);
>> +    return ret;
>> +err:
>> +    ghvm->vm_status = GH_RM_VM_STATUS_INIT_FAILED;
>> +    up_write(&ghvm->status_lock);
>
> Am really not sure if we are doing right thing in the error path, there
> are multiple cases that seems to be not handled or if it was not
> required no comments to clarify this are documented.
> ex: if vm start fails then what happes with memory mapping or do we need
> to un-configure vm or un-init vm from hypervisor side?
>
> if none of this is required its useful to add come clear comments.
>

It is required and done in the VM cleanup path. I'll add comment with
this info.

>> +    return ret;
>> +}
>> +
>> +static int gh_vm_ensure_started(struct gh_vm *ghvm)
>> +{
>> +    int ret;
>> +
>> +retry:
>> +    ret = down_read_interruptible(&ghvm->status_lock);
>> +    if (ret)
>> +        return ret;
>> +
>> +    /* Unlikely because VM is typically started */
>> +    if (unlikely(ghvm->vm_status == GH_RM_VM_STATUS_LOAD)) {
>> +        up_read(&ghvm->status_lock);
>> +        ret = gh_vm_start(ghvm);
>> +        if (ret)
>> +            gotoout;
>> +        goto retry;
>> +    }
>
> do while will do better job here w.r.t to readablity.
>

I think do while and my current "goto retry" imply a long loop is
possible. The "goto retry" or while loop is guaranteed to run only once
because gh_vm_start will always bring VM out of GH_RM_VM_STATUS_LOAD.

How about this?

- goto retry;
+ /** gh_vm_start() is guaranteed to bring status out of
+ * GH_RM_VM_STATUS_LOAD, thus inifitely recursive call
is not
+ * possible
+ */
+ return gh_vm_ensure_started(ghvm);



>> +
>> +    /* Unlikely because VM is typically running */
>> +    if (unlikely(ghvm->vm_status != GH_RM_VM_STATUS_RUNNING))
>> +        ret = -ENODEV;
>> +
>> +out:
>> +    up_read(&ghvm->status_lock);
>> +    return ret;
>> +}
>> +
>>   static long gh_vm_ioctl(struct file *filp, unsigned int cmd,
>> unsigned long arg)
>>   {
>>       struct gh_vm *ghvm = filp->private_data;
>> @@ -88,6 +275,22 @@ static long gh_vm_ioctl(struct file *filp,
>> unsigned int cmd, unsigned long arg)
>>               r = gh_vm_mem_free(ghvm, region.label);
>>           break;
>>       }
>> +    case GH_VM_SET_DTB_CONFIG: {
>> +        struct gh_vm_dtb_config dtb_config;
>> +
>> +        if (copy_from_user(&dtb_config, argp, sizeof(dtb_config)))
>> +            return -EFAULT;
>> +
>> +        dtb_config.size = PAGE_ALIGN(dtb_config.size);
>> +        ghvm->dtb_config = dtb_config;
>> +
>> +        r = 0;
>> +        break;
>> +    }
>> +    case GH_VM_START: {
>> +        r = gh_vm_ensure_started(ghvm);
>> +        break;
>> +    }
>>       default:
>>           r = -ENOTTY;
>>           break;
>> diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
>> index 97bc00c34878..e9cf56647cc2 100644
>> --- a/drivers/virt/gunyah/vm_mgr.h
>> +++ b/drivers/virt/gunyah/vm_mgr.h
>> @@ -10,6 +10,8 @@
>>   #include <linux/list.h>
>>   #include <linux/miscdevice.h>
>>   #include <linux/mutex.h>
>> +#include <linux/rwsem.h>
>> +#include <linux/wait.h>
>>   #include <uapi/linux/gunyah.h>
>> @@ -33,6 +35,13 @@ struct gh_vm_mem {
>>   struct gh_vm {
>>       u16 vmid;
>>       struct gh_rm *rm;
>> +    enum gh_rm_vm_auth_mechanism auth;
>> +    struct gh_vm_dtb_config dtb_config;
>> +
>> +    struct notifier_block nb;
>> +    enum gh_rm_vm_status vm_status;
>> +    wait_queue_head_t vm_status_wait;
>> +    struct rw_semaphore status_lock;
>>       struct work_struct free_work;
>>       struct mutex mm_lock;
>> @@ -43,5 +52,6 @@ int gh_vm_mem_alloc(struct gh_vm *ghvm, struct
>> gh_userspace_memory_region *regio
>>   void gh_vm_mem_reclaim(struct gh_vm *ghvm, struct gh_vm_mem *mapping);
>>   int gh_vm_mem_free(struct gh_vm *ghvm, u32 label);
>>   struct gh_vm_mem *gh_vm_mem_find(struct gh_vm *ghvm, u32 label);
>> +struct gh_vm_mem *gh_vm_mem_find_mapping(struct gh_vm *ghvm, u64 gpa,
>> u32 size);
>>   #endif
>> diff --git a/drivers/virt/gunyah/vm_mgr_mm.c
>> b/drivers/virt/gunyah/vm_mgr_mm.c
>> index 03e71a36ea3b..128b90da555a 100644
>> --- a/drivers/virt/gunyah/vm_mgr_mm.c
>> +++ b/drivers/virt/gunyah/vm_mgr_mm.c
>> @@ -52,6 +52,29 @@ void gh_vm_mem_reclaim(struct gh_vm *ghvm, struct
>> gh_vm_mem *mapping)
>>       list_del(&mapping->list);
>>   }
>> +struct gh_vm_mem *gh_vm_mem_find_mapping(struct gh_vm *ghvm, u64 gpa,
>> u32 size)
> naming is bit missleading we already have
> gh_vm_mem_find/__gh_vm_mem_find which is returning mapping based on label
> now with gh_vm_mem_find_mapping() is doing same thing but with address.
>
> Can we rename them clearly
> gh_vm_mem_find_mapping_by_label()
> gh_vm_mem_find_mapping_by_addr()
>

Done.

- Elliot

2023-02-23 01:55:38

by Elliot Berman

[permalink] [raw]
Subject: Re: [PATCH v10 15/26] gunyah: rsc_mgr: Add platform ops on mem_lend/mem_reclaim


On 2/22/2023 2:21 AM, Srinivas Kandagatla wrote:
>
>
> On 21/02/2023 21:22, Elliot Berman wrote:
>>
>>
>> On 2/21/2023 6:51 AM, Srinivas Kandagatla wrote:
>>>
>>>
>>> On 14/02/2023 21:24, Elliot Berman wrote:
>> [snip]
>>>> +
>>>> +static struct gunyah_rm_platform_ops *rm_platform_ops;
>>>> +static DECLARE_RWSEM(rm_platform_ops_lock);
>>>
>>> Why do we need this read/write lock or this global rm_platform_ops
>>> here, AFAIU, there will be only one instance of platform_ops per
>>> platform.
>>>
>>> This should be a core part of the gunyah and its driver early setup,
>>> that should give us pretty much lock less behaviour.
>>>
>>> We should be able to determine by Hypervisor UUID that its on
>>> Qualcomm platform or not, during early gunyah setup which should help
>>> us setup the platfrom ops accordingly.
>>>
>>> This should also help cleanup some of the gunyah code that was added
>>> futher down in this patchset.
>>
>> I'm guessing the direction to take is:
>>
>>    config GUNYAH
>>      select QCOM_SCM if ARCH_QCOM
>
> This is how other kernel drivers use SCM.
>
>>
>> and have vm_mgr call directly into qcom_scm driver if the UID matches?
>
> Yes that is the plan, we could have these callbacks as part key data
> structure like struct gh_rm and update it at very early in setup stage
> based on UUID match.
>
>
>>
>> We have an Android requirement to enable CONFIG_GUNYAH=y and
>> CONFIG_QCOM_SCM=m, but it wouldn't be possible with this design. The
>
> Am not sure how this will work, if gunyah for QCOM Platform is depended
> on SCM then there is no way that gunyah could be a inbuilt and make scm
> a module. >
> On the other hand with the existing design gunyah will not be functional
> until scm driver is loaded and platform hooks are registered. This
> runtime dependency design does not express the dependency correctly and
> the only way to know if gunyah is functional is keep trying which can
> only work after scm driver is probed.
>
> This also raises the design question on how much of platform hooks
> dependency is captured at gunyah core and api level, with state of
> current code /dev/gunyah will be created even without platform hooks and
> let the userspace use it which then only fail at hyp call level.
>
> Other issue with current design is, scm module can be unloaded under the
> hood leaving gunyah with NULL pointers to those platform hook functions.


This is not possible because SCM module can't be unloaded (except with
CONFIG_MODULE_FORCE_UNLOAD). I can also increase refcount of qcom_scm.ko
module to be more correct.

> This is the kind of issues we could see if the dependency is not
> expressed from bottom up. >
> The current design is not really capturing the depended components
> accurately.
>
> Considering platform hooks as a core resource to gunyah on Qualcomm
> platform is something that needs attention. If we can fix that then it
> might be doable to have QCOM_SCM=m and CONFIG_GUNYAH=y.
>

I'm open to ideas. I don't see this as being a real-world issue because
default defconfig has QCOM_SCM=y and all Qualcomm platforms enable
QCOM_SCM at least as =m.

Thanks,
Elliot

>
> --srini
>> platform hooks implementation allows GUNYAH and QCOM_SCM to be enabled
>> without setting lower bound of the other.
>>
>> - Elliot

2023-02-23 09:21:41

by Srinivas Kandagatla

[permalink] [raw]
Subject: Re: [PATCH v10 13/26] gunyah: vm_mgr: Add ioctls to support basic non-proxy VM boot



On 23/02/2023 00:50, Elliot Berman wrote:
>>>
>>> +
>>> +    mem_handle = mapping->parcel.mem_handle;
>>> +    dtb_offset = ghvm->dtb_config.gpa - mapping->guest_phys_addr;
>>> +
>>> +    ret = gh_rm_vm_configure(ghvm->rm, ghvm->vmid, ghvm->auth,
>>> mem_handle,
>>
>> where is authentication mechanism (auth) comming from? Who is supposed
>> to set this value?
>>
>> Should it come from userspace? if so I do not see any UAPI facility to
>> do that via VM_START ioctl.
>>
>
> Right, we are only adding the support for unauthenticated VMs for now.
> There would be further UAPI facilities to set the authentication type.
We have to be careful, please note that you can not change an existing
UAPI to accommodate new features.

There are two ways to do this properly:

1. Design UAPI to accommodate features that will be part of this in very
soon or in future. This way the UAPI is stable and does not change
over time when we add support this feature in driver.

In this particular case, vm authentication type is one that needs to
come from user, rather than kernel assuming it, so definitely this need
to be properly addressed by passing this info from userspace.
Or rename this IOCTl to something like VM_START_UNAUTH_VM to make this
more explicit.


2. For each feature add new UAPI as and when its required, which is
really the only option when we failed to design UAPIs correctly in the
first place.

--srini


>
>>
>>> +                0, 0, dtb_offset, ghvm->dtb_config.size);
>>> +    if (ret) {

2023-02-23 10:08:28

by Srinivas Kandagatla

[permalink] [raw]
Subject: Re: [PATCH v10 10/26] gunyah: vm_mgr: Introduce basic VM Manager



On 22/02/2023 00:27, Elliot Berman wrote:
>
>>> +    .llseek = noop_llseek,
>>> +};
>>> +
>>> +static long gh_dev_ioctl_create_vm(struct gh_rm *rm, unsigned long arg)
>> Not sure what is the gain of this multiple levels of redirection.
>>
>> How about
>>
>> long gh_dev_create_vm(struct gh_rm *rm, unsigned long arg)
>> {
>> ...
>> }
>>
>> and rsc_mgr just call it as part of its ioctl call
>>
>> static long gh_dev_ioctl(struct file *filp, unsigned int cmd, unsigned
>> long arg)
>> {
>>      struct miscdevice *miscdev = filp->private_data;
>>      struct gh_rm *rm = container_of(miscdev, struct gh_rm, miscdev);
>>
>>      switch (cmd) {
>>      case GH_CREATE_VM:
>>          return gh_dev_create_vm(rm, arg);
>>      default:
>>          return -ENOIOCTLCMD;
>>      }
>> }
>>
>
> I'm anticipating we will add further /dev/gunyah ioctls and I thought it
> would be cleaner to have all that in vm_mgr.c itself.
>
>>
>>> +{
>>> +    struct gh_vm *ghvm;
>>> +    struct file *file;
>>> +    int fd, err;
>>> +
>>> +    /* arg reserved for future use. */
>>> +    if (arg)
>>> +        return -EINVAL;
>>
>> The only code path I see here is via GH_CREATE_VM ioctl which
>> obviously does not take any arguments, so if you are thinking of using
>> the argument for architecture-specific VM flags.  Then this needs to
>> be properly done by making the ABI aware of this.
>
> It is documented in Patch 17 (Document Gunyah VM Manager)
>
> +GH_CREATE_VM
> +~~~~~~~~~~~~
> +
> +Creates a Gunyah VM. The argument is reserved for future use and must
> be 0.
>
But this conficts with the UAPIs that have been defined. GH_CREATE_VM
itself is defined to take no parameters.

#define GH_CREATE_VM _IO(GH_IOCTL_TYPE, 0x0)

so where are you expecting the argument to come from?

>>
>> As you mentioned zero value arg imply an "unauthenticated VM" type,
>> but this was not properly encoded in the userspace ABI. Why not make
>> it future compatible. How about adding arguments to GH_CREATE_VM and
>> pass the required information correctly.
>> Note that once the ABI is accepted then you will not be able to change
>> it, other than adding a new one.
>>
>
> Does this means adding #define GH_VM_DEFAULT_ARG 0 ? I am not sure yet
> what arguments to add here.
>
> The ABI can add new "long" values to GH_CREATE_VM and that wouldn't

Sorry, that is exactly what we want to avoid, we can not change the UAPI
its going to break the userspace.

> break compatibility with old kernels; old kernels reject it as -EINVAL.

If you have userspace built with older kernel headers then that will
break. Am not sure about old-kernels.

What exactly is the argument that you want to add to GH_CREATE_VM?

If you want to keep GH_CREATE_VM with no arguments that is fine but
remove the conflicting comments in the code and document so that its not
misleading readers/reviewers that the UAPI is going to be modified in
near future.


>
>>> +
>>> +    ghvm = gh_vm_alloc(rm);
>>> +    if (IS_ERR(ghvm))
>>> +        return PTR_ERR(ghvm);
>>> +
>>> +    fd = get_unused_fd_flags(O_CLOEXEC);
>>> +    if (fd < 0) {
>>> +        err = fd;
>>> +        goto err_destroy_vm;
>>> +    }
>>> +
>>> +    file = anon_inode_getfile("gunyah-vm", &gh_vm_fops, ghvm, O_RDWR);
>>> +    if (IS_ERR(file)) {
>>> +        err = PTR_ERR(file);
>>> +        goto err_put_fd;
>>> +    }
>>> +
>>> +    fd_install(fd, file);
>>> +
>>> +    return fd;
>>> +
>>> +err_put_fd:
>>> +    put_unused_fd(fd);
>>> +err_destroy_vm:
>>> +    kfree(ghvm);
>>> +    return err;
>>> +}
>>> +
>>> +long gh_dev_vm_mgr_ioctl(struct gh_rm *rm, unsigned int cmd,
>>> unsigned long arg)
>>> +{
>>> +    switch (cmd) {
>>> +    case GH_CREATE_VM:
>>> +        return gh_dev_ioctl_create_vm(rm, arg);
>>> +    default:
>>> +        return -ENOIOCTLCMD;
>>> +    }
>>> +}
>>> diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
>>> new file mode 100644
>>> index 000000000000..76954da706e9
>>> --- /dev/null
>>> +++ b/drivers/virt/gunyah/vm_mgr.h
>>> @@ -0,0 +1,22 @@
>>> +/* SPDX-License-Identifier: GPL-2.0-only */
>>> +/*
>>> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All
>>> rights reserved.
>>> + */
>>> +
>>> +#ifndef _GH_PRIV_VM_MGR_H
>>> +#define _GH_PRIV_VM_MGR_H
>>> +
>>> +#include <linux/gunyah_rsc_mgr.h>
>>> +
>>> +#include <uapi/linux/gunyah.h>
>>> +
>>> +long gh_dev_vm_mgr_ioctl(struct gh_rm *rm, unsigned int cmd,
>>> unsigned long arg);
>>> +
>>> +struct gh_vm {
>>> +    u16 vmid;
>>> +    struct gh_rm *rm;
>>> +
>>> +    struct work_struct free_work;
>>> +};
>>> +
>>> +#endif
>>> diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
>>> new file mode 100644
>>> index 000000000000..10ba32d2b0a6
>>> --- /dev/null
>>> +++ b/include/uapi/linux/gunyah.h
>>> @@ -0,0 +1,23 @@
>>> +/* SPDX-License-Identifier: GPL-2.0-only WITH Linux-syscall-note */
>>> +/*
>>> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All
>>> rights reserved.
>>> + */
>>> +
>>> +#ifndef _UAPI_LINUX_GUNYAH
>>> +#define _UAPI_LINUX_GUNYAH
>>> +
>>> +/*
>>> + * Userspace interface for /dev/gunyah - gunyah based virtual machine
>>> + */
>>> +
>>> +#include <linux/types.h>
>>> +#include <linux/ioctl.h>
>>> +
>>> +#define GH_IOCTL_TYPE            'G'
>>> +
>>> +/*
>>> + * ioctls for /dev/gunyah fds:
>>> + */
>>> +#define GH_CREATE_VM            _IO(GH_IOCTL_TYPE, 0x0) /* Returns a
>>> Gunyah VM fd */
>>
>> Can HLOS forcefully destroy a VM?
>> If so should we have a corresponding DESTROY IOCTL?
>
> It can forcefully destroy unauthenticated and protected virtual
> machines. I don't have a userspace usecase for a DESTROY ioctl yet,
> maybe this can be added later? By the way, the VM is forcefully
that should be fine, but its also nice to add it for completeness, but
not a compulsory atm

> destroyed when VM refcount is dropped to 0 (close(vm_fd) and any other
> relevant file descriptors).
I have noticed that path.

--srini
>
> - Elliot

2023-02-23 10:25:25

by Srinivas Kandagatla

[permalink] [raw]
Subject: Re: [PATCH v10 07/26] mailbox: Add Gunyah message queue mailbox



On 23/02/2023 00:15, Elliot Berman wrote:
>
>
> On 2/20/2023 5:59 AM, Srinivas Kandagatla wrote:
>>
>>
>> On 14/02/2023 21:23, Elliot Berman wrote:
>>> Gunyah message queues are a unidirectional inter-VM pipe for messages up
>>> to 1024 bytes. This driver supports pairing a receiver message queue and
>>> a transmitter message queue to expose a single mailbox channel.
>>>
>>> Signed-off-by: Elliot Berman <[email protected]>
>>> ---
>>>   Documentation/virt/gunyah/message-queue.rst |   8 +
>>>   drivers/mailbox/Makefile                   |   2 +
>>>   drivers/mailbox/gunyah-msgq.c               | 214 ++++++++++++++++++++
>>>   include/linux/gunyah.h                      |  56 +++++
>>>   4 files changed, 280 insertions(+)
>>>   create mode 100644 drivers/mailbox/gunyah-msgq.c
>>>
>>> diff --git a/Documentation/virt/gunyah/message-queue.rst
>>> b/Documentation/virt/gunyah/message-queue.rst
>>> index 0667b3eb1ff9..082085e981e0 100644
>>> --- a/Documentation/virt/gunyah/message-queue.rst
>>> +++ b/Documentation/virt/gunyah/message-queue.rst
>>> @@ -59,3 +59,11 @@ vIRQ: two TX message queues will have two vIRQs
>>> (and two capability IDs).
>>>         |               |         |                 | |               |
>>>         |               |         |                 | |               |
>>>         +---------------+         +-----------------+ +---------------+
>>> +
>>> +Gunyah message queues are exposed as mailboxes. To create the
>>> mailbox, create
>>> +a mbox_client and call `gh_msgq_init`. On receipt of the RX_READY
>>> interrupt,
>>> +all messages in the RX message queue are read and pushed via the
>>> `rx_callback`
>>> +of the registered mbox_client.
>>> +
>>> +.. kernel-doc:: drivers/mailbox/gunyah-msgq.c
>>> +   :identifiers: gh_msgq_init
>>> diff --git a/drivers/mailbox/Makefile b/drivers/mailbox/Makefile
>>> index fc9376117111..5f929bb55e9a 100644
>>> --- a/drivers/mailbox/Makefile
>>> +++ b/drivers/mailbox/Makefile
>>> @@ -55,6 +55,8 @@ obj-$(CONFIG_MTK_CMDQ_MBOX)    += mtk-cmdq-mailbox.o
>>>   obj-$(CONFIG_ZYNQMP_IPI_MBOX)    += zynqmp-ipi-mailbox.o
>>> +obj-$(CONFIG_GUNYAH)        += gunyah-msgq.o
>>
>> Why are we reusing CONFIG_GUNYAH Kconfig symbol for mailbox, why not
>> CONFIG_GUNYAH_MBOX?
>>
>
> There was some previous discussion about this:
>
> https://lore.kernel.org/all/[email protected]/
>
>>> +
>>>   obj-$(CONFIG_SUN6I_MSGBOX)    += sun6i-msgbox.o
>>>   obj-$(CONFIG_SPRD_MBOX)       += sprd-mailbox.o
>>> diff --git a/drivers/mailbox/gunyah-msgq.c
>>> b/drivers/mailbox/gunyah-msgq.c
>>> new file mode 100644
>>> index 000000000000..03ffaa30ce9b
>>> --- /dev/null
>>> +++ b/drivers/mailbox/gunyah-msgq.c
>>> @@ -0,0 +1,214 @@
>>> +// SPDX-License-Identifier: GPL-2.0-only
>>> +/*
>>> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All
>>> rights reserved.
>>> + */
>>> +
>>> +#include <linux/mailbox_controller.h>
>>> +#include <linux/module.h>
>>> +#include <linux/interrupt.h>
>>> +#include <linux/gunyah.h>
>>> +#include <linux/printk.h>
>>> +#include <linux/init.h>
>>> +#include <linux/slab.h>
>>> +#include <linux/wait.h>
>>
>> ...
>>
>>> +/* Fired when message queue transitions from "full" to "space
>>> available" to send messages */
>>> +static irqreturn_t gh_msgq_tx_irq_handler(int irq, void *data)
>>> +{
>>> +    struct gh_msgq *msgq = data;
>>> +
>>> +    mbox_chan_txdone(gh_msgq_chan(msgq), 0);
>>> +
>>> +    return IRQ_HANDLED;
>>> +}
>>> +
>>> +/* Fired after sending message and hypercall told us there was more
>>> space available. */
>>> +static void gh_msgq_txdone_tasklet(struct tasklet_struct *tasklet)
>>
>> Tasklets have been long deprecated, consider using workqueues in this
>> particular case.
>>
>
> Workqueues have higher latency and tasklets came as recommendation from
> Jassi. drivers/mailbox/imx-mailbox.c uses tasklets in the same way.
>
> I did some quick unscientific measurements of ~1000x samples. The median
> latency for resource manager went from 25.5 us (tasklet) to 26 us
> (workqueue) (2% slower). The mean went from 28.7 us to 32.5 us (13%
> slower). Obviously, the outliers for workqueues were much more extreme.

TBH, this is expected because we are only testing resource manager, Note
the advantage that you will see shifting from tasket to workqueues is
on overall system latencies and some drivers performance that need to
react to events.

please take some time to read this nice article about this
https://lwn.net/Articles/830964/


--srini
>
>>
>>> +{
>>> +    struct gh_msgq *msgq = container_of(tasklet, struct gh_msgq,
>>> txdone_tasklet);
>>> +
>>> +    mbox_chan_txdone(gh_msgq_chan(msgq), msgq->last_ret);
>>> +}
>>> +
>>> +static int gh_msgq_send_data(struct mbox_chan *chan, void *data)
>>> +{
>> ..
>>
>>> +    tasklet_schedule(&msgq->txdone_tasklet);
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +static struct mbox_chan_ops gh_msgq_ops = {
>>> +    .send_data = gh_msgq_send_data,
>>> +};
>>> +
>>> +/**
>>> + * gh_msgq_init() - Initialize a Gunyah message queue with an
>>> mbox_client
>>> + * @parent: optional, device parent used for the mailbox controller
>>> + * @msgq: Pointer to the gh_msgq to initialize
>>> + * @cl: A mailbox client to bind to the mailbox channel that the
>>> message queue creates
>>> + * @tx_ghrsc: optional, the transmission side of the message queue
>>> + * @rx_ghrsc: optional, the receiving side of the message queue
>>> + *
>>> + * At least one of tx_ghrsc and rx_ghrsc should be not NULL. Most
>>> message queue use cases come with
>>> + * a pair of message queues to facilitate bidirectional
>>> communication. When tx_ghrsc is set,
>>> + * the client can send messages with
>>> mbox_send_message(gh_msgq_chan(msgq), msg). When rx_ghrsc
>>> + * is set, the mbox_client should register an .rx_callback() and the
>>> message queue driver will
>>> + * push all available messages upon receiving the RX ready
>>> interrupt. The messages should be
>>> + * consumed or copied by the client right away as the
>>> gh_msgq_rx_data will be replaced/destroyed
>>> + * after the callback.
>>> + *
>>> + * Returns - 0 on success, negative otherwise
>>> + */
>>> +int gh_msgq_init(struct device *parent, struct gh_msgq *msgq, struct
>>> mbox_client *cl,
>>> +             struct gunyah_resource *tx_ghrsc, struct
>>> gunyah_resource *rx_ghrsc)
>>> +{
>>> +    int ret;
>>> +
>>> +    /* Must have at least a tx_ghrsc or rx_ghrsc and that they are
>>> the right device types */
>>> +    if ((!tx_ghrsc && !rx_ghrsc) ||
>>> +        (tx_ghrsc && tx_ghrsc->type != GUNYAH_RESOURCE_TYPE_MSGQ_TX) ||
>>> +        (rx_ghrsc && rx_ghrsc->type != GUNYAH_RESOURCE_TYPE_MSGQ_RX))
>>> +        return -EINVAL;
>>> +
>>> +    if (gh_api_version() != GUNYAH_API_V1) {
>>> +        pr_err("Unrecognized gunyah version: %u. Currently
>>> supported: %d\n",
>> dev_err(parent
>>
>> would make this more useful
>>
>
> Done.
>
> - Elliot

2023-02-23 10:29:31

by Srinivas Kandagatla

[permalink] [raw]
Subject: Re: [PATCH v10 08/26] gunyah: rsc_mgr: Add resource manager RPC core



On 22/02/2023 23:18, Elliot Berman wrote:
>>>
>>> +EXPORT_SYMBOL_GPL(gh_rm_notifier_unregister);
>>> +
>>> +void get_gh_rm(struct gh_rm *rm)
>>> +{
>>> +    get_device(rm->dev);
>>> +}
>>> +EXPORT_SYMBOL_GPL(get_gh_rm);
>>
>> Can we have some consistency in the exported symbol naming,
>> we have two combinations now.
>>
>> EXPORT_SYMBOL_GPL(gh_rm_notifier_register);
>> EXPORT_SYMBOL_GPL(get_gh_rm);
>>
>> lets stick to one.
>
> done.
>
>>> +
>>> +void put_gh_rm(struct gh_rm *rm)
>>> +{
>>> +    put_device(rm->dev);
>>> +}
>>> +EXPORT_SYMBOL_GPL(put_gh_rm);
>>>
>> ...
>>
>>> +
>>> +static int gh_rm_drv_probe(struct platform_device *pdev)
>>> +{
>>> +    struct gh_msgq_tx_data *msg;
>>> +    struct gh_rm *rm;
>>> +    int ret;
>>> +
>> How are we ensuring that gunyah driver is probed before this driver?
>>
>>
>
> Which driver?

Am referring to gunyah.ko

TBH, gunyah.c should be merged as part of resource manager, and check if
uuids and features in probe before proceeding further.


-srini

>
>>> +    rm = devm_kzalloc(&pdev->dev, sizeof(*rm), GFP_KERNEL);
>>> +    if (!rm)
>>> +        return -ENOMEM;
>>> +
>>> +    platform_set_drvdata(pdev, rm);
>>> +    rm->dev = &pdev->dev;
>>> +
>>> +    mutex_init(&rm->call_idr_lock);
>>> +    idr_init(&rm->call_idr);
>>> +    rm->cache = kmem_cache_create("gh_rm", struct_size(msg, data,
>>> GH_MSGQ_MAX_MSG_SIZE), 0,
>>> +        SLAB_HWCACHE_ALIGN, NULL);
>>> +    if (!rm->cache)
>>> +        return -ENOMEM;
>> new line here would be nice.
>>
>
> done.

2023-02-23 18:25:13

by Alex Elder

[permalink] [raw]
Subject: Re: [PATCH v10 07/26] mailbox: Add Gunyah message queue mailbox

On 2/14/23 3:23 PM, Elliot Berman wrote:
> Gunyah message queues are a unidirectional inter-VM pipe for messages up
> to 1024 bytes. This driver supports pairing a receiver message queue and
> a transmitter message queue to expose a single mailbox channel.
>
> Signed-off-by: Elliot Berman <[email protected]>

I have a general comment for "include/linux/gunyah.h".

You sometimes use "gunyah" in exported names (for example,
enum gunyah_resource_type, or struct gunyah_resource). In
many cases, though, you use "gh_ or "GH_" (such as in
struct gh_msgq, or GH_DBL_NONBLOCK). Is there a reason
that you don't pick one and use it everywhere?

I think it would be best--certainly for exported symbols
like these--to use a single symbol prefix for all cases.

Sometimes there might be a reason to distinguish two names
(maybe "gunyah_" symbols are truly public, while "gh_"
symbols are helpers meant generally to be private). But
I don't think that's the case here.

It seems that "gh" is your most frequent prefix, so that
might be easier to implement. But "gunyah" is more expressive
and is only 4 characters wider.

-Alex

> ---
> Documentation/virt/gunyah/message-queue.rst | 8 +
> drivers/mailbox/Makefile | 2 +
> drivers/mailbox/gunyah-msgq.c | 214 ++++++++++++++++++++
> include/linux/gunyah.h | 56 +++++
> 4 files changed, 280 insertions(+)
> create mode 100644 drivers/mailbox/gunyah-msgq.c
>
> diff --git a/Documentation/virt/gunyah/message-queue.rst b/Documentation/virt/gunyah/message-queue.rst
> index 0667b3eb1ff9..082085e981e0 100644
> --- a/Documentation/virt/gunyah/message-queue.rst
> +++ b/Documentation/virt/gunyah/message-queue.rst
> @@ -59,3 +59,11 @@ vIRQ: two TX message queues will have two vIRQs (and two capability IDs).
> | | | | | |
> | | | | | |
> +---------------+ +-----------------+ +---------------+
> +
> +Gunyah message queues are exposed as mailboxes. To create the mailbox, create
> +a mbox_client and call `gh_msgq_init`. On receipt of the RX_READY interrupt,
> +all messages in the RX message queue are read and pushed via the `rx_callback`
> +of the registered mbox_client.
> +
> +.. kernel-doc:: drivers/mailbox/gunyah-msgq.c
> + :identifiers: gh_msgq_init
> diff --git a/drivers/mailbox/Makefile b/drivers/mailbox/Makefile
> index fc9376117111..5f929bb55e9a 100644
> --- a/drivers/mailbox/Makefile
> +++ b/drivers/mailbox/Makefile
> @@ -55,6 +55,8 @@ obj-$(CONFIG_MTK_CMDQ_MBOX) += mtk-cmdq-mailbox.o
>
> obj-$(CONFIG_ZYNQMP_IPI_MBOX) += zynqmp-ipi-mailbox.o
>
> +obj-$(CONFIG_GUNYAH) += gunyah-msgq.o
> +
> obj-$(CONFIG_SUN6I_MSGBOX) += sun6i-msgbox.o
>
> obj-$(CONFIG_SPRD_MBOX) += sprd-mailbox.o
> diff --git a/drivers/mailbox/gunyah-msgq.c b/drivers/mailbox/gunyah-msgq.c
> new file mode 100644
> index 000000000000..03ffaa30ce9b
> --- /dev/null
> +++ b/drivers/mailbox/gunyah-msgq.c
> @@ -0,0 +1,214 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#include <linux/mailbox_controller.h>
> +#include <linux/module.h>
> +#include <linux/interrupt.h>
> +#include <linux/gunyah.h>
> +#include <linux/printk.h>
> +#include <linux/init.h>
> +#include <linux/slab.h>
> +#include <linux/wait.h>
> +
> +#define mbox_chan_to_msgq(chan) (container_of(chan->mbox, struct gh_msgq, mbox))
> +
> +static irqreturn_t gh_msgq_rx_irq_handler(int irq, void *data)
> +{
> + struct gh_msgq *msgq = data;
> + struct gh_msgq_rx_data rx_data;
> + enum gh_error err;
> + bool ready = true;
> +
> + while (ready) {
> + err = gh_hypercall_msgq_recv(msgq->rx_ghrsc->capid,
> + (uintptr_t)&rx_data.data, sizeof(rx_data.data),
> + &rx_data.length, &ready);
> + if (err != GH_ERROR_OK) {
> + if (err != GH_ERROR_MSGQUEUE_EMPTY)
> + pr_warn("Failed to receive data from msgq for %s: %d\n",
> + msgq->mbox.dev ? dev_name(msgq->mbox.dev) : "", err);
> + break;
> + }
> + mbox_chan_received_data(gh_msgq_chan(msgq), &rx_data);
> + }
> +
> + return IRQ_HANDLED;
> +}
> +
> +/* Fired when message queue transitions from "full" to "space available" to send messages */
> +static irqreturn_t gh_msgq_tx_irq_handler(int irq, void *data)
> +{
> + struct gh_msgq *msgq = data;
> +
> + mbox_chan_txdone(gh_msgq_chan(msgq), 0);
> +
> + return IRQ_HANDLED;
> +}
> +
> +/* Fired after sending message and hypercall told us there was more space available. */
> +static void gh_msgq_txdone_tasklet(struct tasklet_struct *tasklet)
> +{
> + struct gh_msgq *msgq = container_of(tasklet, struct gh_msgq, txdone_tasklet);
> +
> + mbox_chan_txdone(gh_msgq_chan(msgq), msgq->last_ret);
> +}
> +
> +static int gh_msgq_send_data(struct mbox_chan *chan, void *data)
> +{
> + struct gh_msgq *msgq = mbox_chan_to_msgq(chan);
> + struct gh_msgq_tx_data *msgq_data = data;
> + u64 tx_flags = 0;
> + enum gh_error gh_error;
> + bool ready;
> +
> + if (msgq_data->push)
> + tx_flags |= GH_HYPERCALL_MSGQ_TX_FLAGS_PUSH;
> +
> + gh_error = gh_hypercall_msgq_send(msgq->tx_ghrsc->capid, msgq_data->length,
> + (uintptr_t)msgq_data->data, tx_flags, &ready);
> +
> + /**
> + * unlikely because Linux tracks state of msgq and should not try to
> + * send message when msgq is full.
> + */
> + if (unlikely(gh_error == GH_ERROR_MSGQUEUE_FULL))
> + return -EAGAIN;
> +
> + /**
> + * Propagate all other errors to client. If we return error to mailbox
> + * framework, then no other messages can be sent and nobody will know
> + * to retry this message.
> + */
> + msgq->last_ret = gh_remap_error(gh_error);
> +
> + /**
> + * This message was successfully sent, but message queue isn't ready to
> + * receive more messages because it's now full. Mailbox framework
> + * requires that we only report that message was transmitted when
> + * we're ready to transmit another message. We'll get that in the form
> + * of tx IRQ once the other side starts to drain the msgq.
> + */
> + if (gh_error == GH_ERROR_OK && !ready)
> + return 0;
> +
> + /**
> + * We can send more messages. Mailbox framework requires that tx done
> + * happens asynchronously to sending the message. Gunyah message queues
> + * tell us right away on the hypercall return whether we can send more
> + * messages. To work around this, defer the txdone to a tasklet.
> + */
> + tasklet_schedule(&msgq->txdone_tasklet);
> +
> + return 0;
> +}
> +
> +static struct mbox_chan_ops gh_msgq_ops = {
> + .send_data = gh_msgq_send_data,
> +};
> +
> +/**
> + * gh_msgq_init() - Initialize a Gunyah message queue with an mbox_client
> + * @parent: optional, device parent used for the mailbox controller
> + * @msgq: Pointer to the gh_msgq to initialize
> + * @cl: A mailbox client to bind to the mailbox channel that the message queue creates
> + * @tx_ghrsc: optional, the transmission side of the message queue
> + * @rx_ghrsc: optional, the receiving side of the message queue
> + *
> + * At least one of tx_ghrsc and rx_ghrsc should be not NULL. Most message queue use cases come with
> + * a pair of message queues to facilitate bidirectional communication. When tx_ghrsc is set,
> + * the client can send messages with mbox_send_message(gh_msgq_chan(msgq), msg). When rx_ghrsc
> + * is set, the mbox_client should register an .rx_callback() and the message queue driver will
> + * push all available messages upon receiving the RX ready interrupt. The messages should be
> + * consumed or copied by the client right away as the gh_msgq_rx_data will be replaced/destroyed
> + * after the callback.
> + *
> + * Returns - 0 on success, negative otherwise
> + */
> +int gh_msgq_init(struct device *parent, struct gh_msgq *msgq, struct mbox_client *cl,
> + struct gunyah_resource *tx_ghrsc, struct gunyah_resource *rx_ghrsc)
> +{
> + int ret;
> +
> + /* Must have at least a tx_ghrsc or rx_ghrsc and that they are the right device types */
> + if ((!tx_ghrsc && !rx_ghrsc) ||
> + (tx_ghrsc && tx_ghrsc->type != GUNYAH_RESOURCE_TYPE_MSGQ_TX) ||
> + (rx_ghrsc && rx_ghrsc->type != GUNYAH_RESOURCE_TYPE_MSGQ_RX))
> + return -EINVAL;
> +
> + if (gh_api_version() != GUNYAH_API_V1) {
> + pr_err("Unrecognized gunyah version: %u. Currently supported: %d\n",
> + gh_api_version(), GUNYAH_API_V1);
> + return -EOPNOTSUPP;
> + }
> +
> + if (!gh_api_has_feature(GH_API_FEATURE_MSGQUEUE))
> + return -EOPNOTSUPP;
> +
> + msgq->tx_ghrsc = tx_ghrsc;
> + msgq->rx_ghrsc = rx_ghrsc;
> +
> + msgq->mbox.dev = parent;
> + msgq->mbox.ops = &gh_msgq_ops;
> + msgq->mbox.num_chans = 1;
> + msgq->mbox.txdone_irq = true;
> + msgq->mbox.chans = kcalloc(msgq->mbox.num_chans, sizeof(*msgq->mbox.chans), GFP_KERNEL);
> + if (!msgq->mbox.chans)
> + return -ENOMEM;
> +
> + if (msgq->tx_ghrsc) {
> + ret = request_irq(msgq->tx_ghrsc->irq, gh_msgq_tx_irq_handler, 0, "gh_msgq_tx",
> + msgq);
> + if (ret)
> + goto err_chans;
> + }
> +
> + if (msgq->rx_ghrsc) {
> + ret = request_threaded_irq(msgq->rx_ghrsc->irq, NULL, gh_msgq_rx_irq_handler,
> + IRQF_ONESHOT, "gh_msgq_rx", msgq);
> + if (ret)
> + goto err_tx_irq;
> + }
> +
> + tasklet_setup(&msgq->txdone_tasklet, gh_msgq_txdone_tasklet);
> +
> + ret = mbox_controller_register(&msgq->mbox);
> + if (ret)
> + goto err_rx_irq;
> +
> + ret = mbox_bind_client(gh_msgq_chan(msgq), cl);
> + if (ret)
> + goto err_mbox;
> +
> + return 0;
> +err_mbox:
> + mbox_controller_unregister(&msgq->mbox);
> +err_rx_irq:
> + if (msgq->rx_ghrsc)
> + free_irq(msgq->rx_ghrsc->irq, msgq);
> +err_tx_irq:
> + if (msgq->tx_ghrsc)
> + free_irq(msgq->tx_ghrsc->irq, msgq);
> +err_chans:
> + kfree(msgq->mbox.chans);
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(gh_msgq_init);
> +
> +void gh_msgq_remove(struct gh_msgq *msgq)
> +{
> + mbox_controller_unregister(&msgq->mbox);
> +
> + if (msgq->rx_ghrsc)
> + free_irq(msgq->rx_ghrsc->irq, msgq);
> +
> + if (msgq->tx_ghrsc)
> + free_irq(msgq->tx_ghrsc->irq, msgq);
> +
> + kfree(msgq->mbox.chans);
> +}
> +EXPORT_SYMBOL_GPL(gh_msgq_remove);
> +
> +MODULE_LICENSE("GPL");
> +MODULE_DESCRIPTION("Gunyah Message Queue Driver");
> diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
> index cb6df4eec5c2..2e13669c6363 100644
> --- a/include/linux/gunyah.h
> +++ b/include/linux/gunyah.h
> @@ -8,11 +8,67 @@
>
> #include <linux/bitfield.h>
> #include <linux/errno.h>
> +#include <linux/interrupt.h>
> #include <linux/limits.h>
> +#include <linux/mailbox_controller.h>
> +#include <linux/mailbox_client.h>
> #include <linux/types.h>
>
> +/* Follows resource manager's resource types for VM_GET_HYP_RESOURCES */
> +enum gunyah_resource_type {
> + GUNYAH_RESOURCE_TYPE_BELL_TX = 0,
> + GUNYAH_RESOURCE_TYPE_BELL_RX = 1,
> + GUNYAH_RESOURCE_TYPE_MSGQ_TX = 2,
> + GUNYAH_RESOURCE_TYPE_MSGQ_RX = 3,
> + GUNYAH_RESOURCE_TYPE_VCPU = 4,
> +};
> +
> +struct gunyah_resource {
> + enum gunyah_resource_type type;
> + u64 capid;
> + int irq;
> +};
> +
> +/**
> + * Gunyah Message Queues
> + */
> +
> +#define GH_MSGQ_MAX_MSG_SIZE 240
> +
> +struct gh_msgq_tx_data {
> + size_t length;
> + bool push;
> + char data[];
> +};
> +
> +struct gh_msgq_rx_data {
> + size_t length;
> + char data[GH_MSGQ_MAX_MSG_SIZE];
> +};
> +
> +struct gh_msgq {
> + struct gunyah_resource *tx_ghrsc;
> + struct gunyah_resource *rx_ghrsc;
> +
> + /* msgq private */
> + int last_ret; /* Linux error, not GH_STATUS_* */
> + struct mbox_controller mbox;
> + struct tasklet_struct txdone_tasklet;
> +};
> +
> +
> +int gh_msgq_init(struct device *parent, struct gh_msgq *msgq, struct mbox_client *cl,
> + struct gunyah_resource *tx_ghrsc, struct gunyah_resource *rx_ghrsc);
> +void gh_msgq_remove(struct gh_msgq *msgq);
> +
> +static inline struct mbox_chan *gh_msgq_chan(struct gh_msgq *msgq)
> +{
> + return &msgq->mbox.chans[0];
> +}
> +
> /******************************************************************************/
> /* Common arch-independent definitions for Gunyah hypercalls */
> +
> #define GH_CAPID_INVAL U64_MAX
> #define GH_VMID_ROOT_VM 0xff
>


2023-02-23 21:12:04

by Alex Elder

[permalink] [raw]
Subject: Re: [PATCH v10 07/26] mailbox: Add Gunyah message queue mailbox

On 2/14/23 3:23 PM, Elliot Berman wrote:
> Gunyah message queues are a unidirectional inter-VM pipe for messages up
> to 1024 bytes. This driver supports pairing a receiver message queue and
> a transmitter message queue to expose a single mailbox channel.
>
> Signed-off-by: Elliot Berman <[email protected]>
> ---
> Documentation/virt/gunyah/message-queue.rst | 8 +
> drivers/mailbox/Makefile | 2 +
> drivers/mailbox/gunyah-msgq.c | 214 ++++++++++++++++++++
> include/linux/gunyah.h | 56 +++++
> 4 files changed, 280 insertions(+)
> create mode 100644 drivers/mailbox/gunyah-msgq.c
>
> diff --git a/Documentation/virt/gunyah/message-queue.rst b/Documentation/virt/gunyah/message-queue.rst
> index 0667b3eb1ff9..082085e981e0 100644
> --- a/Documentation/virt/gunyah/message-queue.rst
> +++ b/Documentation/virt/gunyah/message-queue.rst
> @@ -59,3 +59,11 @@ vIRQ: two TX message queues will have two vIRQs (and two capability IDs).
> | | | | | |
> | | | | | |
> +---------------+ +-----------------+ +---------------+
> +
> +Gunyah message queues are exposed as mailboxes. To create the mailbox, create
> +a mbox_client and call `gh_msgq_init`. On receipt of the RX_READY interrupt,
> +all messages in the RX message queue are read and pushed via the `rx_callback`
> +of the registered mbox_client.
> +
> +.. kernel-doc:: drivers/mailbox/gunyah-msgq.c
> + :identifiers: gh_msgq_init
> diff --git a/drivers/mailbox/Makefile b/drivers/mailbox/Makefile
> index fc9376117111..5f929bb55e9a 100644
> --- a/drivers/mailbox/Makefile
> +++ b/drivers/mailbox/Makefile
> @@ -55,6 +55,8 @@ obj-$(CONFIG_MTK_CMDQ_MBOX) += mtk-cmdq-mailbox.o
>
> obj-$(CONFIG_ZYNQMP_IPI_MBOX) += zynqmp-ipi-mailbox.o
>
> +obj-$(CONFIG_GUNYAH) += gunyah-msgq.o
> +
> obj-$(CONFIG_SUN6I_MSGBOX) += sun6i-msgbox.o
>
> obj-$(CONFIG_SPRD_MBOX) += sprd-mailbox.o
> diff --git a/drivers/mailbox/gunyah-msgq.c b/drivers/mailbox/gunyah-msgq.c
> new file mode 100644
> index 000000000000..03ffaa30ce9b
> --- /dev/null
> +++ b/drivers/mailbox/gunyah-msgq.c

You use a dash in this source file name, but an underscore
everywhere else. Unless there's a good reason to do this,
please be consistent (use "gunyah_msgq.c").

> @@ -0,0 +1,214 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#include <linux/mailbox_controller.h>
> +#include <linux/module.h>
> +#include <linux/interrupt.h>
> +#include <linux/gunyah.h>
> +#include <linux/printk.h>
> +#include <linux/init.h>
> +#include <linux/slab.h>
> +#include <linux/wait.h>
> +
> +#define mbox_chan_to_msgq(chan) (container_of(chan->mbox, struct gh_msgq, mbox))
> +
> +static irqreturn_t gh_msgq_rx_irq_handler(int irq, void *data)
> +{
> + struct gh_msgq *msgq = data;
> + struct gh_msgq_rx_data rx_data;
> + enum gh_error err;
> + bool ready = true;
> +
> + while (ready) {
> + err = gh_hypercall_msgq_recv(msgq->rx_ghrsc->capid,
> + (uintptr_t)&rx_data.data, sizeof(rx_data.data),
> + &rx_data.length, &ready);
> + if (err != GH_ERROR_OK) {
> + if (err != GH_ERROR_MSGQUEUE_EMPTY)

Srini mentioned something about this too. In many
(all?) cases, there is a device pointer available,
so you should use dev_*() functions rather than pr_*().

In this particular case, I'm not sure why/when the
mbox.dev pointer would be null. Also, dev_*() handles
the case of a null device pointer, and it reports the
device name (just as you do here).

> + pr_warn("Failed to receive data from msgq for %s: %d\n",
> + msgq->mbox.dev ? dev_name(msgq->mbox.dev) : "", err);
> + break;
> + }
> + mbox_chan_received_data(gh_msgq_chan(msgq), &rx_data);
> + }
> +
> + return IRQ_HANDLED;
> +}
> +
> +/* Fired when message queue transitions from "full" to "space available" to send messages */
> +static irqreturn_t gh_msgq_tx_irq_handler(int irq, void *data)
> +{
> + struct gh_msgq *msgq = data;
> +
> + mbox_chan_txdone(gh_msgq_chan(msgq), 0);
> +
> + return IRQ_HANDLED;
> +}
> +
> +/* Fired after sending message and hypercall told us there was more space available. */
> +static void gh_msgq_txdone_tasklet(struct tasklet_struct *tasklet)
> +{
> + struct gh_msgq *msgq = container_of(tasklet, struct gh_msgq, txdone_tasklet);
> +
> + mbox_chan_txdone(gh_msgq_chan(msgq), msgq->last_ret);
> +}
> +
> +static int gh_msgq_send_data(struct mbox_chan *chan, void *data)
> +{
> + struct gh_msgq *msgq = mbox_chan_to_msgq(chan);
> + struct gh_msgq_tx_data *msgq_data = data;
> + u64 tx_flags = 0;
> + enum gh_error gh_error;

Above you named the variable "err". It helps readability
if you use a very consistent naming convention for variables
of a certain type when they are used a lot.

> + bool ready;
> +
> + if (msgq_data->push)
> + tx_flags |= GH_HYPERCALL_MSGQ_TX_FLAGS_PUSH;
> +
> + gh_error = gh_hypercall_msgq_send(msgq->tx_ghrsc->capid, msgq_data->length,
> + (uintptr_t)msgq_data->data, tx_flags, &ready);
> +
> + /**
> + * unlikely because Linux tracks state of msgq and should not try to
> + * send message when msgq is full.
> + */
> + if (unlikely(gh_error == GH_ERROR_MSGQUEUE_FULL))
> + return -EAGAIN;
> +
> + /**
> + * Propagate all other errors to client. If we return error to mailbox
> + * framework, then no other messages can be sent and nobody will know
> + * to retry this message.
> + */
> + msgq->last_ret = gh_remap_error(gh_error);
> +
> + /**
> + * This message was successfully sent, but message queue isn't ready to
> + * receive more messages because it's now full. Mailbox framework

Maybe: s/receive/accept/

> + * requires that we only report that message was transmitted when
> + * we're ready to transmit another message. We'll get that in the form
> + * of tx IRQ once the other side starts to drain the msgq.
> + */
> + if (gh_error == GH_ERROR_OK && !ready)
> + return 0;
> +
> + /**
> + * We can send more messages. Mailbox framework requires that tx done
> + * happens asynchronously to sending the message. Gunyah message queues
> + * tell us right away on the hypercall return whether we can send more
> + * messages. To work around this, defer the txdone to a tasklet.
> + */
> + tasklet_schedule(&msgq->txdone_tasklet);
> +
> + return 0;
> +}
> +
> +static struct mbox_chan_ops gh_msgq_ops = {
> + .send_data = gh_msgq_send_data,
> +};
> +
> +/**
> + * gh_msgq_init() - Initialize a Gunyah message queue with an mbox_client
> + * @parent: optional, device parent used for the mailbox controller
> + * @msgq: Pointer to the gh_msgq to initialize
> + * @cl: A mailbox client to bind to the mailbox channel that the message queue creates
> + * @tx_ghrsc: optional, the transmission side of the message queue
> + * @rx_ghrsc: optional, the receiving side of the message queue
> + *
> + * At least one of tx_ghrsc and rx_ghrsc should be not NULL. Most message queue use cases come with

s/should be/must be/

> + * a pair of message queues to facilitate bidirectional communication. When tx_ghrsc is set,
> + * the client can send messages with mbox_send_message(gh_msgq_chan(msgq), msg). When rx_ghrsc
> + * is set, the mbox_client should register an .rx_callback() and the message queue driver will

s/should register/must register/

A general comment on this code is that you sort of half define
a Gunyah message queue API. You define an initialization
function and an exit function, but you also expose the fact
that you use the mailbox framework in implementation. This
despite avoiding defining it as an mbox in the DTS file.

It might be hard to avoid that I guess. But to me it would be
nice if there were a more distinct Gunyah message queue API,
which would provide a send_message() function, for example.
And in that case, perhaps you would pass in the tx_done and/or
rx_data callbacks to this function (since they're required).

All that said, this is (currently?) only used by the resource
manager, so making a beautiful API might not be that important.
Do you envision this being used to communicate with other VMs
in the future?

> + * push all available messages upon receiving the RX ready interrupt. The messages should be

Maybe: s/push/deliver/

> + * consumed or copied by the client right away as the gh_msgq_rx_data will be replaced/destroyed
> + * after the callback.
> + *
> + * Returns - 0 on success, negative otherwise
> + */
> +int gh_msgq_init(struct device *parent, struct gh_msgq *msgq, struct mbox_client *cl,
> + struct gunyah_resource *tx_ghrsc, struct gunyah_resource *rx_ghrsc)
> +{
> + int ret;
> +
> + /* Must have at least a tx_ghrsc or rx_ghrsc and that they are the right device types */
> + if ((!tx_ghrsc && !rx_ghrsc) ||
> + (tx_ghrsc && tx_ghrsc->type != GUNYAH_RESOURCE_TYPE_MSGQ_TX) ||
> + (rx_ghrsc && rx_ghrsc->type != GUNYAH_RESOURCE_TYPE_MSGQ_RX))
> + return -EINVAL;
> +
> + if (gh_api_version() != GUNYAH_API_V1) {
> + pr_err("Unrecognized gunyah version: %u. Currently supported: %d\n",
> + gh_api_version(), GUNYAH_API_V1);
> + return -EOPNOTSUPP;
> + }
> +
> + if (!gh_api_has_feature(GH_API_FEATURE_MSGQUEUE))
> + return -EOPNOTSUPP;

Can Gunyah even function if it doesn't have the MSGQUEUE feature?
Will there ever be a Gunyah implementation that does not support
it? Perhaps this test could be done in gunyah_init() instead.

For that matter, you could verify the result of gh_api_version()
at that time also.

> +
> + msgq->tx_ghrsc = tx_ghrsc;
> + msgq->rx_ghrsc = rx_ghrsc;
> +
> + msgq->mbox.dev = parent;
> + msgq->mbox.ops = &gh_msgq_ops;
> + msgq->mbox.num_chans = 1;
> + msgq->mbox.txdone_irq = true;
> + msgq->mbox.chans = kcalloc(msgq->mbox.num_chans, sizeof(*msgq->mbox.chans), GFP_KERNEL);

From what I can tell, you will always use exactly one mailbox channel.
So you could just do kzalloc(sizeof()...).

> + if (!msgq->mbox.chans)
> + return -ENOMEM;
> +
> + if (msgq->tx_ghrsc) {

if (tx_ghrsc) {

The irq field is assumed to be valid. Are there any
sanity checks you could perform? Again this is only
used for the resource manager right now, so maybe
it's OK.

> + ret = request_irq(msgq->tx_ghrsc->irq, gh_msgq_tx_irq_handler, 0, "gh_msgq_tx",

ret = request_irq(tx_ghrsc->irq, ...


> + msgq);
> + if (ret)
> + goto err_chans;
> + }
> +
> + if (msgq->rx_ghrsc) {
> + ret = request_threaded_irq(msgq->rx_ghrsc->irq, NULL, gh_msgq_rx_irq_handler,
> + IRQF_ONESHOT, "gh_msgq_rx", msgq);
> + if (ret)
> + goto err_tx_irq;
> + }
> +
> + tasklet_setup(&msgq->txdone_tasklet, gh_msgq_txdone_tasklet);
> +
> + ret = mbox_controller_register(&msgq->mbox);
> + if (ret)
> + goto err_rx_irq;
> +
> + ret = mbox_bind_client(gh_msgq_chan(msgq), cl);


> + if (ret)
> + goto err_mbox;
> +
> + return 0;
> +err_mbox:
> + mbox_controller_unregister(&msgq->mbox);
> +err_rx_irq:
> + if (msgq->rx_ghrsc)
> + free_irq(msgq->rx_ghrsc->irq, msgq);
> +err_tx_irq:
> + if (msgq->tx_ghrsc)
> + free_irq(msgq->tx_ghrsc->irq, msgq);
> +err_chans:
> + kfree(msgq->mbox.chans);
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(gh_msgq_init);
> +
> +void gh_msgq_remove(struct gh_msgq *msgq)
> +{

Is there any need to un-bind the client?

> + mbox_controller_unregister(&msgq->mbox);
> +
> + if (msgq->rx_ghrsc)
> + free_irq(msgq->rx_ghrsc->irq, msgq);
> +
> + if (msgq->tx_ghrsc)
> + free_irq(msgq->tx_ghrsc->irq, msgq);
> +
> + kfree(msgq->mbox.chans);
> +}
> +EXPORT_SYMBOL_GPL(gh_msgq_remove);
> +
> +MODULE_LICENSE("GPL");
> +MODULE_DESCRIPTION("Gunyah Message Queue Driver");
> diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
> index cb6df4eec5c2..2e13669c6363 100644
> --- a/include/linux/gunyah.h
> +++ b/include/linux/gunyah.h
> @@ -8,11 +8,67 @@
>
> #include <linux/bitfield.h>
> #include <linux/errno.h>
> +#include <linux/interrupt.h>
> #include <linux/limits.h>
> +#include <linux/mailbox_controller.h>
> +#include <linux/mailbox_client.h>
> #include <linux/types.h>
>
> +/* Follows resource manager's resource types for VM_GET_HYP_RESOURCES */
> +enum gunyah_resource_type {
> + GUNYAH_RESOURCE_TYPE_BELL_TX = 0,
> + GUNYAH_RESOURCE_TYPE_BELL_RX = 1,
> + GUNYAH_RESOURCE_TYPE_MSGQ_TX = 2,
> + GUNYAH_RESOURCE_TYPE_MSGQ_RX = 3,
> + GUNYAH_RESOURCE_TYPE_VCPU = 4,

The maximum value here must fit in 8 bits. I guess
there's no risk right now of using that up, but you
use negative values in some cases elsewhere.

> +};
> +
> +struct gunyah_resource {
> + enum gunyah_resource_type type;
> + u64 capid;
> + int irq;

request_irq() defines the IRQ value to be an unsigned int.

> +};
> +
> +/**
> + * Gunyah Message Queues
> + */
> +
> +#define GH_MSGQ_MAX_MSG_SIZE 240
> +
> +struct gh_msgq_tx_data {
> + size_t length;
> + bool push;
> + char data[];
> +};
> +
> +struct gh_msgq_rx_data {
> + size_t length;
> + char data[GH_MSGQ_MAX_MSG_SIZE];
> +};
> +
> +struct gh_msgq {
> + struct gunyah_resource *tx_ghrsc;
> + struct gunyah_resource *rx_ghrsc;
> +
> + /* msgq private */
> + int last_ret; /* Linux error, not GH_STATUS_* */
> + struct mbox_controller mbox;
> + struct tasklet_struct txdone_tasklet;

Can the msgq_client be embedded here too? (I don't really
know whether msgq and msgq_client are one-to one.)

> +};
> +
> +
> +int gh_msgq_init(struct device *parent, struct gh_msgq *msgq, struct mbox_client *cl,
> + struct gunyah_resource *tx_ghrsc, struct gunyah_resource *rx_ghrsc);
> +void gh_msgq_remove(struct gh_msgq *msgq);

I suggested:

int gh_msgq_send(struct gh_msgq, struct gh_msgq_tx_data *data);

-Alex

> +
> +static inline struct mbox_chan *gh_msgq_chan(struct gh_msgq *msgq)
> +{
> + return &msgq->mbox.chans[0];
> +}
> +
> /******************************************************************************/
> /* Common arch-independent definitions for Gunyah hypercalls */
> +
> #define GH_CAPID_INVAL U64_MAX
> #define GH_VMID_ROOT_VM 0xff
>


2023-02-23 21:37:20

by Alex Elder

[permalink] [raw]
Subject: Re: [PATCH v10 09/26] gunyah: rsc_mgr: Add VM lifecycle RPC

On 2/14/23 3:23 PM, Elliot Berman wrote:
>
> Add Gunyah Resource Manager RPC to launch an unauthenticated VM.
>
> Signed-off-by: Elliot Berman <[email protected]>
> ---
> drivers/virt/gunyah/Makefile | 2 +-
> drivers/virt/gunyah/rsc_mgr.h | 45 ++++++
> drivers/virt/gunyah/rsc_mgr_rpc.c | 226 ++++++++++++++++++++++++++++++
> include/linux/gunyah_rsc_mgr.h | 73 ++++++++++
> 4 files changed, 345 insertions(+), 1 deletion(-)
> create mode 100644 drivers/virt/gunyah/rsc_mgr_rpc.c
>
> diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
> index cc864ff5abbb..de29769f2f3f 100644
> --- a/drivers/virt/gunyah/Makefile
> +++ b/drivers/virt/gunyah/Makefile
> @@ -2,5 +2,5 @@
>
> obj-$(CONFIG_GUNYAH) += gunyah.o
>
> -gunyah_rsc_mgr-y += rsc_mgr.o
> +gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o
> obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
> diff --git a/drivers/virt/gunyah/rsc_mgr.h b/drivers/virt/gunyah/rsc_mgr.h
> index d4e799a7526f..7406237bc66d 100644
> --- a/drivers/virt/gunyah/rsc_mgr.h
> +++ b/drivers/virt/gunyah/rsc_mgr.h
> @@ -74,4 +74,49 @@ struct gh_rm;
> int gh_rm_call(struct gh_rm *rsc_mgr, u32 message_id, void *req_buff, size_t req_buff_size,
> void **resp_buf, size_t *resp_buff_size);
>
> +/* Message IDs: VM Management */
> +#define GH_RM_RPC_VM_ALLOC_VMID 0x56000001
> +#define GH_RM_RPC_VM_DEALLOC_VMID 0x56000002
> +#define GH_RM_RPC_VM_START 0x56000004
> +#define GH_RM_RPC_VM_STOP 0x56000005
> +#define GH_RM_RPC_VM_RESET 0x56000006
> +#define GH_RM_RPC_VM_CONFIG_IMAGE 0x56000009
> +#define GH_RM_RPC_VM_INIT 0x5600000B
> +#define GH_RM_RPC_VM_GET_HYP_RESOURCES 0x56000020
> +#define GH_RM_RPC_VM_GET_VMID 0x56000024
> +
> +struct gh_rm_vm_common_vmid_req {
> + __le16 vmid;
> + __le16 reserved0;
> +} __packed;
> +
> +/* Call: VM_ALLOC */
> +struct gh_rm_vm_alloc_vmid_resp {
> + __le16 vmid;
> + __le16 reserved0;
> +} __packed;
> +
> +/* Call: VM_STOP */
> +struct gh_rm_vm_stop_req {
> + __le16 vmid;
> +#define GH_RM_VM_STOP_FLAG_FORCE_STOP BIT(0)
> + u8 flags;
> + u8 reserved;
> +#define GH_RM_VM_STOP_REASON_FORCE_STOP 3

I suggested this before and you honored it. Now I'll suggest
it again, and ask you to do it throughout the driver.

Please separate the definitions of constant values that
certain fields can take on from the structure definition.
I think doing it the way you have here makes it harder to
understand the structure definition.

You could define an anonymous enumerated type to hold
the values meant to be held by each field.

> + __le32 stop_reason;
> +} __packed;
> +
> +/* Call: VM_CONFIG_IMAGE */
> +struct gh_rm_vm_config_image_req {
> + __le16 vmid;
> + __le16 auth_mech;
> + __le32 mem_handle;
> + __le64 image_offset;
> + __le64 image_size;
> + __le64 dtb_offset;
> + __le64 dtb_size;
> +} __packed;
> +
> +/* Call: GET_HYP_RESOURCES */
> +
> #endif
> diff --git a/drivers/virt/gunyah/rsc_mgr_rpc.c b/drivers/virt/gunyah/rsc_mgr_rpc.c
> new file mode 100644
> index 000000000000..4515cdd80106
> --- /dev/null
> +++ b/drivers/virt/gunyah/rsc_mgr_rpc.c
> @@ -0,0 +1,226 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#include <linux/gunyah_rsc_mgr.h>
> +
> +#include "rsc_mgr.h"
> +
> +/*
> + * Several RM calls take only a VMID as a parameter and give only standard
> + * response back. Deduplicate boilerplate code by using this common call.
> + */
> +static int gh_rm_common_vmid_call(struct gh_rm *rm, u32 message_id, u16 vmid)
> +{
> + struct gh_rm_vm_common_vmid_req req_payload = {
> + .vmid = cpu_to_le16(vmid),
> + };
> + size_t resp_size;
> + void *resp;
> +
> + return gh_rm_call(rm, message_id, &req_payload, sizeof(req_payload), &resp, &resp_size);
> +}
> +
> +/**
> + * gh_rm_alloc_vmid() - Allocate a new VM in Gunyah. Returns the VM identifier.
> + * @rm: Handle to a Gunyah resource manager
> + * @vmid: Use GH_VMID_INVAL or 0 to dynamically allocate a VM. A reserved VMID can
> + * be supplied to request allocation of a platform-defined VM.

Honestly, I'd rather just see 0 (and *not* GH_VMID_INVAL) be the
special value to mean "dynamically allocate the VMID." It seems
0 is a reserved VMID anyway, and GH_VMID_INVAL might as well be
treated here as an invalid parameter.

Is there any definitition of which VMIDs are reserved? Like,
anything under 1024?

That's it on this patch for now.

-Alex

> + *
> + * Returns - the allocated VMID or negative value on error
> + */
> +int gh_rm_alloc_vmid(struct gh_rm *rm, u16 vmid)
> +{
> + struct gh_rm_vm_common_vmid_req req_payload = { 0 };
> + struct gh_rm_vm_alloc_vmid_resp *resp_payload;
> + size_t resp_size;
> + void *resp;
> + int ret;

. . .


2023-02-23 21:59:06

by Alex Elder

[permalink] [raw]
Subject: Re: [PATCH v10 03/26] gunyah: Common types and error codes for Gunyah hypercalls

On 2/14/23 3:12 PM, Elliot Berman wrote:
> Add architecture-independent standard error codes, types, and macros for
> Gunyah hypercalls.
>
> Reviewed-by: Dmitry Baryshkov <[email protected]>
> Signed-off-by: Elliot Berman <[email protected]>
> ---
> include/linux/gunyah.h | 82 ++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 82 insertions(+)
> create mode 100644 include/linux/gunyah.h
>
> diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
> new file mode 100644
> index 000000000000..59ef4c735ae8
> --- /dev/null
> +++ b/include/linux/gunyah.h
> @@ -0,0 +1,82 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#ifndef _LINUX_GUNYAH_H
> +#define _LINUX_GUNYAH_H
> +
> +#include <linux/errno.h>
> +#include <linux/limits.h>
> +
> +/******************************************************************************/
> +/* Common arch-independent definitions for Gunyah hypercalls */
> +#define GH_CAPID_INVAL U64_MAX
> +#define GH_VMID_ROOT_VM 0xff
> +
> +enum gh_error {
> + GH_ERROR_OK = 0,
> + GH_ERROR_UNIMPLEMENTED = -1,
> + GH_ERROR_RETRY = -2,

Do you expect this type to have a particular size?
Since you specify negative values, it matters, and
it's possible that this forces it to be a 4-byte value
(though I'm not sure what the rules are). In other
words, UNIMPLEMENTED could conceivably have value 0xff
or 0xffffffff. I'm not even sure you can tell whether
an enum is interpreted as signed or unsigned.

It's not usually a good thing to do, but this *could*
be a case where you do a typedef to represent this as
a signed value of a certain bit width. (But don't do
that unless someone else says that's worth doing.)

-Alex

> +
> + GH_ERROR_ARG_INVAL = 1,
> + GH_ERROR_ARG_SIZE = 2,
> + GH_ERROR_ARG_ALIGN = 3,
> +
> + GH_ERROR_NOMEM = 10,
> +
> + GH_ERROR_ADDR_OVFL = 20,
> + GH_ERROR_ADDR_UNFL = 21,
> + GH_ERROR_ADDR_INVAL = 22,
> +
> + GH_ERROR_DENIED = 30,
> + GH_ERROR_BUSY = 31,
> + GH_ERROR_IDLE = 32,
> +
> + GH_ERROR_IRQ_BOUND = 40,
> + GH_ERROR_IRQ_UNBOUND = 41,
> +
> + GH_ERROR_CSPACE_CAP_NULL = 50,
> + GH_ERROR_CSPACE_CAP_REVOKED = 51,
> + GH_ERROR_CSPACE_WRONG_OBJ_TYPE = 52,
> + GH_ERROR_CSPACE_INSUF_RIGHTS = 53,
> + GH_ERROR_CSPACE_FULL = 54,
> +
> + GH_ERROR_MSGQUEUE_EMPTY = 60,
> + GH_ERROR_MSGQUEUE_FULL = 61,
> +};
> +
> +/**
> + * gh_remap_error() - Remap Gunyah hypervisor errors into a Linux error code
> + * @gh_error: Gunyah hypercall return value
> + */
> +static inline int gh_remap_error(enum gh_error gh_error)
> +{
> + switch (gh_error) {
> + case GH_ERROR_OK:
> + return 0;
> + case GH_ERROR_NOMEM:
> + return -ENOMEM;
> + case GH_ERROR_DENIED:
> + case GH_ERROR_CSPACE_CAP_NULL:
> + case GH_ERROR_CSPACE_CAP_REVOKED:
> + case GH_ERROR_CSPACE_WRONG_OBJ_TYPE:
> + case GH_ERROR_CSPACE_INSUF_RIGHTS:
> + case GH_ERROR_CSPACE_FULL:
> + return -EACCES;
> + case GH_ERROR_BUSY:
> + case GH_ERROR_IDLE:
> + case GH_ERROR_IRQ_BOUND:
> + case GH_ERROR_IRQ_UNBOUND:
> + case GH_ERROR_MSGQUEUE_FULL:
> + case GH_ERROR_MSGQUEUE_EMPTY:

Is an empty message queue really busy?

> + return -EBUSY;
> + case GH_ERROR_UNIMPLEMENTED:
> + case GH_ERROR_RETRY:
> + return -EOPNOTSUPP;
> + default:
> + return -EINVAL;
> + }
> +}
> +
> +#endif


2023-02-23 22:00:05

by Alex Elder

[permalink] [raw]
Subject: Re: [PATCH v10 00/26] Drivers for Gunyah hypervisor

On 2/14/23 3:12 PM, Elliot Berman wrote:
> Gunyah is a Type-1 hypervisor independent of any
> high-level OS kernel, and runs in a higher CPU privilege level. It does
> not depend on any lower-privileged OS kernel/code for its core
> functionality. This increases its security and can support a much smaller
> trusted computing base than a Type-2 hypervisor.

Very general comment, across your whole series...

Although it is not strictly required any more, most kernel
code still stays within about 80 columns width, where longer
lines are the exception. Your patches go beyond 80 columns
a lot, and it's just not what I'm accustomed to looking at.

Others might disagree with this, but it would be my preference
that you try to reformat your lines so they normally fit within
80 columns. Format strings for printk-like functions are an
exception--people prefer those to be kept intact. But even
there, look at long messages and consider whether they can
be made more concise.

-Alex


> Gunyah is an open source hypervisor. The source repo is available at
> https://github.com/quic/gunyah-hypervisor.
>
> The diagram below shows the architecture.
>
> ::
>
> VM A VM B
> +-----+ +-----+ | +-----+ +-----+ +-----+
> | | | | | | | | | | |
> EL0 | APP | | APP | | | APP | | APP | | APP |
> | | | | | | | | | | |
> +-----+ +-----+ | +-----+ +-----+ +-----+
> ---------------------|-------------------------
> +--------------+ | +----------------------+
> | | | | |
> EL1 | Linux Kernel | | |Linux kernel/Other OS | ...
> | | | | |
> +--------------+ | +----------------------+
> --------hvc/smc------|------hvc/smc------------
> +----------------------------------------+
> | |
> EL2 | Gunyah Hypervisor |
> | |
> +----------------------------------------+
>
> Gunyah provides these following features.
>
> - Threads and Scheduling: The scheduler schedules virtual CPUs (VCPUs) on
> physical CPUs and enables time-sharing of the CPUs.
> - Memory Management: Gunyah tracks memory ownership and use of all memory
> under its control. Memory partitioning between VMs is a fundamental
> security feature.
> - Interrupt Virtualization: All interrupts are handled in the hypervisor
> and routed to the assigned VM.
> - Inter-VM Communication: There are several different mechanisms provided
> for communicating between VMs.
> - Device Virtualization: Para-virtualization of devices is supported using
> inter-VM communication. Low level system features and devices such as
> interrupt controllers are supported with emulation where required.
>
> This series adds the basic framework for detecting that Linux is running
> under Gunyah as a virtual machine, communication with the Gunyah Resource
> Manager, and a virtual machine manager capable of launching virtual machines.
>
> The series relies on two other patches posted separately:
> - https://lore.kernel.org/all/[email protected]/
> - https://lore.kernel.org/all/[email protected]/
>
> Changes in v10:
> - Fix bisectability (end result of series is same, --fixups applied to wrong commits)
> - Convert GH_ERROR_* and GH_RM_ERROR_* to enums
> - Correct race condition between allocating/freeing user memory
> - Replace offsetof with struct_size
> - Series-wide renaming of functions to be more consistent
> - VM shutdown & restart support added in vCPU and VM Manager patches
> - Convert VM function name (string) to type (number)
> - Convert VM function argument to value (which could be a pointer) to remove memory wastage for arguments
> - Remove defensive checks of hypervisor correctness
> - Clean ups to ioeventfd as suggested by Srivatsa
>
> Changes in v9: https://lore.kernel.org/all/[email protected]/
> - Refactor Gunyah API flags to be exposed as feature flags at kernel level
> - Move mbox client cleanup into gunyah_msgq_remove()
> - Simplify gh_rm_call return value and response payload
> - Missing clean-up/error handling/little endian fixes as suggested by Srivatsa and Alex in v8 series
>
> Changes in v8: https://lore.kernel.org/all/[email protected]/
> - Treat VM manager as a library of RM
> - Add patches 21-28 as RFC to support proxy-scheduled vCPUs and necessary bits to support virtio
> from Gunyah userspace
>
> Changes in v7: https://lore.kernel.org/all/[email protected]/
> - Refactor to remove gunyah RM bus
> - Refactor allow multiple RM device instances
> - Bump UAPI to start at 0x0
> - Refactor QCOM SCM's platform hooks to allow CONFIG_QCOM_SCM=Y/CONFIG_GUNYAH=M combinations
>
> Changes in v6: https://lore.kernel.org/all/[email protected]/
> - *Replace gunyah-console with gunyah VM Manager*
> - Move include/asm-generic/gunyah.h into include/linux/gunyah.h
> - s/gunyah_msgq/gh_msgq/
> - Minor tweaks and documentation tidying based on comments from Jiri, Greg, Arnd, Dmitry, and Bagas.
>
> Changes in v5: https://lore.kernel.org/all/[email protected]/
> - Dropped sysfs nodes
> - Switch from aux bus to Gunyah RM bus for the subdevices
> - Cleaning up RM console
>
> Changes in v4: https://lore.kernel.org/all/[email protected]/
> - Tidied up documentation throughout based on questions/feedback received
> - Switched message queue implementation to use mailboxes
> - Renamed "gunyah_device" as "gunyah_resource"
>
> Changes in v3: https://lore.kernel.org/all/[email protected]/
> - /Maintained/Supported/ in MAINTAINERS
> - Tidied up documentation throughout based on questions/feedback received
> - Moved hypercalls into arch/arm64/gunyah/; following hyper-v's implementation
> - Drop opaque typedefs
> - Move sysfs nodes under /sys/hypervisor/gunyah/
> - Moved Gunyah console driver to drivers/tty/
> - Reworked gunyah_device design to drop the Gunyah bus.
>
> Changes in v2: https://lore.kernel.org/all/[email protected]/
> - DT bindings clean up
> - Switch hypercalls to follow SMCCC
>
> v1: https://lore.kernel.org/all/[email protected]/
>
> Elliot Berman (26):
> docs: gunyah: Introduce Gunyah Hypervisor
> dt-bindings: Add binding for gunyah hypervisor
> gunyah: Common types and error codes for Gunyah hypercalls
> virt: gunyah: Add hypercalls to identify Gunyah
> virt: gunyah: Identify hypervisor version
> virt: gunyah: msgq: Add hypercalls to send and receive messages
> mailbox: Add Gunyah message queue mailbox
> gunyah: rsc_mgr: Add resource manager RPC core
> gunyah: rsc_mgr: Add VM lifecycle RPC
> gunyah: vm_mgr: Introduce basic VM Manager
> gunyah: rsc_mgr: Add RPC for sharing memory
> unyah: vm_mgr: Add/remove user memory regions
> gunyah: vm_mgr: Add ioctls to support basic non-proxy VM boot
> samples: Add sample userspace Gunyah VM Manager
> gunyah: rsc_mgr: Add platform ops on mem_lend/mem_reclaim
> firmware: qcom_scm: Register Gunyah platform ops
> docs: gunyah: Document Gunyah VM Manager
> virt: gunyah: Translate gh_rm_hyp_resource into gunyah_resource
> gunyah: vm_mgr: Add framework to add VM Functions
> virt: gunyah: Add resource tickets
> virt: gunyah: Add IO handlers
> virt: gunyah: Add proxy-scheduled vCPUs
> virt: gunyah: Add hypercalls for sending doorbell
> virt: gunyah: Add irqfd interface
> virt: gunyah: Add ioeventfd
> MAINTAINERS: Add Gunyah hypervisor drivers section
>
> .../bindings/firmware/gunyah-hypervisor.yaml | 82 ++
> .../userspace-api/ioctl/ioctl-number.rst | 1 +
> Documentation/virt/gunyah/index.rst | 114 +++
> Documentation/virt/gunyah/message-queue.rst | 69 ++
> Documentation/virt/gunyah/vm-manager.rst | 193 +++++
> Documentation/virt/index.rst | 1 +
> MAINTAINERS | 13 +
> arch/arm64/Kbuild | 1 +
> arch/arm64/gunyah/Makefile | 3 +
> arch/arm64/gunyah/gunyah_hypercall.c | 146 ++++
> arch/arm64/include/asm/gunyah.h | 23 +
> drivers/firmware/Kconfig | 2 +
> drivers/firmware/qcom_scm.c | 100 +++
> drivers/mailbox/Makefile | 2 +
> drivers/mailbox/gunyah-msgq.c | 214 +++++
> drivers/virt/Kconfig | 2 +
> drivers/virt/Makefile | 1 +
> drivers/virt/gunyah/Kconfig | 46 +
> drivers/virt/gunyah/Makefile | 11 +
> drivers/virt/gunyah/gunyah.c | 54 ++
> drivers/virt/gunyah/gunyah_ioeventfd.c | 113 +++
> drivers/virt/gunyah/gunyah_irqfd.c | 160 ++++
> drivers/virt/gunyah/gunyah_platform_hooks.c | 80 ++
> drivers/virt/gunyah/gunyah_vcpu.c | 463 ++++++++++
> drivers/virt/gunyah/rsc_mgr.c | 798 +++++++++++++++++
> drivers/virt/gunyah/rsc_mgr.h | 169 ++++
> drivers/virt/gunyah/rsc_mgr_rpc.c | 419 +++++++++
> drivers/virt/gunyah/vm_mgr.c | 801 ++++++++++++++++++
> drivers/virt/gunyah/vm_mgr.h | 70 ++
> drivers/virt/gunyah/vm_mgr_mm.c | 258 ++++++
> include/linux/gunyah.h | 198 +++++
> include/linux/gunyah_rsc_mgr.h | 171 ++++
> include/linux/gunyah_vm_mgr.h | 119 +++
> include/uapi/linux/gunyah.h | 191 +++++
> samples/Kconfig | 10 +
> samples/Makefile | 1 +
> samples/gunyah/.gitignore | 2 +
> samples/gunyah/Makefile | 6 +
> samples/gunyah/gunyah_vmm.c | 270 ++++++
> samples/gunyah/sample_vm.dts | 68 ++
> 40 files changed, 5445 insertions(+)
> create mode 100644 Documentation/devicetree/bindings/firmware/gunyah-hypervisor.yaml
> create mode 100644 Documentation/virt/gunyah/index.rst
> create mode 100644 Documentation/virt/gunyah/message-queue.rst
> create mode 100644 Documentation/virt/gunyah/vm-manager.rst
> create mode 100644 arch/arm64/gunyah/Makefile
> create mode 100644 arch/arm64/gunyah/gunyah_hypercall.c
> create mode 100644 arch/arm64/include/asm/gunyah.h
> create mode 100644 drivers/mailbox/gunyah-msgq.c
> create mode 100644 drivers/virt/gunyah/Kconfig
> create mode 100644 drivers/virt/gunyah/Makefile
> create mode 100644 drivers/virt/gunyah/gunyah.c
> create mode 100644 drivers/virt/gunyah/gunyah_ioeventfd.c
> create mode 100644 drivers/virt/gunyah/gunyah_irqfd.c
> create mode 100644 drivers/virt/gunyah/gunyah_platform_hooks.c
> create mode 100644 drivers/virt/gunyah/gunyah_vcpu.c
> create mode 100644 drivers/virt/gunyah/rsc_mgr.c
> create mode 100644 drivers/virt/gunyah/rsc_mgr.h
> create mode 100644 drivers/virt/gunyah/rsc_mgr_rpc.c
> create mode 100644 drivers/virt/gunyah/vm_mgr.c
> create mode 100644 drivers/virt/gunyah/vm_mgr.h
> create mode 100644 drivers/virt/gunyah/vm_mgr_mm.c
> create mode 100644 include/linux/gunyah.h
> create mode 100644 include/linux/gunyah_rsc_mgr.h
> create mode 100644 include/linux/gunyah_vm_mgr.h
> create mode 100644 include/uapi/linux/gunyah.h
> create mode 100644 samples/gunyah/.gitignore
> create mode 100644 samples/gunyah/Makefile
> create mode 100644 samples/gunyah/gunyah_vmm.c
> create mode 100644 samples/gunyah/sample_vm.dts
>
>
> base-commit: 3ebb0ac55efaf1d0fb1b106f852c114e5021f7eb


2023-02-23 22:41:26

by Elliot Berman

[permalink] [raw]
Subject: Re: [PATCH v10 10/26] gunyah: vm_mgr: Introduce basic VM Manager



On 2/23/2023 2:08 AM, Srinivas Kandagatla wrote:
>
>
> On 22/02/2023 00:27, Elliot Berman wrote:
>>
>>>> +    .llseek = noop_llseek,
>>>> +};
>>>> +
>>>> +static long gh_dev_ioctl_create_vm(struct gh_rm *rm, unsigned long
>>>> arg)
>>> Not sure what is the gain of this multiple levels of redirection.
>>>
>>> How about
>>>
>>> long gh_dev_create_vm(struct gh_rm *rm, unsigned long arg)
>>> {
>>> ...
>>> }
>>>
>>> and rsc_mgr just call it as part of its ioctl call
>>>
>>> static long gh_dev_ioctl(struct file *filp, unsigned int cmd,
>>> unsigned long arg)
>>> {
>>>      struct miscdevice *miscdev = filp->private_data;
>>>      struct gh_rm *rm = container_of(miscdev, struct gh_rm, miscdev);
>>>
>>>      switch (cmd) {
>>>      case GH_CREATE_VM:
>>>          return gh_dev_create_vm(rm, arg);
>>>      default:
>>>          return -ENOIOCTLCMD;
>>>      }
>>> }
>>>
>>
>> I'm anticipating we will add further /dev/gunyah ioctls and I thought
>> it would be cleaner to have all that in vm_mgr.c itself.
>>
>>>
>>>> +{
>>>> +    struct gh_vm *ghvm;
>>>> +    struct file *file;
>>>> +    int fd, err;
>>>> +
>>>> +    /* arg reserved for future use. */
>>>> +    if (arg)
>>>> +        return -EINVAL;
>>>
>>> The only code path I see here is via GH_CREATE_VM ioctl which
>>> obviously does not take any arguments, so if you are thinking of
>>> using the argument for architecture-specific VM flags.  Then this
>>> needs to be properly done by making the ABI aware of this.
>>
>> It is documented in Patch 17 (Document Gunyah VM Manager)
>>
>> +GH_CREATE_VM
>> +~~~~~~~~~~~~
>> +
>> +Creates a Gunyah VM. The argument is reserved for future use and must
>> be 0.
>>
> But this conficts with the UAPIs that have been defined. GH_CREATE_VM
> itself is defined to take no parameters.
>
> #define GH_CREATE_VM                    _IO(GH_IOCTL_TYPE, 0x0)
>
> so where are you expecting the argument to come from?
> >>>
>>> As you mentioned zero value arg imply an "unauthenticated VM" type,
>>> but this was not properly encoded in the userspace ABI. Why not make
>>> it future compatible. How about adding arguments to GH_CREATE_VM and
>>> pass the required information correctly.
>>> Note that once the ABI is accepted then you will not be able to
>>> change it, other than adding a new one.
>>>
>>
>> Does this means adding #define GH_VM_DEFAULT_ARG 0 ? I am not sure yet
>> what arguments to add here.
>>
>> The ABI can add new "long" values to GH_CREATE_VM and that wouldn't
>
> Sorry, that is exactly what we want to avoid, we can not change the UAPI
> its going to break the userspace.
>
>> break compatibility with old kernels; old kernels reject it as -EINVAL.
>
> If you have userspace built with older kernel headers then that will
> break. Am not sure about old-kernels.
>
> What exactly is the argument that you want to add to GH_CREATE_VM?
>
> If you want to keep GH_CREATE_VM with no arguments that is fine but
> remove the conflicting comments in the code and document so that its not
> misleading readers/reviewers that the UAPI is going to be modified in
> near future.
>
>

The convention followed here comes from KVM_CREATE_VM. Is this ioctl
considered bad example?

>>
>>>> +
>>>> +    ghvm = gh_vm_alloc(rm);
>>>> +    if (IS_ERR(ghvm))
>>>> +        return PTR_ERR(ghvm);
>>>> +
>>>> +    fd = get_unused_fd_flags(O_CLOEXEC);
>>>> +    if (fd < 0) {
>>>> +        err = fd;
>>>> +        goto err_destroy_vm;
>>>> +    }
>>>> +
>>>> +    file = anon_inode_getfile("gunyah-vm", &gh_vm_fops, ghvm, O_RDWR);
>>>> +    if (IS_ERR(file)) {
>>>> +        err = PTR_ERR(file);
>>>> +        goto err_put_fd;
>>>> +    }
>>>> +
>>>> +    fd_install(fd, file);
>>>> +
>>>> +    return fd;
>>>> +
>>>> +err_put_fd:
>>>> +    put_unused_fd(fd);
>>>> +err_destroy_vm:
>>>> +    kfree(ghvm);
>>>> +    return err;
>>>> +}
>>>> +
>>>> +long gh_dev_vm_mgr_ioctl(struct gh_rm *rm, unsigned int cmd,
>>>> unsigned long arg)
>>>> +{
>>>> +    switch (cmd) {
>>>> +    case GH_CREATE_VM:
>>>> +        return gh_dev_ioctl_create_vm(rm, arg);
>>>> +    default:
>>>> +        return -ENOIOCTLCMD;
>>>> +    }
>>>> +}
>>>> diff --git a/drivers/virt/gunyah/vm_mgr.h
>>>> b/drivers/virt/gunyah/vm_mgr.h
>>>> new file mode 100644
>>>> index 000000000000..76954da706e9
>>>> --- /dev/null
>>>> +++ b/drivers/virt/gunyah/vm_mgr.h
>>>> @@ -0,0 +1,22 @@
>>>> +/* SPDX-License-Identifier: GPL-2.0-only */
>>>> +/*
>>>> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All
>>>> rights reserved.
>>>> + */
>>>> +
>>>> +#ifndef _GH_PRIV_VM_MGR_H
>>>> +#define _GH_PRIV_VM_MGR_H
>>>> +
>>>> +#include <linux/gunyah_rsc_mgr.h>
>>>> +
>>>> +#include <uapi/linux/gunyah.h>
>>>> +
>>>> +long gh_dev_vm_mgr_ioctl(struct gh_rm *rm, unsigned int cmd,
>>>> unsigned long arg);
>>>> +
>>>> +struct gh_vm {
>>>> +    u16 vmid;
>>>> +    struct gh_rm *rm;
>>>> +
>>>> +    struct work_struct free_work;
>>>> +};
>>>> +
>>>> +#endif
>>>> diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
>>>> new file mode 100644
>>>> index 000000000000..10ba32d2b0a6
>>>> --- /dev/null
>>>> +++ b/include/uapi/linux/gunyah.h
>>>> @@ -0,0 +1,23 @@
>>>> +/* SPDX-License-Identifier: GPL-2.0-only WITH Linux-syscall-note */
>>>> +/*
>>>> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All
>>>> rights reserved.
>>>> + */
>>>> +
>>>> +#ifndef _UAPI_LINUX_GUNYAH
>>>> +#define _UAPI_LINUX_GUNYAH
>>>> +
>>>> +/*
>>>> + * Userspace interface for /dev/gunyah - gunyah based virtual machine
>>>> + */
>>>> +
>>>> +#include <linux/types.h>
>>>> +#include <linux/ioctl.h>
>>>> +
>>>> +#define GH_IOCTL_TYPE            'G'
>>>> +
>>>> +/*
>>>> + * ioctls for /dev/gunyah fds:
>>>> + */
>>>> +#define GH_CREATE_VM            _IO(GH_IOCTL_TYPE, 0x0) /* Returns
>>>> a Gunyah VM fd */
>>>
>>> Can HLOS forcefully destroy a VM?
>>> If so should we have a corresponding DESTROY IOCTL?
>>
>> It can forcefully destroy unauthenticated and protected virtual
>> machines. I don't have a userspace usecase for a DESTROY ioctl yet,
>> maybe this can be added later? By the way, the VM is forcefully
> that should be fine, but its also nice to add it for completeness, but
> not a compulsory atm
>
>> destroyed when VM refcount is dropped to 0 (close(vm_fd) and any other
>> relevant file descriptors).
> I have noticed that path.
>
> --srini
>>
>> - Elliot

2023-02-23 23:10:34

by Elliot Berman

[permalink] [raw]
Subject: Re: [PATCH v10 09/26] gunyah: rsc_mgr: Add VM lifecycle RPC



On 2/23/2023 1:36 PM, Alex Elder wrote:
> On 2/14/23 3:23 PM, Elliot Berman wrote:
>>
>> Add Gunyah Resource Manager RPC to launch an unauthenticated VM.
>>
>> Signed-off-by: Elliot Berman <[email protected]>
>> ---
>>   drivers/virt/gunyah/Makefile      |   2 +-
>>   drivers/virt/gunyah/rsc_mgr.h     |  45 ++++++
>>   drivers/virt/gunyah/rsc_mgr_rpc.c | 226 ++++++++++++++++++++++++++++++
>>   include/linux/gunyah_rsc_mgr.h    |  73 ++++++++++
>>   4 files changed, 345 insertions(+), 1 deletion(-)
>>   create mode 100644 drivers/virt/gunyah/rsc_mgr_rpc.c
>>
>> diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
>> index cc864ff5abbb..de29769f2f3f 100644
>> --- a/drivers/virt/gunyah/Makefile
>> +++ b/drivers/virt/gunyah/Makefile
>> @@ -2,5 +2,5 @@
>>   obj-$(CONFIG_GUNYAH) += gunyah.o
>> -gunyah_rsc_mgr-y += rsc_mgr.o
>> +gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o
>>   obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
>> diff --git a/drivers/virt/gunyah/rsc_mgr.h
>> b/drivers/virt/gunyah/rsc_mgr.h
>> index d4e799a7526f..7406237bc66d 100644
>> --- a/drivers/virt/gunyah/rsc_mgr.h
>> +++ b/drivers/virt/gunyah/rsc_mgr.h
>> @@ -74,4 +74,49 @@ struct gh_rm;
>>   int gh_rm_call(struct gh_rm *rsc_mgr, u32 message_id, void
>> *req_buff, size_t req_buff_size,
>>           void **resp_buf, size_t *resp_buff_size);
>> +/* Message IDs: VM Management */
>> +#define GH_RM_RPC_VM_ALLOC_VMID            0x56000001
>> +#define GH_RM_RPC_VM_DEALLOC_VMID        0x56000002
>> +#define GH_RM_RPC_VM_START            0x56000004
>> +#define GH_RM_RPC_VM_STOP            0x56000005
>> +#define GH_RM_RPC_VM_RESET            0x56000006
>> +#define GH_RM_RPC_VM_CONFIG_IMAGE        0x56000009
>> +#define GH_RM_RPC_VM_INIT            0x5600000B
>> +#define GH_RM_RPC_VM_GET_HYP_RESOURCES        0x56000020
>> +#define GH_RM_RPC_VM_GET_VMID            0x56000024
>> +
>> +struct gh_rm_vm_common_vmid_req {
>> +    __le16 vmid;
>> +    __le16 reserved0;
>> +} __packed;
>> +
>> +/* Call: VM_ALLOC */
>> +struct gh_rm_vm_alloc_vmid_resp {
>> +    __le16 vmid;
>> +    __le16 reserved0;
>> +} __packed;
>> +
>> +/* Call: VM_STOP */
>> +struct gh_rm_vm_stop_req {
>> +    __le16 vmid;
>> +#define GH_RM_VM_STOP_FLAG_FORCE_STOP    BIT(0)
>> +    u8 flags;
>> +    u8 reserved;
>> +#define GH_RM_VM_STOP_REASON_FORCE_STOP        3
>
> I suggested this before and you honored it.  Now I'll suggest
> it again, and ask you to do it throughout the driver.
>
> Please separate the definitions of constant values that
> certain fields can take on from the structure definition.
> I think doing it the way you have here makes it harder to
> understand the structure definition.
>
> You could define an anonymous enumerated type to hold
> the values meant to be held by each field.
>

Done.

>> +    __le32 stop_reason;
>> +} __packed;
>> +
>> +/* Call: VM_CONFIG_IMAGE */
>> +struct gh_rm_vm_config_image_req {
>> +    __le16 vmid;
>> +    __le16 auth_mech;
>> +    __le32 mem_handle;
>> +    __le64 image_offset;
>> +    __le64 image_size;
>> +    __le64 dtb_offset;
>> +    __le64 dtb_size;
>> +} __packed;
>> +
>> +/* Call: GET_HYP_RESOURCES */
>> +
>>   #endif
>> diff --git a/drivers/virt/gunyah/rsc_mgr_rpc.c
>> b/drivers/virt/gunyah/rsc_mgr_rpc.c
>> new file mode 100644
>> index 000000000000..4515cdd80106
>> --- /dev/null
>> +++ b/drivers/virt/gunyah/rsc_mgr_rpc.c
>> @@ -0,0 +1,226 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All
>> rights reserved.
>> + */
>> +
>> +#include <linux/gunyah_rsc_mgr.h>
>> +
>> +#include "rsc_mgr.h"
>> +
>> +/*
>> + * Several RM calls take only a VMID as a parameter and give only
>> standard
>> + * response back. Deduplicate boilerplate code by using this common
>> call.
>> + */
>> +static int gh_rm_common_vmid_call(struct gh_rm *rm, u32 message_id,
>> u16 vmid)
>> +{
>> +    struct gh_rm_vm_common_vmid_req req_payload = {
>> +        .vmid = cpu_to_le16(vmid),
>> +    };
>> +    size_t resp_size;
>> +    void *resp;
>> +
>> +    return gh_rm_call(rm, message_id, &req_payload,
>> sizeof(req_payload), &resp, &resp_size);
>> +}
>> +
>> +/**
>> + * gh_rm_alloc_vmid() - Allocate a new VM in Gunyah. Returns the VM
>> identifier.
>> + * @rm: Handle to a Gunyah resource manager
>> + * @vmid: Use GH_VMID_INVAL or 0 to dynamically allocate a VM. A
>> reserved VMID can
>> + *        be supplied to request allocation of a platform-defined VM.
>
> Honestly, I'd rather just see 0 (and *not* GH_VMID_INVAL) be the
> special value to mean "dynamically allocate the VMID."  It seems
> 0 is a reserved VMID anyway, and GH_VMID_INVAL might as well be
> treated here as an invalid parameter.

Done.

>
> Is there any definitition of which VMIDs are reserved?  Like,
> anything under 1024?

It's platform dependent. On Qualcomm platforms, VMIDs <= 63
(QCOM_SCM_MAX_MANAGED_VMID) are reserved. Of those reserved VMIDs,
Gunyah only allows us to allocate the "special VMs" (today: TUIVM,
CPUSYSVM, OEMVM). Passing any value except 0, tuivm_vmid, cpusysvm_vmid,
or oemvm_vmid returns an error.

On current non-Qualcomm platforms, there aren't any reserved VMIDs so
passing anything but 0 returns an error.

Thanks,
Elliot

>
> That's it on this patch for now.
>
>                     -Alex
>
>> + *
>> + * Returns - the allocated VMID or negative value on error
>> + */
>> +int gh_rm_alloc_vmid(struct gh_rm *rm, u16 vmid)
>> +{
>> +    struct gh_rm_vm_common_vmid_req req_payload = { 0 };
>> +    struct gh_rm_vm_alloc_vmid_resp *resp_payload;
>> +    size_t resp_size;
>> +    void *resp;
>> +    int ret;
>
> . . .
>

2023-02-23 23:13:44

by Elliot Berman

[permalink] [raw]
Subject: Re: [PATCH v10 08/26] gunyah: rsc_mgr: Add resource manager RPC core



On 2/23/2023 2:29 AM, Srinivas Kandagatla wrote:
>
>
> On 22/02/2023 23:18, Elliot Berman wrote:
>>>>
>>>> +EXPORT_SYMBOL_GPL(gh_rm_notifier_unregister);
>>>> +
>>>> +void get_gh_rm(struct gh_rm *rm)
>>>> +{
>>>> +    get_device(rm->dev);
>>>> +}
>>>> +EXPORT_SYMBOL_GPL(get_gh_rm);
>>>
>>> Can we have some consistency in the exported symbol naming,
>>> we have two combinations now.
>>>
>>> EXPORT_SYMBOL_GPL(gh_rm_notifier_register);
>>> EXPORT_SYMBOL_GPL(get_gh_rm);
>>>
>>> lets stick to one.
>>
>> done.
>>
>>>> +
>>>> +void put_gh_rm(struct gh_rm *rm)
>>>> +{
>>>> +    put_device(rm->dev);
>>>> +}
>>>> +EXPORT_SYMBOL_GPL(put_gh_rm);
>>>>
>>> ...
>>>
>>>> +
>>>> +static int gh_rm_drv_probe(struct platform_device *pdev)
>>>> +{
>>>> +    struct gh_msgq_tx_data *msg;
>>>> +    struct gh_rm *rm;
>>>> +    int ret;
>>>> +
>>> How are we ensuring that gunyah driver is probed before this driver?
>>>
>>>
>>
>> Which driver?
>
> Am referring to gunyah.ko
>
> TBH, gunyah.c should be merged as part of resource manager, and check if
> uuids and features in probe before proceeding further.
>


Ah -- gunyah_rsc_mgr.ko has symbol dependency on gunyah-msgq.ko.
gunyah-msgq.ko has symbol dependency on gunyah.ko. gunyah.ko doesn't
have any probe and does all its work on module_init.

In order to merge gunyah.c with resource manager, I would need to
incorporate message queue mailbox into resource manager. IMO, this
rapidly moves towards a mega-module which was discouraged previously.

Thanks,
Elliot

2023-02-23 23:15:48

by Elliot Berman

[permalink] [raw]
Subject: Re: [PATCH v10 07/26] mailbox: Add Gunyah message queue mailbox



On 2/23/2023 2:25 AM, Srinivas Kandagatla wrote:
>
>
> On 23/02/2023 00:15, Elliot Berman wrote:
>>
>>
>> On 2/20/2023 5:59 AM, Srinivas Kandagatla wrote:
>>>
>>>
>>> On 14/02/2023 21:23, Elliot Berman wrote:
>>>> Gunyah message queues are a unidirectional inter-VM pipe for
>>>> messages up
>>>> to 1024 bytes. This driver supports pairing a receiver message queue
>>>> and
>>>> a transmitter message queue to expose a single mailbox channel.
>>>>
>>>> Signed-off-by: Elliot Berman <[email protected]>
>>>> ---
>>>>   Documentation/virt/gunyah/message-queue.rst |   8 +
>>>>   drivers/mailbox/Makefile                   |   2 +
>>>>   drivers/mailbox/gunyah-msgq.c               | 214
>>>> ++++++++++++++++++++
>>>>   include/linux/gunyah.h                      |  56 +++++
>>>>   4 files changed, 280 insertions(+)
>>>>   create mode 100644 drivers/mailbox/gunyah-msgq.c
>>>>
>>>> diff --git a/Documentation/virt/gunyah/message-queue.rst
>>>> b/Documentation/virt/gunyah/message-queue.rst
>>>> index 0667b3eb1ff9..082085e981e0 100644
>>>> --- a/Documentation/virt/gunyah/message-queue.rst
>>>> +++ b/Documentation/virt/gunyah/message-queue.rst
>>>> @@ -59,3 +59,11 @@ vIRQ: two TX message queues will have two vIRQs
>>>> (and two capability IDs).
>>>>         |               |         |                 | |               |
>>>>         |               |         |                 | |               |
>>>>         +---------------+         +-----------------+ +---------------+
>>>> +
>>>> +Gunyah message queues are exposed as mailboxes. To create the
>>>> mailbox, create
>>>> +a mbox_client and call `gh_msgq_init`. On receipt of the RX_READY
>>>> interrupt,
>>>> +all messages in the RX message queue are read and pushed via the
>>>> `rx_callback`
>>>> +of the registered mbox_client.
>>>> +
>>>> +.. kernel-doc:: drivers/mailbox/gunyah-msgq.c
>>>> +   :identifiers: gh_msgq_init
>>>> diff --git a/drivers/mailbox/Makefile b/drivers/mailbox/Makefile
>>>> index fc9376117111..5f929bb55e9a 100644
>>>> --- a/drivers/mailbox/Makefile
>>>> +++ b/drivers/mailbox/Makefile
>>>> @@ -55,6 +55,8 @@ obj-$(CONFIG_MTK_CMDQ_MBOX)    += mtk-cmdq-mailbox.o
>>>>   obj-$(CONFIG_ZYNQMP_IPI_MBOX)    += zynqmp-ipi-mailbox.o
>>>> +obj-$(CONFIG_GUNYAH)        += gunyah-msgq.o
>>>
>>> Why are we reusing CONFIG_GUNYAH Kconfig symbol for mailbox, why not
>>> CONFIG_GUNYAH_MBOX?
>>>
>>
>> There was some previous discussion about this:
>>
>> https://lore.kernel.org/all/[email protected]/
>>
>>>> +
>>>>   obj-$(CONFIG_SUN6I_MSGBOX)    += sun6i-msgbox.o
>>>>   obj-$(CONFIG_SPRD_MBOX)       += sprd-mailbox.o
>>>> diff --git a/drivers/mailbox/gunyah-msgq.c
>>>> b/drivers/mailbox/gunyah-msgq.c
>>>> new file mode 100644
>>>> index 000000000000..03ffaa30ce9b
>>>> --- /dev/null
>>>> +++ b/drivers/mailbox/gunyah-msgq.c
>>>> @@ -0,0 +1,214 @@
>>>> +// SPDX-License-Identifier: GPL-2.0-only
>>>> +/*
>>>> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All
>>>> rights reserved.
>>>> + */
>>>> +
>>>> +#include <linux/mailbox_controller.h>
>>>> +#include <linux/module.h>
>>>> +#include <linux/interrupt.h>
>>>> +#include <linux/gunyah.h>
>>>> +#include <linux/printk.h>
>>>> +#include <linux/init.h>
>>>> +#include <linux/slab.h>
>>>> +#include <linux/wait.h>
>>>
>>> ...
>>>
>>>> +/* Fired when message queue transitions from "full" to "space
>>>> available" to send messages */
>>>> +static irqreturn_t gh_msgq_tx_irq_handler(int irq, void *data)
>>>> +{
>>>> +    struct gh_msgq *msgq = data;
>>>> +
>>>> +    mbox_chan_txdone(gh_msgq_chan(msgq), 0);
>>>> +
>>>> +    return IRQ_HANDLED;
>>>> +}
>>>> +
>>>> +/* Fired after sending message and hypercall told us there was more
>>>> space available. */
>>>> +static void gh_msgq_txdone_tasklet(struct tasklet_struct *tasklet)
>>>
>>> Tasklets have been long deprecated, consider using workqueues in this
>>> particular case.
>>>
>>
>> Workqueues have higher latency and tasklets came as recommendation
>> from Jassi. drivers/mailbox/imx-mailbox.c uses tasklets in the same way.
>>
>> I did some quick unscientific measurements of ~1000x samples. The
>> median latency for resource manager went from 25.5 us (tasklet) to 26
>> us (workqueue) (2% slower). The mean went from 28.7 us to 32.5 us (13%
>> slower). Obviously, the outliers for workqueues were much more extreme.
>
> TBH, this is expected because we are only testing resource manager, Note
>  the advantage that you will see shifting from tasket to workqueues is
> on overall system latencies and some drivers performance that need to
> react to events.
>
> please take some time to read this nice article about this
> https://lwn.net/Articles/830964/
>

Hmm, this article is from 2020 and there was another effort in 2007.
Neither seems to have succeeded. I'd like to stick to same mechanisms as
other mailbox controllers.

Jassi, do you have any preferences?

Thanks,
Elliot



2023-02-23 23:29:10

by Alex Elder

[permalink] [raw]
Subject: Re: [PATCH v10 08/26] gunyah: rsc_mgr: Add resource manager RPC core

On 2/14/23 3:23 PM, Elliot Berman wrote:
>
> The resource manager is a special virtual machine which is always
> running on a Gunyah system. It provides APIs for creating and destroying
> VMs, secure memory management, sharing/lending of memory between VMs,
> and setup of inter-VM communication. Calls to the resource manager are
> made via message queues.
>
> This patch implements the basic probing and RPC mechanism to make those
> API calls. Request/response calls can be made with gh_rm_call.
> Drivers can also register to notifications pushed by RM via
> gh_rm_register_notifier
>
> Specific API calls that resource manager supports will be implemented in
> subsequent patches.
>
> Signed-off-by: Elliot Berman <[email protected]>
> ---
> drivers/virt/gunyah/Makefile | 3 +
> drivers/virt/gunyah/rsc_mgr.c | 604 +++++++++++++++++++++++++++++++++
> drivers/virt/gunyah/rsc_mgr.h | 77 +++++
> include/linux/gunyah_rsc_mgr.h | 24 ++
> 4 files changed, 708 insertions(+)
> create mode 100644 drivers/virt/gunyah/rsc_mgr.c
> create mode 100644 drivers/virt/gunyah/rsc_mgr.h
> create mode 100644 include/linux/gunyah_rsc_mgr.h
>
> diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
> index 34f32110faf9..cc864ff5abbb 100644
> --- a/drivers/virt/gunyah/Makefile
> +++ b/drivers/virt/gunyah/Makefile
> @@ -1,3 +1,6 @@
> # SPDX-License-Identifier: GPL-2.0
>
> obj-$(CONFIG_GUNYAH) += gunyah.o
> +
> +gunyah_rsc_mgr-y += rsc_mgr.o
> +obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
> diff --git a/drivers/virt/gunyah/rsc_mgr.c b/drivers/virt/gunyah/rsc_mgr.c
> new file mode 100644
> index 000000000000..2a47139873a8
> --- /dev/null
> +++ b/drivers/virt/gunyah/rsc_mgr.c
> @@ -0,0 +1,604 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#include <linux/of.h>
> +#include <linux/slab.h>
> +#include <linux/mutex.h>
> +#include <linux/sched.h>
> +#include <linux/gunyah.h>
> +#include <linux/module.h>
> +#include <linux/of_irq.h>
> +#include <linux/kthread.h>
> +#include <linux/notifier.h>
> +#include <linux/workqueue.h>
> +#include <linux/completion.h>
> +#include <linux/gunyah_rsc_mgr.h>
> +#include <linux/platform_device.h>
> +
> +#include "rsc_mgr.h"
> +
> +#define RM_RPC_API_VERSION_MASK GENMASK(3, 0)
> +#define RM_RPC_HEADER_WORDS_MASK GENMASK(7, 4)
> +#define RM_RPC_API_VERSION FIELD_PREP(RM_RPC_API_VERSION_MASK, 1)
> +#define RM_RPC_HEADER_WORDS FIELD_PREP(RM_RPC_HEADER_WORDS_MASK, \
> + (sizeof(struct gh_rm_rpc_hdr) / sizeof(u32)))
> +#define RM_RPC_API (RM_RPC_API_VERSION | RM_RPC_HEADER_WORDS)
> +
> +#define RM_RPC_TYPE_CONTINUATION 0x0
> +#define RM_RPC_TYPE_REQUEST 0x1
> +#define RM_RPC_TYPE_REPLY 0x2
> +#define RM_RPC_TYPE_NOTIF 0x3
> +#define RM_RPC_TYPE_MASK GENMASK(1, 0)
> +
> +#define GH_RM_MAX_NUM_FRAGMENTS 62
> +#define RM_RPC_FRAGMENTS_MASK GENMASK(7, 2)
> +
> +struct gh_rm_rpc_hdr {
> + u8 api;
> + u8 type;
> + __le16 seq;
> + __le32 msg_id;
> +} __packed;
> +
> +struct gh_rm_rpc_reply_hdr {
> + struct gh_rm_rpc_hdr hdr;
> + __le32 err_code; /* GH_RM_ERROR_* */
> +} __packed;
> +
> +#define GH_RM_MAX_MSG_SIZE (GH_MSGQ_MAX_MSG_SIZE - sizeof(struct gh_rm_rpc_hdr))
> +
> +/**
> + * struct gh_rm_connection - Represents a complete message from resource manager
> + * @payload: Combined payload of all the fragments (msg headers stripped off).
> + * @size: Size of the payload received so far.
> + * @msg_id: Message ID from the header.

You do not define the @type field here.

> + * @num_fragments: total number of fragments expected to be received.
> + * @fragments_received: fragments received so far.
> + * @reply: Fields used for request/reply sequences
> + * @ret: Linux return code, set in case there was an error processing connection
> + * @type: RM_RPC_TYPE_REPLY or RM_RPC_TYPE_NOTIF.

Oh, I guess you should move the above line up...

> + * @rm_error: For request/reply sequences with standard replies.
> + * @seq: Sequence ID for the main message.
> + * @seq_done: Signals caller that the RM reply has been received

Now that you've defined parts of these in a union, it
might be clearer to document them as separate struct
types defined earlier (above gh_rm_connection).

> + * @notification: Fields used for notifiations

You added a new rm pointer field, not documented here.

> + * @work: Triggered when all fragments of a notification received
> + */
> +struct gh_rm_connection {
> + void *payload;
> + size_t size;
> + __le32 msg_id;
> + u8 type;
> +
> + u8 num_fragments;
> + u8 fragments_received;
> +
> + union {
> + struct {
> + int ret;
> + u16 seq;
> + enum gh_rm_error rm_error;
> + struct completion seq_done;
> + } reply;
> +
> + struct {
> + struct gh_rm *rm;
> + struct work_struct work;
> + } notification;
> + };
> +};
> +
> +struct gh_rm {
> + struct device *dev;
> + struct gunyah_resource tx_ghrsc, rx_ghrsc;

Please split the above two definitions into two separate lines.

> + struct gh_msgq msgq;
> + struct mbox_client msgq_client;
> + struct gh_rm_connection *active_rx_connection;
> + int last_tx_ret;
> +

Maybe the next two fields can just be "idr" and "idr_lock".

> + struct idr call_idr;
> + struct mutex call_idr_lock;
> +
> + struct kmem_cache *cache;
> + struct mutex send_lock;
> + struct blocking_notifier_head nh;
> +};

The next function is very simple. You don't supply a
free_connection counterpart, so maybe it's not even
needed (just open-code it in the two places it's used)?

> +static struct gh_rm_connection *gh_rm_alloc_connection(__le32 msg_id, u8 type)
> +{
> + struct gh_rm_connection *connection;
> +
> + connection = kzalloc(sizeof(*connection), GFP_KERNEL);
> + if (!connection)
> + return ERR_PTR(-ENOMEM);
> +
> + connection->type = type;
> + connection->msg_id = msg_id;
> +
> + return connection;
> +}
> +
> +static int gh_rm_init_connection_payload(struct gh_rm_connection *connection, void *msg,
> + size_t hdr_size, size_t msg_size)

The value of hdr_size is *always* sizeof(*hdr), so you can
do without passing it as an argument.

> +{
> + size_t max_buf_size, payload_size;
> + struct gh_rm_rpc_hdr *hdr = msg;
> +

It probably sounds dumb, but I'd reverse the values
compared below (and the operator).

> + if (hdr_size > msg_size)
> + return -EINVAL;
> +
> + payload_size = msg_size - hdr_size;
> +
> + connection->num_fragments = FIELD_GET(RM_RPC_FRAGMENTS_MASK, hdr->type);
> + connection->fragments_received = 0;
> +
> + /* There's not going to be any payload, no need to allocate buffer. */
> + if (!payload_size && !connection->num_fragments)

The payload size is the same across all messages in the
"connection" right? As is the number of fragments?
It's not even possible/valid to have a zero payload size
and non-zero number of fragments. I think the second
half of the above test can be dropped.

> + return 0;
> +
> + if (connection->num_fragments > GH_RM_MAX_NUM_FRAGMENTS)
> + return -EINVAL;
> +
> + max_buf_size = payload_size + (connection->num_fragments * GH_RM_MAX_MSG_SIZE);
> +
> + connection->payload = kzalloc(max_buf_size, GFP_KERNEL);
> + if (!connection->payload)
> + return -ENOMEM;
> +
> + memcpy(connection->payload, msg + hdr_size, payload_size);

I think I suggested (hdr + 1) rather than (msg + size) elsewhere
and you took that suggestion. I'd say do it one way or the other,
consistently, everywhere.

> + connection->size = payload_size;
> + return 0;
> +}
> +
> +static void gh_rm_notif_work(struct work_struct *work)
> +{
> + struct gh_rm_connection *connection = container_of(work, struct gh_rm_connection,
> + notification.work);
> + struct gh_rm *rm = connection->notification.rm;
> +
> + blocking_notifier_call_chain(&rm->nh, connection->msg_id, connection->payload);
> +
> + put_gh_rm(rm);
> + kfree(connection->payload);
> + kfree(connection);
> +}
> +
> +static struct gh_rm_connection *gh_rm_process_notif(struct gh_rm *rm, void *msg, size_t msg_size)

I think it might be better if you do some of what the caller
does here. I.e., verify the current connection is null (and
abort if not and make it NULL), then assign it to the new
connection before you return success. And return an errno.

> +{
> + struct gh_rm_connection *connection;
> + struct gh_rm_rpc_hdr *hdr = msg;
> + int ret;
> +
> + connection = gh_rm_alloc_connection(hdr->msg_id, RM_RPC_TYPE_NOTIF);
> + if (IS_ERR(connection)) {
> + dev_err(rm->dev, "Failed to alloc connection for notification: %ld, dropping.\n",
> + PTR_ERR(connection));
> + return NULL;
> + }
> +
> + get_gh_rm(rm);
> + connection->notification.rm = rm;
> + INIT_WORK(&connection->notification.work, gh_rm_notif_work);
> +
> + ret = gh_rm_init_connection_payload(connection, msg, sizeof(*hdr), msg_size);
> + if (ret) {
> + dev_err(rm->dev, "Failed to initialize connection buffer for notification: %d\n",
> + ret);
> + kfree(connection);
> + return NULL;
> + }
> +
> + return connection;
> +}
> +
> +static struct gh_rm_connection *gh_rm_process_rply(struct gh_rm *rm, void *msg, size_t msg_size)
> +{

Here too, make sure there is no active connection and then
set it within this function if the errno returned is 0.

> + struct gh_rm_rpc_reply_hdr *reply_hdr = msg;
> + struct gh_rm_connection *connection;
> + u16 seq_id = le16_to_cpu(reply_hdr->hdr.seq);
> +
> + mutex_lock(&rm->call_idr_lock);
> + connection = idr_find(&rm->call_idr, seq_id);
> + mutex_unlock(&rm->call_idr_lock);
> +
> + if (!connection || connection->msg_id != reply_hdr->hdr.msg_id)
> + return NULL;
> +
> + if (gh_rm_init_connection_payload(connection, msg, sizeof(*reply_hdr), msg_size)) {
> + dev_err(rm->dev, "Failed to alloc connection buffer for sequence %d\n", seq_id);
> + /* Send connection complete and error the client. */
> + connection->reply.ret = -ENOMEM;
> + complete(&connection->reply.seq_done);
> + return NULL;
> + }
> +
> + connection->reply.rm_error = le32_to_cpu(reply_hdr->err_code);
> + return connection;
> +}
> +
> +static int gh_rm_process_cont(struct gh_rm *rm, struct gh_rm_connection *connection,
> + void *msg, size_t msg_size)

Similar comment here. Have this function verify there is
a non-null active connection. Then process the message
and abort if there's an error (and null the active connection
pointer).

> +{
> + struct gh_rm_rpc_hdr *hdr = msg;
> + size_t payload_size = msg_size - sizeof(*hdr);
> +
> + /*
> + * hdr->fragments and hdr->msg_id preserves the value from first reply
> + * or notif message. To detect mishandling, check it's still intact.
> + */
> + if (connection->msg_id != hdr->msg_id ||
> + connection->num_fragments != FIELD_GET(RM_RPC_FRAGMENTS_MASK, hdr->type))
> + return -EINVAL;

Maybe -EBADMSG?

> +
> + memcpy(connection->payload + connection->size, msg + sizeof(*hdr), payload_size);
> + connection->size += payload_size;
> + connection->fragments_received++;
> + return 0;
> +}
> +
> +static void gh_rm_abort_connection(struct gh_rm_connection *connection)
> +{
> + switch (connection->type) {
> + case RM_RPC_TYPE_REPLY:
> + connection->reply.ret = -EIO;
> + complete(&connection->reply.seq_done);
> + break;
> + case RM_RPC_TYPE_NOTIF:
> + fallthrough;
> + default:
> + kfree(connection->payload);
> + kfree(connection);
> + }
> +}
> +
> +static bool gh_rm_complete_connection(struct gh_rm *rm, struct gh_rm_connection *connection)

The only caller of this function passes rm->active_rx_connection
as the second argument. It is available to you here, so you
can get rid of that argument.

> +{
> + if (!connection || connection->fragments_received != connection->num_fragments)
> + return false;
> +
> + switch (connection->type) {
> + case RM_RPC_TYPE_REPLY:
> + complete(&connection->reply.seq_done);
> + break;
> + case RM_RPC_TYPE_NOTIF:
> + schedule_work(&connection->notification.work);
> + break;
> + default:
> + dev_err(rm->dev, "Invalid message type (%d) received\n", connection->type);
> + gh_rm_abort_connection(connection);
> + break;
> + }
> +
> + return true;
> +}
> +
> +static void gh_rm_msgq_rx_data(struct mbox_client *cl, void *mssg)
> +{
> + struct gh_rm *rm = container_of(cl, struct gh_rm, msgq_client);
> + struct gh_msgq_rx_data *rx_data = mssg;
> + size_t msg_size = rx_data->length;
> + void *msg = rx_data->data;
> + struct gh_rm_rpc_hdr *hdr;
> +

Is it required that at least one byte (past the header) will
be received? I.e., should the "<=" below just be "<"?

> + if (msg_size <= sizeof(*hdr) || msg_size > GH_MSGQ_MAX_MSG_SIZE)
> + return;

You previously reported a message here. These seem like
errors, which if they occur, maybe should be reported.
They seem like "never happen" issues, but it's defensive
to make these checks (which is good).

> +
> + hdr = msg;
> + if (hdr->api != RM_RPC_API) {

If this ever happens, is the hardware failing? It seems
like once Gunyah is initialized and you've checked the
API version once, there should be no need to check it
repeatedly.

> + dev_err(rm->dev, "Unknown RM RPC API version: %x\n", hdr->api);
> + return;
> + }
> +
> + switch (FIELD_GET(RM_RPC_TYPE_MASK, hdr->type)) {
> + case RM_RPC_TYPE_NOTIF:
> + rm->active_rx_connection = gh_rm_process_notif(rm, msg, msg_size);
> + break;
> + case RM_RPC_TYPE_REPLY:
> + rm->active_rx_connection = gh_rm_process_rply(rm, msg, msg_size);
> + break;
> + case RM_RPC_TYPE_CONTINUATION:
> + if (gh_rm_process_cont(rm, rm->active_rx_connection, msg, msg_size)) {
> + gh_rm_abort_connection(rm->active_rx_connection);
> + rm->active_rx_connection = NULL;
> + }
> + break;
> + default:
> + dev_err(rm->dev, "Invalid message type (%lu) received\n",
> + FIELD_GET(RM_RPC_TYPE_MASK, hdr->type));
> + return;
> + }
> +
> + if (gh_rm_complete_connection(rm, rm->active_rx_connection))
> + rm->active_rx_connection = NULL;
> +}
> +
> +static void gh_rm_msgq_tx_done(struct mbox_client *cl, void *mssg, int r)
> +{
> + struct gh_rm *rm = container_of(cl, struct gh_rm, msgq_client);
> +
> + kmem_cache_free(rm->cache, mssg);
> + rm->last_tx_ret = r;
> +}
> +
> +static int gh_rm_send_request(struct gh_rm *rm, u32 message_id,
> + const void *req_buff, size_t req_buff_size,
> + struct gh_rm_connection *connection)
> +{
> + u8 msg_type = FIELD_PREP(RM_RPC_TYPE_MASK, RM_RPC_TYPE_REQUEST);
> + size_t buff_size_remaining = req_buff_size;
> + const void *req_buff_curr = req_buff;
> + struct gh_msgq_tx_data *msg;
> + struct gh_rm_rpc_hdr *hdr;
> + u32 cont_fragments = 0;
> + size_t payload_size;
> + void *payload;
> + int ret;
> +
> + if (req_buff_size)
> + cont_fragments = (req_buff_size - 1) / GH_RM_MAX_MSG_SIZE;

Compute this *after* verifying the size isn't too big.

> +
> + if (req_buff_size > GH_RM_MAX_NUM_FRAGMENTS * GH_RM_MAX_MSG_SIZE) {
> + pr_warn("Limit exceeded for the number of fragments: %u\n", cont_fragments);
> + dump_stack();
> + return -E2BIG;
> + }
> +
> + ret = mutex_lock_interruptible(&rm->send_lock);
> + if (ret)
> + return ret;
> +
> + /* Consider also the 'request' packet for the loop count */
> + do {
> + msg = kmem_cache_zalloc(rm->cache, GFP_KERNEL);
> + if (!msg) {
> + ret = -ENOMEM;
> + goto out;
> + }
> +
> + /* Fill header */
> + hdr = (struct gh_rm_rpc_hdr *)msg->data;
> + hdr->api = RM_RPC_API;
> + hdr->type = msg_type | FIELD_PREP(RM_RPC_FRAGMENTS_MASK, cont_fragments);
> + hdr->seq = cpu_to_le16(connection->reply.seq);
> + hdr->msg_id = cpu_to_le32(message_id);

Most of the above are constant for every message. I think the only
thing that changes is the type field. It might not make a difference
but you could compute the "generic" header outside the loop and
assign it as a structure, then overwrite the type field.

> +
> + /* Copy payload */
> + payload = hdr + 1;
> + payload_size = min(buff_size_remaining, GH_RM_MAX_MSG_SIZE);
> + memcpy(payload, req_buff_curr, payload_size);
> + req_buff_curr += payload_size;
> + buff_size_remaining -= payload_size;
> +
> + /* Force the last fragment to immediately alert the receiver */
> + msg->push = !buff_size_remaining;
> + msg->length = sizeof(*hdr) + payload_size;
> +
> + ret = mbox_send_message(gh_msgq_chan(&rm->msgq), msg);
> + if (ret < 0) {
> + kmem_cache_free(rm->cache, msg);
> + break;
> + }
> +
> + if (rm->last_tx_ret) {
> + ret = rm->last_tx_ret;
> + break;
> + }
> +
> + msg_type = FIELD_PREP(RM_RPC_TYPE_MASK, RM_RPC_TYPE_CONTINUATION);
> + } while (buff_size_remaining);
> +
> +out:
> + mutex_unlock(&rm->send_lock);
> + return ret < 0 ? ret : 0;
> +}
> +
> +/**
> + * gh_rm_call: Achieve request-response type communication with RPC
> + * @rm: Pointer to Gunyah resource manager internal data
> + * @message_id: The RM RPC message-id
> + * @req_buff: Request buffer that contains the payload
> + * @req_buff_size: Total size of the payload
> + * @resp_buf: Pointer to a response buffer

Is it "buf" or is it "buff"? I prefer the former, but you
should be consistent in your namings.

> + * @resp_buff_size: Size of the response buffer
> + *
> + * Make a request to the RM-VM and wait for reply back. For a successful
> + * response, the function returns the payload. The size of the payload is set in
> + * resp_buff_size. The resp_buf should be freed by the caller.

It might not matter, but resp_buf should be freed by the
caller *if 0 is returned* (no error).

> + *
> + * req_buff should be not NULL for req_buff_size >0. If req_buff_size == 0,
> + * req_buff *can* be NULL and no additional payload is sent.
> + *
> + * Context: Process context. Will sleep waiting for reply.
> + * Return: 0 on success. <0 if error.
> + */
> +int gh_rm_call(struct gh_rm *rm, u32 message_id, void *req_buff, size_t req_buff_size,
> + void **resp_buf, size_t *resp_buff_size)
> +{
> + struct gh_rm_connection *connection;
> + int ret;
> +
> + /* message_id 0 is reserved. req_buff_size implies req_buf is not NULL */
> + if (!message_id || (!req_buff && req_buff_size) || !rm)
> + return -EINVAL;
> +
> + connection = gh_rm_alloc_connection(cpu_to_le32(message_id), RM_RPC_TYPE_REPLY);
> + if (IS_ERR(connection))
> + return PTR_ERR(connection);
> +
> + init_completion(&connection->reply.seq_done);
> +
> + /* Allocate a new seq number for this connection */
> + mutex_lock(&rm->call_idr_lock);
> + ret = idr_alloc_cyclic(&rm->call_idr, connection, 0, U16_MAX,
> + GFP_KERNEL);
> + mutex_unlock(&rm->call_idr_lock);
> + if (ret < 0)
> + goto out;

You need a different error path label here. If there's an
error, the IDR allocation failed (so shoudln't be removed).
(Right?)

> + connection->reply.seq = ret;
> +
> + /* Send the request to the Resource Manager */
> + ret = gh_rm_send_request(rm, message_id, req_buff, req_buff_size, connection);
> + if (ret < 0)
> + goto out;
> +
> + /* Wait for response */
> + ret = wait_for_completion_interruptible(&connection->reply.seq_done);
> + if (ret)
> + goto out;
> +
> + /* Check for internal (kernel) error waiting for the response */
> + if (connection->reply.ret) {
> + ret = connection->reply.ret;
> + if (ret != -ENOMEM)
> + kfree(connection->payload);
> + goto out;
> + }
> +
> + /* Got a response, did resource manager give us an error? */
> + if (connection->reply.rm_error != GH_RM_ERROR_OK) {
> + pr_warn("RM rejected message %08x. Error: %d\n", message_id,
> + connection->reply.rm_error);
> + dump_stack();
> + ret = gh_rm_remap_error(connection->reply.rm_error);
> + kfree(connection->payload);
> + goto out;
> + }
> +
> + /* Everything looks good, return the payload */
> + *resp_buff_size = connection->size;
> + if (connection->size)
> + *resp_buf = connection->payload;
> + else {
> + /* kfree in case RM sent us multiple fragments but never any data in
> + * those fragments. We would've allocated memory for it, but connection->size == 0
> + */
> + kfree(connection->payload);
> + }
> +
> +out:
> + mutex_lock(&rm->call_idr_lock);
> + idr_remove(&rm->call_idr, connection->reply.seq);
> + mutex_unlock(&rm->call_idr_lock);
> + kfree(connection);
> + return ret;
> +}
> +
> +
> +int gh_rm_notifier_register(struct gh_rm *rm, struct notifier_block *nb)
> +{
> + return blocking_notifier_chain_register(&rm->nh, nb);
> +}
> +EXPORT_SYMBOL_GPL(gh_rm_notifier_register);
> +
> +int gh_rm_notifier_unregister(struct gh_rm *rm, struct notifier_block *nb)
> +{
> + return blocking_notifier_chain_unregister(&rm->nh, nb);
> +}
> +EXPORT_SYMBOL_GPL(gh_rm_notifier_unregister);
> +
> +void get_gh_rm(struct gh_rm *rm)

It is often pretty handy to return the argument in
functions like this. It simultaneously takes the
reference and assigns the pointer the reference
represents.


> +{
> + get_device(rm->dev);
> +}
> +EXPORT_SYMBOL_GPL(get_gh_rm);
> +
> +void put_gh_rm(struct gh_rm *rm)
> +{
> + put_device(rm->dev);
> +}
> +EXPORT_SYMBOL_GPL(put_gh_rm);
> +
> +static int gh_msgq_platform_probe_direction(struct platform_device *pdev,
> + bool tx, int idx, struct gunyah_resource *ghrsc)
> +{
> + struct device_node *node = pdev->dev.of_node;
> + int ret;

I think you should declare idx as a local variable.

int idx = tx ? 1 : 0;

> +
> + ghrsc->type = tx ? GUNYAH_RESOURCE_TYPE_MSGQ_TX : GUNYAH_RESOURCE_TYPE_MSGQ_RX;
> +
> + ghrsc->irq = platform_get_irq(pdev, idx);

Do you suppose you could do platform_get_irq_byname(), and then
specify the names of the IRQs ("rm_tx_irq" and "rm_rx_irq" maybe)?

> + if (ghrsc->irq < 0) {
> + dev_err(&pdev->dev, "Failed to get irq%d: %d\n", idx, ghrsc->irq);

Maybe: "Failed to get %cX IRQ: %d\n", tx ? 'T' : 'R', ghrsc->irq);

> + return ghrsc->irq;
> + }
> +
> + ret = of_property_read_u64_index(node, "reg", idx, &ghrsc->capid);

Is a capability ID a simple (but large) number?

The *resource manager* (which is a very special VM) has to
have both TX and RX message queue capability IDs. Is there
'any chance that these specific capability IDs have values
that are fixed by the design? Like, 0 and 1? I don't know
what they are, but it seems like it *could* be something
fixed by the design, and if that were the case, there would
be no need to specify the "reg" property to get the "capid"
values.

> + if (ret) {
> + dev_err(&pdev->dev, "Failed to get capid%d: %d\n", idx, ret);
> + return ret;
> + }
> +
> + return 0;
> +}
> +
> +static int gh_rm_drv_probe(struct platform_device *pdev)
> +{
> + struct gh_msgq_tx_data *msg;
> + struct gh_rm *rm;
> + int ret;
> +
> + rm = devm_kzalloc(&pdev->dev, sizeof(*rm), GFP_KERNEL);
> + if (!rm)
> + return -ENOMEM;
> +
> + platform_set_drvdata(pdev, rm);
> + rm->dev = &pdev->dev;
> +
> + mutex_init(&rm->call_idr_lock);
> + idr_init(&rm->call_idr);
> + rm->cache = kmem_cache_create("gh_rm", struct_size(msg, data, GH_MSGQ_MAX_MSG_SIZE), 0,
> + SLAB_HWCACHE_ALIGN, NULL);
> + if (!rm->cache)
> + return -ENOMEM;

If you abstracted the allocation interface for these messages,
you could actually survive without the slab cache here. But
if this fails, maybe you won't get far anyway.

> + mutex_init(&rm->send_lock);
> + BLOCKING_INIT_NOTIFIER_HEAD(&rm->nh);
> +
> + ret = gh_msgq_platform_probe_direction(pdev, true, 0, &rm->tx_ghrsc);
> + if (ret)
> + goto err_cache;
> +
> + ret = gh_msgq_platform_probe_direction(pdev, false, 1, &rm->rx_ghrsc);
> + if (ret)
> + goto err_cache;
> +
> + rm->msgq_client.dev = &pdev->dev;
> + rm->msgq_client.tx_block = true;
> + rm->msgq_client.rx_callback = gh_rm_msgq_rx_data;
> + rm->msgq_client.tx_done = gh_rm_msgq_tx_done;
> +
> + return gh_msgq_init(&pdev->dev, &rm->msgq, &rm->msgq_client, &rm->tx_ghrsc, &rm->rx_ghrsc);
> +err_cache:
> + kmem_cache_destroy(rm->cache);
> + return ret;
> +}
> +
> +static int gh_rm_drv_remove(struct platform_device *pdev)
> +{
> + struct gh_rm *rm = platform_get_drvdata(pdev);
> +
> + mbox_free_channel(gh_msgq_chan(&rm->msgq));
> + gh_msgq_remove(&rm->msgq);
> + kmem_cache_destroy(rm->cache);
> +
> + return 0;
> +}
> +
> +static const struct of_device_id gh_rm_of_match[] = {
> + { .compatible = "gunyah-resource-manager" },
> + {}
> +};
> +MODULE_DEVICE_TABLE(of, gh_rm_of_match);
> +
> +static struct platform_driver gh_rm_driver = {
> + .probe = gh_rm_drv_probe,
> + .remove = gh_rm_drv_remove,
> + .driver = {
> + .name = "gh_rsc_mgr",
> + .of_match_table = gh_rm_of_match,
> + },
> +};
> +module_platform_driver(gh_rm_driver);
> +
> +MODULE_LICENSE("GPL");
> +MODULE_DESCRIPTION("Gunyah Resource Manager Driver");
> diff --git a/drivers/virt/gunyah/rsc_mgr.h b/drivers/virt/gunyah/rsc_mgr.h
> new file mode 100644
> index 000000000000..d4e799a7526f
> --- /dev/null
> +++ b/drivers/virt/gunyah/rsc_mgr.h
> @@ -0,0 +1,77 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +#ifndef __GH_RSC_MGR_PRIV_H
> +#define __GH_RSC_MGR_PRIV_H
> +
> +#include <linux/gunyah.h>
> +#include <linux/gunyah_rsc_mgr.h>
> +#include <linux/types.h>
> +
> +/* RM Error codes */
> +enum gh_rm_error {
> + GH_RM_ERROR_OK = 0x0,
> + GH_RM_ERROR_UNIMPLEMENTED = 0xFFFFFFFF,
> + GH_RM_ERROR_NOMEM = 0x1,
> + GH_RM_ERROR_NORESOURCE = 0x2,
> + GH_RM_ERROR_DENIED = 0x3,
> + GH_RM_ERROR_INVALID = 0x4,
> + GH_RM_ERROR_BUSY = 0x5,
> + GH_RM_ERROR_ARGUMENT_INVALID = 0x6,
> + GH_RM_ERROR_HANDLE_INVALID = 0x7,
> + GH_RM_ERROR_VALIDATE_FAILED = 0x8,
> + GH_RM_ERROR_MAP_FAILED = 0x9,
> + GH_RM_ERROR_MEM_INVALID = 0xA,
> + GH_RM_ERROR_MEM_INUSE = 0xB,
> + GH_RM_ERROR_MEM_RELEASED = 0xC,
> + GH_RM_ERROR_VMID_INVALID = 0xD,
> + GH_RM_ERROR_LOOKUP_FAILED = 0xE,
> + GH_RM_ERROR_IRQ_INVALID = 0xF,
> + GH_RM_ERROR_IRQ_INUSE = 0x10,
> + GH_RM_ERROR_IRQ_RELEASED = 0x11,
> +};
> +
> +/**
> + * gh_rm_remap_error() - Remap Gunyah resource manager errors into a Linux error code
> + * @gh_error: "Standard" return value from Gunyah resource manager
> + */
> +static inline int gh_rm_remap_error(enum gh_rm_error rm_error)
> +{
> + switch (rm_error) {
> + case GH_RM_ERROR_OK:
> + return 0;
> + case GH_RM_ERROR_UNIMPLEMENTED:
> + return -EOPNOTSUPP;
> + case GH_RM_ERROR_NOMEM:
> + return -ENOMEM;
> + case GH_RM_ERROR_NORESOURCE:
> + return -ENODEV;
> + case GH_RM_ERROR_DENIED:
> + return -EPERM;
> + case GH_RM_ERROR_BUSY:
> + return -EBUSY;
> + case GH_RM_ERROR_INVALID:
> + case GH_RM_ERROR_ARGUMENT_INVALID:
> + case GH_RM_ERROR_HANDLE_INVALID:
> + case GH_RM_ERROR_VALIDATE_FAILED:
> + case GH_RM_ERROR_MAP_FAILED:
> + case GH_RM_ERROR_MEM_INVALID:
> + case GH_RM_ERROR_MEM_INUSE:
> + case GH_RM_ERROR_MEM_RELEASED:
> + case GH_RM_ERROR_VMID_INVALID:
> + case GH_RM_ERROR_LOOKUP_FAILED:
> + case GH_RM_ERROR_IRQ_INVALID:
> + case GH_RM_ERROR_IRQ_INUSE:
> + case GH_RM_ERROR_IRQ_RELEASED:
> + return -EINVAL;
> + default:
> + return -EBADMSG;
> + }
> +}
> +
> +struct gh_rm;

This might just be my preference, but I like to see declarations
like the one above grouped at the top of the file, under includes.

> +int gh_rm_call(struct gh_rm *rsc_mgr, u32 message_id, void *req_buff, size_t req_buff_size,
> + void **resp_buf, size_t *resp_buff_size);
> +
> +#endif
> diff --git a/include/linux/gunyah_rsc_mgr.h b/include/linux/gunyah_rsc_mgr.h
> new file mode 100644
> index 000000000000..c992b3188c8d
> --- /dev/null
> +++ b/include/linux/gunyah_rsc_mgr.h
> @@ -0,0 +1,24 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#ifndef _GUNYAH_RSC_MGR_H
> +#define _GUNYAH_RSC_MGR_H
> +
> +#include <linux/list.h>
> +#include <linux/notifier.h>
> +#include <linux/gunyah.h>
> +
> +#define GH_VMID_INVAL U16_MAX
> +
> +/* Gunyah recognizes VMID0 as an alias to the current VM's ID */
> +#define GH_VMID_SELF 0

I haven't really checked very well, bur you should *use this*
definition where a VMID is being examined. I.e., if you're
going to define this, then never just compare a VMID against 0.

-Alex

> +
> +struct gh_rm;
> +int gh_rm_notifier_register(struct gh_rm *rm, struct notifier_block *nb);
> +int gh_rm_notifier_unregister(struct gh_rm *rm, struct notifier_block *nb);
> +void get_gh_rm(struct gh_rm *rm);
> +void put_gh_rm(struct gh_rm *rm);
> +
> +#endif


2023-02-23 23:42:05

by Alex Elder

[permalink] [raw]
Subject: Re: [PATCH v10 01/26] docs: gunyah: Introduce Gunyah Hypervisor

On 2/14/23 3:12 PM, Elliot Berman wrote:
> Gunyah is an open-source Type-1 hypervisor developed by Qualcomm. It
> does not depend on any lower-privileged OS/kernel code for its core
> functionality. This increases its security and can support a smaller
> trusted computing based when compared to Type-2 hypervisors.
>
> Add documentation describing the Gunyah hypervisor and the main
> components of the Gunyah hypervisor which are of interest to Linux
> virtualization development.
>
> Reviewed-by: Bagas Sanjaya <[email protected]>
> Signed-off-by: Elliot Berman <[email protected]>
> ---
> Documentation/virt/gunyah/index.rst | 113 ++++++++++++++++++++
> Documentation/virt/gunyah/message-queue.rst | 61 +++++++++++
> Documentation/virt/index.rst | 1 +
> 3 files changed, 175 insertions(+)
> create mode 100644 Documentation/virt/gunyah/index.rst
> create mode 100644 Documentation/virt/gunyah/message-queue.rst
>
> diff --git a/Documentation/virt/gunyah/index.rst b/Documentation/virt/gunyah/index.rst
> new file mode 100644
> index 000000000000..45adbbc311db
> --- /dev/null
> +++ b/Documentation/virt/gunyah/index.rst
> @@ -0,0 +1,113 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +=================
> +Gunyah Hypervisor
> +=================
> +
> +.. toctree::
> + :maxdepth: 1
> +
> + message-queue
> +
> +Gunyah is a Type-1 hypervisor which is independent of any OS kernel, and runs in
> +a higher CPU privilege level. It does not depend on any lower-privileged operating system
> +for its core functionality. This increases its security and can support a much smaller
> +trusted computing base than a Type-2 hypervisor.
> +
> +Gunyah is an open source hypervisor. The source repo is available at
> +https://github.com/quic/gunyah-hypervisor.
> +
> +Gunyah provides these following features.
> +
> +- Scheduling:
> +
> + A scheduler for virtual CPUs (vCPUs) on physical CPUs enables time-sharing
> + of the CPUs. Gunyah supports two models of scheduling:
> +
> + 1. "Behind the back" scheduling in which Gunyah hypervisor schedules vCPUS on its own.
> + 2. "Proxy" scheduling in which a delegated VM can donate part of one of its vCPU slice
> + to another VM's vCPU via a hypercall.
> +
> +- Memory Management:
> +
> + APIs handling memory, abstracted as objects, limiting direct use of physical
> + addresses. Memory ownership and usage tracking of all memory under its control.
> + Memory partitioning between VMs is a fundamental security feature.
> +
> +- Interrupt Virtualization:
> +
> + Uses CPU hardware interrupt virtualization capabilities. Interrupts are handled
> + in the hypervisor and routed to the assigned VM.
> +
> +- Inter-VM Communication:
> +
> + There are several different mechanisms provided for communicating between VMs.
> +
> +- Virtual platform:
> +
> + Architectural devices such as interrupt controllers and CPU timers are directly provided
> + by the hypervisor as well as core virtual platform devices and system APIs such as ARM PSCI.
> +
> +- Device Virtualization:
> +
> + Para-virtualization of devices is supported using inter-VM communication.
> +
> +Architectures supported
> +=======================
> +AArch64 with a GIC
> +
> +Resources and Capabilities
> +==========================
> +
> +Some services or resources provided by the Gunyah hypervisor are described to a virtual machine by
> +capability IDs. For instance, inter-VM communication is performed with doorbells and message queues.
> +Gunyah allows access to manipulate that doorbell via the capability ID. These resources are
> +described in Linux as a struct gunyah_resource.
> +
> +High level management of these resources is performed by the resource manager VM. RM informs a
> +guest VM about resources it can access through either the device tree or via guest-initiated RPC.
> +
> +For each virtual machine, Gunyah maintains a table of resources which can be accessed by that VM.
> +An entry in this table is called a "capability" and VMs can only access resources via this
> +capability table. Hence, virtual Gunyah resources are referenced by a "capability IDs" and not
> +"resource IDs". If 2 VMs have access to the same resource, they might not be using the same
> +capability ID to access that resource since the capability tables are independent per VM.
> +
> +Resource Manager
> +================
> +
> +The resource manager (RM) is a privileged application VM supporting the Gunyah Hypervisor.
> +It provides policy enforcement aspects of the virtualization system. The resource manager can
> +be treated as an extension of the Hypervisor but is separated to its own partition to ensure
> +that the hypervisor layer itself remains small and secure and to maintain a separation of policy
> +and mechanism in the platform. RM runs at arm64 NS-EL1 similar to other virtual machines.
> +
> +Communication with the resource manager from each guest VM happens with message-queue.rst. Details
> +about the specific messages can be found in drivers/virt/gunyah/rsc_mgr.c
> +
> +::
> +
> + +-------+ +--------+ +--------+
> + | RM | | VM_A | | VM_B |
> + +-.-.-.-+ +---.----+ +---.----+
> + | | | |
> + +-.-.-----------.------------.----+
> + | | \==========/ | |
> + | \========================/ |
> + | Gunyah |
> + +---------------------------------+
> +
> +The source for the resource manager is available at https://github.com/quic/gunyah-resource-manager.
> +
> +The resource manager provides the following features:
> +
> +- VM lifecycle management: allocating a VM, starting VMs, destruction of VMs
> +- VM access control policy, including memory sharing and lending
> +- Interrupt routing configuration
> +- Forwarding of system-level events (e.g. VM shutdown) to owner VM
> +
> +When booting a virtual machine which uses a devicetree such as Linux, resource manager overlays a
> +/hypervisor node. This node can let Linux know it is running as a Gunyah guest VM,
> +how to communicate with resource manager, and basic description and capabilities of
> +this VM. See Documentation/devicetree/bindings/firmware/gunyah-hypervisor.yaml for a description
> +of this node.
> diff --git a/Documentation/virt/gunyah/message-queue.rst b/Documentation/virt/gunyah/message-queue.rst
> new file mode 100644
> index 000000000000..0667b3eb1ff9
> --- /dev/null
> +++ b/Documentation/virt/gunyah/message-queue.rst
> @@ -0,0 +1,61 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +Message Queues
> +==============
> +Message queue is a simple low-capacity IPC channel between two VMs. It is
> +intended for sending small control and configuration messages. Each message
> +queue is unidirectional, so a full-duplex IPC channel requires a pair of queues.
> +
> +Messages can be up to 240 bytes in length. Longer messages require a further
> +protocol on top of the message queue messages themselves. For instance, communication
> +with the resource manager adds a header field for sending longer messages via multiple
> +message fragments.
> +
> +The diagram below shows how message queue works. A typical configuration involves
> +2 message queues. Message queue 1 allows VM_A to send messages to VM_B. Message
> +queue 2 allows VM_B to send messages to VM_A.
> +
> +1. VM_A sends a message of up to 240 bytes in length. It raises a hypercall

Can you clarify that the message being sent is in the VM's *own*
memory/ Maybe this is clear, but the message doesn't have to (for
example) be located in shared memory. The original message is
copied into message queue buffers in order to be transferred.

> + with the message to inform the hypervisor to add the message to
> + message queue 1's queue.
> +
> +2. Gunyah raises the corresponding interrupt for VM_B (Rx vIRQ) when any of
> + these happens:
> +
> + a. gh_msgq_send has PUSH flag. Queue is immediately flushed. This is the typical case.

Below you use gh_msgq_send() (with parentheses). I prefer that,
but whatever you do, do it consistently.

> + b. Explicility with gh_msgq_push command from VM_A.
> + c. Message queue has reached a threshold depth.
> +
> +3. VM_B calls gh_msgq_recv and Gunyah copies message to requested buffer.
> +
> +4. Gunyah buffers messages in the queue. If the queue became full when VM_A added a message,
> + the return values for gh_msgq_send() include a flag that indicates the queue is full.
> + Once VM_B receives the message and, thus, there is space in the queue, Gunyah
> + will raise the Tx vIRQ on VM_A to indicate it can continue sending messages.
> +
> +For VM_B to send a message to VM_A, the process is identical, except that hypercalls
> +reference message queue 2's capability ID. Each message queue has its own independent
> +vIRQ: two TX message queues will have two vIRQs (and two capability IDs).

Can a sender determine when a message has been delivered?
Does the TX vIRQ indicate only that the messaging system
has processed the message (taken it and queued it), but
says nothing about it being delivered/accepted/received?

-Alex

> +
> +::
> +
> + +---------------+ +-----------------+ +---------------+
> + | VM_A | |Gunyah hypervisor| | VM_B |
> + | | | | | |
> + | | | | | |
> + | | Tx | | | |
> + | |-------->| | Rx vIRQ | |
> + |gh_msgq_send() | Tx vIRQ |Message queue 1 |-------->|gh_msgq_recv() |
> + | |<------- | | | |
> + | | | | | |
> + | Message Queue | | | | Message Queue |
> + | driver | | | | driver |
> + | | | | | |
> + | | | | | |
> + | | | | Tx | |
> + | | Rx vIRQ | |<--------| |
> + |gh_msgq_recv() |<--------|Message queue 2 | Tx vIRQ |gh_msgq_send() |
> + | | | |-------->| |
> + | | | | | |
> + | | | | | |
> + +---------------+ +-----------------+ +---------------+
> diff --git a/Documentation/virt/index.rst b/Documentation/virt/index.rst
> index 7fb55ae08598..15869ee059b3 100644
> --- a/Documentation/virt/index.rst
> +++ b/Documentation/virt/index.rst
> @@ -16,6 +16,7 @@ Virtualization Support
> coco/sev-guest
> coco/tdx-guest
> hyperv/index
> + gunyah/index
>
> .. only:: html and subproject
>


2023-02-23 23:55:12

by Alex Elder

[permalink] [raw]
Subject: Re: [PATCH v10 17/26] docs: gunyah: Document Gunyah VM Manager

On 2/14/23 3:25 PM, Elliot Berman wrote:
>
> Document the ioctls and usage of Gunyah VM Manager driver.
>
> Signed-off-by: Elliot Berman <[email protected]>
> ---
> Documentation/virt/gunyah/index.rst | 1 +
> Documentation/virt/gunyah/vm-manager.rst | 106 +++++++++++++++++++++++
> 2 files changed, 107 insertions(+)
> create mode 100644 Documentation/virt/gunyah/vm-manager.rst
>
> diff --git a/Documentation/virt/gunyah/index.rst b/Documentation/virt/gunyah/index.rst
> index 45adbbc311db..b204b85e86db 100644
> --- a/Documentation/virt/gunyah/index.rst
> +++ b/Documentation/virt/gunyah/index.rst
> @@ -7,6 +7,7 @@ Gunyah Hypervisor
> .. toctree::
> :maxdepth: 1
>
> + vm-manager
> message-queue
>
> Gunyah is a Type-1 hypervisor which is independent of any OS kernel, and runs in
> diff --git a/Documentation/virt/gunyah/vm-manager.rst b/Documentation/virt/gunyah/vm-manager.rst
> new file mode 100644
> index 000000000000..c0126cfeadc7
> --- /dev/null
> +++ b/Documentation/virt/gunyah/vm-manager.rst
> @@ -0,0 +1,106 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +=======================
> +Virtual Machine Manager
> +=======================
> +
> +The Gunyah Virtual Machine Manager is a Linux driver to support launching
> +virtual machines using Gunyah. It presently supports launching non-proxy
> +scheduled Linux-like virtual machines.
> +
> +Except for some basic information about the location of initial binaries,
> +most of the configuration about a Gunyah virtual machine is described in the
> +VM's devicetree. The devicetree is generated by userspace. Interacting with the
> +virtual machine is still done via the kernel and VM configuration requires some
> +of the corresponding functionality to be set up in the kernel. For instance,
> +sharing userspace memory with a VM is done via the GH_VM_SET_USER_MEM_REGION
> +ioctl. The VM itself is configured to use the memory region via the
> +devicetree.
> +
> +Sample Userspace VMM
> +====================
> +
> +A sample userspace VMM is included in samples/gunyah/ along with a minimal
> +devicetree that can be used to launch a VM. To build this sample, enable
> +CONFIG_SAMPLE_GUNYAH.
> +
> +IOCTLs and userspace VMM flows
> +==============================
> +
> +The kernel exposes a char device interface at /dev/gunyah.
> +
> +To create a VM, use the GH_CREATE_VM ioctl. A successful call will return a
> +"Gunyah VM" file descriptor.
> +
> +/dev/gunyah API Descriptions
> +----------------------------
> +
> +GH_CREATE_VM
> +~~~~~~~~~~~~
> +
> +Creates a Gunyah VM. The argument is reserved for future use and must be 0.

I wouldn't say it "must be zero". Instead maybe say it is
is currently ignored.

> +
> +Gunyah VM API Descriptions
> +--------------------------
> +
> +GH_VM_SET_USER_MEM_REGION
> +~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +::
> +
> + struct gh_userspace_memory_region {
> + __u32 label;
> + __u32 flags;
> + __u64 guest_phys_addr;
> + __u64 memory_size;
> + __u64 userspace_addr;
> + };
> +
> +This ioctl allows the user to create or delete a memory parcel for a guest
> +virtual machine. Each memory region is uniquely identified by a label;
> +attempting to create two regions with the same label is not allowed.

Must the label be unique across a single instance of Gunyah (on
a single physical machine)? Or is it unique within a VM? Or
something else? (It's not universally unique, right?)

> +
> +While VMM is guest-agnostic and allows runtime addition of memory regions,
> +Linux guest virtual machines do not support accepting memory regions at runtime.
> +Thus, memory regions should be provided before starting the VM and the VM must
> +be configured to accept these at boot-up.
> +
> +The guest physical address is used by Linux kernel to check that the requested
> +user regions do not overlap and to help find the corresponding memory region
> +for calls like GH_VM_SET_DTB_CONFIG. It should be page aligned.

The physical address must be page aligned. Can a page be anything
other than 4096 bytes?

> +
> +memory_size and userspace_addr should be page-aligned.

Not should, must, right? (The address isn't rounded down to a
page boundary, for example.)

> +
> +The flags field of gh_userspace_memory_region accepts the following bits. All
> +other bits must be 0 and are reserved for future use. The ioctl will return
> +-EINVAL if an unsupported bit is detected.
> +
> + - GH_MEM_ALLOW_READ/GH_MEM_ALLOW_WRITE/GH_MEM_ALLOW_EXEC sets read/write/exec
> + permissions for the guest, respectively.
> + - GH_MEM_LENT means that the memory will be unmapped from the host and be
> + unaccessible by the host while the guest has the region.
> +
> +To add a memory region, call GH_VM_SET_USER_MEM_REGION with fields set as
> +described above.
> +
> +To delete a memory region, call GH_VM_SET_USER_MEM_REGION with label set to the
> +desired region and memory_size set to 0.
> +
> +GH_VM_SET_DTB_CONFIG
> +~~~~~~~~~~~~~~~~~~~~
> +
> +::
> +
> + struct gh_vm_dtb_config {
> + __u64 gpa;

What is "gpa"? Guest physical address? Gunyah pseudo address?
Can this have a longer and more descriptive name please?

> + __u64 size;
> + };
> +
> +This ioctl sets the location of the VM's devicetree blob and is used by Gunyah
> +Resource Manager to allocate resources. The guest physical memory should be part
> +of the primary memory parcel provided to the VM prior to GH_VM_START.

Any alignment constraints? (If not, you could say "there are no
alignment constraints on the address or size.")

> +
> +GH_VM_START
> +~~~~~~~~~~~
> +
> +This ioctl starts the VM.

Is there anything you can say about what gets returned for
these (at least for significant cases, like permission
problems or something)?

Are IOCTLs the normal way for virtual machine mechanisms
to set up things like this? (Noob question.)

-Alex

2023-02-24 00:09:24

by Alex Elder

[permalink] [raw]
Subject: Re: [PATCH v10 04/26] virt: gunyah: Add hypercalls to identify Gunyah

On 2/14/23 3:12 PM, Elliot Berman wrote:
> Add hypercalls to identify when Linux is running a virtual machine under
> Gunyah.
>
> There are two calls to help identify Gunyah:
>
> 1. gh_hypercall_get_uid() returns a UID when running under a Gunyah
> hypervisor.
> 2. gh_hypercall_hyp_identify() returns build information and a set of
> feature flags that are supported by Gunyah.
>
> Signed-off-by: Elliot Berman <[email protected]>
> ---
> arch/arm64/Kbuild | 1 +
> arch/arm64/gunyah/Makefile | 3 ++
> arch/arm64/gunyah/gunyah_hypercall.c | 61 ++++++++++++++++++++++++++++
> drivers/virt/Kconfig | 2 +
> drivers/virt/gunyah/Kconfig | 13 ++++++
> include/linux/gunyah.h | 33 +++++++++++++++
> 6 files changed, 113 insertions(+)
> create mode 100644 arch/arm64/gunyah/Makefile
> create mode 100644 arch/arm64/gunyah/gunyah_hypercall.c
> create mode 100644 drivers/virt/gunyah/Kconfig
>
> diff --git a/arch/arm64/Kbuild b/arch/arm64/Kbuild
> index 5bfbf7d79c99..e4847ba0e3c9 100644
> --- a/arch/arm64/Kbuild
> +++ b/arch/arm64/Kbuild
> @@ -3,6 +3,7 @@ obj-y += kernel/ mm/ net/
> obj-$(CONFIG_KVM) += kvm/
> obj-$(CONFIG_XEN) += xen/
> obj-$(subst m,y,$(CONFIG_HYPERV)) += hyperv/
> +obj-$(CONFIG_GUNYAH) += gunyah/
> obj-$(CONFIG_CRYPTO) += crypto/
>
> # for cleaning
> diff --git a/arch/arm64/gunyah/Makefile b/arch/arm64/gunyah/Makefile
> new file mode 100644
> index 000000000000..84f1e38cafb1
> --- /dev/null
> +++ b/arch/arm64/gunyah/Makefile
> @@ -0,0 +1,3 @@
> +# SPDX-License-Identifier: GPL-2.0
> +
> +obj-$(CONFIG_GUNYAH) += gunyah_hypercall.o
> diff --git a/arch/arm64/gunyah/gunyah_hypercall.c b/arch/arm64/gunyah/gunyah_hypercall.c
> new file mode 100644
> index 000000000000..f30d06ee80cf
> --- /dev/null
> +++ b/arch/arm64/gunyah/gunyah_hypercall.c
> @@ -0,0 +1,61 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#include <linux/arm-smccc.h>
> +#include <linux/module.h>
> +#include <linux/gunyah.h>
> +
> +static const uint32_t gunyah_known_uuids[][4] = {
> + {0x19bd54bd, 0x0b37571b, 0x946f609b, 0x54539de6}, /* QC_HYP (Qualcomm's build) */
> + {0x673d5f14, 0x9265ce36, 0xa4535fdb, 0xc1d58fcd}, /* GUNYAH (open source build) */
> +};

Are these really UUIDs? Standard ones? Define them using
the standard Linux way of doing it. See <linux/uuid.h>.

> +
> +bool arch_is_gunyah_guest(void)
> +{
> + struct arm_smccc_res res;
> + u32 uid[4];
> + int i;
> +
> + arm_smccc_1_1_hvc(ARM_SMCCC_VENDOR_HYP_CALL_UID_FUNC_ID, &res);
> +
> + uid[0] = lower_32_bits(res.a0);
> + uid[1] = lower_32_bits(res.a1);
> + uid[2] = lower_32_bits(res.a2);
> + uid[3] = lower_32_bits(res.a3);
> +
> + for (i = 0; i < ARRAY_SIZE(gunyah_known_uuids); i++)
> + if (!memcmp(uid, gunyah_known_uuids[i], sizeof(uid)))
> + break;
> +
> + return i != ARRAY_SIZE(gunyah_known_uuids);
> +}
> +EXPORT_SYMBOL_GPL(arch_is_gunyah_guest);
> +
> +#define GH_HYPERCALL(fn) ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, ARM_SMCCC_SMC_64, \
> + ARM_SMCCC_OWNER_VENDOR_HYP, \
> + fn)
> +
> +#define GH_HYPERCALL_HYP_IDENTIFY GH_HYPERCALL(0x8000)
> +
> +/**
> + * gh_hypercall_hyp_identify() - Returns build information and feature flags
> + * supported by Gunyah.
> + * @hyp_identity: filled by the hypercall with the API info and feature flags.
> + */
> +void gh_hypercall_hyp_identify(struct gh_hypercall_hyp_identify_resp *hyp_identity)
> +{
> + struct arm_smccc_res res;
> +
> + arm_smccc_1_1_hvc(GH_HYPERCALL_HYP_IDENTIFY, &res);
> +
> + hyp_identity->api_info = res.a0;
> + hyp_identity->flags[0] = res.a1;
> + hyp_identity->flags[1] = res.a2;
> + hyp_identity->flags[2] = res.a3;
> +}
> +EXPORT_SYMBOL_GPL(gh_hypercall_hyp_identify);
> +
> +MODULE_LICENSE("GPL");
> +MODULE_DESCRIPTION("Gunyah Hypervisor Hypercalls");
> diff --git a/drivers/virt/Kconfig b/drivers/virt/Kconfig
> index f79ab13a5c28..85bd6626ffc9 100644
> --- a/drivers/virt/Kconfig
> +++ b/drivers/virt/Kconfig
> @@ -54,4 +54,6 @@ source "drivers/virt/coco/sev-guest/Kconfig"
>
> source "drivers/virt/coco/tdx-guest/Kconfig"
>
> +source "drivers/virt/gunyah/Kconfig"
> +
> endif
> diff --git a/drivers/virt/gunyah/Kconfig b/drivers/virt/gunyah/Kconfig
> new file mode 100644
> index 000000000000..1a737694c333
> --- /dev/null
> +++ b/drivers/virt/gunyah/Kconfig
> @@ -0,0 +1,13 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +
> +config GUNYAH
> + tristate "Gunyah Virtualization drivers"
> + depends on ARM64
> + depends on MAILBOX
> + help
> + The Gunyah drivers are the helper interfaces that run in a guest VM
> + such as basic inter-VM IPC and signaling mechanisms, and higher level
> + services such as memory/device sharing, IRQ sharing, and so on.
> +
> + Say Y/M here to enable the drivers needed to interact in a Gunyah
> + virtual environment.
> diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
> index 59ef4c735ae8..3fef2854c5e1 100644
> --- a/include/linux/gunyah.h
> +++ b/include/linux/gunyah.h
> @@ -6,8 +6,10 @@
> #ifndef _LINUX_GUNYAH_H
> #define _LINUX_GUNYAH_H
>
> +#include <linux/bitfield.h>
> #include <linux/errno.h>
> #include <linux/limits.h>
> +#include <linux/types.h>
>
> /******************************************************************************/
> /* Common arch-independent definitions for Gunyah hypercalls */
> @@ -79,4 +81,35 @@ static inline int gh_remap_error(enum gh_error gh_error)
> }
> }
>
> +enum gh_api_feature {
> + GH_API_FEATURE_DOORBELL,
> + GH_API_FEATURE_MSGQUEUE,
> + GH_API_FEATURE_VCPU,
> + GH_API_FEATURE_MEMEXTENT,
> +};

Can't you reuse these symbols, so that the same set
represents the feature and the identify response?

I'm not sure what a good naming scheme would be, but
you could easily do:
enum gh_api_feature {
GH_FEATURE_DOORBELL = 1,
GH_FEATURE_MSGQUEUE = 2,
GH_FEATURE_VCPU = 5,
GH_FEATURE_MEMEXTENT = 6,
};

And then you could do:

bool gh_api_has_feature(enum gh_api_feature feature)
{
switch (feature) {
case GH_FEATURE_DOORBELL:
case GH_FEATURES_MSGQUEUE:
case GH_FEATURE_VCPU:
case GH_FEATURE_MEMEXTENT:
return !!(gunyah_api.flags[0] & BIT_ULL(feature));

default:
return false;
}
}

> +
> +bool arch_is_gunyah_guest(void);
> +
> +u16 gh_api_version(void);
> +bool gh_api_has_feature(enum gh_api_feature feature);
> +
> +#define GUNYAH_API_V1 1
> +

Rather than _INFO_ here, maybe _IDENTIFY_?

Why is "API_" needed in these symbol names?

> +#define GH_API_INFO_API_VERSION_MASK GENMASK_ULL(13, 0)
> +#define GH_API_INFO_BIG_ENDIAN BIT_ULL(14)
> +#define GH_API_INFO_IS_64BIT BIT_ULL(15)
> +#define GH_API_INFO_VARIANT_MASK GENMASK_ULL(63, 56)
> +
> +#define GH_IDENTIFY_DOORBELL BIT_ULL(1)
> +#define GH_IDENTIFY_MSGQUEUE BIT_ULL(2)
> +#define GH_IDENTIFY_VCPU BIT_ULL(5)
> +#define GH_IDENTIFY_MEMEXTENT BIT_ULL(6)
> +
> +struct gh_hypercall_hyp_identify_resp {
> + u64 api_info;
> + u64 flags[3];
> +};
> +
> +void gh_hypercall_hyp_identify(struct gh_hypercall_hyp_identify_resp *hyp_identity);

Since this is a user space API, you *could* consider having
this function return an int. Just in case there's a future
reason that a failure could occur, or that you want to
supply some other information. If this truly doesn't make
sense, it's fine as-is...

> +
> #endif


2023-02-24 00:15:18

by Alex Elder

[permalink] [raw]
Subject: Re: [PATCH v10 06/26] virt: gunyah: msgq: Add hypercalls to send and receive messages

On 2/14/23 3:23 PM, Elliot Berman wrote:
> Add hypercalls to send and receive messages on a Gunyah message queue.
>
> Signed-off-by: Elliot Berman <[email protected]>
> ---
> arch/arm64/gunyah/gunyah_hypercall.c | 32 ++++++++++++++++++++++++++++
> include/linux/gunyah.h | 7 ++++++
> 2 files changed, 39 insertions(+)
>
> diff --git a/arch/arm64/gunyah/gunyah_hypercall.c b/arch/arm64/gunyah/gunyah_hypercall.c
> index f30d06ee80cf..2ca9ab098ff6 100644
> --- a/arch/arm64/gunyah/gunyah_hypercall.c
> +++ b/arch/arm64/gunyah/gunyah_hypercall.c
> @@ -38,6 +38,8 @@ EXPORT_SYMBOL_GPL(arch_is_gunyah_guest);
> fn)
>
> #define GH_HYPERCALL_HYP_IDENTIFY GH_HYPERCALL(0x8000)
> +#define GH_HYPERCALL_MSGQ_SEND GH_HYPERCALL(0x801B)
> +#define GH_HYPERCALL_MSGQ_RECV GH_HYPERCALL(0x801C)
>
> /**
> * gh_hypercall_hyp_identify() - Returns build information and feature flags
> @@ -57,5 +59,35 @@ void gh_hypercall_hyp_identify(struct gh_hypercall_hyp_identify_resp *hyp_identi
> }
> EXPORT_SYMBOL_GPL(gh_hypercall_hyp_identify);
>
> +enum gh_error gh_hypercall_msgq_send(u64 capid, size_t size, uintptr_t buff, int tx_flags,
> + bool *ready)
> +{
> + struct arm_smccc_res res;
> +
> + arm_smccc_1_1_hvc(GH_HYPERCALL_MSGQ_SEND, capid, size, buff, tx_flags, 0, &res);
> +
> + if (res.a0 == GH_ERROR_OK)
> + *ready = res.a1;
> +
> + return res.a0;
> +}
> +EXPORT_SYMBOL_GPL(gh_hypercall_msgq_send);
> +
> +enum gh_error gh_hypercall_msgq_recv(u64 capid, uintptr_t buff, size_t size, size_t *recv_size,
> + bool *ready)
> +{
> + struct arm_smccc_res res;
> +
> + arm_smccc_1_1_hvc(GH_HYPERCALL_MSGQ_RECV, capid, buff, size, 0, &res);
> +
> + if (res.a0 == GH_ERROR_OK) {
> + *recv_size = res.a1;

Is there any chance the 64-bit size is incompatible
with size_t? (Too big?)

> + *ready = res.a2;

*ready = !!res.a2;

> + }
> +
> + return res.a0;
> +}
> +EXPORT_SYMBOL_GPL(gh_hypercall_msgq_recv);
> +
> MODULE_LICENSE("GPL");
> MODULE_DESCRIPTION("Gunyah Hypervisor Hypercalls");
> diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
> index 3fef2854c5e1..cb6df4eec5c2 100644
> --- a/include/linux/gunyah.h
> +++ b/include/linux/gunyah.h
> @@ -112,4 +112,11 @@ struct gh_hypercall_hyp_identify_resp {
>
> void gh_hypercall_hyp_identify(struct gh_hypercall_hyp_identify_resp *hyp_identity);
>
> +#define GH_HYPERCALL_MSGQ_TX_FLAGS_PUSH BIT(0)
> +
> +enum gh_error gh_hypercall_msgq_send(u64 capid, size_t size, uintptr_t buff, int tx_flags,
> + bool *ready);

Why uintptr_t? Why not just pass a host pointer (void *)
and do whatever conversion is necessary inside the function?

-Alex

> +enum gh_error gh_hypercall_msgq_recv(u64 capid, uintptr_t buff, size_t size, size_t *recv_size,
> + bool *ready);
> +
> #endif


2023-02-24 00:34:37

by Alex Elder

[permalink] [raw]
Subject: Re: [PATCH v10 12/26] gunyah: vm_mgr: Add/remove user memory regions

On 2/14/23 3:24 PM, Elliot Berman wrote:
>
> When launching a virtual machine, Gunyah userspace allocates memory for
> the guest and informs Gunyah about these memory regions through
> SET_USER_MEMORY_REGION ioctl.
>
> Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
> Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
> Signed-off-by: Elliot Berman <[email protected]>
> ---
> drivers/virt/gunyah/Makefile | 2 +-
> drivers/virt/gunyah/vm_mgr.c | 44 ++++++
> drivers/virt/gunyah/vm_mgr.h | 25 ++++
> drivers/virt/gunyah/vm_mgr_mm.c | 235 ++++++++++++++++++++++++++++++++
> include/uapi/linux/gunyah.h | 33 +++++
> 5 files changed, 338 insertions(+), 1 deletion(-)
> create mode 100644 drivers/virt/gunyah/vm_mgr_mm.c
>
> diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
> index 03951cf82023..ff8bc4925392 100644
> --- a/drivers/virt/gunyah/Makefile
> +++ b/drivers/virt/gunyah/Makefile
> @@ -2,5 +2,5 @@
>
> obj-$(CONFIG_GUNYAH) += gunyah.o
>
> -gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o
> +gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o vm_mgr_mm.o
> obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
> diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
> index fd890a57172e..84102bac03cc 100644
> --- a/drivers/virt/gunyah/vm_mgr.c
> +++ b/drivers/virt/gunyah/vm_mgr.c
> @@ -18,8 +18,16 @@
> static void gh_vm_free(struct work_struct *work)
> {
> struct gh_vm *ghvm = container_of(work, struct gh_vm, free_work);
> + struct gh_vm_mem *mapping, *tmp;
> int ret;
>
> + mutex_lock(&ghvm->mm_lock);
> + list_for_each_entry_safe(mapping, tmp, &ghvm->memory_mappings, list) {
> + gh_vm_mem_reclaim(ghvm, mapping);
> + kfree(mapping);
> + }
> + mutex_unlock(&ghvm->mm_lock);
> +
> ret = gh_rm_dealloc_vmid(ghvm->rm, ghvm->vmid);
> if (ret)
> pr_warn("Failed to deallocate vmid: %d\n", ret);
> @@ -48,11 +56,46 @@ static __must_check struct gh_vm *gh_vm_alloc(struct gh_rm *rm)
> ghvm->vmid = vmid;
> ghvm->rm = rm;
>
> + mutex_init(&ghvm->mm_lock);
> + INIT_LIST_HEAD(&ghvm->memory_mappings);
> INIT_WORK(&ghvm->free_work, gh_vm_free);
>
> return ghvm;
> }
>
> +static long gh_vm_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
> +{
> + struct gh_vm *ghvm = filp->private_data;
> + void __user *argp = (void __user *)arg;
> + long r;
> +
> + switch (cmd) {
> + case GH_VM_SET_USER_MEM_REGION: {
> + struct gh_userspace_memory_region region;
> +
> + if (copy_from_user(&region, argp, sizeof(region)))
> + return -EFAULT;
> +
> + /* All other flag bits are reserved for future use */
> + if (region.flags & ~(GH_MEM_ALLOW_READ | GH_MEM_ALLOW_WRITE | GH_MEM_ALLOW_EXEC |
> + GH_MEM_LENT))
> + return -EINVAL;
> +
> +
> + if (region.memory_size)

Would there be any value in allowing a zero-size memory
region to be created? Maybe that doesn't make sense, but
I guess i'm questioning whether a zero memory region size
have special meaning in this interface is a good thing to
do. You could sensibly have a separate REMOVE_USER_MEM_REGION
request, and still permit 0 to be a valid size.

> + r = gh_vm_mem_alloc(ghvm, &region);
> + else
> + r = gh_vm_mem_free(ghvm, region.label);
> + break;
> + }
> + default:
> + r = -ENOTTY;
> + break;
> + }
> +
> + return r;
> +}
> +
> static int gh_vm_release(struct inode *inode, struct file *filp)
> {
> struct gh_vm *ghvm = filp->private_data;
> @@ -65,6 +108,7 @@ static int gh_vm_release(struct inode *inode, struct file *filp)
> }
>
> static const struct file_operations gh_vm_fops = {
> + .unlocked_ioctl = gh_vm_ioctl,
> .release = gh_vm_release,
> .compat_ioctl = compat_ptr_ioctl,
> .llseek = noop_llseek,
> diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
> index 76954da706e9..97bc00c34878 100644
> --- a/drivers/virt/gunyah/vm_mgr.h
> +++ b/drivers/virt/gunyah/vm_mgr.h
> @@ -7,16 +7,41 @@
> #define _GH_PRIV_VM_MGR_H
>
> #include <linux/gunyah_rsc_mgr.h>
> +#include <linux/list.h>
> +#include <linux/miscdevice.h>
> +#include <linux/mutex.h>
>
> #include <uapi/linux/gunyah.h>
>
> long gh_dev_vm_mgr_ioctl(struct gh_rm *rm, unsigned int cmd, unsigned long arg);
>
> +enum gh_vm_mem_share_type {
> + VM_MEM_SHARE,
> + VM_MEM_LEND,

Are there any other share types anticipated? Even if
there were, for now you could use a Boolean to distinguish
between shared or lent (at least until a third option
materializes).

> +};
> +
> +struct gh_vm_mem {
> + struct list_head list;
> + enum gh_vm_mem_share_type share_type;
> + struct gh_rm_mem_parcel parcel;
> +
> + __u64 guest_phys_addr;
> + struct page **pages;
> + unsigned long npages;
> +};
> +
> struct gh_vm {
> u16 vmid;
> struct gh_rm *rm;
>
> struct work_struct free_work;
> + struct mutex mm_lock;
> + struct list_head memory_mappings;
> };
>
> +int gh_vm_mem_alloc(struct gh_vm *ghvm, struct gh_userspace_memory_region *region);
> +void gh_vm_mem_reclaim(struct gh_vm *ghvm, struct gh_vm_mem *mapping);
> +int gh_vm_mem_free(struct gh_vm *ghvm, u32 label);
> +struct gh_vm_mem *gh_vm_mem_find(struct gh_vm *ghvm, u32 label);
> +
> #endif
> diff --git a/drivers/virt/gunyah/vm_mgr_mm.c b/drivers/virt/gunyah/vm_mgr_mm.c
> new file mode 100644
> index 000000000000..03e71a36ea3b
> --- /dev/null
> +++ b/drivers/virt/gunyah/vm_mgr_mm.c
> @@ -0,0 +1,235 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#define pr_fmt(fmt) "gh_vm_mgr: " fmt
> +
> +#include <linux/gunyah_rsc_mgr.h>
> +#include <linux/mm.h>
> +
> +#include <uapi/linux/gunyah.h>
> +
> +#include "vm_mgr.h"
> +
> +static inline bool page_contiguous(phys_addr_t p, phys_addr_t t)

Is there not some existing function that captures this?
In any case, it's used in one place and I think it would
be clearer to just put the logic there rather than hiding
it behind this function.

> +{
> + return t - p == PAGE_SIZE;
> +}
> +
> +static struct gh_vm_mem *__gh_vm_mem_find(struct gh_vm *ghvm, u32 label)
> + __must_hold(&ghvm->mm_lock)
> +{
> + struct gh_vm_mem *mapping;
> +
> + list_for_each_entry(mapping, &ghvm->memory_mappings, list)
> + if (mapping->parcel.label == label)
> + return mapping;
> +
> + return NULL;
> +}
> +

. . .

> diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
> index 10ba32d2b0a6..d85d12119a48 100644
> --- a/include/uapi/linux/gunyah.h
> +++ b/include/uapi/linux/gunyah.h
> @@ -20,4 +20,37 @@
> */
> #define GH_CREATE_VM _IO(GH_IOCTL_TYPE, 0x0) /* Returns a Gunyah VM fd */
>
> +/*
> + * ioctls for VM fds
> + */
> +
> +/**
> + * struct gh_userspace_memory_region - Userspace memory descripion for GH_VM_SET_USER_MEM_REGION
> + * @label: Unique identifer to the region.

Maybe this is described somewhere, but what is the purpose
of the label? Who uses it? Is it meant to be a value
only the current owner of a resource understands? Or does
resource manager use it internally, or what?

> + * @flags: Flags for memory parcel behavior
> + * @guest_phys_addr: Location of the memory region in guest's memory space (page-aligned)
> + * @memory_size: Size of the region (page-aligned)
> + * @userspace_addr: Location of the memory region in caller (userspace)'s memory
> + *
> + * See Documentation/virt/gunyah/vm-manager.rst for further details.
> + */
> +struct gh_userspace_memory_region {
> + __u32 label;

Define the possible permission values separate from
the structure.

-Alex

> +#define GH_MEM_ALLOW_READ (1UL << 0)
> +#define GH_MEM_ALLOW_WRITE (1UL << 1)
> +#define GH_MEM_ALLOW_EXEC (1UL << 2)
> +/*
> + * The guest will be lent the memory instead of shared.
> + * In other words, the guest has exclusive access to the memory region and the host loses access.
> + */
> +#define GH_MEM_LENT (1UL << 3)
> + __u32 flags;
> + __u64 guest_phys_addr;
> + __u64 memory_size;
> + __u64 userspace_addr;
> +};
> +
> +#define GH_VM_SET_USER_MEM_REGION _IOW(GH_IOCTL_TYPE, 0x1, \
> + struct gh_userspace_memory_region)
> +
> #endif


2023-02-24 00:43:47

by Elliot Berman

[permalink] [raw]
Subject: Re: [PATCH v10 12/26] gunyah: vm_mgr: Add/remove user memory regions



On 2/21/2023 4:28 AM, Srinivas Kandagatla wrote:
>
>
> On 14/02/2023 21:24, Elliot Berman wrote:
>>
>> When launching a virtual machine, Gunyah userspace allocates memory for
>> the guest and informs Gunyah about these memory regions through
>> SET_USER_MEMORY_REGION ioctl.
>>
>> Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
>> Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
>> Signed-off-by: Elliot Berman <[email protected]>
>> ---
>>   drivers/virt/gunyah/Makefile    |   2 +-
>>   drivers/virt/gunyah/vm_mgr.c    |  44 ++++++
>>   drivers/virt/gunyah/vm_mgr.h    |  25 ++++
>>   drivers/virt/gunyah/vm_mgr_mm.c | 235 ++++++++++++++++++++++++++++++++
>>   include/uapi/linux/gunyah.h     |  33 +++++
>>   5 files changed, 338 insertions(+), 1 deletion(-)
>>   create mode 100644 drivers/virt/gunyah/vm_mgr_mm.c
>>
>> diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
>> index 03951cf82023..ff8bc4925392 100644
>> --- a/drivers/virt/gunyah/Makefile
>> +++ b/drivers/virt/gunyah/Makefile
>> @@ -2,5 +2,5 @@
>>   obj-$(CONFIG_GUNYAH) += gunyah.o
>> -gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o
>> +gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o vm_mgr_mm.o
>>   obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
>> diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
>> index fd890a57172e..84102bac03cc 100644
>> --- a/drivers/virt/gunyah/vm_mgr.c
>> +++ b/drivers/virt/gunyah/vm_mgr.c
>> @@ -18,8 +18,16 @@
>>   static void gh_vm_free(struct work_struct *work)
>>   {
>>       struct gh_vm *ghvm = container_of(work, struct gh_vm, free_work);
>> +    struct gh_vm_mem *mapping, *tmp;
>>       int ret;
>> +    mutex_lock(&ghvm->mm_lock);
>> +    list_for_each_entry_safe(mapping, tmp, &ghvm->memory_mappings,
>> list) {
>> +        gh_vm_mem_reclaim(ghvm, mapping);
>> +        kfree(mapping);
>> +    }
>> +    mutex_unlock(&ghvm->mm_lock);
>> +
>>       ret = gh_rm_dealloc_vmid(ghvm->rm, ghvm->vmid);
>>       if (ret)
>>           pr_warn("Failed to deallocate vmid: %d\n", ret);
>> @@ -48,11 +56,46 @@ static __must_check struct gh_vm
>> *gh_vm_alloc(struct gh_rm *rm)
>>       ghvm->vmid = vmid;
>>       ghvm->rm = rm;
>> +    mutex_init(&ghvm->mm_lock);
>> +    INIT_LIST_HEAD(&ghvm->memory_mappings);
>>       INIT_WORK(&ghvm->free_work, gh_vm_free);
>>       return ghvm;
>>   }
>> +static long gh_vm_ioctl(struct file *filp, unsigned int cmd, unsigned
>> long arg)
>> +{
>> +    struct gh_vm *ghvm = filp->private_data;
>> +    void __user *argp = (void __user *)arg;
>> +    long r;
>> +
>> +    switch (cmd) {
>> +    case GH_VM_SET_USER_MEM_REGION: {
>> +        struct gh_userspace_memory_region region;
>> +
>> +        if (copy_from_user(&region, argp, sizeof(region)))
>> +            return -EFAULT;
>> +
>> +        /* All other flag bits are reserved for future use */
>> +        if (region.flags & ~(GH_MEM_ALLOW_READ | GH_MEM_ALLOW_WRITE |
>> GH_MEM_ALLOW_EXEC |
>> +            GH_MEM_LENT))
>> +            return -EINVAL;
>> +
>> +
>> +        if (region.memory_size)
>> +            r = gh_vm_mem_alloc(ghvm, &region);
>> +        else
>> +            r = gh_vm_mem_free(ghvm, region.label);
>
> Looks like we are repurposing GH_VM_SET_USER_MEM_REGION for allocation
> and freeing.
>
> Should we have corresponding GH_VM_UN_SET_USER_MEM_REGION instead for
> freeing? given that label is the only relevant member of struct
> gh_userspace_memory_region in free case.
>
>

I'm following convention of KVM here, which re-uses
KVM_SET_USER_MEM_REGION for deleting regions as well.

One question though --

We don't have need to support removal of memory regions while VM is
running. Gunyah rejects removal of parcels that haven't been released
and no guests currently support releasing a memory parcel while it's
running. With the current series, the only time memory parcels can be
reclaimed is when VM is being disposed after shut down. With that in
mind, shall I drop the removal of memory regions in v11? I had added it
for symmetry/completeness, but I'm holding GH_VM_DESTROY for now as well
[1].

[1]:
https://lore.kernel.org/all/[email protected]/

>> +        break;
>> +    }
>> +    default:
>> +        r = -ENOTTY;
>> +        break;
>> +    }
>> +
>> +    return r;
>> +}
>> +
>>   static int gh_vm_release(struct inode *inode, struct file *filp)
>>   {
>>       struct gh_vm *ghvm = filp->private_data;
>> @@ -65,6 +108,7 @@ static int gh_vm_release(struct inode *inode,
>> struct file *filp)
>>   }
>>   static const struct file_operations gh_vm_fops = {
>> +    .unlocked_ioctl = gh_vm_ioctl,
>>       .release = gh_vm_release,
>>       .compat_ioctl    = compat_ptr_ioctl,
>>       .llseek = noop_llseek,
>> diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
>> index 76954da706e9..97bc00c34878 100644
>> --- a/drivers/virt/gunyah/vm_mgr.h
>> +++ b/drivers/virt/gunyah/vm_mgr.h
>> @@ -7,16 +7,41 @@
>>   #define _GH_PRIV_VM_MGR_H
>>   #include <linux/gunyah_rsc_mgr.h>
>> +#include <linux/list.h>
>> +#include <linux/miscdevice.h>
>> +#include <linux/mutex.h>
>>   #include <uapi/linux/gunyah.h>
>>   long gh_dev_vm_mgr_ioctl(struct gh_rm *rm, unsigned int cmd,
>> unsigned long arg);
>> +enum gh_vm_mem_share_type {
>> +    VM_MEM_SHARE,
>> +    VM_MEM_LEND,
>> +};
>> +
>> +struct gh_vm_mem {
>> +    struct list_head list;
>> +    enum gh_vm_mem_share_type share_type;
>> +    struct gh_rm_mem_parcel parcel;
>> +
>> +    __u64 guest_phys_addr;
>> +    struct page **pages;
>> +    unsigned long npages;
>> +};
>> +
>>   struct gh_vm {
>>       u16 vmid;
>>       struct gh_rm *rm;
>>       struct work_struct free_work;
>> +    struct mutex mm_lock;
>> +    struct list_head memory_mappings;
>>   };
>> +int gh_vm_mem_alloc(struct gh_vm *ghvm, struct
>> gh_userspace_memory_region *region);
>> +void gh_vm_mem_reclaim(struct gh_vm *ghvm, struct gh_vm_mem *mapping);
>> +int gh_vm_mem_free(struct gh_vm *ghvm, u32 label);
>> +struct gh_vm_mem *gh_vm_mem_find(struct gh_vm *ghvm, u32 label);
>> +
>>   #endif
>> diff --git a/drivers/virt/gunyah/vm_mgr_mm.c
>> b/drivers/virt/gunyah/vm_mgr_mm.c
>> new file mode 100644
>> index 000000000000..03e71a36ea3b
>> --- /dev/null
>> +++ b/drivers/virt/gunyah/vm_mgr_mm.c
>> @@ -0,0 +1,235 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All
>> rights reserved.
>> + */
>> +
>> +#define pr_fmt(fmt) "gh_vm_mgr: " fmt
>> +
>> +#include <linux/gunyah_rsc_mgr.h>
>> +#include <linux/mm.h>
>> +
>> +#include <uapi/linux/gunyah.h>
>> +
>> +#include "vm_mgr.h"
>> +
>> +static inline bool page_contiguous(phys_addr_t p, phys_addr_t t)
>> +{
>> +    return t - p == PAGE_SIZE;
>> +}
>> +
>> +static struct gh_vm_mem *__gh_vm_mem_find(struct gh_vm *ghvm, u32 label)
>> +    __must_hold(&ghvm->mm_lock)
>> +{
>> +    struct gh_vm_mem *mapping;
>> +
>> +    list_for_each_entry(mapping, &ghvm->memory_mappings, list)
>> +        if (mapping->parcel.label == label)
>> +            return mapping;
>> +
>> +    return NULL;
>> +}
>> +
>> +void gh_vm_mem_reclaim(struct gh_vm *ghvm, struct gh_vm_mem *mapping)
>> +    __must_hold(&ghvm->mm_lock)
>> +{
>> +    int i, ret = 0;
>> +
>> +    if (mapping->parcel.mem_handle != GH_MEM_HANDLE_INVAL) {
>> +        ret = gh_rm_mem_reclaim(ghvm->rm, &mapping->parcel);
>> +        if (ret)
>> +            pr_warn("Failed to reclaim memory parcel for label %d:
>> %d\n",
>> +                mapping->parcel.label, ret);
>
> what the behavoir of hypervisor if we failed to reclaim the pages?
>

Hypervisor doesn't modify access to the pages.

>> +    }
>> +
>> +    if (!ret)
> So we will leave the user pages pinned if hypervisor call fails, but
> further down we free the mapping all together.
>
> Am not 100% sure if this will have any side-effect, but is it okay to
> leave user-pages pinned with no possiblity of unpinning them in such cases?
>

I think it's not okay, but the only way this could fail is if there is a
kernel bug. I'd rather not BUG_ON?

>
>> +        for (i = 0; i < mapping->npages; i++)
>> +            unpin_user_page(mapping->pages[i]);
>> +
>> +    kfree(mapping->pages);
>> +    kfree(mapping->parcel.acl_entries);
>> +    kfree(mapping->parcel.mem_entries);
>> +
>> +    list_del(&mapping->list);
>> +}
>> +
>> +struct gh_vm_mem *gh_vm_mem_find(struct gh_vm *ghvm, u32 label)
>> +{
>> +    struct gh_vm_mem *mapping;
>> +    int ret;
>> +
>> +    ret = mutex_lock_interruptible(&ghvm->mm_lock);
>> +    if (ret)
>> +        return ERR_PTR(ret);
> new line would be nice here.
>
>> +    mapping = __gh_vm_mem_find(ghvm, label);
>> +    mutex_unlock(&ghvm->mm_lock);
> new line would be nice here.
>
>> +    return mapping ? : ERR_PTR(-ENODEV);
>> +}
>> +
>> +int gh_vm_mem_alloc(struct gh_vm *ghvm, struct
>> gh_userspace_memory_region *region)
>> +{
>> +    struct gh_vm_mem *mapping, *tmp_mapping;
>> +    struct gh_rm_mem_entry *mem_entries;
>> +    phys_addr_t curr_page, prev_page;
>> +    struct gh_rm_mem_parcel *parcel;
>> +    int i, j, pinned, ret = 0;
>> +    size_t entry_size;
>> +    u16 vmid;
>> +
>> +    if (!gh_api_has_feature(GH_API_FEATURE_MEMEXTENT))
>> +        return -EOPNOTSUPP;
>
> Should this not be first thing to do in ioctl before even entering this
> function?
>

I don't see why one place is better than other, but I can move.

>> +
>> +    if (!region->memory_size || !PAGE_ALIGNED(region->memory_size) ||
>> +        !PAGE_ALIGNED(region->userspace_addr) ||
>> !PAGE_ALIGNED(region->guest_phys_addr))
>> +        return -EINVAL;
>> +
>> +    ret = mutex_lock_interruptible(&ghvm->mm_lock);
>> +    if (ret)
>> +        return ret;
> new line.
>
>> +    mapping = __gh_vm_mem_find(ghvm, region->label);
>> +    if (mapping) {
>> +        mutex_unlock(&ghvm->mm_lock);
>> +        return -EEXIST;
>> +    }
>> +
>> +    mapping = kzalloc(sizeof(*mapping), GFP_KERNEL);
>> +    if (!mapping) {
>> +        ret = -ENOMEM;
>> +        goto free_mapping;
>
> how about,
>
> mutex_unlock(&ghvm->mm_lock);
> return -ENMEM;
>
>> +    }
>> +
>> +    mapping->parcel.label = region->label;
>> +    mapping->guest_phys_addr = region->guest_phys_addr;
>> +    mapping->npages = region->memory_size >> PAGE_SHIFT;
>> +    parcel = &mapping->parcel;
>> +    parcel->mem_handle = GH_MEM_HANDLE_INVAL; /* to be filled later
>> by mem_share/mem_lend */
>> +    parcel->mem_type = GH_RM_MEM_TYPE_NORMAL;
>> +
>> +    /* Check for overlap */
>> +    list_for_each_entry(tmp_mapping, &ghvm->memory_mappings, list) {
>> +        if (!((mapping->guest_phys_addr + (mapping->npages <<
>> PAGE_SHIFT) <=
>> +            tmp_mapping->guest_phys_addr) ||
>> +            (mapping->guest_phys_addr >=
>> +            tmp_mapping->guest_phys_addr + (tmp_mapping->npages <<
>> PAGE_SHIFT)))) {
>> +            ret = -EEXIST;
>> +            goto free_mapping;
>> +        }
>> +    }
>> +
>> +    list_add(&mapping->list, &ghvm->memory_mappings);
>> +
>> +    mapping->pages = kcalloc(mapping->npages,
>> sizeof(*mapping->pages), GFP_KERNEL);
>> +    if (!mapping->pages) {
>> +        ret = -ENOMEM;
>> +        mapping->npages = 0; /* update npages for reclaim */
>> +        goto reclaim;
>> +    }
>> +
>> +    pinned = pin_user_pages_fast(region->userspace_addr,
>> mapping->npages,
>> +                    FOLL_WRITE | FOLL_LONGTERM, mapping->pages);
>> +    if (pinned < 0) {
>> +        ret = pinned;
>> +        mapping->npages = 0; /* update npages for reclaim */
>> +        goto reclaim;
>> +    } else if (pinned != mapping->npages) {
>> +        ret = -EFAULT;
>> +        mapping->npages = pinned; /* update npages for reclaim */
>> +        goto reclaim;
>> +    }
>> +
>> +    if (region->flags & GH_MEM_LENT) {
>> +        parcel->n_acl_entries = 1;
>> +        mapping->share_type = VM_MEM_LEND;
>> +    } else {
>> +        parcel->n_acl_entries = 2;
>> +        mapping->share_type = VM_MEM_SHARE;
>> +    }
>> +    parcel->acl_entries = kcalloc(parcel->n_acl_entries,
>> sizeof(*parcel->acl_entries),
>> +                    GFP_KERNEL);
>> +    if (!parcel->acl_entries) {
>> +        ret = -ENOMEM;
>> +        goto reclaim;
>> +    }
>> +
>> +    parcel->acl_entries[0].vmid = cpu_to_le16(ghvm->vmid);
> new line
>> +    if (region->flags & GH_MEM_ALLOW_READ)
>> +        parcel->acl_entries[0].perms |= GH_RM_ACL_R;
>> +    if (region->flags & GH_MEM_ALLOW_WRITE)
>> +        parcel->acl_entries[0].perms |= GH_RM_ACL_W;
>> +    if (region->flags & GH_MEM_ALLOW_EXEC)
>> +        parcel->acl_entries[0].perms |= GH_RM_ACL_X;
>> +
>> +    if (mapping->share_type == VM_MEM_SHARE) {
>> +        ret = gh_rm_get_vmid(ghvm->rm, &vmid);
>> +        if (ret)
>> +            goto reclaim;
>> +
>> +        parcel->acl_entries[1].vmid = cpu_to_le16(vmid);
>> +        /* Host assumed to have all these permissions. Gunyah will not
>> +         * grant new permissions if host actually had less than RWX
>> +         */
>> +        parcel->acl_entries[1].perms |= GH_RM_ACL_R | GH_RM_ACL_W |
>> GH_RM_ACL_X;
>> +    }
>> +
>> +    mem_entries = kcalloc(mapping->npages, sizeof(*mem_entries),
>> GFP_KERNEL);
>> +    if (!mem_entries) {
>> +        ret = -ENOMEM;
>> +        goto reclaim;
>> +    }
>> +
>> +    /* reduce number of entries by combining contiguous pages into
>> single memory entry */
>> +    prev_page = page_to_phys(mapping->pages[0]);
>> +    mem_entries[0].ipa_base = cpu_to_le64(prev_page);
>> +    entry_size = PAGE_SIZE;
> new line
>> +    for (i = 1, j = 0; i < mapping->npages; i++) {
>> +        curr_page = page_to_phys(mapping->pages[i]);
>> +        if (page_contiguous(prev_page, curr_page)) {
>> +            entry_size += PAGE_SIZE;
>> +        } else {
>> +            mem_entries[j].size = cpu_to_le64(entry_size);
>> +            j++;
>> +            mem_entries[j].ipa_base = cpu_to_le64(curr_page);
>> +            entry_size = PAGE_SIZE;
>> +        }
>> +
>> +        prev_page = curr_page;
>> +    }
>> +    mem_entries[j].size = cpu_to_le64(entry_size);
>> +
>> +    parcel->n_mem_entries = j + 1;
>> +    parcel->mem_entries = kmemdup(mem_entries, sizeof(*mem_entries) *
>> parcel->n_mem_entries,
>> +                    GFP_KERNEL);
>> +    kfree(mem_entries);
>> +    if (!parcel->mem_entries) {
>> +        ret = -ENOMEM;
>> +        goto reclaim;
>> +    }
>> +
>> +    mutex_unlock(&ghvm->mm_lock);
>> +    return 0;
>> +reclaim:
>> +    gh_vm_mem_reclaim(ghvm, mapping);
>> +free_mapping:
>> +    kfree(mapping);
>> +    mutex_unlock(&ghvm->mm_lock);
>> +    return ret;
>> +}
>> +
>> +int gh_vm_mem_free(struct gh_vm *ghvm, u32 label)
>> +{
>> +    struct gh_vm_mem *mapping;
>> +    int ret;
>> +
>> +    ret = mutex_lock_interruptible(&ghvm->mm_lock);
>> +    if (ret)
>> +        return ret;
>> +
>> +    mapping = __gh_vm_mem_find(ghvm, label);
>> +    if (!mapping)
>> +        goto out;
>> +
>> +    gh_vm_mem_reclaim(ghvm, mapping);
>> +    kfree(mapping);
>> +out:
>> +    mutex_unlock(&ghvm->mm_lock);
>> +    return ret;
>> +}
>> diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
>> index 10ba32d2b0a6..d85d12119a48 100644
>> --- a/include/uapi/linux/gunyah.h
>> +++ b/include/uapi/linux/gunyah.h
>> @@ -20,4 +20,37 @@
>>    */
>>   #define GH_CREATE_VM            _IO(GH_IOCTL_TYPE, 0x0) /* Returns a
>> Gunyah VM fd */
>> +/*
>> + * ioctls for VM fds
>> + */
>> +
>> +/**
>> + * struct gh_userspace_memory_region - Userspace memory descripion
>> for GH_VM_SET_USER_MEM_REGION
>> + * @label: Unique identifer to the region.
>> + * @flags: Flags for memory parcel behavior
>> + * @guest_phys_addr: Location of the memory region in guest's memory
>> space (page-aligned)#
>
> Note about overlapping here would be useful.
>

I'd like to reduce duplicate documentation where possible. I was
generally following this procedure:
- include/uapi/linux/gunyah.h docstrings have basic information to
remind what the field is
- Documentation/virt/gunyah/ documentation explains how to properly
use the APIs

I think it's definitely good idea to have separate documentation beyond
what can be described in docstrings here.

Thanks,
Elliot


2023-02-24 07:49:39

by Srinivas Kandagatla

[permalink] [raw]
Subject: Re: [PATCH v10 07/26] mailbox: Add Gunyah message queue mailbox



On 23/02/2023 23:15, Elliot Berman wrote:
>
>
> On 2/23/2023 2:25 AM, Srinivas Kandagatla wrote:
>>
>>
>> On 23/02/2023 00:15, Elliot Berman wrote:
>>>
>>>
>>> On 2/20/2023 5:59 AM, Srinivas Kandagatla wrote:
>>>>
>>>>
>>>> On 14/02/2023 21:23, Elliot Berman wrote:
>>>>> Gunyah message queues are a unidirectional inter-VM pipe for
>>>>> messages up
>>>>> to 1024 bytes. This driver supports pairing a receiver message
>>>>> queue and
>>>>> a transmitter message queue to expose a single mailbox channel.
>>>>>
>>>>> Signed-off-by: Elliot Berman <[email protected]>
>>>>> ---
>>>>>   Documentation/virt/gunyah/message-queue.rst |   8 +
>>>>>   drivers/mailbox/Makefile                   |   2 +
>>>>>   drivers/mailbox/gunyah-msgq.c               | 214
>>>>> ++++++++++++++++++++
>>>>>   include/linux/gunyah.h                      |  56 +++++
>>>>>   4 files changed, 280 insertions(+)
>>>>>   create mode 100644 drivers/mailbox/gunyah-msgq.c
>>>>>
>>>>> diff --git a/Documentation/virt/gunyah/message-queue.rst
>>>>> b/Documentation/virt/gunyah/message-queue.rst
>>>>> index 0667b3eb1ff9..082085e981e0 100644
>>>>> --- a/Documentation/virt/gunyah/message-queue.rst
>>>>> +++ b/Documentation/virt/gunyah/message-queue.rst
>>>>> @@ -59,3 +59,11 @@ vIRQ: two TX message queues will have two vIRQs
>>>>> (and two capability IDs).
>>>>>         |               |         |                 |
>>>>> |               |
>>>>>         |               |         |                 |
>>>>> |               |
>>>>>         +---------------+         +-----------------+
>>>>> +---------------+
>>>>> +
>>>>> +Gunyah message queues are exposed as mailboxes. To create the
>>>>> mailbox, create
>>>>> +a mbox_client and call `gh_msgq_init`. On receipt of the RX_READY
>>>>> interrupt,
>>>>> +all messages in the RX message queue are read and pushed via the
>>>>> `rx_callback`
>>>>> +of the registered mbox_client.
>>>>> +
>>>>> +.. kernel-doc:: drivers/mailbox/gunyah-msgq.c
>>>>> +   :identifiers: gh_msgq_init
>>>>> diff --git a/drivers/mailbox/Makefile b/drivers/mailbox/Makefile
>>>>> index fc9376117111..5f929bb55e9a 100644
>>>>> --- a/drivers/mailbox/Makefile
>>>>> +++ b/drivers/mailbox/Makefile
>>>>> @@ -55,6 +55,8 @@ obj-$(CONFIG_MTK_CMDQ_MBOX)    += mtk-cmdq-mailbox.o
>>>>>   obj-$(CONFIG_ZYNQMP_IPI_MBOX)    += zynqmp-ipi-mailbox.o
>>>>> +obj-$(CONFIG_GUNYAH)        += gunyah-msgq.o
>>>>
>>>> Why are we reusing CONFIG_GUNYAH Kconfig symbol for mailbox, why not
>>>> CONFIG_GUNYAH_MBOX?
>>>>
>>>
>>> There was some previous discussion about this:
>>>
>>> https://lore.kernel.org/all/[email protected]/
>>>
>>>>> +
>>>>>   obj-$(CONFIG_SUN6I_MSGBOX)    += sun6i-msgbox.o
>>>>>   obj-$(CONFIG_SPRD_MBOX)       += sprd-mailbox.o
>>>>> diff --git a/drivers/mailbox/gunyah-msgq.c
>>>>> b/drivers/mailbox/gunyah-msgq.c
>>>>> new file mode 100644
>>>>> index 000000000000..03ffaa30ce9b
>>>>> --- /dev/null
>>>>> +++ b/drivers/mailbox/gunyah-msgq.c
>>>>> @@ -0,0 +1,214 @@
>>>>> +// SPDX-License-Identifier: GPL-2.0-only
>>>>> +/*
>>>>> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All
>>>>> rights reserved.
>>>>> + */
>>>>> +
>>>>> +#include <linux/mailbox_controller.h>
>>>>> +#include <linux/module.h>
>>>>> +#include <linux/interrupt.h>
>>>>> +#include <linux/gunyah.h>
>>>>> +#include <linux/printk.h>
>>>>> +#include <linux/init.h>
>>>>> +#include <linux/slab.h>
>>>>> +#include <linux/wait.h>
>>>>
>>>> ...
>>>>
>>>>> +/* Fired when message queue transitions from "full" to "space
>>>>> available" to send messages */
>>>>> +static irqreturn_t gh_msgq_tx_irq_handler(int irq, void *data)
>>>>> +{
>>>>> +    struct gh_msgq *msgq = data;
>>>>> +
>>>>> +    mbox_chan_txdone(gh_msgq_chan(msgq), 0);
>>>>> +
>>>>> +    return IRQ_HANDLED;
>>>>> +}
>>>>> +
>>>>> +/* Fired after sending message and hypercall told us there was
>>>>> more space available. */
>>>>> +static void gh_msgq_txdone_tasklet(struct tasklet_struct *tasklet)
>>>>
>>>> Tasklets have been long deprecated, consider using workqueues in
>>>> this particular case.
>>>>
>>>
>>> Workqueues have higher latency and tasklets came as recommendation
>>> from Jassi. drivers/mailbox/imx-mailbox.c uses tasklets in the same way.
>>>
>>> I did some quick unscientific measurements of ~1000x samples. The
>>> median latency for resource manager went from 25.5 us (tasklet) to 26
>>> us (workqueue) (2% slower). The mean went from 28.7 us to 32.5 us
>>> (13% slower). Obviously, the outliers for workqueues were much more
>>> extreme.
>>
>> TBH, this is expected because we are only testing resource manager,
>> Note   the advantage that you will see shifting from tasket to
>> workqueues is on overall system latencies and some drivers performance
>> that need to react to events.
>>
>> please take some time to read this nice article about this
>> https://lwn.net/Articles/830964/
>>
>
> Hmm, this article is from 2020 and there was another effort in 2007.
> Neither seems to have succeeded. I'd like to stick to same mechanisms as
> other mailbox controllers.

I don't want to block this series because of this. We will have more
opportunity to improve this once some system wide profiling is done.

AFAIU, In this system we will have atleast 2 tasklets between VM and RM
and 2 per inter-vm, so if the number of tasklets increase in the system
will be potentially spending more time in soft irq handling it.

At somepoint in time its good to get some profiling done using
bcc/softirqs to see how much time is spent on softirqs.


--srini

>
> Jassi, do you have any preferences?
>
> Thanks,
> Elliot
>
>

2023-02-24 10:20:31

by Fuad Tabba

[permalink] [raw]
Subject: Re: [PATCH v10 12/26] gunyah: vm_mgr: Add/remove user memory regions

Hi,

On Tue, Feb 14, 2023 at 9:26 PM Elliot Berman <[email protected]> wrote:
>
>
> When launching a virtual machine, Gunyah userspace allocates memory for
> the guest and informs Gunyah about these memory regions through
> SET_USER_MEMORY_REGION ioctl.

I'm working on pKVM [1], and regarding the problem of donating private
memory to a guest, we and others working on confidential computing
have faced a similar issue that this patch is trying to address. In
pKVM, we've initially taken an approach similar to the one here by
pinning the pages being donated to prevent swapping or migration [2].
However, we've encountered issues with this approach since the memory
is still mapped by the host, which could cause the system to crash on
an errant access.

Instead, we've been working on adopting an fd-based restricted memory
approach that was initially proposed for TDX [3] and is now being
considered by others in the confidential computing space as well
(e.g., Arm CCA [4]). The basic idea is that the host manages the guest
memory via a file descriptor instead of a userspace address. It cannot
map that memory (unless explicitly shared by the guest [5]),
eliminating the possibility of the host trying to access private
memory accidentally or being tricked by a malicious actor. This is
based on memfd with some restrictions. It handles swapping and
migration by disallowing them (for now [6]), and adds a new type of
memory region to KVM to accommodate having an fd representing guest
memory.

Although the fd-based restricted memory isn't upstream yet, we've
ported the latest patches to arm64 and made changes and additions to
make it work with pKVM, to test it and see if the solution is feasible
for us (it is). I wanted to mention this work in case you find it
useful, and in the hopes that we can all work on confidential
computing using the same interfaces as much as possible.

Some comments inline below...

Cheers,
/fuad

[1] https://lore.kernel.org/kvmarm/[email protected]/
[2] https://lore.kernel.org/kvmarm/[email protected]/
[3] https://lore.kernel.org/all/[email protected]/
[4] https://lore.kernel.org/lkml/[email protected]/
[5] This is a modification we've done for the arm64 port, after
discussing it with the original authors.
[6] Nothing inherent in the proposal to stop migration and swapping.
There are some technical issues that need to be resolved.

<snip>

> +int gh_vm_mem_alloc(struct gh_vm *ghvm, struct gh_userspace_memory_region *region)
> +{
> + struct gh_vm_mem *mapping, *tmp_mapping;
> + struct gh_rm_mem_entry *mem_entries;
> + phys_addr_t curr_page, prev_page;
> + struct gh_rm_mem_parcel *parcel;
> + int i, j, pinned, ret = 0;
> + size_t entry_size;
> + u16 vmid;
> +
> + if (!gh_api_has_feature(GH_API_FEATURE_MEMEXTENT))
> + return -EOPNOTSUPP;
> +
> + if (!region->memory_size || !PAGE_ALIGNED(region->memory_size) ||
> + !PAGE_ALIGNED(region->userspace_addr) || !PAGE_ALIGNED(region->guest_phys_addr))
> + return -EINVAL;
> +
> + ret = mutex_lock_interruptible(&ghvm->mm_lock);
> + if (ret)
> + return ret;
> + mapping = __gh_vm_mem_find(ghvm, region->label);
> + if (mapping) {
> + mutex_unlock(&ghvm->mm_lock);
> + return -EEXIST;
> + }
> +
> + mapping = kzalloc(sizeof(*mapping), GFP_KERNEL);
> + if (!mapping) {
> + ret = -ENOMEM;
> + goto free_mapping;
> + }
> +
> + mapping->parcel.label = region->label;
> + mapping->guest_phys_addr = region->guest_phys_addr;
> + mapping->npages = region->memory_size >> PAGE_SHIFT;
> + parcel = &mapping->parcel;
> + parcel->mem_handle = GH_MEM_HANDLE_INVAL; /* to be filled later by mem_share/mem_lend */
> + parcel->mem_type = GH_RM_MEM_TYPE_NORMAL;
> +
> + /* Check for overlap */
> + list_for_each_entry(tmp_mapping, &ghvm->memory_mappings, list) {
> + if (!((mapping->guest_phys_addr + (mapping->npages << PAGE_SHIFT) <=
> + tmp_mapping->guest_phys_addr) ||
> + (mapping->guest_phys_addr >=
> + tmp_mapping->guest_phys_addr + (tmp_mapping->npages << PAGE_SHIFT)))) {
> + ret = -EEXIST;
> + goto free_mapping;
> + }
> + }
> +
> + list_add(&mapping->list, &ghvm->memory_mappings);
> +
> + mapping->pages = kcalloc(mapping->npages, sizeof(*mapping->pages), GFP_KERNEL);
> + if (!mapping->pages) {
> + ret = -ENOMEM;
> + mapping->npages = 0; /* update npages for reclaim */
> + goto reclaim;
> + }

These pages should be accounted for as locked pages, e.g.,
account_locked_vm(), which would also ensure that the process hasn't
reached its limit.

> + pinned = pin_user_pages_fast(region->userspace_addr, mapping->npages,
> + FOLL_WRITE | FOLL_LONGTERM, mapping->pages);

It might be good to check and avoid donating pages with pre-faulted
file mappings since it might trigger a writeback of a page after
losing access to it. Ideally, you want to only accept anonymous or
shmem pages. In pKVM, we check that the pages are SwapBacked and
reject the pinning/donation otherwise [2].

> + if (pinned < 0) {
> + ret = pinned;
> + mapping->npages = 0; /* update npages for reclaim */
> + goto reclaim;
> + } else if (pinned != mapping->npages) {
> + ret = -EFAULT;
> + mapping->npages = pinned; /* update npages for reclaim */
> + goto reclaim;
> + }
> +
> + if (region->flags & GH_MEM_LENT) {
> + parcel->n_acl_entries = 1;
> + mapping->share_type = VM_MEM_LEND;
> + } else {
> + parcel->n_acl_entries = 2;
> + mapping->share_type = VM_MEM_SHARE;
> + }
> + parcel->acl_entries = kcalloc(parcel->n_acl_entries, sizeof(*parcel->acl_entries),
> + GFP_KERNEL);
> + if (!parcel->acl_entries) {
> + ret = -ENOMEM;
> + goto reclaim;
> + }
> +
> + parcel->acl_entries[0].vmid = cpu_to_le16(ghvm->vmid);
> + if (region->flags & GH_MEM_ALLOW_READ)
> + parcel->acl_entries[0].perms |= GH_RM_ACL_R;
> + if (region->flags & GH_MEM_ALLOW_WRITE)
> + parcel->acl_entries[0].perms |= GH_RM_ACL_W;
> + if (region->flags & GH_MEM_ALLOW_EXEC)
> + parcel->acl_entries[0].perms |= GH_RM_ACL_X;
> +
> + if (mapping->share_type == VM_MEM_SHARE) {
> + ret = gh_rm_get_vmid(ghvm->rm, &vmid);
> + if (ret)
> + goto reclaim;
> +
> + parcel->acl_entries[1].vmid = cpu_to_le16(vmid);
> + /* Host assumed to have all these permissions. Gunyah will not
> + * grant new permissions if host actually had less than RWX
> + */
> + parcel->acl_entries[1].perms |= GH_RM_ACL_R | GH_RM_ACL_W | GH_RM_ACL_X;
> + }
> +
> + mem_entries = kcalloc(mapping->npages, sizeof(*mem_entries), GFP_KERNEL);
> + if (!mem_entries) {
> + ret = -ENOMEM;
> + goto reclaim;
> + }
> +
> + /* reduce number of entries by combining contiguous pages into single memory entry */
> + prev_page = page_to_phys(mapping->pages[0]);
> + mem_entries[0].ipa_base = cpu_to_le64(prev_page);
> + entry_size = PAGE_SIZE;
> + for (i = 1, j = 0; i < mapping->npages; i++) {
> + curr_page = page_to_phys(mapping->pages[i]);
> + if (page_contiguous(prev_page, curr_page)) {
> + entry_size += PAGE_SIZE;
> + } else {
> + mem_entries[j].size = cpu_to_le64(entry_size);
> + j++;
> + mem_entries[j].ipa_base = cpu_to_le64(curr_page);
> + entry_size = PAGE_SIZE;
> + }
> +
> + prev_page = curr_page;
> + }
> + mem_entries[j].size = cpu_to_le64(entry_size);
> +
> + parcel->n_mem_entries = j + 1;
> + parcel->mem_entries = kmemdup(mem_entries, sizeof(*mem_entries) * parcel->n_mem_entries,
> + GFP_KERNEL);
> + kfree(mem_entries);
> + if (!parcel->mem_entries) {
> + ret = -ENOMEM;
> + goto reclaim;
> + }
> +
> + mutex_unlock(&ghvm->mm_lock);
> + return 0;
> +reclaim:
> + gh_vm_mem_reclaim(ghvm, mapping);
> +free_mapping:
> + kfree(mapping);
> + mutex_unlock(&ghvm->mm_lock);
> + return ret;
> +}
> +
> +int gh_vm_mem_free(struct gh_vm *ghvm, u32 label)
> +{
> + struct gh_vm_mem *mapping;
> + int ret;
> +
> + ret = mutex_lock_interruptible(&ghvm->mm_lock);
> + if (ret)
> + return ret;
> +
> + mapping = __gh_vm_mem_find(ghvm, label);
> + if (!mapping)
> + goto out;
> +
> + gh_vm_mem_reclaim(ghvm, mapping);
> + kfree(mapping);
> +out:
> + mutex_unlock(&ghvm->mm_lock);
> + return ret;
> +}
> diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
> index 10ba32d2b0a6..d85d12119a48 100644
> --- a/include/uapi/linux/gunyah.h
> +++ b/include/uapi/linux/gunyah.h
> @@ -20,4 +20,37 @@
> */
> #define GH_CREATE_VM _IO(GH_IOCTL_TYPE, 0x0) /* Returns a Gunyah VM fd */
>
> +/*
> + * ioctls for VM fds
> + */
> +
> +/**
> + * struct gh_userspace_memory_region - Userspace memory descripion for GH_VM_SET_USER_MEM_REGION

nit: s/descripion/description

> + * @label: Unique identifer to the region.

nit: s/identifer/identifier




> + * @flags: Flags for memory parcel behavior
> + * @guest_phys_addr: Location of the memory region in guest's memory space (page-aligned)
> + * @memory_size: Size of the region (page-aligned)
> + * @userspace_addr: Location of the memory region in caller (userspace)'s memory
> + *
> + * See Documentation/virt/gunyah/vm-manager.rst for further details.
> + */
> +struct gh_userspace_memory_region {
> + __u32 label;
> +#define GH_MEM_ALLOW_READ (1UL << 0)
> +#define GH_MEM_ALLOW_WRITE (1UL << 1)
> +#define GH_MEM_ALLOW_EXEC (1UL << 2)
> +/*
> + * The guest will be lent the memory instead of shared.
> + * In other words, the guest has exclusive access to the memory region and the host loses access.
> + */
> +#define GH_MEM_LENT (1UL << 3)
> + __u32 flags;
> + __u64 guest_phys_addr;
> + __u64 memory_size;
> + __u64 userspace_addr;
> +};
> +
> +#define GH_VM_SET_USER_MEM_REGION _IOW(GH_IOCTL_TYPE, 0x1, \
> + struct gh_userspace_memory_region)
> +
> #endif
> --
> 2.39.1
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

2023-02-24 10:31:21

by Srinivas Kandagatla

[permalink] [raw]
Subject: Re: [PATCH v10 10/26] gunyah: vm_mgr: Introduce basic VM Manager



On 23/02/2023 22:40, Elliot Berman wrote:
>
>
> On 2/23/2023 2:08 AM, Srinivas Kandagatla wrote:
>>
>>
>> On 22/02/2023 00:27, Elliot Berman wrote:
>>>
>>>>> +    .llseek = noop_llseek,
>>>>> +};
>>>>> +
>>>>> +static long gh_dev_ioctl_create_vm(struct gh_rm *rm, unsigned long
>>>>> arg)
>>>> Not sure what is the gain of this multiple levels of redirection.
>>>>
>>>> How about
>>>>
>>>> long gh_dev_create_vm(struct gh_rm *rm, unsigned long arg)
>>>> {
>>>> ...
>>>> }
>>>>
>>>> and rsc_mgr just call it as part of its ioctl call
>>>>
>>>> static long gh_dev_ioctl(struct file *filp, unsigned int cmd,
>>>> unsigned long arg)
>>>> {
>>>>      struct miscdevice *miscdev = filp->private_data;
>>>>      struct gh_rm *rm = container_of(miscdev, struct gh_rm, miscdev);
>>>>
>>>>      switch (cmd) {
>>>>      case GH_CREATE_VM:
>>>>          return gh_dev_create_vm(rm, arg);
>>>>      default:
>>>>          return -ENOIOCTLCMD;
>>>>      }
>>>> }
>>>>
>>>
>>> I'm anticipating we will add further /dev/gunyah ioctls and I thought
>>> it would be cleaner to have all that in vm_mgr.c itself.
>>>
>>>>
>>>>> +{
>>>>> +    struct gh_vm *ghvm;
>>>>> +    struct file *file;
>>>>> +    int fd, err;
>>>>> +
>>>>> +    /* arg reserved for future use. */
>>>>> +    if (arg)
>>>>> +        return -EINVAL;
>>>>
>>>> The only code path I see here is via GH_CREATE_VM ioctl which
>>>> obviously does not take any arguments, so if you are thinking of
>>>> using the argument for architecture-specific VM flags.  Then this
>>>> needs to be properly done by making the ABI aware of this.
>>>
>>> It is documented in Patch 17 (Document Gunyah VM Manager)
>>>
>>> +GH_CREATE_VM
>>> +~~~~~~~~~~~~
>>> +
>>> +Creates a Gunyah VM. The argument is reserved for future use and
>>> must be 0.
>>>
>> But this conficts with the UAPIs that have been defined. GH_CREATE_VM
>> itself is defined to take no parameters.
>>
>> #define GH_CREATE_VM                    _IO(GH_IOCTL_TYPE, 0x0)
>>
>> so where are you expecting the argument to come from?
>>  >>>
>>>> As you mentioned zero value arg imply an "unauthenticated VM" type,
>>>> but this was not properly encoded in the userspace ABI. Why not make
>>>> it future compatible. How about adding arguments to GH_CREATE_VM and
>>>> pass the required information correctly.
>>>> Note that once the ABI is accepted then you will not be able to
>>>> change it, other than adding a new one.
>>>>
>>>
>>> Does this means adding #define GH_VM_DEFAULT_ARG 0 ? I am not sure
>>> yet what arguments to add here.
>>>
>>> The ABI can add new "long" values to GH_CREATE_VM and that wouldn't
>>
>> Sorry, that is exactly what we want to avoid, we can not change the
>> UAPI its going to break the userspace.
>>
>>> break compatibility with old kernels; old kernels reject it as -EINVAL.
>>
>> If you have userspace built with older kernel headers then that will
>> break. Am not sure about old-kernels.
>>
>> What exactly is the argument that you want to add to GH_CREATE_VM?
>>
>> If you want to keep GH_CREATE_VM with no arguments that is fine but
>> remove the conflicting comments in the code and document so that its
>> not misleading readers/reviewers that the UAPI is going to be modified
>> in near future.
>>
>>
>
> The convention followed here comes from KVM_CREATE_VM. Is this ioctl
> considered bad example?
>

It is recommended to only use _IO for commands without arguments, and
use pointers for passing data. Even though _IO can indicate either
commands with no argument or passing an integer value instead of a
pointer. Am really not sure how this works in compat case.

Am sure there are tricks that can be done with just using _IO() macro
(ex vfio), but this does not mean that we should not use _IOW to be more
explicit on the type and size of argument that we are expecting.

On the other hand If its really not possible to change this IOCTL to
_IOW and argument that you are referring would be with in integer range,
then what you have with _IO macro should work.

--srini

>>>
>>>>> +
>>>>> +    ghvm = gh_vm_alloc(rm);
>>>>> +    if (IS_ERR(ghvm))
>>>>> +        return PTR_ERR(ghvm);
>>>>> +
>>>>> +    fd = get_unused_fd_flags(O_CLOEXEC);
>>>>> +    if (fd < 0) {
>>>>> +        err = fd;
>>>>> +        goto err_destroy_vm;
>>>>> +    }
>>>>> +
>>>>> +    file = anon_inode_getfile("gunyah-vm", &gh_vm_fops, ghvm,
>>>>> O_RDWR);
>>>>> +    if (IS_ERR(file)) {
>>>>> +        err = PTR_ERR(file);
>>>>> +        goto err_put_fd;
>>>>> +    }
>>>>> +
>>>>> +    fd_install(fd, file);
>>>>> +
>>>>> +    return fd;
>>>>> +
>>>>> +err_put_fd:
>>>>> +    put_unused_fd(fd);
>>>>> +err_destroy_vm:
>>>>> +    kfree(ghvm);
>>>>> +    return err;
>>>>> +}
>>>>> +
>>>>> +long gh_dev_vm_mgr_ioctl(struct gh_rm *rm, unsigned int cmd,
>>>>> unsigned long arg)
>>>>> +{
>>>>> +    switch (cmd) {
>>>>> +    case GH_CREATE_VM:
>>>>> +        return gh_dev_ioctl_create_vm(rm, arg);
>>>>> +    default:
>>>>> +        return -ENOIOCTLCMD;
>>>>> +    }
>>>>> +}
>>>>> diff --git a/drivers/virt/gunyah/vm_mgr.h
>>>>> b/drivers/virt/gunyah/vm_mgr.h
>>>>> new file mode 100644
>>>>> index 000000000000..76954da706e9
>>>>> --- /dev/null
>>>>> +++ b/drivers/virt/gunyah/vm_mgr.h
>>>>> @@ -0,0 +1,22 @@
>>>>> +/* SPDX-License-Identifier: GPL-2.0-only */
>>>>> +/*
>>>>> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All
>>>>> rights reserved.
>>>>> + */
>>>>> +
>>>>> +#ifndef _GH_PRIV_VM_MGR_H
>>>>> +#define _GH_PRIV_VM_MGR_H
>>>>> +
>>>>> +#include <linux/gunyah_rsc_mgr.h>
>>>>> +
>>>>> +#include <uapi/linux/gunyah.h>
>>>>> +
>>>>> +long gh_dev_vm_mgr_ioctl(struct gh_rm *rm, unsigned int cmd,
>>>>> unsigned long arg);
>>>>> +
>>>>> +struct gh_vm {
>>>>> +    u16 vmid;
>>>>> +    struct gh_rm *rm;
>>>>> +
>>>>> +    struct work_struct free_work;
>>>>> +};
>>>>> +
>>>>> +#endif
>>>>> diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
>>>>> new file mode 100644
>>>>> index 000000000000..10ba32d2b0a6
>>>>> --- /dev/null
>>>>> +++ b/include/uapi/linux/gunyah.h
>>>>> @@ -0,0 +1,23 @@
>>>>> +/* SPDX-License-Identifier: GPL-2.0-only WITH Linux-syscall-note */
>>>>> +/*
>>>>> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All
>>>>> rights reserved.
>>>>> + */
>>>>> +
>>>>> +#ifndef _UAPI_LINUX_GUNYAH
>>>>> +#define _UAPI_LINUX_GUNYAH
>>>>> +
>>>>> +/*
>>>>> + * Userspace interface for /dev/gunyah - gunyah based virtual machine
>>>>> + */
>>>>> +
>>>>> +#include <linux/types.h>
>>>>> +#include <linux/ioctl.h>
>>>>> +
>>>>> +#define GH_IOCTL_TYPE            'G'
>>>>> +
>>>>> +/*
>>>>> + * ioctls for /dev/gunyah fds:
>>>>> + */
>>>>> +#define GH_CREATE_VM            _IO(GH_IOCTL_TYPE, 0x0) /* Returns
>>>>> a Gunyah VM fd */
>>>>
>>>> Can HLOS forcefully destroy a VM?
>>>> If so should we have a corresponding DESTROY IOCTL?
>>>
>>> It can forcefully destroy unauthenticated and protected virtual
>>> machines. I don't have a userspace usecase for a DESTROY ioctl yet,
>>> maybe this can be added later? By the way, the VM is forcefully
>> that should be fine, but its also nice to add it for completeness, but
>> not a compulsory atm
>>
>>> destroyed when VM refcount is dropped to 0 (close(vm_fd) and any
>>> other relevant file descriptors).
>> I have noticed that path.
>>
>> --srini
>>>
>>> - Elliot

2023-02-24 10:38:43

by Srinivas Kandagatla

[permalink] [raw]
Subject: Re: [PATCH v10 12/26] gunyah: vm_mgr: Add/remove user memory regions



On 24/02/2023 00:43, Elliot Berman wrote:
>>> +/*
>>> + * ioctls for VM fds
>>> + */
>>> +
>>> +/**
>>> + * struct gh_userspace_memory_region - Userspace memory descripion
>>> for GH_VM_SET_USER_MEM_REGION
>>> + * @label: Unique identifer to the region.
>>> + * @flags: Flags for memory parcel behavior
>>> + * @guest_phys_addr: Location of the memory region in guest's memory
>>> space (page-aligned)#
>>
>> Note about overlapping here would be useful.
>>
>
> I'd like to reduce duplicate documentation where possible. I was
This is exactly what .rst files can provide.

If you have a proper kernel-doc type documentation in header/source
files, these can be directly used in .rst files.

The reStructuredText (.rst) files may contain directives to include
structured documentation comments, or kernel-doc comments, from source
files.

ex:
.. kernel-doc:: include/linux/gunyah.h
:internal:


--srini
> generally following this procedure:
>  - include/uapi/linux/gunyah.h docstrings have basic information to
> remind what the field is
>  - Documentation/virt/gunyah/ documentation explains how to properly
> use the APIs
>
> I think it's definitely good idea to have separate documentation beyond
> what can be described in docstrings here.
>
> Thanks,
> Elliot

2023-02-24 13:21:08

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH v10 10/26] gunyah: vm_mgr: Introduce basic VM Manager

On Fri, Feb 24, 2023, at 11:29, Srinivas Kandagatla wrote:
> On 23/02/2023 22:40, Elliot Berman wrote:

>>>> Does this means adding #define GH_VM_DEFAULT_ARG 0 ? I am not sure
>>>> yet what arguments to add here.
>>>>
>>>> The ABI can add new "long" values to GH_CREATE_VM and that wouldn't
>>>
>>> Sorry, that is exactly what we want to avoid, we can not change the
>>> UAPI its going to break the userspace.
>>>
>>>> break compatibility with old kernels; old kernels reject it as -EINVAL.
>>>
>>> If you have userspace built with older kernel headers then that will
>>> break. Am not sure about old-kernels.
>>>
>>> What exactly is the argument that you want to add to GH_CREATE_VM?
>>>
>>> If you want to keep GH_CREATE_VM with no arguments that is fine but
>>> remove the conflicting comments in the code and document so that its
>>> not misleading readers/reviewers that the UAPI is going to be modified
>>> in near future.
>>>
>>>
>>
>> The convention followed here comes from KVM_CREATE_VM. Is this ioctl
>> considered bad example?
>>
>
> It is recommended to only use _IO for commands without arguments, and
> use pointers for passing data. Even though _IO can indicate either
> commands with no argument or passing an integer value instead of a
> pointer. Am really not sure how this works in compat case.
>
> Am sure there are tricks that can be done with just using _IO() macro
> (ex vfio), but this does not mean that we should not use _IOW to be more
> explicit on the type and size of argument that we are expecting.
>
> On the other hand If its really not possible to change this IOCTL to
> _IOW and argument that you are referring would be with in integer range,
> then what you have with _IO macro should work.

Passing an 'unsigned long' value instead of a pointer is fine for compat
mode, as a 32-bit compat_ulong_t always fits inside of the 64-bit
unsigned long. The downside is that portable code cannot have a
single ioctl handler function that takes both commands with pointers
and other commands with integer arguments, as some architectures
(i.e. s390, possibly arm64+morello in the future) need to mangle
pointer arguments using compat_ptr() but must not do that on integer
arguments.

Arnd

2023-02-24 18:08:55

by Elliot Berman

[permalink] [raw]
Subject: Re: [PATCH v10 12/26] gunyah: vm_mgr: Add/remove user memory regions



On 2/24/2023 2:19 AM, Fuad Tabba wrote:
> Hi,
>
> On Tue, Feb 14, 2023 at 9:26 PM Elliot Berman <[email protected]> wrote:
>>
>>
>> When launching a virtual machine, Gunyah userspace allocates memory for
>> the guest and informs Gunyah about these memory regions through
>> SET_USER_MEMORY_REGION ioctl.
>
> I'm working on pKVM [1], and regarding the problem of donating private
> memory to a guest, we and others working on confidential computing
> have faced a similar issue that this patch is trying to address. In
> pKVM, we've initially taken an approach similar to the one here by
> pinning the pages being donated to prevent swapping or migration [2].
> However, we've encountered issues with this approach since the memory
> is still mapped by the host, which could cause the system to crash on
> an errant access.
>
> Instead, we've been working on adopting an fd-based restricted memory
> approach that was initially proposed for TDX [3] and is now being
> considered by others in the confidential computing space as well
> (e.g., Arm CCA [4]). The basic idea is that the host manages the guest
> memory via a file descriptor instead of a userspace address. It cannot
> map that memory (unless explicitly shared by the guest [5]),
> eliminating the possibility of the host trying to access private
> memory accidentally or being tricked by a malicious actor. This is
> based on memfd with some restrictions. It handles swapping and
> migration by disallowing them (for now [6]), and adds a new type of
> memory region to KVM to accommodate having an fd representing guest
> memory.
>
> Although the fd-based restricted memory isn't upstream yet, we've
> ported the latest patches to arm64 and made changes and additions to
> make it work with pKVM, to test it and see if the solution is feasible
> for us (it is). I wanted to mention this work in case you find it
> useful, and in the hopes that we can all work on confidential
> computing using the same interfaces as much as possible.

Thanks for highlighting the memfd_restricted changes to us! We'll
investigate how/if it can suit Gunyah usecases. It sounds like you
might've made memfd_restricted changes as well? Are those posted on the
mailing lists? Also, are example userspace (crosvm?) changes posted?

Thanks,
Elliot

>
> Some comments inline below...
>
> Cheers,
> /fuad
>
> [1] https://lore.kernel.org/kvmarm/[email protected]/
> [2] https://lore.kernel.org/kvmarm/[email protected]/
> [3] https://lore.kernel.org/all/[email protected]/
> [4] https://lore.kernel.org/lkml/[email protected]/
> [5] This is a modification we've done for the arm64 port, after
> discussing it with the original authors.
> [6] Nothing inherent in the proposal to stop migration and swapping.
> There are some technical issues that need to be resolved.
>
> <snip>
<snip, looking at comments in parallel>

2023-02-24 18:58:58

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v10 12/26] gunyah: vm_mgr: Add/remove user memory regions

On Fri, Feb 24, 2023, Elliot Berman wrote:
>
>
> On 2/24/2023 2:19 AM, Fuad Tabba wrote:
> > Hi,
> >
> > On Tue, Feb 14, 2023 at 9:26 PM Elliot Berman <[email protected]> wrote:
> > >
> > >
> > > When launching a virtual machine, Gunyah userspace allocates memory for
> > > the guest and informs Gunyah about these memory regions through
> > > SET_USER_MEMORY_REGION ioctl.
> >
> > I'm working on pKVM [1], and regarding the problem of donating private
> > memory to a guest, we and others working on confidential computing
> > have faced a similar issue that this patch is trying to address. In
> > pKVM, we've initially taken an approach similar to the one here by
> > pinning the pages being donated to prevent swapping or migration [2].
> > However, we've encountered issues with this approach since the memory
> > is still mapped by the host, which could cause the system to crash on
> > an errant access.
> >
> > Instead, we've been working on adopting an fd-based restricted memory
> > approach that was initially proposed for TDX [3] and is now being
> > considered by others in the confidential computing space as well
> > (e.g., Arm CCA [4]). The basic idea is that the host manages the guest
> > memory via a file descriptor instead of a userspace address. It cannot
> > map that memory (unless explicitly shared by the guest [5]),
> > eliminating the possibility of the host trying to access private
> > memory accidentally or being tricked by a malicious actor. This is
> > based on memfd with some restrictions. It handles swapping and
> > migration by disallowing them (for now [6]), and adds a new type of
> > memory region to KVM to accommodate having an fd representing guest
> > memory.
> >
> > Although the fd-based restricted memory isn't upstream yet, we've
> > ported the latest patches to arm64 and made changes and additions to
> > make it work with pKVM, to test it and see if the solution is feasible
> > for us (it is). I wanted to mention this work in case you find it
> > useful, and in the hopes that we can all work on confidential
> > computing using the same interfaces as much as possible.
>
> Thanks for highlighting the memfd_restricted changes to us! We'll
> investigate how/if it can suit Gunyah usecases.

Can you provide Gunyah's requirements/rules and use cases as they relate to memory
management? I agree with Fuad, this is pretty much exactly what memfd_restricted()
is intended to handle. If Gunyah has a unique requirement or use case, it'd be
helpful to find out sooner than later. E.g.

1. What is the state of memory when it's accepted by a VM? Is it undefined,
i.e. the VM's responsibility to initialize? If not, is it always
zero-initialized or can memory be populated by the RM?

2. When exclusive/private memory is reclaimed, can the VM's data be preserved,
or is it unconditionally

3. How frequently is memory transition allocated/reclaimed?

4. Are there assumptions and/or limitations on the size or granlarity of
memory objects?

5. Can memory be shared by multiple VMs but _not_ be accessible from the RM?

6. etc. :-)

Thanks!

2023-02-24 21:24:42

by Elliot Berman

[permalink] [raw]
Subject: Re: [PATCH v10 06/26] virt: gunyah: msgq: Add hypercalls to send and receive messages



On 2/23/2023 4:15 PM, Alex Elder wrote:
> On 2/14/23 3:23 PM, Elliot Berman wrote:
>> Add hypercalls to send and receive messages on a Gunyah message queue.
>>
>> Signed-off-by: Elliot Berman <[email protected]>
>> ---
>>   arch/arm64/gunyah/gunyah_hypercall.c | 32 ++++++++++++++++++++++++++++
>>   include/linux/gunyah.h               |  7 ++++++
>>   2 files changed, 39 insertions(+)
>>
>> diff --git a/arch/arm64/gunyah/gunyah_hypercall.c
>> b/arch/arm64/gunyah/gunyah_hypercall.c
>> index f30d06ee80cf..2ca9ab098ff6 100644
>> --- a/arch/arm64/gunyah/gunyah_hypercall.c
>> +++ b/arch/arm64/gunyah/gunyah_hypercall.c
>> @@ -38,6 +38,8 @@ EXPORT_SYMBOL_GPL(arch_is_gunyah_guest);
>>                              fn)
>>   #define GH_HYPERCALL_HYP_IDENTIFY        GH_HYPERCALL(0x8000)
>> +#define GH_HYPERCALL_MSGQ_SEND            GH_HYPERCALL(0x801B)
>> +#define GH_HYPERCALL_MSGQ_RECV            GH_HYPERCALL(0x801C)
>>   /**
>>    * gh_hypercall_hyp_identify() - Returns build information and
>> feature flags
>> @@ -57,5 +59,35 @@ void gh_hypercall_hyp_identify(struct
>> gh_hypercall_hyp_identify_resp *hyp_identi
>>   }
>>   EXPORT_SYMBOL_GPL(gh_hypercall_hyp_identify);
>> +enum gh_error gh_hypercall_msgq_send(u64 capid, size_t size,
>> uintptr_t buff, int tx_flags,
>> +                    bool *ready)
>> +{
>> +    struct arm_smccc_res res;
>> +
>> +    arm_smccc_1_1_hvc(GH_HYPERCALL_MSGQ_SEND, capid, size, buff,
>> tx_flags, 0, &res);
>> +
>> +    if (res.a0 == GH_ERROR_OK)
>> +        *ready = res.a1;
>> +
>> +    return res.a0;
>> +}
>> +EXPORT_SYMBOL_GPL(gh_hypercall_msgq_send);
>> +
>> +enum gh_error gh_hypercall_msgq_recv(u64 capid, uintptr_t buff,
>> size_t size, size_t *recv_size,
>> +                    bool *ready)
>> +{
>> +    struct arm_smccc_res res;
>> +
>> +    arm_smccc_1_1_hvc(GH_HYPERCALL_MSGQ_RECV, capid, buff, size, 0,
>> &res);
>> +
>> +    if (res.a0 == GH_ERROR_OK) {
>> +        *recv_size = res.a1;
>
> Is there any chance the 64-bit size is incompatible
> with size_t?  (Too big?)

This is safe because size of messages <= 240.

>
>> +        *ready = res.a2;
>
>         *ready = !!res.a2;
>
>> +    }
>> +
>> +    return res.a0;
>> +}
>> +EXPORT_SYMBOL_GPL(gh_hypercall_msgq_recv);
>> +
>>   MODULE_LICENSE("GPL");
>>   MODULE_DESCRIPTION("Gunyah Hypervisor Hypercalls");
>> diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
>> index 3fef2854c5e1..cb6df4eec5c2 100644
>> --- a/include/linux/gunyah.h
>> +++ b/include/linux/gunyah.h
>> @@ -112,4 +112,11 @@ struct gh_hypercall_hyp_identify_resp {
>>   void gh_hypercall_hyp_identify(struct gh_hypercall_hyp_identify_resp
>> *hyp_identity);
>> +#define GH_HYPERCALL_MSGQ_TX_FLAGS_PUSH        BIT(0)
>> +
>> +enum gh_error gh_hypercall_msgq_send(u64 capid, size_t size,
>> uintptr_t buff, int tx_flags,
>> +                    bool *ready);
>
> Why uintptr_t?  Why not just pass a host pointer (void *)
> and do whatever conversion is necessary inside the function?
>
>                     -Alex
>
>> +enum gh_error gh_hypercall_msgq_recv(u64 capid, uintptr_t buff,
>> size_t size, size_t *recv_size,
>> +                    bool *ready);
>> +
>>   #endif
>

2023-02-24 21:57:25

by Elliot Berman

[permalink] [raw]
Subject: Re: [PATCH v10 07/26] mailbox: Add Gunyah message queue mailbox



On 2/23/2023 1:11 PM, Alex Elder wrote:
> On 2/14/23 3:23 PM, Elliot Berman wrote:
>> Gunyah message queues are a unidirectional inter-VM pipe for messages up
>> to 1024 bytes. This driver supports pairing a receiver message queue and
>> a transmitter message queue to expose a single mailbox channel.
>>
>> Signed-off-by: Elliot Berman <[email protected]>
>> ---
>>   Documentation/virt/gunyah/message-queue.rst |   8 +
>>   drivers/mailbox/Makefile                    |   2 +
>>   drivers/mailbox/gunyah-msgq.c               | 214 ++++++++++++++++++++
>>   include/linux/gunyah.h                      |  56 +++++
>>   4 files changed, 280 insertions(+)
>>   create mode 100644 drivers/mailbox/gunyah-msgq.c
>>
>> diff --git a/Documentation/virt/gunyah/message-queue.rst
>> b/Documentation/virt/gunyah/message-queue.rst
>> index 0667b3eb1ff9..082085e981e0 100644
>> --- a/Documentation/virt/gunyah/message-queue.rst
>> +++ b/Documentation/virt/gunyah/message-queue.rst
>> @@ -59,3 +59,11 @@ vIRQ: two TX message queues will have two vIRQs
>> (and two capability IDs).
>>         |               |         |                 |
>> |               |
>>         |               |         |                 |
>> |               |
>>         +---------------+         +-----------------+
>> +---------------+
>> +
>> +Gunyah message queues are exposed as mailboxes. To create the
>> mailbox, create
>> +a mbox_client and call `gh_msgq_init`. On receipt of the RX_READY
>> interrupt,
>> +all messages in the RX message queue are read and pushed via the
>> `rx_callback`
>> +of the registered mbox_client.
>> +
>> +.. kernel-doc:: drivers/mailbox/gunyah-msgq.c
>> +   :identifiers: gh_msgq_init
>> diff --git a/drivers/mailbox/Makefile b/drivers/mailbox/Makefile
>> index fc9376117111..5f929bb55e9a 100644
>> --- a/drivers/mailbox/Makefile
>> +++ b/drivers/mailbox/Makefile
>> @@ -55,6 +55,8 @@ obj-$(CONFIG_MTK_CMDQ_MBOX)    += mtk-cmdq-mailbox.o
>>   obj-$(CONFIG_ZYNQMP_IPI_MBOX)    += zynqmp-ipi-mailbox.o
>> +obj-$(CONFIG_GUNYAH)        += gunyah-msgq.o
>> +
>>   obj-$(CONFIG_SUN6I_MSGBOX)    += sun6i-msgbox.o
>>   obj-$(CONFIG_SPRD_MBOX)        += sprd-mailbox.o
>> diff --git a/drivers/mailbox/gunyah-msgq.c
>> b/drivers/mailbox/gunyah-msgq.c
>> new file mode 100644
>> index 000000000000..03ffaa30ce9b
>> --- /dev/null
>> +++ b/drivers/mailbox/gunyah-msgq.c
>
> You use a dash in this source file name, but an underscore
> everywhere else.  Unless there's a good reason to do this,
> please be consistent (use "gunyah_msgq.c").
>
>> @@ -0,0 +1,214 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All
>> rights reserved.
>> + */
>> +
>> +#include <linux/mailbox_controller.h>
>> +#include <linux/module.h>
>> +#include <linux/interrupt.h>
>> +#include <linux/gunyah.h>
>> +#include <linux/printk.h>
>> +#include <linux/init.h>
>> +#include <linux/slab.h>
>> +#include <linux/wait.h>
>> +
>> +#define mbox_chan_to_msgq(chan) (container_of(chan->mbox, struct
>> gh_msgq, mbox))
>> +
>> +static irqreturn_t gh_msgq_rx_irq_handler(int irq, void *data)
>> +{
>> +    struct gh_msgq *msgq = data;
>> +    struct gh_msgq_rx_data rx_data;
>> +    enum gh_error err;
>> +    bool ready = true;
>> +
>> +    while (ready) {
>> +        err = gh_hypercall_msgq_recv(msgq->rx_ghrsc->capid,
>> +                (uintptr_t)&rx_data.data, sizeof(rx_data.data),
>> +                &rx_data.length, &ready);
>> +        if (err != GH_ERROR_OK) {
>> +            if (err != GH_ERROR_MSGQUEUE_EMPTY)
>
> Srini mentioned something about this too.  In many
> (all?) cases, there is a device pointer available,
> so you should use dev_*() functions rather than pr_*().
>
> In this particular case, I'm not sure why/when the
> mbox.dev pointer would be null.  Also, dev_*() handles
> the case of a null device pointer, and it reports the
> device name (just as you do here).
>
>> +                pr_warn("Failed to receive data from msgq for %s: %d\n",
>> +                    msgq->mbox.dev ? dev_name(msgq->mbox.dev) : "",
>> err);
>> +            break;
>> +        }
>> +        mbox_chan_received_data(gh_msgq_chan(msgq), &rx_data);
>> +    }
>> +
>> +    return IRQ_HANDLED;
>> +}
>> +
>> +/* Fired when message queue transitions from "full" to "space
>> available" to send messages */
>> +static irqreturn_t gh_msgq_tx_irq_handler(int irq, void *data)
>> +{
>> +    struct gh_msgq *msgq = data;
>> +
>> +    mbox_chan_txdone(gh_msgq_chan(msgq), 0);
>> +
>> +    return IRQ_HANDLED;
>> +}
>> +
>> +/* Fired after sending message and hypercall told us there was more
>> space available. */
>> +static void gh_msgq_txdone_tasklet(struct tasklet_struct *tasklet)
>> +{
>> +    struct gh_msgq *msgq = container_of(tasklet, struct gh_msgq,
>> txdone_tasklet);
>> +
>> +    mbox_chan_txdone(gh_msgq_chan(msgq), msgq->last_ret);
>> +}
>> +
>> +static int gh_msgq_send_data(struct mbox_chan *chan, void *data)
>> +{
>> +    struct gh_msgq *msgq = mbox_chan_to_msgq(chan);
>> +    struct gh_msgq_tx_data *msgq_data = data;
>> +    u64 tx_flags = 0;
>> +    enum gh_error gh_error;
>
> Above you named the variable "err".  It helps readability
> if you use a very consistent naming convention for variables
> of a certain type when they are used a lot.
>
>> +    bool ready;
>> +
>> +    if (msgq_data->push)
>> +        tx_flags |= GH_HYPERCALL_MSGQ_TX_FLAGS_PUSH;
>> +
>> +    gh_error = gh_hypercall_msgq_send(msgq->tx_ghrsc->capid,
>> msgq_data->length,
>> +                    (uintptr_t)msgq_data->data, tx_flags, &ready);
>> +
>> +    /**
>> +     * unlikely because Linux tracks state of msgq and should not try to
>> +     * send message when msgq is full.
>> +     */
>> +    if (unlikely(gh_error == GH_ERROR_MSGQUEUE_FULL))
>> +        return -EAGAIN;
>> +
>> +    /**
>> +     * Propagate all other errors to client. If we return error to
>> mailbox
>> +     * framework, then no other messages can be sent and nobody will
>> know
>> +     * to retry this message.
>> +     */
>> +    msgq->last_ret = gh_remap_error(gh_error);
>> +
>> +    /**
>> +     * This message was successfully sent, but message queue isn't
>> ready to
>> +     * receive more messages because it's now full. Mailbox framework
>
> Maybe:  s/receive/accept/
>
>> +     * requires that we only report that message was transmitted when
>> +     * we're ready to transmit another message. We'll get that in the
>> form
>> +     * of tx IRQ once the other side starts to drain the msgq.
>> +     */
>> +    if (gh_error == GH_ERROR_OK && !ready)
>> +        return 0;
>> +
>> +    /**
>> +     * We can send more messages. Mailbox framework requires that tx
>> done
>> +     * happens asynchronously to sending the message. Gunyah message
>> queues
>> +     * tell us right away on the hypercall return whether we can send
>> more
>> +     * messages. To work around this, defer the txdone to a tasklet.
>> +     */
>> +    tasklet_schedule(&msgq->txdone_tasklet);
>> +
>> +    return 0;
>> +}
>> +
>> +static struct mbox_chan_ops gh_msgq_ops = {
>> +    .send_data = gh_msgq_send_data,
>> +};
>> +
>> +/**
>> + * gh_msgq_init() - Initialize a Gunyah message queue with an
>> mbox_client
>> + * @parent: optional, device parent used for the mailbox controller
>> + * @msgq: Pointer to the gh_msgq to initialize
>> + * @cl: A mailbox client to bind to the mailbox channel that the
>> message queue creates
>> + * @tx_ghrsc: optional, the transmission side of the message queue
>> + * @rx_ghrsc: optional, the receiving side of the message queue
>> + *
>> + * At least one of tx_ghrsc and rx_ghrsc should be not NULL. Most
>> message queue use cases come with
>
> s/should be/must be/
>
>> + * a pair of message queues to facilitate bidirectional
>> communication. When tx_ghrsc is set,
>> + * the client can send messages with
>> mbox_send_message(gh_msgq_chan(msgq), msg). When rx_ghrsc
>> + * is set, the mbox_client should register an .rx_callback() and the
>> message queue driver will
>
> s/should register/must register/
>
> A general comment on this code is that you sort of half define
> a Gunyah message queue API.  You define an initialization
> function and an exit function, but you also expose the fact
> that you use the mailbox framework in implementation.  This
> despite avoiding defining it as an mbox in the DTS file.
>
> It might be hard to avoid that I guess.  But to me it would be
> nice if there were a more distinct Gunyah message queue API,
> which would provide a send_message() function, for example.
> And in that case, perhaps you would pass in the tx_done and/or
> rx_data callbacks to this function (since they're required).

I can write a wrapper for send_message, but I think it limits the code
re-use of mailbox framework.

>
> All that said, this is (currently?) only used by the resource
> manager, so making a beautiful API might not be that important.
> Do you envision this being used to communicate with other VMs
> in the future?
>
>> + * push all available messages upon receiving the RX ready interrupt.
>> The messages should be
>
> Maybe: s/push/deliver/
>
>> + * consumed or copied by the client right away as the gh_msgq_rx_data
>> will be replaced/destroyed
>> + * after the callback.
>> + *
>> + * Returns - 0 on success, negative otherwise
>> + */
>> +int gh_msgq_init(struct device *parent, struct gh_msgq *msgq, struct
>> mbox_client *cl,
>> +             struct gunyah_resource *tx_ghrsc, struct gunyah_resource
>> *rx_ghrsc)
>> +{
>> +    int ret;
>> +
>> +    /* Must have at least a tx_ghrsc or rx_ghrsc and that they are
>> the right device types */
>> +    if ((!tx_ghrsc && !rx_ghrsc) ||
>> +        (tx_ghrsc && tx_ghrsc->type != GUNYAH_RESOURCE_TYPE_MSGQ_TX) ||
>> +        (rx_ghrsc && rx_ghrsc->type != GUNYAH_RESOURCE_TYPE_MSGQ_RX))
>> +        return -EINVAL;
>> +
>> +    if (gh_api_version() != GUNYAH_API_V1) {
>> +        pr_err("Unrecognized gunyah version: %u. Currently supported:
>> %d\n",
>> +            gh_api_version(), GUNYAH_API_V1);
>> +        return -EOPNOTSUPP;
>> +    }
>> +
>> +    if (!gh_api_has_feature(GH_API_FEATURE_MSGQUEUE))
>> +        return -EOPNOTSUPP;
>
> Can Gunyah even function if it doesn't have the MSGQUEUE feature?
> Will there ever be a Gunyah implementation that does not support
> it?  Perhaps this test could be done in gunyah_init() instead.

I don't think we will ever have a Gunyah implementation that doesn't
support message queues. Perhaps some long distant Gunyah will use IPC
mechanism X instead of message queues and the message queue support is
dropped.

>
> For that matter, you could verify the result of gh_api_version()
> at that time also.
>

Moved the gh_api_version() check to gunyah_init()

>> +
>> +    msgq->tx_ghrsc = tx_ghrsc;
>> +    msgq->rx_ghrsc = rx_ghrsc;
>> +
>> +    msgq->mbox.dev = parent;
>> +    msgq->mbox.ops = &gh_msgq_ops;
>> +    msgq->mbox.num_chans = 1;
>> +    msgq->mbox.txdone_irq = true;
>> +    msgq->mbox.chans = kcalloc(msgq->mbox.num_chans,
>> sizeof(*msgq->mbox.chans), GFP_KERNEL);
>
> From what I can tell, you will always use exactly one mailbox channel.
> So you could just do kzalloc(sizeof()...).
>

If it's all the same, I'd like to keep it as kcalloc because chans is
expected to be an array with num_chans size. It seems more correct to
use kcalloc.

>> +    if (!msgq->mbox.chans)
>> +        return -ENOMEM;
>> +
>> +    if (msgq->tx_ghrsc) {
>
>     if (tx_ghrsc) {
>
> The irq field is assumed to be valid.  Are there any
> sanity checks you could perform?  Again this is only
> used for the resource manager right now, so maybe
> it's OK.
>

We should safely assume irq field is valid. If we need to be skeptical
of irq, we'd also need to be skeptical of capid and there's not validity
check to perform there. struct gunyah_resource's are either filled from
DT (in this case) or would be created by resource manager which does
validity checks.

>> +        ret = request_irq(msgq->tx_ghrsc->irq,
>> gh_msgq_tx_irq_handler, 0, "gh_msgq_tx",
>
>         ret = request_irq(tx_ghrsc->irq, ...
>
>
>> +                msgq);
>> +        if (ret)
>> +            goto err_chans;
>> +    }
>> +
>> +    if (msgq->rx_ghrsc) {
>> +        ret = request_threaded_irq(msgq->rx_ghrsc->irq, NULL,
>> gh_msgq_rx_irq_handler,
>> +                        IRQF_ONESHOT, "gh_msgq_rx", msgq);
>> +        if (ret)
>> +            goto err_tx_irq;
>> +    }
>> +
>> +    tasklet_setup(&msgq->txdone_tasklet, gh_msgq_txdone_tasklet);
>> +
>> +    ret = mbox_controller_register(&msgq->mbox);
>> +    if (ret)
>> +        goto err_rx_irq;
>> +
>> +    ret = mbox_bind_client(gh_msgq_chan(msgq), cl);
>
>
>> +    if (ret)
>> +        goto err_mbox;
>> +
>> +    return 0;
>> +err_mbox:
>> +    mbox_controller_unregister(&msgq->mbox);
>> +err_rx_irq:
>> +    if (msgq->rx_ghrsc)
>> +        free_irq(msgq->rx_ghrsc->irq, msgq);
>> +err_tx_irq:
>> +    if (msgq->tx_ghrsc)
>> +        free_irq(msgq->tx_ghrsc->irq, msgq);
>> +err_chans:
>> +    kfree(msgq->mbox.chans);
>> +    return ret;
>> +}
>> +EXPORT_SYMBOL_GPL(gh_msgq_init);
>> +
>> +void gh_msgq_remove(struct gh_msgq *msgq)
>> +{
>
> Is there any need to un-bind the client?
>

I was leaving un-binding the client to the client (RM).

>> +    mbox_controller_unregister(&msgq->mbox);
>> +
>> +    if (msgq->rx_ghrsc)
>> +        free_irq(msgq->rx_ghrsc->irq, msgq);
>> +
>> +    if (msgq->tx_ghrsc)
>> +        free_irq(msgq->tx_ghrsc->irq, msgq);
>> +
>> +    kfree(msgq->mbox.chans);
>> +}
>> +EXPORT_SYMBOL_GPL(gh_msgq_remove);
>> +
>> +MODULE_LICENSE("GPL");
>> +MODULE_DESCRIPTION("Gunyah Message Queue Driver");
>> diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
>> index cb6df4eec5c2..2e13669c6363 100644
>> --- a/include/linux/gunyah.h
>> +++ b/include/linux/gunyah.h
>> @@ -8,11 +8,67 @@
>>   #include <linux/bitfield.h>
>>   #include <linux/errno.h>
>> +#include <linux/interrupt.h>
>>   #include <linux/limits.h>
>> +#include <linux/mailbox_controller.h>
>> +#include <linux/mailbox_client.h>
>>   #include <linux/types.h>
>> +/* Follows resource manager's resource types for VM_GET_HYP_RESOURCES */
>> +enum gunyah_resource_type {
>> +    GUNYAH_RESOURCE_TYPE_BELL_TX    = 0,
>> +    GUNYAH_RESOURCE_TYPE_BELL_RX    = 1,
>> +    GUNYAH_RESOURCE_TYPE_MSGQ_TX    = 2,
>> +    GUNYAH_RESOURCE_TYPE_MSGQ_RX    = 3,
>> +    GUNYAH_RESOURCE_TYPE_VCPU    = 4,
>
> The maximum value here must fit in 8 bits.  I guess
> there's no risk right now of using that up, but you
> use negative values in some cases elsewhere.
>
>> +};
>> +
>> +struct gunyah_resource {
>> +    enum gunyah_resource_type type;
>> +    u64 capid;
>> +    int irq;
>
> request_irq() defines the IRQ value to be an unsigned int.
>

Done.

>> +};
>> +
>> +/**
>> + * Gunyah Message Queues
>> + */
>> +
>> +#define GH_MSGQ_MAX_MSG_SIZE    240
>> +
>> +struct gh_msgq_tx_data {
>> +    size_t length;
>> +    bool push;
>> +    char data[];
>> +};
>> +
>> +struct gh_msgq_rx_data {
>> +    size_t length;
>> +    char data[GH_MSGQ_MAX_MSG_SIZE];
>> +};
>> +
>> +struct gh_msgq {
>> +    struct gunyah_resource *tx_ghrsc;
>> +    struct gunyah_resource *rx_ghrsc;
>> +
>> +    /* msgq private */
>> +    int last_ret; /* Linux error, not GH_STATUS_* */
>> +    struct mbox_controller mbox;
>> +    struct tasklet_struct txdone_tasklet;
>
> Can the msgq_client be embedded here too?  (I don't really
> know whether msgq and msgq_client are one-to one.)
>

They are one-to-one. I can embed the struct in the struct gh_msgq and
drop the kcalloc.

Thanks,
Elliot

>> +};
>> +
>> +
>> +int gh_msgq_init(struct device *parent, struct gh_msgq *msgq, struct
>> mbox_client *cl,
>> +             struct gunyah_resource *tx_ghrsc, struct gunyah_resource
>> *rx_ghrsc);
>> +void gh_msgq_remove(struct gh_msgq *msgq);
>
> I suggested:
>
> int gh_msgq_send(struct gh_msgq, struct gh_msgq_tx_data *data);
>
>                     -Alex
>
>> +
>> +static inline struct mbox_chan *gh_msgq_chan(struct gh_msgq *msgq)
>> +{
>> +    return &msgq->mbox.chans[0];
>> +}
>> +
>>
>> /******************************************************************************/
>>   /* Common arch-independent definitions for Gunyah
>> hypercalls                  */
>> +
>>   #define GH_CAPID_INVAL    U64_MAX
>>   #define GH_VMID_ROOT_VM    0xff
>

2023-02-24 22:40:14

by Elliot Berman

[permalink] [raw]
Subject: Re: [PATCH v10 08/26] gunyah: rsc_mgr: Add resource manager RPC core



On 2/23/2023 3:28 PM, Alex Elder wrote:
> On 2/14/23 3:23 PM, Elliot Berman wrote:
>>
<snip>
>> +}
>> +
>> +static int gh_rm_init_connection_payload(struct gh_rm_connection
>> *connection, void *msg,
>> +                    size_t hdr_size, size_t msg_size)
>
> The value of hdr_size is *always* sizeof(*hdr), so you can
> do without passing it as an argument.
>

hdr_size is different when receiving reply (1 extra word) vs notification.

>> +{
>> +    size_t max_buf_size, payload_size;
>> +    struct gh_rm_rpc_hdr *hdr = msg;
>> +
>
> It probably sounds dumb, but I'd reverse the values
> compared below (and the operator).
>
>> +    if (hdr_size > msg_size)
>> +        return -EINVAL;
>> +
>> +    payload_size = msg_size - hdr_size;
>> +
>> +    connection->num_fragments = FIELD_GET(RM_RPC_FRAGMENTS_MASK,
>> hdr->type);
>> +    connection->fragments_received = 0;
>> +
>> +    /* There's not going to be any payload, no need to allocate
>> buffer. */
>> +    if (!payload_size && !connection->num_fragments)
>
> The payload size is the same across all messages in the
> "connection" right?  As is the number of fragments?
> It's not even possible/valid to have a zero payload size
> and non-zero number of fragments.  I think the second
> half of the above test can be dropped.
>

The RM RPC specification doesn't require that the first message have
payload. (It makes sense to do it and it does, but that's implementation
detail)

>> +        return 0;
>> +
>> +    if (connection->num_fragments > GH_RM_MAX_NUM_FRAGMENTS)
>> +        return -EINVAL;
>> +
>> +    max_buf_size = payload_size + (connection->num_fragments *
>> GH_RM_MAX_MSG_SIZE);
>> +
>> +    connection->payload = kzalloc(max_buf_size, GFP_KERNEL);
>> +    if (!connection->payload)
>> +        return -ENOMEM;
>> +
>> +    memcpy(connection->payload, msg + hdr_size, payload_size);
>
> I think I suggested (hdr + 1) rather than (msg + size) elsewhere
> and you took that suggestion.  I'd say do it one way or the other,
> consistently, everywhere.
>

hdr_size != sizeof(*hdr) when we receive a reply message.

>> +    connection->size = payload_size;
>> +    return 0;
>> +}
>> +
>> +static void gh_rm_notif_work(struct work_struct *work)
>> +{
>> +    struct gh_rm_connection *connection = container_of(work, struct
>> gh_rm_connection,
>> +                                notification.work);
>> +    struct gh_rm *rm = connection->notification.rm;
>> +
>> +    blocking_notifier_call_chain(&rm->nh, connection->msg_id,
>> connection->payload);
>> +
>> +    put_gh_rm(rm);
>> +    kfree(connection->payload);
>> +    kfree(connection);
>> +}
>> +
>> +static struct gh_rm_connection *gh_rm_process_notif(struct gh_rm *rm,
>> void *msg, size_t msg_size)
>
> I think it might be better if you do some of what the caller
> does here.  I.e., verify the current connection is null (and
> abort if not and make it NULL), then assign it to the new
> connection before you return success.  And return an errno.
>

Since you and Srini both suggest to do it, I'll cave. :-)

>> +{
>> +    struct gh_rm_connection *connection;
>> +    struct gh_rm_rpc_hdr *hdr = msg;
>> +    int ret;
>> +
>> +    connection = gh_rm_alloc_connection(hdr->msg_id, RM_RPC_TYPE_NOTIF);
>> +    if (IS_ERR(connection)) {
>> +        dev_err(rm->dev, "Failed to alloc connection for
>> notification: %ld, dropping.\n",
>> +            PTR_ERR(connection));
>> +        return NULL;
>> +    }
>> +
>> +    get_gh_rm(rm);
>> +    connection->notification.rm = rm;
>> +    INIT_WORK(&connection->notification.work, gh_rm_notif_work);
>> +
>> +    ret = gh_rm_init_connection_payload(connection, msg,
>> sizeof(*hdr), msg_size);
>> +    if (ret) {
>> +        dev_err(rm->dev, "Failed to initialize connection buffer for
>> notification: %d\n",
>> +            ret);
>> +        kfree(connection);
>> +        return NULL;
>> +    }
>> +
>> +    return connection;
>> +}
>> +
>> +static struct gh_rm_connection *gh_rm_process_rply(struct gh_rm *rm,
>> void *msg, size_t msg_size)
>> +{
>
> Here too, make sure there is no active connection and then
> set it within this function if the errno returned is 0.
>
>> +    struct gh_rm_rpc_reply_hdr *reply_hdr = msg;
>> +    struct gh_rm_connection *connection;
>> +    u16 seq_id = le16_to_cpu(reply_hdr->hdr.seq);
>> +
>> +    mutex_lock(&rm->call_idr_lock);
>> +    connection = idr_find(&rm->call_idr, seq_id);
>> +    mutex_unlock(&rm->call_idr_lock);
>> +
>> +    if (!connection || connection->msg_id != reply_hdr->hdr.msg_id)
>> +        return NULL;
>> +
>> +    if (gh_rm_init_connection_payload(connection, msg,
>> sizeof(*reply_hdr), msg_size)) {
>> +        dev_err(rm->dev, "Failed to alloc connection buffer for
>> sequence %d\n", seq_id);
>> +        /* Send connection complete and error the client. */
>> +        connection->reply.ret = -ENOMEM;
>> +        complete(&connection->reply.seq_done);
>> +        return NULL;
>> +    }
>> +
>> +    connection->reply.rm_error = le32_to_cpu(reply_hdr->err_code);
>> +    return connection;
>> +}
>> +
>> +static int gh_rm_process_cont(struct gh_rm *rm, struct
>> gh_rm_connection *connection,
>> +                void *msg, size_t msg_size)
>
> Similar comment here.  Have this function verify there is
> a non-null active connection.  Then process the message
> and abort if there's an error (and null the active connection
> pointer).
>
>> +{
>> +    struct gh_rm_rpc_hdr *hdr = msg;
>> +    size_t payload_size = msg_size - sizeof(*hdr);
>> +
>> +    /*
>> +     * hdr->fragments and hdr->msg_id preserves the value from first
>> reply
>> +     * or notif message. To detect mishandling, check it's still intact.
>> +     */
>> +    if (connection->msg_id != hdr->msg_id ||
>> +        connection->num_fragments != FIELD_GET(RM_RPC_FRAGMENTS_MASK,
>> hdr->type))
>> +        return -EINVAL;
>
> Maybe -EBADMSG?
>
>> +
>> +    memcpy(connection->payload + connection->size, msg +
>> sizeof(*hdr), payload_size);
>> +    connection->size += payload_size;
>> +    connection->fragments_received++;
>> +    return 0;
>> +}
>> +
>> +static void gh_rm_abort_connection(struct gh_rm_connection *connection)
>> +{
>> +    switch (connection->type) {
>> +    case RM_RPC_TYPE_REPLY:
>> +        connection->reply.ret = -EIO;
>> +        complete(&connection->reply.seq_done);
>> +        break;
>> +    case RM_RPC_TYPE_NOTIF:
>> +        fallthrough;
>> +    default:
>> +        kfree(connection->payload);
>> +        kfree(connection);
>> +    }
>> +}
>> +
>> +static bool gh_rm_complete_connection(struct gh_rm *rm, struct
>> gh_rm_connection *connection)
>
> The only caller of this function passes rm->active_rx_connection
> as the second argument.  It is available to you here, so you
> can get rid of that argument.
>
>> +{
>> +    if (!connection || connection->fragments_received !=
>> connection->num_fragments)
>> +        return false;
>> +
>> +    switch (connection->type) {
>> +    case RM_RPC_TYPE_REPLY:
>> +        complete(&connection->reply.seq_done);
>> +        break;
>> +    case RM_RPC_TYPE_NOTIF:
>> +        schedule_work(&connection->notification.work);
>> +        break;
>> +    default:
>> +        dev_err(rm->dev, "Invalid message type (%d) received\n",
>> connection->type);
>> +        gh_rm_abort_connection(connection);
>> +        break;
>> +    }
>> +
>> +    return true;
>> +}
>> +
>> +static void gh_rm_msgq_rx_data(struct mbox_client *cl, void *mssg)
>> +{
>> +    struct gh_rm *rm = container_of(cl, struct gh_rm, msgq_client);
>> +    struct gh_msgq_rx_data *rx_data = mssg;
>> +    size_t msg_size = rx_data->length;
>> +    void *msg = rx_data->data;
>> +    struct gh_rm_rpc_hdr *hdr;
>> +
>
> Is it required that at least one byte (past the header) will
> be received?  I.e., should the "<=" below just be "<"?
>
>> +    if (msg_size <= sizeof(*hdr) || msg_size > GH_MSGQ_MAX_MSG_SIZE)
>> +        return;
>
> You previously reported a message here.  These seem like
> errors, which if they occur, maybe should be reported.
> They seem like "never happen" issues, but it's defensive
> to make these checks (which is good).
>
>> +
>> +    hdr = msg;
>> +    if (hdr->api != RM_RPC_API) {
>
> If this ever happens, is the hardware failing?  It seems
> like once Gunyah is initialized and you've checked the
> API version once, there should be no need to check it
> repeatedly.

I'd need to check the API version for the first message. On subsequent
messages, I'd need to check if I already checked. Might as well just
check the version every time?

<done for all the comments snipped>

>> +
>> +void get_gh_rm(struct gh_rm *rm)
>
> It is often pretty handy to return the argument in
> functions like this.  It simultaneously takes the
> reference and assigns the pointer the reference
> represents.
>
>

I've updated so that gh_rm_get() returns a struct device * (the
miscdev's device). Is this too unusual?

>> +{
>> +    get_device(rm->dev);
>> +}
>> +EXPORT_SYMBOL_GPL(get_gh_rm);
>> +
>> +void put_gh_rm(struct gh_rm *rm)
>> +{
>> +    put_device(rm->dev);
>> +}
>> +EXPORT_SYMBOL_GPL(put_gh_rm);
>> +
>> +static int gh_msgq_platform_probe_direction(struct platform_device
>> *pdev,
>> +                    bool tx, int idx, struct gunyah_resource *ghrsc)
>> +{
>> +    struct device_node *node = pdev->dev.of_node;
>> +    int ret;
>
> I think you should declare idx as a local variable.
>
>     int idx = tx ? 1 : 0;
> >> +
>> +    ghrsc->type = tx ? GUNYAH_RESOURCE_TYPE_MSGQ_TX :
>> GUNYAH_RESOURCE_TYPE_MSGQ_RX;
>> +
>> +    ghrsc->irq = platform_get_irq(pdev, idx);
>
> Do you suppose you could do platform_get_irq_byname(), and then
> specify the names of the IRQs ("rm_tx_irq" and "rm_rx_irq" maybe)?
>
>> +    if (ghrsc->irq < 0) {
>> +        dev_err(&pdev->dev, "Failed to get irq%d: %d\n", idx,
>> ghrsc->irq);
>
> Maybe:    "Failed to get %cX IRQ: %d\n", tx ? 'T' : 'R', ghrsc->irq);
>
>> +        return ghrsc->irq;
>> +    }
>> +
>> +    ret = of_property_read_u64_index(node, "reg", idx, &ghrsc->capid);
>
> Is a capability ID a simple (but large) number?
>
> The *resource manager* (which is a very special VM) has to
> have both TX and RX message queue capability IDs.  Is there
> 'any chance that these specific capability IDs have values
> that are fixed by the design?  Like, 0 and 1?  I don't know
> what they are, but it seems like it *could* be something
> fixed by the design, and if that were the case, there would
> be no need to specify the "reg" property to get the "capid"
> values.
>

They aren't fixed by the design in a production version of Gunyah.

>> +    if (ret) {
>> +        dev_err(&pdev->dev, "Failed to get capid%d: %d\n", idx, ret);
>> +        return ret;
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +static int gh_rm_drv_probe(struct platform_device *pdev)
>> +{
>> +    struct gh_msgq_tx_data *msg;
>> +    struct gh_rm *rm;
>> +    int ret;
>> +
>> +    rm = devm_kzalloc(&pdev->dev, sizeof(*rm), GFP_KERNEL);
>> +    if (!rm)
>> +        return -ENOMEM;
>> +
>> +    platform_set_drvdata(pdev, rm);
>> +    rm->dev = &pdev->dev;
>> +
>> +    mutex_init(&rm->call_idr_lock);
>> +    idr_init(&rm->call_idr);
>> +    rm->cache = kmem_cache_create("gh_rm", struct_size(msg, data,
>> GH_MSGQ_MAX_MSG_SIZE), 0,
>> +        SLAB_HWCACHE_ALIGN, NULL);
>> +    if (!rm->cache)
>> +        return -ENOMEM;
>
> If you abstracted the allocation interface for these messages,
> you could actually survive without the slab cache here.  But
> if this fails, maybe you won't get far anyway.
>
>> +    mutex_init(&rm->send_lock);
>> +    BLOCKING_INIT_NOTIFIER_HEAD(&rm->nh);
>> +
>> +    ret = gh_msgq_platform_probe_direction(pdev, true, 0,
>> &rm->tx_ghrsc);
>> +    if (ret)
>> +        goto err_cache;
>> +
>> +    ret = gh_msgq_platform_probe_direction(pdev, false, 1,
>> &rm->rx_ghrsc);
>> +    if (ret)
>> +        goto err_cache;
>> +
>> +    rm->msgq_client.dev = &pdev->dev;
>> +    rm->msgq_client.tx_block = true;
>> +    rm->msgq_client.rx_callback = gh_rm_msgq_rx_data;
>> +    rm->msgq_client.tx_done = gh_rm_msgq_tx_done;
>> +
>> +    return gh_msgq_init(&pdev->dev, &rm->msgq, &rm->msgq_client,
>> &rm->tx_ghrsc, &rm->rx_ghrsc);
>> +err_cache:
>> +    kmem_cache_destroy(rm->cache);
>> +    return ret;
>> +}
>> +
>> +static int gh_rm_drv_remove(struct platform_device *pdev)
>> +{
>> +    struct gh_rm *rm = platform_get_drvdata(pdev);
>> +
>> +    mbox_free_channel(gh_msgq_chan(&rm->msgq));
>> +    gh_msgq_remove(&rm->msgq);
>> +    kmem_cache_destroy(rm->cache);
>> +
>> +    return 0;
>> +}
>> +
>> +static const struct of_device_id gh_rm_of_match[] = {
>> +    { .compatible = "gunyah-resource-manager" },
>> +    {}
>> +};
>> +MODULE_DEVICE_TABLE(of, gh_rm_of_match);
>> +
>> +static struct platform_driver gh_rm_driver = {
>> +    .probe = gh_rm_drv_probe,
>> +    .remove = gh_rm_drv_remove,
>> +    .driver = {
>> +        .name = "gh_rsc_mgr",
>> +        .of_match_table = gh_rm_of_match,
>> +    },
>> +};
>> +module_platform_driver(gh_rm_driver);
>> +
>> +MODULE_LICENSE("GPL");
>> +MODULE_DESCRIPTION("Gunyah Resource Manager Driver");
>> diff --git a/drivers/virt/gunyah/rsc_mgr.h
>> b/drivers/virt/gunyah/rsc_mgr.h
>> new file mode 100644
>> index 000000000000..d4e799a7526f
>> --- /dev/null
>> +++ b/drivers/virt/gunyah/rsc_mgr.h
>> @@ -0,0 +1,77 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All
>> rights reserved.
>> + */
>> +#ifndef __GH_RSC_MGR_PRIV_H
>> +#define __GH_RSC_MGR_PRIV_H
>> +
>> +#include <linux/gunyah.h>
>> +#include <linux/gunyah_rsc_mgr.h>
>> +#include <linux/types.h>
>> +
>> +/* RM Error codes */
>> +enum gh_rm_error {
>> +    GH_RM_ERROR_OK            = 0x0,
>> +    GH_RM_ERROR_UNIMPLEMENTED    = 0xFFFFFFFF,
>> +    GH_RM_ERROR_NOMEM        = 0x1,
>> +    GH_RM_ERROR_NORESOURCE        = 0x2,
>> +    GH_RM_ERROR_DENIED        = 0x3,
>> +    GH_RM_ERROR_INVALID        = 0x4,
>> +    GH_RM_ERROR_BUSY        = 0x5,
>> +    GH_RM_ERROR_ARGUMENT_INVALID    = 0x6,
>> +    GH_RM_ERROR_HANDLE_INVALID    = 0x7,
>> +    GH_RM_ERROR_VALIDATE_FAILED    = 0x8,
>> +    GH_RM_ERROR_MAP_FAILED        = 0x9,
>> +    GH_RM_ERROR_MEM_INVALID        = 0xA,
>> +    GH_RM_ERROR_MEM_INUSE        = 0xB,
>> +    GH_RM_ERROR_MEM_RELEASED    = 0xC,
>> +    GH_RM_ERROR_VMID_INVALID    = 0xD,
>> +    GH_RM_ERROR_LOOKUP_FAILED    = 0xE,
>> +    GH_RM_ERROR_IRQ_INVALID        = 0xF,
>> +    GH_RM_ERROR_IRQ_INUSE        = 0x10,
>> +    GH_RM_ERROR_IRQ_RELEASED    = 0x11,
>> +};
>> +
>> +/**
>> + * gh_rm_remap_error() - Remap Gunyah resource manager errors into a
>> Linux error code
>> + * @gh_error: "Standard" return value from Gunyah resource manager
>> + */
>> +static inline int gh_rm_remap_error(enum gh_rm_error rm_error)
>> +{
>> +    switch (rm_error) {
>> +    case GH_RM_ERROR_OK:
>> +        return 0;
>> +    case GH_RM_ERROR_UNIMPLEMENTED:
>> +        return -EOPNOTSUPP;
>> +    case GH_RM_ERROR_NOMEM:
>> +        return -ENOMEM;
>> +    case GH_RM_ERROR_NORESOURCE:
>> +        return -ENODEV;
>> +    case GH_RM_ERROR_DENIED:
>> +        return -EPERM;
>> +    case GH_RM_ERROR_BUSY:
>> +        return -EBUSY;
>> +    case GH_RM_ERROR_INVALID:
>> +    case GH_RM_ERROR_ARGUMENT_INVALID:
>> +    case GH_RM_ERROR_HANDLE_INVALID:
>> +    case GH_RM_ERROR_VALIDATE_FAILED:
>> +    case GH_RM_ERROR_MAP_FAILED:
>> +    case GH_RM_ERROR_MEM_INVALID:
>> +    case GH_RM_ERROR_MEM_INUSE:
>> +    case GH_RM_ERROR_MEM_RELEASED:
>> +    case GH_RM_ERROR_VMID_INVALID:
>> +    case GH_RM_ERROR_LOOKUP_FAILED:
>> +    case GH_RM_ERROR_IRQ_INVALID:
>> +    case GH_RM_ERROR_IRQ_INUSE:
>> +    case GH_RM_ERROR_IRQ_RELEASED:
>> +        return -EINVAL;
>> +    default:
>> +        return -EBADMSG;
>> +    }
>> +}
>> +
>> +struct gh_rm;
>
> This might just be my preference, but I like to see declarations
> like the one above grouped at the top of the file, under includes.
>
>> +int gh_rm_call(struct gh_rm *rsc_mgr, u32 message_id, void *req_buff,
>> size_t req_buff_size,
>> +        void **resp_buf, size_t *resp_buff_size);
>> +
>> +#endif
>> diff --git a/include/linux/gunyah_rsc_mgr.h
>> b/include/linux/gunyah_rsc_mgr.h
>> new file mode 100644
>> index 000000000000..c992b3188c8d
>> --- /dev/null
>> +++ b/include/linux/gunyah_rsc_mgr.h
>> @@ -0,0 +1,24 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All
>> rights reserved.
>> + */
>> +
>> +#ifndef _GUNYAH_RSC_MGR_H
>> +#define _GUNYAH_RSC_MGR_H
>> +
>> +#include <linux/list.h>
>> +#include <linux/notifier.h>
>> +#include <linux/gunyah.h>
>> +
>> +#define GH_VMID_INVAL    U16_MAX
>> +
>> +/* Gunyah recognizes VMID0 as an alias to the current VM's ID */
>> +#define GH_VMID_SELF            0
>
> I haven't really checked very well, bur you should *use this*
> definition where a VMID is being examined. I.e., if you're
> going to define this, then never just compare a VMID against 0.
>

I realize now the only place I *could* use GH_VMID_SELF is the one
exception to usage of VMID -- in gh_rm_vmid_alloc. There, vmid of 0
means "use dynamic allocation". Since there aren't any users of the
GH_VMID_SELF, I'll drop it.

Thanks,
Elliot

>                     -Alex
>
>> +
>> +struct gh_rm;
>> +int gh_rm_notifier_register(struct gh_rm *rm, struct notifier_block
>> *nb);
>> +int gh_rm_notifier_unregister(struct gh_rm *rm, struct notifier_block
>> *nb);
>> +void get_gh_rm(struct gh_rm *rm);
>> +void put_gh_rm(struct gh_rm *rm);
>> +
>> +#endif
>

2023-02-24 22:49:29

by Elliot Berman

[permalink] [raw]
Subject: Re: [PATCH v10 10/26] gunyah: vm_mgr: Introduce basic VM Manager



On 2/24/2023 5:20 AM, Arnd Bergmann wrote:
> On Fri, Feb 24, 2023, at 11:29, Srinivas Kandagatla wrote:
>> On 23/02/2023 22:40, Elliot Berman wrote:
>
>>>>> Does this means adding #define GH_VM_DEFAULT_ARG 0 ? I am not sure
>>>>> yet what arguments to add here.
>>>>>
>>>>> The ABI can add new "long" values to GH_CREATE_VM and that wouldn't
>>>>
>>>> Sorry, that is exactly what we want to avoid, we can not change the
>>>> UAPI its going to break the userspace.
>>>>
>>>>> break compatibility with old kernels; old kernels reject it as -EINVAL.
>>>>
>>>> If you have userspace built with older kernel headers then that will
>>>> break. Am not sure about old-kernels.
>>>>
>>>> What exactly is the argument that you want to add to GH_CREATE_VM?
>>>>
>>>> If you want to keep GH_CREATE_VM with no arguments that is fine but
>>>> remove the conflicting comments in the code and document so that its
>>>> not misleading readers/reviewers that the UAPI is going to be modified
>>>> in near future.
>>>>
>>>>
>>>
>>> The convention followed here comes from KVM_CREATE_VM. Is this ioctl
>>> considered bad example?
>>>
>>
>> It is recommended to only use _IO for commands without arguments, and
>> use pointers for passing data. Even though _IO can indicate either
>> commands with no argument or passing an integer value instead of a
>> pointer. Am really not sure how this works in compat case.
>>
>> Am sure there are tricks that can be done with just using _IO() macro
>> (ex vfio), but this does not mean that we should not use _IOW to be more
>> explicit on the type and size of argument that we are expecting.
>>
>> On the other hand If its really not possible to change this IOCTL to
>> _IOW and argument that you are referring would be with in integer range,
>> then what you have with _IO macro should work.
>
> Passing an 'unsigned long' value instead of a pointer is fine for compat
> mode, as a 32-bit compat_ulong_t always fits inside of the 64-bit
> unsigned long. The downside is that portable code cannot have a
> single ioctl handler function that takes both commands with pointers
> and other commands with integer arguments, as some architectures
> (i.e. s390, possibly arm64+morello in the future) need to mangle
> pointer arguments using compat_ptr() but must not do that on integer
> arguments.

Thanks Arnd for helping clarify here!

I'd be open to making GH_CREATE_VM take a struct argument today, but I
really don't know what size or what needs to be in that struct. My hope
is that we can get away with just an integer for future needs. If
integer doesn't suit, then new ioctl would need to be created. I think
there's same problem if I pick some struct today (the struct may not
suit tomorrow and we need to create new ioctl for the new struct).

2023-02-24 23:45:02

by Elliot Berman

[permalink] [raw]
Subject: Re: [PATCH v10 19/26] gunyah: vm_mgr: Add framework to add VM Functions



On 2/22/2023 6:08 AM, Srinivas Kandagatla wrote:
>
>
> On 14/02/2023 21:25, Elliot Berman wrote:
>>
>> Introduce a framework for Gunyah userspace to install VM functions. VM
>> functions are optional interfaces to the virtual machine. vCPUs,
>> ioeventfs, and irqfds are examples of such VM functions and are
>> implemented in subsequent patches.
>>
>> A generic framework is implemented instead of individual ioctls to
>> create vCPUs, irqfds, etc., in order to simplify the VM manager core
>> implementation and allow dynamic loading of VM function modules.
>>
>> Signed-off-by: Elliot Berman <[email protected]>
>> ---
>>   Documentation/virt/gunyah/vm-manager.rst |  18 ++
>>   drivers/virt/gunyah/vm_mgr.c             | 240 ++++++++++++++++++++++-
>>   drivers/virt/gunyah/vm_mgr.h             |   3 +
>>   include/linux/gunyah_vm_mgr.h            |  80 ++++++++
>>   include/uapi/linux/gunyah.h              |  17 ++
>>   5 files changed, 353 insertions(+), 5 deletions(-)
>>   create mode 100644 include/linux/gunyah_vm_mgr.h
>>
>> diff --git a/Documentation/virt/gunyah/vm-manager.rst
>> b/Documentation/virt/gunyah/vm-manager.rst
>> index c0126cfeadc7..5272a6e9145c 100644
>> --- a/Documentation/virt/gunyah/vm-manager.rst
>> +++ b/Documentation/virt/gunyah/vm-manager.rst
>> @@ -17,6 +17,24 @@ sharing userspace memory with a VM is done via the
>> GH_VM_SET_USER_MEM_REGION
>>   ioctl. The VM itself is configured to use the memory region via the
>>   devicetree.
>> +Gunyah Functions
>> +================
>> +
>> +Components of a Gunyah VM's configuration that need kernel
>> configuration are
>> +called "functions" and are built on top of a framework. Functions are
>> identified
>> +by a string and have some argument(s) to configure them. They are
>> typically
>> +created by the `GH_VM_ADD_FUNCTION` ioctl.
>> +
>> +Functions typically will always do at least one of these operations:
>> +
>> +1. Create resource ticket(s). Resource tickets allow a function to
>> register
>> +   itself as the client for a Gunyah resource (e.g. doorbell or vCPU)
>> and
>> +   the function is given the pointer to the `struct gunyah_resource`
>> when the
>> +   VM is starting.
>> +
>> +2. Register IO handler(s). IO handlers allow a function to handle
>> stage-2 faults
>> +   from the virtual machine.
>> +
>>   Sample Userspace VMM
>>   ====================
>> diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
>> index fa324385ade5..e9c55e7dd1b3 100644
>> --- a/drivers/virt/gunyah/vm_mgr.c
>> +++ b/drivers/virt/gunyah/vm_mgr.c
>> @@ -6,8 +6,10 @@
>>   #define pr_fmt(fmt) "gh_vm_mgr: " fmt
>>   #include <linux/anon_inodes.h>
>> +#include <linux/compat.h>
>>   #include <linux/file.h>
>>   #include <linux/gunyah_rsc_mgr.h>
>> +#include <linux/gunyah_vm_mgr.h>
>>   #include <linux/miscdevice.h>
>>   #include <linux/mm.h>
>>   #include <linux/module.h>
>> @@ -16,6 +18,177 @@
>>   #include "vm_mgr.h"
>> +static DEFINE_MUTEX(functions_lock);
>> +static DEFINE_IDR(functions);
> Why are these global? Can these be not part of struc gh_rm?

If I make the list part of gh_rm, the core gunyah framework would need
to know and be dependent on all possible VM functions. This prevents
CONFIG_GUNYAH=y and CONFIG_GUNYAH_IRQFD=m (or any other function from
being enabled as a module). I also see a 2-way dependency when enabled
all enabled as modules. This approach seems vastly simpler.

> Not to mention please move idr to xarrays.
>

Done.

>> +
>> +int gh_vm_function_register(struct gh_vm_function *drv)
>> +{
>> +    int ret = 0;
>> +
>> +    if (!drv->bind || !drv->unbind)
>> +        return -EINVAL;
>> +
>> +    mutex_lock(&functions_lock);
>> +    if (idr_find(&functions, drv->type)) {
>> +        ret = -EEXIST;
>> +        goto out;
>> +    }
>> +
>> +    INIT_LIST_HEAD(&drv->instances);
>> +    ret = idr_alloc(&functions, drv, drv->type, drv->type + 1,
>> GFP_KERNEL);
>> +    if (ret > 0)
>> +        ret = 0;
>> +out:
>> +    mutex_unlock(&functions_lock);
>> +    return ret;
>> +}
>> +EXPORT_SYMBOL_GPL(gh_vm_function_register);
>> +
>> +static void gh_vm_remove_function_instance(struct
>> gh_vm_function_instance *inst)
>> +    __must_hold(functions_lock)
>> +{
>> +    inst->fn->unbind(inst);
>> +    list_del(&inst->vm_list);
>> +    list_del(&inst->fn_list);
>> +    module_put(inst->fn->mod);
>> +    if (inst->arg_size)
>> +        kfree(inst->argp);
>> +    kfree(inst);
>> +}
>> +
>> +void gh_vm_function_unregister(struct gh_vm_function *fn)
>> +{
>> +    struct gh_vm_function_instance *inst, *iter;
>> +
>> +    mutex_lock(&functions_lock);
>> +    list_for_each_entry_safe(inst, iter, &fn->instances,fn_list)
>> +        gh_vm_remove_function_instance(inst);
>
> We should never have any instances as we have refcounted the module.
>
> If there are any instances then its clearly a bug, as this will pull out
> function under the hood while userspace is using it.
>
>

Done.

>> +    idr_remove(&functions, fn->type);
>> +    mutex_unlock(&functions_lock);
>> +}
>> +EXPORT_SYMBOL_GPL(gh_vm_function_unregister);
>> +
>> +static long gh_vm_add_function(struct gh_vm *ghvm, struct gh_fn_desc *f)
>> +{
>> +    struct gh_vm_function_instance *inst;
>> +    void __user *argp;
>> +    long r = 0;
>> +
>> +    if (f->arg_size > GH_FN_MAX_ARG_SIZE)
>
> lets print some useful error message to user.
>
>> +        return -EINVAL;
>> +
>> +    inst = kzalloc(sizeof(*inst), GFP_KERNEL);
>> +    if (!inst)
>> +        return -ENOMEM;
>> +
>> +    inst->arg_size = f->arg_size;
>> +    if (inst->arg_size) {
>> +        inst->argp = kzalloc(inst->arg_size, GFP_KERNEL);
>> +        if (!inst->arg) {
>> +            r = -ENOMEM;
>> +            gotofree;
>> +        }
>> +
>> +        argp = is_compat_task() ? compat_ptr(f->arg) : (void __user
>> *) f->arg;
>
> hmm, arg is not a data pointer it is a fixed size variable (__u64 arg),
> so why are using compat_ptr() here?
>
> you should be able to do
>
> argp = u64_to_user_ptr(f->arg);
>

Done.

>> +        if (copy_from_user(inst->argp, argp, f->arg_size)) {
>> +            r = -EFAULT;
>> +            gotofree_arg;
>> +        }
>> +    } else {
>> +        inst->arg = f->arg;
> bit lost here, so, we treat the arg as both pointer and value in cases
> where size is zero.
>

That's correct -- arg is value in cases where size is zero. It might be
a bit too strange, and I'm not opposed to always treating arg as pointer
when size >= 0 and none of the "arg is value". In vCPU and several other
possible functions we have (presently) downstream, the only argument a
only a 32-bit integer label. I thought it would be good to optimize this
path.

>> +    }
>> +
> <---
>> +    mutex_lock(&functions_lock);
>> +    inst->fn = idr_find(&functions, f->type);
>> +    if (!inst->fn) {
>> +        mutex_unlock(&functions_lock);
>> +        r = request_module("ghfunc:%d", f->type);
>> +        if (r)
>> +            gotounlock_free;
>> +
>> +        mutex_lock(&functions_lock);
>> +        inst->fn = idr_find(&functions, f->type);
>> +    }
>> +
>> +    if (!inst->fn) {
>> +        r = -ENOENT;
>> +        goto unlock_free;
>> +    }
>> +
>> +    if (!try_module_get(inst->fn->mod)) {
>> +        r = -ENOENT;
>> +        inst->fn = NULL;
>> +        goto unlock_free;
>> +    }
>> +
> --->
> can we do this snippet as a gh_vm_get_function() and corresponding
> gh_vm_put_function(). that should make the code more cleaner.
>

Done.

>
>> +    inst->ghvm = ghvm;
>> +    inst->rm = ghvm->rm;
>> +
>> +    r = inst->fn->bind(inst);
>> +    if (r < 0) {
>> +        module_put(inst->fn->mod);
>> +        goto unlock_free;
>> +    }
>> +
>> +    list_add(&inst->vm_list, &ghvm->functions);
>
> I guess its possible to add same functions with same argumentso to this
> list, how are we preventing this to happen?
>
> Is it a valid usecase?
>

I'm not able to think of a function where this is valid. This is handled
today because the ->bind() callback returns an error. This happens if
the resource ticket was already created, e.g. same vCPU is requested twice.

This should be handled at function level because two arguments could be
different but request the same resource. For instance, two
GH_VM_ADD_FUNCTION to create irqfd with same doorbell. One call is with
LEVEL flag and the other doesn't have the flag set. Only once we try to
register the doorbell resource ticket do we realize that 2nd
GH_VM_ADD_FUNCTION should fail.

>> +    list_add(&inst->fn_list, &inst->fn->instances);
>> +    mutex_unlock(&functions_lock);
>> +    return r;
>> +unlock_free:
>> +    mutex_unlock(&functions_lock);
>> +free_arg:
>> +    if (inst->arg_size)
>> +        kfree(inst->argp);
>> +free:
>> +    kfree(inst);
>> +    return r;
>> +}
>> +
>> +static long gh_vm_rm_function(struct gh_vm *ghvm, struct gh_fn_desc *f)
>> +{
>> +    struct gh_vm_function_instance *inst, *iter;
>> +    void __user *user_argp;
>> +    void *argp;
>> +    long r = 0;
>> +
>> +    r = mutex_lock_interruptible(&functions_lock);
>> +    if (r)
>> +        return r;
>> +
>> +    if (f->arg_size) {
>> +        argp = kzalloc(f->arg_size, GFP_KERNEL);
>> +        if (!argp) {
>> +            r = -ENOMEM;
>> +            gotoout;
>> +        }
>> +
>> +        user_argp = is_compat_task() ? compat_ptr(f->arg) : (void
>> __user *) f->arg;
>
> same comment as add;
>
>> +        if (copy_from_user(argp, user_argp, f->arg_size)) {
>> +            r = -EFAULT;
>> +            kfree(argp);
>> +            gotoout;
>> +        }
>> +
>> +        list_for_each_entry_safe(inst, iter, &ghvm->functions,
>> vm_list) {
>> +            if (inst->fn->type == f->type &&
>> +                f->arg_size == inst->arg_size &&
>> +                !memcmp(argp, inst->argp, f->arg_size))
>> +                gh_vm_remove_function_instance(inst);
>> +        }
>
> leaking argp;
>

Done.

>> +    } else {
>> +        list_for_each_entry_safe(inst, iter, &ghvm->functions,
>> vm_list) {
>> +            if (inst->fn->type == f->type &&
>> +                f->arg_size == inst->arg_size &&
>> +                inst->arg == f->arg)
>> +                gh_vm_remove_function_instance(inst);
>> +        }
>> +    }
>> +
>> +out:
>> +    mutex_unlock(&functions_lock);
>> +    return r;
>> +}
>> +
>>   static int gh_vm_rm_notification_status(struct gh_vm *ghvm, void *data)
>>   {
>>       struct gh_rm_vm_status_payload *payload = data;
>> @@ -80,6 +253,7 @@ static void gh_vm_stop(struct gh_vm *ghvm)
>>   static void gh_vm_free(struct work_struct *work)
>>   {
>>       struct gh_vm *ghvm = container_of(work,struct gh_vm, free_work);
>> +    struct gh_vm_function_instance *inst, *iiter;
>>       struct gh_vm_mem *mapping, *tmp;
>>       int ret;
>> @@ -90,7 +264,13 @@ static void gh_vm_free(struct work_struct *work)
>>           fallthrough;
>>       case GH_RM_VM_STATUS_INIT_FAILED:
>>       case GH_RM_VM_STATUS_LOAD:
>> -    case GH_RM_VM_STATUS_LOAD_FAILED:
>> +    case GH_RM_VM_STATUS_EXITED:
>> +        mutex_lock(&functions_lock);
>> +        list_for_each_entry_safe(inst, iiter, &ghvm->functions,
>> vm_list) {
>> +            gh_vm_remove_function_instance(inst);
>> +        }
>> +        mutex_unlock(&functions_lock);
>> +
>>           mutex_lock(&ghvm->mm_lock);
>>           list_for_each_entry_safe(mapping, tmp,
>> &ghvm->memory_mappings, list) {
>>               gh_vm_mem_reclaim(ghvm, mapping);
>> @@ -113,6 +293,28 @@ static void gh_vm_free(struct work_struct *work)
>>       }
>>   }
>> +static void _gh_vm_put(struct kref *kref)
>> +{
>> +    struct gh_vm *ghvm = container_of(kref, struct gh_vm, kref);
>> +
>> +    /* VM will be reset and make RM calls which can interruptible sleep.
>> +     * Defer to a work so this thread can receive signal.
>> +     */
>> +    schedule_work(&ghvm->free_work);
>> +}
>> +
>> +int __must_check gh_vm_get(struct gh_vm *ghvm)
>> +{
>> +    return kref_get_unless_zero(&ghvm->kref);
>> +}
>> +EXPORT_SYMBOL_GPL(gh_vm_get);
>> +
>> +void gh_vm_put(struct gh_vm *ghvm)
>> +{
>> +    kref_put(&ghvm->kref, _gh_vm_put);
>> +}
>> +EXPORT_SYMBOL_GPL(gh_vm_put);
>> +
>>   static __must_check struct gh_vm *gh_vm_alloc(struct gh_rm *rm)
>>   {
>>       struct gh_vm *ghvm;
>> @@ -147,6 +349,8 @@ static __must_check struct gh_vm
>> *gh_vm_alloc(struct gh_rm *rm)
>>       INIT_LIST_HEAD(&ghvm->memory_mappings);
>>       init_rwsem(&ghvm->status_lock);
>>       INIT_WORK(&ghvm->free_work, gh_vm_free);
>> +    kref_init(&ghvm->kref);
>> +    INIT_LIST_HEAD(&ghvm->functions);
>>       ghvm->vm_status = GH_RM_VM_STATUS_LOAD;
>>       return ghvm;
>> @@ -291,6 +495,35 @@ static long gh_vm_ioctl(struct file *filp,
>> unsigned int cmd, unsigned long arg)
>>           r = gh_vm_ensure_started(ghvm);
>>           break;
>>       }
>> +    case GH_VM_ADD_FUNCTION: {
>> +        struct gh_fn_desc *f;
>> +
>> +        f = kzalloc(sizeof(*f), GFP_KERNEL);
>> +        if (!f)
>> +            return -ENOMEM;
>> +
>> +        if (copy_from_user(f, argp, sizeof(*f)))
>> +            return -EFAULT;
>> +
>> +        r = gh_vm_add_function(ghvm, f);
>> +        if (r < 0)
>> +            kfree(f);
>
>
> we are memory leaking f here, we should free it irrespective of return
> value. or I see no reason not to use this small struct from stack.
>

I can copy to stack directly.

>
>> +        break;
>> +    }
>> +    case GH_VM_REMOVE_FUNCTION: {
>> +        struct gh_fn_desc *f;
>> +
>> +        f = kzalloc(sizeof(*f), GFP_KERNEL);
>> +        if (!f)
>> +            return -ENOMEM;
>> +
>> +        if (copy_from_user(f, argp, sizeof(*f)))
>> +            return -EFAULT;
>> +
>> +        r = gh_vm_rm_function(ghvm, f);
>> +        kfree(f);
>> +        break;
>> +    }
>>       default:
>>           r = -ENOTTY;
>>           break;
>> @@ -303,10 +536,7 @@ static int gh_vm_release(struct inode *inode,
>> struct file *filp)
>>   {
>>       struct gh_vm *ghvm = filp->private_data;
>> -    /* VM will be reset and make RM calls which can interruptible sleep.
>> -     * Defer to a work so this thread can receive signal.
>> -     */
>> -    schedule_work(&ghvm->free_work);
>> +    gh_vm_put(ghvm);
>>       return 0;
>>   }
>> diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
>> index e9cf56647cc2..4750d56c1297 100644
>> --- a/drivers/virt/gunyah/vm_mgr.h
>> +++ b/drivers/virt/gunyah/vm_mgr.h
>> @@ -8,6 +8,7 @@
>>   #include <linux/gunyah_rsc_mgr.h>
>>   #include <linux/list.h>
>> +#include <linux/kref.h>
>>   #include <linux/miscdevice.h>
>>   #include <linux/mutex.h>
>>   #include <linux/rwsem.h>
>> @@ -44,8 +45,10 @@ struct gh_vm {
>>       struct rw_semaphore status_lock;
>>       struct work_struct free_work;
>> +    struct kref kref;
>>       struct mutex mm_lock;
>>       struct list_head memory_mappings;
>> +    struct list_head functions;
>>   };
>>   int gh_vm_mem_alloc(struct gh_vm *ghvm, struct
>> gh_userspace_memory_region *region);
>> diff --git a/include/linux/gunyah_vm_mgr.h
>> b/include/linux/gunyah_vm_mgr.h
>> new file mode 100644
>> index 000000000000..f0a95af50b2e
>> --- /dev/null
>> +++ b/include/linux/gunyah_vm_mgr.h
>> @@ -0,0 +1,80 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All
>> rights reserved.
>> + */
>> +
>> +#ifndef _GUNYAH_VM_MGR_H
>> +#define _GUNYAH_VM_MGR_H
>> +
>> +#include <linux/compiler_types.h>
>> +#include <linux/gunyah.h>
>> +#include <linux/gunyah_rsc_mgr.h>
>> +#include <linux/list.h>
>> +#include <linux/mod_devicetable.h>
> ??
>
>> +#include <linux/notifier.h>
>
> ??
>
>> +
>> +#include <uapi/linux/gunyah.h>
>> +
>> +struct gh_vm;
>> +
>> +int __must_check gh_vm_get(struct gh_vm *ghvm);
>> +void gh_vm_put(struct gh_vm *ghvm);
>> +
>> +struct gh_vm_function_instance;
>> +struct gh_vm_function {
>> +    u32 type; > +    const char *name;
>> +    struct module *mod;
>> +    long (*bind)(struct gh_vm_function_instance *f);
>> +    void (*unbind)(struct gh_vm_function_instance *f);
>> +    struct mutex instances_lock;
>> +    struct list_head instances;
>> +};
>> +
>> +/**
>> + * struct gh_vm_function_instance - Represents one function instance
>> + * @arg_size: size of user argument
>> + * @arg: user argument to describe the function instance; arg_size is 0
>> + * @argp: pointer to user argument
>> + * @ghvm: Pointer to VM instance
>> + * @rm: Pointer to resource manager for the VM instance
>> + * @fn: The ops for the function
>> + * @data: Private data for function
>> + * @vm_list: for gh_vm's functions list
>> + * @fn_list: for gh_vm_function's instances list
>> + */
>> +struct gh_vm_function_instance {
>> +    size_t arg_size;
>> +    union {
>> +        u64 arg;
>> +        void *argp;
>> +    };
>> +    struct gh_vm *ghvm;
>> +    struct gh_rm *rm;
>> +    struct gh_vm_function *fn;
>> +    void *data;
>> +    struct list_head vm_list;
>> +    struct list_head fn_list;
> Am not seeing any advantage of storing the instance in two different
> list, they look redundant to me. storing the function instances in vm
> should be good IMO.
>

I was concerned about removal of function outside of module_removal, but
we don't have reason/ability to do it.

>
>> +};
>> +
>> +int gh_vm_function_register(struct gh_vm_function *f);
>> +void gh_vm_function_unregister(struct gh_vm_function *f);
>> +
>> +#define DECLARE_GUNYAH_VM_FUNCTION(_name, _type, _bind, _unbind)    \
>> +    static struct gh_vm_function _name = {        \
>> +        .type = _type,                        \
>> +        .name = __stringify(_name),                \
>> +        .mod = THIS_MODULE,                    \
>> +        .bind = _bind,                        \
>> +        .unbind = _unbind,                    \
>> +    };                                \
>> +    MODULE_ALIAS("ghfunc:"__stringify(_type))
>> +
>> +#define module_gunyah_vm_function(__gf)                    \
>> +    module_driver(__gf, gh_vm_function_register,
>> gh_vm_function_unregister)
>> +
>> +#define DECLARE_GUNYAH_VM_FUNCTION_INIT(_name, _type, _bind,
>> _unbind)    \
>> +    DECLARE_GUNYAH_VM_FUNCTION(_name, _type, _bind, _unbind);    \
>> +    module_gunyah_vm_function(_name)
>> +
>> +#endif
>> diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
>> index d899bba6a4c6..8df455a2a293 100644
>> --- a/include/uapi/linux/gunyah.h
>> +++ b/include/uapi/linux/gunyah.h
>> @@ -66,4 +66,21 @@ struct gh_vm_dtb_config {
>>   #define GH_VM_START        _IO(GH_IOCTL_TYPE, 0x3)
>> +#define GH_FN_MAX_ARG_SIZE        256
>> +
>> +/**
>> + * struct gh_fn_desc - Arguments to create a VM function
>> + * @type: Type of the function. See GH_FN_* macro for supported types
>> + * @arg_size: Size of argument to pass to the function
>
> a note on max arg size  of 256 bytes would be useful.
>
>> + * @arg: Value or pointer to argument given to the function
>
> Treating this as value when arg_size == 0 is really confusing abi.
> how about just use as arg as ptr to data along with arg_size;
>
> --srini
>> + */
>> +struct gh_fn_desc {
>> +    __u32 type;
>> +    __u32 arg_size;
>> +    __u64 arg;
>> +};
>> +
>> +#define GH_VM_ADD_FUNCTION    _IOW(GH_IOCTL_TYPE, 0x4, struct
>> gh_fn_desc)
>> +#define GH_VM_REMOVE_FUNCTION    _IOW(GH_IOCTL_TYPE, 0x7, struct
>> gh_fn_desc)
>
> Do you have an example of how add and rm ioctls are used w.r.t to arg, i
> see that we check correcteness of arg in between add and remove.
>
> --srini
>> +
>>   #endif

2023-02-25 01:03:35

by Elliot Berman

[permalink] [raw]
Subject: Re: [PATCH v10 12/26] gunyah: vm_mgr: Add/remove user memory regions



On 2/23/2023 4:34 PM, Alex Elder wrote:
> On 2/14/23 3:24 PM, Elliot Berman wrote:
>>
>> When launching a virtual machine, Gunyah userspace allocates memory for
>> the guest and informs Gunyah about these memory regions through
>> SET_USER_MEMORY_REGION ioctl.
>>
>> Co-developed-by: Prakruthi Deepak Heragu <[email protected]>
>> Signed-off-by: Prakruthi Deepak Heragu <[email protected]>
>> Signed-off-by: Elliot Berman <[email protected]>
>> ---
>>   drivers/virt/gunyah/Makefile    |   2 +-
>>   drivers/virt/gunyah/vm_mgr.c    |  44 ++++++
>>   drivers/virt/gunyah/vm_mgr.h    |  25 ++++
>>   drivers/virt/gunyah/vm_mgr_mm.c | 235 ++++++++++++++++++++++++++++++++
>>   include/uapi/linux/gunyah.h     |  33 +++++
>>   5 files changed, 338 insertions(+), 1 deletion(-)
>>   create mode 100644 drivers/virt/gunyah/vm_mgr_mm.c
>>
>> diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
>> index 03951cf82023..ff8bc4925392 100644
>> --- a/drivers/virt/gunyah/Makefile
>> +++ b/drivers/virt/gunyah/Makefile
>> @@ -2,5 +2,5 @@
>>   obj-$(CONFIG_GUNYAH) += gunyah.o
>> -gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o
>> +gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o vm_mgr_mm.o
>>   obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
>> diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
>> index fd890a57172e..84102bac03cc 100644
>> --- a/drivers/virt/gunyah/vm_mgr.c
>> +++ b/drivers/virt/gunyah/vm_mgr.c
>> @@ -18,8 +18,16 @@
>>   static void gh_vm_free(struct work_struct *work)
>>   {
>>       struct gh_vm *ghvm = container_of(work, struct gh_vm, free_work);
>> +    struct gh_vm_mem *mapping, *tmp;
>>       int ret;
>> +    mutex_lock(&ghvm->mm_lock);
>> +    list_for_each_entry_safe(mapping, tmp, &ghvm->memory_mappings,
>> list) {
>> +        gh_vm_mem_reclaim(ghvm, mapping);
>> +        kfree(mapping);
>> +    }
>> +    mutex_unlock(&ghvm->mm_lock);
>> +
>>       ret = gh_rm_dealloc_vmid(ghvm->rm, ghvm->vmid);
>>       if (ret)
>>           pr_warn("Failed to deallocate vmid: %d\n", ret);
>> @@ -48,11 +56,46 @@ static __must_check struct gh_vm
>> *gh_vm_alloc(struct gh_rm *rm)
>>       ghvm->vmid = vmid;
>>       ghvm->rm = rm;
>> +    mutex_init(&ghvm->mm_lock);
>> +    INIT_LIST_HEAD(&ghvm->memory_mappings);
>>       INIT_WORK(&ghvm->free_work, gh_vm_free);
>>       return ghvm;
>>   }
>> +static long gh_vm_ioctl(struct file *filp, unsigned int cmd, unsigned
>> long arg)
>> +{
>> +    struct gh_vm *ghvm = filp->private_data;
>> +    void __user *argp = (void __user *)arg;
>> +    long r;
>> +
>> +    switch (cmd) {
>> +    case GH_VM_SET_USER_MEM_REGION: {
>> +        struct gh_userspace_memory_region region;
>> +
>> +        if (copy_from_user(&region, argp, sizeof(region)))
>> +            return -EFAULT;
>> +
>> +        /* All other flag bits are reserved for future use */
>> +        if (region.flags & ~(GH_MEM_ALLOW_READ | GH_MEM_ALLOW_WRITE |
>> GH_MEM_ALLOW_EXEC |
>> +            GH_MEM_LENT))
>> +            return -EINVAL;
>> +
>> +
>> +        if (region.memory_size)
>
> Would there be any value in allowing a zero-size memory
> region to be created?  Maybe that doesn't make sense, but
> I guess i'm questioning whether a zero memory region size
> have special meaning in this interface is a good thing to
> do.  You could sensibly have a separate REMOVE_USER_MEM_REGION
> request, and still permit 0 to be a valid size.
>

I don't think zero-size memory region makes sense. At best, it only
registers an empty region with guest and causes memory overhead for
bookkeeping.

>> +            r = gh_vm_mem_alloc(ghvm, &region);
>> +        else
>> +            r = gh_vm_mem_free(ghvm, region.label);
>> +        break;
>> +    }
>> +    default:
>> +        r = -ENOTTY;
>> +        break;
>> +    }
>> +
>> +    return r;
>> +}
>> +
>>   static int gh_vm_release(struct inode *inode, struct file *filp)
>>   {
>>       struct gh_vm *ghvm = filp->private_data;
>> @@ -65,6 +108,7 @@ static int gh_vm_release(struct inode *inode,
>> struct file *filp)
>>   }
>>   static const struct file_operations gh_vm_fops = {
>> +    .unlocked_ioctl = gh_vm_ioctl,
>>       .release = gh_vm_release,
>>       .compat_ioctl    = compat_ptr_ioctl,
>>       .llseek = noop_llseek,
>> diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
>> index 76954da706e9..97bc00c34878 100644
>> --- a/drivers/virt/gunyah/vm_mgr.h
>> +++ b/drivers/virt/gunyah/vm_mgr.h
>> @@ -7,16 +7,41 @@
>>   #define _GH_PRIV_VM_MGR_H
>>   #include <linux/gunyah_rsc_mgr.h>
>> +#include <linux/list.h>
>> +#include <linux/miscdevice.h>
>> +#include <linux/mutex.h>
>>   #include <uapi/linux/gunyah.h>
>>   long gh_dev_vm_mgr_ioctl(struct gh_rm *rm, unsigned int cmd,
>> unsigned long arg);
>> +enum gh_vm_mem_share_type {
>> +    VM_MEM_SHARE,
>> +    VM_MEM_LEND,
>
> Are there any other share types anticipated?  Even if
> there were, for now you could use a Boolean to distinguish
> between shared or lent (at least until a third option
> materializes).
>

There is VM_MEM_DONATE. I can add the type, but it's only used special
VMs (there's nothing really stopping a generic unauth VM to use it, but
I don't think anyone will want to).

>> +};
>> +
>> +struct gh_vm_mem {
>> +    struct list_head list;
>> +    enum gh_vm_mem_share_type share_type;
>> +    struct gh_rm_mem_parcel parcel;
>> +
>> +    __u64 guest_phys_addr;
>> +    struct page **pages;
>> +    unsigned long npages;
>> +};
>> +
>>   struct gh_vm {
>>       u16 vmid;
>>       struct gh_rm *rm;
>>       struct work_struct free_work;
>> +    struct mutex mm_lock;
>> +    struct list_head memory_mappings;
>>   };
>> +int gh_vm_mem_alloc(struct gh_vm *ghvm, struct
>> gh_userspace_memory_region *region);
>> +void gh_vm_mem_reclaim(struct gh_vm *ghvm, struct gh_vm_mem *mapping);
>> +int gh_vm_mem_free(struct gh_vm *ghvm, u32 label);
>> +struct gh_vm_mem *gh_vm_mem_find(struct gh_vm *ghvm, u32 label);
>> +
>>   #endif
>> diff --git a/drivers/virt/gunyah/vm_mgr_mm.c
>> b/drivers/virt/gunyah/vm_mgr_mm.c
>> new file mode 100644
>> index 000000000000..03e71a36ea3b
>> --- /dev/null
>> +++ b/drivers/virt/gunyah/vm_mgr_mm.c
>> @@ -0,0 +1,235 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All
>> rights reserved.
>> + */
>> +
>> +#define pr_fmt(fmt) "gh_vm_mgr: " fmt
>> +
>> +#include <linux/gunyah_rsc_mgr.h>
>> +#include <linux/mm.h>
>> +
>> +#include <uapi/linux/gunyah.h>
>> +
>> +#include "vm_mgr.h"
>> +
>> +static inline bool page_contiguous(phys_addr_t p, phys_addr_t t)
>
> Is there not some existing function that captures this?
> In any case, it's used in one place and I think it would
> be clearer to just put the logic there rather than hiding
> it behind this function.
>

Done.

>> +{
>> +    return t - p == PAGE_SIZE;
>> +}
>> +
>> +static struct gh_vm_mem *__gh_vm_mem_find(struct gh_vm *ghvm, u32 label)
>> +    __must_hold(&ghvm->mm_lock)
>> +{
>> +    struct gh_vm_mem *mapping;
>> +
>> +    list_for_each_entry(mapping, &ghvm->memory_mappings, list)
>> +        if (mapping->parcel.label == label)
>> +            return mapping;
>> +
>> +    return NULL;
>> +}
>> +
>
> . . .
>
>> diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
>> index 10ba32d2b0a6..d85d12119a48 100644
>> --- a/include/uapi/linux/gunyah.h
>> +++ b/include/uapi/linux/gunyah.h
>> @@ -20,4 +20,37 @@
>>    */
>>   #define GH_CREATE_VM            _IO(GH_IOCTL_TYPE, 0x0) /* Returns a
>> Gunyah VM fd */
>> +/*
>> + * ioctls for VM fds
>> + */
>> +
>> +/**
>> + * struct gh_userspace_memory_region - Userspace memory descripion
>> for GH_VM_SET_USER_MEM_REGION
>> + * @label: Unique identifer to the region.
>
> Maybe this is described somewhere, but what is the purpose
> of the label?  Who uses it?  Is it meant to be a value
> only the current owner of a resource understands?  Or does
> resource manager use it internally, or what?
>

The label is used by kernel, userspace, and Gunyah. Userspace decides
all the labels and there are no special labels.

- Userspace can delete memory parcels by label (kernel looks up parcel
by label)
- The VM's DTB configuration describes where Gunyah should map memory
parcels into guest's memory. The VM DTB uses the memory parcel's label
as the reference.

Thanks,
Elliot

>> + * @flags: Flags for memory parcel behavior
>> + * @guest_phys_addr: Location of the memory region in guest's memory
>> space (page-aligned)
>> + * @memory_size: Size of the region (page-aligned)
>> + * @userspace_addr: Location of the memory region in caller
>> (userspace)'s memory
>> + *
>> + * See Documentation/virt/gunyah/vm-manager.rst for further details.
>> + */
>> +struct gh_userspace_memory_region {
>> +    __u32 label;
>
> Define the possible permission values separate from
> the structure.
>
>                     -Alex
>
>> +#define GH_MEM_ALLOW_READ    (1UL << 0)
>> +#define GH_MEM_ALLOW_WRITE    (1UL << 1)
>> +#define GH_MEM_ALLOW_EXEC    (1UL << 2)
>> +/*
>> + * The guest will be lent the memory instead of shared.
>> + * In other words, the guest has exclusive access to the memory
>> region and the host loses access.
>> + */
>> +#define GH_MEM_LENT        (1UL << 3)
>> +    __u32 flags;
>> +    __u64 guest_phys_addr;
>> +    __u64 memory_size;
>> +    __u64 userspace_addr;
>> +};
>> +
>> +#define GH_VM_SET_USER_MEM_REGION    _IOW(GH_IOCTL_TYPE, 0x1, \
>> +                        struct gh_userspace_memory_region)
>> +
>>   #endif
>

2023-02-27 09:55:55

by Fuad Tabba

[permalink] [raw]
Subject: Re: [PATCH v10 12/26] gunyah: vm_mgr: Add/remove user memory regions

Hi,

On Fri, Feb 24, 2023 at 6:08 PM Elliot Berman <[email protected]> wrote:
>
>
>
> On 2/24/2023 2:19 AM, Fuad Tabba wrote:
> > Hi,
> >
> > On Tue, Feb 14, 2023 at 9:26 PM Elliot Berman <[email protected]> wrote:
> >>
> >>
> >> When launching a virtual machine, Gunyah userspace allocates memory for
> >> the guest and informs Gunyah about these memory regions through
> >> SET_USER_MEMORY_REGION ioctl.
> >
> > I'm working on pKVM [1], and regarding the problem of donating private
> > memory to a guest, we and others working on confidential computing
> > have faced a similar issue that this patch is trying to address. In
> > pKVM, we've initially taken an approach similar to the one here by
> > pinning the pages being donated to prevent swapping or migration [2].
> > However, we've encountered issues with this approach since the memory
> > is still mapped by the host, which could cause the system to crash on
> > an errant access.
> >
> > Instead, we've been working on adopting an fd-based restricted memory
> > approach that was initially proposed for TDX [3] and is now being
> > considered by others in the confidential computing space as well
> > (e.g., Arm CCA [4]). The basic idea is that the host manages the guest
> > memory via a file descriptor instead of a userspace address. It cannot
> > map that memory (unless explicitly shared by the guest [5]),
> > eliminating the possibility of the host trying to access private
> > memory accidentally or being tricked by a malicious actor. This is
> > based on memfd with some restrictions. It handles swapping and
> > migration by disallowing them (for now [6]), and adds a new type of
> > memory region to KVM to accommodate having an fd representing guest
> > memory.
> >
> > Although the fd-based restricted memory isn't upstream yet, we've
> > ported the latest patches to arm64 and made changes and additions to
> > make it work with pKVM, to test it and see if the solution is feasible
> > for us (it is). I wanted to mention this work in case you find it
> > useful, and in the hopes that we can all work on confidential
> > computing using the same interfaces as much as possible.
>
> Thanks for highlighting the memfd_restricted changes to us! We'll
> investigate how/if it can suit Gunyah usecases. It sounds like you
> might've made memfd_restricted changes as well? Are those posted on the
> mailing lists? Also, are example userspace (crosvm?) changes posted?

I have posted kvmtool changes to make it work with memfd_restricted
and pKVM as an RFC [1] (git [2]). I haven't posted the arm64 port, but
it's in a git repo [3]. Chao has a repository with qemu support (TDX)
as well [4].

Eventually, we're likely to have crosvm support as well. If you're
interested, I can keep you CCed on anything we post upstream.

Cheers,
/fuad

[1] https://lore.kernel.org/all/[email protected]/
[2] https://android-kvm.googlesource.com/kvmtool/+/refs/heads/tabba/fdmem-v10-core
[3] https://android-kvm.googlesource.com/linux/+/refs/heads/tabba/fdmem-v10-core
[4] https://github.com/chao-p/qemu/tree/privmem-v10

>
> Thanks,
> Elliot
>
> >
> > Some comments inline below...
> >
> > Cheers,
> > /fuad
> >
> > [1] https://lore.kernel.org/kvmarm/[email protected]/
> > [2] https://lore.kernel.org/kvmarm/[email protected]/
> > [3] https://lore.kernel.org/all/[email protected]/
> > [4] https://lore.kernel.org/lkml/[email protected]/
> > [5] This is a modification we've done for the arm64 port, after
> > discussing it with the original authors.
> > [6] Nothing inherent in the proposal to stop migration and swapping.
> > There are some technical issues that need to be resolved.
> >
> > <snip>
> <snip, looking at comments in parallel>

2023-02-28 00:52:10

by Alex Elder

[permalink] [raw]
Subject: Re: [PATCH v10 08/26] gunyah: rsc_mgr: Add resource manager RPC core

On 2/23/23 5:13 PM, Elliot Berman wrote:
>> TBH, gunyah.c should be merged as part of resource manager, and check
>> if uuids and features in probe before proceeding further.
>>
>
>
> Ah -- gunyah_rsc_mgr.ko has symbol dependency on gunyah-msgq.ko.
> gunyah-msgq.ko has symbol dependency on gunyah.ko. gunyah.ko doesn't
> have any probe and does all its work on module_init.
>
> In order to merge gunyah.c with resource manager, I would need to
> incorporate message queue mailbox into resource manager. IMO, this
> rapidly moves towards a mega-module which was discouraged previously.

I missed this discussion; why was it discouraged?

I can think of some reasons why I guess. But I don't see what
problem comes from linking together a "mega module" that's made
up of well-isolated source files that expose minimal APIs to
one another. All inter-dependent modules will required at once
anyway; I don't understand the benefit of implementing them
separately. Can you explain, or provide some context? Thanks.

-Alex


2023-02-28 01:06:38

by Alex Elder

[permalink] [raw]
Subject: Re: [PATCH v10 10/26] gunyah: vm_mgr: Introduce basic VM Manager

On 2/24/23 4:48 PM, Elliot Berman wrote:
> I'd be open to making GH_CREATE_VM take a struct argument today, but I
> really don't know what size or what needs to be in that struct. My hope
> is that we can get away with just an integer for future needs. If
> integer doesn't suit, then new ioctl would need to be created. I think
> there's same problem if I pick some struct today (the struct may not
> suit tomorrow and we need to create new ioctl for the new struct).

I'd like someone to back me up (or tell me I'm wrong), but...

I think you can still pass a void in/out pointer, which can
be interpreted in an IOCTL-specific way, as long as it can
be unambiguously processed.

So if you passed a non-null pointer, what it referred to
could contain a key that defines the way to interpret it.

You can't take away a behavior you've once supported, but I
*think* you can add a new behavior (with a new structure
that identifies itself).

So if that is correct, you can extend a single IOCTL. But
sadly I can't tell you I'm sure this is correct.

-Alex

2023-02-28 09:19:27

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH v10 10/26] gunyah: vm_mgr: Introduce basic VM Manager

On Tue, Feb 28, 2023, at 02:06, Alex Elder wrote:
> On 2/24/23 4:48 PM, Elliot Berman wrote:
>> I'd be open to making GH_CREATE_VM take a struct argument today, but I
>> really don't know what size or what needs to be in that struct. My hope
>> is that we can get away with just an integer for future needs. If
>> integer doesn't suit, then new ioctl would need to be created. I think
>> there's same problem if I pick some struct today (the struct may not
>> suit tomorrow and we need to create new ioctl for the new struct).
>
> I'd like someone to back me up (or tell me I'm wrong), but...
>
> I think you can still pass a void in/out pointer, which can
> be interpreted in an IOCTL-specific way, as long as it can
> be unambiguously processed.
>
> So if you passed a non-null pointer, what it referred to
> could contain a key that defines the way to interpret it.
>
> You can't take away a behavior you've once supported, but I
> *think* you can add a new behavior (with a new structure
> that identifies itself).
>
> So if that is correct, you can extend a single IOCTL. But
> sadly I can't tell you I'm sure this is correct.

In general you are correct that the behavior of an ioctl
command can be changed by reusing a combination of inputs that
was previously prohibited. I can't think of a case where that
would be a good idea though, as this just adds more complexity
than defining a new ioctl command code.

Interface versions and multiplexed ioctl commands are
all discouraged for the same reason.

Arnd

2023-02-28 22:50:21

by Elliot Berman

[permalink] [raw]
Subject: Re: [PATCH v10 08/26] gunyah: rsc_mgr: Add resource manager RPC core



On 2/27/2023 4:52 PM, Alex Elder wrote:
> On 2/23/23 5:13 PM, Elliot Berman wrote:
>>> TBH, gunyah.c should be merged as part of resource manager, and check
>>> if uuids and features in probe before proceeding further.
>>>
>>
>>
>> Ah -- gunyah_rsc_mgr.ko has symbol dependency on gunyah-msgq.ko.
>> gunyah-msgq.ko has symbol dependency on gunyah.ko. gunyah.ko doesn't
>> have any probe and does all its work on module_init.
>>
>> In order to merge gunyah.c with resource manager, I would need to
>> incorporate message queue mailbox into resource manager. IMO, this
>> rapidly moves towards a mega-module which was discouraged previously.
>
> I missed this discussion; why was it discouraged?
>
> I can think of some reasons why I guess.  But I don't see what
> problem comes from linking together a "mega module" that's made
> up of well-isolated source files that expose minimal APIs to
> one another.  All inter-dependent modules will required at once
> anyway; I don't understand the benefit of implementing them
> separately.  Can you explain, or provide some context?  Thanks.

I came from some earlier comments from Dmitry:

https://lore.kernel.org/all/[email protected]/

Earlier comments from Dmitry were about having bus and drivers in same
module. I think same comment applies with mailbox built into the
gunyah.ko (message queue is the provider and rsr_mgr is the consumer).

Thanks,
Elliot

2023-03-01 00:01:16

by Elliot Berman

[permalink] [raw]
Subject: Re: [PATCH v10 01/26] docs: gunyah: Introduce Gunyah Hypervisor



On 2/23/2023 3:41 PM, Alex Elder wrote:
> On 2/14/23 3:12 PM, Elliot Berman wrote:
>> Gunyah is an open-source Type-1 hypervisor developed by Qualcomm. It
>> does not depend on any lower-privileged OS/kernel code for its core
>> functionality. This increases its security and can support a smaller
>> trusted computing based when compared to Type-2 hypervisors.
>>
>> Add documentation describing the Gunyah hypervisor and the main
>> components of the Gunyah hypervisor which are of interest to Linux
>> virtualization development.
>>
>> Reviewed-by: Bagas Sanjaya <[email protected]>
>> Signed-off-by: Elliot Berman <[email protected]>
>> ---
>>   Documentation/virt/gunyah/index.rst         | 113 ++++++++++++++++++++
>>   Documentation/virt/gunyah/message-queue.rst |  61 +++++++++++
>>   Documentation/virt/index.rst                |   1 +
>>   3 files changed, 175 insertions(+)
>>   create mode 100644 Documentation/virt/gunyah/index.rst
>>   create mode 100644 Documentation/virt/gunyah/message-queue.rst
>>
>> diff --git a/Documentation/virt/gunyah/index.rst
>> b/Documentation/virt/gunyah/index.rst
>> new file mode 100644
>> index 000000000000..45adbbc311db
>> --- /dev/null
>> +++ b/Documentation/virt/gunyah/index.rst
>> @@ -0,0 +1,113 @@
>> +.. SPDX-License-Identifier: GPL-2.0
>> +
>> +=================
>> +Gunyah Hypervisor
>> +=================
>> +
>> +.. toctree::
>> +   :maxdepth: 1
>> +
>> +   message-queue
>> +
>> +Gunyah is a Type-1 hypervisor which is independent of any OS kernel,
>> and runs in
>> +a higher CPU privilege level. It does not depend on any
>> lower-privileged operating system
>> +for its core functionality. This increases its security and can
>> support a much smaller
>> +trusted computing base than a Type-2 hypervisor.
>> +
>> +Gunyah is an open source hypervisor. The source repo is available at
>> +https://github.com/quic/gunyah-hypervisor.
>> +
>> +Gunyah provides these following features.
>> +
>> +- Scheduling:
>> +
>> +  A scheduler for virtual CPUs (vCPUs) on physical CPUs enables
>> time-sharing
>> +  of the CPUs. Gunyah supports two models of scheduling:
>> +
>> +    1. "Behind the back" scheduling in which Gunyah hypervisor
>> schedules vCPUS on its own.
>> +    2. "Proxy" scheduling in which a delegated VM can donate part of
>> one of its vCPU slice
>> +       to another VM's vCPU via a hypercall.
>> +
>> +- Memory Management:
>> +
>> +  APIs handling memory, abstracted as objects, limiting direct use of
>> physical
>> +  addresses. Memory ownership and usage tracking of all memory under
>> its control.
>> +  Memory partitioning between VMs is a fundamental security feature.
>> +
>> +- Interrupt Virtualization:
>> +
>> +  Uses CPU hardware interrupt virtualization capabilities. Interrupts
>> are handled
>> +  in the hypervisor and routed to the assigned VM.
>> +
>> +- Inter-VM Communication:
>> +
>> +  There are several different mechanisms provided for communicating
>> between VMs.
>> +
>> +- Virtual platform:
>> +
>> +  Architectural devices such as interrupt controllers and CPU timers
>> are directly provided
>> +  by the hypervisor as well as core virtual platform devices and
>> system APIs such as ARM PSCI.
>> +
>> +- Device Virtualization:
>> +
>> +  Para-virtualization of devices is supported using inter-VM
>> communication.
>> +
>> +Architectures supported
>> +=======================
>> +AArch64 with a GIC
>> +
>> +Resources and Capabilities
>> +==========================
>> +
>> +Some services or resources provided by the Gunyah hypervisor are
>> described to a virtual machine by
>> +capability IDs. For instance, inter-VM communication is performed
>> with doorbells and message queues.
>> +Gunyah allows access to manipulate that doorbell via the capability
>> ID. These resources are
>> +described in Linux as a struct gunyah_resource.
>> +
>> +High level management of these resources is performed by the resource
>> manager VM. RM informs a
>> +guest VM about resources it can access through either the device tree
>> or via guest-initiated RPC.
>> +
>> +For each virtual machine, Gunyah maintains a table of resources which
>> can be accessed by that VM.
>> +An entry in this table is called a "capability" and VMs can only
>> access resources via this
>> +capability table. Hence, virtual Gunyah resources are referenced by a
>> "capability IDs" and not
>> +"resource IDs". If 2 VMs have access to the same resource, they might
>> not be using the same
>> +capability ID to access that resource since the capability tables are
>> independent per VM.
>> +
>> +Resource Manager
>> +================
>> +
>> +The resource manager (RM) is a privileged application VM supporting
>> the Gunyah Hypervisor.
>> +It provides policy enforcement aspects of the virtualization system.
>> The resource manager can
>> +be treated as an extension of the Hypervisor but is separated to its
>> own partition to ensure
>> +that the hypervisor layer itself remains small and secure and to
>> maintain a separation of policy
>> +and mechanism in the platform. RM runs at arm64 NS-EL1 similar to
>> other virtual machines.
>> +
>> +Communication with the resource manager from each guest VM happens
>> with message-queue.rst. Details
>> +about the specific messages can be found in
>> drivers/virt/gunyah/rsc_mgr.c
>> +
>> +::
>> +
>> +  +-------+   +--------+   +--------+
>> +  |  RM   |   |  VM_A  |   |  VM_B  |
>> +  +-.-.-.-+   +---.----+   +---.----+
>> +    | |           |            |
>> +  +-.-.-----------.------------.----+
>> +  | | \==========/             |    |
>> +  |  \========================/     |
>> +  |            Gunyah               |
>> +  +---------------------------------+
>> +
>> +The source for the resource manager is available at
>> https://github.com/quic/gunyah-resource-manager.
>> +
>> +The resource manager provides the following features:
>> +
>> +- VM lifecycle management: allocating a VM, starting VMs, destruction
>> of VMs
>> +- VM access control policy, including memory sharing and lending
>> +- Interrupt routing configuration
>> +- Forwarding of system-level events (e.g. VM shutdown) to owner VM
>> +
>> +When booting a virtual machine which uses a devicetree such as Linux,
>> resource manager overlays a
>> +/hypervisor node. This node can let Linux know it is running as a
>> Gunyah guest VM,
>> +how to communicate with resource manager, and basic description and
>> capabilities of
>> +this VM. See
>> Documentation/devicetree/bindings/firmware/gunyah-hypervisor.yaml for
>> a description
>> +of this node.
>> diff --git a/Documentation/virt/gunyah/message-queue.rst
>> b/Documentation/virt/gunyah/message-queue.rst
>> new file mode 100644
>> index 000000000000..0667b3eb1ff9
>> --- /dev/null
>> +++ b/Documentation/virt/gunyah/message-queue.rst
>> @@ -0,0 +1,61 @@
>> +.. SPDX-License-Identifier: GPL-2.0
>> +
>> +Message Queues
>> +==============
>> +Message queue is a simple low-capacity IPC channel between two VMs.
>> It is
>> +intended for sending small control and configuration messages. Each
>> message
>> +queue is unidirectional, so a full-duplex IPC channel requires a pair
>> of queues.
>> +
>> +Messages can be up to 240 bytes in length. Longer messages require a
>> further
>> +protocol on top of the message queue messages themselves. For
>> instance, communication
>> +with the resource manager adds a header field for sending longer
>> messages via multiple
>> +message fragments.
>> +
>> +The diagram below shows how message queue works. A typical
>> configuration involves
>> +2 message queues. Message queue 1 allows VM_A to send messages to
>> VM_B. Message
>> +queue 2 allows VM_B to send messages to VM_A.
>> +
>> +1. VM_A sends a message of up to 240 bytes in length. It raises a
>> hypercall
>
> Can you clarify that the message being sent is in the VM's *own*
> memory/  Maybe this is clear, but the message doesn't have to (for
> example) be located in shared memory.  The original message is
> copied into message queue buffers in order to be transferred.
>
>> +   with the message to inform the hypervisor to add the message to
>> +   message queue 1's queue.
>> +
>> +2. Gunyah raises the corresponding interrupt for VM_B (Rx vIRQ) when
>> any of
>> +   these happens:
>> +
>> +   a. gh_msgq_send has PUSH flag. Queue is immediately flushed. This
>> is the typical case.
>
> Below you use gh_msgq_send() (with parentheses).  I prefer that,
> but whatever you do, do it consistently.
>
>> +   b. Explicility with gh_msgq_push command from VM_A.
>> +   c. Message queue has reached a threshold depth.
>> +
>> +3. VM_B calls gh_msgq_recv and Gunyah copies message to requested
>> buffer.
>> +
>> +4. Gunyah buffers messages in the queue. If the queue became full
>> when VM_A added a message,
>> +   the return values for gh_msgq_send() include a flag that indicates
>> the queue is full.
>> +   Once VM_B receives the message and, thus, there is space in the
>> queue, Gunyah
>> +   will raise the Tx vIRQ on VM_A to indicate it can continue sending
>> messages.
>> +
>> +For VM_B to send a message to VM_A, the process is identical, except
>> that hypercalls
>> +reference message queue 2's capability ID. Each message queue has its
>> own independent
>> +vIRQ: two TX message queues will have two vIRQs (and two capability
>> IDs).
>
> Can a sender determine when a message has been delivered?

Sender cannot determine when the receiving VM has processed the message.

> Does the TX vIRQ indicate only that the messaging system
> has processed the message (taken it and queued it), but
> says nothing about it being delivered/accepted/received?

That's the correct interpretation.

Thanks,
Elliot

2023-03-02 01:22:18

by Elliot Berman

[permalink] [raw]
Subject: Re: [PATCH v10 04/26] virt: gunyah: Add hypercalls to identify Gunyah



On 2/23/2023 4:09 PM, Alex Elder wrote:
> On 2/14/23 3:12 PM, Elliot Berman wrote:
>> +
>> +void gh_hypercall_hyp_identify(struct gh_hypercall_hyp_identify_resp
>> *hyp_identity);
>
> Since this is a user space API, you *could* consider having
> this function return an int.  Just in case there's a future
> reason that a failure could occur, or that you want to
> supply some other information.  If this truly doesn't make
> sense, it's fine as-is...
>

I'm not sure what was meant by user space API. However, hypervisor API
doesn't provision a return value. r0 is usually the return value for
most other Gunyah hypercalls except for this one -- instead, it's the
api_info field.

The other kind of error we could get is at hypercall "transport" layer,
but the hvc instruction doesn't fail and if we ever change the hypercall
transport, I'm sure there will be a lot of other changes to consider as
well.

Thanks,
Elliot


2023-03-02 01:41:46

by Elliot Berman

[permalink] [raw]
Subject: Re: [PATCH v10 03/26] gunyah: Common types and error codes for Gunyah hypercalls



On 2/23/2023 1:58 PM, Alex Elder wrote:
> On 2/14/23 3:12 PM, Elliot Berman wrote:
>> Add architecture-independent standard error codes, types, and macros for
>> Gunyah hypercalls.
>>
>> Reviewed-by: Dmitry Baryshkov <[email protected]>
>> Signed-off-by: Elliot Berman <[email protected]>
>> ---
>>   include/linux/gunyah.h | 82 ++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 82 insertions(+)
>>   create mode 100644 include/linux/gunyah.h
>>
>> diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
>> new file mode 100644
>> index 000000000000..59ef4c735ae8
>> --- /dev/null
>> +++ b/include/linux/gunyah.h
>> @@ -0,0 +1,82 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All
>> rights reserved.
>> + */
>> +
>> +#ifndef _LINUX_GUNYAH_H
>> +#define _LINUX_GUNYAH_H
>> +
>> +#include <linux/errno.h>
>> +#include <linux/limits.h>
>> +
>> +/******************************************************************************/
>> +/* Common arch-independent definitions for Gunyah
>> hypercalls                  */
>> +#define GH_CAPID_INVAL    U64_MAX
>> +#define GH_VMID_ROOT_VM    0xff
>> +
>> +enum gh_error {
>> +    GH_ERROR_OK            = 0,
>> +    GH_ERROR_UNIMPLEMENTED        = -1,
>> +    GH_ERROR_RETRY            = -2,
>
> Do you expect this type to have a particular size?
> Since you specify negative values, it matters, and
> it's possible that this forces it to be a 4-byte value
> (though I'm not sure what the rules are).  In other
> words, UNIMPLEMENTED could conceivably have value 0xff
> or 0xffffffff.  I'm not even sure you can tell whether
> an enum is interpreted as signed or unsigned.

I'm not a C expert, but my understanding is that enums are signed.
Gunyah will be returning a signed 64-bit register, however there's no
intention to go beyond 32 bits of error codes since we want to work on
32-bit architectures.

>
> It's not usually a good thing to do, but this *could*
> be a case where you do a typedef to represent this as
> a signed value of a certain bit width.  (But don't do
> that unless someone else says that's worth doing.)
>
>                     -Alex
>
>> +
>> +    GH_ERROR_ARG_INVAL        = 1,
>> +    GH_ERROR_ARG_SIZE        = 2,
>> +    GH_ERROR_ARG_ALIGN        = 3,
>> +
>> +    GH_ERROR_NOMEM            = 10,
>> +
>> +    GH_ERROR_ADDR_OVFL        = 20,
>> +    GH_ERROR_ADDR_UNFL        = 21,
>> +    GH_ERROR_ADDR_INVAL        = 22,
>> +
>> +    GH_ERROR_DENIED            = 30,
>> +    GH_ERROR_BUSY            = 31,
>> +    GH_ERROR_IDLE            = 32,
>> +
>> +    GH_ERROR_IRQ_BOUND        = 40,
>> +    GH_ERROR_IRQ_UNBOUND        = 41,
>> +
>> +    GH_ERROR_CSPACE_CAP_NULL    = 50,
>> +    GH_ERROR_CSPACE_CAP_REVOKED    = 51,
>> +    GH_ERROR_CSPACE_WRONG_OBJ_TYPE    = 52,
>> +    GH_ERROR_CSPACE_INSUF_RIGHTS    = 53,
>> +    GH_ERROR_CSPACE_FULL        = 54,
>> +
>> +    GH_ERROR_MSGQUEUE_EMPTY        = 60,
>> +    GH_ERROR_MSGQUEUE_FULL        = 61,
>> +};
>> +
>> +/**
>> + * gh_remap_error() - Remap Gunyah hypervisor errors into a Linux
>> error code
>> + * @gh_error: Gunyah hypercall return value
>> + */
>> +static inline int gh_remap_error(enum gh_error gh_error)
>> +{
>> +    switch (gh_error) {
>> +    case GH_ERROR_OK:
>> +        return 0;
>> +    case GH_ERROR_NOMEM:
>> +        return -ENOMEM;
>> +    case GH_ERROR_DENIED:
>> +    case GH_ERROR_CSPACE_CAP_NULL:
>> +    case GH_ERROR_CSPACE_CAP_REVOKED:
>> +    case GH_ERROR_CSPACE_WRONG_OBJ_TYPE:
>> +    case GH_ERROR_CSPACE_INSUF_RIGHTS:
>> +    case GH_ERROR_CSPACE_FULL:
>> +        return -EACCES;
>> +    case GH_ERROR_BUSY:
>> +    case GH_ERROR_IDLE:
>> +    case GH_ERROR_IRQ_BOUND:
>> +    case GH_ERROR_IRQ_UNBOUND:
>> +    case GH_ERROR_MSGQUEUE_FULL:
>> +    case GH_ERROR_MSGQUEUE_EMPTY:
>
> Is an empty message queue really busy?
>

Changed to -EIO.

>> +        return -EBUSY;
>> +    case GH_ERROR_UNIMPLEMENTED:
>> +    case GH_ERROR_RETRY:
>> +        return -EOPNOTSUPP;
>> +    default:
>> +        return -EINVAL;
>> +    }
>> +}
>> +
>> +#endif
>

2023-03-02 07:18:58

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH v10 03/26] gunyah: Common types and error codes for Gunyah hypercalls

On Thu, Mar 2, 2023, at 02:40, Elliot Berman wrote:
> On 2/23/2023 1:58 PM, Alex Elder wrote:

>>> +enum gh_error {
>>> +    GH_ERROR_OK            = 0,
>>> +    GH_ERROR_UNIMPLEMENTED        = -1,
>>> +    GH_ERROR_RETRY            = -2,
>>
>> Do you expect this type to have a particular size?
>> Since you specify negative values, it matters, and
>> it's possible that this forces it to be a 4-byte value
>> (though I'm not sure what the rules are).  In other
>> words, UNIMPLEMENTED could conceivably have value 0xff
>> or 0xffffffff.  I'm not even sure you can tell whether
>> an enum is interpreted as signed or unsigned.
>
> I'm not a C expert, but my understanding is that enums are signed.
> Gunyah will be returning a signed 64-bit register, however there's no
> intention to go beyond 32 bits of error codes since we want to work on
> 32-bit architectures.

This came up recently because gcc-13 changes the rules.

In GNU C, the enum type will have the smallest type that fits all
values, so if it contains a negative number it ends up as a signed
type (int, long or long long), but if all values are positive and at
least one of them exceeds the signed range (e.g. UINT_MAX), it is
an unsigned type. If it contains both UINT_MAX and -1, the enum
type gets changed to a signed 64-bit type in order to fit both.

Before gcc-13, the individual constants have the smallest type
(at least 'int') that fits their value, but in gcc-13 they have
the same type as the enum type itself.

Arnd