2020-08-25 03:05:47

by Shuo Liu

[permalink] [raw]
Subject: [PATCH 00/17] HSM driver for ACRN hypervisor

From: Shuo Liu <[email protected]>

ACRN is a Type 1 reference hypervisor stack, running directly on the bare-metal
hardware, and is suitable for a variety of IoT and embedded device solutions.

ACRN implements a hybrid VMM architecture, using a privileged Service VM. The
Service VM manages the system resources (CPU, memory, etc.) and I/O devices of
User VMs. Multiple User VMs are supported, with each of them running Linux,
Android OS or Windows. Both Service VM and User VMs are guest VM.

Below figure shows the architecture.

Service VM User VM
+----------------------------+ | +------------------+
| +--------------+ | | | |
| |ACRN userspace| | | | |
| +--------------+ | | | |
|-----------------ioctl------| | | | ...
|kernel space +----------+ | | | |
| | HSM | | | | Drivers |
| +----------+ | | | |
+--------------------|-------+ | +------------------+
+---------------------hypercall----------------------------------------+
| ACRN Hypervisor |
+----------------------------------------------------------------------+
| Hardware |
+----------------------------------------------------------------------+

There is only one Service VM which could run Linux as OS.

In a typical case, the Service VM will be auto started when ACRN Hypervisor is
booted. Then the ACRN userspace (an application running in Service VM) could be
used to start/stop User VMs by communicating with ACRN Hypervisor Service
Module (HSM).

ACRN Hypervisor Service Module (HSM) is a middle layer that allows the ACRN
userspace and Service VM OS kernel to communicate with ACRN Hypervisor
and manage different User VMs. This middle layer provides the following
functionalities,
- Issues hypercalls to the hypervisor to manage User VMs:
* VM/vCPU management
* Memory management
* Device passthrough
* Interrupts injection
- I/O requests handling from User VMs.
- Exports ioctl through HSM char device.
- Exports function calls for other kernel modules

ACRN is focused on embedded system. So it doesn't support some features.
E.g.,
- ACRN doesn't support VM migration.
- ACRN doesn't support vCPU migration.

This patch set adds the HSM to the Linux kernel.

The basic ARCN support was merged to upstream already.
https://lore.kernel.org/lkml/[email protected]/

Shuo Liu (16):
docs: acrn: Introduce ACRN
x86/acrn: Introduce acrn_{setup, remove}_intr_handler()
x86/acrn: Introduce hypercall interfaces
virt: acrn: Introduce ACRN HSM basic driver
virt: acrn: Introduce VM management interfaces
virt: acrn: Introduce an ioctl to set vCPU registers state
virt: acrn: Introduce EPT mapping management
virt: acrn: Introduce I/O request management
virt: acrn: Introduce PCI configuration space PIO accesses combiner
virt: acrn: Introduce interfaces for PCI device passthrough
virt: acrn: Introduce interrupt injection interfaces
virt: acrn: Introduce interfaces to query C-states and P-states
allowed by hypervisor
virt: acrn: Introduce I/O ranges operation interfaces
virt: acrn: Introduce ioeventfd
virt: acrn: Introduce irqfd
virt: acrn: Introduce an interface for Service VM to control vCPU

Yin Fengwei (1):
x86/acrn: Introduce an API to check if a VM is privileged

.../userspace-api/ioctl/ioctl-number.rst | 1 +
Documentation/virt/acrn/index.rst | 11 +
Documentation/virt/acrn/introduction.rst | 40 ++
Documentation/virt/acrn/io-request.rst | 97 +++
Documentation/virt/index.rst | 1 +
MAINTAINERS | 9 +
arch/x86/include/asm/acrn.h | 74 ++
arch/x86/kernel/cpu/acrn.c | 38 +-
drivers/virt/Kconfig | 2 +
drivers/virt/Makefile | 1 +
drivers/virt/acrn/Kconfig | 15 +
drivers/virt/acrn/Makefile | 3 +
drivers/virt/acrn/acrn_drv.h | 225 ++++++
drivers/virt/acrn/hsm.c | 439 ++++++++++++
drivers/virt/acrn/hypercall.h | 266 ++++++++
drivers/virt/acrn/ioeventfd.c | 275 ++++++++
drivers/virt/acrn/ioreq.c | 638 ++++++++++++++++++
drivers/virt/acrn/irqfd.c | 236 +++++++
drivers/virt/acrn/mm.c | 298 ++++++++
drivers/virt/acrn/vm.c | 120 ++++
include/uapi/linux/acrn.h | 499 ++++++++++++++
21 files changed, 3287 insertions(+), 1 deletion(-)
create mode 100644 Documentation/virt/acrn/index.rst
create mode 100644 Documentation/virt/acrn/introduction.rst
create mode 100644 Documentation/virt/acrn/io-request.rst
create mode 100644 arch/x86/include/asm/acrn.h
create mode 100644 drivers/virt/acrn/Kconfig
create mode 100644 drivers/virt/acrn/Makefile
create mode 100644 drivers/virt/acrn/acrn_drv.h
create mode 100644 drivers/virt/acrn/hsm.c
create mode 100644 drivers/virt/acrn/hypercall.h
create mode 100644 drivers/virt/acrn/ioeventfd.c
create mode 100644 drivers/virt/acrn/ioreq.c
create mode 100644 drivers/virt/acrn/irqfd.c
create mode 100644 drivers/virt/acrn/mm.c
create mode 100644 drivers/virt/acrn/vm.c
create mode 100644 include/uapi/linux/acrn.h


base-commit: 18445bf405cb331117bc98427b1ba6f12418ad17
--
2.28.0


2020-08-25 03:07:17

by Shuo Liu

[permalink] [raw]
Subject: [PATCH 05/17] virt: acrn: Introduce ACRN HSM basic driver

From: Shuo Liu <[email protected]>

ACRN Hypervisor Service Module (HSM) is a kernel module in Service VM
which communicates with ACRN userspace through ioctls and talks to ACRN
Hypervisor through hypercalls.

Add a basic HSM driver which allows Service VM userspace to communicate
with ACRN. The following patches will add more ioctls, guest VM memory
mapping caching, I/O request processing, ioeventfd and irqfd into this
module. HSM exports a char device interface (/dev/acrn_hsm) to userspace.

Signed-off-by: Shuo Liu <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Zhi Wang <[email protected]>
Cc: Zhenyu Wang <[email protected]>
Cc: Yu Wang <[email protected]>
Cc: Reinette Chatre <[email protected]>
---
.../userspace-api/ioctl/ioctl-number.rst | 1 +
MAINTAINERS | 2 +
drivers/virt/Kconfig | 2 +
drivers/virt/Makefile | 1 +
drivers/virt/acrn/Kconfig | 14 +++
drivers/virt/acrn/Makefile | 3 +
drivers/virt/acrn/acrn_drv.h | 21 ++++
drivers/virt/acrn/hsm.c | 115 ++++++++++++++++++
drivers/virt/acrn/hypercall.h | 30 +++++
include/uapi/linux/acrn.h | 33 +++++
10 files changed, 222 insertions(+)
create mode 100644 drivers/virt/acrn/Kconfig
create mode 100644 drivers/virt/acrn/Makefile
create mode 100644 drivers/virt/acrn/acrn_drv.h
create mode 100644 drivers/virt/acrn/hsm.c
create mode 100644 drivers/virt/acrn/hypercall.h
create mode 100644 include/uapi/linux/acrn.h

diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documentation/userspace-api/ioctl/ioctl-number.rst
index 2a198838fca9..ac60efedb104 100644
--- a/Documentation/userspace-api/ioctl/ioctl-number.rst
+++ b/Documentation/userspace-api/ioctl/ioctl-number.rst
@@ -319,6 +319,7 @@ Code Seq# Include File Comments
0xA0 all linux/sdp/sdp.h Industrial Device Project
<mailto:[email protected]>
0xA1 0 linux/vtpm_proxy.h TPM Emulator Proxy Driver
+0xA2 all uapi/linux/acrn.h ACRN hypervisor
0xA3 80-8F Port ACL in development:
<mailto:[email protected]>
0xA3 90-9F linux/dtlk.h
diff --git a/MAINTAINERS b/MAINTAINERS
index e0fea5e464b4..d4c1ef303c2d 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -442,6 +442,8 @@ L: [email protected]
S: Supported
W: https://projectacrn.org
F: Documentation/virt/acrn/
+F: drivers/virt/acrn/
+F: include/uapi/linux/acrn.h

AD1889 ALSA SOUND DRIVER
L: [email protected]
diff --git a/drivers/virt/Kconfig b/drivers/virt/Kconfig
index cbc1f25c79ab..d9484a2e9b46 100644
--- a/drivers/virt/Kconfig
+++ b/drivers/virt/Kconfig
@@ -32,4 +32,6 @@ config FSL_HV_MANAGER
partition shuts down.

source "drivers/virt/vboxguest/Kconfig"
+
+source "drivers/virt/acrn/Kconfig"
endif
diff --git a/drivers/virt/Makefile b/drivers/virt/Makefile
index fd331247c27a..f0491bbf0d4d 100644
--- a/drivers/virt/Makefile
+++ b/drivers/virt/Makefile
@@ -5,3 +5,4 @@

obj-$(CONFIG_FSL_HV_MANAGER) += fsl_hypervisor.o
obj-y += vboxguest/
+obj-$(CONFIG_ACRN_HSM) += acrn/
diff --git a/drivers/virt/acrn/Kconfig b/drivers/virt/acrn/Kconfig
new file mode 100644
index 000000000000..36c80378c30c
--- /dev/null
+++ b/drivers/virt/acrn/Kconfig
@@ -0,0 +1,14 @@
+# SPDX-License-Identifier: GPL-2.0
+config ACRN_HSM
+ tristate "ACRN Hypervisor Service Module"
+ depends on ACRN_GUEST
+ help
+ ACRN Hypervisor Service Module (HSM) is a kernel module which
+ communicates with ACRN userspace through ioctls and talks to
+ the ACRN Hypervisor through hypercalls. HSM will only run in
+ a privileged management VM, called Service VM, to manage User
+ VMs and do I/O emulation. Not required for simply running
+ under ACRN as a User VM.
+
+ To compile as a module, choose M, the module will be called
+ acrn. If unsure, say N.
diff --git a/drivers/virt/acrn/Makefile b/drivers/virt/acrn/Makefile
new file mode 100644
index 000000000000..6920ed798aaf
--- /dev/null
+++ b/drivers/virt/acrn/Makefile
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0
+obj-$(CONFIG_ACRN_HSM) := acrn.o
+acrn-y := hsm.o
diff --git a/drivers/virt/acrn/acrn_drv.h b/drivers/virt/acrn/acrn_drv.h
new file mode 100644
index 000000000000..36f43d8d43d0
--- /dev/null
+++ b/drivers/virt/acrn/acrn_drv.h
@@ -0,0 +1,21 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __ACRN_HSM_DRV_H
+#define __ACRN_HSM_DRV_H
+
+#include <linux/acrn.h>
+#include <linux/types.h>
+
+#include "hypercall.h"
+
+#define ACRN_INVALID_VMID (0xffffU)
+
+/**
+ * struct acrn_vm - Properties of ACRN User VM.
+ * @vmid: User VM ID
+ */
+struct acrn_vm {
+ u16 vmid;
+};
+
+#endif /* __ACRN_HSM_DRV_H */
diff --git a/drivers/virt/acrn/hsm.c b/drivers/virt/acrn/hsm.c
new file mode 100644
index 000000000000..a08169f35c96
--- /dev/null
+++ b/drivers/virt/acrn/hsm.c
@@ -0,0 +1,115 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * ACRN Hypervisor Service Module (HSM)
+ *
+ * Copyright (C) 2020 Intel Corporation. All rights reserved.
+ *
+ * Authors:
+ * Fengwei Yin <[email protected]>
+ * Yakui Zhao <[email protected]>
+ */
+
+#define pr_fmt(fmt) "acrn: " fmt
+
+#include <linux/miscdevice.h>
+#include <linux/mm.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+
+#include <asm/acrn.h>
+#include <asm/hypervisor.h>
+
+#include "acrn_drv.h"
+
+static struct acrn_api_version api_version;
+
+/*
+ * When /dev/acrn_hsm is opened, a 'struct acrn_vm' object is created to
+ * represent a VM instance and continues to be associated with the opened file
+ * descriptor. All ioctl operations on this file descriptor will be targeted to
+ * the VM instance. Release of this file descriptor will destroy the object.
+ */
+static int acrn_dev_open(struct inode *inode, struct file *filp)
+{
+ struct acrn_vm *vm;
+
+ vm = kzalloc(sizeof(*vm), GFP_KERNEL);
+ if (!vm)
+ return -ENOMEM;
+
+ vm->vmid = ACRN_INVALID_VMID;
+ filp->private_data = vm;
+ return 0;
+}
+
+static long acrn_dev_ioctl(struct file *filp, unsigned int cmd,
+ unsigned long ioctl_param)
+{
+ if (cmd == ACRN_IOCTL_GET_API_VERSION) {
+ if (copy_to_user((void __user *)ioctl_param,
+ &api_version, sizeof(api_version)))
+ return -EFAULT;
+ }
+
+ return 0;
+}
+
+static int acrn_dev_release(struct inode *inode, struct file *filp)
+{
+ struct acrn_vm *vm = filp->private_data;
+
+ kfree(vm);
+ return 0;
+}
+
+static const struct file_operations acrn_fops = {
+ .owner = THIS_MODULE,
+ .open = acrn_dev_open,
+ .release = acrn_dev_release,
+ .unlocked_ioctl = acrn_dev_ioctl,
+};
+
+static struct miscdevice acrn_dev = {
+ .minor = MISC_DYNAMIC_MINOR,
+ .name = "acrn_hsm",
+ .fops = &acrn_fops,
+};
+
+static int __init hsm_init(void)
+{
+ int ret;
+
+ if (x86_hyper_type != X86_HYPER_ACRN)
+ return -ENODEV;
+
+ if (!acrn_is_privileged_vm())
+ return -EPERM;
+
+ ret = hcall_get_api_version(slow_virt_to_phys(&api_version));
+ if (ret < 0) {
+ pr_err("Failed to get API version from hypervisor!\n");
+ return ret;
+ }
+
+ pr_info("API version is %u.%u\n",
+ api_version.major_version, api_version.minor_version);
+
+ ret = misc_register(&acrn_dev);
+ if (ret) {
+ pr_err("Create misc dev failed!\n");
+ return ret;
+ }
+
+ return 0;
+}
+
+static void __exit hsm_exit(void)
+{
+ misc_deregister(&acrn_dev);
+}
+module_init(hsm_init);
+module_exit(hsm_exit);
+
+MODULE_AUTHOR("Intel Corporation");
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("ACRN Hypervisor Service Module (HSM)");
diff --git a/drivers/virt/acrn/hypercall.h b/drivers/virt/acrn/hypercall.h
new file mode 100644
index 000000000000..3ad1b708e162
--- /dev/null
+++ b/drivers/virt/acrn/hypercall.h
@@ -0,0 +1,30 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * ACRN HSM: hypercalls of ACRN Hypervisor
+ */
+#ifndef __ACRN_HSM_HYPERCALL_H
+#define __ACRN_HSM_HYPERCALL_H
+#include <asm/acrn.h>
+
+/*
+ * Hypercall IDs of the ACRN Hypervisor
+ */
+#define _HC_ID(x, y) (((x) << 24) | (y))
+
+#define HC_ID 0x80UL
+
+#define HC_ID_GEN_BASE 0x0UL
+#define HC_GET_API_VERSION _HC_ID(HC_ID, HC_ID_GEN_BASE + 0x00)
+
+/**
+ * hcall_get_api_version() - Get API version from hypervisor
+ * @api_version: Service VM GPA of version info
+ *
+ * Return: 0 on success, <0 on failure
+ */
+static inline long hcall_get_api_version(u64 api_version)
+{
+ return acrn_hypercall1(HC_GET_API_VERSION, api_version);
+}
+
+#endif /* __ACRN_HSM_HYPERCALL_H */
diff --git a/include/uapi/linux/acrn.h b/include/uapi/linux/acrn.h
new file mode 100644
index 000000000000..c59488ad7252
--- /dev/null
+++ b/include/uapi/linux/acrn.h
@@ -0,0 +1,33 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/*
+ * Userspace interface for /dev/acrn_hsm - ACRN Hypervisor Service Module
+ *
+ * This file can be used by applications that need to communicate with the HSM
+ * via the ioctl interface.
+ */
+
+#ifndef _UAPI_ACRN_H
+#define _UAPI_ACRN_H
+
+#include <linux/types.h>
+
+/**
+ * struct acrn_api_version - ACRN Hypervisor API version.
+ * @major_version: Major version of ACRN Hypervisor API.
+ * @minor_version: Minor version of ACRN Hypervisor API.
+ */
+struct acrn_api_version {
+ __u32 major_version;
+ __u32 minor_version;
+} __attribute__((aligned(8)));
+
+/* The ioctl type, documented in ioctl-number.rst */
+#define ACRN_IOCTL_TYPE 0xA2
+
+/*
+ * Common IOCTL IDs definition for ACRN userspace
+ */
+#define ACRN_IOCTL_GET_API_VERSION \
+ _IOR(ACRN_IOCTL_TYPE, 0, struct acrn_api_version)
+
+#endif /* _UAPI_ACRN_H */
--
2.28.0

2020-08-25 03:07:48

by Shuo Liu

[permalink] [raw]
Subject: [PATCH 01/17] docs: acrn: Introduce ACRN

From: Shuo Liu <[email protected]>

Add documentation on the following aspects of ACRN:

1) A brief introduction on the architecture of ACRN.
2) I/O request handling in ACRN.

To learn more about ACRN, please go to ACRN project website
https://projectacrn.org, or the documentation page
https://projectacrn.github.io/.

Signed-off-by: Shuo Liu <[email protected]>
Reviewed-by: Zhi Wang <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Sen Christopherson <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Fengwei Yin <[email protected]>
Cc: Zhi Wang <[email protected]>
Cc: Zhenyu Wang <[email protected]>
Cc: Yu Wang <[email protected]>
Cc: Reinette Chatre <[email protected]>
---
Documentation/virt/acrn/index.rst | 11 +++
Documentation/virt/acrn/introduction.rst | 40 ++++++++++
Documentation/virt/acrn/io-request.rst | 97 ++++++++++++++++++++++++
Documentation/virt/index.rst | 1 +
MAINTAINERS | 7 ++
5 files changed, 156 insertions(+)
create mode 100644 Documentation/virt/acrn/index.rst
create mode 100644 Documentation/virt/acrn/introduction.rst
create mode 100644 Documentation/virt/acrn/io-request.rst

diff --git a/Documentation/virt/acrn/index.rst b/Documentation/virt/acrn/index.rst
new file mode 100644
index 000000000000..e3cf99033bdb
--- /dev/null
+++ b/Documentation/virt/acrn/index.rst
@@ -0,0 +1,11 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===============
+ACRN Hypervisor
+===============
+
+.. toctree::
+ :maxdepth: 1
+
+ introduction
+ io-request
diff --git a/Documentation/virt/acrn/introduction.rst b/Documentation/virt/acrn/introduction.rst
new file mode 100644
index 000000000000..6b44924d5c0e
--- /dev/null
+++ b/Documentation/virt/acrn/introduction.rst
@@ -0,0 +1,40 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+ACRN Hypervisor Introduction
+============================
+
+The ACRN Hypervisor is a Type 1 hypervisor, running directly on the bare-metal
+hardware. It has a privileged management VM, called Service VM, to manage User
+VMs and do I/O emulation.
+
+ACRN userspace is an application running in the Service VM that emulates
+devices for a User VM based on command line configurations. ACRN Hypervisor
+Service Module (HSM) is a kernel module in the Service VM which provides
+hypervisor services to the ACRN userspace.
+
+Below figure shows the architecture.
+
+::
+
+ Service VM User VM
+ +----------------------------+ | +------------------+
+ | +--------------+ | | | |
+ | |ACRN userspace| | | | |
+ | +--------------+ | | | |
+ |-----------------ioctl------| | | | ...
+ |kernel space +----------+ | | | |
+ | | HSM | | | | Drivers |
+ | +----------+ | | | |
+ +--------------------|-------+ | +------------------+
+ +---------------------hypercall----------------------------------------+
+ | ACRN Hypervisor |
+ +----------------------------------------------------------------------+
+ | Hardware |
+ +----------------------------------------------------------------------+
+
+ACRN userspace allocates memory for the User VM, configures and initializes the
+devices used by the User VM, loads the virtual bootloader, initializes the
+virtual CPU state and handles I/O request accesses from the User VM. It uses
+ioctls to communicate with the HSM. HSM implements hypervisor services by
+interacting with the ACRN Hypervisor via hypercalls. HSM exports a char device
+interface (/dev/acrn_hsm) to userspace.
diff --git a/Documentation/virt/acrn/io-request.rst b/Documentation/virt/acrn/io-request.rst
new file mode 100644
index 000000000000..019dc5978f7c
--- /dev/null
+++ b/Documentation/virt/acrn/io-request.rst
@@ -0,0 +1,97 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+I/O request handling
+====================
+
+An I/O request of a User VM, which is constructed by the hypervisor, is
+distributed by the ACRN Hypervisor Service Module to an I/O client
+corresponding to the address range of the I/O request. Details of I/O request
+handling are described in the following sections.
+
+1. I/O request
+--------------
+
+For each User VM, there is a shared 4-KByte memory region used for I/O requests
+communication between the hypervisor and Service VM. An I/O request is a
+256-byte structure buffer, which is 'struct acrn_io_request', that is filled by
+an I/O handler of the hypervisor when a trapped I/O access happens in a User
+VM. ACRN userspace in the Service VM first allocates a 4-KByte page and passes
+the GPA (Guest Physical Address) of the buffer to the hypervisor. The buffer is
+used as an array of 16 I/O request slots with each I/O request slot being 256
+bytes. This array is indexed by vCPU ID.
+
+2. I/O clients
+--------------
+
+An I/O client is responsible for handling User VM I/O requests whose accessed
+GPA falls in a certain range. Multiple I/O clients can be associated with each
+User VM. There is a special client associated with each User VM, called the
+default client, that handles all I/O requests that do not fit into the range of
+any other clients. The ACRN userspace acts as the default client for each User
+VM.
+
+Below illustration shows the relationship between I/O requests shared buffer,
+I/O requests and I/O clients.
+
+::
+
+ +------------------------------------------------------+
+ | Service VM |
+ |+--------------------------------------------------+ |
+ || +----------------------------------------+ | |
+ || | shared page ACRN userspace | | |
+ || | +-----------------+ +------------+ | | |
+ || +----+->| acrn_io_request |<-+ default | | | |
+ || | | | +-----------------+ | I/O client | | | |
+ || | | | | ... | +------------+ | | |
+ || | | | +-----------------+ | | |
+ || | +-|--------------------------------------+ | |
+ ||---|----|-----------------------------------------| |
+ || | | kernel | |
+ || | | +----------------------+ | |
+ || | | | +-------------+ HSM | | |
+ || | +--------------+ | | | |
+ || | | | I/O clients | | | |
+ || | | | | | | |
+ || | | +-------------+ | | |
+ || | +----------------------+ | |
+ |+---|----------------------------------------------+ |
+ +----|-------------------------------------------------+
+ |
+ +----|-------------------------------------------------+
+ | +-+-----------+ |
+ | | I/O handler | ACRN Hypervisor |
+ | +-------------+ |
+ +------------------------------------------------------+
+
+3. I/O request state transition
+-------------------------------
+
+The state transitions of a ACRN I/O request are as follows.
+
+::
+
+ FREE -> PENDING -> PROCESSING -> COMPLETE -> FREE -> ...
+
+- FREE: this I/O request slot is empty
+- PENDING: a valid I/O request is pending in this slot
+- PROCESSING: the I/O request is being processed
+- COMPLETE: the I/O request has been processed
+
+An I/O request in COMPLETE or FREE state is owned by the hypervisor. HSM and
+ACRN userspace are in charge of processing the others.
+
+4. Processing flow of I/O requests
+-------------------------------
+
+a. The I/O handler of the hypervisor will fill an I/O request with PENDING
+ state when a trapped I/O access happens in a User VM.
+b. The hypervisor makes an upcall, which is a notification interrupt, to
+ the Service VM.
+c. The upcall handler schedules a tasklet to dispatch I/O requests.
+d. The tasklet looks for the PENDING I/O requests, assigns them to different
+ registered clients based on the address of the I/O accesses, updates
+ their state to PROCESSING, and notifies the corresponding client to handle.
+e. The notified client handles the assigned I/O requests.
+f. The HSM updates I/O requests states to COMPLETE and notifies the hypervisor
+ of the completion via hypercalls.
diff --git a/Documentation/virt/index.rst b/Documentation/virt/index.rst
index de1ab81df958..c10b519507f5 100644
--- a/Documentation/virt/index.rst
+++ b/Documentation/virt/index.rst
@@ -11,6 +11,7 @@ Linux Virtualization Support
uml/user_mode_linux
paravirt_ops
guest-halt-polling
+ acrn/index

.. only:: html and subproject

diff --git a/MAINTAINERS b/MAINTAINERS
index deaafb617361..e0fea5e464b4 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -436,6 +436,13 @@ S: Orphan
F: drivers/platform/x86/wmi.c
F: include/uapi/linux/wmi.h

+ACRN HYPERVISOR SERVICE MODULE
+M: Shuo Liu <[email protected]>
+L: [email protected]
+S: Supported
+W: https://projectacrn.org
+F: Documentation/virt/acrn/
+
AD1889 ALSA SOUND DRIVER
L: [email protected]
S: Maintained
--
2.28.0

2020-08-25 03:09:26

by Shuo Liu

[permalink] [raw]
Subject: [PATCH 07/17] virt: acrn: Introduce an ioctl to set vCPU registers state

From: Shuo Liu <[email protected]>

A virtual CPU of User VM has different context due to the different
registers state. ACRN userspace needs to set the virtual CPU
registers state (e.g. giving a initial registers state to a virtual
BSP of a User VM).

HSM provides an ioctl ACRN_IOCTL_SET_VCPU_REGS to do the virtual CPU
registers state setting. The ioctl passes the registers state from ACRN
userspace to the hypervisor directly.

Signed-off-by: Shuo Liu <[email protected]>
Reviewed-by: Zhi Wang <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Cc: Zhi Wang <[email protected]>
Cc: Zhenyu Wang <[email protected]>
Cc: Yu Wang <[email protected]>
Cc: Reinette Chatre <[email protected]>
---
drivers/virt/acrn/hsm.c | 13 +++++++
drivers/virt/acrn/hypercall.h | 13 +++++++
include/uapi/linux/acrn.h | 71 +++++++++++++++++++++++++++++++++++
3 files changed, 97 insertions(+)

diff --git a/drivers/virt/acrn/hsm.c b/drivers/virt/acrn/hsm.c
index ed8921a6c68b..31dec2f1aa12 100644
--- a/drivers/virt/acrn/hsm.c
+++ b/drivers/virt/acrn/hsm.c
@@ -11,6 +11,7 @@

#define pr_fmt(fmt) "acrn: " fmt

+#include <linux/io.h>
#include <linux/miscdevice.h>
#include <linux/mm.h>
#include <linux/module.h>
@@ -47,6 +48,7 @@ static long acrn_dev_ioctl(struct file *filp, unsigned int cmd,
{
struct acrn_vm *vm = filp->private_data;
struct acrn_vm_creation *vm_param;
+ struct acrn_vcpu_regs *cpu_regs;
int ret = 0;

if (cmd == ACRN_IOCTL_GET_API_VERSION) {
@@ -101,6 +103,17 @@ static long acrn_dev_ioctl(struct file *filp, unsigned int cmd,
case ACRN_IOCTL_DESTROY_VM:
ret = acrn_vm_destroy(vm);
break;
+ case ACRN_IOCTL_SET_VCPU_REGS:
+ cpu_regs = memdup_user((void __user *)ioctl_param,
+ sizeof(struct acrn_vcpu_regs));
+ if (IS_ERR(cpu_regs))
+ return PTR_ERR(cpu_regs);
+
+ ret = hcall_set_vcpu_regs(vm->vmid, virt_to_phys(cpu_regs));
+ if (ret < 0)
+ pr_err("Failed to set regs state of VM%u!\n", vm->vmid);
+ kfree(cpu_regs);
+ break;
default:
pr_warn("Unknown IOCTL 0x%x!\n", cmd);
ret = -EINVAL;
diff --git a/drivers/virt/acrn/hypercall.h b/drivers/virt/acrn/hypercall.h
index 6429e7a06e7e..5cc975db38d9 100644
--- a/drivers/virt/acrn/hypercall.h
+++ b/drivers/virt/acrn/hypercall.h
@@ -22,6 +22,7 @@
#define HC_START_VM _HC_ID(HC_ID, HC_ID_VM_BASE + 0x02)
#define HC_PAUSE_VM _HC_ID(HC_ID, HC_ID_VM_BASE + 0x03)
#define HC_RESET_VM _HC_ID(HC_ID, HC_ID_VM_BASE + 0x05)
+#define HC_SET_VCPU_REGS _HC_ID(HC_ID, HC_ID_VM_BASE + 0x06)

/**
* hcall_get_api_version() - Get API version from hypervisor
@@ -89,4 +90,16 @@ static inline long hcall_reset_vm(u64 vmid)
return acrn_hypercall1(HC_RESET_VM, vmid);
}

+/**
+ * hcall_set_vcpu_regs() - Set up registers of virtual CPU of a User VM
+ * @vmid: User VM ID
+ * @regs_state: Service VM GPA of registers state
+ *
+ * Return: 0 on success, <0 on failure
+ */
+static inline long hcall_set_vcpu_regs(u64 vmid, u64 regs_state)
+{
+ return acrn_hypercall2(HC_SET_VCPU_REGS, vmid, regs_state);
+}
+
#endif /* __ACRN_HSM_HYPERCALL_H */
diff --git a/include/uapi/linux/acrn.h b/include/uapi/linux/acrn.h
index f8f00b18cd46..392d59a46499 100644
--- a/include/uapi/linux/acrn.h
+++ b/include/uapi/linux/acrn.h
@@ -46,6 +46,75 @@ struct acrn_vm_creation {
__u8 reserved2[8];
} __attribute__((aligned(8)));

+struct acrn_gp_regs {
+ __u64 rax;
+ __u64 rcx;
+ __u64 rdx;
+ __u64 rbx;
+ __u64 rsp;
+ __u64 rbp;
+ __u64 rsi;
+ __u64 rdi;
+ __u64 r8;
+ __u64 r9;
+ __u64 r10;
+ __u64 r11;
+ __u64 r12;
+ __u64 r13;
+ __u64 r14;
+ __u64 r15;
+};
+
+struct acrn_descriptor_ptr {
+ __u16 limit;
+ __u64 base;
+ __u16 reserved[3];
+} __attribute__ ((__packed__));
+
+struct acrn_regs {
+ struct acrn_gp_regs gprs;
+ struct acrn_descriptor_ptr gdt;
+ struct acrn_descriptor_ptr idt;
+
+ __u64 rip;
+ __u64 cs_base;
+ __u64 cr0;
+ __u64 cr4;
+ __u64 cr3;
+ __u64 ia32_efer;
+ __u64 rflags;
+ __u64 reserved_64[4];
+
+ __u32 cs_ar;
+ __u32 cs_limit;
+ __u32 reserved_32[3];
+
+ __u16 cs_sel;
+ __u16 ss_sel;
+ __u16 ds_sel;
+ __u16 es_sel;
+ __u16 fs_sel;
+ __u16 gs_sel;
+ __u16 ldt_sel;
+ __u16 tr_sel;
+
+ __u16 reserved_16[4];
+};
+
+/**
+ * struct acrn_vcpu_regs - Info of vCPU registers state
+ * @vcpu_id: vCPU ID
+ * @reserved0: Reserved
+ * @vcpu_regs: vCPU registers state
+ *
+ * This structure will be passed to hypervisor directly.
+ */
+struct acrn_vcpu_regs {
+ __u16 vcpu_id;
+ __u16 reserved0[3];
+ struct acrn_regs vcpu_regs;
+} __attribute__((aligned(8)));
+
/* The ioctl type, documented in ioctl-number.rst */
#define ACRN_IOCTL_TYPE 0xA2

@@ -65,5 +134,7 @@ struct acrn_vm_creation {
_IO(ACRN_IOCTL_TYPE, 0x13)
#define ACRN_IOCTL_RESET_VM \
_IO(ACRN_IOCTL_TYPE, 0x15)
+#define ACRN_IOCTL_SET_VCPU_REGS \
+ _IOW(ACRN_IOCTL_TYPE, 0x16, struct acrn_vcpu_regs)

#endif /* _UAPI_ACRN_H */
--
2.28.0

2020-08-25 03:22:29

by Shuo Liu

[permalink] [raw]
Subject: [PATCH 14/17] virt: acrn: Introduce I/O ranges operation interfaces

From: Shuo Liu <[email protected]>

An I/O request of a User VM, which is constructed by hypervisor, is
distributed by the ACRN Hypervisor Service Module to an I/O client
corresponding to the address range of the I/O request.

I/O client maintains a list of address ranges. Introduce
acrn_ioreq_range_{add,del}() to manage these address ranges.

Signed-off-by: Shuo Liu <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Cc: Zhi Wang <[email protected]>
Cc: Zhenyu Wang <[email protected]>
Cc: Yu Wang <[email protected]>
Cc: Reinette Chatre <[email protected]>
---
drivers/virt/acrn/acrn_drv.h | 4 +++
drivers/virt/acrn/ioreq.c | 59 ++++++++++++++++++++++++++++++++++++
2 files changed, 63 insertions(+)

diff --git a/drivers/virt/acrn/acrn_drv.h b/drivers/virt/acrn/acrn_drv.h
index c08235ba21fc..05836dcefbd6 100644
--- a/drivers/virt/acrn/acrn_drv.h
+++ b/drivers/virt/acrn/acrn_drv.h
@@ -195,6 +195,10 @@ struct acrn_ioreq_client *acrn_ioreq_client_create(struct acrn_vm *vm,
void *data, bool is_default,
const char *name);
void acrn_ioreq_client_destroy(struct acrn_ioreq_client *client);
+int acrn_ioreq_range_add(struct acrn_ioreq_client *client,
+ u32 type, u64 start, u64 end);
+void acrn_ioreq_range_del(struct acrn_ioreq_client *client,
+ u32 type, u64 start, u64 end);

int acrn_msi_inject(u16 vmid, u64 msi_addr, u64 msi_data);

diff --git a/drivers/virt/acrn/ioreq.c b/drivers/virt/acrn/ioreq.c
index 7e312b8e5edd..48ce6955699e 100644
--- a/drivers/virt/acrn/ioreq.c
+++ b/drivers/virt/acrn/ioreq.c
@@ -102,6 +102,65 @@ int acrn_ioreq_request_default_complete(struct acrn_vm *vm, u16 vcpu)
return ret;
}

+/**
+ * acrn_ioreq_range_add() - Add an iorange monitored by an ioreq client
+ * @client: The ioreq client
+ * @type: Type (ACRN_IOREQ_TYPE_MMIO or ACRN_IOREQ_TYPE_PORTIO)
+ * @start: Start address of iorange
+ * @end: End address of iorange
+ *
+ * Return: 0 on success, <0 on error
+ */
+int acrn_ioreq_range_add(struct acrn_ioreq_client *client,
+ u32 type, u64 start, u64 end)
+{
+ struct acrn_ioreq_range *range;
+
+ if (end < start) {
+ pr_err("Invalid IO range [0x%llx,0x%llx]\n", start, end);
+ return -EFAULT;
+ }
+
+ range = kzalloc(sizeof(*range), GFP_KERNEL);
+ if (!range)
+ return -ENOMEM;
+
+ range->type = type;
+ range->start = start;
+ range->end = end;
+
+ write_lock_bh(&client->range_lock);
+ list_add(&range->list, &client->range_list);
+ write_unlock_bh(&client->range_lock);
+
+ return 0;
+}
+
+/**
+ * acrn_ioreq_range_del() - Del an iorange monitored by an ioreq client
+ * @client: The ioreq client
+ * @type: Type (ACRN_IOREQ_TYPE_MMIO or ACRN_IOREQ_TYPE_PORTIO)
+ * @start: Start address of iorange
+ * @end: End address of iorange
+ */
+void acrn_ioreq_range_del(struct acrn_ioreq_client *client,
+ u32 type, u64 start, u64 end)
+{
+ struct acrn_ioreq_range *range;
+
+ write_lock_bh(&client->range_lock);
+ list_for_each_entry(range, &client->range_list, list) {
+ if (type == range->type &&
+ start == range->start &&
+ end == range->end) {
+ list_del(&range->list);
+ kfree(range);
+ break;
+ }
+ }
+ write_unlock_bh(&client->range_lock);
+}
+
/*
* ioreq_task() is the execution entity of handler thread of an I/O client.
* The handler callback of the I/O client is called within the handler thread.
--
2.28.0

2020-08-25 03:22:29

by Shuo Liu

[permalink] [raw]
Subject: [PATCH 16/17] virt: acrn: Introduce irqfd

From: Shuo Liu <[email protected]>

irqfd is a mechanism to inject a specific interrupt to a User VM using a
decoupled eventfd mechanism.

Vhost is a kernel-level virtio server which uses eventfd for interrupt
injection. To support vhost on ACRN, irqfd is introduced in HSM.

HSM provides ioctls to associate a virtual Message Signaled Interrupt
(MSI) with an eventfd. The corresponding virtual MSI will be injected
into a User VM once the eventfd got signal.

Signed-off-by: Shuo Liu <[email protected]>
Reviewed-by: Zhi Wang <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Cc: Zhi Wang <[email protected]>
Cc: Zhenyu Wang <[email protected]>
Cc: Yu Wang <[email protected]>
Cc: Reinette Chatre <[email protected]>
---
drivers/virt/acrn/Makefile | 2 +-
drivers/virt/acrn/acrn_drv.h | 10 ++
drivers/virt/acrn/hsm.c | 7 ++
drivers/virt/acrn/irqfd.c | 236 +++++++++++++++++++++++++++++++++++
drivers/virt/acrn/vm.c | 3 +
include/uapi/linux/acrn.h | 15 +++
6 files changed, 272 insertions(+), 1 deletion(-)
create mode 100644 drivers/virt/acrn/irqfd.c

diff --git a/drivers/virt/acrn/Makefile b/drivers/virt/acrn/Makefile
index 755b583b32ca..08ce641dcfa1 100644
--- a/drivers/virt/acrn/Makefile
+++ b/drivers/virt/acrn/Makefile
@@ -1,3 +1,3 @@
# SPDX-License-Identifier: GPL-2.0
obj-$(CONFIG_ACRN_HSM) := acrn.o
-acrn-y := hsm.o vm.o mm.o ioreq.o ioeventfd.o
+acrn-y := hsm.o vm.o mm.o ioreq.o ioeventfd.o irqfd.o
diff --git a/drivers/virt/acrn/acrn_drv.h b/drivers/virt/acrn/acrn_drv.h
index e36e8c94139b..5d8f151cf9ba 100644
--- a/drivers/virt/acrn/acrn_drv.h
+++ b/drivers/virt/acrn/acrn_drv.h
@@ -157,6 +157,9 @@ extern rwlock_t acrn_vm_list_lock;
* @ioeventfds_lock: Lock to protect ioeventfds list
* @ioeventfds: List to link all hsm_ioeventfd
* @ioeventfd_client: I/O client for ioeventfds of the VM
+ * @irqfds_lock: Lock to protect irqfds list
+ * @irqfds: List to link all hsm_irqfd
+ * @irqfd_wq: Workqueue for irqfd async shutdown
*/
struct acrn_vm {
struct list_head list;
@@ -176,6 +179,9 @@ struct acrn_vm {
struct mutex ioeventfds_lock;
struct list_head ioeventfds;
struct acrn_ioreq_client *ioeventfd_client;
+ struct mutex irqfds_lock;
+ struct list_head irqfds;
+ struct workqueue_struct *irqfd_wq;
};

struct acrn_vm *acrn_vm_create(struct acrn_vm *vm,
@@ -212,4 +218,8 @@ int acrn_ioeventfd_init(struct acrn_vm *vm);
int acrn_ioeventfd_config(struct acrn_vm *vm, struct acrn_ioeventfd *args);
void acrn_ioeventfd_deinit(struct acrn_vm *vm);

+int acrn_irqfd_init(struct acrn_vm *vm);
+int acrn_irqfd_config(struct acrn_vm *vm, struct acrn_irqfd *args);
+void acrn_irqfd_deinit(struct acrn_vm *vm);
+
#endif /* __ACRN_HSM_DRV_H */
diff --git a/drivers/virt/acrn/hsm.c b/drivers/virt/acrn/hsm.c
index 9f990929242c..81300ea19dc9 100644
--- a/drivers/virt/acrn/hsm.c
+++ b/drivers/virt/acrn/hsm.c
@@ -116,6 +116,7 @@ static long acrn_dev_ioctl(struct file *filp, unsigned int cmd,
struct acrn_vm_memmap memmap;
struct acrn_msi_entry *msi;
struct acrn_pcidev *pcidev;
+ struct acrn_irqfd irqfd;
struct page *page;
u64 cstate_cmd;
int ret = 0;
@@ -311,6 +312,12 @@ static long acrn_dev_ioctl(struct file *filp, unsigned int cmd,

ret = acrn_ioeventfd_config(vm, &ioeventfd);
break;
+ case ACRN_IOCTL_IRQFD:
+ if (copy_from_user(&irqfd, (void __user *)ioctl_param,
+ sizeof(irqfd)))
+ return -EFAULT;
+ ret = acrn_irqfd_config(vm, &irqfd);
+ break;
default:
pr_warn("Unknown IOCTL 0x%x!\n", cmd);
ret = -EINVAL;
diff --git a/drivers/virt/acrn/irqfd.c b/drivers/virt/acrn/irqfd.c
new file mode 100644
index 000000000000..67380c79f167
--- /dev/null
+++ b/drivers/virt/acrn/irqfd.c
@@ -0,0 +1,236 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * ACRN HSM irqfd: use eventfd objects to inject virtual interrupts
+ *
+ * Copyright (C) 2020 Intel Corporation. All rights reserved.
+ *
+ * Authors:
+ * Shuo Liu <[email protected]>
+ * Yakui Zhao <[email protected]>
+ */
+#define pr_fmt(fmt) "acrn: " fmt
+
+#include <linux/eventfd.h>
+#include <linux/file.h>
+#include <linux/poll.h>
+#include <linux/slab.h>
+
+#include "acrn_drv.h"
+
+static LIST_HEAD(acrn_irqfd_clients);
+static DEFINE_MUTEX(acrn_irqfds_mutex);
+
+/**
+ * struct hsm_irqfd - Properties of HSM irqfd
+ * @vm: Associated VM pointer
+ * @wait: Entry of wait-queue
+ * @shutdown: Async shutdown work
+ * @eventfd: Associated eventfd
+ * @list: Entry within &acrn_vm.irqfds of irqfds of a VM
+ * @pt: Structure for select/poll on the associated eventfd
+ * @msi: MSI data
+ */
+struct hsm_irqfd {
+ struct acrn_vm *vm;
+ wait_queue_entry_t wait;
+ struct work_struct shutdown;
+ struct eventfd_ctx *eventfd;
+ struct list_head list;
+ poll_table pt;
+ struct acrn_msi_entry msi;
+};
+
+static void acrn_irqfd_inject(struct hsm_irqfd *irqfd)
+{
+ struct acrn_vm *vm = irqfd->vm;
+
+ acrn_msi_inject(vm->vmid, irqfd->msi.msi_addr,
+ irqfd->msi.msi_data);
+}
+
+static void hsm_irqfd_shutdown(struct hsm_irqfd *irqfd)
+{
+ u64 cnt;
+
+ lockdep_assert_held(&irqfd->vm->irqfds_lock);
+
+ /* remove from wait queue */
+ list_del_init(&irqfd->list);
+ eventfd_ctx_remove_wait_queue(irqfd->eventfd, &irqfd->wait, &cnt);
+ eventfd_ctx_put(irqfd->eventfd);
+ kfree(irqfd);
+}
+
+static void hsm_irqfd_shutdown_work(struct work_struct *work)
+{
+ struct hsm_irqfd *irqfd;
+ struct acrn_vm *vm;
+
+ irqfd = container_of(work, struct hsm_irqfd, shutdown);
+ vm = irqfd->vm;
+ mutex_lock(&vm->irqfds_lock);
+ if (!list_empty(&irqfd->list))
+ hsm_irqfd_shutdown(irqfd);
+ mutex_unlock(&vm->irqfds_lock);
+}
+
+/* Called with wqh->lock held and interrupts disabled */
+static int hsm_irqfd_wakeup(wait_queue_entry_t *wait, unsigned int mode,
+ int sync, void *key)
+{
+ unsigned long poll_bits = (unsigned long)key;
+ struct hsm_irqfd *irqfd;
+ struct acrn_vm *vm;
+
+ irqfd = container_of(wait, struct hsm_irqfd, wait);
+ vm = irqfd->vm;
+ if (poll_bits & POLLIN)
+ /* An event has been signaled, inject an interrupt */
+ acrn_irqfd_inject(irqfd);
+
+ if (poll_bits & POLLHUP)
+ /* Do shutdown work in thread to hold wqh->lock */
+ queue_work(vm->irqfd_wq, &irqfd->shutdown);
+
+ return 0;
+}
+
+static void hsm_irqfd_poll_func(struct file *file, wait_queue_head_t *wqh,
+ poll_table *pt)
+{
+ struct hsm_irqfd *irqfd;
+
+ irqfd = container_of(pt, struct hsm_irqfd, pt);
+ add_wait_queue(wqh, &irqfd->wait);
+}
+
+/*
+ * Assign an eventfd to a VM and create a HSM irqfd associated with the
+ * eventfd. The properties of the HSM irqfd are built from a &struct
+ * acrn_irqfd.
+ */
+static int acrn_irqfd_assign(struct acrn_vm *vm, struct acrn_irqfd *args)
+{
+ struct eventfd_ctx *eventfd = NULL;
+ struct hsm_irqfd *irqfd, *tmp;
+ unsigned int events;
+ struct fd f;
+ int ret = 0;
+
+ irqfd = kzalloc(sizeof(*irqfd), GFP_KERNEL);
+ if (!irqfd)
+ return -ENOMEM;
+
+ irqfd->vm = vm;
+ memcpy(&irqfd->msi, &args->msi, sizeof(args->msi));
+ INIT_LIST_HEAD(&irqfd->list);
+ INIT_WORK(&irqfd->shutdown, hsm_irqfd_shutdown_work);
+
+ f = fdget(args->fd);
+ if (!f.file) {
+ ret = -EBADF;
+ goto out;
+ }
+
+ eventfd = eventfd_ctx_fileget(f.file);
+ if (IS_ERR(eventfd)) {
+ ret = PTR_ERR(eventfd);
+ goto fail;
+ }
+
+ irqfd->eventfd = eventfd;
+
+ /*
+ * Install custom wake-up handling to be notified whenever underlying
+ * eventfd is signaled.
+ */
+ init_waitqueue_func_entry(&irqfd->wait, hsm_irqfd_wakeup);
+ init_poll_funcptr(&irqfd->pt, hsm_irqfd_poll_func);
+
+ mutex_lock(&vm->irqfds_lock);
+ list_for_each_entry(tmp, &vm->irqfds, list) {
+ if (irqfd->eventfd != tmp->eventfd)
+ continue;
+ ret = -EBUSY;
+ mutex_unlock(&vm->irqfds_lock);
+ goto fail;
+ }
+ list_add_tail(&irqfd->list, &vm->irqfds);
+ mutex_unlock(&vm->irqfds_lock);
+
+ /* Check the pending event in this stage */
+ events = f.file->f_op->poll(f.file, &irqfd->pt);
+
+ if (events & POLLIN)
+ acrn_irqfd_inject(irqfd);
+
+ fdput(f);
+ return 0;
+fail:
+ if (eventfd && !IS_ERR(eventfd))
+ eventfd_ctx_put(eventfd);
+
+ fdput(f);
+out:
+ kfree(irqfd);
+ return ret;
+}
+
+static int acrn_irqfd_deassign(struct acrn_vm *vm,
+ struct acrn_irqfd *args)
+{
+ struct hsm_irqfd *irqfd, *tmp;
+ struct eventfd_ctx *eventfd;
+
+ eventfd = eventfd_ctx_fdget(args->fd);
+ if (IS_ERR(eventfd))
+ return PTR_ERR(eventfd);
+
+ mutex_lock(&vm->irqfds_lock);
+ list_for_each_entry_safe(irqfd, tmp, &vm->irqfds, list) {
+ if (irqfd->eventfd == eventfd) {
+ hsm_irqfd_shutdown(irqfd);
+ break;
+ }
+ }
+ mutex_unlock(&vm->irqfds_lock);
+ eventfd_ctx_put(eventfd);
+
+ return 0;
+}
+
+int acrn_irqfd_config(struct acrn_vm *vm, struct acrn_irqfd *args)
+{
+ int ret;
+
+ if (args->flags & ACRN_IRQFD_FLAG_DEASSIGN)
+ ret = acrn_irqfd_deassign(vm, args);
+ else
+ ret = acrn_irqfd_assign(vm, args);
+
+ return ret;
+}
+
+int acrn_irqfd_init(struct acrn_vm *vm)
+{
+ INIT_LIST_HEAD(&vm->irqfds);
+ mutex_init(&vm->irqfds_lock);
+ vm->irqfd_wq = alloc_workqueue("acrn_irqfd-%u", 0, 0, vm->vmid);
+ if (!vm->irqfd_wq)
+ return -ENOMEM;
+
+ pr_debug("VM %u irqfd init.\n", vm->vmid);
+ return 0;
+}
+
+void acrn_irqfd_deinit(struct acrn_vm *vm)
+{
+ struct hsm_irqfd *irqfd, *next;
+
+ pr_debug("VM %u irqfd deinit.\n", vm->vmid);
+ destroy_workqueue(vm->irqfd_wq);
+ mutex_lock(&vm->irqfds_lock);
+ list_for_each_entry_safe(irqfd, next, &vm->irqfds, list)
+ hsm_irqfd_shutdown(irqfd);
+ mutex_unlock(&vm->irqfds_lock);
+}
diff --git a/drivers/virt/acrn/vm.c b/drivers/virt/acrn/vm.c
index 1a9456794663..f2b80685d82e 100644
--- a/drivers/virt/acrn/vm.c
+++ b/drivers/virt/acrn/vm.c
@@ -47,6 +47,7 @@ struct acrn_vm *acrn_vm_create(struct acrn_vm *vm,
write_unlock_bh(&acrn_vm_list_lock);

acrn_ioeventfd_init(vm);
+ acrn_irqfd_init(vm);
pr_debug("VM %u created.\n", vm->vmid);
return vm;
}
@@ -65,7 +66,9 @@ int acrn_vm_destroy(struct acrn_vm *vm)
write_unlock_bh(&acrn_vm_list_lock);

acrn_ioeventfd_deinit(vm);
+ acrn_irqfd_deinit(vm);
acrn_ioreq_deinit(vm);
+
if (vm->monitor_page) {
put_page(vm->monitor_page);
vm->monitor_page = NULL;
diff --git a/include/uapi/linux/acrn.h b/include/uapi/linux/acrn.h
index e2d5c657f8e2..322fbcdc25ac 100644
--- a/include/uapi/linux/acrn.h
+++ b/include/uapi/linux/acrn.h
@@ -421,6 +421,19 @@ struct acrn_ioeventfd {
__u64 data;
};

+#define ACRN_IRQFD_FLAG_DEASSIGN 0x01
+/**
+ * struct acrn_irqfd - Data to operate a &struct hsm_irqfd
+ * @fd: The fd of eventfd associated with a hsm_irqfd
+ * @flags: Logical-OR of ACRN_IRQFD_FLAG_*
+ * @msi: Info of MSI associated with the irqfd
+ */
+struct acrn_irqfd {
+ __s32 fd;
+ __u32 flags;
+ struct acrn_msi_entry msi;
+};
+
/* The ioctl type, documented in ioctl-number.rst */
#define ACRN_IOCTL_TYPE 0xA2

@@ -480,5 +493,7 @@ struct acrn_ioeventfd {

#define ACRN_IOCTL_IOEVENTFD \
_IOW(ACRN_IOCTL_TYPE, 0x70, struct acrn_ioeventfd)
+#define ACRN_IOCTL_IRQFD \
+ _IOW(ACRN_IOCTL_TYPE, 0x71, struct acrn_irqfd)

#endif /* _UAPI_ACRN_H */
--
2.28.0

2020-08-25 03:22:29

by Shuo Liu

[permalink] [raw]
Subject: [PATCH 15/17] virt: acrn: Introduce ioeventfd

From: Shuo Liu <[email protected]>

ioeventfd is a mechanism to register PIO/MMIO regions to trigger an
eventfd signal when written to by a User VM. ACRN userspace can register
any arbitrary I/O address with a corresponding eventfd and then pass the
eventfd to a specific end-point of interest for handling.

Vhost is a kernel-level virtio server which uses eventfd for signalling.
To support vhost on ACRN, ioeventfd is introduced in HSM.

A new I/O client dedicated to ioeventfd is associated with a User VM
during VM creation. HSM provides ioctls to associate an I/O region with
a eventfd. The I/O client signals a eventfd once its corresponding I/O
region is matched with an I/O request.

Signed-off-by: Shuo Liu <[email protected]>
Reviewed-by: Zhi Wang <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Cc: Zhi Wang <[email protected]>
Cc: Zhenyu Wang <[email protected]>
Cc: Yu Wang <[email protected]>
Cc: Reinette Chatre <[email protected]>
---
drivers/virt/acrn/Kconfig | 1 +
drivers/virt/acrn/Makefile | 2 +-
drivers/virt/acrn/acrn_drv.h | 10 ++
drivers/virt/acrn/hsm.c | 8 +
drivers/virt/acrn/ioeventfd.c | 275 ++++++++++++++++++++++++++++++++++
drivers/virt/acrn/vm.c | 2 +
include/uapi/linux/acrn.h | 29 ++++
7 files changed, 326 insertions(+), 1 deletion(-)
create mode 100644 drivers/virt/acrn/ioeventfd.c

diff --git a/drivers/virt/acrn/Kconfig b/drivers/virt/acrn/Kconfig
index 36c80378c30c..3e1a61c9d8d8 100644
--- a/drivers/virt/acrn/Kconfig
+++ b/drivers/virt/acrn/Kconfig
@@ -2,6 +2,7 @@
config ACRN_HSM
tristate "ACRN Hypervisor Service Module"
depends on ACRN_GUEST
+ select EVENTFD
help
ACRN Hypervisor Service Module (HSM) is a kernel module which
communicates with ACRN userspace through ioctls and talks to
diff --git a/drivers/virt/acrn/Makefile b/drivers/virt/acrn/Makefile
index 21721cbf6a80..755b583b32ca 100644
--- a/drivers/virt/acrn/Makefile
+++ b/drivers/virt/acrn/Makefile
@@ -1,3 +1,3 @@
# SPDX-License-Identifier: GPL-2.0
obj-$(CONFIG_ACRN_HSM) := acrn.o
-acrn-y := hsm.o vm.o mm.o ioreq.o
+acrn-y := hsm.o vm.o mm.o ioreq.o ioeventfd.o
diff --git a/drivers/virt/acrn/acrn_drv.h b/drivers/virt/acrn/acrn_drv.h
index 05836dcefbd6..e36e8c94139b 100644
--- a/drivers/virt/acrn/acrn_drv.h
+++ b/drivers/virt/acrn/acrn_drv.h
@@ -154,6 +154,9 @@ extern rwlock_t acrn_vm_list_lock;
* @ioreq_page: The page of the I/O request shared buffer
* @pci_conf_addr: Address of a PCI configuration access emulation
* @monitor_page: Page of interrupt statistics of User VM
+ * @ioeventfds_lock: Lock to protect ioeventfds list
+ * @ioeventfds: List to link all hsm_ioeventfd
+ * @ioeventfd_client: I/O client for ioeventfds of the VM
*/
struct acrn_vm {
struct list_head list;
@@ -170,6 +173,9 @@ struct acrn_vm {
struct page *ioreq_page;
u32 pci_conf_addr;
struct page *monitor_page;
+ struct mutex ioeventfds_lock;
+ struct list_head ioeventfds;
+ struct acrn_ioreq_client *ioeventfd_client;
};

struct acrn_vm *acrn_vm_create(struct acrn_vm *vm,
@@ -202,4 +208,8 @@ void acrn_ioreq_range_del(struct acrn_ioreq_client *client,

int acrn_msi_inject(u16 vmid, u64 msi_addr, u64 msi_data);

+int acrn_ioeventfd_init(struct acrn_vm *vm);
+int acrn_ioeventfd_config(struct acrn_vm *vm, struct acrn_ioeventfd *args);
+void acrn_ioeventfd_deinit(struct acrn_vm *vm);
+
#endif /* __ACRN_HSM_DRV_H */
diff --git a/drivers/virt/acrn/hsm.c b/drivers/virt/acrn/hsm.c
index c0f33bc505e2..9f990929242c 100644
--- a/drivers/virt/acrn/hsm.c
+++ b/drivers/virt/acrn/hsm.c
@@ -112,6 +112,7 @@ static long acrn_dev_ioctl(struct file *filp, unsigned int cmd,
struct acrn_vcpu_regs *cpu_regs;
struct acrn_ioreq_notify notify;
struct acrn_ptdev_irq *irq_info;
+ struct acrn_ioeventfd ioeventfd;
struct acrn_vm_memmap memmap;
struct acrn_msi_entry *msi;
struct acrn_pcidev *pcidev;
@@ -303,6 +304,13 @@ static long acrn_dev_ioctl(struct file *filp, unsigned int cmd,

ret = pmcmd_ioctl(cstate_cmd, (void __user *)ioctl_param);
break;
+ case ACRN_IOCTL_IOEVENTFD:
+ if (copy_from_user(&ioeventfd, (void __user *)ioctl_param,
+ sizeof(ioeventfd)))
+ return -EFAULT;
+
+ ret = acrn_ioeventfd_config(vm, &ioeventfd);
+ break;
default:
pr_warn("Unknown IOCTL 0x%x!\n", cmd);
ret = -EINVAL;
diff --git a/drivers/virt/acrn/ioeventfd.c b/drivers/virt/acrn/ioeventfd.c
new file mode 100644
index 000000000000..3c575173c47c
--- /dev/null
+++ b/drivers/virt/acrn/ioeventfd.c
@@ -0,0 +1,275 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * ACRN HSM eventfd - use eventfd objects to signal expected I/O requests
+ *
+ * Copyright (C) 2020 Intel Corporation. All rights reserved.
+ *
+ * Authors:
+ * Shuo Liu <[email protected]>
+ * Yakui Zhao <[email protected]>
+ */
+
+#define pr_fmt(fmt) "acrn: " fmt
+
+#include <linux/eventfd.h>
+#include <linux/slab.h>
+
+#include "acrn_drv.h"
+
+/**
+ * struct hsm_ioeventfd - Properties of HSM ioeventfd
+ * @list: Entry within &acrn_vm.ioeventfds of ioeventfds of a VM
+ * @eventfd: Eventfd of the HSM ioeventfd
+ * @addr: Address of I/O range
+ * @data: Data for matching
+ * @length: Length of I/O range
+ * @type: Type of I/O range (ACRN_IOREQ_TYPE_MMIO/ACRN_IOREQ_TYPE_PORTIO)
+ * @wildcard: Data matching or not
+ */
+struct hsm_ioeventfd {
+ struct list_head list;
+ struct eventfd_ctx *eventfd;
+ u64 addr;
+ u64 data;
+ int length;
+ int type;
+ bool wildcard;
+};
+
+static inline int ioreq_type_from_flags(int flags)
+{
+ return flags & ACRN_IOEVENTFD_FLAG_PIO ?
+ ACRN_IOREQ_TYPE_PORTIO : ACRN_IOREQ_TYPE_MMIO;
+}
+
+static void acrn_ioeventfd_shutdown(struct acrn_vm *vm, struct hsm_ioeventfd *p)
+{
+ lockdep_assert_held(&vm->ioeventfds_lock);
+
+ eventfd_ctx_put(p->eventfd);
+ list_del(&p->list);
+ kfree(p);
+}
+
+static bool hsm_ioeventfd_is_conflict(struct acrn_vm *vm,
+ struct hsm_ioeventfd *ioeventfd)
+{
+ struct hsm_ioeventfd *p;
+
+ lockdep_assert_held(&vm->ioeventfds_lock);
+
+ /* Either one is wildcard, the data matching will be skipped. */
+ list_for_each_entry(p, &vm->ioeventfds, list)
+ if (p->eventfd == ioeventfd->eventfd &&
+ p->addr == ioeventfd->addr &&
+ p->type == ioeventfd->type &&
+ (p->wildcard || ioeventfd->wildcard ||
+ p->data == ioeventfd->data))
+ return true;
+
+ return false;
+}
+
+/*
+ * Assign an eventfd to a VM and create a HSM ioeventfd associated with the
+ * eventfd. The properties of the HSM ioeventfd are built from a &struct
+ * acrn_ioeventfd.
+ */
+static int acrn_ioeventfd_assign(struct acrn_vm *vm,
+ struct acrn_ioeventfd *args)
+{
+ struct eventfd_ctx *eventfd;
+ struct hsm_ioeventfd *p;
+ int ret;
+
+ /* Check for range overflow */
+ if (args->addr + args->len < args->addr)
+ return -EINVAL;
+
+ /*
+ * Currently, acrn_ioeventfd is used to support vhost. 1,2,4,8 width
+ * accesses can cover vhost's requirements.
+ */
+ if (!(args->len == 1 || args->len == 2 ||
+ args->len == 4 || args->len == 8))
+ return -EINVAL;
+
+ eventfd = eventfd_ctx_fdget(args->fd);
+ if (IS_ERR(eventfd))
+ return PTR_ERR(eventfd);
+
+ p = kzalloc(sizeof(*p), GFP_KERNEL);
+ if (!p) {
+ ret = -ENOMEM;
+ goto fail;
+ }
+
+ INIT_LIST_HEAD(&p->list);
+ p->addr = args->addr;
+ p->length = args->len;
+ p->eventfd = eventfd;
+ p->type = ioreq_type_from_flags(args->flags);
+
+ /*
+ * ACRN_IOEVENTFD_FLAG_DATAMATCH flag is set in virtio 1.0 support, the
+ * writing of notification register of each virtqueue may trigger the
+ * notification. There is no data matching requirement.
+ */
+ if (args->flags & ACRN_IOEVENTFD_FLAG_DATAMATCH)
+ p->data = args->data;
+ else
+ p->wildcard = true;
+
+ mutex_lock(&vm->ioeventfds_lock);
+
+ if (hsm_ioeventfd_is_conflict(vm, p)) {
+ ret = -EEXIST;
+ goto unlock_fail;
+ }
+
+ /* register the I/O range into ioreq client */
+ ret = acrn_ioreq_range_add(vm->ioeventfd_client, p->type,
+ p->addr, p->addr + p->length - 1);
+ if (ret < 0)
+ goto unlock_fail;
+
+ list_add_tail(&p->list, &vm->ioeventfds);
+ mutex_unlock(&vm->ioeventfds_lock);
+
+ return 0;
+
+unlock_fail:
+ mutex_unlock(&vm->ioeventfds_lock);
+ kfree(p);
+fail:
+ eventfd_ctx_put(eventfd);
+ return ret;
+}
+
+static int acrn_ioeventfd_deassign(struct acrn_vm *vm,
+ struct acrn_ioeventfd *args)
+{
+ struct hsm_ioeventfd *p;
+ struct eventfd_ctx *eventfd;
+
+ eventfd = eventfd_ctx_fdget(args->fd);
+ if (IS_ERR(eventfd))
+ return PTR_ERR(eventfd);
+
+ mutex_lock(&vm->ioeventfds_lock);
+ list_for_each_entry(p, &vm->ioeventfds, list) {
+ if (p->eventfd != eventfd)
+ continue;
+
+ acrn_ioreq_range_del(vm->ioeventfd_client, p->type,
+ p->addr, p->addr + p->length - 1);
+ acrn_ioeventfd_shutdown(vm, p);
+ break;
+ }
+ mutex_unlock(&vm->ioeventfds_lock);
+
+ eventfd_ctx_put(eventfd);
+ return 0;
+}
+
+static struct hsm_ioeventfd *hsm_ioeventfd_match(struct acrn_vm *vm, u64 addr,
+ u64 data, int len, int type)
+{
+ struct hsm_ioeventfd *p = NULL;
+
+ lockdep_assert_held(&vm->ioeventfds_lock);
+
+ list_for_each_entry(p, &vm->ioeventfds, list) {
+ if (p->type == type && p->addr == addr && p->length >= len &&
+ (p->wildcard || p->data == data))
+ return p;
+ }
+
+ return NULL;
+}
+
+static int acrn_ioeventfd_handler(struct acrn_ioreq_client *client,
+ struct acrn_io_request *req)
+{
+ struct hsm_ioeventfd *p;
+ u64 addr, val;
+ int size;
+
+ if (req->type == ACRN_IOREQ_TYPE_MMIO) {
+ /*
+ * I/O requests are dispatched by range check only, so a
+ * acrn_ioreq_client need process both READ and WRITE accesses
+ * of same range. READ accesses are safe to be ignored here
+ * because virtio PCI devices write the notify registers for
+ * notification.
+ */
+ if (req->reqs.mmio_request.direction == ACRN_IOREQ_DIR_READ) {
+ /* reading does nothing and return 0 */
+ req->reqs.mmio_request.value = 0;
+ return 0;
+ }
+ addr = req->reqs.mmio_request.address;
+ size = req->reqs.mmio_request.size;
+ val = req->reqs.mmio_request.value;
+ } else {
+ if (req->reqs.pio_request.direction == ACRN_IOREQ_DIR_READ) {
+ /* reading does nothing and return 0 */
+ req->reqs.pio_request.value = 0;
+ return 0;
+ }
+ addr = req->reqs.pio_request.address;
+ size = req->reqs.pio_request.size;
+ val = req->reqs.pio_request.value;
+ }
+
+ mutex_lock(&client->vm->ioeventfds_lock);
+ p = hsm_ioeventfd_match(client->vm, addr, val, size, req->type);
+ if (p)
+ eventfd_signal(p->eventfd, 1);
+ mutex_unlock(&client->vm->ioeventfds_lock);
+
+ return 0;
+}
+
+int acrn_ioeventfd_config(struct acrn_vm *vm, struct acrn_ioeventfd *args)
+{
+ int ret;
+
+ if (args->flags & ACRN_IOEVENTFD_FLAG_DEASSIGN)
+ ret = acrn_ioeventfd_deassign(vm, args);
+ else
+ ret = acrn_ioeventfd_assign(vm, args);
+
+ return ret;
+}
+
+int acrn_ioeventfd_init(struct acrn_vm *vm)
+{
+ char name[ACRN_NAME_LEN];
+
+ mutex_init(&vm->ioeventfds_lock);
+ INIT_LIST_HEAD(&vm->ioeventfds);
+ snprintf(name, sizeof(name), "ioeventfd-%u", vm->vmid);
+ vm->ioeventfd_client = acrn_ioreq_client_create(vm,
+ acrn_ioeventfd_handler,
+ NULL, false, name);
+ if (!vm->ioeventfd_client) {
+ pr_err("Failed to create ioeventfd ioreq client!\n");
+ return -EINVAL;
+ }
+
+ pr_debug("VM %u ioeventfd init.\n", vm->vmid);
+ return 0;
+}
+
+void acrn_ioeventfd_deinit(struct acrn_vm *vm)
+{
+ struct hsm_ioeventfd *p, *next;
+
+ pr_debug("VM %u ioeventfd deinit.\n", vm->vmid);
+ acrn_ioreq_client_destroy(vm->ioeventfd_client);
+ mutex_lock(&vm->ioeventfds_lock);
+ list_for_each_entry_safe(p, next, &vm->ioeventfds, list)
+ acrn_ioeventfd_shutdown(vm, p);
+ mutex_unlock(&vm->ioeventfds_lock);
+}
diff --git a/drivers/virt/acrn/vm.c b/drivers/virt/acrn/vm.c
index 97c809490758..1a9456794663 100644
--- a/drivers/virt/acrn/vm.c
+++ b/drivers/virt/acrn/vm.c
@@ -46,6 +46,7 @@ struct acrn_vm *acrn_vm_create(struct acrn_vm *vm,
list_add(&vm->list, &acrn_vm_list);
write_unlock_bh(&acrn_vm_list_lock);

+ acrn_ioeventfd_init(vm);
pr_debug("VM %u created.\n", vm->vmid);
return vm;
}
@@ -63,6 +64,7 @@ int acrn_vm_destroy(struct acrn_vm *vm)
list_del_init(&vm->list);
write_unlock_bh(&acrn_vm_list_lock);

+ acrn_ioeventfd_deinit(vm);
acrn_ioreq_deinit(vm);
if (vm->monitor_page) {
put_page(vm->monitor_page);
diff --git a/include/uapi/linux/acrn.h b/include/uapi/linux/acrn.h
index d5bd2b7dfd85..e2d5c657f8e2 100644
--- a/include/uapi/linux/acrn.h
+++ b/include/uapi/linux/acrn.h
@@ -395,6 +395,32 @@ enum acrn_pm_cmd_type {
ACRN_PMCMD_GET_CX_DATA,
};

+#define ACRN_IOEVENTFD_FLAG_PIO 0x01
+#define ACRN_IOEVENTFD_FLAG_DATAMATCH 0x02
+#define ACRN_IOEVENTFD_FLAG_DEASSIGN 0x04
+/**
+ * struct acrn_ioeventfd - Data to operate a &struct hsm_ioeventfd
+ * @fd: The fd of eventfd associated with a hsm_ioeventfd
+ * @flags: Logical-OR of ACRN_IOEVENTFD_FLAG_*
+ * @addr: The start address of IO range of ioeventfd
+ * @len: The length of IO range of ioeventfd
+ * @reserved: Reserved
+ * @data: Data for data matching
+ *
+ * Without flag ACRN_IOEVENTFD_FLAG_DEASSIGN, ioctl ACRN_IOCTL_IOEVENTFD
+ * creates a &struct hsm_ioeventfd with properties originated from &struct
+ * acrn_ioeventfd. With flag ACRN_IOEVENTFD_FLAG_DEASSIGN, ioctl
+ * ACRN_IOCTL_IOEVENTFD destroys the &struct hsm_ioeventfd matching the fd.
+ */
+struct acrn_ioeventfd {
+ __u32 fd;
+ __u32 flags;
+ __u64 addr;
+ __u32 len;
+ __u32 reserved;
+ __u64 data;
+};
+
/* The ioctl type, documented in ioctl-number.rst */
#define ACRN_IOCTL_TYPE 0xA2

@@ -452,4 +478,7 @@ enum acrn_pm_cmd_type {
#define ACRN_IOCTL_PM_GET_CPU_STATE \
_IOWR(ACRN_IOCTL_TYPE, 0x60, __u64)

+#define ACRN_IOCTL_IOEVENTFD \
+ _IOW(ACRN_IOCTL_TYPE, 0x70, struct acrn_ioeventfd)
+
#endif /* _UAPI_ACRN_H */
--
2.28.0

2020-08-25 03:22:29

by Shuo Liu

[permalink] [raw]
Subject: [PATCH 09/17] virt: acrn: Introduce I/O request management

From: Shuo Liu <[email protected]>

An I/O request of a User VM, which is constructed by the hypervisor, is
distributed by the ACRN Hypervisor Service Module to an I/O client
corresponding to the address range of the I/O request.

For each User VM, there is a shared 4-KByte memory region used for I/O
requests communication between the hypervisor and Service VM. An I/O
request is a 256-byte structure buffer, which is 'struct
acrn_io_request', that is filled by an I/O handler of the hypervisor
when a trapped I/O access happens in a User VM. ACRN userspace in the
Service VM first allocates a 4-KByte page and passes the GPA (Guest
Physical Address) of the buffer to the hypervisor. The buffer is used as
an array of 16 I/O request slots with each I/O request slot being 256
bytes. This array is indexed by vCPU ID.

An I/O client, which is 'struct acrn_ioreq_client', is responsible for
handling User VM I/O requests whose accessed GPA falls in a certain
range. Multiple I/O clients can be associated with each User VM. There
is a special client associated with each User VM, called the default
client, that handles all I/O requests that do not fit into the range of
any other I/O clients. The ACRN userspace acts as the default client for
each User VM.

The state transitions of a ACRN I/O request are as follows.

FREE -> PENDING -> PROCESSING -> COMPLETE -> FREE -> ...

FREE: this I/O request slot is empty
PENDING: a valid I/O request is pending in this slot
PROCESSING: the I/O request is being processed
COMPLETE: the I/O request has been processed

An I/O request in COMPLETE or FREE state is owned by the hypervisor. HSM
and ACRN userspace are in charge of processing the others.

The processing flow of I/O requests are listed as following:

a) The I/O handler of the hypervisor will fill an I/O request with
PENDING state when a trapped I/O access happens in a User VM.
b) The hypervisor makes an upcall, which is a notification interrupt, to
the Service VM.
c) The upcall handler schedules a tasklet to dispatch I/O requests.
d) The tasklet looks for the PENDING I/O requests, assigns them to
different registered clients based on the address of the I/O accesses,
updates their state to PROCESSING, and notifies the corresponding
client to handle.
e) The notified client handles the assigned I/O requests.
f) The HSM updates I/O requests states to COMPLETE and notifies the
hypervisor of the completion via hypercalls.

Signed-off-by: Shuo Liu <[email protected]>
Reviewed-by: Zhi Wang <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Cc: Zhi Wang <[email protected]>
Cc: Zhenyu Wang <[email protected]>
Cc: Yu Wang <[email protected]>
Cc: Reinette Chatre <[email protected]>
---
drivers/virt/acrn/Makefile | 2 +-
drivers/virt/acrn/acrn_drv.h | 80 ++++++
drivers/virt/acrn/hsm.c | 26 ++
drivers/virt/acrn/hypercall.h | 28 ++
drivers/virt/acrn/ioreq.c | 503 ++++++++++++++++++++++++++++++++++
drivers/virt/acrn/vm.c | 10 +
include/uapi/linux/acrn.h | 134 +++++++++
7 files changed, 782 insertions(+), 1 deletion(-)
create mode 100644 drivers/virt/acrn/ioreq.c

diff --git a/drivers/virt/acrn/Makefile b/drivers/virt/acrn/Makefile
index 38bc44b6edcd..21721cbf6a80 100644
--- a/drivers/virt/acrn/Makefile
+++ b/drivers/virt/acrn/Makefile
@@ -1,3 +1,3 @@
# SPDX-License-Identifier: GPL-2.0
obj-$(CONFIG_ACRN_HSM) := acrn.o
-acrn-y := hsm.o vm.o mm.o
+acrn-y := hsm.o vm.o mm.o ioreq.o
diff --git a/drivers/virt/acrn/acrn_drv.h b/drivers/virt/acrn/acrn_drv.h
index c198142376d9..c2e32e9a17b7 100644
--- a/drivers/virt/acrn/acrn_drv.h
+++ b/drivers/virt/acrn/acrn_drv.h
@@ -8,10 +8,15 @@

#include "hypercall.h"

+#define ACRN_NAME_LEN 16
#define ACRN_MEM_MAPPING_MAX 256

#define ACRN_MEM_REGION_ADD 0
#define ACRN_MEM_REGION_DEL 2
+
+struct acrn_vm;
+struct acrn_ioreq_client;
+
/**
* struct vm_memory_region_op - Hypervisor memory operation
* @type: Operation type (ACRN_MEM_REGION_*)
@@ -73,9 +78,61 @@ struct vm_memory_mapping {
size_t size;
};

+/**
+ * struct acrn_ioreq_buffer - Data for setting the ioreq buffer of User VM
+ * @ioreq_buf: The GPA of the IO request shared buffer of a VM
+ *
+ * The parameter for the HC_SET_IOREQ_BUFFER hypercall used to set up
+ * the shared I/O request buffer between Service VM and ACRN hypervisor.
+ */
+struct acrn_ioreq_buffer {
+ u64 ioreq_buf;
+};
+
+struct acrn_ioreq_range {
+ struct list_head list;
+ u32 type;
+ u64 start;
+ u64 end;
+};
+
+#define ACRN_IOREQ_CLIENT_DESTROYING 0U
+typedef int (*ioreq_handler_t)(struct acrn_ioreq_client *client,
+ struct acrn_io_request *req);
+/**
+ * struct acrn_ioreq_client - Structure of I/O client.
+ * @name: Client name
+ * @vm: The VM that the client belongs to
+ * @list: List node for this acrn_ioreq_client
+ * @is_default: If this client is the default one
+ * @flags: Flags (ACRN_IOREQ_CLIENT_*)
+ * @range_list: I/O ranges
+ * @range_lock: Lock to protect range_list
+ * @ioreqs_map: The pending I/O requests bitmap.
+ * @handler: I/O requests handler of this client
+ * @thread: The thread which executes the handler
+ * @wq: The wait queue for the handler thread parking
+ * @priv: Data for the thread
+ */
+struct acrn_ioreq_client {
+ char name[ACRN_NAME_LEN];
+ struct acrn_vm *vm;
+ struct list_head list;
+ bool is_default;
+ unsigned long flags;
+ struct list_head range_list;
+ rwlock_t range_lock;
+ DECLARE_BITMAP(ioreqs_map, ACRN_IO_REQUEST_MAX);
+ ioreq_handler_t handler;
+ struct task_struct *thread;
+ wait_queue_head_t wq;
+ void *priv;
+};
+
#define ACRN_INVALID_VMID (0xffffU)

#define ACRN_VM_FLAG_DESTROYED 0U
+#define ACRN_VM_FLAG_CLEARING_IOREQ 1U
extern struct list_head acrn_vm_list;
extern rwlock_t acrn_vm_list_lock;
/**
@@ -90,6 +147,11 @@ extern rwlock_t acrn_vm_list_lock;
* &acrn_vm.regions_mapping_count.
* @regions_mapping: Memory mappings of this VM.
* @regions_mapping_count: Number of memory mapping of this VM.
+ * @ioreq_clients_lock: Lock to protect ioreq_clients and default_client
+ * @ioreq_clients: The I/O request clients list of this VM
+ * @default_client: The default I/O request client
+ * @ioreq_buf: I/O request shared buffer
+ * @ioreq_page: The page of the I/O request shared buffer
*/
struct acrn_vm {
struct list_head list;
@@ -99,6 +161,11 @@ struct acrn_vm {
struct mutex regions_mapping_lock;
struct vm_memory_mapping regions_mapping[ACRN_MEM_MAPPING_MAX];
int regions_mapping_count;
+ spinlock_t ioreq_clients_lock;
+ struct list_head ioreq_clients;
+ struct acrn_ioreq_client *default_client;
+ struct acrn_io_request_buffer *ioreq_buf;
+ struct page *ioreq_page;
};

struct acrn_vm *acrn_vm_create(struct acrn_vm *vm,
@@ -112,4 +179,17 @@ int acrn_vm_memseg_unmap(struct acrn_vm *vm, struct acrn_vm_memmap *memmap);
int acrn_vm_ram_map(struct acrn_vm *vm, struct acrn_vm_memmap *memmap);
void acrn_vm_all_ram_unmap(struct acrn_vm *vm);

+int acrn_ioreq_init(struct acrn_vm *vm, u64 buf_vma);
+void acrn_ioreq_deinit(struct acrn_vm *vm);
+void acrn_ioreq_intr_setup(void);
+void acrn_ioreq_intr_remove(void);
+void acrn_ioreq_request_clear(struct acrn_vm *vm);
+int acrn_ioreq_client_wait(struct acrn_ioreq_client *client);
+int acrn_ioreq_request_default_complete(struct acrn_vm *vm, u16 vcpu);
+struct acrn_ioreq_client *acrn_ioreq_client_create(struct acrn_vm *vm,
+ ioreq_handler_t handler,
+ void *data, bool is_default,
+ const char *name);
+void acrn_ioreq_client_destroy(struct acrn_ioreq_client *client);
+
#endif /* __ACRN_HSM_DRV_H */
diff --git a/drivers/virt/acrn/hsm.c b/drivers/virt/acrn/hsm.c
index 980725454214..3c7bea54e476 100644
--- a/drivers/virt/acrn/hsm.c
+++ b/drivers/virt/acrn/hsm.c
@@ -49,6 +49,7 @@ static long acrn_dev_ioctl(struct file *filp, unsigned int cmd,
struct acrn_vm *vm = filp->private_data;
struct acrn_vm_creation *vm_param;
struct acrn_vcpu_regs *cpu_regs;
+ struct acrn_ioreq_notify notify;
struct acrn_vm_memmap memmap;
int ret = 0;

@@ -129,6 +130,29 @@ static long acrn_dev_ioctl(struct file *filp, unsigned int cmd,

ret = acrn_vm_memseg_unmap(vm, &memmap);
break;
+ case ACRN_IOCTL_CREATE_IOREQ_CLIENT:
+ if (vm->default_client)
+ return -EEXIST;
+ if (!acrn_ioreq_client_create(vm, NULL, NULL, true, "acrndm"))
+ ret = -EFAULT;
+ break;
+ case ACRN_IOCTL_DESTROY_IOREQ_CLIENT:
+ if (vm->default_client)
+ acrn_ioreq_client_destroy(vm->default_client);
+ break;
+ case ACRN_IOCTL_ATTACH_IOREQ_CLIENT:
+ if (vm->default_client)
+ ret = acrn_ioreq_client_wait(vm->default_client);
+ break;
+ case ACRN_IOCTL_NOTIFY_REQUEST_FINISH:
+ if (copy_from_user(&notify, (void __user *)ioctl_param,
+ sizeof(struct acrn_ioreq_notify)))
+ return -EFAULT;
+ ret = acrn_ioreq_request_default_complete(vm, notify.vcpu);
+ break;
+ case ACRN_IOCTL_CLEAR_VM_IOREQ:
+ acrn_ioreq_request_clear(vm);
+ break;
default:
pr_warn("Unknown IOCTL 0x%x!\n", cmd);
ret = -EINVAL;
@@ -184,11 +208,13 @@ static int __init hsm_init(void)
return ret;
}

+ acrn_ioreq_intr_setup();
return 0;
}

static void __exit hsm_exit(void)
{
+ acrn_ioreq_intr_remove();
misc_deregister(&acrn_dev);
}
module_init(hsm_init);
diff --git a/drivers/virt/acrn/hypercall.h b/drivers/virt/acrn/hypercall.h
index 89eb45285728..d85dbcdb9f00 100644
--- a/drivers/virt/acrn/hypercall.h
+++ b/drivers/virt/acrn/hypercall.h
@@ -24,6 +24,10 @@
#define HC_RESET_VM _HC_ID(HC_ID, HC_ID_VM_BASE + 0x05)
#define HC_SET_VCPU_REGS _HC_ID(HC_ID, HC_ID_VM_BASE + 0x06)

+#define HC_ID_IOREQ_BASE 0x30UL
+#define HC_SET_IOREQ_BUFFER _HC_ID(HC_ID, HC_ID_IOREQ_BASE + 0x00)
+#define HC_NOTIFY_REQUEST_FINISH _HC_ID(HC_ID, HC_ID_IOREQ_BASE + 0x01)
+
#define HC_ID_MEM_BASE 0x40UL
#define HC_VM_SET_MEMORY_REGIONS _HC_ID(HC_ID, HC_ID_MEM_BASE + 0x02)

@@ -105,6 +109,30 @@ static inline long hcall_set_vcpu_regs(u64 vmid, u64 regs_state)
return acrn_hypercall2(HC_SET_VCPU_REGS, vmid, regs_state);
}

+/**
+ * hcall_set_ioreq_buffer() - Set up the shared buffer for I/O Requests.
+ * @vmid: User VM ID
+ * @buffer: Service VM GPA of the shared buffer
+ *
+ * Return: 0 on success, <0 on failure
+ */
+static inline long hcall_set_ioreq_buffer(u64 vmid, u64 buffer)
+{
+ return acrn_hypercall2(HC_SET_IOREQ_BUFFER, vmid, buffer);
+}
+
+/**
+ * hcall_notify_req_finish() - Notify ACRN Hypervisor of I/O request completion.
+ * @vmid: User VM ID
+ * @vcpu: The vCPU which initiated the I/O request
+ *
+ * Return: 0 on success, <0 on failure
+ */
+static inline long hcall_notify_req_finish(u64 vmid, u64 vcpu)
+{
+ return acrn_hypercall2(HC_NOTIFY_REQUEST_FINISH, vmid, vcpu);
+}
+
/**
* hcall_set_memory_regions() - Inform the hypervisor to set up EPT mappings
* @regions_pa: Service VM GPA of &struct vm_memory_region_batch
diff --git a/drivers/virt/acrn/ioreq.c b/drivers/virt/acrn/ioreq.c
new file mode 100644
index 000000000000..3339fc7c8b54
--- /dev/null
+++ b/drivers/virt/acrn/ioreq.c
@@ -0,0 +1,503 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * ACRN_HSM: Handle I/O requests
+ *
+ * Copyright (C) 2020 Intel Corporation. All rights reserved.
+ *
+ * Authors:
+ * Jason Chen CJ <[email protected]>
+ * Fengwei Yin <[email protected]>
+ */
+
+#define pr_fmt(fmt) "acrn: " fmt
+
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/kthread.h>
+#include <linux/mm.h>
+#include <linux/slab.h>
+
+#include <asm/acrn.h>
+
+#include "acrn_drv.h"
+
+static void ioreq_pause(void);
+static void ioreq_resume(void);
+
+static struct tasklet_struct ioreq_tasklet;
+
+static inline bool has_pending_request(struct acrn_ioreq_client *client)
+{
+ return !bitmap_empty(client->ioreqs_map, ACRN_IO_REQUEST_MAX);
+}
+
+static inline bool is_destroying(struct acrn_ioreq_client *client)
+{
+ return test_bit(ACRN_IOREQ_CLIENT_DESTROYING, &client->flags);
+}
+
+static int ioreq_complete_request(u16 vmid, u16 vcpu,
+ struct acrn_io_request *acrn_req)
+{
+ bool polling_mode;
+ int ret = 0;
+
+ polling_mode = acrn_req->completion_polling;
+ /* Add barrier() to make sure the writes are done before completion */
+ smp_store_release(&acrn_req->processed, ACRN_IOREQ_STATE_COMPLETE);
+
+ /*
+ * To fulfill the requirement of real-time in several industry
+ * scenarios, like automotive, ACRN can run under the partition mode,
+ * in which User VMs and Service VM are bound to dedicated CPU cores.
+ * Polling mode of handling the I/O request is introduced to achieve a
+ * faster I/O request handling. In polling mode, the hypervisor polls
+ * I/O request's completion. Once an I/O request is marked as
+ * ACRN_IOREQ_STATE_COMPLETE, hypervisor resumes from the polling point
+ * to continue the I/O request flow. Thus, the completion notification
+ * from HSM of I/O request is not needed. Please note,
+ * completion_polling needs to be read before the I/O request being
+ * marked as ACRN_IOREQ_STATE_COMPLETE to avoid racing with the
+ * hypervisor.
+ */
+ if (!polling_mode) {
+ ret = hcall_notify_req_finish(vmid, vcpu);
+ if (ret < 0)
+ pr_err("Notify I/O request finished failed!\n");
+ }
+
+ return ret;
+}
+
+static int acrn_ioreq_complete_request(struct acrn_ioreq_client *client,
+ u16 vcpu,
+ struct acrn_io_request *acrn_req)
+{
+ int ret;
+
+ if (vcpu >= client->vm->vcpu_num)
+ return -EINVAL;
+
+ clear_bit(vcpu, client->ioreqs_map);
+ if (!acrn_req) {
+ acrn_req = (struct acrn_io_request *)client->vm->ioreq_buf;
+ acrn_req += vcpu;
+ }
+
+ ret = ioreq_complete_request(client->vm->vmid, vcpu, acrn_req);
+
+ return ret;
+}
+
+int acrn_ioreq_request_default_complete(struct acrn_vm *vm, u16 vcpu)
+{
+ int ret = 0;
+
+ spin_lock_bh(&vm->ioreq_clients_lock);
+ if (vm->default_client)
+ ret = acrn_ioreq_complete_request(vm->default_client,
+ vcpu, NULL);
+ spin_unlock_bh(&vm->ioreq_clients_lock);
+
+ return ret;
+}
+
+/*
+ * ioreq_task() is the execution entity of handler thread of an I/O client.
+ * The handler callback of the I/O client is called within the handler thread.
+ */
+static int ioreq_task(void *data)
+{
+ struct acrn_ioreq_client *client = data;
+ struct acrn_io_request *req;
+ unsigned long *ioreqs_map;
+ int vcpu, ret;
+
+ /*
+ * Lockless access to ioreqs_map is safe, because
+ * 1) set_bit() and clear_bit() are atomic operations.
+ * 2) I/O requests arrives serialized. The access flow of ioreqs_map is:
+ * set_bit() - in tasklet
+ * Handler callback handles corresponding I/O request
+ * clear_bit() - in handler thread (include ACRN userspace)
+ * Mark corresponding I/O request completed
+ * Loop again if a new I/O request occurs
+ */
+ ioreqs_map = client->ioreqs_map;
+ while (!kthread_should_stop()) {
+ acrn_ioreq_client_wait(client);
+ while (has_pending_request(client)) {
+ vcpu = find_first_bit(ioreqs_map, client->vm->vcpu_num);
+ req = client->vm->ioreq_buf->req_slot + vcpu;
+ ret = client->handler(client, req);
+ if (ret < 0) {
+ pr_err("IO handle failure: %d\n", ret);
+ break;
+ }
+ acrn_ioreq_complete_request(client, vcpu, req);
+ }
+ }
+
+ return 0;
+}
+
+/*
+ * For the non-default I/O clients, give them chance to complete the current
+ * I/O requests if there are any. For the default I/O client, it is safe to
+ * clear all pending I/O requests because the clearing request is from ACRN
+ * userspace.
+ */
+void acrn_ioreq_request_clear(struct acrn_vm *vm)
+{
+ struct acrn_ioreq_client *client;
+ bool has_pending = false;
+ unsigned long vcpu;
+ int retry = 10;
+
+ /*
+ * IO requests of this VM will be completed directly in
+ * acrn_ioreq_dispatch if ACRN_VM_FLAG_CLEARING_IOREQ flag is set.
+ */
+ set_bit(ACRN_VM_FLAG_CLEARING_IOREQ, &vm->flags);
+
+ /*
+ * acrn_ioreq_request_clear is only called in VM reset case. Simply
+ * wait 100ms in total for the IO requests' completion.
+ */
+ do {
+ spin_lock_bh(&vm->ioreq_clients_lock);
+ list_for_each_entry(client, &vm->ioreq_clients, list) {
+ has_pending = has_pending_request(client);
+ if (has_pending)
+ break;
+ }
+ spin_unlock_bh(&vm->ioreq_clients_lock);
+
+ if (has_pending)
+ schedule_timeout_interruptible(HZ / 100);
+ } while (has_pending && --retry > 0);
+ if (retry == 0)
+ pr_warn("%s cannot flush pending request!\n", client->name);
+
+ /* Clear all ioreqs belonging to the default client */
+ spin_lock_bh(&vm->ioreq_clients_lock);
+ client = vm->default_client;
+ if (client) {
+ vcpu = find_next_bit(client->ioreqs_map,
+ ACRN_IO_REQUEST_MAX, 0);
+ while (vcpu < ACRN_IO_REQUEST_MAX) {
+ acrn_ioreq_complete_request(client, vcpu, NULL);
+ vcpu = find_next_bit(client->ioreqs_map,
+ ACRN_IO_REQUEST_MAX, vcpu + 1);
+ }
+ }
+ spin_unlock_bh(&vm->ioreq_clients_lock);
+
+ /* Clear ACRN_VM_FLAG_CLEARING_IOREQ flag after the clearing */
+ clear_bit(ACRN_VM_FLAG_CLEARING_IOREQ, &vm->flags);
+}
+
+int acrn_ioreq_client_wait(struct acrn_ioreq_client *client)
+{
+ if (client->is_default) {
+ /*
+ * In the default client, a user space thread waits on the
+ * waitqueue. The is_destroying() check is used to notify user
+ * space the client is going to be destroyed.
+ */
+ wait_event_interruptible(client->wq,
+ has_pending_request(client) ||
+ is_destroying(client));
+ if (is_destroying(client))
+ /* return 1 to indicate the client is being destroyed */
+ return 1;
+ } else {
+ wait_event_interruptible(client->wq,
+ has_pending_request(client) ||
+ kthread_should_stop());
+ }
+
+ return 0;
+}
+
+static bool in_range(struct acrn_ioreq_range *range,
+ struct acrn_io_request *req)
+{
+ bool ret = false;
+
+ if (range->type == req->type) {
+ switch (req->type) {
+ case ACRN_IOREQ_TYPE_MMIO:
+ if (req->reqs.mmio_request.address >= range->start &&
+ (req->reqs.mmio_request.address +
+ req->reqs.mmio_request.size - 1) <= range->end)
+ ret = true;
+ break;
+ case ACRN_IOREQ_TYPE_PORTIO:
+ if (req->reqs.pio_request.address >= range->start &&
+ (req->reqs.pio_request.address +
+ req->reqs.pio_request.size - 1) <= range->end)
+ ret = true;
+ break;
+ default:
+ break;
+ }
+ }
+
+ return ret;
+}
+
+static struct acrn_ioreq_client *find_ioreq_client(struct acrn_vm *vm,
+ struct acrn_io_request *req)
+{
+ struct acrn_ioreq_client *client, *found = NULL;
+ struct acrn_ioreq_range *range;
+
+ lockdep_assert_held(&vm->ioreq_clients_lock);
+
+ list_for_each_entry(client, &vm->ioreq_clients, list) {
+ read_lock_bh(&client->range_lock);
+ list_for_each_entry(range, &client->range_list, list) {
+ if (in_range(range, req)) {
+ found = client;
+ break;
+ }
+ }
+ read_unlock_bh(&client->range_lock);
+ if (found)
+ break;
+ }
+ return found ? found : vm->default_client;
+}
+
+/**
+ * acrn_ioreq_client_create() - Create an ioreq client
+ * @vm: The VM that this client belongs to
+ * @handler: The ioreq_handler of ioreq client acrn_hsm will create a kernel
+ * thread and call the handler to handle I/O requests.
+ * @priv: Private data for the handler
+ * @is_default: If it is the default client
+ * @name: The name of ioreq client
+ *
+ * Return: acrn_ioreq_client pointer on success, NULL on error
+ */
+struct acrn_ioreq_client *acrn_ioreq_client_create(struct acrn_vm *vm,
+ ioreq_handler_t handler,
+ void *priv, bool is_default,
+ const char *name)
+{
+ struct acrn_ioreq_client *client;
+
+ if (!handler && !is_default) {
+ pr_err("Cannot create non-default client w/o handler!\n");
+ return NULL;
+ }
+ client = kzalloc(sizeof(*client), GFP_KERNEL);
+ if (!client)
+ return NULL;
+
+ client->handler = handler;
+ client->vm = vm;
+ client->priv = priv;
+ client->is_default = is_default;
+ if (name)
+ strncpy(client->name, name, sizeof(client->name) - 1);
+ rwlock_init(&client->range_lock);
+ INIT_LIST_HEAD(&client->range_list);
+ init_waitqueue_head(&client->wq);
+
+ if (client->handler) {
+ client->thread = kthread_run(ioreq_task, client, "VM%u-%s",
+ client->vm->vmid, client->name);
+ if (IS_ERR(client->thread)) {
+ kfree(client);
+ return NULL;
+ }
+ }
+
+ spin_lock_bh(&vm->ioreq_clients_lock);
+ if (is_default)
+ vm->default_client = client;
+ else
+ list_add(&client->list, &vm->ioreq_clients);
+ spin_unlock_bh(&vm->ioreq_clients_lock);
+
+ pr_debug("Created ioreq client %s.\n", name);
+ return client;
+}
+
+/**
+ * acrn_ioreq_client_destroy() - Destroy an ioreq client
+ * @client: The ioreq client
+ */
+void acrn_ioreq_client_destroy(struct acrn_ioreq_client *client)
+{
+ struct acrn_ioreq_range *range, *next;
+ struct acrn_vm *vm = client->vm;
+
+ pr_debug("Destroy ioreq client %s.\n", client->name);
+ ioreq_pause();
+ set_bit(ACRN_IOREQ_CLIENT_DESTROYING, &client->flags);
+ if (client->is_default)
+ wake_up_interruptible(&client->wq);
+ else
+ kthread_stop(client->thread);
+
+ spin_lock_bh(&vm->ioreq_clients_lock);
+ if (client->is_default)
+ vm->default_client = NULL;
+ else
+ list_del(&client->list);
+ spin_unlock_bh(&vm->ioreq_clients_lock);
+
+ write_lock_bh(&client->range_lock);
+ list_for_each_entry_safe(range, next, &client->range_list, list) {
+ list_del(&range->list);
+ kfree(range);
+ }
+ write_unlock_bh(&client->range_lock);
+ kfree(client);
+
+ ioreq_resume();
+}
+
+static int acrn_ioreq_dispatch(struct acrn_vm *vm)
+{
+ struct acrn_ioreq_client *client;
+ struct acrn_io_request *req;
+ int i;
+
+ for (i = 0; i < vm->vcpu_num; i++) {
+ req = vm->ioreq_buf->req_slot + i;
+
+ /* barrier the read of processed of acrn_io_request */
+ if (smp_load_acquire(&req->processed) ==
+ ACRN_IOREQ_STATE_PENDING) {
+ /* Complete the IO request directly in clearing stage */
+ if (test_bit(ACRN_VM_FLAG_CLEARING_IOREQ, &vm->flags)) {
+ ioreq_complete_request(vm->vmid, i, req);
+ continue;
+ }
+
+ spin_lock_bh(&vm->ioreq_clients_lock);
+ client = find_ioreq_client(vm, req);
+ if (!client) {
+ pr_err("Failed to find ioreq client!\n");
+ spin_unlock_bh(&vm->ioreq_clients_lock);
+ return -EINVAL;
+ }
+ if (!client->is_default)
+ req->kernel_handled = 1;
+ else
+ req->kernel_handled = 0;
+ /*
+ * Add barrier() to make sure the writes are done
+ * before setting ACRN_IOREQ_STATE_PROCESSING
+ */
+ smp_store_release(&req->processed,
+ ACRN_IOREQ_STATE_PROCESSING);
+ set_bit(i, client->ioreqs_map);
+ wake_up_interruptible(&client->wq);
+ spin_unlock_bh(&vm->ioreq_clients_lock);
+ }
+ }
+
+ return 0;
+}
+
+static void ioreq_tasklet_handler(unsigned long data)
+{
+ struct acrn_vm *vm;
+
+ read_lock(&acrn_vm_list_lock);
+ list_for_each_entry(vm, &acrn_vm_list, list) {
+ if (!vm->ioreq_buf)
+ break;
+ acrn_ioreq_dispatch(vm);
+ }
+ read_unlock(&acrn_vm_list_lock);
+}
+
+static void ioreq_pause(void)
+{
+ /* Flush and disable the tasklet to ensure no I/O requests pending */
+ tasklet_disable(&ioreq_tasklet);
+}
+
+static void ioreq_resume(void)
+{
+ /* Schedule once after enabling in case other clients miss a tasklet */
+ tasklet_enable(&ioreq_tasklet);
+ tasklet_schedule(&ioreq_tasklet);
+}
+
+static void ioreq_intr_handler(void)
+{
+ tasklet_schedule(&ioreq_tasklet);
+}
+
+void acrn_ioreq_intr_setup(void)
+{
+ acrn_setup_intr_handler(ioreq_intr_handler);
+ tasklet_init(&ioreq_tasklet, ioreq_tasklet_handler, 0);
+}
+
+void acrn_ioreq_intr_remove(void)
+{
+ acrn_remove_intr_handler();
+}
+
+int acrn_ioreq_init(struct acrn_vm *vm, u64 buf_vma)
+{
+ struct acrn_ioreq_buffer *set_buffer;
+ struct page *page;
+ int ret;
+
+ if (vm->ioreq_buf)
+ return -EEXIST;
+
+ set_buffer = kzalloc(sizeof(*set_buffer), GFP_KERNEL);
+ if (!set_buffer)
+ return -ENOMEM;
+
+ ret = get_user_pages_fast(buf_vma, 1, FOLL_WRITE, &page);
+ if (unlikely(ret != 1) || !page) {
+ pr_err("Failed to pin ioreq page!\n");
+ ret = -ENOMEM;
+ goto free_buf;
+ }
+
+ vm->ioreq_buf = page_address(page);
+ vm->ioreq_page = page;
+ set_buffer->ioreq_buf = page_to_phys(page);
+ ret = hcall_set_ioreq_buffer(vm->vmid, virt_to_phys(set_buffer));
+ if (ret < 0) {
+ pr_err("Failed to init ioreq buffer!\n");
+ put_page(page);
+ vm->ioreq_buf = NULL;
+ goto free_buf;
+ }
+
+ pr_debug("Init ioreq buffer %pK!\n", vm->ioreq_buf);
+ ret = 0;
+free_buf:
+ kfree(set_buffer);
+ return ret;
+}
+
+void acrn_ioreq_deinit(struct acrn_vm *vm)
+{
+ struct acrn_ioreq_client *client, *next;
+
+ pr_debug("Deinit ioreq buffer %pK!\n", vm->ioreq_buf);
+ /* Destroy all clients belonging to this VM */
+ list_for_each_entry_safe(client, next, &vm->ioreq_clients, list)
+ acrn_ioreq_client_destroy(client);
+ if (vm->default_client)
+ acrn_ioreq_client_destroy(vm->default_client);
+
+ if (vm->ioreq_buf && vm->ioreq_page) {
+ put_page(vm->ioreq_page);
+ vm->ioreq_buf = NULL;
+ }
+}
diff --git a/drivers/virt/acrn/vm.c b/drivers/virt/acrn/vm.c
index 57a6b3896de6..1cd5f3b09f12 100644
--- a/drivers/virt/acrn/vm.c
+++ b/drivers/virt/acrn/vm.c
@@ -31,9 +31,17 @@ struct acrn_vm *acrn_vm_create(struct acrn_vm *vm,
}

mutex_init(&vm->regions_mapping_lock);
+ INIT_LIST_HEAD(&vm->ioreq_clients);
+ spin_lock_init(&vm->ioreq_clients_lock);
vm->vmid = vm_param->vmid;
vm->vcpu_num = vm_param->vcpu_num;

+ if (acrn_ioreq_init(vm, vm_param->ioreq_buf) < 0) {
+ hcall_destroy_vm(vm_param->vmid);
+ vm->vmid = ACRN_INVALID_VMID;
+ return NULL;
+ }
+
write_lock_bh(&acrn_vm_list_lock);
list_add(&vm->list, &acrn_vm_list);
write_unlock_bh(&acrn_vm_list_lock);
@@ -55,6 +63,8 @@ int acrn_vm_destroy(struct acrn_vm *vm)
list_del_init(&vm->list);
write_unlock_bh(&acrn_vm_list_lock);

+ acrn_ioreq_deinit(vm);
+
ret = hcall_destroy_vm(vm->vmid);
if (ret < 0) {
pr_err("Failed to destroy VM %u\n", vm->vmid);
diff --git a/include/uapi/linux/acrn.h b/include/uapi/linux/acrn.h
index 4a8349229819..713b22110a99 100644
--- a/include/uapi/linux/acrn.h
+++ b/include/uapi/linux/acrn.h
@@ -11,6 +11,129 @@

#include <linux/types.h>

+#define ACRN_IO_REQUEST_MAX 16
+
+#define ACRN_IOREQ_STATE_PENDING 0
+#define ACRN_IOREQ_STATE_COMPLETE 1
+#define ACRN_IOREQ_STATE_PROCESSING 2
+#define ACRN_IOREQ_STATE_FREE 3
+
+#define ACRN_IOREQ_TYPE_PORTIO 0
+#define ACRN_IOREQ_TYPE_MMIO 1
+
+#define ACRN_IOREQ_DIR_READ 0
+#define ACRN_IOREQ_DIR_WRITE 1
+
+struct acrn_mmio_request {
+ __u32 direction;
+ __u32 reserved;
+ __u64 address;
+ __u64 size;
+ __u64 value;
+} __attribute__((aligned(8)));
+
+struct acrn_pio_request {
+ __u32 direction;
+ __u32 reserved;
+ __u64 address;
+ __u64 size;
+ __u32 value;
+} __attribute__((aligned(8)));
+
+/**
+ * struct acrn_io_request - 256-byte ACRN I/O request
+ * @type: Type of this request (ACRN_IOREQ_TYPE_*).
+ * @completion_polling: Polling flag. Hypervisor will poll completion of the
+ * I/O request if this flag set.
+ * @reserved0: Reserved fields.
+ * @reqs: Union of different types of request. Byte offset: 64.
+ * @reqs.pio_request: PIO request data of the I/O request.
+ * @reqs.mmio_request: MMIO request data of the I/O request.
+ * @reqs.data: Raw data of the I/O request.
+ * @reserved1: Reserved fields.
+ * @kernel_handled: Flag indicates this request need be handled in kernel.
+ * @processed: The status of this request (ACRN_IOREQ_STATE_*).
+ *
+ * The state transitions of ACRN I/O request:
+ *
+ * FREE -> PENDING -> PROCESSING -> COMPLETE -> FREE -> ...
+ *
+ * An I/O request in COMPLETE or FREE state is owned by the hypervisor. HSM and
+ * ACRN userspace are in charge of processing the others.
+ *
+ * On basis of the states illustrated above, a typical lifecycle of ACRN IO
+ * request would look like:
+ *
+ * Flow (assume the initial state is FREE)
+ * |
+ * | Service VM vCPU 0 Service VM vCPU x User vCPU y
+ * |
+ * | hypervisor:
+ * | fills in type, addr, etc.
+ * | pauses the User VM vCPU y
+ * | sets the state to PENDING (a)
+ * | fires an upcall to Service VM
+ * |
+ * | HSM:
+ * | scans for PENDING requests
+ * | sets the states to PROCESSING (b)
+ * | assigns the requests to clients (c)
+ * V
+ * | client:
+ * | scans for the assigned requests
+ * | handles the requests (d)
+ * | HSM:
+ * | sets states to COMPLETE
+ * | notifies the hypervisor
+ * |
+ * | hypervisor:
+ * | resumes User VM vCPU y (e)
+ * |
+ * | hypervisor:
+ * | post handling (f)
+ * V sets states to FREE
+ *
+ * Note that the procedures (a) to (f) in the illustration above require to be
+ * strictly processed in the order. One vCPU cannot trigger another request of
+ * I/O emulation before completing the previous one.
+ *
+ * Atomic and barriers are required when HSM and hypervisor accessing the state
+ * of &struct acrn_io_request.
+ *
+ */
+struct acrn_io_request {
+ __u32 type;
+ __u32 completion_polling;
+ __u32 reserved0[14];
+ union {
+ struct acrn_pio_request pio_request;
+ struct acrn_mmio_request mmio_request;
+ __u64 data[8];
+ } reqs;
+ __u32 reserved1;
+ __u32 kernel_handled;
+ __u32 processed;
+} __attribute__((aligned(256)));
+
+struct acrn_io_request_buffer {
+ union {
+ struct acrn_io_request req_slot[ACRN_IO_REQUEST_MAX];
+ __u8 reserved[4096];
+ };
+};
+
+/**
+ * struct acrn_ioreq_notify - The structure of ioreq completion notification
+ * @vmid: User VM ID
+ * @reserved: Reserved
+ * @vcpu: vCPU ID
+ */
+struct acrn_ioreq_notify {
+ __u16 vmid;
+ __u16 reserved;
+ __u32 vcpu;
+} __attribute__((aligned(8)));
+
/**
* struct acrn_api_version - ACRN Hypervisor API version.
* @major_version: Major version of ACRN Hypervisor API.
@@ -183,6 +306,17 @@ struct acrn_vm_memmap {
#define ACRN_IOCTL_SET_VCPU_REGS \
_IOW(ACRN_IOCTL_TYPE, 0x16, struct acrn_vcpu_regs)

+#define ACRN_IOCTL_NOTIFY_REQUEST_FINISH \
+ _IOW(ACRN_IOCTL_TYPE, 0x31, struct acrn_ioreq_notify)
+#define ACRN_IOCTL_CREATE_IOREQ_CLIENT \
+ _IO(ACRN_IOCTL_TYPE, 0x32)
+#define ACRN_IOCTL_ATTACH_IOREQ_CLIENT \
+ _IO(ACRN_IOCTL_TYPE, 0x33)
+#define ACRN_IOCTL_DESTROY_IOREQ_CLIENT \
+ _IO(ACRN_IOCTL_TYPE, 0x34)
+#define ACRN_IOCTL_CLEAR_VM_IOREQ \
+ _IO(ACRN_IOCTL_TYPE, 0x35)
+
#define ACRN_IOCTL_SET_MEMSEG \
_IOW(ACRN_IOCTL_TYPE, 0x41, struct acrn_vm_memmap)
#define ACRN_IOCTL_UNSET_MEMSEG \
--
2.28.0

2020-08-28 10:28:36

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 05/17] virt: acrn: Introduce ACRN HSM basic driver

On Tue, Aug 25, 2020 at 10:45:05AM +0800, [email protected] wrote:
> +static long acrn_dev_ioctl(struct file *filp, unsigned int cmd,
> + unsigned long ioctl_param)
> +{
> + if (cmd == ACRN_IOCTL_GET_API_VERSION) {
> + if (copy_to_user((void __user *)ioctl_param,
> + &api_version, sizeof(api_version)))
> + return -EFAULT;

Why are you versioning your api? Shouldn't that not be a thing and you
either support an ioctl or you do not?



> + }
> +
> + return 0;
> +}
> +
> +static int acrn_dev_release(struct inode *inode, struct file *filp)
> +{
> + struct acrn_vm *vm = filp->private_data;
> +
> + kfree(vm);
> + return 0;
> +}
> +
> +static const struct file_operations acrn_fops = {
> + .owner = THIS_MODULE,
> + .open = acrn_dev_open,
> + .release = acrn_dev_release,
> + .unlocked_ioctl = acrn_dev_ioctl,
> +};
> +
> +static struct miscdevice acrn_dev = {
> + .minor = MISC_DYNAMIC_MINOR,
> + .name = "acrn_hsm",
> + .fops = &acrn_fops,
> +};
> +
> +static int __init hsm_init(void)
> +{
> + int ret;
> +
> + if (x86_hyper_type != X86_HYPER_ACRN)
> + return -ENODEV;
> +
> + if (!acrn_is_privileged_vm())
> + return -EPERM;
> +
> + ret = hcall_get_api_version(slow_virt_to_phys(&api_version));
> + if (ret < 0) {
> + pr_err("Failed to get API version from hypervisor!\n");
> + return ret;
> + }
> +
> + pr_info("API version is %u.%u\n",
> + api_version.major_version, api_version.minor_version);

Shouldn't drivers be quiet when they load and all goes well? pr_dbg()?

And can't you defer the "read the version" call until open happens?
Does it have to happen at module load time, increasing boot time for no
good reason if there is not a user?

thanks,

greg k-h

2020-08-29 10:47:12

by Shuo Liu

[permalink] [raw]
Subject: Re: [PATCH 05/17] virt: acrn: Introduce ACRN HSM basic driver

Hi Greg,

On Fri 28.Aug'20 at 12:25:59 +0200, Greg Kroah-Hartman wrote:
>On Tue, Aug 25, 2020 at 10:45:05AM +0800, [email protected] wrote:
>> +static long acrn_dev_ioctl(struct file *filp, unsigned int cmd,
>> + unsigned long ioctl_param)
>> +{
>> + if (cmd == ACRN_IOCTL_GET_API_VERSION) {
>> + if (copy_to_user((void __user *)ioctl_param,
>> + &api_version, sizeof(api_version)))
>> + return -EFAULT;
>
>Why are you versioning your api? Shouldn't that not be a thing and you
>either support an ioctl or you do not?

The API version here is more for the hypercalls.
The hypercalls might evolve later and the version indicates which set of
interfaces (include the paramters' format) should be used by user space
tools. Currently, it's used rarely.

>
>
>> + }
>> +
>> + return 0;
>> +}
>> +
>> +static int acrn_dev_release(struct inode *inode, struct file *filp)
>> +{
>> + struct acrn_vm *vm = filp->private_data;
>> +
>> + kfree(vm);
>> + return 0;
>> +}
>> +
>> +static const struct file_operations acrn_fops = {
>> + .owner = THIS_MODULE,
>> + .open = acrn_dev_open,
>> + .release = acrn_dev_release,
>> + .unlocked_ioctl = acrn_dev_ioctl,
>> +};
>> +
>> +static struct miscdevice acrn_dev = {
>> + .minor = MISC_DYNAMIC_MINOR,
>> + .name = "acrn_hsm",
>> + .fops = &acrn_fops,
>> +};
>> +
>> +static int __init hsm_init(void)
>> +{
>> + int ret;
>> +
>> + if (x86_hyper_type != X86_HYPER_ACRN)
>> + return -ENODEV;
>> +
>> + if (!acrn_is_privileged_vm())
>> + return -EPERM;
>> +
>> + ret = hcall_get_api_version(slow_virt_to_phys(&api_version));
>> + if (ret < 0) {
>> + pr_err("Failed to get API version from hypervisor!\n");
>> + return ret;
>> + }
>> +
>> + pr_info("API version is %u.%u\n",
>> + api_version.major_version, api_version.minor_version);
>
>Shouldn't drivers be quiet when they load and all goes well? pr_dbg()?
>
>And can't you defer the "read the version" call until open happens?
>Does it have to happen at module load time, increasing boot time for no
>good reason if there is not a user?

OK. I can defer the version fetch and pr_dbg() until open.

Thanks
shuo

2020-08-29 16:15:32

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH 05/17] virt: acrn: Introduce ACRN HSM basic driver

On 8/29/20 3:46 AM, Shuo A Liu wrote:
> On Fri 28.Aug'20 at 12:25:59 +0200, Greg Kroah-Hartman wrote:
>> On Tue, Aug 25, 2020 at 10:45:05AM +0800, [email protected] wrote:
>>> +static long acrn_dev_ioctl(struct file *filp, unsigned int cmd,
>>> +               unsigned long ioctl_param)
>>> +{
>>> +    if (cmd == ACRN_IOCTL_GET_API_VERSION) {
>>> +        if (copy_to_user((void __user *)ioctl_param,
>>> +                 &api_version, sizeof(api_version)))
>>> +            return -EFAULT;
>>
>> Why are you versioning your api?  Shouldn't that not be a thing and you
>> either support an ioctl or you do not?
>
> The API version here is more for the hypercalls.
> The hypercalls might evolve later

They might evolve, but the old ones must always keep working. Right?

> and the version indicates which set of interfaces (include the
> paramters' format) should be used by user space tools. Currently,
> it's used rarely.
Why do you need this when the core kernel doesn't? We add syscalls,
ioctl()s and prctl()s all the time, but nothing is versioned.

This sounds like something you need to remove from the series.

2020-08-30 08:17:58

by Shuo Liu

[permalink] [raw]
Subject: Re: [PATCH 05/17] virt: acrn: Introduce ACRN HSM basic driver

Hi Dave,

On Sat 29.Aug'20 at 9:12:22 -0700, Dave Hansen wrote:
>On 8/29/20 3:46 AM, Shuo A Liu wrote:
>> On Fri 28.Aug'20 at 12:25:59 +0200, Greg Kroah-Hartman wrote:
>>> On Tue, Aug 25, 2020 at 10:45:05AM +0800, [email protected] wrote:
>>>> +static long acrn_dev_ioctl(struct file *filp, unsigned int cmd,
>>>> +               unsigned long ioctl_param)
>>>> +{
>>>> +    if (cmd == ACRN_IOCTL_GET_API_VERSION) {
>>>> +        if (copy_to_user((void __user *)ioctl_param,
>>>> +                 &api_version, sizeof(api_version)))
>>>> +            return -EFAULT;
>>>
>>> Why are you versioning your api?  Shouldn't that not be a thing and you
>>> either support an ioctl or you do not?
>>
>> The API version here is more for the hypercalls.
>> The hypercalls might evolve later
>
>They might evolve, but the old ones must always keep working. Right?

Yes, it's right.

>
>> and the version indicates which set of interfaces (include the
>> paramters' format) should be used by user space tools. Currently,
>> it's used rarely.
>Why do you need this when the core kernel doesn't? We add syscalls,
>ioctl()s and prctl()s all the time, but nothing is versioned.

Indeed. It looks a bit odd.

>
>This sounds like something you need to remove from the series.

OK. I will remove the api version related code.

Thanks
shuo