2014-02-08 17:31:38

by Antonios Motakis

[permalink] [raw]
Subject: [RFC PATCH v4 00/10] VFIO support for platform devices

v4 of this series is functionally identical to v3 for VFIO_PLATFORM. The only
change is the inclusion of Kim Phillips' patch to expose driver_probe_device()
and the implementation of a binding mechanism for arbitrary devices via a file
in sysfs. The latter has been folded in the skeleton patch (04/10).

This patch series aims to implement VFIO support for platform devices that
reside behind an IOMMU. Examples of such devices are devices behind an ARM SMMU,
or behind a Samsung Exynos System MMU.

The API used is based on the existing VFIO API that is also used with PCI
devices. Only devices that include a basic set of IRQs and memory regions are
targeted; devices with complex relationships with other devices on a device tree
are not taken into account at this stage.

A copy with all the dependencies applied can be cloned from branch
vfio-platform-v4 at [email protected]:virtualopensystems/linux-kvm-arm.git

This code can also be tested on ARM FastModels using the following test cases:
- A user space implementation via VFIO for the PL330 on the FastModels:
[email protected]:virtualopensystems/pl330-vfio-test.git
- A QEMU prototype, also based on the PL330:
[email protected]:virtualopensystems/qemu.git pl330-vfio-dev

We have written detailed instructions on how to build and run these tests:
http://www.virtualopensystems.com/en/solutions/guides/vfio-on-arm/

The following IOCTLs have been found to be working on FastModels with an
ARM SMMU (MMU400). Testing was based on the ARM PL330 DMA Controller featured
on those models.
- VFIO_GET_API_VERSION
- VFIO_CHECK_EXTENSION

The TYPE1 fix proposed here enables the following IOCTLs:
- VFIO_GROUP_GET_STATUS
- VFIO_GROUP_SET_CONTAINER
- VFIO_SET_IOMMU
- VFIO_IOMMU_GET_INFO
- VFIO_IOMMU_MAP_DMA
For this ioctl specifically, a new flag has been added:
VFIO_DMA_MAP_FLAG_EXEC. This flag is taken into account on systems with an
ARM SMMU.

The VFIO platform driver proposed here implements the following:
- VFIO_GROUP_GET_DEVICE_FD
- VFIO_DEVICE_GET_INFO
- VFIO_DEVICE_GET_REGION_INFO
- VFIO_DEVICE_GET_IRQ_INFO
- VFIO_DEVICE_SET_IRQS
IRQs are implemented partially using this ioctl. Handling incoming
interrupts with an eventfd is supported, as is masking and unmasking.
Level sensitive interrupts are automasked. What is not implemented is
masking/unmasking via eventfd.

In addition, the VFIO platform driver implements the following through
the VFIO device file descriptor:
- MMAPing memory regions to the virtual address space of the VFIO user.
- Read / write of memory regions directly through the file descriptor.

What still needs to be done, includes:
- Eventfd for masking/unmasking
- Extend the driver and API for device tree metadata
- QEMU / KVM integration
- Device specific functionality (e.g. VFIO_DEVICE_RESET)
- Improve VFIO_IOMMU_TYPE1 driver to support multiple buses at the same time
- Bind to ARM AMBA devices
- IOMMUs with nested page tables (Stage 1 & 2 translation on ARM SMMUs)

Changes since v3:
- Use Kim Phillips' driver_probe_device()
Changes since v2:
- Fixed Read/Write and MMAP on device regions
- Removed dependency on Device Tree
- Interrupts support
- Interrupt masking/unmasking
- Automask level sensitive interrupts
- Introduced VFIO_DMA_MAP_FLAG_EXEC
- Code clean ups

Antonios Motakis (9):
VFIO_IOMMU_TYPE1: Introduce the VFIO_DMA_MAP_FLAG_EXEC flag
VFIO_IOMMU_TYPE1: workaround to build for platform devices
VFIO_PLATFORM: Initial skeleton of VFIO support for platform devices
VFIO_PLATFORM: Return info for device and its memory mapped IO regions
VFIO_PLATFORM: Read and write support for the device fd
VFIO_PLATFORM: Support MMAP of MMIO regions
VFIO_PLATFORM: Return IRQ info
VFIO_PLATFORM: Initial interrupts support
VFIO_PLATFORM: Support for maskable and automasked interrupts

Kim Phillips (1):
driver core: export driver_probe_device()

drivers/base/base.h | 1 -
drivers/base/dd.c | 1 +
drivers/vfio/Kconfig | 3 +-
drivers/vfio/Makefile | 1 +
drivers/vfio/platform/Kconfig | 9 +
drivers/vfio/platform/Makefile | 4 +
drivers/vfio/platform/vfio_platform.c | 493 ++++++++++++++++++++++++++
drivers/vfio/platform/vfio_platform_irq.c | 289 +++++++++++++++
drivers/vfio/platform/vfio_platform_private.h | 50 +++
drivers/vfio/vfio_iommu_type1.c | 27 +-
include/linux/device.h | 1 +
include/uapi/linux/vfio.h | 2 +
12 files changed, 874 insertions(+), 7 deletions(-)
create mode 100644 drivers/vfio/platform/Kconfig
create mode 100644 drivers/vfio/platform/Makefile
create mode 100644 drivers/vfio/platform/vfio_platform.c
create mode 100644 drivers/vfio/platform/vfio_platform_irq.c
create mode 100644 drivers/vfio/platform/vfio_platform_private.h

--
1.8.3.2


2014-02-08 17:32:10

by Antonios Motakis

[permalink] [raw]
Subject: [RFC PATCH v4 02/10] VFIO_IOMMU_TYPE1: Introduce the VFIO_DMA_MAP_FLAG_EXEC flag

The ARM SMMU driver expects the IOMMU_EXEC flag, otherwise it will
set the page tables for a device as XN (execute never). This affects
devices such as the ARM PL330 DMA Controller, which fails to operate
if the XN flag is set on the memory it tries to fetch its instructions
from.

We introduce the VFIO_DMA_MAP_FLAG_EXEC to VFIO, and use it in
VFIO_IOMMU_TYPE1 to set the IOMMU_EXEC flag. This way the user can
control whether the XN flag will be set on the requested mappings.

Signed-off-by: Antonios Motakis <[email protected]>
---
drivers/vfio/vfio_iommu_type1.c | 5 ++++-
include/uapi/linux/vfio.h | 1 +
2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 4fb7a8f..ad7a1f6 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -557,6 +557,8 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
prot |= IOMMU_WRITE;
if (map->flags & VFIO_DMA_MAP_FLAG_READ)
prot |= IOMMU_READ;
+ if (map->flags & VFIO_DMA_MAP_FLAG_EXEC)
+ prot |= IOMMU_EXEC;

if (!prot)
return -EINVAL; /* No READ/WRITE? */
@@ -865,7 +867,8 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
} else if (cmd == VFIO_IOMMU_MAP_DMA) {
struct vfio_iommu_type1_dma_map map;
uint32_t mask = VFIO_DMA_MAP_FLAG_READ |
- VFIO_DMA_MAP_FLAG_WRITE;
+ VFIO_DMA_MAP_FLAG_WRITE |
+ VFIO_DMA_MAP_FLAG_EXEC;

minsz = offsetofend(struct vfio_iommu_type1_dma_map, size);

diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 0fd47f5..d8e9e99 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -392,6 +392,7 @@ struct vfio_iommu_type1_dma_map {
__u32 flags;
#define VFIO_DMA_MAP_FLAG_READ (1 << 0) /* readable from device */
#define VFIO_DMA_MAP_FLAG_WRITE (1 << 1) /* writable from device */
+#define VFIO_DMA_MAP_FLAG_EXEC (1 << 2) /* executable from device */
__u64 vaddr; /* Process virtual address */
__u64 iova; /* IO virtual address */
__u64 size; /* Size of mapping (bytes) */
--
1.8.3.2

2014-02-08 17:32:06

by Antonios Motakis

[permalink] [raw]
Subject: [RFC PATCH v4 01/10] driver core: export driver_probe_device()

From: Kim Phillips <[email protected]>

Needed by drivers, such as the vfio platform driver [1], seeking to
bypass bind_store()'s driver_match_device(), and bind to any device
via a private sysfs bind file.

[1] https://lkml.org/lkml/2013/12/11/522

note: the EXPORT_SYMBOL is needed because vfio-platform can be built
as a module.

Signed-off-by: Kim Phillips <[email protected]>
---
drivers/base/base.h | 1 -
drivers/base/dd.c | 1 +
include/linux/device.h | 1 +
3 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/base/base.h b/drivers/base/base.h
index 24f4242..fe25ad87 100644
--- a/drivers/base/base.h
+++ b/drivers/base/base.h
@@ -112,7 +112,6 @@ extern int bus_add_driver(struct device_driver *drv);
extern void bus_remove_driver(struct device_driver *drv);

extern void driver_detach(struct device_driver *drv);
-extern int driver_probe_device(struct device_driver *drv, struct device *dev);
extern void driver_deferred_probe_del(struct device *dev);
static inline int driver_match_device(struct device_driver *drv,
struct device *dev)
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 0605176..44f6184 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -384,6 +384,7 @@ int driver_probe_device(struct device_driver *drv, struct device *dev)

return ret;
}
+EXPORT_SYMBOL_GPL(driver_probe_device);

static int __device_attach(struct device_driver *drv, void *data)
{
diff --git a/include/linux/device.h b/include/linux/device.h
index 952b010..ad80dd2 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -257,6 +257,7 @@ extern struct device_driver *driver_find(const char *name,
struct bus_type *bus);
extern int driver_probe_done(void);
extern void wait_for_device_probe(void);
+extern int driver_probe_device(struct device_driver *drv, struct device *dev);


/* sysfs interface for exporting driver attributes */
--
1.8.3.2

2014-02-08 17:32:15

by Antonios Motakis

[permalink] [raw]
Subject: [RFC PATCH v4 03/10] VFIO_IOMMU_TYPE1: workaround to build for platform devices

This is a workaround to make the VFIO_IOMMU_TYPE1 driver usable with
platform devices instead of PCI. A future permanent fix should support
both. This is required in order to use the Exynos SMMU, or ARM SMMU
driver with VFIO.

Signed-off-by: Antonios Motakis <[email protected]>
---
drivers/vfio/Kconfig | 2 +-
drivers/vfio/vfio_iommu_type1.c | 22 ++++++++++++++++++----
2 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig
index 26b3d9d..bd50721 100644
--- a/drivers/vfio/Kconfig
+++ b/drivers/vfio/Kconfig
@@ -11,7 +11,7 @@ config VFIO_IOMMU_SPAPR_TCE
menuconfig VFIO
tristate "VFIO Non-Privileged userspace driver framework"
depends on IOMMU_API
- select VFIO_IOMMU_TYPE1 if X86
+ select VFIO_IOMMU_TYPE1 if X86 || ARM
select VFIO_IOMMU_SPAPR_TCE if (PPC_POWERNV || PPC_PSERIES)
help
VFIO provides a framework for secure userspace device drivers.
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index ad7a1f6..81e65f4 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -30,7 +30,8 @@
#include <linux/iommu.h>
#include <linux/module.h>
#include <linux/mm.h>
-#include <linux/pci.h> /* pci_bus_type */
+#include <linux/pci.h> /* pci_bus_type */
+#include <linux/platform_device.h> /* platform_bus_type */
#include <linux/rbtree.h>
#include <linux/sched.h>
#include <linux/slab.h>
@@ -47,6 +48,8 @@ module_param_named(allow_unsafe_interrupts,
allow_unsafe_interrupts, bool, S_IRUGO | S_IWUSR);
MODULE_PARM_DESC(allow_unsafe_interrupts,
"Enable VFIO IOMMU support for on platforms without interrupt remapping support.");
+static struct bus_type *iommu_bus_type = NULL;
+static bool require_cap_intr_remap = false;

static bool disable_hugepages;
module_param_named(disable_hugepages,
@@ -785,7 +788,8 @@ static void *vfio_iommu_type1_open(unsigned long arg)
/*
* Wish we didn't have to know about bus_type here.
*/
- iommu->domain = iommu_domain_alloc(&pci_bus_type);
+ iommu->domain = iommu_domain_alloc(iommu_bus_type);
+
if (!iommu->domain) {
kfree(iommu);
return ERR_PTR(-EIO);
@@ -797,7 +801,7 @@ static void *vfio_iommu_type1_open(unsigned long arg)
* the way. Fortunately we know interrupt remapping is global for
* our iommus.
*/
- if (!allow_unsafe_interrupts &&
+ if (require_cap_intr_remap && !allow_unsafe_interrupts &&
!iommu_domain_has_cap(iommu->domain, IOMMU_CAP_INTR_REMAP)) {
pr_warn("%s: No interrupt remapping support. Use the module param \"allow_unsafe_interrupts\" to enable VFIO IOMMU support on this platform\n",
__func__);
@@ -914,7 +918,17 @@ static const struct vfio_iommu_driver_ops vfio_iommu_driver_ops_type1 = {

static int __init vfio_iommu_type1_init(void)
{
- if (!iommu_present(&pci_bus_type))
+#ifdef CONFIG_PCI
+ if (iommu_present(&pci_bus_type)) {
+ iommu_bus_type = &pci_bus_type;
+ /* For PCI targets, IOMMU_CAP_INTR_REMAP is required */
+ require_cap_intr_remap = true;
+ }
+#endif
+ if (!iommu_bus_type && iommu_present(&platform_bus_type))
+ iommu_bus_type = &platform_bus_type;
+
+ if(!iommu_bus_type)
return -ENODEV;

return vfio_register_iommu_driver(&vfio_iommu_driver_ops_type1);
--
1.8.3.2

2014-02-08 17:33:19

by Antonios Motakis

[permalink] [raw]
Subject: [RFC PATCH v4 04/10] VFIO_PLATFORM: Initial skeleton of VFIO support for platform devices

This patch forms the skeleton for platform devices support with VFIO.
We implement a new 'vfio_bind' sysfs file, that, once written
with a device ID, binds the platform driver to the device directly.
Bypassing the driver core's traditional bind file allows this
platform driver to bind to any device via sysfs.

Signed-off-by: Antonios Motakis <[email protected]>
Signed-off-by: Kim Phillips <[email protected]>
---
drivers/vfio/Kconfig | 1 +
drivers/vfio/Makefile | 1 +
drivers/vfio/platform/Kconfig | 9 ++
drivers/vfio/platform/Makefile | 4 +
drivers/vfio/platform/vfio_platform.c | 219 ++++++++++++++++++++++++++
drivers/vfio/platform/vfio_platform_private.h | 22 +++
include/uapi/linux/vfio.h | 1 +
7 files changed, 257 insertions(+)
create mode 100644 drivers/vfio/platform/Kconfig
create mode 100644 drivers/vfio/platform/Makefile
create mode 100644 drivers/vfio/platform/vfio_platform.c
create mode 100644 drivers/vfio/platform/vfio_platform_private.h

diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig
index bd50721..8156fcf 100644
--- a/drivers/vfio/Kconfig
+++ b/drivers/vfio/Kconfig
@@ -20,3 +20,4 @@ menuconfig VFIO
If you don't know what to do here, say N.

source "drivers/vfio/pci/Kconfig"
+source "drivers/vfio/platform/Kconfig"
diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile
index 72bfabc..b5e4a33 100644
--- a/drivers/vfio/Makefile
+++ b/drivers/vfio/Makefile
@@ -2,3 +2,4 @@ obj-$(CONFIG_VFIO) += vfio.o
obj-$(CONFIG_VFIO_IOMMU_TYPE1) += vfio_iommu_type1.o
obj-$(CONFIG_VFIO_IOMMU_SPAPR_TCE) += vfio_iommu_spapr_tce.o
obj-$(CONFIG_VFIO_PCI) += pci/
+obj-$(CONFIG_VFIO_PLATFORM) += platform/
diff --git a/drivers/vfio/platform/Kconfig b/drivers/vfio/platform/Kconfig
new file mode 100644
index 0000000..42b0022
--- /dev/null
+++ b/drivers/vfio/platform/Kconfig
@@ -0,0 +1,9 @@
+config VFIO_PLATFORM
+ tristate "VFIO support for platform devices"
+ depends on VFIO && EVENTFD
+ help
+ Support for platform devices with VFIO. This is required to make
+ use of platform devices present on the system using the VFIO
+ framework.
+
+ If you don't know what to do here, say N.
diff --git a/drivers/vfio/platform/Makefile b/drivers/vfio/platform/Makefile
new file mode 100644
index 0000000..df3a014
--- /dev/null
+++ b/drivers/vfio/platform/Makefile
@@ -0,0 +1,4 @@
+
+vfio-platform-y := vfio_platform.o
+
+obj-$(CONFIG_VFIO_PLATFORM) += vfio-platform.o
diff --git a/drivers/vfio/platform/vfio_platform.c b/drivers/vfio/platform/vfio_platform.c
new file mode 100644
index 0000000..a3d8f29
--- /dev/null
+++ b/drivers/vfio/platform/vfio_platform.c
@@ -0,0 +1,219 @@
+/*
+ * Copyright (C) 2013 - Virtual Open Systems
+ * Author: Antonios Motakis <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/device.h>
+#include <linux/eventfd.h>
+#include <linux/interrupt.h>
+#include <linux/iommu.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/notifier.h>
+#include <linux/pm_runtime.h>
+#include <linux/slab.h>
+#include <linux/types.h>
+#include <linux/uaccess.h>
+#include <linux/vfio.h>
+#include <linux/io.h>
+#include <linux/platform_device.h>
+#include <linux/irq.h>
+
+#include "vfio_platform_private.h"
+
+#define DRIVER_VERSION "0.2"
+#define DRIVER_AUTHOR "Antonios Motakis <[email protected]>"
+#define DRIVER_DESC "VFIO for platform devices - User Level meta-driver"
+
+static void vfio_platform_release(void *device_data)
+{
+ module_put(THIS_MODULE);
+}
+
+static int vfio_platform_open(void *device_data)
+{
+ if (!try_module_get(THIS_MODULE))
+ return -ENODEV;
+
+ return 0;
+}
+
+static long vfio_platform_ioctl(void *device_data,
+ unsigned int cmd, unsigned long arg)
+{
+ struct vfio_platform_device *vdev = device_data;
+ unsigned long minsz;
+
+ if (cmd == VFIO_DEVICE_GET_INFO) {
+ struct vfio_device_info info;
+
+ minsz = offsetofend(struct vfio_device_info, num_irqs);
+
+ if (copy_from_user(&info, (void __user *)arg, minsz))
+ return -EFAULT;
+
+ if (info.argsz < minsz)
+ return -EINVAL;
+
+ info.flags = VFIO_DEVICE_FLAGS_PLATFORM;
+ info.num_regions = 0;
+ info.num_irqs = 0;
+
+ return copy_to_user((void __user *)arg, &info, minsz);
+
+ } else if (cmd == VFIO_DEVICE_GET_REGION_INFO)
+ return -EINVAL;
+
+ else if (cmd == VFIO_DEVICE_GET_IRQ_INFO)
+ return -EINVAL;
+
+ else if (cmd == VFIO_DEVICE_SET_IRQS)
+ return -EINVAL;
+
+ else if (cmd == VFIO_DEVICE_RESET)
+ return -EINVAL;
+
+ return -ENOTTY;
+}
+
+static ssize_t vfio_platform_read(void *device_data, char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ return 0;
+}
+
+static ssize_t vfio_platform_write(void *device_data, const char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ return 0;
+}
+
+static int vfio_platform_mmap(void *device_data, struct vm_area_struct *vma)
+{
+ return -EINVAL;
+}
+
+static const struct vfio_device_ops vfio_platform_ops = {
+ .name = "vfio-platform",
+ .open = vfio_platform_open,
+ .release = vfio_platform_release,
+ .ioctl = vfio_platform_ioctl,
+ .read = vfio_platform_read,
+ .write = vfio_platform_write,
+ .mmap = vfio_platform_mmap,
+};
+
+static ssize_t vfio_bind_store(struct device_driver *driver, const char *buf,
+ size_t count)
+{
+ struct device *dev;
+ int ret;
+
+ dev = bus_find_device_by_name(&platform_bus_type, NULL, buf);
+ if (!dev)
+ return -ENODEV;
+
+ device_lock(dev);
+ ret = driver_probe_device(driver, dev);
+ device_unlock(dev);
+ if (ret > 0) {
+ /* success */
+ ret = count;
+ }
+
+ return ret;
+}
+static DRIVER_ATTR_WO(vfio_bind);
+
+static int vfio_platform_probe(struct platform_device *pdev)
+{
+ struct vfio_platform_device *vdev;
+ struct iommu_group *group;
+ int ret;
+
+ group = iommu_group_get(&pdev->dev);
+ if (!group) {
+ pr_err("VFIO: No IOMMU group for device %s\n", pdev->name);
+ return -EINVAL;
+ }
+
+ vdev = kzalloc(sizeof(*vdev), GFP_KERNEL);
+ if (!vdev) {
+ iommu_group_put(group);
+ return -ENOMEM;
+ }
+
+ vdev->pdev = pdev;
+
+ ret = vfio_add_group_dev(&pdev->dev, &vfio_platform_ops, vdev);
+ if (ret) {
+ iommu_group_put(group);
+ kfree(vdev);
+ }
+
+ return ret;
+}
+
+static int vfio_platform_remove(struct platform_device *pdev)
+{
+ struct vfio_platform_device *vdev;
+
+ vdev = vfio_del_group_dev(&pdev->dev);
+ if (!vdev)
+ return -EINVAL;
+
+ iommu_group_put(pdev->dev.iommu_group);
+ kfree(vdev);
+
+ return 0;
+}
+
+static struct platform_driver vfio_platform_driver = {
+ .probe = vfio_platform_probe,
+ .remove = vfio_platform_remove,
+ .driver = {
+ .name = "vfio-platform",
+ .owner = THIS_MODULE,
+ },
+};
+
+static int __init vfio_platform_driver_init(void)
+{
+ int ret;
+
+ ret = platform_driver_register(&vfio_platform_driver);
+ if (ret) {
+ pr_err("Failed to register vfio platform driver, error: %d\n",
+ ret);
+ return ret;
+ }
+
+ ret = driver_create_file(&vfio_platform_driver.driver,
+ &driver_attr_vfio_bind);
+ if (ret)
+ pr_err("Failed to create vfio_bind file, error: %d\n", ret);
+
+ return ret;
+}
+
+static void __exit vfio_platform_driver_exit(void)
+{
+ platform_driver_unregister(&vfio_platform_driver);
+}
+
+module_init(vfio_platform_driver_init);
+module_exit(vfio_platform_driver_exit);
+
+MODULE_VERSION(DRIVER_VERSION);
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR(DRIVER_AUTHOR);
+MODULE_DESCRIPTION(DRIVER_DESC);
diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
new file mode 100644
index 0000000..6df8084
--- /dev/null
+++ b/drivers/vfio/platform/vfio_platform_private.h
@@ -0,0 +1,22 @@
+/*
+ * Copyright (C) 2013 - Virtual Open Systems
+ * Author: Antonios Motakis <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef VFIO_PLATFORM_PRIVATE_H
+#define VFIO_PLATFORM_PRIVATE_H
+
+struct vfio_platform_device {
+ struct platform_device *pdev;
+};
+
+#endif /* VFIO_PCI_PRIVATE_H */
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index d8e9e99..51ebf65 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -148,6 +148,7 @@ struct vfio_device_info {
__u32 flags;
#define VFIO_DEVICE_FLAGS_RESET (1 << 0) /* Device supports reset */
#define VFIO_DEVICE_FLAGS_PCI (1 << 1) /* vfio-pci device */
+#define VFIO_DEVICE_FLAGS_PLATFORM (1 << 2) /* vfio-platform device */
__u32 num_regions; /* Max region index + 1 */
__u32 num_irqs; /* Max IRQ index + 1 */
};
--
1.8.3.2

2014-02-08 17:33:29

by Antonios Motakis

[permalink] [raw]
Subject: [RFC PATCH v4 05/10] VFIO_PLATFORM: Return info for device and its memory mapped IO regions

A VFIO userspace driver will start by opening the VFIO device
that corresponds to an IOMMU group, and will use the ioctl interface
to get the basic device info, such as number of memory regions and
interrupts, and their properties.

This patch enables the IOCTLs:
- VFIO_DEVICE_GET_INFO
- VFIO_DEVICE_GET_REGION_INFO

IRQ info is provided by one of the latter patches.

Signed-off-by: Antonios Motakis <[email protected]>
---
drivers/vfio/platform/vfio_platform.c | 74 ++++++++++++++++++++++++---
drivers/vfio/platform/vfio_platform_private.h | 8 +++
2 files changed, 76 insertions(+), 6 deletions(-)

diff --git a/drivers/vfio/platform/vfio_platform.c b/drivers/vfio/platform/vfio_platform.c
index a3d8f29..f7db5c0 100644
--- a/drivers/vfio/platform/vfio_platform.c
+++ b/drivers/vfio/platform/vfio_platform.c
@@ -34,15 +34,62 @@
#define DRIVER_AUTHOR "Antonios Motakis <[email protected]>"
#define DRIVER_DESC "VFIO for platform devices - User Level meta-driver"

+static int vfio_platform_regions_init(struct vfio_platform_device *vdev)
+{
+ int cnt = 0, i;
+
+ while (platform_get_resource(vdev->pdev, IORESOURCE_MEM, cnt))
+ cnt++;
+
+ vdev->num_regions = cnt;
+
+ vdev->region = kzalloc(sizeof(struct vfio_platform_region) * cnt,
+ GFP_KERNEL);
+ if (!vdev->region)
+ return -ENOMEM;
+
+ for (i = 0; i < cnt; i++) {
+ struct vfio_platform_region region;
+ struct resource *res =
+ platform_get_resource(vdev->pdev, IORESOURCE_MEM, i);
+
+ region.addr = res->start;
+ region.size = resource_size(res);
+ region.flags = 0;
+
+ vdev->region[i] = region;
+ }
+
+ return 0;
+}
+
+static void vfio_platform_regions_cleanup(struct vfio_platform_device *vdev)
+{
+ kfree(vdev->region);
+}
+
static void vfio_platform_release(void *device_data)
{
+ struct vfio_platform_device *vdev = device_data;
+
+ vfio_platform_regions_cleanup(vdev);
+
module_put(THIS_MODULE);
}

static int vfio_platform_open(void *device_data)
{
- if (!try_module_get(THIS_MODULE))
+ struct vfio_platform_device *vdev = device_data;
+ int ret;
+
+ ret = vfio_platform_regions_init(vdev);
+ if (ret)
+ return ret;
+
+ if (!try_module_get(THIS_MODULE)) {
+ vfio_platform_regions_cleanup(vdev);
return -ENODEV;
+ }

return 0;
}
@@ -65,18 +112,33 @@ static long vfio_platform_ioctl(void *device_data,
return -EINVAL;

info.flags = VFIO_DEVICE_FLAGS_PLATFORM;
- info.num_regions = 0;
+ info.num_regions = vdev->num_regions;
info.num_irqs = 0;

return copy_to_user((void __user *)arg, &info, minsz);

- } else if (cmd == VFIO_DEVICE_GET_REGION_INFO)
- return -EINVAL;
+ } else if (cmd == VFIO_DEVICE_GET_REGION_INFO) {
+ struct vfio_region_info info;
+
+ minsz = offsetofend(struct vfio_region_info, offset);
+
+ if (copy_from_user(&info, (void __user *)arg, minsz))
+ return -EFAULT;
+
+ if (info.argsz < minsz)
+ return -EINVAL;
+
+ /* map offset to the physical address */
+ info.offset = vdev->region[info.index].addr;
+ info.size = vdev->region[info.index].size;
+ info.flags = vdev->region[info.index].flags;
+
+ return copy_to_user((void __user *)arg, &info, minsz);

- else if (cmd == VFIO_DEVICE_GET_IRQ_INFO)
+ } else if (cmd == VFIO_DEVICE_GET_IRQ_INFO) {
return -EINVAL;

- else if (cmd == VFIO_DEVICE_SET_IRQS)
+ } else if (cmd == VFIO_DEVICE_SET_IRQS)
return -EINVAL;

else if (cmd == VFIO_DEVICE_RESET)
diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
index 6df8084..4705aa5 100644
--- a/drivers/vfio/platform/vfio_platform_private.h
+++ b/drivers/vfio/platform/vfio_platform_private.h
@@ -15,8 +15,16 @@
#ifndef VFIO_PLATFORM_PRIVATE_H
#define VFIO_PLATFORM_PRIVATE_H

+struct vfio_platform_region {
+ u64 addr;
+ resource_size_t size;
+ u32 flags;
+};
+
struct vfio_platform_device {
struct platform_device *pdev;
+ struct vfio_platform_region *region;
+ u32 num_regions;
};

#endif /* VFIO_PCI_PRIVATE_H */
--
1.8.3.2

2014-02-08 17:33:37

by Antonios Motakis

[permalink] [raw]
Subject: [RFC PATCH v4 06/10] VFIO_PLATFORM: Read and write support for the device fd

VFIO returns a file descriptor which we can use to manipulate the memory
regions of the device. Since some memory regions we cannot mmap due to
security concerns, we also allow to read and write to this file descriptor
directly.

Signed-off-by: Antonios Motakis <[email protected]>
Tested-by: Alvise Rigo <[email protected]>
---
drivers/vfio/platform/vfio_platform.c | 128 +++++++++++++++++++++++++++++++++-
1 file changed, 125 insertions(+), 3 deletions(-)

diff --git a/drivers/vfio/platform/vfio_platform.c b/drivers/vfio/platform/vfio_platform.c
index f7db5c0..ee96078 100644
--- a/drivers/vfio/platform/vfio_platform.c
+++ b/drivers/vfio/platform/vfio_platform.c
@@ -55,7 +55,8 @@ static int vfio_platform_regions_init(struct vfio_platform_device *vdev)

region.addr = res->start;
region.size = resource_size(res);
- region.flags = 0;
+ region.flags = VFIO_REGION_INFO_FLAG_READ
+ | VFIO_REGION_INFO_FLAG_WRITE;

vdev->region[i] = region;
}
@@ -150,13 +151,134 @@ static long vfio_platform_ioctl(void *device_data,
static ssize_t vfio_platform_read(void *device_data, char __user *buf,
size_t count, loff_t *ppos)
{
- return 0;
+ struct vfio_platform_device *vdev = device_data;
+ unsigned int *io;
+ int i;
+
+ for (i = 0; i < vdev->num_regions; i++) {
+ struct vfio_platform_region region = vdev->region[i];
+ unsigned int done = 0;
+ loff_t off;
+
+ if ((*ppos < region.addr)
+ || (*ppos + count - 1) >= (region.addr + region.size))
+ continue;
+
+ io = ioremap_nocache(region.addr, region.size);
+
+ off = *ppos - region.addr;
+
+ while (count) {
+ size_t filled;
+
+ if (count >= 4 && !(off % 4)) {
+ u32 val;
+
+ val = ioread32(io + off);
+ if (copy_to_user(buf, &val, 4))
+ goto err;
+
+ filled = 4;
+ } else if (count >= 2 && !(off % 2)) {
+ u16 val;
+
+ val = ioread16(io + off);
+ if (copy_to_user(buf, &val, 2))
+ goto err;
+
+ filled = 2;
+ } else {
+ u8 val;
+
+ val = ioread8(io + off);
+ if (copy_to_user(buf, &val, 1))
+ goto err;
+
+ filled = 1;
+ }
+
+
+ count -= filled;
+ done += filled;
+ off += filled;
+ buf += filled;
+ }
+
+ iounmap(io);
+ return done;
+ }
+
+ return -EFAULT;
+
+err:
+ iounmap(io);
+ return -EFAULT;
}

static ssize_t vfio_platform_write(void *device_data, const char __user *buf,
size_t count, loff_t *ppos)
{
- return 0;
+ struct vfio_platform_device *vdev = device_data;
+ unsigned int *io;
+ int i;
+
+ for (i = 0; i < vdev->num_regions; i++) {
+ struct vfio_platform_region region = vdev->region[i];
+ unsigned int done = 0;
+ loff_t off;
+
+ if ((*ppos < region.addr)
+ || (*ppos + count - 1) >= (region.addr + region.size))
+ continue;
+
+ io = ioremap_nocache(region.addr, region.size);
+
+ off = *ppos - region.addr;
+
+ while (count) {
+ size_t filled;
+
+ if (count >= 4 && !(off % 4)) {
+ u32 val;
+
+ if (copy_from_user(&val, buf, 4))
+ goto err;
+ iowrite32(val, io + off);
+
+ filled = 4;
+ } else if (count >= 2 && !(off % 2)) {
+ u16 val;
+
+ if (copy_from_user(&val, buf, 2))
+ goto err;
+ iowrite16(val, io + off);
+
+ filled = 2;
+ } else {
+ u8 val;
+
+ if (copy_from_user(&val, buf, 1))
+ goto err;
+ iowrite8(val, io + off);
+
+ filled = 1;
+ }
+
+ count -= filled;
+ done += filled;
+ off += filled;
+ buf += filled;
+ }
+
+ iounmap(io);
+ return done;
+ }
+
+ return -EINVAL;
+
+err:
+ iounmap(io);
+ return -EFAULT;
}

static int vfio_platform_mmap(void *device_data, struct vm_area_struct *vma)
--
1.8.3.2

2014-02-08 17:33:44

by Antonios Motakis

[permalink] [raw]
Subject: [RFC PATCH v4 07/10] VFIO_PLATFORM: Support MMAP of MMIO regions

Allow to memory map the MMIO regions of the device so userspace can
directly access them.

Signed-off-by: Antonios Motakis <[email protected]>
Tested-by: Alvise Rigo <[email protected]>
---
drivers/vfio/platform/vfio_platform.c | 33 +++++++++++++++++++++++++++++++++
1 file changed, 33 insertions(+)

diff --git a/drivers/vfio/platform/vfio_platform.c b/drivers/vfio/platform/vfio_platform.c
index ee96078..6b4b033 100644
--- a/drivers/vfio/platform/vfio_platform.c
+++ b/drivers/vfio/platform/vfio_platform.c
@@ -58,6 +58,11 @@ static int vfio_platform_regions_init(struct vfio_platform_device *vdev)
region.flags = VFIO_REGION_INFO_FLAG_READ
| VFIO_REGION_INFO_FLAG_WRITE;

+ /* Only regions addressed with PAGE granularity can be MMAPed
+ * securely. */
+ if (!(region.addr & ~PAGE_MASK) && !(region.size & ~PAGE_MASK))
+ region.flags |= VFIO_REGION_INFO_FLAG_MMAP;
+
vdev->region[i] = region;
}

@@ -283,6 +288,34 @@ err:

static int vfio_platform_mmap(void *device_data, struct vm_area_struct *vma)
{
+ struct vfio_platform_device *vdev = device_data;
+ u64 req_len = vma->vm_end - vma->vm_start;
+ u64 addr = vma->vm_pgoff << PAGE_SHIFT;
+ int i;
+
+ if (vma->vm_end < vma->vm_start)
+ return -EINVAL;
+ if (vma->vm_start & ~PAGE_MASK)
+ return -EINVAL;
+ if (vma->vm_end & ~PAGE_MASK)
+ return -EINVAL;
+ if ((vma->vm_flags & VM_SHARED) == 0)
+ return -EINVAL;
+
+ for (i = 0; i < vdev->num_regions; i++) {
+ struct vfio_platform_region region = vdev->region[i];
+
+ if ((addr < region.addr)
+ || (addr + req_len - 1) >= (region.addr + region.size))
+ continue;
+
+ vma->vm_private_data = vdev;
+ vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
+
+ return remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff,
+ req_len, vma->vm_page_prot);
+ }
+
return -EINVAL;
}

--
1.8.3.2

2014-02-08 17:33:59

by Antonios Motakis

[permalink] [raw]
Subject: [RFC PATCH v4 08/10] VFIO_PLATFORM: Return IRQ info

Return information for the interrupts exposed by the device.
This patch extends VFIO_DEVICE_GET_INFO with the number of IRQs
and enables VFIO_DEVICE_GET_IRQ_INFO

Signed-off-by: Antonios Motakis <[email protected]>
---
drivers/vfio/platform/Makefile | 2 +-
drivers/vfio/platform/vfio_platform.c | 35 +++++++++++++--
drivers/vfio/platform/vfio_platform_irq.c | 63 +++++++++++++++++++++++++++
drivers/vfio/platform/vfio_platform_private.h | 11 +++++
4 files changed, 106 insertions(+), 5 deletions(-)
create mode 100644 drivers/vfio/platform/vfio_platform_irq.c

diff --git a/drivers/vfio/platform/Makefile b/drivers/vfio/platform/Makefile
index df3a014..2c53327 100644
--- a/drivers/vfio/platform/Makefile
+++ b/drivers/vfio/platform/Makefile
@@ -1,4 +1,4 @@

-vfio-platform-y := vfio_platform.o
+vfio-platform-y := vfio_platform.o vfio_platform_irq.o

obj-$(CONFIG_VFIO_PLATFORM) += vfio-platform.o
diff --git a/drivers/vfio/platform/vfio_platform.c b/drivers/vfio/platform/vfio_platform.c
index 6b4b033..ef1ac17 100644
--- a/drivers/vfio/platform/vfio_platform.c
+++ b/drivers/vfio/platform/vfio_platform.c
@@ -79,6 +79,7 @@ static void vfio_platform_release(void *device_data)
struct vfio_platform_device *vdev = device_data;

vfio_platform_regions_cleanup(vdev);
+ vfio_platform_irq_cleanup(vdev);

module_put(THIS_MODULE);
}
@@ -92,12 +93,22 @@ static int vfio_platform_open(void *device_data)
if (ret)
return ret;

+ ret = vfio_platform_irq_init(vdev);
+ if (ret)
+ goto err_irq;
+
if (!try_module_get(THIS_MODULE)) {
- vfio_platform_regions_cleanup(vdev);
- return -ENODEV;
+ ret = -ENODEV;
+ goto err_mod;
}

return 0;
+
+err_mod:
+ vfio_platform_irq_cleanup(vdev);
+err_irq:
+ vfio_platform_regions_cleanup(vdev);
+ return ret;
}

static long vfio_platform_ioctl(void *device_data,
@@ -119,7 +130,7 @@ static long vfio_platform_ioctl(void *device_data,

info.flags = VFIO_DEVICE_FLAGS_PLATFORM;
info.num_regions = vdev->num_regions;
- info.num_irqs = 0;
+ info.num_irqs = vdev->num_irqs;

return copy_to_user((void __user *)arg, &info, minsz);

@@ -142,7 +153,23 @@ static long vfio_platform_ioctl(void *device_data,
return copy_to_user((void __user *)arg, &info, minsz);

} else if (cmd == VFIO_DEVICE_GET_IRQ_INFO) {
- return -EINVAL;
+ struct vfio_irq_info info;
+
+ minsz = offsetofend(struct vfio_irq_info, count);
+
+ if (copy_from_user(&info, (void __user *)arg, minsz))
+ return -EFAULT;
+
+ if (info.argsz < minsz)
+ return -EINVAL;
+
+ if (info.index >= vdev->num_irqs)
+ return -EINVAL;
+
+ info.flags = vdev->irq[info.index].flags;
+ info.count = vdev->irq[info.index].count;
+
+ return copy_to_user((void __user *)arg, &info, minsz);

} else if (cmd == VFIO_DEVICE_SET_IRQS)
return -EINVAL;
diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
new file mode 100644
index 0000000..075c401
--- /dev/null
+++ b/drivers/vfio/platform/vfio_platform_irq.c
@@ -0,0 +1,63 @@
+/*
+ * VFIO platform devices interrupt handling
+ *
+ * Copyright (C) 2013 - Virtual Open Systems
+ * Author: Antonios Motakis <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/device.h>
+#include <linux/eventfd.h>
+#include <linux/interrupt.h>
+#include <linux/iommu.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/notifier.h>
+#include <linux/pm_runtime.h>
+#include <linux/slab.h>
+#include <linux/types.h>
+#include <linux/uaccess.h>
+#include <linux/vfio.h>
+#include <linux/platform_device.h>
+#include <linux/irq.h>
+
+#include "vfio_platform_private.h"
+
+int vfio_platform_irq_init(struct vfio_platform_device *vdev)
+{
+ int cnt = 0, i;
+
+ while (platform_get_irq(vdev->pdev, cnt) > 0)
+ cnt++;
+
+ vdev->num_irqs = cnt;
+
+ vdev->irq = kzalloc(sizeof(struct vfio_platform_irq) * vdev->num_irqs,
+ GFP_KERNEL);
+ if (!vdev->irq)
+ return -ENOMEM;
+
+ for (i = 0; i < cnt; i++) {
+ struct vfio_platform_irq irq;
+
+ irq.flags = 0;
+ irq.count = 1;
+
+ vdev->irq[i] = irq;
+ }
+
+ return 0;
+}
+
+void vfio_platform_irq_cleanup(struct vfio_platform_device *vdev)
+{
+ kfree(vdev->irq);
+}
diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
index 4705aa5..726f5d1 100644
--- a/drivers/vfio/platform/vfio_platform_private.h
+++ b/drivers/vfio/platform/vfio_platform_private.h
@@ -15,6 +15,11 @@
#ifndef VFIO_PLATFORM_PRIVATE_H
#define VFIO_PLATFORM_PRIVATE_H

+struct vfio_platform_irq {
+ u32 flags;
+ u32 count;
+};
+
struct vfio_platform_region {
u64 addr;
resource_size_t size;
@@ -25,6 +30,12 @@ struct vfio_platform_device {
struct platform_device *pdev;
struct vfio_platform_region *region;
u32 num_regions;
+ struct vfio_platform_irq *irq;
+ u32 num_irqs;
};

+extern int vfio_platform_irq_init(struct vfio_platform_device *vdev);
+
+extern void vfio_platform_irq_cleanup(struct vfio_platform_device *vdev);
+
#endif /* VFIO_PCI_PRIVATE_H */
--
1.8.3.2

2014-02-08 17:34:14

by Antonios Motakis

[permalink] [raw]
Subject: [RFC PATCH v4 09/10] VFIO_PLATFORM: Initial interrupts support

This patch allows to set an eventfd for a patform device's interrupt,
and also to trigger the interrupt eventfd from userspace for testing.

Signed-off-by: Antonios Motakis <[email protected]>
Tested-by: Alvise Rigo <[email protected]>
---
drivers/vfio/platform/vfio_platform.c | 36 +++++++-
drivers/vfio/platform/vfio_platform_irq.c | 123 +++++++++++++++++++++++++-
drivers/vfio/platform/vfio_platform_private.h | 7 ++
3 files changed, 162 insertions(+), 4 deletions(-)

diff --git a/drivers/vfio/platform/vfio_platform.c b/drivers/vfio/platform/vfio_platform.c
index ef1ac17..ed5d678 100644
--- a/drivers/vfio/platform/vfio_platform.c
+++ b/drivers/vfio/platform/vfio_platform.c
@@ -171,10 +171,40 @@ static long vfio_platform_ioctl(void *device_data,

return copy_to_user((void __user *)arg, &info, minsz);

- } else if (cmd == VFIO_DEVICE_SET_IRQS)
- return -EINVAL;
+ } else if (cmd == VFIO_DEVICE_SET_IRQS) {
+ struct vfio_irq_set hdr;
+ int ret = 0;
+
+ minsz = offsetofend(struct vfio_irq_set, count);
+
+ if (copy_from_user(&hdr, (void __user *)arg, minsz))
+ return -EFAULT;
+
+ if (hdr.argsz < minsz)
+ return -EINVAL;
+
+ if (hdr.index >= vdev->num_irqs)
+ return -EINVAL;
+
+ if (hdr.start != 0 || hdr.count > 1)
+ return -EINVAL;
+
+ if (hdr.count == 0 &&
+ (!(hdr.flags & VFIO_IRQ_SET_DATA_NONE) ||
+ !(hdr.flags & VFIO_IRQ_SET_ACTION_TRIGGER)))
+ return -EINVAL;
+
+ if (hdr.flags & ~(VFIO_IRQ_SET_DATA_TYPE_MASK |
+ VFIO_IRQ_SET_ACTION_TYPE_MASK))
+ return -EINVAL;
+
+ ret = vfio_platform_set_irqs_ioctl(vdev, hdr.flags, hdr.index,
+ hdr.start, hdr.count,
+ (void *)arg+minsz);
+
+ return ret;

- else if (cmd == VFIO_DEVICE_RESET)
+ } else if (cmd == VFIO_DEVICE_RESET)
return -EINVAL;

return -ENOTTY;
diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
index 075c401..433edc1 100644
--- a/drivers/vfio/platform/vfio_platform_irq.c
+++ b/drivers/vfio/platform/vfio_platform_irq.c
@@ -31,6 +31,9 @@

#include "vfio_platform_private.h"

+static int vfio_set_trigger(struct vfio_platform_device *vdev,
+ int index, int fd);
+
int vfio_platform_irq_init(struct vfio_platform_device *vdev)
{
int cnt = 0, i;
@@ -47,9 +50,11 @@ int vfio_platform_irq_init(struct vfio_platform_device *vdev)

for (i = 0; i < cnt; i++) {
struct vfio_platform_irq irq;
+ int hwirq = platform_get_irq(vdev->pdev, i);

- irq.flags = 0;
+ irq.flags = VFIO_IRQ_INFO_EVENTFD;
irq.count = 1;
+ irq.hwirq = hwirq;

vdev->irq[i] = irq;
}
@@ -59,5 +64,121 @@ int vfio_platform_irq_init(struct vfio_platform_device *vdev)

void vfio_platform_irq_cleanup(struct vfio_platform_device *vdev)
{
+ int i;
+
+ for (i = 0; i < vdev->num_irqs; i++)
+ vfio_set_trigger(vdev, i, -1);
+
kfree(vdev->irq);
}
+
+static irqreturn_t vfio_irq_handler(int irq, void *dev_id)
+{
+ struct eventfd_ctx *trigger = dev_id;
+
+ eventfd_signal(trigger, 1);
+
+ return IRQ_HANDLED;
+}
+
+static int vfio_set_trigger(struct vfio_platform_device *vdev,
+ int index, int fd)
+{
+ struct vfio_platform_irq *irq = &vdev->irq[index];
+ struct eventfd_ctx *trigger;
+ int ret;
+
+ if (irq->trigger) {
+ free_irq(irq->hwirq, irq);
+ kfree(irq->name);
+ eventfd_ctx_put(irq->trigger);
+ irq->trigger = NULL;
+ }
+
+ if (fd < 0) /* Disable only */
+ return 0;
+
+ irq->name = kasprintf(GFP_KERNEL, "vfio-irq[%d](%s)",
+ irq->hwirq, vdev->pdev->name);
+ if (!irq->name)
+ return -ENOMEM;
+
+ trigger = eventfd_ctx_fdget(fd);
+ if (IS_ERR(trigger)) {
+ kfree(irq->name);
+ return PTR_ERR(trigger);
+ }
+
+ irq->trigger = trigger;
+
+ ret = request_irq(irq->hwirq, vfio_irq_handler, 0, irq->name, irq);
+ if (ret) {
+ kfree(irq->name);
+ eventfd_ctx_put(trigger);
+ irq->trigger = NULL;
+ return ret;
+ }
+
+ return 0;
+}
+
+static int vfio_platform_set_irq_trigger(struct vfio_platform_device *vdev,
+ unsigned index, unsigned start,
+ unsigned count, uint32_t flags, void *data)
+{
+ struct vfio_platform_irq *irq = &vdev->irq[index];
+ uint8_t arr;
+ int32_t fd;
+
+ switch (flags & VFIO_IRQ_SET_DATA_TYPE_MASK) {
+ case VFIO_IRQ_SET_DATA_NONE:
+ if (count == 0)
+ return vfio_set_trigger(vdev, index, -1);
+
+ vfio_irq_handler(irq->hwirq, irq);
+ return 0;
+
+ case VFIO_IRQ_SET_DATA_BOOL:
+ if (copy_from_user(&arr, data, sizeof(uint8_t)))
+ return -EFAULT;
+
+ if (arr == 0x1) {
+ vfio_irq_handler(irq->hwirq, irq);
+ return 0;
+ }
+
+ return -EINVAL;
+
+ case VFIO_IRQ_SET_DATA_EVENTFD:
+ if (copy_from_user(&fd, data, sizeof(int32_t)))
+ return -EFAULT;
+
+ return vfio_set_trigger(vdev, index, fd);
+ }
+
+ return -EFAULT;
+}
+
+int vfio_platform_set_irqs_ioctl(struct vfio_platform_device *vdev,
+ uint32_t flags, unsigned index, unsigned start,
+ unsigned count, void *data)
+{
+ int (*func)(struct vfio_platform_device *vdev, unsigned index,
+ unsigned start, unsigned count, uint32_t flags,
+ void *data) = NULL;
+
+ switch (flags & VFIO_IRQ_SET_ACTION_TYPE_MASK) {
+ case VFIO_IRQ_SET_ACTION_MASK:
+ case VFIO_IRQ_SET_ACTION_UNMASK:
+ /* XXX not implemented */
+ break;
+ case VFIO_IRQ_SET_ACTION_TRIGGER:
+ func = vfio_platform_set_irq_trigger;
+ break;
+ }
+
+ if (!func)
+ return -ENOTTY;
+
+ return func(vdev, index, start, count, flags, data);
+}
diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
index 726f5d1..befef01 100644
--- a/drivers/vfio/platform/vfio_platform_private.h
+++ b/drivers/vfio/platform/vfio_platform_private.h
@@ -16,8 +16,11 @@
#define VFIO_PLATFORM_PRIVATE_H

struct vfio_platform_irq {
+ struct eventfd_ctx *trigger;
u32 flags;
u32 count;
+ int hwirq;
+ char *name;
};

struct vfio_platform_region {
@@ -38,4 +41,8 @@ extern int vfio_platform_irq_init(struct vfio_platform_device *vdev);

extern void vfio_platform_irq_cleanup(struct vfio_platform_device *vdev);

+extern int vfio_platform_set_irqs_ioctl(struct vfio_platform_device *vdev,
+ uint32_t flags, unsigned index, unsigned start,
+ unsigned count, void *data);
+
#endif /* VFIO_PCI_PRIVATE_H */
--
1.8.3.2

2014-02-08 17:34:24

by Antonios Motakis

[permalink] [raw]
Subject: [RFC PATCH v4 10/10] VFIO_PLATFORM: Support for maskable and automasked interrupts

Adds support to mask interrupts, and also for automasked interrupts.
Level sensitive interrupts are exposed as automasked interrupts and
are masked and disabled automatically when they fire.

Signed-off-by: Antonios Motakis <[email protected]>
Tested-by: Alvise Rigo <[email protected]>
---
drivers/vfio/platform/vfio_platform_irq.c | 117 ++++++++++++++++++++++++--
drivers/vfio/platform/vfio_platform_private.h | 2 +
2 files changed, 113 insertions(+), 6 deletions(-)

diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
index 433edc1..e38982f 100644
--- a/drivers/vfio/platform/vfio_platform_irq.c
+++ b/drivers/vfio/platform/vfio_platform_irq.c
@@ -52,9 +52,16 @@ int vfio_platform_irq_init(struct vfio_platform_device *vdev)
struct vfio_platform_irq irq;
int hwirq = platform_get_irq(vdev->pdev, i);

- irq.flags = VFIO_IRQ_INFO_EVENTFD;
+ spin_lock_init(&irq.lock);
+
+ irq.flags = VFIO_IRQ_INFO_EVENTFD | VFIO_IRQ_INFO_MASKABLE;
+
+ if (irq_get_trigger_type(hwirq) & IRQ_TYPE_LEVEL_MASK)
+ irq.flags |= VFIO_IRQ_INFO_AUTOMASKED;
+
irq.count = 1;
irq.hwirq = hwirq;
+ irq.masked = false;

vdev->irq[i] = irq;
}
@@ -66,19 +73,39 @@ void vfio_platform_irq_cleanup(struct vfio_platform_device *vdev)
{
int i;

- for (i = 0; i < vdev->num_irqs; i++)
+ for (i = 0; i < vdev->num_irqs; i++) {
vfio_set_trigger(vdev, i, -1);

+ if (vdev->irq[i].masked)
+ enable_irq(vdev->irq[i].hwirq);
+ }
+
kfree(vdev->irq);
}

static irqreturn_t vfio_irq_handler(int irq, void *dev_id)
{
- struct eventfd_ctx *trigger = dev_id;
+ struct vfio_platform_irq *irq_ctx = dev_id;
+ unsigned long flags;
+ int ret = IRQ_NONE;
+
+ spin_lock_irqsave(&irq_ctx->lock, flags);
+
+ if (!irq_ctx->masked) {
+ ret = IRQ_HANDLED;
+
+ if (irq_ctx->flags & VFIO_IRQ_INFO_AUTOMASKED) {
+ disable_irq_nosync(irq_ctx->hwirq);
+ irq_ctx->masked = true;
+ }
+ }

- eventfd_signal(trigger, 1);
+ spin_unlock_irqrestore(&irq_ctx->lock, flags);

- return IRQ_HANDLED;
+ if (ret == IRQ_HANDLED)
+ eventfd_signal(irq_ctx->trigger, 1);
+
+ return ret;
}

static int vfio_set_trigger(struct vfio_platform_device *vdev,
@@ -159,6 +186,82 @@ static int vfio_platform_set_irq_trigger(struct vfio_platform_device *vdev,
return -EFAULT;
}

+static int vfio_platform_set_irq_unmask(struct vfio_platform_device *vdev,
+ unsigned index, unsigned start,
+ unsigned count, uint32_t flags, void *data)
+{
+ uint8_t arr;
+
+ if (start != 0 || count != 1)
+ return -EINVAL;
+
+ switch (flags & VFIO_IRQ_SET_DATA_TYPE_MASK) {
+ case VFIO_IRQ_SET_DATA_BOOL:
+ if (copy_from_user(&arr, data, sizeof(uint8_t)))
+ return -EFAULT;
+
+ if (arr != 0x1)
+ return -EINVAL;
+
+ case VFIO_IRQ_SET_DATA_NONE:
+
+ spin_lock_irq(&vdev->irq[index].lock);
+
+ if (vdev->irq[index].masked) {
+ enable_irq(vdev->irq[index].hwirq);
+ vdev->irq[index].masked = false;
+ }
+
+ spin_unlock_irq(&vdev->irq[index].lock);
+
+ return 0;
+
+ case VFIO_IRQ_SET_DATA_EVENTFD: /* XXX not implemented yet */
+ default:
+ return -ENOTTY;
+ }
+
+ return 0;
+}
+
+static int vfio_platform_set_irq_mask(struct vfio_platform_device *vdev,
+ unsigned index, unsigned start,
+ unsigned count, uint32_t flags, void *data)
+{
+ uint8_t arr;
+
+ if (start != 0 || count != 1)
+ return -EINVAL;
+
+ switch (flags & VFIO_IRQ_SET_DATA_TYPE_MASK) {
+ case VFIO_IRQ_SET_DATA_BOOL:
+ if (copy_from_user(&arr, data, sizeof(uint8_t)))
+ return -EFAULT;
+
+ if (arr != 0x1)
+ return -EINVAL;
+
+ case VFIO_IRQ_SET_DATA_NONE:
+
+ spin_lock_irq(&vdev->irq[index].lock);
+
+ if (!vdev->irq[index].masked) {
+ disable_irq(vdev->irq[index].hwirq);
+ vdev->irq[index].masked = true;
+ }
+
+ spin_unlock_irq(&vdev->irq[index].lock);
+
+ return 0;
+
+ case VFIO_IRQ_SET_DATA_EVENTFD: /* XXX not implemented yet */
+ default:
+ return -ENOTTY;
+ }
+
+ return 0;
+}
+
int vfio_platform_set_irqs_ioctl(struct vfio_platform_device *vdev,
uint32_t flags, unsigned index, unsigned start,
unsigned count, void *data)
@@ -169,8 +272,10 @@ int vfio_platform_set_irqs_ioctl(struct vfio_platform_device *vdev,

switch (flags & VFIO_IRQ_SET_ACTION_TYPE_MASK) {
case VFIO_IRQ_SET_ACTION_MASK:
+ func = vfio_platform_set_irq_mask;
+ break;
case VFIO_IRQ_SET_ACTION_UNMASK:
- /* XXX not implemented */
+ func = vfio_platform_set_irq_unmask;
break;
case VFIO_IRQ_SET_ACTION_TRIGGER:
func = vfio_platform_set_irq_trigger;
diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
index befef01..5721313 100644
--- a/drivers/vfio/platform/vfio_platform_private.h
+++ b/drivers/vfio/platform/vfio_platform_private.h
@@ -21,6 +21,8 @@ struct vfio_platform_irq {
u32 count;
int hwirq;
char *name;
+ bool masked;
+ spinlock_t lock;
};

struct vfio_platform_region {
--
1.8.3.2

2014-02-10 20:04:49

by Alex Williamson

[permalink] [raw]
Subject: Re: [RFC PATCH v4 02/10] VFIO_IOMMU_TYPE1: Introduce the VFIO_DMA_MAP_FLAG_EXEC flag

On Sat, 2014-02-08 at 18:29 +0100, Antonios Motakis wrote:
> The ARM SMMU driver expects the IOMMU_EXEC flag, otherwise it will
> set the page tables for a device as XN (execute never). This affects
> devices such as the ARM PL330 DMA Controller, which fails to operate
> if the XN flag is set on the memory it tries to fetch its instructions
> from.
>
> We introduce the VFIO_DMA_MAP_FLAG_EXEC to VFIO, and use it in
> VFIO_IOMMU_TYPE1 to set the IOMMU_EXEC flag. This way the user can
> control whether the XN flag will be set on the requested mappings.

Should the user be told whether this flag is available? It looks like
existing iommu drivers for x86 ignore the flag, can we count on that?
Thanks,

Alex

> Signed-off-by: Antonios Motakis <[email protected]>
> ---
> drivers/vfio/vfio_iommu_type1.c | 5 ++++-
> include/uapi/linux/vfio.h | 1 +
> 2 files changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index 4fb7a8f..ad7a1f6 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -557,6 +557,8 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
> prot |= IOMMU_WRITE;
> if (map->flags & VFIO_DMA_MAP_FLAG_READ)
> prot |= IOMMU_READ;
> + if (map->flags & VFIO_DMA_MAP_FLAG_EXEC)
> + prot |= IOMMU_EXEC;
>
> if (!prot)
> return -EINVAL; /* No READ/WRITE? */
> @@ -865,7 +867,8 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
> } else if (cmd == VFIO_IOMMU_MAP_DMA) {
> struct vfio_iommu_type1_dma_map map;
> uint32_t mask = VFIO_DMA_MAP_FLAG_READ |
> - VFIO_DMA_MAP_FLAG_WRITE;
> + VFIO_DMA_MAP_FLAG_WRITE |
> + VFIO_DMA_MAP_FLAG_EXEC;
>
> minsz = offsetofend(struct vfio_iommu_type1_dma_map, size);
>
> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> index 0fd47f5..d8e9e99 100644
> --- a/include/uapi/linux/vfio.h
> +++ b/include/uapi/linux/vfio.h
> @@ -392,6 +392,7 @@ struct vfio_iommu_type1_dma_map {
> __u32 flags;
> #define VFIO_DMA_MAP_FLAG_READ (1 << 0) /* readable from device */
> #define VFIO_DMA_MAP_FLAG_WRITE (1 << 1) /* writable from device */
> +#define VFIO_DMA_MAP_FLAG_EXEC (1 << 2) /* executable from device */
> __u64 vaddr; /* Process virtual address */
> __u64 iova; /* IO virtual address */
> __u64 size; /* Size of mapping (bytes) */


2014-02-10 22:32:52

by Alex Williamson

[permalink] [raw]
Subject: Re: [RFC PATCH v4 05/10] VFIO_PLATFORM: Return info for device and its memory mapped IO regions

On Sat, 2014-02-08 at 18:29 +0100, Antonios Motakis wrote:
> A VFIO userspace driver will start by opening the VFIO device
> that corresponds to an IOMMU group, and will use the ioctl interface
> to get the basic device info, such as number of memory regions and
> interrupts, and their properties.
>
> This patch enables the IOCTLs:
> - VFIO_DEVICE_GET_INFO
> - VFIO_DEVICE_GET_REGION_INFO
>
> IRQ info is provided by one of the latter patches.
>
> Signed-off-by: Antonios Motakis <[email protected]>
> ---
> drivers/vfio/platform/vfio_platform.c | 74 ++++++++++++++++++++++++---
> drivers/vfio/platform/vfio_platform_private.h | 8 +++
> 2 files changed, 76 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/vfio/platform/vfio_platform.c b/drivers/vfio/platform/vfio_platform.c
> index a3d8f29..f7db5c0 100644
> --- a/drivers/vfio/platform/vfio_platform.c
> +++ b/drivers/vfio/platform/vfio_platform.c
> @@ -34,15 +34,62 @@
> #define DRIVER_AUTHOR "Antonios Motakis <[email protected]>"
> #define DRIVER_DESC "VFIO for platform devices - User Level meta-driver"
>
> +static int vfio_platform_regions_init(struct vfio_platform_device *vdev)
> +{
> + int cnt = 0, i;
> +
> + while (platform_get_resource(vdev->pdev, IORESOURCE_MEM, cnt))
> + cnt++;
> +
> + vdev->num_regions = cnt;
> +
> + vdev->region = kzalloc(sizeof(struct vfio_platform_region) * cnt,
> + GFP_KERNEL);
> + if (!vdev->region)
> + return -ENOMEM;
> +
> + for (i = 0; i < cnt; i++) {
> + struct vfio_platform_region region;
> + struct resource *res =
> + platform_get_resource(vdev->pdev, IORESOURCE_MEM, i);
> +
> + region.addr = res->start;
> + region.size = resource_size(res);
> + region.flags = 0;
> +
> + vdev->region[i] = region;
> + }
> +
> + return 0;
> +}
> +
> +static void vfio_platform_regions_cleanup(struct vfio_platform_device *vdev)
> +{
> + kfree(vdev->region);
> +}
> +
> static void vfio_platform_release(void *device_data)
> {
> + struct vfio_platform_device *vdev = device_data;
> +
> + vfio_platform_regions_cleanup(vdev);
> +
> module_put(THIS_MODULE);
> }
>
> static int vfio_platform_open(void *device_data)
> {
> - if (!try_module_get(THIS_MODULE))
> + struct vfio_platform_device *vdev = device_data;
> + int ret;
> +
> + ret = vfio_platform_regions_init(vdev);
> + if (ret)
> + return ret;
> +
> + if (!try_module_get(THIS_MODULE)) {
> + vfio_platform_regions_cleanup(vdev);
> return -ENODEV;
> + }
>
> return 0;
> }
> @@ -65,18 +112,33 @@ static long vfio_platform_ioctl(void *device_data,
> return -EINVAL;
>
> info.flags = VFIO_DEVICE_FLAGS_PLATFORM;
> - info.num_regions = 0;
> + info.num_regions = vdev->num_regions;
> info.num_irqs = 0;
>
> return copy_to_user((void __user *)arg, &info, minsz);
>
> - } else if (cmd == VFIO_DEVICE_GET_REGION_INFO)
> - return -EINVAL;
> + } else if (cmd == VFIO_DEVICE_GET_REGION_INFO) {
> + struct vfio_region_info info;
> +
> + minsz = offsetofend(struct vfio_region_info, offset);
> +
> + if (copy_from_user(&info, (void __user *)arg, minsz))
> + return -EFAULT;
> +
> + if (info.argsz < minsz)
> + return -EINVAL;

Missing a bounds check for info.index, user could be getting back kernel
data here. Thanks,

Alex

> +
> + /* map offset to the physical address */
> + info.offset = vdev->region[info.index].addr;
> + info.size = vdev->region[info.index].size;
> + info.flags = vdev->region[info.index].flags;
> +
> + return copy_to_user((void __user *)arg, &info, minsz);
>
> - else if (cmd == VFIO_DEVICE_GET_IRQ_INFO)
> + } else if (cmd == VFIO_DEVICE_GET_IRQ_INFO) {
> return -EINVAL;
>
> - else if (cmd == VFIO_DEVICE_SET_IRQS)
> + } else if (cmd == VFIO_DEVICE_SET_IRQS)
> return -EINVAL;
>
> else if (cmd == VFIO_DEVICE_RESET)
> diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
> index 6df8084..4705aa5 100644
> --- a/drivers/vfio/platform/vfio_platform_private.h
> +++ b/drivers/vfio/platform/vfio_platform_private.h
> @@ -15,8 +15,16 @@
> #ifndef VFIO_PLATFORM_PRIVATE_H
> #define VFIO_PLATFORM_PRIVATE_H
>
> +struct vfio_platform_region {
> + u64 addr;
> + resource_size_t size;
> + u32 flags;
> +};
> +
> struct vfio_platform_device {
> struct platform_device *pdev;
> + struct vfio_platform_region *region;
> + u32 num_regions;
> };
>
> #endif /* VFIO_PCI_PRIVATE_H */


2014-02-10 22:45:51

by Alex Williamson

[permalink] [raw]
Subject: Re: [RFC PATCH v4 06/10] VFIO_PLATFORM: Read and write support for the device fd

On Sat, 2014-02-08 at 18:29 +0100, Antonios Motakis wrote:
> VFIO returns a file descriptor which we can use to manipulate the memory
> regions of the device. Since some memory regions we cannot mmap due to
> security concerns, we also allow to read and write to this file descriptor
> directly.
>
> Signed-off-by: Antonios Motakis <[email protected]>
> Tested-by: Alvise Rigo <[email protected]>
> ---
> drivers/vfio/platform/vfio_platform.c | 128 +++++++++++++++++++++++++++++++++-
> 1 file changed, 125 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/vfio/platform/vfio_platform.c b/drivers/vfio/platform/vfio_platform.c
> index f7db5c0..ee96078 100644
> --- a/drivers/vfio/platform/vfio_platform.c
> +++ b/drivers/vfio/platform/vfio_platform.c
> @@ -55,7 +55,8 @@ static int vfio_platform_regions_init(struct vfio_platform_device *vdev)
>
> region.addr = res->start;
> region.size = resource_size(res);
> - region.flags = 0;
> + region.flags = VFIO_REGION_INFO_FLAG_READ
> + | VFIO_REGION_INFO_FLAG_WRITE;
>
> vdev->region[i] = region;
> }
> @@ -150,13 +151,134 @@ static long vfio_platform_ioctl(void *device_data,
> static ssize_t vfio_platform_read(void *device_data, char __user *buf,
> size_t count, loff_t *ppos)
> {
> - return 0;
> + struct vfio_platform_device *vdev = device_data;
> + unsigned int *io;
> + int i;
> +
> + for (i = 0; i < vdev->num_regions; i++) {
> + struct vfio_platform_region region = vdev->region[i];
> + unsigned int done = 0;
> + loff_t off;
> +
> + if ((*ppos < region.addr)
> + || (*ppos + count - 1) >= (region.addr + region.size))
> + continue;

Perhaps there's something to be said for vfio-pci's use of fixed offsets
to have a direct offset to index lookup.

> +
> + io = ioremap_nocache(region.addr, region.size);

This must incur some overhead per access.

> +
> + off = *ppos - region.addr;
> +
> + while (count) {
> + size_t filled;
> +
> + if (count >= 4 && !(off % 4)) {
> + u32 val;
> +
> + val = ioread32(io + off);
> + if (copy_to_user(buf, &val, 4))
> + goto err;

For vfio-pci we've decided that these interfaces are always little
endian, have you considered whether it makes sense to do something
similar here? Thanks,

Alex

> +
> + filled = 4;
> + } else if (count >= 2 && !(off % 2)) {
> + u16 val;
> +
> + val = ioread16(io + off);
> + if (copy_to_user(buf, &val, 2))
> + goto err;
> +
> + filled = 2;
> + } else {
> + u8 val;
> +
> + val = ioread8(io + off);
> + if (copy_to_user(buf, &val, 1))
> + goto err;
> +
> + filled = 1;
> + }
> +
> +
> + count -= filled;
> + done += filled;
> + off += filled;
> + buf += filled;
> + }
> +
> + iounmap(io);
> + return done;
> + }
> +
> + return -EFAULT;
> +
> +err:
> + iounmap(io);
> + return -EFAULT;
> }
>
> static ssize_t vfio_platform_write(void *device_data, const char __user *buf,
> size_t count, loff_t *ppos)
> {
> - return 0;
> + struct vfio_platform_device *vdev = device_data;
> + unsigned int *io;
> + int i;
> +
> + for (i = 0; i < vdev->num_regions; i++) {
> + struct vfio_platform_region region = vdev->region[i];
> + unsigned int done = 0;
> + loff_t off;
> +
> + if ((*ppos < region.addr)
> + || (*ppos + count - 1) >= (region.addr + region.size))
> + continue;
> +
> + io = ioremap_nocache(region.addr, region.size);
> +
> + off = *ppos - region.addr;
> +
> + while (count) {
> + size_t filled;
> +
> + if (count >= 4 && !(off % 4)) {
> + u32 val;
> +
> + if (copy_from_user(&val, buf, 4))
> + goto err;
> + iowrite32(val, io + off);
> +
> + filled = 4;
> + } else if (count >= 2 && !(off % 2)) {
> + u16 val;
> +
> + if (copy_from_user(&val, buf, 2))
> + goto err;
> + iowrite16(val, io + off);
> +
> + filled = 2;
> + } else {
> + u8 val;
> +
> + if (copy_from_user(&val, buf, 1))
> + goto err;
> + iowrite8(val, io + off);
> +
> + filled = 1;
> + }
> +
> + count -= filled;
> + done += filled;
> + off += filled;
> + buf += filled;
> + }
> +
> + iounmap(io);
> + return done;
> + }
> +
> + return -EINVAL;
> +
> +err:
> + iounmap(io);
> + return -EFAULT;
> }
>
> static int vfio_platform_mmap(void *device_data, struct vm_area_struct *vma)


2014-02-10 23:13:01

by Scott Wood

[permalink] [raw]
Subject: Re: [RFC PATCH v4 06/10] VFIO_PLATFORM: Read and write support for the device fd

On Mon, 2014-02-10 at 15:45 -0700, Alex Williamson wrote:
> On Sat, 2014-02-08 at 18:29 +0100, Antonios Motakis wrote:
> > VFIO returns a file descriptor which we can use to manipulate the memory
> > regions of the device. Since some memory regions we cannot mmap due to
> > security concerns, we also allow to read and write to this file descriptor
> > directly.
> >
> > Signed-off-by: Antonios Motakis <[email protected]>
> > Tested-by: Alvise Rigo <[email protected]>
> > ---
> > drivers/vfio/platform/vfio_platform.c | 128 +++++++++++++++++++++++++++++++++-
> > 1 file changed, 125 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/vfio/platform/vfio_platform.c b/drivers/vfio/platform/vfio_platform.c
> > index f7db5c0..ee96078 100644
> > --- a/drivers/vfio/platform/vfio_platform.c
> > +++ b/drivers/vfio/platform/vfio_platform.c
> > @@ -55,7 +55,8 @@ static int vfio_platform_regions_init(struct vfio_platform_device *vdev)
> >
> > region.addr = res->start;
> > region.size = resource_size(res);
> > - region.flags = 0;
> > + region.flags = VFIO_REGION_INFO_FLAG_READ
> > + | VFIO_REGION_INFO_FLAG_WRITE;
> >
> > vdev->region[i] = region;
> > }
> > @@ -150,13 +151,134 @@ static long vfio_platform_ioctl(void *device_data,
> > static ssize_t vfio_platform_read(void *device_data, char __user *buf,
> > size_t count, loff_t *ppos)
> > {
> > - return 0;
> > + struct vfio_platform_device *vdev = device_data;
> > + unsigned int *io;
> > + int i;
> > +
> > + for (i = 0; i < vdev->num_regions; i++) {
> > + struct vfio_platform_region region = vdev->region[i];
> > + unsigned int done = 0;
> > + loff_t off;
> > +
> > + if ((*ppos < region.addr)
> > + || (*ppos + count - 1) >= (region.addr + region.size))
> > + continue;
>
> Perhaps there's something to be said for vfio-pci's use of fixed offsets
> to have a direct offset to index lookup.
>
> > +
> > + io = ioremap_nocache(region.addr, region.size);
>
> This must incur some overhead per access.

There's mmap() if you want fast... Given the limited ioremap space on
32-bit, I can see not wanting to map everything that the user has open
all the time -- but in that case, wouldn't it be better to just map one
page here rather than the whole region?

> > +
> > + off = *ppos - region.addr;
> > +
> > + while (count) {
> > + size_t filled;
> > +
> > + if (count >= 4 && !(off % 4)) {
> > + u32 val;
> > +
> > + val = ioread32(io + off);
> > + if (copy_to_user(buf, &val, 4))
> > + goto err;
>
> For vfio-pci we've decided that these interfaces are always little
> endian, have you considered whether it makes sense to do something
> similar here? Thanks,

ioread32() is little endian -- but since read() puts its result in the
caller's memory buffer (rather than a register return), I think it makes
more sense to preserve byte-invariance -- similar to the conclusion of
the recent KVM MMIO API clarification discussion. Then the VFIO user
would use the same type of access (byte swapped or not) to access the
read() buffer that they would have used to access the register directly.

Forcing little endian is a better fit for PCI (which is inherently
little endian) than for platform devices which can be either endianness.

-Scott

2014-02-10 23:21:11

by Alex Williamson

[permalink] [raw]
Subject: Re: [RFC PATCH v4 06/10] VFIO_PLATFORM: Read and write support for the device fd

On Mon, 2014-02-10 at 17:12 -0600, Scott Wood wrote:
> On Mon, 2014-02-10 at 15:45 -0700, Alex Williamson wrote:
> > On Sat, 2014-02-08 at 18:29 +0100, Antonios Motakis wrote:
> > > VFIO returns a file descriptor which we can use to manipulate the memory
> > > regions of the device. Since some memory regions we cannot mmap due to
> > > security concerns, we also allow to read and write to this file descriptor
> > > directly.
> > >
> > > Signed-off-by: Antonios Motakis <[email protected]>
> > > Tested-by: Alvise Rigo <[email protected]>
> > > ---
> > > drivers/vfio/platform/vfio_platform.c | 128 +++++++++++++++++++++++++++++++++-
> > > 1 file changed, 125 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/vfio/platform/vfio_platform.c b/drivers/vfio/platform/vfio_platform.c
> > > index f7db5c0..ee96078 100644
> > > --- a/drivers/vfio/platform/vfio_platform.c
> > > +++ b/drivers/vfio/platform/vfio_platform.c
> > > @@ -55,7 +55,8 @@ static int vfio_platform_regions_init(struct vfio_platform_device *vdev)
> > >
> > > region.addr = res->start;
> > > region.size = resource_size(res);
> > > - region.flags = 0;
> > > + region.flags = VFIO_REGION_INFO_FLAG_READ
> > > + | VFIO_REGION_INFO_FLAG_WRITE;
> > >
> > > vdev->region[i] = region;
> > > }
> > > @@ -150,13 +151,134 @@ static long vfio_platform_ioctl(void *device_data,
> > > static ssize_t vfio_platform_read(void *device_data, char __user *buf,
> > > size_t count, loff_t *ppos)
> > > {
> > > - return 0;
> > > + struct vfio_platform_device *vdev = device_data;
> > > + unsigned int *io;
> > > + int i;
> > > +
> > > + for (i = 0; i < vdev->num_regions; i++) {
> > > + struct vfio_platform_region region = vdev->region[i];
> > > + unsigned int done = 0;
> > > + loff_t off;
> > > +
> > > + if ((*ppos < region.addr)
> > > + || (*ppos + count - 1) >= (region.addr + region.size))
> > > + continue;
> >
> > Perhaps there's something to be said for vfio-pci's use of fixed offsets
> > to have a direct offset to index lookup.
> >
> > > +
> > > + io = ioremap_nocache(region.addr, region.size);
> >
> > This must incur some overhead per access.
>
> There's mmap() if you want fast... Given the limited ioremap space on
> 32-bit, I can see not wanting to map everything that the user has open
> all the time -- but in that case, wouldn't it be better to just map one
> page here rather than the whole region?
>
> > > +
> > > + off = *ppos - region.addr;
> > > +
> > > + while (count) {
> > > + size_t filled;
> > > +
> > > + if (count >= 4 && !(off % 4)) {
> > > + u32 val;
> > > +
> > > + val = ioread32(io + off);
> > > + if (copy_to_user(buf, &val, 4))
> > > + goto err;
> >
> > For vfio-pci we've decided that these interfaces are always little
> > endian, have you considered whether it makes sense to do something
> > similar here? Thanks,
>
> ioread32() is little endian -- but since read() puts its result in the
> caller's memory buffer (rather than a register return), I think it makes
> more sense to preserve byte-invariance -- similar to the conclusion of
> the recent KVM MMIO API clarification discussion. Then the VFIO user
> would use the same type of access (byte swapped or not) to access the
> read() buffer that they would have used to access the register directly.
>
> Forcing little endian is a better fit for PCI (which is inherently
> little endian) than for platform devices which can be either endianness.

Ok, works for me. Thanks,

Alex


2014-02-14 22:25:58

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [RFC PATCH v4 01/10] driver core: export driver_probe_device()

On Sat, Feb 08, 2014 at 06:29:31PM +0100, Antonios Motakis wrote:
> From: Kim Phillips <[email protected]>
>
> Needed by drivers, such as the vfio platform driver [1], seeking to
> bypass bind_store()'s driver_match_device(), and bind to any device
> via a private sysfs bind file.
>
> [1] https://lkml.org/lkml/2013/12/11/522
>
> note: the EXPORT_SYMBOL is needed because vfio-platform can be built
> as a module.

No code outside of drivers/base/ should be calling this function, you
are doing something wrong in your bus if you want to do this, please fix
your bus code.

sorry, I can't accept this at all.

greg k-h