This patchset provides a means by which hardware information about the
underlying PCI device can be passed up to userspace (ie, QEMU) so that
this hardware information can be used rather than previously hard-coded
assumptions. A new VFIO region type is defined which holds this
information.
A form of these patches saw some rounds last year but has been back-
tabled for a while. The original work for this feature was done by Pierre
Morel. I'd like to refresh the discussion on this and get this finished up
so that we can move forward with better-supporting additional types of
PCI-attached devices. The proposal here presents a completely different
region mapping vs the prior approach, taking inspiration from vfio info
capability chains to provide device CLP information in a way that allows
for future expansion (new CLP features).
This feature is toggled via the CONFIG_VFIO_PCI_ZDEV configuration entry.
Changes from v1:
- Added ACKs (thanks!)
- Patch 2: Minor change:s/util_avail/util_str_avail/ per Niklas
- Patch 3: removed __packed
- Patch 3: rework various descriptions / comment blocks
- New patch: MAINTAINERS hit to cover new files.
Matthew Rosato (5):
s390/pci: stash version in the zpci_dev
s390/pci: track whether util_str is valid in the zpci_dev
vfio-pci/zdev: define the vfio_zdev header
vfio-pci/zdev: use a device region to retrieve zPCI information
MAINTAINERS: Add entry for s390 vfio-pci
MAINTAINERS | 8 ++
arch/s390/include/asm/pci.h | 4 +-
arch/s390/pci/pci_clp.c | 2 +
drivers/vfio/pci/Kconfig | 13 ++
drivers/vfio/pci/Makefile | 1 +
drivers/vfio/pci/vfio_pci.c | 8 ++
drivers/vfio/pci/vfio_pci_private.h | 10 ++
drivers/vfio/pci/vfio_pci_zdev.c | 242 ++++++++++++++++++++++++++++++++++++
include/uapi/linux/vfio.h | 5 +
include/uapi/linux/vfio_zdev.h | 118 ++++++++++++++++++
10 files changed, 410 insertions(+), 1 deletion(-)
create mode 100644 drivers/vfio/pci/vfio_pci_zdev.c
create mode 100644 include/uapi/linux/vfio_zdev.h
--
1.8.3.1
We define a new device region in vfio.h to be able to get the ZPCI CLP
information by reading this region from userspace.
We create a new file, vfio_zdev.h to define the structure of the new
region defined in vfio.h
Signed-off-by: Matthew Rosato <[email protected]>
---
include/uapi/linux/vfio.h | 5 ++
include/uapi/linux/vfio_zdev.h | 118 +++++++++++++++++++++++++++++++++++++++++
2 files changed, 123 insertions(+)
create mode 100644 include/uapi/linux/vfio_zdev.h
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 9204705..65eb367 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -326,6 +326,11 @@ struct vfio_region_info_cap_type {
* to do TLB invalidation on a GPU.
*/
#define VFIO_REGION_SUBTYPE_IBM_NVLINK2_ATSD (1)
+/*
+ * IBM zPCI specific hardware feature information for a devcie. The contents
+ * of this region are mapped by struct vfio_region_zpci_info.
+ */
+#define VFIO_REGION_SUBTYPE_IBM_ZPCI_CLP (2)
/* sub-types for VFIO_REGION_TYPE_GFX */
#define VFIO_REGION_SUBTYPE_GFX_EDID (1)
diff --git a/include/uapi/linux/vfio_zdev.h b/include/uapi/linux/vfio_zdev.h
new file mode 100644
index 0000000..1c8fb62
--- /dev/null
+++ b/include/uapi/linux/vfio_zdev.h
@@ -0,0 +1,118 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/*
+ * Region definition for ZPCI devices
+ *
+ * Copyright IBM Corp. 2020
+ *
+ * Author(s): Pierre Morel <[email protected]>
+ * Matthew Rosato <[email protected]>
+ */
+
+#ifndef _VFIO_ZDEV_H_
+#define _VFIO_ZDEV_H_
+
+#include <linux/types.h>
+
+/**
+ * struct vfio_region_zpci_info - ZPCI information
+ *
+ * This region provides zPCI specific hardware feature information for a
+ * device.
+ *
+ * The ZPCI information structure is presented as a chain of CLP features
+ * defined below. argsz provides the size of the entire region, and offset
+ * provides the location of the first CLP feature in the chain.
+ *
+ */
+struct vfio_region_zpci_info {
+ __u32 argsz; /* Size of entire payload */
+ __u32 offset; /* Location of first entry */
+};
+
+/**
+ * struct vfio_region_zpci_info_hdr - ZPCI header information
+ *
+ * This structure is included at the top of each CLP feature to define what
+ * type of CLP feature is presented / the structure version. The next value
+ * defines the offset of the next CLP feature, and is an offset from the very
+ * beginning of the region (vfio_region_zpci_info).
+ *
+ * Each CLP feature must have it's own unique 'id'.
+ */
+struct vfio_region_zpci_info_hdr {
+ __u16 id; /* Identifies the CLP type */
+ __u16 version; /* version of the CLP data */
+ __u32 next; /* Offset of next entry */
+};
+
+/**
+ * struct vfio_region_zpci_info_pci - Base PCI Function information
+ *
+ * This region provides a set of descriptive information about the associated
+ * PCI function.
+ */
+#define VFIO_REGION_ZPCI_INFO_BASE 1
+
+struct vfio_region_zpci_info_base {
+ struct vfio_region_zpci_info_hdr hdr;
+ __u64 start_dma; /* Start of available DMA addresses */
+ __u64 end_dma; /* End of available DMA addresses */
+ __u16 pchid; /* Physical Channel ID */
+ __u16 vfn; /* Virtual function number */
+ __u16 fmb_length; /* Measurement Block Length (in bytes) */
+ __u8 pft; /* PCI Function Type */
+ __u8 gid; /* PCI function group ID */
+};
+
+
+/**
+ * struct vfio_region_zpci_info_group - Base PCI Function Group information
+ *
+ * This region provides a set of descriptive information about the group of PCI
+ * functions that the associated device belongs to.
+ */
+#define VFIO_REGION_ZPCI_INFO_GROUP 2
+
+struct vfio_region_zpci_info_group {
+ struct vfio_region_zpci_info_hdr hdr;
+ __u64 dasm; /* DMA Address space mask */
+ __u64 msi_addr; /* MSI address */
+ __u64 flags;
+#define VFIO_PCI_ZDEV_FLAGS_REFRESH 1 /* Use program-specified TLB refresh */
+ __u16 mui; /* Measurement Block Update Interval */
+ __u16 noi; /* Maximum number of MSIs */
+ __u16 maxstbl; /* Maximum Store Block Length */
+ __u8 version; /* Supported PCI Version */
+};
+
+/**
+ * struct vfio_region_zpci_info_util - Utility String
+ *
+ * This region provides the utility string for the associated device, which is
+ * a device identifier string made up of EBCDID characters. 'size' specifies
+ * the length of 'util_str'.
+ */
+#define VFIO_REGION_ZPCI_INFO_UTIL 3
+
+struct vfio_region_zpci_info_util {
+ struct vfio_region_zpci_info_hdr hdr;
+ __u32 size;
+ __u8 util_str[];
+};
+
+/**
+ * struct vfio_region_zpci_info_pfip - PCI Function Path
+ *
+ * This region provides the PCI function path string, which is an identifier
+ * that describes the internal hardware path of the device. 'size' specifies
+ * the length of 'pfip'.
+ */
+#define VFIO_REGION_ZPCI_INFO_PFIP 4
+
+struct vfio_region_zpci_info_pfip {
+struct vfio_region_zpci_info_hdr hdr;
+ __u32 size;
+ __u8 pfip[];
+};
+
+#endif
--
1.8.3.1
Define a new configuration entry VFIO_PCI_ZDEV for VFIO/PCI.
When this s390-only feature is configured we initialize a new device
region, VFIO_REGION_SUBTYPE_IBM_ZPCI_CLP, to hold information provided
by the underlying hardware.
This patch is based on work previously done by Pierre Morel.
Signed-off-by: Matthew Rosato <[email protected]>
---
drivers/vfio/pci/Kconfig | 13 ++
drivers/vfio/pci/Makefile | 1 +
drivers/vfio/pci/vfio_pci.c | 8 ++
drivers/vfio/pci/vfio_pci_private.h | 10 ++
drivers/vfio/pci/vfio_pci_zdev.c | 242 ++++++++++++++++++++++++++++++++++++
5 files changed, 274 insertions(+)
create mode 100644 drivers/vfio/pci/vfio_pci_zdev.c
diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig
index ac3c1dd..07b4a35 100644
--- a/drivers/vfio/pci/Kconfig
+++ b/drivers/vfio/pci/Kconfig
@@ -45,3 +45,16 @@ config VFIO_PCI_NVLINK2
depends on VFIO_PCI && PPC_POWERNV
help
VFIO PCI support for P9 Witherspoon machine with NVIDIA V100 GPUs
+
+config VFIO_PCI_ZDEV
+ bool "VFIO PCI ZPCI device CLP support"
+ depends on VFIO_PCI && S390
+ default y
+ help
+ Enabling this options exposes a region containing hardware
+ configuration for zPCI devices. This enables userspace (e.g. QEMU)
+ to supply proper configuration values instead of hard-coded defaults
+ for zPCI devices passed through via VFIO on s390.
+
+ Say Y here.
+
diff --git a/drivers/vfio/pci/Makefile b/drivers/vfio/pci/Makefile
index f027f8a..781e080 100644
--- a/drivers/vfio/pci/Makefile
+++ b/drivers/vfio/pci/Makefile
@@ -3,5 +3,6 @@
vfio-pci-y := vfio_pci.o vfio_pci_intrs.o vfio_pci_rdwr.o vfio_pci_config.o
vfio-pci-$(CONFIG_VFIO_PCI_IGD) += vfio_pci_igd.o
vfio-pci-$(CONFIG_VFIO_PCI_NVLINK2) += vfio_pci_nvlink2.o
+vfio-pci-$(CONFIG_VFIO_PCI_ZDEV) += vfio_pci_zdev.o
obj-$(CONFIG_VFIO_PCI) += vfio-pci.o
diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 1ab1f5c..cfb04d9 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -409,6 +409,14 @@ static int vfio_pci_enable(struct vfio_pci_device *vdev)
}
}
+ if (IS_ENABLED(CONFIG_VFIO_PCI_ZDEV)) {
+ ret = vfio_pci_zdev_init(vdev);
+ if (ret && ret != -ENODEV) {
+ pci_warn(pdev, "Failed to setup zPCI CLP region\n");
+ goto disable_exit;
+ }
+ }
+
vfio_pci_probe_mmaps(vdev);
return 0;
diff --git a/drivers/vfio/pci/vfio_pci_private.h b/drivers/vfio/pci/vfio_pci_private.h
index 61ca8ab..729af20 100644
--- a/drivers/vfio/pci/vfio_pci_private.h
+++ b/drivers/vfio/pci/vfio_pci_private.h
@@ -213,4 +213,14 @@ static inline int vfio_pci_ibm_npu2_init(struct vfio_pci_device *vdev)
return -ENODEV;
}
#endif
+
+#ifdef CONFIG_VFIO_PCI_ZDEV
+extern int vfio_pci_zdev_init(struct vfio_pci_device *vdev);
+#else
+static inline int vfio_pci_zdev_init(struct vfio_pci_device *vdev)
+{
+ return -ENODEV;
+}
+#endif
+
#endif /* VFIO_PCI_PRIVATE_H */
diff --git a/drivers/vfio/pci/vfio_pci_zdev.c b/drivers/vfio/pci/vfio_pci_zdev.c
new file mode 100644
index 0000000..33324f0
--- /dev/null
+++ b/drivers/vfio/pci/vfio_pci_zdev.c
@@ -0,0 +1,242 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * VFIO ZPCI devices support
+ *
+ * Copyright (C) IBM Corp. 2020. All rights reserved.
+ * Author(s): Pierre Morel <[email protected]>
+ * Matthew Rosato <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ */
+#include <linux/io.h>
+#include <linux/pci.h>
+#include <linux/uaccess.h>
+#include <linux/vfio.h>
+#include <linux/vfio_zdev.h>
+#include <asm/pci_clp.h>
+#include <asm/pci_io.h>
+
+#include "vfio_pci_private.h"
+
+static size_t vfio_pci_zdev_rw(struct vfio_pci_device *vdev,
+ char __user *buf, size_t count, loff_t *ppos,
+ bool iswrite)
+{
+ unsigned int i = VFIO_PCI_OFFSET_TO_INDEX(*ppos) - VFIO_PCI_NUM_REGIONS;
+ struct vfio_region_zpci_info *region = vdev->region[i].data;
+ loff_t pos = *ppos & VFIO_PCI_OFFSET_MASK;
+
+ if ((!vdev->pdev->bus) || (!to_zpci(vdev->pdev)))
+ return -ENODEV;
+
+ if (pos >= vdev->region[i].size || iswrite)
+ return -EINVAL;
+
+ count = min(count, (size_t)(vdev->region[i].size - pos));
+ if (copy_to_user(buf, region + pos, count))
+ return -EFAULT;
+
+ return count;
+}
+
+static void vfio_pci_zdev_release(struct vfio_pci_device *vdev,
+ struct vfio_pci_region *region)
+{
+ kfree(region->data);
+}
+
+static const struct vfio_pci_regops vfio_pci_zdev_regops = {
+ .rw = vfio_pci_zdev_rw,
+ .release = vfio_pci_zdev_release,
+};
+
+static void vfio_pci_zdev_fill_hdr(struct vfio_region_zpci_info_hdr *hdr,
+ __u16 id, __u16 version, size_t offset,
+ size_t size)
+{
+ hdr->id = id;
+ hdr->version = version;
+ hdr->next = offset + size;
+}
+
+/*
+ * Add the Base PCI Function information to the device region.
+ *
+ * zdev - the zPCI device to get information from
+ * region - start of the vfio device region
+ * offset - location within region to place the data
+ *
+ * On return, provide the offset of the end of this CLP feature.
+ */
+static size_t vfio_pci_zdev_add_base(struct zpci_dev *zdev, void *region,
+ size_t offset)
+{
+ struct vfio_region_zpci_info_base *clp;
+
+ /* Jump to the CLP region via the offset */
+ clp = (struct vfio_region_zpci_info_base *) (region + offset);
+
+ /* Fill in the clp header */
+ vfio_pci_zdev_fill_hdr(&clp->hdr, VFIO_REGION_ZPCI_INFO_BASE, 1,
+ offset, sizeof(*clp));
+
+ /* Fill in the CLP feature info */
+ clp->start_dma = zdev->start_dma;
+ clp->end_dma = zdev->end_dma;
+ clp->pchid = zdev->pchid;
+ clp->vfn = zdev->vfn;
+ clp->fmb_length = zdev->fmb_length;
+ clp->pft = zdev->pft;
+ clp->gid = zdev->pfgid;
+
+ /* Return offset to the end of this CLP feature */
+ return clp->hdr.next;
+}
+
+/*
+ * Add the Base PCI Function Group information to the device region.
+ *
+ * zdev - the zPCI device to get information from
+ * region - start of the vfio device region
+ * offset - location within region to place the data
+ *
+ * On return, provide the offset of the end of this CLP feature.
+ */
+static size_t vfio_pci_zdev_add_group(struct zpci_dev *zdev, void *region,
+ size_t offset)
+{
+ struct vfio_region_zpci_info_group *clp;
+
+ /* Jump to the CLP region via the offset */
+ clp = (struct vfio_region_zpci_info_group *) (region + offset);
+
+ /* Fill in the clp header */
+ vfio_pci_zdev_fill_hdr(&clp->hdr, VFIO_REGION_ZPCI_INFO_GROUP, 1,
+ offset, sizeof(*clp));
+
+ /* Fill in the CLP feature info */
+ clp->dasm = zdev->dma_mask;
+ clp->msi_addr = zdev->msi_addr;
+ clp->flags = VFIO_PCI_ZDEV_FLAGS_REFRESH;
+ clp->mui = zdev->fmb_update;
+ clp->noi = zdev->max_msi;
+ clp->maxstbl = ZPCI_MAX_WRITE_SIZE;
+ clp->version = zdev->version;
+
+ /* Return offset to the end of this CLP feature */
+ return clp->hdr.next;
+}
+
+/*
+ * Add the device utility string to the device region.
+ *
+ * zdev - the zPCI device to get information from
+ * region - start of the vfio device region
+ * offset - location within region to place the data
+ *
+ * On return, provide the offset of the end of this CLP feature.
+ */
+static size_t vfio_pci_zdev_add_util(struct zpci_dev *zdev, void *region,
+ size_t offset)
+{
+ struct vfio_region_zpci_info_util *clp;
+ size_t size = CLP_UTIL_STR_LEN;
+
+ /* Only add a utility string if one is available */
+ if (!zdev->util_str_avail)
+ return offset;
+
+ /* Jump to the CLP region via the offset */
+ clp = (struct vfio_region_zpci_info_util *) (region + offset);
+
+ /* Fill in the clp header */
+ vfio_pci_zdev_fill_hdr(&clp->hdr, VFIO_REGION_ZPCI_INFO_UTIL, 1,
+ offset, sizeof(*clp) + size);
+
+ /* Fill in the CLP feature info */
+ clp->size = size;
+ memcpy(clp->util_str, zdev->util_str, size);
+
+ /* Return offset to the end of this CLP feature */
+ return clp->hdr.next;
+}
+
+/*
+ * Add the function path string to the device region.
+ *
+ * zdev - the zPCI device to get information from
+ * region - start of the vfio device region
+ * offset - location within region to place the data
+ *
+ * On return, provide the offset of the end of this CLP feature.
+ */
+static size_t vfio_pci_zdev_add_pfip(struct zpci_dev *zdev, void *region,
+ size_t offset)
+{
+ struct vfio_region_zpci_info_pfip *clp;
+ size_t size = CLP_PFIP_NR_SEGMENTS;
+
+ /* Jump to the CLP region via the offset */
+ clp = (struct vfio_region_zpci_info_pfip *) (region + offset);
+
+ /* Fill in the clp header */
+ vfio_pci_zdev_fill_hdr(&clp->hdr, VFIO_REGION_ZPCI_INFO_PFIP, 1,
+ offset, sizeof(*clp) + size);
+
+ /* Fill in the CLP feature info */
+ clp->size = size;
+ memcpy(clp->pfip, zdev->pfip, size);
+
+ /* Return offset to the end of this CLP feature */
+ return clp->hdr.next;
+}
+
+int vfio_pci_zdev_init(struct vfio_pci_device *vdev)
+{
+ struct vfio_region_zpci_info *region;
+ struct zpci_dev *zdev;
+ size_t clp_offset;
+ int size;
+ int ret;
+
+ if (!vdev->pdev->bus)
+ return -ENODEV;
+
+ zdev = to_zpci(vdev->pdev);
+ if (!zdev)
+ return -ENODEV;
+
+ /* Calculate size needed for all supported CLP features */
+ size = sizeof(*region) +
+ sizeof(struct vfio_region_zpci_info_base) +
+ sizeof(struct vfio_region_zpci_info_group) +
+ (sizeof(struct vfio_region_zpci_info_util) + CLP_UTIL_STR_LEN) +
+ (sizeof(struct vfio_region_zpci_info_pfip) +
+ CLP_PFIP_NR_SEGMENTS);
+
+ region = kmalloc(size, GFP_KERNEL);
+ if (!region)
+ return -ENOMEM;
+
+ /* Fill in header */
+ region->argsz = size;
+ clp_offset = region->offset = sizeof(struct vfio_region_zpci_info);
+
+ /* Fill the supported CLP features */
+ clp_offset = vfio_pci_zdev_add_base(zdev, region, clp_offset);
+ clp_offset = vfio_pci_zdev_add_group(zdev, region, clp_offset);
+ clp_offset = vfio_pci_zdev_add_util(zdev, region, clp_offset);
+ clp_offset = vfio_pci_zdev_add_pfip(zdev, region, clp_offset);
+
+ ret = vfio_pci_register_dev_region(vdev,
+ PCI_VENDOR_ID_IBM | VFIO_REGION_TYPE_PCI_VENDOR_TYPE,
+ VFIO_REGION_SUBTYPE_IBM_ZPCI_CLP, &vfio_pci_zdev_regops,
+ size, VFIO_REGION_INFO_FLAG_READ, region);
+ if (ret)
+ kfree(region);
+
+ return ret;
+}
--
1.8.3.1
Add myself to cover s390-specific items related to vfio-pci.
Signed-off-by: Matthew Rosato <[email protected]>
---
MAINTAINERS | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index 190c7fa..389c4ad 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -15162,6 +15162,14 @@ F: Documentation/s390/vfio-ccw.rst
F: drivers/s390/cio/vfio_ccw*
F: include/uapi/linux/vfio_ccw.h
+S390 VFIO-PCI DRIVER
+M: Matthew Rosato <[email protected]>
+L: [email protected]
+L: [email protected]
+S: Supported
+F: drivers/vfio/pci/vfio_pci_zdev.c
+F: include/uapi/linux/vfio_zdev.h
+
S390 ZCRYPT DRIVER
M: Harald Freudenberger <[email protected]>
L: [email protected]
--
1.8.3.1
On 10/2/20 4:00 PM, Matthew Rosato wrote:
> This patchset provides a means by which hardware information about the
> underlying PCI device can be passed up to userspace (ie, QEMU) so that
> this hardware information can be used rather than previously hard-coded
> assumptions. A new VFIO region type is defined which holds this
> information.
>
> A form of these patches saw some rounds last year but has been back-
> tabled for a while. The original work for this feature was done by Pierre
> Morel. I'd like to refresh the discussion on this and get this finished up
> so that we can move forward with better-supporting additional types of
> PCI-attached devices. The proposal here presents a completely different
> region mapping vs the prior approach, taking inspiration from vfio info
> capability chains to provide device CLP information in a way that allows
> for future expansion (new CLP features).
>
> This feature is toggled via the CONFIG_VFIO_PCI_ZDEV configuration entry.
>
> Changes from v1:
> - Added ACKs (thanks!)
> - Patch 2: Minor change:s/util_avail/util_str_avail/ per Niklas
> - Patch 3: removed __packed
> - Patch 3: rework various descriptions / comment blocks
> - New patch: MAINTAINERS hit to cover new files.
>
Link to latest QEMU patchset:
https://lists.gnu.org/archive/html/qemu-devel/2020-10/msg00673.html
> Matthew Rosato (5):
> s390/pci: stash version in the zpci_dev
> s390/pci: track whether util_str is valid in the zpci_dev
> vfio-pci/zdev: define the vfio_zdev header
> vfio-pci/zdev: use a device region to retrieve zPCI information
> MAINTAINERS: Add entry for s390 vfio-pci
>
> MAINTAINERS | 8 ++
> arch/s390/include/asm/pci.h | 4 +-
> arch/s390/pci/pci_clp.c | 2 +
> drivers/vfio/pci/Kconfig | 13 ++
> drivers/vfio/pci/Makefile | 1 +
> drivers/vfio/pci/vfio_pci.c | 8 ++
> drivers/vfio/pci/vfio_pci_private.h | 10 ++
> drivers/vfio/pci/vfio_pci_zdev.c | 242 ++++++++++++++++++++++++++++++++++++
> include/uapi/linux/vfio.h | 5 +
> include/uapi/linux/vfio_zdev.h | 118 ++++++++++++++++++
> 10 files changed, 410 insertions(+), 1 deletion(-)
> create mode 100644 drivers/vfio/pci/vfio_pci_zdev.c
> create mode 100644 include/uapi/linux/vfio_zdev.h
>
On Fri, 2 Oct 2020 16:00:42 -0400
Matthew Rosato <[email protected]> wrote:
> We define a new device region in vfio.h to be able to get the ZPCI CLP
> information by reading this region from userspace.
>
> We create a new file, vfio_zdev.h to define the structure of the new
> region defined in vfio.h
>
> Signed-off-by: Matthew Rosato <[email protected]>
> ---
> include/uapi/linux/vfio.h | 5 ++
> include/uapi/linux/vfio_zdev.h | 118 +++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 123 insertions(+)
> create mode 100644 include/uapi/linux/vfio_zdev.h
>
> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> index 9204705..65eb367 100644
> --- a/include/uapi/linux/vfio.h
> +++ b/include/uapi/linux/vfio.h
> @@ -326,6 +326,11 @@ struct vfio_region_info_cap_type {
> * to do TLB invalidation on a GPU.
> */
> #define VFIO_REGION_SUBTYPE_IBM_NVLINK2_ATSD (1)
> +/*
> + * IBM zPCI specific hardware feature information for a devcie. The contents
> + * of this region are mapped by struct vfio_region_zpci_info.
> + */
> +#define VFIO_REGION_SUBTYPE_IBM_ZPCI_CLP (2)
>
> /* sub-types for VFIO_REGION_TYPE_GFX */
> #define VFIO_REGION_SUBTYPE_GFX_EDID (1)
> diff --git a/include/uapi/linux/vfio_zdev.h b/include/uapi/linux/vfio_zdev.h
> new file mode 100644
> index 0000000..1c8fb62
> --- /dev/null
> +++ b/include/uapi/linux/vfio_zdev.h
> @@ -0,0 +1,118 @@
> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
> +/*
> + * Region definition for ZPCI devices
> + *
> + * Copyright IBM Corp. 2020
> + *
> + * Author(s): Pierre Morel <[email protected]>
> + * Matthew Rosato <[email protected]>
> + */
> +
> +#ifndef _VFIO_ZDEV_H_
> +#define _VFIO_ZDEV_H_
> +
> +#include <linux/types.h>
> +
> +/**
> + * struct vfio_region_zpci_info - ZPCI information
> + *
> + * This region provides zPCI specific hardware feature information for a
> + * device.
> + *
> + * The ZPCI information structure is presented as a chain of CLP features
> + * defined below. argsz provides the size of the entire region, and offset
> + * provides the location of the first CLP feature in the chain.
> + *
> + */
> +struct vfio_region_zpci_info {
> + __u32 argsz; /* Size of entire payload */
> + __u32 offset; /* Location of first entry */
> +};
> +
> +/**
> + * struct vfio_region_zpci_info_hdr - ZPCI header information
> + *
> + * This structure is included at the top of each CLP feature to define what
> + * type of CLP feature is presented / the structure version. The next value
> + * defines the offset of the next CLP feature, and is an offset from the very
> + * beginning of the region (vfio_region_zpci_info).
> + *
> + * Each CLP feature must have it's own unique 'id'.
> + */
> +struct vfio_region_zpci_info_hdr {
> + __u16 id; /* Identifies the CLP type */
> + __u16 version; /* version of the CLP data */
> + __u32 next; /* Offset of next entry */
> +};
> +
> +/**
> + * struct vfio_region_zpci_info_pci - Base PCI Function information
> + *
> + * This region provides a set of descriptive information about the associated
> + * PCI function.
> + */
> +#define VFIO_REGION_ZPCI_INFO_BASE 1
> +
> +struct vfio_region_zpci_info_base {
> + struct vfio_region_zpci_info_hdr hdr;
> + __u64 start_dma; /* Start of available DMA addresses */
> + __u64 end_dma; /* End of available DMA addresses */
> + __u16 pchid; /* Physical Channel ID */
> + __u16 vfn; /* Virtual function number */
> + __u16 fmb_length; /* Measurement Block Length (in bytes) */
> + __u8 pft; /* PCI Function Type */
> + __u8 gid; /* PCI function group ID */
> +};
> +
> +
> +/**
> + * struct vfio_region_zpci_info_group - Base PCI Function Group information
> + *
> + * This region provides a set of descriptive information about the group of PCI
> + * functions that the associated device belongs to.
> + */
> +#define VFIO_REGION_ZPCI_INFO_GROUP 2
> +
> +struct vfio_region_zpci_info_group {
> + struct vfio_region_zpci_info_hdr hdr;
> + __u64 dasm; /* DMA Address space mask */
> + __u64 msi_addr; /* MSI address */
> + __u64 flags;
> +#define VFIO_PCI_ZDEV_FLAGS_REFRESH 1 /* Use program-specified TLB refresh */
> + __u16 mui; /* Measurement Block Update Interval */
> + __u16 noi; /* Maximum number of MSIs */
> + __u16 maxstbl; /* Maximum Store Block Length */
> + __u8 version; /* Supported PCI Version */
> +};
> +
> +/**
> + * struct vfio_region_zpci_info_util - Utility String
> + *
> + * This region provides the utility string for the associated device, which is
> + * a device identifier string made up of EBCDID characters. 'size' specifies
> + * the length of 'util_str'.
> + */
> +#define VFIO_REGION_ZPCI_INFO_UTIL 3
> +
> +struct vfio_region_zpci_info_util {
> + struct vfio_region_zpci_info_hdr hdr;
> + __u32 size;
> + __u8 util_str[];
> +};
> +
> +/**
> + * struct vfio_region_zpci_info_pfip - PCI Function Path
> + *
> + * This region provides the PCI function path string, which is an identifier
> + * that describes the internal hardware path of the device. 'size' specifies
> + * the length of 'pfip'.
> + */
> +#define VFIO_REGION_ZPCI_INFO_PFIP 4
> +
> +struct vfio_region_zpci_info_pfip {
> +struct vfio_region_zpci_info_hdr hdr;
> + __u32 size;
> + __u8 pfip[];
> +};
> +
> +#endif
Can you discuss why a region with embedded capability chain is a better
solution than extending the VFIO_DEVICE_GET_INFO ioctl to support a
capability chain and providing this info there? This all appears to be
read-only info, so what's the benefit of duplicating yet another
capability chain in a region? It would also be possible to define four
separate device specific regions, one for each of these capabilities
rather than creating this chain. It just seems like a strange approach
TBH. Thanks,
Alex
On 10/2/20 5:44 PM, Alex Williamson wrote:
> On Fri, 2 Oct 2020 16:00:42 -0400
> Matthew Rosato <[email protected]> wrote:
>
>> We define a new device region in vfio.h to be able to get the ZPCI CLP
>> information by reading this region from userspace.
>>
>> We create a new file, vfio_zdev.h to define the structure of the new
>> region defined in vfio.h
>>
>> Signed-off-by: Matthew Rosato <[email protected]>
>> ---
>> include/uapi/linux/vfio.h | 5 ++
>> include/uapi/linux/vfio_zdev.h | 118 +++++++++++++++++++++++++++++++++++++++++
>> 2 files changed, 123 insertions(+)
>> create mode 100644 include/uapi/linux/vfio_zdev.h
>>
>> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
>> index 9204705..65eb367 100644
>> --- a/include/uapi/linux/vfio.h
>> +++ b/include/uapi/linux/vfio.h
>> @@ -326,6 +326,11 @@ struct vfio_region_info_cap_type {
>> * to do TLB invalidation on a GPU.
>> */
>> #define VFIO_REGION_SUBTYPE_IBM_NVLINK2_ATSD (1)
>> +/*
>> + * IBM zPCI specific hardware feature information for a devcie. The contents
>> + * of this region are mapped by struct vfio_region_zpci_info.
>> + */
>> +#define VFIO_REGION_SUBTYPE_IBM_ZPCI_CLP (2)
>>
>> /* sub-types for VFIO_REGION_TYPE_GFX */
>> #define VFIO_REGION_SUBTYPE_GFX_EDID (1)
>> diff --git a/include/uapi/linux/vfio_zdev.h b/include/uapi/linux/vfio_zdev.h
>> new file mode 100644
>> index 0000000..1c8fb62
>> --- /dev/null
>> +++ b/include/uapi/linux/vfio_zdev.h
>> @@ -0,0 +1,118 @@
>> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
>> +/*
>> + * Region definition for ZPCI devices
>> + *
>> + * Copyright IBM Corp. 2020
>> + *
>> + * Author(s): Pierre Morel <[email protected]>
>> + * Matthew Rosato <[email protected]>
>> + */
>> +
>> +#ifndef _VFIO_ZDEV_H_
>> +#define _VFIO_ZDEV_H_
>> +
>> +#include <linux/types.h>
>> +
>> +/**
>> + * struct vfio_region_zpci_info - ZPCI information
>> + *
>> + * This region provides zPCI specific hardware feature information for a
>> + * device.
>> + *
>> + * The ZPCI information structure is presented as a chain of CLP features
>> + * defined below. argsz provides the size of the entire region, and offset
>> + * provides the location of the first CLP feature in the chain.
>> + *
>> + */
>> +struct vfio_region_zpci_info {
>> + __u32 argsz; /* Size of entire payload */
>> + __u32 offset; /* Location of first entry */
>> +};
>> +
>> +/**
>> + * struct vfio_region_zpci_info_hdr - ZPCI header information
>> + *
>> + * This structure is included at the top of each CLP feature to define what
>> + * type of CLP feature is presented / the structure version. The next value
>> + * defines the offset of the next CLP feature, and is an offset from the very
>> + * beginning of the region (vfio_region_zpci_info).
>> + *
>> + * Each CLP feature must have it's own unique 'id'.
>> + */
>> +struct vfio_region_zpci_info_hdr {
>> + __u16 id; /* Identifies the CLP type */
>> + __u16 version; /* version of the CLP data */
>> + __u32 next; /* Offset of next entry */
>> +};
>> +
>> +/**
>> + * struct vfio_region_zpci_info_pci - Base PCI Function information
>> + *
>> + * This region provides a set of descriptive information about the associated
>> + * PCI function.
>> + */
>> +#define VFIO_REGION_ZPCI_INFO_BASE 1
>> +
>> +struct vfio_region_zpci_info_base {
>> + struct vfio_region_zpci_info_hdr hdr;
>> + __u64 start_dma; /* Start of available DMA addresses */
>> + __u64 end_dma; /* End of available DMA addresses */
>> + __u16 pchid; /* Physical Channel ID */
>> + __u16 vfn; /* Virtual function number */
>> + __u16 fmb_length; /* Measurement Block Length (in bytes) */
>> + __u8 pft; /* PCI Function Type */
>> + __u8 gid; /* PCI function group ID */
>> +};
>> +
>> +
>> +/**
>> + * struct vfio_region_zpci_info_group - Base PCI Function Group information
>> + *
>> + * This region provides a set of descriptive information about the group of PCI
>> + * functions that the associated device belongs to.
>> + */
>> +#define VFIO_REGION_ZPCI_INFO_GROUP 2
>> +
>> +struct vfio_region_zpci_info_group {
>> + struct vfio_region_zpci_info_hdr hdr;
>> + __u64 dasm; /* DMA Address space mask */
>> + __u64 msi_addr; /* MSI address */
>> + __u64 flags;
>> +#define VFIO_PCI_ZDEV_FLAGS_REFRESH 1 /* Use program-specified TLB refresh */
>> + __u16 mui; /* Measurement Block Update Interval */
>> + __u16 noi; /* Maximum number of MSIs */
>> + __u16 maxstbl; /* Maximum Store Block Length */
>> + __u8 version; /* Supported PCI Version */
>> +};
>> +
>> +/**
>> + * struct vfio_region_zpci_info_util - Utility String
>> + *
>> + * This region provides the utility string for the associated device, which is
>> + * a device identifier string made up of EBCDID characters. 'size' specifies
>> + * the length of 'util_str'.
>> + */
>> +#define VFIO_REGION_ZPCI_INFO_UTIL 3
>> +
>> +struct vfio_region_zpci_info_util {
>> + struct vfio_region_zpci_info_hdr hdr;
>> + __u32 size;
>> + __u8 util_str[];
>> +};
>> +
>> +/**
>> + * struct vfio_region_zpci_info_pfip - PCI Function Path
>> + *
>> + * This region provides the PCI function path string, which is an identifier
>> + * that describes the internal hardware path of the device. 'size' specifies
>> + * the length of 'pfip'.
>> + */
>> +#define VFIO_REGION_ZPCI_INFO_PFIP 4
>> +
>> +struct vfio_region_zpci_info_pfip {
>> +struct vfio_region_zpci_info_hdr hdr;
>> + __u32 size;
>> + __u8 pfip[];
>> +};
>> +
>> +#endif
>
> Can you discuss why a region with embedded capability chain is a better
> solution than extending the VFIO_DEVICE_GET_INFO ioctl to support a
> capability chain and providing this info there? This all appears to be
> read-only info, so what's the benefit of duplicating yet another
It is indeed read-only info, and the device region was defined as such.
I would not necessarily be opposed to extending VFIO_DEVICE_GET_INFO
with these defined as capabilities; I'd say a primary motivating factor
to putting these in their own region was to avoid stuffing a bunch of
s390-specific capabilities into a general-purpose ioctl response.
But if you're OK with that notion, I can give that a crack in v3.
> capability chain in a region? It would also be possible to define four
> separate device specific regions, one for each of these capabilities
> rather than creating this chain. It just seems like a strange approach
I'm not sure if creating separate regions would be the right approach
though; these are just the first 4. There will definitely be additional
capabilities in support of new zPCI features moving forward, I'm not
sure how many regions we really want to end up with. Some might be as
small as a single field, which seems more in-line with capabilities vs
an entire region.
On 10/5/20 12:01 PM, Cornelia Huck wrote:
> On Mon, 5 Oct 2020 09:52:25 -0400
> Matthew Rosato <[email protected]> wrote:
>
>> On 10/2/20 5:44 PM, Alex Williamson wrote:
>
>>> Can you discuss why a region with embedded capability chain is a better
>>> solution than extending the VFIO_DEVICE_GET_INFO ioctl to support a
>>> capability chain and providing this info there? This all appears to be
>>> read-only info, so what's the benefit of duplicating yet another
>>
>> It is indeed read-only info, and the device region was defined as such.
>>
>> I would not necessarily be opposed to extending VFIO_DEVICE_GET_INFO
>> with these defined as capabilities; I'd say a primary motivating factor
>> to putting these in their own region was to avoid stuffing a bunch of
>> s390-specific capabilities into a general-purpose ioctl response.
>
> Can't you make the zdev code register the capabilities? That would put
> them nicely into their own configurable part.
>
I can still keep the code that adds these capabilities in the zdev .c
file, thus meaning they will only be added for s390 zpci devices -- but
the actual definition of them should probably instead be in vfio.h, no?
(maybe that's what you mean, but let's lay it out just in case)
The capability IDs would be shared with any other potential user of
VFIO_DEVICE_GET_INFO (I guess there is precedent for this already,
nvlink2 does this for vfio_region_info, see
VFIO_REGION_INFO_CAP_NVLINK2_SSATGT as an example).
Today, ZPCI would be the only users of VFIO_DEVICE_GET_INFO capability
chains. Tomorrow, some other type might use them too. Unless we want
to put a stake in the ground that says there will never be a case for a
capability that all devices share on VFIO_DEVICE_GET_INFO, I think we
should keep the IDs unique and define the capabilities in vfio.h but do
the corresponding add_capability() calls from a zdev-specific file.
>>
>> But if you're OK with that notion, I can give that a crack in v3.
>>
>>> capability chain in a region? It would also be possible to define four
>>> separate device specific regions, one for each of these capabilities
>>> rather than creating this chain. It just seems like a strange approach
>>
>> I'm not sure if creating separate regions would be the right approach
>> though; these are just the first 4. There will definitely be additional
>> capabilities in support of new zPCI features moving forward, I'm not
>> sure how many regions we really want to end up with. Some might be as
>> small as a single field, which seems more in-line with capabilities vs
>> an entire region.
>
> If we are expecting more of these in the future, going with GET_INFO
> capabilities when adding new ones seems like the best approach.
>
On Mon, 5 Oct 2020 12:16:10 -0400
Matthew Rosato <[email protected]> wrote:
> On 10/5/20 12:01 PM, Cornelia Huck wrote:
> > On Mon, 5 Oct 2020 09:52:25 -0400
> > Matthew Rosato <[email protected]> wrote:
> >
> >> On 10/2/20 5:44 PM, Alex Williamson wrote:
> >
> >>> Can you discuss why a region with embedded capability chain is a better
> >>> solution than extending the VFIO_DEVICE_GET_INFO ioctl to support a
> >>> capability chain and providing this info there? This all appears to be
> >>> read-only info, so what's the benefit of duplicating yet another
> >>
> >> It is indeed read-only info, and the device region was defined as such.
> >>
> >> I would not necessarily be opposed to extending VFIO_DEVICE_GET_INFO
> >> with these defined as capabilities; I'd say a primary motivating factor
> >> to putting these in their own region was to avoid stuffing a bunch of
> >> s390-specific capabilities into a general-purpose ioctl response.
> >
> > Can't you make the zdev code register the capabilities? That would put
> > them nicely into their own configurable part.
> >
>
> I can still keep the code that adds these capabilities in the zdev .c
> file, thus meaning they will only be added for s390 zpci devices -- but
> the actual definition of them should probably instead be in vfio.h, no?
> (maybe that's what you mean, but let's lay it out just in case)
>
> The capability IDs would be shared with any other potential user of
> VFIO_DEVICE_GET_INFO (I guess there is precedent for this already,
> nvlink2 does this for vfio_region_info, see
> VFIO_REGION_INFO_CAP_NVLINK2_SSATGT as an example).
>
> Today, ZPCI would be the only users of VFIO_DEVICE_GET_INFO capability
> chains. Tomorrow, some other type might use them too. Unless we want
> to put a stake in the ground that says there will never be a case for a
> capability that all devices share on VFIO_DEVICE_GET_INFO, I think we
> should keep the IDs unique and define the capabilities in vfio.h but do
> the corresponding add_capability() calls from a zdev-specific file.
Agreed. We should have enough space for multiple users, and I do not
consider reserving the IDs cluttering.
On Mon, 5 Oct 2020 09:52:25 -0400
Matthew Rosato <[email protected]> wrote:
> On 10/2/20 5:44 PM, Alex Williamson wrote:
> > Can you discuss why a region with embedded capability chain is a better
> > solution than extending the VFIO_DEVICE_GET_INFO ioctl to support a
> > capability chain and providing this info there? This all appears to be
> > read-only info, so what's the benefit of duplicating yet another
>
> It is indeed read-only info, and the device region was defined as such.
>
> I would not necessarily be opposed to extending VFIO_DEVICE_GET_INFO
> with these defined as capabilities; I'd say a primary motivating factor
> to putting these in their own region was to avoid stuffing a bunch of
> s390-specific capabilities into a general-purpose ioctl response.
Can't you make the zdev code register the capabilities? That would put
them nicely into their own configurable part.
>
> But if you're OK with that notion, I can give that a crack in v3.
>
> > capability chain in a region? It would also be possible to define four
> > separate device specific regions, one for each of these capabilities
> > rather than creating this chain. It just seems like a strange approach
>
> I'm not sure if creating separate regions would be the right approach
> though; these are just the first 4. There will definitely be additional
> capabilities in support of new zPCI features moving forward, I'm not
> sure how many regions we really want to end up with. Some might be as
> small as a single field, which seems more in-line with capabilities vs
> an entire region.
If we are expecting more of these in the future, going with GET_INFO
capabilities when adding new ones seems like the best approach.
On Fri, 2 Oct 2020 16:00:44 -0400
Matthew Rosato <[email protected]> wrote:
> Add myself to cover s390-specific items related to vfio-pci.
>
> Signed-off-by: Matthew Rosato <[email protected]>
> ---
> MAINTAINERS | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 190c7fa..389c4ad 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -15162,6 +15162,14 @@ F: Documentation/s390/vfio-ccw.rst
> F: drivers/s390/cio/vfio_ccw*
> F: include/uapi/linux/vfio_ccw.h
>
> +S390 VFIO-PCI DRIVER
> +M: Matthew Rosato <[email protected]>
> +L: [email protected]
> +L: [email protected]
> +S: Supported
> +F: drivers/vfio/pci/vfio_pci_zdev.c
> +F: include/uapi/linux/vfio_zdev.h
> +
> S390 ZCRYPT DRIVER
> M: Harald Freudenberger <[email protected]>
> L: [email protected]
Acked-by: Cornelia Huck <[email protected]>