2019-04-19 19:14:44

by Yan Zhao

[permalink] [raw]
Subject: [PATCH 0/2] introduction of version attribute for VFIO live migration

This patchset introduces a version attribute under sysfs of VFIO Mediated
devices.

This version attribute is used by user space software like libvirt to
determine whether two mdev devices are compatible for live migration
before starting live migration.

Patch 1 defines version attribute as mandatory for VFIO live migration. It
means if version attribute is missing or it returns errno, the
corresponding mdev device is regarded as not supporting live migration.
samples for vfio-mdev are modified to demonstrate it.

Patch 2 uses GVT as an example to show how to expose version attribute and
check device compatibility in vendor driver.


Yan Zhao (2):
vfio/mdev: add version field as mandatory attribute for mdev device
drm/i915/gvt: export mdev device version to sysfs for Intel vGPU

Documentation/vfio-mediated-device.txt | 36 +++++++++
drivers/gpu/drm/i915/gvt/Makefile | 2 +-
drivers/gpu/drm/i915/gvt/device_version.c | 94 +++++++++++++++++++++++
drivers/gpu/drm/i915/gvt/gvt.c | 55 +++++++++++++
drivers/gpu/drm/i915/gvt/gvt.h | 6 ++
samples/vfio-mdev/mbochs.c | 17 ++++
samples/vfio-mdev/mdpy.c | 16 ++++
samples/vfio-mdev/mtty.c | 16 ++++
8 files changed, 241 insertions(+), 1 deletion(-)
create mode 100644 drivers/gpu/drm/i915/gvt/device_version.c

--
2.17.1



2019-04-19 19:04:57

by Yan Zhao

[permalink] [raw]
Subject: [PATCH 1/2] vfio/mdev: add version field as mandatory attribute for mdev device

device version attribute in mdev sysfs is used by user space software
(e.g. libvirt) to query device compatibility for live migration of VFIO
mdev devices. This attribute is mandatory if a mdev device supports live
migration.

It consists of two parts: common part and vendor proprietary part.
common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
identifies device type. e.g., for pci device, it is
"pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
vendor proprietary part: this part is varied in length. vendor driver can
specify any string to identify a device.

When reading this attribute, it should show device version string of the
device of type <type-id>. If a device does not support live migration, it
should return errno.
When writing a string to this attribute, it returns errno for
incompatibility or returns written string length in compatibility case.
If a device does not support live migration, it always returns errno.

For user space software to use:
1.
Before starting live migration, user space software first reads source side
mdev device's version. e.g.
"#cat \
/sys/bus/pci/devices/0000\:00\:02.0/5ac1fb20-2bbf-4842-bb7e-36c58c3be9cd/mdev_type/version"
00028086-193b-i915-GVTg_V5_4

2.
Then, user space software writes the source side returned version string
to device version attribute in target side, and checks the return value.
If a negative errno is returned in the target side, then mdev devices in
source and target sides are not compatible;
If a positive number is returned and it equals to the length of written
string, then the two mdev devices in source and target side are compatible.
e.g.
(a) compatibility case
"# echo 00028086-193b-i915-GVTg_V5_4 >
/sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"

(b) incompatibility case
"#echo 00028086-193b-i915-GVTg_V5_1 >
/sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
-bash: echo: write error: Invalid argument

3. if two mdev devices are compatible, user space software can start
live migration, and vice versa.

Note: if a mdev device does not support live migration, it either does
not provide a version attribute, or always returns errno when its version
attribute is read/written.

Cc: Alex Williamson <[email protected]>
Cc: Erik Skultety <[email protected]>
Cc: "Dr. David Alan Gilbert" <[email protected]>
Cc: Cornelia Huck <[email protected]>
Cc: "Tian, Kevin" <[email protected]>
Cc: Zhenyu Wang <[email protected]>
Cc: "Wang, Zhi A" <[email protected]>
Cc: Neo Jia <[email protected]>
Cc: Kirti Wankhede <[email protected]>

Signed-off-by: Yan Zhao <[email protected]>
---
Documentation/vfio-mediated-device.txt | 36 ++++++++++++++++++++++++++
samples/vfio-mdev/mbochs.c | 17 ++++++++++++
samples/vfio-mdev/mdpy.c | 16 ++++++++++++
samples/vfio-mdev/mtty.c | 16 ++++++++++++
4 files changed, 85 insertions(+)

diff --git a/Documentation/vfio-mediated-device.txt b/Documentation/vfio-mediated-device.txt
index c3f69bcaf96e..bc28471c0667 100644
--- a/Documentation/vfio-mediated-device.txt
+++ b/Documentation/vfio-mediated-device.txt
@@ -202,6 +202,7 @@ Directories and files under the sysfs for Each Physical Device
| | |--- available_instances
| | |--- device_api
| | |--- description
+ | | |--- version
| | |--- [devices]
| |--- [<type-id>]
| | |--- create
@@ -209,6 +210,7 @@ Directories and files under the sysfs for Each Physical Device
| | |--- available_instances
| | |--- device_api
| | |--- description
+ | | |--- version
| | |--- [devices]
| |--- [<type-id>]
| |--- create
@@ -216,6 +218,7 @@ Directories and files under the sysfs for Each Physical Device
| |--- available_instances
| |--- device_api
| |--- description
+ | |--- version
| |--- [devices]

* [mdev_supported_types]
@@ -225,6 +228,8 @@ Directories and files under the sysfs for Each Physical Device
[<type-id>], device_api, and available_instances are mandatory attributes
that should be provided by vendor driver.

+ version is a mandatory attribute if a mdev device supports live migration.
+
* [<type-id>]

The [<type-id>] name is created by adding the device driver string as a prefix
@@ -246,6 +251,35 @@ Directories and files under the sysfs for Each Physical Device
This attribute should show the number of devices of type <type-id> that can be
created.

+* version
+
+ This attribute is rw. It is used to check whether two devices are compatible
+ for live migration. If this attribute is missing, then the corresponding mdev
+ device is regarded as not supporting live migration.
+
+ It consists of two parts: common part and vendor proprietary part.
+ common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies
+ device type. e.g., for pci device, it is
+ "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
+ vendor proprietary part: this part is varied in length. vendor driver can
+ specify any string to identify a device.
+
+ When reading this attribute, it should show device version string of the device
+ of type <type-id>. If a device does not support live migration, it should
+ return errno.
+ When writing a string to this attribute, it returns errno for incompatibility
+ or returns written string length in compatibility case. If a device does not
+ support live migration, it always returns errno.
+
+ for example.
+ # cat \
+ /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_2/version
+ 00028086-193b-i915-GVTg_V5_2
+
+ #echo 00028086-193b-i915-GVTg_V5_2 > \
+ /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
+ -bash: echo: write error: Invalid argument
+
* [device]

This directory contains links to the devices of type <type-id> that have been
@@ -327,12 +361,14 @@ card.
| | |-- available_instances
| | |-- create
| | |-- device_api
+ | | |-- version
| | |-- devices
| | `-- name
| `-- mtty-2
| |-- available_instances
| |-- create
| |-- device_api
+ | |-- version
| |-- devices
| `-- name
|-- mtty_dev
diff --git a/samples/vfio-mdev/mbochs.c b/samples/vfio-mdev/mbochs.c
index b038aa9f5a70..2f5ba96b91a2 100644
--- a/samples/vfio-mdev/mbochs.c
+++ b/samples/vfio-mdev/mbochs.c
@@ -1391,11 +1391,28 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
}
MDEV_TYPE_ATTR_RO(device_api);

+static ssize_t version_show(struct kobject *kobj, struct device *dev,
+ char *buf)
+{
+ /* do not support live migration */
+ return -EINVAL;
+}
+
+static ssize_t version_store(struct kobject *kobj, struct device *dev,
+ const char *buf, size_t count)
+{
+ /* do not support live migration */
+ return -EINVAL;
+}
+
+static MDEV_TYPE_ATTR_RW(version);
+
static struct attribute *mdev_types_attrs[] = {
&mdev_type_attr_name.attr,
&mdev_type_attr_description.attr,
&mdev_type_attr_device_api.attr,
&mdev_type_attr_available_instances.attr,
+ &mdev_type_attr_version.attr,
NULL,
};

diff --git a/samples/vfio-mdev/mdpy.c b/samples/vfio-mdev/mdpy.c
index cc86bf6566e4..ff15fdfc7d46 100644
--- a/samples/vfio-mdev/mdpy.c
+++ b/samples/vfio-mdev/mdpy.c
@@ -695,11 +695,27 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
}
MDEV_TYPE_ATTR_RO(device_api);

+static ssize_t version_show(struct kobject *kobj, struct device *dev,
+ char *buf)
+{
+ /* do not support live migration */
+ return -EINVAL;
+}
+
+static ssize_t version_store(struct kobject *kobj, struct device *dev,
+ const char *buf, size_t count)
+{
+ /* do not support live migration */
+ return -EINVAL;
+}
+static MDEV_TYPE_ATTR_RW(version);
+
static struct attribute *mdev_types_attrs[] = {
&mdev_type_attr_name.attr,
&mdev_type_attr_description.attr,
&mdev_type_attr_device_api.attr,
&mdev_type_attr_available_instances.attr,
+ &mdev_type_attr_version.attr,
NULL,
};

diff --git a/samples/vfio-mdev/mtty.c b/samples/vfio-mdev/mtty.c
index 1c77c370c92f..4ae3aad3474d 100644
--- a/samples/vfio-mdev/mtty.c
+++ b/samples/vfio-mdev/mtty.c
@@ -1390,10 +1390,26 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,

MDEV_TYPE_ATTR_RO(device_api);

+static ssize_t version_show(struct kobject *kobj, struct device *dev,
+ char *buf)
+{
+ /* do not support live migration */
+ return -EINVAL;
+}
+
+static ssize_t version_store(struct kobject *kobj, struct device *dev,
+ const char *buf, size_t count)
+{
+ /* do not support live migration */
+ return -EINVAL;
+}
+
+static MDEV_TYPE_ATTR_RW(version);
static struct attribute *mdev_types_attrs[] = {
&mdev_type_attr_name.attr,
&mdev_type_attr_device_api.attr,
&mdev_type_attr_available_instances.attr,
+ &mdev_type_attr_version.attr,
NULL,
};

--
2.17.1


2019-04-19 19:06:15

by Yan Zhao

[permalink] [raw]
Subject: [PATCH 2/2] drm/i915/gvt: export mdev device version to sysfs for Intel vGPU

This feature implements the version attribute for Intel's vGPU mdev
devices.

version attribute is rw. It is queried by userspace software like libvirt
to check whether two vGPUs are compatible for live migration.

It consists of two parts: common part and vendor proprietary part.
common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
identifies device type. e.g., for pci device, it is
"pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
vendor proprietary part: this part is varied in length. vendor driver can
specify any string to identify a device.

For Intel vGPU of gen8 and gen9, the vendor proprietary part currently
consists of 2 fields: "device id" + "mdev type".

Reading from a vGPU's version attribute, a string is returned in below
format: 00028086-<device id>-<mdev type>. e.g.
00028086-193b-i915-GVTg_V5_2.

Writing a string to a vGPU's version attribute will trigger GVT to check
whether a vGPU identified by the written string is compatible with
current vGPU owning this version attribute. errno is returned if the two
vGPUs are incompatible. The length of written string is returned in
compatible case.

For other platforms, and for GVT not supporting vGPU live migration
feature, errnos are returned when read/write of mdev devices' version
attributes.

For old GVT versions where no version attributes exposed in sysfs, it is
regarded as not supporting vGPU live migration.

For future platforms, besides the current 2 fields in vendor proprietary
part, more fields may be added to identify Intel vGPU well for live
migration purpose.

Cc: Alex Williamson <[email protected]>
Cc: Erik Skultety <[email protected]>
Cc: "Dr. David Alan Gilbert" <[email protected]>
Cc: Cornelia Huck <[email protected]>
Cc: "Tian, Kevin" <[email protected]>
Cc: Zhenyu Wang <[email protected]>
Cc: "Wang, Zhi A" <[email protected]>
c: Neo Jia <[email protected]>
Cc: Kirti Wankhede <[email protected]>

Signed-off-by: Yan Zhao <[email protected]>
---
drivers/gpu/drm/i915/gvt/Makefile | 2 +-
drivers/gpu/drm/i915/gvt/device_version.c | 94 +++++++++++++++++++++++
drivers/gpu/drm/i915/gvt/gvt.c | 55 +++++++++++++
drivers/gpu/drm/i915/gvt/gvt.h | 6 ++
4 files changed, 156 insertions(+), 1 deletion(-)
create mode 100644 drivers/gpu/drm/i915/gvt/device_version.c

diff --git a/drivers/gpu/drm/i915/gvt/Makefile b/drivers/gpu/drm/i915/gvt/Makefile
index 271fb46d4dd0..54e209a23899 100644
--- a/drivers/gpu/drm/i915/gvt/Makefile
+++ b/drivers/gpu/drm/i915/gvt/Makefile
@@ -3,7 +3,7 @@ GVT_DIR := gvt
GVT_SOURCE := gvt.o aperture_gm.o handlers.o vgpu.o trace_points.o firmware.o \
interrupt.o gtt.o cfg_space.o opregion.o mmio.o display.o edid.o \
execlist.o scheduler.o sched_policy.o mmio_context.o cmd_parser.o debugfs.o \
- fb_decoder.o dmabuf.o page_track.o
+ fb_decoder.o dmabuf.o page_track.o device_version.o

ccflags-y += -I$(src) -I$(src)/$(GVT_DIR)
i915-y += $(addprefix $(GVT_DIR)/, $(GVT_SOURCE))
diff --git a/drivers/gpu/drm/i915/gvt/device_version.c b/drivers/gpu/drm/i915/gvt/device_version.c
new file mode 100644
index 000000000000..c64010d2bc54
--- /dev/null
+++ b/drivers/gpu/drm/i915/gvt/device_version.c
@@ -0,0 +1,94 @@
+/*
+ * Copyright(c) 2011-2017 Intel Corporation. All rights reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+#include <linux/vfio.h>
+#include "i915_drv.h"
+
+#define GVT_VFIO_DEVICE_VENDOR_ID ((0x8086) | \
+ ((VFIO_DEVICE_FLAGS_PCI & 0xff) << 16))
+
+#define GVT_DEVICE_VERSION_COMMON_LEN 0x8
+#define GVT_DEVICE_VERSION_DEVICE_ID_LEN 0x4
+
+static bool is_compatible(const char *self, const char *remote)
+{
+ if (strlen(remote) != strlen(self))
+ return false;
+
+ return (strncmp(self, remote, strlen(self))) ? false : true;
+}
+
+ssize_t intel_gvt_get_vfio_device_version_len(struct drm_i915_private *dev_priv)
+{
+ if (!IS_GEN(dev_priv, 8) && !IS_GEN(dev_priv, 9))
+ return -ENODEV;
+
+ return PAGE_SIZE;
+}
+
+ssize_t intel_gvt_get_vfio_device_version(struct drm_i915_private *dev_priv,
+ char *buf, const char *mdev_type)
+{
+ int cnt = 0, ret = 0;
+ const char *str = NULL;
+
+ /* currently only gen8 & gen9 are supported */
+ if (!IS_GEN(dev_priv, 8) && !IS_GEN(dev_priv, 9))
+ return -ENODEV;
+
+ /* first 32 bit common part specifying vendor id and it's a pci
+ * device
+ */
+ cnt = snprintf(buf, GVT_DEVICE_VERSION_COMMON_LEN + 1,
+ "%08x", GVT_VFIO_DEVICE_VENDOR_ID);
+ buf += cnt;
+ ret += cnt;
+
+ /* vendor proprietary part: device id + mdev type */
+ /* device id */
+ cnt = snprintf(buf, GVT_DEVICE_VERSION_DEVICE_ID_LEN + 2,
+ "-%04x", INTEL_DEVID(dev_priv));
+ buf += cnt;
+ ret += cnt;
+
+ /* mdev type */
+ str = mdev_type;
+ cnt = snprintf(buf, strlen(str) + 3, "-%s\n", mdev_type);
+ buf += cnt;
+ ret += cnt;
+
+ return ret;
+}
+
+ssize_t intel_gvt_check_vfio_device_version(struct drm_i915_private *dev_priv,
+ const char *self, const char *remote)
+{
+
+ /* currently only gen8 & gen9 are supported */
+ if (!IS_GEN(dev_priv, 8) && !IS_GEN(dev_priv, 9))
+ return -ENODEV;
+
+ if (!is_compatible(self, remote))
+ return -EINVAL;
+
+ return 0;
+}
diff --git a/drivers/gpu/drm/i915/gvt/gvt.c b/drivers/gpu/drm/i915/gvt/gvt.c
index 43f4242062dd..e720465b93d8 100644
--- a/drivers/gpu/drm/i915/gvt/gvt.c
+++ b/drivers/gpu/drm/i915/gvt/gvt.c
@@ -105,14 +105,69 @@ static ssize_t description_show(struct kobject *kobj, struct device *dev,
type->weight);
}

+static ssize_t version_show(struct kobject *kobj, struct device *dev,
+ char *buf)
+{
+#ifdef GVT_MIGRATION_VERSION
+ struct drm_i915_private *i915 = kdev_to_i915(dev);
+ const char *mdev_type = kobject_name(kobj);
+
+ return intel_gvt_get_vfio_device_version(i915, buf, mdev_type);
+#else
+ /* do not support live migration */
+ return -EINVAL;
+#endif
+}
+
+static ssize_t version_store(struct kobject *kobj, struct device *dev,
+ const char *buf, size_t count)
+{
+#ifdef GVT_MIGRATION_VERSION
+ char *remote = NULL, *self = NULL;
+ int len, ret = 0;
+ struct drm_i915_private *i915 = kdev_to_i915(dev);
+ const char *mdev_type = kobject_name(kobj);
+
+ len = intel_gvt_get_vfio_device_version_len(i915);
+ if (len < 0)
+ return len;
+
+ self = kmalloc(len, GFP_KERNEL);
+ if (!self)
+ return -ENOMEM;
+
+ ret = intel_gvt_get_vfio_device_version(i915, self, mdev_type);
+ if (ret < 0)
+ goto out;
+
+ remote = kstrndup(buf, count, GFP_KERNEL);
+ if (!remote) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ ret = intel_gvt_check_vfio_device_version(i915, self, remote);
+
+out:
+ kfree(self);
+ kfree(remote);
+ return (ret < 0 ? ret : count);
+#else
+ /* do not support live migration */
+ return -EINVAL;
+#endif
+}
+
static MDEV_TYPE_ATTR_RO(available_instances);
static MDEV_TYPE_ATTR_RO(device_api);
static MDEV_TYPE_ATTR_RO(description);
+static MDEV_TYPE_ATTR_RW(version);

static struct attribute *gvt_type_attrs[] = {
&mdev_type_attr_available_instances.attr,
&mdev_type_attr_device_api.attr,
&mdev_type_attr_description.attr,
+ &mdev_type_attr_version.attr,
NULL,
};

diff --git a/drivers/gpu/drm/i915/gvt/gvt.h b/drivers/gpu/drm/i915/gvt/gvt.h
index f5a328b5290a..4062f6b26acf 100644
--- a/drivers/gpu/drm/i915/gvt/gvt.h
+++ b/drivers/gpu/drm/i915/gvt/gvt.h
@@ -687,6 +687,12 @@ void intel_gvt_debugfs_remove_vgpu(struct intel_vgpu *vgpu);
int intel_gvt_debugfs_init(struct intel_gvt *gvt);
void intel_gvt_debugfs_clean(struct intel_gvt *gvt);

+ssize_t intel_gvt_get_vfio_device_version(struct drm_i915_private *i915,
+ char *buf, const char *mdev_type);
+ssize_t intel_gvt_check_vfio_device_version(struct drm_i915_private *dev_priv,
+ const char *self, const char *remote);
+ssize_t
+intel_gvt_get_vfio_device_version_len(struct drm_i915_private *dev_priv);

#include "trace.h"
#include "mpt.h"
--
2.17.1


2019-04-22 09:46:13

by Zhenyu Wang

[permalink] [raw]
Subject: Re: [PATCH 2/2] drm/i915/gvt: export mdev device version to sysfs for Intel vGPU

On 2019.04.19 04:35:59 -0400, Yan Zhao wrote:
> This feature implements the version attribute for Intel's vGPU mdev
> devices.
>
> version attribute is rw. It is queried by userspace software like libvirt
> to check whether two vGPUs are compatible for live migration.
>
> It consists of two parts: common part and vendor proprietary part.
> common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> identifies device type. e.g., for pci device, it is
> "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> vendor proprietary part: this part is varied in length. vendor driver can
> specify any string to identify a device.
>
> For Intel vGPU of gen8 and gen9, the vendor proprietary part currently
> consists of 2 fields: "device id" + "mdev type".
>
> Reading from a vGPU's version attribute, a string is returned in below
> format: 00028086-<device id>-<mdev type>. e.g.
> 00028086-193b-i915-GVTg_V5_2.
>
> Writing a string to a vGPU's version attribute will trigger GVT to check
> whether a vGPU identified by the written string is compatible with
> current vGPU owning this version attribute. errno is returned if the two
> vGPUs are incompatible. The length of written string is returned in
> compatible case.
>
> For other platforms, and for GVT not supporting vGPU live migration
> feature, errnos are returned when read/write of mdev devices' version
> attributes.
>
> For old GVT versions where no version attributes exposed in sysfs, it is
> regarded as not supporting vGPU live migration.
>
> For future platforms, besides the current 2 fields in vendor proprietary
> part, more fields may be added to identify Intel vGPU well for live
> migration purpose.
>
> Cc: Alex Williamson <[email protected]>
> Cc: Erik Skultety <[email protected]>
> Cc: "Dr. David Alan Gilbert" <[email protected]>
> Cc: Cornelia Huck <[email protected]>
> Cc: "Tian, Kevin" <[email protected]>
> Cc: Zhenyu Wang <[email protected]>
> Cc: "Wang, Zhi A" <[email protected]>
> c: Neo Jia <[email protected]>
> Cc: Kirti Wankhede <[email protected]>
>
> Signed-off-by: Yan Zhao <[email protected]>
> ---
> drivers/gpu/drm/i915/gvt/Makefile | 2 +-
> drivers/gpu/drm/i915/gvt/device_version.c | 94 +++++++++++++++++++++++
> drivers/gpu/drm/i915/gvt/gvt.c | 55 +++++++++++++
> drivers/gpu/drm/i915/gvt/gvt.h | 6 ++
> 4 files changed, 156 insertions(+), 1 deletion(-)
> create mode 100644 drivers/gpu/drm/i915/gvt/device_version.c
>
> diff --git a/drivers/gpu/drm/i915/gvt/Makefile b/drivers/gpu/drm/i915/gvt/Makefile
> index 271fb46d4dd0..54e209a23899 100644
> --- a/drivers/gpu/drm/i915/gvt/Makefile
> +++ b/drivers/gpu/drm/i915/gvt/Makefile
> @@ -3,7 +3,7 @@ GVT_DIR := gvt
> GVT_SOURCE := gvt.o aperture_gm.o handlers.o vgpu.o trace_points.o firmware.o \
> interrupt.o gtt.o cfg_space.o opregion.o mmio.o display.o edid.o \
> execlist.o scheduler.o sched_policy.o mmio_context.o cmd_parser.o debugfs.o \
> - fb_decoder.o dmabuf.o page_track.o
> + fb_decoder.o dmabuf.o page_track.o device_version.o
>
> ccflags-y += -I$(src) -I$(src)/$(GVT_DIR)
> i915-y += $(addprefix $(GVT_DIR)/, $(GVT_SOURCE))
> diff --git a/drivers/gpu/drm/i915/gvt/device_version.c b/drivers/gpu/drm/i915/gvt/device_version.c
> new file mode 100644
> index 000000000000..c64010d2bc54
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/gvt/device_version.c
> @@ -0,0 +1,94 @@
> +/*
> + * Copyright(c) 2011-2017 Intel Corporation. All rights reserved.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
> + * SOFTWARE.
> + */
> +#include <linux/vfio.h>
> +#include "i915_drv.h"
> +
> +#define GVT_VFIO_DEVICE_VENDOR_ID ((0x8086) | \
> + ((VFIO_DEVICE_FLAGS_PCI & 0xff) << 16))
> +
> +#define GVT_DEVICE_VERSION_COMMON_LEN 0x8
> +#define GVT_DEVICE_VERSION_DEVICE_ID_LEN 0x4
> +
> +static bool is_compatible(const char *self, const char *remote)
> +{
> + if (strlen(remote) != strlen(self))
> + return false;
> +
> + return (strncmp(self, remote, strlen(self))) ? false : true;
> +}
> +
> +ssize_t intel_gvt_get_vfio_device_version_len(struct drm_i915_private *dev_priv)
> +{
> + if (!IS_GEN(dev_priv, 8) && !IS_GEN(dev_priv, 9))
> + return -ENODEV;
> +
> + return PAGE_SIZE;
> +}
> +
> +ssize_t intel_gvt_get_vfio_device_version(struct drm_i915_private *dev_priv,
> + char *buf, const char *mdev_type)
> +{
> + int cnt = 0, ret = 0;
> + const char *str = NULL;
> +
> + /* currently only gen8 & gen9 are supported */
> + if (!IS_GEN(dev_priv, 8) && !IS_GEN(dev_priv, 9))
> + return -ENODEV;
> +
> + /* first 32 bit common part specifying vendor id and it's a pci
> + * device
> + */
> + cnt = snprintf(buf, GVT_DEVICE_VERSION_COMMON_LEN + 1,
> + "%08x", GVT_VFIO_DEVICE_VENDOR_ID);
> + buf += cnt;
> + ret += cnt;
> +
> + /* vendor proprietary part: device id + mdev type */
> + /* device id */
> + cnt = snprintf(buf, GVT_DEVICE_VERSION_DEVICE_ID_LEN + 2,
> + "-%04x", INTEL_DEVID(dev_priv));
> + buf += cnt;
> + ret += cnt;
> +
> + /* mdev type */
> + str = mdev_type;
> + cnt = snprintf(buf, strlen(str) + 3, "-%s\n", mdev_type);
> + buf += cnt;
> + ret += cnt;
> +
> + return ret;
> +}
> +
> +ssize_t intel_gvt_check_vfio_device_version(struct drm_i915_private *dev_priv,
> + const char *self, const char *remote)
> +{
> +
> + /* currently only gen8 & gen9 are supported */
> + if (!IS_GEN(dev_priv, 8) && !IS_GEN(dev_priv, 9))
> + return -ENODEV;
> +
> + if (!is_compatible(self, remote))
> + return -EINVAL;
> +
> + return 0;
> +}
> diff --git a/drivers/gpu/drm/i915/gvt/gvt.c b/drivers/gpu/drm/i915/gvt/gvt.c
> index 43f4242062dd..e720465b93d8 100644
> --- a/drivers/gpu/drm/i915/gvt/gvt.c
> +++ b/drivers/gpu/drm/i915/gvt/gvt.c
> @@ -105,14 +105,69 @@ static ssize_t description_show(struct kobject *kobj, struct device *dev,
> type->weight);
> }
>
> +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> + char *buf)
> +{
> +#ifdef GVT_MIGRATION_VERSION
> + struct drm_i915_private *i915 = kdev_to_i915(dev);
> + const char *mdev_type = kobject_name(kobj);
> +
> + return intel_gvt_get_vfio_device_version(i915, buf, mdev_type);
> +#else
> + /* do not support live migration */
> + return -EINVAL;
> +#endif
> +}
> +
> +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> + const char *buf, size_t count)
> +{
> +#ifdef GVT_MIGRATION_VERSION
> + char *remote = NULL, *self = NULL;
> + int len, ret = 0;
> + struct drm_i915_private *i915 = kdev_to_i915(dev);
> + const char *mdev_type = kobject_name(kobj);
> +
> + len = intel_gvt_get_vfio_device_version_len(i915);
> + if (len < 0)
> + return len;
> +
> + self = kmalloc(len, GFP_KERNEL);
> + if (!self)
> + return -ENOMEM;
> +
> + ret = intel_gvt_get_vfio_device_version(i915, self, mdev_type);
> + if (ret < 0)
> + goto out;
> +

device version string should be allocated and specified during device instance
setup instead of version attribute r/w time.

> + remote = kstrndup(buf, count, GFP_KERNEL);
> + if (!remote) {
> + ret = -ENOMEM;
> + goto out;
> + }
> +
> + ret = intel_gvt_check_vfio_device_version(i915, self, remote);
> +
> +out:
> + kfree(self);
> + kfree(remote);
> + return (ret < 0 ? ret : count);
> +#else
> + /* do not support live migration */
> + return -EINVAL;
> +#endif
> +}
> +
> static MDEV_TYPE_ATTR_RO(available_instances);
> static MDEV_TYPE_ATTR_RO(device_api);
> static MDEV_TYPE_ATTR_RO(description);
> +static MDEV_TYPE_ATTR_RW(version);
>
> static struct attribute *gvt_type_attrs[] = {
> &mdev_type_attr_available_instances.attr,
> &mdev_type_attr_device_api.attr,
> &mdev_type_attr_description.attr,
> + &mdev_type_attr_version.attr,
> NULL,
> };
>
> diff --git a/drivers/gpu/drm/i915/gvt/gvt.h b/drivers/gpu/drm/i915/gvt/gvt.h
> index f5a328b5290a..4062f6b26acf 100644
> --- a/drivers/gpu/drm/i915/gvt/gvt.h
> +++ b/drivers/gpu/drm/i915/gvt/gvt.h
> @@ -687,6 +687,12 @@ void intel_gvt_debugfs_remove_vgpu(struct intel_vgpu *vgpu);
> int intel_gvt_debugfs_init(struct intel_gvt *gvt);
> void intel_gvt_debugfs_clean(struct intel_gvt *gvt);
>
> +ssize_t intel_gvt_get_vfio_device_version(struct drm_i915_private *i915,
> + char *buf, const char *mdev_type);
> +ssize_t intel_gvt_check_vfio_device_version(struct drm_i915_private *dev_priv,
> + const char *self, const char *remote);
> +ssize_t
> +intel_gvt_get_vfio_device_version_len(struct drm_i915_private *dev_priv);
>
> #include "trace.h"
> #include "mpt.h"
> --
> 2.17.1
>
> _______________________________________________
> intel-gvt-dev mailing list
> [email protected]
> https://lists.freedesktop.org/mailman/listinfo/intel-gvt-dev

--
Open Source Technology Center, Intel ltd.

$gpg --keyserver wwwkeys.pgp.net --recv-keys 4D781827


Attachments:
(No filename) (9.94 kB)
signature.asc (201.00 B)
Download all attachments

2019-04-22 15:31:16

by Alex Williamson

[permalink] [raw]
Subject: Re: [PATCH 1/2] vfio/mdev: add version field as mandatory attribute for mdev device

On Fri, 19 Apr 2019 04:35:04 -0400
Yan Zhao <[email protected]> wrote:

> device version attribute in mdev sysfs is used by user space software
> (e.g. libvirt) to query device compatibility for live migration of VFIO
> mdev devices. This attribute is mandatory if a mdev device supports live
> migration.

The Subject: doesn't quite match what's being proposed here.

> It consists of two parts: common part and vendor proprietary part.
> common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> identifies device type. e.g., for pci device, it is
> "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).

What purpose does this serve? If it's intended as some sort of
namespace feature, shouldn't we first assume that we can only support
migration to devices of the same type? Therefore each type would
already have its own namespace. Also that would make the trailing bit
of the version string listed below in the example redundant. A vendor
is still welcome to include this in their version string if they wish,
but I think the string should be entirely vendor defined.

> vendor proprietary part: this part is varied in length. vendor driver can
> specify any string to identify a device.
>
> When reading this attribute, it should show device version string of the
> device of type <type-id>. If a device does not support live migration, it
> should return errno.
> When writing a string to this attribute, it returns errno for
> incompatibility or returns written string length in compatibility case.
> If a device does not support live migration, it always returns errno.
>
> For user space software to use:
> 1.
> Before starting live migration, user space software first reads source side
> mdev device's version. e.g.
> "#cat \
> /sys/bus/pci/devices/0000\:00\:02.0/5ac1fb20-2bbf-4842-bb7e-36c58c3be9cd/mdev_type/version"
> 00028086-193b-i915-GVTg_V5_4
>
> 2.
> Then, user space software writes the source side returned version string
> to device version attribute in target side, and checks the return value.
> If a negative errno is returned in the target side, then mdev devices in
> source and target sides are not compatible;
> If a positive number is returned and it equals to the length of written
> string, then the two mdev devices in source and target side are compatible.
> e.g.
> (a) compatibility case
> "# echo 00028086-193b-i915-GVTg_V5_4 >
> /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
>
> (b) incompatibility case
> "#echo 00028086-193b-i915-GVTg_V5_1 >
> /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> -bash: echo: write error: Invalid argument
>
> 3. if two mdev devices are compatible, user space software can start
> live migration, and vice versa.
>
> Note: if a mdev device does not support live migration, it either does
> not provide a version attribute, or always returns errno when its version
> attribute is read/written.

I think it would be cleaner to do the former, not supply the
attribute. This seems to do the latter in the sample drivers. Thanks,

Alex

> Cc: Alex Williamson <[email protected]>
> Cc: Erik Skultety <[email protected]>
> Cc: "Dr. David Alan Gilbert" <[email protected]>
> Cc: Cornelia Huck <[email protected]>
> Cc: "Tian, Kevin" <[email protected]>
> Cc: Zhenyu Wang <[email protected]>
> Cc: "Wang, Zhi A" <[email protected]>
> Cc: Neo Jia <[email protected]>
> Cc: Kirti Wankhede <[email protected]>
>
> Signed-off-by: Yan Zhao <[email protected]>
> ---
> Documentation/vfio-mediated-device.txt | 36 ++++++++++++++++++++++++++
> samples/vfio-mdev/mbochs.c | 17 ++++++++++++
> samples/vfio-mdev/mdpy.c | 16 ++++++++++++
> samples/vfio-mdev/mtty.c | 16 ++++++++++++
> 4 files changed, 85 insertions(+)
>
> diff --git a/Documentation/vfio-mediated-device.txt b/Documentation/vfio-mediated-device.txt
> index c3f69bcaf96e..bc28471c0667 100644
> --- a/Documentation/vfio-mediated-device.txt
> +++ b/Documentation/vfio-mediated-device.txt
> @@ -202,6 +202,7 @@ Directories and files under the sysfs for Each Physical Device
> | | |--- available_instances
> | | |--- device_api
> | | |--- description
> + | | |--- version
> | | |--- [devices]
> | |--- [<type-id>]
> | | |--- create
> @@ -209,6 +210,7 @@ Directories and files under the sysfs for Each Physical Device
> | | |--- available_instances
> | | |--- device_api
> | | |--- description
> + | | |--- version
> | | |--- [devices]
> | |--- [<type-id>]
> | |--- create
> @@ -216,6 +218,7 @@ Directories and files under the sysfs for Each Physical Device
> | |--- available_instances
> | |--- device_api
> | |--- description
> + | |--- version
> | |--- [devices]
>
> * [mdev_supported_types]
> @@ -225,6 +228,8 @@ Directories and files under the sysfs for Each Physical Device
> [<type-id>], device_api, and available_instances are mandatory attributes
> that should be provided by vendor driver.
>
> + version is a mandatory attribute if a mdev device supports live migration.
> +
> * [<type-id>]
>
> The [<type-id>] name is created by adding the device driver string as a prefix
> @@ -246,6 +251,35 @@ Directories and files under the sysfs for Each Physical Device
> This attribute should show the number of devices of type <type-id> that can be
> created.
>
> +* version
> +
> + This attribute is rw. It is used to check whether two devices are compatible
> + for live migration. If this attribute is missing, then the corresponding mdev
> + device is regarded as not supporting live migration.
> +
> + It consists of two parts: common part and vendor proprietary part.
> + common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies
> + device type. e.g., for pci device, it is
> + "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> + vendor proprietary part: this part is varied in length. vendor driver can
> + specify any string to identify a device.
> +
> + When reading this attribute, it should show device version string of the device
> + of type <type-id>. If a device does not support live migration, it should
> + return errno.
> + When writing a string to this attribute, it returns errno for incompatibility
> + or returns written string length in compatibility case. If a device does not
> + support live migration, it always returns errno.
> +
> + for example.
> + # cat \
> + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_2/version
> + 00028086-193b-i915-GVTg_V5_2
> +
> + #echo 00028086-193b-i915-GVTg_V5_2 > \
> + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> + -bash: echo: write error: Invalid argument
> +
> * [device]
>
> This directory contains links to the devices of type <type-id> that have been
> @@ -327,12 +361,14 @@ card.
> | | |-- available_instances
> | | |-- create
> | | |-- device_api
> + | | |-- version
> | | |-- devices
> | | `-- name
> | `-- mtty-2
> | |-- available_instances
> | |-- create
> | |-- device_api
> + | |-- version
> | |-- devices
> | `-- name
> |-- mtty_dev
> diff --git a/samples/vfio-mdev/mbochs.c b/samples/vfio-mdev/mbochs.c
> index b038aa9f5a70..2f5ba96b91a2 100644
> --- a/samples/vfio-mdev/mbochs.c
> +++ b/samples/vfio-mdev/mbochs.c
> @@ -1391,11 +1391,28 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> }
> MDEV_TYPE_ATTR_RO(device_api);
>
> +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> + char *buf)
> +{
> + /* do not support live migration */
> + return -EINVAL;
> +}
> +
> +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> + const char *buf, size_t count)
> +{
> + /* do not support live migration */
> + return -EINVAL;
> +}
> +
> +static MDEV_TYPE_ATTR_RW(version);
> +
> static struct attribute *mdev_types_attrs[] = {
> &mdev_type_attr_name.attr,
> &mdev_type_attr_description.attr,
> &mdev_type_attr_device_api.attr,
> &mdev_type_attr_available_instances.attr,
> + &mdev_type_attr_version.attr,
> NULL,
> };
>
> diff --git a/samples/vfio-mdev/mdpy.c b/samples/vfio-mdev/mdpy.c
> index cc86bf6566e4..ff15fdfc7d46 100644
> --- a/samples/vfio-mdev/mdpy.c
> +++ b/samples/vfio-mdev/mdpy.c
> @@ -695,11 +695,27 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> }
> MDEV_TYPE_ATTR_RO(device_api);
>
> +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> + char *buf)
> +{
> + /* do not support live migration */
> + return -EINVAL;
> +}
> +
> +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> + const char *buf, size_t count)
> +{
> + /* do not support live migration */
> + return -EINVAL;
> +}
> +static MDEV_TYPE_ATTR_RW(version);
> +
> static struct attribute *mdev_types_attrs[] = {
> &mdev_type_attr_name.attr,
> &mdev_type_attr_description.attr,
> &mdev_type_attr_device_api.attr,
> &mdev_type_attr_available_instances.attr,
> + &mdev_type_attr_version.attr,
> NULL,
> };
>
> diff --git a/samples/vfio-mdev/mtty.c b/samples/vfio-mdev/mtty.c
> index 1c77c370c92f..4ae3aad3474d 100644
> --- a/samples/vfio-mdev/mtty.c
> +++ b/samples/vfio-mdev/mtty.c
> @@ -1390,10 +1390,26 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
>
> MDEV_TYPE_ATTR_RO(device_api);
>
> +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> + char *buf)
> +{
> + /* do not support live migration */
> + return -EINVAL;
> +}
> +
> +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> + const char *buf, size_t count)
> +{
> + /* do not support live migration */
> + return -EINVAL;
> +}
> +
> +static MDEV_TYPE_ATTR_RW(version);
> static struct attribute *mdev_types_attrs[] = {
> &mdev_type_attr_name.attr,
> &mdev_type_attr_device_api.attr,
> &mdev_type_attr_available_instances.attr,
> + &mdev_type_attr_version.attr,
> NULL,
> };
>

2019-04-23 03:59:07

by Yan Zhao

[permalink] [raw]
Subject: Re: [PATCH 1/2] vfio/mdev: add version field as mandatory attribute for mdev device

On Mon, Apr 22, 2019 at 10:39:50PM +0800, Alex Williamson wrote:
> On Fri, 19 Apr 2019 04:35:04 -0400
> Yan Zhao <[email protected]> wrote:
>
> > device version attribute in mdev sysfs is used by user space software
> > (e.g. libvirt) to query device compatibility for live migration of VFIO
> > mdev devices. This attribute is mandatory if a mdev device supports live
> > migration.
>
> The Subject: doesn't quite match what's being proposed here.
>
> > It consists of two parts: common part and vendor proprietary part.
> > common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> > identifies device type. e.g., for pci device, it is
> > "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
>
> What purpose does this serve? If it's intended as some sort of
> namespace feature, shouldn't we first assume that we can only support
> migration to devices of the same type? Therefore each type would
> already have its own namespace. Also that would make the trailing bit
> of the version string listed below in the example redundant. A vendor
> is still welcome to include this in their version string if they wish,
> but I think the string should be entirely vendor defined.
>
hi Alex,
This common part is a kind of namespace.
Because if version string is entirely defined by vendors, I'm worried about
if there is a case that one vendor's version string happens to deceive and
interfere with another vendor's version checking?
e.g.
vendor A has a version string like: vendor id + device id + mdev type
vendor B has a version string like: device id + vendor id + mdev type
but vendor A's vendor id is 0x8086, device id is 0x1217
vendor B's vendor id is 0x1217, device id is 0x8086.

In this corner case, the two vendors may regard the two device is
migratable but actually they are not.

That's the reason for this common part that serve as a kind of namespace
that all vendors will comply with to avoid overlap.

> > vendor proprietary part: this part is varied in length. vendor driver can
> > specify any string to identify a device.
> >
> > When reading this attribute, it should show device version string of the
> > device of type <type-id>. If a device does not support live migration, it
> > should return errno.
> > When writing a string to this attribute, it returns errno for
> > incompatibility or returns written string length in compatibility case.
> > If a device does not support live migration, it always returns errno.
> >
> > For user space software to use:
> > 1.
> > Before starting live migration, user space software first reads source side
> > mdev device's version. e.g.
> > "#cat \
> > /sys/bus/pci/devices/0000\:00\:02.0/5ac1fb20-2bbf-4842-bb7e-36c58c3be9cd/mdev_type/version"
> > 00028086-193b-i915-GVTg_V5_4
> >
> > 2.
> > Then, user space software writes the source side returned version string
> > to device version attribute in target side, and checks the return value.
> > If a negative errno is returned in the target side, then mdev devices in
> > source and target sides are not compatible;
> > If a positive number is returned and it equals to the length of written
> > string, then the two mdev devices in source and target side are compatible.
> > e.g.
> > (a) compatibility case
> > "# echo 00028086-193b-i915-GVTg_V5_4 >
> > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> >
> > (b) incompatibility case
> > "#echo 00028086-193b-i915-GVTg_V5_1 >
> > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > -bash: echo: write error: Invalid argument
> >
> > 3. if two mdev devices are compatible, user space software can start
> > live migration, and vice versa.
> >
> > Note: if a mdev device does not support live migration, it either does
> > not provide a version attribute, or always returns errno when its version
> > attribute is read/written.
>
> I think it would be cleaner to do the former, not supply the
> attribute. This seems to do the latter in the sample drivers. Thanks,
Ok. you are right!
what about just keep one sample driver to show how to do the latter,
and let the others do the former?

Thanks!
Yan


> > Cc: Alex Williamson <[email protected]>
> > Cc: Erik Skultety <[email protected]>
> > Cc: "Dr. David Alan Gilbert" <[email protected]>
> > Cc: Cornelia Huck <[email protected]>
> > Cc: "Tian, Kevin" <[email protected]>
> > Cc: Zhenyu Wang <[email protected]>
> > Cc: "Wang, Zhi A" <[email protected]>
> > Cc: Neo Jia <[email protected]>
> > Cc: Kirti Wankhede <[email protected]>
> >
> > Signed-off-by: Yan Zhao <[email protected]>
> > ---
> > Documentation/vfio-mediated-device.txt | 36 ++++++++++++++++++++++++++
> > samples/vfio-mdev/mbochs.c | 17 ++++++++++++
> > samples/vfio-mdev/mdpy.c | 16 ++++++++++++
> > samples/vfio-mdev/mtty.c | 16 ++++++++++++
> > 4 files changed, 85 insertions(+)
> >
> > diff --git a/Documentation/vfio-mediated-device.txt b/Documentation/vfio-mediated-device.txt
> > index c3f69bcaf96e..bc28471c0667 100644
> > --- a/Documentation/vfio-mediated-device.txt
> > +++ b/Documentation/vfio-mediated-device.txt
> > @@ -202,6 +202,7 @@ Directories and files under the sysfs for Each Physical Device
> > | | |--- available_instances
> > | | |--- device_api
> > | | |--- description
> > + | | |--- version
> > | | |--- [devices]
> > | |--- [<type-id>]
> > | | |--- create
> > @@ -209,6 +210,7 @@ Directories and files under the sysfs for Each Physical Device
> > | | |--- available_instances
> > | | |--- device_api
> > | | |--- description
> > + | | |--- version
> > | | |--- [devices]
> > | |--- [<type-id>]
> > | |--- create
> > @@ -216,6 +218,7 @@ Directories and files under the sysfs for Each Physical Device
> > | |--- available_instances
> > | |--- device_api
> > | |--- description
> > + | |--- version
> > | |--- [devices]
> >
> > * [mdev_supported_types]
> > @@ -225,6 +228,8 @@ Directories and files under the sysfs for Each Physical Device
> > [<type-id>], device_api, and available_instances are mandatory attributes
> > that should be provided by vendor driver.
> >
> > + version is a mandatory attribute if a mdev device supports live migration.
> > +
> > * [<type-id>]
> >
> > The [<type-id>] name is created by adding the device driver string as a prefix
> > @@ -246,6 +251,35 @@ Directories and files under the sysfs for Each Physical Device
> > This attribute should show the number of devices of type <type-id> that can be
> > created.
> >
> > +* version
> > +
> > + This attribute is rw. It is used to check whether two devices are compatible
> > + for live migration. If this attribute is missing, then the corresponding mdev
> > + device is regarded as not supporting live migration.
> > +
> > + It consists of two parts: common part and vendor proprietary part.
> > + common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies
> > + device type. e.g., for pci device, it is
> > + "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > + vendor proprietary part: this part is varied in length. vendor driver can
> > + specify any string to identify a device.
> > +
> > + When reading this attribute, it should show device version string of the device
> > + of type <type-id>. If a device does not support live migration, it should
> > + return errno.
> > + When writing a string to this attribute, it returns errno for incompatibility
> > + or returns written string length in compatibility case. If a device does not
> > + support live migration, it always returns errno.
> > +
> > + for example.
> > + # cat \
> > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_2/version
> > + 00028086-193b-i915-GVTg_V5_2
> > +
> > + #echo 00028086-193b-i915-GVTg_V5_2 > \
> > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> > + -bash: echo: write error: Invalid argument
> > +
> > * [device]
> >
> > This directory contains links to the devices of type <type-id> that have been
> > @@ -327,12 +361,14 @@ card.
> > | | |-- available_instances
> > | | |-- create
> > | | |-- device_api
> > + | | |-- version
> > | | |-- devices
> > | | `-- name
> > | `-- mtty-2
> > | |-- available_instances
> > | |-- create
> > | |-- device_api
> > + | |-- version
> > | |-- devices
> > | `-- name
> > |-- mtty_dev
> > diff --git a/samples/vfio-mdev/mbochs.c b/samples/vfio-mdev/mbochs.c
> > index b038aa9f5a70..2f5ba96b91a2 100644
> > --- a/samples/vfio-mdev/mbochs.c
> > +++ b/samples/vfio-mdev/mbochs.c
> > @@ -1391,11 +1391,28 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> > }
> > MDEV_TYPE_ATTR_RO(device_api);
> >
> > +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> > + char *buf)
> > +{
> > + /* do not support live migration */
> > + return -EINVAL;
> > +}
> > +
> > +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> > + const char *buf, size_t count)
> > +{
> > + /* do not support live migration */
> > + return -EINVAL;
> > +}
> > +
> > +static MDEV_TYPE_ATTR_RW(version);
> > +
> > static struct attribute *mdev_types_attrs[] = {
> > &mdev_type_attr_name.attr,
> > &mdev_type_attr_description.attr,
> > &mdev_type_attr_device_api.attr,
> > &mdev_type_attr_available_instances.attr,
> > + &mdev_type_attr_version.attr,
> > NULL,
> > };
> >
> > diff --git a/samples/vfio-mdev/mdpy.c b/samples/vfio-mdev/mdpy.c
> > index cc86bf6566e4..ff15fdfc7d46 100644
> > --- a/samples/vfio-mdev/mdpy.c
> > +++ b/samples/vfio-mdev/mdpy.c
> > @@ -695,11 +695,27 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> > }
> > MDEV_TYPE_ATTR_RO(device_api);
> >
> > +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> > + char *buf)
> > +{
> > + /* do not support live migration */
> > + return -EINVAL;
> > +}
> > +
> > +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> > + const char *buf, size_t count)
> > +{
> > + /* do not support live migration */
> > + return -EINVAL;
> > +}
> > +static MDEV_TYPE_ATTR_RW(version);
> > +
> > static struct attribute *mdev_types_attrs[] = {
> > &mdev_type_attr_name.attr,
> > &mdev_type_attr_description.attr,
> > &mdev_type_attr_device_api.attr,
> > &mdev_type_attr_available_instances.attr,
> > + &mdev_type_attr_version.attr,
> > NULL,
> > };
> >
> > diff --git a/samples/vfio-mdev/mtty.c b/samples/vfio-mdev/mtty.c
> > index 1c77c370c92f..4ae3aad3474d 100644
> > --- a/samples/vfio-mdev/mtty.c
> > +++ b/samples/vfio-mdev/mtty.c
> > @@ -1390,10 +1390,26 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> >
> > MDEV_TYPE_ATTR_RO(device_api);
> >
> > +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> > + char *buf)
> > +{
> > + /* do not support live migration */
> > + return -EINVAL;
> > +}
> > +
> > +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> > + const char *buf, size_t count)
> > +{
> > + /* do not support live migration */
> > + return -EINVAL;
> > +}
> > +
> > +static MDEV_TYPE_ATTR_RW(version);
> > static struct attribute *mdev_types_attrs[] = {
> > &mdev_type_attr_name.attr,
> > &mdev_type_attr_device_api.attr,
> > &mdev_type_attr_available_instances.attr,
> > + &mdev_type_attr_version.attr,
> > NULL,
> > };
> >
>
> _______________________________________________
> intel-gvt-dev mailing list
> [email protected]
> https://lists.freedesktop.org/mailman/listinfo/intel-gvt-dev

2019-04-23 04:00:52

by Alex Williamson

[permalink] [raw]
Subject: Re: [PATCH 1/2] vfio/mdev: add version field as mandatory attribute for mdev device

On Mon, 22 Apr 2019 21:01:52 -0400
Yan Zhao <[email protected]> wrote:

> On Mon, Apr 22, 2019 at 10:39:50PM +0800, Alex Williamson wrote:
> > On Fri, 19 Apr 2019 04:35:04 -0400
> > Yan Zhao <[email protected]> wrote:
> >
> > > device version attribute in mdev sysfs is used by user space software
> > > (e.g. libvirt) to query device compatibility for live migration of VFIO
> > > mdev devices. This attribute is mandatory if a mdev device supports live
> > > migration.
> >
> > The Subject: doesn't quite match what's being proposed here.
> >
> > > It consists of two parts: common part and vendor proprietary part.
> > > common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> > > identifies device type. e.g., for pci device, it is
> > > "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> >
> > What purpose does this serve? If it's intended as some sort of
> > namespace feature, shouldn't we first assume that we can only support
> > migration to devices of the same type? Therefore each type would
> > already have its own namespace. Also that would make the trailing bit
> > of the version string listed below in the example redundant. A vendor
> > is still welcome to include this in their version string if they wish,
> > but I think the string should be entirely vendor defined.
> >
> hi Alex,
> This common part is a kind of namespace.
> Because if version string is entirely defined by vendors, I'm worried about
> if there is a case that one vendor's version string happens to deceive and
> interfere with another vendor's version checking?
> e.g.
> vendor A has a version string like: vendor id + device id + mdev type
> vendor B has a version string like: device id + vendor id + mdev type
> but vendor A's vendor id is 0x8086, device id is 0x1217
> vendor B's vendor id is 0x1217, device id is 0x8086.
>
> In this corner case, the two vendors may regard the two device is
> migratable but actually they are not.
>
> That's the reason for this common part that serve as a kind of namespace
> that all vendors will comply with to avoid overlap.

If we assume that migration can only occur between matching mdev types,
this is redundant, each type already has their own namespace.

> > > vendor proprietary part: this part is varied in length. vendor driver can
> > > specify any string to identify a device.
> > >
> > > When reading this attribute, it should show device version string of the
> > > device of type <type-id>. If a device does not support live migration, it
> > > should return errno.
> > > When writing a string to this attribute, it returns errno for
> > > incompatibility or returns written string length in compatibility case.
> > > If a device does not support live migration, it always returns errno.
> > >
> > > For user space software to use:
> > > 1.
> > > Before starting live migration, user space software first reads source side
> > > mdev device's version. e.g.
> > > "#cat \
> > > /sys/bus/pci/devices/0000\:00\:02.0/5ac1fb20-2bbf-4842-bb7e-36c58c3be9cd/mdev_type/version"
> > > 00028086-193b-i915-GVTg_V5_4
> > >
> > > 2.
> > > Then, user space software writes the source side returned version string
> > > to device version attribute in target side, and checks the return value.
> > > If a negative errno is returned in the target side, then mdev devices in
> > > source and target sides are not compatible;
> > > If a positive number is returned and it equals to the length of written
> > > string, then the two mdev devices in source and target side are compatible.
> > > e.g.
> > > (a) compatibility case
> > > "# echo 00028086-193b-i915-GVTg_V5_4 >
> > > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > >
> > > (b) incompatibility case
> > > "#echo 00028086-193b-i915-GVTg_V5_1 >
> > > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > > -bash: echo: write error: Invalid argument
> > >
> > > 3. if two mdev devices are compatible, user space software can start
> > > live migration, and vice versa.
> > >
> > > Note: if a mdev device does not support live migration, it either does
> > > not provide a version attribute, or always returns errno when its version
> > > attribute is read/written.
> >
> > I think it would be cleaner to do the former, not supply the
> > attribute. This seems to do the latter in the sample drivers. Thanks,
> Ok. you are right!
> what about just keep one sample driver to show how to do the latter,
> and let the others do the former?

I'd rather that if a vendor driver doesn't support features requiring
the version attribute that they don't implement it. It's confusing to
developers looking at the sample driver for guidance if we have
different implementations. Of course if you'd like to add migration
support to one of the sample drivers, that'd be very welcome. Thanks,

Alex

> > > Cc: Alex Williamson <[email protected]>
> > > Cc: Erik Skultety <[email protected]>
> > > Cc: "Dr. David Alan Gilbert" <[email protected]>
> > > Cc: Cornelia Huck <[email protected]>
> > > Cc: "Tian, Kevin" <[email protected]>
> > > Cc: Zhenyu Wang <[email protected]>
> > > Cc: "Wang, Zhi A" <[email protected]>
> > > Cc: Neo Jia <[email protected]>
> > > Cc: Kirti Wankhede <[email protected]>
> > >
> > > Signed-off-by: Yan Zhao <[email protected]>
> > > ---
> > > Documentation/vfio-mediated-device.txt | 36 ++++++++++++++++++++++++++
> > > samples/vfio-mdev/mbochs.c | 17 ++++++++++++
> > > samples/vfio-mdev/mdpy.c | 16 ++++++++++++
> > > samples/vfio-mdev/mtty.c | 16 ++++++++++++
> > > 4 files changed, 85 insertions(+)
> > >
> > > diff --git a/Documentation/vfio-mediated-device.txt b/Documentation/vfio-mediated-device.txt
> > > index c3f69bcaf96e..bc28471c0667 100644
> > > --- a/Documentation/vfio-mediated-device.txt
> > > +++ b/Documentation/vfio-mediated-device.txt
> > > @@ -202,6 +202,7 @@ Directories and files under the sysfs for Each Physical Device
> > > | | |--- available_instances
> > > | | |--- device_api
> > > | | |--- description
> > > + | | |--- version
> > > | | |--- [devices]
> > > | |--- [<type-id>]
> > > | | |--- create
> > > @@ -209,6 +210,7 @@ Directories and files under the sysfs for Each Physical Device
> > > | | |--- available_instances
> > > | | |--- device_api
> > > | | |--- description
> > > + | | |--- version
> > > | | |--- [devices]
> > > | |--- [<type-id>]
> > > | |--- create
> > > @@ -216,6 +218,7 @@ Directories and files under the sysfs for Each Physical Device
> > > | |--- available_instances
> > > | |--- device_api
> > > | |--- description
> > > + | |--- version
> > > | |--- [devices]
> > >
> > > * [mdev_supported_types]
> > > @@ -225,6 +228,8 @@ Directories and files under the sysfs for Each Physical Device
> > > [<type-id>], device_api, and available_instances are mandatory attributes
> > > that should be provided by vendor driver.
> > >
> > > + version is a mandatory attribute if a mdev device supports live migration.
> > > +
> > > * [<type-id>]
> > >
> > > The [<type-id>] name is created by adding the device driver string as a prefix
> > > @@ -246,6 +251,35 @@ Directories and files under the sysfs for Each Physical Device
> > > This attribute should show the number of devices of type <type-id> that can be
> > > created.
> > >
> > > +* version
> > > +
> > > + This attribute is rw. It is used to check whether two devices are compatible
> > > + for live migration. If this attribute is missing, then the corresponding mdev
> > > + device is regarded as not supporting live migration.
> > > +
> > > + It consists of two parts: common part and vendor proprietary part.
> > > + common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies
> > > + device type. e.g., for pci device, it is
> > > + "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > > + vendor proprietary part: this part is varied in length. vendor driver can
> > > + specify any string to identify a device.
> > > +
> > > + When reading this attribute, it should show device version string of the device
> > > + of type <type-id>. If a device does not support live migration, it should
> > > + return errno.
> > > + When writing a string to this attribute, it returns errno for incompatibility
> > > + or returns written string length in compatibility case. If a device does not
> > > + support live migration, it always returns errno.
> > > +
> > > + for example.
> > > + # cat \
> > > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_2/version
> > > + 00028086-193b-i915-GVTg_V5_2
> > > +
> > > + #echo 00028086-193b-i915-GVTg_V5_2 > \
> > > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> > > + -bash: echo: write error: Invalid argument
> > > +
> > > * [device]
> > >
> > > This directory contains links to the devices of type <type-id> that have been
> > > @@ -327,12 +361,14 @@ card.
> > > | | |-- available_instances
> > > | | |-- create
> > > | | |-- device_api
> > > + | | |-- version
> > > | | |-- devices
> > > | | `-- name
> > > | `-- mtty-2
> > > | |-- available_instances
> > > | |-- create
> > > | |-- device_api
> > > + | |-- version
> > > | |-- devices
> > > | `-- name
> > > |-- mtty_dev
> > > diff --git a/samples/vfio-mdev/mbochs.c b/samples/vfio-mdev/mbochs.c
> > > index b038aa9f5a70..2f5ba96b91a2 100644
> > > --- a/samples/vfio-mdev/mbochs.c
> > > +++ b/samples/vfio-mdev/mbochs.c
> > > @@ -1391,11 +1391,28 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> > > }
> > > MDEV_TYPE_ATTR_RO(device_api);
> > >
> > > +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> > > + char *buf)
> > > +{
> > > + /* do not support live migration */
> > > + return -EINVAL;
> > > +}
> > > +
> > > +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> > > + const char *buf, size_t count)
> > > +{
> > > + /* do not support live migration */
> > > + return -EINVAL;
> > > +}
> > > +
> > > +static MDEV_TYPE_ATTR_RW(version);
> > > +
> > > static struct attribute *mdev_types_attrs[] = {
> > > &mdev_type_attr_name.attr,
> > > &mdev_type_attr_description.attr,
> > > &mdev_type_attr_device_api.attr,
> > > &mdev_type_attr_available_instances.attr,
> > > + &mdev_type_attr_version.attr,
> > > NULL,
> > > };
> > >
> > > diff --git a/samples/vfio-mdev/mdpy.c b/samples/vfio-mdev/mdpy.c
> > > index cc86bf6566e4..ff15fdfc7d46 100644
> > > --- a/samples/vfio-mdev/mdpy.c
> > > +++ b/samples/vfio-mdev/mdpy.c
> > > @@ -695,11 +695,27 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> > > }
> > > MDEV_TYPE_ATTR_RO(device_api);
> > >
> > > +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> > > + char *buf)
> > > +{
> > > + /* do not support live migration */
> > > + return -EINVAL;
> > > +}
> > > +
> > > +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> > > + const char *buf, size_t count)
> > > +{
> > > + /* do not support live migration */
> > > + return -EINVAL;
> > > +}
> > > +static MDEV_TYPE_ATTR_RW(version);
> > > +
> > > static struct attribute *mdev_types_attrs[] = {
> > > &mdev_type_attr_name.attr,
> > > &mdev_type_attr_description.attr,
> > > &mdev_type_attr_device_api.attr,
> > > &mdev_type_attr_available_instances.attr,
> > > + &mdev_type_attr_version.attr,
> > > NULL,
> > > };
> > >
> > > diff --git a/samples/vfio-mdev/mtty.c b/samples/vfio-mdev/mtty.c
> > > index 1c77c370c92f..4ae3aad3474d 100644
> > > --- a/samples/vfio-mdev/mtty.c
> > > +++ b/samples/vfio-mdev/mtty.c
> > > @@ -1390,10 +1390,26 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> > >
> > > MDEV_TYPE_ATTR_RO(device_api);
> > >
> > > +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> > > + char *buf)
> > > +{
> > > + /* do not support live migration */
> > > + return -EINVAL;
> > > +}
> > > +
> > > +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> > > + const char *buf, size_t count)
> > > +{
> > > + /* do not support live migration */
> > > + return -EINVAL;
> > > +}
> > > +
> > > +static MDEV_TYPE_ATTR_RW(version);
> > > static struct attribute *mdev_types_attrs[] = {
> > > &mdev_type_attr_name.attr,
> > > &mdev_type_attr_device_api.attr,
> > > &mdev_type_attr_available_instances.attr,
> > > + &mdev_type_attr_version.attr,
> > > NULL,
> > > };
> > >
> >
> > _______________________________________________
> > intel-gvt-dev mailing list
> > [email protected]
> > https://lists.freedesktop.org/mailman/listinfo/intel-gvt-dev

2019-04-23 05:51:38

by Yan Zhao

[permalink] [raw]
Subject: Re: [PATCH 1/2] vfio/mdev: add version field as mandatory attribute for mdev device

On Tue, Apr 23, 2019 at 09:21:00AM +0800, Alex Williamson wrote:
> On Mon, 22 Apr 2019 21:01:52 -0400
> Yan Zhao <[email protected]> wrote:
>
> > On Mon, Apr 22, 2019 at 10:39:50PM +0800, Alex Williamson wrote:
> > > On Fri, 19 Apr 2019 04:35:04 -0400
> > > Yan Zhao <[email protected]> wrote:
> > >
> > > > device version attribute in mdev sysfs is used by user space software
> > > > (e.g. libvirt) to query device compatibility for live migration of VFIO
> > > > mdev devices. This attribute is mandatory if a mdev device supports live
> > > > migration.
> > >
> > > The Subject: doesn't quite match what's being proposed here.
> > >
> > > > It consists of two parts: common part and vendor proprietary part.
> > > > common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> > > > identifies device type. e.g., for pci device, it is
> > > > "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > >
> > > What purpose does this serve? If it's intended as some sort of
> > > namespace feature, shouldn't we first assume that we can only support
> > > migration to devices of the same type? Therefore each type would
> > > already have its own namespace. Also that would make the trailing bit
> > > of the version string listed below in the example redundant. A vendor
> > > is still welcome to include this in their version string if they wish,
> > > but I think the string should be entirely vendor defined.
> > >
> > hi Alex,
> > This common part is a kind of namespace.
> > Because if version string is entirely defined by vendors, I'm worried about
> > if there is a case that one vendor's version string happens to deceive and
> > interfere with another vendor's version checking?
> > e.g.
> > vendor A has a version string like: vendor id + device id + mdev type
> > vendor B has a version string like: device id + vendor id + mdev type
> > but vendor A's vendor id is 0x8086, device id is 0x1217
> > vendor B's vendor id is 0x1217, device id is 0x8086.
> >
> > In this corner case, the two vendors may regard the two device is
> > migratable but actually they are not.
> >
> > That's the reason for this common part that serve as a kind of namespace
> > that all vendors will comply with to avoid overlap.
>
> If we assume that migration can only occur between matching mdev types,
> this is redundant, each type already has their own namespace.
>
hi Alex,
do you mean user space software like libvirt needs to first check whether
mdev type is matching and then check whether version is matching?

if user space software only checks version for migration, it means vendor
driver has to include mdev type in their vendor proprietary part string,
right?

Another thing is that could there be any future mdev parent driver that
applies to all mdev devices, just like vfio-pci? like Yi's vfio-pci-mdev
driver (https://lkml.org/lkml/2019/3/13/114)?

> > > > vendor proprietary part: this part is varied in length. vendor driver can
> > > > specify any string to identify a device.
> > > >
> > > > When reading this attribute, it should show device version string of the
> > > > device of type <type-id>. If a device does not support live migration, it
> > > > should return errno.
> > > > When writing a string to this attribute, it returns errno for
> > > > incompatibility or returns written string length in compatibility case.
> > > > If a device does not support live migration, it always returns errno.
> > > >
> > > > For user space software to use:
> > > > 1.
> > > > Before starting live migration, user space software first reads source side
> > > > mdev device's version. e.g.
> > > > "#cat \
> > > > /sys/bus/pci/devices/0000\:00\:02.0/5ac1fb20-2bbf-4842-bb7e-36c58c3be9cd/mdev_type/version"
> > > > 00028086-193b-i915-GVTg_V5_4
> > > >
> > > > 2.
> > > > Then, user space software writes the source side returned version string
> > > > to device version attribute in target side, and checks the return value.
> > > > If a negative errno is returned in the target side, then mdev devices in
> > > > source and target sides are not compatible;
> > > > If a positive number is returned and it equals to the length of written
> > > > string, then the two mdev devices in source and target side are compatible.
> > > > e.g.
> > > > (a) compatibility case
> > > > "# echo 00028086-193b-i915-GVTg_V5_4 >
> > > > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > > >
> > > > (b) incompatibility case
> > > > "#echo 00028086-193b-i915-GVTg_V5_1 >
> > > > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > > > -bash: echo: write error: Invalid argument
> > > >
> > > > 3. if two mdev devices are compatible, user space software can start
> > > > live migration, and vice versa.
> > > >
> > > > Note: if a mdev device does not support live migration, it either does
> > > > not provide a version attribute, or always returns errno when its version
> > > > attribute is read/written.
> > >
> > > I think it would be cleaner to do the former, not supply the
> > > attribute. This seems to do the latter in the sample drivers. Thanks,
> > Ok. you are right!
> > what about just keep one sample driver to show how to do the latter,
> > and let the others do the former?
>
> I'd rather that if a vendor driver doesn't support features requiring
> the version attribute that they don't implement it. It's confusing to
> developers looking at the sample driver for guidance if we have
> different implementations. Of course if you'd like to add migration
> support to one of the sample drivers, that'd be very welcome. Thanks,
>
Got it:)

Thanks!
Yan

>
> > > > Cc: Alex Williamson <[email protected]>
> > > > Cc: Erik Skultety <[email protected]>
> > > > Cc: "Dr. David Alan Gilbert" <[email protected]>
> > > > Cc: Cornelia Huck <[email protected]>
> > > > Cc: "Tian, Kevin" <[email protected]>
> > > > Cc: Zhenyu Wang <[email protected]>
> > > > Cc: "Wang, Zhi A" <[email protected]>
> > > > Cc: Neo Jia <[email protected]>
> > > > Cc: Kirti Wankhede <[email protected]>
> > > >
> > > > Signed-off-by: Yan Zhao <[email protected]>
> > > > ---
> > > > Documentation/vfio-mediated-device.txt | 36 ++++++++++++++++++++++++++
> > > > samples/vfio-mdev/mbochs.c | 17 ++++++++++++
> > > > samples/vfio-mdev/mdpy.c | 16 ++++++++++++
> > > > samples/vfio-mdev/mtty.c | 16 ++++++++++++
> > > > 4 files changed, 85 insertions(+)
> > > >
> > > > diff --git a/Documentation/vfio-mediated-device.txt b/Documentation/vfio-mediated-device.txt
> > > > index c3f69bcaf96e..bc28471c0667 100644
> > > > --- a/Documentation/vfio-mediated-device.txt
> > > > +++ b/Documentation/vfio-mediated-device.txt
> > > > @@ -202,6 +202,7 @@ Directories and files under the sysfs for Each Physical Device
> > > > | | |--- available_instances
> > > > | | |--- device_api
> > > > | | |--- description
> > > > + | | |--- version
> > > > | | |--- [devices]
> > > > | |--- [<type-id>]
> > > > | | |--- create
> > > > @@ -209,6 +210,7 @@ Directories and files under the sysfs for Each Physical Device
> > > > | | |--- available_instances
> > > > | | |--- device_api
> > > > | | |--- description
> > > > + | | |--- version
> > > > | | |--- [devices]
> > > > | |--- [<type-id>]
> > > > | |--- create
> > > > @@ -216,6 +218,7 @@ Directories and files under the sysfs for Each Physical Device
> > > > | |--- available_instances
> > > > | |--- device_api
> > > > | |--- description
> > > > + | |--- version
> > > > | |--- [devices]
> > > >
> > > > * [mdev_supported_types]
> > > > @@ -225,6 +228,8 @@ Directories and files under the sysfs for Each Physical Device
> > > > [<type-id>], device_api, and available_instances are mandatory attributes
> > > > that should be provided by vendor driver.
> > > >
> > > > + version is a mandatory attribute if a mdev device supports live migration.
> > > > +
> > > > * [<type-id>]
> > > >
> > > > The [<type-id>] name is created by adding the device driver string as a prefix
> > > > @@ -246,6 +251,35 @@ Directories and files under the sysfs for Each Physical Device
> > > > This attribute should show the number of devices of type <type-id> that can be
> > > > created.
> > > >
> > > > +* version
> > > > +
> > > > + This attribute is rw. It is used to check whether two devices are compatible
> > > > + for live migration. If this attribute is missing, then the corresponding mdev
> > > > + device is regarded as not supporting live migration.
> > > > +
> > > > + It consists of two parts: common part and vendor proprietary part.
> > > > + common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies
> > > > + device type. e.g., for pci device, it is
> > > > + "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > > > + vendor proprietary part: this part is varied in length. vendor driver can
> > > > + specify any string to identify a device.
> > > > +
> > > > + When reading this attribute, it should show device version string of the device
> > > > + of type <type-id>. If a device does not support live migration, it should
> > > > + return errno.
> > > > + When writing a string to this attribute, it returns errno for incompatibility
> > > > + or returns written string length in compatibility case. If a device does not
> > > > + support live migration, it always returns errno.
> > > > +
> > > > + for example.
> > > > + # cat \
> > > > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_2/version
> > > > + 00028086-193b-i915-GVTg_V5_2
> > > > +
> > > > + #echo 00028086-193b-i915-GVTg_V5_2 > \
> > > > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> > > > + -bash: echo: write error: Invalid argument
> > > > +
> > > > * [device]
> > > >
> > > > This directory contains links to the devices of type <type-id> that have been
> > > > @@ -327,12 +361,14 @@ card.
> > > > | | |-- available_instances
> > > > | | |-- create
> > > > | | |-- device_api
> > > > + | | |-- version
> > > > | | |-- devices
> > > > | | `-- name
> > > > | `-- mtty-2
> > > > | |-- available_instances
> > > > | |-- create
> > > > | |-- device_api
> > > > + | |-- version
> > > > | |-- devices
> > > > | `-- name
> > > > |-- mtty_dev
> > > > diff --git a/samples/vfio-mdev/mbochs.c b/samples/vfio-mdev/mbochs.c
> > > > index b038aa9f5a70..2f5ba96b91a2 100644
> > > > --- a/samples/vfio-mdev/mbochs.c
> > > > +++ b/samples/vfio-mdev/mbochs.c
> > > > @@ -1391,11 +1391,28 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> > > > }
> > > > MDEV_TYPE_ATTR_RO(device_api);
> > > >
> > > > +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> > > > + char *buf)
> > > > +{
> > > > + /* do not support live migration */
> > > > + return -EINVAL;
> > > > +}
> > > > +
> > > > +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> > > > + const char *buf, size_t count)
> > > > +{
> > > > + /* do not support live migration */
> > > > + return -EINVAL;
> > > > +}
> > > > +
> > > > +static MDEV_TYPE_ATTR_RW(version);
> > > > +
> > > > static struct attribute *mdev_types_attrs[] = {
> > > > &mdev_type_attr_name.attr,
> > > > &mdev_type_attr_description.attr,
> > > > &mdev_type_attr_device_api.attr,
> > > > &mdev_type_attr_available_instances.attr,
> > > > + &mdev_type_attr_version.attr,
> > > > NULL,
> > > > };
> > > >
> > > > diff --git a/samples/vfio-mdev/mdpy.c b/samples/vfio-mdev/mdpy.c
> > > > index cc86bf6566e4..ff15fdfc7d46 100644
> > > > --- a/samples/vfio-mdev/mdpy.c
> > > > +++ b/samples/vfio-mdev/mdpy.c
> > > > @@ -695,11 +695,27 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> > > > }
> > > > MDEV_TYPE_ATTR_RO(device_api);
> > > >
> > > > +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> > > > + char *buf)
> > > > +{
> > > > + /* do not support live migration */
> > > > + return -EINVAL;
> > > > +}
> > > > +
> > > > +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> > > > + const char *buf, size_t count)
> > > > +{
> > > > + /* do not support live migration */
> > > > + return -EINVAL;
> > > > +}
> > > > +static MDEV_TYPE_ATTR_RW(version);
> > > > +
> > > > static struct attribute *mdev_types_attrs[] = {
> > > > &mdev_type_attr_name.attr,
> > > > &mdev_type_attr_description.attr,
> > > > &mdev_type_attr_device_api.attr,
> > > > &mdev_type_attr_available_instances.attr,
> > > > + &mdev_type_attr_version.attr,
> > > > NULL,
> > > > };
> > > >
> > > > diff --git a/samples/vfio-mdev/mtty.c b/samples/vfio-mdev/mtty.c
> > > > index 1c77c370c92f..4ae3aad3474d 100644
> > > > --- a/samples/vfio-mdev/mtty.c
> > > > +++ b/samples/vfio-mdev/mtty.c
> > > > @@ -1390,10 +1390,26 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> > > >
> > > > MDEV_TYPE_ATTR_RO(device_api);
> > > >
> > > > +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> > > > + char *buf)
> > > > +{
> > > > + /* do not support live migration */
> > > > + return -EINVAL;
> > > > +}
> > > > +
> > > > +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> > > > + const char *buf, size_t count)
> > > > +{
> > > > + /* do not support live migration */
> > > > + return -EINVAL;
> > > > +}
> > > > +
> > > > +static MDEV_TYPE_ATTR_RW(version);
> > > > static struct attribute *mdev_types_attrs[] = {
> > > > &mdev_type_attr_name.attr,
> > > > &mdev_type_attr_device_api.attr,
> > > > &mdev_type_attr_available_instances.attr,
> > > > + &mdev_type_attr_version.attr,
> > > > NULL,
> > > > };
> > > >
> > >
> > > _______________________________________________
> > > intel-gvt-dev mailing list
> > > [email protected]
> > > https://lists.freedesktop.org/mailman/listinfo/intel-gvt-dev
>
> _______________________________________________
> intel-gvt-dev mailing list
> [email protected]
> https://lists.freedesktop.org/mailman/listinfo/intel-gvt-dev

2019-04-23 09:47:15

by Cornelia Huck

[permalink] [raw]
Subject: Re: [PATCH 1/2] vfio/mdev: add version field as mandatory attribute for mdev device

On Tue, 23 Apr 2019 01:41:57 -0400
Yan Zhao <[email protected]> wrote:

> On Tue, Apr 23, 2019 at 09:21:00AM +0800, Alex Williamson wrote:
> > On Mon, 22 Apr 2019 21:01:52 -0400
> > Yan Zhao <[email protected]> wrote:
> >
> > > On Mon, Apr 22, 2019 at 10:39:50PM +0800, Alex Williamson wrote:
> > > > On Fri, 19 Apr 2019 04:35:04 -0400
> > > > Yan Zhao <[email protected]> wrote:
> > > >
> > > > > device version attribute in mdev sysfs is used by user space software
> > > > > (e.g. libvirt) to query device compatibility for live migration of VFIO
> > > > > mdev devices. This attribute is mandatory if a mdev device supports live
> > > > > migration.
> > > >
> > > > The Subject: doesn't quite match what's being proposed here.
> > > >
> > > > > It consists of two parts: common part and vendor proprietary part.
> > > > > common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> > > > > identifies device type. e.g., for pci device, it is
> > > > > "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > > >
> > > > What purpose does this serve? If it's intended as some sort of
> > > > namespace feature, shouldn't we first assume that we can only support
> > > > migration to devices of the same type? Therefore each type would
> > > > already have its own namespace. Also that would make the trailing bit
> > > > of the version string listed below in the example redundant. A vendor
> > > > is still welcome to include this in their version string if they wish,
> > > > but I think the string should be entirely vendor defined.
> > > >
> > > hi Alex,
> > > This common part is a kind of namespace.
> > > Because if version string is entirely defined by vendors, I'm worried about
> > > if there is a case that one vendor's version string happens to deceive and
> > > interfere with another vendor's version checking?
> > > e.g.
> > > vendor A has a version string like: vendor id + device id + mdev type
> > > vendor B has a version string like: device id + vendor id + mdev type
> > > but vendor A's vendor id is 0x8086, device id is 0x1217
> > > vendor B's vendor id is 0x1217, device id is 0x8086.
> > >
> > > In this corner case, the two vendors may regard the two device is
> > > migratable but actually they are not.
> > >
> > > That's the reason for this common part that serve as a kind of namespace
> > > that all vendors will comply with to avoid overlap.
> >
> > If we assume that migration can only occur between matching mdev types,
> > this is redundant, each type already has their own namespace.
> >
> hi Alex,
> do you mean user space software like libvirt needs to first check whether
> mdev type is matching and then check whether version is matching?
>
> if user space software only checks version for migration, it means vendor
> driver has to include mdev type in their vendor proprietary part string,
> right?

Can't userspace simply check for the driver in use and only then check
the version attribute?

> Another thing is that could there be any future mdev parent driver that
> applies to all mdev devices, just like vfio-pci? like Yi's vfio-pci-mdev
> driver (https://lkml.org/lkml/2019/3/13/114)?

Hm, I think that the vfio-pci-mdev driver then needs to expose
information regarding compatibility (and not the core).

2019-04-23 10:02:15

by Cornelia Huck

[permalink] [raw]
Subject: Re: [PATCH 1/2] vfio/mdev: add version field as mandatory attribute for mdev device

On Fri, 19 Apr 2019 04:35:04 -0400
Yan Zhao <[email protected]> wrote:

> device version attribute in mdev sysfs is used by user space software
> (e.g. libvirt) to query device compatibility for live migration of VFIO
> mdev devices. This attribute is mandatory if a mdev device supports live
> migration.
>
> It consists of two parts: common part and vendor proprietary part.
> common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> identifies device type. e.g., for pci device, it is
> "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> vendor proprietary part: this part is varied in length. vendor driver can
> specify any string to identify a device.
>
> When reading this attribute, it should show device version string of the
> device of type <type-id>. If a device does not support live migration, it
> should return errno.

It might make more sense if the driver does not register the attribute
for the device in that case at all.

> When writing a string to this attribute, it returns errno for
> incompatibility or returns written string length in compatibility case.
> If a device does not support live migration, it always returns errno.
>
> For user space software to use:
> 1.
> Before starting live migration, user space software first reads source side
> mdev device's version. e.g.
> "#cat \
> /sys/bus/pci/devices/0000\:00\:02.0/5ac1fb20-2bbf-4842-bb7e-36c58c3be9cd/mdev_type/version"
> 00028086-193b-i915-GVTg_V5_4
>
> 2.
> Then, user space software writes the source side returned version string
> to device version attribute in target side, and checks the return value.
> If a negative errno is returned in the target side, then mdev devices in
> source and target sides are not compatible;
> If a positive number is returned and it equals to the length of written
> string, then the two mdev devices in source and target side are compatible.
> e.g.
> (a) compatibility case
> "# echo 00028086-193b-i915-GVTg_V5_4 >
> /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
>
> (b) incompatibility case
> "#echo 00028086-193b-i915-GVTg_V5_1 >
> /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> -bash: echo: write error: Invalid argument
>
> 3. if two mdev devices are compatible, user space software can start
> live migration, and vice versa.
>
> Note: if a mdev device does not support live migration, it either does
> not provide a version attribute, or always returns errno when its version
> attribute is read/written.
>
> Cc: Alex Williamson <[email protected]>
> Cc: Erik Skultety <[email protected]>
> Cc: "Dr. David Alan Gilbert" <[email protected]>
> Cc: Cornelia Huck <[email protected]>
> Cc: "Tian, Kevin" <[email protected]>
> Cc: Zhenyu Wang <[email protected]>
> Cc: "Wang, Zhi A" <[email protected]>
> Cc: Neo Jia <[email protected]>
> Cc: Kirti Wankhede <[email protected]>
>
> Signed-off-by: Yan Zhao <[email protected]>
> ---
> Documentation/vfio-mediated-device.txt | 36 ++++++++++++++++++++++++++
> samples/vfio-mdev/mbochs.c | 17 ++++++++++++
> samples/vfio-mdev/mdpy.c | 16 ++++++++++++
> samples/vfio-mdev/mtty.c | 16 ++++++++++++
> 4 files changed, 85 insertions(+)
>
> diff --git a/Documentation/vfio-mediated-device.txt b/Documentation/vfio-mediated-device.txt
> index c3f69bcaf96e..bc28471c0667 100644
> --- a/Documentation/vfio-mediated-device.txt
> +++ b/Documentation/vfio-mediated-device.txt
> @@ -202,6 +202,7 @@ Directories and files under the sysfs for Each Physical Device
> | | |--- available_instances
> | | |--- device_api
> | | |--- description
> + | | |--- version
> | | |--- [devices]
> | |--- [<type-id>]
> | | |--- create
> @@ -209,6 +210,7 @@ Directories and files under the sysfs for Each Physical Device
> | | |--- available_instances
> | | |--- device_api
> | | |--- description
> + | | |--- version
> | | |--- [devices]
> | |--- [<type-id>]
> | |--- create
> @@ -216,6 +218,7 @@ Directories and files under the sysfs for Each Physical Device
> | |--- available_instances
> | |--- device_api
> | |--- description
> + | |--- version
> | |--- [devices]
>
> * [mdev_supported_types]
> @@ -225,6 +228,8 @@ Directories and files under the sysfs for Each Physical Device
> [<type-id>], device_api, and available_instances are mandatory attributes
> that should be provided by vendor driver.
>
> + version is a mandatory attribute if a mdev device supports live migration.

What about "An mdev device wishing to support live migration must
provide the version attribute."?

> +
> * [<type-id>]
>
> The [<type-id>] name is created by adding the device driver string as a prefix
> @@ -246,6 +251,35 @@ Directories and files under the sysfs for Each Physical Device
> This attribute should show the number of devices of type <type-id> that can be
> created.
>
> +* version
> +
> + This attribute is rw. It is used to check whether two devices are compatible
> + for live migration. If this attribute is missing, then the corresponding mdev
> + device is regarded as not supporting live migration.
> +
> + It consists of two parts: common part and vendor proprietary part.
> + common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies
> + device type. e.g., for pci device, it is
> + "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> + vendor proprietary part: this part is varied in length. vendor driver can
> + specify any string to identify a device.
> +
> + When reading this attribute, it should show device version string of the device
> + of type <type-id>. If a device does not support live migration, it should
> + return errno.
> + When writing a string to this attribute, it returns errno for incompatibility
> + or returns written string length in compatibility case. If a device does not
> + support live migration, it always returns errno.

I'm not sure whether a device that does not support live migration
should expose this attribute in the first place. Or is that to cover
cases where a driver supports live migration only for some of the
devices it supports?

Also, I'm not sure if a string that has to be parsed is a good idea...
is this 'version' attribute supposed to convey some human-readable
information as well? The procedure you describe for compatibility
checking does the checking within the vendor driver which I would
expect to have a table/rules for that anyway.

I think you should also specify which errno writing an incompatible id
is supposed to return (probably best something different than if the
device does not support live migration at all, if we stick with
creating the attribute in that case.)

> +
> + for example.
> + # cat \
> + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_2/version
> + 00028086-193b-i915-GVTg_V5_2
> +
> + #echo 00028086-193b-i915-GVTg_V5_2 > \
> + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> + -bash: echo: write error: Invalid argument
> +
> * [device]
>
> This directory contains links to the devices of type <type-id> that have been

2019-04-23 10:25:54

by Daniel P. Berrangé

[permalink] [raw]
Subject: Re: [Qemu-devel] [PATCH 1/2] vfio/mdev: add version field as mandatory attribute for mdev device

On Tue, Apr 23, 2019 at 01:41:57AM -0400, Yan Zhao wrote:
> On Tue, Apr 23, 2019 at 09:21:00AM +0800, Alex Williamson wrote:
> > On Mon, 22 Apr 2019 21:01:52 -0400
> > Yan Zhao <[email protected]> wrote:
> >
> > > On Mon, Apr 22, 2019 at 10:39:50PM +0800, Alex Williamson wrote:
> > > > On Fri, 19 Apr 2019 04:35:04 -0400
> > > > Yan Zhao <[email protected]> wrote:
> > > >
> > > > > device version attribute in mdev sysfs is used by user space software
> > > > > (e.g. libvirt) to query device compatibility for live migration of VFIO
> > > > > mdev devices. This attribute is mandatory if a mdev device supports live
> > > > > migration.
> > > >
> > > > The Subject: doesn't quite match what's being proposed here.
> > > >
> > > > > It consists of two parts: common part and vendor proprietary part.
> > > > > common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> > > > > identifies device type. e.g., for pci device, it is
> > > > > "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > > >
> > > > What purpose does this serve? If it's intended as some sort of
> > > > namespace feature, shouldn't we first assume that we can only support
> > > > migration to devices of the same type? Therefore each type would
> > > > already have its own namespace. Also that would make the trailing bit
> > > > of the version string listed below in the example redundant. A vendor
> > > > is still welcome to include this in their version string if they wish,
> > > > but I think the string should be entirely vendor defined.
> > > >
> > > hi Alex,
> > > This common part is a kind of namespace.
> > > Because if version string is entirely defined by vendors, I'm worried about
> > > if there is a case that one vendor's version string happens to deceive and
> > > interfere with another vendor's version checking?
> > > e.g.
> > > vendor A has a version string like: vendor id + device id + mdev type
> > > vendor B has a version string like: device id + vendor id + mdev type
> > > but vendor A's vendor id is 0x8086, device id is 0x1217
> > > vendor B's vendor id is 0x1217, device id is 0x8086.
> > >
> > > In this corner case, the two vendors may regard the two device is
> > > migratable but actually they are not.
> > >
> > > That's the reason for this common part that serve as a kind of namespace
> > > that all vendors will comply with to avoid overlap.
> >
> > If we assume that migration can only occur between matching mdev types,
> > this is redundant, each type already has their own namespace.
> >
> hi Alex,
> do you mean user space software like libvirt needs to first check whether
> mdev type is matching and then check whether version is matching?

I would expect that libvirt (or other mgmt apps) will always first
check that the vendor id, device id, mdev type all match. So for the
version string it should suffice to be a "normal" numeric value.

Essentially version string just needs to be there to distinguish
revisions of the same mdev type over time.


Regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

2019-04-23 10:42:20

by Daniel P. Berrangé

[permalink] [raw]
Subject: Re: [Qemu-devel] [PATCH 1/2] vfio/mdev: add version field as mandatory attribute for mdev device

On Fri, Apr 19, 2019 at 04:35:04AM -0400, Yan Zhao wrote:
> device version attribute in mdev sysfs is used by user space software
> (e.g. libvirt) to query device compatibility for live migration of VFIO
> mdev devices. This attribute is mandatory if a mdev device supports live
> migration.
>
> It consists of two parts: common part and vendor proprietary part.
> common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> identifies device type. e.g., for pci device, it is
> "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> vendor proprietary part: this part is varied in length. vendor driver can
> specify any string to identify a device.
>
> When reading this attribute, it should show device version string of the
> device of type <type-id>. If a device does not support live migration, it
> should return errno.
> When writing a string to this attribute, it returns errno for
> incompatibility or returns written string length in compatibility case.
> If a device does not support live migration, it always returns errno.
>
> For user space software to use:
> 1.
> Before starting live migration, user space software first reads source side
> mdev device's version. e.g.
> "#cat \
> /sys/bus/pci/devices/0000\:00\:02.0/5ac1fb20-2bbf-4842-bb7e-36c58c3be9cd/mdev_type/version"
> 00028086-193b-i915-GVTg_V5_4
>
> 2.
> Then, user space software writes the source side returned version string
> to device version attribute in target side, and checks the return value.
> If a negative errno is returned in the target side, then mdev devices in
> source and target sides are not compatible;
> If a positive number is returned and it equals to the length of written
> string, then the two mdev devices in source and target side are compatible.
> e.g.
> (a) compatibility case
> "# echo 00028086-193b-i915-GVTg_V5_4 >
> /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
>
> (b) incompatibility case
> "#echo 00028086-193b-i915-GVTg_V5_1 >
> /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> -bash: echo: write error: Invalid argument

What you have written here seems to imply that each mdev type is able to
support many different versions at the same time. Writing a version into
this sysfs file then chooses which of the many versions to actually use.

This is good as it allows for live migration across driver software upgrades.

A mgmt application may well want to know what versions are supported for an
mdev type *before* starting a migration. A mgmt app can query all the 100's
of hosts it knows and thus figure out which are valid to use as the target
of a migration.

IOW, we want to avoid the ever hitting the incompatibility case in the
first place, by only choosing to migrate to a host that we know is going
to be compatible.

This would need some kind of way to report the full list of supported
versions against the mdev supported types on the host.


> 3. if two mdev devices are compatible, user space software can start
> live migration, and vice versa.
>
> Note: if a mdev device does not support live migration, it either does
> not provide a version attribute, or always returns errno when its version
> attribute is read/written.
>
> Cc: Alex Williamson <[email protected]>
> Cc: Erik Skultety <[email protected]>
> Cc: "Dr. David Alan Gilbert" <[email protected]>
> Cc: Cornelia Huck <[email protected]>
> Cc: "Tian, Kevin" <[email protected]>
> Cc: Zhenyu Wang <[email protected]>
> Cc: "Wang, Zhi A" <[email protected]>
> Cc: Neo Jia <[email protected]>
> Cc: Kirti Wankhede <[email protected]>
>
> Signed-off-by: Yan Zhao <[email protected]>
> ---
> Documentation/vfio-mediated-device.txt | 36 ++++++++++++++++++++++++++
> samples/vfio-mdev/mbochs.c | 17 ++++++++++++
> samples/vfio-mdev/mdpy.c | 16 ++++++++++++
> samples/vfio-mdev/mtty.c | 16 ++++++++++++
> 4 files changed, 85 insertions(+)
>
> diff --git a/Documentation/vfio-mediated-device.txt b/Documentation/vfio-mediated-device.txt
> index c3f69bcaf96e..bc28471c0667 100644
> --- a/Documentation/vfio-mediated-device.txt
> +++ b/Documentation/vfio-mediated-device.txt
> @@ -202,6 +202,7 @@ Directories and files under the sysfs for Each Physical Device
> | | |--- available_instances
> | | |--- device_api
> | | |--- description
> + | | |--- version
> | | |--- [devices]
> | |--- [<type-id>]
> | | |--- create
> @@ -209,6 +210,7 @@ Directories and files under the sysfs for Each Physical Device
> | | |--- available_instances
> | | |--- device_api
> | | |--- description
> + | | |--- version
> | | |--- [devices]
> | |--- [<type-id>]
> | |--- create
> @@ -216,6 +218,7 @@ Directories and files under the sysfs for Each Physical Device
> | |--- available_instances
> | |--- device_api
> | |--- description
> + | |--- version
> | |--- [devices]
>
> * [mdev_supported_types]
> @@ -225,6 +228,8 @@ Directories and files under the sysfs for Each Physical Device
> [<type-id>], device_api, and available_instances are mandatory attributes
> that should be provided by vendor driver.
>
> + version is a mandatory attribute if a mdev device supports live migration.
> +
> * [<type-id>]
>
> The [<type-id>] name is created by adding the device driver string as a prefix
> @@ -246,6 +251,35 @@ Directories and files under the sysfs for Each Physical Device
> This attribute should show the number of devices of type <type-id> that can be
> created.
>
> +* version
> +
> + This attribute is rw. It is used to check whether two devices are compatible
> + for live migration. If this attribute is missing, then the corresponding mdev
> + device is regarded as not supporting live migration.
> +
> + It consists of two parts: common part and vendor proprietary part.
> + common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies
> + device type. e.g., for pci device, it is
> + "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> + vendor proprietary part: this part is varied in length. vendor driver can
> + specify any string to identify a device.
> +
> + When reading this attribute, it should show device version string of the device
> + of type <type-id>. If a device does not support live migration, it should
> + return errno.
> + When writing a string to this attribute, it returns errno for incompatibility
> + or returns written string length in compatibility case. If a device does not
> + support live migration, it always returns errno.
> +
> + for example.
> + # cat \
> + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_2/version
> + 00028086-193b-i915-GVTg_V5_2
> +
> + #echo 00028086-193b-i915-GVTg_V5_2 > \
> + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> + -bash: echo: write error: Invalid argument
> +

IIUC this path is against the physical device. IOW, the mgmt app would have
to first write to the "version" file to choose a version, and then write to
the "create" file to actually create an virtual device. This has the obvious
concurrency problem if multiple devices are being created at the same time
and distinct versions for each device are required. There would need to be
a locking scheme defined to ensure safety.

Wouldn't it be better if we can pass the desired version when we write to
the "create" file, so that we avoid any concurrent usage problems. "version"
could be just a read-only file with a *list* of supported versions.

eg

$ cat /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
5.0
5.1
5.2

$ echo "83b8f4f2-509f-382f-3c1e-e6bfe0fa1001;version=5.2" >
/sys/devices/virtual/mtty/mtty/mdev_supported_types/mtty-2/create


Regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

2019-04-23 11:40:29

by Cornelia Huck

[permalink] [raw]
Subject: Re: [PATCH 2/2] drm/i915/gvt: export mdev device version to sysfs for Intel vGPU

On Fri, 19 Apr 2019 04:35:59 -0400
Yan Zhao <[email protected]> wrote:

> This feature implements the version attribute for Intel's vGPU mdev
> devices.
>
> version attribute is rw. It is queried by userspace software like libvirt
> to check whether two vGPUs are compatible for live migration.
>
> It consists of two parts: common part and vendor proprietary part.
> common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> identifies device type. e.g., for pci device, it is
> "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> vendor proprietary part: this part is varied in length. vendor driver can
> specify any string to identify a device.
>
> For Intel vGPU of gen8 and gen9, the vendor proprietary part currently
> consists of 2 fields: "device id" + "mdev type".
>
> Reading from a vGPU's version attribute, a string is returned in below
> format: 00028086-<device id>-<mdev type>. e.g.
> 00028086-193b-i915-GVTg_V5_2.
>
> Writing a string to a vGPU's version attribute will trigger GVT to check
> whether a vGPU identified by the written string is compatible with
> current vGPU owning this version attribute. errno is returned if the two
> vGPUs are incompatible. The length of written string is returned in
> compatible case.
>
> For other platforms, and for GVT not supporting vGPU live migration
> feature, errnos are returned when read/write of mdev devices' version
> attributes.
>
> For old GVT versions where no version attributes exposed in sysfs, it is
> regarded as not supporting vGPU live migration.
>
> For future platforms, besides the current 2 fields in vendor proprietary
> part, more fields may be added to identify Intel vGPU well for live
> migration purpose.
>
> Cc: Alex Williamson <[email protected]>
> Cc: Erik Skultety <[email protected]>
> Cc: "Dr. David Alan Gilbert" <[email protected]>
> Cc: Cornelia Huck <[email protected]>
> Cc: "Tian, Kevin" <[email protected]>
> Cc: Zhenyu Wang <[email protected]>
> Cc: "Wang, Zhi A" <[email protected]>
> c: Neo Jia <[email protected]>
> Cc: Kirti Wankhede <[email protected]>
>
> Signed-off-by: Yan Zhao <[email protected]>
> ---
> drivers/gpu/drm/i915/gvt/Makefile | 2 +-
> drivers/gpu/drm/i915/gvt/device_version.c | 94 +++++++++++++++++++++++
> drivers/gpu/drm/i915/gvt/gvt.c | 55 +++++++++++++
> drivers/gpu/drm/i915/gvt/gvt.h | 6 ++
> 4 files changed, 156 insertions(+), 1 deletion(-)
> create mode 100644 drivers/gpu/drm/i915/gvt/device_version.c
>

(...)

> +static bool is_compatible(const char *self, const char *remote)
> +{
> + if (strlen(remote) != strlen(self))
> + return false;
> +
> + return (strncmp(self, remote, strlen(self))) ? false : true;
> +}
> +
> +ssize_t intel_gvt_get_vfio_device_version_len(struct drm_i915_private *dev_priv)
> +{
> + if (!IS_GEN(dev_priv, 8) && !IS_GEN(dev_priv, 9))
> + return -ENODEV;
> +
> + return PAGE_SIZE;
> +}
> +
> +ssize_t intel_gvt_get_vfio_device_version(struct drm_i915_private *dev_priv,
> + char *buf, const char *mdev_type)
> +{
> + int cnt = 0, ret = 0;
> + const char *str = NULL;
> +
> + /* currently only gen8 & gen9 are supported */
> + if (!IS_GEN(dev_priv, 8) && !IS_GEN(dev_priv, 9))
> + return -ENODEV;
> +
> + /* first 32 bit common part specifying vendor id and it's a pci
> + * device
> + */
> + cnt = snprintf(buf, GVT_DEVICE_VERSION_COMMON_LEN + 1,
> + "%08x", GVT_VFIO_DEVICE_VENDOR_ID);
> + buf += cnt;
> + ret += cnt;
> +
> + /* vendor proprietary part: device id + mdev type */
> + /* device id */
> + cnt = snprintf(buf, GVT_DEVICE_VERSION_DEVICE_ID_LEN + 2,
> + "-%04x", INTEL_DEVID(dev_priv));
> + buf += cnt;
> + ret += cnt;
> +
> + /* mdev type */
> + str = mdev_type;
> + cnt = snprintf(buf, strlen(str) + 3, "-%s\n", mdev_type);
> + buf += cnt;
> + ret += cnt;
> +
> + return ret;
> +}

Looking at this handling, it seems much easier to me to simply use a
numeric value instead of a string: You don't have to build it via
sprintf, there are generic functions for parsing a string input into a
simple number, and you have more options for compatibility (e.g.
"version must be between m and n" instead of an exact match).

If you still need to encode the device id here, you should be able to
easily do something like (device_id << 16) | migration_version -- do
you think that could work?

> +
> +ssize_t intel_gvt_check_vfio_device_version(struct drm_i915_private *dev_priv,
> + const char *self, const char *remote)
> +{
> +
> + /* currently only gen8 & gen9 are supported */
> + if (!IS_GEN(dev_priv, 8) && !IS_GEN(dev_priv, 9))
> + return -ENODEV;
> +
> + if (!is_compatible(self, remote))
> + return -EINVAL;

I think the meaning of the error codes should really be standardized
across vendor drivers, if we need a value for "this device does not
support migration at all". (Your choices look reasonable for that.)

> +
> + return 0;
> +}
> diff --git a/drivers/gpu/drm/i915/gvt/gvt.c b/drivers/gpu/drm/i915/gvt/gvt.c
> index 43f4242062dd..e720465b93d8 100644
> --- a/drivers/gpu/drm/i915/gvt/gvt.c
> +++ b/drivers/gpu/drm/i915/gvt/gvt.c
> @@ -105,14 +105,69 @@ static ssize_t description_show(struct kobject *kobj, struct device *dev,
> type->weight);
> }
>
> +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> + char *buf)
> +{
> +#ifdef GVT_MIGRATION_VERSION
> + struct drm_i915_private *i915 = kdev_to_i915(dev);
> + const char *mdev_type = kobject_name(kobj);
> +
> + return intel_gvt_get_vfio_device_version(i915, buf, mdev_type);
> +#else
> + /* do not support live migration */
> + return -EINVAL;

...but this looks inconsistent. I would expect -ENODEV here, same as
for non-gen{8,9}.

Or simply do not create the attribute at all in that case?

> +#endif
> +}
> +
> +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> + const char *buf, size_t count)
> +{
> +#ifdef GVT_MIGRATION_VERSION
> + char *remote = NULL, *self = NULL;
> + int len, ret = 0;
> + struct drm_i915_private *i915 = kdev_to_i915(dev);
> + const char *mdev_type = kobject_name(kobj);
> +
> + len = intel_gvt_get_vfio_device_version_len(i915);
> + if (len < 0)
> + return len;
> +
> + self = kmalloc(len, GFP_KERNEL);
> + if (!self)
> + return -ENOMEM;
> +
> + ret = intel_gvt_get_vfio_device_version(i915, self, mdev_type);
> + if (ret < 0)
> + goto out;
> +
> + remote = kstrndup(buf, count, GFP_KERNEL);
> + if (!remote) {
> + ret = -ENOMEM;
> + goto out;
> + }
> +
> + ret = intel_gvt_check_vfio_device_version(i915, self, remote);
> +
> +out:
> + kfree(self);
> + kfree(remote);
> + return (ret < 0 ? ret : count);
> +#else
> + /* do not support live migration */
> + return -EINVAL;
> +#endif
> +}
> +
> static MDEV_TYPE_ATTR_RO(available_instances);
> static MDEV_TYPE_ATTR_RO(device_api);
> static MDEV_TYPE_ATTR_RO(description);
> +static MDEV_TYPE_ATTR_RW(version);
>
> static struct attribute *gvt_type_attrs[] = {
> &mdev_type_attr_available_instances.attr,
> &mdev_type_attr_device_api.attr,
> &mdev_type_attr_description.attr,
> + &mdev_type_attr_version.attr,
> NULL,
> };
>
(...)

2019-04-23 12:38:26

by Alex Williamson

[permalink] [raw]
Subject: Re: [Qemu-devel] [PATCH 1/2] vfio/mdev: add version field as mandatory attribute for mdev device

On Tue, 23 Apr 2019 11:39:39 +0100
Daniel P. Berrangé <[email protected]> wrote:

> On Fri, Apr 19, 2019 at 04:35:04AM -0400, Yan Zhao wrote:
> > device version attribute in mdev sysfs is used by user space software
> > (e.g. libvirt) to query device compatibility for live migration of VFIO
> > mdev devices. This attribute is mandatory if a mdev device supports live
> > migration.
> >
> > It consists of two parts: common part and vendor proprietary part.
> > common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> > identifies device type. e.g., for pci device, it is
> > "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > vendor proprietary part: this part is varied in length. vendor driver can
> > specify any string to identify a device.
> >
> > When reading this attribute, it should show device version string of the
> > device of type <type-id>. If a device does not support live migration, it
> > should return errno.
> > When writing a string to this attribute, it returns errno for
> > incompatibility or returns written string length in compatibility case.
> > If a device does not support live migration, it always returns errno.
> >
> > For user space software to use:
> > 1.
> > Before starting live migration, user space software first reads source side
> > mdev device's version. e.g.
> > "#cat \
> > /sys/bus/pci/devices/0000\:00\:02.0/5ac1fb20-2bbf-4842-bb7e-36c58c3be9cd/mdev_type/version"
> > 00028086-193b-i915-GVTg_V5_4
> >
> > 2.
> > Then, user space software writes the source side returned version string
> > to device version attribute in target side, and checks the return value.
> > If a negative errno is returned in the target side, then mdev devices in
> > source and target sides are not compatible;
> > If a positive number is returned and it equals to the length of written
> > string, then the two mdev devices in source and target side are compatible.
> > e.g.
> > (a) compatibility case
> > "# echo 00028086-193b-i915-GVTg_V5_4 >
> > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> >
> > (b) incompatibility case
> > "#echo 00028086-193b-i915-GVTg_V5_1 >
> > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > -bash: echo: write error: Invalid argument
>
> What you have written here seems to imply that each mdev type is able to
> support many different versions at the same time. Writing a version into
> this sysfs file then chooses which of the many versions to actually use.

That's not actually what's being proposed here, reading the version
attribute provides an opaque string. Writing the version attribute
from a different system reports whether that version string is
compatible for migration. IOW, we're not creating a device of a given
version, we're only reporting if that version is compatible with this
mdev. The version is intentionally opaque to allow vendor specific
nuances, therefore it's rather impractical create the sort of version
list requested below.

> This is good as it allows for live migration across driver software upgrades.
>
> A mgmt application may well want to know what versions are supported for an
> mdev type *before* starting a migration. A mgmt app can query all the 100's
> of hosts it knows and thus figure out which are valid to use as the target
> of a migration.
>
> IOW, we want to avoid the ever hitting the incompatibility case in the
> first place, by only choosing to migrate to a host that we know is going
> to be compatible.

This is provided, the migration source reports a version string which
can be queried against the version attribute on other hosts to insure
compatibility prior to migration.

> This would need some kind of way to report the full list of supported
> versions against the mdev supported types on the host.

This is not provided, matching versions are vendor specific.

> > 3. if two mdev devices are compatible, user space software can start
> > live migration, and vice versa.
> >
> > Note: if a mdev device does not support live migration, it either does
> > not provide a version attribute, or always returns errno when its version
> > attribute is read/written.
> >
> > Cc: Alex Williamson <[email protected]>
> > Cc: Erik Skultety <[email protected]>
> > Cc: "Dr. David Alan Gilbert" <[email protected]>
> > Cc: Cornelia Huck <[email protected]>
> > Cc: "Tian, Kevin" <[email protected]>
> > Cc: Zhenyu Wang <[email protected]>
> > Cc: "Wang, Zhi A" <[email protected]>
> > Cc: Neo Jia <[email protected]>
> > Cc: Kirti Wankhede <[email protected]>
> >
> > Signed-off-by: Yan Zhao <[email protected]>
> > ---
> > Documentation/vfio-mediated-device.txt | 36 ++++++++++++++++++++++++++
> > samples/vfio-mdev/mbochs.c | 17 ++++++++++++
> > samples/vfio-mdev/mdpy.c | 16 ++++++++++++
> > samples/vfio-mdev/mtty.c | 16 ++++++++++++
> > 4 files changed, 85 insertions(+)
> >
> > diff --git a/Documentation/vfio-mediated-device.txt b/Documentation/vfio-mediated-device.txt
> > index c3f69bcaf96e..bc28471c0667 100644
> > --- a/Documentation/vfio-mediated-device.txt
> > +++ b/Documentation/vfio-mediated-device.txt
> > @@ -202,6 +202,7 @@ Directories and files under the sysfs for Each Physical Device
> > | | |--- available_instances
> > | | |--- device_api
> > | | |--- description
> > + | | |--- version
> > | | |--- [devices]
> > | |--- [<type-id>]
> > | | |--- create
> > @@ -209,6 +210,7 @@ Directories and files under the sysfs for Each Physical Device
> > | | |--- available_instances
> > | | |--- device_api
> > | | |--- description
> > + | | |--- version
> > | | |--- [devices]
> > | |--- [<type-id>]
> > | |--- create
> > @@ -216,6 +218,7 @@ Directories and files under the sysfs for Each Physical Device
> > | |--- available_instances
> > | |--- device_api
> > | |--- description
> > + | |--- version
> > | |--- [devices]
> >
> > * [mdev_supported_types]
> > @@ -225,6 +228,8 @@ Directories and files under the sysfs for Each Physical Device
> > [<type-id>], device_api, and available_instances are mandatory attributes
> > that should be provided by vendor driver.
> >
> > + version is a mandatory attribute if a mdev device supports live migration.
> > +
> > * [<type-id>]
> >
> > The [<type-id>] name is created by adding the device driver string as a prefix
> > @@ -246,6 +251,35 @@ Directories and files under the sysfs for Each Physical Device
> > This attribute should show the number of devices of type <type-id> that can be
> > created.
> >
> > +* version
> > +
> > + This attribute is rw. It is used to check whether two devices are compatible
> > + for live migration. If this attribute is missing, then the corresponding mdev
> > + device is regarded as not supporting live migration.
> > +
> > + It consists of two parts: common part and vendor proprietary part.
> > + common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies
> > + device type. e.g., for pci device, it is
> > + "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > + vendor proprietary part: this part is varied in length. vendor driver can
> > + specify any string to identify a device.
> > +
> > + When reading this attribute, it should show device version string of the device
> > + of type <type-id>. If a device does not support live migration, it should
> > + return errno.
> > + When writing a string to this attribute, it returns errno for incompatibility
> > + or returns written string length in compatibility case. If a device does not
> > + support live migration, it always returns errno.
> > +
> > + for example.
> > + # cat \
> > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_2/version
> > + 00028086-193b-i915-GVTg_V5_2
> > +
> > + #echo 00028086-193b-i915-GVTg_V5_2 > \
> > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> > + -bash: echo: write error: Invalid argument
> > +
>
> IIUC this path is against the physical device. IOW, the mgmt app would have
> to first write to the "version" file to choose a version, and then write to
> the "create" file to actually create an virtual device. This has the obvious
> concurrency problem if multiple devices are being created at the same time
> and distinct versions for each device are required. There would need to be
> a locking scheme defined to ensure safety.

"Create a device of a given version" is not an intended feature of this
interface aiui. Writing the version attribute only indicates
migration compatibility with a binary result.

> Wouldn't it be better if we can pass the desired version when we write to
> the "create" file, so that we avoid any concurrent usage problems. "version"
> could be just a read-only file with a *list* of supported versions.
>
> eg
>
> $ cat /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> 5.0
> 5.1
> 5.2
>
> $ echo "83b8f4f2-509f-382f-3c1e-e6bfe0fa1001;version=5.2" >
> /sys/devices/virtual/mtty/mtty/mdev_supported_types/mtty-2/create

This is reminiscent of the proposed aggregation support, but again,
this sort of feature is not intended here. It's no expected that any
vendor driver would support creating device types of different
versions, but they may support migration from different versions.
Thanks,

Alex

2019-04-23 13:47:04

by Daniel P. Berrangé

[permalink] [raw]
Subject: Re: [Qemu-devel] [PATCH 1/2] vfio/mdev: add version field as mandatory attribute for mdev device

On Tue, Apr 23, 2019 at 06:35:40AM -0600, Alex Williamson wrote:
> On Tue, 23 Apr 2019 11:39:39 +0100
> Daniel P. Berrangé <[email protected]> wrote:
>
> > On Fri, Apr 19, 2019 at 04:35:04AM -0400, Yan Zhao wrote:
> > > +* version
> > > +
> > > + This attribute is rw. It is used to check whether two devices are compatible
> > > + for live migration. If this attribute is missing, then the corresponding mdev
> > > + device is regarded as not supporting live migration.
> > > +
> > > + It consists of two parts: common part and vendor proprietary part.
> > > + common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies
> > > + device type. e.g., for pci device, it is
> > > + "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > > + vendor proprietary part: this part is varied in length. vendor driver can
> > > + specify any string to identify a device.
> > > +
> > > + When reading this attribute, it should show device version string of the device
> > > + of type <type-id>. If a device does not support live migration, it should
> > > + return errno.
> > > + When writing a string to this attribute, it returns errno for incompatibility
> > > + or returns written string length in compatibility case. If a device does not
> > > + support live migration, it always returns errno.
> > > +
> > > + for example.
> > > + # cat \
> > > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_2/version
> > > + 00028086-193b-i915-GVTg_V5_2
> > > +
> > > + #echo 00028086-193b-i915-GVTg_V5_2 > \
> > > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> > > + -bash: echo: write error: Invalid argument
> > > +
> >
> > IIUC this path is against the physical device. IOW, the mgmt app would have
> > to first write to the "version" file to choose a version, and then write to
> > the "create" file to actually create an virtual device. This has the obvious
> > concurrency problem if multiple devices are being created at the same time
> > and distinct versions for each device are required. There would need to be
> > a locking scheme defined to ensure safety.
>
> "Create a device of a given version" is not an intended feature of this
> interface aiui. Writing the version attribute only indicates
> migration compatibility with a binary result.
>
> > Wouldn't it be better if we can pass the desired version when we write to
> > the "create" file, so that we avoid any concurrent usage problems. "version"
> > could be just a read-only file with a *list* of supported versions.
> >
> > eg
> >
> > $ cat /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> > 5.0
> > 5.1
> > 5.2
> >
> > $ echo "83b8f4f2-509f-382f-3c1e-e6bfe0fa1001;version=5.2" >
> > /sys/devices/virtual/mtty/mtty/mdev_supported_types/mtty-2/create
>
> This is reminiscent of the proposed aggregation support, but again,
> this sort of feature is not intended here. It's no expected that any
> vendor driver would support creating device types of different
> versions, but they may support migration from different versions.

Hmm, that's a subtle distinction I wasn't seeing the patch series.
IIUC from what you're saying, a device can be created with version
"C", but for an incoming migration it can (potentially) accept
serialized state matching any of versions "A", "B" or "C".

That is sufficient if migration is being used as a host upgrade
tool, to move from OS release N to N + 1.

It wouldn't cover the case where you need to support backwards
migration too. eg if you have a mixture of hosts with release
N and N + 1 and want to make sure that VMs can always move
betweeen any host. That would require that you can create
mdevs with the lowest common denominator version, not solely
the most recent.

In QEMU this is done by mgmt applications picking a versioned
machine type for QEMU that is older than most recent.

Regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

2019-04-23 14:50:19

by Alex Williamson

[permalink] [raw]
Subject: Re: [Qemu-devel] [PATCH 1/2] vfio/mdev: add version field as mandatory attribute for mdev device

On Tue, 23 Apr 2019 14:44:00 +0100
Daniel P. Berrangé <[email protected]> wrote:

> On Tue, Apr 23, 2019 at 06:35:40AM -0600, Alex Williamson wrote:
> > On Tue, 23 Apr 2019 11:39:39 +0100
> > Daniel P. Berrangé <[email protected]> wrote:
> >
> > > On Fri, Apr 19, 2019 at 04:35:04AM -0400, Yan Zhao wrote:
> > > > +* version
> > > > +
> > > > + This attribute is rw. It is used to check whether two devices are compatible
> > > > + for live migration. If this attribute is missing, then the corresponding mdev
> > > > + device is regarded as not supporting live migration.
> > > > +
> > > > + It consists of two parts: common part and vendor proprietary part.
> > > > + common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies
> > > > + device type. e.g., for pci device, it is
> > > > + "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > > > + vendor proprietary part: this part is varied in length. vendor driver can
> > > > + specify any string to identify a device.
> > > > +
> > > > + When reading this attribute, it should show device version string of the device
> > > > + of type <type-id>. If a device does not support live migration, it should
> > > > + return errno.
> > > > + When writing a string to this attribute, it returns errno for incompatibility
> > > > + or returns written string length in compatibility case. If a device does not
> > > > + support live migration, it always returns errno.
> > > > +
> > > > + for example.
> > > > + # cat \
> > > > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_2/version
> > > > + 00028086-193b-i915-GVTg_V5_2
> > > > +
> > > > + #echo 00028086-193b-i915-GVTg_V5_2 > \
> > > > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> > > > + -bash: echo: write error: Invalid argument
> > > > +
> > >
> > > IIUC this path is against the physical device. IOW, the mgmt app would have
> > > to first write to the "version" file to choose a version, and then write to
> > > the "create" file to actually create an virtual device. This has the obvious
> > > concurrency problem if multiple devices are being created at the same time
> > > and distinct versions for each device are required. There would need to be
> > > a locking scheme defined to ensure safety.
> >
> > "Create a device of a given version" is not an intended feature of this
> > interface aiui. Writing the version attribute only indicates
> > migration compatibility with a binary result.
> >
> > > Wouldn't it be better if we can pass the desired version when we write to
> > > the "create" file, so that we avoid any concurrent usage problems. "version"
> > > could be just a read-only file with a *list* of supported versions.
> > >
> > > eg
> > >
> > > $ cat /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> > > 5.0
> > > 5.1
> > > 5.2
> > >
> > > $ echo "83b8f4f2-509f-382f-3c1e-e6bfe0fa1001;version=5.2" >
> > > /sys/devices/virtual/mtty/mtty/mdev_supported_types/mtty-2/create
> >
> > This is reminiscent of the proposed aggregation support, but again,
> > this sort of feature is not intended here. It's no expected that any
> > vendor driver would support creating device types of different
> > versions, but they may support migration from different versions.
>
> Hmm, that's a subtle distinction I wasn't seeing the patch series.
> IIUC from what you're saying, a device can be created with version
> "C", but for an incoming migration it can (potentially) accept
> serialized state matching any of versions "A", "B" or "C".
>
> That is sufficient if migration is being used as a host upgrade
> tool, to move from OS release N to N + 1.
>
> It wouldn't cover the case where you need to support backwards
> migration too. eg if you have a mixture of hosts with release
> N and N + 1 and want to make sure that VMs can always move
> betweeen any host. That would require that you can create
> mdevs with the lowest common denominator version, not solely
> the most recent.
>
> In QEMU this is done by mgmt applications picking a versioned
> machine type for QEMU that is older than most recent.

I suppose we'd need to determine how important that is, this is just a
device after all, there are always fallback mechanisms via hotplug.
There are a lot of pieces that need to line up for backwards migration
including support for it at the individual vendor driver. Nothing we
design into the API is going to require vendor drivers to support
backwards migration and we already have some vendors requiring host and
guest driver alignment. Specifying a version with a create syntax as
you've provided is reasonable, but this also balloons into whole
tangential interface providing information regarding what versions a
vendor driver is able to create, because presumably management tools
wouldn't prefer a try-and-see type interface. I believe an intentional
aspect of the proposal here is that we don't need to provide a list of
compatible versions, that's handled internally to the vendor driver.

I don't know if we could start with the proposal here and later add
these sorts of features. Ideas? Thanks,

Alex

2019-04-23 14:58:56

by Daniel P. Berrangé

[permalink] [raw]
Subject: Re: [Qemu-devel] [PATCH 1/2] vfio/mdev: add version field as mandatory attribute for mdev device

On Tue, Apr 23, 2019 at 08:48:52AM -0600, Alex Williamson wrote:
> On Tue, 23 Apr 2019 14:44:00 +0100
> Daniel P. Berrangé <[email protected]> wrote:
>
> > On Tue, Apr 23, 2019 at 06:35:40AM -0600, Alex Williamson wrote:
> > > On Tue, 23 Apr 2019 11:39:39 +0100
> > > Daniel P. Berrangé <[email protected]> wrote:
> > >
> > > > On Fri, Apr 19, 2019 at 04:35:04AM -0400, Yan Zhao wrote:
> > > > > +* version
> > > > > +
> > > > > + This attribute is rw. It is used to check whether two devices are compatible
> > > > > + for live migration. If this attribute is missing, then the corresponding mdev
> > > > > + device is regarded as not supporting live migration.
> > > > > +
> > > > > + It consists of two parts: common part and vendor proprietary part.
> > > > > + common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies
> > > > > + device type. e.g., for pci device, it is
> > > > > + "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > > > > + vendor proprietary part: this part is varied in length. vendor driver can
> > > > > + specify any string to identify a device.
> > > > > +
> > > > > + When reading this attribute, it should show device version string of the device
> > > > > + of type <type-id>. If a device does not support live migration, it should
> > > > > + return errno.
> > > > > + When writing a string to this attribute, it returns errno for incompatibility
> > > > > + or returns written string length in compatibility case. If a device does not
> > > > > + support live migration, it always returns errno.
> > > > > +
> > > > > + for example.
> > > > > + # cat \
> > > > > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_2/version
> > > > > + 00028086-193b-i915-GVTg_V5_2
> > > > > +
> > > > > + #echo 00028086-193b-i915-GVTg_V5_2 > \
> > > > > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> > > > > + -bash: echo: write error: Invalid argument
> > > > > +
> > > >
> > > > IIUC this path is against the physical device. IOW, the mgmt app would have
> > > > to first write to the "version" file to choose a version, and then write to
> > > > the "create" file to actually create an virtual device. This has the obvious
> > > > concurrency problem if multiple devices are being created at the same time
> > > > and distinct versions for each device are required. There would need to be
> > > > a locking scheme defined to ensure safety.
> > >
> > > "Create a device of a given version" is not an intended feature of this
> > > interface aiui. Writing the version attribute only indicates
> > > migration compatibility with a binary result.
> > >
> > > > Wouldn't it be better if we can pass the desired version when we write to
> > > > the "create" file, so that we avoid any concurrent usage problems. "version"
> > > > could be just a read-only file with a *list* of supported versions.
> > > >
> > > > eg
> > > >
> > > > $ cat /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> > > > 5.0
> > > > 5.1
> > > > 5.2
> > > >
> > > > $ echo "83b8f4f2-509f-382f-3c1e-e6bfe0fa1001;version=5.2" >
> > > > /sys/devices/virtual/mtty/mtty/mdev_supported_types/mtty-2/create
> > >
> > > This is reminiscent of the proposed aggregation support, but again,
> > > this sort of feature is not intended here. It's no expected that any
> > > vendor driver would support creating device types of different
> > > versions, but they may support migration from different versions.
> >
> > Hmm, that's a subtle distinction I wasn't seeing the patch series.
> > IIUC from what you're saying, a device can be created with version
> > "C", but for an incoming migration it can (potentially) accept
> > serialized state matching any of versions "A", "B" or "C".
> >
> > That is sufficient if migration is being used as a host upgrade
> > tool, to move from OS release N to N + 1.
> >
> > It wouldn't cover the case where you need to support backwards
> > migration too. eg if you have a mixture of hosts with release
> > N and N + 1 and want to make sure that VMs can always move
> > betweeen any host. That would require that you can create
> > mdevs with the lowest common denominator version, not solely
> > the most recent.
> >
> > In QEMU this is done by mgmt applications picking a versioned
> > machine type for QEMU that is older than most recent.
>
> I suppose we'd need to determine how important that is, this is just a
> device after all, there are always fallback mechanisms via hotplug.
> There are a lot of pieces that need to line up for backwards migration
> including support for it at the individual vendor driver. Nothing we
> design into the API is going to require vendor drivers to support
> backwards migration and we already have some vendors requiring host and
> guest driver alignment. Specifying a version with a create syntax as
> you've provided is reasonable, but this also balloons into whole
> tangential interface providing information regarding what versions a
> vendor driver is able to create, because presumably management tools
> wouldn't prefer a try-and-see type interface. I believe an intentional
> aspect of the proposal here is that we don't need to provide a list of
> compatible versions, that's handled internally to the vendor driver.

Worse is that multi-versions backwards migration also balloons the
testing matrix which is particularly time consuming.

To be clear I'm /not/ saying that multi-versions backwards migration
is a must-have. I just want to be sure we understand the limitations
of what's proposed and that we're happy with them.

Certainly forwards migration is the really big win here.

Backwards migration is more of a nice-to-have and only worth doing if
we expect people are willing to actually do enough testing to make
it reliable. Claiming to support backwards migration and then
never testing it works is even worse than not supporting it, because
it will be a rarely exercised code path comparatively speaking, so
easy to bit-rot if not well tested.

Regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

2019-04-23 15:05:56

by Alex Williamson

[permalink] [raw]
Subject: Re: [PATCH 1/2] vfio/mdev: add version field as mandatory attribute for mdev device

On Tue, 23 Apr 2019 01:41:57 -0400
Yan Zhao <[email protected]> wrote:

> On Tue, Apr 23, 2019 at 09:21:00AM +0800, Alex Williamson wrote:
> > On Mon, 22 Apr 2019 21:01:52 -0400
> > Yan Zhao <[email protected]> wrote:
> >
> > > On Mon, Apr 22, 2019 at 10:39:50PM +0800, Alex Williamson wrote:
> > > > On Fri, 19 Apr 2019 04:35:04 -0400
> > > > Yan Zhao <[email protected]> wrote:
> > > >
> > > > > device version attribute in mdev sysfs is used by user space software
> > > > > (e.g. libvirt) to query device compatibility for live migration of VFIO
> > > > > mdev devices. This attribute is mandatory if a mdev device supports live
> > > > > migration.
> > > >
> > > > The Subject: doesn't quite match what's being proposed here.
> > > >
> > > > > It consists of two parts: common part and vendor proprietary part.
> > > > > common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> > > > > identifies device type. e.g., for pci device, it is
> > > > > "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > > >
> > > > What purpose does this serve? If it's intended as some sort of
> > > > namespace feature, shouldn't we first assume that we can only support
> > > > migration to devices of the same type? Therefore each type would
> > > > already have its own namespace. Also that would make the trailing bit
> > > > of the version string listed below in the example redundant. A vendor
> > > > is still welcome to include this in their version string if they wish,
> > > > but I think the string should be entirely vendor defined.
> > > >
> > > hi Alex,
> > > This common part is a kind of namespace.
> > > Because if version string is entirely defined by vendors, I'm worried about
> > > if there is a case that one vendor's version string happens to deceive and
> > > interfere with another vendor's version checking?
> > > e.g.
> > > vendor A has a version string like: vendor id + device id + mdev type
> > > vendor B has a version string like: device id + vendor id + mdev type
> > > but vendor A's vendor id is 0x8086, device id is 0x1217
> > > vendor B's vendor id is 0x1217, device id is 0x8086.
> > >
> > > In this corner case, the two vendors may regard the two device is
> > > migratable but actually they are not.
> > >
> > > That's the reason for this common part that serve as a kind of namespace
> > > that all vendors will comply with to avoid overlap.
> >
> > If we assume that migration can only occur between matching mdev types,
> > this is redundant, each type already has their own namespace.
> >
> hi Alex,
> do you mean user space software like libvirt needs to first check whether
> mdev type is matching and then check whether version is matching?

Yes.

> if user space software only checks version for migration, it means vendor
> driver has to include mdev type in their vendor proprietary part string,
> right?

Userspace attempting to migrate an nvidia-64 to an i915-GVT_V5_4 would
be a failure on the part of the user.

> Another thing is that could there be any future mdev parent driver that
> applies to all mdev devices, just like vfio-pci? like Yi's vfio-pci-mdev
> driver (https://lkml.org/lkml/2019/3/13/114)?

For starters, this is just a sample driver from which vendor specific
mdev drivers could be forked to support these features, but
additionally, the type is defined by the vendor driver, so even a meta
driver like vfio-pci-mdev could create types like
"vfio-pci-mdev-8086_10c9_abcdef" if it wanted to provide that specific
device. The "vfio-pci-mdev-type1" is just a sample implementation to
say "the native type of the thing bound to me" and it's going to have
limited usefulness for any sort of persistence to userspace. Thus,
it's a sample driver. Thanks,

Alex

> > > > > vendor proprietary part: this part is varied in length. vendor driver can
> > > > > specify any string to identify a device.
> > > > >
> > > > > When reading this attribute, it should show device version string of the
> > > > > device of type <type-id>. If a device does not support live migration, it
> > > > > should return errno.
> > > > > When writing a string to this attribute, it returns errno for
> > > > > incompatibility or returns written string length in compatibility case.
> > > > > If a device does not support live migration, it always returns errno.
> > > > >
> > > > > For user space software to use:
> > > > > 1.
> > > > > Before starting live migration, user space software first reads source side
> > > > > mdev device's version. e.g.
> > > > > "#cat \
> > > > > /sys/bus/pci/devices/0000\:00\:02.0/5ac1fb20-2bbf-4842-bb7e-36c58c3be9cd/mdev_type/version"
> > > > > 00028086-193b-i915-GVTg_V5_4
> > > > >
> > > > > 2.
> > > > > Then, user space software writes the source side returned version string
> > > > > to device version attribute in target side, and checks the return value.
> > > > > If a negative errno is returned in the target side, then mdev devices in
> > > > > source and target sides are not compatible;
> > > > > If a positive number is returned and it equals to the length of written
> > > > > string, then the two mdev devices in source and target side are compatible.
> > > > > e.g.
> > > > > (a) compatibility case
> > > > > "# echo 00028086-193b-i915-GVTg_V5_4 >
> > > > > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > > > >
> > > > > (b) incompatibility case
> > > > > "#echo 00028086-193b-i915-GVTg_V5_1 >
> > > > > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > > > > -bash: echo: write error: Invalid argument
> > > > >
> > > > > 3. if two mdev devices are compatible, user space software can start
> > > > > live migration, and vice versa.
> > > > >
> > > > > Note: if a mdev device does not support live migration, it either does
> > > > > not provide a version attribute, or always returns errno when its version
> > > > > attribute is read/written.
> > > >
> > > > I think it would be cleaner to do the former, not supply the
> > > > attribute. This seems to do the latter in the sample drivers. Thanks,
> > > Ok. you are right!
> > > what about just keep one sample driver to show how to do the latter,
> > > and let the others do the former?
> >
> > I'd rather that if a vendor driver doesn't support features requiring
> > the version attribute that they don't implement it. It's confusing to
> > developers looking at the sample driver for guidance if we have
> > different implementations. Of course if you'd like to add migration
> > support to one of the sample drivers, that'd be very welcome. Thanks,
> >
> Got it:)
>
> Thanks!
> Yan
>
> >
> > > > > Cc: Alex Williamson <[email protected]>
> > > > > Cc: Erik Skultety <[email protected]>
> > > > > Cc: "Dr. David Alan Gilbert" <[email protected]>
> > > > > Cc: Cornelia Huck <[email protected]>
> > > > > Cc: "Tian, Kevin" <[email protected]>
> > > > > Cc: Zhenyu Wang <[email protected]>
> > > > > Cc: "Wang, Zhi A" <[email protected]>
> > > > > Cc: Neo Jia <[email protected]>
> > > > > Cc: Kirti Wankhede <[email protected]>
> > > > >
> > > > > Signed-off-by: Yan Zhao <[email protected]>
> > > > > ---
> > > > > Documentation/vfio-mediated-device.txt | 36 ++++++++++++++++++++++++++
> > > > > samples/vfio-mdev/mbochs.c | 17 ++++++++++++
> > > > > samples/vfio-mdev/mdpy.c | 16 ++++++++++++
> > > > > samples/vfio-mdev/mtty.c | 16 ++++++++++++
> > > > > 4 files changed, 85 insertions(+)
> > > > >
> > > > > diff --git a/Documentation/vfio-mediated-device.txt b/Documentation/vfio-mediated-device.txt
> > > > > index c3f69bcaf96e..bc28471c0667 100644
> > > > > --- a/Documentation/vfio-mediated-device.txt
> > > > > +++ b/Documentation/vfio-mediated-device.txt
> > > > > @@ -202,6 +202,7 @@ Directories and files under the sysfs for Each Physical Device
> > > > > | | |--- available_instances
> > > > > | | |--- device_api
> > > > > | | |--- description
> > > > > + | | |--- version
> > > > > | | |--- [devices]
> > > > > | |--- [<type-id>]
> > > > > | | |--- create
> > > > > @@ -209,6 +210,7 @@ Directories and files under the sysfs for Each Physical Device
> > > > > | | |--- available_instances
> > > > > | | |--- device_api
> > > > > | | |--- description
> > > > > + | | |--- version
> > > > > | | |--- [devices]
> > > > > | |--- [<type-id>]
> > > > > | |--- create
> > > > > @@ -216,6 +218,7 @@ Directories and files under the sysfs for Each Physical Device
> > > > > | |--- available_instances
> > > > > | |--- device_api
> > > > > | |--- description
> > > > > + | |--- version
> > > > > | |--- [devices]
> > > > >
> > > > > * [mdev_supported_types]
> > > > > @@ -225,6 +228,8 @@ Directories and files under the sysfs for Each Physical Device
> > > > > [<type-id>], device_api, and available_instances are mandatory attributes
> > > > > that should be provided by vendor driver.
> > > > >
> > > > > + version is a mandatory attribute if a mdev device supports live migration.
> > > > > +
> > > > > * [<type-id>]
> > > > >
> > > > > The [<type-id>] name is created by adding the device driver string as a prefix
> > > > > @@ -246,6 +251,35 @@ Directories and files under the sysfs for Each Physical Device
> > > > > This attribute should show the number of devices of type <type-id> that can be
> > > > > created.
> > > > >
> > > > > +* version
> > > > > +
> > > > > + This attribute is rw. It is used to check whether two devices are compatible
> > > > > + for live migration. If this attribute is missing, then the corresponding mdev
> > > > > + device is regarded as not supporting live migration.
> > > > > +
> > > > > + It consists of two parts: common part and vendor proprietary part.
> > > > > + common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies
> > > > > + device type. e.g., for pci device, it is
> > > > > + "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > > > > + vendor proprietary part: this part is varied in length. vendor driver can
> > > > > + specify any string to identify a device.
> > > > > +
> > > > > + When reading this attribute, it should show device version string of the device
> > > > > + of type <type-id>. If a device does not support live migration, it should
> > > > > + return errno.
> > > > > + When writing a string to this attribute, it returns errno for incompatibility
> > > > > + or returns written string length in compatibility case. If a device does not
> > > > > + support live migration, it always returns errno.
> > > > > +
> > > > > + for example.
> > > > > + # cat \
> > > > > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_2/version
> > > > > + 00028086-193b-i915-GVTg_V5_2
> > > > > +
> > > > > + #echo 00028086-193b-i915-GVTg_V5_2 > \
> > > > > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> > > > > + -bash: echo: write error: Invalid argument
> > > > > +
> > > > > * [device]
> > > > >
> > > > > This directory contains links to the devices of type <type-id> that have been
> > > > > @@ -327,12 +361,14 @@ card.
> > > > > | | |-- available_instances
> > > > > | | |-- create
> > > > > | | |-- device_api
> > > > > + | | |-- version
> > > > > | | |-- devices
> > > > > | | `-- name
> > > > > | `-- mtty-2
> > > > > | |-- available_instances
> > > > > | |-- create
> > > > > | |-- device_api
> > > > > + | |-- version
> > > > > | |-- devices
> > > > > | `-- name
> > > > > |-- mtty_dev
> > > > > diff --git a/samples/vfio-mdev/mbochs.c b/samples/vfio-mdev/mbochs.c
> > > > > index b038aa9f5a70..2f5ba96b91a2 100644
> > > > > --- a/samples/vfio-mdev/mbochs.c
> > > > > +++ b/samples/vfio-mdev/mbochs.c
> > > > > @@ -1391,11 +1391,28 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> > > > > }
> > > > > MDEV_TYPE_ATTR_RO(device_api);
> > > > >
> > > > > +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> > > > > + char *buf)
> > > > > +{
> > > > > + /* do not support live migration */
> > > > > + return -EINVAL;
> > > > > +}
> > > > > +
> > > > > +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> > > > > + const char *buf, size_t count)
> > > > > +{
> > > > > + /* do not support live migration */
> > > > > + return -EINVAL;
> > > > > +}
> > > > > +
> > > > > +static MDEV_TYPE_ATTR_RW(version);
> > > > > +
> > > > > static struct attribute *mdev_types_attrs[] = {
> > > > > &mdev_type_attr_name.attr,
> > > > > &mdev_type_attr_description.attr,
> > > > > &mdev_type_attr_device_api.attr,
> > > > > &mdev_type_attr_available_instances.attr,
> > > > > + &mdev_type_attr_version.attr,
> > > > > NULL,
> > > > > };
> > > > >
> > > > > diff --git a/samples/vfio-mdev/mdpy.c b/samples/vfio-mdev/mdpy.c
> > > > > index cc86bf6566e4..ff15fdfc7d46 100644
> > > > > --- a/samples/vfio-mdev/mdpy.c
> > > > > +++ b/samples/vfio-mdev/mdpy.c
> > > > > @@ -695,11 +695,27 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> > > > > }
> > > > > MDEV_TYPE_ATTR_RO(device_api);
> > > > >
> > > > > +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> > > > > + char *buf)
> > > > > +{
> > > > > + /* do not support live migration */
> > > > > + return -EINVAL;
> > > > > +}
> > > > > +
> > > > > +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> > > > > + const char *buf, size_t count)
> > > > > +{
> > > > > + /* do not support live migration */
> > > > > + return -EINVAL;
> > > > > +}
> > > > > +static MDEV_TYPE_ATTR_RW(version);
> > > > > +
> > > > > static struct attribute *mdev_types_attrs[] = {
> > > > > &mdev_type_attr_name.attr,
> > > > > &mdev_type_attr_description.attr,
> > > > > &mdev_type_attr_device_api.attr,
> > > > > &mdev_type_attr_available_instances.attr,
> > > > > + &mdev_type_attr_version.attr,
> > > > > NULL,
> > > > > };
> > > > >
> > > > > diff --git a/samples/vfio-mdev/mtty.c b/samples/vfio-mdev/mtty.c
> > > > > index 1c77c370c92f..4ae3aad3474d 100644
> > > > > --- a/samples/vfio-mdev/mtty.c
> > > > > +++ b/samples/vfio-mdev/mtty.c
> > > > > @@ -1390,10 +1390,26 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> > > > >
> > > > > MDEV_TYPE_ATTR_RO(device_api);
> > > > >
> > > > > +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> > > > > + char *buf)
> > > > > +{
> > > > > + /* do not support live migration */
> > > > > + return -EINVAL;
> > > > > +}
> > > > > +
> > > > > +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> > > > > + const char *buf, size_t count)
> > > > > +{
> > > > > + /* do not support live migration */
> > > > > + return -EINVAL;
> > > > > +}
> > > > > +
> > > > > +static MDEV_TYPE_ATTR_RW(version);
> > > > > static struct attribute *mdev_types_attrs[] = {
> > > > > &mdev_type_attr_name.attr,
> > > > > &mdev_type_attr_device_api.attr,
> > > > > &mdev_type_attr_available_instances.attr,
> > > > > + &mdev_type_attr_version.attr,
> > > > > NULL,
> > > > > };
> > > > >
> > > >
> > > > _______________________________________________
> > > > intel-gvt-dev mailing list
> > > > [email protected]
> > > > https://lists.freedesktop.org/mailman/listinfo/intel-gvt-dev
> >
> > _______________________________________________
> > intel-gvt-dev mailing list
> > [email protected]
> > https://lists.freedesktop.org/mailman/listinfo/intel-gvt-dev

2019-04-24 02:41:09

by Yan Zhao

[permalink] [raw]
Subject: Re: [PATCH 2/2] drm/i915/gvt: export mdev device version to sysfs for Intel vGPU

On Tue, Apr 23, 2019 at 07:39:11PM +0800, Cornelia Huck wrote:
> On Fri, 19 Apr 2019 04:35:59 -0400
> Yan Zhao <[email protected]> wrote:
>
> > This feature implements the version attribute for Intel's vGPU mdev
> > devices.
> >
> > version attribute is rw. It is queried by userspace software like libvirt
> > to check whether two vGPUs are compatible for live migration.
> >
> > It consists of two parts: common part and vendor proprietary part.
> > common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> > identifies device type. e.g., for pci device, it is
> > "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > vendor proprietary part: this part is varied in length. vendor driver can
> > specify any string to identify a device.
> >
> > For Intel vGPU of gen8 and gen9, the vendor proprietary part currently
> > consists of 2 fields: "device id" + "mdev type".
> >
> > Reading from a vGPU's version attribute, a string is returned in below
> > format: 00028086-<device id>-<mdev type>. e.g.
> > 00028086-193b-i915-GVTg_V5_2.
> >
> > Writing a string to a vGPU's version attribute will trigger GVT to check
> > whether a vGPU identified by the written string is compatible with
> > current vGPU owning this version attribute. errno is returned if the two
> > vGPUs are incompatible. The length of written string is returned in
> > compatible case.
> >
> > For other platforms, and for GVT not supporting vGPU live migration
> > feature, errnos are returned when read/write of mdev devices' version
> > attributes.
> >
> > For old GVT versions where no version attributes exposed in sysfs, it is
> > regarded as not supporting vGPU live migration.
> >
> > For future platforms, besides the current 2 fields in vendor proprietary
> > part, more fields may be added to identify Intel vGPU well for live
> > migration purpose.
> >
> > Cc: Alex Williamson <[email protected]>
> > Cc: Erik Skultety <[email protected]>
> > Cc: "Dr. David Alan Gilbert" <[email protected]>
> > Cc: Cornelia Huck <[email protected]>
> > Cc: "Tian, Kevin" <[email protected]>
> > Cc: Zhenyu Wang <[email protected]>
> > Cc: "Wang, Zhi A" <[email protected]>
> > c: Neo Jia <[email protected]>
> > Cc: Kirti Wankhede <[email protected]>
> >
> > Signed-off-by: Yan Zhao <[email protected]>
> > ---
> > drivers/gpu/drm/i915/gvt/Makefile | 2 +-
> > drivers/gpu/drm/i915/gvt/device_version.c | 94 +++++++++++++++++++++++
> > drivers/gpu/drm/i915/gvt/gvt.c | 55 +++++++++++++
> > drivers/gpu/drm/i915/gvt/gvt.h | 6 ++
> > 4 files changed, 156 insertions(+), 1 deletion(-)
> > create mode 100644 drivers/gpu/drm/i915/gvt/device_version.c
> >
>
> (...)
>
> > +static bool is_compatible(const char *self, const char *remote)
> > +{
> > + if (strlen(remote) != strlen(self))
> > + return false;
> > +
> > + return (strncmp(self, remote, strlen(self))) ? false : true;
> > +}
> > +
> > +ssize_t intel_gvt_get_vfio_device_version_len(struct drm_i915_private *dev_priv)
> > +{
> > + if (!IS_GEN(dev_priv, 8) && !IS_GEN(dev_priv, 9))
> > + return -ENODEV;
> > +
> > + return PAGE_SIZE;
> > +}
> > +
> > +ssize_t intel_gvt_get_vfio_device_version(struct drm_i915_private *dev_priv,
> > + char *buf, const char *mdev_type)
> > +{
> > + int cnt = 0, ret = 0;
> > + const char *str = NULL;
> > +
> > + /* currently only gen8 & gen9 are supported */
> > + if (!IS_GEN(dev_priv, 8) && !IS_GEN(dev_priv, 9))
> > + return -ENODEV;
> > +
> > + /* first 32 bit common part specifying vendor id and it's a pci
> > + * device
> > + */
> > + cnt = snprintf(buf, GVT_DEVICE_VERSION_COMMON_LEN + 1,
> > + "%08x", GVT_VFIO_DEVICE_VENDOR_ID);
> > + buf += cnt;
> > + ret += cnt;
> > +
> > + /* vendor proprietary part: device id + mdev type */
> > + /* device id */
> > + cnt = snprintf(buf, GVT_DEVICE_VERSION_DEVICE_ID_LEN + 2,
> > + "-%04x", INTEL_DEVID(dev_priv));
> > + buf += cnt;
> > + ret += cnt;
> > +
> > + /* mdev type */
> > + str = mdev_type;
> > + cnt = snprintf(buf, strlen(str) + 3, "-%s\n", mdev_type);
> > + buf += cnt;
> > + ret += cnt;
> > +
> > + return ret;
> > +}
>
> Looking at this handling, it seems much easier to me to simply use a
> numeric value instead of a string: You don't have to build it via
> sprintf, there are generic functions for parsing a string input into a
> simple number, and you have more options for compatibility (e.g.
> "version must be between m and n" instead of an exact match).
>
> If you still need to encode the device id here, you should be able to
> easily do something like (device_id << 16) | migration_version -- do
> you think that could work?
>
hi Cornelia,
using string is based on the consideration that we want to make this
version string a thing that can distinguish a mdev, so we incoportate
vendor id, device id to identify parent device first, then mdev type to
describe the mdev based on the parent device. And that's only for gen8 and
gen9. For future platforms, we may incorpate more information, e.g. besides
vendor id and device id, different device revision number, or even a value
in a register on the run may be needed to identify a parent device.

I think it's cleaner than a numeric version between m and n, because in that
case we have to maintain what m's configration is and what n's is.
whenever a mdev type is added (like changing a resolution type in mdev
type) a new version is generated. it's too complicated.

That's why we use current way: version describe parent device + mdev type
elaborately, then vendor driver checks compatibility according to this
information.

Do you think it's all right?

> > +
> > +ssize_t intel_gvt_check_vfio_device_version(struct drm_i915_private *dev_priv,
> > + const char *self, const char *remote)
> > +{
> > +
> > + /* currently only gen8 & gen9 are supported */
> > + if (!IS_GEN(dev_priv, 8) && !IS_GEN(dev_priv, 9))
> > + return -ENODEV;
> > +
> > + if (!is_compatible(self, remote))
> > + return -EINVAL;
>
> I think the meaning of the error codes should really be standardized
> across vendor drivers, if we need a value for "this device does not
> support migration at all". (Your choices look reasonable for that.)
>
Agree. thank you. I'll keep error codes consistently in future.

> > +
> > + return 0;
> > +}
> > diff --git a/drivers/gpu/drm/i915/gvt/gvt.c b/drivers/gpu/drm/i915/gvt/gvt.c
> > index 43f4242062dd..e720465b93d8 100644
> > --- a/drivers/gpu/drm/i915/gvt/gvt.c
> > +++ b/drivers/gpu/drm/i915/gvt/gvt.c
> > @@ -105,14 +105,69 @@ static ssize_t description_show(struct kobject *kobj, struct device *dev,
> > type->weight);
> > }
> >
> > +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> > + char *buf)
> > +{
> > +#ifdef GVT_MIGRATION_VERSION
> > + struct drm_i915_private *i915 = kdev_to_i915(dev);
> > + const char *mdev_type = kobject_name(kobj);
> > +
> > + return intel_gvt_get_vfio_device_version(i915, buf, mdev_type);
> > +#else
> > + /* do not support live migration */
> > + return -EINVAL;
>
> ...but this looks inconsistent. I would expect -ENODEV here, same as
> for non-gen{8,9}.
Right, case "not suppporting live migration" should return -ENODEV.
Thanks:)


> Or simply do not create the attribute at all in that case?
That's also a good choice :)



> > +#endif
> > +}
> > +
> > +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> > + const char *buf, size_t count)
> > +{
> > +#ifdef GVT_MIGRATION_VERSION
> > + char *remote = NULL, *self = NULL;
> > + int len, ret = 0;
> > + struct drm_i915_private *i915 = kdev_to_i915(dev);
> > + const char *mdev_type = kobject_name(kobj);
> > +
> > + len = intel_gvt_get_vfio_device_version_len(i915);
> > + if (len < 0)
> > + return len;
> > +
> > + self = kmalloc(len, GFP_KERNEL);
> > + if (!self)
> > + return -ENOMEM;
> > +
> > + ret = intel_gvt_get_vfio_device_version(i915, self, mdev_type);
> > + if (ret < 0)
> > + goto out;
> > +
> > + remote = kstrndup(buf, count, GFP_KERNEL);
> > + if (!remote) {
> > + ret = -ENOMEM;
> > + goto out;
> > + }
> > +
> > + ret = intel_gvt_check_vfio_device_version(i915, self, remote);
> > +
> > +out:
> > + kfree(self);
> > + kfree(remote);
> > + return (ret < 0 ? ret : count);
> > +#else
> > + /* do not support live migration */
> > + return -EINVAL;
> > +#endif
> > +}
> > +
> > static MDEV_TYPE_ATTR_RO(available_instances);
> > static MDEV_TYPE_ATTR_RO(device_api);
> > static MDEV_TYPE_ATTR_RO(description);
> > +static MDEV_TYPE_ATTR_RW(version);
> >
> > static struct attribute *gvt_type_attrs[] = {
> > &mdev_type_attr_available_instances.attr,
> > &mdev_type_attr_device_api.attr,
> > &mdev_type_attr_description.attr,
> > + &mdev_type_attr_version.attr,
> > NULL,
> > };
> >
> (...)
> _______________________________________________
> intel-gvt-dev mailing list
> [email protected]
> https://lists.freedesktop.org/mailman/listinfo/intel-gvt-dev

2019-04-24 03:17:23

by Yan Zhao

[permalink] [raw]
Subject: Re: [PATCH 1/2] vfio/mdev: add version field as mandatory attribute for mdev device

On Tue, Apr 23, 2019 at 05:59:32PM +0800, Cornelia Huck wrote:
> On Fri, 19 Apr 2019 04:35:04 -0400
> Yan Zhao <[email protected]> wrote:
>
> > device version attribute in mdev sysfs is used by user space software
> > (e.g. libvirt) to query device compatibility for live migration of VFIO
> > mdev devices. This attribute is mandatory if a mdev device supports live
> > migration.
> >
> > It consists of two parts: common part and vendor proprietary part.
> > common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> > identifies device type. e.g., for pci device, it is
> > "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > vendor proprietary part: this part is varied in length. vendor driver can
> > specify any string to identify a device.
> >
> > When reading this attribute, it should show device version string of the
> > device of type <type-id>. If a device does not support live migration, it
> > should return errno.
>
> It might make more sense if the driver does not register the attribute
> for the device in that case at all.
>
yes. what about rephrase like this:
"
When reading this attribute, it should show device version string of the
device of type <type-id>.
If a device does not support live migration, it has two choices:
1. do not register this version attribute
2. return -ENODEV on rw this version attribute
Choice 1 is preferred.
"


> > When writing a string to this attribute, it returns errno for
> > incompatibility or returns written string length in compatibility case.
> > If a device does not support live migration, it always returns errno.
> >
> > For user space software to use:
> > 1.
> > Before starting live migration, user space software first reads source side
> > mdev device's version. e.g.
> > "#cat \
> > /sys/bus/pci/devices/0000\:00\:02.0/5ac1fb20-2bbf-4842-bb7e-36c58c3be9cd/mdev_type/version"
> > 00028086-193b-i915-GVTg_V5_4
> >
> > 2.
> > Then, user space software writes the source side returned version string
> > to device version attribute in target side, and checks the return value.
> > If a negative errno is returned in the target side, then mdev devices in
> > source and target sides are not compatible;
> > If a positive number is returned and it equals to the length of written
> > string, then the two mdev devices in source and target side are compatible.
> > e.g.
> > (a) compatibility case
> > "# echo 00028086-193b-i915-GVTg_V5_4 >
> > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> >
> > (b) incompatibility case
> > "#echo 00028086-193b-i915-GVTg_V5_1 >
> > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > -bash: echo: write error: Invalid argument
> >
> > 3. if two mdev devices are compatible, user space software can start
> > live migration, and vice versa.
> >
> > Note: if a mdev device does not support live migration, it either does
> > not provide a version attribute, or always returns errno when its version
> > attribute is read/written.
> >
> > Cc: Alex Williamson <[email protected]>
> > Cc: Erik Skultety <[email protected]>
> > Cc: "Dr. David Alan Gilbert" <[email protected]>
> > Cc: Cornelia Huck <[email protected]>
> > Cc: "Tian, Kevin" <[email protected]>
> > Cc: Zhenyu Wang <[email protected]>
> > Cc: "Wang, Zhi A" <[email protected]>
> > Cc: Neo Jia <[email protected]>
> > Cc: Kirti Wankhede <[email protected]>
> >
> > Signed-off-by: Yan Zhao <[email protected]>
> > ---
> > Documentation/vfio-mediated-device.txt | 36 ++++++++++++++++++++++++++
> > samples/vfio-mdev/mbochs.c | 17 ++++++++++++
> > samples/vfio-mdev/mdpy.c | 16 ++++++++++++
> > samples/vfio-mdev/mtty.c | 16 ++++++++++++
> > 4 files changed, 85 insertions(+)
> >
> > diff --git a/Documentation/vfio-mediated-device.txt b/Documentation/vfio-mediated-device.txt
> > index c3f69bcaf96e..bc28471c0667 100644
> > --- a/Documentation/vfio-mediated-device.txt
> > +++ b/Documentation/vfio-mediated-device.txt
> > @@ -202,6 +202,7 @@ Directories and files under the sysfs for Each Physical Device
> > | | |--- available_instances
> > | | |--- device_api
> > | | |--- description
> > + | | |--- version
> > | | |--- [devices]
> > | |--- [<type-id>]
> > | | |--- create
> > @@ -209,6 +210,7 @@ Directories and files under the sysfs for Each Physical Device
> > | | |--- available_instances
> > | | |--- device_api
> > | | |--- description
> > + | | |--- version
> > | | |--- [devices]
> > | |--- [<type-id>]
> > | |--- create
> > @@ -216,6 +218,7 @@ Directories and files under the sysfs for Each Physical Device
> > | |--- available_instances
> > | |--- device_api
> > | |--- description
> > + | |--- version
> > | |--- [devices]
> >
> > * [mdev_supported_types]
> > @@ -225,6 +228,8 @@ Directories and files under the sysfs for Each Physical Device
> > [<type-id>], device_api, and available_instances are mandatory attributes
> > that should be provided by vendor driver.
> >
> > + version is a mandatory attribute if a mdev device supports live migration.
>
> What about "An mdev device wishing to support live migration must
> provide the version attribute."?
yes, I just want to keep consistent with the line above it
" [<type-id>], device_api, and available_instances are mandatory attributes
that should be provided by vendor driver."
what about below one?
"version is a mandatory attribute if a mdev device wishing to support live
migration."


> > +
> > * [<type-id>]
> >
> > The [<type-id>] name is created by adding the device driver string as a prefix
> > @@ -246,6 +251,35 @@ Directories and files under the sysfs for Each Physical Device
> > This attribute should show the number of devices of type <type-id> that can be
> > created.
> >
> > +* version
> > +
> > + This attribute is rw. It is used to check whether two devices are compatible
> > + for live migration. If this attribute is missing, then the corresponding mdev
> > + device is regarded as not supporting live migration.
> > +
> > + It consists of two parts: common part and vendor proprietary part.
> > + common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies
> > + device type. e.g., for pci device, it is
> > + "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > + vendor proprietary part: this part is varied in length. vendor driver can
> > + specify any string to identify a device.
> > +
> > + When reading this attribute, it should show device version string of the device
> > + of type <type-id>. If a device does not support live migration, it should
> > + return errno.
> > + When writing a string to this attribute, it returns errno for incompatibility
> > + or returns written string length in compatibility case. If a device does not
> > + support live migration, it always returns errno.
>
> I'm not sure whether a device that does not support live migration
> should expose this attribute in the first place. Or is that to cover
> cases where a driver supports live migration only for some of the
> devices it supports?
yes, driver returning error code is to cover the cases where only part of devices it
supports can be migrated.


> Also, I'm not sure if a string that has to be parsed is a good idea...
> is this 'version' attribute supposed to convey some human-readable
> information as well? The procedure you describe for compatibility
> checking does the checking within the vendor driver which I would
> expect to have a table/rules for that anyway.
right. if a vendor driver has the confidence to migrate between devices of
diffent platform or mdev types, it can maintain a compatibility table for that
purpose. That's the reason why we would leave the compatibility check to vendor
driver. vendor driver can freely choose its own complicated way to decide
which device is migratable to which device.

> I think you should also specify which errno writing an incompatible id
> is supposed to return (probably best something different than if the
> device does not support live migration at all, if we stick with
> creating the attribute in that case.)
Agree. I'll define it clearly in next revison.

Thanks
Yan

> > +
> > + for example.
> > + # cat \
> > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_2/version
> > + 00028086-193b-i915-GVTg_V5_2
> > +
> > + #echo 00028086-193b-i915-GVTg_V5_2 > \
> > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> > + -bash: echo: write error: Invalid argument
> > +
> > * [device]
> >
> > This directory contains links to the devices of type <type-id> that have been
> _______________________________________________
> intel-gvt-dev mailing list
> [email protected]
> https://lists.freedesktop.org/mailman/listinfo/intel-gvt-dev

2019-04-24 03:41:15

by Yan Zhao

[permalink] [raw]
Subject: Re: [Qemu-devel] [PATCH 1/2] vfio/mdev: add version field as mandatory attribute for mdev device

On Tue, Apr 23, 2019 at 06:24:19PM +0800, Daniel P. Berrang? wrote:
> On Tue, Apr 23, 2019 at 01:41:57AM -0400, Yan Zhao wrote:
> > On Tue, Apr 23, 2019 at 09:21:00AM +0800, Alex Williamson wrote:
> > > On Mon, 22 Apr 2019 21:01:52 -0400
> > > Yan Zhao <[email protected]> wrote:
> > >
> > > > On Mon, Apr 22, 2019 at 10:39:50PM +0800, Alex Williamson wrote:
> > > > > On Fri, 19 Apr 2019 04:35:04 -0400
> > > > > Yan Zhao <[email protected]> wrote:
> > > > >
> > > > > > device version attribute in mdev sysfs is used by user space software
> > > > > > (e.g. libvirt) to query device compatibility for live migration of VFIO
> > > > > > mdev devices. This attribute is mandatory if a mdev device supports live
> > > > > > migration.
> > > > >
> > > > > The Subject: doesn't quite match what's being proposed here.
> > > > >
> > > > > > It consists of two parts: common part and vendor proprietary part.
> > > > > > common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> > > > > > identifies device type. e.g., for pci device, it is
> > > > > > "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > > > >
> > > > > What purpose does this serve? If it's intended as some sort of
> > > > > namespace feature, shouldn't we first assume that we can only support
> > > > > migration to devices of the same type? Therefore each type would
> > > > > already have its own namespace. Also that would make the trailing bit
> > > > > of the version string listed below in the example redundant. A vendor
> > > > > is still welcome to include this in their version string if they wish,
> > > > > but I think the string should be entirely vendor defined.
> > > > >
> > > > hi Alex,
> > > > This common part is a kind of namespace.
> > > > Because if version string is entirely defined by vendors, I'm worried about
> > > > if there is a case that one vendor's version string happens to deceive and
> > > > interfere with another vendor's version checking?
> > > > e.g.
> > > > vendor A has a version string like: vendor id + device id + mdev type
> > > > vendor B has a version string like: device id + vendor id + mdev type
> > > > but vendor A's vendor id is 0x8086, device id is 0x1217
> > > > vendor B's vendor id is 0x1217, device id is 0x8086.
> > > >
> > > > In this corner case, the two vendors may regard the two device is
> > > > migratable but actually they are not.
> > > >
> > > > That's the reason for this common part that serve as a kind of namespace
> > > > that all vendors will comply with to avoid overlap.
> > >
> > > If we assume that migration can only occur between matching mdev types,
> > > this is redundant, each type already has their own namespace.
> > >
> > hi Alex,
> > do you mean user space software like libvirt needs to first check whether
> > mdev type is matching and then check whether version is matching?
>
> I would expect that libvirt (or other mgmt apps) will always first
> check that the vendor id, device id, mdev type all match. So for the
> version string it should suffice to be a "normal" numeric value.
>
> Essentially version string just needs to be there to distinguish
> revisions of the same mdev type over time.
>
hi Daniel,
The way that user space software checks that the vendor id, device id, mdev
type all match and version string is just revisions is somewhat restrictive?
user space software could not have so much detailed information regarding to
which devices are compatible, especially when vendor id, device id,
revision id, and mdev types are not enough or no need to be exactly the same.
By moving decision making for compatibility from user space to vendor
driver, user space software can be more change-resistant.
Agree?

Thanks
Yan

>
> Regards,
> Daniel
> --
> |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o- https://fstop138.berrange.com :|
> |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

2019-04-24 04:15:22

by Neo Jia

[permalink] [raw]
Subject: Re: [Qemu-devel] [PATCH 1/2] vfio/mdev: add version field as mandatory attribute for mdev device

On Tue, Apr 23, 2019 at 11:39:39AM +0100, Daniel P. Berrang? wrote:
> On Fri, Apr 19, 2019 at 04:35:04AM -0400, Yan Zhao wrote:
> > device version attribute in mdev sysfs is used by user space software
> > (e.g. libvirt) to query device compatibility for live migration of VFIO
> > mdev devices. This attribute is mandatory if a mdev device supports live
> > migration.
> >
> > It consists of two parts: common part and vendor proprietary part.
> > common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> > identifies device type. e.g., for pci device, it is
> > "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > vendor proprietary part: this part is varied in length. vendor driver can
> > specify any string to identify a device.
> >
> > When reading this attribute, it should show device version string of the
> > device of type <type-id>. If a device does not support live migration, it
> > should return errno.
> > When writing a string to this attribute, it returns errno for
> > incompatibility or returns written string length in compatibility case.
> > If a device does not support live migration, it always returns errno.
> >
> > For user space software to use:
> > 1.
> > Before starting live migration, user space software first reads source side
> > mdev device's version. e.g.
> > "#cat \
> > /sys/bus/pci/devices/0000\:00\:02.0/5ac1fb20-2bbf-4842-bb7e-36c58c3be9cd/mdev_type/version"
> > 00028086-193b-i915-GVTg_V5_4
> >
> > 2.
> > Then, user space software writes the source side returned version string
> > to device version attribute in target side, and checks the return value.
> > If a negative errno is returned in the target side, then mdev devices in
> > source and target sides are not compatible;
> > If a positive number is returned and it equals to the length of written
> > string, then the two mdev devices in source and target side are compatible.
> > e.g.
> > (a) compatibility case
> > "# echo 00028086-193b-i915-GVTg_V5_4 >
> > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> >
> > (b) incompatibility case
> > "#echo 00028086-193b-i915-GVTg_V5_1 >
> > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > -bash: echo: write error: Invalid argument
>
> What you have written here seems to imply that each mdev type is able to
> support many different versions at the same time. Writing a version into
> this sysfs file then chooses which of the many versions to actually use.
>
> This is good as it allows for live migration across driver software upgrades.
>
> A mgmt application may well want to know what versions are supported for an
> mdev type *before* starting a migration. A mgmt app can query all the 100's
> of hosts it knows and thus figure out which are valid to use as the target
> of a migration.
>
> IOW, we want to avoid the ever hitting the incompatibility case in the
> first place, by only choosing to migrate to a host that we know is going
> to be compatible.
>
> This would need some kind of way to report the full list of supported
> versions against the mdev supported types on the host.

What would be the typical scenario / use case for mgmt layer to query the version
information? Do they expect this get done completely offline as long as the
vendor driver installed on each host?

Thanks,
Neo

>
>

2019-04-24 08:07:38

by Cornelia Huck

[permalink] [raw]
Subject: Re: [PATCH 1/2] vfio/mdev: add version field as mandatory attribute for mdev device

On Tue, 23 Apr 2019 23:10:37 -0400
Yan Zhao <[email protected]> wrote:

> On Tue, Apr 23, 2019 at 05:59:32PM +0800, Cornelia Huck wrote:
> > On Fri, 19 Apr 2019 04:35:04 -0400
> > Yan Zhao <[email protected]> wrote:
> >
> > > device version attribute in mdev sysfs is used by user space software
> > > (e.g. libvirt) to query device compatibility for live migration of VFIO
> > > mdev devices. This attribute is mandatory if a mdev device supports live
> > > migration.
> > >
> > > It consists of two parts: common part and vendor proprietary part.
> > > common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> > > identifies device type. e.g., for pci device, it is
> > > "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > > vendor proprietary part: this part is varied in length. vendor driver can
> > > specify any string to identify a device.
> > >
> > > When reading this attribute, it should show device version string of the
> > > device of type <type-id>. If a device does not support live migration, it
> > > should return errno.
> >
> > It might make more sense if the driver does not register the attribute
> > for the device in that case at all.
> >
> yes. what about rephrase like this:
> "
> When reading this attribute, it should show device version string of the
> device of type <type-id>.
> If a device does not support live migration, it has two choices:
> 1. do not register this version attribute
> 2. return -ENODEV on rw this version attribute

"return -ENODEV when accessing the version attribute" ?

> Choice 1 is preferred.
> "
>
>
> > > When writing a string to this attribute, it returns errno for
> > > incompatibility or returns written string length in compatibility case.
> > > If a device does not support live migration, it always returns errno.
> > >
> > > For user space software to use:
> > > 1.
> > > Before starting live migration, user space software first reads source side
> > > mdev device's version. e.g.
> > > "#cat \
> > > /sys/bus/pci/devices/0000\:00\:02.0/5ac1fb20-2bbf-4842-bb7e-36c58c3be9cd/mdev_type/version"
> > > 00028086-193b-i915-GVTg_V5_4
> > >
> > > 2.
> > > Then, user space software writes the source side returned version string
> > > to device version attribute in target side, and checks the return value.
> > > If a negative errno is returned in the target side, then mdev devices in
> > > source and target sides are not compatible;
> > > If a positive number is returned and it equals to the length of written
> > > string, then the two mdev devices in source and target side are compatible.
> > > e.g.
> > > (a) compatibility case
> > > "# echo 00028086-193b-i915-GVTg_V5_4 >
> > > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > >
> > > (b) incompatibility case
> > > "#echo 00028086-193b-i915-GVTg_V5_1 >
> > > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > > -bash: echo: write error: Invalid argument
> > >
> > > 3. if two mdev devices are compatible, user space software can start
> > > live migration, and vice versa.
> > >
> > > Note: if a mdev device does not support live migration, it either does
> > > not provide a version attribute, or always returns errno when its version
> > > attribute is read/written.
> > >
> > > Cc: Alex Williamson <[email protected]>
> > > Cc: Erik Skultety <[email protected]>
> > > Cc: "Dr. David Alan Gilbert" <[email protected]>
> > > Cc: Cornelia Huck <[email protected]>
> > > Cc: "Tian, Kevin" <[email protected]>
> > > Cc: Zhenyu Wang <[email protected]>
> > > Cc: "Wang, Zhi A" <[email protected]>
> > > Cc: Neo Jia <[email protected]>
> > > Cc: Kirti Wankhede <[email protected]>
> > >
> > > Signed-off-by: Yan Zhao <[email protected]>
> > > ---
> > > Documentation/vfio-mediated-device.txt | 36 ++++++++++++++++++++++++++
> > > samples/vfio-mdev/mbochs.c | 17 ++++++++++++
> > > samples/vfio-mdev/mdpy.c | 16 ++++++++++++
> > > samples/vfio-mdev/mtty.c | 16 ++++++++++++
> > > 4 files changed, 85 insertions(+)
> > >
> > > diff --git a/Documentation/vfio-mediated-device.txt b/Documentation/vfio-mediated-device.txt
> > > index c3f69bcaf96e..bc28471c0667 100644
> > > --- a/Documentation/vfio-mediated-device.txt
> > > +++ b/Documentation/vfio-mediated-device.txt
> > > @@ -202,6 +202,7 @@ Directories and files under the sysfs for Each Physical Device
> > > | | |--- available_instances
> > > | | |--- device_api
> > > | | |--- description
> > > + | | |--- version
> > > | | |--- [devices]
> > > | |--- [<type-id>]
> > > | | |--- create
> > > @@ -209,6 +210,7 @@ Directories and files under the sysfs for Each Physical Device
> > > | | |--- available_instances
> > > | | |--- device_api
> > > | | |--- description
> > > + | | |--- version
> > > | | |--- [devices]
> > > | |--- [<type-id>]
> > > | |--- create
> > > @@ -216,6 +218,7 @@ Directories and files under the sysfs for Each Physical Device
> > > | |--- available_instances
> > > | |--- device_api
> > > | |--- description
> > > + | |--- version
> > > | |--- [devices]
> > >
> > > * [mdev_supported_types]
> > > @@ -225,6 +228,8 @@ Directories and files under the sysfs for Each Physical Device
> > > [<type-id>], device_api, and available_instances are mandatory attributes
> > > that should be provided by vendor driver.
> > >
> > > + version is a mandatory attribute if a mdev device supports live migration.
> >
> > What about "An mdev device wishing to support live migration must
> > provide the version attribute."?
> yes, I just want to keep consistent with the line above it
> " [<type-id>], device_api, and available_instances are mandatory attributes
> that should be provided by vendor driver."
> what about below one?
> "version is a mandatory attribute if a mdev device wishing to support live
> migration."

My point is that an attribute is not mandatory if it can be left out :)
(I'm not a native speaker, though; maybe this makes perfect sense
after all?)

Maybe "version is a required attribute if live migration is supported
for an mdev device"?

>
>
> > > +
> > > * [<type-id>]
> > >
> > > The [<type-id>] name is created by adding the device driver string as a prefix
> > > @@ -246,6 +251,35 @@ Directories and files under the sysfs for Each Physical Device
> > > This attribute should show the number of devices of type <type-id> that can be
> > > created.
> > >
> > > +* version
> > > +
> > > + This attribute is rw. It is used to check whether two devices are compatible
> > > + for live migration. If this attribute is missing, then the corresponding mdev
> > > + device is regarded as not supporting live migration.
> > > +
> > > + It consists of two parts: common part and vendor proprietary part.
> > > + common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies
> > > + device type. e.g., for pci device, it is
> > > + "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > > + vendor proprietary part: this part is varied in length. vendor driver can
> > > + specify any string to identify a device.
> > > +
> > > + When reading this attribute, it should show device version string of the device
> > > + of type <type-id>. If a device does not support live migration, it should
> > > + return errno.
> > > + When writing a string to this attribute, it returns errno for incompatibility
> > > + or returns written string length in compatibility case. If a device does not
> > > + support live migration, it always returns errno.
> >
> > I'm not sure whether a device that does not support live migration
> > should expose this attribute in the first place. Or is that to cover
> > cases where a driver supports live migration only for some of the
> > devices it supports?
> yes, driver returning error code is to cover the cases where only part of devices it
> supports can be migrated.
>
>
> > Also, I'm not sure if a string that has to be parsed is a good idea...
> > is this 'version' attribute supposed to convey some human-readable
> > information as well? The procedure you describe for compatibility
> > checking does the checking within the vendor driver which I would
> > expect to have a table/rules for that anyway.
> right. if a vendor driver has the confidence to migrate between devices of
> diffent platform or mdev types, it can maintain a compatibility table for that
> purpose. That's the reason why we would leave the compatibility check to vendor
> driver. vendor driver can freely choose its own complicated way to decide
> which device is migratable to which device.

I think there are two scenarios here:
- Migrating between different device types, which is unlikely to work,
except in special cases.
- Migrating between different versions of the same device type, which
may work for some drivers/devices (and at least migrating to a newer
version looks quite reasonable).

But both should be something that is decided by the individual driver;
I hope we don't want to support migration between different drivers :-O

Can we make this a driver-defined format?

>
> > I think you should also specify which errno writing an incompatible id
> > is supposed to return (probably best something different than if the
> > device does not support live migration at all, if we stick with
> > creating the attribute in that case.)
> Agree. I'll define it clearly in next revison.

Thanks!

2019-04-24 08:22:58

by Yan Zhao

[permalink] [raw]
Subject: Re: [PATCH 1/2] vfio/mdev: add version field as mandatory attribute for mdev device

On Wed, Apr 24, 2019 at 03:56:24PM +0800, Cornelia Huck wrote:
> On Tue, 23 Apr 2019 23:10:37 -0400
> Yan Zhao <[email protected]> wrote:
>
> > On Tue, Apr 23, 2019 at 05:59:32PM +0800, Cornelia Huck wrote:
> > > On Fri, 19 Apr 2019 04:35:04 -0400
> > > Yan Zhao <[email protected]> wrote:
> > >
> > > > device version attribute in mdev sysfs is used by user space software
> > > > (e.g. libvirt) to query device compatibility for live migration of VFIO
> > > > mdev devices. This attribute is mandatory if a mdev device supports live
> > > > migration.
> > > >
> > > > It consists of two parts: common part and vendor proprietary part.
> > > > common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> > > > identifies device type. e.g., for pci device, it is
> > > > "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > > > vendor proprietary part: this part is varied in length. vendor driver can
> > > > specify any string to identify a device.
> > > >
> > > > When reading this attribute, it should show device version string of the
> > > > device of type <type-id>. If a device does not support live migration, it
> > > > should return errno.
> > >
> > > It might make more sense if the driver does not register the attribute
> > > for the device in that case at all.
> > >
> > yes. what about rephrase like this:
> > "
> > When reading this attribute, it should show device version string of the
> > device of type <type-id>.
> > If a device does not support live migration, it has two choices:
> > 1. do not register this version attribute
> > 2. return -ENODEV on rw this version attribute
>
> "return -ENODEV when accessing the version attribute" ?
Yeah, this one is better, thanks :)

> > Choice 1 is preferred.
> > "
> >
> >
> > > > When writing a string to this attribute, it returns errno for
> > > > incompatibility or returns written string length in compatibility case.
> > > > If a device does not support live migration, it always returns errno.
> > > >
> > > > For user space software to use:
> > > > 1.
> > > > Before starting live migration, user space software first reads source side
> > > > mdev device's version. e.g.
> > > > "#cat \
> > > > /sys/bus/pci/devices/0000\:00\:02.0/5ac1fb20-2bbf-4842-bb7e-36c58c3be9cd/mdev_type/version"
> > > > 00028086-193b-i915-GVTg_V5_4
> > > >
> > > > 2.
> > > > Then, user space software writes the source side returned version string
> > > > to device version attribute in target side, and checks the return value.
> > > > If a negative errno is returned in the target side, then mdev devices in
> > > > source and target sides are not compatible;
> > > > If a positive number is returned and it equals to the length of written
> > > > string, then the two mdev devices in source and target side are compatible.
> > > > e.g.
> > > > (a) compatibility case
> > > > "# echo 00028086-193b-i915-GVTg_V5_4 >
> > > > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > > >
> > > > (b) incompatibility case
> > > > "#echo 00028086-193b-i915-GVTg_V5_1 >
> > > > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > > > -bash: echo: write error: Invalid argument
> > > >
> > > > 3. if two mdev devices are compatible, user space software can start
> > > > live migration, and vice versa.
> > > >
> > > > Note: if a mdev device does not support live migration, it either does
> > > > not provide a version attribute, or always returns errno when its version
> > > > attribute is read/written.
> > > >
> > > > Cc: Alex Williamson <[email protected]>
> > > > Cc: Erik Skultety <[email protected]>
> > > > Cc: "Dr. David Alan Gilbert" <[email protected]>
> > > > Cc: Cornelia Huck <[email protected]>
> > > > Cc: "Tian, Kevin" <[email protected]>
> > > > Cc: Zhenyu Wang <[email protected]>
> > > > Cc: "Wang, Zhi A" <[email protected]>
> > > > Cc: Neo Jia <[email protected]>
> > > > Cc: Kirti Wankhede <[email protected]>
> > > >
> > > > Signed-off-by: Yan Zhao <[email protected]>
> > > > ---
> > > > Documentation/vfio-mediated-device.txt | 36 ++++++++++++++++++++++++++
> > > > samples/vfio-mdev/mbochs.c | 17 ++++++++++++
> > > > samples/vfio-mdev/mdpy.c | 16 ++++++++++++
> > > > samples/vfio-mdev/mtty.c | 16 ++++++++++++
> > > > 4 files changed, 85 insertions(+)
> > > >
> > > > diff --git a/Documentation/vfio-mediated-device.txt b/Documentation/vfio-mediated-device.txt
> > > > index c3f69bcaf96e..bc28471c0667 100644
> > > > --- a/Documentation/vfio-mediated-device.txt
> > > > +++ b/Documentation/vfio-mediated-device.txt
> > > > @@ -202,6 +202,7 @@ Directories and files under the sysfs for Each Physical Device
> > > > | | |--- available_instances
> > > > | | |--- device_api
> > > > | | |--- description
> > > > + | | |--- version
> > > > | | |--- [devices]
> > > > | |--- [<type-id>]
> > > > | | |--- create
> > > > @@ -209,6 +210,7 @@ Directories and files under the sysfs for Each Physical Device
> > > > | | |--- available_instances
> > > > | | |--- device_api
> > > > | | |--- description
> > > > + | | |--- version
> > > > | | |--- [devices]
> > > > | |--- [<type-id>]
> > > > | |--- create
> > > > @@ -216,6 +218,7 @@ Directories and files under the sysfs for Each Physical Device
> > > > | |--- available_instances
> > > > | |--- device_api
> > > > | |--- description
> > > > + | |--- version
> > > > | |--- [devices]
> > > >
> > > > * [mdev_supported_types]
> > > > @@ -225,6 +228,8 @@ Directories and files under the sysfs for Each Physical Device
> > > > [<type-id>], device_api, and available_instances are mandatory attributes
> > > > that should be provided by vendor driver.
> > > >
> > > > + version is a mandatory attribute if a mdev device supports live migration.
> > >
> > > What about "An mdev device wishing to support live migration must
> > > provide the version attribute."?
> > yes, I just want to keep consistent with the line above it
> > " [<type-id>], device_api, and available_instances are mandatory attributes
> > that should be provided by vendor driver."
> > what about below one?
> > "version is a mandatory attribute if a mdev device wishing to support live
> > migration."
>
> My point is that an attribute is not mandatory if it can be left out :)
> (I'm not a native speaker, though; maybe this makes perfect sense
> after all?)
>
> Maybe "version is a required attribute if live migration is supported
> for an mdev device"?
>
you are right, "mandatory" may bring some confusion.
Maybe
"vendor driver must provide version attribute for an mdev device wishing to
support live migration." ?
based on your first version :)

> >
> >
> > > > +
> > > > * [<type-id>]
> > > >
> > > > The [<type-id>] name is created by adding the device driver string as a prefix
> > > > @@ -246,6 +251,35 @@ Directories and files under the sysfs for Each Physical Device
> > > > This attribute should show the number of devices of type <type-id> that can be
> > > > created.
> > > >
> > > > +* version
> > > > +
> > > > + This attribute is rw. It is used to check whether two devices are compatible
> > > > + for live migration. If this attribute is missing, then the corresponding mdev
> > > > + device is regarded as not supporting live migration.
> > > > +
> > > > + It consists of two parts: common part and vendor proprietary part.
> > > > + common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies
> > > > + device type. e.g., for pci device, it is
> > > > + "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > > > + vendor proprietary part: this part is varied in length. vendor driver can
> > > > + specify any string to identify a device.
> > > > +
> > > > + When reading this attribute, it should show device version string of the device
> > > > + of type <type-id>. If a device does not support live migration, it should
> > > > + return errno.
> > > > + When writing a string to this attribute, it returns errno for incompatibility
> > > > + or returns written string length in compatibility case. If a device does not
> > > > + support live migration, it always returns errno.
> > >
> > > I'm not sure whether a device that does not support live migration
> > > should expose this attribute in the first place. Or is that to cover
> > > cases where a driver supports live migration only for some of the
> > > devices it supports?
> > yes, driver returning error code is to cover the cases where only part of devices it
> > supports can be migrated.
> >
> >
> > > Also, I'm not sure if a string that has to be parsed is a good idea...
> > > is this 'version' attribute supposed to convey some human-readable
> > > information as well? The procedure you describe for compatibility
> > > checking does the checking within the vendor driver which I would
> > > expect to have a table/rules for that anyway.
> > right. if a vendor driver has the confidence to migrate between devices of
> > diffent platform or mdev types, it can maintain a compatibility table for that
> > purpose. That's the reason why we would leave the compatibility check to vendor
> > driver. vendor driver can freely choose its own complicated way to decide
> > which device is migratable to which device.
>
> I think there are two scenarios here:
> - Migrating between different device types, which is unlikely to work,
> except in special cases.
> - Migrating between different versions of the same device type, which
> may work for some drivers/devices (and at least migrating to a newer
> version looks quite reasonable).
>
> But both should be something that is decided by the individual driver;
> I hope we don't want to support migration between different drivers :-O
>
> Can we make this a driver-defined format?
>
yes, this is indeed driver-defined format.
Actually we define it into two parts: common part and vendor proprietary part.
common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
identifies device type. e.g., for pci device, it is
"pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
vendor proprietary part: this part is varied in length. vendor driver can
specify any string to identify a device.

vendor proprietary part is defined by vendor driver. vendor driver can
define any format it wishes to use. Also it is its own responsibility to
ensure backward compatibility if it wants to update format definition in this
part.

So user space only needs to get source side's version string, and asks
target side whether the two are compatible. The decision maker is the
vendor driver:)


> >
> > > I think you should also specify which errno writing an incompatible id
> > > is supposed to return (probably best something different than if the
> > > device does not support live migration at all, if we stick with
> > > creating the attribute in that case.)
> > Agree. I'll define it clearly in next revison.
>
> Thanks!

2019-04-24 09:13:04

by Christophe de Dinechin

[permalink] [raw]
Subject: Re: [Qemu-devel] [PATCH 1/2] vfio/mdev: add version field as mandatory attribute for mdev device



> On 23 Apr 2019, at 12:39, Daniel P. Berrangé <[email protected]> wrote:
>
> On Fri, Apr 19, 2019 at 04:35:04AM -0400, Yan Zhao wrote:
>> device version attribute in mdev sysfs is used by user space software
>> (e.g. libvirt) to query device compatibility for live migration of VFIO
>> mdev devices. This attribute is mandatory if a mdev device supports live
>> migration.
>>
>> It consists of two parts: common part and vendor proprietary part.
>> common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
>> identifies device type. e.g., for pci device, it is
>> "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
>> vendor proprietary part: this part is varied in length. vendor driver can
>> specify any string to identify a device.
>>
>> When reading this attribute, it should show device version string of the
>> device of type <type-id>. If a device does not support live migration, it
>> should return errno.
>> When writing a string to this attribute, it returns errno for
>> incompatibility or returns written string length in compatibility case.
>> If a device does not support live migration, it always returns errno.
>>
>> For user space software to use:
>> 1.
>> Before starting live migration, user space software first reads source side
>> mdev device's version. e.g.
>> "#cat \
>> /sys/bus/pci/devices/0000\:00\:02.0/5ac1fb20-2bbf-4842-bb7e-36c58c3be9cd/mdev_type/version"
>> 00028086-193b-i915-GVTg_V5_4
>>
>> 2.
>> Then, user space software writes the source side returned version string
>> to device version attribute in target side, and checks the return value.
>> If a negative errno is returned in the target side, then mdev devices in
>> source and target sides are not compatible;
>> If a positive number is returned and it equals to the length of written
>> string, then the two mdev devices in source and target side are compatible.
>> e.g.
>> (a) compatibility case
>> "# echo 00028086-193b-i915-GVTg_V5_4 >
>> /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
>>
>> (b) incompatibility case
>> "#echo 00028086-193b-i915-GVTg_V5_1 >
>> /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
>> -bash: echo: write error: Invalid argument
>
> What you have written here seems to imply that each mdev type is able to
> support many different versions at the same time. Writing a version into
> this sysfs file then chooses which of the many versions to actually use.
>
> This is good as it allows for live migration across driver software upgrades.
>
> A mgmt application may well want to know what versions are supported for an
> mdev type *before* starting a migration. A mgmt app can query all the 100's
> of hosts it knows and thus figure out which are valid to use as the target
> of a migration.
>
> IOW, we want to avoid the ever hitting the incompatibility case in the
> first place, by only choosing to migrate to a host that we know is going
> to be compatible.
>
> This would need some kind of way to report the full list of supported
> versions against the mdev supported types on the host.
>
>
>> 3. if two mdev devices are compatible, user space software can start
>> live migration, and vice versa.
>>
>> Note: if a mdev device does not support live migration, it either does
>> not provide a version attribute, or always returns errno when its version
>> attribute is read/written.
>>
>> Cc: Alex Williamson <[email protected]>
>> Cc: Erik Skultety <[email protected]>
>> Cc: "Dr. David Alan Gilbert" <[email protected]>
>> Cc: Cornelia Huck <[email protected]>
>> Cc: "Tian, Kevin" <[email protected]>
>> Cc: Zhenyu Wang <[email protected]>
>> Cc: "Wang, Zhi A" <[email protected]>
>> Cc: Neo Jia <[email protected]>
>> Cc: Kirti Wankhede <[email protected]>
>>
>> Signed-off-by: Yan Zhao <[email protected]>
>> ---
>> Documentation/vfio-mediated-device.txt | 36 ++++++++++++++++++++++++++
>> samples/vfio-mdev/mbochs.c | 17 ++++++++++++
>> samples/vfio-mdev/mdpy.c | 16 ++++++++++++
>> samples/vfio-mdev/mtty.c | 16 ++++++++++++
>> 4 files changed, 85 insertions(+)
>>
>> diff --git a/Documentation/vfio-mediated-device.txt b/Documentation/vfio-mediated-device.txt
>> index c3f69bcaf96e..bc28471c0667 100644
>> --- a/Documentation/vfio-mediated-device.txt
>> +++ b/Documentation/vfio-mediated-device.txt
>> @@ -202,6 +202,7 @@ Directories and files under the sysfs for Each Physical Device
>> | | |--- available_instances
>> | | |--- device_api
>> | | |--- description
>> + | | |--- version
>> | | |--- [devices]
>> | |--- [<type-id>]
>> | | |--- create
>> @@ -209,6 +210,7 @@ Directories and files under the sysfs for Each Physical Device
>> | | |--- available_instances
>> | | |--- device_api
>> | | |--- description
>> + | | |--- version
>> | | |--- [devices]
>> | |--- [<type-id>]
>> | |--- create
>> @@ -216,6 +218,7 @@ Directories and files under the sysfs for Each Physical Device
>> | |--- available_instances
>> | |--- device_api
>> | |--- description
>> + | |--- version
>> | |--- [devices]
>>
>> * [mdev_supported_types]
>> @@ -225,6 +228,8 @@ Directories and files under the sysfs for Each Physical Device
>> [<type-id>], device_api, and available_instances are mandatory attributes
>> that should be provided by vendor driver.
>>
>> + version is a mandatory attribute if a mdev device supports live migration.
>> +
>> * [<type-id>]
>>
>> The [<type-id>] name is created by adding the device driver string as a prefix
>> @@ -246,6 +251,35 @@ Directories and files under the sysfs for Each Physical Device
>> This attribute should show the number of devices of type <type-id> that can be
>> created.
>>
>> +* version
>> +
>> + This attribute is rw. It is used to check whether two devices are compatible
>> + for live migration. If this attribute is missing, then the corresponding mdev
>> + device is regarded as not supporting live migration.
>> +
>> + It consists of two parts: common part and vendor proprietary part.
>> + common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies
>> + device type. e.g., for pci device, it is
>> + "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
>> + vendor proprietary part: this part is varied in length. vendor driver can
>> + specify any string to identify a device.
>> +
>> + When reading this attribute, it should show device version string of the device
>> + of type <type-id>. If a device does not support live migration, it should
>> + return errno.
>> + When writing a string to this attribute, it returns errno for incompatibility
>> + or returns written string length in compatibility case. If a device does not
>> + support live migration, it always returns errno.
>> +
>> + for example.
>> + # cat \
>> + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_2/version
>> + 00028086-193b-i915-GVTg_V5_2
>> +
>> + #echo 00028086-193b-i915-GVTg_V5_2 > \
>> + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
>> + -bash: echo: write error: Invalid argument
>> +
>
> IIUC this path is against the physical device. IOW, the mgmt app would have
> to first write to the "version" file to choose a version, and then write to
> the "create" file to actually create an virtual device. This has the obvious
> concurrency problem if multiple devices are being created at the same time
> and distinct versions for each device are required. There would need to be
> a locking scheme defined to ensure safety.
>
> Wouldn't it be better if we can pass the desired version when we write to
> the "create" file, so that we avoid any concurrent usage problems. "version"
> could be just a read-only file with a *list* of supported versions.
>
> eg
>
> $ cat /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> 5.0
> 5.1
> 5.2
>
> $ echo "83b8f4f2-509f-382f-3c1e-e6bfe0fa1001;version=5.2" >
> /sys/devices/virtual/mtty/mtty/mdev_supported_types/mtty-2/create
>

I read Alex’s comment that creating an mdev with a specific version is not the intent here. However…

- If the goal is just to check compatibility with migration data, then I think the name should be more explicit, e.g. migration_version instead of version. Otherwise, I read the intent the way Daniel did.

- If we want to provide a list of versions and make it possible to create instances based on a version, I would rather use a directory structure for that, i.e. (replacing cat with ls):

$ ls /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
5.0
5.1
5.2

where each version is a directory that has its own available_instances, device_api, description, create, …

$ echo 83b8f4f2-509f-382f-3c1e-e6bfe0fa1001 > /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version/5.1/create

In addition to not having the problem of two consecutive writes to distinct files, I can imagine contrived cases where available_instances might depend on the version…


Thanks
Christophe

>
> Regards,
> Daniel
> --
> |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o- https://fstop138.berrange.com :|
> |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

2019-04-24 14:16:48

by Alex Williamson

[permalink] [raw]
Subject: Re: [PATCH 1/2] vfio/mdev: add version field as mandatory attribute for mdev device

On Tue, 23 Apr 2019 23:39:34 -0400
Yan Zhao <[email protected]> wrote:

> On Tue, Apr 23, 2019 at 11:02:56PM +0800, Alex Williamson wrote:
> > On Tue, 23 Apr 2019 01:41:57 -0400
> > Yan Zhao <[email protected]> wrote:
> >
> > > On Tue, Apr 23, 2019 at 09:21:00AM +0800, Alex Williamson wrote:
> > > > On Mon, 22 Apr 2019 21:01:52 -0400
> > > > Yan Zhao <[email protected]> wrote:
> > > >
> > > > > On Mon, Apr 22, 2019 at 10:39:50PM +0800, Alex Williamson wrote:
> > > > > > On Fri, 19 Apr 2019 04:35:04 -0400
> > > > > > Yan Zhao <[email protected]> wrote:
> > > > > >
> > > > > > > device version attribute in mdev sysfs is used by user space software
> > > > > > > (e.g. libvirt) to query device compatibility for live migration of VFIO
> > > > > > > mdev devices. This attribute is mandatory if a mdev device supports live
> > > > > > > migration.
> > > > > >
> > > > > > The Subject: doesn't quite match what's being proposed here.
> > > > > >
> > > > > > > It consists of two parts: common part and vendor proprietary part.
> > > > > > > common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> > > > > > > identifies device type. e.g., for pci device, it is
> > > > > > > "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > > > > >
> > > > > > What purpose does this serve? If it's intended as some sort of
> > > > > > namespace feature, shouldn't we first assume that we can only support
> > > > > > migration to devices of the same type? Therefore each type would
> > > > > > already have its own namespace. Also that would make the trailing bit
> > > > > > of the version string listed below in the example redundant. A vendor
> > > > > > is still welcome to include this in their version string if they wish,
> > > > > > but I think the string should be entirely vendor defined.
> > > > > >
> > > > > hi Alex,
> > > > > This common part is a kind of namespace.
> > > > > Because if version string is entirely defined by vendors, I'm worried about
> > > > > if there is a case that one vendor's version string happens to deceive and
> > > > > interfere with another vendor's version checking?
> > > > > e.g.
> > > > > vendor A has a version string like: vendor id + device id + mdev type
> > > > > vendor B has a version string like: device id + vendor id + mdev type
> > > > > but vendor A's vendor id is 0x8086, device id is 0x1217
> > > > > vendor B's vendor id is 0x1217, device id is 0x8086.
> > > > >
> > > > > In this corner case, the two vendors may regard the two device is
> > > > > migratable but actually they are not.
> > > > >
> > > > > That's the reason for this common part that serve as a kind of namespace
> > > > > that all vendors will comply with to avoid overlap.
> > > >
> > > > If we assume that migration can only occur between matching mdev types,
> > > > this is redundant, each type already has their own namespace.
> > > >
> > > hi Alex,
> > > do you mean user space software like libvirt needs to first check whether
> > > mdev type is matching and then check whether version is matching?
> >
> > Yes.
> >
> may I know the drawback of defining the this common part?
> common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> identifies device type. e.g., for pci device, it is
> "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> By doing so, user space software has no need to first check whether mdev type
> is matching. A confident vendor driver can even allow devices migrating
> between different mdev types.


It's not practical to expect userspace to test the version of every
mdev type in the system to determine a match, let alone across a data
center. Additionally, in order to be migration compatible the mdev
types must be software compatible to the VM, which is the basic
definition of the differences between mdev types. Therefore let me
flip the question around, why would a vendor driver choose to create a
new mdev type for software compatible devices? If the vendor wants to
maintain compatibility, indicate basic compatibility using the same
mdev type. The original intention of the version attribute is to be
entirely opaque to userspace, introducing "common" parts is unnecessary
as above, degrades the original concept, and only defines a solution for
PCI devices. Thanks,

Alex

> > > if user space software only checks version for migration, it means vendor
> > > driver has to include mdev type in their vendor proprietary part string,
> > > right?
> >
> > Userspace attempting to migrate an nvidia-64 to an i915-GVT_V5_4 would
> > be a failure on the part of the user.
> >
> > > Another thing is that could there be any future mdev parent driver that
> > > applies to all mdev devices, just like vfio-pci? like Yi's vfio-pci-mdev
> > > driver (https://lkml.org/lkml/2019/3/13/114)?
> >
> > For starters, this is just a sample driver from which vendor specific
> > mdev drivers could be forked to support these features, but
> > additionally, the type is defined by the vendor driver, so even a meta
> > driver like vfio-pci-mdev could create types like
> > "vfio-pci-mdev-8086_10c9_abcdef" if it wanted to provide that specific
> > device. The "vfio-pci-mdev-type1" is just a sample implementation to
> > say "the native type of the thing bound to me" and it's going to have
> > limited usefulness for any sort of persistence to userspace. Thus,
> > it's a sample driver. Thanks,
> Thanks for this explanation:)
>
>
> >
> > > > > > > vendor proprietary part: this part is varied in length. vendor driver can
> > > > > > > specify any string to identify a device.
> > > > > > >
> > > > > > > When reading this attribute, it should show device version string of the
> > > > > > > device of type <type-id>. If a device does not support live migration, it
> > > > > > > should return errno.
> > > > > > > When writing a string to this attribute, it returns errno for
> > > > > > > incompatibility or returns written string length in compatibility case.
> > > > > > > If a device does not support live migration, it always returns errno.
> > > > > > >
> > > > > > > For user space software to use:
> > > > > > > 1.
> > > > > > > Before starting live migration, user space software first reads source side
> > > > > > > mdev device's version. e.g.
> > > > > > > "#cat \
> > > > > > > /sys/bus/pci/devices/0000\:00\:02.0/5ac1fb20-2bbf-4842-bb7e-36c58c3be9cd/mdev_type/version"
> > > > > > > 00028086-193b-i915-GVTg_V5_4
> > > > > > >
> > > > > > > 2.
> > > > > > > Then, user space software writes the source side returned version string
> > > > > > > to device version attribute in target side, and checks the return value.
> > > > > > > If a negative errno is returned in the target side, then mdev devices in
> > > > > > > source and target sides are not compatible;
> > > > > > > If a positive number is returned and it equals to the length of written
> > > > > > > string, then the two mdev devices in source and target side are compatible.
> > > > > > > e.g.
> > > > > > > (a) compatibility case
> > > > > > > "# echo 00028086-193b-i915-GVTg_V5_4 >
> > > > > > > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > > > > > >
> > > > > > > (b) incompatibility case
> > > > > > > "#echo 00028086-193b-i915-GVTg_V5_1 >
> > > > > > > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > > > > > > -bash: echo: write error: Invalid argument
> > > > > > >
> > > > > > > 3. if two mdev devices are compatible, user space software can start
> > > > > > > live migration, and vice versa.
> > > > > > >
> > > > > > > Note: if a mdev device does not support live migration, it either does
> > > > > > > not provide a version attribute, or always returns errno when its version
> > > > > > > attribute is read/written.
> > > > > >
> > > > > > I think it would be cleaner to do the former, not supply the
> > > > > > attribute. This seems to do the latter in the sample drivers. Thanks,
> > > > > Ok. you are right!
> > > > > what about just keep one sample driver to show how to do the latter,
> > > > > and let the others do the former?
> > > >
> > > > I'd rather that if a vendor driver doesn't support features requiring
> > > > the version attribute that they don't implement it. It's confusing to
> > > > developers looking at the sample driver for guidance if we have
> > > > different implementations. Of course if you'd like to add migration
> > > > support to one of the sample drivers, that'd be very welcome. Thanks,
> > > >
> > > Got it:)
> > >
> > > Thanks!
> > > Yan
> > >
> > > >
> > > > > > > Cc: Alex Williamson <[email protected]>
> > > > > > > Cc: Erik Skultety <[email protected]>
> > > > > > > Cc: "Dr. David Alan Gilbert" <[email protected]>
> > > > > > > Cc: Cornelia Huck <[email protected]>
> > > > > > > Cc: "Tian, Kevin" <[email protected]>
> > > > > > > Cc: Zhenyu Wang <[email protected]>
> > > > > > > Cc: "Wang, Zhi A" <[email protected]>
> > > > > > > Cc: Neo Jia <[email protected]>
> > > > > > > Cc: Kirti Wankhede <[email protected]>
> > > > > > >
> > > > > > > Signed-off-by: Yan Zhao <[email protected]>
> > > > > > > ---
> > > > > > > Documentation/vfio-mediated-device.txt | 36 ++++++++++++++++++++++++++
> > > > > > > samples/vfio-mdev/mbochs.c | 17 ++++++++++++
> > > > > > > samples/vfio-mdev/mdpy.c | 16 ++++++++++++
> > > > > > > samples/vfio-mdev/mtty.c | 16 ++++++++++++
> > > > > > > 4 files changed, 85 insertions(+)
> > > > > > >
> > > > > > > diff --git a/Documentation/vfio-mediated-device.txt b/Documentation/vfio-mediated-device.txt
> > > > > > > index c3f69bcaf96e..bc28471c0667 100644
> > > > > > > --- a/Documentation/vfio-mediated-device.txt
> > > > > > > +++ b/Documentation/vfio-mediated-device.txt
> > > > > > > @@ -202,6 +202,7 @@ Directories and files under the sysfs for Each Physical Device
> > > > > > > | | |--- available_instances
> > > > > > > | | |--- device_api
> > > > > > > | | |--- description
> > > > > > > + | | |--- version
> > > > > > > | | |--- [devices]
> > > > > > > | |--- [<type-id>]
> > > > > > > | | |--- create
> > > > > > > @@ -209,6 +210,7 @@ Directories and files under the sysfs for Each Physical Device
> > > > > > > | | |--- available_instances
> > > > > > > | | |--- device_api
> > > > > > > | | |--- description
> > > > > > > + | | |--- version
> > > > > > > | | |--- [devices]
> > > > > > > | |--- [<type-id>]
> > > > > > > | |--- create
> > > > > > > @@ -216,6 +218,7 @@ Directories and files under the sysfs for Each Physical Device
> > > > > > > | |--- available_instances
> > > > > > > | |--- device_api
> > > > > > > | |--- description
> > > > > > > + | |--- version
> > > > > > > | |--- [devices]
> > > > > > >
> > > > > > > * [mdev_supported_types]
> > > > > > > @@ -225,6 +228,8 @@ Directories and files under the sysfs for Each Physical Device
> > > > > > > [<type-id>], device_api, and available_instances are mandatory attributes
> > > > > > > that should be provided by vendor driver.
> > > > > > >
> > > > > > > + version is a mandatory attribute if a mdev device supports live migration.
> > > > > > > +
> > > > > > > * [<type-id>]
> > > > > > >
> > > > > > > The [<type-id>] name is created by adding the device driver string as a prefix
> > > > > > > @@ -246,6 +251,35 @@ Directories and files under the sysfs for Each Physical Device
> > > > > > > This attribute should show the number of devices of type <type-id> that can be
> > > > > > > created.
> > > > > > >
> > > > > > > +* version
> > > > > > > +
> > > > > > > + This attribute is rw. It is used to check whether two devices are compatible
> > > > > > > + for live migration. If this attribute is missing, then the corresponding mdev
> > > > > > > + device is regarded as not supporting live migration.
> > > > > > > +
> > > > > > > + It consists of two parts: common part and vendor proprietary part.
> > > > > > > + common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies
> > > > > > > + device type. e.g., for pci device, it is
> > > > > > > + "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > > > > > > + vendor proprietary part: this part is varied in length. vendor driver can
> > > > > > > + specify any string to identify a device.
> > > > > > > +
> > > > > > > + When reading this attribute, it should show device version string of the device
> > > > > > > + of type <type-id>. If a device does not support live migration, it should
> > > > > > > + return errno.
> > > > > > > + When writing a string to this attribute, it returns errno for incompatibility
> > > > > > > + or returns written string length in compatibility case. If a device does not
> > > > > > > + support live migration, it always returns errno.
> > > > > > > +
> > > > > > > + for example.
> > > > > > > + # cat \
> > > > > > > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_2/version
> > > > > > > + 00028086-193b-i915-GVTg_V5_2
> > > > > > > +
> > > > > > > + #echo 00028086-193b-i915-GVTg_V5_2 > \
> > > > > > > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> > > > > > > + -bash: echo: write error: Invalid argument
> > > > > > > +
> > > > > > > * [device]
> > > > > > >
> > > > > > > This directory contains links to the devices of type <type-id> that have been
> > > > > > > @@ -327,12 +361,14 @@ card.
> > > > > > > | | |-- available_instances
> > > > > > > | | |-- create
> > > > > > > | | |-- device_api
> > > > > > > + | | |-- version
> > > > > > > | | |-- devices
> > > > > > > | | `-- name
> > > > > > > | `-- mtty-2
> > > > > > > | |-- available_instances
> > > > > > > | |-- create
> > > > > > > | |-- device_api
> > > > > > > + | |-- version
> > > > > > > | |-- devices
> > > > > > > | `-- name
> > > > > > > |-- mtty_dev
> > > > > > > diff --git a/samples/vfio-mdev/mbochs.c b/samples/vfio-mdev/mbochs.c
> > > > > > > index b038aa9f5a70..2f5ba96b91a2 100644
> > > > > > > --- a/samples/vfio-mdev/mbochs.c
> > > > > > > +++ b/samples/vfio-mdev/mbochs.c
> > > > > > > @@ -1391,11 +1391,28 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> > > > > > > }
> > > > > > > MDEV_TYPE_ATTR_RO(device_api);
> > > > > > >
> > > > > > > +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> > > > > > > + char *buf)
> > > > > > > +{
> > > > > > > + /* do not support live migration */
> > > > > > > + return -EINVAL;
> > > > > > > +}
> > > > > > > +
> > > > > > > +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> > > > > > > + const char *buf, size_t count)
> > > > > > > +{
> > > > > > > + /* do not support live migration */
> > > > > > > + return -EINVAL;
> > > > > > > +}
> > > > > > > +
> > > > > > > +static MDEV_TYPE_ATTR_RW(version);
> > > > > > > +
> > > > > > > static struct attribute *mdev_types_attrs[] = {
> > > > > > > &mdev_type_attr_name.attr,
> > > > > > > &mdev_type_attr_description.attr,
> > > > > > > &mdev_type_attr_device_api.attr,
> > > > > > > &mdev_type_attr_available_instances.attr,
> > > > > > > + &mdev_type_attr_version.attr,
> > > > > > > NULL,
> > > > > > > };
> > > > > > >
> > > > > > > diff --git a/samples/vfio-mdev/mdpy.c b/samples/vfio-mdev/mdpy.c
> > > > > > > index cc86bf6566e4..ff15fdfc7d46 100644
> > > > > > > --- a/samples/vfio-mdev/mdpy.c
> > > > > > > +++ b/samples/vfio-mdev/mdpy.c
> > > > > > > @@ -695,11 +695,27 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> > > > > > > }
> > > > > > > MDEV_TYPE_ATTR_RO(device_api);
> > > > > > >
> > > > > > > +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> > > > > > > + char *buf)
> > > > > > > +{
> > > > > > > + /* do not support live migration */
> > > > > > > + return -EINVAL;
> > > > > > > +}
> > > > > > > +
> > > > > > > +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> > > > > > > + const char *buf, size_t count)
> > > > > > > +{
> > > > > > > + /* do not support live migration */
> > > > > > > + return -EINVAL;
> > > > > > > +}
> > > > > > > +static MDEV_TYPE_ATTR_RW(version);
> > > > > > > +
> > > > > > > static struct attribute *mdev_types_attrs[] = {
> > > > > > > &mdev_type_attr_name.attr,
> > > > > > > &mdev_type_attr_description.attr,
> > > > > > > &mdev_type_attr_device_api.attr,
> > > > > > > &mdev_type_attr_available_instances.attr,
> > > > > > > + &mdev_type_attr_version.attr,
> > > > > > > NULL,
> > > > > > > };
> > > > > > >
> > > > > > > diff --git a/samples/vfio-mdev/mtty.c b/samples/vfio-mdev/mtty.c
> > > > > > > index 1c77c370c92f..4ae3aad3474d 100644
> > > > > > > --- a/samples/vfio-mdev/mtty.c
> > > > > > > +++ b/samples/vfio-mdev/mtty.c
> > > > > > > @@ -1390,10 +1390,26 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> > > > > > >
> > > > > > > MDEV_TYPE_ATTR_RO(device_api);
> > > > > > >
> > > > > > > +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> > > > > > > + char *buf)
> > > > > > > +{
> > > > > > > + /* do not support live migration */
> > > > > > > + return -EINVAL;
> > > > > > > +}
> > > > > > > +
> > > > > > > +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> > > > > > > + const char *buf, size_t count)
> > > > > > > +{
> > > > > > > + /* do not support live migration */
> > > > > > > + return -EINVAL;
> > > > > > > +}
> > > > > > > +
> > > > > > > +static MDEV_TYPE_ATTR_RW(version);
> > > > > > > static struct attribute *mdev_types_attrs[] = {
> > > > > > > &mdev_type_attr_name.attr,
> > > > > > > &mdev_type_attr_device_api.attr,
> > > > > > > &mdev_type_attr_available_instances.attr,
> > > > > > > + &mdev_type_attr_version.attr,
> > > > > > > NULL,
> > > > > > > };
> > > > > > >
> > > > > >
> > > > > > _______________________________________________
> > > > > > intel-gvt-dev mailing list
> > > > > > [email protected]
> > > > > > https://lists.freedesktop.org/mailman/listinfo/intel-gvt-dev
> > > >
> > > > _______________________________________________
> > > > intel-gvt-dev mailing list
> > > > [email protected]
> > > > https://lists.freedesktop.org/mailman/listinfo/intel-gvt-dev
> >
> > _______________________________________________
> > intel-gvt-dev mailing list
> > [email protected]
> > https://lists.freedesktop.org/mailman/listinfo/intel-gvt-dev

2019-04-26 02:05:16

by Yan Zhao

[permalink] [raw]
Subject: Re: [PATCH 1/2] vfio/mdev: add version field as mandatory attribute for mdev device

On Wed, Apr 24, 2019 at 10:14:50PM +0800, Alex Williamson wrote:
> On Tue, 23 Apr 2019 23:39:34 -0400
> Yan Zhao <[email protected]> wrote:
>
> > On Tue, Apr 23, 2019 at 11:02:56PM +0800, Alex Williamson wrote:
> > > On Tue, 23 Apr 2019 01:41:57 -0400
> > > Yan Zhao <[email protected]> wrote:
> > >
> > > > On Tue, Apr 23, 2019 at 09:21:00AM +0800, Alex Williamson wrote:
> > > > > On Mon, 22 Apr 2019 21:01:52 -0400
> > > > > Yan Zhao <[email protected]> wrote:
> > > > >
> > > > > > On Mon, Apr 22, 2019 at 10:39:50PM +0800, Alex Williamson wrote:
> > > > > > > On Fri, 19 Apr 2019 04:35:04 -0400
> > > > > > > Yan Zhao <[email protected]> wrote:
> > > > > > >
> > > > > > > > device version attribute in mdev sysfs is used by user space software
> > > > > > > > (e.g. libvirt) to query device compatibility for live migration of VFIO
> > > > > > > > mdev devices. This attribute is mandatory if a mdev device supports live
> > > > > > > > migration.
> > > > > > >
> > > > > > > The Subject: doesn't quite match what's being proposed here.
> > > > > > >
> > > > > > > > It consists of two parts: common part and vendor proprietary part.
> > > > > > > > common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> > > > > > > > identifies device type. e.g., for pci device, it is
> > > > > > > > "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > > > > > >
> > > > > > > What purpose does this serve? If it's intended as some sort of
> > > > > > > namespace feature, shouldn't we first assume that we can only support
> > > > > > > migration to devices of the same type? Therefore each type would
> > > > > > > already have its own namespace. Also that would make the trailing bit
> > > > > > > of the version string listed below in the example redundant. A vendor
> > > > > > > is still welcome to include this in their version string if they wish,
> > > > > > > but I think the string should be entirely vendor defined.
> > > > > > >
> > > > > > hi Alex,
> > > > > > This common part is a kind of namespace.
> > > > > > Because if version string is entirely defined by vendors, I'm worried about
> > > > > > if there is a case that one vendor's version string happens to deceive and
> > > > > > interfere with another vendor's version checking?
> > > > > > e.g.
> > > > > > vendor A has a version string like: vendor id + device id + mdev type
> > > > > > vendor B has a version string like: device id + vendor id + mdev type
> > > > > > but vendor A's vendor id is 0x8086, device id is 0x1217
> > > > > > vendor B's vendor id is 0x1217, device id is 0x8086.
> > > > > >
> > > > > > In this corner case, the two vendors may regard the two device is
> > > > > > migratable but actually they are not.
> > > > > >
> > > > > > That's the reason for this common part that serve as a kind of namespace
> > > > > > that all vendors will comply with to avoid overlap.
> > > > >
> > > > > If we assume that migration can only occur between matching mdev types,
> > > > > this is redundant, each type already has their own namespace.
> > > > >
> > > > hi Alex,
> > > > do you mean user space software like libvirt needs to first check whether
> > > > mdev type is matching and then check whether version is matching?
> > >
> > > Yes.
> > >
> > may I know the drawback of defining the this common part?
> > common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> > identifies device type. e.g., for pci device, it is
> > "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > By doing so, user space software has no need to first check whether mdev type
> > is matching. A confident vendor driver can even allow devices migrating
> > between different mdev types.
>
>
> It's not practical to expect userspace to test the version of every
> mdev type in the system to determine a match, let alone across a data
> center. Additionally, in order to be migration compatible the mdev
> types must be software compatible to the VM, which is the basic
> definition of the differences between mdev types. Therefore let me
> flip the question around, why would a vendor driver choose to create a
> new mdev type for software compatible devices? If the vendor wants to
> maintain compatibility, indicate basic compatibility using the same
> mdev type. The original intention of the version attribute is to be
> entirely opaque to userspace, introducing "common" parts is unnecessary
> as above, degrades the original concept, and only defines a solution for
> PCI devices. Thanks,
>
ok. Got it. Thanks for explanation.
I'll remove this common part in next revision.


>
> > > > if user space software only checks version for migration, it means vendor
> > > > driver has to include mdev type in their vendor proprietary part string,
> > > > right?
> > >
> > > Userspace attempting to migrate an nvidia-64 to an i915-GVT_V5_4 would
> > > be a failure on the part of the user.
> > >
> > > > Another thing is that could there be any future mdev parent driver that
> > > > applies to all mdev devices, just like vfio-pci? like Yi's vfio-pci-mdev
> > > > driver (https://lkml.org/lkml/2019/3/13/114)?
> > >
> > > For starters, this is just a sample driver from which vendor specific
> > > mdev drivers could be forked to support these features, but
> > > additionally, the type is defined by the vendor driver, so even a meta
> > > driver like vfio-pci-mdev could create types like
> > > "vfio-pci-mdev-8086_10c9_abcdef" if it wanted to provide that specific
> > > device. The "vfio-pci-mdev-type1" is just a sample implementation to
> > > say "the native type of the thing bound to me" and it's going to have
> > > limited usefulness for any sort of persistence to userspace. Thus,
> > > it's a sample driver. Thanks,
> > Thanks for this explanation:)
> >
> >
> > >
> > > > > > > > vendor proprietary part: this part is varied in length. vendor driver can
> > > > > > > > specify any string to identify a device.
> > > > > > > >
> > > > > > > > When reading this attribute, it should show device version string of the
> > > > > > > > device of type <type-id>. If a device does not support live migration, it
> > > > > > > > should return errno.
> > > > > > > > When writing a string to this attribute, it returns errno for
> > > > > > > > incompatibility or returns written string length in compatibility case.
> > > > > > > > If a device does not support live migration, it always returns errno.
> > > > > > > >
> > > > > > > > For user space software to use:
> > > > > > > > 1.
> > > > > > > > Before starting live migration, user space software first reads source side
> > > > > > > > mdev device's version. e.g.
> > > > > > > > "#cat \
> > > > > > > > /sys/bus/pci/devices/0000\:00\:02.0/5ac1fb20-2bbf-4842-bb7e-36c58c3be9cd/mdev_type/version"
> > > > > > > > 00028086-193b-i915-GVTg_V5_4
> > > > > > > >
> > > > > > > > 2.
> > > > > > > > Then, user space software writes the source side returned version string
> > > > > > > > to device version attribute in target side, and checks the return value.
> > > > > > > > If a negative errno is returned in the target side, then mdev devices in
> > > > > > > > source and target sides are not compatible;
> > > > > > > > If a positive number is returned and it equals to the length of written
> > > > > > > > string, then the two mdev devices in source and target side are compatible.
> > > > > > > > e.g.
> > > > > > > > (a) compatibility case
> > > > > > > > "# echo 00028086-193b-i915-GVTg_V5_4 >
> > > > > > > > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > > > > > > >
> > > > > > > > (b) incompatibility case
> > > > > > > > "#echo 00028086-193b-i915-GVTg_V5_1 >
> > > > > > > > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > > > > > > > -bash: echo: write error: Invalid argument
> > > > > > > >
> > > > > > > > 3. if two mdev devices are compatible, user space software can start
> > > > > > > > live migration, and vice versa.
> > > > > > > >
> > > > > > > > Note: if a mdev device does not support live migration, it either does
> > > > > > > > not provide a version attribute, or always returns errno when its version
> > > > > > > > attribute is read/written.
> > > > > > >
> > > > > > > I think it would be cleaner to do the former, not supply the
> > > > > > > attribute. This seems to do the latter in the sample drivers. Thanks,
> > > > > > Ok. you are right!
> > > > > > what about just keep one sample driver to show how to do the latter,
> > > > > > and let the others do the former?
> > > > >
> > > > > I'd rather that if a vendor driver doesn't support features requiring
> > > > > the version attribute that they don't implement it. It's confusing to
> > > > > developers looking at the sample driver for guidance if we have
> > > > > different implementations. Of course if you'd like to add migration
> > > > > support to one of the sample drivers, that'd be very welcome. Thanks,
> > > > >
> > > > Got it:)
> > > >
> > > > Thanks!
> > > > Yan
> > > >
> > > > >
> > > > > > > > Cc: Alex Williamson <[email protected]>
> > > > > > > > Cc: Erik Skultety <[email protected]>
> > > > > > > > Cc: "Dr. David Alan Gilbert" <[email protected]>
> > > > > > > > Cc: Cornelia Huck <[email protected]>
> > > > > > > > Cc: "Tian, Kevin" <[email protected]>
> > > > > > > > Cc: Zhenyu Wang <[email protected]>
> > > > > > > > Cc: "Wang, Zhi A" <[email protected]>
> > > > > > > > Cc: Neo Jia <[email protected]>
> > > > > > > > Cc: Kirti Wankhede <[email protected]>
> > > > > > > >
> > > > > > > > Signed-off-by: Yan Zhao <[email protected]>
> > > > > > > > ---
> > > > > > > > Documentation/vfio-mediated-device.txt | 36 ++++++++++++++++++++++++++
> > > > > > > > samples/vfio-mdev/mbochs.c | 17 ++++++++++++
> > > > > > > > samples/vfio-mdev/mdpy.c | 16 ++++++++++++
> > > > > > > > samples/vfio-mdev/mtty.c | 16 ++++++++++++
> > > > > > > > 4 files changed, 85 insertions(+)
> > > > > > > >
> > > > > > > > diff --git a/Documentation/vfio-mediated-device.txt b/Documentation/vfio-mediated-device.txt
> > > > > > > > index c3f69bcaf96e..bc28471c0667 100644
> > > > > > > > --- a/Documentation/vfio-mediated-device.txt
> > > > > > > > +++ b/Documentation/vfio-mediated-device.txt
> > > > > > > > @@ -202,6 +202,7 @@ Directories and files under the sysfs for Each Physical Device
> > > > > > > > | | |--- available_instances
> > > > > > > > | | |--- device_api
> > > > > > > > | | |--- description
> > > > > > > > + | | |--- version
> > > > > > > > | | |--- [devices]
> > > > > > > > | |--- [<type-id>]
> > > > > > > > | | |--- create
> > > > > > > > @@ -209,6 +210,7 @@ Directories and files under the sysfs for Each Physical Device
> > > > > > > > | | |--- available_instances
> > > > > > > > | | |--- device_api
> > > > > > > > | | |--- description
> > > > > > > > + | | |--- version
> > > > > > > > | | |--- [devices]
> > > > > > > > | |--- [<type-id>]
> > > > > > > > | |--- create
> > > > > > > > @@ -216,6 +218,7 @@ Directories and files under the sysfs for Each Physical Device
> > > > > > > > | |--- available_instances
> > > > > > > > | |--- device_api
> > > > > > > > | |--- description
> > > > > > > > + | |--- version
> > > > > > > > | |--- [devices]
> > > > > > > >
> > > > > > > > * [mdev_supported_types]
> > > > > > > > @@ -225,6 +228,8 @@ Directories and files under the sysfs for Each Physical Device
> > > > > > > > [<type-id>], device_api, and available_instances are mandatory attributes
> > > > > > > > that should be provided by vendor driver.
> > > > > > > >
> > > > > > > > + version is a mandatory attribute if a mdev device supports live migration.
> > > > > > > > +
> > > > > > > > * [<type-id>]
> > > > > > > >
> > > > > > > > The [<type-id>] name is created by adding the device driver string as a prefix
> > > > > > > > @@ -246,6 +251,35 @@ Directories and files under the sysfs for Each Physical Device
> > > > > > > > This attribute should show the number of devices of type <type-id> that can be
> > > > > > > > created.
> > > > > > > >
> > > > > > > > +* version
> > > > > > > > +
> > > > > > > > + This attribute is rw. It is used to check whether two devices are compatible
> > > > > > > > + for live migration. If this attribute is missing, then the corresponding mdev
> > > > > > > > + device is regarded as not supporting live migration.
> > > > > > > > +
> > > > > > > > + It consists of two parts: common part and vendor proprietary part.
> > > > > > > > + common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies
> > > > > > > > + device type. e.g., for pci device, it is
> > > > > > > > + "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > > > > > > > + vendor proprietary part: this part is varied in length. vendor driver can
> > > > > > > > + specify any string to identify a device.
> > > > > > > > +
> > > > > > > > + When reading this attribute, it should show device version string of the device
> > > > > > > > + of type <type-id>. If a device does not support live migration, it should
> > > > > > > > + return errno.
> > > > > > > > + When writing a string to this attribute, it returns errno for incompatibility
> > > > > > > > + or returns written string length in compatibility case. If a device does not
> > > > > > > > + support live migration, it always returns errno.
> > > > > > > > +
> > > > > > > > + for example.
> > > > > > > > + # cat \
> > > > > > > > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_2/version
> > > > > > > > + 00028086-193b-i915-GVTg_V5_2
> > > > > > > > +
> > > > > > > > + #echo 00028086-193b-i915-GVTg_V5_2 > \
> > > > > > > > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> > > > > > > > + -bash: echo: write error: Invalid argument
> > > > > > > > +
> > > > > > > > * [device]
> > > > > > > >
> > > > > > > > This directory contains links to the devices of type <type-id> that have been
> > > > > > > > @@ -327,12 +361,14 @@ card.
> > > > > > > > | | |-- available_instances
> > > > > > > > | | |-- create
> > > > > > > > | | |-- device_api
> > > > > > > > + | | |-- version
> > > > > > > > | | |-- devices
> > > > > > > > | | `-- name
> > > > > > > > | `-- mtty-2
> > > > > > > > | |-- available_instances
> > > > > > > > | |-- create
> > > > > > > > | |-- device_api
> > > > > > > > + | |-- version
> > > > > > > > | |-- devices
> > > > > > > > | `-- name
> > > > > > > > |-- mtty_dev
> > > > > > > > diff --git a/samples/vfio-mdev/mbochs.c b/samples/vfio-mdev/mbochs.c
> > > > > > > > index b038aa9f5a70..2f5ba96b91a2 100644
> > > > > > > > --- a/samples/vfio-mdev/mbochs.c
> > > > > > > > +++ b/samples/vfio-mdev/mbochs.c
> > > > > > > > @@ -1391,11 +1391,28 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> > > > > > > > }
> > > > > > > > MDEV_TYPE_ATTR_RO(device_api);
> > > > > > > >
> > > > > > > > +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> > > > > > > > + char *buf)
> > > > > > > > +{
> > > > > > > > + /* do not support live migration */
> > > > > > > > + return -EINVAL;
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> > > > > > > > + const char *buf, size_t count)
> > > > > > > > +{
> > > > > > > > + /* do not support live migration */
> > > > > > > > + return -EINVAL;
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +static MDEV_TYPE_ATTR_RW(version);
> > > > > > > > +
> > > > > > > > static struct attribute *mdev_types_attrs[] = {
> > > > > > > > &mdev_type_attr_name.attr,
> > > > > > > > &mdev_type_attr_description.attr,
> > > > > > > > &mdev_type_attr_device_api.attr,
> > > > > > > > &mdev_type_attr_available_instances.attr,
> > > > > > > > + &mdev_type_attr_version.attr,
> > > > > > > > NULL,
> > > > > > > > };
> > > > > > > >
> > > > > > > > diff --git a/samples/vfio-mdev/mdpy.c b/samples/vfio-mdev/mdpy.c
> > > > > > > > index cc86bf6566e4..ff15fdfc7d46 100644
> > > > > > > > --- a/samples/vfio-mdev/mdpy.c
> > > > > > > > +++ b/samples/vfio-mdev/mdpy.c
> > > > > > > > @@ -695,11 +695,27 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> > > > > > > > }
> > > > > > > > MDEV_TYPE_ATTR_RO(device_api);
> > > > > > > >
> > > > > > > > +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> > > > > > > > + char *buf)
> > > > > > > > +{
> > > > > > > > + /* do not support live migration */
> > > > > > > > + return -EINVAL;
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> > > > > > > > + const char *buf, size_t count)
> > > > > > > > +{
> > > > > > > > + /* do not support live migration */
> > > > > > > > + return -EINVAL;
> > > > > > > > +}
> > > > > > > > +static MDEV_TYPE_ATTR_RW(version);
> > > > > > > > +
> > > > > > > > static struct attribute *mdev_types_attrs[] = {
> > > > > > > > &mdev_type_attr_name.attr,
> > > > > > > > &mdev_type_attr_description.attr,
> > > > > > > > &mdev_type_attr_device_api.attr,
> > > > > > > > &mdev_type_attr_available_instances.attr,
> > > > > > > > + &mdev_type_attr_version.attr,
> > > > > > > > NULL,
> > > > > > > > };
> > > > > > > >
> > > > > > > > diff --git a/samples/vfio-mdev/mtty.c b/samples/vfio-mdev/mtty.c
> > > > > > > > index 1c77c370c92f..4ae3aad3474d 100644
> > > > > > > > --- a/samples/vfio-mdev/mtty.c
> > > > > > > > +++ b/samples/vfio-mdev/mtty.c
> > > > > > > > @@ -1390,10 +1390,26 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> > > > > > > >
> > > > > > > > MDEV_TYPE_ATTR_RO(device_api);
> > > > > > > >
> > > > > > > > +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> > > > > > > > + char *buf)
> > > > > > > > +{
> > > > > > > > + /* do not support live migration */
> > > > > > > > + return -EINVAL;
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> > > > > > > > + const char *buf, size_t count)
> > > > > > > > +{
> > > > > > > > + /* do not support live migration */
> > > > > > > > + return -EINVAL;
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +static MDEV_TYPE_ATTR_RW(version);
> > > > > > > > static struct attribute *mdev_types_attrs[] = {
> > > > > > > > &mdev_type_attr_name.attr,
> > > > > > > > &mdev_type_attr_device_api.attr,
> > > > > > > > &mdev_type_attr_available_instances.attr,
> > > > > > > > + &mdev_type_attr_version.attr,
> > > > > > > > NULL,
> > > > > > > > };
> > > > > > > >
> > > > > > >
> > > > > > > _______________________________________________
> > > > > > > intel-gvt-dev mailing list
> > > > > > > [email protected]
> > > > > > > https://lists.freedesktop.org/mailman/listinfo/intel-gvt-dev
> > > > >
> > > > > _______________________________________________
> > > > > intel-gvt-dev mailing list
> > > > > [email protected]
> > > > > https://lists.freedesktop.org/mailman/listinfo/intel-gvt-dev
> > >
> > > _______________________________________________
> > > intel-gvt-dev mailing list
> > > [email protected]
> > > https://lists.freedesktop.org/mailman/listinfo/intel-gvt-dev
>
> _______________________________________________
> intel-gvt-dev mailing list
> [email protected]
> https://lists.freedesktop.org/mailman/listinfo/intel-gvt-dev

2019-04-30 15:30:57

by Cornelia Huck

[permalink] [raw]
Subject: Re: [PATCH 1/2] vfio/mdev: add version field as mandatory attribute for mdev device

On Wed, 24 Apr 2019 04:15:58 -0400
Yan Zhao <[email protected]> wrote:

> On Wed, Apr 24, 2019 at 03:56:24PM +0800, Cornelia Huck wrote:
> > On Tue, 23 Apr 2019 23:10:37 -0400
> > Yan Zhao <[email protected]> wrote:
> >
> > > On Tue, Apr 23, 2019 at 05:59:32PM +0800, Cornelia Huck wrote:
> > > > On Fri, 19 Apr 2019 04:35:04 -0400
> > > > Yan Zhao <[email protected]> wrote:

> > > > > @@ -225,6 +228,8 @@ Directories and files under the sysfs for Each Physical Device
> > > > > [<type-id>], device_api, and available_instances are mandatory attributes
> > > > > that should be provided by vendor driver.
> > > > >
> > > > > + version is a mandatory attribute if a mdev device supports live migration.
> > > >
> > > > What about "An mdev device wishing to support live migration must
> > > > provide the version attribute."?
> > > yes, I just want to keep consistent with the line above it
> > > " [<type-id>], device_api, and available_instances are mandatory attributes
> > > that should be provided by vendor driver."
> > > what about below one?
> > > "version is a mandatory attribute if a mdev device wishing to support live
> > > migration."
> >
> > My point is that an attribute is not mandatory if it can be left out :)
> > (I'm not a native speaker, though; maybe this makes perfect sense
> > after all?)
> >
> > Maybe "version is a required attribute if live migration is supported
> > for an mdev device"?
> >
> you are right, "mandatory" may bring some confusion.
> Maybe
> "vendor driver must provide version attribute for an mdev device wishing to
> support live migration." ?
> based on your first version :)

"The vendor driver must provide the version attribute for any mdev
device it wishes to support live migration for." ?

>
> > >
> > >
> > > > > +
> > > > > * [<type-id>]
> > > > >
> > > > > The [<type-id>] name is created by adding the device driver string as a prefix
> > > > > @@ -246,6 +251,35 @@ Directories and files under the sysfs for Each Physical Device
> > > > > This attribute should show the number of devices of type <type-id> that can be
> > > > > created.
> > > > >
> > > > > +* version
> > > > > +
> > > > > + This attribute is rw. It is used to check whether two devices are compatible
> > > > > + for live migration. If this attribute is missing, then the corresponding mdev
> > > > > + device is regarded as not supporting live migration.
> > > > > +
> > > > > + It consists of two parts: common part and vendor proprietary part.
> > > > > + common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies
> > > > > + device type. e.g., for pci device, it is
> > > > > + "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > > > > + vendor proprietary part: this part is varied in length. vendor driver can
> > > > > + specify any string to identify a device.
> > > > > +
> > > > > + When reading this attribute, it should show device version string of the device
> > > > > + of type <type-id>. If a device does not support live migration, it should
> > > > > + return errno.
> > > > > + When writing a string to this attribute, it returns errno for incompatibility
> > > > > + or returns written string length in compatibility case. If a device does not
> > > > > + support live migration, it always returns errno.
> > > >
> > > > I'm not sure whether a device that does not support live migration
> > > > should expose this attribute in the first place. Or is that to cover
> > > > cases where a driver supports live migration only for some of the
> > > > devices it supports?
> > > yes, driver returning error code is to cover the cases where only part of devices it
> > > supports can be migrated.
> > >
> > >
> > > > Also, I'm not sure if a string that has to be parsed is a good idea...
> > > > is this 'version' attribute supposed to convey some human-readable
> > > > information as well? The procedure you describe for compatibility
> > > > checking does the checking within the vendor driver which I would
> > > > expect to have a table/rules for that anyway.
> > > right. if a vendor driver has the confidence to migrate between devices of
> > > diffent platform or mdev types, it can maintain a compatibility table for that
> > > purpose. That's the reason why we would leave the compatibility check to vendor
> > > driver. vendor driver can freely choose its own complicated way to decide
> > > which device is migratable to which device.
> >
> > I think there are two scenarios here:
> > - Migrating between different device types, which is unlikely to work,
> > except in special cases.
> > - Migrating between different versions of the same device type, which
> > may work for some drivers/devices (and at least migrating to a newer
> > version looks quite reasonable).
> >
> > But both should be something that is decided by the individual driver;
> > I hope we don't want to support migration between different drivers :-O
> >
> > Can we make this a driver-defined format?
> >
> yes, this is indeed driver-defined format.
> Actually we define it into two parts: common part and vendor proprietary part.
> common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> identifies device type. e.g., for pci device, it is
> "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> vendor proprietary part: this part is varied in length. vendor driver can
> specify any string to identify a device.
>
> vendor proprietary part is defined by vendor driver. vendor driver can
> define any format it wishes to use. Also it is its own responsibility to
> ensure backward compatibility if it wants to update format definition in this
> part.
>
> So user space only needs to get source side's version string, and asks
> target side whether the two are compatible. The decision maker is the
> vendor driver:)

If I followed the discussion correctly, I think you plan to drop this
format, don't you? I'd be happy if a vendor driver can use a simple
number without any prefixes if it so chooses.

I also like the idea of renaming this "migration_version" so that it is
clear we're dealing with versioning of the migration capability (and
not a version of the device or so).

2019-05-07 05:46:04

by Yan Zhao

[permalink] [raw]
Subject: Re: [PATCH 1/2] vfio/mdev: add version field as mandatory attribute for mdev device

On Tue, Apr 30, 2019 at 11:29:08PM +0800, Cornelia Huck wrote:
> On Wed, 24 Apr 2019 04:15:58 -0400
> Yan Zhao <[email protected]> wrote:
>
> > On Wed, Apr 24, 2019 at 03:56:24PM +0800, Cornelia Huck wrote:
> > > On Tue, 23 Apr 2019 23:10:37 -0400
> > > Yan Zhao <[email protected]> wrote:
> > >
> > > > On Tue, Apr 23, 2019 at 05:59:32PM +0800, Cornelia Huck wrote:
> > > > > On Fri, 19 Apr 2019 04:35:04 -0400
> > > > > Yan Zhao <[email protected]> wrote:
>
> > > > > > @@ -225,6 +228,8 @@ Directories and files under the sysfs for Each Physical Device
> > > > > > [<type-id>], device_api, and available_instances are mandatory attributes
> > > > > > that should be provided by vendor driver.
> > > > > >
> > > > > > + version is a mandatory attribute if a mdev device supports live migration.
> > > > >
> > > > > What about "An mdev device wishing to support live migration must
> > > > > provide the version attribute."?
> > > > yes, I just want to keep consistent with the line above it
> > > > " [<type-id>], device_api, and available_instances are mandatory attributes
> > > > that should be provided by vendor driver."
> > > > what about below one?
> > > > "version is a mandatory attribute if a mdev device wishing to support live
> > > > migration."
> > >
> > > My point is that an attribute is not mandatory if it can be left out :)
> > > (I'm not a native speaker, though; maybe this makes perfect sense
> > > after all?)
> > >
> > > Maybe "version is a required attribute if live migration is supported
> > > for an mdev device"?
> > >
> > you are right, "mandatory" may bring some confusion.
> > Maybe
> > "vendor driver must provide version attribute for an mdev device wishing to
> > support live migration." ?
> > based on your first version :)
>
> "The vendor driver must provide the version attribute for any mdev
> device it wishes to support live migration for." ?
>
> >
> > > >
> > > >
> > > > > > +
> > > > > > * [<type-id>]
> > > > > >
> > > > > > The [<type-id>] name is created by adding the device driver string as a prefix
> > > > > > @@ -246,6 +251,35 @@ Directories and files under the sysfs for Each Physical Device
> > > > > > This attribute should show the number of devices of type <type-id> that can be
> > > > > > created.
> > > > > >
> > > > > > +* version
> > > > > > +
> > > > > > + This attribute is rw. It is used to check whether two devices are compatible
> > > > > > + for live migration. If this attribute is missing, then the corresponding mdev
> > > > > > + device is regarded as not supporting live migration.
> > > > > > +
> > > > > > + It consists of two parts: common part and vendor proprietary part.
> > > > > > + common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies
> > > > > > + device type. e.g., for pci device, it is
> > > > > > + "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > > > > > + vendor proprietary part: this part is varied in length. vendor driver can
> > > > > > + specify any string to identify a device.
> > > > > > +
> > > > > > + When reading this attribute, it should show device version string of the device
> > > > > > + of type <type-id>. If a device does not support live migration, it should
> > > > > > + return errno.
> > > > > > + When writing a string to this attribute, it returns errno for incompatibility
> > > > > > + or returns written string length in compatibility case. If a device does not
> > > > > > + support live migration, it always returns errno.
> > > > >
> > > > > I'm not sure whether a device that does not support live migration
> > > > > should expose this attribute in the first place. Or is that to cover
> > > > > cases where a driver supports live migration only for some of the
> > > > > devices it supports?
> > > > yes, driver returning error code is to cover the cases where only part of devices it
> > > > supports can be migrated.
> > > >
> > > >
> > > > > Also, I'm not sure if a string that has to be parsed is a good idea...
> > > > > is this 'version' attribute supposed to convey some human-readable
> > > > > information as well? The procedure you describe for compatibility
> > > > > checking does the checking within the vendor driver which I would
> > > > > expect to have a table/rules for that anyway.
> > > > right. if a vendor driver has the confidence to migrate between devices of
> > > > diffent platform or mdev types, it can maintain a compatibility table for that
> > > > purpose. That's the reason why we would leave the compatibility check to vendor
> > > > driver. vendor driver can freely choose its own complicated way to decide
> > > > which device is migratable to which device.
> > >
> > > I think there are two scenarios here:
> > > - Migrating between different device types, which is unlikely to work,
> > > except in special cases.
> > > - Migrating between different versions of the same device type, which
> > > may work for some drivers/devices (and at least migrating to a newer
> > > version looks quite reasonable).
> > >
> > > But both should be something that is decided by the individual driver;
> > > I hope we don't want to support migration between different drivers :-O
> > >
> > > Can we make this a driver-defined format?
> > >
> > yes, this is indeed driver-defined format.
> > Actually we define it into two parts: common part and vendor proprietary part.
> > common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> > identifies device type. e.g., for pci device, it is
> > "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > vendor proprietary part: this part is varied in length. vendor driver can
> > specify any string to identify a device.
> >
> > vendor proprietary part is defined by vendor driver. vendor driver can
> > define any format it wishes to use. Also it is its own responsibility to
> > ensure backward compatibility if it wants to update format definition in this
> > part.
> >
> > So user space only needs to get source side's version string, and asks
> > target side whether the two are compatible. The decision maker is the
> > vendor driver:)
>
> If I followed the discussion correctly, I think you plan to drop this
> format, don't you? I'd be happy if a vendor driver can use a simple
> number without any prefixes if it so chooses.
>
> I also like the idea of renaming this "migration_version" so that it is
> clear we're dealing with versioning of the migration capability (and
> not a version of the device or so).
hi Cornelia,
sorry I just saw this mail after sending v2 of this patch set...
yes, I dropped the common part and vendor driver now can define whatever it
wishes to identify a device version.
However, I don't agree to rename it to "migration_version", as it still may
bring some kind of confusing with the migration version a vendor driver is
using, e.g. vendor driver changes migration code and increases that migration
version.
In fact, what info we want to get from this attribute is whether this mdev
device is compatible with another mdev device, which is tied to device, and not
necessarily bound to migration.

do you think so?

Thanks
Yan
> _______________________________________________
> intel-gvt-dev mailing list
> [email protected]
> https://lists.freedesktop.org/mailman/listinfo/intel-gvt-dev

2019-05-07 08:54:22

by Cornelia Huck

[permalink] [raw]
Subject: Re: [PATCH 1/2] vfio/mdev: add version field as mandatory attribute for mdev device

On Tue, 7 May 2019 01:39:13 -0400
Yan Zhao <[email protected]> wrote:

> On Tue, Apr 30, 2019 at 11:29:08PM +0800, Cornelia Huck wrote:

> > If I followed the discussion correctly, I think you plan to drop this
> > format, don't you? I'd be happy if a vendor driver can use a simple
> > number without any prefixes if it so chooses.
> >
> > I also like the idea of renaming this "migration_version" so that it is
> > clear we're dealing with versioning of the migration capability (and
> > not a version of the device or so).
> hi Cornelia,
> sorry I just saw this mail after sending v2 of this patch set...
> yes, I dropped the common part and vendor driver now can define whatever it
> wishes to identify a device version.

Ok, I'll look at v2.

> However, I don't agree to rename it to "migration_version", as it still may
> bring some kind of confusing with the migration version a vendor driver is
> using, e.g. vendor driver changes migration code and increases that migration
> version.
> In fact, what info we want to get from this attribute is whether this mdev
> device is compatible with another mdev device, which is tied to device, and not
> necessarily bound to migration.
>
> do you think so?

I'm not 100% convinced; but we can continue the discussion on v2.