2021-09-08 12:29:28

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH 0/6] vDPA driver for legacy virtio-pci device

This series implements the vDPA driver for legacy virtio-pci device.
Currently we already have the vDPA driver for modern virtio-pci device
only, but there are some legacy virtio-pci devices conforming to the
virtio-pci specifications of 0.9.x or older versions. For example,
ENI(Elastic Network Interface) of Alibaba ECS baremetal instance is a
hardware virtio network device which follows the Virtio PCI Card 0.9.5
Draft specification. Such legacy virtio-pci devices have some
inconsistent behaviour with modern virtio-pci devices, so some common
codes are split out and modern device specific codes are moved to a
separated file.

For legacy devices, it is not supported to negotiate the virtqueue size
by the specification. So a new callback get_vq_num_unchangeable is
introduced to indicate user not to try change the virtqueue size of the
legacy vdpa device. For example, QEMU should not allocate memory for
virtqueue according to the properties tx_queue_size and rx_queue_size if
we use legacy virtio-pci device as the vhost-vdpa backend. Instead, QEMU
should use the new callback get_vq_num_unchangeable first to check if
the vdpa device support to change virtqueue size. If not, QEMU should
call the callback get_vq_num_max to get the static virtqueue size then
allocate the same size memory for the virtqueue.

This series have been tested with the ENI in Alibaba ECS baremetal
instance.

These patches may under consideration, welcome for comments.


Wu Zongyong (6):
virtio-pci: introduce legacy device module
vdpa: fix typo
vp_vdpa: add vq irq offloading support
vp_vdpa: split out reusable and device specific codes to separate file
vdpa: add get_vq_num_unchangeable callback in vdpa_config_ops
vp_vdpa: introduce legacy virtio pci driver

drivers/vdpa/Kconfig | 7 +
drivers/vdpa/virtio_pci/Makefile | 3 +
drivers/vdpa/virtio_pci/vp_vdpa.c | 8 +
drivers/vdpa/virtio_pci/vp_vdpa_common.c | 220 ++++++++++++++
drivers/vdpa/virtio_pci/vp_vdpa_common.h | 67 +++++
drivers/vdpa/virtio_pci/vp_vdpa_legacy.c | 346 +++++++++++++++++++++++
drivers/vdpa/virtio_pci/vp_vdpa_modern.c | 327 +++++++++++++++++++++
drivers/vhost/vdpa.c | 19 ++
drivers/virtio/Kconfig | 10 +
drivers/virtio/Makefile | 1 +
drivers/virtio/virtio_pci_common.c | 10 +-
drivers/virtio/virtio_pci_common.h | 9 +-
drivers/virtio/virtio_pci_legacy.c | 101 ++-----
drivers/virtio/virtio_pci_legacy_dev.c | 220 ++++++++++++++
drivers/virtio/virtio_vdpa.c | 5 +-
include/linux/vdpa.h | 6 +-
include/linux/virtio_pci_legacy.h | 44 +++
include/uapi/linux/vhost.h | 2 +
18 files changed, 1320 insertions(+), 85 deletions(-)
create mode 100644 drivers/vdpa/virtio_pci/vp_vdpa_common.c
create mode 100644 drivers/vdpa/virtio_pci/vp_vdpa_common.h
create mode 100644 drivers/vdpa/virtio_pci/vp_vdpa_legacy.c
create mode 100644 drivers/vdpa/virtio_pci/vp_vdpa_modern.c
create mode 100644 drivers/virtio/virtio_pci_legacy_dev.c
create mode 100644 include/linux/virtio_pci_legacy.h

--
2.31.1


2021-09-08 12:29:36

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH 5/6] vdpa: add get_vq_num_unchangeable callback in vdpa_config_ops

This new callback is used to indicate whether the vring size can be
change or not. It is useful when we have a legacy virtio pci device as
the vdpa device for there is no way to negotiate the vring num by the
specification.

Signed-off-by: Wu Zongyong <[email protected]>
---
drivers/vhost/vdpa.c | 19 +++++++++++++++++++
drivers/virtio/virtio_vdpa.c | 5 ++++-
include/linux/vdpa.h | 4 ++++
include/uapi/linux/vhost.h | 2 ++
4 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index 9479f7f79217..2204d27d1e5d 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -350,6 +350,22 @@ static long vhost_vdpa_get_iova_range(struct vhost_vdpa *v, u32 __user *argp)
return 0;
}

+static long vhost_vdpa_get_vring_num_unchangeable(struct vhost_vdpa *v,
+ u32 __user *argp)
+{
+ struct vdpa_device *vdpa = v->vdpa;
+ const struct vdpa_config_ops *ops = vdpa->config;
+ bool unchangeable = false;
+
+ if (ops->get_vq_num_unchangeable)
+ unchangeable = ops->get_vq_num_unchangeable(vdpa);
+
+ if (copy_to_user(argp, &unchangeable, sizeof(unchangeable)))
+ return -EFAULT;
+
+ return 0;
+}
+
static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd,
void __user *argp)
{
@@ -487,6 +503,9 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
case VHOST_VDPA_GET_IOVA_RANGE:
r = vhost_vdpa_get_iova_range(v, argp);
break;
+ case VHOST_VDPA_GET_VRING_NUM_UNCHANGEABLE:
+ r = vhost_vdpa_get_vring_num_unchangeable(v, argp);
+ break;
default:
r = vhost_dev_ioctl(&v->vdev, cmd, argp);
if (r == -ENOIOCTLCMD)
diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
index 72eaef2caeb1..afb47465307a 100644
--- a/drivers/virtio/virtio_vdpa.c
+++ b/drivers/virtio/virtio_vdpa.c
@@ -146,6 +146,7 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
struct vdpa_vq_state state = {0};
unsigned long flags;
u32 align, num;
+ bool may_reduce_num = true;
int err;

if (!name)
@@ -171,8 +172,10 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,

/* Create the vring */
align = ops->get_vq_align(vdpa);
+ if (ops->get_vq_num_unchangeable)
+ may_reduce_num = !ops->get_vq_num_unchangeable(vdpa);
vq = vring_create_virtqueue(index, num, align, vdev,
- true, true, ctx,
+ true, may_reduce_num, ctx,
virtio_vdpa_notify, callback, name);
if (!vq) {
err = -ENOMEM;
diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
index 35648c11e312..f809b7ada00d 100644
--- a/include/linux/vdpa.h
+++ b/include/linux/vdpa.h
@@ -195,6 +195,9 @@ struct vdpa_iova_range {
* @vdev: vdpa device
* Returns the iova range supported by
* the device.
+ * @get_vq_num_unchangeable Check if size of virtqueue is unchangeable (optional)
+ * @vdev: vdpa device
+ * Returns boolean: unchangeable (true) or not (false)
* @set_map: Set device memory mapping (optional)
* Needed for device that using device
* specific DMA translation (on-chip IOMMU)
@@ -262,6 +265,7 @@ struct vdpa_config_ops {
const void *buf, unsigned int len);
u32 (*get_generation)(struct vdpa_device *vdev);
struct vdpa_iova_range (*get_iova_range)(struct vdpa_device *vdev);
+ bool (*get_vq_num_unchangeable)(struct vdpa_device *vdev);

/* DMA ops */
int (*set_map)(struct vdpa_device *vdev, struct vhost_iotlb *iotlb);
diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h
index c998860d7bbc..184f1f7f8498 100644
--- a/include/uapi/linux/vhost.h
+++ b/include/uapi/linux/vhost.h
@@ -150,4 +150,6 @@
/* Get the valid iova range */
#define VHOST_VDPA_GET_IOVA_RANGE _IOR(VHOST_VIRTIO, 0x78, \
struct vhost_vdpa_iova_range)
+/* Check if the vring size can be change */
+#define VHOST_VDPA_GET_VRING_NUM_UNCHANGEABLE _IOR(VHOST_VIRTIO, 0X79, bool)
#endif
--
2.31.1

2021-09-08 12:29:53

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH 4/6] vp_vdpa: split out reusable and device specific codes to separate file

Split out codes that can be reused later for legacy virtio-pci devices
to vp_vdpa_common.{h,c} files. And move modern device specific codes to
vp_vdpa_modern.c file.

Signed-off-by: Wu Zongyong <[email protected]>
---
drivers/vdpa/virtio_pci/Makefile | 2 +
drivers/vdpa/virtio_pci/vp_vdpa_common.c | 215 +++++++++++++++
drivers/vdpa/virtio_pci/vp_vdpa_common.h | 56 ++++
drivers/vdpa/virtio_pci/vp_vdpa_modern.c | 327 +++++++++++++++++++++++
4 files changed, 600 insertions(+)
create mode 100644 drivers/vdpa/virtio_pci/vp_vdpa_common.c
create mode 100644 drivers/vdpa/virtio_pci/vp_vdpa_common.h
create mode 100644 drivers/vdpa/virtio_pci/vp_vdpa_modern.c

diff --git a/drivers/vdpa/virtio_pci/Makefile b/drivers/vdpa/virtio_pci/Makefile
index 231088d3af7d..a772d86952b1 100644
--- a/drivers/vdpa/virtio_pci/Makefile
+++ b/drivers/vdpa/virtio_pci/Makefile
@@ -1,2 +1,4 @@
# SPDX-License-Identifier: GPL-2.0
+
+vp_vdpa-y += vp_vdpa_common.o vp_vdpa_modern.o
obj-$(CONFIG_VP_VDPA) += vp_vdpa.o
diff --git a/drivers/vdpa/virtio_pci/vp_vdpa_common.c b/drivers/vdpa/virtio_pci/vp_vdpa_common.c
new file mode 100644
index 000000000000..3ff24c9ad6e4
--- /dev/null
+++ b/drivers/vdpa/virtio_pci/vp_vdpa_common.c
@@ -0,0 +1,215 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * vDPA bridge driver for modern virtio-pci device
+ *
+ * Copyright (c) 2020, Red Hat Inc. All rights reserved.
+ * Author: Jason Wang <[email protected]>
+ *
+ * Based on virtio_pci_modern.c.
+ */
+
+#include <linux/irqreturn.h>
+#include <linux/interrupt.h>
+#include "vp_vdpa_common.h"
+
+int vp_vdpa_get_vq_irq(struct vdpa_device *vdev, u16 idx)
+{
+ struct vp_vdpa *vp_vdpa = vdpa_to_vp(vdev);
+
+ return vp_vdpa->vring[idx].irq;
+}
+
+void vp_vdpa_free_irq(struct vp_vdpa *vp_vdpa)
+{
+ struct virtio_pci_modern_device *mdev = &vp_vdpa->mdev;
+ struct pci_dev *pdev = mdev->pci_dev;
+ int i;
+
+ for (i = 0; i < vp_vdpa->queues; i++) {
+ if (vp_vdpa->vring[i].irq != VIRTIO_MSI_NO_VECTOR) {
+ vp_modern_queue_vector(mdev, i, VIRTIO_MSI_NO_VECTOR);
+ devm_free_irq(&pdev->dev, vp_vdpa->vring[i].irq,
+ &vp_vdpa->vring[i]);
+ vp_vdpa->vring[i].irq = VIRTIO_MSI_NO_VECTOR;
+ }
+ }
+
+ if (vp_vdpa->config_irq != VIRTIO_MSI_NO_VECTOR) {
+ vp_modern_config_vector(mdev, VIRTIO_MSI_NO_VECTOR);
+ devm_free_irq(&pdev->dev, vp_vdpa->config_irq, vp_vdpa);
+ vp_vdpa->config_irq = VIRTIO_MSI_NO_VECTOR;
+ }
+
+ if (vp_vdpa->vectors) {
+ pci_free_irq_vectors(pdev);
+ vp_vdpa->vectors = 0;
+ }
+}
+
+static irqreturn_t vp_vdpa_vq_handler(int irq, void *arg)
+{
+ struct vp_vring *vring = arg;
+
+ if (vring->cb.callback)
+ return vring->cb.callback(vring->cb.private);
+
+ return IRQ_HANDLED;
+}
+
+static irqreturn_t vp_vdpa_config_handler(int irq, void *arg)
+{
+ struct vp_vdpa *vp_vdpa = arg;
+
+ if (vp_vdpa->config_cb.callback)
+ return vp_vdpa->config_cb.callback(vp_vdpa->config_cb.private);
+
+ return IRQ_HANDLED;
+}
+
+int vp_vdpa_request_irq(struct vp_vdpa *vp_vdpa)
+{
+ struct pci_dev *pdev = vp_vdpa->pci_dev;
+ int i, ret, irq;
+ int queues = vp_vdpa->queues;
+ int vectors = queues + 1;
+
+ ret = pci_alloc_irq_vectors(pdev, vectors, vectors, PCI_IRQ_MSIX);
+ if (ret != vectors) {
+ dev_err(&pdev->dev,
+ "vp_vdpa: fail to allocate irq vectors want %d but %d\n",
+ vectors, ret);
+ return ret;
+ }
+
+ vp_vdpa->vectors = vectors;
+
+ for (i = 0; i < queues; i++) {
+ snprintf(vp_vdpa->vring[i].msix_name, VP_VDPA_NAME_SIZE,
+ "vp-vdpa[%s]-%d\n", pci_name(pdev), i);
+ irq = pci_irq_vector(pdev, i);
+ ret = devm_request_irq(&pdev->dev, irq,
+ vp_vdpa_vq_handler,
+ 0, vp_vdpa->vring[i].msix_name,
+ &vp_vdpa->vring[i]);
+ if (ret) {
+ dev_err(&pdev->dev,
+ "vp_vdpa: fail to request irq for vq %d\n", i);
+ goto err;
+ }
+ vp_vdpa->queue_vector(vp_vdpa, i, i);
+ vp_vdpa->vring[i].irq = irq;
+ }
+
+ snprintf(vp_vdpa->msix_name, VP_VDPA_NAME_SIZE, "vp-vdpa[%s]-config\n",
+ pci_name(pdev));
+ irq = pci_irq_vector(pdev, queues);
+ ret = devm_request_irq(&pdev->dev, irq, vp_vdpa_config_handler, 0,
+ vp_vdpa->msix_name, vp_vdpa);
+ if (ret) {
+ dev_err(&pdev->dev,
+ "vp_vdpa: fail to request irq for vq %d\n", i);
+ goto err;
+ }
+ vp_vdpa->config_vector(vp_vdpa, queues);
+ vp_vdpa->config_irq = irq;
+
+ return 0;
+err:
+ vp_vdpa_free_irq(vp_vdpa);
+ return ret;
+}
+
+int vp_vdpa_get_vq_state(struct vdpa_device *vdpa, u16 qid,
+ struct vdpa_vq_state *state)
+{
+ /* Note that this is not supported by virtio specification, so
+ * we return -EOPNOTSUPP here. This means we can't support live
+ * migration, vhost device start/stop.
+ */
+ return -EOPNOTSUPP;
+}
+
+void vp_vdpa_set_vq_cb(struct vdpa_device *vdpa, u16 qid,
+ struct vdpa_callback *cb)
+{
+ struct vp_vdpa *vp_vdpa = vdpa_to_vp(vdpa);
+
+ vp_vdpa->vring[qid].cb = *cb;
+}
+
+void vp_vdpa_kick_vq(struct vdpa_device *vdpa, u16 qid)
+{
+ struct vp_vdpa *vp_vdpa = vdpa_to_vp(vdpa);
+
+ vp_iowrite16(qid, vp_vdpa->vring[qid].notify);
+}
+
+u32 vp_vdpa_get_vq_align(struct vdpa_device *vdpa)
+{
+ return PAGE_SIZE;
+}
+
+void vp_vdpa_set_config_cb(struct vdpa_device *vdpa,
+ struct vdpa_callback *cb)
+{
+ struct vp_vdpa *vp_vdpa = vdpa_to_vp(vdpa);
+
+ vp_vdpa->config_cb = *cb;
+}
+
+void vp_vdpa_free_irq_vectors(void *data)
+{
+ pci_free_irq_vectors(data);
+}
+
+static int vp_vdpa_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+{
+ struct vp_vdpa *vp_vdpa;
+ int ret;
+
+ ret = pcim_enable_device(pdev);
+ if (ret)
+ return ret;
+
+ vp_vdpa = vp_vdpa_modern_probe(pdev);
+ if (IS_ERR(vp_vdpa))
+ return PTR_ERR(vp_vdpa);
+
+ vp_vdpa->pci_dev = pdev;
+
+ pci_set_master(pdev);
+
+ ret = vdpa_register_device(&vp_vdpa->vdpa, vp_vdpa->queues);
+ if (ret) {
+ dev_err(&pdev->dev, "Failed to register to vdpa bus\n");
+ goto err;
+ }
+
+ return 0;
+
+err:
+ put_device(&vp_vdpa->vdpa.dev);
+ return ret;
+}
+
+static void vp_vdpa_remove(struct pci_dev *pdev)
+{
+ struct vp_vdpa *vp_vdpa = pci_get_drvdata(pdev);
+
+ vdpa_unregister_device(&vp_vdpa->vdpa);
+ vp_modern_remove(&vp_vdpa->mdev);
+}
+
+static struct pci_driver vp_vdpa_driver = {
+ .name = "vp-vdpa",
+ .id_table = NULL, /* only dynamic ids */
+ .probe = vp_vdpa_probe,
+ .remove = vp_vdpa_remove,
+};
+
+module_pci_driver(vp_vdpa_driver);
+
+MODULE_AUTHOR("Jason Wang <[email protected]>");
+MODULE_DESCRIPTION("vp-vdpa");
+MODULE_LICENSE("GPL");
+MODULE_VERSION("1");
diff --git a/drivers/vdpa/virtio_pci/vp_vdpa_common.h b/drivers/vdpa/virtio_pci/vp_vdpa_common.h
new file mode 100644
index 000000000000..57886b55a2e9
--- /dev/null
+++ b/drivers/vdpa/virtio_pci/vp_vdpa_common.h
@@ -0,0 +1,56 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef _DRIVERS_VDPA_VIRTIO_PCI_VP_VDPA_COMMON_H
+#define _DRIVERS_VDPA_VIRTIO_PCI_VP_VDPA_COMMON_H
+
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/vdpa.h>
+#include <linux/virtio.h>
+#include <linux/virtio_config.h>
+#include <linux/virtio_ring.h>
+#include <linux/virtio_pci.h>
+#include <linux/virtio_pci_modern.h>
+
+#define VP_VDPA_DRIVER_NAME "vp_vdpa"
+#define VP_VDPA_NAME_SIZE 256
+
+struct vp_vring {
+ void __iomem *notify;
+ char msix_name[VP_VDPA_NAME_SIZE];
+ struct vdpa_callback cb;
+ resource_size_t notify_pa;
+ int irq;
+};
+
+struct vp_vdpa {
+ struct vdpa_device vdpa;
+ struct pci_dev *pci_dev;
+ struct virtio_pci_modern_device mdev;
+ struct vp_vring *vring;
+ struct vdpa_callback config_cb;
+ char msix_name[VP_VDPA_NAME_SIZE];
+ int config_irq;
+ int queues;
+ int vectors;
+ u16 (*queue_vector)(struct vp_vdpa *vp_vdpa, u16 idx, u16 vector);
+ u16 (*config_vector)(struct vp_vdpa *vp_vdpa, u16 vector);
+};
+
+static struct vp_vdpa *vdpa_to_vp(struct vdpa_device *vdpa)
+{
+ return container_of(vdpa, struct vp_vdpa, vdpa);
+}
+
+int vp_vdpa_get_vq_irq(struct vdpa_device *vdev, u16 idx);
+void vp_vdpa_free_irq(struct vp_vdpa *vp_vdpa);
+int vp_vdpa_request_irq(struct vp_vdpa *vp_vdpa);
+int vp_vdpa_get_vq_state(struct vdpa_device *vdpa, u16 qid, struct vdpa_vq_state *state);
+void vp_vdpa_set_vq_cb(struct vdpa_device *vdpa, u16 qid, struct vdpa_callback *cb);
+void vp_vdpa_kick_vq(struct vdpa_device *vdpa, u16 qid);
+u32 vp_vdpa_get_vq_align(struct vdpa_device *vdpa);
+void vp_vdpa_set_config_cb(struct vdpa_device *vdpa, struct vdpa_callback *cb);
+void vp_vdpa_free_irq_vectors(void *data);
+
+struct vp_vdpa *vp_vdpa_modern_probe(struct pci_dev *pdev);
+
+#endif
diff --git a/drivers/vdpa/virtio_pci/vp_vdpa_modern.c b/drivers/vdpa/virtio_pci/vp_vdpa_modern.c
new file mode 100644
index 000000000000..13b66edbb27a
--- /dev/null
+++ b/drivers/vdpa/virtio_pci/vp_vdpa_modern.c
@@ -0,0 +1,327 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * vDPA bridge driver for modern virtio-pci device
+ *
+ * Copyright (c) 2020, Red Hat Inc. All rights reserved.
+ * Author: Jason Wang <[email protected]>
+ *
+ * Based on virtio_pci_modern.c.
+ */
+
+#include "linux/pci.h"
+#include "linux/vdpa.h"
+#include "vp_vdpa_common.h"
+
+#define VP_VDPA_QUEUE_MAX 256
+
+static struct virtio_pci_modern_device *vdpa_to_mdev(struct vdpa_device *vdpa)
+{
+ struct vp_vdpa *vp_vdpa = vdpa_to_vp(vdpa);
+
+ return &vp_vdpa->mdev;
+}
+
+static u64 vp_vdpa_get_features(struct vdpa_device *vdpa)
+{
+ struct virtio_pci_modern_device *mdev = vdpa_to_mdev(vdpa);
+
+ return vp_modern_get_features(mdev);
+}
+
+static int vp_vdpa_set_features(struct vdpa_device *vdpa, u64 features)
+{
+ struct virtio_pci_modern_device *mdev = vdpa_to_mdev(vdpa);
+
+ vp_modern_set_features(mdev, features);
+
+ return 0;
+}
+
+static u8 vp_vdpa_get_status(struct vdpa_device *vdpa)
+{
+ struct virtio_pci_modern_device *mdev = vdpa_to_mdev(vdpa);
+
+ return vp_modern_get_status(mdev);
+}
+
+static void vp_vdpa_set_status(struct vdpa_device *vdpa, u8 status)
+{
+ struct vp_vdpa *vp_vdpa = vdpa_to_vp(vdpa);
+ struct virtio_pci_modern_device *mdev = &vp_vdpa->mdev;
+ u8 s = vp_vdpa_get_status(vdpa);
+
+ if (status & VIRTIO_CONFIG_S_DRIVER_OK &&
+ !(s & VIRTIO_CONFIG_S_DRIVER_OK)) {
+ vp_vdpa_request_irq(vp_vdpa);
+ }
+
+ vp_modern_set_status(mdev, status);
+
+ if (!(status & VIRTIO_CONFIG_S_DRIVER_OK) &&
+ (s & VIRTIO_CONFIG_S_DRIVER_OK))
+ vp_vdpa_free_irq(vp_vdpa);
+}
+
+static u16 vp_vdpa_get_vq_num_max(struct vdpa_device *vdpa)
+{
+ return VP_VDPA_QUEUE_MAX;
+}
+
+static int vp_vdpa_set_vq_state_split(struct vdpa_device *vdpa,
+ const struct vdpa_vq_state *state)
+{
+ const struct vdpa_vq_state_split *split = &state->split;
+
+ if (split->avail_index == 0)
+ return 0;
+
+ return -EOPNOTSUPP;
+}
+
+static int vp_vdpa_set_vq_state_packed(struct vdpa_device *vdpa,
+ const struct vdpa_vq_state *state)
+{
+ const struct vdpa_vq_state_packed *packed = &state->packed;
+
+ if (packed->last_avail_counter == 1 &&
+ packed->last_avail_idx == 0 &&
+ packed->last_used_counter == 1 &&
+ packed->last_used_idx == 0)
+ return 0;
+
+ return -EOPNOTSUPP;
+}
+
+static int vp_vdpa_set_vq_state(struct vdpa_device *vdpa, u16 qid,
+ const struct vdpa_vq_state *state)
+{
+ struct virtio_pci_modern_device *mdev = vdpa_to_mdev(vdpa);
+
+ /* Note that this is not supported by virtio specification.
+ * But if the state is by chance equal to the device initial
+ * state, we can let it go.
+ */
+ if ((vp_modern_get_status(mdev) & VIRTIO_CONFIG_S_FEATURES_OK) &&
+ !vp_modern_get_queue_enable(mdev, qid)) {
+ if (vp_modern_get_driver_features(mdev) &
+ BIT_ULL(VIRTIO_F_RING_PACKED))
+ return vp_vdpa_set_vq_state_packed(vdpa, state);
+ else
+ return vp_vdpa_set_vq_state_split(vdpa, state);
+ }
+
+ return -EOPNOTSUPP;
+}
+
+static void vp_vdpa_set_vq_ready(struct vdpa_device *vdpa,
+ u16 qid, bool ready)
+{
+ struct virtio_pci_modern_device *mdev = vdpa_to_mdev(vdpa);
+
+ vp_modern_set_queue_enable(mdev, qid, ready);
+}
+
+static bool vp_vdpa_get_vq_ready(struct vdpa_device *vdpa, u16 qid)
+{
+ struct virtio_pci_modern_device *mdev = vdpa_to_mdev(vdpa);
+
+ return vp_modern_get_queue_enable(mdev, qid);
+}
+
+static void vp_vdpa_set_vq_num(struct vdpa_device *vdpa, u16 qid,
+ u32 num)
+{
+ struct virtio_pci_modern_device *mdev = vdpa_to_mdev(vdpa);
+
+ vp_modern_set_queue_size(mdev, qid, num);
+}
+
+static int vp_vdpa_set_vq_address(struct vdpa_device *vdpa, u16 qid,
+ u64 desc_area, u64 driver_area,
+ u64 device_area)
+{
+ struct virtio_pci_modern_device *mdev = vdpa_to_mdev(vdpa);
+
+ vp_modern_queue_address(mdev, qid, desc_area,
+ driver_area, device_area);
+
+ return 0;
+}
+
+static u32 vp_vdpa_get_generation(struct vdpa_device *vdpa)
+{
+ struct virtio_pci_modern_device *mdev = vdpa_to_mdev(vdpa);
+
+ return vp_modern_generation(mdev);
+}
+
+static u32 vp_vdpa_get_device_id(struct vdpa_device *vdpa)
+{
+ struct virtio_pci_modern_device *mdev = vdpa_to_mdev(vdpa);
+
+ return mdev->id.device;
+}
+
+static u32 vp_vdpa_get_vendor_id(struct vdpa_device *vdpa)
+{
+ struct virtio_pci_modern_device *mdev = vdpa_to_mdev(vdpa);
+
+ return mdev->id.vendor;
+}
+
+static size_t vp_vdpa_get_config_size(struct vdpa_device *vdpa)
+{
+ struct virtio_pci_modern_device *mdev = vdpa_to_mdev(vdpa);
+
+ return mdev->device_len;
+}
+
+static void vp_vdpa_get_config(struct vdpa_device *vdpa,
+ unsigned int offset,
+ void *buf, unsigned int len)
+{
+ struct vp_vdpa *vp_vdpa = vdpa_to_vp(vdpa);
+ struct virtio_pci_modern_device *mdev = &vp_vdpa->mdev;
+ u8 old, new;
+ u8 *p;
+ int i;
+
+ do {
+ old = vp_ioread8(&mdev->common->config_generation);
+ p = buf;
+ for (i = 0; i < len; i++)
+ *p++ = vp_ioread8(mdev->device + offset + i);
+
+ new = vp_ioread8(&mdev->common->config_generation);
+ } while (old != new);
+}
+
+static void vp_vdpa_set_config(struct vdpa_device *vdpa,
+ unsigned int offset, const void *buf,
+ unsigned int len)
+{
+ struct vp_vdpa *vp_vdpa = vdpa_to_vp(vdpa);
+ struct virtio_pci_modern_device *mdev = &vp_vdpa->mdev;
+ const u8 *p = buf;
+ int i;
+
+ for (i = 0; i < len; i++)
+ vp_iowrite8(*p++, mdev->device + offset + i);
+}
+
+static struct vdpa_notification_area
+vp_vdpa_get_vq_notification(struct vdpa_device *vdpa, u16 qid)
+{
+ struct vp_vdpa *vp_vdpa = vdpa_to_vp(vdpa);
+ struct virtio_pci_modern_device *mdev = &vp_vdpa->mdev;
+ struct vdpa_notification_area notify;
+
+ notify.addr = vp_vdpa->vring[qid].notify_pa;
+ notify.size = mdev->notify_offset_multiplier;
+
+ return notify;
+}
+
+static const struct vdpa_config_ops vp_vdpa_ops = {
+ .get_features = vp_vdpa_get_features,
+ .set_features = vp_vdpa_set_features,
+ .get_status = vp_vdpa_get_status,
+ .set_status = vp_vdpa_set_status,
+ .get_vq_num_max = vp_vdpa_get_vq_num_max,
+ .get_vq_state = vp_vdpa_get_vq_state,
+ .get_vq_notification = vp_vdpa_get_vq_notification,
+ .set_vq_state = vp_vdpa_set_vq_state,
+ .set_vq_cb = vp_vdpa_set_vq_cb,
+ .set_vq_ready = vp_vdpa_set_vq_ready,
+ .get_vq_ready = vp_vdpa_get_vq_ready,
+ .set_vq_num = vp_vdpa_set_vq_num,
+ .set_vq_address = vp_vdpa_set_vq_address,
+ .kick_vq = vp_vdpa_kick_vq,
+ .get_generation = vp_vdpa_get_generation,
+ .get_device_id = vp_vdpa_get_device_id,
+ .get_vendor_id = vp_vdpa_get_vendor_id,
+ .get_vq_align = vp_vdpa_get_vq_align,
+ .get_config_size = vp_vdpa_get_config_size,
+ .get_config = vp_vdpa_get_config,
+ .set_config = vp_vdpa_set_config,
+ .set_config_cb = vp_vdpa_set_config_cb,
+ .get_vq_irq = vp_vdpa_get_vq_irq,
+};
+
+static u16 vp_vdpa_queue_vector(struct vp_vdpa *vp_vdpa, u16 idx, u16 vector)
+{
+ return vp_modern_queue_vector(&vp_vdpa->mdev, idx, vector);
+}
+
+static u16 vp_vdpa_config_vector(struct vp_vdpa *vp_vdpa, u16 vector)
+{
+ return vp_modern_config_vector(&vp_vdpa->mdev, vector);
+}
+
+struct vp_vdpa *vp_vdpa_modern_probe(struct pci_dev *pdev)
+{
+ struct device *dev = &pdev->dev;
+ struct vp_vdpa *vp_vdpa;
+ struct virtio_pci_modern_device *mdev;
+ int ret, i;
+
+ vp_vdpa = vdpa_alloc_device(struct vp_vdpa, vdpa,
+ dev, &vp_vdpa_ops, NULL);
+ if (IS_ERR(vp_vdpa)) {
+ dev_err(dev, "vp_vdpa: Failed to allocate vDPA structure\n");
+ return vp_vdpa;
+ }
+
+ mdev = &vp_vdpa->mdev;
+ mdev->pci_dev = pdev;
+
+ ret = vp_modern_probe(mdev);
+ if (ret) {
+ dev_err(dev, "Failed to probe modern PCI device\n");
+ goto err;
+ }
+
+ pci_set_master(pdev);
+ pci_set_drvdata(pdev, vp_vdpa);
+
+ vp_vdpa->vdpa.dma_dev = dev;
+ vp_vdpa->queues = vp_modern_get_num_queues(mdev);
+
+ ret = devm_add_action_or_reset(dev, vp_vdpa_free_irq_vectors, pdev);
+ if (ret) {
+ dev_err(dev,
+ "Failed for adding devres for freeing irq vectors\n");
+ goto err;
+ }
+
+ vp_vdpa->vring = devm_kcalloc(&pdev->dev, vp_vdpa->queues,
+ sizeof(*vp_vdpa->vring),
+ GFP_KERNEL);
+ if (!vp_vdpa->vring) {
+ ret = -ENOMEM;
+ dev_err(dev, "Fail to allocate virtqueues\n");
+ goto err;
+ }
+
+ for (i = 0; i < vp_vdpa->queues; i++) {
+ vp_vdpa->vring[i].irq = VIRTIO_MSI_NO_VECTOR;
+ vp_vdpa->vring[i].notify =
+ vp_modern_map_vq_notify(mdev, i,
+ &vp_vdpa->vring[i].notify_pa);
+ if (!vp_vdpa->vring[i].notify) {
+ ret = -EINVAL;
+ dev_warn(dev, "Fail to map vq notify %d\n", i);
+ goto err;
+ }
+ }
+ vp_vdpa->config_irq = VIRTIO_MSI_NO_VECTOR;
+
+ vp_vdpa->queue_vector = vp_vdpa_queue_vector;
+ vp_vdpa->config_vector = vp_vdpa_config_vector;
+
+ return vp_vdpa;
+
+err:
+ put_device(&vp_vdpa->vdpa.dev);
+ return ERR_PTR(ret);
+}
--
2.31.1

2021-09-08 13:03:16

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH 3/6] vp_vdpa: add vq irq offloading support

This patch implements the get_vq_irq() callback for virtio pci devices
to allow irq offloading.

Signed-off-by: Wu Zongyong <[email protected]>
---
drivers/vdpa/virtio_pci/vp_vdpa.c | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/drivers/vdpa/virtio_pci/vp_vdpa.c b/drivers/vdpa/virtio_pci/vp_vdpa.c
index fe0527329857..4c512ae1fe01 100644
--- a/drivers/vdpa/virtio_pci/vp_vdpa.c
+++ b/drivers/vdpa/virtio_pci/vp_vdpa.c
@@ -76,6 +76,13 @@ static u8 vp_vdpa_get_status(struct vdpa_device *vdpa)
return vp_modern_get_status(mdev);
}

+static int vp_vdpa_get_vq_irq(struct vdpa_device *vdev, u16 idx)
+{
+ struct vp_vdpa *vp_vdpa = vdpa_to_vp(vdev);
+
+ return vp_vdpa->vring[idx].irq;
+}
+
static void vp_vdpa_free_irq(struct vp_vdpa *vp_vdpa)
{
struct virtio_pci_modern_device *mdev = &vp_vdpa->mdev;
@@ -416,6 +423,7 @@ static const struct vdpa_config_ops vp_vdpa_ops = {
.get_config = vp_vdpa_get_config,
.set_config = vp_vdpa_set_config,
.set_config_cb = vp_vdpa_set_config_cb,
+ .get_vq_irq = vp_vdpa_get_vq_irq,
};

static void vp_vdpa_free_irq_vectors(void *data)
--
2.31.1

2021-09-08 13:03:58

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH 2/6] vdpa: fix typo

Signed-off-by: Wu Zongyong <[email protected]>
---
include/linux/vdpa.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
index 8cfe49d201dd..35648c11e312 100644
--- a/include/linux/vdpa.h
+++ b/include/linux/vdpa.h
@@ -242,7 +242,7 @@ struct vdpa_config_ops {
struct vdpa_notification_area
(*get_vq_notification)(struct vdpa_device *vdev, u16 idx);
/* vq irq is not expected to be changed once DRIVER_OK is set */
- int (*get_vq_irq)(struct vdpa_device *vdv, u16 idx);
+ int (*get_vq_irq)(struct vdpa_device *vdev, u16 idx);

/* Device ops */
u32 (*get_vq_align)(struct vdpa_device *vdev);
--
2.31.1

2021-09-09 03:00:16

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH 5/6] vdpa: add get_vq_num_unchangeable callback in vdpa_config_ops

On Wed, Sep 8, 2021 at 8:23 PM Wu Zongyong <[email protected]> wrote:
>
> This new callback is used to indicate whether the vring size can be
> change or not. It is useful when we have a legacy virtio pci device as
> the vdpa device for there is no way to negotiate the vring num by the
> specification.

So I'm not sure it's worth bothering. E.g what if we just fail
VHOST_SET_VRING_NUM it the value doesn't match what hardware has?

Thanks

>
> Signed-off-by: Wu Zongyong <[email protected]>
> ---
> drivers/vhost/vdpa.c | 19 +++++++++++++++++++
> drivers/virtio/virtio_vdpa.c | 5 ++++-
> include/linux/vdpa.h | 4 ++++
> include/uapi/linux/vhost.h | 2 ++
> 4 files changed, 29 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> index 9479f7f79217..2204d27d1e5d 100644
> --- a/drivers/vhost/vdpa.c
> +++ b/drivers/vhost/vdpa.c
> @@ -350,6 +350,22 @@ static long vhost_vdpa_get_iova_range(struct vhost_vdpa *v, u32 __user *argp)
> return 0;
> }
>
> +static long vhost_vdpa_get_vring_num_unchangeable(struct vhost_vdpa *v,
> + u32 __user *argp)
> +{
> + struct vdpa_device *vdpa = v->vdpa;
> + const struct vdpa_config_ops *ops = vdpa->config;
> + bool unchangeable = false;
> +
> + if (ops->get_vq_num_unchangeable)
> + unchangeable = ops->get_vq_num_unchangeable(vdpa);
> +
> + if (copy_to_user(argp, &unchangeable, sizeof(unchangeable)))
> + return -EFAULT;
> +
> + return 0;
> +}
> +
> static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd,
> void __user *argp)
> {
> @@ -487,6 +503,9 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> case VHOST_VDPA_GET_IOVA_RANGE:
> r = vhost_vdpa_get_iova_range(v, argp);
> break;
> + case VHOST_VDPA_GET_VRING_NUM_UNCHANGEABLE:
> + r = vhost_vdpa_get_vring_num_unchangeable(v, argp);
> + break;
> default:
> r = vhost_dev_ioctl(&v->vdev, cmd, argp);
> if (r == -ENOIOCTLCMD)
> diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
> index 72eaef2caeb1..afb47465307a 100644
> --- a/drivers/virtio/virtio_vdpa.c
> +++ b/drivers/virtio/virtio_vdpa.c
> @@ -146,6 +146,7 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> struct vdpa_vq_state state = {0};
> unsigned long flags;
> u32 align, num;
> + bool may_reduce_num = true;
> int err;
>
> if (!name)
> @@ -171,8 +172,10 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
>
> /* Create the vring */
> align = ops->get_vq_align(vdpa);
> + if (ops->get_vq_num_unchangeable)
> + may_reduce_num = !ops->get_vq_num_unchangeable(vdpa);
> vq = vring_create_virtqueue(index, num, align, vdev,
> - true, true, ctx,
> + true, may_reduce_num, ctx,
> virtio_vdpa_notify, callback, name);
> if (!vq) {
> err = -ENOMEM;
> diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
> index 35648c11e312..f809b7ada00d 100644
> --- a/include/linux/vdpa.h
> +++ b/include/linux/vdpa.h
> @@ -195,6 +195,9 @@ struct vdpa_iova_range {
> * @vdev: vdpa device
> * Returns the iova range supported by
> * the device.
> + * @get_vq_num_unchangeable Check if size of virtqueue is unchangeable (optional)
> + * @vdev: vdpa device
> + * Returns boolean: unchangeable (true) or not (false)
> * @set_map: Set device memory mapping (optional)
> * Needed for device that using device
> * specific DMA translation (on-chip IOMMU)
> @@ -262,6 +265,7 @@ struct vdpa_config_ops {
> const void *buf, unsigned int len);
> u32 (*get_generation)(struct vdpa_device *vdev);
> struct vdpa_iova_range (*get_iova_range)(struct vdpa_device *vdev);
> + bool (*get_vq_num_unchangeable)(struct vdpa_device *vdev);
>
> /* DMA ops */
> int (*set_map)(struct vdpa_device *vdev, struct vhost_iotlb *iotlb);
> diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h
> index c998860d7bbc..184f1f7f8498 100644
> --- a/include/uapi/linux/vhost.h
> +++ b/include/uapi/linux/vhost.h
> @@ -150,4 +150,6 @@
> /* Get the valid iova range */
> #define VHOST_VDPA_GET_IOVA_RANGE _IOR(VHOST_VIRTIO, 0x78, \
> struct vhost_vdpa_iova_range)
> +/* Check if the vring size can be change */
> +#define VHOST_VDPA_GET_VRING_NUM_UNCHANGEABLE _IOR(VHOST_VIRTIO, 0X79, bool)
> #endif
> --
> 2.31.1
>

2021-09-09 03:08:15

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH 0/6] vDPA driver for legacy virtio-pci device

On Wed, Sep 8, 2021 at 8:22 PM Wu Zongyong <[email protected]> wrote:
>
> This series implements the vDPA driver for legacy virtio-pci device.
> Currently we already have the vDPA driver for modern virtio-pci device
> only, but there are some legacy virtio-pci devices conforming to the
> virtio-pci specifications of 0.9.x or older versions. For example,
> ENI(Elastic Network Interface) of Alibaba ECS baremetal instance is a
> hardware virtio network device which follows the Virtio PCI Card 0.9.5
> Draft specification. Such legacy virtio-pci devices have some
> inconsistent behaviour with modern virtio-pci devices, so some common
> codes are split out and modern device specific codes are moved to a
> separated file.

What worries me a little bit are:

1) vDPA requires IOMMU_PLATFORM to be supported by the device to work,
if I understand ENI correctly, it's a legacy device so it can't
support ACCESS_PLATFORM. Or is it a legacy device that supports
ACCESS_PLATFORM implicitly.
2) vDPA tries to present a 1.0 device, in this case the behavior could
be ruled by the spec. If we tries to present an 1.0 device on top of
legacy device we may suffer a lot of issues:

- endian issue: 1.0 use le but legacy may use native endian
- queue_enable semantic which is missed in the legacy
- virtqueue size, as you mentioned below

I guess what the device(ENI) supports are:

1) semantic of ACCESS_PLATFORM without a feature
2) little endian
3) but a legacy device

So I think it might be better:

1) introduce the library for legacy as you did in this patch
2) having a dedicated ENI vDPA driver

3) live migration support, though it was not supported by the spec
yet, but we are working on the support, and we know legacy device can
support this.

Thanks

>
> For legacy devices, it is not supported to negotiate the virtqueue size
> by the specification. So a new callback get_vq_num_unchangeable is
> introduced to indicate user not to try change the virtqueue size of the
> legacy vdpa device. For example, QEMU should not allocate memory for
> virtqueue according to the properties tx_queue_size and rx_queue_size if
> we use legacy virtio-pci device as the vhost-vdpa backend. Instead, QEMU
> should use the new callback get_vq_num_unchangeable first to check if
> the vdpa device support to change virtqueue size. If not, QEMU should
> call the callback get_vq_num_max to get the static virtqueue size then
> allocate the same size memory for the virtqueue.
>
> This series have been tested with the ENI in Alibaba ECS baremetal
> instance.
>
> These patches may under consideration, welcome for comments.
>
>
> Wu Zongyong (6):
> virtio-pci: introduce legacy device module
> vdpa: fix typo
> vp_vdpa: add vq irq offloading support
> vp_vdpa: split out reusable and device specific codes to separate file
> vdpa: add get_vq_num_unchangeable callback in vdpa_config_ops
> vp_vdpa: introduce legacy virtio pci driver
>
> drivers/vdpa/Kconfig | 7 +
> drivers/vdpa/virtio_pci/Makefile | 3 +
> drivers/vdpa/virtio_pci/vp_vdpa.c | 8 +
> drivers/vdpa/virtio_pci/vp_vdpa_common.c | 220 ++++++++++++++
> drivers/vdpa/virtio_pci/vp_vdpa_common.h | 67 +++++
> drivers/vdpa/virtio_pci/vp_vdpa_legacy.c | 346 +++++++++++++++++++++++
> drivers/vdpa/virtio_pci/vp_vdpa_modern.c | 327 +++++++++++++++++++++
> drivers/vhost/vdpa.c | 19 ++
> drivers/virtio/Kconfig | 10 +
> drivers/virtio/Makefile | 1 +
> drivers/virtio/virtio_pci_common.c | 10 +-
> drivers/virtio/virtio_pci_common.h | 9 +-
> drivers/virtio/virtio_pci_legacy.c | 101 ++-----
> drivers/virtio/virtio_pci_legacy_dev.c | 220 ++++++++++++++
> drivers/virtio/virtio_vdpa.c | 5 +-
> include/linux/vdpa.h | 6 +-
> include/linux/virtio_pci_legacy.h | 44 +++
> include/uapi/linux/vhost.h | 2 +
> 18 files changed, 1320 insertions(+), 85 deletions(-)
> create mode 100644 drivers/vdpa/virtio_pci/vp_vdpa_common.c
> create mode 100644 drivers/vdpa/virtio_pci/vp_vdpa_common.h
> create mode 100644 drivers/vdpa/virtio_pci/vp_vdpa_legacy.c
> create mode 100644 drivers/vdpa/virtio_pci/vp_vdpa_modern.c
> create mode 100644 drivers/virtio/virtio_pci_legacy_dev.c
> create mode 100644 include/linux/virtio_pci_legacy.h
>
> --
> 2.31.1
>

2021-09-09 03:16:22

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH 3/6] vp_vdpa: add vq irq offloading support

On Wed, Sep 8, 2021 at 8:23 PM Wu Zongyong <[email protected]> wrote:
>
> This patch implements the get_vq_irq() callback for virtio pci devices
> to allow irq offloading.
>
> Signed-off-by: Wu Zongyong <[email protected]>
> ---
> drivers/vdpa/virtio_pci/vp_vdpa.c | 8 ++++++++
> 1 file changed, 8 insertions(+)

Acked-by: Jason Wang <[email protected]>

>
> diff --git a/drivers/vdpa/virtio_pci/vp_vdpa.c b/drivers/vdpa/virtio_pci/vp_vdpa.c
> index fe0527329857..4c512ae1fe01 100644
> --- a/drivers/vdpa/virtio_pci/vp_vdpa.c
> +++ b/drivers/vdpa/virtio_pci/vp_vdpa.c
> @@ -76,6 +76,13 @@ static u8 vp_vdpa_get_status(struct vdpa_device *vdpa)
> return vp_modern_get_status(mdev);
> }
>
> +static int vp_vdpa_get_vq_irq(struct vdpa_device *vdev, u16 idx)
> +{
> + struct vp_vdpa *vp_vdpa = vdpa_to_vp(vdev);
> +
> + return vp_vdpa->vring[idx].irq;
> +}
> +
> static void vp_vdpa_free_irq(struct vp_vdpa *vp_vdpa)
> {
> struct virtio_pci_modern_device *mdev = &vp_vdpa->mdev;
> @@ -416,6 +423,7 @@ static const struct vdpa_config_ops vp_vdpa_ops = {
> .get_config = vp_vdpa_get_config,
> .set_config = vp_vdpa_set_config,
> .set_config_cb = vp_vdpa_set_config_cb,
> + .get_vq_irq = vp_vdpa_get_vq_irq,
> };
>
> static void vp_vdpa_free_irq_vectors(void *data)
> --
> 2.31.1
>

2021-09-09 03:22:27

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH 0/6] vDPA driver for legacy virtio-pci device

On Thu, Sep 9, 2021 at 11:05 AM Jason Wang <[email protected]> wrote:
>
> On Wed, Sep 8, 2021 at 8:22 PM Wu Zongyong <[email protected]> wrote:
> >
> > This series implements the vDPA driver for legacy virtio-pci device.
> > Currently we already have the vDPA driver for modern virtio-pci device
> > only, but there are some legacy virtio-pci devices conforming to the
> > virtio-pci specifications of 0.9.x or older versions. For example,
> > ENI(Elastic Network Interface) of Alibaba ECS baremetal instance is a
> > hardware virtio network device which follows the Virtio PCI Card 0.9.5
> > Draft specification. Such legacy virtio-pci devices have some
> > inconsistent behaviour with modern virtio-pci devices, so some common
> > codes are split out and modern device specific codes are moved to a
> > separated file.
>
> What worries me a little bit are:
>
> 1) vDPA requires IOMMU_PLATFORM to be supported by the device to work,
> if I understand ENI correctly, it's a legacy device so it can't
> support ACCESS_PLATFORM. Or is it a legacy device that supports
> ACCESS_PLATFORM implicitly.
> 2) vDPA tries to present a 1.0 device, in this case the behavior could
> be ruled by the spec. If we tries to present an 1.0 device on top of
> legacy device we may suffer a lot of issues:
>
> - endian issue: 1.0 use le but legacy may use native endian
> - queue_enable semantic which is missed in the legacy
> - virtqueue size, as you mentioned below
>
> I guess what the device(ENI) supports are:
>
> 1) semantic of ACCESS_PLATFORM without a feature
> 2) little endian
> 3) but a legacy device
>
> So I think it might be better:
>
> 1) introduce the library for legacy as you did in this patch
> 2) having a dedicated ENI vDPA driver
>
> 3) live migration support, though it was not supported by the spec
> yet, but we are working on the support, and we know legacy device can

I meant "can't" actually.

Thanks

> support this.
>
> Thanks
>
> >
> > For legacy devices, it is not supported to negotiate the virtqueue size
> > by the specification. So a new callback get_vq_num_unchangeable is
> > introduced to indicate user not to try change the virtqueue size of the
> > legacy vdpa device. For example, QEMU should not allocate memory for
> > virtqueue according to the properties tx_queue_size and rx_queue_size if
> > we use legacy virtio-pci device as the vhost-vdpa backend. Instead, QEMU
> > should use the new callback get_vq_num_unchangeable first to check if
> > the vdpa device support to change virtqueue size. If not, QEMU should
> > call the callback get_vq_num_max to get the static virtqueue size then
> > allocate the same size memory for the virtqueue.
> >
> > This series have been tested with the ENI in Alibaba ECS baremetal
> > instance.
> >
> > These patches may under consideration, welcome for comments.
> >
> >
> > Wu Zongyong (6):
> > virtio-pci: introduce legacy device module
> > vdpa: fix typo
> > vp_vdpa: add vq irq offloading support
> > vp_vdpa: split out reusable and device specific codes to separate file
> > vdpa: add get_vq_num_unchangeable callback in vdpa_config_ops
> > vp_vdpa: introduce legacy virtio pci driver
> >
> > drivers/vdpa/Kconfig | 7 +
> > drivers/vdpa/virtio_pci/Makefile | 3 +
> > drivers/vdpa/virtio_pci/vp_vdpa.c | 8 +
> > drivers/vdpa/virtio_pci/vp_vdpa_common.c | 220 ++++++++++++++
> > drivers/vdpa/virtio_pci/vp_vdpa_common.h | 67 +++++
> > drivers/vdpa/virtio_pci/vp_vdpa_legacy.c | 346 +++++++++++++++++++++++
> > drivers/vdpa/virtio_pci/vp_vdpa_modern.c | 327 +++++++++++++++++++++
> > drivers/vhost/vdpa.c | 19 ++
> > drivers/virtio/Kconfig | 10 +
> > drivers/virtio/Makefile | 1 +
> > drivers/virtio/virtio_pci_common.c | 10 +-
> > drivers/virtio/virtio_pci_common.h | 9 +-
> > drivers/virtio/virtio_pci_legacy.c | 101 ++-----
> > drivers/virtio/virtio_pci_legacy_dev.c | 220 ++++++++++++++
> > drivers/virtio/virtio_vdpa.c | 5 +-
> > include/linux/vdpa.h | 6 +-
> > include/linux/virtio_pci_legacy.h | 44 +++
> > include/uapi/linux/vhost.h | 2 +
> > 18 files changed, 1320 insertions(+), 85 deletions(-)
> > create mode 100644 drivers/vdpa/virtio_pci/vp_vdpa_common.c
> > create mode 100644 drivers/vdpa/virtio_pci/vp_vdpa_common.h
> > create mode 100644 drivers/vdpa/virtio_pci/vp_vdpa_legacy.c
> > create mode 100644 drivers/vdpa/virtio_pci/vp_vdpa_modern.c
> > create mode 100644 drivers/virtio/virtio_pci_legacy_dev.c
> > create mode 100644 include/linux/virtio_pci_legacy.h
> >
> > --
> > 2.31.1
> >

2021-09-09 08:04:12

by Wu Zongyong

[permalink] [raw]
Subject: Re: [PATCH 5/6] vdpa: add get_vq_num_unchangeable callback in vdpa_config_ops

On Thu, Sep 09, 2021 at 10:55:03AM +0800, Jason Wang wrote:
> On Wed, Sep 8, 2021 at 8:23 PM Wu Zongyong <[email protected]> wrote:
> >
> > This new callback is used to indicate whether the vring size can be
> > change or not. It is useful when we have a legacy virtio pci device as
> > the vdpa device for there is no way to negotiate the vring num by the
> > specification.
>
> So I'm not sure it's worth bothering. E.g what if we just fail
> VHOST_SET_VRING_NUM it the value doesn't match what hardware has?
>
> Thanks
>
I think we should not call VHOST_SET_VRING_NUM in that case.

If the hardware reports that the virtqueue size cannot be changed, we
should call VHOST_GET_VRING_NUM to get the static virtqueue size
firstly, then allocate the same size memory for the virtqueues and write
the address to hardware finally.

For QEMU, we will ignore the properties rx/tx_queue_size and just get it
from the hardware if this new callback return true.

What do you think?
> >
> > Signed-off-by: Wu Zongyong <[email protected]>
> > ---
> > drivers/vhost/vdpa.c | 19 +++++++++++++++++++
> > drivers/virtio/virtio_vdpa.c | 5 ++++-
> > include/linux/vdpa.h | 4 ++++
> > include/uapi/linux/vhost.h | 2 ++
> > 4 files changed, 29 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> > index 9479f7f79217..2204d27d1e5d 100644
> > --- a/drivers/vhost/vdpa.c
> > +++ b/drivers/vhost/vdpa.c
> > @@ -350,6 +350,22 @@ static long vhost_vdpa_get_iova_range(struct vhost_vdpa *v, u32 __user *argp)
> > return 0;
> > }
> >
> > +static long vhost_vdpa_get_vring_num_unchangeable(struct vhost_vdpa *v,
> > + u32 __user *argp)
> > +{
> > + struct vdpa_device *vdpa = v->vdpa;
> > + const struct vdpa_config_ops *ops = vdpa->config;
> > + bool unchangeable = false;
> > +
> > + if (ops->get_vq_num_unchangeable)
> > + unchangeable = ops->get_vq_num_unchangeable(vdpa);
> > +
> > + if (copy_to_user(argp, &unchangeable, sizeof(unchangeable)))
> > + return -EFAULT;
> > +
> > + return 0;
> > +}
> > +
> > static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd,
> > void __user *argp)
> > {
> > @@ -487,6 +503,9 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> > case VHOST_VDPA_GET_IOVA_RANGE:
> > r = vhost_vdpa_get_iova_range(v, argp);
> > break;
> > + case VHOST_VDPA_GET_VRING_NUM_UNCHANGEABLE:
> > + r = vhost_vdpa_get_vring_num_unchangeable(v, argp);
> > + break;
> > default:
> > r = vhost_dev_ioctl(&v->vdev, cmd, argp);
> > if (r == -ENOIOCTLCMD)
> > diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
> > index 72eaef2caeb1..afb47465307a 100644
> > --- a/drivers/virtio/virtio_vdpa.c
> > +++ b/drivers/virtio/virtio_vdpa.c
> > @@ -146,6 +146,7 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > struct vdpa_vq_state state = {0};
> > unsigned long flags;
> > u32 align, num;
> > + bool may_reduce_num = true;
> > int err;
> >
> > if (!name)
> > @@ -171,8 +172,10 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> >
> > /* Create the vring */
> > align = ops->get_vq_align(vdpa);
> > + if (ops->get_vq_num_unchangeable)
> > + may_reduce_num = !ops->get_vq_num_unchangeable(vdpa);
> > vq = vring_create_virtqueue(index, num, align, vdev,
> > - true, true, ctx,
> > + true, may_reduce_num, ctx,
> > virtio_vdpa_notify, callback, name);
> > if (!vq) {
> > err = -ENOMEM;
> > diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
> > index 35648c11e312..f809b7ada00d 100644
> > --- a/include/linux/vdpa.h
> > +++ b/include/linux/vdpa.h
> > @@ -195,6 +195,9 @@ struct vdpa_iova_range {
> > * @vdev: vdpa device
> > * Returns the iova range supported by
> > * the device.
> > + * @get_vq_num_unchangeable Check if size of virtqueue is unchangeable (optional)
> > + * @vdev: vdpa device
> > + * Returns boolean: unchangeable (true) or not (false)
> > * @set_map: Set device memory mapping (optional)
> > * Needed for device that using device
> > * specific DMA translation (on-chip IOMMU)
> > @@ -262,6 +265,7 @@ struct vdpa_config_ops {
> > const void *buf, unsigned int len);
> > u32 (*get_generation)(struct vdpa_device *vdev);
> > struct vdpa_iova_range (*get_iova_range)(struct vdpa_device *vdev);
> > + bool (*get_vq_num_unchangeable)(struct vdpa_device *vdev);
> >
> > /* DMA ops */
> > int (*set_map)(struct vdpa_device *vdev, struct vhost_iotlb *iotlb);
> > diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h
> > index c998860d7bbc..184f1f7f8498 100644
> > --- a/include/uapi/linux/vhost.h
> > +++ b/include/uapi/linux/vhost.h
> > @@ -150,4 +150,6 @@
> > /* Get the valid iova range */
> > #define VHOST_VDPA_GET_IOVA_RANGE _IOR(VHOST_VIRTIO, 0x78, \
> > struct vhost_vdpa_iova_range)
> > +/* Check if the vring size can be change */
> > +#define VHOST_VDPA_GET_VRING_NUM_UNCHANGEABLE _IOR(VHOST_VIRTIO, 0X79, bool)
> > #endif
> > --
> > 2.31.1
> >

2021-09-09 08:14:28

by Wu Zongyong

[permalink] [raw]
Subject: Re: [PATCH 0/6] vDPA driver for legacy virtio-pci device

On Thu, Sep 09, 2021 at 11:05:06AM +0800, Jason Wang wrote:
> On Wed, Sep 8, 2021 at 8:22 PM Wu Zongyong <[email protected]> wrote:
> >
> > This series implements the vDPA driver for legacy virtio-pci device.
> > Currently we already have the vDPA driver for modern virtio-pci device
> > only, but there are some legacy virtio-pci devices conforming to the
> > virtio-pci specifications of 0.9.x or older versions. For example,
> > ENI(Elastic Network Interface) of Alibaba ECS baremetal instance is a
> > hardware virtio network device which follows the Virtio PCI Card 0.9.5
> > Draft specification. Such legacy virtio-pci devices have some
> > inconsistent behaviour with modern virtio-pci devices, so some common
> > codes are split out and modern device specific codes are moved to a
> > separated file.
>
> What worries me a little bit are:
>
> 1) vDPA requires IOMMU_PLATFORM to be supported by the device to work,
> if I understand ENI correctly, it's a legacy device so it can't
> support ACCESS_PLATFORM. Or is it a legacy device that supports
> ACCESS_PLATFORM implicitly.
> 2) vDPA tries to present a 1.0 device, in this case the behavior could
> be ruled by the spec. If we tries to present an 1.0 device on top of
> legacy device we may suffer a lot of issues:
>
> - endian issue: 1.0 use le but legacy may use native endian
> - queue_enable semantic which is missed in the legacy

Writting the queue_address is regarded as enable queue in the legacy.
Right?

> - virtqueue size, as you mentioned below
>
> I guess what the device(ENI) supports are:
>
> 1) semantic of ACCESS_PLATFORM without a feature
> 2) little endian
> 3) but a legacy device
>
> So I think it might be better:
>
> 1) introduce the library for legacy as you did in this patch
> 2) having a dedicated ENI vDPA driver
>
> 3) live migration support, though it was not supported by the spec
> yet, but we are working on the support, and we know legacy device can
> support this.
>
> Thanks
>

I agree.
It's better to implement a dedicated vDPA driver for ENI only. ENI is
not a standard legacy virtio-pci device.

> >
> > For legacy devices, it is not supported to negotiate the virtqueue size
> > by the specification. So a new callback get_vq_num_unchangeable is
> > introduced to indicate user not to try change the virtqueue size of the
> > legacy vdpa device. For example, QEMU should not allocate memory for
> > virtqueue according to the properties tx_queue_size and rx_queue_size if
> > we use legacy virtio-pci device as the vhost-vdpa backend. Instead, QEMU
> > should use the new callback get_vq_num_unchangeable first to check if
> > the vdpa device support to change virtqueue size. If not, QEMU should
> > call the callback get_vq_num_max to get the static virtqueue size then
> > allocate the same size memory for the virtqueue.
> >
> > This series have been tested with the ENI in Alibaba ECS baremetal
> > instance.
> >
> > These patches may under consideration, welcome for comments.
> >
> >
> > Wu Zongyong (6):
> > virtio-pci: introduce legacy device module
> > vdpa: fix typo
> > vp_vdpa: add vq irq offloading support
> > vp_vdpa: split out reusable and device specific codes to separate file
> > vdpa: add get_vq_num_unchangeable callback in vdpa_config_ops
> > vp_vdpa: introduce legacy virtio pci driver
> >
> > drivers/vdpa/Kconfig | 7 +
> > drivers/vdpa/virtio_pci/Makefile | 3 +
> > drivers/vdpa/virtio_pci/vp_vdpa.c | 8 +
> > drivers/vdpa/virtio_pci/vp_vdpa_common.c | 220 ++++++++++++++
> > drivers/vdpa/virtio_pci/vp_vdpa_common.h | 67 +++++
> > drivers/vdpa/virtio_pci/vp_vdpa_legacy.c | 346 +++++++++++++++++++++++
> > drivers/vdpa/virtio_pci/vp_vdpa_modern.c | 327 +++++++++++++++++++++
> > drivers/vhost/vdpa.c | 19 ++
> > drivers/virtio/Kconfig | 10 +
> > drivers/virtio/Makefile | 1 +
> > drivers/virtio/virtio_pci_common.c | 10 +-
> > drivers/virtio/virtio_pci_common.h | 9 +-
> > drivers/virtio/virtio_pci_legacy.c | 101 ++-----
> > drivers/virtio/virtio_pci_legacy_dev.c | 220 ++++++++++++++
> > drivers/virtio/virtio_vdpa.c | 5 +-
> > include/linux/vdpa.h | 6 +-
> > include/linux/virtio_pci_legacy.h | 44 +++
> > include/uapi/linux/vhost.h | 2 +
> > 18 files changed, 1320 insertions(+), 85 deletions(-)
> > create mode 100644 drivers/vdpa/virtio_pci/vp_vdpa_common.c
> > create mode 100644 drivers/vdpa/virtio_pci/vp_vdpa_common.h
> > create mode 100644 drivers/vdpa/virtio_pci/vp_vdpa_legacy.c
> > create mode 100644 drivers/vdpa/virtio_pci/vp_vdpa_modern.c
> > create mode 100644 drivers/virtio/virtio_pci_legacy_dev.c
> > create mode 100644 include/linux/virtio_pci_legacy.h
> >
> > --
> > 2.31.1
> >

2021-09-09 09:22:18

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH 5/6] vdpa: add get_vq_num_unchangeable callback in vdpa_config_ops

On Thu, Sep 09, 2021 at 10:55:03AM +0800, Jason Wang wrote:
> On Wed, Sep 8, 2021 at 8:23 PM Wu Zongyong <[email protected]> wrote:
> >
> > This new callback is used to indicate whether the vring size can be
> > change or not. It is useful when we have a legacy virtio pci device as
> > the vdpa device for there is no way to negotiate the vring num by the
> > specification.
>
> So I'm not sure it's worth bothering. E.g what if we just fail
> VHOST_SET_VRING_NUM it the value doesn't match what hardware has?
>
> Thanks

More importantly is there and actual plan for supporting
legacy devices? I don't think they currently work at a number
of levels.

--
MST

2021-09-09 09:22:28

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH 0/6] vDPA driver for legacy virtio-pci device

On Thu, Sep 09, 2021 at 11:05:06AM +0800, Jason Wang wrote:
> On Wed, Sep 8, 2021 at 8:22 PM Wu Zongyong <[email protected]> wrote:
> >
> > This series implements the vDPA driver for legacy virtio-pci device.
> > Currently we already have the vDPA driver for modern virtio-pci device
> > only, but there are some legacy virtio-pci devices conforming to the
> > virtio-pci specifications of 0.9.x or older versions. For example,
> > ENI(Elastic Network Interface) of Alibaba ECS baremetal instance is a
> > hardware virtio network device which follows the Virtio PCI Card 0.9.5
> > Draft specification. Such legacy virtio-pci devices have some
> > inconsistent behaviour with modern virtio-pci devices, so some common
> > codes are split out and modern device specific codes are moved to a
> > separated file.
>
> What worries me a little bit are:
>
> 1) vDPA requires IOMMU_PLATFORM to be supported by the device to work,
> if I understand ENI correctly, it's a legacy device so it can't
> support ACCESS_PLATFORM. Or is it a legacy device that supports
> ACCESS_PLATFORM implicitly.
> 2) vDPA tries to present a 1.0 device, in this case the behavior could
> be ruled by the spec. If we tries to present an 1.0 device on top of
> legacy device we may suffer a lot of issues:
>
> - endian issue: 1.0 use le but legacy may use native endian
> - queue_enable semantic which is missed in the legacy
> - virtqueue size, as you mentioned below

So this all kind of works when guest and host are
strongly ordered and LE. Case in point x86.
Question is how do we limit this to an x86 guest?
Add a new ioctl declaring that this is the case?

--
MST

2021-09-09 09:30:45

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH 5/6] vdpa: add get_vq_num_unchangeable callback in vdpa_config_ops

On Thu, Sep 9, 2021 at 4:02 PM Wu Zongyong <[email protected]> wrote:
>
> On Thu, Sep 09, 2021 at 10:55:03AM +0800, Jason Wang wrote:
> > On Wed, Sep 8, 2021 at 8:23 PM Wu Zongyong <[email protected]> wrote:
> > >
> > > This new callback is used to indicate whether the vring size can be
> > > change or not. It is useful when we have a legacy virtio pci device as
> > > the vdpa device for there is no way to negotiate the vring num by the
> > > specification.
> >
> > So I'm not sure it's worth bothering. E.g what if we just fail
> > VHOST_SET_VRING_NUM it the value doesn't match what hardware has?
> >
> > Thanks
> >
> I think we should not call VHOST_SET_VRING_NUM in that case.
>
> If the hardware reports that the virtqueue size cannot be changed, we
> should call VHOST_GET_VRING_NUM to get the static virtqueue size
> firstly, then allocate the same size memory for the virtqueues and write
> the address to hardware finally.
>
> For QEMU, we will ignore the properties rx/tx_queue_size and just get it
> from the hardware if this new callback return true.

This will break live migration. My understanding is that we can
advertise those capability/limitation via the netlink management
protocol then management layer can choose to use the correct queue
size.

Thanks

>
> What do you think?
> > >
> > > Signed-off-by: Wu Zongyong <[email protected]>
> > > ---
> > > drivers/vhost/vdpa.c | 19 +++++++++++++++++++
> > > drivers/virtio/virtio_vdpa.c | 5 ++++-
> > > include/linux/vdpa.h | 4 ++++
> > > include/uapi/linux/vhost.h | 2 ++
> > > 4 files changed, 29 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> > > index 9479f7f79217..2204d27d1e5d 100644
> > > --- a/drivers/vhost/vdpa.c
> > > +++ b/drivers/vhost/vdpa.c
> > > @@ -350,6 +350,22 @@ static long vhost_vdpa_get_iova_range(struct vhost_vdpa *v, u32 __user *argp)
> > > return 0;
> > > }
> > >
> > > +static long vhost_vdpa_get_vring_num_unchangeable(struct vhost_vdpa *v,
> > > + u32 __user *argp)
> > > +{
> > > + struct vdpa_device *vdpa = v->vdpa;
> > > + const struct vdpa_config_ops *ops = vdpa->config;
> > > + bool unchangeable = false;
> > > +
> > > + if (ops->get_vq_num_unchangeable)
> > > + unchangeable = ops->get_vq_num_unchangeable(vdpa);
> > > +
> > > + if (copy_to_user(argp, &unchangeable, sizeof(unchangeable)))
> > > + return -EFAULT;
> > > +
> > > + return 0;
> > > +}
> > > +
> > > static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd,
> > > void __user *argp)
> > > {
> > > @@ -487,6 +503,9 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> > > case VHOST_VDPA_GET_IOVA_RANGE:
> > > r = vhost_vdpa_get_iova_range(v, argp);
> > > break;
> > > + case VHOST_VDPA_GET_VRING_NUM_UNCHANGEABLE:
> > > + r = vhost_vdpa_get_vring_num_unchangeable(v, argp);
> > > + break;
> > > default:
> > > r = vhost_dev_ioctl(&v->vdev, cmd, argp);
> > > if (r == -ENOIOCTLCMD)
> > > diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
> > > index 72eaef2caeb1..afb47465307a 100644
> > > --- a/drivers/virtio/virtio_vdpa.c
> > > +++ b/drivers/virtio/virtio_vdpa.c
> > > @@ -146,6 +146,7 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > struct vdpa_vq_state state = {0};
> > > unsigned long flags;
> > > u32 align, num;
> > > + bool may_reduce_num = true;
> > > int err;
> > >
> > > if (!name)
> > > @@ -171,8 +172,10 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > >
> > > /* Create the vring */
> > > align = ops->get_vq_align(vdpa);
> > > + if (ops->get_vq_num_unchangeable)
> > > + may_reduce_num = !ops->get_vq_num_unchangeable(vdpa);
> > > vq = vring_create_virtqueue(index, num, align, vdev,
> > > - true, true, ctx,
> > > + true, may_reduce_num, ctx,
> > > virtio_vdpa_notify, callback, name);
> > > if (!vq) {
> > > err = -ENOMEM;
> > > diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
> > > index 35648c11e312..f809b7ada00d 100644
> > > --- a/include/linux/vdpa.h
> > > +++ b/include/linux/vdpa.h
> > > @@ -195,6 +195,9 @@ struct vdpa_iova_range {
> > > * @vdev: vdpa device
> > > * Returns the iova range supported by
> > > * the device.
> > > + * @get_vq_num_unchangeable Check if size of virtqueue is unchangeable (optional)
> > > + * @vdev: vdpa device
> > > + * Returns boolean: unchangeable (true) or not (false)
> > > * @set_map: Set device memory mapping (optional)
> > > * Needed for device that using device
> > > * specific DMA translation (on-chip IOMMU)
> > > @@ -262,6 +265,7 @@ struct vdpa_config_ops {
> > > const void *buf, unsigned int len);
> > > u32 (*get_generation)(struct vdpa_device *vdev);
> > > struct vdpa_iova_range (*get_iova_range)(struct vdpa_device *vdev);
> > > + bool (*get_vq_num_unchangeable)(struct vdpa_device *vdev);
> > >
> > > /* DMA ops */
> > > int (*set_map)(struct vdpa_device *vdev, struct vhost_iotlb *iotlb);
> > > diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h
> > > index c998860d7bbc..184f1f7f8498 100644
> > > --- a/include/uapi/linux/vhost.h
> > > +++ b/include/uapi/linux/vhost.h
> > > @@ -150,4 +150,6 @@
> > > /* Get the valid iova range */
> > > #define VHOST_VDPA_GET_IOVA_RANGE _IOR(VHOST_VIRTIO, 0x78, \
> > > struct vhost_vdpa_iova_range)
> > > +/* Check if the vring size can be change */
> > > +#define VHOST_VDPA_GET_VRING_NUM_UNCHANGEABLE _IOR(VHOST_VIRTIO, 0X79, bool)
> > > #endif
> > > --
> > > 2.31.1
> > >
>

2021-09-09 09:32:36

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH 0/6] vDPA driver for legacy virtio-pci device

On Thu, Sep 9, 2021 at 4:12 PM Wu Zongyong <[email protected]> wrote:
>
> On Thu, Sep 09, 2021 at 11:05:06AM +0800, Jason Wang wrote:
> > On Wed, Sep 8, 2021 at 8:22 PM Wu Zongyong <[email protected]> wrote:
> > >
> > > This series implements the vDPA driver for legacy virtio-pci device.
> > > Currently we already have the vDPA driver for modern virtio-pci device
> > > only, but there are some legacy virtio-pci devices conforming to the
> > > virtio-pci specifications of 0.9.x or older versions. For example,
> > > ENI(Elastic Network Interface) of Alibaba ECS baremetal instance is a
> > > hardware virtio network device which follows the Virtio PCI Card 0.9.5
> > > Draft specification. Such legacy virtio-pci devices have some
> > > inconsistent behaviour with modern virtio-pci devices, so some common
> > > codes are split out and modern device specific codes are moved to a
> > > separated file.
> >
> > What worries me a little bit are:
> >
> > 1) vDPA requires IOMMU_PLATFORM to be supported by the device to work,
> > if I understand ENI correctly, it's a legacy device so it can't
> > support ACCESS_PLATFORM. Or is it a legacy device that supports
> > ACCESS_PLATFORM implicitly.
> > 2) vDPA tries to present a 1.0 device, in this case the behavior could
> > be ruled by the spec. If we tries to present an 1.0 device on top of
> > legacy device we may suffer a lot of issues:
> >
> > - endian issue: 1.0 use le but legacy may use native endian
> > - queue_enable semantic which is missed in the legacy
>
> Writting the queue_address is regarded as enable queue in the legacy.
> Right?

It's implementation specific details that virtio spec can't mandate.

E.g if your ENI behaves like this, you can delay the queue_address
write to hardware until set_vq_ready() in the ENI vDPA driver.

Thanks

>
> > - virtqueue size, as you mentioned below
> >
> > I guess what the device(ENI) supports are:
> >
> > 1) semantic of ACCESS_PLATFORM without a feature
> > 2) little endian
> > 3) but a legacy device
> >
> > So I think it might be better:
> >
> > 1) introduce the library for legacy as you did in this patch
> > 2) having a dedicated ENI vDPA driver
> >
> > 3) live migration support, though it was not supported by the spec
> > yet, but we are working on the support, and we know legacy device can
> > support this.
> >
> > Thanks
> >
>
> I agree.
> It's better to implement a dedicated vDPA driver for ENI only. ENI is
> not a standard legacy virtio-pci device.
>
> > >
> > > For legacy devices, it is not supported to negotiate the virtqueue size
> > > by the specification. So a new callback get_vq_num_unchangeable is
> > > introduced to indicate user not to try change the virtqueue size of the
> > > legacy vdpa device. For example, QEMU should not allocate memory for
> > > virtqueue according to the properties tx_queue_size and rx_queue_size if
> > > we use legacy virtio-pci device as the vhost-vdpa backend. Instead, QEMU
> > > should use the new callback get_vq_num_unchangeable first to check if
> > > the vdpa device support to change virtqueue size. If not, QEMU should
> > > call the callback get_vq_num_max to get the static virtqueue size then
> > > allocate the same size memory for the virtqueue.
> > >
> > > This series have been tested with the ENI in Alibaba ECS baremetal
> > > instance.
> > >
> > > These patches may under consideration, welcome for comments.
> > >
> > >
> > > Wu Zongyong (6):
> > > virtio-pci: introduce legacy device module
> > > vdpa: fix typo
> > > vp_vdpa: add vq irq offloading support
> > > vp_vdpa: split out reusable and device specific codes to separate file
> > > vdpa: add get_vq_num_unchangeable callback in vdpa_config_ops
> > > vp_vdpa: introduce legacy virtio pci driver
> > >
> > > drivers/vdpa/Kconfig | 7 +
> > > drivers/vdpa/virtio_pci/Makefile | 3 +
> > > drivers/vdpa/virtio_pci/vp_vdpa.c | 8 +
> > > drivers/vdpa/virtio_pci/vp_vdpa_common.c | 220 ++++++++++++++
> > > drivers/vdpa/virtio_pci/vp_vdpa_common.h | 67 +++++
> > > drivers/vdpa/virtio_pci/vp_vdpa_legacy.c | 346 +++++++++++++++++++++++
> > > drivers/vdpa/virtio_pci/vp_vdpa_modern.c | 327 +++++++++++++++++++++
> > > drivers/vhost/vdpa.c | 19 ++
> > > drivers/virtio/Kconfig | 10 +
> > > drivers/virtio/Makefile | 1 +
> > > drivers/virtio/virtio_pci_common.c | 10 +-
> > > drivers/virtio/virtio_pci_common.h | 9 +-
> > > drivers/virtio/virtio_pci_legacy.c | 101 ++-----
> > > drivers/virtio/virtio_pci_legacy_dev.c | 220 ++++++++++++++
> > > drivers/virtio/virtio_vdpa.c | 5 +-
> > > include/linux/vdpa.h | 6 +-
> > > include/linux/virtio_pci_legacy.h | 44 +++
> > > include/uapi/linux/vhost.h | 2 +
> > > 18 files changed, 1320 insertions(+), 85 deletions(-)
> > > create mode 100644 drivers/vdpa/virtio_pci/vp_vdpa_common.c
> > > create mode 100644 drivers/vdpa/virtio_pci/vp_vdpa_common.h
> > > create mode 100644 drivers/vdpa/virtio_pci/vp_vdpa_legacy.c
> > > create mode 100644 drivers/vdpa/virtio_pci/vp_vdpa_modern.c
> > > create mode 100644 drivers/virtio/virtio_pci_legacy_dev.c
> > > create mode 100644 include/linux/virtio_pci_legacy.h
> > >
> > > --
> > > 2.31.1
> > >
>

2021-09-09 09:33:50

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH 0/6] vDPA driver for legacy virtio-pci device

On Thu, Sep 9, 2021 at 5:21 PM Michael S. Tsirkin <[email protected]> wrote:
>
> On Thu, Sep 09, 2021 at 11:05:06AM +0800, Jason Wang wrote:
> > On Wed, Sep 8, 2021 at 8:22 PM Wu Zongyong <[email protected]> wrote:
> > >
> > > This series implements the vDPA driver for legacy virtio-pci device.
> > > Currently we already have the vDPA driver for modern virtio-pci device
> > > only, but there are some legacy virtio-pci devices conforming to the
> > > virtio-pci specifications of 0.9.x or older versions. For example,
> > > ENI(Elastic Network Interface) of Alibaba ECS baremetal instance is a
> > > hardware virtio network device which follows the Virtio PCI Card 0.9.5
> > > Draft specification. Such legacy virtio-pci devices have some
> > > inconsistent behaviour with modern virtio-pci devices, so some common
> > > codes are split out and modern device specific codes are moved to a
> > > separated file.
> >
> > What worries me a little bit are:
> >
> > 1) vDPA requires IOMMU_PLATFORM to be supported by the device to work,
> > if I understand ENI correctly, it's a legacy device so it can't
> > support ACCESS_PLATFORM. Or is it a legacy device that supports
> > ACCESS_PLATFORM implicitly.
> > 2) vDPA tries to present a 1.0 device, in this case the behavior could
> > be ruled by the spec. If we tries to present an 1.0 device on top of
> > legacy device we may suffer a lot of issues:
> >
> > - endian issue: 1.0 use le but legacy may use native endian
> > - queue_enable semantic which is missed in the legacy
> > - virtqueue size, as you mentioned below
>
> So this all kind of works when guest and host are
> strongly ordered and LE. Case in point x86.
> Question is how do we limit this to an x86 guest?
> Add a new ioctl declaring that this is the case?

I think the most simple way is to disable the driver on non LE host
(assuming it tries to use native endian which is kind of impossible).

Thanks

>
> --
> MST
>

2021-09-09 09:34:23

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH 5/6] vdpa: add get_vq_num_unchangeable callback in vdpa_config_ops

On Thu, Sep 9, 2021 at 5:18 PM Michael S. Tsirkin <[email protected]> wrote:
>
> On Thu, Sep 09, 2021 at 10:55:03AM +0800, Jason Wang wrote:
> > On Wed, Sep 8, 2021 at 8:23 PM Wu Zongyong <[email protected]> wrote:
> > >
> > > This new callback is used to indicate whether the vring size can be
> > > change or not. It is useful when we have a legacy virtio pci device as
> > > the vdpa device for there is no way to negotiate the vring num by the
> > > specification.
> >
> > So I'm not sure it's worth bothering. E.g what if we just fail
> > VHOST_SET_VRING_NUM it the value doesn't match what hardware has?
> >
> > Thanks
>
> More importantly is there and actual plan for supporting
> legacy devices? I don't think they currently work at a number
> of levels.

I think the answer is no, it would introduce a lot of burdens.

Thanks

>
> --
> MST
>

2021-09-09 09:58:21

by Wu Zongyong

[permalink] [raw]
Subject: Re: [PATCH 5/6] vdpa: add get_vq_num_unchangeable callback in vdpa_config_ops

On Thu, Sep 09, 2021 at 05:28:26PM +0800, Jason Wang wrote:
> On Thu, Sep 9, 2021 at 4:02 PM Wu Zongyong <[email protected]> wrote:
> >
> > On Thu, Sep 09, 2021 at 10:55:03AM +0800, Jason Wang wrote:
> > > On Wed, Sep 8, 2021 at 8:23 PM Wu Zongyong <[email protected]> wrote:
> > > >
> > > > This new callback is used to indicate whether the vring size can be
> > > > change or not. It is useful when we have a legacy virtio pci device as
> > > > the vdpa device for there is no way to negotiate the vring num by the
> > > > specification.
> > >
> > > So I'm not sure it's worth bothering. E.g what if we just fail
> > > VHOST_SET_VRING_NUM it the value doesn't match what hardware has?
> > >
> > > Thanks
> > >
> > I think we should not call VHOST_SET_VRING_NUM in that case.
> >
> > If the hardware reports that the virtqueue size cannot be changed, we
> > should call VHOST_GET_VRING_NUM to get the static virtqueue size
> > firstly, then allocate the same size memory for the virtqueues and write
> > the address to hardware finally.
> >
> > For QEMU, we will ignore the properties rx/tx_queue_size and just get it
> > from the hardware if this new callback return true.
>
> This will break live migration. My understanding is that we can
> advertise those capability/limitation via the netlink management
> protocol then management layer can choose to use the correct queue
> size.
>
> Thanks
I agree, it is a good idea.
BTW, can we also advertise mac address of network device? I found the
mac address generated by libvirt or qemu will break the network datapath
down if I don't specify the right mac explicitly in the XML or qemu
commandline.
>
> >
> > What do you think?
> > > >
> > > > Signed-off-by: Wu Zongyong <[email protected]>
> > > > ---
> > > > drivers/vhost/vdpa.c | 19 +++++++++++++++++++
> > > > drivers/virtio/virtio_vdpa.c | 5 ++++-
> > > > include/linux/vdpa.h | 4 ++++
> > > > include/uapi/linux/vhost.h | 2 ++
> > > > 4 files changed, 29 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> > > > index 9479f7f79217..2204d27d1e5d 100644
> > > > --- a/drivers/vhost/vdpa.c
> > > > +++ b/drivers/vhost/vdpa.c
> > > > @@ -350,6 +350,22 @@ static long vhost_vdpa_get_iova_range(struct vhost_vdpa *v, u32 __user *argp)
> > > > return 0;
> > > > }
> > > >
> > > > +static long vhost_vdpa_get_vring_num_unchangeable(struct vhost_vdpa *v,
> > > > + u32 __user *argp)
> > > > +{
> > > > + struct vdpa_device *vdpa = v->vdpa;
> > > > + const struct vdpa_config_ops *ops = vdpa->config;
> > > > + bool unchangeable = false;
> > > > +
> > > > + if (ops->get_vq_num_unchangeable)
> > > > + unchangeable = ops->get_vq_num_unchangeable(vdpa);
> > > > +
> > > > + if (copy_to_user(argp, &unchangeable, sizeof(unchangeable)))
> > > > + return -EFAULT;
> > > > +
> > > > + return 0;
> > > > +}
> > > > +
> > > > static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd,
> > > > void __user *argp)
> > > > {
> > > > @@ -487,6 +503,9 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> > > > case VHOST_VDPA_GET_IOVA_RANGE:
> > > > r = vhost_vdpa_get_iova_range(v, argp);
> > > > break;
> > > > + case VHOST_VDPA_GET_VRING_NUM_UNCHANGEABLE:
> > > > + r = vhost_vdpa_get_vring_num_unchangeable(v, argp);
> > > > + break;
> > > > default:
> > > > r = vhost_dev_ioctl(&v->vdev, cmd, argp);
> > > > if (r == -ENOIOCTLCMD)
> > > > diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
> > > > index 72eaef2caeb1..afb47465307a 100644
> > > > --- a/drivers/virtio/virtio_vdpa.c
> > > > +++ b/drivers/virtio/virtio_vdpa.c
> > > > @@ -146,6 +146,7 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > > struct vdpa_vq_state state = {0};
> > > > unsigned long flags;
> > > > u32 align, num;
> > > > + bool may_reduce_num = true;
> > > > int err;
> > > >
> > > > if (!name)
> > > > @@ -171,8 +172,10 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > >
> > > > /* Create the vring */
> > > > align = ops->get_vq_align(vdpa);
> > > > + if (ops->get_vq_num_unchangeable)
> > > > + may_reduce_num = !ops->get_vq_num_unchangeable(vdpa);
> > > > vq = vring_create_virtqueue(index, num, align, vdev,
> > > > - true, true, ctx,
> > > > + true, may_reduce_num, ctx,
> > > > virtio_vdpa_notify, callback, name);
> > > > if (!vq) {
> > > > err = -ENOMEM;
> > > > diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
> > > > index 35648c11e312..f809b7ada00d 100644
> > > > --- a/include/linux/vdpa.h
> > > > +++ b/include/linux/vdpa.h
> > > > @@ -195,6 +195,9 @@ struct vdpa_iova_range {
> > > > * @vdev: vdpa device
> > > > * Returns the iova range supported by
> > > > * the device.
> > > > + * @get_vq_num_unchangeable Check if size of virtqueue is unchangeable (optional)
> > > > + * @vdev: vdpa device
> > > > + * Returns boolean: unchangeable (true) or not (false)
> > > > * @set_map: Set device memory mapping (optional)
> > > > * Needed for device that using device
> > > > * specific DMA translation (on-chip IOMMU)
> > > > @@ -262,6 +265,7 @@ struct vdpa_config_ops {
> > > > const void *buf, unsigned int len);
> > > > u32 (*get_generation)(struct vdpa_device *vdev);
> > > > struct vdpa_iova_range (*get_iova_range)(struct vdpa_device *vdev);
> > > > + bool (*get_vq_num_unchangeable)(struct vdpa_device *vdev);
> > > >
> > > > /* DMA ops */
> > > > int (*set_map)(struct vdpa_device *vdev, struct vhost_iotlb *iotlb);
> > > > diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h
> > > > index c998860d7bbc..184f1f7f8498 100644
> > > > --- a/include/uapi/linux/vhost.h
> > > > +++ b/include/uapi/linux/vhost.h
> > > > @@ -150,4 +150,6 @@
> > > > /* Get the valid iova range */
> > > > #define VHOST_VDPA_GET_IOVA_RANGE _IOR(VHOST_VIRTIO, 0x78, \
> > > > struct vhost_vdpa_iova_range)
> > > > +/* Check if the vring size can be change */
> > > > +#define VHOST_VDPA_GET_VRING_NUM_UNCHANGEABLE _IOR(VHOST_VIRTIO, 0X79, bool)
> > > > #endif
> > > > --
> > > > 2.31.1
> > > >
> >

2021-09-09 13:08:19

by Wu Zongyong

[permalink] [raw]
Subject: Re: [PATCH 0/6] vDPA driver for legacy virtio-pci device

On Thu, Sep 09, 2021 at 11:05:06AM +0800, Jason Wang wrote:
> On Wed, Sep 8, 2021 at 8:22 PM Wu Zongyong <[email protected]> wrote:
> >
> > This series implements the vDPA driver for legacy virtio-pci device.
> > Currently we already have the vDPA driver for modern virtio-pci device
> > only, but there are some legacy virtio-pci devices conforming to the
> > virtio-pci specifications of 0.9.x or older versions. For example,
> > ENI(Elastic Network Interface) of Alibaba ECS baremetal instance is a
> > hardware virtio network device which follows the Virtio PCI Card 0.9.5
> > Draft specification. Such legacy virtio-pci devices have some
> > inconsistent behaviour with modern virtio-pci devices, so some common
> > codes are split out and modern device specific codes are moved to a
> > separated file.
>
> What worries me a little bit are:
>
> 1) vDPA requires IOMMU_PLATFORM to be supported by the device to work,
> if I understand ENI correctly, it's a legacy device so it can't
> support ACCESS_PLATFORM. Or is it a legacy device that supports
> ACCESS_PLATFORM implicitly.
> 2) vDPA tries to present a 1.0 device, in this case the behavior could
> be ruled by the spec. If we tries to present an 1.0 device on top of
> legacy device we may suffer a lot of issues:
>
> - endian issue: 1.0 use le but legacy may use native endian
> - queue_enable semantic which is missed in the legacy
> - virtqueue size, as you mentioned below
>
> I guess what the device(ENI) supports are:
>
> 1) semantic of ACCESS_PLATFORM without a feature
> 2) little endian
> 3) but a legacy device
>
> So I think it might be better:
>
> 1) introduce the library for legacy as you did in this patch
> 2) having a dedicated ENI vDPA driver

Would you mind I place the ENI vDPA driver inside virtio_pci folder? Or
should I create a new folder for it?

>
> 3) live migration support, though it was not supported by the spec
> yet, but we are working on the support, and we know legacy device can
> support this.
>
> Thanks
>
> >
> > For legacy devices, it is not supported to negotiate the virtqueue size
> > by the specification. So a new callback get_vq_num_unchangeable is
> > introduced to indicate user not to try change the virtqueue size of the
> > legacy vdpa device. For example, QEMU should not allocate memory for
> > virtqueue according to the properties tx_queue_size and rx_queue_size if
> > we use legacy virtio-pci device as the vhost-vdpa backend. Instead, QEMU
> > should use the new callback get_vq_num_unchangeable first to check if
> > the vdpa device support to change virtqueue size. If not, QEMU should
> > call the callback get_vq_num_max to get the static virtqueue size then
> > allocate the same size memory for the virtqueue.
> >
> > This series have been tested with the ENI in Alibaba ECS baremetal
> > instance.
> >
> > These patches may under consideration, welcome for comments.
> >
> >
> > Wu Zongyong (6):
> > virtio-pci: introduce legacy device module
> > vdpa: fix typo
> > vp_vdpa: add vq irq offloading support
> > vp_vdpa: split out reusable and device specific codes to separate file
> > vdpa: add get_vq_num_unchangeable callback in vdpa_config_ops
> > vp_vdpa: introduce legacy virtio pci driver
> >
> > drivers/vdpa/Kconfig | 7 +
> > drivers/vdpa/virtio_pci/Makefile | 3 +
> > drivers/vdpa/virtio_pci/vp_vdpa.c | 8 +
> > drivers/vdpa/virtio_pci/vp_vdpa_common.c | 220 ++++++++++++++
> > drivers/vdpa/virtio_pci/vp_vdpa_common.h | 67 +++++
> > drivers/vdpa/virtio_pci/vp_vdpa_legacy.c | 346 +++++++++++++++++++++++
> > drivers/vdpa/virtio_pci/vp_vdpa_modern.c | 327 +++++++++++++++++++++
> > drivers/vhost/vdpa.c | 19 ++
> > drivers/virtio/Kconfig | 10 +
> > drivers/virtio/Makefile | 1 +
> > drivers/virtio/virtio_pci_common.c | 10 +-
> > drivers/virtio/virtio_pci_common.h | 9 +-
> > drivers/virtio/virtio_pci_legacy.c | 101 ++-----
> > drivers/virtio/virtio_pci_legacy_dev.c | 220 ++++++++++++++
> > drivers/virtio/virtio_vdpa.c | 5 +-
> > include/linux/vdpa.h | 6 +-
> > include/linux/virtio_pci_legacy.h | 44 +++
> > include/uapi/linux/vhost.h | 2 +
> > 18 files changed, 1320 insertions(+), 85 deletions(-)
> > create mode 100644 drivers/vdpa/virtio_pci/vp_vdpa_common.c
> > create mode 100644 drivers/vdpa/virtio_pci/vp_vdpa_common.h
> > create mode 100644 drivers/vdpa/virtio_pci/vp_vdpa_legacy.c
> > create mode 100644 drivers/vdpa/virtio_pci/vp_vdpa_modern.c
> > create mode 100644 drivers/virtio/virtio_pci_legacy_dev.c
> > create mode 100644 include/linux/virtio_pci_legacy.h
> >
> > --
> > 2.31.1
> >

2021-09-10 01:47:50

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH 5/6] vdpa: add get_vq_num_unchangeable callback in vdpa_config_ops

On Thu, Sep 9, 2021 at 5:57 PM Wu Zongyong <[email protected]> wrote:
>
> On Thu, Sep 09, 2021 at 05:28:26PM +0800, Jason Wang wrote:
> > On Thu, Sep 9, 2021 at 4:02 PM Wu Zongyong <[email protected]> wrote:
> > >
> > > On Thu, Sep 09, 2021 at 10:55:03AM +0800, Jason Wang wrote:
> > > > On Wed, Sep 8, 2021 at 8:23 PM Wu Zongyong <[email protected]> wrote:
> > > > >
> > > > > This new callback is used to indicate whether the vring size can be
> > > > > change or not. It is useful when we have a legacy virtio pci device as
> > > > > the vdpa device for there is no way to negotiate the vring num by the
> > > > > specification.
> > > >
> > > > So I'm not sure it's worth bothering. E.g what if we just fail
> > > > VHOST_SET_VRING_NUM it the value doesn't match what hardware has?
> > > >
> > > > Thanks
> > > >
> > > I think we should not call VHOST_SET_VRING_NUM in that case.
> > >
> > > If the hardware reports that the virtqueue size cannot be changed, we
> > > should call VHOST_GET_VRING_NUM to get the static virtqueue size
> > > firstly, then allocate the same size memory for the virtqueues and write
> > > the address to hardware finally.
> > >
> > > For QEMU, we will ignore the properties rx/tx_queue_size and just get it
> > > from the hardware if this new callback return true.
> >
> > This will break live migration. My understanding is that we can
> > advertise those capability/limitation via the netlink management
> > protocol then management layer can choose to use the correct queue
> > size.
> >
> > Thanks
> I agree, it is a good idea.
> BTW, can we also advertise mac address of network device? I found the
> mac address generated by libvirt or qemu will break the network datapath
> down if I don't specify the right mac explicitly in the XML or qemu
> commandline.

We never saw this before, AFAIK when vhost-vdpa is used, currently
qemu will probably ignore the mac address set via command line since
the config space is read from the device instead of qemu itself?

Thanks

> >
> > >
> > > What do you think?
> > > > >
> > > > > Signed-off-by: Wu Zongyong <[email protected]>
> > > > > ---
> > > > > drivers/vhost/vdpa.c | 19 +++++++++++++++++++
> > > > > drivers/virtio/virtio_vdpa.c | 5 ++++-
> > > > > include/linux/vdpa.h | 4 ++++
> > > > > include/uapi/linux/vhost.h | 2 ++
> > > > > 4 files changed, 29 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> > > > > index 9479f7f79217..2204d27d1e5d 100644
> > > > > --- a/drivers/vhost/vdpa.c
> > > > > +++ b/drivers/vhost/vdpa.c
> > > > > @@ -350,6 +350,22 @@ static long vhost_vdpa_get_iova_range(struct vhost_vdpa *v, u32 __user *argp)
> > > > > return 0;
> > > > > }
> > > > >
> > > > > +static long vhost_vdpa_get_vring_num_unchangeable(struct vhost_vdpa *v,
> > > > > + u32 __user *argp)
> > > > > +{
> > > > > + struct vdpa_device *vdpa = v->vdpa;
> > > > > + const struct vdpa_config_ops *ops = vdpa->config;
> > > > > + bool unchangeable = false;
> > > > > +
> > > > > + if (ops->get_vq_num_unchangeable)
> > > > > + unchangeable = ops->get_vq_num_unchangeable(vdpa);
> > > > > +
> > > > > + if (copy_to_user(argp, &unchangeable, sizeof(unchangeable)))
> > > > > + return -EFAULT;
> > > > > +
> > > > > + return 0;
> > > > > +}
> > > > > +
> > > > > static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd,
> > > > > void __user *argp)
> > > > > {
> > > > > @@ -487,6 +503,9 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> > > > > case VHOST_VDPA_GET_IOVA_RANGE:
> > > > > r = vhost_vdpa_get_iova_range(v, argp);
> > > > > break;
> > > > > + case VHOST_VDPA_GET_VRING_NUM_UNCHANGEABLE:
> > > > > + r = vhost_vdpa_get_vring_num_unchangeable(v, argp);
> > > > > + break;
> > > > > default:
> > > > > r = vhost_dev_ioctl(&v->vdev, cmd, argp);
> > > > > if (r == -ENOIOCTLCMD)
> > > > > diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
> > > > > index 72eaef2caeb1..afb47465307a 100644
> > > > > --- a/drivers/virtio/virtio_vdpa.c
> > > > > +++ b/drivers/virtio/virtio_vdpa.c
> > > > > @@ -146,6 +146,7 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > > > struct vdpa_vq_state state = {0};
> > > > > unsigned long flags;
> > > > > u32 align, num;
> > > > > + bool may_reduce_num = true;
> > > > > int err;
> > > > >
> > > > > if (!name)
> > > > > @@ -171,8 +172,10 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > > >
> > > > > /* Create the vring */
> > > > > align = ops->get_vq_align(vdpa);
> > > > > + if (ops->get_vq_num_unchangeable)
> > > > > + may_reduce_num = !ops->get_vq_num_unchangeable(vdpa);
> > > > > vq = vring_create_virtqueue(index, num, align, vdev,
> > > > > - true, true, ctx,
> > > > > + true, may_reduce_num, ctx,
> > > > > virtio_vdpa_notify, callback, name);
> > > > > if (!vq) {
> > > > > err = -ENOMEM;
> > > > > diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
> > > > > index 35648c11e312..f809b7ada00d 100644
> > > > > --- a/include/linux/vdpa.h
> > > > > +++ b/include/linux/vdpa.h
> > > > > @@ -195,6 +195,9 @@ struct vdpa_iova_range {
> > > > > * @vdev: vdpa device
> > > > > * Returns the iova range supported by
> > > > > * the device.
> > > > > + * @get_vq_num_unchangeable Check if size of virtqueue is unchangeable (optional)
> > > > > + * @vdev: vdpa device
> > > > > + * Returns boolean: unchangeable (true) or not (false)
> > > > > * @set_map: Set device memory mapping (optional)
> > > > > * Needed for device that using device
> > > > > * specific DMA translation (on-chip IOMMU)
> > > > > @@ -262,6 +265,7 @@ struct vdpa_config_ops {
> > > > > const void *buf, unsigned int len);
> > > > > u32 (*get_generation)(struct vdpa_device *vdev);
> > > > > struct vdpa_iova_range (*get_iova_range)(struct vdpa_device *vdev);
> > > > > + bool (*get_vq_num_unchangeable)(struct vdpa_device *vdev);
> > > > >
> > > > > /* DMA ops */
> > > > > int (*set_map)(struct vdpa_device *vdev, struct vhost_iotlb *iotlb);
> > > > > diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h
> > > > > index c998860d7bbc..184f1f7f8498 100644
> > > > > --- a/include/uapi/linux/vhost.h
> > > > > +++ b/include/uapi/linux/vhost.h
> > > > > @@ -150,4 +150,6 @@
> > > > > /* Get the valid iova range */
> > > > > #define VHOST_VDPA_GET_IOVA_RANGE _IOR(VHOST_VIRTIO, 0x78, \
> > > > > struct vhost_vdpa_iova_range)
> > > > > +/* Check if the vring size can be change */
> > > > > +#define VHOST_VDPA_GET_VRING_NUM_UNCHANGEABLE _IOR(VHOST_VIRTIO, 0X79, bool)
> > > > > #endif
> > > > > --
> > > > > 2.31.1
> > > > >
> > >
>

2021-09-10 01:48:21

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH 0/6] vDPA driver for legacy virtio-pci device

On Thu, Sep 9, 2021 at 8:53 PM Wu Zongyong <[email protected]> wrote:
>
> On Thu, Sep 09, 2021 at 11:05:06AM +0800, Jason Wang wrote:
> > On Wed, Sep 8, 2021 at 8:22 PM Wu Zongyong <[email protected]> wrote:
> > >
> > > This series implements the vDPA driver for legacy virtio-pci device.
> > > Currently we already have the vDPA driver for modern virtio-pci device
> > > only, but there are some legacy virtio-pci devices conforming to the
> > > virtio-pci specifications of 0.9.x or older versions. For example,
> > > ENI(Elastic Network Interface) of Alibaba ECS baremetal instance is a
> > > hardware virtio network device which follows the Virtio PCI Card 0.9.5
> > > Draft specification. Such legacy virtio-pci devices have some
> > > inconsistent behaviour with modern virtio-pci devices, so some common
> > > codes are split out and modern device specific codes are moved to a
> > > separated file.
> >
> > What worries me a little bit are:
> >
> > 1) vDPA requires IOMMU_PLATFORM to be supported by the device to work,
> > if I understand ENI correctly, it's a legacy device so it can't
> > support ACCESS_PLATFORM. Or is it a legacy device that supports
> > ACCESS_PLATFORM implicitly.
> > 2) vDPA tries to present a 1.0 device, in this case the behavior could
> > be ruled by the spec. If we tries to present an 1.0 device on top of
> > legacy device we may suffer a lot of issues:
> >
> > - endian issue: 1.0 use le but legacy may use native endian
> > - queue_enable semantic which is missed in the legacy
> > - virtqueue size, as you mentioned below
> >
> > I guess what the device(ENI) supports are:
> >
> > 1) semantic of ACCESS_PLATFORM without a feature
> > 2) little endian
> > 3) but a legacy device
> >
> > So I think it might be better:
> >
> > 1) introduce the library for legacy as you did in this patch
> > 2) having a dedicated ENI vDPA driver
>
> Would you mind I place the ENI vDPA driver inside virtio_pci folder? Or
> should I create a new folder for it?

I think it's better to have a new folder.

Thanks

>
> >
> > 3) live migration support, though it was not supported by the spec
> > yet, but we are working on the support, and we know legacy device can
> > support this.
> >
> > Thanks
> >
> > >
> > > For legacy devices, it is not supported to negotiate the virtqueue size
> > > by the specification. So a new callback get_vq_num_unchangeable is
> > > introduced to indicate user not to try change the virtqueue size of the
> > > legacy vdpa device. For example, QEMU should not allocate memory for
> > > virtqueue according to the properties tx_queue_size and rx_queue_size if
> > > we use legacy virtio-pci device as the vhost-vdpa backend. Instead, QEMU
> > > should use the new callback get_vq_num_unchangeable first to check if
> > > the vdpa device support to change virtqueue size. If not, QEMU should
> > > call the callback get_vq_num_max to get the static virtqueue size then
> > > allocate the same size memory for the virtqueue.
> > >
> > > This series have been tested with the ENI in Alibaba ECS baremetal
> > > instance.
> > >
> > > These patches may under consideration, welcome for comments.
> > >
> > >
> > > Wu Zongyong (6):
> > > virtio-pci: introduce legacy device module
> > > vdpa: fix typo
> > > vp_vdpa: add vq irq offloading support
> > > vp_vdpa: split out reusable and device specific codes to separate file
> > > vdpa: add get_vq_num_unchangeable callback in vdpa_config_ops
> > > vp_vdpa: introduce legacy virtio pci driver
> > >
> > > drivers/vdpa/Kconfig | 7 +
> > > drivers/vdpa/virtio_pci/Makefile | 3 +
> > > drivers/vdpa/virtio_pci/vp_vdpa.c | 8 +
> > > drivers/vdpa/virtio_pci/vp_vdpa_common.c | 220 ++++++++++++++
> > > drivers/vdpa/virtio_pci/vp_vdpa_common.h | 67 +++++
> > > drivers/vdpa/virtio_pci/vp_vdpa_legacy.c | 346 +++++++++++++++++++++++
> > > drivers/vdpa/virtio_pci/vp_vdpa_modern.c | 327 +++++++++++++++++++++
> > > drivers/vhost/vdpa.c | 19 ++
> > > drivers/virtio/Kconfig | 10 +
> > > drivers/virtio/Makefile | 1 +
> > > drivers/virtio/virtio_pci_common.c | 10 +-
> > > drivers/virtio/virtio_pci_common.h | 9 +-
> > > drivers/virtio/virtio_pci_legacy.c | 101 ++-----
> > > drivers/virtio/virtio_pci_legacy_dev.c | 220 ++++++++++++++
> > > drivers/virtio/virtio_vdpa.c | 5 +-
> > > include/linux/vdpa.h | 6 +-
> > > include/linux/virtio_pci_legacy.h | 44 +++
> > > include/uapi/linux/vhost.h | 2 +
> > > 18 files changed, 1320 insertions(+), 85 deletions(-)
> > > create mode 100644 drivers/vdpa/virtio_pci/vp_vdpa_common.c
> > > create mode 100644 drivers/vdpa/virtio_pci/vp_vdpa_common.h
> > > create mode 100644 drivers/vdpa/virtio_pci/vp_vdpa_legacy.c
> > > create mode 100644 drivers/vdpa/virtio_pci/vp_vdpa_modern.c
> > > create mode 100644 drivers/virtio/virtio_pci_legacy_dev.c
> > > create mode 100644 include/linux/virtio_pci_legacy.h
> > >
> > > --
> > > 2.31.1
> > >
>

2021-09-10 07:34:10

by Wu Zongyong

[permalink] [raw]
Subject: Re: [PATCH 5/6] vdpa: add get_vq_num_unchangeable callback in vdpa_config_ops

On Fri, Sep 10, 2021 at 09:45:53AM +0800, Jason Wang wrote:
> On Thu, Sep 9, 2021 at 5:57 PM Wu Zongyong <[email protected]> wrote:
> >
> > On Thu, Sep 09, 2021 at 05:28:26PM +0800, Jason Wang wrote:
> > > On Thu, Sep 9, 2021 at 4:02 PM Wu Zongyong <[email protected]> wrote:
> > > >
> > > > On Thu, Sep 09, 2021 at 10:55:03AM +0800, Jason Wang wrote:
> > > > > On Wed, Sep 8, 2021 at 8:23 PM Wu Zongyong <[email protected]> wrote:
> > > > > >
> > > > > > This new callback is used to indicate whether the vring size can be
> > > > > > change or not. It is useful when we have a legacy virtio pci device as
> > > > > > the vdpa device for there is no way to negotiate the vring num by the
> > > > > > specification.
> > > > >
> > > > > So I'm not sure it's worth bothering. E.g what if we just fail
> > > > > VHOST_SET_VRING_NUM it the value doesn't match what hardware has?
> > > > >
> > > > > Thanks
> > > > >
> > > > I think we should not call VHOST_SET_VRING_NUM in that case.
> > > >
> > > > If the hardware reports that the virtqueue size cannot be changed, we
> > > > should call VHOST_GET_VRING_NUM to get the static virtqueue size
> > > > firstly, then allocate the same size memory for the virtqueues and write
> > > > the address to hardware finally.
> > > >
> > > > For QEMU, we will ignore the properties rx/tx_queue_size and just get it
> > > > from the hardware if this new callback return true.
> > >
> > > This will break live migration. My understanding is that we can
> > > advertise those capability/limitation via the netlink management
> > > protocol then management layer can choose to use the correct queue
> > > size.
> > >
> > > Thanks
> > I agree, it is a good idea.
> > BTW, can we also advertise mac address of network device? I found the
> > mac address generated by libvirt or qemu will break the network datapath
> > down if I don't specify the right mac explicitly in the XML or qemu
> > commandline.
>
> We never saw this before, AFAIK when vhost-vdpa is used, currently
> qemu will probably ignore the mac address set via command line since
> the config space is read from the device instead of qemu itself?
>

I saw the code below in qemu:

static void virtio_net_device_realize(DeviceState *dev, Error **errp)
{
...
if (nc->peer && nc->peer->info->type == NET_CLIENT_DRIVER_VHOST_VDPA) {
struct virtio_net_config netcfg = {};
memcpy(&netcfg.mac, &n->nic_conf.macaddr, ETH_ALEN);
vhost_net_set_config(get_vhost_net(nc->peer),
(uint8_t *)&netcfg, 0, ETH_ALEN, VHOST_SET_CONFIG_TYPE_MASTER);
}
...
}

This write the mac address set via cmdline into vdpa device config, and
then guest will read it back.
If I remove these codes, it behaves like you said.


> Thanks
>
> > >
> > > >
> > > > What do you think?
> > > > > >
> > > > > > Signed-off-by: Wu Zongyong <[email protected]>
> > > > > > ---
> > > > > > drivers/vhost/vdpa.c | 19 +++++++++++++++++++
> > > > > > drivers/virtio/virtio_vdpa.c | 5 ++++-
> > > > > > include/linux/vdpa.h | 4 ++++
> > > > > > include/uapi/linux/vhost.h | 2 ++
> > > > > > 4 files changed, 29 insertions(+), 1 deletion(-)
> > > > > >
> > > > > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> > > > > > index 9479f7f79217..2204d27d1e5d 100644
> > > > > > --- a/drivers/vhost/vdpa.c
> > > > > > +++ b/drivers/vhost/vdpa.c
> > > > > > @@ -350,6 +350,22 @@ static long vhost_vdpa_get_iova_range(struct vhost_vdpa *v, u32 __user *argp)
> > > > > > return 0;
> > > > > > }
> > > > > >
> > > > > > +static long vhost_vdpa_get_vring_num_unchangeable(struct vhost_vdpa *v,
> > > > > > + u32 __user *argp)
> > > > > > +{
> > > > > > + struct vdpa_device *vdpa = v->vdpa;
> > > > > > + const struct vdpa_config_ops *ops = vdpa->config;
> > > > > > + bool unchangeable = false;
> > > > > > +
> > > > > > + if (ops->get_vq_num_unchangeable)
> > > > > > + unchangeable = ops->get_vq_num_unchangeable(vdpa);
> > > > > > +
> > > > > > + if (copy_to_user(argp, &unchangeable, sizeof(unchangeable)))
> > > > > > + return -EFAULT;
> > > > > > +
> > > > > > + return 0;
> > > > > > +}
> > > > > > +
> > > > > > static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd,
> > > > > > void __user *argp)
> > > > > > {
> > > > > > @@ -487,6 +503,9 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> > > > > > case VHOST_VDPA_GET_IOVA_RANGE:
> > > > > > r = vhost_vdpa_get_iova_range(v, argp);
> > > > > > break;
> > > > > > + case VHOST_VDPA_GET_VRING_NUM_UNCHANGEABLE:
> > > > > > + r = vhost_vdpa_get_vring_num_unchangeable(v, argp);
> > > > > > + break;
> > > > > > default:
> > > > > > r = vhost_dev_ioctl(&v->vdev, cmd, argp);
> > > > > > if (r == -ENOIOCTLCMD)
> > > > > > diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
> > > > > > index 72eaef2caeb1..afb47465307a 100644
> > > > > > --- a/drivers/virtio/virtio_vdpa.c
> > > > > > +++ b/drivers/virtio/virtio_vdpa.c
> > > > > > @@ -146,6 +146,7 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > > > > struct vdpa_vq_state state = {0};
> > > > > > unsigned long flags;
> > > > > > u32 align, num;
> > > > > > + bool may_reduce_num = true;
> > > > > > int err;
> > > > > >
> > > > > > if (!name)
> > > > > > @@ -171,8 +172,10 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > > > >
> > > > > > /* Create the vring */
> > > > > > align = ops->get_vq_align(vdpa);
> > > > > > + if (ops->get_vq_num_unchangeable)
> > > > > > + may_reduce_num = !ops->get_vq_num_unchangeable(vdpa);
> > > > > > vq = vring_create_virtqueue(index, num, align, vdev,
> > > > > > - true, true, ctx,
> > > > > > + true, may_reduce_num, ctx,
> > > > > > virtio_vdpa_notify, callback, name);
> > > > > > if (!vq) {
> > > > > > err = -ENOMEM;
> > > > > > diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
> > > > > > index 35648c11e312..f809b7ada00d 100644
> > > > > > --- a/include/linux/vdpa.h
> > > > > > +++ b/include/linux/vdpa.h
> > > > > > @@ -195,6 +195,9 @@ struct vdpa_iova_range {
> > > > > > * @vdev: vdpa device
> > > > > > * Returns the iova range supported by
> > > > > > * the device.
> > > > > > + * @get_vq_num_unchangeable Check if size of virtqueue is unchangeable (optional)
> > > > > > + * @vdev: vdpa device
> > > > > > + * Returns boolean: unchangeable (true) or not (false)
> > > > > > * @set_map: Set device memory mapping (optional)
> > > > > > * Needed for device that using device
> > > > > > * specific DMA translation (on-chip IOMMU)
> > > > > > @@ -262,6 +265,7 @@ struct vdpa_config_ops {
> > > > > > const void *buf, unsigned int len);
> > > > > > u32 (*get_generation)(struct vdpa_device *vdev);
> > > > > > struct vdpa_iova_range (*get_iova_range)(struct vdpa_device *vdev);
> > > > > > + bool (*get_vq_num_unchangeable)(struct vdpa_device *vdev);
> > > > > >
> > > > > > /* DMA ops */
> > > > > > int (*set_map)(struct vdpa_device *vdev, struct vhost_iotlb *iotlb);
> > > > > > diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h
> > > > > > index c998860d7bbc..184f1f7f8498 100644
> > > > > > --- a/include/uapi/linux/vhost.h
> > > > > > +++ b/include/uapi/linux/vhost.h
> > > > > > @@ -150,4 +150,6 @@
> > > > > > /* Get the valid iova range */
> > > > > > #define VHOST_VDPA_GET_IOVA_RANGE _IOR(VHOST_VIRTIO, 0x78, \
> > > > > > struct vhost_vdpa_iova_range)
> > > > > > +/* Check if the vring size can be change */
> > > > > > +#define VHOST_VDPA_GET_VRING_NUM_UNCHANGEABLE _IOR(VHOST_VIRTIO, 0X79, bool)
> > > > > > #endif
> > > > > > --
> > > > > > 2.31.1
> > > > > >
> > > >
> >

2021-09-10 08:29:51

by Cindy Lu

[permalink] [raw]
Subject: Re: [PATCH 5/6] vdpa: add get_vq_num_unchangeable callback in vdpa_config_ops

,

On Fri, Sep 10, 2021 at 3:33 PM Wu Zongyong
<[email protected]> wrote:
>
> On Fri, Sep 10, 2021 at 09:45:53AM +0800, Jason Wang wrote:
> > On Thu, Sep 9, 2021 at 5:57 PM Wu Zongyong <[email protected]> wrote:
> > >
> > > On Thu, Sep 09, 2021 at 05:28:26PM +0800, Jason Wang wrote:
> > > > On Thu, Sep 9, 2021 at 4:02 PM Wu Zongyong <[email protected]> wrote:
> > > > >
> > > > > On Thu, Sep 09, 2021 at 10:55:03AM +0800, Jason Wang wrote:
> > > > > > On Wed, Sep 8, 2021 at 8:23 PM Wu Zongyong <[email protected]> wrote:
> > > > > > >
> > > > > > > This new callback is used to indicate whether the vring size can be
> > > > > > > change or not. It is useful when we have a legacy virtio pci device as
> > > > > > > the vdpa device for there is no way to negotiate the vring num by the
> > > > > > > specification.
> > > > > >
> > > > > > So I'm not sure it's worth bothering. E.g what if we just fail
> > > > > > VHOST_SET_VRING_NUM it the value doesn't match what hardware has?
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > I think we should not call VHOST_SET_VRING_NUM in that case.
> > > > >
> > > > > If the hardware reports that the virtqueue size cannot be changed, we
> > > > > should call VHOST_GET_VRING_NUM to get the static virtqueue size
> > > > > firstly, then allocate the same size memory for the virtqueues and write
> > > > > the address to hardware finally.
> > > > >
> > > > > For QEMU, we will ignore the properties rx/tx_queue_size and just get it
> > > > > from the hardware if this new callback return true.
> > > >
> > > > This will break live migration. My understanding is that we can
> > > > advertise those capability/limitation via the netlink management
> > > > protocol then management layer can choose to use the correct queue
> > > > size.
> > > >
> > > > Thanks
> > > I agree, it is a good idea.
> > > BTW, can we also advertise mac address of network device? I found the
> > > mac address generated by libvirt or qemu will break the network datapath
> > > down if I don't specify the right mac explicitly in the XML or qemu
> > > commandline.
> >
> > We never saw this before, AFAIK when vhost-vdpa is used, currently
> > qemu will probably ignore the mac address set via command line since
> > the config space is read from the device instead of qemu itself?
> >
>
> I saw the code below in qemu:
>
> static void virtio_net_device_realize(DeviceState *dev, Error **errp)
> {
> ...
> if (nc->peer && nc->peer->info->type == NET_CLIENT_DRIVER_VHOST_VDPA) {
> struct virtio_net_config netcfg = {};
> memcpy(&netcfg.mac, &n->nic_conf.macaddr, ETH_ALEN);
> vhost_net_set_config(get_vhost_net(nc->peer),
> (uint8_t *)&netcfg, 0, ETH_ALEN, VHOST_SET_CONFIG_TYPE_MASTER);
> }
> ...
> }
>
> This write the mac address set via cmdline into vdpa device config, and
> then guest will read it back.
> If I remove these codes, it behaves like you said.
>
>
Hi Zongyong
I think this code only works while qemu get an all 0 mac address from
hardware , you can get more information from the function
virtio_net_get_config.
> > Thanks
> >
> > > >
> > > > >
> > > > > What do you think?
> > > > > > >
> > > > > > > Signed-off-by: Wu Zongyong <[email protected]>
> > > > > > > ---
> > > > > > > drivers/vhost/vdpa.c | 19 +++++++++++++++++++
> > > > > > > drivers/virtio/virtio_vdpa.c | 5 ++++-
> > > > > > > include/linux/vdpa.h | 4 ++++
> > > > > > > include/uapi/linux/vhost.h | 2 ++
> > > > > > > 4 files changed, 29 insertions(+), 1 deletion(-)
> > > > > > >
> > > > > > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> > > > > > > index 9479f7f79217..2204d27d1e5d 100644
> > > > > > > --- a/drivers/vhost/vdpa.c
> > > > > > > +++ b/drivers/vhost/vdpa.c
> > > > > > > @@ -350,6 +350,22 @@ static long vhost_vdpa_get_iova_range(struct vhost_vdpa *v, u32 __user *argp)
> > > > > > > return 0;
> > > > > > > }
> > > > > > >
> > > > > > > +static long vhost_vdpa_get_vring_num_unchangeable(struct vhost_vdpa *v,
> > > > > > > + u32 __user *argp)
> > > > > > > +{
> > > > > > > + struct vdpa_device *vdpa = v->vdpa;
> > > > > > > + const struct vdpa_config_ops *ops = vdpa->config;
> > > > > > > + bool unchangeable = false;
> > > > > > > +
> > > > > > > + if (ops->get_vq_num_unchangeable)
> > > > > > > + unchangeable = ops->get_vq_num_unchangeable(vdpa);
> > > > > > > +
> > > > > > > + if (copy_to_user(argp, &unchangeable, sizeof(unchangeable)))
> > > > > > > + return -EFAULT;
> > > > > > > +
> > > > > > > + return 0;
> > > > > > > +}
> > > > > > > +
> > > > > > > static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd,
> > > > > > > void __user *argp)
> > > > > > > {
> > > > > > > @@ -487,6 +503,9 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> > > > > > > case VHOST_VDPA_GET_IOVA_RANGE:
> > > > > > > r = vhost_vdpa_get_iova_range(v, argp);
> > > > > > > break;
> > > > > > > + case VHOST_VDPA_GET_VRING_NUM_UNCHANGEABLE:
> > > > > > > + r = vhost_vdpa_get_vring_num_unchangeable(v, argp);
> > > > > > > + break;
> > > > > > > default:
> > > > > > > r = vhost_dev_ioctl(&v->vdev, cmd, argp);
> > > > > > > if (r == -ENOIOCTLCMD)
> > > > > > > diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
> > > > > > > index 72eaef2caeb1..afb47465307a 100644
> > > > > > > --- a/drivers/virtio/virtio_vdpa.c
> > > > > > > +++ b/drivers/virtio/virtio_vdpa.c
> > > > > > > @@ -146,6 +146,7 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > > > > > struct vdpa_vq_state state = {0};
> > > > > > > unsigned long flags;
> > > > > > > u32 align, num;
> > > > > > > + bool may_reduce_num = true;
> > > > > > > int err;
> > > > > > >
> > > > > > > if (!name)
> > > > > > > @@ -171,8 +172,10 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > > > > >
> > > > > > > /* Create the vring */
> > > > > > > align = ops->get_vq_align(vdpa);
> > > > > > > + if (ops->get_vq_num_unchangeable)
> > > > > > > + may_reduce_num = !ops->get_vq_num_unchangeable(vdpa);
> > > > > > > vq = vring_create_virtqueue(index, num, align, vdev,
> > > > > > > - true, true, ctx,
> > > > > > > + true, may_reduce_num, ctx,
> > > > > > > virtio_vdpa_notify, callback, name);
> > > > > > > if (!vq) {
> > > > > > > err = -ENOMEM;
> > > > > > > diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
> > > > > > > index 35648c11e312..f809b7ada00d 100644
> > > > > > > --- a/include/linux/vdpa.h
> > > > > > > +++ b/include/linux/vdpa.h
> > > > > > > @@ -195,6 +195,9 @@ struct vdpa_iova_range {
> > > > > > > * @vdev: vdpa device
> > > > > > > * Returns the iova range supported by
> > > > > > > * the device.
> > > > > > > + * @get_vq_num_unchangeable Check if size of virtqueue is unchangeable (optional)
> > > > > > > + * @vdev: vdpa device
> > > > > > > + * Returns boolean: unchangeable (true) or not (false)
> > > > > > > * @set_map: Set device memory mapping (optional)
> > > > > > > * Needed for device that using device
> > > > > > > * specific DMA translation (on-chip IOMMU)
> > > > > > > @@ -262,6 +265,7 @@ struct vdpa_config_ops {
> > > > > > > const void *buf, unsigned int len);
> > > > > > > u32 (*get_generation)(struct vdpa_device *vdev);
> > > > > > > struct vdpa_iova_range (*get_iova_range)(struct vdpa_device *vdev);
> > > > > > > + bool (*get_vq_num_unchangeable)(struct vdpa_device *vdev);
> > > > > > >
> > > > > > > /* DMA ops */
> > > > > > > int (*set_map)(struct vdpa_device *vdev, struct vhost_iotlb *iotlb);
> > > > > > > diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h
> > > > > > > index c998860d7bbc..184f1f7f8498 100644
> > > > > > > --- a/include/uapi/linux/vhost.h
> > > > > > > +++ b/include/uapi/linux/vhost.h
> > > > > > > @@ -150,4 +150,6 @@
> > > > > > > /* Get the valid iova range */
> > > > > > > #define VHOST_VDPA_GET_IOVA_RANGE _IOR(VHOST_VIRTIO, 0x78, \
> > > > > > > struct vhost_vdpa_iova_range)
> > > > > > > +/* Check if the vring size can be change */
> > > > > > > +#define VHOST_VDPA_GET_VRING_NUM_UNCHANGEABLE _IOR(VHOST_VIRTIO, 0X79, bool)
> > > > > > > #endif
> > > > > > > --
> > > > > > > 2.31.1
> > > > > > >
> > > > >
> > >
>

2021-09-10 09:22:02

by Wu Zongyong

[permalink] [raw]
Subject: Re: [PATCH 5/6] vdpa: add get_vq_num_unchangeable callback in vdpa_config_ops

On Fri, Sep 10, 2021 at 04:25:18PM +0800, Cindy Lu wrote:
> ,
>
> On Fri, Sep 10, 2021 at 3:33 PM Wu Zongyong
> <[email protected]> wrote:
> >
> > On Fri, Sep 10, 2021 at 09:45:53AM +0800, Jason Wang wrote:
> > > On Thu, Sep 9, 2021 at 5:57 PM Wu Zongyong <[email protected]> wrote:
> > > >
> > > > On Thu, Sep 09, 2021 at 05:28:26PM +0800, Jason Wang wrote:
> > > > > On Thu, Sep 9, 2021 at 4:02 PM Wu Zongyong <[email protected]> wrote:
> > > > > >
> > > > > > On Thu, Sep 09, 2021 at 10:55:03AM +0800, Jason Wang wrote:
> > > > > > > On Wed, Sep 8, 2021 at 8:23 PM Wu Zongyong <[email protected]> wrote:
> > > > > > > >
> > > > > > > > This new callback is used to indicate whether the vring size can be
> > > > > > > > change or not. It is useful when we have a legacy virtio pci device as
> > > > > > > > the vdpa device for there is no way to negotiate the vring num by the
> > > > > > > > specification.
> > > > > > >
> > > > > > > So I'm not sure it's worth bothering. E.g what if we just fail
> > > > > > > VHOST_SET_VRING_NUM it the value doesn't match what hardware has?
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > > I think we should not call VHOST_SET_VRING_NUM in that case.
> > > > > >
> > > > > > If the hardware reports that the virtqueue size cannot be changed, we
> > > > > > should call VHOST_GET_VRING_NUM to get the static virtqueue size
> > > > > > firstly, then allocate the same size memory for the virtqueues and write
> > > > > > the address to hardware finally.
> > > > > >
> > > > > > For QEMU, we will ignore the properties rx/tx_queue_size and just get it
> > > > > > from the hardware if this new callback return true.
> > > > >
> > > > > This will break live migration. My understanding is that we can
> > > > > advertise those capability/limitation via the netlink management
> > > > > protocol then management layer can choose to use the correct queue
> > > > > size.
> > > > >
> > > > > Thanks
> > > > I agree, it is a good idea.
> > > > BTW, can we also advertise mac address of network device? I found the
> > > > mac address generated by libvirt or qemu will break the network datapath
> > > > down if I don't specify the right mac explicitly in the XML or qemu
> > > > commandline.
> > >
> > > We never saw this before, AFAIK when vhost-vdpa is used, currently
> > > qemu will probably ignore the mac address set via command line since
> > > the config space is read from the device instead of qemu itself?
> > >
> >
> > I saw the code below in qemu:
> >
> > static void virtio_net_device_realize(DeviceState *dev, Error **errp)
> > {
> > ...
> > if (nc->peer && nc->peer->info->type == NET_CLIENT_DRIVER_VHOST_VDPA) {
> > struct virtio_net_config netcfg = {};
> > memcpy(&netcfg.mac, &n->nic_conf.macaddr, ETH_ALEN);
> > vhost_net_set_config(get_vhost_net(nc->peer),
> > (uint8_t *)&netcfg, 0, ETH_ALEN, VHOST_SET_CONFIG_TYPE_MASTER);
> > }
> > ...
> > }
> >
> > This write the mac address set via cmdline into vdpa device config, and
> > then guest will read it back.
> > If I remove these codes, it behaves like you said.
> >
> >
> Hi Zongyong
> I think this code only works while qemu get an all 0 mac address from
> hardware , you can get more information from the function
> virtio_net_get_config.

It depends how vdpa_config_ops->set_config implements.
For mlx5, callback set_config do nothing. But for virtio-pci, callback
set_config will write the config register of the vdpa device, so qemu
will write the mac set via cmdline to hardware and the mac guest read
it back is the value writted by qemu just now.

> > > Thanks
> > >
> > > > >
> > > > > >
> > > > > > What do you think?
> > > > > > > >
> > > > > > > > Signed-off-by: Wu Zongyong <[email protected]>
> > > > > > > > ---
> > > > > > > > drivers/vhost/vdpa.c | 19 +++++++++++++++++++
> > > > > > > > drivers/virtio/virtio_vdpa.c | 5 ++++-
> > > > > > > > include/linux/vdpa.h | 4 ++++
> > > > > > > > include/uapi/linux/vhost.h | 2 ++
> > > > > > > > 4 files changed, 29 insertions(+), 1 deletion(-)
> > > > > > > >
> > > > > > > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> > > > > > > > index 9479f7f79217..2204d27d1e5d 100644
> > > > > > > > --- a/drivers/vhost/vdpa.c
> > > > > > > > +++ b/drivers/vhost/vdpa.c
> > > > > > > > @@ -350,6 +350,22 @@ static long vhost_vdpa_get_iova_range(struct vhost_vdpa *v, u32 __user *argp)
> > > > > > > > return 0;
> > > > > > > > }
> > > > > > > >
> > > > > > > > +static long vhost_vdpa_get_vring_num_unchangeable(struct vhost_vdpa *v,
> > > > > > > > + u32 __user *argp)
> > > > > > > > +{
> > > > > > > > + struct vdpa_device *vdpa = v->vdpa;
> > > > > > > > + const struct vdpa_config_ops *ops = vdpa->config;
> > > > > > > > + bool unchangeable = false;
> > > > > > > > +
> > > > > > > > + if (ops->get_vq_num_unchangeable)
> > > > > > > > + unchangeable = ops->get_vq_num_unchangeable(vdpa);
> > > > > > > > +
> > > > > > > > + if (copy_to_user(argp, &unchangeable, sizeof(unchangeable)))
> > > > > > > > + return -EFAULT;
> > > > > > > > +
> > > > > > > > + return 0;
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd,
> > > > > > > > void __user *argp)
> > > > > > > > {
> > > > > > > > @@ -487,6 +503,9 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> > > > > > > > case VHOST_VDPA_GET_IOVA_RANGE:
> > > > > > > > r = vhost_vdpa_get_iova_range(v, argp);
> > > > > > > > break;
> > > > > > > > + case VHOST_VDPA_GET_VRING_NUM_UNCHANGEABLE:
> > > > > > > > + r = vhost_vdpa_get_vring_num_unchangeable(v, argp);
> > > > > > > > + break;
> > > > > > > > default:
> > > > > > > > r = vhost_dev_ioctl(&v->vdev, cmd, argp);
> > > > > > > > if (r == -ENOIOCTLCMD)
> > > > > > > > diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
> > > > > > > > index 72eaef2caeb1..afb47465307a 100644
> > > > > > > > --- a/drivers/virtio/virtio_vdpa.c
> > > > > > > > +++ b/drivers/virtio/virtio_vdpa.c
> > > > > > > > @@ -146,6 +146,7 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > > > > > > struct vdpa_vq_state state = {0};
> > > > > > > > unsigned long flags;
> > > > > > > > u32 align, num;
> > > > > > > > + bool may_reduce_num = true;
> > > > > > > > int err;
> > > > > > > >
> > > > > > > > if (!name)
> > > > > > > > @@ -171,8 +172,10 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > > > > > >
> > > > > > > > /* Create the vring */
> > > > > > > > align = ops->get_vq_align(vdpa);
> > > > > > > > + if (ops->get_vq_num_unchangeable)
> > > > > > > > + may_reduce_num = !ops->get_vq_num_unchangeable(vdpa);
> > > > > > > > vq = vring_create_virtqueue(index, num, align, vdev,
> > > > > > > > - true, true, ctx,
> > > > > > > > + true, may_reduce_num, ctx,
> > > > > > > > virtio_vdpa_notify, callback, name);
> > > > > > > > if (!vq) {
> > > > > > > > err = -ENOMEM;
> > > > > > > > diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
> > > > > > > > index 35648c11e312..f809b7ada00d 100644
> > > > > > > > --- a/include/linux/vdpa.h
> > > > > > > > +++ b/include/linux/vdpa.h
> > > > > > > > @@ -195,6 +195,9 @@ struct vdpa_iova_range {
> > > > > > > > * @vdev: vdpa device
> > > > > > > > * Returns the iova range supported by
> > > > > > > > * the device.
> > > > > > > > + * @get_vq_num_unchangeable Check if size of virtqueue is unchangeable (optional)
> > > > > > > > + * @vdev: vdpa device
> > > > > > > > + * Returns boolean: unchangeable (true) or not (false)
> > > > > > > > * @set_map: Set device memory mapping (optional)
> > > > > > > > * Needed for device that using device
> > > > > > > > * specific DMA translation (on-chip IOMMU)
> > > > > > > > @@ -262,6 +265,7 @@ struct vdpa_config_ops {
> > > > > > > > const void *buf, unsigned int len);
> > > > > > > > u32 (*get_generation)(struct vdpa_device *vdev);
> > > > > > > > struct vdpa_iova_range (*get_iova_range)(struct vdpa_device *vdev);
> > > > > > > > + bool (*get_vq_num_unchangeable)(struct vdpa_device *vdev);
> > > > > > > >
> > > > > > > > /* DMA ops */
> > > > > > > > int (*set_map)(struct vdpa_device *vdev, struct vhost_iotlb *iotlb);
> > > > > > > > diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h
> > > > > > > > index c998860d7bbc..184f1f7f8498 100644
> > > > > > > > --- a/include/uapi/linux/vhost.h
> > > > > > > > +++ b/include/uapi/linux/vhost.h
> > > > > > > > @@ -150,4 +150,6 @@
> > > > > > > > /* Get the valid iova range */
> > > > > > > > #define VHOST_VDPA_GET_IOVA_RANGE _IOR(VHOST_VIRTIO, 0x78, \
> > > > > > > > struct vhost_vdpa_iova_range)
> > > > > > > > +/* Check if the vring size can be change */
> > > > > > > > +#define VHOST_VDPA_GET_VRING_NUM_UNCHANGEABLE _IOR(VHOST_VIRTIO, 0X79, bool)
> > > > > > > > #endif
> > > > > > > > --
> > > > > > > > 2.31.1
> > > > > > > >
> > > > > >
> > > >
> >

2021-09-10 15:13:43

by Cindy Lu

[permalink] [raw]
Subject: Re: [PATCH 5/6] vdpa: add get_vq_num_unchangeable callback in vdpa_config_ops

On Fri, Sep 10, 2021 at 5:20 PM Wu Zongyong
<[email protected]> wrote:
>
> On Fri, Sep 10, 2021 at 04:25:18PM +0800, Cindy Lu wrote:
> > ,
> >
> > On Fri, Sep 10, 2021 at 3:33 PM Wu Zongyong
> > <[email protected]> wrote:
> > >
> > > On Fri, Sep 10, 2021 at 09:45:53AM +0800, Jason Wang wrote:
> > > > On Thu, Sep 9, 2021 at 5:57 PM Wu Zongyong <[email protected]> wrote:
> > > > >
> > > > > On Thu, Sep 09, 2021 at 05:28:26PM +0800, Jason Wang wrote:
> > > > > > On Thu, Sep 9, 2021 at 4:02 PM Wu Zongyong <[email protected]> wrote:
> > > > > > >
> > > > > > > On Thu, Sep 09, 2021 at 10:55:03AM +0800, Jason Wang wrote:
> > > > > > > > On Wed, Sep 8, 2021 at 8:23 PM Wu Zongyong <[email protected]> wrote:
> > > > > > > > >
> > > > > > > > > This new callback is used to indicate whether the vring size can be
> > > > > > > > > change or not. It is useful when we have a legacy virtio pci device as
> > > > > > > > > the vdpa device for there is no way to negotiate the vring num by the
> > > > > > > > > specification.
> > > > > > > >
> > > > > > > > So I'm not sure it's worth bothering. E.g what if we just fail
> > > > > > > > VHOST_SET_VRING_NUM it the value doesn't match what hardware has?
> > > > > > > >
> > > > > > > > Thanks
> > > > > > > >
> > > > > > > I think we should not call VHOST_SET_VRING_NUM in that case.
> > > > > > >
> > > > > > > If the hardware reports that the virtqueue size cannot be changed, we
> > > > > > > should call VHOST_GET_VRING_NUM to get the static virtqueue size
> > > > > > > firstly, then allocate the same size memory for the virtqueues and write
> > > > > > > the address to hardware finally.
> > > > > > >
> > > > > > > For QEMU, we will ignore the properties rx/tx_queue_size and just get it
> > > > > > > from the hardware if this new callback return true.
> > > > > >
> > > > > > This will break live migration. My understanding is that we can
> > > > > > advertise those capability/limitation via the netlink management
> > > > > > protocol then management layer can choose to use the correct queue
> > > > > > size.
> > > > > >
> > > > > > Thanks
> > > > > I agree, it is a good idea.
> > > > > BTW, can we also advertise mac address of network device? I found the
> > > > > mac address generated by libvirt or qemu will break the network datapath
> > > > > down if I don't specify the right mac explicitly in the XML or qemu
> > > > > commandline.
> > > >
> > > > We never saw this before, AFAIK when vhost-vdpa is used, currently
> > > > qemu will probably ignore the mac address set via command line since
> > > > the config space is read from the device instead of qemu itself?
> > > >
> > >
> > > I saw the code below in qemu:
> > >
> > > static void virtio_net_device_realize(DeviceState *dev, Error **errp)
> > > {
> > > ...
> > > if (nc->peer && nc->peer->info->type == NET_CLIENT_DRIVER_VHOST_VDPA) {
> > > struct virtio_net_config netcfg = {};
> > > memcpy(&netcfg.mac, &n->nic_conf.macaddr, ETH_ALEN);
> > > vhost_net_set_config(get_vhost_net(nc->peer),
> > > (uint8_t *)&netcfg, 0, ETH_ALEN, VHOST_SET_CONFIG_TYPE_MASTER);
> > > }
> > > ...
> > > }
> > >
> > > This write the mac address set via cmdline into vdpa device config, and
> > > then guest will read it back.
> > > If I remove these codes, it behaves like you said.
> > >
> > >
> > Hi Zongyong
> > I think this code only works while qemu get an all 0 mac address from
> > hardware , you can get more information from the function
> > virtio_net_get_config.
>
> It depends how vdpa_config_ops->set_config implements.
> For mlx5, callback set_config do nothing. But for virtio-pci, callback
> set_config will write the config register of the vdpa device, so qemu
> will write the mac set via cmdline to hardware and the mac guest read
> it back is the value writted by qemu just now.
>
So here comes a question, which MAC address has higher priority ?
the MAC address in hardware or the MAC address from the cmdline?
If both of these two MAC addresses exist, which should we use?
I have checked the spec, not sure if the bit VIRTIO_NET_F_MAC is the right one?
if yes, I will post a patch in qemu and add check for this bit before
we set the mac to hardware
https://docs.oasis-open.org/virtio/virtio/v1.1/csprd01/virtio-v1.1-csprd01.html

Thanks
cindy
> > > > Thanks
> > > >
> > > > > >
> > > > > > >
> > > > > > > What do you think?
> > > > > > > > >
> > > > > > > > > Signed-off-by: Wu Zongyong <[email protected]>
> > > > > > > > > ---
> > > > > > > > > drivers/vhost/vdpa.c | 19 +++++++++++++++++++
> > > > > > > > > drivers/virtio/virtio_vdpa.c | 5 ++++-
> > > > > > > > > include/linux/vdpa.h | 4 ++++
> > > > > > > > > include/uapi/linux/vhost.h | 2 ++
> > > > > > > > > 4 files changed, 29 insertions(+), 1 deletion(-)
> > > > > > > > >
> > > > > > > > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> > > > > > > > > index 9479f7f79217..2204d27d1e5d 100644
> > > > > > > > > --- a/drivers/vhost/vdpa.c
> > > > > > > > > +++ b/drivers/vhost/vdpa.c
> > > > > > > > > @@ -350,6 +350,22 @@ static long vhost_vdpa_get_iova_range(struct vhost_vdpa *v, u32 __user *argp)
> > > > > > > > > return 0;
> > > > > > > > > }
> > > > > > > > >
> > > > > > > > > +static long vhost_vdpa_get_vring_num_unchangeable(struct vhost_vdpa *v,
> > > > > > > > > + u32 __user *argp)
> > > > > > > > > +{
> > > > > > > > > + struct vdpa_device *vdpa = v->vdpa;
> > > > > > > > > + const struct vdpa_config_ops *ops = vdpa->config;
> > > > > > > > > + bool unchangeable = false;
> > > > > > > > > +
> > > > > > > > > + if (ops->get_vq_num_unchangeable)
> > > > > > > > > + unchangeable = ops->get_vq_num_unchangeable(vdpa);
> > > > > > > > > +
> > > > > > > > > + if (copy_to_user(argp, &unchangeable, sizeof(unchangeable)))
> > > > > > > > > + return -EFAULT;
> > > > > > > > > +
> > > > > > > > > + return 0;
> > > > > > > > > +}
> > > > > > > > > +
> > > > > > > > > static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd,
> > > > > > > > > void __user *argp)
> > > > > > > > > {
> > > > > > > > > @@ -487,6 +503,9 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> > > > > > > > > case VHOST_VDPA_GET_IOVA_RANGE:
> > > > > > > > > r = vhost_vdpa_get_iova_range(v, argp);
> > > > > > > > > break;
> > > > > > > > > + case VHOST_VDPA_GET_VRING_NUM_UNCHANGEABLE:
> > > > > > > > > + r = vhost_vdpa_get_vring_num_unchangeable(v, argp);
> > > > > > > > > + break;
> > > > > > > > > default:
> > > > > > > > > r = vhost_dev_ioctl(&v->vdev, cmd, argp);
> > > > > > > > > if (r == -ENOIOCTLCMD)
> > > > > > > > > diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
> > > > > > > > > index 72eaef2caeb1..afb47465307a 100644
> > > > > > > > > --- a/drivers/virtio/virtio_vdpa.c
> > > > > > > > > +++ b/drivers/virtio/virtio_vdpa.c
> > > > > > > > > @@ -146,6 +146,7 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > > > > > > > struct vdpa_vq_state state = {0};
> > > > > > > > > unsigned long flags;
> > > > > > > > > u32 align, num;
> > > > > > > > > + bool may_reduce_num = true;
> > > > > > > > > int err;
> > > > > > > > >
> > > > > > > > > if (!name)
> > > > > > > > > @@ -171,8 +172,10 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > > > > > > >
> > > > > > > > > /* Create the vring */
> > > > > > > > > align = ops->get_vq_align(vdpa);
> > > > > > > > > + if (ops->get_vq_num_unchangeable)
> > > > > > > > > + may_reduce_num = !ops->get_vq_num_unchangeable(vdpa);
> > > > > > > > > vq = vring_create_virtqueue(index, num, align, vdev,
> > > > > > > > > - true, true, ctx,
> > > > > > > > > + true, may_reduce_num, ctx,
> > > > > > > > > virtio_vdpa_notify, callback, name);
> > > > > > > > > if (!vq) {
> > > > > > > > > err = -ENOMEM;
> > > > > > > > > diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
> > > > > > > > > index 35648c11e312..f809b7ada00d 100644
> > > > > > > > > --- a/include/linux/vdpa.h
> > > > > > > > > +++ b/include/linux/vdpa.h
> > > > > > > > > @@ -195,6 +195,9 @@ struct vdpa_iova_range {
> > > > > > > > > * @vdev: vdpa device
> > > > > > > > > * Returns the iova range supported by
> > > > > > > > > * the device.
> > > > > > > > > + * @get_vq_num_unchangeable Check if size of virtqueue is unchangeable (optional)
> > > > > > > > > + * @vdev: vdpa device
> > > > > > > > > + * Returns boolean: unchangeable (true) or not (false)
> > > > > > > > > * @set_map: Set device memory mapping (optional)
> > > > > > > > > * Needed for device that using device
> > > > > > > > > * specific DMA translation (on-chip IOMMU)
> > > > > > > > > @@ -262,6 +265,7 @@ struct vdpa_config_ops {
> > > > > > > > > const void *buf, unsigned int len);
> > > > > > > > > u32 (*get_generation)(struct vdpa_device *vdev);
> > > > > > > > > struct vdpa_iova_range (*get_iova_range)(struct vdpa_device *vdev);
> > > > > > > > > + bool (*get_vq_num_unchangeable)(struct vdpa_device *vdev);
> > > > > > > > >
> > > > > > > > > /* DMA ops */
> > > > > > > > > int (*set_map)(struct vdpa_device *vdev, struct vhost_iotlb *iotlb);
> > > > > > > > > diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h
> > > > > > > > > index c998860d7bbc..184f1f7f8498 100644
> > > > > > > > > --- a/include/uapi/linux/vhost.h
> > > > > > > > > +++ b/include/uapi/linux/vhost.h
> > > > > > > > > @@ -150,4 +150,6 @@
> > > > > > > > > /* Get the valid iova range */
> > > > > > > > > #define VHOST_VDPA_GET_IOVA_RANGE _IOR(VHOST_VIRTIO, 0x78, \
> > > > > > > > > struct vhost_vdpa_iova_range)
> > > > > > > > > +/* Check if the vring size can be change */
> > > > > > > > > +#define VHOST_VDPA_GET_VRING_NUM_UNCHANGEABLE _IOR(VHOST_VIRTIO, 0X79, bool)
> > > > > > > > > #endif
> > > > > > > > > --
> > > > > > > > > 2.31.1
> > > > > > > > >
> > > > > > >
> > > > >
> > >
>

2021-09-13 01:56:29

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH 5/6] vdpa: add get_vq_num_unchangeable callback in vdpa_config_ops

On Fri, Sep 10, 2021 at 11:11 PM Cindy Lu <[email protected]> wrote:
>
> On Fri, Sep 10, 2021 at 5:20 PM Wu Zongyong
> <[email protected]> wrote:
> >
> > On Fri, Sep 10, 2021 at 04:25:18PM +0800, Cindy Lu wrote:
> > > ,
> > >
> > > On Fri, Sep 10, 2021 at 3:33 PM Wu Zongyong
> > > <[email protected]> wrote:
> > > >
> > > > On Fri, Sep 10, 2021 at 09:45:53AM +0800, Jason Wang wrote:
> > > > > On Thu, Sep 9, 2021 at 5:57 PM Wu Zongyong <[email protected]> wrote:
> > > > > >
> > > > > > On Thu, Sep 09, 2021 at 05:28:26PM +0800, Jason Wang wrote:
> > > > > > > On Thu, Sep 9, 2021 at 4:02 PM Wu Zongyong <[email protected]> wrote:
> > > > > > > >
> > > > > > > > On Thu, Sep 09, 2021 at 10:55:03AM +0800, Jason Wang wrote:
> > > > > > > > > On Wed, Sep 8, 2021 at 8:23 PM Wu Zongyong <[email protected]> wrote:
> > > > > > > > > >
> > > > > > > > > > This new callback is used to indicate whether the vring size can be
> > > > > > > > > > change or not. It is useful when we have a legacy virtio pci device as
> > > > > > > > > > the vdpa device for there is no way to negotiate the vring num by the
> > > > > > > > > > specification.
> > > > > > > > >
> > > > > > > > > So I'm not sure it's worth bothering. E.g what if we just fail
> > > > > > > > > VHOST_SET_VRING_NUM it the value doesn't match what hardware has?
> > > > > > > > >
> > > > > > > > > Thanks
> > > > > > > > >
> > > > > > > > I think we should not call VHOST_SET_VRING_NUM in that case.
> > > > > > > >
> > > > > > > > If the hardware reports that the virtqueue size cannot be changed, we
> > > > > > > > should call VHOST_GET_VRING_NUM to get the static virtqueue size
> > > > > > > > firstly, then allocate the same size memory for the virtqueues and write
> > > > > > > > the address to hardware finally.
> > > > > > > >
> > > > > > > > For QEMU, we will ignore the properties rx/tx_queue_size and just get it
> > > > > > > > from the hardware if this new callback return true.
> > > > > > >
> > > > > > > This will break live migration. My understanding is that we can
> > > > > > > advertise those capability/limitation via the netlink management
> > > > > > > protocol then management layer can choose to use the correct queue
> > > > > > > size.
> > > > > > >
> > > > > > > Thanks
> > > > > > I agree, it is a good idea.
> > > > > > BTW, can we also advertise mac address of network device? I found the
> > > > > > mac address generated by libvirt or qemu will break the network datapath
> > > > > > down if I don't specify the right mac explicitly in the XML or qemu
> > > > > > commandline.
> > > > >
> > > > > We never saw this before, AFAIK when vhost-vdpa is used, currently
> > > > > qemu will probably ignore the mac address set via command line since
> > > > > the config space is read from the device instead of qemu itself?
> > > > >
> > > >
> > > > I saw the code below in qemu:
> > > >
> > > > static void virtio_net_device_realize(DeviceState *dev, Error **errp)
> > > > {
> > > > ...
> > > > if (nc->peer && nc->peer->info->type == NET_CLIENT_DRIVER_VHOST_VDPA) {
> > > > struct virtio_net_config netcfg = {};
> > > > memcpy(&netcfg.mac, &n->nic_conf.macaddr, ETH_ALEN);
> > > > vhost_net_set_config(get_vhost_net(nc->peer),
> > > > (uint8_t *)&netcfg, 0, ETH_ALEN, VHOST_SET_CONFIG_TYPE_MASTER);
> > > > }
> > > > ...
> > > > }
> > > >
> > > > This write the mac address set via cmdline into vdpa device config, and
> > > > then guest will read it back.
> > > > If I remove these codes, it behaves like you said.
> > > >
> > > >
> > > Hi Zongyong
> > > I think this code only works while qemu get an all 0 mac address from
> > > hardware , you can get more information from the function
> > > virtio_net_get_config.
> >
> > It depends how vdpa_config_ops->set_config implements.
> > For mlx5, callback set_config do nothing. But for virtio-pci, callback
> > set_config will write the config register of the vdpa device, so qemu
> > will write the mac set via cmdline to hardware and the mac guest read
> > it back is the value writted by qemu just now.
> >
> So here comes a question, which MAC address has higher priority ?
> the MAC address in hardware or the MAC address from the cmdline?
> If both of these two MAC addresses exist, which should we use?
> I have checked the spec, not sure if the bit VIRTIO_NET_F_MAC is the right one?

I think so, if VIRTIO_NET_F_MAC is set, qemu can override the mac otherwise not.

Thanks

> if yes, I will post a patch in qemu and add check for this bit before
> we set the mac to hardware
> https://docs.oasis-open.org/virtio/virtio/v1.1/csprd01/virtio-v1.1-csprd01.html
>
> Thanks
> cindy
> > > > > Thanks
> > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > What do you think?
> > > > > > > > > >
> > > > > > > > > > Signed-off-by: Wu Zongyong <[email protected]>
> > > > > > > > > > ---
> > > > > > > > > > drivers/vhost/vdpa.c | 19 +++++++++++++++++++
> > > > > > > > > > drivers/virtio/virtio_vdpa.c | 5 ++++-
> > > > > > > > > > include/linux/vdpa.h | 4 ++++
> > > > > > > > > > include/uapi/linux/vhost.h | 2 ++
> > > > > > > > > > 4 files changed, 29 insertions(+), 1 deletion(-)
> > > > > > > > > >
> > > > > > > > > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> > > > > > > > > > index 9479f7f79217..2204d27d1e5d 100644
> > > > > > > > > > --- a/drivers/vhost/vdpa.c
> > > > > > > > > > +++ b/drivers/vhost/vdpa.c
> > > > > > > > > > @@ -350,6 +350,22 @@ static long vhost_vdpa_get_iova_range(struct vhost_vdpa *v, u32 __user *argp)
> > > > > > > > > > return 0;
> > > > > > > > > > }
> > > > > > > > > >
> > > > > > > > > > +static long vhost_vdpa_get_vring_num_unchangeable(struct vhost_vdpa *v,
> > > > > > > > > > + u32 __user *argp)
> > > > > > > > > > +{
> > > > > > > > > > + struct vdpa_device *vdpa = v->vdpa;
> > > > > > > > > > + const struct vdpa_config_ops *ops = vdpa->config;
> > > > > > > > > > + bool unchangeable = false;
> > > > > > > > > > +
> > > > > > > > > > + if (ops->get_vq_num_unchangeable)
> > > > > > > > > > + unchangeable = ops->get_vq_num_unchangeable(vdpa);
> > > > > > > > > > +
> > > > > > > > > > + if (copy_to_user(argp, &unchangeable, sizeof(unchangeable)))
> > > > > > > > > > + return -EFAULT;
> > > > > > > > > > +
> > > > > > > > > > + return 0;
> > > > > > > > > > +}
> > > > > > > > > > +
> > > > > > > > > > static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd,
> > > > > > > > > > void __user *argp)
> > > > > > > > > > {
> > > > > > > > > > @@ -487,6 +503,9 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> > > > > > > > > > case VHOST_VDPA_GET_IOVA_RANGE:
> > > > > > > > > > r = vhost_vdpa_get_iova_range(v, argp);
> > > > > > > > > > break;
> > > > > > > > > > + case VHOST_VDPA_GET_VRING_NUM_UNCHANGEABLE:
> > > > > > > > > > + r = vhost_vdpa_get_vring_num_unchangeable(v, argp);
> > > > > > > > > > + break;
> > > > > > > > > > default:
> > > > > > > > > > r = vhost_dev_ioctl(&v->vdev, cmd, argp);
> > > > > > > > > > if (r == -ENOIOCTLCMD)
> > > > > > > > > > diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
> > > > > > > > > > index 72eaef2caeb1..afb47465307a 100644
> > > > > > > > > > --- a/drivers/virtio/virtio_vdpa.c
> > > > > > > > > > +++ b/drivers/virtio/virtio_vdpa.c
> > > > > > > > > > @@ -146,6 +146,7 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > > > > > > > > struct vdpa_vq_state state = {0};
> > > > > > > > > > unsigned long flags;
> > > > > > > > > > u32 align, num;
> > > > > > > > > > + bool may_reduce_num = true;
> > > > > > > > > > int err;
> > > > > > > > > >
> > > > > > > > > > if (!name)
> > > > > > > > > > @@ -171,8 +172,10 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > > > > > > > >
> > > > > > > > > > /* Create the vring */
> > > > > > > > > > align = ops->get_vq_align(vdpa);
> > > > > > > > > > + if (ops->get_vq_num_unchangeable)
> > > > > > > > > > + may_reduce_num = !ops->get_vq_num_unchangeable(vdpa);
> > > > > > > > > > vq = vring_create_virtqueue(index, num, align, vdev,
> > > > > > > > > > - true, true, ctx,
> > > > > > > > > > + true, may_reduce_num, ctx,
> > > > > > > > > > virtio_vdpa_notify, callback, name);
> > > > > > > > > > if (!vq) {
> > > > > > > > > > err = -ENOMEM;
> > > > > > > > > > diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
> > > > > > > > > > index 35648c11e312..f809b7ada00d 100644
> > > > > > > > > > --- a/include/linux/vdpa.h
> > > > > > > > > > +++ b/include/linux/vdpa.h
> > > > > > > > > > @@ -195,6 +195,9 @@ struct vdpa_iova_range {
> > > > > > > > > > * @vdev: vdpa device
> > > > > > > > > > * Returns the iova range supported by
> > > > > > > > > > * the device.
> > > > > > > > > > + * @get_vq_num_unchangeable Check if size of virtqueue is unchangeable (optional)
> > > > > > > > > > + * @vdev: vdpa device
> > > > > > > > > > + * Returns boolean: unchangeable (true) or not (false)
> > > > > > > > > > * @set_map: Set device memory mapping (optional)
> > > > > > > > > > * Needed for device that using device
> > > > > > > > > > * specific DMA translation (on-chip IOMMU)
> > > > > > > > > > @@ -262,6 +265,7 @@ struct vdpa_config_ops {
> > > > > > > > > > const void *buf, unsigned int len);
> > > > > > > > > > u32 (*get_generation)(struct vdpa_device *vdev);
> > > > > > > > > > struct vdpa_iova_range (*get_iova_range)(struct vdpa_device *vdev);
> > > > > > > > > > + bool (*get_vq_num_unchangeable)(struct vdpa_device *vdev);
> > > > > > > > > >
> > > > > > > > > > /* DMA ops */
> > > > > > > > > > int (*set_map)(struct vdpa_device *vdev, struct vhost_iotlb *iotlb);
> > > > > > > > > > diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h
> > > > > > > > > > index c998860d7bbc..184f1f7f8498 100644
> > > > > > > > > > --- a/include/uapi/linux/vhost.h
> > > > > > > > > > +++ b/include/uapi/linux/vhost.h
> > > > > > > > > > @@ -150,4 +150,6 @@
> > > > > > > > > > /* Get the valid iova range */
> > > > > > > > > > #define VHOST_VDPA_GET_IOVA_RANGE _IOR(VHOST_VIRTIO, 0x78, \
> > > > > > > > > > struct vhost_vdpa_iova_range)
> > > > > > > > > > +/* Check if the vring size can be change */
> > > > > > > > > > +#define VHOST_VDPA_GET_VRING_NUM_UNCHANGEABLE _IOR(VHOST_VIRTIO, 0X79, bool)
> > > > > > > > > > #endif
> > > > > > > > > > --
> > > > > > > > > > 2.31.1
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> >
>

2021-09-13 03:03:04

by Wu Zongyong

[permalink] [raw]
Subject: Re: [PATCH 5/6] vdpa: add get_vq_num_unchangeable callback in vdpa_config_ops

On Mon, Sep 13, 2021 at 09:43:40AM +0800, Jason Wang wrote:
> On Fri, Sep 10, 2021 at 11:11 PM Cindy Lu <[email protected]> wrote:
> >
> > On Fri, Sep 10, 2021 at 5:20 PM Wu Zongyong
> > <[email protected]> wrote:
> > >
> > > On Fri, Sep 10, 2021 at 04:25:18PM +0800, Cindy Lu wrote:
> > > > ,
> > > >
> > > > On Fri, Sep 10, 2021 at 3:33 PM Wu Zongyong
> > > > <[email protected]> wrote:
> > > > >
> > > > > On Fri, Sep 10, 2021 at 09:45:53AM +0800, Jason Wang wrote:
> > > > > > On Thu, Sep 9, 2021 at 5:57 PM Wu Zongyong <[email protected]> wrote:
> > > > > > >
> > > > > > > On Thu, Sep 09, 2021 at 05:28:26PM +0800, Jason Wang wrote:
> > > > > > > > On Thu, Sep 9, 2021 at 4:02 PM Wu Zongyong <[email protected]> wrote:
> > > > > > > > >
> > > > > > > > > On Thu, Sep 09, 2021 at 10:55:03AM +0800, Jason Wang wrote:
> > > > > > > > > > On Wed, Sep 8, 2021 at 8:23 PM Wu Zongyong <[email protected]> wrote:
> > > > > > > > > > >
> > > > > > > > > > > This new callback is used to indicate whether the vring size can be
> > > > > > > > > > > change or not. It is useful when we have a legacy virtio pci device as
> > > > > > > > > > > the vdpa device for there is no way to negotiate the vring num by the
> > > > > > > > > > > specification.
> > > > > > > > > >
> > > > > > > > > > So I'm not sure it's worth bothering. E.g what if we just fail
> > > > > > > > > > VHOST_SET_VRING_NUM it the value doesn't match what hardware has?
> > > > > > > > > >
> > > > > > > > > > Thanks
> > > > > > > > > >
> > > > > > > > > I think we should not call VHOST_SET_VRING_NUM in that case.
> > > > > > > > >
> > > > > > > > > If the hardware reports that the virtqueue size cannot be changed, we
> > > > > > > > > should call VHOST_GET_VRING_NUM to get the static virtqueue size
> > > > > > > > > firstly, then allocate the same size memory for the virtqueues and write
> > > > > > > > > the address to hardware finally.
> > > > > > > > >
> > > > > > > > > For QEMU, we will ignore the properties rx/tx_queue_size and just get it
> > > > > > > > > from the hardware if this new callback return true.
> > > > > > > >
> > > > > > > > This will break live migration. My understanding is that we can
> > > > > > > > advertise those capability/limitation via the netlink management
> > > > > > > > protocol then management layer can choose to use the correct queue
> > > > > > > > size.
> > > > > > > >
> > > > > > > > Thanks
> > > > > > > I agree, it is a good idea.
> > > > > > > BTW, can we also advertise mac address of network device? I found the
> > > > > > > mac address generated by libvirt or qemu will break the network datapath
> > > > > > > down if I don't specify the right mac explicitly in the XML or qemu
> > > > > > > commandline.
> > > > > >
> > > > > > We never saw this before, AFAIK when vhost-vdpa is used, currently
> > > > > > qemu will probably ignore the mac address set via command line since
> > > > > > the config space is read from the device instead of qemu itself?
> > > > > >
> > > > >
> > > > > I saw the code below in qemu:
> > > > >
> > > > > static void virtio_net_device_realize(DeviceState *dev, Error **errp)
> > > > > {
> > > > > ...
> > > > > if (nc->peer && nc->peer->info->type == NET_CLIENT_DRIVER_VHOST_VDPA) {
> > > > > struct virtio_net_config netcfg = {};
> > > > > memcpy(&netcfg.mac, &n->nic_conf.macaddr, ETH_ALEN);
> > > > > vhost_net_set_config(get_vhost_net(nc->peer),
> > > > > (uint8_t *)&netcfg, 0, ETH_ALEN, VHOST_SET_CONFIG_TYPE_MASTER);
> > > > > }
> > > > > ...
> > > > > }
> > > > >
> > > > > This write the mac address set via cmdline into vdpa device config, and
> > > > > then guest will read it back.
> > > > > If I remove these codes, it behaves like you said.
> > > > >
> > > > >
> > > > Hi Zongyong
> > > > I think this code only works while qemu get an all 0 mac address from
> > > > hardware , you can get more information from the function
> > > > virtio_net_get_config.
> > >
> > > It depends how vdpa_config_ops->set_config implements.
> > > For mlx5, callback set_config do nothing. But for virtio-pci, callback
> > > set_config will write the config register of the vdpa device, so qemu
> > > will write the mac set via cmdline to hardware and the mac guest read
> > > it back is the value writted by qemu just now.
> > >
> > So here comes a question, which MAC address has higher priority ?
> > the MAC address in hardware or the MAC address from the cmdline?
> > If both of these two MAC addresses exist, which should we use?
> > I have checked the spec, not sure if the bit VIRTIO_NET_F_MAC is the right one?
>
> I think so, if VIRTIO_NET_F_MAC is set, qemu can override the mac otherwise not.
>
The spec says:
"driver SHOULD negotiate VIRTIO_NET_F_MAC if the device offers it. If the driver
negotiates the VIRTIO_NET_F_MAC feature, the driver MUST set the physical address
of the NIC to mac. Otherwise, it SHOULD use a locally-administered MAC address."

To my understanding, I guess you mean qemu CANNOT override the mac
device provides actually?
> Thanks
>
> > if yes, I will post a patch in qemu and add check for this bit before
> > we set the mac to hardware
> > https://docs.oasis-open.org/virtio/virtio/v1.1/csprd01/virtio-v1.1-csprd01.html
> >
> > Thanks
> > cindy
> > > > > > Thanks
> > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > What do you think?
> > > > > > > > > > >
> > > > > > > > > > > Signed-off-by: Wu Zongyong <[email protected]>
> > > > > > > > > > > ---
> > > > > > > > > > > drivers/vhost/vdpa.c | 19 +++++++++++++++++++
> > > > > > > > > > > drivers/virtio/virtio_vdpa.c | 5 ++++-
> > > > > > > > > > > include/linux/vdpa.h | 4 ++++
> > > > > > > > > > > include/uapi/linux/vhost.h | 2 ++
> > > > > > > > > > > 4 files changed, 29 insertions(+), 1 deletion(-)
> > > > > > > > > > >
> > > > > > > > > > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> > > > > > > > > > > index 9479f7f79217..2204d27d1e5d 100644
> > > > > > > > > > > --- a/drivers/vhost/vdpa.c
> > > > > > > > > > > +++ b/drivers/vhost/vdpa.c
> > > > > > > > > > > @@ -350,6 +350,22 @@ static long vhost_vdpa_get_iova_range(struct vhost_vdpa *v, u32 __user *argp)
> > > > > > > > > > > return 0;
> > > > > > > > > > > }
> > > > > > > > > > >
> > > > > > > > > > > +static long vhost_vdpa_get_vring_num_unchangeable(struct vhost_vdpa *v,
> > > > > > > > > > > + u32 __user *argp)
> > > > > > > > > > > +{
> > > > > > > > > > > + struct vdpa_device *vdpa = v->vdpa;
> > > > > > > > > > > + const struct vdpa_config_ops *ops = vdpa->config;
> > > > > > > > > > > + bool unchangeable = false;
> > > > > > > > > > > +
> > > > > > > > > > > + if (ops->get_vq_num_unchangeable)
> > > > > > > > > > > + unchangeable = ops->get_vq_num_unchangeable(vdpa);
> > > > > > > > > > > +
> > > > > > > > > > > + if (copy_to_user(argp, &unchangeable, sizeof(unchangeable)))
> > > > > > > > > > > + return -EFAULT;
> > > > > > > > > > > +
> > > > > > > > > > > + return 0;
> > > > > > > > > > > +}
> > > > > > > > > > > +
> > > > > > > > > > > static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd,
> > > > > > > > > > > void __user *argp)
> > > > > > > > > > > {
> > > > > > > > > > > @@ -487,6 +503,9 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> > > > > > > > > > > case VHOST_VDPA_GET_IOVA_RANGE:
> > > > > > > > > > > r = vhost_vdpa_get_iova_range(v, argp);
> > > > > > > > > > > break;
> > > > > > > > > > > + case VHOST_VDPA_GET_VRING_NUM_UNCHANGEABLE:
> > > > > > > > > > > + r = vhost_vdpa_get_vring_num_unchangeable(v, argp);
> > > > > > > > > > > + break;
> > > > > > > > > > > default:
> > > > > > > > > > > r = vhost_dev_ioctl(&v->vdev, cmd, argp);
> > > > > > > > > > > if (r == -ENOIOCTLCMD)
> > > > > > > > > > > diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
> > > > > > > > > > > index 72eaef2caeb1..afb47465307a 100644
> > > > > > > > > > > --- a/drivers/virtio/virtio_vdpa.c
> > > > > > > > > > > +++ b/drivers/virtio/virtio_vdpa.c
> > > > > > > > > > > @@ -146,6 +146,7 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > > > > > > > > > struct vdpa_vq_state state = {0};
> > > > > > > > > > > unsigned long flags;
> > > > > > > > > > > u32 align, num;
> > > > > > > > > > > + bool may_reduce_num = true;
> > > > > > > > > > > int err;
> > > > > > > > > > >
> > > > > > > > > > > if (!name)
> > > > > > > > > > > @@ -171,8 +172,10 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > > > > > > > > >
> > > > > > > > > > > /* Create the vring */
> > > > > > > > > > > align = ops->get_vq_align(vdpa);
> > > > > > > > > > > + if (ops->get_vq_num_unchangeable)
> > > > > > > > > > > + may_reduce_num = !ops->get_vq_num_unchangeable(vdpa);
> > > > > > > > > > > vq = vring_create_virtqueue(index, num, align, vdev,
> > > > > > > > > > > - true, true, ctx,
> > > > > > > > > > > + true, may_reduce_num, ctx,
> > > > > > > > > > > virtio_vdpa_notify, callback, name);
> > > > > > > > > > > if (!vq) {
> > > > > > > > > > > err = -ENOMEM;
> > > > > > > > > > > diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
> > > > > > > > > > > index 35648c11e312..f809b7ada00d 100644
> > > > > > > > > > > --- a/include/linux/vdpa.h
> > > > > > > > > > > +++ b/include/linux/vdpa.h
> > > > > > > > > > > @@ -195,6 +195,9 @@ struct vdpa_iova_range {
> > > > > > > > > > > * @vdev: vdpa device
> > > > > > > > > > > * Returns the iova range supported by
> > > > > > > > > > > * the device.
> > > > > > > > > > > + * @get_vq_num_unchangeable Check if size of virtqueue is unchangeable (optional)
> > > > > > > > > > > + * @vdev: vdpa device
> > > > > > > > > > > + * Returns boolean: unchangeable (true) or not (false)
> > > > > > > > > > > * @set_map: Set device memory mapping (optional)
> > > > > > > > > > > * Needed for device that using device
> > > > > > > > > > > * specific DMA translation (on-chip IOMMU)
> > > > > > > > > > > @@ -262,6 +265,7 @@ struct vdpa_config_ops {
> > > > > > > > > > > const void *buf, unsigned int len);
> > > > > > > > > > > u32 (*get_generation)(struct vdpa_device *vdev);
> > > > > > > > > > > struct vdpa_iova_range (*get_iova_range)(struct vdpa_device *vdev);
> > > > > > > > > > > + bool (*get_vq_num_unchangeable)(struct vdpa_device *vdev);
> > > > > > > > > > >
> > > > > > > > > > > /* DMA ops */
> > > > > > > > > > > int (*set_map)(struct vdpa_device *vdev, struct vhost_iotlb *iotlb);
> > > > > > > > > > > diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h
> > > > > > > > > > > index c998860d7bbc..184f1f7f8498 100644
> > > > > > > > > > > --- a/include/uapi/linux/vhost.h
> > > > > > > > > > > +++ b/include/uapi/linux/vhost.h
> > > > > > > > > > > @@ -150,4 +150,6 @@
> > > > > > > > > > > /* Get the valid iova range */
> > > > > > > > > > > #define VHOST_VDPA_GET_IOVA_RANGE _IOR(VHOST_VIRTIO, 0x78, \
> > > > > > > > > > > struct vhost_vdpa_iova_range)
> > > > > > > > > > > +/* Check if the vring size can be change */
> > > > > > > > > > > +#define VHOST_VDPA_GET_VRING_NUM_UNCHANGEABLE _IOR(VHOST_VIRTIO, 0X79, bool)
> > > > > > > > > > > #endif
> > > > > > > > > > > --
> > > > > > > > > > > 2.31.1
> > > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > >
> >

2021-09-13 03:15:21

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH 5/6] vdpa: add get_vq_num_unchangeable callback in vdpa_config_ops

On Mon, Sep 13, 2021 at 10:59 AM Wu Zongyong
<[email protected]> wrote:
>
> On Mon, Sep 13, 2021 at 09:43:40AM +0800, Jason Wang wrote:
> > On Fri, Sep 10, 2021 at 11:11 PM Cindy Lu <[email protected]> wrote:
> > >
> > > On Fri, Sep 10, 2021 at 5:20 PM Wu Zongyong
> > > <[email protected]> wrote:
> > > >
> > > > On Fri, Sep 10, 2021 at 04:25:18PM +0800, Cindy Lu wrote:
> > > > > ,
> > > > >
> > > > > On Fri, Sep 10, 2021 at 3:33 PM Wu Zongyong
> > > > > <[email protected]> wrote:
> > > > > >
> > > > > > On Fri, Sep 10, 2021 at 09:45:53AM +0800, Jason Wang wrote:
> > > > > > > On Thu, Sep 9, 2021 at 5:57 PM Wu Zongyong <[email protected]> wrote:
> > > > > > > >
> > > > > > > > On Thu, Sep 09, 2021 at 05:28:26PM +0800, Jason Wang wrote:
> > > > > > > > > On Thu, Sep 9, 2021 at 4:02 PM Wu Zongyong <[email protected]> wrote:
> > > > > > > > > >
> > > > > > > > > > On Thu, Sep 09, 2021 at 10:55:03AM +0800, Jason Wang wrote:
> > > > > > > > > > > On Wed, Sep 8, 2021 at 8:23 PM Wu Zongyong <[email protected]> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > This new callback is used to indicate whether the vring size can be
> > > > > > > > > > > > change or not. It is useful when we have a legacy virtio pci device as
> > > > > > > > > > > > the vdpa device for there is no way to negotiate the vring num by the
> > > > > > > > > > > > specification.
> > > > > > > > > > >
> > > > > > > > > > > So I'm not sure it's worth bothering. E.g what if we just fail
> > > > > > > > > > > VHOST_SET_VRING_NUM it the value doesn't match what hardware has?
> > > > > > > > > > >
> > > > > > > > > > > Thanks
> > > > > > > > > > >
> > > > > > > > > > I think we should not call VHOST_SET_VRING_NUM in that case.
> > > > > > > > > >
> > > > > > > > > > If the hardware reports that the virtqueue size cannot be changed, we
> > > > > > > > > > should call VHOST_GET_VRING_NUM to get the static virtqueue size
> > > > > > > > > > firstly, then allocate the same size memory for the virtqueues and write
> > > > > > > > > > the address to hardware finally.
> > > > > > > > > >
> > > > > > > > > > For QEMU, we will ignore the properties rx/tx_queue_size and just get it
> > > > > > > > > > from the hardware if this new callback return true.
> > > > > > > > >
> > > > > > > > > This will break live migration. My understanding is that we can
> > > > > > > > > advertise those capability/limitation via the netlink management
> > > > > > > > > protocol then management layer can choose to use the correct queue
> > > > > > > > > size.
> > > > > > > > >
> > > > > > > > > Thanks
> > > > > > > > I agree, it is a good idea.
> > > > > > > > BTW, can we also advertise mac address of network device? I found the
> > > > > > > > mac address generated by libvirt or qemu will break the network datapath
> > > > > > > > down if I don't specify the right mac explicitly in the XML or qemu
> > > > > > > > commandline.
> > > > > > >
> > > > > > > We never saw this before, AFAIK when vhost-vdpa is used, currently
> > > > > > > qemu will probably ignore the mac address set via command line since
> > > > > > > the config space is read from the device instead of qemu itself?
> > > > > > >
> > > > > >
> > > > > > I saw the code below in qemu:
> > > > > >
> > > > > > static void virtio_net_device_realize(DeviceState *dev, Error **errp)
> > > > > > {
> > > > > > ...
> > > > > > if (nc->peer && nc->peer->info->type == NET_CLIENT_DRIVER_VHOST_VDPA) {
> > > > > > struct virtio_net_config netcfg = {};
> > > > > > memcpy(&netcfg.mac, &n->nic_conf.macaddr, ETH_ALEN);
> > > > > > vhost_net_set_config(get_vhost_net(nc->peer),
> > > > > > (uint8_t *)&netcfg, 0, ETH_ALEN, VHOST_SET_CONFIG_TYPE_MASTER);
> > > > > > }
> > > > > > ...
> > > > > > }
> > > > > >
> > > > > > This write the mac address set via cmdline into vdpa device config, and
> > > > > > then guest will read it back.
> > > > > > If I remove these codes, it behaves like you said.
> > > > > >
> > > > > >
> > > > > Hi Zongyong
> > > > > I think this code only works while qemu get an all 0 mac address from
> > > > > hardware , you can get more information from the function
> > > > > virtio_net_get_config.
> > > >
> > > > It depends how vdpa_config_ops->set_config implements.
> > > > For mlx5, callback set_config do nothing. But for virtio-pci, callback
> > > > set_config will write the config register of the vdpa device, so qemu
> > > > will write the mac set via cmdline to hardware and the mac guest read
> > > > it back is the value writted by qemu just now.
> > > >
> > > So here comes a question, which MAC address has higher priority ?
> > > the MAC address in hardware or the MAC address from the cmdline?
> > > If both of these two MAC addresses exist, which should we use?
> > > I have checked the spec, not sure if the bit VIRTIO_NET_F_MAC is the right one?
> >
> > I think so, if VIRTIO_NET_F_MAC is set, qemu can override the mac otherwise not.
> >
> The spec says:
> "driver SHOULD negotiate VIRTIO_NET_F_MAC if the device offers it. If the driver
> negotiates the VIRTIO_NET_F_MAC feature, the driver MUST set the physical address
> of the NIC to mac. Otherwise, it SHOULD use a locally-administered MAC address."
>
> To my understanding, I guess you mean qemu CANNOT override the mac
> device provides actually?

Seems not, if VIRTIO_NET_F_MAC is not negotiated, mac is not valid in
the config space:

"The mac address field always exists (though is only valid if
VIRTIO_NET_F_MAC is set)"

So I think the right approach:

- if mac is not specified in the cli, Qemu doesn't need to override the mac
- if mac is specified in the cli and VIRTIO_NET_F_MAC is supported,
Qemu can override the mac
- if mac is specified in the cli and VIRTIO_NET_F_MAC is not
supported, we need fail the launching

Note that we're working on extending the netlink management API to set
mac address during vDPA instance provisioning. Management layer can
then get the correct mac address and set it via cli. AFAIK, Cindy's
patch is a workaround when netlink doesn't support mac address.

Thanks

> > Thanks
> >
> > > if yes, I will post a patch in qemu and add check for this bit before
> > > we set the mac to hardware
> > > https://docs.oasis-open.org/virtio/virtio/v1.1/csprd01/virtio-v1.1-csprd01.html
> > >
> > > Thanks
> > > cindy
> > > > > > > Thanks
> > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > What do you think?
> > > > > > > > > > > >
> > > > > > > > > > > > Signed-off-by: Wu Zongyong <[email protected]>
> > > > > > > > > > > > ---
> > > > > > > > > > > > drivers/vhost/vdpa.c | 19 +++++++++++++++++++
> > > > > > > > > > > > drivers/virtio/virtio_vdpa.c | 5 ++++-
> > > > > > > > > > > > include/linux/vdpa.h | 4 ++++
> > > > > > > > > > > > include/uapi/linux/vhost.h | 2 ++
> > > > > > > > > > > > 4 files changed, 29 insertions(+), 1 deletion(-)
> > > > > > > > > > > >
> > > > > > > > > > > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> > > > > > > > > > > > index 9479f7f79217..2204d27d1e5d 100644
> > > > > > > > > > > > --- a/drivers/vhost/vdpa.c
> > > > > > > > > > > > +++ b/drivers/vhost/vdpa.c
> > > > > > > > > > > > @@ -350,6 +350,22 @@ static long vhost_vdpa_get_iova_range(struct vhost_vdpa *v, u32 __user *argp)
> > > > > > > > > > > > return 0;
> > > > > > > > > > > > }
> > > > > > > > > > > >
> > > > > > > > > > > > +static long vhost_vdpa_get_vring_num_unchangeable(struct vhost_vdpa *v,
> > > > > > > > > > > > + u32 __user *argp)
> > > > > > > > > > > > +{
> > > > > > > > > > > > + struct vdpa_device *vdpa = v->vdpa;
> > > > > > > > > > > > + const struct vdpa_config_ops *ops = vdpa->config;
> > > > > > > > > > > > + bool unchangeable = false;
> > > > > > > > > > > > +
> > > > > > > > > > > > + if (ops->get_vq_num_unchangeable)
> > > > > > > > > > > > + unchangeable = ops->get_vq_num_unchangeable(vdpa);
> > > > > > > > > > > > +
> > > > > > > > > > > > + if (copy_to_user(argp, &unchangeable, sizeof(unchangeable)))
> > > > > > > > > > > > + return -EFAULT;
> > > > > > > > > > > > +
> > > > > > > > > > > > + return 0;
> > > > > > > > > > > > +}
> > > > > > > > > > > > +
> > > > > > > > > > > > static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd,
> > > > > > > > > > > > void __user *argp)
> > > > > > > > > > > > {
> > > > > > > > > > > > @@ -487,6 +503,9 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> > > > > > > > > > > > case VHOST_VDPA_GET_IOVA_RANGE:
> > > > > > > > > > > > r = vhost_vdpa_get_iova_range(v, argp);
> > > > > > > > > > > > break;
> > > > > > > > > > > > + case VHOST_VDPA_GET_VRING_NUM_UNCHANGEABLE:
> > > > > > > > > > > > + r = vhost_vdpa_get_vring_num_unchangeable(v, argp);
> > > > > > > > > > > > + break;
> > > > > > > > > > > > default:
> > > > > > > > > > > > r = vhost_dev_ioctl(&v->vdev, cmd, argp);
> > > > > > > > > > > > if (r == -ENOIOCTLCMD)
> > > > > > > > > > > > diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
> > > > > > > > > > > > index 72eaef2caeb1..afb47465307a 100644
> > > > > > > > > > > > --- a/drivers/virtio/virtio_vdpa.c
> > > > > > > > > > > > +++ b/drivers/virtio/virtio_vdpa.c
> > > > > > > > > > > > @@ -146,6 +146,7 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > > > > > > > > > > struct vdpa_vq_state state = {0};
> > > > > > > > > > > > unsigned long flags;
> > > > > > > > > > > > u32 align, num;
> > > > > > > > > > > > + bool may_reduce_num = true;
> > > > > > > > > > > > int err;
> > > > > > > > > > > >
> > > > > > > > > > > > if (!name)
> > > > > > > > > > > > @@ -171,8 +172,10 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > > > > > > > > > >
> > > > > > > > > > > > /* Create the vring */
> > > > > > > > > > > > align = ops->get_vq_align(vdpa);
> > > > > > > > > > > > + if (ops->get_vq_num_unchangeable)
> > > > > > > > > > > > + may_reduce_num = !ops->get_vq_num_unchangeable(vdpa);
> > > > > > > > > > > > vq = vring_create_virtqueue(index, num, align, vdev,
> > > > > > > > > > > > - true, true, ctx,
> > > > > > > > > > > > + true, may_reduce_num, ctx,
> > > > > > > > > > > > virtio_vdpa_notify, callback, name);
> > > > > > > > > > > > if (!vq) {
> > > > > > > > > > > > err = -ENOMEM;
> > > > > > > > > > > > diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
> > > > > > > > > > > > index 35648c11e312..f809b7ada00d 100644
> > > > > > > > > > > > --- a/include/linux/vdpa.h
> > > > > > > > > > > > +++ b/include/linux/vdpa.h
> > > > > > > > > > > > @@ -195,6 +195,9 @@ struct vdpa_iova_range {
> > > > > > > > > > > > * @vdev: vdpa device
> > > > > > > > > > > > * Returns the iova range supported by
> > > > > > > > > > > > * the device.
> > > > > > > > > > > > + * @get_vq_num_unchangeable Check if size of virtqueue is unchangeable (optional)
> > > > > > > > > > > > + * @vdev: vdpa device
> > > > > > > > > > > > + * Returns boolean: unchangeable (true) or not (false)
> > > > > > > > > > > > * @set_map: Set device memory mapping (optional)
> > > > > > > > > > > > * Needed for device that using device
> > > > > > > > > > > > * specific DMA translation (on-chip IOMMU)
> > > > > > > > > > > > @@ -262,6 +265,7 @@ struct vdpa_config_ops {
> > > > > > > > > > > > const void *buf, unsigned int len);
> > > > > > > > > > > > u32 (*get_generation)(struct vdpa_device *vdev);
> > > > > > > > > > > > struct vdpa_iova_range (*get_iova_range)(struct vdpa_device *vdev);
> > > > > > > > > > > > + bool (*get_vq_num_unchangeable)(struct vdpa_device *vdev);
> > > > > > > > > > > >
> > > > > > > > > > > > /* DMA ops */
> > > > > > > > > > > > int (*set_map)(struct vdpa_device *vdev, struct vhost_iotlb *iotlb);
> > > > > > > > > > > > diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h
> > > > > > > > > > > > index c998860d7bbc..184f1f7f8498 100644
> > > > > > > > > > > > --- a/include/uapi/linux/vhost.h
> > > > > > > > > > > > +++ b/include/uapi/linux/vhost.h
> > > > > > > > > > > > @@ -150,4 +150,6 @@
> > > > > > > > > > > > /* Get the valid iova range */
> > > > > > > > > > > > #define VHOST_VDPA_GET_IOVA_RANGE _IOR(VHOST_VIRTIO, 0x78, \
> > > > > > > > > > > > struct vhost_vdpa_iova_range)
> > > > > > > > > > > > +/* Check if the vring size can be change */
> > > > > > > > > > > > +#define VHOST_VDPA_GET_VRING_NUM_UNCHANGEABLE _IOR(VHOST_VIRTIO, 0X79, bool)
> > > > > > > > > > > > #endif
> > > > > > > > > > > > --
> > > > > > > > > > > > 2.31.1
> > > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> > >
>

2021-09-13 06:23:45

by Cindy Lu

[permalink] [raw]
Subject: Re: [PATCH 5/6] vdpa: add get_vq_num_unchangeable callback in vdpa_config_ops

On Mon, Sep 13, 2021 at 11:13 AM Jason Wang <[email protected]> wrote:
>
> On Mon, Sep 13, 2021 at 10:59 AM Wu Zongyong
> <[email protected]> wrote:
> >
> > On Mon, Sep 13, 2021 at 09:43:40AM +0800, Jason Wang wrote:
> > > On Fri, Sep 10, 2021 at 11:11 PM Cindy Lu <[email protected]> wrote:
> > > >
> > > > On Fri, Sep 10, 2021 at 5:20 PM Wu Zongyong
> > > > <[email protected]> wrote:
> > > > >
> > > > > On Fri, Sep 10, 2021 at 04:25:18PM +0800, Cindy Lu wrote:
> > > > > > ,
> > > > > >
> > > > > > On Fri, Sep 10, 2021 at 3:33 PM Wu Zongyong
> > > > > > <[email protected]> wrote:
> > > > > > >
> > > > > > > On Fri, Sep 10, 2021 at 09:45:53AM +0800, Jason Wang wrote:
> > > > > > > > On Thu, Sep 9, 2021 at 5:57 PM Wu Zongyong <[email protected]> wrote:
> > > > > > > > >
> > > > > > > > > On Thu, Sep 09, 2021 at 05:28:26PM +0800, Jason Wang wrote:
> > > > > > > > > > On Thu, Sep 9, 2021 at 4:02 PM Wu Zongyong <[email protected]> wrote:
> > > > > > > > > > >
> > > > > > > > > > > On Thu, Sep 09, 2021 at 10:55:03AM +0800, Jason Wang wrote:
> > > > > > > > > > > > On Wed, Sep 8, 2021 at 8:23 PM Wu Zongyong <[email protected]> wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > This new callback is used to indicate whether the vring size can be
> > > > > > > > > > > > > change or not. It is useful when we have a legacy virtio pci device as
> > > > > > > > > > > > > the vdpa device for there is no way to negotiate the vring num by the
> > > > > > > > > > > > > specification.
> > > > > > > > > > > >
> > > > > > > > > > > > So I'm not sure it's worth bothering. E.g what if we just fail
> > > > > > > > > > > > VHOST_SET_VRING_NUM it the value doesn't match what hardware has?
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks
> > > > > > > > > > > >
> > > > > > > > > > > I think we should not call VHOST_SET_VRING_NUM in that case.
> > > > > > > > > > >
> > > > > > > > > > > If the hardware reports that the virtqueue size cannot be changed, we
> > > > > > > > > > > should call VHOST_GET_VRING_NUM to get the static virtqueue size
> > > > > > > > > > > firstly, then allocate the same size memory for the virtqueues and write
> > > > > > > > > > > the address to hardware finally.
> > > > > > > > > > >
> > > > > > > > > > > For QEMU, we will ignore the properties rx/tx_queue_size and just get it
> > > > > > > > > > > from the hardware if this new callback return true.
> > > > > > > > > >
> > > > > > > > > > This will break live migration. My understanding is that we can
> > > > > > > > > > advertise those capability/limitation via the netlink management
> > > > > > > > > > protocol then management layer can choose to use the correct queue
> > > > > > > > > > size.
> > > > > > > > > >
> > > > > > > > > > Thanks
> > > > > > > > > I agree, it is a good idea.
> > > > > > > > > BTW, can we also advertise mac address of network device? I found the
> > > > > > > > > mac address generated by libvirt or qemu will break the network datapath
> > > > > > > > > down if I don't specify the right mac explicitly in the XML or qemu
> > > > > > > > > commandline.
> > > > > > > >
> > > > > > > > We never saw this before, AFAIK when vhost-vdpa is used, currently
> > > > > > > > qemu will probably ignore the mac address set via command line since
> > > > > > > > the config space is read from the device instead of qemu itself?
> > > > > > > >
> > > > > > >
> > > > > > > I saw the code below in qemu:
> > > > > > >
> > > > > > > static void virtio_net_device_realize(DeviceState *dev, Error **errp)
> > > > > > > {
> > > > > > > ...
> > > > > > > if (nc->peer && nc->peer->info->type == NET_CLIENT_DRIVER_VHOST_VDPA) {
> > > > > > > struct virtio_net_config netcfg = {};
> > > > > > > memcpy(&netcfg.mac, &n->nic_conf.macaddr, ETH_ALEN);
> > > > > > > vhost_net_set_config(get_vhost_net(nc->peer),
> > > > > > > (uint8_t *)&netcfg, 0, ETH_ALEN, VHOST_SET_CONFIG_TYPE_MASTER);
> > > > > > > }
> > > > > > > ...
> > > > > > > }
> > > > > > >
> > > > > > > This write the mac address set via cmdline into vdpa device config, and
> > > > > > > then guest will read it back.
> > > > > > > If I remove these codes, it behaves like you said.
> > > > > > >
> > > > > > >
> > > > > > Hi Zongyong
> > > > > > I think this code only works while qemu get an all 0 mac address from
> > > > > > hardware , you can get more information from the function
> > > > > > virtio_net_get_config.
> > > > >
> > > > > It depends how vdpa_config_ops->set_config implements.
> > > > > For mlx5, callback set_config do nothing. But for virtio-pci, callback
> > > > > set_config will write the config register of the vdpa device, so qemu
> > > > > will write the mac set via cmdline to hardware and the mac guest read
> > > > > it back is the value writted by qemu just now.
> > > > >
> > > > So here comes a question, which MAC address has higher priority ?
> > > > the MAC address in hardware or the MAC address from the cmdline?
> > > > If both of these two MAC addresses exist, which should we use?
> > > > I have checked the spec, not sure if the bit VIRTIO_NET_F_MAC is the right one?
> > >
> > > I think so, if VIRTIO_NET_F_MAC is set, qemu can override the mac otherwise not.
> > >
> > The spec says:
> > "driver SHOULD negotiate VIRTIO_NET_F_MAC if the device offers it. If the driver
> > negotiates the VIRTIO_NET_F_MAC feature, the driver MUST set the physical address
> > of the NIC to mac. Otherwise, it SHOULD use a locally-administered MAC address."
> >
> > To my understanding, I guess you mean qemu CANNOT override the mac
> > device provides actually?
>
> Seems not, if VIRTIO_NET_F_MAC is not negotiated, mac is not valid in
> the config space:
>
> "The mac address field always exists (though is only valid if
> VIRTIO_NET_F_MAC is set)"
>
> So I think the right approach:
>
> - if mac is not specified in the cli, Qemu doesn't need to override the mac
> - if mac is specified in the cli and VIRTIO_NET_F_MAC is supported,
> Qemu can override the mac
> - if mac is specified in the cli and VIRTIO_NET_F_MAC is not
> supported, we need fail the launching
>
> Note that we're working on extending the netlink management API to set
> mac address during vDPA instance provisioning. Management layer can
> then get the correct mac address and set it via cli. AFAIK, Cindy's
> patch is a workaround when netlink doesn't support mac address.
>
> Thanks
>
sure, I will post a patch based on that
> > > Thanks
> > >
> > > > if yes, I will post a patch in qemu and add check for this bit before
> > > > we set the mac to hardware
> > > > https://docs.oasis-open.org/virtio/virtio/v1.1/csprd01/virtio-v1.1-csprd01.html
> > > >
> > > > Thanks
> > > > cindy
> > > > > > > > Thanks
> > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > What do you think?
> > > > > > > > > > > > >
> > > > > > > > > > > > > Signed-off-by: Wu Zongyong <[email protected]>
> > > > > > > > > > > > > ---
> > > > > > > > > > > > > drivers/vhost/vdpa.c | 19 +++++++++++++++++++
> > > > > > > > > > > > > drivers/virtio/virtio_vdpa.c | 5 ++++-
> > > > > > > > > > > > > include/linux/vdpa.h | 4 ++++
> > > > > > > > > > > > > include/uapi/linux/vhost.h | 2 ++
> > > > > > > > > > > > > 4 files changed, 29 insertions(+), 1 deletion(-)
> > > > > > > > > > > > >
> > > > > > > > > > > > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> > > > > > > > > > > > > index 9479f7f79217..2204d27d1e5d 100644
> > > > > > > > > > > > > --- a/drivers/vhost/vdpa.c
> > > > > > > > > > > > > +++ b/drivers/vhost/vdpa.c
> > > > > > > > > > > > > @@ -350,6 +350,22 @@ static long vhost_vdpa_get_iova_range(struct vhost_vdpa *v, u32 __user *argp)
> > > > > > > > > > > > > return 0;
> > > > > > > > > > > > > }
> > > > > > > > > > > > >
> > > > > > > > > > > > > +static long vhost_vdpa_get_vring_num_unchangeable(struct vhost_vdpa *v,
> > > > > > > > > > > > > + u32 __user *argp)
> > > > > > > > > > > > > +{
> > > > > > > > > > > > > + struct vdpa_device *vdpa = v->vdpa;
> > > > > > > > > > > > > + const struct vdpa_config_ops *ops = vdpa->config;
> > > > > > > > > > > > > + bool unchangeable = false;
> > > > > > > > > > > > > +
> > > > > > > > > > > > > + if (ops->get_vq_num_unchangeable)
> > > > > > > > > > > > > + unchangeable = ops->get_vq_num_unchangeable(vdpa);
> > > > > > > > > > > > > +
> > > > > > > > > > > > > + if (copy_to_user(argp, &unchangeable, sizeof(unchangeable)))
> > > > > > > > > > > > > + return -EFAULT;
> > > > > > > > > > > > > +
> > > > > > > > > > > > > + return 0;
> > > > > > > > > > > > > +}
> > > > > > > > > > > > > +
> > > > > > > > > > > > > static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd,
> > > > > > > > > > > > > void __user *argp)
> > > > > > > > > > > > > {
> > > > > > > > > > > > > @@ -487,6 +503,9 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> > > > > > > > > > > > > case VHOST_VDPA_GET_IOVA_RANGE:
> > > > > > > > > > > > > r = vhost_vdpa_get_iova_range(v, argp);
> > > > > > > > > > > > > break;
> > > > > > > > > > > > > + case VHOST_VDPA_GET_VRING_NUM_UNCHANGEABLE:
> > > > > > > > > > > > > + r = vhost_vdpa_get_vring_num_unchangeable(v, argp);
> > > > > > > > > > > > > + break;
> > > > > > > > > > > > > default:
> > > > > > > > > > > > > r = vhost_dev_ioctl(&v->vdev, cmd, argp);
> > > > > > > > > > > > > if (r == -ENOIOCTLCMD)
> > > > > > > > > > > > > diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
> > > > > > > > > > > > > index 72eaef2caeb1..afb47465307a 100644
> > > > > > > > > > > > > --- a/drivers/virtio/virtio_vdpa.c
> > > > > > > > > > > > > +++ b/drivers/virtio/virtio_vdpa.c
> > > > > > > > > > > > > @@ -146,6 +146,7 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > > > > > > > > > > > struct vdpa_vq_state state = {0};
> > > > > > > > > > > > > unsigned long flags;
> > > > > > > > > > > > > u32 align, num;
> > > > > > > > > > > > > + bool may_reduce_num = true;
> > > > > > > > > > > > > int err;
> > > > > > > > > > > > >
> > > > > > > > > > > > > if (!name)
> > > > > > > > > > > > > @@ -171,8 +172,10 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > > > > > > > > > > >
> > > > > > > > > > > > > /* Create the vring */
> > > > > > > > > > > > > align = ops->get_vq_align(vdpa);
> > > > > > > > > > > > > + if (ops->get_vq_num_unchangeable)
> > > > > > > > > > > > > + may_reduce_num = !ops->get_vq_num_unchangeable(vdpa);
> > > > > > > > > > > > > vq = vring_create_virtqueue(index, num, align, vdev,
> > > > > > > > > > > > > - true, true, ctx,
> > > > > > > > > > > > > + true, may_reduce_num, ctx,
> > > > > > > > > > > > > virtio_vdpa_notify, callback, name);
> > > > > > > > > > > > > if (!vq) {
> > > > > > > > > > > > > err = -ENOMEM;
> > > > > > > > > > > > > diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
> > > > > > > > > > > > > index 35648c11e312..f809b7ada00d 100644
> > > > > > > > > > > > > --- a/include/linux/vdpa.h
> > > > > > > > > > > > > +++ b/include/linux/vdpa.h
> > > > > > > > > > > > > @@ -195,6 +195,9 @@ struct vdpa_iova_range {
> > > > > > > > > > > > > * @vdev: vdpa device
> > > > > > > > > > > > > * Returns the iova range supported by
> > > > > > > > > > > > > * the device.
> > > > > > > > > > > > > + * @get_vq_num_unchangeable Check if size of virtqueue is unchangeable (optional)
> > > > > > > > > > > > > + * @vdev: vdpa device
> > > > > > > > > > > > > + * Returns boolean: unchangeable (true) or not (false)
> > > > > > > > > > > > > * @set_map: Set device memory mapping (optional)
> > > > > > > > > > > > > * Needed for device that using device
> > > > > > > > > > > > > * specific DMA translation (on-chip IOMMU)
> > > > > > > > > > > > > @@ -262,6 +265,7 @@ struct vdpa_config_ops {
> > > > > > > > > > > > > const void *buf, unsigned int len);
> > > > > > > > > > > > > u32 (*get_generation)(struct vdpa_device *vdev);
> > > > > > > > > > > > > struct vdpa_iova_range (*get_iova_range)(struct vdpa_device *vdev);
> > > > > > > > > > > > > + bool (*get_vq_num_unchangeable)(struct vdpa_device *vdev);
> > > > > > > > > > > > >
> > > > > > > > > > > > > /* DMA ops */
> > > > > > > > > > > > > int (*set_map)(struct vdpa_device *vdev, struct vhost_iotlb *iotlb);
> > > > > > > > > > > > > diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h
> > > > > > > > > > > > > index c998860d7bbc..184f1f7f8498 100644
> > > > > > > > > > > > > --- a/include/uapi/linux/vhost.h
> > > > > > > > > > > > > +++ b/include/uapi/linux/vhost.h
> > > > > > > > > > > > > @@ -150,4 +150,6 @@
> > > > > > > > > > > > > /* Get the valid iova range */
> > > > > > > > > > > > > #define VHOST_VDPA_GET_IOVA_RANGE _IOR(VHOST_VIRTIO, 0x78, \
> > > > > > > > > > > > > struct vhost_vdpa_iova_range)
> > > > > > > > > > > > > +/* Check if the vring size can be change */
> > > > > > > > > > > > > +#define VHOST_VDPA_GET_VRING_NUM_UNCHANGEABLE _IOR(VHOST_VIRTIO, 0X79, bool)
> > > > > > > > > > > > > #endif
> > > > > > > > > > > > > --
> > > > > > > > > > > > > 2.31.1
> > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > > >
> >
>

2021-09-14 12:26:24

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH v2 0/5] vDPA driver for Alibaba ENI

This series implements the vDPA driver for Alibaba ENI(Elastic Network
Interface) which is build based on virtio-pci 0.9.5 specification.

A new vdpa attribute VDPA_ATTR_DEV_F_VERSION_1 is introduced to guide
users to choose right virtqueue size if the vdpa device is legacy.

Change From V1:
- add new vdpa attribute VDPA_ATTR_DEV_F_VERSION_1 to indicate whether
the vdpa device is legacy
- implement dedicated driver for Alibaba ENI instead a legacy virtio-pci
driver as suggested by Jason Wang
- some bugs fixed

Wu Zongyong (5):
virtio-pci: introduce legacy device module
vdpa: fix typo
vp_vdpa: add vq irq offloading support
vdpa: add new vdpa attribute VDPA_ATTR_DEV_F_VERSION_1
eni_vdpa: add vDPA driver for Alibaba ENI

drivers/vdpa/Kconfig | 8 +
drivers/vdpa/Makefile | 1 +
drivers/vdpa/alibaba/Makefile | 3 +
drivers/vdpa/alibaba/eni_vdpa.c | 537 +++++++++++++++++++++++++
drivers/vdpa/vdpa.c | 6 +
drivers/vdpa/virtio_pci/vp_vdpa.c | 12 +
drivers/virtio/Kconfig | 10 +
drivers/virtio/Makefile | 1 +
drivers/virtio/virtio_pci_common.c | 10 +-
drivers/virtio/virtio_pci_common.h | 9 +-
drivers/virtio/virtio_pci_legacy.c | 101 ++---
drivers/virtio/virtio_pci_legacy_dev.c | 220 ++++++++++
drivers/virtio/virtio_vdpa.c | 7 +-
include/linux/vdpa.h | 2 +-
include/linux/virtio_pci_legacy.h | 44 ++
include/uapi/linux/vdpa.h | 1 +
16 files changed, 887 insertions(+), 85 deletions(-)
create mode 100644 drivers/vdpa/alibaba/Makefile
create mode 100644 drivers/vdpa/alibaba/eni_vdpa.c
create mode 100644 drivers/virtio/virtio_pci_legacy_dev.c
create mode 100644 include/linux/virtio_pci_legacy.h

--
2.31.1

2021-09-14 12:27:04

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH v2 3/5] vp_vdpa: add vq irq offloading support

This patch implements the get_vq_irq() callback for virtio pci devices
to allow irq offloading.

Signed-off-by: Wu Zongyong <[email protected]>
---
drivers/vdpa/virtio_pci/vp_vdpa.c | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/drivers/vdpa/virtio_pci/vp_vdpa.c b/drivers/vdpa/virtio_pci/vp_vdpa.c
index 5bcd00246d2e..e3ff7875e123 100644
--- a/drivers/vdpa/virtio_pci/vp_vdpa.c
+++ b/drivers/vdpa/virtio_pci/vp_vdpa.c
@@ -76,6 +76,17 @@ static u8 vp_vdpa_get_status(struct vdpa_device *vdpa)
return vp_modern_get_status(mdev);
}

+static int vp_vdpa_get_vq_irq(struct vdpa_device *vdpa, u16 idx)
+{
+ struct vp_vdpa *vp_vdpa = vdpa_to_vp(vdpa);
+ int irq = vp_vdpa->vring[idx].irq;
+
+ if (irq == VIRTIO_MSI_NO_VECTOR)
+ return -EINVAL;
+
+ return irq;
+}
+
static void vp_vdpa_free_irq(struct vp_vdpa *vp_vdpa)
{
struct virtio_pci_modern_device *mdev = &vp_vdpa->mdev;
@@ -427,6 +438,7 @@ static const struct vdpa_config_ops vp_vdpa_ops = {
.get_config = vp_vdpa_get_config,
.set_config = vp_vdpa_set_config,
.set_config_cb = vp_vdpa_set_config_cb,
+ .get_vq_irq = vp_vdpa_get_vq_irq,
};

static void vp_vdpa_free_irq_vectors(void *data)
--
2.31.1

2021-09-14 12:27:26

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH v2 4/5] vdpa: add new vdpa attribute VDPA_ATTR_DEV_F_VERSION_1

This new attribute advertises whether the vdpa device is legacy or not.
Users can pick right virtqueue size if the vdpa device is legacy which
doesn't support to change virtqueue size.

Signed-off-by: Wu Zongyong <[email protected]>
---
drivers/vdpa/vdpa.c | 6 ++++++
drivers/virtio/virtio_vdpa.c | 7 ++++++-
include/uapi/linux/vdpa.h | 1 +
3 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
index 1dc121a07a93..533d7f589eee 100644
--- a/drivers/vdpa/vdpa.c
+++ b/drivers/vdpa/vdpa.c
@@ -12,6 +12,7 @@
#include <linux/slab.h>
#include <linux/vdpa.h>
#include <uapi/linux/vdpa.h>
+#include <uapi/linux/virtio_config.h>
#include <net/genetlink.h>
#include <linux/mod_devicetable.h>

@@ -494,6 +495,7 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
u16 max_vq_size;
u32 device_id;
u32 vendor_id;
+ u64 features;
void *hdr;
int err;

@@ -508,6 +510,7 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
device_id = vdev->config->get_device_id(vdev);
vendor_id = vdev->config->get_vendor_id(vdev);
max_vq_size = vdev->config->get_vq_num_max(vdev);
+ features = vdev->config->get_features(vdev);

err = -EMSGSIZE;
if (nla_put_string(msg, VDPA_ATTR_DEV_NAME, dev_name(&vdev->dev)))
@@ -520,6 +523,9 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
goto msg_err;
if (nla_put_u16(msg, VDPA_ATTR_DEV_MAX_VQ_SIZE, max_vq_size))
goto msg_err;
+ if (features & BIT_ULL(VIRTIO_F_VERSION_1) &&
+ nla_put_flag(msg, VDPA_ATTR_DEV_VERSION_1))
+ goto msg_err;

genlmsg_end(msg, hdr);
return 0;
diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
index 72eaef2caeb1..1cba957c4cdc 100644
--- a/drivers/virtio/virtio_vdpa.c
+++ b/drivers/virtio/virtio_vdpa.c
@@ -7,6 +7,7 @@
*
*/

+#include "linux/virtio_config.h"
#include <linux/init.h>
#include <linux/module.h>
#include <linux/device.h>
@@ -145,6 +146,7 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
/* Assume split virtqueue, switch to packed if necessary */
struct vdpa_vq_state state = {0};
unsigned long flags;
+ bool may_reduce_num = false;
u32 align, num;
int err;

@@ -169,10 +171,13 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
goto error_new_virtqueue;
}

+ if (ops->get_features(vdpa) & BIT_ULL(VIRTIO_F_VERSION_1))
+ may_reduce_num = true;
+
/* Create the vring */
align = ops->get_vq_align(vdpa);
vq = vring_create_virtqueue(index, num, align, vdev,
- true, true, ctx,
+ true, may_reduce_num, ctx,
virtio_vdpa_notify, callback, name);
if (!vq) {
err = -ENOMEM;
diff --git a/include/uapi/linux/vdpa.h b/include/uapi/linux/vdpa.h
index 66a41e4ec163..ce0b74276a5b 100644
--- a/include/uapi/linux/vdpa.h
+++ b/include/uapi/linux/vdpa.h
@@ -32,6 +32,7 @@ enum vdpa_attr {
VDPA_ATTR_DEV_VENDOR_ID, /* u32 */
VDPA_ATTR_DEV_MAX_VQS, /* u32 */
VDPA_ATTR_DEV_MAX_VQ_SIZE, /* u16 */
+ VDPA_ATTR_DEV_VERSION_1, /* flag */

/* new attributes must be added above here */
VDPA_ATTR_MAX,
--
2.31.1

2021-09-14 12:28:39

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH v2 1/5] virtio-pci: introduce legacy device module

Split common codes from virtio-pci-legacy so vDPA driver can reuse it
later.

Signed-off-by: Wu Zongyong <[email protected]>
---
drivers/virtio/Kconfig | 10 ++
drivers/virtio/Makefile | 1 +
drivers/virtio/virtio_pci_common.c | 10 +-
drivers/virtio/virtio_pci_common.h | 9 +-
drivers/virtio/virtio_pci_legacy.c | 101 +++---------
drivers/virtio/virtio_pci_legacy_dev.c | 220 +++++++++++++++++++++++++
include/linux/virtio_pci_legacy.h | 44 +++++
7 files changed, 312 insertions(+), 83 deletions(-)
create mode 100644 drivers/virtio/virtio_pci_legacy_dev.c
create mode 100644 include/linux/virtio_pci_legacy.h

diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
index ce1b3f6ec325..b14768dc9e04 100644
--- a/drivers/virtio/Kconfig
+++ b/drivers/virtio/Kconfig
@@ -20,6 +20,15 @@ config VIRTIO_PCI_LIB
PCI device with possible vendor specific extensions. Any
module that selects this module must depend on PCI.

+config VIRTIO_PCI_LIB_LEGACY
+ tristate
+ help
+ Legacy PCI device (Virtio PCI Card 0.9.x Draft and older device)
+ implementation.
+ This modules implements the basic probe and control for devices
+ which are based on legacy PCI device. Any module that selects this
+ module must depend on PCI.
+
menuconfig VIRTIO_MENU
bool "Virtio drivers"
default y
@@ -43,6 +52,7 @@ config VIRTIO_PCI_LEGACY
bool "Support for legacy virtio draft 0.9.X and older devices"
default y
depends on VIRTIO_PCI
+ select VIRTIO_PCI_LIB_LEGACY
help
Virtio PCI Card 0.9.X Draft (circa 2014) and older device support.

diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile
index 699bbea0465f..0a82d0873248 100644
--- a/drivers/virtio/Makefile
+++ b/drivers/virtio/Makefile
@@ -1,6 +1,7 @@
# SPDX-License-Identifier: GPL-2.0
obj-$(CONFIG_VIRTIO) += virtio.o virtio_ring.o
obj-$(CONFIG_VIRTIO_PCI_LIB) += virtio_pci_modern_dev.o
+obj-$(CONFIG_VIRTIO_PCI_LIB_LEGACY) += virtio_pci_legacy_dev.o
obj-$(CONFIG_VIRTIO_MMIO) += virtio_mmio.o
obj-$(CONFIG_VIRTIO_PCI) += virtio_pci.o
virtio_pci-y := virtio_pci_modern.o virtio_pci_common.o
diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c
index b35bb2d57f62..d724f676608b 100644
--- a/drivers/virtio/virtio_pci_common.c
+++ b/drivers/virtio/virtio_pci_common.c
@@ -549,6 +549,8 @@ static int virtio_pci_probe(struct pci_dev *pci_dev,

pci_set_master(pci_dev);

+ vp_dev->is_legacy = vp_dev->ldev.ioaddr ? true : false;
+
rc = register_virtio_device(&vp_dev->vdev);
reg_dev = vp_dev;
if (rc)
@@ -557,10 +559,10 @@ static int virtio_pci_probe(struct pci_dev *pci_dev,
return 0;

err_register:
- if (vp_dev->ioaddr)
- virtio_pci_legacy_remove(vp_dev);
+ if (vp_dev->is_legacy)
+ virtio_pci_legacy_remove(vp_dev);
else
- virtio_pci_modern_remove(vp_dev);
+ virtio_pci_modern_remove(vp_dev);
err_probe:
pci_disable_device(pci_dev);
err_enable_device:
@@ -587,7 +589,7 @@ static void virtio_pci_remove(struct pci_dev *pci_dev)

unregister_virtio_device(&vp_dev->vdev);

- if (vp_dev->ioaddr)
+ if (vp_dev->is_legacy)
virtio_pci_legacy_remove(vp_dev);
else
virtio_pci_modern_remove(vp_dev);
diff --git a/drivers/virtio/virtio_pci_common.h b/drivers/virtio/virtio_pci_common.h
index beec047a8f8d..eb17a29fc7ef 100644
--- a/drivers/virtio/virtio_pci_common.h
+++ b/drivers/virtio/virtio_pci_common.h
@@ -25,6 +25,7 @@
#include <linux/virtio_config.h>
#include <linux/virtio_ring.h>
#include <linux/virtio_pci.h>
+#include <linux/virtio_pci_legacy.h>
#include <linux/virtio_pci_modern.h>
#include <linux/highmem.h>
#include <linux/spinlock.h>
@@ -44,16 +45,14 @@ struct virtio_pci_vq_info {
struct virtio_pci_device {
struct virtio_device vdev;
struct pci_dev *pci_dev;
+ struct virtio_pci_legacy_device ldev;
struct virtio_pci_modern_device mdev;

- /* In legacy mode, these two point to within ->legacy. */
+ bool is_legacy;
+
/* Where to read and clear interrupt */
u8 __iomem *isr;

- /* Legacy only field */
- /* the IO mapping for the PCI config space */
- void __iomem *ioaddr;
-
/* a list of queues so we can dispatch IRQs */
spinlock_t lock;
struct list_head virtqueues;
diff --git a/drivers/virtio/virtio_pci_legacy.c b/drivers/virtio/virtio_pci_legacy.c
index d62e9835aeec..82eb437ad920 100644
--- a/drivers/virtio/virtio_pci_legacy.c
+++ b/drivers/virtio/virtio_pci_legacy.c
@@ -14,6 +14,7 @@
* Michael S. Tsirkin <[email protected]>
*/

+#include "linux/virtio_pci_legacy.h"
#include "virtio_pci_common.h"

/* virtio config->get_features() implementation */
@@ -23,7 +24,7 @@ static u64 vp_get_features(struct virtio_device *vdev)

/* When someone needs more than 32 feature bits, we'll need to
* steal a bit to indicate that the rest are somewhere else. */
- return ioread32(vp_dev->ioaddr + VIRTIO_PCI_HOST_FEATURES);
+ return vp_legacy_get_features(&vp_dev->ldev);
}

/* virtio config->finalize_features() implementation */
@@ -38,7 +39,7 @@ static int vp_finalize_features(struct virtio_device *vdev)
BUG_ON((u32)vdev->features != vdev->features);

/* We only support 32 feature bits. */
- iowrite32(vdev->features, vp_dev->ioaddr + VIRTIO_PCI_GUEST_FEATURES);
+ vp_legacy_set_features(&vp_dev->ldev, vdev->features);

return 0;
}
@@ -48,7 +49,7 @@ static void vp_get(struct virtio_device *vdev, unsigned offset,
void *buf, unsigned len)
{
struct virtio_pci_device *vp_dev = to_vp_device(vdev);
- void __iomem *ioaddr = vp_dev->ioaddr +
+ void __iomem *ioaddr = vp_dev->ldev.ioaddr +
VIRTIO_PCI_CONFIG_OFF(vp_dev->msix_enabled) +
offset;
u8 *ptr = buf;
@@ -64,7 +65,7 @@ static void vp_set(struct virtio_device *vdev, unsigned offset,
const void *buf, unsigned len)
{
struct virtio_pci_device *vp_dev = to_vp_device(vdev);
- void __iomem *ioaddr = vp_dev->ioaddr +
+ void __iomem *ioaddr = vp_dev->ldev.ioaddr +
VIRTIO_PCI_CONFIG_OFF(vp_dev->msix_enabled) +
offset;
const u8 *ptr = buf;
@@ -78,7 +79,7 @@ static void vp_set(struct virtio_device *vdev, unsigned offset,
static u8 vp_get_status(struct virtio_device *vdev)
{
struct virtio_pci_device *vp_dev = to_vp_device(vdev);
- return ioread8(vp_dev->ioaddr + VIRTIO_PCI_STATUS);
+ return vp_legacy_get_status(&vp_dev->ldev);
}

static void vp_set_status(struct virtio_device *vdev, u8 status)
@@ -86,28 +87,24 @@ static void vp_set_status(struct virtio_device *vdev, u8 status)
struct virtio_pci_device *vp_dev = to_vp_device(vdev);
/* We should never be setting status to 0. */
BUG_ON(status == 0);
- iowrite8(status, vp_dev->ioaddr + VIRTIO_PCI_STATUS);
+ vp_legacy_set_status(&vp_dev->ldev, status);
}

static void vp_reset(struct virtio_device *vdev)
{
struct virtio_pci_device *vp_dev = to_vp_device(vdev);
/* 0 status means a reset. */
- iowrite8(0, vp_dev->ioaddr + VIRTIO_PCI_STATUS);
+ vp_legacy_set_status(&vp_dev->ldev, 0);
/* Flush out the status write, and flush in device writes,
* including MSi-X interrupts, if any. */
- ioread8(vp_dev->ioaddr + VIRTIO_PCI_STATUS);
+ vp_legacy_get_status(&vp_dev->ldev);
/* Flush pending VQ/configuration callbacks. */
vp_synchronize_vectors(vdev);
}

static u16 vp_config_vector(struct virtio_pci_device *vp_dev, u16 vector)
{
- /* Setup the vector used for configuration events */
- iowrite16(vector, vp_dev->ioaddr + VIRTIO_MSI_CONFIG_VECTOR);
- /* Verify we had enough resources to assign the vector */
- /* Will also flush the write out to device */
- return ioread16(vp_dev->ioaddr + VIRTIO_MSI_CONFIG_VECTOR);
+ return vp_legacy_config_vector(&vp_dev->ldev, vector);
}

static struct virtqueue *setup_vq(struct virtio_pci_device *vp_dev,
@@ -123,12 +120,9 @@ static struct virtqueue *setup_vq(struct virtio_pci_device *vp_dev,
int err;
u64 q_pfn;

- /* Select the queue we're interested in */
- iowrite16(index, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_SEL);
-
/* Check if queue is either not available or already active. */
- num = ioread16(vp_dev->ioaddr + VIRTIO_PCI_QUEUE_NUM);
- if (!num || ioread32(vp_dev->ioaddr + VIRTIO_PCI_QUEUE_PFN))
+ num = vp_legacy_get_queue_size(&vp_dev->ldev, index);
+ if (!num || vp_legacy_get_queue_enable(&vp_dev->ldev, index))
return ERR_PTR(-ENOENT);

info->msix_vector = msix_vec;
@@ -151,13 +145,12 @@ static struct virtqueue *setup_vq(struct virtio_pci_device *vp_dev,
}

/* activate the queue */
- iowrite32(q_pfn, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_PFN);
+ vp_legacy_set_queue_address(&vp_dev->ldev, index, q_pfn);

- vq->priv = (void __force *)vp_dev->ioaddr + VIRTIO_PCI_QUEUE_NOTIFY;
+ vq->priv = (void __force *)vp_dev->ldev.ioaddr + VIRTIO_PCI_QUEUE_NOTIFY;

if (msix_vec != VIRTIO_MSI_NO_VECTOR) {
- iowrite16(msix_vec, vp_dev->ioaddr + VIRTIO_MSI_QUEUE_VECTOR);
- msix_vec = ioread16(vp_dev->ioaddr + VIRTIO_MSI_QUEUE_VECTOR);
+ msix_vec = vp_legacy_queue_vector(&vp_dev->ldev, index, msix_vec);
if (msix_vec == VIRTIO_MSI_NO_VECTOR) {
err = -EBUSY;
goto out_deactivate;
@@ -167,7 +160,7 @@ static struct virtqueue *setup_vq(struct virtio_pci_device *vp_dev,
return vq;

out_deactivate:
- iowrite32(0, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_PFN);
+ vp_legacy_set_queue_address(&vp_dev->ldev, index, 0);
out_del_vq:
vring_del_virtqueue(vq);
return ERR_PTR(err);
@@ -178,17 +171,15 @@ static void del_vq(struct virtio_pci_vq_info *info)
struct virtqueue *vq = info->vq;
struct virtio_pci_device *vp_dev = to_vp_device(vq->vdev);

- iowrite16(vq->index, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_SEL);
-
if (vp_dev->msix_enabled) {
- iowrite16(VIRTIO_MSI_NO_VECTOR,
- vp_dev->ioaddr + VIRTIO_MSI_QUEUE_VECTOR);
+ vp_legacy_queue_vector(&vp_dev->ldev, vq->index,
+ VIRTIO_MSI_NO_VECTOR);
/* Flush the write out to device */
- ioread8(vp_dev->ioaddr + VIRTIO_PCI_ISR);
+ ioread8(vp_dev->ldev.ioaddr + VIRTIO_PCI_ISR);
}

/* Select and deactivate the queue */
- iowrite32(0, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_PFN);
+ vp_legacy_set_queue_address(&vp_dev->ldev, vq->index, 0);

vring_del_virtqueue(vq);
}
@@ -211,51 +202,18 @@ static const struct virtio_config_ops virtio_pci_config_ops = {
/* the PCI probing function */
int virtio_pci_legacy_probe(struct virtio_pci_device *vp_dev)
{
+ struct virtio_pci_legacy_device *ldev = &vp_dev->ldev;
struct pci_dev *pci_dev = vp_dev->pci_dev;
int rc;

- /* We only own devices >= 0x1000 and <= 0x103f: leave the rest. */
- if (pci_dev->device < 0x1000 || pci_dev->device > 0x103f)
- return -ENODEV;
-
- if (pci_dev->revision != VIRTIO_PCI_ABI_VERSION) {
- printk(KERN_ERR "virtio_pci: expected ABI version %d, got %d\n",
- VIRTIO_PCI_ABI_VERSION, pci_dev->revision);
- return -ENODEV;
- }
-
- rc = dma_set_mask(&pci_dev->dev, DMA_BIT_MASK(64));
- if (rc) {
- rc = dma_set_mask_and_coherent(&pci_dev->dev, DMA_BIT_MASK(32));
- } else {
- /*
- * The virtio ring base address is expressed as a 32-bit PFN,
- * with a page size of 1 << VIRTIO_PCI_QUEUE_ADDR_SHIFT.
- */
- dma_set_coherent_mask(&pci_dev->dev,
- DMA_BIT_MASK(32 + VIRTIO_PCI_QUEUE_ADDR_SHIFT));
- }
-
- if (rc)
- dev_warn(&pci_dev->dev, "Failed to enable 64-bit or 32-bit DMA. Trying to continue, but this might not work.\n");
+ ldev->pci_dev = pci_dev;

- rc = pci_request_region(pci_dev, 0, "virtio-pci-legacy");
+ rc = vp_legacy_probe(ldev);
if (rc)
return rc;

- rc = -ENOMEM;
- vp_dev->ioaddr = pci_iomap(pci_dev, 0, 0);
- if (!vp_dev->ioaddr)
- goto err_iomap;
-
- vp_dev->isr = vp_dev->ioaddr + VIRTIO_PCI_ISR;
-
- /* we use the subsystem vendor/device id as the virtio vendor/device
- * id. this allows us to use the same PCI vendor/device id for all
- * virtio devices and to identify the particular virtio driver by
- * the subsystem ids */
- vp_dev->vdev.id.vendor = pci_dev->subsystem_vendor;
- vp_dev->vdev.id.device = pci_dev->subsystem_device;
+ vp_dev->isr = ldev->isr;
+ vp_dev->vdev.id = ldev->id;

vp_dev->vdev.config = &virtio_pci_config_ops;

@@ -264,16 +222,11 @@ int virtio_pci_legacy_probe(struct virtio_pci_device *vp_dev)
vp_dev->del_vq = del_vq;

return 0;
-
-err_iomap:
- pci_release_region(pci_dev, 0);
- return rc;
}

void virtio_pci_legacy_remove(struct virtio_pci_device *vp_dev)
{
- struct pci_dev *pci_dev = vp_dev->pci_dev;
+ struct virtio_pci_legacy_device *ldev = &vp_dev->ldev;

- pci_iounmap(pci_dev, vp_dev->ioaddr);
- pci_release_region(pci_dev, 0);
+ vp_legacy_remove(ldev);
}
diff --git a/drivers/virtio/virtio_pci_legacy_dev.c b/drivers/virtio/virtio_pci_legacy_dev.c
new file mode 100644
index 000000000000..9b97680dd02b
--- /dev/null
+++ b/drivers/virtio/virtio_pci_legacy_dev.c
@@ -0,0 +1,220 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+
+#include "linux/virtio_pci.h"
+#include <linux/virtio_pci_legacy.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+
+
+/*
+ * vp_legacy_probe: probe the legacy virtio pci device, note that the
+ * caller is required to enable PCI device before calling this function.
+ * @ldev: the legacy virtio-pci device
+ *
+ * Return 0 on succeed otherwise fail
+ */
+int vp_legacy_probe(struct virtio_pci_legacy_device *ldev)
+{
+ struct pci_dev *pci_dev = ldev->pci_dev;
+ int rc;
+
+ /* We only own devices >= 0x1000 and <= 0x103f: leave the rest. */
+ if (pci_dev->device < 0x1000 || pci_dev->device > 0x103f)
+ return -ENODEV;
+
+ if (pci_dev->revision != VIRTIO_PCI_ABI_VERSION)
+ return -ENODEV;
+
+ rc = dma_set_mask(&pci_dev->dev, DMA_BIT_MASK(64));
+ if (rc) {
+ rc = dma_set_mask_and_coherent(&pci_dev->dev, DMA_BIT_MASK(32));
+ } else {
+ /*
+ * The virtio ring base address is expressed as a 32-bit PFN,
+ * with a page size of 1 << VIRTIO_PCI_QUEUE_ADDR_SHIFT.
+ */
+ dma_set_coherent_mask(&pci_dev->dev,
+ DMA_BIT_MASK(32 + VIRTIO_PCI_QUEUE_ADDR_SHIFT));
+ }
+
+ if (rc)
+ dev_warn(&pci_dev->dev, "Failed to enable 64-bit or 32-bit DMA. Trying to continue, but this might not work.\n");
+
+ rc = pci_request_region(pci_dev, 0, "virtio-pci-legacy");
+ if (rc)
+ return rc;
+
+ ldev->ioaddr = pci_iomap(pci_dev, 0, 0);
+ if (!ldev->ioaddr)
+ goto err_iomap;
+
+ ldev->isr = ldev->ioaddr + VIRTIO_PCI_ISR;
+
+ ldev->id.vendor = pci_dev->subsystem_vendor;
+ ldev->id.device = pci_dev->subsystem_device;
+
+ return 0;
+err_iomap:
+ pci_release_region(pci_dev, 0);
+ return rc;
+}
+EXPORT_SYMBOL_GPL(vp_legacy_probe);
+
+/*
+ * vp_legacy_probe: remove and cleanup the legacy virtio pci device
+ * @ldev: the legacy virtio-pci device
+ */
+void vp_legacy_remove(struct virtio_pci_legacy_device *ldev)
+{
+ struct pci_dev *pci_dev = ldev->pci_dev;
+
+ pci_iounmap(pci_dev, ldev->ioaddr);
+ pci_release_region(pci_dev, 0);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_remove);
+
+/*
+ * vp_legacy_get_features - get features from device
+ * @ldev: the legacy virtio-pci device
+ *
+ * Returns the features read from the device
+ */
+u64 vp_legacy_get_features(struct virtio_pci_legacy_device *ldev)
+{
+
+ return ioread32(ldev->ioaddr + VIRTIO_PCI_HOST_FEATURES);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_get_features);
+
+/*
+ * vp_legacy_get_driver_features - get driver features from device
+ * @ldev: the legacy virtio-pci device
+ *
+ * Returns the driver features read from the device
+ */
+u64 vp_legacy_get_driver_features(struct virtio_pci_legacy_device *ldev)
+{
+ return ioread32(ldev->ioaddr + VIRTIO_PCI_GUEST_FEATURES);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_get_driver_features);
+
+/*
+ * vp_legacy_set_features - set features to device
+ * @ldev: the legacy virtio-pci device
+ * @features: the features set to device
+ */
+void vp_legacy_set_features(struct virtio_pci_legacy_device *ldev,
+ u32 features)
+{
+ iowrite32(features, ldev->ioaddr + VIRTIO_PCI_GUEST_FEATURES);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_set_features);
+
+/*
+ * vp_legacy_get_status - get the device status
+ * @ldev: the legacy virtio-pci device
+ *
+ * Returns the status read from device
+ */
+u8 vp_legacy_get_status(struct virtio_pci_legacy_device *ldev)
+{
+ return ioread8(ldev->ioaddr + VIRTIO_PCI_STATUS);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_get_status);
+
+/*
+ * vp_legacy_set_status - set status to device
+ * @ldev: the legacy virtio-pci device
+ * @status: the status set to device
+ */
+void vp_legacy_set_status(struct virtio_pci_legacy_device *ldev,
+ u8 status)
+{
+ iowrite8(status, ldev->ioaddr + VIRTIO_PCI_STATUS);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_set_status);
+
+/*
+ * vp_legacy_queue_vector - set the MSIX vector for a specific virtqueue
+ * @ldev: the legacy virtio-pci device
+ * @index: queue index
+ * @vector: the config vector
+ *
+ * Returns the config vector read from the device
+ */
+u16 vp_legacy_queue_vector(struct virtio_pci_legacy_device *ldev,
+ u16 index, u16 vector)
+{
+ iowrite16(index, ldev->ioaddr + VIRTIO_PCI_QUEUE_SEL);
+ iowrite16(vector, ldev->ioaddr + VIRTIO_MSI_QUEUE_VECTOR);
+ /* Flush the write out to device */
+ return ioread16(ldev->ioaddr + VIRTIO_MSI_QUEUE_VECTOR);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_queue_vector);
+
+/*
+ * vp_legacy_config_vector - set the vector for config interrupt
+ * @ldev: the legacy virtio-pci device
+ * @vector: the config vector
+ *
+ * Returns the config vector read from the device
+ */
+u16 vp_legacy_config_vector(struct virtio_pci_legacy_device *ldev,
+ u16 vector)
+{
+ /* Setup the vector used for configuration events */
+ iowrite16(vector, ldev->ioaddr + VIRTIO_MSI_CONFIG_VECTOR);
+ /* Verify we had enough resources to assign the vector */
+ /* Will also flush the write out to device */
+ return ioread16(ldev->ioaddr + VIRTIO_MSI_CONFIG_VECTOR);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_config_vector);
+
+/*
+ * vp_legacy_set_queue_address - set the virtqueue address
+ * @ldev: the legacy virtio-pci device
+ * @index: the queue index
+ * @queue_pfn: pfn of the virtqueue
+ */
+void vp_legacy_set_queue_address(struct virtio_pci_legacy_device *ldev,
+ u16 index, u32 queue_pfn)
+{
+ iowrite16(index, ldev->ioaddr + VIRTIO_PCI_QUEUE_SEL);
+ iowrite32(queue_pfn, ldev->ioaddr + VIRTIO_PCI_QUEUE_PFN);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_set_queue_address);
+
+/*
+ * vp_legacy_get_queue_enable - enable a virtqueue
+ * @ldev: the legacy virtio-pci device
+ * @index: the queue index
+ *
+ * Returns whether a virtqueue is enabled or not
+ */
+bool vp_legacy_get_queue_enable(struct virtio_pci_legacy_device *ldev,
+ u16 index)
+{
+ iowrite16(index, ldev->ioaddr + VIRTIO_PCI_QUEUE_SEL);
+ return ioread32(ldev->ioaddr + VIRTIO_PCI_QUEUE_PFN);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_get_queue_enable);
+
+/*
+ * vp_legacy_get_queue_size - get size for a virtqueue
+ * @ldev: the legacy virtio-pci device
+ * @index: the queue index
+ *
+ * Returns the size of the virtqueue
+ */
+u16 vp_legacy_get_queue_size(struct virtio_pci_legacy_device *ldev,
+ u16 index)
+{
+ iowrite16(index, ldev->ioaddr + VIRTIO_PCI_QUEUE_SEL);
+ return ioread16(ldev->ioaddr + VIRTIO_PCI_QUEUE_NUM);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_get_queue_size);
+
+MODULE_VERSION("0.1");
+MODULE_DESCRIPTION("Legacy Virtio PCI Device");
+MODULE_AUTHOR("Wu Zongyong <[email protected]>");
+MODULE_LICENSE("GPL");
diff --git a/include/linux/virtio_pci_legacy.h b/include/linux/virtio_pci_legacy.h
new file mode 100644
index 000000000000..ee2c6157215f
--- /dev/null
+++ b/include/linux/virtio_pci_legacy.h
@@ -0,0 +1,44 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_VIRTIO_PCI_LEGACY_H
+#define _LINUX_VIRTIO_PCI_LEGACY_H
+
+#include "linux/mod_devicetable.h"
+#include <linux/pci.h>
+#include <linux/virtio_pci.h>
+
+struct virtio_pci_legacy_device {
+ struct pci_dev *pci_dev;
+
+ /* Where to read and clear interrupt */
+ u8 __iomem *isr;
+ /* The IO mapping for the PCI config space (legacy mode only) */
+ void __iomem *ioaddr;
+
+ struct virtio_device_id id;
+};
+
+u64 vp_legacy_get_features(struct virtio_pci_legacy_device *ldev);
+u64 vp_legacy_get_driver_features(struct virtio_pci_legacy_device *ldev);
+void vp_legacy_set_features(struct virtio_pci_legacy_device *ldev,
+ u32 features);
+u8 vp_legacy_get_status(struct virtio_pci_legacy_device *ldev);
+void vp_legacy_set_status(struct virtio_pci_legacy_device *ldev,
+ u8 status);
+u16 vp_legacy_queue_vector(struct virtio_pci_legacy_device *ldev,
+ u16 idx, u16 vector);
+u16 vp_legacy_config_vector(struct virtio_pci_legacy_device *ldev,
+ u16 vector);
+void vp_legacy_set_queue_address(struct virtio_pci_legacy_device *ldev,
+ u16 index, u32 queue_pfn);
+void vp_legacy_set_queue_enable(struct virtio_pci_legacy_device *ldev,
+ u16 idx, bool enable);
+bool vp_legacy_get_queue_enable(struct virtio_pci_legacy_device *ldev,
+ u16 idx);
+void vp_legacy_set_queue_size(struct virtio_pci_legacy_device *ldev,
+ u16 idx, u16 size);
+u16 vp_legacy_get_queue_size(struct virtio_pci_legacy_device *ldev,
+ u16 idx);
+int vp_legacy_probe(struct virtio_pci_legacy_device *ldev);
+void vp_legacy_remove(struct virtio_pci_legacy_device *ldev);
+
+#endif
--
2.31.1

2021-09-14 12:28:54

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH v2 2/5] vdpa: fix typo

Signed-off-by: Wu Zongyong <[email protected]>
---
include/linux/vdpa.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
index 3972ab765de1..a896ee021e5f 100644
--- a/include/linux/vdpa.h
+++ b/include/linux/vdpa.h
@@ -257,7 +257,7 @@ struct vdpa_config_ops {
struct vdpa_notification_area
(*get_vq_notification)(struct vdpa_device *vdev, u16 idx);
/* vq irq is not expected to be changed once DRIVER_OK is set */
- int (*get_vq_irq)(struct vdpa_device *vdv, u16 idx);
+ int (*get_vq_irq)(struct vdpa_device *vdev, u16 idx);

/* Device ops */
u32 (*get_vq_align)(struct vdpa_device *vdev);
--
2.31.1

2021-09-14 12:30:48

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH v2 5/5] eni_vdpa: add vDPA driver for Alibaba ENI

This patch adds a new vDPA driver for Alibaba ENI(Elastic Network
Interface) which is build upon virtio 0.9.5 specification.

Signed-off-by: Wu Zongyong <[email protected]>
---
drivers/vdpa/Kconfig | 8 +
drivers/vdpa/Makefile | 1 +
drivers/vdpa/alibaba/Makefile | 3 +
drivers/vdpa/alibaba/eni_vdpa.c | 537 ++++++++++++++++++++++++++++++++
4 files changed, 549 insertions(+)
create mode 100644 drivers/vdpa/alibaba/Makefile
create mode 100644 drivers/vdpa/alibaba/eni_vdpa.c

diff --git a/drivers/vdpa/Kconfig b/drivers/vdpa/Kconfig
index 3d91982d8371..9587b9177b05 100644
--- a/drivers/vdpa/Kconfig
+++ b/drivers/vdpa/Kconfig
@@ -78,4 +78,12 @@ config VP_VDPA
help
This kernel module bridges virtio PCI device to vDPA bus.

+config ALIBABA_ENI_VDPA
+ tristate "vDPA driver for Alibaba ENI"
+ select VIRTIO_PCI_LEGACY_LIB
+ depends on PCI_MSI
+ help
+ VDPA driver for Alibaba ENI(Elastic Network Interface) which is build upon
+ virtio 0.9.5 specification.
+
endif # VDPA
diff --git a/drivers/vdpa/Makefile b/drivers/vdpa/Makefile
index f02ebed33f19..15665563a7f4 100644
--- a/drivers/vdpa/Makefile
+++ b/drivers/vdpa/Makefile
@@ -5,3 +5,4 @@ obj-$(CONFIG_VDPA_USER) += vdpa_user/
obj-$(CONFIG_IFCVF) += ifcvf/
obj-$(CONFIG_MLX5_VDPA) += mlx5/
obj-$(CONFIG_VP_VDPA) += virtio_pci/
+obj-$(CONFIG_ALIBABA_ENI_VDPA) += alibaba/
diff --git a/drivers/vdpa/alibaba/Makefile b/drivers/vdpa/alibaba/Makefile
new file mode 100644
index 000000000000..ef4aae69f87a
--- /dev/null
+++ b/drivers/vdpa/alibaba/Makefile
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0
+obj-$(CONFIG_ALIBABA_ENI_VDPA) += eni_vdpa.o
+
diff --git a/drivers/vdpa/alibaba/eni_vdpa.c b/drivers/vdpa/alibaba/eni_vdpa.c
new file mode 100644
index 000000000000..38e85a5dd62e
--- /dev/null
+++ b/drivers/vdpa/alibaba/eni_vdpa.c
@@ -0,0 +1,537 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * vDPA bridge driver for Alibaba ENI(Elastic Network Interface)
+ *
+ * Copyright (c) 2021, Alibaba Inc. All rights reserved.
+ * Author: Wu Zongyong <[email protected]>
+ *
+ */
+
+#include "asm-generic/errno-base.h"
+#include "asm-generic/errno.h"
+#include "linux/irqreturn.h"
+#include "linux/kernel.h"
+#include "linux/pci_ids.h"
+#include "linux/virtio_config.h"
+#include <linux/interrupt.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/vdpa.h>
+#include <linux/virtio.h>
+#include <linux/virtio_config.h>
+#include <linux/virtio_ring.h>
+#include <linux/virtio_pci.h>
+#include <linux/virtio_pci_legacy.h>
+#include <uapi/linux/virtio_net.h>
+
+#define ENI_MSIX_NAME_SIZE 256
+
+#define ENI_ERR(pdev, fmt, ...) \
+ dev_err(&pdev->dev, "%s"fmt, "eni_vdpa: ", ##__VA_ARGS__)
+#define ENI_DBG(pdev, fmt, ...) \
+ dev_dbg(&pdev->dev, "%s"fmt, "eni_vdpa: ", ##__VA_ARGS__)
+#define ENI_INFO(pdev, fmt, ...) \
+ dev_info(&pdev->dev, "%s"fmt, "eni_vdpa: ", ##__VA_ARGS__)
+
+struct eni_vring {
+ void __iomem *notify;
+ char msix_name[ENI_MSIX_NAME_SIZE];
+ struct vdpa_callback cb;
+ int irq;
+};
+
+struct eni_vdpa {
+ struct vdpa_device vdpa;
+ struct virtio_pci_legacy_device ldev;
+ struct eni_vring *vring;
+ struct vdpa_callback config_cb;
+ char msix_name[ENI_MSIX_NAME_SIZE];
+ int config_irq;
+ int queues;
+ int vectors;
+};
+
+static struct eni_vdpa *vdpa_to_eni(struct vdpa_device *vdpa)
+{
+ return container_of(vdpa, struct eni_vdpa, vdpa);
+}
+
+static struct virtio_pci_legacy_device *vdpa_to_ldev(struct vdpa_device *vdpa)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+
+ return &eni_vdpa->ldev;
+}
+
+static u64 eni_vdpa_get_features(struct vdpa_device *vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+
+ return vp_legacy_get_features(ldev);
+}
+
+static int eni_vdpa_set_features(struct vdpa_device *vdpa, u64 features)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+
+ vp_legacy_set_features(ldev, (u32)features);
+
+ return 0;
+}
+
+static u8 eni_vdpa_get_status(struct vdpa_device *vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+
+ return vp_legacy_get_status(ldev);
+}
+
+static int eni_vdpa_get_vq_irq(struct vdpa_device *vdpa, u16 idx)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+ int irq = eni_vdpa->vring[idx].irq;
+
+ if (irq == VIRTIO_MSI_NO_VECTOR)
+ return -EINVAL;
+
+ return irq;
+}
+
+static void eni_vdpa_free_irq(struct eni_vdpa *eni_vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
+ struct pci_dev *pdev = ldev->pci_dev;
+ int i;
+
+ for (i = 0; i < eni_vdpa->queues; i++) {
+ if (eni_vdpa->vring[i].irq != VIRTIO_MSI_NO_VECTOR) {
+ vp_legacy_queue_vector(ldev, i, VIRTIO_MSI_NO_VECTOR);
+ devm_free_irq(&pdev->dev, eni_vdpa->vring[i].irq,
+ &eni_vdpa->vring[i]);
+ eni_vdpa->vring[i].irq = VIRTIO_MSI_NO_VECTOR;
+ }
+ }
+
+ if (eni_vdpa->config_irq != VIRTIO_MSI_NO_VECTOR) {
+ vp_legacy_config_vector(ldev, VIRTIO_MSI_NO_VECTOR);
+ devm_free_irq(&pdev->dev, eni_vdpa->config_irq, eni_vdpa);
+ eni_vdpa->config_irq = VIRTIO_MSI_NO_VECTOR;
+ }
+
+ if (eni_vdpa->vectors) {
+ pci_free_irq_vectors(pdev);
+ eni_vdpa->vectors = 0;
+ }
+}
+
+static irqreturn_t eni_vdpa_vq_handler(int irq, void *arg)
+{
+ struct eni_vring *vring = arg;
+
+ if (vring->cb.callback)
+ return vring->cb.callback(vring->cb.private);
+
+ return IRQ_HANDLED;
+}
+
+static irqreturn_t eni_vdpa_config_handler(int irq, void *arg)
+{
+ struct eni_vdpa *eni_vdpa = arg;
+
+ if (eni_vdpa->config_cb.callback)
+ return eni_vdpa->config_cb.callback(eni_vdpa->config_cb.private);
+
+ return IRQ_HANDLED;
+}
+
+static int eni_vdpa_request_irq(struct eni_vdpa *eni_vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
+ struct pci_dev *pdev = ldev->pci_dev;
+ int i, ret, irq;
+ int queues = eni_vdpa->queues;
+ int vectors = queues + 1;
+
+ ret = pci_alloc_irq_vectors(pdev, vectors, vectors, PCI_IRQ_MSIX);
+ if (ret != vectors) {
+ ENI_ERR(pdev,
+ "failed to allocate irq vectors want %d but %d\n",
+ vectors, ret);
+ return ret;
+ }
+
+ eni_vdpa->vectors = vectors;
+
+ for (i = 0; i < queues; i++) {
+ snprintf(eni_vdpa->vring[i].msix_name, ENI_MSIX_NAME_SIZE,
+ "eni-vdpa[%s]-%d\n", pci_name(pdev), i);
+ irq = pci_irq_vector(pdev, i);
+ ret = devm_request_irq(&pdev->dev, irq,
+ eni_vdpa_vq_handler,
+ 0, eni_vdpa->vring[i].msix_name,
+ &eni_vdpa->vring[i]);
+ if (ret) {
+ ENI_ERR(pdev, "failed to request irq for vq %d\n", i);
+ goto err;
+ }
+ vp_legacy_queue_vector(ldev, i, i);
+ eni_vdpa->vring[i].irq = irq;
+ }
+
+ snprintf(eni_vdpa->msix_name, ENI_MSIX_NAME_SIZE, "eni-vdpa[%s]-config\n",
+ pci_name(pdev));
+ irq = pci_irq_vector(pdev, queues);
+ ret = devm_request_irq(&pdev->dev, irq, eni_vdpa_config_handler, 0,
+ eni_vdpa->msix_name, eni_vdpa);
+ if (ret) {
+ ENI_ERR(pdev, "failed to request irq for config vq %d\n", i);
+ goto err;
+ }
+ vp_legacy_config_vector(ldev, queues);
+ eni_vdpa->config_irq = irq;
+
+ return 0;
+err:
+ eni_vdpa_free_irq(eni_vdpa);
+ return ret;
+}
+
+static void eni_vdpa_set_status(struct vdpa_device *vdpa, u8 status)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+ struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
+ u8 s = eni_vdpa_get_status(vdpa);
+
+ if (status & VIRTIO_CONFIG_S_DRIVER_OK &&
+ !(s & VIRTIO_CONFIG_S_DRIVER_OK)) {
+ eni_vdpa_request_irq(eni_vdpa);
+ }
+
+ vp_legacy_set_status(ldev, status);
+
+ if (!(status & VIRTIO_CONFIG_S_DRIVER_OK) &&
+ (s & VIRTIO_CONFIG_S_DRIVER_OK))
+ eni_vdpa_free_irq(eni_vdpa);
+}
+
+static int eni_vdpa_reset(struct vdpa_device *vdpa)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+ struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
+ u8 s = eni_vdpa_get_status(vdpa);
+
+ vp_legacy_set_status(ldev, 0);
+
+ if (s & VIRTIO_CONFIG_S_DRIVER_OK)
+ eni_vdpa_free_irq(eni_vdpa);
+
+ return 0;
+}
+
+static u16 eni_vdpa_get_vq_num_max(struct vdpa_device *vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+
+ return vp_legacy_get_queue_size(ldev, 0);
+}
+
+static int eni_vdpa_get_vq_state(struct vdpa_device *vdpa, u16 qid,
+ struct vdpa_vq_state *state)
+{
+ return -EOPNOTSUPP;
+}
+
+static int eni_vdpa_set_vq_state(struct vdpa_device *vdpa, u16 qid,
+ const struct vdpa_vq_state *state)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+ const struct vdpa_vq_state_split *split = &state->split;
+
+ /* ENI is build upon virtio-pci specfication which not support
+ * to set state of virtqueue. But if the state is equal to the
+ * device initial state by chance, we can let it go.
+ */
+ if (!vp_legacy_get_queue_enable(ldev, qid)
+ && split->avail_index == 0)
+ return 0;
+
+ return -EOPNOTSUPP;
+}
+
+
+static void eni_vdpa_set_vq_cb(struct vdpa_device *vdpa, u16 qid,
+ struct vdpa_callback *cb)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+
+ eni_vdpa->vring[qid].cb = *cb;
+}
+
+static void eni_vdpa_set_vq_ready(struct vdpa_device *vdpa, u16 qid,
+ bool ready)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+
+ /* ENI is a legacy virtio-pci device. This is not supported
+ * by specification. But we can disable virtqueue by setting
+ * address to 0.
+ */
+ if (!ready)
+ vp_legacy_set_queue_address(ldev, qid, 0);
+}
+
+static bool eni_vdpa_get_vq_ready(struct vdpa_device *vdpa, u16 qid)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+
+ return vp_legacy_get_queue_enable(ldev, qid);
+}
+
+static void eni_vdpa_set_vq_num(struct vdpa_device *vdpa, u16 qid,
+ u32 num)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+ struct pci_dev *pdev = ldev->pci_dev;
+ u16 n = vp_legacy_get_queue_size(ldev, qid);
+
+ /* ENI is a legacy virtio-pci device which not allow to change
+ * virtqueue size. Just report a error if someone tries to
+ * change it.
+ */
+ if (num != n)
+ ENI_ERR(pdev,
+ "not support to set vq %u fixed num %u to %u\n",
+ qid, n, num);
+}
+
+static int eni_vdpa_set_vq_address(struct vdpa_device *vdpa, u16 qid,
+ u64 desc_area, u64 driver_area,
+ u64 device_area)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+ u32 pfn = desc_area >> VIRTIO_PCI_QUEUE_ADDR_SHIFT;
+
+ vp_legacy_set_queue_address(ldev, qid, pfn);
+
+ return 0;
+}
+
+static void eni_vdpa_kick_vq(struct vdpa_device *vdpa, u16 qid)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+
+ iowrite16(qid, eni_vdpa->vring[qid].notify);
+}
+
+static u32 eni_vdpa_get_device_id(struct vdpa_device *vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+
+ return ldev->id.device;
+}
+
+static u32 eni_vdpa_get_vendor_id(struct vdpa_device *vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+
+ return ldev->id.vendor;
+}
+
+static u32 eni_vdpa_get_vq_align(struct vdpa_device *vdpa)
+{
+ return PAGE_SIZE;
+}
+
+static size_t eni_vdpa_get_config_size(struct vdpa_device *vdpa)
+{
+ return sizeof(struct virtio_net_config);
+}
+
+
+static void eni_vdpa_get_config(struct vdpa_device *vdpa,
+ unsigned int offset,
+ void *buf, unsigned int len)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+ struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
+ void __iomem *ioaddr = ldev->ioaddr +
+ VIRTIO_PCI_CONFIG_OFF(eni_vdpa->vectors) +
+ offset;
+ u8 *p = buf;
+ int i;
+
+ for (i = 0; i < len; i++)
+ *p++ = ioread8(ioaddr + i);
+}
+
+static void eni_vdpa_set_config(struct vdpa_device *vdpa,
+ unsigned int offset, const void *buf,
+ unsigned int len)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+ struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
+ void __iomem *ioaddr = ldev->ioaddr +
+ VIRTIO_PCI_CONFIG_OFF(eni_vdpa->vectors) +
+ offset;
+ const u8 *p = buf;
+ int i;
+
+ for (i = 0; i < len; i++)
+ iowrite8(*p++, ioaddr + i);
+}
+
+static void eni_vdpa_set_config_cb(struct vdpa_device *vdpa,
+ struct vdpa_callback *cb)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+
+ eni_vdpa->config_cb = *cb;
+}
+
+static const struct vdpa_config_ops eni_vdpa_ops = {
+ .get_features = eni_vdpa_get_features,
+ .set_features = eni_vdpa_set_features,
+ .get_status = eni_vdpa_get_status,
+ .set_status = eni_vdpa_set_status,
+ .reset = eni_vdpa_reset,
+ .get_vq_num_max = eni_vdpa_get_vq_num_max,
+ .get_vq_state = eni_vdpa_get_vq_state,
+ .set_vq_state = eni_vdpa_set_vq_state,
+ .set_vq_cb = eni_vdpa_set_vq_cb,
+ .set_vq_ready = eni_vdpa_set_vq_ready,
+ .get_vq_ready = eni_vdpa_get_vq_ready,
+ .set_vq_num = eni_vdpa_set_vq_num,
+ .set_vq_address = eni_vdpa_set_vq_address,
+ .kick_vq = eni_vdpa_kick_vq,
+ .get_device_id = eni_vdpa_get_device_id,
+ .get_vendor_id = eni_vdpa_get_vendor_id,
+ .get_vq_align = eni_vdpa_get_vq_align,
+ .get_config_size = eni_vdpa_get_config_size,
+ .get_config = eni_vdpa_get_config,
+ .set_config = eni_vdpa_set_config,
+ .set_config_cb = eni_vdpa_set_config_cb,
+ .get_vq_irq = eni_vdpa_get_vq_irq,
+};
+
+
+static u16 eni_vdpa_get_num_queues(struct eni_vdpa *eni_vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
+ u32 features = vp_legacy_get_features(ldev);
+ u16 num = 2;
+
+ if (features & BIT_ULL(VIRTIO_NET_F_MQ)) {
+ __virtio16 max_virtqueue_pairs;
+
+ eni_vdpa_get_config(&eni_vdpa->vdpa,
+ offsetof(struct virtio_net_config, max_virtqueue_pairs),
+ &max_virtqueue_pairs,
+ sizeof(max_virtqueue_pairs));
+ num = 2 * __virtio16_to_cpu(virtio_legacy_is_little_endian(),
+ max_virtqueue_pairs);
+ }
+
+ if (features & BIT_ULL(VIRTIO_NET_F_CTRL_VQ))
+ num += 1;
+
+ return num;
+}
+
+static void eni_vdpa_free_irq_vectors(void *data)
+{
+ pci_free_irq_vectors(data);
+}
+
+static int eni_vdpa_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+{
+ struct device *dev = &pdev->dev;
+ struct eni_vdpa *eni_vdpa;
+ struct virtio_pci_legacy_device *ldev;
+ int ret, i;
+
+ ret = pcim_enable_device(pdev);
+ if (ret)
+ return ret;
+
+ eni_vdpa = vdpa_alloc_device(struct eni_vdpa, vdpa,
+ dev, &eni_vdpa_ops, NULL, false);
+ if (IS_ERR(eni_vdpa)) {
+ ENI_ERR(pdev, "failed to allocate vDPA structure\n");
+ return PTR_ERR(eni_vdpa);
+ }
+
+ ldev = &eni_vdpa->ldev;
+ ldev->pci_dev = pdev;
+
+ ret = vp_legacy_probe(ldev);
+ if (ret) {
+ ENI_ERR(pdev, "failed to probe legacy PCI device\n");
+ goto err;
+ }
+
+ pci_set_master(pdev);
+ pci_set_drvdata(pdev, eni_vdpa);
+
+ eni_vdpa->vdpa.dma_dev = &pdev->dev;
+ eni_vdpa->queues = eni_vdpa_get_num_queues(eni_vdpa);
+
+ ret = devm_add_action_or_reset(dev, eni_vdpa_free_irq_vectors, pdev);
+ if (ret) {
+ ENI_ERR(pdev,
+ "failed for adding devres for freeing irq vectors\n");
+ goto err;
+ }
+
+ eni_vdpa->vring = devm_kcalloc(&pdev->dev, eni_vdpa->queues,
+ sizeof(*eni_vdpa->vring),
+ GFP_KERNEL);
+ if (!eni_vdpa->vring) {
+ ret = -ENOMEM;
+ ENI_ERR(pdev, "fail to allocate virtqueues\n");
+ goto err;
+ }
+
+ for (i = 0; i < eni_vdpa->queues; i++) {
+ eni_vdpa->vring[i].irq = VIRTIO_MSI_NO_VECTOR;
+ eni_vdpa->vring[i].notify = ldev->ioaddr + VIRTIO_PCI_QUEUE_NOTIFY;
+ }
+ eni_vdpa->config_irq = VIRTIO_MSI_NO_VECTOR;
+
+ ret = vdpa_register_device(&eni_vdpa->vdpa, eni_vdpa->queues);
+ if (ret) {
+ ENI_ERR(pdev, "failed to register to vdpa bus\n");
+ goto err;
+ }
+
+ return 0;
+
+err:
+ put_device(&eni_vdpa->vdpa.dev);
+ return ret;
+}
+
+static void eni_vdpa_remove(struct pci_dev *pdev)
+{
+ struct eni_vdpa *eni_vdpa = pci_get_drvdata(pdev);
+
+ vdpa_unregister_device(&eni_vdpa->vdpa);
+ vp_legacy_remove(&eni_vdpa->ldev);
+}
+
+static struct pci_device_id eni_pci_ids[] = {
+ { PCI_VENDOR_ID_REDHAT_QUMRANET, VIRTIO_TRANS_ID_NET },
+ { 0 },
+};
+
+static struct pci_driver eni_vdpa_driver = {
+ .name = "alibaba-eni-vdpa",
+ .id_table = eni_pci_ids,
+ .probe = eni_vdpa_probe,
+ .remove = eni_vdpa_remove,
+};
+
+module_pci_driver(eni_vdpa_driver);
+
+MODULE_AUTHOR("Wu Zongyong <[email protected]>");
+MODULE_DESCRIPTION("Alibaba ENI vDPA driver");
+MODULE_LICENSE("GPL v2");
--
2.31.1

2021-09-14 13:03:22

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH v2 4/5] vdpa: add new vdpa attribute VDPA_ATTR_DEV_F_VERSION_1

On Tue, Sep 14, 2021 at 08:24:51PM +0800, Wu Zongyong wrote:
> This new attribute advertises whether the vdpa device is legacy or not.
> Users can pick right virtqueue size if the vdpa device is legacy which
> doesn't support to change virtqueue size.
>
> Signed-off-by: Wu Zongyong <[email protected]>

So if we are bothering with legacy, I think there are
several things to do when building the interface
- support transitional devices, that is allow userspace
to tell device it's in legacy mode
- support reporting/setting supporting endian-ness

> ---
> drivers/vdpa/vdpa.c | 6 ++++++
> drivers/virtio/virtio_vdpa.c | 7 ++++++-
> include/uapi/linux/vdpa.h | 1 +
> 3 files changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
> index 1dc121a07a93..533d7f589eee 100644
> --- a/drivers/vdpa/vdpa.c
> +++ b/drivers/vdpa/vdpa.c
> @@ -12,6 +12,7 @@
> #include <linux/slab.h>
> #include <linux/vdpa.h>
> #include <uapi/linux/vdpa.h>
> +#include <uapi/linux/virtio_config.h>
> #include <net/genetlink.h>
> #include <linux/mod_devicetable.h>
>
> @@ -494,6 +495,7 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> u16 max_vq_size;
> u32 device_id;
> u32 vendor_id;
> + u64 features;
> void *hdr;
> int err;
>
> @@ -508,6 +510,7 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> device_id = vdev->config->get_device_id(vdev);
> vendor_id = vdev->config->get_vendor_id(vdev);
> max_vq_size = vdev->config->get_vq_num_max(vdev);
> + features = vdev->config->get_features(vdev);
>
> err = -EMSGSIZE;
> if (nla_put_string(msg, VDPA_ATTR_DEV_NAME, dev_name(&vdev->dev)))
> @@ -520,6 +523,9 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> goto msg_err;
> if (nla_put_u16(msg, VDPA_ATTR_DEV_MAX_VQ_SIZE, max_vq_size))
> goto msg_err;
> + if (features & BIT_ULL(VIRTIO_F_VERSION_1) &&
> + nla_put_flag(msg, VDPA_ATTR_DEV_VERSION_1))
> + goto msg_err;
>
> genlmsg_end(msg, hdr);
> return 0;
> diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
> index 72eaef2caeb1..1cba957c4cdc 100644
> --- a/drivers/virtio/virtio_vdpa.c
> +++ b/drivers/virtio/virtio_vdpa.c
> @@ -7,6 +7,7 @@
> *
> */
>
> +#include "linux/virtio_config.h"
> #include <linux/init.h>
> #include <linux/module.h>
> #include <linux/device.h>
> @@ -145,6 +146,7 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> /* Assume split virtqueue, switch to packed if necessary */
> struct vdpa_vq_state state = {0};
> unsigned long flags;
> + bool may_reduce_num = false;
> u32 align, num;
> int err;
>
> @@ -169,10 +171,13 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> goto error_new_virtqueue;
> }
>
> + if (ops->get_features(vdpa) & BIT_ULL(VIRTIO_F_VERSION_1))
> + may_reduce_num = true;
> +
> /* Create the vring */
> align = ops->get_vq_align(vdpa);
> vq = vring_create_virtqueue(index, num, align, vdev,
> - true, true, ctx,
> + true, may_reduce_num, ctx,
> virtio_vdpa_notify, callback, name);
> if (!vq) {
> err = -ENOMEM;
> diff --git a/include/uapi/linux/vdpa.h b/include/uapi/linux/vdpa.h
> index 66a41e4ec163..ce0b74276a5b 100644
> --- a/include/uapi/linux/vdpa.h
> +++ b/include/uapi/linux/vdpa.h
> @@ -32,6 +32,7 @@ enum vdpa_attr {
> VDPA_ATTR_DEV_VENDOR_ID, /* u32 */
> VDPA_ATTR_DEV_MAX_VQS, /* u32 */
> VDPA_ATTR_DEV_MAX_VQ_SIZE, /* u16 */
> + VDPA_ATTR_DEV_VERSION_1, /* flag */
>
> /* new attributes must be added above here */
> VDPA_ATTR_MAX,
> --
> 2.31.1

2021-09-14 22:37:37

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH v2 5/5] eni_vdpa: add vDPA driver for Alibaba ENI

On 9/14/21 5:24 AM, Wu Zongyong wrote:
> diff --git a/drivers/vdpa/Kconfig b/drivers/vdpa/Kconfig
> index 3d91982d8371..9587b9177b05 100644
> --- a/drivers/vdpa/Kconfig
> +++ b/drivers/vdpa/Kconfig
> @@ -78,4 +78,12 @@ config VP_VDPA
> help
> This kernel module bridges virtio PCI device to vDPA bus.
>
> +config ALIBABA_ENI_VDPA
> + tristate "vDPA driver for Alibaba ENI"
> + select VIRTIO_PCI_LEGACY_LIB
> + depends on PCI_MSI
> + help
> + VDPA driver for Alibaba ENI(Elastic Network Interface) which is build upon

ENI (Elastic built upon the

> + virtio 0.9.5 specification.
> +
> endif # VDPA


--
~Randy

2021-09-14 22:40:06

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH v2 1/5] virtio-pci: introduce legacy device module

On 9/14/21 5:24 AM, Wu Zongyong wrote:
> diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
> index ce1b3f6ec325..b14768dc9e04 100644
> --- a/drivers/virtio/Kconfig
> +++ b/drivers/virtio/Kconfig
> @@ -20,6 +20,15 @@ config VIRTIO_PCI_LIB
> PCI device with possible vendor specific extensions. Any
> module that selects this module must depend on PCI.
>
> +config VIRTIO_PCI_LIB_LEGACY
> + tristate
> + help
> + Legacy PCI device (Virtio PCI Card 0.9.x Draft and older device)
> + implementation.
> + This modules implements the basic probe and control for devices

module

> + which are based on legacy PCI device. Any module that selects this
> + module must depend on PCI.
> +


--
~Randy

2021-09-15 03:16:42

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH v2 5/5] eni_vdpa: add vDPA driver for Alibaba ENI

On Tue, Sep 14, 2021 at 8:26 PM Wu Zongyong
<[email protected]> wrote:
>
> This patch adds a new vDPA driver for Alibaba ENI(Elastic Network
> Interface) which is build upon virtio 0.9.5 specification.
>
> Signed-off-by: Wu Zongyong <[email protected]>
> ---
> drivers/vdpa/Kconfig | 8 +
> drivers/vdpa/Makefile | 1 +
> drivers/vdpa/alibaba/Makefile | 3 +
> drivers/vdpa/alibaba/eni_vdpa.c | 537 ++++++++++++++++++++++++++++++++
> 4 files changed, 549 insertions(+)
> create mode 100644 drivers/vdpa/alibaba/Makefile
> create mode 100644 drivers/vdpa/alibaba/eni_vdpa.c
>
> diff --git a/drivers/vdpa/Kconfig b/drivers/vdpa/Kconfig
> index 3d91982d8371..9587b9177b05 100644
> --- a/drivers/vdpa/Kconfig
> +++ b/drivers/vdpa/Kconfig
> @@ -78,4 +78,12 @@ config VP_VDPA
> help
> This kernel module bridges virtio PCI device to vDPA bus.
>
> +config ALIBABA_ENI_VDPA
> + tristate "vDPA driver for Alibaba ENI"
> + select VIRTIO_PCI_LEGACY_LIB
> + depends on PCI_MSI
> + help
> + VDPA driver for Alibaba ENI(Elastic Network Interface) which is build upon
> + virtio 0.9.5 specification.
> +
> endif # VDPA
> diff --git a/drivers/vdpa/Makefile b/drivers/vdpa/Makefile
> index f02ebed33f19..15665563a7f4 100644
> --- a/drivers/vdpa/Makefile
> +++ b/drivers/vdpa/Makefile
> @@ -5,3 +5,4 @@ obj-$(CONFIG_VDPA_USER) += vdpa_user/
> obj-$(CONFIG_IFCVF) += ifcvf/
> obj-$(CONFIG_MLX5_VDPA) += mlx5/
> obj-$(CONFIG_VP_VDPA) += virtio_pci/
> +obj-$(CONFIG_ALIBABA_ENI_VDPA) += alibaba/
> diff --git a/drivers/vdpa/alibaba/Makefile b/drivers/vdpa/alibaba/Makefile
> new file mode 100644
> index 000000000000..ef4aae69f87a
> --- /dev/null
> +++ b/drivers/vdpa/alibaba/Makefile
> @@ -0,0 +1,3 @@
> +# SPDX-License-Identifier: GPL-2.0
> +obj-$(CONFIG_ALIBABA_ENI_VDPA) += eni_vdpa.o
> +
> diff --git a/drivers/vdpa/alibaba/eni_vdpa.c b/drivers/vdpa/alibaba/eni_vdpa.c
> new file mode 100644
> index 000000000000..38e85a5dd62e
> --- /dev/null
> +++ b/drivers/vdpa/alibaba/eni_vdpa.c
> @@ -0,0 +1,537 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * vDPA bridge driver for Alibaba ENI(Elastic Network Interface)
> + *
> + * Copyright (c) 2021, Alibaba Inc. All rights reserved.
> + * Author: Wu Zongyong <[email protected]>
> + *
> + */
> +
> +#include "asm-generic/errno-base.h"
> +#include "asm-generic/errno.h"
> +#include "linux/irqreturn.h"
> +#include "linux/kernel.h"
> +#include "linux/pci_ids.h"
> +#include "linux/virtio_config.h"
> +#include <linux/interrupt.h>
> +#include <linux/module.h>
> +#include <linux/pci.h>
> +#include <linux/vdpa.h>
> +#include <linux/virtio.h>
> +#include <linux/virtio_config.h>
> +#include <linux/virtio_ring.h>
> +#include <linux/virtio_pci.h>
> +#include <linux/virtio_pci_legacy.h>
> +#include <uapi/linux/virtio_net.h>
> +
> +#define ENI_MSIX_NAME_SIZE 256
> +
> +#define ENI_ERR(pdev, fmt, ...) \
> + dev_err(&pdev->dev, "%s"fmt, "eni_vdpa: ", ##__VA_ARGS__)
> +#define ENI_DBG(pdev, fmt, ...) \
> + dev_dbg(&pdev->dev, "%s"fmt, "eni_vdpa: ", ##__VA_ARGS__)
> +#define ENI_INFO(pdev, fmt, ...) \
> + dev_info(&pdev->dev, "%s"fmt, "eni_vdpa: ", ##__VA_ARGS__)
> +
> +struct eni_vring {
> + void __iomem *notify;
> + char msix_name[ENI_MSIX_NAME_SIZE];
> + struct vdpa_callback cb;
> + int irq;
> +};
> +
> +struct eni_vdpa {
> + struct vdpa_device vdpa;
> + struct virtio_pci_legacy_device ldev;
> + struct eni_vring *vring;
> + struct vdpa_callback config_cb;
> + char msix_name[ENI_MSIX_NAME_SIZE];
> + int config_irq;
> + int queues;
> + int vectors;
> +};
> +
> +static struct eni_vdpa *vdpa_to_eni(struct vdpa_device *vdpa)
> +{
> + return container_of(vdpa, struct eni_vdpa, vdpa);
> +}
> +
> +static struct virtio_pci_legacy_device *vdpa_to_ldev(struct vdpa_device *vdpa)
> +{
> + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> +
> + return &eni_vdpa->ldev;
> +}
> +
> +static u64 eni_vdpa_get_features(struct vdpa_device *vdpa)
> +{
> + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> +
> + return vp_legacy_get_features(ldev);
> +}
> +
> +static int eni_vdpa_set_features(struct vdpa_device *vdpa, u64 features)
> +{
> + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> +
> + vp_legacy_set_features(ldev, (u32)features);
> +
> + return 0;
> +}

Interesting, I wonder how VIRTIO_F_ACCESS_PLATFORM can work in this case.

Without that, the virtio driver won't use DMA API which breaks the
setup with IOMMU.

Or is the VIRTIO_F_ACCESS_PLATFORM mandated by the device? If yes, we
need some meditation here:

e.g return VIRTIO_F_ACCESS_PLATFORM set in get_features().

> +
> +static u8 eni_vdpa_get_status(struct vdpa_device *vdpa)
> +{
> + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> +
> + return vp_legacy_get_status(ldev);
> +}
> +
> +static int eni_vdpa_get_vq_irq(struct vdpa_device *vdpa, u16 idx)
> +{
> + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> + int irq = eni_vdpa->vring[idx].irq;
> +
> + if (irq == VIRTIO_MSI_NO_VECTOR)
> + return -EINVAL;
> +
> + return irq;
> +}
> +
> +static void eni_vdpa_free_irq(struct eni_vdpa *eni_vdpa)
> +{
> + struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
> + struct pci_dev *pdev = ldev->pci_dev;
> + int i;
> +
> + for (i = 0; i < eni_vdpa->queues; i++) {
> + if (eni_vdpa->vring[i].irq != VIRTIO_MSI_NO_VECTOR) {
> + vp_legacy_queue_vector(ldev, i, VIRTIO_MSI_NO_VECTOR);
> + devm_free_irq(&pdev->dev, eni_vdpa->vring[i].irq,
> + &eni_vdpa->vring[i]);
> + eni_vdpa->vring[i].irq = VIRTIO_MSI_NO_VECTOR;
> + }
> + }
> +
> + if (eni_vdpa->config_irq != VIRTIO_MSI_NO_VECTOR) {
> + vp_legacy_config_vector(ldev, VIRTIO_MSI_NO_VECTOR);
> + devm_free_irq(&pdev->dev, eni_vdpa->config_irq, eni_vdpa);
> + eni_vdpa->config_irq = VIRTIO_MSI_NO_VECTOR;
> + }
> +
> + if (eni_vdpa->vectors) {
> + pci_free_irq_vectors(pdev);
> + eni_vdpa->vectors = 0;
> + }
> +}
> +
> +static irqreturn_t eni_vdpa_vq_handler(int irq, void *arg)
> +{
> + struct eni_vring *vring = arg;
> +
> + if (vring->cb.callback)
> + return vring->cb.callback(vring->cb.private);
> +
> + return IRQ_HANDLED;
> +}
> +
> +static irqreturn_t eni_vdpa_config_handler(int irq, void *arg)
> +{
> + struct eni_vdpa *eni_vdpa = arg;
> +
> + if (eni_vdpa->config_cb.callback)
> + return eni_vdpa->config_cb.callback(eni_vdpa->config_cb.private);
> +
> + return IRQ_HANDLED;
> +}
> +
> +static int eni_vdpa_request_irq(struct eni_vdpa *eni_vdpa)
> +{
> + struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
> + struct pci_dev *pdev = ldev->pci_dev;
> + int i, ret, irq;
> + int queues = eni_vdpa->queues;
> + int vectors = queues + 1;
> +
> + ret = pci_alloc_irq_vectors(pdev, vectors, vectors, PCI_IRQ_MSIX);
> + if (ret != vectors) {
> + ENI_ERR(pdev,
> + "failed to allocate irq vectors want %d but %d\n",
> + vectors, ret);
> + return ret;
> + }
> +
> + eni_vdpa->vectors = vectors;
> +
> + for (i = 0; i < queues; i++) {
> + snprintf(eni_vdpa->vring[i].msix_name, ENI_MSIX_NAME_SIZE,
> + "eni-vdpa[%s]-%d\n", pci_name(pdev), i);
> + irq = pci_irq_vector(pdev, i);
> + ret = devm_request_irq(&pdev->dev, irq,
> + eni_vdpa_vq_handler,
> + 0, eni_vdpa->vring[i].msix_name,
> + &eni_vdpa->vring[i]);
> + if (ret) {
> + ENI_ERR(pdev, "failed to request irq for vq %d\n", i);
> + goto err;
> + }
> + vp_legacy_queue_vector(ldev, i, i);
> + eni_vdpa->vring[i].irq = irq;
> + }
> +
> + snprintf(eni_vdpa->msix_name, ENI_MSIX_NAME_SIZE, "eni-vdpa[%s]-config\n",
> + pci_name(pdev));
> + irq = pci_irq_vector(pdev, queues);
> + ret = devm_request_irq(&pdev->dev, irq, eni_vdpa_config_handler, 0,
> + eni_vdpa->msix_name, eni_vdpa);
> + if (ret) {
> + ENI_ERR(pdev, "failed to request irq for config vq %d\n", i);
> + goto err;
> + }
> + vp_legacy_config_vector(ldev, queues);
> + eni_vdpa->config_irq = irq;
> +
> + return 0;
> +err:
> + eni_vdpa_free_irq(eni_vdpa);
> + return ret;
> +}
> +
> +static void eni_vdpa_set_status(struct vdpa_device *vdpa, u8 status)
> +{
> + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> + struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
> + u8 s = eni_vdpa_get_status(vdpa);
> +
> + if (status & VIRTIO_CONFIG_S_DRIVER_OK &&
> + !(s & VIRTIO_CONFIG_S_DRIVER_OK)) {
> + eni_vdpa_request_irq(eni_vdpa);
> + }
> +
> + vp_legacy_set_status(ldev, status);
> +
> + if (!(status & VIRTIO_CONFIG_S_DRIVER_OK) &&
> + (s & VIRTIO_CONFIG_S_DRIVER_OK))
> + eni_vdpa_free_irq(eni_vdpa);
> +}
> +
> +static int eni_vdpa_reset(struct vdpa_device *vdpa)
> +{
> + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> + struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
> + u8 s = eni_vdpa_get_status(vdpa);
> +
> + vp_legacy_set_status(ldev, 0);
> +
> + if (s & VIRTIO_CONFIG_S_DRIVER_OK)
> + eni_vdpa_free_irq(eni_vdpa);
> +
> + return 0;
> +}
> +
> +static u16 eni_vdpa_get_vq_num_max(struct vdpa_device *vdpa)
> +{
> + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> +
> + return vp_legacy_get_queue_size(ldev, 0);
> +}
> +
> +static int eni_vdpa_get_vq_state(struct vdpa_device *vdpa, u16 qid,
> + struct vdpa_vq_state *state)
> +{
> + return -EOPNOTSUPP;
> +}
> +
> +static int eni_vdpa_set_vq_state(struct vdpa_device *vdpa, u16 qid,
> + const struct vdpa_vq_state *state)
> +{
> + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> + const struct vdpa_vq_state_split *split = &state->split;
> +
> + /* ENI is build upon virtio-pci specfication which not support
> + * to set state of virtqueue. But if the state is equal to the
> + * device initial state by chance, we can let it go.
> + */
> + if (!vp_legacy_get_queue_enable(ldev, qid)
> + && split->avail_index == 0)
> + return 0;
> +
> + return -EOPNOTSUPP;
> +}
> +
> +
> +static void eni_vdpa_set_vq_cb(struct vdpa_device *vdpa, u16 qid,
> + struct vdpa_callback *cb)
> +{
> + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> +
> + eni_vdpa->vring[qid].cb = *cb;
> +}
> +
> +static void eni_vdpa_set_vq_ready(struct vdpa_device *vdpa, u16 qid,
> + bool ready)
> +{
> + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> +
> + /* ENI is a legacy virtio-pci device. This is not supported
> + * by specification. But we can disable virtqueue by setting
> + * address to 0.
> + */
> + if (!ready)
> + vp_legacy_set_queue_address(ldev, qid, 0);
> +}
> +
> +static bool eni_vdpa_get_vq_ready(struct vdpa_device *vdpa, u16 qid)
> +{
> + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> +
> + return vp_legacy_get_queue_enable(ldev, qid);
> +}
> +
> +static void eni_vdpa_set_vq_num(struct vdpa_device *vdpa, u16 qid,
> + u32 num)
> +{
> + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> + struct pci_dev *pdev = ldev->pci_dev;
> + u16 n = vp_legacy_get_queue_size(ldev, qid);
> +
> + /* ENI is a legacy virtio-pci device which not allow to change
> + * virtqueue size. Just report a error if someone tries to
> + * change it.
> + */
> + if (num != n)
> + ENI_ERR(pdev,
> + "not support to set vq %u fixed num %u to %u\n",
> + qid, n, num);
> +}
> +
> +static int eni_vdpa_set_vq_address(struct vdpa_device *vdpa, u16 qid,
> + u64 desc_area, u64 driver_area,
> + u64 device_area)
> +{
> + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> + u32 pfn = desc_area >> VIRTIO_PCI_QUEUE_ADDR_SHIFT;
> +
> + vp_legacy_set_queue_address(ldev, qid, pfn);
> +
> + return 0;
> +}
> +
> +static void eni_vdpa_kick_vq(struct vdpa_device *vdpa, u16 qid)
> +{
> + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> +
> + iowrite16(qid, eni_vdpa->vring[qid].notify);
> +}
> +
> +static u32 eni_vdpa_get_device_id(struct vdpa_device *vdpa)
> +{
> + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> +
> + return ldev->id.device;
> +}
> +
> +static u32 eni_vdpa_get_vendor_id(struct vdpa_device *vdpa)
> +{
> + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> +
> + return ldev->id.vendor;
> +}
> +
> +static u32 eni_vdpa_get_vq_align(struct vdpa_device *vdpa)
> +{
> + return PAGE_SIZE;
> +}
> +
> +static size_t eni_vdpa_get_config_size(struct vdpa_device *vdpa)
> +{
> + return sizeof(struct virtio_net_config);
> +}
> +
> +
> +static void eni_vdpa_get_config(struct vdpa_device *vdpa,
> + unsigned int offset,
> + void *buf, unsigned int len)
> +{
> + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> + struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
> + void __iomem *ioaddr = ldev->ioaddr +
> + VIRTIO_PCI_CONFIG_OFF(eni_vdpa->vectors) +
> + offset;
> + u8 *p = buf;
> + int i;
> +
> + for (i = 0; i < len; i++)
> + *p++ = ioread8(ioaddr + i);
> +}
> +
> +static void eni_vdpa_set_config(struct vdpa_device *vdpa,
> + unsigned int offset, const void *buf,
> + unsigned int len)
> +{
> + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> + struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
> + void __iomem *ioaddr = ldev->ioaddr +
> + VIRTIO_PCI_CONFIG_OFF(eni_vdpa->vectors) +
> + offset;
> + const u8 *p = buf;
> + int i;
> +
> + for (i = 0; i < len; i++)
> + iowrite8(*p++, ioaddr + i);
> +}
> +
> +static void eni_vdpa_set_config_cb(struct vdpa_device *vdpa,
> + struct vdpa_callback *cb)
> +{
> + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> +
> + eni_vdpa->config_cb = *cb;
> +}
> +
> +static const struct vdpa_config_ops eni_vdpa_ops = {
> + .get_features = eni_vdpa_get_features,
> + .set_features = eni_vdpa_set_features,
> + .get_status = eni_vdpa_get_status,
> + .set_status = eni_vdpa_set_status,
> + .reset = eni_vdpa_reset,
> + .get_vq_num_max = eni_vdpa_get_vq_num_max,
> + .get_vq_state = eni_vdpa_get_vq_state,
> + .set_vq_state = eni_vdpa_set_vq_state,
> + .set_vq_cb = eni_vdpa_set_vq_cb,
> + .set_vq_ready = eni_vdpa_set_vq_ready,
> + .get_vq_ready = eni_vdpa_get_vq_ready,
> + .set_vq_num = eni_vdpa_set_vq_num,
> + .set_vq_address = eni_vdpa_set_vq_address,
> + .kick_vq = eni_vdpa_kick_vq,
> + .get_device_id = eni_vdpa_get_device_id,
> + .get_vendor_id = eni_vdpa_get_vendor_id,
> + .get_vq_align = eni_vdpa_get_vq_align,
> + .get_config_size = eni_vdpa_get_config_size,
> + .get_config = eni_vdpa_get_config,
> + .set_config = eni_vdpa_set_config,
> + .set_config_cb = eni_vdpa_set_config_cb,
> + .get_vq_irq = eni_vdpa_get_vq_irq,
> +};
> +
> +
> +static u16 eni_vdpa_get_num_queues(struct eni_vdpa *eni_vdpa)
> +{
> + struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
> + u32 features = vp_legacy_get_features(ldev);
> + u16 num = 2;
> +
> + if (features & BIT_ULL(VIRTIO_NET_F_MQ)) {
> + __virtio16 max_virtqueue_pairs;
> +
> + eni_vdpa_get_config(&eni_vdpa->vdpa,
> + offsetof(struct virtio_net_config, max_virtqueue_pairs),
> + &max_virtqueue_pairs,
> + sizeof(max_virtqueue_pairs));
> + num = 2 * __virtio16_to_cpu(virtio_legacy_is_little_endian(),
> + max_virtqueue_pairs);
> + }
> +
> + if (features & BIT_ULL(VIRTIO_NET_F_CTRL_VQ))
> + num += 1;
> +
> + return num;
> +}
> +
> +static void eni_vdpa_free_irq_vectors(void *data)
> +{
> + pci_free_irq_vectors(data);
> +}
> +
> +static int eni_vdpa_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> +{
> + struct device *dev = &pdev->dev;
> + struct eni_vdpa *eni_vdpa;
> + struct virtio_pci_legacy_device *ldev;
> + int ret, i;
> +
> + ret = pcim_enable_device(pdev);
> + if (ret)
> + return ret;
> +
> + eni_vdpa = vdpa_alloc_device(struct eni_vdpa, vdpa,
> + dev, &eni_vdpa_ops, NULL, false);
> + if (IS_ERR(eni_vdpa)) {
> + ENI_ERR(pdev, "failed to allocate vDPA structure\n");
> + return PTR_ERR(eni_vdpa);
> + }
> +
> + ldev = &eni_vdpa->ldev;
> + ldev->pci_dev = pdev;
> +
> + ret = vp_legacy_probe(ldev);
> + if (ret) {
> + ENI_ERR(pdev, "failed to probe legacy PCI device\n");
> + goto err;
> + }
> +
> + pci_set_master(pdev);
> + pci_set_drvdata(pdev, eni_vdpa);
> +
> + eni_vdpa->vdpa.dma_dev = &pdev->dev;
> + eni_vdpa->queues = eni_vdpa_get_num_queues(eni_vdpa);
> +
> + ret = devm_add_action_or_reset(dev, eni_vdpa_free_irq_vectors, pdev);
> + if (ret) {
> + ENI_ERR(pdev,
> + "failed for adding devres for freeing irq vectors\n");
> + goto err;
> + }
> +
> + eni_vdpa->vring = devm_kcalloc(&pdev->dev, eni_vdpa->queues,
> + sizeof(*eni_vdpa->vring),
> + GFP_KERNEL);
> + if (!eni_vdpa->vring) {
> + ret = -ENOMEM;
> + ENI_ERR(pdev, "fail to allocate virtqueues\n");
> + goto err;
> + }
> +
> + for (i = 0; i < eni_vdpa->queues; i++) {
> + eni_vdpa->vring[i].irq = VIRTIO_MSI_NO_VECTOR;
> + eni_vdpa->vring[i].notify = ldev->ioaddr + VIRTIO_PCI_QUEUE_NOTIFY;
> + }
> + eni_vdpa->config_irq = VIRTIO_MSI_NO_VECTOR;
> +
> + ret = vdpa_register_device(&eni_vdpa->vdpa, eni_vdpa->queues);
> + if (ret) {
> + ENI_ERR(pdev, "failed to register to vdpa bus\n");
> + goto err;
> + }
> +
> + return 0;
> +
> +err:
> + put_device(&eni_vdpa->vdpa.dev);
> + return ret;
> +}
> +
> +static void eni_vdpa_remove(struct pci_dev *pdev)
> +{
> + struct eni_vdpa *eni_vdpa = pci_get_drvdata(pdev);
> +
> + vdpa_unregister_device(&eni_vdpa->vdpa);
> + vp_legacy_remove(&eni_vdpa->ldev);
> +}
> +
> +static struct pci_device_id eni_pci_ids[] = {
> + { PCI_VENDOR_ID_REDHAT_QUMRANET, VIRTIO_TRANS_ID_NET },

This will cause some confusion for driver binding. I think it's better
to add subvendor matching here.

Thanks

> + { 0 },
> +};
> +
> +static struct pci_driver eni_vdpa_driver = {
> + .name = "alibaba-eni-vdpa",
> + .id_table = eni_pci_ids,
> + .probe = eni_vdpa_probe,
> + .remove = eni_vdpa_remove,
> +};
> +
> +module_pci_driver(eni_vdpa_driver);
> +
> +MODULE_AUTHOR("Wu Zongyong <[email protected]>");
> +MODULE_DESCRIPTION("Alibaba ENI vDPA driver");
> +MODULE_LICENSE("GPL v2");
> --
> 2.31.1
>

2021-09-15 03:16:54

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH v2 2/5] vdpa: fix typo

On Tue, Sep 14, 2021 at 8:26 PM Wu Zongyong
<[email protected]> wrote:
>
> Signed-off-by: Wu Zongyong <[email protected]>
> ---

Acked-by: Jason Wang <[email protected]>

> include/linux/vdpa.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
> index 3972ab765de1..a896ee021e5f 100644
> --- a/include/linux/vdpa.h
> +++ b/include/linux/vdpa.h
> @@ -257,7 +257,7 @@ struct vdpa_config_ops {
> struct vdpa_notification_area
> (*get_vq_notification)(struct vdpa_device *vdev, u16 idx);
> /* vq irq is not expected to be changed once DRIVER_OK is set */
> - int (*get_vq_irq)(struct vdpa_device *vdv, u16 idx);
> + int (*get_vq_irq)(struct vdpa_device *vdev, u16 idx);
>
> /* Device ops */
> u32 (*get_vq_align)(struct vdpa_device *vdev);
> --
> 2.31.1
>

2021-09-15 03:19:24

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH v2 3/5] vp_vdpa: add vq irq offloading support

On Tue, Sep 14, 2021 at 8:25 PM Wu Zongyong
<[email protected]> wrote:
>
> This patch implements the get_vq_irq() callback for virtio pci devices
> to allow irq offloading.
>
> Signed-off-by: Wu Zongyong <[email protected]>

Acked-by: Jason Wang <[email protected]>

(btw, I think I've acked this but it seems lost).

Thanks

> ---
> drivers/vdpa/virtio_pci/vp_vdpa.c | 12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
> diff --git a/drivers/vdpa/virtio_pci/vp_vdpa.c b/drivers/vdpa/virtio_pci/vp_vdpa.c
> index 5bcd00246d2e..e3ff7875e123 100644
> --- a/drivers/vdpa/virtio_pci/vp_vdpa.c
> +++ b/drivers/vdpa/virtio_pci/vp_vdpa.c
> @@ -76,6 +76,17 @@ static u8 vp_vdpa_get_status(struct vdpa_device *vdpa)
> return vp_modern_get_status(mdev);
> }
>
> +static int vp_vdpa_get_vq_irq(struct vdpa_device *vdpa, u16 idx)
> +{
> + struct vp_vdpa *vp_vdpa = vdpa_to_vp(vdpa);
> + int irq = vp_vdpa->vring[idx].irq;
> +
> + if (irq == VIRTIO_MSI_NO_VECTOR)
> + return -EINVAL;
> +
> + return irq;
> +}
> +
> static void vp_vdpa_free_irq(struct vp_vdpa *vp_vdpa)
> {
> struct virtio_pci_modern_device *mdev = &vp_vdpa->mdev;
> @@ -427,6 +438,7 @@ static const struct vdpa_config_ops vp_vdpa_ops = {
> .get_config = vp_vdpa_get_config,
> .set_config = vp_vdpa_set_config,
> .set_config_cb = vp_vdpa_set_config_cb,
> + .get_vq_irq = vp_vdpa_get_vq_irq,
> };
>
> static void vp_vdpa_free_irq_vectors(void *data)
> --
> 2.31.1
>

2021-09-15 03:19:46

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH v2 4/5] vdpa: add new vdpa attribute VDPA_ATTR_DEV_F_VERSION_1

On Tue, Sep 14, 2021 at 8:58 PM Michael S. Tsirkin <[email protected]> wrote:
>
> On Tue, Sep 14, 2021 at 08:24:51PM +0800, Wu Zongyong wrote:
> > This new attribute advertises whether the vdpa device is legacy or not.
> > Users can pick right virtqueue size if the vdpa device is legacy which
> > doesn't support to change virtqueue size.
> >
> > Signed-off-by: Wu Zongyong <[email protected]>
>
> So if we are bothering with legacy,

I think we'd better not. I guess the following may work:

1) disable the driver on BE host
2) present VERSION_1 with ACCESS_PLATFORM in get_features()
3) extend the management to advertise max_queue_size and
min_queue_size, for ENI they are the same so management layer knows it
needs to set the queue_size correctly during launching qemu

Thoughts?

Thanks

> I think there are
> several things to do when building the interface
> - support transitional devices, that is allow userspace
> to tell device it's in legacy mode
> - support reporting/setting supporting endian-ness
>
> > ---
> > drivers/vdpa/vdpa.c | 6 ++++++
> > drivers/virtio/virtio_vdpa.c | 7 ++++++-
> > include/uapi/linux/vdpa.h | 1 +
> > 3 files changed, 13 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
> > index 1dc121a07a93..533d7f589eee 100644
> > --- a/drivers/vdpa/vdpa.c
> > +++ b/drivers/vdpa/vdpa.c
> > @@ -12,6 +12,7 @@
> > #include <linux/slab.h>
> > #include <linux/vdpa.h>
> > #include <uapi/linux/vdpa.h>
> > +#include <uapi/linux/virtio_config.h>
> > #include <net/genetlink.h>
> > #include <linux/mod_devicetable.h>
> >
> > @@ -494,6 +495,7 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> > u16 max_vq_size;
> > u32 device_id;
> > u32 vendor_id;
> > + u64 features;
> > void *hdr;
> > int err;
> >
> > @@ -508,6 +510,7 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> > device_id = vdev->config->get_device_id(vdev);
> > vendor_id = vdev->config->get_vendor_id(vdev);
> > max_vq_size = vdev->config->get_vq_num_max(vdev);
> > + features = vdev->config->get_features(vdev);
> >
> > err = -EMSGSIZE;
> > if (nla_put_string(msg, VDPA_ATTR_DEV_NAME, dev_name(&vdev->dev)))
> > @@ -520,6 +523,9 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> > goto msg_err;
> > if (nla_put_u16(msg, VDPA_ATTR_DEV_MAX_VQ_SIZE, max_vq_size))
> > goto msg_err;
> > + if (features & BIT_ULL(VIRTIO_F_VERSION_1) &&
> > + nla_put_flag(msg, VDPA_ATTR_DEV_VERSION_1))
> > + goto msg_err;
> >
> > genlmsg_end(msg, hdr);
> > return 0;
> > diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
> > index 72eaef2caeb1..1cba957c4cdc 100644
> > --- a/drivers/virtio/virtio_vdpa.c
> > +++ b/drivers/virtio/virtio_vdpa.c
> > @@ -7,6 +7,7 @@
> > *
> > */
> >
> > +#include "linux/virtio_config.h"
> > #include <linux/init.h>
> > #include <linux/module.h>
> > #include <linux/device.h>
> > @@ -145,6 +146,7 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > /* Assume split virtqueue, switch to packed if necessary */
> > struct vdpa_vq_state state = {0};
> > unsigned long flags;
> > + bool may_reduce_num = false;
> > u32 align, num;
> > int err;
> >
> > @@ -169,10 +171,13 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > goto error_new_virtqueue;
> > }
> >
> > + if (ops->get_features(vdpa) & BIT_ULL(VIRTIO_F_VERSION_1))
> > + may_reduce_num = true;
> > +
> > /* Create the vring */
> > align = ops->get_vq_align(vdpa);
> > vq = vring_create_virtqueue(index, num, align, vdev,
> > - true, true, ctx,
> > + true, may_reduce_num, ctx,
> > virtio_vdpa_notify, callback, name);
> > if (!vq) {
> > err = -ENOMEM;
> > diff --git a/include/uapi/linux/vdpa.h b/include/uapi/linux/vdpa.h
> > index 66a41e4ec163..ce0b74276a5b 100644
> > --- a/include/uapi/linux/vdpa.h
> > +++ b/include/uapi/linux/vdpa.h
> > @@ -32,6 +32,7 @@ enum vdpa_attr {
> > VDPA_ATTR_DEV_VENDOR_ID, /* u32 */
> > VDPA_ATTR_DEV_MAX_VQS, /* u32 */
> > VDPA_ATTR_DEV_MAX_VQ_SIZE, /* u16 */
> > + VDPA_ATTR_DEV_VERSION_1, /* flag */
> >
> > /* new attributes must be added above here */
> > VDPA_ATTR_MAX,
> > --
> > 2.31.1
>

2021-09-15 03:28:02

by Wu Zongyong

[permalink] [raw]
Subject: Re: [PATCH v2 4/5] vdpa: add new vdpa attribute VDPA_ATTR_DEV_F_VERSION_1

On Tue, Sep 14, 2021 at 08:58:28AM -0400, Michael S. Tsirkin wrote:
> On Tue, Sep 14, 2021 at 08:24:51PM +0800, Wu Zongyong wrote:
> > This new attribute advertises whether the vdpa device is legacy or not.
> > Users can pick right virtqueue size if the vdpa device is legacy which
> > doesn't support to change virtqueue size.
> >
> > Signed-off-by: Wu Zongyong <[email protected]>
>
> So if we are bothering with legacy, I think there are
> several things to do when building the interface
> - support transitional devices, that is allow userspace
> to tell device it's in legacy mode
> - support reporting/setting supporting endian-ness

It's true if we try to implement a general drvier for legacy.
But this series is dedicated to implement a driver only for ENI. Is it
necessary to implement what you said here in this series?
>
> > ---
> > drivers/vdpa/vdpa.c | 6 ++++++
> > drivers/virtio/virtio_vdpa.c | 7 ++++++-
> > include/uapi/linux/vdpa.h | 1 +
> > 3 files changed, 13 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
> > index 1dc121a07a93..533d7f589eee 100644
> > --- a/drivers/vdpa/vdpa.c
> > +++ b/drivers/vdpa/vdpa.c
> > @@ -12,6 +12,7 @@
> > #include <linux/slab.h>
> > #include <linux/vdpa.h>
> > #include <uapi/linux/vdpa.h>
> > +#include <uapi/linux/virtio_config.h>
> > #include <net/genetlink.h>
> > #include <linux/mod_devicetable.h>
> >
> > @@ -494,6 +495,7 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> > u16 max_vq_size;
> > u32 device_id;
> > u32 vendor_id;
> > + u64 features;
> > void *hdr;
> > int err;
> >
> > @@ -508,6 +510,7 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> > device_id = vdev->config->get_device_id(vdev);
> > vendor_id = vdev->config->get_vendor_id(vdev);
> > max_vq_size = vdev->config->get_vq_num_max(vdev);
> > + features = vdev->config->get_features(vdev);
> >
> > err = -EMSGSIZE;
> > if (nla_put_string(msg, VDPA_ATTR_DEV_NAME, dev_name(&vdev->dev)))
> > @@ -520,6 +523,9 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> > goto msg_err;
> > if (nla_put_u16(msg, VDPA_ATTR_DEV_MAX_VQ_SIZE, max_vq_size))
> > goto msg_err;
> > + if (features & BIT_ULL(VIRTIO_F_VERSION_1) &&
> > + nla_put_flag(msg, VDPA_ATTR_DEV_VERSION_1))
> > + goto msg_err;
> >
> > genlmsg_end(msg, hdr);
> > return 0;
> > diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
> > index 72eaef2caeb1..1cba957c4cdc 100644
> > --- a/drivers/virtio/virtio_vdpa.c
> > +++ b/drivers/virtio/virtio_vdpa.c
> > @@ -7,6 +7,7 @@
> > *
> > */
> >
> > +#include "linux/virtio_config.h"
> > #include <linux/init.h>
> > #include <linux/module.h>
> > #include <linux/device.h>
> > @@ -145,6 +146,7 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > /* Assume split virtqueue, switch to packed if necessary */
> > struct vdpa_vq_state state = {0};
> > unsigned long flags;
> > + bool may_reduce_num = false;
> > u32 align, num;
> > int err;
> >
> > @@ -169,10 +171,13 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > goto error_new_virtqueue;
> > }
> >
> > + if (ops->get_features(vdpa) & BIT_ULL(VIRTIO_F_VERSION_1))
> > + may_reduce_num = true;
> > +
> > /* Create the vring */
> > align = ops->get_vq_align(vdpa);
> > vq = vring_create_virtqueue(index, num, align, vdev,
> > - true, true, ctx,
> > + true, may_reduce_num, ctx,
> > virtio_vdpa_notify, callback, name);
> > if (!vq) {
> > err = -ENOMEM;
> > diff --git a/include/uapi/linux/vdpa.h b/include/uapi/linux/vdpa.h
> > index 66a41e4ec163..ce0b74276a5b 100644
> > --- a/include/uapi/linux/vdpa.h
> > +++ b/include/uapi/linux/vdpa.h
> > @@ -32,6 +32,7 @@ enum vdpa_attr {
> > VDPA_ATTR_DEV_VENDOR_ID, /* u32 */
> > VDPA_ATTR_DEV_MAX_VQS, /* u32 */
> > VDPA_ATTR_DEV_MAX_VQ_SIZE, /* u16 */
> > + VDPA_ATTR_DEV_VERSION_1, /* flag */
> >
> > /* new attributes must be added above here */
> > VDPA_ATTR_MAX,
> > --
> > 2.31.1

2021-09-15 03:32:35

by Wu Zongyong

[permalink] [raw]
Subject: Re: [PATCH v2 3/5] vp_vdpa: add vq irq offloading support

On Wed, Sep 15, 2021 at 11:16:03AM +0800, Jason Wang wrote:
> On Tue, Sep 14, 2021 at 8:25 PM Wu Zongyong
> <[email protected]> wrote:
> >
> > This patch implements the get_vq_irq() callback for virtio pci devices
> > to allow irq offloading.
> >
> > Signed-off-by: Wu Zongyong <[email protected]>
>
> Acked-by: Jason Wang <[email protected]>
>
> (btw, I think I've acked this but it seems lost).
Yes, but this patch is a little different with the previous one.

And should I not send the patch again if one of the previous version
patch series have been acked by someone? It's the first time for me to
send patches to kernel community.
>
> Thanks
>
> > ---
> > drivers/vdpa/virtio_pci/vp_vdpa.c | 12 ++++++++++++
> > 1 file changed, 12 insertions(+)
> >
> > diff --git a/drivers/vdpa/virtio_pci/vp_vdpa.c b/drivers/vdpa/virtio_pci/vp_vdpa.c
> > index 5bcd00246d2e..e3ff7875e123 100644
> > --- a/drivers/vdpa/virtio_pci/vp_vdpa.c
> > +++ b/drivers/vdpa/virtio_pci/vp_vdpa.c
> > @@ -76,6 +76,17 @@ static u8 vp_vdpa_get_status(struct vdpa_device *vdpa)
> > return vp_modern_get_status(mdev);
> > }
> >
> > +static int vp_vdpa_get_vq_irq(struct vdpa_device *vdpa, u16 idx)
> > +{
> > + struct vp_vdpa *vp_vdpa = vdpa_to_vp(vdpa);
> > + int irq = vp_vdpa->vring[idx].irq;
> > +
> > + if (irq == VIRTIO_MSI_NO_VECTOR)
> > + return -EINVAL;
> > +
> > + return irq;
> > +}
> > +
> > static void vp_vdpa_free_irq(struct vp_vdpa *vp_vdpa)
> > {
> > struct virtio_pci_modern_device *mdev = &vp_vdpa->mdev;
> > @@ -427,6 +438,7 @@ static const struct vdpa_config_ops vp_vdpa_ops = {
> > .get_config = vp_vdpa_get_config,
> > .set_config = vp_vdpa_set_config,
> > .set_config_cb = vp_vdpa_set_config_cb,
> > + .get_vq_irq = vp_vdpa_get_vq_irq,
> > };
> >
> > static void vp_vdpa_free_irq_vectors(void *data)
> > --
> > 2.31.1
> >

2021-09-15 03:50:06

by Wu Zongyong

[permalink] [raw]
Subject: Re: [PATCH v2 5/5] eni_vdpa: add vDPA driver for Alibaba ENI

On Wed, Sep 15, 2021 at 11:14:43AM +0800, Jason Wang wrote:
> On Tue, Sep 14, 2021 at 8:26 PM Wu Zongyong
> <[email protected]> wrote:
> >
> > This patch adds a new vDPA driver for Alibaba ENI(Elastic Network
> > Interface) which is build upon virtio 0.9.5 specification.
> >
> > Signed-off-by: Wu Zongyong <[email protected]>
> > ---
> > drivers/vdpa/Kconfig | 8 +
> > drivers/vdpa/Makefile | 1 +
> > drivers/vdpa/alibaba/Makefile | 3 +
> > drivers/vdpa/alibaba/eni_vdpa.c | 537 ++++++++++++++++++++++++++++++++
> > 4 files changed, 549 insertions(+)
> > create mode 100644 drivers/vdpa/alibaba/Makefile
> > create mode 100644 drivers/vdpa/alibaba/eni_vdpa.c
> >
> > diff --git a/drivers/vdpa/Kconfig b/drivers/vdpa/Kconfig
> > index 3d91982d8371..9587b9177b05 100644
> > --- a/drivers/vdpa/Kconfig
> > +++ b/drivers/vdpa/Kconfig
> > @@ -78,4 +78,12 @@ config VP_VDPA
> > help
> > This kernel module bridges virtio PCI device to vDPA bus.
> >
> > +config ALIBABA_ENI_VDPA
> > + tristate "vDPA driver for Alibaba ENI"
> > + select VIRTIO_PCI_LEGACY_LIB
> > + depends on PCI_MSI
> > + help
> > + VDPA driver for Alibaba ENI(Elastic Network Interface) which is build upon
> > + virtio 0.9.5 specification.
> > +
> > endif # VDPA
> > diff --git a/drivers/vdpa/Makefile b/drivers/vdpa/Makefile
> > index f02ebed33f19..15665563a7f4 100644
> > --- a/drivers/vdpa/Makefile
> > +++ b/drivers/vdpa/Makefile
> > @@ -5,3 +5,4 @@ obj-$(CONFIG_VDPA_USER) += vdpa_user/
> > obj-$(CONFIG_IFCVF) += ifcvf/
> > obj-$(CONFIG_MLX5_VDPA) += mlx5/
> > obj-$(CONFIG_VP_VDPA) += virtio_pci/
> > +obj-$(CONFIG_ALIBABA_ENI_VDPA) += alibaba/
> > diff --git a/drivers/vdpa/alibaba/Makefile b/drivers/vdpa/alibaba/Makefile
> > new file mode 100644
> > index 000000000000..ef4aae69f87a
> > --- /dev/null
> > +++ b/drivers/vdpa/alibaba/Makefile
> > @@ -0,0 +1,3 @@
> > +# SPDX-License-Identifier: GPL-2.0
> > +obj-$(CONFIG_ALIBABA_ENI_VDPA) += eni_vdpa.o
> > +
> > diff --git a/drivers/vdpa/alibaba/eni_vdpa.c b/drivers/vdpa/alibaba/eni_vdpa.c
> > new file mode 100644
> > index 000000000000..38e85a5dd62e
> > --- /dev/null
> > +++ b/drivers/vdpa/alibaba/eni_vdpa.c
> > @@ -0,0 +1,537 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * vDPA bridge driver for Alibaba ENI(Elastic Network Interface)
> > + *
> > + * Copyright (c) 2021, Alibaba Inc. All rights reserved.
> > + * Author: Wu Zongyong <[email protected]>
> > + *
> > + */
> > +
> > +#include "asm-generic/errno-base.h"
> > +#include "asm-generic/errno.h"
> > +#include "linux/irqreturn.h"
> > +#include "linux/kernel.h"
> > +#include "linux/pci_ids.h"
> > +#include "linux/virtio_config.h"
> > +#include <linux/interrupt.h>
> > +#include <linux/module.h>
> > +#include <linux/pci.h>
> > +#include <linux/vdpa.h>
> > +#include <linux/virtio.h>
> > +#include <linux/virtio_config.h>
> > +#include <linux/virtio_ring.h>
> > +#include <linux/virtio_pci.h>
> > +#include <linux/virtio_pci_legacy.h>
> > +#include <uapi/linux/virtio_net.h>
> > +
> > +#define ENI_MSIX_NAME_SIZE 256
> > +
> > +#define ENI_ERR(pdev, fmt, ...) \
> > + dev_err(&pdev->dev, "%s"fmt, "eni_vdpa: ", ##__VA_ARGS__)
> > +#define ENI_DBG(pdev, fmt, ...) \
> > + dev_dbg(&pdev->dev, "%s"fmt, "eni_vdpa: ", ##__VA_ARGS__)
> > +#define ENI_INFO(pdev, fmt, ...) \
> > + dev_info(&pdev->dev, "%s"fmt, "eni_vdpa: ", ##__VA_ARGS__)
> > +
> > +struct eni_vring {
> > + void __iomem *notify;
> > + char msix_name[ENI_MSIX_NAME_SIZE];
> > + struct vdpa_callback cb;
> > + int irq;
> > +};
> > +
> > +struct eni_vdpa {
> > + struct vdpa_device vdpa;
> > + struct virtio_pci_legacy_device ldev;
> > + struct eni_vring *vring;
> > + struct vdpa_callback config_cb;
> > + char msix_name[ENI_MSIX_NAME_SIZE];
> > + int config_irq;
> > + int queues;
> > + int vectors;
> > +};
> > +
> > +static struct eni_vdpa *vdpa_to_eni(struct vdpa_device *vdpa)
> > +{
> > + return container_of(vdpa, struct eni_vdpa, vdpa);
> > +}
> > +
> > +static struct virtio_pci_legacy_device *vdpa_to_ldev(struct vdpa_device *vdpa)
> > +{
> > + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> > +
> > + return &eni_vdpa->ldev;
> > +}
> > +
> > +static u64 eni_vdpa_get_features(struct vdpa_device *vdpa)
> > +{
> > + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> > +
> > + return vp_legacy_get_features(ldev);
> > +}
> > +
> > +static int eni_vdpa_set_features(struct vdpa_device *vdpa, u64 features)
> > +{
> > + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> > +
> > + vp_legacy_set_features(ldev, (u32)features);
> > +
> > + return 0;
> > +}
>
> Interesting, I wonder how VIRTIO_F_ACCESS_PLATFORM can work in this case.
>
> Without that, the virtio driver won't use DMA API which breaks the
> setup with IOMMU.
>
> Or is the VIRTIO_F_ACCESS_PLATFORM mandated by the device? If yes, we
> need some meditation here:
>
> e.g return VIRTIO_F_ACCESS_PLATFORM set in get_features().
>

I will fix it.
> > +
> > +static u8 eni_vdpa_get_status(struct vdpa_device *vdpa)
> > +{
> > + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> > +
> > + return vp_legacy_get_status(ldev);
> > +}
> > +
> > +static int eni_vdpa_get_vq_irq(struct vdpa_device *vdpa, u16 idx)
> > +{
> > + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> > + int irq = eni_vdpa->vring[idx].irq;
> > +
> > + if (irq == VIRTIO_MSI_NO_VECTOR)
> > + return -EINVAL;
> > +
> > + return irq;
> > +}
> > +
> > +static void eni_vdpa_free_irq(struct eni_vdpa *eni_vdpa)
> > +{
> > + struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
> > + struct pci_dev *pdev = ldev->pci_dev;
> > + int i;
> > +
> > + for (i = 0; i < eni_vdpa->queues; i++) {
> > + if (eni_vdpa->vring[i].irq != VIRTIO_MSI_NO_VECTOR) {
> > + vp_legacy_queue_vector(ldev, i, VIRTIO_MSI_NO_VECTOR);
> > + devm_free_irq(&pdev->dev, eni_vdpa->vring[i].irq,
> > + &eni_vdpa->vring[i]);
> > + eni_vdpa->vring[i].irq = VIRTIO_MSI_NO_VECTOR;
> > + }
> > + }
> > +
> > + if (eni_vdpa->config_irq != VIRTIO_MSI_NO_VECTOR) {
> > + vp_legacy_config_vector(ldev, VIRTIO_MSI_NO_VECTOR);
> > + devm_free_irq(&pdev->dev, eni_vdpa->config_irq, eni_vdpa);
> > + eni_vdpa->config_irq = VIRTIO_MSI_NO_VECTOR;
> > + }
> > +
> > + if (eni_vdpa->vectors) {
> > + pci_free_irq_vectors(pdev);
> > + eni_vdpa->vectors = 0;
> > + }
> > +}
> > +
> > +static irqreturn_t eni_vdpa_vq_handler(int irq, void *arg)
> > +{
> > + struct eni_vring *vring = arg;
> > +
> > + if (vring->cb.callback)
> > + return vring->cb.callback(vring->cb.private);
> > +
> > + return IRQ_HANDLED;
> > +}
> > +
> > +static irqreturn_t eni_vdpa_config_handler(int irq, void *arg)
> > +{
> > + struct eni_vdpa *eni_vdpa = arg;
> > +
> > + if (eni_vdpa->config_cb.callback)
> > + return eni_vdpa->config_cb.callback(eni_vdpa->config_cb.private);
> > +
> > + return IRQ_HANDLED;
> > +}
> > +
> > +static int eni_vdpa_request_irq(struct eni_vdpa *eni_vdpa)
> > +{
> > + struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
> > + struct pci_dev *pdev = ldev->pci_dev;
> > + int i, ret, irq;
> > + int queues = eni_vdpa->queues;
> > + int vectors = queues + 1;
> > +
> > + ret = pci_alloc_irq_vectors(pdev, vectors, vectors, PCI_IRQ_MSIX);
> > + if (ret != vectors) {
> > + ENI_ERR(pdev,
> > + "failed to allocate irq vectors want %d but %d\n",
> > + vectors, ret);
> > + return ret;
> > + }
> > +
> > + eni_vdpa->vectors = vectors;
> > +
> > + for (i = 0; i < queues; i++) {
> > + snprintf(eni_vdpa->vring[i].msix_name, ENI_MSIX_NAME_SIZE,
> > + "eni-vdpa[%s]-%d\n", pci_name(pdev), i);
> > + irq = pci_irq_vector(pdev, i);
> > + ret = devm_request_irq(&pdev->dev, irq,
> > + eni_vdpa_vq_handler,
> > + 0, eni_vdpa->vring[i].msix_name,
> > + &eni_vdpa->vring[i]);
> > + if (ret) {
> > + ENI_ERR(pdev, "failed to request irq for vq %d\n", i);
> > + goto err;
> > + }
> > + vp_legacy_queue_vector(ldev, i, i);
> > + eni_vdpa->vring[i].irq = irq;
> > + }
> > +
> > + snprintf(eni_vdpa->msix_name, ENI_MSIX_NAME_SIZE, "eni-vdpa[%s]-config\n",
> > + pci_name(pdev));
> > + irq = pci_irq_vector(pdev, queues);
> > + ret = devm_request_irq(&pdev->dev, irq, eni_vdpa_config_handler, 0,
> > + eni_vdpa->msix_name, eni_vdpa);
> > + if (ret) {
> > + ENI_ERR(pdev, "failed to request irq for config vq %d\n", i);
> > + goto err;
> > + }
> > + vp_legacy_config_vector(ldev, queues);
> > + eni_vdpa->config_irq = irq;
> > +
> > + return 0;
> > +err:
> > + eni_vdpa_free_irq(eni_vdpa);
> > + return ret;
> > +}
> > +
> > +static void eni_vdpa_set_status(struct vdpa_device *vdpa, u8 status)
> > +{
> > + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> > + struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
> > + u8 s = eni_vdpa_get_status(vdpa);
> > +
> > + if (status & VIRTIO_CONFIG_S_DRIVER_OK &&
> > + !(s & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > + eni_vdpa_request_irq(eni_vdpa);
> > + }
> > +
> > + vp_legacy_set_status(ldev, status);
> > +
> > + if (!(status & VIRTIO_CONFIG_S_DRIVER_OK) &&
> > + (s & VIRTIO_CONFIG_S_DRIVER_OK))
> > + eni_vdpa_free_irq(eni_vdpa);
> > +}
> > +
> > +static int eni_vdpa_reset(struct vdpa_device *vdpa)
> > +{
> > + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> > + struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
> > + u8 s = eni_vdpa_get_status(vdpa);
> > +
> > + vp_legacy_set_status(ldev, 0);
> > +
> > + if (s & VIRTIO_CONFIG_S_DRIVER_OK)
> > + eni_vdpa_free_irq(eni_vdpa);
> > +
> > + return 0;
> > +}
> > +
> > +static u16 eni_vdpa_get_vq_num_max(struct vdpa_device *vdpa)
> > +{
> > + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> > +
> > + return vp_legacy_get_queue_size(ldev, 0);
> > +}
> > +
> > +static int eni_vdpa_get_vq_state(struct vdpa_device *vdpa, u16 qid,
> > + struct vdpa_vq_state *state)
> > +{
> > + return -EOPNOTSUPP;
> > +}
> > +
> > +static int eni_vdpa_set_vq_state(struct vdpa_device *vdpa, u16 qid,
> > + const struct vdpa_vq_state *state)
> > +{
> > + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> > + const struct vdpa_vq_state_split *split = &state->split;
> > +
> > + /* ENI is build upon virtio-pci specfication which not support
> > + * to set state of virtqueue. But if the state is equal to the
> > + * device initial state by chance, we can let it go.
> > + */
> > + if (!vp_legacy_get_queue_enable(ldev, qid)
> > + && split->avail_index == 0)
> > + return 0;
> > +
> > + return -EOPNOTSUPP;
> > +}
> > +
> > +
> > +static void eni_vdpa_set_vq_cb(struct vdpa_device *vdpa, u16 qid,
> > + struct vdpa_callback *cb)
> > +{
> > + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> > +
> > + eni_vdpa->vring[qid].cb = *cb;
> > +}
> > +
> > +static void eni_vdpa_set_vq_ready(struct vdpa_device *vdpa, u16 qid,
> > + bool ready)
> > +{
> > + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> > +
> > + /* ENI is a legacy virtio-pci device. This is not supported
> > + * by specification. But we can disable virtqueue by setting
> > + * address to 0.
> > + */
> > + if (!ready)
> > + vp_legacy_set_queue_address(ldev, qid, 0);
> > +}
> > +
> > +static bool eni_vdpa_get_vq_ready(struct vdpa_device *vdpa, u16 qid)
> > +{
> > + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> > +
> > + return vp_legacy_get_queue_enable(ldev, qid);
> > +}
> > +
> > +static void eni_vdpa_set_vq_num(struct vdpa_device *vdpa, u16 qid,
> > + u32 num)
> > +{
> > + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> > + struct pci_dev *pdev = ldev->pci_dev;
> > + u16 n = vp_legacy_get_queue_size(ldev, qid);
> > +
> > + /* ENI is a legacy virtio-pci device which not allow to change
> > + * virtqueue size. Just report a error if someone tries to
> > + * change it.
> > + */
> > + if (num != n)
> > + ENI_ERR(pdev,
> > + "not support to set vq %u fixed num %u to %u\n",
> > + qid, n, num);
> > +}
> > +
> > +static int eni_vdpa_set_vq_address(struct vdpa_device *vdpa, u16 qid,
> > + u64 desc_area, u64 driver_area,
> > + u64 device_area)
> > +{
> > + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> > + u32 pfn = desc_area >> VIRTIO_PCI_QUEUE_ADDR_SHIFT;
> > +
> > + vp_legacy_set_queue_address(ldev, qid, pfn);
> > +
> > + return 0;
> > +}
> > +
> > +static void eni_vdpa_kick_vq(struct vdpa_device *vdpa, u16 qid)
> > +{
> > + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> > +
> > + iowrite16(qid, eni_vdpa->vring[qid].notify);
> > +}
> > +
> > +static u32 eni_vdpa_get_device_id(struct vdpa_device *vdpa)
> > +{
> > + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> > +
> > + return ldev->id.device;
> > +}
> > +
> > +static u32 eni_vdpa_get_vendor_id(struct vdpa_device *vdpa)
> > +{
> > + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> > +
> > + return ldev->id.vendor;
> > +}
> > +
> > +static u32 eni_vdpa_get_vq_align(struct vdpa_device *vdpa)
> > +{
> > + return PAGE_SIZE;
> > +}
> > +
> > +static size_t eni_vdpa_get_config_size(struct vdpa_device *vdpa)
> > +{
> > + return sizeof(struct virtio_net_config);
> > +}
> > +
> > +
> > +static void eni_vdpa_get_config(struct vdpa_device *vdpa,
> > + unsigned int offset,
> > + void *buf, unsigned int len)
> > +{
> > + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> > + struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
> > + void __iomem *ioaddr = ldev->ioaddr +
> > + VIRTIO_PCI_CONFIG_OFF(eni_vdpa->vectors) +
> > + offset;
> > + u8 *p = buf;
> > + int i;
> > +
> > + for (i = 0; i < len; i++)
> > + *p++ = ioread8(ioaddr + i);
> > +}
> > +
> > +static void eni_vdpa_set_config(struct vdpa_device *vdpa,
> > + unsigned int offset, const void *buf,
> > + unsigned int len)
> > +{
> > + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> > + struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
> > + void __iomem *ioaddr = ldev->ioaddr +
> > + VIRTIO_PCI_CONFIG_OFF(eni_vdpa->vectors) +
> > + offset;
> > + const u8 *p = buf;
> > + int i;
> > +
> > + for (i = 0; i < len; i++)
> > + iowrite8(*p++, ioaddr + i);
> > +}
> > +
> > +static void eni_vdpa_set_config_cb(struct vdpa_device *vdpa,
> > + struct vdpa_callback *cb)
> > +{
> > + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> > +
> > + eni_vdpa->config_cb = *cb;
> > +}
> > +
> > +static const struct vdpa_config_ops eni_vdpa_ops = {
> > + .get_features = eni_vdpa_get_features,
> > + .set_features = eni_vdpa_set_features,
> > + .get_status = eni_vdpa_get_status,
> > + .set_status = eni_vdpa_set_status,
> > + .reset = eni_vdpa_reset,
> > + .get_vq_num_max = eni_vdpa_get_vq_num_max,
> > + .get_vq_state = eni_vdpa_get_vq_state,
> > + .set_vq_state = eni_vdpa_set_vq_state,
> > + .set_vq_cb = eni_vdpa_set_vq_cb,
> > + .set_vq_ready = eni_vdpa_set_vq_ready,
> > + .get_vq_ready = eni_vdpa_get_vq_ready,
> > + .set_vq_num = eni_vdpa_set_vq_num,
> > + .set_vq_address = eni_vdpa_set_vq_address,
> > + .kick_vq = eni_vdpa_kick_vq,
> > + .get_device_id = eni_vdpa_get_device_id,
> > + .get_vendor_id = eni_vdpa_get_vendor_id,
> > + .get_vq_align = eni_vdpa_get_vq_align,
> > + .get_config_size = eni_vdpa_get_config_size,
> > + .get_config = eni_vdpa_get_config,
> > + .set_config = eni_vdpa_set_config,
> > + .set_config_cb = eni_vdpa_set_config_cb,
> > + .get_vq_irq = eni_vdpa_get_vq_irq,
> > +};
> > +
> > +
> > +static u16 eni_vdpa_get_num_queues(struct eni_vdpa *eni_vdpa)
> > +{
> > + struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
> > + u32 features = vp_legacy_get_features(ldev);
> > + u16 num = 2;
> > +
> > + if (features & BIT_ULL(VIRTIO_NET_F_MQ)) {
> > + __virtio16 max_virtqueue_pairs;
> > +
> > + eni_vdpa_get_config(&eni_vdpa->vdpa,
> > + offsetof(struct virtio_net_config, max_virtqueue_pairs),
> > + &max_virtqueue_pairs,
> > + sizeof(max_virtqueue_pairs));
> > + num = 2 * __virtio16_to_cpu(virtio_legacy_is_little_endian(),
> > + max_virtqueue_pairs);
> > + }
> > +
> > + if (features & BIT_ULL(VIRTIO_NET_F_CTRL_VQ))
> > + num += 1;
> > +
> > + return num;
> > +}
> > +
> > +static void eni_vdpa_free_irq_vectors(void *data)
> > +{
> > + pci_free_irq_vectors(data);
> > +}
> > +
> > +static int eni_vdpa_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> > +{
> > + struct device *dev = &pdev->dev;
> > + struct eni_vdpa *eni_vdpa;
> > + struct virtio_pci_legacy_device *ldev;
> > + int ret, i;
> > +
> > + ret = pcim_enable_device(pdev);
> > + if (ret)
> > + return ret;
> > +
> > + eni_vdpa = vdpa_alloc_device(struct eni_vdpa, vdpa,
> > + dev, &eni_vdpa_ops, NULL, false);
> > + if (IS_ERR(eni_vdpa)) {
> > + ENI_ERR(pdev, "failed to allocate vDPA structure\n");
> > + return PTR_ERR(eni_vdpa);
> > + }
> > +
> > + ldev = &eni_vdpa->ldev;
> > + ldev->pci_dev = pdev;
> > +
> > + ret = vp_legacy_probe(ldev);
> > + if (ret) {
> > + ENI_ERR(pdev, "failed to probe legacy PCI device\n");
> > + goto err;
> > + }
> > +
> > + pci_set_master(pdev);
> > + pci_set_drvdata(pdev, eni_vdpa);
> > +
> > + eni_vdpa->vdpa.dma_dev = &pdev->dev;
> > + eni_vdpa->queues = eni_vdpa_get_num_queues(eni_vdpa);
> > +
> > + ret = devm_add_action_or_reset(dev, eni_vdpa_free_irq_vectors, pdev);
> > + if (ret) {
> > + ENI_ERR(pdev,
> > + "failed for adding devres for freeing irq vectors\n");
> > + goto err;
> > + }
> > +
> > + eni_vdpa->vring = devm_kcalloc(&pdev->dev, eni_vdpa->queues,
> > + sizeof(*eni_vdpa->vring),
> > + GFP_KERNEL);
> > + if (!eni_vdpa->vring) {
> > + ret = -ENOMEM;
> > + ENI_ERR(pdev, "fail to allocate virtqueues\n");
> > + goto err;
> > + }
> > +
> > + for (i = 0; i < eni_vdpa->queues; i++) {
> > + eni_vdpa->vring[i].irq = VIRTIO_MSI_NO_VECTOR;
> > + eni_vdpa->vring[i].notify = ldev->ioaddr + VIRTIO_PCI_QUEUE_NOTIFY;
> > + }
> > + eni_vdpa->config_irq = VIRTIO_MSI_NO_VECTOR;
> > +
> > + ret = vdpa_register_device(&eni_vdpa->vdpa, eni_vdpa->queues);
> > + if (ret) {
> > + ENI_ERR(pdev, "failed to register to vdpa bus\n");
> > + goto err;
> > + }
> > +
> > + return 0;
> > +
> > +err:
> > + put_device(&eni_vdpa->vdpa.dev);
> > + return ret;
> > +}
> > +
> > +static void eni_vdpa_remove(struct pci_dev *pdev)
> > +{
> > + struct eni_vdpa *eni_vdpa = pci_get_drvdata(pdev);
> > +
> > + vdpa_unregister_device(&eni_vdpa->vdpa);
> > + vp_legacy_remove(&eni_vdpa->ldev);
> > +}
> > +
> > +static struct pci_device_id eni_pci_ids[] = {
> > + { PCI_VENDOR_ID_REDHAT_QUMRANET, VIRTIO_TRANS_ID_NET },
>
> This will cause some confusion for driver binding. I think it's better
> to add subvendor matching here.

I will fix it.
>
> Thanks
>
> > + { 0 },
> > +};
> > +
> > +static struct pci_driver eni_vdpa_driver = {
> > + .name = "alibaba-eni-vdpa",
> > + .id_table = eni_pci_ids,
> > + .probe = eni_vdpa_probe,
> > + .remove = eni_vdpa_remove,
> > +};
> > +
> > +module_pci_driver(eni_vdpa_driver);
> > +
> > +MODULE_AUTHOR("Wu Zongyong <[email protected]>");
> > +MODULE_DESCRIPTION("Alibaba ENI vDPA driver");
> > +MODULE_LICENSE("GPL v2");
> > --
> > 2.31.1
> >

2021-09-15 04:14:40

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH v2 3/5] vp_vdpa: add vq irq offloading support

On Wed, Sep 15, 2021 at 11:31 AM Wu Zongyong
<[email protected]> wrote:
>
> On Wed, Sep 15, 2021 at 11:16:03AM +0800, Jason Wang wrote:
> > On Tue, Sep 14, 2021 at 8:25 PM Wu Zongyong
> > <[email protected]> wrote:
> > >
> > > This patch implements the get_vq_irq() callback for virtio pci devices
> > > to allow irq offloading.
> > >
> > > Signed-off-by: Wu Zongyong <[email protected]>
> >
> > Acked-by: Jason Wang <[email protected]>
> >
> > (btw, I think I've acked this but it seems lost).
> Yes, but this patch is a little different with the previous one.

I see, then it's better to mention this after "---" like

---
change since v1:
- xyz
---

or in the cover letter.

>
> And should I not send the patch again if one of the previous version
> patch series have been acked by someone?

No, you need to resend the whole series.

Thanks

> It's the first time for me to
> send patches to kernel community.
> >
> > Thanks
> >
> > > ---
> > > drivers/vdpa/virtio_pci/vp_vdpa.c | 12 ++++++++++++
> > > 1 file changed, 12 insertions(+)
> > >
> > > diff --git a/drivers/vdpa/virtio_pci/vp_vdpa.c b/drivers/vdpa/virtio_pci/vp_vdpa.c
> > > index 5bcd00246d2e..e3ff7875e123 100644
> > > --- a/drivers/vdpa/virtio_pci/vp_vdpa.c
> > > +++ b/drivers/vdpa/virtio_pci/vp_vdpa.c
> > > @@ -76,6 +76,17 @@ static u8 vp_vdpa_get_status(struct vdpa_device *vdpa)
> > > return vp_modern_get_status(mdev);
> > > }
> > >
> > > +static int vp_vdpa_get_vq_irq(struct vdpa_device *vdpa, u16 idx)
> > > +{
> > > + struct vp_vdpa *vp_vdpa = vdpa_to_vp(vdpa);
> > > + int irq = vp_vdpa->vring[idx].irq;
> > > +
> > > + if (irq == VIRTIO_MSI_NO_VECTOR)
> > > + return -EINVAL;
> > > +
> > > + return irq;
> > > +}
> > > +
> > > static void vp_vdpa_free_irq(struct vp_vdpa *vp_vdpa)
> > > {
> > > struct virtio_pci_modern_device *mdev = &vp_vdpa->mdev;
> > > @@ -427,6 +438,7 @@ static const struct vdpa_config_ops vp_vdpa_ops = {
> > > .get_config = vp_vdpa_get_config,
> > > .set_config = vp_vdpa_set_config,
> > > .set_config_cb = vp_vdpa_set_config_cb,
> > > + .get_vq_irq = vp_vdpa_get_vq_irq,
> > > };
> > >
> > > static void vp_vdpa_free_irq_vectors(void *data)
> > > --
> > > 2.31.1
> > >
>

2021-09-15 07:36:08

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH v2 4/5] vdpa: add new vdpa attribute VDPA_ATTR_DEV_F_VERSION_1

On Wed, Sep 15, 2021 at 11:24:53AM +0800, Wu Zongyong wrote:
> On Tue, Sep 14, 2021 at 08:58:28AM -0400, Michael S. Tsirkin wrote:
> > On Tue, Sep 14, 2021 at 08:24:51PM +0800, Wu Zongyong wrote:
> > > This new attribute advertises whether the vdpa device is legacy or not.
> > > Users can pick right virtqueue size if the vdpa device is legacy which
> > > doesn't support to change virtqueue size.
> > >
> > > Signed-off-by: Wu Zongyong <[email protected]>
> >
> > So if we are bothering with legacy, I think there are
> > several things to do when building the interface
> > - support transitional devices, that is allow userspace
> > to tell device it's in legacy mode
> > - support reporting/setting supporting endian-ness
>
> It's true if we try to implement a general drvier for legacy.
> But this series is dedicated to implement a driver only for ENI. Is it
> necessary to implement what you said here in this series?

To a certain degree, yes.

I am thinking about the UAPI here. The new attribute is part of that.
E.g. userspace consuming this needs to be more or less hardware agnostic
and not depend on specifics of ENI.

Otherwise if userspace assumes legacy==eni then it will break with
other hardware.

One way to test how generic it all is would be adding legacy support in
the simulator.

> >
> > > ---
> > > drivers/vdpa/vdpa.c | 6 ++++++
> > > drivers/virtio/virtio_vdpa.c | 7 ++++++-
> > > include/uapi/linux/vdpa.h | 1 +
> > > 3 files changed, 13 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
> > > index 1dc121a07a93..533d7f589eee 100644
> > > --- a/drivers/vdpa/vdpa.c
> > > +++ b/drivers/vdpa/vdpa.c
> > > @@ -12,6 +12,7 @@
> > > #include <linux/slab.h>
> > > #include <linux/vdpa.h>
> > > #include <uapi/linux/vdpa.h>
> > > +#include <uapi/linux/virtio_config.h>
> > > #include <net/genetlink.h>
> > > #include <linux/mod_devicetable.h>
> > >
> > > @@ -494,6 +495,7 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> > > u16 max_vq_size;
> > > u32 device_id;
> > > u32 vendor_id;
> > > + u64 features;
> > > void *hdr;
> > > int err;
> > >
> > > @@ -508,6 +510,7 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> > > device_id = vdev->config->get_device_id(vdev);
> > > vendor_id = vdev->config->get_vendor_id(vdev);
> > > max_vq_size = vdev->config->get_vq_num_max(vdev);
> > > + features = vdev->config->get_features(vdev);
> > >
> > > err = -EMSGSIZE;
> > > if (nla_put_string(msg, VDPA_ATTR_DEV_NAME, dev_name(&vdev->dev)))
> > > @@ -520,6 +523,9 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> > > goto msg_err;
> > > if (nla_put_u16(msg, VDPA_ATTR_DEV_MAX_VQ_SIZE, max_vq_size))
> > > goto msg_err;
> > > + if (features & BIT_ULL(VIRTIO_F_VERSION_1) &&
> > > + nla_put_flag(msg, VDPA_ATTR_DEV_VERSION_1))
> > > + goto msg_err;
> > >
> > > genlmsg_end(msg, hdr);
> > > return 0;
> > > diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
> > > index 72eaef2caeb1..1cba957c4cdc 100644
> > > --- a/drivers/virtio/virtio_vdpa.c
> > > +++ b/drivers/virtio/virtio_vdpa.c
> > > @@ -7,6 +7,7 @@
> > > *
> > > */
> > >
> > > +#include "linux/virtio_config.h"
> > > #include <linux/init.h>
> > > #include <linux/module.h>
> > > #include <linux/device.h>
> > > @@ -145,6 +146,7 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > /* Assume split virtqueue, switch to packed if necessary */
> > > struct vdpa_vq_state state = {0};
> > > unsigned long flags;
> > > + bool may_reduce_num = false;
> > > u32 align, num;
> > > int err;
> > >
> > > @@ -169,10 +171,13 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > goto error_new_virtqueue;
> > > }
> > >
> > > + if (ops->get_features(vdpa) & BIT_ULL(VIRTIO_F_VERSION_1))
> > > + may_reduce_num = true;
> > > +
> > > /* Create the vring */
> > > align = ops->get_vq_align(vdpa);
> > > vq = vring_create_virtqueue(index, num, align, vdev,
> > > - true, true, ctx,
> > > + true, may_reduce_num, ctx,
> > > virtio_vdpa_notify, callback, name);
> > > if (!vq) {
> > > err = -ENOMEM;
> > > diff --git a/include/uapi/linux/vdpa.h b/include/uapi/linux/vdpa.h
> > > index 66a41e4ec163..ce0b74276a5b 100644
> > > --- a/include/uapi/linux/vdpa.h
> > > +++ b/include/uapi/linux/vdpa.h
> > > @@ -32,6 +32,7 @@ enum vdpa_attr {
> > > VDPA_ATTR_DEV_VENDOR_ID, /* u32 */
> > > VDPA_ATTR_DEV_MAX_VQS, /* u32 */
> > > VDPA_ATTR_DEV_MAX_VQ_SIZE, /* u16 */
> > > + VDPA_ATTR_DEV_VERSION_1, /* flag */
> > >
> > > /* new attributes must be added above here */
> > > VDPA_ATTR_MAX,
> > > --
> > > 2.31.1

2021-09-15 07:45:33

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH v2 4/5] vdpa: add new vdpa attribute VDPA_ATTR_DEV_F_VERSION_1

On Wed, Sep 15, 2021 at 11:18:06AM +0800, Jason Wang wrote:
> On Tue, Sep 14, 2021 at 8:58 PM Michael S. Tsirkin <[email protected]> wrote:
> >
> > On Tue, Sep 14, 2021 at 08:24:51PM +0800, Wu Zongyong wrote:
> > > This new attribute advertises whether the vdpa device is legacy or not.
> > > Users can pick right virtqueue size if the vdpa device is legacy which
> > > doesn't support to change virtqueue size.
> > >
> > > Signed-off-by: Wu Zongyong <[email protected]>
> >
> > So if we are bothering with legacy,
>
> I think we'd better not. I guess the following may work:
>
> 1) disable the driver on BE host
> 2) present VERSION_1 with ACCESS_PLATFORM in get_features()
> 3) extend the management to advertise max_queue_size and
> min_queue_size, for ENI they are the same so management layer knows it
> needs to set the queue_size correctly during launching qemu
>
> Thoughts?
>
> Thanks

There are other subtle differences such as header size without
mergeable buffers for net.


> > I think there are
> > several things to do when building the interface
> > - support transitional devices, that is allow userspace
> > to tell device it's in legacy mode
> > - support reporting/setting supporting endian-ness
> >
> > > ---
> > > drivers/vdpa/vdpa.c | 6 ++++++
> > > drivers/virtio/virtio_vdpa.c | 7 ++++++-
> > > include/uapi/linux/vdpa.h | 1 +
> > > 3 files changed, 13 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
> > > index 1dc121a07a93..533d7f589eee 100644
> > > --- a/drivers/vdpa/vdpa.c
> > > +++ b/drivers/vdpa/vdpa.c
> > > @@ -12,6 +12,7 @@
> > > #include <linux/slab.h>
> > > #include <linux/vdpa.h>
> > > #include <uapi/linux/vdpa.h>
> > > +#include <uapi/linux/virtio_config.h>
> > > #include <net/genetlink.h>
> > > #include <linux/mod_devicetable.h>
> > >
> > > @@ -494,6 +495,7 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> > > u16 max_vq_size;
> > > u32 device_id;
> > > u32 vendor_id;
> > > + u64 features;
> > > void *hdr;
> > > int err;
> > >
> > > @@ -508,6 +510,7 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> > > device_id = vdev->config->get_device_id(vdev);
> > > vendor_id = vdev->config->get_vendor_id(vdev);
> > > max_vq_size = vdev->config->get_vq_num_max(vdev);
> > > + features = vdev->config->get_features(vdev);
> > >
> > > err = -EMSGSIZE;
> > > if (nla_put_string(msg, VDPA_ATTR_DEV_NAME, dev_name(&vdev->dev)))
> > > @@ -520,6 +523,9 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> > > goto msg_err;
> > > if (nla_put_u16(msg, VDPA_ATTR_DEV_MAX_VQ_SIZE, max_vq_size))
> > > goto msg_err;
> > > + if (features & BIT_ULL(VIRTIO_F_VERSION_1) &&
> > > + nla_put_flag(msg, VDPA_ATTR_DEV_VERSION_1))
> > > + goto msg_err;
> > >
> > > genlmsg_end(msg, hdr);
> > > return 0;
> > > diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
> > > index 72eaef2caeb1..1cba957c4cdc 100644
> > > --- a/drivers/virtio/virtio_vdpa.c
> > > +++ b/drivers/virtio/virtio_vdpa.c
> > > @@ -7,6 +7,7 @@
> > > *
> > > */
> > >
> > > +#include "linux/virtio_config.h"
> > > #include <linux/init.h>
> > > #include <linux/module.h>
> > > #include <linux/device.h>
> > > @@ -145,6 +146,7 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > /* Assume split virtqueue, switch to packed if necessary */
> > > struct vdpa_vq_state state = {0};
> > > unsigned long flags;
> > > + bool may_reduce_num = false;
> > > u32 align, num;
> > > int err;
> > >
> > > @@ -169,10 +171,13 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > goto error_new_virtqueue;
> > > }
> > >
> > > + if (ops->get_features(vdpa) & BIT_ULL(VIRTIO_F_VERSION_1))
> > > + may_reduce_num = true;
> > > +
> > > /* Create the vring */
> > > align = ops->get_vq_align(vdpa);
> > > vq = vring_create_virtqueue(index, num, align, vdev,
> > > - true, true, ctx,
> > > + true, may_reduce_num, ctx,
> > > virtio_vdpa_notify, callback, name);
> > > if (!vq) {
> > > err = -ENOMEM;
> > > diff --git a/include/uapi/linux/vdpa.h b/include/uapi/linux/vdpa.h
> > > index 66a41e4ec163..ce0b74276a5b 100644
> > > --- a/include/uapi/linux/vdpa.h
> > > +++ b/include/uapi/linux/vdpa.h
> > > @@ -32,6 +32,7 @@ enum vdpa_attr {
> > > VDPA_ATTR_DEV_VENDOR_ID, /* u32 */
> > > VDPA_ATTR_DEV_MAX_VQS, /* u32 */
> > > VDPA_ATTR_DEV_MAX_VQ_SIZE, /* u16 */
> > > + VDPA_ATTR_DEV_VERSION_1, /* flag */
> > >
> > > /* new attributes must be added above here */
> > > VDPA_ATTR_MAX,
> > > --
> > > 2.31.1
> >

2021-09-15 08:08:58

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH v2 4/5] vdpa: add new vdpa attribute VDPA_ATTR_DEV_F_VERSION_1

On Wed, Sep 15, 2021 at 3:31 PM Michael S. Tsirkin <[email protected]> wrote:
>
> On Wed, Sep 15, 2021 at 11:24:53AM +0800, Wu Zongyong wrote:
> > On Tue, Sep 14, 2021 at 08:58:28AM -0400, Michael S. Tsirkin wrote:
> > > On Tue, Sep 14, 2021 at 08:24:51PM +0800, Wu Zongyong wrote:
> > > > This new attribute advertises whether the vdpa device is legacy or not.
> > > > Users can pick right virtqueue size if the vdpa device is legacy which
> > > > doesn't support to change virtqueue size.
> > > >
> > > > Signed-off-by: Wu Zongyong <[email protected]>
> > >
> > > So if we are bothering with legacy, I think there are
> > > several things to do when building the interface
> > > - support transitional devices, that is allow userspace
> > > to tell device it's in legacy mode
> > > - support reporting/setting supporting endian-ness
> >
> > It's true if we try to implement a general drvier for legacy.
> > But this series is dedicated to implement a driver only for ENI. Is it
> > necessary to implement what you said here in this series?
>
> To a certain degree, yes.
>
> I am thinking about the UAPI here. The new attribute is part of that.
> E.g. userspace consuming this needs to be more or less hardware agnostic
> and not depend on specifics of ENI.
>
> Otherwise if userspace assumes legacy==eni then it will break with
> other hardware.
>
> One way to test how generic it all is would be adding legacy support in
> the simulator.

I don't get why we need to support legacy devices e.g it doesn't have
ACCESS_PLATFORM support. I think we should re-consider to mandate 1.0
devices.

https://lore.kernel.org/lkml/[email protected]/

And it will complicate all of the different layers.

Thanks

>
> > >
> > > > ---
> > > > drivers/vdpa/vdpa.c | 6 ++++++
> > > > drivers/virtio/virtio_vdpa.c | 7 ++++++-
> > > > include/uapi/linux/vdpa.h | 1 +
> > > > 3 files changed, 13 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
> > > > index 1dc121a07a93..533d7f589eee 100644
> > > > --- a/drivers/vdpa/vdpa.c
> > > > +++ b/drivers/vdpa/vdpa.c
> > > > @@ -12,6 +12,7 @@
> > > > #include <linux/slab.h>
> > > > #include <linux/vdpa.h>
> > > > #include <uapi/linux/vdpa.h>
> > > > +#include <uapi/linux/virtio_config.h>
> > > > #include <net/genetlink.h>
> > > > #include <linux/mod_devicetable.h>
> > > >
> > > > @@ -494,6 +495,7 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> > > > u16 max_vq_size;
> > > > u32 device_id;
> > > > u32 vendor_id;
> > > > + u64 features;
> > > > void *hdr;
> > > > int err;
> > > >
> > > > @@ -508,6 +510,7 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> > > > device_id = vdev->config->get_device_id(vdev);
> > > > vendor_id = vdev->config->get_vendor_id(vdev);
> > > > max_vq_size = vdev->config->get_vq_num_max(vdev);
> > > > + features = vdev->config->get_features(vdev);
> > > >
> > > > err = -EMSGSIZE;
> > > > if (nla_put_string(msg, VDPA_ATTR_DEV_NAME, dev_name(&vdev->dev)))
> > > > @@ -520,6 +523,9 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> > > > goto msg_err;
> > > > if (nla_put_u16(msg, VDPA_ATTR_DEV_MAX_VQ_SIZE, max_vq_size))
> > > > goto msg_err;
> > > > + if (features & BIT_ULL(VIRTIO_F_VERSION_1) &&
> > > > + nla_put_flag(msg, VDPA_ATTR_DEV_VERSION_1))
> > > > + goto msg_err;
> > > >
> > > > genlmsg_end(msg, hdr);
> > > > return 0;
> > > > diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
> > > > index 72eaef2caeb1..1cba957c4cdc 100644
> > > > --- a/drivers/virtio/virtio_vdpa.c
> > > > +++ b/drivers/virtio/virtio_vdpa.c
> > > > @@ -7,6 +7,7 @@
> > > > *
> > > > */
> > > >
> > > > +#include "linux/virtio_config.h"
> > > > #include <linux/init.h>
> > > > #include <linux/module.h>
> > > > #include <linux/device.h>
> > > > @@ -145,6 +146,7 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > > /* Assume split virtqueue, switch to packed if necessary */
> > > > struct vdpa_vq_state state = {0};
> > > > unsigned long flags;
> > > > + bool may_reduce_num = false;
> > > > u32 align, num;
> > > > int err;
> > > >
> > > > @@ -169,10 +171,13 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > > goto error_new_virtqueue;
> > > > }
> > > >
> > > > + if (ops->get_features(vdpa) & BIT_ULL(VIRTIO_F_VERSION_1))
> > > > + may_reduce_num = true;
> > > > +
> > > > /* Create the vring */
> > > > align = ops->get_vq_align(vdpa);
> > > > vq = vring_create_virtqueue(index, num, align, vdev,
> > > > - true, true, ctx,
> > > > + true, may_reduce_num, ctx,
> > > > virtio_vdpa_notify, callback, name);
> > > > if (!vq) {
> > > > err = -ENOMEM;
> > > > diff --git a/include/uapi/linux/vdpa.h b/include/uapi/linux/vdpa.h
> > > > index 66a41e4ec163..ce0b74276a5b 100644
> > > > --- a/include/uapi/linux/vdpa.h
> > > > +++ b/include/uapi/linux/vdpa.h
> > > > @@ -32,6 +32,7 @@ enum vdpa_attr {
> > > > VDPA_ATTR_DEV_VENDOR_ID, /* u32 */
> > > > VDPA_ATTR_DEV_MAX_VQS, /* u32 */
> > > > VDPA_ATTR_DEV_MAX_VQ_SIZE, /* u16 */
> > > > + VDPA_ATTR_DEV_VERSION_1, /* flag */
> > > >
> > > > /* new attributes must be added above here */
> > > > VDPA_ATTR_MAX,
> > > > --
> > > > 2.31.1
>

2021-09-15 08:12:13

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH v2 4/5] vdpa: add new vdpa attribute VDPA_ATTR_DEV_F_VERSION_1

On Wed, Sep 15, 2021 at 3:38 PM Michael S. Tsirkin <[email protected]> wrote:
>
> On Wed, Sep 15, 2021 at 11:18:06AM +0800, Jason Wang wrote:
> > On Tue, Sep 14, 2021 at 8:58 PM Michael S. Tsirkin <[email protected]> wrote:
> > >
> > > On Tue, Sep 14, 2021 at 08:24:51PM +0800, Wu Zongyong wrote:
> > > > This new attribute advertises whether the vdpa device is legacy or not.
> > > > Users can pick right virtqueue size if the vdpa device is legacy which
> > > > doesn't support to change virtqueue size.
> > > >
> > > > Signed-off-by: Wu Zongyong <[email protected]>
> > >
> > > So if we are bothering with legacy,
> >
> > I think we'd better not. I guess the following may work:
> >
> > 1) disable the driver on BE host
> > 2) present VERSION_1 with ACCESS_PLATFORM in get_features()
> > 3) extend the management to advertise max_queue_size and
> > min_queue_size, for ENI they are the same so management layer knows it
> > needs to set the queue_size correctly during launching qemu
> >
> > Thoughts?
> >
> > Thanks
>
> There are other subtle differences such as header size without
> mergeable buffers for net.

This can be solved by mandating the feature of a mergeable buffer?

Thanks

>
>
> > > I think there are
> > > several things to do when building the interface
> > > - support transitional devices, that is allow userspace
> > > to tell device it's in legacy mode
> > > - support reporting/setting supporting endian-ness
> > >
> > > > ---
> > > > drivers/vdpa/vdpa.c | 6 ++++++
> > > > drivers/virtio/virtio_vdpa.c | 7 ++++++-
> > > > include/uapi/linux/vdpa.h | 1 +
> > > > 3 files changed, 13 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
> > > > index 1dc121a07a93..533d7f589eee 100644
> > > > --- a/drivers/vdpa/vdpa.c
> > > > +++ b/drivers/vdpa/vdpa.c
> > > > @@ -12,6 +12,7 @@
> > > > #include <linux/slab.h>
> > > > #include <linux/vdpa.h>
> > > > #include <uapi/linux/vdpa.h>
> > > > +#include <uapi/linux/virtio_config.h>
> > > > #include <net/genetlink.h>
> > > > #include <linux/mod_devicetable.h>
> > > >
> > > > @@ -494,6 +495,7 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> > > > u16 max_vq_size;
> > > > u32 device_id;
> > > > u32 vendor_id;
> > > > + u64 features;
> > > > void *hdr;
> > > > int err;
> > > >
> > > > @@ -508,6 +510,7 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> > > > device_id = vdev->config->get_device_id(vdev);
> > > > vendor_id = vdev->config->get_vendor_id(vdev);
> > > > max_vq_size = vdev->config->get_vq_num_max(vdev);
> > > > + features = vdev->config->get_features(vdev);
> > > >
> > > > err = -EMSGSIZE;
> > > > if (nla_put_string(msg, VDPA_ATTR_DEV_NAME, dev_name(&vdev->dev)))
> > > > @@ -520,6 +523,9 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> > > > goto msg_err;
> > > > if (nla_put_u16(msg, VDPA_ATTR_DEV_MAX_VQ_SIZE, max_vq_size))
> > > > goto msg_err;
> > > > + if (features & BIT_ULL(VIRTIO_F_VERSION_1) &&
> > > > + nla_put_flag(msg, VDPA_ATTR_DEV_VERSION_1))
> > > > + goto msg_err;
> > > >
> > > > genlmsg_end(msg, hdr);
> > > > return 0;
> > > > diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
> > > > index 72eaef2caeb1..1cba957c4cdc 100644
> > > > --- a/drivers/virtio/virtio_vdpa.c
> > > > +++ b/drivers/virtio/virtio_vdpa.c
> > > > @@ -7,6 +7,7 @@
> > > > *
> > > > */
> > > >
> > > > +#include "linux/virtio_config.h"
> > > > #include <linux/init.h>
> > > > #include <linux/module.h>
> > > > #include <linux/device.h>
> > > > @@ -145,6 +146,7 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > > /* Assume split virtqueue, switch to packed if necessary */
> > > > struct vdpa_vq_state state = {0};
> > > > unsigned long flags;
> > > > + bool may_reduce_num = false;
> > > > u32 align, num;
> > > > int err;
> > > >
> > > > @@ -169,10 +171,13 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > > goto error_new_virtqueue;
> > > > }
> > > >
> > > > + if (ops->get_features(vdpa) & BIT_ULL(VIRTIO_F_VERSION_1))
> > > > + may_reduce_num = true;
> > > > +
> > > > /* Create the vring */
> > > > align = ops->get_vq_align(vdpa);
> > > > vq = vring_create_virtqueue(index, num, align, vdev,
> > > > - true, true, ctx,
> > > > + true, may_reduce_num, ctx,
> > > > virtio_vdpa_notify, callback, name);
> > > > if (!vq) {
> > > > err = -ENOMEM;
> > > > diff --git a/include/uapi/linux/vdpa.h b/include/uapi/linux/vdpa.h
> > > > index 66a41e4ec163..ce0b74276a5b 100644
> > > > --- a/include/uapi/linux/vdpa.h
> > > > +++ b/include/uapi/linux/vdpa.h
> > > > @@ -32,6 +32,7 @@ enum vdpa_attr {
> > > > VDPA_ATTR_DEV_VENDOR_ID, /* u32 */
> > > > VDPA_ATTR_DEV_MAX_VQS, /* u32 */
> > > > VDPA_ATTR_DEV_MAX_VQ_SIZE, /* u16 */
> > > > + VDPA_ATTR_DEV_VERSION_1, /* flag */
> > > >
> > > > /* new attributes must be added above here */
> > > > VDPA_ATTR_MAX,
> > > > --
> > > > 2.31.1
> > >
>

2021-09-15 11:10:12

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH v2 4/5] vdpa: add new vdpa attribute VDPA_ATTR_DEV_F_VERSION_1

On Wed, Sep 15, 2021 at 04:06:57PM +0800, Jason Wang wrote:
> On Wed, Sep 15, 2021 at 3:38 PM Michael S. Tsirkin <[email protected]> wrote:
> >
> > On Wed, Sep 15, 2021 at 11:18:06AM +0800, Jason Wang wrote:
> > > On Tue, Sep 14, 2021 at 8:58 PM Michael S. Tsirkin <[email protected]> wrote:
> > > >
> > > > On Tue, Sep 14, 2021 at 08:24:51PM +0800, Wu Zongyong wrote:
> > > > > This new attribute advertises whether the vdpa device is legacy or not.
> > > > > Users can pick right virtqueue size if the vdpa device is legacy which
> > > > > doesn't support to change virtqueue size.
> > > > >
> > > > > Signed-off-by: Wu Zongyong <[email protected]>
> > > >
> > > > So if we are bothering with legacy,
> > >
> > > I think we'd better not. I guess the following may work:
> > >
> > > 1) disable the driver on BE host
> > > 2) present VERSION_1 with ACCESS_PLATFORM in get_features()
> > > 3) extend the management to advertise max_queue_size and
> > > min_queue_size, for ENI they are the same so management layer knows it
> > > needs to set the queue_size correctly during launching qemu
> > >
> > > Thoughts?
> > >
> > > Thanks
> >
> > There are other subtle differences such as header size without
> > mergeable buffers for net.
>
> This can be solved by mandating the feature of a mergeable buffer?
>
> Thanks

PXE and some dpdk versions are only some of the guests that
disable mergeable buffers feature.

> >
> >
> > > > I think there are
> > > > several things to do when building the interface
> > > > - support transitional devices, that is allow userspace
> > > > to tell device it's in legacy mode
> > > > - support reporting/setting supporting endian-ness
> > > >
> > > > > ---
> > > > > drivers/vdpa/vdpa.c | 6 ++++++
> > > > > drivers/virtio/virtio_vdpa.c | 7 ++++++-
> > > > > include/uapi/linux/vdpa.h | 1 +
> > > > > 3 files changed, 13 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
> > > > > index 1dc121a07a93..533d7f589eee 100644
> > > > > --- a/drivers/vdpa/vdpa.c
> > > > > +++ b/drivers/vdpa/vdpa.c
> > > > > @@ -12,6 +12,7 @@
> > > > > #include <linux/slab.h>
> > > > > #include <linux/vdpa.h>
> > > > > #include <uapi/linux/vdpa.h>
> > > > > +#include <uapi/linux/virtio_config.h>
> > > > > #include <net/genetlink.h>
> > > > > #include <linux/mod_devicetable.h>
> > > > >
> > > > > @@ -494,6 +495,7 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> > > > > u16 max_vq_size;
> > > > > u32 device_id;
> > > > > u32 vendor_id;
> > > > > + u64 features;
> > > > > void *hdr;
> > > > > int err;
> > > > >
> > > > > @@ -508,6 +510,7 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> > > > > device_id = vdev->config->get_device_id(vdev);
> > > > > vendor_id = vdev->config->get_vendor_id(vdev);
> > > > > max_vq_size = vdev->config->get_vq_num_max(vdev);
> > > > > + features = vdev->config->get_features(vdev);
> > > > >
> > > > > err = -EMSGSIZE;
> > > > > if (nla_put_string(msg, VDPA_ATTR_DEV_NAME, dev_name(&vdev->dev)))
> > > > > @@ -520,6 +523,9 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> > > > > goto msg_err;
> > > > > if (nla_put_u16(msg, VDPA_ATTR_DEV_MAX_VQ_SIZE, max_vq_size))
> > > > > goto msg_err;
> > > > > + if (features & BIT_ULL(VIRTIO_F_VERSION_1) &&
> > > > > + nla_put_flag(msg, VDPA_ATTR_DEV_VERSION_1))
> > > > > + goto msg_err;
> > > > >
> > > > > genlmsg_end(msg, hdr);
> > > > > return 0;
> > > > > diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
> > > > > index 72eaef2caeb1..1cba957c4cdc 100644
> > > > > --- a/drivers/virtio/virtio_vdpa.c
> > > > > +++ b/drivers/virtio/virtio_vdpa.c
> > > > > @@ -7,6 +7,7 @@
> > > > > *
> > > > > */
> > > > >
> > > > > +#include "linux/virtio_config.h"
> > > > > #include <linux/init.h>
> > > > > #include <linux/module.h>
> > > > > #include <linux/device.h>
> > > > > @@ -145,6 +146,7 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > > > /* Assume split virtqueue, switch to packed if necessary */
> > > > > struct vdpa_vq_state state = {0};
> > > > > unsigned long flags;
> > > > > + bool may_reduce_num = false;
> > > > > u32 align, num;
> > > > > int err;
> > > > >
> > > > > @@ -169,10 +171,13 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > > > goto error_new_virtqueue;
> > > > > }
> > > > >
> > > > > + if (ops->get_features(vdpa) & BIT_ULL(VIRTIO_F_VERSION_1))
> > > > > + may_reduce_num = true;
> > > > > +
> > > > > /* Create the vring */
> > > > > align = ops->get_vq_align(vdpa);
> > > > > vq = vring_create_virtqueue(index, num, align, vdev,
> > > > > - true, true, ctx,
> > > > > + true, may_reduce_num, ctx,
> > > > > virtio_vdpa_notify, callback, name);
> > > > > if (!vq) {
> > > > > err = -ENOMEM;
> > > > > diff --git a/include/uapi/linux/vdpa.h b/include/uapi/linux/vdpa.h
> > > > > index 66a41e4ec163..ce0b74276a5b 100644
> > > > > --- a/include/uapi/linux/vdpa.h
> > > > > +++ b/include/uapi/linux/vdpa.h
> > > > > @@ -32,6 +32,7 @@ enum vdpa_attr {
> > > > > VDPA_ATTR_DEV_VENDOR_ID, /* u32 */
> > > > > VDPA_ATTR_DEV_MAX_VQS, /* u32 */
> > > > > VDPA_ATTR_DEV_MAX_VQ_SIZE, /* u16 */
> > > > > + VDPA_ATTR_DEV_VERSION_1, /* flag */
> > > > >
> > > > > /* new attributes must be added above here */
> > > > > VDPA_ATTR_MAX,
> > > > > --
> > > > > 2.31.1
> > > >
> >

2021-09-15 11:12:32

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH v2 4/5] vdpa: add new vdpa attribute VDPA_ATTR_DEV_F_VERSION_1

On Wed, Sep 15, 2021 at 04:05:49PM +0800, Jason Wang wrote:
> On Wed, Sep 15, 2021 at 3:31 PM Michael S. Tsirkin <[email protected]> wrote:
> >
> > On Wed, Sep 15, 2021 at 11:24:53AM +0800, Wu Zongyong wrote:
> > > On Tue, Sep 14, 2021 at 08:58:28AM -0400, Michael S. Tsirkin wrote:
> > > > On Tue, Sep 14, 2021 at 08:24:51PM +0800, Wu Zongyong wrote:
> > > > > This new attribute advertises whether the vdpa device is legacy or not.
> > > > > Users can pick right virtqueue size if the vdpa device is legacy which
> > > > > doesn't support to change virtqueue size.
> > > > >
> > > > > Signed-off-by: Wu Zongyong <[email protected]>
> > > >
> > > > So if we are bothering with legacy, I think there are
> > > > several things to do when building the interface
> > > > - support transitional devices, that is allow userspace
> > > > to tell device it's in legacy mode
> > > > - support reporting/setting supporting endian-ness
> > >
> > > It's true if we try to implement a general drvier for legacy.
> > > But this series is dedicated to implement a driver only for ENI. Is it
> > > necessary to implement what you said here in this series?
> >
> > To a certain degree, yes.
> >
> > I am thinking about the UAPI here. The new attribute is part of that.
> > E.g. userspace consuming this needs to be more or less hardware agnostic
> > and not depend on specifics of ENI.
> >
> > Otherwise if userspace assumes legacy==eni then it will break with
> > other hardware.
> >
> > One way to test how generic it all is would be adding legacy support in
> > the simulator.
>
> I don't get why we need to support legacy devices e.g it doesn't have
> ACCESS_PLATFORM support. I think we should re-consider to mandate 1.0
> devices.
>
> https://lore.kernel.org/lkml/[email protected]/
>
> And it will complicate all of the different layers.
>
> Thanks


It's not that we have to, it's just that imho, if we do
it's easier to just do it all in the kernel rather than spreading
parts of code around.

> >
> > > >
> > > > > ---
> > > > > drivers/vdpa/vdpa.c | 6 ++++++
> > > > > drivers/virtio/virtio_vdpa.c | 7 ++++++-
> > > > > include/uapi/linux/vdpa.h | 1 +
> > > > > 3 files changed, 13 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
> > > > > index 1dc121a07a93..533d7f589eee 100644
> > > > > --- a/drivers/vdpa/vdpa.c
> > > > > +++ b/drivers/vdpa/vdpa.c
> > > > > @@ -12,6 +12,7 @@
> > > > > #include <linux/slab.h>
> > > > > #include <linux/vdpa.h>
> > > > > #include <uapi/linux/vdpa.h>
> > > > > +#include <uapi/linux/virtio_config.h>
> > > > > #include <net/genetlink.h>
> > > > > #include <linux/mod_devicetable.h>
> > > > >
> > > > > @@ -494,6 +495,7 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> > > > > u16 max_vq_size;
> > > > > u32 device_id;
> > > > > u32 vendor_id;
> > > > > + u64 features;
> > > > > void *hdr;
> > > > > int err;
> > > > >
> > > > > @@ -508,6 +510,7 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> > > > > device_id = vdev->config->get_device_id(vdev);
> > > > > vendor_id = vdev->config->get_vendor_id(vdev);
> > > > > max_vq_size = vdev->config->get_vq_num_max(vdev);
> > > > > + features = vdev->config->get_features(vdev);
> > > > >
> > > > > err = -EMSGSIZE;
> > > > > if (nla_put_string(msg, VDPA_ATTR_DEV_NAME, dev_name(&vdev->dev)))
> > > > > @@ -520,6 +523,9 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> > > > > goto msg_err;
> > > > > if (nla_put_u16(msg, VDPA_ATTR_DEV_MAX_VQ_SIZE, max_vq_size))
> > > > > goto msg_err;
> > > > > + if (features & BIT_ULL(VIRTIO_F_VERSION_1) &&
> > > > > + nla_put_flag(msg, VDPA_ATTR_DEV_VERSION_1))
> > > > > + goto msg_err;
> > > > >
> > > > > genlmsg_end(msg, hdr);
> > > > > return 0;
> > > > > diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
> > > > > index 72eaef2caeb1..1cba957c4cdc 100644
> > > > > --- a/drivers/virtio/virtio_vdpa.c
> > > > > +++ b/drivers/virtio/virtio_vdpa.c
> > > > > @@ -7,6 +7,7 @@
> > > > > *
> > > > > */
> > > > >
> > > > > +#include "linux/virtio_config.h"
> > > > > #include <linux/init.h>
> > > > > #include <linux/module.h>
> > > > > #include <linux/device.h>
> > > > > @@ -145,6 +146,7 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > > > /* Assume split virtqueue, switch to packed if necessary */
> > > > > struct vdpa_vq_state state = {0};
> > > > > unsigned long flags;
> > > > > + bool may_reduce_num = false;
> > > > > u32 align, num;
> > > > > int err;
> > > > >
> > > > > @@ -169,10 +171,13 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > > > goto error_new_virtqueue;
> > > > > }
> > > > >
> > > > > + if (ops->get_features(vdpa) & BIT_ULL(VIRTIO_F_VERSION_1))
> > > > > + may_reduce_num = true;
> > > > > +
> > > > > /* Create the vring */
> > > > > align = ops->get_vq_align(vdpa);
> > > > > vq = vring_create_virtqueue(index, num, align, vdev,
> > > > > - true, true, ctx,
> > > > > + true, may_reduce_num, ctx,
> > > > > virtio_vdpa_notify, callback, name);
> > > > > if (!vq) {
> > > > > err = -ENOMEM;
> > > > > diff --git a/include/uapi/linux/vdpa.h b/include/uapi/linux/vdpa.h
> > > > > index 66a41e4ec163..ce0b74276a5b 100644
> > > > > --- a/include/uapi/linux/vdpa.h
> > > > > +++ b/include/uapi/linux/vdpa.h
> > > > > @@ -32,6 +32,7 @@ enum vdpa_attr {
> > > > > VDPA_ATTR_DEV_VENDOR_ID, /* u32 */
> > > > > VDPA_ATTR_DEV_MAX_VQS, /* u32 */
> > > > > VDPA_ATTR_DEV_MAX_VQ_SIZE, /* u16 */
> > > > > + VDPA_ATTR_DEV_VERSION_1, /* flag */
> > > > >
> > > > > /* new attributes must be added above here */
> > > > > VDPA_ATTR_MAX,
> > > > > --
> > > > > 2.31.1
> >

2021-09-15 12:14:34

by Wu Zongyong

[permalink] [raw]
Subject: Re: [PATCH v2 4/5] vdpa: add new vdpa attribute VDPA_ATTR_DEV_F_VERSION_1

On Wed, Sep 15, 2021 at 07:08:51AM -0400, Michael S. Tsirkin wrote:
> On Wed, Sep 15, 2021 at 04:06:57PM +0800, Jason Wang wrote:
> > On Wed, Sep 15, 2021 at 3:38 PM Michael S. Tsirkin <[email protected]> wrote:
> > >
> > > On Wed, Sep 15, 2021 at 11:18:06AM +0800, Jason Wang wrote:
> > > > On Tue, Sep 14, 2021 at 8:58 PM Michael S. Tsirkin <[email protected]> wrote:
> > > > >
> > > > > On Tue, Sep 14, 2021 at 08:24:51PM +0800, Wu Zongyong wrote:
> > > > > > This new attribute advertises whether the vdpa device is legacy or not.
> > > > > > Users can pick right virtqueue size if the vdpa device is legacy which
> > > > > > doesn't support to change virtqueue size.
> > > > > >
> > > > > > Signed-off-by: Wu Zongyong <[email protected]>
> > > > >
> > > > > So if we are bothering with legacy,
> > > >
> > > > I think we'd better not. I guess the following may work:
> > > >
> > > > 1) disable the driver on BE host
> > > > 2) present VERSION_1 with ACCESS_PLATFORM in get_features()
> > > > 3) extend the management to advertise max_queue_size and
> > > > min_queue_size, for ENI they are the same so management layer knows it
> > > > needs to set the queue_size correctly during launching qemu
> > > >
> > > > Thoughts?
> > > >
> > > > Thanks
> > >
> > > There are other subtle differences such as header size without
> > > mergeable buffers for net.
> >
> > This can be solved by mandating the feature of a mergeable buffer?
> >
> > Thanks
>
> PXE and some dpdk versions are only some of the guests that
> disable mergeable buffers feature.
>
So what about this:

1) disable the driver on BE host
AFAIK, there are no use cases for ENI to be used in a BE machine. So
just disable the driver on BE machine, it will make things simper.
2) present ACCESS_PLATFORM but not VERSION_1 in get_features()
3) extend the management to advertise min_queue_size
min_queue_size is the same as with max_queue_size for ENI.

Another choice for 3):
extend the management to advertise the flag F_VERSION_1 just like
this patch
> > >
> > >
> > > > > I think there are
> > > > > several things to do when building the interface
> > > > > - support transitional devices, that is allow userspace
> > > > > to tell device it's in legacy mode
> > > > > - support reporting/setting supporting endian-ness
> > > > >
> > > > > > ---
> > > > > > drivers/vdpa/vdpa.c | 6 ++++++
> > > > > > drivers/virtio/virtio_vdpa.c | 7 ++++++-
> > > > > > include/uapi/linux/vdpa.h | 1 +
> > > > > > 3 files changed, 13 insertions(+), 1 deletion(-)
> > > > > >
> > > > > > diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
> > > > > > index 1dc121a07a93..533d7f589eee 100644
> > > > > > --- a/drivers/vdpa/vdpa.c
> > > > > > +++ b/drivers/vdpa/vdpa.c
> > > > > > @@ -12,6 +12,7 @@
> > > > > > #include <linux/slab.h>
> > > > > > #include <linux/vdpa.h>
> > > > > > #include <uapi/linux/vdpa.h>
> > > > > > +#include <uapi/linux/virtio_config.h>
> > > > > > #include <net/genetlink.h>
> > > > > > #include <linux/mod_devicetable.h>
> > > > > >
> > > > > > @@ -494,6 +495,7 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> > > > > > u16 max_vq_size;
> > > > > > u32 device_id;
> > > > > > u32 vendor_id;
> > > > > > + u64 features;
> > > > > > void *hdr;
> > > > > > int err;
> > > > > >
> > > > > > @@ -508,6 +510,7 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> > > > > > device_id = vdev->config->get_device_id(vdev);
> > > > > > vendor_id = vdev->config->get_vendor_id(vdev);
> > > > > > max_vq_size = vdev->config->get_vq_num_max(vdev);
> > > > > > + features = vdev->config->get_features(vdev);
> > > > > >
> > > > > > err = -EMSGSIZE;
> > > > > > if (nla_put_string(msg, VDPA_ATTR_DEV_NAME, dev_name(&vdev->dev)))
> > > > > > @@ -520,6 +523,9 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> > > > > > goto msg_err;
> > > > > > if (nla_put_u16(msg, VDPA_ATTR_DEV_MAX_VQ_SIZE, max_vq_size))
> > > > > > goto msg_err;
> > > > > > + if (features & BIT_ULL(VIRTIO_F_VERSION_1) &&
> > > > > > + nla_put_flag(msg, VDPA_ATTR_DEV_VERSION_1))
> > > > > > + goto msg_err;
> > > > > >
> > > > > > genlmsg_end(msg, hdr);
> > > > > > return 0;
> > > > > > diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
> > > > > > index 72eaef2caeb1..1cba957c4cdc 100644
> > > > > > --- a/drivers/virtio/virtio_vdpa.c
> > > > > > +++ b/drivers/virtio/virtio_vdpa.c
> > > > > > @@ -7,6 +7,7 @@
> > > > > > *
> > > > > > */
> > > > > >
> > > > > > +#include "linux/virtio_config.h"
> > > > > > #include <linux/init.h>
> > > > > > #include <linux/module.h>
> > > > > > #include <linux/device.h>
> > > > > > @@ -145,6 +146,7 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > > > > /* Assume split virtqueue, switch to packed if necessary */
> > > > > > struct vdpa_vq_state state = {0};
> > > > > > unsigned long flags;
> > > > > > + bool may_reduce_num = false;
> > > > > > u32 align, num;
> > > > > > int err;
> > > > > >
> > > > > > @@ -169,10 +171,13 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > > > > goto error_new_virtqueue;
> > > > > > }
> > > > > >
> > > > > > + if (ops->get_features(vdpa) & BIT_ULL(VIRTIO_F_VERSION_1))
> > > > > > + may_reduce_num = true;
> > > > > > +
> > > > > > /* Create the vring */
> > > > > > align = ops->get_vq_align(vdpa);
> > > > > > vq = vring_create_virtqueue(index, num, align, vdev,
> > > > > > - true, true, ctx,
> > > > > > + true, may_reduce_num, ctx,
> > > > > > virtio_vdpa_notify, callback, name);
> > > > > > if (!vq) {
> > > > > > err = -ENOMEM;
> > > > > > diff --git a/include/uapi/linux/vdpa.h b/include/uapi/linux/vdpa.h
> > > > > > index 66a41e4ec163..ce0b74276a5b 100644
> > > > > > --- a/include/uapi/linux/vdpa.h
> > > > > > +++ b/include/uapi/linux/vdpa.h
> > > > > > @@ -32,6 +32,7 @@ enum vdpa_attr {
> > > > > > VDPA_ATTR_DEV_VENDOR_ID, /* u32 */
> > > > > > VDPA_ATTR_DEV_MAX_VQS, /* u32 */
> > > > > > VDPA_ATTR_DEV_MAX_VQ_SIZE, /* u16 */
> > > > > > + VDPA_ATTR_DEV_VERSION_1, /* flag */
> > > > > >
> > > > > > /* new attributes must be added above here */
> > > > > > VDPA_ATTR_MAX,
> > > > > > --
> > > > > > 2.31.1
> > > > >
> > >

2021-09-15 13:41:44

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v2 5/5] eni_vdpa: add vDPA driver for Alibaba ENI

Hi Wu,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on v5.15-rc1 next-20210915]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url: https://github.com/0day-ci/linux/commits/Wu-Zongyong/virtio-pci-introduce-legacy-device-module/20210914-212528
base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git d0ee23f9d78be5531c4b055ea424ed0b489dfe9b
config: mips-randconfig-r021-20210915 (attached as .config)
compiler: mipsel-linux-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/0day-ci/linux/commit/d687e424f26eca6cc8fe50165e6d2eb9d67acd45
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Wu-Zongyong/virtio-pci-introduce-legacy-device-module/20210914-212528
git checkout d687e424f26eca6cc8fe50165e6d2eb9d67acd45
# save the attached .config to linux build tree
mkdir build_dir
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross O=build_dir ARCH=mips SHELL=/bin/bash drivers/vdpa/alibaba/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>

All warnings (new ones prefixed by >>):

In file included from arch/mips/include/asm/errno.h:11,
from include/linux/err.h:8,
from include/linux/virtio_config.h:5,
from drivers/vdpa/alibaba/eni_vdpa.c:15:
>> arch/mips/include/uapi/asm/errno.h:18: warning: "ENOMSG" redefined
18 | #define ENOMSG 35 /* No message of desired type */
|
In file included from drivers/vdpa/alibaba/eni_vdpa.c:11:
include/uapi/asm-generic/errno.h:23: note: this is the location of the previous definition
23 | #define ENOMSG 42 /* No message of desired type */
|
In file included from arch/mips/include/asm/errno.h:11,
from include/linux/err.h:8,
from include/linux/virtio_config.h:5,
from drivers/vdpa/alibaba/eni_vdpa.c:15:
>> arch/mips/include/uapi/asm/errno.h:19: warning: "EIDRM" redefined
19 | #define EIDRM 36 /* Identifier removed */
|
In file included from drivers/vdpa/alibaba/eni_vdpa.c:11:
include/uapi/asm-generic/errno.h:24: note: this is the location of the previous definition
24 | #define EIDRM 43 /* Identifier removed */
|
In file included from arch/mips/include/asm/errno.h:11,
from include/linux/err.h:8,
from include/linux/virtio_config.h:5,
from drivers/vdpa/alibaba/eni_vdpa.c:15:
>> arch/mips/include/uapi/asm/errno.h:20: warning: "ECHRNG" redefined
20 | #define ECHRNG 37 /* Channel number out of range */
|
In file included from drivers/vdpa/alibaba/eni_vdpa.c:11:
include/uapi/asm-generic/errno.h:25: note: this is the location of the previous definition
25 | #define ECHRNG 44 /* Channel number out of range */
|
In file included from arch/mips/include/asm/errno.h:11,
from include/linux/err.h:8,
from include/linux/virtio_config.h:5,
from drivers/vdpa/alibaba/eni_vdpa.c:15:
>> arch/mips/include/uapi/asm/errno.h:21: warning: "EL2NSYNC" redefined
21 | #define EL2NSYNC 38 /* Level 2 not synchronized */
|
In file included from drivers/vdpa/alibaba/eni_vdpa.c:11:
include/uapi/asm-generic/errno.h:26: note: this is the location of the previous definition
26 | #define EL2NSYNC 45 /* Level 2 not synchronized */
|
In file included from arch/mips/include/asm/errno.h:11,
from include/linux/err.h:8,
from include/linux/virtio_config.h:5,
from drivers/vdpa/alibaba/eni_vdpa.c:15:
>> arch/mips/include/uapi/asm/errno.h:22: warning: "EL3HLT" redefined
22 | #define EL3HLT 39 /* Level 3 halted */
|
In file included from drivers/vdpa/alibaba/eni_vdpa.c:11:
include/uapi/asm-generic/errno.h:27: note: this is the location of the previous definition
27 | #define EL3HLT 46 /* Level 3 halted */
|
In file included from arch/mips/include/asm/errno.h:11,
from include/linux/err.h:8,
from include/linux/virtio_config.h:5,
from drivers/vdpa/alibaba/eni_vdpa.c:15:
>> arch/mips/include/uapi/asm/errno.h:23: warning: "EL3RST" redefined
23 | #define EL3RST 40 /* Level 3 reset */
|
In file included from drivers/vdpa/alibaba/eni_vdpa.c:11:
include/uapi/asm-generic/errno.h:28: note: this is the location of the previous definition
28 | #define EL3RST 47 /* Level 3 reset */
|
In file included from arch/mips/include/asm/errno.h:11,
from include/linux/err.h:8,
from include/linux/virtio_config.h:5,
from drivers/vdpa/alibaba/eni_vdpa.c:15:
>> arch/mips/include/uapi/asm/errno.h:24: warning: "ELNRNG" redefined
24 | #define ELNRNG 41 /* Link number out of range */
|
In file included from drivers/vdpa/alibaba/eni_vdpa.c:11:
include/uapi/asm-generic/errno.h:29: note: this is the location of the previous definition
29 | #define ELNRNG 48 /* Link number out of range */
|
In file included from arch/mips/include/asm/errno.h:11,
from include/linux/err.h:8,
from include/linux/virtio_config.h:5,
from drivers/vdpa/alibaba/eni_vdpa.c:15:
>> arch/mips/include/uapi/asm/errno.h:25: warning: "EUNATCH" redefined
25 | #define EUNATCH 42 /* Protocol driver not attached */
|
In file included from drivers/vdpa/alibaba/eni_vdpa.c:11:
include/uapi/asm-generic/errno.h:30: note: this is the location of the previous definition
30 | #define EUNATCH 49 /* Protocol driver not attached */
|
In file included from arch/mips/include/asm/errno.h:11,
from include/linux/err.h:8,
from include/linux/virtio_config.h:5,
from drivers/vdpa/alibaba/eni_vdpa.c:15:
>> arch/mips/include/uapi/asm/errno.h:26: warning: "ENOCSI" redefined
26 | #define ENOCSI 43 /* No CSI structure available */
|
In file included from drivers/vdpa/alibaba/eni_vdpa.c:11:
include/uapi/asm-generic/errno.h:31: note: this is the location of the previous definition
31 | #define ENOCSI 50 /* No CSI structure available */
|
In file included from arch/mips/include/asm/errno.h:11,
from include/linux/err.h:8,
from include/linux/virtio_config.h:5,
from drivers/vdpa/alibaba/eni_vdpa.c:15:
>> arch/mips/include/uapi/asm/errno.h:27: warning: "EL2HLT" redefined
27 | #define EL2HLT 44 /* Level 2 halted */
|
In file included from drivers/vdpa/alibaba/eni_vdpa.c:11:
include/uapi/asm-generic/errno.h:32: note: this is the location of the previous definition
32 | #define EL2HLT 51 /* Level 2 halted */
|
In file included from arch/mips/include/asm/errno.h:11,
from include/linux/err.h:8,
from include/linux/virtio_config.h:5,
from drivers/vdpa/alibaba/eni_vdpa.c:15:
>> arch/mips/include/uapi/asm/errno.h:28: warning: "EDEADLK" redefined
28 | #define EDEADLK 45 /* Resource deadlock would occur */
|
In file included from drivers/vdpa/alibaba/eni_vdpa.c:11:
include/uapi/asm-generic/errno.h:7: note: this is the location of the previous definition
7 | #define EDEADLK 35 /* Resource deadlock would occur */
|
In file included from arch/mips/include/asm/errno.h:11,
from include/linux/err.h:8,
from include/linux/virtio_config.h:5,
from drivers/vdpa/alibaba/eni_vdpa.c:15:
>> arch/mips/include/uapi/asm/errno.h:29: warning: "ENOLCK" redefined
29 | #define ENOLCK 46 /* No record locks available */
|
In file included from drivers/vdpa/alibaba/eni_vdpa.c:11:
include/uapi/asm-generic/errno.h:9: note: this is the location of the previous definition
9 | #define ENOLCK 37 /* No record locks available */
|
In file included from arch/mips/include/asm/errno.h:11,
from include/linux/err.h:8,
from include/linux/virtio_config.h:5,
from drivers/vdpa/alibaba/eni_vdpa.c:15:
>> arch/mips/include/uapi/asm/errno.h:30: warning: "EBADE" redefined
30 | #define EBADE 50 /* Invalid exchange */
|
In file included from drivers/vdpa/alibaba/eni_vdpa.c:11:
include/uapi/asm-generic/errno.h:33: note: this is the location of the previous definition
33 | #define EBADE 52 /* Invalid exchange */
|
In file included from arch/mips/include/asm/errno.h:11,
from include/linux/err.h:8,
from include/linux/virtio_config.h:5,
from drivers/vdpa/alibaba/eni_vdpa.c:15:
>> arch/mips/include/uapi/asm/errno.h:31: warning: "EBADR" redefined
31 | #define EBADR 51 /* Invalid request descriptor */
|
In file included from drivers/vdpa/alibaba/eni_vdpa.c:11:
include/uapi/asm-generic/errno.h:34: note: this is the location of the previous definition
34 | #define EBADR 53 /* Invalid request descriptor */
|
In file included from arch/mips/include/asm/errno.h:11,
from include/linux/err.h:8,
from include/linux/virtio_config.h:5,
from drivers/vdpa/alibaba/eni_vdpa.c:15:
>> arch/mips/include/uapi/asm/errno.h:32: warning: "EXFULL" redefined
32 | #define EXFULL 52 /* Exchange full */
|
In file included from drivers/vdpa/alibaba/eni_vdpa.c:11:
include/uapi/asm-generic/errno.h:35: note: this is the location of the previous definition
35 | #define EXFULL 54 /* Exchange full */
|
In file included from arch/mips/include/asm/errno.h:11,
from include/linux/err.h:8,
from include/linux/virtio_config.h:5,
from drivers/vdpa/alibaba/eni_vdpa.c:15:
>> arch/mips/include/uapi/asm/errno.h:33: warning: "ENOANO" redefined
33 | #define ENOANO 53 /* No anode */
|
In file included from drivers/vdpa/alibaba/eni_vdpa.c:11:
include/uapi/asm-generic/errno.h:36: note: this is the location of the previous definition
36 | #define ENOANO 55 /* No anode */
|
In file included from arch/mips/include/asm/errno.h:11,
from include/linux/err.h:8,
from include/linux/virtio_config.h:5,
from drivers/vdpa/alibaba/eni_vdpa.c:15:
>> arch/mips/include/uapi/asm/errno.h:34: warning: "EBADRQC" redefined
34 | #define EBADRQC 54 /* Invalid request code */
|
In file included from drivers/vdpa/alibaba/eni_vdpa.c:11:
include/uapi/asm-generic/errno.h:37: note: this is the location of the previous definition
37 | #define EBADRQC 56 /* Invalid request code */
|
In file included from arch/mips/include/asm/errno.h:11,
from include/linux/err.h:8,
from include/linux/virtio_config.h:5,
from drivers/vdpa/alibaba/eni_vdpa.c:15:
>> arch/mips/include/uapi/asm/errno.h:35: warning: "EBADSLT" redefined
35 | #define EBADSLT 55 /* Invalid slot */
|
In file included from drivers/vdpa/alibaba/eni_vdpa.c:11:
include/uapi/asm-generic/errno.h:38: note: this is the location of the previous definition
38 | #define EBADSLT 57 /* Invalid slot */
|
In file included from arch/mips/include/asm/errno.h:11,
from include/linux/err.h:8,
from include/linux/virtio_config.h:5,
from drivers/vdpa/alibaba/eni_vdpa.c:15:
>> arch/mips/include/uapi/asm/errno.h:36: warning: "EDEADLOCK" redefined
36 | #define EDEADLOCK 56 /* File locking deadlock error */
|
In file included from drivers/vdpa/alibaba/eni_vdpa.c:11:
include/uapi/asm-generic/errno.h:40: note: this is the location of the previous definition
40 | #define EDEADLOCK EDEADLK
|
In file included from arch/mips/include/asm/errno.h:11,
from include/linux/err.h:8,
from include/linux/virtio_config.h:5,
from drivers/vdpa/alibaba/eni_vdpa.c:15:
>> arch/mips/include/uapi/asm/errno.h:51: warning: "EMULTIHOP" redefined
51 | #define EMULTIHOP 74 /* Multihop attempted */
|
In file included from drivers/vdpa/alibaba/eni_vdpa.c:11:
include/uapi/asm-generic/errno.h:55: note: this is the location of the previous definition
55 | #define EMULTIHOP 72 /* Multihop attempted */
|
In file included from arch/mips/include/asm/errno.h:11,
from include/linux/err.h:8,
from include/linux/virtio_config.h:5,
from drivers/vdpa/alibaba/eni_vdpa.c:15:
arch/mips/include/uapi/asm/errno.h:52: warning: "EBADMSG" redefined
52 | #define EBADMSG 77 /* Not a data message */
|
In file included from drivers/vdpa/alibaba/eni_vdpa.c:11:
include/uapi/asm-generic/errno.h:57: note: this is the location of the previous definition
57 | #define EBADMSG 74 /* Not a data message */
|
In file included from arch/mips/include/asm/errno.h:11,
from include/linux/err.h:8,
from include/linux/virtio_config.h:5,
from drivers/vdpa/alibaba/eni_vdpa.c:15:
arch/mips/include/uapi/asm/errno.h:53: warning: "ENAMETOOLONG" redefined
53 | #define ENAMETOOLONG 78 /* File name too long */
|
In file included from drivers/vdpa/alibaba/eni_vdpa.c:11:
include/uapi/asm-generic/errno.h:8: note: this is the location of the previous definition
8 | #define ENAMETOOLONG 36 /* File name too long */
|
In file included from arch/mips/include/asm/errno.h:11,
from include/linux/err.h:8,
from include/linux/virtio_config.h:5,
from drivers/vdpa/alibaba/eni_vdpa.c:15:
arch/mips/include/uapi/asm/errno.h:54: warning: "EOVERFLOW" redefined
54 | #define EOVERFLOW 79 /* Value too large for defined data type */
|
In file included from drivers/vdpa/alibaba/eni_vdpa.c:11:
include/uapi/asm-generic/errno.h:58: note: this is the location of the previous definition
58 | #define EOVERFLOW 75 /* Value too large for defined data type */
|
In file included from arch/mips/include/asm/errno.h:11,
from include/linux/err.h:8,
from include/linux/virtio_config.h:5,
from drivers/vdpa/alibaba/eni_vdpa.c:15:
arch/mips/include/uapi/asm/errno.h:55: warning: "ENOTUNIQ" redefined
55 | #define ENOTUNIQ 80 /* Name not unique on network */
|
In file included from drivers/vdpa/alibaba/eni_vdpa.c:11:
include/uapi/asm-generic/errno.h:59: note: this is the location of the previous definition
59 | #define ENOTUNIQ 76 /* Name not unique on network */
|
In file included from arch/mips/include/asm/errno.h:11,
from include/linux/err.h:8,
from include/linux/virtio_config.h:5,
from drivers/vdpa/alibaba/eni_vdpa.c:15:
arch/mips/include/uapi/asm/errno.h:56: warning: "EBADFD" redefined
56 | #define EBADFD 81 /* File descriptor in bad state */
|
In file included from drivers/vdpa/alibaba/eni_vdpa.c:11:
include/uapi/asm-generic/errno.h:60: note: this is the location of the previous definition
60 | #define EBADFD 77 /* File descriptor in bad state */
|
In file included from arch/mips/include/asm/errno.h:11,
from include/linux/err.h:8,
from include/linux/virtio_config.h:5,
from drivers/vdpa/alibaba/eni_vdpa.c:15:
arch/mips/include/uapi/asm/errno.h:57: warning: "EREMCHG" redefined
57 | #define EREMCHG 82 /* Remote address changed */
|
In file included from drivers/vdpa/alibaba/eni_vdpa.c:11:
include/uapi/asm-generic/errno.h:61: note: this is the location of the previous definition
61 | #define EREMCHG 78 /* Remote address changed */
|
In file included from arch/mips/include/asm/errno.h:11,
from include/linux/err.h:8,
from include/linux/virtio_config.h:5,
from drivers/vdpa/alibaba/eni_vdpa.c:15:
arch/mips/include/uapi/asm/errno.h:58: warning: "ELIBACC" redefined
58 | #define ELIBACC 83 /* Can not access a needed shared library */
|
In file included from drivers/vdpa/alibaba/eni_vdpa.c:11:
include/uapi/asm-generic/errno.h:62: note: this is the location of the previous definition
62 | #define ELIBACC 79 /* Can not access a needed shared library */
|
In file included from arch/mips/include/asm/errno.h:11,
from include/linux/err.h:8,
from include/linux/virtio_config.h:5,
from drivers/vdpa/alibaba/eni_vdpa.c:15:
arch/mips/include/uapi/asm/errno.h:59: warning: "ELIBBAD" redefined
59 | #define ELIBBAD 84 /* Accessing a corrupted shared library */
|
In file included from drivers/vdpa/alibaba/eni_vdpa.c:11:
include/uapi/asm-generic/errno.h:63: note: this is the location of the previous definition
63 | #define ELIBBAD 80 /* Accessing a corrupted shared library */
|
In file included from arch/mips/include/asm/errno.h:11,
from include/linux/err.h:8,
from include/linux/virtio_config.h:5,
from drivers/vdpa/alibaba/eni_vdpa.c:15:
arch/mips/include/uapi/asm/errno.h:60: warning: "ELIBSCN" redefined
60 | #define ELIBSCN 85 /* .lib section in a.out corrupted */


vim +/ENOMSG +18 arch/mips/include/uapi/asm/errno.h

61730c538f8281 David Howells 2012-10-09 17
61730c538f8281 David Howells 2012-10-09 @18 #define ENOMSG 35 /* No message of desired type */
61730c538f8281 David Howells 2012-10-09 @19 #define EIDRM 36 /* Identifier removed */
61730c538f8281 David Howells 2012-10-09 @20 #define ECHRNG 37 /* Channel number out of range */
61730c538f8281 David Howells 2012-10-09 @21 #define EL2NSYNC 38 /* Level 2 not synchronized */
61730c538f8281 David Howells 2012-10-09 @22 #define EL3HLT 39 /* Level 3 halted */
61730c538f8281 David Howells 2012-10-09 @23 #define EL3RST 40 /* Level 3 reset */
61730c538f8281 David Howells 2012-10-09 @24 #define ELNRNG 41 /* Link number out of range */
61730c538f8281 David Howells 2012-10-09 @25 #define EUNATCH 42 /* Protocol driver not attached */
61730c538f8281 David Howells 2012-10-09 @26 #define ENOCSI 43 /* No CSI structure available */
61730c538f8281 David Howells 2012-10-09 @27 #define EL2HLT 44 /* Level 2 halted */
61730c538f8281 David Howells 2012-10-09 @28 #define EDEADLK 45 /* Resource deadlock would occur */
61730c538f8281 David Howells 2012-10-09 @29 #define ENOLCK 46 /* No record locks available */
61730c538f8281 David Howells 2012-10-09 @30 #define EBADE 50 /* Invalid exchange */
61730c538f8281 David Howells 2012-10-09 @31 #define EBADR 51 /* Invalid request descriptor */
61730c538f8281 David Howells 2012-10-09 @32 #define EXFULL 52 /* Exchange full */
61730c538f8281 David Howells 2012-10-09 @33 #define ENOANO 53 /* No anode */
61730c538f8281 David Howells 2012-10-09 @34 #define EBADRQC 54 /* Invalid request code */
61730c538f8281 David Howells 2012-10-09 @35 #define EBADSLT 55 /* Invalid slot */
61730c538f8281 David Howells 2012-10-09 @36 #define EDEADLOCK 56 /* File locking deadlock error */
61730c538f8281 David Howells 2012-10-09 37 #define EBFONT 59 /* Bad font file format */
61730c538f8281 David Howells 2012-10-09 38 #define ENOSTR 60 /* Device not a stream */
61730c538f8281 David Howells 2012-10-09 39 #define ENODATA 61 /* No data available */
61730c538f8281 David Howells 2012-10-09 40 #define ETIME 62 /* Timer expired */
61730c538f8281 David Howells 2012-10-09 41 #define ENOSR 63 /* Out of streams resources */
61730c538f8281 David Howells 2012-10-09 42 #define ENONET 64 /* Machine is not on the network */
61730c538f8281 David Howells 2012-10-09 43 #define ENOPKG 65 /* Package not installed */
61730c538f8281 David Howells 2012-10-09 44 #define EREMOTE 66 /* Object is remote */
61730c538f8281 David Howells 2012-10-09 45 #define ENOLINK 67 /* Link has been severed */
61730c538f8281 David Howells 2012-10-09 46 #define EADV 68 /* Advertise error */
61730c538f8281 David Howells 2012-10-09 47 #define ESRMNT 69 /* Srmount error */
61730c538f8281 David Howells 2012-10-09 48 #define ECOMM 70 /* Communication error on send */
61730c538f8281 David Howells 2012-10-09 49 #define EPROTO 71 /* Protocol error */
61730c538f8281 David Howells 2012-10-09 50 #define EDOTDOT 73 /* RFS specific error */
61730c538f8281 David Howells 2012-10-09 @51 #define EMULTIHOP 74 /* Multihop attempted */
61730c538f8281 David Howells 2012-10-09 @52 #define EBADMSG 77 /* Not a data message */
61730c538f8281 David Howells 2012-10-09 @53 #define ENAMETOOLONG 78 /* File name too long */
61730c538f8281 David Howells 2012-10-09 @54 #define EOVERFLOW 79 /* Value too large for defined data type */
61730c538f8281 David Howells 2012-10-09 @55 #define ENOTUNIQ 80 /* Name not unique on network */
61730c538f8281 David Howells 2012-10-09 @56 #define EBADFD 81 /* File descriptor in bad state */
61730c538f8281 David Howells 2012-10-09 @57 #define EREMCHG 82 /* Remote address changed */
61730c538f8281 David Howells 2012-10-09 @58 #define ELIBACC 83 /* Can not access a needed shared library */
61730c538f8281 David Howells 2012-10-09 @59 #define ELIBBAD 84 /* Accessing a corrupted shared library */
61730c538f8281 David Howells 2012-10-09 @60 #define ELIBSCN 85 /* .lib section in a.out corrupted */
61730c538f8281 David Howells 2012-10-09 @61 #define ELIBMAX 86 /* Attempting to link in too many shared libraries */
61730c538f8281 David Howells 2012-10-09 @62 #define ELIBEXEC 87 /* Cannot exec a shared library directly */
61730c538f8281 David Howells 2012-10-09 @63 #define EILSEQ 88 /* Illegal byte sequence */
61730c538f8281 David Howells 2012-10-09 @64 #define ENOSYS 89 /* Function not implemented */
61730c538f8281 David Howells 2012-10-09 @65 #define ELOOP 90 /* Too many symbolic links encountered */
61730c538f8281 David Howells 2012-10-09 @66 #define ERESTART 91 /* Interrupted system call should be restarted */
61730c538f8281 David Howells 2012-10-09 @67 #define ESTRPIPE 92 /* Streams pipe error */
61730c538f8281 David Howells 2012-10-09 @68 #define ENOTEMPTY 93 /* Directory not empty */
61730c538f8281 David Howells 2012-10-09 @69 #define EUSERS 94 /* Too many users */
61730c538f8281 David Howells 2012-10-09 @70 #define ENOTSOCK 95 /* Socket operation on non-socket */
61730c538f8281 David Howells 2012-10-09 @71 #define EDESTADDRREQ 96 /* Destination address required */
61730c538f8281 David Howells 2012-10-09 @72 #define EMSGSIZE 97 /* Message too long */
61730c538f8281 David Howells 2012-10-09 @73 #define EPROTOTYPE 98 /* Protocol wrong type for socket */
61730c538f8281 David Howells 2012-10-09 @74 #define ENOPROTOOPT 99 /* Protocol not available */
61730c538f8281 David Howells 2012-10-09 @75 #define EPROTONOSUPPORT 120 /* Protocol not supported */
61730c538f8281 David Howells 2012-10-09 @76 #define ESOCKTNOSUPPORT 121 /* Socket type not supported */
61730c538f8281 David Howells 2012-10-09 @77 #define EOPNOTSUPP 122 /* Operation not supported on transport endpoint */
61730c538f8281 David Howells 2012-10-09 @78 #define EPFNOSUPPORT 123 /* Protocol family not supported */
61730c538f8281 David Howells 2012-10-09 @79 #define EAFNOSUPPORT 124 /* Address family not supported by protocol */
61730c538f8281 David Howells 2012-10-09 @80 #define EADDRINUSE 125 /* Address already in use */
61730c538f8281 David Howells 2012-10-09 @81 #define EADDRNOTAVAIL 126 /* Cannot assign requested address */
61730c538f8281 David Howells 2012-10-09 @82 #define ENETDOWN 127 /* Network is down */
61730c538f8281 David Howells 2012-10-09 @83 #define ENETUNREACH 128 /* Network is unreachable */
61730c538f8281 David Howells 2012-10-09 @84 #define ENETRESET 129 /* Network dropped connection because of reset */
61730c538f8281 David Howells 2012-10-09 @85 #define ECONNABORTED 130 /* Software caused connection abort */
61730c538f8281 David Howells 2012-10-09 @86 #define ECONNRESET 131 /* Connection reset by peer */
61730c538f8281 David Howells 2012-10-09 @87 #define ENOBUFS 132 /* No buffer space available */
61730c538f8281 David Howells 2012-10-09 @88 #define EISCONN 133 /* Transport endpoint is already connected */
61730c538f8281 David Howells 2012-10-09 @89 #define ENOTCONN 134 /* Transport endpoint is not connected */
61730c538f8281 David Howells 2012-10-09 @90 #define EUCLEAN 135 /* Structure needs cleaning */
61730c538f8281 David Howells 2012-10-09 @91 #define ENOTNAM 137 /* Not a XENIX named type file */
61730c538f8281 David Howells 2012-10-09 @92 #define ENAVAIL 138 /* No XENIX semaphores available */
61730c538f8281 David Howells 2012-10-09 @93 #define EISNAM 139 /* Is a named type file */
61730c538f8281 David Howells 2012-10-09 @94 #define EREMOTEIO 140 /* Remote I/O error */
61730c538f8281 David Howells 2012-10-09 95 #define EINIT 141 /* Reserved */
61730c538f8281 David Howells 2012-10-09 96 #define EREMDEV 142 /* Error 142 */
61730c538f8281 David Howells 2012-10-09 @97 #define ESHUTDOWN 143 /* Cannot send after transport endpoint shutdown */
61730c538f8281 David Howells 2012-10-09 @98 #define ETOOMANYREFS 144 /* Too many references: cannot splice */
61730c538f8281 David Howells 2012-10-09 @99 #define ETIMEDOUT 145 /* Connection timed out */
61730c538f8281 David Howells 2012-10-09 @100 #define ECONNREFUSED 146 /* Connection refused */
61730c538f8281 David Howells 2012-10-09 @101 #define EHOSTDOWN 147 /* Host is down */
61730c538f8281 David Howells 2012-10-09 @102 #define EHOSTUNREACH 148 /* No route to host */
61730c538f8281 David Howells 2012-10-09 103 #define EWOULDBLOCK EAGAIN /* Operation would block */
61730c538f8281 David Howells 2012-10-09 @104 #define EALREADY 149 /* Operation already in progress */
61730c538f8281 David Howells 2012-10-09 @105 #define EINPROGRESS 150 /* Operation now in progress */
0ca43435188b9f Eric Sandeen 2013-11-12 @106 #define ESTALE 151 /* Stale file handle */
61730c538f8281 David Howells 2012-10-09 @107 #define ECANCELED 158 /* AIO operation canceled */
61730c538f8281 David Howells 2012-10-09 108
61730c538f8281 David Howells 2012-10-09 109 /*
61730c538f8281 David Howells 2012-10-09 110 * These error are Linux extensions.
61730c538f8281 David Howells 2012-10-09 111 */
61730c538f8281 David Howells 2012-10-09 @112 #define ENOMEDIUM 159 /* No medium found */
61730c538f8281 David Howells 2012-10-09 @113 #define EMEDIUMTYPE 160 /* Wrong medium type */
61730c538f8281 David Howells 2012-10-09 @114 #define ENOKEY 161 /* Required key not available */
61730c538f8281 David Howells 2012-10-09 @115 #define EKEYEXPIRED 162 /* Key has expired */
61730c538f8281 David Howells 2012-10-09 @116 #define EKEYREVOKED 163 /* Key has been revoked */
61730c538f8281 David Howells 2012-10-09 @117 #define EKEYREJECTED 164 /* Key was rejected by service */
61730c538f8281 David Howells 2012-10-09 118
61730c538f8281 David Howells 2012-10-09 119 /* for robust mutexes */
61730c538f8281 David Howells 2012-10-09 @120 #define EOWNERDEAD 165 /* Owner died */
61730c538f8281 David Howells 2012-10-09 @121 #define ENOTRECOVERABLE 166 /* State not recoverable */
61730c538f8281 David Howells 2012-10-09 122
61730c538f8281 David Howells 2012-10-09 @123 #define ERFKILL 167 /* Operation not possible due to RF-kill */
61730c538f8281 David Howells 2012-10-09 124
61730c538f8281 David Howells 2012-10-09 @125 #define EHWPOISON 168 /* Memory page has hardware error */
61730c538f8281 David Howells 2012-10-09 126
61730c538f8281 David Howells 2012-10-09 @127 #define EDQUOT 1133 /* Quota exceeded */
61730c538f8281 David Howells 2012-10-09 128
61730c538f8281 David Howells 2012-10-09 129

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]


Attachments:
(No filename) (29.70 kB)
.config.gz (32.45 kB)
Download all attachments

2021-09-16 01:08:05

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH v2 4/5] vdpa: add new vdpa attribute VDPA_ATTR_DEV_F_VERSION_1

On Wed, Sep 15, 2021 at 7:09 PM Michael S. Tsirkin <[email protected]> wrote:
>
> On Wed, Sep 15, 2021 at 04:06:57PM +0800, Jason Wang wrote:
> > On Wed, Sep 15, 2021 at 3:38 PM Michael S. Tsirkin <[email protected]> wrote:
> > >
> > > On Wed, Sep 15, 2021 at 11:18:06AM +0800, Jason Wang wrote:
> > > > On Tue, Sep 14, 2021 at 8:58 PM Michael S. Tsirkin <[email protected]> wrote:
> > > > >
> > > > > On Tue, Sep 14, 2021 at 08:24:51PM +0800, Wu Zongyong wrote:
> > > > > > This new attribute advertises whether the vdpa device is legacy or not.
> > > > > > Users can pick right virtqueue size if the vdpa device is legacy which
> > > > > > doesn't support to change virtqueue size.
> > > > > >
> > > > > > Signed-off-by: Wu Zongyong <[email protected]>
> > > > >
> > > > > So if we are bothering with legacy,
> > > >
> > > > I think we'd better not. I guess the following may work:
> > > >
> > > > 1) disable the driver on BE host
> > > > 2) present VERSION_1 with ACCESS_PLATFORM in get_features()
> > > > 3) extend the management to advertise max_queue_size and
> > > > min_queue_size, for ENI they are the same so management layer knows it
> > > > needs to set the queue_size correctly during launching qemu
> > > >
> > > > Thoughts?
> > > >
> > > > Thanks
> > >
> > > There are other subtle differences such as header size without
> > > mergeable buffers for net.
> >
> > This can be solved by mandating the feature of a mergeable buffer?
> >
> > Thanks
>
> PXE and some dpdk versions are only some of the guests that
> disable mergeable buffers feature.

True, but consider

1) the legacy stuffs requires changes in several software layers
2) it is how virtio 1.0 works e.g device can fail the feature negotiation
3) it is not supported since day 0
4) management API can be extended to advertise the mandated features

It looks affordable.

Thanks

>
> > >
> > >
> > > > > I think there are
> > > > > several things to do when building the interface
> > > > > - support transitional devices, that is allow userspace
> > > > > to tell device it's in legacy mode
> > > > > - support reporting/setting supporting endian-ness
> > > > >
> > > > > > ---
> > > > > > drivers/vdpa/vdpa.c | 6 ++++++
> > > > > > drivers/virtio/virtio_vdpa.c | 7 ++++++-
> > > > > > include/uapi/linux/vdpa.h | 1 +
> > > > > > 3 files changed, 13 insertions(+), 1 deletion(-)
> > > > > >
> > > > > > diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
> > > > > > index 1dc121a07a93..533d7f589eee 100644
> > > > > > --- a/drivers/vdpa/vdpa.c
> > > > > > +++ b/drivers/vdpa/vdpa.c
> > > > > > @@ -12,6 +12,7 @@
> > > > > > #include <linux/slab.h>
> > > > > > #include <linux/vdpa.h>
> > > > > > #include <uapi/linux/vdpa.h>
> > > > > > +#include <uapi/linux/virtio_config.h>
> > > > > > #include <net/genetlink.h>
> > > > > > #include <linux/mod_devicetable.h>
> > > > > >
> > > > > > @@ -494,6 +495,7 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> > > > > > u16 max_vq_size;
> > > > > > u32 device_id;
> > > > > > u32 vendor_id;
> > > > > > + u64 features;
> > > > > > void *hdr;
> > > > > > int err;
> > > > > >
> > > > > > @@ -508,6 +510,7 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> > > > > > device_id = vdev->config->get_device_id(vdev);
> > > > > > vendor_id = vdev->config->get_vendor_id(vdev);
> > > > > > max_vq_size = vdev->config->get_vq_num_max(vdev);
> > > > > > + features = vdev->config->get_features(vdev);
> > > > > >
> > > > > > err = -EMSGSIZE;
> > > > > > if (nla_put_string(msg, VDPA_ATTR_DEV_NAME, dev_name(&vdev->dev)))
> > > > > > @@ -520,6 +523,9 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> > > > > > goto msg_err;
> > > > > > if (nla_put_u16(msg, VDPA_ATTR_DEV_MAX_VQ_SIZE, max_vq_size))
> > > > > > goto msg_err;
> > > > > > + if (features & BIT_ULL(VIRTIO_F_VERSION_1) &&
> > > > > > + nla_put_flag(msg, VDPA_ATTR_DEV_VERSION_1))
> > > > > > + goto msg_err;
> > > > > >
> > > > > > genlmsg_end(msg, hdr);
> > > > > > return 0;
> > > > > > diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
> > > > > > index 72eaef2caeb1..1cba957c4cdc 100644
> > > > > > --- a/drivers/virtio/virtio_vdpa.c
> > > > > > +++ b/drivers/virtio/virtio_vdpa.c
> > > > > > @@ -7,6 +7,7 @@
> > > > > > *
> > > > > > */
> > > > > >
> > > > > > +#include "linux/virtio_config.h"
> > > > > > #include <linux/init.h>
> > > > > > #include <linux/module.h>
> > > > > > #include <linux/device.h>
> > > > > > @@ -145,6 +146,7 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > > > > /* Assume split virtqueue, switch to packed if necessary */
> > > > > > struct vdpa_vq_state state = {0};
> > > > > > unsigned long flags;
> > > > > > + bool may_reduce_num = false;
> > > > > > u32 align, num;
> > > > > > int err;
> > > > > >
> > > > > > @@ -169,10 +171,13 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > > > > goto error_new_virtqueue;
> > > > > > }
> > > > > >
> > > > > > + if (ops->get_features(vdpa) & BIT_ULL(VIRTIO_F_VERSION_1))
> > > > > > + may_reduce_num = true;
> > > > > > +
> > > > > > /* Create the vring */
> > > > > > align = ops->get_vq_align(vdpa);
> > > > > > vq = vring_create_virtqueue(index, num, align, vdev,
> > > > > > - true, true, ctx,
> > > > > > + true, may_reduce_num, ctx,
> > > > > > virtio_vdpa_notify, callback, name);
> > > > > > if (!vq) {
> > > > > > err = -ENOMEM;
> > > > > > diff --git a/include/uapi/linux/vdpa.h b/include/uapi/linux/vdpa.h
> > > > > > index 66a41e4ec163..ce0b74276a5b 100644
> > > > > > --- a/include/uapi/linux/vdpa.h
> > > > > > +++ b/include/uapi/linux/vdpa.h
> > > > > > @@ -32,6 +32,7 @@ enum vdpa_attr {
> > > > > > VDPA_ATTR_DEV_VENDOR_ID, /* u32 */
> > > > > > VDPA_ATTR_DEV_MAX_VQS, /* u32 */
> > > > > > VDPA_ATTR_DEV_MAX_VQ_SIZE, /* u16 */
> > > > > > + VDPA_ATTR_DEV_VERSION_1, /* flag */
> > > > > >
> > > > > > /* new attributes must be added above here */
> > > > > > VDPA_ATTR_MAX,
> > > > > > --
> > > > > > 2.31.1
> > > > >
> > >
>

2021-09-16 01:15:31

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH v2 4/5] vdpa: add new vdpa attribute VDPA_ATTR_DEV_F_VERSION_1

On Wed, Sep 15, 2021 at 8:12 PM Wu Zongyong
<[email protected]> wrote:
>
> On Wed, Sep 15, 2021 at 07:08:51AM -0400, Michael S. Tsirkin wrote:
> > On Wed, Sep 15, 2021 at 04:06:57PM +0800, Jason Wang wrote:
> > > On Wed, Sep 15, 2021 at 3:38 PM Michael S. Tsirkin <[email protected]> wrote:
> > > >
> > > > On Wed, Sep 15, 2021 at 11:18:06AM +0800, Jason Wang wrote:
> > > > > On Tue, Sep 14, 2021 at 8:58 PM Michael S. Tsirkin <[email protected]> wrote:
> > > > > >
> > > > > > On Tue, Sep 14, 2021 at 08:24:51PM +0800, Wu Zongyong wrote:
> > > > > > > This new attribute advertises whether the vdpa device is legacy or not.
> > > > > > > Users can pick right virtqueue size if the vdpa device is legacy which
> > > > > > > doesn't support to change virtqueue size.
> > > > > > >
> > > > > > > Signed-off-by: Wu Zongyong <[email protected]>
> > > > > >
> > > > > > So if we are bothering with legacy,
> > > > >
> > > > > I think we'd better not. I guess the following may work:
> > > > >
> > > > > 1) disable the driver on BE host
> > > > > 2) present VERSION_1 with ACCESS_PLATFORM in get_features()
> > > > > 3) extend the management to advertise max_queue_size and
> > > > > min_queue_size, for ENI they are the same so management layer knows it
> > > > > needs to set the queue_size correctly during launching qemu
> > > > >
> > > > > Thoughts?
> > > > >
> > > > > Thanks
> > > >
> > > > There are other subtle differences such as header size without
> > > > mergeable buffers for net.
> > >
> > > This can be solved by mandating the feature of a mergeable buffer?
> > >
> > > Thanks
> >
> > PXE and some dpdk versions are only some of the guests that
> > disable mergeable buffers feature.
> >
> So what about this:
>
> 1) disable the driver on BE host
> AFAIK, there are no use cases for ENI to be used in a BE machine. So
> just disable the driver on BE machine, it will make things simper.
> 2) present ACCESS_PLATFORM but not VERSION_1 in get_features()

This sounds like a violation of the virtio spec. ACCESS_PLATFORM
depends on VERSION_1.

> 3) extend the management to advertise min_queue_size
> min_queue_size is the same as with max_queue_size for ENI.
>
> Another choice for 3):
> extend the management to advertise the flag F_VERSION_1 just like
> this patch

This will bring a lot of trouble, notice that the legacy/transitional
device doesn't work in several layers (both vdpa kernel and qemu).

If we can afford mandating mergeable rx buffers in the driver, it's
the most simple way.

I guess you should have the plan for the next generation ENI which
should support VERSION_1 and RING_PACKED.

Thanks

> > > >
> > > >
> > > > > > I think there are
> > > > > > several things to do when building the interface
> > > > > > - support transitional devices, that is allow userspace
> > > > > > to tell device it's in legacy mode
> > > > > > - support reporting/setting supporting endian-ness
> > > > > >
> > > > > > > ---
> > > > > > > drivers/vdpa/vdpa.c | 6 ++++++
> > > > > > > drivers/virtio/virtio_vdpa.c | 7 ++++++-
> > > > > > > include/uapi/linux/vdpa.h | 1 +
> > > > > > > 3 files changed, 13 insertions(+), 1 deletion(-)
> > > > > > >
> > > > > > > diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
> > > > > > > index 1dc121a07a93..533d7f589eee 100644
> > > > > > > --- a/drivers/vdpa/vdpa.c
> > > > > > > +++ b/drivers/vdpa/vdpa.c
> > > > > > > @@ -12,6 +12,7 @@
> > > > > > > #include <linux/slab.h>
> > > > > > > #include <linux/vdpa.h>
> > > > > > > #include <uapi/linux/vdpa.h>
> > > > > > > +#include <uapi/linux/virtio_config.h>
> > > > > > > #include <net/genetlink.h>
> > > > > > > #include <linux/mod_devicetable.h>
> > > > > > >
> > > > > > > @@ -494,6 +495,7 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> > > > > > > u16 max_vq_size;
> > > > > > > u32 device_id;
> > > > > > > u32 vendor_id;
> > > > > > > + u64 features;
> > > > > > > void *hdr;
> > > > > > > int err;
> > > > > > >
> > > > > > > @@ -508,6 +510,7 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> > > > > > > device_id = vdev->config->get_device_id(vdev);
> > > > > > > vendor_id = vdev->config->get_vendor_id(vdev);
> > > > > > > max_vq_size = vdev->config->get_vq_num_max(vdev);
> > > > > > > + features = vdev->config->get_features(vdev);
> > > > > > >
> > > > > > > err = -EMSGSIZE;
> > > > > > > if (nla_put_string(msg, VDPA_ATTR_DEV_NAME, dev_name(&vdev->dev)))
> > > > > > > @@ -520,6 +523,9 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> > > > > > > goto msg_err;
> > > > > > > if (nla_put_u16(msg, VDPA_ATTR_DEV_MAX_VQ_SIZE, max_vq_size))
> > > > > > > goto msg_err;
> > > > > > > + if (features & BIT_ULL(VIRTIO_F_VERSION_1) &&
> > > > > > > + nla_put_flag(msg, VDPA_ATTR_DEV_VERSION_1))
> > > > > > > + goto msg_err;
> > > > > > >
> > > > > > > genlmsg_end(msg, hdr);
> > > > > > > return 0;
> > > > > > > diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
> > > > > > > index 72eaef2caeb1..1cba957c4cdc 100644
> > > > > > > --- a/drivers/virtio/virtio_vdpa.c
> > > > > > > +++ b/drivers/virtio/virtio_vdpa.c
> > > > > > > @@ -7,6 +7,7 @@
> > > > > > > *
> > > > > > > */
> > > > > > >
> > > > > > > +#include "linux/virtio_config.h"
> > > > > > > #include <linux/init.h>
> > > > > > > #include <linux/module.h>
> > > > > > > #include <linux/device.h>
> > > > > > > @@ -145,6 +146,7 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > > > > > /* Assume split virtqueue, switch to packed if necessary */
> > > > > > > struct vdpa_vq_state state = {0};
> > > > > > > unsigned long flags;
> > > > > > > + bool may_reduce_num = false;
> > > > > > > u32 align, num;
> > > > > > > int err;
> > > > > > >
> > > > > > > @@ -169,10 +171,13 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > > > > > goto error_new_virtqueue;
> > > > > > > }
> > > > > > >
> > > > > > > + if (ops->get_features(vdpa) & BIT_ULL(VIRTIO_F_VERSION_1))
> > > > > > > + may_reduce_num = true;
> > > > > > > +
> > > > > > > /* Create the vring */
> > > > > > > align = ops->get_vq_align(vdpa);
> > > > > > > vq = vring_create_virtqueue(index, num, align, vdev,
> > > > > > > - true, true, ctx,
> > > > > > > + true, may_reduce_num, ctx,
> > > > > > > virtio_vdpa_notify, callback, name);
> > > > > > > if (!vq) {
> > > > > > > err = -ENOMEM;
> > > > > > > diff --git a/include/uapi/linux/vdpa.h b/include/uapi/linux/vdpa.h
> > > > > > > index 66a41e4ec163..ce0b74276a5b 100644
> > > > > > > --- a/include/uapi/linux/vdpa.h
> > > > > > > +++ b/include/uapi/linux/vdpa.h
> > > > > > > @@ -32,6 +32,7 @@ enum vdpa_attr {
> > > > > > > VDPA_ATTR_DEV_VENDOR_ID, /* u32 */
> > > > > > > VDPA_ATTR_DEV_MAX_VQS, /* u32 */
> > > > > > > VDPA_ATTR_DEV_MAX_VQ_SIZE, /* u16 */
> > > > > > > + VDPA_ATTR_DEV_VERSION_1, /* flag */
> > > > > > >
> > > > > > > /* new attributes must be added above here */
> > > > > > > VDPA_ATTR_MAX,
> > > > > > > --
> > > > > > > 2.31.1
> > > > > >
> > > >
>

2021-09-17 11:52:16

by Wu Zongyong

[permalink] [raw]
Subject: Re: [PATCH v2 4/5] vdpa: add new vdpa attribute VDPA_ATTR_DEV_F_VERSION_1

On Thu, Sep 16, 2021 at 09:05:58AM +0800, Jason Wang wrote:
> On Wed, Sep 15, 2021 at 7:09 PM Michael S. Tsirkin <[email protected]> wrote:
> >
> > On Wed, Sep 15, 2021 at 04:06:57PM +0800, Jason Wang wrote:
> > > On Wed, Sep 15, 2021 at 3:38 PM Michael S. Tsirkin <[email protected]> wrote:
> > > >
> > > > On Wed, Sep 15, 2021 at 11:18:06AM +0800, Jason Wang wrote:
> > > > > On Tue, Sep 14, 2021 at 8:58 PM Michael S. Tsirkin <[email protected]> wrote:
> > > > > >
> > > > > > On Tue, Sep 14, 2021 at 08:24:51PM +0800, Wu Zongyong wrote:
> > > > > > > This new attribute advertises whether the vdpa device is legacy or not.
> > > > > > > Users can pick right virtqueue size if the vdpa device is legacy which
> > > > > > > doesn't support to change virtqueue size.
> > > > > > >
> > > > > > > Signed-off-by: Wu Zongyong <[email protected]>
> > > > > >
> > > > > > So if we are bothering with legacy,
> > > > >
> > > > > I think we'd better not. I guess the following may work:
> > > > >
> > > > > 1) disable the driver on BE host
> > > > > 2) present VERSION_1 with ACCESS_PLATFORM in get_features()
> > > > > 3) extend the management to advertise max_queue_size and
> > > > > min_queue_size, for ENI they are the same so management layer knows it
> > > > > needs to set the queue_size correctly during launching qemu
> > > > >
> > > > > Thoughts?
> > > > >
> > > > > Thanks
> > > >
> > > > There are other subtle differences such as header size without
> > > > mergeable buffers for net.
> > >
> > > This can be solved by mandating the feature of a mergeable buffer?
> > >
> > > Thanks
> >
> > PXE and some dpdk versions are only some of the guests that
> > disable mergeable buffers feature.
>
> True, but consider
>
> 1) the legacy stuffs requires changes in several software layers
> 2) it is how virtio 1.0 works e.g device can fail the feature negotiation
> 3) it is not supported since day 0
> 4) management API can be extended to advertise the mandated features

So let me confirm what I should do in next revision:
1) disable the driver on BE host like that:

#ifdef __LITTE_ENDIAN
int eni_vdpa_probe()
{
...
}
#else
int eni_vdpa_probe()
{
return -ENODEV;
}
#endif

2) report F_VERSION_1 and F_ACCESS_PLATFORM in get_features()
3) introduce a new cb get_vq_num_min in vdpa_config_ops

Does I miss something?

> It looks affordable.
>
> Thanks
>
> >
> > > >
> > > >
> > > > > > I think there are
> > > > > > several things to do when building the interface
> > > > > > - support transitional devices, that is allow userspace
> > > > > > to tell device it's in legacy mode
> > > > > > - support reporting/setting supporting endian-ness
> > > > > >
> > > > > > > ---
> > > > > > > drivers/vdpa/vdpa.c | 6 ++++++
> > > > > > > drivers/virtio/virtio_vdpa.c | 7 ++++++-
> > > > > > > include/uapi/linux/vdpa.h | 1 +
> > > > > > > 3 files changed, 13 insertions(+), 1 deletion(-)
> > > > > > >
> > > > > > > diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
> > > > > > > index 1dc121a07a93..533d7f589eee 100644
> > > > > > > --- a/drivers/vdpa/vdpa.c
> > > > > > > +++ b/drivers/vdpa/vdpa.c
> > > > > > > @@ -12,6 +12,7 @@
> > > > > > > #include <linux/slab.h>
> > > > > > > #include <linux/vdpa.h>
> > > > > > > #include <uapi/linux/vdpa.h>
> > > > > > > +#include <uapi/linux/virtio_config.h>
> > > > > > > #include <net/genetlink.h>
> > > > > > > #include <linux/mod_devicetable.h>
> > > > > > >
> > > > > > > @@ -494,6 +495,7 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> > > > > > > u16 max_vq_size;
> > > > > > > u32 device_id;
> > > > > > > u32 vendor_id;
> > > > > > > + u64 features;
> > > > > > > void *hdr;
> > > > > > > int err;
> > > > > > >
> > > > > > > @@ -508,6 +510,7 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> > > > > > > device_id = vdev->config->get_device_id(vdev);
> > > > > > > vendor_id = vdev->config->get_vendor_id(vdev);
> > > > > > > max_vq_size = vdev->config->get_vq_num_max(vdev);
> > > > > > > + features = vdev->config->get_features(vdev);
> > > > > > >
> > > > > > > err = -EMSGSIZE;
> > > > > > > if (nla_put_string(msg, VDPA_ATTR_DEV_NAME, dev_name(&vdev->dev)))
> > > > > > > @@ -520,6 +523,9 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> > > > > > > goto msg_err;
> > > > > > > if (nla_put_u16(msg, VDPA_ATTR_DEV_MAX_VQ_SIZE, max_vq_size))
> > > > > > > goto msg_err;
> > > > > > > + if (features & BIT_ULL(VIRTIO_F_VERSION_1) &&
> > > > > > > + nla_put_flag(msg, VDPA_ATTR_DEV_VERSION_1))
> > > > > > > + goto msg_err;
> > > > > > >
> > > > > > > genlmsg_end(msg, hdr);
> > > > > > > return 0;
> > > > > > > diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
> > > > > > > index 72eaef2caeb1..1cba957c4cdc 100644
> > > > > > > --- a/drivers/virtio/virtio_vdpa.c
> > > > > > > +++ b/drivers/virtio/virtio_vdpa.c
> > > > > > > @@ -7,6 +7,7 @@
> > > > > > > *
> > > > > > > */
> > > > > > >
> > > > > > > +#include "linux/virtio_config.h"
> > > > > > > #include <linux/init.h>
> > > > > > > #include <linux/module.h>
> > > > > > > #include <linux/device.h>
> > > > > > > @@ -145,6 +146,7 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > > > > > /* Assume split virtqueue, switch to packed if necessary */
> > > > > > > struct vdpa_vq_state state = {0};
> > > > > > > unsigned long flags;
> > > > > > > + bool may_reduce_num = false;
> > > > > > > u32 align, num;
> > > > > > > int err;
> > > > > > >
> > > > > > > @@ -169,10 +171,13 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > > > > > goto error_new_virtqueue;
> > > > > > > }
> > > > > > >
> > > > > > > + if (ops->get_features(vdpa) & BIT_ULL(VIRTIO_F_VERSION_1))
> > > > > > > + may_reduce_num = true;
> > > > > > > +
> > > > > > > /* Create the vring */
> > > > > > > align = ops->get_vq_align(vdpa);
> > > > > > > vq = vring_create_virtqueue(index, num, align, vdev,
> > > > > > > - true, true, ctx,
> > > > > > > + true, may_reduce_num, ctx,
> > > > > > > virtio_vdpa_notify, callback, name);
> > > > > > > if (!vq) {
> > > > > > > err = -ENOMEM;
> > > > > > > diff --git a/include/uapi/linux/vdpa.h b/include/uapi/linux/vdpa.h
> > > > > > > index 66a41e4ec163..ce0b74276a5b 100644
> > > > > > > --- a/include/uapi/linux/vdpa.h
> > > > > > > +++ b/include/uapi/linux/vdpa.h
> > > > > > > @@ -32,6 +32,7 @@ enum vdpa_attr {
> > > > > > > VDPA_ATTR_DEV_VENDOR_ID, /* u32 */
> > > > > > > VDPA_ATTR_DEV_MAX_VQS, /* u32 */
> > > > > > > VDPA_ATTR_DEV_MAX_VQ_SIZE, /* u16 */
> > > > > > > + VDPA_ATTR_DEV_VERSION_1, /* flag */
> > > > > > >
> > > > > > > /* new attributes must be added above here */
> > > > > > > VDPA_ATTR_MAX,
> > > > > > > --
> > > > > > > 2.31.1
> > > > > >
> > > >
> >

2021-09-17 13:05:22

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH v2 4/5] vdpa: add new vdpa attribute VDPA_ATTR_DEV_F_VERSION_1

On Fri, Sep 17, 2021 at 10:35 AM Wu Zongyong
<[email protected]> wrote:
>
> On Thu, Sep 16, 2021 at 09:05:58AM +0800, Jason Wang wrote:
> > On Wed, Sep 15, 2021 at 7:09 PM Michael S. Tsirkin <[email protected]> wrote:
> > >
> > > On Wed, Sep 15, 2021 at 04:06:57PM +0800, Jason Wang wrote:
> > > > On Wed, Sep 15, 2021 at 3:38 PM Michael S. Tsirkin <[email protected]> wrote:
> > > > >
> > > > > On Wed, Sep 15, 2021 at 11:18:06AM +0800, Jason Wang wrote:
> > > > > > On Tue, Sep 14, 2021 at 8:58 PM Michael S. Tsirkin <[email protected]> wrote:
> > > > > > >
> > > > > > > On Tue, Sep 14, 2021 at 08:24:51PM +0800, Wu Zongyong wrote:
> > > > > > > > This new attribute advertises whether the vdpa device is legacy or not.
> > > > > > > > Users can pick right virtqueue size if the vdpa device is legacy which
> > > > > > > > doesn't support to change virtqueue size.
> > > > > > > >
> > > > > > > > Signed-off-by: Wu Zongyong <[email protected]>
> > > > > > >
> > > > > > > So if we are bothering with legacy,
> > > > > >
> > > > > > I think we'd better not. I guess the following may work:
> > > > > >
> > > > > > 1) disable the driver on BE host
> > > > > > 2) present VERSION_1 with ACCESS_PLATFORM in get_features()
> > > > > > 3) extend the management to advertise max_queue_size and
> > > > > > min_queue_size, for ENI they are the same so management layer knows it
> > > > > > needs to set the queue_size correctly during launching qemu
> > > > > >
> > > > > > Thoughts?
> > > > > >
> > > > > > Thanks
> > > > >
> > > > > There are other subtle differences such as header size without
> > > > > mergeable buffers for net.
> > > >
> > > > This can be solved by mandating the feature of a mergeable buffer?
> > > >
> > > > Thanks
> > >
> > > PXE and some dpdk versions are only some of the guests that
> > > disable mergeable buffers feature.
> >
> > True, but consider
> >
> > 1) the legacy stuffs requires changes in several software layers
> > 2) it is how virtio 1.0 works e.g device can fail the feature negotiation
> > 3) it is not supported since day 0
> > 4) management API can be extended to advertise the mandated features
>
> So let me confirm what I should do in next revision:
> 1) disable the driver on BE host like that:
>
> #ifdef __LITTE_ENDIAN
> int eni_vdpa_probe()
> {
> ...
> }
> #else
> int eni_vdpa_probe()
> {
> return -ENODEV;
> }
> #endif

This might work but I wonder if we can disable it via Kconfig.

>
> 2) report F_VERSION_1 and F_ACCESS_PLATFORM in get_features()
> 3) introduce a new cb get_vq_num_min in vdpa_config_ops
>
> Does I miss something?

And we need this as well.

Fail the feature negotiation if mrg rxbuf is not negotiated. Otherwise
we can meet the 1.0 requirement of header length. Or the hardware can
still preset the mergeable header if the mrg rx buffer is not
negotiated?

Thanks

>
> > It looks affordable.
> >
> > Thanks
> >
> > >
> > > > >
> > > > >
> > > > > > > I think there are
> > > > > > > several things to do when building the interface
> > > > > > > - support transitional devices, that is allow userspace
> > > > > > > to tell device it's in legacy mode
> > > > > > > - support reporting/setting supporting endian-ness
> > > > > > >
> > > > > > > > ---
> > > > > > > > drivers/vdpa/vdpa.c | 6 ++++++
> > > > > > > > drivers/virtio/virtio_vdpa.c | 7 ++++++-
> > > > > > > > include/uapi/linux/vdpa.h | 1 +
> > > > > > > > 3 files changed, 13 insertions(+), 1 deletion(-)
> > > > > > > >
> > > > > > > > diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
> > > > > > > > index 1dc121a07a93..533d7f589eee 100644
> > > > > > > > --- a/drivers/vdpa/vdpa.c
> > > > > > > > +++ b/drivers/vdpa/vdpa.c
> > > > > > > > @@ -12,6 +12,7 @@
> > > > > > > > #include <linux/slab.h>
> > > > > > > > #include <linux/vdpa.h>
> > > > > > > > #include <uapi/linux/vdpa.h>
> > > > > > > > +#include <uapi/linux/virtio_config.h>
> > > > > > > > #include <net/genetlink.h>
> > > > > > > > #include <linux/mod_devicetable.h>
> > > > > > > >
> > > > > > > > @@ -494,6 +495,7 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> > > > > > > > u16 max_vq_size;
> > > > > > > > u32 device_id;
> > > > > > > > u32 vendor_id;
> > > > > > > > + u64 features;
> > > > > > > > void *hdr;
> > > > > > > > int err;
> > > > > > > >
> > > > > > > > @@ -508,6 +510,7 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> > > > > > > > device_id = vdev->config->get_device_id(vdev);
> > > > > > > > vendor_id = vdev->config->get_vendor_id(vdev);
> > > > > > > > max_vq_size = vdev->config->get_vq_num_max(vdev);
> > > > > > > > + features = vdev->config->get_features(vdev);
> > > > > > > >
> > > > > > > > err = -EMSGSIZE;
> > > > > > > > if (nla_put_string(msg, VDPA_ATTR_DEV_NAME, dev_name(&vdev->dev)))
> > > > > > > > @@ -520,6 +523,9 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> > > > > > > > goto msg_err;
> > > > > > > > if (nla_put_u16(msg, VDPA_ATTR_DEV_MAX_VQ_SIZE, max_vq_size))
> > > > > > > > goto msg_err;
> > > > > > > > + if (features & BIT_ULL(VIRTIO_F_VERSION_1) &&
> > > > > > > > + nla_put_flag(msg, VDPA_ATTR_DEV_VERSION_1))
> > > > > > > > + goto msg_err;
> > > > > > > >
> > > > > > > > genlmsg_end(msg, hdr);
> > > > > > > > return 0;
> > > > > > > > diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
> > > > > > > > index 72eaef2caeb1..1cba957c4cdc 100644
> > > > > > > > --- a/drivers/virtio/virtio_vdpa.c
> > > > > > > > +++ b/drivers/virtio/virtio_vdpa.c
> > > > > > > > @@ -7,6 +7,7 @@
> > > > > > > > *
> > > > > > > > */
> > > > > > > >
> > > > > > > > +#include "linux/virtio_config.h"
> > > > > > > > #include <linux/init.h>
> > > > > > > > #include <linux/module.h>
> > > > > > > > #include <linux/device.h>
> > > > > > > > @@ -145,6 +146,7 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > > > > > > /* Assume split virtqueue, switch to packed if necessary */
> > > > > > > > struct vdpa_vq_state state = {0};
> > > > > > > > unsigned long flags;
> > > > > > > > + bool may_reduce_num = false;
> > > > > > > > u32 align, num;
> > > > > > > > int err;
> > > > > > > >
> > > > > > > > @@ -169,10 +171,13 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > > > > > > goto error_new_virtqueue;
> > > > > > > > }
> > > > > > > >
> > > > > > > > + if (ops->get_features(vdpa) & BIT_ULL(VIRTIO_F_VERSION_1))
> > > > > > > > + may_reduce_num = true;
> > > > > > > > +
> > > > > > > > /* Create the vring */
> > > > > > > > align = ops->get_vq_align(vdpa);
> > > > > > > > vq = vring_create_virtqueue(index, num, align, vdev,
> > > > > > > > - true, true, ctx,
> > > > > > > > + true, may_reduce_num, ctx,
> > > > > > > > virtio_vdpa_notify, callback, name);
> > > > > > > > if (!vq) {
> > > > > > > > err = -ENOMEM;
> > > > > > > > diff --git a/include/uapi/linux/vdpa.h b/include/uapi/linux/vdpa.h
> > > > > > > > index 66a41e4ec163..ce0b74276a5b 100644
> > > > > > > > --- a/include/uapi/linux/vdpa.h
> > > > > > > > +++ b/include/uapi/linux/vdpa.h
> > > > > > > > @@ -32,6 +32,7 @@ enum vdpa_attr {
> > > > > > > > VDPA_ATTR_DEV_VENDOR_ID, /* u32 */
> > > > > > > > VDPA_ATTR_DEV_MAX_VQS, /* u32 */
> > > > > > > > VDPA_ATTR_DEV_MAX_VQ_SIZE, /* u16 */
> > > > > > > > + VDPA_ATTR_DEV_VERSION_1, /* flag */
> > > > > > > >
> > > > > > > > /* new attributes must be added above here */
> > > > > > > > VDPA_ATTR_MAX,
> > > > > > > > --
> > > > > > > > 2.31.1
> > > > > > >
> > > > >
> > >
>

2021-09-22 12:48:37

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH v3 0/7] vDPA driver for Alibaba ENI

This series implements the vDPA driver for Alibaba ENI(Elastic Network
Interface) which is build based on virtio-pci 0.9.5 specification.

Changes since V2:
- add new attribute VDPA_ATTR_DEV_MIN_VQ_SIZE instead
VDPA_ATTR_DEV_F_VERSION_1 to guide users to choose correct virtqueue
size as suggested by Jason Wang
- present ACCESS_PLATFORM in get_features callback as suggested by Jason
Wang
- disable this driver on Big Endian host as suggested by Jason Wang
- fix a typo

Changes since V1:
- add new vdpa attribute VDPA_ATTR_DEV_F_VERSION_1 to indicate whether
the vdpa device is legacy
- implement dedicated driver for Alibaba ENI instead a legacy virtio-pci
driver as suggested by Jason Wang
- some bugs fixed

Wu Zongyong (7):
virtio-pci: introduce legacy device module
vdpa: fix typo
vp_vdpa: add vq irq offloading support
vdpa: add new callback get_vq_num_min in vdpa_config_ops
virtio_vdpa: setup correct vq size with callbacks get_vq_num_{max,min}
vdpa: add new attribute VDPA_ATTR_DEV_MIN_VQ_SIZE
eni_vdpa: add vDPA driver for Alibaba ENI

drivers/vdpa/Kconfig | 8 +
drivers/vdpa/Makefile | 1 +
drivers/vdpa/alibaba/Makefile | 3 +
drivers/vdpa/alibaba/eni_vdpa.c | 554 +++++++++++++++++++++++++
drivers/vdpa/vdpa.c | 5 +
drivers/vdpa/virtio_pci/vp_vdpa.c | 12 +
drivers/virtio/Kconfig | 10 +
drivers/virtio/Makefile | 1 +
drivers/virtio/virtio_pci_common.c | 10 +-
drivers/virtio/virtio_pci_common.h | 9 +-
drivers/virtio/virtio_pci_legacy.c | 101 ++---
drivers/virtio/virtio_pci_legacy_dev.c | 220 ++++++++++
drivers/virtio/virtio_vdpa.c | 25 +-
include/linux/vdpa.h | 6 +-
include/linux/virtio_pci_legacy.h | 44 ++
include/uapi/linux/vdpa.h | 1 +
16 files changed, 921 insertions(+), 89 deletions(-)
create mode 100644 drivers/vdpa/alibaba/Makefile
create mode 100644 drivers/vdpa/alibaba/eni_vdpa.c
create mode 100644 drivers/virtio/virtio_pci_legacy_dev.c
create mode 100644 include/linux/virtio_pci_legacy.h

--
2.31.1

2021-09-22 12:48:58

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH v3 5/7] virtio_vdpa: setup correct vq size with callbacks get_vq_num_{max,min}

Signed-off-by: Wu Zongyong <[email protected]>
---
drivers/virtio/virtio_vdpa.c | 25 ++++++++++++++++++++-----
1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
index 72eaef2caeb1..8aa4ebe2a2a2 100644
--- a/drivers/virtio/virtio_vdpa.c
+++ b/drivers/virtio/virtio_vdpa.c
@@ -145,7 +145,8 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
/* Assume split virtqueue, switch to packed if necessary */
struct vdpa_vq_state state = {0};
unsigned long flags;
- u32 align, num;
+ u32 align, max_num, min_num = 0;
+ bool may_reduce_num = true;
int err;

if (!name)
@@ -163,22 +164,36 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
if (!info)
return ERR_PTR(-ENOMEM);

- num = ops->get_vq_num_max(vdpa);
- if (num == 0) {
+ max_num = ops->get_vq_num_max(vdpa);
+ if (max_num == 0) {
err = -ENOENT;
goto error_new_virtqueue;
}

+ if (ops->get_vq_num_min)
+ min_num = ops->get_vq_num_min(vdpa);
+ if (min_num > max_num) {
+ err = -ENOENT;
+ goto error_new_virtqueue;
+ }
+
+ may_reduce_num = (max_num == min_num) ? false : true;
+
/* Create the vring */
align = ops->get_vq_align(vdpa);
- vq = vring_create_virtqueue(index, num, align, vdev,
- true, true, ctx,
+ vq = vring_create_virtqueue(index, max_num, align, vdev,
+ true, may_reduce_num, ctx,
virtio_vdpa_notify, callback, name);
if (!vq) {
err = -ENOMEM;
goto error_new_virtqueue;
}

+ if (virtqueue_get_vring_size(vq) < min_num) {
+ err = -EINVAL;
+ goto err_vq;
+ }
+
/* Setup virtqueue callback */
cb.callback = virtio_vdpa_virtqueue_cb;
cb.private = info;
--
2.31.1

2021-09-22 12:49:04

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH v3 7/7] eni_vdpa: add vDPA driver for Alibaba ENI

This patch adds a new vDPA driver for Alibaba ENI(Elastic Network
Interface) which is build upon virtio 0.9.5 specification.
And this driver doesn't support to run on BE host.

Signed-off-by: Wu Zongyong <[email protected]>
---
drivers/vdpa/Kconfig | 8 +
drivers/vdpa/Makefile | 1 +
drivers/vdpa/alibaba/Makefile | 3 +
drivers/vdpa/alibaba/eni_vdpa.c | 554 ++++++++++++++++++++++++++++++++
4 files changed, 566 insertions(+)
create mode 100644 drivers/vdpa/alibaba/Makefile
create mode 100644 drivers/vdpa/alibaba/eni_vdpa.c

diff --git a/drivers/vdpa/Kconfig b/drivers/vdpa/Kconfig
index 3d91982d8371..9587b9177b05 100644
--- a/drivers/vdpa/Kconfig
+++ b/drivers/vdpa/Kconfig
@@ -78,4 +78,12 @@ config VP_VDPA
help
This kernel module bridges virtio PCI device to vDPA bus.

+config ALIBABA_ENI_VDPA
+ tristate "vDPA driver for Alibaba ENI"
+ select VIRTIO_PCI_LEGACY_LIB
+ depends on PCI_MSI
+ help
+ VDPA driver for Alibaba ENI(Elastic Network Interface) which is build upon
+ virtio 0.9.5 specification.
+
endif # VDPA
diff --git a/drivers/vdpa/Makefile b/drivers/vdpa/Makefile
index f02ebed33f19..15665563a7f4 100644
--- a/drivers/vdpa/Makefile
+++ b/drivers/vdpa/Makefile
@@ -5,3 +5,4 @@ obj-$(CONFIG_VDPA_USER) += vdpa_user/
obj-$(CONFIG_IFCVF) += ifcvf/
obj-$(CONFIG_MLX5_VDPA) += mlx5/
obj-$(CONFIG_VP_VDPA) += virtio_pci/
+obj-$(CONFIG_ALIBABA_ENI_VDPA) += alibaba/
diff --git a/drivers/vdpa/alibaba/Makefile b/drivers/vdpa/alibaba/Makefile
new file mode 100644
index 000000000000..ef4aae69f87a
--- /dev/null
+++ b/drivers/vdpa/alibaba/Makefile
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0
+obj-$(CONFIG_ALIBABA_ENI_VDPA) += eni_vdpa.o
+
diff --git a/drivers/vdpa/alibaba/eni_vdpa.c b/drivers/vdpa/alibaba/eni_vdpa.c
new file mode 100644
index 000000000000..b6eef696cec5
--- /dev/null
+++ b/drivers/vdpa/alibaba/eni_vdpa.c
@@ -0,0 +1,554 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * vDPA bridge driver for Alibaba ENI(Elastic Network Interface)
+ *
+ * Copyright (c) 2021, Alibaba Inc. All rights reserved.
+ * Author: Wu Zongyong <[email protected]>
+ *
+ */
+
+#include "linux/bits.h"
+#include <linux/interrupt.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/vdpa.h>
+#include <linux/virtio.h>
+#include <linux/virtio_config.h>
+#include <linux/virtio_ring.h>
+#include <linux/virtio_pci.h>
+#include <linux/virtio_pci_legacy.h>
+#include <uapi/linux/virtio_net.h>
+
+#define ENI_MSIX_NAME_SIZE 256
+
+#define ENI_ERR(pdev, fmt, ...) \
+ dev_err(&pdev->dev, "%s"fmt, "eni_vdpa: ", ##__VA_ARGS__)
+#define ENI_DBG(pdev, fmt, ...) \
+ dev_dbg(&pdev->dev, "%s"fmt, "eni_vdpa: ", ##__VA_ARGS__)
+#define ENI_INFO(pdev, fmt, ...) \
+ dev_info(&pdev->dev, "%s"fmt, "eni_vdpa: ", ##__VA_ARGS__)
+
+struct eni_vring {
+ void __iomem *notify;
+ char msix_name[ENI_MSIX_NAME_SIZE];
+ struct vdpa_callback cb;
+ int irq;
+};
+
+struct eni_vdpa {
+ struct vdpa_device vdpa;
+ struct virtio_pci_legacy_device ldev;
+ struct eni_vring *vring;
+ struct vdpa_callback config_cb;
+ char msix_name[ENI_MSIX_NAME_SIZE];
+ int config_irq;
+ int queues;
+ int vectors;
+};
+
+static struct eni_vdpa *vdpa_to_eni(struct vdpa_device *vdpa)
+{
+ return container_of(vdpa, struct eni_vdpa, vdpa);
+}
+
+static struct virtio_pci_legacy_device *vdpa_to_ldev(struct vdpa_device *vdpa)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+
+ return &eni_vdpa->ldev;
+}
+
+static u64 eni_vdpa_get_features(struct vdpa_device *vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+ u64 features = vp_legacy_get_features(ldev);
+
+ features |= BIT_ULL(VIRTIO_F_ACCESS_PLATFORM);
+
+ return features;
+}
+
+static int eni_vdpa_set_features(struct vdpa_device *vdpa, u64 features)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+
+ vp_legacy_set_features(ldev, (u32)features);
+
+ return 0;
+}
+
+static u8 eni_vdpa_get_status(struct vdpa_device *vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+
+ return vp_legacy_get_status(ldev);
+}
+
+static int eni_vdpa_get_vq_irq(struct vdpa_device *vdpa, u16 idx)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+ int irq = eni_vdpa->vring[idx].irq;
+
+ if (irq == VIRTIO_MSI_NO_VECTOR)
+ return -EINVAL;
+
+ return irq;
+}
+
+static void eni_vdpa_free_irq(struct eni_vdpa *eni_vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
+ struct pci_dev *pdev = ldev->pci_dev;
+ int i;
+
+ for (i = 0; i < eni_vdpa->queues; i++) {
+ if (eni_vdpa->vring[i].irq != VIRTIO_MSI_NO_VECTOR) {
+ vp_legacy_queue_vector(ldev, i, VIRTIO_MSI_NO_VECTOR);
+ devm_free_irq(&pdev->dev, eni_vdpa->vring[i].irq,
+ &eni_vdpa->vring[i]);
+ eni_vdpa->vring[i].irq = VIRTIO_MSI_NO_VECTOR;
+ }
+ }
+
+ if (eni_vdpa->config_irq != VIRTIO_MSI_NO_VECTOR) {
+ vp_legacy_config_vector(ldev, VIRTIO_MSI_NO_VECTOR);
+ devm_free_irq(&pdev->dev, eni_vdpa->config_irq, eni_vdpa);
+ eni_vdpa->config_irq = VIRTIO_MSI_NO_VECTOR;
+ }
+
+ if (eni_vdpa->vectors) {
+ pci_free_irq_vectors(pdev);
+ eni_vdpa->vectors = 0;
+ }
+}
+
+static irqreturn_t eni_vdpa_vq_handler(int irq, void *arg)
+{
+ struct eni_vring *vring = arg;
+
+ if (vring->cb.callback)
+ return vring->cb.callback(vring->cb.private);
+
+ return IRQ_HANDLED;
+}
+
+static irqreturn_t eni_vdpa_config_handler(int irq, void *arg)
+{
+ struct eni_vdpa *eni_vdpa = arg;
+
+ if (eni_vdpa->config_cb.callback)
+ return eni_vdpa->config_cb.callback(eni_vdpa->config_cb.private);
+
+ return IRQ_HANDLED;
+}
+
+static int eni_vdpa_request_irq(struct eni_vdpa *eni_vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
+ struct pci_dev *pdev = ldev->pci_dev;
+ int i, ret, irq;
+ int queues = eni_vdpa->queues;
+ int vectors = queues + 1;
+
+ ret = pci_alloc_irq_vectors(pdev, vectors, vectors, PCI_IRQ_MSIX);
+ if (ret != vectors) {
+ ENI_ERR(pdev,
+ "failed to allocate irq vectors want %d but %d\n",
+ vectors, ret);
+ return ret;
+ }
+
+ eni_vdpa->vectors = vectors;
+
+ for (i = 0; i < queues; i++) {
+ snprintf(eni_vdpa->vring[i].msix_name, ENI_MSIX_NAME_SIZE,
+ "eni-vdpa[%s]-%d\n", pci_name(pdev), i);
+ irq = pci_irq_vector(pdev, i);
+ ret = devm_request_irq(&pdev->dev, irq,
+ eni_vdpa_vq_handler,
+ 0, eni_vdpa->vring[i].msix_name,
+ &eni_vdpa->vring[i]);
+ if (ret) {
+ ENI_ERR(pdev, "failed to request irq for vq %d\n", i);
+ goto err;
+ }
+ vp_legacy_queue_vector(ldev, i, i);
+ eni_vdpa->vring[i].irq = irq;
+ }
+
+ snprintf(eni_vdpa->msix_name, ENI_MSIX_NAME_SIZE, "eni-vdpa[%s]-config\n",
+ pci_name(pdev));
+ irq = pci_irq_vector(pdev, queues);
+ ret = devm_request_irq(&pdev->dev, irq, eni_vdpa_config_handler, 0,
+ eni_vdpa->msix_name, eni_vdpa);
+ if (ret) {
+ ENI_ERR(pdev, "failed to request irq for config vq %d\n", i);
+ goto err;
+ }
+ vp_legacy_config_vector(ldev, queues);
+ eni_vdpa->config_irq = irq;
+
+ return 0;
+err:
+ eni_vdpa_free_irq(eni_vdpa);
+ return ret;
+}
+
+static void eni_vdpa_set_status(struct vdpa_device *vdpa, u8 status)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+ struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
+ u8 s = eni_vdpa_get_status(vdpa);
+
+ if (status & VIRTIO_CONFIG_S_DRIVER_OK &&
+ !(s & VIRTIO_CONFIG_S_DRIVER_OK)) {
+ eni_vdpa_request_irq(eni_vdpa);
+ }
+
+ vp_legacy_set_status(ldev, status);
+
+ if (!(status & VIRTIO_CONFIG_S_DRIVER_OK) &&
+ (s & VIRTIO_CONFIG_S_DRIVER_OK))
+ eni_vdpa_free_irq(eni_vdpa);
+}
+
+static int eni_vdpa_reset(struct vdpa_device *vdpa)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+ struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
+ u8 s = eni_vdpa_get_status(vdpa);
+
+ vp_legacy_set_status(ldev, 0);
+
+ if (s & VIRTIO_CONFIG_S_DRIVER_OK)
+ eni_vdpa_free_irq(eni_vdpa);
+
+ return 0;
+}
+
+static u16 eni_vdpa_get_vq_num_max(struct vdpa_device *vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+
+ return vp_legacy_get_queue_size(ldev, 0);
+}
+
+static u16 eni_vdpa_get_vq_num_min(struct vdpa_device *vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+
+ return vp_legacy_get_queue_size(ldev, 0);
+}
+
+static int eni_vdpa_get_vq_state(struct vdpa_device *vdpa, u16 qid,
+ struct vdpa_vq_state *state)
+{
+ return -EOPNOTSUPP;
+}
+
+static int eni_vdpa_set_vq_state(struct vdpa_device *vdpa, u16 qid,
+ const struct vdpa_vq_state *state)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+ const struct vdpa_vq_state_split *split = &state->split;
+
+ /* ENI is build upon virtio-pci specfication which not support
+ * to set state of virtqueue. But if the state is equal to the
+ * device initial state by chance, we can let it go.
+ */
+ if (!vp_legacy_get_queue_enable(ldev, qid)
+ && split->avail_index == 0)
+ return 0;
+
+ return -EOPNOTSUPP;
+}
+
+
+static void eni_vdpa_set_vq_cb(struct vdpa_device *vdpa, u16 qid,
+ struct vdpa_callback *cb)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+
+ eni_vdpa->vring[qid].cb = *cb;
+}
+
+static void eni_vdpa_set_vq_ready(struct vdpa_device *vdpa, u16 qid,
+ bool ready)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+
+ /* ENI is a legacy virtio-pci device. This is not supported
+ * by specification. But we can disable virtqueue by setting
+ * address to 0.
+ */
+ if (!ready)
+ vp_legacy_set_queue_address(ldev, qid, 0);
+}
+
+static bool eni_vdpa_get_vq_ready(struct vdpa_device *vdpa, u16 qid)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+
+ return vp_legacy_get_queue_enable(ldev, qid);
+}
+
+static void eni_vdpa_set_vq_num(struct vdpa_device *vdpa, u16 qid,
+ u32 num)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+ struct pci_dev *pdev = ldev->pci_dev;
+ u16 n = vp_legacy_get_queue_size(ldev, qid);
+
+ /* ENI is a legacy virtio-pci device which not allow to change
+ * virtqueue size. Just report a error if someone tries to
+ * change it.
+ */
+ if (num != n)
+ ENI_ERR(pdev,
+ "not support to set vq %u fixed num %u to %u\n",
+ qid, n, num);
+}
+
+static int eni_vdpa_set_vq_address(struct vdpa_device *vdpa, u16 qid,
+ u64 desc_area, u64 driver_area,
+ u64 device_area)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+ u32 pfn = desc_area >> VIRTIO_PCI_QUEUE_ADDR_SHIFT;
+
+ vp_legacy_set_queue_address(ldev, qid, pfn);
+
+ return 0;
+}
+
+static void eni_vdpa_kick_vq(struct vdpa_device *vdpa, u16 qid)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+
+ iowrite16(qid, eni_vdpa->vring[qid].notify);
+}
+
+static u32 eni_vdpa_get_device_id(struct vdpa_device *vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+
+ return ldev->id.device;
+}
+
+static u32 eni_vdpa_get_vendor_id(struct vdpa_device *vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+
+ return ldev->id.vendor;
+}
+
+static u32 eni_vdpa_get_vq_align(struct vdpa_device *vdpa)
+{
+ return PAGE_SIZE;
+}
+
+static size_t eni_vdpa_get_config_size(struct vdpa_device *vdpa)
+{
+ return sizeof(struct virtio_net_config);
+}
+
+
+static void eni_vdpa_get_config(struct vdpa_device *vdpa,
+ unsigned int offset,
+ void *buf, unsigned int len)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+ struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
+ void __iomem *ioaddr = ldev->ioaddr +
+ VIRTIO_PCI_CONFIG_OFF(eni_vdpa->vectors) +
+ offset;
+ u8 *p = buf;
+ int i;
+
+ for (i = 0; i < len; i++)
+ *p++ = ioread8(ioaddr + i);
+}
+
+static void eni_vdpa_set_config(struct vdpa_device *vdpa,
+ unsigned int offset, const void *buf,
+ unsigned int len)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+ struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
+ void __iomem *ioaddr = ldev->ioaddr +
+ VIRTIO_PCI_CONFIG_OFF(eni_vdpa->vectors) +
+ offset;
+ const u8 *p = buf;
+ int i;
+
+ for (i = 0; i < len; i++)
+ iowrite8(*p++, ioaddr + i);
+}
+
+static void eni_vdpa_set_config_cb(struct vdpa_device *vdpa,
+ struct vdpa_callback *cb)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+
+ eni_vdpa->config_cb = *cb;
+}
+
+static const struct vdpa_config_ops eni_vdpa_ops = {
+ .get_features = eni_vdpa_get_features,
+ .set_features = eni_vdpa_set_features,
+ .get_status = eni_vdpa_get_status,
+ .set_status = eni_vdpa_set_status,
+ .reset = eni_vdpa_reset,
+ .get_vq_num_max = eni_vdpa_get_vq_num_max,
+ .get_vq_num_min = eni_vdpa_get_vq_num_min,
+ .get_vq_state = eni_vdpa_get_vq_state,
+ .set_vq_state = eni_vdpa_set_vq_state,
+ .set_vq_cb = eni_vdpa_set_vq_cb,
+ .set_vq_ready = eni_vdpa_set_vq_ready,
+ .get_vq_ready = eni_vdpa_get_vq_ready,
+ .set_vq_num = eni_vdpa_set_vq_num,
+ .set_vq_address = eni_vdpa_set_vq_address,
+ .kick_vq = eni_vdpa_kick_vq,
+ .get_device_id = eni_vdpa_get_device_id,
+ .get_vendor_id = eni_vdpa_get_vendor_id,
+ .get_vq_align = eni_vdpa_get_vq_align,
+ .get_config_size = eni_vdpa_get_config_size,
+ .get_config = eni_vdpa_get_config,
+ .set_config = eni_vdpa_set_config,
+ .set_config_cb = eni_vdpa_set_config_cb,
+ .get_vq_irq = eni_vdpa_get_vq_irq,
+};
+
+
+static u16 eni_vdpa_get_num_queues(struct eni_vdpa *eni_vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
+ u32 features = vp_legacy_get_features(ldev);
+ u16 num = 2;
+
+ if (features & BIT_ULL(VIRTIO_NET_F_MQ)) {
+ __virtio16 max_virtqueue_pairs;
+
+ eni_vdpa_get_config(&eni_vdpa->vdpa,
+ offsetof(struct virtio_net_config, max_virtqueue_pairs),
+ &max_virtqueue_pairs,
+ sizeof(max_virtqueue_pairs));
+ num = 2 * __virtio16_to_cpu(virtio_legacy_is_little_endian(),
+ max_virtqueue_pairs);
+ }
+
+ if (features & BIT_ULL(VIRTIO_NET_F_CTRL_VQ))
+ num += 1;
+
+ return num;
+}
+
+static void eni_vdpa_free_irq_vectors(void *data)
+{
+ pci_free_irq_vectors(data);
+}
+
+#ifdef __LITTLE_ENDIAN
+static int eni_vdpa_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+{
+ struct device *dev = &pdev->dev;
+ struct eni_vdpa *eni_vdpa;
+ struct virtio_pci_legacy_device *ldev;
+ int ret, i;
+
+ ret = pcim_enable_device(pdev);
+ if (ret)
+ return ret;
+
+ eni_vdpa = vdpa_alloc_device(struct eni_vdpa, vdpa,
+ dev, &eni_vdpa_ops, NULL, false);
+ if (IS_ERR(eni_vdpa)) {
+ ENI_ERR(pdev, "failed to allocate vDPA structure\n");
+ return PTR_ERR(eni_vdpa);
+ }
+
+ ldev = &eni_vdpa->ldev;
+ ldev->pci_dev = pdev;
+
+ ret = vp_legacy_probe(ldev);
+ if (ret) {
+ ENI_ERR(pdev, "failed to probe legacy PCI device\n");
+ goto err;
+ }
+
+ pci_set_master(pdev);
+ pci_set_drvdata(pdev, eni_vdpa);
+
+ eni_vdpa->vdpa.dma_dev = &pdev->dev;
+ eni_vdpa->queues = eni_vdpa_get_num_queues(eni_vdpa);
+
+ ret = devm_add_action_or_reset(dev, eni_vdpa_free_irq_vectors, pdev);
+ if (ret) {
+ ENI_ERR(pdev,
+ "failed for adding devres for freeing irq vectors\n");
+ goto err;
+ }
+
+ eni_vdpa->vring = devm_kcalloc(&pdev->dev, eni_vdpa->queues,
+ sizeof(*eni_vdpa->vring),
+ GFP_KERNEL);
+ if (!eni_vdpa->vring) {
+ ret = -ENOMEM;
+ ENI_ERR(pdev, "failed to allocate virtqueues\n");
+ goto err;
+ }
+
+ for (i = 0; i < eni_vdpa->queues; i++) {
+ eni_vdpa->vring[i].irq = VIRTIO_MSI_NO_VECTOR;
+ eni_vdpa->vring[i].notify = ldev->ioaddr + VIRTIO_PCI_QUEUE_NOTIFY;
+ }
+ eni_vdpa->config_irq = VIRTIO_MSI_NO_VECTOR;
+
+ ret = vdpa_register_device(&eni_vdpa->vdpa, eni_vdpa->queues);
+ if (ret) {
+ ENI_ERR(pdev, "failed to register to vdpa bus\n");
+ goto err;
+ }
+
+ return 0;
+
+err:
+ put_device(&eni_vdpa->vdpa.dev);
+ return ret;
+}
+#else
+static int eni_vdpa_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+{
+ ENI_ERR(pdev, "this driver not supported on BE host\n");
+ return -ENODEV;
+}
+#endif
+
+static void eni_vdpa_remove(struct pci_dev *pdev)
+{
+ struct eni_vdpa *eni_vdpa = pci_get_drvdata(pdev);
+
+ vdpa_unregister_device(&eni_vdpa->vdpa);
+ vp_legacy_remove(&eni_vdpa->ldev);
+}
+
+static struct pci_device_id eni_pci_ids[] = {
+ { PCI_DEVICE_SUB(PCI_VENDOR_ID_REDHAT_QUMRANET,
+ VIRTIO_TRANS_ID_NET,
+ PCI_SUBVENDOR_ID_REDHAT_QUMRANET,
+ VIRTIO_ID_NET) },
+ { 0 },
+};
+
+static struct pci_driver eni_vdpa_driver = {
+ .name = "alibaba-eni-vdpa",
+ .id_table = eni_pci_ids,
+ .probe = eni_vdpa_probe,
+ .remove = eni_vdpa_remove,
+};
+
+module_pci_driver(eni_vdpa_driver);
+
+MODULE_AUTHOR("Wu Zongyong <[email protected]>");
+MODULE_DESCRIPTION("Alibaba ENI vDPA driver");
+MODULE_LICENSE("GPL v2");
--
2.31.1

2021-09-22 12:50:26

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH v3 3/7] vp_vdpa: add vq irq offloading support

This patch implements the get_vq_irq() callback for virtio pci devices
to allow irq offloading.

Signed-off-by: Wu Zongyong <[email protected]>
Acked-by: Jason Wang <[email protected]>
---
drivers/vdpa/virtio_pci/vp_vdpa.c | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/drivers/vdpa/virtio_pci/vp_vdpa.c b/drivers/vdpa/virtio_pci/vp_vdpa.c
index 5bcd00246d2e..e3ff7875e123 100644
--- a/drivers/vdpa/virtio_pci/vp_vdpa.c
+++ b/drivers/vdpa/virtio_pci/vp_vdpa.c
@@ -76,6 +76,17 @@ static u8 vp_vdpa_get_status(struct vdpa_device *vdpa)
return vp_modern_get_status(mdev);
}

+static int vp_vdpa_get_vq_irq(struct vdpa_device *vdpa, u16 idx)
+{
+ struct vp_vdpa *vp_vdpa = vdpa_to_vp(vdpa);
+ int irq = vp_vdpa->vring[idx].irq;
+
+ if (irq == VIRTIO_MSI_NO_VECTOR)
+ return -EINVAL;
+
+ return irq;
+}
+
static void vp_vdpa_free_irq(struct vp_vdpa *vp_vdpa)
{
struct virtio_pci_modern_device *mdev = &vp_vdpa->mdev;
@@ -427,6 +438,7 @@ static const struct vdpa_config_ops vp_vdpa_ops = {
.get_config = vp_vdpa_get_config,
.set_config = vp_vdpa_set_config,
.set_config_cb = vp_vdpa_set_config_cb,
+ .get_vq_irq = vp_vdpa_get_vq_irq,
};

static void vp_vdpa_free_irq_vectors(void *data)
--
2.31.1

2021-09-22 12:50:49

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH v3 2/7] vdpa: fix typo

Signed-off-by: Wu Zongyong <[email protected]>
---
include/linux/vdpa.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
index 3972ab765de1..a896ee021e5f 100644
--- a/include/linux/vdpa.h
+++ b/include/linux/vdpa.h
@@ -257,7 +257,7 @@ struct vdpa_config_ops {
struct vdpa_notification_area
(*get_vq_notification)(struct vdpa_device *vdev, u16 idx);
/* vq irq is not expected to be changed once DRIVER_OK is set */
- int (*get_vq_irq)(struct vdpa_device *vdv, u16 idx);
+ int (*get_vq_irq)(struct vdpa_device *vdev, u16 idx);

/* Device ops */
u32 (*get_vq_align)(struct vdpa_device *vdev);
--
2.31.1

2021-09-22 12:50:59

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH v3 4/7] vdpa: add new callback get_vq_num_min in vdpa_config_ops

This callback is optional. For vdpa devices that not support to change
virtqueue size, get_vq_num_min and get_vq_num_max will return the same
value, so that users can choose a correct value for that device.

Suggested-by: Jason Wang <[email protected]>
Signed-off-by: Wu Zongyong <[email protected]>
---
include/linux/vdpa.h | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
index a896ee021e5f..30864848950b 100644
--- a/include/linux/vdpa.h
+++ b/include/linux/vdpa.h
@@ -171,6 +171,9 @@ struct vdpa_map_file {
* @get_vq_num_max: Get the max size of virtqueue
* @vdev: vdpa device
* Returns u16: max size of virtqueue
+ * @get_vq_num_min: Get the min size of virtqueue (optional)
+ * @vdev: vdpa device
+ * Returns u16: min size of virtqueue
* @get_device_id: Get virtio device id
* @vdev: vdpa device
* Returns u32: virtio device id
@@ -266,6 +269,7 @@ struct vdpa_config_ops {
void (*set_config_cb)(struct vdpa_device *vdev,
struct vdpa_callback *cb);
u16 (*get_vq_num_max)(struct vdpa_device *vdev);
+ u16 (*get_vq_num_min)(struct vdpa_device *vdev);
u32 (*get_device_id)(struct vdpa_device *vdev);
u32 (*get_vendor_id)(struct vdpa_device *vdev);
u8 (*get_status)(struct vdpa_device *vdev);
--
2.31.1

2021-09-22 12:51:47

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH v3 6/7] vdpa: add new attribute VDPA_ATTR_DEV_MIN_VQ_SIZE

This attribute advertises the min value of virtqueue size. The value is
0 by default.

Signed-off-by: Wu Zongyong <[email protected]>
---
drivers/vdpa/vdpa.c | 5 +++++
include/uapi/linux/vdpa.h | 1 +
2 files changed, 6 insertions(+)

diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
index 1dc121a07a93..6ed79fba33e4 100644
--- a/drivers/vdpa/vdpa.c
+++ b/drivers/vdpa/vdpa.c
@@ -492,6 +492,7 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
int flags, struct netlink_ext_ack *extack)
{
u16 max_vq_size;
+ u16 min_vq_size = 0;
u32 device_id;
u32 vendor_id;
void *hdr;
@@ -508,6 +509,8 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
device_id = vdev->config->get_device_id(vdev);
vendor_id = vdev->config->get_vendor_id(vdev);
max_vq_size = vdev->config->get_vq_num_max(vdev);
+ if (vdev->config->get_vq_num_min)
+ min_vq_size = vdev->config->get_vq_num_min(vdev);

err = -EMSGSIZE;
if (nla_put_string(msg, VDPA_ATTR_DEV_NAME, dev_name(&vdev->dev)))
@@ -520,6 +523,8 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
goto msg_err;
if (nla_put_u16(msg, VDPA_ATTR_DEV_MAX_VQ_SIZE, max_vq_size))
goto msg_err;
+ if (nla_put_u16(msg, VDPA_ATTR_DEV_MIN_VQ_SIZE, min_vq_size))
+ goto msg_err;

genlmsg_end(msg, hdr);
return 0;
diff --git a/include/uapi/linux/vdpa.h b/include/uapi/linux/vdpa.h
index 66a41e4ec163..e3b87879514c 100644
--- a/include/uapi/linux/vdpa.h
+++ b/include/uapi/linux/vdpa.h
@@ -32,6 +32,7 @@ enum vdpa_attr {
VDPA_ATTR_DEV_VENDOR_ID, /* u32 */
VDPA_ATTR_DEV_MAX_VQS, /* u32 */
VDPA_ATTR_DEV_MAX_VQ_SIZE, /* u16 */
+ VDPA_ATTR_DEV_MIN_VQ_SIZE, /* u16 */

/* new attributes must be added above here */
VDPA_ATTR_MAX,
--
2.31.1

2021-09-26 02:26:24

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH v3 7/7] eni_vdpa: add vDPA driver for Alibaba ENI

On Wed, Sep 22, 2021 at 8:47 PM Wu Zongyong
<[email protected]> wrote:
>
> This patch adds a new vDPA driver for Alibaba ENI(Elastic Network
> Interface) which is build upon virtio 0.9.5 specification.
> And this driver doesn't support to run on BE host.
>
> Signed-off-by: Wu Zongyong <[email protected]>
> ---
> drivers/vdpa/Kconfig | 8 +
> drivers/vdpa/Makefile | 1 +
> drivers/vdpa/alibaba/Makefile | 3 +
> drivers/vdpa/alibaba/eni_vdpa.c | 554 ++++++++++++++++++++++++++++++++
> 4 files changed, 566 insertions(+)
> create mode 100644 drivers/vdpa/alibaba/Makefile
> create mode 100644 drivers/vdpa/alibaba/eni_vdpa.c
>
> diff --git a/drivers/vdpa/Kconfig b/drivers/vdpa/Kconfig
> index 3d91982d8371..9587b9177b05 100644
> --- a/drivers/vdpa/Kconfig
> +++ b/drivers/vdpa/Kconfig
> @@ -78,4 +78,12 @@ config VP_VDPA
> help
> This kernel module bridges virtio PCI device to vDPA bus.
>
> +config ALIBABA_ENI_VDPA
> + tristate "vDPA driver for Alibaba ENI"
> + select VIRTIO_PCI_LEGACY_LIB
> + depends on PCI_MSI
> + help
> + VDPA driver for Alibaba ENI(Elastic Network Interface) which is build upon
> + virtio 0.9.5 specification.
> +
> endif # VDPA
> diff --git a/drivers/vdpa/Makefile b/drivers/vdpa/Makefile
> index f02ebed33f19..15665563a7f4 100644
> --- a/drivers/vdpa/Makefile
> +++ b/drivers/vdpa/Makefile
> @@ -5,3 +5,4 @@ obj-$(CONFIG_VDPA_USER) += vdpa_user/
> obj-$(CONFIG_IFCVF) += ifcvf/
> obj-$(CONFIG_MLX5_VDPA) += mlx5/
> obj-$(CONFIG_VP_VDPA) += virtio_pci/
> +obj-$(CONFIG_ALIBABA_ENI_VDPA) += alibaba/
> diff --git a/drivers/vdpa/alibaba/Makefile b/drivers/vdpa/alibaba/Makefile
> new file mode 100644
> index 000000000000..ef4aae69f87a
> --- /dev/null
> +++ b/drivers/vdpa/alibaba/Makefile
> @@ -0,0 +1,3 @@
> +# SPDX-License-Identifier: GPL-2.0
> +obj-$(CONFIG_ALIBABA_ENI_VDPA) += eni_vdpa.o
> +
> diff --git a/drivers/vdpa/alibaba/eni_vdpa.c b/drivers/vdpa/alibaba/eni_vdpa.c
> new file mode 100644
> index 000000000000..b6eef696cec5
> --- /dev/null
> +++ b/drivers/vdpa/alibaba/eni_vdpa.c
> @@ -0,0 +1,554 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * vDPA bridge driver for Alibaba ENI(Elastic Network Interface)
> + *
> + * Copyright (c) 2021, Alibaba Inc. All rights reserved.
> + * Author: Wu Zongyong <[email protected]>
> + *
> + */
> +
> +#include "linux/bits.h"
> +#include <linux/interrupt.h>
> +#include <linux/module.h>
> +#include <linux/pci.h>
> +#include <linux/vdpa.h>
> +#include <linux/virtio.h>
> +#include <linux/virtio_config.h>
> +#include <linux/virtio_ring.h>
> +#include <linux/virtio_pci.h>
> +#include <linux/virtio_pci_legacy.h>
> +#include <uapi/linux/virtio_net.h>
> +
> +#define ENI_MSIX_NAME_SIZE 256
> +
> +#define ENI_ERR(pdev, fmt, ...) \
> + dev_err(&pdev->dev, "%s"fmt, "eni_vdpa: ", ##__VA_ARGS__)
> +#define ENI_DBG(pdev, fmt, ...) \
> + dev_dbg(&pdev->dev, "%s"fmt, "eni_vdpa: ", ##__VA_ARGS__)
> +#define ENI_INFO(pdev, fmt, ...) \
> + dev_info(&pdev->dev, "%s"fmt, "eni_vdpa: ", ##__VA_ARGS__)
> +
> +struct eni_vring {
> + void __iomem *notify;
> + char msix_name[ENI_MSIX_NAME_SIZE];
> + struct vdpa_callback cb;
> + int irq;
> +};
> +
> +struct eni_vdpa {
> + struct vdpa_device vdpa;
> + struct virtio_pci_legacy_device ldev;
> + struct eni_vring *vring;
> + struct vdpa_callback config_cb;
> + char msix_name[ENI_MSIX_NAME_SIZE];
> + int config_irq;
> + int queues;
> + int vectors;
> +};
> +
> +static struct eni_vdpa *vdpa_to_eni(struct vdpa_device *vdpa)
> +{
> + return container_of(vdpa, struct eni_vdpa, vdpa);
> +}
> +
> +static struct virtio_pci_legacy_device *vdpa_to_ldev(struct vdpa_device *vdpa)
> +{
> + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> +
> + return &eni_vdpa->ldev;
> +}
> +
> +static u64 eni_vdpa_get_features(struct vdpa_device *vdpa)
> +{
> + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> + u64 features = vp_legacy_get_features(ldev);
> +
> + features |= BIT_ULL(VIRTIO_F_ACCESS_PLATFORM);
> +
> + return features;

I wonder if the following can work with ENI:

-device virtio-net-pci,mrg_rxbuf=off

?

Thanks

> +}
> +
> +static int eni_vdpa_set_features(struct vdpa_device *vdpa, u64 features)
> +{
> + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> +
> + vp_legacy_set_features(ldev, (u32)features);
> +
> + return 0;
> +}
> +
> +static u8 eni_vdpa_get_status(struct vdpa_device *vdpa)
> +{
> + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> +
> + return vp_legacy_get_status(ldev);
> +}
> +
> +static int eni_vdpa_get_vq_irq(struct vdpa_device *vdpa, u16 idx)
> +{
> + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> + int irq = eni_vdpa->vring[idx].irq;
> +
> + if (irq == VIRTIO_MSI_NO_VECTOR)
> + return -EINVAL;
> +
> + return irq;
> +}
> +
> +static void eni_vdpa_free_irq(struct eni_vdpa *eni_vdpa)
> +{
> + struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
> + struct pci_dev *pdev = ldev->pci_dev;
> + int i;
> +
> + for (i = 0; i < eni_vdpa->queues; i++) {
> + if (eni_vdpa->vring[i].irq != VIRTIO_MSI_NO_VECTOR) {
> + vp_legacy_queue_vector(ldev, i, VIRTIO_MSI_NO_VECTOR);
> + devm_free_irq(&pdev->dev, eni_vdpa->vring[i].irq,
> + &eni_vdpa->vring[i]);
> + eni_vdpa->vring[i].irq = VIRTIO_MSI_NO_VECTOR;
> + }
> + }
> +
> + if (eni_vdpa->config_irq != VIRTIO_MSI_NO_VECTOR) {
> + vp_legacy_config_vector(ldev, VIRTIO_MSI_NO_VECTOR);
> + devm_free_irq(&pdev->dev, eni_vdpa->config_irq, eni_vdpa);
> + eni_vdpa->config_irq = VIRTIO_MSI_NO_VECTOR;
> + }
> +
> + if (eni_vdpa->vectors) {
> + pci_free_irq_vectors(pdev);
> + eni_vdpa->vectors = 0;
> + }
> +}
> +
> +static irqreturn_t eni_vdpa_vq_handler(int irq, void *arg)
> +{
> + struct eni_vring *vring = arg;
> +
> + if (vring->cb.callback)
> + return vring->cb.callback(vring->cb.private);
> +
> + return IRQ_HANDLED;
> +}
> +
> +static irqreturn_t eni_vdpa_config_handler(int irq, void *arg)
> +{
> + struct eni_vdpa *eni_vdpa = arg;
> +
> + if (eni_vdpa->config_cb.callback)
> + return eni_vdpa->config_cb.callback(eni_vdpa->config_cb.private);
> +
> + return IRQ_HANDLED;
> +}
> +
> +static int eni_vdpa_request_irq(struct eni_vdpa *eni_vdpa)
> +{
> + struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
> + struct pci_dev *pdev = ldev->pci_dev;
> + int i, ret, irq;
> + int queues = eni_vdpa->queues;
> + int vectors = queues + 1;
> +
> + ret = pci_alloc_irq_vectors(pdev, vectors, vectors, PCI_IRQ_MSIX);
> + if (ret != vectors) {
> + ENI_ERR(pdev,
> + "failed to allocate irq vectors want %d but %d\n",
> + vectors, ret);
> + return ret;
> + }
> +
> + eni_vdpa->vectors = vectors;
> +
> + for (i = 0; i < queues; i++) {
> + snprintf(eni_vdpa->vring[i].msix_name, ENI_MSIX_NAME_SIZE,
> + "eni-vdpa[%s]-%d\n", pci_name(pdev), i);
> + irq = pci_irq_vector(pdev, i);
> + ret = devm_request_irq(&pdev->dev, irq,
> + eni_vdpa_vq_handler,
> + 0, eni_vdpa->vring[i].msix_name,
> + &eni_vdpa->vring[i]);
> + if (ret) {
> + ENI_ERR(pdev, "failed to request irq for vq %d\n", i);
> + goto err;
> + }
> + vp_legacy_queue_vector(ldev, i, i);
> + eni_vdpa->vring[i].irq = irq;
> + }
> +
> + snprintf(eni_vdpa->msix_name, ENI_MSIX_NAME_SIZE, "eni-vdpa[%s]-config\n",
> + pci_name(pdev));
> + irq = pci_irq_vector(pdev, queues);
> + ret = devm_request_irq(&pdev->dev, irq, eni_vdpa_config_handler, 0,
> + eni_vdpa->msix_name, eni_vdpa);
> + if (ret) {
> + ENI_ERR(pdev, "failed to request irq for config vq %d\n", i);
> + goto err;
> + }
> + vp_legacy_config_vector(ldev, queues);
> + eni_vdpa->config_irq = irq;
> +
> + return 0;
> +err:
> + eni_vdpa_free_irq(eni_vdpa);
> + return ret;
> +}
> +
> +static void eni_vdpa_set_status(struct vdpa_device *vdpa, u8 status)
> +{
> + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> + struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
> + u8 s = eni_vdpa_get_status(vdpa);
> +
> + if (status & VIRTIO_CONFIG_S_DRIVER_OK &&
> + !(s & VIRTIO_CONFIG_S_DRIVER_OK)) {
> + eni_vdpa_request_irq(eni_vdpa);
> + }
> +
> + vp_legacy_set_status(ldev, status);
> +
> + if (!(status & VIRTIO_CONFIG_S_DRIVER_OK) &&
> + (s & VIRTIO_CONFIG_S_DRIVER_OK))
> + eni_vdpa_free_irq(eni_vdpa);
> +}
> +
> +static int eni_vdpa_reset(struct vdpa_device *vdpa)
> +{
> + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> + struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
> + u8 s = eni_vdpa_get_status(vdpa);
> +
> + vp_legacy_set_status(ldev, 0);
> +
> + if (s & VIRTIO_CONFIG_S_DRIVER_OK)
> + eni_vdpa_free_irq(eni_vdpa);
> +
> + return 0;
> +}
> +
> +static u16 eni_vdpa_get_vq_num_max(struct vdpa_device *vdpa)
> +{
> + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> +
> + return vp_legacy_get_queue_size(ldev, 0);
> +}
> +
> +static u16 eni_vdpa_get_vq_num_min(struct vdpa_device *vdpa)
> +{
> + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> +
> + return vp_legacy_get_queue_size(ldev, 0);
> +}
> +
> +static int eni_vdpa_get_vq_state(struct vdpa_device *vdpa, u16 qid,
> + struct vdpa_vq_state *state)
> +{
> + return -EOPNOTSUPP;
> +}
> +
> +static int eni_vdpa_set_vq_state(struct vdpa_device *vdpa, u16 qid,
> + const struct vdpa_vq_state *state)
> +{
> + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> + const struct vdpa_vq_state_split *split = &state->split;
> +
> + /* ENI is build upon virtio-pci specfication which not support
> + * to set state of virtqueue. But if the state is equal to the
> + * device initial state by chance, we can let it go.
> + */
> + if (!vp_legacy_get_queue_enable(ldev, qid)
> + && split->avail_index == 0)
> + return 0;
> +
> + return -EOPNOTSUPP;
> +}
> +
> +
> +static void eni_vdpa_set_vq_cb(struct vdpa_device *vdpa, u16 qid,
> + struct vdpa_callback *cb)
> +{
> + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> +
> + eni_vdpa->vring[qid].cb = *cb;
> +}
> +
> +static void eni_vdpa_set_vq_ready(struct vdpa_device *vdpa, u16 qid,
> + bool ready)
> +{
> + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> +
> + /* ENI is a legacy virtio-pci device. This is not supported
> + * by specification. But we can disable virtqueue by setting
> + * address to 0.
> + */
> + if (!ready)
> + vp_legacy_set_queue_address(ldev, qid, 0);
> +}
> +
> +static bool eni_vdpa_get_vq_ready(struct vdpa_device *vdpa, u16 qid)
> +{
> + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> +
> + return vp_legacy_get_queue_enable(ldev, qid);
> +}
> +
> +static void eni_vdpa_set_vq_num(struct vdpa_device *vdpa, u16 qid,
> + u32 num)
> +{
> + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> + struct pci_dev *pdev = ldev->pci_dev;
> + u16 n = vp_legacy_get_queue_size(ldev, qid);
> +
> + /* ENI is a legacy virtio-pci device which not allow to change
> + * virtqueue size. Just report a error if someone tries to
> + * change it.
> + */
> + if (num != n)
> + ENI_ERR(pdev,
> + "not support to set vq %u fixed num %u to %u\n",
> + qid, n, num);
> +}
> +
> +static int eni_vdpa_set_vq_address(struct vdpa_device *vdpa, u16 qid,
> + u64 desc_area, u64 driver_area,
> + u64 device_area)
> +{
> + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> + u32 pfn = desc_area >> VIRTIO_PCI_QUEUE_ADDR_SHIFT;
> +
> + vp_legacy_set_queue_address(ldev, qid, pfn);
> +
> + return 0;
> +}
> +
> +static void eni_vdpa_kick_vq(struct vdpa_device *vdpa, u16 qid)
> +{
> + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> +
> + iowrite16(qid, eni_vdpa->vring[qid].notify);
> +}
> +
> +static u32 eni_vdpa_get_device_id(struct vdpa_device *vdpa)
> +{
> + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> +
> + return ldev->id.device;
> +}
> +
> +static u32 eni_vdpa_get_vendor_id(struct vdpa_device *vdpa)
> +{
> + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> +
> + return ldev->id.vendor;
> +}
> +
> +static u32 eni_vdpa_get_vq_align(struct vdpa_device *vdpa)
> +{
> + return PAGE_SIZE;
> +}
> +
> +static size_t eni_vdpa_get_config_size(struct vdpa_device *vdpa)
> +{
> + return sizeof(struct virtio_net_config);
> +}
> +
> +
> +static void eni_vdpa_get_config(struct vdpa_device *vdpa,
> + unsigned int offset,
> + void *buf, unsigned int len)
> +{
> + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> + struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
> + void __iomem *ioaddr = ldev->ioaddr +
> + VIRTIO_PCI_CONFIG_OFF(eni_vdpa->vectors) +
> + offset;
> + u8 *p = buf;
> + int i;
> +
> + for (i = 0; i < len; i++)
> + *p++ = ioread8(ioaddr + i);
> +}
> +
> +static void eni_vdpa_set_config(struct vdpa_device *vdpa,
> + unsigned int offset, const void *buf,
> + unsigned int len)
> +{
> + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> + struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
> + void __iomem *ioaddr = ldev->ioaddr +
> + VIRTIO_PCI_CONFIG_OFF(eni_vdpa->vectors) +
> + offset;
> + const u8 *p = buf;
> + int i;
> +
> + for (i = 0; i < len; i++)
> + iowrite8(*p++, ioaddr + i);
> +}
> +
> +static void eni_vdpa_set_config_cb(struct vdpa_device *vdpa,
> + struct vdpa_callback *cb)
> +{
> + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> +
> + eni_vdpa->config_cb = *cb;
> +}
> +
> +static const struct vdpa_config_ops eni_vdpa_ops = {
> + .get_features = eni_vdpa_get_features,
> + .set_features = eni_vdpa_set_features,
> + .get_status = eni_vdpa_get_status,
> + .set_status = eni_vdpa_set_status,
> + .reset = eni_vdpa_reset,
> + .get_vq_num_max = eni_vdpa_get_vq_num_max,
> + .get_vq_num_min = eni_vdpa_get_vq_num_min,
> + .get_vq_state = eni_vdpa_get_vq_state,
> + .set_vq_state = eni_vdpa_set_vq_state,
> + .set_vq_cb = eni_vdpa_set_vq_cb,
> + .set_vq_ready = eni_vdpa_set_vq_ready,
> + .get_vq_ready = eni_vdpa_get_vq_ready,
> + .set_vq_num = eni_vdpa_set_vq_num,
> + .set_vq_address = eni_vdpa_set_vq_address,
> + .kick_vq = eni_vdpa_kick_vq,
> + .get_device_id = eni_vdpa_get_device_id,
> + .get_vendor_id = eni_vdpa_get_vendor_id,
> + .get_vq_align = eni_vdpa_get_vq_align,
> + .get_config_size = eni_vdpa_get_config_size,
> + .get_config = eni_vdpa_get_config,
> + .set_config = eni_vdpa_set_config,
> + .set_config_cb = eni_vdpa_set_config_cb,
> + .get_vq_irq = eni_vdpa_get_vq_irq,
> +};
> +
> +
> +static u16 eni_vdpa_get_num_queues(struct eni_vdpa *eni_vdpa)
> +{
> + struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
> + u32 features = vp_legacy_get_features(ldev);
> + u16 num = 2;
> +
> + if (features & BIT_ULL(VIRTIO_NET_F_MQ)) {
> + __virtio16 max_virtqueue_pairs;
> +
> + eni_vdpa_get_config(&eni_vdpa->vdpa,
> + offsetof(struct virtio_net_config, max_virtqueue_pairs),
> + &max_virtqueue_pairs,
> + sizeof(max_virtqueue_pairs));
> + num = 2 * __virtio16_to_cpu(virtio_legacy_is_little_endian(),
> + max_virtqueue_pairs);
> + }
> +
> + if (features & BIT_ULL(VIRTIO_NET_F_CTRL_VQ))
> + num += 1;
> +
> + return num;
> +}
> +
> +static void eni_vdpa_free_irq_vectors(void *data)
> +{
> + pci_free_irq_vectors(data);
> +}
> +
> +#ifdef __LITTLE_ENDIAN
> +static int eni_vdpa_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> +{
> + struct device *dev = &pdev->dev;
> + struct eni_vdpa *eni_vdpa;
> + struct virtio_pci_legacy_device *ldev;
> + int ret, i;
> +
> + ret = pcim_enable_device(pdev);
> + if (ret)
> + return ret;
> +
> + eni_vdpa = vdpa_alloc_device(struct eni_vdpa, vdpa,
> + dev, &eni_vdpa_ops, NULL, false);
> + if (IS_ERR(eni_vdpa)) {
> + ENI_ERR(pdev, "failed to allocate vDPA structure\n");
> + return PTR_ERR(eni_vdpa);
> + }
> +
> + ldev = &eni_vdpa->ldev;
> + ldev->pci_dev = pdev;
> +
> + ret = vp_legacy_probe(ldev);
> + if (ret) {
> + ENI_ERR(pdev, "failed to probe legacy PCI device\n");
> + goto err;
> + }
> +
> + pci_set_master(pdev);
> + pci_set_drvdata(pdev, eni_vdpa);
> +
> + eni_vdpa->vdpa.dma_dev = &pdev->dev;
> + eni_vdpa->queues = eni_vdpa_get_num_queues(eni_vdpa);
> +
> + ret = devm_add_action_or_reset(dev, eni_vdpa_free_irq_vectors, pdev);
> + if (ret) {
> + ENI_ERR(pdev,
> + "failed for adding devres for freeing irq vectors\n");
> + goto err;
> + }
> +
> + eni_vdpa->vring = devm_kcalloc(&pdev->dev, eni_vdpa->queues,
> + sizeof(*eni_vdpa->vring),
> + GFP_KERNEL);
> + if (!eni_vdpa->vring) {
> + ret = -ENOMEM;
> + ENI_ERR(pdev, "failed to allocate virtqueues\n");
> + goto err;
> + }
> +
> + for (i = 0; i < eni_vdpa->queues; i++) {
> + eni_vdpa->vring[i].irq = VIRTIO_MSI_NO_VECTOR;
> + eni_vdpa->vring[i].notify = ldev->ioaddr + VIRTIO_PCI_QUEUE_NOTIFY;
> + }
> + eni_vdpa->config_irq = VIRTIO_MSI_NO_VECTOR;
> +
> + ret = vdpa_register_device(&eni_vdpa->vdpa, eni_vdpa->queues);
> + if (ret) {
> + ENI_ERR(pdev, "failed to register to vdpa bus\n");
> + goto err;
> + }
> +
> + return 0;
> +
> +err:
> + put_device(&eni_vdpa->vdpa.dev);
> + return ret;
> +}
> +#else
> +static int eni_vdpa_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> +{
> + ENI_ERR(pdev, "this driver not supported on BE host\n");
> + return -ENODEV;
> +}
> +#endif
> +
> +static void eni_vdpa_remove(struct pci_dev *pdev)
> +{
> + struct eni_vdpa *eni_vdpa = pci_get_drvdata(pdev);
> +
> + vdpa_unregister_device(&eni_vdpa->vdpa);
> + vp_legacy_remove(&eni_vdpa->ldev);
> +}
> +
> +static struct pci_device_id eni_pci_ids[] = {
> + { PCI_DEVICE_SUB(PCI_VENDOR_ID_REDHAT_QUMRANET,
> + VIRTIO_TRANS_ID_NET,
> + PCI_SUBVENDOR_ID_REDHAT_QUMRANET,
> + VIRTIO_ID_NET) },
> + { 0 },
> +};
> +
> +static struct pci_driver eni_vdpa_driver = {
> + .name = "alibaba-eni-vdpa",
> + .id_table = eni_pci_ids,
> + .probe = eni_vdpa_probe,
> + .remove = eni_vdpa_remove,
> +};
> +
> +module_pci_driver(eni_vdpa_driver);
> +
> +MODULE_AUTHOR("Wu Zongyong <[email protected]>");
> +MODULE_DESCRIPTION("Alibaba ENI vDPA driver");
> +MODULE_LICENSE("GPL v2");
> --
> 2.31.1
>

2021-09-26 02:28:35

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH v3 7/7] eni_vdpa: add vDPA driver for Alibaba ENI


?? 2021/9/22 ????8:46, Wu Zongyong д??:
> +
> +#ifdef __LITTLE_ENDIAN


I think disable the device via Kconfig is better than letting user to
meet errors like this.

(Or if the device is always using little endian, we don't even need to
bother this).

Thanks


> +static int eni_vdpa_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> +{
> + struct device *dev = &pdev->dev;
> + struct eni_vdpa *eni_vdpa;
> + struct virtio_pci_legacy_device *ldev;
> + int ret, i;
> +
> + ret = pcim_enable_device(pdev);
> + if (ret)
> + return ret;
> +

2021-09-26 03:28:16

by Wu Zongyong

[permalink] [raw]
Subject: Re: [PATCH v3 7/7] eni_vdpa: add vDPA driver for Alibaba ENI

On Sun, Sep 26, 2021 at 10:24:21AM +0800, Jason Wang wrote:
> On Wed, Sep 22, 2021 at 8:47 PM Wu Zongyong
> <[email protected]> wrote:
> >
> > This patch adds a new vDPA driver for Alibaba ENI(Elastic Network
> > Interface) which is build upon virtio 0.9.5 specification.
> > And this driver doesn't support to run on BE host.
> >
> > Signed-off-by: Wu Zongyong <[email protected]>
> > ---
> > drivers/vdpa/Kconfig | 8 +
> > drivers/vdpa/Makefile | 1 +
> > drivers/vdpa/alibaba/Makefile | 3 +
> > drivers/vdpa/alibaba/eni_vdpa.c | 554 ++++++++++++++++++++++++++++++++
> > 4 files changed, 566 insertions(+)
> > create mode 100644 drivers/vdpa/alibaba/Makefile
> > create mode 100644 drivers/vdpa/alibaba/eni_vdpa.c
> >
> > diff --git a/drivers/vdpa/Kconfig b/drivers/vdpa/Kconfig
> > index 3d91982d8371..9587b9177b05 100644
> > --- a/drivers/vdpa/Kconfig
> > +++ b/drivers/vdpa/Kconfig
> > @@ -78,4 +78,12 @@ config VP_VDPA
> > help
> > This kernel module bridges virtio PCI device to vDPA bus.
> >
> > +config ALIBABA_ENI_VDPA
> > + tristate "vDPA driver for Alibaba ENI"
> > + select VIRTIO_PCI_LEGACY_LIB
> > + depends on PCI_MSI
> > + help
> > + VDPA driver for Alibaba ENI(Elastic Network Interface) which is build upon
> > + virtio 0.9.5 specification.
> > +
> > endif # VDPA
> > diff --git a/drivers/vdpa/Makefile b/drivers/vdpa/Makefile
> > index f02ebed33f19..15665563a7f4 100644
> > --- a/drivers/vdpa/Makefile
> > +++ b/drivers/vdpa/Makefile
> > @@ -5,3 +5,4 @@ obj-$(CONFIG_VDPA_USER) += vdpa_user/
> > obj-$(CONFIG_IFCVF) += ifcvf/
> > obj-$(CONFIG_MLX5_VDPA) += mlx5/
> > obj-$(CONFIG_VP_VDPA) += virtio_pci/
> > +obj-$(CONFIG_ALIBABA_ENI_VDPA) += alibaba/
> > diff --git a/drivers/vdpa/alibaba/Makefile b/drivers/vdpa/alibaba/Makefile
> > new file mode 100644
> > index 000000000000..ef4aae69f87a
> > --- /dev/null
> > +++ b/drivers/vdpa/alibaba/Makefile
> > @@ -0,0 +1,3 @@
> > +# SPDX-License-Identifier: GPL-2.0
> > +obj-$(CONFIG_ALIBABA_ENI_VDPA) += eni_vdpa.o
> > +
> > diff --git a/drivers/vdpa/alibaba/eni_vdpa.c b/drivers/vdpa/alibaba/eni_vdpa.c
> > new file mode 100644
> > index 000000000000..b6eef696cec5
> > --- /dev/null
> > +++ b/drivers/vdpa/alibaba/eni_vdpa.c
> > @@ -0,0 +1,554 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * vDPA bridge driver for Alibaba ENI(Elastic Network Interface)
> > + *
> > + * Copyright (c) 2021, Alibaba Inc. All rights reserved.
> > + * Author: Wu Zongyong <[email protected]>
> > + *
> > + */
> > +
> > +#include "linux/bits.h"
> > +#include <linux/interrupt.h>
> > +#include <linux/module.h>
> > +#include <linux/pci.h>
> > +#include <linux/vdpa.h>
> > +#include <linux/virtio.h>
> > +#include <linux/virtio_config.h>
> > +#include <linux/virtio_ring.h>
> > +#include <linux/virtio_pci.h>
> > +#include <linux/virtio_pci_legacy.h>
> > +#include <uapi/linux/virtio_net.h>
> > +
> > +#define ENI_MSIX_NAME_SIZE 256
> > +
> > +#define ENI_ERR(pdev, fmt, ...) \
> > + dev_err(&pdev->dev, "%s"fmt, "eni_vdpa: ", ##__VA_ARGS__)
> > +#define ENI_DBG(pdev, fmt, ...) \
> > + dev_dbg(&pdev->dev, "%s"fmt, "eni_vdpa: ", ##__VA_ARGS__)
> > +#define ENI_INFO(pdev, fmt, ...) \
> > + dev_info(&pdev->dev, "%s"fmt, "eni_vdpa: ", ##__VA_ARGS__)
> > +
> > +struct eni_vring {
> > + void __iomem *notify;
> > + char msix_name[ENI_MSIX_NAME_SIZE];
> > + struct vdpa_callback cb;
> > + int irq;
> > +};
> > +
> > +struct eni_vdpa {
> > + struct vdpa_device vdpa;
> > + struct virtio_pci_legacy_device ldev;
> > + struct eni_vring *vring;
> > + struct vdpa_callback config_cb;
> > + char msix_name[ENI_MSIX_NAME_SIZE];
> > + int config_irq;
> > + int queues;
> > + int vectors;
> > +};
> > +
> > +static struct eni_vdpa *vdpa_to_eni(struct vdpa_device *vdpa)
> > +{
> > + return container_of(vdpa, struct eni_vdpa, vdpa);
> > +}
> > +
> > +static struct virtio_pci_legacy_device *vdpa_to_ldev(struct vdpa_device *vdpa)
> > +{
> > + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> > +
> > + return &eni_vdpa->ldev;
> > +}
> > +
> > +static u64 eni_vdpa_get_features(struct vdpa_device *vdpa)
> > +{
> > + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> > + u64 features = vp_legacy_get_features(ldev);
> > +
> > + features |= BIT_ULL(VIRTIO_F_ACCESS_PLATFORM);
> > +
> > + return features;
>
> I wonder if the following can work with ENI:
>
> -device virtio-net-pci,mrg_rxbuf=off
>
> ?

ENI didn't work.
I will remove F_MRG_RXBUF when get_features.
>
> Thanks
>
> > +}
> > +
> > +static int eni_vdpa_set_features(struct vdpa_device *vdpa, u64 features)
> > +{
> > + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> > +
> > + vp_legacy_set_features(ldev, (u32)features);
> > +
> > + return 0;
> > +}
> > +
> > +static u8 eni_vdpa_get_status(struct vdpa_device *vdpa)
> > +{
> > + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> > +
> > + return vp_legacy_get_status(ldev);
> > +}
> > +
> > +static int eni_vdpa_get_vq_irq(struct vdpa_device *vdpa, u16 idx)
> > +{
> > + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> > + int irq = eni_vdpa->vring[idx].irq;
> > +
> > + if (irq == VIRTIO_MSI_NO_VECTOR)
> > + return -EINVAL;
> > +
> > + return irq;
> > +}
> > +
> > +static void eni_vdpa_free_irq(struct eni_vdpa *eni_vdpa)
> > +{
> > + struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
> > + struct pci_dev *pdev = ldev->pci_dev;
> > + int i;
> > +
> > + for (i = 0; i < eni_vdpa->queues; i++) {
> > + if (eni_vdpa->vring[i].irq != VIRTIO_MSI_NO_VECTOR) {
> > + vp_legacy_queue_vector(ldev, i, VIRTIO_MSI_NO_VECTOR);
> > + devm_free_irq(&pdev->dev, eni_vdpa->vring[i].irq,
> > + &eni_vdpa->vring[i]);
> > + eni_vdpa->vring[i].irq = VIRTIO_MSI_NO_VECTOR;
> > + }
> > + }
> > +
> > + if (eni_vdpa->config_irq != VIRTIO_MSI_NO_VECTOR) {
> > + vp_legacy_config_vector(ldev, VIRTIO_MSI_NO_VECTOR);
> > + devm_free_irq(&pdev->dev, eni_vdpa->config_irq, eni_vdpa);
> > + eni_vdpa->config_irq = VIRTIO_MSI_NO_VECTOR;
> > + }
> > +
> > + if (eni_vdpa->vectors) {
> > + pci_free_irq_vectors(pdev);
> > + eni_vdpa->vectors = 0;
> > + }
> > +}
> > +
> > +static irqreturn_t eni_vdpa_vq_handler(int irq, void *arg)
> > +{
> > + struct eni_vring *vring = arg;
> > +
> > + if (vring->cb.callback)
> > + return vring->cb.callback(vring->cb.private);
> > +
> > + return IRQ_HANDLED;
> > +}
> > +
> > +static irqreturn_t eni_vdpa_config_handler(int irq, void *arg)
> > +{
> > + struct eni_vdpa *eni_vdpa = arg;
> > +
> > + if (eni_vdpa->config_cb.callback)
> > + return eni_vdpa->config_cb.callback(eni_vdpa->config_cb.private);
> > +
> > + return IRQ_HANDLED;
> > +}
> > +
> > +static int eni_vdpa_request_irq(struct eni_vdpa *eni_vdpa)
> > +{
> > + struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
> > + struct pci_dev *pdev = ldev->pci_dev;
> > + int i, ret, irq;
> > + int queues = eni_vdpa->queues;
> > + int vectors = queues + 1;
> > +
> > + ret = pci_alloc_irq_vectors(pdev, vectors, vectors, PCI_IRQ_MSIX);
> > + if (ret != vectors) {
> > + ENI_ERR(pdev,
> > + "failed to allocate irq vectors want %d but %d\n",
> > + vectors, ret);
> > + return ret;
> > + }
> > +
> > + eni_vdpa->vectors = vectors;
> > +
> > + for (i = 0; i < queues; i++) {
> > + snprintf(eni_vdpa->vring[i].msix_name, ENI_MSIX_NAME_SIZE,
> > + "eni-vdpa[%s]-%d\n", pci_name(pdev), i);
> > + irq = pci_irq_vector(pdev, i);
> > + ret = devm_request_irq(&pdev->dev, irq,
> > + eni_vdpa_vq_handler,
> > + 0, eni_vdpa->vring[i].msix_name,
> > + &eni_vdpa->vring[i]);
> > + if (ret) {
> > + ENI_ERR(pdev, "failed to request irq for vq %d\n", i);
> > + goto err;
> > + }
> > + vp_legacy_queue_vector(ldev, i, i);
> > + eni_vdpa->vring[i].irq = irq;
> > + }
> > +
> > + snprintf(eni_vdpa->msix_name, ENI_MSIX_NAME_SIZE, "eni-vdpa[%s]-config\n",
> > + pci_name(pdev));
> > + irq = pci_irq_vector(pdev, queues);
> > + ret = devm_request_irq(&pdev->dev, irq, eni_vdpa_config_handler, 0,
> > + eni_vdpa->msix_name, eni_vdpa);
> > + if (ret) {
> > + ENI_ERR(pdev, "failed to request irq for config vq %d\n", i);
> > + goto err;
> > + }
> > + vp_legacy_config_vector(ldev, queues);
> > + eni_vdpa->config_irq = irq;
> > +
> > + return 0;
> > +err:
> > + eni_vdpa_free_irq(eni_vdpa);
> > + return ret;
> > +}
> > +
> > +static void eni_vdpa_set_status(struct vdpa_device *vdpa, u8 status)
> > +{
> > + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> > + struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
> > + u8 s = eni_vdpa_get_status(vdpa);
> > +
> > + if (status & VIRTIO_CONFIG_S_DRIVER_OK &&
> > + !(s & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > + eni_vdpa_request_irq(eni_vdpa);
> > + }
> > +
> > + vp_legacy_set_status(ldev, status);
> > +
> > + if (!(status & VIRTIO_CONFIG_S_DRIVER_OK) &&
> > + (s & VIRTIO_CONFIG_S_DRIVER_OK))
> > + eni_vdpa_free_irq(eni_vdpa);
> > +}
> > +
> > +static int eni_vdpa_reset(struct vdpa_device *vdpa)
> > +{
> > + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> > + struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
> > + u8 s = eni_vdpa_get_status(vdpa);
> > +
> > + vp_legacy_set_status(ldev, 0);
> > +
> > + if (s & VIRTIO_CONFIG_S_DRIVER_OK)
> > + eni_vdpa_free_irq(eni_vdpa);
> > +
> > + return 0;
> > +}
> > +
> > +static u16 eni_vdpa_get_vq_num_max(struct vdpa_device *vdpa)
> > +{
> > + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> > +
> > + return vp_legacy_get_queue_size(ldev, 0);
> > +}
> > +
> > +static u16 eni_vdpa_get_vq_num_min(struct vdpa_device *vdpa)
> > +{
> > + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> > +
> > + return vp_legacy_get_queue_size(ldev, 0);
> > +}
> > +
> > +static int eni_vdpa_get_vq_state(struct vdpa_device *vdpa, u16 qid,
> > + struct vdpa_vq_state *state)
> > +{
> > + return -EOPNOTSUPP;
> > +}
> > +
> > +static int eni_vdpa_set_vq_state(struct vdpa_device *vdpa, u16 qid,
> > + const struct vdpa_vq_state *state)
> > +{
> > + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> > + const struct vdpa_vq_state_split *split = &state->split;
> > +
> > + /* ENI is build upon virtio-pci specfication which not support
> > + * to set state of virtqueue. But if the state is equal to the
> > + * device initial state by chance, we can let it go.
> > + */
> > + if (!vp_legacy_get_queue_enable(ldev, qid)
> > + && split->avail_index == 0)
> > + return 0;
> > +
> > + return -EOPNOTSUPP;
> > +}
> > +
> > +
> > +static void eni_vdpa_set_vq_cb(struct vdpa_device *vdpa, u16 qid,
> > + struct vdpa_callback *cb)
> > +{
> > + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> > +
> > + eni_vdpa->vring[qid].cb = *cb;
> > +}
> > +
> > +static void eni_vdpa_set_vq_ready(struct vdpa_device *vdpa, u16 qid,
> > + bool ready)
> > +{
> > + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> > +
> > + /* ENI is a legacy virtio-pci device. This is not supported
> > + * by specification. But we can disable virtqueue by setting
> > + * address to 0.
> > + */
> > + if (!ready)
> > + vp_legacy_set_queue_address(ldev, qid, 0);
> > +}
> > +
> > +static bool eni_vdpa_get_vq_ready(struct vdpa_device *vdpa, u16 qid)
> > +{
> > + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> > +
> > + return vp_legacy_get_queue_enable(ldev, qid);
> > +}
> > +
> > +static void eni_vdpa_set_vq_num(struct vdpa_device *vdpa, u16 qid,
> > + u32 num)
> > +{
> > + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> > + struct pci_dev *pdev = ldev->pci_dev;
> > + u16 n = vp_legacy_get_queue_size(ldev, qid);
> > +
> > + /* ENI is a legacy virtio-pci device which not allow to change
> > + * virtqueue size. Just report a error if someone tries to
> > + * change it.
> > + */
> > + if (num != n)
> > + ENI_ERR(pdev,
> > + "not support to set vq %u fixed num %u to %u\n",
> > + qid, n, num);
> > +}
> > +
> > +static int eni_vdpa_set_vq_address(struct vdpa_device *vdpa, u16 qid,
> > + u64 desc_area, u64 driver_area,
> > + u64 device_area)
> > +{
> > + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> > + u32 pfn = desc_area >> VIRTIO_PCI_QUEUE_ADDR_SHIFT;
> > +
> > + vp_legacy_set_queue_address(ldev, qid, pfn);
> > +
> > + return 0;
> > +}
> > +
> > +static void eni_vdpa_kick_vq(struct vdpa_device *vdpa, u16 qid)
> > +{
> > + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> > +
> > + iowrite16(qid, eni_vdpa->vring[qid].notify);
> > +}
> > +
> > +static u32 eni_vdpa_get_device_id(struct vdpa_device *vdpa)
> > +{
> > + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> > +
> > + return ldev->id.device;
> > +}
> > +
> > +static u32 eni_vdpa_get_vendor_id(struct vdpa_device *vdpa)
> > +{
> > + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> > +
> > + return ldev->id.vendor;
> > +}
> > +
> > +static u32 eni_vdpa_get_vq_align(struct vdpa_device *vdpa)
> > +{
> > + return PAGE_SIZE;
> > +}
> > +
> > +static size_t eni_vdpa_get_config_size(struct vdpa_device *vdpa)
> > +{
> > + return sizeof(struct virtio_net_config);
> > +}
> > +
> > +
> > +static void eni_vdpa_get_config(struct vdpa_device *vdpa,
> > + unsigned int offset,
> > + void *buf, unsigned int len)
> > +{
> > + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> > + struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
> > + void __iomem *ioaddr = ldev->ioaddr +
> > + VIRTIO_PCI_CONFIG_OFF(eni_vdpa->vectors) +
> > + offset;
> > + u8 *p = buf;
> > + int i;
> > +
> > + for (i = 0; i < len; i++)
> > + *p++ = ioread8(ioaddr + i);
> > +}
> > +
> > +static void eni_vdpa_set_config(struct vdpa_device *vdpa,
> > + unsigned int offset, const void *buf,
> > + unsigned int len)
> > +{
> > + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> > + struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
> > + void __iomem *ioaddr = ldev->ioaddr +
> > + VIRTIO_PCI_CONFIG_OFF(eni_vdpa->vectors) +
> > + offset;
> > + const u8 *p = buf;
> > + int i;
> > +
> > + for (i = 0; i < len; i++)
> > + iowrite8(*p++, ioaddr + i);
> > +}
> > +
> > +static void eni_vdpa_set_config_cb(struct vdpa_device *vdpa,
> > + struct vdpa_callback *cb)
> > +{
> > + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> > +
> > + eni_vdpa->config_cb = *cb;
> > +}
> > +
> > +static const struct vdpa_config_ops eni_vdpa_ops = {
> > + .get_features = eni_vdpa_get_features,
> > + .set_features = eni_vdpa_set_features,
> > + .get_status = eni_vdpa_get_status,
> > + .set_status = eni_vdpa_set_status,
> > + .reset = eni_vdpa_reset,
> > + .get_vq_num_max = eni_vdpa_get_vq_num_max,
> > + .get_vq_num_min = eni_vdpa_get_vq_num_min,
> > + .get_vq_state = eni_vdpa_get_vq_state,
> > + .set_vq_state = eni_vdpa_set_vq_state,
> > + .set_vq_cb = eni_vdpa_set_vq_cb,
> > + .set_vq_ready = eni_vdpa_set_vq_ready,
> > + .get_vq_ready = eni_vdpa_get_vq_ready,
> > + .set_vq_num = eni_vdpa_set_vq_num,
> > + .set_vq_address = eni_vdpa_set_vq_address,
> > + .kick_vq = eni_vdpa_kick_vq,
> > + .get_device_id = eni_vdpa_get_device_id,
> > + .get_vendor_id = eni_vdpa_get_vendor_id,
> > + .get_vq_align = eni_vdpa_get_vq_align,
> > + .get_config_size = eni_vdpa_get_config_size,
> > + .get_config = eni_vdpa_get_config,
> > + .set_config = eni_vdpa_set_config,
> > + .set_config_cb = eni_vdpa_set_config_cb,
> > + .get_vq_irq = eni_vdpa_get_vq_irq,
> > +};
> > +
> > +
> > +static u16 eni_vdpa_get_num_queues(struct eni_vdpa *eni_vdpa)
> > +{
> > + struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
> > + u32 features = vp_legacy_get_features(ldev);
> > + u16 num = 2;
> > +
> > + if (features & BIT_ULL(VIRTIO_NET_F_MQ)) {
> > + __virtio16 max_virtqueue_pairs;
> > +
> > + eni_vdpa_get_config(&eni_vdpa->vdpa,
> > + offsetof(struct virtio_net_config, max_virtqueue_pairs),
> > + &max_virtqueue_pairs,
> > + sizeof(max_virtqueue_pairs));
> > + num = 2 * __virtio16_to_cpu(virtio_legacy_is_little_endian(),
> > + max_virtqueue_pairs);
> > + }
> > +
> > + if (features & BIT_ULL(VIRTIO_NET_F_CTRL_VQ))
> > + num += 1;
> > +
> > + return num;
> > +}
> > +
> > +static void eni_vdpa_free_irq_vectors(void *data)
> > +{
> > + pci_free_irq_vectors(data);
> > +}
> > +
> > +#ifdef __LITTLE_ENDIAN
> > +static int eni_vdpa_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> > +{
> > + struct device *dev = &pdev->dev;
> > + struct eni_vdpa *eni_vdpa;
> > + struct virtio_pci_legacy_device *ldev;
> > + int ret, i;
> > +
> > + ret = pcim_enable_device(pdev);
> > + if (ret)
> > + return ret;
> > +
> > + eni_vdpa = vdpa_alloc_device(struct eni_vdpa, vdpa,
> > + dev, &eni_vdpa_ops, NULL, false);
> > + if (IS_ERR(eni_vdpa)) {
> > + ENI_ERR(pdev, "failed to allocate vDPA structure\n");
> > + return PTR_ERR(eni_vdpa);
> > + }
> > +
> > + ldev = &eni_vdpa->ldev;
> > + ldev->pci_dev = pdev;
> > +
> > + ret = vp_legacy_probe(ldev);
> > + if (ret) {
> > + ENI_ERR(pdev, "failed to probe legacy PCI device\n");
> > + goto err;
> > + }
> > +
> > + pci_set_master(pdev);
> > + pci_set_drvdata(pdev, eni_vdpa);
> > +
> > + eni_vdpa->vdpa.dma_dev = &pdev->dev;
> > + eni_vdpa->queues = eni_vdpa_get_num_queues(eni_vdpa);
> > +
> > + ret = devm_add_action_or_reset(dev, eni_vdpa_free_irq_vectors, pdev);
> > + if (ret) {
> > + ENI_ERR(pdev,
> > + "failed for adding devres for freeing irq vectors\n");
> > + goto err;
> > + }
> > +
> > + eni_vdpa->vring = devm_kcalloc(&pdev->dev, eni_vdpa->queues,
> > + sizeof(*eni_vdpa->vring),
> > + GFP_KERNEL);
> > + if (!eni_vdpa->vring) {
> > + ret = -ENOMEM;
> > + ENI_ERR(pdev, "failed to allocate virtqueues\n");
> > + goto err;
> > + }
> > +
> > + for (i = 0; i < eni_vdpa->queues; i++) {
> > + eni_vdpa->vring[i].irq = VIRTIO_MSI_NO_VECTOR;
> > + eni_vdpa->vring[i].notify = ldev->ioaddr + VIRTIO_PCI_QUEUE_NOTIFY;
> > + }
> > + eni_vdpa->config_irq = VIRTIO_MSI_NO_VECTOR;
> > +
> > + ret = vdpa_register_device(&eni_vdpa->vdpa, eni_vdpa->queues);
> > + if (ret) {
> > + ENI_ERR(pdev, "failed to register to vdpa bus\n");
> > + goto err;
> > + }
> > +
> > + return 0;
> > +
> > +err:
> > + put_device(&eni_vdpa->vdpa.dev);
> > + return ret;
> > +}
> > +#else
> > +static int eni_vdpa_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> > +{
> > + ENI_ERR(pdev, "this driver not supported on BE host\n");
> > + return -ENODEV;
> > +}
> > +#endif
> > +
> > +static void eni_vdpa_remove(struct pci_dev *pdev)
> > +{
> > + struct eni_vdpa *eni_vdpa = pci_get_drvdata(pdev);
> > +
> > + vdpa_unregister_device(&eni_vdpa->vdpa);
> > + vp_legacy_remove(&eni_vdpa->ldev);
> > +}
> > +
> > +static struct pci_device_id eni_pci_ids[] = {
> > + { PCI_DEVICE_SUB(PCI_VENDOR_ID_REDHAT_QUMRANET,
> > + VIRTIO_TRANS_ID_NET,
> > + PCI_SUBVENDOR_ID_REDHAT_QUMRANET,
> > + VIRTIO_ID_NET) },
> > + { 0 },
> > +};
> > +
> > +static struct pci_driver eni_vdpa_driver = {
> > + .name = "alibaba-eni-vdpa",
> > + .id_table = eni_pci_ids,
> > + .probe = eni_vdpa_probe,
> > + .remove = eni_vdpa_remove,
> > +};
> > +
> > +module_pci_driver(eni_vdpa_driver);
> > +
> > +MODULE_AUTHOR("Wu Zongyong <[email protected]>");
> > +MODULE_DESCRIPTION("Alibaba ENI vDPA driver");
> > +MODULE_LICENSE("GPL v2");
> > --
> > 2.31.1
> >

2021-09-26 03:30:42

by Wu Zongyong

[permalink] [raw]
Subject: Re: [PATCH v3 7/7] eni_vdpa: add vDPA driver for Alibaba ENI

On Sun, Sep 26, 2021 at 10:26:47AM +0800, Jason Wang wrote:
>
> ?? 2021/9/22 ????8:46, Wu Zongyong д??:
> > +
> > +#ifdef __LITTLE_ENDIAN
>
>
> I think disable the device via Kconfig is better than letting user to meet
> errors like this.
>
> (Or if the device is always using little endian, we don't even need to
> bother this).

I prefer the second suggestion since there are no use cases that the
device uses big endian
>
> Thanks
>
>
> > +static int eni_vdpa_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> > +{
> > + struct device *dev = &pdev->dev;
> > + struct eni_vdpa *eni_vdpa;
> > + struct virtio_pci_legacy_device *ldev;
> > + int ret, i;
> > +
> > + ret = pcim_enable_device(pdev);
> > + if (ret)
> > + return ret;
> > +

2021-09-26 04:23:39

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH v3 7/7] eni_vdpa: add vDPA driver for Alibaba ENI

On Sun, Sep 26, 2021 at 11:27 AM Wu Zongyong
<[email protected]> wrote:
>
> On Sun, Sep 26, 2021 at 10:26:47AM +0800, Jason Wang wrote:
> >
> > 在 2021/9/22 下午8:46, Wu Zongyong 写道:
> > > +
> > > +#ifdef __LITTLE_ENDIAN
> >
> >
> > I think disable the device via Kconfig is better than letting user to meet
> > errors like this.
> >
> > (Or if the device is always using little endian, we don't even need to
> > bother this).
>
> I prefer the second suggestion since there are no use cases that the
> device uses big endian

If this means the device will always use little endian. It's fine.

Thanks

> >
> > Thanks
> >
> >
> > > +static int eni_vdpa_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> > > +{
> > > + struct device *dev = &pdev->dev;
> > > + struct eni_vdpa *eni_vdpa;
> > > + struct virtio_pci_legacy_device *ldev;
> > > + int ret, i;
> > > +
> > > + ret = pcim_enable_device(pdev);
> > > + if (ret)
> > > + return ret;
> > > +
>

2021-09-26 04:23:39

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH v3 7/7] eni_vdpa: add vDPA driver for Alibaba ENI

On Sun, Sep 26, 2021 at 11:24 AM Wu Zongyong
<[email protected]> wrote:
>
> On Sun, Sep 26, 2021 at 10:24:21AM +0800, Jason Wang wrote:
> > On Wed, Sep 22, 2021 at 8:47 PM Wu Zongyong
> > <[email protected]> wrote:
> > >
> > > This patch adds a new vDPA driver for Alibaba ENI(Elastic Network
> > > Interface) which is build upon virtio 0.9.5 specification.
> > > And this driver doesn't support to run on BE host.
> > >
> > > Signed-off-by: Wu Zongyong <[email protected]>
> > > ---
> > > drivers/vdpa/Kconfig | 8 +
> > > drivers/vdpa/Makefile | 1 +
> > > drivers/vdpa/alibaba/Makefile | 3 +
> > > drivers/vdpa/alibaba/eni_vdpa.c | 554 ++++++++++++++++++++++++++++++++
> > > 4 files changed, 566 insertions(+)
> > > create mode 100644 drivers/vdpa/alibaba/Makefile
> > > create mode 100644 drivers/vdpa/alibaba/eni_vdpa.c
> > >
> > > diff --git a/drivers/vdpa/Kconfig b/drivers/vdpa/Kconfig
> > > index 3d91982d8371..9587b9177b05 100644
> > > --- a/drivers/vdpa/Kconfig
> > > +++ b/drivers/vdpa/Kconfig
> > > @@ -78,4 +78,12 @@ config VP_VDPA
> > > help
> > > This kernel module bridges virtio PCI device to vDPA bus.
> > >
> > > +config ALIBABA_ENI_VDPA
> > > + tristate "vDPA driver for Alibaba ENI"
> > > + select VIRTIO_PCI_LEGACY_LIB
> > > + depends on PCI_MSI
> > > + help
> > > + VDPA driver for Alibaba ENI(Elastic Network Interface) which is build upon
> > > + virtio 0.9.5 specification.
> > > +
> > > endif # VDPA
> > > diff --git a/drivers/vdpa/Makefile b/drivers/vdpa/Makefile
> > > index f02ebed33f19..15665563a7f4 100644
> > > --- a/drivers/vdpa/Makefile
> > > +++ b/drivers/vdpa/Makefile
> > > @@ -5,3 +5,4 @@ obj-$(CONFIG_VDPA_USER) += vdpa_user/
> > > obj-$(CONFIG_IFCVF) += ifcvf/
> > > obj-$(CONFIG_MLX5_VDPA) += mlx5/
> > > obj-$(CONFIG_VP_VDPA) += virtio_pci/
> > > +obj-$(CONFIG_ALIBABA_ENI_VDPA) += alibaba/
> > > diff --git a/drivers/vdpa/alibaba/Makefile b/drivers/vdpa/alibaba/Makefile
> > > new file mode 100644
> > > index 000000000000..ef4aae69f87a
> > > --- /dev/null
> > > +++ b/drivers/vdpa/alibaba/Makefile
> > > @@ -0,0 +1,3 @@
> > > +# SPDX-License-Identifier: GPL-2.0
> > > +obj-$(CONFIG_ALIBABA_ENI_VDPA) += eni_vdpa.o
> > > +
> > > diff --git a/drivers/vdpa/alibaba/eni_vdpa.c b/drivers/vdpa/alibaba/eni_vdpa.c
> > > new file mode 100644
> > > index 000000000000..b6eef696cec5
> > > --- /dev/null
> > > +++ b/drivers/vdpa/alibaba/eni_vdpa.c
> > > @@ -0,0 +1,554 @@
> > > +// SPDX-License-Identifier: GPL-2.0-only
> > > +/*
> > > + * vDPA bridge driver for Alibaba ENI(Elastic Network Interface)
> > > + *
> > > + * Copyright (c) 2021, Alibaba Inc. All rights reserved.
> > > + * Author: Wu Zongyong <[email protected]>
> > > + *
> > > + */
> > > +
> > > +#include "linux/bits.h"
> > > +#include <linux/interrupt.h>
> > > +#include <linux/module.h>
> > > +#include <linux/pci.h>
> > > +#include <linux/vdpa.h>
> > > +#include <linux/virtio.h>
> > > +#include <linux/virtio_config.h>
> > > +#include <linux/virtio_ring.h>
> > > +#include <linux/virtio_pci.h>
> > > +#include <linux/virtio_pci_legacy.h>
> > > +#include <uapi/linux/virtio_net.h>
> > > +
> > > +#define ENI_MSIX_NAME_SIZE 256
> > > +
> > > +#define ENI_ERR(pdev, fmt, ...) \
> > > + dev_err(&pdev->dev, "%s"fmt, "eni_vdpa: ", ##__VA_ARGS__)
> > > +#define ENI_DBG(pdev, fmt, ...) \
> > > + dev_dbg(&pdev->dev, "%s"fmt, "eni_vdpa: ", ##__VA_ARGS__)
> > > +#define ENI_INFO(pdev, fmt, ...) \
> > > + dev_info(&pdev->dev, "%s"fmt, "eni_vdpa: ", ##__VA_ARGS__)
> > > +
> > > +struct eni_vring {
> > > + void __iomem *notify;
> > > + char msix_name[ENI_MSIX_NAME_SIZE];
> > > + struct vdpa_callback cb;
> > > + int irq;
> > > +};
> > > +
> > > +struct eni_vdpa {
> > > + struct vdpa_device vdpa;
> > > + struct virtio_pci_legacy_device ldev;
> > > + struct eni_vring *vring;
> > > + struct vdpa_callback config_cb;
> > > + char msix_name[ENI_MSIX_NAME_SIZE];
> > > + int config_irq;
> > > + int queues;
> > > + int vectors;
> > > +};
> > > +
> > > +static struct eni_vdpa *vdpa_to_eni(struct vdpa_device *vdpa)
> > > +{
> > > + return container_of(vdpa, struct eni_vdpa, vdpa);
> > > +}
> > > +
> > > +static struct virtio_pci_legacy_device *vdpa_to_ldev(struct vdpa_device *vdpa)
> > > +{
> > > + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> > > +
> > > + return &eni_vdpa->ldev;
> > > +}
> > > +
> > > +static u64 eni_vdpa_get_features(struct vdpa_device *vdpa)
> > > +{
> > > + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> > > + u64 features = vp_legacy_get_features(ldev);
> > > +
> > > + features |= BIT_ULL(VIRTIO_F_ACCESS_PLATFORM);
> > > +
> > > + return features;
> >
> > I wonder if the following can work with ENI:
> >
> > -device virtio-net-pci,mrg_rxbuf=off
> >
> > ?
>
> ENI didn't work.
> I will remove F_MRG_RXBUF when get_features.

I think we need to fail FEATURE_OK if F_MRG_RXBUF is not negotiated.
Since VERSION_1 requires a fixed header length even if F_MRG_RXBUF is
not negotiated.

But this trick doesn't come for free. If some driver doesn't support
mrg_rxbuf, it won't work.

Thanks


> >
> > Thanks
> >
> > > +}
> > > +
> > > +static int eni_vdpa_set_features(struct vdpa_device *vdpa, u64 features)
> > > +{
> > > + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> > > +
> > > + vp_legacy_set_features(ldev, (u32)features);
> > > +
> > > + return 0;
> > > +}
> > > +
> > > +static u8 eni_vdpa_get_status(struct vdpa_device *vdpa)
> > > +{
> > > + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> > > +
> > > + return vp_legacy_get_status(ldev);
> > > +}
> > > +
> > > +static int eni_vdpa_get_vq_irq(struct vdpa_device *vdpa, u16 idx)
> > > +{
> > > + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> > > + int irq = eni_vdpa->vring[idx].irq;
> > > +
> > > + if (irq == VIRTIO_MSI_NO_VECTOR)
> > > + return -EINVAL;
> > > +
> > > + return irq;
> > > +}
> > > +
> > > +static void eni_vdpa_free_irq(struct eni_vdpa *eni_vdpa)
> > > +{
> > > + struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
> > > + struct pci_dev *pdev = ldev->pci_dev;
> > > + int i;
> > > +
> > > + for (i = 0; i < eni_vdpa->queues; i++) {
> > > + if (eni_vdpa->vring[i].irq != VIRTIO_MSI_NO_VECTOR) {
> > > + vp_legacy_queue_vector(ldev, i, VIRTIO_MSI_NO_VECTOR);
> > > + devm_free_irq(&pdev->dev, eni_vdpa->vring[i].irq,
> > > + &eni_vdpa->vring[i]);
> > > + eni_vdpa->vring[i].irq = VIRTIO_MSI_NO_VECTOR;
> > > + }
> > > + }
> > > +
> > > + if (eni_vdpa->config_irq != VIRTIO_MSI_NO_VECTOR) {
> > > + vp_legacy_config_vector(ldev, VIRTIO_MSI_NO_VECTOR);
> > > + devm_free_irq(&pdev->dev, eni_vdpa->config_irq, eni_vdpa);
> > > + eni_vdpa->config_irq = VIRTIO_MSI_NO_VECTOR;
> > > + }
> > > +
> > > + if (eni_vdpa->vectors) {
> > > + pci_free_irq_vectors(pdev);
> > > + eni_vdpa->vectors = 0;
> > > + }
> > > +}
> > > +
> > > +static irqreturn_t eni_vdpa_vq_handler(int irq, void *arg)
> > > +{
> > > + struct eni_vring *vring = arg;
> > > +
> > > + if (vring->cb.callback)
> > > + return vring->cb.callback(vring->cb.private);
> > > +
> > > + return IRQ_HANDLED;
> > > +}
> > > +
> > > +static irqreturn_t eni_vdpa_config_handler(int irq, void *arg)
> > > +{
> > > + struct eni_vdpa *eni_vdpa = arg;
> > > +
> > > + if (eni_vdpa->config_cb.callback)
> > > + return eni_vdpa->config_cb.callback(eni_vdpa->config_cb.private);
> > > +
> > > + return IRQ_HANDLED;
> > > +}
> > > +
> > > +static int eni_vdpa_request_irq(struct eni_vdpa *eni_vdpa)
> > > +{
> > > + struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
> > > + struct pci_dev *pdev = ldev->pci_dev;
> > > + int i, ret, irq;
> > > + int queues = eni_vdpa->queues;
> > > + int vectors = queues + 1;
> > > +
> > > + ret = pci_alloc_irq_vectors(pdev, vectors, vectors, PCI_IRQ_MSIX);
> > > + if (ret != vectors) {
> > > + ENI_ERR(pdev,
> > > + "failed to allocate irq vectors want %d but %d\n",
> > > + vectors, ret);
> > > + return ret;
> > > + }
> > > +
> > > + eni_vdpa->vectors = vectors;
> > > +
> > > + for (i = 0; i < queues; i++) {
> > > + snprintf(eni_vdpa->vring[i].msix_name, ENI_MSIX_NAME_SIZE,
> > > + "eni-vdpa[%s]-%d\n", pci_name(pdev), i);
> > > + irq = pci_irq_vector(pdev, i);
> > > + ret = devm_request_irq(&pdev->dev, irq,
> > > + eni_vdpa_vq_handler,
> > > + 0, eni_vdpa->vring[i].msix_name,
> > > + &eni_vdpa->vring[i]);
> > > + if (ret) {
> > > + ENI_ERR(pdev, "failed to request irq for vq %d\n", i);
> > > + goto err;
> > > + }
> > > + vp_legacy_queue_vector(ldev, i, i);
> > > + eni_vdpa->vring[i].irq = irq;
> > > + }
> > > +
> > > + snprintf(eni_vdpa->msix_name, ENI_MSIX_NAME_SIZE, "eni-vdpa[%s]-config\n",
> > > + pci_name(pdev));
> > > + irq = pci_irq_vector(pdev, queues);
> > > + ret = devm_request_irq(&pdev->dev, irq, eni_vdpa_config_handler, 0,
> > > + eni_vdpa->msix_name, eni_vdpa);
> > > + if (ret) {
> > > + ENI_ERR(pdev, "failed to request irq for config vq %d\n", i);
> > > + goto err;
> > > + }
> > > + vp_legacy_config_vector(ldev, queues);
> > > + eni_vdpa->config_irq = irq;
> > > +
> > > + return 0;
> > > +err:
> > > + eni_vdpa_free_irq(eni_vdpa);
> > > + return ret;
> > > +}
> > > +
> > > +static void eni_vdpa_set_status(struct vdpa_device *vdpa, u8 status)
> > > +{
> > > + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> > > + struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
> > > + u8 s = eni_vdpa_get_status(vdpa);
> > > +
> > > + if (status & VIRTIO_CONFIG_S_DRIVER_OK &&
> > > + !(s & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > + eni_vdpa_request_irq(eni_vdpa);
> > > + }
> > > +
> > > + vp_legacy_set_status(ldev, status);
> > > +
> > > + if (!(status & VIRTIO_CONFIG_S_DRIVER_OK) &&
> > > + (s & VIRTIO_CONFIG_S_DRIVER_OK))
> > > + eni_vdpa_free_irq(eni_vdpa);
> > > +}
> > > +
> > > +static int eni_vdpa_reset(struct vdpa_device *vdpa)
> > > +{
> > > + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> > > + struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
> > > + u8 s = eni_vdpa_get_status(vdpa);
> > > +
> > > + vp_legacy_set_status(ldev, 0);
> > > +
> > > + if (s & VIRTIO_CONFIG_S_DRIVER_OK)
> > > + eni_vdpa_free_irq(eni_vdpa);
> > > +
> > > + return 0;
> > > +}
> > > +
> > > +static u16 eni_vdpa_get_vq_num_max(struct vdpa_device *vdpa)
> > > +{
> > > + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> > > +
> > > + return vp_legacy_get_queue_size(ldev, 0);
> > > +}
> > > +
> > > +static u16 eni_vdpa_get_vq_num_min(struct vdpa_device *vdpa)
> > > +{
> > > + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> > > +
> > > + return vp_legacy_get_queue_size(ldev, 0);
> > > +}
> > > +
> > > +static int eni_vdpa_get_vq_state(struct vdpa_device *vdpa, u16 qid,
> > > + struct vdpa_vq_state *state)
> > > +{
> > > + return -EOPNOTSUPP;
> > > +}
> > > +
> > > +static int eni_vdpa_set_vq_state(struct vdpa_device *vdpa, u16 qid,
> > > + const struct vdpa_vq_state *state)
> > > +{
> > > + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> > > + const struct vdpa_vq_state_split *split = &state->split;
> > > +
> > > + /* ENI is build upon virtio-pci specfication which not support
> > > + * to set state of virtqueue. But if the state is equal to the
> > > + * device initial state by chance, we can let it go.
> > > + */
> > > + if (!vp_legacy_get_queue_enable(ldev, qid)
> > > + && split->avail_index == 0)
> > > + return 0;
> > > +
> > > + return -EOPNOTSUPP;
> > > +}
> > > +
> > > +
> > > +static void eni_vdpa_set_vq_cb(struct vdpa_device *vdpa, u16 qid,
> > > + struct vdpa_callback *cb)
> > > +{
> > > + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> > > +
> > > + eni_vdpa->vring[qid].cb = *cb;
> > > +}
> > > +
> > > +static void eni_vdpa_set_vq_ready(struct vdpa_device *vdpa, u16 qid,
> > > + bool ready)
> > > +{
> > > + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> > > +
> > > + /* ENI is a legacy virtio-pci device. This is not supported
> > > + * by specification. But we can disable virtqueue by setting
> > > + * address to 0.
> > > + */
> > > + if (!ready)
> > > + vp_legacy_set_queue_address(ldev, qid, 0);
> > > +}
> > > +
> > > +static bool eni_vdpa_get_vq_ready(struct vdpa_device *vdpa, u16 qid)
> > > +{
> > > + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> > > +
> > > + return vp_legacy_get_queue_enable(ldev, qid);
> > > +}
> > > +
> > > +static void eni_vdpa_set_vq_num(struct vdpa_device *vdpa, u16 qid,
> > > + u32 num)
> > > +{
> > > + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> > > + struct pci_dev *pdev = ldev->pci_dev;
> > > + u16 n = vp_legacy_get_queue_size(ldev, qid);
> > > +
> > > + /* ENI is a legacy virtio-pci device which not allow to change
> > > + * virtqueue size. Just report a error if someone tries to
> > > + * change it.
> > > + */
> > > + if (num != n)
> > > + ENI_ERR(pdev,
> > > + "not support to set vq %u fixed num %u to %u\n",
> > > + qid, n, num);
> > > +}
> > > +
> > > +static int eni_vdpa_set_vq_address(struct vdpa_device *vdpa, u16 qid,
> > > + u64 desc_area, u64 driver_area,
> > > + u64 device_area)
> > > +{
> > > + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> > > + u32 pfn = desc_area >> VIRTIO_PCI_QUEUE_ADDR_SHIFT;
> > > +
> > > + vp_legacy_set_queue_address(ldev, qid, pfn);
> > > +
> > > + return 0;
> > > +}
> > > +
> > > +static void eni_vdpa_kick_vq(struct vdpa_device *vdpa, u16 qid)
> > > +{
> > > + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> > > +
> > > + iowrite16(qid, eni_vdpa->vring[qid].notify);
> > > +}
> > > +
> > > +static u32 eni_vdpa_get_device_id(struct vdpa_device *vdpa)
> > > +{
> > > + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> > > +
> > > + return ldev->id.device;
> > > +}
> > > +
> > > +static u32 eni_vdpa_get_vendor_id(struct vdpa_device *vdpa)
> > > +{
> > > + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> > > +
> > > + return ldev->id.vendor;
> > > +}
> > > +
> > > +static u32 eni_vdpa_get_vq_align(struct vdpa_device *vdpa)
> > > +{
> > > + return PAGE_SIZE;
> > > +}
> > > +
> > > +static size_t eni_vdpa_get_config_size(struct vdpa_device *vdpa)
> > > +{
> > > + return sizeof(struct virtio_net_config);
> > > +}
> > > +
> > > +
> > > +static void eni_vdpa_get_config(struct vdpa_device *vdpa,
> > > + unsigned int offset,
> > > + void *buf, unsigned int len)
> > > +{
> > > + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> > > + struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
> > > + void __iomem *ioaddr = ldev->ioaddr +
> > > + VIRTIO_PCI_CONFIG_OFF(eni_vdpa->vectors) +
> > > + offset;
> > > + u8 *p = buf;
> > > + int i;
> > > +
> > > + for (i = 0; i < len; i++)
> > > + *p++ = ioread8(ioaddr + i);
> > > +}
> > > +
> > > +static void eni_vdpa_set_config(struct vdpa_device *vdpa,
> > > + unsigned int offset, const void *buf,
> > > + unsigned int len)
> > > +{
> > > + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> > > + struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
> > > + void __iomem *ioaddr = ldev->ioaddr +
> > > + VIRTIO_PCI_CONFIG_OFF(eni_vdpa->vectors) +
> > > + offset;
> > > + const u8 *p = buf;
> > > + int i;
> > > +
> > > + for (i = 0; i < len; i++)
> > > + iowrite8(*p++, ioaddr + i);
> > > +}
> > > +
> > > +static void eni_vdpa_set_config_cb(struct vdpa_device *vdpa,
> > > + struct vdpa_callback *cb)
> > > +{
> > > + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> > > +
> > > + eni_vdpa->config_cb = *cb;
> > > +}
> > > +
> > > +static const struct vdpa_config_ops eni_vdpa_ops = {
> > > + .get_features = eni_vdpa_get_features,
> > > + .set_features = eni_vdpa_set_features,
> > > + .get_status = eni_vdpa_get_status,
> > > + .set_status = eni_vdpa_set_status,
> > > + .reset = eni_vdpa_reset,
> > > + .get_vq_num_max = eni_vdpa_get_vq_num_max,
> > > + .get_vq_num_min = eni_vdpa_get_vq_num_min,
> > > + .get_vq_state = eni_vdpa_get_vq_state,
> > > + .set_vq_state = eni_vdpa_set_vq_state,
> > > + .set_vq_cb = eni_vdpa_set_vq_cb,
> > > + .set_vq_ready = eni_vdpa_set_vq_ready,
> > > + .get_vq_ready = eni_vdpa_get_vq_ready,
> > > + .set_vq_num = eni_vdpa_set_vq_num,
> > > + .set_vq_address = eni_vdpa_set_vq_address,
> > > + .kick_vq = eni_vdpa_kick_vq,
> > > + .get_device_id = eni_vdpa_get_device_id,
> > > + .get_vendor_id = eni_vdpa_get_vendor_id,
> > > + .get_vq_align = eni_vdpa_get_vq_align,
> > > + .get_config_size = eni_vdpa_get_config_size,
> > > + .get_config = eni_vdpa_get_config,
> > > + .set_config = eni_vdpa_set_config,
> > > + .set_config_cb = eni_vdpa_set_config_cb,
> > > + .get_vq_irq = eni_vdpa_get_vq_irq,
> > > +};
> > > +
> > > +
> > > +static u16 eni_vdpa_get_num_queues(struct eni_vdpa *eni_vdpa)
> > > +{
> > > + struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
> > > + u32 features = vp_legacy_get_features(ldev);
> > > + u16 num = 2;
> > > +
> > > + if (features & BIT_ULL(VIRTIO_NET_F_MQ)) {
> > > + __virtio16 max_virtqueue_pairs;
> > > +
> > > + eni_vdpa_get_config(&eni_vdpa->vdpa,
> > > + offsetof(struct virtio_net_config, max_virtqueue_pairs),
> > > + &max_virtqueue_pairs,
> > > + sizeof(max_virtqueue_pairs));
> > > + num = 2 * __virtio16_to_cpu(virtio_legacy_is_little_endian(),
> > > + max_virtqueue_pairs);
> > > + }
> > > +
> > > + if (features & BIT_ULL(VIRTIO_NET_F_CTRL_VQ))
> > > + num += 1;
> > > +
> > > + return num;
> > > +}
> > > +
> > > +static void eni_vdpa_free_irq_vectors(void *data)
> > > +{
> > > + pci_free_irq_vectors(data);
> > > +}
> > > +
> > > +#ifdef __LITTLE_ENDIAN
> > > +static int eni_vdpa_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> > > +{
> > > + struct device *dev = &pdev->dev;
> > > + struct eni_vdpa *eni_vdpa;
> > > + struct virtio_pci_legacy_device *ldev;
> > > + int ret, i;
> > > +
> > > + ret = pcim_enable_device(pdev);
> > > + if (ret)
> > > + return ret;
> > > +
> > > + eni_vdpa = vdpa_alloc_device(struct eni_vdpa, vdpa,
> > > + dev, &eni_vdpa_ops, NULL, false);
> > > + if (IS_ERR(eni_vdpa)) {
> > > + ENI_ERR(pdev, "failed to allocate vDPA structure\n");
> > > + return PTR_ERR(eni_vdpa);
> > > + }
> > > +
> > > + ldev = &eni_vdpa->ldev;
> > > + ldev->pci_dev = pdev;
> > > +
> > > + ret = vp_legacy_probe(ldev);
> > > + if (ret) {
> > > + ENI_ERR(pdev, "failed to probe legacy PCI device\n");
> > > + goto err;
> > > + }
> > > +
> > > + pci_set_master(pdev);
> > > + pci_set_drvdata(pdev, eni_vdpa);
> > > +
> > > + eni_vdpa->vdpa.dma_dev = &pdev->dev;
> > > + eni_vdpa->queues = eni_vdpa_get_num_queues(eni_vdpa);
> > > +
> > > + ret = devm_add_action_or_reset(dev, eni_vdpa_free_irq_vectors, pdev);
> > > + if (ret) {
> > > + ENI_ERR(pdev,
> > > + "failed for adding devres for freeing irq vectors\n");
> > > + goto err;
> > > + }
> > > +
> > > + eni_vdpa->vring = devm_kcalloc(&pdev->dev, eni_vdpa->queues,
> > > + sizeof(*eni_vdpa->vring),
> > > + GFP_KERNEL);
> > > + if (!eni_vdpa->vring) {
> > > + ret = -ENOMEM;
> > > + ENI_ERR(pdev, "failed to allocate virtqueues\n");
> > > + goto err;
> > > + }
> > > +
> > > + for (i = 0; i < eni_vdpa->queues; i++) {
> > > + eni_vdpa->vring[i].irq = VIRTIO_MSI_NO_VECTOR;
> > > + eni_vdpa->vring[i].notify = ldev->ioaddr + VIRTIO_PCI_QUEUE_NOTIFY;
> > > + }
> > > + eni_vdpa->config_irq = VIRTIO_MSI_NO_VECTOR;
> > > +
> > > + ret = vdpa_register_device(&eni_vdpa->vdpa, eni_vdpa->queues);
> > > + if (ret) {
> > > + ENI_ERR(pdev, "failed to register to vdpa bus\n");
> > > + goto err;
> > > + }
> > > +
> > > + return 0;
> > > +
> > > +err:
> > > + put_device(&eni_vdpa->vdpa.dev);
> > > + return ret;
> > > +}
> > > +#else
> > > +static int eni_vdpa_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> > > +{
> > > + ENI_ERR(pdev, "this driver not supported on BE host\n");
> > > + return -ENODEV;
> > > +}
> > > +#endif
> > > +
> > > +static void eni_vdpa_remove(struct pci_dev *pdev)
> > > +{
> > > + struct eni_vdpa *eni_vdpa = pci_get_drvdata(pdev);
> > > +
> > > + vdpa_unregister_device(&eni_vdpa->vdpa);
> > > + vp_legacy_remove(&eni_vdpa->ldev);
> > > +}
> > > +
> > > +static struct pci_device_id eni_pci_ids[] = {
> > > + { PCI_DEVICE_SUB(PCI_VENDOR_ID_REDHAT_QUMRANET,
> > > + VIRTIO_TRANS_ID_NET,
> > > + PCI_SUBVENDOR_ID_REDHAT_QUMRANET,
> > > + VIRTIO_ID_NET) },
> > > + { 0 },
> > > +};
> > > +
> > > +static struct pci_driver eni_vdpa_driver = {
> > > + .name = "alibaba-eni-vdpa",
> > > + .id_table = eni_pci_ids,
> > > + .probe = eni_vdpa_probe,
> > > + .remove = eni_vdpa_remove,
> > > +};
> > > +
> > > +module_pci_driver(eni_vdpa_driver);
> > > +
> > > +MODULE_AUTHOR("Wu Zongyong <[email protected]>");
> > > +MODULE_DESCRIPTION("Alibaba ENI vDPA driver");
> > > +MODULE_LICENSE("GPL v2");
> > > --
> > > 2.31.1
> > >
>

2021-09-27 10:37:58

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH v3 7/7] eni_vdpa: add vDPA driver for Alibaba ENI

On Sun, Sep 26, 2021 at 12:18:26PM +0800, Jason Wang wrote:
> > > I wonder if the following can work with ENI:
> > >
> > > -device virtio-net-pci,mrg_rxbuf=off
> > >
> > > ?
> >
> > ENI didn't work.
> > I will remove F_MRG_RXBUF when get_features.
>
> I think we need to fail FEATURE_OK if F_MRG_RXBUF is not negotiated.
> Since VERSION_1 requires a fixed header length even if F_MRG_RXBUF is
> not negotiated.
>
> But this trick doesn't come for free. If some driver doesn't support
> mrg_rxbuf, it won't work.
>
> Thanks

Yea. Ugh. Down the road I think we'll add legacy support to vdpa on
strongly ordered systems. Doing it in userspace is just too messy imho.
But yes, this kind of hack is probably ok for weakly ordered systems.
BTW we need to set VIRTIO_F_ORDER_PLATFORM, right?

--
MST

2021-09-28 02:19:00

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH v3 7/7] eni_vdpa: add vDPA driver for Alibaba ENI

On Mon, Sep 27, 2021 at 6:36 PM Michael S. Tsirkin <[email protected]> wrote:
>
> On Sun, Sep 26, 2021 at 12:18:26PM +0800, Jason Wang wrote:
> > > > I wonder if the following can work with ENI:
> > > >
> > > > -device virtio-net-pci,mrg_rxbuf=off
> > > >
> > > > ?
> > >
> > > ENI didn't work.
> > > I will remove F_MRG_RXBUF when get_features.
> >
> > I think we need to fail FEATURE_OK if F_MRG_RXBUF is not negotiated.
> > Since VERSION_1 requires a fixed header length even if F_MRG_RXBUF is
> > not negotiated.
> >
> > But this trick doesn't come for free. If some driver doesn't support
> > mrg_rxbuf, it won't work.
> >
> > Thanks
>
> Yea. Ugh. Down the road I think we'll add legacy support to vdpa on
> strongly ordered systems.

I don't see the connection, can you explain why?

> Doing it in userspace is just too messy imho.
> But yes, this kind of hack is probably ok for weakly ordered systems.
> BTW we need to set VIRTIO_F_ORDER_PLATFORM, right?

Right.

Thanks

>
> --
> MST
>

2021-09-29 06:13:28

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH v4 0/7] vDPA driver for Alibaba ENI

This series implements the vDPA driver for Alibaba ENI(Elastic Network
Interface) which is build based on virtio-pci 0.9.5 specification.

Changes since V3:
- validate VIRTIO_NET_F_MRG_RXBUF when negotiate features
- present F_ORDER_PLATFORM in get_features
- remove endian check since ENI always use litter endian

Changes since V2:
- add new attribute VDPA_ATTR_DEV_MIN_VQ_SIZE instead
VDPA_ATTR_DEV_F_VERSION_1 to guide users to choose correct virtqueue
size as suggested by Jason Wang
- present ACCESS_PLATFORM in get_features callback as suggested by Jason
Wang
- disable this driver on Big Endian host as suggested by Jason Wang
- fix a typo

Changes since V1:
- add new vdpa attribute VDPA_ATTR_DEV_F_VERSION_1 to indicate whether
the vdpa device is legacy
- implement dedicated driver for Alibaba ENI instead a legacy virtio-pci
driver as suggested by Jason Wang
- some bugs fixed

Wu Zongyong (7):
virtio-pci: introduce legacy device module
vdpa: fix typo
vp_vdpa: add vq irq offloading support
vdpa: add new callback get_vq_num_min in vdpa_config_ops
virtio_vdpa: setup correct vq size with callbacks get_vq_num_{max,min}
vdpa: add new attribute VDPA_ATTR_DEV_MIN_VQ_SIZE
eni_vdpa: add vDPA driver for Alibaba ENI

drivers/vdpa/Kconfig | 8 +
drivers/vdpa/Makefile | 1 +
drivers/vdpa/alibaba/Makefile | 3 +
drivers/vdpa/alibaba/eni_vdpa.c | 553 +++++++++++++++++++++++++
drivers/vdpa/vdpa.c | 5 +
drivers/vdpa/virtio_pci/vp_vdpa.c | 12 +
drivers/virtio/Kconfig | 10 +
drivers/virtio/Makefile | 1 +
drivers/virtio/virtio_pci_common.c | 10 +-
drivers/virtio/virtio_pci_common.h | 9 +-
drivers/virtio/virtio_pci_legacy.c | 101 ++---
drivers/virtio/virtio_pci_legacy_dev.c | 220 ++++++++++
drivers/virtio/virtio_vdpa.c | 25 +-
include/linux/vdpa.h | 6 +-
include/linux/virtio_pci_legacy.h | 44 ++
include/uapi/linux/vdpa.h | 1 +
16 files changed, 920 insertions(+), 89 deletions(-)
create mode 100644 drivers/vdpa/alibaba/Makefile
create mode 100644 drivers/vdpa/alibaba/eni_vdpa.c
create mode 100644 drivers/virtio/virtio_pci_legacy_dev.c
create mode 100644 include/linux/virtio_pci_legacy.h

--
2.31.1

2021-09-29 06:14:24

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH v4 4/7] vdpa: add new callback get_vq_num_min in vdpa_config_ops

This callback is optional. For vdpa devices that not support to change
virtqueue size, get_vq_num_min and get_vq_num_max will return the same
value, so that users can choose a correct value for that device.

Suggested-by: Jason Wang <[email protected]>
Signed-off-by: Wu Zongyong <[email protected]>
---
include/linux/vdpa.h | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
index a896ee021e5f..30864848950b 100644
--- a/include/linux/vdpa.h
+++ b/include/linux/vdpa.h
@@ -171,6 +171,9 @@ struct vdpa_map_file {
* @get_vq_num_max: Get the max size of virtqueue
* @vdev: vdpa device
* Returns u16: max size of virtqueue
+ * @get_vq_num_min: Get the min size of virtqueue (optional)
+ * @vdev: vdpa device
+ * Returns u16: min size of virtqueue
* @get_device_id: Get virtio device id
* @vdev: vdpa device
* Returns u32: virtio device id
@@ -266,6 +269,7 @@ struct vdpa_config_ops {
void (*set_config_cb)(struct vdpa_device *vdev,
struct vdpa_callback *cb);
u16 (*get_vq_num_max)(struct vdpa_device *vdev);
+ u16 (*get_vq_num_min)(struct vdpa_device *vdev);
u32 (*get_device_id)(struct vdpa_device *vdev);
u32 (*get_vendor_id)(struct vdpa_device *vdev);
u8 (*get_status)(struct vdpa_device *vdev);
--
2.31.1

2021-09-29 06:15:37

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH v4 7/7] eni_vdpa: add vDPA driver for Alibaba ENI

This patch adds a new vDPA driver for Alibaba ENI(Elastic Network
Interface) which is build upon virtio 0.9.5 specification.
And this driver doesn't support to run on BE host.

Signed-off-by: Wu Zongyong <[email protected]>
---
drivers/vdpa/Kconfig | 8 +
drivers/vdpa/Makefile | 1 +
drivers/vdpa/alibaba/Makefile | 3 +
drivers/vdpa/alibaba/eni_vdpa.c | 553 ++++++++++++++++++++++++++++++++
4 files changed, 565 insertions(+)
create mode 100644 drivers/vdpa/alibaba/Makefile
create mode 100644 drivers/vdpa/alibaba/eni_vdpa.c

diff --git a/drivers/vdpa/Kconfig b/drivers/vdpa/Kconfig
index 3d91982d8371..9587b9177b05 100644
--- a/drivers/vdpa/Kconfig
+++ b/drivers/vdpa/Kconfig
@@ -78,4 +78,12 @@ config VP_VDPA
help
This kernel module bridges virtio PCI device to vDPA bus.

+config ALIBABA_ENI_VDPA
+ tristate "vDPA driver for Alibaba ENI"
+ select VIRTIO_PCI_LEGACY_LIB
+ depends on PCI_MSI
+ help
+ VDPA driver for Alibaba ENI(Elastic Network Interface) which is build upon
+ virtio 0.9.5 specification.
+
endif # VDPA
diff --git a/drivers/vdpa/Makefile b/drivers/vdpa/Makefile
index f02ebed33f19..15665563a7f4 100644
--- a/drivers/vdpa/Makefile
+++ b/drivers/vdpa/Makefile
@@ -5,3 +5,4 @@ obj-$(CONFIG_VDPA_USER) += vdpa_user/
obj-$(CONFIG_IFCVF) += ifcvf/
obj-$(CONFIG_MLX5_VDPA) += mlx5/
obj-$(CONFIG_VP_VDPA) += virtio_pci/
+obj-$(CONFIG_ALIBABA_ENI_VDPA) += alibaba/
diff --git a/drivers/vdpa/alibaba/Makefile b/drivers/vdpa/alibaba/Makefile
new file mode 100644
index 000000000000..ef4aae69f87a
--- /dev/null
+++ b/drivers/vdpa/alibaba/Makefile
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0
+obj-$(CONFIG_ALIBABA_ENI_VDPA) += eni_vdpa.o
+
diff --git a/drivers/vdpa/alibaba/eni_vdpa.c b/drivers/vdpa/alibaba/eni_vdpa.c
new file mode 100644
index 000000000000..6a09f157d810
--- /dev/null
+++ b/drivers/vdpa/alibaba/eni_vdpa.c
@@ -0,0 +1,553 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * vDPA bridge driver for Alibaba ENI(Elastic Network Interface)
+ *
+ * Copyright (c) 2021, Alibaba Inc. All rights reserved.
+ * Author: Wu Zongyong <[email protected]>
+ *
+ */
+
+#include "linux/bits.h"
+#include <linux/interrupt.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/vdpa.h>
+#include <linux/virtio.h>
+#include <linux/virtio_config.h>
+#include <linux/virtio_ring.h>
+#include <linux/virtio_pci.h>
+#include <linux/virtio_pci_legacy.h>
+#include <uapi/linux/virtio_net.h>
+
+#define ENI_MSIX_NAME_SIZE 256
+
+#define ENI_ERR(pdev, fmt, ...) \
+ dev_err(&pdev->dev, "%s"fmt, "eni_vdpa: ", ##__VA_ARGS__)
+#define ENI_DBG(pdev, fmt, ...) \
+ dev_dbg(&pdev->dev, "%s"fmt, "eni_vdpa: ", ##__VA_ARGS__)
+#define ENI_INFO(pdev, fmt, ...) \
+ dev_info(&pdev->dev, "%s"fmt, "eni_vdpa: ", ##__VA_ARGS__)
+
+struct eni_vring {
+ void __iomem *notify;
+ char msix_name[ENI_MSIX_NAME_SIZE];
+ struct vdpa_callback cb;
+ int irq;
+};
+
+struct eni_vdpa {
+ struct vdpa_device vdpa;
+ struct virtio_pci_legacy_device ldev;
+ struct eni_vring *vring;
+ struct vdpa_callback config_cb;
+ char msix_name[ENI_MSIX_NAME_SIZE];
+ int config_irq;
+ int queues;
+ int vectors;
+};
+
+static struct eni_vdpa *vdpa_to_eni(struct vdpa_device *vdpa)
+{
+ return container_of(vdpa, struct eni_vdpa, vdpa);
+}
+
+static struct virtio_pci_legacy_device *vdpa_to_ldev(struct vdpa_device *vdpa)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+
+ return &eni_vdpa->ldev;
+}
+
+static u64 eni_vdpa_get_features(struct vdpa_device *vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+ u64 features = vp_legacy_get_features(ldev);
+
+ features |= BIT_ULL(VIRTIO_F_ACCESS_PLATFORM);
+ features |= BIT_ULL(VIRTIO_F_ORDER_PLATFORM);
+
+ return features;
+}
+
+static int eni_vdpa_set_features(struct vdpa_device *vdpa, u64 features)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+
+ if (!(features & BIT_ULL(VIRTIO_NET_F_MRG_RXBUF)) && features) {
+ ENI_ERR(ldev->pci_dev,
+ "VIRTIO_NET_F_MRG_RXBUF is not negotiated\n");
+ return -EINVAL;
+ }
+
+ vp_legacy_set_features(ldev, (u32)features);
+
+ return 0;
+}
+
+static u8 eni_vdpa_get_status(struct vdpa_device *vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+
+ return vp_legacy_get_status(ldev);
+}
+
+static int eni_vdpa_get_vq_irq(struct vdpa_device *vdpa, u16 idx)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+ int irq = eni_vdpa->vring[idx].irq;
+
+ if (irq == VIRTIO_MSI_NO_VECTOR)
+ return -EINVAL;
+
+ return irq;
+}
+
+static void eni_vdpa_free_irq(struct eni_vdpa *eni_vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
+ struct pci_dev *pdev = ldev->pci_dev;
+ int i;
+
+ for (i = 0; i < eni_vdpa->queues; i++) {
+ if (eni_vdpa->vring[i].irq != VIRTIO_MSI_NO_VECTOR) {
+ vp_legacy_queue_vector(ldev, i, VIRTIO_MSI_NO_VECTOR);
+ devm_free_irq(&pdev->dev, eni_vdpa->vring[i].irq,
+ &eni_vdpa->vring[i]);
+ eni_vdpa->vring[i].irq = VIRTIO_MSI_NO_VECTOR;
+ }
+ }
+
+ if (eni_vdpa->config_irq != VIRTIO_MSI_NO_VECTOR) {
+ vp_legacy_config_vector(ldev, VIRTIO_MSI_NO_VECTOR);
+ devm_free_irq(&pdev->dev, eni_vdpa->config_irq, eni_vdpa);
+ eni_vdpa->config_irq = VIRTIO_MSI_NO_VECTOR;
+ }
+
+ if (eni_vdpa->vectors) {
+ pci_free_irq_vectors(pdev);
+ eni_vdpa->vectors = 0;
+ }
+}
+
+static irqreturn_t eni_vdpa_vq_handler(int irq, void *arg)
+{
+ struct eni_vring *vring = arg;
+
+ if (vring->cb.callback)
+ return vring->cb.callback(vring->cb.private);
+
+ return IRQ_HANDLED;
+}
+
+static irqreturn_t eni_vdpa_config_handler(int irq, void *arg)
+{
+ struct eni_vdpa *eni_vdpa = arg;
+
+ if (eni_vdpa->config_cb.callback)
+ return eni_vdpa->config_cb.callback(eni_vdpa->config_cb.private);
+
+ return IRQ_HANDLED;
+}
+
+static int eni_vdpa_request_irq(struct eni_vdpa *eni_vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
+ struct pci_dev *pdev = ldev->pci_dev;
+ int i, ret, irq;
+ int queues = eni_vdpa->queues;
+ int vectors = queues + 1;
+
+ ret = pci_alloc_irq_vectors(pdev, vectors, vectors, PCI_IRQ_MSIX);
+ if (ret != vectors) {
+ ENI_ERR(pdev,
+ "failed to allocate irq vectors want %d but %d\n",
+ vectors, ret);
+ return ret;
+ }
+
+ eni_vdpa->vectors = vectors;
+
+ for (i = 0; i < queues; i++) {
+ snprintf(eni_vdpa->vring[i].msix_name, ENI_MSIX_NAME_SIZE,
+ "eni-vdpa[%s]-%d\n", pci_name(pdev), i);
+ irq = pci_irq_vector(pdev, i);
+ ret = devm_request_irq(&pdev->dev, irq,
+ eni_vdpa_vq_handler,
+ 0, eni_vdpa->vring[i].msix_name,
+ &eni_vdpa->vring[i]);
+ if (ret) {
+ ENI_ERR(pdev, "failed to request irq for vq %d\n", i);
+ goto err;
+ }
+ vp_legacy_queue_vector(ldev, i, i);
+ eni_vdpa->vring[i].irq = irq;
+ }
+
+ snprintf(eni_vdpa->msix_name, ENI_MSIX_NAME_SIZE, "eni-vdpa[%s]-config\n",
+ pci_name(pdev));
+ irq = pci_irq_vector(pdev, queues);
+ ret = devm_request_irq(&pdev->dev, irq, eni_vdpa_config_handler, 0,
+ eni_vdpa->msix_name, eni_vdpa);
+ if (ret) {
+ ENI_ERR(pdev, "failed to request irq for config vq %d\n", i);
+ goto err;
+ }
+ vp_legacy_config_vector(ldev, queues);
+ eni_vdpa->config_irq = irq;
+
+ return 0;
+err:
+ eni_vdpa_free_irq(eni_vdpa);
+ return ret;
+}
+
+static void eni_vdpa_set_status(struct vdpa_device *vdpa, u8 status)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+ struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
+ u8 s = eni_vdpa_get_status(vdpa);
+
+ if (status & VIRTIO_CONFIG_S_DRIVER_OK &&
+ !(s & VIRTIO_CONFIG_S_DRIVER_OK)) {
+ eni_vdpa_request_irq(eni_vdpa);
+ }
+
+ vp_legacy_set_status(ldev, status);
+
+ if (!(status & VIRTIO_CONFIG_S_DRIVER_OK) &&
+ (s & VIRTIO_CONFIG_S_DRIVER_OK))
+ eni_vdpa_free_irq(eni_vdpa);
+}
+
+static int eni_vdpa_reset(struct vdpa_device *vdpa)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+ struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
+ u8 s = eni_vdpa_get_status(vdpa);
+
+ vp_legacy_set_status(ldev, 0);
+
+ if (s & VIRTIO_CONFIG_S_DRIVER_OK)
+ eni_vdpa_free_irq(eni_vdpa);
+
+ return 0;
+}
+
+static u16 eni_vdpa_get_vq_num_max(struct vdpa_device *vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+
+ return vp_legacy_get_queue_size(ldev, 0);
+}
+
+static u16 eni_vdpa_get_vq_num_min(struct vdpa_device *vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+
+ return vp_legacy_get_queue_size(ldev, 0);
+}
+
+static int eni_vdpa_get_vq_state(struct vdpa_device *vdpa, u16 qid,
+ struct vdpa_vq_state *state)
+{
+ return -EOPNOTSUPP;
+}
+
+static int eni_vdpa_set_vq_state(struct vdpa_device *vdpa, u16 qid,
+ const struct vdpa_vq_state *state)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+ const struct vdpa_vq_state_split *split = &state->split;
+
+ /* ENI is build upon virtio-pci specfication which not support
+ * to set state of virtqueue. But if the state is equal to the
+ * device initial state by chance, we can let it go.
+ */
+ if (!vp_legacy_get_queue_enable(ldev, qid)
+ && split->avail_index == 0)
+ return 0;
+
+ return -EOPNOTSUPP;
+}
+
+
+static void eni_vdpa_set_vq_cb(struct vdpa_device *vdpa, u16 qid,
+ struct vdpa_callback *cb)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+
+ eni_vdpa->vring[qid].cb = *cb;
+}
+
+static void eni_vdpa_set_vq_ready(struct vdpa_device *vdpa, u16 qid,
+ bool ready)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+
+ /* ENI is a legacy virtio-pci device. This is not supported
+ * by specification. But we can disable virtqueue by setting
+ * address to 0.
+ */
+ if (!ready)
+ vp_legacy_set_queue_address(ldev, qid, 0);
+}
+
+static bool eni_vdpa_get_vq_ready(struct vdpa_device *vdpa, u16 qid)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+
+ return vp_legacy_get_queue_enable(ldev, qid);
+}
+
+static void eni_vdpa_set_vq_num(struct vdpa_device *vdpa, u16 qid,
+ u32 num)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+ struct pci_dev *pdev = ldev->pci_dev;
+ u16 n = vp_legacy_get_queue_size(ldev, qid);
+
+ /* ENI is a legacy virtio-pci device which not allow to change
+ * virtqueue size. Just report a error if someone tries to
+ * change it.
+ */
+ if (num != n)
+ ENI_ERR(pdev,
+ "not support to set vq %u fixed num %u to %u\n",
+ qid, n, num);
+}
+
+static int eni_vdpa_set_vq_address(struct vdpa_device *vdpa, u16 qid,
+ u64 desc_area, u64 driver_area,
+ u64 device_area)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+ u32 pfn = desc_area >> VIRTIO_PCI_QUEUE_ADDR_SHIFT;
+
+ vp_legacy_set_queue_address(ldev, qid, pfn);
+
+ return 0;
+}
+
+static void eni_vdpa_kick_vq(struct vdpa_device *vdpa, u16 qid)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+
+ iowrite16(qid, eni_vdpa->vring[qid].notify);
+}
+
+static u32 eni_vdpa_get_device_id(struct vdpa_device *vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+
+ return ldev->id.device;
+}
+
+static u32 eni_vdpa_get_vendor_id(struct vdpa_device *vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+
+ return ldev->id.vendor;
+}
+
+static u32 eni_vdpa_get_vq_align(struct vdpa_device *vdpa)
+{
+ return PAGE_SIZE;
+}
+
+static size_t eni_vdpa_get_config_size(struct vdpa_device *vdpa)
+{
+ return sizeof(struct virtio_net_config);
+}
+
+
+static void eni_vdpa_get_config(struct vdpa_device *vdpa,
+ unsigned int offset,
+ void *buf, unsigned int len)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+ struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
+ void __iomem *ioaddr = ldev->ioaddr +
+ VIRTIO_PCI_CONFIG_OFF(eni_vdpa->vectors) +
+ offset;
+ u8 *p = buf;
+ int i;
+
+ for (i = 0; i < len; i++)
+ *p++ = ioread8(ioaddr + i);
+}
+
+static void eni_vdpa_set_config(struct vdpa_device *vdpa,
+ unsigned int offset, const void *buf,
+ unsigned int len)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+ struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
+ void __iomem *ioaddr = ldev->ioaddr +
+ VIRTIO_PCI_CONFIG_OFF(eni_vdpa->vectors) +
+ offset;
+ const u8 *p = buf;
+ int i;
+
+ for (i = 0; i < len; i++)
+ iowrite8(*p++, ioaddr + i);
+}
+
+static void eni_vdpa_set_config_cb(struct vdpa_device *vdpa,
+ struct vdpa_callback *cb)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+
+ eni_vdpa->config_cb = *cb;
+}
+
+static const struct vdpa_config_ops eni_vdpa_ops = {
+ .get_features = eni_vdpa_get_features,
+ .set_features = eni_vdpa_set_features,
+ .get_status = eni_vdpa_get_status,
+ .set_status = eni_vdpa_set_status,
+ .reset = eni_vdpa_reset,
+ .get_vq_num_max = eni_vdpa_get_vq_num_max,
+ .get_vq_num_min = eni_vdpa_get_vq_num_min,
+ .get_vq_state = eni_vdpa_get_vq_state,
+ .set_vq_state = eni_vdpa_set_vq_state,
+ .set_vq_cb = eni_vdpa_set_vq_cb,
+ .set_vq_ready = eni_vdpa_set_vq_ready,
+ .get_vq_ready = eni_vdpa_get_vq_ready,
+ .set_vq_num = eni_vdpa_set_vq_num,
+ .set_vq_address = eni_vdpa_set_vq_address,
+ .kick_vq = eni_vdpa_kick_vq,
+ .get_device_id = eni_vdpa_get_device_id,
+ .get_vendor_id = eni_vdpa_get_vendor_id,
+ .get_vq_align = eni_vdpa_get_vq_align,
+ .get_config_size = eni_vdpa_get_config_size,
+ .get_config = eni_vdpa_get_config,
+ .set_config = eni_vdpa_set_config,
+ .set_config_cb = eni_vdpa_set_config_cb,
+ .get_vq_irq = eni_vdpa_get_vq_irq,
+};
+
+
+static u16 eni_vdpa_get_num_queues(struct eni_vdpa *eni_vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
+ u32 features = vp_legacy_get_features(ldev);
+ u16 num = 2;
+
+ if (features & BIT_ULL(VIRTIO_NET_F_MQ)) {
+ __virtio16 max_virtqueue_pairs;
+
+ eni_vdpa_get_config(&eni_vdpa->vdpa,
+ offsetof(struct virtio_net_config, max_virtqueue_pairs),
+ &max_virtqueue_pairs,
+ sizeof(max_virtqueue_pairs));
+ num = 2 * __virtio16_to_cpu(virtio_legacy_is_little_endian(),
+ max_virtqueue_pairs);
+ }
+
+ if (features & BIT_ULL(VIRTIO_NET_F_CTRL_VQ))
+ num += 1;
+
+ return num;
+}
+
+static void eni_vdpa_free_irq_vectors(void *data)
+{
+ pci_free_irq_vectors(data);
+}
+
+static int eni_vdpa_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+{
+ struct device *dev = &pdev->dev;
+ struct eni_vdpa *eni_vdpa;
+ struct virtio_pci_legacy_device *ldev;
+ int ret, i;
+
+ ret = pcim_enable_device(pdev);
+ if (ret)
+ return ret;
+
+ eni_vdpa = vdpa_alloc_device(struct eni_vdpa, vdpa,
+ dev, &eni_vdpa_ops, NULL, false);
+ if (IS_ERR(eni_vdpa)) {
+ ENI_ERR(pdev, "failed to allocate vDPA structure\n");
+ return PTR_ERR(eni_vdpa);
+ }
+
+ ldev = &eni_vdpa->ldev;
+ ldev->pci_dev = pdev;
+
+ ret = vp_legacy_probe(ldev);
+ if (ret) {
+ ENI_ERR(pdev, "failed to probe legacy PCI device\n");
+ goto err;
+ }
+
+ pci_set_master(pdev);
+ pci_set_drvdata(pdev, eni_vdpa);
+
+ eni_vdpa->vdpa.dma_dev = &pdev->dev;
+ eni_vdpa->queues = eni_vdpa_get_num_queues(eni_vdpa);
+
+ ret = devm_add_action_or_reset(dev, eni_vdpa_free_irq_vectors, pdev);
+ if (ret) {
+ ENI_ERR(pdev,
+ "failed for adding devres for freeing irq vectors\n");
+ goto err;
+ }
+
+ eni_vdpa->vring = devm_kcalloc(&pdev->dev, eni_vdpa->queues,
+ sizeof(*eni_vdpa->vring),
+ GFP_KERNEL);
+ if (!eni_vdpa->vring) {
+ ret = -ENOMEM;
+ ENI_ERR(pdev, "failed to allocate virtqueues\n");
+ goto err;
+ }
+
+ for (i = 0; i < eni_vdpa->queues; i++) {
+ eni_vdpa->vring[i].irq = VIRTIO_MSI_NO_VECTOR;
+ eni_vdpa->vring[i].notify = ldev->ioaddr + VIRTIO_PCI_QUEUE_NOTIFY;
+ }
+ eni_vdpa->config_irq = VIRTIO_MSI_NO_VECTOR;
+
+ ret = vdpa_register_device(&eni_vdpa->vdpa, eni_vdpa->queues);
+ if (ret) {
+ ENI_ERR(pdev, "failed to register to vdpa bus\n");
+ goto err;
+ }
+
+ return 0;
+
+err:
+ put_device(&eni_vdpa->vdpa.dev);
+ return ret;
+}
+
+static void eni_vdpa_remove(struct pci_dev *pdev)
+{
+ struct eni_vdpa *eni_vdpa = pci_get_drvdata(pdev);
+
+ vdpa_unregister_device(&eni_vdpa->vdpa);
+ vp_legacy_remove(&eni_vdpa->ldev);
+}
+
+static struct pci_device_id eni_pci_ids[] = {
+ { PCI_DEVICE_SUB(PCI_VENDOR_ID_REDHAT_QUMRANET,
+ VIRTIO_TRANS_ID_NET,
+ PCI_SUBVENDOR_ID_REDHAT_QUMRANET,
+ VIRTIO_ID_NET) },
+ { 0 },
+};
+
+static struct pci_driver eni_vdpa_driver = {
+ .name = "alibaba-eni-vdpa",
+ .id_table = eni_pci_ids,
+ .probe = eni_vdpa_probe,
+ .remove = eni_vdpa_remove,
+};
+
+module_pci_driver(eni_vdpa_driver);
+
+MODULE_AUTHOR("Wu Zongyong <[email protected]>");
+MODULE_DESCRIPTION("Alibaba ENI vDPA driver");
+MODULE_LICENSE("GPL v2");
--
2.31.1

2021-09-29 06:15:38

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH v4 6/7] vdpa: add new attribute VDPA_ATTR_DEV_MIN_VQ_SIZE

This attribute advertises the min value of virtqueue size. The value is
0 by default.

Signed-off-by: Wu Zongyong <[email protected]>
---
drivers/vdpa/vdpa.c | 5 +++++
include/uapi/linux/vdpa.h | 1 +
2 files changed, 6 insertions(+)

diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
index 1dc121a07a93..6ed79fba33e4 100644
--- a/drivers/vdpa/vdpa.c
+++ b/drivers/vdpa/vdpa.c
@@ -492,6 +492,7 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
int flags, struct netlink_ext_ack *extack)
{
u16 max_vq_size;
+ u16 min_vq_size = 0;
u32 device_id;
u32 vendor_id;
void *hdr;
@@ -508,6 +509,8 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
device_id = vdev->config->get_device_id(vdev);
vendor_id = vdev->config->get_vendor_id(vdev);
max_vq_size = vdev->config->get_vq_num_max(vdev);
+ if (vdev->config->get_vq_num_min)
+ min_vq_size = vdev->config->get_vq_num_min(vdev);

err = -EMSGSIZE;
if (nla_put_string(msg, VDPA_ATTR_DEV_NAME, dev_name(&vdev->dev)))
@@ -520,6 +523,8 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
goto msg_err;
if (nla_put_u16(msg, VDPA_ATTR_DEV_MAX_VQ_SIZE, max_vq_size))
goto msg_err;
+ if (nla_put_u16(msg, VDPA_ATTR_DEV_MIN_VQ_SIZE, min_vq_size))
+ goto msg_err;

genlmsg_end(msg, hdr);
return 0;
diff --git a/include/uapi/linux/vdpa.h b/include/uapi/linux/vdpa.h
index 66a41e4ec163..e3b87879514c 100644
--- a/include/uapi/linux/vdpa.h
+++ b/include/uapi/linux/vdpa.h
@@ -32,6 +32,7 @@ enum vdpa_attr {
VDPA_ATTR_DEV_VENDOR_ID, /* u32 */
VDPA_ATTR_DEV_MAX_VQS, /* u32 */
VDPA_ATTR_DEV_MAX_VQ_SIZE, /* u16 */
+ VDPA_ATTR_DEV_MIN_VQ_SIZE, /* u16 */

/* new attributes must be added above here */
VDPA_ATTR_MAX,
--
2.31.1

2021-09-29 06:22:24

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH v4 3/7] vp_vdpa: add vq irq offloading support

This patch implements the get_vq_irq() callback for virtio pci devices
to allow irq offloading.

Signed-off-by: Wu Zongyong <[email protected]>
Acked-by: Jason Wang <[email protected]>
---
drivers/vdpa/virtio_pci/vp_vdpa.c | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/drivers/vdpa/virtio_pci/vp_vdpa.c b/drivers/vdpa/virtio_pci/vp_vdpa.c
index 5bcd00246d2e..e3ff7875e123 100644
--- a/drivers/vdpa/virtio_pci/vp_vdpa.c
+++ b/drivers/vdpa/virtio_pci/vp_vdpa.c
@@ -76,6 +76,17 @@ static u8 vp_vdpa_get_status(struct vdpa_device *vdpa)
return vp_modern_get_status(mdev);
}

+static int vp_vdpa_get_vq_irq(struct vdpa_device *vdpa, u16 idx)
+{
+ struct vp_vdpa *vp_vdpa = vdpa_to_vp(vdpa);
+ int irq = vp_vdpa->vring[idx].irq;
+
+ if (irq == VIRTIO_MSI_NO_VECTOR)
+ return -EINVAL;
+
+ return irq;
+}
+
static void vp_vdpa_free_irq(struct vp_vdpa *vp_vdpa)
{
struct virtio_pci_modern_device *mdev = &vp_vdpa->mdev;
@@ -427,6 +438,7 @@ static const struct vdpa_config_ops vp_vdpa_ops = {
.get_config = vp_vdpa_get_config,
.set_config = vp_vdpa_set_config,
.set_config_cb = vp_vdpa_set_config_cb,
+ .get_vq_irq = vp_vdpa_get_vq_irq,
};

static void vp_vdpa_free_irq_vectors(void *data)
--
2.31.1

2021-09-29 06:23:57

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH v4 5/7] virtio_vdpa: setup correct vq size with callbacks get_vq_num_{max,min}

Signed-off-by: Wu Zongyong <[email protected]>
---
drivers/virtio/virtio_vdpa.c | 25 ++++++++++++++++++++-----
1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
index 72eaef2caeb1..8aa4ebe2a2a2 100644
--- a/drivers/virtio/virtio_vdpa.c
+++ b/drivers/virtio/virtio_vdpa.c
@@ -145,7 +145,8 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
/* Assume split virtqueue, switch to packed if necessary */
struct vdpa_vq_state state = {0};
unsigned long flags;
- u32 align, num;
+ u32 align, max_num, min_num = 0;
+ bool may_reduce_num = true;
int err;

if (!name)
@@ -163,22 +164,36 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
if (!info)
return ERR_PTR(-ENOMEM);

- num = ops->get_vq_num_max(vdpa);
- if (num == 0) {
+ max_num = ops->get_vq_num_max(vdpa);
+ if (max_num == 0) {
err = -ENOENT;
goto error_new_virtqueue;
}

+ if (ops->get_vq_num_min)
+ min_num = ops->get_vq_num_min(vdpa);
+ if (min_num > max_num) {
+ err = -ENOENT;
+ goto error_new_virtqueue;
+ }
+
+ may_reduce_num = (max_num == min_num) ? false : true;
+
/* Create the vring */
align = ops->get_vq_align(vdpa);
- vq = vring_create_virtqueue(index, num, align, vdev,
- true, true, ctx,
+ vq = vring_create_virtqueue(index, max_num, align, vdev,
+ true, may_reduce_num, ctx,
virtio_vdpa_notify, callback, name);
if (!vq) {
err = -ENOMEM;
goto error_new_virtqueue;
}

+ if (virtqueue_get_vring_size(vq) < min_num) {
+ err = -EINVAL;
+ goto err_vq;
+ }
+
/* Setup virtqueue callback */
cb.callback = virtio_vdpa_virtqueue_cb;
cb.private = info;
--
2.31.1

2021-09-29 06:24:54

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH v4 2/7] vdpa: fix typo

Signed-off-by: Wu Zongyong <[email protected]>
---
include/linux/vdpa.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
index 3972ab765de1..a896ee021e5f 100644
--- a/include/linux/vdpa.h
+++ b/include/linux/vdpa.h
@@ -257,7 +257,7 @@ struct vdpa_config_ops {
struct vdpa_notification_area
(*get_vq_notification)(struct vdpa_device *vdev, u16 idx);
/* vq irq is not expected to be changed once DRIVER_OK is set */
- int (*get_vq_irq)(struct vdpa_device *vdv, u16 idx);
+ int (*get_vq_irq)(struct vdpa_device *vdev, u16 idx);

/* Device ops */
u32 (*get_vq_align)(struct vdpa_device *vdev);
--
2.31.1

2021-09-29 06:51:39

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH v4 1/7] virtio-pci: introduce legacy device module

Split common codes from virtio-pci-legacy so vDPA driver can reuse it
later.

Signed-off-by: Wu Zongyong <[email protected]>
---
drivers/virtio/Kconfig | 10 ++
drivers/virtio/Makefile | 1 +
drivers/virtio/virtio_pci_common.c | 10 +-
drivers/virtio/virtio_pci_common.h | 9 +-
drivers/virtio/virtio_pci_legacy.c | 101 +++---------
drivers/virtio/virtio_pci_legacy_dev.c | 220 +++++++++++++++++++++++++
include/linux/virtio_pci_legacy.h | 44 +++++
7 files changed, 312 insertions(+), 83 deletions(-)
create mode 100644 drivers/virtio/virtio_pci_legacy_dev.c
create mode 100644 include/linux/virtio_pci_legacy.h

diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
index ce1b3f6ec325..8fcf94cd2c96 100644
--- a/drivers/virtio/Kconfig
+++ b/drivers/virtio/Kconfig
@@ -20,6 +20,15 @@ config VIRTIO_PCI_LIB
PCI device with possible vendor specific extensions. Any
module that selects this module must depend on PCI.

+config VIRTIO_PCI_LIB_LEGACY
+ tristate
+ help
+ Legacy PCI device (Virtio PCI Card 0.9.x Draft and older device)
+ implementation.
+ This module implements the basic probe and control for devices
+ which are based on legacy PCI device. Any module that selects this
+ module must depend on PCI.
+
menuconfig VIRTIO_MENU
bool "Virtio drivers"
default y
@@ -43,6 +52,7 @@ config VIRTIO_PCI_LEGACY
bool "Support for legacy virtio draft 0.9.X and older devices"
default y
depends on VIRTIO_PCI
+ select VIRTIO_PCI_LIB_LEGACY
help
Virtio PCI Card 0.9.X Draft (circa 2014) and older device support.

diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile
index 699bbea0465f..0a82d0873248 100644
--- a/drivers/virtio/Makefile
+++ b/drivers/virtio/Makefile
@@ -1,6 +1,7 @@
# SPDX-License-Identifier: GPL-2.0
obj-$(CONFIG_VIRTIO) += virtio.o virtio_ring.o
obj-$(CONFIG_VIRTIO_PCI_LIB) += virtio_pci_modern_dev.o
+obj-$(CONFIG_VIRTIO_PCI_LIB_LEGACY) += virtio_pci_legacy_dev.o
obj-$(CONFIG_VIRTIO_MMIO) += virtio_mmio.o
obj-$(CONFIG_VIRTIO_PCI) += virtio_pci.o
virtio_pci-y := virtio_pci_modern.o virtio_pci_common.o
diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c
index b35bb2d57f62..d724f676608b 100644
--- a/drivers/virtio/virtio_pci_common.c
+++ b/drivers/virtio/virtio_pci_common.c
@@ -549,6 +549,8 @@ static int virtio_pci_probe(struct pci_dev *pci_dev,

pci_set_master(pci_dev);

+ vp_dev->is_legacy = vp_dev->ldev.ioaddr ? true : false;
+
rc = register_virtio_device(&vp_dev->vdev);
reg_dev = vp_dev;
if (rc)
@@ -557,10 +559,10 @@ static int virtio_pci_probe(struct pci_dev *pci_dev,
return 0;

err_register:
- if (vp_dev->ioaddr)
- virtio_pci_legacy_remove(vp_dev);
+ if (vp_dev->is_legacy)
+ virtio_pci_legacy_remove(vp_dev);
else
- virtio_pci_modern_remove(vp_dev);
+ virtio_pci_modern_remove(vp_dev);
err_probe:
pci_disable_device(pci_dev);
err_enable_device:
@@ -587,7 +589,7 @@ static void virtio_pci_remove(struct pci_dev *pci_dev)

unregister_virtio_device(&vp_dev->vdev);

- if (vp_dev->ioaddr)
+ if (vp_dev->is_legacy)
virtio_pci_legacy_remove(vp_dev);
else
virtio_pci_modern_remove(vp_dev);
diff --git a/drivers/virtio/virtio_pci_common.h b/drivers/virtio/virtio_pci_common.h
index beec047a8f8d..eb17a29fc7ef 100644
--- a/drivers/virtio/virtio_pci_common.h
+++ b/drivers/virtio/virtio_pci_common.h
@@ -25,6 +25,7 @@
#include <linux/virtio_config.h>
#include <linux/virtio_ring.h>
#include <linux/virtio_pci.h>
+#include <linux/virtio_pci_legacy.h>
#include <linux/virtio_pci_modern.h>
#include <linux/highmem.h>
#include <linux/spinlock.h>
@@ -44,16 +45,14 @@ struct virtio_pci_vq_info {
struct virtio_pci_device {
struct virtio_device vdev;
struct pci_dev *pci_dev;
+ struct virtio_pci_legacy_device ldev;
struct virtio_pci_modern_device mdev;

- /* In legacy mode, these two point to within ->legacy. */
+ bool is_legacy;
+
/* Where to read and clear interrupt */
u8 __iomem *isr;

- /* Legacy only field */
- /* the IO mapping for the PCI config space */
- void __iomem *ioaddr;
-
/* a list of queues so we can dispatch IRQs */
spinlock_t lock;
struct list_head virtqueues;
diff --git a/drivers/virtio/virtio_pci_legacy.c b/drivers/virtio/virtio_pci_legacy.c
index d62e9835aeec..82eb437ad920 100644
--- a/drivers/virtio/virtio_pci_legacy.c
+++ b/drivers/virtio/virtio_pci_legacy.c
@@ -14,6 +14,7 @@
* Michael S. Tsirkin <[email protected]>
*/

+#include "linux/virtio_pci_legacy.h"
#include "virtio_pci_common.h"

/* virtio config->get_features() implementation */
@@ -23,7 +24,7 @@ static u64 vp_get_features(struct virtio_device *vdev)

/* When someone needs more than 32 feature bits, we'll need to
* steal a bit to indicate that the rest are somewhere else. */
- return ioread32(vp_dev->ioaddr + VIRTIO_PCI_HOST_FEATURES);
+ return vp_legacy_get_features(&vp_dev->ldev);
}

/* virtio config->finalize_features() implementation */
@@ -38,7 +39,7 @@ static int vp_finalize_features(struct virtio_device *vdev)
BUG_ON((u32)vdev->features != vdev->features);

/* We only support 32 feature bits. */
- iowrite32(vdev->features, vp_dev->ioaddr + VIRTIO_PCI_GUEST_FEATURES);
+ vp_legacy_set_features(&vp_dev->ldev, vdev->features);

return 0;
}
@@ -48,7 +49,7 @@ static void vp_get(struct virtio_device *vdev, unsigned offset,
void *buf, unsigned len)
{
struct virtio_pci_device *vp_dev = to_vp_device(vdev);
- void __iomem *ioaddr = vp_dev->ioaddr +
+ void __iomem *ioaddr = vp_dev->ldev.ioaddr +
VIRTIO_PCI_CONFIG_OFF(vp_dev->msix_enabled) +
offset;
u8 *ptr = buf;
@@ -64,7 +65,7 @@ static void vp_set(struct virtio_device *vdev, unsigned offset,
const void *buf, unsigned len)
{
struct virtio_pci_device *vp_dev = to_vp_device(vdev);
- void __iomem *ioaddr = vp_dev->ioaddr +
+ void __iomem *ioaddr = vp_dev->ldev.ioaddr +
VIRTIO_PCI_CONFIG_OFF(vp_dev->msix_enabled) +
offset;
const u8 *ptr = buf;
@@ -78,7 +79,7 @@ static void vp_set(struct virtio_device *vdev, unsigned offset,
static u8 vp_get_status(struct virtio_device *vdev)
{
struct virtio_pci_device *vp_dev = to_vp_device(vdev);
- return ioread8(vp_dev->ioaddr + VIRTIO_PCI_STATUS);
+ return vp_legacy_get_status(&vp_dev->ldev);
}

static void vp_set_status(struct virtio_device *vdev, u8 status)
@@ -86,28 +87,24 @@ static void vp_set_status(struct virtio_device *vdev, u8 status)
struct virtio_pci_device *vp_dev = to_vp_device(vdev);
/* We should never be setting status to 0. */
BUG_ON(status == 0);
- iowrite8(status, vp_dev->ioaddr + VIRTIO_PCI_STATUS);
+ vp_legacy_set_status(&vp_dev->ldev, status);
}

static void vp_reset(struct virtio_device *vdev)
{
struct virtio_pci_device *vp_dev = to_vp_device(vdev);
/* 0 status means a reset. */
- iowrite8(0, vp_dev->ioaddr + VIRTIO_PCI_STATUS);
+ vp_legacy_set_status(&vp_dev->ldev, 0);
/* Flush out the status write, and flush in device writes,
* including MSi-X interrupts, if any. */
- ioread8(vp_dev->ioaddr + VIRTIO_PCI_STATUS);
+ vp_legacy_get_status(&vp_dev->ldev);
/* Flush pending VQ/configuration callbacks. */
vp_synchronize_vectors(vdev);
}

static u16 vp_config_vector(struct virtio_pci_device *vp_dev, u16 vector)
{
- /* Setup the vector used for configuration events */
- iowrite16(vector, vp_dev->ioaddr + VIRTIO_MSI_CONFIG_VECTOR);
- /* Verify we had enough resources to assign the vector */
- /* Will also flush the write out to device */
- return ioread16(vp_dev->ioaddr + VIRTIO_MSI_CONFIG_VECTOR);
+ return vp_legacy_config_vector(&vp_dev->ldev, vector);
}

static struct virtqueue *setup_vq(struct virtio_pci_device *vp_dev,
@@ -123,12 +120,9 @@ static struct virtqueue *setup_vq(struct virtio_pci_device *vp_dev,
int err;
u64 q_pfn;

- /* Select the queue we're interested in */
- iowrite16(index, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_SEL);
-
/* Check if queue is either not available or already active. */
- num = ioread16(vp_dev->ioaddr + VIRTIO_PCI_QUEUE_NUM);
- if (!num || ioread32(vp_dev->ioaddr + VIRTIO_PCI_QUEUE_PFN))
+ num = vp_legacy_get_queue_size(&vp_dev->ldev, index);
+ if (!num || vp_legacy_get_queue_enable(&vp_dev->ldev, index))
return ERR_PTR(-ENOENT);

info->msix_vector = msix_vec;
@@ -151,13 +145,12 @@ static struct virtqueue *setup_vq(struct virtio_pci_device *vp_dev,
}

/* activate the queue */
- iowrite32(q_pfn, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_PFN);
+ vp_legacy_set_queue_address(&vp_dev->ldev, index, q_pfn);

- vq->priv = (void __force *)vp_dev->ioaddr + VIRTIO_PCI_QUEUE_NOTIFY;
+ vq->priv = (void __force *)vp_dev->ldev.ioaddr + VIRTIO_PCI_QUEUE_NOTIFY;

if (msix_vec != VIRTIO_MSI_NO_VECTOR) {
- iowrite16(msix_vec, vp_dev->ioaddr + VIRTIO_MSI_QUEUE_VECTOR);
- msix_vec = ioread16(vp_dev->ioaddr + VIRTIO_MSI_QUEUE_VECTOR);
+ msix_vec = vp_legacy_queue_vector(&vp_dev->ldev, index, msix_vec);
if (msix_vec == VIRTIO_MSI_NO_VECTOR) {
err = -EBUSY;
goto out_deactivate;
@@ -167,7 +160,7 @@ static struct virtqueue *setup_vq(struct virtio_pci_device *vp_dev,
return vq;

out_deactivate:
- iowrite32(0, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_PFN);
+ vp_legacy_set_queue_address(&vp_dev->ldev, index, 0);
out_del_vq:
vring_del_virtqueue(vq);
return ERR_PTR(err);
@@ -178,17 +171,15 @@ static void del_vq(struct virtio_pci_vq_info *info)
struct virtqueue *vq = info->vq;
struct virtio_pci_device *vp_dev = to_vp_device(vq->vdev);

- iowrite16(vq->index, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_SEL);
-
if (vp_dev->msix_enabled) {
- iowrite16(VIRTIO_MSI_NO_VECTOR,
- vp_dev->ioaddr + VIRTIO_MSI_QUEUE_VECTOR);
+ vp_legacy_queue_vector(&vp_dev->ldev, vq->index,
+ VIRTIO_MSI_NO_VECTOR);
/* Flush the write out to device */
- ioread8(vp_dev->ioaddr + VIRTIO_PCI_ISR);
+ ioread8(vp_dev->ldev.ioaddr + VIRTIO_PCI_ISR);
}

/* Select and deactivate the queue */
- iowrite32(0, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_PFN);
+ vp_legacy_set_queue_address(&vp_dev->ldev, vq->index, 0);

vring_del_virtqueue(vq);
}
@@ -211,51 +202,18 @@ static const struct virtio_config_ops virtio_pci_config_ops = {
/* the PCI probing function */
int virtio_pci_legacy_probe(struct virtio_pci_device *vp_dev)
{
+ struct virtio_pci_legacy_device *ldev = &vp_dev->ldev;
struct pci_dev *pci_dev = vp_dev->pci_dev;
int rc;

- /* We only own devices >= 0x1000 and <= 0x103f: leave the rest. */
- if (pci_dev->device < 0x1000 || pci_dev->device > 0x103f)
- return -ENODEV;
-
- if (pci_dev->revision != VIRTIO_PCI_ABI_VERSION) {
- printk(KERN_ERR "virtio_pci: expected ABI version %d, got %d\n",
- VIRTIO_PCI_ABI_VERSION, pci_dev->revision);
- return -ENODEV;
- }
-
- rc = dma_set_mask(&pci_dev->dev, DMA_BIT_MASK(64));
- if (rc) {
- rc = dma_set_mask_and_coherent(&pci_dev->dev, DMA_BIT_MASK(32));
- } else {
- /*
- * The virtio ring base address is expressed as a 32-bit PFN,
- * with a page size of 1 << VIRTIO_PCI_QUEUE_ADDR_SHIFT.
- */
- dma_set_coherent_mask(&pci_dev->dev,
- DMA_BIT_MASK(32 + VIRTIO_PCI_QUEUE_ADDR_SHIFT));
- }
-
- if (rc)
- dev_warn(&pci_dev->dev, "Failed to enable 64-bit or 32-bit DMA. Trying to continue, but this might not work.\n");
+ ldev->pci_dev = pci_dev;

- rc = pci_request_region(pci_dev, 0, "virtio-pci-legacy");
+ rc = vp_legacy_probe(ldev);
if (rc)
return rc;

- rc = -ENOMEM;
- vp_dev->ioaddr = pci_iomap(pci_dev, 0, 0);
- if (!vp_dev->ioaddr)
- goto err_iomap;
-
- vp_dev->isr = vp_dev->ioaddr + VIRTIO_PCI_ISR;
-
- /* we use the subsystem vendor/device id as the virtio vendor/device
- * id. this allows us to use the same PCI vendor/device id for all
- * virtio devices and to identify the particular virtio driver by
- * the subsystem ids */
- vp_dev->vdev.id.vendor = pci_dev->subsystem_vendor;
- vp_dev->vdev.id.device = pci_dev->subsystem_device;
+ vp_dev->isr = ldev->isr;
+ vp_dev->vdev.id = ldev->id;

vp_dev->vdev.config = &virtio_pci_config_ops;

@@ -264,16 +222,11 @@ int virtio_pci_legacy_probe(struct virtio_pci_device *vp_dev)
vp_dev->del_vq = del_vq;

return 0;
-
-err_iomap:
- pci_release_region(pci_dev, 0);
- return rc;
}

void virtio_pci_legacy_remove(struct virtio_pci_device *vp_dev)
{
- struct pci_dev *pci_dev = vp_dev->pci_dev;
+ struct virtio_pci_legacy_device *ldev = &vp_dev->ldev;

- pci_iounmap(pci_dev, vp_dev->ioaddr);
- pci_release_region(pci_dev, 0);
+ vp_legacy_remove(ldev);
}
diff --git a/drivers/virtio/virtio_pci_legacy_dev.c b/drivers/virtio/virtio_pci_legacy_dev.c
new file mode 100644
index 000000000000..9b97680dd02b
--- /dev/null
+++ b/drivers/virtio/virtio_pci_legacy_dev.c
@@ -0,0 +1,220 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+
+#include "linux/virtio_pci.h"
+#include <linux/virtio_pci_legacy.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+
+
+/*
+ * vp_legacy_probe: probe the legacy virtio pci device, note that the
+ * caller is required to enable PCI device before calling this function.
+ * @ldev: the legacy virtio-pci device
+ *
+ * Return 0 on succeed otherwise fail
+ */
+int vp_legacy_probe(struct virtio_pci_legacy_device *ldev)
+{
+ struct pci_dev *pci_dev = ldev->pci_dev;
+ int rc;
+
+ /* We only own devices >= 0x1000 and <= 0x103f: leave the rest. */
+ if (pci_dev->device < 0x1000 || pci_dev->device > 0x103f)
+ return -ENODEV;
+
+ if (pci_dev->revision != VIRTIO_PCI_ABI_VERSION)
+ return -ENODEV;
+
+ rc = dma_set_mask(&pci_dev->dev, DMA_BIT_MASK(64));
+ if (rc) {
+ rc = dma_set_mask_and_coherent(&pci_dev->dev, DMA_BIT_MASK(32));
+ } else {
+ /*
+ * The virtio ring base address is expressed as a 32-bit PFN,
+ * with a page size of 1 << VIRTIO_PCI_QUEUE_ADDR_SHIFT.
+ */
+ dma_set_coherent_mask(&pci_dev->dev,
+ DMA_BIT_MASK(32 + VIRTIO_PCI_QUEUE_ADDR_SHIFT));
+ }
+
+ if (rc)
+ dev_warn(&pci_dev->dev, "Failed to enable 64-bit or 32-bit DMA. Trying to continue, but this might not work.\n");
+
+ rc = pci_request_region(pci_dev, 0, "virtio-pci-legacy");
+ if (rc)
+ return rc;
+
+ ldev->ioaddr = pci_iomap(pci_dev, 0, 0);
+ if (!ldev->ioaddr)
+ goto err_iomap;
+
+ ldev->isr = ldev->ioaddr + VIRTIO_PCI_ISR;
+
+ ldev->id.vendor = pci_dev->subsystem_vendor;
+ ldev->id.device = pci_dev->subsystem_device;
+
+ return 0;
+err_iomap:
+ pci_release_region(pci_dev, 0);
+ return rc;
+}
+EXPORT_SYMBOL_GPL(vp_legacy_probe);
+
+/*
+ * vp_legacy_probe: remove and cleanup the legacy virtio pci device
+ * @ldev: the legacy virtio-pci device
+ */
+void vp_legacy_remove(struct virtio_pci_legacy_device *ldev)
+{
+ struct pci_dev *pci_dev = ldev->pci_dev;
+
+ pci_iounmap(pci_dev, ldev->ioaddr);
+ pci_release_region(pci_dev, 0);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_remove);
+
+/*
+ * vp_legacy_get_features - get features from device
+ * @ldev: the legacy virtio-pci device
+ *
+ * Returns the features read from the device
+ */
+u64 vp_legacy_get_features(struct virtio_pci_legacy_device *ldev)
+{
+
+ return ioread32(ldev->ioaddr + VIRTIO_PCI_HOST_FEATURES);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_get_features);
+
+/*
+ * vp_legacy_get_driver_features - get driver features from device
+ * @ldev: the legacy virtio-pci device
+ *
+ * Returns the driver features read from the device
+ */
+u64 vp_legacy_get_driver_features(struct virtio_pci_legacy_device *ldev)
+{
+ return ioread32(ldev->ioaddr + VIRTIO_PCI_GUEST_FEATURES);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_get_driver_features);
+
+/*
+ * vp_legacy_set_features - set features to device
+ * @ldev: the legacy virtio-pci device
+ * @features: the features set to device
+ */
+void vp_legacy_set_features(struct virtio_pci_legacy_device *ldev,
+ u32 features)
+{
+ iowrite32(features, ldev->ioaddr + VIRTIO_PCI_GUEST_FEATURES);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_set_features);
+
+/*
+ * vp_legacy_get_status - get the device status
+ * @ldev: the legacy virtio-pci device
+ *
+ * Returns the status read from device
+ */
+u8 vp_legacy_get_status(struct virtio_pci_legacy_device *ldev)
+{
+ return ioread8(ldev->ioaddr + VIRTIO_PCI_STATUS);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_get_status);
+
+/*
+ * vp_legacy_set_status - set status to device
+ * @ldev: the legacy virtio-pci device
+ * @status: the status set to device
+ */
+void vp_legacy_set_status(struct virtio_pci_legacy_device *ldev,
+ u8 status)
+{
+ iowrite8(status, ldev->ioaddr + VIRTIO_PCI_STATUS);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_set_status);
+
+/*
+ * vp_legacy_queue_vector - set the MSIX vector for a specific virtqueue
+ * @ldev: the legacy virtio-pci device
+ * @index: queue index
+ * @vector: the config vector
+ *
+ * Returns the config vector read from the device
+ */
+u16 vp_legacy_queue_vector(struct virtio_pci_legacy_device *ldev,
+ u16 index, u16 vector)
+{
+ iowrite16(index, ldev->ioaddr + VIRTIO_PCI_QUEUE_SEL);
+ iowrite16(vector, ldev->ioaddr + VIRTIO_MSI_QUEUE_VECTOR);
+ /* Flush the write out to device */
+ return ioread16(ldev->ioaddr + VIRTIO_MSI_QUEUE_VECTOR);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_queue_vector);
+
+/*
+ * vp_legacy_config_vector - set the vector for config interrupt
+ * @ldev: the legacy virtio-pci device
+ * @vector: the config vector
+ *
+ * Returns the config vector read from the device
+ */
+u16 vp_legacy_config_vector(struct virtio_pci_legacy_device *ldev,
+ u16 vector)
+{
+ /* Setup the vector used for configuration events */
+ iowrite16(vector, ldev->ioaddr + VIRTIO_MSI_CONFIG_VECTOR);
+ /* Verify we had enough resources to assign the vector */
+ /* Will also flush the write out to device */
+ return ioread16(ldev->ioaddr + VIRTIO_MSI_CONFIG_VECTOR);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_config_vector);
+
+/*
+ * vp_legacy_set_queue_address - set the virtqueue address
+ * @ldev: the legacy virtio-pci device
+ * @index: the queue index
+ * @queue_pfn: pfn of the virtqueue
+ */
+void vp_legacy_set_queue_address(struct virtio_pci_legacy_device *ldev,
+ u16 index, u32 queue_pfn)
+{
+ iowrite16(index, ldev->ioaddr + VIRTIO_PCI_QUEUE_SEL);
+ iowrite32(queue_pfn, ldev->ioaddr + VIRTIO_PCI_QUEUE_PFN);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_set_queue_address);
+
+/*
+ * vp_legacy_get_queue_enable - enable a virtqueue
+ * @ldev: the legacy virtio-pci device
+ * @index: the queue index
+ *
+ * Returns whether a virtqueue is enabled or not
+ */
+bool vp_legacy_get_queue_enable(struct virtio_pci_legacy_device *ldev,
+ u16 index)
+{
+ iowrite16(index, ldev->ioaddr + VIRTIO_PCI_QUEUE_SEL);
+ return ioread32(ldev->ioaddr + VIRTIO_PCI_QUEUE_PFN);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_get_queue_enable);
+
+/*
+ * vp_legacy_get_queue_size - get size for a virtqueue
+ * @ldev: the legacy virtio-pci device
+ * @index: the queue index
+ *
+ * Returns the size of the virtqueue
+ */
+u16 vp_legacy_get_queue_size(struct virtio_pci_legacy_device *ldev,
+ u16 index)
+{
+ iowrite16(index, ldev->ioaddr + VIRTIO_PCI_QUEUE_SEL);
+ return ioread16(ldev->ioaddr + VIRTIO_PCI_QUEUE_NUM);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_get_queue_size);
+
+MODULE_VERSION("0.1");
+MODULE_DESCRIPTION("Legacy Virtio PCI Device");
+MODULE_AUTHOR("Wu Zongyong <[email protected]>");
+MODULE_LICENSE("GPL");
diff --git a/include/linux/virtio_pci_legacy.h b/include/linux/virtio_pci_legacy.h
new file mode 100644
index 000000000000..ee2c6157215f
--- /dev/null
+++ b/include/linux/virtio_pci_legacy.h
@@ -0,0 +1,44 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_VIRTIO_PCI_LEGACY_H
+#define _LINUX_VIRTIO_PCI_LEGACY_H
+
+#include "linux/mod_devicetable.h"
+#include <linux/pci.h>
+#include <linux/virtio_pci.h>
+
+struct virtio_pci_legacy_device {
+ struct pci_dev *pci_dev;
+
+ /* Where to read and clear interrupt */
+ u8 __iomem *isr;
+ /* The IO mapping for the PCI config space (legacy mode only) */
+ void __iomem *ioaddr;
+
+ struct virtio_device_id id;
+};
+
+u64 vp_legacy_get_features(struct virtio_pci_legacy_device *ldev);
+u64 vp_legacy_get_driver_features(struct virtio_pci_legacy_device *ldev);
+void vp_legacy_set_features(struct virtio_pci_legacy_device *ldev,
+ u32 features);
+u8 vp_legacy_get_status(struct virtio_pci_legacy_device *ldev);
+void vp_legacy_set_status(struct virtio_pci_legacy_device *ldev,
+ u8 status);
+u16 vp_legacy_queue_vector(struct virtio_pci_legacy_device *ldev,
+ u16 idx, u16 vector);
+u16 vp_legacy_config_vector(struct virtio_pci_legacy_device *ldev,
+ u16 vector);
+void vp_legacy_set_queue_address(struct virtio_pci_legacy_device *ldev,
+ u16 index, u32 queue_pfn);
+void vp_legacy_set_queue_enable(struct virtio_pci_legacy_device *ldev,
+ u16 idx, bool enable);
+bool vp_legacy_get_queue_enable(struct virtio_pci_legacy_device *ldev,
+ u16 idx);
+void vp_legacy_set_queue_size(struct virtio_pci_legacy_device *ldev,
+ u16 idx, u16 size);
+u16 vp_legacy_get_queue_size(struct virtio_pci_legacy_device *ldev,
+ u16 idx);
+int vp_legacy_probe(struct virtio_pci_legacy_device *ldev);
+void vp_legacy_remove(struct virtio_pci_legacy_device *ldev);
+
+#endif
--
2.31.1

2021-09-29 12:57:17

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v3 7/7] eni_vdpa: add vDPA driver for Alibaba ENI

Hi Wu,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v5.15-rc3 next-20210922]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url: https://github.com/0day-ci/linux/commits/Wu-Zongyong/virtio-pci-introduce-legacy-device-module/20210929-115033
base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git a4e6f95a891ac08bd09d62e3e6dae239b150f4c1
config: xtensa-allyesconfig (attached as .config)
compiler: xtensa-linux-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/0day-ci/linux/commit/86ed35603fb93a4bc8c8929ff89edd5f6556ca44
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Wu-Zongyong/virtio-pci-introduce-legacy-device-module/20210929-115033
git checkout 86ed35603fb93a4bc8c8929ff89edd5f6556ca44
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross ARCH=xtensa

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>):

>> drivers/vdpa/alibaba/eni_vdpa.c:446:13: error: 'eni_vdpa_free_irq_vectors' defined but not used [-Werror=unused-function]
446 | static void eni_vdpa_free_irq_vectors(void *data)
| ^~~~~~~~~~~~~~~~~~~~~~~~~
>> drivers/vdpa/alibaba/eni_vdpa.c:423:12: error: 'eni_vdpa_get_num_queues' defined but not used [-Werror=unused-function]
423 | static u16 eni_vdpa_get_num_queues(struct eni_vdpa *eni_vdpa)
| ^~~~~~~~~~~~~~~~~~~~~~~
>> drivers/vdpa/alibaba/eni_vdpa.c:396:37: error: 'eni_vdpa_ops' defined but not used [-Werror=unused-const-variable=]
396 | static const struct vdpa_config_ops eni_vdpa_ops = {
| ^~~~~~~~~~~~
cc1: all warnings being treated as errors


vim +/eni_vdpa_free_irq_vectors +446 drivers/vdpa/alibaba/eni_vdpa.c

395
> 396 static const struct vdpa_config_ops eni_vdpa_ops = {
397 .get_features = eni_vdpa_get_features,
398 .set_features = eni_vdpa_set_features,
399 .get_status = eni_vdpa_get_status,
400 .set_status = eni_vdpa_set_status,
401 .reset = eni_vdpa_reset,
402 .get_vq_num_max = eni_vdpa_get_vq_num_max,
403 .get_vq_num_min = eni_vdpa_get_vq_num_min,
404 .get_vq_state = eni_vdpa_get_vq_state,
405 .set_vq_state = eni_vdpa_set_vq_state,
406 .set_vq_cb = eni_vdpa_set_vq_cb,
407 .set_vq_ready = eni_vdpa_set_vq_ready,
408 .get_vq_ready = eni_vdpa_get_vq_ready,
409 .set_vq_num = eni_vdpa_set_vq_num,
410 .set_vq_address = eni_vdpa_set_vq_address,
411 .kick_vq = eni_vdpa_kick_vq,
412 .get_device_id = eni_vdpa_get_device_id,
413 .get_vendor_id = eni_vdpa_get_vendor_id,
414 .get_vq_align = eni_vdpa_get_vq_align,
415 .get_config_size = eni_vdpa_get_config_size,
416 .get_config = eni_vdpa_get_config,
417 .set_config = eni_vdpa_set_config,
418 .set_config_cb = eni_vdpa_set_config_cb,
419 .get_vq_irq = eni_vdpa_get_vq_irq,
420 };
421
422
> 423 static u16 eni_vdpa_get_num_queues(struct eni_vdpa *eni_vdpa)
424 {
425 struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
426 u32 features = vp_legacy_get_features(ldev);
427 u16 num = 2;
428
429 if (features & BIT_ULL(VIRTIO_NET_F_MQ)) {
430 __virtio16 max_virtqueue_pairs;
431
432 eni_vdpa_get_config(&eni_vdpa->vdpa,
433 offsetof(struct virtio_net_config, max_virtqueue_pairs),
434 &max_virtqueue_pairs,
435 sizeof(max_virtqueue_pairs));
436 num = 2 * __virtio16_to_cpu(virtio_legacy_is_little_endian(),
437 max_virtqueue_pairs);
438 }
439
440 if (features & BIT_ULL(VIRTIO_NET_F_CTRL_VQ))
441 num += 1;
442
443 return num;
444 }
445
> 446 static void eni_vdpa_free_irq_vectors(void *data)
447 {
448 pci_free_irq_vectors(data);
449 }
450

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]


Attachments:
(No filename) (4.42 kB)
.config.gz (67.34 kB)
Download all attachments

2021-10-11 04:15:52

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH v4 2/7] vdpa: fix typo


?? 2021/9/29 ????2:11, Wu Zongyong д??:
> Signed-off-by: Wu Zongyong <[email protected]>


Acked-by: Jason Wang <[email protected]>


> ---
> include/linux/vdpa.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
> index 3972ab765de1..a896ee021e5f 100644
> --- a/include/linux/vdpa.h
> +++ b/include/linux/vdpa.h
> @@ -257,7 +257,7 @@ struct vdpa_config_ops {
> struct vdpa_notification_area
> (*get_vq_notification)(struct vdpa_device *vdev, u16 idx);
> /* vq irq is not expected to be changed once DRIVER_OK is set */
> - int (*get_vq_irq)(struct vdpa_device *vdv, u16 idx);
> + int (*get_vq_irq)(struct vdpa_device *vdev, u16 idx);
>
> /* Device ops */
> u32 (*get_vq_align)(struct vdpa_device *vdev);

2021-10-11 05:30:40

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH v4 1/7] virtio-pci: introduce legacy device module


?? 2021/9/29 ????2:11, Wu Zongyong д??:
> Split common codes from virtio-pci-legacy so vDPA driver can reuse it
> later.
>
> Signed-off-by: Wu Zongyong <[email protected]>


Acked-by: Jason Wang <[email protected]>


> ---
> drivers/virtio/Kconfig | 10 ++
> drivers/virtio/Makefile | 1 +
> drivers/virtio/virtio_pci_common.c | 10 +-
> drivers/virtio/virtio_pci_common.h | 9 +-
> drivers/virtio/virtio_pci_legacy.c | 101 +++---------
> drivers/virtio/virtio_pci_legacy_dev.c | 220 +++++++++++++++++++++++++
> include/linux/virtio_pci_legacy.h | 44 +++++
> 7 files changed, 312 insertions(+), 83 deletions(-)
> create mode 100644 drivers/virtio/virtio_pci_legacy_dev.c
> create mode 100644 include/linux/virtio_pci_legacy.h
>
> diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
> index ce1b3f6ec325..8fcf94cd2c96 100644
> --- a/drivers/virtio/Kconfig
> +++ b/drivers/virtio/Kconfig
> @@ -20,6 +20,15 @@ config VIRTIO_PCI_LIB
> PCI device with possible vendor specific extensions. Any
> module that selects this module must depend on PCI.
>
> +config VIRTIO_PCI_LIB_LEGACY
> + tristate
> + help
> + Legacy PCI device (Virtio PCI Card 0.9.x Draft and older device)
> + implementation.
> + This module implements the basic probe and control for devices
> + which are based on legacy PCI device. Any module that selects this
> + module must depend on PCI.
> +
> menuconfig VIRTIO_MENU
> bool "Virtio drivers"
> default y
> @@ -43,6 +52,7 @@ config VIRTIO_PCI_LEGACY
> bool "Support for legacy virtio draft 0.9.X and older devices"
> default y
> depends on VIRTIO_PCI
> + select VIRTIO_PCI_LIB_LEGACY
> help
> Virtio PCI Card 0.9.X Draft (circa 2014) and older device support.
>
> diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile
> index 699bbea0465f..0a82d0873248 100644
> --- a/drivers/virtio/Makefile
> +++ b/drivers/virtio/Makefile
> @@ -1,6 +1,7 @@
> # SPDX-License-Identifier: GPL-2.0
> obj-$(CONFIG_VIRTIO) += virtio.o virtio_ring.o
> obj-$(CONFIG_VIRTIO_PCI_LIB) += virtio_pci_modern_dev.o
> +obj-$(CONFIG_VIRTIO_PCI_LIB_LEGACY) += virtio_pci_legacy_dev.o
> obj-$(CONFIG_VIRTIO_MMIO) += virtio_mmio.o
> obj-$(CONFIG_VIRTIO_PCI) += virtio_pci.o
> virtio_pci-y := virtio_pci_modern.o virtio_pci_common.o
> diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c
> index b35bb2d57f62..d724f676608b 100644
> --- a/drivers/virtio/virtio_pci_common.c
> +++ b/drivers/virtio/virtio_pci_common.c
> @@ -549,6 +549,8 @@ static int virtio_pci_probe(struct pci_dev *pci_dev,
>
> pci_set_master(pci_dev);
>
> + vp_dev->is_legacy = vp_dev->ldev.ioaddr ? true : false;
> +
> rc = register_virtio_device(&vp_dev->vdev);
> reg_dev = vp_dev;
> if (rc)
> @@ -557,10 +559,10 @@ static int virtio_pci_probe(struct pci_dev *pci_dev,
> return 0;
>
> err_register:
> - if (vp_dev->ioaddr)
> - virtio_pci_legacy_remove(vp_dev);
> + if (vp_dev->is_legacy)
> + virtio_pci_legacy_remove(vp_dev);
> else
> - virtio_pci_modern_remove(vp_dev);
> + virtio_pci_modern_remove(vp_dev);
> err_probe:
> pci_disable_device(pci_dev);
> err_enable_device:
> @@ -587,7 +589,7 @@ static void virtio_pci_remove(struct pci_dev *pci_dev)
>
> unregister_virtio_device(&vp_dev->vdev);
>
> - if (vp_dev->ioaddr)
> + if (vp_dev->is_legacy)
> virtio_pci_legacy_remove(vp_dev);
> else
> virtio_pci_modern_remove(vp_dev);
> diff --git a/drivers/virtio/virtio_pci_common.h b/drivers/virtio/virtio_pci_common.h
> index beec047a8f8d..eb17a29fc7ef 100644
> --- a/drivers/virtio/virtio_pci_common.h
> +++ b/drivers/virtio/virtio_pci_common.h
> @@ -25,6 +25,7 @@
> #include <linux/virtio_config.h>
> #include <linux/virtio_ring.h>
> #include <linux/virtio_pci.h>
> +#include <linux/virtio_pci_legacy.h>
> #include <linux/virtio_pci_modern.h>
> #include <linux/highmem.h>
> #include <linux/spinlock.h>
> @@ -44,16 +45,14 @@ struct virtio_pci_vq_info {
> struct virtio_pci_device {
> struct virtio_device vdev;
> struct pci_dev *pci_dev;
> + struct virtio_pci_legacy_device ldev;
> struct virtio_pci_modern_device mdev;
>
> - /* In legacy mode, these two point to within ->legacy. */
> + bool is_legacy;
> +
> /* Where to read and clear interrupt */
> u8 __iomem *isr;
>
> - /* Legacy only field */
> - /* the IO mapping for the PCI config space */
> - void __iomem *ioaddr;
> -
> /* a list of queues so we can dispatch IRQs */
> spinlock_t lock;
> struct list_head virtqueues;
> diff --git a/drivers/virtio/virtio_pci_legacy.c b/drivers/virtio/virtio_pci_legacy.c
> index d62e9835aeec..82eb437ad920 100644
> --- a/drivers/virtio/virtio_pci_legacy.c
> +++ b/drivers/virtio/virtio_pci_legacy.c
> @@ -14,6 +14,7 @@
> * Michael S. Tsirkin <[email protected]>
> */
>
> +#include "linux/virtio_pci_legacy.h"
> #include "virtio_pci_common.h"
>
> /* virtio config->get_features() implementation */
> @@ -23,7 +24,7 @@ static u64 vp_get_features(struct virtio_device *vdev)
>
> /* When someone needs more than 32 feature bits, we'll need to
> * steal a bit to indicate that the rest are somewhere else. */
> - return ioread32(vp_dev->ioaddr + VIRTIO_PCI_HOST_FEATURES);
> + return vp_legacy_get_features(&vp_dev->ldev);
> }
>
> /* virtio config->finalize_features() implementation */
> @@ -38,7 +39,7 @@ static int vp_finalize_features(struct virtio_device *vdev)
> BUG_ON((u32)vdev->features != vdev->features);
>
> /* We only support 32 feature bits. */
> - iowrite32(vdev->features, vp_dev->ioaddr + VIRTIO_PCI_GUEST_FEATURES);
> + vp_legacy_set_features(&vp_dev->ldev, vdev->features);
>
> return 0;
> }
> @@ -48,7 +49,7 @@ static void vp_get(struct virtio_device *vdev, unsigned offset,
> void *buf, unsigned len)
> {
> struct virtio_pci_device *vp_dev = to_vp_device(vdev);
> - void __iomem *ioaddr = vp_dev->ioaddr +
> + void __iomem *ioaddr = vp_dev->ldev.ioaddr +
> VIRTIO_PCI_CONFIG_OFF(vp_dev->msix_enabled) +
> offset;
> u8 *ptr = buf;
> @@ -64,7 +65,7 @@ static void vp_set(struct virtio_device *vdev, unsigned offset,
> const void *buf, unsigned len)
> {
> struct virtio_pci_device *vp_dev = to_vp_device(vdev);
> - void __iomem *ioaddr = vp_dev->ioaddr +
> + void __iomem *ioaddr = vp_dev->ldev.ioaddr +
> VIRTIO_PCI_CONFIG_OFF(vp_dev->msix_enabled) +
> offset;
> const u8 *ptr = buf;
> @@ -78,7 +79,7 @@ static void vp_set(struct virtio_device *vdev, unsigned offset,
> static u8 vp_get_status(struct virtio_device *vdev)
> {
> struct virtio_pci_device *vp_dev = to_vp_device(vdev);
> - return ioread8(vp_dev->ioaddr + VIRTIO_PCI_STATUS);
> + return vp_legacy_get_status(&vp_dev->ldev);
> }
>
> static void vp_set_status(struct virtio_device *vdev, u8 status)
> @@ -86,28 +87,24 @@ static void vp_set_status(struct virtio_device *vdev, u8 status)
> struct virtio_pci_device *vp_dev = to_vp_device(vdev);
> /* We should never be setting status to 0. */
> BUG_ON(status == 0);
> - iowrite8(status, vp_dev->ioaddr + VIRTIO_PCI_STATUS);
> + vp_legacy_set_status(&vp_dev->ldev, status);
> }
>
> static void vp_reset(struct virtio_device *vdev)
> {
> struct virtio_pci_device *vp_dev = to_vp_device(vdev);
> /* 0 status means a reset. */
> - iowrite8(0, vp_dev->ioaddr + VIRTIO_PCI_STATUS);
> + vp_legacy_set_status(&vp_dev->ldev, 0);
> /* Flush out the status write, and flush in device writes,
> * including MSi-X interrupts, if any. */
> - ioread8(vp_dev->ioaddr + VIRTIO_PCI_STATUS);
> + vp_legacy_get_status(&vp_dev->ldev);
> /* Flush pending VQ/configuration callbacks. */
> vp_synchronize_vectors(vdev);
> }
>
> static u16 vp_config_vector(struct virtio_pci_device *vp_dev, u16 vector)
> {
> - /* Setup the vector used for configuration events */
> - iowrite16(vector, vp_dev->ioaddr + VIRTIO_MSI_CONFIG_VECTOR);
> - /* Verify we had enough resources to assign the vector */
> - /* Will also flush the write out to device */
> - return ioread16(vp_dev->ioaddr + VIRTIO_MSI_CONFIG_VECTOR);
> + return vp_legacy_config_vector(&vp_dev->ldev, vector);
> }
>
> static struct virtqueue *setup_vq(struct virtio_pci_device *vp_dev,
> @@ -123,12 +120,9 @@ static struct virtqueue *setup_vq(struct virtio_pci_device *vp_dev,
> int err;
> u64 q_pfn;
>
> - /* Select the queue we're interested in */
> - iowrite16(index, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_SEL);
> -
> /* Check if queue is either not available or already active. */
> - num = ioread16(vp_dev->ioaddr + VIRTIO_PCI_QUEUE_NUM);
> - if (!num || ioread32(vp_dev->ioaddr + VIRTIO_PCI_QUEUE_PFN))
> + num = vp_legacy_get_queue_size(&vp_dev->ldev, index);
> + if (!num || vp_legacy_get_queue_enable(&vp_dev->ldev, index))
> return ERR_PTR(-ENOENT);
>
> info->msix_vector = msix_vec;
> @@ -151,13 +145,12 @@ static struct virtqueue *setup_vq(struct virtio_pci_device *vp_dev,
> }
>
> /* activate the queue */
> - iowrite32(q_pfn, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_PFN);
> + vp_legacy_set_queue_address(&vp_dev->ldev, index, q_pfn);
>
> - vq->priv = (void __force *)vp_dev->ioaddr + VIRTIO_PCI_QUEUE_NOTIFY;
> + vq->priv = (void __force *)vp_dev->ldev.ioaddr + VIRTIO_PCI_QUEUE_NOTIFY;
>
> if (msix_vec != VIRTIO_MSI_NO_VECTOR) {
> - iowrite16(msix_vec, vp_dev->ioaddr + VIRTIO_MSI_QUEUE_VECTOR);
> - msix_vec = ioread16(vp_dev->ioaddr + VIRTIO_MSI_QUEUE_VECTOR);
> + msix_vec = vp_legacy_queue_vector(&vp_dev->ldev, index, msix_vec);
> if (msix_vec == VIRTIO_MSI_NO_VECTOR) {
> err = -EBUSY;
> goto out_deactivate;
> @@ -167,7 +160,7 @@ static struct virtqueue *setup_vq(struct virtio_pci_device *vp_dev,
> return vq;
>
> out_deactivate:
> - iowrite32(0, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_PFN);
> + vp_legacy_set_queue_address(&vp_dev->ldev, index, 0);
> out_del_vq:
> vring_del_virtqueue(vq);
> return ERR_PTR(err);
> @@ -178,17 +171,15 @@ static void del_vq(struct virtio_pci_vq_info *info)
> struct virtqueue *vq = info->vq;
> struct virtio_pci_device *vp_dev = to_vp_device(vq->vdev);
>
> - iowrite16(vq->index, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_SEL);
> -
> if (vp_dev->msix_enabled) {
> - iowrite16(VIRTIO_MSI_NO_VECTOR,
> - vp_dev->ioaddr + VIRTIO_MSI_QUEUE_VECTOR);
> + vp_legacy_queue_vector(&vp_dev->ldev, vq->index,
> + VIRTIO_MSI_NO_VECTOR);
> /* Flush the write out to device */
> - ioread8(vp_dev->ioaddr + VIRTIO_PCI_ISR);
> + ioread8(vp_dev->ldev.ioaddr + VIRTIO_PCI_ISR);
> }
>
> /* Select and deactivate the queue */
> - iowrite32(0, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_PFN);
> + vp_legacy_set_queue_address(&vp_dev->ldev, vq->index, 0);
>
> vring_del_virtqueue(vq);
> }
> @@ -211,51 +202,18 @@ static const struct virtio_config_ops virtio_pci_config_ops = {
> /* the PCI probing function */
> int virtio_pci_legacy_probe(struct virtio_pci_device *vp_dev)
> {
> + struct virtio_pci_legacy_device *ldev = &vp_dev->ldev;
> struct pci_dev *pci_dev = vp_dev->pci_dev;
> int rc;
>
> - /* We only own devices >= 0x1000 and <= 0x103f: leave the rest. */
> - if (pci_dev->device < 0x1000 || pci_dev->device > 0x103f)
> - return -ENODEV;
> -
> - if (pci_dev->revision != VIRTIO_PCI_ABI_VERSION) {
> - printk(KERN_ERR "virtio_pci: expected ABI version %d, got %d\n",
> - VIRTIO_PCI_ABI_VERSION, pci_dev->revision);
> - return -ENODEV;
> - }
> -
> - rc = dma_set_mask(&pci_dev->dev, DMA_BIT_MASK(64));
> - if (rc) {
> - rc = dma_set_mask_and_coherent(&pci_dev->dev, DMA_BIT_MASK(32));
> - } else {
> - /*
> - * The virtio ring base address is expressed as a 32-bit PFN,
> - * with a page size of 1 << VIRTIO_PCI_QUEUE_ADDR_SHIFT.
> - */
> - dma_set_coherent_mask(&pci_dev->dev,
> - DMA_BIT_MASK(32 + VIRTIO_PCI_QUEUE_ADDR_SHIFT));
> - }
> -
> - if (rc)
> - dev_warn(&pci_dev->dev, "Failed to enable 64-bit or 32-bit DMA. Trying to continue, but this might not work.\n");
> + ldev->pci_dev = pci_dev;
>
> - rc = pci_request_region(pci_dev, 0, "virtio-pci-legacy");
> + rc = vp_legacy_probe(ldev);
> if (rc)
> return rc;
>
> - rc = -ENOMEM;
> - vp_dev->ioaddr = pci_iomap(pci_dev, 0, 0);
> - if (!vp_dev->ioaddr)
> - goto err_iomap;
> -
> - vp_dev->isr = vp_dev->ioaddr + VIRTIO_PCI_ISR;
> -
> - /* we use the subsystem vendor/device id as the virtio vendor/device
> - * id. this allows us to use the same PCI vendor/device id for all
> - * virtio devices and to identify the particular virtio driver by
> - * the subsystem ids */
> - vp_dev->vdev.id.vendor = pci_dev->subsystem_vendor;
> - vp_dev->vdev.id.device = pci_dev->subsystem_device;
> + vp_dev->isr = ldev->isr;
> + vp_dev->vdev.id = ldev->id;
>
> vp_dev->vdev.config = &virtio_pci_config_ops;
>
> @@ -264,16 +222,11 @@ int virtio_pci_legacy_probe(struct virtio_pci_device *vp_dev)
> vp_dev->del_vq = del_vq;
>
> return 0;
> -
> -err_iomap:
> - pci_release_region(pci_dev, 0);
> - return rc;
> }
>
> void virtio_pci_legacy_remove(struct virtio_pci_device *vp_dev)
> {
> - struct pci_dev *pci_dev = vp_dev->pci_dev;
> + struct virtio_pci_legacy_device *ldev = &vp_dev->ldev;
>
> - pci_iounmap(pci_dev, vp_dev->ioaddr);
> - pci_release_region(pci_dev, 0);
> + vp_legacy_remove(ldev);
> }
> diff --git a/drivers/virtio/virtio_pci_legacy_dev.c b/drivers/virtio/virtio_pci_legacy_dev.c
> new file mode 100644
> index 000000000000..9b97680dd02b
> --- /dev/null
> +++ b/drivers/virtio/virtio_pci_legacy_dev.c
> @@ -0,0 +1,220 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +
> +#include "linux/virtio_pci.h"
> +#include <linux/virtio_pci_legacy.h>
> +#include <linux/module.h>
> +#include <linux/pci.h>
> +
> +
> +/*
> + * vp_legacy_probe: probe the legacy virtio pci device, note that the
> + * caller is required to enable PCI device before calling this function.
> + * @ldev: the legacy virtio-pci device
> + *
> + * Return 0 on succeed otherwise fail
> + */
> +int vp_legacy_probe(struct virtio_pci_legacy_device *ldev)
> +{
> + struct pci_dev *pci_dev = ldev->pci_dev;
> + int rc;
> +
> + /* We only own devices >= 0x1000 and <= 0x103f: leave the rest. */
> + if (pci_dev->device < 0x1000 || pci_dev->device > 0x103f)
> + return -ENODEV;
> +
> + if (pci_dev->revision != VIRTIO_PCI_ABI_VERSION)
> + return -ENODEV;
> +
> + rc = dma_set_mask(&pci_dev->dev, DMA_BIT_MASK(64));
> + if (rc) {
> + rc = dma_set_mask_and_coherent(&pci_dev->dev, DMA_BIT_MASK(32));
> + } else {
> + /*
> + * The virtio ring base address is expressed as a 32-bit PFN,
> + * with a page size of 1 << VIRTIO_PCI_QUEUE_ADDR_SHIFT.
> + */
> + dma_set_coherent_mask(&pci_dev->dev,
> + DMA_BIT_MASK(32 + VIRTIO_PCI_QUEUE_ADDR_SHIFT));
> + }
> +
> + if (rc)
> + dev_warn(&pci_dev->dev, "Failed to enable 64-bit or 32-bit DMA. Trying to continue, but this might not work.\n");
> +
> + rc = pci_request_region(pci_dev, 0, "virtio-pci-legacy");
> + if (rc)
> + return rc;
> +
> + ldev->ioaddr = pci_iomap(pci_dev, 0, 0);
> + if (!ldev->ioaddr)
> + goto err_iomap;
> +
> + ldev->isr = ldev->ioaddr + VIRTIO_PCI_ISR;
> +
> + ldev->id.vendor = pci_dev->subsystem_vendor;
> + ldev->id.device = pci_dev->subsystem_device;
> +
> + return 0;
> +err_iomap:
> + pci_release_region(pci_dev, 0);
> + return rc;
> +}
> +EXPORT_SYMBOL_GPL(vp_legacy_probe);
> +
> +/*
> + * vp_legacy_probe: remove and cleanup the legacy virtio pci device
> + * @ldev: the legacy virtio-pci device
> + */
> +void vp_legacy_remove(struct virtio_pci_legacy_device *ldev)
> +{
> + struct pci_dev *pci_dev = ldev->pci_dev;
> +
> + pci_iounmap(pci_dev, ldev->ioaddr);
> + pci_release_region(pci_dev, 0);
> +}
> +EXPORT_SYMBOL_GPL(vp_legacy_remove);
> +
> +/*
> + * vp_legacy_get_features - get features from device
> + * @ldev: the legacy virtio-pci device
> + *
> + * Returns the features read from the device
> + */
> +u64 vp_legacy_get_features(struct virtio_pci_legacy_device *ldev)
> +{
> +
> + return ioread32(ldev->ioaddr + VIRTIO_PCI_HOST_FEATURES);
> +}
> +EXPORT_SYMBOL_GPL(vp_legacy_get_features);
> +
> +/*
> + * vp_legacy_get_driver_features - get driver features from device
> + * @ldev: the legacy virtio-pci device
> + *
> + * Returns the driver features read from the device
> + */
> +u64 vp_legacy_get_driver_features(struct virtio_pci_legacy_device *ldev)
> +{
> + return ioread32(ldev->ioaddr + VIRTIO_PCI_GUEST_FEATURES);
> +}
> +EXPORT_SYMBOL_GPL(vp_legacy_get_driver_features);
> +
> +/*
> + * vp_legacy_set_features - set features to device
> + * @ldev: the legacy virtio-pci device
> + * @features: the features set to device
> + */
> +void vp_legacy_set_features(struct virtio_pci_legacy_device *ldev,
> + u32 features)
> +{
> + iowrite32(features, ldev->ioaddr + VIRTIO_PCI_GUEST_FEATURES);
> +}
> +EXPORT_SYMBOL_GPL(vp_legacy_set_features);
> +
> +/*
> + * vp_legacy_get_status - get the device status
> + * @ldev: the legacy virtio-pci device
> + *
> + * Returns the status read from device
> + */
> +u8 vp_legacy_get_status(struct virtio_pci_legacy_device *ldev)
> +{
> + return ioread8(ldev->ioaddr + VIRTIO_PCI_STATUS);
> +}
> +EXPORT_SYMBOL_GPL(vp_legacy_get_status);
> +
> +/*
> + * vp_legacy_set_status - set status to device
> + * @ldev: the legacy virtio-pci device
> + * @status: the status set to device
> + */
> +void vp_legacy_set_status(struct virtio_pci_legacy_device *ldev,
> + u8 status)
> +{
> + iowrite8(status, ldev->ioaddr + VIRTIO_PCI_STATUS);
> +}
> +EXPORT_SYMBOL_GPL(vp_legacy_set_status);
> +
> +/*
> + * vp_legacy_queue_vector - set the MSIX vector for a specific virtqueue
> + * @ldev: the legacy virtio-pci device
> + * @index: queue index
> + * @vector: the config vector
> + *
> + * Returns the config vector read from the device
> + */
> +u16 vp_legacy_queue_vector(struct virtio_pci_legacy_device *ldev,
> + u16 index, u16 vector)
> +{
> + iowrite16(index, ldev->ioaddr + VIRTIO_PCI_QUEUE_SEL);
> + iowrite16(vector, ldev->ioaddr + VIRTIO_MSI_QUEUE_VECTOR);
> + /* Flush the write out to device */
> + return ioread16(ldev->ioaddr + VIRTIO_MSI_QUEUE_VECTOR);
> +}
> +EXPORT_SYMBOL_GPL(vp_legacy_queue_vector);
> +
> +/*
> + * vp_legacy_config_vector - set the vector for config interrupt
> + * @ldev: the legacy virtio-pci device
> + * @vector: the config vector
> + *
> + * Returns the config vector read from the device
> + */
> +u16 vp_legacy_config_vector(struct virtio_pci_legacy_device *ldev,
> + u16 vector)
> +{
> + /* Setup the vector used for configuration events */
> + iowrite16(vector, ldev->ioaddr + VIRTIO_MSI_CONFIG_VECTOR);
> + /* Verify we had enough resources to assign the vector */
> + /* Will also flush the write out to device */
> + return ioread16(ldev->ioaddr + VIRTIO_MSI_CONFIG_VECTOR);
> +}
> +EXPORT_SYMBOL_GPL(vp_legacy_config_vector);
> +
> +/*
> + * vp_legacy_set_queue_address - set the virtqueue address
> + * @ldev: the legacy virtio-pci device
> + * @index: the queue index
> + * @queue_pfn: pfn of the virtqueue
> + */
> +void vp_legacy_set_queue_address(struct virtio_pci_legacy_device *ldev,
> + u16 index, u32 queue_pfn)
> +{
> + iowrite16(index, ldev->ioaddr + VIRTIO_PCI_QUEUE_SEL);
> + iowrite32(queue_pfn, ldev->ioaddr + VIRTIO_PCI_QUEUE_PFN);
> +}
> +EXPORT_SYMBOL_GPL(vp_legacy_set_queue_address);
> +
> +/*
> + * vp_legacy_get_queue_enable - enable a virtqueue
> + * @ldev: the legacy virtio-pci device
> + * @index: the queue index
> + *
> + * Returns whether a virtqueue is enabled or not
> + */
> +bool vp_legacy_get_queue_enable(struct virtio_pci_legacy_device *ldev,
> + u16 index)
> +{
> + iowrite16(index, ldev->ioaddr + VIRTIO_PCI_QUEUE_SEL);
> + return ioread32(ldev->ioaddr + VIRTIO_PCI_QUEUE_PFN);
> +}
> +EXPORT_SYMBOL_GPL(vp_legacy_get_queue_enable);
> +
> +/*
> + * vp_legacy_get_queue_size - get size for a virtqueue
> + * @ldev: the legacy virtio-pci device
> + * @index: the queue index
> + *
> + * Returns the size of the virtqueue
> + */
> +u16 vp_legacy_get_queue_size(struct virtio_pci_legacy_device *ldev,
> + u16 index)
> +{
> + iowrite16(index, ldev->ioaddr + VIRTIO_PCI_QUEUE_SEL);
> + return ioread16(ldev->ioaddr + VIRTIO_PCI_QUEUE_NUM);
> +}
> +EXPORT_SYMBOL_GPL(vp_legacy_get_queue_size);
> +
> +MODULE_VERSION("0.1");
> +MODULE_DESCRIPTION("Legacy Virtio PCI Device");
> +MODULE_AUTHOR("Wu Zongyong <[email protected]>");
> +MODULE_LICENSE("GPL");
> diff --git a/include/linux/virtio_pci_legacy.h b/include/linux/virtio_pci_legacy.h
> new file mode 100644
> index 000000000000..ee2c6157215f
> --- /dev/null
> +++ b/include/linux/virtio_pci_legacy.h
> @@ -0,0 +1,44 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _LINUX_VIRTIO_PCI_LEGACY_H
> +#define _LINUX_VIRTIO_PCI_LEGACY_H
> +
> +#include "linux/mod_devicetable.h"
> +#include <linux/pci.h>
> +#include <linux/virtio_pci.h>
> +
> +struct virtio_pci_legacy_device {
> + struct pci_dev *pci_dev;
> +
> + /* Where to read and clear interrupt */
> + u8 __iomem *isr;
> + /* The IO mapping for the PCI config space (legacy mode only) */
> + void __iomem *ioaddr;
> +
> + struct virtio_device_id id;
> +};
> +
> +u64 vp_legacy_get_features(struct virtio_pci_legacy_device *ldev);
> +u64 vp_legacy_get_driver_features(struct virtio_pci_legacy_device *ldev);
> +void vp_legacy_set_features(struct virtio_pci_legacy_device *ldev,
> + u32 features);
> +u8 vp_legacy_get_status(struct virtio_pci_legacy_device *ldev);
> +void vp_legacy_set_status(struct virtio_pci_legacy_device *ldev,
> + u8 status);
> +u16 vp_legacy_queue_vector(struct virtio_pci_legacy_device *ldev,
> + u16 idx, u16 vector);
> +u16 vp_legacy_config_vector(struct virtio_pci_legacy_device *ldev,
> + u16 vector);
> +void vp_legacy_set_queue_address(struct virtio_pci_legacy_device *ldev,
> + u16 index, u32 queue_pfn);
> +void vp_legacy_set_queue_enable(struct virtio_pci_legacy_device *ldev,
> + u16 idx, bool enable);
> +bool vp_legacy_get_queue_enable(struct virtio_pci_legacy_device *ldev,
> + u16 idx);
> +void vp_legacy_set_queue_size(struct virtio_pci_legacy_device *ldev,
> + u16 idx, u16 size);
> +u16 vp_legacy_get_queue_size(struct virtio_pci_legacy_device *ldev,
> + u16 idx);
> +int vp_legacy_probe(struct virtio_pci_legacy_device *ldev);
> +void vp_legacy_remove(struct virtio_pci_legacy_device *ldev);
> +
> +#endif

2021-10-11 05:33:07

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH v4 1/7] virtio-pci: introduce legacy device module


?? 2021/9/29 ????2:11, Wu Zongyong д??:
> Split common codes from virtio-pci-legacy so vDPA driver can reuse it
> later.
>
> Signed-off-by: Wu Zongyong <[email protected]>
> ---
> drivers/virtio/Kconfig | 10 ++
> drivers/virtio/Makefile | 1 +
> drivers/virtio/virtio_pci_common.c | 10 +-
> drivers/virtio/virtio_pci_common.h | 9 +-
> drivers/virtio/virtio_pci_legacy.c | 101 +++---------
> drivers/virtio/virtio_pci_legacy_dev.c | 220 +++++++++++++++++++++++++
> include/linux/virtio_pci_legacy.h | 44 +++++
> 7 files changed, 312 insertions(+), 83 deletions(-)
> create mode 100644 drivers/virtio/virtio_pci_legacy_dev.c
> create mode 100644 include/linux/virtio_pci_legacy.h
>
> diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
> index ce1b3f6ec325..8fcf94cd2c96 100644
> --- a/drivers/virtio/Kconfig
> +++ b/drivers/virtio/Kconfig
> @@ -20,6 +20,15 @@ config VIRTIO_PCI_LIB
> PCI device with possible vendor specific extensions. Any
> module that selects this module must depend on PCI.
>
> +config VIRTIO_PCI_LIB_LEGACY
> + tristate
> + help
> + Legacy PCI device (Virtio PCI Card 0.9.x Draft and older device)
> + implementation.
> + This module implements the basic probe and control for devices
> + which are based on legacy PCI device. Any module that selects this
> + module must depend on PCI.
> +
> menuconfig VIRTIO_MENU
> bool "Virtio drivers"
> default y
> @@ -43,6 +52,7 @@ config VIRTIO_PCI_LEGACY
> bool "Support for legacy virtio draft 0.9.X and older devices"
> default y
> depends on VIRTIO_PCI
> + select VIRTIO_PCI_LIB_LEGACY
> help
> Virtio PCI Card 0.9.X Draft (circa 2014) and older device support.
>
> diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile
> index 699bbea0465f..0a82d0873248 100644
> --- a/drivers/virtio/Makefile
> +++ b/drivers/virtio/Makefile
> @@ -1,6 +1,7 @@
> # SPDX-License-Identifier: GPL-2.0
> obj-$(CONFIG_VIRTIO) += virtio.o virtio_ring.o
> obj-$(CONFIG_VIRTIO_PCI_LIB) += virtio_pci_modern_dev.o
> +obj-$(CONFIG_VIRTIO_PCI_LIB_LEGACY) += virtio_pci_legacy_dev.o
> obj-$(CONFIG_VIRTIO_MMIO) += virtio_mmio.o
> obj-$(CONFIG_VIRTIO_PCI) += virtio_pci.o
> virtio_pci-y := virtio_pci_modern.o virtio_pci_common.o
> diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c
> index b35bb2d57f62..d724f676608b 100644
> --- a/drivers/virtio/virtio_pci_common.c
> +++ b/drivers/virtio/virtio_pci_common.c
> @@ -549,6 +549,8 @@ static int virtio_pci_probe(struct pci_dev *pci_dev,
>
> pci_set_master(pci_dev);
>
> + vp_dev->is_legacy = vp_dev->ldev.ioaddr ? true : false;
> +
> rc = register_virtio_device(&vp_dev->vdev);
> reg_dev = vp_dev;
> if (rc)
> @@ -557,10 +559,10 @@ static int virtio_pci_probe(struct pci_dev *pci_dev,
> return 0;
>
> err_register:
> - if (vp_dev->ioaddr)
> - virtio_pci_legacy_remove(vp_dev);
> + if (vp_dev->is_legacy)
> + virtio_pci_legacy_remove(vp_dev);
> else
> - virtio_pci_modern_remove(vp_dev);
> + virtio_pci_modern_remove(vp_dev);
> err_probe:
> pci_disable_device(pci_dev);
> err_enable_device:
> @@ -587,7 +589,7 @@ static void virtio_pci_remove(struct pci_dev *pci_dev)
>
> unregister_virtio_device(&vp_dev->vdev);
>
> - if (vp_dev->ioaddr)
> + if (vp_dev->is_legacy)
> virtio_pci_legacy_remove(vp_dev);
> else
> virtio_pci_modern_remove(vp_dev);
> diff --git a/drivers/virtio/virtio_pci_common.h b/drivers/virtio/virtio_pci_common.h
> index beec047a8f8d..eb17a29fc7ef 100644
> --- a/drivers/virtio/virtio_pci_common.h
> +++ b/drivers/virtio/virtio_pci_common.h
> @@ -25,6 +25,7 @@
> #include <linux/virtio_config.h>
> #include <linux/virtio_ring.h>
> #include <linux/virtio_pci.h>
> +#include <linux/virtio_pci_legacy.h>
> #include <linux/virtio_pci_modern.h>
> #include <linux/highmem.h>
> #include <linux/spinlock.h>
> @@ -44,16 +45,14 @@ struct virtio_pci_vq_info {
> struct virtio_pci_device {
> struct virtio_device vdev;
> struct pci_dev *pci_dev;
> + struct virtio_pci_legacy_device ldev;
> struct virtio_pci_modern_device mdev;
>
> - /* In legacy mode, these two point to within ->legacy. */
> + bool is_legacy;
> +
> /* Where to read and clear interrupt */
> u8 __iomem *isr;
>
> - /* Legacy only field */
> - /* the IO mapping for the PCI config space */
> - void __iomem *ioaddr;
> -
> /* a list of queues so we can dispatch IRQs */
> spinlock_t lock;
> struct list_head virtqueues;
> diff --git a/drivers/virtio/virtio_pci_legacy.c b/drivers/virtio/virtio_pci_legacy.c
> index d62e9835aeec..82eb437ad920 100644
> --- a/drivers/virtio/virtio_pci_legacy.c
> +++ b/drivers/virtio/virtio_pci_legacy.c
> @@ -14,6 +14,7 @@
> * Michael S. Tsirkin <[email protected]>
> */
>
> +#include "linux/virtio_pci_legacy.h"
> #include "virtio_pci_common.h"
>
> /* virtio config->get_features() implementation */
> @@ -23,7 +24,7 @@ static u64 vp_get_features(struct virtio_device *vdev)
>
> /* When someone needs more than 32 feature bits, we'll need to
> * steal a bit to indicate that the rest are somewhere else. */
> - return ioread32(vp_dev->ioaddr + VIRTIO_PCI_HOST_FEATURES);
> + return vp_legacy_get_features(&vp_dev->ldev);
> }
>
> /* virtio config->finalize_features() implementation */
> @@ -38,7 +39,7 @@ static int vp_finalize_features(struct virtio_device *vdev)
> BUG_ON((u32)vdev->features != vdev->features);
>
> /* We only support 32 feature bits. */
> - iowrite32(vdev->features, vp_dev->ioaddr + VIRTIO_PCI_GUEST_FEATURES);
> + vp_legacy_set_features(&vp_dev->ldev, vdev->features);
>
> return 0;
> }
> @@ -48,7 +49,7 @@ static void vp_get(struct virtio_device *vdev, unsigned offset,
> void *buf, unsigned len)
> {
> struct virtio_pci_device *vp_dev = to_vp_device(vdev);
> - void __iomem *ioaddr = vp_dev->ioaddr +
> + void __iomem *ioaddr = vp_dev->ldev.ioaddr +
> VIRTIO_PCI_CONFIG_OFF(vp_dev->msix_enabled) +
> offset;
> u8 *ptr = buf;
> @@ -64,7 +65,7 @@ static void vp_set(struct virtio_device *vdev, unsigned offset,
> const void *buf, unsigned len)
> {
> struct virtio_pci_device *vp_dev = to_vp_device(vdev);
> - void __iomem *ioaddr = vp_dev->ioaddr +
> + void __iomem *ioaddr = vp_dev->ldev.ioaddr +
> VIRTIO_PCI_CONFIG_OFF(vp_dev->msix_enabled) +
> offset;
> const u8 *ptr = buf;
> @@ -78,7 +79,7 @@ static void vp_set(struct virtio_device *vdev, unsigned offset,
> static u8 vp_get_status(struct virtio_device *vdev)
> {
> struct virtio_pci_device *vp_dev = to_vp_device(vdev);
> - return ioread8(vp_dev->ioaddr + VIRTIO_PCI_STATUS);
> + return vp_legacy_get_status(&vp_dev->ldev);
> }
>
> static void vp_set_status(struct virtio_device *vdev, u8 status)
> @@ -86,28 +87,24 @@ static void vp_set_status(struct virtio_device *vdev, u8 status)
> struct virtio_pci_device *vp_dev = to_vp_device(vdev);
> /* We should never be setting status to 0. */
> BUG_ON(status == 0);
> - iowrite8(status, vp_dev->ioaddr + VIRTIO_PCI_STATUS);
> + vp_legacy_set_status(&vp_dev->ldev, status);
> }
>
> static void vp_reset(struct virtio_device *vdev)
> {
> struct virtio_pci_device *vp_dev = to_vp_device(vdev);
> /* 0 status means a reset. */
> - iowrite8(0, vp_dev->ioaddr + VIRTIO_PCI_STATUS);
> + vp_legacy_set_status(&vp_dev->ldev, 0);
> /* Flush out the status write, and flush in device writes,
> * including MSi-X interrupts, if any. */
> - ioread8(vp_dev->ioaddr + VIRTIO_PCI_STATUS);
> + vp_legacy_get_status(&vp_dev->ldev);
> /* Flush pending VQ/configuration callbacks. */
> vp_synchronize_vectors(vdev);
> }
>
> static u16 vp_config_vector(struct virtio_pci_device *vp_dev, u16 vector)
> {
> - /* Setup the vector used for configuration events */
> - iowrite16(vector, vp_dev->ioaddr + VIRTIO_MSI_CONFIG_VECTOR);
> - /* Verify we had enough resources to assign the vector */
> - /* Will also flush the write out to device */
> - return ioread16(vp_dev->ioaddr + VIRTIO_MSI_CONFIG_VECTOR);
> + return vp_legacy_config_vector(&vp_dev->ldev, vector);
> }
>
> static struct virtqueue *setup_vq(struct virtio_pci_device *vp_dev,
> @@ -123,12 +120,9 @@ static struct virtqueue *setup_vq(struct virtio_pci_device *vp_dev,
> int err;
> u64 q_pfn;
>
> - /* Select the queue we're interested in */
> - iowrite16(index, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_SEL);
> -
> /* Check if queue is either not available or already active. */
> - num = ioread16(vp_dev->ioaddr + VIRTIO_PCI_QUEUE_NUM);
> - if (!num || ioread32(vp_dev->ioaddr + VIRTIO_PCI_QUEUE_PFN))
> + num = vp_legacy_get_queue_size(&vp_dev->ldev, index);
> + if (!num || vp_legacy_get_queue_enable(&vp_dev->ldev, index))
> return ERR_PTR(-ENOENT);
>
> info->msix_vector = msix_vec;
> @@ -151,13 +145,12 @@ static struct virtqueue *setup_vq(struct virtio_pci_device *vp_dev,
> }
>
> /* activate the queue */
> - iowrite32(q_pfn, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_PFN);
> + vp_legacy_set_queue_address(&vp_dev->ldev, index, q_pfn);
>
> - vq->priv = (void __force *)vp_dev->ioaddr + VIRTIO_PCI_QUEUE_NOTIFY;
> + vq->priv = (void __force *)vp_dev->ldev.ioaddr + VIRTIO_PCI_QUEUE_NOTIFY;
>
> if (msix_vec != VIRTIO_MSI_NO_VECTOR) {
> - iowrite16(msix_vec, vp_dev->ioaddr + VIRTIO_MSI_QUEUE_VECTOR);
> - msix_vec = ioread16(vp_dev->ioaddr + VIRTIO_MSI_QUEUE_VECTOR);
> + msix_vec = vp_legacy_queue_vector(&vp_dev->ldev, index, msix_vec);
> if (msix_vec == VIRTIO_MSI_NO_VECTOR) {
> err = -EBUSY;
> goto out_deactivate;
> @@ -167,7 +160,7 @@ static struct virtqueue *setup_vq(struct virtio_pci_device *vp_dev,
> return vq;
>
> out_deactivate:
> - iowrite32(0, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_PFN);
> + vp_legacy_set_queue_address(&vp_dev->ldev, index, 0);
> out_del_vq:
> vring_del_virtqueue(vq);
> return ERR_PTR(err);
> @@ -178,17 +171,15 @@ static void del_vq(struct virtio_pci_vq_info *info)
> struct virtqueue *vq = info->vq;
> struct virtio_pci_device *vp_dev = to_vp_device(vq->vdev);
>
> - iowrite16(vq->index, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_SEL);
> -
> if (vp_dev->msix_enabled) {
> - iowrite16(VIRTIO_MSI_NO_VECTOR,
> - vp_dev->ioaddr + VIRTIO_MSI_QUEUE_VECTOR);
> + vp_legacy_queue_vector(&vp_dev->ldev, vq->index,
> + VIRTIO_MSI_NO_VECTOR);
> /* Flush the write out to device */
> - ioread8(vp_dev->ioaddr + VIRTIO_PCI_ISR);
> + ioread8(vp_dev->ldev.ioaddr + VIRTIO_PCI_ISR);
> }
>
> /* Select and deactivate the queue */
> - iowrite32(0, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_PFN);
> + vp_legacy_set_queue_address(&vp_dev->ldev, vq->index, 0);
>
> vring_del_virtqueue(vq);
> }
> @@ -211,51 +202,18 @@ static const struct virtio_config_ops virtio_pci_config_ops = {
> /* the PCI probing function */
> int virtio_pci_legacy_probe(struct virtio_pci_device *vp_dev)
> {
> + struct virtio_pci_legacy_device *ldev = &vp_dev->ldev;
> struct pci_dev *pci_dev = vp_dev->pci_dev;
> int rc;
>
> - /* We only own devices >= 0x1000 and <= 0x103f: leave the rest. */
> - if (pci_dev->device < 0x1000 || pci_dev->device > 0x103f)
> - return -ENODEV;
> -
> - if (pci_dev->revision != VIRTIO_PCI_ABI_VERSION) {
> - printk(KERN_ERR "virtio_pci: expected ABI version %d, got %d\n",
> - VIRTIO_PCI_ABI_VERSION, pci_dev->revision);
> - return -ENODEV;
> - }
> -
> - rc = dma_set_mask(&pci_dev->dev, DMA_BIT_MASK(64));
> - if (rc) {
> - rc = dma_set_mask_and_coherent(&pci_dev->dev, DMA_BIT_MASK(32));
> - } else {
> - /*
> - * The virtio ring base address is expressed as a 32-bit PFN,
> - * with a page size of 1 << VIRTIO_PCI_QUEUE_ADDR_SHIFT.
> - */
> - dma_set_coherent_mask(&pci_dev->dev,
> - DMA_BIT_MASK(32 + VIRTIO_PCI_QUEUE_ADDR_SHIFT));
> - }
> -
> - if (rc)
> - dev_warn(&pci_dev->dev, "Failed to enable 64-bit or 32-bit DMA. Trying to continue, but this might not work.\n");
> + ldev->pci_dev = pci_dev;
>
> - rc = pci_request_region(pci_dev, 0, "virtio-pci-legacy");
> + rc = vp_legacy_probe(ldev);
> if (rc)
> return rc;
>
> - rc = -ENOMEM;
> - vp_dev->ioaddr = pci_iomap(pci_dev, 0, 0);
> - if (!vp_dev->ioaddr)
> - goto err_iomap;
> -
> - vp_dev->isr = vp_dev->ioaddr + VIRTIO_PCI_ISR;
> -
> - /* we use the subsystem vendor/device id as the virtio vendor/device
> - * id. this allows us to use the same PCI vendor/device id for all
> - * virtio devices and to identify the particular virtio driver by
> - * the subsystem ids */
> - vp_dev->vdev.id.vendor = pci_dev->subsystem_vendor;
> - vp_dev->vdev.id.device = pci_dev->subsystem_device;
> + vp_dev->isr = ldev->isr;
> + vp_dev->vdev.id = ldev->id;
>
> vp_dev->vdev.config = &virtio_pci_config_ops;
>
> @@ -264,16 +222,11 @@ int virtio_pci_legacy_probe(struct virtio_pci_device *vp_dev)
> vp_dev->del_vq = del_vq;
>
> return 0;
> -
> -err_iomap:
> - pci_release_region(pci_dev, 0);
> - return rc;
> }
>
> void virtio_pci_legacy_remove(struct virtio_pci_device *vp_dev)
> {
> - struct pci_dev *pci_dev = vp_dev->pci_dev;
> + struct virtio_pci_legacy_device *ldev = &vp_dev->ldev;
>
> - pci_iounmap(pci_dev, vp_dev->ioaddr);
> - pci_release_region(pci_dev, 0);
> + vp_legacy_remove(ldev);
> }
> diff --git a/drivers/virtio/virtio_pci_legacy_dev.c b/drivers/virtio/virtio_pci_legacy_dev.c
> new file mode 100644
> index 000000000000..9b97680dd02b
> --- /dev/null
> +++ b/drivers/virtio/virtio_pci_legacy_dev.c
> @@ -0,0 +1,220 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +
> +#include "linux/virtio_pci.h"
> +#include <linux/virtio_pci_legacy.h>
> +#include <linux/module.h>
> +#include <linux/pci.h>
> +
> +
> +/*
> + * vp_legacy_probe: probe the legacy virtio pci device, note that the
> + * caller is required to enable PCI device before calling this function.
> + * @ldev: the legacy virtio-pci device
> + *
> + * Return 0 on succeed otherwise fail
> + */
> +int vp_legacy_probe(struct virtio_pci_legacy_device *ldev)
> +{
> + struct pci_dev *pci_dev = ldev->pci_dev;
> + int rc;
> +
> + /* We only own devices >= 0x1000 and <= 0x103f: leave the rest. */
> + if (pci_dev->device < 0x1000 || pci_dev->device > 0x103f)
> + return -ENODEV;
> +
> + if (pci_dev->revision != VIRTIO_PCI_ABI_VERSION)
> + return -ENODEV;
> +
> + rc = dma_set_mask(&pci_dev->dev, DMA_BIT_MASK(64));
> + if (rc) {
> + rc = dma_set_mask_and_coherent(&pci_dev->dev, DMA_BIT_MASK(32));
> + } else {
> + /*
> + * The virtio ring base address is expressed as a 32-bit PFN,
> + * with a page size of 1 << VIRTIO_PCI_QUEUE_ADDR_SHIFT.
> + */
> + dma_set_coherent_mask(&pci_dev->dev,
> + DMA_BIT_MASK(32 + VIRTIO_PCI_QUEUE_ADDR_SHIFT));
> + }
> +
> + if (rc)
> + dev_warn(&pci_dev->dev, "Failed to enable 64-bit or 32-bit DMA. Trying to continue, but this might not work.\n");
> +
> + rc = pci_request_region(pci_dev, 0, "virtio-pci-legacy");
> + if (rc)
> + return rc;
> +
> + ldev->ioaddr = pci_iomap(pci_dev, 0, 0);
> + if (!ldev->ioaddr)
> + goto err_iomap;
> +
> + ldev->isr = ldev->ioaddr + VIRTIO_PCI_ISR;
> +
> + ldev->id.vendor = pci_dev->subsystem_vendor;
> + ldev->id.device = pci_dev->subsystem_device;
> +
> + return 0;
> +err_iomap:
> + pci_release_region(pci_dev, 0);
> + return rc;
> +}
> +EXPORT_SYMBOL_GPL(vp_legacy_probe);
> +
> +/*
> + * vp_legacy_probe: remove and cleanup the legacy virtio pci device
> + * @ldev: the legacy virtio-pci device
> + */
> +void vp_legacy_remove(struct virtio_pci_legacy_device *ldev)
> +{
> + struct pci_dev *pci_dev = ldev->pci_dev;
> +
> + pci_iounmap(pci_dev, ldev->ioaddr);
> + pci_release_region(pci_dev, 0);
> +}
> +EXPORT_SYMBOL_GPL(vp_legacy_remove);
> +
> +/*
> + * vp_legacy_get_features - get features from device
> + * @ldev: the legacy virtio-pci device
> + *
> + * Returns the features read from the device
> + */
> +u64 vp_legacy_get_features(struct virtio_pci_legacy_device *ldev)
> +{
> +
> + return ioread32(ldev->ioaddr + VIRTIO_PCI_HOST_FEATURES);
> +}
> +EXPORT_SYMBOL_GPL(vp_legacy_get_features);
> +
> +/*
> + * vp_legacy_get_driver_features - get driver features from device
> + * @ldev: the legacy virtio-pci device
> + *
> + * Returns the driver features read from the device
> + */
> +u64 vp_legacy_get_driver_features(struct virtio_pci_legacy_device *ldev)
> +{
> + return ioread32(ldev->ioaddr + VIRTIO_PCI_GUEST_FEATURES);
> +}
> +EXPORT_SYMBOL_GPL(vp_legacy_get_driver_features);
> +
> +/*
> + * vp_legacy_set_features - set features to device
> + * @ldev: the legacy virtio-pci device
> + * @features: the features set to device
> + */
> +void vp_legacy_set_features(struct virtio_pci_legacy_device *ldev,
> + u32 features)
> +{
> + iowrite32(features, ldev->ioaddr + VIRTIO_PCI_GUEST_FEATURES);
> +}
> +EXPORT_SYMBOL_GPL(vp_legacy_set_features);
> +
> +/*
> + * vp_legacy_get_status - get the device status
> + * @ldev: the legacy virtio-pci device
> + *
> + * Returns the status read from device
> + */
> +u8 vp_legacy_get_status(struct virtio_pci_legacy_device *ldev)
> +{
> + return ioread8(ldev->ioaddr + VIRTIO_PCI_STATUS);
> +}
> +EXPORT_SYMBOL_GPL(vp_legacy_get_status);
> +
> +/*
> + * vp_legacy_set_status - set status to device
> + * @ldev: the legacy virtio-pci device
> + * @status: the status set to device
> + */
> +void vp_legacy_set_status(struct virtio_pci_legacy_device *ldev,
> + u8 status)
> +{
> + iowrite8(status, ldev->ioaddr + VIRTIO_PCI_STATUS);
> +}
> +EXPORT_SYMBOL_GPL(vp_legacy_set_status);
> +
> +/*
> + * vp_legacy_queue_vector - set the MSIX vector for a specific virtqueue
> + * @ldev: the legacy virtio-pci device
> + * @index: queue index
> + * @vector: the config vector
> + *
> + * Returns the config vector read from the device
> + */
> +u16 vp_legacy_queue_vector(struct virtio_pci_legacy_device *ldev,
> + u16 index, u16 vector)
> +{
> + iowrite16(index, ldev->ioaddr + VIRTIO_PCI_QUEUE_SEL);
> + iowrite16(vector, ldev->ioaddr + VIRTIO_MSI_QUEUE_VECTOR);
> + /* Flush the write out to device */
> + return ioread16(ldev->ioaddr + VIRTIO_MSI_QUEUE_VECTOR);
> +}
> +EXPORT_SYMBOL_GPL(vp_legacy_queue_vector);
> +
> +/*
> + * vp_legacy_config_vector - set the vector for config interrupt
> + * @ldev: the legacy virtio-pci device
> + * @vector: the config vector
> + *
> + * Returns the config vector read from the device
> + */
> +u16 vp_legacy_config_vector(struct virtio_pci_legacy_device *ldev,
> + u16 vector)
> +{
> + /* Setup the vector used for configuration events */
> + iowrite16(vector, ldev->ioaddr + VIRTIO_MSI_CONFIG_VECTOR);
> + /* Verify we had enough resources to assign the vector */
> + /* Will also flush the write out to device */
> + return ioread16(ldev->ioaddr + VIRTIO_MSI_CONFIG_VECTOR);
> +}
> +EXPORT_SYMBOL_GPL(vp_legacy_config_vector);
> +
> +/*
> + * vp_legacy_set_queue_address - set the virtqueue address
> + * @ldev: the legacy virtio-pci device
> + * @index: the queue index
> + * @queue_pfn: pfn of the virtqueue
> + */
> +void vp_legacy_set_queue_address(struct virtio_pci_legacy_device *ldev,
> + u16 index, u32 queue_pfn)
> +{
> + iowrite16(index, ldev->ioaddr + VIRTIO_PCI_QUEUE_SEL);
> + iowrite32(queue_pfn, ldev->ioaddr + VIRTIO_PCI_QUEUE_PFN);
> +}
> +EXPORT_SYMBOL_GPL(vp_legacy_set_queue_address);
> +
> +/*
> + * vp_legacy_get_queue_enable - enable a virtqueue
> + * @ldev: the legacy virtio-pci device
> + * @index: the queue index
> + *
> + * Returns whether a virtqueue is enabled or not
> + */
> +bool vp_legacy_get_queue_enable(struct virtio_pci_legacy_device *ldev,
> + u16 index)
> +{
> + iowrite16(index, ldev->ioaddr + VIRTIO_PCI_QUEUE_SEL);
> + return ioread32(ldev->ioaddr + VIRTIO_PCI_QUEUE_PFN);
> +}
> +EXPORT_SYMBOL_GPL(vp_legacy_get_queue_enable);
> +
> +/*
> + * vp_legacy_get_queue_size - get size for a virtqueue
> + * @ldev: the legacy virtio-pci device
> + * @index: the queue index
> + *
> + * Returns the size of the virtqueue
> + */
> +u16 vp_legacy_get_queue_size(struct virtio_pci_legacy_device *ldev,
> + u16 index)
> +{
> + iowrite16(index, ldev->ioaddr + VIRTIO_PCI_QUEUE_SEL);
> + return ioread16(ldev->ioaddr + VIRTIO_PCI_QUEUE_NUM);
> +}
> +EXPORT_SYMBOL_GPL(vp_legacy_get_queue_size);
> +
> +MODULE_VERSION("0.1");
> +MODULE_DESCRIPTION("Legacy Virtio PCI Device");
> +MODULE_AUTHOR("Wu Zongyong <[email protected]>");
> +MODULE_LICENSE("GPL");
> diff --git a/include/linux/virtio_pci_legacy.h b/include/linux/virtio_pci_legacy.h
> new file mode 100644
> index 000000000000..ee2c6157215f
> --- /dev/null
> +++ b/include/linux/virtio_pci_legacy.h
> @@ -0,0 +1,44 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _LINUX_VIRTIO_PCI_LEGACY_H
> +#define _LINUX_VIRTIO_PCI_LEGACY_H
> +
> +#include "linux/mod_devicetable.h"
> +#include <linux/pci.h>
> +#include <linux/virtio_pci.h>
> +
> +struct virtio_pci_legacy_device {
> + struct pci_dev *pci_dev;
> +
> + /* Where to read and clear interrupt */
> + u8 __iomem *isr;
> + /* The IO mapping for the PCI config space (legacy mode only) */
> + void __iomem *ioaddr;
> +
> + struct virtio_device_id id;
> +};
> +
> +u64 vp_legacy_get_features(struct virtio_pci_legacy_device *ldev);
> +u64 vp_legacy_get_driver_features(struct virtio_pci_legacy_device *ldev);
> +void vp_legacy_set_features(struct virtio_pci_legacy_device *ldev,
> + u32 features);
> +u8 vp_legacy_get_status(struct virtio_pci_legacy_device *ldev);
> +void vp_legacy_set_status(struct virtio_pci_legacy_device *ldev,
> + u8 status);
> +u16 vp_legacy_queue_vector(struct virtio_pci_legacy_device *ldev,
> + u16 idx, u16 vector);
> +u16 vp_legacy_config_vector(struct virtio_pci_legacy_device *ldev,
> + u16 vector);
> +void vp_legacy_set_queue_address(struct virtio_pci_legacy_device *ldev,
> + u16 index, u32 queue_pfn);
> +void vp_legacy_set_queue_enable(struct virtio_pci_legacy_device *ldev,
> + u16 idx, bool enable);


Just spot this. This is never defined in this patch?

Thanks


> +bool vp_legacy_get_queue_enable(struct virtio_pci_legacy_device *ldev,
> + u16 idx);
> +void vp_legacy_set_queue_size(struct virtio_pci_legacy_device *ldev,
> + u16 idx, u16 size);
> +u16 vp_legacy_get_queue_size(struct virtio_pci_legacy_device *ldev,
> + u16 idx);
> +int vp_legacy_probe(struct virtio_pci_legacy_device *ldev);
> +void vp_legacy_remove(struct virtio_pci_legacy_device *ldev);
> +
> +#endif

2021-10-11 05:33:50

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH v4 4/7] vdpa: add new callback get_vq_num_min in vdpa_config_ops


?? 2021/9/29 ????2:11, Wu Zongyong д??:
> This callback is optional. For vdpa devices that not support to change
> virtqueue size, get_vq_num_min and get_vq_num_max will return the same
> value, so that users can choose a correct value for that device.
>
> Suggested-by: Jason Wang <[email protected]>
> Signed-off-by: Wu Zongyong <[email protected]>


Acked-by: Jason Wang <[email protected]>


> ---
> include/linux/vdpa.h | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
> index a896ee021e5f..30864848950b 100644
> --- a/include/linux/vdpa.h
> +++ b/include/linux/vdpa.h
> @@ -171,6 +171,9 @@ struct vdpa_map_file {
> * @get_vq_num_max: Get the max size of virtqueue
> * @vdev: vdpa device
> * Returns u16: max size of virtqueue
> + * @get_vq_num_min: Get the min size of virtqueue (optional)
> + * @vdev: vdpa device
> + * Returns u16: min size of virtqueue
> * @get_device_id: Get virtio device id
> * @vdev: vdpa device
> * Returns u32: virtio device id
> @@ -266,6 +269,7 @@ struct vdpa_config_ops {
> void (*set_config_cb)(struct vdpa_device *vdev,
> struct vdpa_callback *cb);
> u16 (*get_vq_num_max)(struct vdpa_device *vdev);
> + u16 (*get_vq_num_min)(struct vdpa_device *vdev);
> u32 (*get_device_id)(struct vdpa_device *vdev);
> u32 (*get_vendor_id)(struct vdpa_device *vdev);
> u8 (*get_status)(struct vdpa_device *vdev);

2021-10-11 05:34:34

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH v4 5/7] virtio_vdpa: setup correct vq size with callbacks get_vq_num_{max,min}


?? 2021/9/29 ????2:11, Wu Zongyong д??:

> Signed-off-by: Wu Zongyong <[email protected]>


Commit log please.


> ---
> drivers/virtio/virtio_vdpa.c | 25 ++++++++++++++++++++-----
> 1 file changed, 20 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
> index 72eaef2caeb1..8aa4ebe2a2a2 100644
> --- a/drivers/virtio/virtio_vdpa.c
> +++ b/drivers/virtio/virtio_vdpa.c
> @@ -145,7 +145,8 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> /* Assume split virtqueue, switch to packed if necessary */
> struct vdpa_vq_state state = {0};
> unsigned long flags;
> - u32 align, num;
> + u32 align, max_num, min_num = 0;
> + bool may_reduce_num = true;
> int err;
>
> if (!name)
> @@ -163,22 +164,36 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> if (!info)
> return ERR_PTR(-ENOMEM);
>
> - num = ops->get_vq_num_max(vdpa);
> - if (num == 0) {
> + max_num = ops->get_vq_num_max(vdpa);
> + if (max_num == 0) {
> err = -ENOENT;
> goto error_new_virtqueue;
> }
>
> + if (ops->get_vq_num_min)
> + min_num = ops->get_vq_num_min(vdpa);
> + if (min_num > max_num) {
> + err = -ENOENT;
> + goto error_new_virtqueue;
> + }


If we really want to do this, let's move this to vdpa core during device
probing.

Or just leave it as is (device risk itself).

Thanks


> +
> + may_reduce_num = (max_num == min_num) ? false : true;
> +
> /* Create the vring */
> align = ops->get_vq_align(vdpa);
> - vq = vring_create_virtqueue(index, num, align, vdev,
> - true, true, ctx,
> + vq = vring_create_virtqueue(index, max_num, align, vdev,
> + true, may_reduce_num, ctx,
> virtio_vdpa_notify, callback, name);
> if (!vq) {
> err = -ENOMEM;
> goto error_new_virtqueue;
> }
>
> + if (virtqueue_get_vring_size(vq) < min_num) {
> + err = -EINVAL;
> + goto err_vq;
> + }
> +
> /* Setup virtqueue callback */
> cb.callback = virtio_vdpa_virtqueue_cb;
> cb.private = info;

2021-10-11 05:38:07

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH v4 7/7] eni_vdpa: add vDPA driver for Alibaba ENI


?? 2021/9/29 ????2:11, Wu Zongyong д??:
> This patch adds a new vDPA driver for Alibaba ENI(Elastic Network
> Interface) which is build upon virtio 0.9.5 specification.
> And this driver doesn't support to run on BE host.


If this is true, I think it's still better to exclude this driver via
Kconfig.


>
> Signed-off-by: Wu Zongyong <[email protected]>
> ---
> drivers/vdpa/Kconfig | 8 +
> drivers/vdpa/Makefile | 1 +
> drivers/vdpa/alibaba/Makefile | 3 +
> drivers/vdpa/alibaba/eni_vdpa.c | 553 ++++++++++++++++++++++++++++++++
> 4 files changed, 565 insertions(+)
> create mode 100644 drivers/vdpa/alibaba/Makefile
> create mode 100644 drivers/vdpa/alibaba/eni_vdpa.c
>
> diff --git a/drivers/vdpa/Kconfig b/drivers/vdpa/Kconfig
> index 3d91982d8371..9587b9177b05 100644
> --- a/drivers/vdpa/Kconfig
> +++ b/drivers/vdpa/Kconfig
> @@ -78,4 +78,12 @@ config VP_VDPA
> help
> This kernel module bridges virtio PCI device to vDPA bus.
>
> +config ALIBABA_ENI_VDPA
> + tristate "vDPA driver for Alibaba ENI"
> + select VIRTIO_PCI_LEGACY_LIB
> + depends on PCI_MSI
> + help
> + VDPA driver for Alibaba ENI(Elastic Network Interface) which is build upon
> + virtio 0.9.5 specification.
> +
> endif # VDPA
> diff --git a/drivers/vdpa/Makefile b/drivers/vdpa/Makefile
> index f02ebed33f19..15665563a7f4 100644
> --- a/drivers/vdpa/Makefile
> +++ b/drivers/vdpa/Makefile
> @@ -5,3 +5,4 @@ obj-$(CONFIG_VDPA_USER) += vdpa_user/
> obj-$(CONFIG_IFCVF) += ifcvf/
> obj-$(CONFIG_MLX5_VDPA) += mlx5/
> obj-$(CONFIG_VP_VDPA) += virtio_pci/
> +obj-$(CONFIG_ALIBABA_ENI_VDPA) += alibaba/
> diff --git a/drivers/vdpa/alibaba/Makefile b/drivers/vdpa/alibaba/Makefile
> new file mode 100644
> index 000000000000..ef4aae69f87a
> --- /dev/null
> +++ b/drivers/vdpa/alibaba/Makefile
> @@ -0,0 +1,3 @@
> +# SPDX-License-Identifier: GPL-2.0
> +obj-$(CONFIG_ALIBABA_ENI_VDPA) += eni_vdpa.o
> +
> diff --git a/drivers/vdpa/alibaba/eni_vdpa.c b/drivers/vdpa/alibaba/eni_vdpa.c
> new file mode 100644
> index 000000000000..6a09f157d810
> --- /dev/null
> +++ b/drivers/vdpa/alibaba/eni_vdpa.c
> @@ -0,0 +1,553 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * vDPA bridge driver for Alibaba ENI(Elastic Network Interface)
> + *
> + * Copyright (c) 2021, Alibaba Inc. All rights reserved.
> + * Author: Wu Zongyong <[email protected]>
> + *
> + */
> +
> +#include "linux/bits.h"
> +#include <linux/interrupt.h>
> +#include <linux/module.h>
> +#include <linux/pci.h>
> +#include <linux/vdpa.h>
> +#include <linux/virtio.h>
> +#include <linux/virtio_config.h>
> +#include <linux/virtio_ring.h>
> +#include <linux/virtio_pci.h>
> +#include <linux/virtio_pci_legacy.h>
> +#include <uapi/linux/virtio_net.h>
> +
> +#define ENI_MSIX_NAME_SIZE 256
> +
> +#define ENI_ERR(pdev, fmt, ...) \
> + dev_err(&pdev->dev, "%s"fmt, "eni_vdpa: ", ##__VA_ARGS__)
> +#define ENI_DBG(pdev, fmt, ...) \
> + dev_dbg(&pdev->dev, "%s"fmt, "eni_vdpa: ", ##__VA_ARGS__)
> +#define ENI_INFO(pdev, fmt, ...) \
> + dev_info(&pdev->dev, "%s"fmt, "eni_vdpa: ", ##__VA_ARGS__)
> +
> +struct eni_vring {
> + void __iomem *notify;
> + char msix_name[ENI_MSIX_NAME_SIZE];
> + struct vdpa_callback cb;
> + int irq;
> +};
> +
> +struct eni_vdpa {
> + struct vdpa_device vdpa;
> + struct virtio_pci_legacy_device ldev;
> + struct eni_vring *vring;
> + struct vdpa_callback config_cb;
> + char msix_name[ENI_MSIX_NAME_SIZE];
> + int config_irq;
> + int queues;
> + int vectors;
> +};
> +
> +static struct eni_vdpa *vdpa_to_eni(struct vdpa_device *vdpa)
> +{
> + return container_of(vdpa, struct eni_vdpa, vdpa);
> +}
> +
> +static struct virtio_pci_legacy_device *vdpa_to_ldev(struct vdpa_device *vdpa)
> +{
> + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> +
> + return &eni_vdpa->ldev;
> +}
> +
> +static u64 eni_vdpa_get_features(struct vdpa_device *vdpa)
> +{
> + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> + u64 features = vp_legacy_get_features(ldev);
> +
> + features |= BIT_ULL(VIRTIO_F_ACCESS_PLATFORM);
> + features |= BIT_ULL(VIRTIO_F_ORDER_PLATFORM);


VERSION_1 is also needed?


> +
> + return features;
> +}
> +
> +static int eni_vdpa_set_features(struct vdpa_device *vdpa, u64 features)
> +{
> + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> +
> + if (!(features & BIT_ULL(VIRTIO_NET_F_MRG_RXBUF)) && features) {
> + ENI_ERR(ldev->pci_dev,
> + "VIRTIO_NET_F_MRG_RXBUF is not negotiated\n");
> + return -EINVAL;


Do we need to make sure FEATURE_OK is not set in this case or the ENI
can do this for us?

Other looks good.

Thanks


> + }
> +
> + vp_legacy_set_features(ldev, (u32)features);
> +
> + return 0;
> +}
> +
> +static u8 eni_vdpa_get_status(struct vdpa_device *vdpa)
> +{
> + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> +
> + return vp_legacy_get_status(ldev);
> +}
> +
> +static int eni_vdpa_get_vq_irq(struct vdpa_device *vdpa, u16 idx)
> +{
> + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> + int irq = eni_vdpa->vring[idx].irq;
> +
> + if (irq == VIRTIO_MSI_NO_VECTOR)
> + return -EINVAL;
> +
> + return irq;
> +}
> +
> +static void eni_vdpa_free_irq(struct eni_vdpa *eni_vdpa)
> +{
> + struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
> + struct pci_dev *pdev = ldev->pci_dev;
> + int i;
> +
> + for (i = 0; i < eni_vdpa->queues; i++) {
> + if (eni_vdpa->vring[i].irq != VIRTIO_MSI_NO_VECTOR) {
> + vp_legacy_queue_vector(ldev, i, VIRTIO_MSI_NO_VECTOR);
> + devm_free_irq(&pdev->dev, eni_vdpa->vring[i].irq,
> + &eni_vdpa->vring[i]);
> + eni_vdpa->vring[i].irq = VIRTIO_MSI_NO_VECTOR;
> + }
> + }
> +
> + if (eni_vdpa->config_irq != VIRTIO_MSI_NO_VECTOR) {
> + vp_legacy_config_vector(ldev, VIRTIO_MSI_NO_VECTOR);
> + devm_free_irq(&pdev->dev, eni_vdpa->config_irq, eni_vdpa);
> + eni_vdpa->config_irq = VIRTIO_MSI_NO_VECTOR;
> + }
> +
> + if (eni_vdpa->vectors) {
> + pci_free_irq_vectors(pdev);
> + eni_vdpa->vectors = 0;
> + }
> +}
> +
> +static irqreturn_t eni_vdpa_vq_handler(int irq, void *arg)
> +{
> + struct eni_vring *vring = arg;
> +
> + if (vring->cb.callback)
> + return vring->cb.callback(vring->cb.private);
> +
> + return IRQ_HANDLED;
> +}
> +
> +static irqreturn_t eni_vdpa_config_handler(int irq, void *arg)
> +{
> + struct eni_vdpa *eni_vdpa = arg;
> +
> + if (eni_vdpa->config_cb.callback)
> + return eni_vdpa->config_cb.callback(eni_vdpa->config_cb.private);
> +
> + return IRQ_HANDLED;
> +}
> +
> +static int eni_vdpa_request_irq(struct eni_vdpa *eni_vdpa)
> +{
> + struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
> + struct pci_dev *pdev = ldev->pci_dev;
> + int i, ret, irq;
> + int queues = eni_vdpa->queues;
> + int vectors = queues + 1;
> +
> + ret = pci_alloc_irq_vectors(pdev, vectors, vectors, PCI_IRQ_MSIX);
> + if (ret != vectors) {
> + ENI_ERR(pdev,
> + "failed to allocate irq vectors want %d but %d\n",
> + vectors, ret);
> + return ret;
> + }
> +
> + eni_vdpa->vectors = vectors;
> +
> + for (i = 0; i < queues; i++) {
> + snprintf(eni_vdpa->vring[i].msix_name, ENI_MSIX_NAME_SIZE,
> + "eni-vdpa[%s]-%d\n", pci_name(pdev), i);
> + irq = pci_irq_vector(pdev, i);
> + ret = devm_request_irq(&pdev->dev, irq,
> + eni_vdpa_vq_handler,
> + 0, eni_vdpa->vring[i].msix_name,
> + &eni_vdpa->vring[i]);
> + if (ret) {
> + ENI_ERR(pdev, "failed to request irq for vq %d\n", i);
> + goto err;
> + }
> + vp_legacy_queue_vector(ldev, i, i);
> + eni_vdpa->vring[i].irq = irq;
> + }
> +
> + snprintf(eni_vdpa->msix_name, ENI_MSIX_NAME_SIZE, "eni-vdpa[%s]-config\n",
> + pci_name(pdev));
> + irq = pci_irq_vector(pdev, queues);
> + ret = devm_request_irq(&pdev->dev, irq, eni_vdpa_config_handler, 0,
> + eni_vdpa->msix_name, eni_vdpa);
> + if (ret) {
> + ENI_ERR(pdev, "failed to request irq for config vq %d\n", i);
> + goto err;
> + }
> + vp_legacy_config_vector(ldev, queues);
> + eni_vdpa->config_irq = irq;
> +
> + return 0;
> +err:
> + eni_vdpa_free_irq(eni_vdpa);
> + return ret;
> +}
> +
> +static void eni_vdpa_set_status(struct vdpa_device *vdpa, u8 status)
> +{
> + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> + struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
> + u8 s = eni_vdpa_get_status(vdpa);
> +
> + if (status & VIRTIO_CONFIG_S_DRIVER_OK &&
> + !(s & VIRTIO_CONFIG_S_DRIVER_OK)) {
> + eni_vdpa_request_irq(eni_vdpa);
> + }
> +
> + vp_legacy_set_status(ldev, status);
> +
> + if (!(status & VIRTIO_CONFIG_S_DRIVER_OK) &&
> + (s & VIRTIO_CONFIG_S_DRIVER_OK))
> + eni_vdpa_free_irq(eni_vdpa);
> +}
> +
> +static int eni_vdpa_reset(struct vdpa_device *vdpa)
> +{
> + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> + struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
> + u8 s = eni_vdpa_get_status(vdpa);
> +
> + vp_legacy_set_status(ldev, 0);
> +
> + if (s & VIRTIO_CONFIG_S_DRIVER_OK)
> + eni_vdpa_free_irq(eni_vdpa);
> +
> + return 0;
> +}
> +
> +static u16 eni_vdpa_get_vq_num_max(struct vdpa_device *vdpa)
> +{
> + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> +
> + return vp_legacy_get_queue_size(ldev, 0);
> +}
> +
> +static u16 eni_vdpa_get_vq_num_min(struct vdpa_device *vdpa)
> +{
> + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> +
> + return vp_legacy_get_queue_size(ldev, 0);
> +}
> +
> +static int eni_vdpa_get_vq_state(struct vdpa_device *vdpa, u16 qid,
> + struct vdpa_vq_state *state)
> +{
> + return -EOPNOTSUPP;
> +}
> +
> +static int eni_vdpa_set_vq_state(struct vdpa_device *vdpa, u16 qid,
> + const struct vdpa_vq_state *state)
> +{
> + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> + const struct vdpa_vq_state_split *split = &state->split;
> +
> + /* ENI is build upon virtio-pci specfication which not support
> + * to set state of virtqueue. But if the state is equal to the
> + * device initial state by chance, we can let it go.
> + */
> + if (!vp_legacy_get_queue_enable(ldev, qid)
> + && split->avail_index == 0)
> + return 0;
> +
> + return -EOPNOTSUPP;
> +}
> +
> +
> +static void eni_vdpa_set_vq_cb(struct vdpa_device *vdpa, u16 qid,
> + struct vdpa_callback *cb)
> +{
> + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> +
> + eni_vdpa->vring[qid].cb = *cb;
> +}
> +
> +static void eni_vdpa_set_vq_ready(struct vdpa_device *vdpa, u16 qid,
> + bool ready)
> +{
> + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> +
> + /* ENI is a legacy virtio-pci device. This is not supported
> + * by specification. But we can disable virtqueue by setting
> + * address to 0.
> + */
> + if (!ready)
> + vp_legacy_set_queue_address(ldev, qid, 0);
> +}
> +
> +static bool eni_vdpa_get_vq_ready(struct vdpa_device *vdpa, u16 qid)
> +{
> + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> +
> + return vp_legacy_get_queue_enable(ldev, qid);
> +}
> +
> +static void eni_vdpa_set_vq_num(struct vdpa_device *vdpa, u16 qid,
> + u32 num)
> +{
> + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> + struct pci_dev *pdev = ldev->pci_dev;
> + u16 n = vp_legacy_get_queue_size(ldev, qid);
> +
> + /* ENI is a legacy virtio-pci device which not allow to change
> + * virtqueue size. Just report a error if someone tries to
> + * change it.
> + */
> + if (num != n)
> + ENI_ERR(pdev,
> + "not support to set vq %u fixed num %u to %u\n",
> + qid, n, num);
> +}
> +
> +static int eni_vdpa_set_vq_address(struct vdpa_device *vdpa, u16 qid,
> + u64 desc_area, u64 driver_area,
> + u64 device_area)
> +{
> + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> + u32 pfn = desc_area >> VIRTIO_PCI_QUEUE_ADDR_SHIFT;
> +
> + vp_legacy_set_queue_address(ldev, qid, pfn);
> +
> + return 0;
> +}
> +
> +static void eni_vdpa_kick_vq(struct vdpa_device *vdpa, u16 qid)
> +{
> + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> +
> + iowrite16(qid, eni_vdpa->vring[qid].notify);
> +}
> +
> +static u32 eni_vdpa_get_device_id(struct vdpa_device *vdpa)
> +{
> + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> +
> + return ldev->id.device;
> +}
> +
> +static u32 eni_vdpa_get_vendor_id(struct vdpa_device *vdpa)
> +{
> + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> +
> + return ldev->id.vendor;
> +}
> +
> +static u32 eni_vdpa_get_vq_align(struct vdpa_device *vdpa)
> +{
> + return PAGE_SIZE;
> +}
> +
> +static size_t eni_vdpa_get_config_size(struct vdpa_device *vdpa)
> +{
> + return sizeof(struct virtio_net_config);
> +}
> +
> +
> +static void eni_vdpa_get_config(struct vdpa_device *vdpa,
> + unsigned int offset,
> + void *buf, unsigned int len)
> +{
> + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> + struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
> + void __iomem *ioaddr = ldev->ioaddr +
> + VIRTIO_PCI_CONFIG_OFF(eni_vdpa->vectors) +
> + offset;
> + u8 *p = buf;
> + int i;
> +
> + for (i = 0; i < len; i++)
> + *p++ = ioread8(ioaddr + i);
> +}
> +
> +static void eni_vdpa_set_config(struct vdpa_device *vdpa,
> + unsigned int offset, const void *buf,
> + unsigned int len)
> +{
> + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> + struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
> + void __iomem *ioaddr = ldev->ioaddr +
> + VIRTIO_PCI_CONFIG_OFF(eni_vdpa->vectors) +
> + offset;
> + const u8 *p = buf;
> + int i;
> +
> + for (i = 0; i < len; i++)
> + iowrite8(*p++, ioaddr + i);
> +}
> +
> +static void eni_vdpa_set_config_cb(struct vdpa_device *vdpa,
> + struct vdpa_callback *cb)
> +{
> + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> +
> + eni_vdpa->config_cb = *cb;
> +}
> +
> +static const struct vdpa_config_ops eni_vdpa_ops = {
> + .get_features = eni_vdpa_get_features,
> + .set_features = eni_vdpa_set_features,
> + .get_status = eni_vdpa_get_status,
> + .set_status = eni_vdpa_set_status,
> + .reset = eni_vdpa_reset,
> + .get_vq_num_max = eni_vdpa_get_vq_num_max,
> + .get_vq_num_min = eni_vdpa_get_vq_num_min,
> + .get_vq_state = eni_vdpa_get_vq_state,
> + .set_vq_state = eni_vdpa_set_vq_state,
> + .set_vq_cb = eni_vdpa_set_vq_cb,
> + .set_vq_ready = eni_vdpa_set_vq_ready,
> + .get_vq_ready = eni_vdpa_get_vq_ready,
> + .set_vq_num = eni_vdpa_set_vq_num,
> + .set_vq_address = eni_vdpa_set_vq_address,
> + .kick_vq = eni_vdpa_kick_vq,
> + .get_device_id = eni_vdpa_get_device_id,
> + .get_vendor_id = eni_vdpa_get_vendor_id,
> + .get_vq_align = eni_vdpa_get_vq_align,
> + .get_config_size = eni_vdpa_get_config_size,
> + .get_config = eni_vdpa_get_config,
> + .set_config = eni_vdpa_set_config,
> + .set_config_cb = eni_vdpa_set_config_cb,
> + .get_vq_irq = eni_vdpa_get_vq_irq,
> +};
> +
> +
> +static u16 eni_vdpa_get_num_queues(struct eni_vdpa *eni_vdpa)
> +{
> + struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
> + u32 features = vp_legacy_get_features(ldev);
> + u16 num = 2;
> +
> + if (features & BIT_ULL(VIRTIO_NET_F_MQ)) {
> + __virtio16 max_virtqueue_pairs;
> +
> + eni_vdpa_get_config(&eni_vdpa->vdpa,
> + offsetof(struct virtio_net_config, max_virtqueue_pairs),
> + &max_virtqueue_pairs,
> + sizeof(max_virtqueue_pairs));
> + num = 2 * __virtio16_to_cpu(virtio_legacy_is_little_endian(),
> + max_virtqueue_pairs);
> + }
> +
> + if (features & BIT_ULL(VIRTIO_NET_F_CTRL_VQ))
> + num += 1;
> +
> + return num;
> +}
> +
> +static void eni_vdpa_free_irq_vectors(void *data)
> +{
> + pci_free_irq_vectors(data);
> +}
> +
> +static int eni_vdpa_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> +{
> + struct device *dev = &pdev->dev;
> + struct eni_vdpa *eni_vdpa;
> + struct virtio_pci_legacy_device *ldev;
> + int ret, i;
> +
> + ret = pcim_enable_device(pdev);
> + if (ret)
> + return ret;
> +
> + eni_vdpa = vdpa_alloc_device(struct eni_vdpa, vdpa,
> + dev, &eni_vdpa_ops, NULL, false);
> + if (IS_ERR(eni_vdpa)) {
> + ENI_ERR(pdev, "failed to allocate vDPA structure\n");
> + return PTR_ERR(eni_vdpa);
> + }
> +
> + ldev = &eni_vdpa->ldev;
> + ldev->pci_dev = pdev;
> +
> + ret = vp_legacy_probe(ldev);
> + if (ret) {
> + ENI_ERR(pdev, "failed to probe legacy PCI device\n");
> + goto err;
> + }
> +
> + pci_set_master(pdev);
> + pci_set_drvdata(pdev, eni_vdpa);
> +
> + eni_vdpa->vdpa.dma_dev = &pdev->dev;
> + eni_vdpa->queues = eni_vdpa_get_num_queues(eni_vdpa);
> +
> + ret = devm_add_action_or_reset(dev, eni_vdpa_free_irq_vectors, pdev);
> + if (ret) {
> + ENI_ERR(pdev,
> + "failed for adding devres for freeing irq vectors\n");
> + goto err;
> + }
> +
> + eni_vdpa->vring = devm_kcalloc(&pdev->dev, eni_vdpa->queues,
> + sizeof(*eni_vdpa->vring),
> + GFP_KERNEL);
> + if (!eni_vdpa->vring) {
> + ret = -ENOMEM;
> + ENI_ERR(pdev, "failed to allocate virtqueues\n");
> + goto err;
> + }
> +
> + for (i = 0; i < eni_vdpa->queues; i++) {
> + eni_vdpa->vring[i].irq = VIRTIO_MSI_NO_VECTOR;
> + eni_vdpa->vring[i].notify = ldev->ioaddr + VIRTIO_PCI_QUEUE_NOTIFY;
> + }
> + eni_vdpa->config_irq = VIRTIO_MSI_NO_VECTOR;
> +
> + ret = vdpa_register_device(&eni_vdpa->vdpa, eni_vdpa->queues);
> + if (ret) {
> + ENI_ERR(pdev, "failed to register to vdpa bus\n");
> + goto err;
> + }
> +
> + return 0;
> +
> +err:
> + put_device(&eni_vdpa->vdpa.dev);
> + return ret;
> +}
> +
> +static void eni_vdpa_remove(struct pci_dev *pdev)
> +{
> + struct eni_vdpa *eni_vdpa = pci_get_drvdata(pdev);
> +
> + vdpa_unregister_device(&eni_vdpa->vdpa);
> + vp_legacy_remove(&eni_vdpa->ldev);
> +}
> +
> +static struct pci_device_id eni_pci_ids[] = {
> + { PCI_DEVICE_SUB(PCI_VENDOR_ID_REDHAT_QUMRANET,
> + VIRTIO_TRANS_ID_NET,
> + PCI_SUBVENDOR_ID_REDHAT_QUMRANET,
> + VIRTIO_ID_NET) },
> + { 0 },
> +};
> +
> +static struct pci_driver eni_vdpa_driver = {
> + .name = "alibaba-eni-vdpa",
> + .id_table = eni_pci_ids,
> + .probe = eni_vdpa_probe,
> + .remove = eni_vdpa_remove,
> +};
> +
> +module_pci_driver(eni_vdpa_driver);
> +
> +MODULE_AUTHOR("Wu Zongyong <[email protected]>");
> +MODULE_DESCRIPTION("Alibaba ENI vDPA driver");
> +MODULE_LICENSE("GPL v2");

2021-10-11 07:03:03

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH v4 6/7] vdpa: add new attribute VDPA_ATTR_DEV_MIN_VQ_SIZE


?? 2021/9/29 ????2:11, Wu Zongyong д??:
> This attribute advertises the min value of virtqueue size. The value is
> 0 by default.


I think 0 is not a correct value. If I understand the spec correctly, it
should be 1.

Thanks


>
> Signed-off-by: Wu Zongyong <[email protected]>
> ---
> drivers/vdpa/vdpa.c | 5 +++++
> include/uapi/linux/vdpa.h | 1 +
> 2 files changed, 6 insertions(+)
>
> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
> index 1dc121a07a93..6ed79fba33e4 100644
> --- a/drivers/vdpa/vdpa.c
> +++ b/drivers/vdpa/vdpa.c
> @@ -492,6 +492,7 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> int flags, struct netlink_ext_ack *extack)
> {
> u16 max_vq_size;
> + u16 min_vq_size = 0;
> u32 device_id;
> u32 vendor_id;
> void *hdr;
> @@ -508,6 +509,8 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> device_id = vdev->config->get_device_id(vdev);
> vendor_id = vdev->config->get_vendor_id(vdev);
> max_vq_size = vdev->config->get_vq_num_max(vdev);
> + if (vdev->config->get_vq_num_min)
> + min_vq_size = vdev->config->get_vq_num_min(vdev);
>
> err = -EMSGSIZE;
> if (nla_put_string(msg, VDPA_ATTR_DEV_NAME, dev_name(&vdev->dev)))
> @@ -520,6 +523,8 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
> goto msg_err;
> if (nla_put_u16(msg, VDPA_ATTR_DEV_MAX_VQ_SIZE, max_vq_size))
> goto msg_err;
> + if (nla_put_u16(msg, VDPA_ATTR_DEV_MIN_VQ_SIZE, min_vq_size))
> + goto msg_err;
>
> genlmsg_end(msg, hdr);
> return 0;
> diff --git a/include/uapi/linux/vdpa.h b/include/uapi/linux/vdpa.h
> index 66a41e4ec163..e3b87879514c 100644
> --- a/include/uapi/linux/vdpa.h
> +++ b/include/uapi/linux/vdpa.h
> @@ -32,6 +32,7 @@ enum vdpa_attr {
> VDPA_ATTR_DEV_VENDOR_ID, /* u32 */
> VDPA_ATTR_DEV_MAX_VQS, /* u32 */
> VDPA_ATTR_DEV_MAX_VQ_SIZE, /* u16 */
> + VDPA_ATTR_DEV_MIN_VQ_SIZE, /* u16 */
>
> /* new attributes must be added above here */
> VDPA_ATTR_MAX,

2021-10-15 13:23:18

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH v5 0/8] vDPA driver for Alibaba ENI

This series implements the vDPA driver for Alibaba ENI(Elastic Network
Interface) which is build based on virtio-pci 0.9.5 specification.

Changes since V4:
- check return values of get_vq_num_{max,min} when probing devices
- disable the driver on BE host via Kconfig
- add missing commit message

Changes since V3:
- validate VIRTIO_NET_F_MRG_RXBUF when negotiate features
- present F_ORDER_PLATFORM in get_features
- remove endian check since ENI always use litter endian

Changes since V2:
- add new attribute VDPA_ATTR_DEV_MIN_VQ_SIZE instead
VDPA_ATTR_DEV_F_VERSION_1 to guide users to choose correct virtqueue
size as suggested by Jason Wang
- present ACCESS_PLATFORM in get_features callback as suggested by Jason
Wang
- disable this driver on Big Endian host as suggested by Jason Wang
- fix a typo

Changes since V1:
- add new vdpa attribute VDPA_ATTR_DEV_F_VERSION_1 to indicate whether
the vdpa device is legacy
- implement dedicated driver for Alibaba ENI instead a legacy virtio-pci
driver as suggested by Jason Wang
- some bugs fixed

Wu Zongyong (8):
virtio-pci: introduce legacy device module
vdpa: fix typo
vp_vdpa: add vq irq offloading support
vdpa: add new callback get_vq_num_min in vdpa_config_ops
vdpa: min vq num of vdpa device cannot be greater than max vq num
virtio_vdpa: setup correct vq size with callbacks get_vq_num_{max,min}
vdpa: add new attribute VDPA_ATTR_DEV_MIN_VQ_SIZE
eni_vdpa: add vDPA driver for Alibaba ENI

drivers/vdpa/Kconfig | 8 +
drivers/vdpa/Makefile | 1 +
drivers/vdpa/alibaba/Makefile | 3 +
drivers/vdpa/alibaba/eni_vdpa.c | 553 +++++++++++++++++++++++++
drivers/vdpa/vdpa.c | 13 +
drivers/vdpa/virtio_pci/vp_vdpa.c | 12 +
drivers/virtio/Kconfig | 10 +
drivers/virtio/Makefile | 1 +
drivers/virtio/virtio_pci_common.c | 10 +-
drivers/virtio/virtio_pci_common.h | 9 +-
drivers/virtio/virtio_pci_legacy.c | 101 ++---
drivers/virtio/virtio_pci_legacy_dev.c | 220 ++++++++++
drivers/virtio/virtio_vdpa.c | 21 +-
include/linux/vdpa.h | 6 +-
include/linux/virtio_pci_legacy.h | 44 ++
include/uapi/linux/vdpa.h | 1 +
16 files changed, 924 insertions(+), 89 deletions(-)
create mode 100644 drivers/vdpa/alibaba/Makefile
create mode 100644 drivers/vdpa/alibaba/eni_vdpa.c
create mode 100644 drivers/virtio/virtio_pci_legacy_dev.c
create mode 100644 include/linux/virtio_pci_legacy.h

--
2.31.1

2021-10-15 13:23:18

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH v5 1/8] virtio-pci: introduce legacy device module

Split common codes from virtio-pci-legacy so vDPA driver can reuse it
later.

Signed-off-by: Wu Zongyong <[email protected]>
---
drivers/virtio/Kconfig | 10 ++
drivers/virtio/Makefile | 1 +
drivers/virtio/virtio_pci_common.c | 10 +-
drivers/virtio/virtio_pci_common.h | 9 +-
drivers/virtio/virtio_pci_legacy.c | 101 +++---------
drivers/virtio/virtio_pci_legacy_dev.c | 220 +++++++++++++++++++++++++
include/linux/virtio_pci_legacy.h | 44 +++++
7 files changed, 312 insertions(+), 83 deletions(-)
create mode 100644 drivers/virtio/virtio_pci_legacy_dev.c
create mode 100644 include/linux/virtio_pci_legacy.h

diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
index ce1b3f6ec325..8fcf94cd2c96 100644
--- a/drivers/virtio/Kconfig
+++ b/drivers/virtio/Kconfig
@@ -20,6 +20,15 @@ config VIRTIO_PCI_LIB
PCI device with possible vendor specific extensions. Any
module that selects this module must depend on PCI.

+config VIRTIO_PCI_LIB_LEGACY
+ tristate
+ help
+ Legacy PCI device (Virtio PCI Card 0.9.x Draft and older device)
+ implementation.
+ This module implements the basic probe and control for devices
+ which are based on legacy PCI device. Any module that selects this
+ module must depend on PCI.
+
menuconfig VIRTIO_MENU
bool "Virtio drivers"
default y
@@ -43,6 +52,7 @@ config VIRTIO_PCI_LEGACY
bool "Support for legacy virtio draft 0.9.X and older devices"
default y
depends on VIRTIO_PCI
+ select VIRTIO_PCI_LIB_LEGACY
help
Virtio PCI Card 0.9.X Draft (circa 2014) and older device support.

diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile
index 699bbea0465f..0a82d0873248 100644
--- a/drivers/virtio/Makefile
+++ b/drivers/virtio/Makefile
@@ -1,6 +1,7 @@
# SPDX-License-Identifier: GPL-2.0
obj-$(CONFIG_VIRTIO) += virtio.o virtio_ring.o
obj-$(CONFIG_VIRTIO_PCI_LIB) += virtio_pci_modern_dev.o
+obj-$(CONFIG_VIRTIO_PCI_LIB_LEGACY) += virtio_pci_legacy_dev.o
obj-$(CONFIG_VIRTIO_MMIO) += virtio_mmio.o
obj-$(CONFIG_VIRTIO_PCI) += virtio_pci.o
virtio_pci-y := virtio_pci_modern.o virtio_pci_common.o
diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c
index b35bb2d57f62..d724f676608b 100644
--- a/drivers/virtio/virtio_pci_common.c
+++ b/drivers/virtio/virtio_pci_common.c
@@ -549,6 +549,8 @@ static int virtio_pci_probe(struct pci_dev *pci_dev,

pci_set_master(pci_dev);

+ vp_dev->is_legacy = vp_dev->ldev.ioaddr ? true : false;
+
rc = register_virtio_device(&vp_dev->vdev);
reg_dev = vp_dev;
if (rc)
@@ -557,10 +559,10 @@ static int virtio_pci_probe(struct pci_dev *pci_dev,
return 0;

err_register:
- if (vp_dev->ioaddr)
- virtio_pci_legacy_remove(vp_dev);
+ if (vp_dev->is_legacy)
+ virtio_pci_legacy_remove(vp_dev);
else
- virtio_pci_modern_remove(vp_dev);
+ virtio_pci_modern_remove(vp_dev);
err_probe:
pci_disable_device(pci_dev);
err_enable_device:
@@ -587,7 +589,7 @@ static void virtio_pci_remove(struct pci_dev *pci_dev)

unregister_virtio_device(&vp_dev->vdev);

- if (vp_dev->ioaddr)
+ if (vp_dev->is_legacy)
virtio_pci_legacy_remove(vp_dev);
else
virtio_pci_modern_remove(vp_dev);
diff --git a/drivers/virtio/virtio_pci_common.h b/drivers/virtio/virtio_pci_common.h
index beec047a8f8d..eb17a29fc7ef 100644
--- a/drivers/virtio/virtio_pci_common.h
+++ b/drivers/virtio/virtio_pci_common.h
@@ -25,6 +25,7 @@
#include <linux/virtio_config.h>
#include <linux/virtio_ring.h>
#include <linux/virtio_pci.h>
+#include <linux/virtio_pci_legacy.h>
#include <linux/virtio_pci_modern.h>
#include <linux/highmem.h>
#include <linux/spinlock.h>
@@ -44,16 +45,14 @@ struct virtio_pci_vq_info {
struct virtio_pci_device {
struct virtio_device vdev;
struct pci_dev *pci_dev;
+ struct virtio_pci_legacy_device ldev;
struct virtio_pci_modern_device mdev;

- /* In legacy mode, these two point to within ->legacy. */
+ bool is_legacy;
+
/* Where to read and clear interrupt */
u8 __iomem *isr;

- /* Legacy only field */
- /* the IO mapping for the PCI config space */
- void __iomem *ioaddr;
-
/* a list of queues so we can dispatch IRQs */
spinlock_t lock;
struct list_head virtqueues;
diff --git a/drivers/virtio/virtio_pci_legacy.c b/drivers/virtio/virtio_pci_legacy.c
index d62e9835aeec..82eb437ad920 100644
--- a/drivers/virtio/virtio_pci_legacy.c
+++ b/drivers/virtio/virtio_pci_legacy.c
@@ -14,6 +14,7 @@
* Michael S. Tsirkin <[email protected]>
*/

+#include "linux/virtio_pci_legacy.h"
#include "virtio_pci_common.h"

/* virtio config->get_features() implementation */
@@ -23,7 +24,7 @@ static u64 vp_get_features(struct virtio_device *vdev)

/* When someone needs more than 32 feature bits, we'll need to
* steal a bit to indicate that the rest are somewhere else. */
- return ioread32(vp_dev->ioaddr + VIRTIO_PCI_HOST_FEATURES);
+ return vp_legacy_get_features(&vp_dev->ldev);
}

/* virtio config->finalize_features() implementation */
@@ -38,7 +39,7 @@ static int vp_finalize_features(struct virtio_device *vdev)
BUG_ON((u32)vdev->features != vdev->features);

/* We only support 32 feature bits. */
- iowrite32(vdev->features, vp_dev->ioaddr + VIRTIO_PCI_GUEST_FEATURES);
+ vp_legacy_set_features(&vp_dev->ldev, vdev->features);

return 0;
}
@@ -48,7 +49,7 @@ static void vp_get(struct virtio_device *vdev, unsigned offset,
void *buf, unsigned len)
{
struct virtio_pci_device *vp_dev = to_vp_device(vdev);
- void __iomem *ioaddr = vp_dev->ioaddr +
+ void __iomem *ioaddr = vp_dev->ldev.ioaddr +
VIRTIO_PCI_CONFIG_OFF(vp_dev->msix_enabled) +
offset;
u8 *ptr = buf;
@@ -64,7 +65,7 @@ static void vp_set(struct virtio_device *vdev, unsigned offset,
const void *buf, unsigned len)
{
struct virtio_pci_device *vp_dev = to_vp_device(vdev);
- void __iomem *ioaddr = vp_dev->ioaddr +
+ void __iomem *ioaddr = vp_dev->ldev.ioaddr +
VIRTIO_PCI_CONFIG_OFF(vp_dev->msix_enabled) +
offset;
const u8 *ptr = buf;
@@ -78,7 +79,7 @@ static void vp_set(struct virtio_device *vdev, unsigned offset,
static u8 vp_get_status(struct virtio_device *vdev)
{
struct virtio_pci_device *vp_dev = to_vp_device(vdev);
- return ioread8(vp_dev->ioaddr + VIRTIO_PCI_STATUS);
+ return vp_legacy_get_status(&vp_dev->ldev);
}

static void vp_set_status(struct virtio_device *vdev, u8 status)
@@ -86,28 +87,24 @@ static void vp_set_status(struct virtio_device *vdev, u8 status)
struct virtio_pci_device *vp_dev = to_vp_device(vdev);
/* We should never be setting status to 0. */
BUG_ON(status == 0);
- iowrite8(status, vp_dev->ioaddr + VIRTIO_PCI_STATUS);
+ vp_legacy_set_status(&vp_dev->ldev, status);
}

static void vp_reset(struct virtio_device *vdev)
{
struct virtio_pci_device *vp_dev = to_vp_device(vdev);
/* 0 status means a reset. */
- iowrite8(0, vp_dev->ioaddr + VIRTIO_PCI_STATUS);
+ vp_legacy_set_status(&vp_dev->ldev, 0);
/* Flush out the status write, and flush in device writes,
* including MSi-X interrupts, if any. */
- ioread8(vp_dev->ioaddr + VIRTIO_PCI_STATUS);
+ vp_legacy_get_status(&vp_dev->ldev);
/* Flush pending VQ/configuration callbacks. */
vp_synchronize_vectors(vdev);
}

static u16 vp_config_vector(struct virtio_pci_device *vp_dev, u16 vector)
{
- /* Setup the vector used for configuration events */
- iowrite16(vector, vp_dev->ioaddr + VIRTIO_MSI_CONFIG_VECTOR);
- /* Verify we had enough resources to assign the vector */
- /* Will also flush the write out to device */
- return ioread16(vp_dev->ioaddr + VIRTIO_MSI_CONFIG_VECTOR);
+ return vp_legacy_config_vector(&vp_dev->ldev, vector);
}

static struct virtqueue *setup_vq(struct virtio_pci_device *vp_dev,
@@ -123,12 +120,9 @@ static struct virtqueue *setup_vq(struct virtio_pci_device *vp_dev,
int err;
u64 q_pfn;

- /* Select the queue we're interested in */
- iowrite16(index, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_SEL);
-
/* Check if queue is either not available or already active. */
- num = ioread16(vp_dev->ioaddr + VIRTIO_PCI_QUEUE_NUM);
- if (!num || ioread32(vp_dev->ioaddr + VIRTIO_PCI_QUEUE_PFN))
+ num = vp_legacy_get_queue_size(&vp_dev->ldev, index);
+ if (!num || vp_legacy_get_queue_enable(&vp_dev->ldev, index))
return ERR_PTR(-ENOENT);

info->msix_vector = msix_vec;
@@ -151,13 +145,12 @@ static struct virtqueue *setup_vq(struct virtio_pci_device *vp_dev,
}

/* activate the queue */
- iowrite32(q_pfn, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_PFN);
+ vp_legacy_set_queue_address(&vp_dev->ldev, index, q_pfn);

- vq->priv = (void __force *)vp_dev->ioaddr + VIRTIO_PCI_QUEUE_NOTIFY;
+ vq->priv = (void __force *)vp_dev->ldev.ioaddr + VIRTIO_PCI_QUEUE_NOTIFY;

if (msix_vec != VIRTIO_MSI_NO_VECTOR) {
- iowrite16(msix_vec, vp_dev->ioaddr + VIRTIO_MSI_QUEUE_VECTOR);
- msix_vec = ioread16(vp_dev->ioaddr + VIRTIO_MSI_QUEUE_VECTOR);
+ msix_vec = vp_legacy_queue_vector(&vp_dev->ldev, index, msix_vec);
if (msix_vec == VIRTIO_MSI_NO_VECTOR) {
err = -EBUSY;
goto out_deactivate;
@@ -167,7 +160,7 @@ static struct virtqueue *setup_vq(struct virtio_pci_device *vp_dev,
return vq;

out_deactivate:
- iowrite32(0, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_PFN);
+ vp_legacy_set_queue_address(&vp_dev->ldev, index, 0);
out_del_vq:
vring_del_virtqueue(vq);
return ERR_PTR(err);
@@ -178,17 +171,15 @@ static void del_vq(struct virtio_pci_vq_info *info)
struct virtqueue *vq = info->vq;
struct virtio_pci_device *vp_dev = to_vp_device(vq->vdev);

- iowrite16(vq->index, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_SEL);
-
if (vp_dev->msix_enabled) {
- iowrite16(VIRTIO_MSI_NO_VECTOR,
- vp_dev->ioaddr + VIRTIO_MSI_QUEUE_VECTOR);
+ vp_legacy_queue_vector(&vp_dev->ldev, vq->index,
+ VIRTIO_MSI_NO_VECTOR);
/* Flush the write out to device */
- ioread8(vp_dev->ioaddr + VIRTIO_PCI_ISR);
+ ioread8(vp_dev->ldev.ioaddr + VIRTIO_PCI_ISR);
}

/* Select and deactivate the queue */
- iowrite32(0, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_PFN);
+ vp_legacy_set_queue_address(&vp_dev->ldev, vq->index, 0);

vring_del_virtqueue(vq);
}
@@ -211,51 +202,18 @@ static const struct virtio_config_ops virtio_pci_config_ops = {
/* the PCI probing function */
int virtio_pci_legacy_probe(struct virtio_pci_device *vp_dev)
{
+ struct virtio_pci_legacy_device *ldev = &vp_dev->ldev;
struct pci_dev *pci_dev = vp_dev->pci_dev;
int rc;

- /* We only own devices >= 0x1000 and <= 0x103f: leave the rest. */
- if (pci_dev->device < 0x1000 || pci_dev->device > 0x103f)
- return -ENODEV;
-
- if (pci_dev->revision != VIRTIO_PCI_ABI_VERSION) {
- printk(KERN_ERR "virtio_pci: expected ABI version %d, got %d\n",
- VIRTIO_PCI_ABI_VERSION, pci_dev->revision);
- return -ENODEV;
- }
-
- rc = dma_set_mask(&pci_dev->dev, DMA_BIT_MASK(64));
- if (rc) {
- rc = dma_set_mask_and_coherent(&pci_dev->dev, DMA_BIT_MASK(32));
- } else {
- /*
- * The virtio ring base address is expressed as a 32-bit PFN,
- * with a page size of 1 << VIRTIO_PCI_QUEUE_ADDR_SHIFT.
- */
- dma_set_coherent_mask(&pci_dev->dev,
- DMA_BIT_MASK(32 + VIRTIO_PCI_QUEUE_ADDR_SHIFT));
- }
-
- if (rc)
- dev_warn(&pci_dev->dev, "Failed to enable 64-bit or 32-bit DMA. Trying to continue, but this might not work.\n");
+ ldev->pci_dev = pci_dev;

- rc = pci_request_region(pci_dev, 0, "virtio-pci-legacy");
+ rc = vp_legacy_probe(ldev);
if (rc)
return rc;

- rc = -ENOMEM;
- vp_dev->ioaddr = pci_iomap(pci_dev, 0, 0);
- if (!vp_dev->ioaddr)
- goto err_iomap;
-
- vp_dev->isr = vp_dev->ioaddr + VIRTIO_PCI_ISR;
-
- /* we use the subsystem vendor/device id as the virtio vendor/device
- * id. this allows us to use the same PCI vendor/device id for all
- * virtio devices and to identify the particular virtio driver by
- * the subsystem ids */
- vp_dev->vdev.id.vendor = pci_dev->subsystem_vendor;
- vp_dev->vdev.id.device = pci_dev->subsystem_device;
+ vp_dev->isr = ldev->isr;
+ vp_dev->vdev.id = ldev->id;

vp_dev->vdev.config = &virtio_pci_config_ops;

@@ -264,16 +222,11 @@ int virtio_pci_legacy_probe(struct virtio_pci_device *vp_dev)
vp_dev->del_vq = del_vq;

return 0;
-
-err_iomap:
- pci_release_region(pci_dev, 0);
- return rc;
}

void virtio_pci_legacy_remove(struct virtio_pci_device *vp_dev)
{
- struct pci_dev *pci_dev = vp_dev->pci_dev;
+ struct virtio_pci_legacy_device *ldev = &vp_dev->ldev;

- pci_iounmap(pci_dev, vp_dev->ioaddr);
- pci_release_region(pci_dev, 0);
+ vp_legacy_remove(ldev);
}
diff --git a/drivers/virtio/virtio_pci_legacy_dev.c b/drivers/virtio/virtio_pci_legacy_dev.c
new file mode 100644
index 000000000000..9b97680dd02b
--- /dev/null
+++ b/drivers/virtio/virtio_pci_legacy_dev.c
@@ -0,0 +1,220 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+
+#include "linux/virtio_pci.h"
+#include <linux/virtio_pci_legacy.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+
+
+/*
+ * vp_legacy_probe: probe the legacy virtio pci device, note that the
+ * caller is required to enable PCI device before calling this function.
+ * @ldev: the legacy virtio-pci device
+ *
+ * Return 0 on succeed otherwise fail
+ */
+int vp_legacy_probe(struct virtio_pci_legacy_device *ldev)
+{
+ struct pci_dev *pci_dev = ldev->pci_dev;
+ int rc;
+
+ /* We only own devices >= 0x1000 and <= 0x103f: leave the rest. */
+ if (pci_dev->device < 0x1000 || pci_dev->device > 0x103f)
+ return -ENODEV;
+
+ if (pci_dev->revision != VIRTIO_PCI_ABI_VERSION)
+ return -ENODEV;
+
+ rc = dma_set_mask(&pci_dev->dev, DMA_BIT_MASK(64));
+ if (rc) {
+ rc = dma_set_mask_and_coherent(&pci_dev->dev, DMA_BIT_MASK(32));
+ } else {
+ /*
+ * The virtio ring base address is expressed as a 32-bit PFN,
+ * with a page size of 1 << VIRTIO_PCI_QUEUE_ADDR_SHIFT.
+ */
+ dma_set_coherent_mask(&pci_dev->dev,
+ DMA_BIT_MASK(32 + VIRTIO_PCI_QUEUE_ADDR_SHIFT));
+ }
+
+ if (rc)
+ dev_warn(&pci_dev->dev, "Failed to enable 64-bit or 32-bit DMA. Trying to continue, but this might not work.\n");
+
+ rc = pci_request_region(pci_dev, 0, "virtio-pci-legacy");
+ if (rc)
+ return rc;
+
+ ldev->ioaddr = pci_iomap(pci_dev, 0, 0);
+ if (!ldev->ioaddr)
+ goto err_iomap;
+
+ ldev->isr = ldev->ioaddr + VIRTIO_PCI_ISR;
+
+ ldev->id.vendor = pci_dev->subsystem_vendor;
+ ldev->id.device = pci_dev->subsystem_device;
+
+ return 0;
+err_iomap:
+ pci_release_region(pci_dev, 0);
+ return rc;
+}
+EXPORT_SYMBOL_GPL(vp_legacy_probe);
+
+/*
+ * vp_legacy_probe: remove and cleanup the legacy virtio pci device
+ * @ldev: the legacy virtio-pci device
+ */
+void vp_legacy_remove(struct virtio_pci_legacy_device *ldev)
+{
+ struct pci_dev *pci_dev = ldev->pci_dev;
+
+ pci_iounmap(pci_dev, ldev->ioaddr);
+ pci_release_region(pci_dev, 0);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_remove);
+
+/*
+ * vp_legacy_get_features - get features from device
+ * @ldev: the legacy virtio-pci device
+ *
+ * Returns the features read from the device
+ */
+u64 vp_legacy_get_features(struct virtio_pci_legacy_device *ldev)
+{
+
+ return ioread32(ldev->ioaddr + VIRTIO_PCI_HOST_FEATURES);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_get_features);
+
+/*
+ * vp_legacy_get_driver_features - get driver features from device
+ * @ldev: the legacy virtio-pci device
+ *
+ * Returns the driver features read from the device
+ */
+u64 vp_legacy_get_driver_features(struct virtio_pci_legacy_device *ldev)
+{
+ return ioread32(ldev->ioaddr + VIRTIO_PCI_GUEST_FEATURES);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_get_driver_features);
+
+/*
+ * vp_legacy_set_features - set features to device
+ * @ldev: the legacy virtio-pci device
+ * @features: the features set to device
+ */
+void vp_legacy_set_features(struct virtio_pci_legacy_device *ldev,
+ u32 features)
+{
+ iowrite32(features, ldev->ioaddr + VIRTIO_PCI_GUEST_FEATURES);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_set_features);
+
+/*
+ * vp_legacy_get_status - get the device status
+ * @ldev: the legacy virtio-pci device
+ *
+ * Returns the status read from device
+ */
+u8 vp_legacy_get_status(struct virtio_pci_legacy_device *ldev)
+{
+ return ioread8(ldev->ioaddr + VIRTIO_PCI_STATUS);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_get_status);
+
+/*
+ * vp_legacy_set_status - set status to device
+ * @ldev: the legacy virtio-pci device
+ * @status: the status set to device
+ */
+void vp_legacy_set_status(struct virtio_pci_legacy_device *ldev,
+ u8 status)
+{
+ iowrite8(status, ldev->ioaddr + VIRTIO_PCI_STATUS);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_set_status);
+
+/*
+ * vp_legacy_queue_vector - set the MSIX vector for a specific virtqueue
+ * @ldev: the legacy virtio-pci device
+ * @index: queue index
+ * @vector: the config vector
+ *
+ * Returns the config vector read from the device
+ */
+u16 vp_legacy_queue_vector(struct virtio_pci_legacy_device *ldev,
+ u16 index, u16 vector)
+{
+ iowrite16(index, ldev->ioaddr + VIRTIO_PCI_QUEUE_SEL);
+ iowrite16(vector, ldev->ioaddr + VIRTIO_MSI_QUEUE_VECTOR);
+ /* Flush the write out to device */
+ return ioread16(ldev->ioaddr + VIRTIO_MSI_QUEUE_VECTOR);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_queue_vector);
+
+/*
+ * vp_legacy_config_vector - set the vector for config interrupt
+ * @ldev: the legacy virtio-pci device
+ * @vector: the config vector
+ *
+ * Returns the config vector read from the device
+ */
+u16 vp_legacy_config_vector(struct virtio_pci_legacy_device *ldev,
+ u16 vector)
+{
+ /* Setup the vector used for configuration events */
+ iowrite16(vector, ldev->ioaddr + VIRTIO_MSI_CONFIG_VECTOR);
+ /* Verify we had enough resources to assign the vector */
+ /* Will also flush the write out to device */
+ return ioread16(ldev->ioaddr + VIRTIO_MSI_CONFIG_VECTOR);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_config_vector);
+
+/*
+ * vp_legacy_set_queue_address - set the virtqueue address
+ * @ldev: the legacy virtio-pci device
+ * @index: the queue index
+ * @queue_pfn: pfn of the virtqueue
+ */
+void vp_legacy_set_queue_address(struct virtio_pci_legacy_device *ldev,
+ u16 index, u32 queue_pfn)
+{
+ iowrite16(index, ldev->ioaddr + VIRTIO_PCI_QUEUE_SEL);
+ iowrite32(queue_pfn, ldev->ioaddr + VIRTIO_PCI_QUEUE_PFN);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_set_queue_address);
+
+/*
+ * vp_legacy_get_queue_enable - enable a virtqueue
+ * @ldev: the legacy virtio-pci device
+ * @index: the queue index
+ *
+ * Returns whether a virtqueue is enabled or not
+ */
+bool vp_legacy_get_queue_enable(struct virtio_pci_legacy_device *ldev,
+ u16 index)
+{
+ iowrite16(index, ldev->ioaddr + VIRTIO_PCI_QUEUE_SEL);
+ return ioread32(ldev->ioaddr + VIRTIO_PCI_QUEUE_PFN);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_get_queue_enable);
+
+/*
+ * vp_legacy_get_queue_size - get size for a virtqueue
+ * @ldev: the legacy virtio-pci device
+ * @index: the queue index
+ *
+ * Returns the size of the virtqueue
+ */
+u16 vp_legacy_get_queue_size(struct virtio_pci_legacy_device *ldev,
+ u16 index)
+{
+ iowrite16(index, ldev->ioaddr + VIRTIO_PCI_QUEUE_SEL);
+ return ioread16(ldev->ioaddr + VIRTIO_PCI_QUEUE_NUM);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_get_queue_size);
+
+MODULE_VERSION("0.1");
+MODULE_DESCRIPTION("Legacy Virtio PCI Device");
+MODULE_AUTHOR("Wu Zongyong <[email protected]>");
+MODULE_LICENSE("GPL");
diff --git a/include/linux/virtio_pci_legacy.h b/include/linux/virtio_pci_legacy.h
new file mode 100644
index 000000000000..ee2c6157215f
--- /dev/null
+++ b/include/linux/virtio_pci_legacy.h
@@ -0,0 +1,44 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_VIRTIO_PCI_LEGACY_H
+#define _LINUX_VIRTIO_PCI_LEGACY_H
+
+#include "linux/mod_devicetable.h"
+#include <linux/pci.h>
+#include <linux/virtio_pci.h>
+
+struct virtio_pci_legacy_device {
+ struct pci_dev *pci_dev;
+
+ /* Where to read and clear interrupt */
+ u8 __iomem *isr;
+ /* The IO mapping for the PCI config space (legacy mode only) */
+ void __iomem *ioaddr;
+
+ struct virtio_device_id id;
+};
+
+u64 vp_legacy_get_features(struct virtio_pci_legacy_device *ldev);
+u64 vp_legacy_get_driver_features(struct virtio_pci_legacy_device *ldev);
+void vp_legacy_set_features(struct virtio_pci_legacy_device *ldev,
+ u32 features);
+u8 vp_legacy_get_status(struct virtio_pci_legacy_device *ldev);
+void vp_legacy_set_status(struct virtio_pci_legacy_device *ldev,
+ u8 status);
+u16 vp_legacy_queue_vector(struct virtio_pci_legacy_device *ldev,
+ u16 idx, u16 vector);
+u16 vp_legacy_config_vector(struct virtio_pci_legacy_device *ldev,
+ u16 vector);
+void vp_legacy_set_queue_address(struct virtio_pci_legacy_device *ldev,
+ u16 index, u32 queue_pfn);
+void vp_legacy_set_queue_enable(struct virtio_pci_legacy_device *ldev,
+ u16 idx, bool enable);
+bool vp_legacy_get_queue_enable(struct virtio_pci_legacy_device *ldev,
+ u16 idx);
+void vp_legacy_set_queue_size(struct virtio_pci_legacy_device *ldev,
+ u16 idx, u16 size);
+u16 vp_legacy_get_queue_size(struct virtio_pci_legacy_device *ldev,
+ u16 idx);
+int vp_legacy_probe(struct virtio_pci_legacy_device *ldev);
+void vp_legacy_remove(struct virtio_pci_legacy_device *ldev);
+
+#endif
--
2.31.1

2021-10-15 13:23:22

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH v5 6/8] virtio_vdpa: setup correct vq size with callbacks get_vq_num_{max,min}

For the devices which implement the get_vq_num_min callback, the driver
should not negotiate with virtqueue size with the backend vdpa device if
the value returned by get_vq_num_min equals to the value returned by
get_vq_num_max.
This is useful for vdpa devices based on legacy virtio specfication.

Signed-off-by: Wu Zongyong <[email protected]>
---
drivers/virtio/virtio_vdpa.c | 21 ++++++++++++++++-----
1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
index 72eaef2caeb1..e42ace29daa1 100644
--- a/drivers/virtio/virtio_vdpa.c
+++ b/drivers/virtio/virtio_vdpa.c
@@ -145,7 +145,8 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
/* Assume split virtqueue, switch to packed if necessary */
struct vdpa_vq_state state = {0};
unsigned long flags;
- u32 align, num;
+ u32 align, max_num, min_num = 0;
+ bool may_reduce_num = true;
int err;

if (!name)
@@ -163,22 +164,32 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
if (!info)
return ERR_PTR(-ENOMEM);

- num = ops->get_vq_num_max(vdpa);
- if (num == 0) {
+ max_num = ops->get_vq_num_max(vdpa);
+ if (max_num == 0) {
err = -ENOENT;
goto error_new_virtqueue;
}

+ if (ops->get_vq_num_min)
+ min_num = ops->get_vq_num_min(vdpa);
+
+ may_reduce_num = (max_num == min_num) ? false : true;
+
/* Create the vring */
align = ops->get_vq_align(vdpa);
- vq = vring_create_virtqueue(index, num, align, vdev,
- true, true, ctx,
+ vq = vring_create_virtqueue(index, max_num, align, vdev,
+ true, may_reduce_num, ctx,
virtio_vdpa_notify, callback, name);
if (!vq) {
err = -ENOMEM;
goto error_new_virtqueue;
}

+ if (virtqueue_get_vring_size(vq) < min_num) {
+ err = -EINVAL;
+ goto err_vq;
+ }
+
/* Setup virtqueue callback */
cb.callback = virtio_vdpa_virtqueue_cb;
cb.private = info;
--
2.31.1

2021-10-15 13:23:22

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH v5 4/8] vdpa: add new callback get_vq_num_min in vdpa_config_ops

This callback is optional. For vdpa devices that not support to change
virtqueue size, get_vq_num_min and get_vq_num_max will return the same
value, so that users can choose a correct value for that device.

Suggested-by: Jason Wang <[email protected]>
Signed-off-by: Wu Zongyong <[email protected]>
---
include/linux/vdpa.h | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
index a896ee021e5f..30864848950b 100644
--- a/include/linux/vdpa.h
+++ b/include/linux/vdpa.h
@@ -171,6 +171,9 @@ struct vdpa_map_file {
* @get_vq_num_max: Get the max size of virtqueue
* @vdev: vdpa device
* Returns u16: max size of virtqueue
+ * @get_vq_num_min: Get the min size of virtqueue (optional)
+ * @vdev: vdpa device
+ * Returns u16: min size of virtqueue
* @get_device_id: Get virtio device id
* @vdev: vdpa device
* Returns u32: virtio device id
@@ -266,6 +269,7 @@ struct vdpa_config_ops {
void (*set_config_cb)(struct vdpa_device *vdev,
struct vdpa_callback *cb);
u16 (*get_vq_num_max)(struct vdpa_device *vdev);
+ u16 (*get_vq_num_min)(struct vdpa_device *vdev);
u32 (*get_device_id)(struct vdpa_device *vdev);
u32 (*get_vendor_id)(struct vdpa_device *vdev);
u8 (*get_status)(struct vdpa_device *vdev);
--
2.31.1

2021-10-15 13:24:36

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH v5 5/8] vdpa: min vq num of vdpa device cannot be greater than max vq num

Just failed to probe the vdpa device if the min virtqueue num returned
by get_vq_num_min is greater than the max virtqueue num returned by
get_vq_num_max.

Signed-off-by: Wu Zongyong <[email protected]>
---
drivers/vdpa/vdpa.c | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
index 1dc121a07a93..fd014ecec711 100644
--- a/drivers/vdpa/vdpa.c
+++ b/drivers/vdpa/vdpa.c
@@ -26,8 +26,16 @@ static int vdpa_dev_probe(struct device *d)
{
struct vdpa_device *vdev = dev_to_vdpa(d);
struct vdpa_driver *drv = drv_to_vdpa(vdev->dev.driver);
+ const struct vdpa_config_ops *ops = vdev->config;
+ u32 max_num, min_num = 0;
int ret = 0;

+ max_num = ops->get_vq_num_max(vdev);
+ if (ops->get_vq_num_min)
+ min_num = ops->get_vq_num_min(vdev);
+ if (max_num < min_num)
+ return -EINVAL;
+
if (drv && drv->probe)
ret = drv->probe(vdev);

--
2.31.1

2021-10-15 13:24:36

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH v5 3/8] vp_vdpa: add vq irq offloading support

This patch implements the get_vq_irq() callback for virtio pci devices
to allow irq offloading.

Signed-off-by: Wu Zongyong <[email protected]>
Acked-by: Jason Wang <[email protected]>
Signed-off-by: Wu Zongyong <[email protected]>
---
drivers/vdpa/virtio_pci/vp_vdpa.c | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/drivers/vdpa/virtio_pci/vp_vdpa.c b/drivers/vdpa/virtio_pci/vp_vdpa.c
index 5bcd00246d2e..e3ff7875e123 100644
--- a/drivers/vdpa/virtio_pci/vp_vdpa.c
+++ b/drivers/vdpa/virtio_pci/vp_vdpa.c
@@ -76,6 +76,17 @@ static u8 vp_vdpa_get_status(struct vdpa_device *vdpa)
return vp_modern_get_status(mdev);
}

+static int vp_vdpa_get_vq_irq(struct vdpa_device *vdpa, u16 idx)
+{
+ struct vp_vdpa *vp_vdpa = vdpa_to_vp(vdpa);
+ int irq = vp_vdpa->vring[idx].irq;
+
+ if (irq == VIRTIO_MSI_NO_VECTOR)
+ return -EINVAL;
+
+ return irq;
+}
+
static void vp_vdpa_free_irq(struct vp_vdpa *vp_vdpa)
{
struct virtio_pci_modern_device *mdev = &vp_vdpa->mdev;
@@ -427,6 +438,7 @@ static const struct vdpa_config_ops vp_vdpa_ops = {
.get_config = vp_vdpa_get_config,
.set_config = vp_vdpa_set_config,
.set_config_cb = vp_vdpa_set_config_cb,
+ .get_vq_irq = vp_vdpa_get_vq_irq,
};

static void vp_vdpa_free_irq_vectors(void *data)
--
2.31.1

2021-10-15 13:24:37

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH v5 7/8] vdpa: add new attribute VDPA_ATTR_DEV_MIN_VQ_SIZE

This attribute advertises the min value of virtqueue size. The value is
0 by default.

Signed-off-by: Wu Zongyong <[email protected]>
---
drivers/vdpa/vdpa.c | 5 +++++
include/uapi/linux/vdpa.h | 1 +
2 files changed, 6 insertions(+)

diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
index fd014ecec711..4aeb1458b924 100644
--- a/drivers/vdpa/vdpa.c
+++ b/drivers/vdpa/vdpa.c
@@ -500,6 +500,7 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
int flags, struct netlink_ext_ack *extack)
{
u16 max_vq_size;
+ u16 min_vq_size = 0;
u32 device_id;
u32 vendor_id;
void *hdr;
@@ -516,6 +517,8 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
device_id = vdev->config->get_device_id(vdev);
vendor_id = vdev->config->get_vendor_id(vdev);
max_vq_size = vdev->config->get_vq_num_max(vdev);
+ if (vdev->config->get_vq_num_min)
+ min_vq_size = vdev->config->get_vq_num_min(vdev);

err = -EMSGSIZE;
if (nla_put_string(msg, VDPA_ATTR_DEV_NAME, dev_name(&vdev->dev)))
@@ -528,6 +531,8 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
goto msg_err;
if (nla_put_u16(msg, VDPA_ATTR_DEV_MAX_VQ_SIZE, max_vq_size))
goto msg_err;
+ if (nla_put_u16(msg, VDPA_ATTR_DEV_MIN_VQ_SIZE, min_vq_size))
+ goto msg_err;

genlmsg_end(msg, hdr);
return 0;
diff --git a/include/uapi/linux/vdpa.h b/include/uapi/linux/vdpa.h
index 66a41e4ec163..e3b87879514c 100644
--- a/include/uapi/linux/vdpa.h
+++ b/include/uapi/linux/vdpa.h
@@ -32,6 +32,7 @@ enum vdpa_attr {
VDPA_ATTR_DEV_VENDOR_ID, /* u32 */
VDPA_ATTR_DEV_MAX_VQS, /* u32 */
VDPA_ATTR_DEV_MAX_VQ_SIZE, /* u16 */
+ VDPA_ATTR_DEV_MIN_VQ_SIZE, /* u16 */

/* new attributes must be added above here */
VDPA_ATTR_MAX,
--
2.31.1

2021-10-15 13:24:38

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH v5 8/8] eni_vdpa: add vDPA driver for Alibaba ENI

This patch adds a new vDPA driver for Alibaba ENI(Elastic Network
Interface) which is build upon virtio 0.9.5 specification.
And this driver doesn't support to run on BE host.

Signed-off-by: Wu Zongyong <[email protected]>
---
drivers/vdpa/Kconfig | 8 +
drivers/vdpa/Makefile | 1 +
drivers/vdpa/alibaba/Makefile | 3 +
drivers/vdpa/alibaba/eni_vdpa.c | 553 ++++++++++++++++++++++++++++++++
4 files changed, 565 insertions(+)
create mode 100644 drivers/vdpa/alibaba/Makefile
create mode 100644 drivers/vdpa/alibaba/eni_vdpa.c

diff --git a/drivers/vdpa/Kconfig b/drivers/vdpa/Kconfig
index 3d91982d8371..c0232a2148a7 100644
--- a/drivers/vdpa/Kconfig
+++ b/drivers/vdpa/Kconfig
@@ -78,4 +78,12 @@ config VP_VDPA
help
This kernel module bridges virtio PCI device to vDPA bus.

+config ALIBABA_ENI_VDPA
+ tristate "vDPA driver for Alibaba ENI"
+ select VIRTIO_PCI_LEGACY_LIB
+ depends on PCI_MSI && !CPU_BIG_ENDIAN
+ help
+ VDPA driver for Alibaba ENI(Elastic Network Interface) which is build upon
+ virtio 0.9.5 specification.
+
endif # VDPA
diff --git a/drivers/vdpa/Makefile b/drivers/vdpa/Makefile
index f02ebed33f19..15665563a7f4 100644
--- a/drivers/vdpa/Makefile
+++ b/drivers/vdpa/Makefile
@@ -5,3 +5,4 @@ obj-$(CONFIG_VDPA_USER) += vdpa_user/
obj-$(CONFIG_IFCVF) += ifcvf/
obj-$(CONFIG_MLX5_VDPA) += mlx5/
obj-$(CONFIG_VP_VDPA) += virtio_pci/
+obj-$(CONFIG_ALIBABA_ENI_VDPA) += alibaba/
diff --git a/drivers/vdpa/alibaba/Makefile b/drivers/vdpa/alibaba/Makefile
new file mode 100644
index 000000000000..ef4aae69f87a
--- /dev/null
+++ b/drivers/vdpa/alibaba/Makefile
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0
+obj-$(CONFIG_ALIBABA_ENI_VDPA) += eni_vdpa.o
+
diff --git a/drivers/vdpa/alibaba/eni_vdpa.c b/drivers/vdpa/alibaba/eni_vdpa.c
new file mode 100644
index 000000000000..6a09f157d810
--- /dev/null
+++ b/drivers/vdpa/alibaba/eni_vdpa.c
@@ -0,0 +1,553 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * vDPA bridge driver for Alibaba ENI(Elastic Network Interface)
+ *
+ * Copyright (c) 2021, Alibaba Inc. All rights reserved.
+ * Author: Wu Zongyong <[email protected]>
+ *
+ */
+
+#include "linux/bits.h"
+#include <linux/interrupt.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/vdpa.h>
+#include <linux/virtio.h>
+#include <linux/virtio_config.h>
+#include <linux/virtio_ring.h>
+#include <linux/virtio_pci.h>
+#include <linux/virtio_pci_legacy.h>
+#include <uapi/linux/virtio_net.h>
+
+#define ENI_MSIX_NAME_SIZE 256
+
+#define ENI_ERR(pdev, fmt, ...) \
+ dev_err(&pdev->dev, "%s"fmt, "eni_vdpa: ", ##__VA_ARGS__)
+#define ENI_DBG(pdev, fmt, ...) \
+ dev_dbg(&pdev->dev, "%s"fmt, "eni_vdpa: ", ##__VA_ARGS__)
+#define ENI_INFO(pdev, fmt, ...) \
+ dev_info(&pdev->dev, "%s"fmt, "eni_vdpa: ", ##__VA_ARGS__)
+
+struct eni_vring {
+ void __iomem *notify;
+ char msix_name[ENI_MSIX_NAME_SIZE];
+ struct vdpa_callback cb;
+ int irq;
+};
+
+struct eni_vdpa {
+ struct vdpa_device vdpa;
+ struct virtio_pci_legacy_device ldev;
+ struct eni_vring *vring;
+ struct vdpa_callback config_cb;
+ char msix_name[ENI_MSIX_NAME_SIZE];
+ int config_irq;
+ int queues;
+ int vectors;
+};
+
+static struct eni_vdpa *vdpa_to_eni(struct vdpa_device *vdpa)
+{
+ return container_of(vdpa, struct eni_vdpa, vdpa);
+}
+
+static struct virtio_pci_legacy_device *vdpa_to_ldev(struct vdpa_device *vdpa)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+
+ return &eni_vdpa->ldev;
+}
+
+static u64 eni_vdpa_get_features(struct vdpa_device *vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+ u64 features = vp_legacy_get_features(ldev);
+
+ features |= BIT_ULL(VIRTIO_F_ACCESS_PLATFORM);
+ features |= BIT_ULL(VIRTIO_F_ORDER_PLATFORM);
+
+ return features;
+}
+
+static int eni_vdpa_set_features(struct vdpa_device *vdpa, u64 features)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+
+ if (!(features & BIT_ULL(VIRTIO_NET_F_MRG_RXBUF)) && features) {
+ ENI_ERR(ldev->pci_dev,
+ "VIRTIO_NET_F_MRG_RXBUF is not negotiated\n");
+ return -EINVAL;
+ }
+
+ vp_legacy_set_features(ldev, (u32)features);
+
+ return 0;
+}
+
+static u8 eni_vdpa_get_status(struct vdpa_device *vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+
+ return vp_legacy_get_status(ldev);
+}
+
+static int eni_vdpa_get_vq_irq(struct vdpa_device *vdpa, u16 idx)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+ int irq = eni_vdpa->vring[idx].irq;
+
+ if (irq == VIRTIO_MSI_NO_VECTOR)
+ return -EINVAL;
+
+ return irq;
+}
+
+static void eni_vdpa_free_irq(struct eni_vdpa *eni_vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
+ struct pci_dev *pdev = ldev->pci_dev;
+ int i;
+
+ for (i = 0; i < eni_vdpa->queues; i++) {
+ if (eni_vdpa->vring[i].irq != VIRTIO_MSI_NO_VECTOR) {
+ vp_legacy_queue_vector(ldev, i, VIRTIO_MSI_NO_VECTOR);
+ devm_free_irq(&pdev->dev, eni_vdpa->vring[i].irq,
+ &eni_vdpa->vring[i]);
+ eni_vdpa->vring[i].irq = VIRTIO_MSI_NO_VECTOR;
+ }
+ }
+
+ if (eni_vdpa->config_irq != VIRTIO_MSI_NO_VECTOR) {
+ vp_legacy_config_vector(ldev, VIRTIO_MSI_NO_VECTOR);
+ devm_free_irq(&pdev->dev, eni_vdpa->config_irq, eni_vdpa);
+ eni_vdpa->config_irq = VIRTIO_MSI_NO_VECTOR;
+ }
+
+ if (eni_vdpa->vectors) {
+ pci_free_irq_vectors(pdev);
+ eni_vdpa->vectors = 0;
+ }
+}
+
+static irqreturn_t eni_vdpa_vq_handler(int irq, void *arg)
+{
+ struct eni_vring *vring = arg;
+
+ if (vring->cb.callback)
+ return vring->cb.callback(vring->cb.private);
+
+ return IRQ_HANDLED;
+}
+
+static irqreturn_t eni_vdpa_config_handler(int irq, void *arg)
+{
+ struct eni_vdpa *eni_vdpa = arg;
+
+ if (eni_vdpa->config_cb.callback)
+ return eni_vdpa->config_cb.callback(eni_vdpa->config_cb.private);
+
+ return IRQ_HANDLED;
+}
+
+static int eni_vdpa_request_irq(struct eni_vdpa *eni_vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
+ struct pci_dev *pdev = ldev->pci_dev;
+ int i, ret, irq;
+ int queues = eni_vdpa->queues;
+ int vectors = queues + 1;
+
+ ret = pci_alloc_irq_vectors(pdev, vectors, vectors, PCI_IRQ_MSIX);
+ if (ret != vectors) {
+ ENI_ERR(pdev,
+ "failed to allocate irq vectors want %d but %d\n",
+ vectors, ret);
+ return ret;
+ }
+
+ eni_vdpa->vectors = vectors;
+
+ for (i = 0; i < queues; i++) {
+ snprintf(eni_vdpa->vring[i].msix_name, ENI_MSIX_NAME_SIZE,
+ "eni-vdpa[%s]-%d\n", pci_name(pdev), i);
+ irq = pci_irq_vector(pdev, i);
+ ret = devm_request_irq(&pdev->dev, irq,
+ eni_vdpa_vq_handler,
+ 0, eni_vdpa->vring[i].msix_name,
+ &eni_vdpa->vring[i]);
+ if (ret) {
+ ENI_ERR(pdev, "failed to request irq for vq %d\n", i);
+ goto err;
+ }
+ vp_legacy_queue_vector(ldev, i, i);
+ eni_vdpa->vring[i].irq = irq;
+ }
+
+ snprintf(eni_vdpa->msix_name, ENI_MSIX_NAME_SIZE, "eni-vdpa[%s]-config\n",
+ pci_name(pdev));
+ irq = pci_irq_vector(pdev, queues);
+ ret = devm_request_irq(&pdev->dev, irq, eni_vdpa_config_handler, 0,
+ eni_vdpa->msix_name, eni_vdpa);
+ if (ret) {
+ ENI_ERR(pdev, "failed to request irq for config vq %d\n", i);
+ goto err;
+ }
+ vp_legacy_config_vector(ldev, queues);
+ eni_vdpa->config_irq = irq;
+
+ return 0;
+err:
+ eni_vdpa_free_irq(eni_vdpa);
+ return ret;
+}
+
+static void eni_vdpa_set_status(struct vdpa_device *vdpa, u8 status)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+ struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
+ u8 s = eni_vdpa_get_status(vdpa);
+
+ if (status & VIRTIO_CONFIG_S_DRIVER_OK &&
+ !(s & VIRTIO_CONFIG_S_DRIVER_OK)) {
+ eni_vdpa_request_irq(eni_vdpa);
+ }
+
+ vp_legacy_set_status(ldev, status);
+
+ if (!(status & VIRTIO_CONFIG_S_DRIVER_OK) &&
+ (s & VIRTIO_CONFIG_S_DRIVER_OK))
+ eni_vdpa_free_irq(eni_vdpa);
+}
+
+static int eni_vdpa_reset(struct vdpa_device *vdpa)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+ struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
+ u8 s = eni_vdpa_get_status(vdpa);
+
+ vp_legacy_set_status(ldev, 0);
+
+ if (s & VIRTIO_CONFIG_S_DRIVER_OK)
+ eni_vdpa_free_irq(eni_vdpa);
+
+ return 0;
+}
+
+static u16 eni_vdpa_get_vq_num_max(struct vdpa_device *vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+
+ return vp_legacy_get_queue_size(ldev, 0);
+}
+
+static u16 eni_vdpa_get_vq_num_min(struct vdpa_device *vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+
+ return vp_legacy_get_queue_size(ldev, 0);
+}
+
+static int eni_vdpa_get_vq_state(struct vdpa_device *vdpa, u16 qid,
+ struct vdpa_vq_state *state)
+{
+ return -EOPNOTSUPP;
+}
+
+static int eni_vdpa_set_vq_state(struct vdpa_device *vdpa, u16 qid,
+ const struct vdpa_vq_state *state)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+ const struct vdpa_vq_state_split *split = &state->split;
+
+ /* ENI is build upon virtio-pci specfication which not support
+ * to set state of virtqueue. But if the state is equal to the
+ * device initial state by chance, we can let it go.
+ */
+ if (!vp_legacy_get_queue_enable(ldev, qid)
+ && split->avail_index == 0)
+ return 0;
+
+ return -EOPNOTSUPP;
+}
+
+
+static void eni_vdpa_set_vq_cb(struct vdpa_device *vdpa, u16 qid,
+ struct vdpa_callback *cb)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+
+ eni_vdpa->vring[qid].cb = *cb;
+}
+
+static void eni_vdpa_set_vq_ready(struct vdpa_device *vdpa, u16 qid,
+ bool ready)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+
+ /* ENI is a legacy virtio-pci device. This is not supported
+ * by specification. But we can disable virtqueue by setting
+ * address to 0.
+ */
+ if (!ready)
+ vp_legacy_set_queue_address(ldev, qid, 0);
+}
+
+static bool eni_vdpa_get_vq_ready(struct vdpa_device *vdpa, u16 qid)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+
+ return vp_legacy_get_queue_enable(ldev, qid);
+}
+
+static void eni_vdpa_set_vq_num(struct vdpa_device *vdpa, u16 qid,
+ u32 num)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+ struct pci_dev *pdev = ldev->pci_dev;
+ u16 n = vp_legacy_get_queue_size(ldev, qid);
+
+ /* ENI is a legacy virtio-pci device which not allow to change
+ * virtqueue size. Just report a error if someone tries to
+ * change it.
+ */
+ if (num != n)
+ ENI_ERR(pdev,
+ "not support to set vq %u fixed num %u to %u\n",
+ qid, n, num);
+}
+
+static int eni_vdpa_set_vq_address(struct vdpa_device *vdpa, u16 qid,
+ u64 desc_area, u64 driver_area,
+ u64 device_area)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+ u32 pfn = desc_area >> VIRTIO_PCI_QUEUE_ADDR_SHIFT;
+
+ vp_legacy_set_queue_address(ldev, qid, pfn);
+
+ return 0;
+}
+
+static void eni_vdpa_kick_vq(struct vdpa_device *vdpa, u16 qid)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+
+ iowrite16(qid, eni_vdpa->vring[qid].notify);
+}
+
+static u32 eni_vdpa_get_device_id(struct vdpa_device *vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+
+ return ldev->id.device;
+}
+
+static u32 eni_vdpa_get_vendor_id(struct vdpa_device *vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+
+ return ldev->id.vendor;
+}
+
+static u32 eni_vdpa_get_vq_align(struct vdpa_device *vdpa)
+{
+ return PAGE_SIZE;
+}
+
+static size_t eni_vdpa_get_config_size(struct vdpa_device *vdpa)
+{
+ return sizeof(struct virtio_net_config);
+}
+
+
+static void eni_vdpa_get_config(struct vdpa_device *vdpa,
+ unsigned int offset,
+ void *buf, unsigned int len)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+ struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
+ void __iomem *ioaddr = ldev->ioaddr +
+ VIRTIO_PCI_CONFIG_OFF(eni_vdpa->vectors) +
+ offset;
+ u8 *p = buf;
+ int i;
+
+ for (i = 0; i < len; i++)
+ *p++ = ioread8(ioaddr + i);
+}
+
+static void eni_vdpa_set_config(struct vdpa_device *vdpa,
+ unsigned int offset, const void *buf,
+ unsigned int len)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+ struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
+ void __iomem *ioaddr = ldev->ioaddr +
+ VIRTIO_PCI_CONFIG_OFF(eni_vdpa->vectors) +
+ offset;
+ const u8 *p = buf;
+ int i;
+
+ for (i = 0; i < len; i++)
+ iowrite8(*p++, ioaddr + i);
+}
+
+static void eni_vdpa_set_config_cb(struct vdpa_device *vdpa,
+ struct vdpa_callback *cb)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+
+ eni_vdpa->config_cb = *cb;
+}
+
+static const struct vdpa_config_ops eni_vdpa_ops = {
+ .get_features = eni_vdpa_get_features,
+ .set_features = eni_vdpa_set_features,
+ .get_status = eni_vdpa_get_status,
+ .set_status = eni_vdpa_set_status,
+ .reset = eni_vdpa_reset,
+ .get_vq_num_max = eni_vdpa_get_vq_num_max,
+ .get_vq_num_min = eni_vdpa_get_vq_num_min,
+ .get_vq_state = eni_vdpa_get_vq_state,
+ .set_vq_state = eni_vdpa_set_vq_state,
+ .set_vq_cb = eni_vdpa_set_vq_cb,
+ .set_vq_ready = eni_vdpa_set_vq_ready,
+ .get_vq_ready = eni_vdpa_get_vq_ready,
+ .set_vq_num = eni_vdpa_set_vq_num,
+ .set_vq_address = eni_vdpa_set_vq_address,
+ .kick_vq = eni_vdpa_kick_vq,
+ .get_device_id = eni_vdpa_get_device_id,
+ .get_vendor_id = eni_vdpa_get_vendor_id,
+ .get_vq_align = eni_vdpa_get_vq_align,
+ .get_config_size = eni_vdpa_get_config_size,
+ .get_config = eni_vdpa_get_config,
+ .set_config = eni_vdpa_set_config,
+ .set_config_cb = eni_vdpa_set_config_cb,
+ .get_vq_irq = eni_vdpa_get_vq_irq,
+};
+
+
+static u16 eni_vdpa_get_num_queues(struct eni_vdpa *eni_vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
+ u32 features = vp_legacy_get_features(ldev);
+ u16 num = 2;
+
+ if (features & BIT_ULL(VIRTIO_NET_F_MQ)) {
+ __virtio16 max_virtqueue_pairs;
+
+ eni_vdpa_get_config(&eni_vdpa->vdpa,
+ offsetof(struct virtio_net_config, max_virtqueue_pairs),
+ &max_virtqueue_pairs,
+ sizeof(max_virtqueue_pairs));
+ num = 2 * __virtio16_to_cpu(virtio_legacy_is_little_endian(),
+ max_virtqueue_pairs);
+ }
+
+ if (features & BIT_ULL(VIRTIO_NET_F_CTRL_VQ))
+ num += 1;
+
+ return num;
+}
+
+static void eni_vdpa_free_irq_vectors(void *data)
+{
+ pci_free_irq_vectors(data);
+}
+
+static int eni_vdpa_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+{
+ struct device *dev = &pdev->dev;
+ struct eni_vdpa *eni_vdpa;
+ struct virtio_pci_legacy_device *ldev;
+ int ret, i;
+
+ ret = pcim_enable_device(pdev);
+ if (ret)
+ return ret;
+
+ eni_vdpa = vdpa_alloc_device(struct eni_vdpa, vdpa,
+ dev, &eni_vdpa_ops, NULL, false);
+ if (IS_ERR(eni_vdpa)) {
+ ENI_ERR(pdev, "failed to allocate vDPA structure\n");
+ return PTR_ERR(eni_vdpa);
+ }
+
+ ldev = &eni_vdpa->ldev;
+ ldev->pci_dev = pdev;
+
+ ret = vp_legacy_probe(ldev);
+ if (ret) {
+ ENI_ERR(pdev, "failed to probe legacy PCI device\n");
+ goto err;
+ }
+
+ pci_set_master(pdev);
+ pci_set_drvdata(pdev, eni_vdpa);
+
+ eni_vdpa->vdpa.dma_dev = &pdev->dev;
+ eni_vdpa->queues = eni_vdpa_get_num_queues(eni_vdpa);
+
+ ret = devm_add_action_or_reset(dev, eni_vdpa_free_irq_vectors, pdev);
+ if (ret) {
+ ENI_ERR(pdev,
+ "failed for adding devres for freeing irq vectors\n");
+ goto err;
+ }
+
+ eni_vdpa->vring = devm_kcalloc(&pdev->dev, eni_vdpa->queues,
+ sizeof(*eni_vdpa->vring),
+ GFP_KERNEL);
+ if (!eni_vdpa->vring) {
+ ret = -ENOMEM;
+ ENI_ERR(pdev, "failed to allocate virtqueues\n");
+ goto err;
+ }
+
+ for (i = 0; i < eni_vdpa->queues; i++) {
+ eni_vdpa->vring[i].irq = VIRTIO_MSI_NO_VECTOR;
+ eni_vdpa->vring[i].notify = ldev->ioaddr + VIRTIO_PCI_QUEUE_NOTIFY;
+ }
+ eni_vdpa->config_irq = VIRTIO_MSI_NO_VECTOR;
+
+ ret = vdpa_register_device(&eni_vdpa->vdpa, eni_vdpa->queues);
+ if (ret) {
+ ENI_ERR(pdev, "failed to register to vdpa bus\n");
+ goto err;
+ }
+
+ return 0;
+
+err:
+ put_device(&eni_vdpa->vdpa.dev);
+ return ret;
+}
+
+static void eni_vdpa_remove(struct pci_dev *pdev)
+{
+ struct eni_vdpa *eni_vdpa = pci_get_drvdata(pdev);
+
+ vdpa_unregister_device(&eni_vdpa->vdpa);
+ vp_legacy_remove(&eni_vdpa->ldev);
+}
+
+static struct pci_device_id eni_pci_ids[] = {
+ { PCI_DEVICE_SUB(PCI_VENDOR_ID_REDHAT_QUMRANET,
+ VIRTIO_TRANS_ID_NET,
+ PCI_SUBVENDOR_ID_REDHAT_QUMRANET,
+ VIRTIO_ID_NET) },
+ { 0 },
+};
+
+static struct pci_driver eni_vdpa_driver = {
+ .name = "alibaba-eni-vdpa",
+ .id_table = eni_pci_ids,
+ .probe = eni_vdpa_probe,
+ .remove = eni_vdpa_remove,
+};
+
+module_pci_driver(eni_vdpa_driver);
+
+MODULE_AUTHOR("Wu Zongyong <[email protected]>");
+MODULE_DESCRIPTION("Alibaba ENI vDPA driver");
+MODULE_LICENSE("GPL v2");
--
2.31.1

2021-10-15 13:24:53

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH v5 2/8] vdpa: fix typo

Signed-off-by: Wu Zongyong <[email protected]>
Acked-by: Jason Wang <[email protected]>
---
include/linux/vdpa.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
index 3972ab765de1..a896ee021e5f 100644
--- a/include/linux/vdpa.h
+++ b/include/linux/vdpa.h
@@ -257,7 +257,7 @@ struct vdpa_config_ops {
struct vdpa_notification_area
(*get_vq_notification)(struct vdpa_device *vdev, u16 idx);
/* vq irq is not expected to be changed once DRIVER_OK is set */
- int (*get_vq_irq)(struct vdpa_device *vdv, u16 idx);
+ int (*get_vq_irq)(struct vdpa_device *vdev, u16 idx);

/* Device ops */
u32 (*get_vq_align)(struct vdpa_device *vdev);
--
2.31.1

2021-10-15 15:20:25

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH v5 8/8] eni_vdpa: add vDPA driver for Alibaba ENI


?? 2021/10/15 ????3:15, Wu Zongyong д??:
> +
> +static u64 eni_vdpa_get_features(struct vdpa_device *vdpa)
> +{
> + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> + u64 features = vp_legacy_get_features(ldev);
> +
> + features |= BIT_ULL(VIRTIO_F_ACCESS_PLATFORM);
> + features |= BIT_ULL(VIRTIO_F_ORDER_PLATFORM);
> +
> + return features;
> +}
> +
> +static int eni_vdpa_set_features(struct vdpa_device *vdpa, u64 features)
> +{
> + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> +
> + if (!(features & BIT_ULL(VIRTIO_NET_F_MRG_RXBUF)) && features) {
> + ENI_ERR(ldev->pci_dev,
> + "VIRTIO_NET_F_MRG_RXBUF is not negotiated\n");
> + return -EINVAL;
> + }
> +
> + vp_legacy_set_features(ldev, (u32)features);
> +
> + return 0;
> +}


Hi:

It looks like some of my previous comments were ignored?

> +static u64 eni_vdpa_get_features(struct vdpa_device *vdpa)
> +{
> + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> + u64 features = vp_legacy_get_features(ldev);
> +
> + features |= BIT_ULL(VIRTIO_F_ACCESS_PLATFORM);
> + features |= BIT_ULL(VIRTIO_F_ORDER_PLATFORM);

VERSION_1 is also needed?


> +
> + return features;
> +}
> +
> +static int eni_vdpa_set_features(struct vdpa_device *vdpa, u64 features)
> +{
> + struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
> +
> + if (!(features & BIT_ULL(VIRTIO_NET_F_MRG_RXBUF)) && features) {
> + ENI_ERR(ldev->pci_dev,
> + "VIRTIO_NET_F_MRG_RXBUF is not negotiated\n");
> + return -EINVAL;

Do we need to make sure FEATURE_OK is not set in this case or the ENI can do
this for us?

Other looks good.

Thanks

2021-10-15 15:20:39

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH v5 4/8] vdpa: add new callback get_vq_num_min in vdpa_config_ops


?? 2021/10/15 ????3:14, Wu Zongyong д??:
> This callback is optional. For vdpa devices that not support to change
> virtqueue size, get_vq_num_min and get_vq_num_max will return the same
> value, so that users can choose a correct value for that device.
>
> Suggested-by: Jason Wang <[email protected]>
> Signed-off-by: Wu Zongyong <[email protected]>


Acked-by: Jason Wang <[email protected]>


> ---
> include/linux/vdpa.h | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
> index a896ee021e5f..30864848950b 100644
> --- a/include/linux/vdpa.h
> +++ b/include/linux/vdpa.h
> @@ -171,6 +171,9 @@ struct vdpa_map_file {
> * @get_vq_num_max: Get the max size of virtqueue
> * @vdev: vdpa device
> * Returns u16: max size of virtqueue
> + * @get_vq_num_min: Get the min size of virtqueue (optional)
> + * @vdev: vdpa device
> + * Returns u16: min size of virtqueue
> * @get_device_id: Get virtio device id
> * @vdev: vdpa device
> * Returns u32: virtio device id
> @@ -266,6 +269,7 @@ struct vdpa_config_ops {
> void (*set_config_cb)(struct vdpa_device *vdev,
> struct vdpa_callback *cb);
> u16 (*get_vq_num_max)(struct vdpa_device *vdev);
> + u16 (*get_vq_num_min)(struct vdpa_device *vdev);
> u32 (*get_device_id)(struct vdpa_device *vdev);
> u32 (*get_vendor_id)(struct vdpa_device *vdev);
> u8 (*get_status)(struct vdpa_device *vdev);

2021-10-15 15:20:39

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH v5 1/8] virtio-pci: introduce legacy device module


?? 2021/10/15 ????3:14, Wu Zongyong д??:
> +void vp_legacy_set_queue_enable(struct virtio_pci_legacy_device *ldev,
> + u16 idx, bool enable);


Similar to previous one, this function is not implemented in this patch.

Thanks

2021-10-15 15:21:38

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH v5 5/8] vdpa: min vq num of vdpa device cannot be greater than max vq num


?? 2021/10/15 ????3:14, Wu Zongyong д??:
> Just failed to probe the vdpa device if the min virtqueue num returned
> by get_vq_num_min is greater than the max virtqueue num returned by
> get_vq_num_max.
>
> Signed-off-by: Wu Zongyong <[email protected]>
> ---
> drivers/vdpa/vdpa.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
> index 1dc121a07a93..fd014ecec711 100644
> --- a/drivers/vdpa/vdpa.c
> +++ b/drivers/vdpa/vdpa.c
> @@ -26,8 +26,16 @@ static int vdpa_dev_probe(struct device *d)
> {
> struct vdpa_device *vdev = dev_to_vdpa(d);
> struct vdpa_driver *drv = drv_to_vdpa(vdev->dev.driver);
> + const struct vdpa_config_ops *ops = vdev->config;
> + u32 max_num, min_num = 0;


As discussed in previous version, 1 seems better?

Thanks


> int ret = 0;
>
> + max_num = ops->get_vq_num_max(vdev);
> + if (ops->get_vq_num_min)
> + min_num = ops->get_vq_num_min(vdev);
> + if (max_num < min_num)
> + return -EINVAL;
> +
> if (drv && drv->probe)
> ret = drv->probe(vdev);
>

2021-10-15 16:28:27

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH v5 6/8] virtio_vdpa: setup correct vq size with callbacks get_vq_num_{max,min}


?? 2021/10/15 ????3:14, Wu Zongyong д??:
> For the devices which implement the get_vq_num_min callback, the driver
> should not negotiate with virtqueue size with the backend vdpa device if
> the value returned by get_vq_num_min equals to the value returned by
> get_vq_num_max.
> This is useful for vdpa devices based on legacy virtio specfication.
>
> Signed-off-by: Wu Zongyong <[email protected]>
> ---
> drivers/virtio/virtio_vdpa.c | 21 ++++++++++++++++-----
> 1 file changed, 16 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
> index 72eaef2caeb1..e42ace29daa1 100644
> --- a/drivers/virtio/virtio_vdpa.c
> +++ b/drivers/virtio/virtio_vdpa.c
> @@ -145,7 +145,8 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> /* Assume split virtqueue, switch to packed if necessary */
> struct vdpa_vq_state state = {0};
> unsigned long flags;
> - u32 align, num;
> + u32 align, max_num, min_num = 0;
> + bool may_reduce_num = true;
> int err;
>
> if (!name)
> @@ -163,22 +164,32 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> if (!info)
> return ERR_PTR(-ENOMEM);
>
> - num = ops->get_vq_num_max(vdpa);
> - if (num == 0) {
> + max_num = ops->get_vq_num_max(vdpa);
> + if (max_num == 0) {
> err = -ENOENT;
> goto error_new_virtqueue;
> }
>
> + if (ops->get_vq_num_min)
> + min_num = ops->get_vq_num_min(vdpa);
> +
> + may_reduce_num = (max_num == min_num) ? false : true;
> +
> /* Create the vring */
> align = ops->get_vq_align(vdpa);
> - vq = vring_create_virtqueue(index, num, align, vdev,
> - true, true, ctx,
> + vq = vring_create_virtqueue(index, max_num, align, vdev,
> + true, may_reduce_num, ctx,
> virtio_vdpa_notify, callback, name);
> if (!vq) {
> err = -ENOMEM;
> goto error_new_virtqueue;
> }
>
> + if (virtqueue_get_vring_size(vq) < min_num) {
> + err = -EINVAL;
> + goto err_vq;
> + }


Under which condition can we hit this error?

Thanks


> +
> /* Setup virtqueue callback */
> cb.callback = virtio_vdpa_virtqueue_cb;
> cb.private = info;

2021-10-17 20:40:38

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH v5 8/8] eni_vdpa: add vDPA driver for Alibaba ENI

On 10/15/21 12:15 AM, Wu Zongyong wrote:
> diff --git a/drivers/vdpa/Kconfig b/drivers/vdpa/Kconfig
> index 3d91982d8371..c0232a2148a7 100644
> --- a/drivers/vdpa/Kconfig
> +++ b/drivers/vdpa/Kconfig
> @@ -78,4 +78,12 @@ config VP_VDPA
> help
> This kernel module bridges virtio PCI device to vDPA bus.
>
> +config ALIBABA_ENI_VDPA
> + tristate "vDPA driver for Alibaba ENI"
> + select VIRTIO_PCI_LEGACY_LIB
> + depends on PCI_MSI && !CPU_BIG_ENDIAN
> + help
> + VDPA driver for Alibaba ENI(Elastic Network Interface) which is build upon

ENI (Elastic Network Interface) built

> + virtio 0.9.5 specification.


--
~Randy

2021-10-22 02:48:50

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH v6 0/8] vDPA driver for Alibaba ENI

This series implements the vDPA driver for Alibaba ENI(Elastic Network
Interface) which is build based on virtio-pci 0.9.5 specification.

Changes since V5:
- remove unused codes

Changes since V4:
- check return values of get_vq_num_{max,min} when probing devices
- disable the driver on BE host via Kconfig
- add missing commit message

Changes since V3:
- validate VIRTIO_NET_F_MRG_RXBUF when negotiate features
- present F_ORDER_PLATFORM in get_features
- remove endian check since ENI always use litter endian

Changes since V2:
- add new attribute VDPA_ATTR_DEV_MIN_VQ_SIZE instead
VDPA_ATTR_DEV_F_VERSION_1 to guide users to choose correct virtqueue
size as suggested by Jason Wang
- present ACCESS_PLATFORM in get_features callback as suggested by Jason
Wang
- disable this driver on Big Endian host as suggested by Jason Wang
- fix a typo

Changes since V1:
- add new vdpa attribute VDPA_ATTR_DEV_F_VERSION_1 to indicate whether
the vdpa device is legacy
- implement dedicated driver for Alibaba ENI instead a legacy virtio-pci
driver as suggested by Jason Wang
- some bugs fixed

Wu Zongyong (8):
virtio-pci: introduce legacy device module
vdpa: fix typo
vp_vdpa: add vq irq offloading support
vdpa: add new callback get_vq_num_min in vdpa_config_ops
vdpa: min vq num of vdpa device cannot be greater than max vq num
virtio_vdpa: setup correct vq size with callbacks get_vq_num_{max,min}
vdpa: add new attribute VDPA_ATTR_DEV_MIN_VQ_SIZE
eni_vdpa: add vDPA driver for Alibaba ENI

drivers/vdpa/Kconfig | 8 +
drivers/vdpa/Makefile | 1 +
drivers/vdpa/alibaba/Makefile | 3 +
drivers/vdpa/alibaba/eni_vdpa.c | 553 +++++++++++++++++++++++++
drivers/vdpa/vdpa.c | 13 +
drivers/vdpa/virtio_pci/vp_vdpa.c | 12 +
drivers/virtio/Kconfig | 10 +
drivers/virtio/Makefile | 1 +
drivers/virtio/virtio_pci_common.c | 10 +-
drivers/virtio/virtio_pci_common.h | 9 +-
drivers/virtio/virtio_pci_legacy.c | 101 ++---
drivers/virtio/virtio_pci_legacy_dev.c | 220 ++++++++++
drivers/virtio/virtio_vdpa.c | 21 +-
include/linux/vdpa.h | 6 +-
include/linux/virtio_pci_legacy.h | 42 ++
include/uapi/linux/vdpa.h | 1 +
16 files changed, 922 insertions(+), 89 deletions(-)
create mode 100644 drivers/vdpa/alibaba/Makefile
create mode 100644 drivers/vdpa/alibaba/eni_vdpa.c
create mode 100644 drivers/virtio/virtio_pci_legacy_dev.c
create mode 100644 include/linux/virtio_pci_legacy.h

--
2.31.1

2021-10-22 02:50:38

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH v6 6/8] virtio_vdpa: setup correct vq size with callbacks get_vq_num_{max,min}

For the devices which implement the get_vq_num_min callback, the driver
should not negotiate with virtqueue size with the backend vdpa device if
the value returned by get_vq_num_min equals to the value returned by
get_vq_num_max.
This is useful for vdpa devices based on legacy virtio specfication.

Signed-off-by: Wu Zongyong <[email protected]>
---
drivers/virtio/virtio_vdpa.c | 21 ++++++++++++++++-----
1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
index 72eaef2caeb1..e42ace29daa1 100644
--- a/drivers/virtio/virtio_vdpa.c
+++ b/drivers/virtio/virtio_vdpa.c
@@ -145,7 +145,8 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
/* Assume split virtqueue, switch to packed if necessary */
struct vdpa_vq_state state = {0};
unsigned long flags;
- u32 align, num;
+ u32 align, max_num, min_num = 0;
+ bool may_reduce_num = true;
int err;

if (!name)
@@ -163,22 +164,32 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
if (!info)
return ERR_PTR(-ENOMEM);

- num = ops->get_vq_num_max(vdpa);
- if (num == 0) {
+ max_num = ops->get_vq_num_max(vdpa);
+ if (max_num == 0) {
err = -ENOENT;
goto error_new_virtqueue;
}

+ if (ops->get_vq_num_min)
+ min_num = ops->get_vq_num_min(vdpa);
+
+ may_reduce_num = (max_num == min_num) ? false : true;
+
/* Create the vring */
align = ops->get_vq_align(vdpa);
- vq = vring_create_virtqueue(index, num, align, vdev,
- true, true, ctx,
+ vq = vring_create_virtqueue(index, max_num, align, vdev,
+ true, may_reduce_num, ctx,
virtio_vdpa_notify, callback, name);
if (!vq) {
err = -ENOMEM;
goto error_new_virtqueue;
}

+ if (virtqueue_get_vring_size(vq) < min_num) {
+ err = -EINVAL;
+ goto err_vq;
+ }
+
/* Setup virtqueue callback */
cb.callback = virtio_vdpa_virtqueue_cb;
cb.private = info;
--
2.31.1

2021-10-25 02:25:44

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH v6 6/8] virtio_vdpa: setup correct vq size with callbacks get_vq_num_{max,min}

On Fri, Oct 22, 2021 at 10:45 AM Wu Zongyong
<[email protected]> wrote:
>
> For the devices which implement the get_vq_num_min callback, the driver
> should not negotiate with virtqueue size with the backend vdpa device if
> the value returned by get_vq_num_min equals to the value returned by
> get_vq_num_max.
> This is useful for vdpa devices based on legacy virtio specfication.
>
> Signed-off-by: Wu Zongyong <[email protected]>
> ---
> drivers/virtio/virtio_vdpa.c | 21 ++++++++++++++++-----
> 1 file changed, 16 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
> index 72eaef2caeb1..e42ace29daa1 100644
> --- a/drivers/virtio/virtio_vdpa.c
> +++ b/drivers/virtio/virtio_vdpa.c
> @@ -145,7 +145,8 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> /* Assume split virtqueue, switch to packed if necessary */
> struct vdpa_vq_state state = {0};
> unsigned long flags;
> - u32 align, num;
> + u32 align, max_num, min_num = 0;
> + bool may_reduce_num = true;
> int err;
>
> if (!name)
> @@ -163,22 +164,32 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> if (!info)
> return ERR_PTR(-ENOMEM);
>
> - num = ops->get_vq_num_max(vdpa);
> - if (num == 0) {
> + max_num = ops->get_vq_num_max(vdpa);
> + if (max_num == 0) {
> err = -ENOENT;
> goto error_new_virtqueue;
> }
>
> + if (ops->get_vq_num_min)
> + min_num = ops->get_vq_num_min(vdpa);
> +
> + may_reduce_num = (max_num == min_num) ? false : true;
> +
> /* Create the vring */
> align = ops->get_vq_align(vdpa);
> - vq = vring_create_virtqueue(index, num, align, vdev,
> - true, true, ctx,
> + vq = vring_create_virtqueue(index, max_num, align, vdev,
> + true, may_reduce_num, ctx,
> virtio_vdpa_notify, callback, name);
> if (!vq) {
> err = -ENOMEM;
> goto error_new_virtqueue;
> }
>
> + if (virtqueue_get_vring_size(vq) < min_num) {
> + err = -EINVAL;
> + goto err_vq;
> + }

I wonder under which case can we hit this?

Thanks

> +
> /* Setup virtqueue callback */
> cb.callback = virtio_vdpa_virtqueue_cb;
> cb.private = info;
> --
> 2.31.1
>

2021-10-25 04:50:59

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH v6 6/8] virtio_vdpa: setup correct vq size with callbacks get_vq_num_{max,min}


?? 2021/10/25 ????10:44, Wu Zongyong д??:
> On Mon, Oct 25, 2021 at 10:22:30AM +0800, Jason Wang wrote:
>> On Fri, Oct 22, 2021 at 10:45 AM Wu Zongyong
>> <[email protected]> wrote:
>>> For the devices which implement the get_vq_num_min callback, the driver
>>> should not negotiate with virtqueue size with the backend vdpa device if
>>> the value returned by get_vq_num_min equals to the value returned by
>>> get_vq_num_max.
>>> This is useful for vdpa devices based on legacy virtio specfication.
>>>
>>> Signed-off-by: Wu Zongyong <[email protected]>
>>> ---
>>> drivers/virtio/virtio_vdpa.c | 21 ++++++++++++++++-----
>>> 1 file changed, 16 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
>>> index 72eaef2caeb1..e42ace29daa1 100644
>>> --- a/drivers/virtio/virtio_vdpa.c
>>> +++ b/drivers/virtio/virtio_vdpa.c
>>> @@ -145,7 +145,8 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
>>> /* Assume split virtqueue, switch to packed if necessary */
>>> struct vdpa_vq_state state = {0};
>>> unsigned long flags;
>>> - u32 align, num;
>>> + u32 align, max_num, min_num = 0;
>>> + bool may_reduce_num = true;
>>> int err;
>>>
>>> if (!name)
>>> @@ -163,22 +164,32 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
>>> if (!info)
>>> return ERR_PTR(-ENOMEM);
>>>
>>> - num = ops->get_vq_num_max(vdpa);
>>> - if (num == 0) {
>>> + max_num = ops->get_vq_num_max(vdpa);
>>> + if (max_num == 0) {
>>> err = -ENOENT;
>>> goto error_new_virtqueue;
>>> }
>>>
>>> + if (ops->get_vq_num_min)
>>> + min_num = ops->get_vq_num_min(vdpa);
>>> +
>>> + may_reduce_num = (max_num == min_num) ? false : true;
>>> +
>>> /* Create the vring */
>>> align = ops->get_vq_align(vdpa);
>>> - vq = vring_create_virtqueue(index, num, align, vdev,
>>> - true, true, ctx,
>>> + vq = vring_create_virtqueue(index, max_num, align, vdev,
>>> + true, may_reduce_num, ctx,
>>> virtio_vdpa_notify, callback, name);
>>> if (!vq) {
>>> err = -ENOMEM;
>>> goto error_new_virtqueue;
>>> }
>>>
>>> + if (virtqueue_get_vring_size(vq) < min_num) {
>>> + err = -EINVAL;
>>> + goto err_vq;
>>> + }
>> I wonder under which case can we hit this?
>>
>> Thanks
> If min_vq_num < max_vq_num, may_reduce_num should be true, then it is
> possible to allocate a virtqueue with a small size which value is less
> than the min_vq_num since we only set the upper bound for virtqueue size
> when creating virtqueue.
>
> Refers to vring_create_virtqueue_split in driver/virtio/virtio_vring.c:
>
> for (; num && vring_size(num, vring_align) > PAGE_SIZE; num /= 2) {
> queue = vring_alloc_queue(vdev, vring_size(num, vring_align),
> &dma_addr,
> GFP_KERNEL|__GFP_NOWARN|__GFP_ZERO);
> if (queue)
> break;
> if (!may_reduce_num)
> return NULL;
> }


It looks to me it's better to fix this function instead of checking it
in the caller?


>
> BTW, I have replied this mail on Nov.18, have you ever received it?


For some reason I dont' get that.

Thanks


>>> +
>>> /* Setup virtqueue callback */
>>> cb.callback = virtio_vdpa_virtqueue_cb;
>>> cb.private = info;
>>> --
>>> 2.31.1
>>>

2021-10-26 08:46:50

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH v6 6/8] virtio_vdpa: setup correct vq size with callbacks get_vq_num_{max,min}

On Mon, Oct 25, 2021 at 2:25 PM Wu Zongyong
<[email protected]> wrote:
>
> On Mon, Oct 25, 2021 at 12:45:44PM +0800, Jason Wang wrote:
> >
> > 在 2021/10/25 上午10:44, Wu Zongyong 写道:
> > > On Mon, Oct 25, 2021 at 10:22:30AM +0800, Jason Wang wrote:
> > > > On Fri, Oct 22, 2021 at 10:45 AM Wu Zongyong
> > > > <[email protected]> wrote:
> > > > > For the devices which implement the get_vq_num_min callback, the driver
> > > > > should not negotiate with virtqueue size with the backend vdpa device if
> > > > > the value returned by get_vq_num_min equals to the value returned by
> > > > > get_vq_num_max.
> > > > > This is useful for vdpa devices based on legacy virtio specfication.
> > > > >
> > > > > Signed-off-by: Wu Zongyong <[email protected]>
> > > > > ---
> > > > > drivers/virtio/virtio_vdpa.c | 21 ++++++++++++++++-----
> > > > > 1 file changed, 16 insertions(+), 5 deletions(-)
> > > > >
> > > > > diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
> > > > > index 72eaef2caeb1..e42ace29daa1 100644
> > > > > --- a/drivers/virtio/virtio_vdpa.c
> > > > > +++ b/drivers/virtio/virtio_vdpa.c
> > > > > @@ -145,7 +145,8 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > > > /* Assume split virtqueue, switch to packed if necessary */
> > > > > struct vdpa_vq_state state = {0};
> > > > > unsigned long flags;
> > > > > - u32 align, num;
> > > > > + u32 align, max_num, min_num = 0;
> > > > > + bool may_reduce_num = true;
> > > > > int err;
> > > > >
> > > > > if (!name)
> > > > > @@ -163,22 +164,32 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
> > > > > if (!info)
> > > > > return ERR_PTR(-ENOMEM);
> > > > >
> > > > > - num = ops->get_vq_num_max(vdpa);
> > > > > - if (num == 0) {
> > > > > + max_num = ops->get_vq_num_max(vdpa);
> > > > > + if (max_num == 0) {
> > > > > err = -ENOENT;
> > > > > goto error_new_virtqueue;
> > > > > }
> > > > >
> > > > > + if (ops->get_vq_num_min)
> > > > > + min_num = ops->get_vq_num_min(vdpa);
> > > > > +
> > > > > + may_reduce_num = (max_num == min_num) ? false : true;
> > > > > +
> > > > > /* Create the vring */
> > > > > align = ops->get_vq_align(vdpa);
> > > > > - vq = vring_create_virtqueue(index, num, align, vdev,
> > > > > - true, true, ctx,
> > > > > + vq = vring_create_virtqueue(index, max_num, align, vdev,
> > > > > + true, may_reduce_num, ctx,
> > > > > virtio_vdpa_notify, callback, name);
> > > > > if (!vq) {
> > > > > err = -ENOMEM;
> > > > > goto error_new_virtqueue;
> > > > > }
> > > > >
> > > > > + if (virtqueue_get_vring_size(vq) < min_num) {
> > > > > + err = -EINVAL;
> > > > > + goto err_vq;
> > > > > + }
> > > > I wonder under which case can we hit this?
> > > >
> > > > Thanks
> > > If min_vq_num < max_vq_num, may_reduce_num should be true, then it is
> > > possible to allocate a virtqueue with a small size which value is less
> > > than the min_vq_num since we only set the upper bound for virtqueue size
> > > when creating virtqueue.
> > >
> > > Refers to vring_create_virtqueue_split in driver/virtio/virtio_vring.c:
> > >
> > > for (; num && vring_size(num, vring_align) > PAGE_SIZE; num /= 2) {
> > > queue = vring_alloc_queue(vdev, vring_size(num, vring_align),
> > > &dma_addr,
> > > GFP_KERNEL|__GFP_NOWARN|__GFP_ZERO);
> > > if (queue)
> > > break;
> > > if (!may_reduce_num)
> > > return NULL;
> > > }
> >
> >
> > It looks to me it's better to fix this function instead of checking it in
> > the caller?
>
> Or we can simply remove that code since this case only exists in theory, and
> there is no real usecase for now.

(Adding list back)

Somehow, it can't happen if you stick to a 256 as both min and max.

Another question, can ENI support vring size which is less than 256?

Thanks

>
> >
> >
> > >
> > > BTW, I have replied this mail on Nov.18, have you ever received it?
> >
> >
> > For some reason I dont' get that.
> >
> > Thanks
> >
> >
> > > > > +
> > > > > /* Setup virtqueue callback */
> > > > > cb.callback = virtio_vdpa_virtqueue_cb;
> > > > > cb.private = info;
> > > > > --
> > > > > 2.31.1
> > > > >
>

2021-10-29 09:16:28

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH v7 0/9] vDPA driver for Alibaba ENI

This series implements the vDPA driver for Alibaba ENI (Elastic Network
Interface) which is built based on virtio-pci 0.9.5 specification.

Changes since V6:
- set default min vq size to 1 intead of 0
- enable eni vdpa driver only on X86 hosts
- fix some typos

Changes since V5:
- remove unused codes

Changes since V4:
- check return values of get_vq_num_{max,min} when probing devices
- disable the driver on BE host via Kconfig
- add missing commit message

Changes since V3:
- validate VIRTIO_NET_F_MRG_RXBUF when negotiate features
- present F_ORDER_PLATFORM in get_features
- remove endian check since ENI always use litter endian

Changes since V2:
- add new attribute VDPA_ATTR_DEV_MIN_VQ_SIZE instead
VDPA_ATTR_DEV_F_VERSION_1 to guide users to choose correct virtqueue
size as suggested by Jason Wang
- present ACCESS_PLATFORM in get_features callback as suggested by Jason
Wang
- disable this driver on Big Endian host as suggested by Jason Wang
- fix a typo

Changes since V1:
- add new vdpa attribute VDPA_ATTR_DEV_F_VERSION_1 to indicate whether
the vdpa device is legacy
- implement dedicated driver for Alibaba ENI instead a legacy virtio-pci
driver as suggested by Jason Wang
- some bugs fixed

Wu Zongyong (9):
virtio-pci: introduce legacy device module
vdpa: fix typo
vp_vdpa: add vq irq offloading support
vdpa: add new callback get_vq_num_min in vdpa_config_ops
vdpa: min vq num of vdpa device cannot be greater than max vq num
virtio_vdpa: setup correct vq size with callbacks get_vq_num_{max,min}
vdpa: add new attribute VDPA_ATTR_DEV_MIN_VQ_SIZE
eni_vdpa: add vDPA driver for Alibaba ENI
eni_vdpa: alibaba: fix Kconfig typo

drivers/vdpa/Kconfig | 8 +
drivers/vdpa/Makefile | 1 +
drivers/vdpa/alibaba/Makefile | 3 +
drivers/vdpa/alibaba/eni_vdpa.c | 553 +++++++++++++++++++++++++
drivers/vdpa/vdpa.c | 13 +
drivers/vdpa/virtio_pci/vp_vdpa.c | 12 +
drivers/virtio/Kconfig | 10 +
drivers/virtio/Makefile | 1 +
drivers/virtio/virtio_pci_common.c | 10 +-
drivers/virtio/virtio_pci_common.h | 9 +-
drivers/virtio/virtio_pci_legacy.c | 101 ++---
drivers/virtio/virtio_pci_legacy_dev.c | 220 ++++++++++
drivers/virtio/virtio_vdpa.c | 16 +-
include/linux/vdpa.h | 6 +-
include/linux/virtio_pci_legacy.h | 42 ++
include/uapi/linux/vdpa.h | 1 +
16 files changed, 917 insertions(+), 89 deletions(-)
create mode 100644 drivers/vdpa/alibaba/Makefile
create mode 100644 drivers/vdpa/alibaba/eni_vdpa.c
create mode 100644 drivers/virtio/virtio_pci_legacy_dev.c
create mode 100644 include/linux/virtio_pci_legacy.h

--
2.31.1

2021-10-29 09:16:50

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH v7 2/9] vdpa: fix typo

Signed-off-by: Wu Zongyong <[email protected]>
Acked-by: Jason Wang <[email protected]>
---
include/linux/vdpa.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
index 3972ab765de1..a896ee021e5f 100644
--- a/include/linux/vdpa.h
+++ b/include/linux/vdpa.h
@@ -257,7 +257,7 @@ struct vdpa_config_ops {
struct vdpa_notification_area
(*get_vq_notification)(struct vdpa_device *vdev, u16 idx);
/* vq irq is not expected to be changed once DRIVER_OK is set */
- int (*get_vq_irq)(struct vdpa_device *vdv, u16 idx);
+ int (*get_vq_irq)(struct vdpa_device *vdev, u16 idx);

/* Device ops */
u32 (*get_vq_align)(struct vdpa_device *vdev);
--
2.31.1

2021-10-29 09:17:04

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH v7 1/9] virtio-pci: introduce legacy device module

Split common codes from virtio-pci-legacy so vDPA driver can reuse it
later.

Signed-off-by: Wu Zongyong <[email protected]>
Acked-by: Jason Wang <[email protected]>
---
drivers/virtio/Kconfig | 10 ++
drivers/virtio/Makefile | 1 +
drivers/virtio/virtio_pci_common.c | 10 +-
drivers/virtio/virtio_pci_common.h | 9 +-
drivers/virtio/virtio_pci_legacy.c | 101 +++---------
drivers/virtio/virtio_pci_legacy_dev.c | 220 +++++++++++++++++++++++++
include/linux/virtio_pci_legacy.h | 42 +++++
7 files changed, 310 insertions(+), 83 deletions(-)
create mode 100644 drivers/virtio/virtio_pci_legacy_dev.c
create mode 100644 include/linux/virtio_pci_legacy.h

diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
index ce1b3f6ec325..8fcf94cd2c96 100644
--- a/drivers/virtio/Kconfig
+++ b/drivers/virtio/Kconfig
@@ -20,6 +20,15 @@ config VIRTIO_PCI_LIB
PCI device with possible vendor specific extensions. Any
module that selects this module must depend on PCI.

+config VIRTIO_PCI_LIB_LEGACY
+ tristate
+ help
+ Legacy PCI device (Virtio PCI Card 0.9.x Draft and older device)
+ implementation.
+ This module implements the basic probe and control for devices
+ which are based on legacy PCI device. Any module that selects this
+ module must depend on PCI.
+
menuconfig VIRTIO_MENU
bool "Virtio drivers"
default y
@@ -43,6 +52,7 @@ config VIRTIO_PCI_LEGACY
bool "Support for legacy virtio draft 0.9.X and older devices"
default y
depends on VIRTIO_PCI
+ select VIRTIO_PCI_LIB_LEGACY
help
Virtio PCI Card 0.9.X Draft (circa 2014) and older device support.

diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile
index 699bbea0465f..0a82d0873248 100644
--- a/drivers/virtio/Makefile
+++ b/drivers/virtio/Makefile
@@ -1,6 +1,7 @@
# SPDX-License-Identifier: GPL-2.0
obj-$(CONFIG_VIRTIO) += virtio.o virtio_ring.o
obj-$(CONFIG_VIRTIO_PCI_LIB) += virtio_pci_modern_dev.o
+obj-$(CONFIG_VIRTIO_PCI_LIB_LEGACY) += virtio_pci_legacy_dev.o
obj-$(CONFIG_VIRTIO_MMIO) += virtio_mmio.o
obj-$(CONFIG_VIRTIO_PCI) += virtio_pci.o
virtio_pci-y := virtio_pci_modern.o virtio_pci_common.o
diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c
index b35bb2d57f62..d724f676608b 100644
--- a/drivers/virtio/virtio_pci_common.c
+++ b/drivers/virtio/virtio_pci_common.c
@@ -549,6 +549,8 @@ static int virtio_pci_probe(struct pci_dev *pci_dev,

pci_set_master(pci_dev);

+ vp_dev->is_legacy = vp_dev->ldev.ioaddr ? true : false;
+
rc = register_virtio_device(&vp_dev->vdev);
reg_dev = vp_dev;
if (rc)
@@ -557,10 +559,10 @@ static int virtio_pci_probe(struct pci_dev *pci_dev,
return 0;

err_register:
- if (vp_dev->ioaddr)
- virtio_pci_legacy_remove(vp_dev);
+ if (vp_dev->is_legacy)
+ virtio_pci_legacy_remove(vp_dev);
else
- virtio_pci_modern_remove(vp_dev);
+ virtio_pci_modern_remove(vp_dev);
err_probe:
pci_disable_device(pci_dev);
err_enable_device:
@@ -587,7 +589,7 @@ static void virtio_pci_remove(struct pci_dev *pci_dev)

unregister_virtio_device(&vp_dev->vdev);

- if (vp_dev->ioaddr)
+ if (vp_dev->is_legacy)
virtio_pci_legacy_remove(vp_dev);
else
virtio_pci_modern_remove(vp_dev);
diff --git a/drivers/virtio/virtio_pci_common.h b/drivers/virtio/virtio_pci_common.h
index beec047a8f8d..eb17a29fc7ef 100644
--- a/drivers/virtio/virtio_pci_common.h
+++ b/drivers/virtio/virtio_pci_common.h
@@ -25,6 +25,7 @@
#include <linux/virtio_config.h>
#include <linux/virtio_ring.h>
#include <linux/virtio_pci.h>
+#include <linux/virtio_pci_legacy.h>
#include <linux/virtio_pci_modern.h>
#include <linux/highmem.h>
#include <linux/spinlock.h>
@@ -44,16 +45,14 @@ struct virtio_pci_vq_info {
struct virtio_pci_device {
struct virtio_device vdev;
struct pci_dev *pci_dev;
+ struct virtio_pci_legacy_device ldev;
struct virtio_pci_modern_device mdev;

- /* In legacy mode, these two point to within ->legacy. */
+ bool is_legacy;
+
/* Where to read and clear interrupt */
u8 __iomem *isr;

- /* Legacy only field */
- /* the IO mapping for the PCI config space */
- void __iomem *ioaddr;
-
/* a list of queues so we can dispatch IRQs */
spinlock_t lock;
struct list_head virtqueues;
diff --git a/drivers/virtio/virtio_pci_legacy.c b/drivers/virtio/virtio_pci_legacy.c
index d62e9835aeec..82eb437ad920 100644
--- a/drivers/virtio/virtio_pci_legacy.c
+++ b/drivers/virtio/virtio_pci_legacy.c
@@ -14,6 +14,7 @@
* Michael S. Tsirkin <[email protected]>
*/

+#include "linux/virtio_pci_legacy.h"
#include "virtio_pci_common.h"

/* virtio config->get_features() implementation */
@@ -23,7 +24,7 @@ static u64 vp_get_features(struct virtio_device *vdev)

/* When someone needs more than 32 feature bits, we'll need to
* steal a bit to indicate that the rest are somewhere else. */
- return ioread32(vp_dev->ioaddr + VIRTIO_PCI_HOST_FEATURES);
+ return vp_legacy_get_features(&vp_dev->ldev);
}

/* virtio config->finalize_features() implementation */
@@ -38,7 +39,7 @@ static int vp_finalize_features(struct virtio_device *vdev)
BUG_ON((u32)vdev->features != vdev->features);

/* We only support 32 feature bits. */
- iowrite32(vdev->features, vp_dev->ioaddr + VIRTIO_PCI_GUEST_FEATURES);
+ vp_legacy_set_features(&vp_dev->ldev, vdev->features);

return 0;
}
@@ -48,7 +49,7 @@ static void vp_get(struct virtio_device *vdev, unsigned offset,
void *buf, unsigned len)
{
struct virtio_pci_device *vp_dev = to_vp_device(vdev);
- void __iomem *ioaddr = vp_dev->ioaddr +
+ void __iomem *ioaddr = vp_dev->ldev.ioaddr +
VIRTIO_PCI_CONFIG_OFF(vp_dev->msix_enabled) +
offset;
u8 *ptr = buf;
@@ -64,7 +65,7 @@ static void vp_set(struct virtio_device *vdev, unsigned offset,
const void *buf, unsigned len)
{
struct virtio_pci_device *vp_dev = to_vp_device(vdev);
- void __iomem *ioaddr = vp_dev->ioaddr +
+ void __iomem *ioaddr = vp_dev->ldev.ioaddr +
VIRTIO_PCI_CONFIG_OFF(vp_dev->msix_enabled) +
offset;
const u8 *ptr = buf;
@@ -78,7 +79,7 @@ static void vp_set(struct virtio_device *vdev, unsigned offset,
static u8 vp_get_status(struct virtio_device *vdev)
{
struct virtio_pci_device *vp_dev = to_vp_device(vdev);
- return ioread8(vp_dev->ioaddr + VIRTIO_PCI_STATUS);
+ return vp_legacy_get_status(&vp_dev->ldev);
}

static void vp_set_status(struct virtio_device *vdev, u8 status)
@@ -86,28 +87,24 @@ static void vp_set_status(struct virtio_device *vdev, u8 status)
struct virtio_pci_device *vp_dev = to_vp_device(vdev);
/* We should never be setting status to 0. */
BUG_ON(status == 0);
- iowrite8(status, vp_dev->ioaddr + VIRTIO_PCI_STATUS);
+ vp_legacy_set_status(&vp_dev->ldev, status);
}

static void vp_reset(struct virtio_device *vdev)
{
struct virtio_pci_device *vp_dev = to_vp_device(vdev);
/* 0 status means a reset. */
- iowrite8(0, vp_dev->ioaddr + VIRTIO_PCI_STATUS);
+ vp_legacy_set_status(&vp_dev->ldev, 0);
/* Flush out the status write, and flush in device writes,
* including MSi-X interrupts, if any. */
- ioread8(vp_dev->ioaddr + VIRTIO_PCI_STATUS);
+ vp_legacy_get_status(&vp_dev->ldev);
/* Flush pending VQ/configuration callbacks. */
vp_synchronize_vectors(vdev);
}

static u16 vp_config_vector(struct virtio_pci_device *vp_dev, u16 vector)
{
- /* Setup the vector used for configuration events */
- iowrite16(vector, vp_dev->ioaddr + VIRTIO_MSI_CONFIG_VECTOR);
- /* Verify we had enough resources to assign the vector */
- /* Will also flush the write out to device */
- return ioread16(vp_dev->ioaddr + VIRTIO_MSI_CONFIG_VECTOR);
+ return vp_legacy_config_vector(&vp_dev->ldev, vector);
}

static struct virtqueue *setup_vq(struct virtio_pci_device *vp_dev,
@@ -123,12 +120,9 @@ static struct virtqueue *setup_vq(struct virtio_pci_device *vp_dev,
int err;
u64 q_pfn;

- /* Select the queue we're interested in */
- iowrite16(index, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_SEL);
-
/* Check if queue is either not available or already active. */
- num = ioread16(vp_dev->ioaddr + VIRTIO_PCI_QUEUE_NUM);
- if (!num || ioread32(vp_dev->ioaddr + VIRTIO_PCI_QUEUE_PFN))
+ num = vp_legacy_get_queue_size(&vp_dev->ldev, index);
+ if (!num || vp_legacy_get_queue_enable(&vp_dev->ldev, index))
return ERR_PTR(-ENOENT);

info->msix_vector = msix_vec;
@@ -151,13 +145,12 @@ static struct virtqueue *setup_vq(struct virtio_pci_device *vp_dev,
}

/* activate the queue */
- iowrite32(q_pfn, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_PFN);
+ vp_legacy_set_queue_address(&vp_dev->ldev, index, q_pfn);

- vq->priv = (void __force *)vp_dev->ioaddr + VIRTIO_PCI_QUEUE_NOTIFY;
+ vq->priv = (void __force *)vp_dev->ldev.ioaddr + VIRTIO_PCI_QUEUE_NOTIFY;

if (msix_vec != VIRTIO_MSI_NO_VECTOR) {
- iowrite16(msix_vec, vp_dev->ioaddr + VIRTIO_MSI_QUEUE_VECTOR);
- msix_vec = ioread16(vp_dev->ioaddr + VIRTIO_MSI_QUEUE_VECTOR);
+ msix_vec = vp_legacy_queue_vector(&vp_dev->ldev, index, msix_vec);
if (msix_vec == VIRTIO_MSI_NO_VECTOR) {
err = -EBUSY;
goto out_deactivate;
@@ -167,7 +160,7 @@ static struct virtqueue *setup_vq(struct virtio_pci_device *vp_dev,
return vq;

out_deactivate:
- iowrite32(0, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_PFN);
+ vp_legacy_set_queue_address(&vp_dev->ldev, index, 0);
out_del_vq:
vring_del_virtqueue(vq);
return ERR_PTR(err);
@@ -178,17 +171,15 @@ static void del_vq(struct virtio_pci_vq_info *info)
struct virtqueue *vq = info->vq;
struct virtio_pci_device *vp_dev = to_vp_device(vq->vdev);

- iowrite16(vq->index, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_SEL);
-
if (vp_dev->msix_enabled) {
- iowrite16(VIRTIO_MSI_NO_VECTOR,
- vp_dev->ioaddr + VIRTIO_MSI_QUEUE_VECTOR);
+ vp_legacy_queue_vector(&vp_dev->ldev, vq->index,
+ VIRTIO_MSI_NO_VECTOR);
/* Flush the write out to device */
- ioread8(vp_dev->ioaddr + VIRTIO_PCI_ISR);
+ ioread8(vp_dev->ldev.ioaddr + VIRTIO_PCI_ISR);
}

/* Select and deactivate the queue */
- iowrite32(0, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_PFN);
+ vp_legacy_set_queue_address(&vp_dev->ldev, vq->index, 0);

vring_del_virtqueue(vq);
}
@@ -211,51 +202,18 @@ static const struct virtio_config_ops virtio_pci_config_ops = {
/* the PCI probing function */
int virtio_pci_legacy_probe(struct virtio_pci_device *vp_dev)
{
+ struct virtio_pci_legacy_device *ldev = &vp_dev->ldev;
struct pci_dev *pci_dev = vp_dev->pci_dev;
int rc;

- /* We only own devices >= 0x1000 and <= 0x103f: leave the rest. */
- if (pci_dev->device < 0x1000 || pci_dev->device > 0x103f)
- return -ENODEV;
-
- if (pci_dev->revision != VIRTIO_PCI_ABI_VERSION) {
- printk(KERN_ERR "virtio_pci: expected ABI version %d, got %d\n",
- VIRTIO_PCI_ABI_VERSION, pci_dev->revision);
- return -ENODEV;
- }
-
- rc = dma_set_mask(&pci_dev->dev, DMA_BIT_MASK(64));
- if (rc) {
- rc = dma_set_mask_and_coherent(&pci_dev->dev, DMA_BIT_MASK(32));
- } else {
- /*
- * The virtio ring base address is expressed as a 32-bit PFN,
- * with a page size of 1 << VIRTIO_PCI_QUEUE_ADDR_SHIFT.
- */
- dma_set_coherent_mask(&pci_dev->dev,
- DMA_BIT_MASK(32 + VIRTIO_PCI_QUEUE_ADDR_SHIFT));
- }
-
- if (rc)
- dev_warn(&pci_dev->dev, "Failed to enable 64-bit or 32-bit DMA. Trying to continue, but this might not work.\n");
+ ldev->pci_dev = pci_dev;

- rc = pci_request_region(pci_dev, 0, "virtio-pci-legacy");
+ rc = vp_legacy_probe(ldev);
if (rc)
return rc;

- rc = -ENOMEM;
- vp_dev->ioaddr = pci_iomap(pci_dev, 0, 0);
- if (!vp_dev->ioaddr)
- goto err_iomap;
-
- vp_dev->isr = vp_dev->ioaddr + VIRTIO_PCI_ISR;
-
- /* we use the subsystem vendor/device id as the virtio vendor/device
- * id. this allows us to use the same PCI vendor/device id for all
- * virtio devices and to identify the particular virtio driver by
- * the subsystem ids */
- vp_dev->vdev.id.vendor = pci_dev->subsystem_vendor;
- vp_dev->vdev.id.device = pci_dev->subsystem_device;
+ vp_dev->isr = ldev->isr;
+ vp_dev->vdev.id = ldev->id;

vp_dev->vdev.config = &virtio_pci_config_ops;

@@ -264,16 +222,11 @@ int virtio_pci_legacy_probe(struct virtio_pci_device *vp_dev)
vp_dev->del_vq = del_vq;

return 0;
-
-err_iomap:
- pci_release_region(pci_dev, 0);
- return rc;
}

void virtio_pci_legacy_remove(struct virtio_pci_device *vp_dev)
{
- struct pci_dev *pci_dev = vp_dev->pci_dev;
+ struct virtio_pci_legacy_device *ldev = &vp_dev->ldev;

- pci_iounmap(pci_dev, vp_dev->ioaddr);
- pci_release_region(pci_dev, 0);
+ vp_legacy_remove(ldev);
}
diff --git a/drivers/virtio/virtio_pci_legacy_dev.c b/drivers/virtio/virtio_pci_legacy_dev.c
new file mode 100644
index 000000000000..9b97680dd02b
--- /dev/null
+++ b/drivers/virtio/virtio_pci_legacy_dev.c
@@ -0,0 +1,220 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+
+#include "linux/virtio_pci.h"
+#include <linux/virtio_pci_legacy.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+
+
+/*
+ * vp_legacy_probe: probe the legacy virtio pci device, note that the
+ * caller is required to enable PCI device before calling this function.
+ * @ldev: the legacy virtio-pci device
+ *
+ * Return 0 on succeed otherwise fail
+ */
+int vp_legacy_probe(struct virtio_pci_legacy_device *ldev)
+{
+ struct pci_dev *pci_dev = ldev->pci_dev;
+ int rc;
+
+ /* We only own devices >= 0x1000 and <= 0x103f: leave the rest. */
+ if (pci_dev->device < 0x1000 || pci_dev->device > 0x103f)
+ return -ENODEV;
+
+ if (pci_dev->revision != VIRTIO_PCI_ABI_VERSION)
+ return -ENODEV;
+
+ rc = dma_set_mask(&pci_dev->dev, DMA_BIT_MASK(64));
+ if (rc) {
+ rc = dma_set_mask_and_coherent(&pci_dev->dev, DMA_BIT_MASK(32));
+ } else {
+ /*
+ * The virtio ring base address is expressed as a 32-bit PFN,
+ * with a page size of 1 << VIRTIO_PCI_QUEUE_ADDR_SHIFT.
+ */
+ dma_set_coherent_mask(&pci_dev->dev,
+ DMA_BIT_MASK(32 + VIRTIO_PCI_QUEUE_ADDR_SHIFT));
+ }
+
+ if (rc)
+ dev_warn(&pci_dev->dev, "Failed to enable 64-bit or 32-bit DMA. Trying to continue, but this might not work.\n");
+
+ rc = pci_request_region(pci_dev, 0, "virtio-pci-legacy");
+ if (rc)
+ return rc;
+
+ ldev->ioaddr = pci_iomap(pci_dev, 0, 0);
+ if (!ldev->ioaddr)
+ goto err_iomap;
+
+ ldev->isr = ldev->ioaddr + VIRTIO_PCI_ISR;
+
+ ldev->id.vendor = pci_dev->subsystem_vendor;
+ ldev->id.device = pci_dev->subsystem_device;
+
+ return 0;
+err_iomap:
+ pci_release_region(pci_dev, 0);
+ return rc;
+}
+EXPORT_SYMBOL_GPL(vp_legacy_probe);
+
+/*
+ * vp_legacy_probe: remove and cleanup the legacy virtio pci device
+ * @ldev: the legacy virtio-pci device
+ */
+void vp_legacy_remove(struct virtio_pci_legacy_device *ldev)
+{
+ struct pci_dev *pci_dev = ldev->pci_dev;
+
+ pci_iounmap(pci_dev, ldev->ioaddr);
+ pci_release_region(pci_dev, 0);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_remove);
+
+/*
+ * vp_legacy_get_features - get features from device
+ * @ldev: the legacy virtio-pci device
+ *
+ * Returns the features read from the device
+ */
+u64 vp_legacy_get_features(struct virtio_pci_legacy_device *ldev)
+{
+
+ return ioread32(ldev->ioaddr + VIRTIO_PCI_HOST_FEATURES);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_get_features);
+
+/*
+ * vp_legacy_get_driver_features - get driver features from device
+ * @ldev: the legacy virtio-pci device
+ *
+ * Returns the driver features read from the device
+ */
+u64 vp_legacy_get_driver_features(struct virtio_pci_legacy_device *ldev)
+{
+ return ioread32(ldev->ioaddr + VIRTIO_PCI_GUEST_FEATURES);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_get_driver_features);
+
+/*
+ * vp_legacy_set_features - set features to device
+ * @ldev: the legacy virtio-pci device
+ * @features: the features set to device
+ */
+void vp_legacy_set_features(struct virtio_pci_legacy_device *ldev,
+ u32 features)
+{
+ iowrite32(features, ldev->ioaddr + VIRTIO_PCI_GUEST_FEATURES);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_set_features);
+
+/*
+ * vp_legacy_get_status - get the device status
+ * @ldev: the legacy virtio-pci device
+ *
+ * Returns the status read from device
+ */
+u8 vp_legacy_get_status(struct virtio_pci_legacy_device *ldev)
+{
+ return ioread8(ldev->ioaddr + VIRTIO_PCI_STATUS);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_get_status);
+
+/*
+ * vp_legacy_set_status - set status to device
+ * @ldev: the legacy virtio-pci device
+ * @status: the status set to device
+ */
+void vp_legacy_set_status(struct virtio_pci_legacy_device *ldev,
+ u8 status)
+{
+ iowrite8(status, ldev->ioaddr + VIRTIO_PCI_STATUS);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_set_status);
+
+/*
+ * vp_legacy_queue_vector - set the MSIX vector for a specific virtqueue
+ * @ldev: the legacy virtio-pci device
+ * @index: queue index
+ * @vector: the config vector
+ *
+ * Returns the config vector read from the device
+ */
+u16 vp_legacy_queue_vector(struct virtio_pci_legacy_device *ldev,
+ u16 index, u16 vector)
+{
+ iowrite16(index, ldev->ioaddr + VIRTIO_PCI_QUEUE_SEL);
+ iowrite16(vector, ldev->ioaddr + VIRTIO_MSI_QUEUE_VECTOR);
+ /* Flush the write out to device */
+ return ioread16(ldev->ioaddr + VIRTIO_MSI_QUEUE_VECTOR);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_queue_vector);
+
+/*
+ * vp_legacy_config_vector - set the vector for config interrupt
+ * @ldev: the legacy virtio-pci device
+ * @vector: the config vector
+ *
+ * Returns the config vector read from the device
+ */
+u16 vp_legacy_config_vector(struct virtio_pci_legacy_device *ldev,
+ u16 vector)
+{
+ /* Setup the vector used for configuration events */
+ iowrite16(vector, ldev->ioaddr + VIRTIO_MSI_CONFIG_VECTOR);
+ /* Verify we had enough resources to assign the vector */
+ /* Will also flush the write out to device */
+ return ioread16(ldev->ioaddr + VIRTIO_MSI_CONFIG_VECTOR);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_config_vector);
+
+/*
+ * vp_legacy_set_queue_address - set the virtqueue address
+ * @ldev: the legacy virtio-pci device
+ * @index: the queue index
+ * @queue_pfn: pfn of the virtqueue
+ */
+void vp_legacy_set_queue_address(struct virtio_pci_legacy_device *ldev,
+ u16 index, u32 queue_pfn)
+{
+ iowrite16(index, ldev->ioaddr + VIRTIO_PCI_QUEUE_SEL);
+ iowrite32(queue_pfn, ldev->ioaddr + VIRTIO_PCI_QUEUE_PFN);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_set_queue_address);
+
+/*
+ * vp_legacy_get_queue_enable - enable a virtqueue
+ * @ldev: the legacy virtio-pci device
+ * @index: the queue index
+ *
+ * Returns whether a virtqueue is enabled or not
+ */
+bool vp_legacy_get_queue_enable(struct virtio_pci_legacy_device *ldev,
+ u16 index)
+{
+ iowrite16(index, ldev->ioaddr + VIRTIO_PCI_QUEUE_SEL);
+ return ioread32(ldev->ioaddr + VIRTIO_PCI_QUEUE_PFN);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_get_queue_enable);
+
+/*
+ * vp_legacy_get_queue_size - get size for a virtqueue
+ * @ldev: the legacy virtio-pci device
+ * @index: the queue index
+ *
+ * Returns the size of the virtqueue
+ */
+u16 vp_legacy_get_queue_size(struct virtio_pci_legacy_device *ldev,
+ u16 index)
+{
+ iowrite16(index, ldev->ioaddr + VIRTIO_PCI_QUEUE_SEL);
+ return ioread16(ldev->ioaddr + VIRTIO_PCI_QUEUE_NUM);
+}
+EXPORT_SYMBOL_GPL(vp_legacy_get_queue_size);
+
+MODULE_VERSION("0.1");
+MODULE_DESCRIPTION("Legacy Virtio PCI Device");
+MODULE_AUTHOR("Wu Zongyong <[email protected]>");
+MODULE_LICENSE("GPL");
diff --git a/include/linux/virtio_pci_legacy.h b/include/linux/virtio_pci_legacy.h
new file mode 100644
index 000000000000..e5d665faf00e
--- /dev/null
+++ b/include/linux/virtio_pci_legacy.h
@@ -0,0 +1,42 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_VIRTIO_PCI_LEGACY_H
+#define _LINUX_VIRTIO_PCI_LEGACY_H
+
+#include "linux/mod_devicetable.h"
+#include <linux/pci.h>
+#include <linux/virtio_pci.h>
+
+struct virtio_pci_legacy_device {
+ struct pci_dev *pci_dev;
+
+ /* Where to read and clear interrupt */
+ u8 __iomem *isr;
+ /* The IO mapping for the PCI config space (legacy mode only) */
+ void __iomem *ioaddr;
+
+ struct virtio_device_id id;
+};
+
+u64 vp_legacy_get_features(struct virtio_pci_legacy_device *ldev);
+u64 vp_legacy_get_driver_features(struct virtio_pci_legacy_device *ldev);
+void vp_legacy_set_features(struct virtio_pci_legacy_device *ldev,
+ u32 features);
+u8 vp_legacy_get_status(struct virtio_pci_legacy_device *ldev);
+void vp_legacy_set_status(struct virtio_pci_legacy_device *ldev,
+ u8 status);
+u16 vp_legacy_queue_vector(struct virtio_pci_legacy_device *ldev,
+ u16 idx, u16 vector);
+u16 vp_legacy_config_vector(struct virtio_pci_legacy_device *ldev,
+ u16 vector);
+void vp_legacy_set_queue_address(struct virtio_pci_legacy_device *ldev,
+ u16 index, u32 queue_pfn);
+bool vp_legacy_get_queue_enable(struct virtio_pci_legacy_device *ldev,
+ u16 idx);
+void vp_legacy_set_queue_size(struct virtio_pci_legacy_device *ldev,
+ u16 idx, u16 size);
+u16 vp_legacy_get_queue_size(struct virtio_pci_legacy_device *ldev,
+ u16 idx);
+int vp_legacy_probe(struct virtio_pci_legacy_device *ldev);
+void vp_legacy_remove(struct virtio_pci_legacy_device *ldev);
+
+#endif
--
2.31.1

2021-10-29 09:17:38

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH v7 6/9] virtio_vdpa: setup correct vq size with callbacks get_vq_num_{max,min}

For the devices which implement the get_vq_num_min callback, the driver
should not negotiate with virtqueue size with the backend vdpa device if
the value returned by get_vq_num_min equals to the value returned by
get_vq_num_max.
This is useful for vdpa devices based on legacy virtio specfication.

Signed-off-by: Wu Zongyong <[email protected]>
---
drivers/virtio/virtio_vdpa.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
index 72eaef2caeb1..6b62aaf08cc5 100644
--- a/drivers/virtio/virtio_vdpa.c
+++ b/drivers/virtio/virtio_vdpa.c
@@ -145,7 +145,8 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
/* Assume split virtqueue, switch to packed if necessary */
struct vdpa_vq_state state = {0};
unsigned long flags;
- u32 align, num;
+ u32 align, max_num, min_num = 1;
+ bool may_reduce_num = true;
int err;

if (!name)
@@ -163,16 +164,21 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
if (!info)
return ERR_PTR(-ENOMEM);

- num = ops->get_vq_num_max(vdpa);
- if (num == 0) {
+ max_num = ops->get_vq_num_max(vdpa);
+ if (max_num == 0) {
err = -ENOENT;
goto error_new_virtqueue;
}

+ if (ops->get_vq_num_min)
+ min_num = ops->get_vq_num_min(vdpa);
+
+ may_reduce_num = (max_num == min_num) ? false : true;
+
/* Create the vring */
align = ops->get_vq_align(vdpa);
- vq = vring_create_virtqueue(index, num, align, vdev,
- true, true, ctx,
+ vq = vring_create_virtqueue(index, max_num, align, vdev,
+ true, may_reduce_num, ctx,
virtio_vdpa_notify, callback, name);
if (!vq) {
err = -ENOMEM;
--
2.31.1

2021-10-29 09:17:40

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH v7 5/9] vdpa: min vq num of vdpa device cannot be greater than max vq num

Just failed to probe the vdpa device if the min virtqueue num returned
by get_vq_num_min is greater than the max virtqueue num returned by
get_vq_num_max.

Signed-off-by: Wu Zongyong <[email protected]>
Acked-by: Jason Wang <[email protected]>
---
drivers/vdpa/vdpa.c | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
index 1dc121a07a93..d783a943647d 100644
--- a/drivers/vdpa/vdpa.c
+++ b/drivers/vdpa/vdpa.c
@@ -26,8 +26,16 @@ static int vdpa_dev_probe(struct device *d)
{
struct vdpa_device *vdev = dev_to_vdpa(d);
struct vdpa_driver *drv = drv_to_vdpa(vdev->dev.driver);
+ const struct vdpa_config_ops *ops = vdev->config;
+ u32 max_num, min_num = 1;
int ret = 0;

+ max_num = ops->get_vq_num_max(vdev);
+ if (ops->get_vq_num_min)
+ min_num = ops->get_vq_num_min(vdev);
+ if (max_num < min_num)
+ return -EINVAL;
+
if (drv && drv->probe)
ret = drv->probe(vdev);

--
2.31.1

2021-10-29 09:18:32

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH v7 3/9] vp_vdpa: add vq irq offloading support

This patch implements the get_vq_irq() callback for virtio pci devices
to allow irq offloading.

Signed-off-by: Wu Zongyong <[email protected]>
Acked-by: Jason Wang <[email protected]>
---
drivers/vdpa/virtio_pci/vp_vdpa.c | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/drivers/vdpa/virtio_pci/vp_vdpa.c b/drivers/vdpa/virtio_pci/vp_vdpa.c
index 5bcd00246d2e..e3ff7875e123 100644
--- a/drivers/vdpa/virtio_pci/vp_vdpa.c
+++ b/drivers/vdpa/virtio_pci/vp_vdpa.c
@@ -76,6 +76,17 @@ static u8 vp_vdpa_get_status(struct vdpa_device *vdpa)
return vp_modern_get_status(mdev);
}

+static int vp_vdpa_get_vq_irq(struct vdpa_device *vdpa, u16 idx)
+{
+ struct vp_vdpa *vp_vdpa = vdpa_to_vp(vdpa);
+ int irq = vp_vdpa->vring[idx].irq;
+
+ if (irq == VIRTIO_MSI_NO_VECTOR)
+ return -EINVAL;
+
+ return irq;
+}
+
static void vp_vdpa_free_irq(struct vp_vdpa *vp_vdpa)
{
struct virtio_pci_modern_device *mdev = &vp_vdpa->mdev;
@@ -427,6 +438,7 @@ static const struct vdpa_config_ops vp_vdpa_ops = {
.get_config = vp_vdpa_get_config,
.set_config = vp_vdpa_set_config,
.set_config_cb = vp_vdpa_set_config_cb,
+ .get_vq_irq = vp_vdpa_get_vq_irq,
};

static void vp_vdpa_free_irq_vectors(void *data)
--
2.31.1

2021-10-29 09:18:45

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH v7 7/9] vdpa: add new attribute VDPA_ATTR_DEV_MIN_VQ_SIZE

This attribute advertises the min value of virtqueue size. The value is
1 by default.

Signed-off-by: Wu Zongyong <[email protected]>
---
drivers/vdpa/vdpa.c | 5 +++++
include/uapi/linux/vdpa.h | 1 +
2 files changed, 6 insertions(+)

diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
index d783a943647d..fcf02a364878 100644
--- a/drivers/vdpa/vdpa.c
+++ b/drivers/vdpa/vdpa.c
@@ -500,6 +500,7 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
int flags, struct netlink_ext_ack *extack)
{
u16 max_vq_size;
+ u16 min_vq_size = 1;
u32 device_id;
u32 vendor_id;
void *hdr;
@@ -516,6 +517,8 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
device_id = vdev->config->get_device_id(vdev);
vendor_id = vdev->config->get_vendor_id(vdev);
max_vq_size = vdev->config->get_vq_num_max(vdev);
+ if (vdev->config->get_vq_num_min)
+ min_vq_size = vdev->config->get_vq_num_min(vdev);

err = -EMSGSIZE;
if (nla_put_string(msg, VDPA_ATTR_DEV_NAME, dev_name(&vdev->dev)))
@@ -528,6 +531,8 @@ vdpa_dev_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid, u32 seq
goto msg_err;
if (nla_put_u16(msg, VDPA_ATTR_DEV_MAX_VQ_SIZE, max_vq_size))
goto msg_err;
+ if (nla_put_u16(msg, VDPA_ATTR_DEV_MIN_VQ_SIZE, min_vq_size))
+ goto msg_err;

genlmsg_end(msg, hdr);
return 0;
diff --git a/include/uapi/linux/vdpa.h b/include/uapi/linux/vdpa.h
index 66a41e4ec163..e3b87879514c 100644
--- a/include/uapi/linux/vdpa.h
+++ b/include/uapi/linux/vdpa.h
@@ -32,6 +32,7 @@ enum vdpa_attr {
VDPA_ATTR_DEV_VENDOR_ID, /* u32 */
VDPA_ATTR_DEV_MAX_VQS, /* u32 */
VDPA_ATTR_DEV_MAX_VQ_SIZE, /* u16 */
+ VDPA_ATTR_DEV_MIN_VQ_SIZE, /* u16 */

/* new attributes must be added above here */
VDPA_ATTR_MAX,
--
2.31.1

2021-10-29 09:18:54

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH v7 8/9] eni_vdpa: add vDPA driver for Alibaba ENI

This patch adds a new vDPA driver for Alibaba ENI(Elastic Network
Interface) which is build upon virtio 0.9.5 specification.
And this driver is only enabled on X86 host currently.

Signed-off-by: Wu Zongyong <[email protected]>
---
drivers/vdpa/Kconfig | 8 +
drivers/vdpa/Makefile | 1 +
drivers/vdpa/alibaba/Makefile | 3 +
drivers/vdpa/alibaba/eni_vdpa.c | 553 ++++++++++++++++++++++++++++++++
4 files changed, 565 insertions(+)
create mode 100644 drivers/vdpa/alibaba/Makefile
create mode 100644 drivers/vdpa/alibaba/eni_vdpa.c

diff --git a/drivers/vdpa/Kconfig b/drivers/vdpa/Kconfig
index 3d91982d8371..07b0c73212aa 100644
--- a/drivers/vdpa/Kconfig
+++ b/drivers/vdpa/Kconfig
@@ -78,4 +78,12 @@ config VP_VDPA
help
This kernel module bridges virtio PCI device to vDPA bus.

+config ALIBABA_ENI_VDPA
+ tristate "vDPA driver for Alibaba ENI"
+ select VIRTIO_PCI_LEGACY_LIB
+ depends on PCI_MSI && X86
+ help
+ VDPA driver for Alibaba ENI (Elastic Network Interface) which is built upon
+ virtio 0.9.5 specification.
+
endif # VDPA
diff --git a/drivers/vdpa/Makefile b/drivers/vdpa/Makefile
index f02ebed33f19..15665563a7f4 100644
--- a/drivers/vdpa/Makefile
+++ b/drivers/vdpa/Makefile
@@ -5,3 +5,4 @@ obj-$(CONFIG_VDPA_USER) += vdpa_user/
obj-$(CONFIG_IFCVF) += ifcvf/
obj-$(CONFIG_MLX5_VDPA) += mlx5/
obj-$(CONFIG_VP_VDPA) += virtio_pci/
+obj-$(CONFIG_ALIBABA_ENI_VDPA) += alibaba/
diff --git a/drivers/vdpa/alibaba/Makefile b/drivers/vdpa/alibaba/Makefile
new file mode 100644
index 000000000000..ef4aae69f87a
--- /dev/null
+++ b/drivers/vdpa/alibaba/Makefile
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0
+obj-$(CONFIG_ALIBABA_ENI_VDPA) += eni_vdpa.o
+
diff --git a/drivers/vdpa/alibaba/eni_vdpa.c b/drivers/vdpa/alibaba/eni_vdpa.c
new file mode 100644
index 000000000000..3f788794571a
--- /dev/null
+++ b/drivers/vdpa/alibaba/eni_vdpa.c
@@ -0,0 +1,553 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * vDPA bridge driver for Alibaba ENI(Elastic Network Interface)
+ *
+ * Copyright (c) 2021, Alibaba Inc. All rights reserved.
+ * Author: Wu Zongyong <[email protected]>
+ *
+ */
+
+#include "linux/bits.h"
+#include <linux/interrupt.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/vdpa.h>
+#include <linux/virtio.h>
+#include <linux/virtio_config.h>
+#include <linux/virtio_ring.h>
+#include <linux/virtio_pci.h>
+#include <linux/virtio_pci_legacy.h>
+#include <uapi/linux/virtio_net.h>
+
+#define ENI_MSIX_NAME_SIZE 256
+
+#define ENI_ERR(pdev, fmt, ...) \
+ dev_err(&pdev->dev, "%s"fmt, "eni_vdpa: ", ##__VA_ARGS__)
+#define ENI_DBG(pdev, fmt, ...) \
+ dev_dbg(&pdev->dev, "%s"fmt, "eni_vdpa: ", ##__VA_ARGS__)
+#define ENI_INFO(pdev, fmt, ...) \
+ dev_info(&pdev->dev, "%s"fmt, "eni_vdpa: ", ##__VA_ARGS__)
+
+struct eni_vring {
+ void __iomem *notify;
+ char msix_name[ENI_MSIX_NAME_SIZE];
+ struct vdpa_callback cb;
+ int irq;
+};
+
+struct eni_vdpa {
+ struct vdpa_device vdpa;
+ struct virtio_pci_legacy_device ldev;
+ struct eni_vring *vring;
+ struct vdpa_callback config_cb;
+ char msix_name[ENI_MSIX_NAME_SIZE];
+ int config_irq;
+ int queues;
+ int vectors;
+};
+
+static struct eni_vdpa *vdpa_to_eni(struct vdpa_device *vdpa)
+{
+ return container_of(vdpa, struct eni_vdpa, vdpa);
+}
+
+static struct virtio_pci_legacy_device *vdpa_to_ldev(struct vdpa_device *vdpa)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+
+ return &eni_vdpa->ldev;
+}
+
+static u64 eni_vdpa_get_features(struct vdpa_device *vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+ u64 features = vp_legacy_get_features(ldev);
+
+ features |= BIT_ULL(VIRTIO_F_ACCESS_PLATFORM);
+ features |= BIT_ULL(VIRTIO_F_ORDER_PLATFORM);
+
+ return features;
+}
+
+static int eni_vdpa_set_features(struct vdpa_device *vdpa, u64 features)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+
+ if (!(features & BIT_ULL(VIRTIO_NET_F_MRG_RXBUF)) && features) {
+ ENI_ERR(ldev->pci_dev,
+ "VIRTIO_NET_F_MRG_RXBUF is not negotiated\n");
+ return -EINVAL;
+ }
+
+ vp_legacy_set_features(ldev, (u32)features);
+
+ return 0;
+}
+
+static u8 eni_vdpa_get_status(struct vdpa_device *vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+
+ return vp_legacy_get_status(ldev);
+}
+
+static int eni_vdpa_get_vq_irq(struct vdpa_device *vdpa, u16 idx)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+ int irq = eni_vdpa->vring[idx].irq;
+
+ if (irq == VIRTIO_MSI_NO_VECTOR)
+ return -EINVAL;
+
+ return irq;
+}
+
+static void eni_vdpa_free_irq(struct eni_vdpa *eni_vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
+ struct pci_dev *pdev = ldev->pci_dev;
+ int i;
+
+ for (i = 0; i < eni_vdpa->queues; i++) {
+ if (eni_vdpa->vring[i].irq != VIRTIO_MSI_NO_VECTOR) {
+ vp_legacy_queue_vector(ldev, i, VIRTIO_MSI_NO_VECTOR);
+ devm_free_irq(&pdev->dev, eni_vdpa->vring[i].irq,
+ &eni_vdpa->vring[i]);
+ eni_vdpa->vring[i].irq = VIRTIO_MSI_NO_VECTOR;
+ }
+ }
+
+ if (eni_vdpa->config_irq != VIRTIO_MSI_NO_VECTOR) {
+ vp_legacy_config_vector(ldev, VIRTIO_MSI_NO_VECTOR);
+ devm_free_irq(&pdev->dev, eni_vdpa->config_irq, eni_vdpa);
+ eni_vdpa->config_irq = VIRTIO_MSI_NO_VECTOR;
+ }
+
+ if (eni_vdpa->vectors) {
+ pci_free_irq_vectors(pdev);
+ eni_vdpa->vectors = 0;
+ }
+}
+
+static irqreturn_t eni_vdpa_vq_handler(int irq, void *arg)
+{
+ struct eni_vring *vring = arg;
+
+ if (vring->cb.callback)
+ return vring->cb.callback(vring->cb.private);
+
+ return IRQ_HANDLED;
+}
+
+static irqreturn_t eni_vdpa_config_handler(int irq, void *arg)
+{
+ struct eni_vdpa *eni_vdpa = arg;
+
+ if (eni_vdpa->config_cb.callback)
+ return eni_vdpa->config_cb.callback(eni_vdpa->config_cb.private);
+
+ return IRQ_HANDLED;
+}
+
+static int eni_vdpa_request_irq(struct eni_vdpa *eni_vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
+ struct pci_dev *pdev = ldev->pci_dev;
+ int i, ret, irq;
+ int queues = eni_vdpa->queues;
+ int vectors = queues + 1;
+
+ ret = pci_alloc_irq_vectors(pdev, vectors, vectors, PCI_IRQ_MSIX);
+ if (ret != vectors) {
+ ENI_ERR(pdev,
+ "failed to allocate irq vectors want %d but %d\n",
+ vectors, ret);
+ return ret;
+ }
+
+ eni_vdpa->vectors = vectors;
+
+ for (i = 0; i < queues; i++) {
+ snprintf(eni_vdpa->vring[i].msix_name, ENI_MSIX_NAME_SIZE,
+ "eni-vdpa[%s]-%d\n", pci_name(pdev), i);
+ irq = pci_irq_vector(pdev, i);
+ ret = devm_request_irq(&pdev->dev, irq,
+ eni_vdpa_vq_handler,
+ 0, eni_vdpa->vring[i].msix_name,
+ &eni_vdpa->vring[i]);
+ if (ret) {
+ ENI_ERR(pdev, "failed to request irq for vq %d\n", i);
+ goto err;
+ }
+ vp_legacy_queue_vector(ldev, i, i);
+ eni_vdpa->vring[i].irq = irq;
+ }
+
+ snprintf(eni_vdpa->msix_name, ENI_MSIX_NAME_SIZE, "eni-vdpa[%s]-config\n",
+ pci_name(pdev));
+ irq = pci_irq_vector(pdev, queues);
+ ret = devm_request_irq(&pdev->dev, irq, eni_vdpa_config_handler, 0,
+ eni_vdpa->msix_name, eni_vdpa);
+ if (ret) {
+ ENI_ERR(pdev, "failed to request irq for config vq %d\n", i);
+ goto err;
+ }
+ vp_legacy_config_vector(ldev, queues);
+ eni_vdpa->config_irq = irq;
+
+ return 0;
+err:
+ eni_vdpa_free_irq(eni_vdpa);
+ return ret;
+}
+
+static void eni_vdpa_set_status(struct vdpa_device *vdpa, u8 status)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+ struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
+ u8 s = eni_vdpa_get_status(vdpa);
+
+ if (status & VIRTIO_CONFIG_S_DRIVER_OK &&
+ !(s & VIRTIO_CONFIG_S_DRIVER_OK)) {
+ eni_vdpa_request_irq(eni_vdpa);
+ }
+
+ vp_legacy_set_status(ldev, status);
+
+ if (!(status & VIRTIO_CONFIG_S_DRIVER_OK) &&
+ (s & VIRTIO_CONFIG_S_DRIVER_OK))
+ eni_vdpa_free_irq(eni_vdpa);
+}
+
+static int eni_vdpa_reset(struct vdpa_device *vdpa)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+ struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
+ u8 s = eni_vdpa_get_status(vdpa);
+
+ vp_legacy_set_status(ldev, 0);
+
+ if (s & VIRTIO_CONFIG_S_DRIVER_OK)
+ eni_vdpa_free_irq(eni_vdpa);
+
+ return 0;
+}
+
+static u16 eni_vdpa_get_vq_num_max(struct vdpa_device *vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+
+ return vp_legacy_get_queue_size(ldev, 0);
+}
+
+static u16 eni_vdpa_get_vq_num_min(struct vdpa_device *vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+
+ return vp_legacy_get_queue_size(ldev, 0);
+}
+
+static int eni_vdpa_get_vq_state(struct vdpa_device *vdpa, u16 qid,
+ struct vdpa_vq_state *state)
+{
+ return -EOPNOTSUPP;
+}
+
+static int eni_vdpa_set_vq_state(struct vdpa_device *vdpa, u16 qid,
+ const struct vdpa_vq_state *state)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+ const struct vdpa_vq_state_split *split = &state->split;
+
+ /* ENI is build upon virtio-pci specfication which not support
+ * to set state of virtqueue. But if the state is equal to the
+ * device initial state by chance, we can let it go.
+ */
+ if (!vp_legacy_get_queue_enable(ldev, qid)
+ && split->avail_index == 0)
+ return 0;
+
+ return -EOPNOTSUPP;
+}
+
+
+static void eni_vdpa_set_vq_cb(struct vdpa_device *vdpa, u16 qid,
+ struct vdpa_callback *cb)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+
+ eni_vdpa->vring[qid].cb = *cb;
+}
+
+static void eni_vdpa_set_vq_ready(struct vdpa_device *vdpa, u16 qid,
+ bool ready)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+
+ /* ENI is a legacy virtio-pci device. This is not supported
+ * by specification. But we can disable virtqueue by setting
+ * address to 0.
+ */
+ if (!ready)
+ vp_legacy_set_queue_address(ldev, qid, 0);
+}
+
+static bool eni_vdpa_get_vq_ready(struct vdpa_device *vdpa, u16 qid)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+
+ return vp_legacy_get_queue_enable(ldev, qid);
+}
+
+static void eni_vdpa_set_vq_num(struct vdpa_device *vdpa, u16 qid,
+ u32 num)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+ struct pci_dev *pdev = ldev->pci_dev;
+ u16 n = vp_legacy_get_queue_size(ldev, qid);
+
+ /* ENI is a legacy virtio-pci device which not allow to change
+ * virtqueue size. Just report a error if someone tries to
+ * change it.
+ */
+ if (num != n)
+ ENI_ERR(pdev,
+ "not support to set vq %u fixed num %u to %u\n",
+ qid, n, num);
+}
+
+static int eni_vdpa_set_vq_address(struct vdpa_device *vdpa, u16 qid,
+ u64 desc_area, u64 driver_area,
+ u64 device_area)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+ u32 pfn = desc_area >> VIRTIO_PCI_QUEUE_ADDR_SHIFT;
+
+ vp_legacy_set_queue_address(ldev, qid, pfn);
+
+ return 0;
+}
+
+static void eni_vdpa_kick_vq(struct vdpa_device *vdpa, u16 qid)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+
+ iowrite16(qid, eni_vdpa->vring[qid].notify);
+}
+
+static u32 eni_vdpa_get_device_id(struct vdpa_device *vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+
+ return ldev->id.device;
+}
+
+static u32 eni_vdpa_get_vendor_id(struct vdpa_device *vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = vdpa_to_ldev(vdpa);
+
+ return ldev->id.vendor;
+}
+
+static u32 eni_vdpa_get_vq_align(struct vdpa_device *vdpa)
+{
+ return VIRTIO_PCI_VRING_ALIGN;
+}
+
+static size_t eni_vdpa_get_config_size(struct vdpa_device *vdpa)
+{
+ return sizeof(struct virtio_net_config);
+}
+
+
+static void eni_vdpa_get_config(struct vdpa_device *vdpa,
+ unsigned int offset,
+ void *buf, unsigned int len)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+ struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
+ void __iomem *ioaddr = ldev->ioaddr +
+ VIRTIO_PCI_CONFIG_OFF(eni_vdpa->vectors) +
+ offset;
+ u8 *p = buf;
+ int i;
+
+ for (i = 0; i < len; i++)
+ *p++ = ioread8(ioaddr + i);
+}
+
+static void eni_vdpa_set_config(struct vdpa_device *vdpa,
+ unsigned int offset, const void *buf,
+ unsigned int len)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+ struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
+ void __iomem *ioaddr = ldev->ioaddr +
+ VIRTIO_PCI_CONFIG_OFF(eni_vdpa->vectors) +
+ offset;
+ const u8 *p = buf;
+ int i;
+
+ for (i = 0; i < len; i++)
+ iowrite8(*p++, ioaddr + i);
+}
+
+static void eni_vdpa_set_config_cb(struct vdpa_device *vdpa,
+ struct vdpa_callback *cb)
+{
+ struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
+
+ eni_vdpa->config_cb = *cb;
+}
+
+static const struct vdpa_config_ops eni_vdpa_ops = {
+ .get_features = eni_vdpa_get_features,
+ .set_features = eni_vdpa_set_features,
+ .get_status = eni_vdpa_get_status,
+ .set_status = eni_vdpa_set_status,
+ .reset = eni_vdpa_reset,
+ .get_vq_num_max = eni_vdpa_get_vq_num_max,
+ .get_vq_num_min = eni_vdpa_get_vq_num_min,
+ .get_vq_state = eni_vdpa_get_vq_state,
+ .set_vq_state = eni_vdpa_set_vq_state,
+ .set_vq_cb = eni_vdpa_set_vq_cb,
+ .set_vq_ready = eni_vdpa_set_vq_ready,
+ .get_vq_ready = eni_vdpa_get_vq_ready,
+ .set_vq_num = eni_vdpa_set_vq_num,
+ .set_vq_address = eni_vdpa_set_vq_address,
+ .kick_vq = eni_vdpa_kick_vq,
+ .get_device_id = eni_vdpa_get_device_id,
+ .get_vendor_id = eni_vdpa_get_vendor_id,
+ .get_vq_align = eni_vdpa_get_vq_align,
+ .get_config_size = eni_vdpa_get_config_size,
+ .get_config = eni_vdpa_get_config,
+ .set_config = eni_vdpa_set_config,
+ .set_config_cb = eni_vdpa_set_config_cb,
+ .get_vq_irq = eni_vdpa_get_vq_irq,
+};
+
+
+static u16 eni_vdpa_get_num_queues(struct eni_vdpa *eni_vdpa)
+{
+ struct virtio_pci_legacy_device *ldev = &eni_vdpa->ldev;
+ u32 features = vp_legacy_get_features(ldev);
+ u16 num = 2;
+
+ if (features & BIT_ULL(VIRTIO_NET_F_MQ)) {
+ __virtio16 max_virtqueue_pairs;
+
+ eni_vdpa_get_config(&eni_vdpa->vdpa,
+ offsetof(struct virtio_net_config, max_virtqueue_pairs),
+ &max_virtqueue_pairs,
+ sizeof(max_virtqueue_pairs));
+ num = 2 * __virtio16_to_cpu(virtio_legacy_is_little_endian(),
+ max_virtqueue_pairs);
+ }
+
+ if (features & BIT_ULL(VIRTIO_NET_F_CTRL_VQ))
+ num += 1;
+
+ return num;
+}
+
+static void eni_vdpa_free_irq_vectors(void *data)
+{
+ pci_free_irq_vectors(data);
+}
+
+static int eni_vdpa_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+{
+ struct device *dev = &pdev->dev;
+ struct eni_vdpa *eni_vdpa;
+ struct virtio_pci_legacy_device *ldev;
+ int ret, i;
+
+ ret = pcim_enable_device(pdev);
+ if (ret)
+ return ret;
+
+ eni_vdpa = vdpa_alloc_device(struct eni_vdpa, vdpa,
+ dev, &eni_vdpa_ops, NULL, false);
+ if (IS_ERR(eni_vdpa)) {
+ ENI_ERR(pdev, "failed to allocate vDPA structure\n");
+ return PTR_ERR(eni_vdpa);
+ }
+
+ ldev = &eni_vdpa->ldev;
+ ldev->pci_dev = pdev;
+
+ ret = vp_legacy_probe(ldev);
+ if (ret) {
+ ENI_ERR(pdev, "failed to probe legacy PCI device\n");
+ goto err;
+ }
+
+ pci_set_master(pdev);
+ pci_set_drvdata(pdev, eni_vdpa);
+
+ eni_vdpa->vdpa.dma_dev = &pdev->dev;
+ eni_vdpa->queues = eni_vdpa_get_num_queues(eni_vdpa);
+
+ ret = devm_add_action_or_reset(dev, eni_vdpa_free_irq_vectors, pdev);
+ if (ret) {
+ ENI_ERR(pdev,
+ "failed for adding devres for freeing irq vectors\n");
+ goto err;
+ }
+
+ eni_vdpa->vring = devm_kcalloc(&pdev->dev, eni_vdpa->queues,
+ sizeof(*eni_vdpa->vring),
+ GFP_KERNEL);
+ if (!eni_vdpa->vring) {
+ ret = -ENOMEM;
+ ENI_ERR(pdev, "failed to allocate virtqueues\n");
+ goto err;
+ }
+
+ for (i = 0; i < eni_vdpa->queues; i++) {
+ eni_vdpa->vring[i].irq = VIRTIO_MSI_NO_VECTOR;
+ eni_vdpa->vring[i].notify = ldev->ioaddr + VIRTIO_PCI_QUEUE_NOTIFY;
+ }
+ eni_vdpa->config_irq = VIRTIO_MSI_NO_VECTOR;
+
+ ret = vdpa_register_device(&eni_vdpa->vdpa, eni_vdpa->queues);
+ if (ret) {
+ ENI_ERR(pdev, "failed to register to vdpa bus\n");
+ goto err;
+ }
+
+ return 0;
+
+err:
+ put_device(&eni_vdpa->vdpa.dev);
+ return ret;
+}
+
+static void eni_vdpa_remove(struct pci_dev *pdev)
+{
+ struct eni_vdpa *eni_vdpa = pci_get_drvdata(pdev);
+
+ vdpa_unregister_device(&eni_vdpa->vdpa);
+ vp_legacy_remove(&eni_vdpa->ldev);
+}
+
+static struct pci_device_id eni_pci_ids[] = {
+ { PCI_DEVICE_SUB(PCI_VENDOR_ID_REDHAT_QUMRANET,
+ VIRTIO_TRANS_ID_NET,
+ PCI_SUBVENDOR_ID_REDHAT_QUMRANET,
+ VIRTIO_ID_NET) },
+ { 0 },
+};
+
+static struct pci_driver eni_vdpa_driver = {
+ .name = "alibaba-eni-vdpa",
+ .id_table = eni_pci_ids,
+ .probe = eni_vdpa_probe,
+ .remove = eni_vdpa_remove,
+};
+
+module_pci_driver(eni_vdpa_driver);
+
+MODULE_AUTHOR("Wu Zongyong <[email protected]>");
+MODULE_DESCRIPTION("Alibaba ENI vDPA driver");
+MODULE_LICENSE("GPL v2");
--
2.31.1

2021-10-29 09:19:24

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH v7 9/9] eni_vdpa: alibaba: fix Kconfig typo

The Kconfig symbol was misspelled, which leads to randconfig link
failures:

ld.lld: error: undefined symbol: vp_legacy_probe
>>> referenced by eni_vdpa.c
>>> vdpa/alibaba/eni_vdpa.o:(eni_vdpa_probe) in archive drivers/built-in.a

Fixes: 6a9f32c00609 ("eni_vdpa: add vDPA driver for Alibaba ENI")
Signed-off-by: Arnd Bergmann <[email protected]>
Reviewed-by: Stefano Garzarella <[email protected]>
Signed-off-by: Wu Zongyong <[email protected]>
---
drivers/vdpa/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/vdpa/Kconfig b/drivers/vdpa/Kconfig
index 07b0c73212aa..50f45d037611 100644
--- a/drivers/vdpa/Kconfig
+++ b/drivers/vdpa/Kconfig
@@ -80,7 +80,7 @@ config VP_VDPA

config ALIBABA_ENI_VDPA
tristate "vDPA driver for Alibaba ENI"
- select VIRTIO_PCI_LEGACY_LIB
+ select VIRTIO_PCI_LIB_LEGACY
depends on PCI_MSI && X86
help
VDPA driver for Alibaba ENI (Elastic Network Interface) which is built upon
--
2.31.1

2021-10-29 09:20:12

by Wu Zongyong

[permalink] [raw]
Subject: [PATCH v7 4/9] vdpa: add new callback get_vq_num_min in vdpa_config_ops

This callback is optional. For vdpa devices that not support to change
virtqueue size, get_vq_num_min and get_vq_num_max will return the same
value, so that users can choose a correct value for that device.

Suggested-by: Jason Wang <[email protected]>
Signed-off-by: Wu Zongyong <[email protected]>
Acked-by: Jason Wang <[email protected]>
---
include/linux/vdpa.h | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
index a896ee021e5f..30864848950b 100644
--- a/include/linux/vdpa.h
+++ b/include/linux/vdpa.h
@@ -171,6 +171,9 @@ struct vdpa_map_file {
* @get_vq_num_max: Get the max size of virtqueue
* @vdev: vdpa device
* Returns u16: max size of virtqueue
+ * @get_vq_num_min: Get the min size of virtqueue (optional)
+ * @vdev: vdpa device
+ * Returns u16: min size of virtqueue
* @get_device_id: Get virtio device id
* @vdev: vdpa device
* Returns u32: virtio device id
@@ -266,6 +269,7 @@ struct vdpa_config_ops {
void (*set_config_cb)(struct vdpa_device *vdev,
struct vdpa_callback *cb);
u16 (*get_vq_num_max)(struct vdpa_device *vdev);
+ u16 (*get_vq_num_min)(struct vdpa_device *vdev);
u32 (*get_device_id)(struct vdpa_device *vdev);
u32 (*get_vendor_id)(struct vdpa_device *vdev);
u8 (*get_status)(struct vdpa_device *vdev);
--
2.31.1

2021-11-01 03:33:47

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH v7 0/9] vDPA driver for Alibaba ENI

On Fri, Oct 29, 2021 at 5:15 PM Wu Zongyong
<[email protected]> wrote:
>
> This series implements the vDPA driver for Alibaba ENI (Elastic Network
> Interface) which is built based on virtio-pci 0.9.5 specification.

It looks to me Michael has applied the patches, if this is the case,
we probably need to send patches on top.

Thanks

>
> Changes since V6:
> - set default min vq size to 1 intead of 0
> - enable eni vdpa driver only on X86 hosts
> - fix some typos
>
> Changes since V5:
> - remove unused codes
>
> Changes since V4:
> - check return values of get_vq_num_{max,min} when probing devices
> - disable the driver on BE host via Kconfig
> - add missing commit message
>
> Changes since V3:
> - validate VIRTIO_NET_F_MRG_RXBUF when negotiate features
> - present F_ORDER_PLATFORM in get_features
> - remove endian check since ENI always use litter endian
>
> Changes since V2:
> - add new attribute VDPA_ATTR_DEV_MIN_VQ_SIZE instead
> VDPA_ATTR_DEV_F_VERSION_1 to guide users to choose correct virtqueue
> size as suggested by Jason Wang
> - present ACCESS_PLATFORM in get_features callback as suggested by Jason
> Wang
> - disable this driver on Big Endian host as suggested by Jason Wang
> - fix a typo
>
> Changes since V1:
> - add new vdpa attribute VDPA_ATTR_DEV_F_VERSION_1 to indicate whether
> the vdpa device is legacy
> - implement dedicated driver for Alibaba ENI instead a legacy virtio-pci
> driver as suggested by Jason Wang
> - some bugs fixed
>
> Wu Zongyong (9):
> virtio-pci: introduce legacy device module
> vdpa: fix typo
> vp_vdpa: add vq irq offloading support
> vdpa: add new callback get_vq_num_min in vdpa_config_ops
> vdpa: min vq num of vdpa device cannot be greater than max vq num
> virtio_vdpa: setup correct vq size with callbacks get_vq_num_{max,min}
> vdpa: add new attribute VDPA_ATTR_DEV_MIN_VQ_SIZE
> eni_vdpa: add vDPA driver for Alibaba ENI
> eni_vdpa: alibaba: fix Kconfig typo
>
> drivers/vdpa/Kconfig | 8 +
> drivers/vdpa/Makefile | 1 +
> drivers/vdpa/alibaba/Makefile | 3 +
> drivers/vdpa/alibaba/eni_vdpa.c | 553 +++++++++++++++++++++++++
> drivers/vdpa/vdpa.c | 13 +
> drivers/vdpa/virtio_pci/vp_vdpa.c | 12 +
> drivers/virtio/Kconfig | 10 +
> drivers/virtio/Makefile | 1 +
> drivers/virtio/virtio_pci_common.c | 10 +-
> drivers/virtio/virtio_pci_common.h | 9 +-
> drivers/virtio/virtio_pci_legacy.c | 101 ++---
> drivers/virtio/virtio_pci_legacy_dev.c | 220 ++++++++++
> drivers/virtio/virtio_vdpa.c | 16 +-
> include/linux/vdpa.h | 6 +-
> include/linux/virtio_pci_legacy.h | 42 ++
> include/uapi/linux/vdpa.h | 1 +
> 16 files changed, 917 insertions(+), 89 deletions(-)
> create mode 100644 drivers/vdpa/alibaba/Makefile
> create mode 100644 drivers/vdpa/alibaba/eni_vdpa.c
> create mode 100644 drivers/virtio/virtio_pci_legacy_dev.c
> create mode 100644 include/linux/virtio_pci_legacy.h
>
> --
> 2.31.1
>

2021-11-01 06:24:59

by Wu Zongyong

[permalink] [raw]
Subject: Re: [PATCH v7 0/9] vDPA driver for Alibaba ENI

On Mon, Nov 01, 2021 at 11:31:15AM +0800, Jason Wang wrote:
> On Fri, Oct 29, 2021 at 5:15 PM Wu Zongyong
> <[email protected]> wrote:
> >
> > This series implements the vDPA driver for Alibaba ENI (Elastic Network
> > Interface) which is built based on virtio-pci 0.9.5 specification.
>
> It looks to me Michael has applied the patches, if this is the case,
> we probably need to send patches on top.

What do you mean by saying "send patches on top"?
Sorry, I'm a newbie to contribute for kernel, could you please explain
it in detail?

Thanks
> Thanks
>
> >
> > Changes since V6:
> > - set default min vq size to 1 intead of 0
> > - enable eni vdpa driver only on X86 hosts
> > - fix some typos
> >
> > Changes since V5:
> > - remove unused codes
> >
> > Changes since V4:
> > - check return values of get_vq_num_{max,min} when probing devices
> > - disable the driver on BE host via Kconfig
> > - add missing commit message
> >
> > Changes since V3:
> > - validate VIRTIO_NET_F_MRG_RXBUF when negotiate features
> > - present F_ORDER_PLATFORM in get_features
> > - remove endian check since ENI always use litter endian
> >
> > Changes since V2:
> > - add new attribute VDPA_ATTR_DEV_MIN_VQ_SIZE instead
> > VDPA_ATTR_DEV_F_VERSION_1 to guide users to choose correct virtqueue
> > size as suggested by Jason Wang
> > - present ACCESS_PLATFORM in get_features callback as suggested by Jason
> > Wang
> > - disable this driver on Big Endian host as suggested by Jason Wang
> > - fix a typo
> >
> > Changes since V1:
> > - add new vdpa attribute VDPA_ATTR_DEV_F_VERSION_1 to indicate whether
> > the vdpa device is legacy
> > - implement dedicated driver for Alibaba ENI instead a legacy virtio-pci
> > driver as suggested by Jason Wang
> > - some bugs fixed
> >
> > Wu Zongyong (9):
> > virtio-pci: introduce legacy device module
> > vdpa: fix typo
> > vp_vdpa: add vq irq offloading support
> > vdpa: add new callback get_vq_num_min in vdpa_config_ops
> > vdpa: min vq num of vdpa device cannot be greater than max vq num
> > virtio_vdpa: setup correct vq size with callbacks get_vq_num_{max,min}
> > vdpa: add new attribute VDPA_ATTR_DEV_MIN_VQ_SIZE
> > eni_vdpa: add vDPA driver for Alibaba ENI
> > eni_vdpa: alibaba: fix Kconfig typo
> >
> > drivers/vdpa/Kconfig | 8 +
> > drivers/vdpa/Makefile | 1 +
> > drivers/vdpa/alibaba/Makefile | 3 +
> > drivers/vdpa/alibaba/eni_vdpa.c | 553 +++++++++++++++++++++++++
> > drivers/vdpa/vdpa.c | 13 +
> > drivers/vdpa/virtio_pci/vp_vdpa.c | 12 +
> > drivers/virtio/Kconfig | 10 +
> > drivers/virtio/Makefile | 1 +
> > drivers/virtio/virtio_pci_common.c | 10 +-
> > drivers/virtio/virtio_pci_common.h | 9 +-
> > drivers/virtio/virtio_pci_legacy.c | 101 ++---
> > drivers/virtio/virtio_pci_legacy_dev.c | 220 ++++++++++
> > drivers/virtio/virtio_vdpa.c | 16 +-
> > include/linux/vdpa.h | 6 +-
> > include/linux/virtio_pci_legacy.h | 42 ++
> > include/uapi/linux/vdpa.h | 1 +
> > 16 files changed, 917 insertions(+), 89 deletions(-)
> > create mode 100644 drivers/vdpa/alibaba/Makefile
> > create mode 100644 drivers/vdpa/alibaba/eni_vdpa.c
> > create mode 100644 drivers/virtio/virtio_pci_legacy_dev.c
> > create mode 100644 include/linux/virtio_pci_legacy.h
> >
> > --
> > 2.31.1
> >

2021-11-01 07:05:48

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH v7 0/9] vDPA driver for Alibaba ENI

On Mon, Nov 1, 2021 at 2:23 PM Wu Zongyong <[email protected]> wrote:
>
> On Mon, Nov 01, 2021 at 11:31:15AM +0800, Jason Wang wrote:
> > On Fri, Oct 29, 2021 at 5:15 PM Wu Zongyong
> > <[email protected]> wrote:
> > >
> > > This series implements the vDPA driver for Alibaba ENI (Elastic Network
> > > Interface) which is built based on virtio-pci 0.9.5 specification.
> >
> > It looks to me Michael has applied the patches, if this is the case,
> > we probably need to send patches on top.
>
> What do you mean by saying "send patches on top"?
> Sorry, I'm a newbie to contribute for kernel, could you please explain
> it in detail?

I meant you probably need to send incremental patch on top of:

git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git linux-next.

Thanks


>
> Thanks
> > Thanks
> >
> > >
> > > Changes since V6:
> > > - set default min vq size to 1 intead of 0
> > > - enable eni vdpa driver only on X86 hosts
> > > - fix some typos
> > >
> > > Changes since V5:
> > > - remove unused codes
> > >
> > > Changes since V4:
> > > - check return values of get_vq_num_{max,min} when probing devices
> > > - disable the driver on BE host via Kconfig
> > > - add missing commit message
> > >
> > > Changes since V3:
> > > - validate VIRTIO_NET_F_MRG_RXBUF when negotiate features
> > > - present F_ORDER_PLATFORM in get_features
> > > - remove endian check since ENI always use litter endian
> > >
> > > Changes since V2:
> > > - add new attribute VDPA_ATTR_DEV_MIN_VQ_SIZE instead
> > > VDPA_ATTR_DEV_F_VERSION_1 to guide users to choose correct virtqueue
> > > size as suggested by Jason Wang
> > > - present ACCESS_PLATFORM in get_features callback as suggested by Jason
> > > Wang
> > > - disable this driver on Big Endian host as suggested by Jason Wang
> > > - fix a typo
> > >
> > > Changes since V1:
> > > - add new vdpa attribute VDPA_ATTR_DEV_F_VERSION_1 to indicate whether
> > > the vdpa device is legacy
> > > - implement dedicated driver for Alibaba ENI instead a legacy virtio-pci
> > > driver as suggested by Jason Wang
> > > - some bugs fixed
> > >
> > > Wu Zongyong (9):
> > > virtio-pci: introduce legacy device module
> > > vdpa: fix typo
> > > vp_vdpa: add vq irq offloading support
> > > vdpa: add new callback get_vq_num_min in vdpa_config_ops
> > > vdpa: min vq num of vdpa device cannot be greater than max vq num
> > > virtio_vdpa: setup correct vq size with callbacks get_vq_num_{max,min}
> > > vdpa: add new attribute VDPA_ATTR_DEV_MIN_VQ_SIZE
> > > eni_vdpa: add vDPA driver for Alibaba ENI
> > > eni_vdpa: alibaba: fix Kconfig typo
> > >
> > > drivers/vdpa/Kconfig | 8 +
> > > drivers/vdpa/Makefile | 1 +
> > > drivers/vdpa/alibaba/Makefile | 3 +
> > > drivers/vdpa/alibaba/eni_vdpa.c | 553 +++++++++++++++++++++++++
> > > drivers/vdpa/vdpa.c | 13 +
> > > drivers/vdpa/virtio_pci/vp_vdpa.c | 12 +
> > > drivers/virtio/Kconfig | 10 +
> > > drivers/virtio/Makefile | 1 +
> > > drivers/virtio/virtio_pci_common.c | 10 +-
> > > drivers/virtio/virtio_pci_common.h | 9 +-
> > > drivers/virtio/virtio_pci_legacy.c | 101 ++---
> > > drivers/virtio/virtio_pci_legacy_dev.c | 220 ++++++++++
> > > drivers/virtio/virtio_vdpa.c | 16 +-
> > > include/linux/vdpa.h | 6 +-
> > > include/linux/virtio_pci_legacy.h | 42 ++
> > > include/uapi/linux/vdpa.h | 1 +
> > > 16 files changed, 917 insertions(+), 89 deletions(-)
> > > create mode 100644 drivers/vdpa/alibaba/Makefile
> > > create mode 100644 drivers/vdpa/alibaba/eni_vdpa.c
> > > create mode 100644 drivers/virtio/virtio_pci_legacy_dev.c
> > > create mode 100644 include/linux/virtio_pci_legacy.h
> > >
> > > --
> > > 2.31.1
> > >
>

2021-11-01 08:14:23

by Wu Zongyong

[permalink] [raw]
Subject: Re: [PATCH v7 0/9] vDPA driver for Alibaba ENI

On Mon, Nov 01, 2021 at 03:02:52PM +0800, Jason Wang wrote:
> On Mon, Nov 1, 2021 at 2:23 PM Wu Zongyong <[email protected]> wrote:
> >
> > On Mon, Nov 01, 2021 at 11:31:15AM +0800, Jason Wang wrote:
> > > On Fri, Oct 29, 2021 at 5:15 PM Wu Zongyong
> > > <[email protected]> wrote:
> > > >
> > > > This series implements the vDPA driver for Alibaba ENI (Elastic Network
> > > > Interface) which is built based on virtio-pci 0.9.5 specification.
> > >
> > > It looks to me Michael has applied the patches, if this is the case,
> > > we probably need to send patches on top.
> >
> > What do you mean by saying "send patches on top"?
> > Sorry, I'm a newbie to contribute for kernel, could you please explain
> > it in detail?
>
> I meant you probably need to send incremental patch on top of:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git linux-next.

Get it.

Thanks
>
> Thanks
>
>
> >
> > Thanks
> > > Thanks
> > >
> > > >
> > > > Changes since V6:
> > > > - set default min vq size to 1 intead of 0
> > > > - enable eni vdpa driver only on X86 hosts
> > > > - fix some typos
> > > >
> > > > Changes since V5:
> > > > - remove unused codes
> > > >
> > > > Changes since V4:
> > > > - check return values of get_vq_num_{max,min} when probing devices
> > > > - disable the driver on BE host via Kconfig
> > > > - add missing commit message
> > > >
> > > > Changes since V3:
> > > > - validate VIRTIO_NET_F_MRG_RXBUF when negotiate features
> > > > - present F_ORDER_PLATFORM in get_features
> > > > - remove endian check since ENI always use litter endian
> > > >
> > > > Changes since V2:
> > > > - add new attribute VDPA_ATTR_DEV_MIN_VQ_SIZE instead
> > > > VDPA_ATTR_DEV_F_VERSION_1 to guide users to choose correct virtqueue
> > > > size as suggested by Jason Wang
> > > > - present ACCESS_PLATFORM in get_features callback as suggested by Jason
> > > > Wang
> > > > - disable this driver on Big Endian host as suggested by Jason Wang
> > > > - fix a typo
> > > >
> > > > Changes since V1:
> > > > - add new vdpa attribute VDPA_ATTR_DEV_F_VERSION_1 to indicate whether
> > > > the vdpa device is legacy
> > > > - implement dedicated driver for Alibaba ENI instead a legacy virtio-pci
> > > > driver as suggested by Jason Wang
> > > > - some bugs fixed
> > > >
> > > > Wu Zongyong (9):
> > > > virtio-pci: introduce legacy device module
> > > > vdpa: fix typo
> > > > vp_vdpa: add vq irq offloading support
> > > > vdpa: add new callback get_vq_num_min in vdpa_config_ops
> > > > vdpa: min vq num of vdpa device cannot be greater than max vq num
> > > > virtio_vdpa: setup correct vq size with callbacks get_vq_num_{max,min}
> > > > vdpa: add new attribute VDPA_ATTR_DEV_MIN_VQ_SIZE
> > > > eni_vdpa: add vDPA driver for Alibaba ENI
> > > > eni_vdpa: alibaba: fix Kconfig typo
> > > >
> > > > drivers/vdpa/Kconfig | 8 +
> > > > drivers/vdpa/Makefile | 1 +
> > > > drivers/vdpa/alibaba/Makefile | 3 +
> > > > drivers/vdpa/alibaba/eni_vdpa.c | 553 +++++++++++++++++++++++++
> > > > drivers/vdpa/vdpa.c | 13 +
> > > > drivers/vdpa/virtio_pci/vp_vdpa.c | 12 +
> > > > drivers/virtio/Kconfig | 10 +
> > > > drivers/virtio/Makefile | 1 +
> > > > drivers/virtio/virtio_pci_common.c | 10 +-
> > > > drivers/virtio/virtio_pci_common.h | 9 +-
> > > > drivers/virtio/virtio_pci_legacy.c | 101 ++---
> > > > drivers/virtio/virtio_pci_legacy_dev.c | 220 ++++++++++
> > > > drivers/virtio/virtio_vdpa.c | 16 +-
> > > > include/linux/vdpa.h | 6 +-
> > > > include/linux/virtio_pci_legacy.h | 42 ++
> > > > include/uapi/linux/vdpa.h | 1 +
> > > > 16 files changed, 917 insertions(+), 89 deletions(-)
> > > > create mode 100644 drivers/vdpa/alibaba/Makefile
> > > > create mode 100644 drivers/vdpa/alibaba/eni_vdpa.c
> > > > create mode 100644 drivers/virtio/virtio_pci_legacy_dev.c
> > > > create mode 100644 include/linux/virtio_pci_legacy.h
> > > >
> > > > --
> > > > 2.31.1
> > > >
> >

2021-11-01 08:20:56

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH v7 9/9] eni_vdpa: alibaba: fix Kconfig typo

On Fri, Oct 29, 2021 at 05:14:50PM +0800, Wu Zongyong wrote:
> The Kconfig symbol was misspelled, which leads to randconfig link
> failures:
>
> ld.lld: error: undefined symbol: vp_legacy_probe
> >>> referenced by eni_vdpa.c
> >>> vdpa/alibaba/eni_vdpa.o:(eni_vdpa_probe) in archive drivers/built-in.a
>
> Fixes: 6a9f32c00609 ("eni_vdpa: add vDPA driver for Alibaba ENI")
> Signed-off-by: Arnd Bergmann <[email protected]>
> Reviewed-by: Stefano Garzarella <[email protected]>
> Signed-off-by: Wu Zongyong <[email protected]>

This one I'll squash into the previous one. That commit hash is not
going to match anything useful.

> ---
> drivers/vdpa/Kconfig | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/vdpa/Kconfig b/drivers/vdpa/Kconfig
> index 07b0c73212aa..50f45d037611 100644
> --- a/drivers/vdpa/Kconfig
> +++ b/drivers/vdpa/Kconfig
> @@ -80,7 +80,7 @@ config VP_VDPA
>
> config ALIBABA_ENI_VDPA
> tristate "vDPA driver for Alibaba ENI"
> - select VIRTIO_PCI_LEGACY_LIB
> + select VIRTIO_PCI_LIB_LEGACY
> depends on PCI_MSI && X86
> help
> VDPA driver for Alibaba ENI (Elastic Network Interface) which is built upon
> --
> 2.31.1

2021-11-01 08:22:00

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH v7 0/9] vDPA driver for Alibaba ENI

On Fri, Oct 29, 2021 at 05:14:41PM +0800, Wu Zongyong wrote:
> This series implements the vDPA driver for Alibaba ENI (Elastic Network
> Interface) which is built based on virtio-pci 0.9.5 specification.

In the future pls do not send v7 as a reply to v6.
Start a new thread with each version.

> Changes since V6:
> - set default min vq size to 1 intead of 0
> - enable eni vdpa driver only on X86 hosts
> - fix some typos
>
> Changes since V5:
> - remove unused codes
>
> Changes since V4:
> - check return values of get_vq_num_{max,min} when probing devices
> - disable the driver on BE host via Kconfig
> - add missing commit message
>
> Changes since V3:
> - validate VIRTIO_NET_F_MRG_RXBUF when negotiate features
> - present F_ORDER_PLATFORM in get_features
> - remove endian check since ENI always use litter endian
>
> Changes since V2:
> - add new attribute VDPA_ATTR_DEV_MIN_VQ_SIZE instead
> VDPA_ATTR_DEV_F_VERSION_1 to guide users to choose correct virtqueue
> size as suggested by Jason Wang
> - present ACCESS_PLATFORM in get_features callback as suggested by Jason
> Wang
> - disable this driver on Big Endian host as suggested by Jason Wang
> - fix a typo
>
> Changes since V1:
> - add new vdpa attribute VDPA_ATTR_DEV_F_VERSION_1 to indicate whether
> the vdpa device is legacy
> - implement dedicated driver for Alibaba ENI instead a legacy virtio-pci
> driver as suggested by Jason Wang
> - some bugs fixed
>
> Wu Zongyong (9):
> virtio-pci: introduce legacy device module
> vdpa: fix typo
> vp_vdpa: add vq irq offloading support
> vdpa: add new callback get_vq_num_min in vdpa_config_ops
> vdpa: min vq num of vdpa device cannot be greater than max vq num
> virtio_vdpa: setup correct vq size with callbacks get_vq_num_{max,min}
> vdpa: add new attribute VDPA_ATTR_DEV_MIN_VQ_SIZE
> eni_vdpa: add vDPA driver for Alibaba ENI
> eni_vdpa: alibaba: fix Kconfig typo
>
> drivers/vdpa/Kconfig | 8 +
> drivers/vdpa/Makefile | 1 +
> drivers/vdpa/alibaba/Makefile | 3 +
> drivers/vdpa/alibaba/eni_vdpa.c | 553 +++++++++++++++++++++++++
> drivers/vdpa/vdpa.c | 13 +
> drivers/vdpa/virtio_pci/vp_vdpa.c | 12 +
> drivers/virtio/Kconfig | 10 +
> drivers/virtio/Makefile | 1 +
> drivers/virtio/virtio_pci_common.c | 10 +-
> drivers/virtio/virtio_pci_common.h | 9 +-
> drivers/virtio/virtio_pci_legacy.c | 101 ++---
> drivers/virtio/virtio_pci_legacy_dev.c | 220 ++++++++++
> drivers/virtio/virtio_vdpa.c | 16 +-
> include/linux/vdpa.h | 6 +-
> include/linux/virtio_pci_legacy.h | 42 ++
> include/uapi/linux/vdpa.h | 1 +
> 16 files changed, 917 insertions(+), 89 deletions(-)
> create mode 100644 drivers/vdpa/alibaba/Makefile
> create mode 100644 drivers/vdpa/alibaba/eni_vdpa.c
> create mode 100644 drivers/virtio/virtio_pci_legacy_dev.c
> create mode 100644 include/linux/virtio_pci_legacy.h
>
> --
> 2.31.1

2021-11-01 11:13:42

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH v7 0/9] vDPA driver for Alibaba ENI

On Mon, Nov 01, 2021 at 04:11:59PM +0800, Wu Zongyong wrote:
> On Mon, Nov 01, 2021 at 03:02:52PM +0800, Jason Wang wrote:
> > On Mon, Nov 1, 2021 at 2:23 PM Wu Zongyong <[email protected]> wrote:
> > >
> > > On Mon, Nov 01, 2021 at 11:31:15AM +0800, Jason Wang wrote:
> > > > On Fri, Oct 29, 2021 at 5:15 PM Wu Zongyong
> > > > <[email protected]> wrote:
> > > > >
> > > > > This series implements the vDPA driver for Alibaba ENI (Elastic Network
> > > > > Interface) which is built based on virtio-pci 0.9.5 specification.
> > > >
> > > > It looks to me Michael has applied the patches, if this is the case,
> > > > we probably need to send patches on top.
> > >
> > > What do you mean by saying "send patches on top"?
> > > Sorry, I'm a newbie to contribute for kernel, could you please explain
> > > it in detail?
> >
> > I meant you probably need to send incremental patch on top of:
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git linux-next.
>
> Get it.
>
> Thanks

No need, I rebased.

> >
> > Thanks
> >
> >
> > >
> > > Thanks
> > > > Thanks
> > > >
> > > > >
> > > > > Changes since V6:
> > > > > - set default min vq size to 1 intead of 0
> > > > > - enable eni vdpa driver only on X86 hosts
> > > > > - fix some typos
> > > > >
> > > > > Changes since V5:
> > > > > - remove unused codes
> > > > >
> > > > > Changes since V4:
> > > > > - check return values of get_vq_num_{max,min} when probing devices
> > > > > - disable the driver on BE host via Kconfig
> > > > > - add missing commit message
> > > > >
> > > > > Changes since V3:
> > > > > - validate VIRTIO_NET_F_MRG_RXBUF when negotiate features
> > > > > - present F_ORDER_PLATFORM in get_features
> > > > > - remove endian check since ENI always use litter endian
> > > > >
> > > > > Changes since V2:
> > > > > - add new attribute VDPA_ATTR_DEV_MIN_VQ_SIZE instead
> > > > > VDPA_ATTR_DEV_F_VERSION_1 to guide users to choose correct virtqueue
> > > > > size as suggested by Jason Wang
> > > > > - present ACCESS_PLATFORM in get_features callback as suggested by Jason
> > > > > Wang
> > > > > - disable this driver on Big Endian host as suggested by Jason Wang
> > > > > - fix a typo
> > > > >
> > > > > Changes since V1:
> > > > > - add new vdpa attribute VDPA_ATTR_DEV_F_VERSION_1 to indicate whether
> > > > > the vdpa device is legacy
> > > > > - implement dedicated driver for Alibaba ENI instead a legacy virtio-pci
> > > > > driver as suggested by Jason Wang
> > > > > - some bugs fixed
> > > > >
> > > > > Wu Zongyong (9):
> > > > > virtio-pci: introduce legacy device module
> > > > > vdpa: fix typo
> > > > > vp_vdpa: add vq irq offloading support
> > > > > vdpa: add new callback get_vq_num_min in vdpa_config_ops
> > > > > vdpa: min vq num of vdpa device cannot be greater than max vq num
> > > > > virtio_vdpa: setup correct vq size with callbacks get_vq_num_{max,min}
> > > > > vdpa: add new attribute VDPA_ATTR_DEV_MIN_VQ_SIZE
> > > > > eni_vdpa: add vDPA driver for Alibaba ENI
> > > > > eni_vdpa: alibaba: fix Kconfig typo
> > > > >
> > > > > drivers/vdpa/Kconfig | 8 +
> > > > > drivers/vdpa/Makefile | 1 +
> > > > > drivers/vdpa/alibaba/Makefile | 3 +
> > > > > drivers/vdpa/alibaba/eni_vdpa.c | 553 +++++++++++++++++++++++++
> > > > > drivers/vdpa/vdpa.c | 13 +
> > > > > drivers/vdpa/virtio_pci/vp_vdpa.c | 12 +
> > > > > drivers/virtio/Kconfig | 10 +
> > > > > drivers/virtio/Makefile | 1 +
> > > > > drivers/virtio/virtio_pci_common.c | 10 +-
> > > > > drivers/virtio/virtio_pci_common.h | 9 +-
> > > > > drivers/virtio/virtio_pci_legacy.c | 101 ++---
> > > > > drivers/virtio/virtio_pci_legacy_dev.c | 220 ++++++++++
> > > > > drivers/virtio/virtio_vdpa.c | 16 +-
> > > > > include/linux/vdpa.h | 6 +-
> > > > > include/linux/virtio_pci_legacy.h | 42 ++
> > > > > include/uapi/linux/vdpa.h | 1 +
> > > > > 16 files changed, 917 insertions(+), 89 deletions(-)
> > > > > create mode 100644 drivers/vdpa/alibaba/Makefile
> > > > > create mode 100644 drivers/vdpa/alibaba/eni_vdpa.c
> > > > > create mode 100644 drivers/virtio/virtio_pci_legacy_dev.c
> > > > > create mode 100644 include/linux/virtio_pci_legacy.h
> > > > >
> > > > > --
> > > > > 2.31.1
> > > > >
> > >