2020-09-24 03:23:07

by Jason Wang

[permalink] [raw]
Subject: [RFC PATCH 00/24] Control VQ support in vDPA

Hi All:

This series tries to add the support for control virtqueue in vDPA.

Control virtqueue is used by networking device for accepting various
commands from the driver. It's a must to support multiqueue and other
configurations.

When used by vhost-vDPA bus driver for VM, the control virtqueue
should be shadowed via userspace VMM (Qemu) instead of being assigned
directly to Guest. This is because Qemu needs to know the device state
in order to start and stop device correctly (e.g for Live Migration).

This requies to isolate the memory mapping for control virtqueue
presented by vhost-vDPA to prevent guest from accesing it directly.

To achieve this, vDPA introduce two new abstractions:

- address space: identified through address space id (ASID) and a set
of memory mapping in maintained
- virtqueue group: the minimal set of virtqueues that must share an
address space

Device needs to advertise the following attributes to vDPA:

- the number of address spaces supported in the device
- the number of virtqueue groups supported in the device
- the mappings from a specific virtqueue to its virtqueue groups

The mappings from virtqueue to virtqueue groups is fixed and defined
by vDPA device driver. E.g:

- For the device that has hardware ASID support, it can simply
advertise a per virtqueue virtqueue group.
- For the device that does not have hardware ASID support, it can
simply advertise a single virtqueue group that contains all
virtqueues. Or if it wants a software emulated control virtqueue, it
can advertise two virtqueue groups, one is for cvq, another is for
the rest virtqueues.

vDPA also allow to change the association between virtqueue group and
address space. So in the case of control virtqueue, userspace
VMM(Qemu) may use a dedicated address space for the control virtqueue
group to isolate the memory mapping.

The vhost/vhost-vDPA is also extend for the userspace to:

- query the number of virtqueue groups and address spaces supported by
the device
- query the virtqueue group for a specific virtqueue
- assocaite a virtqueue group with an address space
- send ASID based IOTLB commands

This will help userspace VMM(Qemu) to detect whether the control vq
could be supported and isolate memory mappings of control virtqueue
from the others.

To demonstrate the usage, vDPA simulator is extended to support
setting MAC address via a emulated control virtqueue. Please refer
patch 24 for more implementation details.

Please review.

Note that patch 1 and a equivalent of patch 2 have been posted in the
list. Those two are requirement for this series to work, so I add them
here.

Thank

Jason Wang (24):
vhost-vdpa: fix backend feature ioctls
vhost-vdpa: fix vqs leak in vhost_vdpa_open()
vhost: move the backend feature bits to vhost_types.h
virtio-vdpa: don't set callback if virtio doesn't need it
vhost-vdpa: passing iotlb to IOMMU mapping helpers
vhost-vdpa: switch to use vhost-vdpa specific IOTLB
vdpa: add the missing comment for nvqs in struct vdpa_device
vdpa: introduce virtqueue groups
vdpa: multiple address spaces support
vdpa: introduce config operations for associating ASID to a virtqueue
group
vhost_iotlb: split out IOTLB initialization
vhost: support ASID in IOTLB API
vhost-vdpa: introduce ASID based IOTLB
vhost-vdpa: introduce uAPI to get the number of virtqueue groups
vhost-vdpa: introduce uAPI to get the number of address spaces
vhost-vdpa: uAPI to get virtqueue group id
vhost-vdpa: introduce uAPI to set group ASID
vhost-vdpa: support ASID based IOTLB API
vdpa_sim: use separated iov for reading and writing
vdpa_sim: advertise VIRTIO_NET_F_MTU
vdpa_sim: advertise VIRTIO_NET_F_MAC
vdpa_sim: factor out buffer completion logic
vdpa_sim: filter destination mac address
vdpasim: control virtqueue support

drivers/vdpa/ifcvf/ifcvf_main.c | 9 +-
drivers/vdpa/mlx5/net/mlx5_vnet.c | 11 +-
drivers/vdpa/vdpa.c | 8 +-
drivers/vdpa/vdpa_sim/vdpa_sim.c | 293 ++++++++++++++++++++++++------
drivers/vhost/iotlb.c | 23 ++-
drivers/vhost/vdpa.c | 259 ++++++++++++++++++++------
drivers/vhost/vhost.c | 23 ++-
drivers/vhost/vhost.h | 4 +-
drivers/virtio/virtio_vdpa.c | 2 +-
include/linux/vdpa.h | 42 ++++-
include/linux/vhost_iotlb.h | 2 +
include/uapi/linux/vhost.h | 19 +-
include/uapi/linux/vhost_types.h | 10 +-
13 files changed, 556 insertions(+), 149 deletions(-)

--
2.20.1


2020-09-24 03:23:32

by Jason Wang

[permalink] [raw]
Subject: [RFC PATCH 01/24] vhost-vdpa: fix backend feature ioctls

Commit 653055b9acd4 ("vhost-vdpa: support get/set backend features")
introduces two malfunction backend features ioctls:

1) the ioctls was blindly added to vring ioctl instead of vdpa device
ioctl
2) vhost_set_backend_features() was called when dev mutex has already
been held which will lead a deadlock

This patch fixes the above issues.

Cc: Eli Cohen <[email protected]>
Reported-by: Zhu Lingshan <[email protected]>
Fixes: 653055b9acd4 ("vhost-vdpa: support get/set backend features")
Signed-off-by: Jason Wang <[email protected]>
---
drivers/vhost/vdpa.c | 30 ++++++++++++++++--------------
1 file changed, 16 insertions(+), 14 deletions(-)

diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index 3fab94f88894..796fe979f997 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -353,8 +353,6 @@ static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd,
struct vdpa_callback cb;
struct vhost_virtqueue *vq;
struct vhost_vring_state s;
- u64 __user *featurep = argp;
- u64 features;
u32 idx;
long r;

@@ -381,18 +379,6 @@ static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd,

vq->last_avail_idx = vq_state.avail_index;
break;
- case VHOST_GET_BACKEND_FEATURES:
- features = VHOST_VDPA_BACKEND_FEATURES;
- if (copy_to_user(featurep, &features, sizeof(features)))
- return -EFAULT;
- return 0;
- case VHOST_SET_BACKEND_FEATURES:
- if (copy_from_user(&features, featurep, sizeof(features)))
- return -EFAULT;
- if (features & ~VHOST_VDPA_BACKEND_FEATURES)
- return -EOPNOTSUPP;
- vhost_set_backend_features(&v->vdev, features);
- return 0;
}

r = vhost_vring_ioctl(&v->vdev, cmd, argp);
@@ -440,8 +426,20 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
struct vhost_vdpa *v = filep->private_data;
struct vhost_dev *d = &v->vdev;
void __user *argp = (void __user *)arg;
+ u64 __user *featurep = argp;
+ u64 features;
long r;

+ if (cmd == VHOST_SET_BACKEND_FEATURES) {
+ r = copy_from_user(&features, featurep, sizeof(features));
+ if (r)
+ return r;
+ if (features & ~VHOST_VDPA_BACKEND_FEATURES)
+ return -EOPNOTSUPP;
+ vhost_set_backend_features(&v->vdev, features);
+ return 0;
+ }
+
mutex_lock(&d->mutex);

switch (cmd) {
@@ -476,6 +474,10 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
case VHOST_VDPA_SET_CONFIG_CALL:
r = vhost_vdpa_set_config_call(v, argp);
break;
+ case VHOST_GET_BACKEND_FEATURES:
+ features = VHOST_VDPA_BACKEND_FEATURES;
+ r = copy_to_user(featurep, &features, sizeof(features));
+ break;
default:
r = vhost_dev_ioctl(&v->vdev, cmd, argp);
if (r == -ENOIOCTLCMD)
--
2.20.1

2020-09-24 03:24:11

by Jason Wang

[permalink] [raw]
Subject: [RFC PATCH 02/24] vhost-vdpa: fix vqs leak in vhost_vdpa_open()

We need to free vqs during the err path after it has been allocated
since vhost won't do that for us.

Signed-off-by: Jason Wang <[email protected]>
---
drivers/vhost/vdpa.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index 796fe979f997..9c641274b9f3 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -764,6 +764,12 @@ static void vhost_vdpa_free_domain(struct vhost_vdpa *v)
v->domain = NULL;
}

+static void vhost_vdpa_cleanup(struct vhost_vdpa *v)
+{
+ vhost_dev_cleanup(&v->vdev);
+ kfree(v->vdev.vqs);
+}
+
static int vhost_vdpa_open(struct inode *inode, struct file *filep)
{
struct vhost_vdpa *v;
@@ -809,7 +815,7 @@ static int vhost_vdpa_open(struct inode *inode, struct file *filep)
return 0;

err_init_iotlb:
- vhost_dev_cleanup(&v->vdev);
+ vhost_vdpa_cleanup(v);
err:
atomic_dec(&v->opened);
return r;
@@ -840,8 +846,7 @@ static int vhost_vdpa_release(struct inode *inode, struct file *filep)
vhost_vdpa_free_domain(v);
vhost_vdpa_config_put(v);
vhost_vdpa_clean_irq(v);
- vhost_dev_cleanup(&v->vdev);
- kfree(v->vdev.vqs);
+ vhost_vdpa_cleanup(v);
mutex_unlock(&d->mutex);

atomic_dec(&v->opened);
--
2.20.1

2020-09-24 03:24:18

by Jason Wang

[permalink] [raw]
Subject: [RFC PATCH 03/24] vhost: move the backend feature bits to vhost_types.h

We should store feature bits in vhost_types.h as what has been done
for e.g VHOST_F_LOG_ALL.

Signed-off-by: Jason Wang <[email protected]>
---
include/uapi/linux/vhost.h | 5 -----
include/uapi/linux/vhost_types.h | 5 +++++
2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h
index 75232185324a..c26452782882 100644
--- a/include/uapi/linux/vhost.h
+++ b/include/uapi/linux/vhost.h
@@ -89,11 +89,6 @@

/* Set or get vhost backend capability */

-/* Use message type V2 */
-#define VHOST_BACKEND_F_IOTLB_MSG_V2 0x1
-/* IOTLB can accept batching hints */
-#define VHOST_BACKEND_F_IOTLB_BATCH 0x2
-
#define VHOST_SET_BACKEND_FEATURES _IOW(VHOST_VIRTIO, 0x25, __u64)
#define VHOST_GET_BACKEND_FEATURES _IOR(VHOST_VIRTIO, 0x26, __u64)

diff --git a/include/uapi/linux/vhost_types.h b/include/uapi/linux/vhost_types.h
index 9a269a88a6ff..532571571b4b 100644
--- a/include/uapi/linux/vhost_types.h
+++ b/include/uapi/linux/vhost_types.h
@@ -144,4 +144,9 @@ struct vhost_vdpa_config {
/* vhost-net should add virtio_net_hdr for RX, and strip for TX packets. */
#define VHOST_NET_F_VIRTIO_NET_HDR 27

+/* Use message type V2 */
+#define VHOST_BACKEND_F_IOTLB_MSG_V2 0x1
+/* IOTLB can accept batching hints */
+#define VHOST_BACKEND_F_IOTLB_BATCH 0x2
+
#endif
--
2.20.1

2020-09-24 03:24:37

by Jason Wang

[permalink] [raw]
Subject: [RFC PATCH 04/24] virtio-vdpa: don't set callback if virtio doesn't need it

There's no need for setting callbacks for the driver that doesn't care
about that.

Signed-off-by: Jason Wang <[email protected]>
---
drivers/virtio/virtio_vdpa.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
index 4a9ddb44b2a7..af6ee677f319 100644
--- a/drivers/virtio/virtio_vdpa.c
+++ b/drivers/virtio/virtio_vdpa.c
@@ -175,7 +175,7 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
}

/* Setup virtqueue callback */
- cb.callback = virtio_vdpa_virtqueue_cb;
+ cb.callback = callback ? virtio_vdpa_virtqueue_cb : NULL;
cb.private = info;
ops->set_vq_cb(vdpa, index, &cb);
ops->set_vq_num(vdpa, index, virtqueue_get_vring_size(vq));
--
2.20.1

2020-09-24 03:24:48

by Jason Wang

[permalink] [raw]
Subject: [RFC PATCH 05/24] vhost-vdpa: passing iotlb to IOMMU mapping helpers

To prepare for the ASID support for vhost-vdpa, try to pass IOTLB
object to dma helpers.

Signed-off-by: Jason Wang <[email protected]>
---
drivers/vhost/vdpa.c | 40 ++++++++++++++++++++++------------------
1 file changed, 22 insertions(+), 18 deletions(-)

diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index 9c641274b9f3..74bef1c15a70 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -489,10 +489,11 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
return r;
}

-static void vhost_vdpa_iotlb_unmap(struct vhost_vdpa *v, u64 start, u64 last)
+static void vhost_vdpa_iotlb_unmap(struct vhost_vdpa *v,
+ struct vhost_iotlb *iotlb,
+ u64 start, u64 last)
{
struct vhost_dev *dev = &v->vdev;
- struct vhost_iotlb *iotlb = dev->iotlb;
struct vhost_iotlb_map *map;
struct page *page;
unsigned long pfn, pinned;
@@ -514,8 +515,9 @@ static void vhost_vdpa_iotlb_unmap(struct vhost_vdpa *v, u64 start, u64 last)
static void vhost_vdpa_iotlb_free(struct vhost_vdpa *v)
{
struct vhost_dev *dev = &v->vdev;
+ struct vhost_iotlb *iotlb = dev->iotlb;

- vhost_vdpa_iotlb_unmap(v, 0ULL, 0ULL - 1);
+ vhost_vdpa_iotlb_unmap(v, iotlb, 0ULL, 0ULL - 1);
kfree(dev->iotlb);
dev->iotlb = NULL;
}
@@ -542,15 +544,14 @@ static int perm_to_iommu_flags(u32 perm)
return flags | IOMMU_CACHE;
}

-static int vhost_vdpa_map(struct vhost_vdpa *v,
+static int vhost_vdpa_map(struct vhost_vdpa *v, struct vhost_iotlb *iotlb,
u64 iova, u64 size, u64 pa, u32 perm)
{
- struct vhost_dev *dev = &v->vdev;
struct vdpa_device *vdpa = v->vdpa;
const struct vdpa_config_ops *ops = vdpa->config;
int r = 0;

- r = vhost_iotlb_add_range(dev->iotlb, iova, iova + size - 1,
+ r = vhost_iotlb_add_range(iotlb, iova, iova + size - 1,
pa, perm);
if (r)
return r;
@@ -559,7 +560,7 @@ static int vhost_vdpa_map(struct vhost_vdpa *v,
r = ops->dma_map(vdpa, iova, size, pa, perm);
} else if (ops->set_map) {
if (!v->in_batch)
- r = ops->set_map(vdpa, dev->iotlb);
+ r = ops->set_map(vdpa, iotlb);
} else {
r = iommu_map(v->domain, iova, pa, size,
perm_to_iommu_flags(perm));
@@ -568,29 +569,30 @@ static int vhost_vdpa_map(struct vhost_vdpa *v,
return r;
}

-static void vhost_vdpa_unmap(struct vhost_vdpa *v, u64 iova, u64 size)
+static void vhost_vdpa_unmap(struct vhost_vdpa *v,
+ struct vhost_iotlb *iotlb,
+ u64 iova, u64 size)
{
- struct vhost_dev *dev = &v->vdev;
struct vdpa_device *vdpa = v->vdpa;
const struct vdpa_config_ops *ops = vdpa->config;

- vhost_vdpa_iotlb_unmap(v, iova, iova + size - 1);
+ vhost_vdpa_iotlb_unmap(v, iotlb, iova, iova + size - 1);

if (ops->dma_map) {
ops->dma_unmap(vdpa, iova, size);
} else if (ops->set_map) {
if (!v->in_batch)
- ops->set_map(vdpa, dev->iotlb);
+ ops->set_map(vdpa, iotlb);
} else {
iommu_unmap(v->domain, iova, size);
}
}

static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v,
+ struct vhost_iotlb *iotlb,
struct vhost_iotlb_msg *msg)
{
struct vhost_dev *dev = &v->vdev;
- struct vhost_iotlb *iotlb = dev->iotlb;
struct page **page_list;
unsigned long list_size = PAGE_SIZE / sizeof(struct page *);
unsigned int gup_flags = FOLL_LONGTERM;
@@ -644,7 +646,7 @@ static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v,
if (last_pfn && (this_pfn != last_pfn + 1)) {
/* Pin a contiguous chunk of memory */
csize = (last_pfn - map_pfn + 1) << PAGE_SHIFT;
- if (vhost_vdpa_map(v, iova, csize,
+ if (vhost_vdpa_map(v, iotlb, iova, csize,
map_pfn << PAGE_SHIFT,
msg->perm))
goto out;
@@ -660,11 +662,12 @@ static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v,
}

/* Pin the rest chunk */
- ret = vhost_vdpa_map(v, iova, (last_pfn - map_pfn + 1) << PAGE_SHIFT,
+ ret = vhost_vdpa_map(v, iotlb, iova,
+ (last_pfn - map_pfn + 1) << PAGE_SHIFT,
map_pfn << PAGE_SHIFT, msg->perm);
out:
if (ret) {
- vhost_vdpa_unmap(v, msg->iova, msg->size);
+ vhost_vdpa_unmap(v, iotlb, msg->iova, msg->size);
atomic64_sub(npages, &dev->mm->pinned_vm);
}
mmap_read_unlock(dev->mm);
@@ -678,6 +681,7 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev,
struct vhost_vdpa *v = container_of(dev, struct vhost_vdpa, vdev);
struct vdpa_device *vdpa = v->vdpa;
const struct vdpa_config_ops *ops = vdpa->config;
+ struct vhost_iotlb *iotlb = dev->iotlb;
int r = 0;

r = vhost_dev_check_owner(dev);
@@ -686,17 +690,17 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev,

switch (msg->type) {
case VHOST_IOTLB_UPDATE:
- r = vhost_vdpa_process_iotlb_update(v, msg);
+ r = vhost_vdpa_process_iotlb_update(v, iotlb, msg);
break;
case VHOST_IOTLB_INVALIDATE:
- vhost_vdpa_unmap(v, msg->iova, msg->size);
+ vhost_vdpa_unmap(v, iotlb, msg->iova, msg->size);
break;
case VHOST_IOTLB_BATCH_BEGIN:
v->in_batch = true;
break;
case VHOST_IOTLB_BATCH_END:
if (v->in_batch && ops->set_map)
- ops->set_map(vdpa, dev->iotlb);
+ ops->set_map(vdpa, iotlb);
v->in_batch = false;
break;
default:
--
2.20.1

2020-09-24 03:25:11

by Jason Wang

[permalink] [raw]
Subject: [RFC PATCH 06/24] vhost-vdpa: switch to use vhost-vdpa specific IOTLB

To ease the implementation of per group ASID support for vDPA
device. This patch switches to use a vhost-vdpa specific IOTLB to
avoid the unnecessary refactoring of the vhost core.

Signed-off-by: Jason Wang <[email protected]>
---
drivers/vhost/vdpa.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index 74bef1c15a70..ec3c94f706c1 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -40,6 +40,7 @@ struct vhost_vdpa {
struct vhost_virtqueue *vqs;
struct completion completion;
struct vdpa_device *vdpa;
+ struct vhost_iotlb *iotlb;
struct device dev;
struct cdev cdev;
atomic_t opened;
@@ -514,12 +515,11 @@ static void vhost_vdpa_iotlb_unmap(struct vhost_vdpa *v,

static void vhost_vdpa_iotlb_free(struct vhost_vdpa *v)
{
- struct vhost_dev *dev = &v->vdev;
- struct vhost_iotlb *iotlb = dev->iotlb;
+ struct vhost_iotlb *iotlb = v->iotlb;

vhost_vdpa_iotlb_unmap(v, iotlb, 0ULL, 0ULL - 1);
- kfree(dev->iotlb);
- dev->iotlb = NULL;
+ kfree(v->iotlb);
+ v->iotlb = NULL;
}

static int perm_to_iommu_flags(u32 perm)
@@ -681,7 +681,7 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev,
struct vhost_vdpa *v = container_of(dev, struct vhost_vdpa, vdev);
struct vdpa_device *vdpa = v->vdpa;
const struct vdpa_config_ops *ops = vdpa->config;
- struct vhost_iotlb *iotlb = dev->iotlb;
+ struct vhost_iotlb *iotlb = v->iotlb;
int r = 0;

r = vhost_dev_check_owner(dev);
@@ -812,12 +812,14 @@ static int vhost_vdpa_open(struct inode *inode, struct file *filep)

r = vhost_vdpa_alloc_domain(v);
if (r)
- goto err_init_iotlb;
+ goto err_alloc_domain;

filep->private_data = v;

return 0;

+err_alloc_domain:
+ vhost_vdpa_iotlb_free(v);
err_init_iotlb:
vhost_vdpa_cleanup(v);
err:
--
2.20.1

2020-09-24 03:25:24

by Jason Wang

[permalink] [raw]
Subject: [RFC PATCH 07/24] vdpa: add the missing comment for nvqs in struct vdpa_device

Signed-off-by: Jason Wang <[email protected]>
---
include/linux/vdpa.h | 1 +
1 file changed, 1 insertion(+)

diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
index eae0bfd87d91..df169c2f5c0f 100644
--- a/include/linux/vdpa.h
+++ b/include/linux/vdpa.h
@@ -42,6 +42,7 @@ struct vdpa_vq_state {
* @config: the configuration ops for this device.
* @index: device index
* @features_valid: were features initialized? for legacy guests
+ * @nvqs: the number of virtqueues
*/
struct vdpa_device {
struct device dev;
--
2.20.1

2020-09-24 03:25:53

by Jason Wang

[permalink] [raw]
Subject: [RFC PATCH 08/24] vdpa: introduce virtqueue groups

This patch introduces virtqueue groups to vDPA device. The virtqueue
group is the minimal set of virtqueues that must share an address
space. And the adddress space identifier could only be attached to
a specific virtqueue group.

A new mandated bus operation is introduced to get the virtqueue group
ID for a specific virtqueue.

All the vDPA device drivers were converted to simply support a single
virtqueue group.

Signed-off-by: Jason Wang <[email protected]>
---
drivers/vdpa/ifcvf/ifcvf_main.c | 9 ++++++++-
drivers/vdpa/mlx5/net/mlx5_vnet.c | 8 +++++++-
drivers/vdpa/vdpa.c | 4 +++-
drivers/vdpa/vdpa_sim/vdpa_sim.c | 11 ++++++++++-
include/linux/vdpa.h | 12 +++++++++---
5 files changed, 37 insertions(+), 7 deletions(-)

diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c
index 076d7ac5e723..e6a0be374e51 100644
--- a/drivers/vdpa/ifcvf/ifcvf_main.c
+++ b/drivers/vdpa/ifcvf/ifcvf_main.c
@@ -327,6 +327,11 @@ static u32 ifcvf_vdpa_get_vq_align(struct vdpa_device *vdpa_dev)
return IFCVF_QUEUE_ALIGNMENT;
}

+static u32 ifcvf_vdpa_get_vq_group(struct vdpa_device *vdpa, u16 idx)
+{
+ return 0;
+}
+
static void ifcvf_vdpa_get_config(struct vdpa_device *vdpa_dev,
unsigned int offset,
void *buf, unsigned int len)
@@ -387,6 +392,7 @@ static const struct vdpa_config_ops ifc_vdpa_ops = {
.get_device_id = ifcvf_vdpa_get_device_id,
.get_vendor_id = ifcvf_vdpa_get_vendor_id,
.get_vq_align = ifcvf_vdpa_get_vq_align,
+ .get_vq_group = ifcvf_vdpa_get_vq_group,
.get_config = ifcvf_vdpa_get_config,
.set_config = ifcvf_vdpa_set_config,
.set_config_cb = ifcvf_vdpa_set_config_cb,
@@ -434,7 +440,8 @@ static int ifcvf_probe(struct pci_dev *pdev, const struct pci_device_id *id)

adapter = vdpa_alloc_device(struct ifcvf_adapter, vdpa,
dev, &ifc_vdpa_ops,
- IFCVF_MAX_QUEUE_PAIRS * 2);
+ IFCVF_MAX_QUEUE_PAIRS * 2, 1);
+
if (adapter == NULL) {
IFCVF_ERR(pdev, "Failed to allocate vDPA structure");
return -ENOMEM;
diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
index 9df69d5efe8c..4e480f4f754e 100644
--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
+++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
@@ -1428,6 +1428,11 @@ static u32 mlx5_vdpa_get_vq_align(struct vdpa_device *vdev)
return PAGE_SIZE;
}

+static u32 mlx5_vdpa_get_vq_group(struct vdpa_device *vdpa, u16 idx)
+{
+ return 0;
+}
+
enum { MLX5_VIRTIO_NET_F_GUEST_CSUM = 1 << 9,
MLX5_VIRTIO_NET_F_CSUM = 1 << 10,
MLX5_VIRTIO_NET_F_HOST_TSO6 = 1 << 11,
@@ -1838,6 +1843,7 @@ static const struct vdpa_config_ops mlx5_vdpa_ops = {
.get_vq_notification = mlx5_get_vq_notification,
.get_vq_irq = mlx5_get_vq_irq,
.get_vq_align = mlx5_vdpa_get_vq_align,
+ .get_vq_group = mlx5_vdpa_get_vq_group,
.get_features = mlx5_vdpa_get_features,
.set_features = mlx5_vdpa_set_features,
.set_config_cb = mlx5_vdpa_set_config_cb,
@@ -1925,7 +1931,7 @@ void *mlx5_vdpa_add_dev(struct mlx5_core_dev *mdev)
max_vqs = min_t(u32, max_vqs, MLX5_MAX_SUPPORTED_VQS);

ndev = vdpa_alloc_device(struct mlx5_vdpa_net, mvdev.vdev, mdev->device, &mlx5_vdpa_ops,
- 2 * mlx5_vdpa_max_qps(max_vqs));
+ 2 * mlx5_vdpa_max_qps(max_vqs), 1);
if (IS_ERR(ndev))
return ndev;

diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
index a69ffc991e13..46399746ec7c 100644
--- a/drivers/vdpa/vdpa.c
+++ b/drivers/vdpa/vdpa.c
@@ -62,6 +62,7 @@ static void vdpa_release_dev(struct device *d)
* @parent: the parent device
* @config: the bus operations that is supported by this device
* @nvqs: number of virtqueues supported by this device
+ * @ngroups: number of groups supported by this device
* @size: size of the parent structure that contains private data
*
* Driver should use vdpa_alloc_device() wrapper macro instead of
@@ -72,7 +73,7 @@ static void vdpa_release_dev(struct device *d)
*/
struct vdpa_device *__vdpa_alloc_device(struct device *parent,
const struct vdpa_config_ops *config,
- int nvqs,
+ int nvqs, unsigned int ngroups,
size_t size)
{
struct vdpa_device *vdev;
@@ -100,6 +101,7 @@ struct vdpa_device *__vdpa_alloc_device(struct device *parent,
vdev->config = config;
vdev->features_valid = false;
vdev->nvqs = nvqs;
+ vdev->ngroups = ngroups;

err = dev_set_name(&vdev->dev, "vdpa%u", vdev->index);
if (err)
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
index 62d640327145..6669c561bc6e 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
@@ -75,6 +75,7 @@ struct vdpasim {
u32 status;
u32 generation;
u64 features;
+ u32 groups;
/* spinlock to synchronize iommu table */
spinlock_t iommu_lock;
};
@@ -352,7 +353,8 @@ static struct vdpasim *vdpasim_create(void)
else
ops = &vdpasim_net_config_ops;

- vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops, VDPASIM_VQ_NUM);
+ vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops,
+ VDPASIM_VQ_NUM, 1);
if (!vdpasim)
goto err_alloc;

@@ -481,6 +483,11 @@ static u32 vdpasim_get_vq_align(struct vdpa_device *vdpa)
return VDPASIM_QUEUE_ALIGN;
}

+static u32 vdpasim_get_vq_group(struct vdpa_device *vdpa, u16 idx)
+{
+ return 0;
+}
+
static u64 vdpasim_get_features(struct vdpa_device *vdpa)
{
return vdpasim_features;
@@ -646,6 +653,7 @@ static const struct vdpa_config_ops vdpasim_net_config_ops = {
.set_vq_state = vdpasim_set_vq_state,
.get_vq_state = vdpasim_get_vq_state,
.get_vq_align = vdpasim_get_vq_align,
+ .get_vq_group = vdpasim_get_vq_group,
.get_features = vdpasim_get_features,
.set_features = vdpasim_set_features,
.set_config_cb = vdpasim_set_config_cb,
@@ -672,6 +680,7 @@ static const struct vdpa_config_ops vdpasim_net_batch_config_ops = {
.set_vq_state = vdpasim_set_vq_state,
.get_vq_state = vdpasim_get_vq_state,
.get_vq_align = vdpasim_get_vq_align,
+ .get_vq_group = vdpasim_get_vq_group,
.get_features = vdpasim_get_features,
.set_features = vdpasim_set_features,
.set_config_cb = vdpasim_set_config_cb,
diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
index df169c2f5c0f..d829512efd27 100644
--- a/include/linux/vdpa.h
+++ b/include/linux/vdpa.h
@@ -51,6 +51,7 @@ struct vdpa_device {
unsigned int index;
bool features_valid;
int nvqs;
+ unsigned int ngroups;
};

/**
@@ -109,6 +110,10 @@ struct vdpa_device {
* for the device
* @vdev: vdpa device
* Returns virtqueue algin requirement
+ * @get_vq_group: Get the group id for a specific virtqueue
+ * @vdev: vdpa device
+ * @idx: virtqueue index
+ * Returns u32: group id for this virtqueue
* @get_features: Get virtio features supported by the device
* @vdev: vdpa device
* Returns the virtio features support by the
@@ -203,6 +208,7 @@ struct vdpa_config_ops {

/* Device ops */
u32 (*get_vq_align)(struct vdpa_device *vdev);
+ u32 (*get_vq_group)(struct vdpa_device *vdev, u16 idx);
u64 (*get_features)(struct vdpa_device *vdev);
int (*set_features)(struct vdpa_device *vdev, u64 features);
void (*set_config_cb)(struct vdpa_device *vdev,
@@ -230,12 +236,12 @@ struct vdpa_config_ops {

struct vdpa_device *__vdpa_alloc_device(struct device *parent,
const struct vdpa_config_ops *config,
- int nvqs,
+ int nvqs, unsigned int ngroups,
size_t size);

-#define vdpa_alloc_device(dev_struct, member, parent, config, nvqs) \
+#define vdpa_alloc_device(dev_struct, member, parent, config, nvqs, ngroups) \
container_of(__vdpa_alloc_device( \
- parent, config, nvqs, \
+ parent, config, nvqs, ngroups, \
sizeof(dev_struct) + \
BUILD_BUG_ON_ZERO(offsetof( \
dev_struct, member))), \
--
2.20.1

2020-09-24 03:25:59

by Jason Wang

[permalink] [raw]
Subject: [RFC PATCH 11/24] vhost_iotlb: split out IOTLB initialization

This patch split out IOTLB initialization logic into a new
helper. This allows vhost to implement device specific IOTLB
allocation logic.

Signed-off-by: Jason Wang <[email protected]>
---
drivers/vhost/iotlb.c | 23 ++++++++++++++++++-----
include/linux/vhost_iotlb.h | 2 ++
2 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/drivers/vhost/iotlb.c b/drivers/vhost/iotlb.c
index 1f0ca6e44410..36d785efd038 100644
--- a/drivers/vhost/iotlb.c
+++ b/drivers/vhost/iotlb.c
@@ -98,6 +98,23 @@ void vhost_iotlb_del_range(struct vhost_iotlb *iotlb, u64 start, u64 last)
}
EXPORT_SYMBOL_GPL(vhost_iotlb_del_range);

+/**
+ * vhost_iotlb_init - initialize a vhost IOTLB
+ * @iotlb: the IOTLB that needs to be initialized
+ * @limit: maximum number of IOTLB entries
+ * @flags: VHOST_IOTLB_FLAG_XXX
+ */
+void vhost_iotlb_init(struct vhost_iotlb *iotlb, unsigned int limit,
+ unsigned int flags)
+{
+ iotlb->root = RB_ROOT_CACHED;
+ iotlb->limit = limit;
+ iotlb->nmaps = 0;
+ iotlb->flags = flags;
+ INIT_LIST_HEAD(&iotlb->list);
+}
+EXPORT_SYMBOL_GPL(vhost_iotlb_init);
+
/**
* vhost_iotlb_alloc - add a new vhost IOTLB
* @limit: maximum number of IOTLB entries
@@ -112,11 +129,7 @@ struct vhost_iotlb *vhost_iotlb_alloc(unsigned int limit, unsigned int flags)
if (!iotlb)
return NULL;

- iotlb->root = RB_ROOT_CACHED;
- iotlb->limit = limit;
- iotlb->nmaps = 0;
- iotlb->flags = flags;
- INIT_LIST_HEAD(&iotlb->list);
+ vhost_iotlb_init(iotlb, limit, flags);

return iotlb;
}
diff --git a/include/linux/vhost_iotlb.h b/include/linux/vhost_iotlb.h
index 6b09b786a762..c0df193ec3e1 100644
--- a/include/linux/vhost_iotlb.h
+++ b/include/linux/vhost_iotlb.h
@@ -33,6 +33,8 @@ int vhost_iotlb_add_range(struct vhost_iotlb *iotlb, u64 start, u64 last,
u64 addr, unsigned int perm);
void vhost_iotlb_del_range(struct vhost_iotlb *iotlb, u64 start, u64 last);

+void vhost_iotlb_init(struct vhost_iotlb *iotlb, unsigned int limit,
+ unsigned int flags);
struct vhost_iotlb *vhost_iotlb_alloc(unsigned int limit, unsigned int flags);
void vhost_iotlb_free(struct vhost_iotlb *iotlb);
void vhost_iotlb_reset(struct vhost_iotlb *iotlb);
--
2.20.1

2020-09-24 03:26:14

by Jason Wang

[permalink] [raw]
Subject: [RFC PATCH 12/24] vhost: support ASID in IOTLB API

This patches allows userspace to send ASID based IOTLB message to
vhost. This idea is to use the reserved u32 field in the existing V2
IOTLB message. Vhost device should advertise this capability via
VHOST_BACKEND_F_IOTLB_ASID backend feature.

Signed-off-by: Jason Wang <[email protected]>
---
drivers/vhost/vdpa.c | 5 ++++-
drivers/vhost/vhost.c | 23 ++++++++++++++++++-----
drivers/vhost/vhost.h | 4 ++--
include/uapi/linux/vhost_types.h | 5 ++++-
4 files changed, 28 insertions(+), 9 deletions(-)

diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index eeefcd971e3f..6552987544d7 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -675,7 +675,7 @@ static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v,
return ret;
}

-static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev,
+static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev, u32 asid,
struct vhost_iotlb_msg *msg)
{
struct vhost_vdpa *v = container_of(dev, struct vhost_vdpa, vdev);
@@ -684,6 +684,9 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev,
struct vhost_iotlb *iotlb = v->iotlb;
int r = 0;

+ if (asid != 0)
+ return -EINVAL;
+
r = vhost_dev_check_owner(dev);
if (r)
return r;
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index b45519ca66a7..060662b12de3 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -463,7 +463,7 @@ void vhost_dev_init(struct vhost_dev *dev,
struct vhost_virtqueue **vqs, int nvqs,
int iov_limit, int weight, int byte_weight,
bool use_worker,
- int (*msg_handler)(struct vhost_dev *dev,
+ int (*msg_handler)(struct vhost_dev *dev, u32 asid,
struct vhost_iotlb_msg *msg))
{
struct vhost_virtqueue *vq;
@@ -1079,11 +1079,14 @@ static bool umem_access_ok(u64 uaddr, u64 size, int access)
return true;
}

-static int vhost_process_iotlb_msg(struct vhost_dev *dev,
+static int vhost_process_iotlb_msg(struct vhost_dev *dev, u16 asid,
struct vhost_iotlb_msg *msg)
{
int ret = 0;

+ if (asid != 0)
+ return -EINVAL;
+
mutex_lock(&dev->mutex);
vhost_dev_lock_vqs(dev);
switch (msg->type) {
@@ -1130,6 +1133,7 @@ ssize_t vhost_chr_write_iter(struct vhost_dev *dev,
struct vhost_iotlb_msg msg;
size_t offset;
int type, ret;
+ u16 asid = 0;

ret = copy_from_iter(&type, sizeof(type), from);
if (ret != sizeof(type)) {
@@ -1145,7 +1149,16 @@ ssize_t vhost_chr_write_iter(struct vhost_dev *dev,
offset = offsetof(struct vhost_msg, iotlb) - sizeof(int);
break;
case VHOST_IOTLB_MSG_V2:
- offset = sizeof(__u32);
+ if (vhost_backend_has_feature(dev->vqs[0],
+ VHOST_BACKEND_F_IOTLB_ASID)) {
+ ret = copy_from_iter(&asid, sizeof(asid), from);
+ if (ret != sizeof(asid)) {
+ ret = -EINVAL;
+ goto done;
+ }
+ offset = sizeof(__u16);
+ } else
+ offset = sizeof(__u32);
break;
default:
ret = -EINVAL;
@@ -1160,9 +1173,9 @@ ssize_t vhost_chr_write_iter(struct vhost_dev *dev,
}

if (dev->msg_handler)
- ret = dev->msg_handler(dev, &msg);
+ ret = dev->msg_handler(dev, asid, &msg);
else
- ret = vhost_process_iotlb_msg(dev, &msg);
+ ret = vhost_process_iotlb_msg(dev, asid, &msg);
if (ret) {
ret = -EFAULT;
goto done;
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 9032d3c2a9f4..05e7aaf6071b 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -162,7 +162,7 @@ struct vhost_dev {
int byte_weight;
u64 kcov_handle;
bool use_worker;
- int (*msg_handler)(struct vhost_dev *dev,
+ int (*msg_handler)(struct vhost_dev *dev, u32 asid,
struct vhost_iotlb_msg *msg);
};

@@ -170,7 +170,7 @@ bool vhost_exceeds_weight(struct vhost_virtqueue *vq, int pkts, int total_len);
void vhost_dev_init(struct vhost_dev *, struct vhost_virtqueue **vqs,
int nvqs, int iov_limit, int weight, int byte_weight,
bool use_worker,
- int (*msg_handler)(struct vhost_dev *dev,
+ int (*msg_handler)(struct vhost_dev *dev, u32 asid,
struct vhost_iotlb_msg *msg));
long vhost_dev_set_owner(struct vhost_dev *dev);
bool vhost_dev_has_owner(struct vhost_dev *dev);
diff --git a/include/uapi/linux/vhost_types.h b/include/uapi/linux/vhost_types.h
index 532571571b4b..2eb55fc9bf2e 100644
--- a/include/uapi/linux/vhost_types.h
+++ b/include/uapi/linux/vhost_types.h
@@ -87,7 +87,7 @@ struct vhost_msg {

struct vhost_msg_v2 {
__u32 type;
- __u32 reserved;
+ __u32 asid;
union {
struct vhost_iotlb_msg iotlb;
__u8 padding[64];
@@ -148,5 +148,8 @@ struct vhost_vdpa_config {
#define VHOST_BACKEND_F_IOTLB_MSG_V2 0x1
/* IOTLB can accept batching hints */
#define VHOST_BACKEND_F_IOTLB_BATCH 0x2
+/* IOTLB can accept address space identifier through V2 type of IOTLB
+ message */
+#define VHOST_BACKEND_F_IOTLB_ASID 0x3

#endif
--
2.20.1

2020-09-24 03:26:26

by Jason Wang

[permalink] [raw]
Subject: [RFC PATCH 14/24] vhost-vdpa: introduce uAPI to get the number of virtqueue groups

Follows the vDPA support for multiple address spaces, this patch
introduce uAPI for the userspace to know the number of virtqueue
groups supported by the vDPA device.

Signed-off-by: Jason Wang <[email protected]>
---
drivers/vhost/vdpa.c | 4 ++++
include/uapi/linux/vhost.h | 4 ++++
2 files changed, 8 insertions(+)

diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index 1ba7e95619b5..4b8882f55bc9 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -528,6 +528,10 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
case VHOST_VDPA_GET_VRING_NUM:
r = vhost_vdpa_get_vring_num(v, argp);
break;
+ case VHOST_VDPA_GET_GROUP_NUM:
+ r = copy_to_user(argp, &v->vdpa->ngroups,
+ sizeof(v->vdpa->ngroups));
+ break;
case VHOST_SET_LOG_BASE:
case VHOST_SET_LOG_FD:
r = -ENOIOCTLCMD;
diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h
index c26452782882..19f1acdfe3ea 100644
--- a/include/uapi/linux/vhost.h
+++ b/include/uapi/linux/vhost.h
@@ -141,4 +141,8 @@

/* Set event fd for config interrupt*/
#define VHOST_VDPA_SET_CONFIG_CALL _IOW(VHOST_VIRTIO, 0x77, int)
+
+/* Get the number of virtqueue groups. */
+#define VHOST_VDPA_GET_GROUP_NUM _IOR(VHOST_VIRTIO, 0x78, unsigned int)
+
#endif
--
2.20.1

2020-09-24 03:26:40

by Jason Wang

[permalink] [raw]
Subject: [RFC PATCH 15/24] vhost-vdpa: introduce uAPI to get the number of address spaces

A new uAPI is introduced for the userspace to know the address spaces
that is supported by a specific device.

Signed-off-by: Jason Wang <[email protected]>
---
drivers/vhost/vdpa.c | 3 +++
include/uapi/linux/vhost.h | 2 ++
2 files changed, 5 insertions(+)

diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index 4b8882f55bc9..4d97a59824a1 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -532,6 +532,9 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
r = copy_to_user(argp, &v->vdpa->ngroups,
sizeof(v->vdpa->ngroups));
break;
+ case VHOST_VDPA_GET_AS_NUM:
+ r = copy_to_user(argp, &v->vdpa->nas, sizeof(v->vdpa->nas));
+ break;
case VHOST_SET_LOG_BASE:
case VHOST_SET_LOG_FD:
r = -ENOIOCTLCMD;
diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h
index 19f1acdfe3ea..99bdf50efc50 100644
--- a/include/uapi/linux/vhost.h
+++ b/include/uapi/linux/vhost.h
@@ -145,4 +145,6 @@
/* Get the number of virtqueue groups. */
#define VHOST_VDPA_GET_GROUP_NUM _IOR(VHOST_VIRTIO, 0x78, unsigned int)

+/* Get the number of address spaces. */
+#define VHOST_VDPA_GET_AS_NUM _IOR(VHOST_VIRTIO, 0x79, unsigned int)
#endif
--
2.20.1

2020-09-24 03:27:17

by Jason Wang

[permalink] [raw]
Subject: [RFC PATCH 09/24] vdpa: multiple address spaces support

This patches introduces the multiple address spaces support for vDPA
device. This idea is to identify a specific address space via an
dedicated identifier - ASID.

During vDPA device allocation, vDPA device driver needs to report the
number of address spaces supported by the device then the DMA mapping
ops of the vDPA device needs to be extended to support ASID.

This helps to isolate the DMA among the virtqueues. E.g in the case of
virtio-net, the control virtqueue will not be assigned directly to
guest.

This RFC patch only converts for the device that wants its own
IOMMU/DMA translation logic. So it will rejects the device with more
that 1 address space that depends on platform IOMMU. The plan to
moving all the DMA mapping logic to the vDPA device driver instead of
doing it in vhost-vDPA (otherwise it could result a very complicated
APIs and actually vhost-vDPA doesn't care about how the actual
composition/emulation were done in the device driver).

Signed-off-by: Jason Wang <[email protected]>
---
drivers/vdpa/ifcvf/ifcvf_main.c | 2 +-
drivers/vdpa/mlx5/net/mlx5_vnet.c | 5 +++--
drivers/vdpa/vdpa.c | 4 +++-
drivers/vdpa/vdpa_sim/vdpa_sim.c | 10 ++++++----
drivers/vhost/vdpa.c | 14 +++++++++-----
include/linux/vdpa.h | 23 ++++++++++++++++-------
6 files changed, 38 insertions(+), 20 deletions(-)

diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c
index e6a0be374e51..86cdf5f8bcae 100644
--- a/drivers/vdpa/ifcvf/ifcvf_main.c
+++ b/drivers/vdpa/ifcvf/ifcvf_main.c
@@ -440,7 +440,7 @@ static int ifcvf_probe(struct pci_dev *pdev, const struct pci_device_id *id)

adapter = vdpa_alloc_device(struct ifcvf_adapter, vdpa,
dev, &ifc_vdpa_ops,
- IFCVF_MAX_QUEUE_PAIRS * 2, 1);
+ IFCVF_MAX_QUEUE_PAIRS * 2, 1, 1);

if (adapter == NULL) {
IFCVF_ERR(pdev, "Failed to allocate vDPA structure");
diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
index 4e480f4f754e..db7404e121bf 100644
--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
+++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
@@ -1788,7 +1788,8 @@ static u32 mlx5_vdpa_get_generation(struct vdpa_device *vdev)
return mvdev->generation;
}

-static int mlx5_vdpa_set_map(struct vdpa_device *vdev, struct vhost_iotlb *iotlb)
+static int mlx5_vdpa_set_map(struct vdpa_device *vdev, unsigned int asid,
+ struct vhost_iotlb *iotlb)
{
struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
@@ -1931,7 +1932,7 @@ void *mlx5_vdpa_add_dev(struct mlx5_core_dev *mdev)
max_vqs = min_t(u32, max_vqs, MLX5_MAX_SUPPORTED_VQS);

ndev = vdpa_alloc_device(struct mlx5_vdpa_net, mvdev.vdev, mdev->device, &mlx5_vdpa_ops,
- 2 * mlx5_vdpa_max_qps(max_vqs), 1);
+ 2 * mlx5_vdpa_max_qps(max_vqs), 1, 1);
if (IS_ERR(ndev))
return ndev;

diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
index 46399746ec7c..05195fa7865d 100644
--- a/drivers/vdpa/vdpa.c
+++ b/drivers/vdpa/vdpa.c
@@ -63,6 +63,7 @@ static void vdpa_release_dev(struct device *d)
* @config: the bus operations that is supported by this device
* @nvqs: number of virtqueues supported by this device
* @ngroups: number of groups supported by this device
+ * @nas: number of address spaces supported by this device
* @size: size of the parent structure that contains private data
*
* Driver should use vdpa_alloc_device() wrapper macro instead of
@@ -74,7 +75,7 @@ static void vdpa_release_dev(struct device *d)
struct vdpa_device *__vdpa_alloc_device(struct device *parent,
const struct vdpa_config_ops *config,
int nvqs, unsigned int ngroups,
- size_t size)
+ unsigned int nas, size_t size)
{
struct vdpa_device *vdev;
int err = -EINVAL;
@@ -102,6 +103,7 @@ struct vdpa_device *__vdpa_alloc_device(struct device *parent,
vdev->features_valid = false;
vdev->nvqs = nvqs;
vdev->ngroups = ngroups;
+ vdev->nas = nas;

err = dev_set_name(&vdev->dev, "vdpa%u", vdev->index);
if (err)
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
index 6669c561bc6e..5dc04ec271bb 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
@@ -354,7 +354,7 @@ static struct vdpasim *vdpasim_create(void)
ops = &vdpasim_net_config_ops;

vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops,
- VDPASIM_VQ_NUM, 1);
+ VDPASIM_VQ_NUM, 1, 1);
if (!vdpasim)
goto err_alloc;

@@ -581,7 +581,7 @@ static u32 vdpasim_get_generation(struct vdpa_device *vdpa)
return vdpasim->generation;
}

-static int vdpasim_set_map(struct vdpa_device *vdpa,
+static int vdpasim_set_map(struct vdpa_device *vdpa, unsigned int asid,
struct vhost_iotlb *iotlb)
{
struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
@@ -608,7 +608,8 @@ static int vdpasim_set_map(struct vdpa_device *vdpa,
return ret;
}

-static int vdpasim_dma_map(struct vdpa_device *vdpa, u64 iova, u64 size,
+static int vdpasim_dma_map(struct vdpa_device *vdpa, unsigned int asid,
+ u64 iova, u64 size,
u64 pa, u32 perm)
{
struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
@@ -622,7 +623,8 @@ static int vdpasim_dma_map(struct vdpa_device *vdpa, u64 iova, u64 size,
return ret;
}

-static int vdpasim_dma_unmap(struct vdpa_device *vdpa, u64 iova, u64 size)
+static int vdpasim_dma_unmap(struct vdpa_device *vdpa, unsigned int asid,
+ u64 iova, u64 size)
{
struct vdpasim *vdpasim = vdpa_to_sim(vdpa);

diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index ec3c94f706c1..eeefcd971e3f 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -557,10 +557,10 @@ static int vhost_vdpa_map(struct vhost_vdpa *v, struct vhost_iotlb *iotlb,
return r;

if (ops->dma_map) {
- r = ops->dma_map(vdpa, iova, size, pa, perm);
+ r = ops->dma_map(vdpa, 0, iova, size, pa, perm);
} else if (ops->set_map) {
if (!v->in_batch)
- r = ops->set_map(vdpa, iotlb);
+ r = ops->set_map(vdpa, 0, iotlb);
} else {
r = iommu_map(v->domain, iova, pa, size,
perm_to_iommu_flags(perm));
@@ -579,10 +579,10 @@ static void vhost_vdpa_unmap(struct vhost_vdpa *v,
vhost_vdpa_iotlb_unmap(v, iotlb, iova, iova + size - 1);

if (ops->dma_map) {
- ops->dma_unmap(vdpa, iova, size);
+ ops->dma_unmap(vdpa, 0, iova, size);
} else if (ops->set_map) {
if (!v->in_batch)
- ops->set_map(vdpa, iotlb);
+ ops->set_map(vdpa, 0, iotlb);
} else {
iommu_unmap(v->domain, iova, size);
}
@@ -700,7 +700,7 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev,
break;
case VHOST_IOTLB_BATCH_END:
if (v->in_batch && ops->set_map)
- ops->set_map(vdpa, iotlb);
+ ops->set_map(vdpa, 0, iotlb);
v->in_batch = false;
break;
default:
@@ -949,6 +949,10 @@ static int vhost_vdpa_probe(struct vdpa_device *vdpa)
int minor;
int r;

+ /* Only support 1 address space */
+ if (vdpa->ngroups != 1)
+ return -ENOTSUPP;
+
/* Currently, we only accept the network devices. */
if (ops->get_device_id(vdpa) != VIRTIO_ID_NET)
return -ENOTSUPP;
diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
index d829512efd27..1e1163daa352 100644
--- a/include/linux/vdpa.h
+++ b/include/linux/vdpa.h
@@ -43,6 +43,8 @@ struct vdpa_vq_state {
* @index: device index
* @features_valid: were features initialized? for legacy guests
* @nvqs: the number of virtqueues
+ * @ngroups: the number of virtqueue groups
+ * @nas: the number of address spaces
*/
struct vdpa_device {
struct device dev;
@@ -52,6 +54,7 @@ struct vdpa_device {
bool features_valid;
int nvqs;
unsigned int ngroups;
+ unsigned int nas;
};

/**
@@ -161,6 +164,7 @@ struct vdpa_device {
* Needed for device that using device
* specific DMA translation (on-chip IOMMU)
* @vdev: vdpa device
+ * @asid: address space identifier
* @iotlb: vhost memory mapping to be
* used by the vDPA
* Returns integer: success (0) or error (< 0)
@@ -169,6 +173,7 @@ struct vdpa_device {
* specific DMA translation (on-chip IOMMU)
* and preferring incremental map.
* @vdev: vdpa device
+ * @asid: address space identifier
* @iova: iova to be mapped
* @size: size of the area
* @pa: physical address for the map
@@ -180,6 +185,7 @@ struct vdpa_device {
* specific DMA translation (on-chip IOMMU)
* and preferring incremental unmap.
* @vdev: vdpa device
+ * @asid: address space identifier
* @iova: iova to be unmapped
* @size: size of the area
* Returns integer: success (0) or error (< 0)
@@ -225,10 +231,12 @@ struct vdpa_config_ops {
u32 (*get_generation)(struct vdpa_device *vdev);

/* DMA ops */
- int (*set_map)(struct vdpa_device *vdev, struct vhost_iotlb *iotlb);
- int (*dma_map)(struct vdpa_device *vdev, u64 iova, u64 size,
- u64 pa, u32 perm);
- int (*dma_unmap)(struct vdpa_device *vdev, u64 iova, u64 size);
+ int (*set_map)(struct vdpa_device *vdev, unsigned int asid,
+ struct vhost_iotlb *iotlb);
+ int (*dma_map)(struct vdpa_device *vdev, unsigned int asid,
+ u64 iova, u64 size, u64 pa, u32 perm);
+ int (*dma_unmap)(struct vdpa_device *vdev, unsigned int asid,
+ u64 iova, u64 size);

/* Free device resources */
void (*free)(struct vdpa_device *vdev);
@@ -237,11 +245,12 @@ struct vdpa_config_ops {
struct vdpa_device *__vdpa_alloc_device(struct device *parent,
const struct vdpa_config_ops *config,
int nvqs, unsigned int ngroups,
- size_t size);
+ unsigned int nas, size_t size);

-#define vdpa_alloc_device(dev_struct, member, parent, config, nvqs, ngroups) \
+#define vdpa_alloc_device(dev_struct, member, parent, config, nvqs, \
+ ngroups, nas) \
container_of(__vdpa_alloc_device( \
- parent, config, nvqs, ngroups, \
+ parent, config, nvqs, ngroups, nas, \
sizeof(dev_struct) + \
BUILD_BUG_ON_ZERO(offsetof( \
dev_struct, member))), \
--
2.20.1

2020-09-24 03:27:50

by Jason Wang

[permalink] [raw]
Subject: [RFC PATCH 10/24] vdpa: introduce config operations for associating ASID to a virtqueue group

This patch introduces a new bus operation to allow the vDPA bus driver
to associate an ASID to a virtqueue group.

Signed-off-by: Jason Wang <[email protected]>
---
include/linux/vdpa.h | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
index 1e1163daa352..e2394995a3cd 100644
--- a/include/linux/vdpa.h
+++ b/include/linux/vdpa.h
@@ -160,6 +160,12 @@ struct vdpa_device {
* @get_generation: Get device config generation (optional)
* @vdev: vdpa device
* Returns u32: device generation
+ * @set_group_asid: Set address space identifier for a
+ * virtqueue group
+ * @vdev: vdpa device
+ * @group: virtqueue group
+ * @asid: address space id for this group
+ * Returns integer: success (0) or error (< 0)
* @set_map: Set device memory mapping (optional)
* Needed for device that using device
* specific DMA translation (on-chip IOMMU)
@@ -237,6 +243,10 @@ struct vdpa_config_ops {
u64 iova, u64 size, u64 pa, u32 perm);
int (*dma_unmap)(struct vdpa_device *vdev, unsigned int asid,
u64 iova, u64 size);
+ int (*set_group_asid)(struct vdpa_device *vdev, unsigned int group,
+ unsigned int asid);
+
+

/* Free device resources */
void (*free)(struct vdpa_device *vdev);
--
2.20.1

2020-09-24 03:27:59

by Jason Wang

[permalink] [raw]
Subject: [RFC PATCH 20/24] vdpa_sim: advertise VIRTIO_NET_F_MTU

We've already reported maximum mtu via config space, so let's
advertise the feature.

Signed-off-by: Jason Wang <[email protected]>
---
drivers/vdpa/vdpa_sim/vdpa_sim.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
index d1764a64578d..4b2d0d3fbc87 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
@@ -61,7 +61,8 @@ struct vdpasim_virtqueue {

static u64 vdpasim_features = (1ULL << VIRTIO_F_ANY_LAYOUT) |
(1ULL << VIRTIO_F_VERSION_1) |
- (1ULL << VIRTIO_F_ACCESS_PLATFORM);
+ (1ULL << VIRTIO_F_ACCESS_PLATFORM) |
+ (1ULL << VIRTIO_NET_F_MTU);

/* State of each vdpasim device */
struct vdpasim {
--
2.20.1

2020-09-24 03:28:12

by Jason Wang

[permalink] [raw]
Subject: [RFC PATCH 21/24] vdpa_sim: advertise VIRTIO_NET_F_MAC

We advertise mac address via config space, so let's advertise
VIRTIO_NET_F_MAC.

Signed-off-by: Jason Wang <[email protected]>
---
drivers/vdpa/vdpa_sim/vdpa_sim.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
index 4b2d0d3fbc87..ca5c2d0db905 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
@@ -62,7 +62,8 @@ struct vdpasim_virtqueue {
static u64 vdpasim_features = (1ULL << VIRTIO_F_ANY_LAYOUT) |
(1ULL << VIRTIO_F_VERSION_1) |
(1ULL << VIRTIO_F_ACCESS_PLATFORM) |
- (1ULL << VIRTIO_NET_F_MTU);
+ (1ULL << VIRTIO_NET_F_MTU) |
+ (1ULL << VIRTIO_NET_F_MAC);

/* State of each vdpasim device */
struct vdpasim {
--
2.20.1

2020-09-24 03:28:26

by Jason Wang

[permalink] [raw]
Subject: [RFC PATCH 16/24] vhost-vdpa: uAPI to get virtqueue group id

Follows the support for virtqueue group in vDPA. This patches
introduces uAPI to get the virtqueue group ID for a specific virtqueue
in vhost-vdpa.

Signed-off-by: Jason Wang <[email protected]>
---
drivers/vhost/vdpa.c | 8 ++++++++
include/uapi/linux/vhost.h | 4 ++++
2 files changed, 12 insertions(+)

diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index 4d97a59824a1..a234d3783e16 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -433,6 +433,14 @@ static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd,
return -EFAULT;
ops->set_vq_ready(vdpa, idx, s.num);
return 0;
+ case VHOST_VDPA_GET_VRING_GROUP:
+ s.index = idx;
+ s.num = ops->get_vq_group(vdpa, idx);
+ if (s.num >= vdpa->ngroups)
+ return -EIO;
+ else if (copy_to_user(argp, &s, sizeof s))
+ return -EFAULT;
+ return 0;
case VHOST_GET_VRING_BASE:
r = ops->get_vq_state(v->vdpa, idx, &vq_state);
if (r)
diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h
index 99bdf50efc50..d1c4b5561fee 100644
--- a/include/uapi/linux/vhost.h
+++ b/include/uapi/linux/vhost.h
@@ -147,4 +147,8 @@

/* Get the number of address spaces. */
#define VHOST_VDPA_GET_AS_NUM _IOR(VHOST_VIRTIO, 0x79, unsigned int)
+
+/* Get the group for a virtqueue: read index, write group in num */
+#define VHOST_VDPA_GET_VRING_GROUP _IOWR(VHOST_VIRTIO, 0x79, \
+ struct vhost_vring_state)
#endif
--
2.20.1

2020-09-24 03:28:26

by Jason Wang

[permalink] [raw]
Subject: [RFC PATCH 13/24] vhost-vdpa: introduce ASID based IOTLB

This patch introduces the support of ASID based IOTLB by tagging IOTLB
with a unique ASID. This is a must for supporting ASID based vhost
IOTLB API by the following patches.

IOTLB were stored in a hlist and new IOTLB will be allocated when a
new ASID is seen via IOTLB API and destoryed when there's no mapping
associated with an ASID.

Signed-off-by: Jason Wang <[email protected]>
---
drivers/vhost/vdpa.c | 94 +++++++++++++++++++++++++++++++++-----------
1 file changed, 72 insertions(+), 22 deletions(-)

diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index 6552987544d7..1ba7e95619b5 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -34,13 +34,21 @@ enum {

#define VHOST_VDPA_DEV_MAX (1U << MINORBITS)

+#define VHOST_VDPA_IOTLB_BUCKETS 16
+
+struct vhost_vdpa_as {
+ struct hlist_node hash_link;
+ struct vhost_iotlb iotlb;
+ u32 id;
+};
+
struct vhost_vdpa {
struct vhost_dev vdev;
struct iommu_domain *domain;
struct vhost_virtqueue *vqs;
struct completion completion;
struct vdpa_device *vdpa;
- struct vhost_iotlb *iotlb;
+ struct hlist_head as[VHOST_VDPA_IOTLB_BUCKETS];
struct device dev;
struct cdev cdev;
atomic_t opened;
@@ -49,12 +57,64 @@ struct vhost_vdpa {
int minor;
struct eventfd_ctx *config_ctx;
int in_batch;
+ int used_as;
};

static DEFINE_IDA(vhost_vdpa_ida);

static dev_t vhost_vdpa_major;

+static struct vhost_vdpa_as *asid_to_as(struct vhost_vdpa *v, u32 asid)
+{
+ struct hlist_head *head = &v->as[asid % VHOST_VDPA_IOTLB_BUCKETS];
+ struct vhost_vdpa_as *as;
+
+ hlist_for_each_entry(as, head, hash_link)
+ if (as->id == asid)
+ return as;
+
+ return NULL;
+}
+
+static struct vhost_vdpa_as *vhost_vdpa_alloc_as(struct vhost_vdpa *v, u32 asid)
+{
+ struct hlist_head *head = &v->as[asid % VHOST_VDPA_IOTLB_BUCKETS];
+ struct vhost_vdpa_as *as;
+
+ if (asid_to_as(v, asid))
+ return NULL;
+
+ as = kmalloc(sizeof(*as), GFP_KERNEL);
+ if (!as)
+ return NULL;
+
+ vhost_iotlb_init(&as->iotlb, 0, 0);
+ as->id = asid;
+ hlist_add_head(&as->hash_link, head);
+ ++v->used_as;
+
+ return as;
+}
+
+static int vhost_vdpa_remove_as(struct vhost_vdpa *v, u32 asid)
+{
+ struct vhost_vdpa_as *as = asid_to_as(v, asid);
+
+ /* Remove default address space is not allowed */
+ if (asid == 0)
+ return -EINVAL;
+
+ if (!as)
+ return -EINVAL;
+
+ hlist_del(&as->hash_link);
+ vhost_iotlb_reset(&as->iotlb);
+ kfree(as);
+ --v->used_as;
+
+ return 0;
+}
+
static void handle_vq_kick(struct vhost_work *work)
{
struct vhost_virtqueue *vq = container_of(work, struct vhost_virtqueue,
@@ -513,15 +573,6 @@ static void vhost_vdpa_iotlb_unmap(struct vhost_vdpa *v,
}
}

-static void vhost_vdpa_iotlb_free(struct vhost_vdpa *v)
-{
- struct vhost_iotlb *iotlb = v->iotlb;
-
- vhost_vdpa_iotlb_unmap(v, iotlb, 0ULL, 0ULL - 1);
- kfree(v->iotlb);
- v->iotlb = NULL;
-}
-
static int perm_to_iommu_flags(u32 perm)
{
int flags = 0;
@@ -681,7 +732,8 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev, u32 asid,
struct vhost_vdpa *v = container_of(dev, struct vhost_vdpa, vdev);
struct vdpa_device *vdpa = v->vdpa;
const struct vdpa_config_ops *ops = vdpa->config;
- struct vhost_iotlb *iotlb = v->iotlb;
+ struct vhost_vdpa_as *as = asid_to_as(v, 0);
+ struct vhost_iotlb *iotlb = &as->iotlb;
int r = 0;

if (asid != 0)
@@ -775,6 +827,7 @@ static void vhost_vdpa_cleanup(struct vhost_vdpa *v)
{
vhost_dev_cleanup(&v->vdev);
kfree(v->vdev.vqs);
+ vhost_vdpa_remove_as(v, 0);
}

static int vhost_vdpa_open(struct inode *inode, struct file *filep)
@@ -807,23 +860,18 @@ static int vhost_vdpa_open(struct inode *inode, struct file *filep)
vhost_dev_init(dev, vqs, nvqs, 0, 0, 0, false,
vhost_vdpa_process_iotlb_msg);

- dev->iotlb = vhost_iotlb_alloc(0, 0);
- if (!dev->iotlb) {
- r = -ENOMEM;
- goto err_init_iotlb;
- }
+ if (!vhost_vdpa_alloc_as(v, 0))
+ goto err_alloc_as;

r = vhost_vdpa_alloc_domain(v);
if (r)
- goto err_alloc_domain;
+ goto err_alloc_as;

filep->private_data = v;

return 0;

-err_alloc_domain:
- vhost_vdpa_iotlb_free(v);
-err_init_iotlb:
+err_alloc_as:
vhost_vdpa_cleanup(v);
err:
atomic_dec(&v->opened);
@@ -851,7 +899,6 @@ static int vhost_vdpa_release(struct inode *inode, struct file *filep)
filep->private_data = NULL;
vhost_vdpa_reset(v);
vhost_dev_stop(&v->vdev);
- vhost_vdpa_iotlb_free(v);
vhost_vdpa_free_domain(v);
vhost_vdpa_config_put(v);
vhost_vdpa_clean_irq(v);
@@ -950,7 +997,7 @@ static int vhost_vdpa_probe(struct vdpa_device *vdpa)
const struct vdpa_config_ops *ops = vdpa->config;
struct vhost_vdpa *v;
int minor;
- int r;
+ int i, r;

/* Only support 1 address space */
if (vdpa->ngroups != 1)
@@ -1002,6 +1049,9 @@ static int vhost_vdpa_probe(struct vdpa_device *vdpa)
init_completion(&v->completion);
vdpa_set_drvdata(vdpa, v);

+ for (i = 0; i < VHOST_VDPA_IOTLB_BUCKETS; i++)
+ INIT_HLIST_HEAD(&v->as[i]);
+
return 0;

err:
--
2.20.1

2020-09-24 03:28:43

by Jason Wang

[permalink] [raw]
Subject: [RFC PATCH 24/24] vdpasim: control virtqueue support

This patch introduces the control virtqueue support for vDPA
simulator. This is a requirement for supporting advanced features like
multiqueue.

A requirement for control virtqueue is to isolate its memory access
from the rx/tx virtqueues. This is because when using vDPA device
for VM, the control virqueue may not be assigned to VM
directly but shadowed by userspace VMM (Qemu).

The isolation is done via the virtqueue groups and ASID support in
vDPA through vhost-vdpa. The simulator is extended to have:

1) three virtqueues: RXVQ, TXVQ and CVQ (control virtqueue)
2) two virtqueue groups: group 0 contains RX/TX, group 1 contains CVQ
3) two address spaces and the simulator simply implements the address
spaces by mapping it 1:1 to IOTLB.

For the VM use cases, userspace(Qemu) may set AS 0 to group 0 and AS 1
to group 1. So we have:

1) The IOTLB for virtqueue group 0 contains the mappings of guest, so
RX and TX can be assigned to guest directly.
2) The IOTLB for virtqueue group 1 contains the mappings of CVQ which
is the buffers that allocated and managed by userspace only so that
guest can not access the CVQ of vhost-vdpa.

For the other use cases, since AS 0 is associated to all virtqueue
groups by default. All virtqueues share the same mapping by default.

To demonstrate the function, VIRITO_NET_F_CTRL_MACADDR is
implemented in the simulator for the driver to set mac address.

Signed-off-by: Jason Wang <[email protected]>
---
drivers/vdpa/vdpa_sim/vdpa_sim.c | 189 +++++++++++++++++++++++++++----
1 file changed, 166 insertions(+), 23 deletions(-)

diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
index 66d901fb4c57..3459539c4460 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
@@ -56,14 +56,18 @@ struct vdpasim_virtqueue {
#define VDPASIM_QUEUE_MAX 256
#define VDPASIM_DEVICE_ID 0x1
#define VDPASIM_VENDOR_ID 0
-#define VDPASIM_VQ_NUM 0x2
+#define VDPASIM_VQ_NUM 0x3
+#define VDPASIM_AS_NUM 0x2
+#define VDPASIM_GROUP_NUM 0x2
#define VDPASIM_NAME "vdpasim-netdev"

static u64 vdpasim_features = (1ULL << VIRTIO_F_ANY_LAYOUT) |
(1ULL << VIRTIO_F_VERSION_1) |
(1ULL << VIRTIO_F_ACCESS_PLATFORM) |
(1ULL << VIRTIO_NET_F_MTU) |
- (1ULL << VIRTIO_NET_F_MAC);
+ (1ULL << VIRTIO_NET_F_MAC) |
+ (1ULL << VIRTIO_NET_F_CTRL_VQ) |
+ (1ULL << VIRTIO_NET_F_CTRL_MAC_ADDR);

/* State of each vdpasim device */
struct vdpasim {
@@ -143,11 +147,17 @@ static void vdpasim_reset(struct vdpasim *vdpasim)
{
int i;

- for (i = 0; i < VDPASIM_VQ_NUM; i++)
+ spin_lock(&vdpasim->iommu_lock);
+
+ for (i = 0; i < VDPASIM_VQ_NUM; i++) {
vdpasim_vq_reset(&vdpasim->vqs[i]);
+ vringh_set_iotlb(&vdpasim->vqs[i].vring,
+ &vdpasim->iommu[0]);
+ }

- spin_lock(&vdpasim->iommu_lock);
- vhost_iotlb_reset(vdpasim->iommu);
+ for (i = 0; i < VDPASIM_AS_NUM; i++) {
+ vhost_iotlb_reset(&vdpasim->iommu[i]);
+ }
spin_unlock(&vdpasim->iommu_lock);

vdpasim->features = 0;
@@ -187,6 +197,80 @@ static bool receive_filter(struct vdpasim *vdpasim, size_t len)
return false;
}

+virtio_net_ctrl_ack vdpasim_handle_ctrl_mac(struct vdpasim *vdpasim,
+ u8 cmd)
+{
+ struct vdpasim_virtqueue *cvq = &vdpasim->vqs[2];
+ virtio_net_ctrl_ack status = VIRTIO_NET_ERR;
+ size_t read;
+
+ switch (cmd) {
+ case VIRTIO_NET_CTRL_MAC_ADDR_SET:
+ read = vringh_iov_pull_iotlb(&cvq->vring, &cvq->riov,
+ (void *)vdpasim->config.mac,
+ ETH_ALEN);
+ if (read == ETH_ALEN)
+ status = VIRTIO_NET_OK;
+ break;
+ default:
+ break;
+ }
+
+ return status;
+}
+
+static void vdpasim_handle_cvq(struct vdpasim *vdpasim)
+{
+ struct vdpasim_virtqueue *cvq = &vdpasim->vqs[2];
+ virtio_net_ctrl_ack status = VIRTIO_NET_ERR;
+ struct virtio_net_ctrl_hdr ctrl;
+ size_t read, write;
+ int err;
+
+ if (!(vdpasim->features & (1ULL << VIRTIO_NET_F_CTRL_VQ)))
+ return;
+
+ if (!cvq->ready)
+ return;
+
+ while (true) {
+ err = vringh_getdesc_iotlb(&cvq->vring, &cvq->riov, &cvq->wiov,
+ &cvq->head, GFP_ATOMIC);
+ if (err <= 0)
+ break;
+
+ read = vringh_iov_pull_iotlb(&cvq->vring, &cvq->riov, &ctrl,
+ sizeof(ctrl));
+ if (read != sizeof(ctrl))
+ break;
+
+ switch (ctrl.class) {
+ case VIRTIO_NET_CTRL_MAC:
+ status = vdpasim_handle_ctrl_mac(vdpasim, ctrl.cmd);
+ break;
+ default:
+ break;
+ }
+
+ /* Make sure data is wrote before advancing index */
+ smp_wmb();
+
+ write = vringh_iov_push_iotlb(&cvq->vring, &cvq->wiov,
+ &status, sizeof (status));
+ vringh_complete_iotlb(&cvq->vring, cvq->head, write);
+ vringh_kiov_cleanup(&cvq->riov);
+ vringh_kiov_cleanup(&cvq->wiov);
+
+ /* Make sure used is visible before rasing the interrupt. */
+ smp_wmb();
+
+ local_bh_disable();
+ if (cvq->cb)
+ cvq->cb(cvq->private);
+ local_bh_enable();
+ }
+}
+
static void vdpasim_work(struct work_struct *work)
{
struct vdpasim *vdpasim = container_of(work, struct
@@ -272,7 +356,7 @@ static dma_addr_t vdpasim_map_page(struct device *dev, struct page *page,
unsigned long attrs)
{
struct vdpasim *vdpasim = dev_to_sim(dev);
- struct vhost_iotlb *iommu = vdpasim->iommu;
+ struct vhost_iotlb *iommu = &vdpasim->iommu[0];
u64 pa = (page_to_pfn(page) << PAGE_SHIFT) + offset;
int ret, perm = dir_to_perm(dir);

@@ -297,7 +381,7 @@ static void vdpasim_unmap_page(struct device *dev, dma_addr_t dma_addr,
unsigned long attrs)
{
struct vdpasim *vdpasim = dev_to_sim(dev);
- struct vhost_iotlb *iommu = vdpasim->iommu;
+ struct vhost_iotlb *iommu = &vdpasim->iommu[0];

spin_lock(&vdpasim->iommu_lock);
vhost_iotlb_del_range(iommu, (u64)dma_addr,
@@ -310,7 +394,7 @@ static void *vdpasim_alloc_coherent(struct device *dev, size_t size,
unsigned long attrs)
{
struct vdpasim *vdpasim = dev_to_sim(dev);
- struct vhost_iotlb *iommu = vdpasim->iommu;
+ struct vhost_iotlb *iommu = &vdpasim->iommu[0];
void *addr = kmalloc(size, flag);
int ret;

@@ -340,7 +424,7 @@ static void vdpasim_free_coherent(struct device *dev, size_t size,
unsigned long attrs)
{
struct vdpasim *vdpasim = dev_to_sim(dev);
- struct vhost_iotlb *iommu = vdpasim->iommu;
+ struct vhost_iotlb *iommu = &vdpasim->iommu[0];

spin_lock(&vdpasim->iommu_lock);
vhost_iotlb_del_range(iommu, (u64)dma_addr,
@@ -366,14 +450,17 @@ static struct vdpasim *vdpasim_create(void)
struct vdpasim *vdpasim;
struct device *dev;
int ret = -ENOMEM;
+ int i;

if (batch_mapping)
ops = &vdpasim_net_batch_config_ops;
else
ops = &vdpasim_net_config_ops;

+ /* 3 virtqueues, 2 address spaces, 2 virtqueue groups */
vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops,
- VDPASIM_VQ_NUM, 1, 1);
+ VDPASIM_VQ_NUM, VDPASIM_AS_NUM,
+ VDPASIM_GROUP_NUM);
if (!vdpasim)
goto err_alloc;

@@ -385,18 +472,23 @@ static struct vdpasim *vdpasim_create(void)
dev->coherent_dma_mask = DMA_BIT_MASK(64);
set_dma_ops(dev, &vdpasim_dma_ops);

- vdpasim->iommu = vhost_iotlb_alloc(2048, 0);
+ vdpasim->iommu = kmalloc_array(VDPASIM_AS_NUM,
+ sizeof(*vdpasim->iommu), GFP_KERNEL);
if (!vdpasim->iommu)
goto err_iommu;

+ for (i = 0; i < VDPASIM_AS_NUM; i++)
+ vhost_iotlb_init(&vdpasim->iommu[i], 0, 0);
+
vdpasim->buffer = kmalloc(PAGE_SIZE, GFP_KERNEL);
if (!vdpasim->buffer)
goto err_iommu;

eth_random_addr(vdpasim->config.mac);

- vringh_set_iotlb(&vdpasim->vqs[0].vring, vdpasim->iommu);
- vringh_set_iotlb(&vdpasim->vqs[1].vring, vdpasim->iommu);
+ /* Make sure that default ASID is zero */
+ for (i = 0; i < VDPASIM_VQ_NUM; i++)
+ vringh_set_iotlb(&vdpasim->vqs[i].vring, &vdpasim->iommu[0]);

vdpasim->vdpa.dma_dev = dev;
ret = vdpa_register_device(&vdpasim->vdpa);
@@ -438,7 +530,14 @@ static void vdpasim_kick_vq(struct vdpa_device *vdpa, u16 idx)
struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
struct vdpasim_virtqueue *vq = &vdpasim->vqs[idx];

- if (vq->ready)
+ if (idx == 2) {
+ /* Kernel virtio driver will do busy waiting for the
+ * result, so we can't handle cvq in the workqueue.
+ */
+ spin_lock(&vdpasim->lock);
+ vdpasim_handle_cvq(vdpasim);
+ spin_unlock(&vdpasim->lock);
+ } else if (vq->ready)
schedule_work(&vdpasim->work);
}

@@ -504,7 +603,11 @@ static u32 vdpasim_get_vq_align(struct vdpa_device *vdpa)

static u32 vdpasim_get_vq_group(struct vdpa_device *vdpa, u16 idx)
{
- return 0;
+ /* RX and TX belongs to group 0, CVQ belongs to group 1 */
+ if (idx == 2)
+ return 1;
+ else
+ return 0;
}

static u64 vdpasim_get_features(struct vdpa_device *vdpa)
@@ -600,20 +703,53 @@ static u32 vdpasim_get_generation(struct vdpa_device *vdpa)
return vdpasim->generation;
}

+int vdpasim_set_group_asid(struct vdpa_device *vdpa, unsigned int group,
+ unsigned int asid)
+{
+ struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
+ struct vhost_iotlb *iommu;
+ int i;
+
+ if (group > VDPASIM_GROUP_NUM)
+ return -EINVAL;
+
+ if (asid > VDPASIM_AS_NUM)
+ return -EINVAL;
+
+ iommu = &vdpasim->iommu[asid];
+
+ spin_lock(&vdpasim->lock);
+
+ for (i = 0; i < VDPASIM_VQ_NUM; i++)
+ if (vdpasim_get_vq_group(vdpa, i) == group)
+ vringh_set_iotlb(&vdpasim->vqs[i].vring, iommu);
+
+ spin_unlock(&vdpasim->lock);
+
+ return 0;
+}
+
+
static int vdpasim_set_map(struct vdpa_device *vdpa, unsigned int asid,
struct vhost_iotlb *iotlb)
{
struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
struct vhost_iotlb_map *map;
+ struct vhost_iotlb *iommu;
u64 start = 0ULL, last = 0ULL - 1;
int ret;

+ if (asid >= VDPASIM_AS_NUM)
+ return -EINVAL;
+
spin_lock(&vdpasim->iommu_lock);
- vhost_iotlb_reset(vdpasim->iommu);
+
+ iommu = &vdpasim->iommu[asid];
+ vhost_iotlb_reset(iommu);

for (map = vhost_iotlb_itree_first(iotlb, start, last); map;
map = vhost_iotlb_itree_next(map, start, last)) {
- ret = vhost_iotlb_add_range(vdpasim->iommu, map->start,
+ ret = vhost_iotlb_add_range(iommu, map->start,
map->last, map->addr, map->perm);
if (ret)
goto err;
@@ -622,7 +758,7 @@ static int vdpasim_set_map(struct vdpa_device *vdpa, unsigned int asid,
return 0;

err:
- vhost_iotlb_reset(vdpasim->iommu);
+ vhost_iotlb_reset(iommu);
spin_unlock(&vdpasim->iommu_lock);
return ret;
}
@@ -634,9 +770,12 @@ static int vdpasim_dma_map(struct vdpa_device *vdpa, unsigned int asid,
struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
int ret;

+ if (asid >= VDPASIM_AS_NUM)
+ return -EINVAL;
+
spin_lock(&vdpasim->iommu_lock);
- ret = vhost_iotlb_add_range(vdpasim->iommu, iova, iova + size - 1, pa,
- perm);
+ ret = vhost_iotlb_add_range(&vdpasim->iommu[asid], iova,
+ iova + size - 1, pa, perm);
spin_unlock(&vdpasim->iommu_lock);

return ret;
@@ -647,8 +786,11 @@ static int vdpasim_dma_unmap(struct vdpa_device *vdpa, unsigned int asid,
{
struct vdpasim *vdpasim = vdpa_to_sim(vdpa);

+ if (asid >= VDPASIM_AS_NUM)
+ return -EINVAL;
+
spin_lock(&vdpasim->iommu_lock);
- vhost_iotlb_del_range(vdpasim->iommu, iova, iova + size - 1);
+ vhost_iotlb_del_range(&vdpasim->iommu[asid], iova, iova + size - 1);
spin_unlock(&vdpasim->iommu_lock);

return 0;
@@ -660,8 +802,7 @@ static void vdpasim_free(struct vdpa_device *vdpa)

cancel_work_sync(&vdpasim->work);
kfree(vdpasim->buffer);
- if (vdpasim->iommu)
- vhost_iotlb_free(vdpasim->iommu);
+ vhost_iotlb_free(vdpasim->iommu);
}

static const struct vdpa_config_ops vdpasim_net_config_ops = {
@@ -686,6 +827,7 @@ static const struct vdpa_config_ops vdpasim_net_config_ops = {
.get_config = vdpasim_get_config,
.set_config = vdpasim_set_config,
.get_generation = vdpasim_get_generation,
+ .set_group_asid = vdpasim_set_group_asid,
.dma_map = vdpasim_dma_map,
.dma_unmap = vdpasim_dma_unmap,
.free = vdpasim_free,
@@ -713,6 +855,7 @@ static const struct vdpa_config_ops vdpasim_net_batch_config_ops = {
.get_config = vdpasim_get_config,
.set_config = vdpasim_set_config,
.get_generation = vdpasim_get_generation,
+ .set_group_asid = vdpasim_set_group_asid,
.set_map = vdpasim_set_map,
.free = vdpasim_free,
};
--
2.20.1

2020-09-24 03:28:49

by Jason Wang

[permalink] [raw]
Subject: [RFC PATCH 18/24] vhost-vdpa: support ASID based IOTLB API

This patch extends the vhost-vdpa to support ASID based IOTLB API. The
vhost-vdpa device will allocated multple IOTLBs for vDPA device that
supports multiple address spaces. The IOTLBs and vDPA device memory
mappings is determined and maintained through ASID.

Note that we still don't support vDPA device with more than one
address spaces that depends on platform IOMMU. This work will be done
by moving the IOMMU logic from vhost-vDPA to vDPA device driver.

Signed-off-by: Jason Wang <[email protected]>
---
drivers/vhost/vdpa.c | 106 ++++++++++++++++++++++++++++++++-----------
1 file changed, 79 insertions(+), 27 deletions(-)

diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index 978cf97dc03a..99ac13b2ed11 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -29,7 +29,8 @@
enum {
VHOST_VDPA_BACKEND_FEATURES =
(1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2) |
- (1ULL << VHOST_BACKEND_F_IOTLB_BATCH),
+ (1ULL << VHOST_BACKEND_F_IOTLB_BATCH) |
+ (1ULL << VHOST_BACKEND_F_IOTLB_ASID),
};

#define VHOST_VDPA_DEV_MAX (1U << MINORBITS)
@@ -58,12 +59,20 @@ struct vhost_vdpa {
struct eventfd_ctx *config_ctx;
int in_batch;
int used_as;
+ u32 batch_asid;
};

static DEFINE_IDA(vhost_vdpa_ida);

static dev_t vhost_vdpa_major;

+static inline u32 iotlb_to_asid(struct vhost_iotlb *iotlb)
+{
+ struct vhost_vdpa_as *as = container_of(iotlb, struct
+ vhost_vdpa_as, iotlb);
+ return as->id;
+}
+
static struct vhost_vdpa_as *asid_to_as(struct vhost_vdpa *v, u32 asid)
{
struct hlist_head *head = &v->as[asid % VHOST_VDPA_IOTLB_BUCKETS];
@@ -76,6 +85,16 @@ static struct vhost_vdpa_as *asid_to_as(struct vhost_vdpa *v, u32 asid)
return NULL;
}

+static struct vhost_iotlb *asid_to_iotlb(struct vhost_vdpa *v, u32 asid)
+{
+ struct vhost_vdpa_as *as = asid_to_as(v, asid);
+
+ if (!as)
+ return NULL;
+
+ return &as->iotlb;
+}
+
static struct vhost_vdpa_as *vhost_vdpa_alloc_as(struct vhost_vdpa *v, u32 asid)
{
struct hlist_head *head = &v->as[asid % VHOST_VDPA_IOTLB_BUCKETS];
@@ -84,6 +103,9 @@ static struct vhost_vdpa_as *vhost_vdpa_alloc_as(struct vhost_vdpa *v, u32 asid)
if (asid_to_as(v, asid))
return NULL;

+ if (asid >= v->vdpa->nas)
+ return NULL;
+
as = kmalloc(sizeof(*as), GFP_KERNEL);
if (!as)
return NULL;
@@ -96,13 +118,20 @@ static struct vhost_vdpa_as *vhost_vdpa_alloc_as(struct vhost_vdpa *v, u32 asid)
return as;
}

-static int vhost_vdpa_remove_as(struct vhost_vdpa *v, u32 asid)
+static struct vhost_vdpa_as *vhost_vdpa_find_alloc_as(struct vhost_vdpa *v,
+ u32 asid)
{
struct vhost_vdpa_as *as = asid_to_as(v, asid);

- /* Remove default address space is not allowed */
- if (asid == 0)
- return -EINVAL;
+ if (as)
+ return as;
+
+ return vhost_vdpa_alloc_as(v, asid);
+}
+
+static int vhost_vdpa_remove_as(struct vhost_vdpa *v, u32 asid)
+{
+ struct vhost_vdpa_as *as = asid_to_as(v, asid);

if (!as)
return -EINVAL;
@@ -623,6 +652,7 @@ static int vhost_vdpa_map(struct vhost_vdpa *v, struct vhost_iotlb *iotlb,
{
struct vdpa_device *vdpa = v->vdpa;
const struct vdpa_config_ops *ops = vdpa->config;
+ u32 asid = iotlb_to_asid(iotlb);
int r = 0;

r = vhost_iotlb_add_range(iotlb, iova, iova + size - 1,
@@ -631,10 +661,10 @@ static int vhost_vdpa_map(struct vhost_vdpa *v, struct vhost_iotlb *iotlb,
return r;

if (ops->dma_map) {
- r = ops->dma_map(vdpa, 0, iova, size, pa, perm);
+ r = ops->dma_map(vdpa, asid, iova, size, pa, perm);
} else if (ops->set_map) {
if (!v->in_batch)
- r = ops->set_map(vdpa, 0, iotlb);
+ r = ops->set_map(vdpa, asid, iotlb);
} else {
r = iommu_map(v->domain, iova, pa, size,
perm_to_iommu_flags(perm));
@@ -643,23 +673,32 @@ static int vhost_vdpa_map(struct vhost_vdpa *v, struct vhost_iotlb *iotlb,
return r;
}

-static void vhost_vdpa_unmap(struct vhost_vdpa *v,
- struct vhost_iotlb *iotlb,
- u64 iova, u64 size)
+static int vhost_vdpa_unmap(struct vhost_vdpa *v,
+ struct vhost_iotlb *iotlb,
+ u64 iova, u64 size)
{
struct vdpa_device *vdpa = v->vdpa;
const struct vdpa_config_ops *ops = vdpa->config;
+ u32 asid = iotlb_to_asid(iotlb);
+
+ if (!iotlb)
+ return -EINVAL;

vhost_vdpa_iotlb_unmap(v, iotlb, iova, iova + size - 1);

if (ops->dma_map) {
- ops->dma_unmap(vdpa, 0, iova, size);
+ ops->dma_unmap(vdpa, asid, iova, size);
} else if (ops->set_map) {
if (!v->in_batch)
- ops->set_map(vdpa, 0, iotlb);
+ ops->set_map(vdpa, asid, iotlb);
} else {
iommu_unmap(v->domain, iova, size);
}
+
+ if (!iotlb->nmaps)
+ vhost_vdpa_remove_as(v, asid);
+
+ return 0;
}

static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v,
@@ -755,30 +794,38 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev, u32 asid,
struct vhost_vdpa *v = container_of(dev, struct vhost_vdpa, vdev);
struct vdpa_device *vdpa = v->vdpa;
const struct vdpa_config_ops *ops = vdpa->config;
- struct vhost_vdpa_as *as = asid_to_as(v, 0);
- struct vhost_iotlb *iotlb = &as->iotlb;
+ struct vhost_iotlb *iotlb = asid_to_iotlb(v, asid);
+ struct vhost_vdpa_as *as;
int r = 0;

- if (asid != 0)
- return -EINVAL;
-
r = vhost_dev_check_owner(dev);
if (r)
return r;

+ if ((msg->type == VHOST_IOTLB_UPDATE) && !iotlb) {
+ as = vhost_vdpa_find_alloc_as(v, asid);
+ if (!as)
+ return -EINVAL;
+ iotlb = &as->iotlb;
+ }
+
+ if (v->in_batch && v->batch_asid != asid)
+ return -EINVAL;
+
switch (msg->type) {
case VHOST_IOTLB_UPDATE:
r = vhost_vdpa_process_iotlb_update(v, iotlb, msg);
break;
case VHOST_IOTLB_INVALIDATE:
- vhost_vdpa_unmap(v, iotlb, msg->iova, msg->size);
+ r = vhost_vdpa_unmap(v, iotlb, msg->iova, msg->size);
break;
case VHOST_IOTLB_BATCH_BEGIN:
+ v->batch_asid = asid;
v->in_batch = true;
break;
case VHOST_IOTLB_BATCH_END:
if (v->in_batch && ops->set_map)
- ops->set_map(vdpa, 0, iotlb);
+ ops->set_map(vdpa, asid, iotlb);
v->in_batch = false;
break;
default:
@@ -848,9 +895,17 @@ static void vhost_vdpa_free_domain(struct vhost_vdpa *v)

static void vhost_vdpa_cleanup(struct vhost_vdpa *v)
{
+ struct vhost_vdpa_as *as;
+ u32 asid;
+
vhost_dev_cleanup(&v->vdev);
kfree(v->vdev.vqs);
- vhost_vdpa_remove_as(v, 0);
+
+ for (asid = 0; asid < v->vdpa->nas; asid++) {
+ as = asid_to_as(v, asid);
+ if (as)
+ vhost_vdpa_remove_as(v, asid);
+ }
}

static int vhost_vdpa_open(struct inode *inode, struct file *filep)
@@ -883,18 +938,15 @@ static int vhost_vdpa_open(struct inode *inode, struct file *filep)
vhost_dev_init(dev, vqs, nvqs, 0, 0, 0, false,
vhost_vdpa_process_iotlb_msg);

- if (!vhost_vdpa_alloc_as(v, 0))
- goto err_alloc_as;
-
r = vhost_vdpa_alloc_domain(v);
if (r)
- goto err_alloc_as;
+ goto err_alloc_domain;

filep->private_data = v;

return 0;

-err_alloc_as:
+err_alloc_domain:
vhost_vdpa_cleanup(v);
err:
atomic_dec(&v->opened);
@@ -1022,8 +1074,8 @@ static int vhost_vdpa_probe(struct vdpa_device *vdpa)
int minor;
int i, r;

- /* Only support 1 address space */
- if (vdpa->ngroups != 1)
+ /* We can't support platform IOMMU device with more than 1 group */
+ if (!ops->set_map && !ops->dma_map && vdpa->ngroups > 1)
return -ENOTSUPP;

/* Currently, we only accept the network devices. */
--
2.20.1

2020-09-24 03:29:04

by Jason Wang

[permalink] [raw]
Subject: [RFC PATCH 19/24] vdpa_sim: use separated iov for reading and writing

In order to support control virtqueue whose commands have both in and
out descriptors, we need to use separated iov for reading and writing
in vdpa_sim.

Signed-off-by: Jason Wang <[email protected]>
---
drivers/vdpa/vdpa_sim/vdpa_sim.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
index 5dc04ec271bb..d1764a64578d 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
@@ -40,7 +40,8 @@ MODULE_PARM_DESC(batch_mapping, "Batched mapping 1 -Enable; 0 - Disable");

struct vdpasim_virtqueue {
struct vringh vring;
- struct vringh_kiov iov;
+ struct vringh_kiov riov;
+ struct vringh_kiov wiov;
unsigned short head;
bool ready;
u64 desc_addr;
@@ -173,12 +174,12 @@ static void vdpasim_work(struct work_struct *work)

while (true) {
total_write = 0;
- err = vringh_getdesc_iotlb(&txq->vring, &txq->iov, NULL,
+ err = vringh_getdesc_iotlb(&txq->vring, &txq->riov, NULL,
&txq->head, GFP_ATOMIC);
if (err <= 0)
break;

- err = vringh_getdesc_iotlb(&rxq->vring, NULL, &rxq->iov,
+ err = vringh_getdesc_iotlb(&rxq->vring, NULL, &rxq->wiov,
&rxq->head, GFP_ATOMIC);
if (err <= 0) {
vringh_complete_iotlb(&txq->vring, txq->head, 0);
@@ -186,13 +187,13 @@ static void vdpasim_work(struct work_struct *work)
}

while (true) {
- read = vringh_iov_pull_iotlb(&txq->vring, &txq->iov,
+ read = vringh_iov_pull_iotlb(&txq->vring, &txq->riov,
vdpasim->buffer,
PAGE_SIZE);
if (read <= 0)
break;

- write = vringh_iov_push_iotlb(&rxq->vring, &rxq->iov,
+ write = vringh_iov_push_iotlb(&rxq->vring, &rxq->wiov,
vdpasim->buffer, read);
if (write <= 0)
break;
--
2.20.1

2020-09-24 03:29:11

by Jason Wang

[permalink] [raw]
Subject: [RFC PATCH 17/24] vhost-vdpa: introduce uAPI to set group ASID

Follows the vDPA support for associating ASID to a specific virtqueue
group. This patch adds a uAPI to support setting them from userspace.

Signed-off-by: Jason Wang <[email protected]>
---
drivers/vhost/vdpa.c | 8 ++++++++
include/uapi/linux/vhost.h | 4 ++++
2 files changed, 12 insertions(+)

diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index a234d3783e16..978cf97dc03a 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -441,6 +441,14 @@ static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd,
else if (copy_to_user(argp, &s, sizeof s))
return -EFAULT;
return 0;
+ case VHOST_VDPA_SET_GROUP_ASID:
+ if (copy_from_user(&s, argp, sizeof(s)))
+ return -EFAULT;
+ if (s.num >= vdpa->ngroups)
+ return -EINVAL;
+ if (!ops->set_group_asid)
+ return -ENOTSUPP;
+ return ops->set_group_asid(vdpa, idx, s.num);
case VHOST_GET_VRING_BASE:
r = ops->get_vq_state(v->vdpa, idx, &vq_state);
if (r)
diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h
index d1c4b5561fee..f3de9e45c518 100644
--- a/include/uapi/linux/vhost.h
+++ b/include/uapi/linux/vhost.h
@@ -151,4 +151,8 @@
/* Get the group for a virtqueue: read index, write group in num */
#define VHOST_VDPA_GET_VRING_GROUP _IOWR(VHOST_VIRTIO, 0x79, \
struct vhost_vring_state)
+/* Set the ASID for a virtqueue group. */
+#define VHOST_VDPA_SET_GROUP_ASID _IOW(VHOST_VIRTIO, 0x7A, \
+ struct vhost_vring_state)
+
#endif
--
2.20.1

2020-09-24 03:29:45

by Jason Wang

[permalink] [raw]
Subject: [RFC PATCH 22/24] vdpa_sim: factor out buffer completion logic

This patch factors out the buffer completion logic in order to support
future features.

Signed-off-by: Jason Wang <[email protected]>
---
drivers/vdpa/vdpa_sim/vdpa_sim.c | 33 +++++++++++++++++---------------
1 file changed, 18 insertions(+), 15 deletions(-)

diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
index ca5c2d0db905..b21670e054ba 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
@@ -155,6 +155,22 @@ static void vdpasim_reset(struct vdpasim *vdpasim)
++vdpasim->generation;
}

+static void vdpasim_complete(struct vdpasim_virtqueue *vq, size_t len)
+{
+ /* Make sure data is wrote before advancing index */
+ smp_wmb();
+
+ vringh_complete_iotlb(&vq->vring, vq->head, len);
+
+ /* Make sure used is visible before rasing the interrupt. */
+ smp_wmb();
+
+ local_bh_disable();
+ if (vq->cb)
+ vq->cb(vq->private);
+ local_bh_enable();
+}
+
static void vdpasim_work(struct work_struct *work)
{
struct vdpasim *vdpasim = container_of(work, struct
@@ -203,21 +219,8 @@ static void vdpasim_work(struct work_struct *work)
total_write += write;
}

- /* Make sure data is wrote before advancing index */
- smp_wmb();
-
- vringh_complete_iotlb(&txq->vring, txq->head, 0);
- vringh_complete_iotlb(&rxq->vring, rxq->head, total_write);
-
- /* Make sure used is visible before rasing the interrupt. */
- smp_wmb();
-
- local_bh_disable();
- if (txq->cb)
- txq->cb(txq->private);
- if (rxq->cb)
- rxq->cb(rxq->private);
- local_bh_enable();
+ vdpasim_complete(txq, 0);
+ vdpasim_complete(rxq, total_write);

if (++pkts > 4) {
schedule_work(&vdpasim->work);
--
2.20.1

2020-09-24 03:30:50

by Jason Wang

[permalink] [raw]
Subject: [RFC PATCH 23/24] vdpa_sim: filter destination mac address

Add a simple unicast filter to filter out the dest MAC doesn't match
to the one stored in the config.

Signed-off-by: Jason Wang <[email protected]>
---
drivers/vdpa/vdpa_sim/vdpa_sim.c | 49 ++++++++++++++++++++------------
1 file changed, 31 insertions(+), 18 deletions(-)

diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
index b21670e054ba..66d901fb4c57 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
@@ -171,6 +171,22 @@ static void vdpasim_complete(struct vdpasim_virtqueue *vq, size_t len)
local_bh_enable();
}

+static bool receive_filter(struct vdpasim *vdpasim, size_t len)
+{
+ bool modern = vdpasim->features & (1ULL << VIRTIO_F_VERSION_1);
+ size_t hdr_len = modern ? sizeof(struct virtio_net_hdr_v1) :
+ sizeof(struct virtio_net_hdr);
+
+ if (len < ETH_ALEN + hdr_len)
+ return false;
+
+ if (!strncmp(vdpasim->buffer + hdr_len,
+ vdpasim->config.mac, ETH_ALEN))
+ return true;
+
+ return false;
+}
+
static void vdpasim_work(struct work_struct *work)
{
struct vdpasim *vdpasim = container_of(work, struct
@@ -178,7 +194,6 @@ static void vdpasim_work(struct work_struct *work)
struct vdpasim_virtqueue *txq = &vdpasim->vqs[1];
struct vdpasim_virtqueue *rxq = &vdpasim->vqs[0];
ssize_t read, write;
- size_t total_write;
int pkts = 0;
int err;

@@ -191,36 +206,34 @@ static void vdpasim_work(struct work_struct *work)
goto out;

while (true) {
- total_write = 0;
err = vringh_getdesc_iotlb(&txq->vring, &txq->riov, NULL,
&txq->head, GFP_ATOMIC);
if (err <= 0)
break;

+ read = vringh_iov_pull_iotlb(&txq->vring, &txq->riov,
+ vdpasim->buffer,
+ PAGE_SIZE);
+
+ if (!receive_filter(vdpasim, read)) {
+ vdpasim_complete(txq, 0);
+ continue;
+ }
+
err = vringh_getdesc_iotlb(&rxq->vring, NULL, &rxq->wiov,
&rxq->head, GFP_ATOMIC);
if (err <= 0) {
- vringh_complete_iotlb(&txq->vring, txq->head, 0);
+ vdpasim_complete(txq, 0);
break;
}

- while (true) {
- read = vringh_iov_pull_iotlb(&txq->vring, &txq->riov,
- vdpasim->buffer,
- PAGE_SIZE);
- if (read <= 0)
- break;
-
- write = vringh_iov_push_iotlb(&rxq->vring, &rxq->wiov,
- vdpasim->buffer, read);
- if (write <= 0)
- break;
-
- total_write += write;
- }
+ write = vringh_iov_push_iotlb(&rxq->vring, &rxq->wiov,
+ vdpasim->buffer, read);
+ if (write <= 0)
+ break;

vdpasim_complete(txq, 0);
- vdpasim_complete(rxq, total_write);
+ vdpasim_complete(rxq, write);

if (++pkts > 4) {
schedule_work(&vdpasim->work);
--
2.20.1

2020-09-24 07:17:59

by Eli Cohen

[permalink] [raw]
Subject: Re: [RFC PATCH 01/24] vhost-vdpa: fix backend feature ioctls

On Thu, Sep 24, 2020 at 11:21:02AM +0800, Jason Wang wrote:
> Commit 653055b9acd4 ("vhost-vdpa: support get/set backend features")
> introduces two malfunction backend features ioctls:
>
> 1) the ioctls was blindly added to vring ioctl instead of vdpa device
> ioctl
> 2) vhost_set_backend_features() was called when dev mutex has already
> been held which will lead a deadlock
>

I assume this patch requires some patch in qemu as well. Do you have
such patch?

> This patch fixes the above issues.
>
> Cc: Eli Cohen <[email protected]>
> Reported-by: Zhu Lingshan <[email protected]>
> Fixes: 653055b9acd4 ("vhost-vdpa: support get/set backend features")
> Signed-off-by: Jason Wang <[email protected]>
> ---
> drivers/vhost/vdpa.c | 30 ++++++++++++++++--------------
> 1 file changed, 16 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> index 3fab94f88894..796fe979f997 100644
> --- a/drivers/vhost/vdpa.c
> +++ b/drivers/vhost/vdpa.c
> @@ -353,8 +353,6 @@ static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd,
> struct vdpa_callback cb;
> struct vhost_virtqueue *vq;
> struct vhost_vring_state s;
> - u64 __user *featurep = argp;
> - u64 features;
> u32 idx;
> long r;
>
> @@ -381,18 +379,6 @@ static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd,
>
> vq->last_avail_idx = vq_state.avail_index;
> break;
> - case VHOST_GET_BACKEND_FEATURES:
> - features = VHOST_VDPA_BACKEND_FEATURES;
> - if (copy_to_user(featurep, &features, sizeof(features)))
> - return -EFAULT;
> - return 0;
> - case VHOST_SET_BACKEND_FEATURES:
> - if (copy_from_user(&features, featurep, sizeof(features)))
> - return -EFAULT;
> - if (features & ~VHOST_VDPA_BACKEND_FEATURES)
> - return -EOPNOTSUPP;
> - vhost_set_backend_features(&v->vdev, features);
> - return 0;
> }
>
> r = vhost_vring_ioctl(&v->vdev, cmd, argp);
> @@ -440,8 +426,20 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> struct vhost_vdpa *v = filep->private_data;
> struct vhost_dev *d = &v->vdev;
> void __user *argp = (void __user *)arg;
> + u64 __user *featurep = argp;
> + u64 features;
> long r;
>
> + if (cmd == VHOST_SET_BACKEND_FEATURES) {
> + r = copy_from_user(&features, featurep, sizeof(features));
> + if (r)
> + return r;
> + if (features & ~VHOST_VDPA_BACKEND_FEATURES)
> + return -EOPNOTSUPP;
> + vhost_set_backend_features(&v->vdev, features);
> + return 0;
> + }
> +
> mutex_lock(&d->mutex);
>
> switch (cmd) {
> @@ -476,6 +474,10 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> case VHOST_VDPA_SET_CONFIG_CALL:
> r = vhost_vdpa_set_config_call(v, argp);
> break;
> + case VHOST_GET_BACKEND_FEATURES:
> + features = VHOST_VDPA_BACKEND_FEATURES;
> + r = copy_to_user(featurep, &features, sizeof(features));
> + break;
> default:
> r = vhost_dev_ioctl(&v->vdev, cmd, argp);
> if (r == -ENOIOCTLCMD)
> --
> 2.20.1
>

2020-09-24 07:28:11

by Jason Wang

[permalink] [raw]
Subject: Re: [RFC PATCH 01/24] vhost-vdpa: fix backend feature ioctls


On 2020/9/24 下午3:16, Eli Cohen wrote:
> On Thu, Sep 24, 2020 at 11:21:02AM +0800, Jason Wang wrote:
>> Commit 653055b9acd4 ("vhost-vdpa: support get/set backend features")
>> introduces two malfunction backend features ioctls:
>>
>> 1) the ioctls was blindly added to vring ioctl instead of vdpa device
>> ioctl
>> 2) vhost_set_backend_features() was called when dev mutex has already
>> been held which will lead a deadlock
>>
> I assume this patch requires some patch in qemu as well. Do you have
> such patch?
>

It's this series: [PATCH 0/3] Vhost-vDPA: batch IOTLB updating.

You were copied.

Thanks

2020-09-24 07:40:22

by Eli Cohen

[permalink] [raw]
Subject: Re: [RFC PATCH 01/24] vhost-vdpa: fix backend feature ioctls

On Thu, Sep 24, 2020 at 03:26:08PM +0800, Jason Wang wrote:
>
> On 2020/9/24 下午3:16, Eli Cohen wrote:
> > On Thu, Sep 24, 2020 at 11:21:02AM +0800, Jason Wang wrote:
> > > Commit 653055b9acd4 ("vhost-vdpa: support get/set backend features")
> > > introduces two malfunction backend features ioctls:
> > >
> > > 1) the ioctls was blindly added to vring ioctl instead of vdpa device
> > > ioctl
> > > 2) vhost_set_backend_features() was called when dev mutex has already
> > > been held which will lead a deadlock
> > >
> > I assume this patch requires some patch in qemu as well. Do you have
> > such patch?
> >
>
> It's this series: [PATCH 0/3] Vhost-vDPA: batch IOTLB updating.
>
> You were copied.
>

Right, I miss those.
Thanks.

2020-09-24 07:50:54

by Eli Cohen

[permalink] [raw]
Subject: Re: [RFC PATCH 02/24] vhost-vdpa: fix vqs leak in vhost_vdpa_open()

On Thu, Sep 24, 2020 at 11:21:03AM +0800, Jason Wang wrote:
> We need to free vqs during the err path after it has been allocated
> since vhost won't do that for us.
>
> Signed-off-by: Jason Wang <[email protected]>
> ---
> drivers/vhost/vdpa.c | 11 ++++++++---
> 1 file changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> index 796fe979f997..9c641274b9f3 100644
> --- a/drivers/vhost/vdpa.c
> +++ b/drivers/vhost/vdpa.c
> @@ -764,6 +764,12 @@ static void vhost_vdpa_free_domain(struct vhost_vdpa *v)
> v->domain = NULL;
> }
>
> +static void vhost_vdpa_cleanup(struct vhost_vdpa *v)
> +{
> + vhost_dev_cleanup(&v->vdev);
> + kfree(v->vdev.vqs);
> +}
> +

Wouldn't it be cleaner to call kfree(vqs) explicilty inside
vhost_vdpa_open() in case of failure and keep the symetry of
vhost_dev_init()/vhost_dev_cleanup()?

> static int vhost_vdpa_open(struct inode *inode, struct file *filep)
> {
> struct vhost_vdpa *v;
> @@ -809,7 +815,7 @@ static int vhost_vdpa_open(struct inode *inode, struct file *filep)
> return 0;
>
> err_init_iotlb:
> - vhost_dev_cleanup(&v->vdev);
> + vhost_vdpa_cleanup(v);
> err:
> atomic_dec(&v->opened);
> return r;
> @@ -840,8 +846,7 @@ static int vhost_vdpa_release(struct inode *inode, struct file *filep)
> vhost_vdpa_free_domain(v);
> vhost_vdpa_config_put(v);
> vhost_vdpa_clean_irq(v);
> - vhost_dev_cleanup(&v->vdev);
> - kfree(v->vdev.vqs);
> + vhost_vdpa_cleanup(v);
> mutex_unlock(&d->mutex);
>
> atomic_dec(&v->opened);
> --
> 2.20.1
>

2020-09-24 07:54:35

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [RFC PATCH 01/24] vhost-vdpa: fix backend feature ioctls

On Thu, Sep 24, 2020 at 11:21:02AM +0800, Jason Wang wrote:
> Commit 653055b9acd4 ("vhost-vdpa: support get/set backend features")
> introduces two malfunction backend features ioctls:
>
> 1) the ioctls was blindly added to vring ioctl instead of vdpa device
> ioctl
> 2) vhost_set_backend_features() was called when dev mutex has already
> been held which will lead a deadlock
>
> This patch fixes the above issues.
>
> Cc: Eli Cohen <[email protected]>
> Reported-by: Zhu Lingshan <[email protected]>
> Fixes: 653055b9acd4 ("vhost-vdpa: support get/set backend features")
> Signed-off-by: Jason Wang <[email protected]>

Don't we want the fixes queued right now, as opposed to the rest of the
RFC?

> ---
> drivers/vhost/vdpa.c | 30 ++++++++++++++++--------------
> 1 file changed, 16 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> index 3fab94f88894..796fe979f997 100644
> --- a/drivers/vhost/vdpa.c
> +++ b/drivers/vhost/vdpa.c
> @@ -353,8 +353,6 @@ static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd,
> struct vdpa_callback cb;
> struct vhost_virtqueue *vq;
> struct vhost_vring_state s;
> - u64 __user *featurep = argp;
> - u64 features;
> u32 idx;
> long r;
>
> @@ -381,18 +379,6 @@ static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd,
>
> vq->last_avail_idx = vq_state.avail_index;
> break;
> - case VHOST_GET_BACKEND_FEATURES:
> - features = VHOST_VDPA_BACKEND_FEATURES;
> - if (copy_to_user(featurep, &features, sizeof(features)))
> - return -EFAULT;
> - return 0;
> - case VHOST_SET_BACKEND_FEATURES:
> - if (copy_from_user(&features, featurep, sizeof(features)))
> - return -EFAULT;
> - if (features & ~VHOST_VDPA_BACKEND_FEATURES)
> - return -EOPNOTSUPP;
> - vhost_set_backend_features(&v->vdev, features);
> - return 0;
> }
>
> r = vhost_vring_ioctl(&v->vdev, cmd, argp);
> @@ -440,8 +426,20 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> struct vhost_vdpa *v = filep->private_data;
> struct vhost_dev *d = &v->vdev;
> void __user *argp = (void __user *)arg;
> + u64 __user *featurep = argp;
> + u64 features;
> long r;
>
> + if (cmd == VHOST_SET_BACKEND_FEATURES) {
> + r = copy_from_user(&features, featurep, sizeof(features));
> + if (r)
> + return r;
> + if (features & ~VHOST_VDPA_BACKEND_FEATURES)
> + return -EOPNOTSUPP;
> + vhost_set_backend_features(&v->vdev, features);
> + return 0;
> + }
> +
> mutex_lock(&d->mutex);
>
> switch (cmd) {
> @@ -476,6 +474,10 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> case VHOST_VDPA_SET_CONFIG_CALL:
> r = vhost_vdpa_set_config_call(v, argp);
> break;
> + case VHOST_GET_BACKEND_FEATURES:
> + features = VHOST_VDPA_BACKEND_FEATURES;
> + r = copy_to_user(featurep, &features, sizeof(features));
> + break;
> default:
> r = vhost_dev_ioctl(&v->vdev, cmd, argp);
> if (r == -ENOIOCTLCMD)
> --
> 2.20.1

2020-09-24 08:30:40

by Jason Wang

[permalink] [raw]
Subject: Re: [RFC PATCH 01/24] vhost-vdpa: fix backend feature ioctls


On 2020/9/24 下午3:50, Michael S. Tsirkin wrote:
> On Thu, Sep 24, 2020 at 11:21:02AM +0800, Jason Wang wrote:
>> Commit 653055b9acd4 ("vhost-vdpa: support get/set backend features")
>> introduces two malfunction backend features ioctls:
>>
>> 1) the ioctls was blindly added to vring ioctl instead of vdpa device
>> ioctl
>> 2) vhost_set_backend_features() was called when dev mutex has already
>> been held which will lead a deadlock
>>
>> This patch fixes the above issues.
>>
>> Cc: Eli Cohen<[email protected]>
>> Reported-by: Zhu Lingshan<[email protected]>
>> Fixes: 653055b9acd4 ("vhost-vdpa: support get/set backend features")
>> Signed-off-by: Jason Wang<[email protected]>
> Don't we want the fixes queued right now, as opposed to the rest of the
> RFC?


Yes, actually I've posted in before[1].

Adding the patch here is to simplify the work for the guys that want to
do the work on top. E.g for Cindy to start the Qemu prototype.

Thanks

[1] https://www.spinics.net/lists/netdev/msg681247.html


2020-09-24 09:33:13

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [RFC PATCH 02/24] vhost-vdpa: fix vqs leak in vhost_vdpa_open()

On Thu, Sep 24, 2020 at 11:21:03AM +0800, Jason Wang wrote:
> We need to free vqs during the err path after it has been allocated
> since vhost won't do that for us.
>
> Signed-off-by: Jason Wang <[email protected]>

This is a bugfix too right? I don't see it posted separately ...

> ---
> drivers/vhost/vdpa.c | 11 ++++++++---
> 1 file changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> index 796fe979f997..9c641274b9f3 100644
> --- a/drivers/vhost/vdpa.c
> +++ b/drivers/vhost/vdpa.c
> @@ -764,6 +764,12 @@ static void vhost_vdpa_free_domain(struct vhost_vdpa *v)
> v->domain = NULL;
> }
>
> +static void vhost_vdpa_cleanup(struct vhost_vdpa *v)
> +{
> + vhost_dev_cleanup(&v->vdev);
> + kfree(v->vdev.vqs);
> +}
> +
> static int vhost_vdpa_open(struct inode *inode, struct file *filep)
> {
> struct vhost_vdpa *v;
> @@ -809,7 +815,7 @@ static int vhost_vdpa_open(struct inode *inode, struct file *filep)
> return 0;
>
> err_init_iotlb:
> - vhost_dev_cleanup(&v->vdev);
> + vhost_vdpa_cleanup(v);
> err:
> atomic_dec(&v->opened);
> return r;
> @@ -840,8 +846,7 @@ static int vhost_vdpa_release(struct inode *inode, struct file *filep)
> vhost_vdpa_free_domain(v);
> vhost_vdpa_config_put(v);
> vhost_vdpa_clean_irq(v);
> - vhost_dev_cleanup(&v->vdev);
> - kfree(v->vdev.vqs);
> + vhost_vdpa_cleanup(v);
> mutex_unlock(&d->mutex);
>
> atomic_dec(&v->opened);
> --
> 2.20.1

2020-09-24 10:22:52

by Stefan Hajnoczi

[permalink] [raw]
Subject: Re: [RFC PATCH 00/24] Control VQ support in vDPA

On Thu, Sep 24, 2020 at 11:21:01AM +0800, Jason Wang wrote:
> This series tries to add the support for control virtqueue in vDPA.

Please include documentation for both driver authors and vhost-vdpa
ioctl users. vhost-vdpa ioctls are only documented with a single
sentence. Please add full information on arguments, return values, and a
high-level explanation of the feature (like this cover letter) to
introduce the API.

What is the policy for using virtqueue groups? My guess is:
1. virtio_vdpa simply enables all virtqueue groups.
2. vhost_vdpa relies on userspace policy on how to use virtqueue groups.
Are the semantics of virtqueue groups documented somewhere so
userspace knows what to do? If a vDPA driver author decides to create
N virtqueue groups, N/2 virtqueue groups, or just 1 virtqueue group,
how will userspace know what to do?

Maybe a document is needed to describe the recommended device-specific
virtqueue groups that vDPA drivers should implement (e.g. "put the net
control vq into its own virtqueue group")?

This could become messy with guidelines. For example, drivers might be
shipped that aren't usable for certain use cases just because the author
didn't know that a certain virtqueue grouping is advantageous.

BTW I like how general this feature is. It seems to allow vDPA devices
to be split into sub-devices for further passthrough. Who will write the
first vDPA-on-vDPA driver? :)

Stefan


Attachments:
(No filename) (1.43 kB)
signature.asc (499.00 B)
Download all attachments

2020-09-25 11:31:05

by Jason Wang

[permalink] [raw]
Subject: Re: [RFC PATCH 02/24] vhost-vdpa: fix vqs leak in vhost_vdpa_open()


On 2020/9/24 下午5:31, Michael S. Tsirkin wrote:
> On Thu, Sep 24, 2020 at 11:21:03AM +0800, Jason Wang wrote:
>> We need to free vqs during the err path after it has been allocated
>> since vhost won't do that for us.
>>
>> Signed-off-by: Jason Wang <[email protected]>
> This is a bugfix too right? I don't see it posted separately ...


A patch that is functional equivalent is posted here:

https://www.mail-archive.com/[email protected]/msg42558.html

I'm a little bit lazy to use that one since this patch is probably wrote
before that one.

Thanks


>
>> ---
>> drivers/vhost/vdpa.c | 11 ++++++++---
>> 1 file changed, 8 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
>> index 796fe979f997..9c641274b9f3 100644
>> --- a/drivers/vhost/vdpa.c
>> +++ b/drivers/vhost/vdpa.c
>> @@ -764,6 +764,12 @@ static void vhost_vdpa_free_domain(struct vhost_vdpa *v)
>> v->domain = NULL;
>> }
>>
>> +static void vhost_vdpa_cleanup(struct vhost_vdpa *v)
>> +{
>> + vhost_dev_cleanup(&v->vdev);
>> + kfree(v->vdev.vqs);
>> +}
>> +
>> static int vhost_vdpa_open(struct inode *inode, struct file *filep)
>> {
>> struct vhost_vdpa *v;
>> @@ -809,7 +815,7 @@ static int vhost_vdpa_open(struct inode *inode, struct file *filep)
>> return 0;
>>
>> err_init_iotlb:
>> - vhost_dev_cleanup(&v->vdev);
>> + vhost_vdpa_cleanup(v);
>> err:
>> atomic_dec(&v->opened);
>> return r;
>> @@ -840,8 +846,7 @@ static int vhost_vdpa_release(struct inode *inode, struct file *filep)
>> vhost_vdpa_free_domain(v);
>> vhost_vdpa_config_put(v);
>> vhost_vdpa_clean_irq(v);
>> - vhost_dev_cleanup(&v->vdev);
>> - kfree(v->vdev.vqs);
>> + vhost_vdpa_cleanup(v);
>> mutex_unlock(&d->mutex);
>>
>> atomic_dec(&v->opened);
>> --
>> 2.20.1

2020-09-25 11:38:16

by Jason Wang

[permalink] [raw]
Subject: Re: [RFC PATCH 00/24] Control VQ support in vDPA


On 2020/9/24 下午6:17, Stefan Hajnoczi wrote:
> On Thu, Sep 24, 2020 at 11:21:01AM +0800, Jason Wang wrote:
>> This series tries to add the support for control virtqueue in vDPA.
> Please include documentation for both driver authors and vhost-vdpa
> ioctl users. vhost-vdpa ioctls are only documented with a single
> sentence. Please add full information on arguments, return values, and a
> high-level explanation of the feature (like this cover letter) to
> introduce the API.


Right, this is in the TODO list. (And we probably need to start with
documenting vDPA bus operations first).


>
> What is the policy for using virtqueue groups? My guess is:
> 1. virtio_vdpa simply enables all virtqueue groups.
> 2. vhost_vdpa relies on userspace policy on how to use virtqueue groups.
> Are the semantics of virtqueue groups documented somewhere so
> userspace knows what to do? If a vDPA driver author decides to create
> N virtqueue groups, N/2 virtqueue groups, or just 1 virtqueue group,
> how will userspace know what to do?


So the mapping from virtqueue to virtqueue group is mandated by the vDPA
device(driver). vDPA bus driver (like vhost-vDPA), can only change the
association between virtqueue groups and ASID.

By default, it is required all virtqueue groups to be associated to
address space 0. This make sure virtio_vdpa can work without any special
groups/asid configuration.

I admit we need document all those semantics/polices.


>
> Maybe a document is needed to describe the recommended device-specific
> virtqueue groups that vDPA drivers should implement (e.g. "put the net
> control vq into its own virtqueue group")?


Yes, note that this depends on the hardware capability actually. It can
only put control vq in other virtqueue group if:

1) hardware support to isolate control vq DMA from the rest virtqueues
(PASID or simply using PA (translated address) for control vq)
or
2) the control vq is emulated by vDPA device driver (like vdpa_sim did).


>
> This could become messy with guidelines. For example, drivers might be
> shipped that aren't usable for certain use cases just because the author
> didn't know that a certain virtqueue grouping is advantageous.


Right.


>
> BTW I like how general this feature is. It seems to allow vDPA devices
> to be split into sub-devices for further passthrough. Who will write the
> first vDPA-on-vDPA driver? :)


Yes, that's an interesting question. For now, I can imagine we can
emulated a SRIOV based virtio-net VFs via this.

If we want to expose the ASID setting to guest as well, it probably
needs more thought.

Thanks


>
> Stefan

2020-09-25 11:43:21

by Jason Wang

[permalink] [raw]
Subject: Re: [RFC PATCH 02/24] vhost-vdpa: fix vqs leak in vhost_vdpa_open()


On 2020/9/24 下午3:48, Eli Cohen wrote:
> On Thu, Sep 24, 2020 at 11:21:03AM +0800, Jason Wang wrote:
>> We need to free vqs during the err path after it has been allocated
>> since vhost won't do that for us.
>>
>> Signed-off-by: Jason Wang <[email protected]>
>> ---
>> drivers/vhost/vdpa.c | 11 ++++++++---
>> 1 file changed, 8 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
>> index 796fe979f997..9c641274b9f3 100644
>> --- a/drivers/vhost/vdpa.c
>> +++ b/drivers/vhost/vdpa.c
>> @@ -764,6 +764,12 @@ static void vhost_vdpa_free_domain(struct vhost_vdpa *v)
>> v->domain = NULL;
>> }
>>
>> +static void vhost_vdpa_cleanup(struct vhost_vdpa *v)
>> +{
>> + vhost_dev_cleanup(&v->vdev);
>> + kfree(v->vdev.vqs);
>> +}
>> +
> Wouldn't it be cleaner to call kfree(vqs) explicilty inside
> vhost_vdpa_open() in case of failure and keep the symetry of
> vhost_dev_init()/vhost_dev_cleanup()?


That's also fine.

See
https://www.mail-archive.com/[email protected]/msg42558.html

I will use that for the next version.

Thanks.


>
>> static int vhost_vdpa_open(struct inode *inode, struct file *filep)
>> {
>> struct vhost_vdpa *v;
>> @@ -809,7 +815,7 @@ static int vhost_vdpa_open(struct inode *inode, struct file *filep)
>> return 0;
>>
>> err_init_iotlb:
>> - vhost_dev_cleanup(&v->vdev);
>> + vhost_vdpa_cleanup(v);
>> err:
>> atomic_dec(&v->opened);
>> return r;
>> @@ -840,8 +846,7 @@ static int vhost_vdpa_release(struct inode *inode, struct file *filep)
>> vhost_vdpa_free_domain(v);
>> vhost_vdpa_config_put(v);
>> vhost_vdpa_clean_irq(v);
>> - vhost_dev_cleanup(&v->vdev);
>> - kfree(v->vdev.vqs);
>> + vhost_vdpa_cleanup(v);
>> mutex_unlock(&d->mutex);
>>
>> atomic_dec(&v->opened);
>> --
>> 2.20.1
>>

2020-09-28 15:46:49

by Eugenio Perez Martin

[permalink] [raw]
Subject: Re: [RFC PATCH 08/24] vdpa: introduce virtqueue groups

On Thu, Sep 24, 2020 at 5:23 AM Jason Wang <[email protected]> wrote:
>
> This patch introduces virtqueue groups to vDPA device. The virtqueue
> group is the minimal set of virtqueues that must share an address
> space. And the adddress space identifier could only be attached to
> a specific virtqueue group.
>
> A new mandated bus operation is introduced to get the virtqueue group
> ID for a specific virtqueue.
>
> All the vDPA device drivers were converted to simply support a single
> virtqueue group.
>
> Signed-off-by: Jason Wang <[email protected]>
> ---
> drivers/vdpa/ifcvf/ifcvf_main.c | 9 ++++++++-
> drivers/vdpa/mlx5/net/mlx5_vnet.c | 8 +++++++-
> drivers/vdpa/vdpa.c | 4 +++-
> drivers/vdpa/vdpa_sim/vdpa_sim.c | 11 ++++++++++-
> include/linux/vdpa.h | 12 +++++++++---
> 5 files changed, 37 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c
> index 076d7ac5e723..e6a0be374e51 100644
> --- a/drivers/vdpa/ifcvf/ifcvf_main.c
> +++ b/drivers/vdpa/ifcvf/ifcvf_main.c
> @@ -327,6 +327,11 @@ static u32 ifcvf_vdpa_get_vq_align(struct vdpa_device *vdpa_dev)
> return IFCVF_QUEUE_ALIGNMENT;
> }
>
> +static u32 ifcvf_vdpa_get_vq_group(struct vdpa_device *vdpa, u16 idx)
> +{
> + return 0;
> +}
> +
> static void ifcvf_vdpa_get_config(struct vdpa_device *vdpa_dev,
> unsigned int offset,
> void *buf, unsigned int len)
> @@ -387,6 +392,7 @@ static const struct vdpa_config_ops ifc_vdpa_ops = {
> .get_device_id = ifcvf_vdpa_get_device_id,
> .get_vendor_id = ifcvf_vdpa_get_vendor_id,
> .get_vq_align = ifcvf_vdpa_get_vq_align,
> + .get_vq_group = ifcvf_vdpa_get_vq_group,
> .get_config = ifcvf_vdpa_get_config,
> .set_config = ifcvf_vdpa_set_config,
> .set_config_cb = ifcvf_vdpa_set_config_cb,
> @@ -434,7 +440,8 @@ static int ifcvf_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>
> adapter = vdpa_alloc_device(struct ifcvf_adapter, vdpa,
> dev, &ifc_vdpa_ops,
> - IFCVF_MAX_QUEUE_PAIRS * 2);
> + IFCVF_MAX_QUEUE_PAIRS * 2, 1);
> +
> if (adapter == NULL) {
> IFCVF_ERR(pdev, "Failed to allocate vDPA structure");
> return -ENOMEM;
> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> index 9df69d5efe8c..4e480f4f754e 100644
> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> @@ -1428,6 +1428,11 @@ static u32 mlx5_vdpa_get_vq_align(struct vdpa_device *vdev)
> return PAGE_SIZE;
> }
>
> +static u32 mlx5_vdpa_get_vq_group(struct vdpa_device *vdpa, u16 idx)
> +{
> + return 0;
> +}
> +
> enum { MLX5_VIRTIO_NET_F_GUEST_CSUM = 1 << 9,
> MLX5_VIRTIO_NET_F_CSUM = 1 << 10,
> MLX5_VIRTIO_NET_F_HOST_TSO6 = 1 << 11,
> @@ -1838,6 +1843,7 @@ static const struct vdpa_config_ops mlx5_vdpa_ops = {
> .get_vq_notification = mlx5_get_vq_notification,
> .get_vq_irq = mlx5_get_vq_irq,
> .get_vq_align = mlx5_vdpa_get_vq_align,
> + .get_vq_group = mlx5_vdpa_get_vq_group,
> .get_features = mlx5_vdpa_get_features,
> .set_features = mlx5_vdpa_set_features,
> .set_config_cb = mlx5_vdpa_set_config_cb,
> @@ -1925,7 +1931,7 @@ void *mlx5_vdpa_add_dev(struct mlx5_core_dev *mdev)
> max_vqs = min_t(u32, max_vqs, MLX5_MAX_SUPPORTED_VQS);
>
> ndev = vdpa_alloc_device(struct mlx5_vdpa_net, mvdev.vdev, mdev->device, &mlx5_vdpa_ops,
> - 2 * mlx5_vdpa_max_qps(max_vqs));
> + 2 * mlx5_vdpa_max_qps(max_vqs), 1);
> if (IS_ERR(ndev))
> return ndev;
>
> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
> index a69ffc991e13..46399746ec7c 100644
> --- a/drivers/vdpa/vdpa.c
> +++ b/drivers/vdpa/vdpa.c
> @@ -62,6 +62,7 @@ static void vdpa_release_dev(struct device *d)
> * @parent: the parent device
> * @config: the bus operations that is supported by this device
> * @nvqs: number of virtqueues supported by this device
> + * @ngroups: number of groups supported by this device

Hi!

Maybe the description of "ngroups" could be "number of *virtqueue*
groups supported by this device"? I think that it could be needed in
some contexts reading the code.

Thanks!

> * @size: size of the parent structure that contains private data
> *
> * Driver should use vdpa_alloc_device() wrapper macro instead of
> @@ -72,7 +73,7 @@ static void vdpa_release_dev(struct device *d)
> */
> struct vdpa_device *__vdpa_alloc_device(struct device *parent,
> const struct vdpa_config_ops *config,
> - int nvqs,
> + int nvqs, unsigned int ngroups,
> size_t size)
> {
> struct vdpa_device *vdev;
> @@ -100,6 +101,7 @@ struct vdpa_device *__vdpa_alloc_device(struct device *parent,
> vdev->config = config;
> vdev->features_valid = false;
> vdev->nvqs = nvqs;
> + vdev->ngroups = ngroups;
>
> err = dev_set_name(&vdev->dev, "vdpa%u", vdev->index);
> if (err)
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> index 62d640327145..6669c561bc6e 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> @@ -75,6 +75,7 @@ struct vdpasim {
> u32 status;
> u32 generation;
> u64 features;
> + u32 groups;
> /* spinlock to synchronize iommu table */
> spinlock_t iommu_lock;
> };
> @@ -352,7 +353,8 @@ static struct vdpasim *vdpasim_create(void)
> else
> ops = &vdpasim_net_config_ops;
>
> - vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops, VDPASIM_VQ_NUM);
> + vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops,
> + VDPASIM_VQ_NUM, 1);
> if (!vdpasim)
> goto err_alloc;
>
> @@ -481,6 +483,11 @@ static u32 vdpasim_get_vq_align(struct vdpa_device *vdpa)
> return VDPASIM_QUEUE_ALIGN;
> }
>
> +static u32 vdpasim_get_vq_group(struct vdpa_device *vdpa, u16 idx)
> +{
> + return 0;
> +}
> +
> static u64 vdpasim_get_features(struct vdpa_device *vdpa)
> {
> return vdpasim_features;
> @@ -646,6 +653,7 @@ static const struct vdpa_config_ops vdpasim_net_config_ops = {
> .set_vq_state = vdpasim_set_vq_state,
> .get_vq_state = vdpasim_get_vq_state,
> .get_vq_align = vdpasim_get_vq_align,
> + .get_vq_group = vdpasim_get_vq_group,
> .get_features = vdpasim_get_features,
> .set_features = vdpasim_set_features,
> .set_config_cb = vdpasim_set_config_cb,
> @@ -672,6 +680,7 @@ static const struct vdpa_config_ops vdpasim_net_batch_config_ops = {
> .set_vq_state = vdpasim_set_vq_state,
> .get_vq_state = vdpasim_get_vq_state,
> .get_vq_align = vdpasim_get_vq_align,
> + .get_vq_group = vdpasim_get_vq_group,
> .get_features = vdpasim_get_features,
> .set_features = vdpasim_set_features,
> .set_config_cb = vdpasim_set_config_cb,
> diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
> index df169c2f5c0f..d829512efd27 100644
> --- a/include/linux/vdpa.h
> +++ b/include/linux/vdpa.h
> @@ -51,6 +51,7 @@ struct vdpa_device {
> unsigned int index;
> bool features_valid;
> int nvqs;
> + unsigned int ngroups;
> };
>
> /**
> @@ -109,6 +110,10 @@ struct vdpa_device {
> * for the device
> * @vdev: vdpa device
> * Returns virtqueue algin requirement
> + * @get_vq_group: Get the group id for a specific virtqueue
> + * @vdev: vdpa device
> + * @idx: virtqueue index
> + * Returns u32: group id for this virtqueue
> * @get_features: Get virtio features supported by the device
> * @vdev: vdpa device
> * Returns the virtio features support by the
> @@ -203,6 +208,7 @@ struct vdpa_config_ops {
>
> /* Device ops */
> u32 (*get_vq_align)(struct vdpa_device *vdev);
> + u32 (*get_vq_group)(struct vdpa_device *vdev, u16 idx);
> u64 (*get_features)(struct vdpa_device *vdev);
> int (*set_features)(struct vdpa_device *vdev, u64 features);
> void (*set_config_cb)(struct vdpa_device *vdev,
> @@ -230,12 +236,12 @@ struct vdpa_config_ops {
>
> struct vdpa_device *__vdpa_alloc_device(struct device *parent,
> const struct vdpa_config_ops *config,
> - int nvqs,
> + int nvqs, unsigned int ngroups,
> size_t size);
>
> -#define vdpa_alloc_device(dev_struct, member, parent, config, nvqs) \
> +#define vdpa_alloc_device(dev_struct, member, parent, config, nvqs, ngroups) \
> container_of(__vdpa_alloc_device( \
> - parent, config, nvqs, \
> + parent, config, nvqs, ngroups, \
> sizeof(dev_struct) + \
> BUILD_BUG_ON_ZERO(offsetof( \
> dev_struct, member))), \
> --
> 2.20.1
>

2020-09-28 15:47:02

by Eugenio Perez Martin

[permalink] [raw]
Subject: Re: [RFC PATCH 18/24] vhost-vdpa: support ASID based IOTLB API

On Thu, Sep 24, 2020 at 5:25 AM Jason Wang <[email protected]> wrote:
>
> This patch extends the vhost-vdpa to support ASID based IOTLB API. The
> vhost-vdpa device will allocated multple IOTLBs for vDPA device that
> supports multiple address spaces. The IOTLBs and vDPA device memory
> mappings is determined and maintained through ASID.
>
> Note that we still don't support vDPA device with more than one
> address spaces that depends on platform IOMMU. This work will be done
> by moving the IOMMU logic from vhost-vDPA to vDPA device driver.
>
> Signed-off-by: Jason Wang <[email protected]>
> ---
> drivers/vhost/vdpa.c | 106 ++++++++++++++++++++++++++++++++-----------
> 1 file changed, 79 insertions(+), 27 deletions(-)
>
> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> index 978cf97dc03a..99ac13b2ed11 100644
> --- a/drivers/vhost/vdpa.c
> +++ b/drivers/vhost/vdpa.c
> @@ -29,7 +29,8 @@
> enum {
> VHOST_VDPA_BACKEND_FEATURES =
> (1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2) |
> - (1ULL << VHOST_BACKEND_F_IOTLB_BATCH),
> + (1ULL << VHOST_BACKEND_F_IOTLB_BATCH) |
> + (1ULL << VHOST_BACKEND_F_IOTLB_ASID),
> };
>
> #define VHOST_VDPA_DEV_MAX (1U << MINORBITS)
> @@ -58,12 +59,20 @@ struct vhost_vdpa {
> struct eventfd_ctx *config_ctx;
> int in_batch;
> int used_as;
> + u32 batch_asid;
> };
>
> static DEFINE_IDA(vhost_vdpa_ida);
>
> static dev_t vhost_vdpa_major;
>
> +static inline u32 iotlb_to_asid(struct vhost_iotlb *iotlb)
> +{
> + struct vhost_vdpa_as *as = container_of(iotlb, struct
> + vhost_vdpa_as, iotlb);
> + return as->id;
> +}
> +
> static struct vhost_vdpa_as *asid_to_as(struct vhost_vdpa *v, u32 asid)
> {
> struct hlist_head *head = &v->as[asid % VHOST_VDPA_IOTLB_BUCKETS];
> @@ -76,6 +85,16 @@ static struct vhost_vdpa_as *asid_to_as(struct vhost_vdpa *v, u32 asid)
> return NULL;
> }
>
> +static struct vhost_iotlb *asid_to_iotlb(struct vhost_vdpa *v, u32 asid)
> +{
> + struct vhost_vdpa_as *as = asid_to_as(v, asid);
> +
> + if (!as)
> + return NULL;
> +
> + return &as->iotlb;
> +}
> +
> static struct vhost_vdpa_as *vhost_vdpa_alloc_as(struct vhost_vdpa *v, u32 asid)
> {
> struct hlist_head *head = &v->as[asid % VHOST_VDPA_IOTLB_BUCKETS];
> @@ -84,6 +103,9 @@ static struct vhost_vdpa_as *vhost_vdpa_alloc_as(struct vhost_vdpa *v, u32 asid)
> if (asid_to_as(v, asid))
> return NULL;
>
> + if (asid >= v->vdpa->nas)
> + return NULL;
> +
> as = kmalloc(sizeof(*as), GFP_KERNEL);
> if (!as)
> return NULL;
> @@ -96,13 +118,20 @@ static struct vhost_vdpa_as *vhost_vdpa_alloc_as(struct vhost_vdpa *v, u32 asid)
> return as;
> }
>
> -static int vhost_vdpa_remove_as(struct vhost_vdpa *v, u32 asid)
> +static struct vhost_vdpa_as *vhost_vdpa_find_alloc_as(struct vhost_vdpa *v,
> + u32 asid)
> {
> struct vhost_vdpa_as *as = asid_to_as(v, asid);
>
> - /* Remove default address space is not allowed */
> - if (asid == 0)
> - return -EINVAL;
> + if (as)
> + return as;
> +
> + return vhost_vdpa_alloc_as(v, asid);
> +}
> +
> +static int vhost_vdpa_remove_as(struct vhost_vdpa *v, u32 asid)
> +{
> + struct vhost_vdpa_as *as = asid_to_as(v, asid);
>
> if (!as)
> return -EINVAL;
> @@ -623,6 +652,7 @@ static int vhost_vdpa_map(struct vhost_vdpa *v, struct vhost_iotlb *iotlb,
> {
> struct vdpa_device *vdpa = v->vdpa;
> const struct vdpa_config_ops *ops = vdpa->config;
> + u32 asid = iotlb_to_asid(iotlb);
> int r = 0;
>
> r = vhost_iotlb_add_range(iotlb, iova, iova + size - 1,
> @@ -631,10 +661,10 @@ static int vhost_vdpa_map(struct vhost_vdpa *v, struct vhost_iotlb *iotlb,
> return r;
>
> if (ops->dma_map) {
> - r = ops->dma_map(vdpa, 0, iova, size, pa, perm);
> + r = ops->dma_map(vdpa, asid, iova, size, pa, perm);
> } else if (ops->set_map) {
> if (!v->in_batch)
> - r = ops->set_map(vdpa, 0, iotlb);
> + r = ops->set_map(vdpa, asid, iotlb);
> } else {
> r = iommu_map(v->domain, iova, pa, size,
> perm_to_iommu_flags(perm));
> @@ -643,23 +673,32 @@ static int vhost_vdpa_map(struct vhost_vdpa *v, struct vhost_iotlb *iotlb,
> return r;
> }
>
> -static void vhost_vdpa_unmap(struct vhost_vdpa *v,
> - struct vhost_iotlb *iotlb,
> - u64 iova, u64 size)
> +static int vhost_vdpa_unmap(struct vhost_vdpa *v,
> + struct vhost_iotlb *iotlb,
> + u64 iova, u64 size)
> {
> struct vdpa_device *vdpa = v->vdpa;
> const struct vdpa_config_ops *ops = vdpa->config;
> + u32 asid = (iotlb);
> +
> + if (!iotlb)
> + return -EINVAL;

This should be reorder to check for (!iotlb) before use at `asid =
iotlb_to_asid()`, isn't it?

Thanks!

>
> vhost_vdpa_iotlb_unmap(v, iotlb, iova, iova + size - 1);
>
> if (ops->dma_map) {
> - ops->dma_unmap(vdpa, 0, iova, size);
> + ops->dma_unmap(vdpa, asid, iova, size);
> } else if (ops->set_map) {
> if (!v->in_batch)
> - ops->set_map(vdpa, 0, iotlb);
> + ops->set_map(vdpa, asid, iotlb);
> } else {
> iommu_unmap(v->domain, iova, size);
> }
> +
> + if (!iotlb->nmaps)
> + vhost_vdpa_remove_as(v, asid);
> +
> + return 0;
> }
>
> static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v,
> @@ -755,30 +794,38 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev, u32 asid,
> struct vhost_vdpa *v = container_of(dev, struct vhost_vdpa, vdev);
> struct vdpa_device *vdpa = v->vdpa;
> const struct vdpa_config_ops *ops = vdpa->config;
> - struct vhost_vdpa_as *as = asid_to_as(v, 0);
> - struct vhost_iotlb *iotlb = &as->iotlb;
> + struct vhost_iotlb *iotlb = asid_to_iotlb(v, asid);
> + struct vhost_vdpa_as *as;
> int r = 0;
>
> - if (asid != 0)
> - return -EINVAL;
> -
> r = vhost_dev_check_owner(dev);
> if (r)
> return r;
>
> + if ((msg->type == VHOST_IOTLB_UPDATE) && !iotlb) {
> + as = vhost_vdpa_find_alloc_as(v, asid);
> + if (!as)
> + return -EINVAL;
> + iotlb = &as->iotlb;
> + }
> +
> + if (v->in_batch && v->batch_asid != asid)
> + return -EINVAL;
> +
> switch (msg->type) {
> case VHOST_IOTLB_UPDATE:
> r = vhost_vdpa_process_iotlb_update(v, iotlb, msg);
> break;
> case VHOST_IOTLB_INVALIDATE:
> - vhost_vdpa_unmap(v, iotlb, msg->iova, msg->size);
> + r = vhost_vdpa_unmap(v, iotlb, msg->iova, msg->size);
> break;
> case VHOST_IOTLB_BATCH_BEGIN:
> + v->batch_asid = asid;
> v->in_batch = true;
> break;
> case VHOST_IOTLB_BATCH_END:
> if (v->in_batch && ops->set_map)
> - ops->set_map(vdpa, 0, iotlb);
> + ops->set_map(vdpa, asid, iotlb);
> v->in_batch = false;
> break;
> default:
> @@ -848,9 +895,17 @@ static void vhost_vdpa_free_domain(struct vhost_vdpa *v)
>
> static void vhost_vdpa_cleanup(struct vhost_vdpa *v)
> {
> + struct vhost_vdpa_as *as;
> + u32 asid;
> +
> vhost_dev_cleanup(&v->vdev);
> kfree(v->vdev.vqs);
> - vhost_vdpa_remove_as(v, 0);
> +
> + for (asid = 0; asid < v->vdpa->nas; asid++) {
> + as = asid_to_as(v, asid);
> + if (as)
> + vhost_vdpa_remove_as(v, asid);
> + }
> }
>
> static int vhost_vdpa_open(struct inode *inode, struct file *filep)
> @@ -883,18 +938,15 @@ static int vhost_vdpa_open(struct inode *inode, struct file *filep)
> vhost_dev_init(dev, vqs, nvqs, 0, 0, 0, false,
> vhost_vdpa_process_iotlb_msg);
>
> - if (!vhost_vdpa_alloc_as(v, 0))
> - goto err_alloc_as;
> -
> r = vhost_vdpa_alloc_domain(v);
> if (r)
> - goto err_alloc_as;
> + goto err_alloc_domain;
>
> filep->private_data = v;
>
> return 0;
>
> -err_alloc_as:
> +err_alloc_domain:
> vhost_vdpa_cleanup(v);
> err:
> atomic_dec(&v->opened);
> @@ -1022,8 +1074,8 @@ static int vhost_vdpa_probe(struct vdpa_device *vdpa)
> int minor;
> int i, r;
>
> - /* Only support 1 address space */
> - if (vdpa->ngroups != 1)
> + /* We can't support platform IOMMU device with more than 1 group */
> + if (!ops->set_map && !ops->dma_map && vdpa->ngroups > 1)
> return -ENOTSUPP;
>
> /* Currently, we only accept the network devices. */
> --
> 2.20.1
>

2020-09-29 14:44:37

by Eugenio Perez Martin

[permalink] [raw]
Subject: Re: [RFC PATCH 13/24] vhost-vdpa: introduce ASID based IOTLB

On Thu, Sep 24, 2020 at 5:24 AM Jason Wang <[email protected]> wrote:
>
> This patch introduces the support of ASID based IOTLB by tagging IOTLB
> with a unique ASID. This is a must for supporting ASID based vhost
> IOTLB API by the following patches.
>
> IOTLB were stored in a hlist and new IOTLB will be allocated when a
> new ASID is seen via IOTLB API and destoryed when there's no mapping
> associated with an ASID.
>
> Signed-off-by: Jason Wang <[email protected]>
> ---
> drivers/vhost/vdpa.c | 94 +++++++++++++++++++++++++++++++++-----------
> 1 file changed, 72 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> index 6552987544d7..1ba7e95619b5 100644
> --- a/drivers/vhost/vdpa.c
> +++ b/drivers/vhost/vdpa.c
> @@ -34,13 +34,21 @@ enum {
>
> #define VHOST_VDPA_DEV_MAX (1U << MINORBITS)
>
> +#define VHOST_VDPA_IOTLB_BUCKETS 16
> +
> +struct vhost_vdpa_as {
> + struct hlist_node hash_link;
> + struct vhost_iotlb iotlb;
> + u32 id;
> +};
> +
> struct vhost_vdpa {
> struct vhost_dev vdev;
> struct iommu_domain *domain;
> struct vhost_virtqueue *vqs;
> struct completion completion;
> struct vdpa_device *vdpa;
> - struct vhost_iotlb *iotlb;
> + struct hlist_head as[VHOST_VDPA_IOTLB_BUCKETS];
> struct device dev;
> struct cdev cdev;
> atomic_t opened;
> @@ -49,12 +57,64 @@ struct vhost_vdpa {
> int minor;
> struct eventfd_ctx *config_ctx;
> int in_batch;
> + int used_as;

Hi!

The variable `used_as` is not used anywhere outside this commit, and
in this commit is only tracking the number os AS added, not being able
to query it or using it by limiting them or anything like that.

If I'm right, could we consider deleting it? Or am I missing some usage of it?

I smoke tested all the series deleting that variable and everything
seems right to me.

Thanks!

> };
>
> static DEFINE_IDA(vhost_vdpa_ida);
>
> static dev_t vhost_vdpa_major;
>
> +static struct vhost_vdpa_as *asid_to_as(struct vhost_vdpa *v, u32 asid)
> +{
> + struct hlist_head *head = &v->as[asid % VHOST_VDPA_IOTLB_BUCKETS];
> + struct vhost_vdpa_as *as;
> +
> + hlist_for_each_entry(as, head, hash_link)
> + if (as->id == asid)
> + return as;
> +
> + return NULL;
> +}
> +
> +static struct vhost_vdpa_as *vhost_vdpa_alloc_as(struct vhost_vdpa *v, u32 asid)
> +{
> + struct hlist_head *head = &v->as[asid % VHOST_VDPA_IOTLB_BUCKETS];
> + struct vhost_vdpa_as *as;
> +
> + if (asid_to_as(v, asid))
> + return NULL;
> +
> + as = kmalloc(sizeof(*as), GFP_KERNEL);
> + if (!as)
> + return NULL;
> +
> + vhost_iotlb_init(&as->iotlb, 0, 0);
> + as->id = asid;
> + hlist_add_head(&as->hash_link, head);
> + ++v->used_as;
> +
> + return as;
> +}
> +
> +static int vhost_vdpa_remove_as(struct vhost_vdpa *v, u32 asid)
> +{
> + struct vhost_vdpa_as *as = asid_to_as(v, asid);
> +
> + /* Remove default address space is not allowed */
> + if (asid == 0)
> + return -EINVAL;
> +
> + if (!as)
> + return -EINVAL;
> +
> + hlist_del(&as->hash_link);
> + vhost_iotlb_reset(&as->iotlb);
> + kfree(as);
> + --v->used_as;
> +
> + return 0;
> +}
> +
> static void handle_vq_kick(struct vhost_work *work)
> {
> struct vhost_virtqueue *vq = container_of(work, struct vhost_virtqueue,
> @@ -513,15 +573,6 @@ static void vhost_vdpa_iotlb_unmap(struct vhost_vdpa *v,
> }
> }
>
> -static void vhost_vdpa_iotlb_free(struct vhost_vdpa *v)
> -{
> - struct vhost_iotlb *iotlb = v->iotlb;
> -
> - vhost_vdpa_iotlb_unmap(v, iotlb, 0ULL, 0ULL - 1);
> - kfree(v->iotlb);
> - v->iotlb = NULL;
> -}
> -
> static int perm_to_iommu_flags(u32 perm)
> {
> int flags = 0;
> @@ -681,7 +732,8 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev, u32 asid,
> struct vhost_vdpa *v = container_of(dev, struct vhost_vdpa, vdev);
> struct vdpa_device *vdpa = v->vdpa;
> const struct vdpa_config_ops *ops = vdpa->config;
> - struct vhost_iotlb *iotlb = v->iotlb;
> + struct vhost_vdpa_as *as = asid_to_as(v, 0);
> + struct vhost_iotlb *iotlb = &as->iotlb;
> int r = 0;
>
> if (asid != 0)
> @@ -775,6 +827,7 @@ static void vhost_vdpa_cleanup(struct vhost_vdpa *v)
> {
> vhost_dev_cleanup(&v->vdev);
> kfree(v->vdev.vqs);
> + vhost_vdpa_remove_as(v, 0);
> }
>
> static int vhost_vdpa_open(struct inode *inode, struct file *filep)
> @@ -807,23 +860,18 @@ static int vhost_vdpa_open(struct inode *inode, struct file *filep)
> vhost_dev_init(dev, vqs, nvqs, 0, 0, 0, false,
> vhost_vdpa_process_iotlb_msg);
>
> - dev->iotlb = vhost_iotlb_alloc(0, 0);
> - if (!dev->iotlb) {
> - r = -ENOMEM;
> - goto err_init_iotlb;
> - }
> + if (!vhost_vdpa_alloc_as(v, 0))
> + goto err_alloc_as;
>
> r = vhost_vdpa_alloc_domain(v);
> if (r)
> - goto err_alloc_domain;
> + goto err_alloc_as;
>
> filep->private_data = v;
>
> return 0;
>
> -err_alloc_domain:
> - vhost_vdpa_iotlb_free(v);
> -err_init_iotlb:
> +err_alloc_as:
> vhost_vdpa_cleanup(v);
> err:
> atomic_dec(&v->opened);
> @@ -851,7 +899,6 @@ static int vhost_vdpa_release(struct inode *inode, struct file *filep)
> filep->private_data = NULL;
> vhost_vdpa_reset(v);
> vhost_dev_stop(&v->vdev);
> - vhost_vdpa_iotlb_free(v);
> vhost_vdpa_free_domain(v);
> vhost_vdpa_config_put(v);
> vhost_vdpa_clean_irq(v);
> @@ -950,7 +997,7 @@ static int vhost_vdpa_probe(struct vdpa_device *vdpa)
> const struct vdpa_config_ops *ops = vdpa->config;
> struct vhost_vdpa *v;
> int minor;
> - int r;
> + int i, r;
>
> /* Only support 1 address space */
> if (vdpa->ngroups != 1)
> @@ -1002,6 +1049,9 @@ static int vhost_vdpa_probe(struct vdpa_device *vdpa)
> init_completion(&v->completion);
> vdpa_set_drvdata(vdpa, v);
>
> + for (i = 0; i < VHOST_VDPA_IOTLB_BUCKETS; i++)
> + INIT_HLIST_HEAD(&v->as[i]);
> +
> return 0;
>
> err:
> --
> 2.20.1
>

2020-09-30 11:27:59

by Eli Cohen

[permalink] [raw]
Subject: Re: [RFC PATCH 05/24] vhost-vdpa: passing iotlb to IOMMU mapping helpers

On Thu, Sep 24, 2020 at 11:21:06AM +0800, Jason Wang wrote:
> To prepare for the ASID support for vhost-vdpa, try to pass IOTLB
> object to dma helpers.

Maybe it's worth mentioning here that this patch does not change any
functionality and is presented as a preparation for passing different
iotlb's instead of using dev->iotlb

>
> Signed-off-by: Jason Wang <[email protected]>
> ---
> drivers/vhost/vdpa.c | 40 ++++++++++++++++++++++------------------
> 1 file changed, 22 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> index 9c641274b9f3..74bef1c15a70 100644
> --- a/drivers/vhost/vdpa.c
> +++ b/drivers/vhost/vdpa.c
> @@ -489,10 +489,11 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> return r;
> }
>
> -static void vhost_vdpa_iotlb_unmap(struct vhost_vdpa *v, u64 start, u64 last)
> +static void vhost_vdpa_iotlb_unmap(struct vhost_vdpa *v,
> + struct vhost_iotlb *iotlb,
> + u64 start, u64 last)
> {
> struct vhost_dev *dev = &v->vdev;
> - struct vhost_iotlb *iotlb = dev->iotlb;
> struct vhost_iotlb_map *map;
> struct page *page;
> unsigned long pfn, pinned;
> @@ -514,8 +515,9 @@ static void vhost_vdpa_iotlb_unmap(struct vhost_vdpa *v, u64 start, u64 last)
> static void vhost_vdpa_iotlb_free(struct vhost_vdpa *v)
> {
> struct vhost_dev *dev = &v->vdev;
> + struct vhost_iotlb *iotlb = dev->iotlb;
>
> - vhost_vdpa_iotlb_unmap(v, 0ULL, 0ULL - 1);
> + vhost_vdpa_iotlb_unmap(v, iotlb, 0ULL, 0ULL - 1);
> kfree(dev->iotlb);
> dev->iotlb = NULL;
> }
> @@ -542,15 +544,14 @@ static int perm_to_iommu_flags(u32 perm)
> return flags | IOMMU_CACHE;
> }
>
> -static int vhost_vdpa_map(struct vhost_vdpa *v,
> +static int vhost_vdpa_map(struct vhost_vdpa *v, struct vhost_iotlb *iotlb,
> u64 iova, u64 size, u64 pa, u32 perm)
> {
> - struct vhost_dev *dev = &v->vdev;
> struct vdpa_device *vdpa = v->vdpa;
> const struct vdpa_config_ops *ops = vdpa->config;
> int r = 0;
>
> - r = vhost_iotlb_add_range(dev->iotlb, iova, iova + size - 1,
> + r = vhost_iotlb_add_range(iotlb, iova, iova + size - 1,
> pa, perm);
> if (r)
> return r;
> @@ -559,7 +560,7 @@ static int vhost_vdpa_map(struct vhost_vdpa *v,
> r = ops->dma_map(vdpa, iova, size, pa, perm);
> } else if (ops->set_map) {
> if (!v->in_batch)
> - r = ops->set_map(vdpa, dev->iotlb);
> + r = ops->set_map(vdpa, iotlb);
> } else {
> r = iommu_map(v->domain, iova, pa, size,
> perm_to_iommu_flags(perm));
> @@ -568,29 +569,30 @@ static int vhost_vdpa_map(struct vhost_vdpa *v,
> return r;
> }
>
> -static void vhost_vdpa_unmap(struct vhost_vdpa *v, u64 iova, u64 size)
> +static void vhost_vdpa_unmap(struct vhost_vdpa *v,
> + struct vhost_iotlb *iotlb,
> + u64 iova, u64 size)
> {
> - struct vhost_dev *dev = &v->vdev;
> struct vdpa_device *vdpa = v->vdpa;
> const struct vdpa_config_ops *ops = vdpa->config;
>
> - vhost_vdpa_iotlb_unmap(v, iova, iova + size - 1);
> + vhost_vdpa_iotlb_unmap(v, iotlb, iova, iova + size - 1);
>
> if (ops->dma_map) {
> ops->dma_unmap(vdpa, iova, size);
> } else if (ops->set_map) {
> if (!v->in_batch)
> - ops->set_map(vdpa, dev->iotlb);
> + ops->set_map(vdpa, iotlb);
> } else {
> iommu_unmap(v->domain, iova, size);
> }
> }
>
> static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v,
> + struct vhost_iotlb *iotlb,
> struct vhost_iotlb_msg *msg)
> {
> struct vhost_dev *dev = &v->vdev;
> - struct vhost_iotlb *iotlb = dev->iotlb;
> struct page **page_list;
> unsigned long list_size = PAGE_SIZE / sizeof(struct page *);
> unsigned int gup_flags = FOLL_LONGTERM;
> @@ -644,7 +646,7 @@ static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v,
> if (last_pfn && (this_pfn != last_pfn + 1)) {
> /* Pin a contiguous chunk of memory */
> csize = (last_pfn - map_pfn + 1) << PAGE_SHIFT;
> - if (vhost_vdpa_map(v, iova, csize,
> + if (vhost_vdpa_map(v, iotlb, iova, csize,
> map_pfn << PAGE_SHIFT,
> msg->perm))
> goto out;
> @@ -660,11 +662,12 @@ static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v,
> }
>
> /* Pin the rest chunk */
> - ret = vhost_vdpa_map(v, iova, (last_pfn - map_pfn + 1) << PAGE_SHIFT,
> + ret = vhost_vdpa_map(v, iotlb, iova,
> + (last_pfn - map_pfn + 1) << PAGE_SHIFT,
> map_pfn << PAGE_SHIFT, msg->perm);
> out:
> if (ret) {
> - vhost_vdpa_unmap(v, msg->iova, msg->size);
> + vhost_vdpa_unmap(v, iotlb, msg->iova, msg->size);
> atomic64_sub(npages, &dev->mm->pinned_vm);
> }
> mmap_read_unlock(dev->mm);
> @@ -678,6 +681,7 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev,
> struct vhost_vdpa *v = container_of(dev, struct vhost_vdpa, vdev);
> struct vdpa_device *vdpa = v->vdpa;
> const struct vdpa_config_ops *ops = vdpa->config;
> + struct vhost_iotlb *iotlb = dev->iotlb;
> int r = 0;
>
> r = vhost_dev_check_owner(dev);
> @@ -686,17 +690,17 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev,
>
> switch (msg->type) {
> case VHOST_IOTLB_UPDATE:
> - r = vhost_vdpa_process_iotlb_update(v, msg);
> + r = vhost_vdpa_process_iotlb_update(v, iotlb, msg);
> break;
> case VHOST_IOTLB_INVALIDATE:
> - vhost_vdpa_unmap(v, msg->iova, msg->size);
> + vhost_vdpa_unmap(v, iotlb, msg->iova, msg->size);
> break;
> case VHOST_IOTLB_BATCH_BEGIN:
> v->in_batch = true;
> break;
> case VHOST_IOTLB_BATCH_END:
> if (v->in_batch && ops->set_map)
> - ops->set_map(vdpa, dev->iotlb);
> + ops->set_map(vdpa, iotlb);
> v->in_batch = false;
> break;
> default:
> --
> 2.20.1
>

2020-09-30 12:03:50

by Eli Cohen

[permalink] [raw]
Subject: Re: [RFC PATCH 06/24] vhost-vdpa: switch to use vhost-vdpa specific IOTLB

On Thu, Sep 24, 2020 at 11:21:07AM +0800, Jason Wang wrote:
> To ease the implementation of per group ASID support for vDPA
> device. This patch switches to use a vhost-vdpa specific IOTLB to
> avoid the unnecessary refactoring of the vhost core.
>
> Signed-off-by: Jason Wang <[email protected]>
> ---
> drivers/vhost/vdpa.c | 14 ++++++++------
> 1 file changed, 8 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> index 74bef1c15a70..ec3c94f706c1 100644
> --- a/drivers/vhost/vdpa.c
> +++ b/drivers/vhost/vdpa.c
> @@ -40,6 +40,7 @@ struct vhost_vdpa {
> struct vhost_virtqueue *vqs;
> struct completion completion;
> struct vdpa_device *vdpa;
> + struct vhost_iotlb *iotlb;
> struct device dev;
> struct cdev cdev;
> atomic_t opened;
> @@ -514,12 +515,11 @@ static void vhost_vdpa_iotlb_unmap(struct vhost_vdpa *v,
>
> static void vhost_vdpa_iotlb_free(struct vhost_vdpa *v)
> {
> - struct vhost_dev *dev = &v->vdev;
> - struct vhost_iotlb *iotlb = dev->iotlb;
> + struct vhost_iotlb *iotlb = v->iotlb;
>
> vhost_vdpa_iotlb_unmap(v, iotlb, 0ULL, 0ULL - 1);
> - kfree(dev->iotlb);
> - dev->iotlb = NULL;
> + kfree(v->iotlb);
> + v->iotlb = NULL;
> }
>
> static int perm_to_iommu_flags(u32 perm)
> @@ -681,7 +681,7 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev,
> struct vhost_vdpa *v = container_of(dev, struct vhost_vdpa, vdev);
> struct vdpa_device *vdpa = v->vdpa;
> const struct vdpa_config_ops *ops = vdpa->config;
> - struct vhost_iotlb *iotlb = dev->iotlb;
> + struct vhost_iotlb *iotlb = v->iotlb;
> int r = 0;
>
> r = vhost_dev_check_owner(dev);
> @@ -812,12 +812,14 @@ static int vhost_vdpa_open(struct inode *inode, struct file *filep)
>
> r = vhost_vdpa_alloc_domain(v);
> if (r)
> - goto err_init_iotlb;
> + goto err_alloc_domain;

You're still using this:
dev->iotlb = vhost_iotlb_alloc(0, 0);

Shouldn't you use
v->iotlb = host_iotlb_alloc(0, 0);

to set the vdpa device iotlb field?

>
> filep->private_data = v;
>
> return 0;
>
> +err_alloc_domain:
> + vhost_vdpa_iotlb_free(v);
> err_init_iotlb:
> vhost_vdpa_cleanup(v);
> err:
> --
> 2.20.1
>

2020-10-01 13:25:45

by Eli Cohen

[permalink] [raw]
Subject: Re: [RFC PATCH 09/24] vdpa: multiple address spaces support

On Thu, Sep 24, 2020 at 11:21:10AM +0800, Jason Wang wrote:
> This patches introduces the multiple address spaces support for vDPA
> device. This idea is to identify a specific address space via an
> dedicated identifier - ASID.
>
> During vDPA device allocation, vDPA device driver needs to report the
> number of address spaces supported by the device then the DMA mapping
> ops of the vDPA device needs to be extended to support ASID.
>
> This helps to isolate the DMA among the virtqueues. E.g in the case of
> virtio-net, the control virtqueue will not be assigned directly to
> guest.
>
> This RFC patch only converts for the device that wants its own
> IOMMU/DMA translation logic. So it will rejects the device with more
> that 1 address space that depends on platform IOMMU. The plan to
> moving all the DMA mapping logic to the vDPA device driver instead of
> doing it in vhost-vDPA (otherwise it could result a very complicated
> APIs and actually vhost-vDPA doesn't care about how the actual
> composition/emulation were done in the device driver).
>
> Signed-off-by: Jason Wang <[email protected]>
> ---
> drivers/vdpa/ifcvf/ifcvf_main.c | 2 +-
> drivers/vdpa/mlx5/net/mlx5_vnet.c | 5 +++--
> drivers/vdpa/vdpa.c | 4 +++-
> drivers/vdpa/vdpa_sim/vdpa_sim.c | 10 ++++++----
> drivers/vhost/vdpa.c | 14 +++++++++-----
> include/linux/vdpa.h | 23 ++++++++++++++++-------
> 6 files changed, 38 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c
> index e6a0be374e51..86cdf5f8bcae 100644
> --- a/drivers/vdpa/ifcvf/ifcvf_main.c
> +++ b/drivers/vdpa/ifcvf/ifcvf_main.c
> @@ -440,7 +440,7 @@ static int ifcvf_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>
> adapter = vdpa_alloc_device(struct ifcvf_adapter, vdpa,
> dev, &ifc_vdpa_ops,
> - IFCVF_MAX_QUEUE_PAIRS * 2, 1);
> + IFCVF_MAX_QUEUE_PAIRS * 2, 1, 1);
>
> if (adapter == NULL) {
> IFCVF_ERR(pdev, "Failed to allocate vDPA structure");
> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> index 4e480f4f754e..db7404e121bf 100644
> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> @@ -1788,7 +1788,8 @@ static u32 mlx5_vdpa_get_generation(struct vdpa_device *vdev)
> return mvdev->generation;
> }
>
> -static int mlx5_vdpa_set_map(struct vdpa_device *vdev, struct vhost_iotlb *iotlb)
> +static int mlx5_vdpa_set_map(struct vdpa_device *vdev, unsigned int asid,
> + struct vhost_iotlb *iotlb)
> {
> struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
> struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
> @@ -1931,7 +1932,7 @@ void *mlx5_vdpa_add_dev(struct mlx5_core_dev *mdev)
> max_vqs = min_t(u32, max_vqs, MLX5_MAX_SUPPORTED_VQS);
>
> ndev = vdpa_alloc_device(struct mlx5_vdpa_net, mvdev.vdev, mdev->device, &mlx5_vdpa_ops,
> - 2 * mlx5_vdpa_max_qps(max_vqs), 1);
> + 2 * mlx5_vdpa_max_qps(max_vqs), 1, 1);
> if (IS_ERR(ndev))
> return ndev;
>
> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
> index 46399746ec7c..05195fa7865d 100644
> --- a/drivers/vdpa/vdpa.c
> +++ b/drivers/vdpa/vdpa.c
> @@ -63,6 +63,7 @@ static void vdpa_release_dev(struct device *d)
> * @config: the bus operations that is supported by this device
> * @nvqs: number of virtqueues supported by this device
> * @ngroups: number of groups supported by this device
> + * @nas: number of address spaces supported by this device
> * @size: size of the parent structure that contains private data
> *
> * Driver should use vdpa_alloc_device() wrapper macro instead of
> @@ -74,7 +75,7 @@ static void vdpa_release_dev(struct device *d)
> struct vdpa_device *__vdpa_alloc_device(struct device *parent,
> const struct vdpa_config_ops *config,
> int nvqs, unsigned int ngroups,
> - size_t size)
> + unsigned int nas, size_t size)
> {
> struct vdpa_device *vdev;
> int err = -EINVAL;
> @@ -102,6 +103,7 @@ struct vdpa_device *__vdpa_alloc_device(struct device *parent,
> vdev->features_valid = false;
> vdev->nvqs = nvqs;
> vdev->ngroups = ngroups;
> + vdev->nas = nas;
>
> err = dev_set_name(&vdev->dev, "vdpa%u", vdev->index);
> if (err)
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> index 6669c561bc6e..5dc04ec271bb 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> @@ -354,7 +354,7 @@ static struct vdpasim *vdpasim_create(void)
> ops = &vdpasim_net_config_ops;
>
> vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops,
> - VDPASIM_VQ_NUM, 1);
> + VDPASIM_VQ_NUM, 1, 1);
> if (!vdpasim)
> goto err_alloc;
>
> @@ -581,7 +581,7 @@ static u32 vdpasim_get_generation(struct vdpa_device *vdpa)
> return vdpasim->generation;
> }
>
> -static int vdpasim_set_map(struct vdpa_device *vdpa,
> +static int vdpasim_set_map(struct vdpa_device *vdpa, unsigned int asid,
> struct vhost_iotlb *iotlb)
> {
> struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
> @@ -608,7 +608,8 @@ static int vdpasim_set_map(struct vdpa_device *vdpa,
> return ret;
> }
>
> -static int vdpasim_dma_map(struct vdpa_device *vdpa, u64 iova, u64 size,
> +static int vdpasim_dma_map(struct vdpa_device *vdpa, unsigned int asid,
> + u64 iova, u64 size,
> u64 pa, u32 perm)
> {
> struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
> @@ -622,7 +623,8 @@ static int vdpasim_dma_map(struct vdpa_device *vdpa, u64 iova, u64 size,
> return ret;
> }
>
> -static int vdpasim_dma_unmap(struct vdpa_device *vdpa, u64 iova, u64 size)
> +static int vdpasim_dma_unmap(struct vdpa_device *vdpa, unsigned int asid,
> + u64 iova, u64 size)
> {
> struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
>
> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> index ec3c94f706c1..eeefcd971e3f 100644
> --- a/drivers/vhost/vdpa.c
> +++ b/drivers/vhost/vdpa.c
> @@ -557,10 +557,10 @@ static int vhost_vdpa_map(struct vhost_vdpa *v, struct vhost_iotlb *iotlb,
> return r;
>
> if (ops->dma_map) {
> - r = ops->dma_map(vdpa, iova, size, pa, perm);
> + r = ops->dma_map(vdpa, 0, iova, size, pa, perm);
> } else if (ops->set_map) {
> if (!v->in_batch)
> - r = ops->set_map(vdpa, iotlb);
> + r = ops->set_map(vdpa, 0, iotlb);
> } else {
> r = iommu_map(v->domain, iova, pa, size,
> perm_to_iommu_flags(perm));
> @@ -579,10 +579,10 @@ static void vhost_vdpa_unmap(struct vhost_vdpa *v,
> vhost_vdpa_iotlb_unmap(v, iotlb, iova, iova + size - 1);
>
> if (ops->dma_map) {
> - ops->dma_unmap(vdpa, iova, size);
> + ops->dma_unmap(vdpa, 0, iova, size);
> } else if (ops->set_map) {
> if (!v->in_batch)
> - ops->set_map(vdpa, iotlb);
> + ops->set_map(vdpa, 0, iotlb);
> } else {
> iommu_unmap(v->domain, iova, size);
> }
> @@ -700,7 +700,7 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev,
> break;
> case VHOST_IOTLB_BATCH_END:
> if (v->in_batch && ops->set_map)
> - ops->set_map(vdpa, iotlb);
> + ops->set_map(vdpa, 0, iotlb);
> v->in_batch = false;
> break;
> default:
> @@ -949,6 +949,10 @@ static int vhost_vdpa_probe(struct vdpa_device *vdpa)
> int minor;
> int r;
>
> + /* Only support 1 address space */
> + if (vdpa->ngroups != 1)
> + return -ENOTSUPP;

Checkpatch warning: prefer EOPNOTSUPP

> +
> /* Currently, we only accept the network devices. */
> if (ops->get_device_id(vdpa) != VIRTIO_ID_NET)
> return -ENOTSUPP;
> diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
> index d829512efd27..1e1163daa352 100644
> --- a/include/linux/vdpa.h
> +++ b/include/linux/vdpa.h
> @@ -43,6 +43,8 @@ struct vdpa_vq_state {
> * @index: device index
> * @features_valid: were features initialized? for legacy guests
> * @nvqs: the number of virtqueues
> + * @ngroups: the number of virtqueue groups
> + * @nas: the number of address spaces
> */
> struct vdpa_device {
> struct device dev;
> @@ -52,6 +54,7 @@ struct vdpa_device {
> bool features_valid;
> int nvqs;
> unsigned int ngroups;
> + unsigned int nas;
> };
>
> /**
> @@ -161,6 +164,7 @@ struct vdpa_device {
> * Needed for device that using device
> * specific DMA translation (on-chip IOMMU)
> * @vdev: vdpa device
> + * @asid: address space identifier
> * @iotlb: vhost memory mapping to be
> * used by the vDPA
> * Returns integer: success (0) or error (< 0)
> @@ -169,6 +173,7 @@ struct vdpa_device {
> * specific DMA translation (on-chip IOMMU)
> * and preferring incremental map.
> * @vdev: vdpa device
> + * @asid: address space identifier
> * @iova: iova to be mapped
> * @size: size of the area
> * @pa: physical address for the map
> @@ -180,6 +185,7 @@ struct vdpa_device {
> * specific DMA translation (on-chip IOMMU)
> * and preferring incremental unmap.
> * @vdev: vdpa device
> + * @asid: address space identifier
> * @iova: iova to be unmapped
> * @size: size of the area
> * Returns integer: success (0) or error (< 0)
> @@ -225,10 +231,12 @@ struct vdpa_config_ops {
> u32 (*get_generation)(struct vdpa_device *vdev);
>
> /* DMA ops */
> - int (*set_map)(struct vdpa_device *vdev, struct vhost_iotlb *iotlb);
> - int (*dma_map)(struct vdpa_device *vdev, u64 iova, u64 size,
> - u64 pa, u32 perm);
> - int (*dma_unmap)(struct vdpa_device *vdev, u64 iova, u64 size);
> + int (*set_map)(struct vdpa_device *vdev, unsigned int asid,
> + struct vhost_iotlb *iotlb);
> + int (*dma_map)(struct vdpa_device *vdev, unsigned int asid,
> + u64 iova, u64 size, u64 pa, u32 perm);
> + int (*dma_unmap)(struct vdpa_device *vdev, unsigned int asid,
> + u64 iova, u64 size);
>
> /* Free device resources */
> void (*free)(struct vdpa_device *vdev);
> @@ -237,11 +245,12 @@ struct vdpa_config_ops {
> struct vdpa_device *__vdpa_alloc_device(struct device *parent,
> const struct vdpa_config_ops *config,
> int nvqs, unsigned int ngroups,
> - size_t size);
> + unsigned int nas, size_t size);
>
> -#define vdpa_alloc_device(dev_struct, member, parent, config, nvqs, ngroups) \
> +#define vdpa_alloc_device(dev_struct, member, parent, config, nvqs, \
> + ngroups, nas) \
> container_of(__vdpa_alloc_device( \
> - parent, config, nvqs, ngroups, \
> + parent, config, nvqs, ngroups, nas, \
> sizeof(dev_struct) + \
> BUILD_BUG_ON_ZERO(offsetof( \
> dev_struct, member))), \
> --
> 2.20.1
>

2020-10-01 13:26:01

by Eli Cohen

[permalink] [raw]
Subject: Re: [RFC PATCH 09/24] vdpa: multiple address spaces support

On Thu, Sep 24, 2020 at 11:21:10AM +0800, Jason Wang wrote:
> This patches introduces the multiple address spaces support for vDPA
> device. This idea is to identify a specific address space via an
> dedicated identifier - ASID.
>
> During vDPA device allocation, vDPA device driver needs to report the
> number of address spaces supported by the device then the DMA mapping
> ops of the vDPA device needs to be extended to support ASID.
>
> This helps to isolate the DMA among the virtqueues. E.g in the case of
> virtio-net, the control virtqueue will not be assigned directly to
> guest.
>
> This RFC patch only converts for the device that wants its own
> IOMMU/DMA translation logic. So it will rejects the device with more
> that 1 address space that depends on platform IOMMU. The plan to

This is not apparent from the code. Instead you enforce number of groups
to 1.

> moving all the DMA mapping logic to the vDPA device driver instead of
> doing it in vhost-vDPA (otherwise it could result a very complicated
> APIs and actually vhost-vDPA doesn't care about how the actual
> composition/emulation were done in the device driver).
>
> Signed-off-by: Jason Wang <[email protected]>
> ---
> drivers/vdpa/ifcvf/ifcvf_main.c | 2 +-
> drivers/vdpa/mlx5/net/mlx5_vnet.c | 5 +++--
> drivers/vdpa/vdpa.c | 4 +++-
> drivers/vdpa/vdpa_sim/vdpa_sim.c | 10 ++++++----
> drivers/vhost/vdpa.c | 14 +++++++++-----
> include/linux/vdpa.h | 23 ++++++++++++++++-------
> 6 files changed, 38 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c
> index e6a0be374e51..86cdf5f8bcae 100644
> --- a/drivers/vdpa/ifcvf/ifcvf_main.c
> +++ b/drivers/vdpa/ifcvf/ifcvf_main.c
> @@ -440,7 +440,7 @@ static int ifcvf_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>
> adapter = vdpa_alloc_device(struct ifcvf_adapter, vdpa,
> dev, &ifc_vdpa_ops,
> - IFCVF_MAX_QUEUE_PAIRS * 2, 1);
> + IFCVF_MAX_QUEUE_PAIRS * 2, 1, 1);
>
> if (adapter == NULL) {
> IFCVF_ERR(pdev, "Failed to allocate vDPA structure");
> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> index 4e480f4f754e..db7404e121bf 100644
> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> @@ -1788,7 +1788,8 @@ static u32 mlx5_vdpa_get_generation(struct vdpa_device *vdev)
> return mvdev->generation;
> }
>
> -static int mlx5_vdpa_set_map(struct vdpa_device *vdev, struct vhost_iotlb *iotlb)
> +static int mlx5_vdpa_set_map(struct vdpa_device *vdev, unsigned int asid,
> + struct vhost_iotlb *iotlb)
> {
> struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
> struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
> @@ -1931,7 +1932,7 @@ void *mlx5_vdpa_add_dev(struct mlx5_core_dev *mdev)
> max_vqs = min_t(u32, max_vqs, MLX5_MAX_SUPPORTED_VQS);
>
> ndev = vdpa_alloc_device(struct mlx5_vdpa_net, mvdev.vdev, mdev->device, &mlx5_vdpa_ops,
> - 2 * mlx5_vdpa_max_qps(max_vqs), 1);
> + 2 * mlx5_vdpa_max_qps(max_vqs), 1, 1);
> if (IS_ERR(ndev))
> return ndev;
>
> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
> index 46399746ec7c..05195fa7865d 100644
> --- a/drivers/vdpa/vdpa.c
> +++ b/drivers/vdpa/vdpa.c
> @@ -63,6 +63,7 @@ static void vdpa_release_dev(struct device *d)
> * @config: the bus operations that is supported by this device
> * @nvqs: number of virtqueues supported by this device
> * @ngroups: number of groups supported by this device
> + * @nas: number of address spaces supported by this device
> * @size: size of the parent structure that contains private data
> *
> * Driver should use vdpa_alloc_device() wrapper macro instead of
> @@ -74,7 +75,7 @@ static void vdpa_release_dev(struct device *d)
> struct vdpa_device *__vdpa_alloc_device(struct device *parent,
> const struct vdpa_config_ops *config,
> int nvqs, unsigned int ngroups,
> - size_t size)
> + unsigned int nas, size_t size)
> {
> struct vdpa_device *vdev;
> int err = -EINVAL;
> @@ -102,6 +103,7 @@ struct vdpa_device *__vdpa_alloc_device(struct device *parent,
> vdev->features_valid = false;
> vdev->nvqs = nvqs;
> vdev->ngroups = ngroups;
> + vdev->nas = nas;
>
> err = dev_set_name(&vdev->dev, "vdpa%u", vdev->index);
> if (err)
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> index 6669c561bc6e..5dc04ec271bb 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> @@ -354,7 +354,7 @@ static struct vdpasim *vdpasim_create(void)
> ops = &vdpasim_net_config_ops;
>
> vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops,
> - VDPASIM_VQ_NUM, 1);
> + VDPASIM_VQ_NUM, 1, 1);
> if (!vdpasim)
> goto err_alloc;
>
> @@ -581,7 +581,7 @@ static u32 vdpasim_get_generation(struct vdpa_device *vdpa)
> return vdpasim->generation;
> }
>
> -static int vdpasim_set_map(struct vdpa_device *vdpa,
> +static int vdpasim_set_map(struct vdpa_device *vdpa, unsigned int asid,
> struct vhost_iotlb *iotlb)
> {
> struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
> @@ -608,7 +608,8 @@ static int vdpasim_set_map(struct vdpa_device *vdpa,
> return ret;
> }
>
> -static int vdpasim_dma_map(struct vdpa_device *vdpa, u64 iova, u64 size,
> +static int vdpasim_dma_map(struct vdpa_device *vdpa, unsigned int asid,
> + u64 iova, u64 size,
> u64 pa, u32 perm)
> {
> struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
> @@ -622,7 +623,8 @@ static int vdpasim_dma_map(struct vdpa_device *vdpa, u64 iova, u64 size,
> return ret;
> }
>
> -static int vdpasim_dma_unmap(struct vdpa_device *vdpa, u64 iova, u64 size)
> +static int vdpasim_dma_unmap(struct vdpa_device *vdpa, unsigned int asid,
> + u64 iova, u64 size)
> {
> struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
>
> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> index ec3c94f706c1..eeefcd971e3f 100644
> --- a/drivers/vhost/vdpa.c
> +++ b/drivers/vhost/vdpa.c
> @@ -557,10 +557,10 @@ static int vhost_vdpa_map(struct vhost_vdpa *v, struct vhost_iotlb *iotlb,
> return r;
>
> if (ops->dma_map) {
> - r = ops->dma_map(vdpa, iova, size, pa, perm);
> + r = ops->dma_map(vdpa, 0, iova, size, pa, perm);
> } else if (ops->set_map) {
> if (!v->in_batch)
> - r = ops->set_map(vdpa, iotlb);
> + r = ops->set_map(vdpa, 0, iotlb);
> } else {
> r = iommu_map(v->domain, iova, pa, size,
> perm_to_iommu_flags(perm));
> @@ -579,10 +579,10 @@ static void vhost_vdpa_unmap(struct vhost_vdpa *v,
> vhost_vdpa_iotlb_unmap(v, iotlb, iova, iova + size - 1);
>
> if (ops->dma_map) {
> - ops->dma_unmap(vdpa, iova, size);
> + ops->dma_unmap(vdpa, 0, iova, size);
> } else if (ops->set_map) {
> if (!v->in_batch)
> - ops->set_map(vdpa, iotlb);
> + ops->set_map(vdpa, 0, iotlb);
> } else {
> iommu_unmap(v->domain, iova, size);
> }
> @@ -700,7 +700,7 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev,
> break;
> case VHOST_IOTLB_BATCH_END:
> if (v->in_batch && ops->set_map)
> - ops->set_map(vdpa, iotlb);
> + ops->set_map(vdpa, 0, iotlb);
> v->in_batch = false;
> break;
> default:
> @@ -949,6 +949,10 @@ static int vhost_vdpa_probe(struct vdpa_device *vdpa)
> int minor;
> int r;
>
> + /* Only support 1 address space */
> + if (vdpa->ngroups != 1)
> + return -ENOTSUPP;
> +

Did you mean to check agains vdpa->nas?

> /* Currently, we only accept the network devices. */
> if (ops->get_device_id(vdpa) != VIRTIO_ID_NET)
> return -ENOTSUPP;
> diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
> index d829512efd27..1e1163daa352 100644
> --- a/include/linux/vdpa.h
> +++ b/include/linux/vdpa.h
> @@ -43,6 +43,8 @@ struct vdpa_vq_state {
> * @index: device index
> * @features_valid: were features initialized? for legacy guests
> * @nvqs: the number of virtqueues
> + * @ngroups: the number of virtqueue groups
> + * @nas: the number of address spaces
> */
> struct vdpa_device {
> struct device dev;
> @@ -52,6 +54,7 @@ struct vdpa_device {
> bool features_valid;
> int nvqs;
> unsigned int ngroups;
> + unsigned int nas;
> };
>
> /**
> @@ -161,6 +164,7 @@ struct vdpa_device {
> * Needed for device that using device
> * specific DMA translation (on-chip IOMMU)
> * @vdev: vdpa device
> + * @asid: address space identifier
> * @iotlb: vhost memory mapping to be
> * used by the vDPA
> * Returns integer: success (0) or error (< 0)
> @@ -169,6 +173,7 @@ struct vdpa_device {
> * specific DMA translation (on-chip IOMMU)
> * and preferring incremental map.
> * @vdev: vdpa device
> + * @asid: address space identifier
> * @iova: iova to be mapped
> * @size: size of the area
> * @pa: physical address for the map
> @@ -180,6 +185,7 @@ struct vdpa_device {
> * specific DMA translation (on-chip IOMMU)
> * and preferring incremental unmap.
> * @vdev: vdpa device
> + * @asid: address space identifier
> * @iova: iova to be unmapped
> * @size: size of the area
> * Returns integer: success (0) or error (< 0)
> @@ -225,10 +231,12 @@ struct vdpa_config_ops {
> u32 (*get_generation)(struct vdpa_device *vdev);
>
> /* DMA ops */
> - int (*set_map)(struct vdpa_device *vdev, struct vhost_iotlb *iotlb);
> - int (*dma_map)(struct vdpa_device *vdev, u64 iova, u64 size,
> - u64 pa, u32 perm);
> - int (*dma_unmap)(struct vdpa_device *vdev, u64 iova, u64 size);
> + int (*set_map)(struct vdpa_device *vdev, unsigned int asid,
> + struct vhost_iotlb *iotlb);
> + int (*dma_map)(struct vdpa_device *vdev, unsigned int asid,
> + u64 iova, u64 size, u64 pa, u32 perm);
> + int (*dma_unmap)(struct vdpa_device *vdev, unsigned int asid,
> + u64 iova, u64 size);
>
> /* Free device resources */
> void (*free)(struct vdpa_device *vdev);
> @@ -237,11 +245,12 @@ struct vdpa_config_ops {
> struct vdpa_device *__vdpa_alloc_device(struct device *parent,
> const struct vdpa_config_ops *config,
> int nvqs, unsigned int ngroups,
> - size_t size);
> + unsigned int nas, size_t size);
>
> -#define vdpa_alloc_device(dev_struct, member, parent, config, nvqs, ngroups) \
> +#define vdpa_alloc_device(dev_struct, member, parent, config, nvqs, \
> + ngroups, nas) \
> container_of(__vdpa_alloc_device( \
> - parent, config, nvqs, ngroups, \
> + parent, config, nvqs, ngroups, nas, \
> sizeof(dev_struct) + \
> BUILD_BUG_ON_ZERO(offsetof( \
> dev_struct, member))), \
> --
> 2.20.1
>

2020-10-01 13:31:30

by Eli Cohen

[permalink] [raw]
Subject: Re: [RFC PATCH 10/24] vdpa: introduce config operations for associating ASID to a virtqueue group

On Thu, Sep 24, 2020 at 11:21:11AM +0800, Jason Wang wrote:
> This patch introduces a new bus operation to allow the vDPA bus driver
> to associate an ASID to a virtqueue group.
>

So in case of virtio_net, I would expect that all the data virtqueues
will be associated with the same address space identifier. Moreover,
this assignment should be provided before the set_map call that provides
the iotlb for the address space, correct?

> Signed-off-by: Jason Wang <[email protected]>
> ---
> include/linux/vdpa.h | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
> index 1e1163daa352..e2394995a3cd 100644
> --- a/include/linux/vdpa.h
> +++ b/include/linux/vdpa.h
> @@ -160,6 +160,12 @@ struct vdpa_device {
> * @get_generation: Get device config generation (optional)
> * @vdev: vdpa device
> * Returns u32: device generation
> + * @set_group_asid: Set address space identifier for a
> + * virtqueue group
> + * @vdev: vdpa device
> + * @group: virtqueue group
> + * @asid: address space id for this group
> + * Returns integer: success (0) or error (< 0)
> * @set_map: Set device memory mapping (optional)
> * Needed for device that using device
> * specific DMA translation (on-chip IOMMU)
> @@ -237,6 +243,10 @@ struct vdpa_config_ops {
> u64 iova, u64 size, u64 pa, u32 perm);
> int (*dma_unmap)(struct vdpa_device *vdev, unsigned int asid,
> u64 iova, u64 size);
> + int (*set_group_asid)(struct vdpa_device *vdev, unsigned int group,
> + unsigned int asid);
> +
> +

Extra space
>
> /* Free device resources */
> void (*free)(struct vdpa_device *vdev);
> --
> 2.20.1
>

2020-10-09 03:57:54

by Jason Wang

[permalink] [raw]
Subject: Re: [RFC PATCH 05/24] vhost-vdpa: passing iotlb to IOMMU mapping helpers


On 2020/9/30 下午7:26, Eli Cohen wrote:
> On Thu, Sep 24, 2020 at 11:21:06AM +0800, Jason Wang wrote:
>> To prepare for the ASID support for vhost-vdpa, try to pass IOTLB
>> object to dma helpers.
> Maybe it's worth mentioning here that this patch does not change any
> functionality and is presented as a preparation for passing different
> iotlb's instead of using dev->iotlb


Right, let me add them in the next version.

Thanks


>
>> Signed-off-by: Jason Wang <[email protected]>
>> ---
>> drivers/vhost/vdpa.c | 40 ++++++++++++++++++++++------------------
>> 1 file changed, 22 insertions(+), 18 deletions(-)
>>
>> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
>> index 9c641274b9f3..74bef1c15a70 100644
>> --- a/drivers/vhost/vdpa.c
>> +++ b/drivers/vhost/vdpa.c
>> @@ -489,10 +489,11 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
>> return r;
>> }
>>
>> -static void vhost_vdpa_iotlb_unmap(struct vhost_vdpa *v, u64 start, u64 last)
>> +static void vhost_vdpa_iotlb_unmap(struct vhost_vdpa *v,
>> + struct vhost_iotlb *iotlb,
>> + u64 start, u64 last)
>> {
>> struct vhost_dev *dev = &v->vdev;
>> - struct vhost_iotlb *iotlb = dev->iotlb;
>> struct vhost_iotlb_map *map;
>> struct page *page;
>> unsigned long pfn, pinned;
>> @@ -514,8 +515,9 @@ static void vhost_vdpa_iotlb_unmap(struct vhost_vdpa *v, u64 start, u64 last)
>> static void vhost_vdpa_iotlb_free(struct vhost_vdpa *v)
>> {
>> struct vhost_dev *dev = &v->vdev;
>> + struct vhost_iotlb *iotlb = dev->iotlb;
>>
>> - vhost_vdpa_iotlb_unmap(v, 0ULL, 0ULL - 1);
>> + vhost_vdpa_iotlb_unmap(v, iotlb, 0ULL, 0ULL - 1);
>> kfree(dev->iotlb);
>> dev->iotlb = NULL;
>> }
>> @@ -542,15 +544,14 @@ static int perm_to_iommu_flags(u32 perm)
>> return flags | IOMMU_CACHE;
>> }
>>
>> -static int vhost_vdpa_map(struct vhost_vdpa *v,
>> +static int vhost_vdpa_map(struct vhost_vdpa *v, struct vhost_iotlb *iotlb,
>> u64 iova, u64 size, u64 pa, u32 perm)
>> {
>> - struct vhost_dev *dev = &v->vdev;
>> struct vdpa_device *vdpa = v->vdpa;
>> const struct vdpa_config_ops *ops = vdpa->config;
>> int r = 0;
>>
>> - r = vhost_iotlb_add_range(dev->iotlb, iova, iova + size - 1,
>> + r = vhost_iotlb_add_range(iotlb, iova, iova + size - 1,
>> pa, perm);
>> if (r)
>> return r;
>> @@ -559,7 +560,7 @@ static int vhost_vdpa_map(struct vhost_vdpa *v,
>> r = ops->dma_map(vdpa, iova, size, pa, perm);
>> } else if (ops->set_map) {
>> if (!v->in_batch)
>> - r = ops->set_map(vdpa, dev->iotlb);
>> + r = ops->set_map(vdpa, iotlb);
>> } else {
>> r = iommu_map(v->domain, iova, pa, size,
>> perm_to_iommu_flags(perm));
>> @@ -568,29 +569,30 @@ static int vhost_vdpa_map(struct vhost_vdpa *v,
>> return r;
>> }
>>
>> -static void vhost_vdpa_unmap(struct vhost_vdpa *v, u64 iova, u64 size)
>> +static void vhost_vdpa_unmap(struct vhost_vdpa *v,
>> + struct vhost_iotlb *iotlb,
>> + u64 iova, u64 size)
>> {
>> - struct vhost_dev *dev = &v->vdev;
>> struct vdpa_device *vdpa = v->vdpa;
>> const struct vdpa_config_ops *ops = vdpa->config;
>>
>> - vhost_vdpa_iotlb_unmap(v, iova, iova + size - 1);
>> + vhost_vdpa_iotlb_unmap(v, iotlb, iova, iova + size - 1);
>>
>> if (ops->dma_map) {
>> ops->dma_unmap(vdpa, iova, size);
>> } else if (ops->set_map) {
>> if (!v->in_batch)
>> - ops->set_map(vdpa, dev->iotlb);
>> + ops->set_map(vdpa, iotlb);
>> } else {
>> iommu_unmap(v->domain, iova, size);
>> }
>> }
>>
>> static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v,
>> + struct vhost_iotlb *iotlb,
>> struct vhost_iotlb_msg *msg)
>> {
>> struct vhost_dev *dev = &v->vdev;
>> - struct vhost_iotlb *iotlb = dev->iotlb;
>> struct page **page_list;
>> unsigned long list_size = PAGE_SIZE / sizeof(struct page *);
>> unsigned int gup_flags = FOLL_LONGTERM;
>> @@ -644,7 +646,7 @@ static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v,
>> if (last_pfn && (this_pfn != last_pfn + 1)) {
>> /* Pin a contiguous chunk of memory */
>> csize = (last_pfn - map_pfn + 1) << PAGE_SHIFT;
>> - if (vhost_vdpa_map(v, iova, csize,
>> + if (vhost_vdpa_map(v, iotlb, iova, csize,
>> map_pfn << PAGE_SHIFT,
>> msg->perm))
>> goto out;
>> @@ -660,11 +662,12 @@ static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v,
>> }
>>
>> /* Pin the rest chunk */
>> - ret = vhost_vdpa_map(v, iova, (last_pfn - map_pfn + 1) << PAGE_SHIFT,
>> + ret = vhost_vdpa_map(v, iotlb, iova,
>> + (last_pfn - map_pfn + 1) << PAGE_SHIFT,
>> map_pfn << PAGE_SHIFT, msg->perm);
>> out:
>> if (ret) {
>> - vhost_vdpa_unmap(v, msg->iova, msg->size);
>> + vhost_vdpa_unmap(v, iotlb, msg->iova, msg->size);
>> atomic64_sub(npages, &dev->mm->pinned_vm);
>> }
>> mmap_read_unlock(dev->mm);
>> @@ -678,6 +681,7 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev,
>> struct vhost_vdpa *v = container_of(dev, struct vhost_vdpa, vdev);
>> struct vdpa_device *vdpa = v->vdpa;
>> const struct vdpa_config_ops *ops = vdpa->config;
>> + struct vhost_iotlb *iotlb = dev->iotlb;
>> int r = 0;
>>
>> r = vhost_dev_check_owner(dev);
>> @@ -686,17 +690,17 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev,
>>
>> switch (msg->type) {
>> case VHOST_IOTLB_UPDATE:
>> - r = vhost_vdpa_process_iotlb_update(v, msg);
>> + r = vhost_vdpa_process_iotlb_update(v, iotlb, msg);
>> break;
>> case VHOST_IOTLB_INVALIDATE:
>> - vhost_vdpa_unmap(v, msg->iova, msg->size);
>> + vhost_vdpa_unmap(v, iotlb, msg->iova, msg->size);
>> break;
>> case VHOST_IOTLB_BATCH_BEGIN:
>> v->in_batch = true;
>> break;
>> case VHOST_IOTLB_BATCH_END:
>> if (v->in_batch && ops->set_map)
>> - ops->set_map(vdpa, dev->iotlb);
>> + ops->set_map(vdpa, iotlb);
>> v->in_batch = false;
>> break;
>> default:
>> --
>> 2.20.1
>>

2020-10-09 04:27:34

by Jason Wang

[permalink] [raw]
Subject: Re: [RFC PATCH 06/24] vhost-vdpa: switch to use vhost-vdpa specific IOTLB


On 2020/9/30 下午8:02, Eli Cohen wrote:
> On Thu, Sep 24, 2020 at 11:21:07AM +0800, Jason Wang wrote:
>> To ease the implementation of per group ASID support for vDPA
>> device. This patch switches to use a vhost-vdpa specific IOTLB to
>> avoid the unnecessary refactoring of the vhost core.
>>
>> Signed-off-by: Jason Wang <[email protected]>
>> ---
>> drivers/vhost/vdpa.c | 14 ++++++++------
>> 1 file changed, 8 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
>> index 74bef1c15a70..ec3c94f706c1 100644
>> --- a/drivers/vhost/vdpa.c
>> +++ b/drivers/vhost/vdpa.c
>> @@ -40,6 +40,7 @@ struct vhost_vdpa {
>> struct vhost_virtqueue *vqs;
>> struct completion completion;
>> struct vdpa_device *vdpa;
>> + struct vhost_iotlb *iotlb;
>> struct device dev;
>> struct cdev cdev;
>> atomic_t opened;
>> @@ -514,12 +515,11 @@ static void vhost_vdpa_iotlb_unmap(struct vhost_vdpa *v,
>>
>> static void vhost_vdpa_iotlb_free(struct vhost_vdpa *v)
>> {
>> - struct vhost_dev *dev = &v->vdev;
>> - struct vhost_iotlb *iotlb = dev->iotlb;
>> + struct vhost_iotlb *iotlb = v->iotlb;
>>
>> vhost_vdpa_iotlb_unmap(v, iotlb, 0ULL, 0ULL - 1);
>> - kfree(dev->iotlb);
>> - dev->iotlb = NULL;
>> + kfree(v->iotlb);
>> + v->iotlb = NULL;
>> }
>>
>> static int perm_to_iommu_flags(u32 perm)
>> @@ -681,7 +681,7 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev,
>> struct vhost_vdpa *v = container_of(dev, struct vhost_vdpa, vdev);
>> struct vdpa_device *vdpa = v->vdpa;
>> const struct vdpa_config_ops *ops = vdpa->config;
>> - struct vhost_iotlb *iotlb = dev->iotlb;
>> + struct vhost_iotlb *iotlb = v->iotlb;
>> int r = 0;
>>
>> r = vhost_dev_check_owner(dev);
>> @@ -812,12 +812,14 @@ static int vhost_vdpa_open(struct inode *inode, struct file *filep)
>>
>> r = vhost_vdpa_alloc_domain(v);
>> if (r)
>> - goto err_init_iotlb;
>> + goto err_alloc_domain;
> You're still using this:
> dev->iotlb = vhost_iotlb_alloc(0, 0);
>
> Shouldn't you use
> v->iotlb = host_iotlb_alloc(0, 0);
>
> to set the vdpa device iotlb field?


Yes, you're right.

Will fix.

Thanks


>
>>
>> filep->private_data = v;
>>
>> return 0;
>>
>> +err_alloc_domain:
>> + vhost_vdpa_iotlb_free(v);
>> err_init_iotlb:
>> vhost_vdpa_cleanup(v);
>> err:
>> --
>> 2.20.1
>>

2020-10-09 04:30:28

by Jason Wang

[permalink] [raw]
Subject: Re: [RFC PATCH 09/24] vdpa: multiple address spaces support


On 2020/10/1 下午9:23, Eli Cohen wrote:
>>
>> + /* Only support 1 address space */
>> + if (vdpa->ngroups != 1)
>> + return -ENOTSUPP;
> Checkpatch warning: prefer EOPNOTSUPP
>

Will fix.

Thanks

2020-10-09 06:20:09

by Jason Wang

[permalink] [raw]
Subject: Re: [RFC PATCH 10/24] vdpa: introduce config operations for associating ASID to a virtqueue group


On 2020/10/1 下午9:29, Eli Cohen wrote:
> On Thu, Sep 24, 2020 at 11:21:11AM +0800, Jason Wang wrote:
>> This patch introduces a new bus operation to allow the vDPA bus driver
>> to associate an ASID to a virtqueue group.
>>
> So in case of virtio_net, I would expect that all the data virtqueues
> will be associated with the same address space identifier.


Right.

I will add the codes to do this in the next version. It should be more
explicit than have this assumption by default.


> Moreover,
> this assignment should be provided before the set_map call that provides
> the iotlb for the address space, correct?


I think it's better not have this limitation, note that set_map() now
takes a asid argument.

So for hardware if the associated as is changed, the driver needs to
program the hardware to switch to the new mapping.

Does this work for mlx5?


>> Signed-off-by: Jason Wang <[email protected]>
>> ---
>> include/linux/vdpa.h | 10 ++++++++++
>> 1 file changed, 10 insertions(+)
>>
>> diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
>> index 1e1163daa352..e2394995a3cd 100644
>> --- a/include/linux/vdpa.h
>> +++ b/include/linux/vdpa.h
>> @@ -160,6 +160,12 @@ struct vdpa_device {
>> * @get_generation: Get device config generation (optional)
>> * @vdev: vdpa device
>> * Returns u32: device generation
>> + * @set_group_asid: Set address space identifier for a
>> + * virtqueue group
>> + * @vdev: vdpa device
>> + * @group: virtqueue group
>> + * @asid: address space id for this group
>> + * Returns integer: success (0) or error (< 0)
>> * @set_map: Set device memory mapping (optional)
>> * Needed for device that using device
>> * specific DMA translation (on-chip IOMMU)
>> @@ -237,6 +243,10 @@ struct vdpa_config_ops {
>> u64 iova, u64 size, u64 pa, u32 perm);
>> int (*dma_unmap)(struct vdpa_device *vdev, unsigned int asid,
>> u64 iova, u64 size);
>> + int (*set_group_asid)(struct vdpa_device *vdev, unsigned int group,
>> + unsigned int asid);
>> +
>> +
> Extra space


Will fix.

Thanks


>>
>> /* Free device resources */
>> void (*free)(struct vdpa_device *vdev);
>> --
>> 2.20.1
>>

2020-10-09 06:20:59

by Jason Wang

[permalink] [raw]
Subject: Re: [RFC PATCH 18/24] vhost-vdpa: support ASID based IOTLB API


On 2020/9/28 下午11:44, Eugenio Perez Martin wrote:
>> - u64 iova, u64 size)
>> +static int vhost_vdpa_unmap(struct vhost_vdpa *v,
>> + struct vhost_iotlb *iotlb,
>> + u64 iova, u64 size)
>> {
>> struct vdpa_device *vdpa = v->vdpa;
>> const struct vdpa_config_ops *ops = vdpa->config;
>> + u32 asid = (iotlb);
>> +
>> + if (!iotlb)
>> + return -EINVAL;
> This should be reorder to check for (!iotlb) before use at `asid =
> iotlb_to_asid()`, isn't it?
>
> Thanks!
>

Yes, will fix in the next version.

Thanks

2020-10-09 09:25:36

by Jason Wang

[permalink] [raw]
Subject: Re: [RFC PATCH 08/24] vdpa: introduce virtqueue groups


On 2020/9/28 下午11:44, Eugenio Perez Martin wrote:
> On Thu, Sep 24, 2020 at 5:23 AM Jason Wang<[email protected]> wrote:
>> This patch introduces virtqueue groups to vDPA device. The virtqueue
>> group is the minimal set of virtqueues that must share an address
>> space. And the adddress space identifier could only be attached to
>> a specific virtqueue group.
>>
>> A new mandated bus operation is introduced to get the virtqueue group
>> ID for a specific virtqueue.
>>
>> All the vDPA device drivers were converted to simply support a single
>> virtqueue group.
>>
>> Signed-off-by: Jason Wang<[email protected]>
>> ---
>> drivers/vdpa/ifcvf/ifcvf_main.c | 9 ++++++++-
>> drivers/vdpa/mlx5/net/mlx5_vnet.c | 8 +++++++-
>> drivers/vdpa/vdpa.c | 4 +++-
>> drivers/vdpa/vdpa_sim/vdpa_sim.c | 11 ++++++++++-
>> include/linux/vdpa.h | 12 +++++++++---
>> 5 files changed, 37 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c
>> index 076d7ac5e723..e6a0be374e51 100644
>> --- a/drivers/vdpa/ifcvf/ifcvf_main.c
>> +++ b/drivers/vdpa/ifcvf/ifcvf_main.c
>> @@ -327,6 +327,11 @@ static u32 ifcvf_vdpa_get_vq_align(struct vdpa_device *vdpa_dev)
>> return IFCVF_QUEUE_ALIGNMENT;
>> }
>>
>> +static u32 ifcvf_vdpa_get_vq_group(struct vdpa_device *vdpa, u16 idx)
>> +{
>> + return 0;
>> +}
>> +
>> static void ifcvf_vdpa_get_config(struct vdpa_device *vdpa_dev,
>> unsigned int offset,
>> void *buf, unsigned int len)
>> @@ -387,6 +392,7 @@ static const struct vdpa_config_ops ifc_vdpa_ops = {
>> .get_device_id = ifcvf_vdpa_get_device_id,
>> .get_vendor_id = ifcvf_vdpa_get_vendor_id,
>> .get_vq_align = ifcvf_vdpa_get_vq_align,
>> + .get_vq_group = ifcvf_vdpa_get_vq_group,
>> .get_config = ifcvf_vdpa_get_config,
>> .set_config = ifcvf_vdpa_set_config,
>> .set_config_cb = ifcvf_vdpa_set_config_cb,
>> @@ -434,7 +440,8 @@ static int ifcvf_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>>
>> adapter = vdpa_alloc_device(struct ifcvf_adapter, vdpa,
>> dev, &ifc_vdpa_ops,
>> - IFCVF_MAX_QUEUE_PAIRS * 2);
>> + IFCVF_MAX_QUEUE_PAIRS * 2, 1);
>> +
>> if (adapter == NULL) {
>> IFCVF_ERR(pdev, "Failed to allocate vDPA structure");
>> return -ENOMEM;
>> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
>> index 9df69d5efe8c..4e480f4f754e 100644
>> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
>> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
>> @@ -1428,6 +1428,11 @@ static u32 mlx5_vdpa_get_vq_align(struct vdpa_device *vdev)
>> return PAGE_SIZE;
>> }
>>
>> +static u32 mlx5_vdpa_get_vq_group(struct vdpa_device *vdpa, u16 idx)
>> +{
>> + return 0;
>> +}
>> +
>> enum { MLX5_VIRTIO_NET_F_GUEST_CSUM = 1 << 9,
>> MLX5_VIRTIO_NET_F_CSUM = 1 << 10,
>> MLX5_VIRTIO_NET_F_HOST_TSO6 = 1 << 11,
>> @@ -1838,6 +1843,7 @@ static const struct vdpa_config_ops mlx5_vdpa_ops = {
>> .get_vq_notification = mlx5_get_vq_notification,
>> .get_vq_irq = mlx5_get_vq_irq,
>> .get_vq_align = mlx5_vdpa_get_vq_align,
>> + .get_vq_group = mlx5_vdpa_get_vq_group,
>> .get_features = mlx5_vdpa_get_features,
>> .set_features = mlx5_vdpa_set_features,
>> .set_config_cb = mlx5_vdpa_set_config_cb,
>> @@ -1925,7 +1931,7 @@ void *mlx5_vdpa_add_dev(struct mlx5_core_dev *mdev)
>> max_vqs = min_t(u32, max_vqs, MLX5_MAX_SUPPORTED_VQS);
>>
>> ndev = vdpa_alloc_device(struct mlx5_vdpa_net, mvdev.vdev, mdev->device, &mlx5_vdpa_ops,
>> - 2 * mlx5_vdpa_max_qps(max_vqs));
>> + 2 * mlx5_vdpa_max_qps(max_vqs), 1);
>> if (IS_ERR(ndev))
>> return ndev;
>>
>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
>> index a69ffc991e13..46399746ec7c 100644
>> --- a/drivers/vdpa/vdpa.c
>> +++ b/drivers/vdpa/vdpa.c
>> @@ -62,6 +62,7 @@ static void vdpa_release_dev(struct device *d)
>> * @parent: the parent device
>> * @config: the bus operations that is supported by this device
>> * @nvqs: number of virtqueues supported by this device
>> + * @ngroups: number of groups supported by this device
> Hi!
>
> Maybe the description of "ngroups" could be "number of*virtqueue*
> groups supported by this device"? I think that it could be needed in
> some contexts reading the code.


Exactly.

Will fix.

Thanks


>
> Thanks!
>

2020-10-09 09:26:18

by Jason Wang

[permalink] [raw]
Subject: Re: [RFC PATCH 09/24] vdpa: multiple address spaces support


On 2020/10/1 下午9:21, Eli Cohen wrote:
> On Thu, Sep 24, 2020 at 11:21:10AM +0800, Jason Wang wrote:
>> This patches introduces the multiple address spaces support for vDPA
>> device. This idea is to identify a specific address space via an
>> dedicated identifier - ASID.
>>
>> During vDPA device allocation, vDPA device driver needs to report the
>> number of address spaces supported by the device then the DMA mapping
>> ops of the vDPA device needs to be extended to support ASID.
>>
>> This helps to isolate the DMA among the virtqueues. E.g in the case of
>> virtio-net, the control virtqueue will not be assigned directly to
>> guest.
>>
>> This RFC patch only converts for the device that wants its own
>> IOMMU/DMA translation logic. So it will rejects the device with more
>> that 1 address space that depends on platform IOMMU. The plan to
> This is not apparent from the code. Instead you enforce number of groups
> to 1.


Yes, will fix.


>
>> moving all the DMA mapping logic to the vDPA device driver instead of
>> doing it in vhost-vDPA (otherwise it could result a very complicated
>> APIs and actually vhost-vDPA doesn't care about how the actual
>> composition/emulation were done in the device driver).
>>
>> Signed-off-by: Jason Wang <[email protected]>
>> ---
>> drivers/vdpa/ifcvf/ifcvf_main.c | 2 +-
>> drivers/vdpa/mlx5/net/mlx5_vnet.c | 5 +++--
>> drivers/vdpa/vdpa.c | 4 +++-
>> drivers/vdpa/vdpa_sim/vdpa_sim.c | 10 ++++++----
>> drivers/vhost/vdpa.c | 14 +++++++++-----
>> include/linux/vdpa.h | 23 ++++++++++++++++-------
>> 6 files changed, 38 insertions(+), 20 deletions(-)
>>
>> diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c
>> index e6a0be374e51..86cdf5f8bcae 100644
>> --- a/drivers/vdpa/ifcvf/ifcvf_main.c
>> +++ b/drivers/vdpa/ifcvf/ifcvf_main.c
>> @@ -440,7 +440,7 @@ static int ifcvf_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>>
>> adapter = vdpa_alloc_device(struct ifcvf_adapter, vdpa,
>> dev, &ifc_vdpa_ops,
>> - IFCVF_MAX_QUEUE_PAIRS * 2, 1);
>> + IFCVF_MAX_QUEUE_PAIRS * 2, 1, 1);
>>
>> if (adapter == NULL) {
>> IFCVF_ERR(pdev, "Failed to allocate vDPA structure");
>> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
>> index 4e480f4f754e..db7404e121bf 100644
>> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
>> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
>> @@ -1788,7 +1788,8 @@ static u32 mlx5_vdpa_get_generation(struct vdpa_device *vdev)
>> return mvdev->generation;
>> }
>>
>> -static int mlx5_vdpa_set_map(struct vdpa_device *vdev, struct vhost_iotlb *iotlb)
>> +static int mlx5_vdpa_set_map(struct vdpa_device *vdev, unsigned int asid,
>> + struct vhost_iotlb *iotlb)
>> {
>> struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
>> struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
>> @@ -1931,7 +1932,7 @@ void *mlx5_vdpa_add_dev(struct mlx5_core_dev *mdev)
>> max_vqs = min_t(u32, max_vqs, MLX5_MAX_SUPPORTED_VQS);
>>
>> ndev = vdpa_alloc_device(struct mlx5_vdpa_net, mvdev.vdev, mdev->device, &mlx5_vdpa_ops,
>> - 2 * mlx5_vdpa_max_qps(max_vqs), 1);
>> + 2 * mlx5_vdpa_max_qps(max_vqs), 1, 1);
>> if (IS_ERR(ndev))
>> return ndev;
>>
>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
>> index 46399746ec7c..05195fa7865d 100644
>> --- a/drivers/vdpa/vdpa.c
>> +++ b/drivers/vdpa/vdpa.c
>> @@ -63,6 +63,7 @@ static void vdpa_release_dev(struct device *d)
>> * @config: the bus operations that is supported by this device
>> * @nvqs: number of virtqueues supported by this device
>> * @ngroups: number of groups supported by this device
>> + * @nas: number of address spaces supported by this device
>> * @size: size of the parent structure that contains private data
>> *
>> * Driver should use vdpa_alloc_device() wrapper macro instead of
>> @@ -74,7 +75,7 @@ static void vdpa_release_dev(struct device *d)
>> struct vdpa_device *__vdpa_alloc_device(struct device *parent,
>> const struct vdpa_config_ops *config,
>> int nvqs, unsigned int ngroups,
>> - size_t size)
>> + unsigned int nas, size_t size)
>> {
>> struct vdpa_device *vdev;
>> int err = -EINVAL;
>> @@ -102,6 +103,7 @@ struct vdpa_device *__vdpa_alloc_device(struct device *parent,
>> vdev->features_valid = false;
>> vdev->nvqs = nvqs;
>> vdev->ngroups = ngroups;
>> + vdev->nas = nas;
>>
>> err = dev_set_name(&vdev->dev, "vdpa%u", vdev->index);
>> if (err)
>> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
>> index 6669c561bc6e..5dc04ec271bb 100644
>> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
>> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
>> @@ -354,7 +354,7 @@ static struct vdpasim *vdpasim_create(void)
>> ops = &vdpasim_net_config_ops;
>>
>> vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops,
>> - VDPASIM_VQ_NUM, 1);
>> + VDPASIM_VQ_NUM, 1, 1);
>> if (!vdpasim)
>> goto err_alloc;
>>
>> @@ -581,7 +581,7 @@ static u32 vdpasim_get_generation(struct vdpa_device *vdpa)
>> return vdpasim->generation;
>> }
>>
>> -static int vdpasim_set_map(struct vdpa_device *vdpa,
>> +static int vdpasim_set_map(struct vdpa_device *vdpa, unsigned int asid,
>> struct vhost_iotlb *iotlb)
>> {
>> struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
>> @@ -608,7 +608,8 @@ static int vdpasim_set_map(struct vdpa_device *vdpa,
>> return ret;
>> }
>>
>> -static int vdpasim_dma_map(struct vdpa_device *vdpa, u64 iova, u64 size,
>> +static int vdpasim_dma_map(struct vdpa_device *vdpa, unsigned int asid,
>> + u64 iova, u64 size,
>> u64 pa, u32 perm)
>> {
>> struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
>> @@ -622,7 +623,8 @@ static int vdpasim_dma_map(struct vdpa_device *vdpa, u64 iova, u64 size,
>> return ret;
>> }
>>
>> -static int vdpasim_dma_unmap(struct vdpa_device *vdpa, u64 iova, u64 size)
>> +static int vdpasim_dma_unmap(struct vdpa_device *vdpa, unsigned int asid,
>> + u64 iova, u64 size)
>> {
>> struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
>>
>> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
>> index ec3c94f706c1..eeefcd971e3f 100644
>> --- a/drivers/vhost/vdpa.c
>> +++ b/drivers/vhost/vdpa.c
>> @@ -557,10 +557,10 @@ static int vhost_vdpa_map(struct vhost_vdpa *v, struct vhost_iotlb *iotlb,
>> return r;
>>
>> if (ops->dma_map) {
>> - r = ops->dma_map(vdpa, iova, size, pa, perm);
>> + r = ops->dma_map(vdpa, 0, iova, size, pa, perm);
>> } else if (ops->set_map) {
>> if (!v->in_batch)
>> - r = ops->set_map(vdpa, iotlb);
>> + r = ops->set_map(vdpa, 0, iotlb);
>> } else {
>> r = iommu_map(v->domain, iova, pa, size,
>> perm_to_iommu_flags(perm));
>> @@ -579,10 +579,10 @@ static void vhost_vdpa_unmap(struct vhost_vdpa *v,
>> vhost_vdpa_iotlb_unmap(v, iotlb, iova, iova + size - 1);
>>
>> if (ops->dma_map) {
>> - ops->dma_unmap(vdpa, iova, size);
>> + ops->dma_unmap(vdpa, 0, iova, size);
>> } else if (ops->set_map) {
>> if (!v->in_batch)
>> - ops->set_map(vdpa, iotlb);
>> + ops->set_map(vdpa, 0, iotlb);
>> } else {
>> iommu_unmap(v->domain, iova, size);
>> }
>> @@ -700,7 +700,7 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev,
>> break;
>> case VHOST_IOTLB_BATCH_END:
>> if (v->in_batch && ops->set_map)
>> - ops->set_map(vdpa, iotlb);
>> + ops->set_map(vdpa, 0, iotlb);
>> v->in_batch = false;
>> break;
>> default:
>> @@ -949,6 +949,10 @@ static int vhost_vdpa_probe(struct vdpa_device *vdpa)
>> int minor;
>> int r;
>>
>> + /* Only support 1 address space */
>> + if (vdpa->ngroups != 1)
>> + return -ENOTSUPP;
>> +
> Did you mean to check agains vdpa->nas?


I think we should check both.

Thanks


>
>> /* Currently, we only accept the network devices. */
>> if (ops->get_device_id(vdpa) != VIRTIO_ID_NET)
>> return -ENOTSUPP;
>> diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
>> index d829512efd27..1e1163daa352 100644
>> --- a/include/linux/vdpa.h
>> +++ b/include/linux/vdpa.h
>> @@ -43,6 +43,8 @@ struct vdpa_vq_state {
>> * @index: device index
>> * @features_valid: were features initialized? for legacy guests
>> * @nvqs: the number of virtqueues
>> + * @ngroups: the number of virtqueue groups
>> + * @nas: the number of address spaces
>> */
>> struct vdpa_device {
>> struct device dev;
>> @@ -52,6 +54,7 @@ struct vdpa_device {
>> bool features_valid;
>> int nvqs;
>> unsigned int ngroups;
>> + unsigned int nas;
>> };
>>
>> /**
>> @@ -161,6 +164,7 @@ struct vdpa_device {
>> * Needed for device that using device
>> * specific DMA translation (on-chip IOMMU)
>> * @vdev: vdpa device
>> + * @asid: address space identifier
>> * @iotlb: vhost memory mapping to be
>> * used by the vDPA
>> * Returns integer: success (0) or error (< 0)
>> @@ -169,6 +173,7 @@ struct vdpa_device {
>> * specific DMA translation (on-chip IOMMU)
>> * and preferring incremental map.
>> * @vdev: vdpa device
>> + * @asid: address space identifier
>> * @iova: iova to be mapped
>> * @size: size of the area
>> * @pa: physical address for the map
>> @@ -180,6 +185,7 @@ struct vdpa_device {
>> * specific DMA translation (on-chip IOMMU)
>> * and preferring incremental unmap.
>> * @vdev: vdpa device
>> + * @asid: address space identifier
>> * @iova: iova to be unmapped
>> * @size: size of the area
>> * Returns integer: success (0) or error (< 0)
>> @@ -225,10 +231,12 @@ struct vdpa_config_ops {
>> u32 (*get_generation)(struct vdpa_device *vdev);
>>
>> /* DMA ops */
>> - int (*set_map)(struct vdpa_device *vdev, struct vhost_iotlb *iotlb);
>> - int (*dma_map)(struct vdpa_device *vdev, u64 iova, u64 size,
>> - u64 pa, u32 perm);
>> - int (*dma_unmap)(struct vdpa_device *vdev, u64 iova, u64 size);
>> + int (*set_map)(struct vdpa_device *vdev, unsigned int asid,
>> + struct vhost_iotlb *iotlb);
>> + int (*dma_map)(struct vdpa_device *vdev, unsigned int asid,
>> + u64 iova, u64 size, u64 pa, u32 perm);
>> + int (*dma_unmap)(struct vdpa_device *vdev, unsigned int asid,
>> + u64 iova, u64 size);
>>
>> /* Free device resources */
>> void (*free)(struct vdpa_device *vdev);
>> @@ -237,11 +245,12 @@ struct vdpa_config_ops {
>> struct vdpa_device *__vdpa_alloc_device(struct device *parent,
>> const struct vdpa_config_ops *config,
>> int nvqs, unsigned int ngroups,
>> - size_t size);
>> + unsigned int nas, size_t size);
>>
>> -#define vdpa_alloc_device(dev_struct, member, parent, config, nvqs, ngroups) \
>> +#define vdpa_alloc_device(dev_struct, member, parent, config, nvqs, \
>> + ngroups, nas) \
>> container_of(__vdpa_alloc_device( \
>> - parent, config, nvqs, ngroups, \
>> + parent, config, nvqs, ngroups, nas, \
>> sizeof(dev_struct) + \
>> BUILD_BUG_ON_ZERO(offsetof( \
>> dev_struct, member))), \
>> --
>> 2.20.1
>>

2020-10-12 07:01:21

by Eli Cohen

[permalink] [raw]
Subject: Re: [RFC PATCH 10/24] vdpa: introduce config operations for associating ASID to a virtqueue group

On Fri, Oct 09, 2020 at 11:56:45AM +0800, Jason Wang wrote:
>
> On 2020/10/1 下午9:29, Eli Cohen wrote:
> > On Thu, Sep 24, 2020 at 11:21:11AM +0800, Jason Wang wrote:
> > > This patch introduces a new bus operation to allow the vDPA bus driver
> > > to associate an ASID to a virtqueue group.
> > >
> > So in case of virtio_net, I would expect that all the data virtqueues
> > will be associated with the same address space identifier.
>
>
> Right.
>
> I will add the codes to do this in the next version. It should be more
> explicit than have this assumption by default.
>
>
> > Moreover,
> > this assignment should be provided before the set_map call that provides
> > the iotlb for the address space, correct?
>
>
> I think it's better not have this limitation, note that set_map() now takes
> a asid argument.
>
> So for hardware if the associated as is changed, the driver needs to program
> the hardware to switch to the new mapping.
>
> Does this work for mlx5?
>

So in theory we can have several asid's (for different virtqueues), each
one should be followed by a specific set_map call. If this is so, how do
I know if I met all the conditions run my driver? Maybe we need another
callback to let the driver know it should not expect more set_maps().

2020-10-12 08:18:04

by Jason Wang

[permalink] [raw]
Subject: Re: [RFC PATCH 10/24] vdpa: introduce config operations for associating ASID to a virtqueue group


On 2020/10/12 下午2:59, Eli Cohen wrote:
> On Fri, Oct 09, 2020 at 11:56:45AM +0800, Jason Wang wrote:
>> On 2020/10/1 下午9:29, Eli Cohen wrote:
>>> On Thu, Sep 24, 2020 at 11:21:11AM +0800, Jason Wang wrote:
>>>> This patch introduces a new bus operation to allow the vDPA bus driver
>>>> to associate an ASID to a virtqueue group.
>>>>
>>> So in case of virtio_net, I would expect that all the data virtqueues
>>> will be associated with the same address space identifier.
>>
>> Right.
>>
>> I will add the codes to do this in the next version. It should be more
>> explicit than have this assumption by default.
>>
>>
>>> Moreover,
>>> this assignment should be provided before the set_map call that provides
>>> the iotlb for the address space, correct?
>>
>> I think it's better not have this limitation, note that set_map() now takes
>> a asid argument.
>>
>> So for hardware if the associated as is changed, the driver needs to program
>> the hardware to switch to the new mapping.
>>
>> Does this work for mlx5?
>>
> So in theory we can have several asid's (for different virtqueues), each
> one should be followed by a specific set_map call. If this is so, how do
> I know if I met all the conditions run my driver? Maybe we need another
> callback to let the driver know it should not expect more set_maps().


This should work similarly as in the past. Two parts of the work is
expected to be done by the driver:

1) store the mapping somewhere (e.g hardware) during set_map()
2) associating mapping with a specific virtqueue

The only difference is that more than one mapping is used now.

For the issue of more set_maps(), driver should be always ready for the
new set_maps() call instead of not expecting new set_maps() since guest
memory topology could be changed due to several reasons.

Qemu or vhost-vDPA will try their best to avoid the frequency of
set_maps() for better performance (e.g through batched IOTLB updating).
E.g there should be at most one set_map() during one time of guest booting.

Thanks


>

2020-10-12 20:05:13

by Eli Cohen

[permalink] [raw]
Subject: Re: [RFC PATCH 10/24] vdpa: introduce config operations for associating ASID to a virtqueue group

On Mon, Oct 12, 2020 at 03:45:10PM +0800, Jason Wang wrote:
> > >
> > So in theory we can have several asid's (for different virtqueues), each
> > one should be followed by a specific set_map call. If this is so, how do
> > I know if I met all the conditions run my driver? Maybe we need another
> > callback to let the driver know it should not expect more set_maps().
>
>
> This should work similarly as in the past. Two parts of the work is expected
> to be done by the driver:
>
> 1) store the mapping somewhere (e.g hardware) during set_map()
> 2) associating mapping with a specific virtqueue
>
> The only difference is that more than one mapping is used now.

ok, so like today, I will always get DRIVER_OK after I got all the
set_maps(), right?

>
> For the issue of more set_maps(), driver should be always ready for the new
> set_maps() call instead of not expecting new set_maps() since guest memory
> topology could be changed due to several reasons.
>
> Qemu or vhost-vDPA will try their best to avoid the frequency of set_maps()
> for better performance (e.g through batched IOTLB updating). E.g there
> should be at most one set_map() during one time of guest booting.
>
>
> >
>

2020-10-13 11:33:50

by Jason Wang

[permalink] [raw]
Subject: Re: [RFC PATCH 10/24] vdpa: introduce config operations for associating ASID to a virtqueue group


On 2020/10/12 下午4:17, Eli Cohen wrote:
> On Mon, Oct 12, 2020 at 03:45:10PM +0800, Jason Wang wrote:
>>> So in theory we can have several asid's (for different virtqueues), each
>>> one should be followed by a specific set_map call. If this is so, how do
>>> I know if I met all the conditions run my driver? Maybe we need another
>>> callback to let the driver know it should not expect more set_maps().
>>
>> This should work similarly as in the past. Two parts of the work is expected
>> to be done by the driver:
>>
>> 1) store the mapping somewhere (e.g hardware) during set_map()
>> 2) associating mapping with a specific virtqueue
>>
>> The only difference is that more than one mapping is used now.
> ok, so like today, I will always get DRIVER_OK after I got all the
> set_maps(), right?


Yes.

Thanks


>
>> For the issue of more set_maps(), driver should be always ready for the new
>> set_maps() call instead of not expecting new set_maps() since guest memory
>> topology could be changed due to several reasons.
>>
>> Qemu or vhost-vDPA will try their best to avoid the frequency of set_maps()
>> for better performance (e.g through batched IOTLB updating). E.g there
>> should be at most one set_map() during one time of guest booting.
>>
>>