2022-02-24 23:10:22

by Gautam Dawar

[permalink] [raw]
Subject: [RFC PATCH v2 00/19] Control VQ support in vDPA

Hi All:

This series tries to add the support for control virtqueue in vDPA.

Control virtqueue is used by networking device for accepting various
commands from the driver. It's a must to support multiqueue and other
configurations.

When used by vhost-vDPA bus driver for VM, the control virtqueue
should be shadowed via userspace VMM (Qemu) instead of being assigned
directly to Guest. This is because Qemu needs to know the device state
in order to start and stop device correctly (e.g for Live Migration).

This requies to isolate the memory mapping for control virtqueue
presented by vhost-vDPA to prevent guest from accessing it directly.

To achieve this, vDPA introduce two new abstractions:

- address space: identified through address space id (ASID) and a set
of memory mapping in maintained
- virtqueue group: the minimal set of virtqueues that must share an
address space

Device needs to advertise the following attributes to vDPA:

- the number of address spaces supported in the device
- the number of virtqueue groups supported in the device
- the mappings from a specific virtqueue to its virtqueue groups

The mappings from virtqueue to virtqueue groups is fixed and defined
by vDPA device driver. E.g:

- For the device that has hardware ASID support, it can simply
advertise a per virtqueue virtqueue group.
- For the device that does not have hardware ASID support, it can
simply advertise a single virtqueue group that contains all
virtqueues. Or if it wants a software emulated control virtqueue, it
can advertise two virtqueue groups, one is for cvq, another is for
the rest virtqueues.

vDPA also allow to change the association between virtqueue group and
address space. So in the case of control virtqueue, userspace
VMM(Qemu) may use a dedicated address space for the control virtqueue
group to isolate the memory mapping.

The vhost/vhost-vDPA is also extend for the userspace to:

- query the number of virtqueue groups and address spaces supported by
the device
- query the virtqueue group for a specific virtqueue
- assocaite a virtqueue group with an address space
- send ASID based IOTLB commands

This will help userspace VMM(Qemu) to detect whether the control vq
could be supported and isolate memory mappings of control virtqueue
from the others.

To demonstrate the usage, vDPA simulator is extended to support
setting MAC address via a emulated control virtqueue.

Please review.

Changes since v1:

- Rebased the v1 patch series on vhost branch of MST vhost git repo
git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git/log/?h=vhost
- Updates to accommodate vdpa_sim changes from monolithic module in
kernel used v1 patch series to current modularized class (net, block)
based approach.
- Added new attributes (ngroups and nas) to "vdpasim_dev_attr" and
propagated them from vdpa_sim_net to vdpa_sim
- Widened the data-type for "asid" member of vhost_msg_v2 to __u32
to accommodate PASID
- Fixed the buildbot warnings
- Resolved all checkpatch.pl errors and warnings
- Tested both control and datapath with Xilinx Smartnic SN1000 series
device using QEMU implementing the Shadow virtqueue and support for
VQ groups and ASID available at:
github.com/eugpermar/qemu/releases/tag/vdpa_sw_live_migration.d%2F
asid_groups-v1.d%2F00

Changes since RFC:

- tweak vhost uAPI documentation
- switch to use device specific IOTLB really in patch 4
- tweak the commit log
- fix that ASID in vhost is claimed to be 32 actually but 16bit
actually
- fix use after free when using ASID with IOTLB batching requests
- switch to use Stefano's patch for having separated iov
- remove unused "used_as" variable
- fix the iotlb/asid checking in vhost_vdpa_unmap()

Thanks

Gautam Dawar (19):
vhost: move the backend feature bits to vhost_types.h
virtio-vdpa: don't set callback if virtio doesn't need it
vhost-vdpa: passing iotlb to IOMMU mapping helpers
vhost-vdpa: switch to use vhost-vdpa specific IOTLB
vdpa: introduce virtqueue groups
vdpa: multiple address spaces support
vdpa: introduce config operations for associating ASID to a virtqueue
group
vhost_iotlb: split out IOTLB initialization
vhost: support ASID in IOTLB API
vhost-vdpa: introduce asid based IOTLB
vhost-vdpa: introduce uAPI to get the number of virtqueue groups
vhost-vdpa: introduce uAPI to get the number of address spaces
vhost-vdpa: uAPI to get virtqueue group id
vhost-vdpa: introduce uAPI to set group ASID
vhost-vdpa: support ASID based IOTLB API
vdpa_sim: advertise VIRTIO_NET_F_MTU
vdpa_sim: factor out buffer completion logic
vdpa_sim: filter destination mac address
vdpasim: control virtqueue support

drivers/vdpa/ifcvf/ifcvf_main.c | 8 +-
drivers/vdpa/mlx5/net/mlx5_vnet.c | 11 +-
drivers/vdpa/vdpa.c | 5 +
drivers/vdpa/vdpa_sim/vdpa_sim.c | 100 ++++++++--
drivers/vdpa/vdpa_sim/vdpa_sim.h | 3 +
drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 169 +++++++++++++----
drivers/vhost/iotlb.c | 23 ++-
drivers/vhost/vdpa.c | 272 +++++++++++++++++++++------
drivers/vhost/vhost.c | 23 ++-
drivers/vhost/vhost.h | 4 +-
drivers/virtio/virtio_vdpa.c | 2 +-
include/linux/vdpa.h | 46 ++++-
include/linux/vhost_iotlb.h | 2 +
include/uapi/linux/vhost.h | 25 ++-
include/uapi/linux/vhost_types.h | 11 +-
15 files changed, 566 insertions(+), 138 deletions(-)

--
2.25.0


2022-02-24 23:23:34

by Gautam Dawar

[permalink] [raw]
Subject: [RFC PATCH v2 06/19] vdpa: multiple address spaces support

This patches introduces the multiple address spaces support for vDPA
device. This idea is to identify a specific address space via an
dedicated identifier - ASID.

During vDPA device allocation, vDPA device driver needs to report the
number of address spaces supported by the device then the DMA mapping
ops of the vDPA device needs to be extended to support ASID.

This helps to isolate the environments for the virtqueue that will not
be assigned directly. E.g in the case of virtio-net, the control
virtqueue will not be assigned directly to guest.

As a start, simply claim 1 virtqueue groups and 1 address spaces for
all vDPA devices. And vhost-vDPA will simply reject the device with
more than 1 virtqueue groups or address spaces.

Signed-off-by: Jason Wang <[email protected]>
Signed-off-by: Gautam Dawar <[email protected]>
---
drivers/vdpa/ifcvf/ifcvf_main.c | 2 +-
drivers/vdpa/mlx5/net/mlx5_vnet.c | 5 +++--
drivers/vdpa/vdpa.c | 4 +++-
drivers/vdpa/vdpa_sim/vdpa_sim.c | 10 ++++++----
drivers/vhost/vdpa.c | 14 +++++++++-----
include/linux/vdpa.h | 28 +++++++++++++++++++---------
6 files changed, 41 insertions(+), 22 deletions(-)

diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c
index c815a2e62440..a4815c5612f9 100644
--- a/drivers/vdpa/ifcvf/ifcvf_main.c
+++ b/drivers/vdpa/ifcvf/ifcvf_main.c
@@ -513,7 +513,7 @@ static int ifcvf_vdpa_dev_add(struct vdpa_mgmt_dev *mdev, const char *name,
pdev = ifcvf_mgmt_dev->pdev;
dev = &pdev->dev;
adapter = vdpa_alloc_device(struct ifcvf_adapter, vdpa,
- dev, &ifc_vdpa_ops, 1, name, false);
+ dev, &ifc_vdpa_ops, 1, 1, name, false);
if (IS_ERR(adapter)) {
IFCVF_ERR(pdev, "Failed to allocate vDPA structure");
return PTR_ERR(adapter);
diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
index fcfc28460b72..a76417892ef3 100644
--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
+++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
@@ -2282,7 +2282,8 @@ static u32 mlx5_vdpa_get_generation(struct vdpa_device *vdev)
return mvdev->generation;
}

-static int mlx5_vdpa_set_map(struct vdpa_device *vdev, struct vhost_iotlb *iotlb)
+static int mlx5_vdpa_set_map(struct vdpa_device *vdev, unsigned int asid,
+ struct vhost_iotlb *iotlb)
{
struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
bool change_map;
@@ -2581,7 +2582,7 @@ static int mlx5_vdpa_dev_add(struct vdpa_mgmt_dev *v_mdev, const char *name,
}

ndev = vdpa_alloc_device(struct mlx5_vdpa_net, mvdev.vdev, mdev->device, &mlx5_vdpa_ops,
- 1, name, false);
+ 1, 1, name, false);
if (IS_ERR(ndev))
return PTR_ERR(ndev);

diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
index a07bf0130559..1793dc12b208 100644
--- a/drivers/vdpa/vdpa.c
+++ b/drivers/vdpa/vdpa.c
@@ -160,6 +160,7 @@ static void vdpa_release_dev(struct device *d)
* @parent: the parent device
* @config: the bus operations that is supported by this device
* @ngroups: number of groups supported by this device
+ * @nas: number of address spaces supported by this device
* @size: size of the parent structure that contains private data
* @name: name of the vdpa device; optional.
* @use_va: indicate whether virtual address must be used by this device
@@ -172,7 +173,7 @@ static void vdpa_release_dev(struct device *d)
*/
struct vdpa_device *__vdpa_alloc_device(struct device *parent,
const struct vdpa_config_ops *config,
- unsigned int ngroups,
+ unsigned int ngroups, unsigned int nas,
size_t size, const char *name,
bool use_va)
{
@@ -206,6 +207,7 @@ struct vdpa_device *__vdpa_alloc_device(struct device *parent,
vdev->features_valid = false;
vdev->use_va = use_va;
vdev->ngroups = ngroups;
+ vdev->nas = nas;

if (name)
err = dev_set_name(&vdev->dev, "%s", name);
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
index c98cb1f869fa..659e2e2e4b0c 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
@@ -251,7 +251,7 @@ struct vdpasim *vdpasim_create(struct vdpasim_dev_attr *dev_attr)
ops = &vdpasim_config_ops;

vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops, 1,
- dev_attr->name, false);
+ 1, dev_attr->name, false);
if (IS_ERR(vdpasim)) {
ret = PTR_ERR(vdpasim);
goto err_alloc;
@@ -539,7 +539,7 @@ static struct vdpa_iova_range vdpasim_get_iova_range(struct vdpa_device *vdpa)
return range;
}

-static int vdpasim_set_map(struct vdpa_device *vdpa,
+static int vdpasim_set_map(struct vdpa_device *vdpa, unsigned int asid,
struct vhost_iotlb *iotlb)
{
struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
@@ -566,7 +566,8 @@ static int vdpasim_set_map(struct vdpa_device *vdpa,
return ret;
}

-static int vdpasim_dma_map(struct vdpa_device *vdpa, u64 iova, u64 size,
+static int vdpasim_dma_map(struct vdpa_device *vdpa, unsigned int asid,
+ u64 iova, u64 size,
u64 pa, u32 perm, void *opaque)
{
struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
@@ -580,7 +581,8 @@ static int vdpasim_dma_map(struct vdpa_device *vdpa, u64 iova, u64 size,
return ret;
}

-static int vdpasim_dma_unmap(struct vdpa_device *vdpa, u64 iova, u64 size)
+static int vdpasim_dma_unmap(struct vdpa_device *vdpa, unsigned int asid,
+ u64 iova, u64 size)
{
struct vdpasim *vdpasim = vdpa_to_sim(vdpa);

diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index 655ff7029401..6bf755f84d26 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -599,10 +599,10 @@ static int vhost_vdpa_map(struct vhost_vdpa *v, struct vhost_iotlb *iotlb,
return r;

if (ops->dma_map) {
- r = ops->dma_map(vdpa, iova, size, pa, perm, opaque);
+ r = ops->dma_map(vdpa, 0, iova, size, pa, perm, opaque);
} else if (ops->set_map) {
if (!v->in_batch)
- r = ops->set_map(vdpa, iotlb);
+ r = ops->set_map(vdpa, 0, iotlb);
} else {
r = iommu_map(v->domain, iova, pa, size,
perm_to_iommu_flags(perm));
@@ -628,10 +628,10 @@ static void vhost_vdpa_unmap(struct vhost_vdpa *v,
vhost_vdpa_iotlb_unmap(v, iotlb, iova, iova + size - 1);

if (ops->dma_map) {
- ops->dma_unmap(vdpa, iova, size);
+ ops->dma_unmap(vdpa, 0, iova, size);
} else if (ops->set_map) {
if (!v->in_batch)
- ops->set_map(vdpa, iotlb);
+ ops->set_map(vdpa, 0, iotlb);
} else {
iommu_unmap(v->domain, iova, size);
}
@@ -863,7 +863,7 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev,
break;
case VHOST_IOTLB_BATCH_END:
if (v->in_batch && ops->set_map)
- ops->set_map(vdpa, iotlb);
+ ops->set_map(vdpa, 0, iotlb);
v->in_batch = false;
break;
default:
@@ -1128,6 +1128,10 @@ static int vhost_vdpa_probe(struct vdpa_device *vdpa)
int minor;
int r;

+ /* Only support 1 address space and 1 groups */
+ if (vdpa->ngroups != 1 || vdpa->nas != 1)
+ return -EOPNOTSUPP;
+
v = kzalloc(sizeof(*v), GFP_KERNEL | __GFP_RETRY_MAYFAIL);
if (!v)
return -ENOMEM;
diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
index 026b7ad72ed7..de22ca1a8ef3 100644
--- a/include/linux/vdpa.h
+++ b/include/linux/vdpa.h
@@ -69,6 +69,8 @@ struct vdpa_mgmt_dev;
* @cf_mutex: Protects get and set access to configuration layout.
* @index: device index
* @features_valid: were features initialized? for legacy guests
+ * @ngroups: the number of virtqueue groups
+ * @nas: the number of address spaces
* @use_va: indicate whether virtual address must be used by this device
* @nvqs: maximum number of supported virtqueues
* @mdev: management device pointer; caller must setup when registering device as part
@@ -86,6 +88,7 @@ struct vdpa_device {
int nvqs;
struct vdpa_mgmt_dev *mdev;
unsigned int ngroups;
+ unsigned int nas;
};

/**
@@ -240,6 +243,7 @@ struct vdpa_map_file {
* Needed for device that using device
* specific DMA translation (on-chip IOMMU)
* @vdev: vdpa device
+ * @asid: address space identifier
* @iotlb: vhost memory mapping to be
* used by the vDPA
* Returns integer: success (0) or error (< 0)
@@ -248,6 +252,7 @@ struct vdpa_map_file {
* specific DMA translation (on-chip IOMMU)
* and preferring incremental map.
* @vdev: vdpa device
+ * @asid: address space identifier
* @iova: iova to be mapped
* @size: size of the area
* @pa: physical address for the map
@@ -259,6 +264,7 @@ struct vdpa_map_file {
* specific DMA translation (on-chip IOMMU)
* and preferring incremental unmap.
* @vdev: vdpa device
+ * @asid: address space identifier
* @iova: iova to be unmapped
* @size: size of the area
* Returns integer: success (0) or error (< 0)
@@ -309,10 +315,12 @@ struct vdpa_config_ops {
struct vdpa_iova_range (*get_iova_range)(struct vdpa_device *vdev);

/* DMA ops */
- int (*set_map)(struct vdpa_device *vdev, struct vhost_iotlb *iotlb);
- int (*dma_map)(struct vdpa_device *vdev, u64 iova, u64 size,
- u64 pa, u32 perm, void *opaque);
- int (*dma_unmap)(struct vdpa_device *vdev, u64 iova, u64 size);
+ int (*set_map)(struct vdpa_device *vdev, unsigned int asid,
+ struct vhost_iotlb *iotlb);
+ int (*dma_map)(struct vdpa_device *vdev, unsigned int asid,
+ u64 iova, u64 size, u64 pa, u32 perm, void *opaque);
+ int (*dma_unmap)(struct vdpa_device *vdev, unsigned int asid,
+ u64 iova, u64 size);

/* Free device resources */
void (*free)(struct vdpa_device *vdev);
@@ -320,7 +328,7 @@ struct vdpa_config_ops {

struct vdpa_device *__vdpa_alloc_device(struct device *parent,
const struct vdpa_config_ops *config,
- unsigned int ngroups,
+ unsigned int ngroups, unsigned int nas,
size_t size, const char *name,
bool use_va);

@@ -332,17 +340,19 @@ struct vdpa_device *__vdpa_alloc_device(struct device *parent,
* @parent: the parent device
* @config: the bus operations that is supported by this device
* @ngroups: the number of virtqueue groups supported by this device
+ * @nas: the number of address spaces
* @name: name of the vdpa device
* @use_va: indicate whether virtual address must be used by this device
*
* Return allocated data structure or ERR_PTR upon error
*/
-#define vdpa_alloc_device(dev_struct, member, parent, config, ngroups, name, use_va) \
+#define vdpa_alloc_device(dev_struct, member, parent, config, ngroups, nas, \
+ name, use_va) \
container_of((__vdpa_alloc_device( \
- parent, config, ngroups, \
- sizeof(dev_struct) + \
+ parent, config, ngroups, nas, \
+ (sizeof(dev_struct) + \
BUILD_BUG_ON_ZERO(offsetof( \
- dev_struct, member)), name, use_va)), \
+ dev_struct, member))), name, use_va)), \
dev_struct, member)

int vdpa_register_device(struct vdpa_device *vdev, int nvqs);
--
2.25.0

2022-02-25 00:11:13

by Gautam Dawar

[permalink] [raw]
Subject: [RFC PATCH v2 16/19] vdpa_sim: advertise VIRTIO_NET_F_MTU

We've already reported maximum mtu via config space, so let's
advertise the feature.

Signed-off-by: Jason Wang <[email protected]>
Signed-off-by: Gautam Dawar <[email protected]>
---
drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
index d5324f6fd8c7..ff22cc56f40b 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
@@ -26,7 +26,8 @@
#define DRV_LICENSE "GPL v2"

#define VDPASIM_NET_FEATURES (VDPASIM_FEATURES | \
- (1ULL << VIRTIO_NET_F_MAC))
+ (1ULL << VIRTIO_NET_F_MAC) | \
+ (1ULL << VIRTIO_NET_F_MTU));

#define VDPASIM_NET_VQ_NUM 2

--
2.25.0

2022-02-25 00:23:27

by Gautam Dawar

[permalink] [raw]
Subject: [RFC PATCH v2 03/19] vhost-vdpa: passing iotlb to IOMMU mapping helpers

To prepare for the ASID support for vhost-vdpa, try to pass IOTLB
object to dma helpers. No functional changes, it's just a preparation
for support multiple IOTLBs.

Signed-off-by: Jason Wang <[email protected]>
Signed-off-by: Gautam Dawar <[email protected]>
---
drivers/vhost/vdpa.c | 67 ++++++++++++++++++++++++--------------------
1 file changed, 37 insertions(+), 30 deletions(-)

diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index 851539807bc9..146911082514 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -503,10 +503,11 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
return r;
}

-static void vhost_vdpa_pa_unmap(struct vhost_vdpa *v, u64 start, u64 last)
+static void vhost_vdpa_pa_unmap(struct vhost_vdpa *v,
+ struct vhost_iotlb *iotlb,
+ u64 start, u64 last)
{
struct vhost_dev *dev = &v->vdev;
- struct vhost_iotlb *iotlb = dev->iotlb;
struct vhost_iotlb_map *map;
struct page *page;
unsigned long pfn, pinned;
@@ -525,10 +526,10 @@ static void vhost_vdpa_pa_unmap(struct vhost_vdpa *v, u64 start, u64 last)
}
}

-static void vhost_vdpa_va_unmap(struct vhost_vdpa *v, u64 start, u64 last)
+static void vhost_vdpa_va_unmap(struct vhost_vdpa *v,
+ struct vhost_iotlb *iotlb,
+ u64 start, u64 last)
{
- struct vhost_dev *dev = &v->vdev;
- struct vhost_iotlb *iotlb = dev->iotlb;
struct vhost_iotlb_map *map;
struct vdpa_map_file *map_file;

@@ -540,21 +541,24 @@ static void vhost_vdpa_va_unmap(struct vhost_vdpa *v, u64 start, u64 last)
}
}

-static void vhost_vdpa_iotlb_unmap(struct vhost_vdpa *v, u64 start, u64 last)
+static void vhost_vdpa_iotlb_unmap(struct vhost_vdpa *v,
+ struct vhost_iotlb *iotlb,
+ u64 start, u64 last)
{
struct vdpa_device *vdpa = v->vdpa;

if (vdpa->use_va)
- return vhost_vdpa_va_unmap(v, start, last);
+ return vhost_vdpa_va_unmap(v, iotlb, start, last);

- return vhost_vdpa_pa_unmap(v, start, last);
+ return vhost_vdpa_pa_unmap(v, iotlb, start, last);
}

static void vhost_vdpa_iotlb_free(struct vhost_vdpa *v)
{
struct vhost_dev *dev = &v->vdev;
+ struct vhost_iotlb *iotlb = dev->iotlb;

- vhost_vdpa_iotlb_unmap(v, 0ULL, 0ULL - 1);
+ vhost_vdpa_iotlb_unmap(v, iotlb, 0ULL, 0ULL - 1);
kfree(dev->iotlb);
dev->iotlb = NULL;
}
@@ -581,15 +585,15 @@ static int perm_to_iommu_flags(u32 perm)
return flags | IOMMU_CACHE;
}

-static int vhost_vdpa_map(struct vhost_vdpa *v, u64 iova,
- u64 size, u64 pa, u32 perm, void *opaque)
+static int vhost_vdpa_map(struct vhost_vdpa *v, struct vhost_iotlb *iotlb,
+ u64 iova, u64 size, u64 pa, u32 perm, void *opaque)
{
struct vhost_dev *dev = &v->vdev;
struct vdpa_device *vdpa = v->vdpa;
const struct vdpa_config_ops *ops = vdpa->config;
int r = 0;

- r = vhost_iotlb_add_range_ctx(dev->iotlb, iova, iova + size - 1,
+ r = vhost_iotlb_add_range_ctx(iotlb, iova, iova + size - 1,
pa, perm, opaque);
if (r)
return r;
@@ -598,13 +602,13 @@ static int vhost_vdpa_map(struct vhost_vdpa *v, u64 iova,
r = ops->dma_map(vdpa, iova, size, pa, perm, opaque);
} else if (ops->set_map) {
if (!v->in_batch)
- r = ops->set_map(vdpa, dev->iotlb);
+ r = ops->set_map(vdpa, iotlb);
} else {
r = iommu_map(v->domain, iova, pa, size,
perm_to_iommu_flags(perm));
}
if (r) {
- vhost_iotlb_del_range(dev->iotlb, iova, iova + size - 1);
+ vhost_iotlb_del_range(iotlb, iova, iova + size - 1);
return r;
}

@@ -614,25 +618,27 @@ static int vhost_vdpa_map(struct vhost_vdpa *v, u64 iova,
return 0;
}

-static void vhost_vdpa_unmap(struct vhost_vdpa *v, u64 iova, u64 size)
+static void vhost_vdpa_unmap(struct vhost_vdpa *v,
+ struct vhost_iotlb *iotlb,
+ u64 iova, u64 size)
{
- struct vhost_dev *dev = &v->vdev;
struct vdpa_device *vdpa = v->vdpa;
const struct vdpa_config_ops *ops = vdpa->config;

- vhost_vdpa_iotlb_unmap(v, iova, iova + size - 1);
+ vhost_vdpa_iotlb_unmap(v, iotlb, iova, iova + size - 1);

if (ops->dma_map) {
ops->dma_unmap(vdpa, iova, size);
} else if (ops->set_map) {
if (!v->in_batch)
- ops->set_map(vdpa, dev->iotlb);
+ ops->set_map(vdpa, iotlb);
} else {
iommu_unmap(v->domain, iova, size);
}
}

static int vhost_vdpa_va_map(struct vhost_vdpa *v,
+ struct vhost_iotlb *iotlb,
u64 iova, u64 size, u64 uaddr, u32 perm)
{
struct vhost_dev *dev = &v->vdev;
@@ -662,7 +668,7 @@ static int vhost_vdpa_va_map(struct vhost_vdpa *v,
offset = (vma->vm_pgoff << PAGE_SHIFT) + uaddr - vma->vm_start;
map_file->offset = offset;
map_file->file = get_file(vma->vm_file);
- ret = vhost_vdpa_map(v, map_iova, map_size, uaddr,
+ ret = vhost_vdpa_map(v, iotlb, map_iova, map_size, uaddr,
perm, map_file);
if (ret) {
fput(map_file->file);
@@ -675,7 +681,7 @@ static int vhost_vdpa_va_map(struct vhost_vdpa *v,
map_iova += map_size;
}
if (ret)
- vhost_vdpa_unmap(v, iova, map_iova - iova);
+ vhost_vdpa_unmap(v, iotlb, iova, map_iova - iova);

mmap_read_unlock(dev->mm);

@@ -683,6 +689,7 @@ static int vhost_vdpa_va_map(struct vhost_vdpa *v,
}

static int vhost_vdpa_pa_map(struct vhost_vdpa *v,
+ struct vhost_iotlb *iotlb,
u64 iova, u64 size, u64 uaddr, u32 perm)
{
struct vhost_dev *dev = &v->vdev;
@@ -746,7 +753,7 @@ static int vhost_vdpa_pa_map(struct vhost_vdpa *v,
if (last_pfn && (this_pfn != last_pfn + 1)) {
/* Pin a contiguous chunk of memory */
csize = PFN_PHYS(last_pfn - map_pfn + 1);
- ret = vhost_vdpa_map(v, iova, csize,
+ ret = vhost_vdpa_map(v, iotlb, iova, csize,
PFN_PHYS(map_pfn),
perm, NULL);
if (ret) {
@@ -776,7 +783,7 @@ static int vhost_vdpa_pa_map(struct vhost_vdpa *v,
}

/* Pin the rest chunk */
- ret = vhost_vdpa_map(v, iova, PFN_PHYS(last_pfn - map_pfn + 1),
+ ret = vhost_vdpa_map(v, iotlb, iova, PFN_PHYS(last_pfn - map_pfn + 1),
PFN_PHYS(map_pfn), perm, NULL);
out:
if (ret) {
@@ -796,7 +803,7 @@ static int vhost_vdpa_pa_map(struct vhost_vdpa *v,
for (pfn = map_pfn; pfn <= last_pfn; pfn++)
unpin_user_page(pfn_to_page(pfn));
}
- vhost_vdpa_unmap(v, start, size);
+ vhost_vdpa_unmap(v, iotlb, start, size);
}
unlock:
mmap_read_unlock(dev->mm);
@@ -807,11 +814,10 @@ static int vhost_vdpa_pa_map(struct vhost_vdpa *v,
}

static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v,
+ struct vhost_iotlb *iotlb,
struct vhost_iotlb_msg *msg)
{
- struct vhost_dev *dev = &v->vdev;
struct vdpa_device *vdpa = v->vdpa;
- struct vhost_iotlb *iotlb = dev->iotlb;

if (msg->iova < v->range.first || !msg->size ||
msg->iova > U64_MAX - msg->size + 1 ||
@@ -823,10 +829,10 @@ static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v,
return -EEXIST;

if (vdpa->use_va)
- return vhost_vdpa_va_map(v, msg->iova, msg->size,
+ return vhost_vdpa_va_map(v, iotlb, msg->iova, msg->size,
msg->uaddr, msg->perm);

- return vhost_vdpa_pa_map(v, msg->iova, msg->size, msg->uaddr,
+ return vhost_vdpa_pa_map(v, iotlb, msg->iova, msg->size, msg->uaddr,
msg->perm);
}

@@ -836,6 +842,7 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev,
struct vhost_vdpa *v = container_of(dev, struct vhost_vdpa, vdev);
struct vdpa_device *vdpa = v->vdpa;
const struct vdpa_config_ops *ops = vdpa->config;
+ struct vhost_iotlb *iotlb = dev->iotlb;
int r = 0;

mutex_lock(&dev->mutex);
@@ -846,17 +853,17 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev,

switch (msg->type) {
case VHOST_IOTLB_UPDATE:
- r = vhost_vdpa_process_iotlb_update(v, msg);
+ r = vhost_vdpa_process_iotlb_update(v, iotlb, msg);
break;
case VHOST_IOTLB_INVALIDATE:
- vhost_vdpa_unmap(v, msg->iova, msg->size);
+ vhost_vdpa_unmap(v, iotlb, msg->iova, msg->size);
break;
case VHOST_IOTLB_BATCH_BEGIN:
v->in_batch = true;
break;
case VHOST_IOTLB_BATCH_END:
if (v->in_batch && ops->set_map)
- ops->set_map(vdpa, dev->iotlb);
+ ops->set_map(vdpa, iotlb);
v->in_batch = false;
break;
default:
--
2.25.0

2022-02-25 09:30:43

by Gautam Dawar

[permalink] [raw]
Subject: [RFC PATCH v2 02/19] virtio-vdpa: don't set callback if virtio doesn't need it

There's no need for setting callbacks for the driver that doesn't care
about that.

Signed-off-by: Jason Wang <[email protected]>
Signed-off-by: Gautam Dawar <[email protected]>
---
drivers/virtio/virtio_vdpa.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
index 7767a7f0119b..bc74ba5fa717 100644
--- a/drivers/virtio/virtio_vdpa.c
+++ b/drivers/virtio/virtio_vdpa.c
@@ -184,7 +184,7 @@ virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index,
}

/* Setup virtqueue callback */
- cb.callback = virtio_vdpa_virtqueue_cb;
+ cb.callback = callback ? virtio_vdpa_virtqueue_cb : NULL;
cb.private = info;
ops->set_vq_cb(vdpa, index, &cb);
ops->set_vq_num(vdpa, index, virtqueue_get_vring_size(vq));
--
2.25.0

2022-02-25 12:09:34

by Gautam Dawar

[permalink] [raw]
Subject: [RFC PATCH v2 19/19] vdpasim: control virtqueue support

This patch introduces the control virtqueue support for vDPA
simulator. This is a requirement for supporting advanced features like
multiqueue.

A requirement for control virtqueue is to isolate its memory access
from the rx/tx virtqueues. This is because when using vDPA device
for VM, the control virqueue is not directly assigned to VM. Userspace
(Qemu) will present a shadow control virtqueue to control for
recording the device states.

The isolation is done via the virtqueue groups and ASID support in
vDPA through vhost-vdpa. The simulator is extended to have:

1) three virtqueues: RXVQ, TXVQ and CVQ (control virtqueue)
2) two virtqueue groups: group 0 contains RXVQ and TXVQ; group 1
contains CVQ
3) two address spaces and the simulator simply implements the address
spaces by mapping it 1:1 to IOTLB.

For the VM use cases, userspace(Qemu) may set AS 0 to group 0 and AS 1
to group 1. So we have:

1) The IOTLB for virtqueue group 0 contains the mappings of guest, so
RX and TX can be assigned to guest directly.
2) The IOTLB for virtqueue group 1 contains the mappings of CVQ which
is the buffers that allocated and managed by VMM only. So CVQ of
vhost-vdpa is visible to VMM only. And Guest can not access the CVQ
of vhost-vdpa.

For the other use cases, since AS 0 is associated to all virtqueue
groups by default. All virtqueues share the same mapping by default.

To demonstrate the function, VIRITO_NET_F_CTRL_MACADDR is
implemented in the simulator for the driver to set mac address.

Signed-off-by: Jason Wang <[email protected]>
Signed-off-by: Gautam Dawar <[email protected]>
---
drivers/vdpa/vdpa_sim/vdpa_sim.c | 91 ++++++++++++++++++++++------
drivers/vdpa/vdpa_sim/vdpa_sim.h | 2 +
drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 88 ++++++++++++++++++++++++++-
3 files changed, 161 insertions(+), 20 deletions(-)

diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
index 659e2e2e4b0c..59611f18a3a8 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
@@ -96,11 +96,17 @@ static void vdpasim_do_reset(struct vdpasim *vdpasim)
{
int i;

- for (i = 0; i < vdpasim->dev_attr.nvqs; i++)
+ spin_lock(&vdpasim->iommu_lock);
+
+ for (i = 0; i < vdpasim->dev_attr.nvqs; i++) {
vdpasim_vq_reset(vdpasim, &vdpasim->vqs[i]);
+ vringh_set_iotlb(&vdpasim->vqs[i].vring, &vdpasim->iommu[0],
+ &vdpasim->iommu_lock);
+ }
+
+ for (i = 0; i < vdpasim->dev_attr.nas; i++)
+ vhost_iotlb_reset(&vdpasim->iommu[i]);

- spin_lock(&vdpasim->iommu_lock);
- vhost_iotlb_reset(vdpasim->iommu);
spin_unlock(&vdpasim->iommu_lock);

vdpasim->features = 0;
@@ -145,7 +151,7 @@ static dma_addr_t vdpasim_map_range(struct vdpasim *vdpasim, phys_addr_t paddr,
dma_addr = iova_dma_addr(&vdpasim->iova, iova);

spin_lock(&vdpasim->iommu_lock);
- ret = vhost_iotlb_add_range(vdpasim->iommu, (u64)dma_addr,
+ ret = vhost_iotlb_add_range(&vdpasim->iommu[0], (u64)dma_addr,
(u64)dma_addr + size - 1, (u64)paddr, perm);
spin_unlock(&vdpasim->iommu_lock);

@@ -161,7 +167,7 @@ static void vdpasim_unmap_range(struct vdpasim *vdpasim, dma_addr_t dma_addr,
size_t size)
{
spin_lock(&vdpasim->iommu_lock);
- vhost_iotlb_del_range(vdpasim->iommu, (u64)dma_addr,
+ vhost_iotlb_del_range(&vdpasim->iommu[0], (u64)dma_addr,
(u64)dma_addr + size - 1);
spin_unlock(&vdpasim->iommu_lock);

@@ -250,8 +256,9 @@ struct vdpasim *vdpasim_create(struct vdpasim_dev_attr *dev_attr)
else
ops = &vdpasim_config_ops;

- vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops, 1,
- 1, dev_attr->name, false);
+ vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops,
+ dev_attr->ngroups, dev_attr->nas,
+ dev_attr->name, false);
if (IS_ERR(vdpasim)) {
ret = PTR_ERR(vdpasim);
goto err_alloc;
@@ -278,16 +285,20 @@ struct vdpasim *vdpasim_create(struct vdpasim_dev_attr *dev_attr)
if (!vdpasim->vqs)
goto err_iommu;

- vdpasim->iommu = vhost_iotlb_alloc(max_iotlb_entries, 0);
+ vdpasim->iommu = kmalloc_array(vdpasim->dev_attr.nas,
+ sizeof(*vdpasim->iommu), GFP_KERNEL);
if (!vdpasim->iommu)
goto err_iommu;

+ for (i = 0; i < vdpasim->dev_attr.nas; i++)
+ vhost_iotlb_init(&vdpasim->iommu[i], 0, 0);
+
vdpasim->buffer = kvmalloc(dev_attr->buffer_size, GFP_KERNEL);
if (!vdpasim->buffer)
goto err_iommu;

for (i = 0; i < dev_attr->nvqs; i++)
- vringh_set_iotlb(&vdpasim->vqs[i].vring, vdpasim->iommu,
+ vringh_set_iotlb(&vdpasim->vqs[i].vring, &vdpasim->iommu[0],
&vdpasim->iommu_lock);

ret = iova_cache_get();
@@ -401,7 +412,11 @@ static u32 vdpasim_get_vq_align(struct vdpa_device *vdpa)

static u32 vdpasim_get_vq_group(struct vdpa_device *vdpa, u16 idx)
{
- return 0;
+ /* RX and TX belongs to group 0, CVQ belongs to group 1 */
+ if (idx == 2)
+ return 1;
+ else
+ return 0;
}

static u64 vdpasim_get_device_features(struct vdpa_device *vdpa)
@@ -539,20 +554,53 @@ static struct vdpa_iova_range vdpasim_get_iova_range(struct vdpa_device *vdpa)
return range;
}

+static int vdpasim_set_group_asid(struct vdpa_device *vdpa, unsigned int group,
+ unsigned int asid)
+{
+ struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
+ struct vhost_iotlb *iommu;
+ int i;
+
+ if (group > vdpasim->dev_attr.ngroups)
+ return -EINVAL;
+
+ if (asid > vdpasim->dev_attr.nas)
+ return -EINVAL;
+
+ iommu = &vdpasim->iommu[asid];
+
+ spin_lock(&vdpasim->lock);
+
+ for (i = 0; i < vdpasim->dev_attr.nvqs; i++)
+ if (vdpasim_get_vq_group(vdpa, i) == group)
+ vringh_set_iotlb(&vdpasim->vqs[i].vring, &vdpasim->iommu[0],
+ &vdpasim->iommu_lock);
+
+ spin_unlock(&vdpasim->lock);
+
+ return 0;
+}
+
static int vdpasim_set_map(struct vdpa_device *vdpa, unsigned int asid,
struct vhost_iotlb *iotlb)
{
struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
struct vhost_iotlb_map *map;
+ struct vhost_iotlb *iommu;
u64 start = 0ULL, last = 0ULL - 1;
int ret;

+ if (asid >= vdpasim->dev_attr.nas)
+ return -EINVAL;
+
spin_lock(&vdpasim->iommu_lock);
- vhost_iotlb_reset(vdpasim->iommu);
+
+ iommu = &vdpasim->iommu[asid];
+ vhost_iotlb_reset(iommu);

for (map = vhost_iotlb_itree_first(iotlb, start, last); map;
map = vhost_iotlb_itree_next(map, start, last)) {
- ret = vhost_iotlb_add_range(vdpasim->iommu, map->start,
+ ret = vhost_iotlb_add_range(iommu, map->start,
map->last, map->addr, map->perm);
if (ret)
goto err;
@@ -561,7 +609,7 @@ static int vdpasim_set_map(struct vdpa_device *vdpa, unsigned int asid,
return 0;

err:
- vhost_iotlb_reset(vdpasim->iommu);
+ vhost_iotlb_reset(iommu);
spin_unlock(&vdpasim->iommu_lock);
return ret;
}
@@ -573,9 +621,12 @@ static int vdpasim_dma_map(struct vdpa_device *vdpa, unsigned int asid,
struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
int ret;

+ if (asid >= vdpasim->dev_attr.nas)
+ return -EINVAL;
+
spin_lock(&vdpasim->iommu_lock);
- ret = vhost_iotlb_add_range_ctx(vdpasim->iommu, iova, iova + size - 1,
- pa, perm, opaque);
+ ret = vhost_iotlb_add_range_ctx(&vdpasim->iommu[asid], iova,
+ iova + size - 1, pa, perm, opaque);
spin_unlock(&vdpasim->iommu_lock);

return ret;
@@ -586,8 +637,11 @@ static int vdpasim_dma_unmap(struct vdpa_device *vdpa, unsigned int asid,
{
struct vdpasim *vdpasim = vdpa_to_sim(vdpa);

+ if (asid >= vdpasim->dev_attr.nas)
+ return -EINVAL;
+
spin_lock(&vdpasim->iommu_lock);
- vhost_iotlb_del_range(vdpasim->iommu, iova, iova + size - 1);
+ vhost_iotlb_del_range(&vdpasim->iommu[asid], iova, iova + size - 1);
spin_unlock(&vdpasim->iommu_lock);

return 0;
@@ -611,8 +665,7 @@ static void vdpasim_free(struct vdpa_device *vdpa)
}

kvfree(vdpasim->buffer);
- if (vdpasim->iommu)
- vhost_iotlb_free(vdpasim->iommu);
+ vhost_iotlb_free(vdpasim->iommu);
kfree(vdpasim->vqs);
kfree(vdpasim->config);
}
@@ -643,6 +696,7 @@ static const struct vdpa_config_ops vdpasim_config_ops = {
.set_config = vdpasim_set_config,
.get_generation = vdpasim_get_generation,
.get_iova_range = vdpasim_get_iova_range,
+ .set_group_asid = vdpasim_set_group_asid,
.dma_map = vdpasim_dma_map,
.dma_unmap = vdpasim_dma_unmap,
.free = vdpasim_free,
@@ -674,6 +728,7 @@ static const struct vdpa_config_ops vdpasim_batch_config_ops = {
.set_config = vdpasim_set_config,
.get_generation = vdpasim_get_generation,
.get_iova_range = vdpasim_get_iova_range,
+ .set_group_asid = vdpasim_set_group_asid,
.set_map = vdpasim_set_map,
.free = vdpasim_free,
};
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.h b/drivers/vdpa/vdpa_sim/vdpa_sim.h
index 0be7c1e7ef80..622782e92239 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.h
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.h
@@ -41,6 +41,8 @@ struct vdpasim_dev_attr {
size_t buffer_size;
int nvqs;
u32 id;
+ u32 ngroups;
+ u32 nas;

work_func_t work_fn;
void (*get_config)(struct vdpasim *vdpasim, void *config);
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
index ed5ade4ae570..513970c05af2 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
@@ -26,10 +26,15 @@
#define DRV_LICENSE "GPL v2"

#define VDPASIM_NET_FEATURES (VDPASIM_FEATURES | \
+ (1ULL << VIRTIO_NET_F_MTU) | \
(1ULL << VIRTIO_NET_F_MAC) | \
- (1ULL << VIRTIO_NET_F_MTU));
+ (1ULL << VIRTIO_NET_F_CTRL_VQ) | \
+ (1ULL << VIRTIO_NET_F_CTRL_MAC_ADDR));

-#define VDPASIM_NET_VQ_NUM 2
+/* 3 virtqueues, 2 address spaces, 2 virtqueue groups */
+#define VDPASIM_NET_VQ_NUM 3
+#define VDPASIM_NET_AS_NUM 2
+#define VDPASIM_NET_GROUP_NUM 2

static void vdpasim_net_complete(struct vdpasim_virtqueue *vq, size_t len)
{
@@ -63,6 +68,81 @@ static bool receive_filter(struct vdpasim *vdpasim, size_t len)
return false;
}

+static virtio_net_ctrl_ack vdpasim_handle_ctrl_mac(struct vdpasim *vdpasim,
+ u8 cmd)
+{
+ struct vdpasim_virtqueue *cvq = &vdpasim->vqs[2];
+ virtio_net_ctrl_ack status = VIRTIO_NET_ERR;
+ size_t read;
+
+ switch (cmd) {
+ case VIRTIO_NET_CTRL_MAC_ADDR_SET:
+ read = vringh_iov_pull_iotlb(&cvq->vring, &cvq->in_iov,
+ (void *)vdpasim->config.mac,
+ ETH_ALEN);
+ if (read == ETH_ALEN)
+ status = VIRTIO_NET_OK;
+ break;
+ default:
+ break;
+ }
+
+ return status;
+}
+
+static void vdpasim_handle_cvq(struct vdpasim *vdpasim)
+{
+ struct vdpasim_virtqueue *cvq = &vdpasim->vqs[2];
+ virtio_net_ctrl_ack status = VIRTIO_NET_ERR;
+ struct virtio_net_ctrl_hdr ctrl;
+ size_t read, write;
+ int err;
+
+ if (!(vdpasim->features & (1ULL << VIRTIO_NET_F_CTRL_VQ)))
+ return;
+
+ if (!cvq->ready)
+ return;
+
+ while (true) {
+ err = vringh_getdesc_iotlb(&cvq->vring, &cvq->in_iov,
+ &cvq->out_iov,
+ &cvq->head, GFP_ATOMIC);
+ if (err <= 0)
+ break;
+
+ read = vringh_iov_pull_iotlb(&cvq->vring, &cvq->in_iov, &ctrl,
+ sizeof(ctrl));
+ if (read != sizeof(ctrl))
+ break;
+
+ switch (ctrl.class) {
+ case VIRTIO_NET_CTRL_MAC:
+ status = vdpasim_handle_ctrl_mac(vdpasim, ctrl.cmd);
+ break;
+ default:
+ break;
+ }
+
+ /* Make sure data is wrote before advancing index */
+ smp_wmb();
+
+ write = vringh_iov_push_iotlb(&cvq->vring, &cvq->out_iov,
+ &status, sizeof(status));
+ vringh_complete_iotlb(&cvq->vring, cvq->head, write);
+ vringh_kiov_cleanup(&cvq->in_iov);
+ vringh_kiov_cleanup(&cvq->out_iov);
+
+ /* Make sure used is visible before rasing the interrupt. */
+ smp_wmb();
+
+ local_bh_disable();
+ if (cvq->cb)
+ cvq->cb(cvq->private);
+ local_bh_enable();
+ }
+}
+
static void vdpasim_net_work(struct work_struct *work)
{
struct vdpasim *vdpasim = container_of(work, struct vdpasim, work);
@@ -77,6 +157,8 @@ static void vdpasim_net_work(struct work_struct *work)
if (!(vdpasim->status & VIRTIO_CONFIG_S_DRIVER_OK))
goto out;

+ vdpasim_handle_cvq(vdpasim);
+
if (!txq->ready || !rxq->ready)
goto out;

@@ -162,6 +244,8 @@ static int vdpasim_net_dev_add(struct vdpa_mgmt_dev *mdev, const char *name,
dev_attr.id = VIRTIO_ID_NET;
dev_attr.supported_features = VDPASIM_NET_FEATURES;
dev_attr.nvqs = VDPASIM_NET_VQ_NUM;
+ dev_attr.ngroups = VDPASIM_NET_GROUP_NUM;
+ dev_attr.nas = VDPASIM_NET_AS_NUM;
dev_attr.config_size = sizeof(struct virtio_net_config);
dev_attr.get_config = vdpasim_net_get_config;
dev_attr.work_fn = vdpasim_net_work;
--
2.25.0

2022-02-28 09:41:35

by Jason Wang

[permalink] [raw]
Subject: Re: [RFC PATCH v2 00/19] Control VQ support in vDPA


在 2022/2/25 上午5:22, Gautam Dawar 写道:
> Hi All:
>
> This series tries to add the support for control virtqueue in vDPA.
>
> Control virtqueue is used by networking device for accepting various
> commands from the driver. It's a must to support multiqueue and other
> configurations.
>
> When used by vhost-vDPA bus driver for VM, the control virtqueue
> should be shadowed via userspace VMM (Qemu) instead of being assigned
> directly to Guest. This is because Qemu needs to know the device state
> in order to start and stop device correctly (e.g for Live Migration).
>
> This requies to isolate the memory mapping for control virtqueue
> presented by vhost-vDPA to prevent guest from accessing it directly.
>
> To achieve this, vDPA introduce two new abstractions:
>
> - address space: identified through address space id (ASID) and a set
> of memory mapping in maintained
> - virtqueue group: the minimal set of virtqueues that must share an
> address space
>
> Device needs to advertise the following attributes to vDPA:
>
> - the number of address spaces supported in the device
> - the number of virtqueue groups supported in the device
> - the mappings from a specific virtqueue to its virtqueue groups
>
> The mappings from virtqueue to virtqueue groups is fixed and defined
> by vDPA device driver. E.g:
>
> - For the device that has hardware ASID support, it can simply
> advertise a per virtqueue virtqueue group.
> - For the device that does not have hardware ASID support, it can
> simply advertise a single virtqueue group that contains all
> virtqueues. Or if it wants a software emulated control virtqueue, it
> can advertise two virtqueue groups, one is for cvq, another is for
> the rest virtqueues.
>
> vDPA also allow to change the association between virtqueue group and
> address space. So in the case of control virtqueue, userspace
> VMM(Qemu) may use a dedicated address space for the control virtqueue
> group to isolate the memory mapping.
>
> The vhost/vhost-vDPA is also extend for the userspace to:
>
> - query the number of virtqueue groups and address spaces supported by
> the device
> - query the virtqueue group for a specific virtqueue
> - assocaite a virtqueue group with an address space
> - send ASID based IOTLB commands
>
> This will help userspace VMM(Qemu) to detect whether the control vq
> could be supported and isolate memory mappings of control virtqueue
> from the others.
>
> To demonstrate the usage, vDPA simulator is extended to support
> setting MAC address via a emulated control virtqueue.
>
> Please review.
>
> Changes since v1:
>
> - Rebased the v1 patch series on vhost branch of MST vhost git repo
> git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git/log/?h=vhost
> - Updates to accommodate vdpa_sim changes from monolithic module in
> kernel used v1 patch series to current modularized class (net, block)
> based approach.
> - Added new attributes (ngroups and nas) to "vdpasim_dev_attr" and
> propagated them from vdpa_sim_net to vdpa_sim
> - Widened the data-type for "asid" member of vhost_msg_v2 to __u32
> to accommodate PASID


This is great. Then the semantic matches exactly the PASID proposal here[1].


> - Fixed the buildbot warnings
> - Resolved all checkpatch.pl errors and warnings
> - Tested both control and datapath with Xilinx Smartnic SN1000 series
> device using QEMU implementing the Shadow virtqueue and support for
> VQ groups and ASID available at:
> github.com/eugpermar/qemu/releases/tag/vdpa_sw_live_migration.d%2F
> asid_groups-v1.d%2F00


On top, we may extend the netlink protocol to report the mapping between
virtqueue to its groups.


Thanks

[1]
https://www.mail-archive.com/[email protected]/msg08077.html


>
> Changes since RFC:
>
> - tweak vhost uAPI documentation
> - switch to use device specific IOTLB really in patch 4
> - tweak the commit log
> - fix that ASID in vhost is claimed to be 32 actually but 16bit
> actually
> - fix use after free when using ASID with IOTLB batching requests
> - switch to use Stefano's patch for having separated iov
> - remove unused "used_as" variable
> - fix the iotlb/asid checking in vhost_vdpa_unmap()
>
> Thanks
>
> Gautam Dawar (19):
> vhost: move the backend feature bits to vhost_types.h
> virtio-vdpa: don't set callback if virtio doesn't need it
> vhost-vdpa: passing iotlb to IOMMU mapping helpers
> vhost-vdpa: switch to use vhost-vdpa specific IOTLB
> vdpa: introduce virtqueue groups
> vdpa: multiple address spaces support
> vdpa: introduce config operations for associating ASID to a virtqueue
> group
> vhost_iotlb: split out IOTLB initialization
> vhost: support ASID in IOTLB API
> vhost-vdpa: introduce asid based IOTLB
> vhost-vdpa: introduce uAPI to get the number of virtqueue groups
> vhost-vdpa: introduce uAPI to get the number of address spaces
> vhost-vdpa: uAPI to get virtqueue group id
> vhost-vdpa: introduce uAPI to set group ASID
> vhost-vdpa: support ASID based IOTLB API
> vdpa_sim: advertise VIRTIO_NET_F_MTU
> vdpa_sim: factor out buffer completion logic
> vdpa_sim: filter destination mac address
> vdpasim: control virtqueue support
>
> drivers/vdpa/ifcvf/ifcvf_main.c | 8 +-
> drivers/vdpa/mlx5/net/mlx5_vnet.c | 11 +-
> drivers/vdpa/vdpa.c | 5 +
> drivers/vdpa/vdpa_sim/vdpa_sim.c | 100 ++++++++--
> drivers/vdpa/vdpa_sim/vdpa_sim.h | 3 +
> drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 169 +++++++++++++----
> drivers/vhost/iotlb.c | 23 ++-
> drivers/vhost/vdpa.c | 272 +++++++++++++++++++++------
> drivers/vhost/vhost.c | 23 ++-
> drivers/vhost/vhost.h | 4 +-
> drivers/virtio/virtio_vdpa.c | 2 +-
> include/linux/vdpa.h | 46 ++++-
> include/linux/vhost_iotlb.h | 2 +
> include/uapi/linux/vhost.h | 25 ++-
> include/uapi/linux/vhost_types.h | 11 +-
> 15 files changed, 566 insertions(+), 138 deletions(-)
>

2022-02-28 14:21:11

by Gautam Dawar

[permalink] [raw]
Subject: RE: [RFC PATCH v2 00/19] Control VQ support in vDPA

在 2022/2/25 上午5:22, Gautam Dawar 写道:
> Hi All:
>
> This series tries to add the support for control virtqueue in vDPA.
>
> Control virtqueue is used by networking device for accepting various
> commands from the driver. It's a must to support multiqueue and other
> configurations.
>
> When used by vhost-vDPA bus driver for VM, the control virtqueue
> should be shadowed via userspace VMM (Qemu) instead of being assigned
> directly to Guest. This is because Qemu needs to know the device state
> in order to start and stop device correctly (e.g for Live Migration).
>
> This requies to isolate the memory mapping for control virtqueue
> presented by vhost-vDPA to prevent guest from accessing it directly.
>
> To achieve this, vDPA introduce two new abstractions:
>
> - address space: identified through address space id (ASID) and a set
> of memory mapping in maintained
> - virtqueue group: the minimal set of virtqueues that must share an
> address space
>
> Device needs to advertise the following attributes to vDPA:
>
> - the number of address spaces supported in the device
> - the number of virtqueue groups supported in the device
> - the mappings from a specific virtqueue to its virtqueue groups
>
> The mappings from virtqueue to virtqueue groups is fixed and defined
> by vDPA device driver. E.g:
>
> - For the device that has hardware ASID support, it can simply
> advertise a per virtqueue virtqueue group.
> - For the device that does not have hardware ASID support, it can
> simply advertise a single virtqueue group that contains all
> virtqueues. Or if it wants a software emulated control virtqueue, it
> can advertise two virtqueue groups, one is for cvq, another is for
> the rest virtqueues.
>
> vDPA also allow to change the association between virtqueue group and
> address space. So in the case of control virtqueue, userspace
> VMM(Qemu) may use a dedicated address space for the control virtqueue
> group to isolate the memory mapping.
>
> The vhost/vhost-vDPA is also extend for the userspace to:
>
> - query the number of virtqueue groups and address spaces supported by
> the device
> - query the virtqueue group for a specific virtqueue
> - assocaite a virtqueue group with an address space
> - send ASID based IOTLB commands
>
> This will help userspace VMM(Qemu) to detect whether the control vq
> could be supported and isolate memory mappings of control virtqueue
> from the others.
>
> To demonstrate the usage, vDPA simulator is extended to support
> setting MAC address via a emulated control virtqueue.
>
> Please review.
>
> Changes since v1:
>
> - Rebased the v1 patch series on vhost branch of MST vhost git repo
> git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git/log/?h=vhost
> - Updates to accommodate vdpa_sim changes from monolithic module in
> kernel used v1 patch series to current modularized class (net, block)
> based approach.
> - Added new attributes (ngroups and nas) to "vdpasim_dev_attr" and
> propagated them from vdpa_sim_net to vdpa_sim
> - Widened the data-type for "asid" member of vhost_msg_v2 to __u32
> to accommodate PASID


This is great. Then the semantic matches exactly the PASID proposal here[1].


> - Fixed the buildbot warnings
> - Resolved all checkpatch.pl errors and warnings
> - Tested both control and datapath with Xilinx Smartnic SN1000 series
> device using QEMU implementing the Shadow virtqueue and support for
> VQ groups and ASID available at:
> github.com/eugpermar/qemu/releases/tag/vdpa_sw_live_migration.d%2F
> asid_groups-v1.d%2F00


On top, we may extend the netlink protocol to report the mapping between virtqueue to its groups.
[GD>>] Yes, I've already discussed this with Eugenio. For testing purpose, I added the mapping in Xilinx netdriver "sfc".

Thanks

[1]
https://www.mail-archive.com/[email protected]/msg08077.html


>
> Changes since RFC:
>
> - tweak vhost uAPI documentation
> - switch to use device specific IOTLB really in patch 4
> - tweak the commit log
> - fix that ASID in vhost is claimed to be 32 actually but 16bit
> actually
> - fix use after free when using ASID with IOTLB batching requests
> - switch to use Stefano's patch for having separated iov
> - remove unused "used_as" variable
> - fix the iotlb/asid checking in vhost_vdpa_unmap()
>
> Thanks
>
> Gautam Dawar (19):
> vhost: move the backend feature bits to vhost_types.h
> virtio-vdpa: don't set callback if virtio doesn't need it
> vhost-vdpa: passing iotlb to IOMMU mapping helpers
> vhost-vdpa: switch to use vhost-vdpa specific IOTLB
> vdpa: introduce virtqueue groups
> vdpa: multiple address spaces support
> vdpa: introduce config operations for associating ASID to a virtqueue
> group
> vhost_iotlb: split out IOTLB initialization
> vhost: support ASID in IOTLB API
> vhost-vdpa: introduce asid based IOTLB
> vhost-vdpa: introduce uAPI to get the number of virtqueue groups
> vhost-vdpa: introduce uAPI to get the number of address spaces
> vhost-vdpa: uAPI to get virtqueue group id
> vhost-vdpa: introduce uAPI to set group ASID
> vhost-vdpa: support ASID based IOTLB API
> vdpa_sim: advertise VIRTIO_NET_F_MTU
> vdpa_sim: factor out buffer completion logic
> vdpa_sim: filter destination mac address
> vdpasim: control virtqueue support
>
> drivers/vdpa/ifcvf/ifcvf_main.c | 8 +-
> drivers/vdpa/mlx5/net/mlx5_vnet.c | 11 +-
> drivers/vdpa/vdpa.c | 5 +
> drivers/vdpa/vdpa_sim/vdpa_sim.c | 100 ++++++++--
> drivers/vdpa/vdpa_sim/vdpa_sim.h | 3 +
> drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 169 +++++++++++++----
> drivers/vhost/iotlb.c | 23 ++-
> drivers/vhost/vdpa.c | 272 +++++++++++++++++++++------
> drivers/vhost/vhost.c | 23 ++-
> drivers/vhost/vhost.h | 4 +-
> drivers/virtio/virtio_vdpa.c | 2 +-
> include/linux/vdpa.h | 46 ++++-
> include/linux/vhost_iotlb.h | 2 +
> include/uapi/linux/vhost.h | 25 ++-
> include/uapi/linux/vhost_types.h | 11 +-
> 15 files changed, 566 insertions(+), 138 deletions(-)
>

2022-03-03 20:59:13

by Eugenio Perez Martin

[permalink] [raw]
Subject: Re: [RFC PATCH v2 06/19] vdpa: multiple address spaces support

On Thu, Feb 24, 2022 at 10:25 PM Gautam Dawar <[email protected]> wrote:
>
> This patches introduces the multiple address spaces support for vDPA
> device. This idea is to identify a specific address space via an
> dedicated identifier - ASID.
>
> During vDPA device allocation, vDPA device driver needs to report the
> number of address spaces supported by the device then the DMA mapping
> ops of the vDPA device needs to be extended to support ASID.
>
> This helps to isolate the environments for the virtqueue that will not
> be assigned directly. E.g in the case of virtio-net, the control
> virtqueue will not be assigned directly to guest.
>
> As a start, simply claim 1 virtqueue groups and 1 address spaces for
> all vDPA devices. And vhost-vDPA will simply reject the device with
> more than 1 virtqueue groups or address spaces.
>
> Signed-off-by: Jason Wang <[email protected]>
> Signed-off-by: Gautam Dawar <[email protected]>
> ---
> drivers/vdpa/ifcvf/ifcvf_main.c | 2 +-
> drivers/vdpa/mlx5/net/mlx5_vnet.c | 5 +++--
> drivers/vdpa/vdpa.c | 4 +++-
> drivers/vdpa/vdpa_sim/vdpa_sim.c | 10 ++++++----
> drivers/vhost/vdpa.c | 14 +++++++++-----
> include/linux/vdpa.h | 28 +++++++++++++++++++---------
> 6 files changed, 41 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c
> index c815a2e62440..a4815c5612f9 100644
> --- a/drivers/vdpa/ifcvf/ifcvf_main.c
> +++ b/drivers/vdpa/ifcvf/ifcvf_main.c
> @@ -513,7 +513,7 @@ static int ifcvf_vdpa_dev_add(struct vdpa_mgmt_dev *mdev, const char *name,
> pdev = ifcvf_mgmt_dev->pdev;
> dev = &pdev->dev;
> adapter = vdpa_alloc_device(struct ifcvf_adapter, vdpa,
> - dev, &ifc_vdpa_ops, 1, name, false);
> + dev, &ifc_vdpa_ops, 1, 1, name, false);
> if (IS_ERR(adapter)) {
> IFCVF_ERR(pdev, "Failed to allocate vDPA structure");
> return PTR_ERR(adapter);
> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> index fcfc28460b72..a76417892ef3 100644
> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> @@ -2282,7 +2282,8 @@ static u32 mlx5_vdpa_get_generation(struct vdpa_device *vdev)
> return mvdev->generation;
> }
>
> -static int mlx5_vdpa_set_map(struct vdpa_device *vdev, struct vhost_iotlb *iotlb)
> +static int mlx5_vdpa_set_map(struct vdpa_device *vdev, unsigned int asid,
> + struct vhost_iotlb *iotlb)
> {
> struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
> bool change_map;
> @@ -2581,7 +2582,7 @@ static int mlx5_vdpa_dev_add(struct vdpa_mgmt_dev *v_mdev, const char *name,
> }
>
> ndev = vdpa_alloc_device(struct mlx5_vdpa_net, mvdev.vdev, mdev->device, &mlx5_vdpa_ops,
> - 1, name, false);
> + 1, 1, name, false);
> if (IS_ERR(ndev))
> return PTR_ERR(ndev);
>
> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
> index a07bf0130559..1793dc12b208 100644
> --- a/drivers/vdpa/vdpa.c
> +++ b/drivers/vdpa/vdpa.c
> @@ -160,6 +160,7 @@ static void vdpa_release_dev(struct device *d)
> * @parent: the parent device
> * @config: the bus operations that is supported by this device
> * @ngroups: number of groups supported by this device
> + * @nas: number of address spaces supported by this device
> * @size: size of the parent structure that contains private data
> * @name: name of the vdpa device; optional.
> * @use_va: indicate whether virtual address must be used by this device
> @@ -172,7 +173,7 @@ static void vdpa_release_dev(struct device *d)
> */
> struct vdpa_device *__vdpa_alloc_device(struct device *parent,
> const struct vdpa_config_ops *config,
> - unsigned int ngroups,
> + unsigned int ngroups, unsigned int nas,
> size_t size, const char *name,
> bool use_va)
> {
> @@ -206,6 +207,7 @@ struct vdpa_device *__vdpa_alloc_device(struct device *parent,
> vdev->features_valid = false;
> vdev->use_va = use_va;
> vdev->ngroups = ngroups;
> + vdev->nas = nas;
>
> if (name)
> err = dev_set_name(&vdev->dev, "%s", name);
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> index c98cb1f869fa..659e2e2e4b0c 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> @@ -251,7 +251,7 @@ struct vdpasim *vdpasim_create(struct vdpasim_dev_attr *dev_attr)
> ops = &vdpasim_config_ops;
>
> vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops, 1,
> - dev_attr->name, false);
> + 1, dev_attr->name, false);
> if (IS_ERR(vdpasim)) {
> ret = PTR_ERR(vdpasim);
> goto err_alloc;
> @@ -539,7 +539,7 @@ static struct vdpa_iova_range vdpasim_get_iova_range(struct vdpa_device *vdpa)
> return range;
> }
>
> -static int vdpasim_set_map(struct vdpa_device *vdpa,
> +static int vdpasim_set_map(struct vdpa_device *vdpa, unsigned int asid,
> struct vhost_iotlb *iotlb)
> {
> struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
> @@ -566,7 +566,8 @@ static int vdpasim_set_map(struct vdpa_device *vdpa,
> return ret;
> }
>
> -static int vdpasim_dma_map(struct vdpa_device *vdpa, u64 iova, u64 size,
> +static int vdpasim_dma_map(struct vdpa_device *vdpa, unsigned int asid,
> + u64 iova, u64 size,
> u64 pa, u32 perm, void *opaque)
> {
> struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
> @@ -580,7 +581,8 @@ static int vdpasim_dma_map(struct vdpa_device *vdpa, u64 iova, u64 size,
> return ret;
> }
>
> -static int vdpasim_dma_unmap(struct vdpa_device *vdpa, u64 iova, u64 size)
> +static int vdpasim_dma_unmap(struct vdpa_device *vdpa, unsigned int asid,
> + u64 iova, u64 size)
> {
> struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
>
> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> index 655ff7029401..6bf755f84d26 100644
> --- a/drivers/vhost/vdpa.c
> +++ b/drivers/vhost/vdpa.c
> @@ -599,10 +599,10 @@ static int vhost_vdpa_map(struct vhost_vdpa *v, struct vhost_iotlb *iotlb,
> return r;
>
> if (ops->dma_map) {
> - r = ops->dma_map(vdpa, iova, size, pa, perm, opaque);
> + r = ops->dma_map(vdpa, 0, iova, size, pa, perm, opaque);
> } else if (ops->set_map) {
> if (!v->in_batch)
> - r = ops->set_map(vdpa, iotlb);
> + r = ops->set_map(vdpa, 0, iotlb);
> } else {
> r = iommu_map(v->domain, iova, pa, size,
> perm_to_iommu_flags(perm));
> @@ -628,10 +628,10 @@ static void vhost_vdpa_unmap(struct vhost_vdpa *v,
> vhost_vdpa_iotlb_unmap(v, iotlb, iova, iova + size - 1);
>
> if (ops->dma_map) {
> - ops->dma_unmap(vdpa, iova, size);
> + ops->dma_unmap(vdpa, 0, iova, size);
> } else if (ops->set_map) {
> if (!v->in_batch)
> - ops->set_map(vdpa, iotlb);
> + ops->set_map(vdpa, 0, iotlb);
> } else {
> iommu_unmap(v->domain, iova, size);
> }
> @@ -863,7 +863,7 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev,
> break;
> case VHOST_IOTLB_BATCH_END:
> if (v->in_batch && ops->set_map)
> - ops->set_map(vdpa, iotlb);
> + ops->set_map(vdpa, 0, iotlb);
> v->in_batch = false;
> break;
> default:
> @@ -1128,6 +1128,10 @@ static int vhost_vdpa_probe(struct vdpa_device *vdpa)
> int minor;
> int r;
>
> + /* Only support 1 address space and 1 groups */
> + if (vdpa->ngroups != 1 || vdpa->nas != 1)
> + return -EOPNOTSUPP;
> +
> v = kzalloc(sizeof(*v), GFP_KERNEL | __GFP_RETRY_MAYFAIL);
> if (!v)
> return -ENOMEM;
> diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
> index 026b7ad72ed7..de22ca1a8ef3 100644
> --- a/include/linux/vdpa.h
> +++ b/include/linux/vdpa.h
> @@ -69,6 +69,8 @@ struct vdpa_mgmt_dev;
> * @cf_mutex: Protects get and set access to configuration layout.
> * @index: device index
> * @features_valid: were features initialized? for legacy guests
> + * @ngroups: the number of virtqueue groups
> + * @nas: the number of address spaces
> * @use_va: indicate whether virtual address must be used by this device
> * @nvqs: maximum number of supported virtqueues
> * @mdev: management device pointer; caller must setup when registering device as part
> @@ -86,6 +88,7 @@ struct vdpa_device {
> int nvqs;
> struct vdpa_mgmt_dev *mdev;
> unsigned int ngroups;
> + unsigned int nas;
> };
>
> /**
> @@ -240,6 +243,7 @@ struct vdpa_map_file {
> * Needed for device that using device
> * specific DMA translation (on-chip IOMMU)
> * @vdev: vdpa device
> + * @asid: address space identifier
> * @iotlb: vhost memory mapping to be
> * used by the vDPA
> * Returns integer: success (0) or error (< 0)
> @@ -248,6 +252,7 @@ struct vdpa_map_file {
> * specific DMA translation (on-chip IOMMU)
> * and preferring incremental map.
> * @vdev: vdpa device
> + * @asid: address space identifier
> * @iova: iova to be mapped
> * @size: size of the area
> * @pa: physical address for the map
> @@ -259,6 +264,7 @@ struct vdpa_map_file {
> * specific DMA translation (on-chip IOMMU)
> * and preferring incremental unmap.
> * @vdev: vdpa device
> + * @asid: address space identifier
> * @iova: iova to be unmapped
> * @size: size of the area
> * Returns integer: success (0) or error (< 0)
> @@ -309,10 +315,12 @@ struct vdpa_config_ops {
> struct vdpa_iova_range (*get_iova_range)(struct vdpa_device *vdev);
>
> /* DMA ops */
> - int (*set_map)(struct vdpa_device *vdev, struct vhost_iotlb *iotlb);
> - int (*dma_map)(struct vdpa_device *vdev, u64 iova, u64 size,
> - u64 pa, u32 perm, void *opaque);
> - int (*dma_unmap)(struct vdpa_device *vdev, u64 iova, u64 size);
> + int (*set_map)(struct vdpa_device *vdev, unsigned int asid,
> + struct vhost_iotlb *iotlb);
> + int (*dma_map)(struct vdpa_device *vdev, unsigned int asid,
> + u64 iova, u64 size, u64 pa, u32 perm, void *opaque);
> + int (*dma_unmap)(struct vdpa_device *vdev, unsigned int asid,
> + u64 iova, u64 size);
>
> /* Free device resources */
> void (*free)(struct vdpa_device *vdev);
> @@ -320,7 +328,7 @@ struct vdpa_config_ops {
>
> struct vdpa_device *__vdpa_alloc_device(struct device *parent,
> const struct vdpa_config_ops *config,
> - unsigned int ngroups,
> + unsigned int ngroups, unsigned int nas,
> size_t size, const char *name,
> bool use_va);
>
> @@ -332,17 +340,19 @@ struct vdpa_device *__vdpa_alloc_device(struct device *parent,
> * @parent: the parent device
> * @config: the bus operations that is supported by this device
> * @ngroups: the number of virtqueue groups supported by this device
> + * @nas: the number of address spaces
> * @name: name of the vdpa device
> * @use_va: indicate whether virtual address must be used by this device
> *
> * Return allocated data structure or ERR_PTR upon error
> */
> -#define vdpa_alloc_device(dev_struct, member, parent, config, ngroups, name, use_va) \
> +#define vdpa_alloc_device(dev_struct, member, parent, config, ngroups, nas, \
> + name, use_va) \
> container_of((__vdpa_alloc_device( \
> - parent, config, ngroups, \
> - sizeof(dev_struct) + \
> + parent, config, ngroups, nas, \
> + (sizeof(dev_struct) + \

Maybe too nitpick or I'm missing something, but do we need to add the
parentheses around (sizeof(dev_struct) + BUILD_BUG_ON_ZERO(...)) ?

> BUILD_BUG_ON_ZERO(offsetof( \
> - dev_struct, member)), name, use_va)), \
> + dev_struct, member))), name, use_va)), \
> dev_struct, member)
>
> int vdpa_register_device(struct vdpa_device *vdev, int nvqs);
> --
> 2.25.0
>

2022-03-04 08:40:01

by Gautam Dawar

[permalink] [raw]
Subject: RE: [RFC PATCH v2 06/19] vdpa: multiple address spaces support

-----Original Message-----
From: Eugenio Perez Martin <[email protected]>
Sent: Friday, March 4, 2022 1:10 AM
To: Gautam Dawar <[email protected]>
Cc: Gautam Dawar <[email protected]>; Martin Petrus Hubertus Habets <[email protected]>; Harpreet Singh Anand <[email protected]>; Tanuj Murlidhar Kamde <[email protected]>; Jason Wang <[email protected]>; Michael S. Tsirkin <[email protected]>; Zhu Lingshan <[email protected]>; Stefano Garzarella <[email protected]>; Xie Yongji <[email protected]>; Eli Cohen <[email protected]>; Si-Wei Liu <[email protected]>; Parav Pandit <[email protected]>; Longpeng <[email protected]>; virtualization <[email protected]>; [email protected]; kvm list <[email protected]>; [email protected]
Subject: Re: [RFC PATCH v2 06/19] vdpa: multiple address spaces support

On Thu, Feb 24, 2022 at 10:25 PM Gautam Dawar <[email protected]> wrote:
>
> This patches introduces the multiple address spaces support for vDPA
> device. This idea is to identify a specific address space via an
> dedicated identifier - ASID.
>
> During vDPA device allocation, vDPA device driver needs to report the
> number of address spaces supported by the device then the DMA mapping
> ops of the vDPA device needs to be extended to support ASID.
>
> This helps to isolate the environments for the virtqueue that will not
> be assigned directly. E.g in the case of virtio-net, the control
> virtqueue will not be assigned directly to guest.
>
> As a start, simply claim 1 virtqueue groups and 1 address spaces for
> all vDPA devices. And vhost-vDPA will simply reject the device with
> more than 1 virtqueue groups or address spaces.
>
> Signed-off-by: Jason Wang <[email protected]>
> Signed-off-by: Gautam Dawar <[email protected]>
> ---
> drivers/vdpa/ifcvf/ifcvf_main.c | 2 +-
> drivers/vdpa/mlx5/net/mlx5_vnet.c | 5 +++--
> drivers/vdpa/vdpa.c | 4 +++-
> drivers/vdpa/vdpa_sim/vdpa_sim.c | 10 ++++++----
> drivers/vhost/vdpa.c | 14 +++++++++-----
> include/linux/vdpa.h | 28 +++++++++++++++++++---------
> 6 files changed, 41 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c
> b/drivers/vdpa/ifcvf/ifcvf_main.c index c815a2e62440..a4815c5612f9
> 100644
> --- a/drivers/vdpa/ifcvf/ifcvf_main.c
> +++ b/drivers/vdpa/ifcvf/ifcvf_main.c
> @@ -513,7 +513,7 @@ static int ifcvf_vdpa_dev_add(struct vdpa_mgmt_dev *mdev, const char *name,
> pdev = ifcvf_mgmt_dev->pdev;
> dev = &pdev->dev;
> adapter = vdpa_alloc_device(struct ifcvf_adapter, vdpa,
> - dev, &ifc_vdpa_ops, 1, name, false);
> + dev, &ifc_vdpa_ops, 1, 1, name,
> + false);
> if (IS_ERR(adapter)) {
> IFCVF_ERR(pdev, "Failed to allocate vDPA structure");
> return PTR_ERR(adapter); diff --git
> a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> index fcfc28460b72..a76417892ef3 100644
> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> @@ -2282,7 +2282,8 @@ static u32 mlx5_vdpa_get_generation(struct vdpa_device *vdev)
> return mvdev->generation;
> }
>
> -static int mlx5_vdpa_set_map(struct vdpa_device *vdev, struct
> vhost_iotlb *iotlb)
> +static int mlx5_vdpa_set_map(struct vdpa_device *vdev, unsigned int asid,
> + struct vhost_iotlb *iotlb)
> {
> struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
> bool change_map;
> @@ -2581,7 +2582,7 @@ static int mlx5_vdpa_dev_add(struct vdpa_mgmt_dev *v_mdev, const char *name,
> }
>
> ndev = vdpa_alloc_device(struct mlx5_vdpa_net, mvdev.vdev, mdev->device, &mlx5_vdpa_ops,
> - 1, name, false);
> + 1, 1, name, false);
> if (IS_ERR(ndev))
> return PTR_ERR(ndev);
>
> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
> a07bf0130559..1793dc12b208 100644
> --- a/drivers/vdpa/vdpa.c
> +++ b/drivers/vdpa/vdpa.c
> @@ -160,6 +160,7 @@ static void vdpa_release_dev(struct device *d)
> * @parent: the parent device
> * @config: the bus operations that is supported by this device
> * @ngroups: number of groups supported by this device
> + * @nas: number of address spaces supported by this device
> * @size: size of the parent structure that contains private data
> * @name: name of the vdpa device; optional.
> * @use_va: indicate whether virtual address must be used by this
> device @@ -172,7 +173,7 @@ static void vdpa_release_dev(struct device *d)
> */
> struct vdpa_device *__vdpa_alloc_device(struct device *parent,
> const struct vdpa_config_ops *config,
> - unsigned int ngroups,
> + unsigned int ngroups, unsigned
> + int nas,
> size_t size, const char *name,
> bool use_va) { @@ -206,6
> +207,7 @@ struct vdpa_device *__vdpa_alloc_device(struct device *parent,
> vdev->features_valid = false;
> vdev->use_va = use_va;
> vdev->ngroups = ngroups;
> + vdev->nas = nas;
>
> if (name)
> err = dev_set_name(&vdev->dev, "%s", name); diff --git
> a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> index c98cb1f869fa..659e2e2e4b0c 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> @@ -251,7 +251,7 @@ struct vdpasim *vdpasim_create(struct vdpasim_dev_attr *dev_attr)
> ops = &vdpasim_config_ops;
>
> vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops, 1,
> - dev_attr->name, false);
> + 1, dev_attr->name, false);
> if (IS_ERR(vdpasim)) {
> ret = PTR_ERR(vdpasim);
> goto err_alloc;
> @@ -539,7 +539,7 @@ static struct vdpa_iova_range vdpasim_get_iova_range(struct vdpa_device *vdpa)
> return range;
> }
>
> -static int vdpasim_set_map(struct vdpa_device *vdpa,
> +static int vdpasim_set_map(struct vdpa_device *vdpa, unsigned int
> +asid,
> struct vhost_iotlb *iotlb) {
> struct vdpasim *vdpasim = vdpa_to_sim(vdpa); @@ -566,7 +566,8
> @@ static int vdpasim_set_map(struct vdpa_device *vdpa,
> return ret;
> }
>
> -static int vdpasim_dma_map(struct vdpa_device *vdpa, u64 iova, u64
> size,
> +static int vdpasim_dma_map(struct vdpa_device *vdpa, unsigned int asid,
> + u64 iova, u64 size,
> u64 pa, u32 perm, void *opaque) {
> struct vdpasim *vdpasim = vdpa_to_sim(vdpa); @@ -580,7 +581,8
> @@ static int vdpasim_dma_map(struct vdpa_device *vdpa, u64 iova, u64 size,
> return ret;
> }
>
> -static int vdpasim_dma_unmap(struct vdpa_device *vdpa, u64 iova, u64
> size)
> +static int vdpasim_dma_unmap(struct vdpa_device *vdpa, unsigned int asid,
> + u64 iova, u64 size)
> {
> struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
>
> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index
> 655ff7029401..6bf755f84d26 100644
> --- a/drivers/vhost/vdpa.c
> +++ b/drivers/vhost/vdpa.c
> @@ -599,10 +599,10 @@ static int vhost_vdpa_map(struct vhost_vdpa *v, struct vhost_iotlb *iotlb,
> return r;
>
> if (ops->dma_map) {
> - r = ops->dma_map(vdpa, iova, size, pa, perm, opaque);
> + r = ops->dma_map(vdpa, 0, iova, size, pa, perm,
> + opaque);
> } else if (ops->set_map) {
> if (!v->in_batch)
> - r = ops->set_map(vdpa, iotlb);
> + r = ops->set_map(vdpa, 0, iotlb);
> } else {
> r = iommu_map(v->domain, iova, pa, size,
> perm_to_iommu_flags(perm)); @@ -628,10
> +628,10 @@ static void vhost_vdpa_unmap(struct vhost_vdpa *v,
> vhost_vdpa_iotlb_unmap(v, iotlb, iova, iova + size - 1);
>
> if (ops->dma_map) {
> - ops->dma_unmap(vdpa, iova, size);
> + ops->dma_unmap(vdpa, 0, iova, size);
> } else if (ops->set_map) {
> if (!v->in_batch)
> - ops->set_map(vdpa, iotlb);
> + ops->set_map(vdpa, 0, iotlb);
> } else {
> iommu_unmap(v->domain, iova, size);
> }
> @@ -863,7 +863,7 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev,
> break;
> case VHOST_IOTLB_BATCH_END:
> if (v->in_batch && ops->set_map)
> - ops->set_map(vdpa, iotlb);
> + ops->set_map(vdpa, 0, iotlb);
> v->in_batch = false;
> break;
> default:
> @@ -1128,6 +1128,10 @@ static int vhost_vdpa_probe(struct vdpa_device *vdpa)
> int minor;
> int r;
>
> + /* Only support 1 address space and 1 groups */
> + if (vdpa->ngroups != 1 || vdpa->nas != 1)
> + return -EOPNOTSUPP;
> +
> v = kzalloc(sizeof(*v), GFP_KERNEL | __GFP_RETRY_MAYFAIL);
> if (!v)
> return -ENOMEM;
> diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h index
> 026b7ad72ed7..de22ca1a8ef3 100644
> --- a/include/linux/vdpa.h
> +++ b/include/linux/vdpa.h
> @@ -69,6 +69,8 @@ struct vdpa_mgmt_dev;
> * @cf_mutex: Protects get and set access to configuration layout.
> * @index: device index
> * @features_valid: were features initialized? for legacy guests
> + * @ngroups: the number of virtqueue groups
> + * @nas: the number of address spaces
> * @use_va: indicate whether virtual address must be used by this device
> * @nvqs: maximum number of supported virtqueues
> * @mdev: management device pointer; caller must setup when
> registering device as part @@ -86,6 +88,7 @@ struct vdpa_device {
> int nvqs;
> struct vdpa_mgmt_dev *mdev;
> unsigned int ngroups;
> + unsigned int nas;
> };
>
> /**
> @@ -240,6 +243,7 @@ struct vdpa_map_file {
> * Needed for device that using device
> * specific DMA translation (on-chip IOMMU)
> * @vdev: vdpa device
> + * @asid: address space identifier
> * @iotlb: vhost memory mapping to be
> * used by the vDPA
> * Returns integer: success (0) or error (< 0)
> @@ -248,6 +252,7 @@ struct vdpa_map_file {
> * specific DMA translation (on-chip IOMMU)
> * and preferring incremental map.
> * @vdev: vdpa device
> + * @asid: address space identifier
> * @iova: iova to be mapped
> * @size: size of the area
> * @pa: physical address for the map
> @@ -259,6 +264,7 @@ struct vdpa_map_file {
> * specific DMA translation (on-chip IOMMU)
> * and preferring incremental unmap.
> * @vdev: vdpa device
> + * @asid: address space identifier
> * @iova: iova to be unmapped
> * @size: size of the area
> * Returns integer: success (0) or error (< 0)
> @@ -309,10 +315,12 @@ struct vdpa_config_ops {
> struct vdpa_iova_range (*get_iova_range)(struct vdpa_device
> *vdev);
>
> /* DMA ops */
> - int (*set_map)(struct vdpa_device *vdev, struct vhost_iotlb *iotlb);
> - int (*dma_map)(struct vdpa_device *vdev, u64 iova, u64 size,
> - u64 pa, u32 perm, void *opaque);
> - int (*dma_unmap)(struct vdpa_device *vdev, u64 iova, u64 size);
> + int (*set_map)(struct vdpa_device *vdev, unsigned int asid,
> + struct vhost_iotlb *iotlb);
> + int (*dma_map)(struct vdpa_device *vdev, unsigned int asid,
> + u64 iova, u64 size, u64 pa, u32 perm, void *opaque);
> + int (*dma_unmap)(struct vdpa_device *vdev, unsigned int asid,
> + u64 iova, u64 size);
>
> /* Free device resources */
> void (*free)(struct vdpa_device *vdev); @@ -320,7 +328,7 @@
> struct vdpa_config_ops {
>
> struct vdpa_device *__vdpa_alloc_device(struct device *parent,
> const struct vdpa_config_ops *config,
> - unsigned int ngroups,
> + unsigned int ngroups, unsigned
> + int nas,
> size_t size, const char *name,
> bool use_va);
>
> @@ -332,17 +340,19 @@ struct vdpa_device *__vdpa_alloc_device(struct device *parent,
> * @parent: the parent device
> * @config: the bus operations that is supported by this device
> * @ngroups: the number of virtqueue groups supported by this device
> + * @nas: the number of address spaces
> * @name: name of the vdpa device
> * @use_va: indicate whether virtual address must be used by this device
> *
> * Return allocated data structure or ERR_PTR upon error
> */
> -#define vdpa_alloc_device(dev_struct, member, parent, config, ngroups, name, use_va) \
> +#define vdpa_alloc_device(dev_struct, member, parent, config, ngroups, nas, \
> + name, use_va) \
> container_of((__vdpa_alloc_device( \
> - parent, config, ngroups, \
> - sizeof(dev_struct) + \
> + parent, config, ngroups, nas, \
> + (sizeof(dev_struct) + \

Maybe too nitpick or I'm missing something, but do we need to add the parentheses around (sizeof(dev_struct) + BUILD_BUG_ON_ZERO(...)) ?
[GD>>] Yes, that's required as without it checkpatch reports "ERROR: Macros with complex values should be enclosed in parentheses"

> BUILD_BUG_ON_ZERO(offsetof( \
> - dev_struct, member)), name, use_va)), \
> + dev_struct, member))), name,
> + use_va)), \
> dev_struct, member)
>
> int vdpa_register_device(struct vdpa_device *vdev, int nvqs);
> --
> 2.25.0
>

2022-03-04 19:45:07

by Eugenio Perez Martin

[permalink] [raw]
Subject: Re: [RFC PATCH v2 06/19] vdpa: multiple address spaces support

On Fri, Mar 4, 2022 at 7:30 AM Gautam Dawar <[email protected]> wrote:
>
> -----Original Message-----
> From: Eugenio Perez Martin <[email protected]>
> Sent: Friday, March 4, 2022 1:10 AM
> To: Gautam Dawar <[email protected]>
> Cc: Gautam Dawar <[email protected]>; Martin Petrus Hubertus Habets <[email protected]>; Harpreet Singh Anand <[email protected]>; Tanuj Murlidhar Kamde <[email protected]>; Jason Wang <[email protected]>; Michael S. Tsirkin <[email protected]>; Zhu Lingshan <[email protected]>; Stefano Garzarella <[email protected]>; Xie Yongji <[email protected]>; Eli Cohen <[email protected]>; Si-Wei Liu <[email protected]>; Parav Pandit <[email protected]>; Longpeng <[email protected]>; virtualization <[email protected]>; [email protected]; kvm list <[email protected]>; [email protected]
> Subject: Re: [RFC PATCH v2 06/19] vdpa: multiple address spaces support
>
> On Thu, Feb 24, 2022 at 10:25 PM Gautam Dawar <[email protected]> wrote:
> >
> > This patches introduces the multiple address spaces support for vDPA
> > device. This idea is to identify a specific address space via an
> > dedicated identifier - ASID.
> >
> > During vDPA device allocation, vDPA device driver needs to report the
> > number of address spaces supported by the device then the DMA mapping
> > ops of the vDPA device needs to be extended to support ASID.
> >
> > This helps to isolate the environments for the virtqueue that will not
> > be assigned directly. E.g in the case of virtio-net, the control
> > virtqueue will not be assigned directly to guest.
> >
> > As a start, simply claim 1 virtqueue groups and 1 address spaces for
> > all vDPA devices. And vhost-vDPA will simply reject the device with
> > more than 1 virtqueue groups or address spaces.
> >
> > Signed-off-by: Jason Wang <[email protected]>
> > Signed-off-by: Gautam Dawar <[email protected]>
> > ---
> > drivers/vdpa/ifcvf/ifcvf_main.c | 2 +-
> > drivers/vdpa/mlx5/net/mlx5_vnet.c | 5 +++--
> > drivers/vdpa/vdpa.c | 4 +++-
> > drivers/vdpa/vdpa_sim/vdpa_sim.c | 10 ++++++----
> > drivers/vhost/vdpa.c | 14 +++++++++-----
> > include/linux/vdpa.h | 28 +++++++++++++++++++---------
> > 6 files changed, 41 insertions(+), 22 deletions(-)
> >
> > diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c
> > b/drivers/vdpa/ifcvf/ifcvf_main.c index c815a2e62440..a4815c5612f9
> > 100644
> > --- a/drivers/vdpa/ifcvf/ifcvf_main.c
> > +++ b/drivers/vdpa/ifcvf/ifcvf_main.c
> > @@ -513,7 +513,7 @@ static int ifcvf_vdpa_dev_add(struct vdpa_mgmt_dev *mdev, const char *name,
> > pdev = ifcvf_mgmt_dev->pdev;
> > dev = &pdev->dev;
> > adapter = vdpa_alloc_device(struct ifcvf_adapter, vdpa,
> > - dev, &ifc_vdpa_ops, 1, name, false);
> > + dev, &ifc_vdpa_ops, 1, 1, name,
> > + false);
> > if (IS_ERR(adapter)) {
> > IFCVF_ERR(pdev, "Failed to allocate vDPA structure");
> > return PTR_ERR(adapter); diff --git
> > a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > index fcfc28460b72..a76417892ef3 100644
> > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > @@ -2282,7 +2282,8 @@ static u32 mlx5_vdpa_get_generation(struct vdpa_device *vdev)
> > return mvdev->generation;
> > }
> >
> > -static int mlx5_vdpa_set_map(struct vdpa_device *vdev, struct
> > vhost_iotlb *iotlb)
> > +static int mlx5_vdpa_set_map(struct vdpa_device *vdev, unsigned int asid,
> > + struct vhost_iotlb *iotlb)
> > {
> > struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
> > bool change_map;
> > @@ -2581,7 +2582,7 @@ static int mlx5_vdpa_dev_add(struct vdpa_mgmt_dev *v_mdev, const char *name,
> > }
> >
> > ndev = vdpa_alloc_device(struct mlx5_vdpa_net, mvdev.vdev, mdev->device, &mlx5_vdpa_ops,
> > - 1, name, false);
> > + 1, 1, name, false);
> > if (IS_ERR(ndev))
> > return PTR_ERR(ndev);
> >
> > diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
> > a07bf0130559..1793dc12b208 100644
> > --- a/drivers/vdpa/vdpa.c
> > +++ b/drivers/vdpa/vdpa.c
> > @@ -160,6 +160,7 @@ static void vdpa_release_dev(struct device *d)
> > * @parent: the parent device
> > * @config: the bus operations that is supported by this device
> > * @ngroups: number of groups supported by this device
> > + * @nas: number of address spaces supported by this device
> > * @size: size of the parent structure that contains private data
> > * @name: name of the vdpa device; optional.
> > * @use_va: indicate whether virtual address must be used by this
> > device @@ -172,7 +173,7 @@ static void vdpa_release_dev(struct device *d)
> > */
> > struct vdpa_device *__vdpa_alloc_device(struct device *parent,
> > const struct vdpa_config_ops *config,
> > - unsigned int ngroups,
> > + unsigned int ngroups, unsigned
> > + int nas,
> > size_t size, const char *name,
> > bool use_va) { @@ -206,6
> > +207,7 @@ struct vdpa_device *__vdpa_alloc_device(struct device *parent,
> > vdev->features_valid = false;
> > vdev->use_va = use_va;
> > vdev->ngroups = ngroups;
> > + vdev->nas = nas;
> >
> > if (name)
> > err = dev_set_name(&vdev->dev, "%s", name); diff --git
> > a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> > index c98cb1f869fa..659e2e2e4b0c 100644
> > --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
> > +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> > @@ -251,7 +251,7 @@ struct vdpasim *vdpasim_create(struct vdpasim_dev_attr *dev_attr)
> > ops = &vdpasim_config_ops;
> >
> > vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops, 1,
> > - dev_attr->name, false);
> > + 1, dev_attr->name, false);
> > if (IS_ERR(vdpasim)) {
> > ret = PTR_ERR(vdpasim);
> > goto err_alloc;
> > @@ -539,7 +539,7 @@ static struct vdpa_iova_range vdpasim_get_iova_range(struct vdpa_device *vdpa)
> > return range;
> > }
> >
> > -static int vdpasim_set_map(struct vdpa_device *vdpa,
> > +static int vdpasim_set_map(struct vdpa_device *vdpa, unsigned int
> > +asid,
> > struct vhost_iotlb *iotlb) {
> > struct vdpasim *vdpasim = vdpa_to_sim(vdpa); @@ -566,7 +566,8
> > @@ static int vdpasim_set_map(struct vdpa_device *vdpa,
> > return ret;
> > }
> >
> > -static int vdpasim_dma_map(struct vdpa_device *vdpa, u64 iova, u64
> > size,
> > +static int vdpasim_dma_map(struct vdpa_device *vdpa, unsigned int asid,
> > + u64 iova, u64 size,
> > u64 pa, u32 perm, void *opaque) {
> > struct vdpasim *vdpasim = vdpa_to_sim(vdpa); @@ -580,7 +581,8
> > @@ static int vdpasim_dma_map(struct vdpa_device *vdpa, u64 iova, u64 size,
> > return ret;
> > }
> >
> > -static int vdpasim_dma_unmap(struct vdpa_device *vdpa, u64 iova, u64
> > size)
> > +static int vdpasim_dma_unmap(struct vdpa_device *vdpa, unsigned int asid,
> > + u64 iova, u64 size)
> > {
> > struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
> >
> > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index
> > 655ff7029401..6bf755f84d26 100644
> > --- a/drivers/vhost/vdpa.c
> > +++ b/drivers/vhost/vdpa.c
> > @@ -599,10 +599,10 @@ static int vhost_vdpa_map(struct vhost_vdpa *v, struct vhost_iotlb *iotlb,
> > return r;
> >
> > if (ops->dma_map) {
> > - r = ops->dma_map(vdpa, iova, size, pa, perm, opaque);
> > + r = ops->dma_map(vdpa, 0, iova, size, pa, perm,
> > + opaque);
> > } else if (ops->set_map) {
> > if (!v->in_batch)
> > - r = ops->set_map(vdpa, iotlb);
> > + r = ops->set_map(vdpa, 0, iotlb);
> > } else {
> > r = iommu_map(v->domain, iova, pa, size,
> > perm_to_iommu_flags(perm)); @@ -628,10
> > +628,10 @@ static void vhost_vdpa_unmap(struct vhost_vdpa *v,
> > vhost_vdpa_iotlb_unmap(v, iotlb, iova, iova + size - 1);
> >
> > if (ops->dma_map) {
> > - ops->dma_unmap(vdpa, iova, size);
> > + ops->dma_unmap(vdpa, 0, iova, size);
> > } else if (ops->set_map) {
> > if (!v->in_batch)
> > - ops->set_map(vdpa, iotlb);
> > + ops->set_map(vdpa, 0, iotlb);
> > } else {
> > iommu_unmap(v->domain, iova, size);
> > }
> > @@ -863,7 +863,7 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev,
> > break;
> > case VHOST_IOTLB_BATCH_END:
> > if (v->in_batch && ops->set_map)
> > - ops->set_map(vdpa, iotlb);
> > + ops->set_map(vdpa, 0, iotlb);
> > v->in_batch = false;
> > break;
> > default:
> > @@ -1128,6 +1128,10 @@ static int vhost_vdpa_probe(struct vdpa_device *vdpa)
> > int minor;
> > int r;
> >
> > + /* Only support 1 address space and 1 groups */
> > + if (vdpa->ngroups != 1 || vdpa->nas != 1)
> > + return -EOPNOTSUPP;
> > +
> > v = kzalloc(sizeof(*v), GFP_KERNEL | __GFP_RETRY_MAYFAIL);
> > if (!v)
> > return -ENOMEM;
> > diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h index
> > 026b7ad72ed7..de22ca1a8ef3 100644
> > --- a/include/linux/vdpa.h
> > +++ b/include/linux/vdpa.h
> > @@ -69,6 +69,8 @@ struct vdpa_mgmt_dev;
> > * @cf_mutex: Protects get and set access to configuration layout.
> > * @index: device index
> > * @features_valid: were features initialized? for legacy guests
> > + * @ngroups: the number of virtqueue groups
> > + * @nas: the number of address spaces
> > * @use_va: indicate whether virtual address must be used by this device
> > * @nvqs: maximum number of supported virtqueues
> > * @mdev: management device pointer; caller must setup when
> > registering device as part @@ -86,6 +88,7 @@ struct vdpa_device {
> > int nvqs;
> > struct vdpa_mgmt_dev *mdev;
> > unsigned int ngroups;
> > + unsigned int nas;
> > };
> >
> > /**
> > @@ -240,6 +243,7 @@ struct vdpa_map_file {
> > * Needed for device that using device
> > * specific DMA translation (on-chip IOMMU)
> > * @vdev: vdpa device
> > + * @asid: address space identifier
> > * @iotlb: vhost memory mapping to be
> > * used by the vDPA
> > * Returns integer: success (0) or error (< 0)
> > @@ -248,6 +252,7 @@ struct vdpa_map_file {
> > * specific DMA translation (on-chip IOMMU)
> > * and preferring incremental map.
> > * @vdev: vdpa device
> > + * @asid: address space identifier
> > * @iova: iova to be mapped
> > * @size: size of the area
> > * @pa: physical address for the map
> > @@ -259,6 +264,7 @@ struct vdpa_map_file {
> > * specific DMA translation (on-chip IOMMU)
> > * and preferring incremental unmap.
> > * @vdev: vdpa device
> > + * @asid: address space identifier
> > * @iova: iova to be unmapped
> > * @size: size of the area
> > * Returns integer: success (0) or error (< 0)
> > @@ -309,10 +315,12 @@ struct vdpa_config_ops {
> > struct vdpa_iova_range (*get_iova_range)(struct vdpa_device
> > *vdev);
> >
> > /* DMA ops */
> > - int (*set_map)(struct vdpa_device *vdev, struct vhost_iotlb *iotlb);
> > - int (*dma_map)(struct vdpa_device *vdev, u64 iova, u64 size,
> > - u64 pa, u32 perm, void *opaque);
> > - int (*dma_unmap)(struct vdpa_device *vdev, u64 iova, u64 size);
> > + int (*set_map)(struct vdpa_device *vdev, unsigned int asid,
> > + struct vhost_iotlb *iotlb);
> > + int (*dma_map)(struct vdpa_device *vdev, unsigned int asid,
> > + u64 iova, u64 size, u64 pa, u32 perm, void *opaque);
> > + int (*dma_unmap)(struct vdpa_device *vdev, unsigned int asid,
> > + u64 iova, u64 size);
> >
> > /* Free device resources */
> > void (*free)(struct vdpa_device *vdev); @@ -320,7 +328,7 @@
> > struct vdpa_config_ops {
> >
> > struct vdpa_device *__vdpa_alloc_device(struct device *parent,
> > const struct vdpa_config_ops *config,
> > - unsigned int ngroups,
> > + unsigned int ngroups, unsigned
> > + int nas,
> > size_t size, const char *name,
> > bool use_va);
> >
> > @@ -332,17 +340,19 @@ struct vdpa_device *__vdpa_alloc_device(struct device *parent,
> > * @parent: the parent device
> > * @config: the bus operations that is supported by this device
> > * @ngroups: the number of virtqueue groups supported by this device
> > + * @nas: the number of address spaces
> > * @name: name of the vdpa device
> > * @use_va: indicate whether virtual address must be used by this device
> > *
> > * Return allocated data structure or ERR_PTR upon error
> > */
> > -#define vdpa_alloc_device(dev_struct, member, parent, config, ngroups, name, use_va) \
> > +#define vdpa_alloc_device(dev_struct, member, parent, config, ngroups, nas, \
> > + name, use_va) \
> > container_of((__vdpa_alloc_device( \
> > - parent, config, ngroups, \
> > - sizeof(dev_struct) + \
> > + parent, config, ngroups, nas, \
> > + (sizeof(dev_struct) + \
>
> Maybe too nitpick or I'm missing something, but do we need to add the parentheses around (sizeof(dev_struct) + BUILD_BUG_ON_ZERO(...)) ?
> [GD>>] Yes, that's required as without it checkpatch reports "ERROR: Macros with complex values should be enclosed in parentheses"

Interestingly, I cannot reproduce locally. But it's not something that
matters a lot in my opinion.

Thanks!

>
> > BUILD_BUG_ON_ZERO(offsetof( \
> > - dev_struct, member)), name, use_va)), \
> > + dev_struct, member))), name,
> > + use_va)), \
> > dev_struct, member)
> >
> > int vdpa_register_device(struct vdpa_device *vdev, int nvqs);
> > --
> > 2.25.0
> >
>

2022-03-10 22:46:09

by Eugenio Perez Martin

[permalink] [raw]
Subject: Re: [RFC PATCH v2 19/19] vdpasim: control virtqueue support

On Thu, Feb 24, 2022 at 10:29 PM Gautam Dawar <[email protected]> wrote:
>
> This patch introduces the control virtqueue support for vDPA
> simulator. This is a requirement for supporting advanced features like
> multiqueue.
>
> A requirement for control virtqueue is to isolate its memory access
> from the rx/tx virtqueues. This is because when using vDPA device
> for VM, the control virqueue is not directly assigned to VM. Userspace
> (Qemu) will present a shadow control virtqueue to control for
> recording the device states.
>
> The isolation is done via the virtqueue groups and ASID support in
> vDPA through vhost-vdpa. The simulator is extended to have:
>
> 1) three virtqueues: RXVQ, TXVQ and CVQ (control virtqueue)
> 2) two virtqueue groups: group 0 contains RXVQ and TXVQ; group 1
> contains CVQ
> 3) two address spaces and the simulator simply implements the address
> spaces by mapping it 1:1 to IOTLB.
>
> For the VM use cases, userspace(Qemu) may set AS 0 to group 0 and AS 1
> to group 1. So we have:
>
> 1) The IOTLB for virtqueue group 0 contains the mappings of guest, so
> RX and TX can be assigned to guest directly.
> 2) The IOTLB for virtqueue group 1 contains the mappings of CVQ which
> is the buffers that allocated and managed by VMM only. So CVQ of
> vhost-vdpa is visible to VMM only. And Guest can not access the CVQ
> of vhost-vdpa.
>
> For the other use cases, since AS 0 is associated to all virtqueue
> groups by default. All virtqueues share the same mapping by default.
>
> To demonstrate the function, VIRITO_NET_F_CTRL_MACADDR is
> implemented in the simulator for the driver to set mac address.
>
> Signed-off-by: Jason Wang <[email protected]>
> Signed-off-by: Gautam Dawar <[email protected]>
> ---
> drivers/vdpa/vdpa_sim/vdpa_sim.c | 91 ++++++++++++++++++++++------
> drivers/vdpa/vdpa_sim/vdpa_sim.h | 2 +
> drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 88 ++++++++++++++++++++++++++-
> 3 files changed, 161 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> index 659e2e2e4b0c..59611f18a3a8 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> @@ -96,11 +96,17 @@ static void vdpasim_do_reset(struct vdpasim *vdpasim)
> {
> int i;
>
> - for (i = 0; i < vdpasim->dev_attr.nvqs; i++)
> + spin_lock(&vdpasim->iommu_lock);
> +
> + for (i = 0; i < vdpasim->dev_attr.nvqs; i++) {
> vdpasim_vq_reset(vdpasim, &vdpasim->vqs[i]);
> + vringh_set_iotlb(&vdpasim->vqs[i].vring, &vdpasim->iommu[0],
> + &vdpasim->iommu_lock);
> + }
> +
> + for (i = 0; i < vdpasim->dev_attr.nas; i++)
> + vhost_iotlb_reset(&vdpasim->iommu[i]);
>
> - spin_lock(&vdpasim->iommu_lock);
> - vhost_iotlb_reset(vdpasim->iommu);
> spin_unlock(&vdpasim->iommu_lock);
>
> vdpasim->features = 0;
> @@ -145,7 +151,7 @@ static dma_addr_t vdpasim_map_range(struct vdpasim *vdpasim, phys_addr_t paddr,
> dma_addr = iova_dma_addr(&vdpasim->iova, iova);
>
> spin_lock(&vdpasim->iommu_lock);
> - ret = vhost_iotlb_add_range(vdpasim->iommu, (u64)dma_addr,
> + ret = vhost_iotlb_add_range(&vdpasim->iommu[0], (u64)dma_addr,
> (u64)dma_addr + size - 1, (u64)paddr, perm);
> spin_unlock(&vdpasim->iommu_lock);
>
> @@ -161,7 +167,7 @@ static void vdpasim_unmap_range(struct vdpasim *vdpasim, dma_addr_t dma_addr,
> size_t size)
> {
> spin_lock(&vdpasim->iommu_lock);
> - vhost_iotlb_del_range(vdpasim->iommu, (u64)dma_addr,
> + vhost_iotlb_del_range(&vdpasim->iommu[0], (u64)dma_addr,
> (u64)dma_addr + size - 1);
> spin_unlock(&vdpasim->iommu_lock);
>
> @@ -250,8 +256,9 @@ struct vdpasim *vdpasim_create(struct vdpasim_dev_attr *dev_attr)
> else
> ops = &vdpasim_config_ops;
>
> - vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops, 1,
> - 1, dev_attr->name, false);
> + vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops,
> + dev_attr->ngroups, dev_attr->nas,
> + dev_attr->name, false);
> if (IS_ERR(vdpasim)) {
> ret = PTR_ERR(vdpasim);
> goto err_alloc;
> @@ -278,16 +285,20 @@ struct vdpasim *vdpasim_create(struct vdpasim_dev_attr *dev_attr)
> if (!vdpasim->vqs)
> goto err_iommu;
>
> - vdpasim->iommu = vhost_iotlb_alloc(max_iotlb_entries, 0);
> + vdpasim->iommu = kmalloc_array(vdpasim->dev_attr.nas,
> + sizeof(*vdpasim->iommu), GFP_KERNEL);
> if (!vdpasim->iommu)
> goto err_iommu;
>
> + for (i = 0; i < vdpasim->dev_attr.nas; i++)
> + vhost_iotlb_init(&vdpasim->iommu[i], 0, 0);
> +
> vdpasim->buffer = kvmalloc(dev_attr->buffer_size, GFP_KERNEL);
> if (!vdpasim->buffer)
> goto err_iommu;
>
> for (i = 0; i < dev_attr->nvqs; i++)
> - vringh_set_iotlb(&vdpasim->vqs[i].vring, vdpasim->iommu,
> + vringh_set_iotlb(&vdpasim->vqs[i].vring, &vdpasim->iommu[0],
> &vdpasim->iommu_lock);
>
> ret = iova_cache_get();
> @@ -401,7 +412,11 @@ static u32 vdpasim_get_vq_align(struct vdpa_device *vdpa)
>
> static u32 vdpasim_get_vq_group(struct vdpa_device *vdpa, u16 idx)
> {
> - return 0;
> + /* RX and TX belongs to group 0, CVQ belongs to group 1 */
> + if (idx == 2)
> + return 1;
> + else
> + return 0;
> }
>
> static u64 vdpasim_get_device_features(struct vdpa_device *vdpa)
> @@ -539,20 +554,53 @@ static struct vdpa_iova_range vdpasim_get_iova_range(struct vdpa_device *vdpa)
> return range;
> }
>
> +static int vdpasim_set_group_asid(struct vdpa_device *vdpa, unsigned int group,
> + unsigned int asid)
> +{
> + struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
> + struct vhost_iotlb *iommu;
> + int i;
> +
> + if (group > vdpasim->dev_attr.ngroups)
> + return -EINVAL;
> +
> + if (asid > vdpasim->dev_attr.nas)
> + return -EINVAL;
> +
> + iommu = &vdpasim->iommu[asid];
> +
> + spin_lock(&vdpasim->lock);
> +
> + for (i = 0; i < vdpasim->dev_attr.nvqs; i++)
> + if (vdpasim_get_vq_group(vdpa, i) == group)
> + vringh_set_iotlb(&vdpasim->vqs[i].vring, &vdpasim->iommu[0],
> + &vdpasim->iommu_lock);
> +
> + spin_unlock(&vdpasim->lock);
> +
> + return 0;
> +}
> +
> static int vdpasim_set_map(struct vdpa_device *vdpa, unsigned int asid,
> struct vhost_iotlb *iotlb)
> {
> struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
> struct vhost_iotlb_map *map;
> + struct vhost_iotlb *iommu;
> u64 start = 0ULL, last = 0ULL - 1;
> int ret;
>
> + if (asid >= vdpasim->dev_attr.nas)
> + return -EINVAL;
> +
> spin_lock(&vdpasim->iommu_lock);
> - vhost_iotlb_reset(vdpasim->iommu);
> +
> + iommu = &vdpasim->iommu[asid];
> + vhost_iotlb_reset(iommu);
>
> for (map = vhost_iotlb_itree_first(iotlb, start, last); map;
> map = vhost_iotlb_itree_next(map, start, last)) {
> - ret = vhost_iotlb_add_range(vdpasim->iommu, map->start,
> + ret = vhost_iotlb_add_range(iommu, map->start,
> map->last, map->addr, map->perm);
> if (ret)
> goto err;
> @@ -561,7 +609,7 @@ static int vdpasim_set_map(struct vdpa_device *vdpa, unsigned int asid,
> return 0;
>
> err:
> - vhost_iotlb_reset(vdpasim->iommu);
> + vhost_iotlb_reset(iommu);
> spin_unlock(&vdpasim->iommu_lock);
> return ret;
> }
> @@ -573,9 +621,12 @@ static int vdpasim_dma_map(struct vdpa_device *vdpa, unsigned int asid,
> struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
> int ret;
>
> + if (asid >= vdpasim->dev_attr.nas)
> + return -EINVAL;
> +
> spin_lock(&vdpasim->iommu_lock);
> - ret = vhost_iotlb_add_range_ctx(vdpasim->iommu, iova, iova + size - 1,
> - pa, perm, opaque);
> + ret = vhost_iotlb_add_range_ctx(&vdpasim->iommu[asid], iova,
> + iova + size - 1, pa, perm, opaque);
> spin_unlock(&vdpasim->iommu_lock);
>
> return ret;
> @@ -586,8 +637,11 @@ static int vdpasim_dma_unmap(struct vdpa_device *vdpa, unsigned int asid,
> {
> struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
>
> + if (asid >= vdpasim->dev_attr.nas)
> + return -EINVAL;
> +
> spin_lock(&vdpasim->iommu_lock);
> - vhost_iotlb_del_range(vdpasim->iommu, iova, iova + size - 1);
> + vhost_iotlb_del_range(&vdpasim->iommu[asid], iova, iova + size - 1);
> spin_unlock(&vdpasim->iommu_lock);
>
> return 0;
> @@ -611,8 +665,7 @@ static void vdpasim_free(struct vdpa_device *vdpa)
> }
>
> kvfree(vdpasim->buffer);
> - if (vdpasim->iommu)
> - vhost_iotlb_free(vdpasim->iommu);
> + vhost_iotlb_free(vdpasim->iommu);
> kfree(vdpasim->vqs);
> kfree(vdpasim->config);
> }
> @@ -643,6 +696,7 @@ static const struct vdpa_config_ops vdpasim_config_ops = {
> .set_config = vdpasim_set_config,
> .get_generation = vdpasim_get_generation,
> .get_iova_range = vdpasim_get_iova_range,
> + .set_group_asid = vdpasim_set_group_asid,
> .dma_map = vdpasim_dma_map,
> .dma_unmap = vdpasim_dma_unmap,
> .free = vdpasim_free,
> @@ -674,6 +728,7 @@ static const struct vdpa_config_ops vdpasim_batch_config_ops = {
> .set_config = vdpasim_set_config,
> .get_generation = vdpasim_get_generation,
> .get_iova_range = vdpasim_get_iova_range,
> + .set_group_asid = vdpasim_set_group_asid,
> .set_map = vdpasim_set_map,
> .free = vdpasim_free,
> };
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.h b/drivers/vdpa/vdpa_sim/vdpa_sim.h
> index 0be7c1e7ef80..622782e92239 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.h
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.h
> @@ -41,6 +41,8 @@ struct vdpasim_dev_attr {
> size_t buffer_size;
> int nvqs;
> u32 id;
> + u32 ngroups;
> + u32 nas;
>
> work_func_t work_fn;
> void (*get_config)(struct vdpasim *vdpasim, void *config);
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
> index ed5ade4ae570..513970c05af2 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
> @@ -26,10 +26,15 @@
> #define DRV_LICENSE "GPL v2"
>
> #define VDPASIM_NET_FEATURES (VDPASIM_FEATURES | \
> + (1ULL << VIRTIO_NET_F_MTU) | \

Why the reorder here?

> (1ULL << VIRTIO_NET_F_MAC) | \
> - (1ULL << VIRTIO_NET_F_MTU));
> + (1ULL << VIRTIO_NET_F_CTRL_VQ) | \
> + (1ULL << VIRTIO_NET_F_CTRL_MAC_ADDR));
>
> -#define VDPASIM_NET_VQ_NUM 2
> +/* 3 virtqueues, 2 address spaces, 2 virtqueue groups */
> +#define VDPASIM_NET_VQ_NUM 3
> +#define VDPASIM_NET_AS_NUM 2
> +#define VDPASIM_NET_GROUP_NUM 2
>
> static void vdpasim_net_complete(struct vdpasim_virtqueue *vq, size_t len)
> {
> @@ -63,6 +68,81 @@ static bool receive_filter(struct vdpasim *vdpasim, size_t len)
> return false;
> }
>
> +static virtio_net_ctrl_ack vdpasim_handle_ctrl_mac(struct vdpasim *vdpasim,
> + u8 cmd)
> +{
> + struct vdpasim_virtqueue *cvq = &vdpasim->vqs[2];
> + virtio_net_ctrl_ack status = VIRTIO_NET_ERR;
> + size_t read;
> +
> + switch (cmd) {
> + case VIRTIO_NET_CTRL_MAC_ADDR_SET:
> + read = vringh_iov_pull_iotlb(&cvq->vring, &cvq->in_iov,
> + (void *)vdpasim->config.mac,

This is not valid anymore as config is of type void *. Need to convert
to virtio_net_config. This also occurs in receive_filter.

You can copy how vdpasim_net_setup_config does it, for example.

Thanks!

> + ETH_ALEN);
> + if (read == ETH_ALEN)
> + status = VIRTIO_NET_OK;
> + break;
> + default:
> + break;
> + }
> +
> + return status;
> +}
> +
> +static void vdpasim_handle_cvq(struct vdpasim *vdpasim)
> +{
> + struct vdpasim_virtqueue *cvq = &vdpasim->vqs[2];
> + virtio_net_ctrl_ack status = VIRTIO_NET_ERR;
> + struct virtio_net_ctrl_hdr ctrl;
> + size_t read, write;
> + int err;
> +
> + if (!(vdpasim->features & (1ULL << VIRTIO_NET_F_CTRL_VQ)))
> + return;
> +
> + if (!cvq->ready)
> + return;
> +
> + while (true) {
> + err = vringh_getdesc_iotlb(&cvq->vring, &cvq->in_iov,
> + &cvq->out_iov,
> + &cvq->head, GFP_ATOMIC);
> + if (err <= 0)
> + break;
> +
> + read = vringh_iov_pull_iotlb(&cvq->vring, &cvq->in_iov, &ctrl,
> + sizeof(ctrl));
> + if (read != sizeof(ctrl))
> + break;
> +
> + switch (ctrl.class) {
> + case VIRTIO_NET_CTRL_MAC:
> + status = vdpasim_handle_ctrl_mac(vdpasim, ctrl.cmd);
> + break;
> + default:
> + break;
> + }
> +
> + /* Make sure data is wrote before advancing index */
> + smp_wmb();
> +
> + write = vringh_iov_push_iotlb(&cvq->vring, &cvq->out_iov,
> + &status, sizeof(status));
> + vringh_complete_iotlb(&cvq->vring, cvq->head, write);
> + vringh_kiov_cleanup(&cvq->in_iov);
> + vringh_kiov_cleanup(&cvq->out_iov);
> +
> + /* Make sure used is visible before rasing the interrupt. */
> + smp_wmb();
> +
> + local_bh_disable();
> + if (cvq->cb)
> + cvq->cb(cvq->private);
> + local_bh_enable();
> + }
> +}
> +
> static void vdpasim_net_work(struct work_struct *work)
> {
> struct vdpasim *vdpasim = container_of(work, struct vdpasim, work);
> @@ -77,6 +157,8 @@ static void vdpasim_net_work(struct work_struct *work)
> if (!(vdpasim->status & VIRTIO_CONFIG_S_DRIVER_OK))
> goto out;
>
> + vdpasim_handle_cvq(vdpasim);
> +
> if (!txq->ready || !rxq->ready)
> goto out;
>
> @@ -162,6 +244,8 @@ static int vdpasim_net_dev_add(struct vdpa_mgmt_dev *mdev, const char *name,
> dev_attr.id = VIRTIO_ID_NET;
> dev_attr.supported_features = VDPASIM_NET_FEATURES;
> dev_attr.nvqs = VDPASIM_NET_VQ_NUM;
> + dev_attr.ngroups = VDPASIM_NET_GROUP_NUM;
> + dev_attr.nas = VDPASIM_NET_AS_NUM;
> dev_attr.config_size = sizeof(struct virtio_net_config);
> dev_attr.get_config = vdpasim_net_get_config;
> dev_attr.work_fn = vdpasim_net_work;
> --
> 2.25.0
>

2022-03-11 04:59:03

by Eugenio Perez Martin

[permalink] [raw]
Subject: Re: [RFC PATCH v2 16/19] vdpa_sim: advertise VIRTIO_NET_F_MTU

On Thu, Feb 24, 2022 at 10:28 PM Gautam Dawar <[email protected]> wrote:
>
> We've already reported maximum mtu via config space, so let's
> advertise the feature.
>
> Signed-off-by: Jason Wang <[email protected]>
> Signed-off-by: Gautam Dawar <[email protected]>
> ---
> drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
> index d5324f6fd8c7..ff22cc56f40b 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
> @@ -26,7 +26,8 @@
> #define DRV_LICENSE "GPL v2"
>
> #define VDPASIM_NET_FEATURES (VDPASIM_FEATURES | \
> - (1ULL << VIRTIO_NET_F_MAC))
> + (1ULL << VIRTIO_NET_F_MAC) | \
> + (1ULL << VIRTIO_NET_F_MTU));

Extra semicolon at the end of macro.

Thanks.

>
> #define VDPASIM_NET_VQ_NUM 2
>
> --
> 2.25.0
>

2022-03-18 08:57:07

by Eugenio Perez Martin

[permalink] [raw]
Subject: Re: [RFC PATCH v2 19/19] vdpasim: control virtqueue support

On Thu, Feb 24, 2022 at 10:29 PM Gautam Dawar <[email protected]> wrote:
>
> This patch introduces the control virtqueue support for vDPA
> simulator. This is a requirement for supporting advanced features like
> multiqueue.
>
> A requirement for control virtqueue is to isolate its memory access
> from the rx/tx virtqueues. This is because when using vDPA device
> for VM, the control virqueue is not directly assigned to VM. Userspace
> (Qemu) will present a shadow control virtqueue to control for
> recording the device states.
>
> The isolation is done via the virtqueue groups and ASID support in
> vDPA through vhost-vdpa. The simulator is extended to have:
>
> 1) three virtqueues: RXVQ, TXVQ and CVQ (control virtqueue)
> 2) two virtqueue groups: group 0 contains RXVQ and TXVQ; group 1
> contains CVQ
> 3) two address spaces and the simulator simply implements the address
> spaces by mapping it 1:1 to IOTLB.
>
> For the VM use cases, userspace(Qemu) may set AS 0 to group 0 and AS 1
> to group 1. So we have:
>
> 1) The IOTLB for virtqueue group 0 contains the mappings of guest, so
> RX and TX can be assigned to guest directly.
> 2) The IOTLB for virtqueue group 1 contains the mappings of CVQ which
> is the buffers that allocated and managed by VMM only. So CVQ of
> vhost-vdpa is visible to VMM only. And Guest can not access the CVQ
> of vhost-vdpa.
>
> For the other use cases, since AS 0 is associated to all virtqueue
> groups by default. All virtqueues share the same mapping by default.
>
> To demonstrate the function, VIRITO_NET_F_CTRL_MACADDR is
> implemented in the simulator for the driver to set mac address.
>
> Signed-off-by: Jason Wang <[email protected]>
> Signed-off-by: Gautam Dawar <[email protected]>
> ---
> drivers/vdpa/vdpa_sim/vdpa_sim.c | 91 ++++++++++++++++++++++------
> drivers/vdpa/vdpa_sim/vdpa_sim.h | 2 +
> drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 88 ++++++++++++++++++++++++++-
> 3 files changed, 161 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> index 659e2e2e4b0c..59611f18a3a8 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> @@ -96,11 +96,17 @@ static void vdpasim_do_reset(struct vdpasim *vdpasim)
> {
> int i;
>
> - for (i = 0; i < vdpasim->dev_attr.nvqs; i++)
> + spin_lock(&vdpasim->iommu_lock);
> +
> + for (i = 0; i < vdpasim->dev_attr.nvqs; i++) {
> vdpasim_vq_reset(vdpasim, &vdpasim->vqs[i]);
> + vringh_set_iotlb(&vdpasim->vqs[i].vring, &vdpasim->iommu[0],
> + &vdpasim->iommu_lock);
> + }
> +
> + for (i = 0; i < vdpasim->dev_attr.nas; i++)
> + vhost_iotlb_reset(&vdpasim->iommu[i]);
>
> - spin_lock(&vdpasim->iommu_lock);
> - vhost_iotlb_reset(vdpasim->iommu);
> spin_unlock(&vdpasim->iommu_lock);
>
> vdpasim->features = 0;
> @@ -145,7 +151,7 @@ static dma_addr_t vdpasim_map_range(struct vdpasim *vdpasim, phys_addr_t paddr,
> dma_addr = iova_dma_addr(&vdpasim->iova, iova);
>
> spin_lock(&vdpasim->iommu_lock);
> - ret = vhost_iotlb_add_range(vdpasim->iommu, (u64)dma_addr,
> + ret = vhost_iotlb_add_range(&vdpasim->iommu[0], (u64)dma_addr,
> (u64)dma_addr + size - 1, (u64)paddr, perm);
> spin_unlock(&vdpasim->iommu_lock);
>
> @@ -161,7 +167,7 @@ static void vdpasim_unmap_range(struct vdpasim *vdpasim, dma_addr_t dma_addr,
> size_t size)
> {
> spin_lock(&vdpasim->iommu_lock);
> - vhost_iotlb_del_range(vdpasim->iommu, (u64)dma_addr,
> + vhost_iotlb_del_range(&vdpasim->iommu[0], (u64)dma_addr,
> (u64)dma_addr + size - 1);
> spin_unlock(&vdpasim->iommu_lock);
>
> @@ -250,8 +256,9 @@ struct vdpasim *vdpasim_create(struct vdpasim_dev_attr *dev_attr)
> else
> ops = &vdpasim_config_ops;
>
> - vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops, 1,
> - 1, dev_attr->name, false);
> + vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops,
> + dev_attr->ngroups, dev_attr->nas,
> + dev_attr->name, false);
> if (IS_ERR(vdpasim)) {
> ret = PTR_ERR(vdpasim);
> goto err_alloc;
> @@ -278,16 +285,20 @@ struct vdpasim *vdpasim_create(struct vdpasim_dev_attr *dev_attr)
> if (!vdpasim->vqs)
> goto err_iommu;
>
> - vdpasim->iommu = vhost_iotlb_alloc(max_iotlb_entries, 0);
> + vdpasim->iommu = kmalloc_array(vdpasim->dev_attr.nas,
> + sizeof(*vdpasim->iommu), GFP_KERNEL);
> if (!vdpasim->iommu)
> goto err_iommu;
>
> + for (i = 0; i < vdpasim->dev_attr.nas; i++)
> + vhost_iotlb_init(&vdpasim->iommu[i], 0, 0);
> +
> vdpasim->buffer = kvmalloc(dev_attr->buffer_size, GFP_KERNEL);
> if (!vdpasim->buffer)
> goto err_iommu;
>
> for (i = 0; i < dev_attr->nvqs; i++)
> - vringh_set_iotlb(&vdpasim->vqs[i].vring, vdpasim->iommu,
> + vringh_set_iotlb(&vdpasim->vqs[i].vring, &vdpasim->iommu[0],
> &vdpasim->iommu_lock);
>
> ret = iova_cache_get();
> @@ -401,7 +412,11 @@ static u32 vdpasim_get_vq_align(struct vdpa_device *vdpa)
>
> static u32 vdpasim_get_vq_group(struct vdpa_device *vdpa, u16 idx)
> {
> - return 0;
> + /* RX and TX belongs to group 0, CVQ belongs to group 1 */
> + if (idx == 2)
> + return 1;
> + else
> + return 0;
> }
>
> static u64 vdpasim_get_device_features(struct vdpa_device *vdpa)
> @@ -539,20 +554,53 @@ static struct vdpa_iova_range vdpasim_get_iova_range(struct vdpa_device *vdpa)
> return range;
> }
>
> +static int vdpasim_set_group_asid(struct vdpa_device *vdpa, unsigned int group,
> + unsigned int asid)
> +{
> + struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
> + struct vhost_iotlb *iommu;
> + int i;
> +
> + if (group > vdpasim->dev_attr.ngroups)
> + return -EINVAL;
> +
> + if (asid > vdpasim->dev_attr.nas)
> + return -EINVAL;
> +
> + iommu = &vdpasim->iommu[asid];
> +
> + spin_lock(&vdpasim->lock);
> +
> + for (i = 0; i < vdpasim->dev_attr.nvqs; i++)
> + if (vdpasim_get_vq_group(vdpa, i) == group)
> + vringh_set_iotlb(&vdpasim->vqs[i].vring, &vdpasim->iommu[0],

The second argument to "vringh_set_iotlb" call must be simply "iommu".
If not, we're effectively setting asid 0 to all virtqueue groups that
match "group", making it impossible to change it.

Thanks!

> + &vdpasim->iommu_lock);
> +
> + spin_unlock(&vdpasim->lock);
> +
> + return 0;
> +}
> +
> static int vdpasim_set_map(struct vdpa_device *vdpa, unsigned int asid,
> struct vhost_iotlb *iotlb)
> {
> struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
> struct vhost_iotlb_map *map;
> + struct vhost_iotlb *iommu;
> u64 start = 0ULL, last = 0ULL - 1;
> int ret;
>
> + if (asid >= vdpasim->dev_attr.nas)
> + return -EINVAL;
> +
> spin_lock(&vdpasim->iommu_lock);
> - vhost_iotlb_reset(vdpasim->iommu);
> +
> + iommu = &vdpasim->iommu[asid];
> + vhost_iotlb_reset(iommu);
>
> for (map = vhost_iotlb_itree_first(iotlb, start, last); map;
> map = vhost_iotlb_itree_next(map, start, last)) {
> - ret = vhost_iotlb_add_range(vdpasim->iommu, map->start,
> + ret = vhost_iotlb_add_range(iommu, map->start,
> map->last, map->addr, map->perm);
> if (ret)
> goto err;
> @@ -561,7 +609,7 @@ static int vdpasim_set_map(struct vdpa_device *vdpa, unsigned int asid,
> return 0;
>
> err:
> - vhost_iotlb_reset(vdpasim->iommu);
> + vhost_iotlb_reset(iommu);
> spin_unlock(&vdpasim->iommu_lock);
> return ret;
> }
> @@ -573,9 +621,12 @@ static int vdpasim_dma_map(struct vdpa_device *vdpa, unsigned int asid,
> struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
> int ret;
>
> + if (asid >= vdpasim->dev_attr.nas)
> + return -EINVAL;
> +
> spin_lock(&vdpasim->iommu_lock);
> - ret = vhost_iotlb_add_range_ctx(vdpasim->iommu, iova, iova + size - 1,
> - pa, perm, opaque);
> + ret = vhost_iotlb_add_range_ctx(&vdpasim->iommu[asid], iova,
> + iova + size - 1, pa, perm, opaque);
> spin_unlock(&vdpasim->iommu_lock);
>
> return ret;
> @@ -586,8 +637,11 @@ static int vdpasim_dma_unmap(struct vdpa_device *vdpa, unsigned int asid,
> {
> struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
>
> + if (asid >= vdpasim->dev_attr.nas)
> + return -EINVAL;
> +
> spin_lock(&vdpasim->iommu_lock);
> - vhost_iotlb_del_range(vdpasim->iommu, iova, iova + size - 1);
> + vhost_iotlb_del_range(&vdpasim->iommu[asid], iova, iova + size - 1);
> spin_unlock(&vdpasim->iommu_lock);
>
> return 0;
> @@ -611,8 +665,7 @@ static void vdpasim_free(struct vdpa_device *vdpa)
> }
>
> kvfree(vdpasim->buffer);
> - if (vdpasim->iommu)
> - vhost_iotlb_free(vdpasim->iommu);
> + vhost_iotlb_free(vdpasim->iommu);
> kfree(vdpasim->vqs);
> kfree(vdpasim->config);
> }
> @@ -643,6 +696,7 @@ static const struct vdpa_config_ops vdpasim_config_ops = {
> .set_config = vdpasim_set_config,
> .get_generation = vdpasim_get_generation,
> .get_iova_range = vdpasim_get_iova_range,
> + .set_group_asid = vdpasim_set_group_asid,
> .dma_map = vdpasim_dma_map,
> .dma_unmap = vdpasim_dma_unmap,
> .free = vdpasim_free,
> @@ -674,6 +728,7 @@ static const struct vdpa_config_ops vdpasim_batch_config_ops = {
> .set_config = vdpasim_set_config,
> .get_generation = vdpasim_get_generation,
> .get_iova_range = vdpasim_get_iova_range,
> + .set_group_asid = vdpasim_set_group_asid,
> .set_map = vdpasim_set_map,
> .free = vdpasim_free,
> };
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.h b/drivers/vdpa/vdpa_sim/vdpa_sim.h
> index 0be7c1e7ef80..622782e92239 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.h
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.h
> @@ -41,6 +41,8 @@ struct vdpasim_dev_attr {
> size_t buffer_size;
> int nvqs;
> u32 id;
> + u32 ngroups;
> + u32 nas;
>
> work_func_t work_fn;
> void (*get_config)(struct vdpasim *vdpasim, void *config);
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
> index ed5ade4ae570..513970c05af2 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
> @@ -26,10 +26,15 @@
> #define DRV_LICENSE "GPL v2"
>
> #define VDPASIM_NET_FEATURES (VDPASIM_FEATURES | \
> + (1ULL << VIRTIO_NET_F_MTU) | \
> (1ULL << VIRTIO_NET_F_MAC) | \
> - (1ULL << VIRTIO_NET_F_MTU));
> + (1ULL << VIRTIO_NET_F_CTRL_VQ) | \
> + (1ULL << VIRTIO_NET_F_CTRL_MAC_ADDR));
>
> -#define VDPASIM_NET_VQ_NUM 2
> +/* 3 virtqueues, 2 address spaces, 2 virtqueue groups */
> +#define VDPASIM_NET_VQ_NUM 3
> +#define VDPASIM_NET_AS_NUM 2
> +#define VDPASIM_NET_GROUP_NUM 2
>
> static void vdpasim_net_complete(struct vdpasim_virtqueue *vq, size_t len)
> {
> @@ -63,6 +68,81 @@ static bool receive_filter(struct vdpasim *vdpasim, size_t len)
> return false;
> }
>
> +static virtio_net_ctrl_ack vdpasim_handle_ctrl_mac(struct vdpasim *vdpasim,
> + u8 cmd)
> +{
> + struct vdpasim_virtqueue *cvq = &vdpasim->vqs[2];
> + virtio_net_ctrl_ack status = VIRTIO_NET_ERR;
> + size_t read;
> +
> + switch (cmd) {
> + case VIRTIO_NET_CTRL_MAC_ADDR_SET:
> + read = vringh_iov_pull_iotlb(&cvq->vring, &cvq->in_iov,
> + (void *)vdpasim->config.mac,
> + ETH_ALEN);
> + if (read == ETH_ALEN)
> + status = VIRTIO_NET_OK;
> + break;
> + default:
> + break;
> + }
> +
> + return status;
> +}
> +
> +static void vdpasim_handle_cvq(struct vdpasim *vdpasim)
> +{
> + struct vdpasim_virtqueue *cvq = &vdpasim->vqs[2];
> + virtio_net_ctrl_ack status = VIRTIO_NET_ERR;
> + struct virtio_net_ctrl_hdr ctrl;
> + size_t read, write;
> + int err;
> +
> + if (!(vdpasim->features & (1ULL << VIRTIO_NET_F_CTRL_VQ)))
> + return;
> +
> + if (!cvq->ready)
> + return;
> +
> + while (true) {
> + err = vringh_getdesc_iotlb(&cvq->vring, &cvq->in_iov,
> + &cvq->out_iov,
> + &cvq->head, GFP_ATOMIC);
> + if (err <= 0)
> + break;
> +
> + read = vringh_iov_pull_iotlb(&cvq->vring, &cvq->in_iov, &ctrl,
> + sizeof(ctrl));
> + if (read != sizeof(ctrl))
> + break;
> +
> + switch (ctrl.class) {
> + case VIRTIO_NET_CTRL_MAC:
> + status = vdpasim_handle_ctrl_mac(vdpasim, ctrl.cmd);
> + break;
> + default:
> + break;
> + }
> +
> + /* Make sure data is wrote before advancing index */
> + smp_wmb();
> +
> + write = vringh_iov_push_iotlb(&cvq->vring, &cvq->out_iov,
> + &status, sizeof(status));
> + vringh_complete_iotlb(&cvq->vring, cvq->head, write);
> + vringh_kiov_cleanup(&cvq->in_iov);
> + vringh_kiov_cleanup(&cvq->out_iov);
> +
> + /* Make sure used is visible before rasing the interrupt. */
> + smp_wmb();
> +
> + local_bh_disable();
> + if (cvq->cb)
> + cvq->cb(cvq->private);
> + local_bh_enable();
> + }
> +}
> +
> static void vdpasim_net_work(struct work_struct *work)
> {
> struct vdpasim *vdpasim = container_of(work, struct vdpasim, work);
> @@ -77,6 +157,8 @@ static void vdpasim_net_work(struct work_struct *work)
> if (!(vdpasim->status & VIRTIO_CONFIG_S_DRIVER_OK))
> goto out;
>
> + vdpasim_handle_cvq(vdpasim);
> +
> if (!txq->ready || !rxq->ready)
> goto out;
>
> @@ -162,6 +244,8 @@ static int vdpasim_net_dev_add(struct vdpa_mgmt_dev *mdev, const char *name,
> dev_attr.id = VIRTIO_ID_NET;
> dev_attr.supported_features = VDPASIM_NET_FEATURES;
> dev_attr.nvqs = VDPASIM_NET_VQ_NUM;
> + dev_attr.ngroups = VDPASIM_NET_GROUP_NUM;
> + dev_attr.nas = VDPASIM_NET_AS_NUM;
> dev_attr.config_size = sizeof(struct virtio_net_config);
> dev_attr.get_config = vdpasim_net_get_config;
> dev_attr.work_fn = vdpasim_net_work;
> --
> 2.25.0
>

2022-03-22 10:54:51

by Gautam Dawar

[permalink] [raw]
Subject: RE: [RFC PATCH v2 19/19] vdpasim: control virtqueue support

-----Original Message-----
From: Eugenio Perez Martin <[email protected]>
Sent: Friday, March 18, 2022 1:05 PM
To: Gautam Dawar <[email protected]>
Cc: Gautam Dawar <[email protected]>; Martin Petrus Hubertus Habets <[email protected]>; Harpreet Singh Anand <[email protected]>; Tanuj Murlidhar Kamde <[email protected]>; Jason Wang <[email protected]>; Michael S. Tsirkin <[email protected]>; Zhu Lingshan <[email protected]>; Stefano Garzarella <[email protected]>; Xie Yongji <[email protected]>; Eli Cohen <[email protected]>; Si-Wei Liu <[email protected]>; Parav Pandit <[email protected]>; Longpeng <[email protected]>; virtualization <[email protected]>; [email protected]; kvm list <[email protected]>; [email protected]
Subject: Re: [RFC PATCH v2 19/19] vdpasim: control virtqueue support

On Thu, Feb 24, 2022 at 10:29 PM Gautam Dawar <[email protected]> wrote:
>
> This patch introduces the control virtqueue support for vDPA
> simulator. This is a requirement for supporting advanced features like
> multiqueue.
>
> A requirement for control virtqueue is to isolate its memory access
> from the rx/tx virtqueues. This is because when using vDPA device for
> VM, the control virqueue is not directly assigned to VM. Userspace
> (Qemu) will present a shadow control virtqueue to control for
> recording the device states.
>
> The isolation is done via the virtqueue groups and ASID support in
> vDPA through vhost-vdpa. The simulator is extended to have:
>
> 1) three virtqueues: RXVQ, TXVQ and CVQ (control virtqueue)
> 2) two virtqueue groups: group 0 contains RXVQ and TXVQ; group 1
> contains CVQ
> 3) two address spaces and the simulator simply implements the address
> spaces by mapping it 1:1 to IOTLB.
>
> For the VM use cases, userspace(Qemu) may set AS 0 to group 0 and AS 1
> to group 1. So we have:
>
> 1) The IOTLB for virtqueue group 0 contains the mappings of guest, so
> RX and TX can be assigned to guest directly.
> 2) The IOTLB for virtqueue group 1 contains the mappings of CVQ which
> is the buffers that allocated and managed by VMM only. So CVQ of
> vhost-vdpa is visible to VMM only. And Guest can not access the CVQ
> of vhost-vdpa.
>
> For the other use cases, since AS 0 is associated to all virtqueue
> groups by default. All virtqueues share the same mapping by default.
>
> To demonstrate the function, VIRITO_NET_F_CTRL_MACADDR is implemented
> in the simulator for the driver to set mac address.
>
> Signed-off-by: Jason Wang <[email protected]>
> Signed-off-by: Gautam Dawar <[email protected]>
> ---
> drivers/vdpa/vdpa_sim/vdpa_sim.c | 91 ++++++++++++++++++++++------
> drivers/vdpa/vdpa_sim/vdpa_sim.h | 2 +
> drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 88 ++++++++++++++++++++++++++-
> 3 files changed, 161 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c
> b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> index 659e2e2e4b0c..59611f18a3a8 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> @@ -96,11 +96,17 @@ static void vdpasim_do_reset(struct vdpasim
> *vdpasim) {
> int i;
>
> - for (i = 0; i < vdpasim->dev_attr.nvqs; i++)
> + spin_lock(&vdpasim->iommu_lock);
> +
> + for (i = 0; i < vdpasim->dev_attr.nvqs; i++) {
> vdpasim_vq_reset(vdpasim, &vdpasim->vqs[i]);
> + vringh_set_iotlb(&vdpasim->vqs[i].vring, &vdpasim->iommu[0],
> + &vdpasim->iommu_lock);
> + }
> +
> + for (i = 0; i < vdpasim->dev_attr.nas; i++)
> + vhost_iotlb_reset(&vdpasim->iommu[i]);
>
> - spin_lock(&vdpasim->iommu_lock);
> - vhost_iotlb_reset(vdpasim->iommu);
> spin_unlock(&vdpasim->iommu_lock);
>
> vdpasim->features = 0;
> @@ -145,7 +151,7 @@ static dma_addr_t vdpasim_map_range(struct vdpasim *vdpasim, phys_addr_t paddr,
> dma_addr = iova_dma_addr(&vdpasim->iova, iova);
>
> spin_lock(&vdpasim->iommu_lock);
> - ret = vhost_iotlb_add_range(vdpasim->iommu, (u64)dma_addr,
> + ret = vhost_iotlb_add_range(&vdpasim->iommu[0], (u64)dma_addr,
> (u64)dma_addr + size - 1, (u64)paddr, perm);
> spin_unlock(&vdpasim->iommu_lock);
>
> @@ -161,7 +167,7 @@ static void vdpasim_unmap_range(struct vdpasim *vdpasim, dma_addr_t dma_addr,
> size_t size) {
> spin_lock(&vdpasim->iommu_lock);
> - vhost_iotlb_del_range(vdpasim->iommu, (u64)dma_addr,
> + vhost_iotlb_del_range(&vdpasim->iommu[0], (u64)dma_addr,
> (u64)dma_addr + size - 1);
> spin_unlock(&vdpasim->iommu_lock);
>
> @@ -250,8 +256,9 @@ struct vdpasim *vdpasim_create(struct vdpasim_dev_attr *dev_attr)
> else
> ops = &vdpasim_config_ops;
>
> - vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops, 1,
> - 1, dev_attr->name, false);
> + vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops,
> + dev_attr->ngroups, dev_attr->nas,
> + dev_attr->name, false);
> if (IS_ERR(vdpasim)) {
> ret = PTR_ERR(vdpasim);
> goto err_alloc;
> @@ -278,16 +285,20 @@ struct vdpasim *vdpasim_create(struct vdpasim_dev_attr *dev_attr)
> if (!vdpasim->vqs)
> goto err_iommu;
>
> - vdpasim->iommu = vhost_iotlb_alloc(max_iotlb_entries, 0);
> + vdpasim->iommu = kmalloc_array(vdpasim->dev_attr.nas,
> + sizeof(*vdpasim->iommu),
> + GFP_KERNEL);
> if (!vdpasim->iommu)
> goto err_iommu;
>
> + for (i = 0; i < vdpasim->dev_attr.nas; i++)
> + vhost_iotlb_init(&vdpasim->iommu[i], 0, 0);
> +
> vdpasim->buffer = kvmalloc(dev_attr->buffer_size, GFP_KERNEL);
> if (!vdpasim->buffer)
> goto err_iommu;
>
> for (i = 0; i < dev_attr->nvqs; i++)
> - vringh_set_iotlb(&vdpasim->vqs[i].vring, vdpasim->iommu,
> + vringh_set_iotlb(&vdpasim->vqs[i].vring,
> + &vdpasim->iommu[0],
> &vdpasim->iommu_lock);
>
> ret = iova_cache_get();
> @@ -401,7 +412,11 @@ static u32 vdpasim_get_vq_align(struct
> vdpa_device *vdpa)
>
> static u32 vdpasim_get_vq_group(struct vdpa_device *vdpa, u16 idx) {
> - return 0;
> + /* RX and TX belongs to group 0, CVQ belongs to group 1 */
> + if (idx == 2)
> + return 1;
> + else
> + return 0;
> }
>
> static u64 vdpasim_get_device_features(struct vdpa_device *vdpa) @@
> -539,20 +554,53 @@ static struct vdpa_iova_range vdpasim_get_iova_range(struct vdpa_device *vdpa)
> return range;
> }
>
> +static int vdpasim_set_group_asid(struct vdpa_device *vdpa, unsigned int group,
> + unsigned int asid) {
> + struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
> + struct vhost_iotlb *iommu;
> + int i;
> +
> + if (group > vdpasim->dev_attr.ngroups)
> + return -EINVAL;
> +
> + if (asid > vdpasim->dev_attr.nas)
> + return -EINVAL;
> +
> + iommu = &vdpasim->iommu[asid];
> +
> + spin_lock(&vdpasim->lock);
> +
> + for (i = 0; i < vdpasim->dev_attr.nvqs; i++)
> + if (vdpasim_get_vq_group(vdpa, i) == group)
> + vringh_set_iotlb(&vdpasim->vqs[i].vring,
> + &vdpasim->iommu[0],

The second argument to "vringh_set_iotlb" call must be simply "iommu".
If not, we're effectively setting asid 0 to all virtqueue groups that match "group", making it impossible to change it.
[GD>>] I agree. Will fix this.

Thanks!

> + &vdpasim->iommu_lock);
> +
> + spin_unlock(&vdpasim->lock);
> +
> + return 0;
> +}
> +
> static int vdpasim_set_map(struct vdpa_device *vdpa, unsigned int asid,
> struct vhost_iotlb *iotlb) {
> struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
> struct vhost_iotlb_map *map;
> + struct vhost_iotlb *iommu;
> u64 start = 0ULL, last = 0ULL - 1;
> int ret;
>
> + if (asid >= vdpasim->dev_attr.nas)
> + return -EINVAL;
> +
> spin_lock(&vdpasim->iommu_lock);
> - vhost_iotlb_reset(vdpasim->iommu);
> +
> + iommu = &vdpasim->iommu[asid];
> + vhost_iotlb_reset(iommu);
>
> for (map = vhost_iotlb_itree_first(iotlb, start, last); map;
> map = vhost_iotlb_itree_next(map, start, last)) {
> - ret = vhost_iotlb_add_range(vdpasim->iommu, map->start,
> + ret = vhost_iotlb_add_range(iommu, map->start,
> map->last, map->addr, map->perm);
> if (ret)
> goto err;
> @@ -561,7 +609,7 @@ static int vdpasim_set_map(struct vdpa_device *vdpa, unsigned int asid,
> return 0;
>
> err:
> - vhost_iotlb_reset(vdpasim->iommu);
> + vhost_iotlb_reset(iommu);
> spin_unlock(&vdpasim->iommu_lock);
> return ret;
> }
> @@ -573,9 +621,12 @@ static int vdpasim_dma_map(struct vdpa_device *vdpa, unsigned int asid,
> struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
> int ret;
>
> + if (asid >= vdpasim->dev_attr.nas)
> + return -EINVAL;
> +
> spin_lock(&vdpasim->iommu_lock);
> - ret = vhost_iotlb_add_range_ctx(vdpasim->iommu, iova, iova + size - 1,
> - pa, perm, opaque);
> + ret = vhost_iotlb_add_range_ctx(&vdpasim->iommu[asid], iova,
> + iova + size - 1, pa, perm,
> + opaque);
> spin_unlock(&vdpasim->iommu_lock);
>
> return ret;
> @@ -586,8 +637,11 @@ static int vdpasim_dma_unmap(struct vdpa_device
> *vdpa, unsigned int asid, {
> struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
>
> + if (asid >= vdpasim->dev_attr.nas)
> + return -EINVAL;
> +
> spin_lock(&vdpasim->iommu_lock);
> - vhost_iotlb_del_range(vdpasim->iommu, iova, iova + size - 1);
> + vhost_iotlb_del_range(&vdpasim->iommu[asid], iova, iova + size
> + - 1);
> spin_unlock(&vdpasim->iommu_lock);
>
> return 0;
> @@ -611,8 +665,7 @@ static void vdpasim_free(struct vdpa_device *vdpa)
> }
>
> kvfree(vdpasim->buffer);
> - if (vdpasim->iommu)
> - vhost_iotlb_free(vdpasim->iommu);
> + vhost_iotlb_free(vdpasim->iommu);
> kfree(vdpasim->vqs);
> kfree(vdpasim->config);
> }
> @@ -643,6 +696,7 @@ static const struct vdpa_config_ops vdpasim_config_ops = {
> .set_config = vdpasim_set_config,
> .get_generation = vdpasim_get_generation,
> .get_iova_range = vdpasim_get_iova_range,
> + .set_group_asid = vdpasim_set_group_asid,
> .dma_map = vdpasim_dma_map,
> .dma_unmap = vdpasim_dma_unmap,
> .free = vdpasim_free,
> @@ -674,6 +728,7 @@ static const struct vdpa_config_ops vdpasim_batch_config_ops = {
> .set_config = vdpasim_set_config,
> .get_generation = vdpasim_get_generation,
> .get_iova_range = vdpasim_get_iova_range,
> + .set_group_asid = vdpasim_set_group_asid,
> .set_map = vdpasim_set_map,
> .free = vdpasim_free,
> };
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.h
> b/drivers/vdpa/vdpa_sim/vdpa_sim.h
> index 0be7c1e7ef80..622782e92239 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.h
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.h
> @@ -41,6 +41,8 @@ struct vdpasim_dev_attr {
> size_t buffer_size;
> int nvqs;
> u32 id;
> + u32 ngroups;
> + u32 nas;
>
> work_func_t work_fn;
> void (*get_config)(struct vdpasim *vdpasim, void *config);
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
> b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
> index ed5ade4ae570..513970c05af2 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
> @@ -26,10 +26,15 @@
> #define DRV_LICENSE "GPL v2"
>
> #define VDPASIM_NET_FEATURES (VDPASIM_FEATURES | \
> + (1ULL << VIRTIO_NET_F_MTU) | \
> (1ULL << VIRTIO_NET_F_MAC) | \
> - (1ULL << VIRTIO_NET_F_MTU));
> + (1ULL << VIRTIO_NET_F_CTRL_VQ) | \
> + (1ULL <<
> + VIRTIO_NET_F_CTRL_MAC_ADDR));
>
> -#define VDPASIM_NET_VQ_NUM 2
> +/* 3 virtqueues, 2 address spaces, 2 virtqueue groups */
> +#define VDPASIM_NET_VQ_NUM 3
> +#define VDPASIM_NET_AS_NUM 2
> +#define VDPASIM_NET_GROUP_NUM 2
>
> static void vdpasim_net_complete(struct vdpasim_virtqueue *vq, size_t
> len) { @@ -63,6 +68,81 @@ static bool receive_filter(struct vdpasim
> *vdpasim, size_t len)
> return false;
> }
>
> +static virtio_net_ctrl_ack vdpasim_handle_ctrl_mac(struct vdpasim *vdpasim,
> + u8 cmd) {
> + struct vdpasim_virtqueue *cvq = &vdpasim->vqs[2];
> + virtio_net_ctrl_ack status = VIRTIO_NET_ERR;
> + size_t read;
> +
> + switch (cmd) {
> + case VIRTIO_NET_CTRL_MAC_ADDR_SET:
> + read = vringh_iov_pull_iotlb(&cvq->vring, &cvq->in_iov,
> + (void *)vdpasim->config.mac,
> + ETH_ALEN);
> + if (read == ETH_ALEN)
> + status = VIRTIO_NET_OK;
> + break;
> + default:
> + break;
> + }
> +
> + return status;
> +}
> +
> +static void vdpasim_handle_cvq(struct vdpasim *vdpasim) {
> + struct vdpasim_virtqueue *cvq = &vdpasim->vqs[2];
> + virtio_net_ctrl_ack status = VIRTIO_NET_ERR;
> + struct virtio_net_ctrl_hdr ctrl;
> + size_t read, write;
> + int err;
> +
> + if (!(vdpasim->features & (1ULL << VIRTIO_NET_F_CTRL_VQ)))
> + return;
> +
> + if (!cvq->ready)
> + return;
> +
> + while (true) {
> + err = vringh_getdesc_iotlb(&cvq->vring, &cvq->in_iov,
> + &cvq->out_iov,
> + &cvq->head, GFP_ATOMIC);
> + if (err <= 0)
> + break;
> +
> + read = vringh_iov_pull_iotlb(&cvq->vring, &cvq->in_iov, &ctrl,
> + sizeof(ctrl));
> + if (read != sizeof(ctrl))
> + break;
> +
> + switch (ctrl.class) {
> + case VIRTIO_NET_CTRL_MAC:
> + status = vdpasim_handle_ctrl_mac(vdpasim, ctrl.cmd);
> + break;
> + default:
> + break;
> + }
> +
> + /* Make sure data is wrote before advancing index */
> + smp_wmb();
> +
> + write = vringh_iov_push_iotlb(&cvq->vring, &cvq->out_iov,
> + &status, sizeof(status));
> + vringh_complete_iotlb(&cvq->vring, cvq->head, write);
> + vringh_kiov_cleanup(&cvq->in_iov);
> + vringh_kiov_cleanup(&cvq->out_iov);
> +
> + /* Make sure used is visible before rasing the interrupt. */
> + smp_wmb();
> +
> + local_bh_disable();
> + if (cvq->cb)
> + cvq->cb(cvq->private);
> + local_bh_enable();
> + }
> +}
> +
> static void vdpasim_net_work(struct work_struct *work) {
> struct vdpasim *vdpasim = container_of(work, struct vdpasim,
> work); @@ -77,6 +157,8 @@ static void vdpasim_net_work(struct work_struct *work)
> if (!(vdpasim->status & VIRTIO_CONFIG_S_DRIVER_OK))
> goto out;
>
> + vdpasim_handle_cvq(vdpasim);
> +
> if (!txq->ready || !rxq->ready)
> goto out;
>
> @@ -162,6 +244,8 @@ static int vdpasim_net_dev_add(struct vdpa_mgmt_dev *mdev, const char *name,
> dev_attr.id = VIRTIO_ID_NET;
> dev_attr.supported_features = VDPASIM_NET_FEATURES;
> dev_attr.nvqs = VDPASIM_NET_VQ_NUM;
> + dev_attr.ngroups = VDPASIM_NET_GROUP_NUM;
> + dev_attr.nas = VDPASIM_NET_AS_NUM;
> dev_attr.config_size = sizeof(struct virtio_net_config);
> dev_attr.get_config = vdpasim_net_get_config;
> dev_attr.work_fn = vdpasim_net_work;
> --
> 2.25.0
>