Thanks to Max that started this work!
I took his patches, and extended the block simulator a bit.
This series moves the network device simulator in a new module
(vdpa_sim_net) and leaves the generic functions in the vdpa_sim core
module, allowing the possibility to add new vDPA device simulators.
Then we added a new vdpa_sim_blk module to simulate a block device.
I'm not sure about patch 11 ("vringh: allow vringh_iov_xfer() to skip
bytes when ptr is NULL"), maybe we can add a new functions instead of
modify vringh_iov_xfer().
As Max reported, I'm also seeing errors with vdpa_sim_blk related to
iotlb and vringh when there is high load, these are some of the error
messages I can see randomly:
vringh: Failed to access avail idx at 00000000e8deb2cc
vringh: Failed to read head: idx 6289 address 00000000e1ad1d50
vringh: Failed to get flags at 000000006635d7a3
virtio_vdpa vdpa0: vringh_iov_push_iotlb() error: -14 offset: 0x2840000 len: 0x20000
virtio_vdpa vdpa0: vringh_iov_pull_iotlb() error: -14 offset: 0x58ee000 len: 0x3000
These errors should all be related to the fact that iotlb_translate()
fails with -EINVAL, so it seems that we miss some mapping.
I'll debug more carefully, in the meantime can you give a first review?
Thanks,
Stefano
Max Gurtovoy (4):
vhost-vdpa: add support for vDPA blk devices
vdpa: split vdpasim to core and net modules
vdpa_sim: remove hard-coded virtq count
vdpa: add vdpa simulator for block device
Stefano Garzarella (8):
vdpa_sim: remove the limit of IOTLB entries
vdpa_sim: add struct vdpasim_device to store device properties
vdpa_sim: move config management outside of the core
vdpa_sim: use kvmalloc to allocate vdpasim->buffer
vdpa_sim: make vdpasim->buffer size configurable
vdpa_sim: split vdpasim_virtqueue's iov field in riov and wiov
vringh: allow vringh_iov_xfer() to skip bytes when ptr is NULL
vdpa_sim_blk: implement ramdisk behaviour
drivers/vdpa/vdpa_sim/vdpa_sim.h | 117 +++++++++++
drivers/vdpa/vdpa_sim/vdpa_sim.c | 283 +++++----------------------
drivers/vdpa/vdpa_sim/vdpa_sim_blk.c | 251 ++++++++++++++++++++++++
drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 172 ++++++++++++++++
drivers/vhost/vdpa.c | 11 +-
drivers/vhost/vringh.c | 16 +-
drivers/vdpa/Kconfig | 16 +-
drivers/vdpa/vdpa_sim/Makefile | 2 +
8 files changed, 628 insertions(+), 240 deletions(-)
create mode 100644 drivers/vdpa/vdpa_sim/vdpa_sim.h
create mode 100644 drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
create mode 100644 drivers/vdpa/vdpa_sim/vdpa_sim_net.c
--
2.26.2
From: Max Gurtovoy <[email protected]>
Currently only net devices can act as vDPA backends. Add an
infrastructure for block devices will basic feature list that will be
increased in the future.
Signed-off-by: Max Gurtovoy <[email protected]>
Reviewed-by: Jason Wang <[email protected]>
Signed-off-by: Stefano Garzarella <[email protected]>
---
drivers/vhost/vdpa.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index 2754f3069738..fb0411594963 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -22,6 +22,7 @@
#include <linux/nospec.h>
#include <linux/vhost.h>
#include <linux/virtio_net.h>
+#include <linux/virtio_blk.h>
#include "vhost.h"
@@ -194,6 +195,9 @@ static int vhost_vdpa_config_validate(struct vhost_vdpa *v,
case VIRTIO_ID_NET:
size = sizeof(struct virtio_net_config);
break;
+ case VIRTIO_ID_BLOCK:
+ size = sizeof(struct virtio_blk_config);
+ break;
}
if (c->len == 0)
@@ -975,12 +979,13 @@ static void vhost_vdpa_release_dev(struct device *device)
static int vhost_vdpa_probe(struct vdpa_device *vdpa)
{
const struct vdpa_config_ops *ops = vdpa->config;
+ u32 device_id = ops->get_device_id(vdpa);
struct vhost_vdpa *v;
int minor;
int r;
- /* Currently, we only accept the network devices. */
- if (ops->get_device_id(vdpa) != VIRTIO_ID_NET)
+ /* Currently, we only accept the network and block devices. */
+ if (device_id != VIRTIO_ID_NET && device_id != VIRTIO_ID_BLOCK)
return -ENOTSUPP;
v = kzalloc(sizeof(*v), GFP_KERNEL | __GFP_RETRY_MAYFAIL);
@@ -998,7 +1003,7 @@ static int vhost_vdpa_probe(struct vdpa_device *vdpa)
v->minor = minor;
v->vdpa = vdpa;
v->nvqs = vdpa->nvqs;
- v->virtio_id = ops->get_device_id(vdpa);
+ v->virtio_id = device_id;
device_initialize(&v->dev);
v->dev.release = vhost_vdpa_release_dev;
--
2.26.2
From: Max Gurtovoy <[email protected]>
Add a new attribute that will define the number of virt queues to be
created for the vdpasim device.
Signed-off-by: Max Gurtovoy <[email protected]>
[sgarzare: replace kmalloc_array() with kcalloc()]
Signed-off-by: Stefano Garzarella <[email protected]>
---
v1:
- use kcalloc() instead of kmalloc_array() since some function expects
variables initialized to zero
---
drivers/vdpa/vdpa_sim/vdpa_sim.h | 5 +++--
drivers/vdpa/vdpa_sim/vdpa_sim.c | 14 +++++++++++---
drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 3 +++
3 files changed, 17 insertions(+), 5 deletions(-)
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.h b/drivers/vdpa/vdpa_sim/vdpa_sim.h
index 33613c49888c..6a1267c40d5e 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.h
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.h
@@ -20,7 +20,6 @@
#define VDPASIM_QUEUE_ALIGN PAGE_SIZE
#define VDPASIM_QUEUE_MAX 256
#define VDPASIM_VENDOR_ID 0
-#define VDPASIM_VQ_NUM 0x2
#define VDPASIM_FEATURES ((1ULL << VIRTIO_F_ANY_LAYOUT) | \
(1ULL << VIRTIO_F_VERSION_1) | \
@@ -46,12 +45,13 @@ struct vdpasim_init_attr {
u64 features;
work_func_t work_fn;
int batch_mapping;
+ int nvqs;
};
/* State of each vdpasim device */
struct vdpasim {
struct vdpa_device vdpa;
- struct vdpasim_virtqueue vqs[VDPASIM_VQ_NUM];
+ struct vdpasim_virtqueue *vqs;
struct work_struct work;
/* spinlock to synchronize virtqueue state */
spinlock_t lock;
@@ -64,6 +64,7 @@ struct vdpasim {
u32 generation;
u64 features;
u64 supported_features;
+ int nvqs;
/* spinlock to synchronize iommu table */
spinlock_t iommu_lock;
};
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
index 04f9dc9ce8c8..2b4fea354413 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
@@ -54,7 +54,7 @@ static void vdpasim_reset(struct vdpasim *vdpasim)
{
int i;
- for (i = 0; i < VDPASIM_VQ_NUM; i++)
+ for (i = 0; i < vdpasim->nvqs; i++)
vdpasim_vq_reset(vdpasim, &vdpasim->vqs[i]);
spin_lock(&vdpasim->iommu_lock);
@@ -199,7 +199,8 @@ struct vdpasim *vdpasim_create(struct vdpasim_init_attr *attr)
else
ops = &vdpasim_config_ops;
- vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops, VDPASIM_VQ_NUM);
+ vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops,
+ attr->nvqs);
if (!vdpasim)
goto err_alloc;
@@ -211,8 +212,14 @@ struct vdpasim *vdpasim_create(struct vdpasim_init_attr *attr)
if (!vdpasim->config)
goto err_iommu;
+ vdpasim->vqs = kcalloc(attr->nvqs, sizeof(struct vdpasim_virtqueue),
+ GFP_KERNEL);
+ if (!vdpasim->vqs)
+ goto err_iommu;
+
vdpasim->device_id = device_id;
vdpasim->supported_features = attr->features;
+ vdpasim->nvqs = attr->nvqs;
INIT_WORK(&vdpasim->work, attr->work_fn);
spin_lock_init(&vdpasim->lock);
spin_lock_init(&vdpasim->iommu_lock);
@@ -231,7 +238,7 @@ struct vdpasim *vdpasim_create(struct vdpasim_init_attr *attr)
if (!vdpasim->buffer)
goto err_iommu;
- for (i = 0; i < VDPASIM_VQ_NUM; i++)
+ for (i = 0; i < vdpasim->nvqs; i++)
vringh_set_iotlb(&vdpasim->vqs[i].vring, vdpasim->iommu);
vdpasim->vdpa.dma_dev = dev;
@@ -511,6 +518,7 @@ static void vdpasim_free(struct vdpa_device *vdpa)
kfree(vdpasim->buffer);
if (vdpasim->iommu)
vhost_iotlb_free(vdpasim->iommu);
+ kfree(vdpasim->vqs);
kfree(vdpasim->config);
}
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
index c68d5488ab54..e1e57c52b108 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
@@ -14,6 +14,8 @@
#define VDPASIM_NET_FEATURES (1ULL << VIRTIO_NET_F_MAC)
+#define VDPASIM_NET_VQ_NUM 2
+
static int batch_mapping = 1;
module_param(batch_mapping, int, 0444);
MODULE_PARM_DESC(batch_mapping, "Batched mapping 1 - Enable; 0 - Disable");
@@ -105,6 +107,7 @@ static int __init vdpasim_net_init(void)
attr.device_id = VIRTIO_ID_NET;
attr.features = VDPASIM_FEATURES | VDPASIM_NET_FEATURES;
+ attr.nvqs = VDPASIM_NET_VQ_NUM;
attr.work_fn = vdpasim_net_work;
attr.batch_mapping = batch_mapping;
vdpasim_net_dev = vdpasim_create(&attr);
--
2.26.2
In order to simplify the code of the vdpa_sim core, we move the
config management in each device simulator.
The device must provide the size of config structure and a callback
to update this structure called during the vdpasim_set_features().
Signed-off-by: Stefano Garzarella <[email protected]>
---
drivers/vdpa/vdpa_sim/vdpa_sim.h | 5 +++--
drivers/vdpa/vdpa_sim/vdpa_sim.c | 29 +++++-----------------------
drivers/vdpa/vdpa_sim/vdpa_sim_blk.c | 27 ++++++++++++++++----------
drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 12 ++++++++++++
4 files changed, 37 insertions(+), 36 deletions(-)
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.h b/drivers/vdpa/vdpa_sim/vdpa_sim.h
index 76e642042eb0..f7e1fe0a88d3 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.h
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.h
@@ -10,8 +10,6 @@
#include <linux/vdpa.h>
#include <linux/vhost_iotlb.h>
#include <uapi/linux/virtio_config.h>
-#include <uapi/linux/virtio_net.h>
-#include <uapi/linux/virtio_blk.h>
#define DRV_VERSION "0.1"
#define DRV_AUTHOR "Jason Wang <[email protected]>"
@@ -42,8 +40,11 @@ struct vdpasim_virtqueue {
struct vdpasim_device {
u64 supported_features;
+ size_t config_size;
u32 id;
int nvqs;
+
+ void (*update_config)(struct vdpasim *vdpasim);
};
struct vdpasim_init_attr {
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
index d053bd14b3f8..9c29c2013661 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
@@ -185,14 +185,8 @@ struct vdpasim *vdpasim_create(struct vdpasim_init_attr *attr)
{
const struct vdpa_config_ops *ops;
struct vdpasim *vdpasim;
- u32 device_id;
struct device *dev;
- int i, size, ret = -ENOMEM;
-
- device_id = attr->device.id;
- /* Currently, we only accept the network and block devices. */
- if (device_id != VIRTIO_ID_NET && device_id != VIRTIO_ID_BLOCK)
- return ERR_PTR(-EOPNOTSUPP);
+ int i, ret = -ENOMEM;
if (attr->batch_mapping)
ops = &vdpasim_batch_config_ops;
@@ -206,11 +200,7 @@ struct vdpasim *vdpasim_create(struct vdpasim_init_attr *attr)
vdpasim->device = attr->device;
- if (device_id == VIRTIO_ID_NET)
- size = sizeof(struct virtio_net_config);
- else
- size = sizeof(struct virtio_blk_config);
- vdpasim->config = kzalloc(size, GFP_KERNEL);
+ vdpasim->config = kzalloc(vdpasim->device.config_size, GFP_KERNEL);
if (!vdpasim->config)
goto err_iommu;
@@ -364,13 +354,8 @@ static int vdpasim_set_features(struct vdpa_device *vdpa, u64 features)
* Note: We actually require VIRTIO_F_ACCESS_PLATFORM above which
* implies VIRTIO_F_VERSION_1, but let's not try to be clever here.
*/
- if (vdpasim->device.id == VIRTIO_ID_NET) {
- struct virtio_net_config *config =
- (struct virtio_net_config *)vdpasim->config;
-
- config->mtu = cpu_to_vdpasim16(vdpasim, 1500);
- config->status = cpu_to_vdpasim16(vdpasim, VIRTIO_NET_S_LINK_UP);
- }
+ if (vdpasim->device.update_config)
+ vdpasim->device.update_config(vdpasim);
return 0;
}
@@ -426,11 +411,7 @@ static void vdpasim_get_config(struct vdpa_device *vdpa, unsigned int offset,
{
struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
- if (vdpasim->device.id == VIRTIO_ID_BLOCK &&
- (offset + len < sizeof(struct virtio_blk_config)))
- memcpy(buf, vdpasim->config + offset, len);
- else if (vdpasim->device.id == VIRTIO_ID_NET &&
- (offset + len < sizeof(struct virtio_net_config)))
+ if (offset + len < vdpasim->device.config_size)
memcpy(buf, vdpasim->config + offset, len);
}
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
index 363273d72e26..f456a0e4e097 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
@@ -7,6 +7,7 @@
*/
#include <linux/module.h>
+#include <uapi/linux/virtio_blk.h>
#include "vdpa_sim.h"
@@ -72,16 +73,31 @@ static void vdpasim_blk_work(struct work_struct *work)
}
+static void vdpasim_blk_update_config(struct vdpasim *vdpasim)
+{
+ struct virtio_blk_config *config =
+ (struct virtio_blk_config *)vdpasim->config;
+
+ config->capacity = cpu_to_vdpasim64(vdpasim, VDPASIM_BLK_CAPACITY);
+ config->size_max = cpu_to_vdpasim32(vdpasim, VDPASIM_BLK_SIZE_MAX);
+ config->seg_max = cpu_to_vdpasim32(vdpasim, VDPASIM_BLK_SEG_MAX);
+ config->num_queues = cpu_to_vdpasim16(vdpasim, VDPASIM_BLK_VQ_NUM);
+ config->min_io_size = cpu_to_vdpasim16(vdpasim, 1);
+ config->opt_io_size = cpu_to_vdpasim32(vdpasim, 1);
+ config->blk_size = cpu_to_vdpasim32(vdpasim, 512);
+}
+
static int __init vdpasim_blk_init(void)
{
struct vdpasim_init_attr attr = {};
- struct virtio_blk_config *config;
int ret;
attr.device.id = VIRTIO_ID_BLOCK;
attr.device.supported_features = VDPASIM_FEATURES |
VDPASIM_BLK_FEATURES;
attr.device.nvqs = VDPASIM_BLK_VQ_NUM;
+ attr.device.config_size = sizeof(struct virtio_blk_config);
+ attr.device.update_config = vdpasim_blk_update_config;
attr.work_fn = vdpasim_blk_work;
@@ -91,15 +107,6 @@ static int __init vdpasim_blk_init(void)
goto out;
}
- config = (struct virtio_blk_config *)vdpasim_blk_dev->config;
- config->capacity = cpu_to_vdpasim64(vdpasim_blk_dev, VDPASIM_BLK_CAPACITY);
- config->size_max = cpu_to_vdpasim32(vdpasim_blk_dev, VDPASIM_BLK_SIZE_MAX);
- config->seg_max = cpu_to_vdpasim32(vdpasim_blk_dev, VDPASIM_BLK_SEG_MAX);
- config->num_queues = cpu_to_vdpasim16(vdpasim_blk_dev, VDPASIM_BLK_VQ_NUM);
- config->min_io_size = cpu_to_vdpasim16(vdpasim_blk_dev, 1);
- config->opt_io_size = cpu_to_vdpasim32(vdpasim_blk_dev, 1);
- config->blk_size = cpu_to_vdpasim32(vdpasim_blk_dev, 512);
-
ret = vdpa_register_device(&vdpasim_blk_dev->vdpa);
if (ret)
goto put_dev;
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
index 88c9569f6bd3..b9372fdf2415 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
@@ -9,6 +9,7 @@
#include <linux/module.h>
#include <linux/etherdevice.h>
+#include <uapi/linux/virtio_net.h>
#include "vdpa_sim.h"
@@ -99,6 +100,15 @@ static void vdpasim_net_work(struct work_struct *work)
spin_unlock(&vdpasim->lock);
}
+static void vdpasim_net_update_config(struct vdpasim *vdpasim)
+{
+ struct virtio_net_config *config =
+ (struct virtio_net_config *)vdpasim->config;
+
+ config->mtu = cpu_to_vdpasim16(vdpasim, 1500);
+ config->status = cpu_to_vdpasim16(vdpasim, VIRTIO_NET_S_LINK_UP);
+}
+
static int __init vdpasim_net_init(void)
{
struct vdpasim_init_attr attr = {};
@@ -109,6 +119,8 @@ static int __init vdpasim_net_init(void)
attr.device.supported_features = VDPASIM_FEATURES |
VDPASIM_NET_FEATURES;
attr.device.nvqs = VDPASIM_NET_VQ_NUM;
+ attr.device.config_size = sizeof(struct virtio_net_config);
+ attr.device.update_config = vdpasim_net_update_config;
attr.work_fn = vdpasim_net_work;
attr.batch_mapping = batch_mapping;
--
2.26.2
Move device properties used during the entire life cycle in a new
structure to simplify the copy of these fields during the vdpasim
initialization.
Signed-off-by: Stefano Garzarella <[email protected]>
---
drivers/vdpa/vdpa_sim/vdpa_sim.h | 17 ++++++++------
drivers/vdpa/vdpa_sim/vdpa_sim.c | 33 ++++++++++++++--------------
drivers/vdpa/vdpa_sim/vdpa_sim_blk.c | 8 +++++--
drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 9 +++++---
4 files changed, 38 insertions(+), 29 deletions(-)
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.h b/drivers/vdpa/vdpa_sim/vdpa_sim.h
index 6a1267c40d5e..76e642042eb0 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.h
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.h
@@ -40,12 +40,17 @@ struct vdpasim_virtqueue {
irqreturn_t (*cb)(void *data);
};
+struct vdpasim_device {
+ u64 supported_features;
+ u32 id;
+ int nvqs;
+};
+
struct vdpasim_init_attr {
- u32 device_id;
- u64 features;
+ struct vdpasim_device device;
+ int batch_mapping;
+
work_func_t work_fn;
- int batch_mapping;
- int nvqs;
};
/* State of each vdpasim device */
@@ -53,18 +58,16 @@ struct vdpasim {
struct vdpa_device vdpa;
struct vdpasim_virtqueue *vqs;
struct work_struct work;
+ struct vdpasim_device device;
/* spinlock to synchronize virtqueue state */
spinlock_t lock;
/* virtio config according to device type */
void *config;
struct vhost_iotlb *iommu;
void *buffer;
- u32 device_id;
u32 status;
u32 generation;
u64 features;
- u64 supported_features;
- int nvqs;
/* spinlock to synchronize iommu table */
spinlock_t iommu_lock;
};
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
index 9c9717441bbe..d053bd14b3f8 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
@@ -28,7 +28,7 @@ static void vdpasim_queue_ready(struct vdpasim *vdpasim, unsigned int idx)
{
struct vdpasim_virtqueue *vq = &vdpasim->vqs[idx];
- vringh_init_iotlb(&vq->vring, vdpasim->supported_features,
+ vringh_init_iotlb(&vq->vring, vdpasim->device.supported_features,
VDPASIM_QUEUE_MAX, false,
(struct vring_desc *)(uintptr_t)vq->desc_addr,
(struct vring_avail *)
@@ -46,7 +46,7 @@ static void vdpasim_vq_reset(struct vdpasim *vdpasim,
vq->device_addr = 0;
vq->cb = NULL;
vq->private = NULL;
- vringh_init_iotlb(&vq->vring, vdpasim->supported_features,
+ vringh_init_iotlb(&vq->vring, vdpasim->device.supported_features,
VDPASIM_QUEUE_MAX, false, NULL, NULL, NULL);
}
@@ -54,7 +54,7 @@ static void vdpasim_reset(struct vdpasim *vdpasim)
{
int i;
- for (i = 0; i < vdpasim->nvqs; i++)
+ for (i = 0; i < vdpasim->device.nvqs; i++)
vdpasim_vq_reset(vdpasim, &vdpasim->vqs[i]);
spin_lock(&vdpasim->iommu_lock);
@@ -189,7 +189,7 @@ struct vdpasim *vdpasim_create(struct vdpasim_init_attr *attr)
struct device *dev;
int i, size, ret = -ENOMEM;
- device_id = attr->device_id;
+ device_id = attr->device.id;
/* Currently, we only accept the network and block devices. */
if (device_id != VIRTIO_ID_NET && device_id != VIRTIO_ID_BLOCK)
return ERR_PTR(-EOPNOTSUPP);
@@ -200,10 +200,12 @@ struct vdpasim *vdpasim_create(struct vdpasim_init_attr *attr)
ops = &vdpasim_config_ops;
vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops,
- attr->nvqs);
+ attr->device.nvqs);
if (!vdpasim)
goto err_alloc;
+ vdpasim->device = attr->device;
+
if (device_id == VIRTIO_ID_NET)
size = sizeof(struct virtio_net_config);
else
@@ -212,14 +214,11 @@ struct vdpasim *vdpasim_create(struct vdpasim_init_attr *attr)
if (!vdpasim->config)
goto err_iommu;
- vdpasim->vqs = kcalloc(attr->nvqs, sizeof(struct vdpasim_virtqueue),
- GFP_KERNEL);
+ vdpasim->vqs = kcalloc(vdpasim->device.nvqs,
+ sizeof(struct vdpasim_virtqueue), GFP_KERNEL);
if (!vdpasim->vqs)
goto err_iommu;
- vdpasim->device_id = device_id;
- vdpasim->supported_features = attr->features;
- vdpasim->nvqs = attr->nvqs;
INIT_WORK(&vdpasim->work, attr->work_fn);
spin_lock_init(&vdpasim->lock);
spin_lock_init(&vdpasim->iommu_lock);
@@ -238,7 +237,7 @@ struct vdpasim *vdpasim_create(struct vdpasim_init_attr *attr)
if (!vdpasim->buffer)
goto err_iommu;
- for (i = 0; i < vdpasim->nvqs; i++)
+ for (i = 0; i < vdpasim->device.nvqs; i++)
vringh_set_iotlb(&vdpasim->vqs[i].vring, vdpasim->iommu);
vdpasim->vdpa.dma_dev = dev;
@@ -347,7 +346,7 @@ static u64 vdpasim_get_features(struct vdpa_device *vdpa)
{
struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
- return vdpasim->supported_features;
+ return vdpasim->device.supported_features;
}
static int vdpasim_set_features(struct vdpa_device *vdpa, u64 features)
@@ -358,14 +357,14 @@ static int vdpasim_set_features(struct vdpa_device *vdpa, u64 features)
if (!(features & (1ULL << VIRTIO_F_ACCESS_PLATFORM)))
return -EINVAL;
- vdpasim->features = features & vdpasim->supported_features;
+ vdpasim->features = features & vdpasim->device.supported_features;
/* We generally only know whether guest is using the legacy interface
* here, so generally that's the earliest we can set config fields.
* Note: We actually require VIRTIO_F_ACCESS_PLATFORM above which
* implies VIRTIO_F_VERSION_1, but let's not try to be clever here.
*/
- if (vdpasim->device_id == VIRTIO_ID_NET) {
+ if (vdpasim->device.id == VIRTIO_ID_NET) {
struct virtio_net_config *config =
(struct virtio_net_config *)vdpasim->config;
@@ -391,7 +390,7 @@ static u32 vdpasim_get_device_id(struct vdpa_device *vdpa)
{
struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
- return vdpasim->device_id;
+ return vdpasim->device.id;
}
static u32 vdpasim_get_vendor_id(struct vdpa_device *vdpa)
@@ -427,10 +426,10 @@ static void vdpasim_get_config(struct vdpa_device *vdpa, unsigned int offset,
{
struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
- if (vdpasim->device_id == VIRTIO_ID_BLOCK &&
+ if (vdpasim->device.id == VIRTIO_ID_BLOCK &&
(offset + len < sizeof(struct virtio_blk_config)))
memcpy(buf, vdpasim->config + offset, len);
- else if (vdpasim->device_id == VIRTIO_ID_NET &&
+ else if (vdpasim->device.id == VIRTIO_ID_NET &&
(offset + len < sizeof(struct virtio_net_config)))
memcpy(buf, vdpasim->config + offset, len);
}
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
index 386dbb2f7138..363273d72e26 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
@@ -78,9 +78,13 @@ static int __init vdpasim_blk_init(void)
struct virtio_blk_config *config;
int ret;
- attr.device_id = VIRTIO_ID_BLOCK;
- attr.features = VDPASIM_FEATURES | VDPASIM_BLK_FEATURES;
+ attr.device.id = VIRTIO_ID_BLOCK;
+ attr.device.supported_features = VDPASIM_FEATURES |
+ VDPASIM_BLK_FEATURES;
+ attr.device.nvqs = VDPASIM_BLK_VQ_NUM;
+
attr.work_fn = vdpasim_blk_work;
+
vdpasim_blk_dev = vdpasim_create(&attr);
if (IS_ERR(vdpasim_blk_dev)) {
ret = PTR_ERR(vdpasim_blk_dev);
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
index e1e57c52b108..88c9569f6bd3 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
@@ -105,11 +105,14 @@ static int __init vdpasim_net_init(void)
struct virtio_net_config *config;
int ret;
- attr.device_id = VIRTIO_ID_NET;
- attr.features = VDPASIM_FEATURES | VDPASIM_NET_FEATURES;
- attr.nvqs = VDPASIM_NET_VQ_NUM;
+ attr.device.id = VIRTIO_ID_NET;
+ attr.device.supported_features = VDPASIM_FEATURES |
+ VDPASIM_NET_FEATURES;
+ attr.device.nvqs = VDPASIM_NET_VQ_NUM;
+
attr.work_fn = vdpasim_net_work;
attr.batch_mapping = batch_mapping;
+
vdpasim_net_dev = vdpasim_create(&attr);
if (IS_ERR(vdpasim_net_dev)) {
ret = PTR_ERR(vdpasim_net_dev);
--
2.26.2
The simulated devices can support multiple queues, so this limit
should be defined according to the number of queues supported by
the device.
Since we are in a simulator, let's simply remove that limit.
Suggested-by: Jason Wang <[email protected]>
Signed-off-by: Stefano Garzarella <[email protected]>
---
drivers/vdpa/vdpa_sim/vdpa_sim.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
index 2b4fea354413..9c9717441bbe 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
@@ -230,7 +230,7 @@ struct vdpasim *vdpasim_create(struct vdpasim_init_attr *attr)
goto err_iommu;
set_dma_ops(dev, &vdpasim_dma_ops);
- vdpasim->iommu = vhost_iotlb_alloc(2048, 0);
+ vdpasim->iommu = vhost_iotlb_alloc(0, 0);
if (!vdpasim->iommu)
goto err_iommu;
--
2.26.2
From: Max Gurtovoy <[email protected]>
Introduce new vdpa_sim_net and vdpa_sim (core) drivers. This is a
preparation for adding a vdpa simulator module for block devices.
Signed-off-by: Max Gurtovoy <[email protected]>
[sgarzare: various cleanups/fixes]
Signed-off-by: Stefano Garzarella <[email protected]>
---
v1:
- Removed unused headers
- Removed empty module_init() module_exit()
- Moved vdpasim_is_little_endian() in vdpa_sim.h
- Moved vdpasim16_to_cpu/cpu_to_vdpasim16() in vdpa_sim.h
- Added vdpasim*_to_cpu/cpu_to_vdpasim*() also for 32 and 64
- Replaced 'select VDPA_SIM' with 'depends on VDPA_SIM' since selected
option can not depend on other [Jason]
---
drivers/vdpa/vdpa_sim/vdpa_sim.h | 110 +++++++++++
drivers/vdpa/vdpa_sim/vdpa_sim.c | 285 ++++++---------------------
drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 153 ++++++++++++++
drivers/vdpa/Kconfig | 7 +-
drivers/vdpa/vdpa_sim/Makefile | 1 +
5 files changed, 329 insertions(+), 227 deletions(-)
create mode 100644 drivers/vdpa/vdpa_sim/vdpa_sim.h
create mode 100644 drivers/vdpa/vdpa_sim/vdpa_sim_net.c
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.h b/drivers/vdpa/vdpa_sim/vdpa_sim.h
new file mode 100644
index 000000000000..33613c49888c
--- /dev/null
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.h
@@ -0,0 +1,110 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2020, Red Hat Inc. All rights reserved.
+ */
+
+#ifndef _VDPA_SIM_H
+#define _VDPA_SIM_H
+
+#include <linux/vringh.h>
+#include <linux/vdpa.h>
+#include <linux/vhost_iotlb.h>
+#include <uapi/linux/virtio_config.h>
+#include <uapi/linux/virtio_net.h>
+#include <uapi/linux/virtio_blk.h>
+
+#define DRV_VERSION "0.1"
+#define DRV_AUTHOR "Jason Wang <[email protected]>"
+#define DRV_LICENSE "GPL v2"
+
+#define VDPASIM_QUEUE_ALIGN PAGE_SIZE
+#define VDPASIM_QUEUE_MAX 256
+#define VDPASIM_VENDOR_ID 0
+#define VDPASIM_VQ_NUM 0x2
+
+#define VDPASIM_FEATURES ((1ULL << VIRTIO_F_ANY_LAYOUT) | \
+ (1ULL << VIRTIO_F_VERSION_1) | \
+ (1ULL << VIRTIO_F_ACCESS_PLATFORM))
+
+struct vdpasim;
+
+struct vdpasim_virtqueue {
+ struct vringh vring;
+ struct vringh_kiov iov;
+ unsigned short head;
+ bool ready;
+ u64 desc_addr;
+ u64 device_addr;
+ u64 driver_addr;
+ u32 num;
+ void *private;
+ irqreturn_t (*cb)(void *data);
+};
+
+struct vdpasim_init_attr {
+ u32 device_id;
+ u64 features;
+ work_func_t work_fn;
+ int batch_mapping;
+};
+
+/* State of each vdpasim device */
+struct vdpasim {
+ struct vdpa_device vdpa;
+ struct vdpasim_virtqueue vqs[VDPASIM_VQ_NUM];
+ struct work_struct work;
+ /* spinlock to synchronize virtqueue state */
+ spinlock_t lock;
+ /* virtio config according to device type */
+ void *config;
+ struct vhost_iotlb *iommu;
+ void *buffer;
+ u32 device_id;
+ u32 status;
+ u32 generation;
+ u64 features;
+ u64 supported_features;
+ /* spinlock to synchronize iommu table */
+ spinlock_t iommu_lock;
+};
+
+struct vdpasim *vdpasim_create(struct vdpasim_init_attr *attr);
+
+/* TODO: cross-endian support */
+static inline bool vdpasim_is_little_endian(struct vdpasim *vdpasim)
+{
+ return virtio_legacy_is_little_endian() ||
+ (vdpasim->features & (1ULL << VIRTIO_F_VERSION_1));
+}
+
+static inline u16 vdpasim16_to_cpu(struct vdpasim *vdpasim, __virtio16 val)
+{
+ return __virtio16_to_cpu(vdpasim_is_little_endian(vdpasim), val);
+}
+
+static inline __virtio16 cpu_to_vdpasim16(struct vdpasim *vdpasim, u16 val)
+{
+ return __cpu_to_virtio16(vdpasim_is_little_endian(vdpasim), val);
+}
+
+static inline u32 vdpasim32_to_cpu(struct vdpasim *vdpasim, __virtio32 val)
+{
+ return __virtio32_to_cpu(vdpasim_is_little_endian(vdpasim), val);
+}
+
+static inline __virtio32 cpu_to_vdpasim32(struct vdpasim *vdpasim, u32 val)
+{
+ return __cpu_to_virtio32(vdpasim_is_little_endian(vdpasim), val);
+}
+
+static inline u64 vdpasim64_to_cpu(struct vdpasim *vdpasim, __virtio64 val)
+{
+ return __virtio64_to_cpu(vdpasim_is_little_endian(vdpasim), val);
+}
+
+static inline __virtio64 cpu_to_vdpasim64(struct vdpasim *vdpasim, u64 val)
+{
+ return __cpu_to_virtio64(vdpasim_is_little_endian(vdpasim), val);
+}
+
+#endif
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
index 6a90fdb9cbfc..04f9dc9ce8c8 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
@@ -1,107 +1,16 @@
// SPDX-License-Identifier: GPL-2.0-only
/*
- * VDPA networking device simulator.
+ * VDPA simulator core.
*
* Copyright (c) 2020, Red Hat Inc. All rights reserved.
* Author: Jason Wang <[email protected]>
*
*/
-#include <linux/init.h>
#include <linux/module.h>
-#include <linux/device.h>
-#include <linux/kernel.h>
-#include <linux/fs.h>
-#include <linux/poll.h>
-#include <linux/slab.h>
-#include <linux/sched.h>
-#include <linux/wait.h>
-#include <linux/uuid.h>
-#include <linux/iommu.h>
#include <linux/dma-map-ops.h>
-#include <linux/sysfs.h>
-#include <linux/file.h>
-#include <linux/etherdevice.h>
-#include <linux/vringh.h>
-#include <linux/vdpa.h>
-#include <linux/virtio_byteorder.h>
-#include <linux/vhost_iotlb.h>
-#include <uapi/linux/virtio_config.h>
-#include <uapi/linux/virtio_net.h>
-
-#define DRV_VERSION "0.1"
-#define DRV_AUTHOR "Jason Wang <[email protected]>"
-#define DRV_DESC "vDPA Device Simulator"
-#define DRV_LICENSE "GPL v2"
-
-static int batch_mapping = 1;
-module_param(batch_mapping, int, 0444);
-MODULE_PARM_DESC(batch_mapping, "Batched mapping 1 -Enable; 0 - Disable");
-
-static char *macaddr;
-module_param(macaddr, charp, 0);
-MODULE_PARM_DESC(macaddr, "Ethernet MAC address");
-
-struct vdpasim_virtqueue {
- struct vringh vring;
- struct vringh_kiov iov;
- unsigned short head;
- bool ready;
- u64 desc_addr;
- u64 device_addr;
- u64 driver_addr;
- u32 num;
- void *private;
- irqreturn_t (*cb)(void *data);
-};
-
-#define VDPASIM_QUEUE_ALIGN PAGE_SIZE
-#define VDPASIM_QUEUE_MAX 256
-#define VDPASIM_DEVICE_ID 0x1
-#define VDPASIM_VENDOR_ID 0
-#define VDPASIM_VQ_NUM 0x2
-#define VDPASIM_NAME "vdpasim-netdev"
-
-static u64 vdpasim_features = (1ULL << VIRTIO_F_ANY_LAYOUT) |
- (1ULL << VIRTIO_F_VERSION_1) |
- (1ULL << VIRTIO_F_ACCESS_PLATFORM) |
- (1ULL << VIRTIO_NET_F_MAC);
-
-/* State of each vdpasim device */
-struct vdpasim {
- struct vdpa_device vdpa;
- struct vdpasim_virtqueue vqs[VDPASIM_VQ_NUM];
- struct work_struct work;
- /* spinlock to synchronize virtqueue state */
- spinlock_t lock;
- struct virtio_net_config config;
- struct vhost_iotlb *iommu;
- void *buffer;
- u32 status;
- u32 generation;
- u64 features;
- /* spinlock to synchronize iommu table */
- spinlock_t iommu_lock;
-};
-
-/* TODO: cross-endian support */
-static inline bool vdpasim_is_little_endian(struct vdpasim *vdpasim)
-{
- return virtio_legacy_is_little_endian() ||
- (vdpasim->features & (1ULL << VIRTIO_F_VERSION_1));
-}
-
-static inline u16 vdpasim16_to_cpu(struct vdpasim *vdpasim, __virtio16 val)
-{
- return __virtio16_to_cpu(vdpasim_is_little_endian(vdpasim), val);
-}
-
-static inline __virtio16 cpu_to_vdpasim16(struct vdpasim *vdpasim, u16 val)
-{
- return __cpu_to_virtio16(vdpasim_is_little_endian(vdpasim), val);
-}
-static struct vdpasim *vdpasim_dev;
+#include "vdpa_sim.h"
static struct vdpasim *vdpa_to_sim(struct vdpa_device *vdpa)
{
@@ -119,7 +28,7 @@ static void vdpasim_queue_ready(struct vdpasim *vdpasim, unsigned int idx)
{
struct vdpasim_virtqueue *vq = &vdpasim->vqs[idx];
- vringh_init_iotlb(&vq->vring, vdpasim_features,
+ vringh_init_iotlb(&vq->vring, vdpasim->supported_features,
VDPASIM_QUEUE_MAX, false,
(struct vring_desc *)(uintptr_t)vq->desc_addr,
(struct vring_avail *)
@@ -128,7 +37,8 @@ static void vdpasim_queue_ready(struct vdpasim *vdpasim, unsigned int idx)
(uintptr_t)vq->device_addr);
}
-static void vdpasim_vq_reset(struct vdpasim_virtqueue *vq)
+static void vdpasim_vq_reset(struct vdpasim *vdpasim,
+ struct vdpasim_virtqueue *vq)
{
vq->ready = false;
vq->desc_addr = 0;
@@ -136,8 +46,8 @@ static void vdpasim_vq_reset(struct vdpasim_virtqueue *vq)
vq->device_addr = 0;
vq->cb = NULL;
vq->private = NULL;
- vringh_init_iotlb(&vq->vring, vdpasim_features, VDPASIM_QUEUE_MAX,
- false, NULL, NULL, NULL);
+ vringh_init_iotlb(&vq->vring, vdpasim->supported_features,
+ VDPASIM_QUEUE_MAX, false, NULL, NULL, NULL);
}
static void vdpasim_reset(struct vdpasim *vdpasim)
@@ -145,7 +55,7 @@ static void vdpasim_reset(struct vdpasim *vdpasim)
int i;
for (i = 0; i < VDPASIM_VQ_NUM; i++)
- vdpasim_vq_reset(&vdpasim->vqs[i]);
+ vdpasim_vq_reset(vdpasim, &vdpasim->vqs[i]);
spin_lock(&vdpasim->iommu_lock);
vhost_iotlb_reset(vdpasim->iommu);
@@ -156,80 +66,6 @@ static void vdpasim_reset(struct vdpasim *vdpasim)
++vdpasim->generation;
}
-static void vdpasim_work(struct work_struct *work)
-{
- struct vdpasim *vdpasim = container_of(work, struct
- vdpasim, work);
- struct vdpasim_virtqueue *txq = &vdpasim->vqs[1];
- struct vdpasim_virtqueue *rxq = &vdpasim->vqs[0];
- ssize_t read, write;
- size_t total_write;
- int pkts = 0;
- int err;
-
- spin_lock(&vdpasim->lock);
-
- if (!(vdpasim->status & VIRTIO_CONFIG_S_DRIVER_OK))
- goto out;
-
- if (!txq->ready || !rxq->ready)
- goto out;
-
- while (true) {
- total_write = 0;
- err = vringh_getdesc_iotlb(&txq->vring, &txq->iov, NULL,
- &txq->head, GFP_ATOMIC);
- if (err <= 0)
- break;
-
- err = vringh_getdesc_iotlb(&rxq->vring, NULL, &rxq->iov,
- &rxq->head, GFP_ATOMIC);
- if (err <= 0) {
- vringh_complete_iotlb(&txq->vring, txq->head, 0);
- break;
- }
-
- while (true) {
- read = vringh_iov_pull_iotlb(&txq->vring, &txq->iov,
- vdpasim->buffer,
- PAGE_SIZE);
- if (read <= 0)
- break;
-
- write = vringh_iov_push_iotlb(&rxq->vring, &rxq->iov,
- vdpasim->buffer, read);
- if (write <= 0)
- break;
-
- total_write += write;
- }
-
- /* Make sure data is wrote before advancing index */
- smp_wmb();
-
- vringh_complete_iotlb(&txq->vring, txq->head, 0);
- vringh_complete_iotlb(&rxq->vring, rxq->head, total_write);
-
- /* Make sure used is visible before rasing the interrupt. */
- smp_wmb();
-
- local_bh_disable();
- if (txq->cb)
- txq->cb(txq->private);
- if (rxq->cb)
- rxq->cb(rxq->private);
- local_bh_enable();
-
- if (++pkts > 4) {
- schedule_work(&vdpasim->work);
- goto out;
- }
- }
-
-out:
- spin_unlock(&vdpasim->lock);
-}
-
static int dir_to_perm(enum dma_data_direction dir)
{
int perm = -EFAULT;
@@ -342,26 +178,42 @@ static const struct dma_map_ops vdpasim_dma_ops = {
.free = vdpasim_free_coherent,
};
-static const struct vdpa_config_ops vdpasim_net_config_ops;
-static const struct vdpa_config_ops vdpasim_net_batch_config_ops;
+static const struct vdpa_config_ops vdpasim_config_ops;
+static const struct vdpa_config_ops vdpasim_batch_config_ops;
-static struct vdpasim *vdpasim_create(void)
+struct vdpasim *vdpasim_create(struct vdpasim_init_attr *attr)
{
const struct vdpa_config_ops *ops;
struct vdpasim *vdpasim;
+ u32 device_id;
struct device *dev;
- int ret = -ENOMEM;
+ int i, size, ret = -ENOMEM;
- if (batch_mapping)
- ops = &vdpasim_net_batch_config_ops;
+ device_id = attr->device_id;
+ /* Currently, we only accept the network and block devices. */
+ if (device_id != VIRTIO_ID_NET && device_id != VIRTIO_ID_BLOCK)
+ return ERR_PTR(-EOPNOTSUPP);
+
+ if (attr->batch_mapping)
+ ops = &vdpasim_batch_config_ops;
else
- ops = &vdpasim_net_config_ops;
+ ops = &vdpasim_config_ops;
vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops, VDPASIM_VQ_NUM);
if (!vdpasim)
goto err_alloc;
- INIT_WORK(&vdpasim->work, vdpasim_work);
+ if (device_id == VIRTIO_ID_NET)
+ size = sizeof(struct virtio_net_config);
+ else
+ size = sizeof(struct virtio_blk_config);
+ vdpasim->config = kzalloc(size, GFP_KERNEL);
+ if (!vdpasim->config)
+ goto err_iommu;
+
+ vdpasim->device_id = device_id;
+ vdpasim->supported_features = attr->features;
+ INIT_WORK(&vdpasim->work, attr->work_fn);
spin_lock_init(&vdpasim->lock);
spin_lock_init(&vdpasim->iommu_lock);
@@ -379,23 +231,10 @@ static struct vdpasim *vdpasim_create(void)
if (!vdpasim->buffer)
goto err_iommu;
- if (macaddr) {
- mac_pton(macaddr, vdpasim->config.mac);
- if (!is_valid_ether_addr(vdpasim->config.mac)) {
- ret = -EADDRNOTAVAIL;
- goto err_iommu;
- }
- } else {
- eth_random_addr(vdpasim->config.mac);
- }
-
- vringh_set_iotlb(&vdpasim->vqs[0].vring, vdpasim->iommu);
- vringh_set_iotlb(&vdpasim->vqs[1].vring, vdpasim->iommu);
+ for (i = 0; i < VDPASIM_VQ_NUM; i++)
+ vringh_set_iotlb(&vdpasim->vqs[i].vring, vdpasim->iommu);
vdpasim->vdpa.dma_dev = dev;
- ret = vdpa_register_device(&vdpasim->vdpa);
- if (ret)
- goto err_iommu;
return vdpasim;
@@ -404,6 +243,7 @@ static struct vdpasim *vdpasim_create(void)
err_alloc:
return ERR_PTR(ret);
}
+EXPORT_SYMBOL_GPL(vdpasim_create);
static int vdpasim_set_vq_address(struct vdpa_device *vdpa, u16 idx,
u64 desc_area, u64 driver_area,
@@ -498,28 +338,34 @@ static u32 vdpasim_get_vq_align(struct vdpa_device *vdpa)
static u64 vdpasim_get_features(struct vdpa_device *vdpa)
{
- return vdpasim_features;
+ struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
+
+ return vdpasim->supported_features;
}
static int vdpasim_set_features(struct vdpa_device *vdpa, u64 features)
{
struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
- struct virtio_net_config *config = &vdpasim->config;
/* DMA mapping must be done by driver */
if (!(features & (1ULL << VIRTIO_F_ACCESS_PLATFORM)))
return -EINVAL;
- vdpasim->features = features & vdpasim_features;
+ vdpasim->features = features & vdpasim->supported_features;
/* We generally only know whether guest is using the legacy interface
* here, so generally that's the earliest we can set config fields.
* Note: We actually require VIRTIO_F_ACCESS_PLATFORM above which
* implies VIRTIO_F_VERSION_1, but let's not try to be clever here.
*/
+ if (vdpasim->device_id == VIRTIO_ID_NET) {
+ struct virtio_net_config *config =
+ (struct virtio_net_config *)vdpasim->config;
+
+ config->mtu = cpu_to_vdpasim16(vdpasim, 1500);
+ config->status = cpu_to_vdpasim16(vdpasim, VIRTIO_NET_S_LINK_UP);
+ }
- config->mtu = cpu_to_vdpasim16(vdpasim, 1500);
- config->status = cpu_to_vdpasim16(vdpasim, VIRTIO_NET_S_LINK_UP);
return 0;
}
@@ -536,7 +382,9 @@ static u16 vdpasim_get_vq_num_max(struct vdpa_device *vdpa)
static u32 vdpasim_get_device_id(struct vdpa_device *vdpa)
{
- return VDPASIM_DEVICE_ID;
+ struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
+
+ return vdpasim->device_id;
}
static u32 vdpasim_get_vendor_id(struct vdpa_device *vdpa)
@@ -572,8 +420,12 @@ static void vdpasim_get_config(struct vdpa_device *vdpa, unsigned int offset,
{
struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
- if (offset + len < sizeof(struct virtio_net_config))
- memcpy(buf, (u8 *)&vdpasim->config + offset, len);
+ if (vdpasim->device_id == VIRTIO_ID_BLOCK &&
+ (offset + len < sizeof(struct virtio_blk_config)))
+ memcpy(buf, vdpasim->config + offset, len);
+ else if (vdpasim->device_id == VIRTIO_ID_NET &&
+ (offset + len < sizeof(struct virtio_net_config)))
+ memcpy(buf, vdpasim->config + offset, len);
}
static void vdpasim_set_config(struct vdpa_device *vdpa, unsigned int offset,
@@ -659,9 +511,10 @@ static void vdpasim_free(struct vdpa_device *vdpa)
kfree(vdpasim->buffer);
if (vdpasim->iommu)
vhost_iotlb_free(vdpasim->iommu);
+ kfree(vdpasim->config);
}
-static const struct vdpa_config_ops vdpasim_net_config_ops = {
+static const struct vdpa_config_ops vdpasim_config_ops = {
.set_vq_address = vdpasim_set_vq_address,
.set_vq_num = vdpasim_set_vq_num,
.kick_vq = vdpasim_kick_vq,
@@ -688,7 +541,7 @@ static const struct vdpa_config_ops vdpasim_net_config_ops = {
.free = vdpasim_free,
};
-static const struct vdpa_config_ops vdpasim_net_batch_config_ops = {
+static const struct vdpa_config_ops vdpasim_batch_config_ops = {
.set_vq_address = vdpasim_set_vq_address,
.set_vq_num = vdpasim_set_vq_num,
.kick_vq = vdpasim_kick_vq,
@@ -714,27 +567,7 @@ static const struct vdpa_config_ops vdpasim_net_batch_config_ops = {
.free = vdpasim_free,
};
-static int __init vdpasim_dev_init(void)
-{
- vdpasim_dev = vdpasim_create();
-
- if (!IS_ERR(vdpasim_dev))
- return 0;
-
- return PTR_ERR(vdpasim_dev);
-}
-
-static void __exit vdpasim_dev_exit(void)
-{
- struct vdpa_device *vdpa = &vdpasim_dev->vdpa;
-
- vdpa_unregister_device(vdpa);
-}
-
-module_init(vdpasim_dev_init)
-module_exit(vdpasim_dev_exit)
-
MODULE_VERSION(DRV_VERSION);
MODULE_LICENSE(DRV_LICENSE);
MODULE_AUTHOR(DRV_AUTHOR);
-MODULE_DESCRIPTION(DRV_DESC);
+MODULE_DESCRIPTION("vDPA Simulator core");
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
new file mode 100644
index 000000000000..c68d5488ab54
--- /dev/null
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
@@ -0,0 +1,153 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * VDPA simulator for networking device.
+ *
+ * Copyright (c) 2020, Red Hat Inc. All rights reserved.
+ * Author: Jason Wang <[email protected]>
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/etherdevice.h>
+
+#include "vdpa_sim.h"
+
+#define VDPASIM_NET_FEATURES (1ULL << VIRTIO_NET_F_MAC)
+
+static int batch_mapping = 1;
+module_param(batch_mapping, int, 0444);
+MODULE_PARM_DESC(batch_mapping, "Batched mapping 1 - Enable; 0 - Disable");
+
+static char *macaddr;
+module_param(macaddr, charp, 0);
+MODULE_PARM_DESC(macaddr, "Ethernet MAC address");
+
+static struct vdpasim *vdpasim_net_dev;
+
+static void vdpasim_net_work(struct work_struct *work)
+{
+ struct vdpasim *vdpasim = container_of(work, struct vdpasim, work);
+ struct vdpasim_virtqueue *txq = &vdpasim->vqs[1];
+ struct vdpasim_virtqueue *rxq = &vdpasim->vqs[0];
+ ssize_t read, write;
+ size_t total_write;
+ int pkts = 0;
+ int err;
+
+ spin_lock(&vdpasim->lock);
+
+ if (!(vdpasim->status & VIRTIO_CONFIG_S_DRIVER_OK))
+ goto out;
+
+ if (!txq->ready || !rxq->ready)
+ goto out;
+
+ while (true) {
+ total_write = 0;
+ err = vringh_getdesc_iotlb(&txq->vring, &txq->iov, NULL,
+ &txq->head, GFP_ATOMIC);
+ if (err <= 0)
+ break;
+
+ err = vringh_getdesc_iotlb(&rxq->vring, NULL, &rxq->iov,
+ &rxq->head, GFP_ATOMIC);
+ if (err <= 0) {
+ vringh_complete_iotlb(&txq->vring, txq->head, 0);
+ break;
+ }
+
+ while (true) {
+ read = vringh_iov_pull_iotlb(&txq->vring, &txq->iov,
+ vdpasim->buffer,
+ PAGE_SIZE);
+ if (read <= 0)
+ break;
+
+ write = vringh_iov_push_iotlb(&rxq->vring, &rxq->iov,
+ vdpasim->buffer, read);
+ if (write <= 0)
+ break;
+
+ total_write += write;
+ }
+
+ /* Make sure data is wrote before advancing index */
+ smp_wmb();
+
+ vringh_complete_iotlb(&txq->vring, txq->head, 0);
+ vringh_complete_iotlb(&rxq->vring, rxq->head, total_write);
+
+ /* Make sure used is visible before rasing the interrupt. */
+ smp_wmb();
+
+ local_bh_disable();
+ if (txq->cb)
+ txq->cb(txq->private);
+ if (rxq->cb)
+ rxq->cb(rxq->private);
+ local_bh_enable();
+
+ if (++pkts > 4) {
+ schedule_work(&vdpasim->work);
+ goto out;
+ }
+ }
+
+out:
+ spin_unlock(&vdpasim->lock);
+}
+
+static int __init vdpasim_net_init(void)
+{
+ struct vdpasim_init_attr attr = {};
+ struct virtio_net_config *config;
+ int ret;
+
+ attr.device_id = VIRTIO_ID_NET;
+ attr.features = VDPASIM_FEATURES | VDPASIM_NET_FEATURES;
+ attr.work_fn = vdpasim_net_work;
+ attr.batch_mapping = batch_mapping;
+ vdpasim_net_dev = vdpasim_create(&attr);
+ if (IS_ERR(vdpasim_net_dev)) {
+ ret = PTR_ERR(vdpasim_net_dev);
+ goto out;
+ }
+
+ config = (struct virtio_net_config *)vdpasim_net_dev->config;
+
+ if (macaddr) {
+ mac_pton(macaddr, config->mac);
+ if (!is_valid_ether_addr(config->mac)) {
+ ret = -EADDRNOTAVAIL;
+ goto put_dev;
+ }
+ } else {
+ eth_random_addr(config->mac);
+ }
+
+ ret = vdpa_register_device(&vdpasim_net_dev->vdpa);
+ if (ret)
+ goto put_dev;
+
+ return 0;
+
+put_dev:
+ put_device(&vdpasim_net_dev->vdpa.dev);
+out:
+ return ret;
+}
+
+static void __exit vdpasim_net_exit(void)
+{
+ struct vdpa_device *vdpa = &vdpasim_net_dev->vdpa;
+
+ vdpa_unregister_device(vdpa);
+}
+
+module_init(vdpasim_net_init);
+module_exit(vdpasim_net_exit);
+
+MODULE_VERSION(DRV_VERSION);
+MODULE_LICENSE(DRV_LICENSE);
+MODULE_AUTHOR(DRV_AUTHOR);
+MODULE_DESCRIPTION("vDPA Device Simulator for networking device");
diff --git a/drivers/vdpa/Kconfig b/drivers/vdpa/Kconfig
index d7d32b656102..fdb1a9267347 100644
--- a/drivers/vdpa/Kconfig
+++ b/drivers/vdpa/Kconfig
@@ -9,11 +9,16 @@ menuconfig VDPA
if VDPA
config VDPA_SIM
- tristate "vDPA device simulator"
+ tristate "vDPA simulator core"
depends on RUNTIME_TESTING_MENU && HAS_DMA
select DMA_OPS
select VHOST_RING
default n
+
+config VDPA_SIM_NET
+ tristate "vDPA simulator for networking device"
+ depends on VDPA_SIM
+ default n
help
vDPA networking device simulator which loop TX traffic back
to RX. This device is used for testing, prototyping and
diff --git a/drivers/vdpa/vdpa_sim/Makefile b/drivers/vdpa/vdpa_sim/Makefile
index b40278f65e04..79d4536d347e 100644
--- a/drivers/vdpa/vdpa_sim/Makefile
+++ b/drivers/vdpa/vdpa_sim/Makefile
@@ -1,2 +1,3 @@
# SPDX-License-Identifier: GPL-2.0
obj-$(CONFIG_VDPA_SIM) += vdpa_sim.o
+obj-$(CONFIG_VDPA_SIM_NET) += vdpa_sim_net.o
--
2.26.2
vringh_getdesc_iotlb() manages 2 iovs for writable and readable
descriptors. This is very useful for the block device, where for
each request we have both types of descriptor.
Let's split the vdpasim_virtqueue's iov field in riov and wiov
to use them with vringh_getdesc_iotlb().
Signed-off-by: Stefano Garzarella <[email protected]>
---
drivers/vdpa/vdpa_sim/vdpa_sim.h | 3 ++-
drivers/vdpa/vdpa_sim/vdpa_sim_blk.c | 6 +++---
drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 8 ++++----
3 files changed, 9 insertions(+), 8 deletions(-)
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.h b/drivers/vdpa/vdpa_sim/vdpa_sim.h
index cc21e07aa2f7..0d4629675e4b 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.h
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.h
@@ -27,7 +27,8 @@ struct vdpasim;
struct vdpasim_virtqueue {
struct vringh vring;
- struct vringh_kiov iov;
+ struct vringh_kiov riov;
+ struct vringh_kiov wiov;
unsigned short head;
bool ready;
u64 desc_addr;
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
index 122a3c039507..8e41b3ab98d5 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
@@ -41,13 +41,13 @@ static void vdpasim_blk_work(struct work_struct *work)
if (!vq->ready)
continue;
- while (vringh_getdesc_iotlb(&vq->vring, &vq->iov, &vq->iov,
+ while (vringh_getdesc_iotlb(&vq->vring, &vq->riov, &vq->wiov,
&vq->head, GFP_ATOMIC) > 0) {
int write;
- vq->iov.i = vq->iov.used - 1;
- write = vringh_iov_push_iotlb(&vq->vring, &vq->iov, &status, 1);
+ vq->wiov.i = vq->wiov.used - 1;
+ write = vringh_iov_push_iotlb(&vq->vring, &vq->wiov, &status, 1);
if (write <= 0)
break;
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
index d0a1403f64b2..783b1e85b09c 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
@@ -47,12 +47,12 @@ static void vdpasim_net_work(struct work_struct *work)
while (true) {
total_write = 0;
- err = vringh_getdesc_iotlb(&txq->vring, &txq->iov, NULL,
+ err = vringh_getdesc_iotlb(&txq->vring, &txq->riov, NULL,
&txq->head, GFP_ATOMIC);
if (err <= 0)
break;
- err = vringh_getdesc_iotlb(&rxq->vring, NULL, &rxq->iov,
+ err = vringh_getdesc_iotlb(&rxq->vring, NULL, &rxq->wiov,
&rxq->head, GFP_ATOMIC);
if (err <= 0) {
vringh_complete_iotlb(&txq->vring, txq->head, 0);
@@ -60,13 +60,13 @@ static void vdpasim_net_work(struct work_struct *work)
}
while (true) {
- read = vringh_iov_pull_iotlb(&txq->vring, &txq->iov,
+ read = vringh_iov_pull_iotlb(&txq->vring, &txq->riov,
vdpasim->buffer,
PAGE_SIZE);
if (read <= 0)
break;
- write = vringh_iov_push_iotlb(&rxq->vring, &rxq->iov,
+ write = vringh_iov_push_iotlb(&rxq->vring, &rxq->wiov,
vdpasim->buffer, read);
if (write <= 0)
break;
--
2.26.2
The next patch will make the buffer size configurable from each
device.
Since the buffer could be larger than a page, we use kvmalloc()
instead of kmalloc().
Signed-off-by: Stefano Garzarella <[email protected]>
---
drivers/vdpa/vdpa_sim/vdpa_sim.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
index 9c29c2013661..bd034fbf4683 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
@@ -223,7 +223,7 @@ struct vdpasim *vdpasim_create(struct vdpasim_init_attr *attr)
if (!vdpasim->iommu)
goto err_iommu;
- vdpasim->buffer = kmalloc(PAGE_SIZE, GFP_KERNEL);
+ vdpasim->buffer = kvmalloc(PAGE_SIZE, GFP_KERNEL);
if (!vdpasim->buffer)
goto err_iommu;
@@ -495,7 +495,7 @@ static void vdpasim_free(struct vdpa_device *vdpa)
struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
cancel_work_sync(&vdpasim->work);
- kfree(vdpasim->buffer);
+ kvfree(vdpasim->buffer);
if (vdpasim->iommu)
vhost_iotlb_free(vdpasim->iommu);
kfree(vdpasim->vqs);
--
2.26.2
Allow each device to specify the size of the buffer allocated
in vdpa_sim.
Signed-off-by: Stefano Garzarella <[email protected]>
---
drivers/vdpa/vdpa_sim/vdpa_sim.h | 1 +
drivers/vdpa/vdpa_sim/vdpa_sim.c | 2 +-
drivers/vdpa/vdpa_sim/vdpa_sim_blk.c | 1 +
drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 1 +
4 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.h b/drivers/vdpa/vdpa_sim/vdpa_sim.h
index f7e1fe0a88d3..cc21e07aa2f7 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.h
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.h
@@ -49,6 +49,7 @@ struct vdpasim_device {
struct vdpasim_init_attr {
struct vdpasim_device device;
+ size_t buffer_size;
int batch_mapping;
work_func_t work_fn;
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
index bd034fbf4683..3863d49e0d6d 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
@@ -223,7 +223,7 @@ struct vdpasim *vdpasim_create(struct vdpasim_init_attr *attr)
if (!vdpasim->iommu)
goto err_iommu;
- vdpasim->buffer = kvmalloc(PAGE_SIZE, GFP_KERNEL);
+ vdpasim->buffer = kvmalloc(attr->buffer_size, GFP_KERNEL);
if (!vdpasim->buffer)
goto err_iommu;
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
index f456a0e4e097..122a3c039507 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
@@ -100,6 +100,7 @@ static int __init vdpasim_blk_init(void)
attr.device.update_config = vdpasim_blk_update_config;
attr.work_fn = vdpasim_blk_work;
+ attr.buffer_size = PAGE_SIZE;
vdpasim_blk_dev = vdpasim_create(&attr);
if (IS_ERR(vdpasim_blk_dev)) {
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
index b9372fdf2415..d0a1403f64b2 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
@@ -124,6 +124,7 @@ static int __init vdpasim_net_init(void)
attr.work_fn = vdpasim_net_work;
attr.batch_mapping = batch_mapping;
+ attr.buffer_size = PAGE_SIZE;
vdpasim_net_dev = vdpasim_create(&attr);
if (IS_ERR(vdpasim_net_dev)) {
--
2.26.2
From: Max Gurtovoy <[email protected]>
This will allow running vDPA for virtio block protocol.
Signed-off-by: Max Gurtovoy <[email protected]>
[sgarzare: various cleanups/fixes]
Signed-off-by: Stefano Garzarella <[email protected]>
---
v1:
- Removed unused headers
- Used cpu_to_vdpasim*() to store config fields
- Replaced 'select VDPA_SIM' with 'depends on VDPA_SIM' since selected
option can not depend on other [Jason]
- Start with a single queue for now [Jason]
- Add comments to memory barriers
---
drivers/vdpa/vdpa_sim/vdpa_sim_blk.c | 124 +++++++++++++++++++++++++++
drivers/vdpa/Kconfig | 9 ++
drivers/vdpa/vdpa_sim/Makefile | 1 +
3 files changed, 134 insertions(+)
create mode 100644 drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
new file mode 100644
index 000000000000..386dbb2f7138
--- /dev/null
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
@@ -0,0 +1,124 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * VDPA simulator for block device.
+ *
+ * Copyright (c) 2020, Mellanox Technologies. All rights reserved.
+ *
+ */
+
+#include <linux/module.h>
+
+#include "vdpa_sim.h"
+
+#define VDPASIM_BLK_FEATURES ((1ULL << VIRTIO_BLK_F_SIZE_MAX) | \
+ (1ULL << VIRTIO_BLK_F_SEG_MAX) | \
+ (1ULL << VIRTIO_BLK_F_BLK_SIZE) | \
+ (1ULL << VIRTIO_BLK_F_TOPOLOGY) | \
+ (1ULL << VIRTIO_BLK_F_MQ))
+
+#define VDPASIM_BLK_CAPACITY 0x40000
+#define VDPASIM_BLK_SIZE_MAX 0x1000
+#define VDPASIM_BLK_SEG_MAX 32
+#define VDPASIM_BLK_VQ_NUM 1
+
+static struct vdpasim *vdpasim_blk_dev;
+
+static void vdpasim_blk_work(struct work_struct *work)
+{
+ struct vdpasim *vdpasim = container_of(work, struct vdpasim, work);
+ u8 status = VIRTIO_BLK_S_OK;
+ int i;
+
+ spin_lock(&vdpasim->lock);
+
+ if (!(vdpasim->status & VIRTIO_CONFIG_S_DRIVER_OK))
+ goto out;
+
+ for (i = 0; i < VDPASIM_BLK_VQ_NUM; i++) {
+ struct vdpasim_virtqueue *vq = &vdpasim->vqs[i];
+
+ if (!vq->ready)
+ continue;
+
+ while (vringh_getdesc_iotlb(&vq->vring, &vq->iov, &vq->iov,
+ &vq->head, GFP_ATOMIC) > 0) {
+
+ int write;
+
+ vq->iov.i = vq->iov.used - 1;
+ write = vringh_iov_push_iotlb(&vq->vring, &vq->iov, &status, 1);
+ if (write <= 0)
+ break;
+
+ /* Make sure data is wrote before advancing index */
+ smp_wmb();
+
+ vringh_complete_iotlb(&vq->vring, vq->head, write);
+
+ /* Make sure used is visible before rasing the interrupt. */
+ smp_wmb();
+
+ if (vringh_need_notify_iotlb(&vq->vring) > 0)
+ vringh_notify(&vq->vring);
+
+ local_bh_disable();
+ if (vq->cb)
+ vq->cb(vq->private);
+ local_bh_enable();
+ }
+ }
+out:
+ spin_unlock(&vdpasim->lock);
+
+}
+
+static int __init vdpasim_blk_init(void)
+{
+ struct vdpasim_init_attr attr = {};
+ struct virtio_blk_config *config;
+ int ret;
+
+ attr.device_id = VIRTIO_ID_BLOCK;
+ attr.features = VDPASIM_FEATURES | VDPASIM_BLK_FEATURES;
+ attr.work_fn = vdpasim_blk_work;
+ vdpasim_blk_dev = vdpasim_create(&attr);
+ if (IS_ERR(vdpasim_blk_dev)) {
+ ret = PTR_ERR(vdpasim_blk_dev);
+ goto out;
+ }
+
+ config = (struct virtio_blk_config *)vdpasim_blk_dev->config;
+ config->capacity = cpu_to_vdpasim64(vdpasim_blk_dev, VDPASIM_BLK_CAPACITY);
+ config->size_max = cpu_to_vdpasim32(vdpasim_blk_dev, VDPASIM_BLK_SIZE_MAX);
+ config->seg_max = cpu_to_vdpasim32(vdpasim_blk_dev, VDPASIM_BLK_SEG_MAX);
+ config->num_queues = cpu_to_vdpasim16(vdpasim_blk_dev, VDPASIM_BLK_VQ_NUM);
+ config->min_io_size = cpu_to_vdpasim16(vdpasim_blk_dev, 1);
+ config->opt_io_size = cpu_to_vdpasim32(vdpasim_blk_dev, 1);
+ config->blk_size = cpu_to_vdpasim32(vdpasim_blk_dev, 512);
+
+ ret = vdpa_register_device(&vdpasim_blk_dev->vdpa);
+ if (ret)
+ goto put_dev;
+
+ return 0;
+
+put_dev:
+ put_device(&vdpasim_blk_dev->vdpa.dev);
+out:
+ return ret;
+}
+
+static void __exit vdpasim_blk_exit(void)
+{
+ struct vdpa_device *vdpa = &vdpasim_blk_dev->vdpa;
+
+ vdpa_unregister_device(vdpa);
+}
+
+module_init(vdpasim_blk_init)
+module_exit(vdpasim_blk_exit)
+
+MODULE_VERSION(DRV_VERSION);
+MODULE_LICENSE(DRV_LICENSE);
+MODULE_AUTHOR("Max Gurtovoy <[email protected]>");
+MODULE_DESCRIPTION("vDPA Device Simulator for block device");
diff --git a/drivers/vdpa/Kconfig b/drivers/vdpa/Kconfig
index fdb1a9267347..0fb63362cd5d 100644
--- a/drivers/vdpa/Kconfig
+++ b/drivers/vdpa/Kconfig
@@ -24,6 +24,15 @@ config VDPA_SIM_NET
to RX. This device is used for testing, prototyping and
development of vDPA.
+config VDPA_SIM_BLOCK
+ tristate "vDPA simulator for block device"
+ depends on VDPA_SIM
+ default n
+ help
+ vDPA block device simulator which terminates IO request in a
+ memory buffer. This device is used for testing, prototyping and
+ development of vDPA.
+
config IFCVF
tristate "Intel IFC VF vDPA driver"
depends on PCI_MSI
diff --git a/drivers/vdpa/vdpa_sim/Makefile b/drivers/vdpa/vdpa_sim/Makefile
index 79d4536d347e..d458103302f2 100644
--- a/drivers/vdpa/vdpa_sim/Makefile
+++ b/drivers/vdpa/vdpa_sim/Makefile
@@ -1,3 +1,4 @@
# SPDX-License-Identifier: GPL-2.0
obj-$(CONFIG_VDPA_SIM) += vdpa_sim.o
obj-$(CONFIG_VDPA_SIM_NET) += vdpa_sim_net.o
+obj-$(CONFIG_VDPA_SIM_BLOCK) += vdpa_sim_blk.o
--
2.26.2
The previous implementation wrote only the status of each request.
This patch implements a more accurate block device simulator,
providing a ramdisk-like behavior.
Also handle VIRTIO_BLK_T_GET_ID request, always answering the
"vdpa_blk_sim" string.
Signed-off-by: Stefano Garzarella <[email protected]>
---
drivers/vdpa/vdpa_sim/vdpa_sim_blk.c | 151 +++++++++++++++++++++++----
1 file changed, 133 insertions(+), 18 deletions(-)
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
index 8e41b3ab98d5..68e74383322f 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
@@ -7,6 +7,7 @@
*/
#include <linux/module.h>
+#include <linux/blkdev.h>
#include <uapi/linux/virtio_blk.h>
#include "vdpa_sim.h"
@@ -24,10 +25,137 @@
static struct vdpasim *vdpasim_blk_dev;
+static int vdpasim_blk_handle_req(struct vdpasim *vdpasim,
+ struct vdpasim_virtqueue *vq)
+{
+ size_t wrote = 0, to_read = 0, to_write = 0;
+ struct virtio_blk_outhdr hdr;
+ uint8_t status;
+ uint32_t type;
+ ssize_t bytes;
+ loff_t offset;
+ int i, ret;
+
+ vringh_kiov_cleanup(&vq->riov);
+ vringh_kiov_cleanup(&vq->wiov);
+
+ ret = vringh_getdesc_iotlb(&vq->vring, &vq->riov, &vq->wiov,
+ &vq->head, GFP_ATOMIC);
+ if (ret != 1)
+ return ret;
+
+ for (i = 0; i < vq->wiov.used; i++)
+ to_write += vq->wiov.iov[i].iov_len;
+ to_write -= 1; /* last byte is the status */
+
+ for (i = 0; i < vq->riov.used; i++)
+ to_read += vq->riov.iov[i].iov_len;
+
+ bytes = vringh_iov_pull_iotlb(&vq->vring, &vq->riov, &hdr, sizeof(hdr));
+ if (bytes != sizeof(hdr))
+ return 0;
+
+ to_read -= bytes;
+
+ type = le32_to_cpu(hdr.type);
+ offset = le64_to_cpu(hdr.sector) << SECTOR_SHIFT;
+ status = VIRTIO_BLK_S_OK;
+
+ switch (type) {
+ case VIRTIO_BLK_T_IN:
+ if (offset + to_write > VDPASIM_BLK_CAPACITY << SECTOR_SHIFT) {
+ dev_err(&vdpasim->vdpa.dev,
+ "reading over the capacity - offset: 0x%llx len: 0x%lx\n",
+ offset, to_write);
+ status = VIRTIO_BLK_S_IOERR;
+ break;
+ }
+
+ bytes = vringh_iov_push_iotlb(&vq->vring, &vq->wiov,
+ vdpasim->buffer + offset,
+ to_write);
+ if (bytes < 0) {
+ dev_err(&vdpasim->vdpa.dev,
+ "vringh_iov_push_iotlb() error: %ld offset: 0x%llx len: 0x%lx\n",
+ bytes, offset, to_write);
+ status = VIRTIO_BLK_S_IOERR;
+ break;
+ }
+
+ wrote += bytes;
+ break;
+
+ case VIRTIO_BLK_T_OUT:
+ if (offset + to_read > VDPASIM_BLK_CAPACITY << SECTOR_SHIFT) {
+ dev_err(&vdpasim->vdpa.dev,
+ "writing over the capacity - offset: 0x%llx len: 0x%lx\n",
+ offset, to_read);
+ status = VIRTIO_BLK_S_IOERR;
+ break;
+ }
+
+ bytes = vringh_iov_pull_iotlb(&vq->vring, &vq->riov,
+ vdpasim->buffer + offset,
+ to_read);
+ if (bytes < 0) {
+ dev_err(&vdpasim->vdpa.dev,
+ "vringh_iov_pull_iotlb() error: %ld offset: 0x%llx len: 0x%lx\n",
+ bytes, offset, to_read);
+ status = VIRTIO_BLK_S_IOERR;
+ break;
+ }
+ break;
+
+ case VIRTIO_BLK_T_GET_ID: {
+ char id[VIRTIO_BLK_ID_BYTES] = "vdpa_blk_sim";
+
+ bytes = vringh_iov_push_iotlb(&vq->vring,
+ &vq->wiov, id,
+ VIRTIO_BLK_ID_BYTES);
+ if (bytes < 0) {
+ dev_err(&vdpasim->vdpa.dev,
+ "vringh_iov_push_iotlb() error: %ld\n", bytes);
+ status = VIRTIO_BLK_S_IOERR;
+ break;
+ }
+
+ wrote += bytes;
+ break;
+ }
+
+ default:
+ dev_warn(&vdpasim->vdpa.dev,
+ "Unsupported request type %d\n", type);
+ status = VIRTIO_BLK_S_IOERR;
+ break;
+ }
+
+ /* if VIRTIO_BLK_T_IN or VIRTIO_BLK_T_GET_ID fail, we need to skip
+ * the remaining bytes to put the status in the last byte
+ */
+ if (to_write - wrote > 0) {
+ vringh_iov_push_iotlb(&vq->vring, &vq->wiov, NULL,
+ to_write - wrote);
+ }
+
+ /* last byte is the status */
+ bytes = vringh_iov_push_iotlb(&vq->vring, &vq->wiov, &status, 1);
+ if (bytes != 1)
+ return 0;
+
+ wrote += bytes;
+
+ /* Make sure data is wrote before advancing index */
+ smp_wmb();
+
+ vringh_complete_iotlb(&vq->vring, vq->head, wrote);
+
+ return ret;
+}
+
static void vdpasim_blk_work(struct work_struct *work)
{
struct vdpasim *vdpasim = container_of(work, struct vdpasim, work);
- u8 status = VIRTIO_BLK_S_OK;
int i;
spin_lock(&vdpasim->lock);
@@ -41,21 +169,7 @@ static void vdpasim_blk_work(struct work_struct *work)
if (!vq->ready)
continue;
- while (vringh_getdesc_iotlb(&vq->vring, &vq->riov, &vq->wiov,
- &vq->head, GFP_ATOMIC) > 0) {
-
- int write;
-
- vq->wiov.i = vq->wiov.used - 1;
- write = vringh_iov_push_iotlb(&vq->vring, &vq->wiov, &status, 1);
- if (write <= 0)
- break;
-
- /* Make sure data is wrote before advancing index */
- smp_wmb();
-
- vringh_complete_iotlb(&vq->vring, vq->head, write);
-
+ while (vdpasim_blk_handle_req(vdpasim, vq) > 0) {
/* Make sure used is visible before rasing the interrupt. */
smp_wmb();
@@ -67,6 +181,7 @@ static void vdpasim_blk_work(struct work_struct *work)
vq->cb(vq->private);
local_bh_enable();
}
+
}
out:
spin_unlock(&vdpasim->lock);
@@ -84,7 +199,7 @@ static void vdpasim_blk_update_config(struct vdpasim *vdpasim)
config->num_queues = cpu_to_vdpasim16(vdpasim, VDPASIM_BLK_VQ_NUM);
config->min_io_size = cpu_to_vdpasim16(vdpasim, 1);
config->opt_io_size = cpu_to_vdpasim32(vdpasim, 1);
- config->blk_size = cpu_to_vdpasim32(vdpasim, 512);
+ config->blk_size = cpu_to_vdpasim32(vdpasim, SECTOR_SIZE);
}
static int __init vdpasim_blk_init(void)
@@ -100,7 +215,7 @@ static int __init vdpasim_blk_init(void)
attr.device.update_config = vdpasim_blk_update_config;
attr.work_fn = vdpasim_blk_work;
- attr.buffer_size = PAGE_SIZE;
+ attr.buffer_size = VDPASIM_BLK_CAPACITY << SECTOR_SHIFT;
vdpasim_blk_dev = vdpasim_create(&attr);
if (IS_ERR(vdpasim_blk_dev)) {
--
2.26.2
In some cases, it may be useful to provide a way to skip a number
of bytes in a vringh_iov.
In order to keep vringh_iov consistent, let's reuse vringh_iov_xfer()
logic and skip bytes when the ptr is NULL.
Signed-off-by: Stefano Garzarella <[email protected]>
---
I'm not sure if this is the best option, maybe we can add a new
function vringh_iov_skip().
Suggestions?
---
drivers/vhost/vringh.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)
diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
index 8bd8b403f087..ed3290946ad7 100644
--- a/drivers/vhost/vringh.c
+++ b/drivers/vhost/vringh.c
@@ -75,7 +75,9 @@ static inline int __vringh_get_head(const struct vringh *vrh,
return head;
}
-/* Copy some bytes to/from the iovec. Returns num copied. */
+/* Copy some bytes to/from the iovec. Returns num copied.
+ * If ptr is NULL, skips at most len bytes.
+ */
static inline ssize_t vringh_iov_xfer(struct vringh *vrh,
struct vringh_kiov *iov,
void *ptr, size_t len,
@@ -89,12 +91,16 @@ static inline ssize_t vringh_iov_xfer(struct vringh *vrh,
size_t partlen;
partlen = min(iov->iov[iov->i].iov_len, len);
- err = xfer(vrh, iov->iov[iov->i].iov_base, ptr, partlen);
- if (err)
- return err;
+
+ if (ptr) {
+ err = xfer(vrh, iov->iov[iov->i].iov_base, ptr, partlen);
+ if (err)
+ return err;
+ ptr += partlen;
+ }
+
done += partlen;
len -= partlen;
- ptr += partlen;
iov->consumed += partlen;
iov->iov[iov->i].iov_len -= partlen;
iov->iov[iov->i].iov_base += partlen;
--
2.26.2
On 2020/11/13 下午9:47, Stefano Garzarella wrote:
> Thanks to Max that started this work!
> I took his patches, and extended the block simulator a bit.
>
> This series moves the network device simulator in a new module
> (vdpa_sim_net) and leaves the generic functions in the vdpa_sim core
> module, allowing the possibility to add new vDPA device simulators.
> Then we added a new vdpa_sim_blk module to simulate a block device.
>
> I'm not sure about patch 11 ("vringh: allow vringh_iov_xfer() to skip
> bytes when ptr is NULL"), maybe we can add a new functions instead of
> modify vringh_iov_xfer().
>
> As Max reported, I'm also seeing errors with vdpa_sim_blk related to
> iotlb and vringh when there is high load, these are some of the error
> messages I can see randomly:
>
> vringh: Failed to access avail idx at 00000000e8deb2cc
> vringh: Failed to read head: idx 6289 address 00000000e1ad1d50
> vringh: Failed to get flags at 000000006635d7a3
>
> virtio_vdpa vdpa0: vringh_iov_push_iotlb() error: -14 offset: 0x2840000 len: 0x20000
> virtio_vdpa vdpa0: vringh_iov_pull_iotlb() error: -14 offset: 0x58ee000 len: 0x3000
>
> These errors should all be related to the fact that iotlb_translate()
> fails with -EINVAL, so it seems that we miss some mapping.
Is this only reproducible when there's multiple co-current accessing of
IOTLB? If yes, it's probably a hint that some kind of synchronization is
still missed somewhere.
It might be useful to log the dma_map/unmp in both virtio_ring and
vringh to see who is missing the map.
Thanks
>
> I'll debug more carefully, in the meantime can you give a first review?
>
> Thanks,
> Stefano
>
> Max Gurtovoy (4):
> vhost-vdpa: add support for vDPA blk devices
> vdpa: split vdpasim to core and net modules
> vdpa_sim: remove hard-coded virtq count
> vdpa: add vdpa simulator for block device
>
> Stefano Garzarella (8):
> vdpa_sim: remove the limit of IOTLB entries
> vdpa_sim: add struct vdpasim_device to store device properties
> vdpa_sim: move config management outside of the core
> vdpa_sim: use kvmalloc to allocate vdpasim->buffer
> vdpa_sim: make vdpasim->buffer size configurable
> vdpa_sim: split vdpasim_virtqueue's iov field in riov and wiov
> vringh: allow vringh_iov_xfer() to skip bytes when ptr is NULL
> vdpa_sim_blk: implement ramdisk behaviour
>
> drivers/vdpa/vdpa_sim/vdpa_sim.h | 117 +++++++++++
> drivers/vdpa/vdpa_sim/vdpa_sim.c | 283 +++++----------------------
> drivers/vdpa/vdpa_sim/vdpa_sim_blk.c | 251 ++++++++++++++++++++++++
> drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 172 ++++++++++++++++
> drivers/vhost/vdpa.c | 11 +-
> drivers/vhost/vringh.c | 16 +-
> drivers/vdpa/Kconfig | 16 +-
> drivers/vdpa/vdpa_sim/Makefile | 2 +
> 8 files changed, 628 insertions(+), 240 deletions(-)
> create mode 100644 drivers/vdpa/vdpa_sim/vdpa_sim.h
> create mode 100644 drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
> create mode 100644 drivers/vdpa/vdpa_sim/vdpa_sim_net.c
>
On 2020/11/13 下午9:47, Stefano Garzarella wrote:
> From: Max Gurtovoy <[email protected]>
>
> Introduce new vdpa_sim_net and vdpa_sim (core) drivers. This is a
> preparation for adding a vdpa simulator module for block devices.
>
> Signed-off-by: Max Gurtovoy <[email protected]>
> [sgarzare: various cleanups/fixes]
> Signed-off-by: Stefano Garzarella <[email protected]>
> ---
> v1:
> - Removed unused headers
> - Removed empty module_init() module_exit()
> - Moved vdpasim_is_little_endian() in vdpa_sim.h
> - Moved vdpasim16_to_cpu/cpu_to_vdpasim16() in vdpa_sim.h
> - Added vdpasim*_to_cpu/cpu_to_vdpasim*() also for 32 and 64
> - Replaced 'select VDPA_SIM' with 'depends on VDPA_SIM' since selected
> option can not depend on other [Jason]
If possible, I would suggest to split this patch further:
1) convert to use void *config, and an attribute for setting config size
during allocation
2) introduce supported_features
3) other attributes (#vqs)
4) rename config ops (more generic one)
5) introduce ops for set|get_config, set_get_features
6) real split
> ---
> drivers/vdpa/vdpa_sim/vdpa_sim.h | 110 +++++++++++
> drivers/vdpa/vdpa_sim/vdpa_sim.c | 285 ++++++---------------------
> drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 153 ++++++++++++++
> drivers/vdpa/Kconfig | 7 +-
> drivers/vdpa/vdpa_sim/Makefile | 1 +
> 5 files changed, 329 insertions(+), 227 deletions(-)
> create mode 100644 drivers/vdpa/vdpa_sim/vdpa_sim.h
> create mode 100644 drivers/vdpa/vdpa_sim/vdpa_sim_net.c
>
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.h b/drivers/vdpa/vdpa_sim/vdpa_sim.h
> new file mode 100644
> index 000000000000..33613c49888c
> --- /dev/null
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.h
> @@ -0,0 +1,110 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (c) 2020, Red Hat Inc. All rights reserved.
> + */
> +
> +#ifndef _VDPA_SIM_H
> +#define _VDPA_SIM_H
> +
> +#include <linux/vringh.h>
> +#include <linux/vdpa.h>
> +#include <linux/vhost_iotlb.h>
> +#include <uapi/linux/virtio_config.h>
> +#include <uapi/linux/virtio_net.h>
> +#include <uapi/linux/virtio_blk.h>
> +
> +#define DRV_VERSION "0.1"
> +#define DRV_AUTHOR "Jason Wang <[email protected]>"
> +#define DRV_LICENSE "GPL v2"
> +
> +#define VDPASIM_QUEUE_ALIGN PAGE_SIZE
> +#define VDPASIM_QUEUE_MAX 256
> +#define VDPASIM_VENDOR_ID 0
> +#define VDPASIM_VQ_NUM 0x2
> +
> +#define VDPASIM_FEATURES ((1ULL << VIRTIO_F_ANY_LAYOUT) | \
> + (1ULL << VIRTIO_F_VERSION_1) | \
> + (1ULL << VIRTIO_F_ACCESS_PLATFORM))
> +
> +struct vdpasim;
> +
> +struct vdpasim_virtqueue {
> + struct vringh vring;
> + struct vringh_kiov iov;
> + unsigned short head;
> + bool ready;
> + u64 desc_addr;
> + u64 device_addr;
> + u64 driver_addr;
> + u32 num;
> + void *private;
> + irqreturn_t (*cb)(void *data);
> +};
> +
> +struct vdpasim_init_attr {
> + u32 device_id;
> + u64 features;
> + work_func_t work_fn;
> + int batch_mapping;
> +};
> +
> +/* State of each vdpasim device */
> +struct vdpasim {
> + struct vdpa_device vdpa;
> + struct vdpasim_virtqueue vqs[VDPASIM_VQ_NUM];
> + struct work_struct work;
> + /* spinlock to synchronize virtqueue state */
> + spinlock_t lock;
> + /* virtio config according to device type */
> + void *config;
> + struct vhost_iotlb *iommu;
> + void *buffer;
> + u32 device_id;
> + u32 status;
> + u32 generation;
> + u64 features;
> + u64 supported_features;
> + /* spinlock to synchronize iommu table */
> + spinlock_t iommu_lock;
> +};
> +
> +struct vdpasim *vdpasim_create(struct vdpasim_init_attr *attr);
> +
> +/* TODO: cross-endian support */
> +static inline bool vdpasim_is_little_endian(struct vdpasim *vdpasim)
> +{
> + return virtio_legacy_is_little_endian() ||
> + (vdpasim->features & (1ULL << VIRTIO_F_VERSION_1));
> +}
> +
> +static inline u16 vdpasim16_to_cpu(struct vdpasim *vdpasim, __virtio16 val)
> +{
> + return __virtio16_to_cpu(vdpasim_is_little_endian(vdpasim), val);
> +}
> +
> +static inline __virtio16 cpu_to_vdpasim16(struct vdpasim *vdpasim, u16 val)
> +{
> + return __cpu_to_virtio16(vdpasim_is_little_endian(vdpasim), val);
> +}
> +
> +static inline u32 vdpasim32_to_cpu(struct vdpasim *vdpasim, __virtio32 val)
> +{
> + return __virtio32_to_cpu(vdpasim_is_little_endian(vdpasim), val);
> +}
> +
> +static inline __virtio32 cpu_to_vdpasim32(struct vdpasim *vdpasim, u32 val)
> +{
> + return __cpu_to_virtio32(vdpasim_is_little_endian(vdpasim), val);
> +}
> +
> +static inline u64 vdpasim64_to_cpu(struct vdpasim *vdpasim, __virtio64 val)
> +{
> + return __virtio64_to_cpu(vdpasim_is_little_endian(vdpasim), val);
> +}
> +
> +static inline __virtio64 cpu_to_vdpasim64(struct vdpasim *vdpasim, u64 val)
> +{
> + return __cpu_to_virtio64(vdpasim_is_little_endian(vdpasim), val);
> +}
> +
> +#endif
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> index 6a90fdb9cbfc..04f9dc9ce8c8 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> @@ -1,107 +1,16 @@
> // SPDX-License-Identifier: GPL-2.0-only
> /*
> - * VDPA networking device simulator.
> + * VDPA simulator core.
> *
> * Copyright (c) 2020, Red Hat Inc. All rights reserved.
> * Author: Jason Wang <[email protected]>
> *
> */
>
> -#include <linux/init.h>
> #include <linux/module.h>
> -#include <linux/device.h>
> -#include <linux/kernel.h>
> -#include <linux/fs.h>
> -#include <linux/poll.h>
> -#include <linux/slab.h>
> -#include <linux/sched.h>
> -#include <linux/wait.h>
> -#include <linux/uuid.h>
> -#include <linux/iommu.h>
> #include <linux/dma-map-ops.h>
> -#include <linux/sysfs.h>
> -#include <linux/file.h>
> -#include <linux/etherdevice.h>
> -#include <linux/vringh.h>
> -#include <linux/vdpa.h>
> -#include <linux/virtio_byteorder.h>
> -#include <linux/vhost_iotlb.h>
> -#include <uapi/linux/virtio_config.h>
> -#include <uapi/linux/virtio_net.h>
> -
> -#define DRV_VERSION "0.1"
> -#define DRV_AUTHOR "Jason Wang <[email protected]>"
> -#define DRV_DESC "vDPA Device Simulator"
> -#define DRV_LICENSE "GPL v2"
> -
> -static int batch_mapping = 1;
> -module_param(batch_mapping, int, 0444);
> -MODULE_PARM_DESC(batch_mapping, "Batched mapping 1 -Enable; 0 - Disable");
> -
> -static char *macaddr;
> -module_param(macaddr, charp, 0);
> -MODULE_PARM_DESC(macaddr, "Ethernet MAC address");
> -
> -struct vdpasim_virtqueue {
> - struct vringh vring;
> - struct vringh_kiov iov;
> - unsigned short head;
> - bool ready;
> - u64 desc_addr;
> - u64 device_addr;
> - u64 driver_addr;
> - u32 num;
> - void *private;
> - irqreturn_t (*cb)(void *data);
> -};
> -
> -#define VDPASIM_QUEUE_ALIGN PAGE_SIZE
> -#define VDPASIM_QUEUE_MAX 256
> -#define VDPASIM_DEVICE_ID 0x1
> -#define VDPASIM_VENDOR_ID 0
> -#define VDPASIM_VQ_NUM 0x2
> -#define VDPASIM_NAME "vdpasim-netdev"
> -
> -static u64 vdpasim_features = (1ULL << VIRTIO_F_ANY_LAYOUT) |
> - (1ULL << VIRTIO_F_VERSION_1) |
> - (1ULL << VIRTIO_F_ACCESS_PLATFORM) |
> - (1ULL << VIRTIO_NET_F_MAC);
> -
> -/* State of each vdpasim device */
> -struct vdpasim {
> - struct vdpa_device vdpa;
> - struct vdpasim_virtqueue vqs[VDPASIM_VQ_NUM];
> - struct work_struct work;
> - /* spinlock to synchronize virtqueue state */
> - spinlock_t lock;
> - struct virtio_net_config config;
> - struct vhost_iotlb *iommu;
> - void *buffer;
> - u32 status;
> - u32 generation;
> - u64 features;
> - /* spinlock to synchronize iommu table */
> - spinlock_t iommu_lock;
> -};
> -
> -/* TODO: cross-endian support */
> -static inline bool vdpasim_is_little_endian(struct vdpasim *vdpasim)
> -{
> - return virtio_legacy_is_little_endian() ||
> - (vdpasim->features & (1ULL << VIRTIO_F_VERSION_1));
> -}
> -
> -static inline u16 vdpasim16_to_cpu(struct vdpasim *vdpasim, __virtio16 val)
> -{
> - return __virtio16_to_cpu(vdpasim_is_little_endian(vdpasim), val);
> -}
> -
> -static inline __virtio16 cpu_to_vdpasim16(struct vdpasim *vdpasim, u16 val)
> -{
> - return __cpu_to_virtio16(vdpasim_is_little_endian(vdpasim), val);
> -}
>
> -static struct vdpasim *vdpasim_dev;
> +#include "vdpa_sim.h"
>
> static struct vdpasim *vdpa_to_sim(struct vdpa_device *vdpa)
> {
> @@ -119,7 +28,7 @@ static void vdpasim_queue_ready(struct vdpasim *vdpasim, unsigned int idx)
> {
> struct vdpasim_virtqueue *vq = &vdpasim->vqs[idx];
>
> - vringh_init_iotlb(&vq->vring, vdpasim_features,
> + vringh_init_iotlb(&vq->vring, vdpasim->supported_features,
> VDPASIM_QUEUE_MAX, false,
> (struct vring_desc *)(uintptr_t)vq->desc_addr,
> (struct vring_avail *)
> @@ -128,7 +37,8 @@ static void vdpasim_queue_ready(struct vdpasim *vdpasim, unsigned int idx)
> (uintptr_t)vq->device_addr);
> }
>
> -static void vdpasim_vq_reset(struct vdpasim_virtqueue *vq)
> +static void vdpasim_vq_reset(struct vdpasim *vdpasim,
> + struct vdpasim_virtqueue *vq)
> {
> vq->ready = false;
> vq->desc_addr = 0;
> @@ -136,8 +46,8 @@ static void vdpasim_vq_reset(struct vdpasim_virtqueue *vq)
> vq->device_addr = 0;
> vq->cb = NULL;
> vq->private = NULL;
> - vringh_init_iotlb(&vq->vring, vdpasim_features, VDPASIM_QUEUE_MAX,
> - false, NULL, NULL, NULL);
> + vringh_init_iotlb(&vq->vring, vdpasim->supported_features,
> + VDPASIM_QUEUE_MAX, false, NULL, NULL, NULL);
> }
>
> static void vdpasim_reset(struct vdpasim *vdpasim)
> @@ -145,7 +55,7 @@ static void vdpasim_reset(struct vdpasim *vdpasim)
> int i;
>
> for (i = 0; i < VDPASIM_VQ_NUM; i++)
> - vdpasim_vq_reset(&vdpasim->vqs[i]);
> + vdpasim_vq_reset(vdpasim, &vdpasim->vqs[i]);
>
> spin_lock(&vdpasim->iommu_lock);
> vhost_iotlb_reset(vdpasim->iommu);
> @@ -156,80 +66,6 @@ static void vdpasim_reset(struct vdpasim *vdpasim)
> ++vdpasim->generation;
> }
>
> -static void vdpasim_work(struct work_struct *work)
> -{
> - struct vdpasim *vdpasim = container_of(work, struct
> - vdpasim, work);
> - struct vdpasim_virtqueue *txq = &vdpasim->vqs[1];
> - struct vdpasim_virtqueue *rxq = &vdpasim->vqs[0];
> - ssize_t read, write;
> - size_t total_write;
> - int pkts = 0;
> - int err;
> -
> - spin_lock(&vdpasim->lock);
> -
> - if (!(vdpasim->status & VIRTIO_CONFIG_S_DRIVER_OK))
> - goto out;
> -
> - if (!txq->ready || !rxq->ready)
> - goto out;
> -
> - while (true) {
> - total_write = 0;
> - err = vringh_getdesc_iotlb(&txq->vring, &txq->iov, NULL,
> - &txq->head, GFP_ATOMIC);
> - if (err <= 0)
> - break;
> -
> - err = vringh_getdesc_iotlb(&rxq->vring, NULL, &rxq->iov,
> - &rxq->head, GFP_ATOMIC);
> - if (err <= 0) {
> - vringh_complete_iotlb(&txq->vring, txq->head, 0);
> - break;
> - }
> -
> - while (true) {
> - read = vringh_iov_pull_iotlb(&txq->vring, &txq->iov,
> - vdpasim->buffer,
> - PAGE_SIZE);
> - if (read <= 0)
> - break;
> -
> - write = vringh_iov_push_iotlb(&rxq->vring, &rxq->iov,
> - vdpasim->buffer, read);
> - if (write <= 0)
> - break;
> -
> - total_write += write;
> - }
> -
> - /* Make sure data is wrote before advancing index */
> - smp_wmb();
> -
> - vringh_complete_iotlb(&txq->vring, txq->head, 0);
> - vringh_complete_iotlb(&rxq->vring, rxq->head, total_write);
> -
> - /* Make sure used is visible before rasing the interrupt. */
> - smp_wmb();
> -
> - local_bh_disable();
> - if (txq->cb)
> - txq->cb(txq->private);
> - if (rxq->cb)
> - rxq->cb(rxq->private);
> - local_bh_enable();
> -
> - if (++pkts > 4) {
> - schedule_work(&vdpasim->work);
> - goto out;
> - }
> - }
> -
> -out:
> - spin_unlock(&vdpasim->lock);
> -}
> -
> static int dir_to_perm(enum dma_data_direction dir)
> {
> int perm = -EFAULT;
> @@ -342,26 +178,42 @@ static const struct dma_map_ops vdpasim_dma_ops = {
> .free = vdpasim_free_coherent,
> };
>
> -static const struct vdpa_config_ops vdpasim_net_config_ops;
> -static const struct vdpa_config_ops vdpasim_net_batch_config_ops;
> +static const struct vdpa_config_ops vdpasim_config_ops;
> +static const struct vdpa_config_ops vdpasim_batch_config_ops;
>
> -static struct vdpasim *vdpasim_create(void)
> +struct vdpasim *vdpasim_create(struct vdpasim_init_attr *attr)
> {
> const struct vdpa_config_ops *ops;
> struct vdpasim *vdpasim;
> + u32 device_id;
> struct device *dev;
> - int ret = -ENOMEM;
> + int i, size, ret = -ENOMEM;
>
> - if (batch_mapping)
> - ops = &vdpasim_net_batch_config_ops;
> + device_id = attr->device_id;
> + /* Currently, we only accept the network and block devices. */
> + if (device_id != VIRTIO_ID_NET && device_id != VIRTIO_ID_BLOCK)
> + return ERR_PTR(-EOPNOTSUPP);
> +
> + if (attr->batch_mapping)
> + ops = &vdpasim_batch_config_ops;
> else
> - ops = &vdpasim_net_config_ops;
> + ops = &vdpasim_config_ops;
>
> vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops, VDPASIM_VQ_NUM);
> if (!vdpasim)
> goto err_alloc;
>
> - INIT_WORK(&vdpasim->work, vdpasim_work);
> + if (device_id == VIRTIO_ID_NET)
> + size = sizeof(struct virtio_net_config);
> + else
> + size = sizeof(struct virtio_blk_config);
It's better to avoid such if/else consider we may introduce more type of
devices.
Can we have an attribute of config size instead?
> + vdpasim->config = kzalloc(size, GFP_KERNEL);
> + if (!vdpasim->config)
> + goto err_iommu;
> +
> + vdpasim->device_id = device_id;
> + vdpasim->supported_features = attr->features;
> + INIT_WORK(&vdpasim->work, attr->work_fn);
> spin_lock_init(&vdpasim->lock);
> spin_lock_init(&vdpasim->iommu_lock);
>
> @@ -379,23 +231,10 @@ static struct vdpasim *vdpasim_create(void)
> if (!vdpasim->buffer)
> goto err_iommu;
>
> - if (macaddr) {
> - mac_pton(macaddr, vdpasim->config.mac);
> - if (!is_valid_ether_addr(vdpasim->config.mac)) {
> - ret = -EADDRNOTAVAIL;
> - goto err_iommu;
> - }
> - } else {
> - eth_random_addr(vdpasim->config.mac);
> - }
> -
> - vringh_set_iotlb(&vdpasim->vqs[0].vring, vdpasim->iommu);
> - vringh_set_iotlb(&vdpasim->vqs[1].vring, vdpasim->iommu);
> + for (i = 0; i < VDPASIM_VQ_NUM; i++)
> + vringh_set_iotlb(&vdpasim->vqs[i].vring, vdpasim->iommu);
And an attribute of #vqs here.
>
> vdpasim->vdpa.dma_dev = dev;
> - ret = vdpa_register_device(&vdpasim->vdpa);
> - if (ret)
> - goto err_iommu;
>
> return vdpasim;
>
> @@ -404,6 +243,7 @@ static struct vdpasim *vdpasim_create(void)
> err_alloc:
> return ERR_PTR(ret);
> }
> +EXPORT_SYMBOL_GPL(vdpasim_create);
>
> static int vdpasim_set_vq_address(struct vdpa_device *vdpa, u16 idx,
> u64 desc_area, u64 driver_area,
> @@ -498,28 +338,34 @@ static u32 vdpasim_get_vq_align(struct vdpa_device *vdpa)
>
> static u64 vdpasim_get_features(struct vdpa_device *vdpa)
> {
> - return vdpasim_features;
> + struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
> +
> + return vdpasim->supported_features;
> }
>
> static int vdpasim_set_features(struct vdpa_device *vdpa, u64 features)
> {
> struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
> - struct virtio_net_config *config = &vdpasim->config;
>
> /* DMA mapping must be done by driver */
> if (!(features & (1ULL << VIRTIO_F_ACCESS_PLATFORM)))
> return -EINVAL;
>
> - vdpasim->features = features & vdpasim_features;
> + vdpasim->features = features & vdpasim->supported_features;
>
> /* We generally only know whether guest is using the legacy interface
> * here, so generally that's the earliest we can set config fields.
> * Note: We actually require VIRTIO_F_ACCESS_PLATFORM above which
> * implies VIRTIO_F_VERSION_1, but let's not try to be clever here.
> */
> + if (vdpasim->device_id == VIRTIO_ID_NET) {
> + struct virtio_net_config *config =
> + (struct virtio_net_config *)vdpasim->config;
> +
> + config->mtu = cpu_to_vdpasim16(vdpasim, 1500);
> + config->status = cpu_to_vdpasim16(vdpasim, VIRTIO_NET_S_LINK_UP);
> + }
Can we introduce callbacks of set_features/get_features here to avoid
dealing of device type specific codes in generic simulator code?
>
> - config->mtu = cpu_to_vdpasim16(vdpasim, 1500);
> - config->status = cpu_to_vdpasim16(vdpasim, VIRTIO_NET_S_LINK_UP);
> return 0;
> }
>
> @@ -536,7 +382,9 @@ static u16 vdpasim_get_vq_num_max(struct vdpa_device *vdpa)
>
> static u32 vdpasim_get_device_id(struct vdpa_device *vdpa)
> {
> - return VDPASIM_DEVICE_ID;
> + struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
> +
> + return vdpasim->device_id;
> }
>
> static u32 vdpasim_get_vendor_id(struct vdpa_device *vdpa)
> @@ -572,8 +420,12 @@ static void vdpasim_get_config(struct vdpa_device *vdpa, unsigned int offset,
> {
> struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
>
> - if (offset + len < sizeof(struct virtio_net_config))
> - memcpy(buf, (u8 *)&vdpasim->config + offset, len);
> + if (vdpasim->device_id == VIRTIO_ID_BLOCK &&
> + (offset + len < sizeof(struct virtio_blk_config)))
> + memcpy(buf, vdpasim->config + offset, len);
> + else if (vdpasim->device_id == VIRTIO_ID_NET &&
> + (offset + len < sizeof(struct virtio_net_config)))
> + memcpy(buf, vdpasim->config + offset, len);
Similarly, can we introduce set/get_config ops?
> }
>
> static void vdpasim_set_config(struct vdpa_device *vdpa, unsigned int offset,
> @@ -659,9 +511,10 @@ static void vdpasim_free(struct vdpa_device *vdpa)
> kfree(vdpasim->buffer);
> if (vdpasim->iommu)
> vhost_iotlb_free(vdpasim->iommu);
> + kfree(vdpasim->config);
> }
>
> -static const struct vdpa_config_ops vdpasim_net_config_ops = {
> +static const struct vdpa_config_ops vdpasim_config_ops = {
> .set_vq_address = vdpasim_set_vq_address,
> .set_vq_num = vdpasim_set_vq_num,
> .kick_vq = vdpasim_kick_vq,
> @@ -688,7 +541,7 @@ static const struct vdpa_config_ops vdpasim_net_config_ops = {
> .free = vdpasim_free,
> };
>
> -static const struct vdpa_config_ops vdpasim_net_batch_config_ops = {
> +static const struct vdpa_config_ops vdpasim_batch_config_ops = {
> .set_vq_address = vdpasim_set_vq_address,
> .set_vq_num = vdpasim_set_vq_num,
> .kick_vq = vdpasim_kick_vq,
> @@ -714,27 +567,7 @@ static const struct vdpa_config_ops vdpasim_net_batch_config_ops = {
> .free = vdpasim_free,
> };
>
> -static int __init vdpasim_dev_init(void)
> -{
> - vdpasim_dev = vdpasim_create();
> -
> - if (!IS_ERR(vdpasim_dev))
> - return 0;
> -
> - return PTR_ERR(vdpasim_dev);
> -}
> -
> -static void __exit vdpasim_dev_exit(void)
> -{
> - struct vdpa_device *vdpa = &vdpasim_dev->vdpa;
> -
> - vdpa_unregister_device(vdpa);
> -}
> -
> -module_init(vdpasim_dev_init)
> -module_exit(vdpasim_dev_exit)
> -
> MODULE_VERSION(DRV_VERSION);
> MODULE_LICENSE(DRV_LICENSE);
> MODULE_AUTHOR(DRV_AUTHOR);
> -MODULE_DESCRIPTION(DRV_DESC);
> +MODULE_DESCRIPTION("vDPA Simulator core");
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
> new file mode 100644
> index 000000000000..c68d5488ab54
> --- /dev/null
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
> @@ -0,0 +1,153 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * VDPA simulator for networking device.
> + *
> + * Copyright (c) 2020, Red Hat Inc. All rights reserved.
> + * Author: Jason Wang <[email protected]>
> + *
> + */
> +
> +#include <linux/module.h>
> +#include <linux/etherdevice.h>
> +
> +#include "vdpa_sim.h"
> +
> +#define VDPASIM_NET_FEATURES (1ULL << VIRTIO_NET_F_MAC)
> +
> +static int batch_mapping = 1;
> +module_param(batch_mapping, int, 0444);
> +MODULE_PARM_DESC(batch_mapping, "Batched mapping 1 - Enable; 0 - Disable");
I think batch_mapping should belong to vpda_sim core module.
> +
> +static char *macaddr;
> +module_param(macaddr, charp, 0);
> +MODULE_PARM_DESC(macaddr, "Ethernet MAC address");
> +
> +static struct vdpasim *vdpasim_net_dev;
> +
> +static void vdpasim_net_work(struct work_struct *work)
> +{
> + struct vdpasim *vdpasim = container_of(work, struct vdpasim, work);
> + struct vdpasim_virtqueue *txq = &vdpasim->vqs[1];
> + struct vdpasim_virtqueue *rxq = &vdpasim->vqs[0];
> + ssize_t read, write;
> + size_t total_write;
> + int pkts = 0;
> + int err;
> +
> + spin_lock(&vdpasim->lock);
> +
> + if (!(vdpasim->status & VIRTIO_CONFIG_S_DRIVER_OK))
> + goto out;
> +
> + if (!txq->ready || !rxq->ready)
> + goto out;
> +
> + while (true) {
> + total_write = 0;
> + err = vringh_getdesc_iotlb(&txq->vring, &txq->iov, NULL,
> + &txq->head, GFP_ATOMIC);
> + if (err <= 0)
> + break;
> +
> + err = vringh_getdesc_iotlb(&rxq->vring, NULL, &rxq->iov,
> + &rxq->head, GFP_ATOMIC);
> + if (err <= 0) {
> + vringh_complete_iotlb(&txq->vring, txq->head, 0);
> + break;
> + }
> +
> + while (true) {
> + read = vringh_iov_pull_iotlb(&txq->vring, &txq->iov,
> + vdpasim->buffer,
> + PAGE_SIZE);
> + if (read <= 0)
> + break;
> +
> + write = vringh_iov_push_iotlb(&rxq->vring, &rxq->iov,
> + vdpasim->buffer, read);
> + if (write <= 0)
> + break;
> +
> + total_write += write;
> + }
> +
> + /* Make sure data is wrote before advancing index */
> + smp_wmb();
> +
> + vringh_complete_iotlb(&txq->vring, txq->head, 0);
> + vringh_complete_iotlb(&rxq->vring, rxq->head, total_write);
> +
> + /* Make sure used is visible before rasing the interrupt. */
> + smp_wmb();
> +
> + local_bh_disable();
> + if (txq->cb)
> + txq->cb(txq->private);
> + if (rxq->cb)
> + rxq->cb(rxq->private);
> + local_bh_enable();
> +
> + if (++pkts > 4) {
> + schedule_work(&vdpasim->work);
> + goto out;
> + }
> + }
> +
> +out:
> + spin_unlock(&vdpasim->lock);
> +}
> +
> +static int __init vdpasim_net_init(void)
> +{
> + struct vdpasim_init_attr attr = {};
> + struct virtio_net_config *config;
> + int ret;
> +
> + attr.device_id = VIRTIO_ID_NET;
> + attr.features = VDPASIM_FEATURES | VDPASIM_NET_FEATURES;
> + attr.work_fn = vdpasim_net_work;
> + attr.batch_mapping = batch_mapping;
> + vdpasim_net_dev = vdpasim_create(&attr);
> + if (IS_ERR(vdpasim_net_dev)) {
> + ret = PTR_ERR(vdpasim_net_dev);
> + goto out;
> + }
> +
> + config = (struct virtio_net_config *)vdpasim_net_dev->config;
> +
> + if (macaddr) {
> + mac_pton(macaddr, config->mac);
> + if (!is_valid_ether_addr(config->mac)) {
> + ret = -EADDRNOTAVAIL;
> + goto put_dev;
> + }
> + } else {
> + eth_random_addr(config->mac);
> + }
> +
> + ret = vdpa_register_device(&vdpasim_net_dev->vdpa);
> + if (ret)
> + goto put_dev;
> +
> + return 0;
> +
> +put_dev:
> + put_device(&vdpasim_net_dev->vdpa.dev);
> +out:
> + return ret;
> +}
> +
> +static void __exit vdpasim_net_exit(void)
> +{
> + struct vdpa_device *vdpa = &vdpasim_net_dev->vdpa;
> +
> + vdpa_unregister_device(vdpa);
> +}
> +
> +module_init(vdpasim_net_init);
> +module_exit(vdpasim_net_exit);
> +
> +MODULE_VERSION(DRV_VERSION);
> +MODULE_LICENSE(DRV_LICENSE);
> +MODULE_AUTHOR(DRV_AUTHOR);
> +MODULE_DESCRIPTION("vDPA Device Simulator for networking device");
> diff --git a/drivers/vdpa/Kconfig b/drivers/vdpa/Kconfig
> index d7d32b656102..fdb1a9267347 100644
> --- a/drivers/vdpa/Kconfig
> +++ b/drivers/vdpa/Kconfig
> @@ -9,11 +9,16 @@ menuconfig VDPA
> if VDPA
>
> config VDPA_SIM
> - tristate "vDPA device simulator"
> + tristate "vDPA simulator core"
> depends on RUNTIME_TESTING_MENU && HAS_DMA
> select DMA_OPS
> select VHOST_RING
> default n
> +
> +config VDPA_SIM_NET
> + tristate "vDPA simulator for networking device"
> + depends on VDPA_SIM
> + default n
I remember somebody told me that if we don't enable a module it was
disabled by default.
Thanks
> help
> vDPA networking device simulator which loop TX traffic back
> to RX. This device is used for testing, prototyping and
> diff --git a/drivers/vdpa/vdpa_sim/Makefile b/drivers/vdpa/vdpa_sim/Makefile
> index b40278f65e04..79d4536d347e 100644
> --- a/drivers/vdpa/vdpa_sim/Makefile
> +++ b/drivers/vdpa/vdpa_sim/Makefile
> @@ -1,2 +1,3 @@
> # SPDX-License-Identifier: GPL-2.0
> obj-$(CONFIG_VDPA_SIM) += vdpa_sim.o
> +obj-$(CONFIG_VDPA_SIM_NET) += vdpa_sim_net.o
On 2020/11/13 下午9:47, Stefano Garzarella wrote:
> From: Max Gurtovoy <[email protected]>
>
> Add a new attribute that will define the number of virt queues to be
> created for the vdpasim device.
>
> Signed-off-by: Max Gurtovoy <[email protected]>
> [sgarzare: replace kmalloc_array() with kcalloc()]
> Signed-off-by: Stefano Garzarella <[email protected]>
> ---
> v1:
> - use kcalloc() instead of kmalloc_array() since some function expects
> variables initialized to zero
Looks good, one nit, I prefer to do this before patch 2.
Thanks
On 2020/11/13 下午9:47, Stefano Garzarella wrote:
> The simulated devices can support multiple queues, so this limit
> should be defined according to the number of queues supported by
> the device.
>
> Since we are in a simulator, let's simply remove that limit.
>
> Suggested-by: Jason Wang <[email protected]>
> Signed-off-by: Stefano Garzarella <[email protected]>
Acked-by: Jason Wang <[email protected]>
It would be good to introduce a macro instead of using the magic 0 here.
Thanks
> ---
> drivers/vdpa/vdpa_sim/vdpa_sim.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> index 2b4fea354413..9c9717441bbe 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> @@ -230,7 +230,7 @@ struct vdpasim *vdpasim_create(struct vdpasim_init_attr *attr)
> goto err_iommu;
> set_dma_ops(dev, &vdpasim_dma_ops);
>
> - vdpasim->iommu = vhost_iotlb_alloc(2048, 0);
> + vdpasim->iommu = vhost_iotlb_alloc(0, 0);
> if (!vdpasim->iommu)
> goto err_iommu;
>
On 2020/11/13 下午9:47, Stefano Garzarella wrote:
> Move device properties used during the entire life cycle in a new
> structure to simplify the copy of these fields during the vdpasim
> initialization.
>
> Signed-off-by: Stefano Garzarella <[email protected]>
It would be better to do it before patch 2.
> ---
> drivers/vdpa/vdpa_sim/vdpa_sim.h | 17 ++++++++------
> drivers/vdpa/vdpa_sim/vdpa_sim.c | 33 ++++++++++++++--------------
> drivers/vdpa/vdpa_sim/vdpa_sim_blk.c | 8 +++++--
> drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 9 +++++---
> 4 files changed, 38 insertions(+), 29 deletions(-)
>
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.h b/drivers/vdpa/vdpa_sim/vdpa_sim.h
> index 6a1267c40d5e..76e642042eb0 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.h
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.h
> @@ -40,12 +40,17 @@ struct vdpasim_virtqueue {
> irqreturn_t (*cb)(void *data);
> };
>
> +struct vdpasim_device {
> + u64 supported_features;
> + u32 id;
> + int nvqs;
> +};
> +
> struct vdpasim_init_attr {
> - u32 device_id;
> - u64 features;
> + struct vdpasim_device device;
> + int batch_mapping;
> +
> work_func_t work_fn;
> - int batch_mapping;
> - int nvqs;
> };
>
> /* State of each vdpasim device */
> @@ -53,18 +58,16 @@ struct vdpasim {
> struct vdpa_device vdpa;
> struct vdpasim_virtqueue *vqs;
> struct work_struct work;
> + struct vdpasim_device device;
> /* spinlock to synchronize virtqueue state */
> spinlock_t lock;
> /* virtio config according to device type */
> void *config;
> struct vhost_iotlb *iommu;
> void *buffer;
> - u32 device_id;
> u32 status;
> u32 generation;
> u64 features;
> - u64 supported_features;
> - int nvqs;
> /* spinlock to synchronize iommu table */
> spinlock_t iommu_lock;
> };
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> index 9c9717441bbe..d053bd14b3f8 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> @@ -28,7 +28,7 @@ static void vdpasim_queue_ready(struct vdpasim *vdpasim, unsigned int idx)
> {
> struct vdpasim_virtqueue *vq = &vdpasim->vqs[idx];
>
> - vringh_init_iotlb(&vq->vring, vdpasim->supported_features,
> + vringh_init_iotlb(&vq->vring, vdpasim->device.supported_features,
> VDPASIM_QUEUE_MAX, false,
> (struct vring_desc *)(uintptr_t)vq->desc_addr,
> (struct vring_avail *)
> @@ -46,7 +46,7 @@ static void vdpasim_vq_reset(struct vdpasim *vdpasim,
> vq->device_addr = 0;
> vq->cb = NULL;
> vq->private = NULL;
> - vringh_init_iotlb(&vq->vring, vdpasim->supported_features,
> + vringh_init_iotlb(&vq->vring, vdpasim->device.supported_features,
> VDPASIM_QUEUE_MAX, false, NULL, NULL, NULL);
> }
>
> @@ -54,7 +54,7 @@ static void vdpasim_reset(struct vdpasim *vdpasim)
> {
> int i;
>
> - for (i = 0; i < vdpasim->nvqs; i++)
> + for (i = 0; i < vdpasim->device.nvqs; i++)
> vdpasim_vq_reset(vdpasim, &vdpasim->vqs[i]);
>
> spin_lock(&vdpasim->iommu_lock);
> @@ -189,7 +189,7 @@ struct vdpasim *vdpasim_create(struct vdpasim_init_attr *attr)
> struct device *dev;
> int i, size, ret = -ENOMEM;
>
> - device_id = attr->device_id;
> + device_id = attr->device.id;
> /* Currently, we only accept the network and block devices. */
> if (device_id != VIRTIO_ID_NET && device_id != VIRTIO_ID_BLOCK)
> return ERR_PTR(-EOPNOTSUPP);
> @@ -200,10 +200,12 @@ struct vdpasim *vdpasim_create(struct vdpasim_init_attr *attr)
> ops = &vdpasim_config_ops;
>
> vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops,
> - attr->nvqs);
> + attr->device.nvqs);
> if (!vdpasim)
> goto err_alloc;
>
> + vdpasim->device = attr->device;
> +
> if (device_id == VIRTIO_ID_NET)
> size = sizeof(struct virtio_net_config);
> else
> @@ -212,14 +214,11 @@ struct vdpasim *vdpasim_create(struct vdpasim_init_attr *attr)
> if (!vdpasim->config)
> goto err_iommu;
>
> - vdpasim->vqs = kcalloc(attr->nvqs, sizeof(struct vdpasim_virtqueue),
> - GFP_KERNEL);
> + vdpasim->vqs = kcalloc(vdpasim->device.nvqs,
> + sizeof(struct vdpasim_virtqueue), GFP_KERNEL);
> if (!vdpasim->vqs)
> goto err_iommu;
>
> - vdpasim->device_id = device_id;
> - vdpasim->supported_features = attr->features;
> - vdpasim->nvqs = attr->nvqs;
> INIT_WORK(&vdpasim->work, attr->work_fn);
> spin_lock_init(&vdpasim->lock);
> spin_lock_init(&vdpasim->iommu_lock);
> @@ -238,7 +237,7 @@ struct vdpasim *vdpasim_create(struct vdpasim_init_attr *attr)
> if (!vdpasim->buffer)
> goto err_iommu;
>
> - for (i = 0; i < vdpasim->nvqs; i++)
> + for (i = 0; i < vdpasim->device.nvqs; i++)
> vringh_set_iotlb(&vdpasim->vqs[i].vring, vdpasim->iommu);
>
> vdpasim->vdpa.dma_dev = dev;
> @@ -347,7 +346,7 @@ static u64 vdpasim_get_features(struct vdpa_device *vdpa)
> {
> struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
>
> - return vdpasim->supported_features;
> + return vdpasim->device.supported_features;
> }
>
> static int vdpasim_set_features(struct vdpa_device *vdpa, u64 features)
> @@ -358,14 +357,14 @@ static int vdpasim_set_features(struct vdpa_device *vdpa, u64 features)
> if (!(features & (1ULL << VIRTIO_F_ACCESS_PLATFORM)))
> return -EINVAL;
>
> - vdpasim->features = features & vdpasim->supported_features;
> + vdpasim->features = features & vdpasim->device.supported_features;
>
> /* We generally only know whether guest is using the legacy interface
> * here, so generally that's the earliest we can set config fields.
> * Note: We actually require VIRTIO_F_ACCESS_PLATFORM above which
> * implies VIRTIO_F_VERSION_1, but let's not try to be clever here.
> */
> - if (vdpasim->device_id == VIRTIO_ID_NET) {
> + if (vdpasim->device.id == VIRTIO_ID_NET) {
> struct virtio_net_config *config =
> (struct virtio_net_config *)vdpasim->config;
>
> @@ -391,7 +390,7 @@ static u32 vdpasim_get_device_id(struct vdpa_device *vdpa)
> {
> struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
>
> - return vdpasim->device_id;
> + return vdpasim->device.id;
> }
>
> static u32 vdpasim_get_vendor_id(struct vdpa_device *vdpa)
> @@ -427,10 +426,10 @@ static void vdpasim_get_config(struct vdpa_device *vdpa, unsigned int offset,
> {
> struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
>
> - if (vdpasim->device_id == VIRTIO_ID_BLOCK &&
> + if (vdpasim->device.id == VIRTIO_ID_BLOCK &&
> (offset + len < sizeof(struct virtio_blk_config)))
> memcpy(buf, vdpasim->config + offset, len);
> - else if (vdpasim->device_id == VIRTIO_ID_NET &&
> + else if (vdpasim->device.id == VIRTIO_ID_NET &&
> (offset + len < sizeof(struct virtio_net_config)))
> memcpy(buf, vdpasim->config + offset, len);
> }
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
> index 386dbb2f7138..363273d72e26 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
> @@ -78,9 +78,13 @@ static int __init vdpasim_blk_init(void)
> struct virtio_blk_config *config;
> int ret;
>
> - attr.device_id = VIRTIO_ID_BLOCK;
> - attr.features = VDPASIM_FEATURES | VDPASIM_BLK_FEATURES;
> + attr.device.id = VIRTIO_ID_BLOCK;
> + attr.device.supported_features = VDPASIM_FEATURES |
> + VDPASIM_BLK_FEATURES;
> + attr.device.nvqs = VDPASIM_BLK_VQ_NUM;
> +
> attr.work_fn = vdpasim_blk_work;
> +
> vdpasim_blk_dev = vdpasim_create(&attr);
> if (IS_ERR(vdpasim_blk_dev)) {
> ret = PTR_ERR(vdpasim_blk_dev);
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
> index e1e57c52b108..88c9569f6bd3 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
> @@ -105,11 +105,14 @@ static int __init vdpasim_net_init(void)
> struct virtio_net_config *config;
> int ret;
>
> - attr.device_id = VIRTIO_ID_NET;
> - attr.features = VDPASIM_FEATURES | VDPASIM_NET_FEATURES;
> - attr.nvqs = VDPASIM_NET_VQ_NUM;
> + attr.device.id = VIRTIO_ID_NET;
> + attr.device.supported_features = VDPASIM_FEATURES |
> + VDPASIM_NET_FEATURES;
> + attr.device.nvqs = VDPASIM_NET_VQ_NUM;
> +
> attr.work_fn = vdpasim_net_work;
> attr.batch_mapping = batch_mapping;
> +
Unnecessary changes.
Thanks
> vdpasim_net_dev = vdpasim_create(&attr);
> if (IS_ERR(vdpasim_net_dev)) {
> ret = PTR_ERR(vdpasim_net_dev);
On 2020/11/13 下午9:47, Stefano Garzarella wrote:
> The next patch will make the buffer size configurable from each
> device.
> Since the buffer could be larger than a page, we use kvmalloc()
> instead of kmalloc().
>
> Signed-off-by: Stefano Garzarella <[email protected]>
Acked-by: Jason Wang <[email protected]>
Thanks
> ---
> drivers/vdpa/vdpa_sim/vdpa_sim.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> index 9c29c2013661..bd034fbf4683 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> @@ -223,7 +223,7 @@ struct vdpasim *vdpasim_create(struct vdpasim_init_attr *attr)
> if (!vdpasim->iommu)
> goto err_iommu;
>
> - vdpasim->buffer = kmalloc(PAGE_SIZE, GFP_KERNEL);
> + vdpasim->buffer = kvmalloc(PAGE_SIZE, GFP_KERNEL);
> if (!vdpasim->buffer)
> goto err_iommu;
>
> @@ -495,7 +495,7 @@ static void vdpasim_free(struct vdpa_device *vdpa)
> struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
>
> cancel_work_sync(&vdpasim->work);
> - kfree(vdpasim->buffer);
> + kvfree(vdpasim->buffer);
> if (vdpasim->iommu)
> vhost_iotlb_free(vdpasim->iommu);
> kfree(vdpasim->vqs);
On 2020/11/13 下午9:47, Stefano Garzarella wrote:
> Allow each device to specify the size of the buffer allocated
> in vdpa_sim.
>
> Signed-off-by: Stefano Garzarella <[email protected]>
Acked-by: Jason Wang <[email protected]>
> ---
> drivers/vdpa/vdpa_sim/vdpa_sim.h | 1 +
> drivers/vdpa/vdpa_sim/vdpa_sim.c | 2 +-
> drivers/vdpa/vdpa_sim/vdpa_sim_blk.c | 1 +
> drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 1 +
> 4 files changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.h b/drivers/vdpa/vdpa_sim/vdpa_sim.h
> index f7e1fe0a88d3..cc21e07aa2f7 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.h
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.h
> @@ -49,6 +49,7 @@ struct vdpasim_device {
>
> struct vdpasim_init_attr {
> struct vdpasim_device device;
> + size_t buffer_size;
> int batch_mapping;
>
> work_func_t work_fn;
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> index bd034fbf4683..3863d49e0d6d 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> @@ -223,7 +223,7 @@ struct vdpasim *vdpasim_create(struct vdpasim_init_attr *attr)
> if (!vdpasim->iommu)
> goto err_iommu;
>
> - vdpasim->buffer = kvmalloc(PAGE_SIZE, GFP_KERNEL);
> + vdpasim->buffer = kvmalloc(attr->buffer_size, GFP_KERNEL);
> if (!vdpasim->buffer)
> goto err_iommu;
>
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
> index f456a0e4e097..122a3c039507 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
> @@ -100,6 +100,7 @@ static int __init vdpasim_blk_init(void)
> attr.device.update_config = vdpasim_blk_update_config;
>
> attr.work_fn = vdpasim_blk_work;
> + attr.buffer_size = PAGE_SIZE;
>
> vdpasim_blk_dev = vdpasim_create(&attr);
> if (IS_ERR(vdpasim_blk_dev)) {
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
> index b9372fdf2415..d0a1403f64b2 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
> @@ -124,6 +124,7 @@ static int __init vdpasim_net_init(void)
>
> attr.work_fn = vdpasim_net_work;
> attr.batch_mapping = batch_mapping;
> + attr.buffer_size = PAGE_SIZE;
>
> vdpasim_net_dev = vdpasim_create(&attr);
> if (IS_ERR(vdpasim_net_dev)) {
On 2020/11/13 下午9:47, Stefano Garzarella wrote:
> From: Max Gurtovoy <[email protected]>
>
> This will allow running vDPA for virtio block protocol.
>
> Signed-off-by: Max Gurtovoy <[email protected]>
> [sgarzare: various cleanups/fixes]
> Signed-off-by: Stefano Garzarella <[email protected]>
> ---
> v1:
> - Removed unused headers
> - Used cpu_to_vdpasim*() to store config fields
> - Replaced 'select VDPA_SIM' with 'depends on VDPA_SIM' since selected
> option can not depend on other [Jason]
> - Start with a single queue for now [Jason]
> - Add comments to memory barriers
> ---
> drivers/vdpa/vdpa_sim/vdpa_sim_blk.c | 124 +++++++++++++++++++++++++++
> drivers/vdpa/Kconfig | 9 ++
> drivers/vdpa/vdpa_sim/Makefile | 1 +
> 3 files changed, 134 insertions(+)
> create mode 100644 drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
>
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
> new file mode 100644
> index 000000000000..386dbb2f7138
> --- /dev/null
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
> @@ -0,0 +1,124 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * VDPA simulator for block device.
> + *
> + * Copyright (c) 2020, Mellanox Technologies. All rights reserved.
> + *
> + */
> +
> +#include <linux/module.h>
> +
> +#include "vdpa_sim.h"
> +
> +#define VDPASIM_BLK_FEATURES ((1ULL << VIRTIO_BLK_F_SIZE_MAX) | \
> + (1ULL << VIRTIO_BLK_F_SEG_MAX) | \
> + (1ULL << VIRTIO_BLK_F_BLK_SIZE) | \
> + (1ULL << VIRTIO_BLK_F_TOPOLOGY) | \
> + (1ULL << VIRTIO_BLK_F_MQ))
> +
> +#define VDPASIM_BLK_CAPACITY 0x40000
> +#define VDPASIM_BLK_SIZE_MAX 0x1000
> +#define VDPASIM_BLK_SEG_MAX 32
> +#define VDPASIM_BLK_VQ_NUM 1
> +
> +static struct vdpasim *vdpasim_blk_dev;
> +
> +static void vdpasim_blk_work(struct work_struct *work)
> +{
> + struct vdpasim *vdpasim = container_of(work, struct vdpasim, work);
> + u8 status = VIRTIO_BLK_S_OK;
> + int i;
> +
> + spin_lock(&vdpasim->lock);
> +
> + if (!(vdpasim->status & VIRTIO_CONFIG_S_DRIVER_OK))
> + goto out;
> +
> + for (i = 0; i < VDPASIM_BLK_VQ_NUM; i++) {
> + struct vdpasim_virtqueue *vq = &vdpasim->vqs[i];
> +
> + if (!vq->ready)
> + continue;
> +
> + while (vringh_getdesc_iotlb(&vq->vring, &vq->iov, &vq->iov,
> + &vq->head, GFP_ATOMIC) > 0) {
> +
> + int write;
> +
> + vq->iov.i = vq->iov.used - 1;
> + write = vringh_iov_push_iotlb(&vq->vring, &vq->iov, &status, 1);
> + if (write <= 0)
> + break;
> +
> + /* Make sure data is wrote before advancing index */
> + smp_wmb();
> +
> + vringh_complete_iotlb(&vq->vring, vq->head, write);
> +
> + /* Make sure used is visible before rasing the interrupt. */
> + smp_wmb();
> +
> + if (vringh_need_notify_iotlb(&vq->vring) > 0)
> + vringh_notify(&vq->vring);
Do we initialize vrh->notify anywhere? And This seems duplicated with
the following vq->cb.
I think the correct way is to initialize vrh->notify and use
vringh_need_notify_iotlb()/vringh_notify() instead of the vq->cb here.
And while at it, it's better to convert net simulator to do the same.
Thanks
> +
> + local_bh_disable();
> + if (vq->cb)
> + vq->cb(vq->private);
> + local_bh_enable();
> + }
> + }
> +out:
> + spin_unlock(&vdpasim->lock);
> +
> +}
> +
> +static int __init vdpasim_blk_init(void)
> +{
> + struct vdpasim_init_attr attr = {};
> + struct virtio_blk_config *config;
> + int ret;
> +
> + attr.device_id = VIRTIO_ID_BLOCK;
> + attr.features = VDPASIM_FEATURES | VDPASIM_BLK_FEATURES;
> + attr.work_fn = vdpasim_blk_work;
> + vdpasim_blk_dev = vdpasim_create(&attr);
> + if (IS_ERR(vdpasim_blk_dev)) {
> + ret = PTR_ERR(vdpasim_blk_dev);
> + goto out;
> + }
> +
> + config = (struct virtio_blk_config *)vdpasim_blk_dev->config;
> + config->capacity = cpu_to_vdpasim64(vdpasim_blk_dev, VDPASIM_BLK_CAPACITY);
> + config->size_max = cpu_to_vdpasim32(vdpasim_blk_dev, VDPASIM_BLK_SIZE_MAX);
> + config->seg_max = cpu_to_vdpasim32(vdpasim_blk_dev, VDPASIM_BLK_SEG_MAX);
> + config->num_queues = cpu_to_vdpasim16(vdpasim_blk_dev, VDPASIM_BLK_VQ_NUM);
> + config->min_io_size = cpu_to_vdpasim16(vdpasim_blk_dev, 1);
> + config->opt_io_size = cpu_to_vdpasim32(vdpasim_blk_dev, 1);
> + config->blk_size = cpu_to_vdpasim32(vdpasim_blk_dev, 512);
> +
> + ret = vdpa_register_device(&vdpasim_blk_dev->vdpa);
> + if (ret)
> + goto put_dev;
> +
> + return 0;
> +
> +put_dev:
> + put_device(&vdpasim_blk_dev->vdpa.dev);
> +out:
> + return ret;
> +}
> +
> +static void __exit vdpasim_blk_exit(void)
> +{
> + struct vdpa_device *vdpa = &vdpasim_blk_dev->vdpa;
> +
> + vdpa_unregister_device(vdpa);
> +}
> +
> +module_init(vdpasim_blk_init)
> +module_exit(vdpasim_blk_exit)
> +
> +MODULE_VERSION(DRV_VERSION);
> +MODULE_LICENSE(DRV_LICENSE);
> +MODULE_AUTHOR("Max Gurtovoy <[email protected]>");
> +MODULE_DESCRIPTION("vDPA Device Simulator for block device");
> diff --git a/drivers/vdpa/Kconfig b/drivers/vdpa/Kconfig
> index fdb1a9267347..0fb63362cd5d 100644
> --- a/drivers/vdpa/Kconfig
> +++ b/drivers/vdpa/Kconfig
> @@ -24,6 +24,15 @@ config VDPA_SIM_NET
> to RX. This device is used for testing, prototyping and
> development of vDPA.
>
> +config VDPA_SIM_BLOCK
> + tristate "vDPA simulator for block device"
> + depends on VDPA_SIM
> + default n
> + help
> + vDPA block device simulator which terminates IO request in a
> + memory buffer. This device is used for testing, prototyping and
> + development of vDPA.
> +
> config IFCVF
> tristate "Intel IFC VF vDPA driver"
> depends on PCI_MSI
> diff --git a/drivers/vdpa/vdpa_sim/Makefile b/drivers/vdpa/vdpa_sim/Makefile
> index 79d4536d347e..d458103302f2 100644
> --- a/drivers/vdpa/vdpa_sim/Makefile
> +++ b/drivers/vdpa/vdpa_sim/Makefile
> @@ -1,3 +1,4 @@
> # SPDX-License-Identifier: GPL-2.0
> obj-$(CONFIG_VDPA_SIM) += vdpa_sim.o
> obj-$(CONFIG_VDPA_SIM_NET) += vdpa_sim_net.o
> +obj-$(CONFIG_VDPA_SIM_BLOCK) += vdpa_sim_blk.o
On 2020/11/13 下午9:47, Stefano Garzarella wrote:
> In order to simplify the code of the vdpa_sim core, we move the
> config management in each device simulator.
>
> The device must provide the size of config structure and a callback
> to update this structure called during the vdpasim_set_features().
Similarly, I suggest to do this before patch 2, then there's no need for
the conversion of blk device.
>
> Signed-off-by: Stefano Garzarella <[email protected]>
> ---
> drivers/vdpa/vdpa_sim/vdpa_sim.h | 5 +++--
> drivers/vdpa/vdpa_sim/vdpa_sim.c | 29 +++++-----------------------
> drivers/vdpa/vdpa_sim/vdpa_sim_blk.c | 27 ++++++++++++++++----------
> drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 12 ++++++++++++
> 4 files changed, 37 insertions(+), 36 deletions(-)
>
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.h b/drivers/vdpa/vdpa_sim/vdpa_sim.h
> index 76e642042eb0..f7e1fe0a88d3 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.h
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.h
> @@ -10,8 +10,6 @@
> #include <linux/vdpa.h>
> #include <linux/vhost_iotlb.h>
> #include <uapi/linux/virtio_config.h>
> -#include <uapi/linux/virtio_net.h>
> -#include <uapi/linux/virtio_blk.h>
>
> #define DRV_VERSION "0.1"
> #define DRV_AUTHOR "Jason Wang <[email protected]>"
> @@ -42,8 +40,11 @@ struct vdpasim_virtqueue {
>
> struct vdpasim_device {
> u64 supported_features;
> + size_t config_size;
> u32 id;
> int nvqs;
> +
> + void (*update_config)(struct vdpasim *vdpasim);
Let's use set_config/get_config to align with virtio/vhost.
Other looks good.
Thanks
On 2020/11/13 下午9:47, Stefano Garzarella wrote:
> In some cases, it may be useful to provide a way to skip a number
> of bytes in a vringh_iov.
>
> In order to keep vringh_iov consistent, let's reuse vringh_iov_xfer()
> logic and skip bytes when the ptr is NULL.
>
> Signed-off-by: Stefano Garzarella <[email protected]>
> ---
>
> I'm not sure if this is the best option, maybe we can add a new
> function vringh_iov_skip().
>
> Suggestions?
I might be worth to check whether we can convert vringh_iov to use iov
iterator then we can use iov_iterator_advance() here.
Thanks
> ---
> drivers/vhost/vringh.c | 16 +++++++++++-----
> 1 file changed, 11 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
> index 8bd8b403f087..ed3290946ad7 100644
> --- a/drivers/vhost/vringh.c
> +++ b/drivers/vhost/vringh.c
> @@ -75,7 +75,9 @@ static inline int __vringh_get_head(const struct vringh *vrh,
> return head;
> }
>
> -/* Copy some bytes to/from the iovec. Returns num copied. */
> +/* Copy some bytes to/from the iovec. Returns num copied.
> + * If ptr is NULL, skips at most len bytes.
> + */
> static inline ssize_t vringh_iov_xfer(struct vringh *vrh,
> struct vringh_kiov *iov,
> void *ptr, size_t len,
> @@ -89,12 +91,16 @@ static inline ssize_t vringh_iov_xfer(struct vringh *vrh,
> size_t partlen;
>
> partlen = min(iov->iov[iov->i].iov_len, len);
> - err = xfer(vrh, iov->iov[iov->i].iov_base, ptr, partlen);
> - if (err)
> - return err;
> +
> + if (ptr) {
> + err = xfer(vrh, iov->iov[iov->i].iov_base, ptr, partlen);
> + if (err)
> + return err;
> + ptr += partlen;
> + }
> +
> done += partlen;
> len -= partlen;
> - ptr += partlen;
> iov->consumed += partlen;
> iov->iov[iov->i].iov_len -= partlen;
> iov->iov[iov->i].iov_base += partlen;
On 2020/11/13 下午9:47, Stefano Garzarella wrote:
> vringh_getdesc_iotlb() manages 2 iovs for writable and readable
> descriptors. This is very useful for the block device, where for
> each request we have both types of descriptor.
>
> Let's split the vdpasim_virtqueue's iov field in riov and wiov
> to use them with vringh_getdesc_iotlb().
>
> Signed-off-by: Stefano Garzarella <[email protected]>
Acked-by: Jason Wang <[email protected]>
> ---
> drivers/vdpa/vdpa_sim/vdpa_sim.h | 3 ++-
> drivers/vdpa/vdpa_sim/vdpa_sim_blk.c | 6 +++---
> drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 8 ++++----
> 3 files changed, 9 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.h b/drivers/vdpa/vdpa_sim/vdpa_sim.h
> index cc21e07aa2f7..0d4629675e4b 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.h
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.h
> @@ -27,7 +27,8 @@ struct vdpasim;
>
> struct vdpasim_virtqueue {
> struct vringh vring;
> - struct vringh_kiov iov;
> + struct vringh_kiov riov;
> + struct vringh_kiov wiov;
> unsigned short head;
> bool ready;
> u64 desc_addr;
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
> index 122a3c039507..8e41b3ab98d5 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
> @@ -41,13 +41,13 @@ static void vdpasim_blk_work(struct work_struct *work)
> if (!vq->ready)
> continue;
>
> - while (vringh_getdesc_iotlb(&vq->vring, &vq->iov, &vq->iov,
> + while (vringh_getdesc_iotlb(&vq->vring, &vq->riov, &vq->wiov,
> &vq->head, GFP_ATOMIC) > 0) {
>
> int write;
>
> - vq->iov.i = vq->iov.used - 1;
> - write = vringh_iov_push_iotlb(&vq->vring, &vq->iov, &status, 1);
> + vq->wiov.i = vq->wiov.used - 1;
> + write = vringh_iov_push_iotlb(&vq->vring, &vq->wiov, &status, 1);
> if (write <= 0)
> break;
>
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
> index d0a1403f64b2..783b1e85b09c 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
> @@ -47,12 +47,12 @@ static void vdpasim_net_work(struct work_struct *work)
>
> while (true) {
> total_write = 0;
> - err = vringh_getdesc_iotlb(&txq->vring, &txq->iov, NULL,
> + err = vringh_getdesc_iotlb(&txq->vring, &txq->riov, NULL,
> &txq->head, GFP_ATOMIC);
> if (err <= 0)
> break;
>
> - err = vringh_getdesc_iotlb(&rxq->vring, NULL, &rxq->iov,
> + err = vringh_getdesc_iotlb(&rxq->vring, NULL, &rxq->wiov,
> &rxq->head, GFP_ATOMIC);
> if (err <= 0) {
> vringh_complete_iotlb(&txq->vring, txq->head, 0);
> @@ -60,13 +60,13 @@ static void vdpasim_net_work(struct work_struct *work)
> }
>
> while (true) {
> - read = vringh_iov_pull_iotlb(&txq->vring, &txq->iov,
> + read = vringh_iov_pull_iotlb(&txq->vring, &txq->riov,
> vdpasim->buffer,
> PAGE_SIZE);
> if (read <= 0)
> break;
>
> - write = vringh_iov_push_iotlb(&rxq->vring, &rxq->iov,
> + write = vringh_iov_push_iotlb(&rxq->vring, &rxq->wiov,
> vdpasim->buffer, read);
> if (write <= 0)
> break;
On 2020/11/13 下午9:47, Stefano Garzarella wrote:
> The previous implementation wrote only the status of each request.
> This patch implements a more accurate block device simulator,
> providing a ramdisk-like behavior.
>
> Also handle VIRTIO_BLK_T_GET_ID request, always answering the
> "vdpa_blk_sim" string.
Let's use a separate patch for this.
>
> Signed-off-by: Stefano Garzarella <[email protected]>
> ---
> drivers/vdpa/vdpa_sim/vdpa_sim_blk.c | 151 +++++++++++++++++++++++----
> 1 file changed, 133 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
> index 8e41b3ab98d5..68e74383322f 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
> @@ -7,6 +7,7 @@
> */
>
> #include <linux/module.h>
> +#include <linux/blkdev.h>
> #include <uapi/linux/virtio_blk.h>
>
> #include "vdpa_sim.h"
> @@ -24,10 +25,137 @@
>
> static struct vdpasim *vdpasim_blk_dev;
>
> +static int vdpasim_blk_handle_req(struct vdpasim *vdpasim,
> + struct vdpasim_virtqueue *vq)
> +{
> + size_t wrote = 0, to_read = 0, to_write = 0;
> + struct virtio_blk_outhdr hdr;
> + uint8_t status;
> + uint32_t type;
> + ssize_t bytes;
> + loff_t offset;
> + int i, ret;
> +
> + vringh_kiov_cleanup(&vq->riov);
> + vringh_kiov_cleanup(&vq->wiov);
It looks to me we should do those after vringh_get_desc_iotlb()? See
comment above vringh_getdesc_kern().
> +
> + ret = vringh_getdesc_iotlb(&vq->vring, &vq->riov, &vq->wiov,
> + &vq->head, GFP_ATOMIC);
> + if (ret != 1)
> + return ret;
> +
> + for (i = 0; i < vq->wiov.used; i++)
> + to_write += vq->wiov.iov[i].iov_len;
It's better to introduce a helper for this (or consider to use iov
iterator).
> + to_write -= 1; /* last byte is the status */
> +
> + for (i = 0; i < vq->riov.used; i++)
> + to_read += vq->riov.iov[i].iov_len;
> +
> + bytes = vringh_iov_pull_iotlb(&vq->vring, &vq->riov, &hdr, sizeof(hdr));
> + if (bytes != sizeof(hdr))
> + return 0;
> +
> + to_read -= bytes;
> +
> + type = le32_to_cpu(hdr.type);
> + offset = le64_to_cpu(hdr.sector) << SECTOR_SHIFT;
> + status = VIRTIO_BLK_S_OK;
> +
> + switch (type) {
> + case VIRTIO_BLK_T_IN:
> + if (offset + to_write > VDPASIM_BLK_CAPACITY << SECTOR_SHIFT) {
> + dev_err(&vdpasim->vdpa.dev,
> + "reading over the capacity - offset: 0x%llx len: 0x%lx\n",
> + offset, to_write);
> + status = VIRTIO_BLK_S_IOERR;
> + break;
> + }
> +
> + bytes = vringh_iov_push_iotlb(&vq->vring, &vq->wiov,
> + vdpasim->buffer + offset,
> + to_write);
> + if (bytes < 0) {
> + dev_err(&vdpasim->vdpa.dev,
> + "vringh_iov_push_iotlb() error: %ld offset: 0x%llx len: 0x%lx\n",
> + bytes, offset, to_write);
> + status = VIRTIO_BLK_S_IOERR;
> + break;
> + }
> +
> + wrote += bytes;
> + break;
> +
> + case VIRTIO_BLK_T_OUT:
> + if (offset + to_read > VDPASIM_BLK_CAPACITY << SECTOR_SHIFT) {
> + dev_err(&vdpasim->vdpa.dev,
> + "writing over the capacity - offset: 0x%llx len: 0x%lx\n",
> + offset, to_read);
> + status = VIRTIO_BLK_S_IOERR;
> + break;
> + }
> +
> + bytes = vringh_iov_pull_iotlb(&vq->vring, &vq->riov,
> + vdpasim->buffer + offset,
> + to_read);
> + if (bytes < 0) {
> + dev_err(&vdpasim->vdpa.dev,
> + "vringh_iov_pull_iotlb() error: %ld offset: 0x%llx len: 0x%lx\n",
> + bytes, offset, to_read);
> + status = VIRTIO_BLK_S_IOERR;
> + break;
> + }
> + break;
> +
> + case VIRTIO_BLK_T_GET_ID: {
> + char id[VIRTIO_BLK_ID_BYTES] = "vdpa_blk_sim";
Let's use a global static one?
> +
> + bytes = vringh_iov_push_iotlb(&vq->vring,
> + &vq->wiov, id,
> + VIRTIO_BLK_ID_BYTES);
> + if (bytes < 0) {
> + dev_err(&vdpasim->vdpa.dev,
> + "vringh_iov_push_iotlb() error: %ld\n", bytes);
> + status = VIRTIO_BLK_S_IOERR;
> + break;
> + }
> +
> + wrote += bytes;
> + break;
> + }
> +
> + default:
> + dev_warn(&vdpasim->vdpa.dev,
> + "Unsupported request type %d\n", type);
> + status = VIRTIO_BLK_S_IOERR;
> + break;
> + }
> +
> + /* if VIRTIO_BLK_T_IN or VIRTIO_BLK_T_GET_ID fail, we need to skip
> + * the remaining bytes to put the status in the last byte
> + */
> + if (to_write - wrote > 0) {
> + vringh_iov_push_iotlb(&vq->vring, &vq->wiov, NULL,
> + to_write - wrote);
> + }
> +
> + /* last byte is the status */
> + bytes = vringh_iov_push_iotlb(&vq->vring, &vq->wiov, &status, 1);
> + if (bytes != 1)
> + return 0;
> +
> + wrote += bytes;
> +
> + /* Make sure data is wrote before advancing index */
> + smp_wmb();
> +
> + vringh_complete_iotlb(&vq->vring, vq->head, wrote);
> +
> + return ret;
> +}
> +
> static void vdpasim_blk_work(struct work_struct *work)
> {
> struct vdpasim *vdpasim = container_of(work, struct vdpasim, work);
> - u8 status = VIRTIO_BLK_S_OK;
> int i;
>
> spin_lock(&vdpasim->lock);
> @@ -41,21 +169,7 @@ static void vdpasim_blk_work(struct work_struct *work)
> if (!vq->ready)
> continue;
>
> - while (vringh_getdesc_iotlb(&vq->vring, &vq->riov, &vq->wiov,
> - &vq->head, GFP_ATOMIC) > 0) {
> -
> - int write;
> -
> - vq->wiov.i = vq->wiov.used - 1;
> - write = vringh_iov_push_iotlb(&vq->vring, &vq->wiov, &status, 1);
> - if (write <= 0)
> - break;
> -
> - /* Make sure data is wrote before advancing index */
> - smp_wmb();
> -
> - vringh_complete_iotlb(&vq->vring, vq->head, write);
> -
> + while (vdpasim_blk_handle_req(vdpasim, vq) > 0) {
> /* Make sure used is visible before rasing the interrupt. */
> smp_wmb();
>
> @@ -67,6 +181,7 @@ static void vdpasim_blk_work(struct work_struct *work)
> vq->cb(vq->private);
> local_bh_enable();
> }
> +
Unnecessary change.
Thanks
On Mon, Nov 16, 2020 at 11:37:48AM +0800, Jason Wang wrote:
>
>On 2020/11/13 下午9:47, Stefano Garzarella wrote:
>>Thanks to Max that started this work!
>>I took his patches, and extended the block simulator a bit.
>>
>>This series moves the network device simulator in a new module
>>(vdpa_sim_net) and leaves the generic functions in the vdpa_sim core
>>module, allowing the possibility to add new vDPA device simulators.
>>Then we added a new vdpa_sim_blk module to simulate a block device.
>>
>>I'm not sure about patch 11 ("vringh: allow vringh_iov_xfer() to skip
>>bytes when ptr is NULL"), maybe we can add a new functions instead of
>>modify vringh_iov_xfer().
>>
>>As Max reported, I'm also seeing errors with vdpa_sim_blk related to
>>iotlb and vringh when there is high load, these are some of the error
>>messages I can see randomly:
>>
>> vringh: Failed to access avail idx at 00000000e8deb2cc
>> vringh: Failed to read head: idx 6289 address 00000000e1ad1d50
>> vringh: Failed to get flags at 000000006635d7a3
>>
>> virtio_vdpa vdpa0: vringh_iov_push_iotlb() error: -14 offset: 0x2840000 len: 0x20000
>> virtio_vdpa vdpa0: vringh_iov_pull_iotlb() error: -14 offset: 0x58ee000 len: 0x3000
>>
>>These errors should all be related to the fact that iotlb_translate()
>>fails with -EINVAL, so it seems that we miss some mapping.
>
>
>Is this only reproducible when there's multiple co-current accessing
>of IOTLB? If yes, it's probably a hint that some kind of
>synchronization is still missed somewhere.
Yeah, maybe this is the case where virtio_ring and vringh use IOTLB
concorrentetively.
>
>It might be useful to log the dma_map/unmp in both virtio_ring and
>vringh to see who is missing the map.
I'll try.
Thanks for the hints,
Stefano
On Mon, Nov 16, 2020 at 12:00:11PM +0800, Jason Wang wrote:
>
>On 2020/11/13 下午9:47, Stefano Garzarella wrote:
>>From: Max Gurtovoy <[email protected]>
>>
>>Introduce new vdpa_sim_net and vdpa_sim (core) drivers. This is a
>>preparation for adding a vdpa simulator module for block devices.
>>
>>Signed-off-by: Max Gurtovoy <[email protected]>
>>[sgarzare: various cleanups/fixes]
>>Signed-off-by: Stefano Garzarella <[email protected]>
>>---
>>v1:
>>- Removed unused headers
>>- Removed empty module_init() module_exit()
>>- Moved vdpasim_is_little_endian() in vdpa_sim.h
>>- Moved vdpasim16_to_cpu/cpu_to_vdpasim16() in vdpa_sim.h
>>- Added vdpasim*_to_cpu/cpu_to_vdpasim*() also for 32 and 64
>>- Replaced 'select VDPA_SIM' with 'depends on VDPA_SIM' since selected
>> option can not depend on other [Jason]
>
>
>If possible, I would suggest to split this patch further:
>
>1) convert to use void *config, and an attribute for setting config
>size during allocation
>2) introduce supported_features
>3) other attributes (#vqs)
>4) rename config ops (more generic one)
>5) introduce ops for set|get_config, set_get_features
>6) real split
>
Okay, I'll try to split Max's patch following your suggestion.
It should be cleaner.
Thanks,
Stefano
On Mon, Nov 16, 2020 at 12:10:19PM +0800, Jason Wang wrote:
>
>On 2020/11/13 下午9:47, Stefano Garzarella wrote:
>>From: Max Gurtovoy <[email protected]>
>>
>>This will allow running vDPA for virtio block protocol.
>>
>>Signed-off-by: Max Gurtovoy <[email protected]>
>>[sgarzare: various cleanups/fixes]
>>Signed-off-by: Stefano Garzarella <[email protected]>
>>---
>>v1:
>>- Removed unused headers
>>- Used cpu_to_vdpasim*() to store config fields
>>- Replaced 'select VDPA_SIM' with 'depends on VDPA_SIM' since selected
>> option can not depend on other [Jason]
>>- Start with a single queue for now [Jason]
>>- Add comments to memory barriers
>>---
>> drivers/vdpa/vdpa_sim/vdpa_sim_blk.c | 124 +++++++++++++++++++++++++++
>> drivers/vdpa/Kconfig | 9 ++
>> drivers/vdpa/vdpa_sim/Makefile | 1 +
>> 3 files changed, 134 insertions(+)
>> create mode 100644 drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
>>
>>diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
>>new file mode 100644
>>index 000000000000..386dbb2f7138
>>--- /dev/null
>>+++ b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
>>@@ -0,0 +1,124 @@
>>+// SPDX-License-Identifier: GPL-2.0-only
>>+/*
>>+ * VDPA simulator for block device.
>>+ *
>>+ * Copyright (c) 2020, Mellanox Technologies. All rights reserved.
>>+ *
>>+ */
>>+
>>+#include <linux/module.h>
>>+
>>+#include "vdpa_sim.h"
>>+
>>+#define VDPASIM_BLK_FEATURES ((1ULL << VIRTIO_BLK_F_SIZE_MAX) | \
>>+ (1ULL << VIRTIO_BLK_F_SEG_MAX) | \
>>+ (1ULL << VIRTIO_BLK_F_BLK_SIZE) | \
>>+ (1ULL << VIRTIO_BLK_F_TOPOLOGY) | \
>>+ (1ULL << VIRTIO_BLK_F_MQ))
>>+
>>+#define VDPASIM_BLK_CAPACITY 0x40000
>>+#define VDPASIM_BLK_SIZE_MAX 0x1000
>>+#define VDPASIM_BLK_SEG_MAX 32
>>+#define VDPASIM_BLK_VQ_NUM 1
>>+
>>+static struct vdpasim *vdpasim_blk_dev;
>>+
>>+static void vdpasim_blk_work(struct work_struct *work)
>>+{
>>+ struct vdpasim *vdpasim = container_of(work, struct vdpasim, work);
>>+ u8 status = VIRTIO_BLK_S_OK;
>>+ int i;
>>+
>>+ spin_lock(&vdpasim->lock);
>>+
>>+ if (!(vdpasim->status & VIRTIO_CONFIG_S_DRIVER_OK))
>>+ goto out;
>>+
>>+ for (i = 0; i < VDPASIM_BLK_VQ_NUM; i++) {
>>+ struct vdpasim_virtqueue *vq = &vdpasim->vqs[i];
>>+
>>+ if (!vq->ready)
>>+ continue;
>>+
>>+ while (vringh_getdesc_iotlb(&vq->vring, &vq->iov, &vq->iov,
>>+ &vq->head, GFP_ATOMIC) > 0) {
>>+
>>+ int write;
>>+
>>+ vq->iov.i = vq->iov.used - 1;
>>+ write = vringh_iov_push_iotlb(&vq->vring, &vq->iov, &status, 1);
>>+ if (write <= 0)
>>+ break;
>>+
>>+ /* Make sure data is wrote before advancing index */
>>+ smp_wmb();
>>+
>>+ vringh_complete_iotlb(&vq->vring, vq->head, write);
>>+
>>+ /* Make sure used is visible before rasing the interrupt. */
>>+ smp_wmb();
>>+
>>+ if (vringh_need_notify_iotlb(&vq->vring) > 0)
>>+ vringh_notify(&vq->vring);
>
>
>Do we initialize vrh->notify anywhere? And This seems duplicated with
>the following vq->cb.
>
>I think the correct way is to initialize vrh->notify and use
>vringh_need_notify_iotlb()/vringh_notify() instead of the vq->cb here.
Okay, so I'll set vrh->notify in the vdpasim core with a function that
calls vq->cb() (the callback set through .set_vq_cb).
>
>And while at it, it's better to convert net simulator to do the same.
Sure.
Thanks,
Stefano
On Mon, Nov 16, 2020 at 12:14:31PM +0800, Jason Wang wrote:
>
>On 2020/11/13 下午9:47, Stefano Garzarella wrote:
>>Move device properties used during the entire life cycle in a new
>>structure to simplify the copy of these fields during the vdpasim
>>initialization.
>>
>>Signed-off-by: Stefano Garzarella <[email protected]>
>
>
>It would be better to do it before patch 2.
>
Okay, I'll move this patch.
>
>>---
>> drivers/vdpa/vdpa_sim/vdpa_sim.h | 17 ++++++++------
>> drivers/vdpa/vdpa_sim/vdpa_sim.c | 33 ++++++++++++++--------------
>> drivers/vdpa/vdpa_sim/vdpa_sim_blk.c | 8 +++++--
>> drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 9 +++++---
>> 4 files changed, 38 insertions(+), 29 deletions(-)
>>
>>diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.h b/drivers/vdpa/vdpa_sim/vdpa_sim.h
>>index 6a1267c40d5e..76e642042eb0 100644
>>--- a/drivers/vdpa/vdpa_sim/vdpa_sim.h
>>+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.h
>>@@ -40,12 +40,17 @@ struct vdpasim_virtqueue {
>> irqreturn_t (*cb)(void *data);
>> };
>>+struct vdpasim_device {
>>+ u64 supported_features;
>>+ u32 id;
>>+ int nvqs;
>>+};
>>+
>> struct vdpasim_init_attr {
>>- u32 device_id;
>>- u64 features;
>>+ struct vdpasim_device device;
>>+ int batch_mapping;
>>+
>> work_func_t work_fn;
>>- int batch_mapping;
>>- int nvqs;
>> };
>> /* State of each vdpasim device */
>>@@ -53,18 +58,16 @@ struct vdpasim {
>> struct vdpa_device vdpa;
>> struct vdpasim_virtqueue *vqs;
>> struct work_struct work;
>>+ struct vdpasim_device device;
>> /* spinlock to synchronize virtqueue state */
>> spinlock_t lock;
>> /* virtio config according to device type */
>> void *config;
>> struct vhost_iotlb *iommu;
>> void *buffer;
>>- u32 device_id;
>> u32 status;
>> u32 generation;
>> u64 features;
>>- u64 supported_features;
>>- int nvqs;
>> /* spinlock to synchronize iommu table */
>> spinlock_t iommu_lock;
>> };
>>diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
>>index 9c9717441bbe..d053bd14b3f8 100644
>>--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
>>+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
>>@@ -28,7 +28,7 @@ static void vdpasim_queue_ready(struct vdpasim *vdpasim, unsigned int idx)
>> {
>> struct vdpasim_virtqueue *vq = &vdpasim->vqs[idx];
>>- vringh_init_iotlb(&vq->vring, vdpasim->supported_features,
>>+ vringh_init_iotlb(&vq->vring, vdpasim->device.supported_features,
>> VDPASIM_QUEUE_MAX, false,
>> (struct vring_desc *)(uintptr_t)vq->desc_addr,
>> (struct vring_avail *)
>>@@ -46,7 +46,7 @@ static void vdpasim_vq_reset(struct vdpasim *vdpasim,
>> vq->device_addr = 0;
>> vq->cb = NULL;
>> vq->private = NULL;
>>- vringh_init_iotlb(&vq->vring, vdpasim->supported_features,
>>+ vringh_init_iotlb(&vq->vring, vdpasim->device.supported_features,
>> VDPASIM_QUEUE_MAX, false, NULL, NULL, NULL);
>> }
>>@@ -54,7 +54,7 @@ static void vdpasim_reset(struct vdpasim *vdpasim)
>> {
>> int i;
>>- for (i = 0; i < vdpasim->nvqs; i++)
>>+ for (i = 0; i < vdpasim->device.nvqs; i++)
>> vdpasim_vq_reset(vdpasim, &vdpasim->vqs[i]);
>> spin_lock(&vdpasim->iommu_lock);
>>@@ -189,7 +189,7 @@ struct vdpasim *vdpasim_create(struct vdpasim_init_attr *attr)
>> struct device *dev;
>> int i, size, ret = -ENOMEM;
>>- device_id = attr->device_id;
>>+ device_id = attr->device.id;
>> /* Currently, we only accept the network and block devices. */
>> if (device_id != VIRTIO_ID_NET && device_id != VIRTIO_ID_BLOCK)
>> return ERR_PTR(-EOPNOTSUPP);
>>@@ -200,10 +200,12 @@ struct vdpasim *vdpasim_create(struct vdpasim_init_attr *attr)
>> ops = &vdpasim_config_ops;
>> vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops,
>>- attr->nvqs);
>>+ attr->device.nvqs);
>> if (!vdpasim)
>> goto err_alloc;
>>+ vdpasim->device = attr->device;
>>+
>> if (device_id == VIRTIO_ID_NET)
>> size = sizeof(struct virtio_net_config);
>> else
>>@@ -212,14 +214,11 @@ struct vdpasim *vdpasim_create(struct vdpasim_init_attr *attr)
>> if (!vdpasim->config)
>> goto err_iommu;
>>- vdpasim->vqs = kcalloc(attr->nvqs, sizeof(struct vdpasim_virtqueue),
>>- GFP_KERNEL);
>>+ vdpasim->vqs = kcalloc(vdpasim->device.nvqs,
>>+ sizeof(struct vdpasim_virtqueue), GFP_KERNEL);
>> if (!vdpasim->vqs)
>> goto err_iommu;
>>- vdpasim->device_id = device_id;
>>- vdpasim->supported_features = attr->features;
>>- vdpasim->nvqs = attr->nvqs;
>> INIT_WORK(&vdpasim->work, attr->work_fn);
>> spin_lock_init(&vdpasim->lock);
>> spin_lock_init(&vdpasim->iommu_lock);
>>@@ -238,7 +237,7 @@ struct vdpasim *vdpasim_create(struct vdpasim_init_attr *attr)
>> if (!vdpasim->buffer)
>> goto err_iommu;
>>- for (i = 0; i < vdpasim->nvqs; i++)
>>+ for (i = 0; i < vdpasim->device.nvqs; i++)
>> vringh_set_iotlb(&vdpasim->vqs[i].vring, vdpasim->iommu);
>> vdpasim->vdpa.dma_dev = dev;
>>@@ -347,7 +346,7 @@ static u64 vdpasim_get_features(struct vdpa_device *vdpa)
>> {
>> struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
>>- return vdpasim->supported_features;
>>+ return vdpasim->device.supported_features;
>> }
>> static int vdpasim_set_features(struct vdpa_device *vdpa, u64 features)
>>@@ -358,14 +357,14 @@ static int vdpasim_set_features(struct vdpa_device *vdpa, u64 features)
>> if (!(features & (1ULL << VIRTIO_F_ACCESS_PLATFORM)))
>> return -EINVAL;
>>- vdpasim->features = features & vdpasim->supported_features;
>>+ vdpasim->features = features & vdpasim->device.supported_features;
>> /* We generally only know whether guest is using the legacy interface
>> * here, so generally that's the earliest we can set config fields.
>> * Note: We actually require VIRTIO_F_ACCESS_PLATFORM above which
>> * implies VIRTIO_F_VERSION_1, but let's not try to be clever here.
>> */
>>- if (vdpasim->device_id == VIRTIO_ID_NET) {
>>+ if (vdpasim->device.id == VIRTIO_ID_NET) {
>> struct virtio_net_config *config =
>> (struct virtio_net_config *)vdpasim->config;
>>@@ -391,7 +390,7 @@ static u32 vdpasim_get_device_id(struct vdpa_device *vdpa)
>> {
>> struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
>>- return vdpasim->device_id;
>>+ return vdpasim->device.id;
>> }
>> static u32 vdpasim_get_vendor_id(struct vdpa_device *vdpa)
>>@@ -427,10 +426,10 @@ static void vdpasim_get_config(struct vdpa_device *vdpa, unsigned int offset,
>> {
>> struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
>>- if (vdpasim->device_id == VIRTIO_ID_BLOCK &&
>>+ if (vdpasim->device.id == VIRTIO_ID_BLOCK &&
>> (offset + len < sizeof(struct virtio_blk_config)))
>> memcpy(buf, vdpasim->config + offset, len);
>>- else if (vdpasim->device_id == VIRTIO_ID_NET &&
>>+ else if (vdpasim->device.id == VIRTIO_ID_NET &&
>> (offset + len < sizeof(struct virtio_net_config)))
>> memcpy(buf, vdpasim->config + offset, len);
>> }
>>diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
>>index 386dbb2f7138..363273d72e26 100644
>>--- a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
>>+++ b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
>>@@ -78,9 +78,13 @@ static int __init vdpasim_blk_init(void)
>> struct virtio_blk_config *config;
>> int ret;
>>- attr.device_id = VIRTIO_ID_BLOCK;
>>- attr.features = VDPASIM_FEATURES | VDPASIM_BLK_FEATURES;
>>+ attr.device.id = VIRTIO_ID_BLOCK;
>>+ attr.device.supported_features = VDPASIM_FEATURES |
>>+ VDPASIM_BLK_FEATURES;
>>+ attr.device.nvqs = VDPASIM_BLK_VQ_NUM;
>>+
>> attr.work_fn = vdpasim_blk_work;
>>+
>> vdpasim_blk_dev = vdpasim_create(&attr);
>> if (IS_ERR(vdpasim_blk_dev)) {
>> ret = PTR_ERR(vdpasim_blk_dev);
>>diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
>>index e1e57c52b108..88c9569f6bd3 100644
>>--- a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
>>+++ b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
>>@@ -105,11 +105,14 @@ static int __init vdpasim_net_init(void)
>> struct virtio_net_config *config;
>> int ret;
>>- attr.device_id = VIRTIO_ID_NET;
>>- attr.features = VDPASIM_FEATURES | VDPASIM_NET_FEATURES;
>>- attr.nvqs = VDPASIM_NET_VQ_NUM;
>>+ attr.device.id = VIRTIO_ID_NET;
>>+ attr.device.supported_features = VDPASIM_FEATURES |
>>+ VDPASIM_NET_FEATURES;
>>+ attr.device.nvqs = VDPASIM_NET_VQ_NUM;
>>+
>> attr.work_fn = vdpasim_net_work;
>> attr.batch_mapping = batch_mapping;
>>+
>
>
>Unnecessary changes.
I'll remove these new lines.
Thanks,
Stefano
On Mon, Nov 16, 2020 at 12:18:19PM +0800, Jason Wang wrote:
>
>On 2020/11/13 下午9:47, Stefano Garzarella wrote:
>>In order to simplify the code of the vdpa_sim core, we move the
>>config management in each device simulator.
>>
>>The device must provide the size of config structure and a callback
>>to update this structure called during the vdpasim_set_features().
>
>
>Similarly, I suggest to do this before patch 2, then there's no need
>for the conversion of blk device.
>
I'll do.
>
>>
>>Signed-off-by: Stefano Garzarella <[email protected]>
>>---
>> drivers/vdpa/vdpa_sim/vdpa_sim.h | 5 +++--
>> drivers/vdpa/vdpa_sim/vdpa_sim.c | 29 +++++-----------------------
>> drivers/vdpa/vdpa_sim/vdpa_sim_blk.c | 27 ++++++++++++++++----------
>> drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 12 ++++++++++++
>> 4 files changed, 37 insertions(+), 36 deletions(-)
>>
>>diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.h b/drivers/vdpa/vdpa_sim/vdpa_sim.h
>>index 76e642042eb0..f7e1fe0a88d3 100644
>>--- a/drivers/vdpa/vdpa_sim/vdpa_sim.h
>>+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.h
>>@@ -10,8 +10,6 @@
>> #include <linux/vdpa.h>
>> #include <linux/vhost_iotlb.h>
>> #include <uapi/linux/virtio_config.h>
>>-#include <uapi/linux/virtio_net.h>
>>-#include <uapi/linux/virtio_blk.h>
>> #define DRV_VERSION "0.1"
>> #define DRV_AUTHOR "Jason Wang <[email protected]>"
>>@@ -42,8 +40,11 @@ struct vdpasim_virtqueue {
>> struct vdpasim_device {
>> u64 supported_features;
>>+ size_t config_size;
>> u32 id;
>> int nvqs;
>>+
>>+ void (*update_config)(struct vdpasim *vdpasim);
>
>
>Let's use set_config/get_config to align with virtio/vhost.
Yes, it's better,
>
>Other looks good.
Thanks,
Stefano
On Mon, Nov 16, 2020 at 01:25:31PM +0800, Jason Wang wrote:
>
>On 2020/11/13 下午9:47, Stefano Garzarella wrote:
>>The previous implementation wrote only the status of each request.
>>This patch implements a more accurate block device simulator,
>>providing a ramdisk-like behavior.
>>
>>Also handle VIRTIO_BLK_T_GET_ID request, always answering the
>>"vdpa_blk_sim" string.
>
>
>Let's use a separate patch for this.
>
Okay, I'll do.
>
>>
>>Signed-off-by: Stefano Garzarella <[email protected]>
>>---
>> drivers/vdpa/vdpa_sim/vdpa_sim_blk.c | 151 +++++++++++++++++++++++----
>> 1 file changed, 133 insertions(+), 18 deletions(-)
>>
>>diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
>>index 8e41b3ab98d5..68e74383322f 100644
>>--- a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
>>+++ b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
>>@@ -7,6 +7,7 @@
>> */
>> #include <linux/module.h>
>>+#include <linux/blkdev.h>
>> #include <uapi/linux/virtio_blk.h>
>> #include "vdpa_sim.h"
>>@@ -24,10 +25,137 @@
>> static struct vdpasim *vdpasim_blk_dev;
>>+static int vdpasim_blk_handle_req(struct vdpasim *vdpasim,
>>+ struct vdpasim_virtqueue *vq)
>>+{
>>+ size_t wrote = 0, to_read = 0, to_write = 0;
>>+ struct virtio_blk_outhdr hdr;
>>+ uint8_t status;
>>+ uint32_t type;
>>+ ssize_t bytes;
>>+ loff_t offset;
>>+ int i, ret;
>>+
>>+ vringh_kiov_cleanup(&vq->riov);
>>+ vringh_kiov_cleanup(&vq->wiov);
>
>
>It looks to me we should do those after vringh_get_desc_iotlb()? See
>comment above vringh_getdesc_kern().
Do you mean after the last vringh_iov_push_iotlb()?
Because vringh_kiov_cleanup() will free the allocated iov[].
>
>
>>+
>>+ ret = vringh_getdesc_iotlb(&vq->vring, &vq->riov, &vq->wiov,
>>+ &vq->head, GFP_ATOMIC);
>>+ if (ret != 1)
>>+ return ret;
>>+
>>+ for (i = 0; i < vq->wiov.used; i++)
>>+ to_write += vq->wiov.iov[i].iov_len;
>
>
>It's better to introduce a helper for this (or consider to use iov
>iterator).
Okay, I'll try to find the best solution.
>
>
>>+ to_write -= 1; /* last byte is the status */
>>+
>>+ for (i = 0; i < vq->riov.used; i++)
>>+ to_read += vq->riov.iov[i].iov_len;
>>+
>>+ bytes = vringh_iov_pull_iotlb(&vq->vring, &vq->riov, &hdr, sizeof(hdr));
>>+ if (bytes != sizeof(hdr))
>>+ return 0;
>>+
>>+ to_read -= bytes;
>>+
>>+ type = le32_to_cpu(hdr.type);
>>+ offset = le64_to_cpu(hdr.sector) << SECTOR_SHIFT;
>>+ status = VIRTIO_BLK_S_OK;
>>+
>>+ switch (type) {
>>+ case VIRTIO_BLK_T_IN:
>>+ if (offset + to_write > VDPASIM_BLK_CAPACITY <<
>>SECTOR_SHIFT) {
>>+ dev_err(&vdpasim->vdpa.dev,
>>+ "reading over the capacity - offset:
>>0x%llx len: 0x%lx\n",
>>+ offset, to_write);
>>+ status = VIRTIO_BLK_S_IOERR;
>>+ break;
>>+ }
>>+
>>+ bytes = vringh_iov_push_iotlb(&vq->vring, &vq->wiov,
>>+ vdpasim->buffer + offset,
>>+ to_write);
>>+ if (bytes < 0) {
>>+ dev_err(&vdpasim->vdpa.dev,
>>+ "vringh_iov_push_iotlb() error: %ld offset: 0x%llx len: 0x%lx\n",
>>+ bytes, offset, to_write);
>>+ status = VIRTIO_BLK_S_IOERR;
>>+ break;
>>+ }
>>+
>>+ wrote += bytes;
>>+ break;
>>+
>>+ case VIRTIO_BLK_T_OUT:
>>+ if (offset + to_read > VDPASIM_BLK_CAPACITY << SECTOR_SHIFT) {
>>+ dev_err(&vdpasim->vdpa.dev,
>>+ "writing over the capacity - offset: 0x%llx len: 0x%lx\n",
>>+ offset, to_read);
>>+ status = VIRTIO_BLK_S_IOERR;
>>+ break;
>>+ }
>>+
>>+ bytes = vringh_iov_pull_iotlb(&vq->vring, &vq->riov,
>>+ vdpasim->buffer + offset,
>>+ to_read);
>>+ if (bytes < 0) {
>>+ dev_err(&vdpasim->vdpa.dev,
>>+ "vringh_iov_pull_iotlb() error: %ld offset: 0x%llx len: 0x%lx\n",
>>+ bytes, offset, to_read);
>>+ status = VIRTIO_BLK_S_IOERR;
>>+ break;
>>+ }
>>+ break;
>>+
>>+ case VIRTIO_BLK_T_GET_ID: {
>>+ char id[VIRTIO_BLK_ID_BYTES] = "vdpa_blk_sim";
>
>
>Let's use a global static one?
I'll do.
>
>
>>+
>>+ bytes = vringh_iov_push_iotlb(&vq->vring,
>>+ &vq->wiov, id,
>>+ VIRTIO_BLK_ID_BYTES);
>>+ if (bytes < 0) {
>>+ dev_err(&vdpasim->vdpa.dev,
>>+ "vringh_iov_push_iotlb() error: %ld\n", bytes);
>>+ status = VIRTIO_BLK_S_IOERR;
>>+ break;
>>+ }
>>+
>>+ wrote += bytes;
>>+ break;
>>+ }
>>+
>>+ default:
>>+ dev_warn(&vdpasim->vdpa.dev,
>>+ "Unsupported request type %d\n", type);
>>+ status = VIRTIO_BLK_S_IOERR;
>>+ break;
>>+ }
>>+
>>+ /* if VIRTIO_BLK_T_IN or VIRTIO_BLK_T_GET_ID fail, we need to skip
>>+ * the remaining bytes to put the status in the last byte
>>+ */
>>+ if (to_write - wrote > 0) {
>>+ vringh_iov_push_iotlb(&vq->vring, &vq->wiov, NULL,
>>+ to_write - wrote);
>>+ }
>>+
>>+ /* last byte is the status */
>>+ bytes = vringh_iov_push_iotlb(&vq->vring, &vq->wiov, &status, 1);
>>+ if (bytes != 1)
>>+ return 0;
>>+
>>+ wrote += bytes;
>>+
>>+ /* Make sure data is wrote before advancing index */
>>+ smp_wmb();
>>+
>>+ vringh_complete_iotlb(&vq->vring, vq->head, wrote);
>>+
>>+ return ret;
>>+}
>>+
>> static void vdpasim_blk_work(struct work_struct *work)
>> {
>> struct vdpasim *vdpasim = container_of(work, struct vdpasim, work);
>>- u8 status = VIRTIO_BLK_S_OK;
>> int i;
>> spin_lock(&vdpasim->lock);
>>@@ -41,21 +169,7 @@ static void vdpasim_blk_work(struct work_struct *work)
>> if (!vq->ready)
>> continue;
>>- while (vringh_getdesc_iotlb(&vq->vring, &vq->riov, &vq->wiov,
>>- &vq->head, GFP_ATOMIC) > 0) {
>>-
>>- int write;
>>-
>>- vq->wiov.i = vq->wiov.used - 1;
>>- write = vringh_iov_push_iotlb(&vq->vring, &vq->wiov, &status, 1);
>>- if (write <= 0)
>>- break;
>>-
>>- /* Make sure data is wrote before advancing
>>index */
>>- smp_wmb();
>>-
>>- vringh_complete_iotlb(&vq->vring, vq->head, write);
>>-
>>+ while (vdpasim_blk_handle_req(vdpasim, vq) > 0) {
>> /* Make sure used is visible before rasing the interrupt. */
>> smp_wmb();
>>@@ -67,6 +181,7 @@ static void vdpasim_blk_work(struct work_struct *work)
>> vq->cb(vq->private);
>> local_bh_enable();
>> }
>>+
>
>
>Unnecessary change.
Removed.
Thanks,
Stefano
On Mon, Nov 16, 2020 at 04:50:43AM -0500, Michael S. Tsirkin wrote:
>On Fri, Nov 13, 2020 at 02:47:12PM +0100, Stefano Garzarella wrote:
>> The previous implementation wrote only the status of each request.
>> This patch implements a more accurate block device simulator,
>> providing a ramdisk-like behavior.
>>
>> Also handle VIRTIO_BLK_T_GET_ID request, always answering the
>> "vdpa_blk_sim" string.
>
>Maybe an ioctl to specify the id makes more sense.
I agree that make sense to make it configurable from the user, but I'm
not sure an ioctl() is the best interface with this device simulator.
Maybe we can use a module parameter as in the net simulator or even
better using the new vdpa management tool recently proposed (I need to
check better how we can extend it).
What do you think?
Thanks,
Stefano
On Mon, Nov 16, 2020 at 12:12:21PM +0800, Jason Wang wrote:
>
>On 2020/11/13 下午9:47, Stefano Garzarella wrote:
>>The simulated devices can support multiple queues, so this limit
>>should be defined according to the number of queues supported by
>>the device.
>>
>>Since we are in a simulator, let's simply remove that limit.
>>
>>Suggested-by: Jason Wang <[email protected]>
>>Signed-off-by: Stefano Garzarella <[email protected]>
>
>
>Acked-by: Jason Wang <[email protected]>
>
>It would be good to introduce a macro instead of using the magic 0 here.
Done.
Thanks,
Stefano
On Fri, Nov 13, 2020 at 02:47:12PM +0100, Stefano Garzarella wrote:
> The previous implementation wrote only the status of each request.
> This patch implements a more accurate block device simulator,
> providing a ramdisk-like behavior.
>
> Also handle VIRTIO_BLK_T_GET_ID request, always answering the
> "vdpa_blk_sim" string.
Maybe an ioctl to specify the id makes more sense.
> Signed-off-by: Stefano Garzarella <[email protected]>
> ---
> drivers/vdpa/vdpa_sim/vdpa_sim_blk.c | 151 +++++++++++++++++++++++----
> 1 file changed, 133 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
> index 8e41b3ab98d5..68e74383322f 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
> @@ -7,6 +7,7 @@
> */
>
> #include <linux/module.h>
> +#include <linux/blkdev.h>
> #include <uapi/linux/virtio_blk.h>
>
> #include "vdpa_sim.h"
> @@ -24,10 +25,137 @@
>
> static struct vdpasim *vdpasim_blk_dev;
>
> +static int vdpasim_blk_handle_req(struct vdpasim *vdpasim,
> + struct vdpasim_virtqueue *vq)
> +{
> + size_t wrote = 0, to_read = 0, to_write = 0;
> + struct virtio_blk_outhdr hdr;
> + uint8_t status;
> + uint32_t type;
> + ssize_t bytes;
> + loff_t offset;
> + int i, ret;
> +
> + vringh_kiov_cleanup(&vq->riov);
> + vringh_kiov_cleanup(&vq->wiov);
> +
> + ret = vringh_getdesc_iotlb(&vq->vring, &vq->riov, &vq->wiov,
> + &vq->head, GFP_ATOMIC);
> + if (ret != 1)
> + return ret;
> +
> + for (i = 0; i < vq->wiov.used; i++)
> + to_write += vq->wiov.iov[i].iov_len;
> + to_write -= 1; /* last byte is the status */
> +
> + for (i = 0; i < vq->riov.used; i++)
> + to_read += vq->riov.iov[i].iov_len;
> +
> + bytes = vringh_iov_pull_iotlb(&vq->vring, &vq->riov, &hdr, sizeof(hdr));
> + if (bytes != sizeof(hdr))
> + return 0;
> +
> + to_read -= bytes;
> +
> + type = le32_to_cpu(hdr.type);
> + offset = le64_to_cpu(hdr.sector) << SECTOR_SHIFT;
> + status = VIRTIO_BLK_S_OK;
> +
> + switch (type) {
> + case VIRTIO_BLK_T_IN:
> + if (offset + to_write > VDPASIM_BLK_CAPACITY << SECTOR_SHIFT) {
> + dev_err(&vdpasim->vdpa.dev,
> + "reading over the capacity - offset: 0x%llx len: 0x%lx\n",
> + offset, to_write);
> + status = VIRTIO_BLK_S_IOERR;
> + break;
> + }
> +
> + bytes = vringh_iov_push_iotlb(&vq->vring, &vq->wiov,
> + vdpasim->buffer + offset,
> + to_write);
> + if (bytes < 0) {
> + dev_err(&vdpasim->vdpa.dev,
> + "vringh_iov_push_iotlb() error: %ld offset: 0x%llx len: 0x%lx\n",
> + bytes, offset, to_write);
> + status = VIRTIO_BLK_S_IOERR;
> + break;
> + }
> +
> + wrote += bytes;
> + break;
> +
> + case VIRTIO_BLK_T_OUT:
> + if (offset + to_read > VDPASIM_BLK_CAPACITY << SECTOR_SHIFT) {
> + dev_err(&vdpasim->vdpa.dev,
> + "writing over the capacity - offset: 0x%llx len: 0x%lx\n",
> + offset, to_read);
> + status = VIRTIO_BLK_S_IOERR;
> + break;
> + }
> +
> + bytes = vringh_iov_pull_iotlb(&vq->vring, &vq->riov,
> + vdpasim->buffer + offset,
> + to_read);
> + if (bytes < 0) {
> + dev_err(&vdpasim->vdpa.dev,
> + "vringh_iov_pull_iotlb() error: %ld offset: 0x%llx len: 0x%lx\n",
> + bytes, offset, to_read);
> + status = VIRTIO_BLK_S_IOERR;
> + break;
> + }
> + break;
> +
> + case VIRTIO_BLK_T_GET_ID: {
> + char id[VIRTIO_BLK_ID_BYTES] = "vdpa_blk_sim";
> +
> + bytes = vringh_iov_push_iotlb(&vq->vring,
> + &vq->wiov, id,
> + VIRTIO_BLK_ID_BYTES);
> + if (bytes < 0) {
> + dev_err(&vdpasim->vdpa.dev,
> + "vringh_iov_push_iotlb() error: %ld\n", bytes);
> + status = VIRTIO_BLK_S_IOERR;
> + break;
> + }
> +
> + wrote += bytes;
> + break;
> + }
> +
> + default:
> + dev_warn(&vdpasim->vdpa.dev,
> + "Unsupported request type %d\n", type);
> + status = VIRTIO_BLK_S_IOERR;
> + break;
> + }
> +
> + /* if VIRTIO_BLK_T_IN or VIRTIO_BLK_T_GET_ID fail, we need to skip
> + * the remaining bytes to put the status in the last byte
> + */
> + if (to_write - wrote > 0) {
> + vringh_iov_push_iotlb(&vq->vring, &vq->wiov, NULL,
> + to_write - wrote);
> + }
> +
> + /* last byte is the status */
> + bytes = vringh_iov_push_iotlb(&vq->vring, &vq->wiov, &status, 1);
> + if (bytes != 1)
> + return 0;
> +
> + wrote += bytes;
> +
> + /* Make sure data is wrote before advancing index */
> + smp_wmb();
> +
> + vringh_complete_iotlb(&vq->vring, vq->head, wrote);
> +
> + return ret;
> +}
> +
> static void vdpasim_blk_work(struct work_struct *work)
> {
> struct vdpasim *vdpasim = container_of(work, struct vdpasim, work);
> - u8 status = VIRTIO_BLK_S_OK;
> int i;
>
> spin_lock(&vdpasim->lock);
> @@ -41,21 +169,7 @@ static void vdpasim_blk_work(struct work_struct *work)
> if (!vq->ready)
> continue;
>
> - while (vringh_getdesc_iotlb(&vq->vring, &vq->riov, &vq->wiov,
> - &vq->head, GFP_ATOMIC) > 0) {
> -
> - int write;
> -
> - vq->wiov.i = vq->wiov.used - 1;
> - write = vringh_iov_push_iotlb(&vq->vring, &vq->wiov, &status, 1);
> - if (write <= 0)
> - break;
> -
> - /* Make sure data is wrote before advancing index */
> - smp_wmb();
> -
> - vringh_complete_iotlb(&vq->vring, vq->head, write);
> -
> + while (vdpasim_blk_handle_req(vdpasim, vq) > 0) {
> /* Make sure used is visible before rasing the interrupt. */
> smp_wmb();
>
> @@ -67,6 +181,7 @@ static void vdpasim_blk_work(struct work_struct *work)
> vq->cb(vq->private);
> local_bh_enable();
> }
> +
> }
> out:
> spin_unlock(&vdpasim->lock);
> @@ -84,7 +199,7 @@ static void vdpasim_blk_update_config(struct vdpasim *vdpasim)
> config->num_queues = cpu_to_vdpasim16(vdpasim, VDPASIM_BLK_VQ_NUM);
> config->min_io_size = cpu_to_vdpasim16(vdpasim, 1);
> config->opt_io_size = cpu_to_vdpasim32(vdpasim, 1);
> - config->blk_size = cpu_to_vdpasim32(vdpasim, 512);
> + config->blk_size = cpu_to_vdpasim32(vdpasim, SECTOR_SIZE);
> }
>
> static int __init vdpasim_blk_init(void)
> @@ -100,7 +215,7 @@ static int __init vdpasim_blk_init(void)
> attr.device.update_config = vdpasim_blk_update_config;
>
> attr.work_fn = vdpasim_blk_work;
> - attr.buffer_size = PAGE_SIZE;
> + attr.buffer_size = VDPASIM_BLK_CAPACITY << SECTOR_SHIFT;
>
> vdpasim_blk_dev = vdpasim_create(&attr);
> if (IS_ERR(vdpasim_blk_dev)) {
> --
> 2.26.2
On Mon, Nov 16, 2020 at 12:32:02PM +0800, Jason Wang wrote:
>
>On 2020/11/13 下午9:47, Stefano Garzarella wrote:
>>In some cases, it may be useful to provide a way to skip a number
>>of bytes in a vringh_iov.
>>
>>In order to keep vringh_iov consistent, let's reuse vringh_iov_xfer()
>>logic and skip bytes when the ptr is NULL.
>>
>>Signed-off-by: Stefano Garzarella <[email protected]>
>>---
>>
>>I'm not sure if this is the best option, maybe we can add a new
>>function vringh_iov_skip().
>>
>>Suggestions?
>
>
>I might be worth to check whether we can convert vringh_iov to use iov
>iterator then we can use iov_iterator_advance() here.
Make sense, I'll take a look.
Thanks for the suggestion,
Stefano
On Fri, Nov 13, 2020 at 02:47:01PM +0100, Stefano Garzarella wrote:
> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> index 2754f3069738..fb0411594963 100644
> --- a/drivers/vhost/vdpa.c
> +++ b/drivers/vhost/vdpa.c
> @@ -22,6 +22,7 @@
> #include <linux/nospec.h>
> #include <linux/vhost.h>
> #include <linux/virtio_net.h>
> +#include <linux/virtio_blk.h>
>
> #include "vhost.h"
>
> @@ -194,6 +195,9 @@ static int vhost_vdpa_config_validate(struct vhost_vdpa *v,
> case VIRTIO_ID_NET:
> size = sizeof(struct virtio_net_config);
> break;
> + case VIRTIO_ID_BLOCK:
> + size = sizeof(struct virtio_blk_config);
> + break;
> }
>
> if (c->len == 0)
Can vdpa_config_ops->get/set_config() handle the size check instead of
hardcoding device-specific knowledge into drivers/vhost/vdpa.c?
On Fri, Nov 13, 2020 at 02:47:04PM +0100, Stefano Garzarella wrote:
> +static void vdpasim_blk_work(struct work_struct *work)
> +{
> + struct vdpasim *vdpasim = container_of(work, struct vdpasim, work);
> + u8 status = VIRTIO_BLK_S_OK;
> + int i;
> +
> + spin_lock(&vdpasim->lock);
> +
> + if (!(vdpasim->status & VIRTIO_CONFIG_S_DRIVER_OK))
> + goto out;
> +
> + for (i = 0; i < VDPASIM_BLK_VQ_NUM; i++) {
> + struct vdpasim_virtqueue *vq = &vdpasim->vqs[i];
> +
> + if (!vq->ready)
> + continue;
> +
> + while (vringh_getdesc_iotlb(&vq->vring, &vq->iov, &vq->iov,
> + &vq->head, GFP_ATOMIC) > 0) {
> +
> + int write;
> +
> + vq->iov.i = vq->iov.used - 1;
> + write = vringh_iov_push_iotlb(&vq->vring, &vq->iov, &status, 1);
> + if (write <= 0)
> + break;
We're lucky the guest driver doesn't crash after VIRTIO_BLK_T_GET_ID? :)
On Fri, Nov 13, 2020 at 02:47:06PM +0100, Stefano Garzarella wrote:
> Move device properties used during the entire life cycle in a new
> structure to simplify the copy of these fields during the vdpasim
> initialization.
>
> Signed-off-by: Stefano Garzarella <[email protected]>
> ---
> drivers/vdpa/vdpa_sim/vdpa_sim.h | 17 ++++++++------
> drivers/vdpa/vdpa_sim/vdpa_sim.c | 33 ++++++++++++++--------------
> drivers/vdpa/vdpa_sim/vdpa_sim_blk.c | 8 +++++--
> drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 9 +++++---
> 4 files changed, 38 insertions(+), 29 deletions(-)
>
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.h b/drivers/vdpa/vdpa_sim/vdpa_sim.h
> index 6a1267c40d5e..76e642042eb0 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.h
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.h
> @@ -40,12 +40,17 @@ struct vdpasim_virtqueue {
> irqreturn_t (*cb)(void *data);
> };
>
> +struct vdpasim_device {
> + u64 supported_features;
> + u32 id;
> + int nvqs;
> +};
> +
> struct vdpasim_init_attr {
> - u32 device_id;
> - u64 features;
> + struct vdpasim_device device;
It's unclear to me what the exact purpose of struct vdpasim_device is.
At least the name reminds me of struct device, which this is not.
Should this be called just struct vdpasim_attr or struct
vdpasim_dev_attr? In other words, the attributes that are needed even
after intialization?
On Fri, Nov 13, 2020 at 02:47:10PM +0100, Stefano Garzarella wrote:
> vringh_getdesc_iotlb() manages 2 iovs for writable and readable
> descriptors. This is very useful for the block device, where for
> each request we have both types of descriptor.
>
> Let's split the vdpasim_virtqueue's iov field in riov and wiov
> to use them with vringh_getdesc_iotlb().
Is riov/wiov naming common? VIRTIO uses "in" (device-to-driver) and
"out" (driver-to-device). Using VIRTIO terminology might be clearer.
Stefan
On Fri, Nov 13, 2020 at 02:47:12PM +0100, Stefano Garzarella wrote:
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
> index 8e41b3ab98d5..68e74383322f 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
> @@ -7,6 +7,7 @@
> */
>
> #include <linux/module.h>
> +#include <linux/blkdev.h>
> #include <uapi/linux/virtio_blk.h>
>
> #include "vdpa_sim.h"
> @@ -24,10 +25,137 @@
>
> static struct vdpasim *vdpasim_blk_dev;
>
> +static int vdpasim_blk_handle_req(struct vdpasim *vdpasim,
> + struct vdpasim_virtqueue *vq)
This function has a non-standard int return value. Please document it.
> +{
> + size_t wrote = 0, to_read = 0, to_write = 0;
> + struct virtio_blk_outhdr hdr;
> + uint8_t status;
> + uint32_t type;
> + ssize_t bytes;
> + loff_t offset;
> + int i, ret;
> +
> + vringh_kiov_cleanup(&vq->riov);
> + vringh_kiov_cleanup(&vq->wiov);
> +
> + ret = vringh_getdesc_iotlb(&vq->vring, &vq->riov, &vq->wiov,
> + &vq->head, GFP_ATOMIC);
> + if (ret != 1)
> + return ret;
> +
> + for (i = 0; i < vq->wiov.used; i++)
> + to_write += vq->wiov.iov[i].iov_len;
> + to_write -= 1; /* last byte is the status */
What if vq->wiov.used == 0?
> +
> + for (i = 0; i < vq->riov.used; i++)
> + to_read += vq->riov.iov[i].iov_len;
> +
> + bytes = vringh_iov_pull_iotlb(&vq->vring, &vq->riov, &hdr, sizeof(hdr));
> + if (bytes != sizeof(hdr))
> + return 0;
> +
> + to_read -= bytes;
> +
> + type = le32_to_cpu(hdr.type);
> + offset = le64_to_cpu(hdr.sector) << SECTOR_SHIFT;
> + status = VIRTIO_BLK_S_OK;
> +
> + switch (type) {
> + case VIRTIO_BLK_T_IN:
> + if (offset + to_write > VDPASIM_BLK_CAPACITY << SECTOR_SHIFT) {
Integer overflow is not handled.
On Tue, Nov 17, 2020 at 11:11:21AM +0000, Stefan Hajnoczi wrote:
>On Fri, Nov 13, 2020 at 02:47:04PM +0100, Stefano Garzarella wrote:
>> +static void vdpasim_blk_work(struct work_struct *work)
>> +{
>> + struct vdpasim *vdpasim = container_of(work, struct vdpasim, work);
>> + u8 status = VIRTIO_BLK_S_OK;
>> + int i;
>> +
>> + spin_lock(&vdpasim->lock);
>> +
>> + if (!(vdpasim->status & VIRTIO_CONFIG_S_DRIVER_OK))
>> + goto out;
>> +
>> + for (i = 0; i < VDPASIM_BLK_VQ_NUM; i++) {
>> + struct vdpasim_virtqueue *vq = &vdpasim->vqs[i];
>> +
>> + if (!vq->ready)
>> + continue;
>> +
>> + while (vringh_getdesc_iotlb(&vq->vring, &vq->iov, &vq->iov,
>> + &vq->head, GFP_ATOMIC) > 0) {
>> +
>> + int write;
>> +
>> + vq->iov.i = vq->iov.used - 1;
>> + write = vringh_iov_push_iotlb(&vq->vring, &vq->iov, &status, 1);
>> + if (write <= 0)
>> + break;
>
>We're lucky the guest driver doesn't crash after VIRTIO_BLK_T_GET_ID? :)
The crash could happen if the simulator doesn't put the string
terminator, but in virtio_blk.c, the serial_show() initialize the buffer
putting the string terminator in the VIRTIO_BLK_ID_BYTES element:
buf[VIRTIO_BLK_ID_BYTES] = '\0';
err = virtblk_get_id(disk, buf);
This should prevent the issue, right?
However in the last patch of this series I implemented
VIRTIO_BLK_T_GET_ID support :-)
Thanks,
Stefano
On Tue, Nov 17, 2020 at 11:23:05AM +0000, Stefan Hajnoczi wrote:
>On Fri, Nov 13, 2020 at 02:47:06PM +0100, Stefano Garzarella wrote:
>> Move device properties used during the entire life cycle in a new
>> structure to simplify the copy of these fields during the vdpasim
>> initialization.
>>
>> Signed-off-by: Stefano Garzarella <[email protected]>
>> ---
>> drivers/vdpa/vdpa_sim/vdpa_sim.h | 17 ++++++++------
>> drivers/vdpa/vdpa_sim/vdpa_sim.c | 33 ++++++++++++++--------------
>> drivers/vdpa/vdpa_sim/vdpa_sim_blk.c | 8 +++++--
>> drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 9 +++++---
>> 4 files changed, 38 insertions(+), 29 deletions(-)
>>
>> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.h b/drivers/vdpa/vdpa_sim/vdpa_sim.h
>> index 6a1267c40d5e..76e642042eb0 100644
>> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.h
>> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.h
>> @@ -40,12 +40,17 @@ struct vdpasim_virtqueue {
>> irqreturn_t (*cb)(void *data);
>> };
>>
>> +struct vdpasim_device {
>> + u64 supported_features;
>> + u32 id;
>> + int nvqs;
>> +};
>> +
>> struct vdpasim_init_attr {
>> - u32 device_id;
>> - u64 features;
>> + struct vdpasim_device device;
>
>It's unclear to me what the exact purpose of struct vdpasim_device is.
>At least the name reminds me of struct device, which this is not.
>
>Should this be called just struct vdpasim_attr or struct
>vdpasim_dev_attr? In other words, the attributes that are needed even
>after intialization?
Yes, they are attributes that are needed even after intialization,
so I think vdpasim_dev_attr should be better.
I'll change it and I'll try to write a better commit message.
Thanks,
Stefano
On Tue, Nov 17, 2020 at 11:36:36AM +0000, Stefan Hajnoczi wrote:
>On Fri, Nov 13, 2020 at 02:47:12PM +0100, Stefano Garzarella wrote:
>> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
>> index 8e41b3ab98d5..68e74383322f 100644
>> --- a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
>> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
>> @@ -7,6 +7,7 @@
>> */
>>
>> #include <linux/module.h>
>> +#include <linux/blkdev.h>
>> #include <uapi/linux/virtio_blk.h>
>>
>> #include "vdpa_sim.h"
>> @@ -24,10 +25,137 @@
>>
>> static struct vdpasim *vdpasim_blk_dev;
>>
>> +static int vdpasim_blk_handle_req(struct vdpasim *vdpasim,
>> + struct vdpasim_virtqueue *vq)
>
>This function has a non-standard int return value. Please document it.
Yes, I'll do.
>
>> +{
>> + size_t wrote = 0, to_read = 0, to_write = 0;
>> + struct virtio_blk_outhdr hdr;
>> + uint8_t status;
>> + uint32_t type;
>> + ssize_t bytes;
>> + loff_t offset;
>> + int i, ret;
>> +
>> + vringh_kiov_cleanup(&vq->riov);
>> + vringh_kiov_cleanup(&vq->wiov);
>> +
>> + ret = vringh_getdesc_iotlb(&vq->vring, &vq->riov, &vq->wiov,
>> + &vq->head, GFP_ATOMIC);
>> + if (ret != 1)
>> + return ret;
>> +
>> + for (i = 0; i < vq->wiov.used; i++)
>> + to_write += vq->wiov.iov[i].iov_len;
>> + to_write -= 1; /* last byte is the status */
>
>What if vq->wiov.used == 0?
Right, we should discard the descriptor.
>
>> +
>> + for (i = 0; i < vq->riov.used; i++)
>> + to_read += vq->riov.iov[i].iov_len;
>> +
>> + bytes = vringh_iov_pull_iotlb(&vq->vring, &vq->riov, &hdr, sizeof(hdr));
>> + if (bytes != sizeof(hdr))
>> + return 0;
>> +
>> + to_read -= bytes;
>> +
>> + type = le32_to_cpu(hdr.type);
>> + offset = le64_to_cpu(hdr.sector) << SECTOR_SHIFT;
>> + status = VIRTIO_BLK_S_OK;
>> +
>> + switch (type) {
>> + case VIRTIO_BLK_T_IN:
>> + if (offset + to_write > VDPASIM_BLK_CAPACITY << SECTOR_SHIFT) {
>
>Integer overflow is not handled.
I'll fix.
Thanks,
Stefano
On Tue, Nov 17, 2020 at 10:57:09AM +0000, Stefan Hajnoczi wrote:
>On Fri, Nov 13, 2020 at 02:47:01PM +0100, Stefano Garzarella wrote:
>> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
>> index 2754f3069738..fb0411594963 100644
>> --- a/drivers/vhost/vdpa.c
>> +++ b/drivers/vhost/vdpa.c
>> @@ -22,6 +22,7 @@
>> #include <linux/nospec.h>
>> #include <linux/vhost.h>
>> #include <linux/virtio_net.h>
>> +#include <linux/virtio_blk.h>
>>
>> #include "vhost.h"
>>
>> @@ -194,6 +195,9 @@ static int vhost_vdpa_config_validate(struct vhost_vdpa *v,
>> case VIRTIO_ID_NET:
>> size = sizeof(struct virtio_net_config);
>> break;
>> + case VIRTIO_ID_BLOCK:
>> + size = sizeof(struct virtio_blk_config);
>> + break;
>> }
>>
>> if (c->len == 0)
>
>Can vdpa_config_ops->get/set_config() handle the size check instead of
>hardcoding device-specific knowledge into drivers/vhost/vdpa.c?
I agree that this should be better. For example we already check if the
buffer is large enough in the simulator callbacks, we only need to
return an error in case it is not true.
@Jason, do you think it's okay to add a return value to
vdpa_config_ops->get/set_config() to handle the size check?
Thanks,
Stefano
On Tue, Nov 17, 2020 at 03:16:20PM +0100, Stefano Garzarella wrote:
> On Tue, Nov 17, 2020 at 11:11:21AM +0000, Stefan Hajnoczi wrote:
> > On Fri, Nov 13, 2020 at 02:47:04PM +0100, Stefano Garzarella wrote:
> > > +static void vdpasim_blk_work(struct work_struct *work)
> > > +{
> > > + struct vdpasim *vdpasim = container_of(work, struct vdpasim, work);
> > > + u8 status = VIRTIO_BLK_S_OK;
> > > + int i;
> > > +
> > > + spin_lock(&vdpasim->lock);
> > > +
> > > + if (!(vdpasim->status & VIRTIO_CONFIG_S_DRIVER_OK))
> > > + goto out;
> > > +
> > > + for (i = 0; i < VDPASIM_BLK_VQ_NUM; i++) {
> > > + struct vdpasim_virtqueue *vq = &vdpasim->vqs[i];
> > > +
> > > + if (!vq->ready)
> > > + continue;
> > > +
> > > + while (vringh_getdesc_iotlb(&vq->vring, &vq->iov, &vq->iov,
> > > + &vq->head, GFP_ATOMIC) > 0) {
> > > +
> > > + int write;
> > > +
> > > + vq->iov.i = vq->iov.used - 1;
> > > + write = vringh_iov_push_iotlb(&vq->vring, &vq->iov, &status, 1);
> > > + if (write <= 0)
> > > + break;
> >
> > We're lucky the guest driver doesn't crash after VIRTIO_BLK_T_GET_ID? :)
>
> The crash could happen if the simulator doesn't put the string terminator,
> but in virtio_blk.c, the serial_show() initialize the buffer putting the
> string terminator in the VIRTIO_BLK_ID_BYTES element:
>
> buf[VIRTIO_BLK_ID_BYTES] = '\0';
> err = virtblk_get_id(disk, buf);
>
> This should prevent the issue, right?
>
> However in the last patch of this series I implemented VIRTIO_BLK_T_GET_ID
> support :-)
Windows, BSD, macOS, etc guest drivers aren't necessarily going to
terminate or initialize the serial string buffer.
Anyway, the later patch that implements VIRTIO_BLK_T_GET_ID solves this
issue! Thanks.
Stefan
On Tue, Nov 17, 2020 at 04:43:42PM +0000, Stefan Hajnoczi wrote:
>On Tue, Nov 17, 2020 at 03:16:20PM +0100, Stefano Garzarella wrote:
>> On Tue, Nov 17, 2020 at 11:11:21AM +0000, Stefan Hajnoczi wrote:
>> > On Fri, Nov 13, 2020 at 02:47:04PM +0100, Stefano Garzarella wrote:
>> > > +static void vdpasim_blk_work(struct work_struct *work)
>> > > +{
>> > > + struct vdpasim *vdpasim = container_of(work, struct vdpasim, work);
>> > > + u8 status = VIRTIO_BLK_S_OK;
>> > > + int i;
>> > > +
>> > > + spin_lock(&vdpasim->lock);
>> > > +
>> > > + if (!(vdpasim->status & VIRTIO_CONFIG_S_DRIVER_OK))
>> > > + goto out;
>> > > +
>> > > + for (i = 0; i < VDPASIM_BLK_VQ_NUM; i++) {
>> > > + struct vdpasim_virtqueue *vq = &vdpasim->vqs[i];
>> > > +
>> > > + if (!vq->ready)
>> > > + continue;
>> > > +
>> > > + while (vringh_getdesc_iotlb(&vq->vring, &vq->iov, &vq->iov,
>> > > + &vq->head, GFP_ATOMIC) > 0) {
>> > > +
>> > > + int write;
>> > > +
>> > > + vq->iov.i = vq->iov.used - 1;
>> > > + write = vringh_iov_push_iotlb(&vq->vring, &vq->iov, &status, 1);
>> > > + if (write <= 0)
>> > > + break;
>> >
>> > We're lucky the guest driver doesn't crash after VIRTIO_BLK_T_GET_ID? :)
>>
>> The crash could happen if the simulator doesn't put the string terminator,
>> but in virtio_blk.c, the serial_show() initialize the buffer putting the
>> string terminator in the VIRTIO_BLK_ID_BYTES element:
>>
>> buf[VIRTIO_BLK_ID_BYTES] = '\0';
>> err = virtblk_get_id(disk, buf);
>>
>> This should prevent the issue, right?
>>
>> However in the last patch of this series I implemented VIRTIO_BLK_T_GET_ID
>> support :-)
>
>Windows, BSD, macOS, etc guest drivers aren't necessarily going to
>terminate or initialize the serial string buffer.
Unfortunately I discovered that VIRTIO_BLK_T_GET_ID is not in the VIRTIO
specs, so, just for curiosity, I checked the QEMU code and I found this:
case VIRTIO_BLK_T_GET_ID:
{
/*
* NB: per existing s/n string convention the string is
* terminated by '\0' only when shorter than buffer.
*/
const char *serial = s->conf.serial ? s->conf.serial : "";
size_t size = MIN(strlen(serial) + 1,
MIN(iov_size(in_iov, in_num),
VIRTIO_BLK_ID_BYTES));
iov_from_buf(in_iov, in_num, 0, serial, size);
virtio_blk_req_complete(req, VIRTIO_BLK_S_OK);
virtio_blk_free_request(req);
break;
}
It seems that the device emulation in QEMU expects that the driver
terminates the serial string buffer.
Do you know why VIRTIO_BLK_T_GET_ID is not in the specs?
Should we add it?
Thanks,
Stefano
>
>Anyway, the later patch that implements VIRTIO_BLK_T_GET_ID solves this
>issue! Thanks.
>
>Stefan
On Tue, Nov 17, 2020 at 06:38:11PM +0100, Stefano Garzarella wrote:
> On Tue, Nov 17, 2020 at 04:43:42PM +0000, Stefan Hajnoczi wrote:
> > On Tue, Nov 17, 2020 at 03:16:20PM +0100, Stefano Garzarella wrote:
> > > On Tue, Nov 17, 2020 at 11:11:21AM +0000, Stefan Hajnoczi wrote:
> > > > On Fri, Nov 13, 2020 at 02:47:04PM +0100, Stefano Garzarella wrote:
> > > > > +static void vdpasim_blk_work(struct work_struct *work)
> > > > > +{
> > > > > + struct vdpasim *vdpasim = container_of(work, struct vdpasim, work);
> > > > > + u8 status = VIRTIO_BLK_S_OK;
> > > > > + int i;
> > > > > +
> > > > > + spin_lock(&vdpasim->lock);
> > > > > +
> > > > > + if (!(vdpasim->status & VIRTIO_CONFIG_S_DRIVER_OK))
> > > > > + goto out;
> > > > > +
> > > > > + for (i = 0; i < VDPASIM_BLK_VQ_NUM; i++) {
> > > > > + struct vdpasim_virtqueue *vq = &vdpasim->vqs[i];
> > > > > +
> > > > > + if (!vq->ready)
> > > > > + continue;
> > > > > +
> > > > > + while (vringh_getdesc_iotlb(&vq->vring, &vq->iov, &vq->iov,
> > > > > + &vq->head, GFP_ATOMIC) > 0) {
> > > > > +
> > > > > + int write;
> > > > > +
> > > > > + vq->iov.i = vq->iov.used - 1;
> > > > > + write = vringh_iov_push_iotlb(&vq->vring, &vq->iov, &status, 1);
> > > > > + if (write <= 0)
> > > > > + break;
> > > >
> > > > We're lucky the guest driver doesn't crash after VIRTIO_BLK_T_GET_ID? :)
> > >
> > > The crash could happen if the simulator doesn't put the string terminator,
> > > but in virtio_blk.c, the serial_show() initialize the buffer putting the
> > > string terminator in the VIRTIO_BLK_ID_BYTES element:
> > >
> > > buf[VIRTIO_BLK_ID_BYTES] = '\0';
> > > err = virtblk_get_id(disk, buf);
> > >
> > > This should prevent the issue, right?
> > >
> > > However in the last patch of this series I implemented VIRTIO_BLK_T_GET_ID
> > > support :-)
> >
> > Windows, BSD, macOS, etc guest drivers aren't necessarily going to
> > terminate or initialize the serial string buffer.
>
> Unfortunately I discovered that VIRTIO_BLK_T_GET_ID is not in the VIRTIO
> specs, so, just for curiosity, I checked the QEMU code and I found this:
>
> case VIRTIO_BLK_T_GET_ID:
> {
> /*
> * NB: per existing s/n string convention the string is
> * terminated by '\0' only when shorter than buffer.
> */
> const char *serial = s->conf.serial ? s->conf.serial : "";
> size_t size = MIN(strlen(serial) + 1,
> MIN(iov_size(in_iov, in_num),
> VIRTIO_BLK_ID_BYTES));
> iov_from_buf(in_iov, in_num, 0, serial, size);
> virtio_blk_req_complete(req, VIRTIO_BLK_S_OK);
> virtio_blk_free_request(req);
> break;
> }
>
> It seems that the device emulation in QEMU expects that the driver
> terminates the serial string buffer.
>
> Do you know why VIRTIO_BLK_T_GET_ID is not in the specs?
> Should we add it?
It's about to be merged into the VIRTIO spec:
https://github.com/oasis-tcs/virtio-spec/issues/63
Stefan
On Wed, Nov 18, 2020 at 11:23:55AM +0000, Stefan Hajnoczi wrote:
>On Tue, Nov 17, 2020 at 06:38:11PM +0100, Stefano Garzarella wrote:
>> On Tue, Nov 17, 2020 at 04:43:42PM +0000, Stefan Hajnoczi wrote:
>> > On Tue, Nov 17, 2020 at 03:16:20PM +0100, Stefano Garzarella wrote:
>> > > On Tue, Nov 17, 2020 at 11:11:21AM +0000, Stefan Hajnoczi wrote:
>> > > > On Fri, Nov 13, 2020 at 02:47:04PM +0100, Stefano Garzarella wrote:
>> > > > > +static void vdpasim_blk_work(struct work_struct *work)
>> > > > > +{
>> > > > > + struct vdpasim *vdpasim = container_of(work, struct vdpasim, work);
>> > > > > + u8 status = VIRTIO_BLK_S_OK;
>> > > > > + int i;
>> > > > > +
>> > > > > + spin_lock(&vdpasim->lock);
>> > > > > +
>> > > > > + if (!(vdpasim->status & VIRTIO_CONFIG_S_DRIVER_OK))
>> > > > > + goto out;
>> > > > > +
>> > > > > + for (i = 0; i < VDPASIM_BLK_VQ_NUM; i++) {
>> > > > > + struct vdpasim_virtqueue *vq = &vdpasim->vqs[i];
>> > > > > +
>> > > > > + if (!vq->ready)
>> > > > > + continue;
>> > > > > +
>> > > > > + while (vringh_getdesc_iotlb(&vq->vring, &vq->iov, &vq->iov,
>> > > > > + &vq->head, GFP_ATOMIC) > 0) {
>> > > > > +
>> > > > > + int write;
>> > > > > +
>> > > > > + vq->iov.i = vq->iov.used - 1;
>> > > > > + write = vringh_iov_push_iotlb(&vq->vring, &vq->iov, &status, 1);
>> > > > > + if (write <= 0)
>> > > > > + break;
>> > > >
>> > > > We're lucky the guest driver doesn't crash after VIRTIO_BLK_T_GET_ID? :)
>> > >
>> > > The crash could happen if the simulator doesn't put the string terminator,
>> > > but in virtio_blk.c, the serial_show() initialize the buffer putting the
>> > > string terminator in the VIRTIO_BLK_ID_BYTES element:
>> > >
>> > > buf[VIRTIO_BLK_ID_BYTES] = '\0';
>> > > err = virtblk_get_id(disk, buf);
>> > >
>> > > This should prevent the issue, right?
>> > >
>> > > However in the last patch of this series I implemented VIRTIO_BLK_T_GET_ID
>> > > support :-)
>> >
>> > Windows, BSD, macOS, etc guest drivers aren't necessarily going to
>> > terminate or initialize the serial string buffer.
>>
>> Unfortunately I discovered that VIRTIO_BLK_T_GET_ID is not in the VIRTIO
>> specs, so, just for curiosity, I checked the QEMU code and I found this:
>>
>> case VIRTIO_BLK_T_GET_ID:
>> {
>> /*
>> * NB: per existing s/n string convention the string is
>> * terminated by '\0' only when shorter than buffer.
>> */
>> const char *serial = s->conf.serial ? s->conf.serial : "";
>> size_t size = MIN(strlen(serial) + 1,
>> MIN(iov_size(in_iov, in_num),
>> VIRTIO_BLK_ID_BYTES));
>> iov_from_buf(in_iov, in_num, 0, serial, size);
>> virtio_blk_req_complete(req, VIRTIO_BLK_S_OK);
>> virtio_blk_free_request(req);
>> break;
>> }
>>
>> It seems that the device emulation in QEMU expects that the driver
>> terminates the serial string buffer.
>>
>> Do you know why VIRTIO_BLK_T_GET_ID is not in the specs?
>> Should we add it?
>
>It's about to be merged into the VIRTIO spec:
>https://github.com/oasis-tcs/virtio-spec/issues/63
>
Great! Thanks for the link!
Stefano
Hi Jason,
I just discovered that I missed the other questions in this email,
sorry for that!
On Mon, Nov 16, 2020 at 12:00:11PM +0800, Jason Wang wrote:
>
>On 2020/11/13 下午9:47, Stefano Garzarella wrote:
>>From: Max Gurtovoy <[email protected]>
>>
>>Introduce new vdpa_sim_net and vdpa_sim (core) drivers. This is a
>>preparation for adding a vdpa simulator module for block devices.
>>
>>Signed-off-by: Max Gurtovoy <[email protected]>
>>[sgarzare: various cleanups/fixes]
>>Signed-off-by: Stefano Garzarella <[email protected]>
>>---
>>v1:
>>- Removed unused headers
>>- Removed empty module_init() module_exit()
>>- Moved vdpasim_is_little_endian() in vdpa_sim.h
>>- Moved vdpasim16_to_cpu/cpu_to_vdpasim16() in vdpa_sim.h
>>- Added vdpasim*_to_cpu/cpu_to_vdpasim*() also for 32 and 64
>>- Replaced 'select VDPA_SIM' with 'depends on VDPA_SIM' since selected
>> option can not depend on other [Jason]
>
>
>If possible, I would suggest to split this patch further:
>
>1) convert to use void *config, and an attribute for setting config
>size during allocation
>2) introduce supported_features
>3) other attributes (#vqs)
>4) rename config ops (more generic one)
>5) introduce ops for set|get_config, set_get_features
>6) real split
>
>
[...]
>>-static const struct vdpa_config_ops vdpasim_net_config_ops;
>>-static const struct vdpa_config_ops vdpasim_net_batch_config_ops;
>>+static const struct vdpa_config_ops vdpasim_config_ops;
>>+static const struct vdpa_config_ops vdpasim_batch_config_ops;
>>-static struct vdpasim *vdpasim_create(void)
>>+struct vdpasim *vdpasim_create(struct vdpasim_init_attr *attr)
>> {
>> const struct vdpa_config_ops *ops;
>> struct vdpasim *vdpasim;
>>+ u32 device_id;
>> struct device *dev;
>>- int ret = -ENOMEM;
>>+ int i, size, ret = -ENOMEM;
>>- if (batch_mapping)
>>- ops = &vdpasim_net_batch_config_ops;
>>+ device_id = attr->device_id;
>>+ /* Currently, we only accept the network and block devices. */
>>+ if (device_id != VIRTIO_ID_NET && device_id != VIRTIO_ID_BLOCK)
>>+ return ERR_PTR(-EOPNOTSUPP);
>>+
>>+ if (attr->batch_mapping)
>>+ ops = &vdpasim_batch_config_ops;
>> else
>>- ops = &vdpasim_net_config_ops;
>>+ ops = &vdpasim_config_ops;
>> vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops, VDPASIM_VQ_NUM);
>> if (!vdpasim)
>> goto err_alloc;
>>- INIT_WORK(&vdpasim->work, vdpasim_work);
>>+ if (device_id == VIRTIO_ID_NET)
>>+ size = sizeof(struct virtio_net_config);
>>+ else
>>+ size = sizeof(struct virtio_blk_config);
>
>
>It's better to avoid such if/else consider we may introduce more type
>of devices.
>
>Can we have an attribute of config size instead?
Yes, I'll move the patch 7 before this.
About config size and set/get_config ops, I'm not sure if it is better
to hidden everything under the new set/get_config ops, allocating the
config structure in each device, or leave the allocation in the core and
update it like now.
>
>
>>+ vdpasim->config = kzalloc(size, GFP_KERNEL);
>>+ if (!vdpasim->config)
>>+ goto err_iommu;
>>+
>>+ vdpasim->device_id = device_id;
>>+ vdpasim->supported_features = attr->features;
>>+ INIT_WORK(&vdpasim->work, attr->work_fn);
>> spin_lock_init(&vdpasim->lock);
>> spin_lock_init(&vdpasim->iommu_lock);
>>@@ -379,23 +231,10 @@ static struct vdpasim *vdpasim_create(void)
>> if (!vdpasim->buffer)
>> goto err_iommu;
>>- if (macaddr) {
>>- mac_pton(macaddr, vdpasim->config.mac);
>>- if (!is_valid_ether_addr(vdpasim->config.mac)) {
>>- ret = -EADDRNOTAVAIL;
>>- goto err_iommu;
>>- }
>>- } else {
>>- eth_random_addr(vdpasim->config.mac);
>>- }
>>-
>>- vringh_set_iotlb(&vdpasim->vqs[0].vring, vdpasim->iommu);
>>- vringh_set_iotlb(&vdpasim->vqs[1].vring, vdpasim->iommu);
>>+ for (i = 0; i < VDPASIM_VQ_NUM; i++)
>>+ vringh_set_iotlb(&vdpasim->vqs[i].vring,
>>vdpasim->iommu);
>
>
>And an attribute of #vqs here.
Yes.
>
>
>> vdpasim->vdpa.dma_dev = dev;
>>- ret = vdpa_register_device(&vdpasim->vdpa);
>>- if (ret)
>>- goto err_iommu;
>> return vdpasim;
>>@@ -404,6 +243,7 @@ static struct vdpasim *vdpasim_create(void)
>> err_alloc:
>> return ERR_PTR(ret);
>> }
>>+EXPORT_SYMBOL_GPL(vdpasim_create);
>> static int vdpasim_set_vq_address(struct vdpa_device *vdpa, u16 idx,
>> u64 desc_area, u64 driver_area,
>>@@ -498,28 +338,34 @@ static u32 vdpasim_get_vq_align(struct vdpa_device *vdpa)
>> static u64 vdpasim_get_features(struct vdpa_device *vdpa)
>> {
>>- return vdpasim_features;
>>+ struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
>>+
>>+ return vdpasim->supported_features;
>> }
>> static int vdpasim_set_features(struct vdpa_device *vdpa, u64
>> features)
>> {
>> struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
>>- struct virtio_net_config *config = &vdpasim->config;
>> /* DMA mapping must be done by driver */
>> if (!(features & (1ULL << VIRTIO_F_ACCESS_PLATFORM)))
>> return -EINVAL;
>>- vdpasim->features = features & vdpasim_features;
>>+ vdpasim->features = features & vdpasim->supported_features;
>> /* We generally only know whether guest is using the legacy interface
>> * here, so generally that's the earliest we can set config fields.
>> * Note: We actually require VIRTIO_F_ACCESS_PLATFORM above which
>> * implies VIRTIO_F_VERSION_1, but let's not try to be clever here.
>> */
>>+ if (vdpasim->device_id == VIRTIO_ID_NET) {
>>+ struct virtio_net_config *config =
>>+ (struct virtio_net_config *)vdpasim->config;
>>+
>>+ config->mtu = cpu_to_vdpasim16(vdpasim, 1500);
>>+ config->status = cpu_to_vdpasim16(vdpasim, VIRTIO_NET_S_LINK_UP);
>>+ }
>
>
>Can we introduce callbacks of set_features/get_features here to avoid
>dealing of device type specific codes in generic simulator code?
Yes, I'll do.
>
>
>>- config->mtu = cpu_to_vdpasim16(vdpasim, 1500);
>>- config->status = cpu_to_vdpasim16(vdpasim,
>>VIRTIO_NET_S_LINK_UP);
>> return 0;
>> }
>>@@ -536,7 +382,9 @@ static u16 vdpasim_get_vq_num_max(struct
>>vdpa_device *vdpa)
>> static u32 vdpasim_get_device_id(struct vdpa_device *vdpa)
>> {
>>- return VDPASIM_DEVICE_ID;
>>+ struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
>>+
>>+ return vdpasim->device_id;
>> }
>> static u32 vdpasim_get_vendor_id(struct vdpa_device *vdpa)
>>@@ -572,8 +420,12 @@ static void vdpasim_get_config(struct vdpa_device
>>*vdpa, unsigned int offset,
>> {
>> struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
>>- if (offset + len < sizeof(struct virtio_net_config))
>>- memcpy(buf, (u8 *)&vdpasim->config + offset, len);
>>+ if (vdpasim->device_id == VIRTIO_ID_BLOCK &&
>>+ (offset + len < sizeof(struct virtio_blk_config)))
>>+ memcpy(buf, vdpasim->config + offset, len);
>>+ else if (vdpasim->device_id == VIRTIO_ID_NET &&
>>+ (offset + len < sizeof(struct virtio_net_config)))
>>+ memcpy(buf, vdpasim->config + offset, len);
>
>
>Similarly, can we introduce set/get_config ops?
Ditto.
>
>
[...]
>>diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
>>b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
>>new file mode 100644
>>index 000000000000..c68d5488ab54
>>--- /dev/null
>>+++ b/drivers/vdpa/vdpa_sim/vdpa_sim_net.c
>>@@ -0,0 +1,153 @@
>>+// SPDX-License-Identifier: GPL-2.0-only
>>+/*
>>+ * VDPA simulator for networking device.
>>+ *
>>+ * Copyright (c) 2020, Red Hat Inc. All rights reserved.
>>+ * Author: Jason Wang <[email protected]>
>>+ *
>>+ */
>>+
>>+#include <linux/module.h>
>>+#include <linux/etherdevice.h>
>>+
>>+#include "vdpa_sim.h"
>>+
>>+#define VDPASIM_NET_FEATURES (1ULL << VIRTIO_NET_F_MAC)
>>+
>>+static int batch_mapping = 1;
>>+module_param(batch_mapping, int, 0444);
>>+MODULE_PARM_DESC(batch_mapping, "Batched mapping 1 - Enable; 0 - Disable");
>
>I think batch_mapping should belong to vpda_sim core module.
Yes, I agree, I'll leave it in the core.
>
>
[...]
>>diff --git a/drivers/vdpa/Kconfig b/drivers/vdpa/Kconfig
>>index d7d32b656102..fdb1a9267347 100644
>>--- a/drivers/vdpa/Kconfig
>>+++ b/drivers/vdpa/Kconfig
>>@@ -9,11 +9,16 @@ menuconfig VDPA
>> if VDPA
>> config VDPA_SIM
>>- tristate "vDPA device simulator"
>>+ tristate "vDPA simulator core"
>> depends on RUNTIME_TESTING_MENU && HAS_DMA
>> select DMA_OPS
>> select VHOST_RING
>> default n
>>+
>>+config VDPA_SIM_NET
>>+ tristate "vDPA simulator for networking device"
>>+ depends on VDPA_SIM
>>+ default n
>
>
>I remember somebody told me that if we don't enable a module it was
>disabled by default.
So, should I remove "default n" from vdpa_sim* entries?
Thanks,
Stefano
On 2020/11/18 下午9:14, Stefano Garzarella wrote:
> Hi Jason,
> I just discovered that I missed the other questions in this email,
> sorry for that!
No problem :)
>
> On Mon, Nov 16, 2020 at 12:00:11PM +0800, Jason Wang wrote:
>>
>> On 2020/11/13 下午9:47, Stefano Garzarella wrote:
>>> From: Max Gurtovoy <[email protected]>
>>>
>>> Introduce new vdpa_sim_net and vdpa_sim (core) drivers. This is a
>>> preparation for adding a vdpa simulator module for block devices.
>>>
>>> Signed-off-by: Max Gurtovoy <[email protected]>
>>> [sgarzare: various cleanups/fixes]
>>> Signed-off-by: Stefano Garzarella <[email protected]>
>>> ---
>>> v1:
>>> - Removed unused headers
>>> - Removed empty module_init() module_exit()
>>> - Moved vdpasim_is_little_endian() in vdpa_sim.h
>>> - Moved vdpasim16_to_cpu/cpu_to_vdpasim16() in vdpa_sim.h
>>> - Added vdpasim*_to_cpu/cpu_to_vdpasim*() also for 32 and 64
>>> - Replaced 'select VDPA_SIM' with 'depends on VDPA_SIM' since selected
>>> option can not depend on other [Jason]
>>
>>
>> If possible, I would suggest to split this patch further:
>>
>> 1) convert to use void *config, and an attribute for setting config
>> size during allocation
>> 2) introduce supported_features
>> 3) other attributes (#vqs)
>> 4) rename config ops (more generic one)
>> 5) introduce ops for set|get_config, set_get_features
>> 6) real split
>>
>>
>
> [...]
>
>>> -static const struct vdpa_config_ops vdpasim_net_config_ops;
>>> -static const struct vdpa_config_ops vdpasim_net_batch_config_ops;
>>> +static const struct vdpa_config_ops vdpasim_config_ops;
>>> +static const struct vdpa_config_ops vdpasim_batch_config_ops;
>>> -static struct vdpasim *vdpasim_create(void)
>>> +struct vdpasim *vdpasim_create(struct vdpasim_init_attr *attr)
>>> {
>>> const struct vdpa_config_ops *ops;
>>> struct vdpasim *vdpasim;
>>> + u32 device_id;
>>> struct device *dev;
>>> - int ret = -ENOMEM;
>>> + int i, size, ret = -ENOMEM;
>>> - if (batch_mapping)
>>> - ops = &vdpasim_net_batch_config_ops;
>>> + device_id = attr->device_id;
>>> + /* Currently, we only accept the network and block devices. */
>>> + if (device_id != VIRTIO_ID_NET && device_id != VIRTIO_ID_BLOCK)
>>> + return ERR_PTR(-EOPNOTSUPP);
>>> +
>>> + if (attr->batch_mapping)
>>> + ops = &vdpasim_batch_config_ops;
>>> else
>>> - ops = &vdpasim_net_config_ops;
>>> + ops = &vdpasim_config_ops;
>>> vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops,
>>> VDPASIM_VQ_NUM);
>>> if (!vdpasim)
>>> goto err_alloc;
>>> - INIT_WORK(&vdpasim->work, vdpasim_work);
>>> + if (device_id == VIRTIO_ID_NET)
>>> + size = sizeof(struct virtio_net_config);
>>> + else
>>> + size = sizeof(struct virtio_blk_config);
>>
>>
>> It's better to avoid such if/else consider we may introduce more type
>> of devices.
>>
>> Can we have an attribute of config size instead?
>
> Yes, I'll move the patch 7 before this.
>
> About config size and set/get_config ops, I'm not sure if it is better
> to hidden everything under the new set/get_config ops, allocating the
> config structure in each device, or leave the allocation in the core
> and update it like now.
I think we'd better to avoid having any type specific codes in generic
sim codes.
[...]
>>> +config VDPA_SIM_NET
>>> + tristate "vDPA simulator for networking device"
>>> + depends on VDPA_SIM
>>> + default n
>>
>>
>> I remember somebody told me that if we don't enable a module it was
>> disabled by default.
>
> So, should I remove "default n" from vdpa_sim* entries?
Yes, but please do that in another patch.
Thanks
>
> Thanks,
> Stefano
>
On Tue, Nov 17, 2020 at 11:27:03AM +0000, Stefan Hajnoczi wrote:
>On Fri, Nov 13, 2020 at 02:47:10PM +0100, Stefano Garzarella wrote:
>> vringh_getdesc_iotlb() manages 2 iovs for writable and readable
>> descriptors. This is very useful for the block device, where for
>> each request we have both types of descriptor.
>>
>> Let's split the vdpasim_virtqueue's iov field in riov and wiov
>> to use them with vringh_getdesc_iotlb().
>
>Is riov/wiov naming common? VIRTIO uses "in" (device-to-driver) and
>"out" (driver-to-device). Using VIRTIO terminology might be clearer.
I followed the vringh_getdesc_iotlb() attribute names, but I agree that
"in" and "out" would be better. I lost multiple times with read/write...
I'll fix!
Thanks,
Stefano
On Mon, Nov 16, 2020 at 11:37:48AM +0800, Jason Wang wrote:
>
>On 2020/11/13 下午9:47, Stefano Garzarella wrote:
>>Thanks to Max that started this work!
>>I took his patches, and extended the block simulator a bit.
>>
>>This series moves the network device simulator in a new module
>>(vdpa_sim_net) and leaves the generic functions in the vdpa_sim core
>>module, allowing the possibility to add new vDPA device simulators.
>>Then we added a new vdpa_sim_blk module to simulate a block device.
>>
>>I'm not sure about patch 11 ("vringh: allow vringh_iov_xfer() to skip
>>bytes when ptr is NULL"), maybe we can add a new functions instead of
>>modify vringh_iov_xfer().
>>
>>As Max reported, I'm also seeing errors with vdpa_sim_blk related to
>>iotlb and vringh when there is high load, these are some of the error
>>messages I can see randomly:
>>
>> vringh: Failed to access avail idx at 00000000e8deb2cc
>> vringh: Failed to read head: idx 6289 address 00000000e1ad1d50
>> vringh: Failed to get flags at 000000006635d7a3
>>
>> virtio_vdpa vdpa0: vringh_iov_push_iotlb() error: -14 offset:
>> 0x2840000 len: 0x20000
>> virtio_vdpa vdpa0: vringh_iov_pull_iotlb() error: -14 offset:
>> 0x58ee000 len: 0x3000
>>
>>These errors should all be related to the fact that iotlb_translate()
>>fails with -EINVAL, so it seems that we miss some mapping.
>
>
>Is this only reproducible when there's multiple co-current accessing
>of IOTLB? If yes, it's probably a hint that some kind of
>synchronization is still missed somewhere.
>
>It might be useful to log the dma_map/unmp in both virtio_ring and
>vringh to see who is missing the map.
>
Just an update about these issues with vdpa-sim-blk.
I've been focusing a little bit on these failures over the last few days
and have found two issues related to the IOTLB/IOMMU:
1. Some requests coming from the block layer fills the SG list with
multiple buffers that had the same physical address. This happens for
example while using 'mkfs', at some points multiple sectors are zeroed
so multiple SG elements point to the same physical page that is zeroed.
Since we are using vhost_iotlb_del_range() in the vdpasim_unmap_page(),
this removes all the overlapped ranges. I fixed removing a single map in
vdpasim_unmap_page(), but has an alternative we can implement some kind
of reference counts.
2. There was a race between dma_map/unmap and the worker thread, since
both are accessing the IOMMU. Taking the iommu_lock while using
vhost_iotlb_* API in the worker thread fixes the "vringh: Failed to *"
issues.
Whit these issues fixed the vdpa-blk simulator seems to work well.
I'll send the patches next week or after the break.
Thanks,
Stefano
On 2020/12/18 下午7:38, Stefano Garzarella wrote:
> On Mon, Nov 16, 2020 at 11:37:48AM +0800, Jason Wang wrote:
>>
>> On 2020/11/13 下午9:47, Stefano Garzarella wrote:
>>> Thanks to Max that started this work!
>>> I took his patches, and extended the block simulator a bit.
>>>
>>> This series moves the network device simulator in a new module
>>> (vdpa_sim_net) and leaves the generic functions in the vdpa_sim core
>>> module, allowing the possibility to add new vDPA device simulators.
>>> Then we added a new vdpa_sim_blk module to simulate a block device.
>>>
>>> I'm not sure about patch 11 ("vringh: allow vringh_iov_xfer() to skip
>>> bytes when ptr is NULL"), maybe we can add a new functions instead of
>>> modify vringh_iov_xfer().
>>>
>>> As Max reported, I'm also seeing errors with vdpa_sim_blk related to
>>> iotlb and vringh when there is high load, these are some of the error
>>> messages I can see randomly:
>>>
>>> vringh: Failed to access avail idx at 00000000e8deb2cc
>>> vringh: Failed to read head: idx 6289 address 00000000e1ad1d50
>>> vringh: Failed to get flags at 000000006635d7a3
>>>
>>> virtio_vdpa vdpa0: vringh_iov_push_iotlb() error: -14 offset:
>>> 0x2840000 len: 0x20000
>>> virtio_vdpa vdpa0: vringh_iov_pull_iotlb() error: -14 offset:
>>> 0x58ee000 len: 0x3000
>>>
>>> These errors should all be related to the fact that iotlb_translate()
>>> fails with -EINVAL, so it seems that we miss some mapping.
>>
>>
>> Is this only reproducible when there's multiple co-current accessing
>> of IOTLB? If yes, it's probably a hint that some kind of
>> synchronization is still missed somewhere.
>>
>> It might be useful to log the dma_map/unmp in both virtio_ring and
>> vringh to see who is missing the map.
>>
>
> Just an update about these issues with vdpa-sim-blk.
> I've been focusing a little bit on these failures over the last few
> days and have found two issues related to the IOTLB/IOMMU:
>
> 1. Some requests coming from the block layer fills the SG list with
> multiple buffers that had the same physical address. This happens for
> example while using 'mkfs', at some points multiple sectors are zeroed
> so multiple SG elements point to the same physical page that is zeroed.
> Since we are using vhost_iotlb_del_range() in the
> vdpasim_unmap_page(), this removes all the overlapped ranges. I fixed
> removing a single map in vdpasim_unmap_page(), but has an alternative
> we can implement some kind of reference counts.
I think we need to do what hardware do. So using refcount is probably
not a good ida.
>
> 2. There was a race between dma_map/unmap and the worker thread, since
> both are accessing the IOMMU. Taking the iommu_lock while using
> vhost_iotlb_* API in the worker thread fixes the "vringh: Failed to *"
> issues.
>
> Whit these issues fixed the vdpa-blk simulator seems to work well.
> I'll send the patches next week or after the break.
Good to know this.
Thanks
>
> Thanks,
> Stefano
>
On Mon, Dec 21, 2020 at 11:16:54AM +0800, Jason Wang wrote:
>
>On 2020/12/18 下午7:38, Stefano Garzarella wrote:
>>On Mon, Nov 16, 2020 at 11:37:48AM +0800, Jason Wang wrote:
>>>
>>>On 2020/11/13 下午9:47, Stefano Garzarella wrote:
>>>>Thanks to Max that started this work!
>>>>I took his patches, and extended the block simulator a bit.
>>>>
>>>>This series moves the network device simulator in a new module
>>>>(vdpa_sim_net) and leaves the generic functions in the vdpa_sim core
>>>>module, allowing the possibility to add new vDPA device simulators.
>>>>Then we added a new vdpa_sim_blk module to simulate a block device.
>>>>
>>>>I'm not sure about patch 11 ("vringh: allow vringh_iov_xfer() to skip
>>>>bytes when ptr is NULL"), maybe we can add a new functions instead of
>>>>modify vringh_iov_xfer().
>>>>
>>>>As Max reported, I'm also seeing errors with vdpa_sim_blk related to
>>>>iotlb and vringh when there is high load, these are some of the error
>>>>messages I can see randomly:
>>>>
>>>> vringh: Failed to access avail idx at 00000000e8deb2cc
>>>> vringh: Failed to read head: idx 6289 address 00000000e1ad1d50
>>>> vringh: Failed to get flags at 000000006635d7a3
>>>>
>>>> virtio_vdpa vdpa0: vringh_iov_push_iotlb() error: -14 offset:
>>>> 0x2840000 len: 0x20000
>>>> virtio_vdpa vdpa0: vringh_iov_pull_iotlb() error: -14 offset:
>>>> 0x58ee000 len: 0x3000
>>>>
>>>>These errors should all be related to the fact that iotlb_translate()
>>>>fails with -EINVAL, so it seems that we miss some mapping.
>>>
>>>
>>>Is this only reproducible when there's multiple co-current
>>>accessing of IOTLB? If yes, it's probably a hint that some kind of
>>>synchronization is still missed somewhere.
>>>
>>>It might be useful to log the dma_map/unmp in both virtio_ring and
>>>vringh to see who is missing the map.
>>>
>>
>>Just an update about these issues with vdpa-sim-blk.
>>I've been focusing a little bit on these failures over the last few
>>days and have found two issues related to the IOTLB/IOMMU:
>>
>>1. Some requests coming from the block layer fills the SG list with
>>multiple buffers that had the same physical address. This happens
>>for example while using 'mkfs', at some points multiple sectors are
>>zeroed so multiple SG elements point to the same physical page that
>>is zeroed.
>>Since we are using vhost_iotlb_del_range() in the
>>vdpasim_unmap_page(), this removes all the overlapped ranges. I
>>fixed removing a single map in vdpasim_unmap_page(), but has an
>>alternative we can implement some kind of reference counts.
>
>
>I think we need to do what hardware do. So using refcount is probably
>not a good ida.
Okay, so since we are using for simplicity an identical mapping, we are
assigning the same dma_addr to multiple pages.
So, it should be okay to remove a single mapping checking the others
parameters (i.e. dir, size).
I'll send a patch, so with the code it should be easier :-)
Thanks,
Stefano
>
>
>>
>>2. There was a race between dma_map/unmap and the worker thread,
>>since both are accessing the IOMMU. Taking the iommu_lock while
>>using vhost_iotlb_* API in the worker thread fixes the "vringh:
>>Failed to *" issues.
>>
>>Whit these issues fixed the vdpa-blk simulator seems to work well.
>>I'll send the patches next week or after the break.
>
>
>Good to know this.
>
>Thanks
>
>
>>
>>Thanks,
>>Stefano
>>
>
On 2020/12/21 下午7:14, Stefano Garzarella wrote:
> On Mon, Dec 21, 2020 at 11:16:54AM +0800, Jason Wang wrote:
>>
>> On 2020/12/18 下午7:38, Stefano Garzarella wrote:
>>> On Mon, Nov 16, 2020 at 11:37:48AM +0800, Jason Wang wrote:
>>>>
>>>> On 2020/11/13 下午9:47, Stefano Garzarella wrote:
>>>>> Thanks to Max that started this work!
>>>>> I took his patches, and extended the block simulator a bit.
>>>>>
>>>>> This series moves the network device simulator in a new module
>>>>> (vdpa_sim_net) and leaves the generic functions in the vdpa_sim core
>>>>> module, allowing the possibility to add new vDPA device simulators.
>>>>> Then we added a new vdpa_sim_blk module to simulate a block device.
>>>>>
>>>>> I'm not sure about patch 11 ("vringh: allow vringh_iov_xfer() to skip
>>>>> bytes when ptr is NULL"), maybe we can add a new functions instead of
>>>>> modify vringh_iov_xfer().
>>>>>
>>>>> As Max reported, I'm also seeing errors with vdpa_sim_blk related to
>>>>> iotlb and vringh when there is high load, these are some of the error
>>>>> messages I can see randomly:
>>>>>
>>>>> vringh: Failed to access avail idx at 00000000e8deb2cc
>>>>> vringh: Failed to read head: idx 6289 address 00000000e1ad1d50
>>>>> vringh: Failed to get flags at 000000006635d7a3
>>>>>
>>>>> virtio_vdpa vdpa0: vringh_iov_push_iotlb() error: -14 offset:
>>>>> 0x2840000 len: 0x20000
>>>>> virtio_vdpa vdpa0: vringh_iov_pull_iotlb() error: -14 offset:
>>>>> 0x58ee000 len: 0x3000
>>>>>
>>>>> These errors should all be related to the fact that iotlb_translate()
>>>>> fails with -EINVAL, so it seems that we miss some mapping.
>>>>
>>>>
>>>> Is this only reproducible when there's multiple co-current
>>>> accessing of IOTLB? If yes, it's probably a hint that some kind of
>>>> synchronization is still missed somewhere.
>>>>
>>>> It might be useful to log the dma_map/unmp in both virtio_ring and
>>>> vringh to see who is missing the map.
>>>>
>>>
>>> Just an update about these issues with vdpa-sim-blk.
>>> I've been focusing a little bit on these failures over the last few
>>> days and have found two issues related to the IOTLB/IOMMU:
>>>
>>> 1. Some requests coming from the block layer fills the SG list with
>>> multiple buffers that had the same physical address. This happens
>>> for example while using 'mkfs', at some points multiple sectors are
>>> zeroed so multiple SG elements point to the same physical page that
>>> is zeroed.
>>> Since we are using vhost_iotlb_del_range() in the
>>> vdpasim_unmap_page(), this removes all the overlapped ranges. I
>>> fixed removing a single map in vdpasim_unmap_page(), but has an
>>> alternative we can implement some kind of reference counts.
>>
>>
>> I think we need to do what hardware do. So using refcount is probably
>> not a good ida.
>
> Okay, so since we are using for simplicity an identical mapping, we
> are assigning the same dma_addr to multiple pages.
I think I get you now. That's the root cause for the failure.
Then I think we need an simple iova allocator for vdpa simulator, and it
might be useful for VDUSE as well.
Thanks
>
> So, it should be okay to remove a single mapping checking the others
> parameters (i.e. dir, size).
>
> I'll send a patch, so with the code it should be easier :-)
>
> Thanks,
> Stefano
>
>>
>>
>>>
>>> 2. There was a race between dma_map/unmap and the worker thread,
>>> since both are accessing the IOMMU. Taking the iommu_lock while
>>> using vhost_iotlb_* API in the worker thread fixes the "vringh:
>>> Failed to *" issues.
>>>
>>> Whit these issues fixed the vdpa-blk simulator seems to work well.
>>> I'll send the patches next week or after the break.
>>
>>
>> Good to know this.
>>
>> Thanks
>>
>>
>>>
>>> Thanks,
>>> Stefano
>>>
>>
>
On Tue, Dec 22, 2020 at 10:44:48AM +0800, Jason Wang wrote:
>
>On 2020/12/21 下午7:14, Stefano Garzarella wrote:
>>On Mon, Dec 21, 2020 at 11:16:54AM +0800, Jason Wang wrote:
>>>
>>>On 2020/12/18 下午7:38, Stefano Garzarella wrote:
>>>>On Mon, Nov 16, 2020 at 11:37:48AM +0800, Jason Wang wrote:
>>>>>
>>>>>On 2020/11/13 下午9:47, Stefano Garzarella wrote:
>>>>>>Thanks to Max that started this work!
>>>>>>I took his patches, and extended the block simulator a bit.
>>>>>>
>>>>>>This series moves the network device simulator in a new module
>>>>>>(vdpa_sim_net) and leaves the generic functions in the vdpa_sim core
>>>>>>module, allowing the possibility to add new vDPA device simulators.
>>>>>>Then we added a new vdpa_sim_blk module to simulate a block device.
>>>>>>
>>>>>>I'm not sure about patch 11 ("vringh: allow vringh_iov_xfer() to skip
>>>>>>bytes when ptr is NULL"), maybe we can add a new functions instead of
>>>>>>modify vringh_iov_xfer().
>>>>>>
>>>>>>As Max reported, I'm also seeing errors with vdpa_sim_blk related to
>>>>>>iotlb and vringh when there is high load, these are some of the error
>>>>>>messages I can see randomly:
>>>>>>
>>>>>> vringh: Failed to access avail idx at 00000000e8deb2cc
>>>>>> vringh: Failed to read head: idx 6289 address 00000000e1ad1d50
>>>>>> vringh: Failed to get flags at 000000006635d7a3
>>>>>>
>>>>>> virtio_vdpa vdpa0: vringh_iov_push_iotlb() error: -14
>>>>>>offset: 0x2840000 len: 0x20000
>>>>>> virtio_vdpa vdpa0: vringh_iov_pull_iotlb() error: -14
>>>>>>offset: 0x58ee000 len: 0x3000
>>>>>>
>>>>>>These errors should all be related to the fact that iotlb_translate()
>>>>>>fails with -EINVAL, so it seems that we miss some mapping.
>>>>>
>>>>>
>>>>>Is this only reproducible when there's multiple co-current
>>>>>accessing of IOTLB? If yes, it's probably a hint that some
>>>>>kind of synchronization is still missed somewhere.
>>>>>
>>>>>It might be useful to log the dma_map/unmp in both virtio_ring
>>>>>and vringh to see who is missing the map.
>>>>>
>>>>
>>>>Just an update about these issues with vdpa-sim-blk.
>>>>I've been focusing a little bit on these failures over the last
>>>>few days and have found two issues related to the IOTLB/IOMMU:
>>>>
>>>>1. Some requests coming from the block layer fills the SG list
>>>>with multiple buffers that had the same physical address. This
>>>>happens for example while using 'mkfs', at some points multiple
>>>>sectors are zeroed so multiple SG elements point to the same
>>>>physical page that is zeroed.
>>>>Since we are using vhost_iotlb_del_range() in the
>>>>vdpasim_unmap_page(), this removes all the overlapped ranges. I
>>>>fixed removing a single map in vdpasim_unmap_page(), but has an
>>>>alternative we can implement some kind of reference counts.
>>>
>>>
>>>I think we need to do what hardware do. So using refcount is
>>>probably not a good ida.
>>
>>Okay, so since we are using for simplicity an identical mapping, we
>>are assigning the same dma_addr to multiple pages.
>
>
>I think I get you now. That's the root cause for the failure.
Yes, sorry, I didn't explain well previously.
>
>Then I think we need an simple iova allocator for vdpa simulator, and
>it might be useful for VDUSE as well.
Okay, I'll work on it.
If you have an example to follow or some pointers, they are welcome :-)
Thanks,
Stefano
On 2020/12/22 下午6:57, Stefano Garzarella wrote:
> On Tue, Dec 22, 2020 at 10:44:48AM +0800, Jason Wang wrote:
>>
>> On 2020/12/21 下午7:14, Stefano Garzarella wrote:
>>> On Mon, Dec 21, 2020 at 11:16:54AM +0800, Jason Wang wrote:
>>>>
>>>> On 2020/12/18 下午7:38, Stefano Garzarella wrote:
>>>>> On Mon, Nov 16, 2020 at 11:37:48AM +0800, Jason Wang wrote:
>>>>>>
>>>>>> On 2020/11/13 下午9:47, Stefano Garzarella wrote:
>>>>>>> Thanks to Max that started this work!
>>>>>>> I took his patches, and extended the block simulator a bit.
>>>>>>>
>>>>>>> This series moves the network device simulator in a new module
>>>>>>> (vdpa_sim_net) and leaves the generic functions in the vdpa_sim
>>>>>>> core
>>>>>>> module, allowing the possibility to add new vDPA device simulators.
>>>>>>> Then we added a new vdpa_sim_blk module to simulate a block device.
>>>>>>>
>>>>>>> I'm not sure about patch 11 ("vringh: allow vringh_iov_xfer() to
>>>>>>> skip
>>>>>>> bytes when ptr is NULL"), maybe we can add a new functions
>>>>>>> instead of
>>>>>>> modify vringh_iov_xfer().
>>>>>>>
>>>>>>> As Max reported, I'm also seeing errors with vdpa_sim_blk
>>>>>>> related to
>>>>>>> iotlb and vringh when there is high load, these are some of the
>>>>>>> error
>>>>>>> messages I can see randomly:
>>>>>>>
>>>>>>> vringh: Failed to access avail idx at 00000000e8deb2cc
>>>>>>> vringh: Failed to read head: idx 6289 address 00000000e1ad1d50
>>>>>>> vringh: Failed to get flags at 000000006635d7a3
>>>>>>>
>>>>>>> virtio_vdpa vdpa0: vringh_iov_push_iotlb() error: -14 offset:
>>>>>>> 0x2840000 len: 0x20000
>>>>>>> virtio_vdpa vdpa0: vringh_iov_pull_iotlb() error: -14 offset:
>>>>>>> 0x58ee000 len: 0x3000
>>>>>>>
>>>>>>> These errors should all be related to the fact that
>>>>>>> iotlb_translate()
>>>>>>> fails with -EINVAL, so it seems that we miss some mapping.
>>>>>>
>>>>>>
>>>>>> Is this only reproducible when there's multiple co-current
>>>>>> accessing of IOTLB? If yes, it's probably a hint that some kind
>>>>>> of synchronization is still missed somewhere.
>>>>>>
>>>>>> It might be useful to log the dma_map/unmp in both virtio_ring
>>>>>> and vringh to see who is missing the map.
>>>>>>
>>>>>
>>>>> Just an update about these issues with vdpa-sim-blk.
>>>>> I've been focusing a little bit on these failures over the last
>>>>> few days and have found two issues related to the IOTLB/IOMMU:
>>>>>
>>>>> 1. Some requests coming from the block layer fills the SG list
>>>>> with multiple buffers that had the same physical address. This
>>>>> happens for example while using 'mkfs', at some points multiple
>>>>> sectors are zeroed so multiple SG elements point to the same
>>>>> physical page that is zeroed.
>>>>> Since we are using vhost_iotlb_del_range() in the
>>>>> vdpasim_unmap_page(), this removes all the overlapped ranges. I
>>>>> fixed removing a single map in vdpasim_unmap_page(), but has an
>>>>> alternative we can implement some kind of reference counts.
>>>>
>>>>
>>>> I think we need to do what hardware do. So using refcount is
>>>> probably not a good ida.
>>>
>>> Okay, so since we are using for simplicity an identical mapping, we
>>> are assigning the same dma_addr to multiple pages.
>>
>>
>> I think I get you now. That's the root cause for the failure.
>
> Yes, sorry, I didn't explain well previously.
>
>>
>> Then I think we need an simple iova allocator for vdpa simulator, and
>> it might be useful for VDUSE as well.
>
> Okay, I'll work on it.
> If you have an example to follow or some pointers, they are welcome :-)
Kernel had implemented one in iova.c but I'm not sure we need the
complexity like that. Or we can just use rbtree or idr to implement a
simpler one.
Thanks
>
> Thanks,
> Stefano
>
On 2020/12/22 下午8:29, Jason Wang wrote:
>
> On 2020/12/22 下午6:57, Stefano Garzarella wrote:
>> On Tue, Dec 22, 2020 at 10:44:48AM +0800, Jason Wang wrote:
>>>
>>> On 2020/12/21 下午7:14, Stefano Garzarella wrote:
>>>> On Mon, Dec 21, 2020 at 11:16:54AM +0800, Jason Wang wrote:
>>>>>
>>>>> On 2020/12/18 下午7:38, Stefano Garzarella wrote:
>>>>>> On Mon, Nov 16, 2020 at 11:37:48AM +0800, Jason Wang wrote:
>>>>>>>
>>>>>>> On 2020/11/13 下午9:47, Stefano Garzarella wrote:
>>>>>>>> Thanks to Max that started this work!
>>>>>>>> I took his patches, and extended the block simulator a bit.
>>>>>>>>
>>>>>>>> This series moves the network device simulator in a new module
>>>>>>>> (vdpa_sim_net) and leaves the generic functions in the vdpa_sim
>>>>>>>> core
>>>>>>>> module, allowing the possibility to add new vDPA device
>>>>>>>> simulators.
>>>>>>>> Then we added a new vdpa_sim_blk module to simulate a block
>>>>>>>> device.
>>>>>>>>
>>>>>>>> I'm not sure about patch 11 ("vringh: allow vringh_iov_xfer()
>>>>>>>> to skip
>>>>>>>> bytes when ptr is NULL"), maybe we can add a new functions
>>>>>>>> instead of
>>>>>>>> modify vringh_iov_xfer().
>>>>>>>>
>>>>>>>> As Max reported, I'm also seeing errors with vdpa_sim_blk
>>>>>>>> related to
>>>>>>>> iotlb and vringh when there is high load, these are some of the
>>>>>>>> error
>>>>>>>> messages I can see randomly:
>>>>>>>>
>>>>>>>> vringh: Failed to access avail idx at 00000000e8deb2cc
>>>>>>>> vringh: Failed to read head: idx 6289 address 00000000e1ad1d50
>>>>>>>> vringh: Failed to get flags at 000000006635d7a3
>>>>>>>>
>>>>>>>> virtio_vdpa vdpa0: vringh_iov_push_iotlb() error: -14 offset:
>>>>>>>> 0x2840000 len: 0x20000
>>>>>>>> virtio_vdpa vdpa0: vringh_iov_pull_iotlb() error: -14 offset:
>>>>>>>> 0x58ee000 len: 0x3000
>>>>>>>>
>>>>>>>> These errors should all be related to the fact that
>>>>>>>> iotlb_translate()
>>>>>>>> fails with -EINVAL, so it seems that we miss some mapping.
>>>>>>>
>>>>>>>
>>>>>>> Is this only reproducible when there's multiple co-current
>>>>>>> accessing of IOTLB? If yes, it's probably a hint that some kind
>>>>>>> of synchronization is still missed somewhere.
>>>>>>>
>>>>>>> It might be useful to log the dma_map/unmp in both virtio_ring
>>>>>>> and vringh to see who is missing the map.
>>>>>>>
>>>>>>
>>>>>> Just an update about these issues with vdpa-sim-blk.
>>>>>> I've been focusing a little bit on these failures over the last
>>>>>> few days and have found two issues related to the IOTLB/IOMMU:
>>>>>>
>>>>>> 1. Some requests coming from the block layer fills the SG list
>>>>>> with multiple buffers that had the same physical address. This
>>>>>> happens for example while using 'mkfs', at some points multiple
>>>>>> sectors are zeroed so multiple SG elements point to the same
>>>>>> physical page that is zeroed.
>>>>>> Since we are using vhost_iotlb_del_range() in the
>>>>>> vdpasim_unmap_page(), this removes all the overlapped ranges. I
>>>>>> fixed removing a single map in vdpasim_unmap_page(), but has an
>>>>>> alternative we can implement some kind of reference counts.
>>>>>
>>>>>
>>>>> I think we need to do what hardware do. So using refcount is
>>>>> probably not a good ida.
>>>>
>>>> Okay, so since we are using for simplicity an identical mapping, we
>>>> are assigning the same dma_addr to multiple pages.
>>>
>>>
>>> I think I get you now. That's the root cause for the failure.
>>
>> Yes, sorry, I didn't explain well previously.
>>
>>>
>>> Then I think we need an simple iova allocator for vdpa simulator,
>>> and it might be useful for VDUSE as well.
>>
>> Okay, I'll work on it.
>> If you have an example to follow or some pointers, they are welcome :-)
>
>
> Kernel had implemented one in iova.c but I'm not sure we need the
> complexity like that. Or we can just use rbtree or idr to implement a
> simpler one.
VDUSE[1] implements another allocator, but it's still complicated since
it needs to track bounce pages. I feel like we'd better start from a
simple one.
Thanks
[1] https://www.spinics.net/lists/linux-mm/msg231576.html
>
> Thanks
>
>
>>
>> Thanks,
>> Stefano
>>
On Tue, Dec 22, 2020 at 08:29:20PM +0800, Jason Wang wrote:
>
>On 2020/12/22 下午6:57, Stefano Garzarella wrote:
>>On Tue, Dec 22, 2020 at 10:44:48AM +0800, Jason Wang wrote:
>>>
>>>On 2020/12/21 下午7:14, Stefano Garzarella wrote:
>>>>On Mon, Dec 21, 2020 at 11:16:54AM +0800, Jason Wang wrote:
>>>>>
>>>>>On 2020/12/18 下午7:38, Stefano Garzarella wrote:
>>>>>>On Mon, Nov 16, 2020 at 11:37:48AM +0800, Jason Wang wrote:
>>>>>>>
>>>>>>>On 2020/11/13 下午9:47, Stefano Garzarella wrote:
>>>>>>>>Thanks to Max that started this work!
>>>>>>>>I took his patches, and extended the block simulator a bit.
>>>>>>>>
>>>>>>>>This series moves the network device simulator in a new module
>>>>>>>>(vdpa_sim_net) and leaves the generic functions in the
>>>>>>>>vdpa_sim core
>>>>>>>>module, allowing the possibility to add new vDPA device simulators.
>>>>>>>>Then we added a new vdpa_sim_blk module to simulate a block device.
>>>>>>>>
>>>>>>>>I'm not sure about patch 11 ("vringh: allow
>>>>>>>>vringh_iov_xfer() to skip
>>>>>>>>bytes when ptr is NULL"), maybe we can add a new
>>>>>>>>functions instead of
>>>>>>>>modify vringh_iov_xfer().
>>>>>>>>
>>>>>>>>As Max reported, I'm also seeing errors with
>>>>>>>>vdpa_sim_blk related to
>>>>>>>>iotlb and vringh when there is high load, these are some
>>>>>>>>of the error
>>>>>>>>messages I can see randomly:
>>>>>>>>
>>>>>>>> vringh: Failed to access avail idx at 00000000e8deb2cc
>>>>>>>> vringh: Failed to read head: idx 6289 address 00000000e1ad1d50
>>>>>>>> vringh: Failed to get flags at 000000006635d7a3
>>>>>>>>
>>>>>>>> virtio_vdpa vdpa0: vringh_iov_push_iotlb() error: -14
>>>>>>>>offset: 0x2840000 len: 0x20000
>>>>>>>> virtio_vdpa vdpa0: vringh_iov_pull_iotlb() error: -14
>>>>>>>>offset: 0x58ee000 len: 0x3000
>>>>>>>>
>>>>>>>>These errors should all be related to the fact that
>>>>>>>>iotlb_translate()
>>>>>>>>fails with -EINVAL, so it seems that we miss some mapping.
>>>>>>>
>>>>>>>
>>>>>>>Is this only reproducible when there's multiple co-current
>>>>>>>accessing of IOTLB? If yes, it's probably a hint that some
>>>>>>>kind of synchronization is still missed somewhere.
>>>>>>>
>>>>>>>It might be useful to log the dma_map/unmp in both
>>>>>>>virtio_ring and vringh to see who is missing the map.
>>>>>>>
>>>>>>
>>>>>>Just an update about these issues with vdpa-sim-blk.
>>>>>>I've been focusing a little bit on these failures over the
>>>>>>last few days and have found two issues related to the
>>>>>>IOTLB/IOMMU:
>>>>>>
>>>>>>1. Some requests coming from the block layer fills the SG
>>>>>>list with multiple buffers that had the same physical
>>>>>>address. This happens for example while using 'mkfs', at
>>>>>>some points multiple sectors are zeroed so multiple SG
>>>>>>elements point to the same physical page that is zeroed.
>>>>>>Since we are using vhost_iotlb_del_range() in the
>>>>>>vdpasim_unmap_page(), this removes all the overlapped
>>>>>>ranges. I fixed removing a single map in
>>>>>>vdpasim_unmap_page(), but has an alternative we can
>>>>>>implement some kind of reference counts.
>>>>>
>>>>>
>>>>>I think we need to do what hardware do. So using refcount is
>>>>>probably not a good ida.
>>>>
>>>>Okay, so since we are using for simplicity an identical mapping,
>>>>we are assigning the same dma_addr to multiple pages.
>>>
>>>
>>>I think I get you now. That's the root cause for the failure.
>>
>>Yes, sorry, I didn't explain well previously.
>>
>>>
>>>Then I think we need an simple iova allocator for vdpa simulator,
>>>and it might be useful for VDUSE as well.
>>
>>Okay, I'll work on it.
>>If you have an example to follow or some pointers, they are welcome :-)
>
>
>Kernel had implemented one in iova.c but I'm not sure we need the
>complexity like that. Or we can just use rbtree or idr to implement a
>simpler one.
Yeah, I found it and I started to integrate it in the simulator.
Also if it appears complicated, it seams to me that it should be simple
to integrate.
I'll give it a try, and if it is too complicate, I'll switch to a simple
rbtree.
Thanks,
Stefano