LinuxLists.cc - [PATCH V4 linux-next 00/12] VDPA support for Mellanox ConnectX devices

2020-08-04 16:23:49

[permalink] [raw]

Subject: [PATCH V4 linux-next 00/12] VDPA support for Mellanox ConnectX devices

Hi Michael,
please note that this series depends on mlx5 core device driver patches
in mlx5-next branch in
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git.

git pull git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git mlx5-next

They also depend Jason Wang's patches: https://lkml.org/lkml/2020/7/1/301

Jason, I had to resolve some conflicts so I would appreciate of you can verify
that it is ok.

The following series of patches provide VDPA support for Mellanox
devices. The supported devices are ConnectX6 DX and newer.

Currently, only a network driver is implemented; future patches will
introduce a block device driver. iperf performance on a single queue is
around 12 Gbps. Future patches will introduce multi queue support.

The files are organized in such a way that code that can be used by
different VDPA implementations will be placed in a common are resides in
drivers/vdpa/mlx5/core.

Only virtual functions are currently supported. Also, certain firmware
capabilities must be set to enable the driver. Physical functions (PFs)
are skipped by the driver.

To make use of the VDPA net driver, one must load mlx5_vdpa. In such
case, VFs will be operated by the VDPA driver. Although one can see a
regular instance of a network driver on the VF, the VDPA driver takes
precedence over the NIC driver, steering-wize.

Currently, the device/interface infrastructure in mlx5_core is used to
probe drivers. Future patches will introduce virtbus as a means to
register devices and drivers and VDPA will be adapted to it.

The mlx5 mode of operation required to support VDPA is switchdev mode.
Once can use Linux or OVS bridge to take care of layer 2 switching.

In order to provide virtio networking to a guest, an updated version of
qemu is required. This version has been tested by the following quemu
version:

url: https://github.com/jasowang/qemu.git
branch: vdpa
Commit ID: 6f4e59b807db

V2->V3
Fix makefile to use include path relative to the root of the kernel

V3-V4
Rebase Jason's patches on linux-next branch
Fix krobot error on mips arch
Make use of the free callback to destroy resoruces on unload
Use VIRTIO_F_ACCESS_PLATFORM instead of legacy VIRTIO_F_IOMMU_PLATFORM
Add empty implementations for get_vq_notification() and get_vq_irq()

Eli Cohen (6):
net/vdpa: Use struct for set/get vq state
vdpa: Modify get_vq_state() to return error code
vdpa/mlx5: Add hardware descriptive header file
vdpa/mlx5: Add support library for mlx5 VDPA implementation
vdpa/mlx5: Add shared memory registration code
vdpa/mlx5: Add VDPA driver for supported mlx5 devices

Jason Wang (5):
vhost-vdpa: refine ioctl pre-processing
vhost: generialize backend features setting/getting
vhost-vdpa: support get/set backend features
vhost-vdpa: support IOTLB batching hints
vdpasim: support batch updating

Max Gurtovoy (1):
vdpa: remove hard coded virtq num

drivers/vdpa/Kconfig | 19 +
drivers/vdpa/Makefile | 1 +
drivers/vdpa/ifcvf/ifcvf_base.c | 4 +-
drivers/vdpa/ifcvf/ifcvf_base.h | 4 +-
drivers/vdpa/ifcvf/ifcvf_main.c | 13 +-
drivers/vdpa/mlx5/Makefile | 4 +
drivers/vdpa/mlx5/core/mlx5_vdpa.h | 91 ++
drivers/vdpa/mlx5/core/mlx5_vdpa_ifc.h | 168 ++
drivers/vdpa/mlx5/core/mr.c | 484 ++++++
drivers/vdpa/mlx5/core/resources.c | 284 ++++
drivers/vdpa/mlx5/net/main.c | 76 +
drivers/vdpa/mlx5/net/mlx5_vnet.c | 1965 ++++++++++++++++++++++++
drivers/vdpa/mlx5/net/mlx5_vnet.h | 24 +
drivers/vdpa/vdpa.c | 3 +
drivers/vdpa/vdpa_sim/vdpa_sim.c | 53 +-
drivers/vhost/net.c | 18 +-
drivers/vhost/vdpa.c | 76 +-
drivers/vhost/vhost.c | 15 +
drivers/vhost/vhost.h | 2 +
include/linux/vdpa.h | 24 +-
include/uapi/linux/vhost.h | 2 +
include/uapi/linux/vhost_types.h | 11 +
22 files changed, 3284 insertions(+), 57 deletions(-)
create mode 100644 drivers/vdpa/mlx5/Makefile
create mode 100644 drivers/vdpa/mlx5/core/mlx5_vdpa.h
create mode 100644 drivers/vdpa/mlx5/core/mlx5_vdpa_ifc.h
create mode 100644 drivers/vdpa/mlx5/core/mr.c
create mode 100644 drivers/vdpa/mlx5/core/resources.c
create mode 100644 drivers/vdpa/mlx5/net/main.c
create mode 100644 drivers/vdpa/mlx5/net/mlx5_vnet.c
create mode 100644 drivers/vdpa/mlx5/net/mlx5_vnet.h

--
2.26.0

2020-08-04 16:23:49

[permalink] [raw]

Subject: [PATCH V4 linux-next 07/12] net/vdpa: Use struct for set/get vq state

For now VQ state involves 16 bit available index value encoded in u64
variable. In the future it will be extended to contain more fields. Use
struct to contain the state, now containing only a single u16 for the
available index. In the future we can add fields to this struct.

Reviewed-by: Parav Pandit <[email protected]>
Acked-by: Jason Wang <[email protected]>
Signed-off-by: Eli Cohen <[email protected]>
---
drivers/vdpa/ifcvf/ifcvf_base.c | 4 ++--
drivers/vdpa/ifcvf/ifcvf_base.h | 4 ++--
drivers/vdpa/ifcvf/ifcvf_main.c | 9 +++++----
drivers/vdpa/vdpa_sim/vdpa_sim.c | 10 ++++++----
drivers/vhost/vdpa.c | 7 +++++--
include/linux/vdpa.h | 18 ++++++++++++++----
6 files changed, 34 insertions(+), 18 deletions(-)

diff --git a/drivers/vdpa/ifcvf/ifcvf_base.c b/drivers/vdpa/ifcvf/ifcvf_base.c
index 94bf0328b68d..f2a128e56de5 100644
--- a/drivers/vdpa/ifcvf/ifcvf_base.c
+++ b/drivers/vdpa/ifcvf/ifcvf_base.c
@@ -272,7 +272,7 @@ static int ifcvf_config_features(struct ifcvf_hw *hw)
return 0;
}

-u64 ifcvf_get_vq_state(struct ifcvf_hw *hw, u16 qid)
+u16 ifcvf_get_vq_state(struct ifcvf_hw *hw, u16 qid)
{
struct ifcvf_lm_cfg __iomem *ifcvf_lm;
void __iomem *avail_idx_addr;
@@ -287,7 +287,7 @@ u64 ifcvf_get_vq_state(struct ifcvf_hw *hw, u16 qid)
return last_avail_idx;
}

-int ifcvf_set_vq_state(struct ifcvf_hw *hw, u16 qid, u64 num)
+int ifcvf_set_vq_state(struct ifcvf_hw *hw, u16 qid, u16 num)
{
struct ifcvf_lm_cfg __iomem *ifcvf_lm;
void __iomem *avail_idx_addr;
diff --git a/drivers/vdpa/ifcvf/ifcvf_base.h b/drivers/vdpa/ifcvf/ifcvf_base.h
index 24af422b5a3e..08f267a2aafe 100644
--- a/drivers/vdpa/ifcvf/ifcvf_base.h
+++ b/drivers/vdpa/ifcvf/ifcvf_base.h
@@ -116,7 +116,7 @@ void ifcvf_set_status(struct ifcvf_hw *hw, u8 status);
void io_write64_twopart(u64 val, u32 *lo, u32 *hi);
void ifcvf_reset(struct ifcvf_hw *hw);
u64 ifcvf_get_features(struct ifcvf_hw *hw);
-u64 ifcvf_get_vq_state(struct ifcvf_hw *hw, u16 qid);
-int ifcvf_set_vq_state(struct ifcvf_hw *hw, u16 qid, u64 num);
+u16 ifcvf_get_vq_state(struct ifcvf_hw *hw, u16 qid);
+int ifcvf_set_vq_state(struct ifcvf_hw *hw, u16 qid, u16 num);
struct ifcvf_adapter *vf_to_adapter(struct ifcvf_hw *hw);
#endif /* _IFCVF_H_ */
diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c
index 7c93225367db..dc311e972b9e 100644
--- a/drivers/vdpa/ifcvf/ifcvf_main.c
+++ b/drivers/vdpa/ifcvf/ifcvf_main.c
@@ -237,19 +237,20 @@ static u16 ifcvf_vdpa_get_vq_num_max(struct vdpa_device *vdpa_dev)
return IFCVF_QUEUE_MAX;
}

-static u64 ifcvf_vdpa_get_vq_state(struct vdpa_device *vdpa_dev, u16 qid)
+static void ifcvf_vdpa_get_vq_state(struct vdpa_device *vdpa_dev, u16 qid,
+ struct vdpa_vq_state *state)
{
struct ifcvf_hw *vf = vdpa_to_vf(vdpa_dev);

- return ifcvf_get_vq_state(vf, qid);
+ state->avail_index = ifcvf_get_vq_state(vf, qid);
}

static int ifcvf_vdpa_set_vq_state(struct vdpa_device *vdpa_dev, u16 qid,
- u64 num)
+ const struct vdpa_vq_state *state)
{
struct ifcvf_hw *vf = vdpa_to_vf(vdpa_dev);

- return ifcvf_set_vq_state(vf, qid, num);
+ return ifcvf_set_vq_state(vf, qid, state->avail_index);
}

static void ifcvf_vdpa_set_vq_cb(struct vdpa_device *vdpa_dev, u16 qid,
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
index 676cb2f53202..f1c351d5959b 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
@@ -450,26 +450,28 @@ static bool vdpasim_get_vq_ready(struct vdpa_device *vdpa, u16 idx)
return vq->ready;
}

-static int vdpasim_set_vq_state(struct vdpa_device *vdpa, u16 idx, u64 state)
+static int vdpasim_set_vq_state(struct vdpa_device *vdpa, u16 idx,
+ const struct vdpa_vq_state *state)
{
struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
struct vdpasim_virtqueue *vq = &vdpasim->vqs[idx];
struct vringh *vrh = &vq->vring;

spin_lock(&vdpasim->lock);
- vrh->last_avail_idx = state;
+ vrh->last_avail_idx = state->avail_index;
spin_unlock(&vdpasim->lock);

return 0;
}

-static u64 vdpasim_get_vq_state(struct vdpa_device *vdpa, u16 idx)
+static void vdpasim_get_vq_state(struct vdpa_device *vdpa, u16 idx,
+ struct vdpa_vq_state *state)
{
struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
struct vdpasim_virtqueue *vq = &vdpasim->vqs[idx];
struct vringh *vrh = &vq->vring;

- return vrh->last_avail_idx;
+ state->avail_index = vrh->last_avail_idx;
}

static u32 vdpasim_get_vq_align(struct vdpa_device *vdpa)
diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index 0049e34f7ca6..c2e1e2d55084 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -366,6 +366,7 @@ static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd,
{
struct vdpa_device *vdpa = v->vdpa;
const struct vdpa_config_ops *ops = vdpa->config;
+ struct vdpa_vq_state vq_state;
struct vdpa_callback cb;
struct vhost_virtqueue *vq;
struct vhost_vring_state s;
@@ -391,7 +392,8 @@ static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd,
ops->set_vq_ready(vdpa, idx, s.num);
return 0;
case VHOST_GET_VRING_BASE:
- vq->last_avail_idx = ops->get_vq_state(v->vdpa, idx);
+ ops->get_vq_state(v->vdpa, idx, &vq_state);
+ vq->last_avail_idx = vq_state.avail_index;
break;
case VHOST_GET_BACKEND_FEATURES:
features = VHOST_VDPA_BACKEND_FEATURES;
@@ -421,7 +423,8 @@ static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd,
break;

case VHOST_SET_VRING_BASE:
- if (ops->set_vq_state(vdpa, idx, vq->last_avail_idx))
+ vq_state.avail_index = vq->last_avail_idx;
+ if (ops->set_vq_state(vdpa, idx, &vq_state))
r = -EINVAL;
break;

diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
index 7b1e9df8fc11..8f412620071d 100644
--- a/include/linux/vdpa.h
+++ b/include/linux/vdpa.h
@@ -27,6 +27,14 @@ struct vdpa_notification_area {
resource_size_t size;
};

+/**
+ * vDPA vq_state definition
+ * @avail_index: available index
+ */
+struct vdpa_vq_state {
+ u16 avail_index;
+};
+
/**
* vDPA device - representation of a vDPA device
* @dev: underlying device
@@ -80,12 +88,12 @@ struct vdpa_device {
* @set_vq_state: Set the state for a virtqueue
* @vdev: vdpa device
* @idx: virtqueue index
- * @state: virtqueue state (last_avail_idx)
+ * @state: pointer to set virtqueue state (last_avail_idx)
* Returns integer: success (0) or error (< 0)
* @get_vq_state: Get the state for a virtqueue
* @vdev: vdpa device
* @idx: virtqueue index
- * Returns virtqueue state (last_avail_idx)
+ * @state: pointer to returned state (last_avail_idx)
* @get_vq_notification: Get the notification area for a virtqueue
* @vdev: vdpa device
* @idx: virtqueue index
@@ -182,8 +190,10 @@ struct vdpa_config_ops {
struct vdpa_callback *cb);
void (*set_vq_ready)(struct vdpa_device *vdev, u16 idx, bool ready);
bool (*get_vq_ready)(struct vdpa_device *vdev, u16 idx);
- int (*set_vq_state)(struct vdpa_device *vdev, u16 idx, u64 state);
- u64 (*get_vq_state)(struct vdpa_device *vdev, u16 idx);
+ int (*set_vq_state)(struct vdpa_device *vdev, u16 idx,
+ const struct vdpa_vq_state *state);
+ void (*get_vq_state)(struct vdpa_device *vdev, u16 idx,
+ struct vdpa_vq_state *state);
struct vdpa_notification_area
(*get_vq_notification)(struct vdpa_device *vdev, u16 idx);
int (*get_vq_irq)(struct vdpa_device *vdv, u16 idx);
--
2.26.0

2020-08-04 16:24:10

[permalink] [raw]

Subject: [PATCH V4 linux-next 05/12] vdpasim: support batch updating

From: Jason Wang <[email protected]>

The vDPA simulator support both set_map() and dma_map()/dma_unmap()
operations. But vhost-vdpa can only use one of them. So this patch
introduce a module parameter (batch_mapping) that let vpda_sim to
support only one of those dma operations. The batched mapping via
set_map() is enabled by default.

Signed-off-by: Jason Wang <[email protected]>
---
drivers/vdpa/vdpa_sim/vdpa_sim.c | 40 +++++++++++++++++++++++++++++---
1 file changed, 37 insertions(+), 3 deletions(-)

diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
index 13febedd73ee..07ded545197d 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
@@ -34,6 +34,10 @@
#define DRV_DESC "vDPA Device Simulator"
#define DRV_LICENSE "GPL v2"

+static int batch_mapping = 1;
+module_param(batch_mapping, int, 0444);
+MODULE_PARM_DESC(batch_mapping, "Batched mapping 1 -Enable; 0 - Disable");
+
struct vdpasim_virtqueue {
struct vringh vring;
struct vringh_kiov iov;
@@ -334,15 +338,21 @@ static const struct dma_map_ops vdpasim_dma_ops = {
};

static const struct vdpa_config_ops vdpasim_net_config_ops;
+static const struct vdpa_config_ops vdpasim_net_batch_config_ops;

static struct vdpasim *vdpasim_create(void)
{
+ const struct vdpa_config_ops *ops;
struct vdpasim *vdpasim;
struct device *dev;
int ret = -ENOMEM;

- vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL,
- &vdpasim_net_config_ops);
+ if (batch_mapping)
+ ops = &vdpasim_net_batch_config_ops;
+ else
+ ops = &vdpasim_net_config_ops;
+
+ vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops);
if (!vdpasim)
goto err_alloc;

@@ -641,12 +651,36 @@ static const struct vdpa_config_ops vdpasim_net_config_ops = {
.get_config = vdpasim_get_config,
.set_config = vdpasim_set_config,
.get_generation = vdpasim_get_generation,
- .set_map = vdpasim_set_map,
.dma_map = vdpasim_dma_map,
.dma_unmap = vdpasim_dma_unmap,
.free = vdpasim_free,
};

+static const struct vdpa_config_ops vdpasim_net_batch_config_ops = {
+ .set_vq_address = vdpasim_set_vq_address,
+ .set_vq_num = vdpasim_set_vq_num,
+ .kick_vq = vdpasim_kick_vq,
+ .set_vq_cb = vdpasim_set_vq_cb,
+ .set_vq_ready = vdpasim_set_vq_ready,
+ .get_vq_ready = vdpasim_get_vq_ready,
+ .set_vq_state = vdpasim_set_vq_state,
+ .get_vq_state = vdpasim_get_vq_state,
+ .get_vq_align = vdpasim_get_vq_align,
+ .get_features = vdpasim_get_features,
+ .set_features = vdpasim_set_features,
+ .set_config_cb = vdpasim_set_config_cb,
+ .get_vq_num_max = vdpasim_get_vq_num_max,
+ .get_device_id = vdpasim_get_device_id,
+ .get_vendor_id = vdpasim_get_vendor_id,
+ .get_status = vdpasim_get_status,
+ .set_status = vdpasim_set_status,
+ .get_config = vdpasim_get_config,
+ .set_config = vdpasim_set_config,
+ .get_generation = vdpasim_get_generation,
+ .set_map = vdpasim_set_map,
+ .free = vdpasim_free,
+};
+
static int __init vdpasim_dev_init(void)
{
vdpasim_dev = vdpasim_create();
--
2.26.0

2020-08-04 16:24:38

[permalink] [raw]

Subject: [PATCH V4 linux-next 10/12] vdpa/mlx5: Add support library for mlx5 VDPA implementation

Following patches introduce VDPA network driver for Mellanox Connectx6
devices. This patch provides functionality that will be used by those
patches.

Reviewed-by: Parav Pandit <[email protected]>
Signed-off-by: Eli Cohen <[email protected]>
---
drivers/vdpa/Kconfig | 9 +
drivers/vdpa/Makefile | 1 +
drivers/vdpa/mlx5/Makefile | 1 +
drivers/vdpa/mlx5/core/Makefile | 1 +
drivers/vdpa/mlx5/core/mlx5_vdpa.h | 57 ++++++
drivers/vdpa/mlx5/core/resources.c | 281 +++++++++++++++++++++++++++++
6 files changed, 350 insertions(+)
create mode 100644 drivers/vdpa/mlx5/Makefile
create mode 100644 drivers/vdpa/mlx5/core/Makefile
create mode 100644 drivers/vdpa/mlx5/core/mlx5_vdpa.h
create mode 100644 drivers/vdpa/mlx5/core/resources.c

diff --git a/drivers/vdpa/Kconfig b/drivers/vdpa/Kconfig
index 3e1ceb8e9f2b..7cb84f82feba 100644
--- a/drivers/vdpa/Kconfig
+++ b/drivers/vdpa/Kconfig
@@ -28,4 +28,13 @@ config IFCVF
To compile this driver as a module, choose M here: the module will
be called ifcvf.

+config MLX5_VDPA
+ bool "MLX5 VDPA support library for ConnectX devices"
+ depends on MLX5_CORE
+ default n
+ help
+ Support library for Mellanox VDPA drivers. Provides code that is
+ common for all types of VDPA drivers. The following drivers are planned:
+ net, block.
+
endif # VDPA
diff --git a/drivers/vdpa/Makefile b/drivers/vdpa/Makefile
index 8bbb686ca7a2..d160e9b63a66 100644
--- a/drivers/vdpa/Makefile
+++ b/drivers/vdpa/Makefile
@@ -2,3 +2,4 @@
obj-$(CONFIG_VDPA) += vdpa.o
obj-$(CONFIG_VDPA_SIM) += vdpa_sim/
obj-$(CONFIG_IFCVF) += ifcvf/
+obj-$(CONFIG_MLX5_VDPA) += mlx5/
diff --git a/drivers/vdpa/mlx5/Makefile b/drivers/vdpa/mlx5/Makefile
new file mode 100644
index 000000000000..d552abb1d126
--- /dev/null
+++ b/drivers/vdpa/mlx5/Makefile
@@ -0,0 +1 @@
+obj-$(CONFIG_MLX5_VDPA) += core/resources.o
diff --git a/drivers/vdpa/mlx5/core/Makefile b/drivers/vdpa/mlx5/core/Makefile
new file mode 100644
index 000000000000..7070f8c8680d
--- /dev/null
+++ b/drivers/vdpa/mlx5/core/Makefile
@@ -0,0 +1 @@
+obj-y += resources.o
diff --git a/drivers/vdpa/mlx5/core/mlx5_vdpa.h b/drivers/vdpa/mlx5/core/mlx5_vdpa.h
new file mode 100644
index 000000000000..f3571c8b257e
--- /dev/null
+++ b/drivers/vdpa/mlx5/core/mlx5_vdpa.h
@@ -0,0 +1,57 @@
+/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */
+/* Copyright (c) 2020 Mellanox Technologies Ltd. */
+
+#ifndef __MLX5_VDPA_H__
+#define __MLX5_VDPA_H__
+
+#include <linux/vdpa.h>
+#include <linux/mlx5/driver.h>
+
+struct mlx5_vdpa_resources {
+ u32 pdn;
+ struct mlx5_uars_page *uar;
+ void __iomem *kick_addr;
+ u16 uid;
+ u32 null_mkey;
+ bool valid;
+};
+
+struct mlx5_vdpa_dev {
+ struct vdpa_device vdev;
+ struct mlx5_core_dev *mdev;
+ struct mlx5_vdpa_resources res;
+
+ u64 mlx_features;
+ u64 actual_features;
+ u8 status;
+ u32 max_vqs;
+ u32 generation;
+};
+
+int mlx5_vdpa_alloc_pd(struct mlx5_vdpa_dev *dev, u32 *pdn, u16 uid);
+int mlx5_vdpa_dealloc_pd(struct mlx5_vdpa_dev *dev, u32 pdn, u16 uid);
+int mlx5_vdpa_get_null_mkey(struct mlx5_vdpa_dev *dev, u32 *null_mkey);
+int mlx5_vdpa_create_tis(struct mlx5_vdpa_dev *mvdev, void *in, u32 *tisn);
+void mlx5_vdpa_destroy_tis(struct mlx5_vdpa_dev *mvdev, u32 tisn);
+int mlx5_vdpa_create_rqt(struct mlx5_vdpa_dev *mvdev, void *in, int inlen, u32 *rqtn);
+void mlx5_vdpa_destroy_rqt(struct mlx5_vdpa_dev *mvdev, u32 rqtn);
+int mlx5_vdpa_create_tir(struct mlx5_vdpa_dev *mvdev, void *in, u32 *tirn);
+void mlx5_vdpa_destroy_tir(struct mlx5_vdpa_dev *mvdev, u32 tirn);
+int mlx5_vdpa_alloc_transport_domain(struct mlx5_vdpa_dev *mvdev, u32 *tdn);
+void mlx5_vdpa_dealloc_transport_domain(struct mlx5_vdpa_dev *mvdev, u32 tdn);
+int mlx5_vdpa_alloc_resources(struct mlx5_vdpa_dev *mvdev);
+void mlx5_vdpa_free_resources(struct mlx5_vdpa_dev *mvdev);
+
+#define mlx5_vdpa_warn(__dev, format, ...) \
+ dev_warn((__dev)->mdev->device, "%s:%d:(pid %d) warning: " format, __func__, __LINE__, \
+ current->pid, ##__VA_ARGS__)
+
+#define mlx5_vdpa_info(__dev, format, ...) \
+ dev_info((__dev)->mdev->device, "%s:%d:(pid %d): " format, __func__, __LINE__, \
+ current->pid, ##__VA_ARGS__)
+
+#define mlx5_vdpa_dbg(__dev, format, ...) \
+ dev_debug((__dev)->mdev->device, "%s:%d:(pid %d): " format, __func__, __LINE__, \
+ current->pid, ##__VA_ARGS__)
+
+#endif /* __MLX5_VDPA_H__ */
diff --git a/drivers/vdpa/mlx5/core/resources.c b/drivers/vdpa/mlx5/core/resources.c
new file mode 100644
index 000000000000..6c6552b7e9b5
--- /dev/null
+++ b/drivers/vdpa/mlx5/core/resources.c
@@ -0,0 +1,281 @@
+// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
+/* Copyright (c) 2020 Mellanox Technologies Ltd. */
+
+#include <linux/mlx5/driver.h>
+#include "mlx5_vdpa.h"
+
+static int alloc_pd(struct mlx5_vdpa_dev *dev, u32 *pdn, u16 uid)
+{
+ struct mlx5_core_dev *mdev = dev->mdev;
+
+ u32 out[MLX5_ST_SZ_DW(alloc_pd_out)] = {};
+ u32 in[MLX5_ST_SZ_DW(alloc_pd_in)] = {};
+ int err;
+
+ MLX5_SET(alloc_pd_in, in, opcode, MLX5_CMD_OP_ALLOC_PD);
+ MLX5_SET(alloc_pd_in, in, uid, uid);
+
+ err = mlx5_cmd_exec_inout(mdev, alloc_pd, in, out);
+ if (!err)
+ *pdn = MLX5_GET(alloc_pd_out, out, pd);
+
+ return err;
+}
+
+static int dealloc_pd(struct mlx5_vdpa_dev *dev, u32 pdn, u16 uid)
+{
+ u32 in[MLX5_ST_SZ_DW(dealloc_pd_in)] = {};
+ struct mlx5_core_dev *mdev = dev->mdev;
+
+ MLX5_SET(dealloc_pd_in, in, opcode, MLX5_CMD_OP_DEALLOC_PD);
+ MLX5_SET(dealloc_pd_in, in, pd, pdn);
+ MLX5_SET(dealloc_pd_in, in, uid, uid);
+ return mlx5_cmd_exec_in(mdev, dealloc_pd, in);
+}
+
+static int get_null_mkey(struct mlx5_vdpa_dev *dev, u32 *null_mkey)
+{
+ u32 out[MLX5_ST_SZ_DW(query_special_contexts_out)] = {};
+ u32 in[MLX5_ST_SZ_DW(query_special_contexts_in)] = {};
+ struct mlx5_core_dev *mdev = dev->mdev;
+ int err;
+
+ MLX5_SET(query_special_contexts_in, in, opcode, MLX5_CMD_OP_QUERY_SPECIAL_CONTEXTS);
+ err = mlx5_cmd_exec_inout(mdev, query_special_contexts, in, out);
+ if (!err)
+ *null_mkey = MLX5_GET(query_special_contexts_out, out, null_mkey);
+ return err;
+}
+
+static int create_uctx(struct mlx5_vdpa_dev *mvdev, u16 *uid)
+{
+ u32 out[MLX5_ST_SZ_DW(create_uctx_out)] = {};
+ int inlen;
+ void *in;
+ int err;
+
+ /* 0 means not supported */
+ if (!MLX5_CAP_GEN(mvdev->mdev, log_max_uctx))
+ return -EOPNOTSUPP;
+
+ inlen = MLX5_ST_SZ_BYTES(create_uctx_in);
+ in = kzalloc(inlen, GFP_KERNEL);
+ if (!in)
+ return -ENOMEM;
+
+ MLX5_SET(create_uctx_in, in, opcode, MLX5_CMD_OP_CREATE_UCTX);
+ MLX5_SET(create_uctx_in, in, uctx.cap, MLX5_UCTX_CAP_RAW_TX);
+
+ err = mlx5_cmd_exec(mvdev->mdev, in, inlen, out, sizeof(out));
+ kfree(in);
+ if (!err)
+ *uid = MLX5_GET(create_uctx_out, out, uid);
+
+ return err;
+}
+
+static void destroy_uctx(struct mlx5_vdpa_dev *mvdev, u32 uid)
+{
+ u32 out[MLX5_ST_SZ_DW(destroy_uctx_out)] = {};
+ u32 in[MLX5_ST_SZ_DW(destroy_uctx_in)] = {};
+
+ MLX5_SET(destroy_uctx_in, in, opcode, MLX5_CMD_OP_DESTROY_UCTX);
+ MLX5_SET(destroy_uctx_in, in, uid, uid);
+
+ mlx5_cmd_exec(mvdev->mdev, in, sizeof(in), out, sizeof(out));
+}
+
+int mlx5_vdpa_create_tis(struct mlx5_vdpa_dev *mvdev, void *in, u32 *tisn)
+{
+ u32 out[MLX5_ST_SZ_DW(create_tis_out)] = {};
+ int err;
+
+ MLX5_SET(create_tis_in, in, opcode, MLX5_CMD_OP_CREATE_TIS);
+ MLX5_SET(create_tis_in, in, uid, mvdev->res.uid);
+ err = mlx5_cmd_exec_inout(mvdev->mdev, create_tis, in, out);
+ if (!err)
+ *tisn = MLX5_GET(create_tis_out, out, tisn);
+
+ return err;
+}
+
+void mlx5_vdpa_destroy_tis(struct mlx5_vdpa_dev *mvdev, u32 tisn)
+{
+ u32 in[MLX5_ST_SZ_DW(destroy_tis_in)] = {};
+
+ MLX5_SET(destroy_tis_in, in, opcode, MLX5_CMD_OP_DESTROY_TIS);
+ MLX5_SET(destroy_tis_in, in, uid, mvdev->res.uid);
+ MLX5_SET(destroy_tis_in, in, tisn, tisn);
+ mlx5_cmd_exec_in(mvdev->mdev, destroy_tis, in);
+}
+
+int mlx5_vdpa_create_rqt(struct mlx5_vdpa_dev *mvdev, void *in, int inlen, u32 *rqtn)
+{
+ u32 out[MLX5_ST_SZ_DW(create_rqt_out)] = {};
+ int err;
+
+ MLX5_SET(create_rqt_in, in, opcode, MLX5_CMD_OP_CREATE_RQT);
+ err = mlx5_cmd_exec(mvdev->mdev, in, inlen, out, sizeof(out));
+ if (!err)
+ *rqtn = MLX5_GET(create_rqt_out, out, rqtn);
+
+ return err;
+}
+
+void mlx5_vdpa_destroy_rqt(struct mlx5_vdpa_dev *mvdev, u32 rqtn)
+{
+ u32 in[MLX5_ST_SZ_DW(destroy_rqt_in)] = {};
+
+ MLX5_SET(destroy_rqt_in, in, opcode, MLX5_CMD_OP_DESTROY_RQT);
+ MLX5_SET(destroy_rqt_in, in, uid, mvdev->res.uid);
+ MLX5_SET(destroy_rqt_in, in, rqtn, rqtn);
+ mlx5_cmd_exec_in(mvdev->mdev, destroy_rqt, in);
+}
+
+int mlx5_vdpa_create_tir(struct mlx5_vdpa_dev *mvdev, void *in, u32 *tirn)
+{
+ u32 out[MLX5_ST_SZ_DW(create_tir_out)] = {};
+ int err;
+
+ MLX5_SET(create_tir_in, in, opcode, MLX5_CMD_OP_CREATE_TIR);
+ err = mlx5_cmd_exec_inout(mvdev->mdev, create_tir, in, out);
+ if (!err)
+ *tirn = MLX5_GET(create_tir_out, out, tirn);
+
+ return err;
+}
+
+void mlx5_vdpa_destroy_tir(struct mlx5_vdpa_dev *mvdev, u32 tirn)
+{
+ u32 in[MLX5_ST_SZ_DW(destroy_tir_in)] = {};
+
+ MLX5_SET(destroy_tir_in, in, opcode, MLX5_CMD_OP_DESTROY_TIR);
+ MLX5_SET(destroy_tir_in, in, uid, mvdev->res.uid);
+ MLX5_SET(destroy_tir_in, in, tirn, tirn);
+ mlx5_cmd_exec_in(mvdev->mdev, destroy_tir, in);
+}
+
+int mlx5_vdpa_alloc_transport_domain(struct mlx5_vdpa_dev *mvdev, u32 *tdn)
+{
+ u32 out[MLX5_ST_SZ_DW(alloc_transport_domain_out)] = {};
+ u32 in[MLX5_ST_SZ_DW(alloc_transport_domain_in)] = {};
+ int err;
+
+ MLX5_SET(alloc_transport_domain_in, in, opcode, MLX5_CMD_OP_ALLOC_TRANSPORT_DOMAIN);
+ MLX5_SET(alloc_transport_domain_in, in, uid, mvdev->res.uid);
+
+ err = mlx5_cmd_exec_inout(mvdev->mdev, alloc_transport_domain, in, out);
+ if (!err)
+ *tdn = MLX5_GET(alloc_transport_domain_out, out, transport_domain);
+
+ return err;
+}
+
+void mlx5_vdpa_dealloc_transport_domain(struct mlx5_vdpa_dev *mvdev, u32 tdn)
+{
+ u32 in[MLX5_ST_SZ_DW(dealloc_transport_domain_in)] = {};
+
+ MLX5_SET(dealloc_transport_domain_in, in, opcode, MLX5_CMD_OP_DEALLOC_TRANSPORT_DOMAIN);
+ MLX5_SET(dealloc_transport_domain_in, in, uid, mvdev->res.uid);
+ MLX5_SET(dealloc_transport_domain_in, in, transport_domain, tdn);
+ mlx5_cmd_exec_in(mvdev->mdev, dealloc_transport_domain, in);
+}
+
+int mlx5_vdpa_create_mkey(struct mlx5_vdpa_dev *mvdev, struct mlx5_core_mkey *mkey, u32 *in,
+ int inlen)
+{
+ u32 lout[MLX5_ST_SZ_DW(create_mkey_out)] = {};
+ u32 mkey_index;
+ void *mkc;
+ int err;
+
+ MLX5_SET(create_mkey_in, in, opcode, MLX5_CMD_OP_CREATE_MKEY);
+ MLX5_SET(create_mkey_in, in, uid, mvdev->res.uid);
+
+ err = mlx5_cmd_exec(mvdev->mdev, in, inlen, lout, sizeof(lout));
+ if (err)
+ return err;
+
+ mkc = MLX5_ADDR_OF(create_mkey_in, in, memory_key_mkey_entry);
+ mkey_index = MLX5_GET(create_mkey_out, lout, mkey_index);
+ mkey->iova = MLX5_GET64(mkc, mkc, start_addr);
+ mkey->size = MLX5_GET64(mkc, mkc, len);
+ mkey->key |= mlx5_idx_to_mkey(mkey_index);
+ mkey->pd = MLX5_GET(mkc, mkc, pd);
+ return 0;
+}
+
+int mlx5_vdpa_destroy_mkey(struct mlx5_vdpa_dev *mvdev, struct mlx5_core_mkey *mkey)
+{
+ u32 in[MLX5_ST_SZ_DW(destroy_mkey_in)] = {};
+
+ MLX5_SET(destroy_mkey_in, in, uid, mvdev->res.uid);
+ MLX5_SET(destroy_mkey_in, in, opcode, MLX5_CMD_OP_DESTROY_MKEY);
+ MLX5_SET(destroy_mkey_in, in, mkey_index, mlx5_mkey_to_idx(mkey->key));
+ return mlx5_cmd_exec_in(mvdev->mdev, destroy_mkey, in);
+}
+
+int mlx5_vdpa_alloc_resources(struct mlx5_vdpa_dev *mvdev)
+{
+ u64 offset = MLX5_CAP64_DEV_VDPA_EMULATION(mvdev->mdev, doorbell_bar_offset);
+ struct mlx5_vdpa_resources *res = &mvdev->res;
+ struct mlx5_core_dev *mdev = mvdev->mdev;
+ u64 kick_addr;
+ int err;
+
+ if (res->valid) {
+ mlx5_vdpa_warn(mvdev, "resources already allocated\n");
+ return -EINVAL;
+ }
+ res->uar = mlx5_get_uars_page(mdev);
+ if (IS_ERR(res->uar)) {
+ err = PTR_ERR(res->uar);
+ goto err_uars;
+ }
+
+ err = create_uctx(mvdev, &res->uid);
+ if (err)
+ goto err_uctx;
+
+ err = alloc_pd(mvdev, &res->pdn, res->uid);
+ if (err)
+ goto err_pd;
+
+ err = get_null_mkey(mvdev, &res->null_mkey);
+ if (err)
+ goto err_key;
+
+ kick_addr = pci_resource_start(mdev->pdev, 0) + offset;
+ res->kick_addr = ioremap(kick_addr, PAGE_SIZE);
+ if (!res->kick_addr) {
+ err = -ENOMEM;
+ goto err_key;
+ }
+ res->valid = true;
+
+ return 0;
+
+err_key:
+ dealloc_pd(mvdev, res->pdn, res->uid);
+err_pd:
+ destroy_uctx(mvdev, res->uid);
+err_uctx:
+ mlx5_put_uars_page(mdev, res->uar);
+err_uars:
+ return err;
+}
+
+void mlx5_vdpa_free_resources(struct mlx5_vdpa_dev *mvdev)
+{
+ struct mlx5_vdpa_resources *res = &mvdev->res;
+
+ if (!res->valid)
+ return;
+
+ iounmap(res->kick_addr);
+ res->kick_addr = NULL;
+ dealloc_pd(mvdev, res->pdn, res->uid);
+ destroy_uctx(mvdev, res->uid);
+ mlx5_put_uars_page(mvdev->mdev, res->uar);
+ res->valid = false;
+}
--
2.26.0

2020-08-04 16:25:13

[permalink] [raw]

Subject: [PATCH V4 linux-next 04/12] vhost-vdpa: support IOTLB batching hints

From: Jason Wang <[email protected]>

This patches extend the vhost IOTLB API to accept batch updating hints
form userspace. When userspace wants update the device IOTLB in a
batch, it may do:

1) Write vhost_iotlb_msg with VHOST_IOTLB_BATCH_BEGIN flag
2) Perform a batch of IOTLB updating via VHOST_IOTLB_UPDATE/INVALIDATE
3) Write vhost_iotlb_msg with VHOST_IOTLB_BATCH_END flag

Vhost-vdpa may decide to batch the IOMMU/IOTLB updating in step 3 when
vDPA device support set_map() ops. This is useful for the vDPA device
that want to know all the mappings to tweak their own DMA translation
logic.

For vDPA device that doesn't require set_map(), no behavior changes.

This capability is advertised via VHOST_BACKEND_F_IOTLB_BATCH capability.

Signed-off-by: Jason Wang <[email protected]>
---
drivers/vhost/vdpa.c | 36 ++++++++++++++++++++++++--------
include/uapi/linux/vhost.h | 2 ++
include/uapi/linux/vhost_types.h | 11 ++++++++++
3 files changed, 40 insertions(+), 9 deletions(-)

diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index 2f0310d1b914..0a143615e69d 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -27,7 +27,9 @@
#include "vhost.h"

enum {
- VHOST_VDPA_BACKEND_FEATURES = (1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2)
+ VHOST_VDPA_BACKEND_FEATURES =
+ (1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2) |
+ (1ULL << VHOST_BACKEND_F_IOTLB_BATCH),
};

/* Currently, only network backend w/o multiqueue is supported. */
@@ -48,6 +50,7 @@ struct vhost_vdpa {
int virtio_id;
int minor;
struct eventfd_ctx *config_ctx;
+ int in_batch;
};

static DEFINE_IDA(vhost_vdpa_ida);
@@ -140,6 +143,7 @@ static void vhost_vdpa_reset(struct vhost_vdpa *v)
struct vdpa_device *vdpa = v->vdpa;

vdpa_reset(vdpa);
+ v->in_batch = 0;
}

static long vhost_vdpa_get_device_id(struct vhost_vdpa *v, u8 __user *argp)
@@ -563,13 +567,15 @@ static int vhost_vdpa_map(struct vhost_vdpa *v,
if (r)
return r;

- if (ops->dma_map)
+ if (ops->dma_map) {
r = ops->dma_map(vdpa, iova, size, pa, perm);
- else if (ops->set_map)
- r = ops->set_map(vdpa, dev->iotlb);
- else
+ } else if (ops->set_map) {
+ if (!v->in_batch)
+ r = ops->set_map(vdpa, dev->iotlb);
+ } else {
r = iommu_map(v->domain, iova, pa, size,
perm_to_iommu_flags(perm));
+ }

return r;
}
@@ -582,12 +588,14 @@ static void vhost_vdpa_unmap(struct vhost_vdpa *v, u64 iova, u64 size)

vhost_vdpa_iotlb_unmap(v, iova, iova + size - 1);

- if (ops->dma_map)
+ if (ops->dma_map) {
ops->dma_unmap(vdpa, iova, size);
- else if (ops->set_map)
- ops->set_map(vdpa, dev->iotlb);
- else
+ } else if (ops->set_map) {
+ if (!v->in_batch)
+ ops->set_map(vdpa, dev->iotlb);
+ } else {
iommu_unmap(v->domain, iova, size);
+ }
}

static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v,
@@ -680,6 +688,8 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev,
struct vhost_iotlb_msg *msg)
{
struct vhost_vdpa *v = container_of(dev, struct vhost_vdpa, vdev);
+ struct vdpa_device *vdpa = v->vdpa;
+ const struct vdpa_config_ops *ops = vdpa->config;
int r = 0;

r = vhost_dev_check_owner(dev);
@@ -693,6 +703,14 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev,
case VHOST_IOTLB_INVALIDATE:
vhost_vdpa_unmap(v, msg->iova, msg->size);
break;
+ case VHOST_IOTLB_BATCH_BEGIN:
+ v->in_batch = true;
+ break;
+ case VHOST_IOTLB_BATCH_END:
+ if (v->in_batch && ops->set_map)
+ ops->set_map(vdpa, dev->iotlb);
+ v->in_batch = false;
+ break;
default:
r = -EINVAL;
break;
diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h
index 5d9254e2a6b6..11a4948b6216 100644
--- a/include/uapi/linux/vhost.h
+++ b/include/uapi/linux/vhost.h
@@ -91,6 +91,8 @@

/* Use message type V2 */
#define VHOST_BACKEND_F_IOTLB_MSG_V2 0x1
+/* IOTLB can accept batching hints */
+#define VHOST_BACKEND_F_IOTLB_BATCH 0x2

#define VHOST_SET_BACKEND_FEATURES _IOW(VHOST_VIRTIO, 0x25, __u64)
#define VHOST_GET_BACKEND_FEATURES _IOR(VHOST_VIRTIO, 0x26, __u64)
diff --git a/include/uapi/linux/vhost_types.h b/include/uapi/linux/vhost_types.h
index 669457ce5c48..9a269a88a6ff 100644
--- a/include/uapi/linux/vhost_types.h
+++ b/include/uapi/linux/vhost_types.h
@@ -60,6 +60,17 @@ struct vhost_iotlb_msg {
#define VHOST_IOTLB_UPDATE 2
#define VHOST_IOTLB_INVALIDATE 3
#define VHOST_IOTLB_ACCESS_FAIL 4
+/*
+ * VHOST_IOTLB_BATCH_BEGIN and VHOST_IOTLB_BATCH_END allow modifying
+ * multiple mappings in one go: beginning with
+ * VHOST_IOTLB_BATCH_BEGIN, followed by any number of
+ * VHOST_IOTLB_UPDATE messages, and ending with VHOST_IOTLB_BATCH_END.
+ * When one of these two values is used as the message type, the rest
+ * of the fields in the message are ignored. There's no guarantee that
+ * these changes take place automatically in the device.
+ */
+#define VHOST_IOTLB_BATCH_BEGIN 5
+#define VHOST_IOTLB_BATCH_END 6
__u8 type;
};

--
2.26.0

2020-08-04 16:26:35

[permalink] [raw]

Subject: [PATCH V4 linux-next 01/12] vhost-vdpa: refine ioctl pre-processing

From: Jason Wang <[email protected]>

Switch to use 'switch' to make the codes more easier to be extended.

Signed-off-by: Jason Wang <[email protected]>
---
drivers/vhost/vdpa.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index 16e0ffb115be..563ae6204052 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -377,15 +377,16 @@ static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd,
idx = array_index_nospec(idx, v->nvqs);
vq = &v->vqs[idx];

- if (cmd == VHOST_VDPA_SET_VRING_ENABLE) {
+ switch (cmd) {
+ case VHOST_VDPA_SET_VRING_ENABLE:
if (copy_from_user(&s, argp, sizeof(s)))
return -EFAULT;
ops->set_vq_ready(vdpa, idx, s.num);
return 0;
- }
-
- if (cmd == VHOST_GET_VRING_BASE)
+ case VHOST_GET_VRING_BASE:
vq->last_avail_idx = ops->get_vq_state(v->vdpa, idx);
+ break;
+ }

r = vhost_vring_ioctl(&v->vdev, cmd, argp);
if (r)
--
2.26.0

2020-08-04 16:27:17

[permalink] [raw]

Subject: [PATCH V4 linux-next 09/12] vdpa/mlx5: Add hardware descriptive header file

Keep all vdpa related hardware definitions in this file.

Reviewed-by: Parav Pandit <[email protected]>
Signed-off-by: Eli Cohen <[email protected]>
---
drivers/vdpa/mlx5/core/mlx5_vdpa_ifc.h | 168 +++++++++++++++++++++++++
1 file changed, 168 insertions(+)
create mode 100644 drivers/vdpa/mlx5/core/mlx5_vdpa_ifc.h

diff --git a/drivers/vdpa/mlx5/core/mlx5_vdpa_ifc.h b/drivers/vdpa/mlx5/core/mlx5_vdpa_ifc.h
new file mode 100644
index 000000000000..f6f57a29b38e
--- /dev/null
+++ b/drivers/vdpa/mlx5/core/mlx5_vdpa_ifc.h
@@ -0,0 +1,168 @@
+/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */
+/* Copyright (c) 2020 Mellanox Technologies Ltd. */
+
+#ifndef __MLX5_VDPA_IFC_H_
+#define __MLX5_VDPA_IFC_H_
+
+#include <linux/mlx5/mlx5_ifc.h>
+
+enum {
+ MLX5_VIRTIO_Q_EVENT_MODE_NO_MSIX_MODE = 0x0,
+ MLX5_VIRTIO_Q_EVENT_MODE_QP_MODE = 0x1,
+ MLX5_VIRTIO_Q_EVENT_MODE_MSIX_MODE = 0x2,
+};
+
+enum {
+ MLX5_VIRTIO_EMULATION_CAP_VIRTIO_QUEUE_TYPE_SPLIT = 0x1, // do I check this caps?
+ MLX5_VIRTIO_EMULATION_CAP_VIRTIO_QUEUE_TYPE_PACKED = 0x2,
+};
+
+enum {
+ MLX5_VIRTIO_EMULATION_VIRTIO_QUEUE_TYPE_SPLIT = 0,
+ MLX5_VIRTIO_EMULATION_VIRTIO_QUEUE_TYPE_PACKED = 1,
+};
+
+struct mlx5_ifc_virtio_q_bits {
+ u8 virtio_q_type[0x8];
+ u8 reserved_at_8[0x5];
+ u8 event_mode[0x3];
+ u8 queue_index[0x10];
+
+ u8 full_emulation[0x1];
+ u8 virtio_version_1_0[0x1];
+ u8 reserved_at_22[0x2];
+ u8 offload_type[0x4];
+ u8 event_qpn_or_msix[0x18];
+
+ u8 doorbell_stride_index[0x10];
+ u8 queue_size[0x10];
+
+ u8 device_emulation_id[0x20];
+
+ u8 desc_addr[0x40];
+
+ u8 used_addr[0x40];
+
+ u8 available_addr[0x40];
+
+ u8 virtio_q_mkey[0x20];
+
+ u8 max_tunnel_desc[0x10];
+ u8 reserved_at_170[0x8];
+ u8 error_type[0x8];
+
+ u8 umem_1_id[0x20];
+
+ u8 umem_1_size[0x20];
+
+ u8 umem_1_offset[0x40];
+
+ u8 umem_2_id[0x20];
+
+ u8 umem_2_size[0x20];
+
+ u8 umem_2_offset[0x40];
+
+ u8 umem_3_id[0x20];
+
+ u8 umem_3_size[0x20];
+
+ u8 umem_3_offset[0x40];
+
+ u8 counter_set_id[0x20];
+
+ u8 reserved_at_320[0x8];
+ u8 pd[0x18];
+
+ u8 reserved_at_340[0xc0];
+};
+
+struct mlx5_ifc_virtio_net_q_object_bits {
+ u8 modify_field_select[0x40];
+
+ u8 reserved_at_40[0x20];
+
+ u8 vhca_id[0x10];
+ u8 reserved_at_70[0x10];
+
+ u8 queue_feature_bit_mask_12_3[0xa];
+ u8 dirty_bitmap_dump_enable[0x1];
+ u8 vhost_log_page[0x5];
+ u8 reserved_at_90[0xc];
+ u8 state[0x4];
+
+ u8 reserved_at_a0[0x5];
+ u8 queue_feature_bit_mask_2_0[0x3];
+ u8 tisn_or_qpn[0x18];
+
+ u8 dirty_bitmap_mkey[0x20];
+
+ u8 dirty_bitmap_size[0x20];
+
+ u8 dirty_bitmap_addr[0x40];
+
+ u8 hw_available_index[0x10];
+ u8 hw_used_index[0x10];
+
+ u8 reserved_at_160[0xa0];
+
+ struct mlx5_ifc_virtio_q_bits virtio_q_context;
+};
+
+struct mlx5_ifc_create_virtio_net_q_in_bits {
+ struct mlx5_ifc_general_obj_in_cmd_hdr_bits general_obj_in_cmd_hdr;
+
+ struct mlx5_ifc_virtio_net_q_object_bits obj_context;
+};
+
+struct mlx5_ifc_create_virtio_net_q_out_bits {
+ struct mlx5_ifc_general_obj_out_cmd_hdr_bits general_obj_out_cmd_hdr;
+};
+
+struct mlx5_ifc_destroy_virtio_net_q_in_bits {
+ struct mlx5_ifc_general_obj_in_cmd_hdr_bits general_obj_out_cmd_hdr;
+};
+
+struct mlx5_ifc_destroy_virtio_net_q_out_bits {
+ struct mlx5_ifc_general_obj_out_cmd_hdr_bits general_obj_out_cmd_hdr;
+};
+
+struct mlx5_ifc_query_virtio_net_q_in_bits {
+ struct mlx5_ifc_general_obj_in_cmd_hdr_bits general_obj_in_cmd_hdr;
+};
+
+struct mlx5_ifc_query_virtio_net_q_out_bits {
+ struct mlx5_ifc_general_obj_out_cmd_hdr_bits general_obj_out_cmd_hdr;
+
+ struct mlx5_ifc_virtio_net_q_object_bits obj_context;
+};
+
+enum {
+ MLX5_VIRTQ_MODIFY_MASK_STATE = (u64)1 << 0,
+ MLX5_VIRTQ_MODIFY_MASK_DIRTY_BITMAP_PARAMS = (u64)1 << 3,
+ MLX5_VIRTQ_MODIFY_MASK_DIRTY_BITMAP_DUMP_ENABLE = (u64)1 << 4,
+};
+
+enum {
+ MLX5_VIRTIO_NET_Q_OBJECT_STATE_INIT = 0x0,
+ MLX5_VIRTIO_NET_Q_OBJECT_STATE_RDY = 0x1,
+ MLX5_VIRTIO_NET_Q_OBJECT_STATE_SUSPEND = 0x2,
+ MLX5_VIRTIO_NET_Q_OBJECT_STATE_ERR = 0x3,
+};
+
+enum {
+ MLX5_RQTC_LIST_Q_TYPE_RQ = 0x0,
+ MLX5_RQTC_LIST_Q_TYPE_VIRTIO_NET_Q = 0x1,
+};
+
+struct mlx5_ifc_modify_virtio_net_q_in_bits {
+ struct mlx5_ifc_general_obj_in_cmd_hdr_bits general_obj_in_cmd_hdr;
+
+ struct mlx5_ifc_virtio_net_q_object_bits obj_context;
+};
+
+struct mlx5_ifc_modify_virtio_net_q_out_bits {
+ struct mlx5_ifc_general_obj_out_cmd_hdr_bits general_obj_out_cmd_hdr;
+};
+
+#endif /* __MLX5_VDPA_IFC_H_ */
--
2.26.0

2020-08-04 16:27:38

[permalink] [raw]

Subject: [PATCH V4 linux-next 02/12] vhost: generialize backend features setting/getting

From: Jason Wang <[email protected]>

Move the backend features setting/getting from net.c to vhost.c to be
reused by vhost-vdpa.

Signed-off-by: Jason Wang <[email protected]>
---
drivers/vhost/net.c | 18 ++----------------
drivers/vhost/vhost.c | 15 +++++++++++++++
drivers/vhost/vhost.h | 2 ++
3 files changed, 19 insertions(+), 16 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 8e0921d3805d..bfbbe5c876f9 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -1615,21 +1615,6 @@ static long vhost_net_reset_owner(struct vhost_net *n)
return err;
}

-static int vhost_net_set_backend_features(struct vhost_net *n, u64 features)
-{
- int i;
-
- mutex_lock(&n->dev.mutex);
- for (i = 0; i < VHOST_NET_VQ_MAX; ++i) {
- mutex_lock(&n->vqs[i].vq.mutex);
- n->vqs[i].vq.acked_backend_features = features;
- mutex_unlock(&n->vqs[i].vq.mutex);
- }
- mutex_unlock(&n->dev.mutex);
-
- return 0;
-}
-
static int vhost_net_set_features(struct vhost_net *n, u64 features)
{
size_t vhost_hlen, sock_hlen, hdr_len;
@@ -1730,7 +1715,8 @@ static long vhost_net_ioctl(struct file *f, unsigned int ioctl,
return -EFAULT;
if (features & ~VHOST_NET_BACKEND_FEATURES)
return -EOPNOTSUPP;
- return vhost_net_set_backend_features(n, features);
+ vhost_set_backend_features(&n->dev, features);
+ return 0;
case VHOST_RESET_OWNER:
return vhost_net_reset_owner(n);
case VHOST_SET_OWNER:
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 8f622064b3e8..5e5cc3dd983e 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -2591,6 +2591,21 @@ struct vhost_msg_node *vhost_dequeue_msg(struct vhost_dev *dev,
}
EXPORT_SYMBOL_GPL(vhost_dequeue_msg);

+void vhost_set_backend_features(struct vhost_dev *dev, u64 features)
+{
+ struct vhost_virtqueue *vq;
+ int i;
+
+ mutex_lock(&dev->mutex);
+ for (i = 0; i < dev->nvqs; ++i) {
+ vq = dev->vqs[i];
+ mutex_lock(&vq->mutex);
+ vq->acked_backend_features = features;
+ mutex_unlock(&vq->mutex);
+ }
+ mutex_unlock(&dev->mutex);
+}
+EXPORT_SYMBOL_GPL(vhost_set_backend_features);

static int __init vhost_init(void)
{
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 38eb1aa3b68d..9032d3c2a9f4 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -214,6 +214,8 @@ void vhost_enqueue_msg(struct vhost_dev *dev,
struct vhost_msg_node *node);
struct vhost_msg_node *vhost_dequeue_msg(struct vhost_dev *dev,
struct list_head *head);
+void vhost_set_backend_features(struct vhost_dev *dev, u64 features);
+
__poll_t vhost_chr_poll(struct file *file, struct vhost_dev *dev,
poll_table *wait);
ssize_t vhost_chr_read_iter(struct vhost_dev *dev, struct iov_iter *to,
--
2.26.0

2020-08-04 16:27:45

[permalink] [raw]

Subject: [PATCH V4 linux-next 08/12] vdpa: Modify get_vq_state() to return error code

Modify get_vq_state() so it returns an error code. In case of hardware
acceleration, the available index may be retrieved from the device, an
operation that can possibly fail.

Reviewed-by: Parav Pandit <[email protected]>
Signed-off-by: Eli Cohen <[email protected]>
---
drivers/vdpa/ifcvf/ifcvf_main.c | 5 +++--
drivers/vdpa/vdpa_sim/vdpa_sim.c | 5 +++--
drivers/vhost/vdpa.c | 5 ++++-
include/linux/vdpa.h | 4 ++--
4 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c
index dc311e972b9e..076d7ac5e723 100644
--- a/drivers/vdpa/ifcvf/ifcvf_main.c
+++ b/drivers/vdpa/ifcvf/ifcvf_main.c
@@ -237,12 +237,13 @@ static u16 ifcvf_vdpa_get_vq_num_max(struct vdpa_device *vdpa_dev)
return IFCVF_QUEUE_MAX;
}

-static void ifcvf_vdpa_get_vq_state(struct vdpa_device *vdpa_dev, u16 qid,
- struct vdpa_vq_state *state)
+static int ifcvf_vdpa_get_vq_state(struct vdpa_device *vdpa_dev, u16 qid,
+ struct vdpa_vq_state *state)
{
struct ifcvf_hw *vf = vdpa_to_vf(vdpa_dev);

state->avail_index = ifcvf_get_vq_state(vf, qid);
+ return 0;
}

static int ifcvf_vdpa_set_vq_state(struct vdpa_device *vdpa_dev, u16 qid,
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
index f1c351d5959b..c68741363643 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
@@ -464,14 +464,15 @@ static int vdpasim_set_vq_state(struct vdpa_device *vdpa, u16 idx,
return 0;
}

-static void vdpasim_get_vq_state(struct vdpa_device *vdpa, u16 idx,
- struct vdpa_vq_state *state)
+static int vdpasim_get_vq_state(struct vdpa_device *vdpa, u16 idx,
+ struct vdpa_vq_state *state)
{
struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
struct vdpasim_virtqueue *vq = &vdpasim->vqs[idx];
struct vringh *vrh = &vq->vring;

state->avail_index = vrh->last_avail_idx;
+ return 0;
}

static u32 vdpasim_get_vq_align(struct vdpa_device *vdpa)
diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index c2e1e2d55084..a0b7c91948e1 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -392,7 +392,10 @@ static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd,
ops->set_vq_ready(vdpa, idx, s.num);
return 0;
case VHOST_GET_VRING_BASE:
- ops->get_vq_state(v->vdpa, idx, &vq_state);
+ r = ops->get_vq_state(v->vdpa, idx, &vq_state);
+ if (r)
+ return r;
+
vq->last_avail_idx = vq_state.avail_index;
break;
case VHOST_GET_BACKEND_FEATURES:
diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
index 8f412620071d..fd6e560d70f9 100644
--- a/include/linux/vdpa.h
+++ b/include/linux/vdpa.h
@@ -192,8 +192,8 @@ struct vdpa_config_ops {
bool (*get_vq_ready)(struct vdpa_device *vdev, u16 idx);
int (*set_vq_state)(struct vdpa_device *vdev, u16 idx,
const struct vdpa_vq_state *state);
- void (*get_vq_state)(struct vdpa_device *vdev, u16 idx,
- struct vdpa_vq_state *state);
+ int (*get_vq_state)(struct vdpa_device *vdev, u16 idx,
+ struct vdpa_vq_state *state);
struct vdpa_notification_area
(*get_vq_notification)(struct vdpa_device *vdev, u16 idx);
int (*get_vq_irq)(struct vdpa_device *vdv, u16 idx);
--
2.26.0

2020-08-04 16:27:58

[permalink] [raw]

Subject: [PATCH V4 linux-next 11/12] vdpa/mlx5: Add shared memory registration code

Add code to support registering address space region for the device. The
virtio driver can run as either:
1. Guest virtio driver
2. Userspace virtio driver on the host
3. Kernel virtio driver on the host

In any case a memory key object is required to provide access to memory
for the device.

This code will be shared by network or block driver implementations.

Reviewed-by: Parav Pandit <[email protected]>
Signed-off-by: Eli Cohen <[email protected]>
---
drivers/vdpa/mlx5/Makefile | 2 +-
drivers/vdpa/mlx5/core/Makefile | 1 -
drivers/vdpa/mlx5/core/mlx5_vdpa.h | 34 ++
drivers/vdpa/mlx5/core/mr.c | 484 +++++++++++++++++++++++++++++
drivers/vdpa/mlx5/core/resources.c | 3 +
5 files changed, 522 insertions(+), 2 deletions(-)
delete mode 100644 drivers/vdpa/mlx5/core/Makefile
create mode 100644 drivers/vdpa/mlx5/core/mr.c

diff --git a/drivers/vdpa/mlx5/Makefile b/drivers/vdpa/mlx5/Makefile
index d552abb1d126..b347c62032ea 100644
--- a/drivers/vdpa/mlx5/Makefile
+++ b/drivers/vdpa/mlx5/Makefile
@@ -1 +1 @@
-obj-$(CONFIG_MLX5_VDPA) += core/resources.o
+obj-$(CONFIG_MLX5_VDPA) += core/resources.o core/mr.o
diff --git a/drivers/vdpa/mlx5/core/Makefile b/drivers/vdpa/mlx5/core/Makefile
deleted file mode 100644
index 7070f8c8680d..000000000000
--- a/drivers/vdpa/mlx5/core/Makefile
+++ /dev/null
@@ -1 +0,0 @@
-obj-y += resources.o
diff --git a/drivers/vdpa/mlx5/core/mlx5_vdpa.h b/drivers/vdpa/mlx5/core/mlx5_vdpa.h
index f3571c8b257e..5c92a576edae 100644
--- a/drivers/vdpa/mlx5/core/mlx5_vdpa.h
+++ b/drivers/vdpa/mlx5/core/mlx5_vdpa.h
@@ -7,6 +7,31 @@
#include <linux/vdpa.h>
#include <linux/mlx5/driver.h>

+struct mlx5_vdpa_direct_mr {
+ u64 start;
+ u64 end;
+ u32 perm;
+ struct mlx5_core_mkey mr;
+ struct sg_table sg_head;
+ int log_size;
+ int nsg;
+ struct list_head list;
+ u64 offset;
+};
+
+struct mlx5_vdpa_mr {
+ struct mlx5_core_mkey mkey;
+
+ /* list of direct MRs descendants of this indirect mr */
+ struct list_head head;
+ unsigned long num_directs;
+ unsigned long num_klms;
+ bool initialized;
+
+ /* serialize mkey creation and destruction */
+ struct mutex mkey_mtx;
+};
+
struct mlx5_vdpa_resources {
u32 pdn;
struct mlx5_uars_page *uar;
@@ -26,6 +51,8 @@ struct mlx5_vdpa_dev {
u8 status;
u32 max_vqs;
u32 generation;
+
+ struct mlx5_vdpa_mr mr;
};

int mlx5_vdpa_alloc_pd(struct mlx5_vdpa_dev *dev, u32 *pdn, u16 uid);
@@ -41,6 +68,13 @@ int mlx5_vdpa_alloc_transport_domain(struct mlx5_vdpa_dev *mvdev, u32 *tdn);
void mlx5_vdpa_dealloc_transport_domain(struct mlx5_vdpa_dev *mvdev, u32 tdn);
int mlx5_vdpa_alloc_resources(struct mlx5_vdpa_dev *mvdev);
void mlx5_vdpa_free_resources(struct mlx5_vdpa_dev *mvdev);
+int mlx5_vdpa_create_mkey(struct mlx5_vdpa_dev *mvdev, struct mlx5_core_mkey *mkey, u32 *in,
+ int inlen);
+int mlx5_vdpa_destroy_mkey(struct mlx5_vdpa_dev *mvdev, struct mlx5_core_mkey *mkey);
+int mlx5_vdpa_handle_set_map(struct mlx5_vdpa_dev *mvdev, struct vhost_iotlb *iotlb,
+ bool *change_map);
+int mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, struct vhost_iotlb *iotlb);
+void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev);

#define mlx5_vdpa_warn(__dev, format, ...) \
dev_warn((__dev)->mdev->device, "%s:%d:(pid %d) warning: " format, __func__, __LINE__, \
diff --git a/drivers/vdpa/mlx5/core/mr.c b/drivers/vdpa/mlx5/core/mr.c
new file mode 100644
index 000000000000..084698975c47
--- /dev/null
+++ b/drivers/vdpa/mlx5/core/mr.c
@@ -0,0 +1,484 @@
+// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
+/* Copyright (c) 2020 Mellanox Technologies Ltd. */
+
+#include <linux/vdpa.h>
+#include <linux/gcd.h>
+#include <linux/string.h>
+#include <linux/mlx5/qp.h>
+#include "mlx5_vdpa.h"
+
+/* DIV_ROUND_UP where the divider is a power of 2 give by its log base 2 value */
+#define MLX5_DIV_ROUND_UP_POW2(_n, _s) \
+({ \
+ u64 __s = _s; \
+ u64 _res; \
+ _res = (((_n) + (1 << (__s)) - 1) >> (__s)); \
+ _res; \
+})
+
+static int get_octo_len(u64 len, int page_shift)
+{
+ u64 page_size = 1ULL << page_shift;
+ int npages;
+
+ npages = ALIGN(len, page_size) >> page_shift;
+ return (npages + 1) / 2;
+}
+
+static void fill_sg(struct mlx5_vdpa_direct_mr *mr, void *in)
+{
+ struct scatterlist *sg;
+ __be64 *pas;
+ int i;
+
+ pas = MLX5_ADDR_OF(create_mkey_in, in, klm_pas_mtt);
+ for_each_sg(mr->sg_head.sgl, sg, mr->nsg, i)
+ (*pas) = cpu_to_be64(sg_dma_address(sg));
+}
+
+static void mlx5_set_access_mode(void *mkc, int mode)
+{
+ MLX5_SET(mkc, mkc, access_mode_1_0, mode & 0x3);
+ MLX5_SET(mkc, mkc, access_mode_4_2, mode >> 2);
+}
+
+static void populate_mtts(struct mlx5_vdpa_direct_mr *mr, __be64 *mtt)
+{
+ struct scatterlist *sg;
+ int i;
+
+ for_each_sg(mr->sg_head.sgl, sg, mr->nsg, i)
+ mtt[i] = cpu_to_be64(sg_dma_address(sg));
+}
+
+static int create_direct_mr(struct mlx5_vdpa_dev *mvdev, struct mlx5_vdpa_direct_mr *mr)
+{
+ int inlen;
+ void *mkc;
+ void *in;
+ int err;
+
+ inlen = MLX5_ST_SZ_BYTES(create_mkey_in) + roundup(MLX5_ST_SZ_BYTES(mtt) * mr->nsg, 16);
+ in = kvzalloc(inlen, GFP_KERNEL);
+ if (!in)
+ return -ENOMEM;
+
+ MLX5_SET(create_mkey_in, in, uid, mvdev->res.uid);
+ fill_sg(mr, in);
+ mkc = MLX5_ADDR_OF(create_mkey_in, in, memory_key_mkey_entry);
+ MLX5_SET(mkc, mkc, lw, !!(mr->perm & VHOST_MAP_WO));
+ MLX5_SET(mkc, mkc, lr, !!(mr->perm & VHOST_MAP_RO));
+ mlx5_set_access_mode(mkc, MLX5_MKC_ACCESS_MODE_MTT);
+ MLX5_SET(mkc, mkc, qpn, 0xffffff);
+ MLX5_SET(mkc, mkc, pd, mvdev->res.pdn);
+ MLX5_SET64(mkc, mkc, start_addr, mr->offset);
+ MLX5_SET64(mkc, mkc, len, mr->end - mr->start);
+ MLX5_SET(mkc, mkc, log_page_size, mr->log_size);
+ MLX5_SET(mkc, mkc, translations_octword_size,
+ get_octo_len(mr->end - mr->start, mr->log_size));
+ MLX5_SET(create_mkey_in, in, translations_octword_actual_size,
+ get_octo_len(mr->end - mr->start, mr->log_size));
+ populate_mtts(mr, MLX5_ADDR_OF(create_mkey_in, in, klm_pas_mtt));
+ err = mlx5_vdpa_create_mkey(mvdev, &mr->mr, in, inlen);
+ kvfree(in);
+ if (err) {
+ mlx5_vdpa_warn(mvdev, "Failed to create direct MR\n");
+ return err;
+ }
+
+ return 0;
+}
+
+static void destroy_direct_mr(struct mlx5_vdpa_dev *mvdev, struct mlx5_vdpa_direct_mr *mr)
+{
+ mlx5_vdpa_destroy_mkey(mvdev, &mr->mr);
+}
+
+static u64 map_start(struct vhost_iotlb_map *map, struct mlx5_vdpa_direct_mr *mr)
+{
+ return max_t(u64, map->start, mr->start);
+}
+
+static u64 map_end(struct vhost_iotlb_map *map, struct mlx5_vdpa_direct_mr *mr)
+{
+ return min_t(u64, map->last + 1, mr->end);
+}
+
+static u64 maplen(struct vhost_iotlb_map *map, struct mlx5_vdpa_direct_mr *mr)
+{
+ return map_end(map, mr) - map_start(map, mr);
+}
+
+#define MLX5_VDPA_INVALID_START_ADDR ((u64)-1)
+#define MLX5_VDPA_INVALID_LEN ((u64)-1)
+
+static u64 indir_start_addr(struct mlx5_vdpa_mr *mkey)
+{
+ struct mlx5_vdpa_direct_mr *s;
+
+ s = list_first_entry_or_null(&mkey->head, struct mlx5_vdpa_direct_mr, list);
+ if (!s)
+ return MLX5_VDPA_INVALID_START_ADDR;
+
+ return s->start;
+}
+
+static u64 indir_len(struct mlx5_vdpa_mr *mkey)
+{
+ struct mlx5_vdpa_direct_mr *s;
+ struct mlx5_vdpa_direct_mr *e;
+
+ s = list_first_entry_or_null(&mkey->head, struct mlx5_vdpa_direct_mr, list);
+ if (!s)
+ return MLX5_VDPA_INVALID_LEN;
+
+ e = list_last_entry(&mkey->head, struct mlx5_vdpa_direct_mr, list);
+
+ return e->end - s->start;
+}
+
+#define LOG_MAX_KLM_SIZE 30
+#define MAX_KLM_SIZE BIT(LOG_MAX_KLM_SIZE)
+
+static u32 klm_bcount(u64 size)
+{
+ return (u32)size;
+}
+
+static void fill_indir(struct mlx5_vdpa_dev *mvdev, struct mlx5_vdpa_mr *mkey, void *in)
+{
+ struct mlx5_vdpa_direct_mr *dmr;
+ struct mlx5_klm *klmarr;
+ struct mlx5_klm *klm;
+ bool first = true;
+ u64 preve;
+ int i;
+
+ klmarr = MLX5_ADDR_OF(create_mkey_in, in, klm_pas_mtt);
+ i = 0;
+ list_for_each_entry(dmr, &mkey->head, list) {
+again:
+ klm = &klmarr[i++];
+ if (first) {
+ preve = dmr->start;
+ first = false;
+ }
+
+ if (preve == dmr->start) {
+ klm->key = cpu_to_be32(dmr->mr.key);
+ klm->bcount = cpu_to_be32(klm_bcount(dmr->end - dmr->start));
+ preve = dmr->end;
+ } else {
+ klm->key = cpu_to_be32(mvdev->res.null_mkey);
+ klm->bcount = cpu_to_be32(klm_bcount(dmr->start - preve));
+ preve = dmr->start;
+ goto again;
+ }
+ }
+}
+
+static int klm_byte_size(int nklms)
+{
+ return 16 * ALIGN(nklms, 4);
+}
+
+static int create_indirect_key(struct mlx5_vdpa_dev *mvdev, struct mlx5_vdpa_mr *mr)
+{
+ int inlen;
+ void *mkc;
+ void *in;
+ int err;
+ u64 start;
+ u64 len;
+
+ start = indir_start_addr(mr);
+ len = indir_len(mr);
+ if (start == MLX5_VDPA_INVALID_START_ADDR || len == MLX5_VDPA_INVALID_LEN)
+ return -EINVAL;
+
+ inlen = MLX5_ST_SZ_BYTES(create_mkey_in) + klm_byte_size(mr->num_klms);
+ in = kzalloc(inlen, GFP_KERNEL);
+ if (!in)
+ return -ENOMEM;
+
+ MLX5_SET(create_mkey_in, in, uid, mvdev->res.uid);
+ mkc = MLX5_ADDR_OF(create_mkey_in, in, memory_key_mkey_entry);
+ MLX5_SET(mkc, mkc, lw, 1);
+ MLX5_SET(mkc, mkc, lr, 1);
+ mlx5_set_access_mode(mkc, MLX5_MKC_ACCESS_MODE_KLMS);
+ MLX5_SET(mkc, mkc, qpn, 0xffffff);
+ MLX5_SET(mkc, mkc, pd, mvdev->res.pdn);
+ MLX5_SET64(mkc, mkc, start_addr, start);
+ MLX5_SET64(mkc, mkc, len, len);
+ MLX5_SET(mkc, mkc, translations_octword_size, klm_byte_size(mr->num_klms) / 16);
+ MLX5_SET(create_mkey_in, in, translations_octword_actual_size, mr->num_klms);
+ fill_indir(mvdev, mr, in);
+ err = mlx5_vdpa_create_mkey(mvdev, &mr->mkey, in, inlen);
+ kfree(in);
+ return err;
+}
+
+static void destroy_indirect_key(struct mlx5_vdpa_dev *mvdev, struct mlx5_vdpa_mr *mkey)
+{
+ mlx5_vdpa_destroy_mkey(mvdev, &mkey->mkey);
+}
+
+static int map_direct_mr(struct mlx5_vdpa_dev *mvdev, struct mlx5_vdpa_direct_mr *mr,
+ struct vhost_iotlb *iotlb)
+{
+ struct vhost_iotlb_map *map;
+ unsigned long lgcd = 0;
+ int log_entity_size;
+ unsigned long size;
+ u64 start = 0;
+ int err;
+ struct page *pg;
+ unsigned int nsg;
+ int sglen;
+ u64 pa;
+ u64 paend;
+ struct scatterlist *sg;
+ struct device *dma = mvdev->mdev->device;
+ int ret;
+
+ for (map = vhost_iotlb_itree_first(iotlb, mr->start, mr->end - 1);
+ map; map = vhost_iotlb_itree_next(map, start, mr->end - 1)) {
+ size = maplen(map, mr);
+ lgcd = gcd(lgcd, size);
+ start += size;
+ }
+ log_entity_size = ilog2(lgcd);
+
+ sglen = 1 << log_entity_size;
+ nsg = MLX5_DIV_ROUND_UP_POW2(mr->end - mr->start, log_entity_size);
+
+ err = sg_alloc_table(&mr->sg_head, nsg, GFP_KERNEL);
+ if (err)
+ return err;
+
+ sg = mr->sg_head.sgl;
+ for (map = vhost_iotlb_itree_first(iotlb, mr->start, mr->end - 1);
+ map; map = vhost_iotlb_itree_next(map, mr->start, mr->end - 1)) {
+ paend = map->addr + maplen(map, mr);
+ for (pa = map->addr; pa < paend; pa += sglen) {
+ pg = pfn_to_page(__phys_to_pfn(pa));
+ if (!sg) {
+ mlx5_vdpa_warn(mvdev, "sg null. start 0x%llx, end 0x%llx\n",
+ map->start, map->last + 1);
+ err = -ENOMEM;
+ goto err_map;
+ }
+ sg_set_page(sg, pg, sglen, 0);
+ sg = sg_next(sg);
+ if (!sg)
+ goto done;
+ }
+ }
+done:
+ mr->log_size = log_entity_size;
+ mr->nsg = nsg;
+ ret = dma_map_sg_attrs(dma, mr->sg_head.sgl, mr->nsg, DMA_BIDIRECTIONAL, 0);
+ if (!ret)
+ goto err_map;
+
+ err = create_direct_mr(mvdev, mr);
+ if (err)
+ goto err_direct;
+
+ return 0;
+
+err_direct:
+ dma_unmap_sg_attrs(dma, mr->sg_head.sgl, mr->nsg, DMA_BIDIRECTIONAL, 0);
+err_map:
+ sg_free_table(&mr->sg_head);
+ return err;
+}
+
+static void unmap_direct_mr(struct mlx5_vdpa_dev *mvdev, struct mlx5_vdpa_direct_mr *mr)
+{
+ struct device *dma = mvdev->mdev->device;
+
+ destroy_direct_mr(mvdev, mr);
+ dma_unmap_sg_attrs(dma, mr->sg_head.sgl, mr->nsg, DMA_BIDIRECTIONAL, 0);
+ sg_free_table(&mr->sg_head);
+}
+
+static int add_direct_chain(struct mlx5_vdpa_dev *mvdev, u64 start, u64 size, u8 perm,
+ struct vhost_iotlb *iotlb)
+{
+ struct mlx5_vdpa_mr *mr = &mvdev->mr;
+ struct mlx5_vdpa_direct_mr *dmr;
+ struct mlx5_vdpa_direct_mr *n;
+ LIST_HEAD(tmp);
+ u64 st;
+ u64 sz;
+ int err;
+ int i = 0;
+
+ st = start;
+ while (size) {
+ sz = (u32)min_t(u64, MAX_KLM_SIZE, size);
+ dmr = kzalloc(sizeof(*dmr), GFP_KERNEL);
+ if (!dmr)
+ goto err_alloc;
+
+ dmr->start = st;
+ dmr->end = st + sz;
+ dmr->perm = perm;
+ err = map_direct_mr(mvdev, dmr, iotlb);
+ if (err) {
+ kfree(dmr);
+ goto err_alloc;
+ }
+
+ list_add_tail(&dmr->list, &tmp);
+ size -= sz;
+ mr->num_directs++;
+ mr->num_klms++;
+ st += sz;
+ i++;
+ }
+ list_splice_tail(&tmp, &mr->head);
+ return 0;
+
+err_alloc:
+ list_for_each_entry_safe(dmr, n, &mr->head, list) {
+ list_del_init(&dmr->list);
+ unmap_direct_mr(mvdev, dmr);
+ kfree(dmr);
+ }
+ return err;
+}
+
+/* The iotlb pointer contains a list of maps. Go over the maps, possibly
+ * merging mergeable maps, and create direct memory keys that provide the
+ * device access to memory. The direct mkeys are then referred to by the
+ * indirect memory key that provides access to the enitre address space given
+ * by iotlb.
+ */
+static int _mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, struct vhost_iotlb *iotlb)
+{
+ struct mlx5_vdpa_mr *mr = &mvdev->mr;
+ struct mlx5_vdpa_direct_mr *dmr;
+ struct mlx5_vdpa_direct_mr *n;
+ struct vhost_iotlb_map *map;
+ u32 pperm = U16_MAX;
+ u64 last = U64_MAX;
+ u64 ps = U64_MAX;
+ u64 pe = U64_MAX;
+ u64 start = 0;
+ int err = 0;
+ int nnuls;
+
+ if (mr->initialized)
+ return 0;
+
+ INIT_LIST_HEAD(&mr->head);
+ for (map = vhost_iotlb_itree_first(iotlb, start, last); map;
+ map = vhost_iotlb_itree_next(map, start, last)) {
+ start = map->start;
+ if (pe == map->start && pperm == map->perm) {
+ pe = map->last + 1;
+ } else {
+ if (ps != U64_MAX) {
+ if (pe < map->start) {
+ /* We have a hole in the map. Check how
+ * many null keys are required to fill it.
+ */
+ nnuls = MLX5_DIV_ROUND_UP_POW2(map->start - pe,
+ LOG_MAX_KLM_SIZE);
+ mr->num_klms += nnuls;
+ }
+ err = add_direct_chain(mvdev, ps, pe - ps, pperm, iotlb);
+ if (err)
+ goto err_chain;
+ }
+ ps = map->start;
+ pe = map->last + 1;
+ pperm = map->perm;
+ }
+ }
+ err = add_direct_chain(mvdev, ps, pe - ps, pperm, iotlb);
+ if (err)
+ goto err_chain;
+
+ /* Create the memory key that defines the guests's address space. This
+ * memory key refers to the direct keys that contain the MTT
+ * translations
+ */
+ err = create_indirect_key(mvdev, mr);
+ if (err)
+ goto err_chain;
+
+ mr->initialized = true;
+ return 0;
+
+err_chain:
+ list_for_each_entry_safe_reverse(dmr, n, &mr->head, list) {
+ list_del_init(&dmr->list);
+ unmap_direct_mr(mvdev, dmr);
+ kfree(dmr);
+ }
+ return err;
+}
+
+int mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, struct vhost_iotlb *iotlb)
+{
+ struct mlx5_vdpa_mr *mr = &mvdev->mr;
+ int err;
+
+ mutex_lock(&mr->mkey_mtx);
+ err = _mlx5_vdpa_create_mr(mvdev, iotlb);
+ mutex_unlock(&mr->mkey_mtx);
+ return err;
+}
+
+void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev)
+{
+ struct mlx5_vdpa_mr *mr = &mvdev->mr;
+ struct mlx5_vdpa_direct_mr *dmr;
+ struct mlx5_vdpa_direct_mr *n;
+
+ mutex_lock(&mr->mkey_mtx);
+ if (!mr->initialized)
+ goto out;
+
+ destroy_indirect_key(mvdev, mr);
+ list_for_each_entry_safe_reverse(dmr, n, &mr->head, list) {
+ list_del_init(&dmr->list);
+ unmap_direct_mr(mvdev, dmr);
+ kfree(dmr);
+ }
+ memset(mr, 0, sizeof(*mr));
+ mr->initialized = false;
+out:
+ mutex_unlock(&mr->mkey_mtx);
+}
+
+static bool map_empty(struct vhost_iotlb *iotlb)
+{
+ return !vhost_iotlb_itree_first(iotlb, 0, U64_MAX);
+}
+
+int mlx5_vdpa_handle_set_map(struct mlx5_vdpa_dev *mvdev, struct vhost_iotlb *iotlb,
+ bool *change_map)
+{
+ struct mlx5_vdpa_mr *mr = &mvdev->mr;
+ int err;
+
+ *change_map = false;
+ if (map_empty(iotlb)) {
+ mlx5_vdpa_destroy_mr(mvdev);
+ return 0;
+ }
+ mutex_lock(&mr->mkey_mtx);
+ if (mr->initialized) {
+ mlx5_vdpa_info(mvdev, "memory map update\n");
+ *change_map = true;
+ }
+ if (!*change_map)
+ err = _mlx5_vdpa_create_mr(mvdev, iotlb);
+ mutex_unlock(&mr->mkey_mtx);
+
+ return err;
+}
diff --git a/drivers/vdpa/mlx5/core/resources.c b/drivers/vdpa/mlx5/core/resources.c
index 6c6552b7e9b5..96e6421c5d1c 100644
--- a/drivers/vdpa/mlx5/core/resources.c
+++ b/drivers/vdpa/mlx5/core/resources.c
@@ -227,6 +227,7 @@ int mlx5_vdpa_alloc_resources(struct mlx5_vdpa_dev *mvdev)
mlx5_vdpa_warn(mvdev, "resources already allocated\n");
return -EINVAL;
}
+ mutex_init(&mvdev->mr.mkey_mtx);
res->uar = mlx5_get_uars_page(mdev);
if (IS_ERR(res->uar)) {
err = PTR_ERR(res->uar);
@@ -262,6 +263,7 @@ int mlx5_vdpa_alloc_resources(struct mlx5_vdpa_dev *mvdev)
err_uctx:
mlx5_put_uars_page(mdev, res->uar);
err_uars:
+ mutex_destroy(&mvdev->mr.mkey_mtx);
return err;
}

@@ -277,5 +279,6 @@ void mlx5_vdpa_free_resources(struct mlx5_vdpa_dev *mvdev)
dealloc_pd(mvdev, res->pdn, res->uid);
destroy_uctx(mvdev, res->uid);
mlx5_put_uars_page(mvdev->mdev, res->uar);
+ mutex_destroy(&mvdev->mr.mkey_mtx);
res->valid = false;
}
--
2.26.0

2020-08-04 16:48:43

[permalink] [raw]

Subject: [PATCH V4 linux-next 12/12] vdpa/mlx5: Add VDPA driver for supported mlx5 devices

Add a front end VDPA driver that registers in the VDPA bus and provides
networking to a guest. The VDPA driver creates the necessary resources
on the VF it is driving such that data path will be offloaded.

Notifications are being communicated through the driver.

Currently, only VFs are supported. In subsequent patches we will have
devlink support to control which VF is used for VDPA and which function
is used for regular networking.

Reviewed-by: Parav Pandit <[email protected]>
Signed-off-by: Eli Cohen <[email protected]>
---
drivers/vdpa/Kconfig | 10 +
drivers/vdpa/mlx5/Makefile | 5 +-
drivers/vdpa/mlx5/core/mr.c | 2 +-
drivers/vdpa/mlx5/net/main.c | 76 ++
drivers/vdpa/mlx5/net/mlx5_vnet.c | 1965 +++++++++++++++++++++++++++++
drivers/vdpa/mlx5/net/mlx5_vnet.h | 24 +
6 files changed, 2080 insertions(+), 2 deletions(-)
create mode 100644 drivers/vdpa/mlx5/net/main.c
create mode 100644 drivers/vdpa/mlx5/net/mlx5_vnet.c
create mode 100644 drivers/vdpa/mlx5/net/mlx5_vnet.h

diff --git a/drivers/vdpa/Kconfig b/drivers/vdpa/Kconfig
index 7cb84f82feba..a8c7607fdc90 100644
--- a/drivers/vdpa/Kconfig
+++ b/drivers/vdpa/Kconfig
@@ -37,4 +37,14 @@ config MLX5_VDPA
common for all types of VDPA drivers. The following drivers are planned:
net, block.

+config MLX5_VDPA_NET
+ tristate "vDPA driver for ConnectX devices"
+ depends on MLX5_VDPA
+ default n
+ help
+ VDPA network driver for ConnectX6 and newer. Provides offloading
+ of virtio net datapath such that descriptors put on the ring will
+ be executed by the hardware. It also supports a variety of stateless
+ offloads depending on the actual device used and firmware version.
+
endif # VDPA
diff --git a/drivers/vdpa/mlx5/Makefile b/drivers/vdpa/mlx5/Makefile
index b347c62032ea..89a5bededc9f 100644
--- a/drivers/vdpa/mlx5/Makefile
+++ b/drivers/vdpa/mlx5/Makefile
@@ -1 +1,4 @@
-obj-$(CONFIG_MLX5_VDPA) += core/resources.o core/mr.o
+subdir-ccflags-y += -I$(srctree)/drivers/vdpa/mlx5/core
+
+obj-$(CONFIG_MLX5_VDPA_NET) += mlx5_vdpa.o
+mlx5_vdpa-$(CONFIG_MLX5_VDPA_NET) += net/main.o net/mlx5_vnet.o core/resources.o core/mr.o
diff --git a/drivers/vdpa/mlx5/core/mr.c b/drivers/vdpa/mlx5/core/mr.c
index 084698975c47..f5dec0274133 100644
--- a/drivers/vdpa/mlx5/core/mr.c
+++ b/drivers/vdpa/mlx5/core/mr.c
@@ -464,7 +464,7 @@ int mlx5_vdpa_handle_set_map(struct mlx5_vdpa_dev *mvdev, struct vhost_iotlb *io
bool *change_map)
{
struct mlx5_vdpa_mr *mr = &mvdev->mr;
- int err;
+ int err = 0;

*change_map = false;
if (map_empty(iotlb)) {
diff --git a/drivers/vdpa/mlx5/net/main.c b/drivers/vdpa/mlx5/net/main.c
new file mode 100644
index 000000000000..838cd98386ff
--- /dev/null
+++ b/drivers/vdpa/mlx5/net/main.c
@@ -0,0 +1,76 @@
+// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
+/* Copyright (c) 2020 Mellanox Technologies Ltd. */
+
+#include <linux/module.h>
+#include <linux/mlx5/driver.h>
+#include <linux/mlx5/device.h>
+#include "mlx5_vdpa_ifc.h"
+#include "mlx5_vnet.h"
+
+MODULE_AUTHOR("Eli Cohen <[email protected]>");
+MODULE_DESCRIPTION("Mellanox VDPA driver");
+MODULE_LICENSE("Dual BSD/GPL");
+
+static bool required_caps_supported(struct mlx5_core_dev *mdev)
+{
+ u8 event_mode;
+ u64 got;
+
+ got = MLX5_CAP_GEN_64(mdev, general_obj_types);
+
+ if (!(got & MLX5_GENERAL_OBJ_TYPES_CAP_VIRTIO_NET_Q))
+ return false;
+
+ event_mode = MLX5_CAP_DEV_VDPA_EMULATION(mdev, event_mode);
+ if (!(event_mode & MLX5_VIRTIO_Q_EVENT_MODE_QP_MODE))
+ return false;
+
+ if (!MLX5_CAP_DEV_VDPA_EMULATION(mdev, eth_frame_offload_type))
+ return false;
+
+ return true;
+}
+
+static void *mlx5_vdpa_add(struct mlx5_core_dev *mdev)
+{
+ struct mlx5_vdpa_dev *vdev;
+
+ if (mlx5_core_is_pf(mdev))
+ return NULL;
+
+ if (!required_caps_supported(mdev)) {
+ dev_info(mdev->device, "virtio net emulation not supported\n");
+ return NULL;
+ }
+ vdev = mlx5_vdpa_add_dev(mdev);
+ if (IS_ERR(vdev))
+ return NULL;
+
+ return vdev;
+}
+
+static void mlx5_vdpa_remove(struct mlx5_core_dev *mdev, void *context)
+{
+ struct mlx5_vdpa_dev *vdev = context;
+
+ mlx5_vdpa_remove_dev(vdev);
+}
+
+static struct mlx5_interface mlx5_vdpa_interface = {
+ .add = mlx5_vdpa_add,
+ .remove = mlx5_vdpa_remove,
+ .protocol = MLX5_INTERFACE_PROTOCOL_VDPA,
+};
+
+static int __init mlx5_vdpa_init(void)
+{
+ return mlx5_register_interface(&mlx5_vdpa_interface);
+}
+
+static void __exit mlx5_vdpa_exit(void)
+{
+ mlx5_unregister_interface(&mlx5_vdpa_interface);
+}
+
+module_init(mlx5_vdpa_init);
+module_exit(mlx5_vdpa_exit);
diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
new file mode 100644
index 000000000000..3ec44a4f0e45
--- /dev/null
+++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
@@ -0,0 +1,1965 @@
+// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
+/* Copyright (c) 2020 Mellanox Technologies Ltd. */
+
+#include <linux/vdpa.h>
+#include <uapi/linux/virtio_ids.h>
+#include <linux/virtio_config.h>
+#include <linux/mlx5/qp.h>
+#include <linux/mlx5/device.h>
+#include <linux/mlx5/vport.h>
+#include <linux/mlx5/fs.h>
+#include <linux/mlx5/device.h>
+#include "mlx5_vnet.h"
+#include "mlx5_vdpa_ifc.h"
+#include "mlx5_vdpa.h"
+
+#define to_mvdev(__vdev) container_of((__vdev), struct mlx5_vdpa_dev, vdev)
+
+#define VALID_FEATURES_MASK \
+ (BIT(VIRTIO_NET_F_CSUM) | BIT(VIRTIO_NET_F_GUEST_CSUM) | \
+ BIT(VIRTIO_NET_F_CTRL_GUEST_OFFLOADS) | BIT(VIRTIO_NET_F_MTU) | BIT(VIRTIO_NET_F_MAC) | \
+ BIT(VIRTIO_NET_F_GUEST_TSO4) | BIT(VIRTIO_NET_F_GUEST_TSO6) | \
+ BIT(VIRTIO_NET_F_GUEST_ECN) | BIT(VIRTIO_NET_F_GUEST_UFO) | BIT(VIRTIO_NET_F_HOST_TSO4) | \
+ BIT(VIRTIO_NET_F_HOST_TSO6) | BIT(VIRTIO_NET_F_HOST_ECN) | BIT(VIRTIO_NET_F_HOST_UFO) | \
+ BIT(VIRTIO_NET_F_MRG_RXBUF) | BIT(VIRTIO_NET_F_STATUS) | BIT(VIRTIO_NET_F_CTRL_VQ) | \
+ BIT(VIRTIO_NET_F_CTRL_RX) | BIT(VIRTIO_NET_F_CTRL_VLAN) | \
+ BIT(VIRTIO_NET_F_CTRL_RX_EXTRA) | BIT(VIRTIO_NET_F_GUEST_ANNOUNCE) | \
+ BIT(VIRTIO_NET_F_MQ) | BIT(VIRTIO_NET_F_CTRL_MAC_ADDR) | BIT(VIRTIO_NET_F_HASH_REPORT) | \
+ BIT(VIRTIO_NET_F_RSS) | BIT(VIRTIO_NET_F_RSC_EXT) | BIT(VIRTIO_NET_F_STANDBY) | \
+ BIT(VIRTIO_NET_F_SPEED_DUPLEX) | BIT(VIRTIO_F_NOTIFY_ON_EMPTY) | \
+ BIT(VIRTIO_F_ANY_LAYOUT) | BIT(VIRTIO_F_VERSION_1) | BIT(VIRTIO_F_ACCESS_PLATFORM) | \
+ BIT(VIRTIO_F_RING_PACKED) | BIT(VIRTIO_F_ORDER_PLATFORM) | BIT(VIRTIO_F_SR_IOV))
+
+#define VALID_STATUS_MASK \
+ (VIRTIO_CONFIG_S_ACKNOWLEDGE | VIRTIO_CONFIG_S_DRIVER | VIRTIO_CONFIG_S_DRIVER_OK | \
+ VIRTIO_CONFIG_S_FEATURES_OK | VIRTIO_CONFIG_S_NEEDS_RESET | VIRTIO_CONFIG_S_FAILED)
+
+struct mlx5_vdpa_net_resources {
+ u32 tisn;
+ u32 tdn;
+ u32 tirn;
+ u32 rqtn;
+ bool valid;
+};
+
+struct mlx5_vdpa_cq_buf {
+ struct mlx5_frag_buf_ctrl fbc;
+ struct mlx5_frag_buf frag_buf;
+ int cqe_size;
+ int nent;
+};
+
+struct mlx5_vdpa_cq {
+ struct mlx5_core_cq mcq;
+ struct mlx5_vdpa_cq_buf buf;
+ struct mlx5_db db;
+ int cqe;
+};
+
+struct mlx5_vdpa_umem {
+ struct mlx5_frag_buf_ctrl fbc;
+ struct mlx5_frag_buf frag_buf;
+ int size;
+ u32 id;
+};
+
+struct mlx5_vdpa_qp {
+ struct mlx5_core_qp mqp;
+ struct mlx5_frag_buf frag_buf;
+ struct mlx5_db db;
+ u16 head;
+ bool fw;
+};
+
+struct mlx5_vq_restore_info {
+ u32 num_ent;
+ u64 desc_addr;
+ u64 device_addr;
+ u64 driver_addr;
+ u16 avail_index;
+ bool ready;
+ struct vdpa_callback cb;
+ bool restore;
+};
+
+struct mlx5_vdpa_virtqueue {
+ bool ready;
+ u64 desc_addr;
+ u64 device_addr;
+ u64 driver_addr;
+ u32 num_ent;
+ struct vdpa_callback event_cb;
+
+ /* Resources for implementing the notification channel from the device
+ * to the driver. fwqp is the firmware end of an RC connection; the
+ * other end is vqqp used by the driver. cq is is where completions are
+ * reported.
+ */
+ struct mlx5_vdpa_cq cq;
+ struct mlx5_vdpa_qp fwqp;
+ struct mlx5_vdpa_qp vqqp;
+
+ /* umem resources are required for the virtqueue operation. They're use
+ * is internal and they must be provided by the driver.
+ */
+ struct mlx5_vdpa_umem umem1;
+ struct mlx5_vdpa_umem umem2;
+ struct mlx5_vdpa_umem umem3;
+
+ bool initialized;
+ int index;
+ u32 virtq_id;
+ struct mlx5_vdpa_net *ndev;
+ u16 avail_idx;
+ int fw_state;
+
+ /* keep last in the struct */
+ struct mlx5_vq_restore_info ri;
+};
+
+/* We will remove this limitation once mlx5_vdpa_alloc_resources()
+ * provides for driver space allocation
+ */
+#define MLX5_MAX_SUPPORTED_VQS 16
+
+struct mlx5_vdpa_net {
+ struct mlx5_vdpa_dev mvdev;
+ struct mlx5_vdpa_net_resources res;
+ struct virtio_net_config config;
+ struct mlx5_vdpa_virtqueue vqs[MLX5_MAX_SUPPORTED_VQS];
+
+ /* Serialize vq resources creation and destruction. This is required
+ * since memory map might change and we need to destroy and create
+ * resources while driver in operational.
+ */
+ struct mutex reslock;
+ struct mlx5_flow_table *rxft;
+ struct mlx5_fc *rx_counter;
+ struct mlx5_flow_handle *rx_rule;
+ bool setup;
+};
+
+static void free_resources(struct mlx5_vdpa_net *ndev);
+static void init_mvqs(struct mlx5_vdpa_net *ndev);
+static int setup_driver(struct mlx5_vdpa_net *ndev);
+static void teardown_driver(struct mlx5_vdpa_net *ndev);
+
+static bool mlx5_vdpa_debug;
+
+#define MLX5_LOG_VIO_FLAG(_feature) \
+ do { \
+ if (features & BIT(_feature)) \
+ mlx5_vdpa_info(mvdev, "%s\n", #_feature); \
+ } while (0)
+
+#define MLX5_LOG_VIO_STAT(_status) \
+ do { \
+ if (status & (_status)) \
+ mlx5_vdpa_info(mvdev, "%s\n", #_status); \
+ } while (0)
+
+static void print_status(struct mlx5_vdpa_dev *mvdev, u8 status, bool set)
+{
+ if (status & ~VALID_STATUS_MASK)
+ mlx5_vdpa_warn(mvdev, "Warning: there are invalid status bits 0x%x\n",
+ status & ~VALID_STATUS_MASK);
+
+ if (!mlx5_vdpa_debug)
+ return;
+
+ mlx5_vdpa_info(mvdev, "driver status %s", set ? "set" : "get");
+ if (set && !status) {
+ mlx5_vdpa_info(mvdev, "driver resets the device\n");
+ return;
+ }
+
+ MLX5_LOG_VIO_STAT(VIRTIO_CONFIG_S_ACKNOWLEDGE);
+ MLX5_LOG_VIO_STAT(VIRTIO_CONFIG_S_DRIVER);
+ MLX5_LOG_VIO_STAT(VIRTIO_CONFIG_S_DRIVER_OK);
+ MLX5_LOG_VIO_STAT(VIRTIO_CONFIG_S_FEATURES_OK);
+ MLX5_LOG_VIO_STAT(VIRTIO_CONFIG_S_NEEDS_RESET);
+ MLX5_LOG_VIO_STAT(VIRTIO_CONFIG_S_FAILED);
+}
+
+static void print_features(struct mlx5_vdpa_dev *mvdev, u64 features, bool set)
+{
+ if (features & ~VALID_FEATURES_MASK)
+ mlx5_vdpa_warn(mvdev, "There are invalid feature bits 0x%llx\n",
+ features & ~VALID_FEATURES_MASK);
+
+ if (!mlx5_vdpa_debug)
+ return;
+
+ mlx5_vdpa_info(mvdev, "driver %s feature bits:\n", set ? "sets" : "reads");
+ if (!features)
+ mlx5_vdpa_info(mvdev, "all feature bits are cleared\n");
+
+ MLX5_LOG_VIO_FLAG(VIRTIO_NET_F_CSUM);
+ MLX5_LOG_VIO_FLAG(VIRTIO_NET_F_GUEST_CSUM);
+ MLX5_LOG_VIO_FLAG(VIRTIO_NET_F_CTRL_GUEST_OFFLOADS);
+ MLX5_LOG_VIO_FLAG(VIRTIO_NET_F_MTU);
+ MLX5_LOG_VIO_FLAG(VIRTIO_NET_F_MAC);
+ MLX5_LOG_VIO_FLAG(VIRTIO_NET_F_GUEST_TSO4);
+ MLX5_LOG_VIO_FLAG(VIRTIO_NET_F_GUEST_TSO6);
+ MLX5_LOG_VIO_FLAG(VIRTIO_NET_F_GUEST_ECN);
+ MLX5_LOG_VIO_FLAG(VIRTIO_NET_F_GUEST_UFO);
+ MLX5_LOG_VIO_FLAG(VIRTIO_NET_F_HOST_TSO4);
+ MLX5_LOG_VIO_FLAG(VIRTIO_NET_F_HOST_TSO6);
+ MLX5_LOG_VIO_FLAG(VIRTIO_NET_F_HOST_ECN);
+ MLX5_LOG_VIO_FLAG(VIRTIO_NET_F_HOST_UFO);
+ MLX5_LOG_VIO_FLAG(VIRTIO_NET_F_MRG_RXBUF);
+ MLX5_LOG_VIO_FLAG(VIRTIO_NET_F_STATUS);
+ MLX5_LOG_VIO_FLAG(VIRTIO_NET_F_CTRL_VQ);
+ MLX5_LOG_VIO_FLAG(VIRTIO_NET_F_CTRL_RX);
+ MLX5_LOG_VIO_FLAG(VIRTIO_NET_F_CTRL_VLAN);
+ MLX5_LOG_VIO_FLAG(VIRTIO_NET_F_CTRL_RX_EXTRA);
+ MLX5_LOG_VIO_FLAG(VIRTIO_NET_F_GUEST_ANNOUNCE);
+ MLX5_LOG_VIO_FLAG(VIRTIO_NET_F_MQ);
+ MLX5_LOG_VIO_FLAG(VIRTIO_NET_F_CTRL_MAC_ADDR);
+ MLX5_LOG_VIO_FLAG(VIRTIO_NET_F_HASH_REPORT);
+ MLX5_LOG_VIO_FLAG(VIRTIO_NET_F_RSS);
+ MLX5_LOG_VIO_FLAG(VIRTIO_NET_F_RSC_EXT);
+ MLX5_LOG_VIO_FLAG(VIRTIO_NET_F_STANDBY);
+ MLX5_LOG_VIO_FLAG(VIRTIO_NET_F_SPEED_DUPLEX);
+ MLX5_LOG_VIO_FLAG(VIRTIO_F_NOTIFY_ON_EMPTY);
+ MLX5_LOG_VIO_FLAG(VIRTIO_F_ANY_LAYOUT);
+ MLX5_LOG_VIO_FLAG(VIRTIO_F_VERSION_1);
+ MLX5_LOG_VIO_FLAG(VIRTIO_F_ACCESS_PLATFORM);
+ MLX5_LOG_VIO_FLAG(VIRTIO_F_RING_PACKED);
+ MLX5_LOG_VIO_FLAG(VIRTIO_F_ORDER_PLATFORM);
+ MLX5_LOG_VIO_FLAG(VIRTIO_F_SR_IOV);
+}
+
+static int create_tis(struct mlx5_vdpa_net *ndev)
+{
+ struct mlx5_vdpa_dev *mvdev = &ndev->mvdev;
+ u32 in[MLX5_ST_SZ_DW(create_tis_in)] = {};
+ void *tisc;
+ int err;
+
+ tisc = MLX5_ADDR_OF(create_tis_in, in, ctx);
+ MLX5_SET(tisc, tisc, transport_domain, ndev->res.tdn);
+ err = mlx5_vdpa_create_tis(mvdev, in, &ndev->res.tisn);
+ if (err)
+ mlx5_vdpa_warn(mvdev, "create TIS (%d)\n", err);
+
+ return err;
+}
+
+static void destroy_tis(struct mlx5_vdpa_net *ndev)
+{
+ mlx5_vdpa_destroy_tis(&ndev->mvdev, ndev->res.tisn);
+}
+
+#define MLX5_VDPA_CQE_SIZE 64
+#define MLX5_VDPA_LOG_CQE_SIZE ilog2(MLX5_VDPA_CQE_SIZE)
+
+static int cq_frag_buf_alloc(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_cq_buf *buf, int nent)
+{
+ struct mlx5_frag_buf *frag_buf = &buf->frag_buf;
+ u8 log_wq_stride = MLX5_VDPA_LOG_CQE_SIZE;
+ u8 log_wq_sz = MLX5_VDPA_LOG_CQE_SIZE;
+ int err;
+
+ err = mlx5_frag_buf_alloc_node(ndev->mvdev.mdev, nent * MLX5_VDPA_CQE_SIZE, frag_buf,
+ ndev->mvdev.mdev->priv.numa_node);
+ if (err)
+ return err;
+
+ mlx5_init_fbc(frag_buf->frags, log_wq_stride, log_wq_sz, &buf->fbc);
+
+ buf->cqe_size = MLX5_VDPA_CQE_SIZE;
+ buf->nent = nent;
+
+ return 0;
+}
+
+static int umem_frag_buf_alloc(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_umem *umem, int size)
+{
+ struct mlx5_frag_buf *frag_buf = &umem->frag_buf;
+
+ return mlx5_frag_buf_alloc_node(ndev->mvdev.mdev, size, frag_buf,
+ ndev->mvdev.mdev->priv.numa_node);
+}
+
+static void cq_frag_buf_free(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_cq_buf *buf)
+{
+ mlx5_frag_buf_free(ndev->mvdev.mdev, &buf->frag_buf);
+}
+
+static void *get_cqe(struct mlx5_vdpa_cq *vcq, int n)
+{
+ return mlx5_frag_buf_get_wqe(&vcq->buf.fbc, n);
+}
+
+static void cq_frag_buf_init(struct mlx5_vdpa_cq *vcq, struct mlx5_vdpa_cq_buf *buf)
+{
+ struct mlx5_cqe64 *cqe64;
+ void *cqe;
+ int i;
+
+ for (i = 0; i < buf->nent; i++) {
+ cqe = get_cqe(vcq, i);
+ cqe64 = cqe;
+ cqe64->op_own = MLX5_CQE_INVALID << 4;
+ }
+}
+
+static void *get_sw_cqe(struct mlx5_vdpa_cq *cq, int n)
+{
+ struct mlx5_cqe64 *cqe64 = get_cqe(cq, n & (cq->cqe - 1));
+
+ if (likely(get_cqe_opcode(cqe64) != MLX5_CQE_INVALID) &&
+ !((cqe64->op_own & MLX5_CQE_OWNER_MASK) ^ !!(n & cq->cqe)))
+ return cqe64;
+
+ return NULL;
+}
+
+static void rx_post(struct mlx5_vdpa_qp *vqp, int n)
+{
+ vqp->head += n;
+ vqp->db.db[0] = cpu_to_be32(vqp->head);
+}
+
+static void qp_prepare(struct mlx5_vdpa_net *ndev, bool fw, void *in,
+ struct mlx5_vdpa_virtqueue *mvq, u32 num_ent)
+{
+ struct mlx5_vdpa_qp *vqp;
+ __be64 *pas;
+ void *qpc;
+
+ vqp = fw ? &mvq->fwqp : &mvq->vqqp;
+ MLX5_SET(create_qp_in, in, uid, ndev->mvdev.res.uid);
+ qpc = MLX5_ADDR_OF(create_qp_in, in, qpc);
+ if (vqp->fw) {
+ /* Firmware QP is allocated by the driver for the firmware's
+ * use so we can skip part of the params as they will be chosen by firmware
+ */
+ qpc = MLX5_ADDR_OF(create_qp_in, in, qpc);
+ MLX5_SET(qpc, qpc, rq_type, MLX5_ZERO_LEN_RQ);
+ MLX5_SET(qpc, qpc, no_sq, 1);
+ return;
+ }
+
+ MLX5_SET(qpc, qpc, st, MLX5_QP_ST_RC);
+ MLX5_SET(qpc, qpc, pm_state, MLX5_QP_PM_MIGRATED);
+ MLX5_SET(qpc, qpc, pd, ndev->mvdev.res.pdn);
+ MLX5_SET(qpc, qpc, mtu, MLX5_QPC_MTU_256_BYTES);
+ MLX5_SET(qpc, qpc, uar_page, ndev->mvdev.res.uar->index);
+ MLX5_SET(qpc, qpc, log_page_size, vqp->frag_buf.page_shift - MLX5_ADAPTER_PAGE_SHIFT);
+ MLX5_SET(qpc, qpc, no_sq, 1);
+ MLX5_SET(qpc, qpc, cqn_rcv, mvq->cq.mcq.cqn);
+ MLX5_SET(qpc, qpc, log_rq_size, ilog2(num_ent));
+ MLX5_SET(qpc, qpc, rq_type, MLX5_NON_ZERO_RQ);
+ pas = (__be64 *)MLX5_ADDR_OF(create_qp_in, in, pas);
+ mlx5_fill_page_frag_array(&vqp->frag_buf, pas);
+}
+
+static int rq_buf_alloc(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_qp *vqp, u32 num_ent)
+{
+ return mlx5_frag_buf_alloc_node(ndev->mvdev.mdev,
+ num_ent * sizeof(struct mlx5_wqe_data_seg), &vqp->frag_buf,
+ ndev->mvdev.mdev->priv.numa_node);
+}
+
+static void rq_buf_free(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_qp *vqp)
+{
+ mlx5_frag_buf_free(ndev->mvdev.mdev, &vqp->frag_buf);
+}
+
+static int qp_create(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtqueue *mvq,
+ struct mlx5_vdpa_qp *vqp)
+{
+ struct mlx5_core_dev *mdev = ndev->mvdev.mdev;
+ int inlen = MLX5_ST_SZ_BYTES(create_qp_in);
+ u32 out[MLX5_ST_SZ_DW(create_qp_out)] = {};
+ void *qpc;
+ void *in;
+ int err;
+
+ if (!vqp->fw) {
+ vqp = &mvq->vqqp;
+ err = rq_buf_alloc(ndev, vqp, mvq->num_ent);
+ if (err)
+ return err;
+
+ err = mlx5_db_alloc(ndev->mvdev.mdev, &vqp->db);
+ if (err)
+ goto err_db;
+ inlen += vqp->frag_buf.npages * sizeof(__be64);
+ }
+
+ in = kzalloc(inlen, GFP_KERNEL);
+ if (!in) {
+ err = -ENOMEM;
+ goto err_kzalloc;
+ }
+
+ qp_prepare(ndev, vqp->fw, in, mvq, mvq->num_ent);
+ qpc = MLX5_ADDR_OF(create_qp_in, in, qpc);
+ MLX5_SET(qpc, qpc, st, MLX5_QP_ST_RC);
+ MLX5_SET(qpc, qpc, pm_state, MLX5_QP_PM_MIGRATED);
+ MLX5_SET(qpc, qpc, pd, ndev->mvdev.res.pdn);
+ MLX5_SET(qpc, qpc, mtu, MLX5_QPC_MTU_256_BYTES);
+ if (!vqp->fw)
+ MLX5_SET64(qpc, qpc, dbr_addr, vqp->db.dma);
+ MLX5_SET(create_qp_in, in, opcode, MLX5_CMD_OP_CREATE_QP);
+ err = mlx5_cmd_exec(mdev, in, inlen, out, sizeof(out));
+ kfree(in);
+ if (err)
+ goto err_kzalloc;
+
+ vqp->mqp.uid = ndev->mvdev.res.uid;
+ vqp->mqp.qpn = MLX5_GET(create_qp_out, out, qpn);
+
+ if (!vqp->fw)
+ rx_post(vqp, mvq->num_ent);
+
+ return 0;
+
+err_kzalloc:
+ if (!vqp->fw)
+ mlx5_db_free(ndev->mvdev.mdev, &vqp->db);
+err_db:
+ if (!vqp->fw)
+ rq_buf_free(ndev, vqp);
+
+ return err;
+}
+
+static void qp_destroy(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_qp *vqp)
+{
+ u32 in[MLX5_ST_SZ_DW(destroy_qp_in)] = {};
+
+ MLX5_SET(destroy_qp_in, in, opcode, MLX5_CMD_OP_DESTROY_QP);
+ MLX5_SET(destroy_qp_in, in, qpn, vqp->mqp.qpn);
+ MLX5_SET(destroy_qp_in, in, uid, ndev->mvdev.res.uid);
+ if (mlx5_cmd_exec_in(ndev->mvdev.mdev, destroy_qp, in))
+ mlx5_vdpa_warn(&ndev->mvdev, "destroy qp 0x%x\n", vqp->mqp.qpn);
+ if (!vqp->fw) {
+ mlx5_db_free(ndev->mvdev.mdev, &vqp->db);
+ rq_buf_free(ndev, vqp);
+ }
+}
+
+static void *next_cqe_sw(struct mlx5_vdpa_cq *cq)
+{
+ return get_sw_cqe(cq, cq->mcq.cons_index);
+}
+
+static int mlx5_vdpa_poll_one(struct mlx5_vdpa_cq *vcq)
+{
+ struct mlx5_cqe64 *cqe64;
+
+ cqe64 = next_cqe_sw(vcq);
+ if (!cqe64)
+ return -EAGAIN;
+
+ vcq->mcq.cons_index++;
+ return 0;
+}
+
+static void mlx5_vdpa_handle_completions(struct mlx5_vdpa_virtqueue *mvq, int num)
+{
+ mlx5_cq_set_ci(&mvq->cq.mcq);
+ rx_post(&mvq->vqqp, num);
+ if (mvq->event_cb.callback)
+ mvq->event_cb.callback(mvq->event_cb.private);
+}
+
+static void mlx5_vdpa_cq_comp(struct mlx5_core_cq *mcq, struct mlx5_eqe *eqe)
+{
+ struct mlx5_vdpa_virtqueue *mvq = container_of(mcq, struct mlx5_vdpa_virtqueue, cq.mcq);
+ struct mlx5_vdpa_net *ndev = mvq->ndev;
+ void __iomem *uar_page = ndev->mvdev.res.uar->map;
+ int num = 0;
+
+ while (!mlx5_vdpa_poll_one(&mvq->cq)) {
+ num++;
+ if (num > mvq->num_ent / 2) {
+ /* If completions keep coming while we poll, we want to
+ * let the hardware know that we consumed them by
+ * updating the doorbell record. We also let vdpa core
+ * know about this so it passes it on the virtio driver
+ * on the guest.
+ */
+ mlx5_vdpa_handle_completions(mvq, num);
+ num = 0;
+ }
+ }
+
+ if (num)
+ mlx5_vdpa_handle_completions(mvq, num);
+
+ mlx5_cq_arm(&mvq->cq.mcq, MLX5_CQ_DB_REQ_NOT, uar_page, mvq->cq.mcq.cons_index);
+}
+
+static int cq_create(struct mlx5_vdpa_net *ndev, u16 idx, u32 num_ent)
+{
+ struct mlx5_vdpa_virtqueue *mvq = &ndev->vqs[idx];
+ struct mlx5_core_dev *mdev = ndev->mvdev.mdev;
+ void __iomem *uar_page = ndev->mvdev.res.uar->map;
+ u32 out[MLX5_ST_SZ_DW(create_cq_out)];
+ struct mlx5_vdpa_cq *vcq = &mvq->cq;
+ unsigned int irqn;
+ __be64 *pas;
+ int inlen;
+ void *cqc;
+ void *in;
+ int err;
+ int eqn;
+
+ err = mlx5_db_alloc(mdev, &vcq->db);
+ if (err)
+ return err;
+
+ vcq->mcq.set_ci_db = vcq->db.db;
+ vcq->mcq.arm_db = vcq->db.db + 1;
+ vcq->mcq.cqe_sz = 64;
+
+ err = cq_frag_buf_alloc(ndev, &vcq->buf, num_ent);
+ if (err)
+ goto err_db;
+
+ cq_frag_buf_init(vcq, &vcq->buf);
+
+ inlen = MLX5_ST_SZ_BYTES(create_cq_in) +
+ MLX5_FLD_SZ_BYTES(create_cq_in, pas[0]) * vcq->buf.frag_buf.npages;
+ in = kzalloc(inlen, GFP_KERNEL);
+ if (!in) {
+ err = -ENOMEM;
+ goto err_vzalloc;
+ }
+
+ MLX5_SET(create_cq_in, in, uid, ndev->mvdev.res.uid);
+ pas = (__be64 *)MLX5_ADDR_OF(create_cq_in, in, pas);
+ mlx5_fill_page_frag_array(&vcq->buf.frag_buf, pas);
+
+ cqc = MLX5_ADDR_OF(create_cq_in, in, cq_context);
+ MLX5_SET(cqc, cqc, log_page_size, vcq->buf.frag_buf.page_shift - MLX5_ADAPTER_PAGE_SHIFT);
+
+ /* Use vector 0 by default. Consider adding code to choose least used
+ * vector.
+ */
+ err = mlx5_vector2eqn(mdev, 0, &eqn, &irqn);
+ if (err)
+ goto err_vec;
+
+ cqc = MLX5_ADDR_OF(create_cq_in, in, cq_context);
+ MLX5_SET(cqc, cqc, log_cq_size, ilog2(num_ent));
+ MLX5_SET(cqc, cqc, uar_page, ndev->mvdev.res.uar->index);
+ MLX5_SET(cqc, cqc, c_eqn, eqn);
+ MLX5_SET64(cqc, cqc, dbr_addr, vcq->db.dma);
+
+ err = mlx5_core_create_cq(mdev, &vcq->mcq, in, inlen, out, sizeof(out));
+ if (err)
+ goto err_vec;
+
+ vcq->mcq.comp = mlx5_vdpa_cq_comp;
+ vcq->cqe = num_ent;
+ vcq->mcq.set_ci_db = vcq->db.db;
+ vcq->mcq.arm_db = vcq->db.db + 1;
+ mlx5_cq_arm(&mvq->cq.mcq, MLX5_CQ_DB_REQ_NOT, uar_page, mvq->cq.mcq.cons_index);
+ kfree(in);
+ return 0;
+
+err_vec:
+ kfree(in);
+err_vzalloc:
+ cq_frag_buf_free(ndev, &vcq->buf);
+err_db:
+ mlx5_db_free(ndev->mvdev.mdev, &vcq->db);
+ return err;
+}
+
+static void cq_destroy(struct mlx5_vdpa_net *ndev, u16 idx)
+{
+ struct mlx5_vdpa_virtqueue *mvq = &ndev->vqs[idx];
+ struct mlx5_core_dev *mdev = ndev->mvdev.mdev;
+ struct mlx5_vdpa_cq *vcq = &mvq->cq;
+
+ if (mlx5_core_destroy_cq(mdev, &vcq->mcq)) {
+ mlx5_vdpa_warn(&ndev->mvdev, "destroy CQ 0x%x\n", vcq->mcq.cqn);
+ return;
+ }
+ cq_frag_buf_free(ndev, &vcq->buf);
+ mlx5_db_free(ndev->mvdev.mdev, &vcq->db);
+}
+
+static int umem_size(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtqueue *mvq, int num,
+ struct mlx5_vdpa_umem **umemp)
+{
+ struct mlx5_core_dev *mdev = ndev->mvdev.mdev;
+ int p_a;
+ int p_b;
+
+ switch (num) {
+ case 1:
+ p_a = MLX5_CAP_DEV_VDPA_EMULATION(mdev, umem_1_buffer_param_a);
+ p_b = MLX5_CAP_DEV_VDPA_EMULATION(mdev, umem_1_buffer_param_b);
+ *umemp = &mvq->umem1;
+ break;
+ case 2:
+ p_a = MLX5_CAP_DEV_VDPA_EMULATION(mdev, umem_2_buffer_param_a);
+ p_b = MLX5_CAP_DEV_VDPA_EMULATION(mdev, umem_2_buffer_param_b);
+ *umemp = &mvq->umem2;
+ break;
+ case 3:
+ p_a = MLX5_CAP_DEV_VDPA_EMULATION(mdev, umem_3_buffer_param_a);
+ p_b = MLX5_CAP_DEV_VDPA_EMULATION(mdev, umem_3_buffer_param_b);
+ *umemp = &mvq->umem3;
+ break;
+ }
+ return p_a * mvq->num_ent + p_b;
+}
+
+static void umem_frag_buf_free(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_umem *umem)
+{
+ mlx5_frag_buf_free(ndev->mvdev.mdev, &umem->frag_buf);
+}
+
+static int create_umem(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtqueue *mvq, int num)
+{
+ int inlen;
+ u32 out[MLX5_ST_SZ_DW(create_umem_out)] = {};
+ void *um;
+ void *in;
+ int err;
+ __be64 *pas;
+ int size;
+ struct mlx5_vdpa_umem *umem;
+
+ size = umem_size(ndev, mvq, num, &umem);
+ if (size < 0)
+ return size;
+
+ umem->size = size;
+ err = umem_frag_buf_alloc(ndev, umem, size);
+ if (err)
+ return err;
+
+ inlen = MLX5_ST_SZ_BYTES(create_umem_in) + MLX5_ST_SZ_BYTES(mtt) * umem->frag_buf.npages;
+
+ in = kzalloc(inlen, GFP_KERNEL);
+ if (!in) {
+ err = -ENOMEM;
+ goto err_in;
+ }
+
+ MLX5_SET(create_umem_in, in, opcode, MLX5_CMD_OP_CREATE_UMEM);
+ MLX5_SET(create_umem_in, in, uid, ndev->mvdev.res.uid);
+ um = MLX5_ADDR_OF(create_umem_in, in, umem);
+ MLX5_SET(umem, um, log_page_size, umem->frag_buf.page_shift - MLX5_ADAPTER_PAGE_SHIFT);
+ MLX5_SET64(umem, um, num_of_mtt, umem->frag_buf.npages);
+
+ pas = (__be64 *)MLX5_ADDR_OF(umem, um, mtt[0]);
+ mlx5_fill_page_frag_array_perm(&umem->frag_buf, pas, MLX5_MTT_PERM_RW);
+
+ err = mlx5_cmd_exec(ndev->mvdev.mdev, in, inlen, out, sizeof(out));
+ if (err) {
+ mlx5_vdpa_warn(&ndev->mvdev, "create umem(%d)\n", err);
+ goto err_cmd;
+ }
+
+ kfree(in);
+ umem->id = MLX5_GET(create_umem_out, out, umem_id);
+
+ return 0;
+
+err_cmd:
+ kfree(in);
+err_in:
+ umem_frag_buf_free(ndev, umem);
+ return err;
+}
+
+static void umem_destroy(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtqueue *mvq, int num)
+{
+ u32 in[MLX5_ST_SZ_DW(destroy_umem_in)] = {};
+ u32 out[MLX5_ST_SZ_DW(destroy_umem_out)] = {};
+ struct mlx5_vdpa_umem *umem;
+
+ switch (num) {
+ case 1:
+ umem = &mvq->umem1;
+ break;
+ case 2:
+ umem = &mvq->umem2;
+ break;
+ case 3:
+ umem = &mvq->umem3;
+ break;
+ }
+
+ MLX5_SET(destroy_umem_in, in, opcode, MLX5_CMD_OP_DESTROY_UMEM);
+ MLX5_SET(destroy_umem_in, in, umem_id, umem->id);
+ if (mlx5_cmd_exec(ndev->mvdev.mdev, in, sizeof(in), out, sizeof(out)))
+ return;
+
+ umem_frag_buf_free(ndev, umem);
+}
+
+static int umems_create(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtqueue *mvq)
+{
+ int num;
+ int err;
+
+ for (num = 1; num <= 3; num++) {
+ err = create_umem(ndev, mvq, num);
+ if (err)
+ goto err_umem;
+ }
+ return 0;
+
+err_umem:
+ for (num--; num > 0; num--)
+ umem_destroy(ndev, mvq, num);
+
+ return err;
+}
+
+static void umems_destroy(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtqueue *mvq)
+{
+ int num;
+
+ for (num = 3; num > 0; num--)
+ umem_destroy(ndev, mvq, num);
+}
+
+static int get_queue_type(struct mlx5_vdpa_net *ndev)
+{
+ u32 type_mask;
+
+ type_mask = MLX5_CAP_DEV_VDPA_EMULATION(ndev->mvdev.mdev, virtio_queue_type);
+
+ /* prefer split queue */
+ if (type_mask & MLX5_VIRTIO_EMULATION_CAP_VIRTIO_QUEUE_TYPE_PACKED)
+ return MLX5_VIRTIO_EMULATION_VIRTIO_QUEUE_TYPE_PACKED;
+
+ WARN_ON(!(type_mask & MLX5_VIRTIO_EMULATION_CAP_VIRTIO_QUEUE_TYPE_SPLIT));
+
+ return MLX5_VIRTIO_EMULATION_VIRTIO_QUEUE_TYPE_SPLIT;
+}
+
+static bool vq_is_tx(u16 idx)
+{
+ return idx % 2;
+}
+
+static u16 get_features_12_3(u64 features)
+{
+ return (!!(features & BIT(VIRTIO_NET_F_HOST_TSO4)) << 9) |
+ (!!(features & BIT(VIRTIO_NET_F_HOST_TSO6)) << 8) |
+ (!!(features & BIT(VIRTIO_NET_F_CSUM)) << 7) |
+ (!!(features & BIT(VIRTIO_NET_F_GUEST_CSUM)) << 6);
+}
+
+static int create_virtqueue(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtqueue *mvq)
+{
+ int inlen = MLX5_ST_SZ_BYTES(create_virtio_net_q_in);
+ u32 out[MLX5_ST_SZ_DW(create_virtio_net_q_out)] = {};
+ void *obj_context;
+ void *cmd_hdr;
+ void *vq_ctx;
+ void *in;
+ int err;
+
+ err = umems_create(ndev, mvq);
+ if (err)
+ return err;
+
+ in = kzalloc(inlen, GFP_KERNEL);
+ if (!in) {
+ err = -ENOMEM;
+ goto err_alloc;
+ }
+
+ cmd_hdr = MLX5_ADDR_OF(create_virtio_net_q_in, in, general_obj_in_cmd_hdr);
+
+ MLX5_SET(general_obj_in_cmd_hdr, cmd_hdr, opcode, MLX5_CMD_OP_CREATE_GENERAL_OBJECT);
+ MLX5_SET(general_obj_in_cmd_hdr, cmd_hdr, obj_type, MLX5_OBJ_TYPE_VIRTIO_NET_Q);
+ MLX5_SET(general_obj_in_cmd_hdr, cmd_hdr, uid, ndev->mvdev.res.uid);
+
+ obj_context = MLX5_ADDR_OF(create_virtio_net_q_in, in, obj_context);
+ MLX5_SET(virtio_net_q_object, obj_context, hw_available_index, mvq->avail_idx);
+ MLX5_SET(virtio_net_q_object, obj_context, queue_feature_bit_mask_12_3,
+ get_features_12_3(ndev->mvdev.actual_features));
+ vq_ctx = MLX5_ADDR_OF(virtio_net_q_object, obj_context, virtio_q_context);
+ MLX5_SET(virtio_q, vq_ctx, virtio_q_type, get_queue_type(ndev));
+
+ if (vq_is_tx(mvq->index))
+ MLX5_SET(virtio_net_q_object, obj_context, tisn_or_qpn, ndev->res.tisn);
+
+ MLX5_SET(virtio_q, vq_ctx, event_mode, MLX5_VIRTIO_Q_EVENT_MODE_QP_MODE);
+ MLX5_SET(virtio_q, vq_ctx, queue_index, mvq->index);
+ MLX5_SET(virtio_q, vq_ctx, event_qpn_or_msix, mvq->fwqp.mqp.qpn);
+ MLX5_SET(virtio_q, vq_ctx, queue_size, mvq->num_ent);
+ MLX5_SET(virtio_q, vq_ctx, virtio_version_1_0,
+ !!(ndev->mvdev.actual_features & VIRTIO_F_VERSION_1));
+ MLX5_SET64(virtio_q, vq_ctx, desc_addr, mvq->desc_addr);
+ MLX5_SET64(virtio_q, vq_ctx, used_addr, mvq->device_addr);
+ MLX5_SET64(virtio_q, vq_ctx, available_addr, mvq->driver_addr);
+ MLX5_SET(virtio_q, vq_ctx, virtio_q_mkey, ndev->mvdev.mr.mkey.key);
+ MLX5_SET(virtio_q, vq_ctx, umem_1_id, mvq->umem1.id);
+ MLX5_SET(virtio_q, vq_ctx, umem_1_size, mvq->umem1.size);
+ MLX5_SET(virtio_q, vq_ctx, umem_2_id, mvq->umem2.id);
+ MLX5_SET(virtio_q, vq_ctx, umem_2_size, mvq->umem1.size);
+ MLX5_SET(virtio_q, vq_ctx, umem_3_id, mvq->umem3.id);
+ MLX5_SET(virtio_q, vq_ctx, umem_3_size, mvq->umem1.size);
+ MLX5_SET(virtio_q, vq_ctx, pd, ndev->mvdev.res.pdn);
+ if (MLX5_CAP_DEV_VDPA_EMULATION(ndev->mvdev.mdev, eth_frame_offload_type))
+ MLX5_SET(virtio_q, vq_ctx, virtio_version_1_0, 1);
+
+ err = mlx5_cmd_exec(ndev->mvdev.mdev, in, inlen, out, sizeof(out));
+ if (err)
+ goto err_cmd;
+
+ kfree(in);
+ mvq->virtq_id = MLX5_GET(general_obj_out_cmd_hdr, out, obj_id);
+
+ return 0;
+
+err_cmd:
+ kfree(in);
+err_alloc:
+ umems_destroy(ndev, mvq);
+ return err;
+}
+
+static void destroy_virtqueue(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtqueue *mvq)
+{
+ u32 in[MLX5_ST_SZ_DW(destroy_virtio_net_q_in)] = {};
+ u32 out[MLX5_ST_SZ_DW(destroy_virtio_net_q_out)] = {};
+
+ MLX5_SET(destroy_virtio_net_q_in, in, general_obj_out_cmd_hdr.opcode,
+ MLX5_CMD_OP_DESTROY_GENERAL_OBJECT);
+ MLX5_SET(destroy_virtio_net_q_in, in, general_obj_out_cmd_hdr.obj_id, mvq->virtq_id);
+ MLX5_SET(destroy_virtio_net_q_in, in, general_obj_out_cmd_hdr.uid, ndev->mvdev.res.uid);
+ MLX5_SET(destroy_virtio_net_q_in, in, general_obj_out_cmd_hdr.obj_type,
+ MLX5_OBJ_TYPE_VIRTIO_NET_Q);
+ if (mlx5_cmd_exec(ndev->mvdev.mdev, in, sizeof(in), out, sizeof(out))) {
+ mlx5_vdpa_warn(&ndev->mvdev, "destroy virtqueue 0x%x\n", mvq->virtq_id);
+ return;
+ }
+ umems_destroy(ndev, mvq);
+}
+
+static u32 get_rqpn(struct mlx5_vdpa_virtqueue *mvq, bool fw)
+{
+ return fw ? mvq->vqqp.mqp.qpn : mvq->fwqp.mqp.qpn;
+}
+
+static u32 get_qpn(struct mlx5_vdpa_virtqueue *mvq, bool fw)
+{
+ return fw ? mvq->fwqp.mqp.qpn : mvq->vqqp.mqp.qpn;
+}
+
+static void alloc_inout(struct mlx5_vdpa_net *ndev, int cmd, void **in, int *inlen, void **out,
+ int *outlen, u32 qpn, u32 rqpn)
+{
+ void *qpc;
+ void *pp;
+
+ switch (cmd) {
+ case MLX5_CMD_OP_2RST_QP:
+ *inlen = MLX5_ST_SZ_BYTES(qp_2rst_in);
+ *outlen = MLX5_ST_SZ_BYTES(qp_2rst_out);
+ *in = kzalloc(*inlen, GFP_KERNEL);
+ *out = kzalloc(*outlen, GFP_KERNEL);
+ if (!in || !out)
+ goto outerr;
+
+ MLX5_SET(qp_2rst_in, *in, opcode, cmd);
+ MLX5_SET(qp_2rst_in, *in, uid, ndev->mvdev.res.uid);
+ MLX5_SET(qp_2rst_in, *in, qpn, qpn);
+ break;
+ case MLX5_CMD_OP_RST2INIT_QP:
+ *inlen = MLX5_ST_SZ_BYTES(rst2init_qp_in);
+ *outlen = MLX5_ST_SZ_BYTES(rst2init_qp_out);
+ *in = kzalloc(*inlen, GFP_KERNEL);
+ *out = kzalloc(MLX5_ST_SZ_BYTES(rst2init_qp_out), GFP_KERNEL);
+ if (!in || !out)
+ goto outerr;
+
+ MLX5_SET(rst2init_qp_in, *in, opcode, cmd);
+ MLX5_SET(rst2init_qp_in, *in, uid, ndev->mvdev.res.uid);
+ MLX5_SET(rst2init_qp_in, *in, qpn, qpn);
+ qpc = MLX5_ADDR_OF(rst2init_qp_in, *in, qpc);
+ MLX5_SET(qpc, qpc, remote_qpn, rqpn);
+ MLX5_SET(qpc, qpc, rwe, 1);
+ pp = MLX5_ADDR_OF(qpc, qpc, primary_address_path);
+ MLX5_SET(ads, pp, vhca_port_num, 1);
+ break;
+ case MLX5_CMD_OP_INIT2RTR_QP:
+ *inlen = MLX5_ST_SZ_BYTES(init2rtr_qp_in);
+ *outlen = MLX5_ST_SZ_BYTES(init2rtr_qp_out);
+ *in = kzalloc(*inlen, GFP_KERNEL);
+ *out = kzalloc(MLX5_ST_SZ_BYTES(init2rtr_qp_out), GFP_KERNEL);
+ if (!in || !out)
+ goto outerr;
+
+ MLX5_SET(init2rtr_qp_in, *in, opcode, cmd);
+ MLX5_SET(init2rtr_qp_in, *in, uid, ndev->mvdev.res.uid);
+ MLX5_SET(init2rtr_qp_in, *in, qpn, qpn);
+ qpc = MLX5_ADDR_OF(rst2init_qp_in, *in, qpc);
+ MLX5_SET(qpc, qpc, mtu, MLX5_QPC_MTU_256_BYTES);
+ MLX5_SET(qpc, qpc, log_msg_max, 30);
+ MLX5_SET(qpc, qpc, remote_qpn, rqpn);
+ pp = MLX5_ADDR_OF(qpc, qpc, primary_address_path);
+ MLX5_SET(ads, pp, fl, 1);
+ break;
+ case MLX5_CMD_OP_RTR2RTS_QP:
+ *inlen = MLX5_ST_SZ_BYTES(rtr2rts_qp_in);
+ *outlen = MLX5_ST_SZ_BYTES(rtr2rts_qp_out);
+ *in = kzalloc(*inlen, GFP_KERNEL);
+ *out = kzalloc(MLX5_ST_SZ_BYTES(rtr2rts_qp_out), GFP_KERNEL);
+ if (!in || !out)
+ goto outerr;
+
+ MLX5_SET(rtr2rts_qp_in, *in, opcode, cmd);
+ MLX5_SET(rtr2rts_qp_in, *in, uid, ndev->mvdev.res.uid);
+ MLX5_SET(rtr2rts_qp_in, *in, qpn, qpn);
+ qpc = MLX5_ADDR_OF(rst2init_qp_in, *in, qpc);
+ pp = MLX5_ADDR_OF(qpc, qpc, primary_address_path);
+ MLX5_SET(ads, pp, ack_timeout, 14);
+ MLX5_SET(qpc, qpc, retry_count, 7);
+ MLX5_SET(qpc, qpc, rnr_retry, 7);
+ break;
+ default:
+ goto outerr;
+ }
+ if (!*in || !*out)
+ goto outerr;
+
+ return;
+
+outerr:
+ kfree(*in);
+ kfree(*out);
+ *in = NULL;
+ *out = NULL;
+}
+
+static void free_inout(void *in, void *out)
+{
+ kfree(in);
+ kfree(out);
+}
+
+/* Two QPs are used by each virtqueue. One is used by the driver and one by
+ * firmware. The fw argument indicates whether the subjected QP is the one used
+ * by firmware.
+ */
+static int modify_qp(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtqueue *mvq, bool fw, int cmd)
+{
+ int outlen;
+ int inlen;
+ void *out;
+ void *in;
+ int err;
+
+ alloc_inout(ndev, cmd, &in, &inlen, &out, &outlen, get_qpn(mvq, fw), get_rqpn(mvq, fw));
+ if (!in || !out)
+ return -ENOMEM;
+
+ err = mlx5_cmd_exec(ndev->mvdev.mdev, in, inlen, out, outlen);
+ free_inout(in, out);
+ return err;
+}
+
+static int connect_qps(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtqueue *mvq)
+{
+ int err;
+
+ err = modify_qp(ndev, mvq, true, MLX5_CMD_OP_2RST_QP);
+ if (err)
+ return err;
+
+ err = modify_qp(ndev, mvq, false, MLX5_CMD_OP_2RST_QP);
+ if (err)
+ return err;
+
+ err = modify_qp(ndev, mvq, true, MLX5_CMD_OP_RST2INIT_QP);
+ if (err)
+ return err;
+
+ err = modify_qp(ndev, mvq, false, MLX5_CMD_OP_RST2INIT_QP);
+ if (err)
+ return err;
+
+ err = modify_qp(ndev, mvq, true, MLX5_CMD_OP_INIT2RTR_QP);
+ if (err)
+ return err;
+
+ err = modify_qp(ndev, mvq, false, MLX5_CMD_OP_INIT2RTR_QP);
+ if (err)
+ return err;
+
+ return modify_qp(ndev, mvq, true, MLX5_CMD_OP_RTR2RTS_QP);
+}
+
+struct mlx5_virtq_attr {
+ u8 state;
+ u16 available_index;
+};
+
+static int query_virtqueue(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtqueue *mvq,
+ struct mlx5_virtq_attr *attr)
+{
+ int outlen = MLX5_ST_SZ_BYTES(query_virtio_net_q_out);
+ u32 in[MLX5_ST_SZ_DW(query_virtio_net_q_in)] = {};
+ void *out;
+ void *obj_context;
+ void *cmd_hdr;
+ int err;
+
+ out = kzalloc(outlen, GFP_KERNEL);
+ if (!out)
+ return -ENOMEM;
+
+ cmd_hdr = MLX5_ADDR_OF(query_virtio_net_q_in, in, general_obj_in_cmd_hdr);
+
+ MLX5_SET(general_obj_in_cmd_hdr, cmd_hdr, opcode, MLX5_CMD_OP_QUERY_GENERAL_OBJECT);
+ MLX5_SET(general_obj_in_cmd_hdr, cmd_hdr, obj_type, MLX5_OBJ_TYPE_VIRTIO_NET_Q);
+ MLX5_SET(general_obj_in_cmd_hdr, cmd_hdr, obj_id, mvq->virtq_id);
+ MLX5_SET(general_obj_in_cmd_hdr, cmd_hdr, uid, ndev->mvdev.res.uid);
+ err = mlx5_cmd_exec(ndev->mvdev.mdev, in, sizeof(in), out, outlen);
+ if (err)
+ goto err_cmd;
+
+ obj_context = MLX5_ADDR_OF(query_virtio_net_q_out, out, obj_context);
+ memset(attr, 0, sizeof(*attr));
+ attr->state = MLX5_GET(virtio_net_q_object, obj_context, state);
+ attr->available_index = MLX5_GET(virtio_net_q_object, obj_context, hw_available_index);
+ kfree(out);
+ return 0;
+
+err_cmd:
+ kfree(out);
+ return err;
+}
+
+static int modify_virtqueue(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtqueue *mvq, int state)
+{
+ int inlen = MLX5_ST_SZ_BYTES(modify_virtio_net_q_in);
+ u32 out[MLX5_ST_SZ_DW(modify_virtio_net_q_out)] = {};
+ void *obj_context;
+ void *cmd_hdr;
+ void *in;
+ int err;
+
+ in = kzalloc(inlen, GFP_KERNEL);
+ if (!in)
+ return -ENOMEM;
+
+ cmd_hdr = MLX5_ADDR_OF(modify_virtio_net_q_in, in, general_obj_in_cmd_hdr);
+
+ MLX5_SET(general_obj_in_cmd_hdr, cmd_hdr, opcode, MLX5_CMD_OP_MODIFY_GENERAL_OBJECT);
+ MLX5_SET(general_obj_in_cmd_hdr, cmd_hdr, obj_type, MLX5_OBJ_TYPE_VIRTIO_NET_Q);
+ MLX5_SET(general_obj_in_cmd_hdr, cmd_hdr, obj_id, mvq->virtq_id);
+ MLX5_SET(general_obj_in_cmd_hdr, cmd_hdr, uid, ndev->mvdev.res.uid);
+
+ obj_context = MLX5_ADDR_OF(modify_virtio_net_q_in, in, obj_context);
+ MLX5_SET64(virtio_net_q_object, obj_context, modify_field_select,
+ MLX5_VIRTQ_MODIFY_MASK_STATE);
+ MLX5_SET(virtio_net_q_object, obj_context, state, state);
+ err = mlx5_cmd_exec(ndev->mvdev.mdev, in, inlen, out, sizeof(out));
+ kfree(in);
+ if (!err)
+ mvq->fw_state = state;
+
+ return err;
+}
+
+static int setup_vq(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtqueue *mvq)
+{
+ u16 idx = mvq->index;
+ int err;
+
+ if (!mvq->num_ent)
+ return 0;
+
+ if (mvq->initialized) {
+ mlx5_vdpa_warn(&ndev->mvdev, "attempt re init\n");
+ return -EINVAL;
+ }
+
+ err = cq_create(ndev, idx, mvq->num_ent);
+ if (err)
+ return err;
+
+ err = qp_create(ndev, mvq, &mvq->fwqp);
+ if (err)
+ goto err_fwqp;
+
+ err = qp_create(ndev, mvq, &mvq->vqqp);
+ if (err)
+ goto err_vqqp;
+
+ err = connect_qps(ndev, mvq);
+ if (err)
+ goto err_connect;
+
+ err = create_virtqueue(ndev, mvq);
+ if (err)
+ goto err_connect;
+
+ if (mvq->ready) {
+ err = modify_virtqueue(ndev, mvq, MLX5_VIRTIO_NET_Q_OBJECT_STATE_RDY);
+ if (err) {
+ mlx5_vdpa_warn(&ndev->mvdev, "failed to modify to ready vq idx %d(%d)\n",
+ idx, err);
+ goto err_connect;
+ }
+ }
+
+ mvq->initialized = true;
+ return 0;
+
+err_connect:
+ qp_destroy(ndev, &mvq->vqqp);
+err_vqqp:
+ qp_destroy(ndev, &mvq->fwqp);
+err_fwqp:
+ cq_destroy(ndev, idx);
+ return err;
+}
+
+static void suspend_vq(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtqueue *mvq)
+{
+ struct mlx5_virtq_attr attr;
+
+ if (!mvq->initialized)
+ return;
+
+ if (query_virtqueue(ndev, mvq, &attr)) {
+ mlx5_vdpa_warn(&ndev->mvdev, "failed to query virtqueue\n");
+ return;
+ }
+ if (mvq->fw_state != MLX5_VIRTIO_NET_Q_OBJECT_STATE_RDY)
+ return;
+
+ if (modify_virtqueue(ndev, mvq, MLX5_VIRTIO_NET_Q_OBJECT_STATE_SUSPEND))
+ mlx5_vdpa_warn(&ndev->mvdev, "modify to suspend failed\n");
+}
+
+static void suspend_vqs(struct mlx5_vdpa_net *ndev)
+{
+ int i;
+
+ for (i = 0; i < MLX5_MAX_SUPPORTED_VQS; i++)
+ suspend_vq(ndev, &ndev->vqs[i]);
+}
+
+static void teardown_vq(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtqueue *mvq)
+{
+ if (!mvq->initialized)
+ return;
+
+ suspend_vq(ndev, mvq);
+ destroy_virtqueue(ndev, mvq);
+ qp_destroy(ndev, &mvq->vqqp);
+ qp_destroy(ndev, &mvq->fwqp);
+ cq_destroy(ndev, mvq->index);
+ mvq->initialized = false;
+}
+
+static int create_rqt(struct mlx5_vdpa_net *ndev)
+{
+ int log_max_rqt;
+ __be32 *list;
+ void *rqtc;
+ int inlen;
+ void *in;
+ int i, j;
+ int err;
+
+ log_max_rqt = min_t(int, 1, MLX5_CAP_GEN(ndev->mvdev.mdev, log_max_rqt_size));
+ if (log_max_rqt < 1)
+ return -EOPNOTSUPP;
+
+ inlen = MLX5_ST_SZ_BYTES(create_rqt_in) + (1 << log_max_rqt) * MLX5_ST_SZ_BYTES(rq_num);
+ in = kzalloc(inlen, GFP_KERNEL);
+ if (!in)
+ return -ENOMEM;
+
+ MLX5_SET(create_rqt_in, in, uid, ndev->mvdev.res.uid);
+ rqtc = MLX5_ADDR_OF(create_rqt_in, in, rqt_context);
+
+ MLX5_SET(rqtc, rqtc, list_q_type, MLX5_RQTC_LIST_Q_TYPE_VIRTIO_NET_Q);
+ MLX5_SET(rqtc, rqtc, rqt_max_size, 1 << log_max_rqt);
+ MLX5_SET(rqtc, rqtc, rqt_actual_size, 1);
+ list = MLX5_ADDR_OF(rqtc, rqtc, rq_num[0]);
+ for (i = 0, j = 0; j < ndev->mvdev.max_vqs; j++) {
+ if (!ndev->vqs[j].initialized)
+ continue;
+
+ if (!vq_is_tx(ndev->vqs[j].index)) {
+ list[i] = cpu_to_be32(ndev->vqs[j].virtq_id);
+ i++;
+ }
+ }
+
+ err = mlx5_vdpa_create_rqt(&ndev->mvdev, in, inlen, &ndev->res.rqtn);
+ kfree(in);
+ if (err)
+ return err;
+
+ return 0;
+}
+
+static void destroy_rqt(struct mlx5_vdpa_net *ndev)
+{
+ mlx5_vdpa_destroy_rqt(&ndev->mvdev, ndev->res.rqtn);
+}
+
+static int create_tir(struct mlx5_vdpa_net *ndev)
+{
+#define HASH_IP_L4PORTS \
+ (MLX5_HASH_FIELD_SEL_SRC_IP | MLX5_HASH_FIELD_SEL_DST_IP | MLX5_HASH_FIELD_SEL_L4_SPORT | \
+ MLX5_HASH_FIELD_SEL_L4_DPORT)
+ static const u8 rx_hash_toeplitz_key[] = { 0x2c, 0xc6, 0x81, 0xd1, 0x5b, 0xdb, 0xf4, 0xf7,
+ 0xfc, 0xa2, 0x83, 0x19, 0xdb, 0x1a, 0x3e, 0x94,
+ 0x6b, 0x9e, 0x38, 0xd9, 0x2c, 0x9c, 0x03, 0xd1,
+ 0xad, 0x99, 0x44, 0xa7, 0xd9, 0x56, 0x3d, 0x59,
+ 0x06, 0x3c, 0x25, 0xf3, 0xfc, 0x1f, 0xdc, 0x2a };
+ void *rss_key;
+ void *outer;
+ void *tirc;
+ void *in;
+ int err;
+
+ in = kzalloc(MLX5_ST_SZ_BYTES(create_tir_in), GFP_KERNEL);
+ if (!in)
+ return -ENOMEM;
+
+ MLX5_SET(create_tir_in, in, uid, ndev->mvdev.res.uid);
+ tirc = MLX5_ADDR_OF(create_tir_in, in, ctx);
+ MLX5_SET(tirc, tirc, disp_type, MLX5_TIRC_DISP_TYPE_INDIRECT);
+
+ MLX5_SET(tirc, tirc, rx_hash_symmetric, 1);
+ MLX5_SET(tirc, tirc, rx_hash_fn, MLX5_RX_HASH_FN_TOEPLITZ);
+ rss_key = MLX5_ADDR_OF(tirc, tirc, rx_hash_toeplitz_key);
+ memcpy(rss_key, rx_hash_toeplitz_key, sizeof(rx_hash_toeplitz_key));
+
+ outer = MLX5_ADDR_OF(tirc, tirc, rx_hash_field_selector_outer);
+ MLX5_SET(rx_hash_field_select, outer, l3_prot_type, MLX5_L3_PROT_TYPE_IPV4);
+ MLX5_SET(rx_hash_field_select, outer, l4_prot_type, MLX5_L4_PROT_TYPE_TCP);
+ MLX5_SET(rx_hash_field_select, outer, selected_fields, HASH_IP_L4PORTS);
+
+ MLX5_SET(tirc, tirc, indirect_table, ndev->res.rqtn);
+ MLX5_SET(tirc, tirc, transport_domain, ndev->res.tdn);
+
+ err = mlx5_vdpa_create_tir(&ndev->mvdev, in, &ndev->res.tirn);
+ kfree(in);
+ return err;
+}
+
+static void destroy_tir(struct mlx5_vdpa_net *ndev)
+{
+ mlx5_vdpa_destroy_tir(&ndev->mvdev, ndev->res.tirn);
+}
+
+static int add_fwd_to_tir(struct mlx5_vdpa_net *ndev)
+{
+ struct mlx5_flow_destination dest[2] = {};
+ struct mlx5_flow_table_attr ft_attr = {};
+ struct mlx5_flow_act flow_act = {};
+ struct mlx5_flow_namespace *ns;
+ int err;
+
+ /* for now, one entry, match all, forward to tir */
+ ft_attr.max_fte = 1;
+ ft_attr.autogroup.max_num_groups = 1;
+
+ ns = mlx5_get_flow_namespace(ndev->mvdev.mdev, MLX5_FLOW_NAMESPACE_BYPASS);
+ if (!ns) {
+ mlx5_vdpa_warn(&ndev->mvdev, "get flow namespace\n");
+ return -EOPNOTSUPP;
+ }
+
+ ndev->rxft = mlx5_create_auto_grouped_flow_table(ns, &ft_attr);
+ if (IS_ERR(ndev->rxft))
+ return PTR_ERR(ndev->rxft);
+
+ ndev->rx_counter = mlx5_fc_create(ndev->mvdev.mdev, false);
+ if (IS_ERR(ndev->rx_counter)) {
+ err = PTR_ERR(ndev->rx_counter);
+ goto err_fc;
+ }
+
+ flow_act.action = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST | MLX5_FLOW_CONTEXT_ACTION_COUNT;
+ dest[0].type = MLX5_FLOW_DESTINATION_TYPE_TIR;
+ dest[0].tir_num = ndev->res.tirn;
+ dest[1].type = MLX5_FLOW_DESTINATION_TYPE_COUNTER;
+ dest[1].counter_id = mlx5_fc_id(ndev->rx_counter);
+ ndev->rx_rule = mlx5_add_flow_rules(ndev->rxft, NULL, &flow_act, dest, 2);
+ if (IS_ERR(ndev->rx_rule)) {
+ err = PTR_ERR(ndev->rx_rule);
+ ndev->rx_rule = NULL;
+ goto err_rule;
+ }
+
+ return 0;
+
+err_rule:
+ mlx5_fc_destroy(ndev->mvdev.mdev, ndev->rx_counter);
+err_fc:
+ mlx5_destroy_flow_table(ndev->rxft);
+ return err;
+}
+
+static void remove_fwd_to_tir(struct mlx5_vdpa_net *ndev)
+{
+ if (!ndev->rx_rule)
+ return;
+
+ mlx5_del_flow_rules(ndev->rx_rule);
+ mlx5_fc_destroy(ndev->mvdev.mdev, ndev->rx_counter);
+ mlx5_destroy_flow_table(ndev->rxft);
+
+ ndev->rx_rule = NULL;
+}
+
+static void mlx5_vdpa_kick_vq(struct vdpa_device *vdev, u16 idx)
+{
+ struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
+ struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
+ struct mlx5_vdpa_virtqueue *mvq = &ndev->vqs[idx];
+
+ if (unlikely(!mvq->ready))
+ return;
+
+ iowrite16(idx, ndev->mvdev.res.kick_addr);
+}
+
+static int mlx5_vdpa_set_vq_address(struct vdpa_device *vdev, u16 idx, u64 desc_area,
+ u64 driver_area, u64 device_area)
+{
+ struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
+ struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
+ struct mlx5_vdpa_virtqueue *mvq = &ndev->vqs[idx];
+
+ mvq->desc_addr = desc_area;
+ mvq->device_addr = device_area;
+ mvq->driver_addr = driver_area;
+ return 0;
+}
+
+static void mlx5_vdpa_set_vq_num(struct vdpa_device *vdev, u16 idx, u32 num)
+{
+ struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
+ struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
+ struct mlx5_vdpa_virtqueue *mvq;
+
+ mvq = &ndev->vqs[idx];
+ mvq->num_ent = num;
+}
+
+static void mlx5_vdpa_set_vq_cb(struct vdpa_device *vdev, u16 idx, struct vdpa_callback *cb)
+{
+ struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
+ struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
+ struct mlx5_vdpa_virtqueue *vq = &ndev->vqs[idx];
+
+ vq->event_cb = *cb;
+}
+
+static void mlx5_vdpa_set_vq_ready(struct vdpa_device *vdev, u16 idx, bool ready)
+{
+ struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
+ struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
+ struct mlx5_vdpa_virtqueue *mvq = &ndev->vqs[idx];
+
+ if (!ready)
+ suspend_vq(ndev, mvq);
+
+ mvq->ready = ready;
+}
+
+static bool mlx5_vdpa_get_vq_ready(struct vdpa_device *vdev, u16 idx)
+{
+ struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
+ struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
+ struct mlx5_vdpa_virtqueue *mvq = &ndev->vqs[idx];
+
+ return mvq->ready;
+}
+
+static int mlx5_vdpa_set_vq_state(struct vdpa_device *vdev, u16 idx,
+ const struct vdpa_vq_state *state)
+{
+ struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
+ struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
+ struct mlx5_vdpa_virtqueue *mvq = &ndev->vqs[idx];
+
+ if (mvq->fw_state == MLX5_VIRTIO_NET_Q_OBJECT_STATE_RDY) {
+ mlx5_vdpa_warn(mvdev, "can't modify available index\n");
+ return -EINVAL;
+ }
+
+ mvq->avail_idx = state->avail_index;
+ return 0;
+}
+
+static int mlx5_vdpa_get_vq_state(struct vdpa_device *vdev, u16 idx, struct vdpa_vq_state *state)
+{
+ struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
+ struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
+ struct mlx5_vdpa_virtqueue *mvq = &ndev->vqs[idx];
+ struct mlx5_virtq_attr attr;
+ int err;
+
+ if (!mvq->initialized)
+ return -EAGAIN;
+
+ err = query_virtqueue(ndev, mvq, &attr);
+ if (err) {
+ mlx5_vdpa_warn(mvdev, "failed to query virtqueue\n");
+ return err;
+ }
+ state->avail_index = attr.available_index;
+ return 0;
+}
+
+static u32 mlx5_vdpa_get_vq_align(struct vdpa_device *vdev)
+{
+ return PAGE_SIZE;
+}
+
+enum { MLX5_VIRTIO_NET_F_GUEST_CSUM = 1 << 9,
+ MLX5_VIRTIO_NET_F_CSUM = 1 << 10,
+ MLX5_VIRTIO_NET_F_HOST_TSO6 = 1 << 11,
+ MLX5_VIRTIO_NET_F_HOST_TSO4 = 1 << 12,
+};
+
+static u64 mlx_to_vritio_features(u16 dev_features)
+{
+ u64 result = 0;
+
+ if (dev_features & MLX5_VIRTIO_NET_F_GUEST_CSUM)
+ result |= BIT(VIRTIO_NET_F_GUEST_CSUM);
+ if (dev_features & MLX5_VIRTIO_NET_F_CSUM)
+ result |= BIT(VIRTIO_NET_F_CSUM);
+ if (dev_features & MLX5_VIRTIO_NET_F_HOST_TSO6)
+ result |= BIT(VIRTIO_NET_F_HOST_TSO6);
+ if (dev_features & MLX5_VIRTIO_NET_F_HOST_TSO4)
+ result |= BIT(VIRTIO_NET_F_HOST_TSO4);
+
+ return result;
+}
+
+static u64 mlx5_vdpa_get_features(struct vdpa_device *vdev)
+{
+ struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
+ struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
+ u16 dev_features;
+
+ dev_features = MLX5_CAP_DEV_VDPA_EMULATION(mvdev->mdev, device_features_bits_mask);
+ ndev->mvdev.mlx_features = mlx_to_vritio_features(dev_features);
+ if (MLX5_CAP_DEV_VDPA_EMULATION(mvdev->mdev, virtio_version_1_0))
+ ndev->mvdev.mlx_features |= BIT(VIRTIO_F_VERSION_1);
+ ndev->mvdev.mlx_features |= BIT(VIRTIO_F_ACCESS_PLATFORM);
+ print_features(mvdev, ndev->mvdev.mlx_features, false);
+ return ndev->mvdev.mlx_features;
+}
+
+static int verify_min_features(struct mlx5_vdpa_dev *mvdev, u64 features)
+{
+ if (!(features & BIT(VIRTIO_F_ACCESS_PLATFORM)))
+ return -EOPNOTSUPP;
+
+ return 0;
+}
+
+static int setup_virtqueues(struct mlx5_vdpa_net *ndev)
+{
+ int err;
+ int i;
+
+ for (i = 0; i < 2 * mlx5_vdpa_max_qps(ndev->mvdev.max_vqs); i++) {
+ err = setup_vq(ndev, &ndev->vqs[i]);
+ if (err)
+ goto err_vq;
+ }
+
+ return 0;
+
+err_vq:
+ for (--i; i >= 0; i--)
+ teardown_vq(ndev, &ndev->vqs[i]);
+
+ return err;
+}
+
+static void teardown_virtqueues(struct mlx5_vdpa_net *ndev)
+{
+ struct mlx5_vdpa_virtqueue *mvq;
+ int i;
+
+ for (i = ndev->mvdev.max_vqs - 1; i >= 0; i--) {
+ mvq = &ndev->vqs[i];
+ if (!mvq->initialized)
+ continue;
+
+ teardown_vq(ndev, mvq);
+ }
+}
+
+static int mlx5_vdpa_set_features(struct vdpa_device *vdev, u64 features)
+{
+ struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
+ struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
+ int err;
+
+ print_features(mvdev, features, true);
+
+ err = verify_min_features(mvdev, features);
+ if (err)
+ return err;
+
+ ndev->mvdev.actual_features = features & ndev->mvdev.mlx_features;
+ return err;
+}
+
+static void mlx5_vdpa_set_config_cb(struct vdpa_device *vdev, struct vdpa_callback *cb)
+{
+ /* not implemented */
+ mlx5_vdpa_warn(to_mvdev(vdev), "set config callback not supported\n");
+}
+
+#define MLX5_VDPA_MAX_VQ_ENTRIES 256
+static u16 mlx5_vdpa_get_vq_num_max(struct vdpa_device *vdev)
+{
+ return MLX5_VDPA_MAX_VQ_ENTRIES;
+}
+
+static u32 mlx5_vdpa_get_device_id(struct vdpa_device *vdev)
+{
+ return VIRTIO_ID_NET;
+}
+
+static u32 mlx5_vdpa_get_vendor_id(struct vdpa_device *vdev)
+{
+ return PCI_VENDOR_ID_MELLANOX;
+}
+
+static u8 mlx5_vdpa_get_status(struct vdpa_device *vdev)
+{
+ struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
+ struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
+
+ print_status(mvdev, ndev->mvdev.status, false);
+ return ndev->mvdev.status;
+}
+
+static int save_channel_info(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtqueue *mvq)
+{
+ struct mlx5_vq_restore_info *ri = &mvq->ri;
+ struct mlx5_virtq_attr attr;
+ int err;
+
+ if (!mvq->initialized)
+ return 0;
+
+ err = query_virtqueue(ndev, mvq, &attr);
+ if (err)
+ return err;
+
+ ri->avail_index = attr.available_index;
+ ri->ready = mvq->ready;
+ ri->num_ent = mvq->num_ent;
+ ri->desc_addr = mvq->desc_addr;
+ ri->device_addr = mvq->device_addr;
+ ri->driver_addr = mvq->driver_addr;
+ ri->cb = mvq->event_cb;
+ ri->restore = true;
+ return 0;
+}
+
+static int save_channels_info(struct mlx5_vdpa_net *ndev)
+{
+ int i;
+
+ for (i = 0; i < ndev->mvdev.max_vqs; i++) {
+ memset(&ndev->vqs[i].ri, 0, sizeof(ndev->vqs[i].ri));
+ save_channel_info(ndev, &ndev->vqs[i]);
+ }
+ return 0;
+}
+
+static void mlx5_clear_vqs(struct mlx5_vdpa_net *ndev)
+{
+ int i;
+
+ for (i = 0; i < ndev->mvdev.max_vqs; i++)
+ memset(&ndev->vqs[i], 0, offsetof(struct mlx5_vdpa_virtqueue, ri));
+}
+
+static void restore_channels_info(struct mlx5_vdpa_net *ndev)
+{
+ struct mlx5_vdpa_virtqueue *mvq;
+ struct mlx5_vq_restore_info *ri;
+ int i;
+
+ mlx5_clear_vqs(ndev);
+ init_mvqs(ndev);
+ for (i = 0; i < ndev->mvdev.max_vqs; i++) {
+ mvq = &ndev->vqs[i];
+ ri = &mvq->ri;
+ if (!ri->restore)
+ continue;
+
+ mvq->avail_idx = ri->avail_index;
+ mvq->ready = ri->ready;
+ mvq->num_ent = ri->num_ent;
+ mvq->desc_addr = ri->desc_addr;
+ mvq->device_addr = ri->device_addr;
+ mvq->driver_addr = ri->driver_addr;
+ mvq->event_cb = ri->cb;
+ }
+}
+
+static int mlx5_vdpa_change_map(struct mlx5_vdpa_net *ndev, struct vhost_iotlb *iotlb)
+{
+ int err;
+
+ suspend_vqs(ndev);
+ err = save_channels_info(ndev);
+ if (err)
+ goto err_mr;
+
+ teardown_driver(ndev);
+ mlx5_vdpa_destroy_mr(&ndev->mvdev);
+ err = mlx5_vdpa_create_mr(&ndev->mvdev, iotlb);
+ if (err)
+ goto err_mr;
+
+ restore_channels_info(ndev);
+ err = setup_driver(ndev);
+ if (err)
+ goto err_setup;
+
+ return 0;
+
+err_setup:
+ mlx5_vdpa_destroy_mr(&ndev->mvdev);
+err_mr:
+ return err;
+}
+
+static int setup_driver(struct mlx5_vdpa_net *ndev)
+{
+ int err;
+
+ mutex_lock(&ndev->reslock);
+ if (ndev->setup) {
+ mlx5_vdpa_warn(&ndev->mvdev, "setup driver called for already setup driver\n");
+ err = 0;
+ goto out;
+ }
+ err = setup_virtqueues(ndev);
+ if (err) {
+ mlx5_vdpa_warn(&ndev->mvdev, "setup_virtqueues\n");
+ goto out;
+ }
+
+ err = create_rqt(ndev);
+ if (err) {
+ mlx5_vdpa_warn(&ndev->mvdev, "create_rqt\n");
+ goto err_rqt;
+ }
+
+ err = create_tir(ndev);
+ if (err) {
+ mlx5_vdpa_warn(&ndev->mvdev, "create_tir\n");
+ goto err_tir;
+ }
+
+ err = add_fwd_to_tir(ndev);
+ if (err) {
+ mlx5_vdpa_warn(&ndev->mvdev, "add_fwd_to_tir\n");
+ goto err_fwd;
+ }
+ ndev->setup = true;
+ mutex_unlock(&ndev->reslock);
+
+ return 0;
+
+err_fwd:
+ destroy_tir(ndev);
+err_tir:
+ destroy_rqt(ndev);
+err_rqt:
+ teardown_virtqueues(ndev);
+out:
+ mutex_unlock(&ndev->reslock);
+ return err;
+}
+
+static void teardown_driver(struct mlx5_vdpa_net *ndev)
+{
+ mutex_lock(&ndev->reslock);
+ if (!ndev->setup)
+ goto out;
+
+ remove_fwd_to_tir(ndev);
+ destroy_tir(ndev);
+ destroy_rqt(ndev);
+ teardown_virtqueues(ndev);
+ ndev->setup = false;
+out:
+ mutex_unlock(&ndev->reslock);
+}
+
+static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status)
+{
+ struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
+ struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
+ int err;
+
+ print_status(mvdev, status, true);
+ if (!status) {
+ mlx5_vdpa_info(mvdev, "performing device reset\n");
+ teardown_driver(ndev);
+ mlx5_vdpa_destroy_mr(&ndev->mvdev);
+ ndev->mvdev.status = 0;
+ ndev->mvdev.mlx_features = 0;
+ ++mvdev->generation;
+ return;
+ }
+
+ if ((status ^ ndev->mvdev.status) & VIRTIO_CONFIG_S_DRIVER_OK) {
+ if (status & VIRTIO_CONFIG_S_DRIVER_OK) {
+ err = setup_driver(ndev);
+ if (err) {
+ mlx5_vdpa_warn(mvdev, "failed to setup driver\n");
+ goto err_setup;
+ }
+ } else {
+ mlx5_vdpa_warn(mvdev, "did not expect DRIVER_OK to be cleared\n");
+ return;
+ }
+ }
+
+ ndev->mvdev.status = status;
+ return;
+
+err_setup:
+ mlx5_vdpa_destroy_mr(&ndev->mvdev);
+ ndev->mvdev.status |= VIRTIO_CONFIG_S_FAILED;
+}
+
+static void mlx5_vdpa_get_config(struct vdpa_device *vdev, unsigned int offset, void *buf,
+ unsigned int len)
+{
+ struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
+ struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
+
+ if (offset + len < sizeof(struct virtio_net_config))
+ memcpy(buf, &ndev->config + offset, len);
+}
+
+static void mlx5_vdpa_set_config(struct vdpa_device *vdev, unsigned int offset, const void *buf,
+ unsigned int len)
+{
+ /* not supported */
+}
+
+static u32 mlx5_vdpa_get_generation(struct vdpa_device *vdev)
+{
+ struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
+
+ return mvdev->generation;
+}
+
+static int mlx5_vdpa_set_map(struct vdpa_device *vdev, struct vhost_iotlb *iotlb)
+{
+ struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
+ struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
+ bool change_map;
+ int err;
+
+ err = mlx5_vdpa_handle_set_map(mvdev, iotlb, &change_map);
+ if (err) {
+ mlx5_vdpa_warn(mvdev, "set map failed(%d)\n", err);
+ return err;
+ }
+
+ if (change_map)
+ return mlx5_vdpa_change_map(ndev, iotlb);
+
+ return 0;
+}
+
+static void mlx5_vdpa_free(struct vdpa_device *vdev)
+{
+ struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
+ struct mlx5_vdpa_net *ndev;
+
+ ndev = to_mlx5_vdpa_ndev(mvdev);
+
+ free_resources(ndev);
+ mlx5_vdpa_free_resources(&ndev->mvdev);
+ mutex_destroy(&ndev->reslock);
+}
+
+static struct vdpa_notification_area mlx5_get_vq_notification(struct vdpa_device *vdev, u16 idx)
+{
+ struct vdpa_notification_area ret = {};
+
+ return ret;
+}
+
+static int mlx5_get_vq_irq(struct vdpa_device *vdv, u16 idx)
+{
+ return -EOPNOTSUPP;
+}
+
+static const struct vdpa_config_ops mlx5_vdpa_ops = {
+ .set_vq_address = mlx5_vdpa_set_vq_address,
+ .set_vq_num = mlx5_vdpa_set_vq_num,
+ .kick_vq = mlx5_vdpa_kick_vq,
+ .set_vq_cb = mlx5_vdpa_set_vq_cb,
+ .set_vq_ready = mlx5_vdpa_set_vq_ready,
+ .get_vq_ready = mlx5_vdpa_get_vq_ready,
+ .set_vq_state = mlx5_vdpa_set_vq_state,
+ .get_vq_state = mlx5_vdpa_get_vq_state,
+ .get_vq_notification = mlx5_get_vq_notification,
+ .get_vq_irq = mlx5_get_vq_irq,
+ .get_vq_align = mlx5_vdpa_get_vq_align,
+ .get_features = mlx5_vdpa_get_features,
+ .set_features = mlx5_vdpa_set_features,
+ .set_config_cb = mlx5_vdpa_set_config_cb,
+ .get_vq_num_max = mlx5_vdpa_get_vq_num_max,
+ .get_device_id = mlx5_vdpa_get_device_id,
+ .get_vendor_id = mlx5_vdpa_get_vendor_id,
+ .get_status = mlx5_vdpa_get_status,
+ .set_status = mlx5_vdpa_set_status,
+ .get_config = mlx5_vdpa_get_config,
+ .set_config = mlx5_vdpa_set_config,
+ .get_generation = mlx5_vdpa_get_generation,
+ .set_map = mlx5_vdpa_set_map,
+ .free = mlx5_vdpa_free,
+};
+
+static int alloc_resources(struct mlx5_vdpa_net *ndev)
+{
+ struct mlx5_vdpa_net_resources *res = &ndev->res;
+ int err;
+
+ if (res->valid) {
+ mlx5_vdpa_warn(&ndev->mvdev, "resources already allocated\n");
+ return -EEXIST;
+ }
+
+ err = mlx5_vdpa_alloc_transport_domain(&ndev->mvdev, &res->tdn);
+ if (err)
+ return err;
+
+ err = create_tis(ndev);
+ if (err)
+ goto err_tis;
+
+ res->valid = true;
+
+ return 0;
+
+err_tis:
+ mlx5_vdpa_dealloc_transport_domain(&ndev->mvdev, res->tdn);
+ return err;
+}
+
+static void free_resources(struct mlx5_vdpa_net *ndev)
+{
+ struct mlx5_vdpa_net_resources *res = &ndev->res;
+
+ if (!res->valid)
+ return;
+
+ destroy_tis(ndev);
+ mlx5_vdpa_dealloc_transport_domain(&ndev->mvdev, res->tdn);
+ res->valid = false;
+}
+
+static void init_mvqs(struct mlx5_vdpa_net *ndev)
+{
+ struct mlx5_vdpa_virtqueue *mvq;
+ int i;
+
+ for (i = 0; i < 2 * mlx5_vdpa_max_qps(ndev->mvdev.max_vqs); ++i) {
+ mvq = &ndev->vqs[i];
+ memset(mvq, 0, offsetof(struct mlx5_vdpa_virtqueue, ri));
+ mvq->index = i;
+ mvq->ndev = ndev;
+ mvq->fwqp.fw = true;
+ }
+ for (; i < ndev->mvdev.max_vqs; i++) {
+ mvq = &ndev->vqs[i];
+ memset(mvq, 0, offsetof(struct mlx5_vdpa_virtqueue, ri));
+ mvq->index = i;
+ mvq->ndev = ndev;
+ }
+}
+
+void *mlx5_vdpa_add_dev(struct mlx5_core_dev *mdev)
+{
+ struct virtio_net_config *config;
+ struct mlx5_vdpa_dev *mvdev;
+ struct mlx5_vdpa_net *ndev;
+ u32 max_vqs;
+ int err;
+
+ /* we save one virtqueue for control virtqueue should we require it */
+ max_vqs = MLX5_CAP_DEV_VDPA_EMULATION(mdev, max_num_virtio_queues);
+ max_vqs = min_t(u32, max_vqs, MLX5_MAX_SUPPORTED_VQS);
+
+ ndev = vdpa_alloc_device(struct mlx5_vdpa_net, mvdev.vdev, mdev->device, &mlx5_vdpa_ops,
+ 2 * mlx5_vdpa_max_qps(max_vqs));
+ if (IS_ERR(ndev))
+ return ndev;
+
+ ndev->mvdev.max_vqs = max_vqs;
+ mvdev = &ndev->mvdev;
+ mvdev->mdev = mdev;
+ init_mvqs(ndev);
+ mutex_init(&ndev->reslock);
+ config = &ndev->config;
+ err = mlx5_query_nic_vport_mtu(mdev, &config->mtu);
+ if (err)
+ goto err_mtu;
+
+ err = mlx5_query_nic_vport_mac_address(mdev, 0, 0, config->mac);
+ if (err)
+ goto err_mtu;
+
+ mvdev->vdev.dma_dev = mdev->device;
+ err = mlx5_vdpa_alloc_resources(&ndev->mvdev);
+ if (err)
+ goto err_mtu;
+
+ err = alloc_resources(ndev);
+ if (err)
+ goto err_res;
+
+ err = vdpa_register_device(&mvdev->vdev);
+ if (err)
+ goto err_reg;
+
+ return ndev;
+
+err_reg:
+ free_resources(ndev);
+err_res:
+ mlx5_vdpa_free_resources(&ndev->mvdev);
+err_mtu:
+ mutex_destroy(&ndev->reslock);
+ put_device(&mvdev->vdev.dev);
+ return ERR_PTR(err);
+}
+
+void mlx5_vdpa_remove_dev(struct mlx5_vdpa_dev *mvdev)
+{
+ vdpa_unregister_device(&mvdev->vdev);
+}
diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.h b/drivers/vdpa/mlx5/net/mlx5_vnet.h
new file mode 100644
index 000000000000..f2d6d68b020e
--- /dev/null
+++ b/drivers/vdpa/mlx5/net/mlx5_vnet.h
@@ -0,0 +1,24 @@
+/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */
+/* Copyright (c) 2020 Mellanox Technologies Ltd. */
+
+#ifndef __MLX5_VNET_H_
+#define __MLX5_VNET_H_
+
+#include <linux/vdpa.h>
+#include <linux/virtio_net.h>
+#include <linux/vringh.h>
+#include <linux/mlx5/driver.h>
+#include <linux/mlx5/cq.h>
+#include <linux/mlx5/qp.h>
+#include "mlx5_vdpa.h"
+
+static inline u32 mlx5_vdpa_max_qps(int max_vqs)
+{
+ return max_vqs / 2;
+}
+
+#define to_mlx5_vdpa_ndev(__mvdev) container_of(__mvdev, struct mlx5_vdpa_net, mvdev)
+void *mlx5_vdpa_add_dev(struct mlx5_core_dev *mdev);
+void mlx5_vdpa_remove_dev(struct mlx5_vdpa_dev *mvdev);
+
+#endif /* __MLX5_VNET_H_ */
--
2.26.0

2020-08-04 21:30:49

by Michael S. Tsirkin

[permalink] [raw]

Subject: Re: [PATCH V4 linux-next 00/12] VDPA support for Mellanox ConnectX devices

On Tue, Aug 04, 2020 at 07:20:36PM +0300, Eli Cohen wrote:
> Hi Michael,
> please note that this series depends on mlx5 core device driver patches
> in mlx5-next branch in
> git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git.

Thanks! OK so what's the plan for merging this?
Do patches at least build well enough that I can push them
upstream? Or do they have to go on top of the mellanox tree?

> git pull git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git mlx5-next
>
> They also depend Jason Wang's patches: https://lkml.org/lkml/2020/7/1/301

The ones you included, right?

> Jason, I had to resolve some conflicts so I would appreciate of you can verify
> that it is ok.
>
> The following series of patches provide VDPA support for Mellanox
> devices. The supported devices are ConnectX6 DX and newer.
>
> Currently, only a network driver is implemented; future patches will
> introduce a block device driver. iperf performance on a single queue is
> around 12 Gbps. Future patches will introduce multi queue support.
>
> The files are organized in such a way that code that can be used by
> different VDPA implementations will be placed in a common are resides in
> drivers/vdpa/mlx5/core.
>
> Only virtual functions are currently supported. Also, certain firmware
> capabilities must be set to enable the driver. Physical functions (PFs)
> are skipped by the driver.
>
> To make use of the VDPA net driver, one must load mlx5_vdpa. In such
> case, VFs will be operated by the VDPA driver. Although one can see a
> regular instance of a network driver on the VF, the VDPA driver takes
> precedence over the NIC driver, steering-wize.
>
> Currently, the device/interface infrastructure in mlx5_core is used to
> probe drivers. Future patches will introduce virtbus as a means to
> register devices and drivers and VDPA will be adapted to it.
>
> The mlx5 mode of operation required to support VDPA is switchdev mode.
> Once can use Linux or OVS bridge to take care of layer 2 switching.
>
> In order to provide virtio networking to a guest, an updated version of
> qemu is required. This version has been tested by the following quemu
> version:
>
> url: https://github.com/jasowang/qemu.git
> branch: vdpa
> Commit ID: 6f4e59b807db
>
>
> V2->V3
> Fix makefile to use include path relative to the root of the kernel
>
> V3-V4
> Rebase Jason's patches on linux-next branch
> Fix krobot error on mips arch
> Make use of the free callback to destroy resoruces on unload
> Use VIRTIO_F_ACCESS_PLATFORM instead of legacy VIRTIO_F_IOMMU_PLATFORM
> Add empty implementations for get_vq_notification() and get_vq_irq()
>
>
> Eli Cohen (6):
> net/vdpa: Use struct for set/get vq state
> vdpa: Modify get_vq_state() to return error code
> vdpa/mlx5: Add hardware descriptive header file
> vdpa/mlx5: Add support library for mlx5 VDPA implementation
> vdpa/mlx5: Add shared memory registration code
> vdpa/mlx5: Add VDPA driver for supported mlx5 devices
>
> Jason Wang (5):
> vhost-vdpa: refine ioctl pre-processing
> vhost: generialize backend features setting/getting
> vhost-vdpa: support get/set backend features
> vhost-vdpa: support IOTLB batching hints
> vdpasim: support batch updating
>
> Max Gurtovoy (1):
> vdpa: remove hard coded virtq num
>
> drivers/vdpa/Kconfig | 19 +
> drivers/vdpa/Makefile | 1 +
> drivers/vdpa/ifcvf/ifcvf_base.c | 4 +-
> drivers/vdpa/ifcvf/ifcvf_base.h | 4 +-
> drivers/vdpa/ifcvf/ifcvf_main.c | 13 +-
> drivers/vdpa/mlx5/Makefile | 4 +
> drivers/vdpa/mlx5/core/mlx5_vdpa.h | 91 ++
> drivers/vdpa/mlx5/core/mlx5_vdpa_ifc.h | 168 ++
> drivers/vdpa/mlx5/core/mr.c | 484 ++++++
> drivers/vdpa/mlx5/core/resources.c | 284 ++++
> drivers/vdpa/mlx5/net/main.c | 76 +
> drivers/vdpa/mlx5/net/mlx5_vnet.c | 1965 ++++++++++++++++++++++++
> drivers/vdpa/mlx5/net/mlx5_vnet.h | 24 +
> drivers/vdpa/vdpa.c | 3 +
> drivers/vdpa/vdpa_sim/vdpa_sim.c | 53 +-
> drivers/vhost/net.c | 18 +-
> drivers/vhost/vdpa.c | 76 +-
> drivers/vhost/vhost.c | 15 +
> drivers/vhost/vhost.h | 2 +
> include/linux/vdpa.h | 24 +-
> include/uapi/linux/vhost.h | 2 +
> include/uapi/linux/vhost_types.h | 11 +
> 22 files changed, 3284 insertions(+), 57 deletions(-)
> create mode 100644 drivers/vdpa/mlx5/Makefile
> create mode 100644 drivers/vdpa/mlx5/core/mlx5_vdpa.h
> create mode 100644 drivers/vdpa/mlx5/core/mlx5_vdpa_ifc.h
> create mode 100644 drivers/vdpa/mlx5/core/mr.c
> create mode 100644 drivers/vdpa/mlx5/core/resources.c
> create mode 100644 drivers/vdpa/mlx5/net/main.c
> create mode 100644 drivers/vdpa/mlx5/net/mlx5_vnet.c
> create mode 100644 drivers/vdpa/mlx5/net/mlx5_vnet.h
>
> --
> 2.26.0

2020-08-05 05:02:01

[permalink] [raw]

Subject: Re: [PATCH V4 linux-next 00/12] VDPA support for Mellanox ConnectX devices

On Tue, Aug 04, 2020 at 05:29:09PM -0400, Michael S. Tsirkin wrote:
> On Tue, Aug 04, 2020 at 07:20:36PM +0300, Eli Cohen wrote:
> > Hi Michael,
> > please note that this series depends on mlx5 core device driver patches
> > in mlx5-next branch in
> > git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git.
>
> Thanks! OK so what's the plan for merging this?
> Do patches at least build well enough that I can push them
> upstream? Or do they have to go on top of the mellanox tree?
>

The patches are built on your linux-next branch which I updated
yesterday.

I am based on this commit:
776b7b25f10b (origin/linux-next) vhost: add an RPMsg API

On top of that I merged
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git

and after that I have Jason's patches (five patches), than one patch
from Max and then my patches (seven).

It builds fine on x84_64.
I fixed some conflicts on Jason's patches.

I also tested it to verify it's working.

BTW, for some reason I did not get all the patches into my mailbox and I
suspect they were not all sent. Did you get all the series 0-13?

Please let me know, and if needed I'll resend.

>
> > git pull git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git mlx5-next
> >
> > They also depend Jason Wang's patches: https://lkml.org/lkml/2020/7/1/301
>
> The ones you included, right?
>

Right.

> > Jason, I had to resolve some conflicts so I would appreciate of you can verify
> > that it is ok.
> >
> > The following series of patches provide VDPA support for Mellanox
> > devices. The supported devices are ConnectX6 DX and newer.
> >
> > Currently, only a network driver is implemented; future patches will
> > introduce a block device driver. iperf performance on a single queue is
> > around 12 Gbps. Future patches will introduce multi queue support.
> >
> > The files are organized in such a way that code that can be used by
> > different VDPA implementations will be placed in a common are resides in
> > drivers/vdpa/mlx5/core.
> >
> > Only virtual functions are currently supported. Also, certain firmware
> > capabilities must be set to enable the driver. Physical functions (PFs)
> > are skipped by the driver.
> >
> > To make use of the VDPA net driver, one must load mlx5_vdpa. In such
> > case, VFs will be operated by the VDPA driver. Although one can see a
> > regular instance of a network driver on the VF, the VDPA driver takes
> > precedence over the NIC driver, steering-wize.
> >
> > Currently, the device/interface infrastructure in mlx5_core is used to
> > probe drivers. Future patches will introduce virtbus as a means to
> > register devices and drivers and VDPA will be adapted to it.
> >
> > The mlx5 mode of operation required to support VDPA is switchdev mode.
> > Once can use Linux or OVS bridge to take care of layer 2 switching.
> >
> > In order to provide virtio networking to a guest, an updated version of
> > qemu is required. This version has been tested by the following quemu
> > version:
> >
> > url: https://github.com/jasowang/qemu.git
> > branch: vdpa
> > Commit ID: 6f4e59b807db
> >
> >
> > V2->V3
> > Fix makefile to use include path relative to the root of the kernel
> >
> > V3-V4
> > Rebase Jason's patches on linux-next branch
> > Fix krobot error on mips arch
> > Make use of the free callback to destroy resoruces on unload
> > Use VIRTIO_F_ACCESS_PLATFORM instead of legacy VIRTIO_F_IOMMU_PLATFORM
> > Add empty implementations for get_vq_notification() and get_vq_irq()
> >
> >
> > Eli Cohen (6):
> > net/vdpa: Use struct for set/get vq state
> > vdpa: Modify get_vq_state() to return error code
> > vdpa/mlx5: Add hardware descriptive header file
> > vdpa/mlx5: Add support library for mlx5 VDPA implementation
> > vdpa/mlx5: Add shared memory registration code
> > vdpa/mlx5: Add VDPA driver for supported mlx5 devices
> >
> > Jason Wang (5):
> > vhost-vdpa: refine ioctl pre-processing
> > vhost: generialize backend features setting/getting
> > vhost-vdpa: support get/set backend features
> > vhost-vdpa: support IOTLB batching hints
> > vdpasim: support batch updating
> >
> > Max Gurtovoy (1):
> > vdpa: remove hard coded virtq num
> >
> > drivers/vdpa/Kconfig | 19 +
> > drivers/vdpa/Makefile | 1 +
> > drivers/vdpa/ifcvf/ifcvf_base.c | 4 +-
> > drivers/vdpa/ifcvf/ifcvf_base.h | 4 +-
> > drivers/vdpa/ifcvf/ifcvf_main.c | 13 +-
> > drivers/vdpa/mlx5/Makefile | 4 +
> > drivers/vdpa/mlx5/core/mlx5_vdpa.h | 91 ++
> > drivers/vdpa/mlx5/core/mlx5_vdpa_ifc.h | 168 ++
> > drivers/vdpa/mlx5/core/mr.c | 484 ++++++
> > drivers/vdpa/mlx5/core/resources.c | 284 ++++
> > drivers/vdpa/mlx5/net/main.c | 76 +
> > drivers/vdpa/mlx5/net/mlx5_vnet.c | 1965 ++++++++++++++++++++++++
> > drivers/vdpa/mlx5/net/mlx5_vnet.h | 24 +
> > drivers/vdpa/vdpa.c | 3 +
> > drivers/vdpa/vdpa_sim/vdpa_sim.c | 53 +-
> > drivers/vhost/net.c | 18 +-
> > drivers/vhost/vdpa.c | 76 +-
> > drivers/vhost/vhost.c | 15 +
> > drivers/vhost/vhost.h | 2 +
> > include/linux/vdpa.h | 24 +-
> > include/uapi/linux/vhost.h | 2 +
> > include/uapi/linux/vhost_types.h | 11 +
> > 22 files changed, 3284 insertions(+), 57 deletions(-)
> > create mode 100644 drivers/vdpa/mlx5/Makefile
> > create mode 100644 drivers/vdpa/mlx5/core/mlx5_vdpa.h
> > create mode 100644 drivers/vdpa/mlx5/core/mlx5_vdpa_ifc.h
> > create mode 100644 drivers/vdpa/mlx5/core/mr.c
> > create mode 100644 drivers/vdpa/mlx5/core/resources.c
> > create mode 100644 drivers/vdpa/mlx5/net/main.c
> > create mode 100644 drivers/vdpa/mlx5/net/mlx5_vnet.c
> > create mode 100644 drivers/vdpa/mlx5/net/mlx5_vnet.h
> >
> > --
> > 2.26.0
>

2020-08-05 08:11:49

[permalink] [raw]

Subject: Re: [PATCH V4 linux-next 08/12] vdpa: Modify get_vq_state() to return error code

On 2020/8/5 上午12:20, Eli Cohen wrote:
> Modify get_vq_state() so it returns an error code. In case of hardware
> acceleration, the available index may be retrieved from the device, an
> operation that can possibly fail.
>
> Reviewed-by: Parav Pandit <[email protected]>
> Signed-off-by: Eli Cohen <[email protected]>

Acked-by: Jason Wang <[email protected]>

> ---
> drivers/vdpa/ifcvf/ifcvf_main.c | 5 +++--
> drivers/vdpa/vdpa_sim/vdpa_sim.c | 5 +++--
> drivers/vhost/vdpa.c | 5 ++++-
> include/linux/vdpa.h | 4 ++--
> 4 files changed, 12 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c
> index dc311e972b9e..076d7ac5e723 100644
> --- a/drivers/vdpa/ifcvf/ifcvf_main.c
> +++ b/drivers/vdpa/ifcvf/ifcvf_main.c
> @@ -237,12 +237,13 @@ static u16 ifcvf_vdpa_get_vq_num_max(struct vdpa_device *vdpa_dev)
> return IFCVF_QUEUE_MAX;
> }
>
> -static void ifcvf_vdpa_get_vq_state(struct vdpa_device *vdpa_dev, u16 qid,
> - struct vdpa_vq_state *state)
> +static int ifcvf_vdpa_get_vq_state(struct vdpa_device *vdpa_dev, u16 qid,
> + struct vdpa_vq_state *state)
> {
> struct ifcvf_hw *vf = vdpa_to_vf(vdpa_dev);
>
> state->avail_index = ifcvf_get_vq_state(vf, qid);
> + return 0;
> }
>
> static int ifcvf_vdpa_set_vq_state(struct vdpa_device *vdpa_dev, u16 qid,
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> index f1c351d5959b..c68741363643 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> @@ -464,14 +464,15 @@ static int vdpasim_set_vq_state(struct vdpa_device *vdpa, u16 idx,
> return 0;
> }
>
> -static void vdpasim_get_vq_state(struct vdpa_device *vdpa, u16 idx,
> - struct vdpa_vq_state *state)
> +static int vdpasim_get_vq_state(struct vdpa_device *vdpa, u16 idx,
> + struct vdpa_vq_state *state)
> {
> struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
> struct vdpasim_virtqueue *vq = &vdpasim->vqs[idx];
> struct vringh *vrh = &vq->vring;
>
> state->avail_index = vrh->last_avail_idx;
> + return 0;
> }
>
> static u32 vdpasim_get_vq_align(struct vdpa_device *vdpa)
> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> index c2e1e2d55084..a0b7c91948e1 100644
> --- a/drivers/vhost/vdpa.c
> +++ b/drivers/vhost/vdpa.c
> @@ -392,7 +392,10 @@ static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd,
> ops->set_vq_ready(vdpa, idx, s.num);
> return 0;
> case VHOST_GET_VRING_BASE:
> - ops->get_vq_state(v->vdpa, idx, &vq_state);
> + r = ops->get_vq_state(v->vdpa, idx, &vq_state);
> + if (r)
> + return r;
> +
> vq->last_avail_idx = vq_state.avail_index;
> break;
> case VHOST_GET_BACKEND_FEATURES:
> diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
> index 8f412620071d..fd6e560d70f9 100644
> --- a/include/linux/vdpa.h
> +++ b/include/linux/vdpa.h
> @@ -192,8 +192,8 @@ struct vdpa_config_ops {
> bool (*get_vq_ready)(struct vdpa_device *vdev, u16 idx);
> int (*set_vq_state)(struct vdpa_device *vdev, u16 idx,
> const struct vdpa_vq_state *state);
> - void (*get_vq_state)(struct vdpa_device *vdev, u16 idx,
> - struct vdpa_vq_state *state);
> + int (*get_vq_state)(struct vdpa_device *vdev, u16 idx,
> + struct vdpa_vq_state *state);
> struct vdpa_notification_area
> (*get_vq_notification)(struct vdpa_device *vdev, u16 idx);
> int (*get_vq_irq)(struct vdpa_device *vdv, u16 idx);

2020-08-05 08:14:35

[permalink] [raw]

Subject: Re: [PATCH V4 linux-next 12/12] vdpa/mlx5: Add VDPA driver for supported mlx5 devices

On 2020/8/5 上午12:20, Eli Cohen wrote:
> Add a front end VDPA driver that registers in the VDPA bus and provides
> networking to a guest. The VDPA driver creates the necessary resources
> on the VF it is driving such that data path will be offloaded.
>
> Notifications are being communicated through the driver.
>
> Currently, only VFs are supported. In subsequent patches we will have
> devlink support to control which VF is used for VDPA and which function
> is used for regular networking.
>
> Reviewed-by: Parav Pandit<[email protected]>
> Signed-off-by: Eli Cohen<[email protected]>
> ---

Acked-by: Jason Wang <[email protected]>

2020-08-05 08:15:07

[permalink] [raw]

Subject: Re: [PATCH V4 linux-next 00/12] VDPA support for Mellanox ConnectX devices

On 2020/8/5 上午12:20, Eli Cohen wrote:
> Hi Michael,
> please note that this series depends on mlx5 core device driver patches
> in mlx5-next branch in
> git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git.
>
> git pull git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git mlx5-next
>
> They also depend Jason Wang's patches:https://lkml.org/lkml/2020/7/1/301
>
> Jason, I had to resolve some conflicts so I would appreciate of you can verify
> that it is ok.

Looks good to me.

Thanks

2020-08-05 08:15:22

[permalink] [raw]

Subject: Re: [PATCH V4 linux-next 00/12] VDPA support for Mellanox ConnectX devices

On 2020/8/5 下午1:01, Eli Cohen wrote:
> On Tue, Aug 04, 2020 at 05:29:09PM -0400, Michael S. Tsirkin wrote:
>> On Tue, Aug 04, 2020 at 07:20:36PM +0300, Eli Cohen wrote:
>>> Hi Michael,
>>> please note that this series depends on mlx5 core device driver patches
>>> in mlx5-next branch in
>>> git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git.
>> Thanks! OK so what's the plan for merging this?
>> Do patches at least build well enough that I can push them
>> upstream? Or do they have to go on top of the mellanox tree?
>>
> The patches are built on your linux-next branch which I updated
> yesterday.
>
> I am based on this commit:
> 776b7b25f10b (origin/linux-next) vhost: add an RPMsg API
>
> On top of that I merged
> git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git
>
> and after that I have Jason's patches (five patches), than one patch
> from Max and then my patches (seven).
>
> It builds fine on x84_64.
> I fixed some conflicts on Jason's patches.
>
> I also tested it to verify it's working.
>
> BTW, for some reason I did not get all the patches into my mailbox and I
> suspect they were not all sent. Did you get all the series 0-13?

I can see patch 0 to patch 12 but not patch 13 (I guess 12 is all).

Thanks

>
> Please let me know, and if needed I'll resend.
>
>>> git pull git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git mlx5-next
>>>
>>> They also depend Jason Wang's patches: https://lkml.org/lkml/2020/7/1/301
>> The ones you included, right?
>>
> Right.
>
>>> Jason, I had to resolve some conflicts so I would appreciate of you can verify
>>> that it is ok.
>>>
>>> The following series of patches provide VDPA support for Mellanox
>>> devices. The supported devices are ConnectX6 DX and newer.
>>>
>>> Currently, only a network driver is implemented; future patches will
>>> introduce a block device driver. iperf performance on a single queue is
>>> around 12 Gbps. Future patches will introduce multi queue support.
>>>
>>> The files are organized in such a way that code that can be used by
>>> different VDPA implementations will be placed in a common are resides in
>>> drivers/vdpa/mlx5/core.
>>>
>>> Only virtual functions are currently supported. Also, certain firmware
>>> capabilities must be set to enable the driver. Physical functions (PFs)
>>> are skipped by the driver.
>>>
>>> To make use of the VDPA net driver, one must load mlx5_vdpa. In such
>>> case, VFs will be operated by the VDPA driver. Although one can see a
>>> regular instance of a network driver on the VF, the VDPA driver takes
>>> precedence over the NIC driver, steering-wize.
>>>
>>> Currently, the device/interface infrastructure in mlx5_core is used to
>>> probe drivers. Future patches will introduce virtbus as a means to
>>> register devices and drivers and VDPA will be adapted to it.
>>>
>>> The mlx5 mode of operation required to support VDPA is switchdev mode.
>>> Once can use Linux or OVS bridge to take care of layer 2 switching.
>>>
>>> In order to provide virtio networking to a guest, an updated version of
>>> qemu is required. This version has been tested by the following quemu
>>> version:
>>>
>>> url: https://github.com/jasowang/qemu.git
>>> branch: vdpa
>>> Commit ID: 6f4e59b807db
>>>
>>>
>>> V2->V3
>>> Fix makefile to use include path relative to the root of the kernel
>>>
>>> V3-V4
>>> Rebase Jason's patches on linux-next branch
>>> Fix krobot error on mips arch
>>> Make use of the free callback to destroy resoruces on unload
>>> Use VIRTIO_F_ACCESS_PLATFORM instead of legacy VIRTIO_F_IOMMU_PLATFORM
>>> Add empty implementations for get_vq_notification() and get_vq_irq()
>>>
>>>
>>> Eli Cohen (6):
>>> net/vdpa: Use struct for set/get vq state
>>> vdpa: Modify get_vq_state() to return error code
>>> vdpa/mlx5: Add hardware descriptive header file
>>> vdpa/mlx5: Add support library for mlx5 VDPA implementation
>>> vdpa/mlx5: Add shared memory registration code
>>> vdpa/mlx5: Add VDPA driver for supported mlx5 devices
>>>
>>> Jason Wang (5):
>>> vhost-vdpa: refine ioctl pre-processing
>>> vhost: generialize backend features setting/getting
>>> vhost-vdpa: support get/set backend features
>>> vhost-vdpa: support IOTLB batching hints
>>> vdpasim: support batch updating
>>>
>>> Max Gurtovoy (1):
>>> vdpa: remove hard coded virtq num
>>>
>>> drivers/vdpa/Kconfig | 19 +
>>> drivers/vdpa/Makefile | 1 +
>>> drivers/vdpa/ifcvf/ifcvf_base.c | 4 +-
>>> drivers/vdpa/ifcvf/ifcvf_base.h | 4 +-
>>> drivers/vdpa/ifcvf/ifcvf_main.c | 13 +-
>>> drivers/vdpa/mlx5/Makefile | 4 +
>>> drivers/vdpa/mlx5/core/mlx5_vdpa.h | 91 ++
>>> drivers/vdpa/mlx5/core/mlx5_vdpa_ifc.h | 168 ++
>>> drivers/vdpa/mlx5/core/mr.c | 484 ++++++
>>> drivers/vdpa/mlx5/core/resources.c | 284 ++++
>>> drivers/vdpa/mlx5/net/main.c | 76 +
>>> drivers/vdpa/mlx5/net/mlx5_vnet.c | 1965 ++++++++++++++++++++++++
>>> drivers/vdpa/mlx5/net/mlx5_vnet.h | 24 +
>>> drivers/vdpa/vdpa.c | 3 +
>>> drivers/vdpa/vdpa_sim/vdpa_sim.c | 53 +-
>>> drivers/vhost/net.c | 18 +-
>>> drivers/vhost/vdpa.c | 76 +-
>>> drivers/vhost/vhost.c | 15 +
>>> drivers/vhost/vhost.h | 2 +
>>> include/linux/vdpa.h | 24 +-
>>> include/uapi/linux/vhost.h | 2 +
>>> include/uapi/linux/vhost_types.h | 11 +
>>> 22 files changed, 3284 insertions(+), 57 deletions(-)
>>> create mode 100644 drivers/vdpa/mlx5/Makefile
>>> create mode 100644 drivers/vdpa/mlx5/core/mlx5_vdpa.h
>>> create mode 100644 drivers/vdpa/mlx5/core/mlx5_vdpa_ifc.h
>>> create mode 100644 drivers/vdpa/mlx5/core/mr.c
>>> create mode 100644 drivers/vdpa/mlx5/core/resources.c
>>> create mode 100644 drivers/vdpa/mlx5/net/main.c
>>> create mode 100644 drivers/vdpa/mlx5/net/mlx5_vnet.c
>>> create mode 100644 drivers/vdpa/mlx5/net/mlx5_vnet.h
>>>
>>> --
>>> 2.26.0

2020-08-05 08:19:17

[permalink] [raw]

Subject: Re: [PATCH V4 linux-next 00/12] VDPA support for Mellanox ConnectX devices

On Wed, Aug 05, 2020 at 04:12:46PM +0800, Jason Wang wrote:
> >
> >BTW, for some reason I did not get all the patches into my mailbox and I
> >suspect they were not all sent. Did you get all the series 0-13?
>
>
> I can see patch 0 to patch 12 but not patch 13 (I guess 12 is all).
>
> Thanks
>

Yes, I meant 13 patches 0-12.

Thanks

2020-08-05 16:35:44

by Michael S. Tsirkin

[permalink] [raw]

Subject: Re: [PATCH V4 linux-next 00/12] VDPA support for Mellanox ConnectX devices

On Tue, Aug 04, 2020 at 07:20:36PM +0300, Eli Cohen wrote:
> Hi Michael,
> please note that this series depends on mlx5 core device driver patches
> in mlx5-next branch in
> git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git.
>
> git pull git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git mlx5-next
>
> They also depend Jason Wang's patches: https://lkml.org/lkml/2020/7/1/301

So if I apply this to linux-next branch of my tree, I get:

CALL scripts/checksyscalls.sh
CALL scripts/atomic/check-atomics.sh
DESCEND objtool
CHK include/generated/compile.h
CC [M] drivers/vdpa/mlx5/net/main.o
In file included from ./include/linux/swab.h:5,
from ./include/uapi/linux/byteorder/little_endian.h:13,
from ./include/linux/byteorder/little_endian.h:5,
from ./arch/x86/include/uapi/asm/byteorder.h:5,
from ./include/asm-generic/bitops/le.h:6,
from ./arch/x86/include/asm/bitops.h:395,
from ./include/linux/bitops.h:29,
from ./include/linux/kernel.h:12,
from ./include/linux/list.h:9,
from ./include/linux/module.h:12,
from drivers/vdpa/mlx5/net/main.c:4:
drivers/vdpa/mlx5/net/main.c: In function ‘required_caps_supported’:
././include/linux/compiler_types.h:129:35: error: ‘struct mlx5_ifc_device_virtio_emulation_cap_bits’ has no member named ‘event_mode’
129 | #define __compiler_offsetof(a, b) __builtin_offsetof(a, b)
| ^~~~~~~~~~~~~~~~~~
./include/uapi/linux/swab.h:115:54: note: in definition of macro ‘__swab32’
115 | #define __swab32(x) (__u32)__builtin_bswap32((__u32)(x))
| ^
./include/linux/byteorder/generic.h:95:21: note: in expansion of macro ‘__be32_to_cpu’
95 | #define be32_to_cpu __be32_to_cpu
| ^~~~~~~~~~~~~
./include/linux/stddef.h:17:32: note: in expansion of macro ‘__compiler_offsetof’
17 | #define offsetof(TYPE, MEMBER) __compiler_offsetof(TYPE, MEMBER)
| ^~~~~~~~~~~~~~~~~~~
./include/linux/mlx5/device.h:51:35: note: in expansion of macro ‘offsetof’
51 | #define __mlx5_bit_off(typ, fld) (offsetof(struct mlx5_ifc_##typ##_bits, fld))
| ^~~~~~~~
./include/linux/mlx5/device.h:53:34: note: in expansion of macro ‘__mlx5_bit_off’
53 | #define __mlx5_dw_off(typ, fld) (__mlx5_bit_off(typ, fld) / 32)
| ^~~~~~~~~~~~~~
./include/linux/mlx5/device.h:96:1: note: in expansion of macro ‘__mlx5_dw_off’
96 | __mlx5_dw_off(typ, fld))) >> __mlx5_dw_bit_off(typ, fld)) & \
| ^~~~~~~~~~~~~
./include/linux/mlx5/device.h:1355:2: note: in expansion of macro ‘MLX5_GET’
1355 | MLX5_GET(device_virtio_emulation_cap, \
| ^~~~~~~~
drivers/vdpa/mlx5/net/main.c:24:15: note: in expansion of macro ‘MLX5_CAP_DEV_VDPA_EMULATION’
24 | event_mode = MLX5_CAP_DEV_VDPA_EMULATION(mdev, event_mode);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from ./include/linux/mlx5/driver.h:52,
from drivers/vdpa/mlx5/net/main.c:5:
./include/linux/mlx5/device.h:50:57: error: ‘struct mlx5_ifc_device_virtio_emulation_cap_bits’ has no member named ‘event_mode’
50 | #define __mlx5_bit_sz(typ, fld) sizeof(__mlx5_nullp(typ)->fld)
| ^~
./include/linux/mlx5/device.h:56:43: note: in expansion of macro ‘__mlx5_bit_sz’
56 | #define __mlx5_dw_bit_off(typ, fld) (32 - __mlx5_bit_sz(typ, fld) - (__mlx5_bit_off(typ, fld) & 0x1f))
| ^~~~~~~~~~~~~
./include/linux/mlx5/device.h:96:30: note: in expansion of macro ‘__mlx5_dw_bit_off’
96 | __mlx5_dw_off(typ, fld))) >> __mlx5_dw_bit_off(typ, fld)) & \
| ^~~~~~~~~~~~~~~~~
./include/linux/mlx5/device.h:1355:2: note: in expansion of macro ‘MLX5_GET’
1355 | MLX5_GET(device_virtio_emulation_cap, \
| ^~~~~~~~
drivers/vdpa/mlx5/net/main.c:24:15: note: in expansion of macro ‘MLX5_CAP_DEV_VDPA_EMULATION’
24 | event_mode = MLX5_CAP_DEV_VDPA_EMULATION(mdev, event_mode);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from <command-line>:
././include/linux/compiler_types.h:129:35: error: ‘struct mlx5_ifc_device_virtio_emulation_cap_bits’ has no member named ‘event_mode’
129 | #define __compiler_offsetof(a, b) __builtin_offsetof(a, b)
| ^~~~~~~~~~~~~~~~~~
./include/linux/stddef.h:17:32: note: in expansion of macro ‘__compiler_offsetof’
17 | #define offsetof(TYPE, MEMBER) __compiler_offsetof(TYPE, MEMBER)
| ^~~~~~~~~~~~~~~~~~~
./include/linux/mlx5/device.h:51:35: note: in expansion of macro ‘offsetof’
51 | #define __mlx5_bit_off(typ, fld) (offsetof(struct mlx5_ifc_##typ##_bits, fld))
| ^~~~~~~~
./include/linux/mlx5/device.h:56:70: note: in expansion of macro ‘__mlx5_bit_off’
56 | #define __mlx5_dw_bit_off(typ, fld) (32 - __mlx5_bit_sz(typ, fld) - (__mlx5_bit_off(typ, fld) & 0x1f))
| ^~~~~~~~~~~~~~
./include/linux/mlx5/device.h:96:30: note: in expansion of macro ‘__mlx5_dw_bit_off’
96 | __mlx5_dw_off(typ, fld))) >> __mlx5_dw_bit_off(typ, fld)) & \
| ^~~~~~~~~~~~~~~~~
./include/linux/mlx5/device.h:1355:2: note: in expansion of macro ‘MLX5_GET’
1355 | MLX5_GET(device_virtio_emulation_cap, \
| ^~~~~~~~
drivers/vdpa/mlx5/net/main.c:24:15: note: in expansion of macro ‘MLX5_CAP_DEV_VDPA_EMULATION’
24 | event_mode = MLX5_CAP_DEV_VDPA_EMULATION(mdev, event_mode);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from ./include/linux/mlx5/driver.h:52,
from drivers/vdpa/mlx5/net/main.c:5:
./include/linux/mlx5/device.h:50:57: error: ‘struct mlx5_ifc_device_virtio_emulation_cap_bits’ has no member named ‘event_mode’
50 | #define __mlx5_bit_sz(typ, fld) sizeof(__mlx5_nullp(typ)->fld)
| ^~
./include/linux/mlx5/device.h:57:47: note: in expansion of macro ‘__mlx5_bit_sz’
57 | #define __mlx5_mask(typ, fld) ((u32)((1ull << __mlx5_bit_sz(typ, fld)) - 1))
| ^~~~~~~~~~~~~
./include/linux/mlx5/device.h:97:1: note: in expansion of macro ‘__mlx5_mask’
97 | __mlx5_mask(typ, fld))
| ^~~~~~~~~~~
./include/linux/mlx5/device.h:1355:2: note: in expansion of macro ‘MLX5_GET’
1355 | MLX5_GET(device_virtio_emulation_cap, \
| ^~~~~~~~
drivers/vdpa/mlx5/net/main.c:24:15: note: in expansion of macro ‘MLX5_CAP_DEV_VDPA_EMULATION’
24 | event_mode = MLX5_CAP_DEV_VDPA_EMULATION(mdev, event_mode);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from ./include/linux/swab.h:5,
from ./include/uapi/linux/byteorder/little_endian.h:13,
from ./include/linux/byteorder/little_endian.h:5,
from ./arch/x86/include/uapi/asm/byteorder.h:5,
from ./include/asm-generic/bitops/le.h:6,
from ./arch/x86/include/asm/bitops.h:395,
from ./include/linux/bitops.h:29,
from ./include/linux/kernel.h:12,
from ./include/linux/list.h:9,
from ./include/linux/module.h:12,
from drivers/vdpa/mlx5/net/main.c:4:
././include/linux/compiler_types.h:129:35: error: ‘struct mlx5_ifc_device_virtio_emulation_cap_bits’ has no member named ‘eth_frame_offload_type’
129 | #define __compiler_offsetof(a, b) __builtin_offsetof(a, b)
| ^~~~~~~~~~~~~~~~~~
./include/uapi/linux/swab.h:115:54: note: in definition of macro ‘__swab32’
115 | #define __swab32(x) (__u32)__builtin_bswap32((__u32)(x))
| ^
./include/linux/byteorder/generic.h:95:21: note: in expansion of macro ‘__be32_to_cpu’
95 | #define be32_to_cpu __be32_to_cpu
| ^~~~~~~~~~~~~
./include/linux/stddef.h:17:32: note: in expansion of macro ‘__compiler_offsetof’
17 | #define offsetof(TYPE, MEMBER) __compiler_offsetof(TYPE, MEMBER)
| ^~~~~~~~~~~~~~~~~~~
./include/linux/mlx5/device.h:51:35: note: in expansion of macro ‘offsetof’
51 | #define __mlx5_bit_off(typ, fld) (offsetof(struct mlx5_ifc_##typ##_bits, fld))
| ^~~~~~~~
./include/linux/mlx5/device.h:53:34: note: in expansion of macro ‘__mlx5_bit_off’
53 | #define __mlx5_dw_off(typ, fld) (__mlx5_bit_off(typ, fld) / 32)
| ^~~~~~~~~~~~~~
./include/linux/mlx5/device.h:96:1: note: in expansion of macro ‘__mlx5_dw_off’
96 | __mlx5_dw_off(typ, fld))) >> __mlx5_dw_bit_off(typ, fld)) & \
| ^~~~~~~~~~~~~
./include/linux/mlx5/device.h:1355:2: note: in expansion of macro ‘MLX5_GET’
1355 | MLX5_GET(device_virtio_emulation_cap, \
| ^~~~~~~~
drivers/vdpa/mlx5/net/main.c:28:7: note: in expansion of macro ‘MLX5_CAP_DEV_VDPA_EMULATION’
28 | if (!MLX5_CAP_DEV_VDPA_EMULATION(mdev, eth_frame_offload_type))
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from ./include/linux/mlx5/driver.h:52,
from drivers/vdpa/mlx5/net/main.c:5:
./include/linux/mlx5/device.h:50:57: error: ‘struct mlx5_ifc_device_virtio_emulation_cap_bits’ has no member named ‘eth_frame_offload_type’
50 | #define __mlx5_bit_sz(typ, fld) sizeof(__mlx5_nullp(typ)->fld)
| ^~
./include/linux/mlx5/device.h:56:43: note: in expansion of macro ‘__mlx5_bit_sz’
56 | #define __mlx5_dw_bit_off(typ, fld) (32 - __mlx5_bit_sz(typ, fld) - (__mlx5_bit_off(typ, fld) & 0x1f))
| ^~~~~~~~~~~~~
./include/linux/mlx5/device.h:96:30: note: in expansion of macro ‘__mlx5_dw_bit_off’
96 | __mlx5_dw_off(typ, fld))) >> __mlx5_dw_bit_off(typ, fld)) & \
| ^~~~~~~~~~~~~~~~~
./include/linux/mlx5/device.h:1355:2: note: in expansion of macro ‘MLX5_GET’
1355 | MLX5_GET(device_virtio_emulation_cap, \
| ^~~~~~~~
drivers/vdpa/mlx5/net/main.c:28:7: note: in expansion of macro ‘MLX5_CAP_DEV_VDPA_EMULATION’
28 | if (!MLX5_CAP_DEV_VDPA_EMULATION(mdev, eth_frame_offload_type))
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from <command-line>:
././include/linux/compiler_types.h:129:35: error: ‘struct mlx5_ifc_device_virtio_emulation_cap_bits’ has no member named ‘eth_frame_offload_type’
129 | #define __compiler_offsetof(a, b) __builtin_offsetof(a, b)
| ^~~~~~~~~~~~~~~~~~
./include/linux/stddef.h:17:32: note: in expansion of macro ‘__compiler_offsetof’
17 | #define offsetof(TYPE, MEMBER) __compiler_offsetof(TYPE, MEMBER)
| ^~~~~~~~~~~~~~~~~~~
./include/linux/mlx5/device.h:51:35: note: in expansion of macro ‘offsetof’
51 | #define __mlx5_bit_off(typ, fld) (offsetof(struct mlx5_ifc_##typ##_bits, fld))
| ^~~~~~~~
./include/linux/mlx5/device.h:56:70: note: in expansion of macro ‘__mlx5_bit_off’
56 | #define __mlx5_dw_bit_off(typ, fld) (32 - __mlx5_bit_sz(typ, fld) - (__mlx5_bit_off(typ, fld) & 0x1f))
| ^~~~~~~~~~~~~~
./include/linux/mlx5/device.h:96:30: note: in expansion of macro ‘__mlx5_dw_bit_off’
96 | __mlx5_dw_off(typ, fld))) >> __mlx5_dw_bit_off(typ, fld)) & \
| ^~~~~~~~~~~~~~~~~
./include/linux/mlx5/device.h:1355:2: note: in expansion of macro ‘MLX5_GET’
1355 | MLX5_GET(device_virtio_emulation_cap, \
| ^~~~~~~~
drivers/vdpa/mlx5/net/main.c:28:7: note: in expansion of macro ‘MLX5_CAP_DEV_VDPA_EMULATION’
28 | if (!MLX5_CAP_DEV_VDPA_EMULATION(mdev, eth_frame_offload_type))
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from ./include/linux/mlx5/driver.h:52,
from drivers/vdpa/mlx5/net/main.c:5:
./include/linux/mlx5/device.h:50:57: error: ‘struct mlx5_ifc_device_virtio_emulation_cap_bits’ has no member named ‘eth_frame_offload_type’
50 | #define __mlx5_bit_sz(typ, fld) sizeof(__mlx5_nullp(typ)->fld)
| ^~
./include/linux/mlx5/device.h:57:47: note: in expansion of macro ‘__mlx5_bit_sz’
57 | #define __mlx5_mask(typ, fld) ((u32)((1ull << __mlx5_bit_sz(typ, fld)) - 1))
| ^~~~~~~~~~~~~
./include/linux/mlx5/device.h:97:1: note: in expansion of macro ‘__mlx5_mask’
97 | __mlx5_mask(typ, fld))
| ^~~~~~~~~~~
./include/linux/mlx5/device.h:1355:2: note: in expansion of macro ‘MLX5_GET’
1355 | MLX5_GET(device_virtio_emulation_cap, \
| ^~~~~~~~
drivers/vdpa/mlx5/net/main.c:28:7: note: in expansion of macro ‘MLX5_CAP_DEV_VDPA_EMULATION’
28 | if (!MLX5_CAP_DEV_VDPA_EMULATION(mdev, eth_frame_offload_type))
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/vdpa/mlx5/net/main.c: At top level:
drivers/vdpa/mlx5/net/main.c:62:14: error: ‘MLX5_INTERFACE_PROTOCOL_VDPA’ undeclared here (not in a function); did you mean ‘MLX5_INTERFACE_PROTOCOL_ETH’?
62 | .protocol = MLX5_INTERFACE_PROTOCOL_VDPA,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
| MLX5_INTERFACE_PROTOCOL_ETH
make[3]: *** [scripts/Makefile.build:281: drivers/vdpa/mlx5/net/main.o] Error 1
make[2]: *** [scripts/Makefile.build:497: drivers/vdpa/mlx5] Error 2
make[1]: *** [scripts/Makefile.build:497: drivers/vdpa] Error 2
make: *** [Makefile:1756: drivers] Error 2

I am guessing this is because of the missing dependency, right?
So what's the plan for merging this?

--
MST

2020-08-05 17:28:22

by Michael S. Tsirkin

[permalink] [raw]

Subject: Re: [PATCH V4 linux-next 00/12] VDPA support for Mellanox ConnectX devices

On Wed, Aug 05, 2020 at 03:40:54PM +0300, Eli Cohen wrote:
> On Wed, Aug 05, 2020 at 08:00:55AM -0400, Michael S. Tsirkin wrote:
> > On Tue, Aug 04, 2020 at 07:20:36PM +0300, Eli Cohen wrote:
> > > Hi Michael,
> > > please note that this series depends on mlx5 core device driver patches
> > > in mlx5-next branch in
> > > git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git.
> > >
> > > git pull git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git mlx5-next
> > >
> > > They also depend Jason Wang's patches: https://lkml.org/lkml/2020/7/1/301
> >
> > So if I apply this to linux-next branch of my tree, I get:
> >
>
> Did you merge this?:
> git pull git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git mlx5-next

I can only merge this tree if no one else will. Linus does not like
getting same patches through two trees.

Is this the case? Is mlx5-next going to be merged through
my tree in this cycle?

> >
> > CALL scripts/checksyscalls.sh
> > CALL scripts/atomic/check-atomics.sh
> > DESCEND objtool
> > CHK include/generated/compile.h
> > CC [M] drivers/vdpa/mlx5/net/main.o
> > In file included from ./include/linux/swab.h:5,
> > from ./include/uapi/linux/byteorder/little_endian.h:13,
> > from ./include/linux/byteorder/little_endian.h:5,
> > from ./arch/x86/include/uapi/asm/byteorder.h:5,
> > from ./include/asm-generic/bitops/le.h:6,
> > from ./arch/x86/include/asm/bitops.h:395,
> > from ./include/linux/bitops.h:29,
> > from ./include/linux/kernel.h:12,
> > from ./include/linux/list.h:9,
> > from ./include/linux/module.h:12,
> > from drivers/vdpa/mlx5/net/main.c:4:
> > drivers/vdpa/mlx5/net/main.c: In function ‘required_caps_supported’:
> > ././include/linux/compiler_types.h:129:35: error: ‘struct mlx5_ifc_device_virtio_emulation_cap_bits’ has no member named ‘event_mode’
> > 129 | #define __compiler_offsetof(a, b) __builtin_offsetof(a, b)
> > | ^~~~~~~~~~~~~~~~~~
> > ./include/uapi/linux/swab.h:115:54: note: in definition of macro ‘__swab32’
> > 115 | #define __swab32(x) (__u32)__builtin_bswap32((__u32)(x))
> > | ^
> > ./include/linux/byteorder/generic.h:95:21: note: in expansion of macro ‘__be32_to_cpu’
> > 95 | #define be32_to_cpu __be32_to_cpu
> > | ^~~~~~~~~~~~~
> > ./include/linux/stddef.h:17:32: note: in expansion of macro ‘__compiler_offsetof’
> > 17 | #define offsetof(TYPE, MEMBER) __compiler_offsetof(TYPE, MEMBER)
> > | ^~~~~~~~~~~~~~~~~~~
> > ./include/linux/mlx5/device.h:51:35: note: in expansion of macro ‘offsetof’
> > 51 | #define __mlx5_bit_off(typ, fld) (offsetof(struct mlx5_ifc_##typ##_bits, fld))
> > | ^~~~~~~~
> > ./include/linux/mlx5/device.h:53:34: note: in expansion of macro ‘__mlx5_bit_off’
> > 53 | #define __mlx5_dw_off(typ, fld) (__mlx5_bit_off(typ, fld) / 32)
> > | ^~~~~~~~~~~~~~
> > ./include/linux/mlx5/device.h:96:1: note: in expansion of macro ‘__mlx5_dw_off’
> > 96 | __mlx5_dw_off(typ, fld))) >> __mlx5_dw_bit_off(typ, fld)) & \
> > | ^~~~~~~~~~~~~
> > ./include/linux/mlx5/device.h:1355:2: note: in expansion of macro ‘MLX5_GET’
> > 1355 | MLX5_GET(device_virtio_emulation_cap, \
> > | ^~~~~~~~
> > drivers/vdpa/mlx5/net/main.c:24:15: note: in expansion of macro ‘MLX5_CAP_DEV_VDPA_EMULATION’
> > 24 | event_mode = MLX5_CAP_DEV_VDPA_EMULATION(mdev, event_mode);
> > | ^~~~~~~~~~~~~~~~~~~~~~~~~~~
> > In file included from ./include/linux/mlx5/driver.h:52,
> > from drivers/vdpa/mlx5/net/main.c:5:
> > ./include/linux/mlx5/device.h:50:57: error: ‘struct mlx5_ifc_device_virtio_emulation_cap_bits’ has no member named ‘event_mode’
> > 50 | #define __mlx5_bit_sz(typ, fld) sizeof(__mlx5_nullp(typ)->fld)
> > | ^~
> > ./include/linux/mlx5/device.h:56:43: note: in expansion of macro ‘__mlx5_bit_sz’
> > 56 | #define __mlx5_dw_bit_off(typ, fld) (32 - __mlx5_bit_sz(typ, fld) - (__mlx5_bit_off(typ, fld) & 0x1f))
> > | ^~~~~~~~~~~~~
> > ./include/linux/mlx5/device.h:96:30: note: in expansion of macro ‘__mlx5_dw_bit_off’
> > 96 | __mlx5_dw_off(typ, fld))) >> __mlx5_dw_bit_off(typ, fld)) & \
> > | ^~~~~~~~~~~~~~~~~
> > ./include/linux/mlx5/device.h:1355:2: note: in expansion of macro ‘MLX5_GET’
> > 1355 | MLX5_GET(device_virtio_emulation_cap, \
> > | ^~~~~~~~
> > drivers/vdpa/mlx5/net/main.c:24:15: note: in expansion of macro ‘MLX5_CAP_DEV_VDPA_EMULATION’
> > 24 | event_mode = MLX5_CAP_DEV_VDPA_EMULATION(mdev, event_mode);
> > | ^~~~~~~~~~~~~~~~~~~~~~~~~~~
> > In file included from <command-line>:
> > ././include/linux/compiler_types.h:129:35: error: ‘struct mlx5_ifc_device_virtio_emulation_cap_bits’ has no member named ‘event_mode’
> > 129 | #define __compiler_offsetof(a, b) __builtin_offsetof(a, b)
> > | ^~~~~~~~~~~~~~~~~~
> > ./include/linux/stddef.h:17:32: note: in expansion of macro ‘__compiler_offsetof’
> > 17 | #define offsetof(TYPE, MEMBER) __compiler_offsetof(TYPE, MEMBER)
> > | ^~~~~~~~~~~~~~~~~~~
> > ./include/linux/mlx5/device.h:51:35: note: in expansion of macro ‘offsetof’
> > 51 | #define __mlx5_bit_off(typ, fld) (offsetof(struct mlx5_ifc_##typ##_bits, fld))
> > | ^~~~~~~~
> > ./include/linux/mlx5/device.h:56:70: note: in expansion of macro ‘__mlx5_bit_off’
> > 56 | #define __mlx5_dw_bit_off(typ, fld) (32 - __mlx5_bit_sz(typ, fld) - (__mlx5_bit_off(typ, fld) & 0x1f))
> > | ^~~~~~~~~~~~~~
> > ./include/linux/mlx5/device.h:96:30: note: in expansion of macro ‘__mlx5_dw_bit_off’
> > 96 | __mlx5_dw_off(typ, fld))) >> __mlx5_dw_bit_off(typ, fld)) & \
> > | ^~~~~~~~~~~~~~~~~
> > ./include/linux/mlx5/device.h:1355:2: note: in expansion of macro ‘MLX5_GET’
> > 1355 | MLX5_GET(device_virtio_emulation_cap, \
> > | ^~~~~~~~
> > drivers/vdpa/mlx5/net/main.c:24:15: note: in expansion of macro ‘MLX5_CAP_DEV_VDPA_EMULATION’
> > 24 | event_mode = MLX5_CAP_DEV_VDPA_EMULATION(mdev, event_mode);
> > | ^~~~~~~~~~~~~~~~~~~~~~~~~~~
> > In file included from ./include/linux/mlx5/driver.h:52,
> > from drivers/vdpa/mlx5/net/main.c:5:
> > ./include/linux/mlx5/device.h:50:57: error: ‘struct mlx5_ifc_device_virtio_emulation_cap_bits’ has no member named ‘event_mode’
> > 50 | #define __mlx5_bit_sz(typ, fld) sizeof(__mlx5_nullp(typ)->fld)
> > | ^~
> > ./include/linux/mlx5/device.h:57:47: note: in expansion of macro ‘__mlx5_bit_sz’
> > 57 | #define __mlx5_mask(typ, fld) ((u32)((1ull << __mlx5_bit_sz(typ, fld)) - 1))
> > | ^~~~~~~~~~~~~
> > ./include/linux/mlx5/device.h:97:1: note: in expansion of macro ‘__mlx5_mask’
> > 97 | __mlx5_mask(typ, fld))
> > | ^~~~~~~~~~~
> > ./include/linux/mlx5/device.h:1355:2: note: in expansion of macro ‘MLX5_GET’
> > 1355 | MLX5_GET(device_virtio_emulation_cap, \
> > | ^~~~~~~~
> > drivers/vdpa/mlx5/net/main.c:24:15: note: in expansion of macro ‘MLX5_CAP_DEV_VDPA_EMULATION’
> > 24 | event_mode = MLX5_CAP_DEV_VDPA_EMULATION(mdev, event_mode);
> > | ^~~~~~~~~~~~~~~~~~~~~~~~~~~
> > In file included from ./include/linux/swab.h:5,
> > from ./include/uapi/linux/byteorder/little_endian.h:13,
> > from ./include/linux/byteorder/little_endian.h:5,
> > from ./arch/x86/include/uapi/asm/byteorder.h:5,
> > from ./include/asm-generic/bitops/le.h:6,
> > from ./arch/x86/include/asm/bitops.h:395,
> > from ./include/linux/bitops.h:29,
> > from ./include/linux/kernel.h:12,
> > from ./include/linux/list.h:9,
> > from ./include/linux/module.h:12,
> > from drivers/vdpa/mlx5/net/main.c:4:
> > ././include/linux/compiler_types.h:129:35: error: ‘struct mlx5_ifc_device_virtio_emulation_cap_bits’ has no member named ‘eth_frame_offload_type’
> > 129 | #define __compiler_offsetof(a, b) __builtin_offsetof(a, b)
> > | ^~~~~~~~~~~~~~~~~~
> > ./include/uapi/linux/swab.h:115:54: note: in definition of macro ‘__swab32’
> > 115 | #define __swab32(x) (__u32)__builtin_bswap32((__u32)(x))
> > | ^
> > ./include/linux/byteorder/generic.h:95:21: note: in expansion of macro ‘__be32_to_cpu’
> > 95 | #define be32_to_cpu __be32_to_cpu
> > | ^~~~~~~~~~~~~
> > ./include/linux/stddef.h:17:32: note: in expansion of macro ‘__compiler_offsetof’
> > 17 | #define offsetof(TYPE, MEMBER) __compiler_offsetof(TYPE, MEMBER)
> > | ^~~~~~~~~~~~~~~~~~~
> > ./include/linux/mlx5/device.h:51:35: note: in expansion of macro ‘offsetof’
> > 51 | #define __mlx5_bit_off(typ, fld) (offsetof(struct mlx5_ifc_##typ##_bits, fld))
> > | ^~~~~~~~
> > ./include/linux/mlx5/device.h:53:34: note: in expansion of macro ‘__mlx5_bit_off’
> > 53 | #define __mlx5_dw_off(typ, fld) (__mlx5_bit_off(typ, fld) / 32)
> > | ^~~~~~~~~~~~~~
> > ./include/linux/mlx5/device.h:96:1: note: in expansion of macro ‘__mlx5_dw_off’
> > 96 | __mlx5_dw_off(typ, fld))) >> __mlx5_dw_bit_off(typ, fld)) & \
> > | ^~~~~~~~~~~~~
> > ./include/linux/mlx5/device.h:1355:2: note: in expansion of macro ‘MLX5_GET’
> > 1355 | MLX5_GET(device_virtio_emulation_cap, \
> > | ^~~~~~~~
> > drivers/vdpa/mlx5/net/main.c:28:7: note: in expansion of macro ‘MLX5_CAP_DEV_VDPA_EMULATION’
> > 28 | if (!MLX5_CAP_DEV_VDPA_EMULATION(mdev, eth_frame_offload_type))
> > | ^~~~~~~~~~~~~~~~~~~~~~~~~~~
> > In file included from ./include/linux/mlx5/driver.h:52,
> > from drivers/vdpa/mlx5/net/main.c:5:
> > ./include/linux/mlx5/device.h:50:57: error: ‘struct mlx5_ifc_device_virtio_emulation_cap_bits’ has no member named ‘eth_frame_offload_type’
> > 50 | #define __mlx5_bit_sz(typ, fld) sizeof(__mlx5_nullp(typ)->fld)
> > | ^~
> > ./include/linux/mlx5/device.h:56:43: note: in expansion of macro ‘__mlx5_bit_sz’
> > 56 | #define __mlx5_dw_bit_off(typ, fld) (32 - __mlx5_bit_sz(typ, fld) - (__mlx5_bit_off(typ, fld) & 0x1f))
> > | ^~~~~~~~~~~~~
> > ./include/linux/mlx5/device.h:96:30: note: in expansion of macro ‘__mlx5_dw_bit_off’
> > 96 | __mlx5_dw_off(typ, fld))) >> __mlx5_dw_bit_off(typ, fld)) & \
> > | ^~~~~~~~~~~~~~~~~
> > ./include/linux/mlx5/device.h:1355:2: note: in expansion of macro ‘MLX5_GET’
> > 1355 | MLX5_GET(device_virtio_emulation_cap, \
> > | ^~~~~~~~
> > drivers/vdpa/mlx5/net/main.c:28:7: note: in expansion of macro ‘MLX5_CAP_DEV_VDPA_EMULATION’
> > 28 | if (!MLX5_CAP_DEV_VDPA_EMULATION(mdev, eth_frame_offload_type))
> > | ^~~~~~~~~~~~~~~~~~~~~~~~~~~
> > In file included from <command-line>:
> > ././include/linux/compiler_types.h:129:35: error: ‘struct mlx5_ifc_device_virtio_emulation_cap_bits’ has no member named ‘eth_frame_offload_type’
> > 129 | #define __compiler_offsetof(a, b) __builtin_offsetof(a, b)
> > | ^~~~~~~~~~~~~~~~~~
> > ./include/linux/stddef.h:17:32: note: in expansion of macro ‘__compiler_offsetof’
> > 17 | #define offsetof(TYPE, MEMBER) __compiler_offsetof(TYPE, MEMBER)
> > | ^~~~~~~~~~~~~~~~~~~
> > ./include/linux/mlx5/device.h:51:35: note: in expansion of macro ‘offsetof’
> > 51 | #define __mlx5_bit_off(typ, fld) (offsetof(struct mlx5_ifc_##typ##_bits, fld))
> > | ^~~~~~~~
> > ./include/linux/mlx5/device.h:56:70: note: in expansion of macro ‘__mlx5_bit_off’
> > 56 | #define __mlx5_dw_bit_off(typ, fld) (32 - __mlx5_bit_sz(typ, fld) - (__mlx5_bit_off(typ, fld) & 0x1f))
> > | ^~~~~~~~~~~~~~
> > ./include/linux/mlx5/device.h:96:30: note: in expansion of macro ‘__mlx5_dw_bit_off’
> > 96 | __mlx5_dw_off(typ, fld))) >> __mlx5_dw_bit_off(typ, fld)) & \
> > | ^~~~~~~~~~~~~~~~~
> > ./include/linux/mlx5/device.h:1355:2: note: in expansion of macro ‘MLX5_GET’
> > 1355 | MLX5_GET(device_virtio_emulation_cap, \
> > | ^~~~~~~~
> > drivers/vdpa/mlx5/net/main.c:28:7: note: in expansion of macro ‘MLX5_CAP_DEV_VDPA_EMULATION’
> > 28 | if (!MLX5_CAP_DEV_VDPA_EMULATION(mdev, eth_frame_offload_type))
> > | ^~~~~~~~~~~~~~~~~~~~~~~~~~~
> > In file included from ./include/linux/mlx5/driver.h:52,
> > from drivers/vdpa/mlx5/net/main.c:5:
> > ./include/linux/mlx5/device.h:50:57: error: ‘struct mlx5_ifc_device_virtio_emulation_cap_bits’ has no member named ‘eth_frame_offload_type’
> > 50 | #define __mlx5_bit_sz(typ, fld) sizeof(__mlx5_nullp(typ)->fld)
> > | ^~
> > ./include/linux/mlx5/device.h:57:47: note: in expansion of macro ‘__mlx5_bit_sz’
> > 57 | #define __mlx5_mask(typ, fld) ((u32)((1ull << __mlx5_bit_sz(typ, fld)) - 1))
> > | ^~~~~~~~~~~~~
> > ./include/linux/mlx5/device.h:97:1: note: in expansion of macro ‘__mlx5_mask’
> > 97 | __mlx5_mask(typ, fld))
> > | ^~~~~~~~~~~
> > ./include/linux/mlx5/device.h:1355:2: note: in expansion of macro ‘MLX5_GET’
> > 1355 | MLX5_GET(device_virtio_emulation_cap, \
> > | ^~~~~~~~
> > drivers/vdpa/mlx5/net/main.c:28:7: note: in expansion of macro ‘MLX5_CAP_DEV_VDPA_EMULATION’
> > 28 | if (!MLX5_CAP_DEV_VDPA_EMULATION(mdev, eth_frame_offload_type))
> > | ^~~~~~~~~~~~~~~~~~~~~~~~~~~
> > drivers/vdpa/mlx5/net/main.c: At top level:
> > drivers/vdpa/mlx5/net/main.c:62:14: error: ‘MLX5_INTERFACE_PROTOCOL_VDPA’ undeclared here (not in a function); did you mean ‘MLX5_INTERFACE_PROTOCOL_ETH’?
> > 62 | .protocol = MLX5_INTERFACE_PROTOCOL_VDPA,
> > | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > | MLX5_INTERFACE_PROTOCOL_ETH
> > make[3]: *** [scripts/Makefile.build:281: drivers/vdpa/mlx5/net/main.o] Error 1
> > make[2]: *** [scripts/Makefile.build:497: drivers/vdpa/mlx5] Error 2
> > make[1]: *** [scripts/Makefile.build:497: drivers/vdpa] Error 2
> > make: *** [Makefile:1756: drivers] Error 2
> >
> >
> > I am guessing this is because of the missing dependency, right?
> > So what's the plan for merging this?
> >
> >
> > --
> > MST
> >

2020-08-05 19:07:00

by Saeed Mahameed

[permalink] [raw]

Subject: Re: [PATCH V4 linux-next 00/12] VDPA support for Mellanox ConnectX devices

On Wed, 2020-08-05 at 09:12 -0400, Michael S. Tsirkin wrote:
> On Wed, Aug 05, 2020 at 04:01:58PM +0300, Eli Cohen wrote:
> > On Wed, Aug 05, 2020 at 08:48:52AM -0400, Michael S. Tsirkin wrote:
> > > > Did you merge this?:
> > > > git pull
> > > > git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.gi
> > > > t mlx5-next
> > >
> > > I can only merge this tree if no one else will. Linus does not
> > > like
> > > getting same patches through two trees.
> > >
> > > Is this the case? Is mlx5-next going to be merged through
> > > my tree in this cycle?
> > >
> >
> > Saeed Mahameed from Mellanox (located in California) usuaally sends
> > out
> > net patches. So he's supposed to send that to Dave Miller.
> >
> > I think Saeed should answer this. Let's wait a few more hours till
> > he
> > wakes up.
>
> Alternatives:
> - merge vdpa through Saeed's tree. I can ack that, we'll need to
> resolve any conflicts by merging the two trees and show the
> result to Linus so he can resolve the merge in the same way.
> - extract just the necessary patches that are needed for vdpa and
> merge through my tree.
> - if Saeed sends his pull today, it's likely it will be merged
> early next week. Then I can rebase and send a pull with your
> patches
> on top. A bit risky.
> - do some tricks with build. E.g. disable build of your code,
> and enable in Saeed's tree when everything is merged together.
> Can be somewhat hard.
>

Hi Michael,

We do this all the time with net-next and rdma,
mlx5-next is a very small branch based on a very early rc that includes
mlx5 shared stuff between rdma and net-next, and now virtio as well.

we send pull requests of mlx5-next to both rdma and net-next with the
respective features, exactly as we did here, and it works nicely, since
we reduce the number of conflicts to 0 between different subsystems
that rely on mlx5 core.

all the alternative you suggested have never been tried before :),
net-next is Closed, so i can't do further submissions.

2020-08-05 19:23:36

[permalink] [raw]

Subject: Re: [PATCH V4 linux-next 00/12] VDPA support for Mellanox ConnectX devices

On Wed, Aug 05, 2020 at 08:00:55AM -0400, Michael S. Tsirkin wrote:
> On Tue, Aug 04, 2020 at 07:20:36PM +0300, Eli Cohen wrote:
> > Hi Michael,
> > please note that this series depends on mlx5 core device driver patches
> > in mlx5-next branch in
> > git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git.
> >
> > git pull git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git mlx5-next
> >
> > They also depend Jason Wang's patches: https://lkml.org/lkml/2020/7/1/301
>
> So if I apply this to linux-next branch of my tree, I get:
>

Did you merge this?:
git pull git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git mlx5-next

>
> CALL scripts/checksyscalls.sh
> CALL scripts/atomic/check-atomics.sh
> DESCEND objtool
> CHK include/generated/compile.h
> CC [M] drivers/vdpa/mlx5/net/main.o
> In file included from ./include/linux/swab.h:5,
> from ./include/uapi/linux/byteorder/little_endian.h:13,
> from ./include/linux/byteorder/little_endian.h:5,
> from ./arch/x86/include/uapi/asm/byteorder.h:5,
> from ./include/asm-generic/bitops/le.h:6,
> from ./arch/x86/include/asm/bitops.h:395,
> from ./include/linux/bitops.h:29,
> from ./include/linux/kernel.h:12,
> from ./include/linux/list.h:9,
> from ./include/linux/module.h:12,
> from drivers/vdpa/mlx5/net/main.c:4:
> drivers/vdpa/mlx5/net/main.c: In function ‘required_caps_supported’:
> ././include/linux/compiler_types.h:129:35: error: ‘struct mlx5_ifc_device_virtio_emulation_cap_bits’ has no member named ‘event_mode’
> 129 | #define __compiler_offsetof(a, b) __builtin_offsetof(a, b)
> | ^~~~~~~~~~~~~~~~~~
> ./include/uapi/linux/swab.h:115:54: note: in definition of macro ‘__swab32’
> 115 | #define __swab32(x) (__u32)__builtin_bswap32((__u32)(x))
> | ^
> ./include/linux/byteorder/generic.h:95:21: note: in expansion of macro ‘__be32_to_cpu’
> 95 | #define be32_to_cpu __be32_to_cpu
> | ^~~~~~~~~~~~~
> ./include/linux/stddef.h:17:32: note: in expansion of macro ‘__compiler_offsetof’
> 17 | #define offsetof(TYPE, MEMBER) __compiler_offsetof(TYPE, MEMBER)
> | ^~~~~~~~~~~~~~~~~~~
> ./include/linux/mlx5/device.h:51:35: note: in expansion of macro ‘offsetof’
> 51 | #define __mlx5_bit_off(typ, fld) (offsetof(struct mlx5_ifc_##typ##_bits, fld))
> | ^~~~~~~~
> ./include/linux/mlx5/device.h:53:34: note: in expansion of macro ‘__mlx5_bit_off’
> 53 | #define __mlx5_dw_off(typ, fld) (__mlx5_bit_off(typ, fld) / 32)
> | ^~~~~~~~~~~~~~
> ./include/linux/mlx5/device.h:96:1: note: in expansion of macro ‘__mlx5_dw_off’
> 96 | __mlx5_dw_off(typ, fld))) >> __mlx5_dw_bit_off(typ, fld)) & \
> | ^~~~~~~~~~~~~
> ./include/linux/mlx5/device.h:1355:2: note: in expansion of macro ‘MLX5_GET’
> 1355 | MLX5_GET(device_virtio_emulation_cap, \
> | ^~~~~~~~
> drivers/vdpa/mlx5/net/main.c:24:15: note: in expansion of macro ‘MLX5_CAP_DEV_VDPA_EMULATION’
> 24 | event_mode = MLX5_CAP_DEV_VDPA_EMULATION(mdev, event_mode);
> | ^~~~~~~~~~~~~~~~~~~~~~~~~~~
> In file included from ./include/linux/mlx5/driver.h:52,
> from drivers/vdpa/mlx5/net/main.c:5:
> ./include/linux/mlx5/device.h:50:57: error: ‘struct mlx5_ifc_device_virtio_emulation_cap_bits’ has no member named ‘event_mode’
> 50 | #define __mlx5_bit_sz(typ, fld) sizeof(__mlx5_nullp(typ)->fld)
> | ^~
> ./include/linux/mlx5/device.h:56:43: note: in expansion of macro ‘__mlx5_bit_sz’
> 56 | #define __mlx5_dw_bit_off(typ, fld) (32 - __mlx5_bit_sz(typ, fld) - (__mlx5_bit_off(typ, fld) & 0x1f))
> | ^~~~~~~~~~~~~
> ./include/linux/mlx5/device.h:96:30: note: in expansion of macro ‘__mlx5_dw_bit_off’
> 96 | __mlx5_dw_off(typ, fld))) >> __mlx5_dw_bit_off(typ, fld)) & \
> | ^~~~~~~~~~~~~~~~~
> ./include/linux/mlx5/device.h:1355:2: note: in expansion of macro ‘MLX5_GET’
> 1355 | MLX5_GET(device_virtio_emulation_cap, \
> | ^~~~~~~~
> drivers/vdpa/mlx5/net/main.c:24:15: note: in expansion of macro ‘MLX5_CAP_DEV_VDPA_EMULATION’
> 24 | event_mode = MLX5_CAP_DEV_VDPA_EMULATION(mdev, event_mode);
> | ^~~~~~~~~~~~~~~~~~~~~~~~~~~
> In file included from <command-line>:
> ././include/linux/compiler_types.h:129:35: error: ‘struct mlx5_ifc_device_virtio_emulation_cap_bits’ has no member named ‘event_mode’
> 129 | #define __compiler_offsetof(a, b) __builtin_offsetof(a, b)
> | ^~~~~~~~~~~~~~~~~~
> ./include/linux/stddef.h:17:32: note: in expansion of macro ‘__compiler_offsetof’
> 17 | #define offsetof(TYPE, MEMBER) __compiler_offsetof(TYPE, MEMBER)
> | ^~~~~~~~~~~~~~~~~~~
> ./include/linux/mlx5/device.h:51:35: note: in expansion of macro ‘offsetof’
> 51 | #define __mlx5_bit_off(typ, fld) (offsetof(struct mlx5_ifc_##typ##_bits, fld))
> | ^~~~~~~~
> ./include/linux/mlx5/device.h:56:70: note: in expansion of macro ‘__mlx5_bit_off’
> 56 | #define __mlx5_dw_bit_off(typ, fld) (32 - __mlx5_bit_sz(typ, fld) - (__mlx5_bit_off(typ, fld) & 0x1f))
> | ^~~~~~~~~~~~~~
> ./include/linux/mlx5/device.h:96:30: note: in expansion of macro ‘__mlx5_dw_bit_off’
> 96 | __mlx5_dw_off(typ, fld))) >> __mlx5_dw_bit_off(typ, fld)) & \
> | ^~~~~~~~~~~~~~~~~
> ./include/linux/mlx5/device.h:1355:2: note: in expansion of macro ‘MLX5_GET’
> 1355 | MLX5_GET(device_virtio_emulation_cap, \
> | ^~~~~~~~
> drivers/vdpa/mlx5/net/main.c:24:15: note: in expansion of macro ‘MLX5_CAP_DEV_VDPA_EMULATION’
> 24 | event_mode = MLX5_CAP_DEV_VDPA_EMULATION(mdev, event_mode);
> | ^~~~~~~~~~~~~~~~~~~~~~~~~~~
> In file included from ./include/linux/mlx5/driver.h:52,
> from drivers/vdpa/mlx5/net/main.c:5:
> ./include/linux/mlx5/device.h:50:57: error: ‘struct mlx5_ifc_device_virtio_emulation_cap_bits’ has no member named ‘event_mode’
> 50 | #define __mlx5_bit_sz(typ, fld) sizeof(__mlx5_nullp(typ)->fld)
> | ^~
> ./include/linux/mlx5/device.h:57:47: note: in expansion of macro ‘__mlx5_bit_sz’
> 57 | #define __mlx5_mask(typ, fld) ((u32)((1ull << __mlx5_bit_sz(typ, fld)) - 1))
> | ^~~~~~~~~~~~~
> ./include/linux/mlx5/device.h:97:1: note: in expansion of macro ‘__mlx5_mask’
> 97 | __mlx5_mask(typ, fld))
> | ^~~~~~~~~~~
> ./include/linux/mlx5/device.h:1355:2: note: in expansion of macro ‘MLX5_GET’
> 1355 | MLX5_GET(device_virtio_emulation_cap, \
> | ^~~~~~~~
> drivers/vdpa/mlx5/net/main.c:24:15: note: in expansion of macro ‘MLX5_CAP_DEV_VDPA_EMULATION’
> 24 | event_mode = MLX5_CAP_DEV_VDPA_EMULATION(mdev, event_mode);
> | ^~~~~~~~~~~~~~~~~~~~~~~~~~~
> In file included from ./include/linux/swab.h:5,
> from ./include/uapi/linux/byteorder/little_endian.h:13,
> from ./include/linux/byteorder/little_endian.h:5,
> from ./arch/x86/include/uapi/asm/byteorder.h:5,
> from ./include/asm-generic/bitops/le.h:6,
> from ./arch/x86/include/asm/bitops.h:395,
> from ./include/linux/bitops.h:29,
> from ./include/linux/kernel.h:12,
> from ./include/linux/list.h:9,
> from ./include/linux/module.h:12,
> from drivers/vdpa/mlx5/net/main.c:4:
> ././include/linux/compiler_types.h:129:35: error: ‘struct mlx5_ifc_device_virtio_emulation_cap_bits’ has no member named ‘eth_frame_offload_type’
> 129 | #define __compiler_offsetof(a, b) __builtin_offsetof(a, b)
> | ^~~~~~~~~~~~~~~~~~
> ./include/uapi/linux/swab.h:115:54: note: in definition of macro ‘__swab32’
> 115 | #define __swab32(x) (__u32)__builtin_bswap32((__u32)(x))
> | ^
> ./include/linux/byteorder/generic.h:95:21: note: in expansion of macro ‘__be32_to_cpu’
> 95 | #define be32_to_cpu __be32_to_cpu
> | ^~~~~~~~~~~~~
> ./include/linux/stddef.h:17:32: note: in expansion of macro ‘__compiler_offsetof’
> 17 | #define offsetof(TYPE, MEMBER) __compiler_offsetof(TYPE, MEMBER)
> | ^~~~~~~~~~~~~~~~~~~
> ./include/linux/mlx5/device.h:51:35: note: in expansion of macro ‘offsetof’
> 51 | #define __mlx5_bit_off(typ, fld) (offsetof(struct mlx5_ifc_##typ##_bits, fld))
> | ^~~~~~~~
> ./include/linux/mlx5/device.h:53:34: note: in expansion of macro ‘__mlx5_bit_off’
> 53 | #define __mlx5_dw_off(typ, fld) (__mlx5_bit_off(typ, fld) / 32)
> | ^~~~~~~~~~~~~~
> ./include/linux/mlx5/device.h:96:1: note: in expansion of macro ‘__mlx5_dw_off’
> 96 | __mlx5_dw_off(typ, fld))) >> __mlx5_dw_bit_off(typ, fld)) & \
> | ^~~~~~~~~~~~~
> ./include/linux/mlx5/device.h:1355:2: note: in expansion of macro ‘MLX5_GET’
> 1355 | MLX5_GET(device_virtio_emulation_cap, \
> | ^~~~~~~~
> drivers/vdpa/mlx5/net/main.c:28:7: note: in expansion of macro ‘MLX5_CAP_DEV_VDPA_EMULATION’
> 28 | if (!MLX5_CAP_DEV_VDPA_EMULATION(mdev, eth_frame_offload_type))
> | ^~~~~~~~~~~~~~~~~~~~~~~~~~~
> In file included from ./include/linux/mlx5/driver.h:52,
> from drivers/vdpa/mlx5/net/main.c:5:
> ./include/linux/mlx5/device.h:50:57: error: ‘struct mlx5_ifc_device_virtio_emulation_cap_bits’ has no member named ‘eth_frame_offload_type’
> 50 | #define __mlx5_bit_sz(typ, fld) sizeof(__mlx5_nullp(typ)->fld)
> | ^~
> ./include/linux/mlx5/device.h:56:43: note: in expansion of macro ‘__mlx5_bit_sz’
> 56 | #define __mlx5_dw_bit_off(typ, fld) (32 - __mlx5_bit_sz(typ, fld) - (__mlx5_bit_off(typ, fld) & 0x1f))
> | ^~~~~~~~~~~~~
> ./include/linux/mlx5/device.h:96:30: note: in expansion of macro ‘__mlx5_dw_bit_off’
> 96 | __mlx5_dw_off(typ, fld))) >> __mlx5_dw_bit_off(typ, fld)) & \
> | ^~~~~~~~~~~~~~~~~
> ./include/linux/mlx5/device.h:1355:2: note: in expansion of macro ‘MLX5_GET’
> 1355 | MLX5_GET(device_virtio_emulation_cap, \
> | ^~~~~~~~
> drivers/vdpa/mlx5/net/main.c:28:7: note: in expansion of macro ‘MLX5_CAP_DEV_VDPA_EMULATION’
> 28 | if (!MLX5_CAP_DEV_VDPA_EMULATION(mdev, eth_frame_offload_type))
> | ^~~~~~~~~~~~~~~~~~~~~~~~~~~
> In file included from <command-line>:
> ././include/linux/compiler_types.h:129:35: error: ‘struct mlx5_ifc_device_virtio_emulation_cap_bits’ has no member named ‘eth_frame_offload_type’
> 129 | #define __compiler_offsetof(a, b) __builtin_offsetof(a, b)
> | ^~~~~~~~~~~~~~~~~~
> ./include/linux/stddef.h:17:32: note: in expansion of macro ‘__compiler_offsetof’
> 17 | #define offsetof(TYPE, MEMBER) __compiler_offsetof(TYPE, MEMBER)
> | ^~~~~~~~~~~~~~~~~~~
> ./include/linux/mlx5/device.h:51:35: note: in expansion of macro ‘offsetof’
> 51 | #define __mlx5_bit_off(typ, fld) (offsetof(struct mlx5_ifc_##typ##_bits, fld))
> | ^~~~~~~~
> ./include/linux/mlx5/device.h:56:70: note: in expansion of macro ‘__mlx5_bit_off’
> 56 | #define __mlx5_dw_bit_off(typ, fld) (32 - __mlx5_bit_sz(typ, fld) - (__mlx5_bit_off(typ, fld) & 0x1f))
> | ^~~~~~~~~~~~~~
> ./include/linux/mlx5/device.h:96:30: note: in expansion of macro ‘__mlx5_dw_bit_off’
> 96 | __mlx5_dw_off(typ, fld))) >> __mlx5_dw_bit_off(typ, fld)) & \
> | ^~~~~~~~~~~~~~~~~
> ./include/linux/mlx5/device.h:1355:2: note: in expansion of macro ‘MLX5_GET’
> 1355 | MLX5_GET(device_virtio_emulation_cap, \
> | ^~~~~~~~
> drivers/vdpa/mlx5/net/main.c:28:7: note: in expansion of macro ‘MLX5_CAP_DEV_VDPA_EMULATION’
> 28 | if (!MLX5_CAP_DEV_VDPA_EMULATION(mdev, eth_frame_offload_type))
> | ^~~~~~~~~~~~~~~~~~~~~~~~~~~
> In file included from ./include/linux/mlx5/driver.h:52,
> from drivers/vdpa/mlx5/net/main.c:5:
> ./include/linux/mlx5/device.h:50:57: error: ‘struct mlx5_ifc_device_virtio_emulation_cap_bits’ has no member named ‘eth_frame_offload_type’
> 50 | #define __mlx5_bit_sz(typ, fld) sizeof(__mlx5_nullp(typ)->fld)
> | ^~
> ./include/linux/mlx5/device.h:57:47: note: in expansion of macro ‘__mlx5_bit_sz’
> 57 | #define __mlx5_mask(typ, fld) ((u32)((1ull << __mlx5_bit_sz(typ, fld)) - 1))
> | ^~~~~~~~~~~~~
> ./include/linux/mlx5/device.h:97:1: note: in expansion of macro ‘__mlx5_mask’
> 97 | __mlx5_mask(typ, fld))
> | ^~~~~~~~~~~
> ./include/linux/mlx5/device.h:1355:2: note: in expansion of macro ‘MLX5_GET’
> 1355 | MLX5_GET(device_virtio_emulation_cap, \
> | ^~~~~~~~
> drivers/vdpa/mlx5/net/main.c:28:7: note: in expansion of macro ‘MLX5_CAP_DEV_VDPA_EMULATION’
> 28 | if (!MLX5_CAP_DEV_VDPA_EMULATION(mdev, eth_frame_offload_type))
> | ^~~~~~~~~~~~~~~~~~~~~~~~~~~
> drivers/vdpa/mlx5/net/main.c: At top level:
> drivers/vdpa/mlx5/net/main.c:62:14: error: ‘MLX5_INTERFACE_PROTOCOL_VDPA’ undeclared here (not in a function); did you mean ‘MLX5_INTERFACE_PROTOCOL_ETH’?
> 62 | .protocol = MLX5_INTERFACE_PROTOCOL_VDPA,
> | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
> | MLX5_INTERFACE_PROTOCOL_ETH
> make[3]: *** [scripts/Makefile.build:281: drivers/vdpa/mlx5/net/main.o] Error 1
> make[2]: *** [scripts/Makefile.build:497: drivers/vdpa/mlx5] Error 2
> make[1]: *** [scripts/Makefile.build:497: drivers/vdpa] Error 2
> make: *** [Makefile:1756: drivers] Error 2
>
>
> I am guessing this is because of the missing dependency, right?
> So what's the plan for merging this?
>
>
> --
> MST
>

2020-08-05 19:47:36

by Jason Gunthorpe

[permalink] [raw]

Subject: Re: [PATCH V4 linux-next 00/12] VDPA support for Mellanox ConnectX devices

On Wed, Aug 05, 2020 at 07:01:52PM +0000, Saeed Mahameed wrote:
> On Wed, 2020-08-05 at 09:12 -0400, Michael S. Tsirkin wrote:
> > On Wed, Aug 05, 2020 at 04:01:58PM +0300, Eli Cohen wrote:
> > > On Wed, Aug 05, 2020 at 08:48:52AM -0400, Michael S. Tsirkin wrote:
> > > > > Did you merge this?:
> > > > > git pull
> > > > > git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.gi
> > > > > t mlx5-next
> > > >
> > > > I can only merge this tree if no one else will. Linus does not
> > > > like getting same patches through two trees.

This is not quite the case

Linus does not like multiple *copies* of the same patches. The same
actual git commits can be OK.

Linus also does not like unnecessarily cross linking trees, mlx5-next
is designed to be small enough and approved enough that it is not
controversial.

Linus really doesn't like it when people jams stuff together in rc7 or
the weeks of the merge window. He wants to see stuff be in linux-next
for at least a bit. So it may be too late regardless.

> We do this all the time with net-next and rdma,
> mlx5-next is a very small branch based on a very early rc that includes
> mlx5 shared stuff between rdma and net-next, and now virtio as well.

Yes, going on two years now? Been working well

Jason

2020-08-05 20:03:40

by Michael S. Tsirkin

[permalink] [raw]

Subject: Re: [PATCH V4 linux-next 00/12] VDPA support for Mellanox ConnectX devices

On Wed, Aug 05, 2020 at 04:01:58PM +0300, Eli Cohen wrote:
> On Wed, Aug 05, 2020 at 08:48:52AM -0400, Michael S. Tsirkin wrote:
> > >
> > > Did you merge this?:
> > > git pull git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git mlx5-next
> >
> >
> > I can only merge this tree if no one else will. Linus does not like
> > getting same patches through two trees.
> >
> > Is this the case? Is mlx5-next going to be merged through
> > my tree in this cycle?
> >
>
> Saeed Mahameed from Mellanox (located in California) usuaally sends out
> net patches. So he's supposed to send that to Dave Miller.
>
> I think Saeed should answer this. Let's wait a few more hours till he
> wakes up.

Alternatives:
- merge vdpa through Saeed's tree. I can ack that, we'll need to
resolve any conflicts by merging the two trees and show the
result to Linus so he can resolve the merge in the same way.
- extract just the necessary patches that are needed for vdpa and
merge through my tree.
- if Saeed sends his pull today, it's likely it will be merged
early next week. Then I can rebase and send a pull with your patches
on top. A bit risky.
- do some tricks with build. E.g. disable build of your code,
and enable in Saeed's tree when everything is merged together.
Can be somewhat hard.

2020-08-05 20:14:54

[permalink] [raw]

Subject: Re: [PATCH V4 linux-next 00/12] VDPA support for Mellanox ConnectX devices

On Wed, Aug 05, 2020 at 08:48:52AM -0400, Michael S. Tsirkin wrote:
> >
> > Did you merge this?:
> > git pull git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git mlx5-next
>
>
> I can only merge this tree if no one else will. Linus does not like
> getting same patches through two trees.
>
> Is this the case? Is mlx5-next going to be merged through
> my tree in this cycle?
>

Saeed Mahameed from Mellanox (located in California) usuaally sends out
net patches. So he's supposed to send that to Dave Miller.

I think Saeed should answer this. Let's wait a few more hours till he
wakes up.

2020-08-05 22:32:23

by Michael S. Tsirkin

[permalink] [raw]

Subject: Re: [PATCH V4 linux-next 00/12] VDPA support for Mellanox ConnectX devices

On Wed, Aug 05, 2020 at 04:46:46PM -0300, Jason Gunthorpe wrote:
> On Wed, Aug 05, 2020 at 07:01:52PM +0000, Saeed Mahameed wrote:
> > On Wed, 2020-08-05 at 09:12 -0400, Michael S. Tsirkin wrote:
> > > On Wed, Aug 05, 2020 at 04:01:58PM +0300, Eli Cohen wrote:
> > > > On Wed, Aug 05, 2020 at 08:48:52AM -0400, Michael S. Tsirkin wrote:
> > > > > > Did you merge this?:
> > > > > > git pull
> > > > > > git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.gi
> > > > > > t mlx5-next
> > > > >
> > > > > I can only merge this tree if no one else will. Linus does not
> > > > > like getting same patches through two trees.
>
> This is not quite the case
>
> Linus does not like multiple *copies* of the same patches. The same
> actual git commits can be OK.
>
> Linus also does not like unnecessarily cross linking trees, mlx5-next
> is designed to be small enough and approved enough that it is not
> controversial.
>
> Linus really doesn't like it when people jams stuff together in rc7 or
> the weeks of the merge window. He wants to see stuff be in linux-next
> for at least a bit. So it may be too late regardless.

I'll try, let's see what happens.

> > We do this all the time with net-next and rdma,
> > mlx5-next is a very small branch based on a very early rc that includes
> > mlx5 shared stuff between rdma and net-next, and now virtio as well.
>
> Yes, going on two years now? Been working well
>
> Jason

OK, I'll merge it then. Thanks!

--
MST