Hi,
This series attempts to add vfio live migration support for
HiSilicon ACC VF devices based on the new v2 migration protocol
definition and mlx5 v8 series discussed here[0].
RFCv4 --> v5
- Dropped RFC tag as v2 migration APIs are more stable now.
- Addressed review comments from Jason and Alex (Thanks!).
This is sanity tested on a HiSilicon platform using the Qemu branch
provided here[1].
Please take a look and let me know your feedback.
Thanks,
Shameer
[0] https://lore.kernel.org/kvm/[email protected]/
[1] https://github.com/jgunthorpe/qemu/commits/vfio_migration_v2
v3 --> RFCv4
-Based on migration v2 protocol and mlx5 v7 series.
-Added RFC tag again as migration v2 protocol is still under discussion.
-Added new patch #6 to retrieve the PF QM data.
-PRE_COPY compatibility check is now done after the migration data
transfer. This is not ideal and needs discussion.
RFC v2 --> v3
-Dropped RFC tag as the vfio_pci_core subsystem framework is now
part of 5.15-rc1.
-Added override methods for vfio_device_ops read/write/mmap calls
to limit the access within the functional register space.
-Patches 1 to 3 are code refactoring to move the common ACC QM
definitions and header around.
RFCv1 --> RFCv2
-Adds a new vendor-specific vfio_pci driver(hisi-acc-vfio-pci)
for HiSilicon ACC VF devices based on the new vfio-pci-core
framework proposal.
-Since HiSilicon ACC VF device MMIO space contains both the
functional register space and migration control register space,
override the vfio_device_ops ioctl method to report only the
functional space to VMs.
-For a successful migration, we still need access to VF dev
functional register space mainly to read the status registers.
But accessing these while the Guest vCPUs are running may leave
a security hole. To avoid any potential security issues, we
map/unmap the MMIO regions on a need basis and is safe to do so.
(Please see hisi_acc_vf_ioremap/unmap() fns in patch #4).
-Dropped debugfs support for now.
-Uses common QM functions for mailbox access(patch #3).
Longfang Liu (2):
crypto: hisilicon/qm: Move few definitions to common header
hisi_acc_vfio_pci: Add support for VFIO live migration
Shameer Kolothum (6):
crypto: hisilicon/qm: Move the QM header to include/linux
hisi_acc_qm: Move PCI device IDs to common header
hisi_acc_vfio_pci: add new vfio_pci driver for HiSilicon ACC devices
hisi_acc_vfio_pci: Restrict access to VF dev BAR2 migration region
hisi_acc_vfio_pci: Add helper to retrieve the PF qm data
hisi_acc_vfio_pci: Use its own PCI reset_done error handler
drivers/crypto/hisilicon/hpre/hpre.h | 2 +-
drivers/crypto/hisilicon/hpre/hpre_main.c | 18 +-
drivers/crypto/hisilicon/qm.c | 34 +-
drivers/crypto/hisilicon/sec2/sec.h | 2 +-
drivers/crypto/hisilicon/sec2/sec_main.c | 20 +-
drivers/crypto/hisilicon/sgl.c | 2 +-
drivers/crypto/hisilicon/zip/zip.h | 2 +-
drivers/crypto/hisilicon/zip/zip_main.c | 17 +-
drivers/vfio/pci/Kconfig | 2 +
drivers/vfio/pci/Makefile | 2 +
drivers/vfio/pci/hisilicon/Kconfig | 16 +
drivers/vfio/pci/hisilicon/Makefile | 4 +
.../vfio/pci/hisilicon/hisi_acc_vfio_pci.c | 1316 +++++++++++++++++
.../vfio/pci/hisilicon/hisi_acc_vfio_pci.h | 119 ++
.../qm.h => include/linux/hisi_acc_qm.h | 44 +
include/linux/pci_ids.h | 6 +
16 files changed, 1552 insertions(+), 54 deletions(-)
create mode 100644 drivers/vfio/pci/hisilicon/Kconfig
create mode 100644 drivers/vfio/pci/hisilicon/Makefile
create mode 100644 drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
create mode 100644 drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.h
rename drivers/crypto/hisilicon/qm.h => include/linux/hisi_acc_qm.h (88%)
--
2.25.1
Provides a helper function to retrieve the PF QM data associated
with a ACC VF dev. This makes use of the pci_iov_get_pf_drvdata()
to get PF drvdata safely. Introduces helpers to retrieve the ACC
PF dev struct pci_driver pointers as this is an input into the
pci_iov_get_pf_drvdata().
Signed-off-by: Shameer Kolothum <[email protected]>
---
drivers/crypto/hisilicon/hpre/hpre_main.c | 6 ++++
drivers/crypto/hisilicon/sec2/sec_main.c | 6 ++++
drivers/crypto/hisilicon/zip/zip_main.c | 6 ++++
drivers/vfio/pci/hisilicon/Kconfig | 7 +++++
.../vfio/pci/hisilicon/hisi_acc_vfio_pci.c | 30 +++++++++++++++++++
include/linux/hisi_acc_qm.h | 5 ++++
6 files changed, 60 insertions(+)
diff --git a/drivers/crypto/hisilicon/hpre/hpre_main.c b/drivers/crypto/hisilicon/hpre/hpre_main.c
index ba4043447e53..80fb9ef8c571 100644
--- a/drivers/crypto/hisilicon/hpre/hpre_main.c
+++ b/drivers/crypto/hisilicon/hpre/hpre_main.c
@@ -1189,6 +1189,12 @@ static struct pci_driver hpre_pci_driver = {
.driver.pm = &hpre_pm_ops,
};
+struct pci_driver *hisi_hpre_get_pf_driver(void)
+{
+ return &hpre_pci_driver;
+}
+EXPORT_SYMBOL(hisi_hpre_get_pf_driver);
+
static void hpre_register_debugfs(void)
{
if (!debugfs_initialized())
diff --git a/drivers/crypto/hisilicon/sec2/sec_main.c b/drivers/crypto/hisilicon/sec2/sec_main.c
index ab806fb481ac..d8fb5c2b3482 100644
--- a/drivers/crypto/hisilicon/sec2/sec_main.c
+++ b/drivers/crypto/hisilicon/sec2/sec_main.c
@@ -1087,6 +1087,12 @@ static struct pci_driver sec_pci_driver = {
.driver.pm = &sec_pm_ops,
};
+struct pci_driver *hisi_sec_get_pf_driver(void)
+{
+ return &sec_pci_driver;
+}
+EXPORT_SYMBOL(hisi_sec_get_pf_driver);
+
static void sec_register_debugfs(void)
{
if (!debugfs_initialized())
diff --git a/drivers/crypto/hisilicon/zip/zip_main.c b/drivers/crypto/hisilicon/zip/zip_main.c
index f4a517728385..b6ccc7e8f37e 100644
--- a/drivers/crypto/hisilicon/zip/zip_main.c
+++ b/drivers/crypto/hisilicon/zip/zip_main.c
@@ -1010,6 +1010,12 @@ static struct pci_driver hisi_zip_pci_driver = {
.driver.pm = &hisi_zip_pm_ops,
};
+struct pci_driver *hisi_zip_get_pf_driver(void)
+{
+ return &hisi_zip_pci_driver;
+}
+EXPORT_SYMBOL(hisi_zip_get_pf_driver);
+
static void hisi_zip_register_debugfs(void)
{
if (!debugfs_initialized())
diff --git a/drivers/vfio/pci/hisilicon/Kconfig b/drivers/vfio/pci/hisilicon/Kconfig
index d5acaf74a878..02811364a7a7 100644
--- a/drivers/vfio/pci/hisilicon/Kconfig
+++ b/drivers/vfio/pci/hisilicon/Kconfig
@@ -2,6 +2,13 @@
config HISI_ACC_VFIO_PCI
tristate "VFIO PCI support for HiSilicon ACC devices"
depends on (ARM64 && VFIO_PCI_CORE) || (COMPILE_TEST && 64BIT)
+ depends on PCI && PCI_MSI
+ depends on UACCE || UACCE=n
+ depends on ACPI
+ select CRYPTO_DEV_HISI_QM
+ select CRYPTO_DEV_HISI_HPRE
+ select CRYPTO_DEV_HISI_SEC2
+ select CRYPTO_DEV_HISI_ZIP
help
This provides generic PCI support for HiSilicon ACC devices
using the VFIO framework.
diff --git a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
index 585eb84684c9..9c87ab74bf7f 100644
--- a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
+++ b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
@@ -13,6 +13,36 @@
#include <linux/vfio.h>
#include <linux/vfio_pci_core.h>
+static struct hisi_qm *hisi_acc_get_pf_qm(struct pci_dev *pdev)
+{
+ struct hisi_qm *pf_qm;
+ struct pci_driver *pf_driver;
+
+ if (!pdev->is_virtfn)
+ return NULL;
+
+ switch (pdev->device) {
+ case PCI_DEVICE_ID_HUAWEI_SEC_VF:
+ pf_driver = hisi_sec_get_pf_driver();
+ break;
+ case PCI_DEVICE_ID_HUAWEI_HPRE_VF:
+ pf_driver = hisi_hpre_get_pf_driver();
+ break;
+ case PCI_DEVICE_ID_HUAWEI_ZIP_VF:
+ pf_driver = hisi_zip_get_pf_driver();
+ break;
+ default:
+ return NULL;
+ }
+
+ if (!pf_driver)
+ return NULL;
+
+ pf_qm = pci_iov_get_pf_drvdata(pdev, pf_driver);
+
+ return !IS_ERR(pf_qm) ? pf_qm : NULL;
+}
+
static int hisi_acc_pci_rw_access_check(struct vfio_device *core_vdev,
size_t count, loff_t *ppos,
size_t *new_count)
diff --git a/include/linux/hisi_acc_qm.h b/include/linux/hisi_acc_qm.h
index 5eb1e87ccd70..393ef17d306e 100644
--- a/include/linux/hisi_acc_qm.h
+++ b/include/linux/hisi_acc_qm.h
@@ -477,4 +477,9 @@ void hisi_qm_pm_init(struct hisi_qm *qm);
int hisi_qm_get_dfx_access(struct hisi_qm *qm);
void hisi_qm_put_dfx_access(struct hisi_qm *qm);
void hisi_qm_regs_dump(struct seq_file *s, struct debugfs_regset32 *regset);
+
+/* Used by VFIO ACC live migration driver */
+struct pci_driver *hisi_sec_get_pf_driver(void);
+struct pci_driver *hisi_hpre_get_pf_driver(void);
+struct pci_driver *hisi_zip_get_pf_driver(void);
#endif
--
2.25.1
Since we are going to introduce VFIO PCI HiSilicon ACC
driver for live migration in subsequent patches, move
the ACC QM header file to a common include dir.
Signed-off-by: Shameer Kolothum <[email protected]>
---
drivers/crypto/hisilicon/hpre/hpre.h | 2 +-
drivers/crypto/hisilicon/qm.c | 2 +-
drivers/crypto/hisilicon/sec2/sec.h | 2 +-
drivers/crypto/hisilicon/sgl.c | 2 +-
drivers/crypto/hisilicon/zip/zip.h | 2 +-
drivers/crypto/hisilicon/qm.h => include/linux/hisi_acc_qm.h | 0
6 files changed, 5 insertions(+), 5 deletions(-)
rename drivers/crypto/hisilicon/qm.h => include/linux/hisi_acc_qm.h (100%)
diff --git a/drivers/crypto/hisilicon/hpre/hpre.h b/drivers/crypto/hisilicon/hpre/hpre.h
index e0b4a1982ee9..9a0558ed82f9 100644
--- a/drivers/crypto/hisilicon/hpre/hpre.h
+++ b/drivers/crypto/hisilicon/hpre/hpre.h
@@ -4,7 +4,7 @@
#define __HISI_HPRE_H
#include <linux/list.h>
-#include "../qm.h"
+#include <linux/hisi_acc_qm.h>
#define HPRE_SQE_SIZE sizeof(struct hpre_sqe)
#define HPRE_PF_DEF_Q_NUM 64
diff --git a/drivers/crypto/hisilicon/qm.c b/drivers/crypto/hisilicon/qm.c
index c5b84a5ea350..ed23e1d3fa27 100644
--- a/drivers/crypto/hisilicon/qm.c
+++ b/drivers/crypto/hisilicon/qm.c
@@ -15,7 +15,7 @@
#include <linux/uacce.h>
#include <linux/uaccess.h>
#include <uapi/misc/uacce/hisi_qm.h>
-#include "qm.h"
+#include <linux/hisi_acc_qm.h>
/* eq/aeq irq enable */
#define QM_VF_AEQ_INT_SOURCE 0x0
diff --git a/drivers/crypto/hisilicon/sec2/sec.h b/drivers/crypto/hisilicon/sec2/sec.h
index d97cf02b1df7..c2e9b01187a7 100644
--- a/drivers/crypto/hisilicon/sec2/sec.h
+++ b/drivers/crypto/hisilicon/sec2/sec.h
@@ -4,7 +4,7 @@
#ifndef __HISI_SEC_V2_H
#define __HISI_SEC_V2_H
-#include "../qm.h"
+#include <linux/hisi_acc_qm.h>
#include "sec_crypto.h"
/* Algorithm resource per hardware SEC queue */
diff --git a/drivers/crypto/hisilicon/sgl.c b/drivers/crypto/hisilicon/sgl.c
index 057273769f26..534687401135 100644
--- a/drivers/crypto/hisilicon/sgl.c
+++ b/drivers/crypto/hisilicon/sgl.c
@@ -3,7 +3,7 @@
#include <linux/dma-mapping.h>
#include <linux/module.h>
#include <linux/slab.h>
-#include "qm.h"
+#include <linux/hisi_acc_qm.h>
#define HISI_ACC_SGL_SGE_NR_MIN 1
#define HISI_ACC_SGL_NR_MAX 256
diff --git a/drivers/crypto/hisilicon/zip/zip.h b/drivers/crypto/hisilicon/zip/zip.h
index 517fdbdff3ea..3dfd3bac5a33 100644
--- a/drivers/crypto/hisilicon/zip/zip.h
+++ b/drivers/crypto/hisilicon/zip/zip.h
@@ -7,7 +7,7 @@
#define pr_fmt(fmt) "hisi_zip: " fmt
#include <linux/list.h>
-#include "../qm.h"
+#include <linux/hisi_acc_qm.h>
enum hisi_zip_error_type {
/* negative compression */
diff --git a/drivers/crypto/hisilicon/qm.h b/include/linux/hisi_acc_qm.h
similarity index 100%
rename from drivers/crypto/hisilicon/qm.h
rename to include/linux/hisi_acc_qm.h
--
2.25.1
From: Longfang Liu <[email protected]>
VMs assigned with HiSilicon ACC VF devices can now perform
live migration if the VF devices are bind to the hisi_acc_vfio_pci
driver.
Signed-off-by: Longfang Liu <[email protected]>
Signed-off-by: Shameer Kolothum <[email protected]>
---
.../vfio/pci/hisilicon/hisi_acc_vfio_pci.c | 1045 ++++++++++++++++-
.../vfio/pci/hisilicon/hisi_acc_vfio_pci.h | 117 ++
2 files changed, 1144 insertions(+), 18 deletions(-)
create mode 100644 drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.h
diff --git a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
index 9c87ab74bf7f..6ad41375ac71 100644
--- a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
+++ b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
@@ -13,6 +13,957 @@
#include <linux/vfio.h>
#include <linux/vfio_pci_core.h>
+#include <linux/anon_inodes.h>
+
+#include "hisi_acc_vfio_pci.h"
+
+/* return 0 on VM acc device ready, -ETIMEDOUT hardware timeout */
+static int qm_wait_dev_not_ready(struct hisi_qm *qm)
+{
+ u32 val;
+
+ return readl_relaxed_poll_timeout(qm->io_base + QM_VF_STATE,
+ val, !(val & 0x1), MB_POLL_PERIOD_US,
+ MB_POLL_TIMEOUT_US);
+}
+
+/*
+ * Each state Reg is checked 100 times,
+ * with a delay of 100 microseconds after each check
+ */
+static u32 acc_check_reg_state(struct hisi_qm *qm, u32 regs)
+{
+ int check_times = 0;
+ u32 state;
+
+ state = readl(qm->io_base + regs);
+ while (state && check_times < ERROR_CHECK_TIMEOUT) {
+ udelay(CHECK_DELAY_TIME);
+ state = readl(qm->io_base + regs);
+ check_times++;
+ }
+
+ return state;
+}
+
+/* Check the PF's RAS state and Function INT state */
+static int qm_check_int_state(struct hisi_acc_vf_core_device *hisi_acc_vdev)
+{
+ struct hisi_qm *vfqm = &hisi_acc_vdev->vf_qm;
+ struct hisi_qm *qm = hisi_acc_vdev->pf_qm;
+ struct pci_dev *vf_pdev = hisi_acc_vdev->vf_dev;
+ struct device *dev = &qm->pdev->dev;
+ u32 state;
+
+ /* Check RAS state */
+ state = acc_check_reg_state(qm, QM_ABNORMAL_INT_STATUS);
+ if (state) {
+ dev_err(dev, "failed to check QM RAS state!\n");
+ return -EBUSY;
+ }
+
+ /* Check Function Communication state between PF and VF */
+ state = acc_check_reg_state(vfqm, QM_IFC_INT_STATUS);
+ if (state) {
+ dev_err(dev, "failed to check QM IFC INT state!\n");
+ return -EBUSY;
+ }
+ state = acc_check_reg_state(vfqm, QM_IFC_INT_SET_V);
+ if (state) {
+ dev_err(dev, "failed to check QM IFC INT SET state!\n");
+ return -EBUSY;
+ }
+
+ /* Check submodule task state */
+ switch (vf_pdev->device) {
+ case PCI_DEVICE_ID_HUAWEI_SEC_VF:
+ state = acc_check_reg_state(qm, SEC_CORE_INT_STATUS);
+ if (state) {
+ dev_err(dev, "failed to check QM SEC Core INT state!\n");
+ return -EBUSY;
+ }
+ return 0;
+ case PCI_DEVICE_ID_HUAWEI_HPRE_VF:
+ state = acc_check_reg_state(qm, HPRE_HAC_INT_STATUS);
+ if (state) {
+ dev_err(dev, "failed to check QM HPRE HAC INT state!\n");
+ return -EBUSY;
+ }
+ return 0;
+ case PCI_DEVICE_ID_HUAWEI_ZIP_VF:
+ state = acc_check_reg_state(qm, HZIP_CORE_INT_STATUS);
+ if (state) {
+ dev_err(dev, "failed to check QM ZIP Core INT state!\n");
+ return -EBUSY;
+ }
+ return 0;
+ default:
+ dev_err(dev, "failed to detect acc module type!\n");
+ return -EINVAL;
+ }
+}
+
+static int qm_read_reg(struct hisi_qm *qm, u32 reg_addr,
+ u32 *data, u8 nums)
+{
+ int i;
+
+ if (nums < 1 || nums > QM_REGS_MAX_LEN)
+ return -EINVAL;
+
+ for (i = 0; i < nums; i++) {
+ data[i] = readl(qm->io_base + reg_addr);
+ reg_addr += QM_REG_ADDR_OFFSET;
+ }
+
+ return 0;
+}
+
+static int qm_write_reg(struct hisi_qm *qm, u32 reg,
+ u32 *data, u8 nums)
+{
+ int i;
+
+ if (nums < 1 || nums > QM_REGS_MAX_LEN)
+ return -EINVAL;
+
+ for (i = 0; i < nums; i++)
+ writel(data[i], qm->io_base + reg + i * QM_REG_ADDR_OFFSET);
+
+ return 0;
+}
+
+static int qm_get_vft(struct hisi_qm *qm, u32 *base)
+{
+ u64 sqc_vft;
+ u32 qp_num;
+ int ret;
+
+ ret = qm_mb(qm, QM_MB_CMD_SQC_VFT_V2, 0, 0, 1);
+ if (ret)
+ return ret;
+
+ sqc_vft = readl(qm->io_base + QM_MB_CMD_DATA_ADDR_L) |
+ ((u64)readl(qm->io_base + QM_MB_CMD_DATA_ADDR_H) <<
+ QM_XQC_ADDR_OFFSET);
+ *base = QM_SQC_VFT_BASE_MASK_V2 & (sqc_vft >> QM_SQC_VFT_BASE_SHIFT_V2);
+ qp_num = (QM_SQC_VFT_NUM_MASK_V2 &
+ (sqc_vft >> QM_SQC_VFT_NUM_SHIFT_V2)) + 1;
+
+ return qp_num;
+}
+
+static int qm_get_sqc(struct hisi_qm *qm, u64 *addr)
+{
+ int ret;
+
+ ret = qm_mb(qm, QM_MB_CMD_SQC_BT, 0, 0, 1);
+ if (ret)
+ return ret;
+
+ *addr = readl(qm->io_base + QM_MB_CMD_DATA_ADDR_L) |
+ ((u64)readl(qm->io_base + QM_MB_CMD_DATA_ADDR_H) <<
+ QM_XQC_ADDR_OFFSET);
+
+ return 0;
+}
+
+static int qm_get_cqc(struct hisi_qm *qm, u64 *addr)
+{
+ int ret;
+
+ ret = qm_mb(qm, QM_MB_CMD_CQC_BT, 0, 0, 1);
+ if (ret)
+ return ret;
+
+ *addr = readl(qm->io_base + QM_MB_CMD_DATA_ADDR_L) |
+ ((u64)readl(qm->io_base + QM_MB_CMD_DATA_ADDR_H) <<
+ QM_XQC_ADDR_OFFSET);
+
+ return 0;
+}
+
+static int qm_rw_regs_read(struct hisi_qm *qm, struct acc_vf_data *vf_data)
+{
+ struct device *dev = &qm->pdev->dev;
+ int ret;
+
+ ret = qm_read_reg(qm, QM_VF_AEQ_INT_MASK, &vf_data->aeq_int_mask, 1);
+ if (ret) {
+ dev_err(dev, "failed to read QM_VF_AEQ_INT_MASK\n");
+ return ret;
+ }
+
+ ret = qm_read_reg(qm, QM_VF_EQ_INT_MASK, &vf_data->eq_int_mask, 1);
+ if (ret) {
+ dev_err(dev, "failed to read QM_VF_EQ_INT_MASK\n");
+ return ret;
+ }
+
+ ret = qm_read_reg(qm, QM_IFC_INT_SOURCE_V,
+ &vf_data->ifc_int_source, 1);
+ if (ret) {
+ dev_err(dev, "failed to read QM_IFC_INT_SOURCE_V\n");
+ return ret;
+ }
+
+ ret = qm_read_reg(qm, QM_IFC_INT_MASK, &vf_data->ifc_int_mask, 1);
+ if (ret) {
+ dev_err(dev, "failed to read QM_IFC_INT_MASK\n");
+ return ret;
+ }
+
+ ret = qm_read_reg(qm, QM_IFC_INT_SET_V, &vf_data->ifc_int_set, 1);
+ if (ret) {
+ dev_err(dev, "failed to read QM_IFC_INT_SET_V\n");
+ return ret;
+ }
+
+ ret = qm_read_reg(qm, QM_PAGE_SIZE, &vf_data->page_size, 1);
+ if (ret) {
+ dev_err(dev, "failed to read QM_PAGE_SIZE\n");
+ return ret;
+ }
+
+ /* QM_EQC_DW has 7 regs */
+ ret = qm_read_reg(qm, QM_EQC_DW0, vf_data->qm_eqc_dw, 7);
+ if (ret) {
+ dev_err(dev, "failed to read QM_EQC_DW\n");
+ return ret;
+ }
+
+ /* QM_AEQC_DW has 7 regs */
+ ret = qm_read_reg(qm, QM_AEQC_DW0, vf_data->qm_aeqc_dw, 7);
+ if (ret) {
+ dev_err(dev, "failed to read QM_AEQC_DW\n");
+ return ret;
+ }
+
+ return 0;
+}
+
+static int qm_rw_regs_write(struct hisi_qm *qm, struct acc_vf_data *vf_data)
+{
+ struct device *dev = &qm->pdev->dev;
+ int ret;
+
+ /* check VF state */
+ if (unlikely(qm_wait_mb_ready(qm))) {
+ dev_err(&qm->pdev->dev, "QM device is not ready to write\n");
+ return -EBUSY;
+ }
+
+ ret = qm_write_reg(qm, QM_VF_AEQ_INT_MASK, &vf_data->aeq_int_mask, 1);
+ if (ret) {
+ dev_err(dev, "failed to write QM_VF_AEQ_INT_MASK\n");
+ return ret;
+ }
+
+ ret = qm_write_reg(qm, QM_VF_EQ_INT_MASK, &vf_data->eq_int_mask, 1);
+ if (ret) {
+ dev_err(dev, "failed to write QM_VF_EQ_INT_MASK\n");
+ return ret;
+ }
+
+ ret = qm_write_reg(qm, QM_IFC_INT_SOURCE_V,
+ &vf_data->ifc_int_source, 1);
+ if (ret) {
+ dev_err(dev, "failed to write QM_IFC_INT_SOURCE_V\n");
+ return ret;
+ }
+
+ ret = qm_write_reg(qm, QM_IFC_INT_MASK, &vf_data->ifc_int_mask, 1);
+ if (ret) {
+ dev_err(dev, "failed to write QM_IFC_INT_MASK\n");
+ return ret;
+ }
+
+ ret = qm_write_reg(qm, QM_IFC_INT_SET_V, &vf_data->ifc_int_set, 1);
+ if (ret) {
+ dev_err(dev, "failed to write QM_IFC_INT_SET_V\n");
+ return ret;
+ }
+
+ ret = qm_write_reg(qm, QM_QUE_ISO_CFG_V, &vf_data->que_iso_cfg, 1);
+ if (ret) {
+ dev_err(dev, "failed to write QM_QUE_ISO_CFG_V\n");
+ return ret;
+ }
+
+ ret = qm_write_reg(qm, QM_PAGE_SIZE, &vf_data->page_size, 1);
+ if (ret) {
+ dev_err(dev, "failed to write QM_PAGE_SIZE\n");
+ return ret;
+ }
+
+ /* QM_EQC_DW has 7 regs */
+ ret = qm_write_reg(qm, QM_EQC_DW0, vf_data->qm_eqc_dw, 7);
+ if (ret) {
+ dev_err(dev, "failed to write QM_EQC_DW\n");
+ return ret;
+ }
+
+ /* QM_AEQC_DW has 7 regs */
+ ret = qm_write_reg(qm, QM_AEQC_DW0, vf_data->qm_aeqc_dw, 7);
+ if (ret) {
+ dev_err(dev, "failed to write QM_AEQC_DW\n");
+ return ret;
+ }
+
+ return 0;
+}
+
+static void qm_db(struct hisi_qm *qm, u16 qn, u8 cmd,
+ u16 index, u8 priority)
+{
+ u64 doorbell;
+ u64 dbase;
+ u16 randata = 0;
+
+ if (cmd == QM_DOORBELL_CMD_SQ || cmd == QM_DOORBELL_CMD_CQ)
+ dbase = QM_DOORBELL_SQ_CQ_BASE_V2;
+ else
+ dbase = QM_DOORBELL_EQ_AEQ_BASE_V2;
+
+ doorbell = qn | ((u64)cmd << QM_DB_CMD_SHIFT_V2) |
+ ((u64)randata << QM_DB_RAND_SHIFT_V2) |
+ ((u64)index << QM_DB_INDEX_SHIFT_V2) |
+ ((u64)priority << QM_DB_PRIORITY_SHIFT_V2);
+
+ writeq(doorbell, qm->io_base + dbase);
+}
+
+static int pf_qm_get_qp_num(struct hisi_qm *qm, int vf_id, u32 *rbase)
+{
+ unsigned int val;
+ u64 sqc_vft;
+ u32 qp_num;
+ int ret;
+
+ ret = readl_relaxed_poll_timeout(qm->io_base + QM_VFT_CFG_RDY, val,
+ val & BIT(0), MB_POLL_PERIOD_US,
+ MB_POLL_TIMEOUT_US);
+ if (ret)
+ return ret;
+
+ writel(0x1, qm->io_base + QM_VFT_CFG_OP_WR);
+ /* 0 mean SQC VFT */
+ writel(0x0, qm->io_base + QM_VFT_CFG_TYPE);
+ writel(vf_id, qm->io_base + QM_VFT_CFG);
+
+ writel(0x0, qm->io_base + QM_VFT_CFG_RDY);
+ writel(0x1, qm->io_base + QM_VFT_CFG_OP_ENABLE);
+
+ ret = readl_relaxed_poll_timeout(qm->io_base + QM_VFT_CFG_RDY, val,
+ val & BIT(0), MB_POLL_PERIOD_US,
+ MB_POLL_TIMEOUT_US);
+ if (ret)
+ return ret;
+
+ sqc_vft = readl(qm->io_base + QM_VFT_CFG_DATA_L) |
+ ((u64)readl(qm->io_base + QM_VFT_CFG_DATA_H) <<
+ QM_XQC_ADDR_OFFSET);
+ *rbase = QM_SQC_VFT_BASE_MASK_V2 &
+ (sqc_vft >> QM_SQC_VFT_BASE_SHIFT_V2);
+ qp_num = (QM_SQC_VFT_NUM_MASK_V2 &
+ (sqc_vft >> QM_SQC_VFT_NUM_SHIFT_V2)) + 1;
+
+ return qp_num;
+}
+
+static void qm_dev_cmd_init(struct hisi_qm *qm)
+{
+ /* Clear VF communication status registers. */
+ writel(0x1, qm->io_base + QM_IFC_INT_SOURCE_V);
+
+ /* Enable pf and vf communication. */
+ writel(0x0, qm->io_base + QM_IFC_INT_MASK);
+}
+
+static int vf_qm_cache_wb(struct hisi_qm *qm)
+{
+ unsigned int val;
+
+ writel(0x1, qm->io_base + QM_CACHE_WB_START);
+ if (readl_relaxed_poll_timeout(qm->io_base + QM_CACHE_WB_DONE,
+ val, val & BIT(0), MB_POLL_PERIOD_US,
+ MB_POLL_TIMEOUT_US)) {
+ dev_err(&qm->pdev->dev, "vf QM writeback sqc cache fail\n");
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static void vf_qm_fun_reset(struct hisi_acc_vf_core_device *hisi_acc_vdev,
+ struct hisi_qm *qm)
+{
+ int i;
+
+ for (i = 0; i < qm->qp_num; i++)
+ qm_db(qm, i, QM_DOORBELL_CMD_SQ, 0, 1);
+}
+
+static int vf_qm_func_stop(struct hisi_qm *qm)
+{
+ return qm_mb(qm, QM_MB_CMD_PAUSE_QM, 0, 0, 0);
+}
+
+static int vf_qm_load_data(struct hisi_acc_vf_core_device *hisi_acc_vdev,
+ struct hisi_acc_vf_migration_file *migf)
+{
+ struct hisi_qm *qm = &hisi_acc_vdev->vf_qm;
+ struct device *dev = &qm->pdev->dev;
+ struct acc_vf_data *vf_data = &migf->vf_data;
+ int ret;
+
+ /* Return if only match data was transferred */
+ if (migf->total_length == QM_MATCH_SIZE)
+ return 0;
+
+ if (migf->total_length < sizeof(struct acc_vf_data))
+ return -EINVAL;
+
+ qm->eqe_dma = vf_data->eqe_dma;
+ qm->aeqe_dma = vf_data->aeqe_dma;
+ qm->sqc_dma = vf_data->sqc_dma;
+ qm->cqc_dma = vf_data->cqc_dma;
+
+ qm->qp_base = vf_data->qp_base;
+ qm->qp_num = vf_data->qp_num;
+
+ ret = qm_rw_regs_write(qm, vf_data);
+ if (ret) {
+ dev_err(dev, "Set VF regs failed\n");
+ return ret;
+ }
+
+ ret = qm_mb(qm, QM_MB_CMD_SQC_BT, qm->sqc_dma, 0, 0);
+ if (ret) {
+ dev_err(dev, "Set sqc failed\n");
+ return ret;
+ }
+
+ ret = qm_mb(qm, QM_MB_CMD_CQC_BT, qm->cqc_dma, 0, 0);
+ if (ret) {
+ dev_err(dev, "Set cqc failed\n");
+ return ret;
+ }
+
+ qm_dev_cmd_init(qm);
+ return 0;
+}
+
+static int vf_qm_check_match(struct hisi_acc_vf_core_device *hisi_acc_vdev,
+ struct hisi_acc_vf_migration_file *migf)
+{
+ struct acc_vf_data *vf_data = &migf->vf_data;
+ struct hisi_qm *vf_qm = &hisi_acc_vdev->vf_qm;
+ struct hisi_qm *pf_qm = &hisi_acc_vdev->vf_qm;
+ struct device *dev = &vf_qm->pdev->dev;
+ u32 que_iso_state;
+ int ret;
+
+ if (migf->total_length < QM_MATCH_SIZE)
+ return -EINVAL;
+
+ /* vf acc dev type check */
+ if (vf_data->dev_id != hisi_acc_vdev->vf_dev->device) {
+ dev_err(dev, "failed to match VF devices\n");
+ return -EINVAL;
+ }
+
+ /* vf qp num check */
+ ret = qm_get_vft(vf_qm, &vf_qm->qp_base);
+ if (ret <= 0) {
+ dev_err(dev, "failed to get vft qp nums\n");
+ return -EINVAL;
+ }
+
+ if (ret != vf_data->qp_num) {
+ dev_err(dev, "failed to match VF qp num\n");
+ return -EINVAL;
+ }
+
+ vf_qm->qp_num = ret;
+
+ /* vf isolation state check */
+ ret = qm_read_reg(pf_qm, QM_QUE_ISO_CFG_V, &que_iso_state, 1);
+ if (ret) {
+ dev_err(dev, "failed to read QM_QUE_ISO_CFG_V\n");
+ return ret;
+ }
+
+ if (vf_data->que_iso_cfg != que_iso_state) {
+ dev_err(dev, "failed to match isolation state\n");
+ return ret;
+ }
+
+ ret = qm_write_reg(vf_qm, QM_VF_STATE, &vf_data->qm_state, 1);
+ if (ret) {
+ dev_err(dev, "failed to write QM_VF_STATE\n");
+ return ret;
+ }
+
+ hisi_acc_vdev->vf_qm_state = vf_data->qm_state;
+ return 0;
+}
+
+static int vf_qm_get_match_data(struct hisi_acc_vf_core_device *hisi_acc_vdev,
+ struct acc_vf_data *vf_data)
+{
+ struct hisi_qm *pf_qm = hisi_acc_vdev->pf_qm;
+ struct device *dev = &pf_qm->pdev->dev;
+ int vf_id = hisi_acc_vdev->vf_id;
+ int ret;
+
+ /* save device id */
+ vf_data->dev_id = hisi_acc_vdev->vf_dev->device;
+
+ /* vf qp num save from PF */
+ ret = pf_qm_get_qp_num(pf_qm, vf_id, &vf_data->qp_base);
+ if (ret <= 0) {
+ dev_err(dev, "failed to get vft qp nums!\n");
+ return -EINVAL;
+ }
+
+ vf_data->qp_num = ret;
+
+ /* VF isolation state save from PF */
+ ret = qm_read_reg(pf_qm, QM_QUE_ISO_CFG_V, &vf_data->que_iso_cfg, 1);
+ if (ret) {
+ dev_err(dev, "failed to read QM_QUE_ISO_CFG_V!\n");
+ return ret;
+ }
+
+ return 0;
+}
+
+static int vf_qm_state_save(struct hisi_acc_vf_core_device *hisi_acc_vdev,
+ struct hisi_acc_vf_migration_file *migf)
+{
+ struct acc_vf_data *vf_data = &migf->vf_data;
+ struct hisi_qm *vf_qm = &hisi_acc_vdev->vf_qm;
+ struct device *dev = &vf_qm->pdev->dev;
+ int ret;
+
+ ret = vf_qm_get_match_data(hisi_acc_vdev, vf_data);
+ if (ret)
+ return ret;
+
+ if (unlikely(qm_wait_dev_not_ready(vf_qm))) {
+ /* Update state and return with match data */
+ vf_data->qm_state = QM_NOT_READY;
+ hisi_acc_vdev->vf_qm_state = vf_data->qm_state;
+ migf->total_length = QM_MATCH_SIZE;
+ return 0;
+ }
+
+ ret = vf_qm_cache_wb(vf_qm);
+ if (ret) {
+ dev_err(dev, "failed to writeback QM Cache!\n");
+ return ret;
+ }
+
+ ret = qm_rw_regs_read(vf_qm, vf_data);
+ if (ret)
+ return -EINVAL;
+
+ /* Every reg is 32 bit, the dma address is 64 bit. */
+ vf_data->eqe_dma = vf_data->qm_eqc_dw[2];
+ vf_data->eqe_dma <<= QM_XQC_ADDR_OFFSET;
+ vf_data->eqe_dma |= vf_data->qm_eqc_dw[1];
+ vf_data->aeqe_dma = vf_data->qm_aeqc_dw[2];
+ vf_data->aeqe_dma <<= QM_XQC_ADDR_OFFSET;
+ vf_data->aeqe_dma |= vf_data->qm_aeqc_dw[1];
+
+ /* Through SQC_BT/CQC_BT to get sqc and cqc address */
+ ret = qm_get_sqc(vf_qm, &vf_data->sqc_dma);
+ if (ret) {
+ dev_err(dev, "failed to read SQC addr!\n");
+ return -EINVAL;
+ }
+
+ ret = qm_get_cqc(vf_qm, &vf_data->cqc_dma);
+ if (ret) {
+ dev_err(dev, "failed to read CQC addr!\n");
+ return -EINVAL;
+ }
+
+ migf->total_length = sizeof(struct acc_vf_data);
+ return 0;
+}
+
+static void hisi_acc_vf_start_device(struct hisi_acc_vf_core_device *hisi_acc_vdev)
+{
+ struct hisi_qm *vf_qm = &hisi_acc_vdev->vf_qm;
+
+ if (hisi_acc_vdev->vf_qm_state != QM_READY)
+ return;
+
+ vf_qm_fun_reset(hisi_acc_vdev, vf_qm);
+}
+
+static void hisi_acc_vf_disable_fd(struct hisi_acc_vf_migration_file *migf)
+{
+ mutex_lock(&migf->lock);
+ migf->disabled = true;
+ migf->total_length = 0;
+ migf->filp->f_pos = 0;
+ mutex_unlock(&migf->lock);
+}
+
+static int hisi_acc_vf_release_file(struct inode *inode, struct file *filp)
+{
+ struct hisi_acc_vf_migration_file *migf = filp->private_data;
+
+ hisi_acc_vf_disable_fd(migf);
+ mutex_destroy(&migf->lock);
+ kfree(migf);
+ return 0;
+}
+
+static ssize_t hisi_acc_vf_resume_write(struct file *filp, const char __user *buf,
+ size_t len, loff_t *pos)
+{
+ struct hisi_acc_vf_migration_file *migf = filp->private_data;
+ loff_t requested_length;
+ ssize_t done = 0;
+ int ret;
+
+ if (pos)
+ return -ESPIPE;
+ pos = &filp->f_pos;
+
+ if (*pos < 0 ||
+ check_add_overflow((loff_t)len, *pos, &requested_length))
+ return -EINVAL;
+
+ if (requested_length > sizeof(struct acc_vf_data))
+ return -ENOMEM;
+
+ mutex_lock(&migf->lock);
+ if (migf->disabled) {
+ done = -ENODEV;
+ goto out_unlock;
+ }
+
+ ret = copy_from_user(&migf->vf_data, buf, len);
+ if (ret) {
+ done = -EFAULT;
+ goto out_unlock;
+ }
+ *pos += len;
+ done = len;
+ migf->total_length += len;
+out_unlock:
+ mutex_unlock(&migf->lock);
+ return done;
+}
+
+static const struct file_operations hisi_acc_vf_resume_fops = {
+ .owner = THIS_MODULE,
+ .write = hisi_acc_vf_resume_write,
+ .release = hisi_acc_vf_release_file,
+ .llseek = no_llseek,
+};
+
+static struct hisi_acc_vf_migration_file *
+hisi_acc_vf_pci_resume(struct hisi_acc_vf_core_device *hisi_acc_vdev)
+{
+ struct hisi_acc_vf_migration_file *migf;
+
+ migf = kzalloc(sizeof(*migf), GFP_KERNEL);
+ if (!migf)
+ return ERR_PTR(-ENOMEM);
+
+ migf->filp = anon_inode_getfile("hisi_acc_vf_mig", &hisi_acc_vf_resume_fops, migf,
+ O_WRONLY);
+ if (IS_ERR(migf->filp)) {
+ int err = PTR_ERR(migf->filp);
+
+ kfree(migf);
+ return ERR_PTR(err);
+ }
+
+ stream_open(migf->filp->f_inode, migf->filp);
+ mutex_init(&migf->lock);
+ return migf;
+}
+
+static ssize_t hisi_acc_vf_save_read(struct file *filp, char __user *buf, size_t len,
+ loff_t *pos)
+{
+ struct hisi_acc_vf_migration_file *migf = filp->private_data;
+ ssize_t done = 0;
+ int ret;
+
+ if (pos)
+ return -ESPIPE;
+ pos = &filp->f_pos;
+
+ mutex_lock(&migf->lock);
+ if (*pos > migf->total_length) {
+ done = -EINVAL;
+ goto out_unlock;
+ }
+
+ if (migf->disabled) {
+ done = -ENODEV;
+ goto out_unlock;
+ }
+
+ len = min_t(size_t, migf->total_length - *pos, len);
+ if (len) {
+ ret = copy_to_user(buf, &migf->vf_data, len);
+ if (ret) {
+ done = -EFAULT;
+ goto out_unlock;
+ }
+ *pos += len;
+ done = len;
+ }
+
+out_unlock:
+ mutex_unlock(&migf->lock);
+ return done;
+}
+
+static const struct file_operations hisi_acc_vf_save_fops = {
+ .owner = THIS_MODULE,
+ .read = hisi_acc_vf_save_read,
+ .release = hisi_acc_vf_release_file,
+ .llseek = no_llseek,
+};
+
+static struct hisi_acc_vf_migration_file *
+hisi_acc_vf_stop_copy(struct hisi_acc_vf_core_device *hisi_acc_vdev)
+{
+ struct hisi_acc_vf_migration_file *migf;
+ int ret;
+
+ migf = kzalloc(sizeof(*migf), GFP_KERNEL);
+ if (!migf)
+ return ERR_PTR(-ENOMEM);
+
+ migf->filp = anon_inode_getfile("hisi_acc_vf_mig", &hisi_acc_vf_save_fops, migf,
+ O_RDONLY);
+ if (IS_ERR(migf->filp)) {
+ int err = PTR_ERR(migf->filp);
+
+ kfree(migf);
+ return ERR_PTR(err);
+ }
+
+ stream_open(migf->filp->f_inode, migf->filp);
+ mutex_init(&migf->lock);
+
+ ret = vf_qm_state_save(hisi_acc_vdev, migf);
+ if (ret) {
+ fput(migf->filp);
+ return ERR_PTR(ret);
+ }
+
+ return migf;
+}
+
+static int hisi_acc_vf_load_state(struct hisi_acc_vf_core_device *hisi_acc_vdev)
+{
+ struct device *dev = &hisi_acc_vdev->vf_dev->dev;
+ struct hisi_acc_vf_migration_file *migf = hisi_acc_vdev->resuming_migf;
+ int ret;
+
+ /* Check dev compatibility */
+ ret = vf_qm_check_match(hisi_acc_vdev, migf);
+ if (ret) {
+ dev_err(dev, "failed to match the VF!\n");
+ return ret;
+ }
+ /* Recover data to VF */
+ ret = vf_qm_load_data(hisi_acc_vdev, migf);
+ if (ret) {
+ dev_err(dev, "failed to recover the VF!\n");
+ return ret;
+ }
+
+ return 0;
+}
+
+static int hisi_acc_vf_stop_device(struct hisi_acc_vf_core_device *hisi_acc_vdev)
+{
+ struct device *dev = &hisi_acc_vdev->vf_dev->dev;
+ struct hisi_qm *vf_qm = &hisi_acc_vdev->vf_qm;
+ int ret;
+
+ ret = vf_qm_func_stop(vf_qm);
+ if (ret) {
+ dev_err(dev, "failed to stop QM VF function!\n");
+ return ret;
+ }
+
+ ret = qm_check_int_state(hisi_acc_vdev);
+ if (ret) {
+ dev_err(dev, "failed to check QM INT state!\n");
+ return ret;
+ }
+ return 0;
+}
+
+static void hisi_acc_vf_disable_fds(struct hisi_acc_vf_core_device *hisi_acc_vdev)
+{
+ if (hisi_acc_vdev->resuming_migf) {
+ hisi_acc_vf_disable_fd(hisi_acc_vdev->resuming_migf);
+ fput(hisi_acc_vdev->resuming_migf->filp);
+ hisi_acc_vdev->resuming_migf = NULL;
+ }
+
+ if (hisi_acc_vdev->saving_migf) {
+ hisi_acc_vf_disable_fd(hisi_acc_vdev->saving_migf);
+ fput(hisi_acc_vdev->saving_migf->filp);
+ hisi_acc_vdev->saving_migf = NULL;
+ }
+}
+
+static struct file *
+hisi_acc_vf_set_device_state(struct hisi_acc_vf_core_device *hisi_acc_vdev,
+ u32 new)
+{
+ u32 cur = hisi_acc_vdev->mig_state;
+ int ret;
+
+ if (cur == VFIO_DEVICE_STATE_RUNNING && new == VFIO_DEVICE_STATE_STOP) {
+ ret = hisi_acc_vf_stop_device(hisi_acc_vdev);
+ if (ret)
+ return ERR_PTR(ret);
+ return NULL;
+ }
+
+ if (cur == VFIO_DEVICE_STATE_STOP && new == VFIO_DEVICE_STATE_STOP_COPY) {
+ struct hisi_acc_vf_migration_file *migf;
+
+ migf = hisi_acc_vf_stop_copy(hisi_acc_vdev);
+ if (IS_ERR(migf))
+ return ERR_CAST(migf);
+ get_file(migf->filp);
+ hisi_acc_vdev->saving_migf = migf;
+ return migf->filp;
+ }
+
+ if ((cur == VFIO_DEVICE_STATE_STOP_COPY && new == VFIO_DEVICE_STATE_STOP)) {
+ hisi_acc_vf_disable_fds(hisi_acc_vdev);
+ return NULL;
+ }
+
+ if (cur == VFIO_DEVICE_STATE_STOP && new == VFIO_DEVICE_STATE_RESUMING) {
+ struct hisi_acc_vf_migration_file *migf;
+
+ migf = hisi_acc_vf_pci_resume(hisi_acc_vdev);
+ if (IS_ERR(migf))
+ return ERR_CAST(migf);
+ get_file(migf->filp);
+ hisi_acc_vdev->resuming_migf = migf;
+ return migf->filp;
+ }
+
+ if (cur == VFIO_DEVICE_STATE_RESUMING && new == VFIO_DEVICE_STATE_STOP) {
+ ret = hisi_acc_vf_load_state(hisi_acc_vdev);
+ if (ret)
+ return ERR_PTR(ret);
+ hisi_acc_vf_disable_fds(hisi_acc_vdev);
+ return NULL;
+ }
+
+ if (cur == VFIO_DEVICE_STATE_STOP && new == VFIO_DEVICE_STATE_RUNNING) {
+ hisi_acc_vf_start_device(hisi_acc_vdev);
+ return NULL;
+ }
+
+ /*
+ * vfio_mig_get_next_state() does not use arcs other than the above
+ */
+ WARN_ON(true);
+ return ERR_PTR(-EINVAL);
+}
+
+static struct file *
+hisi_acc_vfio_pci_set_device_state(struct vfio_device *vdev,
+ enum vfio_device_mig_state new_state)
+{
+ struct hisi_acc_vf_core_device *hisi_acc_vdev = container_of(vdev,
+ struct hisi_acc_vf_core_device, core_device.vdev);
+ enum vfio_device_mig_state next_state;
+ struct file *res = NULL;
+ int ret;
+
+ mutex_lock(&hisi_acc_vdev->state_mutex);
+ while (new_state != hisi_acc_vdev->mig_state) {
+ ret = vfio_mig_get_next_state(vdev,
+ hisi_acc_vdev->mig_state,
+ new_state, &next_state);
+ if (ret) {
+ res = ERR_PTR(-EINVAL);
+ break;
+ }
+
+ res = hisi_acc_vf_set_device_state(hisi_acc_vdev, next_state);
+ if (IS_ERR(res))
+ break;
+ hisi_acc_vdev->mig_state = next_state;
+ if (WARN_ON(res && new_state != hisi_acc_vdev->mig_state)) {
+ fput(res);
+ res = ERR_PTR(-EINVAL);
+ break;
+ }
+ }
+ mutex_unlock(&hisi_acc_vdev->state_mutex);
+ return res;
+}
+
+static int
+hisi_acc_vfio_pci_get_device_state(struct vfio_device *vdev,
+ enum vfio_device_mig_state *curr_state)
+{
+ struct hisi_acc_vf_core_device *hisi_acc_vdev = container_of(vdev,
+ struct hisi_acc_vf_core_device, core_device.vdev);
+
+ mutex_lock(&hisi_acc_vdev->state_mutex);
+ *curr_state = hisi_acc_vdev->mig_state;
+ mutex_unlock(&hisi_acc_vdev->state_mutex);
+ return 0;
+}
+
+static int hisi_acc_vf_qm_init(struct hisi_acc_vf_core_device *hisi_acc_vdev)
+{
+ struct vfio_pci_core_device *vdev = &hisi_acc_vdev->core_device;
+ struct hisi_qm *vf_qm = &hisi_acc_vdev->vf_qm;
+ struct pci_dev *vf_dev = vdev->pdev;
+
+ /*
+ * ACC VF dev BAR2 region consists of both functional register space
+ * and migration control register space. For migration to work, we
+ * need access to both. Hence, we map the entire BAR2 region here.
+ * But from a security point of view, we restrict access to the
+ * migration control space from Guest(Please see mmap/ioctl/read/write
+ * override functions).
+ *
+ * Also the HiSilicon ACC VF devices supported by this driver on
+ * HiSilicon hardware platforms are integrated end point devices
+ * and has no capability to perform PCIe P2P.
+ */
+
+ vf_qm->io_base =
+ ioremap(pci_resource_start(vf_dev, VFIO_PCI_BAR2_REGION_INDEX),
+ pci_resource_len(vf_dev, VFIO_PCI_BAR2_REGION_INDEX));
+ if (!vf_qm->io_base)
+ return -EIO;
+
+ vf_qm->fun_type = QM_HW_VF;
+ vf_qm->pdev = vf_dev;
+ mutex_init(&vf_qm->mailbox_lock);
+
+ return 0;
+}
+
static struct hisi_qm *hisi_acc_get_pf_qm(struct pci_dev *pdev)
{
struct hisi_qm *pf_qm;
@@ -159,23 +1110,46 @@ static long hisi_acc_vfio_pci_ioctl(struct vfio_device *core_vdev, unsigned int
static int hisi_acc_vfio_pci_open_device(struct vfio_device *core_vdev)
{
- struct vfio_pci_core_device *vdev =
- container_of(core_vdev, struct vfio_pci_core_device, vdev);
+ struct hisi_acc_vf_core_device *hisi_acc_vdev = container_of(core_vdev,
+ struct hisi_acc_vf_core_device, core_device.vdev);
+ struct vfio_pci_core_device *vdev = &hisi_acc_vdev->core_device;
int ret;
ret = vfio_pci_core_enable(vdev);
if (ret)
return ret;
- vfio_pci_core_finish_enable(vdev);
+ if (core_vdev->migration_flags != VFIO_MIGRATION_STOP_COPY) {
+ vfio_pci_core_finish_enable(vdev);
+ return 0;
+ }
+
+ ret = hisi_acc_vf_qm_init(hisi_acc_vdev);
+ if (ret) {
+ vfio_pci_core_disable(vdev);
+ return ret;
+ }
+ hisi_acc_vdev->mig_state = VFIO_DEVICE_STATE_RUNNING;
+
+ vfio_pci_core_finish_enable(vdev);
return 0;
}
+static void hisi_acc_vfio_pci_close_device(struct vfio_device *core_vdev)
+{
+ struct hisi_acc_vf_core_device *hisi_acc_vdev = container_of(core_vdev,
+ struct hisi_acc_vf_core_device, core_device.vdev);
+ struct hisi_qm *vf_qm = &hisi_acc_vdev->vf_qm;
+
+ iounmap(vf_qm->io_base);
+ vfio_pci_core_close_device(core_vdev);
+}
+
static const struct vfio_device_ops hisi_acc_vfio_pci_migrn_ops = {
.name = "hisi-acc-vfio-pci",
.open_device = hisi_acc_vfio_pci_open_device,
- .close_device = vfio_pci_core_close_device,
+ .close_device = hisi_acc_vfio_pci_close_device,
.ioctl = hisi_acc_vfio_pci_ioctl,
.device_feature = vfio_pci_core_ioctl_feature,
.read = hisi_acc_vfio_pci_read,
@@ -183,6 +1157,8 @@ static const struct vfio_device_ops hisi_acc_vfio_pci_migrn_ops = {
.mmap = hisi_acc_vfio_pci_mmap,
.request = vfio_pci_core_request,
.match = vfio_pci_core_match,
+ .migration_set_state = hisi_acc_vfio_pci_set_device_state,
+ .migration_get_state = hisi_acc_vfio_pci_get_device_state,
};
static const struct vfio_device_ops hisi_acc_vfio_pci_ops = {
@@ -198,38 +1174,71 @@ static const struct vfio_device_ops hisi_acc_vfio_pci_ops = {
.match = vfio_pci_core_match,
};
+static int
+hisi_acc_vfio_pci_migrn_init(struct hisi_acc_vf_core_device *hisi_acc_vdev,
+ struct pci_dev *pdev, struct hisi_qm *pf_qm)
+{
+ int vf_id;
+
+ vf_id = pci_iov_vf_id(pdev);
+ if (vf_id < 0)
+ return vf_id;
+
+ hisi_acc_vdev->vf_id = vf_id + 1;
+ hisi_acc_vdev->core_device.vdev.migration_flags =
+ VFIO_MIGRATION_STOP_COPY;
+ hisi_acc_vdev->pf_qm = pf_qm;
+ hisi_acc_vdev->vf_dev = pdev;
+ mutex_init(&hisi_acc_vdev->state_mutex);
+
+ return 0;
+}
+
static int hisi_acc_vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
{
- struct vfio_pci_core_device *vdev;
+ struct hisi_acc_vf_core_device *hisi_acc_vdev;
+ struct hisi_qm *pf_qm;
int ret;
- vdev = kzalloc(sizeof(*vdev), GFP_KERNEL);
- if (!vdev)
+ hisi_acc_vdev = kzalloc(sizeof(*hisi_acc_vdev), GFP_KERNEL);
+ if (!hisi_acc_vdev)
return -ENOMEM;
- vfio_pci_core_init_device(vdev, pdev, &hisi_acc_vfio_pci_ops);
+ pf_qm = hisi_acc_get_pf_qm(pdev);
+ if (pf_qm && pf_qm->ver >= QM_HW_V3) {
+ ret = hisi_acc_vfio_pci_migrn_init(hisi_acc_vdev, pdev, pf_qm);
+ if (ret < 0) {
+ kfree(hisi_acc_vdev);
+ return ret;
+ }
+
+ vfio_pci_core_init_device(&hisi_acc_vdev->core_device, pdev,
+ &hisi_acc_vfio_pci_migrn_ops);
+ } else {
+ vfio_pci_core_init_device(&hisi_acc_vdev->core_device, pdev,
+ &hisi_acc_vfio_pci_ops);
+ }
- ret = vfio_pci_core_register_device(vdev);
+ ret = vfio_pci_core_register_device(&hisi_acc_vdev->core_device);
if (ret)
goto out_free;
- dev_set_drvdata(&pdev->dev, vdev);
-
+ dev_set_drvdata(&pdev->dev, hisi_acc_vdev);
return 0;
out_free:
- vfio_pci_core_uninit_device(vdev);
- kfree(vdev);
+ vfio_pci_core_uninit_device(&hisi_acc_vdev->core_device);
+ kfree(hisi_acc_vdev);
return ret;
}
static void hisi_acc_vfio_pci_remove(struct pci_dev *pdev)
{
- struct vfio_pci_core_device *vdev = dev_get_drvdata(&pdev->dev);
+ struct hisi_acc_vf_core_device *hisi_acc_vdev = dev_get_drvdata(&pdev->dev);
- vfio_pci_core_unregister_device(vdev);
- vfio_pci_core_uninit_device(vdev);
- kfree(vdev);
+ vfio_pci_core_unregister_device(&hisi_acc_vdev->core_device);
+ vfio_pci_core_uninit_device(&hisi_acc_vdev->core_device);
+ kfree(hisi_acc_vdev);
}
static const struct pci_device_id hisi_acc_vfio_pci_table[] = {
@@ -254,4 +1263,4 @@ module_pci_driver(hisi_acc_vfio_pci_driver);
MODULE_LICENSE("GPL v2");
MODULE_AUTHOR("Liu Longfang <[email protected]>");
MODULE_AUTHOR("Shameer Kolothum <[email protected]>");
-MODULE_DESCRIPTION("HiSilicon VFIO PCI - Generic VFIO PCI driver for HiSilicon ACC device family");
+MODULE_DESCRIPTION("HiSilicon VFIO PCI - VFIO PCI driver with live migration support for HiSilicon ACC device family");
diff --git a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.h b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.h
new file mode 100644
index 000000000000..4aedea108152
--- /dev/null
+++ b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.h
@@ -0,0 +1,117 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (c) 2021 HiSilicon Ltd. */
+
+#ifndef HISI_ACC_VFIO_PCI_H
+#define HISI_ACC_VFIO_PCI_H
+
+#include <linux/hisi_acc_qm.h>
+
+#define MB_POLL_PERIOD_US 10
+#define MB_POLL_TIMEOUT_US 1000
+#define QM_CACHE_WB_START 0x204
+#define QM_CACHE_WB_DONE 0x208
+#define QM_MB_CMD_PAUSE_QM 0xe
+#define QM_ABNORMAL_INT_STATUS 0x100008
+#define QM_IFC_INT_STATUS 0x0028
+#define SEC_CORE_INT_STATUS 0x301008
+#define HPRE_HAC_INT_STATUS 0x301800
+#define HZIP_CORE_INT_STATUS 0x3010AC
+#define QM_QUE_ISO_CFG 0x301154
+
+#define QM_VFT_CFG_RDY 0x10006c
+#define QM_VFT_CFG_OP_WR 0x100058
+#define QM_VFT_CFG_TYPE 0x10005c
+#define QM_VFT_CFG 0x100060
+#define QM_VFT_CFG_OP_ENABLE 0x100054
+#define QM_VFT_CFG_DATA_L 0x100064
+#define QM_VFT_CFG_DATA_H 0x100068
+
+#define ERROR_CHECK_TIMEOUT 100
+#define CHECK_DELAY_TIME 100
+
+#define QM_SQC_VFT_BASE_SHIFT_V2 28
+#define QM_SQC_VFT_BASE_MASK_V2 GENMASK(15, 0)
+#define QM_SQC_VFT_NUM_SHIFT_V2 45
+#define QM_SQC_VFT_NUM_MASK_V2 GENMASK(9, 0)
+
+/* RW regs */
+#define QM_REGS_MAX_LEN 7
+#define QM_REG_ADDR_OFFSET 0x0004
+
+#define QM_XQC_ADDR_OFFSET 32U
+#define QM_VF_AEQ_INT_MASK 0x0004
+#define QM_VF_EQ_INT_MASK 0x000c
+#define QM_IFC_INT_SOURCE_V 0x0020
+#define QM_IFC_INT_MASK 0x0024
+#define QM_IFC_INT_SET_V 0x002c
+#define QM_QUE_ISO_CFG_V 0x0030
+#define QM_PAGE_SIZE 0x0034
+
+#define QM_EQC_DW0 0X8000
+#define QM_AEQC_DW0 0X8020
+
+enum qm_vf_state {
+ QM_READY = 0,
+ QM_NOT_READY,
+};
+
+struct acc_vf_data {
+#define QM_MATCH_SIZE 32L
+ /* QM match information */
+ u32 qp_num;
+ u32 dev_id;
+ u32 que_iso_cfg;
+ u32 qp_base;
+ u32 qm_state;
+ /* QM reserved match information */
+ u32 qm_rsv_state[3];
+
+ /* QM RW regs */
+ u32 aeq_int_mask;
+ u32 eq_int_mask;
+ u32 ifc_int_source;
+ u32 ifc_int_mask;
+ u32 ifc_int_set;
+ u32 page_size;
+
+ /* QM_EQC_DW has 7 regs */
+ u32 qm_eqc_dw[7];
+
+ /* QM_AEQC_DW has 7 regs */
+ u32 qm_aeqc_dw[7];
+
+ /* QM reserved 5 regs */
+ u32 qm_rsv_regs[5];
+
+ /* qm memory init information */
+ u64 eqe_dma;
+ u64 aeqe_dma;
+ u64 sqc_dma;
+ u64 cqc_dma;
+};
+
+struct hisi_acc_vf_migration_file {
+ struct file *filp;
+ struct mutex lock;
+ bool disabled;
+
+ struct acc_vf_data vf_data;
+ size_t total_length;
+};
+
+struct hisi_acc_vf_core_device {
+ struct vfio_pci_core_device core_device;
+ /* for migration state */
+ struct mutex state_mutex;
+ enum vfio_device_mig_state mig_state;
+ struct pci_dev *pf_dev;
+ struct pci_dev *vf_dev;
+ struct hisi_qm *pf_qm;
+ struct hisi_qm vf_qm;
+ u32 vf_qm_state;
+ int vf_id;
+
+ struct hisi_acc_vf_migration_file *resuming_migf;
+ struct hisi_acc_vf_migration_file *saving_migf;
+};
+#endif /* HISI_ACC_VFIO_PCI_H */
--
2.25.1
Move the PCI Device IDs of HiSilicon ACC devices to
a common header and use a uniform naming convention.
This will be useful when we introduce the vfio PCI
HiSilicon ACC live migration driver in subsequent patches.
Signed-off-by: Shameer Kolothum <[email protected]>
---
drivers/crypto/hisilicon/hpre/hpre_main.c | 12 +++++-------
drivers/crypto/hisilicon/sec2/sec_main.c | 14 ++++++--------
drivers/crypto/hisilicon/zip/zip_main.c | 11 ++++-------
include/linux/pci_ids.h | 6 ++++++
4 files changed, 21 insertions(+), 22 deletions(-)
diff --git a/drivers/crypto/hisilicon/hpre/hpre_main.c b/drivers/crypto/hisilicon/hpre/hpre_main.c
index ebfab3e14499..ba4043447e53 100644
--- a/drivers/crypto/hisilicon/hpre/hpre_main.c
+++ b/drivers/crypto/hisilicon/hpre/hpre_main.c
@@ -68,8 +68,6 @@
#define HPRE_REG_RD_INTVRL_US 10
#define HPRE_REG_RD_TMOUT_US 1000
#define HPRE_DBGFS_VAL_MAX_LEN 20
-#define HPRE_PCI_DEVICE_ID 0xa258
-#define HPRE_PCI_VF_DEVICE_ID 0xa259
#define HPRE_QM_USR_CFG_MASK GENMASK(31, 1)
#define HPRE_QM_AXI_CFG_MASK GENMASK(15, 0)
#define HPRE_QM_VFG_AX_MASK GENMASK(7, 0)
@@ -111,8 +109,8 @@
static const char hpre_name[] = "hisi_hpre";
static struct dentry *hpre_debugfs_root;
static const struct pci_device_id hpre_dev_ids[] = {
- { PCI_DEVICE(PCI_VENDOR_ID_HUAWEI, HPRE_PCI_DEVICE_ID) },
- { PCI_DEVICE(PCI_VENDOR_ID_HUAWEI, HPRE_PCI_VF_DEVICE_ID) },
+ { PCI_DEVICE(PCI_VENDOR_ID_HUAWEI, PCI_DEVICE_ID_HUAWEI_HPRE_PF) },
+ { PCI_DEVICE(PCI_VENDOR_ID_HUAWEI, PCI_DEVICE_ID_HUAWEI_HPRE_VF) },
{ 0, }
};
@@ -242,7 +240,7 @@ MODULE_PARM_DESC(uacce_mode, UACCE_MODE_DESC);
static int pf_q_num_set(const char *val, const struct kernel_param *kp)
{
- return q_num_set(val, kp, HPRE_PCI_DEVICE_ID);
+ return q_num_set(val, kp, PCI_DEVICE_ID_HUAWEI_HPRE_PF);
}
static const struct kernel_param_ops hpre_pf_q_num_ops = {
@@ -921,7 +919,7 @@ static int hpre_debugfs_init(struct hisi_qm *qm)
qm->debug.sqe_mask_len = HPRE_SQE_MASK_LEN;
hisi_qm_debug_init(qm);
- if (qm->pdev->device == HPRE_PCI_DEVICE_ID) {
+ if (qm->pdev->device == PCI_DEVICE_ID_HUAWEI_HPRE_PF) {
ret = hpre_ctrl_debug_init(qm);
if (ret)
goto failed_to_create;
@@ -958,7 +956,7 @@ static int hpre_qm_init(struct hisi_qm *qm, struct pci_dev *pdev)
qm->sqe_size = HPRE_SQE_SIZE;
qm->dev_name = hpre_name;
- qm->fun_type = (pdev->device == HPRE_PCI_DEVICE_ID) ?
+ qm->fun_type = (pdev->device == PCI_DEVICE_ID_HUAWEI_HPRE_PF) ?
QM_HW_PF : QM_HW_VF;
if (qm->fun_type == QM_HW_PF) {
qm->qp_base = HPRE_PF_DEF_Q_BASE;
diff --git a/drivers/crypto/hisilicon/sec2/sec_main.c b/drivers/crypto/hisilicon/sec2/sec_main.c
index 26d3ab1d308b..ab806fb481ac 100644
--- a/drivers/crypto/hisilicon/sec2/sec_main.c
+++ b/drivers/crypto/hisilicon/sec2/sec_main.c
@@ -20,8 +20,6 @@
#define SEC_VF_NUM 63
#define SEC_QUEUE_NUM_V1 4096
-#define SEC_PF_PCI_DEVICE_ID 0xa255
-#define SEC_VF_PCI_DEVICE_ID 0xa256
#define SEC_BD_ERR_CHK_EN0 0xEFFFFFFF
#define SEC_BD_ERR_CHK_EN1 0x7ffff7fd
@@ -225,7 +223,7 @@ static const struct debugfs_reg32 sec_dfx_regs[] = {
static int sec_pf_q_num_set(const char *val, const struct kernel_param *kp)
{
- return q_num_set(val, kp, SEC_PF_PCI_DEVICE_ID);
+ return q_num_set(val, kp, PCI_DEVICE_ID_HUAWEI_SEC_PF);
}
static const struct kernel_param_ops sec_pf_q_num_ops = {
@@ -313,8 +311,8 @@ module_param_cb(uacce_mode, &sec_uacce_mode_ops, &uacce_mode, 0444);
MODULE_PARM_DESC(uacce_mode, UACCE_MODE_DESC);
static const struct pci_device_id sec_dev_ids[] = {
- { PCI_DEVICE(PCI_VENDOR_ID_HUAWEI, SEC_PF_PCI_DEVICE_ID) },
- { PCI_DEVICE(PCI_VENDOR_ID_HUAWEI, SEC_VF_PCI_DEVICE_ID) },
+ { PCI_DEVICE(PCI_VENDOR_ID_HUAWEI, PCI_DEVICE_ID_HUAWEI_SEC_PF) },
+ { PCI_DEVICE(PCI_VENDOR_ID_HUAWEI, PCI_DEVICE_ID_HUAWEI_SEC_VF) },
{ 0, }
};
MODULE_DEVICE_TABLE(pci, sec_dev_ids);
@@ -717,7 +715,7 @@ static int sec_core_debug_init(struct hisi_qm *qm)
regset->base = qm->io_base;
regset->dev = dev;
- if (qm->pdev->device == SEC_PF_PCI_DEVICE_ID)
+ if (qm->pdev->device == PCI_DEVICE_ID_HUAWEI_SEC_PF)
debugfs_create_file("regs", 0444, tmp_d, regset, &sec_regs_fops);
for (i = 0; i < ARRAY_SIZE(sec_dfx_labels); i++) {
@@ -735,7 +733,7 @@ static int sec_debug_init(struct hisi_qm *qm)
struct sec_dev *sec = container_of(qm, struct sec_dev, qm);
int i;
- if (qm->pdev->device == SEC_PF_PCI_DEVICE_ID) {
+ if (qm->pdev->device == PCI_DEVICE_ID_HUAWEI_SEC_PF) {
for (i = SEC_CLEAR_ENABLE; i < SEC_DEBUG_FILE_NUM; i++) {
spin_lock_init(&sec->debug.files[i].lock);
sec->debug.files[i].index = i;
@@ -877,7 +875,7 @@ static int sec_qm_init(struct hisi_qm *qm, struct pci_dev *pdev)
qm->sqe_size = SEC_SQE_SIZE;
qm->dev_name = sec_name;
- qm->fun_type = (pdev->device == SEC_PF_PCI_DEVICE_ID) ?
+ qm->fun_type = (pdev->device == PCI_DEVICE_ID_HUAWEI_SEC_PF) ?
QM_HW_PF : QM_HW_VF;
if (qm->fun_type == QM_HW_PF) {
qm->qp_base = SEC_PF_DEF_Q_BASE;
diff --git a/drivers/crypto/hisilicon/zip/zip_main.c b/drivers/crypto/hisilicon/zip/zip_main.c
index 678f8b58ec42..f4a517728385 100644
--- a/drivers/crypto/hisilicon/zip/zip_main.c
+++ b/drivers/crypto/hisilicon/zip/zip_main.c
@@ -15,9 +15,6 @@
#include <linux/uacce.h>
#include "zip.h"
-#define PCI_DEVICE_ID_ZIP_PF 0xa250
-#define PCI_DEVICE_ID_ZIP_VF 0xa251
-
#define HZIP_QUEUE_NUM_V1 4096
#define HZIP_CLOCK_GATE_CTRL 0x301004
@@ -246,7 +243,7 @@ MODULE_PARM_DESC(uacce_mode, UACCE_MODE_DESC);
static int pf_q_num_set(const char *val, const struct kernel_param *kp)
{
- return q_num_set(val, kp, PCI_DEVICE_ID_ZIP_PF);
+ return q_num_set(val, kp, PCI_DEVICE_ID_HUAWEI_ZIP_PF);
}
static const struct kernel_param_ops pf_q_num_ops = {
@@ -268,8 +265,8 @@ module_param_cb(vfs_num, &vfs_num_ops, &vfs_num, 0444);
MODULE_PARM_DESC(vfs_num, "Number of VFs to enable(1-63), 0(default)");
static const struct pci_device_id hisi_zip_dev_ids[] = {
- { PCI_DEVICE(PCI_VENDOR_ID_HUAWEI, PCI_DEVICE_ID_ZIP_PF) },
- { PCI_DEVICE(PCI_VENDOR_ID_HUAWEI, PCI_DEVICE_ID_ZIP_VF) },
+ { PCI_DEVICE(PCI_VENDOR_ID_HUAWEI, PCI_DEVICE_ID_HUAWEI_ZIP_PF) },
+ { PCI_DEVICE(PCI_VENDOR_ID_HUAWEI, PCI_DEVICE_ID_HUAWEI_ZIP_VF) },
{ 0, }
};
MODULE_DEVICE_TABLE(pci, hisi_zip_dev_ids);
@@ -838,7 +835,7 @@ static int hisi_zip_qm_init(struct hisi_qm *qm, struct pci_dev *pdev)
qm->sqe_size = HZIP_SQE_SIZE;
qm->dev_name = hisi_zip_name;
- qm->fun_type = (pdev->device == PCI_DEVICE_ID_ZIP_PF) ?
+ qm->fun_type = (pdev->device == PCI_DEVICE_ID_HUAWEI_ZIP_PF) ?
QM_HW_PF : QM_HW_VF;
if (qm->fun_type == QM_HW_PF) {
qm->qp_base = HZIP_PF_DEF_Q_BASE;
diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
index aad54c666407..6b98e0d91f0a 100644
--- a/include/linux/pci_ids.h
+++ b/include/linux/pci_ids.h
@@ -2529,6 +2529,12 @@
#define PCI_DEVICE_ID_KORENIX_JETCARDF3 0x17ff
#define PCI_VENDOR_ID_HUAWEI 0x19e5
+#define PCI_DEVICE_ID_HUAWEI_ZIP_PF 0xa250
+#define PCI_DEVICE_ID_HUAWEI_ZIP_VF 0xa251
+#define PCI_DEVICE_ID_HUAWEI_SEC_PF 0xa255
+#define PCI_DEVICE_ID_HUAWEI_SEC_VF 0xa256
+#define PCI_DEVICE_ID_HUAWEI_HPRE_PF 0xa258
+#define PCI_DEVICE_ID_HUAWEI_HPRE_VF 0xa259
#define PCI_VENDOR_ID_NETRONOME 0x19ee
#define PCI_DEVICE_ID_NETRONOME_NFP4000 0x4000
--
2.25.1
On Mon, 21 Feb 2022 20:49:43 -0400
Jason Gunthorpe <[email protected]> wrote:
> On Mon, Feb 21, 2022 at 11:40:35AM +0000, Shameer Kolothum wrote:
> >
> > Hi,
> >
> > This series attempts to add vfio live migration support for
> > HiSilicon ACC VF devices based on the new v2 migration protocol
> > definition and mlx5 v8 series discussed here[0].
> >
> > RFCv4 --> v5
> > - Dropped RFC tag as v2 migration APIs are more stable now.
> > - Addressed review comments from Jason and Alex (Thanks!).
> >
> > This is sanity tested on a HiSilicon platform using the Qemu branch
> > provided here[1].
> >
> > Please take a look and let me know your feedback.
> >
> > Thanks,
> > Shameer
> > [0] https://lore.kernel.org/kvm/[email protected]/
> > [1] https://github.com/jgunthorpe/qemu/commits/vfio_migration_v2
> >
> >
> > v3 --> RFCv4
> > -Based on migration v2 protocol and mlx5 v7 series.
> > -Added RFC tag again as migration v2 protocol is still under discussion.
> > -Added new patch #6 to retrieve the PF QM data.
> > -PRE_COPY compatibility check is now done after the migration data
> > transfer. This is not ideal and needs discussion.
>
> Alex, do you want to keep the PRE_COPY in just for acc for now? Or do
> you think this is not a good temporary use for it?
>
> We have some work toward doing the compatability more generally, but I
> think it will be a while before that is all settled.
In the original migration protocol I recall that we discussed that
using the pre-copy phase for compatibility testing, even without
additional device data, as a valid use case. The migration driver of
course needs to account for the fact that userspace is not required to
perform a pre-copy, and therefore cannot rely on that exclusively for
compatibility testing, but failing a migration earlier due to detection
of an incompatibility is generally a good thing.
If the ACC driver wants to re-incorporate this behavior into a non-RFC
proposed series and we could align accepting them into the same kernel
release, that sounds ok to me. Thanks,
Alex
> -----Original Message-----
> From: Alex Williamson [mailto:[email protected]]
> Sent: 22 February 2022 19:30
> To: Jason Gunthorpe <[email protected]>
> Cc: Shameerali Kolothum Thodi <[email protected]>;
> [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; Linuxarm <[email protected]>; liulongfang
> <[email protected]>; Zengtao (B) <[email protected]>;
> Jonathan Cameron <[email protected]>; Wangzhou (B)
> <[email protected]>
> Subject: Re: [PATCH v5 0/8] vfio/hisilicon: add ACC live migration driver
>
> On Mon, 21 Feb 2022 20:49:43 -0400
> Jason Gunthorpe <[email protected]> wrote:
>
> > On Mon, Feb 21, 2022 at 11:40:35AM +0000, Shameer Kolothum wrote:
> > >
> > > Hi,
> > >
> > > This series attempts to add vfio live migration support for
> > > HiSilicon ACC VF devices based on the new v2 migration protocol
> > > definition and mlx5 v8 series discussed here[0].
> > >
> > > RFCv4 --> v5
> > > - Dropped RFC tag as v2 migration APIs are more stable now.
> > > - Addressed review comments from Jason and Alex (Thanks!).
> > >
> > > This is sanity tested on a HiSilicon platform using the Qemu branch
> > > provided here[1].
> > >
> > > Please take a look and let me know your feedback.
> > >
> > > Thanks,
> > > Shameer
> > > [0]
> https://lore.kernel.org/kvm/[email protected]/
> > > [1] https://github.com/jgunthorpe/qemu/commits/vfio_migration_v2
> > >
> > >
> > > v3 --> RFCv4
> > > -Based on migration v2 protocol and mlx5 v7 series.
> > > -Added RFC tag again as migration v2 protocol is still under discussion.
> > > -Added new patch #6 to retrieve the PF QM data.
> > > -PRE_COPY compatibility check is now done after the migration data
> > > transfer. This is not ideal and needs discussion.
> >
> > Alex, do you want to keep the PRE_COPY in just for acc for now? Or do
> > you think this is not a good temporary use for it?
> >
> > We have some work toward doing the compatability more generally, but I
> > think it will be a while before that is all settled.
>
> In the original migration protocol I recall that we discussed that
> using the pre-copy phase for compatibility testing, even without
> additional device data, as a valid use case. The migration driver of
> course needs to account for the fact that userspace is not required to
> perform a pre-copy, and therefore cannot rely on that exclusively for
> compatibility testing, but failing a migration earlier due to detection
> of an incompatibility is generally a good thing.
>
> If the ACC driver wants to re-incorporate this behavior into a non-RFC
> proposed series and we could align accepting them into the same kernel
> release, that sounds ok to me. Thanks,
Ok. I will add the support to PRE_COPY and check compatibility early.
From FSM arc point of view, I guess it is adding,
STATE_RUNNING --> STATE_PRE_COPY
create the saving file.
get_match_data();
return fd;
STATE_PRE_COPY --> STATE_STOP_COPY
stop_device()
get_device_data()
update the saving migf total_len;
resume_write()
check compatibility once we have enough bytes.
Also add support to IOCTL VFIO_DEVICE_MIG_PRECOPY.
I will have a go and sent out a revised one.
Thanks,
Shameer
On Wed, Feb 23, 2022 at 09:34:43AM -0700, Alex Williamson wrote:
> On Tue, 22 Feb 2022 20:52:51 -0400
> Jason Gunthorpe <[email protected]> wrote:
>
> > On Mon, Feb 21, 2022 at 11:40:42AM +0000, Shameer Kolothum wrote:
> >
> > > + /*
> > > + * ACC VF dev BAR2 region consists of both functional register space
> > > + * and migration control register space. For migration to work, we
> > > + * need access to both. Hence, we map the entire BAR2 region here.
> > > + * But from a security point of view, we restrict access to the
> > > + * migration control space from Guest(Please see mmap/ioctl/read/write
> > > + * override functions).
> > > + *
> > > + * Also the HiSilicon ACC VF devices supported by this driver on
> > > + * HiSilicon hardware platforms are integrated end point devices
> > > + * and has no capability to perform PCIe P2P.
> >
> > If that is the case why not implement the RUNNING_P2P as well as a
> > NOP?
> >
> > Alex expressed concerned about proliferation of non-P2P devices as it
> > complicates qemu to support mixes
>
> I read the above as more of a statement about isolation, ie. grouping.
> Given that all DMA from the device is translated by the IOMMU, how is
> it possible that a device can entirely lack p2p support, or even know
> that the target address post-translation is to a peer device rather
> than system memory. If this is the case, it sounds like a restriction
> of the SMMU not supporting translations that reflect back to the I/O
> bus rather than a feature of the device itself. Thanks,
This is an interesting point..
Arguably if P2P addresses are invalid in an IOPTE then
pci_p2pdma_distance() should fail and we shouldn't have installed them
into the iommu in the first place.
Jason
On Tue, 22 Feb 2022 20:52:51 -0400
Jason Gunthorpe <[email protected]> wrote:
> On Mon, Feb 21, 2022 at 11:40:42AM +0000, Shameer Kolothum wrote:
>
> > + /*
> > + * ACC VF dev BAR2 region consists of both functional register space
> > + * and migration control register space. For migration to work, we
> > + * need access to both. Hence, we map the entire BAR2 region here.
> > + * But from a security point of view, we restrict access to the
> > + * migration control space from Guest(Please see mmap/ioctl/read/write
> > + * override functions).
> > + *
> > + * Also the HiSilicon ACC VF devices supported by this driver on
> > + * HiSilicon hardware platforms are integrated end point devices
> > + * and has no capability to perform PCIe P2P.
>
> If that is the case why not implement the RUNNING_P2P as well as a
> NOP?
>
> Alex expressed concerned about proliferation of non-P2P devices as it
> complicates qemu to support mixes
I read the above as more of a statement about isolation, ie. grouping.
Given that all DMA from the device is translated by the IOMMU, how is
it possible that a device can entirely lack p2p support, or even know
that the target address post-translation is to a peer device rather
than system memory. If this is the case, it sounds like a restriction
of the SMMU not supporting translations that reflect back to the I/O
bus rather than a feature of the device itself. Thanks,
Alex
On Mon, 21 Feb 2022 11:40:42 +0000
Shameer Kolothum <[email protected]> wrote:
> @@ -159,23 +1110,46 @@ static long hisi_acc_vfio_pci_ioctl(struct vfio_device *core_vdev, unsigned int
>
> static int hisi_acc_vfio_pci_open_device(struct vfio_device *core_vdev)
> {
> - struct vfio_pci_core_device *vdev =
> - container_of(core_vdev, struct vfio_pci_core_device, vdev);
> + struct hisi_acc_vf_core_device *hisi_acc_vdev = container_of(core_vdev,
> + struct hisi_acc_vf_core_device, core_device.vdev);
> + struct vfio_pci_core_device *vdev = &hisi_acc_vdev->core_device;
> int ret;
>
> ret = vfio_pci_core_enable(vdev);
> if (ret)
> return ret;
>
> - vfio_pci_core_finish_enable(vdev);
> + if (core_vdev->migration_flags != VFIO_MIGRATION_STOP_COPY) {
This looks like a minor synchronization issue with
hisi_acc_vfio_pci_migrn_init(), I think it might be cleaner to test
core_vdev->ops against the migration enabled set.
> + vfio_pci_core_finish_enable(vdev);
> + return 0;
> + }
> +
> + ret = hisi_acc_vf_qm_init(hisi_acc_vdev);
> + if (ret) {
> + vfio_pci_core_disable(vdev);
> + return ret;
> + }
>
> + hisi_acc_vdev->mig_state = VFIO_DEVICE_STATE_RUNNING;
Change the polarity of the if() above and encompass this all within
that branch scope so we can use the finish/return below for both cases?
> +
> + vfio_pci_core_finish_enable(vdev);
> return 0;
> }
>
> +static void hisi_acc_vfio_pci_close_device(struct vfio_device *core_vdev)
> +{
> + struct hisi_acc_vf_core_device *hisi_acc_vdev = container_of(core_vdev,
> + struct hisi_acc_vf_core_device, core_device.vdev);
> + struct hisi_qm *vf_qm = &hisi_acc_vdev->vf_qm;
> +
> + iounmap(vf_qm->io_base);
> + vfio_pci_core_close_device(core_vdev);
> +}
> +
> static const struct vfio_device_ops hisi_acc_vfio_pci_migrn_ops = {
> .name = "hisi-acc-vfio-pci",
> .open_device = hisi_acc_vfio_pci_open_device,
> - .close_device = vfio_pci_core_close_device,
> + .close_device = hisi_acc_vfio_pci_close_device,
> .ioctl = hisi_acc_vfio_pci_ioctl,
> .device_feature = vfio_pci_core_ioctl_feature,
> .read = hisi_acc_vfio_pci_read,
> @@ -183,6 +1157,8 @@ static const struct vfio_device_ops hisi_acc_vfio_pci_migrn_ops = {
> .mmap = hisi_acc_vfio_pci_mmap,
> .request = vfio_pci_core_request,
> .match = vfio_pci_core_match,
> + .migration_set_state = hisi_acc_vfio_pci_set_device_state,
> + .migration_get_state = hisi_acc_vfio_pci_get_device_state,
> };
>
> static const struct vfio_device_ops hisi_acc_vfio_pci_ops = {
> @@ -198,38 +1174,71 @@ static const struct vfio_device_ops hisi_acc_vfio_pci_ops = {
> .match = vfio_pci_core_match,
> };
>
> +static int
> +hisi_acc_vfio_pci_migrn_init(struct hisi_acc_vf_core_device *hisi_acc_vdev,
> + struct pci_dev *pdev, struct hisi_qm *pf_qm)
> +{
> + int vf_id;
> +
> + vf_id = pci_iov_vf_id(pdev);
> + if (vf_id < 0)
> + return vf_id;
> +
> + hisi_acc_vdev->vf_id = vf_id + 1;
> + hisi_acc_vdev->core_device.vdev.migration_flags =
> + VFIO_MIGRATION_STOP_COPY;
> + hisi_acc_vdev->pf_qm = pf_qm;
> + hisi_acc_vdev->vf_dev = pdev;
> + mutex_init(&hisi_acc_vdev->state_mutex);
> +
> + return 0;
> +}
> +
> static int hisi_acc_vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> {
> - struct vfio_pci_core_device *vdev;
> + struct hisi_acc_vf_core_device *hisi_acc_vdev;
> + struct hisi_qm *pf_qm;
> int ret;
>
> - vdev = kzalloc(sizeof(*vdev), GFP_KERNEL);
> - if (!vdev)
> + hisi_acc_vdev = kzalloc(sizeof(*hisi_acc_vdev), GFP_KERNEL);
> + if (!hisi_acc_vdev)
> return -ENOMEM;
>
> - vfio_pci_core_init_device(vdev, pdev, &hisi_acc_vfio_pci_ops);
> + pf_qm = hisi_acc_get_pf_qm(pdev);
> + if (pf_qm && pf_qm->ver >= QM_HW_V3) {
> + ret = hisi_acc_vfio_pci_migrn_init(hisi_acc_vdev, pdev, pf_qm);
> + if (ret < 0) {
> + kfree(hisi_acc_vdev);
> + return ret;
> + }
This error path can only occur if the VF ID lookup fails, but should we
fall through to the non-migration ops, maybe with a dev_warn()? Thanks,
Alex
> +
> + vfio_pci_core_init_device(&hisi_acc_vdev->core_device, pdev,
> + &hisi_acc_vfio_pci_migrn_ops);
> + } else {
> + vfio_pci_core_init_device(&hisi_acc_vdev->core_device, pdev,
> + &hisi_acc_vfio_pci_ops);
> + }