2023-04-08 07:44:39

by liulongfang

[permalink] [raw]
Subject: [PATCH v10 0/5] add debugfs to migration driver

Add a debugfs function to the migration driver in VFIO to provide
a step-by-step test function for the migration driver.

When the execution of live migration fails, the user can view the
status and data during the migration process separately from the
source and the destination, which is convenient for users to analyze
and locate problems.

Changes v9 -> v10
Update the debugfs file of the live migration driver.

Changes v8 -> v9
Update the debugfs directory structure of vfio.

Changes v7 -> v8
Add support for platform devices.

Changes v6 -> v7
Fix some code style issues.

Changes v5 -> v6
Control the creation of debugfs through the CONFIG_DEBUG_FS.

Changes v4 -> v5
Remove the newly added vfio_migration_ops and use seq_printf
to optimize the implementation of debugfs.

Changes v3 -> v4
Change the migration_debug_operate interface to debug_root file.

Changes v2 -> v3
Extend the debugfs function from hisilicon device to vfio.

Changes v1 -> v2
Change the registration method of root_debugfs to register
with module initialization.

Longfang Liu (5):
vfio/migration: Add debugfs to live migration driver
hisi_acc_vfio_pci: extract public functions for container_of
hisi_acc_vfio_pci: register debugfs for hisilicon migration driver
Documentation: add debugfs description for vfio
vfio: update live migration device status

.../ABI/testing/debugfs-hisi-migration | 39 +++
Documentation/ABI/testing/debugfs-vfio | 25 ++
MAINTAINERS | 2 +
drivers/vfio/Makefile | 2 +-
.../vfio/pci/hisilicon/hisi_acc_vfio_pci.c | 246 +++++++++++++++++-
.../vfio/pci/hisilicon/hisi_acc_vfio_pci.h | 11 +
drivers/vfio/pci/mlx5/main.c | 3 +
drivers/vfio/vfio.h | 14 +
drivers/vfio/vfio_debugfs.c | 78 ++++++
drivers/vfio/vfio_main.c | 9 +-
include/linux/vfio.h | 8 +
11 files changed, 425 insertions(+), 12 deletions(-)
create mode 100644 Documentation/ABI/testing/debugfs-hisi-migration
create mode 100644 Documentation/ABI/testing/debugfs-vfio
create mode 100644 drivers/vfio/vfio_debugfs.c

--
2.24.0


2023-04-08 07:45:14

by liulongfang

[permalink] [raw]
Subject: [PATCH v10 1/5] vfio/migration: Add debugfs to live migration driver

There are multiple devices, software and operational steps involved
in the process of live migration. An error occurred on any node may
cause the live migration operation to fail.
This complex process makes it very difficult to locate and analyze
the cause when the function fails.

In order to quickly locate the cause of the problem when the
live migration fails, I added a set of debugfs to the vfio
live migration driver.

+-------------------------------------------+
| |
| |
| QEMU |
| |
| |
+---+----------------------------+----------+
| ^ | ^
| | | |
| | | |
v | v |
+---------+--+ +---------+--+
|src vfio_dev| |dst vfio_dev|
+--+---------+ +--+---------+
| ^ | ^
| | | |
v | | |
+-----------+----+ +-----------+----+
|src dev debugfs | |dst dev debugfs |
+----------------+ +----------------+

The entire debugfs directory will be based on the definition of
the CONFIG_DEBUG_FS macro. If this macro is not enabled, the
interfaces in vfio.h will be empty definitions, and the creation
and initialization of the debugfs directory will not be executed.

vfio
|
+---<dev_name1>
| +---migration
| +--state
| +--hisi_acc
| +--attr
| +--data
| +--debug
|
+---<dev_name2>
+---migration
+--state
+--hisi_acc
+--attr
+--data
+--debug

debugfs will create a public root directory "vfio" file.
then create a dev_name() file for each live migration device.
First, create a unified state acquisition file of "migration"
in this device directory.
Then, create a public live migration state lookup file "state"
Finally, create a directory file based on the device type,
and then create the device's own debugging files under
this directory file.

Here, HiSilicon accelerator creates three debug files:
attr: used to obtain the attribute parameters of the
current live migration device.
data: used to get the live migration data of the current
live migration device.
debug: Used to debug the current live migration device
through commands.

The live migration function of the current device can be tested by
operating the debug files, and the functional status of the equipment
and software at each stage can be tested step by step without
performing the complete live migration function. And after the live
migration is performed, the migration device data of the live migration
can be obtained through the debug files.

Signed-off-by: Longfang Liu <[email protected]>
---
drivers/vfio/Makefile | 2 +-
drivers/vfio/vfio.h | 14 +++++++
drivers/vfio/vfio_debugfs.c | 78 +++++++++++++++++++++++++++++++++++++
drivers/vfio/vfio_main.c | 9 ++++-
include/linux/vfio.h | 8 ++++
5 files changed, 109 insertions(+), 2 deletions(-)
create mode 100644 drivers/vfio/vfio_debugfs.c

diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile
index 70e7dcb302ef..1debcff31d30 100644
--- a/drivers/vfio/Makefile
+++ b/drivers/vfio/Makefile
@@ -7,7 +7,7 @@ vfio-y += vfio_main.o \
vfio-$(CONFIG_IOMMUFD) += iommufd.o
vfio-$(CONFIG_VFIO_CONTAINER) += container.o
vfio-$(CONFIG_VFIO_VIRQFD) += virqfd.o
-
+vfio-$(CONFIG_DEBUG_FS) += vfio_debugfs.o
obj-$(CONFIG_VFIO_IOMMU_TYPE1) += vfio_iommu_type1.o
obj-$(CONFIG_VFIO_IOMMU_SPAPR_TCE) += vfio_iommu_spapr_tce.o
obj-$(CONFIG_VFIO_PCI) += pci/
diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
index e9721d8424bc..8e5cafa6aa3a 100644
--- a/drivers/vfio/vfio.h
+++ b/drivers/vfio/vfio.h
@@ -258,4 +258,18 @@ static inline void vfio_device_put_kvm(struct vfio_device *device)
}
#endif

+#ifdef CONFIG_DEBUG_FS
+void vfio_debugfs_create_root(void);
+void vfio_debugfs_remove_root(void);
+
+void vfio_device_debugfs_init(struct vfio_device *vdev);
+void vfio_device_debugfs_exit(struct vfio_device *vdev);
+#else
+static inline void vfio_debugfs_create_root(void) { }
+static inline void vfio_debugfs_remove_root(void) { }
+
+static inline void vfio_device_debugfs_init(struct vfio_device *vdev) { }
+static inline void vfio_device_debugfs_exit(struct vfio_device *vdev) { }
+#endif /* CONFIG_DEBUG_FS */
+
#endif
diff --git a/drivers/vfio/vfio_debugfs.c b/drivers/vfio/vfio_debugfs.c
new file mode 100644
index 000000000000..7bff30f76bd9
--- /dev/null
+++ b/drivers/vfio/vfio_debugfs.c
@@ -0,0 +1,78 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2023, HiSilicon Ltd.
+ */
+
+#include <linux/device.h>
+#include <linux/debugfs.h>
+#include <linux/seq_file.h>
+#include <linux/vfio.h>
+#include "vfio.h"
+
+static struct dentry *vfio_debugfs_root;
+
+static int vfio_device_state_read(struct seq_file *seq, void *data)
+{
+ struct device *vf_dev = seq->private;
+ struct vfio_device *vdev = container_of(vf_dev, struct vfio_device, device);
+ enum vfio_device_mig_state state;
+ int ret;
+
+ ret = vdev->mig_ops->migration_get_state(vdev, &state);
+ if (ret)
+ return -EINVAL;
+
+ switch (state) {
+ case VFIO_DEVICE_STATE_RUNNING:
+ seq_printf(seq, "%s\n", "RUNNING");
+ break;
+ case VFIO_DEVICE_STATE_STOP_COPY:
+ seq_printf(seq, "%s\n", "STOP_COPY");
+ break;
+ case VFIO_DEVICE_STATE_STOP:
+ seq_printf(seq, "%s\n", "STOP");
+ break;
+ case VFIO_DEVICE_STATE_RESUMING:
+ seq_printf(seq, "%s\n", "RESUMING");
+ break;
+ case VFIO_DEVICE_STATE_RUNNING_P2P:
+ seq_printf(seq, "%s\n", "RESUMING_P2P");
+ break;
+ case VFIO_DEVICE_STATE_ERROR:
+ seq_printf(seq, "%s\n", "ERROR");
+ break;
+ default:
+ seq_printf(seq, "%s\n", "Invalid");
+ }
+
+ return 0;
+}
+
+void vfio_device_debugfs_init(struct vfio_device *vdev)
+{
+ struct dentry *vfio_dev_migration = NULL;
+ struct device *dev = &vdev->device;
+
+ vdev->debug_root = debugfs_create_dir(dev_name(vdev->dev), vfio_debugfs_root);
+ vfio_dev_migration = debugfs_create_dir("migration", vdev->debug_root);
+
+ debugfs_create_devm_seqfile(dev, "state", vfio_dev_migration,
+ vfio_device_state_read);
+}
+
+void vfio_device_debugfs_exit(struct vfio_device *vdev)
+{
+ debugfs_remove_recursive(vdev->debug_root);
+}
+
+void vfio_debugfs_create_root(void)
+{
+ vfio_debugfs_root = debugfs_create_dir("vfio", NULL);
+}
+
+void vfio_debugfs_remove_root(void)
+{
+ debugfs_remove_recursive(vfio_debugfs_root);
+ vfio_debugfs_root = NULL;
+}
+
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index 3a597e799918..e9ddf6612e44 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -274,7 +274,8 @@ static int __vfio_register_dev(struct vfio_device *device,

/* Refcounting can't start until the driver calls register */
refcount_set(&device->refcount, 1);
-
+ if (device->mig_ops)
+ vfio_device_debugfs_init(device);
vfio_device_group_register(device);

return 0;
@@ -331,6 +332,8 @@ void vfio_unregister_group_dev(struct vfio_device *device)
}
}

+ if (device->mig_ops)
+ vfio_device_debugfs_exit(device);
vfio_device_group_unregister(device);

/* Balances device_add in register path */
@@ -1407,7 +1410,10 @@ static int __init vfio_init(void)
goto err_dev_class;
}

+
+ vfio_debugfs_create_root();
pr_info(DRIVER_DESC " version: " DRIVER_VERSION "\n");
+
return 0;

err_dev_class:
@@ -1425,6 +1431,7 @@ static void __exit vfio_cleanup(void)
vfio_virqfd_exit();
vfio_group_cleanup();
xa_destroy(&vfio_device_set_xa);
+ vfio_debugfs_remove_root();
}

module_init(vfio_init);
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index 93134b023968..fa6b898ebb58 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -63,6 +63,14 @@ struct vfio_device {
struct iommufd_ctx *iommufd_ictx;
bool iommufd_attached;
#endif
+
+#ifdef CONFIG_DEBUG_FS
+ /*
+ * debug_root is a static property of the vfio_device
+ * which must be set prior to registering the vfio_device.
+ */
+ struct dentry *debug_root;
+#endif
};

/**
--
2.24.0

2023-04-08 07:45:23

by liulongfang

[permalink] [raw]
Subject: [PATCH v10 2/5] hisi_acc_vfio_pci: extract public functions for container_of

In the current driver, vdev is obtained from struct
hisi_acc_vf_core_device through the container_of function.
This method is used in many places in the driver. In order to
reduce this repetitive operation, I extracted a public function
to replace it.

Signed-off-by: Longfang Liu <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
---
.../vfio/pci/hisilicon/hisi_acc_vfio_pci.c | 21 ++++++++++---------
1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
index a117eaf21c14..a1589947e721 100644
--- a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
+++ b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
@@ -630,6 +630,12 @@ static void hisi_acc_vf_disable_fds(struct hisi_acc_vf_core_device *hisi_acc_vde
}
}

+static struct hisi_acc_vf_core_device *hisi_acc_get_vf_dev(struct vfio_device *vdev)
+{
+ return container_of(vdev, struct hisi_acc_vf_core_device,
+ core_device.vdev);
+}
+
/*
* This function is called in all state_mutex unlock cases to
* handle a 'deferred_reset' if exists.
@@ -1042,8 +1048,7 @@ static struct file *
hisi_acc_vfio_pci_set_device_state(struct vfio_device *vdev,
enum vfio_device_mig_state new_state)
{
- struct hisi_acc_vf_core_device *hisi_acc_vdev = container_of(vdev,
- struct hisi_acc_vf_core_device, core_device.vdev);
+ struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_get_vf_dev(vdev);
enum vfio_device_mig_state next_state;
struct file *res = NULL;
int ret;
@@ -1084,8 +1089,7 @@ static int
hisi_acc_vfio_pci_get_device_state(struct vfio_device *vdev,
enum vfio_device_mig_state *curr_state)
{
- struct hisi_acc_vf_core_device *hisi_acc_vdev = container_of(vdev,
- struct hisi_acc_vf_core_device, core_device.vdev);
+ struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_get_vf_dev(vdev);

mutex_lock(&hisi_acc_vdev->state_mutex);
*curr_state = hisi_acc_vdev->mig_state;
@@ -1301,8 +1305,7 @@ static long hisi_acc_vfio_pci_ioctl(struct vfio_device *core_vdev, unsigned int

static int hisi_acc_vfio_pci_open_device(struct vfio_device *core_vdev)
{
- struct hisi_acc_vf_core_device *hisi_acc_vdev = container_of(core_vdev,
- struct hisi_acc_vf_core_device, core_device.vdev);
+ struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_get_vf_dev(core_vdev);
struct vfio_pci_core_device *vdev = &hisi_acc_vdev->core_device;
int ret;

@@ -1325,8 +1328,7 @@ static int hisi_acc_vfio_pci_open_device(struct vfio_device *core_vdev)

static void hisi_acc_vfio_pci_close_device(struct vfio_device *core_vdev)
{
- struct hisi_acc_vf_core_device *hisi_acc_vdev = container_of(core_vdev,
- struct hisi_acc_vf_core_device, core_device.vdev);
+ struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_get_vf_dev(core_vdev);
struct hisi_qm *vf_qm = &hisi_acc_vdev->vf_qm;

iounmap(vf_qm->io_base);
@@ -1341,8 +1343,7 @@ static const struct vfio_migration_ops hisi_acc_vfio_pci_migrn_state_ops = {

static int hisi_acc_vfio_pci_migrn_init_dev(struct vfio_device *core_vdev)
{
- struct hisi_acc_vf_core_device *hisi_acc_vdev = container_of(core_vdev,
- struct hisi_acc_vf_core_device, core_device.vdev);
+ struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_get_vf_dev(core_vdev);
struct pci_dev *pdev = to_pci_dev(core_vdev->dev);
struct hisi_qm *pf_qm = hisi_acc_get_pf_qm(pdev);

--
2.24.0

2023-04-08 07:46:02

by liulongfang

[permalink] [raw]
Subject: [PATCH v10 3/5] hisi_acc_vfio_pci: register debugfs for hisilicon migration driver

On the debugfs framework of VFIO, if the CONFIG_DEBUG_FS macro is
enabled, the debug function is registered for the live migration driver
of the HiSilicon accelerator device.

After registering the HiSilicon accelerator device on the debugfs
framework of live migration of vfio, a directory file "hisi_acc"
of debugfs is created, and then three debug function files are
created in this directory:

data file: used to get the migration data of the live migration device
attr file: used to get device attributes of the live migration device
debug file: used to test for acquiring and writing device state data
for VF device.

Signed-off-by: Longfang Liu <[email protected]>
---
.../vfio/pci/hisilicon/hisi_acc_vfio_pci.c | 222 ++++++++++++++++++
.../vfio/pci/hisilicon/hisi_acc_vfio_pci.h | 11 +
2 files changed, 233 insertions(+)

diff --git a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
index a1589947e721..35abe5face47 100644
--- a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
+++ b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
@@ -15,6 +15,7 @@
#include <linux/anon_inodes.h>

#include "hisi_acc_vfio_pci.h"
+#include "../../vfio.h"

/* Return 0 on VM acc device ready, -ETIMEDOUT hardware timeout */
static int qm_wait_dev_not_ready(struct hisi_qm *qm)
@@ -606,6 +607,18 @@ hisi_acc_check_int_state(struct hisi_acc_vf_core_device *hisi_acc_vdev)
}
}

+static void hisi_acc_vf_migf_save(struct hisi_acc_vf_migration_file *src_migf,
+ struct hisi_acc_vf_migration_file *dst_migf)
+{
+ if (!dst_migf)
+ return;
+
+ dst_migf->disabled = false;
+ dst_migf->total_length = src_migf->total_length;
+ memcpy(&dst_migf->vf_data, &src_migf->vf_data,
+ sizeof(struct acc_vf_data));
+}
+
static void hisi_acc_vf_disable_fd(struct hisi_acc_vf_migration_file *migf)
{
mutex_lock(&migf->lock);
@@ -618,12 +631,16 @@ static void hisi_acc_vf_disable_fd(struct hisi_acc_vf_migration_file *migf)
static void hisi_acc_vf_disable_fds(struct hisi_acc_vf_core_device *hisi_acc_vdev)
{
if (hisi_acc_vdev->resuming_migf) {
+ hisi_acc_vf_migf_save(hisi_acc_vdev->resuming_migf,
+ hisi_acc_vdev->debug_migf);
hisi_acc_vf_disable_fd(hisi_acc_vdev->resuming_migf);
fput(hisi_acc_vdev->resuming_migf->filp);
hisi_acc_vdev->resuming_migf = NULL;
}

if (hisi_acc_vdev->saving_migf) {
+ hisi_acc_vf_migf_save(hisi_acc_vdev->saving_migf,
+ hisi_acc_vdev->debug_migf);
hisi_acc_vf_disable_fd(hisi_acc_vdev->saving_migf);
fput(hisi_acc_vdev->saving_migf->filp);
hisi_acc_vdev->saving_migf = NULL;
@@ -1303,6 +1320,204 @@ static long hisi_acc_vfio_pci_ioctl(struct vfio_device *core_vdev, unsigned int
return vfio_pci_core_ioctl(core_vdev, cmd, arg);
}

+static int hisi_acc_vf_debug_check(struct seq_file *seq, struct vfio_device *vdev)
+{
+ struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_get_vf_dev(vdev);
+ struct hisi_acc_vf_migration_file *migf = hisi_acc_vdev->debug_migf;
+
+ if (!vdev->mig_ops || !migf) {
+ seq_printf(seq, "%s\n", "device not support debugfs!");
+ return -EINVAL;
+ }
+
+ /* If device not opened, the debugfs operation will trigger calltrace */
+ if (!vdev->open_count) {
+ seq_printf(seq, "%s\n", "device not opened!");
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static int hisi_acc_vf_debug_io(struct seq_file *seq, void *data)
+{
+ struct device *vf_dev = seq->private;
+ struct vfio_pci_core_device *core_device = dev_get_drvdata(vf_dev);
+ struct vfio_device *vdev = &core_device->vdev;
+ struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_get_vf_dev(vdev);
+ struct hisi_qm *vf_qm = &hisi_acc_vdev->vf_qm;
+ u64 value;
+ int ret;
+
+ ret = hisi_acc_vf_debug_check(seq, vdev);
+ if (ret)
+ goto io_err;
+
+ ret = qm_wait_dev_not_ready(vf_qm);
+ if (ret) {
+ seq_printf(seq, "%s\n", "VF device not ready!");
+ goto io_err;
+ }
+
+ value = readl(vf_qm->io_base + QM_MB_CMD_SEND_BASE);
+ seq_printf(seq, "%s:0x%llx\n", "debug mailbox val", value);
+
+io_err:
+ return 0;
+}
+
+static int hisi_acc_vf_debug_restore(struct seq_file *seq, void *data)
+{
+ struct device *vf_dev = seq->private;
+ struct vfio_pci_core_device *core_device = dev_get_drvdata(vf_dev);
+ struct vfio_device *vdev = &core_device->vdev;
+ struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_get_vf_dev(vdev);
+ struct hisi_acc_vf_migration_file *migf = hisi_acc_vdev->debug_migf;
+ int ret;
+
+ ret = hisi_acc_vf_debug_check(seq, vdev);
+ if (ret)
+ goto restore_err;
+
+ ret = vf_qm_state_save(hisi_acc_vdev, migf);
+ if (ret) {
+ seq_printf(seq, "%s\n", "failed to save device data!");
+ goto restore_err;
+ }
+
+ ret = vf_qm_check_match(hisi_acc_vdev, migf);
+ if (ret) {
+ seq_printf(seq, "%s\n", "failed to match the VF!");
+ goto restore_err;
+ }
+
+ ret = vf_qm_load_data(hisi_acc_vdev, migf);
+ if (ret) {
+ seq_printf(seq, "%s\n", "failed to recover the VF!");
+ goto restore_err;
+ }
+
+ vf_qm_fun_reset(&hisi_acc_vdev->vf_qm);
+ seq_printf(seq, "%s\n", "successful to resume device data!");
+
+restore_err:
+ return 0;
+}
+
+static int hisi_acc_vf_debug_save(struct seq_file *seq, void *data)
+{
+ struct device *vf_dev = seq->private;
+ struct vfio_pci_core_device *core_device = dev_get_drvdata(vf_dev);
+ struct vfio_device *vdev = &core_device->vdev;
+ struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_get_vf_dev(vdev);
+ struct hisi_acc_vf_migration_file *migf = hisi_acc_vdev->debug_migf;
+ int ret;
+
+ ret = hisi_acc_vf_debug_check(seq, vdev);
+ if (ret)
+ goto save_err;
+
+ ret = vf_qm_state_save(hisi_acc_vdev, migf);
+ if (ret) {
+ seq_printf(seq, "%s\n", "failed to save device data!");
+ goto save_err;
+ }
+ seq_printf(seq, "%s\n", "successful to save device data!");
+
+save_err:
+ return 0;
+}
+
+static int hisi_acc_vf_data_read(struct seq_file *seq, void *data)
+{
+ struct device *vf_dev = seq->private;
+ struct vfio_pci_core_device *core_device = dev_get_drvdata(vf_dev);
+ struct vfio_device *vdev = &core_device->vdev;
+ struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_get_vf_dev(vdev);
+ struct hisi_acc_vf_migration_file *debug_migf = hisi_acc_vdev->debug_migf;
+ size_t vf_data_sz = offsetofend(struct acc_vf_data, padding);
+
+ if (debug_migf && debug_migf->total_length)
+ seq_hex_dump(seq, "Mig Data:", DUMP_PREFIX_OFFSET, 16, 1,
+ (unsigned char *)&debug_migf->vf_data,
+ vf_data_sz, false);
+ else
+ seq_printf(seq, "%s\n", "device not migrated!");
+
+ return 0;
+}
+
+static int hisi_acc_vf_attr_read(struct seq_file *seq, void *data)
+{
+ struct device *vf_dev = seq->private;
+ struct vfio_pci_core_device *core_device = dev_get_drvdata(vf_dev);
+ struct vfio_device *vdev = &core_device->vdev;
+ struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_get_vf_dev(vdev);
+ struct hisi_acc_vf_migration_file *debug_migf = hisi_acc_vdev->debug_migf;
+
+ if (debug_migf && debug_migf->total_length) {
+ seq_printf(seq,
+ "acc device:\n"
+ "device state: %d\n"
+ "device ready: %u\n"
+ "data valid: %d\n"
+ "data size: %lu\n",
+ hisi_acc_vdev->mig_state,
+ hisi_acc_vdev->vf_qm_state,
+ debug_migf->disabled,
+ debug_migf->total_length);
+ } else {
+ seq_printf(seq, "%s\n", "device not migrated!");
+ }
+
+ return 0;
+}
+
+static int hisi_acc_vfio_debug_init(struct hisi_acc_vf_core_device *hisi_acc_vdev)
+{
+ struct vfio_device *vdev = &hisi_acc_vdev->core_device.vdev;
+ struct dentry *vfio_dev_migration = NULL;
+ struct dentry *vfio_hisi_acc = NULL;
+ struct device *dev = vdev->dev;
+ void *migf = NULL;
+
+ if (!debugfs_initialized())
+ return 0;
+
+ migf = kzalloc(sizeof(struct hisi_acc_vf_migration_file), GFP_KERNEL);
+ if (!migf)
+ return -ENOMEM;
+ hisi_acc_vdev->debug_migf = migf;
+
+ vfio_dev_migration = debugfs_lookup("migration", vdev->debug_root);
+ if (!vfio_dev_migration) {
+ dev_err(dev, "failed to lookup migration debugfs file!\n");
+ return -ENODEV;
+ }
+
+ vfio_hisi_acc = debugfs_create_dir("hisi_acc", vfio_dev_migration);
+ debugfs_create_devm_seqfile(dev, "data", vfio_hisi_acc,
+ hisi_acc_vf_data_read);
+ debugfs_create_devm_seqfile(dev, "attr", vfio_hisi_acc,
+ hisi_acc_vf_attr_read);
+ debugfs_create_devm_seqfile(dev, "io_test", vfio_hisi_acc,
+ hisi_acc_vf_debug_io);
+ debugfs_create_devm_seqfile(dev, "save", vfio_hisi_acc,
+ hisi_acc_vf_debug_save);
+ debugfs_create_devm_seqfile(dev, "restore", vfio_hisi_acc,
+ hisi_acc_vf_debug_restore);
+
+ return 0;
+}
+
+static void hisi_acc_vf_debugfs_exit(struct hisi_acc_vf_core_device *hisi_acc_vdev)
+{
+ if (!debugfs_initialized())
+ return;
+
+ kfree(hisi_acc_vdev->debug_migf);
+}
+
static int hisi_acc_vfio_pci_open_device(struct vfio_device *core_vdev)
{
struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_get_vf_dev(core_vdev);
@@ -1323,6 +1538,7 @@ static int hisi_acc_vfio_pci_open_device(struct vfio_device *core_vdev)
}

vfio_pci_core_finish_enable(vdev);
+
return 0;
}

@@ -1420,9 +1636,14 @@ static int hisi_acc_vfio_pci_probe(struct pci_dev *pdev, const struct pci_device
ret = vfio_pci_core_register_device(&hisi_acc_vdev->core_device);
if (ret)
goto out_put_vdev;
+
+ if (ops == &hisi_acc_vfio_pci_migrn_ops)
+ hisi_acc_vfio_debug_init(hisi_acc_vdev);
return 0;

out_put_vdev:
+ if (ops == &hisi_acc_vfio_pci_migrn_ops)
+ hisi_acc_vf_debugfs_exit(hisi_acc_vdev);
vfio_put_device(&hisi_acc_vdev->core_device.vdev);
return ret;
}
@@ -1431,6 +1652,7 @@ static void hisi_acc_vfio_pci_remove(struct pci_dev *pdev)
{
struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_drvdata(pdev);

+ hisi_acc_vf_debugfs_exit(hisi_acc_vdev);
vfio_pci_core_unregister_device(&hisi_acc_vdev->core_device);
vfio_put_device(&hisi_acc_vdev->core_device.vdev);
}
diff --git a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.h b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.h
index dcabfeec6ca1..ef50b12f018d 100644
--- a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.h
+++ b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.h
@@ -49,6 +49,14 @@
#define QM_EQC_DW0 0X8000
#define QM_AEQC_DW0 0X8020

+#define VFIO_DEV_DBG_LEN 256
+
+enum mig_debug_cmd {
+ STATE_SAVE,
+ STATE_RESUME,
+ RW_IO_TEST,
+};
+
struct acc_vf_data {
#define QM_MATCH_SIZE offsetofend(struct acc_vf_data, qm_rsv_state)
/* QM match information */
@@ -113,5 +121,8 @@ struct hisi_acc_vf_core_device {
spinlock_t reset_lock;
struct hisi_acc_vf_migration_file *resuming_migf;
struct hisi_acc_vf_migration_file *saving_migf;
+
+ /* For debugfs */
+ struct hisi_acc_vf_migration_file *debug_migf;
};
#endif /* HISI_ACC_VFIO_PCI_H */
--
2.24.0

2023-04-08 07:46:57

by liulongfang

[permalink] [raw]
Subject: [PATCH v10 4/5] Documentation: add debugfs description for vfio

1.Add two debugfs document description file to help users understand
how to use the accelerator live migration driver's debugfs.
2.Update the file paths that need to be maintained in MAINTAINERS

Signed-off-by: Longfang Liu <[email protected]>
---
.../ABI/testing/debugfs-hisi-migration | 39 +++++++++++++++++++
Documentation/ABI/testing/debugfs-vfio | 25 ++++++++++++
MAINTAINERS | 2 +
3 files changed, 66 insertions(+)
create mode 100644 Documentation/ABI/testing/debugfs-hisi-migration
create mode 100644 Documentation/ABI/testing/debugfs-vfio

diff --git a/Documentation/ABI/testing/debugfs-hisi-migration b/Documentation/ABI/testing/debugfs-hisi-migration
new file mode 100644
index 000000000000..e67478685ed0
--- /dev/null
+++ b/Documentation/ABI/testing/debugfs-hisi-migration
@@ -0,0 +1,39 @@
+What: /sys/kernel/debug/vfio/<device>/migration/hisi_acc/data
+Date: April 2023
+KernelVersion: 6.2
+Contact: Longfang Liu <[email protected]>
+Description: Read the live migration data of the vfio device.
+ The output format of the data is defined by the live
+ migration driver.
+
+What: /sys/kernel/debug/vfio/<device>/migration/hisi_acc/attr
+Date: April 2023
+KernelVersion: 6.2
+Contact: Longfang Liu <[email protected]>
+Description: Read the live migration attributes of the vfio device.
+ The output format of the attributes is defined by the live
+ migration driver.
+
+What: /sys/kernel/debug/vfio/<device>/migration/hisi_acc/io_test
+Date: April 2023
+KernelVersion: 6.2
+Contact: Longfang Liu <[email protected]>
+Description: Trigger the HiSilicon accelerator device to perform
+ the io test through the read operation, and directly output
+ the test result.
+
+What: /sys/kernel/debug/vfio/<device>/migration/hisi_acc/save
+Date: April 2023
+KernelVersion: 6.2
+Contact: Longfang Liu <[email protected]>
+Description: Trigger the Hisilicon accelerator device to perform
+ the state saving operation of live migration through the read
+ operation, and directly output the operation result.
+
+What: /sys/kernel/debug/vfio/<device>/migration/hisi_acc/restore
+Date: April 2023
+KernelVersion: 6.2
+Contact: Longfang Liu <[email protected]>
+Description: Trigger the Hisilicon accelerator device to perform
+ the state restoration operation of live migration through
+ the read operation, and directly output the operation result.
\ No newline at end of file
diff --git a/Documentation/ABI/testing/debugfs-vfio b/Documentation/ABI/testing/debugfs-vfio
new file mode 100644
index 000000000000..85d2b676cb87
--- /dev/null
+++ b/Documentation/ABI/testing/debugfs-vfio
@@ -0,0 +1,25 @@
+What: /sys/kernel/debug/vfio
+Date: April 2023
+KernelVersion: 6.2
+Contact: Longfang Liu <[email protected]>
+Description: This debugfs file directory is used for debugging
+ of vfio devices.
+ Each device can create a device subdirectory under this
+ directory by referencing the public registration interface.
+
+What: /sys/kernel/debug/vfio/<device>/migration
+Date: April 2023
+KernelVersion: 6.2
+Contact: Longfang Liu <[email protected]>
+Description: This debugfs file directory is used for debugging
+ of vfio devices that support live migration.
+ The debugfs of each vfio device that supports live migration
+ could be created under this directory.
+
+What: /sys/kernel/debug/vfio/<device>/migration/state
+Date: April 2023
+KernelVersion: 6.2
+Contact: Longfang Liu <[email protected]>
+Description: Read the live migration status of the vfio device.
+ The status of these live migrations includes:
+ ERROR, RUNNING, STOP, STOP_COPY, RESUMING.
diff --git a/MAINTAINERS b/MAINTAINERS
index 61d3d019cfc7..147c597c9239 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -21860,6 +21860,7 @@ L: [email protected]
S: Maintained
T: git https://github.com/awilliam/linux-vfio.git
F: Documentation/ABI/testing/sysfs-devices-vfio-dev
+F: Documentation/ABI/testing/debugfs-vfio
F: Documentation/driver-api/vfio.rst
F: drivers/vfio/
F: include/linux/vfio.h
@@ -21877,6 +21878,7 @@ M: Longfang Liu <[email protected]>
M: Shameer Kolothum <[email protected]>
L: [email protected]
S: Maintained
+F: Documentation/ABI/testing/debugfs-hisi-migration
F: drivers/vfio/pci/hisilicon/

VFIO MEDIATED DEVICE DRIVERS
--
2.24.0

2023-04-08 07:47:33

by liulongfang

[permalink] [raw]
Subject: [PATCH v10 5/5] vfio: update live migration device status

migration debugfs needs to perform debug operations based on the
status of the current device. If the device is not loaded or has
stopped, debugfs does not allow operations.

so, after the live migration function is executed and the device is
turned off, the device no longer needs to be accessed. At this time,
the status of the device needs to be set to stop.

Signed-off-by: Longfang Liu <[email protected]>
---
drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c | 3 +++
drivers/vfio/pci/mlx5/main.c | 3 +++
2 files changed, 6 insertions(+)

diff --git a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
index 35abe5face47..f15d5bfd3550 100644
--- a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
+++ b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
@@ -1547,6 +1547,9 @@ static void hisi_acc_vfio_pci_close_device(struct vfio_device *core_vdev)
struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_get_vf_dev(core_vdev);
struct hisi_qm *vf_qm = &hisi_acc_vdev->vf_qm;

+ if (core_vdev->mig_ops)
+ hisi_acc_vdev->mig_state = VFIO_DEVICE_STATE_STOP;
+
iounmap(vf_qm->io_base);
vfio_pci_core_close_device(core_vdev);
}
diff --git a/drivers/vfio/pci/mlx5/main.c b/drivers/vfio/pci/mlx5/main.c
index e897537a9e8a..dc3564436946 100644
--- a/drivers/vfio/pci/mlx5/main.c
+++ b/drivers/vfio/pci/mlx5/main.c
@@ -1269,6 +1269,9 @@ static void mlx5vf_pci_close_device(struct vfio_device *core_vdev)
struct mlx5vf_pci_core_device *mvdev = container_of(
core_vdev, struct mlx5vf_pci_core_device, core_device.vdev);

+ if (mvdev->migrate_cap)
+ mvdev->mig_state = VFIO_DEVICE_STATE_STOP;
+
mlx5vf_cmd_close_migratable(mvdev);
vfio_pci_core_close_device(core_vdev);
}
--
2.24.0

2023-04-14 12:25:42

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v10 3/5] hisi_acc_vfio_pci: register debugfs for hisilicon migration driver

On Sat, Apr 08, 2023 at 03:42:22PM +0800, Longfang Liu wrote:
> +static int hisi_acc_vf_debug_restore(struct seq_file *seq, void *data)
> +{
> + struct device *vf_dev = seq->private;
> + struct vfio_pci_core_device *core_device = dev_get_drvdata(vf_dev);
> + struct vfio_device *vdev = &core_device->vdev;
> + struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_get_vf_dev(vdev);
> + struct hisi_acc_vf_migration_file *migf = hisi_acc_vdev->debug_migf;
> + int ret;
> +
> + ret = hisi_acc_vf_debug_check(seq, vdev);
> + if (ret)
> + goto restore_err;
> +
> + ret = vf_qm_state_save(hisi_acc_vdev, migf);
> + if (ret) {
> + seq_printf(seq, "%s\n", "failed to save device data!");
> + goto restore_err;
> + }
> +
> + ret = vf_qm_check_match(hisi_acc_vdev, migf);
> + if (ret) {
> + seq_printf(seq, "%s\n", "failed to match the VF!");
> + goto restore_err;
> + }
> +
> + ret = vf_qm_load_data(hisi_acc_vdev, migf);
> + if (ret) {
> + seq_printf(seq, "%s\n", "failed to recover the VF!");
> + goto restore_err;
> + }
> +
> + vf_qm_fun_reset(&hisi_acc_vdev->vf_qm);
> + seq_printf(seq, "%s\n", "successful to resume device data!");
> +
> +restore_err:
> + return 0;
> +}

This is basically an in-kernel self test, it should be protected with
some kind of VFIO selftest kconfig.

Though, I wonder why we need it??? Can't you write a trivial userspace
program under tools/testing to do this sequence with the ioctls?

> +static int hisi_acc_vf_debug_save(struct seq_file *seq, void *data)
> +{
> + struct device *vf_dev = seq->private;
> + struct vfio_pci_core_device *core_device = dev_get_drvdata(vf_dev);
> + struct vfio_device *vdev = &core_device->vdev;
> + struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_get_vf_dev(vdev);
> + struct hisi_acc_vf_migration_file *migf = hisi_acc_vdev->debug_migf;
> + int ret;
> +
> + ret = hisi_acc_vf_debug_check(seq, vdev);
> + if (ret)
> + goto save_err;
> +
> + ret = vf_qm_state_save(hisi_acc_vdev, migf);
> + if (ret) {
> + seq_printf(seq, "%s\n", "failed to save device data!");
> + goto save_err;
> + }
> + seq_printf(seq, "%s\n", "successful to save device data!");
> +
> +save_err:
> + return 0;
> +}

Same kind of commen there, this is a selftest, why does it need a
special kernel interface?

.. and so on..

I thought the non-selftesty bits were OK, maybe split the patch to
match progress

Jason

2023-04-14 12:26:21

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v10 5/5] vfio: update live migration device status

On Sat, Apr 08, 2023 at 03:42:24PM +0800, Longfang Liu wrote:
> migration debugfs needs to perform debug operations based on the
> status of the current device. If the device is not loaded or has
> stopped, debugfs does not allow operations.
>
> so, after the live migration function is executed and the device is
> turned off, the device no longer needs to be accessed. At this time,
> the status of the device needs to be set to stop.

STOP means the devices isn't functioning

An idle device that has just been reset is RUNNING by definiton.

Jason

2023-04-21 03:12:15

by liulongfang

[permalink] [raw]
Subject: Re: [PATCH v10 5/5] vfio: update live migration device status

On 2023/4/14 20:25, Jason Gunthorpe write:
> On Sat, Apr 08, 2023 at 03:42:24PM +0800, Longfang Liu wrote:
>> migration debugfs needs to perform debug operations based on the
>> status of the current device. If the device is not loaded or has
>> stopped, debugfs does not allow operations.
>>
>> so, after the live migration function is executed and the device is
>> turned off, the device no longer needs to be accessed. At this time,
>> the status of the device needs to be set to stop.
>
> STOP means the devices isn't functioning
> > An idle device that has just been reset is RUNNING by definiton.
>

After the vfio device is opened, it will be set to RUNNING, and after it
is closed, it should be set to STOP according to the function of the device.
Or redefine an IDLE state?

Thanks,
Longfang.
> Jason
> .
>

2023-04-21 03:44:35

by liulongfang

[permalink] [raw]
Subject: Re: [PATCH v10 3/5] hisi_acc_vfio_pci: register debugfs for hisilicon migration driver

On 2023/4/14 20:24, Jason Gunthorpe wrote:
> On Sat, Apr 08, 2023 at 03:42:22PM +0800, Longfang Liu wrote:
>> +static int hisi_acc_vf_debug_restore(struct seq_file *seq, void *data)
>> +{
>> + struct device *vf_dev = seq->private;
>> + struct vfio_pci_core_device *core_device = dev_get_drvdata(vf_dev);
>> + struct vfio_device *vdev = &core_device->vdev;
>> + struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_get_vf_dev(vdev);
>> + struct hisi_acc_vf_migration_file *migf = hisi_acc_vdev->debug_migf;
>> + int ret;
>> +
>> + ret = hisi_acc_vf_debug_check(seq, vdev);
>> + if (ret)
>> + goto restore_err;
>> +
>> + ret = vf_qm_state_save(hisi_acc_vdev, migf);
>> + if (ret) {
>> + seq_printf(seq, "%s\n", "failed to save device data!");
>> + goto restore_err;
>> + }
>> +
>> + ret = vf_qm_check_match(hisi_acc_vdev, migf);
>> + if (ret) {
>> + seq_printf(seq, "%s\n", "failed to match the VF!");
>> + goto restore_err;
>> + }
>> +
>> + ret = vf_qm_load_data(hisi_acc_vdev, migf);
>> + if (ret) {
>> + seq_printf(seq, "%s\n", "failed to recover the VF!");
>> + goto restore_err;
>> + }
>> +
>> + vf_qm_fun_reset(&hisi_acc_vdev->vf_qm);
>> + seq_printf(seq, "%s\n", "successful to resume device data!");
>> +
>> +restore_err:
>> + return 0;
>> +}
>
> This is basically an in-kernel self test, it should be protected with
> some kind of VFIO selftest kconfig.
>
As a debugfs function, its usage will be more flexible for users.

> Though, I wonder why we need it???
After a live migration error occurs. Through this debugfs function,
you can perform separate functional tests on the source and destination
to locate the cause of the error.

Can't you write a trivial userspace
> program under tools/testing to do this sequence with the ioctls?
>
Sorry, I still wish this feature was a simple debugfs feature.
If you want the userspace testing tool you mentioned,
you can try it on mlx5.

Thanks,
Longfang.
>> +static int hisi_acc_vf_debug_save(struct seq_file *seq, void *data)
>> +{
>> + struct device *vf_dev = seq->private;
>> + struct vfio_pci_core_device *core_device = dev_get_drvdata(vf_dev);
>> + struct vfio_device *vdev = &core_device->vdev;
>> + struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_get_vf_dev(vdev);
>> + struct hisi_acc_vf_migration_file *migf = hisi_acc_vdev->debug_migf;
>> + int ret;
>> +
>> + ret = hisi_acc_vf_debug_check(seq, vdev);
>> + if (ret)
>> + goto save_err;
>> +
>> + ret = vf_qm_state_save(hisi_acc_vdev, migf);
>> + if (ret) {
>> + seq_printf(seq, "%s\n", "failed to save device data!");
>> + goto save_err;
>> + }
>> + seq_printf(seq, "%s\n", "successful to save device data!");
>> +
>> +save_err:
>> + return 0;
>> +}
>
> Same kind of commen there, this is a selftest, why does it need a
> special kernel interface?
>
> .. and so on..
>
> I thought the non-selftesty bits were OK, maybe split the patch to
> match progress
>
> Jason
> .
>

2023-04-21 03:44:51

by liulongfang

[permalink] [raw]
Subject: Re: [PATCH v10 3/5] hisi_acc_vfio_pci: register debugfs for hisilicon migration driver

On 2023/4/21 11:27, liulongfang wrote:
> On 2023/4/14 20:24, Jason Gunthorpe wrote:
>> On Sat, Apr 08, 2023 at 03:42:22PM +0800, Longfang Liu wrote:
>>> +static int hisi_acc_vf_debug_restore(struct seq_file *seq, void *data)
>>> +{
>>> + struct device *vf_dev = seq->private;
>>> + struct vfio_pci_core_device *core_device = dev_get_drvdata(vf_dev);
>>> + struct vfio_device *vdev = &core_device->vdev;
>>> + struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_get_vf_dev(vdev);
>>> + struct hisi_acc_vf_migration_file *migf = hisi_acc_vdev->debug_migf;
>>> + int ret;
>>> +
>>> + ret = hisi_acc_vf_debug_check(seq, vdev);
>>> + if (ret)
>>> + goto restore_err;
>>> +
>>> + ret = vf_qm_state_save(hisi_acc_vdev, migf);
>>> + if (ret) {
>>> + seq_printf(seq, "%s\n", "failed to save device data!");
>>> + goto restore_err;
>>> + }
>>> +
>>> + ret = vf_qm_check_match(hisi_acc_vdev, migf);
>>> + if (ret) {
>>> + seq_printf(seq, "%s\n", "failed to match the VF!");
>>> + goto restore_err;
>>> + }
>>> +
>>> + ret = vf_qm_load_data(hisi_acc_vdev, migf);
>>> + if (ret) {
>>> + seq_printf(seq, "%s\n", "failed to recover the VF!");
>>> + goto restore_err;
>>> + }
>>> +
>>> + vf_qm_fun_reset(&hisi_acc_vdev->vf_qm);
>>> + seq_printf(seq, "%s\n", "successful to resume device data!");
>>> +
>>> +restore_err:
>>> + return 0;
>>> +}
>>
>> This is basically an in-kernel self test, it should be protected with
>> some kind of VFIO selftest kconfig.
>>
> As a debugfs function, its usage will be more flexible for users.
>
>> Though, I wonder why we need it???
> After a live migration error occurs. Through this debugfs function,
> you can perform separate functional tests on the source and destination
> to locate the cause of the error.
>
> Can't you write a trivial userspace
>> program under tools/testing to do this sequence with the ioctls?
>>
> Sorry, I still wish this feature was a simple debugfs feature.
> If you want the userspace testing tool you mentioned,
> you can try it on mlx5.
>
> Thanks,
> Longfang.
>>> +static int hisi_acc_vf_debug_save(struct seq_file *seq, void *data)
>>> +{
>>> + struct device *vf_dev = seq->private;
>>> + struct vfio_pci_core_device *core_device = dev_get_drvdata(vf_dev);
>>> + struct vfio_device *vdev = &core_device->vdev;
>>> + struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_get_vf_dev(vdev);
>>> + struct hisi_acc_vf_migration_file *migf = hisi_acc_vdev->debug_migf;
>>> + int ret;
>>> +
>>> + ret = hisi_acc_vf_debug_check(seq, vdev);
>>> + if (ret)
>>> + goto save_err;
>>> +
>>> + ret = vf_qm_state_save(hisi_acc_vdev, migf);
>>> + if (ret) {
>>> + seq_printf(seq, "%s\n", "failed to save device data!");
>>> + goto save_err;
>>> + }
>>> + seq_printf(seq, "%s\n", "successful to save device data!");
>>> +
>>> +save_err:
>>> + return 0;
>>> +}
>>
>> Same kind of commen there, this is a selftest, why does it need a
>> special kernel interface?
>>
>> .. and so on..
>>
>> I thought the non-selftesty bits were OK, maybe split the patch to
>> match progress
>>

Thank you for your suggestion, but the current debugfs method can already
meet the functional requirements of verification testing and
problem location.

Thanks,
Longfang.
>> Jason
>> .
>>

2023-04-21 14:31:48

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v10 5/5] vfio: update live migration device status

On Fri, Apr 21, 2023 at 11:07:53AM +0800, liulongfang wrote:
> On 2023/4/14 20:25, Jason Gunthorpe write:
> > On Sat, Apr 08, 2023 at 03:42:24PM +0800, Longfang Liu wrote:
> >> migration debugfs needs to perform debug operations based on the
> >> status of the current device. If the device is not loaded or has
> >> stopped, debugfs does not allow operations.
> >>
> >> so, after the live migration function is executed and the device is
> >> turned off, the device no longer needs to be accessed. At this time,
> >> the status of the device needs to be set to stop.
> >
> > STOP means the devices isn't functioning
> > > An idle device that has just been reset is RUNNING by definiton.
> >
>
> After the vfio device is opened, it will be set to RUNNING, and after it
> is closed, it should be set to STOP according to the function of the
> device.

No, VFIO does not STOP anything, RUNNING means the PCI function is
operating normally, which is always how VFIO leaves it when the FD is
closed.

Jason

2023-04-21 14:37:36

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v10 3/5] hisi_acc_vfio_pci: register debugfs for hisilicon migration driver

On Fri, Apr 21, 2023 at 11:32:47AM +0800, liulongfang wrote:

> Thank you for your suggestion, but the current debugfs method can already
> meet the functional requirements of verification testing and
> problem location.

To be clear, I'm against adding selftest code in this manner. We have
many frameworks for kernel teesting, please pick one and integrate
with it.

Jason

2023-05-16 10:09:59

by liulongfang

[permalink] [raw]
Subject: Re: [PATCH v10 3/5] hisi_acc_vfio_pci: register debugfs for hisilicon migration driver

On 2023/4/21 22:31, Jason Gunthorpe wrote:
> On Fri, Apr 21, 2023 at 11:32:47AM +0800, liulongfang wrote:
>
>> Thank you for your suggestion, but the current debugfs method can already
>> meet the functional requirements of verification testing and
>> problem location.
>
> To be clear, I'm against adding selftest code in this manner. We have
> many frameworks for kernel teesting, please pick one and integrate
> with it.
>

Hi, Jason:
The purpose of this hisi_acc_vf_debug_restore function is to obtain the
migration status data of the migration device. It is a debug operation.
Just to obtain this status data, user need to complete the few steps
of live migration.
Therefore, it is a debug function here, not a self-test function.

Thanks,
Longfang.
> Jason
> .
>

2023-05-16 11:01:27

by liulongfang

[permalink] [raw]
Subject: Re: [PATCH v10 3/5] hisi_acc_vfio_pci: register debugfs for hisilicon migration driver

On 2023/5/16 17:40, liulongfang wrote:
> On 2023/4/21 22:31, Jason Gunthorpe wrote:
>> On Fri, Apr 21, 2023 at 11:32:47AM +0800, liulongfang wrote:
>>
>>> Thank you for your suggestion, but the current debugfs method can already
>>> meet the functional requirements of verification testing and
>>> problem location.
>>
>> To be clear, I'm against adding selftest code in this manner. We have
>> many frameworks for kernel teesting, please pick one and integrate
>> with it.
>>
>
> Hi, Jason:
> The purpose of this hisi_acc_vf_debug_restore function is to obtain the
> migration status data of the migration device. It is a debug operation.
> Just to obtain this status data, user need to complete the few steps> of live migration.
> Therefore, it is a debug function here, not a self-test function.
>
Here it should be:
In order to test whether the current state of the device is normal,
perform a recovery test through the data saved during the previous live migration.
It is matched with the previous save.
If you still insist that it is a self-test.
How about I delete this recovery debugfs?

Thanks,
Longfang.
> Thanks,
> Longfang.
>> Jason
>> .
>>

2023-05-16 12:33:42

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v10 3/5] hisi_acc_vfio_pci: register debugfs for hisilicon migration driver

On Tue, May 16, 2023 at 05:40:36PM +0800, liulongfang wrote:
> On 2023/4/21 22:31, Jason Gunthorpe wrote:
> > On Fri, Apr 21, 2023 at 11:32:47AM +0800, liulongfang wrote:
> >
> >> Thank you for your suggestion, but the current debugfs method can already
> >> meet the functional requirements of verification testing and
> >> problem location.
> >
> > To be clear, I'm against adding selftest code in this manner. We have
> > many frameworks for kernel teesting, please pick one and integrate
> > with it.
> >
>
> Hi, Jason:
> The purpose of this hisi_acc_vf_debug_restore function is to obtain the
> migration status data of the migration device. It is a debug operation.
> Just to obtain this status data, user need to complete the few steps
> of live migration.
> Therefore, it is a debug function here, not a self-test function.

A debug function should not alter the device state, or do a trial migration.

Jason

2023-06-02 07:23:30

by liulongfang

[permalink] [raw]
Subject: Re: [PATCH v10 3/5] hisi_acc_vfio_pci: register debugfs for hisilicon migration driver

On 2023/5/16 19:50, Jason Gunthorpe wrote:
> On Tue, May 16, 2023 at 05:40:36PM +0800, liulongfang wrote:
>> On 2023/4/21 22:31, Jason Gunthorpe wrote:
>>> On Fri, Apr 21, 2023 at 11:32:47AM +0800, liulongfang wrote:
>>>
>>>> Thank you for your suggestion, but the current debugfs method can already
>>>> meet the functional requirements of verification testing and
>>>> problem location.
>>>
>>> To be clear, I'm against adding selftest code in this manner. We have
>>> many frameworks for kernel teesting, please pick one and integrate
>>> with it.
>>>
>>
>> Hi, Jason:
>> The purpose of this hisi_acc_vf_debug_restore function is to obtain the
>> migration status data of the migration device. It is a debug operation.
>> Just to obtain this status data, user need to complete the few steps
>> of live migration.
>> Therefore, it is a debug function here, not a self-test function.
>
> A debug function should not alter the device state, or do a trial migration.
>

OK Then I delete this migration and restore the test operation.
Does this meet your requirements?

Thanks,
Longfang.
> Jason
> .
>