2022-09-15 01:50:01

by liulongfang

[permalink] [raw]
Subject: [PATCH 0/5] Fix some bugs and clean code issues

There are some software bugs in the accelerator live migration
driver that need to be fixed, and there are still some clean
code issues that need to be resolved.

Longfang Liu (5):
hisi_acc_vfio_pci: Fixes a memory leak bug
hisi_acc_vfio_pci: Fixes error return code issue
hisi_acc_vfio_pci: Remove useless function parameter
hisi_acc_vfio_pci: Fix device data address combination problem
hisi_acc_vfio_pci: Fix some clean code issues

.../vfio/pci/hisilicon/hisi_acc_vfio_pci.c | 66 ++++++++++---------
.../vfio/pci/hisilicon/hisi_acc_vfio_pci.h | 1 -
2 files changed, 34 insertions(+), 33 deletions(-)

--
2.33.0


2022-09-15 01:50:21

by liulongfang

[permalink] [raw]
Subject: [PATCH 1/5] hisi_acc_vfio_pci: Fixes a memory leak bug

During the stop copy phase of live migration, the driver allocates
a memory for the migrated data to save the data.

When an exception occurs when the driver reads device data, the driver
will report an error to qemu and exit the current migration state.
But this memory is not released, which will lead to a memory
leak problem.

So we need to add a memory release operation.

Reviewed-by: Shameer Kolothum <[email protected]>
Signed-off-by: Longfang Liu <[email protected]>
---
drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
index ea762e28c1cc..8fd68af2ed5f 100644
--- a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
+++ b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
@@ -828,15 +828,15 @@ hisi_acc_vf_stop_copy(struct hisi_acc_vf_core_device *hisi_acc_vdev)
return ERR_PTR(err);
}

- stream_open(migf->filp->f_inode, migf->filp);
- mutex_init(&migf->lock);
-
ret = vf_qm_state_save(hisi_acc_vdev, migf);
if (ret) {
- fput(migf->filp);
+ kfree(migf);
return ERR_PTR(ret);
}

+ stream_open(migf->filp->f_inode, migf->filp);
+ mutex_init(&migf->lock);
+
return migf;
}

--
2.33.0

2022-09-15 01:50:28

by liulongfang

[permalink] [raw]
Subject: [PATCH 4/5] hisi_acc_vfio_pci: Fix device data address combination problem

The queue address of the accelerator device should be combined into
a dma address in a way of combining the low and high bits.
The previous combination is wrong and needs to be modified.

Signed-off-by: Longfang Liu <[email protected]>
---
drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
index c172a52088b7..fce49c7f5db8 100644
--- a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
+++ b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
@@ -527,12 +527,12 @@ static int vf_qm_state_save(struct hisi_acc_vf_core_device *hisi_acc_vdev,
return -EINVAL;

/* Every reg is 32 bit, the dma address is 64 bit. */
- vf_data->eqe_dma = vf_data->qm_eqc_dw[2];
+ vf_data->eqe_dma = vf_data->qm_eqc_dw[1];
vf_data->eqe_dma <<= QM_XQC_ADDR_OFFSET;
- vf_data->eqe_dma |= vf_data->qm_eqc_dw[1];
- vf_data->aeqe_dma = vf_data->qm_aeqc_dw[2];
+ vf_data->eqe_dma |= vf_data->qm_eqc_dw[0];
+ vf_data->aeqe_dma = vf_data->qm_aeqc_dw[1];
vf_data->aeqe_dma <<= QM_XQC_ADDR_OFFSET;
- vf_data->aeqe_dma |= vf_data->qm_aeqc_dw[1];
+ vf_data->aeqe_dma |= vf_data->qm_aeqc_dw[0];

/* Through SQC_BT/CQC_BT to get sqc and cqc address */
ret = qm_get_sqc(vf_qm, &vf_data->sqc_dma);
--
2.33.0

2022-09-15 02:10:03

by liulongfang

[permalink] [raw]
Subject: [PATCH 3/5] hisi_acc_vfio_pci: Remove useless function parameter

Remove unused function parameters for vf_qm_fun_reset() and
ensure the device is enabled before the reset operation
is performed.

Signed-off-by: Longfang Liu <[email protected]>
---
drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
index 3790b76a578e..c172a52088b7 100644
--- a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
+++ b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
@@ -345,8 +345,7 @@ static struct hisi_acc_vf_core_device *hssi_acc_drvdata(struct pci_dev *pdev)
core_device);
}

-static void vf_qm_fun_reset(struct hisi_acc_vf_core_device *hisi_acc_vdev,
- struct hisi_qm *qm)
+static void vf_qm_fun_reset(struct hisi_qm *qm)
{
int i;

@@ -662,7 +661,10 @@ static void hisi_acc_vf_start_device(struct hisi_acc_vf_core_device *hisi_acc_vd
if (hisi_acc_vdev->vf_qm_state != QM_READY)
return;

- vf_qm_fun_reset(hisi_acc_vdev, vf_qm);
+ /* Make sure the device is enabled */
+ qm_dev_cmd_init(vf_qm);
+
+ vf_qm_fun_reset(vf_qm);
}

static int hisi_acc_vf_load_state(struct hisi_acc_vf_core_device *hisi_acc_vdev)
--
2.33.0

2022-09-20 16:50:50

by Alex Williamson

[permalink] [raw]
Subject: Re: [PATCH 1/5] hisi_acc_vfio_pci: Fixes a memory leak bug

On Thu, 15 Sep 2022 09:31:53 +0800
Longfang Liu <[email protected]> wrote:

> During the stop copy phase of live migration, the driver allocates
> a memory for the migrated data to save the data.
>
> When an exception occurs when the driver reads device data, the driver
> will report an error to qemu and exit the current migration state.
> But this memory is not released, which will lead to a memory
> leak problem.
>
> So we need to add a memory release operation.
>
> Reviewed-by: Shameer Kolothum <[email protected]>
> Signed-off-by: Longfang Liu <[email protected]>
> ---
> drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
> index ea762e28c1cc..8fd68af2ed5f 100644
> --- a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
> +++ b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
> @@ -828,15 +828,15 @@ hisi_acc_vf_stop_copy(struct hisi_acc_vf_core_device *hisi_acc_vdev)
> return ERR_PTR(err);
> }
>
> - stream_open(migf->filp->f_inode, migf->filp);
> - mutex_init(&migf->lock);
> -
> ret = vf_qm_state_save(hisi_acc_vdev, migf);
> if (ret) {
> - fput(migf->filp);

Sorry, why did this fput() get removed? Thanks,

Alex

> + kfree(migf);
> return ERR_PTR(ret);
> }
>
> + stream_open(migf->filp->f_inode, migf->filp);
> + mutex_init(&migf->lock);
> +
> return migf;
> }
>

2022-09-20 17:10:58

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH 1/5] hisi_acc_vfio_pci: Fixes a memory leak bug

On Tue, Sep 20, 2022 at 10:34:43AM -0600, Alex Williamson wrote:
> On Thu, 15 Sep 2022 09:31:53 +0800
> Longfang Liu <[email protected]> wrote:
>
> > During the stop copy phase of live migration, the driver allocates
> > a memory for the migrated data to save the data.
> >
> > When an exception occurs when the driver reads device data, the driver
> > will report an error to qemu and exit the current migration state.
> > But this memory is not released, which will lead to a memory
> > leak problem.

Why isn't it released? The fput() releases it:

static int hisi_acc_vf_release_file(struct inode *inode, struct file *filp)
{
struct hisi_acc_vf_migration_file *migf = filp->private_data;

hisi_acc_vf_disable_fd(migf);
mutex_destroy(&migf->lock);
kfree(migf);
^^^^^^^^^^

This patch looks wrong to me.

Jason

Subject: RE: [PATCH 1/5] hisi_acc_vfio_pci: Fixes a memory leak bug



> -----Original Message-----
> From: Jason Gunthorpe [mailto:[email protected]]
> Sent: 20 September 2022 17:38
> To: Alex Williamson <[email protected]>
> Cc: liulongfang <[email protected]>; Shameerali Kolothum Thodi
> <[email protected]>; [email protected];
> [email protected]; [email protected]
> Subject: Re: [PATCH 1/5] hisi_acc_vfio_pci: Fixes a memory leak bug
>
> On Tue, Sep 20, 2022 at 10:34:43AM -0600, Alex Williamson wrote:
> > On Thu, 15 Sep 2022 09:31:53 +0800
> > Longfang Liu <[email protected]> wrote:
> >
> > > During the stop copy phase of live migration, the driver allocates a
> > > memory for the migrated data to save the data.
> > >
> > > When an exception occurs when the driver reads device data, the
> > > driver will report an error to qemu and exit the current migration state.
> > > But this memory is not released, which will lead to a memory leak
> > > problem.
>
> Why isn't it released? The fput() releases it:
>
> static int hisi_acc_vf_release_file(struct inode *inode, struct file *filp) {
> struct hisi_acc_vf_migration_file *migf = filp->private_data;
>
> hisi_acc_vf_disable_fd(migf);
> mutex_destroy(&migf->lock);
> kfree(migf);
> ^^^^^^^^^^
>
> This patch looks wrong to me.

That's right. Missed that. Sorry of the oversight.

Thanks,
Shameer

2022-09-21 03:25:54

by liulongfang

[permalink] [raw]
Subject: Re: [PATCH 1/5] hisi_acc_vfio_pci: Fixes a memory leak bug

On 2022/9/21 1:03, Shameerali Kolothum Thodi wrote:
>
>
>> -----Original Message-----
>> From: Jason Gunthorpe [mailto:[email protected]]
>> Sent: 20 September 2022 17:38
>> To: Alex Williamson <[email protected]>
>> Cc: liulongfang <[email protected]>; Shameerali Kolothum Thodi
>> <[email protected]>; [email protected];
>> [email protected]; [email protected]
>> Subject: Re: [PATCH 1/5] hisi_acc_vfio_pci: Fixes a memory leak bug
>>
>> On Tue, Sep 20, 2022 at 10:34:43AM -0600, Alex Williamson wrote:
>>> On Thu, 15 Sep 2022 09:31:53 +0800
>>> Longfang Liu <[email protected]> wrote:
>>>
>>>> During the stop copy phase of live migration, the driver allocates a
>>>> memory for the migrated data to save the data.
>>>>
>>>> When an exception occurs when the driver reads device data, the
>>>> driver will report an error to qemu and exit the current migration state.
>>>> But this memory is not released, which will lead to a memory leak
>>>> problem.
>>
>> Why isn't it released? The fput() releases it:
>>
>> static int hisi_acc_vf_release_file(struct inode *inode, struct file *filp) {
>> struct hisi_acc_vf_migration_file *migf = filp->private_data;
>>
>> hisi_acc_vf_disable_fd(migf);
>> mutex_destroy(&migf->lock);
>> kfree(migf);
>> ^^^^^^^^^^
>>
>> This patch looks wrong to me.
>
> That's right. Missed that. Sorry of the oversight.
>
Yes, fput will call release in ops of file, here will call hisi_acc_vf_release_file
to complete the release operation of migf, so this patch is unnecessary.

But there is another place that needs to be modified:
hisi_acc_vf_disable_fd in hisi_acc_vf_disable_fds is not needed,
because it will have an fput next. Is this correct?

> Thanks,
> Shameer
>
> .
Thanks,
Longfang.
>

Subject: RE: [PATCH 1/5] hisi_acc_vfio_pci: Fixes a memory leak bug



> -----Original Message-----
> From: liulongfang
> Sent: 21 September 2022 04:13
> To: Shameerali Kolothum Thodi <[email protected]>;
> Jason Gunthorpe <[email protected]>; Alex Williamson
> <[email protected]>
> Cc: [email protected]; [email protected];
> [email protected]
> Subject: Re: [PATCH 1/5] hisi_acc_vfio_pci: Fixes a memory leak bug
>
> On 2022/9/21 1:03, Shameerali Kolothum Thodi wrote:
> >
> >
> >> -----Original Message-----
> >> From: Jason Gunthorpe [mailto:[email protected]]
> >> Sent: 20 September 2022 17:38
> >> To: Alex Williamson <[email protected]>
> >> Cc: liulongfang <[email protected]>; Shameerali Kolothum Thodi
> >> <[email protected]>; [email protected];
> >> [email protected]; [email protected]
> >> Subject: Re: [PATCH 1/5] hisi_acc_vfio_pci: Fixes a memory leak bug
> >>
> >> On Tue, Sep 20, 2022 at 10:34:43AM -0600, Alex Williamson wrote:
> >>> On Thu, 15 Sep 2022 09:31:53 +0800
> >>> Longfang Liu <[email protected]> wrote:
> >>>
> >>>> During the stop copy phase of live migration, the driver allocates a
> >>>> memory for the migrated data to save the data.
> >>>>
> >>>> When an exception occurs when the driver reads device data, the
> >>>> driver will report an error to qemu and exit the current migration state.
> >>>> But this memory is not released, which will lead to a memory leak
> >>>> problem.
> >>
> >> Why isn't it released? The fput() releases it:
> >>
> >> static int hisi_acc_vf_release_file(struct inode *inode, struct file *filp) {
> >> struct hisi_acc_vf_migration_file *migf = filp->private_data;
> >>
> >> hisi_acc_vf_disable_fd(migf);
> >> mutex_destroy(&migf->lock);
> >> kfree(migf);
> >> ^^^^^^^^^^
> >>
> >> This patch looks wrong to me.
> >
> > That's right. Missed that. Sorry of the oversight.
> >
> Yes, fput will call release in ops of file, here will call hisi_acc_vf_release_file
> to complete the release operation of migf, so this patch is unnecessary.
>
> But there is another place that needs to be modified:
> hisi_acc_vf_disable_fd in hisi_acc_vf_disable_fds is not needed,
> because it will have an fput next. Is this correct?

I don't think that is correct either. fput() decrements ref count and
will only call release() if the count is zero. We have an explicit get_file()
for the hisi_acc_vf_disable_fds(). Isn't it?

Thanks,
Shameer



> > Thanks,
> > Shameer
> >
> > .
> Thanks,
> Longfang.
> >

2022-09-22 08:18:34

by liulongfang

[permalink] [raw]
Subject: Re: [PATCH 1/5] hisi_acc_vfio_pci: Fixes a memory leak bug

On 2022/9/21 15:27, Shameerali Kolothum Thodi wrote:
>
>
>> -----Original Message-----
>> From: liulongfang
>> Sent: 21 September 2022 04:13
>> To: Shameerali Kolothum Thodi <[email protected]>;
>> Jason Gunthorpe <[email protected]>; Alex Williamson
>> <[email protected]>
>> Cc: [email protected]; [email protected];
>> [email protected]
>> Subject: Re: [PATCH 1/5] hisi_acc_vfio_pci: Fixes a memory leak bug
>>
>> On 2022/9/21 1:03, Shameerali Kolothum Thodi wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: Jason Gunthorpe [mailto:[email protected]]
>>>> Sent: 20 September 2022 17:38
>>>> To: Alex Williamson <[email protected]>
>>>> Cc: liulongfang <[email protected]>; Shameerali Kolothum Thodi
>>>> <[email protected]>; [email protected];
>>>> [email protected]; [email protected]
>>>> Subject: Re: [PATCH 1/5] hisi_acc_vfio_pci: Fixes a memory leak bug
>>>>
>>>> On Tue, Sep 20, 2022 at 10:34:43AM -0600, Alex Williamson wrote:
>>>>> On Thu, 15 Sep 2022 09:31:53 +0800
>>>>> Longfang Liu <[email protected]> wrote:
>>>>>
>>>>>> During the stop copy phase of live migration, the driver allocates a
>>>>>> memory for the migrated data to save the data.
>>>>>>
>>>>>> When an exception occurs when the driver reads device data, the
>>>>>> driver will report an error to qemu and exit the current migration state.
>>>>>> But this memory is not released, which will lead to a memory leak
>>>>>> problem.
>>>>
>>>> Why isn't it released? The fput() releases it:
>>>>
>>>> static int hisi_acc_vf_release_file(struct inode *inode, struct file *filp) {
>>>> struct hisi_acc_vf_migration_file *migf = filp->private_data;
>>>>
>>>> hisi_acc_vf_disable_fd(migf);
>>>> mutex_destroy(&migf->lock);
>>>> kfree(migf);
>>>> ^^^^^^^^^^
>>>>
>>>> This patch looks wrong to me.
>>>
>>> That's right. Missed that. Sorry of the oversight.
>>>
>> Yes, fput will call release in ops of file, here will call hisi_acc_vf_release_file
>> to complete the release operation of migf, so this patch is unnecessary.
>>
>> But there is another place that needs to be modified:
>> hisi_acc_vf_disable_fd in hisi_acc_vf_disable_fds is not needed,
>> because it will have an fput next. Is this correct?
>
> I don't think that is correct either. fput() decrements ref count and
> will only call release() if the count is zero. We have an explicit get_file()
> for the hisi_acc_vf_disable_fds(). Isn't it?
>
> Thanks,
> Shameer
>
>

OK! These are not necessary to be modified, so there is no need to add them to
the patchset. I am going to modify the patchset and send it out in the next
version.

>
>>> Thanks,
>>> Shameer
>>>
>>> .
>> Thanks,
>> Longfang.
>>>
> .
Thanks,
Longfang.
>