2021-06-25 22:09:49

by Anthony Krowiak

[permalink] [raw]
Subject: [PATCH] s390/vfio-ap: do not use open locks during VFIO_GROUP_NOTIFY_SET_KVM notification

The fix to resolve a lockdep splat while handling the
VFIO_GROUP_NOTIFY_SET_KVM event introduced a kvm_busy flag indicating that
the vfio_ap device driver is busy setting or unsetting the KVM pointer.
A wait queue was employed to allow functions requiring access to the KVM
pointer to wait for the kvm_busy flag to be cleared. For the duration of
the wait period, the mdev lock was unlocked then acquired again after the
kvm_busy flag was cleared. This got rid of the lockdep report, but didn't
really resolve the problem.

This patch removes the the kvm_busy flag and wait queue as they are not
necessary to resolve the lockdep splat problem. The wait queue was
introduced to prevent changes to the matrix used to update the guest's
AP configuration. The idea was that whenever an adapter, domain or control
domain was being assigned to or unassigned from the matrix, the function
would wait until the group notifier function was no longer busy with the
KVM pointer.

The thing is, the KVM pointer value (matrix_mdev->kvm) is always set and
cleared while holding the matrix_dev->lock mutex. The assignment and
unassignment interfaces also lock the matrix_dev->lock mutex prior to
checking whether the matrix_mdev->kvm pointer is set and if so, returns
the -EBUSY error from the function. Consequently, there is no chance for
an update to the matrix to occur while the guest's AP configuration is
being updated.

Fixes: 0cc00c8d4050 ("s390/vfio-ap: fix circular lockdep when setting/clearing crypto masks")
Cc: [email protected]
Signed-off-by: Tony Krowiak <[email protected]>
---
drivers/s390/crypto/vfio_ap_ops.c | 77 +++++++--------------------
drivers/s390/crypto/vfio_ap_private.h | 2 -
2 files changed, 20 insertions(+), 59 deletions(-)

diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index 742277bc8d1c..99a58f54c076 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -294,15 +294,6 @@ static int handle_pqap(struct kvm_vcpu *vcpu)
matrix_mdev = container_of(vcpu->kvm->arch.crypto.pqap_hook,
struct ap_matrix_mdev, pqap_hook);

- /*
- * If the KVM pointer is in the process of being set, wait until the
- * process has completed.
- */
- wait_event_cmd(matrix_mdev->wait_for_kvm,
- !matrix_mdev->kvm_busy,
- mutex_unlock(&matrix_dev->lock),
- mutex_lock(&matrix_dev->lock));
-
/* If the there is no guest using the mdev, there is nothing to do */
if (!matrix_mdev->kvm)
goto out_unlock;
@@ -350,7 +341,6 @@ static int vfio_ap_mdev_create(struct mdev_device *mdev)

matrix_mdev->mdev = mdev;
vfio_ap_matrix_init(&matrix_dev->info, &matrix_mdev->matrix);
- init_waitqueue_head(&matrix_mdev->wait_for_kvm);
mdev_set_drvdata(mdev, matrix_mdev);
matrix_mdev->pqap_hook = handle_pqap;
mutex_lock(&matrix_dev->lock);
@@ -623,7 +613,7 @@ static ssize_t assign_adapter_store(struct device *dev,
* If the KVM pointer is in flux or the guest is running, disallow
* un-assignment of adapter
*/
- if (matrix_mdev->kvm_busy || matrix_mdev->kvm) {
+ if (matrix_mdev->kvm) {
ret = -EBUSY;
goto done;
}
@@ -696,7 +686,7 @@ static ssize_t unassign_adapter_store(struct device *dev,
* If the KVM pointer is in flux or the guest is running, disallow
* un-assignment of adapter
*/
- if (matrix_mdev->kvm_busy || matrix_mdev->kvm) {
+ if (matrix_mdev->kvm) {
ret = -EBUSY;
goto done;
}
@@ -786,7 +776,7 @@ static ssize_t assign_domain_store(struct device *dev,
* If the KVM pointer is in flux or the guest is running, disallow
* assignment of domain
*/
- if (matrix_mdev->kvm_busy || matrix_mdev->kvm) {
+ if (matrix_mdev->kvm) {
ret = -EBUSY;
goto done;
}
@@ -854,7 +844,7 @@ static ssize_t unassign_domain_store(struct device *dev,
* If the KVM pointer is in flux or the guest is running, disallow
* un-assignment of domain
*/
- if (matrix_mdev->kvm_busy || matrix_mdev->kvm) {
+ if (matrix_mdev->kvm) {
ret = -EBUSY;
goto done;
}
@@ -908,7 +898,7 @@ static ssize_t assign_control_domain_store(struct device *dev,
* If the KVM pointer is in flux or the guest is running, disallow
* assignment of control domain.
*/
- if (matrix_mdev->kvm_busy || matrix_mdev->kvm) {
+ if (matrix_mdev->kvm) {
ret = -EBUSY;
goto done;
}
@@ -967,7 +957,7 @@ static ssize_t unassign_control_domain_store(struct device *dev,
* If the KVM pointer is in flux or the guest is running, disallow
* un-assignment of control domain.
*/
- if (matrix_mdev->kvm_busy || matrix_mdev->kvm) {
+ if (matrix_mdev->kvm) {
ret = -EBUSY;
goto done;
}
@@ -1108,14 +1098,17 @@ static int vfio_ap_mdev_set_kvm(struct ap_matrix_mdev *matrix_mdev,
struct ap_matrix_mdev *m;

if (kvm->arch.crypto.crycbd) {
+ mutex_lock(&matrix_dev->lock);
+
list_for_each_entry(m, &matrix_dev->mdev_list, node) {
- if (m != matrix_mdev && m->kvm == kvm)
+ if (m != matrix_mdev && m->kvm == kvm) {
+ mutex_unlock(&matrix_dev->lock);
return -EPERM;
+ }
}

kvm_get_kvm(kvm);
matrix_mdev->kvm = kvm;
- matrix_mdev->kvm_busy = true;
mutex_unlock(&matrix_dev->lock);

down_write(&matrix_mdev->kvm->arch.crypto.pqap_hook_rwsem);
@@ -1126,10 +1119,6 @@ static int vfio_ap_mdev_set_kvm(struct ap_matrix_mdev *matrix_mdev,
matrix_mdev->matrix.apm,
matrix_mdev->matrix.aqm,
matrix_mdev->matrix.adm);
-
- mutex_lock(&matrix_dev->lock);
- matrix_mdev->kvm_busy = false;
- wake_up_all(&matrix_mdev->wait_for_kvm);
}

return 0;
@@ -1181,33 +1170,21 @@ static int vfio_ap_mdev_iommu_notifier(struct notifier_block *nb,
*/
static void vfio_ap_mdev_unset_kvm(struct ap_matrix_mdev *matrix_mdev)
{
- /*
- * If the KVM pointer is in the process of being set, wait until the
- * process has completed.
- */
- wait_event_cmd(matrix_mdev->wait_for_kvm,
- !matrix_mdev->kvm_busy,
- mutex_unlock(&matrix_dev->lock),
- mutex_lock(&matrix_dev->lock));
+ mutex_lock(&matrix_dev->lock);

- if (matrix_mdev->kvm) {
- matrix_mdev->kvm_busy = true;
+ if ((matrix_mdev->kvm) && (matrix_mdev->kvm->arch.crypto.crycbd)) {
mutex_unlock(&matrix_dev->lock);
+ down_write(&matrix_mdev->kvm->arch.crypto.pqap_hook_rwsem);
+ matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
+ up_write(&matrix_mdev->kvm->arch.crypto.pqap_hook_rwsem);

- if (matrix_mdev->kvm->arch.crypto.crycbd) {
- down_write(&matrix_mdev->kvm->arch.crypto.pqap_hook_rwsem);
- matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
- up_write(&matrix_mdev->kvm->arch.crypto.pqap_hook_rwsem);
-
- kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
- }
+ kvm_arch_crypto_clear_masks(matrix_mdev->kvm);

mutex_lock(&matrix_dev->lock);
vfio_ap_mdev_reset_queues(matrix_mdev->mdev);
kvm_put_kvm(matrix_mdev->kvm);
matrix_mdev->kvm = NULL;
- matrix_mdev->kvm_busy = false;
- wake_up_all(&matrix_mdev->wait_for_kvm);
+ mutex_unlock(&matrix_dev->lock);
}
}

@@ -1220,7 +1197,6 @@ static int vfio_ap_mdev_group_notifier(struct notifier_block *nb,
if (action != VFIO_GROUP_NOTIFY_SET_KVM)
return NOTIFY_OK;

- mutex_lock(&matrix_dev->lock);
matrix_mdev = container_of(nb, struct ap_matrix_mdev, group_notifier);

if (!data)
@@ -1228,8 +1204,6 @@ static int vfio_ap_mdev_group_notifier(struct notifier_block *nb,
else if (vfio_ap_mdev_set_kvm(matrix_mdev, data))
notify_rc = NOTIFY_DONE;

- mutex_unlock(&matrix_dev->lock);
-
return notify_rc;
}

@@ -1363,14 +1337,12 @@ static void vfio_ap_mdev_release(struct mdev_device *mdev)
{
struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);

- mutex_lock(&matrix_dev->lock);
- vfio_ap_mdev_unset_kvm(matrix_mdev);
- mutex_unlock(&matrix_dev->lock);
-
vfio_unregister_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,
&matrix_mdev->iommu_notifier);
vfio_unregister_notifier(mdev_dev(mdev), VFIO_GROUP_NOTIFY,
&matrix_mdev->group_notifier);
+
+ vfio_ap_mdev_unset_kvm(matrix_mdev);
module_put(THIS_MODULE);
}

@@ -1412,15 +1384,6 @@ static ssize_t vfio_ap_mdev_ioctl(struct mdev_device *mdev,
break;
}

- /*
- * If the KVM pointer is in the process of being set, wait until
- * the process has completed.
- */
- wait_event_cmd(matrix_mdev->wait_for_kvm,
- !matrix_mdev->kvm_busy,
- mutex_unlock(&matrix_dev->lock),
- mutex_lock(&matrix_dev->lock));
-
ret = vfio_ap_mdev_reset_queues(mdev);
break;
default:
diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h
index e12218e5a629..22d2e0ca3ae5 100644
--- a/drivers/s390/crypto/vfio_ap_private.h
+++ b/drivers/s390/crypto/vfio_ap_private.h
@@ -83,8 +83,6 @@ struct ap_matrix_mdev {
struct ap_matrix matrix;
struct notifier_block group_notifier;
struct notifier_block iommu_notifier;
- bool kvm_busy;
- wait_queue_head_t wait_for_kvm;
struct kvm *kvm;
crypto_hook pqap_hook;
struct mdev_device *mdev;
--
2.30.2


2021-06-28 23:40:10

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH] s390/vfio-ap: do not use open locks during VFIO_GROUP_NOTIFY_SET_KVM notification

On Mon, Jun 28, 2021 at 02:20:55PM -0400, Tony Krowiak wrote:
>
>
> On 6/28/21 1:34 PM, Jason Gunthorpe wrote:
> > On Fri, Jun 25, 2021 at 06:07:58PM -0400, Tony Krowiak wrote:
> > > static void vfio_ap_mdev_unset_kvm(struct ap_matrix_mdev *matrix_mdev)
> > > {
> > > + mutex_lock(&matrix_dev->lock);
> > > + if ((matrix_mdev->kvm) && (matrix_mdev->kvm->arch.crypto.crycbd)) {
> > > mutex_unlock(&matrix_dev->lock);
> > > + down_write(&matrix_mdev->kvm->arch.crypto.pqap_hook_rwsem);
> > > + matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
> > > + up_write(&matrix_mdev->kvm->arch.crypto.pqap_hook_rwsem);
> > > + kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
> > > mutex_lock(&matrix_dev->lock);
> > > vfio_ap_mdev_reset_queues(matrix_mdev->mdev);
> > > kvm_put_kvm(matrix_mdev->kvm);
> > > matrix_mdev->kvm = NULL;
> > > + mutex_unlock(&matrix_dev->lock);
> > > }
> > Doesn't a flow exit the function with matrix_dev->lock held he
>
> How can that happen? What flow?

When the if isn't taken

> > Write it with 'success oriented flow'
>
> I'm not sure what you mean, can you clarify this statement?

Basically, don't write the bulk of the function under an if statement

mutex_lock(&matrix_dev->lock);
if (!matrix_mdev->kvm || !matrix_mdev->kvm->arch.crypto.crycbd) {
mutex_unlock(&matrix_dev->lock);
return;
}

Jason

2021-06-28 23:40:11

by Anthony Krowiak

[permalink] [raw]
Subject: Re: [PATCH] s390/vfio-ap: do not use open locks during VFIO_GROUP_NOTIFY_SET_KVM notification



On 6/28/21 1:34 PM, Jason Gunthorpe wrote:
> On Fri, Jun 25, 2021 at 06:07:58PM -0400, Tony Krowiak wrote:
>> static void vfio_ap_mdev_unset_kvm(struct ap_matrix_mdev *matrix_mdev)
>> {
>> + mutex_lock(&matrix_dev->lock);
>>
>> + if ((matrix_mdev->kvm) && (matrix_mdev->kvm->arch.crypto.crycbd)) {
>> mutex_unlock(&matrix_dev->lock);
>> + down_write(&matrix_mdev->kvm->arch.crypto.pqap_hook_rwsem);
>> + matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
>> + up_write(&matrix_mdev->kvm->arch.crypto.pqap_hook_rwsem);
>>
>> + kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
>>
>> mutex_lock(&matrix_dev->lock);
>> vfio_ap_mdev_reset_queues(matrix_mdev->mdev);
>> kvm_put_kvm(matrix_mdev->kvm);
>> matrix_mdev->kvm = NULL;
>> + mutex_unlock(&matrix_dev->lock);
>> }
> Doesn't a flow exit the function with matrix_dev->lock held he

How can that happen? What flow?

>
> Write it with 'success oriented flow'

I'm not sure what you mean, can you clarify this statement?

>
> I didn't check if everything makes sense, but it sure looks clean.
>
> Jason

2021-06-28 23:40:28

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH] s390/vfio-ap: do not use open locks during VFIO_GROUP_NOTIFY_SET_KVM notification

On Fri, Jun 25, 2021 at 06:07:58PM -0400, Tony Krowiak wrote:
> static void vfio_ap_mdev_unset_kvm(struct ap_matrix_mdev *matrix_mdev)
> {
> + mutex_lock(&matrix_dev->lock);
>
> + if ((matrix_mdev->kvm) && (matrix_mdev->kvm->arch.crypto.crycbd)) {
> mutex_unlock(&matrix_dev->lock);
> + down_write(&matrix_mdev->kvm->arch.crypto.pqap_hook_rwsem);
> + matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
> + up_write(&matrix_mdev->kvm->arch.crypto.pqap_hook_rwsem);
>
> + kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
>
> mutex_lock(&matrix_dev->lock);
> vfio_ap_mdev_reset_queues(matrix_mdev->mdev);
> kvm_put_kvm(matrix_mdev->kvm);
> matrix_mdev->kvm = NULL;
> + mutex_unlock(&matrix_dev->lock);
> }

Doesn't a flow exit the function with matrix_dev->lock held here?

Write it with 'success oriented flow'

I didn't check if everything makes sense, but it sure looks clean.

Jason

2021-06-29 00:02:06

by Anthony Krowiak

[permalink] [raw]
Subject: Re: [PATCH] s390/vfio-ap: do not use open locks during VFIO_GROUP_NOTIFY_SET_KVM notification



On 6/28/21 2:22 PM, Jason Gunthorpe wrote:
> On Mon, Jun 28, 2021 at 02:20:55PM -0400, Tony Krowiak wrote:
>>
>> On 6/28/21 1:34 PM, Jason Gunthorpe wrote:
>>> On Fri, Jun 25, 2021 at 06:07:58PM -0400, Tony Krowiak wrote:
>>>> static void vfio_ap_mdev_unset_kvm(struct ap_matrix_mdev *matrix_mdev)
>>>> {
>>>> + mutex_lock(&matrix_dev->lock);
>>>> + if ((matrix_mdev->kvm) && (matrix_mdev->kvm->arch.crypto.crycbd)) {
>>>> mutex_unlock(&matrix_dev->lock);
>>>> + down_write(&matrix_mdev->kvm->arch.crypto.pqap_hook_rwsem);
>>>> + matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
>>>> + up_write(&matrix_mdev->kvm->arch.crypto.pqap_hook_rwsem);
>>>> + kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
>>>> mutex_lock(&matrix_dev->lock);
>>>> vfio_ap_mdev_reset_queues(matrix_mdev->mdev);
>>>> kvm_put_kvm(matrix_mdev->kvm);
>>>> matrix_mdev->kvm = NULL;
>>>> + mutex_unlock(&matrix_dev->lock);
>>>> }
>>> Doesn't a flow exit the function with matrix_dev->lock held he

Yes, you are correct. Stupid mistake.

>> How can that happen? What flow?
> When the if isn't taken
>
>>> Write it with 'success oriented flow'
>> I'm not sure what you mean, can you clarify this statement?
> Basically, don't write the bulk of the function under an if statement
>
> mutex_lock(&matrix_dev->lock);
> if (!matrix_mdev->kvm || !matrix_mdev->kvm->arch.crypto.crycbd) {
> mutex_unlock(&matrix_dev->lock);
> return;
> }

Sure.

>
> Jason
>

2021-06-29 01:07:03

by Halil Pasic

[permalink] [raw]
Subject: Re: [PATCH] s390/vfio-ap: do not use open locks during VFIO_GROUP_NOTIFY_SET_KVM notification

On Fri, 25 Jun 2021 18:07:58 -0400
Tony Krowiak <[email protected]> wrote:

What is a suitable base for this patch. I've tried the usual suspects,
but none of them worked.

> The fix to resolve a lockdep splat while handling the
> VFIO_GROUP_NOTIFY_SET_KVM event introduced a kvm_busy flag indicating that
> the vfio_ap device driver is busy setting or unsetting the KVM pointer.
> A wait queue was employed to allow functions requiring access to the KVM
> pointer to wait for the kvm_busy flag to be cleared. For the duration of
> the wait period, the mdev lock was unlocked then acquired again after the
> kvm_busy flag was cleared. This got rid of the lockdep report, but didn't
> really resolve the problem.

Can you please elaborate on the last point. You mean that we can have
circular locking even after 0cc00c8d4050, but instead of getting stuck in
on a lock we will get stuck on wait_event_cmd()? If that is it, please
state it clearly in the description, and if you can to it in the short
description.

>
> This patch removes the the kvm_busy flag and wait queue as they are not
> necessary to resolve the lockdep splat problem. The wait queue was
> introduced to prevent changes to the matrix used to update the guest's
> AP configuration. The idea was that whenever an adapter, domain or control
> domain was being assigned to or unassigned from the matrix, the function
> would wait until the group notifier function was no longer busy with the
> KVM pointer.
>
> The thing is, the KVM pointer value (matrix_mdev->kvm) is always set and
> cleared while holding the matrix_dev->lock mutex. The assignment and
> unassignment interfaces also lock the matrix_dev->lock mutex prior to
> checking whether the matrix_mdev->kvm pointer is set and if so, returns
> the -EBUSY error from the function. Consequently, there is no chance for
> an update to the matrix to occur while the guest's AP configuration is
> being updated.
>
> Fixes: 0cc00c8d4050 ("s390/vfio-ap: fix circular lockdep when setting/clearing crypto masks")
> Cc: [email protected]
> Signed-off-by: Tony Krowiak <[email protected]>

2021-06-29 13:24:30

by Jason J. Herne

[permalink] [raw]
Subject: Re: [PATCH] s390/vfio-ap: do not use open locks during VFIO_GROUP_NOTIFY_SET_KVM notification

On 6/25/21 6:07 PM, Tony Krowiak wrote:
> The fix to resolve a lockdep splat while handling the
> VFIO_GROUP_NOTIFY_SET_KVM event introduced a kvm_busy flag indicating that
> the vfio_ap device driver is busy setting or unsetting the KVM pointer.
> A wait queue was employed to allow functions requiring access to the KVM
> pointer to wait for the kvm_busy flag to be cleared. For the duration of
> the wait period, the mdev lock was unlocked then acquired again after the
> kvm_busy flag was cleared. This got rid of the lockdep report, but didn't
> really resolve the problem.
>
> This patch removes the the kvm_busy flag and wait queue as they are not
> necessary to resolve the lockdep splat problem. The wait queue was
> introduced to prevent changes to the matrix used to update the guest's
> AP configuration. The idea was that whenever an adapter, domain or control
> domain was being assigned to or unassigned from the matrix, the function
> would wait until the group notifier function was no longer busy with the
> KVM pointer.
>
> The thing is, the KVM pointer value (matrix_mdev->kvm) is always set and
> cleared while holding the matrix_dev->lock mutex. The assignment and
> unassignment interfaces also lock the matrix_dev->lock mutex prior to
> checking whether the matrix_mdev->kvm pointer is set and if so, returns
> the -EBUSY error from the function. Consequently, there is no chance for
> an update to the matrix to occur while the guest's AP configuration is
> being updated.
>
> Fixes: 0cc00c8d4050 ("s390/vfio-ap: fix circular lockdep when setting/clearing crypto masks")
> Cc: [email protected]
> Signed-off-by: Tony Krowiak <[email protected]>
> ---
> drivers/s390/crypto/vfio_ap_ops.c | 77 +++++++--------------------
> drivers/s390/crypto/vfio_ap_private.h | 2 -
> 2 files changed, 20 insertions(+), 59 deletions(-)
>
> diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
> index 742277bc8d1c..99a58f54c076 100644
> --- a/drivers/s390/crypto/vfio_ap_ops.c
> +++ b/drivers/s390/crypto/vfio_ap_ops.c
> @@ -294,15 +294,6 @@ static int handle_pqap(struct kvm_vcpu *vcpu)
> matrix_mdev = container_of(vcpu->kvm->arch.crypto.pqap_hook,
> struct ap_matrix_mdev, pqap_hook);
>
> - /*
> - * If the KVM pointer is in the process of being set, wait until the
> - * process has completed.
> - */
> - wait_event_cmd(matrix_mdev->wait_for_kvm,
> - !matrix_mdev->kvm_busy,
> - mutex_unlock(&matrix_dev->lock),
> - mutex_lock(&matrix_dev->lock));
> -
> /* If the there is no guest using the mdev, there is nothing to do */
> if (!matrix_mdev->kvm)
> goto out_unlock;
> @@ -350,7 +341,6 @@ static int vfio_ap_mdev_create(struct mdev_device *mdev)
>
> matrix_mdev->mdev = mdev;
> vfio_ap_matrix_init(&matrix_dev->info, &matrix_mdev->matrix);
> - init_waitqueue_head(&matrix_mdev->wait_for_kvm);
> mdev_set_drvdata(mdev, matrix_mdev);
> matrix_mdev->pqap_hook = handle_pqap;
> mutex_lock(&matrix_dev->lock);
> @@ -623,7 +613,7 @@ static ssize_t assign_adapter_store(struct device *dev,
> * If the KVM pointer is in flux or the guest is running, disallow
> * un-assignment of adapter
> */
> - if (matrix_mdev->kvm_busy || matrix_mdev->kvm) {
> + if (matrix_mdev->kvm) {
> ret = -EBUSY;
> goto done;
> }
> @@ -696,7 +686,7 @@ static ssize_t unassign_adapter_store(struct device *dev,
> * If the KVM pointer is in flux or the guest is running, disallow
> * un-assignment of adapter
> */
> - if (matrix_mdev->kvm_busy || matrix_mdev->kvm) {
> + if (matrix_mdev->kvm) {
> ret = -EBUSY;
> goto done;
> }
> @@ -786,7 +776,7 @@ static ssize_t assign_domain_store(struct device *dev,
> * If the KVM pointer is in flux or the guest is running, disallow
> * assignment of domain
> */
> - if (matrix_mdev->kvm_busy || matrix_mdev->kvm) {
> + if (matrix_mdev->kvm) {
> ret = -EBUSY;
> goto done;
> }
> @@ -854,7 +844,7 @@ static ssize_t unassign_domain_store(struct device *dev,
> * If the KVM pointer is in flux or the guest is running, disallow
> * un-assignment of domain
> */
> - if (matrix_mdev->kvm_busy || matrix_mdev->kvm) {
> + if (matrix_mdev->kvm) {
> ret = -EBUSY;
> goto done;
> }
> @@ -908,7 +898,7 @@ static ssize_t assign_control_domain_store(struct device *dev,
> * If the KVM pointer is in flux or the guest is running, disallow
> * assignment of control domain.
> */
> - if (matrix_mdev->kvm_busy || matrix_mdev->kvm) {
> + if (matrix_mdev->kvm) {
> ret = -EBUSY;
> goto done;
> }
> @@ -967,7 +957,7 @@ static ssize_t unassign_control_domain_store(struct device *dev,
> * If the KVM pointer is in flux or the guest is running, disallow
> * un-assignment of control domain.
> */
> - if (matrix_mdev->kvm_busy || matrix_mdev->kvm) {
> + if (matrix_mdev->kvm) {
> ret = -EBUSY;
> goto done;
> }
> @@ -1108,14 +1098,17 @@ static int vfio_ap_mdev_set_kvm(struct ap_matrix_mdev *matrix_mdev,
> struct ap_matrix_mdev *m;
>
> if (kvm->arch.crypto.crycbd) {
> + mutex_lock(&matrix_dev->lock);
> +
> list_for_each_entry(m, &matrix_dev->mdev_list, node) {
> - if (m != matrix_mdev && m->kvm == kvm)
> + if (m != matrix_mdev && m->kvm == kvm) {
> + mutex_unlock(&matrix_dev->lock);
> return -EPERM;
> + }
> }
>
> kvm_get_kvm(kvm);
> matrix_mdev->kvm = kvm;
> - matrix_mdev->kvm_busy = true;
> mutex_unlock(&matrix_dev->lock);
>
> down_write(&matrix_mdev->kvm->arch.crypto.pqap_hook_rwsem);
> @@ -1126,10 +1119,6 @@ static int vfio_ap_mdev_set_kvm(struct ap_matrix_mdev *matrix_mdev,
> matrix_mdev->matrix.apm,
> matrix_mdev->matrix.aqm,
> matrix_mdev->matrix.adm);
> -
> - mutex_lock(&matrix_dev->lock);
> - matrix_mdev->kvm_busy = false;
> - wake_up_all(&matrix_mdev->wait_for_kvm);
> }
>
> return 0;
> @@ -1181,33 +1170,21 @@ static int vfio_ap_mdev_iommu_notifier(struct notifier_block *nb,
> */
> static void vfio_ap_mdev_unset_kvm(struct ap_matrix_mdev *matrix_mdev)
> {
> - /*
> - * If the KVM pointer is in the process of being set, wait until the
> - * process has completed.
> - */
> - wait_event_cmd(matrix_mdev->wait_for_kvm,
> - !matrix_mdev->kvm_busy,
> - mutex_unlock(&matrix_dev->lock),
> - mutex_lock(&matrix_dev->lock));
> + mutex_lock(&matrix_dev->lock);
>
> - if (matrix_mdev->kvm) {
> - matrix_mdev->kvm_busy = true;
> + if ((matrix_mdev->kvm) && (matrix_mdev->kvm->arch.crypto.crycbd)) {
> mutex_unlock(&matrix_dev->lock);


If this function cannot be entered concurrently on separate threads then I think we can
remove this mutex_unlock of matrix_dev->lock, (and the above mutex_lock) All that happens
while holding the lock is the examination of the matrix_mdev->kvm pointer and then the
subsequent examination of matrix_mdev->kvm->arch.crypto.crycbd. And since this function
is the only place that the kvm pointer is NULLed I don't see how the kvm pointer could go
away between the two parts of the conditional. Again, this is only true if this function
cannot be entered concurrently.


--
-- Jason J. Herne ([email protected])

2021-06-30 11:56:37

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH] s390/vfio-ap: do not use open locks during VFIO_GROUP_NOTIFY_SET_KVM notification

On Tue, Jun 29, 2021 at 09:21:31AM -0400, Jason J. Herne wrote:

> > + mutex_lock(&matrix_dev->lock);
> > - if (matrix_mdev->kvm) {
> > - matrix_mdev->kvm_busy = true;
> > + if ((matrix_mdev->kvm) && (matrix_mdev->kvm->arch.crypto.crycbd)) {
> > mutex_unlock(&matrix_dev->lock);
>
>
> If this function cannot be entered concurrently on separate threads then I
> think we can remove this mutex_unlock of matrix_dev->lock,

Don't remove locking around data. If the data is written under a mutex
it should be read under a mutex too.

Jason

2021-06-30 14:34:51

by Anthony Krowiak

[permalink] [raw]
Subject: Re: [PATCH] s390/vfio-ap: do not use open locks during VFIO_GROUP_NOTIFY_SET_KVM notification



On 6/28/21 4:29 PM, Halil Pasic wrote:
> On Fri, 25 Jun 2021 18:07:58 -0400
> Tony Krowiak <[email protected]> wrote:
>
> What is a suitable base for this patch. I've tried the usual suspects,
> but none of them worked.

I discovered what the problem is here. The patch is based on our
master branch along with the two pre-requisite patches that were
recently reviewed and are currently being merged. The two patches
of which I speak are:
* [PATCH v6 1/2] s390/vfio-ap: clean up mdev resources when remove
callback invoked
   Message ID: <[email protected]>

* [PATCH v6 2/2] s390/vfio-ap: r/w lock for PQAP interception handler
function pointer
   <[email protected]>

I probably should have included those along with this one.

>
>> The fix to resolve a lockdep splat while handling the
>> VFIO_GROUP_NOTIFY_SET_KVM event introduced a kvm_busy flag indicating that
>> the vfio_ap device driver is busy setting or unsetting the KVM pointer.
>> A wait queue was employed to allow functions requiring access to the KVM
>> pointer to wait for the kvm_busy flag to be cleared. For the duration of
>> the wait period, the mdev lock was unlocked then acquired again after the
>> kvm_busy flag was cleared. This got rid of the lockdep report, but didn't
>> really resolve the problem.
> Can you please elaborate on the last point. You mean that we can have
> circular locking even after 0cc00c8d4050, but instead of getting stuck in
> on a lock we will get stuck on wait_event_cmd()? If that is it, please
> state it clearly in the description, and if you can to it in the short
> description.

This patch was in response to the following review comments made by Jason
Gunthorpe:

* Message ID: <[email protected]>
   "... the kvm_busy should be replaced by a proper rwsem,
    don't try to open code locks like that - it just defeats lockdep
    analysis".

* Message ID: <[email protected]>
   "Usually when people start open coding locks it is often
   because lockdep complained. Open coding a lock makes
   lockdep stop because the lockdep code
   is removed, but it doesn't fix anything. The kvm_busy
   should be replaced by a proper rwsem, don't try to
   open code locks like that - it just defeats lockdep
   analysis."

I will paraphrase and include the information from Jason's
comments in the description.

>> This patch removes the the kvm_busy flag and wait queue as they are not
>> necessary to resolve the lockdep splat problem. The wait queue was
>> introduced to prevent changes to the matrix used to update the guest's
>> AP configuration. The idea was that whenever an adapter, domain or control
>> domain was being assigned to or unassigned from the matrix, the function
>> would wait until the group notifier function was no longer busy with the
>> KVM pointer.
>>
>> The thing is, the KVM pointer value (matrix_mdev->kvm) is always set and
>> cleared while holding the matrix_dev->lock mutex. The assignment and
>> unassignment interfaces also lock the matrix_dev->lock mutex prior to
>> checking whether the matrix_mdev->kvm pointer is set and if so, returns
>> the -EBUSY error from the function. Consequently, there is no chance for
>> an update to the matrix to occur while the guest's AP configuration is
>> being updated.
>>
>> Fixes: 0cc00c8d4050 ("s390/vfio-ap: fix circular lockdep when setting/clearing crypto masks")
>> Cc: [email protected]
>> Signed-off-by: Tony Krowiak <[email protected]>

2021-06-30 22:44:19

by Halil Pasic

[permalink] [raw]
Subject: Re: [PATCH] s390/vfio-ap: do not use open locks during VFIO_GROUP_NOTIFY_SET_KVM notification

On Wed, 30 Jun 2021 10:31:22 -0400
Tony Krowiak <[email protected]> wrote:

> On 6/28/21 4:29 PM, Halil Pasic wrote:
> > On Fri, 25 Jun 2021 18:07:58 -0400
> > Tony Krowiak <[email protected]> wrote:
> >
> > What is a suitable base for this patch. I've tried the usual suspects,
> > but none of them worked.
>
> I discovered what the problem is here. The patch is based on our
> master branch along with the two pre-requisite patches that were
> recently reviewed and are currently being merged. The two patches
> of which I speak are:
> * [PATCH v6 1/2] s390/vfio-ap: clean up mdev resources when remove
> callback invoked
>    Message ID: <[email protected]>
>
> * [PATCH v6 2/2] s390/vfio-ap: r/w lock for PQAP interception handler
> function pointer
>    <[email protected]>
>
> I probably should have included those along with this one.

Either that, or state in the cover letter that those are prerequisites.

>
> >
> >> The fix to resolve a lockdep splat while handling the
> >> VFIO_GROUP_NOTIFY_SET_KVM event introduced a kvm_busy flag indicating that
> >> the vfio_ap device driver is busy setting or unsetting the KVM pointer.
> >> A wait queue was employed to allow functions requiring access to the KVM
> >> pointer to wait for the kvm_busy flag to be cleared. For the duration of
> >> the wait period, the mdev lock was unlocked then acquired again after the
> >> kvm_busy flag was cleared. This got rid of the lockdep report, but didn't
> >> really resolve the problem.
> > Can you please elaborate on the last point. You mean that we can have
> > circular locking even after 0cc00c8d4050, but instead of getting stuck in
> > on a lock we will get stuck on wait_event_cmd()? If that is it, please
> > state it clearly in the description, and if you can to it in the short
> > description.
>
> This patch was in response to the following review comments made by Jason
> Gunthorpe:
>
> * Message ID: <[email protected]>
>    "... the kvm_busy should be replaced by a proper rwsem,
>     don't try to open code locks like that - it just defeats lockdep
>     analysis".
>
> * Message ID: <[email protected]>
>    "Usually when people start open coding locks it is often
>    because lockdep complained. Open coding a lock makes
>    lockdep stop because the lockdep code
>    is removed, but it doesn't fix anything. The kvm_busy
>    should be replaced by a proper rwsem, don't try to
>    open code locks like that - it just defeats lockdep
>    analysis."
>
> I will paraphrase and include the information from Jason's
> comments in the description.
>

This does not answer my questions.

I'm in favor of Jason's proposal, because it is much easier to
comprehend simple rwsem protected than a mutex + wait_queue dance.

I think Jason was talking about open coding locks in general. I don't
consider it as proof of commit 0cc00c8d4050 not doing what it
advertised. You can add a Suggested-by tag if you like, but you should
be able to tell us what is the merit of your patch.

Regards,
Halil

2021-07-01 14:31:33

by Anthony Krowiak

[permalink] [raw]
Subject: Re: [PATCH] s390/vfio-ap: do not use open locks during VFIO_GROUP_NOTIFY_SET_KVM notification



On 6/30/21 6:39 PM, Halil Pasic wrote:
> On Wed, 30 Jun 2021 10:31:22 -0400
> Tony Krowiak <[email protected]> wrote:
>
>> On 6/28/21 4:29 PM, Halil Pasic wrote:
>>> On Fri, 25 Jun 2021 18:07:58 -0400
>>> Tony Krowiak <[email protected]> wrote:
>>>
>>> What is a suitable base for this patch. I've tried the usual suspects,
>>> but none of them worked.
>> I discovered what the problem is here. The patch is based on our
>> master branch along with the two pre-requisite patches that were
>> recently reviewed and are currently being merged. The two patches
>> of which I speak are:
>> * [PATCH v6 1/2] s390/vfio-ap: clean up mdev resources when remove
>> callback invoked
>>    Message ID: <[email protected]>
>>
>> * [PATCH v6 2/2] s390/vfio-ap: r/w lock for PQAP interception handler
>> function pointer
>>    <[email protected]>
>>
>> I probably should have included those along with this one.
> Either that, or state in the cover letter that those are prerequisites.
>
>>>
>>>> The fix to resolve a lockdep splat while handling the
>>>> VFIO_GROUP_NOTIFY_SET_KVM event introduced a kvm_busy flag indicating that
>>>> the vfio_ap device driver is busy setting or unsetting the KVM pointer.
>>>> A wait queue was employed to allow functions requiring access to the KVM
>>>> pointer to wait for the kvm_busy flag to be cleared. For the duration of
>>>> the wait period, the mdev lock was unlocked then acquired again after the
>>>> kvm_busy flag was cleared. This got rid of the lockdep report, but didn't
>>>> really resolve the problem.
>>> Can you please elaborate on the last point. You mean that we can have
>>> circular locking even after 0cc00c8d4050, but instead of getting stuck in
>>> on a lock we will get stuck on wait_event_cmd()? If that is it, please
>>> state it clearly in the description, and if you can to it in the short
>>> description.

I created this patch in response to Jason G's review comments I copied
below. He did not mention anything about getting stuck in a
wait_event_cmd(),
so you may want to ask him about that. To answer your
question, I don't see how we can get stuck in a wait_event_cmd() unless
one of the functions that set the matrix_mdev->kvm_busy flag does not
complete for some reason.

You asked for me to elaborate on the last point in the preceding paragraph;
I did so in my response to your comments/question above.

>>>
>> This patch was in response to the following review comments made by Jason
>> Gunthorpe:
>>
>> * Message ID: <[email protected]>
>>    "... the kvm_busy should be replaced by a proper rwsem,
>>     don't try to open code locks like that - it just defeats lockdep
>>     analysis".
>>
>> * Message ID: <[email protected]>
>>    "Usually when people start open coding locks it is often
>>    because lockdep complained. Open coding a lock makes
>>    lockdep stop because the lockdep code
>>    is removed, but it doesn't fix anything. The kvm_busy
>>    should be replaced by a proper rwsem, don't try to
>>    open code locks like that - it just defeats lockdep
>>    analysis."
>>
>> I will paraphrase and include the information from Jason's
>> comments in the description.
>>
> This does not answer my questions.

See above.

>
> I'm in favor of Jason's proposal, because it is much easier to
> comprehend simple rwsem protected than a mutex + wait_queue dance.

That is a matter of opinion. I have no trouble understanding
the "mutex + wait_queue dance", but then again, I wrote the
code, so maybe that is why.

>
>
> I think Jason was talking about open coding locks in general.

That may be so, but his comments were in support of his
statement that the  mutex + wait_queue did not resolve
the issue reported vai the lockdep splat because it turned
off lockdep.

> I don't
> consider it as proof of commit 0cc00c8d4050 not doing what it
> advertised.

I think I agree with this statement. Maybe I misunderstood what
Jason meant by "open coding locks like that". Since that comment
directly related to replacing the kvm->busy, I assumed that he
was referring to the "mutex + wait_queue dance" as you called it.
I probably should have probed deeper to discern exactly what
Jason meant by "open coding locks". I took Jason at his word
that "the kvm->busy ... just defeats lockdep analysis" because
I don't have deep knowledge about lockdep.

Even if the kvm->busy does defeat lockdep analysis, I still believe
it fixes the problem for which commit 0cc00c8d4050 was created.
If we don't hold the matrix_dev->lock while the kvm->lock is held
during update of the guest's matrix, lockdep does not report a
problem. That is proven by the tests of this patch.

If commit 0cc00c8d4050 does in fact resolve the issue for which
it was created, then there really is no need for this patch. It
certainly would reduce the amount of change that will be required
to integrate this patch with the "s390/vfio-ap: dynamic configuration
support" patch series currently under review.

> You can add a Suggested-by tag if you like, but you should
> be able to tell us what is the merit of your patch.

The patch removes the matrix_mdev->kvm_busy flag and the
wait_queue. The reason for removing them was because, as
Jason stated, "the kvm->busy ... just defeats lockdep analysis".

>
> Regards,
> Halil
>

2021-07-05 14:15:36

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH] s390/vfio-ap: do not use open locks during VFIO_GROUP_NOTIFY_SET_KVM notification

On Thu, Jul 01, 2021 at 10:28:52AM -0400, Tony Krowiak wrote:

> > I think Jason was talking about open coding locks in general.
>
> That may be so, but his comments were in support of his
> statement that the  mutex + wait_queue did not resolve
> the issue reported vai the lockdep splat because it turned
> off lockdep.

Rgiht, if this used to be proper locks and lockdep complained then
whatever potential deadlock it found is not magically removed by going
to a wait_queue. It just removes the lockdep annotations that would
identify the issue early.

This is why people should not open code locks, it completely defeats
lockdep. That alone is merit enough for this patch.

Jason

2021-07-06 14:42:52

by Anthony Krowiak

[permalink] [raw]
Subject: Re: [PATCH] s390/vfio-ap: do not use open locks during VFIO_GROUP_NOTIFY_SET_KVM notification



On 7/5/21 10:13 AM, Jason Gunthorpe wrote:
> On Thu, Jul 01, 2021 at 10:28:52AM -0400, Tony Krowiak wrote:
>
>>> I think Jason was talking about open coding locks in general.
>> That may be so, but his comments were in support of his
>> statement that the  mutex + wait_queue did not resolve
>> the issue reported vai the lockdep splat because it turned
>> off lockdep.
> Rgiht, if this used to be proper locks and lockdep complained then
> whatever potential deadlock it found is not magically removed by going
> to a wait_queue. It just removes the lockdep annotations that would
> identify the issue early.
>
> This is why people should not open code locks, it completely defeats
> lockdep. That alone is merit enough for this patch.

When you use the phrase "open code locks", to what are you
specifically referring? I am confused by the use of the phrase
"open code" in this context because open coding, at least as
I understand it, has to do with data analysis.

>
> Jason

2021-07-06 15:13:40

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH] s390/vfio-ap: do not use open locks during VFIO_GROUP_NOTIFY_SET_KVM notification

On Tue, Jul 06, 2021 at 09:39:29AM -0400, Tony Krowiak wrote:
>
>
> On 7/5/21 10:13 AM, Jason Gunthorpe wrote:
> > On Thu, Jul 01, 2021 at 10:28:52AM -0400, Tony Krowiak wrote:
> >
> > > > I think Jason was talking about open coding locks in general.
> > > That may be so, but his comments were in support of his
> > > statement that the  mutex + wait_queue did not resolve
> > > the issue reported vai the lockdep splat because it turned
> > > off lockdep.
> > Rgiht, if this used to be proper locks and lockdep complained then
> > whatever potential deadlock it found is not magically removed by going
> > to a wait_queue. It just removes the lockdep annotations that would
> > identify the issue early.
> >
> > This is why people should not open code locks, it completely defeats
> > lockdep. That alone is merit enough for this patch.
>
> When you use the phrase "open code locks", to what are you
> specifically referring? I am confused by the use of the phrase
> "open code" in this context because open coding, at least as
> I understand it, has to do with data analysis.

"open code" here means you write the algorithm of a standard lock in
your own functions instead of calling the standard library.

Testing/setting the busy and sleeping on a wait_event is exactly a
standard lock.

Ie if I write

for (len = 0; str[len] != 0; len++)
;

Then I have open coded strlen()

Jason

2021-07-06 22:46:36

by Anthony Krowiak

[permalink] [raw]
Subject: Re: [PATCH] s390/vfio-ap: do not use open locks during VFIO_GROUP_NOTIFY_SET_KVM notification



On 7/6/21 9:49 AM, Jason Gunthorpe wrote:
> On Tue, Jul 06, 2021 at 09:39:29AM -0400, Tony Krowiak wrote:
>>
>> On 7/5/21 10:13 AM, Jason Gunthorpe wrote:
>>> On Thu, Jul 01, 2021 at 10:28:52AM -0400, Tony Krowiak wrote:
>>>
>>>>> I think Jason was talking about open coding locks in general.
>>>> That may be so, but his comments were in support of his
>>>> statement that the  mutex + wait_queue did not resolve
>>>> the issue reported vai the lockdep splat because it turned
>>>> off lockdep.
>>> Rgiht, if this used to be proper locks and lockdep complained then
>>> whatever potential deadlock it found is not magically removed by going
>>> to a wait_queue. It just removes the lockdep annotations that would
>>> identify the issue early.
>>>
>>> This is why people should not open code locks, it completely defeats
>>> lockdep. That alone is merit enough for this patch.
>> When you use the phrase "open code locks", to what are you
>> specifically referring? I am confused by the use of the phrase
>> "open code" in this context because open coding, at least as
>> I understand it, has to do with data analysis.
> "open code" here means you write the algorithm of a standard lock in
> your own functions instead of calling the standard library.
>
> Testing/setting the busy and sleeping on a wait_event is exactly a
> standard lock.
>
> Ie if I write
>
> for (len = 0; str[len] != 0; len++)
> ;
>
> Then I have open coded strlen()
>
> Jason

Thanks for the explanation.