2021-03-10 15:08:08

by Anthony Krowiak

[permalink] [raw]
Subject: [PATCH v4 0/1] s390/vfio-ap: fix circular lockdep when starting

*Commit f21916ec4826 ("s390/vfio-ap: clean up vfio_ap resources when KVM
pointer invalidated") introduced a change that results in a circular
lockdep when a Secure Execution guest that is configured with
crypto devices is started. The problem resulted due to the fact that the
patch moved the setting of the guest's AP masks within the protection of
the matrix_dev->lock when the vfio_ap driver is notified that the KVM
pointer has been set. Since it is not critical that setting/clearing of
the guest's AP masks be done under the matrix_dev->lock when the driver
is notified, the masks will not be updated under the matrix_dev->lock.
The lock is necessary for the setting/unsetting of the KVM pointer,
however, so that will remain in place.

The dependency chain for the circular lockdep resolved by this patch
is (in reverse order):

2: vfio_ap_mdev_group_notifier: kvm->lock
matrix_dev->lock

1: handle_pqap: matrix_dev->lock
kvm_vcpu_ioctl: vcpu->mutex

0: kvm_s390_cpus_to_pv: vcpu->mutex
kvm_vm_ioctl: kvm->lock

Please note:
-----------
* If checkpatch is run against this patch series, you may
get a "WARNING: Unknown commit id 'f21916ec4826', maybe rebased or not
pulled?" message. The commit 'f21916ec4826', however, is definitely
in the master branch on top of which this patch series was built, so
I'm not sure why this message is being output by checkpatch.
* All acks granted from previous review of this patch have been removed
due to the fact that this patch introduces non-trivial changes (see
change log below).

Change log v3=> v4:
------------------
* In vfio_ap_mdev_set_kvm() function, moved the setting of
matrix_mdev->kvm_busy just prior to unlocking matrix_dev->lock.

* Reset queues regardless of regardless of the value of matrix_mdev->kvm
in response to the VFIO_DEVICE_RESET ioctl.

Change log v2=> v3:
------------------
* Added two fields - 'bool kvm_busy' and 'wait_queue_head_t wait_for_kvm'
to struct ap_matrix_mdev. The former indicates that the KVM
pointer is in the process of being updated and the second allows a
function that needs access to the KVM pointer to wait until it is
no longer being updated. Resolves problem of synchronization between
the functions that change the KVM pointer value and the functions that
required access to it.

Change log v1=> v2:
------------------
* No longer holding the matrix_dev->lock prior to setting/clearing the
masks supplying the AP configuration to a KVM guest.
* Make all updates to the data in the matrix mdev that is used to manage
AP resources used by the KVM guest in the vfio_ap_mdev_set_kvm()
function instead of the group notifier callback.
* Check for the matrix mdev's KVM pointer in the vfio_ap_mdev_unset_kvm()
function instead of the vfio_ap_mdev_release() function.

Tony Krowiak (1):
s390/vfio-ap: fix circular lockdep when setting/clearing crypto masks

drivers/s390/crypto/vfio_ap_ops.c | 309 ++++++++++++++++++--------
drivers/s390/crypto/vfio_ap_private.h | 2 +
2 files changed, 215 insertions(+), 96 deletions(-)

--
2.21.3


2021-03-10 15:08:08

by Anthony Krowiak

[permalink] [raw]
Subject: [PATCH v4 1/1] s390/vfio-ap: fix circular lockdep when setting/clearing crypto masks

This patch fixes a lockdep splat introduced by commit f21916ec4826
("s390/vfio-ap: clean up vfio_ap resources when KVM pointer invalidated").
The lockdep splat only occurs when starting a Secure Execution guest.
Crypto virtualization (vfio_ap) is not yet supported for SE guests;
however, in order to avoid this problem when support becomes available,
this fix is being provided.

The circular locking dependency was introduced when the setting of the
masks in the guest's APCB was executed while holding the matrix_dev->lock.
While the lock is definitely needed to protect the setting/unsetting of the
matrix_mdev->kvm pointer, it is not necessarily critical for setting the
masks; so, the matrix_dev->lock will be released while the masks are being
set or cleared.

Keep in mind, however, that another process that takes the matrix_dev->lock
can get control while the masks in the guest's APCB are being set or
cleared as a result of the driver being notified that the KVM pointer
has been set or unset. This could result in invalid access to the
matrix_mdev->kvm pointer by the intervening process. To avoid this
scenario, two new fields are being added to the ap_matrix_mdev struct:

struct ap_matrix_mdev {
...
bool kvm_busy;
wait_queue_head_t wait_for_kvm;
...
};

The functions that handle notification that the KVM pointer value has
been set or cleared will set the kvm_busy flag to true until they are done
processing at which time they will set it to false and wake up the tasks on
the matrix_mdev->wait_for_kvm wait queue. Functions that require
access to matrix_mdev->kvm will sleep on the wait queue until they are
awakened at which time they can safely access the matrix_mdev->kvm
field.

Fixes: f21916ec4826 ("s390/vfio-ap: clean up vfio_ap resources when KVM pointer invalidated")
Cc: [email protected]
Signed-off-by: Tony Krowiak <[email protected]>
---
drivers/s390/crypto/vfio_ap_ops.c | 309 ++++++++++++++++++--------
drivers/s390/crypto/vfio_ap_private.h | 2 +
2 files changed, 215 insertions(+), 96 deletions(-)

diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index 41fc2e4135fe..445d1457faa8 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -294,6 +294,19 @@ static int handle_pqap(struct kvm_vcpu *vcpu)
matrix_mdev = container_of(vcpu->kvm->arch.crypto.pqap_hook,
struct ap_matrix_mdev, pqap_hook);

+ /*
+ * If the KVM pointer is in the process of being set, wait until the
+ * process has completed.
+ */
+ wait_event_cmd(matrix_mdev->wait_for_kvm,
+ matrix_mdev->kvm_busy == false,
+ mutex_unlock(&matrix_dev->lock),
+ mutex_lock(&matrix_dev->lock));
+
+ /* If the there is no guest using the mdev, there is nothing to do */
+ if (!matrix_mdev->kvm)
+ goto out_unlock;
+
q = vfio_ap_get_queue(matrix_mdev, apqn);
if (!q)
goto out_unlock;
@@ -337,6 +350,7 @@ static int vfio_ap_mdev_create(struct kobject *kobj, struct mdev_device *mdev)

matrix_mdev->mdev = mdev;
vfio_ap_matrix_init(&matrix_dev->info, &matrix_mdev->matrix);
+ init_waitqueue_head(&matrix_mdev->wait_for_kvm);
mdev_set_drvdata(mdev, matrix_mdev);
matrix_mdev->pqap_hook.hook = handle_pqap;
matrix_mdev->pqap_hook.owner = THIS_MODULE;
@@ -351,17 +365,23 @@ static int vfio_ap_mdev_remove(struct mdev_device *mdev)
{
struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);

- if (matrix_mdev->kvm)
+ mutex_lock(&matrix_dev->lock);
+
+ /*
+ * If the KVM pointer is in flux or the guest is running, disallow
+ * un-assignment of control domain.
+ */
+ if (matrix_mdev->kvm_busy || matrix_mdev->kvm) {
+ mutex_unlock(&matrix_dev->lock);
return -EBUSY;
+ }

- mutex_lock(&matrix_dev->lock);
vfio_ap_mdev_reset_queues(mdev);
list_del(&matrix_mdev->node);
- mutex_unlock(&matrix_dev->lock);
-
kfree(matrix_mdev);
mdev_set_drvdata(mdev, NULL);
atomic_inc(&matrix_dev->available_instances);
+ mutex_unlock(&matrix_dev->lock);

return 0;
}
@@ -606,24 +626,31 @@ static ssize_t assign_adapter_store(struct device *dev,
struct mdev_device *mdev = mdev_from_dev(dev);
struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);

- /* If the guest is running, disallow assignment of adapter */
- if (matrix_mdev->kvm)
- return -EBUSY;
+ mutex_lock(&matrix_dev->lock);
+
+ /*
+ * If the KVM pointer is in flux or the guest is running, disallow
+ * un-assignment of adapter
+ */
+ if (matrix_mdev->kvm_busy || matrix_mdev->kvm) {
+ ret = -EBUSY;
+ goto done;
+ }

ret = kstrtoul(buf, 0, &apid);
if (ret)
- return ret;
+ goto done;

- if (apid > matrix_mdev->matrix.apm_max)
- return -ENODEV;
+ if (apid > matrix_mdev->matrix.apm_max) {
+ ret = -ENODEV;
+ goto done;
+ }

/*
* Set the bit in the AP mask (APM) corresponding to the AP adapter
* number (APID). The bits in the mask, from most significant to least
* significant bit, correspond to APIDs 0-255.
*/
- mutex_lock(&matrix_dev->lock);
-
ret = vfio_ap_mdev_verify_queues_reserved_for_apid(matrix_mdev, apid);
if (ret)
goto done;
@@ -672,22 +699,31 @@ static ssize_t unassign_adapter_store(struct device *dev,
struct mdev_device *mdev = mdev_from_dev(dev);
struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);

- /* If the guest is running, disallow un-assignment of adapter */
- if (matrix_mdev->kvm)
- return -EBUSY;
+ mutex_lock(&matrix_dev->lock);
+
+ /*
+ * If the KVM pointer is in flux or the guest is running, disallow
+ * un-assignment of adapter
+ */
+ if (matrix_mdev->kvm_busy || matrix_mdev->kvm) {
+ ret = -EBUSY;
+ goto done;
+ }

ret = kstrtoul(buf, 0, &apid);
if (ret)
- return ret;
+ goto done;

- if (apid > matrix_mdev->matrix.apm_max)
- return -ENODEV;
+ if (apid > matrix_mdev->matrix.apm_max) {
+ ret = -ENODEV;
+ goto done;
+ }

- mutex_lock(&matrix_dev->lock);
clear_bit_inv((unsigned long)apid, matrix_mdev->matrix.apm);
+ ret = count;
+done:
mutex_unlock(&matrix_dev->lock);
-
- return count;
+ return ret;
}
static DEVICE_ATTR_WO(unassign_adapter);

@@ -753,17 +789,24 @@ static ssize_t assign_domain_store(struct device *dev,
struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
unsigned long max_apqi = matrix_mdev->matrix.aqm_max;

- /* If the guest is running, disallow assignment of domain */
- if (matrix_mdev->kvm)
- return -EBUSY;
+ mutex_lock(&matrix_dev->lock);
+
+ /*
+ * If the KVM pointer is in flux or the guest is running, disallow
+ * assignment of domain
+ */
+ if (matrix_mdev->kvm_busy || matrix_mdev->kvm) {
+ ret = -EBUSY;
+ goto done;
+ }

ret = kstrtoul(buf, 0, &apqi);
if (ret)
- return ret;
- if (apqi > max_apqi)
- return -ENODEV;
-
- mutex_lock(&matrix_dev->lock);
+ goto done;
+ if (apqi > max_apqi) {
+ ret = -ENODEV;
+ goto done;
+ }

ret = vfio_ap_mdev_verify_queues_reserved_for_apqi(matrix_mdev, apqi);
if (ret)
@@ -814,22 +857,32 @@ static ssize_t unassign_domain_store(struct device *dev,
struct mdev_device *mdev = mdev_from_dev(dev);
struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);

- /* If the guest is running, disallow un-assignment of domain */
- if (matrix_mdev->kvm)
- return -EBUSY;
+ mutex_lock(&matrix_dev->lock);
+
+ /*
+ * If the KVM pointer is in flux or the guest is running, disallow
+ * un-assignment of domain
+ */
+ if (matrix_mdev->kvm_busy || matrix_mdev->kvm) {
+ ret = -EBUSY;
+ goto done;
+ }

ret = kstrtoul(buf, 0, &apqi);
if (ret)
- return ret;
+ goto done;

- if (apqi > matrix_mdev->matrix.aqm_max)
- return -ENODEV;
+ if (apqi > matrix_mdev->matrix.aqm_max) {
+ ret = -ENODEV;
+ goto done;
+ }

- mutex_lock(&matrix_dev->lock);
clear_bit_inv((unsigned long)apqi, matrix_mdev->matrix.aqm);
- mutex_unlock(&matrix_dev->lock);
+ ret = count;

- return count;
+done:
+ mutex_unlock(&matrix_dev->lock);
+ return ret;
}
static DEVICE_ATTR_WO(unassign_domain);

@@ -858,27 +911,36 @@ static ssize_t assign_control_domain_store(struct device *dev,
struct mdev_device *mdev = mdev_from_dev(dev);
struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);

- /* If the guest is running, disallow assignment of control domain */
- if (matrix_mdev->kvm)
- return -EBUSY;
+ mutex_lock(&matrix_dev->lock);
+
+ /*
+ * If the KVM pointer is in flux or the guest is running, disallow
+ * assignment of control domain.
+ */
+ if (matrix_mdev->kvm_busy || matrix_mdev->kvm) {
+ ret = -EBUSY;
+ goto done;
+ }

ret = kstrtoul(buf, 0, &id);
if (ret)
- return ret;
+ goto done;

- if (id > matrix_mdev->matrix.adm_max)
- return -ENODEV;
+ if (id > matrix_mdev->matrix.adm_max) {
+ ret = -ENODEV;
+ goto done;
+ }

/* Set the bit in the ADM (bitmask) corresponding to the AP control
* domain number (id). The bits in the mask, from most significant to
* least significant, correspond to IDs 0 up to the one less than the
* number of control domains that can be assigned.
*/
- mutex_lock(&matrix_dev->lock);
set_bit_inv(id, matrix_mdev->matrix.adm);
+ ret = count;
+done:
mutex_unlock(&matrix_dev->lock);
-
- return count;
+ return ret;
}
static DEVICE_ATTR_WO(assign_control_domain);

@@ -908,21 +970,30 @@ static ssize_t unassign_control_domain_store(struct device *dev,
struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
unsigned long max_domid = matrix_mdev->matrix.adm_max;

- /* If the guest is running, disallow un-assignment of control domain */
- if (matrix_mdev->kvm)
- return -EBUSY;
+ mutex_lock(&matrix_dev->lock);
+
+ /*
+ * If the KVM pointer is in flux or the guest is running, disallow
+ * un-assignment of control domain.
+ */
+ if (matrix_mdev->kvm_busy || matrix_mdev->kvm) {
+ ret = -EBUSY;
+ goto done;
+ }

ret = kstrtoul(buf, 0, &domid);
if (ret)
- return ret;
- if (domid > max_domid)
- return -ENODEV;
+ goto done;
+ if (domid > max_domid) {
+ ret = -ENODEV;
+ goto done;
+ }

- mutex_lock(&matrix_dev->lock);
clear_bit_inv(domid, matrix_mdev->matrix.adm);
+ ret = count;
+done:
mutex_unlock(&matrix_dev->lock);
-
- return count;
+ return ret;
}
static DEVICE_ATTR_WO(unassign_control_domain);

@@ -1027,8 +1098,15 @@ static const struct attribute_group *vfio_ap_mdev_attr_groups[] = {
* @matrix_mdev: a mediated matrix device
* @kvm: reference to KVM instance
*
- * Verifies no other mediated matrix device has @kvm and sets a reference to
- * it in @matrix_mdev->kvm.
+ * Sets all data for @matrix_mdev that are needed to manage AP resources
+ * for the guest whose state is represented by @kvm.
+ *
+ * Note: The matrix_dev->lock must be taken prior to calling
+ * this function; however, the lock will be temporarily released while the
+ * guest's AP configuration is set to avoid a potential lockdep splat.
+ * The kvm->lock is taken to set the guest's AP configuration which, under
+ * certain circumstances, will result in a circular lock dependency if this is
+ * done under the @matrix_mdev->lock.
*
* Return 0 if no other mediated matrix device has a reference to @kvm;
* otherwise, returns an -EPERM.
@@ -1038,14 +1116,25 @@ static int vfio_ap_mdev_set_kvm(struct ap_matrix_mdev *matrix_mdev,
{
struct ap_matrix_mdev *m;

- list_for_each_entry(m, &matrix_dev->mdev_list, node) {
- if ((m != matrix_mdev) && (m->kvm == kvm))
- return -EPERM;
- }
+ if (kvm->arch.crypto.crycbd) {
+ list_for_each_entry(m, &matrix_dev->mdev_list, node) {
+ if ((m != matrix_mdev) && (m->kvm == kvm))
+ return -EPERM;
+ }

- matrix_mdev->kvm = kvm;
- kvm_get_kvm(kvm);
- kvm->arch.crypto.pqap_hook = &matrix_mdev->pqap_hook;
+ kvm_get_kvm(kvm);
+ matrix_mdev->kvm_busy = true;
+ mutex_unlock(&matrix_dev->lock);
+ kvm_arch_crypto_set_masks(kvm,
+ matrix_mdev->matrix.apm,
+ matrix_mdev->matrix.aqm,
+ matrix_mdev->matrix.adm);
+ mutex_lock(&matrix_dev->lock);
+ kvm->arch.crypto.pqap_hook = &matrix_mdev->pqap_hook;
+ matrix_mdev->kvm = kvm;
+ matrix_mdev->kvm_busy = false;
+ wake_up_all(&matrix_mdev->wait_for_kvm);
+ }

return 0;
}
@@ -1079,51 +1168,65 @@ static int vfio_ap_mdev_iommu_notifier(struct notifier_block *nb,
return NOTIFY_DONE;
}

+/**
+ * vfio_ap_mdev_unset_kvm
+ *
+ * @matrix_mdev: a matrix mediated device
+ *
+ * Performs clean-up of resources no longer needed by @matrix_mdev.
+ *
+ * Note: The matrix_dev->lock must be taken prior to calling
+ * this function; however, the lock will be temporarily released while the
+ * guest's AP configuration is cleared to avoid a potential lockdep splat.
+ * The kvm->lock is taken to clear the guest's AP configuration which, under
+ * certain circumstances, will result in a circular lock dependency if this is
+ * done under the @matrix_mdev->lock.
+ *
+ */
static void vfio_ap_mdev_unset_kvm(struct ap_matrix_mdev *matrix_mdev)
{
- kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
- matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
- vfio_ap_mdev_reset_queues(matrix_mdev->mdev);
- kvm_put_kvm(matrix_mdev->kvm);
- matrix_mdev->kvm = NULL;
+ /*
+ * If the KVM pointer is in the process of being set, wait until the
+ * process has completed.
+ */
+ wait_event_cmd(matrix_mdev->wait_for_kvm,
+ matrix_mdev->kvm_busy == false,
+ mutex_unlock(&matrix_dev->lock),
+ mutex_lock(&matrix_dev->lock));
+
+ if (matrix_mdev->kvm) {
+ matrix_mdev->kvm_busy = true;
+ mutex_unlock(&matrix_dev->lock);
+ kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
+ mutex_lock(&matrix_dev->lock);
+ vfio_ap_mdev_reset_queues(matrix_mdev->mdev);
+ matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
+ kvm_put_kvm(matrix_mdev->kvm);
+ matrix_mdev->kvm = NULL;
+ matrix_mdev->kvm_busy = false;
+ wake_up_all(&matrix_mdev->wait_for_kvm);
+ }
}

static int vfio_ap_mdev_group_notifier(struct notifier_block *nb,
unsigned long action, void *data)
{
- int ret, notify_rc = NOTIFY_OK;
+ int notify_rc = NOTIFY_OK;
struct ap_matrix_mdev *matrix_mdev;

if (action != VFIO_GROUP_NOTIFY_SET_KVM)
return NOTIFY_OK;

- matrix_mdev = container_of(nb, struct ap_matrix_mdev, group_notifier);
mutex_lock(&matrix_dev->lock);
+ matrix_mdev = container_of(nb, struct ap_matrix_mdev, group_notifier);

- if (!data) {
- if (matrix_mdev->kvm)
- vfio_ap_mdev_unset_kvm(matrix_mdev);
- goto notify_done;
- }
-
- ret = vfio_ap_mdev_set_kvm(matrix_mdev, data);
- if (ret) {
- notify_rc = NOTIFY_DONE;
- goto notify_done;
- }
-
- /* If there is no CRYCB pointer, then we can't copy the masks */
- if (!matrix_mdev->kvm->arch.crypto.crycbd) {
+ if (!data)
+ vfio_ap_mdev_unset_kvm(matrix_mdev);
+ else if (vfio_ap_mdev_set_kvm(matrix_mdev, data))
notify_rc = NOTIFY_DONE;
- goto notify_done;
- }

- kvm_arch_crypto_set_masks(matrix_mdev->kvm, matrix_mdev->matrix.apm,
- matrix_mdev->matrix.aqm,
- matrix_mdev->matrix.adm);
-
-notify_done:
mutex_unlock(&matrix_dev->lock);
+
return notify_rc;
}

@@ -1258,8 +1361,7 @@ static void vfio_ap_mdev_release(struct mdev_device *mdev)
struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);

mutex_lock(&matrix_dev->lock);
- if (matrix_mdev->kvm)
- vfio_ap_mdev_unset_kvm(matrix_mdev);
+ vfio_ap_mdev_unset_kvm(matrix_mdev);
mutex_unlock(&matrix_dev->lock);

vfio_unregister_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,
@@ -1293,6 +1395,7 @@ static ssize_t vfio_ap_mdev_ioctl(struct mdev_device *mdev,
unsigned int cmd, unsigned long arg)
{
int ret;
+ struct ap_matrix_mdev *matrix_mdev;

mutex_lock(&matrix_dev->lock);
switch (cmd) {
@@ -1300,7 +1403,21 @@ static ssize_t vfio_ap_mdev_ioctl(struct mdev_device *mdev,
ret = vfio_ap_mdev_get_device_info(arg);
break;
case VFIO_DEVICE_RESET:
- ret = vfio_ap_mdev_reset_queues(mdev);
+ matrix_mdev = mdev_get_drvdata(mdev);
+
+ /*
+ * If the KVM pointer is in the process of being set, wait until
+ * the process has completed.
+ */
+ wait_event_cmd(matrix_mdev->wait_for_kvm,
+ matrix_mdev->kvm_busy == false,
+ mutex_unlock(&matrix_dev->lock),
+ mutex_lock(&matrix_dev->lock));
+
+ if (matrix_mdev->kvm)
+ ret = vfio_ap_mdev_reset_queues(mdev);
+ else
+ ret = -ENODEV;
break;
default:
ret = -EOPNOTSUPP;
diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h
index 28e9d9989768..f82a6396acae 100644
--- a/drivers/s390/crypto/vfio_ap_private.h
+++ b/drivers/s390/crypto/vfio_ap_private.h
@@ -83,6 +83,8 @@ struct ap_matrix_mdev {
struct ap_matrix matrix;
struct notifier_block group_notifier;
struct notifier_block iommu_notifier;
+ bool kvm_busy;
+ wait_queue_head_t wait_for_kvm;
struct kvm *kvm;
struct kvm_s390_module_hook pqap_hook;
struct mdev_device *mdev;
--
2.21.3

2021-03-17 23:18:59

by Halil Pasic

[permalink] [raw]
Subject: Re: [PATCH v4 1/1] s390/vfio-ap: fix circular lockdep when setting/clearing crypto masks

On Wed, 10 Mar 2021 10:05:59 -0500
Tony Krowiak <[email protected]> wrote:

> - ret = vfio_ap_mdev_reset_queues(mdev);
> + matrix_mdev = mdev_get_drvdata(mdev);

Is it guaranteed that matrix_mdev can't be NULL here? If yes, please
remind me of the mechanism that ensures this.

> +
> + /*
> + * If the KVM pointer is in the process of being set, wait until
> + * the process has completed.
> + */
> + wait_event_cmd(matrix_mdev->wait_for_kvm,
> + matrix_mdev->kvm_busy == false,
> + mutex_unlock(&matrix_dev->lock),
> + mutex_lock(&matrix_dev->lock));
> +
> + if (matrix_mdev->kvm)
> + ret = vfio_ap_mdev_reset_queues(mdev);
> + else
> + ret = -ENODEV;

Didn't we agree to make the call to vfio_ap_mdev_reset_queues()
unconditional again (for reference please take look at
Message-ID: <[email protected]>)?

Regards,
Halil

2021-03-18 17:55:42

by Anthony Krowiak

[permalink] [raw]
Subject: Re: [PATCH v4 1/1] s390/vfio-ap: fix circular lockdep when setting/clearing crypto masks



On 3/17/21 7:17 PM, Halil Pasic wrote:
> On Wed, 10 Mar 2021 10:05:59 -0500
> Tony Krowiak <[email protected]> wrote:
>
>> - ret = vfio_ap_mdev_reset_queues(mdev);
>> + matrix_mdev = mdev_get_drvdata(mdev);
> Is it guaranteed that matrix_mdev can't be NULL here? If yes, please
> remind me of the mechanism that ensures this.

The matrix_mdev is set as drvdata when the mdev is created and
is only cleared when the mdev is removed. Likewise, this function
is a callback defined by by vfio in the vfio_ap_matrix_ops structure
when the matrix_dev is registered and is intended to handle ioctl
calls from userspace during the lifetime of the mdev. While I can't
speak definitively to the guarantee, I think it is extremely unlikely
that matrix_mdev would be NULL at this point. On the other hand,
it wouldn't hurt to check for NULL and log an error or warning
message (I prefer an error here) if NULL.

>
>> +
>> + /*
>> + * If the KVM pointer is in the process of being set, wait until
>> + * the process has completed.
>> + */
>> + wait_event_cmd(matrix_mdev->wait_for_kvm,
>> + matrix_mdev->kvm_busy == false,
>> + mutex_unlock(&matrix_dev->lock),
>> + mutex_lock(&matrix_dev->lock));
>> +
>> + if (matrix_mdev->kvm)
>> + ret = vfio_ap_mdev_reset_queues(mdev);
>> + else
>> + ret = -ENODEV;
> Didn't we agree to make the call to vfio_ap_mdev_reset_queues()
> unconditional again (for reference please take look at
> Message-ID: <[email protected]>)?

Yes, we did agree to that and I changed it at the time. That change
got lost somehow; I'll reinstate it.

>
> Regards,
> Halil

2021-03-18 18:40:46

by Anthony Krowiak

[permalink] [raw]
Subject: Re: [PATCH v4 1/1] s390/vfio-ap: fix circular lockdep when setting/clearing crypto masks



On 3/17/21 7:17 PM, Halil Pasic wrote:
> On Wed, 10 Mar 2021 10:05:59 -0500
> Tony Krowiak <[email protected]> wrote:
>
>> - ret = vfio_ap_mdev_reset_queues(mdev);
>> + matrix_mdev = mdev_get_drvdata(mdev);
> Is it guaranteed that matrix_mdev can't be NULL here? If yes, please
> remind me of the mechanism that ensures this.
>
>> +
>> + /*
>> + * If the KVM pointer is in the process of being set, wait until
>> + * the process has completed.
>> + */
>> + wait_event_cmd(matrix_mdev->wait_for_kvm,
>> + matrix_mdev->kvm_busy == false,
>> + mutex_unlock(&matrix_dev->lock),
>> + mutex_lock(&matrix_dev->lock));
>> +
>> + if (matrix_mdev->kvm)
>> + ret = vfio_ap_mdev_reset_queues(mdev);
>> + else
>> + ret = -ENODEV;
> Didn't we agree to make the call to vfio_ap_mdev_reset_queues()
> unconditional again (for reference please take look at
> Message-ID: <[email protected]>)?

How about this:

static ssize_t vfio_ap_mdev_ioctl(struct mdev_device *mdev,
                    unsigned int cmd, unsigned long arg)
{
    int ret = 0;
    struct ap_matrix_mdev *matrix_mdev;

    ...
    case VFIO_DEVICE_RESET:
        matrix_mdev = mdev_get_drvdata(mdev);
        WARN(!matrix_mdev, "Driver data missing from mdev!!");

        if (matrix_mdev) {
            /*
             * If the KVM pointer is in the process of being set, wait
until
             * the process has completed.
             */
            wait_event_cmd(matrix_mdev->wait_for_kvm,
                       matrix_mdev->kvm_busy == false,
mutex_unlock(&matrix_dev->lock),
mutex_lock(&matrix_dev->lock));

            ret = vfio_ap_mdev_reset_queues(mdev);
        }
        break;
    ...

    return ret;
}

>
> Regards,
> Halil

2021-03-18 19:26:14

by Halil Pasic

[permalink] [raw]
Subject: Re: [PATCH v4 1/1] s390/vfio-ap: fix circular lockdep when setting/clearing crypto masks

On Thu, 18 Mar 2021 13:54:06 -0400
Tony Krowiak <[email protected]> wrote:

> > Is it guaranteed that matrix_mdev can't be NULL here? If yes, please
> > remind me of the mechanism that ensures this.
>
> The matrix_mdev is set as drvdata when the mdev is created and
> is only cleared when the mdev is removed. Likewise, this function
> is a callback defined by by vfio in the vfio_ap_matrix_ops structure
> when the matrix_dev is registered and is intended to handle ioctl
> calls from userspace during the lifetime of the mdev.

Yes, I've checked that these are all callbacks in the same struct, so
the callbacks are all registered simultaneously, i.e. the ioctl callback
gettin gregistered only when drv_data is already set is not the case.
If there isn't a mechanism in core mdev, then I think we better be
careful. I don't see what would guarantee the pointer is always in the
vfio_ap code.

> While I can't
> speak definitively to the guarantee, I think it is extremely unlikely
> that matrix_mdev would be NULL at this point. On the other hand,
> it wouldn't hurt to check for NULL and log an error or warning
> message (I prefer an error here) if NULL.

If we aren't absolutely sure this pointer is going to be always a valid
one, let's check it!

Regards,
Halil

2021-03-19 00:02:09

by Halil Pasic

[permalink] [raw]
Subject: Re: [PATCH v4 1/1] s390/vfio-ap: fix circular lockdep when setting/clearing crypto masks

On Thu, 18 Mar 2021 14:38:53 -0400
Tony Krowiak <[email protected]> wrote:

> On 3/17/21 7:17 PM, Halil Pasic wrote:
> > On Wed, 10 Mar 2021 10:05:59 -0500
> > Tony Krowiak <[email protected]> wrote:
> >
> >> - ret = vfio_ap_mdev_reset_queues(mdev);
> >> + matrix_mdev = mdev_get_drvdata(mdev);
> > Is it guaranteed that matrix_mdev can't be NULL here? If yes, please
> > remind me of the mechanism that ensures this.
> >
> >> +
> >> + /*
> >> + * If the KVM pointer is in the process of being set, wait until
> >> + * the process has completed.
> >> + */
> >> + wait_event_cmd(matrix_mdev->wait_for_kvm,
> >> + matrix_mdev->kvm_busy == false,
> >> + mutex_unlock(&matrix_dev->lock),
> >> + mutex_lock(&matrix_dev->lock));
> >> +
> >> + if (matrix_mdev->kvm)
> >> + ret = vfio_ap_mdev_reset_queues(mdev);
> >> + else
> >> + ret = -ENODEV;
> > Didn't we agree to make the call to vfio_ap_mdev_reset_queues()
> > unconditional again (for reference please take look at
> > Message-ID: <[email protected]>)?
>
> How about this:

Looks good. I will check the mdev code if the checkeck is really
needed. I'm curious when the sysfs files associated with a new mdev are
created. My guess is that this one comes in via a device specific file
(not the parent like in case of the create), and that those may be
created after the create. But we can get rid of the check any time so I
really don't see it as something that would preclude merging this.

Regards,
Halil

>
> static ssize_t vfio_ap_mdev_ioctl(struct mdev_device *mdev,
>                     unsigned int cmd, unsigned long arg)
> {
>     int ret = 0;
>     struct ap_matrix_mdev *matrix_mdev;
>
>     ...
>     case VFIO_DEVICE_RESET:
>         matrix_mdev = mdev_get_drvdata(mdev);
>         WARN(!matrix_mdev, "Driver data missing from mdev!!");
>
>         if (matrix_mdev) {
>             /*
>              * If the KVM pointer is in the process of being set, wait
> until
>              * the process has completed.
>              */
>             wait_event_cmd(matrix_mdev->wait_for_kvm,
>                        matrix_mdev->kvm_busy == false,
> mutex_unlock(&matrix_dev->lock),
> mutex_lock(&matrix_dev->lock));
>
>             ret = vfio_ap_mdev_reset_queues(mdev);
>         }
>         break;
>     ...
>
>     return ret;
> }
>
> >
> > Regards,
> > Halil
>