The current design for AP pass-through does not support making dynamic
changes to the AP matrix of a running guest resulting in a few
deficiencies this patch series is intended to mitigate:
1. Adapters, domains and control domains can not be added to or removed
from a running guest. In order to modify a guest's AP configuration,
the guest must be terminated; only then can AP resources be assigned
to or unassigned from the guest's matrix mdev. The new AP
configuration becomes available to the guest when it is subsequently
restarted.
2. The AP bus's /sys/bus/ap/apmask and /sys/bus/ap/aqmask interfaces can
be modified by a root user without any restrictions. A change to
either mask can result in AP queue devices being unbound from the
vfio_ap device driver and bound to a zcrypt device driver even if a
guest is using the queues, thus giving the host access to the guest's
private crypto data and vice versa.
3. The APQNs derived from the Cartesian product of the APIDs of the
adapters and APQIs of the domains assigned to a matrix mdev must
reference an AP queue device bound to the vfio_ap device driver. The
AP architecture allows assignment of AP resources that are not
available to the system, so this artificial restriction is not
compliant with the architecture.
4. The AP configuration profile can be dynamically changed for the linux
host after a KVM guest is started. For example, a new domain can be
dynamically added to the configuration profile via the SE or an HMC
connected to a DPM enabled lpar. Likewise, AP adapters can be
dynamically configured (online state) and deconfigured (standby state)
using the SE, an SCLP command or an HMC connected to a DPM enabled
lpar. This can result in inadvertent sharing of AP queues between the
guest and host.
5. A root user can manually unbind an AP queue device representing a
queue in use by a KVM guest via the vfio_ap device driver's sysfs
unbind attribute. In this case, the guest will be using a queue that
is not bound to the driver which violates the device model.
This patch series introduces the following changes to the current design
to alleviate the shortcomings described above as well as to implement
more of the AP architecture:
1. A root user will be prevented from making changes to the AP bus's
/sys/bus/ap/apmask or /sys/bus/ap/aqmask if the ownership of an APQN
changes from the vfio_ap device driver to a zcrypt driver when the
APQN is assigned to a matrix mdev.
2. Allow a root user to hot plug/unplug AP adapters, domains and control
domains using the matrix mdev's assign/unassign attributes.
4. Allow assignment of an AP adapter or domain to a matrix mdev even if
it results in assignment of an APQN that does not reference an AP
queue device bound to the vfio_ap device driver, as long as the APQN
is not reserved for use by the default zcrypt drivers (also known as
over-provisioning of AP resources). Allowing over-provisioning of AP
resources better models the architecture which does not preclude
assigning AP resources that are not yet available in the system.
5. Handle dynamic changes to the AP device model.
1. Rationale for changes to AP bus's apmask/aqmask interfaces:
----------------------------------------------------------
Due to the extremely sensitive nature of cryptographic data, it is
imperative that great care be taken to ensure that such data is secured.
Allowing a root user, either inadvertently or maliciously, to configure
these masks such that a queue is shared between the host and a guest is
not only avoidable, it is advisable. It was suggested that this scenario
is better handled in user space with management software, but that does
not preclude a malicious administrator from using the sysfs interfaces
to gain access to a guest's crypto data. It was also suggested that this
scenario could be avoided by taking access to the adapter away from the
guest and zeroing out the queues prior to the vfio_ap driver releasing the
device; however, stealing an adapter in use from a guest as a by-product
of an operation is bad and will likely cause problems for the guest
unnecessarily. It was decided that the most effective solution with the
least number of negative side effects is to prevent the situation at the
source.
2. Rationale for hot plug/unplug using matrix mdev sysfs interfaces:
----------------------------------------------------------------
Allowing a user to hot plug/unplug AP resources using the matrix mdev
sysfs interfaces circumvents the need to terminate the guest in order to
modify its AP configuration. Allowing dynamic configuration makes
reconfiguring a guest's AP matrix much less disruptive.
3. Rationale for allowing over-provisioning of AP resources:
-----------------------------------------------------------
Allowing assignment of AP resources to a matrix mdev and ultimately to a
guest better models the AP architecture. The architecture does not
preclude assignment of unavailable AP resources. If a queue subsequently
becomes available while a guest using the matrix mdev to which its APQN
is assigned, the guest will be given access to it. If an APQN
is dynamically unassigned from the underlying host system, it will
automatically become unavailable to the guest.
Change log v6-v7:
----------------
* Added callbacks to AP bus:
- on_config_changed: Notifies implementing drivers that
the AP configuration has changed since last AP device scan.
- on_scan_complete: Notifies implementing drivers that the device scan
has completed.
- implemented on_config_changed and on_scan_complete callbacks for
vfio_ap device driver.
- updated vfio_ap device driver's probe and remove callbacks to handle
dynamic changes to the AP device model.
* Added code to filter APQNs when assigning AP resources to a KVM guest's
CRYCB
Change log v5-v6:
----------------
* Fixed a bug in ap_bus.c introduced with patch 2/7 of the v5
series. Harald Freudenberer pointed out that the mutex lock
for ap_perms_mutex in the apmask_store and aqmask_store functions
was not being freed.
* Removed patch 6/7 which added logging to the vfio_ap driver
to expedite acceptance of this series. The logging will be introduced
with a separate patch series to allow more time to explore options
such as DBF logging vs. tracepoints.
* Added 3 patches related to ensuring that APQNs that do not reference
AP queue devices bound to the vfio_ap device driver are not assigned
to the guest CRYCB:
Patch 4: Filter CRYCB bits for unavailable queue devices
Patch 5: sysfs attribute to display the guest CRYCB
Patch 6: update guest CRYCB in vfio_ap probe and remove callbacks
* Added a patch (Patch 9) to version the vfio_ap module.
* Reshuffled patches to allow the in_use callback implementation to
invoke the vfio_ap_mdev_verify_no_sharing() function introduced in
patch 2.
Change log v4-v5:
----------------
* Added a patch to provide kernel s390dbf debug logs for VFIO AP
Change log v3->v4:
-----------------
* Restored patches preventing root user from changing ownership of
APQNs from zcrypt drivers to the vfio_ap driver if the APQN is
assigned to an mdev.
* No longer enforcing requirement restricting guest access to
queues represented by a queue device bound to the vfio_ap
device driver.
* Removed shadow CRYCB and now directly updating the guest CRYCB
from the matrix mdev's matrix.
* Rebased the patch series on top of 'vfio: ap: AP Queue Interrupt
Control' patches.
* Disabled bind/unbind sysfs interfaces for vfio_ap driver
Change log v2->v3:
-----------------
* Allow guest access to an AP queue only if the queue is bound to
the vfio_ap device driver.
* Removed the patch to test CRYCB masks before taking the vCPUs
out of SIE. Now checking the shadow CRYCB in the vfio_ap driver.
Change log v1->v2:
-----------------
* Removed patches preventing root user from unbinding AP queues from
the vfio_ap device driver
* Introduced a shadow CRYCB in the vfio_ap driver to manage dynamic
changes to the AP guest configuration due to root user interventions
or hardware anomalies.
Harald Freudenberger (1):
s390/zcrypt: Notify driver on config changed and scan complete
callbacks
Tony Krowiak (14):
s390/vfio-ap: store queue struct in hash table for quick access
s390/vfio-ap: manage link between queue struct and matrix mdev
s390/zcrypt: driver callback to indicate resource in use
s390/vfio-ap: implement in-use callback for vfio_ap driver
s390/vfio-ap: introduce shadow CRYCB
s390/vfio-ap: sysfs attribute to display the guest CRYCB
s390/vfio-ap: filter CRYCB bits for unavailable queue devices
s390/vfio_ap: add qlink from ap_matrix_mdev struct to vfio_ap_queue
struct
s390/vfio-ap: allow assignment of unavailable AP queues to mdev device
s390/vfio-ap: allow configuration of matrix mdev in use by a KVM guest
s390/vfio-ap: allow hot plug/unplug of AP resources using mdev device
s390/vfio-ap: handle host AP config change notification
s390/vfio-ap: handle AP bus scan completed notification
s390/vfio-ap: handle probe/remove not due to host AP config changes
drivers/s390/crypto/ap_bus.c | 301 ++++++-
drivers/s390/crypto/ap_bus.h | 17 +
drivers/s390/crypto/vfio_ap_drv.c | 35 +-
drivers/s390/crypto/vfio_ap_ops.c | 1103 +++++++++++++++++++------
drivers/s390/crypto/vfio_ap_private.h | 24 +-
5 files changed, 1167 insertions(+), 313 deletions(-)
--
2.21.1
The current support for pass-through crypto adapters does not allow
configuration of a matrix mdev when it is in use by a KVM guest. Let's
allow AP resources - i.e., adapters, domains and control domains - to be
assigned to or unassigned from a matrix mdev while it is in use by a guest.
This is in preparation for the introduction of support for dynamic
configuration of the AP matrix for a running KVM guest.
Signed-off-by: Tony Krowiak <[email protected]>
---
drivers/s390/crypto/vfio_ap_ops.c | 24 ------------------------
1 file changed, 24 deletions(-)
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index b5cfa3a967c0..4b16d45b702b 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -665,10 +665,6 @@ static ssize_t assign_adapter_store(struct device *dev,
struct mdev_device *mdev = mdev_from_dev(dev);
struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
- /* If the guest is running, disallow assignment of adapter */
- if (matrix_mdev->kvm)
- return -EBUSY;
-
ret = kstrtoul(buf, 0, &apid);
if (ret)
return ret;
@@ -720,10 +716,6 @@ static ssize_t unassign_adapter_store(struct device *dev,
struct mdev_device *mdev = mdev_from_dev(dev);
struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
- /* If the guest is running, disallow un-assignment of adapter */
- if (matrix_mdev->kvm)
- return -EBUSY;
-
ret = kstrtoul(buf, 0, &apid);
if (ret)
return ret;
@@ -810,10 +802,6 @@ static ssize_t assign_domain_store(struct device *dev,
struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
unsigned long max_apqi = matrix_mdev->matrix.aqm_max;
- /* If the guest is running, disallow assignment of domain */
- if (matrix_mdev->kvm)
- return -EBUSY;
-
ret = kstrtoul(buf, 0, &apqi);
if (ret)
return ret;
@@ -865,10 +853,6 @@ static ssize_t unassign_domain_store(struct device *dev,
struct mdev_device *mdev = mdev_from_dev(dev);
struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
- /* If the guest is running, disallow un-assignment of domain */
- if (matrix_mdev->kvm)
- return -EBUSY;
-
ret = kstrtoul(buf, 0, &apqi);
if (ret)
return ret;
@@ -910,10 +894,6 @@ static ssize_t assign_control_domain_store(struct device *dev,
struct mdev_device *mdev = mdev_from_dev(dev);
struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
- /* If the guest is running, disallow assignment of control domain */
- if (matrix_mdev->kvm)
- return -EBUSY;
-
ret = kstrtoul(buf, 0, &id);
if (ret)
return ret;
@@ -955,10 +935,6 @@ static ssize_t unassign_control_domain_store(struct device *dev,
struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
unsigned long max_domid = matrix_mdev->matrix.adm_max;
- /* If the guest is running, disallow un-assignment of control domain */
- if (matrix_mdev->kvm)
- return -EBUSY;
-
ret = kstrtoul(buf, 0, &domid);
if (ret)
return ret;
--
2.21.1
AP queue devices are probed or removed for reasons other than changes
to the host AP configuration:
* Each queue device associated with a card device will get created and
probed when the state of the AP adapter represented by the card device
dynamically changes from standby to online.
* Each queue device associated with a card device will get removed
when the state of the AP adapter to which the queue represented by the
queue device dynamically changes from online to standby.
* Each queue device associated with a card device will get removed
when the type of the AP adapter to which the queue represented by the
queue device dynamically changes.
* Each queue device associated with a card device will get removed
when the status of the queue represented by the queue device changes
from operating to check stop.
* AP queue devices can be manually bound to or unbound from the vfio_ap
device driver by a root user via the sysfs bind/unbind attributes of the
driver.
In response to a queue device probe or remove that is not the result of a
change to the host's AP configuration, if a KVM guest is using the matrix
mdev to which the APQN of the queue device is assigned, the vfio_ap device
driver must respond accordingly. In an ideal world, the queue corresponding
to the queue device being probed would be hot plugged into the guest.
Likewise, the queue corresponding to the queue device being removed would
be hot unplugged from the guest. Unfortunately, the AP architecture
precludes plugging or unplugging individual queues, so let's handle
the probe or remove of an AP queue device as follows:
Handling Probe
--------------
There are two requirements that must be met in order to give a
guest access to the queue corresponding to the queue device being probed:
* Each APQN derived from the APID of the queue device and the APQIs of the
domains already assigned to the guest's AP configuration must reference
a queue device bound to the vfio_ap device driver.
* Each APQN derived from the APQI of the queue device and the APIDs of the
adapters assigned to the guest's AP configuration must reference a queue
device bound to the vfio_ap device driver.
If the above conditions are met, the APQN will be assigned to the guest's
AP configuration and the guest will be given access to the queue.
Handling Remove
---------------
Since the AP architecture precludes us from taking access to an individual
queue from a guest, we are left with the choice of taking access away from
either the adapter or the domain to which the queue is connected. Access to
the adapter will be taken away because it is likely that most of the time,
the remove callback will be invoked because the adapter state has
transitioned from online to standby. In such a case, no queue connected
to the adapter will be available to access.
Signed-off-by: Tony Krowiak <[email protected]>
---
drivers/s390/crypto/vfio_ap_ops.c | 38 +++++++++++++++++++++++++++++++
1 file changed, 38 insertions(+)
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index ccc58daf82f6..918b735d5d56 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -1601,6 +1601,15 @@ static void vfio_ap_mdev_for_queue(struct vfio_ap_queue *q)
}
}
+void vfio_ap_mdev_hot_plug_queue(struct vfio_ap_queue *q)
+{
+ if ((q->matrix_mdev == NULL) || !vfio_ap_mdev_has_crycb(q->matrix_mdev))
+ return;
+
+ if (vfio_ap_mdev_configure_crycb(q->matrix_mdev))
+ vfio_ap_mdev_commit_crycb(q->matrix_mdev);
+}
+
int vfio_ap_mdev_probe_queue(struct ap_queue *queue)
{
struct vfio_ap_queue *q;
@@ -1615,11 +1624,35 @@ int vfio_ap_mdev_probe_queue(struct ap_queue *queue)
q->saved_isc = VFIO_AP_ISC_INVALID;
vfio_ap_mdev_for_queue(q);
hash_add(matrix_dev->qtable, &q->qnode, q->apqn);
+ /* Make sure we're not in the middle of an AP configuration change. */
+ if (!(matrix_dev->flags & AP_MATRIX_CFG_CHG))
+ vfio_ap_mdev_hot_plug_queue(q);
mutex_unlock(&matrix_dev->lock);
return 0;
}
+void vfio_ap_mdev_hot_unplug_queue(struct vfio_ap_queue *q)
+{
+ unsigned long apid = AP_QID_CARD(q->apqn);
+ unsigned long apqi = AP_QID_QUEUE(q->apqn);
+
+ if ((q->matrix_mdev == NULL) || !vfio_ap_mdev_has_crycb(q->matrix_mdev))
+ return;
+
+ /*
+ * If the APQN is assigned to the guest, then let's
+ * go ahead and unplug the adapter since the
+ * architecture does not provide a means to unplug
+ * an individual queue.
+ */
+ if (test_bit_inv(apid, q->matrix_mdev->shadow_crycb.apm) &&
+ test_bit_inv(apqi, q->matrix_mdev->shadow_crycb.aqm)) {
+ if (vfio_ap_mdev_unassign_guest_apid(q->matrix_mdev, apid))
+ vfio_ap_mdev_commit_crycb(q->matrix_mdev);
+ }
+}
+
void vfio_ap_mdev_remove_queue(struct ap_queue *queue)
{
struct vfio_ap_queue *q;
@@ -1627,6 +1660,11 @@ void vfio_ap_mdev_remove_queue(struct ap_queue *queue)
mutex_lock(&matrix_dev->lock);
q = dev_get_drvdata(&queue->ap_dev.device);
+
+ /* Make sure we're not in the middle of an AP configuration change. */
+ if (!(matrix_dev->flags & AP_MATRIX_CFG_CHG))
+ vfio_ap_mdev_hot_unplug_queue(q);
+
dev_set_drvdata(&queue->ap_dev.device, NULL);
apid = AP_QID_CARD(q->apqn);
apqi = AP_QID_QUEUE(q->apqn);
--
2.21.1
Implements the driver callback invoked by the AP bus when the host
AP configuration has changed. Since this callback is invoked prior to
unbinding a device from its device driver, the vfio_ap driver will
respond by unplugging the AP adapters, domains and control domains
removed from the host's AP configuration from the guests using them.
Signed-off-by: Tony Krowiak <[email protected]>
---
drivers/s390/crypto/vfio_ap_drv.c | 5 +-
drivers/s390/crypto/vfio_ap_ops.c | 136 ++++++++++++++++++++++++--
drivers/s390/crypto/vfio_ap_private.h | 7 +-
3 files changed, 140 insertions(+), 8 deletions(-)
diff --git a/drivers/s390/crypto/vfio_ap_drv.c b/drivers/s390/crypto/vfio_ap_drv.c
index 5197a1fe14d4..9f6c5d82dfb5 100644
--- a/drivers/s390/crypto/vfio_ap_drv.c
+++ b/drivers/s390/crypto/vfio_ap_drv.c
@@ -113,9 +113,11 @@ static int vfio_ap_matrix_dev_create(void)
/* Fill in config info via PQAP(QCI), if available */
if (test_facility(12)) {
- ret = ap_qci(&matrix_dev->info);
+ ret = ap_qci(&matrix_dev->config_info);
if (ret)
goto matrix_alloc_err;
+ memcpy(&matrix_dev->config_info_prev, &matrix_dev->config_info,
+ sizeof(struct ap_config_info));
}
mutex_init(&matrix_dev->lock);
@@ -176,6 +178,7 @@ static int __init vfio_ap_init(void)
vfio_ap_drv.remove = vfio_ap_queue_dev_remove;
vfio_ap_drv.in_use = vfio_ap_mdev_resource_in_use;
vfio_ap_drv.ids = ap_queue_ids;
+ vfio_ap_drv.on_config_changed = vfio_ap_on_cfg_changed;
ret = ap_driver_register(&vfio_ap_drv, THIS_MODULE, VFIO_AP_DRV_NAME);
if (ret) {
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index 88a4aef5193f..f1dd77729dd9 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -336,8 +336,9 @@ static int vfio_ap_mdev_create(struct kobject *kobj, struct mdev_device *mdev)
}
matrix_mdev->mdev = mdev;
- vfio_ap_matrix_init(&matrix_dev->info, &matrix_mdev->matrix);
- vfio_ap_matrix_init(&matrix_dev->info, &matrix_mdev->shadow_crycb);
+ vfio_ap_matrix_init(&matrix_dev->config_info, &matrix_mdev->matrix);
+ vfio_ap_matrix_init(&matrix_dev->config_info,
+ &matrix_mdev->shadow_crycb);
hash_init(matrix_mdev->qtable);
mdev_set_drvdata(mdev, matrix_mdev);
matrix_mdev->pqap_hook.hook = handle_pqap;
@@ -485,8 +486,8 @@ static int vfio_ap_mdev_filter_matrix(struct ap_matrix_mdev *matrix_mdev,
* If the APID is not assigned to the host AP configuration,
* we can not assign it to the guest's AP configuration
*/
- if (!test_bit_inv(apid,
- (unsigned long *)matrix_dev->info.apm)) {
+ if (!test_bit_inv(apid, (unsigned long *)
+ matrix_dev->config_info.apm)) {
clear_bit_inv(apid, shadow_crycb->apm);
continue;
}
@@ -499,7 +500,7 @@ static int vfio_ap_mdev_filter_matrix(struct ap_matrix_mdev *matrix_mdev,
* guest's AP configuration
*/
if (!test_bit_inv(apqi, (unsigned long *)
- matrix_dev->info.aqm)) {
+ matrix_dev->config_info.aqm)) {
clear_bit_inv(apqi, shadow_crycb->aqm);
continue;
}
@@ -1617,7 +1618,7 @@ int vfio_ap_mdev_probe_queue(struct ap_queue *queue)
void vfio_ap_mdev_remove_queue(struct ap_queue *queue)
{
struct vfio_ap_queue *q;
- int apid, apqi;
+ unsigned long apid, apqi;
mutex_lock(&matrix_dev->lock);
q = dev_get_drvdata(&queue->ap_dev.device);
@@ -1643,3 +1644,126 @@ bool vfio_ap_mdev_resource_in_use(unsigned long *apm, unsigned long *aqm)
return in_use;
}
+
+/**
+ * vfio_ap_mdev_unassign_apids
+ *
+ * @matrix_mdev: The matrix mediated device
+ *
+ * @aqm: A bitmap with 256 bits. Each bit in the map represents an APID from 0
+ * to 255 (with the leftmost bit corresponding to APID 0).
+ *
+ * Unassigns each APID specified in @aqm that is assigned to the shadow CRYCB
+ * of @matrix_mdev. Returns true if at least one APID is unassigned; otherwise,
+ * returns false.
+ */
+bool vfio_ap_mdev_unassign_apids(struct ap_matrix_mdev *matrix_mdev,
+ unsigned long *apm_unassign)
+{
+ unsigned long apid;
+ bool unassigned = false;
+
+ /*
+ * If the matrix mdev is not in use by a KVM guest, return indicating
+ * that no APIDs have been unassigned.
+ */
+ if (!vfio_ap_mdev_has_crycb(matrix_mdev))
+ return false;
+
+ for_each_set_bit_inv(apid, apm_unassign, AP_DEVICES) {
+ unassigned |= vfio_ap_mdev_unassign_guest_apid(matrix_mdev,
+ apid);
+ }
+
+ return unassigned;
+}
+
+/**
+ * vfio_ap_mdev_unassign_apqis
+ *
+ * @matrix_mdev: The matrix mediated device
+ *
+ * @aqm: A bitmap with 256 bits. Each bit in the map represents an APQI from 0
+ * to 255 (with the leftmost bit corresponding to APQI 0).
+ *
+ * Unassigns each APQI specified in @aqm that is assigned to the shadow CRYCB
+ * of @matrix_mdev. Returns true if at least one APQI is unassigned; otherwise,
+ * returns false.
+ */
+bool vfio_ap_mdev_unassign_apqis(struct ap_matrix_mdev *matrix_mdev,
+ unsigned long *aqm_unassign)
+{
+ unsigned long apqi;
+ bool unassigned = false;
+
+ /*
+ * If the matrix mdev is not in use by a KVM guest, return indicating
+ * that no APQIs have been unassigned.
+ */
+ if (!vfio_ap_mdev_has_crycb(matrix_mdev))
+ return false;
+
+ for_each_set_bit_inv(apqi, aqm_unassign, AP_DOMAINS) {
+ unassigned |= vfio_ap_mdev_unassign_guest_apqi(matrix_mdev,
+ apqi);
+ }
+
+ return unassigned;
+}
+
+void vfio_ap_on_cfg_changed(struct ap_config_info *new_config_info,
+ struct ap_config_info *old_config_info)
+{
+ bool unassigned;
+ int ap_remove, aq_remove;
+ struct ap_matrix_mdev *matrix_mdev;
+ DECLARE_BITMAP(apm_unassign, AP_DEVICES);
+ DECLARE_BITMAP(aqm_unassign, AP_DOMAINS);
+
+ unsigned long *cur_apm, *cur_aqm, *prev_apm, *prev_aqm;
+
+ if (matrix_dev->flags & AP_MATRIX_CFG_CHG) {
+ WARN_ONCE(1, "AP host configuration change already reported");
+ return;
+ }
+
+ memcpy(&matrix_dev->config_info, new_config_info,
+ sizeof(struct ap_config_info));
+ memcpy(&matrix_dev->config_info_prev, old_config_info,
+ sizeof(struct ap_config_info));
+
+ cur_apm = (unsigned long *)matrix_dev->config_info.apm;
+ cur_aqm = (unsigned long *)matrix_dev->config_info.aqm;
+ prev_apm = (unsigned long *)matrix_dev->config_info_prev.apm;
+ prev_aqm = (unsigned long *)matrix_dev->config_info_prev.aqm;
+
+ ap_remove = bitmap_andnot(apm_unassign, prev_apm, cur_apm, AP_DEVICES);
+ aq_remove = bitmap_andnot(aqm_unassign, prev_aqm, cur_aqm, AP_DOMAINS);
+
+ mutex_lock(&matrix_dev->lock);
+ matrix_dev->flags |= AP_MATRIX_CFG_CHG;
+
+ list_for_each_entry(matrix_mdev, &matrix_dev->mdev_list, node) {
+ if (!vfio_ap_mdev_has_crycb(matrix_mdev))
+ continue;
+
+ unassigned = false;
+
+ if (ap_remove)
+ if (bitmap_intersects(matrix_mdev->shadow_crycb.apm,
+ apm_unassign, AP_DEVICES))
+ if (vfio_ap_mdev_unassign_apids(matrix_mdev,
+ apm_unassign))
+ unassigned = true;
+ if (aq_remove)
+ if (bitmap_intersects(matrix_mdev->shadow_crycb.aqm,
+ aqm_unassign, AP_DOMAINS))
+ if (vfio_ap_mdev_unassign_apqis(matrix_mdev,
+ aqm_unassign))
+ unassigned = true;
+
+ if (unassigned)
+ vfio_ap_mdev_commit_crycb(matrix_mdev);
+ }
+ mutex_unlock(&matrix_dev->lock);
+}
diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h
index 794c60a767d2..82abbf03781f 100644
--- a/drivers/s390/crypto/vfio_ap_private.h
+++ b/drivers/s390/crypto/vfio_ap_private.h
@@ -40,11 +40,14 @@
struct ap_matrix_dev {
struct device device;
atomic_t available_instances;
- struct ap_config_info info;
+ struct ap_config_info config_info;
+ struct ap_config_info config_info_prev;
struct list_head mdev_list;
struct mutex lock;
struct ap_driver *vfio_ap_drv;
DECLARE_HASHTABLE(qtable, 8);
+ #define AP_MATRIX_CFG_CHG (1UL << 0)
+ unsigned long flags;
};
extern struct ap_matrix_dev *matrix_dev;
@@ -109,5 +112,7 @@ int vfio_ap_mdev_probe_queue(struct ap_queue *queue);
void vfio_ap_mdev_remove_queue(struct ap_queue *queue);
bool vfio_ap_mdev_resource_in_use(unsigned long *apm, unsigned long *aqm);
+void vfio_ap_on_cfg_changed(struct ap_config_info *new_config_info,
+ struct ap_config_info *old_config_info);
#endif /* _VFIO_AP_PRIVATE_H_ */
--
2.21.1
The current implementation does not allow assignment of an AP adapter or
domain to an mdev device if the APQNs resulting from the assignment
do not reference AP queue devices that are bound to the vfio_ap device
driver. This patch allows assignment of AP resources to the matrix mdev as
long as the APQNs resulting from the assignment:
1. Are not reserved by the AP BUS for use by the zcrypt device drivers.
2. Are not assigned to another matrix mdev.
The rationale behind this is twofold:
1. The AP architecture does not preclude assignment of APQNs to an AP
configuration that are not available to the system.
2. APQNs that do not reference a queue device bound to the vfio_ap
device driver will not be assigned to the guest's CRYCB, so the
guest will not get access to queues not bound to the vfio_ap driver.
Signed-off-by: Tony Krowiak <[email protected]>
---
drivers/s390/crypto/vfio_ap_ops.c | 213 ++++++------------------------
1 file changed, 37 insertions(+), 176 deletions(-)
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index 6ee1ebe3f207..b5cfa3a967c0 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -409,122 +409,6 @@ static struct attribute_group *vfio_ap_mdev_type_groups[] = {
NULL,
};
-struct vfio_ap_queue_reserved {
- unsigned long *apid;
- unsigned long *apqi;
- bool reserved;
-};
-
-/**
- * vfio_ap_has_queue
- *
- * @dev: an AP queue device
- * @data: a struct vfio_ap_queue_reserved reference
- *
- * Flags whether the AP queue device (@dev) has a queue ID containing the APQN,
- * apid or apqi specified in @data:
- *
- * - If @data contains both an apid and apqi value, then @data will be flagged
- * as reserved if the APID and APQI fields for the AP queue device matches
- *
- * - If @data contains only an apid value, @data will be flagged as
- * reserved if the APID field in the AP queue device matches
- *
- * - If @data contains only an apqi value, @data will be flagged as
- * reserved if the APQI field in the AP queue device matches
- *
- * Returns 0 to indicate the input to function succeeded. Returns -EINVAL if
- * @data does not contain either an apid or apqi.
- */
-static int vfio_ap_has_queue(struct device *dev, void *data)
-{
- struct vfio_ap_queue_reserved *qres = data;
- struct ap_queue *ap_queue = to_ap_queue(dev);
- ap_qid_t qid;
- unsigned long id;
-
- if (qres->apid && qres->apqi) {
- qid = AP_MKQID(*qres->apid, *qres->apqi);
- if (qid == ap_queue->qid)
- qres->reserved = true;
- } else if (qres->apid && !qres->apqi) {
- id = AP_QID_CARD(ap_queue->qid);
- if (id == *qres->apid)
- qres->reserved = true;
- } else if (!qres->apid && qres->apqi) {
- id = AP_QID_QUEUE(ap_queue->qid);
- if (id == *qres->apqi)
- qres->reserved = true;
- } else {
- return -EINVAL;
- }
-
- return 0;
-}
-
-/**
- * vfio_ap_verify_queue_reserved
- *
- * @matrix_dev: a mediated matrix device
- * @apid: an AP adapter ID
- * @apqi: an AP queue index
- *
- * Verifies that the AP queue with @apid/@apqi is reserved by the VFIO AP device
- * driver according to the following rules:
- *
- * - If both @apid and @apqi are not NULL, then there must be an AP queue
- * device bound to the vfio_ap driver with the APQN identified by @apid and
- * @apqi
- *
- * - If only @apid is not NULL, then there must be an AP queue device bound
- * to the vfio_ap driver with an APQN containing @apid
- *
- * - If only @apqi is not NULL, then there must be an AP queue device bound
- * to the vfio_ap driver with an APQN containing @apqi
- *
- * Returns 0 if the AP queue is reserved; otherwise, returns -EADDRNOTAVAIL.
- */
-static int vfio_ap_verify_queue_reserved(unsigned long *apid,
- unsigned long *apqi)
-{
- int ret;
- struct vfio_ap_queue_reserved qres;
-
- qres.apid = apid;
- qres.apqi = apqi;
- qres.reserved = false;
-
- ret = driver_for_each_device(&matrix_dev->vfio_ap_drv->driver, NULL,
- &qres, vfio_ap_has_queue);
- if (ret)
- return ret;
-
- if (qres.reserved)
- return 0;
-
- return -EADDRNOTAVAIL;
-}
-
-static int
-vfio_ap_mdev_verify_queues_reserved_for_apid(struct ap_matrix_mdev *matrix_mdev,
- unsigned long apid)
-{
- int ret;
- unsigned long apqi;
- unsigned long nbits = matrix_mdev->matrix.aqm_max + 1;
-
- if (find_first_bit_inv(matrix_mdev->matrix.aqm, nbits) >= nbits)
- return vfio_ap_verify_queue_reserved(&apid, NULL);
-
- for_each_set_bit_inv(apqi, matrix_mdev->matrix.aqm, nbits) {
- ret = vfio_ap_verify_queue_reserved(&apid, &apqi);
- if (ret)
- return ret;
- }
-
- return 0;
-}
-
/**
* vfio_ap_mdev_verify_no_sharing
*
@@ -533,6 +417,8 @@ vfio_ap_mdev_verify_queues_reserved_for_apid(struct ap_matrix_mdev *matrix_mdev,
* mediated device. AP queue sharing is not allowed.
*
* @matrix_mdev: the mediated matrix device
+ * @mdev_apm: the mask identifying the adapters assigned to mdev
+ * @mdev_aqm: the mask identifying the domains (queues) assigned to mdev
*
* Returns 0 if the APQNs are not shared, otherwise; returns -EADDRINUSE.
*/
@@ -545,6 +431,11 @@ static int vfio_ap_mdev_verify_no_sharing(struct ap_matrix_mdev *matrix_mdev,
DECLARE_BITMAP(aqm, AP_DOMAINS);
list_for_each_entry(lstdev, &matrix_dev->mdev_list, node) {
+ /*
+ * If either of the input masks belongs to the mdev to which an
+ * AP resource is being assigned, then we don't need to verify
+ * that mdev's masks.
+ */
if (matrix_mdev == lstdev)
continue;
@@ -567,6 +458,20 @@ static int vfio_ap_mdev_verify_no_sharing(struct ap_matrix_mdev *matrix_mdev,
return 0;
}
+static int vfio_ap_mdev_validate_masks(struct ap_matrix_mdev *matrix_mdev,
+ unsigned long *mdev_apm,
+ unsigned long *mdev_aqm)
+{
+ DECLARE_BITMAP(apm, AP_DEVICES);
+ DECLARE_BITMAP(aqm, AP_DOMAINS);
+
+ if (bitmap_and(apm, mdev_apm, ap_perms.apm, AP_DEVICES) &&
+ bitmap_and(aqm, mdev_aqm, ap_perms.aqm, AP_DOMAINS))
+ return -EADDRNOTAVAIL;
+
+ return vfio_ap_mdev_verify_no_sharing(matrix_mdev, mdev_apm, mdev_aqm);
+}
+
static int vfio_ap_mdev_filter_matrix(struct ap_matrix_mdev *matrix_mdev,
struct ap_matrix *shadow_crycb,
bool filter_apids)
@@ -771,33 +676,21 @@ static ssize_t assign_adapter_store(struct device *dev,
if (apid > matrix_mdev->matrix.apm_max)
return -ENODEV;
- /*
- * Set the bit in the AP mask (APM) corresponding to the AP adapter
- * number (APID). The bits in the mask, from most significant to least
- * significant bit, correspond to APIDs 0-255.
- */
- mutex_lock(&matrix_dev->lock);
-
- ret = vfio_ap_mdev_verify_queues_reserved_for_apid(matrix_mdev, apid);
- if (ret)
- goto done;
-
memset(apm, 0, sizeof(apm));
set_bit_inv(apid, apm);
- ret = vfio_ap_mdev_verify_no_sharing(matrix_mdev, apm,
- matrix_mdev->matrix.aqm);
- if (ret)
- goto done;
-
+ mutex_lock(&matrix_dev->lock);
+ ret = vfio_ap_mdev_validate_masks(matrix_mdev, apm,
+ matrix_mdev->matrix.aqm);
+ if (ret) {
+ mutex_unlock(&matrix_dev->lock);
+ return ret;
+ }
set_bit_inv(apid, matrix_mdev->matrix.apm);
vfio_ap_mdev_qlinks_for_apid(matrix_mdev, apid);
- ret = count;
-
-done:
mutex_unlock(&matrix_dev->lock);
- return ret;
+ return count;
}
static DEVICE_ATTR_WO(assign_adapter);
@@ -847,26 +740,6 @@ static ssize_t unassign_adapter_store(struct device *dev,
}
static DEVICE_ATTR_WO(unassign_adapter);
-static int
-vfio_ap_mdev_verify_queues_reserved_for_apqi(struct ap_matrix_mdev *matrix_mdev,
- unsigned long apqi)
-{
- int ret;
- unsigned long apid;
- unsigned long nbits = matrix_mdev->matrix.apm_max + 1;
-
- if (find_first_bit_inv(matrix_mdev->matrix.apm, nbits) >= nbits)
- return vfio_ap_verify_queue_reserved(NULL, &apqi);
-
- for_each_set_bit_inv(apid, matrix_mdev->matrix.apm, nbits) {
- ret = vfio_ap_verify_queue_reserved(&apid, &apqi);
- if (ret)
- return ret;
- }
-
- return 0;
-}
-
/**
* vfio_ap_mdev_qlinks_for_apqi
*
@@ -947,28 +820,21 @@ static ssize_t assign_domain_store(struct device *dev,
if (apqi > max_apqi)
return -ENODEV;
- mutex_lock(&matrix_dev->lock);
-
- ret = vfio_ap_mdev_verify_queues_reserved_for_apqi(matrix_mdev, apqi);
- if (ret)
- goto done;
-
memset(aqm, 0, sizeof(aqm));
set_bit_inv(apqi, aqm);
- ret = vfio_ap_mdev_verify_no_sharing(matrix_mdev,
- matrix_mdev->matrix.apm, aqm);
- if (ret)
- goto done;
-
+ mutex_lock(&matrix_dev->lock);
+ ret = vfio_ap_mdev_validate_masks(matrix_mdev, matrix_mdev->matrix.apm,
+ aqm);
+ if (ret) {
+ mutex_unlock(&matrix_dev->lock);
+ return ret;
+ }
set_bit_inv(apqi, matrix_mdev->matrix.aqm);
vfio_ap_mdev_qlinks_for_apqi(matrix_mdev, apqi);
- ret = count;
-
-done:
mutex_unlock(&matrix_dev->lock);
- return ret;
+ return count;
}
static DEVICE_ATTR_WO(assign_domain);
@@ -1055,11 +921,6 @@ static ssize_t assign_control_domain_store(struct device *dev,
if (id > matrix_mdev->matrix.adm_max)
return -ENODEV;
- /* Set the bit in the ADM (bitmask) corresponding to the AP control
- * domain number (id). The bits in the mask, from most significant to
- * least significant, correspond to IDs 0 up to the one less than the
- * number of control domains that can be assigned.
- */
mutex_lock(&matrix_dev->lock);
set_bit_inv(id, matrix_mdev->matrix.adm);
mutex_unlock(&matrix_dev->lock);
--
2.21.1
Even though APQNs for queues that are not in the host's AP configuration
may be assigned to a matrix mdev, we do not want to set bits in the guest's
CRYCB for APQNs that do not reference AP queue devices bound to the vfio_ap
device driver. Ideally, it would be great if such APQNs could be filtered
out before setting the bits in the guest's CRYCB; however, the architecture
precludes filtering individual APQNs. Consequently, either the APID or APQI
must be filtered.
This patch introduces code to filter the APIDs or APQIs assigned to the
matrix mdev's AP configuration before assigning them to the guest's AP
configuration (i.e., CRYCB). We'll start by filtering the APIDs:
If an APQN assigned to the matrix mdev's AP configuration does not
reference a queue device bound to the vfio_ap device driver, the APID
will be filtered out (i.e., not assigned to the guest's CRYCB).
If every APID assigned to the matrix mdev is filtered out, then we'll try
filtering the APQI's:
If an APQN assigned to the matrix mdev's AP configuration does not
reference a queue device bound to the vfio_ap device driver, the APQI
will be filtered out (i.e., not assigned to the guest's CRYCB).
In any case, if after filtering either the APIDs or APQIs there are any
APQNs that can be assigned to the guest's CRYCB, they will be assigned and
the CRYCB will be hot plugged into the guest.
Example
=======
APQNs bound to vfio_ap device driver:
04.0004
04.0047
04.0054
05.0005
05.0047
05.0054
Assignments to matrix mdev:
APIDs APQIs -> APQNs
04 0004 04.0004
05 0005 04.0005
0047 04.0047
0054 04.0054
05.0004
05.0005
05.0047
04.0054
Filter APIDs:
APID 04 will be filtered because APQN 04.0005 is not bound.
APID 05 will be filtered because APQN 05.0004 is not bound.
APQNs remaining: None
Filter APQIs:
APQI 04 will be filtered because APQN 05.0004 is not bound.
APQI 05 will be filtered because APQN 04.0005 is not bound.
APQNs remaining: 04.0047, 04.0054, 05.0047, 05.0054
APQNs 04.0047, 04.0054, 05.0047, 05.0054 will be assigned to the CRYCB and
hot plugged into the KVM guest.
Signed-off-by: Tony Krowiak <[email protected]>
---
drivers/s390/crypto/vfio_ap_ops.c | 150 ++++++++++++++++++++++++++++--
1 file changed, 140 insertions(+), 10 deletions(-)
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index beb2e68a2b1c..25b7d978e3fd 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -296,14 +296,17 @@ static void vfio_ap_matrix_init(struct ap_config_info *info,
matrix->adm_max = info->apxa ? info->Nd : 15;
}
-static bool vfio_ap_mdev_commit_crycb(struct ap_matrix_mdev *matrix_mdev)
+static bool vfio_ap_mdev_has_crycb(struct ap_matrix_mdev *matrix_mdev)
{
- if (matrix_mdev->kvm && matrix_mdev->kvm->arch.crypto.crycbd) {
- kvm_arch_crypto_set_masks(matrix_mdev->kvm,
- matrix_mdev->shadow_crycb.apm,
- matrix_mdev->shadow_crycb.aqm,
- matrix_mdev->shadow_crycb.adm);
- }
+ return (matrix_mdev->kvm && matrix_mdev->kvm->arch.crypto.crycbd);
+}
+
+static void vfio_ap_mdev_commit_crycb(struct ap_matrix_mdev *matrix_mdev)
+{
+ kvm_arch_crypto_set_masks(matrix_mdev->kvm,
+ matrix_mdev->shadow_crycb.apm,
+ matrix_mdev->shadow_crycb.aqm,
+ matrix_mdev->shadow_crycb.adm);
}
static int vfio_ap_mdev_create(struct kobject *kobj, struct mdev_device *mdev)
@@ -550,6 +553,130 @@ static int vfio_ap_mdev_verify_no_sharing(struct ap_matrix_mdev *matrix_mdev,
return 0;
}
+static int vfio_ap_mdev_filter_matrix(struct ap_matrix_mdev *matrix_mdev,
+ struct ap_matrix *shadow_crycb,
+ bool filter_apids)
+{
+ unsigned long apid, apqi, apqn;
+
+ memcpy(shadow_crycb, &matrix_mdev->matrix, sizeof(*shadow_crycb));
+
+ for_each_set_bit_inv(apid, matrix_mdev->matrix.apm, AP_DEVICES) {
+ /*
+ * If the APID is not assigned to the host AP configuration,
+ * we can not assign it to the guest's AP configuration
+ */
+ if (!test_bit_inv(apid,
+ (unsigned long *)matrix_dev->info.apm)) {
+ clear_bit_inv(apid, shadow_crycb->apm);
+ continue;
+ }
+
+ for_each_set_bit_inv(apqi, matrix_mdev->matrix.aqm,
+ AP_DOMAINS) {
+ /*
+ * If the APQI is not assigned to the host AP
+ * configuration, then it can not be assigned to the
+ * guest's AP configuration
+ */
+ if (!test_bit_inv(apqi, (unsigned long *)
+ matrix_dev->info.aqm)) {
+ clear_bit_inv(apqi, shadow_crycb->aqm);
+ continue;
+ }
+
+ /*
+ * If the APQN is not bound to the vfio_ap device
+ * driver, then we can't assign it to the guest's
+ * AP configuration. The AP architecture won't
+ * allow filtering of a single APQN, so if we're
+ * filtering APIDs, then filter the APID; otherwise,
+ * filter the APQI.
+ */
+ apqn = AP_MKQID(apid, apqi);
+ if (!vfio_ap_get_queue(apqn)) {
+ if (filter_apids)
+ clear_bit_inv(apid, shadow_crycb->apm);
+ else
+ clear_bit_inv(apqi, shadow_crycb->aqm);
+ break;
+ }
+ }
+
+ /*
+ * If we're filtering APQIs and all of them have been filtered,
+ * there's no need to continue filtering.
+ */
+ if (!filter_apids)
+ if (bitmap_empty(shadow_crycb->aqm, AP_DOMAINS))
+ break;
+ }
+
+ return bitmap_weight(shadow_crycb->apm, AP_DEVICES) *
+ bitmap_weight(shadow_crycb->aqm, AP_DOMAINS);
+}
+
+static bool vfio_ap_mdev_configure_crycb(struct ap_matrix_mdev *matrix_mdev)
+{
+ int napm, naqm;
+ struct ap_matrix shadow_crycb;
+
+ vfio_ap_matrix_init(&matrix_dev->info, &shadow_crycb);
+ napm = bitmap_weight(matrix_mdev->matrix.apm, AP_DEVICES);
+ naqm = bitmap_weight(matrix_mdev->matrix.aqm, AP_DOMAINS);
+
+ /*
+ * If there are no APIDs or no APQIs assigned to the matrix mdev,
+ * then no APQNs shall be assigned to the guest CRYCB.
+ */
+ if ((napm != 0) || (naqm != 0)) {
+ /*
+ * Filter the APIDs assigned to the matrix mdev for APQNs that
+ * do not reference an AP queue device bound to the driver.
+ */
+ napm = vfio_ap_mdev_filter_matrix(matrix_mdev, &shadow_crycb,
+ true);
+ /*
+ * If there are no APQNs that can be assigned to the guest's
+ * CRYCB after filtering, then try filtering the APQIs.
+ */
+ if (napm == 0) {
+ naqm = vfio_ap_mdev_filter_matrix(matrix_mdev,
+ &shadow_crycb, false);
+
+ /*
+ * If there are no APQNs that can be assigned to the
+ * matrix mdev after filtering the APQIs, then no APQNs
+ * shall be assigned to the guest's CRYCB.
+ */
+ if (naqm == 0) {
+ bitmap_clear(shadow_crycb.apm, 0, AP_DEVICES);
+ bitmap_clear(shadow_crycb.aqm, 0, AP_DOMAINS);
+ }
+ }
+ }
+
+ /*
+ * If the guest's AP configuration has not changed, then return
+ * indicating such.
+ */
+ if (bitmap_equal(matrix_mdev->shadow_crycb.apm, shadow_crycb.apm,
+ AP_DEVICES) &&
+ bitmap_equal(matrix_mdev->shadow_crycb.aqm, shadow_crycb.aqm,
+ AP_DOMAINS) &&
+ bitmap_equal(matrix_mdev->shadow_crycb.adm, shadow_crycb.adm,
+ AP_DOMAINS))
+ return false;
+
+ /*
+ * Copy the changes to the guest's CRYCB, then return indicating that
+ * the guest's AP configuration has changed.
+ */
+ memcpy(&matrix_mdev->shadow_crycb, &shadow_crycb, sizeof(shadow_crycb));
+
+ return true;
+}
+
/**
* vfio_ap_mdev_qlinks_for_apid
*
@@ -1203,9 +1330,11 @@ static int vfio_ap_mdev_group_notifier(struct notifier_block *nb,
if (ret)
return NOTIFY_DONE;
- memcpy(&matrix_mdev->shadow_crycb, &matrix_mdev->matrix,
- sizeof(matrix_mdev->shadow_crycb));
- vfio_ap_mdev_commit_crycb(matrix_mdev);
+ if (!vfio_ap_mdev_has_crycb(matrix_mdev))
+ return NOTIFY_DONE;
+
+ if (vfio_ap_mdev_configure_crycb(matrix_mdev))
+ vfio_ap_mdev_commit_crycb(matrix_mdev);
return NOTIFY_OK;
}
@@ -1315,6 +1444,7 @@ static void vfio_ap_mdev_release(struct mdev_device *mdev)
mutex_lock(&matrix_dev->lock);
if (matrix_mdev->kvm) {
kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
+ vfio_ap_matrix_clear(&matrix_mdev->shadow_crycb);
matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
vfio_ap_mdev_reset_queues(mdev);
kvm_put_kvm(matrix_mdev->kvm);
--
2.21.1
From: Harald Freudenberger <[email protected]>
This patch intruduces an extension to the ap bus to notify drivers
on crypto config changed and bus scan complete events.
Two new callbacks are introduced for ap_drivers:
void (*on_config_changed)(struct ap_config_info *new_config_info,
struct ap_config_info *old_config_info);
void (*on_scan_complete)(struct ap_config_info *new_config_info,
struct ap_config_info *old_config_info);
Both callbacks are optional. Both callbacks are only triggered
when QCI information is available (facility bit 12):
* The on_config_changed callback is invoked at the start of the AP bus scan
function when it determines that the host AP configuration information
has changed since the previous scan. This is done by storing
an old and current QCI info struct and comparing them. If there is any
difference, the callback is invoked.
Note that when the AP bus scan detects that AP adapters or domains have
been removed from the host's AP configuration, it will remove the
associated devices from the AP bus subsystem's device model. This
callback gives the device driver a chance to respond to the removal
of the AP devices in bulk rather than one at a time as its remove
callback is invoked. It will also allow the device driver to do any
any cleanup prior to giving control back to the bus piecemeal. This is
particularly important for the vfio_ap driver because there may be
guests using the queues at the time.
* The on_scan_complete callback is invoked after the ap bus scan is
complete if the host AP configuration data has changed.
Note that when the AP bus scan detects that adapters or domains have
been added to the host's configuration, it will create new devices in
the AP bus subsystem's device model. This callback also allows the driver
to process all of the new devices in bulk.
Please note that changes to the apmask and aqmask do not trigger
these two callbacks since the bus scan function is not invoked by changes
to those masks.
Signed-off-by: Harald Freudenberger <[email protected]>
---
drivers/s390/crypto/ap_bus.c | 157 +++++++++++++++++++++++++++--------
drivers/s390/crypto/ap_bus.h | 13 +++
2 files changed, 134 insertions(+), 36 deletions(-)
diff --git a/drivers/s390/crypto/ap_bus.c b/drivers/s390/crypto/ap_bus.c
index af15c095e76a..9b3fc4ea9c4d 100644
--- a/drivers/s390/crypto/ap_bus.c
+++ b/drivers/s390/crypto/ap_bus.c
@@ -71,8 +71,12 @@ struct ap_perms ap_perms;
EXPORT_SYMBOL(ap_perms);
DEFINE_MUTEX(ap_perms_mutex);
EXPORT_SYMBOL(ap_perms_mutex);
+DEFINE_MUTEX(ap_config_lock);
+
+/* current and old qci info structs */
+static struct ap_config_info *ap_config_info;
+static struct ap_config_info *ap_old_config_info;
-static struct ap_config_info *ap_configuration;
static bool initialised;
/*
@@ -188,8 +192,8 @@ static int ap_apft_available(void)
*/
static inline int ap_qact_available(void)
{
- if (ap_configuration)
- return ap_configuration->qact;
+ if (ap_config_info)
+ return ap_config_info->qact;
return 0;
}
@@ -218,13 +222,15 @@ static void ap_init_configuration(void)
if (!ap_configuration_available())
return;
- ap_configuration = kzalloc(sizeof(*ap_configuration), GFP_KERNEL);
- if (!ap_configuration)
- return;
- if (ap_query_configuration(ap_configuration) != 0) {
- kfree(ap_configuration);
- ap_configuration = NULL;
+ /* allocate current qci info struct */
+ ap_config_info = kzalloc(sizeof(*ap_config_info), GFP_KERNEL);
+ if (!ap_config_info)
return;
+
+ /* fetch qci info into the current qci info struct */
+ if (ap_query_configuration(ap_config_info)) {
+ kfree(ap_config_info);
+ ap_config_info = NULL;
}
}
@@ -247,10 +253,10 @@ static inline int ap_test_config(unsigned int *field, unsigned int nr)
*/
static inline int ap_test_config_card_id(unsigned int id)
{
- if (!ap_configuration) /* QCI not supported */
- /* only ids 0...3F may be probed */
+ if (!ap_config_info)
+ /* QCI not available, only ids 0...3F may be probed */
return id < 0x40 ? 1 : 0;
- return ap_test_config(ap_configuration->apm, id);
+ return ap_test_config(ap_config_info->apm, id);
}
/*
@@ -264,9 +270,9 @@ static inline int ap_test_config_card_id(unsigned int id)
*/
int ap_test_config_usage_domain(unsigned int domain)
{
- if (!ap_configuration) /* QCI not supported */
+ if (!ap_config_info) /* QCI not supported */
return domain < 16;
- return ap_test_config(ap_configuration->aqm, domain);
+ return ap_test_config(ap_config_info->aqm, domain);
}
EXPORT_SYMBOL(ap_test_config_usage_domain);
@@ -280,9 +286,9 @@ EXPORT_SYMBOL(ap_test_config_usage_domain);
*/
int ap_test_config_ctrl_domain(unsigned int domain)
{
- if (!ap_configuration) /* QCI not supported */
+ if (!ap_config_info) /* QCI not supported */
return 0;
- return ap_test_config(ap_configuration->adm, domain);
+ return ap_test_config(ap_config_info->adm, domain);
}
EXPORT_SYMBOL(ap_test_config_ctrl_domain);
@@ -1052,45 +1058,45 @@ static BUS_ATTR_RW(ap_domain);
static ssize_t ap_control_domain_mask_show(struct bus_type *bus, char *buf)
{
- if (!ap_configuration) /* QCI not supported */
+ if (!ap_config_info) /* QCI not supported */
return snprintf(buf, PAGE_SIZE, "not supported\n");
return snprintf(buf, PAGE_SIZE,
"0x%08x%08x%08x%08x%08x%08x%08x%08x\n",
- ap_configuration->adm[0], ap_configuration->adm[1],
- ap_configuration->adm[2], ap_configuration->adm[3],
- ap_configuration->adm[4], ap_configuration->adm[5],
- ap_configuration->adm[6], ap_configuration->adm[7]);
+ ap_config_info->adm[0], ap_config_info->adm[1],
+ ap_config_info->adm[2], ap_config_info->adm[3],
+ ap_config_info->adm[4], ap_config_info->adm[5],
+ ap_config_info->adm[6], ap_config_info->adm[7]);
}
static BUS_ATTR_RO(ap_control_domain_mask);
static ssize_t ap_usage_domain_mask_show(struct bus_type *bus, char *buf)
{
- if (!ap_configuration) /* QCI not supported */
+ if (!ap_config_info) /* QCI not supported */
return snprintf(buf, PAGE_SIZE, "not supported\n");
return snprintf(buf, PAGE_SIZE,
"0x%08x%08x%08x%08x%08x%08x%08x%08x\n",
- ap_configuration->aqm[0], ap_configuration->aqm[1],
- ap_configuration->aqm[2], ap_configuration->aqm[3],
- ap_configuration->aqm[4], ap_configuration->aqm[5],
- ap_configuration->aqm[6], ap_configuration->aqm[7]);
+ ap_config_info->aqm[0], ap_config_info->aqm[1],
+ ap_config_info->aqm[2], ap_config_info->aqm[3],
+ ap_config_info->aqm[4], ap_config_info->aqm[5],
+ ap_config_info->aqm[6], ap_config_info->aqm[7]);
}
static BUS_ATTR_RO(ap_usage_domain_mask);
static ssize_t ap_adapter_mask_show(struct bus_type *bus, char *buf)
{
- if (!ap_configuration) /* QCI not supported */
+ if (!ap_config_info) /* QCI not supported */
return snprintf(buf, PAGE_SIZE, "not supported\n");
return snprintf(buf, PAGE_SIZE,
"0x%08x%08x%08x%08x%08x%08x%08x%08x\n",
- ap_configuration->apm[0], ap_configuration->apm[1],
- ap_configuration->apm[2], ap_configuration->apm[3],
- ap_configuration->apm[4], ap_configuration->apm[5],
- ap_configuration->apm[6], ap_configuration->apm[7]);
+ ap_config_info->apm[0], ap_config_info->apm[1],
+ ap_config_info->apm[2], ap_config_info->apm[3],
+ ap_config_info->apm[4], ap_config_info->apm[5],
+ ap_config_info->apm[6], ap_config_info->apm[7]);
}
static BUS_ATTR_RO(ap_adapter_mask);
@@ -1178,7 +1184,7 @@ static ssize_t ap_max_domain_id_show(struct bus_type *bus, char *buf)
{
int max_domain_id;
- if (ap_configuration)
+ if (ap_config_info)
max_domain_id = ap_max_domain_id ? : -1;
else
max_domain_id = 15;
@@ -1481,6 +1487,50 @@ static int ap_get_compatible_type(ap_qid_t qid, int rawtype, unsigned int func)
return comp_type;
}
+/* Helper function for notify_config_changed */
+static int __drv_notify_config_changed(struct device_driver *drv, void *data)
+{
+ struct ap_driver *ap_drv = to_ap_drv(drv);
+
+ if (try_module_get(drv->owner)) {
+ if (ap_drv->on_config_changed)
+ ap_drv->on_config_changed(ap_config_info,
+ ap_old_config_info);
+ module_put(drv->owner);
+ }
+
+ return 0;
+}
+
+/* Notify all drivers about an qci config change */
+static inline void notify_config_changed(void)
+{
+ bus_for_each_drv(&ap_bus_type, NULL, NULL,
+ __drv_notify_config_changed);
+}
+
+/* Helper function for notify_scan_complete */
+static int __drv_notify_scan_complete(struct device_driver *drv, void *data)
+{
+ struct ap_driver *ap_drv = to_ap_drv(drv);
+
+ if (try_module_get(drv->owner)) {
+ if (ap_drv->on_scan_complete)
+ ap_drv->on_scan_complete(ap_config_info,
+ ap_old_config_info);
+ module_put(drv->owner);
+ }
+
+ return 0;
+}
+
+/* Notify all drivers about bus scan complete */
+static inline void notify_scan_complete(void)
+{
+ bus_for_each_drv(&ap_bus_type, NULL, NULL,
+ __drv_notify_scan_complete);
+}
+
/*
* Helper function to be used with bus_find_dev
* matches for the card device with the given id
@@ -1663,23 +1713,57 @@ static void _ap_scan_bus_adapter(int id)
put_device(&ac->ap_dev.device);
}
+static int ap_config_changed(void)
+{
+ int cfg_chg = 0;
+
+ if (ap_config_info) {
+ if (!ap_old_config_info) {
+ ap_old_config_info = kzalloc(
+ sizeof(*ap_old_config_info), GFP_KERNEL);
+ if (!ap_old_config_info)
+ return 0;
+ } else {
+ memcpy(ap_old_config_info, ap_config_info,
+ sizeof(struct ap_config_info));
+ }
+ ap_query_configuration(ap_config_info);
+ cfg_chg = memcmp(ap_config_info,
+ ap_old_config_info,
+ sizeof(struct ap_config_info)) != 0;
+ }
+
+ return cfg_chg;
+}
+
/**
* ap_scan_bus(): Scan the AP bus for new devices
* Runs periodically, workqueue timer (ap_config_time)
*/
static void ap_scan_bus(struct work_struct *unused)
{
- int id;
+ int id, config_changed = 0;
AP_DBF(DBF_DEBUG, "%s running\n", __func__);
- ap_query_configuration(ap_configuration);
+ mutex_lock(&ap_config_lock);
+
+ /* config change notify */
+ config_changed = ap_config_changed();
+ if (config_changed)
+ notify_config_changed();
ap_select_domain();
/* loop over all possible adapters */
for (id = 0; id < AP_DEVICES; id++)
_ap_scan_bus_adapter(id);
+ /* scan complete notify */
+ if (config_changed)
+ notify_scan_complete();
+
+ mutex_unlock(&ap_config_lock);
+
/* check if there is at least one queue available with default domain */
if (ap_domain_index >= 0) {
struct device *dev =
@@ -1761,7 +1845,7 @@ static int __init ap_module_init(void)
/* Get AP configuration data if available */
ap_init_configuration();
- if (ap_configuration)
+ if (ap_config_info)
max_domain_id =
ap_max_domain_id ? ap_max_domain_id : AP_DOMAINS - 1;
else
@@ -1841,7 +1925,8 @@ static int __init ap_module_init(void)
out:
if (ap_using_interrupts())
unregister_adapter_interrupt(&ap_airq);
- kfree(ap_configuration);
+ kfree(ap_config_info);
+ kfree(ap_old_config_info);
return rc;
}
device_initcall(ap_module_init);
diff --git a/drivers/s390/crypto/ap_bus.h b/drivers/s390/crypto/ap_bus.h
index 86bf5e224ba4..c94c97c6ea09 100644
--- a/drivers/s390/crypto/ap_bus.h
+++ b/drivers/s390/crypto/ap_bus.h
@@ -139,6 +139,19 @@ struct ap_driver {
void (*suspend)(struct ap_device *);
void (*resume)(struct ap_device *);
bool (*in_use)(unsigned long *apm, unsigned long *aqm);
+
+ /*
+ * Called at the start of the ap bus scan function when
+ * the crypto config information (qci) has changed.
+ */
+ void (*on_config_changed)(struct ap_config_info *new_config_info,
+ struct ap_config_info *old_config_info);
+ /*
+ * Called at the end of the ap bus scan function when
+ * the crypto config information (qci) has changed.
+ */
+ void (*on_scan_complete)(struct ap_config_info *new_config_info,
+ struct ap_config_info *old_config_info);
};
#define to_ap_drv(x) container_of((x), struct ap_driver, driver)
--
2.21.1
Introduces a new driver callback to prevent a root user from unbinding
an AP queue from its device driver if the queue is in use. The intent of
this callback is to provide a driver with the means to prevent a root user
from inadvertently taking a queue away from a guest and giving it to the
host while the guest is still using it. The callback will
be invoked whenever a change to the AP bus's sysfs apmask or aqmask
attributes would result in one or more AP queues being removed from its
driver. If the callback responds in the affirmative for any driver
queried, the change to the apmask or aqmask will be rejected with a device
in use error.
For this patch, only non-default drivers will be queried. Currently,
there is only one non-default driver, the vfio_ap device driver. The
vfio_ap device driver manages AP queues passed through to one or more
guests and we don't want to unexpectedly take AP resources away from
guests which are most likely independently administered.
Signed-off-by: Tony Krowiak <[email protected]>
---
drivers/s390/crypto/ap_bus.c | 144 +++++++++++++++++++++++++++++++++--
drivers/s390/crypto/ap_bus.h | 4 +
2 files changed, 142 insertions(+), 6 deletions(-)
diff --git a/drivers/s390/crypto/ap_bus.c b/drivers/s390/crypto/ap_bus.c
index 5256e3ce84e5..af15c095e76a 100644
--- a/drivers/s390/crypto/ap_bus.c
+++ b/drivers/s390/crypto/ap_bus.c
@@ -35,6 +35,7 @@
#include <linux/mod_devicetable.h>
#include <linux/debugfs.h>
#include <linux/ctype.h>
+#include <linux/module.h>
#include "ap_bus.h"
#include "ap_debug.h"
@@ -995,9 +996,11 @@ int ap_parse_mask_str(const char *str,
newmap = kmalloc(size, GFP_KERNEL);
if (!newmap)
return -ENOMEM;
- if (mutex_lock_interruptible(lock)) {
- kfree(newmap);
- return -ERESTARTSYS;
+ if (lock) {
+ if (mutex_lock_interruptible(lock)) {
+ kfree(newmap);
+ return -ERESTARTSYS;
+ }
}
if (*str == '+' || *str == '-') {
@@ -1009,7 +1012,10 @@ int ap_parse_mask_str(const char *str,
}
if (rc == 0)
memcpy(bitmap, newmap, size);
- mutex_unlock(lock);
+
+ if (lock)
+ mutex_unlock(lock);
+
kfree(newmap);
return rc;
}
@@ -1196,12 +1202,75 @@ static ssize_t apmask_show(struct bus_type *bus, char *buf)
return rc;
}
+int __verify_card_reservations(struct device_driver *drv, void *data)
+{
+ int rc = 0;
+ struct ap_driver *ap_drv = to_ap_drv(drv);
+ unsigned long *newapm = (unsigned long *)data;
+
+ /*
+ * If the reserved bits do not identify cards reserved for use by the
+ * non-default driver, there is no need to verify the driver is using
+ * the queues.
+ */
+ if (ap_drv->flags & AP_DRIVER_FLAG_DEFAULT)
+ return 0;
+
+ /* The non-default driver's module must be loaded */
+ if (!try_module_get(drv->owner))
+ return 0;
+
+ if (ap_drv->in_use)
+ if (ap_drv->in_use(newapm, ap_perms.aqm))
+ rc = -EADDRINUSE;
+
+ module_put(drv->owner);
+
+ return rc;
+}
+
+static int apmask_commit(unsigned long *newapm)
+{
+ int rc;
+ unsigned long reserved[BITS_TO_LONGS(AP_DEVICES)];
+
+ /*
+ * Check if any bits in the apmask have been set which will
+ * result in queues being removed from non-default drivers
+ */
+ if (bitmap_andnot(reserved, newapm, ap_perms.apm, AP_DEVICES)) {
+ rc = bus_for_each_drv(&ap_bus_type, NULL, reserved,
+ __verify_card_reservations);
+ if (rc)
+ return rc;
+ }
+
+ memcpy(ap_perms.apm, newapm, APMASKSIZE);
+
+ return 0;
+}
+
static ssize_t apmask_store(struct bus_type *bus, const char *buf,
size_t count)
{
int rc;
+ unsigned long newapm[BITS_TO_LONGS(AP_DEVICES)];
+
+ if (mutex_lock_interruptible(&ap_perms_mutex))
+ return -ERESTARTSYS;
+
+ memcpy(newapm, ap_perms.apm, APMASKSIZE);
+
+ rc = ap_parse_mask_str(buf, newapm, AP_DEVICES, NULL);
+ if (rc)
+ goto done;
+
+ rc = apmask_commit(newapm);
+ if (rc)
+ goto done;
- rc = ap_parse_mask_str(buf, ap_perms.apm, AP_DEVICES, &ap_perms_mutex);
+done:
+ mutex_unlock(&ap_perms_mutex);
if (rc)
return rc;
@@ -1227,12 +1296,75 @@ static ssize_t aqmask_show(struct bus_type *bus, char *buf)
return rc;
}
+int __verify_queue_reservations(struct device_driver *drv, void *data)
+{
+ int rc = 0;
+ struct ap_driver *ap_drv = to_ap_drv(drv);
+ unsigned long *newaqm = (unsigned long *)data;
+
+ /*
+ * If the reserved bits do not identify queues reserved for use by the
+ * non-default driver, there is no need to verify the driver is using
+ * the queues.
+ */
+ if (ap_drv->flags & AP_DRIVER_FLAG_DEFAULT)
+ return 0;
+
+ /* The non-default driver's module must be loaded */
+ if (!try_module_get(drv->owner))
+ return 0;
+
+ if (ap_drv->in_use)
+ if (ap_drv->in_use(ap_perms.apm, newaqm))
+ rc = -EADDRINUSE;
+
+ module_put(drv->owner);
+
+ return rc;
+}
+
+static int aqmask_commit(unsigned long *newaqm)
+{
+ int rc;
+ unsigned long reserved[BITS_TO_LONGS(AP_DOMAINS)];
+
+ /*
+ * Check if any bits in the aqmask have been set which will
+ * result in queues being removed from non-default drivers
+ */
+ if (bitmap_andnot(reserved, newaqm, ap_perms.aqm, AP_DOMAINS)) {
+ rc = bus_for_each_drv(&ap_bus_type, NULL, reserved,
+ __verify_queue_reservations);
+ if (rc)
+ return rc;
+ }
+
+ memcpy(ap_perms.aqm, newaqm, AQMASKSIZE);
+
+ return 0;
+}
+
static ssize_t aqmask_store(struct bus_type *bus, const char *buf,
size_t count)
{
int rc;
+ unsigned long newaqm[BITS_TO_LONGS(AP_DEVICES)];
- rc = ap_parse_mask_str(buf, ap_perms.aqm, AP_DOMAINS, &ap_perms_mutex);
+ if (mutex_lock_interruptible(&ap_perms_mutex))
+ return -ERESTARTSYS;
+
+ memcpy(newaqm, ap_perms.aqm, AQMASKSIZE);
+
+ rc = ap_parse_mask_str(buf, newaqm, AP_DOMAINS, NULL);
+ if (rc)
+ goto done;
+
+ rc = aqmask_commit(newaqm);
+ if (rc)
+ goto done;
+
+done:
+ mutex_unlock(&ap_perms_mutex);
if (rc)
return rc;
diff --git a/drivers/s390/crypto/ap_bus.h b/drivers/s390/crypto/ap_bus.h
index 4348fdff1c61..86bf5e224ba4 100644
--- a/drivers/s390/crypto/ap_bus.h
+++ b/drivers/s390/crypto/ap_bus.h
@@ -138,6 +138,7 @@ struct ap_driver {
void (*remove)(struct ap_device *);
void (*suspend)(struct ap_device *);
void (*resume)(struct ap_device *);
+ bool (*in_use)(unsigned long *apm, unsigned long *aqm);
};
#define to_ap_drv(x) container_of((x), struct ap_driver, driver)
@@ -266,6 +267,9 @@ void ap_queue_init_state(struct ap_queue *aq);
struct ap_card *ap_card_create(int id, int queue_depth, int raw_device_type,
int comp_device_type, unsigned int functions);
+#define APMASKSIZE (BITS_TO_LONGS(AP_DEVICES) * sizeof(unsigned long))
+#define AQMASKSIZE (BITS_TO_LONGS(AP_DOMAINS) * sizeof(unsigned long))
+
struct ap_perms {
unsigned long ioctlm[BITS_TO_LONGS(AP_IOCTLS)];
unsigned long apm[BITS_TO_LONGS(AP_DEVICES)];
--
2.21.1
The matrix of adapters and domains configured in a guest's CRYCB may
differ from the matrix of adapters and domains assigned to the matrix mdev,
so this patch introduces a sysfs attribute to display the CRYCB of a guest
using the matrix mdev. For a matrix mdev denoted by $uuid, the crycb for a
guest using the matrix mdev can be displayed as follows:
cat /sys/devices/vfio_ap/matrix/$uuid/guest_matrix
If a guest is not using the matrix mdev at the time the crycb is displayed,
an error (ENODEV) will be returned.
Signed-off-by: Tony Krowiak <[email protected]>
---
drivers/s390/crypto/vfio_ap_ops.c | 58 +++++++++++++++++++++++++++++++
1 file changed, 58 insertions(+)
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index b8b678032ab7..beb2e68a2b1c 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -1041,6 +1041,63 @@ static ssize_t matrix_show(struct device *dev, struct device_attribute *attr,
}
static DEVICE_ATTR_RO(matrix);
+static ssize_t guest_matrix_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct mdev_device *mdev = mdev_from_dev(dev);
+ struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
+ char *bufpos = buf;
+ unsigned long apid;
+ unsigned long apqi;
+ unsigned long apid1;
+ unsigned long apqi1;
+ unsigned long napm_bits = matrix_mdev->shadow_crycb.apm_max + 1;
+ unsigned long naqm_bits = matrix_mdev->shadow_crycb.aqm_max + 1;
+ int nchars = 0;
+ int n;
+
+ if (!vfio_ap_mdev_has_crycb(matrix_mdev))
+ return -ENODEV;
+
+ apid1 = find_first_bit_inv(matrix_mdev->shadow_crycb.apm, napm_bits);
+ apqi1 = find_first_bit_inv(matrix_mdev->shadow_crycb.aqm, naqm_bits);
+
+ mutex_lock(&matrix_dev->lock);
+
+ if ((apid1 < napm_bits) && (apqi1 < naqm_bits)) {
+ for_each_set_bit_inv(apid, matrix_mdev->shadow_crycb.apm,
+ napm_bits) {
+ for_each_set_bit_inv(apqi,
+ matrix_mdev->shadow_crycb.aqm,
+ naqm_bits) {
+ n = sprintf(bufpos, "%02lx.%04lx\n", apid,
+ apqi);
+ bufpos += n;
+ nchars += n;
+ }
+ }
+ } else if (apid1 < napm_bits) {
+ for_each_set_bit_inv(apid, matrix_mdev->shadow_crycb.apm,
+ napm_bits) {
+ n = sprintf(bufpos, "%02lx.\n", apid);
+ bufpos += n;
+ nchars += n;
+ }
+ } else if (apqi1 < naqm_bits) {
+ for_each_set_bit_inv(apqi, matrix_mdev->shadow_crycb.aqm,
+ naqm_bits) {
+ n = sprintf(bufpos, ".%04lx\n", apqi);
+ bufpos += n;
+ nchars += n;
+ }
+ }
+
+ mutex_unlock(&matrix_dev->lock);
+
+ return nchars;
+}
+static DEVICE_ATTR_RO(guest_matrix);
+
static struct attribute *vfio_ap_mdev_attrs[] = {
&dev_attr_assign_adapter.attr,
&dev_attr_unassign_adapter.attr,
@@ -1050,6 +1107,7 @@ static struct attribute *vfio_ap_mdev_attrs[] = {
&dev_attr_unassign_control_domain.attr,
&dev_attr_control_domains.attr,
&dev_attr_matrix.attr,
+ &dev_attr_guest_matrix.attr,
NULL,
};
--
2.21.1
Currently, a vfio_ap_queue struct is created for every queue device probed
which is then added to a hash table of all queues probed by the vfio_ap
device driver. This list could get quite large making retrieval
expensive. In order to make retrieval of a vfio_ap_queue struct more
efficient when we already have a pointer to the ap_matrix_mdev to which the
queue's APQN is assigned, let's go ahead and add a link from the
ap_matrix_mdev struct to the vfio_ap_queue struct.
Signed-off-by: Tony Krowiak <[email protected]>
---
drivers/s390/crypto/vfio_ap_ops.c | 29 +++++++++++++++++++++++----
drivers/s390/crypto/vfio_ap_private.h | 2 ++
2 files changed, 27 insertions(+), 4 deletions(-)
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index 25b7d978e3fd..6ee1ebe3f207 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -38,6 +38,19 @@ struct vfio_ap_queue *vfio_ap_get_queue(unsigned long apqn)
return NULL;
}
+struct vfio_ap_queue *vfio_ap_get_mdev_queue(struct ap_matrix_mdev *matrix_mdev,
+ unsigned long apqn)
+{
+ struct vfio_ap_queue *q;
+
+ hash_for_each_possible(matrix_mdev->qtable, q, mdev_qnode, apqn) {
+ if (q->apqn == apqn)
+ return q;
+ }
+
+ return NULL;
+}
+
/**
* vfio_ap_wait_for_irqclear
* @apqn: The AP Queue number
@@ -325,6 +338,7 @@ static int vfio_ap_mdev_create(struct kobject *kobj, struct mdev_device *mdev)
matrix_mdev->mdev = mdev;
vfio_ap_matrix_init(&matrix_dev->info, &matrix_mdev->matrix);
vfio_ap_matrix_init(&matrix_dev->info, &matrix_mdev->shadow_crycb);
+ hash_init(matrix_mdev->qtable);
mdev_set_drvdata(mdev, matrix_mdev);
matrix_mdev->pqap_hook.hook = handle_pqap;
matrix_mdev->pqap_hook.owner = THIS_MODULE;
@@ -594,7 +608,7 @@ static int vfio_ap_mdev_filter_matrix(struct ap_matrix_mdev *matrix_mdev,
* filter the APQI.
*/
apqn = AP_MKQID(apid, apqi);
- if (!vfio_ap_get_queue(apqn)) {
+ if (!vfio_ap_get_mdev_queue(matrix_mdev, apqn)) {
if (filter_apids)
clear_bit_inv(apid, shadow_crycb->apm);
else
@@ -624,7 +638,6 @@ static bool vfio_ap_mdev_configure_crycb(struct ap_matrix_mdev *matrix_mdev)
vfio_ap_matrix_init(&matrix_dev->info, &shadow_crycb);
napm = bitmap_weight(matrix_mdev->matrix.apm, AP_DEVICES);
naqm = bitmap_weight(matrix_mdev->matrix.aqm, AP_DOMAINS);
-
/*
* If there are no APIDs or no APQIs assigned to the matrix mdev,
* then no APQNs shall be assigned to the guest CRYCB.
@@ -636,6 +649,7 @@ static bool vfio_ap_mdev_configure_crycb(struct ap_matrix_mdev *matrix_mdev)
*/
napm = vfio_ap_mdev_filter_matrix(matrix_mdev, &shadow_crycb,
true);
+
/*
* If there are no APQNs that can be assigned to the guest's
* CRYCB after filtering, then try filtering the APQIs.
@@ -697,8 +711,10 @@ static void vfio_ap_mdev_qlinks_for_apid(struct ap_matrix_mdev *matrix_mdev,
for_each_set_bit_inv(apqi, matrix_mdev->matrix.aqm,
matrix_mdev->matrix.aqm_max + 1) {
q = vfio_ap_get_queue(AP_MKQID(apid, apqi));
- if (q)
+ if (q) {
q->matrix_mdev = matrix_mdev;
+ hash_add(matrix_mdev->qtable, &q->mdev_qnode, q->apqn);
+ }
}
}
@@ -871,8 +887,10 @@ static void vfio_ap_mdev_qlinks_for_apqi(struct ap_matrix_mdev *matrix_mdev,
for_each_set_bit_inv(apid, matrix_mdev->matrix.apm,
matrix_mdev->matrix.apm_max + 1) {
q = vfio_ap_get_queue(AP_MKQID(apid, apqi));
- if (q)
+ if (q) {
q->matrix_mdev = matrix_mdev;
+ hash_add(matrix_mdev->qtable, &q->mdev_qnode, q->apqn);
+ }
}
}
@@ -1536,6 +1554,7 @@ static void vfio_ap_mdev_for_queue(struct vfio_ap_queue *q)
if (test_bit_inv(apid, matrix_mdev->matrix.apm) &&
test_bit_inv(apqi, matrix_mdev->matrix.aqm)) {
q->matrix_mdev = matrix_mdev;
+ hash_add(matrix_mdev->qtable, &q->mdev_qnode, q->apqn);
break;
}
}
@@ -1573,6 +1592,8 @@ void vfio_ap_mdev_remove_queue(struct ap_queue *queue)
vfio_ap_mdev_reset_queue(apid, apqi, 1);
vfio_ap_irq_disable(q);
hash_del(&q->qnode);
+ if (q->matrix_mdev)
+ hash_del(&q->mdev_qnode);
kfree(q);
mutex_unlock(&matrix_dev->lock);
}
diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h
index 87cc270c3212..794c60a767d2 100644
--- a/drivers/s390/crypto/vfio_ap_private.h
+++ b/drivers/s390/crypto/vfio_ap_private.h
@@ -89,6 +89,7 @@ struct ap_matrix_mdev {
struct kvm *kvm;
struct kvm_s390_module_hook pqap_hook;
struct mdev_device *mdev;
+ DECLARE_HASHTABLE(qtable, 8);
};
extern int vfio_ap_mdev_register(void);
@@ -101,6 +102,7 @@ struct vfio_ap_queue {
#define VFIO_AP_ISC_INVALID 0xff
unsigned char saved_isc;
struct hlist_node qnode;
+ struct hlist_node mdev_qnode;
};
int vfio_ap_mdev_probe_queue(struct ap_queue *queue);
--
2.21.1
Implements the driver callback invoked by the AP bus when the AP bus
scan has completed. Since this callback is invoked after binding the newly
added devices to their respective device drivers, the vfio_ap driver will
attempt to plug the adapters, domains and control domains into each guest
using a matrix mdev to which they are assigned. Keep in mind that an
adapter or domain can be plugged in only if each APQN with the APID of the
adapter or the APQI of the domain references a queue device bound to the
vfio_ap device driver. Consequently, not all newly added adapters and
domains will necessarily get hot plugged.
Signed-off-by: Tony Krowiak <[email protected]>
---
drivers/s390/crypto/vfio_ap_drv.c | 1 +
drivers/s390/crypto/vfio_ap_ops.c | 132 +++++++++++++++++++++++---
drivers/s390/crypto/vfio_ap_private.h | 2 +
3 files changed, 124 insertions(+), 11 deletions(-)
diff --git a/drivers/s390/crypto/vfio_ap_drv.c b/drivers/s390/crypto/vfio_ap_drv.c
index 9f6c5d82dfb5..0ed557634302 100644
--- a/drivers/s390/crypto/vfio_ap_drv.c
+++ b/drivers/s390/crypto/vfio_ap_drv.c
@@ -179,6 +179,7 @@ static int __init vfio_ap_init(void)
vfio_ap_drv.in_use = vfio_ap_mdev_resource_in_use;
vfio_ap_drv.ids = ap_queue_ids;
vfio_ap_drv.on_config_changed = vfio_ap_on_cfg_changed;
+ vfio_ap_drv.on_scan_complete = vfio_ap_on_scan_complete;
ret = ap_driver_register(&vfio_ap_drv, THIS_MODULE, VFIO_AP_DRV_NAME);
if (ret) {
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index f1dd77729dd9..ccc58daf82f6 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -489,6 +489,7 @@ static int vfio_ap_mdev_filter_matrix(struct ap_matrix_mdev *matrix_mdev,
if (!test_bit_inv(apid, (unsigned long *)
matrix_dev->config_info.apm)) {
clear_bit_inv(apid, shadow_crycb->apm);
+
continue;
}
@@ -502,6 +503,7 @@ static int vfio_ap_mdev_filter_matrix(struct ap_matrix_mdev *matrix_mdev,
if (!test_bit_inv(apqi, (unsigned long *)
matrix_dev->config_info.aqm)) {
clear_bit_inv(apqi, shadow_crycb->aqm);
+
continue;
}
@@ -515,11 +517,12 @@ static int vfio_ap_mdev_filter_matrix(struct ap_matrix_mdev *matrix_mdev,
*/
apqn = AP_MKQID(apid, apqi);
if (!vfio_ap_get_mdev_queue(matrix_mdev, apqn)) {
- if (filter_apids)
+ if (filter_apids) {
clear_bit_inv(apid, shadow_crycb->apm);
- else
- clear_bit_inv(apqi, shadow_crycb->aqm);
- break;
+ break;
+ }
+
+ clear_bit_inv(apqi, shadow_crycb->aqm);
}
}
@@ -541,7 +544,7 @@ static bool vfio_ap_mdev_configure_crycb(struct ap_matrix_mdev *matrix_mdev)
int napm, naqm;
struct ap_matrix shadow_crycb;
- vfio_ap_matrix_init(&matrix_dev->info, &shadow_crycb);
+ vfio_ap_matrix_init(&matrix_dev->config_info, &shadow_crycb);
napm = bitmap_weight(matrix_mdev->matrix.apm, AP_DEVICES);
naqm = bitmap_weight(matrix_mdev->matrix.aqm, AP_DOMAINS);
/*
@@ -569,6 +572,8 @@ static bool vfio_ap_mdev_configure_crycb(struct ap_matrix_mdev *matrix_mdev)
* matrix mdev after filtering the APQIs, then no APQNs
* shall be assigned to the guest's CRYCB.
*/
+ naqm = vfio_ap_mdev_filter_matrix(matrix_mdev,
+ &shadow_crycb, false);
if (naqm == 0) {
bitmap_clear(shadow_crycb.apm, 0, AP_DEVICES);
bitmap_clear(shadow_crycb.aqm, 0, AP_DOMAINS);
@@ -633,8 +638,8 @@ static bool vfio_ap_mdev_assign_apqis_4_apid(struct ap_matrix_mdev *matrix_mdev,
bitmap_copy(aqm, matrix_mdev->matrix.aqm, AP_DOMAINS);
for_each_set_bit_inv(apqi, matrix_mdev->matrix.aqm, AP_DOMAINS) {
- if (!test_bit_inv(apqi,
- (unsigned long *) matrix_dev->info.aqm))
+ if (!test_bit_inv(apqi, (unsigned long *)
+ matrix_dev->config_info.aqm))
clear_bit_inv(apqi, aqm);
apqn = AP_MKQID(apid, apqi);
@@ -657,7 +662,7 @@ static bool vfio_ap_mdev_assign_guest_apid(struct ap_matrix_mdev *matrix_mdev,
unsigned long apqi, apqn;
if (!vfio_ap_mdev_has_crycb(matrix_mdev) ||
- !test_bit_inv(apid, (unsigned long *)matrix_dev->info.apm))
+ !test_bit_inv(apid, (unsigned long *)matrix_dev->config_info.apm))
return false;
if (bitmap_empty(matrix_mdev->shadow_crycb.aqm, AP_DOMAINS))
@@ -853,8 +858,8 @@ static bool vfio_ap_mdev_assign_apids_4_apqi(struct ap_matrix_mdev *matrix_mdev,
bitmap_copy(apm, matrix_mdev->matrix.apm, AP_DEVICES);
for_each_set_bit_inv(apid, matrix_mdev->matrix.apm, AP_DEVICES) {
- if (!test_bit_inv(apid,
- (unsigned long *) matrix_dev->info.apm))
+ if (!test_bit_inv(apid, (unsigned long *)
+ matrix_dev->config_info.apm))
clear_bit_inv(apqi, apm);
apqn = AP_MKQID(apid, apqi);
@@ -877,7 +882,7 @@ static bool vfio_ap_mdev_assign_guest_apqi(struct ap_matrix_mdev *matrix_mdev,
unsigned long apid, apqn;
if (!vfio_ap_mdev_has_crycb(matrix_mdev) ||
- !test_bit_inv(apqi, (unsigned long *)matrix_dev->info.aqm))
+ !test_bit_inv(apqi, (unsigned long *)matrix_dev->config_info.aqm))
return false;
if (bitmap_empty(matrix_mdev->shadow_crycb.apm, AP_DEVICES))
@@ -1673,6 +1678,16 @@ bool vfio_ap_mdev_unassign_apids(struct ap_matrix_mdev *matrix_mdev,
for_each_set_bit_inv(apid, apm_unassign, AP_DEVICES) {
unassigned |= vfio_ap_mdev_unassign_guest_apid(matrix_mdev,
apid);
+ /*
+ * If the APID is not assigned to the matrix mdev's shadow
+ * CRYCB, continue with the next APID.
+ */
+ if (!test_bit_inv(apid, matrix_mdev->shadow_crycb.apm))
+ continue;
+
+ /* Unassign the APID from the matrix mdev's shadow CRYCB */
+ clear_bit_inv(apid, matrix_mdev->shadow_crycb.apm);
+ unassigned = true;
}
return unassigned;
@@ -1706,6 +1721,17 @@ bool vfio_ap_mdev_unassign_apqis(struct ap_matrix_mdev *matrix_mdev,
for_each_set_bit_inv(apqi, aqm_unassign, AP_DOMAINS) {
unassigned |= vfio_ap_mdev_unassign_guest_apqi(matrix_mdev,
apqi);
+
+ /*
+ * If the APQI is not assigned to the matrix mdev's shadow
+ * CRYCB, continue with the next APQI
+ */
+ if (!test_bit_inv(apqi, matrix_mdev->shadow_crycb.aqm))
+ continue;
+
+ /* Unassign the APQI from the matrix mdev's shadow CRYCB */
+ clear_bit_inv(apqi, matrix_mdev->shadow_crycb.aqm);
+ unassigned = true;
}
return unassigned;
@@ -1767,3 +1793,87 @@ void vfio_ap_on_cfg_changed(struct ap_config_info *new_config_info,
}
mutex_unlock(&matrix_dev->lock);
}
+
+bool vfio_ap_mdev_assign_apids(struct ap_matrix_mdev *matrix_mdev,
+ unsigned long *apm_assign)
+{
+ unsigned long apid;
+ bool assigned = false;
+
+ for_each_set_bit_inv(apid, apm_assign, AP_DEVICES)
+ if (test_bit_inv(apid, matrix_mdev->matrix.apm))
+ if (vfio_ap_mdev_assign_guest_apid(matrix_mdev, apid))
+ assigned = true;
+
+ return assigned;
+}
+
+bool vfio_ap_mdev_assign_apqis(struct ap_matrix_mdev *matrix_mdev,
+ unsigned long *aqm_assign)
+{
+ unsigned long apqi;
+ bool assigned = false;
+
+ for_each_set_bit_inv(apqi, aqm_assign, AP_DOMAINS)
+ if (test_bit_inv(apqi, matrix_mdev->matrix.aqm))
+ if (vfio_ap_mdev_assign_guest_apqi(matrix_mdev, apqi))
+ assigned = true;
+
+ return assigned;
+}
+
+void vfio_ap_on_scan_complete(struct ap_config_info *new_config_info,
+ struct ap_config_info *old_config_info)
+{
+ struct ap_matrix_mdev *matrix_mdev;
+ DECLARE_BITMAP(apm_assign, AP_DEVICES);
+ DECLARE_BITMAP(aqm_assign, AP_DOMAINS);
+ int ap_add, aq_add;
+ bool assign;
+ unsigned long *cur_apm, *cur_aqm, *prev_apm, *prev_aqm;
+
+ /*
+ * If we are not in the middle of a host configuration change scan it is
+ * likely that the vfio_ap driver was loaded mid-scan, so let's handle
+ * this scenario by calling the vfio_ap_on_cfg_changed function which
+ * gets called at the start of an AP bus scan when the host AP
+ * configuration has changed.
+ */
+ if (!(matrix_dev->flags & AP_MATRIX_CFG_CHG))
+ vfio_ap_on_cfg_changed(new_config_info, old_config_info);
+
+ cur_apm = (unsigned long *)matrix_dev->config_info.apm;
+ cur_aqm = (unsigned long *)matrix_dev->config_info.aqm;
+
+ prev_apm = (unsigned long *)matrix_dev->config_info_prev.apm;
+ prev_aqm = (unsigned long *)matrix_dev->config_info_prev.aqm;
+
+ ap_add = bitmap_andnot(apm_assign, cur_apm, prev_apm, AP_DEVICES);
+ aq_add = bitmap_andnot(aqm_assign, cur_aqm, prev_aqm, AP_DOMAINS);
+
+ mutex_lock(&matrix_dev->lock);
+ list_for_each_entry(matrix_mdev, &matrix_dev->mdev_list, node) {
+ if (!vfio_ap_mdev_has_crycb(matrix_mdev))
+ continue;
+
+ assign = false;
+
+ if (ap_add)
+ if (bitmap_intersects(matrix_mdev->matrix.apm,
+ apm_assign, AP_DEVICES))
+ assign |= vfio_ap_mdev_assign_apids(matrix_mdev,
+ apm_assign);
+
+ if (aq_add)
+ if (bitmap_intersects(matrix_mdev->matrix.aqm,
+ aqm_assign, AP_DOMAINS))
+ assign |= vfio_ap_mdev_assign_apqis(matrix_mdev,
+ aqm_assign);
+
+ if (assign)
+ vfio_ap_mdev_commit_crycb(matrix_mdev);
+ }
+
+ matrix_dev->flags &= ~AP_MATRIX_CFG_CHG;
+ mutex_unlock(&matrix_dev->lock);
+}
diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h
index 82abbf03781f..38974c37591a 100644
--- a/drivers/s390/crypto/vfio_ap_private.h
+++ b/drivers/s390/crypto/vfio_ap_private.h
@@ -114,5 +114,7 @@ void vfio_ap_mdev_remove_queue(struct ap_queue *queue);
bool vfio_ap_mdev_resource_in_use(unsigned long *apm, unsigned long *aqm);
void vfio_ap_on_cfg_changed(struct ap_config_info *new_config_info,
struct ap_config_info *old_config_info);
+void vfio_ap_on_scan_complete(struct ap_config_info *new_config_info,
+ struct ap_config_info *old_config_info);
#endif /* _VFIO_AP_PRIVATE_H_ */
--
2.21.1
Let's implement the callback to indicate when an APQN
is in use by the vfio_ap device driver. The callback is
invoked whenever a change to the apmask or aqmask would
result in one or more queue devices being removed from the driver. The
vfio_ap device driver will indicate a resource is in use
if the APQN of any of the queue devices to be removed are assigned to
any of the matrix mdevs under the driver's control.
Signed-off-by: Tony Krowiak <[email protected]>
---
drivers/s390/crypto/vfio_ap_drv.c | 1 +
drivers/s390/crypto/vfio_ap_ops.c | 47 +++++++++++++++++----------
drivers/s390/crypto/vfio_ap_private.h | 2 ++
3 files changed, 33 insertions(+), 17 deletions(-)
diff --git a/drivers/s390/crypto/vfio_ap_drv.c b/drivers/s390/crypto/vfio_ap_drv.c
index e9c226c0730e..5197a1fe14d4 100644
--- a/drivers/s390/crypto/vfio_ap_drv.c
+++ b/drivers/s390/crypto/vfio_ap_drv.c
@@ -174,6 +174,7 @@ static int __init vfio_ap_init(void)
memset(&vfio_ap_drv, 0, sizeof(vfio_ap_drv));
vfio_ap_drv.probe = vfio_ap_queue_dev_probe;
vfio_ap_drv.remove = vfio_ap_queue_dev_remove;
+ vfio_ap_drv.in_use = vfio_ap_mdev_resource_in_use;
vfio_ap_drv.ids = ap_queue_ids;
ret = ap_driver_register(&vfio_ap_drv, THIS_MODULE, VFIO_AP_DRV_NAME);
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index 00699bd72d2b..8ece0d52ff4c 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -500,7 +500,9 @@ vfio_ap_mdev_verify_queues_reserved_for_apid(struct ap_matrix_mdev *matrix_mdev,
*
* Returns 0 if the APQNs are not shared, otherwise; returns -EADDRINUSE.
*/
-static int vfio_ap_mdev_verify_no_sharing(struct ap_matrix_mdev *matrix_mdev)
+static int vfio_ap_mdev_verify_no_sharing(struct ap_matrix_mdev *matrix_mdev,
+ unsigned long *mdev_apm,
+ unsigned long *mdev_aqm)
{
struct ap_matrix_mdev *lstdev;
DECLARE_BITMAP(apm, AP_DEVICES);
@@ -517,12 +519,10 @@ static int vfio_ap_mdev_verify_no_sharing(struct ap_matrix_mdev *matrix_mdev)
* We work on full longs, as we can only exclude the leftover
* bits in non-inverse order. The leftover is all zeros.
*/
- if (!bitmap_and(apm, matrix_mdev->matrix.apm,
- lstdev->matrix.apm, AP_DEVICES))
+ if (!bitmap_and(apm, mdev_apm, lstdev->matrix.apm, AP_DEVICES))
continue;
- if (!bitmap_and(aqm, matrix_mdev->matrix.aqm,
- lstdev->matrix.aqm, AP_DOMAINS))
+ if (!bitmap_and(aqm, mdev_aqm, lstdev->matrix.aqm, AP_DOMAINS))
continue;
return -EADDRINUSE;
@@ -594,6 +594,7 @@ static ssize_t assign_adapter_store(struct device *dev,
{
int ret;
unsigned long apid;
+ DECLARE_BITMAP(apm, AP_DEVICES);
struct mdev_device *mdev = mdev_from_dev(dev);
struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
@@ -619,18 +620,18 @@ static ssize_t assign_adapter_store(struct device *dev,
if (ret)
goto done;
- set_bit_inv(apid, matrix_mdev->matrix.apm);
+ memset(apm, 0, sizeof(apm));
+ set_bit_inv(apid, apm);
- ret = vfio_ap_mdev_verify_no_sharing(matrix_mdev);
+ ret = vfio_ap_mdev_verify_no_sharing(matrix_mdev, apm,
+ matrix_mdev->matrix.aqm);
if (ret)
- goto share_err;
+ goto done;
+ set_bit_inv(apid, matrix_mdev->matrix.apm);
vfio_ap_mdev_qlinks_for_apid(matrix_mdev, apid);
ret = count;
- goto done;
-share_err:
- clear_bit_inv(apid, matrix_mdev->matrix.apm);
done:
mutex_unlock(&matrix_dev->lock);
@@ -767,6 +768,7 @@ static ssize_t assign_domain_store(struct device *dev,
{
int ret;
unsigned long apqi;
+ DECLARE_BITMAP(aqm, AP_DOMAINS);
struct mdev_device *mdev = mdev_from_dev(dev);
struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
unsigned long max_apqi = matrix_mdev->matrix.aqm_max;
@@ -787,18 +789,18 @@ static ssize_t assign_domain_store(struct device *dev,
if (ret)
goto done;
- set_bit_inv(apqi, matrix_mdev->matrix.aqm);
+ memset(aqm, 0, sizeof(aqm));
+ set_bit_inv(apqi, aqm);
- ret = vfio_ap_mdev_verify_no_sharing(matrix_mdev);
+ ret = vfio_ap_mdev_verify_no_sharing(matrix_mdev,
+ matrix_mdev->matrix.apm, aqm);
if (ret)
- goto share_err;
+ goto done;
+ set_bit_inv(apqi, matrix_mdev->matrix.aqm);
vfio_ap_mdev_qlinks_for_apqi(matrix_mdev, apqi);
ret = count;
- goto done;
-share_err:
- clear_bit_inv(apqi, matrix_mdev->matrix.aqm);
done:
mutex_unlock(&matrix_dev->lock);
@@ -1369,3 +1371,14 @@ void vfio_ap_mdev_remove_queue(struct ap_queue *queue)
kfree(q);
mutex_unlock(&matrix_dev->lock);
}
+
+bool vfio_ap_mdev_resource_in_use(unsigned long *apm, unsigned long *aqm)
+{
+ bool in_use;
+
+ mutex_lock(&matrix_dev->lock);
+ in_use = vfio_ap_mdev_verify_no_sharing(NULL, apm, aqm) ? true : false;
+ mutex_unlock(&matrix_dev->lock);
+
+ return in_use;
+}
diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h
index e1f8c82cc55d..4b6e144bab17 100644
--- a/drivers/s390/crypto/vfio_ap_private.h
+++ b/drivers/s390/crypto/vfio_ap_private.h
@@ -105,4 +105,6 @@ struct vfio_ap_queue {
int vfio_ap_mdev_probe_queue(struct ap_queue *queue);
void vfio_ap_mdev_remove_queue(struct ap_queue *queue);
+bool vfio_ap_mdev_resource_in_use(unsigned long *apm, unsigned long *aqm);
+
#endif /* _VFIO_AP_PRIVATE_H_ */
--
2.21.1
A vfio_ap_queue structure is created for each queue device probed. To
ensure that the matrix mdev to which a queue's APQN is assigned is linked
to the queue structure as long as the queue device is bound to the vfio_ap
device driver, let's go ahead and manage these links when the queue device
is probed and removed as well as whenever an adapter or domain is assigned
to or unassigned from the matrix mdev.
Signed-off-by: Tony Krowiak <[email protected]>
---
drivers/s390/crypto/vfio_ap_ops.c | 75 ++++++++++++++++++++++++++++---
1 file changed, 70 insertions(+), 5 deletions(-)
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index 134860934fe7..00699bd72d2b 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -148,7 +148,6 @@ struct ap_queue_status vfio_ap_irq_disable(struct vfio_ap_queue *q)
status.response_code);
end_free:
vfio_ap_free_aqic_resources(q);
- q->matrix_mdev = NULL;
return status;
}
@@ -250,7 +249,6 @@ static int handle_pqap(struct kvm_vcpu *vcpu)
struct vfio_ap_queue *q;
struct ap_queue_status qstatus = {
.response_code = AP_RESPONSE_Q_NOT_AVAIL, };
- struct ap_matrix_mdev *matrix_mdev;
/* If we do not use the AIV facility just go to userland */
if (!(vcpu->arch.sie_block->eca & ECA_AIV))
@@ -261,14 +259,11 @@ static int handle_pqap(struct kvm_vcpu *vcpu)
if (!vcpu->kvm->arch.crypto.pqap_hook)
goto out_unlock;
- matrix_mdev = container_of(vcpu->kvm->arch.crypto.pqap_hook,
- struct ap_matrix_mdev, pqap_hook);
q = vfio_ap_get_queue(apqn);
if (!q)
goto out_unlock;
- q->matrix_mdev = matrix_mdev;
status = vcpu->run->s.regs.gprs[1];
/* If IR bit(16) is set we enable the interrupt */
@@ -536,6 +531,31 @@ static int vfio_ap_mdev_verify_no_sharing(struct ap_matrix_mdev *matrix_mdev)
return 0;
}
+/**
+ * vfio_ap_mdev_qlinks_for_apid
+ *
+ * @matrix_mdev: a matrix mediated device
+ * @apqi: the APID of one or more APQNs assigned to @matrix_mdev
+ *
+ * Set the link to @matrix_mdev for each queue device bound to the vfio_ap
+ * device driver with an APQN assigned to @matrix_mdev with the specified @apid.
+ *
+ * Note: If @matrix_mdev is NULL, the link to @matrix_mdev will be severed.
+ */
+static void vfio_ap_mdev_qlinks_for_apid(struct ap_matrix_mdev *matrix_mdev,
+ unsigned long apid)
+{
+ unsigned long apqi;
+ struct vfio_ap_queue *q;
+
+ for_each_set_bit_inv(apqi, matrix_mdev->matrix.aqm,
+ matrix_mdev->matrix.aqm_max + 1) {
+ q = vfio_ap_get_queue(AP_MKQID(apid, apqi));
+ if (q)
+ q->matrix_mdev = matrix_mdev;
+ }
+}
+
/**
* assign_adapter_store
*
@@ -605,6 +625,7 @@ static ssize_t assign_adapter_store(struct device *dev,
if (ret)
goto share_err;
+ vfio_ap_mdev_qlinks_for_apid(matrix_mdev, apid);
ret = count;
goto done;
@@ -656,6 +677,7 @@ static ssize_t unassign_adapter_store(struct device *dev,
mutex_lock(&matrix_dev->lock);
clear_bit_inv((unsigned long)apid, matrix_mdev->matrix.apm);
+ vfio_ap_mdev_qlinks_for_apid(NULL, apid);
mutex_unlock(&matrix_dev->lock);
return count;
@@ -682,6 +704,31 @@ vfio_ap_mdev_verify_queues_reserved_for_apqi(struct ap_matrix_mdev *matrix_mdev,
return 0;
}
+/**
+ * vfio_ap_mdev_qlinks_for_apqi
+ *
+ * @matrix_mdev: a matrix mediated device
+ * @apqi: the APQI of one or more APQNs assigned to @matrix_mdev
+ *
+ * Set the link to @matrix_mdev for each queue device bound to the vfio_ap
+ * device driver with an APQN assigned to @matrix_mdev with the specified @apqi.
+ *
+ * Note: If @matrix_mdev is NULL, the link to @matrix_mdev will be severed.
+ */
+static void vfio_ap_mdev_qlinks_for_apqi(struct ap_matrix_mdev *matrix_mdev,
+ unsigned long apqi)
+{
+ unsigned long apid;
+ struct vfio_ap_queue *q;
+
+ for_each_set_bit_inv(apid, matrix_mdev->matrix.apm,
+ matrix_mdev->matrix.apm_max + 1) {
+ q = vfio_ap_get_queue(AP_MKQID(apid, apqi));
+ if (q)
+ q->matrix_mdev = matrix_mdev;
+ }
+}
+
/**
* assign_domain_store
*
@@ -746,6 +793,7 @@ static ssize_t assign_domain_store(struct device *dev,
if (ret)
goto share_err;
+ vfio_ap_mdev_qlinks_for_apqi(matrix_mdev, apqi);
ret = count;
goto done;
@@ -798,6 +846,7 @@ static ssize_t unassign_domain_store(struct device *dev,
mutex_lock(&matrix_dev->lock);
clear_bit_inv((unsigned long)apqi, matrix_mdev->matrix.aqm);
+ vfio_ap_mdev_qlinks_for_apqi(NULL, apqi);
mutex_unlock(&matrix_dev->lock);
return count;
@@ -1270,6 +1319,21 @@ void vfio_ap_mdev_unregister(void)
mdev_unregister_device(&matrix_dev->device);
}
+static void vfio_ap_mdev_for_queue(struct vfio_ap_queue *q)
+{
+ unsigned long apid = AP_QID_CARD(q->apqn);
+ unsigned long apqi = AP_QID_QUEUE(q->apqn);
+ struct ap_matrix_mdev *matrix_mdev;
+
+ list_for_each_entry(matrix_mdev, &matrix_dev->mdev_list, node) {
+ if (test_bit_inv(apid, matrix_mdev->matrix.apm) &&
+ test_bit_inv(apqi, matrix_mdev->matrix.aqm)) {
+ q->matrix_mdev = matrix_mdev;
+ break;
+ }
+ }
+}
+
int vfio_ap_mdev_probe_queue(struct ap_queue *queue)
{
struct vfio_ap_queue *q;
@@ -1282,6 +1346,7 @@ int vfio_ap_mdev_probe_queue(struct ap_queue *queue)
dev_set_drvdata(&queue->ap_dev.device, q);
q->apqn = queue->qid;
q->saved_isc = VFIO_AP_ISC_INVALID;
+ vfio_ap_mdev_for_queue(q);
hash_add(matrix_dev->qtable, &q->qnode, q->apqn);
mutex_unlock(&matrix_dev->lock);
--
2.21.1
Rather than looping over potentially 65535 objects, let's store the
structures for caching information about queue devices bound to the
vfio_ap device driver in a hash table keyed by APQN.
Signed-off-by: Tony Krowiak <[email protected]>
---
drivers/s390/crypto/vfio_ap_drv.c | 28 +++------
drivers/s390/crypto/vfio_ap_ops.c | 90 ++++++++++++++-------------
drivers/s390/crypto/vfio_ap_private.h | 10 ++-
3 files changed, 60 insertions(+), 68 deletions(-)
diff --git a/drivers/s390/crypto/vfio_ap_drv.c b/drivers/s390/crypto/vfio_ap_drv.c
index be2520cc010b..e9c226c0730e 100644
--- a/drivers/s390/crypto/vfio_ap_drv.c
+++ b/drivers/s390/crypto/vfio_ap_drv.c
@@ -51,15 +51,9 @@ MODULE_DEVICE_TABLE(vfio_ap, ap_queue_ids);
*/
static int vfio_ap_queue_dev_probe(struct ap_device *apdev)
{
- struct vfio_ap_queue *q;
-
- q = kzalloc(sizeof(*q), GFP_KERNEL);
- if (!q)
- return -ENOMEM;
- dev_set_drvdata(&apdev->device, q);
- q->apqn = to_ap_queue(&apdev->device)->qid;
- q->saved_isc = VFIO_AP_ISC_INVALID;
- return 0;
+ struct ap_queue *queue = to_ap_queue(&apdev->device);
+
+ return vfio_ap_mdev_probe_queue(queue);
}
/**
@@ -70,18 +64,9 @@ static int vfio_ap_queue_dev_probe(struct ap_device *apdev)
*/
static void vfio_ap_queue_dev_remove(struct ap_device *apdev)
{
- struct vfio_ap_queue *q;
- int apid, apqi;
-
- mutex_lock(&matrix_dev->lock);
- q = dev_get_drvdata(&apdev->device);
- dev_set_drvdata(&apdev->device, NULL);
- apid = AP_QID_CARD(q->apqn);
- apqi = AP_QID_QUEUE(q->apqn);
- vfio_ap_mdev_reset_queue(apid, apqi, 1);
- vfio_ap_irq_disable(q);
- kfree(q);
- mutex_unlock(&matrix_dev->lock);
+ struct ap_queue *queue = to_ap_queue(&apdev->device);
+
+ vfio_ap_mdev_remove_queue(queue);
}
static void vfio_ap_matrix_dev_release(struct device *dev)
@@ -135,6 +120,7 @@ static int vfio_ap_matrix_dev_create(void)
mutex_init(&matrix_dev->lock);
INIT_LIST_HEAD(&matrix_dev->mdev_list);
+ hash_init(matrix_dev->qtable);
dev_set_name(&matrix_dev->device, "%s", VFIO_AP_DEV_NAME);
matrix_dev->device.parent = root_device;
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index 5c0f53c6dde7..134860934fe7 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -26,45 +26,16 @@
static int vfio_ap_mdev_reset_queues(struct mdev_device *mdev);
-static int match_apqn(struct device *dev, const void *data)
-{
- struct vfio_ap_queue *q = dev_get_drvdata(dev);
-
- return (q->apqn == *(int *)(data)) ? 1 : 0;
-}
-
-/**
- * vfio_ap_get_queue: Retrieve a queue with a specific APQN from a list
- * @matrix_mdev: the associated mediated matrix
- * @apqn: The queue APQN
- *
- * Retrieve a queue with a specific APQN from the list of the
- * devices of the vfio_ap_drv.
- * Verify that the APID and the APQI are set in the matrix.
- *
- * Returns the pointer to the associated vfio_ap_queue
- */
-static struct vfio_ap_queue *vfio_ap_get_queue(
- struct ap_matrix_mdev *matrix_mdev,
- int apqn)
+struct vfio_ap_queue *vfio_ap_get_queue(unsigned long apqn)
{
struct vfio_ap_queue *q;
- struct device *dev;
-
- if (!test_bit_inv(AP_QID_CARD(apqn), matrix_mdev->matrix.apm))
- return NULL;
- if (!test_bit_inv(AP_QID_QUEUE(apqn), matrix_mdev->matrix.aqm))
- return NULL;
-
- dev = driver_find_device(&matrix_dev->vfio_ap_drv->driver, NULL,
- &apqn, match_apqn);
- if (!dev)
- return NULL;
- q = dev_get_drvdata(dev);
- q->matrix_mdev = matrix_mdev;
- put_device(dev);
- return q;
+ hash_for_each_possible(matrix_dev->qtable, q, qnode, apqn) {
+ if (q && (apqn == q->apqn))
+ return q;
+ }
+
+ return NULL;
}
/**
@@ -293,10 +264,11 @@ static int handle_pqap(struct kvm_vcpu *vcpu)
matrix_mdev = container_of(vcpu->kvm->arch.crypto.pqap_hook,
struct ap_matrix_mdev, pqap_hook);
- q = vfio_ap_get_queue(matrix_mdev, apqn);
+ q = vfio_ap_get_queue(apqn);
if (!q)
goto out_unlock;
+ q->matrix_mdev = matrix_mdev;
status = vcpu->run->s.regs.gprs[1];
/* If IR bit(16) is set we enable the interrupt */
@@ -1116,16 +1088,11 @@ static int vfio_ap_mdev_group_notifier(struct notifier_block *nb,
static void vfio_ap_irq_disable_apqn(int apqn)
{
- struct device *dev;
struct vfio_ap_queue *q;
- dev = driver_find_device(&matrix_dev->vfio_ap_drv->driver, NULL,
- &apqn, match_apqn);
- if (dev) {
- q = dev_get_drvdata(dev);
+ q = vfio_ap_get_queue(apqn);
+ if (q)
vfio_ap_irq_disable(q);
- put_device(dev);
- }
}
int vfio_ap_mdev_reset_queue(unsigned int apid, unsigned int apqi,
@@ -1302,3 +1269,38 @@ void vfio_ap_mdev_unregister(void)
{
mdev_unregister_device(&matrix_dev->device);
}
+
+int vfio_ap_mdev_probe_queue(struct ap_queue *queue)
+{
+ struct vfio_ap_queue *q;
+
+ q = kzalloc(sizeof(*q), GFP_KERNEL);
+ if (!q)
+ return -ENOMEM;
+
+ mutex_lock(&matrix_dev->lock);
+ dev_set_drvdata(&queue->ap_dev.device, q);
+ q->apqn = queue->qid;
+ q->saved_isc = VFIO_AP_ISC_INVALID;
+ hash_add(matrix_dev->qtable, &q->qnode, q->apqn);
+ mutex_unlock(&matrix_dev->lock);
+
+ return 0;
+}
+
+void vfio_ap_mdev_remove_queue(struct ap_queue *queue)
+{
+ struct vfio_ap_queue *q;
+ int apid, apqi;
+
+ mutex_lock(&matrix_dev->lock);
+ q = dev_get_drvdata(&queue->ap_dev.device);
+ dev_set_drvdata(&queue->ap_dev.device, NULL);
+ apid = AP_QID_CARD(q->apqn);
+ apqi = AP_QID_QUEUE(q->apqn);
+ vfio_ap_mdev_reset_queue(apid, apqi, 1);
+ vfio_ap_irq_disable(q);
+ hash_del(&q->qnode);
+ kfree(q);
+ mutex_unlock(&matrix_dev->lock);
+}
diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h
index f46dde56b464..e1f8c82cc55d 100644
--- a/drivers/s390/crypto/vfio_ap_private.h
+++ b/drivers/s390/crypto/vfio_ap_private.h
@@ -18,6 +18,7 @@
#include <linux/delay.h>
#include <linux/mutex.h>
#include <linux/kvm_host.h>
+#include <linux/hashtable.h>
#include "ap_bus.h"
@@ -43,6 +44,7 @@ struct ap_matrix_dev {
struct list_head mdev_list;
struct mutex lock;
struct ap_driver *vfio_ap_drv;
+ DECLARE_HASHTABLE(qtable, 8);
};
extern struct ap_matrix_dev *matrix_dev;
@@ -90,8 +92,6 @@ struct ap_matrix_mdev {
extern int vfio_ap_mdev_register(void);
extern void vfio_ap_mdev_unregister(void);
-int vfio_ap_mdev_reset_queue(unsigned int apid, unsigned int apqi,
- unsigned int retry);
struct vfio_ap_queue {
struct ap_matrix_mdev *matrix_mdev;
@@ -99,6 +99,10 @@ struct vfio_ap_queue {
int apqn;
#define VFIO_AP_ISC_INVALID 0xff
unsigned char saved_isc;
+ struct hlist_node qnode;
};
-struct ap_queue_status vfio_ap_irq_disable(struct vfio_ap_queue *q);
+
+int vfio_ap_mdev_probe_queue(struct ap_queue *queue);
+void vfio_ap_mdev_remove_queue(struct ap_queue *queue);
+
#endif /* _VFIO_AP_PRIVATE_H_ */
--
2.21.1
Let's hot plug/unplug adapters, domains and control domains assigned to or
unassigned from an AP matrix mdev device while it is in use by a guest per
the following:
* When the APID of an adapter is assigned to a matrix mdev in use by a KVM
guest, the adapter will be hot plugged into the KVM guest as long as each
APQN derived from the Cartesian product of the APID being assigned and
the APQIs already assigned to the guest's CRYCB references a queue device
bound to the vfio_ap device driver.
* When the APID of an adapter is unassigned from a matrix mdev in use by a
KVM guest, the adapter will be hot unplugged from the KVM guest.
* When the APQI of a domain is assigned to a matrix mdev in use by a KVM
guest, the domain will be hot plugged into the KVM guest as long as each
APQN derived from the Cartesian product of the APQI being assigned and
the APIDs already assigned to the guest's CRYCB references a queue device
bound to the vfio_ap device driver.
* When the APQI of a domain is unassigned from a matrix mdev in use by a
KVM guest, the domain will be hot unplugged from the KVM guest
* When the domain number of a control domain is assigned to a matrix mdev
in use by a KVM guest, the control domain will be hot plugged into the
KVM guest.
* When the domain number of a control domain is unassigned from a matrix
mdev in use by a KVM guest, the control domain will be hot unplugged
from the KVM guest.
Signed-off-by: Tony Krowiak <[email protected]>
---
drivers/s390/crypto/vfio_ap_ops.c | 198 ++++++++++++++++++++++++++++++
1 file changed, 198 insertions(+)
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index 4b16d45b702b..88a4aef5193f 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -623,6 +623,56 @@ static void vfio_ap_mdev_qlinks_for_apid(struct ap_matrix_mdev *matrix_mdev,
}
}
+static bool vfio_ap_mdev_assign_apqis_4_apid(struct ap_matrix_mdev *matrix_mdev,
+ unsigned long apid)
+{
+ DECLARE_BITMAP(aqm, AP_DOMAINS);
+ unsigned long apqi, apqn;
+
+ bitmap_copy(aqm, matrix_mdev->matrix.aqm, AP_DOMAINS);
+
+ for_each_set_bit_inv(apqi, matrix_mdev->matrix.aqm, AP_DOMAINS) {
+ if (!test_bit_inv(apqi,
+ (unsigned long *) matrix_dev->info.aqm))
+ clear_bit_inv(apqi, aqm);
+
+ apqn = AP_MKQID(apid, apqi);
+ if (!vfio_ap_get_mdev_queue(matrix_mdev, apqn))
+ clear_bit_inv(apqi, aqm);
+ }
+
+ if (bitmap_empty(aqm, AP_DOMAINS))
+ return false;
+
+ set_bit_inv(apid, matrix_mdev->shadow_crycb.apm);
+ bitmap_copy(matrix_mdev->shadow_crycb.aqm, aqm, AP_DOMAINS);
+
+ return true;
+}
+
+static bool vfio_ap_mdev_assign_guest_apid(struct ap_matrix_mdev *matrix_mdev,
+ unsigned long apid)
+{
+ unsigned long apqi, apqn;
+
+ if (!vfio_ap_mdev_has_crycb(matrix_mdev) ||
+ !test_bit_inv(apid, (unsigned long *)matrix_dev->info.apm))
+ return false;
+
+ if (bitmap_empty(matrix_mdev->shadow_crycb.aqm, AP_DOMAINS))
+ return vfio_ap_mdev_assign_apqis_4_apid(matrix_mdev, apid);
+
+ for_each_set_bit_inv(apqi, matrix_mdev->shadow_crycb.aqm, AP_DOMAINS) {
+ apqn = AP_MKQID(apid, apqi);
+ if (!vfio_ap_get_mdev_queue(matrix_mdev, apqn))
+ return false;
+ }
+
+ set_bit_inv(apid, matrix_mdev->shadow_crycb.apm);
+
+ return true;
+}
+
/**
* assign_adapter_store
*
@@ -684,12 +734,44 @@ static ssize_t assign_adapter_store(struct device *dev,
}
set_bit_inv(apid, matrix_mdev->matrix.apm);
vfio_ap_mdev_qlinks_for_apid(matrix_mdev, apid);
+
+ if (vfio_ap_mdev_assign_guest_apid(matrix_mdev, apid))
+ vfio_ap_mdev_commit_crycb(matrix_mdev);
+
mutex_unlock(&matrix_dev->lock);
return count;
}
static DEVICE_ATTR_WO(assign_adapter);
+static bool vfio_ap_mdev_unassign_guest_apid(struct ap_matrix_mdev *matrix_mdev,
+ unsigned long apid)
+{
+ if (vfio_ap_mdev_has_crycb(matrix_mdev)) {
+ if (test_bit_inv(apid, matrix_mdev->shadow_crycb.apm)) {
+ clear_bit_inv(apid, matrix_mdev->shadow_crycb.apm);
+
+ /*
+ * If there are no APIDs assigned to the guest, then
+ * the guest will not have access to any queues, so
+ * let's also go ahead and unassign the APQIs. Keeping
+ * them around may yield unpredictable results during
+ * a probe that is not related to a host AP
+ * configuration change (i.e., an AP adapter is
+ * configured online).
+ */
+ if (bitmap_empty(matrix_mdev->shadow_crycb.apm,
+ AP_DEVICES))
+ bitmap_clear(matrix_mdev->shadow_crycb.aqm, 0,
+ AP_DOMAINS);
+
+ return true;
+ }
+ }
+
+ return false;
+}
+
/**
* unassign_adapter_store
*
@@ -726,6 +808,8 @@ static ssize_t unassign_adapter_store(struct device *dev,
mutex_lock(&matrix_dev->lock);
clear_bit_inv((unsigned long)apid, matrix_mdev->matrix.apm);
vfio_ap_mdev_qlinks_for_apid(NULL, apid);
+ if (vfio_ap_mdev_unassign_guest_apid(matrix_mdev, apid))
+ vfio_ap_mdev_commit_crycb(matrix_mdev);
mutex_unlock(&matrix_dev->lock);
return count;
@@ -759,6 +843,56 @@ static void vfio_ap_mdev_qlinks_for_apqi(struct ap_matrix_mdev *matrix_mdev,
}
}
+static bool vfio_ap_mdev_assign_apids_4_apqi(struct ap_matrix_mdev *matrix_mdev,
+ unsigned long apqi)
+{
+ DECLARE_BITMAP(apm, AP_DEVICES);
+ unsigned long apid, apqn;
+
+ bitmap_copy(apm, matrix_mdev->matrix.apm, AP_DEVICES);
+
+ for_each_set_bit_inv(apid, matrix_mdev->matrix.apm, AP_DEVICES) {
+ if (!test_bit_inv(apid,
+ (unsigned long *) matrix_dev->info.apm))
+ clear_bit_inv(apqi, apm);
+
+ apqn = AP_MKQID(apid, apqi);
+ if (!vfio_ap_get_mdev_queue(matrix_mdev, apqn))
+ clear_bit_inv(apid, apm);
+ }
+
+ if (bitmap_empty(apm, AP_DEVICES))
+ return false;
+
+ set_bit_inv(apqi, matrix_mdev->shadow_crycb.aqm);
+ bitmap_copy(matrix_mdev->shadow_crycb.apm, apm, AP_DEVICES);
+
+ return true;
+}
+
+static bool vfio_ap_mdev_assign_guest_apqi(struct ap_matrix_mdev *matrix_mdev,
+ unsigned long apqi)
+{
+ unsigned long apid, apqn;
+
+ if (!vfio_ap_mdev_has_crycb(matrix_mdev) ||
+ !test_bit_inv(apqi, (unsigned long *)matrix_dev->info.aqm))
+ return false;
+
+ if (bitmap_empty(matrix_mdev->shadow_crycb.apm, AP_DEVICES))
+ return vfio_ap_mdev_assign_apids_4_apqi(matrix_mdev, apqi);
+
+ for_each_set_bit_inv(apid, matrix_mdev->shadow_crycb.apm, AP_DEVICES) {
+ apqn = AP_MKQID(apid, apqi);
+ if (!vfio_ap_get_mdev_queue(matrix_mdev, apqn))
+ return false;
+ }
+
+ set_bit_inv(apqi, matrix_mdev->shadow_crycb.aqm);
+
+ return true;
+}
+
/**
* assign_domain_store
*
@@ -820,12 +954,41 @@ static ssize_t assign_domain_store(struct device *dev,
}
set_bit_inv(apqi, matrix_mdev->matrix.aqm);
vfio_ap_mdev_qlinks_for_apqi(matrix_mdev, apqi);
+ if (vfio_ap_mdev_assign_guest_apqi(matrix_mdev, apqi))
+ vfio_ap_mdev_commit_crycb(matrix_mdev);
mutex_unlock(&matrix_dev->lock);
return count;
}
static DEVICE_ATTR_WO(assign_domain);
+static bool vfio_ap_mdev_unassign_guest_apqi(struct ap_matrix_mdev *matrix_mdev,
+ unsigned long apqi)
+{
+ if (vfio_ap_mdev_has_crycb(matrix_mdev)) {
+ if (test_bit_inv(apqi, matrix_mdev->shadow_crycb.aqm)) {
+ clear_bit_inv(apqi, matrix_mdev->shadow_crycb.aqm);
+
+ /*
+ * If there are no APQIs assigned to the guest, then
+ * the guest will not have access to any queues, so
+ * let's also go ahead and unassign the APIDs. Keeping
+ * them around may yield unpredictable results during
+ * a probe that is not related to a host AP
+ * configuration change (i.e., an AP adapter is
+ * configured online).
+ */
+ if (bitmap_empty(matrix_mdev->shadow_crycb.aqm,
+ AP_DOMAINS))
+ bitmap_clear(matrix_mdev->shadow_crycb.apm, 0,
+ AP_DEVICES);
+
+ return true;
+ }
+ }
+
+ return false;
+}
/**
* unassign_domain_store
@@ -863,12 +1026,28 @@ static ssize_t unassign_domain_store(struct device *dev,
mutex_lock(&matrix_dev->lock);
clear_bit_inv((unsigned long)apqi, matrix_mdev->matrix.aqm);
vfio_ap_mdev_qlinks_for_apqi(NULL, apqi);
+ if (vfio_ap_mdev_unassign_guest_apqi(matrix_mdev, apqi))
+ vfio_ap_mdev_commit_crycb(matrix_mdev);
mutex_unlock(&matrix_dev->lock);
return count;
}
static DEVICE_ATTR_WO(unassign_domain);
+static bool vfio_ap_mdev_assign_guest_cdom(struct ap_matrix_mdev *matrix_mdev,
+ unsigned long domid)
+{
+ if (vfio_ap_mdev_has_crycb(matrix_mdev)) {
+ if (!test_bit_inv(domid, matrix_mdev->shadow_crycb.adm)) {
+ set_bit_inv(domid, matrix_mdev->shadow_crycb.adm);
+
+ return true;
+ }
+ }
+
+ return false;
+}
+
/**
* assign_control_domain_store
*
@@ -903,12 +1082,29 @@ static ssize_t assign_control_domain_store(struct device *dev,
mutex_lock(&matrix_dev->lock);
set_bit_inv(id, matrix_mdev->matrix.adm);
+ if (vfio_ap_mdev_assign_guest_cdom(matrix_mdev, id))
+ vfio_ap_mdev_commit_crycb(matrix_mdev);
mutex_unlock(&matrix_dev->lock);
return count;
}
static DEVICE_ATTR_WO(assign_control_domain);
+static bool
+vfio_ap_mdev_unassign_guest_cdom(struct ap_matrix_mdev *matrix_mdev,
+ unsigned long domid)
+{
+ if (vfio_ap_mdev_has_crycb(matrix_mdev)) {
+ if (test_bit_inv(domid, matrix_mdev->shadow_crycb.adm)) {
+ clear_bit_inv(domid, matrix_mdev->shadow_crycb.adm);
+
+ return true;
+ }
+ }
+
+ return false;
+}
+
/**
* unassign_control_domain_store
*
@@ -943,6 +1139,8 @@ static ssize_t unassign_control_domain_store(struct device *dev,
mutex_lock(&matrix_dev->lock);
clear_bit_inv(domid, matrix_mdev->matrix.adm);
+ if (vfio_ap_mdev_unassign_guest_cdom(matrix_mdev, domid))
+ vfio_ap_mdev_commit_crycb(matrix_mdev);
mutex_unlock(&matrix_dev->lock);
return count;
--
2.21.1
Let's introduce a shadow copy of the KVM guest's CRYCB and maintain it for
the lifespan of the guest. The shadow CRYCB will be used to provide the
AP configuration for a KVM guest.
Signed-off-by: Tony Krowiak <[email protected]>
---
drivers/s390/crypto/vfio_ap_ops.c | 31 +++++++++++++++++++++------
drivers/s390/crypto/vfio_ap_private.h | 1 +
2 files changed, 25 insertions(+), 7 deletions(-)
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index 8ece0d52ff4c..b8b678032ab7 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -280,14 +280,32 @@ static int handle_pqap(struct kvm_vcpu *vcpu)
return 0;
}
+static void vfio_ap_matrix_clear(struct ap_matrix *matrix)
+{
+ bitmap_clear(matrix->apm, 0, AP_DEVICES);
+ bitmap_clear(matrix->aqm, 0, AP_DOMAINS);
+ bitmap_clear(matrix->adm, 0, AP_DOMAINS);
+}
+
static void vfio_ap_matrix_init(struct ap_config_info *info,
struct ap_matrix *matrix)
{
+ vfio_ap_matrix_clear(matrix);
matrix->apm_max = info->apxa ? info->Na : 63;
matrix->aqm_max = info->apxa ? info->Nd : 15;
matrix->adm_max = info->apxa ? info->Nd : 15;
}
+static bool vfio_ap_mdev_commit_crycb(struct ap_matrix_mdev *matrix_mdev)
+{
+ if (matrix_mdev->kvm && matrix_mdev->kvm->arch.crypto.crycbd) {
+ kvm_arch_crypto_set_masks(matrix_mdev->kvm,
+ matrix_mdev->shadow_crycb.apm,
+ matrix_mdev->shadow_crycb.aqm,
+ matrix_mdev->shadow_crycb.adm);
+ }
+}
+
static int vfio_ap_mdev_create(struct kobject *kobj, struct mdev_device *mdev)
{
struct ap_matrix_mdev *matrix_mdev;
@@ -303,6 +321,7 @@ static int vfio_ap_mdev_create(struct kobject *kobj, struct mdev_device *mdev)
matrix_mdev->mdev = mdev;
vfio_ap_matrix_init(&matrix_dev->info, &matrix_mdev->matrix);
+ vfio_ap_matrix_init(&matrix_dev->info, &matrix_mdev->shadow_crycb);
mdev_set_drvdata(mdev, matrix_mdev);
matrix_mdev->pqap_hook.hook = handle_pqap;
matrix_mdev->pqap_hook.owner = THIS_MODULE;
@@ -1126,13 +1145,9 @@ static int vfio_ap_mdev_group_notifier(struct notifier_block *nb,
if (ret)
return NOTIFY_DONE;
- /* If there is no CRYCB pointer, then we can't copy the masks */
- if (!matrix_mdev->kvm->arch.crypto.crycbd)
- return NOTIFY_DONE;
-
- kvm_arch_crypto_set_masks(matrix_mdev->kvm, matrix_mdev->matrix.apm,
- matrix_mdev->matrix.aqm,
- matrix_mdev->matrix.adm);
+ memcpy(&matrix_mdev->shadow_crycb, &matrix_mdev->matrix,
+ sizeof(matrix_mdev->shadow_crycb));
+ vfio_ap_mdev_commit_crycb(matrix_mdev);
return NOTIFY_OK;
}
@@ -1247,6 +1262,8 @@ static void vfio_ap_mdev_release(struct mdev_device *mdev)
kvm_put_kvm(matrix_mdev->kvm);
matrix_mdev->kvm = NULL;
}
+
+ vfio_ap_matrix_clear(&matrix_mdev->shadow_crycb);
mutex_unlock(&matrix_dev->lock);
vfio_unregister_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,
diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h
index 4b6e144bab17..87cc270c3212 100644
--- a/drivers/s390/crypto/vfio_ap_private.h
+++ b/drivers/s390/crypto/vfio_ap_private.h
@@ -83,6 +83,7 @@ struct ap_matrix {
struct ap_matrix_mdev {
struct list_head node;
struct ap_matrix matrix;
+ struct ap_matrix shadow_crycb;
struct notifier_block group_notifier;
struct notifier_block iommu_notifier;
struct kvm *kvm;
--
2.21.1
On Tue, 7 Apr 2020 15:20:01 -0400
Tony Krowiak <[email protected]> wrote:
> Rather than looping over potentially 65535 objects, let's store the
> structures for caching information about queue devices bound to the
> vfio_ap device driver in a hash table keyed by APQN.
This also looks like a nice code simplification.
>
> Signed-off-by: Tony Krowiak <[email protected]>
> ---
> drivers/s390/crypto/vfio_ap_drv.c | 28 +++------
> drivers/s390/crypto/vfio_ap_ops.c | 90 ++++++++++++++-------------
> drivers/s390/crypto/vfio_ap_private.h | 10 ++-
> 3 files changed, 60 insertions(+), 68 deletions(-)
>
(...)
> diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
> index 5c0f53c6dde7..134860934fe7 100644
> --- a/drivers/s390/crypto/vfio_ap_ops.c
> +++ b/drivers/s390/crypto/vfio_ap_ops.c
> @@ -26,45 +26,16 @@
>
> static int vfio_ap_mdev_reset_queues(struct mdev_device *mdev);
>
> -static int match_apqn(struct device *dev, const void *data)
> -{
> - struct vfio_ap_queue *q = dev_get_drvdata(dev);
> -
> - return (q->apqn == *(int *)(data)) ? 1 : 0;
> -}
> -
> -/**
> - * vfio_ap_get_queue: Retrieve a queue with a specific APQN from a list
> - * @matrix_mdev: the associated mediated matrix
> - * @apqn: The queue APQN
> - *
> - * Retrieve a queue with a specific APQN from the list of the
> - * devices of the vfio_ap_drv.
> - * Verify that the APID and the APQI are set in the matrix.
> - *
> - * Returns the pointer to the associated vfio_ap_queue
Any reason you're killing this comment, instead of adapting it? The
function is even no longer static...
> - */
> -static struct vfio_ap_queue *vfio_ap_get_queue(
> - struct ap_matrix_mdev *matrix_mdev,
> - int apqn)
> +struct vfio_ap_queue *vfio_ap_get_queue(unsigned long apqn)
> {
> struct vfio_ap_queue *q;
> - struct device *dev;
> -
> - if (!test_bit_inv(AP_QID_CARD(apqn), matrix_mdev->matrix.apm))
> - return NULL;
> - if (!test_bit_inv(AP_QID_QUEUE(apqn), matrix_mdev->matrix.aqm))
> - return NULL;
These were just optimizations and therefore can be dropped now?
> -
> - dev = driver_find_device(&matrix_dev->vfio_ap_drv->driver, NULL,
> - &apqn, match_apqn);
> - if (!dev)
> - return NULL;
> - q = dev_get_drvdata(dev);
> - q->matrix_mdev = matrix_mdev;
> - put_device(dev);
>
> - return q;
> + hash_for_each_possible(matrix_dev->qtable, q, qnode, apqn) {
> + if (q && (apqn == q->apqn))
> + return q;
> + }
Do we need any serialization here? Previously, the driver core made
sure we could get a reference only if the device was still registered;
not sure if we need any further guarantees now.
> +
> + return NULL;
> }
>
> /**
(...)
On Tue, 7 Apr 2020 15:20:06 -0400
Tony Krowiak <[email protected]> wrote:
> The matrix of adapters and domains configured in a guest's CRYCB may
> differ from the matrix of adapters and domains assigned to the matrix mdev,
> so this patch introduces a sysfs attribute to display the CRYCB of a guest
> using the matrix mdev. For a matrix mdev denoted by $uuid, the crycb for a
> guest using the matrix mdev can be displayed as follows:
>
> cat /sys/devices/vfio_ap/matrix/$uuid/guest_matrix
>
> If a guest is not using the matrix mdev at the time the crycb is displayed,
> an error (ENODEV) will be returned.
>
> Signed-off-by: Tony Krowiak <[email protected]>
> ---
> drivers/s390/crypto/vfio_ap_ops.c | 58 +++++++++++++++++++++++++++++++
> 1 file changed, 58 insertions(+)
> +static DEVICE_ATTR_RO(guest_matrix);
Hm... should information like the guest configuration be readable by
everyone? Or should it be restricted a bit more?
> +
> static struct attribute *vfio_ap_mdev_attrs[] = {
> &dev_attr_assign_adapter.attr,
> &dev_attr_unassign_adapter.attr,
> @@ -1050,6 +1107,7 @@ static struct attribute *vfio_ap_mdev_attrs[] = {
> &dev_attr_unassign_control_domain.attr,
> &dev_attr_control_domains.attr,
> &dev_attr_matrix.attr,
> + &dev_attr_guest_matrix.attr,
> NULL,
> };
>
On 4/8/20 6:48 AM, Cornelia Huck wrote:
> On Tue, 7 Apr 2020 15:20:01 -0400
> Tony Krowiak <[email protected]> wrote:
>
>> Rather than looping over potentially 65535 objects, let's store the
>> structures for caching information about queue devices bound to the
>> vfio_ap device driver in a hash table keyed by APQN.
> This also looks like a nice code simplification.
>
>> Signed-off-by: Tony Krowiak <[email protected]>
>> ---
>> drivers/s390/crypto/vfio_ap_drv.c | 28 +++------
>> drivers/s390/crypto/vfio_ap_ops.c | 90 ++++++++++++++-------------
>> drivers/s390/crypto/vfio_ap_private.h | 10 ++-
>> 3 files changed, 60 insertions(+), 68 deletions(-)
>>
> (...)
>
>> diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
>> index 5c0f53c6dde7..134860934fe7 100644
>> --- a/drivers/s390/crypto/vfio_ap_ops.c
>> +++ b/drivers/s390/crypto/vfio_ap_ops.c
>> @@ -26,45 +26,16 @@
>>
>> static int vfio_ap_mdev_reset_queues(struct mdev_device *mdev);
>>
>> -static int match_apqn(struct device *dev, const void *data)
>> -{
>> - struct vfio_ap_queue *q = dev_get_drvdata(dev);
>> -
>> - return (q->apqn == *(int *)(data)) ? 1 : 0;
>> -}
>> -
>> -/**
>> - * vfio_ap_get_queue: Retrieve a queue with a specific APQN from a list
>> - * @matrix_mdev: the associated mediated matrix
>> - * @apqn: The queue APQN
>> - *
>> - * Retrieve a queue with a specific APQN from the list of the
>> - * devices of the vfio_ap_drv.
>> - * Verify that the APID and the APQI are set in the matrix.
>> - *
>> - * Returns the pointer to the associated vfio_ap_queue
> Any reason you're killing this comment, instead of adapting it? The
> function is even no longer static...
I can update and restore the comment and the function should be static.
>
>> - */
>> -static struct vfio_ap_queue *vfio_ap_get_queue(
>> - struct ap_matrix_mdev *matrix_mdev,
>> - int apqn)
>> +struct vfio_ap_queue *vfio_ap_get_queue(unsigned long apqn)
>> {
>> struct vfio_ap_queue *q;
>> - struct device *dev;
>> -
>> - if (!test_bit_inv(AP_QID_CARD(apqn), matrix_mdev->matrix.apm))
>> - return NULL;
>> - if (!test_bit_inv(AP_QID_QUEUE(apqn), matrix_mdev->matrix.aqm))
>> - return NULL;
> These were just optimizations and therefore can be dropped now?
The purpose of this function has changed from its previous incarnation.
This function was originally called from the handle_pqap() function and
served two purposes: It retrieved the struct vfio_ap_queue as driver data
and linked the matrix_mdev to the vfio_ap_queue. The linking of the
matrix_mdev and the vfio_ap_queue are now done when queue devices
are probed and when adapters and domains are assigned; so now, the
handle_pqap() function calls this function to retrieve both the
vfio_ap_queue as well as the matrix_mdev to which it is linked.
Consequently,
the above code is no longer needed.
>
>> -
>> - dev = driver_find_device(&matrix_dev->vfio_ap_drv->driver, NULL,
>> - &apqn, match_apqn);
>> - if (!dev)
>> - return NULL;
>> - q = dev_get_drvdata(dev);
>> - q->matrix_mdev = matrix_mdev;
>> - put_device(dev);
>>
>> - return q;
>> + hash_for_each_possible(matrix_dev->qtable, q, qnode, apqn) {
>> + if (q && (apqn == q->apqn))
>> + return q;
>> + }
> Do we need any serialization here? Previously, the driver core made
> sure we could get a reference only if the device was still registered;
> not sure if we need any further guarantees now.
The vfio_ap_queue structs are created when the queue device is
probed and removed when the queue device is removed.
>
>> +
>> + return NULL;
>> }
>>
>> /**
> (...)
>
On 4/8/20 12:27 PM, Cornelia Huck wrote:
> On Wed, 8 Apr 2020 11:38:07 -0400
> Tony Krowiak <[email protected]> wrote:
>
>> On 4/8/20 6:48 AM, Cornelia Huck wrote:
>>> On Tue, 7 Apr 2020 15:20:01 -0400
>>> Tony Krowiak <[email protected]> wrote:
>>>
>>>> Rather than looping over potentially 65535 objects, let's store the
>>>> structures for caching information about queue devices bound to the
>>>> vfio_ap device driver in a hash table keyed by APQN.
>>> This also looks like a nice code simplification.
>>>
>>>> Signed-off-by: Tony Krowiak <[email protected]>
>>>> ---
>>>> drivers/s390/crypto/vfio_ap_drv.c | 28 +++------
>>>> drivers/s390/crypto/vfio_ap_ops.c | 90 ++++++++++++++-------------
>>>> drivers/s390/crypto/vfio_ap_private.h | 10 ++-
>>>> 3 files changed, 60 insertions(+), 68 deletions(-)
>>>>
>>> (...)
>>>> - */
>>>> -static struct vfio_ap_queue *vfio_ap_get_queue(
>>>> - struct ap_matrix_mdev *matrix_mdev,
>>>> - int apqn)
>>>> +struct vfio_ap_queue *vfio_ap_get_queue(unsigned long apqn)
>>>> {
>>>> struct vfio_ap_queue *q;
>>>> - struct device *dev;
>>>> -
>>>> - if (!test_bit_inv(AP_QID_CARD(apqn), matrix_mdev->matrix.apm))
>>>> - return NULL;
>>>> - if (!test_bit_inv(AP_QID_QUEUE(apqn), matrix_mdev->matrix.aqm))
>>>> - return NULL;
>>> These were just optimizations and therefore can be dropped now?
>> The purpose of this function has changed from its previous incarnation.
>> This function was originally called from the handle_pqap() function and
>> served two purposes: It retrieved the struct vfio_ap_queue as driver data
>> and linked the matrix_mdev to the vfio_ap_queue. The linking of the
>> matrix_mdev and the vfio_ap_queue are now done when queue devices
>> are probed and when adapters and domains are assigned; so now, the
>> handle_pqap() function calls this function to retrieve both the
>> vfio_ap_queue as well as the matrix_mdev to which it is linked.
>> Consequently,
>> the above code is no longer needed.
> Thanks for the explanation, that makes sense.
>
>>>
>>>> -
>>>> - dev = driver_find_device(&matrix_dev->vfio_ap_drv->driver, NULL,
>>>> - &apqn, match_apqn);
>>>> - if (!dev)
>>>> - return NULL;
>>>> - q = dev_get_drvdata(dev);
>>>> - q->matrix_mdev = matrix_mdev;
>>>> - put_device(dev);
>>>>
>>>> - return q;
>>>> + hash_for_each_possible(matrix_dev->qtable, q, qnode, apqn) {
>>>> + if (q && (apqn == q->apqn))
>>>> + return q;
>>>> + }
>>> Do we need any serialization here? Previously, the driver core made
>>> sure we could get a reference only if the device was still registered;
>>> not sure if we need any further guarantees now.
>> The vfio_ap_queue structs are created when the queue device is
>> probed and removed when the queue device is removed.
> Ok, so anything further is not needed.
>
>>>
>>>> +
>>>> + return NULL;
>>>> }
>>>>
>>>> /**
>>> (...)
>>>
> Looks good to me, then. With vfio_ap_get_queue made static and the
> kerneldoc restored/updated:
Already done, thanks for the review.
>
> Reviewed-by: Cornelia Huck <[email protected]>
>
On Wed, 8 Apr 2020 11:38:07 -0400
Tony Krowiak <[email protected]> wrote:
> On 4/8/20 6:48 AM, Cornelia Huck wrote:
> > On Tue, 7 Apr 2020 15:20:01 -0400
> > Tony Krowiak <[email protected]> wrote:
> >
> >> Rather than looping over potentially 65535 objects, let's store the
> >> structures for caching information about queue devices bound to the
> >> vfio_ap device driver in a hash table keyed by APQN.
> > This also looks like a nice code simplification.
> >
> >> Signed-off-by: Tony Krowiak <[email protected]>
> >> ---
> >> drivers/s390/crypto/vfio_ap_drv.c | 28 +++------
> >> drivers/s390/crypto/vfio_ap_ops.c | 90 ++++++++++++++-------------
> >> drivers/s390/crypto/vfio_ap_private.h | 10 ++-
> >> 3 files changed, 60 insertions(+), 68 deletions(-)
> >>
> > (...)
> >> - */
> >> -static struct vfio_ap_queue *vfio_ap_get_queue(
> >> - struct ap_matrix_mdev *matrix_mdev,
> >> - int apqn)
> >> +struct vfio_ap_queue *vfio_ap_get_queue(unsigned long apqn)
> >> {
> >> struct vfio_ap_queue *q;
> >> - struct device *dev;
> >> -
> >> - if (!test_bit_inv(AP_QID_CARD(apqn), matrix_mdev->matrix.apm))
> >> - return NULL;
> >> - if (!test_bit_inv(AP_QID_QUEUE(apqn), matrix_mdev->matrix.aqm))
> >> - return NULL;
> > These were just optimizations and therefore can be dropped now?
>
> The purpose of this function has changed from its previous incarnation.
> This function was originally called from the handle_pqap() function and
> served two purposes: It retrieved the struct vfio_ap_queue as driver data
> and linked the matrix_mdev to the vfio_ap_queue. The linking of the
> matrix_mdev and the vfio_ap_queue are now done when queue devices
> are probed and when adapters and domains are assigned; so now, the
> handle_pqap() function calls this function to retrieve both the
> vfio_ap_queue as well as the matrix_mdev to which it is linked.
> Consequently,
> the above code is no longer needed.
Thanks for the explanation, that makes sense.
>
> >
> >> -
> >> - dev = driver_find_device(&matrix_dev->vfio_ap_drv->driver, NULL,
> >> - &apqn, match_apqn);
> >> - if (!dev)
> >> - return NULL;
> >> - q = dev_get_drvdata(dev);
> >> - q->matrix_mdev = matrix_mdev;
> >> - put_device(dev);
> >>
> >> - return q;
> >> + hash_for_each_possible(matrix_dev->qtable, q, qnode, apqn) {
> >> + if (q && (apqn == q->apqn))
> >> + return q;
> >> + }
> > Do we need any serialization here? Previously, the driver core made
> > sure we could get a reference only if the device was still registered;
> > not sure if we need any further guarantees now.
>
> The vfio_ap_queue structs are created when the queue device is
> probed and removed when the queue device is removed.
Ok, so anything further is not needed.
>
> >
> >> +
> >> + return NULL;
> >> }
> >>
> >> /**
> > (...)
> >
>
Looks good to me, then. With vfio_ap_get_queue made static and the
kerneldoc restored/updated:
Reviewed-by: Cornelia Huck <[email protected]>
On 4/8/20 6:33 AM, Cornelia Huck wrote:
> On Tue, 7 Apr 2020 15:20:06 -0400
> Tony Krowiak <[email protected]> wrote:
>
>> The matrix of adapters and domains configured in a guest's CRYCB may
>> differ from the matrix of adapters and domains assigned to the matrix mdev,
>> so this patch introduces a sysfs attribute to display the CRYCB of a guest
>> using the matrix mdev. For a matrix mdev denoted by $uuid, the crycb for a
>> guest using the matrix mdev can be displayed as follows:
>>
>> cat /sys/devices/vfio_ap/matrix/$uuid/guest_matrix
>>
>> If a guest is not using the matrix mdev at the time the crycb is displayed,
>> an error (ENODEV) will be returned.
>>
>> Signed-off-by: Tony Krowiak <[email protected]>
>> ---
>> drivers/s390/crypto/vfio_ap_ops.c | 58 +++++++++++++++++++++++++++++++
>> 1 file changed, 58 insertions(+)
>> +static DEVICE_ATTR_RO(guest_matrix);
> Hm... should information like the guest configuration be readable by
> everyone? Or should it be restricted a bit more?
Why? The matrix attribute already displays the APQNs of the queues
assigned to the matrix mdev. The guest_matrix attribute merely displays
a subset of the matrix (i.e., the APQNs assigned to the mdev that reference
queue devices bound to the vfio_ap device driver).
How can this be restricted?
>
>> +
>> static struct attribute *vfio_ap_mdev_attrs[] = {
>> &dev_attr_assign_adapter.attr,
>> &dev_attr_unassign_adapter.attr,
>> @@ -1050,6 +1107,7 @@ static struct attribute *vfio_ap_mdev_attrs[] = {
>> &dev_attr_unassign_control_domain.attr,
>> &dev_attr_control_domains.attr,
>> &dev_attr_matrix.attr,
>> + &dev_attr_guest_matrix.attr,
>> NULL,
>> };
>>
On Wed, 8 Apr 2020 12:38:49 -0400
Tony Krowiak <[email protected]> wrote:
> On 4/8/20 6:33 AM, Cornelia Huck wrote:
> > On Tue, 7 Apr 2020 15:20:06 -0400
> > Tony Krowiak <[email protected]> wrote:
> >
> >> The matrix of adapters and domains configured in a guest's CRYCB may
> >> differ from the matrix of adapters and domains assigned to the matrix mdev,
> >> so this patch introduces a sysfs attribute to display the CRYCB of a guest
> >> using the matrix mdev. For a matrix mdev denoted by $uuid, the crycb for a
> >> guest using the matrix mdev can be displayed as follows:
> >>
> >> cat /sys/devices/vfio_ap/matrix/$uuid/guest_matrix
> >>
> >> If a guest is not using the matrix mdev at the time the crycb is displayed,
> >> an error (ENODEV) will be returned.
> >>
> >> Signed-off-by: Tony Krowiak <[email protected]>
> >> ---
> >> drivers/s390/crypto/vfio_ap_ops.c | 58 +++++++++++++++++++++++++++++++
> >> 1 file changed, 58 insertions(+)
> >> +static DEVICE_ATTR_RO(guest_matrix);
> > Hm... should information like the guest configuration be readable by
> > everyone? Or should it be restricted a bit more?
>
> Why? The matrix attribute already displays the APQNs of the queues
> assigned to the matrix mdev. The guest_matrix attribute merely displays
> a subset of the matrix (i.e., the APQNs assigned to the mdev that reference
> queue devices bound to the vfio_ap device driver).
>
> How can this be restricted?
I was thinking of using e.g. 400 instead of 444 for the permissions.
I'm not sure if the info about what subset of the queues the guest
actually uses is all that interesting (except for managing guests); but
if I see guest information being exposed, I get a little wary, so I
just stumbled over this.
Maybe I'll come back to that once I have looked at the series in more
detail, but this might not be a problem at all.
>
> >
> >> +
> >> static struct attribute *vfio_ap_mdev_attrs[] = {
> >> &dev_attr_assign_adapter.attr,
> >> &dev_attr_unassign_adapter.attr,
> >> @@ -1050,6 +1107,7 @@ static struct attribute *vfio_ap_mdev_attrs[] = {
> >> &dev_attr_unassign_control_domain.attr,
> >> &dev_attr_control_domains.attr,
> >> &dev_attr_matrix.attr,
> >> + &dev_attr_guest_matrix.attr,
> >> NULL,
> >> };
> >>
>
On 4/8/20 12:46 PM, Cornelia Huck wrote:
> On Wed, 8 Apr 2020 12:38:49 -0400
> Tony Krowiak <[email protected]> wrote:
>
>> On 4/8/20 6:33 AM, Cornelia Huck wrote:
>>> On Tue, 7 Apr 2020 15:20:06 -0400
>>> Tony Krowiak <[email protected]> wrote:
>>>
>>>> The matrix of adapters and domains configured in a guest's CRYCB may
>>>> differ from the matrix of adapters and domains assigned to the matrix mdev,
>>>> so this patch introduces a sysfs attribute to display the CRYCB of a guest
>>>> using the matrix mdev. For a matrix mdev denoted by $uuid, the crycb for a
>>>> guest using the matrix mdev can be displayed as follows:
>>>>
>>>> cat /sys/devices/vfio_ap/matrix/$uuid/guest_matrix
>>>>
>>>> If a guest is not using the matrix mdev at the time the crycb is displayed,
>>>> an error (ENODEV) will be returned.
>>>>
>>>> Signed-off-by: Tony Krowiak <[email protected]>
>>>> ---
>>>> drivers/s390/crypto/vfio_ap_ops.c | 58 +++++++++++++++++++++++++++++++
>>>> 1 file changed, 58 insertions(+)
>>>> +static DEVICE_ATTR_RO(guest_matrix);
>>> Hm... should information like the guest configuration be readable by
>>> everyone? Or should it be restricted a bit more?
>> Why? The matrix attribute already displays the APQNs of the queues
>> assigned to the matrix mdev. The guest_matrix attribute merely displays
>> a subset of the matrix (i.e., the APQNs assigned to the mdev that reference
>> queue devices bound to the vfio_ap device driver).
>>
>> How can this be restricted?
> I was thinking of using e.g. 400 instead of 444 for the permissions.
>
> I'm not sure if the info about what subset of the queues the guest
> actually uses is all that interesting (except for managing guests); but
> if I see guest information being exposed, I get a little wary, so I
> just stumbled over this.
>
> Maybe I'll come back to that once I have looked at the series in more
> detail, but this might not be a problem at all.
I created this patch primarily because I found it very helpful when
testing. It saved me from logging on to the guest and executing
lszcrypt to verify the AP configuration for the guest. I thought this
might also add value for a system administrator responsible for
KVM guests. It would not be a problem to remove this patch or
change the permissions for the attribute. Let me know when you
complete the series review. Thanks.
>
>>>
>>>> +
>>>> static struct attribute *vfio_ap_mdev_attrs[] = {
>>>> &dev_attr_assign_adapter.attr,
>>>> &dev_attr_unassign_adapter.attr,
>>>> @@ -1050,6 +1107,7 @@ static struct attribute *vfio_ap_mdev_attrs[] = {
>>>> &dev_attr_unassign_control_domain.attr,
>>>> &dev_attr_control_domains.attr,
>>>> &dev_attr_matrix.attr,
>>>> + &dev_attr_guest_matrix.attr,
>>>> NULL,
>>>> };
>>>>
On Tue, 7 Apr 2020 15:20:02 -0400
Tony Krowiak <[email protected]> wrote:
> A vfio_ap_queue structure is created for each queue device probed. To
> ensure that the matrix mdev to which a queue's APQN is assigned is linked
> to the queue structure as long as the queue device is bound to the vfio_ap
> device driver, let's go ahead and manage these links when the queue device
> is probed and removed as well as whenever an adapter or domain is assigned
> to or unassigned from the matrix mdev.
>
> Signed-off-by: Tony Krowiak <[email protected]>
> ---
> drivers/s390/crypto/vfio_ap_ops.c | 75 ++++++++++++++++++++++++++++---
> 1 file changed, 70 insertions(+), 5 deletions(-)
(...)
> @@ -536,6 +531,31 @@ static int vfio_ap_mdev_verify_no_sharing(struct ap_matrix_mdev *matrix_mdev)
> return 0;
> }
>
> +/**
> + * vfio_ap_mdev_qlinks_for_apid
Hm... maybe the function name should express that there's some actual
(un)linking going on?
vfio_ap_mdev_link_by_apid?
Or make this vfio_ap_mdev_link_queues() and pass in an indicator whether
the passed value is an apid or an aqid? Both function names look so
very similar to be easily confused (at least to me).
> + *
> + * @matrix_mdev: a matrix mediated device
> + * @apqi: the APID of one or more APQNs assigned to @matrix_mdev
> + *
> + * Set the link to @matrix_mdev for each queue device bound to the vfio_ap
> + * device driver with an APQN assigned to @matrix_mdev with the specified @apid.
> + *
> + * Note: If @matrix_mdev is NULL, the link to @matrix_mdev will be severed.
> + */
> +static void vfio_ap_mdev_qlinks_for_apid(struct ap_matrix_mdev *matrix_mdev,
> + unsigned long apid)
> +{
> + unsigned long apqi;
> + struct vfio_ap_queue *q;
> +
> + for_each_set_bit_inv(apqi, matrix_mdev->matrix.aqm,
> + matrix_mdev->matrix.aqm_max + 1) {
> + q = vfio_ap_get_queue(AP_MKQID(apid, apqi));
> + if (q)
> + q->matrix_mdev = matrix_mdev;
> + }
> +}
> +
> /**
> * assign_adapter_store
> *
(...)
> @@ -682,6 +704,31 @@ vfio_ap_mdev_verify_queues_reserved_for_apqi(struct ap_matrix_mdev *matrix_mdev,
> return 0;
> }
>
> +/**
> + * vfio_ap_mdev_qlinks_for_apqi
See my comment above.
> + *
> + * @matrix_mdev: a matrix mediated device
> + * @apqi: the APQI of one or more APQNs assigned to @matrix_mdev
> + *
> + * Set the link to @matrix_mdev for each queue device bound to the vfio_ap
> + * device driver with an APQN assigned to @matrix_mdev with the specified @apqi.
> + *
> + * Note: If @matrix_mdev is NULL, the link to @matrix_mdev will be severed.
> + */
> +static void vfio_ap_mdev_qlinks_for_apqi(struct ap_matrix_mdev *matrix_mdev,
> + unsigned long apqi)
> +{
> + unsigned long apid;
> + struct vfio_ap_queue *q;
> +
> + for_each_set_bit_inv(apid, matrix_mdev->matrix.apm,
> + matrix_mdev->matrix.apm_max + 1) {
> + q = vfio_ap_get_queue(AP_MKQID(apid, apqi));
> + if (q)
> + q->matrix_mdev = matrix_mdev;
> + }
> +}
> +
> /**
> * assign_domain_store
> *
(...)
> @@ -1270,6 +1319,21 @@ void vfio_ap_mdev_unregister(void)
> mdev_unregister_device(&matrix_dev->device);
> }
>
> +static void vfio_ap_mdev_for_queue(struct vfio_ap_queue *q)
vfio_ap_queue_link_mdev()? It is the other direction from the linking
above.
> +{
> + unsigned long apid = AP_QID_CARD(q->apqn);
> + unsigned long apqi = AP_QID_QUEUE(q->apqn);
> + struct ap_matrix_mdev *matrix_mdev;
> +
> + list_for_each_entry(matrix_mdev, &matrix_dev->mdev_list, node) {
> + if (test_bit_inv(apid, matrix_mdev->matrix.apm) &&
> + test_bit_inv(apqi, matrix_mdev->matrix.aqm)) {
> + q->matrix_mdev = matrix_mdev;
> + break;
> + }
> + }
> +}
> +
> int vfio_ap_mdev_probe_queue(struct ap_queue *queue)
> {
> struct vfio_ap_queue *q;
> @@ -1282,6 +1346,7 @@ int vfio_ap_mdev_probe_queue(struct ap_queue *queue)
> dev_set_drvdata(&queue->ap_dev.device, q);
> q->apqn = queue->qid;
> q->saved_isc = VFIO_AP_ISC_INVALID;
> + vfio_ap_mdev_for_queue(q);
> hash_add(matrix_dev->qtable, &q->qnode, q->apqn);
> mutex_unlock(&matrix_dev->lock);
>
In general, looks sane.
On 4/9/20 11:06 AM, Cornelia Huck wrote:
> On Tue, 7 Apr 2020 15:20:02 -0400
> Tony Krowiak <[email protected]> wrote:
>
>> A vfio_ap_queue structure is created for each queue device probed. To
>> ensure that the matrix mdev to which a queue's APQN is assigned is linked
>> to the queue structure as long as the queue device is bound to the vfio_ap
>> device driver, let's go ahead and manage these links when the queue device
>> is probed and removed as well as whenever an adapter or domain is assigned
>> to or unassigned from the matrix mdev.
>>
>> Signed-off-by: Tony Krowiak <[email protected]>
>> ---
>> drivers/s390/crypto/vfio_ap_ops.c | 75 ++++++++++++++++++++++++++++---
>> 1 file changed, 70 insertions(+), 5 deletions(-)
> (...)
>
>> @@ -536,6 +531,31 @@ static int vfio_ap_mdev_verify_no_sharing(struct ap_matrix_mdev *matrix_mdev)
>> return 0;
>> }
>>
>> +/**
>> + * vfio_ap_mdev_qlinks_for_apid
> Hm... maybe the function name should express that there's some actual
> (un)linking going on?
>
> vfio_ap_mdev_link_by_apid?
>
> Or make this vfio_ap_mdev_link_queues() and pass in an indicator whether
> the passed value is an apid or an aqid? Both function names look so
> very similar to be easily confused (at least to me).
I like this idea, how about vfio_ap_link_mdev_to_queues()?
>
>> + *
>> + * @matrix_mdev: a matrix mediated device
>> + * @apqi: the APID of one or more APQNs assigned to @matrix_mdev
>> + *
>> + * Set the link to @matrix_mdev for each queue device bound to the vfio_ap
>> + * device driver with an APQN assigned to @matrix_mdev with the specified @apid.
>> + *
>> + * Note: If @matrix_mdev is NULL, the link to @matrix_mdev will be severed.
>> + */
>> +static void vfio_ap_mdev_qlinks_for_apid(struct ap_matrix_mdev *matrix_mdev,
>> + unsigned long apid)
>> +{
>> + unsigned long apqi;
>> + struct vfio_ap_queue *q;
>> +
>> + for_each_set_bit_inv(apqi, matrix_mdev->matrix.aqm,
>> + matrix_mdev->matrix.aqm_max + 1) {
>> + q = vfio_ap_get_queue(AP_MKQID(apid, apqi));
>> + if (q)
>> + q->matrix_mdev = matrix_mdev;
>> + }
>> +}
>> +
>> /**
>> * assign_adapter_store
>> *
> (...)
>
>> @@ -682,6 +704,31 @@ vfio_ap_mdev_verify_queues_reserved_for_apqi(struct ap_matrix_mdev *matrix_mdev,
>> return 0;
>> }
>>
>> +/**
>> + * vfio_ap_mdev_qlinks_for_apqi
> See my comment above.
>
>> + *
>> + * @matrix_mdev: a matrix mediated device
>> + * @apqi: the APQI of one or more APQNs assigned to @matrix_mdev
>> + *
>> + * Set the link to @matrix_mdev for each queue device bound to the vfio_ap
>> + * device driver with an APQN assigned to @matrix_mdev with the specified @apqi.
>> + *
>> + * Note: If @matrix_mdev is NULL, the link to @matrix_mdev will be severed.
>> + */
>> +static void vfio_ap_mdev_qlinks_for_apqi(struct ap_matrix_mdev *matrix_mdev,
>> + unsigned long apqi)
>> +{
>> + unsigned long apid;
>> + struct vfio_ap_queue *q;
>> +
>> + for_each_set_bit_inv(apid, matrix_mdev->matrix.apm,
>> + matrix_mdev->matrix.apm_max + 1) {
>> + q = vfio_ap_get_queue(AP_MKQID(apid, apqi));
>> + if (q)
>> + q->matrix_mdev = matrix_mdev;
>> + }
>> +}
>> +
>> /**
>> * assign_domain_store
>> *
> (...)
>
>> @@ -1270,6 +1319,21 @@ void vfio_ap_mdev_unregister(void)
>> mdev_unregister_device(&matrix_dev->device);
>> }
>>
>> +static void vfio_ap_mdev_for_queue(struct vfio_ap_queue *q)
> vfio_ap_queue_link_mdev()? It is the other direction from the linking
> above.
How about vfio_ap_link_queue_to_mdevs()?
>
>> +{
>> + unsigned long apid = AP_QID_CARD(q->apqn);
>> + unsigned long apqi = AP_QID_QUEUE(q->apqn);
>> + struct ap_matrix_mdev *matrix_mdev;
>> +
>> + list_for_each_entry(matrix_mdev, &matrix_dev->mdev_list, node) {
>> + if (test_bit_inv(apid, matrix_mdev->matrix.apm) &&
>> + test_bit_inv(apqi, matrix_mdev->matrix.aqm)) {
>> + q->matrix_mdev = matrix_mdev;
>> + break;
>> + }
>> + }
>> +}
>> +
>> int vfio_ap_mdev_probe_queue(struct ap_queue *queue)
>> {
>> struct vfio_ap_queue *q;
>> @@ -1282,6 +1346,7 @@ int vfio_ap_mdev_probe_queue(struct ap_queue *queue)
>> dev_set_drvdata(&queue->ap_dev.device, q);
>> q->apqn = queue->qid;
>> q->saved_isc = VFIO_AP_ISC_INVALID;
>> + vfio_ap_mdev_for_queue(q);
>> hash_add(matrix_dev->qtable, &q->qnode, q->apqn);
>> mutex_unlock(&matrix_dev->lock);
>>
> In general, looks sane.
>
On 4/9/20 11:06 AM, Cornelia Huck wrote:
> On Tue, 7 Apr 2020 15:20:02 -0400
> Tony Krowiak <[email protected]> wrote:
>
>> A vfio_ap_queue structure is created for each queue device probed. To
>> ensure that the matrix mdev to which a queue's APQN is assigned is linked
>> to the queue structure as long as the queue device is bound to the vfio_ap
>> device driver, let's go ahead and manage these links when the queue device
>> is probed and removed as well as whenever an adapter or domain is assigned
>> to or unassigned from the matrix mdev.
>>
>> Signed-off-by: Tony Krowiak <[email protected]>
>> ---
>> drivers/s390/crypto/vfio_ap_ops.c | 75 ++++++++++++++++++++++++++++---
>> 1 file changed, 70 insertions(+), 5 deletions(-)
> (...)
>
>> @@ -536,6 +531,31 @@ static int vfio_ap_mdev_verify_no_sharing(struct ap_matrix_mdev *matrix_mdev)
>> return 0;
>> }
>>
>> +/**
>> + * vfio_ap_mdev_qlinks_for_apid
> Hm... maybe the function name should express that there's some actual
> (un)linking going on?
>
> vfio_ap_mdev_link_by_apid?
>
> Or make this vfio_ap_mdev_link_queues() and pass in an indicator whether
> the passed value is an apid or an aqid? Both function names look so
> very similar to be easily confused (at least to me).
Forget my last response, I like your function names better.
>
>> + *
>> + * @matrix_mdev: a matrix mediated device
>> + * @apqi: the APID of one or more APQNs assigned to @matrix_mdev
>> + *
>> + * Set the link to @matrix_mdev for each queue device bound to the vfio_ap
>> + * device driver with an APQN assigned to @matrix_mdev with the specified @apid.
>> + *
>> + * Note: If @matrix_mdev is NULL, the link to @matrix_mdev will be severed.
>> + */
>> +static void vfio_ap_mdev_qlinks_for_apid(struct ap_matrix_mdev *matrix_mdev,
>> + unsigned long apid)
>> +{
>> + unsigned long apqi;
>> + struct vfio_ap_queue *q;
>> +
>> + for_each_set_bit_inv(apqi, matrix_mdev->matrix.aqm,
>> + matrix_mdev->matrix.aqm_max + 1) {
>> + q = vfio_ap_get_queue(AP_MKQID(apid, apqi));
>> + if (q)
>> + q->matrix_mdev = matrix_mdev;
>> + }
>> +}
>> +
>> /**
>> * assign_adapter_store
>> *
> (...)
>
>> @@ -682,6 +704,31 @@ vfio_ap_mdev_verify_queues_reserved_for_apqi(struct ap_matrix_mdev *matrix_mdev,
>> return 0;
>> }
>>
>> +/**
>> + * vfio_ap_mdev_qlinks_for_apqi
> See my comment above.
>
>> + *
>> + * @matrix_mdev: a matrix mediated device
>> + * @apqi: the APQI of one or more APQNs assigned to @matrix_mdev
>> + *
>> + * Set the link to @matrix_mdev for each queue device bound to the vfio_ap
>> + * device driver with an APQN assigned to @matrix_mdev with the specified @apqi.
>> + *
>> + * Note: If @matrix_mdev is NULL, the link to @matrix_mdev will be severed.
>> + */
>> +static void vfio_ap_mdev_qlinks_for_apqi(struct ap_matrix_mdev *matrix_mdev,
>> + unsigned long apqi)
>> +{
>> + unsigned long apid;
>> + struct vfio_ap_queue *q;
>> +
>> + for_each_set_bit_inv(apid, matrix_mdev->matrix.apm,
>> + matrix_mdev->matrix.apm_max + 1) {
>> + q = vfio_ap_get_queue(AP_MKQID(apid, apqi));
>> + if (q)
>> + q->matrix_mdev = matrix_mdev;
>> + }
>> +}
>> +
>> /**
>> * assign_domain_store
>> *
> (...)
>
>> @@ -1270,6 +1319,21 @@ void vfio_ap_mdev_unregister(void)
>> mdev_unregister_device(&matrix_dev->device);
>> }
>>
>> +static void vfio_ap_mdev_for_queue(struct vfio_ap_queue *q)
> vfio_ap_queue_link_mdev()? It is the other direction from the linking
> above.
See my comment above.
>
>> +{
>> + unsigned long apid = AP_QID_CARD(q->apqn);
>> + unsigned long apqi = AP_QID_QUEUE(q->apqn);
>> + struct ap_matrix_mdev *matrix_mdev;
>> +
>> + list_for_each_entry(matrix_mdev, &matrix_dev->mdev_list, node) {
>> + if (test_bit_inv(apid, matrix_mdev->matrix.apm) &&
>> + test_bit_inv(apqi, matrix_mdev->matrix.aqm)) {
>> + q->matrix_mdev = matrix_mdev;
>> + break;
>> + }
>> + }
>> +}
>> +
>> int vfio_ap_mdev_probe_queue(struct ap_queue *queue)
>> {
>> struct vfio_ap_queue *q;
>> @@ -1282,6 +1346,7 @@ int vfio_ap_mdev_probe_queue(struct ap_queue *queue)
>> dev_set_drvdata(&queue->ap_dev.device, q);
>> q->apqn = queue->qid;
>> q->saved_isc = VFIO_AP_ISC_INVALID;
>> + vfio_ap_mdev_for_queue(q);
>> hash_add(matrix_dev->qtable, &q->qnode, q->apqn);
>> mutex_unlock(&matrix_dev->lock);
>>
> In general, looks sane.
>
On Tue, 7 Apr 2020 15:20:03 -0400
Tony Krowiak <[email protected]> wrote:
> Introduces a new driver callback to prevent a root user from unbinding
> an AP queue from its device driver if the queue is in use. The intent of
> this callback is to provide a driver with the means to prevent a root user
> from inadvertently taking a queue away from a guest and giving it to the
> host while the guest is still using it. The callback will
> be invoked whenever a change to the AP bus's sysfs apmask or aqmask
> attributes would result in one or more AP queues being removed from its
> driver. If the callback responds in the affirmative for any driver
> queried, the change to the apmask or aqmask will be rejected with a device
> in use error.
>
> For this patch, only non-default drivers will be queried. Currently,
> there is only one non-default driver, the vfio_ap device driver. The
> vfio_ap device driver manages AP queues passed through to one or more
> guests and we don't want to unexpectedly take AP resources away from
> guests which are most likely independently administered.
>
> Signed-off-by: Tony Krowiak <[email protected]>
> ---
> drivers/s390/crypto/ap_bus.c | 144 +++++++++++++++++++++++++++++++++--
> drivers/s390/crypto/ap_bus.h | 4 +
> 2 files changed, 142 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/s390/crypto/ap_bus.c b/drivers/s390/crypto/ap_bus.c
> index 5256e3ce84e5..af15c095e76a 100644
> --- a/drivers/s390/crypto/ap_bus.c
> +++ b/drivers/s390/crypto/ap_bus.c
> @@ -35,6 +35,7 @@
> #include <linux/mod_devicetable.h>
> #include <linux/debugfs.h>
> #include <linux/ctype.h>
> +#include <linux/module.h>
>
> #include "ap_bus.h"
> #include "ap_debug.h"
> @@ -995,9 +996,11 @@ int ap_parse_mask_str(const char *str,
> newmap = kmalloc(size, GFP_KERNEL);
> if (!newmap)
> return -ENOMEM;
> - if (mutex_lock_interruptible(lock)) {
> - kfree(newmap);
> - return -ERESTARTSYS;
> + if (lock) {
> + if (mutex_lock_interruptible(lock)) {
> + kfree(newmap);
> + return -ERESTARTSYS;
> + }
This whole function is a bit odd. It seems all masks we want to
manipulate are always guarded by the ap_perms_mutex, and the need for
allowing lock == NULL comes from wanting to call this function with the
ap_perms_mutex already held.
That would argue for a locked/unlocked version of this function... but
looking at it, why do we lock the way we do? The one thing this
function (prior to this patch) does outside of the holding of the mutex
is the allocation and freeing of newmap. But with this patch, we do the
allocation and freeing of newmap while holding the mutex. Something
seems a bit weird here.
> }
>
> if (*str == '+' || *str == '-') {
On Tue, 7 Apr 2020 15:20:03 -0400
Tony Krowiak <[email protected]> wrote:
> Introduces a new driver callback to prevent a root user from unbinding
> an AP queue from its device driver if the queue is in use. The intent of
> this callback is to provide a driver with the means to prevent a root user
> from inadvertently taking a queue away from a guest and giving it to the
> host while the guest is still using it. The callback will
> be invoked whenever a change to the AP bus's sysfs apmask or aqmask
> attributes would result in one or more AP queues being removed from its
> driver. If the callback responds in the affirmative for any driver
> queried, the change to the apmask or aqmask will be rejected with a device
> in use error.
>
> For this patch, only non-default drivers will be queried. Currently,
> there is only one non-default driver, the vfio_ap device driver. The
> vfio_ap device driver manages AP queues passed through to one or more
> guests and we don't want to unexpectedly take AP resources away from
> guests which are most likely independently administered.
>
> Signed-off-by: Tony Krowiak <[email protected]>
> ---
> drivers/s390/crypto/ap_bus.c | 144 +++++++++++++++++++++++++++++++++--
> drivers/s390/crypto/ap_bus.h | 4 +
> 2 files changed, 142 insertions(+), 6 deletions(-)
(...)
> @@ -1196,12 +1202,75 @@ static ssize_t apmask_show(struct bus_type *bus, char *buf)
> return rc;
> }
>
> +int __verify_card_reservations(struct device_driver *drv, void *data)
> +{
> + int rc = 0;
> + struct ap_driver *ap_drv = to_ap_drv(drv);
> + unsigned long *newapm = (unsigned long *)data;
> +
> + /*
> + * If the reserved bits do not identify cards reserved for use by the
> + * non-default driver, there is no need to verify the driver is using
> + * the queues.
I had to read that one several times... what about
"No need to verify whether the driver is using the queues if it is the
default driver."
?
> + */
> + if (ap_drv->flags & AP_DRIVER_FLAG_DEFAULT)
> + return 0;
> +
> + /* The non-default driver's module must be loaded */
> + if (!try_module_get(drv->owner))
> + return 0;
Is that really needed? I would have thought that the driver core's
klist usage would make sure that the callback would not be invoked for
drivers that are not registered anymore. Or am I missing a window?
> +
> + if (ap_drv->in_use)
> + if (ap_drv->in_use(newapm, ap_perms.aqm))
Can we log the offending apm somewhere, preferably with additional info
that allows the admin to figure out why an error was returned?
> + rc = -EADDRINUSE;
> +
> + module_put(drv->owner);
> +
> + return rc;
> +}
(Same comments for the other changes further along in this patch.)
On 14.04.20 14:58, Cornelia Huck wrote:
> On Tue, 7 Apr 2020 15:20:03 -0400
> Tony Krowiak <[email protected]> wrote:
>
>> Introduces a new driver callback to prevent a root user from unbinding
>> an AP queue from its device driver if the queue is in use. The intent of
>> this callback is to provide a driver with the means to prevent a root user
>> from inadvertently taking a queue away from a guest and giving it to the
>> host while the guest is still using it. The callback will
>> be invoked whenever a change to the AP bus's sysfs apmask or aqmask
>> attributes would result in one or more AP queues being removed from its
>> driver. If the callback responds in the affirmative for any driver
>> queried, the change to the apmask or aqmask will be rejected with a device
>> in use error.
>>
>> For this patch, only non-default drivers will be queried. Currently,
>> there is only one non-default driver, the vfio_ap device driver. The
>> vfio_ap device driver manages AP queues passed through to one or more
>> guests and we don't want to unexpectedly take AP resources away from
>> guests which are most likely independently administered.
>>
>> Signed-off-by: Tony Krowiak <[email protected]>
>> ---
>> drivers/s390/crypto/ap_bus.c | 144 +++++++++++++++++++++++++++++++++--
>> drivers/s390/crypto/ap_bus.h | 4 +
>> 2 files changed, 142 insertions(+), 6 deletions(-)
> (...)
>
>> @@ -1196,12 +1202,75 @@ static ssize_t apmask_show(struct bus_type *bus, char *buf)
>> return rc;
>> }
>>
>> +int __verify_card_reservations(struct device_driver *drv, void *data)
>> +{
>> + int rc = 0;
>> + struct ap_driver *ap_drv = to_ap_drv(drv);
>> + unsigned long *newapm = (unsigned long *)data;
>> +
>> + /*
>> + * If the reserved bits do not identify cards reserved for use by the
>> + * non-default driver, there is no need to verify the driver is using
>> + * the queues.
> I had to read that one several times... what about
>
> "No need to verify whether the driver is using the queues if it is the
> default driver."
>
> ?
>
>> + */
>> + if (ap_drv->flags & AP_DRIVER_FLAG_DEFAULT)
>> + return 0;
>> +
>> + /* The non-default driver's module must be loaded */
>> + if (!try_module_get(drv->owner))
>> + return 0;
> Is that really needed? I would have thought that the driver core's
> klist usage would make sure that the callback would not be invoked for
> drivers that are not registered anymore. Or am I missing a window?
The try_module_get() and module_put() is a result of review feedback from
my side. The ap bus core is static in the kernel whereas the
vfio dd is a kernel module. So there may be a race condition between
calling the callback function and removal of the vfio dd module.
There is similar code in zcrypt_api which does the same for the zcrypt
device drivers before using some variables or functions from the modules.
Help me, it this is outdated code and there is no need to adjust the
module reference counter any more, then I would be happy to remove
this code :-)
>
>> +
>> + if (ap_drv->in_use)
>> + if (ap_drv->in_use(newapm, ap_perms.aqm))
> Can we log the offending apm somewhere, preferably with additional info
> that allows the admin to figure out why an error was returned?
>
>> + rc = -EADDRINUSE;
>> +
>> + module_put(drv->owner);
>> +
>> + return rc;
>> +}
> (Same comments for the other changes further along in this patch.)
>
On 4/14/20 8:58 AM, Cornelia Huck wrote:
> On Tue, 7 Apr 2020 15:20:03 -0400
> Tony Krowiak <[email protected]> wrote:
>
>> Introduces a new driver callback to prevent a root user from unbinding
>> an AP queue from its device driver if the queue is in use. The intent of
>> this callback is to provide a driver with the means to prevent a root user
>> from inadvertently taking a queue away from a guest and giving it to the
>> host while the guest is still using it. The callback will
>> be invoked whenever a change to the AP bus's sysfs apmask or aqmask
>> attributes would result in one or more AP queues being removed from its
>> driver. If the callback responds in the affirmative for any driver
>> queried, the change to the apmask or aqmask will be rejected with a device
>> in use error.
>>
>> For this patch, only non-default drivers will be queried. Currently,
>> there is only one non-default driver, the vfio_ap device driver. The
>> vfio_ap device driver manages AP queues passed through to one or more
>> guests and we don't want to unexpectedly take AP resources away from
>> guests which are most likely independently administered.
>>
>> Signed-off-by: Tony Krowiak <[email protected]>
>> ---
>> drivers/s390/crypto/ap_bus.c | 144 +++++++++++++++++++++++++++++++++--
>> drivers/s390/crypto/ap_bus.h | 4 +
>> 2 files changed, 142 insertions(+), 6 deletions(-)
> (...)
>
>> @@ -1196,12 +1202,75 @@ static ssize_t apmask_show(struct bus_type *bus, char *buf)
>> return rc;
>> }
>>
>> +int __verify_card_reservations(struct device_driver *drv, void *data)
>> +{
>> + int rc = 0;
>> + struct ap_driver *ap_drv = to_ap_drv(drv);
>> + unsigned long *newapm = (unsigned long *)data;
>> +
>> + /*
>> + * If the reserved bits do not identify cards reserved for use by the
>> + * non-default driver, there is no need to verify the driver is using
>> + * the queues.
> I had to read that one several times... what about
> "No need to verify whether the driver is using the queues if it is the
> default driver."
>
> ?
Sure, that's better.
>
>> + */
>> + if (ap_drv->flags & AP_DRIVER_FLAG_DEFAULT)
>> + return 0;
>> +
>> + /* The non-default driver's module must be loaded */
>> + if (!try_module_get(drv->owner))
>> + return 0;
> Is that really needed? I would have thought that the driver core's
> klist usage would make sure that the callback would not be invoked for
> drivers that are not registered anymore. Or am I missing a window?
>
>> +
>> + if (ap_drv->in_use)
>> + if (ap_drv->in_use(newapm, ap_perms.aqm))
> Can we log the offending apm somewhere, preferably with additional info
> that allows the admin to figure out why an error was returned?
One of the things on my TODO list is to add logging to the vfio_ap
module which will track all significant activity within the device
driver. I plan to do that with a patch or set of patches specifically
put together for that purpose. Having said that, the best place to
log this would be in the in_use callback in the vfio_ap device driver
(see next patch) where the APQNs that are in use can be identified.
For now, I will log a message to the dmesg log indicating which
APQNs are in use by the matrix mdev.
>
>> + rc = -EADDRINUSE;
>> +
>> + module_put(drv->owner);
>> +
>> + return rc;
>> +}
> (Same comments for the other changes further along in this patch.)
>
On 4/14/20 8:08 AM, Cornelia Huck wrote:
> On Tue, 7 Apr 2020 15:20:03 -0400
> Tony Krowiak <[email protected]> wrote:
>
>> Introduces a new driver callback to prevent a root user from unbinding
>> an AP queue from its device driver if the queue is in use. The intent of
>> this callback is to provide a driver with the means to prevent a root user
>> from inadvertently taking a queue away from a guest and giving it to the
>> host while the guest is still using it. The callback will
>> be invoked whenever a change to the AP bus's sysfs apmask or aqmask
>> attributes would result in one or more AP queues being removed from its
>> driver. If the callback responds in the affirmative for any driver
>> queried, the change to the apmask or aqmask will be rejected with a device
>> in use error.
>>
>> For this patch, only non-default drivers will be queried. Currently,
>> there is only one non-default driver, the vfio_ap device driver. The
>> vfio_ap device driver manages AP queues passed through to one or more
>> guests and we don't want to unexpectedly take AP resources away from
>> guests which are most likely independently administered.
>>
>> Signed-off-by: Tony Krowiak <[email protected]>
>> ---
>> drivers/s390/crypto/ap_bus.c | 144 +++++++++++++++++++++++++++++++++--
>> drivers/s390/crypto/ap_bus.h | 4 +
>> 2 files changed, 142 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/s390/crypto/ap_bus.c b/drivers/s390/crypto/ap_bus.c
>> index 5256e3ce84e5..af15c095e76a 100644
>> --- a/drivers/s390/crypto/ap_bus.c
>> +++ b/drivers/s390/crypto/ap_bus.c
>> @@ -35,6 +35,7 @@
>> #include <linux/mod_devicetable.h>
>> #include <linux/debugfs.h>
>> #include <linux/ctype.h>
>> +#include <linux/module.h>
>>
>> #include "ap_bus.h"
>> #include "ap_debug.h"
>> @@ -995,9 +996,11 @@ int ap_parse_mask_str(const char *str,
>> newmap = kmalloc(size, GFP_KERNEL);
>> if (!newmap)
>> return -ENOMEM;
>> - if (mutex_lock_interruptible(lock)) {
>> - kfree(newmap);
>> - return -ERESTARTSYS;
>> + if (lock) {
>> + if (mutex_lock_interruptible(lock)) {
>> + kfree(newmap);
>> + return -ERESTARTSYS;
>> + }
> This whole function is a bit odd. It seems all masks we want to
> manipulate are always guarded by the ap_perms_mutex, and the need for
> allowing lock == NULL comes from wanting to call this function with the
> ap_perms_mutex already held.
>
> That would argue for a locked/unlocked version of this function... but
> looking at it, why do we lock the way we do? The one thing this
> function (prior to this patch) does outside of the holding of the mutex
> is the allocation and freeing of newmap. But with this patch, we do the
> allocation and freeing of newmap while holding the mutex. Something
> seems a bit weird here.
Note that the ap_parse_mask function copies the newmap
to the bitmap passed in as a parameter to the function.
Prior to the introduction of this patch, the calling functions - i.e.,
apmask_store(), aqmask_store() and ap_perms_init() - passed
in the actual bitmap (i.e., ap_perms.apm or ap_perms aqm),
so the ap_perms were changed directly by this function.
With this patch, the apmask_store() and aqmask_store()
functions now pass in a copy of those bitmaps. This is so
we can verify that any APQNs being removed are not
in use by the vfio_ap device driver before committing the
change to ap_perms. Consequently, it is now necessary
to take the lock for the until the changes are committed.
Having explained that, you make a valid argument that
this calls for a locked/unlocked version of this function, so
I will modify this patch to that effect.
>
>> }
>>
>> if (*str == '+' || *str == '-') {
On Wed, 15 Apr 2020 08:08:24 +0200
Harald Freudenberger <[email protected]> wrote:
> On 14.04.20 14:58, Cornelia Huck wrote:
> > On Tue, 7 Apr 2020 15:20:03 -0400
> > Tony Krowiak <[email protected]> wrote:
> >> + /* The non-default driver's module must be loaded */
> >> + if (!try_module_get(drv->owner))
> >> + return 0;
> > Is that really needed? I would have thought that the driver core's
> > klist usage would make sure that the callback would not be invoked for
> > drivers that are not registered anymore. Or am I missing a window?
> The try_module_get() and module_put() is a result of review feedback from
> my side. The ap bus core is static in the kernel whereas the
> vfio dd is a kernel module. So there may be a race condition between
> calling the callback function and removal of the vfio dd module.
> There is similar code in zcrypt_api which does the same for the zcrypt
> device drivers before using some variables or functions from the modules.
> Help me, it this is outdated code and there is no need to adjust the
> module reference counter any more, then I would be happy to remove
> this code :-)
I think the driver core already should keep us safe. A built-in bus
calling a driver in a module is a very common pattern, and I think
->owner was introduced exactly for that case.
Unless I'm really missing something obvious?
On Wed, 15 Apr 2020 13:10:10 -0400
Tony Krowiak <[email protected]> wrote:
> On 4/14/20 8:58 AM, Cornelia Huck wrote:
> > On Tue, 7 Apr 2020 15:20:03 -0400
> > Tony Krowiak <[email protected]> wrote:
> >> +
> >> + if (ap_drv->in_use)
> >> + if (ap_drv->in_use(newapm, ap_perms.aqm))
> > Can we log the offending apm somewhere, preferably with additional info
> > that allows the admin to figure out why an error was returned?
>
> One of the things on my TODO list is to add logging to the vfio_ap
> module which will track all significant activity within the device
> driver. I plan to do that with a patch or set of patches specifically
> put together for that purpose. Having said that, the best place to
> log this would be in the in_use callback in the vfio_ap device driver
> (see next patch) where the APQNs that are in use can be identified.
> For now, I will log a message to the dmesg log indicating which
> APQNs are in use by the matrix mdev.
Sounds reasonable. My main issue was what an admin was supposed to do
until logging was in place :)
>
> >
> >> + rc = -EADDRINUSE;
> >> +
> >> + module_put(drv->owner);
> >> +
> >> + return rc;
> >> +}
On Wed, 15 Apr 2020 13:10:18 -0400
Tony Krowiak <[email protected]> wrote:
> On 4/14/20 8:08 AM, Cornelia Huck wrote:
> > On Tue, 7 Apr 2020 15:20:03 -0400
> > Tony Krowiak <[email protected]> wrote:
> >> @@ -995,9 +996,11 @@ int ap_parse_mask_str(const char *str,
> >> newmap = kmalloc(size, GFP_KERNEL);
> >> if (!newmap)
> >> return -ENOMEM;
> >> - if (mutex_lock_interruptible(lock)) {
> >> - kfree(newmap);
> >> - return -ERESTARTSYS;
> >> + if (lock) {
> >> + if (mutex_lock_interruptible(lock)) {
> >> + kfree(newmap);
> >> + return -ERESTARTSYS;
> >> + }
> > This whole function is a bit odd. It seems all masks we want to
> > manipulate are always guarded by the ap_perms_mutex, and the need for
> > allowing lock == NULL comes from wanting to call this function with the
> > ap_perms_mutex already held.
> >
> > That would argue for a locked/unlocked version of this function... but
> > looking at it, why do we lock the way we do? The one thing this
> > function (prior to this patch) does outside of the holding of the mutex
> > is the allocation and freeing of newmap. But with this patch, we do the
> > allocation and freeing of newmap while holding the mutex. Something
> > seems a bit weird here.
>
> Note that the ap_parse_mask function copies the newmap
> to the bitmap passed in as a parameter to the function.
> Prior to the introduction of this patch, the calling functions - i.e.,
> apmask_store(), aqmask_store() and ap_perms_init() - passed
> in the actual bitmap (i.e., ap_perms.apm or ap_perms aqm),
> so the ap_perms were changed directly by this function.
>
> With this patch, the apmask_store() and aqmask_store()
> functions now pass in a copy of those bitmaps. This is so
> we can verify that any APQNs being removed are not
> in use by the vfio_ap device driver before committing the
> change to ap_perms. Consequently, it is now necessary
> to take the lock for the until the changes are committed.
Yes, but every caller actually takes the mutex before calling this
function already :)
> Having explained that, you make a valid argument that
> this calls for a locked/unlocked version of this function, so
> I will modify this patch to that effect.
Ok.
The other thing I found weird is that the function does
alloc newmap -> grab mutex -> do manipulation -> release mutex -> free newmap
while the new callers do
(mutex already held) -> alloc newmap
so why grab/release the mutex the way the function does now? IOW, why
not have an unlocked __ap_parse_mask_string() and do
int ap_parse_mask_string(...)
{
int rc;
if (mutex_lock_interruptible(&ap_perms_mutex))
return -ERESTARTSYS;
rc = __ap_parse_mask_string(...);
mutex_unlock(&ap_perms_mutex);
return rc;
}
On Tue, 7 Apr 2020 15:20:04 -0400
Tony Krowiak <[email protected]> wrote:
> Let's implement the callback to indicate when an APQN
> is in use by the vfio_ap device driver. The callback is
> invoked whenever a change to the apmask or aqmask would
> result in one or more queue devices being removed from the driver. The
> vfio_ap device driver will indicate a resource is in use
> if the APQN of any of the queue devices to be removed are assigned to
> any of the matrix mdevs under the driver's control.
>
> Signed-off-by: Tony Krowiak <[email protected]>
> ---
> drivers/s390/crypto/vfio_ap_drv.c | 1 +
> drivers/s390/crypto/vfio_ap_ops.c | 47 +++++++++++++++++----------
> drivers/s390/crypto/vfio_ap_private.h | 2 ++
> 3 files changed, 33 insertions(+), 17 deletions(-)
> @@ -1369,3 +1371,14 @@ void vfio_ap_mdev_remove_queue(struct ap_queue *queue)
> kfree(q);
> mutex_unlock(&matrix_dev->lock);
> }
> +
> +bool vfio_ap_mdev_resource_in_use(unsigned long *apm, unsigned long *aqm)
> +{
> + bool in_use;
> +
> + mutex_lock(&matrix_dev->lock);
> + in_use = vfio_ap_mdev_verify_no_sharing(NULL, apm, aqm) ? true : false;
Maybe
in_use = !!vfio_ap_mdev_verify_no_sharing(NULL, apm, aqm);
?
> + mutex_unlock(&matrix_dev->lock);
> +
> + return in_use;
> +}
On Tue, 7 Apr 2020 15:20:05 -0400
Tony Krowiak <[email protected]> wrote:
> Let's introduce a shadow copy of the KVM guest's CRYCB and maintain it for
> the lifespan of the guest. The shadow CRYCB will be used to provide the
> AP configuration for a KVM guest.
'shadow CRYCB' seems to be a bit of a misnomer, as the real CRYCB has a
different format (for starters, it also contains key wrapping stuff).
It seems to be more of a 'shadow matrix'.
>
> Signed-off-by: Tony Krowiak <[email protected]>
> ---
> drivers/s390/crypto/vfio_ap_ops.c | 31 +++++++++++++++++++++------
> drivers/s390/crypto/vfio_ap_private.h | 1 +
> 2 files changed, 25 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
> index 8ece0d52ff4c..b8b678032ab7 100644
> --- a/drivers/s390/crypto/vfio_ap_ops.c
> +++ b/drivers/s390/crypto/vfio_ap_ops.c
> @@ -280,14 +280,32 @@ static int handle_pqap(struct kvm_vcpu *vcpu)
> return 0;
> }
>
> +static void vfio_ap_matrix_clear(struct ap_matrix *matrix)
vfio_ap_matrix_clear_masks()?
> +{
> + bitmap_clear(matrix->apm, 0, AP_DEVICES);
> + bitmap_clear(matrix->aqm, 0, AP_DOMAINS);
> + bitmap_clear(matrix->adm, 0, AP_DOMAINS);
> +}
> +
> static void vfio_ap_matrix_init(struct ap_config_info *info,
> struct ap_matrix *matrix)
> {
> + vfio_ap_matrix_clear(matrix);
> matrix->apm_max = info->apxa ? info->Na : 63;
> matrix->aqm_max = info->apxa ? info->Nd : 15;
> matrix->adm_max = info->apxa ? info->Nd : 15;
> }
>
> +static bool vfio_ap_mdev_commit_crycb(struct ap_matrix_mdev *matrix_mdev)
vfio_ap_mdev_commit_masks()?
And it does not seem to return anything? (Maybe it should, to be
consumed below?)
> +{
> + if (matrix_mdev->kvm && matrix_mdev->kvm->arch.crypto.crycbd) {
> + kvm_arch_crypto_set_masks(matrix_mdev->kvm,
> + matrix_mdev->shadow_crycb.apm,
> + matrix_mdev->shadow_crycb.aqm,
> + matrix_mdev->shadow_crycb.adm);
> + }
> +}
> +
> static int vfio_ap_mdev_create(struct kobject *kobj, struct mdev_device *mdev)
> {
> struct ap_matrix_mdev *matrix_mdev;
> @@ -303,6 +321,7 @@ static int vfio_ap_mdev_create(struct kobject *kobj, struct mdev_device *mdev)
>
> matrix_mdev->mdev = mdev;
> vfio_ap_matrix_init(&matrix_dev->info, &matrix_mdev->matrix);
> + vfio_ap_matrix_init(&matrix_dev->info, &matrix_mdev->shadow_crycb);
> mdev_set_drvdata(mdev, matrix_mdev);
> matrix_mdev->pqap_hook.hook = handle_pqap;
> matrix_mdev->pqap_hook.owner = THIS_MODULE;
> @@ -1126,13 +1145,9 @@ static int vfio_ap_mdev_group_notifier(struct notifier_block *nb,
> if (ret)
> return NOTIFY_DONE;
>
> - /* If there is no CRYCB pointer, then we can't copy the masks */
> - if (!matrix_mdev->kvm->arch.crypto.crycbd)
> - return NOTIFY_DONE;
> -
> - kvm_arch_crypto_set_masks(matrix_mdev->kvm, matrix_mdev->matrix.apm,
> - matrix_mdev->matrix.aqm,
> - matrix_mdev->matrix.adm);
> + memcpy(&matrix_mdev->shadow_crycb, &matrix_mdev->matrix,
> + sizeof(matrix_mdev->shadow_crycb));
> + vfio_ap_mdev_commit_crycb(matrix_mdev);
You are changing the return code for !crycb; maybe that's where a good
return code for vfio_ap_mdev_commit_crycb() would come in handy :)
>
> return NOTIFY_OK;
> }
> @@ -1247,6 +1262,8 @@ static void vfio_ap_mdev_release(struct mdev_device *mdev)
> kvm_put_kvm(matrix_mdev->kvm);
> matrix_mdev->kvm = NULL;
> }
> +
> + vfio_ap_matrix_clear(&matrix_mdev->shadow_crycb);
> mutex_unlock(&matrix_dev->lock);
>
> vfio_unregister_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,
> diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h
> index 4b6e144bab17..87cc270c3212 100644
> --- a/drivers/s390/crypto/vfio_ap_private.h
> +++ b/drivers/s390/crypto/vfio_ap_private.h
> @@ -83,6 +83,7 @@ struct ap_matrix {
> struct ap_matrix_mdev {
> struct list_head node;
> struct ap_matrix matrix;
> + struct ap_matrix shadow_crycb;
I think shadow_matrix would be a better name.
> struct notifier_block group_notifier;
> struct notifier_block iommu_notifier;
> struct kvm *kvm;
On 4/16/20 6:05 AM, Cornelia Huck wrote:
> On Wed, 15 Apr 2020 13:10:18 -0400
> Tony Krowiak <[email protected]> wrote:
>
>> On 4/14/20 8:08 AM, Cornelia Huck wrote:
>>> On Tue, 7 Apr 2020 15:20:03 -0400
>>> Tony Krowiak <[email protected]> wrote:
>>>> @@ -995,9 +996,11 @@ int ap_parse_mask_str(const char *str,
>>>> newmap = kmalloc(size, GFP_KERNEL);
>>>> if (!newmap)
>>>> return -ENOMEM;
>>>> - if (mutex_lock_interruptible(lock)) {
>>>> - kfree(newmap);
>>>> - return -ERESTARTSYS;
>>>> + if (lock) {
>>>> + if (mutex_lock_interruptible(lock)) {
>>>> + kfree(newmap);
>>>> + return -ERESTARTSYS;
>>>> + }
>>> This whole function is a bit odd. It seems all masks we want to
>>> manipulate are always guarded by the ap_perms_mutex, and the need for
>>> allowing lock == NULL comes from wanting to call this function with the
>>> ap_perms_mutex already held.
>>>
>>> That would argue for a locked/unlocked version of this function... but
>>> looking at it, why do we lock the way we do? The one thing this
>>> function (prior to this patch) does outside of the holding of the mutex
>>> is the allocation and freeing of newmap. But with this patch, we do the
>>> allocation and freeing of newmap while holding the mutex. Something
>>> seems a bit weird here.
>> Note that the ap_parse_mask function copies the newmap
>> to the bitmap passed in as a parameter to the function.
>> Prior to the introduction of this patch, the calling functions - i.e.,
>> apmask_store(), aqmask_store() and ap_perms_init() - passed
>> in the actual bitmap (i.e., ap_perms.apm or ap_perms aqm),
>> so the ap_perms were changed directly by this function.
>>
>> With this patch, the apmask_store() and aqmask_store()
>> functions now pass in a copy of those bitmaps. This is so
>> we can verify that any APQNs being removed are not
>> in use by the vfio_ap device driver before committing the
>> change to ap_perms. Consequently, it is now necessary
>> to take the lock for the until the changes are committed.
> Yes, but every caller actually takes the mutex before calling this
> function already :)
That is not a true statement, the ap_perms_init() function
does not take the mutex prior to calling this function. Keep in
mind, the ap_parse_mask function is not static and is exported,
I was precluded from removing the lock parameter from the function
definition.
>
>> Having explained that, you make a valid argument that
>> this calls for a locked/unlocked version of this function, so
>> I will modify this patch to that effect.
> Ok.
>
> The other thing I found weird is that the function does
> alloc newmap -> grab mutex -> do manipulation -> release mutex -> free newmap
> while the new callers do
> (mutex already held) -> alloc newmap
>
> so why grab/release the mutex the way the function does now? IOW, why
> not have an unlocked __ap_parse_mask_string() and do
In my last comment above, I agreed to create an unlocked version
of this function. Your example below is similar to what I
implemented after responding to your comment yesterday.
>
> int ap_parse_mask_string(...)
> {
> int rc;
>
> if (mutex_lock_interruptible(&ap_perms_mutex))
> return -ERESTARTSYS;
> rc = __ap_parse_mask_string(...);
> mutex_unlock(&ap_perms_mutex);
> return rc;
> }
On 4/16/20 7:18 AM, Cornelia Huck wrote:
> On Tue, 7 Apr 2020 15:20:04 -0400
> Tony Krowiak <[email protected]> wrote:
>
>> Let's implement the callback to indicate when an APQN
>> is in use by the vfio_ap device driver. The callback is
>> invoked whenever a change to the apmask or aqmask would
>> result in one or more queue devices being removed from the driver. The
>> vfio_ap device driver will indicate a resource is in use
>> if the APQN of any of the queue devices to be removed are assigned to
>> any of the matrix mdevs under the driver's control.
>>
>> Signed-off-by: Tony Krowiak <[email protected]>
>> ---
>> drivers/s390/crypto/vfio_ap_drv.c | 1 +
>> drivers/s390/crypto/vfio_ap_ops.c | 47 +++++++++++++++++----------
>> drivers/s390/crypto/vfio_ap_private.h | 2 ++
>> 3 files changed, 33 insertions(+), 17 deletions(-)
>> @@ -1369,3 +1371,14 @@ void vfio_ap_mdev_remove_queue(struct ap_queue *queue)
>> kfree(q);
>> mutex_unlock(&matrix_dev->lock);
>> }
>> +
>> +bool vfio_ap_mdev_resource_in_use(unsigned long *apm, unsigned long *aqm)
>> +{
>> + bool in_use;
>> +
>> + mutex_lock(&matrix_dev->lock);
>> + in_use = vfio_ap_mdev_verify_no_sharing(NULL, apm, aqm) ? true : false;
> Maybe
>
> in_use = !!vfio_ap_mdev_verify_no_sharing(NULL, apm, aqm);
>
> ?
To be honest, I find the !! expression very confusing. Every time I see
it, I have
to spend time thinking about what the result of !! is going to be. I think
the statement should be left as-is because it more clearly expresses
the intent.
>
>> + mutex_unlock(&matrix_dev->lock);
>> +
>> + return in_use;
>> +}
On 2020-04-16 16:45, Tony Krowiak wrote:
>
>
> On 4/16/20 7:18 AM, Cornelia Huck wrote:
>> On Tue, 7 Apr 2020 15:20:04 -0400
>> Tony Krowiak <[email protected]> wrote:
>>
>>> Let's implement the callback to indicate when an APQN
>>> is in use by the vfio_ap device driver. The callback is
>>> invoked whenever a change to the apmask or aqmask would
>>> result in one or more queue devices being removed from the driver. The
>>> vfio_ap device driver will indicate a resource is in use
>>> if the APQN of any of the queue devices to be removed are assigned to
>>> any of the matrix mdevs under the driver's control.
>>>
>>> Signed-off-by: Tony Krowiak <[email protected]>
>>> ---
>>> drivers/s390/crypto/vfio_ap_drv.c | 1 +
>>> drivers/s390/crypto/vfio_ap_ops.c | 47 +++++++++++++++++----------
>>> drivers/s390/crypto/vfio_ap_private.h | 2 ++
>>> 3 files changed, 33 insertions(+), 17 deletions(-)
>>> @@ -1369,3 +1371,14 @@ void vfio_ap_mdev_remove_queue(struct ap_queue
>>> *queue)
>>> kfree(q);
>>> mutex_unlock(&matrix_dev->lock);
>>> }
>>> +
>>> +bool vfio_ap_mdev_resource_in_use(unsigned long *apm, unsigned long
>>> *aqm)
>>> +{
>>> + bool in_use;
>>> +
>>> + mutex_lock(&matrix_dev->lock);
>>> + in_use = vfio_ap_mdev_verify_no_sharing(NULL, apm, aqm) ? true :
>>> false;
>> Maybe
>>
>> in_use = !!vfio_ap_mdev_verify_no_sharing(NULL, apm, aqm);
>>
>> ?
>
> To be honest, I find the !! expression very confusing. Every time I see
> it, I have
> to spend time thinking about what the result of !! is going to be. I think
> the statement should be left as-is because it more clearly expresses
> the intent.
In other places you use
"
ret = vfio_ap_mdev_verify_no_sharing(matrix_mdev);
if (ret)
goto share_err;
"
then why use a boolean here?
If you want to return a boolean and you do not want to use !! you can do:
...
ret = vfio_ap_mdev_verify_no_sharing(NULL, apm, aqm);
...
return (ret) ? false : true;
>
>>
>>> + mutex_unlock(&matrix_dev->lock);
>>> +
>>> + return in_use;
>>> +}
>
--
Pierre Morel
IBM Lab Boeblingen
On 16.04.20 11:33, Cornelia Huck wrote:
> On Wed, 15 Apr 2020 08:08:24 +0200
> Harald Freudenberger <[email protected]> wrote:
>
>> On 14.04.20 14:58, Cornelia Huck wrote:
>>> On Tue, 7 Apr 2020 15:20:03 -0400
>>> Tony Krowiak <[email protected]> wrote:
>>>> + /* The non-default driver's module must be loaded */
>>>> + if (!try_module_get(drv->owner))
>>>> + return 0;
>>> Is that really needed? I would have thought that the driver core's
>>> klist usage would make sure that the callback would not be invoked for
>>> drivers that are not registered anymore. Or am I missing a window?
>> The try_module_get() and module_put() is a result of review feedback from
>> my side. The ap bus core is static in the kernel whereas the
>> vfio dd is a kernel module. So there may be a race condition between
>> calling the callback function and removal of the vfio dd module.
>> There is similar code in zcrypt_api which does the same for the zcrypt
>> device drivers before using some variables or functions from the modules.
>> Help me, it this is outdated code and there is no need to adjust the
>> module reference counter any more, then I would be happy to remove
>> this code :-)
> I think the driver core already should keep us safe. A built-in bus
> calling a driver in a module is a very common pattern, and I think
> ->owner was introduced exactly for that case.
>
> Unless I'm really missing something obvious?
Hm. I tested a similar code (see zcrypt_api.c where try_module_get() and module_put()
is called surrounding use of functions related to the implementing driver.
The driver module has a reference count of 0 when not used and can get removed
- because refcount is 0 - at any time when there is nothing related to the driver pending.
As soon as the driver is actually used the try_module_get(...driver.owner) increases
the reference counter and makes it impossible to remove the module. After use the
module_put() reduces the reference count.
When I now remove the try_module_get() and module_put() calls and run this modified
code I immediately face a crash when the module is removed during use.
I see code in the kernel which does an initial try_module_get() on the driver to increase
the reference count, for example when the driver registers. However, I see no clear
way to remove such a driver module any more.
I know I had a fight with a tester some years ago where he stated that it is a valid
testcase to remove a device driver module 'during use of the driver'. So I'd like
to have the try_module_get() and module_put() invokations in the ap bus code
until you convince me there are other maybe better ways to make sure the
driver and it's functions are available at the time of the call.
Maybe we can discuss this offline if you wish :-)
On 4/16/20 7:58 AM, Cornelia Huck wrote:
> On Tue, 7 Apr 2020 15:20:05 -0400
> Tony Krowiak<[email protected]> wrote:
>
>> Let's introduce a shadow copy of the KVM guest's CRYCB and maintain it for
>> the lifespan of the guest. The shadow CRYCB will be used to provide the
>> AP configuration for a KVM guest.
> 'shadow CRYCB' seems to be a bit of a misnomer, as the real CRYCB has a
> different format (for starters, it also contains key wrapping stuff).
> It seems to be more of a 'shadow matrix'.
You make a valid point; however in reality, matrix - as it is used
throughout vfio ap - is a misnomer. The
matrix is actually comprised of the assigned APIDs and APQIs
(i.e., APQNs) and does not include the control domain assignments.
In reality, the APM, AQM and ADM are fields within the AP control
block (APCB) which is embedded within the CRYCB, so a more
accurate name might be 'shadow_apcb'. I think I'll go with that.
>> Signed-off-by: Tony Krowiak<[email protected]>
>> ---
>> drivers/s390/crypto/vfio_ap_ops.c | 31 +++++++++++++++++++++------
>> drivers/s390/crypto/vfio_ap_private.h | 1 +
>> 2 files changed, 25 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
>> index 8ece0d52ff4c..b8b678032ab7 100644
>> --- a/drivers/s390/crypto/vfio_ap_ops.c
>> +++ b/drivers/s390/crypto/vfio_ap_ops.c
>> @@ -280,14 +280,32 @@ static int handle_pqap(struct kvm_vcpu *vcpu)
>> return 0;
>> }
>>
>> +static void vfio_ap_matrix_clear(struct ap_matrix *matrix)
> vfio_ap_matrix_clear_masks()?
Sure, that works.
>
>> +{
>> + bitmap_clear(matrix->apm, 0, AP_DEVICES);
>> + bitmap_clear(matrix->aqm, 0, AP_DOMAINS);
>> + bitmap_clear(matrix->adm, 0, AP_DOMAINS);
>> +}
>> +
>> static void vfio_ap_matrix_init(struct ap_config_info *info,
>> struct ap_matrix *matrix)
>> {
>> + vfio_ap_matrix_clear(matrix);
>> matrix->apm_max = info->apxa ? info->Na : 63;
>> matrix->aqm_max = info->apxa ? info->Nd : 15;
>> matrix->adm_max = info->apxa ? info->Nd : 15;
>> }
>>
>> +static bool vfio_ap_mdev_commit_crycb(struct ap_matrix_mdev *matrix_mdev)
> vfio_ap_mdev_commit_masks()?
Since I am changing the name of shadow_crycb to shadow_apcb,
it probably makes more sense to rename this to
vfio_ap_mdev_commit_apcb().
>
> And it does not seem to return anything? (Maybe it should, to be
> consumed below?)
In patch 7, the check at the beginning of this function (for a CRYCB)
is moved into the vfio_ap_mdev_has_crycb() function, so there is
no need to return anything from this function. I will introduce the
vfio_ap_mdev_has_crycb() function in this patch instead for the
next submission.
>
>> +{
>> + if (matrix_mdev->kvm && matrix_mdev->kvm->arch.crypto.crycbd) {
>> + kvm_arch_crypto_set_masks(matrix_mdev->kvm,
>> + matrix_mdev->shadow_crycb.apm,
>> + matrix_mdev->shadow_crycb.aqm,
>> + matrix_mdev->shadow_crycb.adm);
>> + }
>> +}
>> +
>> static int vfio_ap_mdev_create(struct kobject *kobj, struct mdev_device *mdev)
>> {
>> struct ap_matrix_mdev *matrix_mdev;
>> @@ -303,6 +321,7 @@ static int vfio_ap_mdev_create(struct kobject *kobj, struct mdev_device *mdev)
>>
>> matrix_mdev->mdev = mdev;
>> vfio_ap_matrix_init(&matrix_dev->info, &matrix_mdev->matrix);
>> + vfio_ap_matrix_init(&matrix_dev->info, &matrix_mdev->shadow_crycb);
>> mdev_set_drvdata(mdev, matrix_mdev);
>> matrix_mdev->pqap_hook.hook = handle_pqap;
>> matrix_mdev->pqap_hook.owner = THIS_MODULE;
>> @@ -1126,13 +1145,9 @@ static int vfio_ap_mdev_group_notifier(struct notifier_block *nb,
>> if (ret)
>> return NOTIFY_DONE;
>>
>> - /* If there is no CRYCB pointer, then we can't copy the masks */
>> - if (!matrix_mdev->kvm->arch.crypto.crycbd)
>> - return NOTIFY_DONE;
>> -
>> - kvm_arch_crypto_set_masks(matrix_mdev->kvm, matrix_mdev->matrix.apm,
>> - matrix_mdev->matrix.aqm,
>> - matrix_mdev->matrix.adm);
>> + memcpy(&matrix_mdev->shadow_crycb, &matrix_mdev->matrix,
>> + sizeof(matrix_mdev->shadow_crycb));
>> + vfio_ap_mdev_commit_crycb(matrix_mdev);
> You are changing the return code for !crycb; maybe that's where a good
> return code for vfio_ap_mdev_commit_crycb() would come in handy :)
See my comments above regarding moving the introduction of the
vfio_ap_mdev_has_crycb() function from patch 7 to this patch.
In that case, that function will be called here before committing.
>
>>
>> return NOTIFY_OK;
>> }
>> @@ -1247,6 +1262,8 @@ static void vfio_ap_mdev_release(struct mdev_device *mdev)
>> kvm_put_kvm(matrix_mdev->kvm);
>> matrix_mdev->kvm = NULL;
>> }
>> +
>> + vfio_ap_matrix_clear(&matrix_mdev->shadow_crycb);
>> mutex_unlock(&matrix_dev->lock);
>>
>> vfio_unregister_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,
>> diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h
>> index 4b6e144bab17..87cc270c3212 100644
>> --- a/drivers/s390/crypto/vfio_ap_private.h
>> +++ b/drivers/s390/crypto/vfio_ap_private.h
>> @@ -83,6 +83,7 @@ struct ap_matrix {
>> struct ap_matrix_mdev {
>> struct list_head node;
>> struct ap_matrix matrix;
>> + struct ap_matrix shadow_crycb;
> I think shadow_matrix would be a better name.
Changing to shadow_apcb as per comments above.
>
>> struct notifier_block group_notifier;
>> struct notifier_block iommu_notifier;
>> struct kvm *kvm;
On Thu, 16 Apr 2020 10:45:20 -0400
Tony Krowiak <[email protected]> wrote:
>
>
> On 4/16/20 7:18 AM, Cornelia Huck wrote:
> > On Tue, 7 Apr 2020 15:20:04 -0400
> > Tony Krowiak <[email protected]> wrote:
> >
> >> Let's implement the callback to indicate when an APQN
> >> is in use by the vfio_ap device driver. The callback is
> >> invoked whenever a change to the apmask or aqmask would
> >> result in one or more queue devices being removed from the driver. The
> >> vfio_ap device driver will indicate a resource is in use
> >> if the APQN of any of the queue devices to be removed are assigned to
> >> any of the matrix mdevs under the driver's control.
> >>
> >> Signed-off-by: Tony Krowiak <[email protected]>
> >> ---
> >> drivers/s390/crypto/vfio_ap_drv.c | 1 +
> >> drivers/s390/crypto/vfio_ap_ops.c | 47 +++++++++++++++++----------
> >> drivers/s390/crypto/vfio_ap_private.h | 2 ++
> >> 3 files changed, 33 insertions(+), 17 deletions(-)
> >> @@ -1369,3 +1371,14 @@ void vfio_ap_mdev_remove_queue(struct ap_queue *queue)
> >> kfree(q);
> >> mutex_unlock(&matrix_dev->lock);
> >> }
> >> +
> >> +bool vfio_ap_mdev_resource_in_use(unsigned long *apm, unsigned long *aqm)
> >> +{
> >> + bool in_use;
> >> +
> >> + mutex_lock(&matrix_dev->lock);
> >> + in_use = vfio_ap_mdev_verify_no_sharing(NULL, apm, aqm) ? true : false;
> > Maybe
> >
> > in_use = !!vfio_ap_mdev_verify_no_sharing(NULL, apm, aqm);
> >
> > ?
>
> To be honest, I find the !! expression very confusing. Every time I see
> it, I have
> to spend time thinking about what the result of !! is going to be. I think
> the statement should be left as-is because it more clearly expresses
> the intent.
>
This is discussion is just about cosmetics, I believe. Just a piece of
advice: try to be sensitive about the community. In this community, and
I believe in C general !! is the idiomatic way to convert number to
boolean. Why would one want to do that is a bit longer story. The short
version is in logic condition context the value 0 is false and any
other value is true. !! keeps false value (0) false, and forces a true to
the most true true value. If you keep getting confused every time you
run across a !! that won't help with reading other peoples C.
Regards,
Halil
> >
> >> + mutex_unlock(&matrix_dev->lock);
> >> +
> >> + return in_use;
> >> +}
>
On Thu, 16 Apr 2020 11:37:21 +0200
Cornelia Huck <[email protected]> wrote:
> On Wed, 15 Apr 2020 13:10:10 -0400
> Tony Krowiak <[email protected]> wrote:
>
> > On 4/14/20 8:58 AM, Cornelia Huck wrote:
> > > On Tue, 7 Apr 2020 15:20:03 -0400
> > > Tony Krowiak <[email protected]> wrote:
>
> > >> +
> > >> + if (ap_drv->in_use)
> > >> + if (ap_drv->in_use(newapm, ap_perms.aqm))
> > > Can we log the offending apm somewhere, preferably with additional info
> > > that allows the admin to figure out why an error was returned?
> >
> > One of the things on my TODO list is to add logging to the vfio_ap
> > module which will track all significant activity within the device
> > driver. I plan to do that with a patch or set of patches specifically
> > put together for that purpose. Having said that, the best place to
> > log this would be in the in_use callback in the vfio_ap device driver
> > (see next patch) where the APQNs that are in use can be identified.
> > For now, I will log a message to the dmesg log indicating which
> > APQNs are in use by the matrix mdev.
>
> Sounds reasonable. My main issue was what an admin was supposed to do
> until logging was in place :)
Logging may not be the right answer here. Imagine somebody wants to build
a nice web-tool for managing this stuff at scale -- e.g. something HMC. I
don't think the solution is to let this tool parse the kernel messages
and try to relate that to its own transactions.
But I do agree, having a way to report why "won't do" to the end user is
important for usability.
Regards,
Halil
>
> >
> > >
> > >> + rc = -EADDRINUSE;
> > >> +
> > >> + module_put(drv->owner);
> > >> +
> > >> + return rc;
> > >> +}
>
On Tue, 7 Apr 2020 15:20:01 -0400
Tony Krowiak <[email protected]> wrote:
> Rather than looping over potentially 65535 objects, let's store the
> structures for caching information about queue devices bound to the
> vfio_ap device driver in a hash table keyed by APQN.
@Harald:
Would it make sense to make the efficient lookup of an apqueue base
on its APQN core AP functionality instead of each driver figuring it out
on it's own?
If I'm not wrong the zcrypt device/driver(s) must the problem of
looking up a queue based on its APQN as well.
For instance struct ep11_cprb has a target_id filed
(arch/s390/include/uapi/asm/zcrypt.h).
Regards,
Halil
>
> Signed-off-by: Tony Krowiak <[email protected]>
> ---
> drivers/s390/crypto/vfio_ap_drv.c | 28 +++------
> drivers/s390/crypto/vfio_ap_ops.c | 90 ++++++++++++++-------------
> drivers/s390/crypto/vfio_ap_private.h | 10 ++-
> 3 files changed, 60 insertions(+), 68 deletions(-)
>
> diff --git a/drivers/s390/crypto/vfio_ap_drv.c b/drivers/s390/crypto/vfio_ap_drv.c
> index be2520cc010b..e9c226c0730e 100644
> --- a/drivers/s390/crypto/vfio_ap_drv.c
> +++ b/drivers/s390/crypto/vfio_ap_drv.c
> @@ -51,15 +51,9 @@ MODULE_DEVICE_TABLE(vfio_ap, ap_queue_ids);
> */
> static int vfio_ap_queue_dev_probe(struct ap_device *apdev)
> {
> - struct vfio_ap_queue *q;
> -
> - q = kzalloc(sizeof(*q), GFP_KERNEL);
> - if (!q)
> - return -ENOMEM;
> - dev_set_drvdata(&apdev->device, q);
> - q->apqn = to_ap_queue(&apdev->device)->qid;
> - q->saved_isc = VFIO_AP_ISC_INVALID;
> - return 0;
> + struct ap_queue *queue = to_ap_queue(&apdev->device);
> +
> + return vfio_ap_mdev_probe_queue(queue);
> }
>
> /**
> @@ -70,18 +64,9 @@ static int vfio_ap_queue_dev_probe(struct ap_device *apdev)
> */
> static void vfio_ap_queue_dev_remove(struct ap_device *apdev)
> {
> - struct vfio_ap_queue *q;
> - int apid, apqi;
> -
> - mutex_lock(&matrix_dev->lock);
> - q = dev_get_drvdata(&apdev->device);
> - dev_set_drvdata(&apdev->device, NULL);
> - apid = AP_QID_CARD(q->apqn);
> - apqi = AP_QID_QUEUE(q->apqn);
> - vfio_ap_mdev_reset_queue(apid, apqi, 1);
> - vfio_ap_irq_disable(q);
> - kfree(q);
> - mutex_unlock(&matrix_dev->lock);
> + struct ap_queue *queue = to_ap_queue(&apdev->device);
> +
> + vfio_ap_mdev_remove_queue(queue);
> }
>
> static void vfio_ap_matrix_dev_release(struct device *dev)
> @@ -135,6 +120,7 @@ static int vfio_ap_matrix_dev_create(void)
>
> mutex_init(&matrix_dev->lock);
> INIT_LIST_HEAD(&matrix_dev->mdev_list);
> + hash_init(matrix_dev->qtable);
>
> dev_set_name(&matrix_dev->device, "%s", VFIO_AP_DEV_NAME);
> matrix_dev->device.parent = root_device;
> diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
> index 5c0f53c6dde7..134860934fe7 100644
> --- a/drivers/s390/crypto/vfio_ap_ops.c
> +++ b/drivers/s390/crypto/vfio_ap_ops.c
> @@ -26,45 +26,16 @@
>
> static int vfio_ap_mdev_reset_queues(struct mdev_device *mdev);
>
> -static int match_apqn(struct device *dev, const void *data)
> -{
> - struct vfio_ap_queue *q = dev_get_drvdata(dev);
> -
> - return (q->apqn == *(int *)(data)) ? 1 : 0;
> -}
> -
> -/**
> - * vfio_ap_get_queue: Retrieve a queue with a specific APQN from a list
> - * @matrix_mdev: the associated mediated matrix
> - * @apqn: The queue APQN
> - *
> - * Retrieve a queue with a specific APQN from the list of the
> - * devices of the vfio_ap_drv.
> - * Verify that the APID and the APQI are set in the matrix.
> - *
> - * Returns the pointer to the associated vfio_ap_queue
> - */
> -static struct vfio_ap_queue *vfio_ap_get_queue(
> - struct ap_matrix_mdev *matrix_mdev,
> - int apqn)
> +struct vfio_ap_queue *vfio_ap_get_queue(unsigned long apqn)
> {
> struct vfio_ap_queue *q;
> - struct device *dev;
> -
> - if (!test_bit_inv(AP_QID_CARD(apqn), matrix_mdev->matrix.apm))
> - return NULL;
> - if (!test_bit_inv(AP_QID_QUEUE(apqn), matrix_mdev->matrix.aqm))
> - return NULL;
> -
> - dev = driver_find_device(&matrix_dev->vfio_ap_drv->driver, NULL,
> - &apqn, match_apqn);
> - if (!dev)
> - return NULL;
> - q = dev_get_drvdata(dev);
> - q->matrix_mdev = matrix_mdev;
> - put_device(dev);
>
> - return q;
> + hash_for_each_possible(matrix_dev->qtable, q, qnode, apqn) {
> + if (q && (apqn == q->apqn))
> + return q;
> + }
> +
> + return NULL;
> }
>
> /**
> @@ -293,10 +264,11 @@ static int handle_pqap(struct kvm_vcpu *vcpu)
> matrix_mdev = container_of(vcpu->kvm->arch.crypto.pqap_hook,
> struct ap_matrix_mdev, pqap_hook);
>
> - q = vfio_ap_get_queue(matrix_mdev, apqn);
> + q = vfio_ap_get_queue(apqn);
> if (!q)
> goto out_unlock;
>
> + q->matrix_mdev = matrix_mdev;
> status = vcpu->run->s.regs.gprs[1];
>
> /* If IR bit(16) is set we enable the interrupt */
> @@ -1116,16 +1088,11 @@ static int vfio_ap_mdev_group_notifier(struct notifier_block *nb,
>
> static void vfio_ap_irq_disable_apqn(int apqn)
> {
> - struct device *dev;
> struct vfio_ap_queue *q;
>
> - dev = driver_find_device(&matrix_dev->vfio_ap_drv->driver, NULL,
> - &apqn, match_apqn);
> - if (dev) {
> - q = dev_get_drvdata(dev);
> + q = vfio_ap_get_queue(apqn);
> + if (q)
> vfio_ap_irq_disable(q);
> - put_device(dev);
> - }
> }
>
> int vfio_ap_mdev_reset_queue(unsigned int apid, unsigned int apqi,
> @@ -1302,3 +1269,38 @@ void vfio_ap_mdev_unregister(void)
> {
> mdev_unregister_device(&matrix_dev->device);
> }
> +
> +int vfio_ap_mdev_probe_queue(struct ap_queue *queue)
> +{
> + struct vfio_ap_queue *q;
> +
> + q = kzalloc(sizeof(*q), GFP_KERNEL);
> + if (!q)
> + return -ENOMEM;
> +
> + mutex_lock(&matrix_dev->lock);
> + dev_set_drvdata(&queue->ap_dev.device, q);
> + q->apqn = queue->qid;
> + q->saved_isc = VFIO_AP_ISC_INVALID;
> + hash_add(matrix_dev->qtable, &q->qnode, q->apqn);
> + mutex_unlock(&matrix_dev->lock);
> +
> + return 0;
> +}
> +
> +void vfio_ap_mdev_remove_queue(struct ap_queue *queue)
> +{
> + struct vfio_ap_queue *q;
> + int apid, apqi;
> +
> + mutex_lock(&matrix_dev->lock);
> + q = dev_get_drvdata(&queue->ap_dev.device);
> + dev_set_drvdata(&queue->ap_dev.device, NULL);
> + apid = AP_QID_CARD(q->apqn);
> + apqi = AP_QID_QUEUE(q->apqn);
> + vfio_ap_mdev_reset_queue(apid, apqi, 1);
> + vfio_ap_irq_disable(q);
> + hash_del(&q->qnode);
> + kfree(q);
> + mutex_unlock(&matrix_dev->lock);
> +}
> diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h
> index f46dde56b464..e1f8c82cc55d 100644
> --- a/drivers/s390/crypto/vfio_ap_private.h
> +++ b/drivers/s390/crypto/vfio_ap_private.h
> @@ -18,6 +18,7 @@
> #include <linux/delay.h>
> #include <linux/mutex.h>
> #include <linux/kvm_host.h>
> +#include <linux/hashtable.h>
>
> #include "ap_bus.h"
>
> @@ -43,6 +44,7 @@ struct ap_matrix_dev {
> struct list_head mdev_list;
> struct mutex lock;
> struct ap_driver *vfio_ap_drv;
> + DECLARE_HASHTABLE(qtable, 8);
> };
>
> extern struct ap_matrix_dev *matrix_dev;
> @@ -90,8 +92,6 @@ struct ap_matrix_mdev {
>
> extern int vfio_ap_mdev_register(void);
> extern void vfio_ap_mdev_unregister(void);
> -int vfio_ap_mdev_reset_queue(unsigned int apid, unsigned int apqi,
> - unsigned int retry);
>
> struct vfio_ap_queue {
> struct ap_matrix_mdev *matrix_mdev;
> @@ -99,6 +99,10 @@ struct vfio_ap_queue {
> int apqn;
> #define VFIO_AP_ISC_INVALID 0xff
> unsigned char saved_isc;
> + struct hlist_node qnode;
> };
> -struct ap_queue_status vfio_ap_irq_disable(struct vfio_ap_queue *q);
> +
> +int vfio_ap_mdev_probe_queue(struct ap_queue *queue);
> +void vfio_ap_mdev_remove_queue(struct ap_queue *queue);
> +
> #endif /* _VFIO_AP_PRIVATE_H_ */
On 4/23/20 11:13 PM, Halil Pasic wrote:
> On Thu, 16 Apr 2020 10:45:20 -0400
> Tony Krowiak <[email protected]> wrote:
>
>>
>> On 4/16/20 7:18 AM, Cornelia Huck wrote:
>>> On Tue, 7 Apr 2020 15:20:04 -0400
>>> Tony Krowiak <[email protected]> wrote:
>>>
>>>> Let's implement the callback to indicate when an APQN
>>>> is in use by the vfio_ap device driver. The callback is
>>>> invoked whenever a change to the apmask or aqmask would
>>>> result in one or more queue devices being removed from the driver. The
>>>> vfio_ap device driver will indicate a resource is in use
>>>> if the APQN of any of the queue devices to be removed are assigned to
>>>> any of the matrix mdevs under the driver's control.
>>>>
>>>> Signed-off-by: Tony Krowiak <[email protected]>
>>>> ---
>>>> drivers/s390/crypto/vfio_ap_drv.c | 1 +
>>>> drivers/s390/crypto/vfio_ap_ops.c | 47 +++++++++++++++++----------
>>>> drivers/s390/crypto/vfio_ap_private.h | 2 ++
>>>> 3 files changed, 33 insertions(+), 17 deletions(-)
>>>> @@ -1369,3 +1371,14 @@ void vfio_ap_mdev_remove_queue(struct ap_queue *queue)
>>>> kfree(q);
>>>> mutex_unlock(&matrix_dev->lock);
>>>> }
>>>> +
>>>> +bool vfio_ap_mdev_resource_in_use(unsigned long *apm, unsigned long *aqm)
>>>> +{
>>>> + bool in_use;
>>>> +
>>>> + mutex_lock(&matrix_dev->lock);
>>>> + in_use = vfio_ap_mdev_verify_no_sharing(NULL, apm, aqm) ? true : false;
>>> Maybe
>>>
>>> in_use = !!vfio_ap_mdev_verify_no_sharing(NULL, apm, aqm);
>>>
>>> ?
>> To be honest, I find the !! expression very confusing. Every time I see
>> it, I have
>> to spend time thinking about what the result of !! is going to be. I think
>> the statement should be left as-is because it more clearly expresses
>> the intent.
>>
> This is discussion is just about cosmetics, I believe. Just a piece of
> advice: try to be sensitive about the community. In this community, and
> I believe in C general !! is the idiomatic way to convert number to
> boolean. Why would one want to do that is a bit longer story. The short
> version is in logic condition context the value 0 is false and any
> other value is true. !! keeps false value (0) false, and forces a true to
> the most true true value. If you keep getting confused every time you
> run across a !! that won't help with reading other peoples C.
>
> Regards,
> Halil
The point is moot. After seeing that Conny's comment generated a
discussion, I decided to avoid wasting additional time discussing
personal preferences and am now using the !! syntax. Unfortunately,
I've been having some odd problems with my email client and my
response to Pierre's comment never made it to the list, so I apologize
that you had to waste valuable time on your tutorial.
>
>>>> + mutex_unlock(&matrix_dev->lock);
>>>> +
>>>> + return in_use;
>>>> +}
On 4/23/20 11:33 PM, Halil Pasic wrote:
> On Thu, 16 Apr 2020 11:37:21 +0200
> Cornelia Huck <[email protected]> wrote:
>
>> On Wed, 15 Apr 2020 13:10:10 -0400
>> Tony Krowiak <[email protected]> wrote:
>>
>>> On 4/14/20 8:58 AM, Cornelia Huck wrote:
>>>> On Tue, 7 Apr 2020 15:20:03 -0400
>>>> Tony Krowiak <[email protected]> wrote:
>>>>> +
>>>>> + if (ap_drv->in_use)
>>>>> + if (ap_drv->in_use(newapm, ap_perms.aqm))
>>>> Can we log the offending apm somewhere, preferably with additional info
>>>> that allows the admin to figure out why an error was returned?
>>> One of the things on my TODO list is to add logging to the vfio_ap
>>> module which will track all significant activity within the device
>>> driver. I plan to do that with a patch or set of patches specifically
>>> put together for that purpose. Having said that, the best place to
>>> log this would be in the in_use callback in the vfio_ap device driver
>>> (see next patch) where the APQNs that are in use can be identified.
>>> For now, I will log a message to the dmesg log indicating which
>>> APQNs are in use by the matrix mdev.
>> Sounds reasonable. My main issue was what an admin was supposed to do
>> until logging was in place :)
> Logging may not be the right answer here. Imagine somebody wants to build
> a nice web-tool for managing this stuff at scale -- e.g. something HMC. I
> don't think the solution is to let this tool parse the kernel messages
> and try to relate that to its own transactions.
I don't believe there is no right or wrong answer here; I simply don't
see the relevance of discussing a tool in this context. We are talking
about a sysfs attribute interface here, so - correct me if I'm
mistaken - our options for notifying the user that a queue is in use are
limited to the return code from the sysfs interface and logging. I would
expect that a tool would have to do something similar to the callback
implemented in the vfio_ap device driver and check the APQNs
removed against the APQNs assigned to the mdevs to determine which
is in use.
>
> But I do agree, having a way to report why "won't do" to the end user is
> important for usability.
>
> Regards,
> Halil
>
>>>>
>>>>> + rc = -EADDRINUSE;
>>>>> +
>>>>> + module_put(drv->owner);
>>>>> +
>>>>> + return rc;
>>>>> +}
On Fri, 24 Apr 2020 13:07:38 -0400
Tony Krowiak <[email protected]> wrote:
>
>
> On 4/23/20 11:33 PM, Halil Pasic wrote:
> > On Thu, 16 Apr 2020 11:37:21 +0200
> > Cornelia Huck <[email protected]> wrote:
> >
> >> On Wed, 15 Apr 2020 13:10:10 -0400
> >> Tony Krowiak <[email protected]> wrote:
> >>
> >>> On 4/14/20 8:58 AM, Cornelia Huck wrote:
> >>>> On Tue, 7 Apr 2020 15:20:03 -0400
> >>>> Tony Krowiak <[email protected]> wrote:
> >>>>> +
> >>>>> + if (ap_drv->in_use)
> >>>>> + if (ap_drv->in_use(newapm, ap_perms.aqm))
> >>>> Can we log the offending apm somewhere, preferably with additional info
> >>>> that allows the admin to figure out why an error was returned?
> >>> One of the things on my TODO list is to add logging to the vfio_ap
> >>> module which will track all significant activity within the device
> >>> driver. I plan to do that with a patch or set of patches specifically
> >>> put together for that purpose. Having said that, the best place to
> >>> log this would be in the in_use callback in the vfio_ap device driver
> >>> (see next patch) where the APQNs that are in use can be identified.
> >>> For now, I will log a message to the dmesg log indicating which
> >>> APQNs are in use by the matrix mdev.
> >> Sounds reasonable. My main issue was what an admin was supposed to do
> >> until logging was in place :)
> > Logging may not be the right answer here. Imagine somebody wants to build
> > a nice web-tool for managing this stuff at scale -- e.g. something HMC. I
> > don't think the solution is to let this tool parse the kernel messages
> > and try to relate that to its own transactions.
>
> I don't believe there is no right or wrong answer here; I simply don't
> see the relevance of discussing a tool in this context. We are talking
> about a sysfs attribute interface here, so - correct me if I'm
> mistaken - our options for notifying the user that a queue is in use are
> limited to the return code from the sysfs interface and logging. I would
> expect that a tool would have to do something similar to the callback
> implemented in the vfio_ap device driver and check the APQNs
> removed against the APQNs assigned to the mdevs to determine which
> is in use.
>
We are talking interface design. The relevance of discussing a tool is
that any userspace tool must come by with whatever interface we come up
now. IMHO thinking about the usage (and the client code) is very helpful
in avoiding broken interface designs. AFAIK this is one of the basic
ideas behind test driven development.
Regards,
Halil
On 2020-04-07 21:20, Tony Krowiak wrote:
> Introduces a new driver callback to prevent a root user from unbinding
> an AP queue from its device driver if the queue is in use. The intent of
> this callback is to provide a driver with the means to prevent a root user
> from inadvertently taking a queue away from a guest and giving it to the
> host while the guest is still using it.
How can we know, at this point if the guest uses or not the queue?
Do you want to say that this prevents to take away a queue when it is
currently assigned to a VFIO device?
and with a guest currently using this VFIO device?
> The callback will
> be invoked whenever a change to the AP bus's sysfs apmask or aqmask
> attributes would result in one or more AP queues being removed from its
> driver. If the callback responds in the affirmative for any driver
> queried, the change to the apmask or aqmask will be rejected with a device
> in use error.
AFAIU you mean that Linux's driver's binding and unbinding mechanism is
not sufficient to avoid this issue because unbind can not be refused by
the driver.
The reason why we do not want a single queue to be removed from the VFIO
driver is because the VFIO drivers works on a matrix, not on queues, and
for the matrix to be consistent it needs to acquire all queues defined
by the cross product of all APID and AQID assigned to the matrix.
This functionality is valid for the host as for the guests and is
handled automatically by the firmware with the CRYCB.
The AP bus uses QCI to retrieve the host CRYCB and build the hosts AP
queues.
If instead to mix VFIO CRYCB matrix handling and queues at the same
level inside the AP bus we separate these different firmware entities in
two different software entities.
If we make the AP bus sit above a CRYCB/Matrix bus, and in the way
virtualize the QCI and test AP queue instructions:
- we can directly pass a matrix device to the guest though a VFIO matrix
device
- the consistence will be automatic
- the VFIO device and parent device will be of the same kind which would
make the design much more clearer.
- there will be no need for these callback because the consistence of
the matrix will be guaranteed by firmware
>
> For this patch, only non-default drivers will be queried. Currently,
> there is only one non-default driver, the vfio_ap device driver.
You mean that the admin may take queues away from the "default driver",
while the queue is in use, to give it to an other driver?
Why is it to avoid in one way and not in the other way?
> The
> vfio_ap device driver manages AP queues passed through to one or more
> guests
I read this as if a queue may be passed to several guest...
please, rephrase or explain.
> and we don't want to unexpectedly take AP resources away from
> guests which are most likely independently administered.
When you say "independently administered", you mean as a second admin
inside the host, don't you?
Regards,
Pierre
--
Pierre Morel
IBM Lab Boeblingen
On 24.04.20 05:57, Halil Pasic wrote:
> On Tue, 7 Apr 2020 15:20:01 -0400
> Tony Krowiak <[email protected]> wrote:
>
>> Rather than looping over potentially 65535 objects, let's store the
>> structures for caching information about queue devices bound to the
>> vfio_ap device driver in a hash table keyed by APQN.
> @Harald:
> Would it make sense to make the efficient lookup of an apqueue base
> on its APQN core AP functionality instead of each driver figuring it out
> on it's own?
>
> If I'm not wrong the zcrypt device/driver(s) must the problem of
> looking up a queue based on its APQN as well.
>
> For instance struct ep11_cprb has a target_id filed
> (arch/s390/include/uapi/asm/zcrypt.h).
>
> Regards,
> Halil
Hi Halil
no, the zcrypt drivers don't have this problem. They build up their own device object which
includes a pointer to the base ap device.
However, this is not a big issue, as the ap_bus holds a list of ap_card objects and within each
ap_card object there exists a list of ap_queues. So if Tony want's I could provide a function like
struct ap_queue *ap_get_ap_queue_dev(ap_qid_t qid);
Which returns a ptr to an ap_queue device or an error ptr. It would also do a get_device() on the
device within the ap_queue struct (and would need to run a put_device() after the caller has
finished using the ap_queue).
Tony just tell me, if I should go forward and provide this to you.
regards
Harald
>
>> Signed-off-by: Tony Krowiak <[email protected]>
>> ---
>> drivers/s390/crypto/vfio_ap_drv.c | 28 +++------
>> drivers/s390/crypto/vfio_ap_ops.c | 90 ++++++++++++++-------------
>> drivers/s390/crypto/vfio_ap_private.h | 10 ++-
>> 3 files changed, 60 insertions(+), 68 deletions(-)
>>
>> diff --git a/drivers/s390/crypto/vfio_ap_drv.c b/drivers/s390/crypto/vfio_ap_drv.c
>> index be2520cc010b..e9c226c0730e 100644
>> --- a/drivers/s390/crypto/vfio_ap_drv.c
>> +++ b/drivers/s390/crypto/vfio_ap_drv.c
>> @@ -51,15 +51,9 @@ MODULE_DEVICE_TABLE(vfio_ap, ap_queue_ids);
>> */
>> static int vfio_ap_queue_dev_probe(struct ap_device *apdev)
>> {
>> - struct vfio_ap_queue *q;
>> -
>> - q = kzalloc(sizeof(*q), GFP_KERNEL);
>> - if (!q)
>> - return -ENOMEM;
>> - dev_set_drvdata(&apdev->device, q);
>> - q->apqn = to_ap_queue(&apdev->device)->qid;
>> - q->saved_isc = VFIO_AP_ISC_INVALID;
>> - return 0;
>> + struct ap_queue *queue = to_ap_queue(&apdev->device);
>> +
>> + return vfio_ap_mdev_probe_queue(queue);
>> }
>>
>> /**
>> @@ -70,18 +64,9 @@ static int vfio_ap_queue_dev_probe(struct ap_device *apdev)
>> */
>> static void vfio_ap_queue_dev_remove(struct ap_device *apdev)
>> {
>> - struct vfio_ap_queue *q;
>> - int apid, apqi;
>> -
>> - mutex_lock(&matrix_dev->lock);
>> - q = dev_get_drvdata(&apdev->device);
>> - dev_set_drvdata(&apdev->device, NULL);
>> - apid = AP_QID_CARD(q->apqn);
>> - apqi = AP_QID_QUEUE(q->apqn);
>> - vfio_ap_mdev_reset_queue(apid, apqi, 1);
>> - vfio_ap_irq_disable(q);
>> - kfree(q);
>> - mutex_unlock(&matrix_dev->lock);
>> + struct ap_queue *queue = to_ap_queue(&apdev->device);
>> +
>> + vfio_ap_mdev_remove_queue(queue);
>> }
>>
>> static void vfio_ap_matrix_dev_release(struct device *dev)
>> @@ -135,6 +120,7 @@ static int vfio_ap_matrix_dev_create(void)
>>
>> mutex_init(&matrix_dev->lock);
>> INIT_LIST_HEAD(&matrix_dev->mdev_list);
>> + hash_init(matrix_dev->qtable);
>>
>> dev_set_name(&matrix_dev->device, "%s", VFIO_AP_DEV_NAME);
>> matrix_dev->device.parent = root_device;
>> diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
>> index 5c0f53c6dde7..134860934fe7 100644
>> --- a/drivers/s390/crypto/vfio_ap_ops.c
>> +++ b/drivers/s390/crypto/vfio_ap_ops.c
>> @@ -26,45 +26,16 @@
>>
>> static int vfio_ap_mdev_reset_queues(struct mdev_device *mdev);
>>
>> -static int match_apqn(struct device *dev, const void *data)
>> -{
>> - struct vfio_ap_queue *q = dev_get_drvdata(dev);
>> -
>> - return (q->apqn == *(int *)(data)) ? 1 : 0;
>> -}
>> -
>> -/**
>> - * vfio_ap_get_queue: Retrieve a queue with a specific APQN from a list
>> - * @matrix_mdev: the associated mediated matrix
>> - * @apqn: The queue APQN
>> - *
>> - * Retrieve a queue with a specific APQN from the list of the
>> - * devices of the vfio_ap_drv.
>> - * Verify that the APID and the APQI are set in the matrix.
>> - *
>> - * Returns the pointer to the associated vfio_ap_queue
>> - */
>> -static struct vfio_ap_queue *vfio_ap_get_queue(
>> - struct ap_matrix_mdev *matrix_mdev,
>> - int apqn)
>> +struct vfio_ap_queue *vfio_ap_get_queue(unsigned long apqn)
>> {
>> struct vfio_ap_queue *q;
>> - struct device *dev;
>> -
>> - if (!test_bit_inv(AP_QID_CARD(apqn), matrix_mdev->matrix.apm))
>> - return NULL;
>> - if (!test_bit_inv(AP_QID_QUEUE(apqn), matrix_mdev->matrix.aqm))
>> - return NULL;
>> -
>> - dev = driver_find_device(&matrix_dev->vfio_ap_drv->driver, NULL,
>> - &apqn, match_apqn);
>> - if (!dev)
>> - return NULL;
>> - q = dev_get_drvdata(dev);
>> - q->matrix_mdev = matrix_mdev;
>> - put_device(dev);
>>
>> - return q;
>> + hash_for_each_possible(matrix_dev->qtable, q, qnode, apqn) {
>> + if (q && (apqn == q->apqn))
>> + return q;
>> + }
>> +
>> + return NULL;
>> }
>>
>> /**
>> @@ -293,10 +264,11 @@ static int handle_pqap(struct kvm_vcpu *vcpu)
>> matrix_mdev = container_of(vcpu->kvm->arch.crypto.pqap_hook,
>> struct ap_matrix_mdev, pqap_hook);
>>
>> - q = vfio_ap_get_queue(matrix_mdev, apqn);
>> + q = vfio_ap_get_queue(apqn);
>> if (!q)
>> goto out_unlock;
>>
>> + q->matrix_mdev = matrix_mdev;
>> status = vcpu->run->s.regs.gprs[1];
>>
>> /* If IR bit(16) is set we enable the interrupt */
>> @@ -1116,16 +1088,11 @@ static int vfio_ap_mdev_group_notifier(struct notifier_block *nb,
>>
>> static void vfio_ap_irq_disable_apqn(int apqn)
>> {
>> - struct device *dev;
>> struct vfio_ap_queue *q;
>>
>> - dev = driver_find_device(&matrix_dev->vfio_ap_drv->driver, NULL,
>> - &apqn, match_apqn);
>> - if (dev) {
>> - q = dev_get_drvdata(dev);
>> + q = vfio_ap_get_queue(apqn);
>> + if (q)
>> vfio_ap_irq_disable(q);
>> - put_device(dev);
>> - }
>> }
>>
>> int vfio_ap_mdev_reset_queue(unsigned int apid, unsigned int apqi,
>> @@ -1302,3 +1269,38 @@ void vfio_ap_mdev_unregister(void)
>> {
>> mdev_unregister_device(&matrix_dev->device);
>> }
>> +
>> +int vfio_ap_mdev_probe_queue(struct ap_queue *queue)
>> +{
>> + struct vfio_ap_queue *q;
>> +
>> + q = kzalloc(sizeof(*q), GFP_KERNEL);
>> + if (!q)
>> + return -ENOMEM;
>> +
>> + mutex_lock(&matrix_dev->lock);
>> + dev_set_drvdata(&queue->ap_dev.device, q);
>> + q->apqn = queue->qid;
>> + q->saved_isc = VFIO_AP_ISC_INVALID;
>> + hash_add(matrix_dev->qtable, &q->qnode, q->apqn);
>> + mutex_unlock(&matrix_dev->lock);
>> +
>> + return 0;
>> +}
>> +
>> +void vfio_ap_mdev_remove_queue(struct ap_queue *queue)
>> +{
>> + struct vfio_ap_queue *q;
>> + int apid, apqi;
>> +
>> + mutex_lock(&matrix_dev->lock);
>> + q = dev_get_drvdata(&queue->ap_dev.device);
>> + dev_set_drvdata(&queue->ap_dev.device, NULL);
>> + apid = AP_QID_CARD(q->apqn);
>> + apqi = AP_QID_QUEUE(q->apqn);
>> + vfio_ap_mdev_reset_queue(apid, apqi, 1);
>> + vfio_ap_irq_disable(q);
>> + hash_del(&q->qnode);
>> + kfree(q);
>> + mutex_unlock(&matrix_dev->lock);
>> +}
>> diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h
>> index f46dde56b464..e1f8c82cc55d 100644
>> --- a/drivers/s390/crypto/vfio_ap_private.h
>> +++ b/drivers/s390/crypto/vfio_ap_private.h
>> @@ -18,6 +18,7 @@
>> #include <linux/delay.h>
>> #include <linux/mutex.h>
>> #include <linux/kvm_host.h>
>> +#include <linux/hashtable.h>
>>
>> #include "ap_bus.h"
>>
>> @@ -43,6 +44,7 @@ struct ap_matrix_dev {
>> struct list_head mdev_list;
>> struct mutex lock;
>> struct ap_driver *vfio_ap_drv;
>> + DECLARE_HASHTABLE(qtable, 8);
>> };
>>
>> extern struct ap_matrix_dev *matrix_dev;
>> @@ -90,8 +92,6 @@ struct ap_matrix_mdev {
>>
>> extern int vfio_ap_mdev_register(void);
>> extern void vfio_ap_mdev_unregister(void);
>> -int vfio_ap_mdev_reset_queue(unsigned int apid, unsigned int apqi,
>> - unsigned int retry);
>>
>> struct vfio_ap_queue {
>> struct ap_matrix_mdev *matrix_mdev;
>> @@ -99,6 +99,10 @@ struct vfio_ap_queue {
>> int apqn;
>> #define VFIO_AP_ISC_INVALID 0xff
>> unsigned char saved_isc;
>> + struct hlist_node qnode;
>> };
>> -struct ap_queue_status vfio_ap_irq_disable(struct vfio_ap_queue *q);
>> +
>> +int vfio_ap_mdev_probe_queue(struct ap_queue *queue);
>> +void vfio_ap_mdev_remove_queue(struct ap_queue *queue);
>> +
>> #endif /* _VFIO_AP_PRIVATE_H_ */
On Mon, 27 Apr 2020 15:05:23 +0200
Harald Freudenberger <[email protected]> wrote:
> On 24.04.20 05:57, Halil Pasic wrote:
> > On Tue, 7 Apr 2020 15:20:01 -0400
> > Tony Krowiak <[email protected]> wrote:
> >
> >> Rather than looping over potentially 65535 objects, let's store the
> >> structures for caching information about queue devices bound to the
> >> vfio_ap device driver in a hash table keyed by APQN.
> > @Harald:
> > Would it make sense to make the efficient lookup of an apqueue base
> > on its APQN core AP functionality instead of each driver figuring it out
> > on it's own?
> >
> > If I'm not wrong the zcrypt device/driver(s) must the problem of
> > looking up a queue based on its APQN as well.
> >
> > For instance struct ep11_cprb has a target_id filed
> > (arch/s390/include/uapi/asm/zcrypt.h).
> >
> > Regards,
> > Halil
>
> Hi Halil
>
> no, the zcrypt drivers don't have this problem. They build up their own device object which
> includes a pointer to the base ap device.
I'm a bit confused. Doesn't your code loop first trough the ap_card
objects to find the APID portion of the APQN, and then loop the queue
list of the matching card to find the right ap_queue object? Or did I
miss something? Isn't that what _zcrypt_send_ep11_cprb() does? Can you
point me to the code that avoids the lookup (by apqn) for zcrypt?
If you look at the new function of vfio_ap_get_queue(unsigned long apqn)
it basically about finding the queue based on the apqn, with the
difference that it is vfio specific.
Regards,
Halil
>
> However, this is not a big issue, as the ap_bus holds a list of ap_card objects and within each
> ap_card object there exists a list of ap_queues.
On 4/24/20 2:23 PM, Halil Pasic wrote:
> On Fri, 24 Apr 2020 13:07:38 -0400
> Tony Krowiak <[email protected]> wrote:
>
>>
>> On 4/23/20 11:33 PM, Halil Pasic wrote:
>>> On Thu, 16 Apr 2020 11:37:21 +0200
>>> Cornelia Huck <[email protected]> wrote:
>>>
>>>> On Wed, 15 Apr 2020 13:10:10 -0400
>>>> Tony Krowiak <[email protected]> wrote:
>>>>
>>>>> On 4/14/20 8:58 AM, Cornelia Huck wrote:
>>>>>> On Tue, 7 Apr 2020 15:20:03 -0400
>>>>>> Tony Krowiak <[email protected]> wrote:
>>>>>>> +
>>>>>>> + if (ap_drv->in_use)
>>>>>>> + if (ap_drv->in_use(newapm, ap_perms.aqm))
>>>>>> Can we log the offending apm somewhere, preferably with additional info
>>>>>> that allows the admin to figure out why an error was returned?
>>>>> One of the things on my TODO list is to add logging to the vfio_ap
>>>>> module which will track all significant activity within the device
>>>>> driver. I plan to do that with a patch or set of patches specifically
>>>>> put together for that purpose. Having said that, the best place to
>>>>> log this would be in the in_use callback in the vfio_ap device driver
>>>>> (see next patch) where the APQNs that are in use can be identified.
>>>>> For now, I will log a message to the dmesg log indicating which
>>>>> APQNs are in use by the matrix mdev.
>>>> Sounds reasonable. My main issue was what an admin was supposed to do
>>>> until logging was in place :)
>>> Logging may not be the right answer here. Imagine somebody wants to build
>>> a nice web-tool for managing this stuff at scale -- e.g. something HMC. I
>>> don't think the solution is to let this tool parse the kernel messages
>>> and try to relate that to its own transactions.
>> I don't believe there is no right or wrong answer here; I simply don't
>> see the relevance of discussing a tool in this context. We are talking
>> about a sysfs attribute interface here, so - correct me if I'm
>> mistaken - our options for notifying the user that a queue is in use are
>> limited to the return code from the sysfs interface and logging. I would
>> expect that a tool would have to do something similar to the callback
>> implemented in the vfio_ap device driver and check the APQNs
>> removed against the APQNs assigned to the mdevs to determine which
>> is in use.
>>
> We are talking interface design. The relevance of discussing a tool is
> that any userspace tool must come by with whatever interface we come up
> now. IMHO thinking about the usage (and the client code) is very helpful
> in avoiding broken interface designs. AFAIK this is one of the basic
> ideas behind test driven development.
What can a sysfs interface, such as apmask/aqmask, do other than reply with
a return code given sysfs attributes that store data can only return a
count?
I'm sorry, but I still don't see the relevance.
>
> Regards,
> Halil
On 4/27/20 11:17 AM, Halil Pasic wrote:
> On Mon, 27 Apr 2020 15:05:23 +0200
> Harald Freudenberger <[email protected]> wrote:
>
>> On 24.04.20 05:57, Halil Pasic wrote:
>>> On Tue, 7 Apr 2020 15:20:01 -0400
>>> Tony Krowiak <[email protected]> wrote:
>>>
>>>> Rather than looping over potentially 65535 objects, let's store the
>>>> structures for caching information about queue devices bound to the
>>>> vfio_ap device driver in a hash table keyed by APQN.
>>> @Harald:
>>> Would it make sense to make the efficient lookup of an apqueue base
>>> on its APQN core AP functionality instead of each driver figuring it out
>>> on it's own?
>>>
>>> If I'm not wrong the zcrypt device/driver(s) must the problem of
>>> looking up a queue based on its APQN as well.
>>>
>>> For instance struct ep11_cprb has a target_id filed
>>> (arch/s390/include/uapi/asm/zcrypt.h).
>>>
>>> Regards,
>>> Halil
>> Hi Halil
>>
>> no, the zcrypt drivers don't have this problem. They build up their own device object which
>> includes a pointer to the base ap device.
> I'm a bit confused. Doesn't your code loop first trough the ap_card
> objects to find the APID portion of the APQN, and then loop the queue
> list of the matching card to find the right ap_queue object? Or did I
> miss something? Isn't that what _zcrypt_send_ep11_cprb() does? Can you
> point me to the code that avoids the lookup (by apqn) for zcrypt?
The code you reference, _zcrypt_send_ep11_cprb(), does loop through
each queue associated with each card, but it doesn't appear to be
looking for
a queue with a particular APQN. It appears to be looking for a queue
meeting a specific set of conditions. At least that's my take after
taking a very
brief look at the code, so I'm not sure that applies here.
>
>
> If you look at the new function of vfio_ap_get_queue(unsigned long apqn)
> it basically about finding the queue based on the apqn, with the
> difference that it is vfio specific.
>
> Regards,
> Halil
>
>> However, this is not a big issue, as the ap_bus holds a list of ap_card objects and within each
>> ap_card object there exists a list of ap_queues.
>
>
>
On 4/27/20 4:20 AM, Pierre Morel wrote:
>
>
> On 2020-04-07 21:20, Tony Krowiak wrote:
>> Introduces a new driver callback to prevent a root user from unbinding
>> an AP queue from its device driver if the queue is in use. The intent of
>> this callback is to provide a driver with the means to prevent a root
>> user
>> from inadvertently taking a queue away from a guest and giving it to the
>> host while the guest is still using it.
>
> How can we know, at this point if the guest uses or not the queue?
The struct ap_matrix_mdev has a field, struct kvm *kvm, which holds a
pointer to KVM when
the matrix mdev is in use by a guest. This patch series also introduces
a shadow_crycb (soon to
be shadow_apcb) which holds the AP configuration for the guest. Between
those two things,
the driver can detect when a queue is in use by a guest.
> Do you want to say that this prevents to take away a queue when it is
> currently assigned to a VFIO device?
> and with a guest currently using this VFIO device?
No, I do not. The intent here is to enforce the proper procedure for
giving up a queue so it is done
deliberately. Before taking a queue away from the matrix mdev, its APQN
should be unassigned
from the matrix mdev. That is not to say that if there are major
objections to this that we can't
base in_use upon the queue being in use by a guest at the time. Maybe
that is preferable to
the community. I'll leave it to them to state their case.
>
>> The callback will
>> be invoked whenever a change to the AP bus's sysfs apmask or aqmask
>> attributes would result in one or more AP queues being removed from its
>> driver. If the callback responds in the affirmative for any driver
>> queried, the change to the apmask or aqmask will be rejected with a
>> device
>> in use error.
>
> AFAIU you mean that Linux's driver's binding and unbinding mechanism
> is not sufficient to avoid this issue because unbind can not be
> refused by the driver.
Correct!
>
>
> The reason why we do not want a single queue to be removed from the
> VFIO driver is because the VFIO drivers works on a matrix, not on
> queues, and for the matrix to be consistent it needs to acquire all
> queues defined by the cross product of all APID and AQID assigned to
> the matrix.
Not correct. The reason why is because we do not want a queue to be
surreptitiously removed
without the guest administrator being aware of its removal.
>
> This functionality is valid for the host as for the guests and is
> handled automatically by the firmware with the CRYCB.
> The AP bus uses QCI to retrieve the host CRYCB and build the hosts AP
> queues.
>
> If instead to mix VFIO CRYCB matrix handling and queues at the same
> level inside the AP bus we separate these different firmware entities
> in two different software entities.
>
> If we make the AP bus sit above a CRYCB/Matrix bus, and in the way
> virtualize the QCI and test AP queue instructions:
> - we can directly pass a matrix device to the guest though a VFIO
> matrix device
> - the consistence will be automatic
> - the VFIO device and parent device will be of the same kind which
> would make the design much more clearer.
> - there will be no need for these callback because the consistence of
> the matrix will be guaranteed by firmware
As stated in my response above, the issue here is not consistency. While
the design you describe
may be reasonable, it is a major departure from what is out in the
field. In other words, that ship
has sailed.
>
>
>>
>> For this patch, only non-default drivers will be queried. Currently,
>> there is only one non-default driver, the vfio_ap device driver.
>
> You mean that the admin may take queues away from the "default
> driver", while the queue is in use, to give it to an other driver?
> Why is it to avoid in one way and not in the other way?
Because the default drivers have direct control over the queues and can
ensure they are empty
and reset before giving up control. The vfio driver does not have direct
control over the queues
because they have been passed through to the guest.
>
>> The
>> vfio_ap device driver manages AP queues passed through to one or more
>> guests
>
> I read this as if a queue may be passed to several guest...
> please, rephrase or explain.
AP queues is plural, so it is true that AP queues can be passed through
to more than one guest. I see your point, however, so I'll reword that
to be more clear.
>
>> and we don't want to unexpectedly take AP resources away from
>> guests which are most likely independently administered.
>
> When you say "independently administered", you mean as a second admin
> inside the host, don't you?
I mean that a guest can be administered by a different person than the
host administrator.
Again, I'll try to clarify this.
>
>
> Regards,
> Pierre
>
On 2020-04-28 00:24, Tony Krowiak wrote:
>
>
> On 4/27/20 4:20 AM, Pierre Morel wrote:
>>
>>
>> On 2020-04-07 21:20, Tony Krowiak wrote:
>>> Introduces a new driver callback to prevent a root user from unbinding
>>> an AP queue from its device driver if the queue is in use. The intent of
>>> this callback is to provide a driver with the means to prevent a root
>>> user
>>> from inadvertently taking a queue away from a guest and giving it to the
>>> host while the guest is still using it.
...snip...
>>
>> This functionality is valid for the host as for the guests and is
>> handled automatically by the firmware with the CRYCB.
>> The AP bus uses QCI to retrieve the host CRYCB and build the hosts AP
>> queues.
>>
>> If instead to mix VFIO CRYCB matrix handling and queues at the same
>> level inside the AP bus we separate these different firmware entities
>> in two different software entities.
>>
>> If we make the AP bus sit above a CRYCB/Matrix bus, and in the way
>> virtualize the QCI and test AP queue instructions:
>> - we can directly pass a matrix device to the guest though a VFIO
>> matrix device
>> - the consistence will be automatic
>> - the VFIO device and parent device will be of the same kind which
>> would make the design much more clearer.
>> - there will be no need for these callback because the consistence of
>> the matrix will be guaranteed by firmware
>
> As stated in my response above, the issue here is not consistency. While
> the design you describe
> may be reasonable, it is a major departure from what is out in the
> field. In other words, that ship
> has sailed.
The current VFIO-AP driver works as before, without any change, above
the Matrix device I suggest.
Aside the old scheme which can continue, the Matrix device can be used
directly to build a VFIO Matrix device, usable by QEMU without any
modification.
Once the dynamic extensions proposed in this series and the associated
tools are out on the field, then yes the ship is really far.
For now, the existing user's API do not change, the existing tools do
not need modifications and we can repair the ship for its long journey.
The inconsistency between device and VFIO device and the resulting
complexity is not going to ease future enhancement.
Regards,
Pierre
--
Pierre Morel
IBM Lab Boeblingen
On Mon, 27 Apr 2020 17:48:58 -0400
Tony Krowiak <[email protected]> wrote:
>
>
> On 4/27/20 11:17 AM, Halil Pasic wrote:
> > On Mon, 27 Apr 2020 15:05:23 +0200
> > Harald Freudenberger <[email protected]> wrote:
> >
> >> On 24.04.20 05:57, Halil Pasic wrote:
> >>> On Tue, 7 Apr 2020 15:20:01 -0400
> >>> Tony Krowiak <[email protected]> wrote:
> >>>
> >>>> Rather than looping over potentially 65535 objects, let's store the
> >>>> structures for caching information about queue devices bound to the
> >>>> vfio_ap device driver in a hash table keyed by APQN.
> >>> @Harald:
> >>> Would it make sense to make the efficient lookup of an apqueue base
> >>> on its APQN core AP functionality instead of each driver figuring it out
> >>> on it's own?
> >>>
> >>> If I'm not wrong the zcrypt device/driver(s) must the problem of
> >>> looking up a queue based on its APQN as well.
> >>>
> >>> For instance struct ep11_cprb has a target_id filed
> >>> (arch/s390/include/uapi/asm/zcrypt.h).
> >>>
> >>> Regards,
> >>> Halil
> >> Hi Halil
> >>
> >> no, the zcrypt drivers don't have this problem. They build up their own device object which
> >> includes a pointer to the base ap device.
> > I'm a bit confused. Doesn't your code loop first trough the ap_card
> > objects to find the APID portion of the APQN, and then loop the queue
> > list of the matching card to find the right ap_queue object? Or did I
> > miss something? Isn't that what _zcrypt_send_ep11_cprb() does? Can you
> > point me to the code that avoids the lookup (by apqn) for zcrypt?
>
> The code you reference, _zcrypt_send_ep11_cprb(), does loop through
> each queue associated with each card, but it doesn't appear to be
> looking for
> a queue with a particular APQN. It appears to be looking for a queue
> meeting a specific set of conditions. At least that's my take after
> taking a very
> brief look at the code, so I'm not sure that applies here.
>
One of the possible conditions is that the APQN is in the targets array.
Please have another look at the code below, is_desired_ep11_queue()
and is_desired_ep11_card() do APQI and APID part of the check
respectively:
for_each_zcrypt_card(zc) {
/* Check for online EP11 cards */
if (!zc->online || !(zc->card->functions & 0x04000000))
continue;
/* Check for user selected EP11 card */
if (targets &&
!is_desired_ep11_card(zc->card->id, target_num, targets))
continue;
/* check if device node has admission for this card */
if (!zcrypt_check_card(perms, zc->card->id))
continue;
/* get weight index of the card device */
weight = speed_idx_ep11(func_code) * zc->speed_rating[SECKEY];
if (zcrypt_card_compare(zc, pref_zc, weight, pref_weight))
continue;
for_each_zcrypt_queue(zq, zc) {
/* check if device is online and eligible */
if (!zq->online ||
!zq->ops->send_ep11_cprb ||
(targets &&
!is_desired_ep11_queue(zq->queue->qid,
target_num, targets)))
Yes the size of targets may or may not be 1 (example for size == 1 is
the invocation form ep11_cryptsingle()) and the respective costs
depend on the usual size of the array. Since the goal of the whole
exercise seems to be to pick a single queue, and we settle with the first
suitable (first not in the input array, but in our lists) that is
suitable, I assumed we wouldn't need many hashtable lookups.
Regards,
Halil
> >
> >
> > If you look at the new function of vfio_ap_get_queue(unsigned long apqn)
> > it basically about finding the queue based on the apqn, with the
> > difference that it is vfio specific.
> >
> > Regards,
> > Halil
> >
> >> However, this is not a big issue, as the ap_bus holds a list of ap_card objects and within each
> >> ap_card object there exists a list of ap_queues.
> >
> >
> >
>
On 27.04.20 17:17, Halil Pasic wrote:
> On Mon, 27 Apr 2020 15:05:23 +0200
> Harald Freudenberger <[email protected]> wrote:
>
>> On 24.04.20 05:57, Halil Pasic wrote:
>>> On Tue, 7 Apr 2020 15:20:01 -0400
>>> Tony Krowiak <[email protected]> wrote:
>>>
>>>> Rather than looping over potentially 65535 objects, let's store the
>>>> structures for caching information about queue devices bound to the
>>>> vfio_ap device driver in a hash table keyed by APQN.
>>> @Harald:
>>> Would it make sense to make the efficient lookup of an apqueue base
>>> on its APQN core AP functionality instead of each driver figuring it out
>>> on it's own?
>>>
>>> If I'm not wrong the zcrypt device/driver(s) must the problem of
>>> looking up a queue based on its APQN as well.
>>>
>>> For instance struct ep11_cprb has a target_id filed
>>> (arch/s390/include/uapi/asm/zcrypt.h).
>>>
>>> Regards,
>>> Halil
>> Hi Halil
>>
>> no, the zcrypt drivers don't have this problem. They build up their own device object which
>> includes a pointer to the base ap device.
> I'm a bit confused. Doesn't your code loop first trough the ap_card
> objects to find the APID portion of the APQN, and then loop the queue
> list of the matching card to find the right ap_queue object? Or did I
> miss something? Isn't that what _zcrypt_send_ep11_cprb() does? Can you
> point me to the code that avoids the lookup (by apqn) for zcrypt?
No. The code is looping through zcrypt_card and zcrypt_queue objects which
are build up and held by the zcrypt api and the zcrypt driver(s).
It does not deal with ap_card and ap_queue devices.
>
>
> If you look at the new function of vfio_ap_get_queue(unsigned long apqn)
> it basically about finding the queue based on the apqn, with the
> difference that it is vfio specific.
>
> Regards,
> Halil
>
>> However, this is not a big issue, as the ap_bus holds a list of ap_card objects and within each
>> ap_card object there exists a list of ap_queues.
>
>
>
On 28.04.20 12:07, Halil Pasic wrote:
> On Mon, 27 Apr 2020 17:48:58 -0400
> Tony Krowiak <[email protected]> wrote:
>
>>
>> On 4/27/20 11:17 AM, Halil Pasic wrote:
>>> On Mon, 27 Apr 2020 15:05:23 +0200
>>> Harald Freudenberger <[email protected]> wrote:
>>>
>>>> On 24.04.20 05:57, Halil Pasic wrote:
>>>>> On Tue, 7 Apr 2020 15:20:01 -0400
>>>>> Tony Krowiak <[email protected]> wrote:
>>>>>
>>>>>> Rather than looping over potentially 65535 objects, let's store the
>>>>>> structures for caching information about queue devices bound to the
>>>>>> vfio_ap device driver in a hash table keyed by APQN.
>>>>> @Harald:
>>>>> Would it make sense to make the efficient lookup of an apqueue base
>>>>> on its APQN core AP functionality instead of each driver figuring it out
>>>>> on it's own?
>>>>>
>>>>> If I'm not wrong the zcrypt device/driver(s) must the problem of
>>>>> looking up a queue based on its APQN as well.
>>>>>
>>>>> For instance struct ep11_cprb has a target_id filed
>>>>> (arch/s390/include/uapi/asm/zcrypt.h).
>>>>>
>>>>> Regards,
>>>>> Halil
>>>> Hi Halil
>>>>
>>>> no, the zcrypt drivers don't have this problem. They build up their own device object which
>>>> includes a pointer to the base ap device.
>>> I'm a bit confused. Doesn't your code loop first trough the ap_card
>>> objects to find the APID portion of the APQN, and then loop the queue
>>> list of the matching card to find the right ap_queue object? Or did I
>>> miss something? Isn't that what _zcrypt_send_ep11_cprb() does? Can you
>>> point me to the code that avoids the lookup (by apqn) for zcrypt?
>> The code you reference, _zcrypt_send_ep11_cprb(), does loop through
>> each queue associated with each card, but it doesn't appear to be
>> looking for
>> a queue with a particular APQN. It appears to be looking for a queue
>> meeting a specific set of conditions. At least that's my take after
>> taking a very
>> brief look at the code, so I'm not sure that applies here.
>>
> One of the possible conditions is that the APQN is in the targets array.
> Please have another look at the code below, is_desired_ep11_queue()
> and is_desired_ep11_card() do APQI and APID part of the check
> respectively:
>
> for_each_zcrypt_card(zc) {
> /* Check for online EP11 cards */
> if (!zc->online || !(zc->card->functions & 0x04000000))
> continue;
> /* Check for user selected EP11 card */
> if (targets &&
> !is_desired_ep11_card(zc->card->id, target_num, targets))
> continue;
> /* check if device node has admission for this card */
> if (!zcrypt_check_card(perms, zc->card->id))
> continue;
> /* get weight index of the card device */
> weight = speed_idx_ep11(func_code) * zc->speed_rating[SECKEY];
> if (zcrypt_card_compare(zc, pref_zc, weight, pref_weight))
> continue;
> for_each_zcrypt_queue(zq, zc) {
> /* check if device is online and eligible */
> if (!zq->online ||
> !zq->ops->send_ep11_cprb ||
> (targets &&
> !is_desired_ep11_queue(zq->queue->qid,
> target_num, targets)))
>
>
> Yes the size of targets may or may not be 1 (example for size == 1 is
> the invocation form ep11_cryptsingle()) and the respective costs
> depend on the usual size of the array. Since the goal of the whole
> exercise seems to be to pick a single queue, and we settle with the first
> suitable (first not in the input array, but in our lists) that is
> suitable, I assumed we wouldn't need many hashtable lookups.
>
> Regards,
> Halil
again, this is all code related to zcrypt card and queues and has nothing directly to do with ap queue and ap cards.
If you want to have a look how this works for ap devices, have a look into the scan routines for the ap bus in ap_bus.c
There you can find a bus_for_each_device() which would fit together with the right matching function for your needs.
And this is exactly what Tony implemented in the first shot. However, as written I can provide something like that
for you.
One note for the improvement via hash list with the argument about the max 65535 objects.
Think about a real big machine which has currently up to 30 crypto cards (z15 GA1.5) which when CEX7S are
plugged appear as 60 crypto adapters and have up to 85 domains each. When all these crypto resources
are assigned to one LPAR we end up in 60x85 = 5100 APQNs. Well, of course with a hash you can improve
the linear search through an array or list but can you measure the performance gain and then compare this
to the complexity. ... just some thoughts about beautifying code ...
>>>
>>> If you look at the new function of vfio_ap_get_queue(unsigned long apqn)
>>> it basically about finding the queue based on the apqn, with the
>>> difference that it is vfio specific.
>>>
>>> Regards,
>>> Halil
>>>
>>>> However, this is not a big issue, as the ap_bus holds a list of ap_card objects and within each
>>>> ap_card object there exists a list of ap_queues.
>>>
>>>
On 28.04.20 00:24, Tony Krowiak wrote:
>
>
> On 4/27/20 4:20 AM, Pierre Morel wrote:
>>
>>
>> On 2020-04-07 21:20, Tony Krowiak wrote:
>>> Introduces a new driver callback to prevent a root user from unbinding
>>> an AP queue from its device driver if the queue is in use. The intent of
>>> this callback is to provide a driver with the means to prevent a root user
>>> from inadvertently taking a queue away from a guest and giving it to the
>>> host while the guest is still using it.
>>
>> How can we know, at this point if the guest uses or not the queue?
>
> The struct ap_matrix_mdev has a field, struct kvm *kvm, which holds a pointer to KVM when
> the matrix mdev is in use by a guest. This patch series also introduces a shadow_crycb (soon to
> be shadow_apcb) which holds the AP configuration for the guest. Between those two things,
> the driver can detect when a queue is in use by a guest.
>
>> Do you want to say that this prevents to take away a queue when it is currently assigned to a VFIO device?
>> and with a guest currently using this VFIO device?
>
> No, I do not. The intent here is to enforce the proper procedure for giving up a queue so it is done
> deliberately. Before taking a queue away from the matrix mdev, its APQN should be unassigned
> from the matrix mdev. That is not to say that if there are major objections to this that we can't
> base in_use upon the queue being in use by a guest at the time. Maybe that is preferable to
> the community. I'll leave it to them to state their case.
>
>>
>>> The callback will
>>> be invoked whenever a change to the AP bus's sysfs apmask or aqmask
>>> attributes would result in one or more AP queues being removed from its
>>> driver. If the callback responds in the affirmative for any driver
>>> queried, the change to the apmask or aqmask will be rejected with a device
>>> in use error.
>>
>> AFAIU you mean that Linux's driver's binding and unbinding mechanism is not sufficient to avoid this issue because unbind can not be refused by the driver.
>
> Correct!
>
>>
>>
>> The reason why we do not want a single queue to be removed from the VFIO driver is because the VFIO drivers works on a matrix, not on queues, and for the matrix to be consistent it needs to acquire all queues defined by the cross product of all APID and AQID assigned to the matrix.
>
> Not correct. The reason why is because we do not want a queue to be surreptitiously removed
> without the guest administrator being aware of its removal.
>
>>
>> This functionality is valid for the host as for the guests and is handled automatically by the firmware with the CRYCB.
>> The AP bus uses QCI to retrieve the host CRYCB and build the hosts AP queues.
>>
>> If instead to mix VFIO CRYCB matrix handling and queues at the same level inside the AP bus we separate these different firmware entities in two different software entities.
>>
>> If we make the AP bus sit above a CRYCB/Matrix bus, and in the way virtualize the QCI and test AP queue instructions:
>> - we can directly pass a matrix device to the guest though a VFIO matrix device
>> - the consistence will be automatic
>> - the VFIO device and parent device will be of the same kind which would make the design much more clearer.
>> - there will be no need for these callback because the consistence of the matrix will be guaranteed by firmware
>
> As stated in my response above, the issue here is not consistency. While the design you describe
> may be reasonable, it is a major departure from what is out in the field. In other words, that ship
> has sailed.
>
>>
>>
>>>
>>> For this patch, only non-default drivers will be queried. Currently,
>>> there is only one non-default driver, the vfio_ap device driver.
>>
>> You mean that the admin may take queues away from the "default driver", while the queue is in use, to give it to an other driver?
>> Why is it to avoid in one way and not in the other way?
>
> Because the default drivers have direct control over the queues and can ensure they are empty
> and reset before giving up control. The vfio driver does not have direct control over the queues
> because they have been passed through to the guest.
No, that's not true. The 'default' drivers have no change to do anything with an APQN when it is removed
from the driver. They get the very same notification which is the remove() callback as the vfio dd gets
and have the very same change to do something here. The more interesting thing here is, that the remove()
callback invocation is usually because a hardware HAS BEEN GONE AWAY. Neither the 'default' drivers
nor the vfio dd can do a reset on a not-any-more existing APQN.
And it is also not true that the vfio dd has no direct control over the queue because they have been passed
through to the guest. It's the job of the vfio dd to modify the guest's APM, AQM, ADM masks to disable
the guest's access to the APQN and then the vfio can (try to) do a reset.
>
>>
>>> The
>>> vfio_ap device driver manages AP queues passed through to one or more
>>> guests
>>
>> I read this as if a queue may be passed to several guest...
>> please, rephrase or explain.
>
> AP queues is plural, so it is true that AP queues can be passed through
> to more than one guest. I see your point, however, so I'll reword that
> to be more clear.
>
>>
>>> and we don't want to unexpectedly take AP resources away from
>>> guests which are most likely independently administered.
>>
>> When you say "independently administered", you mean as a second admin inside the host, don't you?
>
> I mean that a guest can be administered by a different person than the host administrator.
> Again, I'll try to clarify this.
>
>>
>>
>> Regards,
>> Pierre
>>
>
On 4/28/20 7:07 AM, Harald Freudenberger wrote:
> On 28.04.20 00:24, Tony Krowiak wrote:
>>
>> On 4/27/20 4:20 AM, Pierre Morel wrote:
>>>
>>> On 2020-04-07 21:20, Tony Krowiak wrote:
>>>> Introduces a new driver callback to prevent a root user from unbinding
>>>> an AP queue from its device driver if the queue is in use. The intent of
>>>> this callback is to provide a driver with the means to prevent a root user
>>>> from inadvertently taking a queue away from a guest and giving it to the
>>>> host while the guest is still using it.
>>> How can we know, at this point if the guest uses or not the queue?
>> The struct ap_matrix_mdev has a field, struct kvm *kvm, which holds a pointer to KVM when
>> the matrix mdev is in use by a guest. This patch series also introduces a shadow_crycb (soon to
>> be shadow_apcb) which holds the AP configuration for the guest. Between those two things,
>> the driver can detect when a queue is in use by a guest.
>>
>>> Do you want to say that this prevents to take away a queue when it is currently assigned to a VFIO device?
>>> and with a guest currently using this VFIO device?
>> No, I do not. The intent here is to enforce the proper procedure for giving up a queue so it is done
>> deliberately. Before taking a queue away from the matrix mdev, its APQN should be unassigned
>> from the matrix mdev. That is not to say that if there are major objections to this that we can't
>> base in_use upon the queue being in use by a guest at the time. Maybe that is preferable to
>> the community. I'll leave it to them to state their case.
>>
>>>> The callback will
>>>> be invoked whenever a change to the AP bus's sysfs apmask or aqmask
>>>> attributes would result in one or more AP queues being removed from its
>>>> driver. If the callback responds in the affirmative for any driver
>>>> queried, the change to the apmask or aqmask will be rejected with a device
>>>> in use error.
>>> AFAIU you mean that Linux's driver's binding and unbinding mechanism is not sufficient to avoid this issue because unbind can not be refused by the driver.
>> Correct!
>>
>>>
>>> The reason why we do not want a single queue to be removed from the VFIO driver is because the VFIO drivers works on a matrix, not on queues, and for the matrix to be consistent it needs to acquire all queues defined by the cross product of all APID and AQID assigned to the matrix.
>> Not correct. The reason why is because we do not want a queue to be surreptitiously removed
>> without the guest administrator being aware of its removal.
>>
>>> This functionality is valid for the host as for the guests and is handled automatically by the firmware with the CRYCB.
>>> The AP bus uses QCI to retrieve the host CRYCB and build the hosts AP queues.
>>>
>>> If instead to mix VFIO CRYCB matrix handling and queues at the same level inside the AP bus we separate these different firmware entities in two different software entities.
>>>
>>> If we make the AP bus sit above a CRYCB/Matrix bus, and in the way virtualize the QCI and test AP queue instructions:
>>> - we can directly pass a matrix device to the guest though a VFIO matrix device
>>> - the consistence will be automatic
>>> - the VFIO device and parent device will be of the same kind which would make the design much more clearer.
>>> - there will be no need for these callback because the consistence of the matrix will be guaranteed by firmware
>> As stated in my response above, the issue here is not consistency. While the design you describe
>> may be reasonable, it is a major departure from what is out in the field. In other words, that ship
>> has sailed.
>>
>>>
>>>> For this patch, only non-default drivers will be queried. Currently,
>>>> there is only one non-default driver, the vfio_ap device driver.
>>> You mean that the admin may take queues away from the "default driver", while the queue is in use, to give it to an other driver?
>>> Why is it to avoid in one way and not in the other way?
>> Because the default drivers have direct control over the queues and can ensure they are empty
>> and reset before giving up control. The vfio driver does not have direct control over the queues
>> because they have been passed through to the guest.
> No, that's not true. The 'default' drivers have no change to do anything with an APQN when it is removed
> from the driver. They get the very same notification which is the remove() callback as the vfio dd gets
> and have the very same change to do something here. The more interesting thing here is, that the remove()
> callback invocation is usually because a hardware HAS BEEN GONE AWAY. Neither the 'default' drivers
> nor the vfio dd can do a reset on a not-any-more existing APQN.
> And it is also not true that the vfio dd has no direct control over the queue because they have been passed
> through to the guest. It's the job of the vfio dd to modify the guest's APM, AQM, ADM masks to disable
> the guest's access to the APQN and then the vfio can (try to) do a reset.
The context here is when a sysadmin deliberately takes one or more
queues away from a
guest by changing the apmask or aqmask; we are not talking about the the
case where an
adapter is deconfigured or disappears. The idea here is to prevent a
sysadmin for the host
from taking a queue away from a KVM guest that is using it. IMHO, control
over that queue should belong to the guest until such time as the guest
gives it up or the
guest is terminated. Since the zcrypt drivers are directly responsible
for their AP queues,
it is not necessary to implement this callback, although there is
nothing precluding that.
>>>> The
>>>> vfio_ap device driver manages AP queues passed through to one or more
>>>> guests
>>> I read this as if a queue may be passed to several guest...
>>> please, rephrase or explain.
>> AP queues is plural, so it is true that AP queues can be passed through
>> to more than one guest. I see your point, however, so I'll reword that
>> to be more clear.
>>
>>>> and we don't want to unexpectedly take AP resources away from
>>>> guests which are most likely independently administered.
>>> When you say "independently administered", you mean as a second admin inside the host, don't you?
>> I mean that a guest can be administered by a different person than the host administrator.
>> Again, I'll try to clarify this.
>>
>>>
>>> Regards,
>>> Pierre
>>>
On 4/28/20 6:57 AM, Harald Freudenberger wrote:
> On 28.04.20 12:07, Halil Pasic wrote:
>> On Mon, 27 Apr 2020 17:48:58 -0400
>> Tony Krowiak <[email protected]> wrote:
>>
>>> On 4/27/20 11:17 AM, Halil Pasic wrote:
>>>> On Mon, 27 Apr 2020 15:05:23 +0200
>>>> Harald Freudenberger <[email protected]> wrote:
>>>>
>>>>> On 24.04.20 05:57, Halil Pasic wrote:
>>>>>> On Tue, 7 Apr 2020 15:20:01 -0400
>>>>>> Tony Krowiak <[email protected]> wrote:
>>>>>>
>>>>>>> Rather than looping over potentially 65535 objects, let's store the
>>>>>>> structures for caching information about queue devices bound to the
>>>>>>> vfio_ap device driver in a hash table keyed by APQN.
>>>>>> @Harald:
>>>>>> Would it make sense to make the efficient lookup of an apqueue base
>>>>>> on its APQN core AP functionality instead of each driver figuring it out
>>>>>> on it's own?
>>>>>>
>>>>>> If I'm not wrong the zcrypt device/driver(s) must the problem of
>>>>>> looking up a queue based on its APQN as well.
>>>>>>
>>>>>> For instance struct ep11_cprb has a target_id filed
>>>>>> (arch/s390/include/uapi/asm/zcrypt.h).
>>>>>>
>>>>>> Regards,
>>>>>> Halil
>>>>> Hi Halil
>>>>>
>>>>> no, the zcrypt drivers don't have this problem. They build up their own device object which
>>>>> includes a pointer to the base ap device.
>>>> I'm a bit confused. Doesn't your code loop first trough the ap_card
>>>> objects to find the APID portion of the APQN, and then loop the queue
>>>> list of the matching card to find the right ap_queue object? Or did I
>>>> miss something? Isn't that what _zcrypt_send_ep11_cprb() does? Can you
>>>> point me to the code that avoids the lookup (by apqn) for zcrypt?
>>> The code you reference, _zcrypt_send_ep11_cprb(), does loop through
>>> each queue associated with each card, but it doesn't appear to be
>>> looking for
>>> a queue with a particular APQN. It appears to be looking for a queue
>>> meeting a specific set of conditions. At least that's my take after
>>> taking a very
>>> brief look at the code, so I'm not sure that applies here.
>>>
>> One of the possible conditions is that the APQN is in the targets array.
>> Please have another look at the code below, is_desired_ep11_queue()
>> and is_desired_ep11_card() do APQI and APID part of the check
>> respectively:
>>
>> for_each_zcrypt_card(zc) {
>> /* Check for online EP11 cards */
>> if (!zc->online || !(zc->card->functions & 0x04000000))
>> continue;
>> /* Check for user selected EP11 card */
>> if (targets &&
>> !is_desired_ep11_card(zc->card->id, target_num, targets))
>> continue;
>> /* check if device node has admission for this card */
>> if (!zcrypt_check_card(perms, zc->card->id))
>> continue;
>> /* get weight index of the card device */
>> weight = speed_idx_ep11(func_code) * zc->speed_rating[SECKEY];
>> if (zcrypt_card_compare(zc, pref_zc, weight, pref_weight))
>> continue;
>> for_each_zcrypt_queue(zq, zc) {
>> /* check if device is online and eligible */
>> if (!zq->online ||
>> !zq->ops->send_ep11_cprb ||
>> (targets &&
>> !is_desired_ep11_queue(zq->queue->qid,
>> target_num, targets)))
>>
>>
>> Yes the size of targets may or may not be 1 (example for size == 1 is
>> the invocation form ep11_cryptsingle()) and the respective costs
>> depend on the usual size of the array. Since the goal of the whole
>> exercise seems to be to pick a single queue, and we settle with the first
>> suitable (first not in the input array, but in our lists) that is
>> suitable, I assumed we wouldn't need many hashtable lookups.
>>
>> Regards,
>> Halil
> again, this is all code related to zcrypt card and queues and has nothing directly to do with ap queue and ap cards.
> If you want to have a look how this works for ap devices, have a look into the scan routines for the ap bus in ap_bus.c
> There you can find a bus_for_each_device() which would fit together with the right matching function for your needs.
> And this is exactly what Tony implemented in the first shot. However, as written I can provide something like that
> for you.
> One note for the improvement via hash list with the argument about the max 65535 objects.
> Think about a real big machine which has currently up to 30 crypto cards (z15 GA1.5) which when CEX7S are
> plugged appear as 60 crypto adapters and have up to 85 domains each. When all these crypto resources
> are assigned to one LPAR we end up in 60x85 = 5100 APQNs. Well, of course with a hash you can improve
> the linear search through an array or list but can you measure the performance gain and then compare this
> to the complexity. ... just some thoughts about beautifying code ...
I set up a test case to compare searching using a hashtable verses using
a list.
I created both a hashtable and a list of 5100 objects. Each structure
had a single
APQN field. I then randomly searched both the hashtable and the list for
each APQN. The following table contains the result of 5 test runs. The
elapsed
times are in nanoseconds.
Test: List Search Hashtable Search
------ ----------- ----------------
Avg. Per APQN: 11651 81
Total per 5500 APQNs: 60164268 1085368
Avg. Per APQN: 10925 78
Total per 5500 APQNs: 56482780 1084590
Avg. Per APQN: 10190 80
Total per 5500 APQNs: 52714920 1123205
Avg. Per APQN: 8431 76
Total per 5500 APQNs: 43748838 1061414
Avg. Per APQN: 9678 75
Total per 5500 APQNs: 50103437 1044427
-----------------------------------------------
Per APQN Search Avg: 10175 78 Hashtable is 130
times faster
Total Search 5500 Avg: 52642848 1079800 Hashtable is 49 times faster
Note that the list search was just a straight search of an object in a
list, not
a device attached to a bus. I don't know if that would add time, but it
seems
that the savings using a hashtable are significant.
So I have two questions:
1. Would it make more sense to provide AP bus interfaces to search for
queue devices by APQN?
2. If so, shall we store the queue devices in a hashtable to make the
searches more efficient?
>>>> If you look at the new function of vfio_ap_get_queue(unsigned long apqn)
>>>> it basically about finding the queue based on the apqn, with the
>>>> difference that it is vfio specific.
>>>>
>>>> Regards,
>>>> Halil
>>>>
>>>>> However, this is not a big issue, as the ap_bus holds a list of ap_card objects and within each
>>>>> ap_card object there exists a list of ap_queues.
>>>>
On 29.04.20 00:30, Tony Krowiak wrote:
>
>
> On 4/28/20 6:57 AM, Harald Freudenberger wrote:
>> On 28.04.20 12:07, Halil Pasic wrote:
>>> On Mon, 27 Apr 2020 17:48:58 -0400
>>> Tony Krowiak <[email protected]> wrote:
>>>
>>>> On 4/27/20 11:17 AM, Halil Pasic wrote:
>>>>> On Mon, 27 Apr 2020 15:05:23 +0200
>>>>> Harald Freudenberger <[email protected]> wrote:
>>>>>
>>>>>> On 24.04.20 05:57, Halil Pasic wrote:
>>>>>>> On Tue, 7 Apr 2020 15:20:01 -0400
>>>>>>> Tony Krowiak <[email protected]> wrote:
>>>>>>>
>>>>>>>> Rather than looping over potentially 65535 objects, let's store the
>>>>>>>> structures for caching information about queue devices bound to the
>>>>>>>> vfio_ap device driver in a hash table keyed by APQN.
>>>>>>> @Harald:
>>>>>>> Would it make sense to make the efficient lookup of an apqueue base
>>>>>>> on its APQN core AP functionality instead of each driver figuring it out
>>>>>>> on it's own?
>>>>>>>
>>>>>>> If I'm not wrong the zcrypt device/driver(s) must the problem of
>>>>>>> looking up a queue based on its APQN as well.
>>>>>>>
>>>>>>> For instance struct ep11_cprb has a target_id filed
>>>>>>> (arch/s390/include/uapi/asm/zcrypt.h).
>>>>>>>
>>>>>>> Regards,
>>>>>>> Halil
>>>>>> Hi Halil
>>>>>>
>>>>>> no, the zcrypt drivers don't have this problem. They build up their own device object which
>>>>>> includes a pointer to the base ap device.
>>>>> I'm a bit confused. Doesn't your code loop first trough the ap_card
>>>>> objects to find the APID portion of the APQN, and then loop the queue
>>>>> list of the matching card to find the right ap_queue object? Or did I
>>>>> miss something? Isn't that what _zcrypt_send_ep11_cprb() does? Can you
>>>>> point me to the code that avoids the lookup (by apqn) for zcrypt?
>>>> The code you reference, _zcrypt_send_ep11_cprb(), does loop through
>>>> each queue associated with each card, but it doesn't appear to be
>>>> looking for
>>>> a queue with a particular APQN. It appears to be looking for a queue
>>>> meeting a specific set of conditions. At least that's my take after
>>>> taking a very
>>>> brief look at the code, so I'm not sure that applies here.
>>>>
>>> One of the possible conditions is that the APQN is in the targets array.
>>> Please have another look at the code below, is_desired_ep11_queue()
>>> and is_desired_ep11_card() do APQI and APID part of the check
>>> respectively:
>>>
>>> for_each_zcrypt_card(zc) {
>>> /* Check for online EP11 cards */
>>> if (!zc->online || !(zc->card->functions & 0x04000000))
>>> continue;
>>> /* Check for user selected EP11 card */
>>> if (targets &&
>>> !is_desired_ep11_card(zc->card->id, target_num, targets))
>>> continue;
>>> /* check if device node has admission for this card */
>>> if (!zcrypt_check_card(perms, zc->card->id))
>>> continue;
>>> /* get weight index of the card device */
>>> weight = speed_idx_ep11(func_code) * zc->speed_rating[SECKEY];
>>> if (zcrypt_card_compare(zc, pref_zc, weight, pref_weight))
>>> continue;
>>> for_each_zcrypt_queue(zq, zc) {
>>> /* check if device is online and eligible */
>>> if (!zq->online ||
>>> !zq->ops->send_ep11_cprb ||
>>> (targets &&
>>> !is_desired_ep11_queue(zq->queue->qid,
>>> target_num, targets)))
>>>
>>>
>>> Yes the size of targets may or may not be 1 (example for size == 1 is
>>> the invocation form ep11_cryptsingle()) and the respective costs
>>> depend on the usual size of the array. Since the goal of the whole
>>> exercise seems to be to pick a single queue, and we settle with the first
>>> suitable (first not in the input array, but in our lists) that is
>>> suitable, I assumed we wouldn't need many hashtable lookups.
>>>
>>> Regards,
>>> Halil
>> again, this is all code related to zcrypt card and queues and has nothing directly to do with ap queue and ap cards.
>> If you want to have a look how this works for ap devices, have a look into the scan routines for the ap bus in ap_bus.c
>> There you can find a bus_for_each_device() which would fit together with the right matching function for your needs.
>> And this is exactly what Tony implemented in the first shot. However, as written I can provide something like that
>> for you.
>> One note for the improvement via hash list with the argument about the max 65535 objects.
>> Think about a real big machine which has currently up to 30 crypto cards (z15 GA1.5) which when CEX7S are
>> plugged appear as 60 crypto adapters and have up to 85 domains each. When all these crypto resources
>> are assigned to one LPAR we end up in 60x85 = 5100 APQNs. Well, of course with a hash you can improve
>> the linear search through an array or list but can you measure the performance gain and then compare this
>> to the complexity. ... just some thoughts about beautifying code ...
>
> I set up a test case to compare searching using a hashtable verses using a list.
> I created both a hashtable and a list of 5100 objects. Each structure had a single
> APQN field. I then randomly searched both the hashtable and the list for
> each APQN. The following table contains the result of 5 test runs. The elapsed
> times are in nanoseconds.
>
> Test: List Search Hashtable Search
> ------ ----------- ----------------
> Avg. Per APQN: 11651 81
> Total per 5500 APQNs: 60164268 1085368
>
> Avg. Per APQN: 10925 78
> Total per 5500 APQNs: 56482780 1084590
>
> Avg. Per APQN: 10190 80
> Total per 5500 APQNs: 52714920 1123205
>
> Avg. Per APQN: 8431 76
> Total per 5500 APQNs: 43748838 1061414
>
> Avg. Per APQN: 9678 75
> Total per 5500 APQNs: 50103437 1044427
> -----------------------------------------------
> Per APQN Search Avg: 10175 78 Hashtable is 130 times faster
> Total Search 5500 Avg: 52642848 1079800 Hashtable is 49 times faster
>
> Note that the list search was just a straight search of an object in a list, not
> a device attached to a bus. I don't know if that would add time, but it seems
> that the savings using a hashtable are significant.
Halil, I did not say that a hashtable is not faster than a linear list. The only thing
I wanted to express is that we are adding complexity and performance improving
code which is not even integrated somewhere. We are beautifying here.
>
> So I have two questions:
>
> 1. Would it make more sense to provide AP bus interfaces to search for
> queue devices by APQN?
>
> 2. If so, shall we store the queue devices in a hashtable to make the
> searches more efficient?
If there is a decision to implement this as a feature function within the AP
bus base code, I will use the bus functions provided by the kernel common
code. So here this will be a bus_for_each_device() together with a filter
function. I don't know how this is implemented within the common bus code.
>
>
>>>>> If you look at the new function of vfio_ap_get_queue(unsigned long apqn)
>>>>> it basically about finding the queue based on the apqn, with the
>>>>> difference that it is vfio specific.
>>>>>
>>>>> Regards,
>>>>> Halil
>>>>>
>>>>>> However, this is not a big issue, as the ap_bus holds a list of ap_card objects and within each
>>>>>> ap_card object there exists a list of ap_queues.
>>>>>
>
On Tue, 28 Apr 2020 12:57:12 +0200
Harald Freudenberger <[email protected]> wrote:
> On 28.04.20 12:07, Halil Pasic wrote:
> > On Mon, 27 Apr 2020 17:48:58 -0400
> > Tony Krowiak <[email protected]> wrote:
> >
> >>
> >> On 4/27/20 11:17 AM, Halil Pasic wrote:
> >>> On Mon, 27 Apr 2020 15:05:23 +0200
> >>> Harald Freudenberger <[email protected]> wrote:
> >>>
> >>>> On 24.04.20 05:57, Halil Pasic wrote:
> >>>>> On Tue, 7 Apr 2020 15:20:01 -0400
> >>>>> Tony Krowiak <[email protected]> wrote:
> >>>>>
> >>>>>> Rather than looping over potentially 65535 objects, let's store the
> >>>>>> structures for caching information about queue devices bound to the
> >>>>>> vfio_ap device driver in a hash table keyed by APQN.
> >>>>> @Harald:
> >>>>> Would it make sense to make the efficient lookup of an apqueue base
> >>>>> on its APQN core AP functionality instead of each driver figuring it out
> >>>>> on it's own?
> >>>>>
> >>>>> If I'm not wrong the zcrypt device/driver(s) must the problem of
> >>>>> looking up a queue based on its APQN as well.
> >>>>>
> >>>>> For instance struct ep11_cprb has a target_id filed
> >>>>> (arch/s390/include/uapi/asm/zcrypt.h).
> >>>>>
> >>>>> Regards,
> >>>>> Halil
> >>>> Hi Halil
> >>>>
> >>>> no, the zcrypt drivers don't have this problem. They build up their own device object which
> >>>> includes a pointer to the base ap device.
> >>> I'm a bit confused. Doesn't your code loop first trough the ap_card
> >>> objects to find the APID portion of the APQN, and then loop the queue
> >>> list of the matching card to find the right ap_queue object? Or did I
> >>> miss something? Isn't that what _zcrypt_send_ep11_cprb() does? Can you
> >>> point me to the code that avoids the lookup (by apqn) for zcrypt?
> >> The code you reference, _zcrypt_send_ep11_cprb(), does loop through
> >> each queue associated with each card, but it doesn't appear to be
> >> looking for
> >> a queue with a particular APQN. It appears to be looking for a queue
> >> meeting a specific set of conditions. At least that's my take after
> >> taking a very
> >> brief look at the code, so I'm not sure that applies here.
> >>
> > One of the possible conditions is that the APQN is in the targets array.
> > Please have another look at the code below, is_desired_ep11_queue()
> > and is_desired_ep11_card() do APQI and APID part of the check
> > respectively:
> >
> > for_each_zcrypt_card(zc) {
> > /* Check for online EP11 cards */
> > if (!zc->online || !(zc->card->functions & 0x04000000))
> > continue;
> > /* Check for user selected EP11 card */
> > if (targets &&
> > !is_desired_ep11_card(zc->card->id, target_num, targets))
> > continue;
> > /* check if device node has admission for this card */
> > if (!zcrypt_check_card(perms, zc->card->id))
> > continue;
> > /* get weight index of the card device */
> > weight = speed_idx_ep11(func_code) * zc->speed_rating[SECKEY];
> > if (zcrypt_card_compare(zc, pref_zc, weight, pref_weight))
> > continue;
> > for_each_zcrypt_queue(zq, zc) {
> > /* check if device is online and eligible */
> > if (!zq->online ||
> > !zq->ops->send_ep11_cprb ||
> > (targets &&
> > !is_desired_ep11_queue(zq->queue->qid,
> > target_num, targets)))
> >
> >
> > Yes the size of targets may or may not be 1 (example for size == 1 is
> > the invocation form ep11_cryptsingle()) and the respective costs
> > depend on the usual size of the array. Since the goal of the whole
> > exercise seems to be to pick a single queue, and we settle with the first
> > suitable (first not in the input array, but in our lists) that is
> > suitable, I assumed we wouldn't need many hashtable lookups.
> >
> > Regards,
> > Halil
> again, this is all code related to zcrypt card and queues and has nothing directly to do with ap queue and ap cards.
Well, if you look at "struct vfio_ap_queue* vfio_ap_get_queue(unsigned
long apqn)" it also works with vfio_ap_queue and "has nothing directly
to do with ap queue". But ap_queue->private points to zcrypt_queue
and vfio_ap_queue when the queue is driven by a zcrypt and a vfio_ap
driver respectively.
> If you want to have a look how this works for ap devices, have a look into the scan routines for the ap bus in ap_bus.c
> There you can find a bus_for_each_device() which would fit together with the right matching function for your needs.
> And this is exactly what Tony implemented in the first shot. However, as written I can provide something like that
> for you.
> One note for the improvement via hash list with the argument about the max 65535 objects.
> Think about a real big machine which has currently up to 30 crypto cards (z15 GA1.5) which when CEX7S are
> plugged appear as 60 crypto adapters and have up to 85 domains each. When all these crypto resources
> are assigned to one LPAR we end up in 60x85 = 5100 APQNs. Well, of course with a hash you can improve
> the linear search through an array or list but can you measure the performance gain and then compare this
> to the complexity. ... just some thoughts about beautifying code ...
My train of thought is that looking up a queue by its APQN is a
functionality potentially common to several drivers. I was hoping for a
simplification, not for a ton of added complexity.
Also I was thinking about the 256 buckets. I mean
"DECLARE_HASHTABLE(qtable, 8);". It would be much easier to reason about
the table size at a bus level.
Regards,
Halil
The review of this series seems to have come to a standstill, so I'm
pinging the list to see if we can't get things going again.
On 4/14/20 8:08 AM, Cornelia Huck wrote:
> The current design for AP pass-through does not support making dynamic
> changes to the AP matrix of a running guest resulting in a few
> deficiencies this patch series is intended to mitigate:
>
> 1. Adapters, domains and control domains can not be added to or removed
> from a running guest. In order to modify a guest's AP configuration,
> the guest must be terminated; only then can AP resources be assigned
> to or unassigned from the guest's matrix mdev. The new AP
> configuration becomes available to the guest when it is subsequently
> restarted.
>
> 2. The AP bus's /sys/bus/ap/apmask and /sys/bus/ap/aqmask interfaces can
> be modified by a root user without any restrictions. A change to
> either mask can result in AP queue devices being unbound from the
> vfio_ap device driver and bound to a zcrypt device driver even if a
> guest is using the queues, thus giving the host access to the guest's
> private crypto data and vice versa.
>
> 3. The APQNs derived from the Cartesian product of the APIDs of the
> adapters and APQIs of the domains assigned to a matrix mdev must
> reference an AP queue device bound to the vfio_ap device driver. The
> AP architecture allows assignment of AP resources that are not
> available to the system, so this artificial restriction is not
> compliant with the architecture.
>
> 4. The AP configuration profile can be dynamically changed for the linux
> host after a KVM guest is started. For example, a new domain can be
> dynamically added to the configuration profile via the SE or an HMC
> connected to a DPM enabled lpar. Likewise, AP adapters can be
> dynamically configured (online state) and deconfigured (standby state)
> using the SE, an SCLP command or an HMC connected to a DPM enabled
> lpar. This can result in inadvertent sharing of AP queues between the
> guest and host.
>
> 5. A root user can manually unbind an AP queue device representing a
> queue in use by a KVM guest via the vfio_ap device driver's sysfs
> unbind attribute. In this case, the guest will be using a queue that
> is not bound to the driver which violates the device model.
>
> This patch series introduces the following changes to the current design
> to alleviate the shortcomings described above as well as to implement
> more of the AP architecture:
>
> 1. A root user will be prevented from making changes to the AP bus's
> /sys/bus/ap/apmask or /sys/bus/ap/aqmask if the ownership of an APQN
> changes from the vfio_ap device driver to a zcrypt driver when the
> APQN is assigned to a matrix mdev.
>
> 2. Allow a root user to hot plug/unplug AP adapters, domains and control
> domains using the matrix mdev's assign/unassign attributes.
>
> 4. Allow assignment of an AP adapter or domain to a matrix mdev even if
> it results in assignment of an APQN that does not reference an AP
> queue device bound to the vfio_ap device driver, as long as the APQN
> is not reserved for use by the default zcrypt drivers (also known as
> over-provisioning of AP resources). Allowing over-provisioning of AP
> resources better models the architecture which does not preclude
> assigning AP resources that are not yet available in the system.
>
> 5. Handle dynamic changes to the AP device model.
>
> 1. Rationale for changes to AP bus's apmask/aqmask interfaces:
> ----------------------------------------------------------
> Due to the extremely sensitive nature of cryptographic data, it is
> imperative that great care be taken to ensure that such data is secured.
> Allowing a root user, either inadvertently or maliciously, to configure
> these masks such that a queue is shared between the host and a guest is
> not only avoidable, it is advisable. It was suggested that this scenario
> is better handled in user space with management software, but that does
> not preclude a malicious administrator from using the sysfs interfaces
> to gain access to a guest's crypto data. It was also suggested that this
> scenario could be avoided by taking access to the adapter away from the
> guest and zeroing out the queues prior to the vfio_ap driver releasing the
> device; however, stealing an adapter in use from a guest as a by-product
> of an operation is bad and will likely cause problems for the guest
> unnecessarily. It was decided that the most effective solution with the
> least number of negative side effects is to prevent the situation at the
> source.
>
> 2. Rationale for hot plug/unplug using matrix mdev sysfs interfaces:
> ----------------------------------------------------------------
> Allowing a user to hot plug/unplug AP resources using the matrix mdev
> sysfs interfaces circumvents the need to terminate the guest in order to
> modify its AP configuration. Allowing dynamic configuration makes
> reconfiguring a guest's AP matrix much less disruptive.
>
> 3. Rationale for allowing over-provisioning of AP resources:
> -----------------------------------------------------------
> Allowing assignment of AP resources to a matrix mdev and ultimately to a
> guest better models the AP architecture. The architecture does not
> preclude assignment of unavailable AP resources. If a queue subsequently
> becomes available while a guest using the matrix mdev to which its APQN
> is assigned, the guest will be given access to it. If an APQN
> is dynamically unassigned from the underlying host system, it will
> automatically become unavailable to the guest.
>
> Change log v6-v7:
> ----------------
> * Added callbacks to AP bus:
> - on_config_changed: Notifies implementing drivers that
> the AP configuration has changed since last AP device scan.
> - on_scan_complete: Notifies implementing drivers that the device scan
> has completed.
> - implemented on_config_changed and on_scan_complete callbacks for
> vfio_ap device driver.
> - updated vfio_ap device driver's probe and remove callbacks to handle
> dynamic changes to the AP device model.
> * Added code to filter APQNs when assigning AP resources to a KVM guest's
> CRYCB
>
> Change log v5-v6:
> ----------------
> * Fixed a bug in ap_bus.c introduced with patch 2/7 of the v5
> series. Harald Freudenberer pointed out that the mutex lock
> for ap_perms_mutex in the apmask_store and aqmask_store functions
> was not being freed.
>
> * Removed patch 6/7 which added logging to the vfio_ap driver
> to expedite acceptance of this series. The logging will be introduced
> with a separate patch series to allow more time to explore options
> such as DBF logging vs. tracepoints.
>
> * Added 3 patches related to ensuring that APQNs that do not reference
> AP queue devices bound to the vfio_ap device driver are not assigned
> to the guest CRYCB:
>
> Patch 4: Filter CRYCB bits for unavailable queue devices
> Patch 5: sysfs attribute to display the guest CRYCB
> Patch 6: update guest CRYCB in vfio_ap probe and remove callbacks
>
> * Added a patch (Patch 9) to version the vfio_ap module.
>
> * Reshuffled patches to allow the in_use callback implementation to
> invoke the vfio_ap_mdev_verify_no_sharing() function introduced in
> patch 2.
>
> Change log v4-v5:
> ----------------
> * Added a patch to provide kernel s390dbf debug logs for VFIO AP
>
> Change log v3->v4:
> -----------------
> * Restored patches preventing root user from changing ownership of
> APQNs from zcrypt drivers to the vfio_ap driver if the APQN is
> assigned to an mdev.
>
> * No longer enforcing requirement restricting guest access to
> queues represented by a queue device bound to the vfio_ap
> device driver.
>
> * Removed shadow CRYCB and now directly updating the guest CRYCB
> from the matrix mdev's matrix.
>
> * Rebased the patch series on top of 'vfio: ap: AP Queue Interrupt
> Control' patches.
>
> * Disabled bind/unbind sysfs interfaces for vfio_ap driver
>
> Change log v2->v3:
> -----------------
> * Allow guest access to an AP queue only if the queue is bound to
> the vfio_ap device driver.
>
> * Removed the patch to test CRYCB masks before taking the vCPUs
> out of SIE. Now checking the shadow CRYCB in the vfio_ap driver.
>
> Change log v1->v2:
> -----------------
> * Removed patches preventing root user from unbinding AP queues from
> the vfio_ap device driver
> * Introduced a shadow CRYCB in the vfio_ap driver to manage dynamic
> changes to the AP guest configuration due to root user interventions
> or hardware anomalies.
>
> Harald Freudenberger (1):
> s390/zcrypt: Notify driver on config changed and scan complete
> callbacks
>
> Tony Krowiak (14):
> s390/vfio-ap: store queue struct in hash table for quick access
> s390/vfio-ap: manage link between queue struct and matrix mdev
> s390/zcrypt: driver callback to indicate resource in use
> s390/vfio-ap: implement in-use callback for vfio_ap driver
> s390/vfio-ap: introduce shadow CRYCB
> s390/vfio-ap: sysfs attribute to display the guest CRYCB
> s390/vfio-ap: filter CRYCB bits for unavailable queue devices
> s390/vfio_ap: add qlink from ap_matrix_mdev struct to vfio_ap_queue
> struct
> s390/vfio-ap: allow assignment of unavailable AP queues to mdev device
> s390/vfio-ap: allow configuration of matrix mdev in use by a KVM guest
> s390/vfio-ap: allow hot plug/unplug of AP resources using mdev device
> s390/vfio-ap: handle host AP config change notification
> s390/vfio-ap: handle AP bus scan completed notification
> s390/vfio-ap: handle probe/remove not due to host AP config changes
>
> drivers/s390/crypto/ap_bus.c | 301 ++++++-
> drivers/s390/crypto/ap_bus.h | 17 +
> drivers/s390/crypto/vfio_ap_drv.c | 35 +-
> drivers/s390/crypto/vfio_ap_ops.c | 1103 +++++++++++++++++++------
> drivers/s390/crypto/vfio_ap_private.h | 24 +-
> 5 files changed, 1167 insertions(+), 313 deletions(-)
>