On s390, we have cryptographic coprocessor cards, which are modeled on
Linux as devices on the AP bus. Each card can be partitioned into domains
which can be thought of as a set of hardware registers for processing
crypto commands. Crypto commands are sent to a specific domain within a
card is via a queue which is identified as a (card,domain) tuple. We model
this something like the following (assuming we have access to cards 3 and
4 and domains 1 and 2):
AP -> card3 -> queue (3,1)
-> queue (3,2)
-> card4 -> queue (4,1)
-> queue (4,2)
If we want to virtualize this, we can use a feature provided by the
hardware. We basically attach a satellite control block to our main
hardware virtualization control block and the hardware takes care of
most of the rest.
For this control block, we don't specify explicit tuples, but a list of
cards and a list of domains. The guest will get access to the cross
product.
Because of this, we need to take care that the lists provided to
different guests don't overlap; i.e., we need to enforce sane
configurations. Otherwise, one guest may get access to things like
secret keys for another guest.
The idea of this patch set is to introduce a new device, the matrix
device. This matrix device hangs off a different root and acts as the
parent node for mdev devices.
If you now want to give the tuples (4,1) and (4,2), you need to do the
following:
- Unbind the (4,1) and (4,2) tuples from their ap bus driver.
- Bind the (4,1) and (4,2) tuples to the vfio_ap driver.
- Create the mediated device.
- Assign card 4 and domains 1 and 2 to the mediated device
QEMU will now simply consume the mediated device and things should work.
For a complete description of the architecture and concepts underlying the
design, see the Documentation/s390/vfio-ap.txt file included with this
patch set.
Change log v3 -> v4
===================
* Resolved issue with enabling ZCRYPT when KVM is enabled by using
#ifdef ZCRYPT in relevant functions
* Added patch with a new function for resetting the crypto attributes
for all vcpus to resolve the issue raised with running vcpus getting out
of sync.
* Removed KVM_S390_VM_CRYPTO_INTERPRET_AP: Setting interpretive exec mode
from vfio_ap driver when mdev device is opened.
Tony Krowiak (15):
s390: zcrypt: externalize AP instructions available function
KVM: s390: reset crypto attributes for all vcpus
KVM: s390: refactor crypto initialization
KVM: s390: CPU model support for AP virtualization
KVM: s390: enable/disable AP interpretive execution
s390: vfio-ap: base implementation of VFIO AP device driver
s390: vfio-ap: register matrix device with VFIO mdev framework
KVM: s390: interfaces to (de)configure guest's AP matrix
s390: vfio-ap: sysfs interfaces to configure adapters
s390: vfio-ap: sysfs interfaces to configure domains
s390: vfio-ap: sysfs interfaces to configure control domains
s390: vfio-ap: sysfs interface to view matrix mdev matrix
KVM: s390: configure the guest's AP devices
s390: vfio-ap: implement VFIO_DEVICE_GET_INFO ioctl
s390: doc: detailed specifications for AP virtualization
Documentation/s390/vfio-ap.txt | 567 +++++++++++++++++++++
MAINTAINERS | 12 +
arch/s390/Kconfig | 11 +
arch/s390/include/asm/ap.h | 7 +
arch/s390/include/asm/kvm-ap.h | 136 +++++
arch/s390/include/asm/kvm_host.h | 3 +
arch/s390/include/uapi/asm/kvm.h | 1 +
arch/s390/kvm/Makefile | 2 +-
arch/s390/kvm/kvm-ap.c | 339 +++++++++++++
arch/s390/kvm/kvm-s390.c | 93 ++---
arch/s390/kvm/kvm-s390.h | 14 +
arch/s390/tools/gen_facilities.c | 2 +
drivers/s390/crypto/Makefile | 4 +
drivers/s390/crypto/ap_bus.c | 6 +
drivers/s390/crypto/vfio_ap_drv.c | 143 ++++++
drivers/s390/crypto/vfio_ap_ops.c | 873 +++++++++++++++++++++++++++++++++
drivers/s390/crypto/vfio_ap_private.h | 46 ++
include/uapi/linux/vfio.h | 2 +
18 files changed, 2200 insertions(+), 61 deletions(-)
create mode 100644 Documentation/s390/vfio-ap.txt
create mode 100644 arch/s390/include/asm/kvm-ap.h
create mode 100644 arch/s390/kvm/kvm-ap.c
create mode 100644 drivers/s390/crypto/vfio_ap_drv.c
create mode 100644 drivers/s390/crypto/vfio_ap_ops.c
create mode 100644 drivers/s390/crypto/vfio_ap_private.h
Provides a sysfs interface to view the AP matrix configured for the
mediated matrix device.
The relevant sysfs structures are:
/sys/devices/vfio_ap
... [matrix]
...... [mdev_supported_types]
......... [vfio_ap-passthrough]
............ [devices]
...............[$uuid]
.................. matrix
To view the matrix configured for the mediated matrix device,
print the matrix file:
cat matrix
Signed-off-by: Tony Krowiak <[email protected]>
---
drivers/s390/crypto/vfio_ap_ops.c | 27 +++++++++++++++++++++++++++
1 files changed, 27 insertions(+), 0 deletions(-)
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index 413ecbb..bc2b05e 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -701,6 +701,32 @@ static ssize_t control_domains_show(struct device *dev,
}
DEVICE_ATTR_RO(control_domains);
+static ssize_t matrix_show(struct device *dev, struct device_attribute *attr,
+ char *buf)
+{
+ struct mdev_device *mdev = mdev_from_dev(dev);
+ struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
+ char *bufpos = buf;
+ unsigned long apid;
+ unsigned long apqi;
+ int nchars = 0;
+ int n;
+
+ for_each_set_bit_inv(apid, matrix_mdev->matrix->apm,
+ matrix_mdev->matrix->apm_max) {
+ for_each_set_bit_inv(apqi, matrix_mdev->matrix->aqm,
+ matrix_mdev->matrix->aqm_max) {
+ n = sprintf(bufpos, "%02lx.%04lx\n", apid, apqi);
+ bufpos += n;
+ nchars += n;
+ }
+ }
+
+ return nchars;
+}
+DEVICE_ATTR_RO(matrix);
+
+
static struct attribute *vfio_ap_mdev_attrs[] = {
&dev_attr_assign_adapter.attr,
&dev_attr_unassign_adapter.attr,
@@ -709,6 +735,7 @@ static ssize_t control_domains_show(struct device *dev,
&dev_attr_assign_control_domain.attr,
&dev_attr_unassign_control_domain.attr,
&dev_attr_control_domains.attr,
+ &dev_attr_matrix.attr,
NULL,
};
--
1.7.1
Introduces ioctl access to the VFIO AP Matrix device driver
by implementing the VFIO_DEVICE_GET_INFO ioctl. This ioctl
provides the VFIO AP Matrix device driver information to the
guest machine.
Reviewed-by: Pierre Morel <[email protected]>
Signed-off-by: Tony Krowiak <[email protected]>
---
drivers/s390/crypto/vfio_ap_ops.c | 43 +++++++++++++++++++++++++++++++++++++
1 files changed, 43 insertions(+), 0 deletions(-)
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index e3ff5ab..00179cd 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -101,6 +101,48 @@ static void vfio_ap_mdev_release(struct mdev_device *mdev)
&matrix_mdev->group_notifier);
}
+static int vfio_ap_mdev_get_device_info(unsigned long arg)
+{
+ unsigned long minsz;
+ struct vfio_device_info info;
+
+ minsz = offsetofend(struct vfio_device_info, num_irqs);
+
+ if (copy_from_user(&info, (void __user *)arg, minsz))
+ return -EFAULT;
+
+ if (info.argsz < minsz) {
+ pr_err("%s: Argument size %u less than min size %li",
+ VFIO_AP_MODULE_NAME, info.argsz, minsz);
+ return -EINVAL;
+ }
+
+ info.flags = VFIO_DEVICE_FLAGS_AP;
+ info.num_regions = 0;
+ info.num_irqs = 0;
+
+ return copy_to_user((void __user *)arg, &info, minsz);
+}
+
+static ssize_t vfio_ap_mdev_ioctl(struct mdev_device *mdev,
+ unsigned int cmd, unsigned long arg)
+{
+ int ret;
+
+ switch (cmd) {
+ case VFIO_DEVICE_GET_INFO:
+ ret = vfio_ap_mdev_get_device_info(arg);
+ break;
+ default:
+ pr_err("%s: ioctl command %d is not a supported command",
+ VFIO_AP_MODULE_NAME, cmd);
+ ret = -EOPNOTSUPP;
+ break;
+ }
+
+ return ret;
+}
+
static ssize_t name_show(struct kobject *kobj, struct device *dev, char *buf)
{
return sprintf(buf, "%s\n", VFIO_AP_MDEV_NAME_HWVIRT);
@@ -804,6 +846,7 @@ static ssize_t matrix_show(struct device *dev, struct device_attribute *attr,
.remove = vfio_ap_mdev_remove,
.open = vfio_ap_mdev_open,
.release = vfio_ap_mdev_release,
+ .ioctl = vfio_ap_mdev_ioctl,
};
int vfio_ap_mdev_register(struct ap_matrix *ap_matrix)
--
1.7.1
Provides the sysfs interfaces for assigning AP control domains
to and unassigning AP control domains from a mediated matrix device.
The IDs of the AP control domains assigned to the mediated matrix
device are stored in an AP domain mask (ADM). The bits in the ADM,
from most significant to least significant bit, correspond to
AP domain numbers 0 to 255. When a control domain is assigned,
the bit corresponding its domain ID will be set in the ADM.
Likewise, when a domain is unassigned, the bit corresponding
to its domain ID will be cleared in the ADM.
The relevant sysfs structures are:
/sys/devices/vfio_ap
... [matrix]
...... [mdev_supported_types]
......... [vfio_ap-passthrough]
............ [devices]
...............[$uuid]
.................. assign_control_domain
.................. unassign_control_domain
To assign a control domain to the $uuid mediated matrix device's
ADM, write its domain number to the assign_control_domain file.
To unassign a domain, write its domain number to the
unassign_control_domain file. The domain number is specified
using conventional semantics: If it begins with 0x the number
will be parsed as a hexadecimal (case insensitive) number;
otherwise, it will be parsed as a decimal number.
For example, to assign control domain 173 (0xad) to the mediated
matrix device $uuid:
echo 173 > assign_control_domain
or
echo 0xad > assign_control_domain
To unassign control domain 173 (0xad):
echo 173 > unassign_control_domain
or
echo 0xad > unassign_control_domain
The assignment will be rejected if the APQI exceeds the maximum
value for an AP domain:
* If the AP Extended Addressing (APXA) facility is installed,
the max value is 255
* Else the max value is 15
Signed-off-by: Tony Krowiak <[email protected]>
---
arch/s390/include/asm/kvm-ap.h | 1 +
drivers/s390/crypto/vfio_ap_ops.c | 107 +++++++++++++++++++++++++++++++++++++
2 files changed, 108 insertions(+), 0 deletions(-)
diff --git a/arch/s390/include/asm/kvm-ap.h b/arch/s390/include/asm/kvm-ap.h
index 8ee196e..3cc305b 100644
--- a/arch/s390/include/asm/kvm-ap.h
+++ b/arch/s390/include/asm/kvm-ap.h
@@ -19,6 +19,7 @@
#define KVM_AP_MASK_BYTES(n) DIV_ROUND_UP(n, BITS_PER_BYTE)
#define KVM_AP_MAX_APM_INDEX(matrix) (matrix->apm_max - 1)
#define KVM_AP_MAX_AQM_INDEX(matrix) (matrix->aqm_max - 1)
+#define KVM_AP_MAX_ADM_INDEX(matrix) (matrix->adm_max - 1)
/**
* The AP matrix is comprised of three bit masks identifying the adapters,
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index d4f9310..413ecbb 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -597,11 +597,118 @@ static ssize_t unassign_domain_store(struct device *dev,
}
DEVICE_ATTR_WO(unassign_domain);
+
+/**
+ * assign_control_domain_store
+ *
+ * @dev: the matrix device
+ * @attr: a mediated matrix device attribute
+ * @buf: a buffer containing the adapter ID (APID) to be assigned
+ * @count: the number of bytes in @buf
+ *
+ * Parses the domain ID from @buf and assigns it to the mediated matrix device.
+ *
+ * Returns the number of bytes processed if the domain ID is valid; otherwise
+ * returns an error.
+ */
+static ssize_t assign_control_domain_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ int ret;
+ unsigned long id;
+ struct mdev_device *mdev = mdev_from_dev(dev);
+ struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
+ int maxid = KVM_AP_MAX_ADM_INDEX(matrix_mdev->matrix);
+
+ ret = kstrtoul(buf, 0, &id);
+ if (ret || (id > maxid)) {
+ pr_err("%s: control domain id '%s' not a value from 0 to %02d(%#04x)",
+ VFIO_AP_MODULE_NAME, buf, maxid, maxid);
+
+ return ret ? ret : -EINVAL;
+ }
+
+ /* Set the bit in the ADM (bitmask) corresponding to the AP control
+ * domain number (id). The bits in the mask, from most significant to
+ * least significant, correspond to IDs 0 up to the one less than the
+ * number of control domains that can be assigned.
+ */
+ set_bit_inv(id, matrix_mdev->matrix->adm);
+
+ return count;
+}
+DEVICE_ATTR_WO(assign_control_domain);
+
+/**
+ * unassign_control_domain_store
+ *
+ * @dev: the matrix device
+ * @attr: a mediated matrix device attribute
+ * @buf: a buffer containing the adapter ID (APID) to be assigned
+ * @count: the number of bytes in @buf
+ *
+ * Parses the domain ID from @buf and unassigns it from the mediated matrix
+ * device.
+ *
+ * Returns the number of bytes processed if the domain ID is valid; otherwise
+ * returns an error.
+ */
+static ssize_t unassign_control_domain_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ int ret;
+ unsigned long apqi;
+ struct mdev_device *mdev = mdev_from_dev(dev);
+ struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
+ int maxid = KVM_AP_MAX_ADM_INDEX(matrix_mdev->matrix);
+
+ ret = kstrtoul(buf, 0, &apqi);
+ if (ret || (apqi > maxid)) {
+ pr_err("%s: control domain id '%s' not a value from 0 to %02d(%#04x)",
+ VFIO_AP_MODULE_NAME, buf, maxid, maxid);
+
+ return ret ? ret : -EINVAL;
+ }
+
+ clear_bit_inv((unsigned long)apqi,
+ (unsigned long *)matrix_mdev->matrix->adm);
+
+ return count;
+}
+DEVICE_ATTR_WO(unassign_control_domain);
+
+static ssize_t control_domains_show(struct device *dev,
+ struct device_attribute *dev_attr,
+ char *buf)
+{
+ unsigned long id;
+ int nchars = 0;
+ int n;
+ char *bufpos = buf;
+ struct mdev_device *mdev = mdev_from_dev(dev);
+ struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
+
+ for_each_set_bit_inv(id, matrix_mdev->matrix->adm,
+ matrix_mdev->matrix->adm_max) {
+ n = sprintf(bufpos, "%04lx\n", id);
+ bufpos += n;
+ nchars += n;
+ }
+
+ return nchars;
+}
+DEVICE_ATTR_RO(control_domains);
+
static struct attribute *vfio_ap_mdev_attrs[] = {
&dev_attr_assign_adapter.attr,
&dev_attr_unassign_adapter.attr,
&dev_attr_assign_domain.attr,
&dev_attr_unassign_domain.attr,
+ &dev_attr_assign_control_domain.attr,
+ &dev_attr_unassign_control_domain.attr,
+ &dev_attr_control_domains.attr,
NULL,
};
--
1.7.1
Registers the matrix device created by the VFIO AP device
driver with the VFIO mediated device framework.
Registering the matrix device will create the sysfs
structures needed to create mediated matrix devices
each of which will be used to configure the AP matrix
for a guest and connect it to the VFIO AP device driver.
Registering the matrix device with the VFIO mediated device
framework will create the following sysfs structures:
/sys/devices/vfio_ap
... [matrix]
...... [mdev_supported_types]
......... [vfio_ap-passthrough]
............ create
To create a mediated device for the AP matrix device, write a UUID
to the create file:
uuidgen > create
A symbolic link to the mediated device's directory will be created in the
devices subdirectory named after the generated $uuid:
/sys/devices/vfio_ap
... [matrix]
...... [mdev_supported_types]
......... [vfio_ap-passthrough]
............ [devices]
............... [$uuid]
Signed-off-by: Tony Krowiak <[email protected]>
---
MAINTAINERS | 1 +
drivers/s390/crypto/Makefile | 2 +-
drivers/s390/crypto/vfio_ap_drv.c | 9 +++
drivers/s390/crypto/vfio_ap_ops.c | 105 +++++++++++++++++++++++++++++++++
drivers/s390/crypto/vfio_ap_private.h | 17 +++++
5 files changed, 133 insertions(+), 1 deletions(-)
create mode 100644 drivers/s390/crypto/vfio_ap_ops.c
diff --git a/MAINTAINERS b/MAINTAINERS
index 1124c4c..6eb8d8a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -12226,6 +12226,7 @@ W: http://www.ibm.com/developerworks/linux/linux390/
S: Supported
F: drivers/s390/crypto/vfio_ap_drv.c
F: drivers/s390/crypto/vfio_ap_private.h
+F: drivers/s390/crypto/vfio_ap_ops.c
S390 ZFCP DRIVER
M: Steffen Maier <[email protected]>
diff --git a/drivers/s390/crypto/Makefile b/drivers/s390/crypto/Makefile
index 48e466e..8d36b05 100644
--- a/drivers/s390/crypto/Makefile
+++ b/drivers/s390/crypto/Makefile
@@ -17,5 +17,5 @@ pkey-objs := pkey_api.o
obj-$(CONFIG_PKEY) += pkey.o
# adjunct processor matrix
-vfio_ap-objs := vfio_ap_drv.o
+vfio_ap-objs := vfio_ap_drv.o vfio_ap_ops.o
obj-$(CONFIG_VFIO_AP) += vfio_ap.o
diff --git a/drivers/s390/crypto/vfio_ap_drv.c b/drivers/s390/crypto/vfio_ap_drv.c
index 014d70f..cc7fbd7 100644
--- a/drivers/s390/crypto/vfio_ap_drv.c
+++ b/drivers/s390/crypto/vfio_ap_drv.c
@@ -121,11 +121,20 @@ int __init vfio_ap_init(void)
return ret;
}
+ ret = vfio_ap_mdev_register(ap_matrix);
+ if (ret) {
+ ap_driver_unregister(&vfio_ap_drv);
+ vfio_ap_matrix_dev_destroy(ap_matrix);
+
+ return ret;
+ }
+
return 0;
}
void __exit vfio_ap_exit(void)
{
+ vfio_ap_mdev_unregister(ap_matrix);
ap_driver_unregister(&vfio_ap_drv);
vfio_ap_matrix_dev_destroy(ap_matrix);
}
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
new file mode 100644
index 0000000..d41b0b8
--- /dev/null
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -0,0 +1,105 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * Adjunct processor matrix VFIO device driver callbacks.
+ *
+ * Copyright IBM Corp. 2018
+ * Author(s): Tony Krowiak <[email protected]>
+ */
+#include <linux/string.h>
+#include <linux/vfio.h>
+#include <linux/device.h>
+#include <linux/list.h>
+#include <linux/ctype.h>
+
+#include "vfio_ap_private.h"
+
+#define VFOP_AP_MDEV_TYPE_HWVIRT "passthrough"
+#define VFIO_AP_MDEV_NAME_HWVIRT "VFIO AP Passthrough Device"
+
+static int vfio_ap_mdev_create(struct kobject *kobj, struct mdev_device *mdev)
+{
+ struct ap_matrix *ap_matrix = to_ap_matrix(mdev_parent_dev(mdev));
+
+ ap_matrix->available_instances--;
+
+ return 0;
+}
+
+static int vfio_ap_mdev_remove(struct mdev_device *mdev)
+{
+ struct ap_matrix *ap_matrix = to_ap_matrix(mdev_parent_dev(mdev));
+
+ ap_matrix->available_instances++;
+
+ return 0;
+}
+
+static ssize_t name_show(struct kobject *kobj, struct device *dev, char *buf)
+{
+ return sprintf(buf, "%s\n", VFIO_AP_MDEV_NAME_HWVIRT);
+}
+
+MDEV_TYPE_ATTR_RO(name);
+
+static ssize_t available_instances_show(struct kobject *kobj,
+ struct device *dev, char *buf)
+{
+ struct ap_matrix *ap_matrix;
+
+ ap_matrix = to_ap_matrix(dev);
+
+ return sprintf(buf, "%d\n", ap_matrix->available_instances);
+}
+
+MDEV_TYPE_ATTR_RO(available_instances);
+
+static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
+ char *buf)
+{
+ return sprintf(buf, "%s\n", VFIO_DEVICE_API_AP_STRING);
+}
+
+MDEV_TYPE_ATTR_RO(device_api);
+
+static struct attribute *vfio_ap_mdev_type_attrs[] = {
+ &mdev_type_attr_name.attr,
+ &mdev_type_attr_device_api.attr,
+ &mdev_type_attr_available_instances.attr,
+ NULL,
+};
+
+static struct attribute_group vfio_ap_mdev_hwvirt_type_group = {
+ .name = VFOP_AP_MDEV_TYPE_HWVIRT,
+ .attrs = vfio_ap_mdev_type_attrs,
+};
+
+static struct attribute_group *vfio_ap_mdev_type_groups[] = {
+ &vfio_ap_mdev_hwvirt_type_group,
+ NULL,
+};
+
+static const struct mdev_parent_ops vfio_ap_matrix_ops = {
+ .owner = THIS_MODULE,
+ .supported_type_groups = vfio_ap_mdev_type_groups,
+ .create = vfio_ap_mdev_create,
+ .remove = vfio_ap_mdev_remove,
+};
+
+int vfio_ap_mdev_register(struct ap_matrix *ap_matrix)
+{
+ int ret;
+
+ ret = mdev_register_device(&ap_matrix->device, &vfio_ap_matrix_ops);
+ if (ret)
+ return ret;
+
+ ap_matrix->available_instances = AP_MATRIX_MAX_AVAILABLE_INSTANCES;
+
+ return 0;
+}
+
+void vfio_ap_mdev_unregister(struct ap_matrix *ap_matrix)
+{
+ ap_matrix->available_instances--;
+ mdev_unregister_device(&ap_matrix->device);
+}
diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h
index 15ed458..c47aeec 100644
--- a/drivers/s390/crypto/vfio_ap_private.h
+++ b/drivers/s390/crypto/vfio_ap_private.h
@@ -10,14 +10,31 @@
#define _VFIO_AP_PRIVATE_H_
#include <linux/types.h>
+#include <linux/device.h>
+#include <linux/mdev.h>
#include "ap_bus.h"
#define VFIO_AP_MODULE_NAME "vfio_ap"
#define VFIO_AP_DRV_NAME "vfio_ap"
+/**
+ * There must be one mediated matrix device per guest. If every APQN is assigned
+ * to a guest, then the maximum number of guests with a unique APQN assigned
+ * would be 255 adapters x 255 domains = 72351 guests.
+ */
+#define AP_MATRIX_MAX_AVAILABLE_INSTANCES 72351
struct ap_matrix {
struct device device;
+ int available_instances;
};
+static inline struct ap_matrix *to_ap_matrix(struct device *dev)
+{
+ return container_of(dev, struct ap_matrix, device);
+}
+
+extern int vfio_ap_mdev_register(struct ap_matrix *ap_matrix);
+extern void vfio_ap_mdev_unregister(struct ap_matrix *ap_matrix);
+
#endif /* _VFIO_AP_PRIVATE_H_ */
--
1.7.1
Provides the sysfs interfaces for assigning AP domains to
and unassigning AP domains from a mediated matrix device.
An AP domain ID corresponds to an AP queue index (APQI). For
each domain assigned to the mediated matrix device, its
corresponging APQI is stored in an AP queue mask (AQM).
The bits in the AQM, from most significant to least
significant bit, correspond to AP domain numbers 0 to 255.
When a domain is assigned, the bit corresponding to its
APQI will be set in the AQM. Likewise, when a domain is
unassigned, the bit corresponding to its APQI will be
cleared from the AQM.
The relevant sysfs structures are:
/sys/devices/vfio_ap
... [matrix]
...... [mdev_supported_types]
......... [vfio_ap-passthrough]
............ [devices]
...............[$uuid]
.................. assign_domain
.................. unassign_domain
To assign a domain to the $uuid mediated matrix device,
write the domain's ID to the assign_domain file. To
unassign a domain, write the domain's ID to the
unassign_domain file. The ID is specified using
conventional semantics: If it begins with 0x, the number
will be parsed as a hexadecimal (case insensitive) number;
otherwise, it will be parsed as a decimal number.
For example, to assign domain 173 (0xad) to the mediated matrix
device $uuid:
echo 173 > assign_domain
or
echo 0xad > assign_domain
To unassign domain 173 (0xad):
echo 173 > unassign_domain
or
echo 0xad > unassign_domain
The assignment will be rejected:
* If the domain ID exceeds the maximum value for an AP domain:
* If the AP Extended Addressing (APXA) facility is installed,
the max value is 255
* Else the max value is 15
* If no AP adapters have yet been assigned and there are
no AP queues reserved by the VFIO AP driver that have an APQN
with an APQI matching that of the AP domain number being
assigned.
* If any of the APQNs that can be derived from the intersection
of the APQI being assigned and the AP adapter ID (APID) of
each of the AP adapters previously assigned can not be matched
with an APQN of an AP queue device reserved by the VFIO AP
driver.
Signed-off-by: Tony Krowiak <[email protected]>
---
arch/s390/include/asm/kvm-ap.h | 1 +
drivers/s390/crypto/vfio_ap_ops.c | 221 ++++++++++++++++++++++++++++++++++++-
2 files changed, 221 insertions(+), 1 deletions(-)
diff --git a/arch/s390/include/asm/kvm-ap.h b/arch/s390/include/asm/kvm-ap.h
index 5ebb171..8ee196e 100644
--- a/arch/s390/include/asm/kvm-ap.h
+++ b/arch/s390/include/asm/kvm-ap.h
@@ -18,6 +18,7 @@
#define KVM_AP_MASK_BYTES(n) DIV_ROUND_UP(n, BITS_PER_BYTE)
#define KVM_AP_MAX_APM_INDEX(matrix) (matrix->apm_max - 1)
+#define KVM_AP_MAX_AQM_INDEX(matrix) (matrix->aqm_max - 1)
/**
* The AP matrix is comprised of three bit masks identifying the adapters,
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index 6d32adb..d4f9310 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -380,10 +380,229 @@ static ssize_t unassign_adapter_store(struct device *dev,
}
DEVICE_ATTR_WO(unassign_adapter);
+/**
+ * vfio_ap_validate_queues_for_apqi
+ *
+ * @ap_matrix: the matrix device
+ * @matrix_mdev: the mediated matrix device
+ * @apqi: an AP queue index (APQI) - corresponds to a domain ID
+ *
+ * Verifies that each APQN that is derived from the intersection of @apqi and
+ * each AP adapter ID (APID) corresponding to an AP domain assigned to the
+ * @matrix_mdev matches the APQN of an AP queue reserved by the VFIO AP device
+ * driver.
+ *
+ * Returns 0 if validation succeeds; otherwise, returns an error.
+ */
+static int vfio_ap_validate_queues_for_apqi(struct ap_matrix *ap_matrix,
+ struct ap_matrix_mdev *matrix_mdev,
+ unsigned long apqi)
+{
+ int ret;
+ struct vfio_ap_qid_match qid_match;
+ unsigned long apid;
+ struct device_driver *drv = ap_matrix->device.driver;
+
+ /**
+ * Examine each APQN with the specified APQI
+ */
+ for_each_set_bit_inv(apid, matrix_mdev->matrix->apm,
+ matrix_mdev->matrix->apm_max) {
+ qid_match.qid = AP_MKQID(apid, apqi);
+ qid_match.dev = NULL;
+
+ ret = driver_for_each_device(drv, NULL, &qid_match,
+ vfio_ap_queue_match);
+ if (ret) {
+ pr_err("%s: Error %d validating AP queue %02lx.%04lx reservation",
+ VFIO_AP_MODULE_NAME, ret, apid, apqi);
+
+ return ret;
+ }
+
+ /*
+ * If the APQN identifies an AP queue that is reserved by the
+ * VFIO AP device driver, continue processing.
+ */
+ if (qid_match.dev)
+ continue;
+
+ pr_err("%s: AP queue %02lx.%04lx not reserved by %s driver",
+ VFIO_AP_MODULE_NAME, apid, apqi, VFIO_AP_DRV_NAME);
+
+ return -ENXIO;
+ }
+
+ return 0;
+}
+
+struct vfio_ap_apqi_reserved {
+ unsigned long apqi;
+ bool reserved;
+};
+
+/**
+ * vfio_ap_queue_id_contains_apqi
+ *
+ * @dev: an AP queue device
+ * @data: an AP queue index (APQI)
+ *
+ * Returns 1 (true) if the APQI (@data) is contained in the AP queue's
+ * identifier; otherwise, returns 0;
+ */
+static int vfio_ap_queue_id_contains_apqi(struct device *dev, void *data)
+{
+ struct vfio_ap_apqi_reserved *apqi_res = data;
+ struct ap_queue *ap_queue = to_ap_queue(dev);
+
+ if (apqi_res->apqi == AP_QID_QUEUE(ap_queue->qid))
+ apqi_res->reserved = true;
+
+ return 0;
+}
+
+/**
+ * vfio_ap_verify_apqi_reserved
+ *
+ * @ap_matrix: the AP matrix configured for the mediated matrix device
+ * @apqi: the AP queue index (APQI) - corresponds to domain ID
+ *
+ * Verifies that at least one AP queue reserved by the VFIO AP device driver
+ * has an APQN containing @apqi.
+ *
+ * Returns 0 if the APQI is reserved; otherwise, returns -ENODEV.
+ */
+static int vfio_ap_verify_apqi_reserved(struct ap_matrix *ap_matrix,
+ unsigned long apqi)
+{
+ int ret;
+ struct vfio_ap_apqi_reserved apqi_res;
+
+ apqi_res.apqi = apqi;
+
+ ret = driver_for_each_device(ap_matrix->device.driver, NULL,
+ &apqi_res,
+ vfio_ap_queue_id_contains_apqi);
+ if (ret) {
+ pr_err("%s: Error %d validating AP queue index %04lx reservation",
+ VFIO_AP_MODULE_NAME, ret, apqi);
+ return ret;
+ }
+
+ if (apqi_res.reserved)
+ return 0;
+
+ pr_err("%s: no APQNs with domain ID %02lx are reserved by %s driver",
+ VFIO_AP_MODULE_NAME, apqi, VFIO_AP_DRV_NAME);
+
+ return -ENODEV;
+}
+
+/**
+ * vfio_ap_validate_apqi
+ *
+ * @matrix_mdev: the mediated matrix device
+ * @apqi: the APQI (domain ID) to validate
+ *
+ * Validates the value of @apqi:
+ * * If there are no AP adapters assigned, then there must be at least
+ * one AP queue device reserved by the VFIO AP device driver with an
+ * APQN containing @apqi.
+ *
+ * * Else each APQN that can be derived from the intersection of @apqi and
+ * the IDs of the AP adapters already assigned must identify an AP queue
+ * that has been reserved by the VFIO AP device driver.
+ *
+ * Returns 0 if the value of @apqi is valid; otherwise, returns an error.
+ */
+static int vfio_ap_validate_apqi(struct mdev_device *mdev,
+ struct ap_matrix_mdev *matrix_mdev,
+ unsigned long apqi)
+{
+ int ret;
+ struct device *dev = mdev_parent_dev(mdev);
+ struct ap_matrix *ap_matrix = to_ap_matrix(dev);
+ unsigned long apid;
+
+ apid = find_first_bit_inv(matrix_mdev->matrix->apm,
+ matrix_mdev->matrix->apm_max);
+ /* If there are no adapters assigned */
+ if (apid == matrix_mdev->matrix->apm_max) {
+ ret = vfio_ap_verify_apqi_reserved(ap_matrix, apqi);
+ } else {
+ ret = vfio_ap_validate_queues_for_apqi(ap_matrix, matrix_mdev,
+ apqi);
+ }
+
+ if (ret)
+ return ret;
+
+ return 0;
+}
+
+static ssize_t assign_domain_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ int ret;
+ unsigned long apqi;
+ struct mdev_device *mdev = mdev_from_dev(dev);
+ struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
+ int maxid = KVM_AP_MAX_AQM_INDEX(matrix_mdev->matrix);
+
+ ret = kstrtoul(buf, 0, &apqi);
+ if (ret || (apqi > maxid)) {
+ pr_err("%s: domain id '%s' not a value from 0 to %02d(%#04x)",
+ VFIO_AP_MODULE_NAME, buf, maxid, maxid);
+
+ return ret ? ret : -EINVAL;
+ }
+
+ ret = vfio_ap_validate_apqi(mdev, matrix_mdev, apqi);
+ if (ret)
+ return ret;
+
+ /* Set the bit in the AQM (bitmask) corresponding to the AP domain
+ * number (APQI). The bits in the mask, from most significant to least
+ * significant, correspond to numbers 0-255.
+ */
+ set_bit_inv(apqi, matrix_mdev->matrix->aqm);
+
+ return count;
+}
+DEVICE_ATTR_WO(assign_domain);
+
+static ssize_t unassign_domain_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ int ret;
+ unsigned long apqi;
+ struct mdev_device *mdev = mdev_from_dev(dev);
+ struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
+ int maxid = KVM_AP_MAX_AQM_INDEX(matrix_mdev->matrix);
+
+ ret = kstrtoul(buf, 0, &apqi);
+ if (ret || (apqi > maxid)) {
+ pr_err("%s: domain id '%s' not a value from 0 to %02d(%#04x)",
+ VFIO_AP_MODULE_NAME, buf, maxid, maxid);
+
+ return ret ? ret : -EINVAL;
+ }
+
+ clear_bit_inv((unsigned long)apqi,
+ (unsigned long *)matrix_mdev->matrix->aqm);
+
+ return count;
+}
+DEVICE_ATTR_WO(unassign_domain);
+
static struct attribute *vfio_ap_mdev_attrs[] = {
&dev_attr_assign_adapter.attr,
&dev_attr_unassign_adapter.attr,
- NULL
+ &dev_attr_assign_domain.attr,
+ &dev_attr_unassign_domain.attr,
+ NULL,
};
static struct attribute_group vfio_ap_mdev_attr_group = {
--
1.7.1
This patch provides documentation describing the AP architecture and
design concepts behind the virtualization of AP devices. It also
includes an example of how to configure AP devices for exclusive
use of KVM guests.
Signed-off-by: Tony Krowiak <[email protected]>
---
Documentation/s390/vfio-ap.txt | 567 ++++++++++++++++++++++++++++++++++++++++
MAINTAINERS | 1 +
2 files changed, 568 insertions(+), 0 deletions(-)
create mode 100644 Documentation/s390/vfio-ap.txt
diff --git a/Documentation/s390/vfio-ap.txt b/Documentation/s390/vfio-ap.txt
new file mode 100644
index 0000000..a1e888a
--- /dev/null
+++ b/Documentation/s390/vfio-ap.txt
@@ -0,0 +1,567 @@
+Introduction:
+============
+The Adjunct Processor (AP) facility is an IBM Z cryptographic facility comprised
+of three AP instructions and from 1 up to 256 PCIe cryptographic adapter cards.
+The AP devices provide cryptographic functions to all CPUs assigned to a
+linux system running in an IBM Z system LPAR.
+
+The AP adapter cards are exposed via the AP bus. The motivation for vfio-ap
+is to make AP cards available to KVM guests using the VFIO mediated device
+framework. This implementation relies considerably on the s390 virtualization
+facilities which do most of the hard work of providing direct access to AP
+devices.
+
+AP Architectural Overview:
+=========================
+To facilitate the comprehension of the design, let's start with some
+definitions:
+
+* AP adapter
+
+ An AP adapter is an IBM Z adapter card that can perform cryptographic
+ functions. There can be from 0 to 256 adapters assigned to an LPAR. Adapters
+ assigned to the LPAR in which a linux host is running will be available to
+ the linux host. Each adapter is identified by a number from 0 to 255. When
+ installed, an AP adapter is accessed by AP instructions executed by any CPU.
+
+ The AP adapter cards are assigned to a given LPAR via the system's Activation
+ Profile which can be edited via the HMC. When the system is IPL'd, the AP bus
+ module is loaded and detects the AP adapter cards assigned to the LPAR. The AP
+ bus creates a sysfs device for each adapter as they are detected. For example,
+ if AP adapters 4 and 10 (0x0a) are assigned to the LPAR, the AP bus will
+ create the following sysfs entries:
+
+ /sys/devices/ap/card04
+ /sys/devices/ap/card0a
+
+ Symbolic links to these devices will also be created in the AP bus devices
+ sub-directory:
+
+ /sys/bus/ap/devices/[card04]
+ /sys/bus/ap/devices/[card04]
+
+* AP domain
+
+ An adapter is partitioned into domains. Each domain can be thought of as
+ a set of hardware registers for processing AP instructions. An adapter can
+ hold up to 256 domains. Each domain is identified by a number from 0 to 255.
+ Domains can be further classified into two types:
+
+ * Usage domains are domains that can be accessed directly to process AP
+ commands.
+
+ * Control domains are domains that are accessed indirectly by AP
+ commands sent to a usage domain to control or change the domain, for
+ example; to set a secure private key for the domain.
+
+ The AP usage and control domains are assigned to a given LPAR via the system's
+ Activation Profile which can be edited via the HMC. When the system is IPL'd,
+ the AP bus module is loaded and detects the AP usage and control domains
+ assigned to the LPAR. The domain number of each usage domain will be coupled
+ with the adapter number of each AP adapter assigned to the LPAR to identify
+ the AP queues (see AP Queue section below). The domain number of each control
+ domain will be represented in a bitmask and stored in a sysfs file
+ /sys/bus/ap/ap_control_domain_mask created by the bus. The bits in the mask,
+ from most to least significant bit, correspond to domains 0-255.
+
+ A domain may be assigned to a system as both a usage and control domain, or
+ as a control domain only. Consequently, all domains assigned as both a usage
+ and control domain can both process AP commands as well as be changed by an AP
+ command sent to any usage domain assigned to the same system. Domains assigned
+ only as control domains can not process AP commands but can be changed by AP
+ commands sent to any usage domain assigned to the system.
+
+* AP Queue
+
+ An AP queue is the means by which an AP command-request message is sent to a
+ usage domain inside a specific adapter. An AP queue is identified by a tuple
+ comprised of an AP adapter ID (APID) and an AP queue index (APQI). The
+ APQI corresponds to a given usage domain number within the adapter. This tuple
+ forms an AP Queue Number (APQN) uniquely identifying an AP queue. AP
+ instructions include a field containing the APQN to identify the AP queue to
+ which the AP command-request message is to be sent for processing.
+
+ The AP bus will create a sysfs device for each APQN that can be derived from
+ the intersection of the AP adapter and usage domain numbers detected when the
+ AP bus module is loaded. For example, if adapters 4 and 10 (0x0a) and usage
+ domains 6 and 71 (0x47) are assigned to the LPAR, the AP bus will create the
+ following sysfs entries:
+
+ /sys/devices/ap/card04/04.0006
+ /sys/devices/ap/card04/04.0047
+ /sys/devices/ap/card0a/0a.0006
+ /sys/devices/ap/card0a/0a.0047
+
+ The following symbolic links to these devices will be created in the AP bus
+ devices subdirectory:
+
+ /sys/bus/ap/devices/[04.0006]
+ /sys/bus/ap/devices/[04.0047]
+ /sys/bus/ap/devices/[0a.0006]
+ /sys/bus/ap/devices/[0a.0047]
+
+* AP Instructions:
+
+ There are three AP instructions:
+
+ * NQAP: to enqueue an AP command-request message to a queue
+ * DQAP: to dequeue an AP command-reply message from a queue
+ * PQAP: to administer the queues
+
+AP and SIE:
+==========
+Let's now see how AP instructions are interpreted by the hardware.
+
+A satellite control block called the Crypto Control Block is attached to our
+main hardware virtualization control block. The CRYCB contains three fields to
+identify the adapters, usage domains and control domains assigned to the KVM
+guest:
+
+* The AP Mask (APM) field is a bit mask that identifies the AP adapters assigned
+ to the KVM guest. Each bit in the mask, from most significant to least
+ significant bit, corresponds to an APID from 0-255. If a bit is set, the
+ corresponding adapter is valid for use by the KVM guest.
+
+* The AP Queue Mask (AQM) field is a bit mask identifying the AP usage domains
+ assigned to the KVM guest. Each bit in the mask, from most significant to
+ least significant bit, corresponds to an AP queue index (APQI) from 0-255. If
+ a bit is set, the corresponding queue is valid for use by the KVM guest.
+
+* The AP Domain Mask field is a bit mask that identifies the AP control domains
+ assigned to the KVM guest. The ADM bit mask controls which domains can be
+ changed by an AP command-request message sent to a usage domain from the
+ guest. Each bit in the mask, from least significant to most significant bit,
+ corresponds to a domain from 0-255. If a bit is set, the corresponding domain
+ can be modified by an AP command-request message sent to a usage domain
+ configured for the KVM guest.
+
+If you recall from the description of an AP Queue, AP instructions include
+an APQN to identify the AP adapter and AP queue to which an AP command-request
+message is to be sent (NQAP and PQAP instructions), or from which a
+command-reply message is to be received (DQAP instruction). The validity of an
+APQN is defined by the matrix calculated from the APM and AQM; it is the
+cross product of all assigned adapter numbers (APM) with all assigned queue
+indexes (AQM). For example, if adapters 1 and 2 and usage domains 5 and 6 are
+assigned to a guest, the APQNs (1,5), (1,6), (2,5) and (2,6) will be valid for
+the guest.
+
+The APQNs can provide secure key functionality - i.e., a private key is stored
+on the adapter card for each of its domains - so each APQN must be assigned to
+at most one guest or the linux host.
+
+ Example 1: Valid configuration:
+ ------------------------------
+ Guest1: adapters 1,2 domains 5,6
+ Guest2: adapter 1,2 domain 7
+
+ This is valid because both guests have a unique set of APQNs: Guest1 has
+ APQNs (1,5), (1,6), (2,5) and (2,6); Guest2 has APQNs (1,7) and (2,7).
+
+ Example 2: Invalid configuration:
+ --------------------------------is assigned by writing the adapter's number into the
+ Guest1: adapters 1,2 domains 5,6
+ Guest2: adapter 1 domains 6,7
+
+ This is an invalid configuration because both guests have access to
+ APQN (1,6).
+
+The Design:
+===========
+The design introduces three new objects:
+
+1. AP matrix device
+2. VFIO AP device driver (vfio_ap.ko)
+3. AP mediated matrix passthrough device
+
+The VFIO AP device driver
+-------------------------
+The VFIO AP (vfio_ap) device driver serves the following purposes:
+
+1. Provides the interfaces to reserve APQNs for exclusive use of KVM guests.
+
+2. Sets up the VFIO mediated device interfaces to manage the mediated matrix
+ device and create the sysfs interfaces for assigning adapters, usage domains,
+ and control domains comprising the matrix for a KVM guest.
+
+3. Configure the APM, AQM and ADM in the CRYCB referenced by a KVM guest's
+ SIE state description to grant the guest access to AP devices
+
+4. Initialize the CPU model feature indicating that a KVM guest may use
+ AP facilities installed on the linux host.
+
+5. Enable interpretive execution mode for the KVM guest.
+
+Reserve APQNs for exclusive use of KVM guests
+---------------------------------------------
+The following block diagram illustrates the mechanism by which APQNs are
+reserved:
+
+ +------------------+
+ remove | | unbind
+ +------------------->+ cex4queue driver +<-----------+
+ | | | |
+ | +------------------+ |
+ | |
+ | |
+ | |
++--------+---------+ register +------------------+ +-----+------+
+| +<---------+ | bind | |
+| ap_bus | | vfio_ap driver +<-----+ admin |
+| +--------->+ | | |
++------------------+ probe +---+--------+-----+ +------------+
+ | |
+ create | | store APQN
+ | |
+ v v
+ +---+--------+-----+
+ | |
+ | matrix device |
+ | |
+ +------------------+
+
+The process for reserving an AP queue for use by a KVM guest is:
+
+* The vfio-ap driver during its initialization will perform the following:
+ * Create the 'vfio_ap' root device - /sys/devices/vfio_ap
+ * Create the 'matrix' device in the 'vfio_ap' root
+ * Register the matrix device with the device core
+* Register with the ap_bus for AP queue devices of type CEX4, CEX5 and
+ CEX6 and to provide the vfio_ap driver's probe and remove callback interfaces.
+* The admin unbinds queue cc.qqqq from the cex4queue device driver. This results
+ in the ap_bus calling the the device driver's remove interface which
+ unbinds the cc.qqqq queue device from the driver.
+* The admin binds the cc.qqqq queue to the vfio_ap device driver. This results
+ in the ap_bus calling the device vfio_ap driver's probe interface to bind
+ queue cc.qqqq to the driver. The vfio_ap device driver will store the APQN for
+ the queue in the matrix device
+
+Set up the VFIO mediated device interfaces
+------------------------------------------
+The VFIO AP device driver utilizes the common interface of the VFIO mediated
+device core driver to:
+* Register an AP mediated bus driver to add a mediated matrix device to and
+ remove it from a VFIO group.
+* Create and destroy a mediated matrix device
+* Add a mediated matrix device to and remove it from the AP mediated bus driver
+* Add a mediated matrix device to and remove it from an IOMMU group
+
+The following high-level block diagram shows the main components and interfaces
+of the VFIO AP mediated matrix device driver:
+
+ +-------------+
+ | |
+ | +---------+ | mdev_register_driver() +--------------+
+ | | Mdev | +<-----------------------+ |
+ | | bus | | | vfio_mdev.ko |
+ | | driver | +----------------------->+ |<-> VFIO user
+ | +---------+ | probe()/remove() +--------------+ APIs
+ | |
+ | MDEV CORE |
+ | MODULE |
+ | mdev.ko |
+ | +---------+ | mdev_register_device() +--------------+
+ | |Physical | +<-----------------------+ |
+ | | device | | | vfio_ap.ko |<-> matrix
+ | |interface| +----------------------->+ | device
+ | +---------+ | callback +--------------+
+ +-------------+
+
+During initialization of the vfio_ap module, the matrix device is registered
+with an 'mdev_parent_ops' structure that provides the sysfs attribute
+structures, mdev functions and callback interfaces for managing the mediated
+matrix device.
+
+* sysfs attribute structures:
+ * supported_type_groups
+ The VFIO mediated device framework supports creation of user-defined
+ mediated device types. These mediated device types are specified
+ via the 'supported_type_groups' structure when a device is registered
+ with the mediated device framework. The registration process creates the
+ sysfs structures for each mediated device type specified in the
+ 'mdev_supported_types' sub-directory of the device being registered. Along
+ with the device type, the sysfs attributes of the mediated device type are
+ provided.
+
+ The VFIO AP device driver will register one mediated device type for
+ passthrough devices:
+ /sys/devices/vfio_ap/mdev_supported_types/vfio_ap-passthrough
+ Only the three read-only attributes required by the VFIO mdev framework will
+ be provided:
+ /sys/devices/vfio_ap/mdev_supported_types
+ ... name
+ ... device_api
+ ... available_instances
+ Where:
+ * name: specifies the name of the mediated device type
+ * device_api: the mediated device type's API
+ * available_instances: the number of mediated matrix passthrough devices
+ that can be created
+ * mdev_attr_groups
+ This attribute group identifies the user-defined sysfs attributes of the
+ mediated device. When a device is registered with the VFIO mediated device
+ framework, the sysfs attributes files identified in the 'mdev_attr_groups'
+ structure will be created in the mediated matrix device's directory. The
+ sysfs attributes for a mediated matrix device are:
+ * assign_adapter:
+ A write-only file for assigning an AP adapter to the mediated matrix
+ device. To assign an adapter, the APID of the adapter is written to the
+ file.
+ * assign_domain:
+ A write-only file for assigning an AP usage domain to the mediated matrix
+ device. To assign a domain, the APQI of the AP queue corresponding to a
+ usage domain is written to the file.
+ * assign_control_domain:
+ A write-only file for assigning an AP control domain to the mediated
+ matrix device. To assign a control domain, the ID of a domain to be
+ controlled is written to the file. For the initial implementation, the set
+ of control domains will always include the set of usage domains, so it is
+ only necessary to assign control domains that are not also assigned as
+ usage domains.
+
+* functions:
+ * create:
+ allocates the ap_matrix_mdev structure used by the vfio_ap driver to:
+ * Keep track of the available instances
+ * Store the reference to the struct kvm for the KVM guest
+ * Provide the notifier callback that will get invoked to handle the
+ VFIO_GROUP_NOTIFY_SET_KVM event. When received, the vfio_ap driver will
+ store the reference in the mediated matrix device's ap_matrix_mdev
+ structure and enable the interpretive execution mode for the KVM guest.
+ * remove:
+ deallocates the mediated matrix device's ap_matrix_mdev structure.
+
+* callback interfaces
+ * open:
+ The vfio_ap driver uses this callback to register a
+ VFIO_GROUP_NOTIFY_SET_KVM notifier callback function for the mdev matrix
+ device. The notifier is invoked when QEMU connects the VFIO iommu group
+ for the mdev matrix device to the MDEV bus. Access to the KVM structure used
+ to set up the KVM guest is provided via this callback.
+ * release:
+ unregisters the VFIO_GROUP_NOTIFY_SET_KVM notifier callback function for the
+ mdev matrix device.
+
+Configure the APM, AQM and ADM in the CRYCB:
+-------------------------------------------
+Configuring the AP matrix for a KVM guest will be performed when the
+VFIO_GROUP_NOTIFY_SET_KVM notifier callback is invoked. The notifier is callback
+function is called when QEMU connects the VFIO iommu group for the mdev matrix
+device to the MDEV bus. The CRYCB is configured by:
+* Setting the bits in the APM corresponding to the APIDs assigned to the
+ mediated matrix device via its 'assign_adapter' interface.
+* Setting the bits in the AQM corresponding to the APQIs assigned to the
+ mediated matrix device via its 'assign_domain' interface.
+* Setting the bits in the ADM corresponding to the domain dIDs assigned to the
+ mediated matrix device via its 'assign_control_domains' interface.
+
+Initialize the CPU model feature for AP
+---------------------------------------
+This design exploits a feature of the SIE architecture called interpretive
+execution (IE). When IE is enabled for a KVM guest, the AP instructions
+executed in the guest will be interpreted by the firmware and the commands
+contained therein will be passed directly through to an AP device assigned to
+the linux host. In order to enable interpretive execution for a KVM guest, SIE
+must have access to the AP facilities installed on the linux host. A new CPU
+model feature is introduced by this design to indicate that the guest will
+directly access the host AP facilities. This feature will be enabled by the
+kernel only if the AP facilities are installed on the linux host. The feature
+must be turned on for the guest in order to access AP devices from the guest.
+For example, to turn the AP facilities on from the QEMU command line:
+
+ /usr/bin/qemu-system-s390x ... -cpu xxx,ap=on
+
+ Where xxx is the CPU model being used.
+
+ If the CPU model feature is not enabled by the kernel, QEMU will fail and
+ report that the feature is not supported.
+
+Example:
+=======
+Let's now provide an example to illustrate how KVM guests may be given
+access to AP facilities. For this example, we will show how to configure
+two guests such that executing the lszcrypt command on the guests would
+look like this:
+
+Guest1
+------
+CARD.DOMAIN TYPE MODE
+------------------------------
+05 CEX5C CCA-Coproc
+05.0004 CEX5C CCA-Coproc
+05.00ab CEX5C CCA-Coproc
+06 CEX5A Accelerator
+06.0004 CEX5A Accelerator
+06.00ab CEX5C CCA-Coproc
+
+Guest2
+------
+CARD.DOMAIN TYPE MODE
+------------------------------
+05 CEX5A Accelerator
+05.0047 CEX5A Accelerator
+05.00ff CEX5A Accelerator
+
+These are the steps:
+
+1. Install the vfio_ap module on the linux host. The dependency chain for the
+ vfio_ap module is:
+ * vfio
+ * mdev
+ * vfio_mdev
+ * vfio_ap
+
+2. Secure the AP queues to be used by the two guests so that the host can not
+ access them. This is done by unbinding each AP Queue device from its
+ respective AP driver. In our example, these queues are bound to the cex4queue
+ driver. The sysfs location of these devices is:
+
+ /sys/bus/ap
+ --- [drivers]
+ ------ [cex4queue]
+ --------- [05.0004]
+ --------- [05.0047]
+ --------- [05.00ab]
+ --------- [05.00ff]
+ --------- [06.0004]
+ --------- [06.00ab]
+ --------- unbind
+
+ To unbind AP queue 05.0004 from the cex4queue device driver:
+
+ echo 05.0004 > unbind
+
+ This must also be done for AP queues 05.00ab, 05.0047, 05.00ff, 06.0004,
+ and 06.00ab.
+
+3. Reserve the queues for use by the two KVM guests. This is accomplished by
+ binding them to the vfio_ap device driver. The sysfs location of the
+ device driver is:
+
+ /sys/bus/ap
+ ---[drivers]
+ ------ [vfio_ap]
+ ---------- bind
+
+ To bind queue 05.0004 to the vfio_ap driver:
+
+ echo 05.0004 > bind
+
+ This must also be done for AP queues 05.00ab, 05.0047, 05.00ff, 06.0004,
+ and 06.00ab.
+
+ Take note that the AP queues bound to the vfio_ap driver will be available
+ for guest usage until they are unbound from the driver, the vfio_ap module
+ is unloaded, or the host system is shut down.
+
+4. Create the mediated devices needed to configure the AP matrixes for the
+ two guests and to provide an interface to the vfio_ap driver for
+ use by the guests:
+
+ /sys/devices/
+ --- [vfio_ap]
+ ------ [matrix] (this is the matrix device)
+ --------- [mdev_supported_types]
+ ------------ [vfio_ap-passthrough] (passthrough mediated matrix device type)
+ --------------- create
+ --------------- [devices]
+
+ To create the mediated devices for the two guests:
+
+ uuidgen > create
+ uuidgen > create
+
+ This will create two mediated devices in the [devices] subdirectory named
+ with the UUID written to the create attribute file. We call them $uuid1
+ and $uuid2:
+
+ /sys/devices/
+ --- [vfio_ap]
+ ------ [matrix]
+ --------- [mdev_supported_types]
+ ------------ [vfio_ap-passthrough]
+ --------------- [devices]
+ ------------------ [$uuid1]
+ --------------------- assign_adapter
+ --------------------- assign_control_domain
+ --------------------- assign_domain
+ --------------------- matrix
+ --------------------- unassign_adapter
+ --------------------- unassign_control_domain
+ --------------------- unassign_domain
+
+ ------------------ [$uuid2]
+ --------------------- assign_adapter
+ --------------------- assign_cTo assign an adapter, the APID of the adapter is written to the
+ file. ontrol_domain
+ --------------------- assign_domain
+ --------------------- matrix
+ --------------------- unassign_adapter
+ --------------------- unassign_control_domain
+ --------------------- unassign_domain
+
+5. The administrator now needs to configure the matrixes for mediated
+ devices $uuid1 (for Guest1) and $uuid2 (for Guest2).
+
+ This is how the matrix is configured for Guest1:
+
+ echo 5 > assign_adapter
+ echo 6 > assign_adapter
+ echo 4 > assign_domain
+ echo 0xab > assign_domain
+
+ For this implementation, all usage domains - i.e., domains assigned
+ via the assign_domain attribute file - will also be configured in the ADM
+ field of the KVM guest's CRYCB, so there is no need to assign control
+ domains here unless you want to assign control domains that are not
+ assigned as usage domains.
+
+ If a mistake is made configuring an adapter, domain or control domain,
+ you can use the unassign_xxx files to unassign the adapter, domain or
+ control domain.
+
+ To display the matrix configuration for Guest1:
+
+ cat matrix
+
+ This is how the matrix is configured for Guest2:
+
+ echo 5 > assign_adapter
+ echo 0x47 > assign_domain
+ echo 0xff > assign_domain
+
+6. Start Guest1:
+
+ /usr/bin/qemu-system-s390x ... -cpu xxx,ap=on \
+ -device vfio-ap,sysfsdev=/sys/devices/vfio_ap/matrix/$uuid1 ...
+
+7. Start Guest2:
+
+ /usr/bin/qemu-system-s390x ... -cpu xxx,ap=on \
+ -device vfio-ap,sysfsdev=/sys/devices/vfio_ap/matrix/$uuid2 ...
+
+When the guest is shut down, the mediated matrix device may be removed.
+
+Using our example again, to remove the mediated matrix device $uuid1:
+
+ /sys/devices/
+ --- [vfio_ap]
+ ------ [matrix]
+ --------- [mdev_supported_types]
+ ------------ [vfio_ap-passthrough]
+ --------------- [devices]
+ ------------------ [$uuid1]
+ --------------------- remove
+
+ echo 1 > remove
+
+ This will remove all of the mdev matrix device's sysfs structures. To
+ recreate and reconfigure the mdev matrix device, all of the steps starting
+ with step 4 will have to be performed again.
+
+ It is not necessary to remove an mdev matrix device, but one may want to
+ remove it if no guest will use it during the lifetime of the linux host. If
+ the mdev matrix device is removed, one may want to unbind the AP queues the
+ guest was using from the vfio_ap device driver and bind them back to the
+ default driver. Alternatively, the AP queues can be configured for another
+ mdev matrix (i.e., guest). In either case, one must take care to change the
+ secure key configured for the domain to which the queue is connected.
\ No newline at end of file
diff --git a/MAINTAINERS b/MAINTAINERS
index 6eb8d8a..b40243f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -12227,6 +12227,7 @@ S: Supported
F: drivers/s390/crypto/vfio_ap_drv.c
F: drivers/s390/crypto/vfio_ap_private.h
F: drivers/s390/crypto/vfio_ap_ops.c
+F: Documentation/s390/vfio-ap.txt
S390 ZFCP DRIVER
M: Steffen Maier <[email protected]>
--
1.7.1
Introduces a new function to reset the crypto attributes for all
vcpus whether they are running or not. Each vcpu in KVM will
be removed from SIE prior to resetting the crypto attributes in its
SIE state description. After all vcpus have had their crypto attributes
reset the vcpus will be restored to SIE.
This function will be used in a later patch to set the ECA.28
bit in the SIE state description to enable interpretive execution of
AP instructions. It will also be incorporated into the
kvm_s390_vm_set_crypto(kvm) function to fix an issue whereby the crypto
key wrapping attributes could potentially get out of synch for running
vcpus.
Signed-off-by: Tony Krowiak <[email protected]>
---
arch/s390/kvm/kvm-s390.c | 19 +++++++++++++------
arch/s390/kvm/kvm-s390.h | 14 ++++++++++++++
2 files changed, 27 insertions(+), 6 deletions(-)
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 64c9862..d0c3518 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -791,11 +791,21 @@ static int kvm_s390_set_mem_control(struct kvm *kvm, struct kvm_device_attr *att
static void kvm_s390_vcpu_crypto_setup(struct kvm_vcpu *vcpu);
-static int kvm_s390_vm_set_crypto(struct kvm *kvm, struct kvm_device_attr *attr)
+void kvm_s390_vcpu_crypto_reset_all(struct kvm *kvm)
{
- struct kvm_vcpu *vcpu;
int i;
+ struct kvm_vcpu *vcpu;
+
+ kvm_s390_vcpu_block_all(kvm);
+
+ kvm_for_each_vcpu(i, vcpu, kvm)
+ kvm_s390_vcpu_crypto_setup(vcpu);
+ kvm_s390_vcpu_unblock_all(kvm);
+}
+
+static int kvm_s390_vm_set_crypto(struct kvm *kvm, struct kvm_device_attr *attr)
+{
if (!test_kvm_facility(kvm, 76))
return -EINVAL;
@@ -832,10 +842,7 @@ static int kvm_s390_vm_set_crypto(struct kvm *kvm, struct kvm_device_attr *attr)
return -ENXIO;
}
- kvm_for_each_vcpu(i, vcpu, kvm) {
- kvm_s390_vcpu_crypto_setup(vcpu);
- exit_sie(vcpu);
- }
+ kvm_s390_vcpu_crypto_reset_all(kvm);
mutex_unlock(&kvm->lock);
return 0;
}
diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h
index 1b5621f..76324b7 100644
--- a/arch/s390/kvm/kvm-s390.h
+++ b/arch/s390/kvm/kvm-s390.h
@@ -410,4 +410,18 @@ static inline int kvm_s390_use_sca_entries(void)
}
void kvm_s390_reinject_machine_check(struct kvm_vcpu *vcpu,
struct mcck_volatile_info *mcck_info);
+
+/**
+ * kvm_s390_vcpu_crypto_reset_all
+ *
+ * Reset the crypto attributes for each vcpu. This can be done while the vcpus
+ * are running as each vcpu will be removed from SIE before resetting the crypto
+ * attributes and restored to SIE afterward.
+ *
+ * Note: The kvm->lock mutex must be locked prior to calling this function and
+ * unlocked after it returns.
+ *
+ * @kvm: the KVM guest
+ */
+void kvm_s390_vcpu_crypto_reset_all(struct kvm *kvm);
#endif
--
1.7.1
Registers a group notifier during the open of the mediated
matrix device to get information on KVM presence through the
VFIO_GROUP_NOTIFY_SET_KVM event. When notified, the pointer
to the kvm structure is saved inside the mediated matrix
device. Once the VFIO AP device driver has access to KVM,
access to the APs can be configured for the guest.
Access to APs is configured when the file descriptor for the
mediated matrix device is opened by userspace. The items to be
configured are:
1. The ECA.28 bit in the SIE state description determines whether
AP instructions are interpreted by the hardware or intercepted.
The VFIO AP device driver relies interpretive execution of
AP instructions so the ECA.28 bit will be set
2. Guest access to AP adapters, usage domains and control domains
is controlled by three bit masks referenced from the
Crypto Control Block (CRYCB) referenced from the guest's SIE state
description:
* The AP Mask (APM) controls access to the AP adapters. Each bit
in the APM represents an adapter number - from most significant
to least significant bit - from 0 to 255. The bits in the APM
are set according to the adapter numbers assigned to the mediated
matrix device via its 'assign_adapter' sysfs attribute file.
* The AP Queue (AQM) controls access to the AP queues. Each bit
in the AQM represents an AP queue index - from most significant
to least significant bit - from 0 to 255. A queue index references
a specific domain and is synonymous with the domian number. The
bits in the AQM are set according to the domain numbers assigned
to the mediated matrix device via its 'assign_domain' sysfs
attribute file.
* The AP Domain Mask (ADM) controls access to the AP control domains.
Each bit in the ADM represents a control domain - from most
significant to least significant bit - from 0-255. The
bits in the ADM are set according to the domain numbers assigned
to the mediated matrix device via its 'assign_control_domain'
sysfs attribute file.
Signed-off-by: Tony Krowiak <[email protected]>
---
drivers/s390/crypto/vfio_ap_ops.c | 50 +++++++++++++++++++++++++++++++++
drivers/s390/crypto/vfio_ap_private.h | 2 +
2 files changed, 52 insertions(+), 0 deletions(-)
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index bc2b05e..e3ff5ab 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -53,6 +53,54 @@ static int vfio_ap_mdev_remove(struct mdev_device *mdev)
return 0;
}
+static int vfio_ap_mdev_group_notifier(struct notifier_block *nb,
+ unsigned long action, void *data)
+{
+ struct ap_matrix_mdev *matrix_mdev;
+
+ if (action == VFIO_GROUP_NOTIFY_SET_KVM) {
+ matrix_mdev = container_of(nb, struct ap_matrix_mdev,
+ group_notifier);
+ matrix_mdev->kvm = data;
+ }
+
+ return NOTIFY_OK;
+}
+
+static int vfio_ap_mdev_open(struct mdev_device *mdev)
+{
+ struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
+ unsigned long events;
+ int ret;
+
+ matrix_mdev->group_notifier.notifier_call = vfio_ap_mdev_group_notifier;
+ events = VFIO_GROUP_NOTIFY_SET_KVM;
+
+ ret = vfio_register_notifier(mdev_dev(mdev), VFIO_GROUP_NOTIFY,
+ &events, &matrix_mdev->group_notifier);
+ if (ret)
+ return ret;
+
+ ret = kvm_ap_interpret_instructions(matrix_mdev->kvm, true);
+ if (ret)
+ return ret;
+
+ ret = kvm_ap_configure_matrix(matrix_mdev->kvm,
+ matrix_mdev->matrix);
+
+ return ret;
+}
+
+static void vfio_ap_mdev_release(struct mdev_device *mdev)
+{
+ struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
+
+ kvm_ap_deconfigure_matrix(matrix_mdev->kvm);
+ kvm_ap_interpret_instructions(matrix_mdev->kvm, false);
+ vfio_unregister_notifier(mdev_dev(mdev), VFIO_GROUP_NOTIFY,
+ &matrix_mdev->group_notifier);
+}
+
static ssize_t name_show(struct kobject *kobj, struct device *dev, char *buf)
{
return sprintf(buf, "%s\n", VFIO_AP_MDEV_NAME_HWVIRT);
@@ -754,6 +802,8 @@ static ssize_t matrix_show(struct device *dev, struct device_attribute *attr,
.mdev_attr_groups = vfio_ap_mdev_attr_groups,
.create = vfio_ap_mdev_create,
.remove = vfio_ap_mdev_remove,
+ .open = vfio_ap_mdev_open,
+ .release = vfio_ap_mdev_release,
};
int vfio_ap_mdev_register(struct ap_matrix *ap_matrix)
diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h
index f248faf..48e2806 100644
--- a/drivers/s390/crypto/vfio_ap_private.h
+++ b/drivers/s390/crypto/vfio_ap_private.h
@@ -31,6 +31,8 @@ struct ap_matrix {
struct ap_matrix_mdev {
struct kvm_ap_matrix *matrix;
+ struct notifier_block group_notifier;
+ struct kvm *kvm;
};
static inline struct ap_matrix *to_ap_matrix(struct device *dev)
--
1.7.1
Provides interfaces to assign AP adapters, usage domains
and control domains to a KVM guest.
A KVM guest is started by executing the Start Interpretive Execution (SIE)
instruction. The SIE state description is a control block that contains the
state information for a KVM guest and is supplied as input to the SIE
instruction. The SIE state description has a satellite structure called the
Crypto Control Block (CRYCB). The CRYCB contains three bitmask fields
identifying the adapters, queues (domains) and control domains assigned to
the KVM guest:
* The AP Adapter Mask (APM) field identifies the AP adapters assigned to
the KVM guest
* The AP Queue Mask (AQM) field identifies the AP queues assigned to
the KVM guest. Each AP queue is connected to a usage domain within
an AP adapter.
* The AP Domain Mask (ADM) field identifies the control domains
assigned to the KVM guest.
Each adapter, queue (usage domain) and control domain are identified by
a number from 0 to 255. The bits in each mask, from most significant to
least significant bit, correspond to the numbers 0-255. When a bit is
set, the corresponding adapter, queue (usage domain) or control domain
is assigned to the KVM guest.
This patch will set the bits in the APM, AQM and ADM fields of the
CRYCB referenced by the KVM guest's SIE state description. The process
used is:
1. Verify that the bits to be set do not exceed the maximum bit
number for the given mask.
2. Verify that the APQNs that can be derived from the intersection
of the bits set in the APM and AQM fields of the KVM guest's CRYCB
are not assigned to any other KVM guest running on the same linux
host.
3. Set the APM, AQM and ADM in the CRYCB according to the matrix
configured for the mediated matrix device via its sysfs
adapter, domain and control domain attribute files respectively.
Signed-off-by: Tony Krowiak <[email protected]>
---
arch/s390/include/asm/kvm-ap.h | 79 ++++++++++
arch/s390/kvm/kvm-ap.c | 259 +++++++++++++++++++++++++++++++++
drivers/s390/crypto/vfio_ap_ops.c | 19 +++
drivers/s390/crypto/vfio_ap_private.h | 4 +
4 files changed, 361 insertions(+), 0 deletions(-)
diff --git a/arch/s390/include/asm/kvm-ap.h b/arch/s390/include/asm/kvm-ap.h
index a6c092e..a068244 100644
--- a/arch/s390/include/asm/kvm-ap.h
+++ b/arch/s390/include/asm/kvm-ap.h
@@ -12,6 +12,34 @@
#include <linux/types.h>
#include <linux/kvm_host.h>
+#include <linux/types.h>
+#include <linux/kvm_host.h>
+#include <linux/bitops.h>
+
+#define KVM_AP_MASK_BYTES(n) DIV_ROUND_UP(n, BITS_PER_BYTE)
+
+/**
+ * The AP matrix is comprised of three bit masks identifying the adapters,
+ * queues (domains) and control domains that belong to an AP matrix. The bits in
+ * each mask, from least significant to most significant bit, correspond to IDs
+ * 0 to the maximum ID allowed for a given mask. When a bit is set, the
+ * corresponding ID belongs to the matrix.
+ *
+ * @apm_max: max number of bits in @apm
+ * @apm identifies the AP adapters in the matrix
+ * @aqm_max: max number of bits in @aqm
+ * @aqm identifies the AP queues (domains) in the matrix
+ * @adm_max: max number of bits in @adm
+ * @adm identifies the AP control domains in the matrix
+ */
+struct kvm_ap_matrix {
+ int apm_max;
+ unsigned long *apm;
+ int aqm_max;
+ unsigned long *aqm;
+ int adm_max;
+ unsigned long *adm;
+};
/**
* kvm_ap_instructions_installed()
@@ -51,4 +79,55 @@
*/
int kvm_ap_interpret_instructions(struct kvm *kvm, bool enable);
+/**
+ * kvm_ap_matrix_create
+ *
+ * Create an AP matrix to hold a configuration of AP adapters, domains and
+ * control domains.
+ *
+ * @ap_matrix: holds the matrix that is created
+ *
+ * Returns 0 if the matrix is successfully created. Returns an error if an APQN
+ * derived from the cross product of the AP adapter IDs and AP queue indexes
+ * comprising the AP matrix is configured for another guest.
+ */
+int kvm_ap_matrix_create(struct kvm_ap_matrix **ap_matrix);
+
+/**
+ * kvm_ap_matrix_destroy
+ *
+ * Destroy an AP matrix by de-allocating all storage allocated by the
+ * kvm_ap_matrix_create function.
+ *
+ * @ap_matrix: the matrix to destroy
+ */
+void kvm_ap_matrix_destroy(struct kvm_ap_matrix *ap_matrix);
+
+/**
+ * kvm_ap_configure_matrix
+ *
+ * Configure the AP matrix for a KVM guest.
+ *
+ * @kvm: the KVM guest
+ * @matrix: the matrix configuration information
+ *
+ * Returns 0 if:
+ * 1. The AP instructions are installed on the guest
+ * 2. The APQNs derived from the intersection of the set of adapter
+ * IDs (APM) and queue indexes (AQM) in @matrix are not configured for
+ * any other KVM guest running on the same linux host.
+ * Otherwise returns an error code.
+ */
+int kvm_ap_configure_matrix(struct kvm *kvm, struct kvm_ap_matrix *matrix);
+
+/**
+ * kvm_ap_deconfigure_matrix
+ *
+ * Deconfigure the AP matrix for a KVM guest. Clears all of the bits in the
+ * APM, AQM and ADM in the guest's CRYCB.
+ *
+ * @kvm: the KVM guest
+ */
+void kvm_ap_deconfigure_matrix(struct kvm *kvm);
+
#endif /* _ASM_KVM_AP */
diff --git a/arch/s390/kvm/kvm-ap.c b/arch/s390/kvm/kvm-ap.c
index 55d11b5..f7226d8 100644
--- a/arch/s390/kvm/kvm-ap.c
+++ b/arch/s390/kvm/kvm-ap.c
@@ -9,6 +9,7 @@
#include <linux/kernel.h>
#include <asm/kvm-ap.h>
#include <asm/ap.h>
+#include <linux/bitops.h>
#include "kvm-s390.h"
@@ -78,3 +79,261 @@ int kvm_ap_interpret_instructions(struct kvm *kvm, bool enable)
return ret;
}
EXPORT_SYMBOL(kvm_ap_interpret_instructions);
+
+static inline void kvm_ap_clear_crycb_masks(struct kvm *kvm)
+{
+ memset(&kvm->arch.crypto.crycb->apcb0, 0,
+ sizeof(kvm->arch.crypto.crycb->apcb0));
+ memset(&kvm->arch.crypto.crycb->apcb1, 0,
+ sizeof(kvm->arch.crypto.crycb->apcb1));
+}
+
+static inline unsigned long *kvm_ap_get_crycb_apm(struct kvm *kvm)
+{
+ unsigned long *apm;
+
+ switch (kvm->arch.crypto.crycbd & CRYCB_FORMAT_MASK) {
+ case CRYCB_FORMAT1:
+ apm = (unsigned long *)kvm->arch.crypto.crycb->apcb0.apm;
+ break;
+ case CRYCB_FORMAT2:
+ apm = (unsigned long *)kvm->arch.crypto.crycb->apcb1.apm;
+ break;
+ default:
+ apm = (unsigned long *)kvm->arch.crypto.crycb->apcb0.apm;
+ break;
+ }
+
+ return apm;
+}
+
+static inline unsigned long *kvm_ap_get_crycb_aqm(struct kvm *kvm)
+{
+ unsigned long *aqm;
+
+ switch (kvm->arch.crypto.crycbd & CRYCB_FORMAT_MASK) {
+ case CRYCB_FORMAT1:
+ aqm = (unsigned long *)kvm->arch.crypto.crycb->apcb0.aqm;
+ break;
+ case CRYCB_FORMAT2:
+ aqm = (unsigned long *)kvm->arch.crypto.crycb->apcb1.aqm;
+ break;
+ default:
+ aqm = (unsigned long *)kvm->arch.crypto.crycb->apcb0.aqm;
+ break;
+ }
+
+ return aqm;
+}
+
+static inline unsigned long *kvm_ap_get_crycb_adm(struct kvm *kvm)
+{
+ unsigned long *adm;
+
+ switch (kvm->arch.crypto.crycbd & CRYCB_FORMAT_MASK) {
+ case CRYCB_FORMAT1:
+ adm = (unsigned long *)kvm->arch.crypto.crycb->apcb0.adm;
+ break;
+ case CRYCB_FORMAT2:
+ adm = (unsigned long *)kvm->arch.crypto.crycb->apcb1.adm;
+ break;
+ default:
+ adm = (unsigned long *)kvm->arch.crypto.crycb->apcb0.adm;
+ break;
+ }
+
+ return adm;
+}
+
+static void kvm_ap_set_crycb_masks(struct kvm *kvm,
+ struct kvm_ap_matrix *matrix)
+{
+ unsigned long *apm = kvm_ap_get_crycb_apm(kvm);
+ unsigned long *aqm = kvm_ap_get_crycb_aqm(kvm);
+ unsigned long *adm = kvm_ap_get_crycb_adm(kvm);
+
+ kvm_ap_clear_crycb_masks(kvm);
+ memcpy(apm, matrix->apm, KVM_AP_MASK_BYTES(matrix->apm_max));
+ memcpy(aqm, matrix->aqm, KVM_AP_MASK_BYTES(matrix->aqm_max));
+
+ /*
+ * Merge the AQM and ADM since the ADM is a superset of the
+ * AQM by agreed-upon convention.
+ */
+ bitmap_or(adm, adm, aqm, matrix->adm_max);
+}
+
+static void kvm_ap_log_sharing_err(struct kvm *kvm, unsigned long apid,
+ unsigned long apqi)
+{
+ pr_err("%s: AP queue %02lx.%04lx is registered to guest %s", __func__,
+ apid, apqi, kvm->arch.dbf->name);
+}
+
+/**
+ * kvm_ap_validate_queue_sharing
+ *
+ * Verifies that the APQNs derived from the cross product of the AP adapter IDs
+ * and AP queue indexes comprising the AP matrix are not configured for
+ * another guest. AP queue sharing is not allowed.
+ *
+ * @kvm: the KVM guest
+ * @matrix: the AP matrix
+ *
+ * Returns 0 if the APQNs are valid, otherwise; returns -EBUSY.
+ */
+static int kvm_ap_validate_queue_sharing(struct kvm *kvm,
+ struct kvm_ap_matrix *matrix)
+{
+ struct kvm *vm;
+ unsigned long *apm, *aqm;
+ unsigned long apid, apqi;
+
+
+ /* No other VM may share an AP Queue with the input VM */
+ list_for_each_entry(vm, &vm_list, vm_list) {
+ if (kvm == vm)
+ continue;
+
+ apm = kvm_ap_get_crycb_apm(vm);
+ if (!bitmap_and(apm, apm, matrix->apm, matrix->apm_max))
+ continue;
+
+ aqm = kvm_ap_get_crycb_aqm(vm);
+ if (!bitmap_and(aqm, aqm, matrix->aqm, matrix->aqm_max))
+ continue;
+
+ for_each_set_bit_inv(apid, apm, matrix->apm_max)
+ for_each_set_bit_inv(apqi, aqm, matrix->aqm_max)
+ kvm_ap_log_sharing_err(kvm, apid, apqi);
+
+ return -EBUSY;
+ }
+
+ return 0;
+}
+
+static int kvm_ap_matrix_apm_create(struct kvm_ap_matrix *ap_matrix,
+ struct ap_config_info *config)
+{
+ int apm_max = (config && config->apxa) ? config->Na + 1 : 16;
+
+ ap_matrix->apm = kzalloc(KVM_AP_MASK_BYTES(apm_max), GFP_KERNEL);
+ if (!ap_matrix->apm)
+ return -ENOMEM;
+
+ ap_matrix->apm_max = apm_max;
+
+ return 0;
+}
+
+static int kvm_ap_matrix_aqm_create(struct kvm_ap_matrix *ap_matrix,
+ struct ap_config_info *config)
+{
+ int aqm_max = (config && config->apxa) ? config->Nd + 1 : 16;
+
+ ap_matrix->aqm = kzalloc(KVM_AP_MASK_BYTES(aqm_max), GFP_KERNEL);
+ if (!ap_matrix->aqm)
+ return -ENOMEM;
+
+ ap_matrix->aqm_max = aqm_max;
+
+ return 0;
+}
+
+static int kvm_ap_matrix_adm_create(struct kvm_ap_matrix *ap_matrix,
+ struct ap_config_info *config)
+{
+ int adm_max = (config && config->apxa) ? config->Nd + 1 : 16;
+
+ ap_matrix->adm = kzalloc(KVM_AP_MASK_BYTES(adm_max), GFP_KERNEL);
+ if (!ap_matrix->adm)
+ return -ENOMEM;
+
+ ap_matrix->adm_max = adm_max;
+
+ return 0;
+}
+
+static void kvm_ap_matrix_masks_destroy(struct kvm_ap_matrix *ap_matrix)
+{
+ kfree(ap_matrix->apm);
+ kfree(ap_matrix->aqm);
+ kfree(ap_matrix->adm);
+}
+
+int kvm_ap_matrix_create(struct kvm_ap_matrix **ap_matrix)
+{
+ int ret;
+ struct kvm_ap_matrix *matrix;
+ struct ap_config_info config;
+ struct ap_config_info *config_info = NULL;
+
+ memset(&config, 0, sizeof(config));
+
+ ret = ap_query_configuration(&config);
+ if (ret) {
+ if (ret != -EOPNOTSUPP)
+ return ret;
+ } else {
+ config_info = &config;
+ }
+
+ matrix = kzalloc(sizeof(*matrix), GFP_KERNEL);
+ if (!matrix)
+ return -ENOMEM;
+
+ ret = kvm_ap_matrix_apm_create(matrix, config_info);
+ if (ret)
+ goto mask_create_err;
+
+ ret = kvm_ap_matrix_aqm_create(matrix, config_info);
+ if (ret)
+ goto mask_create_err;
+
+ ret = kvm_ap_matrix_adm_create(matrix, config_info);
+ if (ret)
+ goto mask_create_err;
+
+ *ap_matrix = matrix;
+
+ return 0;
+
+mask_create_err:
+ kvm_ap_matrix_masks_destroy(matrix);
+ kfree(matrix);
+ return ret;
+}
+EXPORT_SYMBOL(kvm_ap_matrix_create);
+
+void kvm_ap_matrix_destroy(struct kvm_ap_matrix *ap_matrix)
+{
+ kvm_ap_matrix_masks_destroy(ap_matrix);
+ kfree(ap_matrix);
+}
+EXPORT_SYMBOL(kvm_ap_matrix_destroy);
+
+int kvm_ap_configure_matrix(struct kvm *kvm, struct kvm_ap_matrix *matrix)
+{
+ int ret = 0;
+
+ mutex_lock(&kvm->lock);
+
+ ret = kvm_ap_validate_queue_sharing(kvm, matrix);
+ if (ret)
+ goto done;
+
+ kvm_ap_set_crycb_masks(kvm, matrix);
+
+done:
+ mutex_unlock(&kvm->lock);
+
+ return ret;
+}
+EXPORT_SYMBOL(kvm_ap_configure_matrix);
+
+void kvm_ap_deconfigure_matrix(struct kvm *kvm)
+{
+ kvm_ap_clear_crycb_masks(kvm);
+}
+EXPORT_SYMBOL(kvm_ap_deconfigure_matrix);
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index d41b0b8..647ea24 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -10,6 +10,7 @@
#include <linux/device.h>
#include <linux/list.h>
#include <linux/ctype.h>
+#include <asm/kvm-ap.h>
#include "vfio_ap_private.h"
@@ -18,8 +19,23 @@
static int vfio_ap_mdev_create(struct kobject *kobj, struct mdev_device *mdev)
{
+ int ret;
+ struct ap_matrix_mdev *matrix_mdev;
struct ap_matrix *ap_matrix = to_ap_matrix(mdev_parent_dev(mdev));
+ struct kvm_ap_matrix *matrix;
+
+ ret = kvm_ap_matrix_create(&matrix);
+ if (ret)
+ return ret;
+
+ matrix_mdev = kzalloc(sizeof(*matrix_mdev), GFP_KERNEL);
+ if (!matrix_mdev) {
+ kvm_ap_matrix_destroy(matrix);
+ return -ENOMEM;
+ }
+ matrix_mdev->matrix = matrix;
+ mdev_set_drvdata(mdev, matrix_mdev);
ap_matrix->available_instances--;
return 0;
@@ -28,7 +44,10 @@ static int vfio_ap_mdev_create(struct kobject *kobj, struct mdev_device *mdev)
static int vfio_ap_mdev_remove(struct mdev_device *mdev)
{
struct ap_matrix *ap_matrix = to_ap_matrix(mdev_parent_dev(mdev));
+ struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
+ kvm_ap_matrix_destroy(matrix_mdev->matrix);
+ kfree(matrix_mdev);
ap_matrix->available_instances++;
return 0;
diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h
index c47aeec..f248faf 100644
--- a/drivers/s390/crypto/vfio_ap_private.h
+++ b/drivers/s390/crypto/vfio_ap_private.h
@@ -29,6 +29,10 @@ struct ap_matrix {
int available_instances;
};
+struct ap_matrix_mdev {
+ struct kvm_ap_matrix *matrix;
+};
+
static inline struct ap_matrix *to_ap_matrix(struct device *dev)
{
return container_of(dev, struct ap_matrix, device);
--
1.7.1
Introduces a new AP device driver. This device driver
is built on the VFIO mediated device framework. The framework
provides sysfs interfaces that facilitate passthrough
access by guests to devices installed on the linux host.
The VFIO AP device driver will serve two purposes:
1. Provide the interfaces to reserve AP devices for exclusive
use by KVM guests. This is accomplished by unbinding the
devices to be reserved for guest usage from the default AP
device driver and binding them to the VFIO AP device driver.
2. Implements the functions, callbacks and sysfs attribute
interfaces required to create one or more VFIO mediated
devices each of which will be used to configure the AP
matrix for a guest and serve as a file descriptor
for facilitating communication between QEMU and the
VFIO AP device driver.
When the VFIO AP device driver is initialized:
* It registers with the AP bus for control of type 10 (CEX4
and newer) AP queue devices. The probe and remove callbacks
will be provided to support the binding/unbinding of
AP queue devices to/from the VFIO AP device driver.
* Creates a /sys/devices/vfio-ap/matrix device to hold
the APQNs of the AP devices bound to the VFIO
AP device driver and serves as the parent of the
mediated devices created for each guest.
Signed-off-by: Tony Krowiak <[email protected]>
---
MAINTAINERS | 10 +++
arch/s390/Kconfig | 11 +++
drivers/s390/crypto/Makefile | 4 +
drivers/s390/crypto/vfio_ap_drv.c | 134 +++++++++++++++++++++++++++++++++
drivers/s390/crypto/vfio_ap_private.h | 23 ++++++
include/uapi/linux/vfio.h | 2 +
6 files changed, 184 insertions(+), 0 deletions(-)
create mode 100644 drivers/s390/crypto/vfio_ap_drv.c
create mode 100644 drivers/s390/crypto/vfio_ap_private.h
diff --git a/MAINTAINERS b/MAINTAINERS
index 0a1410d..1124c4c 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -12217,6 +12217,16 @@ W: http://www.ibm.com/developerworks/linux/linux390/
S: Supported
F: drivers/s390/crypto/
+S390 VFIO AP DRIVER
+M: Tony Krowiak <[email protected]>
+M: Christian Borntraeger <[email protected]>
+M: Martin Schwidefsky <[email protected]>
+L: [email protected]
+W: http://www.ibm.com/developerworks/linux/linux390/
+S: Supported
+F: drivers/s390/crypto/vfio_ap_drv.c
+F: drivers/s390/crypto/vfio_ap_private.h
+
S390 ZFCP DRIVER
M: Steffen Maier <[email protected]>
M: Benjamin Block <[email protected]>
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 32a0d5b..9b7c87e 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -770,6 +770,17 @@ config VFIO_CCW
To compile this driver as a module, choose M here: the
module will be called vfio_ccw.
+config VFIO_AP
+ def_tristate n
+ prompt "VFIO support for AP devices"
+ depends on ZCRYPT && VFIO_MDEV_DEVICE
+ help
+ This driver grants access to Adjunct Processor (AP) devices
+ via the VFIO mediated device interface.
+
+ To compile this driver as a module, choose M here: the module
+ will be called vfio_ap.
+
endmenu
menu "Dump support"
diff --git a/drivers/s390/crypto/Makefile b/drivers/s390/crypto/Makefile
index b59af54..48e466e 100644
--- a/drivers/s390/crypto/Makefile
+++ b/drivers/s390/crypto/Makefile
@@ -15,3 +15,7 @@ obj-$(CONFIG_ZCRYPT) += zcrypt_pcixcc.o zcrypt_cex2a.o zcrypt_cex4.o
# pkey kernel module
pkey-objs := pkey_api.o
obj-$(CONFIG_PKEY) += pkey.o
+
+# adjunct processor matrix
+vfio_ap-objs := vfio_ap_drv.o
+obj-$(CONFIG_VFIO_AP) += vfio_ap.o
diff --git a/drivers/s390/crypto/vfio_ap_drv.c b/drivers/s390/crypto/vfio_ap_drv.c
new file mode 100644
index 0000000..014d70f
--- /dev/null
+++ b/drivers/s390/crypto/vfio_ap_drv.c
@@ -0,0 +1,134 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * VFIO based AP device driver
+ *
+ * Copyright IBM Corp. 2018
+ *
+ * Author(s): Tony Krowiak <[email protected]>
+ */
+
+#include <linux/module.h>
+#include <linux/mod_devicetable.h>
+#include <linux/slab.h>
+
+#include "vfio_ap_private.h"
+
+#define VFIO_AP_ROOT_NAME "vfio_ap"
+#define VFIO_AP_DEV_TYPE_NAME "ap_matrix"
+#define VFIO_AP_DEV_NAME "matrix"
+
+MODULE_AUTHOR("IBM Corporation");
+MODULE_DESCRIPTION("VFIO AP device driver, Copyright IBM Corp. 2017");
+MODULE_LICENSE("GPL v2");
+
+static struct device *vfio_ap_root_device;
+
+static struct ap_driver vfio_ap_drv;
+
+static struct ap_matrix *ap_matrix;
+
+static struct device_type vfio_ap_dev_type = {
+ .name = VFIO_AP_DEV_TYPE_NAME,
+};
+
+/* Only type 10 adapters (CEX4 and later) are supported
+ * by the AP matrix device driver
+ */
+static struct ap_device_id ap_queue_ids[] = {
+ { .dev_type = AP_DEVICE_TYPE_CEX4,
+ .match_flags = AP_DEVICE_ID_MATCH_QUEUE_TYPE },
+ { .dev_type = AP_DEVICE_TYPE_CEX5,
+ .match_flags = AP_DEVICE_ID_MATCH_QUEUE_TYPE },
+ { .dev_type = AP_DEVICE_TYPE_CEX6,
+ .match_flags = AP_DEVICE_ID_MATCH_QUEUE_TYPE },
+ { /* end of sibling */ },
+};
+
+MODULE_DEVICE_TABLE(vfio_ap, ap_queue_ids);
+
+static int vfio_ap_queue_dev_probe(struct ap_device *apdev)
+{
+ return 0;
+}
+
+static void vfio_ap_matrix_dev_release(struct device *dev)
+{
+ struct ap_matrix *ap_matrix = dev_get_drvdata(dev);
+
+ kfree(ap_matrix);
+}
+
+static int vfio_ap_matrix_dev_create(void)
+{
+ int ret;
+
+ vfio_ap_root_device = root_device_register(VFIO_AP_ROOT_NAME);
+
+ if (IS_ERR(vfio_ap_root_device)) {
+ ret = PTR_ERR(vfio_ap_root_device);
+ goto done;
+ }
+
+ ap_matrix = kzalloc(sizeof(*ap_matrix), GFP_KERNEL);
+ if (!ap_matrix) {
+ ret = -ENOMEM;
+ goto matrix_alloc_err;
+ }
+
+ ap_matrix->device.type = &vfio_ap_dev_type;
+ dev_set_name(&ap_matrix->device, "%s", VFIO_AP_DEV_NAME);
+ ap_matrix->device.parent = vfio_ap_root_device;
+ ap_matrix->device.release = vfio_ap_matrix_dev_release;
+ ap_matrix->device.driver = &vfio_ap_drv.driver;
+
+ ret = device_register(&ap_matrix->device);
+ if (ret)
+ goto matrix_reg_err;
+
+ goto done;
+
+matrix_reg_err:
+ put_device(&ap_matrix->device);
+
+matrix_alloc_err:
+ root_device_unregister(vfio_ap_root_device);
+
+done:
+ return ret;
+}
+
+static void vfio_ap_matrix_dev_destroy(struct ap_matrix *ap_matrix)
+{
+ device_unregister(&ap_matrix->device);
+ root_device_unregister(vfio_ap_root_device);
+}
+
+int __init vfio_ap_init(void)
+{
+ int ret;
+
+ ret = vfio_ap_matrix_dev_create();
+ if (ret)
+ return ret;
+
+ memset(&vfio_ap_drv, 0, sizeof(vfio_ap_drv));
+ vfio_ap_drv.probe = vfio_ap_queue_dev_probe;
+ vfio_ap_drv.ids = ap_queue_ids;
+
+ ret = ap_driver_register(&vfio_ap_drv, THIS_MODULE, VFIO_AP_DRV_NAME);
+ if (ret) {
+ vfio_ap_matrix_dev_destroy(ap_matrix);
+ return ret;
+ }
+
+ return 0;
+}
+
+void __exit vfio_ap_exit(void)
+{
+ ap_driver_unregister(&vfio_ap_drv);
+ vfio_ap_matrix_dev_destroy(ap_matrix);
+}
+
+module_init(vfio_ap_init);
+module_exit(vfio_ap_exit);
diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h
new file mode 100644
index 0000000..15ed458
--- /dev/null
+++ b/drivers/s390/crypto/vfio_ap_private.h
@@ -0,0 +1,23 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Private data and functions for adjunct processor VFIO matrix driver.
+ *
+ * Copyright IBM Corp. 2018
+ * Author(s): Tony Krowiak <[email protected]>
+ */
+
+#ifndef _VFIO_AP_PRIVATE_H_
+#define _VFIO_AP_PRIVATE_H_
+
+#include <linux/types.h>
+
+#include "ap_bus.h"
+
+#define VFIO_AP_MODULE_NAME "vfio_ap"
+#define VFIO_AP_DRV_NAME "vfio_ap"
+
+struct ap_matrix {
+ struct device device;
+};
+
+#endif /* _VFIO_AP_PRIVATE_H_ */
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 1aa7b82..f378b98 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -200,6 +200,7 @@ struct vfio_device_info {
#define VFIO_DEVICE_FLAGS_PLATFORM (1 << 2) /* vfio-platform device */
#define VFIO_DEVICE_FLAGS_AMBA (1 << 3) /* vfio-amba device */
#define VFIO_DEVICE_FLAGS_CCW (1 << 4) /* vfio-ccw device */
+#define VFIO_DEVICE_FLAGS_AP (1 << 5) /* vfio-ap device */
__u32 num_regions; /* Max region index + 1 */
__u32 num_irqs; /* Max IRQ index + 1 */
};
@@ -215,6 +216,7 @@ struct vfio_device_info {
#define VFIO_DEVICE_API_PLATFORM_STRING "vfio-platform"
#define VFIO_DEVICE_API_AMBA_STRING "vfio-amba"
#define VFIO_DEVICE_API_CCW_STRING "vfio-ccw"
+#define VFIO_DEVICE_API_AP_STRING "vfio-ap"
/**
* VFIO_DEVICE_GET_REGION_INFO - _IOWR(VFIO_TYPE, VFIO_BASE + 8,
--
1.7.1
The VFIO AP device model exploits interpretive execution of AP
instructions (APIE) to provide guests passthrough access to AP
devices. This patch introduces a new interface to enable and
disable APIE.
Signed-off-by: Tony Krowiak <[email protected]>
---
arch/s390/include/asm/kvm-ap.h | 16 ++++++++++++++++
arch/s390/include/asm/kvm_host.h | 1 +
arch/s390/kvm/kvm-ap.c | 20 ++++++++++++++++++++
arch/s390/kvm/kvm-s390.c | 9 +++++++++
4 files changed, 46 insertions(+), 0 deletions(-)
diff --git a/arch/s390/include/asm/kvm-ap.h b/arch/s390/include/asm/kvm-ap.h
index 736e93e..a6c092e 100644
--- a/arch/s390/include/asm/kvm-ap.h
+++ b/arch/s390/include/asm/kvm-ap.h
@@ -35,4 +35,20 @@
*/
void kvm_ap_build_crycbd(struct kvm *kvm);
+/**
+ * kvm_ap_interpret_instructions
+ *
+ * Indicate whether AP instructions shall be interpreted. If they are not
+ * interpreted, all AP instructions will be intercepted and routed back to
+ * userspace.
+ *
+ * @kvm: the virtual machine attributes
+ * @enable: indicates whether AP instructions are to be interpreted (true) or
+ * or not (false).
+ *
+ * Returns 0 if completed successfully; otherwise, returns -EOPNOTSUPP
+ * indicating that AP instructions are not installed on the guest.
+ */
+int kvm_ap_interpret_instructions(struct kvm *kvm, bool enable);
+
#endif /* _ASM_KVM_AP */
diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index 3162783..5470685 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -715,6 +715,7 @@ struct kvm_s390_crypto {
__u32 crycbd;
__u8 aes_kw;
__u8 dea_kw;
+ __u8 apie;
};
#define APCB0_MASK_SIZE 1
diff --git a/arch/s390/kvm/kvm-ap.c b/arch/s390/kvm/kvm-ap.c
index 991bae4..55d11b5 100644
--- a/arch/s390/kvm/kvm-ap.c
+++ b/arch/s390/kvm/kvm-ap.c
@@ -58,3 +58,23 @@ void kvm_ap_build_crycbd(struct kvm *kvm)
}
}
EXPORT_SYMBOL(kvm_ap_build_crycbd);
+
+int kvm_ap_interpret_instructions(struct kvm *kvm, bool enable)
+{
+ int ret = 0;
+
+ mutex_lock(&kvm->lock);
+
+ if (!test_kvm_cpu_feat(kvm, KVM_S390_VM_CPU_FEAT_AP)) {
+ ret = -EOPNOTSUPP;
+ goto done;
+ }
+
+ kvm->arch.crypto.apie = enable;
+ kvm_s390_vcpu_crypto_reset_all(kvm);
+
+done:
+ mutex_unlock(&kvm->lock);
+ return ret;
+}
+EXPORT_SYMBOL(kvm_ap_interpret_instructions);
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 55cd897..1dc8566 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -1901,6 +1901,9 @@ static void kvm_s390_crypto_init(struct kvm *kvm)
kvm->arch.crypto.crycb = &kvm->arch.sie_page2->crycb;
kvm_ap_build_crycbd(kvm);
+ /* Default setting indicating SIE shall interpret AP instructions */
+ kvm->arch.crypto.apie = 1;
+
if (!test_kvm_facility(kvm, 76))
return;
@@ -2434,6 +2437,12 @@ static void kvm_s390_vcpu_crypto_setup(struct kvm_vcpu *vcpu)
{
vcpu->arch.sie_block->crycbd = vcpu->kvm->arch.crypto.crycbd;
+ vcpu->arch.sie_block->eca &= ~ECA_APIE;
+ if (vcpu->kvm->arch.crypto.apie &&
+ test_kvm_cpu_feat(vcpu->kvm, KVM_S390_VM_CPU_FEAT_AP))
+ vcpu->arch.sie_block->eca |= ECA_APIE;
+
+
if (!test_kvm_facility(vcpu->kvm, 76))
return;
--
1.7.1
This patch refactors the code that initializes the crypto
configuration for a guest. The crypto configuration is contained in
a crypto control block (CRYCB) which is a satellite control block to
our main hardware virtualization control block. The CRYCB is
attached to the main virtualization control block via a CRYCB
designation (CRYCBD) designation field containing the address of
the CRYCB as well as its format.
Prior to the introduction of AP device virtualization, there was
no need to provide access to or specify the format of the CRYCB for
a guest unless the MSA extension 3 (MSAX3) facility was installed
on the host system. With the introduction of AP device virtualization,
the CRYCB and its format must be made accessible to the guest
regardless of the presence of the MSAX3 facility.
The crypto initialization code is restructured as follows:
* A new compilation unit is introduced to contain all interfaces
and data structures related to configuring a guest's CRYCB for
both the refactoring of crypto initialization as well as all
subsequent patches introducing AP virtualization support.
* Currently, the asm code for querying the AP configuration is
duplicated in the AP bus as well as in KVM. Since the KVM
code was introduced, the AP bus has externalized the interface
for querying the AP configuration. The KVM interface will be
replaced with a call to the AP bus interface. Of course, this
will be moved to the new compilation unit mentioned above.
* An interface to format the CRYCBD field will be provided via
the new compilation unit and called from the KVM vm
initialization.
Signed-off-by: Tony Krowiak <[email protected]>
---
arch/s390/include/asm/kvm-ap.h | 15 +++++++++
arch/s390/include/asm/kvm_host.h | 1 +
arch/s390/kvm/kvm-ap.c | 39 ++++++++++++++++++++++++
arch/s390/kvm/kvm-s390.c | 60 ++++----------------------------------
4 files changed, 61 insertions(+), 54 deletions(-)
diff --git a/arch/s390/include/asm/kvm-ap.h b/arch/s390/include/asm/kvm-ap.h
index 84412a9..736e93e 100644
--- a/arch/s390/include/asm/kvm-ap.h
+++ b/arch/s390/include/asm/kvm-ap.h
@@ -10,6 +10,9 @@
#ifndef _ASM_KVM_AP
#define _ASM_KVM_AP
+#include <linux/types.h>
+#include <linux/kvm_host.h>
+
/**
* kvm_ap_instructions_installed()
*
@@ -20,4 +23,16 @@
*/
int kvm_ap_instructions_installed(void);
+/**
+ * kvm_ap_build_crycbd
+ *
+ * The crypto control block designation (CRYCBD) is a 32-bit field that
+ * designates both the host real address and format of the CRYCB. This function
+ * builds the CRYCBD field for use by the KVM guest.
+ *
+ * @kvm: the KVM guest
+ * @crycbd: reference to the CRYCBD
+ */
+void kvm_ap_build_crycbd(struct kvm *kvm);
+
#endif /* _ASM_KVM_AP */
diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index 81cdb6b..c990a1d 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -257,6 +257,7 @@ struct kvm_s390_sie_block {
__u8 reservedf0[12]; /* 0x00f0 */
#define CRYCB_FORMAT1 0x00000001
#define CRYCB_FORMAT2 0x00000003
+#define CRYCB_FORMAT_MASK 0x00000003
__u32 crycbd; /* 0x00fc */
__u64 gcr[16]; /* 0x0100 */
__u64 gbea; /* 0x0180 */
diff --git a/arch/s390/kvm/kvm-ap.c b/arch/s390/kvm/kvm-ap.c
index 1267588..991bae4 100644
--- a/arch/s390/kvm/kvm-ap.c
+++ b/arch/s390/kvm/kvm-ap.c
@@ -10,6 +10,8 @@
#include <asm/kvm-ap.h>
#include <asm/ap.h>
+#include "kvm-s390.h"
+
int kvm_ap_instructions_installed(void)
{
#ifdef CONFIG_ZCRYPT
@@ -19,3 +21,40 @@ int kvm_ap_instructions_installed(void)
#endif
}
EXPORT_SYMBOL(kvm_ap_instructions_installed);
+
+static inline int kvm_ap_query_config(struct ap_config_info *config)
+{
+ memset(config, 0, sizeof(*config));
+
+#ifdef CONFIG_ZCRYPT
+ if (kvm_ap_instructions_installed())
+ return ap_query_configuration(config);
+#endif
+
+ return -EOPNOTSUPP;
+}
+
+static int kvm_ap_apxa_installed(void)
+{
+ struct ap_config_info config;
+
+ if (kvm_ap_query_config(&config) == 0)
+ return (config.apxa == 1);
+
+ return 0;
+}
+
+void kvm_ap_build_crycbd(struct kvm *kvm)
+{
+ kvm->arch.crypto.crycbd = (__u32)(unsigned long) kvm->arch.crypto.crycb;
+ kvm->arch.crypto.crycbd &= ~(CRYCB_FORMAT_MASK);
+
+ /* check whether MSAX3 is installed */
+ if (kvm_ap_instructions_installed() && test_kvm_facility(kvm, 76)) {
+ if (kvm_ap_apxa_installed())
+ kvm->arch.crypto.crycbd |= CRYCB_FORMAT2;
+ else
+ kvm->arch.crypto.crycbd |= CRYCB_FORMAT1;
+ }
+}
+EXPORT_SYMBOL(kvm_ap_build_crycbd);
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index d0c3518..b47ff11 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -40,6 +40,7 @@
#include <asm/sclp.h>
#include <asm/cpacf.h>
#include <asm/timex.h>
+#include <asm/kvm-ap.h>
#include "kvm-s390.h"
#include "gaccess.h"
@@ -1881,55 +1882,6 @@ long kvm_arch_vm_ioctl(struct file *filp,
return r;
}
-static int kvm_s390_query_ap_config(u8 *config)
-{
- u32 fcn_code = 0x04000000UL;
- u32 cc = 0;
-
- memset(config, 0, 128);
- asm volatile(
- "lgr 0,%1\n"
- "lgr 2,%2\n"
- ".long 0xb2af0000\n" /* PQAP(QCI) */
- "0: ipm %0\n"
- "srl %0,28\n"
- "1:\n"
- EX_TABLE(0b, 1b)
- : "+r" (cc)
- : "r" (fcn_code), "r" (config)
- : "cc", "0", "2", "memory"
- );
-
- return cc;
-}
-
-static int kvm_s390_apxa_installed(void)
-{
- u8 config[128];
- int cc;
-
- if (test_facility(12)) {
- cc = kvm_s390_query_ap_config(config);
-
- if (cc)
- pr_err("PQAP(QCI) failed with cc=%d", cc);
- else
- return config[0] & 0x40;
- }
-
- return 0;
-}
-
-static void kvm_s390_set_crycb_format(struct kvm *kvm)
-{
- kvm->arch.crypto.crycbd = (__u32)(unsigned long) kvm->arch.crypto.crycb;
-
- if (kvm_s390_apxa_installed())
- kvm->arch.crypto.crycbd |= CRYCB_FORMAT2;
- else
- kvm->arch.crypto.crycbd |= CRYCB_FORMAT1;
-}
-
static u64 kvm_s390_get_initial_cpuid(void)
{
struct cpuid cpuid;
@@ -1941,12 +1893,12 @@ static u64 kvm_s390_get_initial_cpuid(void)
static void kvm_s390_crypto_init(struct kvm *kvm)
{
+ kvm->arch.crypto.crycb = &kvm->arch.sie_page2->crycb;
+ kvm_ap_build_crycbd(kvm);
+
if (!test_kvm_facility(kvm, 76))
return;
- kvm->arch.crypto.crycb = &kvm->arch.sie_page2->crycb;
- kvm_s390_set_crycb_format(kvm);
-
/* Enable AES/DEA protected key functions by default */
kvm->arch.crypto.aes_kw = 1;
kvm->arch.crypto.dea_kw = 1;
@@ -2475,6 +2427,8 @@ void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu)
static void kvm_s390_vcpu_crypto_setup(struct kvm_vcpu *vcpu)
{
+ vcpu->arch.sie_block->crycbd = vcpu->kvm->arch.crypto.crycbd;
+
if (!test_kvm_facility(vcpu->kvm, 76))
return;
@@ -2484,8 +2438,6 @@ static void kvm_s390_vcpu_crypto_setup(struct kvm_vcpu *vcpu)
vcpu->arch.sie_block->ecb3 |= ECB3_AES;
if (vcpu->kvm->arch.crypto.dea_kw)
vcpu->arch.sie_block->ecb3 |= ECB3_DEA;
-
- vcpu->arch.sie_block->crycbd = vcpu->kvm->arch.crypto.crycbd;
}
void kvm_s390_vcpu_unsetup_cmma(struct kvm_vcpu *vcpu)
--
1.7.1
Provides the sysfs interfaces for assigning AP adapters to
and unassigning AP adapters from a mediated matrix device.
The IDs of the AP adapters assigned to the mediated matrix
device are stored in an AP mask (APM). The bits in the APM,
from most significant to least significant bit, correspond to
AP adapter numbers 0 to 255. When an adapter is assigned, the
bit corresponding adapter ID will be set in the APM. Likewise,
when an adapter is unassigned, the bit corresponding to the
adapter ID will be cleared from the APM.
The relevant sysfs structures are:
/sys/devices/vfio_ap
... [matrix]
...... [mdev_supported_types]
......... [vfio_ap-passthrough]
............ [devices]
...............[$uuid]
.................. assign_adapter
.................. unassign_adapter
To assign an adapter to the $uuid mediated matrix device's APM,
write the adapter ID (APID) to the assign_adapter file. To
unassign an adapter, write the APID to the unassign_adapter
file. The APID is specified using conventional semantics: If
it begins with 0x the number will be parsed as a hexadecimal
(case insensitive) number; otherwise, it will be parsed as a
decimal number.
For example, to assign adapter 173 (0xad) to the mediated matrix
device $uuid:
echo 173 > assign_adapter
or
echo 0xad > assign_adapter
To unassign adapter 173 (0xad):
echo 173 > unassign_adapter
or
echo 0xad > unassign_adapter
The assignment will be rejected:
* If the APID exceeds the maximum value for an AP adapter:
* If the AP Extended Addressing (APXA) facility is
installed, the max value is 255
* Else the max value is 64
* If no AP domains have yet been assigned and there are
no AP queues bound to the VFIO AP driver that have an APQN
with an APID matching that of the AP adapter being assigned.
* If any of the APQNs that can be derived from the intersection
of the APID being assigned and the AP queue index (APQI) of
each of the AP domains previously assigned can not be matched
with an APQN of an AP queue device reserved by the VFIO AP
driver.
Signed-off-by: Tony Krowiak <[email protected]>
---
arch/s390/include/asm/kvm-ap.h | 1 +
drivers/s390/crypto/vfio_ap_ops.c | 299 +++++++++++++++++++++++++++++++++++++
2 files changed, 300 insertions(+), 0 deletions(-)
diff --git a/arch/s390/include/asm/kvm-ap.h b/arch/s390/include/asm/kvm-ap.h
index a068244..5ebb171 100644
--- a/arch/s390/include/asm/kvm-ap.h
+++ b/arch/s390/include/asm/kvm-ap.h
@@ -17,6 +17,7 @@
#include <linux/bitops.h>
#define KVM_AP_MASK_BYTES(n) DIV_ROUND_UP(n, BITS_PER_BYTE)
+#define KVM_AP_MAX_APM_INDEX(matrix) (matrix->apm_max - 1)
/**
* The AP matrix is comprised of three bit masks identifying the adapters,
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index 647ea24..6d32adb 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -97,9 +97,308 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
NULL,
};
+struct vfio_apid_reserved {
+ unsigned long apid;
+ int reserved;
+};
+
+struct vfio_ap_qid_match {
+ qid_t qid;
+ struct device *dev;
+};
+
+/**
+ * vfio_ap_queue_match
+ *
+ * @dev: an AP queue device that has been reserved by the VFIO AP device
+ * driver
+ * @data: an AP queue identifier
+ *
+ * Returns 1 (true) if @data matches the AP queue identifier specified for @dev;
+ * otherwise, returns 0 (false);
+ */
+static int vfio_ap_queue_match(struct device *dev, void *data)
+{
+ struct vfio_ap_qid_match *qid_match = data;
+ struct ap_queue *ap_queue;
+
+ ap_queue = to_ap_queue(dev);
+ if (ap_queue->qid == qid_match->qid)
+ qid_match->dev = dev;
+
+ return 0;
+}
+
+/**
+ * vfio_ap_validate_queues_for_apid
+ *
+ * @ap_matrix: the matrix device
+ * @matrix_mdev: the mediated matrix device
+ * @apid: an AP adapter ID (APID)
+ *
+ * Verifies that each APQN that is derived from the intersection of @apid and
+ * each AP queue index (APQI) corresponding to an AP adapter assigned to the
+ * @matrix_mdev matches the APQN of an AP queue reserved by the VFIO AP device
+ * driver.
+ *
+ * Returns 0 if validation succeeds; otherwise, returns an error.
+ */
+static int vfio_ap_validate_queues_for_apid(struct ap_matrix *ap_matrix,
+ struct ap_matrix_mdev *matrix_mdev,
+ unsigned long apid)
+{
+ int ret;
+ struct vfio_ap_qid_match qid_match;
+ unsigned long apqi;
+ struct device_driver *drv = ap_matrix->device.driver;
+
+ /**
+ * Examine each APQN with the specified APID
+ */
+ for_each_set_bit_inv(apqi, matrix_mdev->matrix->aqm,
+ matrix_mdev->matrix->apm_max) {
+ qid_match.qid = AP_MKQID(apid, apqi);
+ qid_match.dev = NULL;
+
+ ret = driver_for_each_device(drv, NULL, &qid_match,
+ vfio_ap_queue_match);
+ if (ret) {
+ pr_err("%s: Error %d validating AP queue %02lx.%04lx reservation",
+ VFIO_AP_MODULE_NAME, ret, apid, apqi);
+ return ret;
+ }
+
+ /*
+ * If the APQN identifies an AP queue that is reserved by the
+ * VFIO AP device driver, continue processing.
+ */
+ if (qid_match.dev)
+ continue;
+
+ pr_err("%s: AP queue %02lx.%04lx not reserved by %s driver",
+ VFIO_AP_MODULE_NAME, apid, apqi,
+ VFIO_AP_DRV_NAME);
+
+ return -ENXIO;
+ }
+
+ return 0;
+}
+
+struct vfio_ap_apid_reserved {
+ unsigned long apid;
+ bool reserved;
+};
+
+/**
+ * vfio_ap_queue_id_contains_apid
+ *
+ * @dev: an AP queue device
+ * @data: an AP adapter ID (APID)
+ *
+ * Returns 1 (true) if the APID (@data) is contained in the AP queue's (@data)
+ * identifier; otherwise, returns 0;
+ */
+static int vfio_ap_queue_id_contains_apid(struct device *dev, void *data)
+{
+ struct vfio_ap_apid_reserved *apid_res = data;
+ struct ap_queue *ap_queue = to_ap_queue(dev);
+
+ if (apid_res->apid == AP_QID_CARD(ap_queue->qid))
+ apid_res->reserved = true;
+
+ return 0;
+}
+
+/**
+ * vfio_ap_verify_apid_reserved
+ *
+ * @ap_matrix: the AP matrix configured for the mediated matrix device
+ * @apid: the AP adapter ID
+ *
+ * Verifies that at least one AP queue reserved by the VFIO AP device driver
+ * has an APQN containing @apid.
+ *
+ * Returns 0 if the APID is reserved; otherwise, returns -ENODEV.
+ */
+static int vfio_ap_verify_apid_reserved(struct ap_matrix *ap_matrix,
+ unsigned long apid)
+{
+ int ret;
+ struct vfio_ap_apid_reserved apid_res;
+
+ apid_res.apid = apid;
+
+ ret = driver_for_each_device(ap_matrix->device.driver, NULL,
+ &apid_res,
+ vfio_ap_queue_id_contains_apid);
+ if (ret)
+ return ret;
+
+ if (apid_res.reserved)
+ return 0;
+
+ pr_err("%s: no APQNs with adapter ID %02lx are reserved by %s driver",
+ VFIO_AP_MODULE_NAME, apid, VFIO_AP_DRV_NAME);
+
+ return -ENODEV;
+}
+
+/**
+ * vfio_ap_validate_apid
+ *
+ * @mdev: the mediated device
+ * @matrix_mdev: the mediated matrix device
+ * @apid: the APID to validate
+ *
+ * Validates the value of @apid:
+ * * If there are no AP domains assigned, then there must be at least
+ * one AP queue device reserved by the VFIO AP device driver with an
+ * APQN containing @apid.
+ *
+ * * Else each APQN that can be derived from the intersection of @apid and
+ * the IDs of the AP domains already assigned must identify an AP queue
+ * that has been reserved by the VFIO AP device driver.
+ *
+ * Returns 0 if the value of @apid is valid; otherwise, returns an error.
+ */
+static int vfio_ap_validate_apid(struct mdev_device *mdev,
+ struct ap_matrix_mdev *matrix_mdev,
+ unsigned long apid)
+{
+ int ret;
+ struct device *dev = mdev_parent_dev(mdev);
+ struct ap_matrix *ap_matrix = to_ap_matrix(dev);
+ unsigned long apqi;
+
+ apqi = find_first_bit_inv(matrix_mdev->matrix->aqm,
+ matrix_mdev->matrix->aqm_max);
+ if (apqi == matrix_mdev->matrix->aqm_max) {
+ ret = vfio_ap_verify_apid_reserved(ap_matrix, apid);
+ } else {
+ ret = vfio_ap_validate_queues_for_apid(ap_matrix, matrix_mdev,
+ apid);
+ }
+
+ if (ret)
+ return ret;
+
+ return 0;
+}
+
+/**
+ * assign_adapter_store
+ *
+ * @dev: the matrix device
+ * @attr: a mediated matrix device attribute
+ * @buf: a buffer containing the adapter ID (APID) to be assigned
+ * @count: the number of bytes in @buf
+ *
+ * Parses the APID from @buf and assigns it to the mediated matrix device. The
+ * APID must be a valid value:
+ * * The APID value must not exceed the maximum allowable AP adapter ID
+ *
+ * * If there are no AP domains assigned, then there must be at least
+ * one AP queue device reserved by the VFIO AP device driver with an
+ * APQN containing @apid.
+ *
+ * * Else each APQN that can be derived from the intersection of @apid and
+ * the IDs of the AP domains already assigned must identify an AP queue
+ * that has been reserved by the VFIO AP device driver.
+ *
+ * Returns the number of bytes processed if the APID is valid; otherwise returns
+ * an error.
+ */
+static ssize_t assign_adapter_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ int ret;
+ unsigned long apid;
+ struct mdev_device *mdev = mdev_from_dev(dev);
+ struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
+ int maxid = KVM_AP_MAX_APM_INDEX(matrix_mdev->matrix);
+
+ ret = kstrtoul(buf, 0, &apid);
+ if (ret || (apid > maxid)) {
+ pr_err("%s: adapter id '%s' not a value from 0 to %02d(%#04x)",
+ VFIO_AP_MODULE_NAME, buf, maxid, maxid);
+
+ return ret ? ret : -EINVAL;
+ }
+
+ ret = vfio_ap_validate_apid(mdev, matrix_mdev, apid);
+ if (ret)
+ return ret;
+
+ /* Set the bit in the AP mask (APM) corresponding to the AP adapter
+ * number (APID). The bits in the mask, from most significant to least
+ * significant bit, correspond to APIDs 0-255.
+ */
+ set_bit_inv(apid, matrix_mdev->matrix->apm);
+
+ return count;
+}
+static DEVICE_ATTR_WO(assign_adapter);
+
+/**
+ * unassign_adapter_store
+ *
+ * @dev: the matrix device
+ * @attr: a mediated matrix device attribute
+ * @buf: a buffer containing the adapter ID (APID) to be assigned
+ * @count: the number of bytes in @buf
+ *
+ * Parses the APID from @buf and unassigns it from the mediated matrix device.
+ * The APID must be a valid value
+ *
+ * Returns the number of bytes processed if the APID is valid; otherwise returns
+ * an error.
+ */
+static ssize_t unassign_adapter_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ int ret;
+ unsigned long apid;
+ struct mdev_device *mdev = mdev_from_dev(dev);
+ struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
+ int maxid = KVM_AP_MAX_APM_INDEX(matrix_mdev->matrix);
+
+ ret = kstrtoul(buf, 0, &apid);
+ if (ret || (apid > maxid)) {
+ pr_err("%s: adapter id '%s' must be a value from 0 to %02d(%#04x)",
+ VFIO_AP_MODULE_NAME, buf, maxid, maxid);
+
+ return ret ? ret : -EINVAL;
+ }
+
+ clear_bit_inv((unsigned long)apid,
+ (unsigned long *)matrix_mdev->matrix->apm);
+
+ return count;
+}
+DEVICE_ATTR_WO(unassign_adapter);
+
+static struct attribute *vfio_ap_mdev_attrs[] = {
+ &dev_attr_assign_adapter.attr,
+ &dev_attr_unassign_adapter.attr,
+ NULL
+};
+
+static struct attribute_group vfio_ap_mdev_attr_group = {
+ .attrs = vfio_ap_mdev_attrs
+};
+
+static const struct attribute_group *vfio_ap_mdev_attr_groups[] = {
+ &vfio_ap_mdev_attr_group,
+ NULL
+};
+
static const struct mdev_parent_ops vfio_ap_matrix_ops = {
.owner = THIS_MODULE,
.supported_type_groups = vfio_ap_mdev_type_groups,
+ .mdev_attr_groups = vfio_ap_mdev_attr_groups,
.create = vfio_ap_mdev_create,
.remove = vfio_ap_mdev_remove,
};
--
1.7.1
Introduces a new CPU model feature and two CPU model
facilities to support AP virtualization for KVM guests.
CPU model feature:
The KVM_S390_VM_CPU_FEAT_AP feature indicates that
AP instructions are available on the guest. This
feature will be enabled by the kernel only if the AP
instructions are installed on the linux host. This feature
must be specifically turned on for the KVM guest from
userspace to use the VFIO AP device driver for guest
access to AP devices.
CPU model facilities:
1. AP Query Configuration Information (QCI) facility is installed.
This is indicated by setting facilities bit 12 for
the guest. The kernel will not enable this facility
for the guest if it is not set on the host. This facility
must not be set by userspace if the KVM_S390_VM_CPU_FEAT_AP
feature is not installed.
2. AP Facilities Test facility (APFT) is installed.
This is indicated by setting facilities bit 15 for
the guest. The kernel will not enable this facility for
the guest if it is not set on the host. This facility
must not be set by userspace if the KVM_S390_VM_CPU_FEAT_AP
feature is not installed.
Reviewed-by: Christian Borntraeger <[email protected]>
Reviewed-by: Halil Pasic <[email protected]>
Signed-off-by: Tony Krowiak <[email protected]>
---
arch/s390/include/asm/kvm_host.h | 1 +
arch/s390/include/uapi/asm/kvm.h | 1 +
arch/s390/kvm/kvm-s390.c | 5 +++++
arch/s390/tools/gen_facilities.c | 2 ++
4 files changed, 9 insertions(+), 0 deletions(-)
diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index c990a1d..3162783 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -186,6 +186,7 @@ struct kvm_s390_sie_block {
#define ECA_AIV 0x00200000
#define ECA_VX 0x00020000
#define ECA_PROTEXCI 0x00002000
+#define ECA_APIE 0x00000008
#define ECA_SII 0x00000001
__u32 eca; /* 0x004c */
#define ICPT_INST 0x04
diff --git a/arch/s390/include/uapi/asm/kvm.h b/arch/s390/include/uapi/asm/kvm.h
index 4cdaa55..a580dec 100644
--- a/arch/s390/include/uapi/asm/kvm.h
+++ b/arch/s390/include/uapi/asm/kvm.h
@@ -130,6 +130,7 @@ struct kvm_s390_vm_cpu_machine {
#define KVM_S390_VM_CPU_FEAT_PFMFI 11
#define KVM_S390_VM_CPU_FEAT_SIGPIF 12
#define KVM_S390_VM_CPU_FEAT_KSS 13
+#define KVM_S390_VM_CPU_FEAT_AP 14
struct kvm_s390_vm_cpu_feat {
__u64 feat[16];
};
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index b47ff11..55cd897 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -367,6 +367,11 @@ static void kvm_s390_cpu_feat_init(void)
if (MACHINE_HAS_ESOP)
allow_cpu_feat(KVM_S390_VM_CPU_FEAT_ESOP);
+
+ /* Check if AP instructions installed on host */
+ if (kvm_ap_instructions_installed())
+ allow_cpu_feat(KVM_S390_VM_CPU_FEAT_AP);
+
/*
* We need SIE support, ESOP (PROT_READ protection for gmap_shadow),
* 64bit SCAO (SCA passthrough) and IDTE (for gmap_shadow unshadowing).
diff --git a/arch/s390/tools/gen_facilities.c b/arch/s390/tools/gen_facilities.c
index 90a8c9e..a52290b 100644
--- a/arch/s390/tools/gen_facilities.c
+++ b/arch/s390/tools/gen_facilities.c
@@ -106,6 +106,8 @@ struct facility_def {
.name = "FACILITIES_KVM_CPUMODEL",
.bits = (int[]){
+ 12, /* AP Query Configuration Information */
+ 15, /* AP Facilities Test */
-1 /* END */
}
},
--
1.7.1
If the AP instructions are not available on the linux host, then
AP devices can not be interpreted by the SIE. The AP bus has a
function it uses to determine if the AP instructions are
available. This patch provides a new function that wraps the
AP bus's function to externalize it for use by KVM.
Signed-off-by: Tony Krowiak <[email protected]>
Reviewed-by: Pierre Morel <[email protected]>
Reviewed-by: Harald Freudenberger <[email protected]>
---
arch/s390/include/asm/ap.h | 7 +++++++
arch/s390/include/asm/kvm-ap.h | 23 +++++++++++++++++++++++
arch/s390/kvm/Makefile | 2 +-
arch/s390/kvm/kvm-ap.c | 21 +++++++++++++++++++++
drivers/s390/crypto/ap_bus.c | 6 ++++++
5 files changed, 58 insertions(+), 1 deletions(-)
create mode 100644 arch/s390/include/asm/kvm-ap.h
create mode 100644 arch/s390/kvm/kvm-ap.c
diff --git a/arch/s390/include/asm/ap.h b/arch/s390/include/asm/ap.h
index c1bedb4..7773bfd 100644
--- a/arch/s390/include/asm/ap.h
+++ b/arch/s390/include/asm/ap.h
@@ -120,4 +120,11 @@ struct ap_queue_status ap_queue_irq_ctrl(ap_qid_t qid,
struct ap_qirq_ctrl qirqctrl,
void *ind);
+/**
+ * ap_instructions_installed() - Tests whether AP instructions are installed
+ *
+ * Returns 1 if the AP instructions are installed, otherwise; returns 0
+ */
+int ap_instructions_installed(void);
+
#endif /* _ASM_S390_AP_H_ */
diff --git a/arch/s390/include/asm/kvm-ap.h b/arch/s390/include/asm/kvm-ap.h
new file mode 100644
index 0000000..84412a9
--- /dev/null
+++ b/arch/s390/include/asm/kvm-ap.h
@@ -0,0 +1,23 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * Adjunct Processor (AP) configuration management for KVM guests
+ *
+ * Copyright IBM Corp. 2018
+ *
+ * Author(s): Tony Krowiak <[email protected]>
+ */
+
+#ifndef _ASM_KVM_AP
+#define _ASM_KVM_AP
+
+/**
+ * kvm_ap_instructions_installed()
+ *
+ * Tests whether AP instructions are installed on the linux host
+ *
+ * Returns 1 if the AP instructions are installed on the host, otherwise;
+ * returns 0
+ */
+int kvm_ap_instructions_installed(void);
+
+#endif /* _ASM_KVM_AP */
diff --git a/arch/s390/kvm/Makefile b/arch/s390/kvm/Makefile
index 05ee90a..1876bfe 100644
--- a/arch/s390/kvm/Makefile
+++ b/arch/s390/kvm/Makefile
@@ -9,6 +9,6 @@ common-objs = $(KVM)/kvm_main.o $(KVM)/eventfd.o $(KVM)/async_pf.o $(KVM)/irqch
ccflags-y := -Ivirt/kvm -Iarch/s390/kvm
kvm-objs := $(common-objs) kvm-s390.o intercept.o interrupt.o priv.o sigp.o
-kvm-objs += diag.o gaccess.o guestdbg.o vsie.o
+kvm-objs += diag.o gaccess.o guestdbg.o vsie.o kvm-ap.o
obj-$(CONFIG_KVM) += kvm.o
diff --git a/arch/s390/kvm/kvm-ap.c b/arch/s390/kvm/kvm-ap.c
new file mode 100644
index 0000000..1267588
--- /dev/null
+++ b/arch/s390/kvm/kvm-ap.c
@@ -0,0 +1,21 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * Adjunct Processor (AP) configuration management for KVM guests
+ *
+ * Copyright IBM Corp. 2018
+ *
+ * Author(s): Tony Krowiak <[email protected]>
+ */
+#include <linux/kernel.h>
+#include <asm/kvm-ap.h>
+#include <asm/ap.h>
+
+int kvm_ap_instructions_installed(void)
+{
+#ifdef CONFIG_ZCRYPT
+ return ap_instructions_installed();
+#else
+ return 0;
+#endif
+}
+EXPORT_SYMBOL(kvm_ap_instructions_installed);
diff --git a/drivers/s390/crypto/ap_bus.c b/drivers/s390/crypto/ap_bus.c
index 35a0c2b..9d108b6 100644
--- a/drivers/s390/crypto/ap_bus.c
+++ b/drivers/s390/crypto/ap_bus.c
@@ -210,6 +210,12 @@ int ap_query_configuration(struct ap_config_info *info)
}
EXPORT_SYMBOL(ap_query_configuration);
+int ap_instructions_installed(void)
+{
+ return (ap_instructions_available() == 0);
+}
+EXPORT_SYMBOL(ap_instructions_installed);
+
/**
* ap_init_configuration(): Allocate and query configuration array.
*/
--
1.7.1
Hi Tony,
I love your patch! Yet something to improve:
[auto build test ERROR on s390/features]
[also build test ERROR on v4.17-rc1 next-20180413]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Tony-Krowiak/s390-vfio-ap-guest-dedicated-crypto-adapters/20180416-052759
base: https://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git features
config: s390-alldefconfig (attached as .config)
compiler: s390x-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=s390
All errors (new ones prefixed by >>):
arch/s390/kvm/kvm-ap.o: In function `kvm_ap_matrix_create':
>> kvm-ap.c:(.text+0x176): undefined reference to `ap_query_configuration'
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
On 15/04/2018 23:22, Tony Krowiak wrote:
> If the AP instructions are not available on the linux host, then
> AP devices can not be interpreted by the SIE. The AP bus has a
> function it uses to determine if the AP instructions are
> available. This patch provides a new function that wraps the
> AP bus's function to externalize it for use by KVM.
>
> Signed-off-by: Tony Krowiak <[email protected]>
> Reviewed-by: Pierre Morel <[email protected]>
> Reviewed-by: Harald Freudenberger <[email protected]>
> ---
> arch/s390/include/asm/ap.h | 7 +++++++
> arch/s390/include/asm/kvm-ap.h | 23 +++++++++++++++++++++++
> arch/s390/kvm/Makefile | 2 +-
> arch/s390/kvm/kvm-ap.c | 21 +++++++++++++++++++++
> drivers/s390/crypto/ap_bus.c | 6 ++++++
> 5 files changed, 58 insertions(+), 1 deletions(-)
> create mode 100644 arch/s390/include/asm/kvm-ap.h
> create mode 100644 arch/s390/kvm/kvm-ap.c
>
> diff --git a/arch/s390/include/asm/ap.h b/arch/s390/include/asm/ap.h
> index c1bedb4..7773bfd 100644
> --- a/arch/s390/include/asm/ap.h
> +++ b/arch/s390/include/asm/ap.h
> @@ -120,4 +120,11 @@ struct ap_queue_status ap_queue_irq_ctrl(ap_qid_t qid,
> struct ap_qirq_ctrl qirqctrl,
> void *ind);
>
> +/**
> + * ap_instructions_installed() - Tests whether AP instructions are installed
> + *
> + * Returns 1 if the AP instructions are installed, otherwise; returns 0
> + */
> +int ap_instructions_installed(void);
> +
> #endif /* _ASM_S390_AP_H_ */
> diff --git a/arch/s390/include/asm/kvm-ap.h b/arch/s390/include/asm/kvm-ap.h
> new file mode 100644
> index 0000000..84412a9
> --- /dev/null
> +++ b/arch/s390/include/asm/kvm-ap.h
> @@ -0,0 +1,23 @@
> +// SPDX-License-Identifier: GPL-2.0+
> +/*
> + * Adjunct Processor (AP) configuration management for KVM guests
> + *
> + * Copyright IBM Corp. 2018
> + *
> + * Author(s): Tony Krowiak <[email protected]>
> + */
> +
> +#ifndef _ASM_KVM_AP
> +#define _ASM_KVM_AP
> +
> +/**
> + * kvm_ap_instructions_installed()
> + *
> + * Tests whether AP instructions are installed on the linux host
> + *
> + * Returns 1 if the AP instructions are installed on the host, otherwise;
> + * returns 0
> + */
> +int kvm_ap_instructions_installed(void);
> +
> +#endif /* _ASM_KVM_AP */
> diff --git a/arch/s390/kvm/Makefile b/arch/s390/kvm/Makefile
> index 05ee90a..1876bfe 100644
> --- a/arch/s390/kvm/Makefile
> +++ b/arch/s390/kvm/Makefile
> @@ -9,6 +9,6 @@ common-objs = $(KVM)/kvm_main.o $(KVM)/eventfd.o $(KVM)/async_pf.o $(KVM)/irqch
> ccflags-y := -Ivirt/kvm -Iarch/s390/kvm
>
> kvm-objs := $(common-objs) kvm-s390.o intercept.o interrupt.o priv.o sigp.o
> -kvm-objs += diag.o gaccess.o guestdbg.o vsie.o
> +kvm-objs += diag.o gaccess.o guestdbg.o vsie.o kvm-ap.o
>
> obj-$(CONFIG_KVM) += kvm.o
> diff --git a/arch/s390/kvm/kvm-ap.c b/arch/s390/kvm/kvm-ap.c
> new file mode 100644
> index 0000000..1267588
> --- /dev/null
> +++ b/arch/s390/kvm/kvm-ap.c
> @@ -0,0 +1,21 @@
> +// SPDX-License-Identifier: GPL-2.0+
> +/*
> + * Adjunct Processor (AP) configuration management for KVM guests
> + *
> + * Copyright IBM Corp. 2018
> + *
> + * Author(s): Tony Krowiak <[email protected]>
> + */
> +#include <linux/kernel.h>
> +#include <asm/kvm-ap.h>
> +#include <asm/ap.h>
> +
> +int kvm_ap_instructions_installed(void)
> +{
> +#ifdef CONFIG_ZCRYPT
I did not give my R-B for this.
please change it or suppress my R-B
I think you should review the way you wrap functions
calling the AP interface.
Having all of them together would simplify code and review.
> + return ap_instructions_installed();
> +#else
> + return 0;
> +#endif
> +}
> +EXPORT_SYMBOL(kvm_ap_instructions_installed);
> diff --git a/drivers/s390/crypto/ap_bus.c b/drivers/s390/crypto/ap_bus.c
> index 35a0c2b..9d108b6 100644
> --- a/drivers/s390/crypto/ap_bus.c
> +++ b/drivers/s390/crypto/ap_bus.c
> @@ -210,6 +210,12 @@ int ap_query_configuration(struct ap_config_info *info)
> }
> EXPORT_SYMBOL(ap_query_configuration);
>
> +int ap_instructions_installed(void)
> +{
> + return (ap_instructions_available() == 0);
> +}
> +EXPORT_SYMBOL(ap_instructions_installed);
> +
> /**
> * ap_init_configuration(): Allocate and query configuration array.
> */
--
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany
On 15/04/2018 23:22, Tony Krowiak wrote:
> This patch refactors the code that initializes the crypto
> configuration for a guest. The crypto configuration is contained in
> a crypto control block (CRYCB) which is a satellite control block to
> our main hardware virtualization control block. The CRYCB is
> attached to the main virtualization control block via a CRYCB
> designation (CRYCBD) designation field containing the address of
> the CRYCB as well as its format.
>
> Prior to the introduction of AP device virtualization, there was
> no need to provide access to or specify the format of the CRYCB for
> a guest unless the MSA extension 3 (MSAX3) facility was installed
> on the host system. With the introduction of AP device virtualization,
> the CRYCB and its format must be made accessible to the guest
> regardless of the presence of the MSAX3 facility.
>
> The crypto initialization code is restructured as follows:
>
> * A new compilation unit is introduced to contain all interfaces
> and data structures related to configuring a guest's CRYCB for
> both the refactoring of crypto initialization as well as all
> subsequent patches introducing AP virtualization support.
>
> * Currently, the asm code for querying the AP configuration is
> duplicated in the AP bus as well as in KVM. Since the KVM
> code was introduced, the AP bus has externalized the interface
> for querying the AP configuration. The KVM interface will be
> replaced with a call to the AP bus interface. Of course, this
> will be moved to the new compilation unit mentioned above.
>
> * An interface to format the CRYCBD field will be provided via
> the new compilation unit and called from the KVM vm
> initialization.
>
> Signed-off-by: Tony Krowiak <[email protected]>
> ---
> arch/s390/include/asm/kvm-ap.h | 15 +++++++++
> arch/s390/include/asm/kvm_host.h | 1 +
> arch/s390/kvm/kvm-ap.c | 39 ++++++++++++++++++++++++
> arch/s390/kvm/kvm-s390.c | 60 ++++----------------------------------
> 4 files changed, 61 insertions(+), 54 deletions(-)
>
> diff --git a/arch/s390/include/asm/kvm-ap.h b/arch/s390/include/asm/kvm-ap.h
> index 84412a9..736e93e 100644
> --- a/arch/s390/include/asm/kvm-ap.h
> +++ b/arch/s390/include/asm/kvm-ap.h
> @@ -10,6 +10,9 @@
> #ifndef _ASM_KVM_AP
> #define _ASM_KVM_AP
>
> +#include <linux/types.h>
> +#include <linux/kvm_host.h>
> +
> /**
> * kvm_ap_instructions_installed()
> *
> @@ -20,4 +23,16 @@
> */
> int kvm_ap_instructions_installed(void);
>
> +/**
> + * kvm_ap_build_crycbd
> + *
> + * The crypto control block designation (CRYCBD) is a 32-bit field that
> + * designates both the host real address and format of the CRYCB. This function
> + * builds the CRYCBD field for use by the KVM guest.
> + *
> + * @kvm: the KVM guest
> + * @crycbd: reference to the CRYCBD
> + */
> +void kvm_ap_build_crycbd(struct kvm *kvm);
> +
> #endif /* _ASM_KVM_AP */
> diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
> index 81cdb6b..c990a1d 100644
> --- a/arch/s390/include/asm/kvm_host.h
> +++ b/arch/s390/include/asm/kvm_host.h
> @@ -257,6 +257,7 @@ struct kvm_s390_sie_block {
> __u8 reservedf0[12]; /* 0x00f0 */
> #define CRYCB_FORMAT1 0x00000001
> #define CRYCB_FORMAT2 0x00000003
> +#define CRYCB_FORMAT_MASK 0x00000003
> __u32 crycbd; /* 0x00fc */
> __u64 gcr[16]; /* 0x0100 */
> __u64 gbea; /* 0x0180 */
> diff --git a/arch/s390/kvm/kvm-ap.c b/arch/s390/kvm/kvm-ap.c
> index 1267588..991bae4 100644
> --- a/arch/s390/kvm/kvm-ap.c
> +++ b/arch/s390/kvm/kvm-ap.c
> @@ -10,6 +10,8 @@
> #include <asm/kvm-ap.h>
> #include <asm/ap.h>
>
> +#include "kvm-s390.h"
> +
> int kvm_ap_instructions_installed(void)
> {
> #ifdef CONFIG_ZCRYPT
> @@ -19,3 +21,40 @@ int kvm_ap_instructions_installed(void)
> #endif
> }
> EXPORT_SYMBOL(kvm_ap_instructions_installed);
> +
> +static inline int kvm_ap_query_config(struct ap_config_info *config)
> +{
> + memset(config, 0, sizeof(*config));
> +
> +#ifdef CONFIG_ZCRYPT
I would prefer that you define the interface in an include file
with stubs for the case ZCRYPT is not set.
> + if (kvm_ap_instructions_installed())
> + return ap_query_configuration(config);
> +#endif
> +
> + return -EOPNOTSUPP;
> +}
> +
> +static int kvm_ap_apxa_installed(void)
> +{
> + struct ap_config_info config;
> +
> + if (kvm_ap_query_config(&config) == 0)
> + return (config.apxa == 1);
> +
> + return 0;
> +}
> +
> +void kvm_ap_build_crycbd(struct kvm *kvm)
> +{
> + kvm->arch.crypto.crycbd = (__u32)(unsigned long) kvm->arch.crypto.crycb;
> + kvm->arch.crypto.crycbd &= ~(CRYCB_FORMAT_MASK);
> +
> + /* check whether MSAX3 is installed */
It means we do not support AP virtualization without MSA3.
It follows we do not support CRYCB_FORMAT0
It is different from what you explain in the comment.
> + if (kvm_ap_instructions_installed() && test_kvm_facility(kvm, 76)) {
> + if (kvm_ap_apxa_installed())
> + kvm->arch.crypto.crycbd |= CRYCB_FORMAT2;
> + else
> + kvm->arch.crypto.crycbd |= CRYCB_FORMAT1;
> + }
> +}
> +EXPORT_SYMBOL(kvm_ap_build_crycbd);
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index d0c3518..b47ff11 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -40,6 +40,7 @@
> #include <asm/sclp.h>
> #include <asm/cpacf.h>
> #include <asm/timex.h>
> +#include <asm/kvm-ap.h>
> #include "kvm-s390.h"
> #include "gaccess.h"
>
> @@ -1881,55 +1882,6 @@ long kvm_arch_vm_ioctl(struct file *filp,
> return r;
> }
>
> -static int kvm_s390_query_ap_config(u8 *config)
> -{
> - u32 fcn_code = 0x04000000UL;
> - u32 cc = 0;
> -
> - memset(config, 0, 128);
> - asm volatile(
> - "lgr 0,%1\n"
> - "lgr 2,%2\n"
> - ".long 0xb2af0000\n" /* PQAP(QCI) */
> - "0: ipm %0\n"
> - "srl %0,28\n"
> - "1:\n"
> - EX_TABLE(0b, 1b)
> - : "+r" (cc)
> - : "r" (fcn_code), "r" (config)
> - : "cc", "0", "2", "memory"
> - );
> -
> - return cc;
> -}
> -
> -static int kvm_s390_apxa_installed(void)
> -{
> - u8 config[128];
> - int cc;
> -
> - if (test_facility(12)) {
> - cc = kvm_s390_query_ap_config(config);
> -
> - if (cc)
> - pr_err("PQAP(QCI) failed with cc=%d", cc);
> - else
> - return config[0] & 0x40;
> - }
> -
> - return 0;
> -}
> -
> -static void kvm_s390_set_crycb_format(struct kvm *kvm)
> -{
> - kvm->arch.crypto.crycbd = (__u32)(unsigned long) kvm->arch.crypto.crycb;
> -
> - if (kvm_s390_apxa_installed())
> - kvm->arch.crypto.crycbd |= CRYCB_FORMAT2;
> - else
> - kvm->arch.crypto.crycbd |= CRYCB_FORMAT1;
> -}
> -
> static u64 kvm_s390_get_initial_cpuid(void)
> {
> struct cpuid cpuid;
> @@ -1941,12 +1893,12 @@ static u64 kvm_s390_get_initial_cpuid(void)
>
> static void kvm_s390_crypto_init(struct kvm *kvm)
> {
> + kvm->arch.crypto.crycb = &kvm->arch.sie_page2->crycb;
> + kvm_ap_build_crycbd(kvm);
> +
> if (!test_kvm_facility(kvm, 76))
> return;
>
> - kvm->arch.crypto.crycb = &kvm->arch.sie_page2->crycb;
> - kvm_s390_set_crycb_format(kvm);
> -
> /* Enable AES/DEA protected key functions by default */
> kvm->arch.crypto.aes_kw = 1;
> kvm->arch.crypto.dea_kw = 1;
> @@ -2475,6 +2427,8 @@ void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu)
>
> static void kvm_s390_vcpu_crypto_setup(struct kvm_vcpu *vcpu)
> {
> + vcpu->arch.sie_block->crycbd = vcpu->kvm->arch.crypto.crycbd;
> +
> if (!test_kvm_facility(vcpu->kvm, 76))
> return;
>
> @@ -2484,8 +2438,6 @@ static void kvm_s390_vcpu_crypto_setup(struct kvm_vcpu *vcpu)
> vcpu->arch.sie_block->ecb3 |= ECB3_AES;
> if (vcpu->kvm->arch.crypto.dea_kw)
> vcpu->arch.sie_block->ecb3 |= ECB3_DEA;
> -
> - vcpu->arch.sie_block->crycbd = vcpu->kvm->arch.crypto.crycbd;
> }
>
> void kvm_s390_vcpu_unsetup_cmma(struct kvm_vcpu *vcpu)
--
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany
On 15/04/2018 23:22, Tony Krowiak wrote:
> The VFIO AP device model exploits interpretive execution of AP
> instructions (APIE) to provide guests passthrough access to AP
> devices. This patch introduces a new interface to enable and
> disable APIE.
>
> Signed-off-by: Tony Krowiak <[email protected]>
> ---
> arch/s390/include/asm/kvm-ap.h | 16 ++++++++++++++++
> arch/s390/include/asm/kvm_host.h | 1 +
> arch/s390/kvm/kvm-ap.c | 20 ++++++++++++++++++++
> arch/s390/kvm/kvm-s390.c | 9 +++++++++
> 4 files changed, 46 insertions(+), 0 deletions(-)
>
> diff --git a/arch/s390/include/asm/kvm-ap.h b/arch/s390/include/asm/kvm-ap.h
> index 736e93e..a6c092e 100644
> --- a/arch/s390/include/asm/kvm-ap.h
> +++ b/arch/s390/include/asm/kvm-ap.h
> @@ -35,4 +35,20 @@
> */
> void kvm_ap_build_crycbd(struct kvm *kvm);
>
> +/**
> + * kvm_ap_interpret_instructions
> + *
> + * Indicate whether AP instructions shall be interpreted. If they are not
> + * interpreted, all AP instructions will be intercepted and routed back to
> + * userspace.
> + *
> + * @kvm: the virtual machine attributes
> + * @enable: indicates whether AP instructions are to be interpreted (true) or
> + * or not (false).
> + *
> + * Returns 0 if completed successfully; otherwise, returns -EOPNOTSUPP
> + * indicating that AP instructions are not installed on the guest.
> + */
> +int kvm_ap_interpret_instructions(struct kvm *kvm, bool enable);
> +
> #endif /* _ASM_KVM_AP */
> diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
> index 3162783..5470685 100644
> --- a/arch/s390/include/asm/kvm_host.h
> +++ b/arch/s390/include/asm/kvm_host.h
> @@ -715,6 +715,7 @@ struct kvm_s390_crypto {
> __u32 crycbd;
> __u8 aes_kw;
> __u8 dea_kw;
> + __u8 apie;
> };
>
> #define APCB0_MASK_SIZE 1
> diff --git a/arch/s390/kvm/kvm-ap.c b/arch/s390/kvm/kvm-ap.c
> index 991bae4..55d11b5 100644
> --- a/arch/s390/kvm/kvm-ap.c
> +++ b/arch/s390/kvm/kvm-ap.c
> @@ -58,3 +58,23 @@ void kvm_ap_build_crycbd(struct kvm *kvm)
> }
> }
> EXPORT_SYMBOL(kvm_ap_build_crycbd);
> +
> +int kvm_ap_interpret_instructions(struct kvm *kvm, bool enable)
> +{
> + int ret = 0;
> +
> + mutex_lock(&kvm->lock);
> +
> + if (!test_kvm_cpu_feat(kvm, KVM_S390_VM_CPU_FEAT_AP)) {
Do we really need to test CPU_FEAT_AP?
I understand that KVM_S390_VM_CPU_FEAT_AP means AP instructions are
interpreted.
shouldn't we add this information in the name?
like KVM_S390_VM_CPU_FEAT_APIE
> + ret = -EOPNOTSUPP;
> + goto done;
> + }
> +
> + kvm->arch.crypto.apie = enable;
> + kvm_s390_vcpu_crypto_reset_all(kvm);
> +
> +done:
> + mutex_unlock(&kvm->lock);
> + return ret;
> +}
> +EXPORT_SYMBOL(kvm_ap_interpret_instructions);
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 55cd897..1dc8566 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -1901,6 +1901,9 @@ static void kvm_s390_crypto_init(struct kvm *kvm)
> kvm->arch.crypto.crycb = &kvm->arch.sie_page2->crycb;
> kvm_ap_build_crycbd(kvm);
>
> + /* Default setting indicating SIE shall interpret AP instructions */
> + kvm->arch.crypto.apie = 1;
> +
> if (!test_kvm_facility(kvm, 76))
> return;
>
> @@ -2434,6 +2437,12 @@ static void kvm_s390_vcpu_crypto_setup(struct kvm_vcpu *vcpu)
> {
> vcpu->arch.sie_block->crycbd = vcpu->kvm->arch.crypto.crycbd;
>
> + vcpu->arch.sie_block->eca &= ~ECA_APIE;
> + if (vcpu->kvm->arch.crypto.apie &&
> + test_kvm_cpu_feat(vcpu->kvm, KVM_S390_VM_CPU_FEAT_AP))
Do we call xxx_crypto_setup() if KVM does not support AP interpretation?
> + vcpu->arch.sie_block->eca |= ECA_APIE;
> +
> +
> if (!test_kvm_facility(vcpu->kvm, 76))
> return;
>
--
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany
On Mon, 16 Apr 2018 10:44:53 +0200
Pierre Morel <[email protected]> wrote:
> On 15/04/2018 23:22, Tony Krowiak wrote:
> > If the AP instructions are not available on the linux host, then
> > AP devices can not be interpreted by the SIE. The AP bus has a
> > function it uses to determine if the AP instructions are
> > available. This patch provides a new function that wraps the
> > AP bus's function to externalize it for use by KVM.
> >
> > Signed-off-by: Tony Krowiak <[email protected]>
> > Reviewed-by: Pierre Morel <[email protected]>
> > Reviewed-by: Harald Freudenberger <[email protected]>
> > ---
> > arch/s390/include/asm/ap.h | 7 +++++++
> > arch/s390/include/asm/kvm-ap.h | 23 +++++++++++++++++++++++
> > arch/s390/kvm/Makefile | 2 +-
> > arch/s390/kvm/kvm-ap.c | 21 +++++++++++++++++++++
> > drivers/s390/crypto/ap_bus.c | 6 ++++++
> > 5 files changed, 58 insertions(+), 1 deletions(-)
> > create mode 100644 arch/s390/include/asm/kvm-ap.h
> > create mode 100644 arch/s390/kvm/kvm-ap.c
> > diff --git a/arch/s390/kvm/kvm-ap.c b/arch/s390/kvm/kvm-ap.c
> > new file mode 100644
> > index 0000000..1267588
> > --- /dev/null
> > +++ b/arch/s390/kvm/kvm-ap.c
> > @@ -0,0 +1,21 @@
> > +// SPDX-License-Identifier: GPL-2.0+
> > +/*
> > + * Adjunct Processor (AP) configuration management for KVM guests
> > + *
> > + * Copyright IBM Corp. 2018
> > + *
> > + * Author(s): Tony Krowiak <[email protected]>
> > + */
> > +#include <linux/kernel.h>
> > +#include <asm/kvm-ap.h>
> > +#include <asm/ap.h>
> > +
> > +int kvm_ap_instructions_installed(void)
> > +{
> > +#ifdef CONFIG_ZCRYPT
>
> I did not give my R-B for this.
> please change it or suppress my R-B
>
> I think you should review the way you wrap functions
> calling the AP interface.
> Having all of them together would simplify code and review.
I don't like the ifdeffery either (especially as there's more later).
Consolidating all functions for querying basic ap capabilities sounds
like a good idea. What about collecting them in a ap-util file and
either always building it or selecting it from both zcrypt and kvm?
>
> > + return ap_instructions_installed();
> > +#else
> > + return 0;
> > +#endif
> > +}
> > +EXPORT_SYMBOL(kvm_ap_instructions_installed);
> > diff --git a/drivers/s390/crypto/ap_bus.c b/drivers/s390/crypto/ap_bus.c
> > index 35a0c2b..9d108b6 100644
> > --- a/drivers/s390/crypto/ap_bus.c
> > +++ b/drivers/s390/crypto/ap_bus.c
> > @@ -210,6 +210,12 @@ int ap_query_configuration(struct ap_config_info *info)
> > }
> > EXPORT_SYMBOL(ap_query_configuration);
> >
> > +int ap_instructions_installed(void)
> > +{
> > + return (ap_instructions_available() == 0);
> > +}
> > +EXPORT_SYMBOL(ap_instructions_installed);
> > +
> > /**
> > * ap_init_configuration(): Allocate and query configuration array.
> > */
>
>
On 15/04/2018 23:22, Tony Krowiak wrote:
> Registers a group notifier during the open of the mediated
> matrix device to get information on KVM presence through the
> VFIO_GROUP_NOTIFY_SET_KVM event. When notified, the pointer
> to the kvm structure is saved inside the mediated matrix
> device. Once the VFIO AP device driver has access to KVM,
> access to the APs can be configured for the guest.
>
> Access to APs is configured when the file descriptor for the
> mediated matrix device is opened by userspace. The items to be
> configured are:
>
> 1. The ECA.28 bit in the SIE state description determines whether
> AP instructions are interpreted by the hardware or intercepted.
> The VFIO AP device driver relies interpretive execution of
> AP instructions so the ECA.28 bit will be set
>
> 2. Guest access to AP adapters, usage domains and control domains
> is controlled by three bit masks referenced from the
> Crypto Control Block (CRYCB) referenced from the guest's SIE state
> description:
>
> * The AP Mask (APM) controls access to the AP adapters. Each bit
> in the APM represents an adapter number - from most significant
> to least significant bit - from 0 to 255. The bits in the APM
> are set according to the adapter numbers assigned to the mediated
> matrix device via its 'assign_adapter' sysfs attribute file.
>
> * The AP Queue (AQM) controls access to the AP queues. Each bit
> in the AQM represents an AP queue index - from most significant
> to least significant bit - from 0 to 255. A queue index references
> a specific domain and is synonymous with the domian number. The
> bits in the AQM are set according to the domain numbers assigned
> to the mediated matrix device via its 'assign_domain' sysfs
> attribute file.
>
> * The AP Domain Mask (ADM) controls access to the AP control domains.
> Each bit in the ADM represents a control domain - from most
> significant to least significant bit - from 0-255. The
> bits in the ADM are set according to the domain numbers assigned
> to the mediated matrix device via its 'assign_control_domain'
> sysfs attribute file.
>
> Signed-off-by: Tony Krowiak <[email protected]>
> ---
> drivers/s390/crypto/vfio_ap_ops.c | 50 +++++++++++++++++++++++++++++++++
> drivers/s390/crypto/vfio_ap_private.h | 2 +
> 2 files changed, 52 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
> index bc2b05e..e3ff5ab 100644
> --- a/drivers/s390/crypto/vfio_ap_ops.c
> +++ b/drivers/s390/crypto/vfio_ap_ops.c
> @@ -53,6 +53,54 @@ static int vfio_ap_mdev_remove(struct mdev_device *mdev)
> return 0;
> }
>
> +static int vfio_ap_mdev_group_notifier(struct notifier_block *nb,
> + unsigned long action, void *data)
> +{
> + struct ap_matrix_mdev *matrix_mdev;
> +
> + if (action == VFIO_GROUP_NOTIFY_SET_KVM) {
> + matrix_mdev = container_of(nb, struct ap_matrix_mdev,
> + group_notifier);
> + matrix_mdev->kvm = data;
> + }
> +
> + return NOTIFY_OK;
> +}
> +
> +static int vfio_ap_mdev_open(struct mdev_device *mdev)
> +{
> + struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
> + unsigned long events;
> + int ret;
> +
> + matrix_mdev->group_notifier.notifier_call = vfio_ap_mdev_group_notifier;
> + events = VFIO_GROUP_NOTIFY_SET_KVM;
> +
> + ret = vfio_register_notifier(mdev_dev(mdev), VFIO_GROUP_NOTIFY,
> + &events, &matrix_mdev->group_notifier);
> + if (ret)
> + return ret;
> +
> + ret = kvm_ap_interpret_instructions(matrix_mdev->kvm, true);
Do you need this call ?
apie is always enabled in KVM if AP instructions are available.
Setting or not the interpretation is done for the VM in a all.
It is not the right place to do it here since open is device dependent.
Or we only have one device in the VM at a time.
In this case, shouldn't we make it official by returning -EEXIST for the
second call?
> + if (ret)
> + return ret;
> +
> + ret = kvm_ap_configure_matrix(matrix_mdev->kvm,
> + matrix_mdev->matrix);
> +
> + return ret;
> +}
> +
> +static void vfio_ap_mdev_release(struct mdev_device *mdev)
> +{
> + struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
> +
> + kvm_ap_deconfigure_matrix(matrix_mdev->kvm);
> + kvm_ap_interpret_instructions(matrix_mdev->kvm, false);
This call clears the apie in KVM.
This is only OK if we have a single device present until the end of the VM,
otherwise AP instructions in the guest will fail after the release until
the end of the VM
or until a new device is plugged.
> + vfio_unregister_notifier(mdev_dev(mdev), VFIO_GROUP_NOTIFY,
> + &matrix_mdev->group_notifier);
> +}
> +
> static ssize_t name_show(struct kobject *kobj, struct device *dev, char *buf)
> {
> return sprintf(buf, "%s\n", VFIO_AP_MDEV_NAME_HWVIRT);
> @@ -754,6 +802,8 @@ static ssize_t matrix_show(struct device *dev, struct device_attribute *attr,
> .mdev_attr_groups = vfio_ap_mdev_attr_groups,
> .create = vfio_ap_mdev_create,
> .remove = vfio_ap_mdev_remove,
> + .open = vfio_ap_mdev_open,
> + .release = vfio_ap_mdev_release,
> };
>
> int vfio_ap_mdev_register(struct ap_matrix *ap_matrix)
> diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h
> index f248faf..48e2806 100644
> --- a/drivers/s390/crypto/vfio_ap_private.h
> +++ b/drivers/s390/crypto/vfio_ap_private.h
> @@ -31,6 +31,8 @@ struct ap_matrix {
>
> struct ap_matrix_mdev {
> struct kvm_ap_matrix *matrix;
> + struct notifier_block group_notifier;
> + struct kvm *kvm;
> };
>
> static inline struct ap_matrix *to_ap_matrix(struct device *dev)
--
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany
On 15/04/2018 23:22, Tony Krowiak wrote:
> This patch provides documentation describing the AP architecture and
> design concepts behind the virtualization of AP devices. It also
> includes an example of how to configure AP devices for exclusive
> use of KVM guests.
>
> Signed-off-by: Tony Krowiak <[email protected]>
> ---
> Documentation/s390/vfio-ap.txt | 567 ++++++++++++++++++++++++++++++++++++++++
> MAINTAINERS | 1 +
> 2 files changed, 568 insertions(+), 0 deletions(-)
> create mode 100644 Documentation/s390/vfio-ap.txt
>
> diff --git a/Documentation/s390/vfio-ap.txt b/Documentation/s390/vfio-ap.txt
> new file mode 100644
> index 0000000..a1e888a
> --- /dev/null
> +++ b/Documentation/s390/vfio-ap.txt
> @@ -0,0 +1,567 @@
> +Introduction:
> +============
> +The Adjunct Processor (AP) facility is an IBM Z cryptographic facility comprised
> +of three AP instructions and from 1 up to 256 PCIe cryptographic adapter cards.
> +The AP devices provide cryptographic functions to all CPUs assigned to a
> +linux system running in an IBM Z system LPAR.
> +
> +The AP adapter cards are exposed via the AP bus. The motivation for vfio-ap
> +is to make AP cards available to KVM guests using the VFIO mediated device
> +framework. This implementation relies considerably on the s390 virtualization
> +facilities which do most of the hard work of providing direct access to AP
> +devices.
> +
> +AP Architectural Overview:
> +=========================
> +To facilitate the comprehension of the design, let's start with some
> +definitions:
> +
> +* AP adapter
> +
> + An AP adapter is an IBM Z adapter card that can perform cryptographic
> + functions. There can be from 0 to 256 adapters assigned to an LPAR. Adapters
> + assigned to the LPAR in which a linux host is running will be available to
> + the linux host. Each adapter is identified by a number from 0 to 255. When
> + installed, an AP adapter is accessed by AP instructions executed by any CPU.
> +
> + The AP adapter cards are assigned to a given LPAR via the system's Activation
> + Profile which can be edited via the HMC. When the system is IPL'd, the AP bus
> + module is loaded and detects the AP adapter cards assigned to the LPAR. The AP
> + bus creates a sysfs device for each adapter as they are detected. For example,
> + if AP adapters 4 and 10 (0x0a) are assigned to the LPAR, the AP bus will
> + create the following sysfs entries:
> +
> + /sys/devices/ap/card04
> + /sys/devices/ap/card0a
> +
> + Symbolic links to these devices will also be created in the AP bus devices
> + sub-directory:
> +
> + /sys/bus/ap/devices/[card04]
> + /sys/bus/ap/devices/[card04]
> +
> +* AP domain
> +
> + An adapter is partitioned into domains. Each domain can be thought of as
> + a set of hardware registers for processing AP instructions. An adapter can
> + hold up to 256 domains. Each domain is identified by a number from 0 to 255.
> + Domains can be further classified into two types:
> +
> + * Usage domains are domains that can be accessed directly to process AP
> + commands.
> +
> + * Control domains are domains that are accessed indirectly by AP
> + commands sent to a usage domain to control or change the domain, for
> + example; to set a secure private key for the domain.
> +
> + The AP usage and control domains are assigned to a given LPAR via the system's
> + Activation Profile which can be edited via the HMC. When the system is IPL'd,
> + the AP bus module is loaded and detects the AP usage and control domains
> + assigned to the LPAR. The domain number of each usage domain will be coupled
> + with the adapter number of each AP adapter assigned to the LPAR to identify
> + the AP queues (see AP Queue section below). The domain number of each control
> + domain will be represented in a bitmask and stored in a sysfs file
> + /sys/bus/ap/ap_control_domain_mask created by the bus. The bits in the mask,
> + from most to least significant bit, correspond to domains 0-255.
> +
> + A domain may be assigned to a system as both a usage and control domain, or
> + as a control domain only. Consequently, all domains assigned as both a usage
> + and control domain can both process AP commands as well as be changed by an AP
> + command sent to any usage domain assigned to the same system. Domains assigned
> + only as control domains can not process AP commands but can be changed by AP
> + commands sent to any usage domain assigned to the system.
> +
> +* AP Queue
> +
> + An AP queue is the means by which an AP command-request message is sent to a
> + usage domain inside a specific adapter. An AP queue is identified by a tuple
> + comprised of an AP adapter ID (APID) and an AP queue index (APQI). The
> + APQI corresponds to a given usage domain number within the adapter. This tuple
> + forms an AP Queue Number (APQN) uniquely identifying an AP queue. AP
> + instructions include a field containing the APQN to identify the AP queue to
> + which the AP command-request message is to be sent for processing.
> +
> + The AP bus will create a sysfs device for each APQN that can be derived from
> + the intersection of the AP adapter and usage domain numbers detected when the
> + AP bus module is loaded. For example, if adapters 4 and 10 (0x0a) and usage
> + domains 6 and 71 (0x47) are assigned to the LPAR, the AP bus will create the
> + following sysfs entries:
> +
> + /sys/devices/ap/card04/04.0006
> + /sys/devices/ap/card04/04.0047
> + /sys/devices/ap/card0a/0a.0006
> + /sys/devices/ap/card0a/0a.0047
> +
> + The following symbolic links to these devices will be created in the AP bus
> + devices subdirectory:
> +
> + /sys/bus/ap/devices/[04.0006]
> + /sys/bus/ap/devices/[04.0047]
> + /sys/bus/ap/devices/[0a.0006]
> + /sys/bus/ap/devices/[0a.0047]
> +
> +* AP Instructions:
> +
> + There are three AP instructions:
> +
> + * NQAP: to enqueue an AP command-request message to a queue
> + * DQAP: to dequeue an AP command-reply message from a queue
> + * PQAP: to administer the queues
> +
> +AP and SIE:
> +==========
> +Let's now see how AP instructions are interpreted by the hardware.
> +
> +A satellite control block called the Crypto Control Block is attached to our
> +main hardware virtualization control block. The CRYCB contains three fields to
> +identify the adapters, usage domains and control domains assigned to the KVM
> +guest:
> +
> +* The AP Mask (APM) field is a bit mask that identifies the AP adapters assigned
> + to the KVM guest. Each bit in the mask, from most significant to least
> + significant bit, corresponds to an APID from 0-255. If a bit is set, the
> + corresponding adapter is valid for use by the KVM guest.
> +
> +* The AP Queue Mask (AQM) field is a bit mask identifying the AP usage domains
> + assigned to the KVM guest. Each bit in the mask, from most significant to
> + least significant bit, corresponds to an AP queue index (APQI) from 0-255. If
> + a bit is set, the corresponding queue is valid for use by the KVM guest.
> +
> +* The AP Domain Mask field is a bit mask that identifies the AP control domains
> + assigned to the KVM guest. The ADM bit mask controls which domains can be
> + changed by an AP command-request message sent to a usage domain from the
> + guest. Each bit in the mask, from least significant to most significant bit,
> + corresponds to a domain from 0-255. If a bit is set, the corresponding domain
> + can be modified by an AP command-request message sent to a usage domain
> + configured for the KVM guest.
> +
> +If you recall from the description of an AP Queue, AP instructions include
> +an APQN to identify the AP adapter and AP queue to which an AP command-request
> +message is to be sent (NQAP and PQAP instructions), or from which a
> +command-reply message is to be received (DQAP instruction). The validity of an
> +APQN is defined by the matrix calculated from the APM and AQM; it is the
> +cross product of all assigned adapter numbers (APM) with all assigned queue
> +indexes (AQM). For example, if adapters 1 and 2 and usage domains 5 and 6 are
> +assigned to a guest, the APQNs (1,5), (1,6), (2,5) and (2,6) will be valid for
> +the guest.
> +
> +The APQNs can provide secure key functionality - i.e., a private key is stored
> +on the adapter card for each of its domains - so each APQN must be assigned to
> +at most one guest or the linux host.
> +
> + Example 1: Valid configuration:
> + ------------------------------
> + Guest1: adapters 1,2 domains 5,6
> + Guest2: adapter 1,2 domain 7
> +
> + This is valid because both guests have a unique set of APQNs: Guest1 has
> + APQNs (1,5), (1,6), (2,5) and (2,6); Guest2 has APQNs (1,7) and (2,7).
> +
> + Example 2: Invalid configuration:
> + --------------------------------is assigned by writing the adapter's number into the
> + Guest1: adapters 1,2 domains 5,6
> + Guest2: adapter 1 domains 6,7
> +
> + This is an invalid configuration because both guests have access to
> + APQN (1,6).
> +
> +The Design:
> +===========
> +The design introduces three new objects:
> +
> +1. AP matrix device
> +2. VFIO AP device driver (vfio_ap.ko)
> +3. AP mediated matrix passthrough device
> +
> +The VFIO AP device driver
> +-------------------------
> +The VFIO AP (vfio_ap) device driver serves the following purposes:
> +
> +1. Provides the interfaces to reserve APQNs for exclusive use of KVM guests.
> +
> +2. Sets up the VFIO mediated device interfaces to manage the mediated matrix
> + device and create the sysfs interfaces for assigning adapters, usage domains,
> + and control domains comprising the matrix for a KVM guest.
> +
> +3. Configure the APM, AQM and ADM in the CRYCB referenced by a KVM guest's
> + SIE state description to grant the guest access to AP devices
> +
> +4. Initialize the CPU model feature indicating that a KVM guest may use
> + AP facilities installed on the linux host.
> +
> +5. Enable interpretive execution mode for the KVM guest.
> +
> +Reserve APQNs for exclusive use of KVM guests
> +---------------------------------------------
> +The following block diagram illustrates the mechanism by which APQNs are
> +reserved:
> +
> + +------------------+
> + remove | | unbind
> + +------------------->+ cex4queue driver +<-----------+
> + | | | |
> + | +------------------+ |
> + | |
> + | |
> + | |
> ++--------+---------+ register +------------------+ +-----+------+
> +| +<---------+ | bind | |
> +| ap_bus | | vfio_ap driver +<-----+ admin |
> +| +--------->+ | | |
> ++------------------+ probe +---+--------+-----+ +------------+
> + | |
> + create | | store APQN
> + | |
> + v v
> + +---+--------+-----+
> + | |
> + | matrix device |
> + | |
> + +------------------+
> +
> +The process for reserving an AP queue for use by a KVM guest is:
> +
> +* The vfio-ap driver during its initialization will perform the following:
> + * Create the 'vfio_ap' root device - /sys/devices/vfio_ap
> + * Create the 'matrix' device in the 'vfio_ap' root
> + * Register the matrix device with the device core
> +* Register with the ap_bus for AP queue devices of type CEX4, CEX5 and
> + CEX6 and to provide the vfio_ap driver's probe and remove callback interfaces.
I wonder why the type of card has anything to do with this driver.
It should be transparent, the driver should be able to provide the
matrix (APM/AQM/ADM)
independently from the type of card in the slot.
Regards,
Pierre
--
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany
On 04/15/2018 11:22 PM, Tony Krowiak wrote:
> The VFIO AP device model exploits interpretive execution of AP
> instructions (APIE) to provide guests passthrough access to AP
> devices. This patch introduces a new interface to enable and
> disable APIE.
>
> Signed-off-by: Tony Krowiak <[email protected]>
LGTM, but should be squashed into #4. (Like this you have a
kernel that supports the cpu model feature 'ap' but does not
do 'interpretation is default'.)
On 16/04/2018 12:51, Pierre Morel wrote:
> On 15/04/2018 23:22, Tony Krowiak wrote:
>> The VFIO AP device model exploits interpretive execution of AP
>> instructions (APIE) to provide guests passthrough access to AP
>> devices. This patch introduces a new interface to enable and
>> disable APIE.
>>
>> Signed-off-by: Tony Krowiak <[email protected]>
>> ---
>> arch/s390/include/asm/kvm-ap.h | 16 ++++++++++++++++
>> arch/s390/include/asm/kvm_host.h | 1 +
>> arch/s390/kvm/kvm-ap.c | 20 ++++++++++++++++++++
>> arch/s390/kvm/kvm-s390.c | 9 +++++++++
>> 4 files changed, 46 insertions(+), 0 deletions(-)
>>
>> diff --git a/arch/s390/include/asm/kvm-ap.h
>> b/arch/s390/include/asm/kvm-ap.h
>> index 736e93e..a6c092e 100644
>> --- a/arch/s390/include/asm/kvm-ap.h
>> +++ b/arch/s390/include/asm/kvm-ap.h
>> @@ -35,4 +35,20 @@
>> */
>> void kvm_ap_build_crycbd(struct kvm *kvm);
>>
>> +/**
>> + * kvm_ap_interpret_instructions
>> + *
>> + * Indicate whether AP instructions shall be interpreted. If they
>> are not
>> + * interpreted, all AP instructions will be intercepted and routed
>> back to
>> + * userspace.
>> + *
>> + * @kvm: the virtual machine attributes
>> + * @enable: indicates whether AP instructions are to be interpreted
>> (true) or
>> + * or not (false).
>> + *
>> + * Returns 0 if completed successfully; otherwise, returns -EOPNOTSUPP
>> + * indicating that AP instructions are not installed on the guest.
>> + */
>> +int kvm_ap_interpret_instructions(struct kvm *kvm, bool enable);
>> +
>> #endif /* _ASM_KVM_AP */
>> diff --git a/arch/s390/include/asm/kvm_host.h
>> b/arch/s390/include/asm/kvm_host.h
>> index 3162783..5470685 100644
>> --- a/arch/s390/include/asm/kvm_host.h
>> +++ b/arch/s390/include/asm/kvm_host.h
>> @@ -715,6 +715,7 @@ struct kvm_s390_crypto {
>> __u32 crycbd;
>> __u8 aes_kw;
>> __u8 dea_kw;
>> + __u8 apie;
>> };
>>
>> #define APCB0_MASK_SIZE 1
>> diff --git a/arch/s390/kvm/kvm-ap.c b/arch/s390/kvm/kvm-ap.c
>> index 991bae4..55d11b5 100644
>> --- a/arch/s390/kvm/kvm-ap.c
>> +++ b/arch/s390/kvm/kvm-ap.c
>> @@ -58,3 +58,23 @@ void kvm_ap_build_crycbd(struct kvm *kvm)
>> }
>> }
>> EXPORT_SYMBOL(kvm_ap_build_crycbd);
>> +
>> +int kvm_ap_interpret_instructions(struct kvm *kvm, bool enable)
>> +{
>> + int ret = 0;
>> +
>> + mutex_lock(&kvm->lock);
>> +
>> + if (!test_kvm_cpu_feat(kvm, KVM_S390_VM_CPU_FEAT_AP)) {
>
> Do we really need to test CPU_FEAT_AP?
>
> I understand that KVM_S390_VM_CPU_FEAT_AP means AP instructions are
> interpreted.
> shouldn't we add this information in the name?
> like KVM_S390_VM_CPU_FEAT_APIE
If I misunderstood and FEAT_AP really mean AP instructions available in
the guest,
same question:
is this function called if AP instructions are not available in the guest?
>
>> + ret = -EOPNOTSUPP;
>> + goto done;
>> + }
>> +
>> + kvm->arch.crypto.apie = enable;
>> + kvm_s390_vcpu_crypto_reset_all(kvm);
>> +
>> +done:
>> + mutex_unlock(&kvm->lock);
>> + return ret;
>> +}
>> +EXPORT_SYMBOL(kvm_ap_interpret_instructions);
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index 55cd897..1dc8566 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -1901,6 +1901,9 @@ static void kvm_s390_crypto_init(struct kvm *kvm)
>> kvm->arch.crypto.crycb = &kvm->arch.sie_page2->crycb;
>> kvm_ap_build_crycbd(kvm);
>>
>> + /* Default setting indicating SIE shall interpret AP
>> instructions */
>> + kvm->arch.crypto.apie = 1;
>> +
>> if (!test_kvm_facility(kvm, 76))
>> return;
>>
>> @@ -2434,6 +2437,12 @@ static void kvm_s390_vcpu_crypto_setup(struct
>> kvm_vcpu *vcpu)
>> {
>> vcpu->arch.sie_block->crycbd = vcpu->kvm->arch.crypto.crycbd;
>>
>> + vcpu->arch.sie_block->eca &= ~ECA_APIE;
>> + if (vcpu->kvm->arch.crypto.apie &&
>> + test_kvm_cpu_feat(vcpu->kvm, KVM_S390_VM_CPU_FEAT_AP))
>
> Do we call xxx_crypto_setup() if KVM does not support AP interpretation?
sorry, I should have written AP instructions here:
is this function called if AP instructions are not available in the guest?
>
>> + vcpu->arch.sie_block->eca |= ECA_APIE;
>> +
>> +
>> if (!test_kvm_facility(vcpu->kvm, 76))
>> return;
>>
>
--
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany
On 04/16/2018 01:13 PM, Pierre Morel wrote:
> On 16/04/2018 12:51, Pierre Morel wrote:
>> On 15/04/2018 23:22, Tony Krowiak wrote:
>>> The VFIO AP device model exploits interpretive execution of AP
>>> instructions (APIE) to provide guests passthrough access to AP
>>> devices. This patch introduces a new interface to enable and
>>> disable APIE.
>>>
>>> Signed-off-by: Tony Krowiak <[email protected]>
>>> ---
>>> arch/s390/include/asm/kvm-ap.h | 16 ++++++++++++++++
>>> arch/s390/include/asm/kvm_host.h | 1 +
>>> arch/s390/kvm/kvm-ap.c | 20 ++++++++++++++++++++
>>> arch/s390/kvm/kvm-s390.c | 9 +++++++++
>>> 4 files changed, 46 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/arch/s390/include/asm/kvm-ap.h b/arch/s390/include/asm/kvm-ap.h
>>> index 736e93e..a6c092e 100644
>>> --- a/arch/s390/include/asm/kvm-ap.h
>>> +++ b/arch/s390/include/asm/kvm-ap.h
>>> @@ -35,4 +35,20 @@
>>> */
>>> void kvm_ap_build_crycbd(struct kvm *kvm);
>>>
>>> +/**
>>> + * kvm_ap_interpret_instructions
>>> + *
>>> + * Indicate whether AP instructions shall be interpreted. If they are not
>>> + * interpreted, all AP instructions will be intercepted and routed back to
>>> + * userspace.
>>> + *
>>> + * @kvm: the virtual machine attributes
>>> + * @enable: indicates whether AP instructions are to be interpreted (true) or
>>> + * or not (false).
>>> + *
>>> + * Returns 0 if completed successfully; otherwise, returns -EOPNOTSUPP
>>> + * indicating that AP instructions are not installed on the guest.
>>> + */
>>> +int kvm_ap_interpret_instructions(struct kvm *kvm, bool enable);
>>> +
>>> #endif /* _ASM_KVM_AP */
>>> diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
>>> index 3162783..5470685 100644
>>> --- a/arch/s390/include/asm/kvm_host.h
>>> +++ b/arch/s390/include/asm/kvm_host.h
>>> @@ -715,6 +715,7 @@ struct kvm_s390_crypto {
>>> __u32 crycbd;
>>> __u8 aes_kw;
>>> __u8 dea_kw;
>>> + __u8 apie;
>>> };
>>>
>>> #define APCB0_MASK_SIZE 1
>>> diff --git a/arch/s390/kvm/kvm-ap.c b/arch/s390/kvm/kvm-ap.c
>>> index 991bae4..55d11b5 100644
>>> --- a/arch/s390/kvm/kvm-ap.c
>>> +++ b/arch/s390/kvm/kvm-ap.c
>>> @@ -58,3 +58,23 @@ void kvm_ap_build_crycbd(struct kvm *kvm)
>>> }
>>> }
>>> EXPORT_SYMBOL(kvm_ap_build_crycbd);
>>> +
>>> +int kvm_ap_interpret_instructions(struct kvm *kvm, bool enable)
>>> +{
>>> + int ret = 0;
>>> +
>>> + mutex_lock(&kvm->lock);
>>> +
>>> + if (!test_kvm_cpu_feat(kvm, KVM_S390_VM_CPU_FEAT_AP)) {
>>
>> Do we really need to test CPU_FEAT_AP?
>>
>> I understand that KVM_S390_VM_CPU_FEAT_AP means AP instructions are interpreted.
>> shouldn't we add this information in the name?
>> like KVM_S390_VM_CPU_FEAT_APIE
>
> If I misunderstood and FEAT_AP really mean AP instructions available in the guest,
> same question:
> is this function called if AP instructions are not available in the guest?
>
See patch #13. I guess the check above is anyway good as defensive
programming. This implementation should be sane regardless of
the answer to your question.
>>
>>> + ret = -EOPNOTSUPP;
>>> + goto done;
>>> + }
>>> +
>>> + kvm->arch.crypto.apie = enable;
>>> + kvm_s390_vcpu_crypto_reset_all(kvm);
>>> +
>>> +done:
>>> + mutex_unlock(&kvm->lock);
>>> + return ret;
>>> +}
>>> +EXPORT_SYMBOL(kvm_ap_interpret_instructions);
>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>> index 55cd897..1dc8566 100644
>>> --- a/arch/s390/kvm/kvm-s390.c
>>> +++ b/arch/s390/kvm/kvm-s390.c
>>> @@ -1901,6 +1901,9 @@ static void kvm_s390_crypto_init(struct kvm *kvm)
>>> kvm->arch.crypto.crycb = &kvm->arch.sie_page2->crycb;
>>> kvm_ap_build_crycbd(kvm);
>>>
>>> + /* Default setting indicating SIE shall interpret AP instructions */
>>> + kvm->arch.crypto.apie = 1;
>>> +
>>> if (!test_kvm_facility(kvm, 76))
>>> return;
>>>
>>> @@ -2434,6 +2437,12 @@ static void kvm_s390_vcpu_crypto_setup(struct kvm_vcpu *vcpu)
>>> {
>>> vcpu->arch.sie_block->crycbd = vcpu->kvm->arch.crypto.crycbd;
>>>
>>> + vcpu->arch.sie_block->eca &= ~ECA_APIE;
>>> + if (vcpu->kvm->arch.crypto.apie &&
>>> + test_kvm_cpu_feat(vcpu->kvm, KVM_S390_VM_CPU_FEAT_AP))
>>
>> Do we call xxx_crypto_setup() if KVM does not support AP interpretation?
>
> sorry, I should have written AP instructions here:
> is this function called if AP instructions are not available in the guest?
>
Yes, this function can be called with AP instructions available to the guest.
Please have a look at patch 2 (kvm_s390_vm_set_crypto and the rest).
Also this function is called on initialization regardless of AP instructions.
>>
>>> + vcpu->arch.sie_block->eca |= ECA_APIE;
>>> +
>>> +
>>> if (!test_kvm_facility(vcpu->kvm, 76))
>>> return;
>>>
>>
>
On Mon, 16 Apr 2018 15:13:59 +0200
Pierre Morel <[email protected]> wrote:
> On 15/04/2018 23:22, Tony Krowiak wrote:
> > +* Register with the ap_bus for AP queue devices of type CEX4, CEX5 and
> > + CEX6 and to provide the vfio_ap driver's probe and remove callback interfaces.
>
> I wonder why the type of card has anything to do with this driver.
> It should be transparent, the driver should be able to provide the
> matrix (APM/AQM/ADM)
> independently from the type of card in the slot.
Would also be interested why this is limited to certain, newer cards.
Did some kind of interface change (I dimly recall something like that),
or are simply no old systems with those older card types around to check
whether it works?
In either case, a short note would be good (does not need to go into
any details).
On 04/16/2018 03:05 PM, Pierre Morel wrote:
>> +static void vfio_ap_mdev_release(struct mdev_device *mdev)
>> +{
>> + struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
>> +
>> + kvm_ap_deconfigure_matrix(matrix_mdev->kvm);
>> + kvm_ap_interpret_instructions(matrix_mdev->kvm, false);
>
> This call clears the apie in KVM.
> This is only OK if we have a single device present until the end of the VM,
> otherwise AP instructions in the guest will fail after the release until the end of the VM
> or until a new device is plugged.
I agree, this seems wrong.
On 15/04/2018 23:22, Tony Krowiak wrote:
> If the AP instructions are not available on the linux host, then
> AP devices can not be interpreted by the SIE. The AP bus has a
> function it uses to determine if the AP instructions are
> available. This patch provides a new function that wraps the
> AP bus's function to externalize it for use by KVM.
>
> Signed-off-by: Tony Krowiak <[email protected]>
> Reviewed-by: Pierre Morel <[email protected]>
> Reviewed-by: Harald Freudenberger <[email protected]>
> ---
> arch/s390/include/asm/ap.h | 7 +++++++
> arch/s390/include/asm/kvm-ap.h | 23 +++++++++++++++++++++++
> arch/s390/kvm/Makefile | 2 +-
> arch/s390/kvm/kvm-ap.c | 21 +++++++++++++++++++++
> drivers/s390/crypto/ap_bus.c | 6 ++++++
> 5 files changed, 58 insertions(+), 1 deletions(-)
> create mode 100644 arch/s390/include/asm/kvm-ap.h
> create mode 100644 arch/s390/kvm/kvm-ap.c
>
> diff --git a/arch/s390/include/asm/ap.h b/arch/s390/include/asm/ap.h
> index c1bedb4..7773bfd 100644
> --- a/arch/s390/include/asm/ap.h
> +++ b/arch/s390/include/asm/ap.h
> @@ -120,4 +120,11 @@ struct ap_queue_status ap_queue_irq_ctrl(ap_qid_t qid,
> struct ap_qirq_ctrl qirqctrl,
> void *ind);
>
> +/**
> + * ap_instructions_installed() - Tests whether AP instructions are installed
> + *
> + * Returns 1 if the AP instructions are installed, otherwise; returns 0
> + */
> +int ap_instructions_installed(void);
> +
> #endif /* _ASM_S390_AP_H_ */
> diff --git a/arch/s390/include/asm/kvm-ap.h b/arch/s390/include/asm/kvm-ap.h
> new file mode 100644
> index 0000000..84412a9
> --- /dev/null
> +++ b/arch/s390/include/asm/kvm-ap.h
> @@ -0,0 +1,23 @@
> +// SPDX-License-Identifier: GPL-2.0+
> +/*
> + * Adjunct Processor (AP) configuration management for KVM guests
> + *
> + * Copyright IBM Corp. 2018
> + *
> + * Author(s): Tony Krowiak <[email protected]>
> + */
> +
> +#ifndef _ASM_KVM_AP
> +#define _ASM_KVM_AP
> +
> +/**
> + * kvm_ap_instructions_installed()
> + *
> + * Tests whether AP instructions are installed on the linux host
> + *
> + * Returns 1 if the AP instructions are installed on the host, otherwise;
> + * returns 0
> + */
> +int kvm_ap_instructions_installed(void);
> +
> +#endif /* _ASM_KVM_AP */
> diff --git a/arch/s390/kvm/Makefile b/arch/s390/kvm/Makefile
> index 05ee90a..1876bfe 100644
> --- a/arch/s390/kvm/Makefile
> +++ b/arch/s390/kvm/Makefile
> @@ -9,6 +9,6 @@ common-objs = $(KVM)/kvm_main.o $(KVM)/eventfd.o $(KVM)/async_pf.o $(KVM)/irqch
> ccflags-y := -Ivirt/kvm -Iarch/s390/kvm
>
> kvm-objs := $(common-objs) kvm-s390.o intercept.o interrupt.o priv.o sigp.o
> -kvm-objs += diag.o gaccess.o guestdbg.o vsie.o
> +kvm-objs += diag.o gaccess.o guestdbg.o vsie.o kvm-ap.o
>
> obj-$(CONFIG_KVM) += kvm.o
> diff --git a/arch/s390/kvm/kvm-ap.c b/arch/s390/kvm/kvm-ap.c
> new file mode 100644
> index 0000000..1267588
> --- /dev/null
> +++ b/arch/s390/kvm/kvm-ap.c
> @@ -0,0 +1,21 @@
> +// SPDX-License-Identifier: GPL-2.0+
> +/*
> + * Adjunct Processor (AP) configuration management for KVM guests
> + *
> + * Copyright IBM Corp. 2018
> + *
> + * Author(s): Tony Krowiak <[email protected]>
> + */
> +#include <linux/kernel.h>
> +#include <asm/kvm-ap.h>
> +#include <asm/ap.h>
> +
> +int kvm_ap_instructions_installed(void)
> +{
> +#ifdef CONFIG_ZCRYPT
if you do this take care that ZCRYPT may be a module ;)
> + return ap_instructions_installed();
> +#else
> + return 0;
> +#endif
> +}
> +EXPORT_SYMBOL(kvm_ap_instructions_installed);
> diff --git a/drivers/s390/crypto/ap_bus.c b/drivers/s390/crypto/ap_bus.c
> index 35a0c2b..9d108b6 100644
> --- a/drivers/s390/crypto/ap_bus.c
> +++ b/drivers/s390/crypto/ap_bus.c
> @@ -210,6 +210,12 @@ int ap_query_configuration(struct ap_config_info *info)
> }
> EXPORT_SYMBOL(ap_query_configuration);
>
> +int ap_instructions_installed(void)
> +{
> + return (ap_instructions_available() == 0);
> +}
> +EXPORT_SYMBOL(ap_instructions_installed);
> +
> /**
> * ap_init_configuration(): Allocate and query configuration array.
> */
--
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany
On Tue, 17 Apr 2018 09:49:58 +0200
"Harald Freudenberger" <[email protected]> wrote:
> Didn't we say that when APXA is not available there is no Crypto support
> for KVM ?
[Going by the code, as I don't have access to the architecture]
Current status seems to be:
- setup crycb if facility 76 is available (that's MSAX3, I guess?)
- use format 2 if APXA is available, else use format 1
From Tony's patch description, the goal seems to be:
- setup crycb even if MSAX3 is not available
So my understanding is that we use APXA only to decide on the format of
the crycb, but provide it in any case?
(Not providing a crycb if APXA is not available would be loss of
functionality, I guess? Deciding not to provide vfio-ap if APXA is not
available is a different game, of course.)
On Sun, 15 Apr 2018 17:22:12 -0400
Tony Krowiak <[email protected]> wrote:
> Introduces a new function to reset the crypto attributes for all
> vcpus whether they are running or not. Each vcpu in KVM will
> be removed from SIE prior to resetting the crypto attributes in its
> SIE state description. After all vcpus have had their crypto attributes
> reset the vcpus will be restored to SIE.
>
> This function will be used in a later patch to set the ECA.28
> bit in the SIE state description to enable interpretive execution of
> AP instructions. It will also be incorporated into the
> kvm_s390_vm_set_crypto(kvm) function to fix an issue whereby the crypto
> key wrapping attributes could potentially get out of synch for running
> vcpus.
So, this description leads me to think it would make sense to queue
this patch (fixing the key wrapping) independently of this series,
wouldn't it?
>
> Signed-off-by: Tony Krowiak <[email protected]>
> ---
> arch/s390/kvm/kvm-s390.c | 19 +++++++++++++------
> arch/s390/kvm/kvm-s390.h | 14 ++++++++++++++
> 2 files changed, 27 insertions(+), 6 deletions(-)
>
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 64c9862..d0c3518 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -791,11 +791,21 @@ static int kvm_s390_set_mem_control(struct kvm *kvm, struct kvm_device_attr *att
>
> static void kvm_s390_vcpu_crypto_setup(struct kvm_vcpu *vcpu);
>
> -static int kvm_s390_vm_set_crypto(struct kvm *kvm, struct kvm_device_attr *attr)
> +void kvm_s390_vcpu_crypto_reset_all(struct kvm *kvm)
_reset_all() or _set_all()? Don't really care much, tbh.
> {
> - struct kvm_vcpu *vcpu;
> int i;
> + struct kvm_vcpu *vcpu;
I'd avoid swapping the order of the declarations.
> +
> + kvm_s390_vcpu_block_all(kvm);
> +
> + kvm_for_each_vcpu(i, vcpu, kvm)
> + kvm_s390_vcpu_crypto_setup(vcpu);
>
> + kvm_s390_vcpu_unblock_all(kvm);
> +}
> +
> +static int kvm_s390_vm_set_crypto(struct kvm *kvm, struct kvm_device_attr *attr)
> +{
> if (!test_kvm_facility(kvm, 76))
> return -EINVAL;
>
> @@ -832,10 +842,7 @@ static int kvm_s390_vm_set_crypto(struct kvm *kvm, struct kvm_device_attr *attr)
> return -ENXIO;
> }
>
> - kvm_for_each_vcpu(i, vcpu, kvm) {
> - kvm_s390_vcpu_crypto_setup(vcpu);
> - exit_sie(vcpu);
> - }
> + kvm_s390_vcpu_crypto_reset_all(kvm);
> mutex_unlock(&kvm->lock);
> return 0;
> }
> diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h
> index 1b5621f..76324b7 100644
> --- a/arch/s390/kvm/kvm-s390.h
> +++ b/arch/s390/kvm/kvm-s390.h
> @@ -410,4 +410,18 @@ static inline int kvm_s390_use_sca_entries(void)
> }
> void kvm_s390_reinject_machine_check(struct kvm_vcpu *vcpu,
> struct mcck_volatile_info *mcck_info);
> +
> +/**
> + * kvm_s390_vcpu_crypto_reset_all
> + *
> + * Reset the crypto attributes for each vcpu. This can be done while the vcpus
> + * are running as each vcpu will be removed from SIE before resetting the crypto
> + * attributes and restored to SIE afterward.
> + *
> + * Note: The kvm->lock mutex must be locked prior to calling this function and
> + * unlocked after it returns.
"Must be called with kvm->lock held"?
> + *
> + * @kvm: the KVM guest
> + */
> +void kvm_s390_vcpu_crypto_reset_all(struct kvm *kvm);
> #endif
Other than the nits above, looks good to me.
On 17/04/2018 09:01, Harald Freudenberger wrote:
> Hi Pierre
>
> The AP bus can no longer get compiled as a module. There is a (unbeautiful)
> trick done in the Makefile:
>
> ...
> ap-objs := ap_bus.o ap_card.o ap_queue.o
> obj-$(subst m,y,$(CONFIG_ZCRYPT)) += ap.o
> # zcrypt_api.o and zcrypt_msgtype*.o depend on ap.o
> ...
>
> which makes sure there is either no AP support in the kernel or it is
> always static.
Hi,
AP can not be compiled as a kernel module but...
we can set ZCRYPT=m in the configuration which means
that the ZCRYPT definition will be CONFIG_ZCRYPT_MODULE and not
CONFIG_ZCRYPT.
Regards,
Pierre
>
> Mit freundlichen Gr??en / Kind regards
>
> Harald Freudenberger
>
>
> IBM Systems &Technology Group, zLinux Development
> ----------------------------------------------------------------------------------
>
> IBM Deutschland
> Schoenaicher Str. 220
> 71032 Boeblingen
> Phone: +49-7031-16-5152
> E-Mail: [email protected]
> ----------------------------------------------------------------------------------
>
> IBM Deutschland Research & Development GmbH
> Vorsitzender des Aufsichtsrats: Martina Koederitz
> Gesch?ftsf?hrung: Dirk Wittkopp
> Sitz der Gesellschaft: B?blingen
> Registergericht: Amtsgericht Stuttgart, HRB 243294
> ------------------------------------------------------------------------------------
>
>
>
> From: Pierre Morel <[email protected]>
> To: Tony Krowiak <[email protected]>,
> [email protected], [email protected],
> [email protected]
> Cc: Harald Freudenberger/Germany/IBM@IBMDE,
> [email protected], [email protected],
> [email protected], [email protected],
> [email protected], [email protected],
> [email protected], [email protected],
> [email protected], [email protected],
> [email protected], [email protected],
> [email protected], [email protected],
> [email protected], Reinhard Buendgen/Germany/IBM@IBMDE
> Date: 16.04.2018 17:59
> Subject: Re: [PATCH v4 01/15] s390: zcrypt: externalize AP instructions
> available function
>
>
>
> On 15/04/2018 23:22, Tony Krowiak wrote:
>> If the AP instructions are not available on the linux host, then
>> AP devices can not be interpreted by the SIE. The AP bus has a
>> function it uses to determine if the AP instructions are
>> available. This patch provides a new function that wraps the
>> AP bus's function to externalize it for use by KVM.
>>
>> Signed-off-by: Tony Krowiak <[email protected]>
>> Reviewed-by: Pierre Morel <[email protected]>
>> Reviewed-by: Harald Freudenberger <[email protected]>
>> ---
>> arch/s390/include/asm/ap.h | 7 +++++++
>> arch/s390/include/asm/kvm-ap.h | 23 +++++++++++++++++++++++
>> arch/s390/kvm/Makefile | 2 +-
>> arch/s390/kvm/kvm-ap.c | 21 +++++++++++++++++++++
>> drivers/s390/crypto/ap_bus.c | 6 ++++++
>> 5 files changed, 58 insertions(+), 1 deletions(-)
>> create mode 100644 arch/s390/include/asm/kvm-ap.h
>> create mode 100644 arch/s390/kvm/kvm-ap.c
>>
>> diff --git a/arch/s390/include/asm/ap.h b/arch/s390/include/asm/ap.h
>> index c1bedb4..7773bfd 100644
>> --- a/arch/s390/include/asm/ap.h
>> +++ b/arch/s390/include/asm/ap.h
>> @@ -120,4 +120,11 @@ struct ap_queue_status ap_queue_irq_ctrl(ap_qid_t
> qid,
>> struct
> ap_qirq_ctrl qirqctrl,
>> void *ind);
>>
>> +/**
>> + * ap_instructions_installed() - Tests whether AP instructions are
> installed
>> + *
>> + * Returns 1 if the AP instructions are installed, otherwise; returns 0
>> + */
>> +int ap_instructions_installed(void);
>> +
>> #endif /* _ASM_S390_AP_H_ */
>> diff --git a/arch/s390/include/asm/kvm-ap.h
> b/arch/s390/include/asm/kvm-ap.h
>> new file mode 100644
>> index 0000000..84412a9
>> --- /dev/null
>> +++ b/arch/s390/include/asm/kvm-ap.h
>> @@ -0,0 +1,23 @@
>> +// SPDX-License-Identifier: GPL-2.0+
>> +/*
>> + * Adjunct Processor (AP) configuration management for KVM guests
>> + *
>> + * Copyright IBM Corp. 2018
>> + *
>> + * Author(s): Tony Krowiak <[email protected]>
>> + */
>> +
>> +#ifndef _ASM_KVM_AP
>> +#define _ASM_KVM_AP
>> +
>> +/**
>> + * kvm_ap_instructions_installed()
>> + *
>> + * Tests whether AP instructions are installed on the linux host
>> + *
>> + * Returns 1 if the AP instructions are installed on the host,
> otherwise;
>> + * returns 0
>> + */
>> +int kvm_ap_instructions_installed(void);
>> +
>> +#endif /* _ASM_KVM_AP */
>> diff --git a/arch/s390/kvm/Makefile b/arch/s390/kvm/Makefile
>> index 05ee90a..1876bfe 100644
>> --- a/arch/s390/kvm/Makefile
>> +++ b/arch/s390/kvm/Makefile
>> @@ -9,6 +9,6 @@ common-objs = $(KVM)/kvm_main.o $(KVM)/eventfd.o $
> (KVM)/async_pf.o $(KVM)/irqch
>> ccflags-y := -Ivirt/kvm -Iarch/s390/kvm
>>
>> kvm-objs := $(common-objs) kvm-s390.o intercept.o interrupt.o priv.o
> sigp.o
>> -kvm-objs += diag.o gaccess.o guestdbg.o vsie.o
>> +kvm-objs += diag.o gaccess.o guestdbg.o vsie.o kvm-ap.o
>>
>> obj-$(CONFIG_KVM) += kvm.o
>> diff --git a/arch/s390/kvm/kvm-ap.c b/arch/s390/kvm/kvm-ap.c
>> new file mode 100644
>> index 0000000..1267588
>> --- /dev/null
>> +++ b/arch/s390/kvm/kvm-ap.c
>> @@ -0,0 +1,21 @@
>> +// SPDX-License-Identifier: GPL-2.0+
>> +/*
>> + * Adjunct Processor (AP) configuration management for KVM guests
>> + *
>> + * Copyright IBM Corp. 2018
>> + *
>> + * Author(s): Tony Krowiak <[email protected]>
>> + */
>> +#include <linux/kernel.h>
>> +#include <asm/kvm-ap.h>
>> +#include <asm/ap.h>
>> +
>> +int kvm_ap_instructions_installed(void)
>> +{
>> +#ifdef CONFIG_ZCRYPT
> if you do this take care that ZCRYPT may be a module ;)
>
>> + return ap_instructions_installed();
>> +#else
>> + return 0;
>> +#endif
>> +}
>> +EXPORT_SYMBOL(kvm_ap_instructions_installed);
>> diff --git a/drivers/s390/crypto/ap_bus.c b/drivers/s390/crypto/ap_bus.c
>> index 35a0c2b..9d108b6 100644
>> --- a/drivers/s390/crypto/ap_bus.c
>> +++ b/drivers/s390/crypto/ap_bus.c
>> @@ -210,6 +210,12 @@ int ap_query_configuration(struct ap_config_info
> *info)
>> }
>> EXPORT_SYMBOL(ap_query_configuration);
>>
>> +int ap_instructions_installed(void)
>> +{
>> + return (ap_instructions_available() == 0);
>> +}
>> +EXPORT_SYMBOL(ap_instructions_installed);
>> +
>> /**
>> * ap_init_configuration(): Allocate and query configuration array.
>> */
>
> --
> Pierre Morel
> Linux/KVM/QEMU in B?blingen - Germany
>
>
>
>
--
Pierre Morel
Linux/KVM/QEMU in B?blingen - Germany
On 04/16/2018 08:11 AM, Cornelia Huck wrote:
> On Mon, 16 Apr 2018 10:44:53 +0200
> Pierre Morel <[email protected]> wrote:
>
>> On 15/04/2018 23:22, Tony Krowiak wrote:
>>> If the AP instructions are not available on the linux host, then
>>> AP devices can not be interpreted by the SIE. The AP bus has a
>>> function it uses to determine if the AP instructions are
>>> available. This patch provides a new function that wraps the
>>> AP bus's function to externalize it for use by KVM.
>>>
>>> Signed-off-by: Tony Krowiak <[email protected]>
>>> Reviewed-by: Pierre Morel <[email protected]>
>>> Reviewed-by: Harald Freudenberger <[email protected]>
>>> ---
>>> arch/s390/include/asm/ap.h | 7 +++++++
>>> arch/s390/include/asm/kvm-ap.h | 23 +++++++++++++++++++++++
>>> arch/s390/kvm/Makefile | 2 +-
>>> arch/s390/kvm/kvm-ap.c | 21 +++++++++++++++++++++
>>> drivers/s390/crypto/ap_bus.c | 6 ++++++
>>> 5 files changed, 58 insertions(+), 1 deletions(-)
>>> create mode 100644 arch/s390/include/asm/kvm-ap.h
>>> create mode 100644 arch/s390/kvm/kvm-ap.c
>>> diff --git a/arch/s390/kvm/kvm-ap.c b/arch/s390/kvm/kvm-ap.c
>>> new file mode 100644
>>> index 0000000..1267588
>>> --- /dev/null
>>> +++ b/arch/s390/kvm/kvm-ap.c
>>> @@ -0,0 +1,21 @@
>>> +// SPDX-License-Identifier: GPL-2.0+
>>> +/*
>>> + * Adjunct Processor (AP) configuration management for KVM guests
>>> + *
>>> + * Copyright IBM Corp. 2018
>>> + *
>>> + * Author(s): Tony Krowiak <[email protected]>
>>> + */
>>> +#include <linux/kernel.h>
>>> +#include <asm/kvm-ap.h>
>>> +#include <asm/ap.h>
>>> +
>>> +int kvm_ap_instructions_installed(void)
>>> +{
>>> +#ifdef CONFIG_ZCRYPT
>> I did not give my R-B for this.
>> please change it or suppress my R-B
>>
>> I think you should review the way you wrap functions
>> calling the AP interface.
>> Having all of them together would simplify code and review.
> I don't like the ifdeffery either (especially as there's more later).
I'm not crazy about it myself (see below)
>
> Consolidating all functions for querying basic ap capabilities sounds
> like a good idea. What about collecting them in a ap-util file and
> either always building it or selecting it from both zcrypt and kvm?
My preference would be one of the following:
1. All of the interfaces defined in arch/s390/include/asm/ap.h
are implemented in a file that is built whether ZCRYPT is
built or not.
2. The drivers/s390/crypto/ap_asm.h file containing the functions
that execute the AP instructions are made available outside of
the AP bus, for example; arch/s390/include/asm
I requested this from the maintainer but was told we don't want to
have any crypto adapter support when the host AP functionality is
disabled (CONFIG_ZCRYPT=n). This makes sense, however; I think it is
a bit confusing to have a header file (arch/s390/include/asm/ap.h)
with interfaces that are conditionally built.
This is why I chose the ifdeffery (as you call it) approach. The
only other solution I can conjure is to duplicate the asm code for
the AP instructions needed in KVM and bypass using the AP bus
interfaces.
>
>>> + return ap_instructions_installed();
>>> +#else
>>> + return 0;
>>> +#endif
>>> +}
>>> +EXPORT_SYMBOL(kvm_ap_instructions_installed);
>>> diff --git a/drivers/s390/crypto/ap_bus.c b/drivers/s390/crypto/ap_bus.c
>>> index 35a0c2b..9d108b6 100644
>>> --- a/drivers/s390/crypto/ap_bus.c
>>> +++ b/drivers/s390/crypto/ap_bus.c
>>> @@ -210,6 +210,12 @@ int ap_query_configuration(struct ap_config_info *info)
>>> }
>>> EXPORT_SYMBOL(ap_query_configuration);
>>>
>>> +int ap_instructions_installed(void)
>>> +{
>>> + return (ap_instructions_available() == 0);
>>> +}
>>> +EXPORT_SYMBOL(ap_instructions_installed);
>>> +
>>> /**
>>> * ap_init_configuration(): Allocate and query configuration array.
>>> */
>>
On 04/17/2018 07:34 AM, Cornelia Huck wrote:
> On Sun, 15 Apr 2018 17:22:12 -0400
> Tony Krowiak <[email protected]> wrote:
>
>> Introduces a new function to reset the crypto attributes for all
>> vcpus whether they are running or not. Each vcpu in KVM will
>> be removed from SIE prior to resetting the crypto attributes in its
>> SIE state description. After all vcpus have had their crypto attributes
>> reset the vcpus will be restored to SIE.
>>
>> This function will be used in a later patch to set the ECA.28
>> bit in the SIE state description to enable interpretive execution of
>> AP instructions. It will also be incorporated into the
>> kvm_s390_vm_set_crypto(kvm) function to fix an issue whereby the crypto
>> key wrapping attributes could potentially get out of synch for running
>> vcpus.
> So, this description leads me to think it would make sense to queue
> this patch (fixing the key wrapping) independently of this series,
> wouldn't it?
I considered that because I figured there might be objections, but
since separating them would create dependency issues I didn't see
any harm in including it here. I can remove this from the explanation
above and the code below and create a separate patch for the key
wrapping if you'd prefer.
>
>> Signed-off-by: Tony Krowiak <[email protected]>
>> ---
>> arch/s390/kvm/kvm-s390.c | 19 +++++++++++++------
>> arch/s390/kvm/kvm-s390.h | 14 ++++++++++++++
>> 2 files changed, 27 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index 64c9862..d0c3518 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -791,11 +791,21 @@ static int kvm_s390_set_mem_control(struct kvm *kvm, struct kvm_device_attr *att
>>
>> static void kvm_s390_vcpu_crypto_setup(struct kvm_vcpu *vcpu);
>>
>> -static int kvm_s390_vm_set_crypto(struct kvm *kvm, struct kvm_device_attr *attr)
>> +void kvm_s390_vcpu_crypto_reset_all(struct kvm *kvm)
> _reset_all() or _set_all()? Don't really care much, tbh.
Then why bring it up?:) I chose _reset_all because in both places from which
this is called, we are changing a crypto attribute value and are thus
resetting the crypto settings for all the vcpus.
>
>> {
>> - struct kvm_vcpu *vcpu;
>> int i;
>> + struct kvm_vcpu *vcpu;
> I'd avoid swapping the order of the declarations.
This was unintentional, I can revert it.
>
>> +
>> + kvm_s390_vcpu_block_all(kvm);
>> +
>> + kvm_for_each_vcpu(i, vcpu, kvm)
>> + kvm_s390_vcpu_crypto_setup(vcpu);
>> and
>> + kvm_s390_vcpu_unblock_all(kvm);
>> +}
>> +
>> +static int kvm_s390_vm_set_crypto(struct kvm *kvm, struct kvm_device_attr *attr)
>> +{
>> if (!test_kvm_facility(kvm, 76))
>> return -EINVAL;
>>
>> @@ -832,10 +842,7 @@ static int kvm_s390_vm_set_crypto(struct kvm *kvm, struct kvm_device_attr *attr)
>> return -ENXIO;
>> }
>>
>> - kvm_for_each_vcpu(i, vcpu, kvm) {
>> - kvm_s390_vcpu_crypto_setup(vcpu);
>> - exit_sie(vcpu);
>> - }
>> + kvm_s390_vcpu_crypto_reset_all(kvm);
>> mutex_unlock(&kvm->lock);
>> return 0;
>> }
>> diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h
>> index 1b5621f..76324b7 100644
>> --- a/arch/s390/kvm/kvm-s390.h
>> +++ b/arch/s390/kvm/kvm-s390.h
>> @@ -410,4 +410,18 @@ static inline int kvm_s390_use_sca_entries(void)
>> }
>> void kvm_s390_reinject_machine_check(struct kvm_vcpu *vcpu,
>> struct mcck_volatile_info *mcck_info);
>> +
>> +/**
>> + * kvm_s390_vcpu_crypto_reset_all
>> + *
>> + * Reset the crypto attributes for each vcpu. This can be done while the vcpus
>> + * are running as each vcpu will be removed from SIE before resetting the crypto
>> + * attributes and restored to SIE afterward.
>> + *
>> + * Note: The kvm->lock mutex must be locked prior to calling this function and
>> + * unlocked after it returns.
> "Must be called with kvm->lock held"?
Yes. The kvm->lock must be held to set the crypto attributes that will be
copied to the vcpus via the kvm_s390_vcpu_crypto_reset_all() function,
so it made sense to hold the lock across the entire operation.
>
>> + *
>> + * @kvm: the KVM guest
>> + */
>> +void kvm_s390_vcpu_crypto_reset_all(struct kvm *kvm);
>> #endif
> Other than the nits above, looks good to me.
Great!
>
On Tue, 17 Apr 2018 09:47:58 -0400
Tony Krowiak <[email protected]> wrote:
> On 04/17/2018 07:34 AM, Cornelia Huck wrote:
> > On Sun, 15 Apr 2018 17:22:12 -0400
> > Tony Krowiak <[email protected]> wrote:
> >
> >> Introduces a new function to reset the crypto attributes for all
> >> vcpus whether they are running or not. Each vcpu in KVM will
> >> be removed from SIE prior to resetting the crypto attributes in its
> >> SIE state description. After all vcpus have had their crypto attributes
> >> reset the vcpus will be restored to SIE.
> >>
> >> This function will be used in a later patch to set the ECA.28
> >> bit in the SIE state description to enable interpretive execution of
> >> AP instructions. It will also be incorporated into the
> >> kvm_s390_vm_set_crypto(kvm) function to fix an issue whereby the crypto
> >> key wrapping attributes could potentially get out of synch for running
> >> vcpus.
> > So, this description leads me to think it would make sense to queue
> > this patch (fixing the key wrapping) independently of this series,
> > wouldn't it?
> I considered that because I figured there might be objections, but
> since separating them would create dependency issues I didn't see
> any harm in including it here. I can remove this from the explanation
> above and the code below and create a separate patch for the key
> wrapping if you'd prefer.
Well, I think this makes sense as an individual patch, but I'll leave
that to the maintainers to decide.
> >
> >> Signed-off-by: Tony Krowiak <[email protected]>
> >> ---
> >> arch/s390/kvm/kvm-s390.c | 19 +++++++++++++------
> >> arch/s390/kvm/kvm-s390.h | 14 ++++++++++++++
> >> 2 files changed, 27 insertions(+), 6 deletions(-)
> >> diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h
> >> index 1b5621f..76324b7 100644
> >> --- a/arch/s390/kvm/kvm-s390.h
> >> +++ b/arch/s390/kvm/kvm-s390.h
> >> @@ -410,4 +410,18 @@ static inline int kvm_s390_use_sca_entries(void)
> >> }
> >> void kvm_s390_reinject_machine_check(struct kvm_vcpu *vcpu,
> >> struct mcck_volatile_info *mcck_info);
> >> +
> >> +/**
> >> + * kvm_s390_vcpu_crypto_reset_all
> >> + *
> >> + * Reset the crypto attributes for each vcpu. This can be done while the vcpus
> >> + * are running as each vcpu will be removed from SIE before resetting the crypto
> >> + * attributes and restored to SIE afterward.
> >> + *
> >> + * Note: The kvm->lock mutex must be locked prior to calling this function and
> >> + * unlocked after it returns.
> > "Must be called with kvm->lock held"?
> Yes. The kvm->lock must be held to set the crypto attributes that will be
> copied to the vcpus via the kvm_s390_vcpu_crypto_reset_all() function,
> so it made sense to hold the lock across the entire operation.
This was intended as a suggestion for a more compact usage note :)
On 04/16/2018 04:56 AM, Pierre Morel wrote:
> On 15/04/2018 23:22, Tony Krowiak wrote:
>> This patch refactors the code that initializes the crypto
>> configuration for a guest. The crypto configuration is contained in
>> a crypto control block (CRYCB) which is a satellite control block to
>> our main hardware virtualization control block. The CRYCB is
>> attached to the main virtualization control block via a CRYCB
>> designation (CRYCBD) designation field containing the address of
>> the CRYCB as well as its format.
>>
>> Prior to the introduction of AP device virtualization, there was
>> no need to provide access to or specify the format of the CRYCB for
>> a guest unless the MSA extension 3 (MSAX3) facility was installed
>> on the host system. With the introduction of AP device virtualization,
>> the CRYCB and its format must be made accessible to the guest
>> regardless of the presence of the MSAX3 facility.
>>
>> The crypto initialization code is restructured as follows:
>>
>> * A new compilation unit is introduced to contain all interfaces
>> and data structures related to configuring a guest's CRYCB for
>> both the refactoring of crypto initialization as well as all
>> subsequent patches introducing AP virtualization support.
>>
>> * Currently, the asm code for querying the AP configuration is
>> duplicated in the AP bus as well as in KVM. Since the KVM
>> code was introduced, the AP bus has externalized the interface
>> for querying the AP configuration. The KVM interface will be
>> replaced with a call to the AP bus interface. Of course, this
>> will be moved to the new compilation unit mentioned above.
>>
>> * An interface to format the CRYCBD field will be provided via
>> the new compilation unit and called from the KVM vm
>> initialization.
>>
>> Signed-off-by: Tony Krowiak <[email protected]>
>> ---
>> arch/s390/include/asm/kvm-ap.h | 15 +++++++++
>> arch/s390/include/asm/kvm_host.h | 1 +
>> arch/s390/kvm/kvm-ap.c | 39 ++++++++++++++++++++++++
>> arch/s390/kvm/kvm-s390.c | 60
>> ++++----------------------------------
>> 4 files changed, 61 insertions(+), 54 deletions(-)
>>
>> diff --git a/arch/s390/include/asm/kvm-ap.h
>> b/arch/s390/include/asm/kvm-ap.h
>> index 84412a9..736e93e 100644
>> --- a/arch/s390/include/asm/kvm-ap.h
>> +++ b/arch/s390/include/asm/kvm-ap.h
>> @@ -10,6 +10,9 @@
>> #ifndef _ASM_KVM_AP
>> #define _ASM_KVM_AP
>>
>> +#include <linux/types.h>
>> +#include <linux/kvm_host.h>
>> +
>> /**
>> * kvm_ap_instructions_installed()
>> *
>> @@ -20,4 +23,16 @@
>> */
>> int kvm_ap_instructions_installed(void);
>>
>> +/**
>> + * kvm_ap_build_crycbd
>> + *
>> + * The crypto control block designation (CRYCBD) is a 32-bit field that
>> + * designates both the host real address and format of the CRYCB.
>> This function
>> + * builds the CRYCBD field for use by the KVM guest.
>> + *
>> + * @kvm: the KVM guest
>> + * @crycbd: reference to the CRYCBD
>> + */
>> +void kvm_ap_build_crycbd(struct kvm *kvm);
>> +
>> #endif /* _ASM_KVM_AP */
>> diff --git a/arch/s390/include/asm/kvm_host.h
>> b/arch/s390/include/asm/kvm_host.h
>> index 81cdb6b..c990a1d 100644
>> --- a/arch/s390/include/asm/kvm_host.h
>> +++ b/arch/s390/include/asm/kvm_host.h
>> @@ -257,6 +257,7 @@ struct kvm_s390_sie_block {
>> __u8 reservedf0[12]; /* 0x00f0 */
>> #define CRYCB_FORMAT1 0x00000001
>> #define CRYCB_FORMAT2 0x00000003
>> +#define CRYCB_FORMAT_MASK 0x00000003
>> __u32 crycbd; /* 0x00fc */
>> __u64 gcr[16]; /* 0x0100 */
>> __u64 gbea; /* 0x0180 */
>> diff --git a/arch/s390/kvm/kvm-ap.c b/arch/s390/kvm/kvm-ap.c
>> index 1267588..991bae4 100644
>> --- a/arch/s390/kvm/kvm-ap.c
>> +++ b/arch/s390/kvm/kvm-ap.c
>> @@ -10,6 +10,8 @@
>> #include <asm/kvm-ap.h>
>> #include <asm/ap.h>
>>
>> +#include "kvm-s390.h"
>> +
>> int kvm_ap_instructions_installed(void)
>> {
>> #ifdef CONFIG_ZCRYPT
>> @@ -19,3 +21,40 @@ int kvm_ap_instructions_installed(void)
>> #endif
>> }
>> EXPORT_SYMBOL(kvm_ap_instructions_installed);
>> +
>> +static inline int kvm_ap_query_config(struct ap_config_info *config)
>> +{
>> + memset(config, 0, sizeof(*config));
>> +
>> +#ifdef CONFIG_ZCRYPT
>
> I would prefer that you define the interface in an include file
> with stubs for the case ZCRYPT is not set.
This is a static function only called internally, but I suppose there is
no harm in defining it as an interface in kvm-ap.h ... it may come
in handy down the road.
>
>
>> + if (kvm_ap_instructions_installed())
>> + return ap_query_configuration(config);
>> +#endif
>> +
>> + return -EOPNOTSUPP;
>> +}
>> +
>> +static int kvm_ap_apxa_installed(void)
>> +{
>> + struct ap_config_info config;
>> +
>> + if (kvm_ap_query_config(&config) == 0)
>> + return (config.apxa == 1);
>> +
>> + return 0;
>> +}
>> +
>> +void kvm_ap_build_crycbd(struct kvm *kvm)
>> +{
>> + kvm->arch.crypto.crycbd = (__u32)(unsigned long)
>> kvm->arch.crypto.crycb;
>> + kvm->arch.crypto.crycbd &= ~(CRYCB_FORMAT_MASK);
>> +
>> + /* check whether MSAX3 is installed */
>
> It means we do not support AP virtualization without MSA3.
> It follows we do not support CRYCB_FORMAT0
If MSAX3 is not installed, that means there is no key wrapping support,
hence CRYCB_FORMAT0. The CRYCB_FORMAT1 and CRYCB_FORMAT2 CRYCBs
both include wrapping key masks. I don't follow your logic here.
>
>
> It is different from what you explain in the comment.
How is it different? Above, we are setting the CRYCBD value regardless
of whether MSAX3 is installed or not. Previously, the CRYCBD value
was set only if MSAX3 is installed (see comments below)
>
>
>> + if (kvm_ap_instructions_installed() && test_kvm_facility(kvm,
>> 76)) {
>> + if (kvm_ap_apxa_installed())
>> + kvm->arch.crypto.crycbd |= CRYCB_FORMAT2;
>> + else
>> + kvm->arch.crypto.crycbd |= CRYCB_FORMAT1;
>> + }
>> +}
>> +EXPORT_SYMBOL(kvm_ap_build_crycbd);
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index d0c3518..b47ff11 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -40,6 +40,7 @@
>> #include <asm/sclp.h>
>> #include <asm/cpacf.h>
>> #include <asm/timex.h>
>> +#include <asm/kvm-ap.h>
>> #include "kvm-s390.h"
>> #include "gaccess.h"
>>
>> @@ -1881,55 +1882,6 @@ long kvm_arch_vm_ioctl(struct file *filp,
>> return r;
>> }
>>
>> -static int kvm_s390_query_ap_config(u8 *config)
>> -{
>> - u32 fcn_code = 0x04000000UL;
>> - u32 cc = 0;
>> -
>> - memset(config, 0, 128);
>> - asm volatile(
>> - "lgr 0,%1\n"
>> - "lgr 2,%2\n"
>> - ".long 0xb2af0000\n" /* PQAP(QCI) */
>> - "0: ipm %0\n"
>> - "srl %0,28\n"
>> - "1:\n"
>> - EX_TABLE(0b, 1b)
>> - : "+r" (cc)
>> - : "r" (fcn_code), "r" (config)
>> - : "cc", "0", "2", "memory"
>> - );
>> -
>> - return cc;
>> -}
>> -
>> -static int kvm_s390_apxa_installed(void)
>> -{
>> - u8 config[128];
>> - int cc;
>> -
>> - if (test_facility(12)) {
>> - cc = kvm_s390_query_ap_config(config);
>> -
>> - if (cc)
>> - pr_err("PQAP(QCI) failed with cc=%d", cc);
>> - else
>> - return config[0] & 0x40;
>> - }
>> -
>> - return 0;
>> -}
>> -
>> -static void kvm_s390_set_crycb_format(struct kvm *kvm)
>> -{
>> - kvm->arch.crypto.crycbd = (__u32)(unsigned long)
>> kvm->arch.crypto.crycb;
>> -
>> - if (kvm_s390_apxa_installed())
>> - kvm->arch.crypto.crycbd |= CRYCB_FORMAT2;
>> - else
>> - kvm->arch.crypto.crycbd |= CRYCB_FORMAT1;
>> -}
>> -
>> static u64 kvm_s390_get_initial_cpuid(void)
>> {
>> struct cpuid cpuid;
>> @@ -1941,12 +1893,12 @@ static u64 kvm_s390_get_initial_cpuid(void)
>>
>> static void kvm_s390_crypto_init(struct kvm *kvm)
>> {
>> + kvm->arch.crypto.crycb = &kvm->arch.sie_page2->crycb;
>> + kvm_ap_build_crycbd(kvm);
>> +
Notice the call to kvm_ap_build_crycbd(kvm) above was added, so
the CRYCBD is being set regardless of the presence of MSAX3.
>> if (!test_kvm_facility(kvm, 76))
>> return;
>>
>> - kvm->arch.crypto.crycb = &kvm->arch.sie_page2->crycb;
>> - kvm_s390_set_crycb_format(kvm);
Notice that this code that was removed to set the CRYCBD is called
only if MSAX3 is not installed - i.e., see the if statement
immediately preceding the two statements above.
>> -
>> /* Enable AES/DEA protected key functions by default */
>> kvm->arch.crypto.aes_kw = 1;
>> kvm->arch.crypto.dea_kw = 1;
>> @@ -2475,6 +2427,8 @@ void kvm_arch_vcpu_postcreate(struct kvm_vcpu
>> *vcpu)
>>
>> static void kvm_s390_vcpu_crypto_setup(struct kvm_vcpu *vcpu)
>> {
>> + vcpu->arch.sie_block->crycbd = vcpu->kvm->arch.crypto.crycbd;
>> +
>> if (!test_kvm_facility(vcpu->kvm, 76))
>> return;
>>
>> @@ -2484,8 +2438,6 @@ static void kvm_s390_vcpu_crypto_setup(struct
>> kvm_vcpu *vcpu)
>> vcpu->arch.sie_block->ecb3 |= ECB3_AES;
>> if (vcpu->kvm->arch.crypto.dea_kw)
>> vcpu->arch.sie_block->ecb3 |= ECB3_DEA;
>> -
>> - vcpu->arch.sie_block->crycbd = vcpu->kvm->arch.crypto.crycbd;
>> }
>>
>> void kvm_s390_vcpu_unsetup_cmma(struct kvm_vcpu *vcpu)
>
>
On 04/17/2018 06:10 AM, Cornelia Huck wrote:
> On Tue, 17 Apr 2018 09:49:58 +0200
> "Harald Freudenberger" <[email protected]> wrote:
>
>> Didn't we say that when APXA is not available there is no Crypto support
>> for KVM ?
> [Going by the code, as I don't have access to the architecture]
>
> Current status seems to be:
> - setup crycb if facility 76 is available (that's MSAX3, I guess?)
The crycb is set up regardless of whether STFLE.76 (MSAX3) is
installed or not.
> - use format 2 if APXA is available, else use format 1
Use format 0 if MSAX3 is not available
Use format 1 if MSAX3 is available but APXA is not
Use format 2 if MSAX3 and APXA is available
>
> From Tony's patch description, the goal seems to be:
> - setup crycb even if MSAX3 is not available
Yes, that is true
>
> So my understanding is that we use APXA only to decide on the format of
> the crycb, but provide it in any case?
Yes, that is true
>
> (Not providing a crycb if APXA is not available would be loss of
> functionality, I guess? Deciding not to provide vfio-ap if APXA is not
> available is a different game, of course.)
This would require a change to enabling the CPU model feature for
AP.
>
On 04/15/2018 11:22 PM, Tony Krowiak wrote:
> Introduces a new function to reset the crypto attributes for all
> vcpus whether they are running or not. Each vcpu in KVM will
> be removed from SIE prior to resetting the crypto attributes in its
> SIE state description. After all vcpus have had their crypto attributes
> reset the vcpus will be restored to SIE.
>
> This function will be used in a later patch to set the ECA.28
> bit in the SIE state description to enable interpretive execution of
> AP instructions. It will also be incorporated into the
> kvm_s390_vm_set_crypto(kvm) function to fix an issue whereby the crypto
> key wrapping attributes could potentially get out of synch for running
> vcpus.
>
Wasn't this 'issue' reported by me by any chance?
I agree with Connnie, we don't need the forward reference to
ECA.28.
Regards,
Halil
> Signed-off-by: Tony Krowiak<[email protected]>
On 04/15/2018 05:22 PM, Tony Krowiak wrote:
> This patch refactors the code that initializes the crypto
> configuration for a guest. The crypto configuration is contained in
> a crypto control block (CRYCB) which is a satellite control block to
> our main hardware virtualization control block. The CRYCB is
> attached to the main virtualization control block via a CRYCB
> designation (CRYCBD) designation field containing the address of
> the CRYCB as well as its format.
>
> Prior to the introduction of AP device virtualization, there was
> no need to provide access to or specify the format of the CRYCB for
> a guest unless the MSA extension 3 (MSAX3) facility was installed
> on the host system. With the introduction of AP device virtualization,
> the CRYCB and its format must be made accessible to the guest
> regardless of the presence of the MSAX3 facility.
>
> The crypto initialization code is restructured as follows:
>
> * A new compilation unit is introduced to contain all interfaces
> and data structures related to configuring a guest's CRYCB for
> both the refactoring of crypto initialization as well as all
> subsequent patches introducing AP virtualization support.
>
> * Currently, the asm code for querying the AP configuration is
> duplicated in the AP bus as well as in KVM. Since the KVM
> code was introduced, the AP bus has externalized the interface
> for querying the AP configuration. The KVM interface will be
> replaced with a call to the AP bus interface. Of course, this
> will be moved to the new compilation unit mentioned above.
>
> * An interface to format the CRYCBD field will be provided via
> the new compilation unit and called from the KVM vm
> initialization.
>
> Signed-off-by: Tony Krowiak <[email protected]>
> ---
> arch/s390/include/asm/kvm-ap.h | 15 +++++++++
> arch/s390/include/asm/kvm_host.h | 1 +
> arch/s390/kvm/kvm-ap.c | 39 ++++++++++++++++++++++++
> arch/s390/kvm/kvm-s390.c | 60 ++++----------------------------------
> 4 files changed, 61 insertions(+), 54 deletions(-)
>
> diff --git a/arch/s390/include/asm/kvm-ap.h b/arch/s390/include/asm/kvm-ap.h
> index 84412a9..736e93e 100644
> --- a/arch/s390/include/asm/kvm-ap.h
> +++ b/arch/s390/include/asm/kvm-ap.h
> @@ -10,6 +10,9 @@
> #ifndef _ASM_KVM_AP
> #define _ASM_KVM_AP
>
> +#include <linux/types.h>
> +#include <linux/kvm_host.h>
> +
> /**
> * kvm_ap_instructions_installed()
> *
> @@ -20,4 +23,16 @@
> */
> int kvm_ap_instructions_installed(void);
>
> +/**
> + * kvm_ap_build_crycbd
> + *
> + * The crypto control block designation (CRYCBD) is a 32-bit field that
> + * designates both the host real address and format of the CRYCB. This function
> + * builds the CRYCBD field for use by the KVM guest.
> + *
> + * @kvm: the KVM guest
> + * @crycbd: reference to the CRYCBD
> + */
> +void kvm_ap_build_crycbd(struct kvm *kvm);
> +
> #endif /* _ASM_KVM_AP */
> diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
> index 81cdb6b..c990a1d 100644
> --- a/arch/s390/include/asm/kvm_host.h
> +++ b/arch/s390/include/asm/kvm_host.h
> @@ -257,6 +257,7 @@ struct kvm_s390_sie_block {
> __u8 reservedf0[12]; /* 0x00f0 */
> #define CRYCB_FORMAT1 0x00000001
> #define CRYCB_FORMAT2 0x00000003
> +#define CRYCB_FORMAT_MASK 0x00000003
> __u32 crycbd; /* 0x00fc */
> __u64 gcr[16]; /* 0x0100 */
> __u64 gbea; /* 0x0180 */
> diff --git a/arch/s390/kvm/kvm-ap.c b/arch/s390/kvm/kvm-ap.c
> index 1267588..991bae4 100644
> --- a/arch/s390/kvm/kvm-ap.c
> +++ b/arch/s390/kvm/kvm-ap.c
> @@ -10,6 +10,8 @@
> #include <asm/kvm-ap.h>
> #include <asm/ap.h>
>
> +#include "kvm-s390.h"
> +
> int kvm_ap_instructions_installed(void)
> {
> #ifdef CONFIG_ZCRYPT
> @@ -19,3 +21,40 @@ int kvm_ap_instructions_installed(void)
> #endif
> }
> EXPORT_SYMBOL(kvm_ap_instructions_installed);
> +
> +static inline int kvm_ap_query_config(struct ap_config_info *config)
> +{
> + memset(config, 0, sizeof(*config));
> +
> +#ifdef CONFIG_ZCRYPT
> + if (kvm_ap_instructions_installed())
> + return ap_query_configuration(config);
> +#endif
> +
> + return -EOPNOTSUPP;
> +}
> +
> +static int kvm_ap_apxa_installed(void)
> +{
> + struct ap_config_info config;
> +
> + if (kvm_ap_query_config(&config) == 0)
> + return (config.apxa == 1);
> +
> + return 0;
> +}
> +
> +void kvm_ap_build_crycbd(struct kvm *kvm)
> +{
> + kvm->arch.crypto.crycbd = (__u32)(unsigned long) kvm->arch.crypto.crycb;
> + kvm->arch.crypto.crycbd &= ~(CRYCB_FORMAT_MASK);
Now that I look at this again, I think the check for
kvm_ap_instructions_installed()
needs to be at the beginning of this function. If the AP instructions
are not
installed, then we probably shouldn't be making a CRYCB available to the
guest.
> +
> + /* check whether MSAX3 is installed */
> + if (kvm_ap_instructions_installed() && test_kvm_facility(kvm, 76)) {
> + if (kvm_ap_apxa_installed())
> + kvm->arch.crypto.crycbd |= CRYCB_FORMAT2;
> + else
> + kvm->arch.crypto.crycbd |= CRYCB_FORMAT1;
> + }
> +}
> +EXPORT_SYMBOL(kvm_ap_build_crycbd);
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index d0c3518..b47ff11 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -40,6 +40,7 @@
> #include <asm/sclp.h>
> #include <asm/cpacf.h>
> #include <asm/timex.h>
> +#include <asm/kvm-ap.h>
> #include "kvm-s390.h"
> #include "gaccess.h"
>
> @@ -1881,55 +1882,6 @@ long kvm_arch_vm_ioctl(struct file *filp,
> return r;
> }
>
> -static int kvm_s390_query_ap_config(u8 *config)
> -{
> - u32 fcn_code = 0x04000000UL;
> - u32 cc = 0;
> -
> - memset(config, 0, 128);
> - asm volatile(
> - "lgr 0,%1\n"
> - "lgr 2,%2\n"
> - ".long 0xb2af0000\n" /* PQAP(QCI) */
> - "0: ipm %0\n"
> - "srl %0,28\n"
> - "1:\n"
> - EX_TABLE(0b, 1b)
> - : "+r" (cc)
> - : "r" (fcn_code), "r" (config)
> - : "cc", "0", "2", "memory"
> - );
> -
> - return cc;
> -}
> -
> -static int kvm_s390_apxa_installed(void)
> -{
> - u8 config[128];
> - int cc;
> -
> - if (test_facility(12)) {
> - cc = kvm_s390_query_ap_config(config);
> -
> - if (cc)
> - pr_err("PQAP(QCI) failed with cc=%d", cc);
> - else
> - return config[0] & 0x40;
> - }
> -
> - return 0;
> -}
> -
> -static void kvm_s390_set_crycb_format(struct kvm *kvm)
> -{
> - kvm->arch.crypto.crycbd = (__u32)(unsigned long) kvm->arch.crypto.crycb;
> -
> - if (kvm_s390_apxa_installed())
> - kvm->arch.crypto.crycbd |= CRYCB_FORMAT2;
> - else
> - kvm->arch.crypto.crycbd |= CRYCB_FORMAT1;
> -}
> -
> static u64 kvm_s390_get_initial_cpuid(void)
> {
> struct cpuid cpuid;
> @@ -1941,12 +1893,12 @@ static u64 kvm_s390_get_initial_cpuid(void)
>
> static void kvm_s390_crypto_init(struct kvm *kvm)
> {
> + kvm->arch.crypto.crycb = &kvm->arch.sie_page2->crycb;
> + kvm_ap_build_crycbd(kvm);
> +
> if (!test_kvm_facility(kvm, 76))
> return;
>
> - kvm->arch.crypto.crycb = &kvm->arch.sie_page2->crycb;
> - kvm_s390_set_crycb_format(kvm);
> -
> /* Enable AES/DEA protected key functions by default */
> kvm->arch.crypto.aes_kw = 1;
> kvm->arch.crypto.dea_kw = 1;
> @@ -2475,6 +2427,8 @@ void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu)
>
> static void kvm_s390_vcpu_crypto_setup(struct kvm_vcpu *vcpu)
> {
> + vcpu->arch.sie_block->crycbd = vcpu->kvm->arch.crypto.crycbd;
> +
> if (!test_kvm_facility(vcpu->kvm, 76))
> return;
>
> @@ -2484,8 +2438,6 @@ static void kvm_s390_vcpu_crypto_setup(struct kvm_vcpu *vcpu)
> vcpu->arch.sie_block->ecb3 |= ECB3_AES;
> if (vcpu->kvm->arch.crypto.dea_kw)
> vcpu->arch.sie_block->ecb3 |= ECB3_DEA;
> -
> - vcpu->arch.sie_block->crycbd = vcpu->kvm->arch.crypto.crycbd;
> }
>
> void kvm_s390_vcpu_unsetup_cmma(struct kvm_vcpu *vcpu)
On 04/17/2018 10:29 AM, Halil Pasic wrote:
>
>
> On 04/15/2018 11:22 PM, Tony Krowiak wrote:
>> Introduces a new function to reset the crypto attributes for all
>> vcpus whether they are running or not. Each vcpu in KVM will
>> be removed from SIE prior to resetting the crypto attributes in its
>> SIE state description. After all vcpus have had their crypto attributes
>> reset the vcpus will be restored to SIE.
>>
>> This function will be used in a later patch to set the ECA.28
>> bit in the SIE state description to enable interpretive execution of
>> AP instructions. It will also be incorporated into the
>> kvm_s390_vm_set_crypto(kvm) function to fix an issue whereby the crypto
>> key wrapping attributes could potentially get out of synch for running
>> vcpus.
>>
>
> Wasn't this 'issue' reported by me by any chance?
Yes it was .... was I supposed to include that fact in the commit message?
>
> I agree with Connnie, we don't need the forward reference to
> ECA.28.
I'm not sure that's exactly what she said, but I'd be more than happy
to remove it.
>
>
> Regards,
> Halil
>
>> Signed-off-by: Tony Krowiak<[email protected]>
>
On 04/16/2018 06:51 AM, Pierre Morel wrote:
> On 15/04/2018 23:22, Tony Krowiak wrote:
>> The VFIO AP device model exploits interpretive execution of AP
>> instructions (APIE) to provide guests passthrough access to AP
>> devices. This patch introduces a new interface to enable and
>> disable APIE.
>>
>> Signed-off-by: Tony Krowiak <[email protected]>
>> ---
>> arch/s390/include/asm/kvm-ap.h | 16 ++++++++++++++++
>> arch/s390/include/asm/kvm_host.h | 1 +
>> arch/s390/kvm/kvm-ap.c | 20 ++++++++++++++++++++
>> arch/s390/kvm/kvm-s390.c | 9 +++++++++
>> 4 files changed, 46 insertions(+), 0 deletions(-)
>>
>> diff --git a/arch/s390/include/asm/kvm-ap.h
>> b/arch/s390/include/asm/kvm-ap.h
>> index 736e93e..a6c092e 100644
>> --- a/arch/s390/include/asm/kvm-ap.h
>> +++ b/arch/s390/include/asm/kvm-ap.h
>> @@ -35,4 +35,20 @@
>> */
>> void kvm_ap_build_crycbd(struct kvm *kvm);
>>
>> +/**
>> + * kvm_ap_interpret_instructions
>> + *
>> + * Indicate whether AP instructions shall be interpreted. If they
>> are not
>> + * interpreted, all AP instructions will be intercepted and routed
>> back to
>> + * userspace.
>> + *
>> + * @kvm: the virtual machine attributes
>> + * @enable: indicates whether AP instructions are to be interpreted
>> (true) or
>> + * or not (false).
>> + *
>> + * Returns 0 if completed successfully; otherwise, returns -EOPNOTSUPP
>> + * indicating that AP instructions are not installed on the guest.
>> + */
>> +int kvm_ap_interpret_instructions(struct kvm *kvm, bool enable);
>> +
>> #endif /* _ASM_KVM_AP */
>> diff --git a/arch/s390/include/asm/kvm_host.h
>> b/arch/s390/include/asm/kvm_host.h
>> index 3162783..5470685 100644
>> --- a/arch/s390/include/asm/kvm_host.h
>> +++ b/arch/s390/include/asm/kvm_host.h
>> @@ -715,6 +715,7 @@ struct kvm_s390_crypto {
>> __u32 crycbd;
>> __u8 aes_kw;
>> __u8 dea_kw;
>> + __u8 apie;
>> };
>>
>> #define APCB0_MASK_SIZE 1
>> diff --git a/arch/s390/kvm/kvm-ap.c b/arch/s390/kvm/kvm-ap.c
>> index 991bae4..55d11b5 100644
>> --- a/arch/s390/kvm/kvm-ap.c
>> +++ b/arch/s390/kvm/kvm-ap.c
>> @@ -58,3 +58,23 @@ void kvm_ap_build_crycbd(struct kvm *kvm)
>> }
>> }
>> EXPORT_SYMBOL(kvm_ap_build_crycbd);
>> +
>> +int kvm_ap_interpret_instructions(struct kvm *kvm, bool enable)
>> +{
>> + int ret = 0;
>> +
>> + mutex_lock(&kvm->lock);
>> +
>> + if (!test_kvm_cpu_feat(kvm, KVM_S390_VM_CPU_FEAT_AP)) {
>
> Do we really need to test CPU_FEAT_AP?
Yes we do.
>
>
> I understand that KVM_S390_VM_CPU_FEAT_AP means AP instructions are
> interpreted.
> shouldn't we add this information in the name?
> like KVM_S390_VM_CPU_FEAT_APIE
KVM_S390_VM_CPU_FEAT_AP does NOT mean AP instructions are interpreted,
it means
AP instructions are installed.
>
>> + ret = -EOPNOTSUPP;
>> + goto done;
>> + }
>> +
>> + kvm->arch.crypto.apie = enable;
>> + kvm_s390_vcpu_crypto_reset_all(kvm);
>> +
>> +done:
>> + mutex_unlock(&kvm->lock);
>> + return ret;
>> +}
>> +EXPORT_SYMBOL(kvm_ap_interpret_instructions);
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index 55cd897..1dc8566 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -1901,6 +1901,9 @@ static void kvm_s390_crypto_init(struct kvm *kvm)
>> kvm->arch.crypto.crycb = &kvm->arch.sie_page2->crycb;
>> kvm_ap_build_crycbd(kvm);
>>
>> + /* Default setting indicating SIE shall interpret AP
>> instructions */
>> + kvm->arch.crypto.apie = 1;
>> +
>> if (!test_kvm_facility(kvm, 76))
>> return;
>>
>> @@ -2434,6 +2437,12 @@ static void kvm_s390_vcpu_crypto_setup(struct
>> kvm_vcpu *vcpu)
>> {
>> vcpu->arch.sie_block->crycbd = vcpu->kvm->arch.crypto.crycbd;
>>
>> + vcpu->arch.sie_block->eca &= ~ECA_APIE;
>> + if (vcpu->kvm->arch.crypto.apie &&
>> + test_kvm_cpu_feat(vcpu->kvm, KVM_S390_VM_CPU_FEAT_AP))
>
> Do we call xxx_crypto_setup() if KVM does not support AP interpretation?
Yes, kvm_s390_vcpu_crypto_setup(vcpu) is called by kvm_arch_vcpu_setup(vcpu)
as well as from kvm_s390_vcpu_crypto_reset_all(kvm). Calling it has nothing
to do with whether AP interpretation is supported or not as it does much
more than that, including setting up of wrapping keys and the CRYCBD.
>
>> + vcpu->arch.sie_block->eca |= ECA_APIE;
>> +
>> +
>> if (!test_kvm_facility(vcpu->kvm, 76))
>> return;
>>
>
On 04/16/2018 07:13 AM, Pierre Morel wrote:
> On 16/04/2018 12:51, Pierre Morel wrote:
>> On 15/04/2018 23:22, Tony Krowiak wrote:
>>> The VFIO AP device model exploits interpretive execution of AP
>>> instructions (APIE) to provide guests passthrough access to AP
>>> devices. This patch introduces a new interface to enable and
>>> disable APIE.
>>>
>>> Signed-off-by: Tony Krowiak <[email protected]>
>>> ---
>>> arch/s390/include/asm/kvm-ap.h | 16 ++++++++++++++++
>>> arch/s390/include/asm/kvm_host.h | 1 +
>>> arch/s390/kvm/kvm-ap.c | 20 ++++++++++++++++++++
>>> arch/s390/kvm/kvm-s390.c | 9 +++++++++
>>> 4 files changed, 46 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/arch/s390/include/asm/kvm-ap.h
>>> b/arch/s390/include/asm/kvm-ap.h
>>> index 736e93e..a6c092e 100644
>>> --- a/arch/s390/include/asm/kvm-ap.h
>>> +++ b/arch/s390/include/asm/kvm-ap.h
>>> @@ -35,4 +35,20 @@
>>> */
>>> void kvm_ap_build_crycbd(struct kvm *kvm);
>>>
>>> +/**
>>> + * kvm_ap_interpret_instructions
>>> + *
>>> + * Indicate whether AP instructions shall be interpreted. If they
>>> are not
>>> + * interpreted, all AP instructions will be intercepted and routed
>>> back to
>>> + * userspace.
>>> + *
>>> + * @kvm: the virtual machine attributes
>>> + * @enable: indicates whether AP instructions are to be interpreted
>>> (true) or
>>> + * or not (false).
>>> + *
>>> + * Returns 0 if completed successfully; otherwise, returns -EOPNOTSUPP
>>> + * indicating that AP instructions are not installed on the guest.
>>> + */
>>> +int kvm_ap_interpret_instructions(struct kvm *kvm, bool enable);
>>> +
>>> #endif /* _ASM_KVM_AP */
>>> diff --git a/arch/s390/include/asm/kvm_host.h
>>> b/arch/s390/include/asm/kvm_host.h
>>> index 3162783..5470685 100644
>>> --- a/arch/s390/include/asm/kvm_host.h
>>> +++ b/arch/s390/include/asm/kvm_host.h
>>> @@ -715,6 +715,7 @@ struct kvm_s390_crypto {
>>> __u32 crycbd;
>>> __u8 aes_kw;
>>> __u8 dea_kw;
>>> + __u8 apie;
>>> };
>>>
>>> #define APCB0_MASK_SIZE 1
>>> diff --git a/arch/s390/kvm/kvm-ap.c b/arch/s390/kvm/kvm-ap.c
>>> index 991bae4..55d11b5 100644
>>> --- a/arch/s390/kvm/kvm-ap.c
>>> +++ b/arch/s390/kvm/kvm-ap.c
>>> @@ -58,3 +58,23 @@ void kvm_ap_build_crycbd(struct kvm *kvm)
>>> }
>>> }
>>> EXPORT_SYMBOL(kvm_ap_build_crycbd);
>>> +
>>> +int kvm_ap_interpret_instructions(struct kvm *kvm, bool enable)
>>> +{
>>> + int ret = 0;
>>> +
>>> + mutex_lock(&kvm->lock);
>>> +
>>> + if (!test_kvm_cpu_feat(kvm, KVM_S390_VM_CPU_FEAT_AP)) {
>>
>> Do we really need to test CPU_FEAT_AP?
>>
>> I understand that KVM_S390_VM_CPU_FEAT_AP means AP instructions are
>> interpreted.
>> shouldn't we add this information in the name?
>> like KVM_S390_VM_CPU_FEAT_APIE
>
> If I misunderstood and FEAT_AP really mean AP instructions available
> in the guest,
> same question:
> is this function called if AP instructions are not available in the
> guest?
>
>>
>>> + ret = -EOPNOTSUPP;
>>> + goto done;
>>> + }
>>> +
>>> + kvm->arch.crypto.apie = enable;
>>> + kvm_s390_vcpu_crypto_reset_all(kvm);
>>> +
>>> +done:
>>> + mutex_unlock(&kvm->lock);
>>> + return ret;
>>> +}
>>> +EXPORT_SYMBOL(kvm_ap_interpret_instructions);
>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>> index 55cd897..1dc8566 100644
>>> --- a/arch/s390/kvm/kvm-s390.c
>>> +++ b/arch/s390/kvm/kvm-s390.c
>>> @@ -1901,6 +1901,9 @@ static void kvm_s390_crypto_init(struct kvm *kvm)
>>> kvm->arch.crypto.crycb = &kvm->arch.sie_page2->crycb;
>>> kvm_ap_build_crycbd(kvm);
>>>
>>> + /* Default setting indicating SIE shall interpret AP
>>> instructions */
>>> + kvm->arch.crypto.apie = 1;
>>> +
>>> if (!test_kvm_facility(kvm, 76))
>>> return;
>>>
>>> @@ -2434,6 +2437,12 @@ static void kvm_s390_vcpu_crypto_setup(struct
>>> kvm_vcpu *vcpu)
>>> {
>>> vcpu->arch.sie_block->crycbd = vcpu->kvm->arch.crypto.crycbd;
>>>
>>> + vcpu->arch.sie_block->eca &= ~ECA_APIE;
>>> + if (vcpu->kvm->arch.crypto.apie &&
>>> + test_kvm_cpu_feat(vcpu->kvm, KVM_S390_VM_CPU_FEAT_AP))
>>
>> Do we call xxx_crypto_setup() if KVM does not support AP interpretation?
>
> sorry, I should have written AP instructions here:
> is this function called if AP instructions are not available in the guest?
Yes, as I stated in a previous email, this function is called by
kvm_arch_vcpu_setup(vcpu) regardless of whether AP instructions are
available on the guest or not.
>
>
>>
>>> + vcpu->arch.sie_block->eca |= ECA_APIE;
>>> +
>>> +
>>> if (!test_kvm_facility(vcpu->kvm, 76))
>>> return;
>>>
>>
>
On Tue, 17 Apr 2018 10:55:30 -0400
Tony Krowiak <[email protected]> wrote:
> On 04/17/2018 10:29 AM, Halil Pasic wrote:
> >
> >
> > On 04/15/2018 11:22 PM, Tony Krowiak wrote:
> >> Introduces a new function to reset the crypto attributes for all
> >> vcpus whether they are running or not. Each vcpu in KVM will
> >> be removed from SIE prior to resetting the crypto attributes in its
> >> SIE state description. After all vcpus have had their crypto attributes
> >> reset the vcpus will be restored to SIE.
> >>
> >> This function will be used in a later patch to set the ECA.28
> >> bit in the SIE state description to enable interpretive execution of
> >> AP instructions. It will also be incorporated into the
> >> kvm_s390_vm_set_crypto(kvm) function to fix an issue whereby the crypto
> >> key wrapping attributes could potentially get out of synch for running
> >> vcpus.
> >>
> >
> > Wasn't this 'issue' reported by me by any chance?
>
> Yes it was .... was I supposed to include that fact in the commit message?
A Reported-by: is usually nice.
>
> >
> > I agree with Connnie, we don't need the forward reference to
> > ECA.28.
>
> I'm not sure that's exactly what she said, but I'd be more than happy
> to remove it.
It was kind of implied :)
On 04/16/2018 07:12 AM, Halil Pasic wrote:
>
> On 04/15/2018 11:22 PM, Tony Krowiak wrote:
>> The VFIO AP device model exploits interpretive execution of AP
>> instructions (APIE) to provide guests passthrough access to AP
>> devices. This patch introduces a new interface to enable and
>> disable APIE.
>>
>> Signed-off-by: Tony Krowiak <[email protected]>
> LGTM, but should be squashed into #4. (Like this you have a
> kernel that supports the cpu model feature 'ap' but does not
> do 'interpretation is default'.)
Okay
On 04/16/2018 07:52 AM, Halil Pasic wrote:
>
> On 04/16/2018 01:13 PM, Pierre Morel wrote:
>> On 16/04/2018 12:51, Pierre Morel wrote:
>>> On 15/04/2018 23:22, Tony Krowiak wrote:
>>>> The VFIO AP device model exploits interpretive execution of AP
>>>> instructions (APIE) to provide guests passthrough access to AP
>>>> devices. This patch introduces a new interface to enable and
>>>> disable APIE.
>>>>
>>>> Signed-off-by: Tony Krowiak <[email protected]>
>>>> ---
>>>> arch/s390/include/asm/kvm-ap.h | 16 ++++++++++++++++
>>>> arch/s390/include/asm/kvm_host.h | 1 +
>>>> arch/s390/kvm/kvm-ap.c | 20 ++++++++++++++++++++
>>>> arch/s390/kvm/kvm-s390.c | 9 +++++++++
>>>> 4 files changed, 46 insertions(+), 0 deletions(-)
>>>>
>>>> diff --git a/arch/s390/include/asm/kvm-ap.h b/arch/s390/include/asm/kvm-ap.h
>>>> index 736e93e..a6c092e 100644
>>>> --- a/arch/s390/include/asm/kvm-ap.h
>>>> +++ b/arch/s390/include/asm/kvm-ap.h
>>>> @@ -35,4 +35,20 @@
>>>> */
>>>> void kvm_ap_build_crycbd(struct kvm *kvm);
>>>>
>>>> +/**
>>>> + * kvm_ap_interpret_instructions
>>>> + *
>>>> + * Indicate whether AP instructions shall be interpreted. If they are not
>>>> + * interpreted, all AP instructions will be intercepted and routed back to
>>>> + * userspace.
>>>> + *
>>>> + * @kvm: the virtual machine attributes
>>>> + * @enable: indicates whether AP instructions are to be interpreted (true) or
>>>> + * or not (false).
>>>> + *
>>>> + * Returns 0 if completed successfully; otherwise, returns -EOPNOTSUPP
>>>> + * indicating that AP instructions are not installed on the guest.
>>>> + */
>>>> +int kvm_ap_interpret_instructions(struct kvm *kvm, bool enable);
>>>> +
>>>> #endif /* _ASM_KVM_AP */
>>>> diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
>>>> index 3162783..5470685 100644
>>>> --- a/arch/s390/include/asm/kvm_host.h
>>>> +++ b/arch/s390/include/asm/kvm_host.h
>>>> @@ -715,6 +715,7 @@ struct kvm_s390_crypto {
>>>> __u32 crycbd;
>>>> __u8 aes_kw;
>>>> __u8 dea_kw;
>>>> + __u8 apie;
>>>> };
>>>>
>>>> #define APCB0_MASK_SIZE 1
>>>> diff --git a/arch/s390/kvm/kvm-ap.c b/arch/s390/kvm/kvm-ap.c
>>>> index 991bae4..55d11b5 100644
>>>> --- a/arch/s390/kvm/kvm-ap.c
>>>> +++ b/arch/s390/kvm/kvm-ap.c
>>>> @@ -58,3 +58,23 @@ void kvm_ap_build_crycbd(struct kvm *kvm)
>>>> }
>>>> }
>>>> EXPORT_SYMBOL(kvm_ap_build_crycbd);
>>>> +
>>>> +int kvm_ap_interpret_instructions(struct kvm *kvm, bool enable)
>>>> +{
>>>> + int ret = 0;
>>>> +
>>>> + mutex_lock(&kvm->lock);
>>>> +
>>>> + if (!test_kvm_cpu_feat(kvm, KVM_S390_VM_CPU_FEAT_AP)) {
>>> Do we really need to test CPU_FEAT_AP?
>>>
>>> I understand that KVM_S390_VM_CPU_FEAT_AP means AP instructions are interpreted.
>>> shouldn't we add this information in the name?
>>> like KVM_S390_VM_CPU_FEAT_APIE
>> If I misunderstood and FEAT_AP really mean AP instructions available in the guest,
>> same question:
>> is this function called if AP instructions are not available in the guest?
>>
> See patch #13. I guess the check above is anyway good as defensive
> programming. This implementation should be sane regardless of
> the answer to your question.
I agree.
>
>>>> + ret = -EOPNOTSUPP;
>>>> + goto done;
>>>> + }
>>>> +
>>>> + kvm->arch.crypto.apie = enable;
>>>> + kvm_s390_vcpu_crypto_reset_all(kvm);
>>>> +
>>>> +done:
>>>> + mutex_unlock(&kvm->lock);
>>>> + return ret;
>>>> +}
>>>> +EXPORT_SYMBOL(kvm_ap_interpret_instructions);
>>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>>> index 55cd897..1dc8566 100644
>>>> --- a/arch/s390/kvm/kvm-s390.c
>>>> +++ b/arch/s390/kvm/kvm-s390.c
>>>> @@ -1901,6 +1901,9 @@ static void kvm_s390_crypto_init(struct kvm *kvm)
>>>> kvm->arch.crypto.crycb = &kvm->arch.sie_page2->crycb;
>>>> kvm_ap_build_crycbd(kvm);
>>>>
>>>> + /* Default setting indicating SIE shall interpret AP instructions */
>>>> + kvm->arch.crypto.apie = 1;
>>>> +
>>>> if (!test_kvm_facility(kvm, 76))
>>>> return;
>>>>
>>>> @@ -2434,6 +2437,12 @@ static void kvm_s390_vcpu_crypto_setup(struct kvm_vcpu *vcpu)
>>>> {
>>>> vcpu->arch.sie_block->crycbd = vcpu->kvm->arch.crypto.crycbd;
>>>>
>>>> + vcpu->arch.sie_block->eca &= ~ECA_APIE;
>>>> + if (vcpu->kvm->arch.crypto.apie &&
>>>> + test_kvm_cpu_feat(vcpu->kvm, KVM_S390_VM_CPU_FEAT_AP))
>>> Do we call xxx_crypto_setup() if KVM does not support AP interpretation?
>> sorry, I should have written AP instructions here:
>> is this function called if AP instructions are not available in the guest?
>>
> Yes, this function can be called with AP instructions available to the guest.
> Please have a look at patch 2 (kvm_s390_vm_set_crypto and the rest).
>
> Also this function is called on initialization regardless of AP instructions.
>
>>>> + vcpu->arch.sie_block->eca |= ECA_APIE;
>>>> +
>>>> +
>>>> if (!test_kvm_facility(vcpu->kvm, 76))
>>>> return;
>>>>
On Tue, 17 Apr 2018 10:26:57 -0400
Tony Krowiak <[email protected]> wrote:
> On 04/17/2018 06:10 AM, Cornelia Huck wrote:
> > On Tue, 17 Apr 2018 09:49:58 +0200
> > "Harald Freudenberger" <[email protected]> wrote:
> >
> >> Didn't we say that when APXA is not available there is no Crypto support
> >> for KVM ?
> > [Going by the code, as I don't have access to the architecture]
> >
> > Current status seems to be:
> > - setup crycb if facility 76 is available (that's MSAX3, I guess?)
>
> The crycb is set up regardless of whether STFLE.76 (MSAX3) is
> installed or not.
Hm, the current code does a quick exit if bit 76 is not set, doesn't
it?
>
> > - use format 2 if APXA is available, else use format 1
>
> Use format 0 if MSAX3 is not available
> Use format 1 if MSAX3 is available but APXA is not
> Use format 2 if MSAX3 and APXA is available
>
> >
> > From Tony's patch description, the goal seems to be:
> > - setup crycb even if MSAX3 is not available
>
> Yes, that is true
>
> >
> > So my understanding is that we use APXA only to decide on the format of
> > the crycb, but provide it in any case?
>
> Yes, that is true
With the format selection you outlined above, I guess. Makes sense from
my point of view (just looking at the source code).
>
> >
> > (Not providing a crycb if APXA is not available would be loss of
> > functionality, I guess? Deciding not to provide vfio-ap if APXA is not
> > available is a different game, of course.)
>
> This would require a change to enabling the CPU model feature for
> AP.
But would it actually make sense to tie vfio-ap to APXA? This needs to
be answered by folks with access to the architecture :)
[Personally, I think we should go with the version that uses the least
restrictions without introducing over-complex code. What constitutes
"over-complex code" is of course in the eye of the beholder...]
On 17/04/2018 16:15, Tony Krowiak wrote:
> On 04/16/2018 04:56 AM, Pierre Morel wrote:
>> On 15/04/2018 23:22, Tony Krowiak wrote:
>>> This patch refactors the code that initializes the crypto
>>> configuration for a guest. The crypto configuration is contained in
>>> a crypto control block (CRYCB) which is a satellite control block to
>>> our main hardware virtualization control block. The CRYCB is
>>> attached to the main virtualization control block via a CRYCB
>>> designation (CRYCBD) designation field containing the address of
>>> the CRYCB as well as its format.
>>>
>>> Prior to the introduction of AP device virtualization, there was
>>> no need to provide access to or specify the format of the CRYCB for
>>> a guest unless the MSA extension 3 (MSAX3) facility was installed
>>> on the host system. With the introduction of AP device virtualization,
>>> the CRYCB and its format must be made accessible to the guest
>>> regardless of the presence of the MSAX3 facility.
>>>
>>> The crypto initialization code is restructured as follows:
>>>
>>> * A new compilation unit is introduced to contain all interfaces
>>> and data structures related to configuring a guest's CRYCB for
>>> both the refactoring of crypto initialization as well as all
>>> subsequent patches introducing AP virtualization support.
>>>
>>> * Currently, the asm code for querying the AP configuration is
>>> duplicated in the AP bus as well as in KVM. Since the KVM
>>> code was introduced, the AP bus has externalized the interface
>>> for querying the AP configuration. The KVM interface will be
>>> replaced with a call to the AP bus interface. Of course, this
>>> will be moved to the new compilation unit mentioned above.
>>>
>>> * An interface to format the CRYCBD field will be provided via
>>> the new compilation unit and called from the KVM vm
>>> initialization.
>>>
>>> Signed-off-by: Tony Krowiak <[email protected]>
>>> ---
>>> arch/s390/include/asm/kvm-ap.h | 15 +++++++++
>>> arch/s390/include/asm/kvm_host.h | 1 +
>>> arch/s390/kvm/kvm-ap.c | 39 ++++++++++++++++++++++++
>>> arch/s390/kvm/kvm-s390.c | 60
>>> ++++----------------------------------
>>> 4 files changed, 61 insertions(+), 54 deletions(-)
>>>
>>> diff --git a/arch/s390/include/asm/kvm-ap.h
>>> b/arch/s390/include/asm/kvm-ap.h
>>> index 84412a9..736e93e 100644
>>> --- a/arch/s390/include/asm/kvm-ap.h
>>> +++ b/arch/s390/include/asm/kvm-ap.h
>>> @@ -10,6 +10,9 @@
>>> #ifndef _ASM_KVM_AP
>>> #define _ASM_KVM_AP
>>>
>>> +#include <linux/types.h>
>>> +#include <linux/kvm_host.h>
>>> +
>>> /**
>>> * kvm_ap_instructions_installed()
>>> *
>>> @@ -20,4 +23,16 @@
>>> */
>>> int kvm_ap_instructions_installed(void);
>>>
>>> +/**
>>> + * kvm_ap_build_crycbd
>>> + *
>>> + * The crypto control block designation (CRYCBD) is a 32-bit field
>>> that
>>> + * designates both the host real address and format of the CRYCB.
>>> This function
>>> + * builds the CRYCBD field for use by the KVM guest.
>>> + *
>>> + * @kvm: the KVM guest
>>> + * @crycbd: reference to the CRYCBD
>>> + */
>>> +void kvm_ap_build_crycbd(struct kvm *kvm);
>>> +
>>> #endif /* _ASM_KVM_AP */
>>> diff --git a/arch/s390/include/asm/kvm_host.h
>>> b/arch/s390/include/asm/kvm_host.h
>>> index 81cdb6b..c990a1d 100644
>>> --- a/arch/s390/include/asm/kvm_host.h
>>> +++ b/arch/s390/include/asm/kvm_host.h
>>> @@ -257,6 +257,7 @@ struct kvm_s390_sie_block {
>>> __u8 reservedf0[12]; /* 0x00f0 */
>>> #define CRYCB_FORMAT1 0x00000001
>>> #define CRYCB_FORMAT2 0x00000003
>>> +#define CRYCB_FORMAT_MASK 0x00000003
>>> __u32 crycbd; /* 0x00fc */
>>> __u64 gcr[16]; /* 0x0100 */
>>> __u64 gbea; /* 0x0180 */
>>> diff --git a/arch/s390/kvm/kvm-ap.c b/arch/s390/kvm/kvm-ap.c
>>> index 1267588..991bae4 100644
>>> --- a/arch/s390/kvm/kvm-ap.c
>>> +++ b/arch/s390/kvm/kvm-ap.c
>>> @@ -10,6 +10,8 @@
>>> #include <asm/kvm-ap.h>
>>> #include <asm/ap.h>
>>>
>>> +#include "kvm-s390.h"
>>> +
>>> int kvm_ap_instructions_installed(void)
>>> {
>>> #ifdef CONFIG_ZCRYPT
>>> @@ -19,3 +21,40 @@ int kvm_ap_instructions_installed(void)
>>> #endif
>>> }
>>> EXPORT_SYMBOL(kvm_ap_instructions_installed);
>>> +
>>> +static inline int kvm_ap_query_config(struct ap_config_info *config)
>>> +{
>>> + memset(config, 0, sizeof(*config));
>>> +
>>> +#ifdef CONFIG_ZCRYPT
>>
>> I would prefer that you define the interface in an include file
>> with stubs for the case ZCRYPT is not set.
>
> This is a static function only called internally, but I suppose there is
> no harm in defining it as an interface in kvm-ap.h ... it may come
> in handy down the road.
>
>>
>>
>>> + if (kvm_ap_instructions_installed())
>>> + return ap_query_configuration(config);
>>> +#endif
>>> +
>>> + return -EOPNOTSUPP;
>>> +}
>>> +
>>> +static int kvm_ap_apxa_installed(void)
>>> +{
>>> + struct ap_config_info config;
>>> +
>>> + if (kvm_ap_query_config(&config) == 0)
>>> + return (config.apxa == 1);
>>> +
>>> + return 0;
>>> +}
>>> +
>>> +void kvm_ap_build_crycbd(struct kvm *kvm)
>>> +{
>>> + kvm->arch.crypto.crycbd = (__u32)(unsigned long)
>>> kvm->arch.crypto.crycb;
>>> + kvm->arch.crypto.crycbd &= ~(CRYCB_FORMAT_MASK);
>>> +
>>> + /* check whether MSAX3 is installed */
>>
>> It means we do not support AP virtualization without MSA3.
>> It follows we do not support CRYCB_FORMAT0
>
> If MSAX3 is not installed, that means there is no key wrapping support,
> hence CRYCB_FORMAT0. The CRYCB_FORMAT1 and CRYCB_FORMAT2 CRYCBs
> both include wrapping key masks. I don't follow your logic here.
>
>>
>>
>> It is different from what you explain in the comment.
>
> How is it different? Above, we are setting the CRYCBD value regardless
> of whether MSAX3 is installed or not. Previously, the CRYCBD value
> was set only if MSAX3 is installed (see comments below)
>
>>
>>
>>> + if (kvm_ap_instructions_installed() && test_kvm_facility(kvm,
>>> 76)) {
>>> + if (kvm_ap_apxa_installed())
>>> + kvm->arch.crypto.crycbd |= CRYCB_FORMAT2;
>>> + else
>>> + kvm->arch.crypto.crycbd |= CRYCB_FORMAT1;
>>> + }
sorry, I was fooled by the test on kvm_instructions_installed() and that
CRYCB_FORMAT0 = 0.
since you cleared the format above it is 0 by default.
Since we can not use CRYCB_FORMAT0 if we have no AP instructions, the
logic of the test
seems false even the result is right.
I think you can make it more readable if you put all the crycb
initialization together
inside the kvm_s390_crypto_init() function instead of exporting part of
it inside
kvm_ap_build_crycbd()
Regards,
Pierre
>>> +}
>>> +EXPORT_SYMBOL(kvm_ap_build_crycbd);
>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>> index d0c3518..b47ff11 100644
>>> --- a/arch/s390/kvm/kvm-s390.c
>>> +++ b/arch/s390/kvm/kvm-s390.c
>>> @@ -40,6 +40,7 @@
>>> #include <asm/sclp.h>
>>> #include <asm/cpacf.h>
>>> #include <asm/timex.h>
>>> +#include <asm/kvm-ap.h>
>>> #include "kvm-s390.h"
>>> #include "gaccess.h"
>>>
>>> @@ -1881,55 +1882,6 @@ long kvm_arch_vm_ioctl(struct file *filp,
>>> return r;
>>> }
>>>
>>> -static int kvm_s390_query_ap_config(u8 *config)
>>> -{
>>> - u32 fcn_code = 0x04000000UL;
>>> - u32 cc = 0;
>>> -
>>> - memset(config, 0, 128);
>>> - asm volatile(
>>> - "lgr 0,%1\n"
>>> - "lgr 2,%2\n"
>>> - ".long 0xb2af0000\n" /* PQAP(QCI) */
>>> - "0: ipm %0\n"
>>> - "srl %0,28\n"
>>> - "1:\n"
>>> - EX_TABLE(0b, 1b)
>>> - : "+r" (cc)
>>> - : "r" (fcn_code), "r" (config)
>>> - : "cc", "0", "2", "memory"
>>> - );
>>> -
>>> - return cc;
>>> -}
>>> -
>>> -static int kvm_s390_apxa_installed(void)
>>> -{
>>> - u8 config[128];
>>> - int cc;
>>> -
>>> - if (test_facility(12)) {
>>> - cc = kvm_s390_query_ap_config(config);
>>> -
>>> - if (cc)
>>> - pr_err("PQAP(QCI) failed with cc=%d", cc);
>>> - else
>>> - return config[0] & 0x40;
>>> - }
>>> -
>>> - return 0;
>>> -}
>>> -
>>> -static void kvm_s390_set_crycb_format(struct kvm *kvm)
>>> -{
>>> - kvm->arch.crypto.crycbd = (__u32)(unsigned long)
>>> kvm->arch.crypto.crycb;
>>> -
>>> - if (kvm_s390_apxa_installed())
>>> - kvm->arch.crypto.crycbd |= CRYCB_FORMAT2;
>>> - else
>>> - kvm->arch.crypto.crycbd |= CRYCB_FORMAT1;
>>> -}
>>> -
>>> static u64 kvm_s390_get_initial_cpuid(void)
>>> {
>>> struct cpuid cpuid;
>>> @@ -1941,12 +1893,12 @@ static u64 kvm_s390_get_initial_cpuid(void)
>>>
>>> static void kvm_s390_crypto_init(struct kvm *kvm)
>>> {
>>> + kvm->arch.crypto.crycb = &kvm->arch.sie_page2->crycb;
>>> + kvm_ap_build_crycbd(kvm);
>>> +
>
> Notice the call to kvm_ap_build_crycbd(kvm) above was added, so
> the CRYCBD is being set regardless of the presence of MSAX3.
>
>>> if (!test_kvm_facility(kvm, 76))
>>> return;
>>>
>>> - kvm->arch.crypto.crycb = &kvm->arch.sie_page2->crycb;
>>> - kvm_s390_set_crycb_format(kvm);
> Notice that this code that was removed to set the CRYCBD is called
> only if MSAX3 is not installed - i.e., see the if statement
> immediately preceding the two statements above.
>>> -
>>> /* Enable AES/DEA protected key functions by default */
>>> kvm->arch.crypto.aes_kw = 1;
>>> kvm->arch.crypto.dea_kw = 1;
>>> @@ -2475,6 +2427,8 @@ void kvm_arch_vcpu_postcreate(struct kvm_vcpu
>>> *vcpu)
>>>
>>> static void kvm_s390_vcpu_crypto_setup(struct kvm_vcpu *vcpu)
>>> {
>>> + vcpu->arch.sie_block->crycbd = vcpu->kvm->arch.crypto.crycbd;
>>> +
>>> if (!test_kvm_facility(vcpu->kvm, 76))
>>> return;
>>>
>>> @@ -2484,8 +2438,6 @@ static void kvm_s390_vcpu_crypto_setup(struct
>>> kvm_vcpu *vcpu)
>>> vcpu->arch.sie_block->ecb3 |= ECB3_AES;
>>> if (vcpu->kvm->arch.crypto.dea_kw)
>>> vcpu->arch.sie_block->ecb3 |= ECB3_DEA;
>>> -
>>> - vcpu->arch.sie_block->crycbd = vcpu->kvm->arch.crypto.crycbd;
>>> }
>>>
>>> void kvm_s390_vcpu_unsetup_cmma(struct kvm_vcpu *vcpu)
>>
>>
>
--
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany
On 04/16/2018 10:51 AM, Halil Pasic wrote:
>
> On 04/16/2018 03:05 PM, Pierre Morel wrote:
>>> +static void vfio_ap_mdev_release(struct mdev_device *mdev)
>>> +{
>>> + struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
>>> +
>>> + kvm_ap_deconfigure_matrix(matrix_mdev->kvm);
>>> + kvm_ap_interpret_instructions(matrix_mdev->kvm, false);
>> This call clears the apie in KVM.
>> This is only OK if we have a single device present until the end of the VM,
>> otherwise AP instructions in the guest will fail after the release until the end of the VM
>> or until a new device is plugged.
> I agree, this seems wrong.
As I think about this more, you may be correct. I believe that one can
remove a VFIO mediated
device via a sysfs file descriptor. I suppose that could happen while
the guest is still running,
which would mean AP instructions executed on the guest would meet with
an operation exception.
I will have to explore this some more.
On 04/16/2018 09:13 AM, Pierre Morel wrote:
> On 15/04/2018 23:22, Tony Krowiak wrote:
>> This patch provides documentation describing the AP architecture and
>> design concepts behind the virtualization of AP devices. It also
>> includes an example of how to configure AP devices for exclusive
>> use of KVM guests.
>>
>> Signed-off-by: Tony Krowiak <[email protected]>
>> ---
>> Documentation/s390/vfio-ap.txt | 567
>> ++++++++++++++++++++++++++++++++++++++++
>> MAINTAINERS | 1 +
>> 2 files changed, 568 insertions(+), 0 deletions(-)
>> create mode 100644 Documentation/s390/vfio-ap.txt
>>
>> diff --git a/Documentation/s390/vfio-ap.txt
>> b/Documentation/s390/vfio-ap.txt
>> new file mode 100644
>> index 0000000..a1e888a
>> --- /dev/null
>> +++ b/Documentation/s390/vfio-ap.txt
>> @@ -0,0 +1,567 @@
>> +Introduction:
>> +============
>> +The Adjunct Processor (AP) facility is an IBM Z cryptographic
>> facility comprised
>> +of three AP instructions and from 1 up to 256 PCIe cryptographic
>> adapter cards.
>> +The AP devices provide cryptographic functions to all CPUs assigned
>> to a
>> +linux system running in an IBM Z system LPAR.
>> +
>> +The AP adapter cards are exposed via the AP bus. The motivation for
>> vfio-ap
>> +is to make AP cards available to KVM guests using the VFIO mediated
>> device
>> +framework. This implementation relies considerably on the s390
>> virtualization
>> +facilities which do most of the hard work of providing direct access
>> to AP
>> +devices.
>> +
>> +AP Architectural Overview:
>> +=========================
>> +To facilitate the comprehension of the design, let's start with some
>> +definitions:
>> +
>> +* AP adapter
>> +
>> + An AP adapter is an IBM Z adapter card that can perform cryptographic
>> + functions. There can be from 0 to 256 adapters assigned to an
>> LPAR. Adapters
>> + assigned to the LPAR in which a linux host is running will be
>> available to
>> + the linux host. Each adapter is identified by a number from 0 to
>> 255. When
>> + installed, an AP adapter is accessed by AP instructions executed
>> by any CPU.
>> +
>> + The AP adapter cards are assigned to a given LPAR via the system's
>> Activation
>> + Profile which can be edited via the HMC. When the system is IPL'd,
>> the AP bus
>> + module is loaded and detects the AP adapter cards assigned to the
>> LPAR. The AP
>> + bus creates a sysfs device for each adapter as they are detected.
>> For example,
>> + if AP adapters 4 and 10 (0x0a) are assigned to the LPAR, the AP
>> bus will
>> + create the following sysfs entries:
>> +
>> + /sys/devices/ap/card04
>> + /sys/devices/ap/card0a
>> +
>> + Symbolic links to these devices will also be created in the AP bus
>> devices
>> + sub-directory:
>> +
>> + /sys/bus/ap/devices/[card04]
>> + /sys/bus/ap/devices/[card04]
>> +
>> +* AP domain
>> +
>> + An adapter is partitioned into domains. Each domain can be thought
>> of as
>> + a set of hardware registers for processing AP instructions. An
>> adapter can
>> + hold up to 256 domains. Each domain is identified by a number from
>> 0 to 255.
>> + Domains can be further classified into two types:
>> +
>> + * Usage domains are domains that can be accessed directly to
>> process AP
>> + commands.
>> +
>> + * Control domains are domains that are accessed indirectly by AP
>> + commands sent to a usage domain to control or change the
>> domain, for
>> + example; to set a secure private key for the domain.
>> +
>> + The AP usage and control domains are assigned to a given LPAR via
>> the system's
>> + Activation Profile which can be edited via the HMC. When the
>> system is IPL'd,
>> + the AP bus module is loaded and detects the AP usage and control
>> domains
>> + assigned to the LPAR. The domain number of each usage domain will
>> be coupled
>> + with the adapter number of each AP adapter assigned to the LPAR to
>> identify
>> + the AP queues (see AP Queue section below). The domain number of
>> each control
>> + domain will be represented in a bitmask and stored in a sysfs file
>> + /sys/bus/ap/ap_control_domain_mask created by the bus. The bits in
>> the mask,
>> + from most to least significant bit, correspond to domains 0-255.
>> +
>> + A domain may be assigned to a system as both a usage and control
>> domain, or
>> + as a control domain only. Consequently, all domains assigned as
>> both a usage
>> + and control domain can both process AP commands as well as be
>> changed by an AP
>> + command sent to any usage domain assigned to the same system.
>> Domains assigned
>> + only as control domains can not process AP commands but can be
>> changed by AP
>> + commands sent to any usage domain assigned to the system.
>> +
>> +* AP Queue
>> +
>> + An AP queue is the means by which an AP command-request message is
>> sent to a
>> + usage domain inside a specific adapter. An AP queue is identified
>> by a tuple
>> + comprised of an AP adapter ID (APID) and an AP queue index (APQI).
>> The
>> + APQI corresponds to a given usage domain number within the
>> adapter. This tuple
>> + forms an AP Queue Number (APQN) uniquely identifying an AP queue. AP
>> + instructions include a field containing the APQN to identify the
>> AP queue to
>> + which the AP command-request message is to be sent for processing.
>> +
>> + The AP bus will create a sysfs device for each APQN that can be
>> derived from
>> + the intersection of the AP adapter and usage domain numbers
>> detected when the
>> + AP bus module is loaded. For example, if adapters 4 and 10 (0x0a)
>> and usage
>> + domains 6 and 71 (0x47) are assigned to the LPAR, the AP bus will
>> create the
>> + following sysfs entries:
>> +
>> + /sys/devices/ap/card04/04.0006
>> + /sys/devices/ap/card04/04.0047
>> + /sys/devices/ap/card0a/0a.0006
>> + /sys/devices/ap/card0a/0a.0047
>> +
>> + The following symbolic links to these devices will be created in
>> the AP bus
>> + devices subdirectory:
>> +
>> + /sys/bus/ap/devices/[04.0006]
>> + /sys/bus/ap/devices/[04.0047]
>> + /sys/bus/ap/devices/[0a.0006]
>> + /sys/bus/ap/devices/[0a.0047]
>> +
>> +* AP Instructions:
>> +
>> + There are three AP instructions:
>> +
>> + * NQAP: to enqueue an AP command-request message to a queue
>> + * DQAP: to dequeue an AP command-reply message from a queue
>> + * PQAP: to administer the queues
>> +
>> +AP and SIE:
>> +==========
>> +Let's now see how AP instructions are interpreted by the hardware.
>> +
>> +A satellite control block called the Crypto Control Block is
>> attached to our
>> +main hardware virtualization control block. The CRYCB contains three
>> fields to
>> +identify the adapters, usage domains and control domains assigned to
>> the KVM
>> +guest:
>> +
>> +* The AP Mask (APM) field is a bit mask that identifies the AP
>> adapters assigned
>> + to the KVM guest. Each bit in the mask, from most significant to
>> least
>> + significant bit, corresponds to an APID from 0-255. If a bit is
>> set, the
>> + corresponding adapter is valid for use by the KVM guest.
>> +
>> +* The AP Queue Mask (AQM) field is a bit mask identifying the AP
>> usage domains
>> + assigned to the KVM guest. Each bit in the mask, from most
>> significant to
>> + least significant bit, corresponds to an AP queue index (APQI)
>> from 0-255. If
>> + a bit is set, the corresponding queue is valid for use by the KVM
>> guest.
>> +
>> +* The AP Domain Mask field is a bit mask that identifies the AP
>> control domains
>> + assigned to the KVM guest. The ADM bit mask controls which domains
>> can be
>> + changed by an AP command-request message sent to a usage domain
>> from the
>> + guest. Each bit in the mask, from least significant to most
>> significant bit,
>> + corresponds to a domain from 0-255. If a bit is set, the
>> corresponding domain
>> + can be modified by an AP command-request message sent to a usage
>> domain
>> + configured for the KVM guest.
>> +
>> +If you recall from the description of an AP Queue, AP instructions
>> include
>> +an APQN to identify the AP adapter and AP queue to which an AP
>> command-request
>> +message is to be sent (NQAP and PQAP instructions), or from which a
>> +command-reply message is to be received (DQAP instruction). The
>> validity of an
>> +APQN is defined by the matrix calculated from the APM and AQM; it is
>> the
>> +cross product of all assigned adapter numbers (APM) with all
>> assigned queue
>> +indexes (AQM). For example, if adapters 1 and 2 and usage domains 5
>> and 6 are
>> +assigned to a guest, the APQNs (1,5), (1,6), (2,5) and (2,6) will be
>> valid for
>> +the guest.
>> +
>> +The APQNs can provide secure key functionality - i.e., a private key
>> is stored
>> +on the adapter card for each of its domains - so each APQN must be
>> assigned to
>> +at most one guest or the linux host.
>> +
>> + Example 1: Valid configuration:
>> + ------------------------------
>> + Guest1: adapters 1,2 domains 5,6
>> + Guest2: adapter 1,2 domain 7
>> +
>> + This is valid because both guests have a unique set of APQNs:
>> Guest1 has
>> + APQNs (1,5), (1,6), (2,5) and (2,6); Guest2 has APQNs (1,7) and
>> (2,7).
>> +
>> + Example 2: Invalid configuration:
>> + --------------------------------is assigned by writing the
>> adapter's number into the
>> + Guest1: adapters 1,2 domains 5,6
>> + Guest2: adapter 1 domains 6,7
>> +
>> + This is an invalid configuration because both guests have access to
>> + APQN (1,6).
>> +
>> +The Design:
>> +===========
>> +The design introduces three new objects:
>> +
>> +1. AP matrix device
>> +2. VFIO AP device driver (vfio_ap.ko)
>> +3. AP mediated matrix passthrough device
>> +
>> +The VFIO AP device driver
>> +-------------------------
>> +The VFIO AP (vfio_ap) device driver serves the following purposes:
>> +
>> +1. Provides the interfaces to reserve APQNs for exclusive use of KVM
>> guests.
>> +
>> +2. Sets up the VFIO mediated device interfaces to manage the
>> mediated matrix
>> + device and create the sysfs interfaces for assigning adapters,
>> usage domains,
>> + and control domains comprising the matrix for a KVM guest.
>> +
>> +3. Configure the APM, AQM and ADM in the CRYCB referenced by a KVM
>> guest's
>> + SIE state description to grant the guest access to AP devices
>> +
>> +4. Initialize the CPU model feature indicating that a KVM guest may use
>> + AP facilities installed on the linux host.
>> +
>> +5. Enable interpretive execution mode for the KVM guest.
>> +
>> +Reserve APQNs for exclusive use of KVM guests
>> +---------------------------------------------
>> +The following block diagram illustrates the mechanism by which APQNs
>> are
>> +reserved:
>> +
>> + +------------------+
>> + remove | | unbind
>> + +------------------->+ cex4queue driver +<-----------+
>> + | | | |
>> + | +------------------+ |
>> + | |
>> + | |
>> + | |
>> ++--------+---------+ register +------------------+ +-----+------+
>> +| +<---------+ | bind | |
>> +| ap_bus | | vfio_ap driver +<-----+ admin |
>> +| +--------->+ | | |
>> ++------------------+ probe +---+--------+-----+ +------------+
>> + | |
>> + create | | store APQN
>> + | |
>> + v v
>> + +---+--------+-----+
>> + | |
>> + | matrix device |
>> + | |
>> + +------------------+
>> +
>> +The process for reserving an AP queue for use by a KVM guest is:
>> +
>> +* The vfio-ap driver during its initialization will perform the
>> following:
>> + * Create the 'vfio_ap' root device - /sys/devices/vfio_ap
>> + * Create the 'matrix' device in the 'vfio_ap' root
>> + * Register the matrix device with the device core
>> +* Register with the ap_bus for AP queue devices of type CEX4, CEX5 and
>> + CEX6 and to provide the vfio_ap driver's probe and remove callback
>> interfaces.
>
> I wonder why the type of card has anything to do with this driver.
> It should be transparent, the driver should be able to provide the
> matrix (APM/AQM/ADM)
> independently from the type of card in the slot.
We've been down this road several times before. We are only supporting
virtualization of
CEX4 and newer cards. Also, the AP bus requires registering for specific
card types.
>
>
> Regards,
>
> Pierre
>
On 17/04/2018 17:02, Tony Krowiak wrote:
> On 04/16/2018 06:51 AM, Pierre Morel wrote:
>> On 15/04/2018 23:22, Tony Krowiak wrote:
>>> The VFIO AP device model exploits interpretive execution of AP
>>> instructions (APIE) to provide guests passthrough access to AP
>>> devices. This patch introduces a new interface to enable and
>>> disable APIE.
>>>
>>> Signed-off-by: Tony Krowiak <[email protected]>
>>> ---
>>> arch/s390/include/asm/kvm-ap.h | 16 ++++++++++++++++
>>> arch/s390/include/asm/kvm_host.h | 1 +
>>> arch/s390/kvm/kvm-ap.c | 20 ++++++++++++++++++++
>>> arch/s390/kvm/kvm-s390.c | 9 +++++++++
>>> 4 files changed, 46 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/arch/s390/include/asm/kvm-ap.h
>>> b/arch/s390/include/asm/kvm-ap.h
>>> index 736e93e..a6c092e 100644
>>> --- a/arch/s390/include/asm/kvm-ap.h
>>> +++ b/arch/s390/include/asm/kvm-ap.h
>>> @@ -35,4 +35,20 @@
>>> */
>>> void kvm_ap_build_crycbd(struct kvm *kvm);
>>>
>>> +/**
>>> + * kvm_ap_interpret_instructions
>>> + *
>>> + * Indicate whether AP instructions shall be interpreted. If they
>>> are not
>>> + * interpreted, all AP instructions will be intercepted and routed
>>> back to
>>> + * userspace.
>>> + *
>>> + * @kvm: the virtual machine attributes
>>> + * @enable: indicates whether AP instructions are to be interpreted
>>> (true) or
>>> + * or not (false).
>>> + *
>>> + * Returns 0 if completed successfully; otherwise, returns -EOPNOTSUPP
>>> + * indicating that AP instructions are not installed on the guest.
>>> + */
>>> +int kvm_ap_interpret_instructions(struct kvm *kvm, bool enable);
>>> +
>>> #endif /* _ASM_KVM_AP */
>>> diff --git a/arch/s390/include/asm/kvm_host.h
>>> b/arch/s390/include/asm/kvm_host.h
>>> index 3162783..5470685 100644
>>> --- a/arch/s390/include/asm/kvm_host.h
>>> +++ b/arch/s390/include/asm/kvm_host.h
>>> @@ -715,6 +715,7 @@ struct kvm_s390_crypto {
>>> __u32 crycbd;
>>> __u8 aes_kw;
>>> __u8 dea_kw;
>>> + __u8 apie;
>>> };
>>>
>>> #define APCB0_MASK_SIZE 1
>>> diff --git a/arch/s390/kvm/kvm-ap.c b/arch/s390/kvm/kvm-ap.c
>>> index 991bae4..55d11b5 100644
>>> --- a/arch/s390/kvm/kvm-ap.c
>>> +++ b/arch/s390/kvm/kvm-ap.c
>>> @@ -58,3 +58,23 @@ void kvm_ap_build_crycbd(struct kvm *kvm)
>>> }
>>> }
>>> EXPORT_SYMBOL(kvm_ap_build_crycbd);
>>> +
>>> +int kvm_ap_interpret_instructions(struct kvm *kvm, bool enable)
>>> +{
>>> + int ret = 0;
>>> +
>>> + mutex_lock(&kvm->lock);
>>> +
>>> + if (!test_kvm_cpu_feat(kvm, KVM_S390_VM_CPU_FEAT_AP)) {
>>
>> Do we really need to test CPU_FEAT_AP?
>
> Yes we do.
really? why?
>
>>
>>
>> I understand that KVM_S390_VM_CPU_FEAT_AP means AP instructions are
>> interpreted.
>> shouldn't we add this information in the name?
>> like KVM_S390_VM_CPU_FEAT_APIE
>
> KVM_S390_VM_CPU_FEAT_AP does NOT mean AP instructions are interpreted,
> it means
> AP instructions are installed.
Right same error I made all along this review.
But AFAIK it means AP instructions are provided to the guest.
Then should this function be called if the guest has no AP instructions ?
>
>>
>>> + ret = -EOPNOTSUPP;
>>> + goto done;
>>> + }
>>> +
>>> + kvm->arch.crypto.apie = enable;
>>> + kvm_s390_vcpu_crypto_reset_all(kvm);
>>> +
>>> +done:
>>> + mutex_unlock(&kvm->lock);
>>> + return ret;
>>> +}
>>> +EXPORT_SYMBOL(kvm_ap_interpret_instructions);
>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>> index 55cd897..1dc8566 100644
>>> --- a/arch/s390/kvm/kvm-s390.c
>>> +++ b/arch/s390/kvm/kvm-s390.c
>>> @@ -1901,6 +1901,9 @@ static void kvm_s390_crypto_init(struct kvm *kvm)
>>> kvm->arch.crypto.crycb = &kvm->arch.sie_page2->crycb;
>>> kvm_ap_build_crycbd(kvm);
>>>
>>> + /* Default setting indicating SIE shall interpret AP
>>> instructions */
>>> + kvm->arch.crypto.apie = 1;
>>> +
>>> if (!test_kvm_facility(kvm, 76))
>>> return;
>>>
>>> @@ -2434,6 +2437,12 @@ static void kvm_s390_vcpu_crypto_setup(struct
>>> kvm_vcpu *vcpu)
>>> {
>>> vcpu->arch.sie_block->crycbd = vcpu->kvm->arch.crypto.crycbd;
>>>
>>> + vcpu->arch.sie_block->eca &= ~ECA_APIE;
>>> + if (vcpu->kvm->arch.crypto.apie &&
>>> + test_kvm_cpu_feat(vcpu->kvm, KVM_S390_VM_CPU_FEAT_AP))
>>
>> Do we call xxx_crypto_setup() if KVM does not support AP interpretation?
>
> Yes, kvm_s390_vcpu_crypto_setup(vcpu) is called by
> kvm_arch_vcpu_setup(vcpu)
> as well as from kvm_s390_vcpu_crypto_reset_all(kvm). Calling it has
> nothing
> to do with whether AP interpretation is supported or not as it does much
> more than that, including setting up of wrapping keys and the CRYCBD.
Sorry, still the same error I made about CPU_FEAT_AP meaning AP
instructions in the guest
and not AP interpretation available.
Could apie be set if AP instruction are not supported?
>
>>
>>> + vcpu->arch.sie_block->eca |= ECA_APIE;
>>> +
>>> +
>>> if (!test_kvm_facility(vcpu->kvm, 76))
>>> return;
>>>
>>
>
--
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany
On 04/16/2018 09:53 AM, Cornelia Huck wrote:
> On Mon, 16 Apr 2018 15:13:59 +0200
> Pierre Morel <[email protected]> wrote:
>
>> On 15/04/2018 23:22, Tony Krowiak wrote:
>>> +* Register with the ap_bus for AP queue devices of type CEX4, CEX5 and
>>> + CEX6 and to provide the vfio_ap driver's probe and remove callback interfaces.
>> I wonder why the type of card has anything to do with this driver.
>> It should be transparent, the driver should be able to provide the
>> matrix (APM/AQM/ADM)
>> independently from the type of card in the slot.
> Would also be interested why this is limited to certain, newer cards.
> Did some kind of interface change (I dimly recall something like that),
> or are simply no old systems with those older card types around to check
> whether it works?
That was a restriction recommended by our crypto architect.
>
> In either case, a short note would be good (does not need to go into
> any details).
I can do that.
>
On 17/04/2018 18:08, Tony Krowiak wrote:
> On 04/16/2018 09:05 AM, Pierre Morel wrote:
>> On 15/04/2018 23:22, Tony Krowiak wrote:
>>> Registers a group notifier during the open of the mediated
>>> matrix device to get information on KVM presence through the
>>> VFIO_GROUP_NOTIFY_SET_KVM event. When notified, the pointer
>>> to the kvm structure is saved inside the mediated matrix
>>> device. Once the VFIO AP device driver has access to KVM,
>>> access to the APs can be configured for the guest.
>>>
>>> Access to APs is configured when the file descriptor for the
>>> mediated matrix device is opened by userspace. The items to be
>>> configured are:
>>>
>>> 1. The ECA.28 bit in the SIE state description determines whether
>>> AP instructions are interpreted by the hardware or intercepted.
>>> The VFIO AP device driver relies interpretive execution of
>>> AP instructions so the ECA.28 bit will be set
>>>
>>> 2. Guest access to AP adapters, usage domains and control domains
>>> is controlled by three bit masks referenced from the
>>> Crypto Control Block (CRYCB) referenced from the guest's SIE state
>>> description:
>>>
>>> * The AP Mask (APM) controls access to the AP adapters. Each bit
>>> in the APM represents an adapter number - from most significant
>>> to least significant bit - from 0 to 255. The bits in the APM
>>> are set according to the adapter numbers assigned to the mediated
>>> matrix device via its 'assign_adapter' sysfs attribute file.
>>>
>>> * The AP Queue (AQM) controls access to the AP queues. Each bit
>>> in the AQM represents an AP queue index - from most significant
>>> to least significant bit - from 0 to 255. A queue index
>>> references
>>> a specific domain and is synonymous with the domian number. The
>>> bits in the AQM are set according to the domain numbers assigned
>>> to the mediated matrix device via its 'assign_domain' sysfs
>>> attribute file.
>>>
>>> * The AP Domain Mask (ADM) controls access to the AP control
>>> domains.
>>> Each bit in the ADM represents a control domain - from most
>>> significant to least significant bit - from 0-255. The
>>> bits in the ADM are set according to the domain numbers assigned
>>> to the mediated matrix device via its 'assign_control_domain'
>>> sysfs attribute file.
>>>
>>> Signed-off-by: Tony Krowiak <[email protected]>
>>> ---
>>> drivers/s390/crypto/vfio_ap_ops.c | 50
>>> +++++++++++++++++++++++++++++++++
>>> drivers/s390/crypto/vfio_ap_private.h | 2 +
>>> 2 files changed, 52 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/drivers/s390/crypto/vfio_ap_ops.c
>>> b/drivers/s390/crypto/vfio_ap_ops.c
>>> index bc2b05e..e3ff5ab 100644
>>> --- a/drivers/s390/crypto/vfio_ap_ops.c
>>> +++ b/drivers/s390/crypto/vfio_ap_ops.c
>>> @@ -53,6 +53,54 @@ static int vfio_ap_mdev_remove(struct mdev_device
>>> *mdev)
>>> return 0;
>>> }
>>>
>>> +static int vfio_ap_mdev_group_notifier(struct notifier_block *nb,
>>> + unsigned long action, void *data)
>>> +{
>>> + struct ap_matrix_mdev *matrix_mdev;
>>> +
>>> + if (action == VFIO_GROUP_NOTIFY_SET_KVM) {
>>> + matrix_mdev = container_of(nb, struct ap_matrix_mdev,
>>> + group_notifier);
>>> + matrix_mdev->kvm = data;
>>> + }
>>> +
>>> + return NOTIFY_OK;
>>> +}
>>> +
>>> +static int vfio_ap_mdev_open(struct mdev_device *mdev)
>>> +{
>>> + struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
>>> + unsigned long events;
>>> + int ret;
>>> +
>>> + matrix_mdev->group_notifier.notifier_call =
>>> vfio_ap_mdev_group_notifier;
>>> + events = VFIO_GROUP_NOTIFY_SET_KVM;
>>> +
>>> + ret = vfio_register_notifier(mdev_dev(mdev), VFIO_GROUP_NOTIFY,
>>> + &events, &matrix_mdev->group_notifier);
>>> + if (ret)
>>> + return ret;
>>> +
>>> + ret = kvm_ap_interpret_instructions(matrix_mdev->kvm, true);
>>
>> Do you need this call ?
>> apie is always enabled in KVM if AP instructions are available.
>
> I suppose we don't, in which case we don't need the
> kvm_ap_interpret_instructions()
> function either ... at least not until we implement interception.
>
>>
>>
>> Setting or not the interpretation is done for the VM in a all.
>> It is not the right place to do it here since open is device dependent.
>
> As I stated above, at this time we probably do not need this, however;
> that will not always be the case. The setting is and always will be
> for the
> VM in all - unless the architecture changes - because it is controlled
> by a
> single bit (ECA.28). If you recall, I originally set interpretation in
> the
> vfio_ap device driver when notified of the VFIO_GROUP_NOTIFY_SET_KVM
> event.
> I believe ultimately that it is the device driver that should set the
> value
> for apie.
>
>
>
>>
>>
>> Or we only have one device in the VM at a time.
>> In this case, shouldn't we make it official by returning -EEXIST for
>> the second call?
>
> We do allow only one vfio-ap device at a time. QEMU will allow only
> one vfio-ap device
> to be configured for a guest. Should we also put a check in here?
QEMU is not the only possible user of this interface.
>
>>
>>
>>
>>> + if (ret)
>>> + return ret;
>>> +
>>> + ret = kvm_ap_configure_matrix(matrix_mdev->kvm,
>>> + matrix_mdev->matrix);
>>> +
>>> + return ret;
>>> +}
>>> +
>>> +static void vfio_ap_mdev_release(struct mdev_device *mdev)
>>> +{
>>> + struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
>>> +
>>> + kvm_ap_deconfigure_matrix(matrix_mdev->kvm);
>>> + kvm_ap_interpret_instructions(matrix_mdev->kvm, false);
>>
>> This call clears the apie in KVM.
>> This is only OK if we have a single device present until the end of
>> the VM,
>> otherwise AP instructions in the guest will fail after the release
>> until the end of the VM
>> or until a new device is plugged.
>
> See Message ID:
> <[email protected]> on the
> qemu mailing list. There will be only one vfio-ap device allowed for
> the MVP model.
dito.
Anyone can write a userland application using this interface.
>
>>
>>
>>
>>> + vfio_unregister_notifier(mdev_dev(mdev), VFIO_GROUP_NOTIFY,
>>> + &matrix_mdev->group_notifier);
>>> +}
>>> +
>>> static ssize_t name_show(struct kobject *kobj, struct device *dev,
>>> char *buf)
>>> {
>>> return sprintf(buf, "%s\n", VFIO_AP_MDEV_NAME_HWVIRT);
>>> @@ -754,6 +802,8 @@ static ssize_t matrix_show(struct device *dev,
>>> struct device_attribute *attr,
>>> .mdev_attr_groups = vfio_ap_mdev_attr_groups,
>>> .create = vfio_ap_mdev_create,
>>> .remove = vfio_ap_mdev_remove,
>>> + .open = vfio_ap_mdev_open,
>>> + .release = vfio_ap_mdev_release,
>>> };
>>>
>>> int vfio_ap_mdev_register(struct ap_matrix *ap_matrix)
>>> diff --git a/drivers/s390/crypto/vfio_ap_private.h
>>> b/drivers/s390/crypto/vfio_ap_private.h
>>> index f248faf..48e2806 100644
>>> --- a/drivers/s390/crypto/vfio_ap_private.h
>>> +++ b/drivers/s390/crypto/vfio_ap_private.h
>>> @@ -31,6 +31,8 @@ struct ap_matrix {
>>>
>>> struct ap_matrix_mdev {
>>> struct kvm_ap_matrix *matrix;
>>> + struct notifier_block group_notifier;
>>> + struct kvm *kvm;
>>> };
>>>
>>> static inline struct ap_matrix *to_ap_matrix(struct device *dev)
>>
>>
>
--
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany
On 04/17/2018 12:13 PM, Pierre Morel wrote:
> On 17/04/2018 17:02, Tony Krowiak wrote:
>> On 04/16/2018 06:51 AM, Pierre Morel wrote:
>>> On 15/04/2018 23:22, Tony Krowiak wrote:
>>>> The VFIO AP device model exploits interpretive execution of AP
>>>> instructions (APIE) to provide guests passthrough access to AP
>>>> devices. This patch introduces a new interface to enable and
>>>> disable APIE.
>>>>
>>>> Signed-off-by: Tony Krowiak <[email protected]>
>>>> ---
>>>> arch/s390/include/asm/kvm-ap.h | 16 ++++++++++++++++
>>>> arch/s390/include/asm/kvm_host.h | 1 +
>>>> arch/s390/kvm/kvm-ap.c | 20 ++++++++++++++++++++
>>>> arch/s390/kvm/kvm-s390.c | 9 +++++++++
>>>> 4 files changed, 46 insertions(+), 0 deletions(-)
>>>>
>>>> diff --git a/arch/s390/include/asm/kvm-ap.h
>>>> b/arch/s390/include/asm/kvm-ap.h
>>>> index 736e93e..a6c092e 100644
>>>> --- a/arch/s390/include/asm/kvm-ap.h
>>>> +++ b/arch/s390/include/asm/kvm-ap.h
>>>> @@ -35,4 +35,20 @@
>>>> */
>>>> void kvm_ap_build_crycbd(struct kvm *kvm);
>>>>
>>>> +/**
>>>> + * kvm_ap_interpret_instructions
>>>> + *
>>>> + * Indicate whether AP instructions shall be interpreted. If they
>>>> are not
>>>> + * interpreted, all AP instructions will be intercepted and routed
>>>> back to
>>>> + * userspace.
>>>> + *
>>>> + * @kvm: the virtual machine attributes
>>>> + * @enable: indicates whether AP instructions are to be
>>>> interpreted (true) or
>>>> + * or not (false).
>>>> + *
>>>> + * Returns 0 if completed successfully; otherwise, returns
>>>> -EOPNOTSUPP
>>>> + * indicating that AP instructions are not installed on the guest.
>>>> + */
>>>> +int kvm_ap_interpret_instructions(struct kvm *kvm, bool enable);
>>>> +
>>>> #endif /* _ASM_KVM_AP */
>>>> diff --git a/arch/s390/include/asm/kvm_host.h
>>>> b/arch/s390/include/asm/kvm_host.h
>>>> index 3162783..5470685 100644
>>>> --- a/arch/s390/include/asm/kvm_host.h
>>>> +++ b/arch/s390/include/asm/kvm_host.h
>>>> @@ -715,6 +715,7 @@ struct kvm_s390_crypto {
>>>> __u32 crycbd;
>>>> __u8 aes_kw;
>>>> __u8 dea_kw;
>>>> + __u8 apie;
>>>> };
>>>>
>>>> #define APCB0_MASK_SIZE 1
>>>> diff --git a/arch/s390/kvm/kvm-ap.c b/arch/s390/kvm/kvm-ap.c
>>>> index 991bae4..55d11b5 100644
>>>> --- a/arch/s390/kvm/kvm-ap.c
>>>> +++ b/arch/s390/kvm/kvm-ap.c
>>>> @@ -58,3 +58,23 @@ void kvm_ap_build_crycbd(struct kvm *kvm)
>>>> }
>>>> }
>>>> EXPORT_SYMBOL(kvm_ap_build_crycbd);
>>>> +
>>>> +int kvm_ap_interpret_instructions(struct kvm *kvm, bool enable)
>>>> +{
>>>> + int ret = 0;
>>>> +
>>>> + mutex_lock(&kvm->lock);
>>>> +
>>>> + if (!test_kvm_cpu_feat(kvm, KVM_S390_VM_CPU_FEAT_AP)) {
>>>
>>> Do we really need to test CPU_FEAT_AP?
>>
>> Yes we do.
>
> really? why?
The KVM_S390_VM_CPU_FEAT_AP will not be enabled by KVM if the AP
instructions are not installed on the host. I assume - but have
no way of verifying - that if the AP instructions are not installed
on the host, that interpretation would fail. Do you know what would
happen if AP instructions are interpreted when not installed on
the host?
>
>
>>
>>>
>>>
>>> I understand that KVM_S390_VM_CPU_FEAT_AP means AP instructions are
>>> interpreted.
>>> shouldn't we add this information in the name?
>>> like KVM_S390_VM_CPU_FEAT_APIE
>>
>> KVM_S390_VM_CPU_FEAT_AP does NOT mean AP instructions are
>> interpreted, it means
>> AP instructions are installed.
>
> Right same error I made all along this review.
> But AFAIK it means AP instructions are provided to the guest.
> Then should this function be called if the guest has no AP instructions ?
>
>
>>
>>>
>>>> + ret = -EOPNOTSUPP;
>>>> + goto done;
>>>> + }
>>>> +
>>>> + kvm->arch.crypto.apie = enable;
>>>> + kvm_s390_vcpu_crypto_reset_all(kvm);
>>>> +
>>>> +done:
>>>> + mutex_unlock(&kvm->lock);
>>>> + return ret;
>>>> +}
>>>> +EXPORT_SYMBOL(kvm_ap_interpret_instructions);
>>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>>> index 55cd897..1dc8566 100644
>>>> --- a/arch/s390/kvm/kvm-s390.c
>>>> +++ b/arch/s390/kvm/kvm-s390.c
>>>> @@ -1901,6 +1901,9 @@ static void kvm_s390_crypto_init(struct kvm
>>>> *kvm)
>>>> kvm->arch.crypto.crycb = &kvm->arch.sie_page2->crycb;
>>>> kvm_ap_build_crycbd(kvm);
>>>>
>>>> + /* Default setting indicating SIE shall interpret AP
>>>> instructions */
>>>> + kvm->arch.crypto.apie = 1;
>>>> +
>>>> if (!test_kvm_facility(kvm, 76))
>>>> return;
>>>>
>>>> @@ -2434,6 +2437,12 @@ static void
>>>> kvm_s390_vcpu_crypto_setup(struct kvm_vcpu *vcpu)
>>>> {
>>>> vcpu->arch.sie_block->crycbd = vcpu->kvm->arch.crypto.crycbd;
>>>>
>>>> + vcpu->arch.sie_block->eca &= ~ECA_APIE;
>>>> + if (vcpu->kvm->arch.crypto.apie &&
>>>> + test_kvm_cpu_feat(vcpu->kvm, KVM_S390_VM_CPU_FEAT_AP))
>>>
>>> Do we call xxx_crypto_setup() if KVM does not support AP
>>> interpretation?
>>
>> Yes, kvm_s390_vcpu_crypto_setup(vcpu) is called by
>> kvm_arch_vcpu_setup(vcpu)
>> as well as from kvm_s390_vcpu_crypto_reset_all(kvm). Calling it has
>> nothing
>> to do with whether AP interpretation is supported or not as it does much
>> more than that, including setting up of wrapping keys and the CRYCBD.
>
> Sorry, still the same error I made about CPU_FEAT_AP meaning AP
> instructions in the guest
> and not AP interpretation available.
> Could apie be set if AP instruction are not supported?
>
>>
>>>
>>>> + vcpu->arch.sie_block->eca |= ECA_APIE;
>>>> +
>>>> +
>>>> if (!test_kvm_facility(vcpu->kvm, 76))
>>>> return;
>>>>
>>>
>>
>
On 04/16/2018 09:05 AM, Pierre Morel wrote:
> On 15/04/2018 23:22, Tony Krowiak wrote:
>> Registers a group notifier during the open of the mediated
>> matrix device to get information on KVM presence through the
>> VFIO_GROUP_NOTIFY_SET_KVM event. When notified, the pointer
>> to the kvm structure is saved inside the mediated matrix
>> device. Once the VFIO AP device driver has access to KVM,
>> access to the APs can be configured for the guest.
>>
>> Access to APs is configured when the file descriptor for the
>> mediated matrix device is opened by userspace. The items to be
>> configured are:
>>
>> 1. The ECA.28 bit in the SIE state description determines whether
>> AP instructions are interpreted by the hardware or intercepted.
>> The VFIO AP device driver relies interpretive execution of
>> AP instructions so the ECA.28 bit will be set
>>
>> 2. Guest access to AP adapters, usage domains and control domains
>> is controlled by three bit masks referenced from the
>> Crypto Control Block (CRYCB) referenced from the guest's SIE state
>> description:
>>
>> * The AP Mask (APM) controls access to the AP adapters. Each bit
>> in the APM represents an adapter number - from most significant
>> to least significant bit - from 0 to 255. The bits in the APM
>> are set according to the adapter numbers assigned to the mediated
>> matrix device via its 'assign_adapter' sysfs attribute file.
>>
>> * The AP Queue (AQM) controls access to the AP queues. Each bit
>> in the AQM represents an AP queue index - from most significant
>> to least significant bit - from 0 to 255. A queue index references
>> a specific domain and is synonymous with the domian number. The
>> bits in the AQM are set according to the domain numbers assigned
>> to the mediated matrix device via its 'assign_domain' sysfs
>> attribute file.
>>
>> * The AP Domain Mask (ADM) controls access to the AP control
>> domains.
>> Each bit in the ADM represents a control domain - from most
>> significant to least significant bit - from 0-255. The
>> bits in the ADM are set according to the domain numbers assigned
>> to the mediated matrix device via its 'assign_control_domain'
>> sysfs attribute file.
>>
>> Signed-off-by: Tony Krowiak <[email protected]>
>> ---
>> drivers/s390/crypto/vfio_ap_ops.c | 50
>> +++++++++++++++++++++++++++++++++
>> drivers/s390/crypto/vfio_ap_private.h | 2 +
>> 2 files changed, 52 insertions(+), 0 deletions(-)
>>
>> diff --git a/drivers/s390/crypto/vfio_ap_ops.c
>> b/drivers/s390/crypto/vfio_ap_ops.c
>> index bc2b05e..e3ff5ab 100644
>> --- a/drivers/s390/crypto/vfio_ap_ops.c
>> +++ b/drivers/s390/crypto/vfio_ap_ops.c
>> @@ -53,6 +53,54 @@ static int vfio_ap_mdev_remove(struct mdev_device
>> *mdev)
>> return 0;
>> }
>>
>> +static int vfio_ap_mdev_group_notifier(struct notifier_block *nb,
>> + unsigned long action, void *data)
>> +{
>> + struct ap_matrix_mdev *matrix_mdev;
>> +
>> + if (action == VFIO_GROUP_NOTIFY_SET_KVM) {
>> + matrix_mdev = container_of(nb, struct ap_matrix_mdev,
>> + group_notifier);
>> + matrix_mdev->kvm = data;
>> + }
>> +
>> + return NOTIFY_OK;
>> +}
>> +
>> +static int vfio_ap_mdev_open(struct mdev_device *mdev)
>> +{
>> + struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
>> + unsigned long events;
>> + int ret;
>> +
>> + matrix_mdev->group_notifier.notifier_call =
>> vfio_ap_mdev_group_notifier;
>> + events = VFIO_GROUP_NOTIFY_SET_KVM;
>> +
>> + ret = vfio_register_notifier(mdev_dev(mdev), VFIO_GROUP_NOTIFY,
>> + &events, &matrix_mdev->group_notifier);
>> + if (ret)
>> + return ret;
>> +
>> + ret = kvm_ap_interpret_instructions(matrix_mdev->kvm, true);
>
> Do you need this call ?
> apie is always enabled in KVM if AP instructions are available.
I suppose we don't, in which case we don't need the
kvm_ap_interpret_instructions()
function either ... at least not until we implement interception.
>
>
> Setting or not the interpretation is done for the VM in a all.
> It is not the right place to do it here since open is device dependent.
As I stated above, at this time we probably do not need this, however;
that will not always be the case. The setting is and always will be for the
VM in all - unless the architecture changes - because it is controlled by a
single bit (ECA.28). If you recall, I originally set interpretation in the
vfio_ap device driver when notified of the VFIO_GROUP_NOTIFY_SET_KVM event.
I believe ultimately that it is the device driver that should set the value
for apie.
>
>
> Or we only have one device in the VM at a time.
> In this case, shouldn't we make it official by returning -EEXIST for
> the second call?
We do allow only one vfio-ap device at a time. QEMU will allow only one
vfio-ap device
to be configured for a guest. Should we also put a check in here?
>
>
>
>> + if (ret)
>> + return ret;
>> +
>> + ret = kvm_ap_configure_matrix(matrix_mdev->kvm,
>> + matrix_mdev->matrix);
>> +
>> + return ret;
>> +}
>> +
>> +static void vfio_ap_mdev_release(struct mdev_device *mdev)
>> +{
>> + struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
>> +
>> + kvm_ap_deconfigure_matrix(matrix_mdev->kvm);
>> + kvm_ap_interpret_instructions(matrix_mdev->kvm, false);
>
> This call clears the apie in KVM.
> This is only OK if we have a single device present until the end of
> the VM,
> otherwise AP instructions in the guest will fail after the release
> until the end of the VM
> or until a new device is plugged.
See Message ID:
<[email protected]> on the
qemu mailing list. There will be only one vfio-ap device allowed for the
MVP model.
>
>
>
>> + vfio_unregister_notifier(mdev_dev(mdev), VFIO_GROUP_NOTIFY,
>> + &matrix_mdev->group_notifier);
>> +}
>> +
>> static ssize_t name_show(struct kobject *kobj, struct device *dev,
>> char *buf)
>> {
>> return sprintf(buf, "%s\n", VFIO_AP_MDEV_NAME_HWVIRT);
>> @@ -754,6 +802,8 @@ static ssize_t matrix_show(struct device *dev,
>> struct device_attribute *attr,
>> .mdev_attr_groups = vfio_ap_mdev_attr_groups,
>> .create = vfio_ap_mdev_create,
>> .remove = vfio_ap_mdev_remove,
>> + .open = vfio_ap_mdev_open,
>> + .release = vfio_ap_mdev_release,
>> };
>>
>> int vfio_ap_mdev_register(struct ap_matrix *ap_matrix)
>> diff --git a/drivers/s390/crypto/vfio_ap_private.h
>> b/drivers/s390/crypto/vfio_ap_private.h
>> index f248faf..48e2806 100644
>> --- a/drivers/s390/crypto/vfio_ap_private.h
>> +++ b/drivers/s390/crypto/vfio_ap_private.h
>> @@ -31,6 +31,8 @@ struct ap_matrix {
>>
>> struct ap_matrix_mdev {
>> struct kvm_ap_matrix *matrix;
>> + struct notifier_block group_notifier;
>> + struct kvm *kvm;
>> };
>>
>> static inline struct ap_matrix *to_ap_matrix(struct device *dev)
>
>
On 17/04/2018 18:14, Tony Krowiak wrote:
> On 04/16/2018 09:13 AM, Pierre Morel wrote:
>> On 15/04/2018 23:22, Tony Krowiak wrote:
>>> This patch provides documentation describing the AP architecture and
>>> design concepts behind the virtualization of AP devices. It also
>>>
...snip...
>>> +The process for reserving an AP queue for use by a KVM guest is:
>>> +
>>> +* The vfio-ap driver during its initialization will perform the
>>> following:
>>> + * Create the 'vfio_ap' root device - /sys/devices/vfio_ap
>>> + * Create the 'matrix' device in the 'vfio_ap' root
>>> + * Register the matrix device with the device core
>>> +* Register with the ap_bus for AP queue devices of type CEX4, CEX5 and
>>> + CEX6 and to provide the vfio_ap driver's probe and remove
>>> callback interfaces.
>>
>> I wonder why the type of card has anything to do with this driver.
>> It should be transparent, the driver should be able to provide the
>> matrix (APM/AQM/ADM)
>> independently from the type of card in the slot.
>
> We've been down this road several times before. We are only supporting
> virtualization of
> CEX4 and newer cards. Also, the AP bus requires registering for
> specific card types.
Yes I know, but the AP BUS design may be not the optimal for the AP Matrix.
The AP Matrix is device independent.
We just write/clear bits in a matrix and we do not care what is plugged.
So may be make clear that this device dependence is due to the actual AP
BUS interface design.
>
>>
>>
>> Regards,
>>
>> Pierre
>>
>
--
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany
On 04/17/2018 12:13 PM, Pierre Morel wrote:
> On 17/04/2018 17:02, Tony Krowiak wrote:
>> On 04/16/2018 06:51 AM, Pierre Morel wrote:
>>> On 15/04/2018 23:22, Tony Krowiak wrote:
>>>> The VFIO AP device model exploits interpretive execution of AP
>>>> instructions (APIE) to provide guests passthrough access to AP
>>>> devices. This patch introduces a new interface to enable and
>>>> disable APIE.
>>>>
>>>> Signed-off-by: Tony Krowiak <[email protected]>
>>>> ---
>>>> arch/s390/include/asm/kvm-ap.h | 16 ++++++++++++++++
>>>> arch/s390/include/asm/kvm_host.h | 1 +
>>>> arch/s390/kvm/kvm-ap.c | 20 ++++++++++++++++++++
>>>> arch/s390/kvm/kvm-s390.c | 9 +++++++++
>>>> 4 files changed, 46 insertions(+), 0 deletions(-)
>>>>
>>>> diff --git a/arch/s390/include/asm/kvm-ap.h
>>>> b/arch/s390/include/asm/kvm-ap.h
>>>> index 736e93e..a6c092e 100644
>>>> --- a/arch/s390/include/asm/kvm-ap.h
>>>> +++ b/arch/s390/include/asm/kvm-ap.h
>>>> @@ -35,4 +35,20 @@
>>>> */
>>>> void kvm_ap_build_crycbd(struct kvm *kvm);
>>>>
>>>> +/**
>>>> + * kvm_ap_interpret_instructions
>>>> + *
>>>> + * Indicate whether AP instructions shall be interpreted. If they
>>>> are not
>>>> + * interpreted, all AP instructions will be intercepted and routed
>>>> back to
>>>> + * userspace.
>>>> + *
>>>> + * @kvm: the virtual machine attributes
>>>> + * @enable: indicates whether AP instructions are to be
>>>> interpreted (true) or
>>>> + * or not (false).
>>>> + *
>>>> + * Returns 0 if completed successfully; otherwise, returns
>>>> -EOPNOTSUPP
>>>> + * indicating that AP instructions are not installed on the guest.
>>>> + */
>>>> +int kvm_ap_interpret_instructions(struct kvm *kvm, bool enable);
>>>> +
>>>> #endif /* _ASM_KVM_AP */
>>>> diff --git a/arch/s390/include/asm/kvm_host.h
>>>> b/arch/s390/include/asm/kvm_host.h
>>>> index 3162783..5470685 100644
>>>> --- a/arch/s390/include/asm/kvm_host.h
>>>> +++ b/arch/s390/include/asm/kvm_host.h
>>>> @@ -715,6 +715,7 @@ struct kvm_s390_crypto {
>>>> __u32 crycbd;
>>>> __u8 aes_kw;
>>>> __u8 dea_kw;
>>>> + __u8 apie;
>>>> };
>>>>
>>>> #define APCB0_MASK_SIZE 1
>>>> diff --git a/arch/s390/kvm/kvm-ap.c b/arch/s390/kvm/kvm-ap.c
>>>> index 991bae4..55d11b5 100644
>>>> --- a/arch/s390/kvm/kvm-ap.c
>>>> +++ b/arch/s390/kvm/kvm-ap.c
>>>> @@ -58,3 +58,23 @@ void kvm_ap_build_crycbd(struct kvm *kvm)
>>>> }
>>>> }
>>>> EXPORT_SYMBOL(kvm_ap_build_crycbd);
>>>> +
>>>> +int kvm_ap_interpret_instructions(struct kvm *kvm, bool enable)
>>>> +{
>>>> + int ret = 0;
>>>> +
>>>> + mutex_lock(&kvm->lock);
>>>> +
>>>> + if (!test_kvm_cpu_feat(kvm, KVM_S390_VM_CPU_FEAT_AP)) {
>>>
>>> Do we really need to test CPU_FEAT_AP?
>>
>> Yes we do.
>
> really? why?
Answered this in Message ID:
<[email protected]>
>
>>
>>>
>>>
>>> I understand that KVM_S390_VM_CPU_FEAT_AP means AP instructions are
>>> interpreted.
>>> shouldn't we add this information in the name?
>>> like KVM_S390_VM_CPU_FEAT_APIE
>>
>> KVM_S390_VM_CPU_FEAT_AP does NOT mean AP instructions are
>> interpreted, it means
>> AP instructions are installed.
>
> Right same error I made all along this review.
> But AFAIK it means AP instructions are provided to the guest.
> Then should this function be called if the guest has no AP instructions ?
Same answer as below. We have no control over who calls this interface, so
it behooves us to make sure it isn't called erroneously. I despise reading
code where I have to search all of the callers to ensure they perform a
required check ... why not just do it in the interface.
>
>
>
>>
>>>
>>>> + ret = -EOPNOTSUPP;
>>>> + goto done;
>>>> + }
>>>> +
>>>> + kvm->arch.crypto.apie = enable;
>>>> + kvm_s390_vcpu_crypto_reset_all(kvm);
>>>> +
>>>> +done:
>>>> + mutex_unlock(&kvm->lock);
>>>> + return ret;
>>>> +}
>>>> +EXPORT_SYMBOL(kvm_ap_interpret_instructions);
>>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>>> index 55cd897..1dc8566 100644
>>>> --- a/arch/s390/kvm/kvm-s390.c
>>>> +++ b/arch/s390/kvm/kvm-s390.c
>>>> @@ -1901,6 +1901,9 @@ static void kvm_s390_crypto_init(struct kvm
>>>> *kvm)
>>>> kvm->arch.crypto.crycb = &kvm->arch.sie_page2->crycb;
>>>> kvm_ap_build_crycbd(kvm);
>>>>
>>>> + /* Default setting indicating SIE shall interpret AP
>>>> instructions */
>>>> + kvm->arch.crypto.apie = 1;
>>>> +
>>>> if (!test_kvm_facility(kvm, 76))
>>>> return;
>>>>
>>>> @@ -2434,6 +2437,12 @@ static void
>>>> kvm_s390_vcpu_crypto_setup(struct kvm_vcpu *vcpu)
>>>> {
>>>> vcpu->arch.sie_block->crycbd = vcpu->kvm->arch.crypto.crycbd;
>>>>
>>>> + vcpu->arch.sie_block->eca &= ~ECA_APIE;
>>>> + if (vcpu->kvm->arch.crypto.apie &&
>>>> + test_kvm_cpu_feat(vcpu->kvm, KVM_S390_VM_CPU_FEAT_AP))
>>>
>>> Do we call xxx_crypto_setup() if KVM does not support AP
>>> interpretation?
>>
>> Yes, kvm_s390_vcpu_crypto_setup(vcpu) is called by
>> kvm_arch_vcpu_setup(vcpu)
>> as well as from kvm_s390_vcpu_crypto_reset_all(kvm). Calling it has
>> nothing
>> to do with whether AP interpretation is supported or not as it does much
>> more than that, including setting up of wrapping keys and the CRYCBD.
>
> Sorry, still the same error I made about CPU_FEAT_AP meaning AP
> instructions in the guest
> and not AP interpretation available.
> Could apie be set if AP instruction are not supported?
Only if code authors and reviewers ensure that no future code changes
set the apie flag
when the CPU_FEAT_AP is not set. Why do you see this as a problem? I see
it as
defensive coding since we have no control over who calls this interface.
>
>
>>
>>>
>>>> + vcpu->arch.sie_block->eca |= ECA_APIE;
>>>> +
>>>> +
>>>> if (!test_kvm_facility(vcpu->kvm, 76))
>>>> return;
>>>>
>>>
>>
>
On 04/17/2018 12:25 PM, Pierre Morel wrote:
> On 17/04/2018 18:14, Tony Krowiak wrote:
>> On 04/16/2018 09:13 AM, Pierre Morel wrote:
>>> On 15/04/2018 23:22, Tony Krowiak wrote:
>>>> This patch provides documentation describing the AP architecture and
>>>> design concepts behind the virtualization of AP devices. It also
>>>>
> ...snip...
>>>> +The process for reserving an AP queue for use by a KVM guest is:
>>>> +
>>>> +* The vfio-ap driver during its initialization will perform the
>>>> following:
>>>> + * Create the 'vfio_ap' root device - /sys/devices/vfio_ap
>>>> + * Create the 'matrix' device in the 'vfio_ap' root
>>>> + * Register the matrix device with the device core
>>>> +* Register with the ap_bus for AP queue devices of type CEX4, CEX5
>>>> and
>>>> + CEX6 and to provide the vfio_ap driver's probe and remove
>>>> callback interfaces.
>>>
>>> I wonder why the type of card has anything to do with this driver.
>>> It should be transparent, the driver should be able to provide the
>>> matrix (APM/AQM/ADM)
>>> independently from the type of card in the slot.
>>
>> We've been down this road several times before. We are only
>> supporting virtualization of
>> CEX4 and newer cards. Also, the AP bus requires registering for
>> specific card types.
>
> Yes I know, but the AP BUS design may be not the optimal for the AP
> Matrix.
> The AP Matrix is device independent.
> We just write/clear bits in a matrix and we do not care what is plugged.
>
> So may be make clear that this device dependence is due to the actual
> AP BUS interface design.
Okay, will do.
>
>>
>>>
>>>
>>> Regards,
>>>
>>> Pierre
>>>
>>
>
On 04/17/2018 12:18 PM, Pierre Morel wrote:
> On 17/04/2018 18:08, Tony Krowiak wrote:
>> On 04/16/2018 09:05 AM, Pierre Morel wrote:
>>> On 15/04/2018 23:22, Tony Krowiak wrote:
>>>> Registers a group notifier during the open of the mediated
>>>> matrix device to get information on KVM presence through the
>>>> VFIO_GROUP_NOTIFY_SET_KVM event. When notified, the pointer
>>>> to the kvm structure is saved inside the mediated matrix
>>>> device. Once the VFIO AP device driver has access to KVM,
>>>> access to the APs can be configured for the guest.
>>>>
>>>> Access to APs is configured when the file descriptor for the
>>>> mediated matrix device is opened by userspace. The items to be
>>>> configured are:
>>>>
>>>> 1. The ECA.28 bit in the SIE state description determines whether
>>>> AP instructions are interpreted by the hardware or intercepted.
>>>> The VFIO AP device driver relies interpretive execution of
>>>> AP instructions so the ECA.28 bit will be set
>>>>
>>>> 2. Guest access to AP adapters, usage domains and control domains
>>>> is controlled by three bit masks referenced from the
>>>> Crypto Control Block (CRYCB) referenced from the guest's SIE state
>>>> description:
>>>>
>>>> * The AP Mask (APM) controls access to the AP adapters. Each bit
>>>> in the APM represents an adapter number - from most significant
>>>> to least significant bit - from 0 to 255. The bits in the APM
>>>> are set according to the adapter numbers assigned to the
>>>> mediated
>>>> matrix device via its 'assign_adapter' sysfs attribute file.
>>>>
>>>> * The AP Queue (AQM) controls access to the AP queues. Each bit
>>>> in the AQM represents an AP queue index - from most significant
>>>> to least significant bit - from 0 to 255. A queue index
>>>> references
>>>> a specific domain and is synonymous with the domian number. The
>>>> bits in the AQM are set according to the domain numbers assigned
>>>> to the mediated matrix device via its 'assign_domain' sysfs
>>>> attribute file.
>>>>
>>>> * The AP Domain Mask (ADM) controls access to the AP control
>>>> domains.
>>>> Each bit in the ADM represents a control domain - from most
>>>> significant to least significant bit - from 0-255. The
>>>> bits in the ADM are set according to the domain numbers assigned
>>>> to the mediated matrix device via its 'assign_control_domain'
>>>> sysfs attribute file.
>>>>
>>>> Signed-off-by: Tony Krowiak <[email protected]>
>>>> ---
>>>> drivers/s390/crypto/vfio_ap_ops.c | 50
>>>> +++++++++++++++++++++++++++++++++
>>>> drivers/s390/crypto/vfio_ap_private.h | 2 +
>>>> 2 files changed, 52 insertions(+), 0 deletions(-)
>>>>
>>>> diff --git a/drivers/s390/crypto/vfio_ap_ops.c
>>>> b/drivers/s390/crypto/vfio_ap_ops.c
>>>> index bc2b05e..e3ff5ab 100644
>>>> --- a/drivers/s390/crypto/vfio_ap_ops.c
>>>> +++ b/drivers/s390/crypto/vfio_ap_ops.c
>>>> @@ -53,6 +53,54 @@ static int vfio_ap_mdev_remove(struct
>>>> mdev_device *mdev)
>>>> return 0;
>>>> }
>>>>
>>>> +static int vfio_ap_mdev_group_notifier(struct notifier_block *nb,
>>>> + unsigned long action, void *data)
>>>> +{
>>>> + struct ap_matrix_mdev *matrix_mdev;
>>>> +
>>>> + if (action == VFIO_GROUP_NOTIFY_SET_KVM) {
>>>> + matrix_mdev = container_of(nb, struct ap_matrix_mdev,
>>>> + group_notifier);
>>>> + matrix_mdev->kvm = data;
>>>> + }
>>>> +
>>>> + return NOTIFY_OK;
>>>> +}
>>>> +
>>>> +static int vfio_ap_mdev_open(struct mdev_device *mdev)
>>>> +{
>>>> + struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
>>>> + unsigned long events;
>>>> + int ret;
>>>> +
>>>> + matrix_mdev->group_notifier.notifier_call =
>>>> vfio_ap_mdev_group_notifier;
>>>> + events = VFIO_GROUP_NOTIFY_SET_KVM;
>>>> +
>>>> + ret = vfio_register_notifier(mdev_dev(mdev), VFIO_GROUP_NOTIFY,
>>>> + &events, &matrix_mdev->group_notifier);
>>>> + if (ret)
>>>> + return ret;
>>>> +
>>>> + ret = kvm_ap_interpret_instructions(matrix_mdev->kvm, true);
>>>
>>> Do you need this call ?
>>> apie is always enabled in KVM if AP instructions are available.
>>
>> I suppose we don't, in which case we don't need the
>> kvm_ap_interpret_instructions()
>> function either ... at least not until we implement interception.
>>
>>>
>>>
>>> Setting or not the interpretation is done for the VM in a all.
>>> It is not the right place to do it here since open is device dependent.
>>
>> As I stated above, at this time we probably do not need this, however;
>> that will not always be the case. The setting is and always will be
>> for the
>> VM in all - unless the architecture changes - because it is
>> controlled by a
>> single bit (ECA.28). If you recall, I originally set interpretation
>> in the
>> vfio_ap device driver when notified of the VFIO_GROUP_NOTIFY_SET_KVM
>> event.
>> I believe ultimately that it is the device driver that should set the
>> value
>> for apie.
>>
>>
>>
>>>
>>>
>>> Or we only have one device in the VM at a time.
>>> In this case, shouldn't we make it official by returning -EEXIST for
>>> the second call?
>>
>> We do allow only one vfio-ap device at a time. QEMU will allow only
>> one vfio-ap device
>> to be configured for a guest. Should we also put a check in here?
>
> QEMU is not the only possible user of this interface.
True .... I will put a check in here to make sure only one device is
created.
>
>
>>
>>>
>>>
>>>
>>>> + if (ret)
>>>> + return ret;
>>>> +
>>>> + ret = kvm_ap_configure_matrix(matrix_mdev->kvm,
>>>> + matrix_mdev->matrix);
>>>> +
>>>> + return ret;
>>>> +}
>>>> +
>>>> +static void vfio_ap_mdev_release(struct mdev_device *mdev)
>>>> +{
>>>> + struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
>>>> +
>>>> + kvm_ap_deconfigure_matrix(matrix_mdev->kvm);
>>>> + kvm_ap_interpret_instructions(matrix_mdev->kvm, false);
>>>
>>> This call clears the apie in KVM.
>>> This is only OK if we have a single device present until the end of
>>> the VM,
>>> otherwise AP instructions in the guest will fail after the release
>>> until the end of the VM
>>> or until a new device is plugged.
>>
>> See Message ID:
>> <[email protected]> on the
>> qemu mailing list. There will be only one vfio-ap device allowed for
>> the MVP model.
>
> dito.
> Anyone can write a userland application using this interface.
See comments above, not to mention I will probably remove this call.
>
> <[email protected]>
>>
>>>
>>>
>>>
>>>> + vfio_unregister_notifier(mdev_dev(mdev), VFIO_GROUP_NOTIFY,
>>>> + &matrix_mdev->group_notifier);
>>>> +}
>>>> +
>>>> static ssize_t name_show(struct kobject *kobj, struct device
>>>> *dev, char *buf)
>>>> {
>>>> return sprintf(buf, "%s\n", VFIO_AP_MDEV_NAME_HWVIRT);
>>>> @@ -754,6 +802,8 @@ static ssize_t matrix_show(struct device *dev,
>>>> struct device_attribute *attr,
>>>> .mdev_attr_groups = vfio_ap_mdev_attr_groups,
>>>> .create = vfio_ap_mdev_create,
>>>> .remove = vfio_ap_mdev_remove,
>>>> + .open = vfio_ap_mdev_open,
>>>> + .release = vfio_ap_mdev_release,
>>>> };
>>>>
>>>> int vfio_ap_mdev_register(struct ap_matrix *ap_matrix)
>>>> diff --git a/drivers/s390/crypto/vfio_ap_private.h
>>>> b/drivers/s390/crypto/vfio_ap_private.h
>>>> index f248faf..48e2806 100644
>>>> --- a/drivers/s390/crypto/vfio_ap_private.h
>>>> +++ b/drivers/s390/crypto/vfio_ap_private.h
>>>> @@ -31,6 +31,8 @@ struct ap_matrix {
>>>>
>>>> struct ap_matrix_mdev {
>>>> struct kvm_ap_matrix *matrix;
>>>> + struct notifier_block group_notifier;
>>>> + struct kvm *kvm;
>>>> };
>>>>
>>>> static inline struct ap_matrix *to_ap_matrix(struct device *dev)
>>>
>>>
>>
>
On 17/04/2018 18:22, Tony Krowiak wrote:
> On 04/17/2018 12:13 PM, Pierre Morel wrote:
>> On 17/04/2018 17:02, Tony Krowiak wrote:
>>> On 04/16/2018 06:51 AM, Pierre Morel wrote:
>>>> On 15/04/2018 23:22, Tony Krowiak wrote:
>>>>> The VFIO AP device model exploits interpretive execution of AP
>>>>> instructions (APIE) to provide guests passthrough access to AP
>>>>> devices. This patch introduces a new interface to enable and
>>>>> disable APIE.
>>>>>
>>>>> Signed-off-by: Tony Krowiak <[email protected]>
>>>>> ---
>>>>> arch/s390/include/asm/kvm-ap.h | 16 ++++++++++++++++
>>>>> arch/s390/include/asm/kvm_host.h | 1 +
>>>>> arch/s390/kvm/kvm-ap.c | 20 ++++++++++++++++++++
>>>>> arch/s390/kvm/kvm-s390.c | 9 +++++++++
>>>>> 4 files changed, 46 insertions(+), 0 deletions(-)
>>>>>
>>>>> diff --git a/arch/s390/include/asm/kvm-ap.h
>>>>> b/arch/s390/include/asm/kvm-ap.h
>>>>> index 736e93e..a6c092e 100644
>>>>> --- a/arch/s390/include/asm/kvm-ap.h
>>>>> +++ b/arch/s390/include/asm/kvm-ap.h
>>>>> @@ -35,4 +35,20 @@
>>>>> */
>>>>> void kvm_ap_build_crycbd(struct kvm *kvm);
>>>>>
>>>>> +/**
>>>>> + * kvm_ap_interpret_instructions
>>>>> + *
>>>>> + * Indicate whether AP instructions shall be interpreted. If they
>>>>> are not
>>>>> + * interpreted, all AP instructions will be intercepted and
>>>>> routed back to
>>>>> + * userspace.
>>>>> + *
>>>>> + * @kvm: the virtual machine attributes
>>>>> + * @enable: indicates whether AP instructions are to be
>>>>> interpreted (true) or
>>>>> + * or not (false).
>>>>> + *
>>>>> + * Returns 0 if completed successfully; otherwise, returns
>>>>> -EOPNOTSUPP
>>>>> + * indicating that AP instructions are not installed on the guest.
>>>>> + */
>>>>> +int kvm_ap_interpret_instructions(struct kvm *kvm, bool enable);
>>>>> +
>>>>> #endif /* _ASM_KVM_AP */
>>>>> diff --git a/arch/s390/include/asm/kvm_host.h
>>>>> b/arch/s390/include/asm/kvm_host.h
>>>>> index 3162783..5470685 100644
>>>>> --- a/arch/s390/include/asm/kvm_host.h
>>>>> +++ b/arch/s390/include/asm/kvm_host.h
>>>>> @@ -715,6 +715,7 @@ struct kvm_s390_crypto {
>>>>> __u32 crycbd;
>>>>> __u8 aes_kw;
>>>>> __u8 dea_kw;
>>>>> + __u8 apie;
>>>>> };
>>>>>
>>>>> #define APCB0_MASK_SIZE 1
>>>>> diff --git a/arch/s390/kvm/kvm-ap.c b/arch/s390/kvm/kvm-ap.c
>>>>> index 991bae4..55d11b5 100644
>>>>> --- a/arch/s390/kvm/kvm-ap.c
>>>>> +++ b/arch/s390/kvm/kvm-ap.c
>>>>> @@ -58,3 +58,23 @@ void kvm_ap_build_crycbd(struct kvm *kvm)
>>>>> }
>>>>> }
>>>>> EXPORT_SYMBOL(kvm_ap_build_crycbd);
>>>>> +
>>>>> +int kvm_ap_interpret_instructions(struct kvm *kvm, bool enable)
>>>>> +{
>>>>> + int ret = 0;
>>>>> +
>>>>> + mutex_lock(&kvm->lock);
>>>>> +
>>>>> + if (!test_kvm_cpu_feat(kvm, KVM_S390_VM_CPU_FEAT_AP)) {
>>>>
>>>> Do we really need to test CPU_FEAT_AP?
>>>
>>> Yes we do.
>>
>> really? why?
>
> The KVM_S390_VM_CPU_FEAT_AP will not be enabled by KVM if the AP
> instructions are not installed on the host. I assume - but have
> no way of verifying - that if the AP instructions are not installed
> on the host, that interpretation would fail. Do you know what would
> happen if AP instructions are interpreted when not installed on
> the host?
If the host has no AP instructions (his ECA.28=0) but it set ECA.28 for
a guest,
there will be no AP instructions available in the guest.
>
>>
>>
>>>
>>>>
>>>>
>>>> I understand that KVM_S390_VM_CPU_FEAT_AP means AP instructions are
>>>> interpreted.
>>>> shouldn't we add this information in the name?
>>>> like KVM_S390_VM_CPU_FEAT_APIE
>>>
>>> KVM_S390_VM_CPU_FEAT_AP does NOT mean AP instructions are
>>> interpreted, it means
>>> AP instructions are installed.
>>
>> Right same error I made all along this review.
>> But AFAIK it means AP instructions are provided to the guest.
>> Then should this function be called if the guest has no AP
>> instructions ?
>>
>>
>>>
>>>>
>>>>> + ret = -EOPNOTSUPP;
>>>>> + goto done;
>>>>> + }
>>>>> +
>>>>> + kvm->arch.crypto.apie = enable;
>>>>> + kvm_s390_vcpu_crypto_reset_all(kvm);
>>>>> +
>>>>> +done:
>>>>> + mutex_unlock(&kvm->lock);
>>>>> + return ret;
>>>>> +}
>>>>> +EXPORT_SYMBOL(kvm_ap_interpret_instructions);
>>>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>>>> index 55cd897..1dc8566 100644
>>>>> --- a/arch/s390/kvm/kvm-s390.c
>>>>> +++ b/arch/s390/kvm/kvm-s390.c
>>>>> @@ -1901,6 +1901,9 @@ static void kvm_s390_crypto_init(struct kvm
>>>>> *kvm)
>>>>> kvm->arch.crypto.crycb = &kvm->arch.sie_page2->crycb;
>>>>> kvm_ap_build_crycbd(kvm);
>>>>>
>>>>> + /* Default setting indicating SIE shall interpret AP
>>>>> instructions */
>>>>> + kvm->arch.crypto.apie = 1;
>>>>> +
>>>>> if (!test_kvm_facility(kvm, 76))
>>>>> return;
>>>>>
>>>>> @@ -2434,6 +2437,12 @@ static void
>>>>> kvm_s390_vcpu_crypto_setup(struct kvm_vcpu *vcpu)
>>>>> {
>>>>> vcpu->arch.sie_block->crycbd = vcpu->kvm->arch.crypto.crycbd;
>>>>>
>>>>> + vcpu->arch.sie_block->eca &= ~ECA_APIE;
>>>>> + if (vcpu->kvm->arch.crypto.apie &&
>>>>> + test_kvm_cpu_feat(vcpu->kvm, KVM_S390_VM_CPU_FEAT_AP))
>>>>
>>>> Do we call xxx_crypto_setup() if KVM does not support AP
>>>> interpretation?
>>>
>>> Yes, kvm_s390_vcpu_crypto_setup(vcpu) is called by
>>> kvm_arch_vcpu_setup(vcpu)
>>> as well as from kvm_s390_vcpu_crypto_reset_all(kvm). Calling it has
>>> nothing
>>> to do with whether AP interpretation is supported or not as it does
>>> much
>>> more than that, including setting up of wrapping keys and the CRYCBD.
>>
>> Sorry, still the same error I made about CPU_FEAT_AP meaning AP
>> instructions in the guest
>> and not AP interpretation available.
>> Could apie be set if AP instruction are not supported?
>>
>>>
>>>>
>>>>> + vcpu->arch.sie_block->eca |= ECA_APIE;
>>>>> +
>>>>> +
>>>>> if (!test_kvm_facility(vcpu->kvm, 76))
>>>>> return;
>>>>>
>>>>
>>>
>>
>
--
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany
On Tue, 17 Apr 2018 09:31:00 -0400
Tony Krowiak <[email protected]> wrote:
> My preference would be one of the following:
>
> 1. All of the interfaces defined in arch/s390/include/asm/ap.h
> are implemented in a file that is built whether ZCRYPT is
> built or not.
>
> 2. The drivers/s390/crypto/ap_asm.h file containing the functions
> that execute the AP instructions are made available outside of
> the AP bus, for example; arch/s390/include/asm
>
> I requested this from the maintainer but was told we don't want to
> have any crypto adapter support when the host AP functionality is
> disabled (CONFIG_ZCRYPT=n). This makes sense, however; I think it is
> a bit confusing to have a header file (arch/s390/include/asm/ap.h)
> with interfaces that are conditionally built.
>
> This is why I chose the ifdeffery (as you call it) approach. The
> only other solution I can conjure is to duplicate the asm code for
> the AP instructions needed in KVM and bypass using the AP bus
> interfaces.
I think at the root of this is an unfortunate mixup in the
architecture: The format of the crycb changes depending on some ap
feature being installed. Providing the crycb does not have anything to
do with ap device usage in the host, but we need to issue an ap
instruction to get this right. [Correct me if I'm wrong; but that's
what I get without being able to consult the actual architecture.]
So, exporting *all* of the interfaces is probably not needed anyway. I
think it boils down to either "export the interfaces where a mixup
happened, and keep the rest to zcrypt only", or "duplicate the
instructions for kvm usage".
I hope we can find a solution here, as this seems to be one of the main
discussion points :/ (FWIW, I think the basic driver interface is sane.)
On 04/17/2018 11:10 AM, Cornelia Huck wrote:
> On Tue, 17 Apr 2018 10:55:30 -0400
> Tony Krowiak <[email protected]> wrote:
>
>> On 04/17/2018 10:29 AM, Halil Pasic wrote:
>>>
>>> On 04/15/2018 11:22 PM, Tony Krowiak wrote:
>>>> Introduces a new function to reset the crypto attributes for all
>>>> vcpus whether they are running or not. Each vcpu in KVM will
>>>> be removed from SIE prior to resetting the crypto attributes in its
>>>> SIE state description. After all vcpus have had their crypto attributes
>>>> reset the vcpus will be restored to SIE.
>>>>
>>>> This function will be used in a later patch to set the ECA.28
>>>> bit in the SIE state description to enable interpretive execution of
>>>> AP instructions. It will also be incorporated into the
>>>> kvm_s390_vm_set_crypto(kvm) function to fix an issue whereby the crypto
>>>> key wrapping attributes could potentially get out of synch for running
>>>> vcpus.
>>>>
>>> Wasn't this 'issue' reported by me by any chance?
>> Yes it was .... was I supposed to include that fact in the commit message?
> A Reported-by: is usually nice.
I wasn't aware of such a tag.
>
>>> I agree with Connnie, we don't need the forward reference to
>>> ECA.28.
>> I'm not sure that's exactly what she said, but I'd be more than happy
>> to remove it.
> It was kind of implied :)
>
On 04/17/2018 11:21 AM, Cornelia Huck wrote:
> On Tue, 17 Apr 2018 10:26:57 -0400
> Tony Krowiak <[email protected]> wrote:
>
>> On 04/17/2018 06:10 AM, Cornelia Huck wrote:
>>> On Tue, 17 Apr 2018 09:49:58 +0200
>>> "Harald Freudenberger" <[email protected]> wrote:
>>>
>>>> Didn't we say that when APXA is not available there is no Crypto support
>>>> for KVM ?
>>> [Going by the code, as I don't have access to the architecture]
>>>
>>> Current status seems to be:
>>> - setup crycb if facility 76 is available (that's MSAX3, I guess?)
>> The crycb is set up regardless of whether STFLE.76 (MSAX3) is
>> installed or not.
> Hm, the current code does a quick exit if bit 76 is not set, doesn't
> it?
I guess that depends upon what you mean by current code. If you are talking
about the code as it is distributed today - i.e., before my patch series -
then you are correct. This patch changes that; it initializes the
kvm->arch.crypto.crycbd to point to the CRYCB, then clears the format bits
(kvm->arch.crypto.crycbd &= ~(CRYCB_FORMAT_MASK)) which is the same as
setting the CRYCB format to format 0. It is only after this that the
check is done to determine whether STFLE.76 is set.
>
>>> - use format 2 if APXA is available, else use format 1
>> Use format 0 if MSAX3 is not available
>> Use format 1 if MSAX3 is available but APXA is not
>> Use format 2 if MSAX3 and APXA is available
>>
>>> From Tony's patch description, the goal seems to be:
>>> - setup crycb even if MSAX3 is not available
>> Yes, that is true
>>
>>> So my understanding is that we use APXA only to decide on the format of
>>> the crycb, but provide it in any case?
>> Yes, that is true
> With the format selection you outlined above, I guess. Makes sense from
> my point of view (just looking at the source code).
It also implements what is stated in the architecture doc.
>
>>> (Not providing a crycb if APXA is not available would be loss of
>>> functionality, I guess? Deciding not to provide vfio-ap if APXA is not
>>> available is a different game, of course.)
>> This would require a change to enabling the CPU model feature for
>> AP.
> But would it actually make sense to tie vfio-ap to APXA? This needs to
> be answered by folks with access to the architecture :)
I don't see any reason to do that from an architectural perspective.
One can access AP devices whether APXA is installed or not, it just limits
the range of devices that can be addressed
>
> [Personally, I think we should go with the version that uses the least
> restrictions without introducing over-complex code. What constitutes
> "over-complex code" is of course in the eye of the beholder...]
I agree.
>
On 04/17/2018 12:55 PM, Pierre Morel wrote:
> On 17/04/2018 18:22, Tony Krowiak wrote:
>> On 04/17/2018 12:13 PM, Pierre Morel wrote:
>>> On 17/04/2018 17:02, Tony Krowiak wrote:
>>>> On 04/16/2018 06:51 AM, Pierre Morel wrote:
>>>>> On 15/04/2018 23:22, Tony Krowiak wrote:
>>>>>> The VFIO AP device model exploits interpretive execution of AP
>>>>>> instructions (APIE) to provide guests passthrough access to AP
>>>>>> devices. This patch introduces a new interface to enable and
>>>>>> disable APIE.
>>>>>>
>>>>>> Signed-off-by: Tony Krowiak <[email protected]>
>>>>>> ---
>>>>>> arch/s390/include/asm/kvm-ap.h | 16 ++++++++++++++++
>>>>>> arch/s390/include/asm/kvm_host.h | 1 +
>>>>>> arch/s390/kvm/kvm-ap.c | 20 ++++++++++++++++++++
>>>>>> arch/s390/kvm/kvm-s390.c | 9 +++++++++
>>>>>> 4 files changed, 46 insertions(+), 0 deletions(-)
>>>>>>
>>>>>> diff --git a/arch/s390/include/asm/kvm-ap.h
>>>>>> b/arch/s390/include/asm/kvm-ap.h
>>>>>> index 736e93e..a6c092e 100644
>>>>>> --- a/arch/s390/include/asm/kvm-ap.h
>>>>>> +++ b/arch/s390/include/asm/kvm-ap.h
>>>>>> @@ -35,4 +35,20 @@
>>>>>> */
>>>>>> void kvm_ap_build_crycbd(struct kvm *kvm);
>>>>>>
>>>>>> +/**
>>>>>> + * kvm_ap_interpret_instructions
>>>>>> + *
>>>>>> + * Indicate whether AP instructions shall be interpreted. If
>>>>>> they are not
>>>>>> + * interpreted, all AP instructions will be intercepted and
>>>>>> routed back to
>>>>>> + * userspace.
>>>>>> + *
>>>>>> + * @kvm: the virtual machine attributes
>>>>>> + * @enable: indicates whether AP instructions are to be
>>>>>> interpreted (true) or
>>>>>> + * or not (false).
>>>>>> + *
>>>>>> + * Returns 0 if completed successfully; otherwise, returns
>>>>>> -EOPNOTSUPP
>>>>>> + * indicating that AP instructions are not installed on the guest.
>>>>>> + */
>>>>>> +int kvm_ap_interpret_instructions(struct kvm *kvm, bool enable);
>>>>>> +
>>>>>> #endif /* _ASM_KVM_AP */
>>>>>> diff --git a/arch/s390/include/asm/kvm_host.h
>>>>>> b/arch/s390/include/asm/kvm_host.h
>>>>>> index 3162783..5470685 100644
>>>>>> --- a/arch/s390/include/asm/kvm_host.h
>>>>>> +++ b/arch/s390/include/asm/kvm_host.h
>>>>>> @@ -715,6 +715,7 @@ struct kvm_s390_crypto {
>>>>>> __u32 crycbd;
>>>>>> __u8 aes_kw;
>>>>>> __u8 dea_kw;
>>>>>> + __u8 apie;
>>>>>> };
>>>>>>
>>>>>> #define APCB0_MASK_SIZE 1
>>>>>> diff --git a/arch/s390/kvm/kvm-ap.c b/arch/s390/kvm/kvm-ap.c
>>>>>> index 991bae4..55d11b5 100644
>>>>>> --- a/arch/s390/kvm/kvm-ap.c
>>>>>> +++ b/arch/s390/kvm/kvm-ap.c
>>>>>> @@ -58,3 +58,23 @@ void kvm_ap_build_crycbd(struct kvm *kvm)
>>>>>> }
>>>>>> }
>>>>>> EXPORT_SYMBOL(kvm_ap_build_crycbd);
>>>>>> +
>>>>>> +int kvm_ap_interpret_instructions(struct kvm *kvm, bool enable)
>>>>>> +{
>>>>>> + int ret = 0;
>>>>>> +
>>>>>> + mutex_lock(&kvm->lock);
>>>>>> +
>>>>>> + if (!test_kvm_cpu_feat(kvm, KVM_S390_VM_CPU_FEAT_AP)) {
>>>>>
>>>>> Do we really need to test CPU_FEAT_AP?
>>>>
>>>> Yes we do.
>>>
>>> really? why?
>>
>> The KVM_S390_VM_CPU_FEAT_AP will not be enabled by KVM if the AP
>> instructions are not installed on the host. I assume - but have
>> no way of verifying - that if the AP instructions are not installed
>> on the host, that interpretation would fail. Do you know what would
>> happen if AP instructions are interpreted when not installed on
>> the host?
>
> If the host has no AP instructions (his ECA.28=0) but it set ECA.28
> for a guest,
> there will be no AP instructions available in the guest.
Then there's the answer to your question; this is why we to test
CPU_FEAT_AP.
>
>
>>
>>>
>>>
>>>>
>>>>>
>>>>>
>>>>> I understand that KVM_S390_VM_CPU_FEAT_AP means AP instructions
>>>>> are interpreted.
>>>>> shouldn't we add this information in the name?
>>>>> like KVM_S390_VM_CPU_FEAT_APIE
>>>>
>>>> KVM_S390_VM_CPU_FEAT_AP does NOT mean AP instructions are
>>>> interpreted, it means
>>>> AP instructions are installed.
>>>
>>> Right same error I made all along this review.
>>> But AFAIK it means AP instructions are provided to the guest.
>>> Then should this function be called if the guest has no AP
>>> instructions ?
>>>
>>>
>>>>
>>>>>
>>>>>> + ret = -EOPNOTSUPP;
>>>>>> + goto done;
>>>>>> + }
>>>>>> +
>>>>>> + kvm->arch.crypto.apie = enable;
>>>>>> + kvm_s390_vcpu_crypto_reset_all(kvm);
>>>>>> +
>>>>>> +done:
>>>>>> + mutex_unlock(&kvm->lock);
>>>>>> + return ret;
>>>>>> +}
>>>>>> +EXPORT_SYMBOL(kvm_ap_interpret_instructions);
>>>>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>>>>> index 55cd897..1dc8566 100644
>>>>>> --- a/arch/s390/kvm/kvm-s390.c
>>>>>> +++ b/arch/s390/kvm/kvm-s390.c
>>>>>> @@ -1901,6 +1901,9 @@ static void kvm_s390_crypto_init(struct kvm
>>>>>> *kvm)
>>>>>> kvm->arch.crypto.crycb = &kvm->arch.sie_page2->crycb;
>>>>>> kvm_ap_build_crycbd(kvm);
>>>>>>
>>>>>> + /* Default setting indicating SIE shall interpret AP
>>>>>> instructions */
>>>>>> + kvm->arch.crypto.apie = 1;
>>>>>> +
>>>>>> if (!test_kvm_facility(kvm, 76))
>>>>>> return;
>>>>>>
>>>>>> @@ -2434,6 +2437,12 @@ static void
>>>>>> kvm_s390_vcpu_crypto_setup(struct kvm_vcpu *vcpu)
>>>>>> {
>>>>>> vcpu->arch.sie_block->crycbd = vcpu->kvm->arch.crypto.crycbd;
>>>>>>
>>>>>> + vcpu->arch.sie_block->eca &= ~ECA_APIE;
>>>>>> + if (vcpu->kvm->arch.crypto.apie &&
>>>>>> + test_kvm_cpu_feat(vcpu->kvm, KVM_S390_VM_CPU_FEAT_AP))
>>>>>
>>>>> Do we call xxx_crypto_setup() if KVM does not support AP
>>>>> interpretation?
>>>>
>>>> Yes, kvm_s390_vcpu_crypto_setup(vcpu) is called by
>>>> kvm_arch_vcpu_setup(vcpu)
>>>> as well as from kvm_s390_vcpu_crypto_reset_all(kvm). Calling it has
>>>> nothing
>>>> to do with whether AP interpretation is supported or not as it does
>>>> much
>>>> more than that, including setting up of wrapping keys and the CRYCBD.
>>>
>>> Sorry, still the same error I made about CPU_FEAT_AP meaning AP
>>> instructions in the guest
>>> and not AP interpretation available.
>>> Could apie be set if AP instruction are not supported?
>>>
>>>>
>>>>>
>>>>>> + vcpu->arch.sie_block->eca |= ECA_APIE;
>>>>>> +
>>>>>> +
>>>>>> if (!test_kvm_facility(vcpu->kvm, 76))
>>>>>> return;
>>>>>>
>>>>>
>>>>
>>>
>>
>
On 04/17/2018 12:56 PM, Cornelia Huck wrote:
> On Tue, 17 Apr 2018 09:31:00 -0400
> Tony Krowiak <[email protected]> wrote:
>
>> My preference would be one of the following:
>>
>> 1. All of the interfaces defined in arch/s390/include/asm/ap.h
>> are implemented in a file that is built whether ZCRYPT is
>> built or not.
>>
>> 2. The drivers/s390/crypto/ap_asm.h file containing the functions
>> that execute the AP instructions are made available outside of
>> the AP bus, for example; arch/s390/include/asm
>>
>> I requested this from the maintainer but was told we don't want to
>> have any crypto adapter support when the host AP functionality is
>> disabled (CONFIG_ZCRYPT=n). This makes sense, however; I think it is
>> a bit confusing to have a header file (arch/s390/include/asm/ap.h)
>> with interfaces that are conditionally built.
>>
>> This is why I chose the ifdeffery (as you call it) approach. The
>> only other solution I can conjure is to duplicate the asm code for
>> the AP instructions needed in KVM and bypass using the AP bus
>> interfaces.
> I think at the root of this is an unfortunate mixup in the
> architecture: The format of the crycb changes depending on some ap
> feature being installed. Providing the crycb does not have anything to
> do with ap device usage in the host, but we need to issue an ap
> instruction to get this right. [Correct me if I'm wrong; but that's
> what I get without being able to consult the actual architecture.]
That sums it up.
>
> So, exporting *all* of the interfaces is probably not needed anyway. I
> think it boils down to either "export the interfaces where a mixup
> happened, and keep the rest to zcrypt only", or "duplicate the
> instructions for kvm usage".
I only suggested exporting all of the interfaces because the others may
be needed down the road when interception is implemented for full
virtualization of AP devices.
>
> I hope we can find a solution here, as this seems to be one of the main
> discussion points :/ (FWIW, I think the basic driver interface is sane.)
I will work on coming up with something that attempts to take into
consideration
all of the comments thus far. In the meantime, I will keep my eyes on this
space if anybody comes up with a better, concrete resolution.
>
On Tue, 17 Apr 2018 14:08:59 -0400
Tony Krowiak <[email protected]> wrote:
> On 04/17/2018 11:21 AM, Cornelia Huck wrote:
> > On Tue, 17 Apr 2018 10:26:57 -0400
> > Tony Krowiak <[email protected]> wrote:
> >
> >> On 04/17/2018 06:10 AM, Cornelia Huck wrote:
> >>> On Tue, 17 Apr 2018 09:49:58 +0200
> >>> "Harald Freudenberger" <[email protected]> wrote:
> >>>
> >>>> Didn't we say that when APXA is not available there is no Crypto support
> >>>> for KVM ?
> >>> [Going by the code, as I don't have access to the architecture]
> >>>
> >>> Current status seems to be:
> >>> - setup crycb if facility 76 is available (that's MSAX3, I guess?)
> >> The crycb is set up regardless of whether STFLE.76 (MSAX3) is
> >> installed or not.
> > Hm, the current code does a quick exit if bit 76 is not set, doesn't
> > it?
>
> I guess that depends upon what you mean by current code. If you are talking
> about the code as it is distributed today - i.e., before my patch series -
> then you are correct. This patch changes that; it initializes the
> kvm->arch.crypto.crycbd to point to the CRYCB, then clears the format bits
> (kvm->arch.crypto.crycbd &= ~(CRYCB_FORMAT_MASK)) which is the same as
> setting the CRYCB format to format 0. It is only after this that the
> check is done to determine whether STFLE.76 is set.
Ah yes, with "current" I referred to current upstream.
>
> >
> >>> - use format 2 if APXA is available, else use format 1
> >> Use format 0 if MSAX3 is not available
> >> Use format 1 if MSAX3 is available but APXA is not
> >> Use format 2 if MSAX3 and APXA is available
> >>
> >>> From Tony's patch description, the goal seems to be:
> >>> - setup crycb even if MSAX3 is not available
> >> Yes, that is true
> >>
> >>> So my understanding is that we use APXA only to decide on the format of
> >>> the crycb, but provide it in any case?
> >> Yes, that is true
> > With the format selection you outlined above, I guess. Makes sense from
> > my point of view (just looking at the source code).
> It also implements what is stated in the architecture doc.
OK, great.
> >
> >>> (Not providing a crycb if APXA is not available would be loss of
> >>> functionality, I guess? Deciding not to provide vfio-ap if APXA is not
> >>> available is a different game, of course.)
> >> This would require a change to enabling the CPU model feature for
> >> AP.
> > But would it actually make sense to tie vfio-ap to APXA? This needs to
> > be answered by folks with access to the architecture :)
>
> I don't see any reason to do that from an architectural perspective.
> One can access AP devices whether APXA is installed or not, it just limits
> the range of devices that can be addressed
So I guess we should not introduce a tie-in then (unless it radically
simplifies the code...)
On 17/04/2018 20:11, Tony Krowiak wrote:
> On 04/17/2018 12:55 PM, Pierre Morel wrote:
>> On 17/04/2018 18:22, Tony Krowiak wrote:
>>> On 04/17/2018 12:13 PM, Pierre Morel wrote:
>>>> On 17/04/2018 17:02, Tony Krowiak wrote:
>>>>> On 04/16/2018 06:51 AM, Pierre Morel wrote:
>>>>>> On 15/04/2018 23:22, Tony Krowiak wrote:
>>>>>>> The VFIO AP device model exploits interpretive execution of AP
>>>>>>> instructions (APIE) to provide guests passthrough access to AP
>>>>>>> devices. This patch introduces a new interface to enable and
>>>>>>> disable APIE.
>>>>>>>
>>>>>>> Signed-off-by: Tony Krowiak <[email protected]>
>>>>>>> ---
>>>>>>> arch/s390/include/asm/kvm-ap.h | 16 ++++++++++++++++
>>>>>>> arch/s390/include/asm/kvm_host.h | 1 +
>>>>>>> arch/s390/kvm/kvm-ap.c | 20 ++++++++++++++++++++
>>>>>>> arch/s390/kvm/kvm-s390.c | 9 +++++++++
>>>>>>> 4 files changed, 46 insertions(+), 0 deletions(-)
>>>>>>>
>>>>>>> diff --git a/arch/s390/include/asm/kvm-ap.h
>>>>>>> b/arch/s390/include/asm/kvm-ap.h
>>>>>>> index 736e93e..a6c092e 100644
>>>>>>> --- a/arch/s390/include/asm/kvm-ap.h
>>>>>>> +++ b/arch/s390/include/asm/kvm-ap.h
>>>>>>> @@ -35,4 +35,20 @@
>>>>>>> */
>>>>>>> void kvm_ap_build_crycbd(struct kvm *kvm);
>>>>>>>
>>>>>>> +/**
>>>>>>> + * kvm_ap_interpret_instructions
>>>>>>> + *
>>>>>>> + * Indicate whether AP instructions shall be interpreted. If
>>>>>>> they are not
>>>>>>> + * interpreted, all AP instructions will be intercepted and
>>>>>>> routed back to
>>>>>>> + * userspace.
>>>>>>> + *
>>>>>>> + * @kvm: the virtual machine attributes
>>>>>>> + * @enable: indicates whether AP instructions are to be
>>>>>>> interpreted (true) or
>>>>>>> + * or not (false).
>>>>>>> + *
>>>>>>> + * Returns 0 if completed successfully; otherwise, returns
>>>>>>> -EOPNOTSUPP
>>>>>>> + * indicating that AP instructions are not installed on the guest.
>>>>>>> + */
>>>>>>> +int kvm_ap_interpret_instructions(struct kvm *kvm, bool enable);
>>>>>>> +
>>>>>>> #endif /* _ASM_KVM_AP */
>>>>>>> diff --git a/arch/s390/include/asm/kvm_host.h
>>>>>>> b/arch/s390/include/asm/kvm_host.h
>>>>>>> index 3162783..5470685 100644
>>>>>>> --- a/arch/s390/include/asm/kvm_host.h
>>>>>>> +++ b/arch/s390/include/asm/kvm_host.h
>>>>>>> @@ -715,6 +715,7 @@ struct kvm_s390_crypto {
>>>>>>> __u32 crycbd;
>>>>>>> __u8 aes_kw;
>>>>>>> __u8 dea_kw;
>>>>>>> + __u8 apie;
>>>>>>> };
>>>>>>>
>>>>>>> #define APCB0_MASK_SIZE 1
>>>>>>> diff --git a/arch/s390/kvm/kvm-ap.c b/arch/s390/kvm/kvm-ap.c
>>>>>>> index 991bae4..55d11b5 100644
>>>>>>> --- a/arch/s390/kvm/kvm-ap.c
>>>>>>> +++ b/arch/s390/kvm/kvm-ap.c
>>>>>>> @@ -58,3 +58,23 @@ void kvm_ap_build_crycbd(struct kvm *kvm)
>>>>>>> }
>>>>>>> }
>>>>>>> EXPORT_SYMBOL(kvm_ap_build_crycbd);
>>>>>>> +
>>>>>>> +int kvm_ap_interpret_instructions(struct kvm *kvm, bool enable)
>>>>>>> +{
>>>>>>> + int ret = 0;
>>>>>>> +
>>>>>>> + mutex_lock(&kvm->lock);
>>>>>>> +
>>>>>>> + if (!test_kvm_cpu_feat(kvm, KVM_S390_VM_CPU_FEAT_AP)) {
>>>>>>
>>>>>> Do we really need to test CPU_FEAT_AP?
>>>>>
>>>>> Yes we do.
>>>>
>>>> really? why?
>>>
>>> The KVM_S390_VM_CPU_FEAT_AP will not be enabled by KVM if the AP
>>> instructions are not installed on the host. I assume - but have
>>> no way of verifying - that if the AP instructions are not installed
>>> on the host, that interpretation would fail. Do you know what would
>>> happen if AP instructions are interpreted when not installed on
>>> the host?
>>
>> If the host has no AP instructions (his ECA.28=0) but it set ECA.28
>> for a guest,
>> there will be no AP instructions available in the guest.
>
> Then there's the answer to your question; this is why we to test
> CPU_FEAT_AP.
We can postpone this discussion when we discuss on VSIE.
For this specific call I just wanted to point out that obviously this
function should not
be called if the guest has no AP instructions.
>
>>
>>
>>>
>>>>
>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>> I understand that KVM_S390_VM_CPU_FEAT_AP means AP instructions
>>>>>> are interpreted.
>>>>>> shouldn't we add this information in the name?
>>>>>> like KVM_S390_VM_CPU_FEAT_APIE
>>>>>
>>>>> KVM_S390_VM_CPU_FEAT_AP does NOT mean AP instructions are
>>>>> interpreted, it means
>>>>> AP instructions are installed.
>>>>
>>>> Right same error I made all along this review.
>>>> But AFAIK it means AP instructions are provided to the guest.
>>>> Then should this function be called if the guest has no AP
>>>> instructions ?
>>>>
>>>>
>>>>>
>>>>>>
>>>>>>> + ret = -EOPNOTSUPP;
>>>>>>> + goto done;
>>>>>>> + }
>>>>>>> +
>>>>>>> + kvm->arch.crypto.apie = enable;
>>>>>>> + kvm_s390_vcpu_crypto_reset_all(kvm);
>>>>>>> +
>>>>>>> +done:
>>>>>>> + mutex_unlock(&kvm->lock);
>>>>>>> + return ret;
>>>>>>> +}
>>>>>>> +EXPORT_SYMBOL(kvm_ap_interpret_instructions);
>>>>>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>>>>>> index 55cd897..1dc8566 100644
>>>>>>> --- a/arch/s390/kvm/kvm-s390.c
>>>>>>> +++ b/arch/s390/kvm/kvm-s390.c
>>>>>>> @@ -1901,6 +1901,9 @@ static void kvm_s390_crypto_init(struct
>>>>>>> kvm *kvm)
>>>>>>> kvm->arch.crypto.crycb = &kvm->arch.sie_page2->crycb;
>>>>>>> kvm_ap_build_crycbd(kvm);
>>>>>>>
>>>>>>> + /* Default setting indicating SIE shall interpret AP
>>>>>>> instructions */
>>>>>>> + kvm->arch.crypto.apie = 1;
>>>>>>> +
>>>>>>> if (!test_kvm_facility(kvm, 76))
>>>>>>> return;
>>>>>>>
>>>>>>> @@ -2434,6 +2437,12 @@ static void
>>>>>>> kvm_s390_vcpu_crypto_setup(struct kvm_vcpu *vcpu)
>>>>>>> {
>>>>>>> vcpu->arch.sie_block->crycbd = vcpu->kvm->arch.crypto.crycbd;
>>>>>>>
>>>>>>> + vcpu->arch.sie_block->eca &= ~ECA_APIE;
>>>>>>> + if (vcpu->kvm->arch.crypto.apie &&
>>>>>>> + test_kvm_cpu_feat(vcpu->kvm, KVM_S390_VM_CPU_FEAT_AP))
>>>>>>
>>>>>> Do we call xxx_crypto_setup() if KVM does not support AP
>>>>>> interpretation?
>>>>>
>>>>> Yes, kvm_s390_vcpu_crypto_setup(vcpu) is called by
>>>>> kvm_arch_vcpu_setup(vcpu)
>>>>> as well as from kvm_s390_vcpu_crypto_reset_all(kvm). Calling it
>>>>> has nothing
>>>>> to do with whether AP interpretation is supported or not as it
>>>>> does much
>>>>> more than that, including setting up of wrapping keys and the CRYCBD.
>>>>
>>>> Sorry, still the same error I made about CPU_FEAT_AP meaning AP
>>>> instructions in the guest
>>>> and not AP interpretation available.
>>>> Could apie be set if AP instruction are not supported?
>>>>
>>>>>
>>>>>>
>>>>>>> + vcpu->arch.sie_block->eca |= ECA_APIE;
>>>>>>> +
>>>>>>> +
>>>>>>> if (!test_kvm_facility(vcpu->kvm, 76))
>>>>>>> return;
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
--
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany
On 15/04/2018 23:22, Tony Krowiak wrote:
> Registers a group notifier during the open of the mediated
> matrix device to get information on KVM presence through the
> VFIO_GROUP_NOTIFY_SET_KVM event. When notified, the pointer
> to the kvm structure is saved inside the mediated matrix
> device. Once the VFIO AP device driver has access to KVM,
> access to the APs can be configured for the guest.
>
> Access to APs is configured when the file descriptor for the
> mediated matrix device is opened by userspace. The items to be
> configured are:
>
> 1. The ECA.28 bit in the SIE state description determines whether
> AP instructions are interpreted by the hardware or intercepted.
> The VFIO AP device driver relies interpretive execution of
> AP instructions so the ECA.28 bit will be set
>
> 2. Guest access to AP adapters, usage domains and control domains
> is controlled by three bit masks referenced from the
> Crypto Control Block (CRYCB) referenced from the guest's SIE state
> description:
>
> * The AP Mask (APM) controls access to the AP adapters. Each bit
> in the APM represents an adapter number - from most significant
> to least significant bit - from 0 to 255. The bits in the APM
> are set according to the adapter numbers assigned to the mediated
> matrix device via its 'assign_adapter' sysfs attribute file.
>
> * The AP Queue (AQM) controls access to the AP queues. Each bit
> in the AQM represents an AP queue index - from most significant
> to least significant bit - from 0 to 255. A queue index references
> a specific domain and is synonymous with the domian number. The
> bits in the AQM are set according to the domain numbers assigned
> to the mediated matrix device via its 'assign_domain' sysfs
> attribute file.
>
> * The AP Domain Mask (ADM) controls access to the AP control domains.
> Each bit in the ADM represents a control domain - from most
> significant to least significant bit - from 0-255. The
> bits in the ADM are set according to the domain numbers assigned
> to the mediated matrix device via its 'assign_control_domain'
> sysfs attribute file.
>
> Signed-off-by: Tony Krowiak <[email protected]>
> ---
> drivers/s390/crypto/vfio_ap_ops.c | 50 +++++++++++++++++++++++++++++++++
> drivers/s390/crypto/vfio_ap_private.h | 2 +
> 2 files changed, 52 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
> index bc2b05e..e3ff5ab 100644
> --- a/drivers/s390/crypto/vfio_ap_ops.c
> +++ b/drivers/s390/crypto/vfio_ap_ops.c
> @@ -53,6 +53,54 @@ static int vfio_ap_mdev_remove(struct mdev_device *mdev)
> return 0;
> }
>
> +static int vfio_ap_mdev_group_notifier(struct notifier_block *nb,
> + unsigned long action, void *data)
> +{
> + struct ap_matrix_mdev *matrix_mdev;
> +
> + if (action == VFIO_GROUP_NOTIFY_SET_KVM) {
> + matrix_mdev = container_of(nb, struct ap_matrix_mdev,
> + group_notifier);
> + matrix_mdev->kvm = data;
> + }
> +
> + return NOTIFY_OK;
> +}
> +
> +static int vfio_ap_mdev_open(struct mdev_device *mdev)
> +{
> + struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
> + unsigned long events;
> + int ret;
> +
> + matrix_mdev->group_notifier.notifier_call = vfio_ap_mdev_group_notifier;
> + events = VFIO_GROUP_NOTIFY_SET_KVM;
> +
> + ret = vfio_register_notifier(mdev_dev(mdev), VFIO_GROUP_NOTIFY,
> + &events, &matrix_mdev->group_notifier);
> + if (ret)
> + return ret;
> +
> + ret = kvm_ap_interpret_instructions(matrix_mdev->kvm, true);
> + if (ret)
> + return ret;
> +
> + ret = kvm_ap_configure_matrix(matrix_mdev->kvm,
> + matrix_mdev->matrix);
If all went OK, you may want to increase the module reference count
to avoid removing the module while in use by QEMU.
> +
> + return ret;
> +}
> +
> +static void vfio_ap_mdev_release(struct mdev_device *mdev)
> +{
> + struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
> +
> + kvm_ap_deconfigure_matrix(matrix_mdev->kvm);
> + kvm_ap_interpret_instructions(matrix_mdev->kvm, false);
> + vfio_unregister_notifier(mdev_dev(mdev), VFIO_GROUP_NOTIFY,
> + &matrix_mdev->group_notifier);
... and also decrease the reference count.
> +}
> +
> static ssize_t name_show(struct kobject *kobj, struct device *dev, char *buf)
> {
> return sprintf(buf, "%s\n", VFIO_AP_MDEV_NAME_HWVIRT);
> @@ -754,6 +802,8 @@ static ssize_t matrix_show(struct device *dev, struct device_attribute *attr,
> .mdev_attr_groups = vfio_ap_mdev_attr_groups,
> .create = vfio_ap_mdev_create,
> .remove = vfio_ap_mdev_remove,
> + .open = vfio_ap_mdev_open,
> + .release = vfio_ap_mdev_release,
> };
>
> int vfio_ap_mdev_register(struct ap_matrix *ap_matrix)
> diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h
> index f248faf..48e2806 100644
> --- a/drivers/s390/crypto/vfio_ap_private.h
> +++ b/drivers/s390/crypto/vfio_ap_private.h
> @@ -31,6 +31,8 @@ struct ap_matrix {
>
> struct ap_matrix_mdev {
> struct kvm_ap_matrix *matrix;
> + struct notifier_block group_notifier;
> + struct kvm *kvm;
> };
>
> static inline struct ap_matrix *to_ap_matrix(struct device *dev)
--
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany
On 04/18/2018 04:31 AM, Pierre Morel wrote:
> On 17/04/2018 20:11, Tony Krowiak wrote:
>> On 04/17/2018 12:55 PM, Pierre Morel wrote:
>>> On 17/04/2018 18:22, Tony Krowiak wrote:
>>>> On 04/17/2018 12:13 PM, Pierre Morel wrote:
>>>>> On 17/04/2018 17:02, Tony Krowiak wrote:
>>>>>> On 04/16/2018 06:51 AM, Pierre Morel wrote:
>>>>>>> On 15/04/2018 23:22, Tony Krowiak wrote:
>>>>>>>> The VFIO AP device model exploits interpretive execution of AP
>>>>>>>> instructions (APIE) to provide guests passthrough access to AP
>>>>>>>> devices. This patch introduces a new interface to enable and
>>>>>>>> disable APIE.
>>>>>>>>
>>>>>>>> Signed-off-by: Tony Krowiak <[email protected]>
>>>>>>>> ---
>>>>>>>> arch/s390/include/asm/kvm-ap.h | 16 ++++++++++++++++
>>>>>>>> arch/s390/include/asm/kvm_host.h | 1 +
>>>>>>>> arch/s390/kvm/kvm-ap.c | 20 ++++++++++++++++++++
>>>>>>>> arch/s390/kvm/kvm-s390.c | 9 +++++++++
>>>>>>>> 4 files changed, 46 insertions(+), 0 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/arch/s390/include/asm/kvm-ap.h
>>>>>>>> b/arch/s390/include/asm/kvm-ap.h
>>>>>>>> index 736e93e..a6c092e 100644
>>>>>>>> --- a/arch/s390/include/asm/kvm-ap.h
>>>>>>>> +++ b/arch/s390/include/asm/kvm-ap.h
>>>>>>>> @@ -35,4 +35,20 @@
>>>>>>>> */
>>>>>>>> void kvm_ap_build_crycbd(struct kvm *kvm);
>>>>>>>>
>>>>>>>> +/**
>>>>>>>> + * kvm_ap_interpret_instructions
>>>>>>>> + *
>>>>>>>> + * Indicate whether AP instructions shall be interpreted. If
>>>>>>>> they are not
>>>>>>>> + * interpreted, all AP instructions will be intercepted and
>>>>>>>> routed back to
>>>>>>>> + * userspace.
>>>>>>>> + *
>>>>>>>> + * @kvm: the virtual machine attributes
>>>>>>>> + * @enable: indicates whether AP instructions are to be
>>>>>>>> interpreted (true) or
>>>>>>>> + * or not (false).
>>>>>>>> + *
>>>>>>>> + * Returns 0 if completed successfully; otherwise, returns
>>>>>>>> -EOPNOTSUPP
>>>>>>>> + * indicating that AP instructions are not installed on the
>>>>>>>> guest.
>>>>>>>> + */
>>>>>>>> +int kvm_ap_interpret_instructions(struct kvm *kvm, bool enable);
>>>>>>>> +
>>>>>>>> #endif /* _ASM_KVM_AP */
>>>>>>>> diff --git a/arch/s390/include/asm/kvm_host.h
>>>>>>>> b/arch/s390/include/asm/kvm_host.h
>>>>>>>> index 3162783..5470685 100644
>>>>>>>> --- a/arch/s390/include/asm/kvm_host.h
>>>>>>>> +++ b/arch/s390/include/asm/kvm_host.h
>>>>>>>> @@ -715,6 +715,7 @@ struct kvm_s390_crypto {
>>>>>>>> __u32 crycbd;
>>>>>>>> __u8 aes_kw;
>>>>>>>> __u8 dea_kw;
>>>>>>>> + __u8 apie;
>>>>>>>> };
>>>>>>>>
>>>>>>>> #define APCB0_MASK_SIZE 1
>>>>>>>> diff --git a/arch/s390/kvm/kvm-ap.c b/arch/s390/kvm/kvm-ap.c
>>>>>>>> index 991bae4..55d11b5 100644
>>>>>>>> --- a/arch/s390/kvm/kvm-ap.c
>>>>>>>> +++ b/arch/s390/kvm/kvm-ap.c
>>>>>>>> @@ -58,3 +58,23 @@ void kvm_ap_build_crycbd(struct kvm *kvm)
>>>>>>>> }
>>>>>>>> }
>>>>>>>> EXPORT_SYMBOL(kvm_ap_build_crycbd);
>>>>>>>> +
>>>>>>>> +int kvm_ap_interpret_instructions(struct kvm *kvm, bool enable)
>>>>>>>> +{
>>>>>>>> + int ret = 0;
>>>>>>>> +
>>>>>>>> + mutex_lock(&kvm->lock);
>>>>>>>> +
>>>>>>>> + if (!test_kvm_cpu_feat(kvm, KVM_S390_VM_CPU_FEAT_AP)) {
>>>>>>>
>>>>>>> Do we really need to test CPU_FEAT_AP?
>>>>>>
>>>>>> Yes we do.
>>>>>
>>>>> really? why?
>>>>
>>>> The KVM_S390_VM_CPU_FEAT_AP will not be enabled by KVM if the AP
>>>> instructions are not installed on the host. I assume - but have
>>>> no way of verifying - that if the AP instructions are not installed
>>>> on the host, that interpretation would fail. Do you know what would
>>>> happen if AP instructions are interpreted when not installed on
>>>> the host?
>>>
>>> If the host has no AP instructions (his ECA.28=0) but it set ECA.28
>>> for a guest,
>>> there will be no AP instructions available in the guest.
>>
>> Then there's the answer to your question; this is why we to test
>> CPU_FEAT_AP.
>
> We can postpone this discussion when we discuss on VSIE.
> For this specific call I just wanted to point out that obviously this
> function should not
> be called if the guest has no AP instructions.
I disagree, at least as far as the way things are currently designed.
Whether AP instructions
are interpreted or not is determined by the vfio_ap device driver. The
device driver should not
be required to have "knowledge" about how a guest is configured in KVM
which is why I
encapsulated most of the AP guest configuration in kvm-ap.c. Besides,
the function above allows
setting of AP interpretation only if CPU_FEAT_AP is enabled.
Having said that, I may remove this function since - as you pointed out
earlier - AP instructions
are interpreted by default if CPU_FEAT_AP is enabled, so there will be
no need to set this at
this time.
>
>>
>>>
>>>
>>>>
>>>>>
>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I understand that KVM_S390_VM_CPU_FEAT_AP means AP instructions
>>>>>>> are interpreted.
>>>>>>> shouldn't we add this information in the name?
>>>>>>> like KVM_S390_VM_CPU_FEAT_APIE
>>>>>>
>>>>>> KVM_S390_VM_CPU_FEAT_AP does NOT mean AP instructions are
>>>>>> interpreted, it means
>>>>>> AP instructions are installed.
>>>>>
>>>>> Right same error I made all along this review.
>>>>> But AFAIK it means AP instructions are provided to the guest.
>>>>> Then should this function be called if the guest has no AP
>>>>> instructions ?
>>>>>
>>>>>
>>>>>>
>>>>>>>
>>>>>>>> + ret = -EOPNOTSUPP;
>>>>>>>> + goto done;
>>>>>>>> + }
>>>>>>>> +
>>>>>>>> + kvm->arch.crypto.apie = enable;
>>>>>>>> + kvm_s390_vcpu_crypto_reset_all(kvm);
>>>>>>>> +
>>>>>>>> +done:
>>>>>>>> + mutex_unlock(&kvm->lock);
>>>>>>>> + return ret;
>>>>>>>> +}
>>>>>>>> +EXPORT_SYMBOL(kvm_ap_interpret_instructions);
>>>>>>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>>>>>>> index 55cd897..1dc8566 100644
>>>>>>>> --- a/arch/s390/kvm/kvm-s390.c
>>>>>>>> +++ b/arch/s390/kvm/kvm-s390.c
>>>>>>>> @@ -1901,6 +1901,9 @@ static void kvm_s390_crypto_init(struct
>>>>>>>> kvm *kvm)
>>>>>>>> kvm->arch.crypto.crycb = &kvm->arch.sie_page2->crycb;
>>>>>>>> kvm_ap_build_crycbd(kvm);
>>>>>>>>
>>>>>>>> + /* Default setting indicating SIE shall interpret AP
>>>>>>>> instructions */
>>>>>>>> + kvm->arch.crypto.apie = 1;
>>>>>>>> +
>>>>>>>> if (!test_kvm_facility(kvm, 76))
>>>>>>>> return;
>>>>>>>>
>>>>>>>> @@ -2434,6 +2437,12 @@ static void
>>>>>>>> kvm_s390_vcpu_crypto_setup(struct kvm_vcpu *vcpu)
>>>>>>>> {
>>>>>>>> vcpu->arch.sie_block->crycbd =
>>>>>>>> vcpu->kvm->arch.crypto.crycbd;
>>>>>>>>
>>>>>>>> + vcpu->arch.sie_block->eca &= ~ECA_APIE;
>>>>>>>> + if (vcpu->kvm->arch.crypto.apie &&
>>>>>>>> + test_kvm_cpu_feat(vcpu->kvm, KVM_S390_VM_CPU_FEAT_AP))
>>>>>>>
>>>>>>> Do we call xxx_crypto_setup() if KVM does not support AP
>>>>>>> interpretation?
>>>>>>
>>>>>> Yes, kvm_s390_vcpu_crypto_setup(vcpu) is called by
>>>>>> kvm_arch_vcpu_setup(vcpu)
>>>>>> as well as from kvm_s390_vcpu_crypto_reset_all(kvm). Calling it
>>>>>> has nothing
>>>>>> to do with whether AP interpretation is supported or not as it
>>>>>> does much
>>>>>> more than that, including setting up of wrapping keys and the
>>>>>> CRYCBD.
>>>>>
>>>>> Sorry, still the same error I made about CPU_FEAT_AP meaning AP
>>>>> instructions in the guest
>>>>> and not AP interpretation available.
>>>>> Could apie be set if AP instruction are not supported?
>>>>>
>>>>>>
>>>>>>>
>>>>>>>> + vcpu->arch.sie_block->eca |= ECA_APIE;
>>>>>>>> +
>>>>>>>> +
>>>>>>>> if (!test_kvm_facility(vcpu->kvm, 76))
>>>>>>>> return;
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
On 04/18/2018 03:49 AM, Cornelia Huck wrote:
> On Tue, 17 Apr 2018 14:08:59 -0400
> Tony Krowiak <[email protected]> wrote:
>
>> On 04/17/2018 11:21 AM, Cornelia Huck wrote:
>>> On Tue, 17 Apr 2018 10:26:57 -0400
>>> Tony Krowiak <[email protected]> wrote:
>>>
>>>> On 04/17/2018 06:10 AM, Cornelia Huck wrote:
>>>>> On Tue, 17 Apr 2018 09:49:58 +0200
>>>>> "Harald Freudenberger" <[email protected]> wrote:
>>>>>
>>>>>> Didn't we say that when APXA is not available there is no Crypto support
>>>>>> for KVM ?
>>>>> [Going by the code, as I don't have access to the architecture]
>>>>>
>>>>> Current status seems to be:
>>>>> - setup crycb if facility 76 is available (that's MSAX3, I guess?)
>>>> The crycb is set up regardless of whether STFLE.76 (MSAX3) is
>>>> installed or not.
>>> Hm, the current code does a quick exit if bit 76 is not set, doesn't
>>> it?
>> I guess that depends upon what you mean by current code. If you are talking
>> about the code as it is distributed today - i.e., before my patch series -
>> then you are correct. This patch changes that; it initializes the
>> kvm->arch.crypto.crycbd to point to the CRYCB, then clears the format bits
>> (kvm->arch.crypto.crycbd &= ~(CRYCB_FORMAT_MASK)) which is the same as
>> setting the CRYCB format to format 0. It is only after this that the
>> check is done to determine whether STFLE.76 is set.
> Ah yes, with "current" I referred to current upstream.
>
>>>
>>>>> - use format 2 if APXA is available, else use format 1
>>>> Use format 0 if MSAX3 is not available
>>>> Use format 1 if MSAX3 is available but APXA is not
>>>> Use format 2 if MSAX3 and APXA is available
>>>>
>>>>> From Tony's patch description, the goal seems to be:
>>>>> - setup crycb even if MSAX3 is not available
>>>> Yes, that is true
>>>>
>>>>> So my understanding is that we use APXA only to decide on the format of
>>>>> the crycb, but provide it in any case?
>>>> Yes, that is true
>>> With the format selection you outlined above, I guess. Makes sense from
>>> my point of view (just looking at the source code).
>> It also implements what is stated in the architecture doc.
> OK, great.
>
>>>
>>>>> (Not providing a crycb if APXA is not available would be loss of
>>>>> functionality, I guess? Deciding not to provide vfio-ap if APXA is not
>>>>> available is a different game, of course.)
>>>> This would require a change to enabling the CPU model feature for
>>>> AP.
>>> But would it actually make sense to tie vfio-ap to APXA? This needs to
>>> be answered by folks with access to the architecture :)
>> I don't see any reason to do that from an architectural perspective.
>> One can access AP devices whether APXA is installed or not, it just limits
>> the range of devices that can be addressed
> So I guess we should not introduce a tie-in then (unless it radically
> simplifies the code...)
I'm not clear about what you mean by introducing a tie-in. Can you
clarify that?
>
On 04/18/2018 07:56 AM, Pierre Morel wrote:
> On 15/04/2018 23:22, Tony Krowiak wrote:
>> Registers a group notifier during the open of the mediated
>> matrix device to get information on KVM presence through the
>> VFIO_GROUP_NOTIFY_SET_KVM event. When notified, the pointer
>> to the kvm structure is saved inside the mediated matrix
>> device. Once the VFIO AP device driver has access to KVM,
>> access to the APs can be configured for the guest.
>>
>> Access to APs is configured when the file descriptor for the
>> mediated matrix device is opened by userspace. The items to be
>> configured are:
>>
>> 1. The ECA.28 bit in the SIE state description determines whether
>> AP instructions are interpreted by the hardware or intercepted.
>> The VFIO AP device driver relies interpretive execution of
>> AP instructions so the ECA.28 bit will be set
>>
>> 2. Guest access to AP adapters, usage domains and control domains
>> is controlled by three bit masks referenced from the
>> Crypto Control Block (CRYCB) referenced from the guest's SIE state
>> description:
>>
>> * The AP Mask (APM) controls access to the AP adapters. Each bit
>> in the APM represents an adapter number - from most significant
>> to least significant bit - from 0 to 255. The bits in the APM
>> are set according to the adapter numbers assigned to the mediated
>> matrix device via its 'assign_adapter' sysfs attribute file.
>>
>> * The AP Queue (AQM) controls access to the AP queues. Each bit
>> in the AQM represents an AP queue index - from most significant
>> to least significant bit - from 0 to 255. A queue index references
>> a specific domain and is synonymous with the domian number. The
>> bits in the AQM are set according to the domain numbers assigned
>> to the mediated matrix device via its 'assign_domain' sysfs
>> attribute file.
>>
>> * The AP Domain Mask (ADM) controls access to the AP control
>> domains.
>> Each bit in the ADM represents a control domain - from most
>> significant to least significant bit - from 0-255. The
>> bits in the ADM are set according to the domain numbers assigned
>> to the mediated matrix device via its 'assign_control_domain'
>> sysfs attribute file.
>>
>> Signed-off-by: Tony Krowiak <[email protected]>
>> ---
>> drivers/s390/crypto/vfio_ap_ops.c | 50
>> +++++++++++++++++++++++++++++++++
>> drivers/s390/crypto/vfio_ap_private.h | 2 +
>> 2 files changed, 52 insertions(+), 0 deletions(-)
>>
>> diff --git a/drivers/s390/crypto/vfio_ap_ops.c
>> b/drivers/s390/crypto/vfio_ap_ops.c
>> index bc2b05e..e3ff5ab 100644
>> --- a/drivers/s390/crypto/vfio_ap_ops.c
>> +++ b/drivers/s390/crypto/vfio_ap_ops.c
>> @@ -53,6 +53,54 @@ static int vfio_ap_mdev_remove(struct mdev_device
>> *mdev)
>> return 0;
>> }
>>
>> +static int vfio_ap_mdev_group_notifier(struct notifier_block *nb,
>> + unsigned long action, void *data)
>> +{
>> + struct ap_matrix_mdev *matrix_mdev;
>> +
>> + if (action == VFIO_GROUP_NOTIFY_SET_KVM) {
>> + matrix_mdev = container_of(nb, struct ap_matrix_mdev,
>> + group_notifier);
>> + matrix_mdev->kvm = data;
>> + }
>> +
>> + return NOTIFY_OK;
>> +}
>> +
>> +static int vfio_ap_mdev_open(struct mdev_device *mdev)
>> +{
>> + struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
>> + unsigned long events;
>> + int ret;
>> +
>> + matrix_mdev->group_notifier.notifier_call =
>> vfio_ap_mdev_group_notifier;
>> + events = VFIO_GROUP_NOTIFY_SET_KVM;
>> +
>> + ret = vfio_register_notifier(mdev_dev(mdev), VFIO_GROUP_NOTIFY,
>> + &events, &matrix_mdev->group_notifier);
>> + if (ret)
>> + return ret;
>> +
>> + ret = kvm_ap_interpret_instructions(matrix_mdev->kvm, true);
>> + if (ret)
>> + return ret;
>> +
>> + ret = kvm_ap_configure_matrix(matrix_mdev->kvm,
>> + matrix_mdev->matrix);
>
> If all went OK, you may want to increase the module reference count
> to avoid removing the module while in use by QEMU.
Sounds reasonable.
>
>
>> +
>> + return ret;
>> +}
>> +
>> +static void vfio_ap_mdev_release(struct mdev_device *mdev)
>> +{
>> + struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
>> +
>> + kvm_ap_deconfigure_matrix(matrix_mdev->kvm);
>> + kvm_ap_interpret_instructions(matrix_mdev->kvm, false);
>> + vfio_unregister_notifier(mdev_dev(mdev), VFIO_GROUP_NOTIFY,
>> + &matrix_mdev->group_notifier);
>
> ... and also decrease the reference count.
Ditto.
>
>
>> +}
>> +
>> static ssize_t name_show(struct kobject *kobj, struct device *dev,
>> char *buf)
>> {
>> return sprintf(buf, "%s\n", VFIO_AP_MDEV_NAME_HWVIRT);
>> @@ -754,6 +802,8 @@ static ssize_t matrix_show(struct device *dev,
>> struct device_attribute *attr,
>> .mdev_attr_groups = vfio_ap_mdev_attr_groups,
>> .create = vfio_ap_mdev_create,
>> .remove = vfio_ap_mdev_remove,
>> + .open = vfio_ap_mdev_open,
>> + .release = vfio_ap_mdev_release,
>> };
>>
>> int vfio_ap_mdev_register(struct ap_matrix *ap_matrix)
>> diff --git a/drivers/s390/crypto/vfio_ap_private.h
>> b/drivers/s390/crypto/vfio_ap_private.h
>> index f248faf..48e2806 100644
>> --- a/drivers/s390/crypto/vfio_ap_private.h
>> +++ b/drivers/s390/crypto/vfio_ap_private.h
>> @@ -31,6 +31,8 @@ struct ap_matrix {
>>
>> struct ap_matrix_mdev {
>> struct kvm_ap_matrix *matrix;
>> + struct notifier_block group_notifier;
>> + struct kvm *kvm;
>> };
>>
>> static inline struct ap_matrix *to_ap_matrix(struct device *dev)
>
>
On 04/17/2018 11:52 AM, Pierre Morel wrote:
> On 17/04/2018 16:15, Tony Krowiak wrote:
>> On 04/16/2018 04:56 AM, Pierre Morel wrote:
>>> On 15/04/2018 23:22, Tony Krowiak wrote:
>>>> This patch refactors the code that initializes the crypto
>>>> configuration for a guest. The crypto configuration is contained in
>>>> a crypto control block (CRYCB) which is a satellite control block to
>>>> our main hardware virtualization control block. The CRYCB is
>>>> attached to the main virtualization control block via a CRYCB
>>>> designation (CRYCBD) designation field containing the address of
>>>> the CRYCB as well as its format.
>>>>
>>>> Prior to the introduction of AP device virtualization, there was
>>>> no need to provide access to or specify the format of the CRYCB for
>>>> a guest unless the MSA extension 3 (MSAX3) facility was installed
>>>> on the host system. With the introduction of AP device virtualization,
>>>> the CRYCB and its format must be made accessible to the guest
>>>> regardless of the presence of the MSAX3 facility.
>>>>
>>>> The crypto initialization code is restructured as follows:
>>>>
>>>> * A new compilation unit is introduced to contain all interfaces
>>>> and data structures related to configuring a guest's CRYCB for
>>>> both the refactoring of crypto initialization as well as all
>>>> subsequent patches introducing AP virtualization support.
>>>>
>>>> * Currently, the asm code for querying the AP configuration is
>>>> duplicated in the AP bus as well as in KVM. Since the KVM
>>>> code was introduced, the AP bus has externalized the interface
>>>> for querying the AP configuration. The KVM interface will be
>>>> replaced with a call to the AP bus interface. Of course, this
>>>> will be moved to the new compilation unit mentioned above.
>>>>
>>>> * An interface to format the CRYCBD field will be provided via
>>>> the new compilation unit and called from the KVM vm
>>>> initialization.
>>>>
>>>> Signed-off-by: Tony Krowiak <[email protected]>
>>>> ---
>>>> arch/s390/include/asm/kvm-ap.h | 15 +++++++++
>>>> arch/s390/include/asm/kvm_host.h | 1 +
>>>> arch/s390/kvm/kvm-ap.c | 39 ++++++++++++++++++++++++
>>>> arch/s390/kvm/kvm-s390.c | 60
>>>> ++++----------------------------------
>>>> 4 files changed, 61 insertions(+), 54 deletions(-)
>>>>
>>>> diff --git a/arch/s390/include/asm/kvm-ap.h
>>>> b/arch/s390/include/asm/kvm-ap.h
>>>> index 84412a9..736e93e 100644
>>>> --- a/arch/s390/include/asm/kvm-ap.h
>>>> +++ b/arch/s390/include/asm/kvm-ap.h
>>>> @@ -10,6 +10,9 @@
>>>> #ifndef _ASM_KVM_AP
>>>> #define _ASM_KVM_AP
>>>>
>>>> +#include <linux/types.h>
>>>> +#include <linux/kvm_host.h>
>>>> +
>>>> /**
>>>> * kvm_ap_instructions_installed()
>>>> *
>>>> @@ -20,4 +23,16 @@
>>>> */
>>>> int kvm_ap_instructions_installed(void);
>>>>
>>>> +/**
>>>> + * kvm_ap_build_crycbd
>>>> + *
>>>> + * The crypto control block designation (CRYCBD) is a 32-bit field
>>>> that
>>>> + * designates both the host real address and format of the CRYCB.
>>>> This function
>>>> + * builds the CRYCBD field for use by the KVM guest.
>>>> + *
>>>> + * @kvm: the KVM guest
>>>> + * @crycbd: reference to the CRYCBD
>>>> + */
>>>> +void kvm_ap_build_crycbd(struct kvm *kvm);
>>>> +
>>>> #endif /* _ASM_KVM_AP */
>>>> diff --git a/arch/s390/include/asm/kvm_host.h
>>>> b/arch/s390/include/asm/kvm_host.h
>>>> index 81cdb6b..c990a1d 100644
>>>> --- a/arch/s390/include/asm/kvm_host.h
>>>> +++ b/arch/s390/include/asm/kvm_host.h
>>>> @@ -257,6 +257,7 @@ struct kvm_s390_sie_block {
>>>> __u8 reservedf0[12]; /* 0x00f0 */
>>>> #define CRYCB_FORMAT1 0x00000001
>>>> #define CRYCB_FORMAT2 0x00000003
>>>> +#define CRYCB_FORMAT_MASK 0x00000003
>>>> __u32 crycbd; /* 0x00fc */
>>>> __u64 gcr[16]; /* 0x0100 */
>>>> __u64 gbea; /* 0x0180 */
>>>> diff --git a/arch/s390/kvm/kvm-ap.c b/arch/s390/kvm/kvm-ap.c
>>>> index 1267588..991bae4 100644
>>>> --- a/arch/s390/kvm/kvm-ap.c
>>>> +++ b/arch/s390/kvm/kvm-ap.c
>>>> @@ -10,6 +10,8 @@
>>>> #include <asm/kvm-ap.h>
>>>> #include <asm/ap.h>
>>>>
>>>> +#include "kvm-s390.h"
>>>> +
>>>> int kvm_ap_instructions_installed(void)
>>>> {
>>>> #ifdef CONFIG_ZCRYPT
>>>> @@ -19,3 +21,40 @@ int kvm_ap_instructions_installed(void)
>>>> #endif
>>>> }
>>>> EXPORT_SYMBOL(kvm_ap_instructions_installed);
>>>> +
>>>> +static inline int kvm_ap_query_config(struct ap_config_info *config)
>>>> +{
>>>> + memset(config, 0, sizeof(*config));
>>>> +
>>>> +#ifdef CONFIG_ZCRYPT
>>>
>>> I would prefer that you define the interface in an include file
>>> with stubs for the case ZCRYPT is not set.
>>
>> This is a static function only called internally, but I suppose there is
>> no harm in defining it as an interface in kvm-ap.h ... it may come
>> in handy down the road.
>>
>>>
>>>
>>>> + if (kvm_ap_instructions_installed())
>>>> + return ap_query_configuration(config);
>>>> +#endif
>>>> +
>>>> + return -EOPNOTSUPP;
>>>> +}
>>>> +
>>>> +static int kvm_ap_apxa_installed(void)
>>>> +{
>>>> + struct ap_config_info config;
>>>> +
>>>> + if (kvm_ap_query_config(&config) == 0)
>>>> + return (config.apxa == 1);
>>>> +
>>>> + return 0;
>>>> +}
>>>> +
>>>> +void kvm_ap_build_crycbd(struct kvm *kvm)
>>>> +{
>>>> + kvm->arch.crypto.crycbd = (__u32)(unsigned long)
>>>> kvm->arch.crypto.crycb;
>>>> + kvm->arch.crypto.crycbd &= ~(CRYCB_FORMAT_MASK);
>>>> +
>>>> + /* check whether MSAX3 is installed */
>>>
>>> It means we do not support AP virtualization without MSA3.
>>> It follows we do not support CRYCB_FORMAT0
>>
>> If MSAX3 is not installed, that means there is no key wrapping support,
>> hence CRYCB_FORMAT0. The CRYCB_FORMAT1 and CRYCB_FORMAT2 CRYCBs
>> both include wrapping key masks. I don't follow your logic here.
>>
>>>
>>>
>>> It is different from what you explain in the comment.
>>
>> How is it different? Above, we are setting the CRYCBD value regardless
>> of whether MSAX3 is installed or not. Previously, the CRYCBD value
>> was set only if MSAX3 is installed (see comments below)
>>
>>>
>>>
>>>> + if (kvm_ap_instructions_installed() && test_kvm_facility(kvm,
>>>> 76)) {
>>>> + if (kvm_ap_apxa_installed())
>>>> + kvm->arch.crypto.crycbd |= CRYCB_FORMAT2;
>>>> + else
>>>> + kvm->arch.crypto.crycbd |= CRYCB_FORMAT1;
>>>> + }
>
> sorry, I was fooled by the test on kvm_instructions_installed() and
> that CRYCB_FORMAT0 = 0.
> since you cleared the format above it is 0 by default.
>
> Since we can not use CRYCB_FORMAT0 if we have no AP instructions, the
> logic of the test
> seems false even the result is right.
The logic needs to be changed:
If neither the AP instructions nor MSAX3 is installed
return
Set the address of the CRYCB into the CRYCBD
Clear the format mask in the CRYCBD - i.e., set to format 0 by default
If MSAX3 is installed
If APXA is installed
Set format 2
Else
Set format 1
Set up protected key support (i.e., key wrapping)
>
>
> I think you can make it more readable if you put all the crycb
> initialization together
> inside the kvm_s390_crypto_init() function instead of exporting part
> of it inside
> kvm_ap_build_crycbd()
I agree, I am going to do that.
>
>
> Regards,
>
> Pierre
>
>>>> +}
>>>> +EXPORT_SYMBOL(kvm_ap_build_crycbd);
>>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>>> index d0c3518..b47ff11 100644
>>>> --- a/arch/s390/kvm/kvm-s390.c
>>>> +++ b/arch/s390/kvm/kvm-s390.c
>>>> @@ -40,6 +40,7 @@
>>>> #include <asm/sclp.h>
>>>> #include <asm/cpacf.h>
>>>> #include <asm/timex.h>
>>>> +#include <asm/kvm-ap.h>
>>>> #include "kvm-s390.h"
>>>> #include "gaccess.h"
>>>>
>>>> @@ -1881,55 +1882,6 @@ long kvm_arch_vm_ioctl(struct file *filp,
>>>> return r;
>>>> }
>>>>
>>>> -static int kvm_s390_query_ap_config(u8 *config)
>>>> -{
>>>> - u32 fcn_code = 0x04000000UL;
>>>> - u32 cc = 0;
>>>> -
>>>> - memset(config, 0, 128);
>>>> - asm volatile(
>>>> - "lgr 0,%1\n"
>>>> - "lgr 2,%2\n"
>>>> - ".long 0xb2af0000\n" /* PQAP(QCI) */
>>>> - "0: ipm %0\n"
>>>> - "srl %0,28\n"
>>>> - "1:\n"
>>>> - EX_TABLE(0b, 1b)
>>>> - : "+r" (cc)
>>>> - : "r" (fcn_code), "r" (config)
>>>> - : "cc", "0", "2", "memory"
>>>> - );
>>>> -
>>>> - return cc;
>>>> -}
>>>> -
>>>> -static int kvm_s390_apxa_installed(void)
>>>> -{
>>>> - u8 config[128];
>>>> - int cc;
>>>> -
>>>> - if (test_facility(12)) {
>>>> - cc = kvm_s390_query_ap_config(config);
>>>> -
>>>> - if (cc)
>>>> - pr_err("PQAP(QCI) failed with cc=%d", cc);
>>>> - else
>>>> - return config[0] & 0x40;
>>>> - }
>>>> -
>>>> - return 0;
>>>> -}
>>>> -
>>>> -static void kvm_s390_set_crycb_format(struct kvm *kvm)
>>>> -{
>>>> - kvm->arch.crypto.crycbd = (__u32)(unsigned long)
>>>> kvm->arch.crypto.crycb;
>>>> -
>>>> - if (kvm_s390_apxa_installed())
>>>> - kvm->arch.crypto.crycbd |= CRYCB_FORMAT2;
>>>> - else
>>>> - kvm->arch.crypto.crycbd |= CRYCB_FORMAT1;
>>>> -}
>>>> -
>>>> static u64 kvm_s390_get_initial_cpuid(void)
>>>> {
>>>> struct cpuid cpuid;
>>>> @@ -1941,12 +1893,12 @@ static u64 kvm_s390_get_initial_cpuid(void)
>>>>
>>>> static void kvm_s390_crypto_init(struct kvm *kvm)
>>>> {
>>>> + kvm->arch.crypto.crycb = &kvm->arch.sie_page2->crycb;
>>>> + kvm_ap_build_crycbd(kvm);
>>>> +
>>
>> Notice the call to kvm_ap_build_crycbd(kvm) above was added, so
>> the CRYCBD is being set regardless of the presence of MSAX3.
>>
>>>> if (!test_kvm_facility(kvm, 76))
>>>> return;
>>>>
>>>> - kvm->arch.crypto.crycb = &kvm->arch.sie_page2->crycb;
>>>> - kvm_s390_set_crycb_format(kvm);
>> Notice that this code that was removed to set the CRYCBD is called
>> only if MSAX3 is not installed - i.e., see the if statement
>> immediately preceding the two statements above.
>>>> -
>>>> /* Enable AES/DEA protected key functions by default */
>>>> kvm->arch.crypto.aes_kw = 1;
>>>> kvm->arch.crypto.dea_kw = 1;
>>>> @@ -2475,6 +2427,8 @@ void kvm_arch_vcpu_postcreate(struct kvm_vcpu
>>>> *vcpu)
>>>>
>>>> static void kvm_s390_vcpu_crypto_setup(struct kvm_vcpu *vcpu)
>>>> {
>>>> + vcpu->arch.sie_block->crycbd = vcpu->kvm->arch.crypto.crycbd;
>>>> +
>>>> if (!test_kvm_facility(vcpu->kvm, 76))
>>>> return;
>>>>
>>>> @@ -2484,8 +2438,6 @@ static void kvm_s390_vcpu_crypto_setup(struct
>>>> kvm_vcpu *vcpu)
>>>> vcpu->arch.sie_block->ecb3 |= ECB3_AES;
>>>> if (vcpu->kvm->arch.crypto.dea_kw)
>>>> vcpu->arch.sie_block->ecb3 |= ECB3_DEA;
>>>> -
>>>> - vcpu->arch.sie_block->crycbd = vcpu->kvm->arch.crypto.crycbd;
>>>> }
>>>>
>>>> void kvm_s390_vcpu_unsetup_cmma(struct kvm_vcpu *vcpu)
>>>
>>>
>>
>
On Sun, 22 Apr 2018 10:52:55 -0400
Tony Krowiak <[email protected]> wrote:
> >>>>> (Not providing a crycb if APXA is not available would be loss of
> >>>>> functionality, I guess? Deciding not to provide vfio-ap if APXA is not
> >>>>> available is a different game, of course.)
> >>>> This would require a change to enabling the CPU model feature for
> >>>> AP.
> >>> But would it actually make sense to tie vfio-ap to APXA? This needs to
> >>> be answered by folks with access to the architecture :)
> >> I don't see any reason to do that from an architectural perspective.
> >> One can access AP devices whether APXA is installed or not, it just limits
> >> the range of devices that can be addressed
> > So I guess we should not introduce a tie-in then (unless it radically
> > simplifies the code...)
>
> I'm not clear about what you mean by introducing a tie-in. Can you
> clarify that?
Making vfio-ap depend on APXA.
On Sun, 22 Apr 2018 13:21:16 -0400
Tony Krowiak <[email protected]> wrote:
> On 04/17/2018 12:56 PM, Cornelia Huck wrote:
> > On Tue, 17 Apr 2018 09:31:00 -0400
> > Tony Krowiak <[email protected]> wrote:
> >
> >> My preference would be one of the following:
> >>
> >> 1. All of the interfaces defined in arch/s390/include/asm/ap.h
> >> are implemented in a file that is built whether ZCRYPT is
> >> built or not.
> >>
> >> 2. The drivers/s390/crypto/ap_asm.h file containing the functions
> >> that execute the AP instructions are made available outside of
> >> the AP bus, for example; arch/s390/include/asm
> >>
> >> I requested this from the maintainer but was told we don't want to
> >> have any crypto adapter support when the host AP functionality is
> >> disabled (CONFIG_ZCRYPT=n). This makes sense, however; I think it is
> >> a bit confusing to have a header file (arch/s390/include/asm/ap.h)
> >> with interfaces that are conditionally built.
> >>
> >> This is why I chose the ifdeffery (as you call it) approach. The
> >> only other solution I can conjure is to duplicate the asm code for
> >> the AP instructions needed in KVM and bypass using the AP bus
> >> interfaces.
> > I think at the root of this is an unfortunate mixup in the
> > architecture: The format of the crycb changes depending on some ap
> > feature being installed. Providing the crycb does not have anything to
> > do with ap device usage in the host, but we need to issue an ap
> > instruction to get this right. [Correct me if I'm wrong; but that's
> > what I get without being able to consult the actual architecture.]
> >
> > So, exporting *all* of the interfaces is probably not needed anyway. I
> > think it boils down to either "export the interfaces where a mixup
> > happened, and keep the rest to zcrypt only", or "duplicate the
> > instructions for kvm usage".
> >
> > I hope we can find a solution here, as this seems to be one of the main
> > discussion points :/ (FWIW, I think the basic driver interface is sane.)
>
> I spoke with Harald Freudenberger and he is going to refactor the AP bus
> code to make the two functions//I need static in the kernel:
>
> int ap_instructions_installed(void);
> int ap_query_configuration(struct ap_config_info *info);
> //
> >
>
Excellent, looking forward to it.
On 15/04/2018 23:22, Tony Krowiak wrote:
> Provides interfaces to assign AP adapters, usage domains
> and control domains to a KVM guest.
>
> A KVM guest is started by executing the Start Interpretive Execution (SIE)
> instruction. The SIE state description is a control block that contains the
> state information for a KVM guest and is supplied as input to the SIE
> instruction. The SIE state description has a satellite structure called the
> Crypto Control Block (CRYCB). The CRYCB contains three bitmask fields
> identifying the adapters, queues (domains) and control domains assigned to
> the KVM guest:
>
> * The AP Adapter Mask (APM) field identifies the AP adapters assigned to
> the KVM guest
>
> * The AP Queue Mask (AQM) field identifies the AP queues assigned to
> the KVM guest. Each AP queue is connected to a usage domain within
> an AP adapter.
>
> * The AP Domain Mask (ADM) field identifies the control domains
> assigned to the KVM guest.
>
> Each adapter, queue (usage domain) and control domain are identified by
> a number from 0 to 255. The bits in each mask, from most significant to
> least significant bit, correspond to the numbers 0-255. When a bit is
> set, the corresponding adapter, queue (usage domain) or control domain
> is assigned to the KVM guest.
>
> This patch will set the bits in the APM, AQM and ADM fields of the
> CRYCB referenced by the KVM guest's SIE state description. The process
> used is:
>
> 1. Verify that the bits to be set do not exceed the maximum bit
> number for the given mask.
>
> 2. Verify that the APQNs that can be derived from the intersection
> of the bits set in the APM and AQM fields of the KVM guest's CRYCB
> are not assigned to any other KVM guest running on the same linux
> host.
>
> 3. Set the APM, AQM and ADM in the CRYCB according to the matrix
> configured for the mediated matrix device via its sysfs
> adapter, domain and control domain attribute files respectively.
>
> Signed-off-by: Tony Krowiak <[email protected]>
> ---
> arch/s390/include/asm/kvm-ap.h | 79 ++++++++++
> arch/s390/kvm/kvm-ap.c | 259 +++++++++++++++++++++++++++++++++
> drivers/s390/crypto/vfio_ap_ops.c | 19 +++
> drivers/s390/crypto/vfio_ap_private.h | 4 +
> 4 files changed, 361 insertions(+), 0 deletions(-)
>
> diff --git a/arch/s390/include/asm/kvm-ap.h b/arch/s390/include/asm/kvm-ap.h
> index a6c092e..a068244 100644
> --- a/arch/s390/include/asm/kvm-ap.h
> +++ b/arch/s390/include/asm/kvm-ap.h
> @@ -12,6 +12,34 @@
>
> #include <linux/types.h>
> #include <linux/kvm_host.h>
> +#include <linux/types.h>
> +#include <linux/kvm_host.h>
> +#include <linux/bitops.h>
> +
> +#define KVM_AP_MASK_BYTES(n) DIV_ROUND_UP(n, BITS_PER_BYTE)
> +
> +/**
> + * The AP matrix is comprised of three bit masks identifying the adapters,
> + * queues (domains) and control domains that belong to an AP matrix. The bits in
> + * each mask, from least significant to most significant bit, correspond to IDs
> + * 0 to the maximum ID allowed for a given mask. When a bit is set, the
> + * corresponding ID belongs to the matrix.
> + *
> + * @apm_max: max number of bits in @apm
> + * @apm identifies the AP adapters in the matrix
> + * @aqm_max: max number of bits in @aqm
> + * @aqm identifies the AP queues (domains) in the matrix
> + * @adm_max: max number of bits in @adm
> + * @adm identifies the AP control domains in the matrix
> + */
> +struct kvm_ap_matrix {
> + int apm_max;
> + unsigned long *apm;
> + int aqm_max;
> + unsigned long *aqm;
> + int adm_max;
> + unsigned long *adm;
> +};
>
> /**
> * kvm_ap_instructions_installed()
> @@ -51,4 +79,55 @@
> */
> int kvm_ap_interpret_instructions(struct kvm *kvm, bool enable);
>
> +/**
> + * kvm_ap_matrix_create
> + *
> + * Create an AP matrix to hold a configuration of AP adapters, domains and
> + * control domains.
> + *
> + * @ap_matrix: holds the matrix that is created
> + *
> + * Returns 0 if the matrix is successfully created. Returns an error if an APQN
> + * derived from the cross product of the AP adapter IDs and AP queue indexes
> + * comprising the AP matrix is configured for another guest.
> + */
> +int kvm_ap_matrix_create(struct kvm_ap_matrix **ap_matrix);
why not simply return the pointer?
> +
> +/**
> + * kvm_ap_matrix_destroy
> + *
> + * Destroy an AP matrix by de-allocating all storage allocated by the
> + * kvm_ap_matrix_create function.
> + *
> + * @ap_matrix: the matrix to destroy
> + */
> +void kvm_ap_matrix_destroy(struct kvm_ap_matrix *ap_matrix);
> +
> +/**
> + * kvm_ap_configure_matrix
> + *
> + * Configure the AP matrix for a KVM guest.
> + *
> + * @kvm: the KVM guest
> + * @matrix: the matrix configuration information
> + *
> + * Returns 0 if:
> + * 1. The AP instructions are installed on the guest
> + * 2. The APQNs derived from the intersection of the set of adapter
> + * IDs (APM) and queue indexes (AQM) in @matrix are not configured for
> + * any other KVM guest running on the same linux host.
> + * Otherwise returns an error code.
> + */
> +int kvm_ap_configure_matrix(struct kvm *kvm, struct kvm_ap_matrix *matrix);
> +
> +/**
> + * kvm_ap_deconfigure_matrix
> + *
> + * Deconfigure the AP matrix for a KVM guest. Clears all of the bits in the
> + * APM, AQM and ADM in the guest's CRYCB.
> + *
> + * @kvm: the KVM guest
> + */
> +void kvm_ap_deconfigure_matrix(struct kvm *kvm);
> +
> #endif /* _ASM_KVM_AP */
> diff --git a/arch/s390/kvm/kvm-ap.c b/arch/s390/kvm/kvm-ap.c
> index 55d11b5..f7226d8 100644
> --- a/arch/s390/kvm/kvm-ap.c
> +++ b/arch/s390/kvm/kvm-ap.c
> @@ -9,6 +9,7 @@
> #include <linux/kernel.h>
> #include <asm/kvm-ap.h>
> #include <asm/ap.h>
> +#include <linux/bitops.h>
>
> #include "kvm-s390.h"
>
> @@ -78,3 +79,261 @@ int kvm_ap_interpret_instructions(struct kvm *kvm, bool enable)
> return ret;
> }
> EXPORT_SYMBOL(kvm_ap_interpret_instructions);
> +
> +static inline void kvm_ap_clear_crycb_masks(struct kvm *kvm)
> +{
> + memset(&kvm->arch.crypto.crycb->apcb0, 0,
> + sizeof(kvm->arch.crypto.crycb->apcb0));
> + memset(&kvm->arch.crypto.crycb->apcb1, 0,
> + sizeof(kvm->arch.crypto.crycb->apcb1));
> +}
> +
> +static inline unsigned long *kvm_ap_get_crycb_apm(struct kvm *kvm)
> +{
> + unsigned long *apm;
> +
> + switch (kvm->arch.crypto.crycbd & CRYCB_FORMAT_MASK) {
> + case CRYCB_FORMAT1:
> + apm = (unsigned long *)kvm->arch.crypto.crycb->apcb0.apm;
> + break;
> + case CRYCB_FORMAT2:
> + apm = (unsigned long *)kvm->arch.crypto.crycb->apcb1.apm;
> + break;
> + default:
> + apm = (unsigned long *)kvm->arch.crypto.crycb->apcb0.apm;
> + break;
> + }
> +
> + return apm;
> +}
> +
> +static inline unsigned long *kvm_ap_get_crycb_aqm(struct kvm *kvm)
> +{
> + unsigned long *aqm;
> +
> + switch (kvm->arch.crypto.crycbd & CRYCB_FORMAT_MASK) {
> + case CRYCB_FORMAT1:
> + aqm = (unsigned long *)kvm->arch.crypto.crycb->apcb0.aqm;
> + break;
> + case CRYCB_FORMAT2:
> + aqm = (unsigned long *)kvm->arch.crypto.crycb->apcb1.aqm;
> + break;
> + default:
> + aqm = (unsigned long *)kvm->arch.crypto.crycb->apcb0.aqm;
> + break;
> + }
> +
> + return aqm;
> +}
> +
> +static inline unsigned long *kvm_ap_get_crycb_adm(struct kvm *kvm)
> +{
> + unsigned long *adm;
> +
> + switch (kvm->arch.crypto.crycbd & CRYCB_FORMAT_MASK) {
> + case CRYCB_FORMAT1:
> + adm = (unsigned long *)kvm->arch.crypto.crycb->apcb0.adm;
> + break;
> + case CRYCB_FORMAT2:
> + adm = (unsigned long *)kvm->arch.crypto.crycb->apcb1.adm;
> + break;
> + default:
> + adm = (unsigned long *)kvm->arch.crypto.crycb->apcb0.adm;
> + break;
> + }
> +
> + return adm;
> +}
> +
> +static void kvm_ap_set_crycb_masks(struct kvm *kvm,
> + struct kvm_ap_matrix *matrix)
> +{
> + unsigned long *apm = kvm_ap_get_crycb_apm(kvm);
> + unsigned long *aqm = kvm_ap_get_crycb_aqm(kvm);
> + unsigned long *adm = kvm_ap_get_crycb_adm(kvm);
> +
> + kvm_ap_clear_crycb_masks(kvm);
> + memcpy(apm, matrix->apm, KVM_AP_MASK_BYTES(matrix->apm_max));
> + memcpy(aqm, matrix->aqm, KVM_AP_MASK_BYTES(matrix->aqm_max));
> +
> + /*
> + * Merge the AQM and ADM since the ADM is a superset of the
> + * AQM by agreed-upon convention.
> + */
> + bitmap_or(adm, adm, aqm, matrix->adm_max);
> +}
> +
> +static void kvm_ap_log_sharing_err(struct kvm *kvm, unsigned long apid,
> + unsigned long apqi)
> +{
> + pr_err("%s: AP queue %02lx.%04lx is registered to guest %s", __func__,
> + apid, apqi, kvm->arch.dbf->name);
> +}
> +
> +/**
> + * kvm_ap_validate_queue_sharing
> + *
> + * Verifies that the APQNs derived from the cross product of the AP adapter IDs
> + * and AP queue indexes comprising the AP matrix are not configured for
> + * another guest. AP queue sharing is not allowed.
> + *
> + * @kvm: the KVM guest
> + * @matrix: the AP matrix
> + *
> + * Returns 0 if the APQNs are valid, otherwise; returns -EBUSY.
> + */
> +static int kvm_ap_validate_queue_sharing(struct kvm *kvm,
> + struct kvm_ap_matrix *matrix)
> +{
> + struct kvm *vm;
> + unsigned long *apm, *aqm;
> + unsigned long apid, apqi;
> +
> +
> + /* No other VM may share an AP Queue with the input VM */
I wonder if these functions and structures should really belong to KVM.
The only have sense with the VFIO driver.
My opinion is that they belong there, in the VFIO driver code.
This will also make it easier to handle cases where AP code is not present
but KVM is.
> + list_for_each_entry(vm, &vm_list, vm_list) {
> + if (kvm == vm)
> + continue;
> +
> + apm = kvm_ap_get_crycb_apm(vm);
> + if (!bitmap_and(apm, apm, matrix->apm, matrix->apm_max))
> + continue;
> +
> + aqm = kvm_ap_get_crycb_aqm(vm);
> + if (!bitmap_and(aqm, aqm, matrix->aqm, matrix->aqm_max))
> + continue;
> +
> + for_each_set_bit_inv(apid, apm, matrix->apm_max)
> + for_each_set_bit_inv(apqi, aqm, matrix->aqm_max)
> + kvm_ap_log_sharing_err(kvm, apid, apqi);
> +
> + return -EBUSY;
> + }
> +
> + return 0;
> +}
> +
> +static int kvm_ap_matrix_apm_create(struct kvm_ap_matrix *ap_matrix,
> + struct ap_config_info *config)
> +{
> + int apm_max = (config && config->apxa) ? config->Na + 1 : 16;
At this moment you already know the format of the crycb.
Why don't you just use a virtual CRYCB instead of allocating bitmaps.
It would be much easier to handle and compare.
> +
> + ap_matrix->apm = kzalloc(KVM_AP_MASK_BYTES(apm_max), GFP_KERNEL);
> + if (!ap_matrix->apm)
> + return -ENOMEM;
> +
> + ap_matrix->apm_max = apm_max;
> +
> + return 0;
> +}
> +
> +static int kvm_ap_matrix_aqm_create(struct kvm_ap_matrix *ap_matrix,
> + struct ap_config_info *config)
> +{
> + int aqm_max = (config && config->apxa) ? config->Nd + 1 : 16;
> +
> + ap_matrix->aqm = kzalloc(KVM_AP_MASK_BYTES(aqm_max), GFP_KERNEL);
> + if (!ap_matrix->aqm)
> + return -ENOMEM;
> +
> + ap_matrix->aqm_max = aqm_max;
> +
> + return 0;
> +}
> +
> +static int kvm_ap_matrix_adm_create(struct kvm_ap_matrix *ap_matrix,
> + struct ap_config_info *config)
> +{
> + int adm_max = (config && config->apxa) ? config->Nd + 1 : 16;
> +
> + ap_matrix->adm = kzalloc(KVM_AP_MASK_BYTES(adm_max), GFP_KERNEL);
> + if (!ap_matrix->adm)
> + return -ENOMEM;
> +
> + ap_matrix->adm_max = adm_max;
> +
> + return 0;
> +}
> +
> +static void kvm_ap_matrix_masks_destroy(struct kvm_ap_matrix *ap_matrix)
> +{
> + kfree(ap_matrix->apm);
> + kfree(ap_matrix->aqm);
> + kfree(ap_matrix->adm);
> +}
> +
> +int kvm_ap_matrix_create(struct kvm_ap_matrix **ap_matrix)
> +{
It seems much easier to me to return a pointer to the allocated
kvm_ap_matrix
than to provide the address of the pointer to update.
> + int ret;
> + struct kvm_ap_matrix *matrix;
> + struct ap_config_info config;
> + struct ap_config_info *config_info = NULL;
> +
> + memset(&config, 0, sizeof(config));
> +
> + ret = ap_query_configuration(&config);
> + if (ret) {
> + if (ret != -EOPNOTSUPP)
> + return ret;
> + } else {
> + config_info = &config;
> + }
since you give a non NULL argument you can make this test easier as
if (ret)
return ret;
I do not think that you really need both config and config_info.
> +
> + matrix = kzalloc(sizeof(*matrix), GFP_KERNEL);
> + if (!matrix)
> + return -ENOMEM;
> +
> + ret = kvm_ap_matrix_apm_create(matrix, config_info);
> + if (ret)
> + goto mask_create_err;
> +
> + ret = kvm_ap_matrix_aqm_create(matrix, config_info);
> + if (ret)
> + goto mask_create_err;
> +
> + ret = kvm_ap_matrix_adm_create(matrix, config_info);
> + if (ret)
> + goto mask_create_err;
> +
> + *ap_matrix = matrix;
> +
> + return 0;
> +
> +mask_create_err:
> + kvm_ap_matrix_masks_destroy(matrix);
> + kfree(matrix);
> + return ret;
> +}
> +EXPORT_SYMBOL(kvm_ap_matrix_create);
> +
> +void kvm_ap_matrix_destroy(struct kvm_ap_matrix *ap_matrix)
> +{
> + kvm_ap_matrix_masks_destroy(ap_matrix);
> + kfree(ap_matrix);
> +}
> +EXPORT_SYMBOL(kvm_ap_matrix_destroy);
> +
> +int kvm_ap_configure_matrix(struct kvm *kvm, struct kvm_ap_matrix *matrix)
> +{
> + int ret = 0;
> +
> + mutex_lock(&kvm->lock);
> +
> + ret = kvm_ap_validate_queue_sharing(kvm, matrix);
> + if (ret)
> + goto done;
> +
> + kvm_ap_set_crycb_masks(kvm, matrix);
> +
> +done:
> + mutex_unlock(&kvm->lock);
> +
> + return ret;
> +}
> +EXPORT_SYMBOL(kvm_ap_configure_matrix);
> +
> +void kvm_ap_deconfigure_matrix(struct kvm *kvm)
> +{
> + kvm_ap_clear_crycb_masks(kvm);
> +}
> +EXPORT_SYMBOL(kvm_ap_deconfigure_matrix);
> diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
> index d41b0b8..647ea24 100644
> --- a/drivers/s390/crypto/vfio_ap_ops.c
> +++ b/drivers/s390/crypto/vfio_ap_ops.c
> @@ -10,6 +10,7 @@
> #include <linux/device.h>
> #include <linux/list.h>
> #include <linux/ctype.h>
> +#include <asm/kvm-ap.h>
>
> #include "vfio_ap_private.h"
>
> @@ -18,8 +19,23 @@
>
> static int vfio_ap_mdev_create(struct kobject *kobj, struct mdev_device *mdev)
> {
> + int ret;
> + struct ap_matrix_mdev *matrix_mdev;
> struct ap_matrix *ap_matrix = to_ap_matrix(mdev_parent_dev(mdev));
> + struct kvm_ap_matrix *matrix;
> +
> + ret = kvm_ap_matrix_create(&matrix);
> + if (ret)
> + return ret;
> +
> + matrix_mdev = kzalloc(sizeof(*matrix_mdev), GFP_KERNEL);
> + if (!matrix_mdev) {
> + kvm_ap_matrix_destroy(matrix);
> + return -ENOMEM;
> + }
>
> + matrix_mdev->matrix = matrix;
> + mdev_set_drvdata(mdev, matrix_mdev);
> ap_matrix->available_instances--;
>
> return 0;
> @@ -28,7 +44,10 @@ static int vfio_ap_mdev_create(struct kobject *kobj, struct mdev_device *mdev)
> static int vfio_ap_mdev_remove(struct mdev_device *mdev)
> {
> struct ap_matrix *ap_matrix = to_ap_matrix(mdev_parent_dev(mdev));
> + struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
>
> + kvm_ap_matrix_destroy(matrix_mdev->matrix);
> + kfree(matrix_mdev);
> ap_matrix->available_instances++;
>
> return 0;
> diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h
> index c47aeec..f248faf 100644
> --- a/drivers/s390/crypto/vfio_ap_private.h
> +++ b/drivers/s390/crypto/vfio_ap_private.h
> @@ -29,6 +29,10 @@ struct ap_matrix {
> int available_instances;
> };
>
> +struct ap_matrix_mdev {
> + struct kvm_ap_matrix *matrix;
> +};
> +
> static inline struct ap_matrix *to_ap_matrix(struct device *dev)
> {
> return container_of(dev, struct ap_matrix, device);
--
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany
On 04/23/2018 03:03 AM, Cornelia Huck wrote:
> On Sun, 22 Apr 2018 10:52:55 -0400
> Tony Krowiak <[email protected]> wrote:
>
>>>>>>> (Not providing a crycb if APXA is not available would be loss of
>>>>>>> functionality, I guess? Deciding not to provide vfio-ap if APXA is not
>>>>>>> available is a different game, of course.)
>>>>>> This would require a change to enabling the CPU model feature for
>>>>>> AP.
>>>>> But would it actually make sense to tie vfio-ap to APXA? This needs to
>>>>> be answered by folks with access to the architecture :)
>>>> I don't see any reason to do that from an architectural perspective.
>>>> One can access AP devices whether APXA is installed or not, it just limits
>>>> the range of devices that can be addressed
>>> So I guess we should not introduce a tie-in then (unless it radically
>>> simplifies the code...)
>> I'm not clear about what you mean by introducing a tie-in. Can you
>> clarify that?
> Making vfio-ap depend on APXA.
I don't think vfio-ap should be dependent upon APXA for the reasons I
stated above.
>
On Tue, 24 Apr 2018 09:01:12 -0400
Tony Krowiak <[email protected]> wrote:
> On 04/23/2018 03:03 AM, Cornelia Huck wrote:
> > On Sun, 22 Apr 2018 10:52:55 -0400
> > Tony Krowiak <[email protected]> wrote:
> >
> >>>>>>> (Not providing a crycb if APXA is not available would be loss of
> >>>>>>> functionality, I guess? Deciding not to provide vfio-ap if APXA is not
> >>>>>>> available is a different game, of course.)
> >>>>>> This would require a change to enabling the CPU model feature for
> >>>>>> AP.
> >>>>> But would it actually make sense to tie vfio-ap to APXA? This needs to
> >>>>> be answered by folks with access to the architecture :)
> >>>> I don't see any reason to do that from an architectural perspective.
> >>>> One can access AP devices whether APXA is installed or not, it just limits
> >>>> the range of devices that can be addressed
> >>> So I guess we should not introduce a tie-in then (unless it radically
> >>> simplifies the code...)
> >> I'm not clear about what you mean by introducing a tie-in. Can you
> >> clarify that?
> > Making vfio-ap depend on APXA.
>
> I don't think vfio-ap should be dependent upon APXA for the reasons I
> stated above.
>
> >
>
It seems we are in violent agreement :)
On 04/23/2018 09:46 AM, Pierre Morel wrote:
> On 15/04/2018 23:22, Tony Krowiak wrote:
>> Provides interfaces to assign AP adapters, usage domains
>> and control domains to a KVM guest.
>>
>> A KVM guest is started by executing the Start Interpretive Execution
>> (SIE)
>> instruction. The SIE state description is a control block that
>> contains the
>> state information for a KVM guest and is supplied as input to the SIE
>> instruction. The SIE state description has a satellite structure
>> called the
>> Crypto Control Block (CRYCB). The CRYCB contains three bitmask fields
>> identifying the adapters, queues (domains) and control domains
>> assigned to
>> the KVM guest:
>>
>> * The AP Adapter Mask (APM) field identifies the AP adapters assigned to
>> the KVM guest
>>
>> * The AP Queue Mask (AQM) field identifies the AP queues assigned to
>> the KVM guest. Each AP queue is connected to a usage domain within
>> an AP adapter.
>>
>> * The AP Domain Mask (ADM) field identifies the control domains
>> assigned to the KVM guest.
>> - as well as the
>> allocation of bitmaps -
>> Each adapter, queue (usage domain) and control domain are identified by
>> a number from 0 to 255. The bits in each mask, from most significant to
>> least significant bit, correspond to the numbers 0-255. When a bit is
>> set, the corresponding adapter, queue (usage domain) or control domain
>> is assigned to the KVM guest.
>>
>> This patch will set the bits in the APM, AQM and ADM fields of the
>> CRYCB referenced by the KVM guest's SIE state description. The process
>> used is:
>>
>> 1. Verify that the bits to be set do not exceed the maximum bit
>> number for the given mask.
>>
>> 2. Verify that the APQNs that can be derived from the intersection
>> of the bits set in the APM and AQM fields of the KVM guest's CRYCB
>> are not assigned to any other KVM guest running on the same linux
>> host.
>>
>> 3. Set the APM, AQM and ADM in the CRYCB according to the matrix
>> configured for the mediated matrix device via its sysfs
>> adapter, domain and control domain attribute files respectively.
>>
>> Signed-off-by: Tony Krowiak <[email protected]>
>> ---
>> arch/s390/include/asm/kvm-ap.h | 79 ++++++++++
>> arch/s390/kvm/kvm-ap.c | 259
>> +++++++++++++++++++++++++++++++++
>> drivers/s390/crypto/vfio_ap_ops.c | 19 +++
>> drivers/s390/crypto/vfio_ap_private.h | 4 +
>> 4 files changed, 361 insertions(+), 0 deletions(-)
>>
>> diff --git a/arch/s390/include/asm/kvm-ap.h
>> b/arch/s390/include/asm/kvm-ap.h
>> index a6c092e..a068244 100644
>> --- a/arch/s390/include/asm/kvm-ap.h
>> +++ b/arch/s390/include/asm/kvm-ap.h
>> @@ -12,6 +12,34 @@
>>
>> #include <linux/types.h>
>> #include <linux/kvm_host.h>
>> +#include <linux/types.h>
>> +#include <linux/kvm_host.h>
>> +#include <linux/bitops.h>
>> +
>> +#define KVM_AP_MASK_BYTES(n) DIV_ROUND_UP(n, BITS_PER_BYTE)
>> +
>> +/**
>> + * The AP matrix is comprised of three bit masks identifying the
>> adapters,
>> + * queues (domains) and control domains that belong to an AP matrix.
>> The bits in
>> + * each mask, from least significant to most significant bit,
>> correspond to IDs
>> + * 0 to the maximum ID allowed for a given mask. When a bit is set, the
>> + * corresponding ID belongs to the matrix.
>> + *
>> + * @apm_max: max number of bits in @apm
>> + * @apm identifies the AP adapters in the matrix
>> + * @aqm_max: max number of bits in @aqm
>> + * @aqm identifies the AP queues (domains) in the matrix
>> + * @adm_max: max number of bits in @adm
>> + * @adm identifies the AP control domains in the matrix
>> + */
>> +struct kvm_ap_matrix {
>> + int apm_max;
>> + unsigned long *apm;
>> + int aqm_max;
>> + unsigned long *aqm;
>> + int adm_max;
>> + unsigned long *adm;
>> +};
>>
>> /**
>> * kvm_ap_instructions_installed()
>> @@ -51,4 +79,55 @@
>> */
>> int kvm_ap_interpret_instructions(struct kvm *kvm, bool enable);
>>
>> +/**
>> + * kvm_ap_matrix_create
>> + *
>> + * Create an AP matrix to hold a configuration of AP adapters,
>> domains and
>> + * control domains.
>> + *
>> + * @ap_matrix: holds the matrix that is created
>> + *
>> + * Returns 0 if the matrix is successfully created. Returns an error
>> if an APQN
>> + * derived from the cross product of the AP adapter IDs and AP queue
>> indexes
>> + * comprising the AP matrix is configured for another guest.
>> + */
>> +int kvm_ap_matrix_create(struct kvm_ap_matrix **ap_matrix);
>
> why not simply return the pointer?
The function returns a value indicating the reason a matrix could not be
created.
Returning a NULL pointer provides no clue as to why the call failed.
>
>
>> +
>> +/**
>> + * kvm_ap_matrix_destroy
>> + *
>> + * Destroy an AP matrix by de-allocating all storage allocated by the
>> + * kvm_ap_matrix_create function.
>> + *
>> + * @ap_matrix: the matrix to destroy
>> + */
>> +void kvm_ap_matrix_destroy(struct kvm_ap_matrix *ap_matrix);
>> + - as well as the
>> allocation of bitmaps -
>> +/**
>> + * kvm_ap_configure_matrix
>> + *
>> + * Configure the AP matrix for a KVM guest.
>> + *
>> + * @kvm: the KVM guest
>> + * @matrix: the matrix configuration information
>> + *
>> + * Returns 0 if:
>> + * 1. The AP instructions are installed on the guest
>> + * 2. The APQNs derived from the intersection of the set of adapter
>> + * IDs (APM) and queue indexes (AQM) in @matrix are not
>> configured for
>> + * any other KVM guest running on the same linux host.
>> + * Otherwise returns an error code.
>> + */
>> +int kvm_ap_configure_matrix(struct kvm *kvm, struct kvm_ap_matrix
>> *matrix);
>> +
>> +/**
>> + * kvm_ap_deconfigure_matrix
>> + *
>> + * Deconfigure the AP matrix for a KVM guest. Clears all of the bits
>> in the
>> + * APM, AQM and ADM in the guest's CRYCB.
>> + *
>> + * @kvm: the KVM guest
>> + */
>> +void kvm_ap_deconfigure_matrix(struct kvm *kvm);
>> +
>> #endif /* _ASM_KVM_AP */
>> diff --git a/arch/s390/kvm/kvm-ap.c b/arch/s390/kvm/kvm-ap.c
>> index 55d11b5..f7226d8 100644
>> --- a/arch/s390/kvm/kvm-ap.c
>> +++ b/arch/s390/kvm/kvm-ap.c
>> @@ -9,6 +9,7 @@
>> #include <linux/kernel.h>
>> #include <asm/kvm-ap.h>
>> #include <asm/ap.h>
>> +#include <linux/bitops.h>
>>
>> #include "kvm-s390.h"
>>
>> @@ -78,3 +79,261 @@ int kvm_ap_interpret_instructions(struct kvm
>> *kvm, bool enable)
>> return ret;
>> }
>> EXPORT_SYMBOL(kvm_ap_interpret_instructions);
>> +
>> +static inline void kvm_ap_clear_crycb_masks(struct kvm *kvm)
>> +{
>> + memset(&kvm->arch.crypto.crycb->apcb0, 0,
>> + sizeof(kvm->arch.crypto.crycb->apcb0));
>> + memset(&kvm->arch.crypto.crycb->apcb1, 0,
>> + sizeof(kvm->arch.crypto.crycb->apcb1));
>> +}
>> +
>> +static inline unsigned long *kvm_ap_get_crycb_apm(struct kvm *kvm)
>> +{
>> + unsigned long *apm;
>> +
>> + switch (kvm->arch.crypto.crycbd & CRYCB_FORMAT_MASK) {
>> + case CRYCB_FORMAT1:
>> + apm = (unsigned long *)kvm->arch.crypto.crycb->apcb0.apm;
>> + break;
>> + case CRYCB_FORMAT2:
>> + apm = (unsigned long *)kvm->arch.crypto.crycb->apcb1.apm;
>> + break;
>> + default:
>> + apm = (unsigned long *)kvm->arch.crypto.crycb->apcb0.apm;
>> + break;
>> + }
>> +
>> + return apm;
>> +}
>> +
>> +static inline unsigned long *kvm_ap_get_crycb_aqm(struct kvm *kvm)
>> +{
>> + unsigned long *aqm;
>> +
>> + switch (kvm->arch.crypto.crycbd & CRYCB_FORMAT_MASK) {
>> + case CRYCB_FORMAT1:
>> + aqm = (unsigned long *)kvm->arch.crypto.crycb->apcb0.aqm;
>> + break;
>> + case CRYCB_FORMAT2:
>> + aqm = (unsigned long *)kvm->arch.crypto.crycb->apcb1.aqm;
>> + break;
>> + default:
>> + aqm = (unsigned long *)kvm->arch.crypto.crycb->apcb0.aqm;
>> + break;
>> + }
>> +
>> + return aqm;
>> +}
>> +
>> +static inline unsigned long *kvm_ap_get_crycb_adm(struct kvm *kvm)
>> +{
>> + unsigned long *adm;
>> +
>> + switch (kvm->arch.crypto.crycbd & CRYCB_FORMAT_MASK) {
>> + case CRYCB_FORMAT1:
>> + adm = (unsigned long *)kvm->arch.crypto.crycb->apcb0.adm;
>> + break;
>> + case CRYCB_FORMAT2:
>> + adm = (unsigned long *)kvm->arch.crypto.crycb->apcb1.adm;
>> + break;
>> + default:
>> + adm = (unsigned long *)kvm->arch.crypto.crycb->apcb0.adm;
>> + break;
>> + }
>> +
>> + return adm;
>> +}
>> +
>> +static void kvm_ap_set_crycb_masks(struct kvm *kvm,
>> + struct kvm_ap_matrix *matrix)
>> +{
>> + unsigned long *apm = kvm_ap_get_crycb_apm(kvm);
>> + unsigned long *aqm = kvm_ap_get_crycb_aqm(kvm);
>> + unsigned long *adm = kvm_ap_get_crycb_adm(kvm);
>> +
>> + kvm_ap_clear_crycb_masks(kvm);
>> + memcpy(apm, matrix->apm, KVM_AP_MASK_BYTES(matrix->apm_max));
>> + memcpy(aqm, matrix->aqm, KVM_AP_MASK_BYTES(matrix->aqm_max));
>> +
>> + /*
>> + * Merge the AQM and ADM since the ADM is a superset of the
>> + * AQM by agreed-upon convention.
>> + */
>> + bitmap_or(adm, adm, aqm, matrix->adm_max);
>> +}
>> +
>> +static void kvm_ap_log_sharing_err(struct kvm *kvm, unsigned long apid,
>> + unsigned long apqi)
>> +{
>> + pr_err("%s: AP queue %02lx.%04lx is registered to guest %s",
>> __func__,
>> + apid, apqi, kvm->arch.dbf->name);
>> +}
>> +
>> +/**
>> + * kvm_ap_validate_queue_sharing
>> + *
>> + * Verifies that the APQNs derived from the cross product of the AP
>> adapter IDs
>> + * and AP queue indexes comprising the AP matrix are not configured for
>> + * another guest. AP queue sharing is not allowed.
>> + *
>> + * @kvm: the KVM guest
>> + * @matrix: the AP matrix
>> + *
>> + * Returns 0 if the APQNs are valid, otherwise; returns -EBUSY.
>> + */
>> +static int kvm_ap_validate_queue_sharing(struct kvm *kvm,
>> + struct kvm_ap_matrix *matrix)
>> +{
>> + struct kvm *vm;
>> + unsigned long *apm, *aqm;
>> + unsigned long apid, apqi;
>> +
>> +
>> + /* No other VM may share an AP Queue with the input VM */
>
> I wonder if these functions and structures should really belong to KVM.
> The only have sense with the VFIO driver.
> My opinion is that they belong there, in the VFIO driver code.
I disagree for two reasons:
1. The vfio_ap driver should not have to know how to configure the KVM
guest's matrix nor anything else about KVM for that matter.
2. The interfaces and structures defined in kvm-ap.h and implemented
in kvm-ap.c don't have anything to do with VFIO and can stand alone
to be used by any client code to configure a guest's matrix.
>
>
> This will also make it easier to handle cases where AP code is not
> present
> but KVM is.
That issue is being resolved by making the required zcrypt interfaces
static.
>
>
>> + list_for_each_entry(vm, &vm_list, vm_list) {
>> + if (kvm == vm)
>> + continue;
>> +
>> + apm = kvm_ap_get_crycb_apm(vm);
>> + if (!bitmap_and(apm, apm, matrix->apm, matrix->apm_max))
>> + continue;
>> +
>> + aqm = kvm_ap_get_crycb_aqm(vm);
>> + if (!bitmap_and(aqm, aqm, matrix->aqm, matrix->aqm_max))
>> + continue;
>> +
>> + for_each_set_bit_inv(apid, apm, matrix->apm_max)
>> + for_each_set_bit_inv(apqi, aqm, matrix->aqm_max)
>> + kvm_ap_log_sharing_err(kvm, apid, apqi);
>> +
>> + return -EBUSY;
>> + }
>> +
>> + return 0;
>> +}
>> +
>> +static int kvm_ap_matrix_apm_create(struct kvm_ap_matrix *ap_matrix,
>> + struct ap_config_info *config)
>> +{
>> + int apm_max = (config && config->apxa) ? config->Na + 1 : 16;
>
> At this moment you already know the format of the crycb.
How?
> Why don't you just use a virtual CRYCB instead of allocating bitmaps.
> It would be much easier to handle and compare.
What do you mean by a virtual CRYCB?
After thinking about this more, I think allocating bitmaps is a bit of
overkill. I think I can get rid of this function entirely to simplify
things a great deal.
>
>
>
>> +
>> + ap_matrix->apm = kzalloc(KVM_AP_MASK_BYTES(apm_max), GFP_KERNEL);
>> + if (!ap_matrix->apm)
>> + return -ENOMEM;
>> +
>> + ap_matrix->apm_max = apm_max;
>> +
>> + return 0;
>> +}
>> +
>> +static int kvm_ap_matrix_aqm_create(struct kvm_ap_matrix *ap_matrix,
>> + struct ap_config_info *config)
>> +{
>> + int aqm_max = (config && config->apxa) ? config->Nd + 1 : 16;
>> +
>> + ap_matrix->aqm = kzalloc(KVM_AP_MASK_BYTES(aqm_max), GFP_KERNEL);
>> + if (!ap_matrix->aqm)
>> + return -ENOMEM;
>> +
>> + ap_matrix->aqm_max = aqm_max;
>> +
>> + return 0;
>> +}
>> +
>> +static int kvm_ap_matrix_adm_create(struct kvm_ap_matrix *ap_matrix,
>> + struct ap_config_info *config)
>> +{
>> + int adm_max = (config && config->apxa) ? config->Nd + 1 : 16;
>> +
>> + ap_matrix->adm = kzalloc(KVM_AP_MASK_BYTES(adm_max), GFP_KERNEL);
>> + if (!ap_matrix->adm)
>> + return -ENOMEM;
>> +
>> + ap_matrix->adm_max = adm_max;
>> +
>> + return 0;
>> +}
>> +
>> +static void kvm_ap_matrix_masks_destroy(struct kvm_ap_matrix
>> *ap_matrix)
>> +{
>> + kfree(ap_matrix->apm);
>> + kfree(ap_matrix->aqm);
>> + kfree(ap_matrix->adm);
>> +}
>> +
>> +int kvm_ap_matrix_create(struct kvm_ap_matrix **ap_matrix)
>> +{
>
> It seems much easier to me to return a pointer to the allocated
> kvm_ap_matrix
> than to provide the address of the pointer to update.
See my reasons above. The point may be moot, because I think I'm going
to get
rid of the create and destroy functions.
>
>
>> + int ret;
>> + struct kvm_ap_matrix *matrix;
>> + struct ap_config_info config;
>> + struct ap_config_info *config_info = NULL;
>> +
>> + memset(&config, 0, sizeof(config));
>> +
>> + ret = ap_query_configuration(&config);
>> + if (ret) {
>> + if (ret != -EOPNOTSUPP)
>> + return ret;
>> + } else {
>> + config_info = &config;
>> + }
> since you give a non NULL argument you can make this test easier as
> if (ret)
> return ret;
Not true .... the query function can return an error due to an exception
in which case we want to simply return the error. It can also return
-EOPNOTSUPP
because STFLE.12 is not set in which case we want to continue.
>
>
> I do not think that you really need both config and config_info.
True.
>
>
>> +
>> + matrix = kzalloc(sizeof(*matrix), GFP_KERNEL);
>> + if (!matrix)
>> + return -ENOMEM;
>> +
>> + ret = kvm_ap_matrix_apm_create(matrix, config_info);
>> + if (ret)
>> + goto mask_create_err;
>> +
>> + ret = kvm_ap_matrix_aqm_create(matrix, config_info);
>> + if (ret)
>> + goto mask_create_err;
>> +
>> + ret = kvm_ap_matrix_adm_create(matrix, config_info);
>> + if (ret)
>> + goto mask_create_err;
>> +
>> + *ap_matrix = matrix;
>> +
>> + return 0;
>> +
>> +mask_create_err:
>> + kvm_ap_matrix_masks_destroy(matrix);
>> + kfree(matrix);
>> + return ret;
>> +}
>> +EXPORT_SYMBOL(kvm_ap_matrix_create);
>> +
>> +void kvm_ap_matrix_destroy(struct kvm_ap_matrix *ap_matrix)
>> +{
>> + kvm_ap_matrix_masks_destroy(ap_matrix);
>> + kfree(ap_matrix);
>> +}
>> +EXPORT_SYMBOL(kvm_ap_matrix_destroy);
>> +
>> +int kvm_ap_configure_matrix(struct kvm *kvm, struct kvm_ap_matrix
>> *matrix)
>> +{
>> + int ret = 0;
>> +
>> + mutex_lock(&kvm->lock);
>> +
>> + ret = kvm_ap_validate_queue_sharing(kvm, matrix);
>> + if (ret)
>> + goto done;
>> +
>> + kvm_ap_set_crycb_masks(kvm, matrix);
>> +
>> +done:
>> + mutex_unlock(&kvm->lock);
>> +
>> + return ret;
>> +}
>> +EXPORT_SYMBOL(kvm_ap_configure_matrix);
>> +
>> +void kvm_ap_deconfigure_matrix(struct kvm *kvm)
>> +{
>> + kvm_ap_clear_crycb_masks(kvm);
>> +}
>> +EXPORT_SYMBOL(kvm_ap_deconfigure_matrix);
>> diff --git a/drivers/s390/crypto/vfio_ap_ops.c
>> b/drivers/s390/crypto/vfio_ap_ops.c
>> index d41b0b8..647ea24 100644
>> --- a/drivers/s390/crypto/vfio_ap_ops.c
>> +++ b/drivers/s390/crypto/vfio_ap_ops.c
>> @@ -10,6 +10,7 @@
>> #include <linux/device.h>
>> #include <linux/list.h>
>> #include <linux/ctype.h>
>> +#include <asm/kvm-ap.h>
>>
>> #include "vfio_ap_private.h"
>>
>> @@ -18,8 +19,23 @@
>>
>> static int vfio_ap_mdev_create(struct kobject *kobj, struct
>> mdev_device *mdev)
>> {
>> + int ret;
>> + struct ap_matrix_mdev *matrix_mdev;
>> struct ap_matrix *ap_matrix = to_ap_matrix(mdev_parent_dev(mdev));
>> + struct kvm_ap_matrix *matrix;
>> +
>> + ret = kvm_ap_matrix_create(&matrix);
>> + if (ret)
>> + return ret;
>> +
>> + matrix_mdev = kzalloc(sizeof(*matrix_mdev), GFP_KERNEL);
>> + if (!matrix_mdev) {
>> + kvm_ap_matrix_destroy(matrix);
>> + return -ENOMEM;
>> + }
>>
>> + matrix_mdev->matrix = matrix;
>> + mdev_set_drvdata(mdev, matrix_mdev);
>> ap_matrix->available_instances--;
>>
>> return 0;
>> @@ -28,7 +44,10 @@ static int vfio_ap_mdev_create(struct kobject
>> *kobj, struct mdev_device *mdev)
>> static int vfio_ap_mdev_remove(struct mdev_device *mdev)
>> {
>> struct ap_matrix *ap_matrix = to_ap_matrix(mdev_parent_dev(mdev));
>> + struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
>>
>> + kvm_ap_matrix_destroy(matrix_mdev->matrix);
>> + kfree(matrix_mdev);
>> ap_matrix->available_instances++;
>>
>> return 0;
>> diff --git a/drivers/s390/crypto/vfio_ap_private.h
>> b/drivers/s390/crypto/vfio_ap_private.h
>> index c47aeec..f248faf 100644
>> --- a/drivers/s390/crypto/vfio_ap_private.h
>> +++ b/drivers/s390/crypto/vfio_ap_private.h
>> @@ -29,6 +29,10 @@ struct ap_matrix {
>> int available_instances;
>> };
>>
>> +struct ap_matrix_mdev {
>> + struct kvm_ap_matrix *matrix;
>> +};
>> +
>> static inline struct ap_matrix *to_ap_matrix(struct device *dev)
>> {
>> return container_of(dev, struct ap_matrix, device);
>
>
On 25/04/2018 18:21, Tony Krowiak wrote:
> On 04/23/2018 09:46 AM, Pierre Morel wrote:
>> On 15/04/2018 23:22, Tony Krowiak wrote:
>>> Provides interfaces to assign AP adapters, usage domains
>>> and control domains to a KVM guest.
>>>
...
>>> +/**
>>> + * kvm_ap_matrix_create
>>> + *
>>> + * Create an AP matrix to hold a configuration of AP adapters,
>>> domains and
>>> + * control domains.
>>> + *
>>> + * @ap_matrix: holds the matrix that is created
>>> + *
>>> + * Returns 0 if the matrix is successfully created. Returns an
>>> error if an APQN
>>> + * derived from the cross product of the AP adapter IDs and AP
>>> queue indexes
>>> + * comprising the AP matrix is configured for another guest.
>>> + */
>>> +int kvm_ap_matrix_create(struct kvm_ap_matrix **ap_matrix);
>>
>> why not simply return the pointer?
>
> The function returns a value indicating the reason a matrix could not
> be created.
> Returning a NULL pointer provides no clue as to why the call failed.
That is why the ERR_PTR exist :)
...
>>> + * Returns 0 if the APQNs are valid, otherwise; returns -EBUSY.
>>> + */
>>> +static int kvm_ap_validate_queue_sharing(struct kvm *kvm,
>>> + struct kvm_ap_matrix *matrix)
>>> +{
>>> + struct kvm *vm;
>>> + unsigned long *apm, *aqm;
>>> + unsigned long apid, apqi;
>>> +
>>> +
>>> + /* No other VM may share an AP Queue with the input VM */
>>
>> I wonder if these functions and structures should really belong to KVM.
>> The only have sense with the VFIO driver.
>> My opinion is that they belong there, in the VFIO driver code.
>
> I disagree for two reasons:
>
> 1. The vfio_ap driver should not have to know how to configure the KVM
> guest's matrix nor anything else about KVM for that matter.
>
> 2. The interfaces and structures defined in kvm-ap.h and implemented
> in kvm-ap.c don't have anything to do with VFIO and can stand alone
> to be used by any client code to configure a guest's matrix.
Doing this you will have to change KVM if the AP VFIO matrix protocol to
access the queues change.
i.e. suppose some day the queues may be shared between guests.
...
>>> +static int kvm_ap_matrix_apm_create(struct kvm_ap_matrix *ap_matrix,
>>> + struct ap_config_info *config)
>>> +{
>>> + int apm_max = (config && config->apxa) ? config->Na + 1 : 16;
>>
>> At this moment you already know the format of the crycb.
>
> How?
you calculated this in kvm_ap_build_crycbd() which is called from
kvm_s390_crypto_init()
itself called from kvm_arch_init_vm().
It is when starting the VM.
kvm_ap_matrix_apm_create() is called much later when realizing the device
...
--
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany
On 05/02/2018 10:57 AM, Pierre Morel wrote:
> On 25/04/2018 18:21, Tony Krowiak wrote:
>> On 04/23/2018 09:46 AM, Pierre Morel wrote:
>>> On 15/04/2018 23:22, Tony Krowiak wrote:
>>>> Provides interfaces to assign AP adapters, usage domains
>>>> and control domains to a KVM guest.
>>>>
> ...
>>>> +/**
>>>> + * kvm_ap_matrix_create
>>>> + *
>>>> + * Create an AP matrix to hold a configuration of AP adapters,
>>>> domains and
>>>> + * control domains.
>>>> + *
>>>> + * @ap_matrix: holds the matrix that is created
>>>> + *
>>>> + * Returns 0 if the matrix is successfully created. Returns an
>>>> error if an APQN
>>>> + * derived from the cross product of the AP adapter IDs and AP
>>>> queue indexes
>>>> + * comprising the AP matrix is configured for another guest.
>>>> + */
>>>> +int kvm_ap_matrix_create(struct kvm_ap_matrix **ap_matrix);
>>>
>>> why not simply return the pointer?
>>
>> The function returns a value indicating the reason a matrix could not
>> be created.
>> Returning a NULL pointer provides no clue as to why the call failed.
>
> That is why the ERR_PTR exist :)
The point it moot, I'm getting rid of this function call and including the
struct kvm_ap_matrix as a static member of the parent struct ap_matrix_mdev
and not a pointer.
>
>
>
> ...
>>>> + * Returns 0 if the APQNs are valid, otherwise; returns -EBUSY.
>>>> + */
>>>> +static int kvm_ap_validate_queue_sharing(struct kvm *kvm,
>>>> + struct kvm_ap_matrix *matrix)
>>>> +{
>>>> + struct kvm *vm;
>>>> + unsigned long *apm, *aqm;
>>>> + unsigned long apid, apqi;
>>>> +
>>>> +
>>>> + /* No other VM may share an AP Queue with the input VM */
>>>
>>> I wonder if these functions and structures should really belong to KVM.
>>> The only have sense with the VFIO driver.
>>> My opinion is that they belong there, in the VFIO driver code.
>>
>> I disagree for two reasons:
>>
>> 1. The vfio_ap driver should not have to know how to configure the KVM
>> guest's matrix nor anything else about KVM for that matter.
>>
>> 2. The interfaces and structures defined in kvm-ap.h and implemented
>> in kvm-ap.c don't have anything to do with VFIO and can stand alone
>> to be used by any client code to configure a guest's matrix.
>
> Doing this you will have to change KVM if the AP VFIO matrix protocol
> to access the queues change.
> i.e. suppose some day the queues may be shared between guests.
> ...
The kvm_ap_configure_matrix(kvm, matrix) interface configures the APM,
AQM and ADM in the
guest's CRYCB which implies AP instructions are being interpreted. There
is nothing in SIE
precluding the sharing of AP queues between guests using SIE to
interpret AP instructions,
it is my opinion - along with several others - that this is not
advisable given the
results are not predictable, not to mention the security concerns. If
the protocol to access
queues changes, then we create a different interface. The other option
is to include a flag
on the kvm_ap_configure_matrix(kvm, matrix) interface to indicate
whether sharing is
allowed. I don't like this, because we have no way of knowing if the
caller has taken the
proper care to ensure the VM sharing the queue should be allowed access.
Besides, when
queue sharing is implemented, it is my opinion that we will intercept
the AP instructions
and the matrix will not be configured in the CRYCB. I stick by my
response above.
>
>>>> +static int kvm_ap_matrix_apm_create(struct kvm_ap_matrix *ap_matrix,
>>>> + struct ap_config_info *config)
>>>> +{
>>>> + int apm_max = (config && config->apxa) ? config->Na + 1 : 16;
>>>
>>> At this moment you already know the format of the crycb.
>>
>> How?
>
> you calculated this in kvm_ap_build_crycbd() which is called from
> kvm_s390_crypto_init()
> itself called from kvm_arch_init_vm().
> It is when starting the VM.
This structure is used by the vfio_ap driver to store the mediated
matrix device's matrix
configuration as well as to configure the CRYCB. The mediated device's
matrix is
configured before the guest is started ... it is the means for
configuring the guest's
matrix after all. The bottom line is, this function will be called long
before the
kvm_ap_build_crycbd() function is called.
Having said that, I am including the struct kvm_ap_matrix as a static
field in
struct ap_matrix_mdev - i.e., not a pointer. Consequently, the apm_max,
aqm_max
and adm_max fields will be set by the driver when the mediated matrix
device is
created.
>
>
> kvm_ap_matrix_apm_create() is called much later when realizing the device
The kvm_ap_matrix_apm_create() is called by the kvm_ap_matrix_create()
which is called when
the mediated matrix device is created - i.e., the vfio_ap_mdev_create()
function is vfio_ap_ops.c.
The mediated device is created when the UUID is written to the sysfs
create file. The mediated
device is used when the guest is started to configure it's CRYCB, so the
kvm_ap_matrix_apm_create()
is called long before the device is realized.
Again the point is moot because I am getting rid of the dynamic
allocation of struct kvm_ap_matrix.
>
>
> ...
>
>
>
On 03/05/2018 16:41, Tony Krowiak wrote:
> On 05/02/2018 10:57 AM, Pierre Morel wrote:
>> On 25/04/2018 18:21, Tony Krowiak wrote:
>>> On 04/23/2018 09:46 AM, Pierre Morel wrote:
>>>> On 15/04/2018 23:22, Tony Krowiak wrote:
>>>>> Provides interfaces to assign AP adapters, usage domains
>>>>> and control domains to a KVM guest.
...
> The kvm_ap_configure_matrix(kvm, matrix) interface configures the APM,
> AQM and ADM in the
> guest's CRYCB which implies AP instructions are being interpreted.
> There is nothing in SIE
> precluding the sharing of AP queues between guests using SIE to
> interpret AP instructions,
> it is my opinion - along with several others - that this is not
> advisable given the
> results are not predictable, not to mention the security concerns. If
> the protocol to access
> queues changes, then we create a different interface. The other option
> is to include a flag
> on the kvm_ap_configure_matrix(kvm, matrix) interface to indicate
> whether sharing is
> allowed. I don't like this, because we have no way of knowing if the
> caller has taken the
> proper care to ensure the VM sharing the queue should be allowed
> access. Besides, when
> queue sharing is implemented, it is my opinion that we will intercept
> the AP instructions
> and the matrix will not be configured in the CRYCB. I stick by my
> response above.
I mean, validating the queue sharing is a mater of the VFIO driver.
This code is not needed if the VFIO driver is not used.
But it is not very important.
>
>>
>>>>> +static int kvm_ap_matrix_apm_create(struct kvm_ap_matrix *ap_matrix,
>>>>> + struct ap_config_info *config)
>>>>> +{
>>>>> + int apm_max = (config && config->apxa) ? config->Na + 1 : 16;
>>>>
>>>> At this moment you already know the format of the crycb.
>>>
>>> How?
>>
>> you calculated this in kvm_ap_build_crycbd() which is called from
>> kvm_s390_crypto_init()
>> itself called from kvm_arch_init_vm().
>> It is when starting the VM.
>
> This structure is used by the vfio_ap driver to store the mediated
> matrix device's matrix
> configuration as well as to configure the CRYCB. The mediated device's
> matrix is
> configured before the guest is started ... it is the means for
> configuring the guest's
> matrix after all. The bottom line is, this function will be called
> long before the
> kvm_ap_build_crycbd() function is called.
you are right, I was thinking about open, should have take more attention.
Sorry.
Pierre
--
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany
On 15.04.2018 23:22, Tony Krowiak wrote:
> If the AP instructions are not available on the linux host, then
> AP devices can not be interpreted by the SIE. The AP bus has a
This statement is wrong. The instructions can be interpreted by SIE e.g.
if there are no devices assigned to a guest. This is e.g. the case for
!CONFIG_ZCRYPT.
Also, doesn't this directly imply that the other execution control
should also not be used ("intercept AP instuctions"). This would be bad.
Just because !CONFIG_ZCRYPT does not imply that you can't emulate AP
devices for a guest.
Why isn't it sufficient to glue CONFIG_ZCRYPT to vfio-ap? This would
make more sense in my opinion. You have no "host devices" that you can
"pass through". But you can still emulate devices or emulate an empty bus.
> function it uses to determine if the AP instructions are
> available. This patch provides a new function that wraps the
> AP bus's function to externalize it for use by KVM.
>
> Signed-off-by: Tony Krowiak <[email protected]>
> Reviewed-by: Pierre Morel <[email protected]>
> Reviewed-by: Harald Freudenberger <[email protected]>
> ---
> arch/s390/include/asm/ap.h | 7 +++++++
> arch/s390/include/asm/kvm-ap.h | 23 +++++++++++++++++++++++
> arch/s390/kvm/Makefile | 2 +-
> arch/s390/kvm/kvm-ap.c | 21 +++++++++++++++++++++
> drivers/s390/crypto/ap_bus.c | 6 ++++++
> 5 files changed, 58 insertions(+), 1 deletions(-)
> create mode 100644 arch/s390/include/asm/kvm-ap.h
> create mode 100644 arch/s390/kvm/kvm-ap.c
>
> diff --git a/arch/s390/include/asm/ap.h b/arch/s390/include/asm/ap.h
> index c1bedb4..7773bfd 100644
> --- a/arch/s390/include/asm/ap.h
> +++ b/arch/s390/include/asm/ap.h
> @@ -120,4 +120,11 @@ struct ap_queue_status ap_queue_irq_ctrl(ap_qid_t qid,
> struct ap_qirq_ctrl qirqctrl,
> void *ind);
>
> +/**
> + * ap_instructions_installed() - Tests whether AP instructions are installed
> + *
> + * Returns 1 if the AP instructions are installed, otherwise; returns 0
> + */
> +int ap_instructions_installed(void);
> +
> #endif /* _ASM_S390_AP_H_ */
> diff --git a/arch/s390/include/asm/kvm-ap.h b/arch/s390/include/asm/kvm-ap.h
> new file mode 100644
> index 0000000..84412a9
> --- /dev/null
> +++ b/arch/s390/include/asm/kvm-ap.h
> @@ -0,0 +1,23 @@
> +// SPDX-License-Identifier: GPL-2.0+
> +/*
> + * Adjunct Processor (AP) configuration management for KVM guests
> + *
> + * Copyright IBM Corp. 2018
> + *
> + * Author(s): Tony Krowiak <[email protected]>
> + */
> +
> +#ifndef _ASM_KVM_AP
> +#define _ASM_KVM_AP
> +
> +/**
> + * kvm_ap_instructions_installed()
> + *
> + * Tests whether AP instructions are installed on the linux host
> + *
> + * Returns 1 if the AP instructions are installed on the host, otherwise;
> + * returns 0
> + */
> +int kvm_ap_instructions_installed(void);
> +
> +#endif /* _ASM_KVM_AP */
> diff --git a/arch/s390/kvm/Makefile b/arch/s390/kvm/Makefile
> index 05ee90a..1876bfe 100644
> --- a/arch/s390/kvm/Makefile
> +++ b/arch/s390/kvm/Makefile
> @@ -9,6 +9,6 @@ common-objs = $(KVM)/kvm_main.o $(KVM)/eventfd.o $(KVM)/async_pf.o $(KVM)/irqch
> ccflags-y := -Ivirt/kvm -Iarch/s390/kvm
>
> kvm-objs := $(common-objs) kvm-s390.o intercept.o interrupt.o priv.o sigp.o
> -kvm-objs += diag.o gaccess.o guestdbg.o vsie.o
> +kvm-objs += diag.o gaccess.o guestdbg.o vsie.o kvm-ap.o
>
> obj-$(CONFIG_KVM) += kvm.o
> diff --git a/arch/s390/kvm/kvm-ap.c b/arch/s390/kvm/kvm-ap.c
> new file mode 100644
> index 0000000..1267588
> --- /dev/null
> +++ b/arch/s390/kvm/kvm-ap.c
> @@ -0,0 +1,21 @@
> +// SPDX-License-Identifier: GPL-2.0+
> +/*
> + * Adjunct Processor (AP) configuration management for KVM guests
> + *
> + * Copyright IBM Corp. 2018
> + *
> + * Author(s): Tony Krowiak <[email protected]>
> + */
> +#include <linux/kernel.h>
> +#include <asm/kvm-ap.h>
> +#include <asm/ap.h>
> +
> +int kvm_ap_instructions_installed(void)
> +{
> +#ifdef CONFIG_ZCRYPT
> + return ap_instructions_installed();
> +#else
> + return 0;
> +#endif
> +}
> +EXPORT_SYMBOL(kvm_ap_instructions_installed);
> diff --git a/drivers/s390/crypto/ap_bus.c b/drivers/s390/crypto/ap_bus.c
> index 35a0c2b..9d108b6 100644
> --- a/drivers/s390/crypto/ap_bus.c
> +++ b/drivers/s390/crypto/ap_bus.c
> @@ -210,6 +210,12 @@ int ap_query_configuration(struct ap_config_info *info)
> }
> EXPORT_SYMBOL(ap_query_configuration);
>
> +int ap_instructions_installed(void)
> +{
> + return (ap_instructions_available() == 0);
> +}
> +EXPORT_SYMBOL(ap_instructions_installed);
> +
> /**
> * ap_init_configuration(): Allocate and query configuration array.
> */
>
--
Thanks,
David / dhildenb
On 05/04/2018 03:19 AM, David Hildenbrand wrote:
> On 15.04.2018 23:22, Tony Krowiak wrote:
>> If the AP instructions are not available on the linux host, then
>> AP devices can not be interpreted by the SIE. The AP bus has a
> This statement is wrong. The instructions can be interpreted by SIE e.g.
> if there are no devices assigned to a guest. This is e.g. the case for
> !CONFIG_ZCRYPT.
While the statement is admittedly poorly worded, it is not wrong.
Without going into architectural details, If the AP instructions
are not available, they will not be interpreted for guest
level 1 - i.e., the linux host. If AP instructions are not interpreted
for guest level 1, then they will not be interpreted for guest
level 2 regardless of whether ECA_APIE is set for guest level 2 or
not. I don't see how CONFIG_ZCRYPT has anything to do with this.
>
> Also, doesn't this directly imply that the other execution control
> should also not be used ("intercept AP instuctions"). This would be bad.
> Just because !CONFIG_ZCRYPT does not imply that you can't emulate AP
> devices for a guest.
Setting CONFIG_ZCRYPT=n simply means that the AP bus will not be built
and therefore the AP bus interfaces will not be available to KVM.
As far as ECA_APIE goes, there are only two choices: Set the bit to
enable SIE interpretation of AP instructions; Clear the bit to use
interception. We are only supporting SIE interpretation of AP
instructions at this time, so we need a sure-fire way to determine
if the AP instructions are installed, which is the point of this patch.
Since there are no intercept handlers at this time, when the AP bus
module on the guest is initialized, the init function will fail and
the bus will not come up. There are protections built into userspace
(QEMU in this case) to ensure that a guest is not started if the CPU
model feature for AP instructions is not turned on for the guest. The
CPU model feature will be enabled by the KVM only if the AP instructions
are installed on the linux host. Again, that is reason for this
patch.
>
> Why isn't it sufficient to glue CONFIG_ZCRYPT to vfio-ap? This would
> make more sense in my opinion. You have no "host devices" that you can
> "pass through". But you can still emulate devices or emulate an empty bus.
As I commented above, we are supporting only pass through AP devices
at this time.
>
>> function it uses to determine if the AP instructions are
>> available. This patch provides a new function that wraps the
>> AP bus's function to externalize it for use by KVM.
>>
>> Signed-off-by: Tony Krowiak <[email protected]>
>> Reviewed-by: Pierre Morel <[email protected]>
>> Reviewed-by: Harald Freudenberger <[email protected]>
>> ---
>> arch/s390/include/asm/ap.h | 7 +++++++
>> arch/s390/include/asm/kvm-ap.h | 23 +++++++++++++++++++++++
>> arch/s390/kvm/Makefile | 2 +-
>> arch/s390/kvm/kvm-ap.c | 21 +++++++++++++++++++++
>> drivers/s390/crypto/ap_bus.c | 6 ++++++
>> 5 files changed, 58 insertions(+), 1 deletions(-)
>> create mode 100644 arch/s390/include/asm/kvm-ap.h
>> create mode 100644 arch/s390/kvm/kvm-ap.c
>>
>> diff --git a/arch/s390/include/asm/ap.h b/arch/s390/include/asm/ap.h
>> index c1bedb4..7773bfd 100644
>> --- a/arch/s390/include/asm/ap.h
>> +++ b/arch/s390/include/asm/ap.h
>> @@ -120,4 +120,11 @@ struct ap_queue_status ap_queue_irq_ctrl(ap_qid_t qid,
>> struct ap_qirq_ctrl qirqctrl,
>> void *ind);
>>
>> +/**
>> + * ap_instructions_installed() - Tests whether AP instructions are installed
>> + *
>> + * Returns 1 if the AP instructions are installed, otherwise; returns 0
>> + */
>> +int ap_instructions_installed(void);
>> +
>> #endif /* _ASM_S390_AP_H_ */
>> diff --git a/arch/s390/include/asm/kvm-ap.h b/arch/s390/include/asm/kvm-ap.h
>> new file mode 100644
>> index 0000000..84412a9
>> --- /dev/null
>> +++ b/arch/s390/include/asm/kvm-ap.h
>> @@ -0,0 +1,23 @@
>> +// SPDX-License-Identifier: GPL-2.0+
>> +/*
>> + * Adjunct Processor (AP) configuration management for KVM guests
>> + *
>> + * Copyright IBM Corp. 2018
>> + *
>> + * Author(s): Tony Krowiak <[email protected]>
>> + */
>> +
>> +#ifndef _ASM_KVM_AP
>> +#define _ASM_KVM_AP
>> +
>> +/**
>> + * kvm_ap_instructions_installed()
>> + *
>> + * Tests whether AP instructions are installed on the linux host
>> + *
>> + * Returns 1 if the AP instructions are installed on the host, otherwise;
>> + * returns 0
>> + */
>> +int kvm_ap_instructions_installed(void);
>> +
>> +#endif /* _ASM_KVM_AP */
>> diff --git a/arch/s390/kvm/Makefile b/arch/s390/kvm/Makefile
>> index 05ee90a..1876bfe 100644
>> --- a/arch/s390/kvm/Makefile
>> +++ b/arch/s390/kvm/Makefile
>> @@ -9,6 +9,6 @@ common-objs = $(KVM)/kvm_main.o $(KVM)/eventfd.o $(KVM)/async_pf.o $(KVM)/irqch
>> ccflags-y := -Ivirt/kvm -Iarch/s390/kvm
>>
>> kvm-objs := $(common-objs) kvm-s390.o intercept.o interrupt.o priv.o sigp.o
>> -kvm-objs += diag.o gaccess.o guestdbg.o vsie.o
>> +kvm-objs += diag.o gaccess.o guestdbg.o vsie.o kvm-ap.o
>>
>> obj-$(CONFIG_KVM) += kvm.o
>> diff --git a/arch/s390/kvm/kvm-ap.c b/arch/s390/kvm/kvm-ap.c
>> new file mode 100644
>> index 0000000..1267588
>> --- /dev/null
>> +++ b/arch/s390/kvm/kvm-ap.c
>> @@ -0,0 +1,21 @@
>> +// SPDX-License-Identifier: GPL-2.0+
>> +/*
>> + * Adjunct Processor (AP) configuration management for KVM guests
>> + *
>> + * Copyright IBM Corp. 2018
>> + *
>> + * Author(s): Tony Krowiak <[email protected]>
>> + */
>> +#include <linux/kernel.h>
>> +#include <asm/kvm-ap.h>
>> +#include <asm/ap.h>
>> +
>> +int kvm_ap_instructions_installed(void)
>> +{
>> +#ifdef CONFIG_ZCRYPT
>> + return ap_instructions_installed();
>> +#else
>> + return 0;
>> +#endif
>> +}
>> +EXPORT_SYMBOL(kvm_ap_instructions_installed);
>> diff --git a/drivers/s390/crypto/ap_bus.c b/drivers/s390/crypto/ap_bus.c
>> index 35a0c2b..9d108b6 100644
>> --- a/drivers/s390/crypto/ap_bus.c
>> +++ b/drivers/s390/crypto/ap_bus.c
>> @@ -210,6 +210,12 @@ int ap_query_configuration(struct ap_config_info *info)
>> }
>> EXPORT_SYMBOL(ap_query_configuration);
>>
>> +int ap_instructions_installed(void)
>> +{
>> + return (ap_instructions_available() == 0);
>> +}
>> +EXPORT_SYMBOL(ap_instructions_installed);
>> +
>> /**
>> * ap_init_configuration(): Allocate and query configuration array.
>> */
>>
>
On 05/03/2018 12:01 PM, Pierre Morel wrote:
> On 03/05/2018 16:41, Tony Krowiak wrote:
>> On 05/02/2018 10:57 AM, Pierre Morel wrote:
>>> On 25/04/2018 18:21, Tony Krowiak wrote:
>>>> On 04/23/2018 09:46 AM, Pierre Morel wrote:
>>>>> On 15/04/2018 23:22, Tony Krowiak wrote:
>>>>>> Provides interfaces to assign AP adapters, usage domains
>>>>>> and control domains to a KVM guest.
> ...
>> The kvm_ap_configure_matrix(kvm, matrix) interface configures the
>> APM, AQM and ADM in the
>> guest's CRYCB which implies AP instructions are being interpreted.
>> There is nothing in SIE
>> precluding the sharing of AP queues between guests using SIE to
>> interpret AP instructions,
>> it is my opinion - along with several others - that this is not
>> advisable given the
>> results are not predictable, not to mention the security concerns. If
>> the protocol to access
>> queues changes, then we create a different interface. The other
>> option is to include a flag
>> on the kvm_ap_configure_matrix(kvm, matrix) interface to indicate
>> whether sharing is
>> allowed. I don't like this, because we have no way of knowing if the
>> caller has taken the
>> proper care to ensure the VM sharing the queue should be allowed
>> access. Besides, when
>> queue sharing is implemented, it is my opinion that we will intercept
>> the AP instructions
>> and the matrix will not be configured in the CRYCB. I stick by my
>> response above.
>
> I mean, validating the queue sharing is a mater of the VFIO driver.
> This code is not needed if the VFIO driver is not used.
> But it is not very important.
Yes, this check could have been implemented in the VFIO driver, but as I
stated above, that
would require the driver to "know" the internals of KVM. I think the KVM
logic should
be encapsulated in KVM. If we want to allow sharing of interpreted AP
devices, then we
can always add a flag to the interface.
>
>
>>
>>>
>>>>>> +static int kvm_ap_matrix_apm_create(struct kvm_ap_matrix
>>>>>> *ap_matrix,
>>>>>> + struct ap_config_info *config)
>>>>>> +{
>>>>>> + int apm_max = (config && config->apxa) ? config->Na + 1 : 16;
>>>>>
>>>>> At this moment you already know the format of the crycb.
>>>>
>>>> How?
>>>
>>> you calculated this in kvm_ap_build_crycbd() which is called from
>>> kvm_s390_crypto_init()
>>> itself called from kvm_arch_init_vm().
>>> It is when starting the VM.
>>
>> This structure is used by the vfio_ap driver to store the mediated
>> matrix device's matrix
>> configuration as well as to configure the CRYCB. The mediated
>> device's matrix is
>> configured before the guest is started ... it is the means for
>> configuring the guest's
>> matrix after all. The bottom line is, this function will be called
>> long before the
>> kvm_ap_build_crycbd() function is called.
>
> you are right, I was thinking about open, should have take more
> attention.
> Sorry.
>
> Pierre
>
On 07.05.2018 16:02, Tony Krowiak wrote:
> On 05/04/2018 03:19 AM, David Hildenbrand wrote:
>> On 15.04.2018 23:22, Tony Krowiak wrote:
>>> If the AP instructions are not available on the linux host, then
>>> AP devices can not be interpreted by the SIE. The AP bus has a
>> This statement is wrong. The instructions can be interpreted by SIE e.g.
>> if there are no devices assigned to a guest. This is e.g. the case for
>> !CONFIG_ZCRYPT.
>
> While the statement is admittedly poorly worded, it is not wrong.
> Without going into architectural details, If the AP instructions
> are not available, they will not be interpreted for guest
> level 1 - i.e., the linux host. If AP instructions are not interpreted
> for guest level 1, then they will not be interpreted for guest
> level 2 regardless of whether ECA_APIE is set for guest level 2 or
> not. I don't see how CONFIG_ZCRYPT has anything to do with this.
>
>
>>
>> Also, doesn't this directly imply that the other execution control
>> should also not be used ("intercept AP instuctions"). This would be bad.
>> Just because !CONFIG_ZCRYPT does not imply that you can't emulate AP
>> devices for a guest.
>
> Setting CONFIG_ZCRYPT=n simply means that the AP bus will not be built
> and therefore the AP bus interfaces will not be available to KVM.
> As far as ECA_APIE goes, there are only two choices: Set the bit to
> enable SIE interpretation of AP instructions; Clear the bit to use
I thought somebody once mentioned once in one of these threads that
there are actually 2 different bits. One to control interpretation and
one to control interception.
> interception. We are only supporting SIE interpretation of AP
> instructions at this time, so we need a sure-fire way to determine
> if the AP instructions are installed, which is the point of this patch.
> Since there are no intercept handlers at this time, when the AP bus
> module on the guest is initialized, the init function will fail and
> the bus will not come up. There are protections built into userspace
> (QEMU in this case) to ensure that a guest is not started if the CPU
> model feature for AP instructions is not turned on for the guest. The
> CPU model feature will be enabled by the KVM only if the AP instructions
> are installed on the linux host. Again, that is reason for this
> patch.
>
--
Thanks,
David / dhildenb