2018-03-14 18:29:11

by Tony Krowiak

[permalink] [raw]
Subject: [PATCH v3 00/14] s390: vfio-ap: guest dedicated crypto adapters

On s390, we have cryptographic coprocessor cards, which are modeled on
Linux as devices on the AP bus. Each card can be partitioned into domains
which can be thought of as a set of hardware registers for processing
crypto commands. Crypto commands are sent to a specific domain within a
card is via a queue which is identified as a (card,domain) tuple. We model
this something like the following (assuming we have access to cards 3 and
4 and domains 1 and 2):

AP -> card3 -> queue (3,1)
-> queue (3,2)
-> card4 -> queue (4,1)
-> queue (4,2)

If we want to virtualize this, we can use a feature provided by the
hardware. We basically attach a satellite control block to our main
hardware virtualization control block and the hardware takes care of
most of the rest.

For this control block, we don't specify explicit tuples, but a list of
cards and a list of domains. The guest will get access to the cross
product.

Because of this, we need to take care that the lists provided to
different guests don't overlap; i.e., we need to enforce sane
configurations. Otherwise, one guest may get access to things like
secret keys for another guest.

The idea of this patch set is to introduce a new device, the matrix
device. This matrix device hangs off a different root and acts as the
parent node for mdev devices.

If you now want to give the tuples (4,1) and (4,2), you need to do the
following:

- Unbind the (4,1) and (4,2) tuples from their ap bus driver.
- Bind the (4,1) and (4,2) tuples to the vfio_ap driver.
- Create the mediated device.
- Assign card 4 and domains 1 and 2 to the mediated device

QEMU will now simply consume the mediated device and things should work.

For a complete description of the architecture and concepts underlying the
design, see the Documentation/s390/vfio-ap.txt file included with this
patch set.

v2 => v3 Change log:
===================
* Set APIE in VCPU setup function
* Renamed patch 13/15:
KVM: s390: Configure the guest's CRYCB
KVM: s390: Configure the guest's AP devices
* Fixed problem with building arch/s390/kvm/kvm-ap.c when CONFIG_ZCRYPT
not selected
* Removed patch introducing VSIE support for AP pending further
investigation
* Initialized AP maximum mask sizes - i.e., APM, AQM and ADM - from info
returned from PQAP(QCI) function
* Introduced a new device attribute to the KVM_S390_VM_CRYPTO attribute
group for setting a flag via the KVM_SET_DEVICE_ATTR ioctl to indicate
whether ECA_APIE should be set or not. The flag is used in the
kvm_s390_vcpu_crypto_setup() function to set ECA_APIE in the SIE block.
* Misc. formatting etc.

Tony Krowiak (14):
KVM: s390: refactor crypto initialization
s390: zcrypt: externalize AP instructions available function
KVM: s390: CPU model support for AP virtualization
KVM: s390: device attribute to set AP interpretive execution
s390: vfio-ap: base implementation of VFIO AP device driver
s390: vfio-ap: register matrix device with VFIO mdev framework
KVM: s390: interfaces to configure/deconfigure guest's AP matrix
s390: vfio-ap: sysfs interfaces to configure adapters
s390: vfio-ap: sysfs interfaces to configure domains
s390: vfio-ap: sysfs interfaces to configure control domains
s390: vfio-ap: sysfs interface to view matrix mdev matrix
KVM: s390: configure the guest's AP devices
s390: vfio-ap: implement VFIO_DEVICE_GET_INFO ioctl
s390: doc: detailed specifications for AP virtualization

Documentation/s390/vfio-ap.txt | 560 +++++++++++++++++++++
MAINTAINERS | 14 +
arch/s390/Kconfig | 11 +
arch/s390/include/asm/ap.h | 7 +
arch/s390/include/asm/kvm-ap.h | 57 +++
arch/s390/include/asm/kvm_host.h | 3 +
arch/s390/include/uapi/asm/kvm.h | 2 +
arch/s390/kvm/Kconfig | 1 +
arch/s390/kvm/Makefile | 2 +-
arch/s390/kvm/kvm-ap.c | 330 +++++++++++++
arch/s390/kvm/kvm-s390.c | 84 ++--
arch/s390/tools/gen_facilities.c | 2 +
drivers/s390/crypto/Makefile | 4 +
drivers/s390/crypto/ap_bus.c | 6 +
drivers/s390/crypto/vfio_ap_drv.c | 144 ++++++
drivers/s390/crypto/vfio_ap_ops.c | 872 +++++++++++++++++++++++++++++++++
drivers/s390/crypto/vfio_ap_private.h | 47 ++
include/uapi/linux/vfio.h | 2 +
18 files changed, 2093 insertions(+), 55 deletions(-)
create mode 100644 Documentation/s390/vfio-ap.txt
create mode 100644 arch/s390/include/asm/kvm-ap.h
create mode 100644 arch/s390/kvm/kvm-ap.c
create mode 100644 drivers/s390/crypto/vfio_ap_drv.c
create mode 100644 drivers/s390/crypto/vfio_ap_ops.c
create mode 100644 drivers/s390/crypto/vfio_ap_private.h



2018-03-14 18:28:13

by Tony Krowiak

[permalink] [raw]
Subject: [PATCH v3 06/14] s390: vfio-ap: register matrix device with VFIO mdev framework

Registers the matrix device created by the VFIO AP device
driver with the VFIO mediated device framework.
Registering the matrix device will create the sysfs
structures needed to create mediated matrix devices
each of which will be used to configure the AP matrix
for a guest and connect it to the VFIO AP device driver.

Registering the matrix device with the VFIO mediated device
framework will create the following sysfs structures:

/sys/devices/vfio_ap
... [matrix]
...... [mdev_supported_types]
......... [vfio_ap-passthrough]
............ create

To create a mediated device for the AP matrix device, write a UUID
to the create file:

uuidgen > create

A symbolic link to the mediated device's directory will be created in the
devices subdirectory named after the generated $uuid:

/sys/devices/vfio_ap
... [matrix]
...... [mdev_supported_types]
......... [vfio_ap-passthrough]
............ [devices]
............... [$uuid]

Signed-off-by: Tony Krowiak <[email protected]>
---
MAINTAINERS | 1 +
drivers/s390/crypto/Makefile | 2 +-
drivers/s390/crypto/vfio_ap_drv.c | 9 +++
drivers/s390/crypto/vfio_ap_ops.c | 105 +++++++++++++++++++++++++++++++++
drivers/s390/crypto/vfio_ap_private.h | 18 ++++++
5 files changed, 134 insertions(+), 1 deletions(-)
create mode 100644 drivers/s390/crypto/vfio_ap_ops.c

diff --git a/MAINTAINERS b/MAINTAINERS
index f129253..2af3815 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -11886,6 +11886,7 @@ F: arch/s390/include/asm/kvm/kvm-ap.h
F: arch/s390/kvm/kvm-ap.c
F: drivers/s390/crypto/vfio_ap_drv.c
F: drivers/s390/crypto/vfio_ap_private.h
+F: drivers/s390/crypto/vfio_ap_ops.c

S390 ZFCP DRIVER
M: Steffen Maier <[email protected]>
diff --git a/drivers/s390/crypto/Makefile b/drivers/s390/crypto/Makefile
index 48e466e..8d36b05 100644
--- a/drivers/s390/crypto/Makefile
+++ b/drivers/s390/crypto/Makefile
@@ -17,5 +17,5 @@ pkey-objs := pkey_api.o
obj-$(CONFIG_PKEY) += pkey.o

# adjunct processor matrix
-vfio_ap-objs := vfio_ap_drv.o
+vfio_ap-objs := vfio_ap_drv.o vfio_ap_ops.o
obj-$(CONFIG_VFIO_AP) += vfio_ap.o
diff --git a/drivers/s390/crypto/vfio_ap_drv.c b/drivers/s390/crypto/vfio_ap_drv.c
index 459e595..b138cb0 100644
--- a/drivers/s390/crypto/vfio_ap_drv.c
+++ b/drivers/s390/crypto/vfio_ap_drv.c
@@ -122,11 +122,20 @@ int __init vfio_ap_init(void)
return ret;
}

+ ret = vfio_ap_mdev_register(ap_matrix);
+ if (ret) {
+ ap_driver_unregister(&vfio_ap_drv);
+ vfio_ap_matrix_dev_destroy(ap_matrix);
+
+ return ret;
+ }
+
return 0;
}

void __exit vfio_ap_exit(void)
{
+ vfio_ap_mdev_unregister(ap_matrix);
ap_driver_unregister(&vfio_ap_drv);
vfio_ap_matrix_dev_destroy(ap_matrix);
}
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
new file mode 100644
index 0000000..4292a5e
--- /dev/null
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -0,0 +1,105 @@
+/*
+ * Adjunct processor matrix VFIO device driver callbacks.
+ *
+ * Copyright IBM Corp. 2017
+ * Author(s): Tony Krowiak <[email protected]>
+ *
+ */
+#include <linux/string.h>
+#include <linux/vfio.h>
+#include <linux/device.h>
+#include <linux/list.h>
+#include <linux/ctype.h>
+
+#include "vfio_ap_private.h"
+
+#define VFOP_AP_MDEV_TYPE_HWVIRT "passthrough"
+#define VFIO_AP_MDEV_NAME_HWVIRT "VFIO AP Passthrough Device"
+
+static int vfio_ap_mdev_create(struct kobject *kobj, struct mdev_device *mdev)
+{
+ struct ap_matrix *ap_matrix = to_ap_matrix(mdev_parent_dev(mdev));
+
+ ap_matrix->available_instances--;
+
+ return 0;
+}
+
+static int vfio_ap_mdev_remove(struct mdev_device *mdev)
+{
+ struct ap_matrix *ap_matrix = to_ap_matrix(mdev_parent_dev(mdev));
+
+ ap_matrix->available_instances++;
+
+ return 0;
+}
+
+static ssize_t name_show(struct kobject *kobj, struct device *dev, char *buf)
+{
+ return sprintf(buf, "%s\n", VFIO_AP_MDEV_NAME_HWVIRT);
+}
+
+MDEV_TYPE_ATTR_RO(name);
+
+static ssize_t available_instances_show(struct kobject *kobj,
+ struct device *dev, char *buf)
+{
+ struct ap_matrix *ap_matrix;
+
+ ap_matrix = to_ap_matrix(dev);
+
+ return sprintf(buf, "%d\n", ap_matrix->available_instances);
+}
+
+MDEV_TYPE_ATTR_RO(available_instances);
+
+static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
+ char *buf)
+{
+ return sprintf(buf, "%s\n", VFIO_DEVICE_API_AP_STRING);
+}
+
+MDEV_TYPE_ATTR_RO(device_api);
+
+static struct attribute *vfio_ap_mdev_type_attrs[] = {
+ &mdev_type_attr_name.attr,
+ &mdev_type_attr_device_api.attr,
+ &mdev_type_attr_available_instances.attr,
+ NULL,
+};
+
+static struct attribute_group vfio_ap_mdev_hwvirt_type_group = {
+ .name = VFOP_AP_MDEV_TYPE_HWVIRT,
+ .attrs = vfio_ap_mdev_type_attrs,
+};
+
+static struct attribute_group *vfio_ap_mdev_type_groups[] = {
+ &vfio_ap_mdev_hwvirt_type_group,
+ NULL,
+};
+
+static const struct mdev_parent_ops vfio_ap_matrix_ops = {
+ .owner = THIS_MODULE,
+ .supported_type_groups = vfio_ap_mdev_type_groups,
+ .create = vfio_ap_mdev_create,
+ .remove = vfio_ap_mdev_remove,
+};
+
+int vfio_ap_mdev_register(struct ap_matrix *ap_matrix)
+{
+ int ret;
+
+ ret = mdev_register_device(&ap_matrix->device, &vfio_ap_matrix_ops);
+ if (ret)
+ return ret;
+
+ ap_matrix->available_instances = AP_MATRIX_MAX_AVAILABLE_INSTANCES;
+
+ return 0;
+}
+
+void vfio_ap_mdev_unregister(struct ap_matrix *ap_matrix)
+{
+ ap_matrix->available_instances--;
+ mdev_unregister_device(&ap_matrix->device);
+}
diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h
index 21f3697..f11e5e1 100644
--- a/drivers/s390/crypto/vfio_ap_private.h
+++ b/drivers/s390/crypto/vfio_ap_private.h
@@ -9,14 +9,32 @@
#define _VFIO_AP_PRIVATE_H_

#include <linux/types.h>
+#include <linux/device.h>
+#include <linux/mdev.h>

#include "ap_bus.h"

#define VFIO_AP_MODULE_NAME "vfio_ap"
#define VFIO_AP_DRV_NAME "vfio_ap"
+#define VFIO_AP_MATRIX_MODULE_NAME "vfio_ap_matrix"
+/**
+ * There must be one mediated matrix device per guest. If every APQN is assigned
+ * to a guest, then the maximum number of guests with a unique APQN assigned
+ * would be 255 adapters x 255 domains = 72351 guests.
+ */
+#define AP_MATRIX_MAX_AVAILABLE_INSTANCES 72351

struct ap_matrix {
struct device device;
+ int available_instances;
};

+static inline struct ap_matrix *to_ap_matrix(struct device *dev)
+{
+ return container_of(dev, struct ap_matrix, device);
+}
+
+extern int vfio_ap_mdev_register(struct ap_matrix *ap_matrix);
+extern void vfio_ap_mdev_unregister(struct ap_matrix *ap_matrix);
+
#endif /* _VFIO_AP_PRIVATE_H_ */
--
1.7.1


2018-03-14 18:28:28

by Tony Krowiak

[permalink] [raw]
Subject: [PATCH v3 07/14] KVM: s390: interfaces to configure/deconfigure guest's AP matrix

Provides interfaces to assign AP adapters, usage domains
and control domains to a KVM guest.

A KVM guest is started by executing the Start Interpretive Execution (SIE)
instruction. The SIE state description is a control block that contains the
state information for a KVM guest and is supplied as input to the SIE
instruction. The SIE state description has a satellite structure called the
Crypto Control Block (CRYCB). The CRYCB contains three bitmask fields
identifying the adapters, queues (domains) and control domains assigned to
the KVM guest:

* The AP Adapter Mask (APM) field identifies the AP adapters assigned to
the KVM guest

* The AP Queue Mask (AQM) field identifies the AP queues assigned to
the KVM guest. Each AP queue is connected to a usage domain within
an AP adapter.

* The AP Domain Mask (ADM) field identifies the control domains
assigned to the KVM guest.

Each adapter, queue (usage domain) and control domain are identified by
a number from 0 to 255. The bits in each mask, from most significant to
least significant bit, correspond to the numbers 0-255. When a bit is
set, the corresponding adapter, queue (usage domain) or control domain
is assigned to the KVM guest.

This patch will set the bits in the APM, AQM and ADM fields of the
CRYCB referenced by the KVM guest's SIE state description. The process
used is:

1. Verify that the bits to be set do not exceed the maximum bit
number for the given mask.

2. Verify that the APQNs that can be derived from the intersection
of the bits set in the APM and AQM fields of the KVM guest's CRYCB
are not assigned to any other KVM guest running on the same linux
host.

3. Set the APM, AQM and ADM in the CRYCB according to the matrix
configured for the mediated matrix device via its sysfs
adapter, domain and control domain attribute files respectively.

Signed-off-by: Tony Krowiak <[email protected]>
---
arch/s390/include/asm/kvm-ap.h | 36 +++++
arch/s390/kvm/kvm-ap.c | 268 +++++++++++++++++++++++++++++++++
drivers/s390/crypto/vfio_ap_ops.c | 19 +++
drivers/s390/crypto/vfio_ap_private.h | 4 +
4 files changed, 327 insertions(+), 0 deletions(-)

diff --git a/arch/s390/include/asm/kvm-ap.h b/arch/s390/include/asm/kvm-ap.h
index 362846c..268f3b2 100644
--- a/arch/s390/include/asm/kvm-ap.h
+++ b/arch/s390/include/asm/kvm-ap.h
@@ -10,7 +10,43 @@
#define _ASM_KVM_AP
#include <linux/types.h>
#include <linux/kvm_host.h>
+#include <linux/types.h>
+#include <linux/kvm_host.h>
+#include <linux/bitops.h>
+
+#define KVM_AP_MASK_BYTES(n) DIV_ROUND_UP(n, BITS_PER_BYTE)
+
+/**
+ * The AP matrix is comprised of three bit masks identifying the adapters,
+ * queues (domains) and control domains that belong to an AP matrix. The bits in
+ * each mask, from least significant to most significant bit, correspond to IDs
+ * 0 to the maximum ID allowed for a given mask. When a bit is set, the
+ * corresponding ID belongs to the matrix.
+ *
+ * @apm_max: max number of bits in @apm
+ * @apm identifies the AP adapters in the matrix
+ * @aqm_max: max number of bits in @aqm
+ * @aqm identifies the AP queues (domains) in the matrix
+ * @adm_max: max number of bits in @adm
+ * @adm identifies the AP control domains in the matrix
+ */
+struct kvm_ap_matrix {
+ int apm_max;
+ unsigned long *apm;
+ int aqm_max;
+ unsigned long *aqm;
+ int adm_max;
+ unsigned long *adm;
+};

void kvm_ap_build_crycbd(struct kvm *kvm);

+int kvm_ap_matrix_create(struct kvm_ap_matrix **ap_matrix);
+
+void kvm_ap_matrix_destroy(struct kvm_ap_matrix *ap_matrix);
+
+int kvm_ap_configure_matrix(struct kvm *kvm, struct kvm_ap_matrix *matrix);
+
+void kvm_ap_deconfigure_matrix(struct kvm *kvm);
+
#endif /* _ASM_KVM_AP */
diff --git a/arch/s390/kvm/kvm-ap.c b/arch/s390/kvm/kvm-ap.c
index a2c6ad2..eb365e2 100644
--- a/arch/s390/kvm/kvm-ap.c
+++ b/arch/s390/kvm/kvm-ap.c
@@ -8,9 +8,129 @@

#include <asm/kvm-ap.h>
#include <asm/ap.h>
+#include <linux/bitops.h>

#include "kvm-s390.h"

+static inline void kvm_ap_clear_crycb_masks(struct kvm *kvm)
+{
+ int crycb_fmt = kvm->arch.crypto.crycbd & CRYCB_FORMAT_MASK;
+
+ if (crycb_fmt == CRYCB_FORMAT2)
+ memset(&kvm->arch.crypto.crycb->apcb1, 0,
+ sizeof(kvm->arch.crypto.crycb->apcb1));
+ else
+ memset(&kvm->arch.crypto.crycb->apcb0, 0,
+ sizeof(kvm->arch.crypto.crycb->apcb0));
+}
+
+static inline unsigned long *kvm_ap_get_crycb_apm(struct kvm *kvm)
+{
+ unsigned long *apm;
+ int crycb_fmt = kvm->arch.crypto.crycbd & CRYCB_FORMAT_MASK;
+
+ if (crycb_fmt == CRYCB_FORMAT2)
+ apm = (unsigned long *)kvm->arch.crypto.crycb->apcb1.apm;
+ else
+ apm = (unsigned long *)kvm->arch.crypto.crycb->apcb0.apm;
+
+ return apm;
+}
+
+static inline unsigned long *kvm_ap_get_crycb_aqm(struct kvm *kvm)
+{
+ unsigned long *aqm;
+ int crycb_fmt = kvm->arch.crypto.crycbd & CRYCB_FORMAT_MASK;
+
+ if (crycb_fmt == CRYCB_FORMAT2)
+ aqm = (unsigned long *)kvm->arch.crypto.crycb->apcb1.aqm;
+ else
+ aqm = (unsigned long *)kvm->arch.crypto.crycb->apcb0.aqm;
+
+ return aqm;
+}
+
+static inline unsigned long *kvm_ap_get_crycb_adm(struct kvm *kvm)
+{
+ unsigned long *adm;
+ int crycb_fmt = kvm->arch.crypto.crycbd & CRYCB_FORMAT_MASK;
+
+ if (crycb_fmt == CRYCB_FORMAT2)
+ adm = (unsigned long *)kvm->arch.crypto.crycb->apcb1.adm;
+ else
+ adm = (unsigned long *)kvm->arch.crypto.crycb->apcb0.adm;
+
+ return adm;
+}
+
+static void kvm_ap_set_crycb_masks(struct kvm *kvm,
+ struct kvm_ap_matrix *matrix)
+{
+ unsigned long *apm = kvm_ap_get_crycb_apm(kvm);
+ unsigned long *aqm = kvm_ap_get_crycb_aqm(kvm);
+ unsigned long *adm = kvm_ap_get_crycb_adm(kvm);
+
+ kvm_ap_clear_crycb_masks(kvm);
+ memcpy(apm, matrix->apm, KVM_AP_MASK_BYTES(matrix->apm_max));
+ memcpy(aqm, matrix->aqm, KVM_AP_MASK_BYTES(matrix->aqm_max));
+
+ /*
+ * Merge the AQM and ADM since the ADM is a superset of the
+ * AQM by architectural convention.
+ */
+ bitmap_or(adm, adm, aqm, matrix->adm_max);
+}
+
+static void kvm_ap_log_sharing_err(struct kvm *kvm, unsigned long apid,
+ unsigned long apqi)
+{
+ pr_err("%s: AP queue %02lx.%04lx is registered to guest %s", __func__,
+ apid, apqi, kvm->arch.dbf->name);
+}
+
+/**
+ * kvm_ap_validate_queue_sharing
+ *
+ * Verifies that the APQNs derived from the intersection of the AP adapter IDs
+ * and AP queue indexes comprising the AP matrix are not configured for
+ * another guest. AP queue sharing is not allowed.
+ *
+ * @kvm: the KVM guest
+ * @matrix: the AP matrix
+ *
+ * Returns 0 if the APQNs are valid, otherwise; returns -EBUSY.
+ */
+static int kvm_ap_validate_queue_sharing(struct kvm *kvm,
+ struct kvm_ap_matrix *matrix)
+{
+ struct kvm *vm;
+ unsigned long *apm, *aqm;
+ unsigned long apid, apqi;
+
+
+ /* No other VM may share an AP Queue with the input VM */
+ list_for_each_entry(vm, &vm_list, vm_list) {
+ if (kvm == vm)
+ continue;
+
+ apm = kvm_ap_get_crycb_apm(vm);
+ if (!bitmap_and(apm, apm, matrix->apm, matrix->apm_max))
+ continue;
+
+ aqm = kvm_ap_get_crycb_aqm(vm);
+ if (!bitmap_and(aqm, aqm, matrix->aqm, matrix->aqm_max))
+ continue;
+
+ for_each_set_bit_inv(apid, apm, matrix->apm_max)
+ for_each_set_bit_inv(apqi, aqm, matrix->aqm_max)
+ kvm_ap_log_sharing_err(kvm, apid, apqi);
+
+ return -EBUSY;
+ }
+
+ return 0;
+}
+
static int kvm_ap_apxa_installed(void)
{
int ret;
@@ -46,3 +166,151 @@ void kvm_ap_build_crycbd(struct kvm *kvm)
kvm->arch.crypto.crycbd |= CRYCB_FORMAT1;
}
}
+
+static int kvm_ap_matrix_apm_create(struct kvm_ap_matrix *ap_matrix,
+ struct ap_config_info *config)
+{
+ int apm_max = (config && config->apxa) ? config->Na + 1 : 16;
+
+ ap_matrix->apm = kzalloc(KVM_AP_MASK_BYTES(apm_max), GFP_KERNEL);
+ if (!ap_matrix->apm)
+ return -ENOMEM;
+
+ ap_matrix->apm_max = apm_max;
+
+ return 0;
+}
+
+static int kvm_ap_matrix_aqm_create(struct kvm_ap_matrix *ap_matrix,
+ struct ap_config_info *config)
+{
+ int aqm_max = (config && config->apxa) ? config->Nd + 1 : 16;
+
+ ap_matrix->aqm = kzalloc(KVM_AP_MASK_BYTES(aqm_max), GFP_KERNEL);
+ if (!ap_matrix->aqm)
+ return -ENOMEM;
+
+ ap_matrix->aqm_max = aqm_max;
+
+ return 0;
+}
+
+static int kvm_ap_matrix_adm_create(struct kvm_ap_matrix *ap_matrix,
+ struct ap_config_info *config)
+{
+ int adm_max = (config && config->apxa) ? config->Nd + 1 : 16;
+
+ ap_matrix->adm = kzalloc(KVM_AP_MASK_BYTES(adm_max), GFP_KERNEL);
+ if (!ap_matrix->adm)
+ return -ENOMEM;
+
+ ap_matrix->adm_max = adm_max;
+
+ return 0;
+}
+
+static void kvm_ap_matrix_masks_destroy(struct kvm_ap_matrix *ap_matrix)
+{
+ kfree(ap_matrix->apm);
+ kfree(ap_matrix->aqm);
+ kfree(ap_matrix->adm);
+}
+
+int kvm_ap_matrix_create(struct kvm_ap_matrix **ap_matrix)
+{
+ int ret;
+ struct kvm_ap_matrix *matrix;
+ struct ap_config_info config;
+ struct ap_config_info *config_info = NULL;
+
+ memset(&config, 0, sizeof(config));
+
+ ret = ap_query_configuration(&config);
+ if (ret) {
+ if (ret != -EOPNOTSUPP)
+ return ret;
+ } else {
+ config_info = &config;
+ }
+
+ matrix = kzalloc(sizeof(*matrix), GFP_KERNEL);
+ if (!matrix)
+ return -ENOMEM;
+
+ ret = kvm_ap_matrix_apm_create(matrix, config_info);
+ if (ret)
+ goto mask_create_err;
+
+ ret = kvm_ap_matrix_aqm_create(matrix, config_info);
+ if (ret)
+ goto mask_create_err;
+
+ ret = kvm_ap_matrix_adm_create(matrix, config_info);
+ if (ret)
+ goto mask_create_err;
+
+ *ap_matrix = matrix;
+
+ return 0;
+
+mask_create_err:
+ kvm_ap_matrix_masks_destroy(matrix);
+ kfree(matrix);
+ return ret;
+}
+EXPORT_SYMBOL(kvm_ap_matrix_create);
+
+void kvm_ap_matrix_destroy(struct kvm_ap_matrix *ap_matrix)
+{
+ kvm_ap_matrix_masks_destroy(ap_matrix);
+ kfree(ap_matrix);
+}
+EXPORT_SYMBOL(kvm_ap_matrix_destroy);
+
+/**
+ * kvm_ap_configure_matrix
+ *
+ * Configure the AP matrix for a KVM guest.
+ *
+ * @kvm: the KVM guest
+ * @matrix: the matrix configuration information
+ *
+ * Returns 0 if:
+ * 1. The AP instructions are installed on the guest
+ * 2. The APQNs derived from the intersection of the set of adapter
+ * IDs (APM) and queue indexes (AQM) in @matrix are not configured for
+ * any other KVM guest running on the same linux host.
+ * Otherwise returns an error code.
+ */
+int kvm_ap_configure_matrix(struct kvm *kvm, struct kvm_ap_matrix *matrix)
+{
+ int ret = 0;
+
+ mutex_lock(&kvm->lock);
+
+ ret = kvm_ap_validate_queue_sharing(kvm, matrix);
+ if (ret)
+ goto done;
+
+ kvm_ap_set_crycb_masks(kvm, matrix);
+
+done:
+ mutex_unlock(&kvm->lock);
+
+ return ret;
+}
+EXPORT_SYMBOL(kvm_ap_configure_matrix);
+
+/**
+ * kvm_ap_deconfigure_matrix
+ *
+ * Deconfigure the AP matrix for a KVM guest. Clears all of the bits in the
+ * APM, AQM and ADM in the guest's CRYCB.
+ *
+ * @kvm: the KVM guest
+ */
+void kvm_ap_deconfigure_matrix(struct kvm *kvm)
+{
+ kvm_ap_clear_crycb_masks(kvm);
+}
+EXPORT_SYMBOL(kvm_ap_deconfigure_matrix);
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index 4292a5e..4fda44e 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -10,6 +10,7 @@
#include <linux/device.h>
#include <linux/list.h>
#include <linux/ctype.h>
+#include <asm/kvm-ap.h>

#include "vfio_ap_private.h"

@@ -18,8 +19,23 @@

static int vfio_ap_mdev_create(struct kobject *kobj, struct mdev_device *mdev)
{
+ int ret;
+ struct ap_matrix_mdev *matrix_mdev;
struct ap_matrix *ap_matrix = to_ap_matrix(mdev_parent_dev(mdev));
+ struct kvm_ap_matrix *matrix;
+
+ ret = kvm_ap_matrix_create(&matrix);
+ if (ret)
+ return ret;
+
+ matrix_mdev = kzalloc(sizeof(*matrix_mdev), GFP_KERNEL);
+ if (!matrix_mdev) {
+ kvm_ap_matrix_destroy(matrix);
+ return -ENOMEM;
+ }

+ matrix_mdev->matrix = matrix;
+ mdev_set_drvdata(mdev, matrix_mdev);
ap_matrix->available_instances--;

return 0;
@@ -28,7 +44,10 @@ static int vfio_ap_mdev_create(struct kobject *kobj, struct mdev_device *mdev)
static int vfio_ap_mdev_remove(struct mdev_device *mdev)
{
struct ap_matrix *ap_matrix = to_ap_matrix(mdev_parent_dev(mdev));
+ struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);

+ kvm_ap_matrix_destroy(matrix_mdev->matrix);
+ kfree(matrix_mdev);
ap_matrix->available_instances++;

return 0;
diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h
index f11e5e1..a388b66 100644
--- a/drivers/s390/crypto/vfio_ap_private.h
+++ b/drivers/s390/crypto/vfio_ap_private.h
@@ -29,6 +29,10 @@ struct ap_matrix {
int available_instances;
};

+struct ap_matrix_mdev {
+ struct kvm_ap_matrix *matrix;
+};
+
static inline struct ap_matrix *to_ap_matrix(struct device *dev)
{
return container_of(dev, struct ap_matrix, device);
--
1.7.1


2018-03-14 18:28:59

by Tony Krowiak

[permalink] [raw]
Subject: [PATCH v3 13/14] s390: vfio-ap: implement VFIO_DEVICE_GET_INFO ioctl

Introduces ioctl access to the VFIO AP Matrix device driver
by implementing the VFIO_DEVICE_GET_INFO ioctl. This ioctl
provides the VFIO AP Matrix device driver information to the
guest machine.

Reviewed-by: Pierre Morel <[email protected]>
Signed-off-by: Tony Krowiak <[email protected]>
---
drivers/s390/crypto/vfio_ap_ops.c | 43 +++++++++++++++++++++++++++++++++++++
1 files changed, 43 insertions(+), 0 deletions(-)

diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index c7911da..7223a1c 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -101,6 +101,48 @@ static void vfio_ap_mdev_release(struct mdev_device *mdev)
&matrix_mdev->group_notifier);
}

+static int vfio_ap_mdev_get_device_info(unsigned long arg)
+{
+ unsigned long minsz;
+ struct vfio_device_info info;
+
+ minsz = offsetofend(struct vfio_device_info, num_irqs);
+
+ if (copy_from_user(&info, (void __user *)arg, minsz))
+ return -EFAULT;
+
+ if (info.argsz < minsz) {
+ pr_err("%s: Argument size %u less than min size %li",
+ VFIO_AP_MATRIX_MODULE_NAME, info.argsz, minsz);
+ return -EINVAL;
+ }
+
+ info.flags = VFIO_DEVICE_FLAGS_AP;
+ info.num_regions = 0;
+ info.num_irqs = 0;
+
+ return copy_to_user((void __user *)arg, &info, minsz);
+}
+
+static ssize_t vfio_ap_mdev_ioctl(struct mdev_device *mdev,
+ unsigned int cmd, unsigned long arg)
+{
+ int ret;
+
+ switch (cmd) {
+ case VFIO_DEVICE_GET_INFO:
+ ret = vfio_ap_mdev_get_device_info(arg);
+ break;
+ default:
+ pr_err("%s: ioctl command %d is not a supported command",
+ VFIO_AP_MATRIX_MODULE_NAME, cmd);
+ ret = -EOPNOTSUPP;
+ break;
+ }
+
+ return ret;
+}
+
static ssize_t name_show(struct kobject *kobj, struct device *dev, char *buf)
{
return sprintf(buf, "%s\n", VFIO_AP_MDEV_NAME_HWVIRT);
@@ -807,6 +849,7 @@ static ssize_t matrix_show(struct device *dev, struct device_attribute *attr,
.remove = vfio_ap_mdev_remove,
.open = vfio_ap_mdev_open,
.release = vfio_ap_mdev_release,
+ .ioctl = vfio_ap_mdev_ioctl,
};

int vfio_ap_mdev_register(struct ap_matrix *ap_matrix)
--
1.7.1


2018-03-14 18:29:11

by Tony Krowiak

[permalink] [raw]
Subject: [PATCH v3 14/14] s390: doc: detailed specifications for AP virtualization

This patch provides documentation describing the AP architecture and
design concepts behind the virtualization of AP devices. It also
includes an example of how to configure AP devices for exclusive
use of KVM guests.

Signed-off-by: Tony Krowiak <[email protected]>
---
Documentation/s390/vfio-ap.txt | 560 ++++++++++++++++++++++++++++++++++++++++
MAINTAINERS | 1 +
2 files changed, 561 insertions(+), 0 deletions(-)
create mode 100644 Documentation/s390/vfio-ap.txt

diff --git a/Documentation/s390/vfio-ap.txt b/Documentation/s390/vfio-ap.txt
new file mode 100644
index 0000000..6752d72
--- /dev/null
+++ b/Documentation/s390/vfio-ap.txt
@@ -0,0 +1,560 @@
+Introduction:
+============
+The Adjunct Processor (AP) facility is an IBM Z cryptographic facility comprised
+of three AP instructions and from 1 up to 256 PCIe cryptographic adapter cards.
+The AP devices provide cryptographic functions to all CPUs assigned to a
+linux system running in an IBM Z system LPAR.
+
+The AP adapter cards are exposed via the AP bus. The motivation for vfio-ap
+is to make AP cards available to KVM guests using the VFIO mediated device
+framework. This implementation relies considerably on the s390 virtualization
+facilities which do most of the hard work of providing direct access to AP
+devices.
+
+AP Architectural Overview:
+=========================
+To facilitate the comprehension of the design, let's start with some
+definitions:
+
+* AP adapter
+
+ An AP adapter is an IBM Z adapter card that can perform cryptographic
+ functions. There can be from 0 to 256 adapters assigned to an LPAR. Adapters
+ assigned to the LPAR in which a linux host is running will be available to
+ the linux host. Each adapter is identified by a number from 0 to 255. When
+ installed, an AP adapter is accessed by AP instructions executed by any CPU.
+
+ The AP adapter cards are assigned to a given LPAR via the system's Activation
+ Profile which can be edited via the HMC. When the system is IPL'd, the AP bus
+ module is loaded and detects the AP adapter cards assigned to the LPAR. The AP
+ bus creates a sysfs device for each adapter as they are detected. For example,
+ if AP adapters 4 and 10 (0x0a) are assigned to the LPAR, the AP bus will
+ create the following sysfs entries:
+
+ /sys/devices/ap/card04
+ /sys/devices/ap/card0a
+
+ Symbolic links to these devices will also be created in the AP bus devices
+ sub-directory:
+
+ /sys/bus/ap/devices/[card04]
+ /sys/bus/ap/devices/[card04]
+
+* AP domain
+
+ An adapter is partitioned into domains. Each domain can be thought of as
+ a set of hardware registers for processing AP instructions. An adapter can
+ hold up to 256 domains. Each domain is identified by a number from 0 to 255.
+ Domains can be further classified into two types:
+
+ * Usage domains are domains that can be accessed directly to process AP
+ commands
+
+ * Control domains are domains that are accessed indirectly by AP
+ commands sent to a usage domain to control or change the domain, for
+ example; to set a secure private key for the domain.
+
+ The AP usage and control domains are assigned to a given LPAR via the system's
+ Activation Profile which can be edited via the HMC. When the system is IPL'd,
+ the AP bus module is loaded and detects the AP usage and control domains
+ assigned to the LPAR. The domain number of each usage domain will be coupled
+ with the adapter number of each AP adapter assigned to the LPAR to identify
+ the AP queues (see AP Queue section below). The domain number of each control
+ domain will be represented in a bitmask and stored in a sysfs file
+ /sys/bus/ap/ap_control_domain_mask created by the bus. The bits in the mask,
+ from most to least significant bit, correspond to domains 0-255.
+
+* AP Queue
+
+ An AP queue is the means by which an AP command-request message is sent to a
+ usage domain inside a specific adapter. An AP queue is identified by a tuple
+ comprised of an AP adapter ID (APID) and an AP queue index (APQI). The
+ APQI corresponds to a given usage domain number within the adapter. This tuple
+ forms an AP Queue Number (APQN) uniquely identifying an AP queue. AP
+ instructions include a field containing the APQN to identify the AP queue to
+ which the AP command-request message is to be sent for processing.
+
+ The AP bus will create a sysfs device for each APQN that can be derived from
+ the intersection of the AP adapter and usage domain numbers detected when the
+ AP bus module is loaded. For example, if adapters 4 and 10 (0x0a) and usage
+ domains 6 and 71 (0x47) are assigned to the LPAR, the AP bus will create the
+ following sysfs entries:
+
+ /sys/devices/ap/card04/04.0006
+ /sys/devices/ap/card04/04.0047
+ /sys/devices/ap/card0a/0a.0006
+ /sys/devices/ap/card0a/0a.0047
+
+ The following symbolic links to these devices will be created in the AP bus
+ devices subdirectory:
+
+ /sys/bus/ap/devices/[04.0006]
+ /sys/bus/ap/devices/[04.0047]
+ /sys/bus/ap/devices/[0a.0006]
+ /sys/bus/ap/devices/[0a.0047]
+
+* AP Instructions:
+
+ There are three AP instructions:
+
+ * NQAP: to enqueue an AP command-request message to a queue
+ * DQAP: to dequeue an AP command-reply message from a queue
+ * PQAP: to administer the queues
+
+AP and SIE:
+==========
+Let's now see how AP instructions are interpreted by the hardware.
+
+A satellite control block called the Crypto Control Block is attached to our
+main hardware virtualization control block. The CRYCB contains three fields to
+identify the adapters, usage domains and control domains assigned to the KVM
+guest:
+
+* The AP Mask (APM) field is a bit mask that identifies the AP adapters assigned
+ to the KVM guest. Each bit in the mask, from most significant to least
+ significant bit, corresponds to an APID from 0-255. If a bit is set, the
+ corresponding adapter is valid for use by the KVM guest.
+
+* The AP Queue Mask (AQM) field is a bit mask identifying the AP queues assigned
+ to the KVM guest. Each bit in the mask, from most significant to least
+ significant bit, corresponds to an AP queue index (APQI) from 0-255. If a bit
+ is set, the corresponding queue is valid for use by the KVM guest.
+
+* The AP Domain Mask field is a bit mask that identifies the AP control domains
+ assigned to the KVM guest. The ADM bit mask controls which domains can be
+ changed by an AP command-request message sent to a usage domain from the
+ guest. Each bit in the mask, from least significant to most significant bit,
+ corresponds to a domain from 0-255. If a bit is set, the corresponding domain
+ can be modified by an AP command-request message sent to a usage domain
+ configured for the KVM guest.
+
+If you recall from the description of an AP Queue, AP instructions include
+an APQN to identify the AP adapter and AP queue to which an AP command-request
+message is to be sent (NQAP and PQAP instructions), or from which a
+command-reply message is to be received (DQAP instruction). The validity of an
+APQN is defined by the matrix calculated from the APM and AQM; it is the
+intersection of all assigned adapter numbers (APM) with all assigned queue
+indexes (AQM). For example, if adapters 1 and 2 and usage domains 5 and 6 are
+assigned to a guest, the APQNs (1,5), (1,6), (2,5) and (2,6) will be valid for
+the guest.
+
+The APQNs provide secure key functionality - i.e., a private key is stored on
+the adapter card for each of its domains - so each APQN must be assigned to at
+most one guest or the linux host.
+
+ Example 1: Valid configuration:
+ ------------------------------
+ Guest1: adapters 1,2 domains 5,6
+ Guest2: adapter 1,2 domain 7
+
+ This is valid because both guests have a unique set of APQNs: Guest1 has
+ APQNs (1,5), (1,6), (2,5) and (2,6); Guest2 has APQNs (1,7) and (2,7).
+
+ Example 2: Invalid configuration:
+ --------------------------------is assigned by writing the adapter's number into the
+ Guest1: adapters 1,2 domains 5,6
+ Guest2: adapter 1 domains 6,7
+
+ This is an invalid configuration because both guests have access to
+ APQN (1,6).
+
+The Design:
+===========
+The design introduces three new objects:
+
+1. AP matrix device
+2. VFIO AP device driver (vfio_ap.ko)
+3. AP mediated matrix passthrough device
+
+The VFIO AP device driver
+-------------------------
+The VFIO AP (vfio_ap) device driver serves the following purposes:
+
+1. Provides the interfaces to reserve APQNs for exclusive use of KVM guests.
+
+2. Sets up the VFIO mediated device interfaces to manage the mediated matrix
+ device and create the sysfs interfaces for assigning adapters, usage domains,
+ and control domains comprising the matrix for a KVM guest.
+
+3. Configure the APM, AQM and ADM in the CRYCB referenced by a KVM guest's
+ SIE state description to grant the guest access to AP devices
+
+4. Initialize the CPU model feature indicating that a KVM guest may use
+ AP facilities installed on the linux host.
+
+5. Enable interpretive execution mode for the KVM guest.
+
+Reserve APQNs for exclusive use of KVM guests
+---------------------------------------------
+The following block diagram illustrates the mechanism by which APQNs are
+reserved:
+
+ +------------------+
+ remove | | unbind
+ +------------------->+ cex4queue driver +<-----------+
+ | | | |
+ | +------------------+ |
+ | |
+ | |
+ | |
++--------+---------+ register +------------------+ +-----+------+
+| +<---------+ | bind | |
+| ap_bus | | vfio_ap driver +<-----+ admin |
+| +--------->+ | | |
++------------------+ probe +---+--------+-----+ +------------+
+ | |
+ create | | store APQN
+ | |
+ v v
+ +---+--------+-----+
+ | |
+ | matrix device |
+ | |
+ +------------------+
+
+The process for reserving an AP queue for use by a KVM guest is:
+
+* The vfio-ap driver during its initialization will perform the following:
+ * Create the 'vfio_ap' root device - /sys/devices/vfio_ap
+ * Create the 'matrix' device in the 'vfio_ap' root
+ * Register the matrix device with the device core
+* Register with the ap_bus for AP queue devices of type CEX4, CEX5 and
+ CEX6 and to provide the vfio_ap driver's probe and remove callback interfaces.
+* The admin unbinds queue cc.qqqq from the cex4queue device driver. This results
+ in the ap_bus calling the the device driver's remove interface which
+ unbinds the cc.qqqq queue device from the driver.
+* The admin binds the cc.qqqq queue to the vfio_ap device driver. This results
+ in the ap_bus calling the device vfio_ap driver's probe interface to bind
+ queue cc.qqqq to the driver. The vfio_ap device driver will store the APQN for
+ the queue in the matrix device
+
+Set up the VFIO mediated device interfaces
+------------------------------------------
+The VFIO AP device driver utilizes the common interface of the VFIO mediated
+device core driver to:
+* Register an AP mediated bus driver to add a mediated matrix device to and
+ remove it from a VFIO group.
+* Create and destroy a mediated matrix device
+* Add a mediated matrix device to and remove it from the AP mediated bus driver
+* Add a mediated matrix device to and remove it from an IOMMU group
+
+The following high-level block diagram shows the main components and interfaces
+of the VFIO AP mediated matrix device driver:
+
+ +-------------+
+ | |
+ | +---------+ | mdev_register_driver() +--------------+
+ | | Mdev | +<-----------------------+ |
+ | | bus | | | vfio_mdev.ko |
+ | | driver | +----------------------->+ |<-> VFIO user
+ | +---------+ | probe()/remove() +--------------+ APIs
+ | |
+ | MDEV CORE |
+ | MODULE |
+ | mdev.ko |
+ | +---------+ | mdev_register_device() +--------------+
+ | |Physical | +<-----------------------+ |
+ | | device | | | vfio_ap.ko |<-> matrix
+ | |interface| +----------------------->+ | device
+ | +---------+ | callback +--------------+
+ +-------------+
+
+During initialization of the vfio_ap module, the matrix device is registered
+with an 'mdev_parent_ops' structure that provides the sysfs attribute
+structures, mdev functions and callback interfaces for managing the mediated
+matrix device.
+
+* sysfs attribute structures:
+ * supported_type_groups
+ The VFIO mediated device framework supports creation of user-defined
+ mediated device types. These mediated device types are specified
+ via the 'supported_type_groups' structure when a device is registered
+ with the mediated device framework. The registration process creates the
+ sysfs structures for each mediated device type specified in the
+ 'mdev_supported_types' sub-directory of the device being registered. Along
+ with the device type, the sysfs attributes of the mediated device type are
+ provided.
+
+ The VFIO AP device driver will register one mediated device type for
+ passthrough devices:
+ /sys/devices/vfio_ap/mdev_supported_types/vfio_ap-passthrough
+ Only the three read-only attributes required by the VFIO mdev framework will
+ be provided:
+ /sys/devices/vfio_ap/mdev_supported_types
+ ... name
+ ... device_api
+ ... available_instances
+ Where:
+ * name: specifies the name of the mediated device type
+ * device_api: the mediated device type's API
+ * available_instances: the number of mediated matrix passthrough devices
+ that can be created
+ * mdev_attr_groups
+ This attribute group identifies the user-defined sysfs attributes of the
+ mediated device. When a device is registered with the VFIO mediated device
+ framework, the sysfs attributes files identified in the 'mdev_attr_groups'
+ structure will be created in the mediated matrix device's directory. The
+ sysfs attributes for a mediated matrix device are:
+ * assign_adapter:
+ A write-only file for assigning an AP adapter to the mediated matrix
+ device. To assign an adapter, the APID of the adapter is written to the
+ file.
+ * assign_domain:
+ A write-only file for assigning an AP usage domain to the mediated matrix
+ device. To assign a domain, the APQI of the AP queue corresponding to a
+ usage domain is written to the file.
+ * assign_control_domain:
+ A write-only file for assigning an AP control domain to the mediated
+ matrix device. To assign a control domain, the ID of a domain to be
+ controlled is written to the file. By architectural convention, the set of
+ control domains will always include the set of usage domains, so it is
+ only necessary to assign control domains that are not also assigned as
+ usage domains.
+
+* functions:
+ * create:
+ allocates the ap_matrix_mdev structure used by the vfio_ap driver to:
+ * Keep track of the available instances
+ * Store the reference to the struct kvm for the KVM guest
+ * Provide the notifier callback that will get invoked to handle the
+ VFIO_GROUP_NOTIFY_SET_KVM event. When received, the vfio_ap driver will
+ store the reference in the mediated matrix device's ap_matrix_mdev
+ structure and enable the interpretive execution mode for the KVM guest.
+ * remove:
+ deallocates the mediated matrix device's ap_matrix_mdev structure.
+
+* callback interfaces
+ * open:
+ The vfio_ap driver uses this callback to register a
+ VFIO_GROUP_NOTIFY_SET_KVM notifier callback function for the mdev matrix
+ device. The notifier is invoked when QEMU connects the VFIO iommu group
+ for the mdev matrix device to the MDEV bus. Access to the KVM structure used
+ to set up the KVM guest is provided via this callback.
+ * release:
+ unregisters the VFIO_GROUP_NOTIFY_SET_KVM notifier callback function for the
+ mdev matrix device.
+
+Configure the APM, AQM and ADM in the CRYCB:
+-------------------------------------------
+Configuring the AP matrix for a KVM guest will be performed when the
+VFIO_GROUP_NOTIFY_SET_KVM notifier callback is invoked. The notifier is callback
+function is called when QEMU connects the VFIO iommu group for the mdev matrix
+device to the MDEV bus. The CRYCB is configured by:
+* Setting the bits in the APM corresponding to the APIDs assigned to the
+ mediated matrix device via its 'assign_adapter' interface.
+* Setting the bits in the AQM corresponding to the APQIs assigned to the
+ mediated matrix device via its 'assign_domain' interface.
+* Setting the bits in the ADM corresponding to the domain dIDs assigned to the
+ mediated matrix device via its 'assign_control_domains' interface.
+
+Initialize the CPU model feature for AP
+---------------------------------------
+This design exploits a feature of the SIE architecture called interpretive
+execution (IE). When IE is enabled for a KVM guest, the AP instructions
+executed in the guest will be interpreted by the firmware and the commands
+contained therein will be passed directly through to an AP device assigned to
+the linux host. In order to enable interpretive execution for a KVM guest, SIE
+must have access to the AP facilities installed on the linux host. A new CPU
+model feature is introduced by this design to indicate that the guest will
+directly access the host AP facilities. This feature will be enabled by the
+kernel only if the AP facilities are installed on the linux host. The feature
+must be turned on for the guest in order to access AP devices from the guest.
+For example, to turn the AP facilities on from the QEMU command line:
+
+ /usr/bin/qemu-system-s390x ... -cpu xxx,ap=on
+
+ Where xxx is the CPU model being used.
+
+ If the CPU model feature is not enabled by the kernel, QEMU will fail and
+ report that the feature is not supported.
+
+Example:
+=======
+Let's now provide an example to illustrate how KVM guests may be given
+access to AP facilities. For this example, we will show how to configure
+two guests such that executing the lszcrypt command on the guests would
+look like this:
+
+Guest1
+------
+CARD.DOMAIN TYPE MODE
+------------------------------
+05 CEX5C CCA-Coproc
+05.0004 CEX5C CCA-Coproc
+05.00ab CEX5C CCA-Coproc
+06 CEX5A Accelerator
+06.0004 CEX5A Accelerator
+06.00ab CEX5C CCA-Coproc
+
+Guest2
+------
+CARD.DOMAIN TYPE MODE
+------------------------------
+05 CEX5A Accelerator
+05.0047 CEX5A Accelerator
+05.00ff CEX5A Accelerator
+
+These are the steps:
+
+1. Install the vfio_ap module on the linux host. The dependency chain for the
+ vfio_ap module is:
+ * vfio
+ * mdev
+ * vfio_mdev
+ * vfio_ap
+
+2. Secure the AP queues to be used by the two guests so that the host can not
+ access them. This is done by unbinding each AP Queue device from its
+ respective AP driver. In our example, these queues are bound to the cex4queue
+ driver. The sysfs location of these devices is:
+
+ /sys/bus/ap
+ --- [drivers]
+ ------ [cex4queue]
+ --------- [05.0004]
+ --------- [05.0047]
+ --------- [05.00ab]
+ --------- [05.00ff]
+ --------- [06.0004]
+ --------- [06.00ab]
+ --------- unbind
+
+ To unbind AP queue 05.0004 from the cex4queue device driver:
+
+ echo 05.0004 > unbind
+
+ This must also be done for AP queues 05.00ab, 05.0047, 05.00ff, 06.0004,
+ and 06.00ab.
+
+3. Reserve the queues for use by the two KVM guests. This is accomplished by
+ binding them to the vfio_ap device driver. The sysfs location of the
+ device driver is:
+
+ /sys/bus/ap
+ ---[drivers]
+ ------ [vfio_ap]
+ ---------- bind
+
+ To bind queue 05.0004 to the vfio_ap driver:
+
+ echo 05.0004 > bind
+
+ This must also be done for AP queues 05.00ab, 05.0047, 05.00ff, 06.0004,
+ and 06.00ab.
+
+ Take note that the AP queues bound to the vfio_ap driver will be available
+ for guest usage until they are unbound from the driver, the vfio_ap module
+ is unloaded, or the host system is shut down.
+
+4. Create the mediated devices needed to configure the AP matrixes for the
+ two guests and to provide an interface to the vfio_ap driver for
+ use by the guests:
+
+ /sys/devices/
+ --- [vfio_ap]
+ ------ [matrix] (this is the matrix device)
+ --------- [mdev_supported_types]
+ ------------ [vfio_ap-passthrough] (passthrough mediated matrix device type)
+ --------------- create
+ --------------- [devices]
+
+ To create the mediated devices for the two guests:
+
+ uuidgen > create
+ uuidgen > create
+
+ This will create two mediated devices in the [devices] subdirectory named
+ with the UUID written to the create attribute file. We call them $uuid1
+ and $uuid2:
+
+ /sys/devices/
+ --- [vfio_ap]
+ ------ [matrix]
+ --------- [mdev_supported_types]
+ ------------ [vfio_ap-passthrough]
+ --------------- [devices]
+ ------------------ [$uuid1]
+ --------------------- assign_adapter
+ --------------------- assign_control_domain
+ --------------------- assign_domain
+ --------------------- matrix
+ --------------------- unassign_adapter
+ --------------------- unassign_control_domain
+ --------------------- unassign_domain
+
+ ------------------ [$uuid2]
+ --------------------- assign_adapter
+ --------------------- assign_cTo assign an adapter, the APID of the adapter is written to the
+ file. ontrol_domain
+ --------------------- assign_domain
+ --------------------- matrix
+ --------------------- unassign_adapter
+ --------------------- unassign_control_domain
+ --------------------- unassign_domain
+
+5. The administrator now needs to configure the matrixes for mediated
+ devices $uuid1 (for Guest1) and $uuid2 (for Guest2).
+
+ This is how the matrix is configured for Guest1:
+
+ echo 5 > assign_adapter
+ echo 6 > assign_adapter
+ echo 4 > assign_domain
+ echo 0xab > assign_domain
+
+ By architectural convention, all usage domains - i.e., domains assigned
+ via the assign_domain attribute file - will also be configured in the ADM
+ field of the KVM guest's CRYCB, so there is no need to assign control
+ domains here unless you want to assign control domains that are not
+ assigned as usage domains.
+
+ If a mistake is made configuring an adapter, domain or control domain,
+ you can use the unassign_xxx files to unassign the adapter, domain or
+ control domain.
+
+ To display the matrix configuration for Guest1:
+
+ cat matrix
+
+ This is how the matrix is configured for Guest2:
+
+ echo 5 > assign_adapter
+ echo 0x47 > assign_domain
+ echo 0xff > assign_domain
+
+6. Start Guest1:
+
+ /usr/bin/qemu-system-s390x ... -cpu xxx,ap=on \
+ -device vfio-ap,sysfsdev=/sys/devices/vfio_ap/matrix/$uuid1 ...
+
+7. Start Guest2:
+
+ /usr/bin/qemu-system-s390x ... -cpu xxx,ap=on \
+ -device vfio-ap,sysfsdev=/sys/devices/vfio_ap/matrix/$uuid2 ...
+
+When the guest is shut down, the mediated matrix device may be removed.
+
+Using our example again, to remove the mediated matrix device $uuid1:
+
+ /sys/devices/
+ --- [vfio_ap]
+ ------ [matrix]
+ --------- [mdev_supported_types]
+ ------------ [vfio_ap-passthrough]
+ --------------- [devices]
+ ------------------ [$uuid1]
+ --------------------- remove
+
+ echo 1 > remove
+
+ This will remove all of the mdev matrix device's sysfs structures. To
+ recreate and reconfigure the mdev matrix device, all of the steps starting
+ with step 4 will have to be performed again.
+
+ It is not necessary to remove an mdev matrix device, but one may want to
+ remove it if no guest will use it during the lifetime of the linux host. If
+ the mdev matrix device is removed, one may want to unbind the AP queues the
+ guest was using from the vfio_ap device driver and bind them back to the
+ default driver. Alternatively, the AP queues can be configured for another
+ mdev matrix (i.e., guest). In either case, one must take care to change the
+ secure key configured for the domain to which the queue is connected.
\ No newline at end of file
diff --git a/MAINTAINERS b/MAINTAINERS
index 2af3815..399da92 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -11887,6 +11887,7 @@ F: arch/s390/kvm/kvm-ap.c
F: drivers/s390/crypto/vfio_ap_drv.c
F: drivers/s390/crypto/vfio_ap_private.h
F: drivers/s390/crypto/vfio_ap_ops.c
+F: Documentation/s390/vfio-ap.txt

S390 ZFCP DRIVER
M: Steffen Maier <[email protected]>
--
1.7.1


2018-03-14 18:29:23

by Tony Krowiak

[permalink] [raw]
Subject: [PATCH v3 12/14] KVM: s390: configure the guest's AP devices

Registers a group notifier during the open of the mediated
matrix device to get information on KVM presence through the
VFIO_GROUP_NOTIFY_SET_KVM event. When notified, the pointer
to the kvm structure is saved inside the mediated matrix
device. Once the VFIO AP device driver has access to KVM,
the AP matrix for the guest can be configured.

Guest access to AP adapters, usage domains and control domains
is controlled by three bit masks referenced from the
Crypto Control Block (CRYCB) referenced from the guest's SIE state
description:

* The AP Mask (APM) controls access to the AP adapters. Each bit
in the APM represents an adapter number - from most significant
to least significant bit - from 0 to 255. The bits in the APM
are set according to the adapter numbers assigned to the mediated
matrix device via its 'assign_adapter' sysfs attribute file.

* The AP Queue (AQM) controls access to the AP queues. Each bit
in the AQM represents an AP queue index - from most significant
to least significant bit - from 0 to 255. A queue index references
a specific domain and is synonymous with the domian number. The
bits in the AQM are set according to the domain numbers assigned
to the mediated matrix device via its 'assign_domain' sysfs
attribute file.

* The AP Domain Mask (ADM) controls access to the AP control domains.
Each bit in the ADM represents a control domain - from most
significant to least significant bit - from 0-255. The
bits in the ADM are set according to the domain numbers assigned
to the mediated matrix device via its 'assign_control_domain'
sysfs attribute file.

The guest will be configured when the file descriptor for the mediated
matrix device is opened. If AP interpretive execution (APIE) is
not turned on for the guest, then the open will fail since the
VFIO AP device driver is dependent upon APIE.

Signed-off-by: Tony Krowiak <[email protected]>
---
arch/s390/include/asm/kvm-ap.h | 2 +
arch/s390/kvm/kvm-ap.c | 14 +++++++++
drivers/s390/crypto/vfio_ap_ops.c | 50 +++++++++++++++++++++++++++++++++
drivers/s390/crypto/vfio_ap_private.h | 2 +
4 files changed, 68 insertions(+), 0 deletions(-)

diff --git a/arch/s390/include/asm/kvm-ap.h b/arch/s390/include/asm/kvm-ap.h
index 679e026..e2d45ed 100644
--- a/arch/s390/include/asm/kvm-ap.h
+++ b/arch/s390/include/asm/kvm-ap.h
@@ -48,6 +48,8 @@ struct kvm_ap_matrix {

void kvm_ap_matrix_destroy(struct kvm_ap_matrix *ap_matrix);

+int kvm_ap_instructions_interpreted(struct kvm *kvm);
+
int kvm_ap_configure_matrix(struct kvm *kvm, struct kvm_ap_matrix *matrix);

void kvm_ap_deconfigure_matrix(struct kvm *kvm);
diff --git a/arch/s390/kvm/kvm-ap.c b/arch/s390/kvm/kvm-ap.c
index eb365e2..c331d53 100644
--- a/arch/s390/kvm/kvm-ap.c
+++ b/arch/s390/kvm/kvm-ap.c
@@ -268,6 +268,20 @@ void kvm_ap_matrix_destroy(struct kvm_ap_matrix *ap_matrix)
EXPORT_SYMBOL(kvm_ap_matrix_destroy);

/**
+ * kvm_ap_instructions_interpreted
+ *
+ * Indicates whether AP instructions are being interpreted on the guest
+ *
+ * Returns 1 if instructions are being interpreted; otherwise, returns 0
+ */
+int kvm_ap_instructions_interpreted(struct kvm *kvm)
+{
+ return test_kvm_cpu_feat(kvm, KVM_S390_VM_CPU_FEAT_AP) &&
+ kvm->arch.crypto.apie;
+}
+EXPORT_SYMBOL(kvm_ap_instructions_interpreted);
+
+/**
* kvm_ap_configure_matrix
*
* Configure the AP matrix for a KVM guest.
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index 04f7a92..c7911da 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -53,6 +53,54 @@ static int vfio_ap_mdev_remove(struct mdev_device *mdev)
return 0;
}

+static int vfio_ap_mdev_group_notifier(struct notifier_block *nb,
+ unsigned long action, void *data)
+{
+ struct ap_matrix_mdev *matrix_mdev;
+
+ if (action == VFIO_GROUP_NOTIFY_SET_KVM) {
+ matrix_mdev = container_of(nb, struct ap_matrix_mdev,
+ group_notifier);
+ matrix_mdev->kvm = data;
+ }
+
+ return NOTIFY_OK;
+}
+
+static int vfio_ap_mdev_open(struct mdev_device *mdev)
+{
+ struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
+ unsigned long events;
+ int ret;
+
+ matrix_mdev->group_notifier.notifier_call = vfio_ap_mdev_group_notifier;
+ events = VFIO_GROUP_NOTIFY_SET_KVM;
+
+ ret = vfio_register_notifier(mdev_dev(mdev), VFIO_GROUP_NOTIFY,
+ &events, &matrix_mdev->group_notifier);
+ if (ret)
+ return ret;
+
+ if (!kvm_ap_instructions_interpreted(matrix_mdev->kvm))
+ return -EOPNOTSUPP;
+
+ ret = kvm_ap_configure_matrix(matrix_mdev->kvm,
+ matrix_mdev->matrix);
+ if (ret)
+ return ret;
+
+ return ret;
+}
+
+static void vfio_ap_mdev_release(struct mdev_device *mdev)
+{
+ struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
+
+ kvm_ap_deconfigure_matrix(matrix_mdev->kvm);
+ vfio_unregister_notifier(mdev_dev(mdev), VFIO_GROUP_NOTIFY,
+ &matrix_mdev->group_notifier);
+}
+
static ssize_t name_show(struct kobject *kobj, struct device *dev, char *buf)
{
return sprintf(buf, "%s\n", VFIO_AP_MDEV_NAME_HWVIRT);
@@ -757,6 +805,8 @@ static ssize_t matrix_show(struct device *dev, struct device_attribute *attr,
.mdev_attr_groups = vfio_ap_mdev_attr_groups,
.create = vfio_ap_mdev_create,
.remove = vfio_ap_mdev_remove,
+ .open = vfio_ap_mdev_open,
+ .release = vfio_ap_mdev_release,
};

int vfio_ap_mdev_register(struct ap_matrix *ap_matrix)
diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h
index f6e7ed1..1133735 100644
--- a/drivers/s390/crypto/vfio_ap_private.h
+++ b/drivers/s390/crypto/vfio_ap_private.h
@@ -32,6 +32,8 @@ struct ap_matrix {

struct ap_matrix_mdev {
struct kvm_ap_matrix *matrix;
+ struct notifier_block group_notifier;
+ struct kvm *kvm;
};

static inline struct ap_matrix *to_ap_matrix(struct device *dev)
--
1.7.1


2018-03-14 18:30:36

by Tony Krowiak

[permalink] [raw]
Subject: [PATCH v3 08/14] s390: vfio-ap: sysfs interfaces to configure adapters

Provides the sysfs interfaces for assigning AP adapters to
and unassigning AP adapters from a mediated matrix device.

The IDs of the AP adapters assigned to the mediated matrix
device are stored in an AP mask (APM). The bits in the APM,
from most significant to least significant bit, correspond to
AP adapter numbers 0 to 255. When an adapter is assigned, the
bit corresponding adapter ID will be set in the APM. Likewise,
when an adapter is unassigned, the bit corresponding to the
adapter ID will be cleared from the APM.

The relevant sysfs structures are:

/sys/devices/vfio_ap
... [matrix]
...... [mdev_supported_types]
......... [vfio_ap-passthrough]
............ [devices]
...............[$uuid]
.................. assign_adapter
.................. unassign_adapter

To assign an adapter to the $uuid mediated matrix device's APM,
write the adapter ID (APID) to the assign_adapter file. To
unassign an adapter, write the APID to the unassign_adapter
file. The APID is specified using conventional semantics: If
it begins with 0x the number will be parsed as a hexadecimal
(case insensitive) number; otherwise, it will be parsed as a
decimal number.

For example, to assign adapter 173 (0xad) to the mediated matrix
device $uuid:

echo 173 > assign_adapter

or

echo 0xad > assign_adapter

To unassign adapter 173 (0xad):

echo 173 > unassign_adapter

or

echo 0xad > unassign_adapter

The assignment will be rejected:

* If the APID exceeds the maximum value for an AP adapter:
* If the AP Extended Addressing (APXA) facility is
installed, the max value is 255
* Else the max value is 64

* If no AP domains have yet been assigned and there are
no AP queues bound to the VFIO AP driver that have an APQN
with an APID matching that of the AP adapter being assigned.

* If any of the APQNs that can be derived from the intersection
of the APID being assigned and the AP queue index (APQI) of
each of the AP domains previously assigned can not be matched
with an APQN of an AP queue device reserved by the VFIO AP
driver.

Signed-off-by: Tony Krowiak <[email protected]>
---
arch/s390/include/asm/kvm-ap.h | 1 +
drivers/s390/crypto/vfio_ap_ops.c | 296 +++++++++++++++++++++++++++++++++
drivers/s390/crypto/vfio_ap_private.h | 1 +
3 files changed, 298 insertions(+), 0 deletions(-)

diff --git a/arch/s390/include/asm/kvm-ap.h b/arch/s390/include/asm/kvm-ap.h
index 268f3b2..2052329 100644
--- a/arch/s390/include/asm/kvm-ap.h
+++ b/arch/s390/include/asm/kvm-ap.h
@@ -15,6 +15,7 @@
#include <linux/bitops.h>

#define KVM_AP_MASK_BYTES(n) DIV_ROUND_UP(n, BITS_PER_BYTE)
+#define KVM_AP_MAX_APM_INDEX(matrix) (matrix->apm_max - 1)

/**
* The AP matrix is comprised of three bit masks identifying the adapters,
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index 4fda44e..90512a6 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -97,9 +97,305 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
NULL,
};

+struct vfio_apid_reserved {
+ unsigned long apid;
+ int reserved;
+};
+
+struct vfio_ap_qid_match {
+ qid_t qid;
+ struct device *dev;
+};
+
+/**
+ * vfio_ap_queue_match
+ *
+ * @dev: an AP queue device that has been reserved by the VFIO AP device
+ * driver
+ * @data: an AP queue identifier
+ *
+ * Returns 1 (true) if @data matches the AP queue identifier specified for @dev;
+ * otherwise, returns 0 (false);
+ */
+static int vfio_ap_queue_match(struct device *dev, void *data)
+{
+ struct vfio_ap_qid_match *qid_match = data;
+ struct ap_queue *ap_queue;
+
+ ap_queue = to_ap_queue(dev);
+ if (ap_queue->qid == qid_match->qid)
+ qid_match->dev = dev;
+
+ return 0;
+}
+
+/**
+ * vfio_ap_validate_queues_for_apid
+ *
+ * @ap_matrix: the matrix device
+ * @matrix_mdev: the mediated matrix device
+ * @apid: an AP adapter ID (APID)
+ *
+ * Verifies that each APQN that is derived from the intersection of @apid and
+ * each AP queue index (APQI) corresponding to an AP adapter assigned to the
+ * @matrix_mdev matches the APQN of an AP queue reserved by the VFIO AP device
+ * driver.
+ *
+ * Returns 0 if validation succeeds; otherwise, returns an error.
+ */
+static int vfio_ap_validate_queues_for_apid(struct ap_matrix *ap_matrix,
+ struct ap_matrix_mdev *matrix_mdev,
+ unsigned long apid)
+{
+ int ret;
+ struct vfio_ap_qid_match qid_match;
+ unsigned long apqi;
+ struct device_driver *drv = ap_matrix->device.driver;
+
+ /**
+ * Examine each APQN with the specified APID
+ */
+ for_each_set_bit_inv(apqi, matrix_mdev->matrix->aqm,
+ matrix_mdev->matrix->apm_max) {
+ qid_match.qid = AP_MKQID(apid, apqi);
+ qid_match.dev = NULL;
+
+ ret = driver_for_each_device(drv, NULL, &qid_match,
+ vfio_ap_queue_match);
+ if (ret)
+ return ret;
+
+ /*
+ * If the APQN identifies an AP queue that is reserved by the
+ * VFIO AP device driver, continue processing.
+ */
+ if (qid_match.dev)
+ continue;
+
+ pr_err("%s: AP queue %02lx.%04lx not reserved by %s driver",
+ VFIO_AP_MATRIX_MODULE_NAME, apid, apqi,
+ VFIO_AP_DRV_NAME);
+
+ return -ENXIO;
+ }
+
+ return 0;
+}
+
+struct vfio_ap_apid_reserved {
+ unsigned long apid;
+ bool reserved;
+};
+
+/**
+ * vfio_ap_queue_id_contains_apid
+ *
+ * @dev: an AP queue device
+ * @data: an AP adapter ID (APID)
+ *
+ * Returns 1 (true) if the APID (@data) is contained in the AP queue's (@data)
+ * identifier; otherwise, returns 0;
+ */
+static int vfio_ap_queue_id_contains_apid(struct device *dev, void *data)
+{
+ struct vfio_ap_apid_reserved *apid_res = data;
+ struct ap_queue *ap_queue = to_ap_queue(dev);
+
+ if (apid_res->apid == AP_QID_CARD(ap_queue->qid))
+ apid_res->reserved = true;
+
+ return 0;
+}
+
+/**
+ * vfio_ap_verify_apid_reserved
+ *
+ * @ap_matrix: the AP matrix configured for the mediated matrix device
+ * @apid: the AP adapter ID
+ *
+ * Verifies that at least one AP queue reserved by the VFIO AP device driver
+ * has an APQN containing @apid.
+ *
+ * Returns 0 if the APID is reserved; otherwise, returns -ENODEV.
+ */
+static int vfio_ap_verify_apid_reserved(struct ap_matrix *ap_matrix,
+ unsigned long apid)
+{
+ int ret;
+ struct vfio_ap_apid_reserved apid_res;
+
+ apid_res.apid = apid;
+
+ ret = driver_for_each_device(ap_matrix->device.driver, NULL,
+ &apid_res,
+ vfio_ap_queue_id_contains_apid);
+ if (ret)
+ return ret;
+
+ if (apid_res.reserved)
+ return 0;
+
+ pr_err("%s: no APQNs with adapter ID %02lx are reserved by %s driver",
+ VFIO_AP_MATRIX_MODULE_NAME, apid, VFIO_AP_DRV_NAME);
+
+ return -ENODEV;
+}
+
+/**
+ * vfio_ap_validate_apid
+ *
+ * @mdev: the mediated device
+ * @matrix_mdev: the mediated matrix device
+ * @apid: the APID to validate
+ *
+ * Validates the value of @apid:
+ * * If there are no AP domains assigned, then there must be at least
+ * one AP queue device reserved by the VFIO AP device driver with an
+ * APQN containing @apid.
+ *
+ * * Else each APQN that can be derived from the intersection of @apid and
+ * the IDs of the AP domains already assigned must identify an AP queue
+ * that has been reserved by the VFIO AP device driver.
+ *
+ * Returns 0 if the value of @apid is valid; otherwise, returns an error.
+ */
+static int vfio_ap_validate_apid(struct mdev_device *mdev,
+ struct ap_matrix_mdev *matrix_mdev,
+ unsigned long apid)
+{
+ int ret;
+ struct device *dev = mdev_parent_dev(mdev);
+ struct ap_matrix *ap_matrix = to_ap_matrix(dev);
+ unsigned long apqi;
+
+ apqi = find_first_bit_inv(matrix_mdev->matrix->aqm,
+ matrix_mdev->matrix->aqm_max);
+ if (apqi == matrix_mdev->matrix->aqm_max) {
+ ret = vfio_ap_verify_apid_reserved(ap_matrix, apid);
+ } else {
+ ret = vfio_ap_validate_queues_for_apid(ap_matrix, matrix_mdev,
+ apid);
+ }
+
+ if (ret)
+ return ret;
+
+ return 0;
+}
+
+/**
+ * assign_adapter_store
+ *
+ * @dev: the matrix device
+ * @attr: a mediated matrix device attribute
+ * @buf: a buffer containing the adapter ID (APID) to be assigned
+ * @count: the number of bytes in @buf
+ *
+ * Parses the APID from @buf and assigns it to the mediated matrix device. The
+ * APID must be a valid value:
+ * * The APID value must not exceed the maximum allowable AP adapter ID
+ *
+ * * If there are no AP domains assigned, then there must be at least
+ * one AP queue device reserved by the VFIO AP device driver with an
+ * APQN containing @apid.
+ *
+ * * Else each APQN that can be derived from the intersection of @apid and
+ * the IDs of the AP domains already assigned must identify an AP queue
+ * that has been reserved by the VFIO AP device driver.
+ *
+ * Returns the number of bytes processed if the APID is valid; otherwise returns
+ * an error.
+ */
+static ssize_t assign_adapter_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ int ret;
+ unsigned long apid;
+ struct mdev_device *mdev = mdev_from_dev(dev);
+ struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
+ int maxid = KVM_AP_MAX_APM_INDEX(matrix_mdev->matrix);
+
+ ret = kstrtoul(buf, 0, &apid);
+ if (ret || (apid > maxid)) {
+ pr_err("%s: adapter id '%s' not a value from 0 to %02d(%#04x)",
+ VFIO_AP_MATRIX_MODULE_NAME, buf, maxid, maxid);
+
+ return ret ? ret : -EINVAL;
+ }
+
+ ret = vfio_ap_validate_apid(mdev, matrix_mdev, apid);
+ if (ret)
+ return ret;
+
+ /* Set the bit in the AP mask (APM) corresponding to the AP adapter
+ * number (APID). The bits in the mask, from most significant to least
+ * significant bit, correspond to APIDs 0-255.
+ */
+ set_bit_inv(apid, matrix_mdev->matrix->apm);
+
+ return count;
+}
+static DEVICE_ATTR_WO(assign_adapter);
+
+/**
+ * unassign_adapter_store
+ *
+ * @dev: the matrix device
+ * @attr: a mediated matrix device attribute
+ * @buf: a buffer containing the adapter ID (APID) to be assigned
+ * @count: the number of bytes in @buf
+ *
+ * Parses the APID from @buf and unassigns it from the mediated matrix device.
+ * The APID must be a valid value
+ *
+ * Returns the number of bytes processed if the APID is valid; otherwise returns
+ * an error.
+ */
+static ssize_t unassign_adapter_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ int ret;
+ unsigned long apid;
+ struct mdev_device *mdev = mdev_from_dev(dev);
+ struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
+ int maxid = KVM_AP_MAX_APM_INDEX(matrix_mdev->matrix);
+
+ ret = kstrtoul(buf, 0, &apid);
+ if (ret || (apid > maxid)) {
+ pr_err("%s: adapter id '%s' must be a value from 0 to %02d(%#04x)",
+ VFIO_AP_MATRIX_MODULE_NAME, buf, maxid, maxid);
+
+ return ret ? ret : -EINVAL;
+ }
+
+ clear_bit_inv((unsigned long)apid,
+ (unsigned long *)matrix_mdev->matrix->apm);
+
+ return count;
+}
+DEVICE_ATTR_WO(unassign_adapter);
+
+static struct attribute *vfio_ap_mdev_attrs[] = {
+ &dev_attr_assign_adapter.attr,
+ &dev_attr_unassign_adapter.attr,
+ NULL
+};
+
+static struct attribute_group vfio_ap_mdev_attr_group = {
+ .attrs = vfio_ap_mdev_attrs
+};
+
+static const struct attribute_group *vfio_ap_mdev_attr_groups[] = {
+ &vfio_ap_mdev_attr_group,
+ NULL
+};
+
static const struct mdev_parent_ops vfio_ap_matrix_ops = {
.owner = THIS_MODULE,
.supported_type_groups = vfio_ap_mdev_type_groups,
+ .mdev_attr_groups = vfio_ap_mdev_attr_groups,
.create = vfio_ap_mdev_create,
.remove = vfio_ap_mdev_remove,
};
diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h
index a388b66..f6e7ed1 100644
--- a/drivers/s390/crypto/vfio_ap_private.h
+++ b/drivers/s390/crypto/vfio_ap_private.h
@@ -11,6 +11,7 @@
#include <linux/types.h>
#include <linux/device.h>
#include <linux/mdev.h>
+#include <asm/kvm-ap.h>

#include "ap_bus.h"

--
1.7.1


2018-03-14 18:30:43

by Tony Krowiak

[permalink] [raw]
Subject: [PATCH v3 09/14] s390: vfio-ap: sysfs interfaces to configure domains

Provides the sysfs interfaces for assigning AP domains to
and unassigning AP domains from a mediated matrix device.

An AP domain ID corresponds to an AP queue index (APQI). For
each domain assigned to the mediated matrix device, its
corresponging APQI is stored in an AP queue mask (AQM).
The bits in the AQM, from most significant to least
significant bit, correspond to AP domain numbers 0 to 255.
When a domain is assigned, the bit corresponding to its
APQI will be set in the AQM. Likewise, when a domain is
unassigned, the bit corresponding to its APQI will be
cleared from the AQM.

The relevant sysfs structures are:

/sys/devices/vfio_ap
... [matrix]
...... [mdev_supported_types]
......... [vfio_ap-passthrough]
............ [devices]
...............[$uuid]
.................. assign_domain
.................. unassign_domain

To assign a domain to the $uuid mediated matrix device,
write the domain's ID to the assign_domain file. To
unassign a domain, write the domain's ID to the
unassign_domain file. The ID is specified using
conventional semantics: If it begins with 0x, the number
will be parsed as a hexadecimal (case insensitive) number;
otherwise, it will be parsed as a decimal number.

For example, to assign domain 173 (0xad) to the mediated matrix
device $uuid:

echo 173 > assign_domain

or

echo 0xad > assign_domain

To unassign domain 173 (0xad):

echo 173 > unassign_domain

or

echo 0xad > unassign_domain

The assignment will be rejected:

* If the domain ID exceeds the maximum value for an AP domain:

* If the AP Extended Addressing (APXA) facility is installed,
the max value is 255

* Else the max value is 15

* If no AP adapters have yet been assigned and there are
no AP queues reserved by the VFIO AP driver that have an APQN
with an APQI matching that of the AP domain number being
assigned.

* If any of the APQNs that can be derived from the intersection
of the APQI being assigned and the AP adapter ID (APID) of
each of the AP adapters previously assigned can not be matched
with an APQN of an AP queue device reserved by the VFIO AP
driver.

Signed-off-by: Tony Krowiak <[email protected]>
---
arch/s390/include/asm/kvm-ap.h | 1 +
drivers/s390/crypto/vfio_ap_ops.c | 215 ++++++++++++++++++++++++++++++++++++-
2 files changed, 215 insertions(+), 1 deletions(-)

diff --git a/arch/s390/include/asm/kvm-ap.h b/arch/s390/include/asm/kvm-ap.h
index 2052329..8ec42e7 100644
--- a/arch/s390/include/asm/kvm-ap.h
+++ b/arch/s390/include/asm/kvm-ap.h
@@ -16,6 +16,7 @@

#define KVM_AP_MASK_BYTES(n) DIV_ROUND_UP(n, BITS_PER_BYTE)
#define KVM_AP_MAX_APM_INDEX(matrix) (matrix->apm_max - 1)
+#define KVM_AP_MAX_AQM_INDEX(matrix) (matrix->aqm_max - 1)

/**
* The AP matrix is comprised of three bit masks identifying the adapters,
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index 90512a6..c448835 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -377,10 +377,223 @@ static ssize_t unassign_adapter_store(struct device *dev,
}
DEVICE_ATTR_WO(unassign_adapter);

+/**
+ * vfio_ap_validate_queues_for_apqi
+ *
+ * @ap_matrix: the matrix device
+ * @matrix_mdev: the mediated matrix device
+ * @apqi: an AP queue index (APQI) - corresponds to a domain ID
+ *
+ * Verifies that each APQN that is derived from the intersection of @apqi and
+ * each AP adapter ID (APID) corresponding to an AP domain assigned to the
+ * @matrix_mdev matches the APQN of an AP queue reserved by the VFIO AP device
+ * driver.
+ *
+ * Returns 0 if validation succeeds; otherwise, returns an error.
+ */
+static int vfio_ap_validate_queues_for_apqi(struct ap_matrix *ap_matrix,
+ struct ap_matrix_mdev *matrix_mdev,
+ unsigned long apqi)
+{
+ int ret;
+ struct vfio_ap_qid_match qid_match;
+ unsigned long apid;
+ struct device_driver *drv = ap_matrix->device.driver;
+
+ /**
+ * Examine each APQN with the specified APQI
+ */
+ for_each_set_bit_inv(apid, matrix_mdev->matrix->apm,
+ matrix_mdev->matrix->apm_max) {
+ qid_match.qid = AP_MKQID(apid, apqi);
+ qid_match.dev = NULL;
+
+ ret = driver_for_each_device(drv, NULL, &qid_match,
+ vfio_ap_queue_match);
+ if (ret)
+ return ret;
+
+ /*
+ * If the APQN identifies an AP queue that is reserved by the
+ * VFIO AP device driver, continue processing.
+ */
+ if (qid_match.dev)
+ continue;
+
+ pr_err("%s: AP queue %02lx.%04lx not reserved by %s driver",
+ VFIO_AP_MATRIX_MODULE_NAME, apqi, apqi,
+ VFIO_AP_DRV_NAME);
+
+ return -ENXIO;
+ }
+
+ return 0;
+}
+
+struct vfio_ap_apqi_reserved {
+ unsigned long apqi;
+ bool reserved;
+};
+
+/**
+ * vfio_ap_queue_id_contains_apqi
+ *
+ * @dev: an AP queue device
+ * @data: an AP queue index (APQI)
+ *
+ * Returns 1 (true) if the APQI (@data) is contained in the AP queue's
+ * identifier; otherwise, returns 0;
+ */
+static int vfio_ap_queue_id_contains_apqi(struct device *dev, void *data)
+{
+ struct vfio_ap_apqi_reserved *apqi_res = data;
+ struct ap_queue *ap_queue = to_ap_queue(dev);
+
+ if (apqi_res->apqi == AP_QID_QUEUE(ap_queue->qid))
+ apqi_res->reserved = true;
+
+ return 0;
+}
+
+/**
+ * vfio_ap_verify_apqi_reserved
+ *
+ * @ap_matrix: the AP matrix configured for the mediated matrix device
+ * @apqi: the AP queue index (APQI) - corresponds to domain ID
+ *
+ * Verifies that at least one AP queue reserved by the VFIO AP device driver
+ * has an APQN containing @apqi.
+ *
+ * Returns 0 if the APQI is reserved; otherwise, returns -ENODEV.
+ */
+static int vfio_ap_verify_apqi_reserved(struct ap_matrix *ap_matrix,
+ unsigned long apqi)
+{
+ int ret;
+ struct vfio_ap_apqi_reserved apqi_res;
+
+ apqi_res.apqi = apqi;
+
+ ret = driver_for_each_device(ap_matrix->device.driver, NULL,
+ &apqi_res,
+ vfio_ap_queue_id_contains_apqi);
+ if (ret)
+ return ret;
+
+ if (apqi_res.reserved)
+ return 0;
+
+ pr_err("%s: no APQNs with domain ID %02lx are reserved by %s driver",
+ VFIO_AP_MATRIX_MODULE_NAME, apqi, VFIO_AP_DRV_NAME);
+
+ return -ENODEV;
+}
+
+/**
+ * vfio_ap_validate_apqi
+ *
+ * @matrix_mdev: the mediated matrix device
+ * @apqi: the APQI (domain ID) to validate
+ *
+ * Validates the value of @apqi:
+ * * If there are no AP adapters assigned, then there must be at least
+ * one AP queue device reserved by the VFIO AP device driver with an
+ * APQN containing @apqi.
+ *
+ * * Else each APQN that can be derived from the intersection of @apqi and
+ * the IDs of the AP adapters already assigned must identify an AP queue
+ * that has been reserved by the VFIO AP device driver.
+ *
+ * Returns 0 if the value of @apqi is valid; otherwise, returns an error.
+ */
+static int vfio_ap_validate_apqi(struct mdev_device *mdev,
+ struct ap_matrix_mdev *matrix_mdev,
+ unsigned long apqi)
+{
+ int ret;
+ struct device *dev = mdev_parent_dev(mdev);
+ struct ap_matrix *ap_matrix = to_ap_matrix(dev);
+ unsigned long apid;
+
+ apid = find_first_bit_inv(matrix_mdev->matrix->apm,
+ matrix_mdev->matrix->apm_max);
+ /* If there are no adapters assigned */
+ if (apid == matrix_mdev->matrix->apm_max) {
+ ret = vfio_ap_verify_apqi_reserved(ap_matrix, apqi);
+ } else {
+ ret = vfio_ap_validate_queues_for_apqi(ap_matrix, matrix_mdev,
+ apqi);
+ }
+
+ if (ret)
+ return ret;
+
+ return 0;
+}
+
+static ssize_t assign_domain_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ int ret;
+ unsigned long apqi;
+ struct mdev_device *mdev = mdev_from_dev(dev);
+ struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
+ int maxid = KVM_AP_MAX_AQM_INDEX(matrix_mdev->matrix);
+
+ ret = kstrtoul(buf, 0, &apqi);
+ if (ret || (apqi > maxid)) {
+ pr_err("%s: domain id '%s' not a value from 0 to %02d(%#04x)",
+ VFIO_AP_MATRIX_MODULE_NAME, buf, maxid, maxid);
+
+ return ret ? ret : -EINVAL;
+ }
+
+ ret = vfio_ap_validate_apqi(mdev, matrix_mdev, apqi);
+ if (ret)
+ return ret;
+
+ /* Set the bit in the AQM (bitmask) corresponding to the AP domain
+ * number (APQI). The bits in the mask, from most significant to least
+ * significant, correspond to numbers 0-255.
+ */
+ set_bit_inv(apqi, matrix_mdev->matrix->aqm);
+
+ return count;
+}
+DEVICE_ATTR_WO(assign_domain);
+
+static ssize_t unassign_domain_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ int ret;
+ unsigned long apqi;
+ struct mdev_device *mdev = mdev_from_dev(dev);
+ struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
+ int maxid = KVM_AP_MAX_AQM_INDEX(matrix_mdev->matrix);
+
+ ret = kstrtoul(buf, 0, &apqi);
+ if (ret || (apqi > maxid)) {
+ pr_err("%s: domain id '%s' not a value from 0 to %02d(%#04x)",
+ VFIO_AP_MATRIX_MODULE_NAME, buf, maxid, maxid);
+
+ return ret ? ret : -EINVAL;
+ }
+
+ clear_bit_inv((unsigned long)apqi,
+ (unsigned long *)matrix_mdev->matrix->aqm);
+
+ return count;
+}
+DEVICE_ATTR_WO(unassign_domain);
+
static struct attribute *vfio_ap_mdev_attrs[] = {
&dev_attr_assign_adapter.attr,
&dev_attr_unassign_adapter.attr,
- NULL
+ &dev_attr_assign_domain.attr,
+ &dev_attr_unassign_domain.attr,
+ NULL,
};

static struct attribute_group vfio_ap_mdev_attr_group = {
--
1.7.1


2018-03-14 18:31:06

by Tony Krowiak

[permalink] [raw]
Subject: [PATCH v3 11/14] s390: vfio-ap: sysfs interface to view matrix mdev matrix

Provides a sysfs interface to view the AP matrix configured for the
mediated matrix device.

The relevant sysfs structures are:

/sys/devices/vfio_ap
... [matrix]
...... [mdev_supported_types]
......... [vfio_ap-passthrough]
............ [devices]
...............[$uuid]
.................. matrix

To view the matrix configured for the mediated matrix device,
print the matrix file:

cat matrix

Signed-off-by: Tony Krowiak <[email protected]>
---
drivers/s390/crypto/vfio_ap_ops.c | 39 +++++++++++++++++++++++++++++++++++++
1 files changed, 39 insertions(+), 0 deletions(-)

diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index 461d450..04f7a92 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -692,6 +692,44 @@ static ssize_t control_domains_show(struct device *dev,
}
DEVICE_ATTR_RO(control_domains);

+static ssize_t matrix_show(struct device *dev, struct device_attribute *attr,
+ char *buf)
+{
+ struct mdev_device *mdev = mdev_from_dev(dev);
+ struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
+ char *bufpos = buf;
+ unsigned long apid;
+ unsigned long apqi;
+ int nchars = 0;
+ int n;
+
+ n = sprintf(bufpos, "ADAPTER.DOMAIN\n");
+ bufpos += n;
+ nchars += n;
+
+ n = sprintf(bufpos, "--------------\n");
+ bufpos += n;
+ nchars += n;
+
+ for_each_set_bit_inv(apid, matrix_mdev->matrix->apm,
+ matrix_mdev->matrix->apm_max) {
+ n = sprintf(bufpos, "%02lx\n", apid);
+ bufpos += n;
+ nchars += n;
+
+ for_each_set_bit_inv(apqi, matrix_mdev->matrix->aqm,
+ matrix_mdev->matrix->aqm_max) {
+ n = sprintf(bufpos, "%02lx.%04lx\n", apid, apqi);
+ bufpos += n;
+ nchars += n;
+ }
+ }
+
+ return nchars;
+}
+DEVICE_ATTR_RO(matrix);
+
+
static struct attribute *vfio_ap_mdev_attrs[] = {
&dev_attr_assign_adapter.attr,
&dev_attr_unassign_adapter.attr,
@@ -700,6 +738,7 @@ static ssize_t control_domains_show(struct device *dev,
&dev_attr_assign_control_domain.attr,
&dev_attr_unassign_control_domain.attr,
&dev_attr_control_domains.attr,
+ &dev_attr_matrix.attr,
NULL,
};

--
1.7.1


2018-03-14 18:31:08

by Tony Krowiak

[permalink] [raw]
Subject: [PATCH v3 10/14] s390: vfio-ap: sysfs interfaces to configure control domains

Provides the sysfs interfaces for assigning AP control domains
to and unassigning AP control domains from a mediated matrix device.

The IDs of the AP control domains assigned to the mediated matrix
device are stored in an AP domain mask (ADM). The bits in the ADM,
from most significant to least significant bit, correspond to
AP domain numbers 0 to 255. When a control domain is assigned,
the bit corresponding its domain ID will be set in the ADM.
Likewise, when a domain is unassigned, the bit corresponding
to its domain ID will be cleared in the ADM.

The relevant sysfs structures are:

/sys/devices/vfio_ap
... [matrix]
...... [mdev_supported_types]
......... [vfio_ap-passthrough]
............ [devices]
...............[$uuid]
.................. assign_control_domain
.................. unassign_control_domain

To assign a control domain to the $uuid mediated matrix device's
ADM, write its domain number to the assign_control_domain file.
To unassign a domain, write its domain number to the
unassign_control_domain file. The domain number is specified
using conventional semantics: If it begins with 0x the number
will be parsed as a hexadecimal (case insensitive) number;
otherwise, it will be parsed as a decimal number.

For example, to assign control domain 173 (0xad) to the mediated
matrix device $uuid:

echo 173 > assign_control_domain

or

echo 0xad > assign_control_domain

To unassign control domain 173 (0xad):

echo 173 > unassign_control_domain

or

echo 0xad > unassign_control_domain

The assignment will be rejected if the APQI exceeds the maximum
value for an AP domain:
* If the AP Extended Addressing (APXA) facility is installed,
the max value is 255
* Else the max value is 15

Signed-off-by: Tony Krowiak <[email protected]>
---
arch/s390/include/asm/kvm-ap.h | 1 +
drivers/s390/crypto/vfio_ap_ops.c | 107 +++++++++++++++++++++++++++++++++++++
2 files changed, 108 insertions(+), 0 deletions(-)

diff --git a/arch/s390/include/asm/kvm-ap.h b/arch/s390/include/asm/kvm-ap.h
index 8ec42e7..679e026 100644
--- a/arch/s390/include/asm/kvm-ap.h
+++ b/arch/s390/include/asm/kvm-ap.h
@@ -17,6 +17,7 @@
#define KVM_AP_MASK_BYTES(n) DIV_ROUND_UP(n, BITS_PER_BYTE)
#define KVM_AP_MAX_APM_INDEX(matrix) (matrix->apm_max - 1)
#define KVM_AP_MAX_AQM_INDEX(matrix) (matrix->aqm_max - 1)
+#define KVM_AP_MAX_ADM_INDEX(matrix) (matrix->adm_max - 1)

/**
* The AP matrix is comprised of three bit masks identifying the adapters,
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index c448835..461d450 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -588,11 +588,118 @@ static ssize_t unassign_domain_store(struct device *dev,
}
DEVICE_ATTR_WO(unassign_domain);

+
+/**
+ * assign_control_domain_store
+ *
+ * @dev: the matrix device
+ * @attr: a mediated matrix device attribute
+ * @buf: a buffer containing the adapter ID (APID) to be assigned
+ * @count: the number of bytes in @buf
+ *
+ * Parses the domain ID from @buf and assigns it to the mediated matrix device.
+ *
+ * Returns the number of bytes processed if the domain ID is valid; otherwise
+ * returns an error.
+ */
+static ssize_t assign_control_domain_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ int ret;
+ unsigned long id;
+ struct mdev_device *mdev = mdev_from_dev(dev);
+ struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
+ int maxid = KVM_AP_MAX_ADM_INDEX(matrix_mdev->matrix);
+
+ ret = kstrtoul(buf, 0, &id);
+ if (ret || (id > maxid)) {
+ pr_err("%s: control domain id '%s' not a value from 0 to %02d(%#04x)",
+ VFIO_AP_MATRIX_MODULE_NAME, buf, maxid, maxid);
+
+ return ret ? ret : -EINVAL;
+ }
+
+ /* Set the bit in the ADM (bitmask) corresponding to the AP control
+ * domain number (id). The bits in the mask, from most significant to
+ * least significant, correspond to IDs 0 up to the one less than the
+ * number of control domains that can be assigned.
+ */
+ set_bit_inv(id, matrix_mdev->matrix->adm);
+
+ return count;
+}
+DEVICE_ATTR_WO(assign_control_domain);
+
+/**
+ * unassign_control_domain_store
+ *
+ * @dev: the matrix device
+ * @attr: a mediated matrix device attribute
+ * @buf: a buffer containing the adapter ID (APID) to be assigned
+ * @count: the number of bytes in @buf
+ *
+ * Parses the domain ID from @buf and unassigns it from the mediated matrix
+ * device.
+ *
+ * Returns the number of bytes processed if the domain ID is valid; otherwise
+ * returns an error.
+ */
+static ssize_t unassign_control_domain_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ int ret;
+ unsigned long apqi;
+ struct mdev_device *mdev = mdev_from_dev(dev);
+ struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
+ int maxid = KVM_AP_MAX_ADM_INDEX(matrix_mdev->matrix);
+
+ ret = kstrtoul(buf, 0, &apqi);
+ if (ret || (apqi > maxid)) {
+ pr_err("%s: control domain id '%s' not a value from 0 to %02d(%#04x)",
+ VFIO_AP_MATRIX_MODULE_NAME, buf, maxid, maxid);
+
+ return ret ? ret : -EINVAL;
+ }
+
+ clear_bit_inv((unsigned long)apqi,
+ (unsigned long *)matrix_mdev->matrix->adm);
+
+ return count;
+}
+DEVICE_ATTR_WO(unassign_control_domain);
+
+static ssize_t control_domains_show(struct device *dev,
+ struct device_attribute *dev_attr,
+ char *buf)
+{
+ unsigned long id;
+ int nchars = 0;
+ int n;
+ char *bufpos = buf;
+ struct mdev_device *mdev = mdev_from_dev(dev);
+ struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
+
+ for_each_set_bit_inv(id, matrix_mdev->matrix->adm,
+ matrix_mdev->matrix->adm_max) {
+ n = sprintf(bufpos, "%04lx\n", id);
+ bufpos += n;
+ nchars += n;
+ }
+
+ return nchars;
+}
+DEVICE_ATTR_RO(control_domains);
+
static struct attribute *vfio_ap_mdev_attrs[] = {
&dev_attr_assign_adapter.attr,
&dev_attr_unassign_adapter.attr,
&dev_attr_assign_domain.attr,
&dev_attr_unassign_domain.attr,
+ &dev_attr_assign_control_domain.attr,
+ &dev_attr_unassign_control_domain.attr,
+ &dev_attr_control_domains.attr,
NULL,
};

--
1.7.1


2018-03-14 18:31:32

by Tony Krowiak

[permalink] [raw]
Subject: [PATCH v3 04/14] KVM: s390: device attribute to set AP interpretive execution

The VFIO AP device model exploits interpretive execution of AP
instructions (APIE) to provide guests passthrough access to AP
devices. This patch introduces a new device attribute in the
KVM_S390_VM_CRYPTO device attribute group to set APIE from
the VFIO AP device defined on the guest.

Signed-off-by: Tony Krowiak <[email protected]>
---
arch/s390/include/asm/kvm_host.h | 1 +
arch/s390/include/uapi/asm/kvm.h | 1 +
arch/s390/kvm/kvm-s390.c | 19 +++++++++++++++++++
3 files changed, 21 insertions(+), 0 deletions(-)

diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index 98957c2..bbac5a1 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -699,6 +699,7 @@ struct kvm_s390_crypto {
__u32 crycbd;
__u8 aes_kw;
__u8 dea_kw;
+ __u8 apie;
};

#define APCB0_MASK_SIZE 1
diff --git a/arch/s390/include/uapi/asm/kvm.h b/arch/s390/include/uapi/asm/kvm.h
index a580dec..fdcbeb9 100644
--- a/arch/s390/include/uapi/asm/kvm.h
+++ b/arch/s390/include/uapi/asm/kvm.h
@@ -161,6 +161,7 @@ struct kvm_s390_vm_cpu_subfunc {
#define KVM_S390_VM_CRYPTO_ENABLE_DEA_KW 1
#define KVM_S390_VM_CRYPTO_DISABLE_AES_KW 2
#define KVM_S390_VM_CRYPTO_DISABLE_DEA_KW 3
+#define KVM_S390_VM_CRYPTO_INTERPRET_AP 4

/* kvm attributes for migration mode */
#define KVM_S390_VM_MIGRATION_STOP 0
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index a60c45b..bc46b67 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -815,6 +815,19 @@ static int kvm_s390_vm_set_crypto(struct kvm *kvm, struct kvm_device_attr *attr)
sizeof(kvm->arch.crypto.crycb->dea_wrapping_key_mask));
VM_EVENT(kvm, 3, "%s", "DISABLE: DEA keywrapping support");
break;
+ case KVM_S390_VM_CRYPTO_INTERPRET_AP:
+ if (attr->addr) {
+ if (!test_kvm_cpu_feat(kvm, KVM_S390_VM_CPU_FEAT_AP))
+ return -EOPNOTSUPP;
+ kvm->arch.crypto.apie = 1;
+ VM_EVENT(kvm, 3, "%s",
+ "ENABLE: AP interpretive execution");
+ } else {
+ kvm->arch.crypto.apie = 0;
+ VM_EVENT(kvm, 3, "%s",
+ "DISABLE: AP interpretive execution");
+ }
+ break;
default:
mutex_unlock(&kvm->lock);
return -ENXIO;
@@ -1453,6 +1466,7 @@ static int kvm_s390_vm_has_attr(struct kvm *kvm, struct kvm_device_attr *attr)
case KVM_S390_VM_CRYPTO_ENABLE_DEA_KW:
case KVM_S390_VM_CRYPTO_DISABLE_AES_KW:
case KVM_S390_VM_CRYPTO_DISABLE_DEA_KW:
+ case KVM_S390_VM_CRYPTO_INTERPRET_AP:
ret = 0;
break;
default:
@@ -2409,6 +2423,11 @@ static void kvm_s390_vcpu_crypto_setup(struct kvm_vcpu *vcpu)
{
vcpu->arch.sie_block->crycbd = vcpu->kvm->arch.crypto.crycbd;

+ if (vcpu->kvm->arch.crypto.apie)
+ vcpu->arch.sie_block->eca |= ECA_APIE;
+ else
+ vcpu->arch.sie_block->eca &= ~ECA_APIE;
+
if (!test_kvm_facility(vcpu->kvm, 76))
return;

--
1.7.1


2018-03-14 18:32:10

by Tony Krowiak

[permalink] [raw]
Subject: [PATCH v3 03/14] KVM: s390: CPU model support for AP virtualization

Introduces a new CPU model feature and two CPU model
facilities to support AP virtualization for KVM guests.

CPU model feature:

The KVM_S390_VM_CPU_FEAT_AP feature indicates that
AP instructions are available on the guest. This
feature will be enabled by the kernel only if the AP
instructions are installed on the linux host. This feature
must be specifically turned on for the KVM guest from
userspace to use the VFIO AP device driver for guest
access to AP devices.

CPU model facilities:

1. AP Query Configuration Information (QCI) facility is installed.

This is indicated by setting facilities bit 12 for
the guest. The kernel will not enable this facility
for the guest if it is not set on the host. This facility
must not be set by userspace if the KVM_S390_VM_CPU_FEAT_AP
feature is not installed.

2. AP Facilities Test facility (APFT) is installed.

This is indicated by setting facilities bit 15 for
the guest. The kernel will not enable this facility for
the guest if it is not set on the host. This facility
must not be set by userspace if the KVM_S390_VM_CPU_FEAT_AP
feature is not installed.

Reviewed-by: Christian Borntraeger <[email protected]>
Reviewed-by: Halil Pasic <[email protected]>
Signed-off-by: Tony Krowiak <[email protected]>
---
arch/s390/include/asm/kvm_host.h | 1 +
arch/s390/include/uapi/asm/kvm.h | 1 +
arch/s390/kvm/kvm-s390.c | 4 ++++
arch/s390/tools/gen_facilities.c | 2 ++
4 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index 65a944e..98957c2 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -186,6 +186,7 @@ struct kvm_s390_sie_block {
#define ECA_AIV 0x00200000
#define ECA_VX 0x00020000
#define ECA_PROTEXCI 0x00002000
+#define ECA_APIE 0x00000008
#define ECA_SII 0x00000001
__u32 eca; /* 0x004c */
#define ICPT_INST 0x04
diff --git a/arch/s390/include/uapi/asm/kvm.h b/arch/s390/include/uapi/asm/kvm.h
index 4cdaa55..a580dec 100644
--- a/arch/s390/include/uapi/asm/kvm.h
+++ b/arch/s390/include/uapi/asm/kvm.h
@@ -130,6 +130,7 @@ struct kvm_s390_vm_cpu_machine {
#define KVM_S390_VM_CPU_FEAT_PFMFI 11
#define KVM_S390_VM_CPU_FEAT_SIGPIF 12
#define KVM_S390_VM_CPU_FEAT_KSS 13
+#define KVM_S390_VM_CPU_FEAT_AP 14
struct kvm_s390_vm_cpu_feat {
__u64 feat[16];
};
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index c47731d..a60c45b 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -350,6 +350,10 @@ static void kvm_s390_cpu_feat_init(void)

if (MACHINE_HAS_ESOP)
allow_cpu_feat(KVM_S390_VM_CPU_FEAT_ESOP);
+
+ if (ap_instructions_installed()) /* AP instructions installed on host */
+ allow_cpu_feat(KVM_S390_VM_CPU_FEAT_AP);
+
/*
* We need SIE support, ESOP (PROT_READ protection for gmap_shadow),
* 64bit SCAO (SCA passthrough) and IDTE (for gmap_shadow unshadowing).
diff --git a/arch/s390/tools/gen_facilities.c b/arch/s390/tools/gen_facilities.c
index 83ffefc..a8ae59a 100644
--- a/arch/s390/tools/gen_facilities.c
+++ b/arch/s390/tools/gen_facilities.c
@@ -107,6 +107,8 @@ struct facility_def {

.name = "FACILITIES_KVM_CPUMODEL",
.bits = (int[]){
+ 12, /* AP Query Configuration Information */
+ 15, /* AP Facilities Test */
-1 /* END */
}
},
--
1.7.1


2018-03-14 18:32:20

by Tony Krowiak

[permalink] [raw]
Subject: [PATCH v3 01/14] KVM: s390: refactor crypto initialization

This patch refactors the code that initializes the crypto
configuration for a guest. The crypto configuration is contained in
a crypto control block (CRYCB) which is a satellite control block to
our main hardware virtualization control block. The CRYCB is
attached to the main virtualization control block via a CRYCB
designation (CRYCBD) designation field containing the address of
the CRYCB as well as its format.

Prior to the introduction of AP device virtualization, there was
no need to provide access to or specify the format of the CRYCB for
a guest unless the MSA extension 3 (MSAX3) facility was installed
on the host system. With the introduction of AP device virtualization,
the CRYCB and its format must be made accessible to the guest
regardless of the presence of the MSAX3 facility.

The crypto initialization code is restructured as follows:

* A new compilation unit is introduced to contain all interfaces
and data structures related to configuring a guest's CRYCB for
both the refactoring of crypto initialization as well as all
subsequent patches introducing AP virtualization support.

* Currently, the asm code for querying the AP configuration is
duplicated in the AP bus as well as in KVM. Since the KVM
code was introduced, the AP bus has externalized the interface
for querying the AP configuration. The KVM interface will be
replaced with a call to the AP bus interface. Of course, this
will be moved to the new compilation unit mentioned above.

* An interface to format the CRYCBD field will be provided via
the new compilation unit and called from the KVM vm
initialization.

Signed-off-by: Tony Krowiak <[email protected]>
---
MAINTAINERS | 10 ++++++
arch/s390/include/asm/kvm-ap.h | 16 ++++++++++
arch/s390/include/asm/kvm_host.h | 1 +
arch/s390/kvm/Kconfig | 1 +
arch/s390/kvm/Makefile | 2 +-
arch/s390/kvm/kvm-ap.c | 48 +++++++++++++++++++++++++++++
arch/s390/kvm/kvm-s390.c | 61 ++++---------------------------------
7 files changed, 84 insertions(+), 55 deletions(-)
create mode 100644 arch/s390/include/asm/kvm-ap.h
create mode 100644 arch/s390/kvm/kvm-ap.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 0ec5881..72742d5 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -11875,6 +11875,16 @@ W: http://www.ibm.com/developerworks/linux/linux390/
S: Supported
F: drivers/s390/crypto/

+S390 VFIO AP DRIVER
+M: Tony Krowiak <[email protected]>
+M: Christian Borntraeger <[email protected]>
+M: Martin Schwidefsky <[email protected]>
+L: [email protected]
+W: http://www.ibm.com/developerworks/linux/linux390/
+S: Supported
+F: arch/s390/include/asm/kvm/kvm-ap.h
+F: arch/s390/kvm/kvm-ap.c
+
S390 ZFCP DRIVER
M: Steffen Maier <[email protected]>
M: Benjamin Block <[email protected]>
diff --git a/arch/s390/include/asm/kvm-ap.h b/arch/s390/include/asm/kvm-ap.h
new file mode 100644
index 0000000..362846c
--- /dev/null
+++ b/arch/s390/include/asm/kvm-ap.h
@@ -0,0 +1,16 @@
+/*
+ * Adjunct Processor (AP) configuration management for KVM guests
+ *
+ * Copyright IBM Corp. 2017
+ *
+ * Author(s): Tony Krowiak <[email protected]>
+ */
+
+#ifndef _ASM_KVM_AP
+#define _ASM_KVM_AP
+#include <linux/types.h>
+#include <linux/kvm_host.h>
+
+void kvm_ap_build_crycbd(struct kvm *kvm);
+
+#endif /* _ASM_KVM_AP */
diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index 34c9b5b..65a944e 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -257,6 +257,7 @@ struct kvm_s390_sie_block {
__u8 reservedf0[12]; /* 0x00f0 */
#define CRYCB_FORMAT1 0x00000001
#define CRYCB_FORMAT2 0x00000003
+#define CRYCB_FORMAT_MASK 0x00000003
__u32 crycbd; /* 0x00fc */
__u64 gcr[16]; /* 0x0100 */
__u64 gbea; /* 0x0180 */
diff --git a/arch/s390/kvm/Kconfig b/arch/s390/kvm/Kconfig
index a3dbd45..4ca9077 100644
--- a/arch/s390/kvm/Kconfig
+++ b/arch/s390/kvm/Kconfig
@@ -33,6 +33,7 @@ config KVM
select HAVE_KVM_INVALID_WAKEUPS
select SRCU
select KVM_VFIO
+ select ZCRYPT
---help---
Support hosting paravirtualized guest machines using the SIE
virtualization capability on the mainframe. This should work
diff --git a/arch/s390/kvm/Makefile b/arch/s390/kvm/Makefile
index 05ee90a..1876bfe 100644
--- a/arch/s390/kvm/Makefile
+++ b/arch/s390/kvm/Makefile
@@ -9,6 +9,6 @@ common-objs = $(KVM)/kvm_main.o $(KVM)/eventfd.o $(KVM)/async_pf.o $(KVM)/irqch
ccflags-y := -Ivirt/kvm -Iarch/s390/kvm

kvm-objs := $(common-objs) kvm-s390.o intercept.o interrupt.o priv.o sigp.o
-kvm-objs += diag.o gaccess.o guestdbg.o vsie.o
+kvm-objs += diag.o gaccess.o guestdbg.o vsie.o kvm-ap.o

obj-$(CONFIG_KVM) += kvm.o
diff --git a/arch/s390/kvm/kvm-ap.c b/arch/s390/kvm/kvm-ap.c
new file mode 100644
index 0000000..a2c6ad2
--- /dev/null
+++ b/arch/s390/kvm/kvm-ap.c
@@ -0,0 +1,48 @@
+/*
+ * Adjunct Processor (AP) configuration management for KVM guests
+ *
+ * Copyright IBM Corp. 2017
+ *
+ * Author(s): Tony Krowiak <[email protected]>
+ */
+
+#include <asm/kvm-ap.h>
+#include <asm/ap.h>
+
+#include "kvm-s390.h"
+
+static int kvm_ap_apxa_installed(void)
+{
+ int ret;
+ struct ap_config_info config;
+
+ ret = ap_query_configuration(&config);
+ if (ret)
+ return 0;
+
+ return (config.apxa == 1);
+}
+
+/**
+ * kvm_ap_build_crycbd
+ *
+ * The crypto control block designation (CRYCBD) is a 32-bit field that
+ * designates both the host real address and format of the CRYCB. This function
+ * builds the CRYCBD field for use by the KVM guest.
+ *
+ * @kvm: the KVM guest
+ * @crycbd: reference to the CRYCBD
+ */
+void kvm_ap_build_crycbd(struct kvm *kvm)
+{
+ kvm->arch.crypto.crycbd = (__u32)(unsigned long) kvm->arch.crypto.crycb;
+ kvm->arch.crypto.crycbd &= ~(CRYCB_FORMAT_MASK);
+
+ /* check whether MSAX3 is installed */
+ if (test_kvm_facility(kvm, 76)) {
+ if (kvm_ap_apxa_installed())
+ kvm->arch.crypto.crycbd |= CRYCB_FORMAT2;
+ else
+ kvm->arch.crypto.crycbd |= CRYCB_FORMAT1;
+ }
+}
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 23c4767..c47731d 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -40,6 +40,8 @@
#include <asm/sclp.h>
#include <asm/cpacf.h>
#include <asm/timex.h>
+#include <asm/ap.h>
+#include <asm/kvm-ap.h>
#include "kvm-s390.h"
#include "gaccess.h"

@@ -1856,55 +1858,6 @@ long kvm_arch_vm_ioctl(struct file *filp,
return r;
}

-static int kvm_s390_query_ap_config(u8 *config)
-{
- u32 fcn_code = 0x04000000UL;
- u32 cc = 0;
-
- memset(config, 0, 128);
- asm volatile(
- "lgr 0,%1\n"
- "lgr 2,%2\n"
- ".long 0xb2af0000\n" /* PQAP(QCI) */
- "0: ipm %0\n"
- "srl %0,28\n"
- "1:\n"
- EX_TABLE(0b, 1b)
- : "+r" (cc)
- : "r" (fcn_code), "r" (config)
- : "cc", "0", "2", "memory"
- );
-
- return cc;
-}
-
-static int kvm_s390_apxa_installed(void)
-{
- u8 config[128];
- int cc;
-
- if (test_facility(12)) {
- cc = kvm_s390_query_ap_config(config);
-
- if (cc)
- pr_err("PQAP(QCI) failed with cc=%d", cc);
- else
- return config[0] & 0x40;
- }
-
- return 0;
-}
-
-static void kvm_s390_set_crycb_format(struct kvm *kvm)
-{
- kvm->arch.crypto.crycbd = (__u32)(unsigned long) kvm->arch.crypto.crycb;
-
- if (kvm_s390_apxa_installed())
- kvm->arch.crypto.crycbd |= CRYCB_FORMAT2;
- else
- kvm->arch.crypto.crycbd |= CRYCB_FORMAT1;
-}
-
static u64 kvm_s390_get_initial_cpuid(void)
{
struct cpuid cpuid;
@@ -1916,12 +1869,12 @@ static u64 kvm_s390_get_initial_cpuid(void)

static void kvm_s390_crypto_init(struct kvm *kvm)
{
+ kvm->arch.crypto.crycb = &kvm->arch.sie_page2->crycb;
+ kvm_ap_build_crycbd(kvm);
+
if (!test_kvm_facility(kvm, 76))
return;

- kvm->arch.crypto.crycb = &kvm->arch.sie_page2->crycb;
- kvm_s390_set_crycb_format(kvm);
-
/* Enable AES/DEA protected key functions by default */
kvm->arch.crypto.aes_kw = 1;
kvm->arch.crypto.dea_kw = 1;
@@ -2450,6 +2403,8 @@ void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu)

static void kvm_s390_vcpu_crypto_setup(struct kvm_vcpu *vcpu)
{
+ vcpu->arch.sie_block->crycbd = vcpu->kvm->arch.crypto.crycbd;
+
if (!test_kvm_facility(vcpu->kvm, 76))
return;

@@ -2459,8 +2414,6 @@ static void kvm_s390_vcpu_crypto_setup(struct kvm_vcpu *vcpu)
vcpu->arch.sie_block->ecb3 |= ECB3_AES;
if (vcpu->kvm->arch.crypto.dea_kw)
vcpu->arch.sie_block->ecb3 |= ECB3_DEA;
-
- vcpu->arch.sie_block->crycbd = vcpu->kvm->arch.crypto.crycbd;
}

void kvm_s390_vcpu_unsetup_cmma(struct kvm_vcpu *vcpu)
--
1.7.1


2018-03-14 18:32:56

by Tony Krowiak

[permalink] [raw]
Subject: [PATCH v3 02/14] s390: zcrypt: externalize AP instructions available function

If the AP instructions are not available on the linux host, then
AP devices can not be interpreted by the SIE. The AP bus has a
function it uses to determine if the AP instructions are
available. This patch provides a new function that wraps the
AP bus's function to externalize it for use by KVM.

Signed-off-by: Tony Krowiak <[email protected]>
Reviewed-by: Pierre Morel <[email protected]>
Reviewed-by: Harald Freudenberger <[email protected]>
---
arch/s390/include/asm/ap.h | 7 +++++++
drivers/s390/crypto/ap_bus.c | 6 ++++++
2 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/arch/s390/include/asm/ap.h b/arch/s390/include/asm/ap.h
index cfce683..1df6b59 100644
--- a/arch/s390/include/asm/ap.h
+++ b/arch/s390/include/asm/ap.h
@@ -120,4 +120,11 @@ struct ap_queue_status ap_queue_irq_ctrl(ap_qid_t qid,
struct ap_qirq_ctrl qirqctrl,
void *ind);

+/**
+ * ap_instructions_installed() - Tests whether AP instructions are installed
+ *
+ * Returns 1 if the AP instructions are installed, otherwise; returns 0
+ */
+int ap_instructions_installed(void);
+
#endif /* _ASM_S390_AP_H_ */
diff --git a/drivers/s390/crypto/ap_bus.c b/drivers/s390/crypto/ap_bus.c
index 48d55dc..089b1cf 100644
--- a/drivers/s390/crypto/ap_bus.c
+++ b/drivers/s390/crypto/ap_bus.c
@@ -211,6 +211,12 @@ int ap_query_configuration(struct ap_config_info *info)
}
EXPORT_SYMBOL(ap_query_configuration);

+int ap_instructions_installed(void)
+{
+ return (ap_instructions_available() == 0);
+}
+EXPORT_SYMBOL(ap_instructions_installed);
+
/**
* ap_init_configuration(): Allocate and query configuration array.
*/
--
1.7.1


2018-03-14 18:33:28

by Tony Krowiak

[permalink] [raw]
Subject: [PATCH v3 05/14] s390: vfio-ap: base implementation of VFIO AP device driver

Introduces a new AP device driver. This device driver
is built on the VFIO mediated device framework. The framework
provides sysfs interfaces that facilitate passthrough
access by guests to devices installed on the linux host.

The VFIO AP device driver will serve two purposes:

1. Provide the interfaces to reserve AP devices for exclusive
use by KVM guests. This is accomplished by unbinding the
devices to be reserved for guest usage from the default AP
device driver and binding them to the VFIO AP device driver.

2. Implements the functions, callbacks and sysfs attribute
interfaces required to create one or more VFIO mediated
devices each of which will be used to configure the AP
matrix for a guest and serve as a file descriptor
for facilitating communication between QEMU and the
VFIO AP device driver.

When the VFIO AP device driver is initialized:

* It registers with the AP bus for control of type 10 (CEX4
and newer) AP queue devices. The probe and remove callbacks
will be provided to support the binding/unbinding of
AP queue devices to/from the VFIO AP device driver.

* Creates a /sys/devices/vfio-ap/matrix device to hold
the APQNs of the AP devices bound to the VFIO
AP device driver and serves as the parent of the
mediated devices created for each guest.

Signed-off-by: Tony Krowiak <[email protected]>
---
MAINTAINERS | 2 +
arch/s390/Kconfig | 11 +++
drivers/s390/crypto/Makefile | 4 +
drivers/s390/crypto/vfio_ap_drv.c | 135 +++++++++++++++++++++++++++++++++
drivers/s390/crypto/vfio_ap_private.h | 22 ++++++
include/uapi/linux/vfio.h | 2 +
6 files changed, 176 insertions(+), 0 deletions(-)
create mode 100644 drivers/s390/crypto/vfio_ap_drv.c
create mode 100644 drivers/s390/crypto/vfio_ap_private.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 72742d5..f129253 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -11884,6 +11884,8 @@ W: http://www.ibm.com/developerworks/linux/linux390/
S: Supported
F: arch/s390/include/asm/kvm/kvm-ap.h
F: arch/s390/kvm/kvm-ap.c
+F: drivers/s390/crypto/vfio_ap_drv.c
+F: drivers/s390/crypto/vfio_ap_private.h

S390 ZFCP DRIVER
M: Steffen Maier <[email protected]>
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index cbe1d97..58509db 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -771,6 +771,17 @@ config VFIO_CCW
To compile this driver as a module, choose M here: the
module will be called vfio_ccw.

+config VFIO_AP
+ def_tristate m
+ prompt "VFIO support for AP devices"
+ depends on ZCRYPT && VFIO_MDEV_DEVICE
+ help
+ This driver grants access to Adjunct Processor (AP) devices
+ via the VFIO mediated device interface.
+
+ To compile this driver as a module, choose M here: the module
+ will be called vfio_ap.
+
endmenu

menu "Dump support"
diff --git a/drivers/s390/crypto/Makefile b/drivers/s390/crypto/Makefile
index b59af54..48e466e 100644
--- a/drivers/s390/crypto/Makefile
+++ b/drivers/s390/crypto/Makefile
@@ -15,3 +15,7 @@ obj-$(CONFIG_ZCRYPT) += zcrypt_pcixcc.o zcrypt_cex2a.o zcrypt_cex4.o
# pkey kernel module
pkey-objs := pkey_api.o
obj-$(CONFIG_PKEY) += pkey.o
+
+# adjunct processor matrix
+vfio_ap-objs := vfio_ap_drv.o
+obj-$(CONFIG_VFIO_AP) += vfio_ap.o
diff --git a/drivers/s390/crypto/vfio_ap_drv.c b/drivers/s390/crypto/vfio_ap_drv.c
new file mode 100644
index 0000000..459e595
--- /dev/null
+++ b/drivers/s390/crypto/vfio_ap_drv.c
@@ -0,0 +1,135 @@
+/*
+ * VFIO based AP device driver
+ *
+ * Copyright IBM Corp. 2017
+ *
+ * Author(s): Tony Krowiak <[email protected]>
+ */
+
+#include <linux/module.h>
+#include <linux/mod_devicetable.h>
+#include <linux/slab.h>
+
+#include "vfio_ap_private.h"
+
+#define VFIO_AP_ROOT_NAME "vfio_ap"
+#define VFIO_AP_DEV_TYPE_NAME "ap_matrix"
+#define VFIO_AP_DEV_NAME "matrix"
+
+MODULE_AUTHOR("IBM Corporation");
+MODULE_DESCRIPTION("VFIO AP device driver, Copyright IBM Corp. 2017");
+MODULE_LICENSE("GPL v2");
+
+static struct device *vfio_ap_root_device;
+
+static struct ap_driver vfio_ap_drv;
+
+static struct ap_matrix *ap_matrix;
+
+static struct device_type vfio_ap_dev_type = {
+ .name = VFIO_AP_DEV_TYPE_NAME,
+};
+
+/* Only type 10 adapters (CEX4 and later) are supported
+ * by the AP matrix device driver
+ */
+static struct ap_device_id ap_queue_ids[] = {
+ { .dev_type = AP_DEVICE_TYPE_CEX4,
+ .match_flags = AP_DEVICE_ID_MATCH_QUEUE_TYPE },
+ { .dev_type = AP_DEVICE_TYPE_CEX5,
+ .match_flags = AP_DEVICE_ID_MATCH_QUEUE_TYPE },
+ { .dev_type = AP_DEVICE_TYPE_CEX6,
+ .match_flags = AP_DEVICE_ID_MATCH_QUEUE_TYPE },
+ { /* end of sibling */ },
+};
+
+MODULE_DEVICE_TABLE(vfio_ap, ap_queue_ids);
+
+static int vfio_ap_queue_dev_probe(struct ap_device *apdev)
+{
+ return 0;
+}
+
+static void vfio_ap_matrix_dev_release(struct device *dev)
+{
+ struct ap_matrix *ap_matrix = dev_get_drvdata(dev);
+
+ kfree(ap_matrix);
+}
+
+static int vfio_ap_matrix_dev_create(void)
+{
+ int ret;
+
+ vfio_ap_root_device = root_device_register(VFIO_AP_ROOT_NAME);
+
+ ret = IS_ERR(vfio_ap_root_device);
+ if (ret) {
+ ret = PTR_ERR(vfio_ap_root_device);
+ goto done;
+ }
+
+ ap_matrix = kzalloc(sizeof(*ap_matrix), GFP_KERNEL);
+ if (!ap_matrix) {
+ ret = -ENOMEM;
+ goto matrix_alloc_err;
+ }
+
+ ap_matrix->device.type = &vfio_ap_dev_type;
+ dev_set_name(&ap_matrix->device, "%s", VFIO_AP_DEV_NAME);
+ ap_matrix->device.parent = vfio_ap_root_device;
+ ap_matrix->device.release = vfio_ap_matrix_dev_release;
+ ap_matrix->device.driver = &vfio_ap_drv.driver;
+
+ ret = device_register(&ap_matrix->device);
+ if (ret)
+ goto matrix_reg_err;
+
+ goto done;
+
+matrix_reg_err:
+ put_device(&ap_matrix->device);
+ kfree(ap_matrix);
+
+matrix_alloc_err:
+ root_device_unregister(vfio_ap_root_device);
+
+done:
+ return ret;
+}
+
+static void vfio_ap_matrix_dev_destroy(struct ap_matrix *ap_matrix)
+{
+ device_unregister(&ap_matrix->device);
+ root_device_unregister(vfio_ap_root_device);
+}
+
+int __init vfio_ap_init(void)
+{
+ int ret;
+
+ ret = vfio_ap_matrix_dev_create();
+ if (ret)
+ return ret;
+
+ memset(&vfio_ap_drv, 0, sizeof(vfio_ap_drv));
+ vfio_ap_drv.probe = vfio_ap_queue_dev_probe;
+ vfio_ap_drv.ids = ap_queue_ids;
+
+ ret = ap_driver_register(&vfio_ap_drv, THIS_MODULE, VFIO_AP_DRV_NAME);
+ if (ret) {
+ vfio_ap_matrix_dev_destroy(ap_matrix);
+ return ret;
+ }
+
+ return 0;
+}
+
+void __exit vfio_ap_exit(void)
+{
+ ap_driver_unregister(&vfio_ap_drv);
+ vfio_ap_matrix_dev_destroy(ap_matrix);
+}
+
+module_init(vfio_ap_init);
+module_exit(vfio_ap_exit);
diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h
new file mode 100644
index 0000000..21f3697
--- /dev/null
+++ b/drivers/s390/crypto/vfio_ap_private.h
@@ -0,0 +1,22 @@
+/*
+ * Private data and functions for adjunct processor VFIO matrix driver.
+ *
+ * Copyright IBM Corp. 2017
+ * Author(s): Tony Krowiak <[email protected]>
+ */
+
+#ifndef _VFIO_AP_PRIVATE_H_
+#define _VFIO_AP_PRIVATE_H_
+
+#include <linux/types.h>
+
+#include "ap_bus.h"
+
+#define VFIO_AP_MODULE_NAME "vfio_ap"
+#define VFIO_AP_DRV_NAME "vfio_ap"
+
+struct ap_matrix {
+ struct device device;
+};
+
+#endif /* _VFIO_AP_PRIVATE_H_ */
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index e3301db..cf2a5e9 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -200,6 +200,7 @@ struct vfio_device_info {
#define VFIO_DEVICE_FLAGS_PLATFORM (1 << 2) /* vfio-platform device */
#define VFIO_DEVICE_FLAGS_AMBA (1 << 3) /* vfio-amba device */
#define VFIO_DEVICE_FLAGS_CCW (1 << 4) /* vfio-ccw device */
+#define VFIO_DEVICE_FLAGS_AP (1 << 5) /* vfio-ap device */
__u32 num_regions; /* Max region index + 1 */
__u32 num_irqs; /* Max IRQ index + 1 */
};
@@ -215,6 +216,7 @@ struct vfio_device_info {
#define VFIO_DEVICE_API_PLATFORM_STRING "vfio-platform"
#define VFIO_DEVICE_API_AMBA_STRING "vfio-amba"
#define VFIO_DEVICE_API_CCW_STRING "vfio-ccw"
+#define VFIO_DEVICE_API_AP_STRING "vfio-ap"

/**
* VFIO_DEVICE_GET_REGION_INFO - _IOWR(VFIO_TYPE, VFIO_BASE + 8,
--
1.7.1


2018-03-14 21:58:45

by Halil Pasic

[permalink] [raw]
Subject: Re: [PATCH v3 04/14] KVM: s390: device attribute to set AP interpretive execution



On 03/14/2018 07:25 PM, Tony Krowiak wrote:
> The VFIO AP device model exploits interpretive execution of AP
> instructions (APIE) to provide guests passthrough access to AP
> devices. This patch introduces a new device attribute in the
> KVM_S390_VM_CRYPTO device attribute group to set APIE from
> the VFIO AP device defined on the guest.
>
> Signed-off-by: Tony Krowiak <[email protected]>
> ---

[..]

> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index a60c45b..bc46b67 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -815,6 +815,19 @@ static int kvm_s390_vm_set_crypto(struct kvm *kvm, struct kvm_device_attr *attr)
> sizeof(kvm->arch.crypto.crycb->dea_wrapping_key_mask));
> VM_EVENT(kvm, 3, "%s", "DISABLE: DEA keywrapping support");
> break;
> + case KVM_S390_VM_CRYPTO_INTERPRET_AP:
> + if (attr->addr) {
> + if (!test_kvm_cpu_feat(kvm, KVM_S390_VM_CPU_FEAT_AP))

Unlock mutex before returning?

Maybe flip conditions (don't allow manipulating apie if feature not there).
Clearing the anyways clear apie if feature not there ain't too bad, but
rejecting the operation appears nicer to me.

> + return -EOPNOTSUPP;
> + kvm->arch.crypto.apie = 1;
> + VM_EVENT(kvm, 3, "%s",
> + "ENABLE: AP interpretive execution");
> + } else {
> + kvm->arch.crypto.apie = 0;
> + VM_EVENT(kvm, 3, "%s",
> + "DISABLE: AP interpretive execution");
> + }
> + break;
> default:
> mutex_unlock(&kvm->lock);
> return -ENXIO;

I wonder how the loop after this switch works for KVM_S390_VM_CRYPTO_INTERPRET_AP:

kvm_for_each_vcpu(i, vcpu, kvm) {
kvm_s390_vcpu_crypto_setup(vcpu);
exit_sie(vcpu);
}

From not doing something like for KVM_S390_VM_CRYPTO_INTERPRET_AP

if (kvm->created_vcpus) {
mutex_unlock(&kvm->lock);
return -EBUSY;
and from the aforementioned loop I guess ECA.28 can be changed
for a running guest.

If there are running vcpus when KVM_S390_VM_CRYPTO_INTERPRET_AP is
changed (set) these will be taken out of SIE by exit_sie(). Then for the
corresponding threads the control probably goes to QEMU (the emulator in
the userspace). And it puts that vcpu back into the SIE, and then that
cpu starts acting according to the new ECA.28 value. While other vcpus
may still work with the old value of ECA.28.

I'm not saying what I describe above is necessarily something broken.
But I would like to have it explained, why is it OK -- provided I did not
make any errors in my reasoning (assumptions included).

Can you help me understand this code?

Regards,
Halil

[..]


2018-03-15 09:43:54

by Pierre Morel

[permalink] [raw]
Subject: Re: [PATCH v3 11/14] s390: vfio-ap: sysfs interface to view matrix mdev matrix

On 14/03/2018 19:25, Tony Krowiak wrote:
> Provides a sysfs interface to view the AP matrix configured for the
> mediated matrix device.
>
> The relevant sysfs structures are:
>
> /sys/devices/vfio_ap
> ... [matrix]
> ...... [mdev_supported_types]
> ......... [vfio_ap-passthrough]
> ............ [devices]
> ...............[$uuid]
> .................. matrix
>
> To view the matrix configured for the mediated matrix device,
> print the matrix file:
>
> cat matrix
>
> Signed-off-by: Tony Krowiak <[email protected]>
> ---
> drivers/s390/crypto/vfio_ap_ops.c | 39 +++++++++++++++++++++++++++++++++++++
> 1 files changed, 39 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
> index 461d450..04f7a92 100644
> --- a/drivers/s390/crypto/vfio_ap_ops.c
> +++ b/drivers/s390/crypto/vfio_ap_ops.c
> @@ -692,6 +692,44 @@ static ssize_t control_domains_show(struct device *dev,
> }
> DEVICE_ATTR_RO(control_domains);
>
> +static ssize_t matrix_show(struct device *dev, struct device_attribute *attr,
> + char *buf)
> +{
> + struct mdev_device *mdev = mdev_from_dev(dev);
> + struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
> + char *bufpos = buf;
> + unsigned long apid;
> + unsigned long apqi;
> + int nchars = 0;
> + int n;
> +
> + n = sprintf(bufpos, "ADAPTER.DOMAIN\n");

For easy parsing it is better to only report the interesting data
and let a user space utility make fancy presentation.

> + bufpos += n;
> + nchars += n;
> +
> + n = sprintf(bufpos, "--------------\n");
> + bufpos += n;
> + nchars += n;
> +
> + for_each_set_bit_inv(apid, matrix_mdev->matrix->apm,
> + matrix_mdev->matrix->apm_max) {
> + n = sprintf(bufpos, "%02lx\n", apid);
> + bufpos += n;
> + nchars += n;
> +
> + for_each_set_bit_inv(apqi, matrix_mdev->matrix->aqm,
> + matrix_mdev->matrix->aqm_max) {
> + n = sprintf(bufpos, "%02lx.%04lx\n", apid, apqi);
> + bufpos += n;
> + nchars += n;
> + }
> + }
> +
> + return nchars;
> +}
> +DEVICE_ATTR_RO(matrix);
> +
> +
> static struct attribute *vfio_ap_mdev_attrs[] = {
> &dev_attr_assign_adapter.attr,
> &dev_attr_unassign_adapter.attr,
> @@ -700,6 +738,7 @@ static ssize_t control_domains_show(struct device *dev,
> &dev_attr_assign_control_domain.attr,
> &dev_attr_unassign_control_domain.attr,
> &dev_attr_control_domains.attr,
> + &dev_attr_matrix.attr,
> NULL,
> };
>

--
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany


2018-03-15 12:28:26

by Pierre Morel

[permalink] [raw]
Subject: Re: [PATCH v3 01/14] KVM: s390: refactor crypto initialization

On 14/03/2018 19:25, Tony Krowiak wrote:
> This patch refactors the code that initializes the crypto
> configuration for a guest. The crypto configuration is contained in
> a crypto control block (CRYCB) which is a satellite control block to
> our main hardware virtualization control block. The CRYCB is
> attached to the main virtualization control block via a CRYCB
> designation (CRYCBD) designation field containing the address of
> the CRYCB as well as its format.
>
> Prior to the introduction of AP device virtualization, there was
> no need to provide access to or specify the format of the CRYCB for
> a guest unless the MSA extension 3 (MSAX3) facility was installed
> on the host system. With the introduction of AP device virtualization,
> the CRYCB and its format must be made accessible to the guest
> regardless of the presence of the MSAX3 facility.
>
> The crypto initialization code is restructured as follows:
>
> * A new compilation unit is introduced to contain all interfaces
> and data structures related to configuring a guest's CRYCB for
> both the refactoring of crypto initialization as well as all
> subsequent patches introducing AP virtualization support.
>
> * Currently, the asm code for querying the AP configuration is
> duplicated in the AP bus as well as in KVM. Since the KVM
> code was introduced, the AP bus has externalized the interface
> for querying the AP configuration. The KVM interface will be
> replaced with a call to the AP bus interface. Of course, this
> will be moved to the new compilation unit mentioned above.
>
> * An interface to format the CRYCBD field will be provided via
> the new compilation unit and called from the KVM vm
> initialization.
>
> Signed-off-by: Tony Krowiak <[email protected]>
> ---
> MAINTAINERS | 10 ++++++
> arch/s390/include/asm/kvm-ap.h | 16 ++++++++++
> arch/s390/include/asm/kvm_host.h | 1 +
> arch/s390/kvm/Kconfig | 1 +
> arch/s390/kvm/Makefile | 2 +-
> arch/s390/kvm/kvm-ap.c | 48 +++++++++++++++++++++++++++++
> arch/s390/kvm/kvm-s390.c | 61 ++++---------------------------------
> 7 files changed, 84 insertions(+), 55 deletions(-)
> create mode 100644 arch/s390/include/asm/kvm-ap.h
> create mode 100644 arch/s390/kvm/kvm-ap.c
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 0ec5881..72742d5 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -11875,6 +11875,16 @@ W: http://www.ibm.com/developerworks/linux/linux390/
> S: Supported
> F: drivers/s390/crypto/
>
> +S390 VFIO AP DRIVER
> +M: Tony Krowiak <[email protected]>
> +M: Christian Borntraeger <[email protected]>
> +M: Martin Schwidefsky <[email protected]>
> +L: [email protected]
> +W: http://www.ibm.com/developerworks/linux/linux390/
> +S: Supported
> +F: arch/s390/include/asm/kvm/kvm-ap.h
> +F: arch/s390/kvm/kvm-ap.c
> +
> S390 ZFCP DRIVER
> M: Steffen Maier <[email protected]>
> M: Benjamin Block <[email protected]>
> diff --git a/arch/s390/include/asm/kvm-ap.h b/arch/s390/include/asm/kvm-ap.h
> new file mode 100644
> index 0000000..362846c
> --- /dev/null
> +++ b/arch/s390/include/asm/kvm-ap.h
> @@ -0,0 +1,16 @@
> +/*
> + * Adjunct Processor (AP) configuration management for KVM guests
> + *
> + * Copyright IBM Corp. 2017
> + *
> + * Author(s): Tony Krowiak <[email protected]>
> + */
> +
> +#ifndef _ASM_KVM_AP
> +#define _ASM_KVM_AP
> +#include <linux/types.h>
> +#include <linux/kvm_host.h>
> +
> +void kvm_ap_build_crycbd(struct kvm *kvm);
> +
> +#endif /* _ASM_KVM_AP */
> diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
> index 34c9b5b..65a944e 100644
> --- a/arch/s390/include/asm/kvm_host.h
> +++ b/arch/s390/include/asm/kvm_host.h
> @@ -257,6 +257,7 @@ struct kvm_s390_sie_block {
> __u8 reservedf0[12]; /* 0x00f0 */
> #define CRYCB_FORMAT1 0x00000001
> #define CRYCB_FORMAT2 0x00000003
> +#define CRYCB_FORMAT_MASK 0x00000003
> __u32 crycbd; /* 0x00fc */
> __u64 gcr[16]; /* 0x0100 */
> __u64 gbea; /* 0x0180 */
> diff --git a/arch/s390/kvm/Kconfig b/arch/s390/kvm/Kconfig
> index a3dbd45..4ca9077 100644
> --- a/arch/s390/kvm/Kconfig
> +++ b/arch/s390/kvm/Kconfig
> @@ -33,6 +33,7 @@ config KVM
> select HAVE_KVM_INVALID_WAKEUPS
> select SRCU
> select KVM_VFIO
> + select ZCRYPT

I do not think it is a good solution to *always* enable ZCRYPT
when we have KVM.

Pierre


--
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany


2018-03-15 13:04:32

by Pierre Morel

[permalink] [raw]
Subject: Re: [PATCH v3 04/14] KVM: s390: device attribute to set AP interpretive execution

On 14/03/2018 22:57, Halil Pasic wrote:
>
> On 03/14/2018 07:25 PM, Tony Krowiak wrote:
>> The VFIO AP device model exploits interpretive execution of AP
>> instructions (APIE) to provide guests passthrough access to AP
>> devices. This patch introduces a new device attribute in the
>> KVM_S390_VM_CRYPTO device attribute group to set APIE from
>> the VFIO AP device defined on the guest.
>>
>> Signed-off-by: Tony Krowiak <[email protected]>
>> ---
> [..]
>
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index a60c45b..bc46b67 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -815,6 +815,19 @@ static int kvm_s390_vm_set_crypto(struct kvm *kvm, struct kvm_device_attr *attr)
>> sizeof(kvm->arch.crypto.crycb->dea_wrapping_key_mask));
>> VM_EVENT(kvm, 3, "%s", "DISABLE: DEA keywrapping support");
>> break;
>> + case KVM_S390_VM_CRYPTO_INTERPRET_AP:
>> + if (attr->addr) {
>> + if (!test_kvm_cpu_feat(kvm, KVM_S390_VM_CPU_FEAT_AP))
> Unlock mutex before returning?
>
> Maybe flip conditions (don't allow manipulating apie if feature not there).
> Clearing the anyways clear apie if feature not there ain't too bad, but
> rejecting the operation appears nicer to me.
>
>> + return -EOPNOTSUPP;
>> + kvm->arch.crypto.apie = 1;
>> + VM_EVENT(kvm, 3, "%s",
>> + "ENABLE: AP interpretive execution");
>> + } else {
>> + kvm->arch.crypto.apie = 0;
>> + VM_EVENT(kvm, 3, "%s",
>> + "DISABLE: AP interpretive execution");
>> + }
>> + break;
>> default:
>> mutex_unlock(&kvm->lock);
>> return -ENXIO;
> I wonder how the loop after this switch works for KVM_S390_VM_CRYPTO_INTERPRET_AP:
>
> kvm_for_each_vcpu(i, vcpu, kvm) {
> kvm_s390_vcpu_crypto_setup(vcpu);
> exit_sie(vcpu);
> }
>
> From not doing something like for KVM_S390_VM_CRYPTO_INTERPRET_AP
>
> if (kvm->created_vcpus) {
> mutex_unlock(&kvm->lock);
> return -EBUSY;
> and from the aforementioned loop I guess ECA.28 can be changed
> for a running guest.
>
> If there are running vcpus when KVM_S390_VM_CRYPTO_INTERPRET_AP is
> changed (set) these will be taken out of SIE by exit_sie(). Then for the
> corresponding threads the control probably goes to QEMU (the emulator in
> the userspace). And it puts that vcpu back into the SIE, and then that
> cpu starts acting according to the new ECA.28 value. While other vcpus
> may still work with the old value of ECA.28.
>
> I'm not saying what I describe above is necessarily something broken.
> But I would like to have it explained, why is it OK -- provided I did not
> make any errors in my reasoning (assumptions included).
>
> Can you help me understand this code?
>
> Regards,
> Halil
>
> [..]
>

I have the same concerns as Halil.

We do not need to change the virtulization type
(hardware/software) on the fly for the current use case.

Couldn't we delay this until we have one and in between only make the
vCPU hotplug clean?

We only need to let the door open for the day we have such a use case.

Pierre



--
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany


2018-03-15 13:27:57

by Pierre Morel

[permalink] [raw]
Subject: Re: [PATCH v3 05/14] s390: vfio-ap: base implementation of VFIO AP device driver

On 14/03/2018 19:25, Tony Krowiak wrote:
> Introduces a new AP device driver. This device driver
> is built on the VFIO mediated device framework. The framework
> provides sysfs interfaces that facilitate passthrough
> access by guests to devices installed on the linux host.
>
> The VFIO AP device driver will serve two purposes:
>
> 1. Provide the interfaces to reserve AP devices for exclusive
> use by KVM guests. This is accomplished by unbinding the
> devices to be reserved for guest usage from the default AP
> device driver and binding them to the VFIO AP device driver.
>
> 2. Implements the functions, callbacks and sysfs attribute
> interfaces required to create one or more VFIO mediated
> devices each of which will be used to configure the AP
> matrix for a guest and serve as a file descriptor
> for facilitating communication between QEMU and the
> VFIO AP device driver.
>
> When the VFIO AP device driver is initialized:
>
> * It registers with the AP bus for control of type 10 (CEX4
> and newer) AP queue devices. The probe and remove callbacks
> will be provided to support the binding/unbinding of
> AP queue devices to/from the VFIO AP device driver.
>
> * Creates a /sys/devices/vfio-ap/matrix device to hold
> the APQNs of the AP devices bound to the VFIO
> AP device driver and serves as the parent of the
> mediated devices created for each guest.
>
> Signed-off-by: Tony Krowiak <[email protected]>
> ---
> MAINTAINERS | 2 +
> arch/s390/Kconfig | 11 +++
> drivers/s390/crypto/Makefile | 4 +
> drivers/s390/crypto/vfio_ap_drv.c | 135 +++++++++++++++++++++++++++++++++
> drivers/s390/crypto/vfio_ap_private.h | 22 ++++++
> include/uapi/linux/vfio.h | 2 +
> 6 files changed, 176 insertions(+), 0 deletions(-)
> create mode 100644 drivers/s390/crypto/vfio_ap_drv.c
> create mode 100644 drivers/s390/crypto/vfio_ap_private.h
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 72742d5..f129253 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -11884,6 +11884,8 @@ W: http://www.ibm.com/developerworks/linux/linux390/
> S: Supported
> F: arch/s390/include/asm/kvm/kvm-ap.h
> F: arch/s390/kvm/kvm-ap.c
> +F: drivers/s390/crypto/vfio_ap_drv.c
> +F: drivers/s390/crypto/vfio_ap_private.h
>
> S390 ZFCP DRIVER
> M: Steffen Maier <[email protected]>
> diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
> index cbe1d97..58509db 100644
> --- a/arch/s390/Kconfig
> +++ b/arch/s390/Kconfig
> @@ -771,6 +771,17 @@ config VFIO_CCW
> To compile this driver as a module, choose M here: the
> module will be called vfio_ccw.
>
> +config VFIO_AP
> + def_tristate m
not sure it must be module by default.
I would not set it by default.
> + prompt "VFIO support for AP devices"
> + depends on ZCRYPT && VFIO_MDEV_DEVICE

VFIO_MDEV_DEVICE is a general feature *needed* by VFIO_AP
and has no use case by its own. If it is set it is obviously because some
mediated device drivers needs it.
while ZCRYPT is a Z feature which may be set without VFIO_AP.

So you need:

config VFIO_AP
    def_tristate n
    prompt "VFIO support for AP devices"
    depends on ZCRYPT
    select VFIO_MDEV
    select VFIO_MDEV_DEVICE
...


Pierre

--
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany


2018-03-15 14:49:54

by Tony Krowiak

[permalink] [raw]
Subject: Re: [PATCH v3 01/14] KVM: s390: refactor crypto initialization

On 03/15/2018 08:26 AM, Pierre Morel wrote:
> On 14/03/2018 19:25, Tony Krowiak wrote:
>> This patch refactors the code that initializes the crypto
>> configuration for a guest. The crypto configuration is contained in
>> a crypto control block (CRYCB) which is a satellite control block to
>> our main hardware virtualization control block. The CRYCB is
>> attached to the main virtualization control block via a CRYCB
>> designation (CRYCBD) designation field containing the address of
>> the CRYCB as well as its format.
>>
>> Prior to the introduction of AP device virtualization, there was
>> no need to provide access to or specify the format of the CRYCB for
>> a guest unless the MSA extension 3 (MSAX3) facility was installed
>> on the host system. With the introduction of AP device virtualization,
>> the CRYCB and its format must be made accessible to the guest
>> regardless of the presence of the MSAX3 facility.
>>
>> The crypto initialization code is restructured as follows:
>>
>> * A new compilation unit is introduced to contain all interfaces
>> and data structures related to configuring a guest's CRYCB for
>> both the refactoring of crypto initialization as well as all
>> subsequent patches introducing AP virtualization support.
>>
>> * Currently, the asm code for querying the AP configuration is
>> duplicated in the AP bus as well as in KVM. Since the KVM
>> code was introduced, the AP bus has externalized the interface
>> for querying the AP configuration. The KVM interface will be
>> replaced with a call to the AP bus interface. Of course, this
>> will be moved to the new compilation unit mentioned above.
>>
>> * An interface to format the CRYCBD field will be provided via
>> the new compilation unit and called from the KVM vm
>> initialization.
>>
>> Signed-off-by: Tony Krowiak <[email protected]>
>> ---
>> MAINTAINERS | 10 ++++++
>> arch/s390/include/asm/kvm-ap.h | 16 ++++++++++
>> arch/s390/include/asm/kvm_host.h | 1 +
>> arch/s390/kvm/Kconfig | 1 +
>> arch/s390/kvm/Makefile | 2 +-
>> arch/s390/kvm/kvm-ap.c | 48 +++++++++++++++++++++++++++++
>> arch/s390/kvm/kvm-s390.c | 61
>> ++++---------------------------------
>> 7 files changed, 84 insertions(+), 55 deletions(-)
>> create mode 100644 arch/s390/include/asm/kvm-ap.h
>> create mode 100644 arch/s390/kvm/kvm-ap.c
>>
>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index 0ec5881..72742d5 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -11875,6 +11875,16 @@ W:
>> http://www.ibm.com/developerworks/linux/linux390/
>> S: Supported
>> F: drivers/s390/crypto/
>>
>> +S390 VFIO AP DRIVER
>> +M: Tony Krowiak <[email protected]>
>> +M: Christian Borntraeger <[email protected]>
>> +M: Martin Schwidefsky <[email protected]>
>> +L: [email protected]
>> +W: http://www.ibm.com/developerworks/linux/linux390/
>> +S: Supported
>> +F: arch/s390/include/asm/kvm/kvm-ap.h
>> +F: arch/s390/kvm/kvm-ap.c
>> +
>> S390 ZFCP DRIVER
>> M: Steffen Maier <[email protected]>
>> M: Benjamin Block <[email protected]>
>> diff --git a/arch/s390/include/asm/kvm-ap.h
>> b/arch/s390/include/asm/kvm-ap.h
>> new file mode 100644
>> index 0000000..362846c
>> --- /dev/null
>> +++ b/arch/s390/include/asm/kvm-ap.h
>> @@ -0,0 +1,16 @@
>> +/*
>> + * Adjunct Processor (AP) configuration management for KVM guests
>> + *
>> + * Copyright IBM Corp. 2017
>> + *
>> + * Author(s): Tony Krowiak <[email protected]>
>> + */
>> +
>> +#ifndef _ASM_KVM_AP
>> +#define _ASM_KVM_AP
>> +#include <linux/types.h>
>> +#include <linux/kvm_host.h>
>> +
>> +void kvm_ap_build_crycbd(struct kvm *kvm);
>> +
>> +#endif /* _ASM_KVM_AP */
>> diff --git a/arch/s390/include/asm/kvm_host.h
>> b/arch/s390/include/asm/kvm_host.h
>> index 34c9b5b..65a944e 100644
>> --- a/arch/s390/include/asm/kvm_host.h
>> +++ b/arch/s390/include/asm/kvm_host.h
>> @@ -257,6 +257,7 @@ struct kvm_s390_sie_block {
>> __u8 reservedf0[12]; /* 0x00f0 */
>> #define CRYCB_FORMAT1 0x00000001
>> #define CRYCB_FORMAT2 0x00000003
>> +#define CRYCB_FORMAT_MASK 0x00000003
>> __u32 crycbd; /* 0x00fc */
>> __u64 gcr[16]; /* 0x0100 */
>> __u64 gbea; /* 0x0180 */
>> diff --git a/arch/s390/kvm/Kconfig b/arch/s390/kvm/Kconfig
>> index a3dbd45..4ca9077 100644
>> --- a/arch/s390/kvm/Kconfig
>> +++ b/arch/s390/kvm/Kconfig
>> @@ -33,6 +33,7 @@ config KVM
>> select HAVE_KVM_INVALID_WAKEUPS
>> select SRCU
>> select KVM_VFIO
>> + select ZCRYPT
>
> I do not think it is a good solution to *always* enable ZCRYPT
> when we have KVM.
If CONFIG_ZCRYPT is not selected, then the kvm_ap_apxa_installed()
function will not compile
because it calls a zcrypt interface. How would you suggest we make sure
zcrypt interfaces
used in KVM are built if CONFIG_ZCRYPT is not selected?
>
> Pierre
>
>


2018-03-15 14:53:37

by Tony Krowiak

[permalink] [raw]
Subject: Re: [PATCH v3 11/14] s390: vfio-ap: sysfs interface to view matrix mdev matrix

On 03/15/2018 05:42 AM, Pierre Morel wrote:
> On 14/03/2018 19:25, Tony Krowiak wrote:
>> Provides a sysfs interface to view the AP matrix configured for the
>> mediated matrix device.
>>
>> The relevant sysfs structures are:
>>
>> /sys/devices/vfio_ap
>> ... [matrix]
>> ...... [mdev_supported_types]
>> ......... [vfio_ap-passthrough]
>> ............ [devices]
>> ...............[$uuid]
>> .................. matrix
>>
>> To view the matrix configured for the mediated matrix device,
>> print the matrix file:
>>
>> cat matrix
>>
>> Signed-off-by: Tony Krowiak <[email protected]>
>> ---
>> drivers/s390/crypto/vfio_ap_ops.c | 39
>> +++++++++++++++++++++++++++++++++++++
>> 1 files changed, 39 insertions(+), 0 deletions(-)
>>
>> diff --git a/drivers/s390/crypto/vfio_ap_ops.c
>> b/drivers/s390/crypto/vfio_ap_ops.c
>> index 461d450..04f7a92 100644
>> --- a/drivers/s390/crypto/vfio_ap_ops.c
>> +++ b/drivers/s390/crypto/vfio_ap_ops.c
>> @@ -692,6 +692,44 @@ static ssize_t control_domains_show(struct
>> device *dev,
>> }
>> DEVICE_ATTR_RO(control_domains);
>>
>> +static ssize_t matrix_show(struct device *dev, struct
>> device_attribute *attr,
>> + char *buf)
>> +{
>> + struct mdev_device *mdev = mdev_from_dev(dev);
>> + struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
>> + char *bufpos = buf;
>> + unsigned long apid;
>> + unsigned long apqi;
>> + int nchars = 0;
>> + int n;
>> +
>> + n = sprintf(bufpos, "ADAPTER.DOMAIN\n");
>
> For easy parsing it is better to only report the interesting data
> and let a user space utility make fancy presentation.
Is that your way of saying take the above line out?
>
>
>> + bufpos += n;
>> + nchars += n;
>> +
>> + n = sprintf(bufpos, "--------------\n");
>> + bufpos += n;
>> + nchars += n;
>> +
>> + for_each_set_bit_inv(apid, matrix_mdev->matrix->apm,
>> + matrix_mdev->matrix->apm_max) {
>> + n = sprintf(bufpos, "%02lx\n", apid);
>> + bufpos += n;
>> + nchars += n;
>> +
>> + for_each_set_bit_inv(apqi, matrix_mdev->matrix->aqm,
>> + matrix_mdev->matrix->aqm_max) {
>> + n = sprintf(bufpos, "%02lx.%04lx\n", apid, apqi);
>> + bufpos += n;
>> + nchars += n;
>> + }
>> + }
>> +
>> + return nchars;
>> +}
>> +DEVICE_ATTR_RO(matrix);
>> +
>> +
>> static struct attribute *vfio_ap_mdev_attrs[] = {
>> &dev_attr_assign_adapter.attr,
>> &dev_attr_unassign_adapter.attr,
>> @@ -700,6 +738,7 @@ static ssize_t control_domains_show(struct device
>> *dev,
>> &dev_attr_assign_control_domain.attr,
>> &dev_attr_unassign_control_domain.attr,
>> &dev_attr_control_domains.attr,
>> + &dev_attr_matrix.attr,
>> NULL,
>> };
>>
>


2018-03-15 14:57:11

by Pierre Morel

[permalink] [raw]
Subject: Re: [PATCH v3 01/14] KVM: s390: refactor crypto initialization

On 15/03/2018 15:48, Tony Krowiak wrote:
> On 03/15/2018 08:26 AM, Pierre Morel wrote:
>> On 14/03/2018 19:25, Tony Krowiak wrote:
>>> This patch refactors the code that initializes the crypto
>>> configuration for a guest. The crypto configuration is contained in
>>> a crypto control block (CRYCB) which is a satellite control block to
>>> our main hardware virtualization control block. The CRYCB is
>>> attached to the main virtualization control block via a CRYCB
>>> designation (CRYCBD) designation field containing the address of
>>> the CRYCB as well as its format.
>>>
>>> Prior to the introduction of AP device virtualization, there was
>>> no need to provide access to or specify the format of the CRYCB for
>>> a guest unless the MSA extension 3 (MSAX3) facility was installed
>>> on the host system. With the introduction of AP device virtualization,
>>> the CRYCB and its format must be made accessible to the guest
>>> regardless of the presence of the MSAX3 facility.
>>>
>>> The crypto initialization code is restructured as follows:
>>>
>>> * A new compilation unit is introduced to contain all interfaces
>>>    and data structures related to configuring a guest's CRYCB for
>>>    both the refactoring of crypto initialization as well as all
>>>    subsequent patches introducing AP virtualization support.
>>>
>>> * Currently, the asm code for querying the AP configuration is
>>>    duplicated in the AP bus as well as in KVM. Since the KVM
>>>    code was introduced, the AP bus has externalized the interface
>>>    for querying the AP configuration. The KVM interface will be
>>>    replaced with a call to the AP bus interface. Of course, this
>>>    will be moved to the new compilation unit mentioned above.
>>>
>>> * An interface to format the CRYCBD field will be provided via
>>>    the new compilation unit and called from the KVM vm
>>>    initialization.
>>>
>>> Signed-off-by: Tony Krowiak <[email protected]>
>>> ---
>>>   MAINTAINERS                      |   10 ++++++
>>>   arch/s390/include/asm/kvm-ap.h   |   16 ++++++++++
>>>   arch/s390/include/asm/kvm_host.h |    1 +
>>>   arch/s390/kvm/Kconfig            |    1 +
>>>   arch/s390/kvm/Makefile           |    2 +-
>>>   arch/s390/kvm/kvm-ap.c           |   48 +++++++++++++++++++++++++++++
>>>   arch/s390/kvm/kvm-s390.c         |   61
>>> ++++---------------------------------
>>>   7 files changed, 84 insertions(+), 55 deletions(-)
>>>   create mode 100644 arch/s390/include/asm/kvm-ap.h
>>>   create mode 100644 arch/s390/kvm/kvm-ap.c
>>>
>>> diff --git a/MAINTAINERS b/MAINTAINERS
>>> index 0ec5881..72742d5 100644
>>> --- a/MAINTAINERS
>>> +++ b/MAINTAINERS
>>> @@ -11875,6 +11875,16 @@ W:
>>> http://www.ibm.com/developerworks/linux/linux390/
>>>   S:    Supported
>>>   F:    drivers/s390/crypto/
>>>
>>> +S390 VFIO AP DRIVER
>>> +M:    Tony Krowiak <[email protected]>
>>> +M:    Christian Borntraeger <[email protected]>
>>> +M:    Martin Schwidefsky <[email protected]>
>>> +L:    [email protected]
>>> +W:    http://www.ibm.com/developerworks/linux/linux390/
>>> +S:    Supported
>>> +F:    arch/s390/include/asm/kvm/kvm-ap.h
>>> +F:    arch/s390/kvm/kvm-ap.c
>>> +
>>>   S390 ZFCP DRIVER
>>>   M:    Steffen Maier <[email protected]>
>>>   M:    Benjamin Block <[email protected]>
>>> diff --git a/arch/s390/include/asm/kvm-ap.h
>>> b/arch/s390/include/asm/kvm-ap.h
>>> new file mode 100644
>>> index 0000000..362846c
>>> --- /dev/null
>>> +++ b/arch/s390/include/asm/kvm-ap.h
>>> @@ -0,0 +1,16 @@
>>> +/*
>>> + * Adjunct Processor (AP) configuration management for KVM guests
>>> + *
>>> + * Copyright IBM Corp. 2017
>>> + *
>>> + * Author(s): Tony Krowiak <[email protected]>
>>> + */
>>> +
>>> +#ifndef _ASM_KVM_AP
>>> +#define _ASM_KVM_AP
>>> +#include <linux/types.h>
>>> +#include <linux/kvm_host.h>
>>> +
>>> +void kvm_ap_build_crycbd(struct kvm *kvm);
>>> +
>>> +#endif /* _ASM_KVM_AP */
>>> diff --git a/arch/s390/include/asm/kvm_host.h
>>> b/arch/s390/include/asm/kvm_host.h
>>> index 34c9b5b..65a944e 100644
>>> --- a/arch/s390/include/asm/kvm_host.h
>>> +++ b/arch/s390/include/asm/kvm_host.h
>>> @@ -257,6 +257,7 @@ struct kvm_s390_sie_block {
>>>       __u8    reservedf0[12];        /* 0x00f0 */
>>>   #define CRYCB_FORMAT1 0x00000001
>>>   #define CRYCB_FORMAT2 0x00000003
>>> +#define CRYCB_FORMAT_MASK 0x00000003
>>>       __u32    crycbd;            /* 0x00fc */
>>>       __u64    gcr[16];        /* 0x0100 */
>>>       __u64    gbea;            /* 0x0180 */
>>> diff --git a/arch/s390/kvm/Kconfig b/arch/s390/kvm/Kconfig
>>> index a3dbd45..4ca9077 100644
>>> --- a/arch/s390/kvm/Kconfig
>>> +++ b/arch/s390/kvm/Kconfig
>>> @@ -33,6 +33,7 @@ config KVM
>>>       select HAVE_KVM_INVALID_WAKEUPS
>>>       select SRCU
>>>       select KVM_VFIO
>>> +    select ZCRYPT
>>
>> I do not think it is a good solution to *always* enable ZCRYPT
>> when we have KVM.
> If CONFIG_ZCRYPT is not selected, then the kvm_ap_apxa_installed()
> function will not compile
> because it calls a zcrypt interface. How would you suggest we make
> sure zcrypt interfaces
> used in KVM are built if CONFIG_ZCRYPT is not selected?

if zcrypt is not configured, I suppose that the KVM code initializaing CRYCB
has no use but the function will be called from KVM.
So I would do something like:

#ifdef ZCRYPT
external definitions.
#else
stubs returning error -ENOZCRYPT (or whatever)
#endif



>>
>> Pierre
>>
>>
>

--
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany


2018-03-15 15:24:47

by Tony Krowiak

[permalink] [raw]
Subject: Re: [PATCH v3 04/14] KVM: s390: device attribute to set AP interpretive execution

On 03/14/2018 05:57 PM, Halil Pasic wrote:
>
> On 03/14/2018 07:25 PM, Tony Krowiak wrote:
>> The VFIO AP device model exploits interpretive execution of AP
>> instructions (APIE) to provide guests passthrough access to AP
>> devices. This patch introduces a new device attribute in the
>> KVM_S390_VM_CRYPTO device attribute group to set APIE from
>> the VFIO AP device defined on the guest.
>>
>> Signed-off-by: Tony Krowiak <[email protected]>
>> ---
> [..]
>
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index a60c45b..bc46b67 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -815,6 +815,19 @@ static int kvm_s390_vm_set_crypto(struct kvm *kvm, struct kvm_device_attr *attr)
>> sizeof(kvm->arch.crypto.crycb->dea_wrapping_key_mask));
>> VM_EVENT(kvm, 3, "%s", "DISABLE: DEA keywrapping support");
>> break;
>> + case KVM_S390_VM_CRYPTO_INTERPRET_AP:
>> + if (attr->addr) {
>> + if (!test_kvm_cpu_feat(kvm, KVM_S390_VM_CPU_FEAT_AP))
> Unlock mutex before returning?
The mutex is unlocked prior to return at the end of the function.
>
> Maybe flip conditions (don't allow manipulating apie if feature not there).
> Clearing the anyways clear apie if feature not there ain't too bad, but
> rejecting the operation appears nicer to me.
I think what you're saying is something like this:

if (!test_kvm_cpu_feat(kvm, KVM_S390_VM_CPU_FEAT_AP))
return -EOPNOTSUPP;

kvm->arch.crypto.apie = (attr->addr) ? 1 : 0;

I can make arguments for doing this either way, but since the attribute
is will most likely only be set by an AP device in userspace, I suppose
it makes sense to allow setting of the attribute if the AP feature is
installed. It certainly makes sense for the dedicated implementation.
>
>> + return -EOPNOTSUPP;
>> + kvm->arch.crypto.apie = 1;
>> + VM_EVENT(kvm, 3, "%s",
>> + "ENABLE: AP interpretive execution");
>> + } else {
>> + kvm->arch.crypto.apie = 0;
>> + VM_EVENT(kvm, 3, "%s",
>> + "DISABLE: AP interpretive execution");
>> + }
>> + break;
>> default:
>> mutex_unlock(&kvm->lock);
>> return -ENXIO;
> I wonder how the loop after this switch works for KVM_S390_VM_CRYPTO_INTERPRET_AP:
>
> kvm_for_each_vcpu(i, vcpu, kvm) {
> kvm_s390_vcpu_crypto_setup(vcpu);
> exit_sie(vcpu);
> }
>
> From not doing something like for KVM_S390_VM_CRYPTO_INTERPRET_AP
>
> if (kvm->created_vcpus) {
> mutex_unlock(&kvm->lock);
> return -EBUSY;
> and from the aforementioned loop I guess ECA.28 can be changed
> for a running guest.
>
> If there are running vcpus when KVM_S390_VM_CRYPTO_INTERPRET_AP is
> changed (set) these will be taken out of SIE by exit_sie(). Then for the
> corresponding threads the control probably goes to QEMU (the emulator in
> the userspace). And it puts that vcpu back into the SIE, and then that
> cpu starts acting according to the new ECA.28 value. While other vcpus
> may still work with the old value of ECA.28.
Assuming the scenario plays out as you described, why would the other vcpus
be using the old ECA.28 value if the kvm_s390_vcpu_crypto_setup() function
is executed for each of them to set the new value for ECA.28?
>
> I'm not saying what I describe above is necessarily something broken.
> But I would like to have it explained, why is it OK -- provided I did not
> make any errors in my reasoning (assumptions included).
>
> Can you help me understand this code?
Unless I am missing something in the scenario you described, it seems that
the reason the exit_sie(vcpu) function is called is to ensure that the vcpus
that are already running acquire the new attribute values changed by this
function when they are restored to SIE. Of course, my assumption is that
the kvm_arch_vcpu_setup() function - which calls the
kvm_s390_vcpu_crypto_setup()
function - is invoked when the vcpu is restored to SIE.
>
> Regards,
> Halil
>
> [..]



2018-03-15 15:28:37

by Tony Krowiak

[permalink] [raw]
Subject: Re: [PATCH v3 04/14] KVM: s390: device attribute to set AP interpretive execution

On 03/15/2018 09:00 AM, Pierre Morel wrote:
> On 14/03/2018 22:57, Halil Pasic wrote:
>>
>> On 03/14/2018 07:25 PM, Tony Krowiak wrote:
>>> The VFIO AP device model exploits interpretive execution of AP
>>> instructions (APIE) to provide guests passthrough access to AP
>>> devices. This patch introduces a new device attribute in the
>>> KVM_S390_VM_CRYPTO device attribute group to set APIE from
>>> the VFIO AP device defined on the guest.
>>>
>>> Signed-off-by: Tony Krowiak <[email protected]>
>>> ---
>> [..]
>>
>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>> index a60c45b..bc46b67 100644
>>> --- a/arch/s390/kvm/kvm-s390.c
>>> +++ b/arch/s390/kvm/kvm-s390.c
>>> @@ -815,6 +815,19 @@ static int kvm_s390_vm_set_crypto(struct kvm
>>> *kvm, struct kvm_device_attr *attr)
>>> sizeof(kvm->arch.crypto.crycb->dea_wrapping_key_mask));
>>> VM_EVENT(kvm, 3, "%s", "DISABLE: DEA keywrapping support");
>>> break;
>>> + case KVM_S390_VM_CRYPTO_INTERPRET_AP:
>>> + if (attr->addr) {
>>> + if (!test_kvm_cpu_feat(kvm, KVM_S390_VM_CPU_FEAT_AP))
>> Unlock mutex before returning?
>>
>> Maybe flip conditions (don't allow manipulating apie if feature not
>> there).
>> Clearing the anyways clear apie if feature not there ain't too bad, but
>> rejecting the operation appears nicer to me.
>>
>>> + return -EOPNOTSUPP;
>>> + kvm->arch.crypto.apie = 1;
>>> + VM_EVENT(kvm, 3, "%s",
>>> + "ENABLE: AP interpretive execution");
>>> + } else {
>>> + kvm->arch.crypto.apie = 0;
>>> + VM_EVENT(kvm, 3, "%s",
>>> + "DISABLE: AP interpretive execution");
>>> + }
>>> + break;
>>> default:
>>> mutex_unlock(&kvm->lock);
>>> return -ENXIO;
>> I wonder how the loop after this switch works for
>> KVM_S390_VM_CRYPTO_INTERPRET_AP:
>>
>> kvm_for_each_vcpu(i, vcpu, kvm) {
>> kvm_s390_vcpu_crypto_setup(vcpu);
>> exit_sie(vcpu);
>> }
>>
>> From not doing something like for KVM_S390_VM_CRYPTO_INTERPRET_AP
>>
>> if (kvm->created_vcpus) {
>> mutex_unlock(&kvm->lock);
>> return -EBUSY;
>> and from the aforementioned loop I guess ECA.28 can be changed
>> for a running guest.
>>
>> If there are running vcpus when KVM_S390_VM_CRYPTO_INTERPRET_AP is
>> changed (set) these will be taken out of SIE by exit_sie(). Then for the
>> corresponding threads the control probably goes to QEMU (the emulator in
>> the userspace). And it puts that vcpu back into the SIE, and then that
>> cpu starts acting according to the new ECA.28 value. While other vcpus
>> may still work with the old value of ECA.28.
>>
>> I'm not saying what I describe above is necessarily something broken.
>> But I would like to have it explained, why is it OK -- provided I did
>> not
>> make any errors in my reasoning (assumptions included).
>>
>> Can you help me understand this code?
>>
>> Regards,
>> Halil
>>
>> [..]
>>
>
> I have the same concerns as Halil.
>
> We do not need to change the virtulization type
> (hardware/software) on the fly for the current use case.
>
> Couldn't we delay this until we have one and in between only make the
> vCPU hotplug clean?
>
> We only need to let the door open for the day we have such a use case.
Are you suggesting this code be removed? If so, then where and under
what conditions would
you suggest setting ECA.28 given you objected to setting it based on
whether the
AP feature is installed?
>
>
> Pierre
>
>
>


2018-03-15 15:38:56

by Pierre Morel

[permalink] [raw]
Subject: Re: [PATCH v3 11/14] s390: vfio-ap: sysfs interface to view matrix mdev matrix

On 15/03/2018 15:52, Tony Krowiak wrote:
> On 03/15/2018 05:42 AM, Pierre Morel wrote:
>> On 14/03/2018 19:25, Tony Krowiak wrote:
>>> Provides a sysfs interface to view the AP matrix configured for the
>>> mediated matrix device.
>>>
>>> The relevant sysfs structures are:
>>>
>>> /sys/devices/vfio_ap
>>> ... [matrix]
>>> ...... [mdev_supported_types]
>>> ......... [vfio_ap-passthrough]
>>> ............ [devices]
>>> ...............[$uuid]
>>> .................. matrix
>>>
>>> To view the matrix configured for the mediated matrix device,
>>> print the matrix file:
>>>
>>>     cat matrix
>>>
>>> Signed-off-by: Tony Krowiak <[email protected]>
>>> ---
>>>   drivers/s390/crypto/vfio_ap_ops.c |   39
>>> +++++++++++++++++++++++++++++++++++++
>>>   1 files changed, 39 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/drivers/s390/crypto/vfio_ap_ops.c
>>> b/drivers/s390/crypto/vfio_ap_ops.c
>>> index 461d450..04f7a92 100644
>>> --- a/drivers/s390/crypto/vfio_ap_ops.c
>>> +++ b/drivers/s390/crypto/vfio_ap_ops.c
>>> @@ -692,6 +692,44 @@ static ssize_t control_domains_show(struct
>>> device *dev,
>>>   }
>>>   DEVICE_ATTR_RO(control_domains);
>>>
>>> +static ssize_t matrix_show(struct device *dev, struct
>>> device_attribute *attr,
>>> +               char *buf)
>>> +{
>>> +    struct mdev_device *mdev = mdev_from_dev(dev);
>>> +    struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
>>> +    char *bufpos = buf;
>>> +    unsigned long apid;
>>> +    unsigned long apqi;
>>> +    int nchars = 0;
>>> +    int n;
>>> +
>>> +    n = sprintf(bufpos, "ADAPTER.DOMAIN\n");
>>
>> For easy parsing it is better to only report the interesting data
>> and let a user space utility make fancy presentation.
> Is that your way of saying take the above line out?

yes, (also wanted to explain why)

>>
>>
>>> +    bufpos += n;
>>> +    nchars += n;
>>> +
>>> +    n = sprintf(bufpos, "--------------\n");
>>> +    bufpos += n;
>>> +    nchars += n;
>>> +
>>> +    for_each_set_bit_inv(apid, matrix_mdev->matrix->apm,
>>> +                 matrix_mdev->matrix->apm_max) {
>>> +        n = sprintf(bufpos, "%02lx\n", apid);
>>> +        bufpos += n;
>>> +        nchars += n;
>>> +
>>> +        for_each_set_bit_inv(apqi, matrix_mdev->matrix->aqm,
>>> +                     matrix_mdev->matrix->aqm_max) {
>>> +            n = sprintf(bufpos, "%02lx.%04lx\n", apid, apqi);
>>> +            bufpos += n;
>>> +            nchars += n;
>>> +        }
>>> +    }
>>> +
>>> +    return nchars;
>>> +}
>>> +DEVICE_ATTR_RO(matrix);
>>> +
>>> +
>>>   static struct attribute *vfio_ap_mdev_attrs[] = {
>>>       &dev_attr_assign_adapter.attr,
>>>       &dev_attr_unassign_adapter.attr,
>>> @@ -700,6 +738,7 @@ static ssize_t control_domains_show(struct
>>> device *dev,
>>>       &dev_attr_assign_control_domain.attr,
>>>       &dev_attr_unassign_control_domain.attr,
>>>       &dev_attr_control_domains.attr,
>>> +    &dev_attr_matrix.attr,
>>>       NULL,
>>>   };
>>>
>>
>

--
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany


2018-03-15 15:47:25

by Pierre Morel

[permalink] [raw]
Subject: Re: [PATCH v3 04/14] KVM: s390: device attribute to set AP interpretive execution

On 15/03/2018 16:26, Tony Krowiak wrote:
> On 03/15/2018 09:00 AM, Pierre Morel wrote:
>> On 14/03/2018 22:57, Halil Pasic wrote:
>>>
>>> On 03/14/2018 07:25 PM, Tony Krowiak wrote:
>>>> The VFIO AP device model exploits interpretive execution of AP
>>>> instructions (APIE) to provide guests passthrough access to AP
>>>> devices. This patch introduces a new device attribute in the
>>>> KVM_S390_VM_CRYPTO device attribute group to set APIE from
>>>> the VFIO AP device defined on the guest.
>>>>
>>>> Signed-off-by: Tony Krowiak <[email protected]>
>>>> ---
>>> [..]
>>>
>>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>>> index a60c45b..bc46b67 100644
>>>> --- a/arch/s390/kvm/kvm-s390.c
>>>> +++ b/arch/s390/kvm/kvm-s390.c
>>>> @@ -815,6 +815,19 @@ static int kvm_s390_vm_set_crypto(struct kvm
>>>> *kvm, struct kvm_device_attr *attr)
>>>> sizeof(kvm->arch.crypto.crycb->dea_wrapping_key_mask));
>>>>           VM_EVENT(kvm, 3, "%s", "DISABLE: DEA keywrapping support");
>>>>           break;
>>>> +    case KVM_S390_VM_CRYPTO_INTERPRET_AP:
>>>> +        if (attr->addr) {
>>>> +            if (!test_kvm_cpu_feat(kvm, KVM_S390_VM_CPU_FEAT_AP))
>>> Unlock mutex before returning?
>>>
>>> Maybe flip conditions (don't allow manipulating apie if feature not
>>> there).
>>> Clearing the anyways clear apie if feature not there ain't too bad, but
>>> rejecting the operation appears nicer to me.
>>>
>>>> +                return -EOPNOTSUPP;
>>>> +            kvm->arch.crypto.apie = 1;
>>>> +            VM_EVENT(kvm, 3, "%s",
>>>> +                 "ENABLE: AP interpretive execution");
>>>> +        } else {
>>>> +            kvm->arch.crypto.apie = 0;
>>>> +            VM_EVENT(kvm, 3, "%s",
>>>> +                 "DISABLE: AP interpretive execution");
>>>> +        }
>>>> +        break;
>>>>       default:
>>>>           mutex_unlock(&kvm->lock);
>>>>           return -ENXIO;
>>> I wonder how the loop after this switch works for
>>> KVM_S390_VM_CRYPTO_INTERPRET_AP:
>>>
>>>          kvm_for_each_vcpu(i, vcpu, kvm) {
>>>                  kvm_s390_vcpu_crypto_setup(vcpu);
>>>                  exit_sie(vcpu);
>>>          }
>>>
>>>  From not doing something like for KVM_S390_VM_CRYPTO_INTERPRET_AP
>>>
>>>          if (kvm->created_vcpus) {
>>>                  mutex_unlock(&kvm->lock);
>>>                  return -EBUSY;
>>> and from the aforementioned loop I guess ECA.28 can be changed
>>> for a running guest.
>>>
>>> If there are running vcpus when KVM_S390_VM_CRYPTO_INTERPRET_AP is
>>> changed (set) these will be taken out of SIE by exit_sie(). Then for
>>> the
>>> corresponding threads the control probably goes to QEMU (the
>>> emulator in
>>> the userspace). And it puts that vcpu back into the SIE, and then that
>>> cpu starts acting according to the new ECA.28 value.  While other vcpus
>>> may still work with the old value of ECA.28.
>>>
>>> I'm not saying what I describe above is necessarily something broken.
>>> But I would like to have it explained, why is it OK -- provided I
>>> did not
>>> make any errors in my reasoning (assumptions included).
>>>
>>> Can you help me understand this code?
>>>
>>> Regards,
>>> Halil
>>>
>>> [..]
>>>
>>
>> I have the same concerns as Halil.
>>
>> We do not need to change the virtulization type
>> (hardware/software) on the fly for the current use case.
>>
>> Couldn't we delay this until we have one and in between only make the
>> vCPU hotplug clean?
>>
>> We only need to let the door open for the day we have such a use case.
> Are you suggesting this code be removed? If so, then where and under
> what conditions would
> you suggest setting ECA.28 given you objected to setting it based on
> whether the
> AP feature is installed?

I would only call kvm_s390_vcpu_crypto_setup() from inside
kvm_arch_vcpu_init()
as it is already.


>>
>>
>> Pierre
>>
>>
>>
>

--
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany


2018-03-15 16:02:58

by Pierre Morel

[permalink] [raw]
Subject: Re: [PATCH v3 04/14] KVM: s390: device attribute to set AP interpretive execution

On 15/03/2018 16:23, Tony Krowiak wrote:
> On 03/14/2018 05:57 PM, Halil Pasic wrote:
>>
>> On 03/14/2018 07:25 PM, Tony Krowiak wrote:
>>> The VFIO AP device model exploits interpretive execution of AP
>>> instructions (APIE) to provide guests passthrough access to AP
>>> devices. This patch introduces a new device attribute in the
>>> KVM_S390_VM_CRYPTO device attribute group to set APIE from
>>> the VFIO AP device defined on the guest.
>>>
>>> Signed-off-by: Tony Krowiak <[email protected]>
>>> ---
>> [..]
>>
>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>> index a60c45b..bc46b67 100644
>>> --- a/arch/s390/kvm/kvm-s390.c
>>> +++ b/arch/s390/kvm/kvm-s390.c
>>> @@ -815,6 +815,19 @@ static int kvm_s390_vm_set_crypto(struct kvm
>>> *kvm, struct kvm_device_attr *attr)
>>> sizeof(kvm->arch.crypto.crycb->dea_wrapping_key_mask));
>>>           VM_EVENT(kvm, 3, "%s", "DISABLE: DEA keywrapping support");
>>>           break;
>>> +    case KVM_S390_VM_CRYPTO_INTERPRET_AP:
>>> +        if (attr->addr) {
>>> +            if (!test_kvm_cpu_feat(kvm, KVM_S390_VM_CPU_FEAT_AP))
>> Unlock mutex before returning?
> The mutex is unlocked prior to return at the end of the function.
>>
>> Maybe flip conditions (don't allow manipulating apie if feature not
>> there).
>> Clearing the anyways clear apie if feature not there ain't too bad, but
>> rejecting the operation appears nicer to me.
> I think what you're saying is something like this:
>
>     if (!test_kvm_cpu_feat(kvm, KVM_S390_VM_CPU_FEAT_AP))
>         return -EOPNOTSUPP;
>
>     kvm->arch.crypto.apie = (attr->addr) ? 1 : 0;
>
> I can make arguments for doing this either way, but since the attribute
> is will most likely only be set by an AP device in userspace, I suppose
> it makes sense to allow setting of the attribute if the AP feature is
> installed. It certainly makes sense for the dedicated implementation.
>>
>>> +                return -EOPNOTSUPP;

Obviously Halil is speaking on this return statement.
Which returns without unlocking the mutex.



--
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany


2018-03-15 16:27:22

by Halil Pasic

[permalink] [raw]
Subject: Re: [PATCH v3 04/14] KVM: s390: device attribute to set AP interpretive execution



On 03/15/2018 04:23 PM, Tony Krowiak wrote:
> On 03/14/2018 05:57 PM, Halil Pasic wrote:
>>
>> On 03/14/2018 07:25 PM, Tony Krowiak wrote:
>>> The VFIO AP device model exploits interpretive execution of AP
>>> instructions (APIE) to provide guests passthrough access to AP
>>> devices. This patch introduces a new device attribute in the
>>> KVM_S390_VM_CRYPTO device attribute group to set APIE from
>>> the VFIO AP device defined on the guest.
>>>
>>> Signed-off-by: Tony Krowiak <[email protected]>
>>> ---
>> [..]
>>
>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>> index a60c45b..bc46b67 100644
>>> --- a/arch/s390/kvm/kvm-s390.c
>>> +++ b/arch/s390/kvm/kvm-s390.c
>>> @@ -815,6 +815,19 @@ static int kvm_s390_vm_set_crypto(struct kvm *kvm, struct kvm_device_attr *attr)
>>>               sizeof(kvm->arch.crypto.crycb->dea_wrapping_key_mask));
>>>           VM_EVENT(kvm, 3, "%s", "DISABLE: DEA keywrapping support");
>>>           break;
>>> +    case KVM_S390_VM_CRYPTO_INTERPRET_AP:
>>> +        if (attr->addr) {
>>> +            if (!test_kvm_cpu_feat(kvm, KVM_S390_VM_CPU_FEAT_AP))
>> Unlock mutex before returning?
> The mutex is unlocked prior to return at the end of the function.

Pierre already pointed out what I mean.

>>
>> Maybe flip conditions (don't allow manipulating apie if feature not there).
>> Clearing the anyways clear apie if feature not there ain't too bad, but
>> rejecting the operation appears nicer to me.
> I think what you're saying is something like this:
>
>     if (!test_kvm_cpu_feat(kvm, KVM_S390_VM_CPU_FEAT_AP))
>         return -EOPNOTSUPP;
>
>     kvm->arch.crypto.apie = (attr->addr) ? 1 : 0;
>
> I can make arguments for doing this either way, but since the attribute
> is will most likely only be set by an AP device in userspace, I suppose
> it makes sense to allow setting of the attribute if the AP feature is
> installed. It certainly makes sense for the dedicated implementation.

No strong opinion here.

>>
>>> +                return -EOPNOTSUPP;
>>> +            kvm->arch.crypto.apie = 1;
>>> +            VM_EVENT(kvm, 3, "%s",
>>> +                 "ENABLE: AP interpretive execution");
>>> +        } else {
>>> +            kvm->arch.crypto.apie = 0;
>>> +            VM_EVENT(kvm, 3, "%s",
>>> +                 "DISABLE: AP interpretive execution");
>>> +        }
>>> +        break;
>>>       default:
>>>           mutex_unlock(&kvm->lock);
>>>           return -ENXIO;
>> I wonder how the loop after this switch works for KVM_S390_VM_CRYPTO_INTERPRET_AP:
>>
>>          kvm_for_each_vcpu(i, vcpu, kvm) {
>>                  kvm_s390_vcpu_crypto_setup(vcpu);
>>                  exit_sie(vcpu);
>>          }
>>
>>  From not doing something like for KVM_S390_VM_CRYPTO_INTERPRET_AP
>>
>>          if (kvm->created_vcpus) {
>>                  mutex_unlock(&kvm->lock);
>>                  return -EBUSY;
>> and from the aforementioned loop I guess ECA.28 can be changed
>> for a running guest.
>>
>> If there are running vcpus when KVM_S390_VM_CRYPTO_INTERPRET_AP is
>> changed (set) these will be taken out of SIE by exit_sie().  Then for the
>> corresponding threads the control probably goes to QEMU (the emulator in
>> the userspace). And it puts that vcpu back into the SIE, and then that
>> cpu starts acting according to the new ECA.28 value.  While other vcpus
>> may still work with the old value of ECA.28.
> Assuming the scenario plays out as you described, why would the other vcpus
> be using the old ECA.28 value if the kvm_s390_vcpu_crypto_setup() function
> is executed for each of them to set the new value for ECA.28?

I'm puzzled I though I just described that. The threads implementing the
vcpus are, or at least may be concurrent to the thread doing the loop and
kvm_s390_vcpu_crypto_setup() for each vcpu.

Changing the ECA.28 for each vcpu in the configuration ain't likely to be
simultaneous (we do the kvm_s390_vcpu_crypto_setup() in the loop), but even
if it were simultaneous what would guarantee that the changes is observed
as one atomic change (that is: no mix is observed by the guest)?

(And please read the documentation.)

>>
>> I'm not saying what I describe above is necessarily something broken.
>> But I would like to have it explained, why is it OK -- provided I did not
>> make any errors in my reasoning (assumptions included).
>>
>> Can you help me understand this code?
> Unless I am missing something in the scenario you described, it seems that
> the reason the exit_sie(vcpu) function is called is to ensure that the vcpus
> that are already running acquire the new attribute values changed by this
> function when they are restored to SIE. Of course, my assumption is that
> the kvm_arch_vcpu_setup() function - which calls the kvm_s390_vcpu_crypto_setup()
> function - is invoked when the vcpu is restored to SIE.

I don't know what are you talking about kvm_s390_vcpu_crypto_setup(vcpu) is
invoked in the loop. That changes the State Description.

How is it guaranteed that no vCPU is going to work according to the
new ECA.28 value before *all* vCPUs are made out of SIE by exit_sie()?

Your answers sadly didn't contribute much to my understanding. hope
mine will be more successful in contributing to yours.

Regards,
Halil


2018-03-15 17:24:01

by Tony Krowiak

[permalink] [raw]
Subject: Re: [PATCH v3 04/14] KVM: s390: device attribute to set AP interpretive execution

On 03/15/2018 11:45 AM, Pierre Morel wrote:
> On 15/03/2018 16:26, Tony Krowiak wrote:
>> On 03/15/2018 09:00 AM, Pierre Morel wrote:
>>> On 14/03/2018 22:57, Halil Pasic wrote:
>>>>
>>>> On 03/14/2018 07:25 PM, Tony Krowiak wrote:
>>>>> The VFIO AP device model exploits interpretive execution of AP
>>>>> instructions (APIE) to provide guests passthrough access to AP
>>>>> devices. This patch introduces a new device attribute in the
>>>>> KVM_S390_VM_CRYPTO device attribute group to set APIE from
>>>>> the VFIO AP device defined on the guest.
>>>>>
>>>>> Signed-off-by: Tony Krowiak <[email protected]>
>>>>> ---
>>>> [..]
>>>>
>>>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>>>> index a60c45b..bc46b67 100644
>>>>> --- a/arch/s390/kvm/kvm-s390.c
>>>>> +++ b/arch/s390/kvm/kvm-s390.c
>>>>> @@ -815,6 +815,19 @@ static int kvm_s390_vm_set_crypto(struct kvm
>>>>> *kvm, struct kvm_device_attr *attr)
>>>>> sizeof(kvm->arch.crypto.crycb->dea_wrapping_key_mask));
>>>>> VM_EVENT(kvm, 3, "%s", "DISABLE: DEA keywrapping support");
>>>>> break;
>>>>> + case KVM_S390_VM_CRYPTO_INTERPRET_AP:
>>>>> + if (attr->addr) {
>>>>> + if (!test_kvm_cpu_feat(kvm, KVM_S390_VM_CPU_FEAT_AP))
>>>> Unlock mutex before returning?
>>>>
>>>> Maybe flip conditions (don't allow manipulating apie if feature not
>>>> there).
>>>> Clearing the anyways clear apie if feature not there ain't too bad,
>>>> but
>>>> rejecting the operation appears nicer to me.
>>>>
>>>>> + return -EOPNOTSUPP;
>>>>> + kvm->arch.crypto.apie = 1;
>>>>> + VM_EVENT(kvm, 3, "%s",
>>>>> + "ENABLE: AP interpretive execution");
>>>>> + } else {
>>>>> + kvm->arch.crypto.apie = 0;
>>>>> + VM_EVENT(kvm, 3, "%s",
>>>>> + "DISABLE: AP interpretive execution");
>>>>> + }
>>>>> + break;
>>>>> default:
>>>>> mutex_unlock(&kvm->lock);
>>>>> return -ENXIO;
>>>> I wonder how the loop after this switch works for
>>>> KVM_S390_VM_CRYPTO_INTERPRET_AP:
>>>>
>>>> kvm_for_each_vcpu(i, vcpu, kvm) {
>>>> kvm_s390_vcpu_crypto_setup(vcpu);
>>>> exit_sie(vcpu);
>>>> }
>>>>
>>>> From not doing something like for KVM_S390_VM_CRYPTO_INTERPRET_AP
>>>>
>>>> if (kvm->created_vcpus) {
>>>> mutex_unlock(&kvm->lock);
>>>> return -EBUSY;
>>>> and from the aforementioned loop I guess ECA.28 can be changed
>>>> for a running guest.
>>>>
>>>> If there are running vcpus when KVM_S390_VM_CRYPTO_INTERPRET_AP is
>>>> changed (set) these will be taken out of SIE by exit_sie(). Then
>>>> for the
>>>> corresponding threads the control probably goes to QEMU (the
>>>> emulator in
>>>> the userspace). And it puts that vcpu back into the SIE, and then that
>>>> cpu starts acting according to the new ECA.28 value. While other
>>>> vcpus
>>>> may still work with the old value of ECA.28.
>>>>
>>>> I'm not saying what I describe above is necessarily something broken.
>>>> But I would like to have it explained, why is it OK -- provided I
>>>> did not
>>>> make any errors in my reasoning (assumptions included).
>>>>
>>>> Can you help me understand this code?
>>>>
>>>> Regards,
>>>> Halil
>>>>
>>>> [..]
>>>>
>>>
>>> I have the same concerns as Halil.
>>>
>>> We do not need to change the virtulization type
>>> (hardware/software) on the fly for the current use case.
>>>
>>> Couldn't we delay this until we have one and in between only make
>>> the vCPU hotplug clean?
>>>
>>> We only need to let the door open for the day we have such a use case.
>> Are you suggesting this code be removed? If so, then where and under
>> what conditions would
>> you suggest setting ECA.28 given you objected to setting it based on
>> whether the
>> AP feature is installed?
>
> I would only call kvm_s390_vcpu_crypto_setup() from inside
> kvm_arch_vcpu_init()
> as it is already.
It is not called from kvm_arch_vcpu_init(), it is called from
kvm_arch_vcpu_setup(). Also,
this loop was already here, I did not put it in. Assuming whomever put
it there did so
for a reason, it is not my place to remove it. According to a trace I
ran, the calls to this
function occur after the vcpus are created. Consequently, the
kvm_s390_vcpu_crypto_setup()
function would not be called without the loop and neither the key
wrapping support nor the
ECA_APIE would be configured in the vcpu's SIE descriptor.

If you have a better idea for where/how to set this flag, I'm all
ears. It would be nice if it could be set before the vcpus are created,
but I haven't
found a good candidate. I suspect that the loop was put in to make sure
that all vcpus
get updated regardless of whether they are running or not, but I don't
know what happens
after a vcpu is kicked out of SIE. I suspect, as Halil surmised, that QEMU
restores the vcpus to SIE. This would seemingly cause the
kvm_arch_vcpu_setup() to get
called at which time the ECA_APIE value as well as the key wrapping
values will get set.
If somebody has knowledge of the flow here, please feel free to pitch in.
>
>
>
>>>
>>>
>>> Pierre
>>>
>>>
>>>
>>
>


2018-03-15 17:28:52

by Tony Krowiak

[permalink] [raw]
Subject: Re: [PATCH v3 05/14] s390: vfio-ap: base implementation of VFIO AP device driver

On 03/15/2018 09:25 AM, Pierre Morel wrote:
> On 14/03/2018 19:25, Tony Krowiak wrote:
>> Introduces a new AP device driver. This device driver
>> is built on the VFIO mediated device framework. The framework
>> provides sysfs interfaces that facilitate passthrough
>> access by guests to devices installed on the linux host.
>>
>> The VFIO AP device driver will serve two purposes:
>>
>> 1. Provide the interfaces to reserve AP devices for exclusive
>> use by KVM guests. This is accomplished by unbinding the
>> devices to be reserved for guest usage from the default AP
>> device driver and binding them to the VFIO AP device driver.
>>
>> 2. Implements the functions, callbacks and sysfs attribute
>> interfaces required to create one or more VFIO mediated
>> devices each of which will be used to configure the AP
>> matrix for a guest and serve as a file descriptor
>> for facilitating communication between QEMU and the
>> VFIO AP device driver.
>>
>> When the VFIO AP device driver is initialized:
>>
>> * It registers with the AP bus for control of type 10 (CEX4
>> and newer) AP queue devices. The probe and remove callbacks
>> will be provided to support the binding/unbinding of
>> AP queue devices to/from the VFIO AP device driver.
>>
>> * Creates a /sys/devices/vfio-ap/matrix device to hold
>> the APQNs of the AP devices bound to the VFIO
>> AP device driver and serves as the parent of the
>> mediated devices created for each guest.
>>
>> Signed-off-by: Tony Krowiak <[email protected]>
>> ---
>> MAINTAINERS | 2 +
>> arch/s390/Kconfig | 11 +++
>> drivers/s390/crypto/Makefile | 4 +
>> drivers/s390/crypto/vfio_ap_drv.c | 135
>> +++++++++++++++++++++++++++++++++
>> drivers/s390/crypto/vfio_ap_private.h | 22 ++++++
>> include/uapi/linux/vfio.h | 2 +
>> 6 files changed, 176 insertions(+), 0 deletions(-)
>> create mode 100644 drivers/s390/crypto/vfio_ap_drv.c
>> create mode 100644 drivers/s390/crypto/vfio_ap_private.h
>>
>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index 72742d5..f129253 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -11884,6 +11884,8 @@ W:
>> http://www.ibm.com/developerworks/linux/linux390/
>> S: Supported
>> F: arch/s390/include/asm/kvm/kvm-ap.h
>> F: arch/s390/kvm/kvm-ap.c
>> +F: drivers/s390/crypto/vfio_ap_drv.c
>> +F: drivers/s390/crypto/vfio_ap_private.h
>>
>> S390 ZFCP DRIVER
>> M: Steffen Maier <[email protected]>
>> diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
>> index cbe1d97..58509db 100644
>> --- a/arch/s390/Kconfig
>> +++ b/arch/s390/Kconfig
>> @@ -771,6 +771,17 @@ config VFIO_CCW
>> To compile this driver as a module, choose M here: the
>> module will be called vfio_ccw.
>>
>> +config VFIO_AP
>> + def_tristate m
> not sure it must be module by default.
> I would not set it by default.
Connie also asked about this in the last review, so I will go ahead
and change it.
>
>> + prompt "VFIO support for AP devices"
>> + depends on ZCRYPT && VFIO_MDEV_DEVICE
>
> VFIO_MDEV_DEVICE is a general feature *needed* by VFIO_AP
> and has no use case by its own. If it is set it is obviously because some
> mediated device drivers needs it.
> while ZCRYPT is a Z feature which may be set without VFIO_AP.
>
> So you need:
>
> config VFIO_AP
> def_tristate n
> prompt "VFIO support for AP devices"
> depends on ZCRYPT
> select VFIO_MDEV
> select VFIO_MDEV_DEVICE
> ...
I was thinking the same just yesterday and I agree, this makes sense.
>
>
>
> Pierre
>


2018-03-15 18:00:13

by Pierre Morel

[permalink] [raw]
Subject: Re: [PATCH v3 04/14] KVM: s390: device attribute to set AP interpretive execution

On 15/03/2018 18:21, Tony Krowiak wrote:
> On 03/15/2018 11:45 AM, Pierre Morel wrote:
>> On 15/03/2018 16:26, Tony Krowiak wrote:
>>> On 03/15/2018 09:00 AM, Pierre Morel wrote:
>>>> On 14/03/2018 22:57, Halil Pasic wrote:
>>>>>
>>>>> On 03/14/2018 07:25 PM, Tony Krowiak wrote:
>>>>>> The VFIO AP device model exploits interpretive execution of AP
>>>>>> instructions (APIE) to provide guests passthrough access to AP
>>>>>> devices. This patch introduces a new device attribute in the
>>>>>> KVM_S390_VM_CRYPTO device attribute group to set APIE from
>>>>>> the VFIO AP device defined on the guest.
>>>>>>
>>>>>> Signed-off-by: Tony Krowiak <[email protected]>
>>>>>> ---
>>>>> [..]
>>>>>
>>>>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>>>>> index a60c45b..bc46b67 100644
>>>>>> --- a/arch/s390/kvm/kvm-s390.c
>>>>>> +++ b/arch/s390/kvm/kvm-s390.c
>>>>>> @@ -815,6 +815,19 @@ static int kvm_s390_vm_set_crypto(struct kvm
>>>>>> *kvm, struct kvm_device_attr *attr)
>>>>>> sizeof(kvm->arch.crypto.crycb->dea_wrapping_key_mask));
>>>>>>           VM_EVENT(kvm, 3, "%s", "DISABLE: DEA keywrapping
>>>>>> support");
>>>>>>           break;
>>>>>> +    case KVM_S390_VM_CRYPTO_INTERPRET_AP:
>>>>>> +        if (attr->addr) {
>>>>>> +            if (!test_kvm_cpu_feat(kvm, KVM_S390_VM_CPU_FEAT_AP))
>>>>> Unlock mutex before returning?
>>>>>
>>>>> Maybe flip conditions (don't allow manipulating apie if feature
>>>>> not there).
>>>>> Clearing the anyways clear apie if feature not there ain't too
>>>>> bad, but
>>>>> rejecting the operation appears nicer to me.
>>>>>
>>>>>> +                return -EOPNOTSUPP;
>>>>>> +            kvm->arch.crypto.apie = 1;
>>>>>> +            VM_EVENT(kvm, 3, "%s",
>>>>>> +                 "ENABLE: AP interpretive execution");
>>>>>> +        } else {
>>>>>> +            kvm->arch.crypto.apie = 0;
>>>>>> +            VM_EVENT(kvm, 3, "%s",
>>>>>> +                 "DISABLE: AP interpretive execution");
>>>>>> +        }
>>>>>> +        break;
>>>>>>       default:
>>>>>>           mutex_unlock(&kvm->lock);
>>>>>>           return -ENXIO;
>>>>> I wonder how the loop after this switch works for
>>>>> KVM_S390_VM_CRYPTO_INTERPRET_AP:
>>>>>
>>>>>          kvm_for_each_vcpu(i, vcpu, kvm) {
>>>>>                  kvm_s390_vcpu_crypto_setup(vcpu);
>>>>>                  exit_sie(vcpu);
>>>>>          }
>>>>>
>>>>>  From not doing something like for KVM_S390_VM_CRYPTO_INTERPRET_AP
>>>>>
>>>>>          if (kvm->created_vcpus) {
>>>>>                  mutex_unlock(&kvm->lock);
>>>>>                  return -EBUSY;
>>>>> and from the aforementioned loop I guess ECA.28 can be changed
>>>>> for a running guest.
>>>>>
>>>>> If there are running vcpus when KVM_S390_VM_CRYPTO_INTERPRET_AP is
>>>>> changed (set) these will be taken out of SIE by exit_sie(). Then
>>>>> for the
>>>>> corresponding threads the control probably goes to QEMU (the
>>>>> emulator in
>>>>> the userspace). And it puts that vcpu back into the SIE, and then
>>>>> that
>>>>> cpu starts acting according to the new ECA.28 value. While other
>>>>> vcpus
>>>>> may still work with the old value of ECA.28.
>>>>>
>>>>> I'm not saying what I describe above is necessarily something broken.
>>>>> But I would like to have it explained, why is it OK -- provided I
>>>>> did not
>>>>> make any errors in my reasoning (assumptions included).
>>>>>
>>>>> Can you help me understand this code?
>>>>>
>>>>> Regards,
>>>>> Halil
>>>>>
>>>>> [..]
>>>>>
>>>>
>>>> I have the same concerns as Halil.
>>>>
>>>> We do not need to change the virtulization type
>>>> (hardware/software) on the fly for the current use case.
>>>>
>>>> Couldn't we delay this until we have one and in between only make
>>>> the vCPU hotplug clean?
>>>>
>>>> We only need to let the door open for the day we have such a use case.
>>> Are you suggesting this code be removed? If so, then where and under
>>> what conditions would
>>> you suggest setting ECA.28 given you objected to setting it based on
>>> whether the
>>> AP feature is installed?
>>
>> I would only call kvm_s390_vcpu_crypto_setup() from inside
>> kvm_arch_vcpu_init()
>> as it is already.
> It is not called from kvm_arch_vcpu_init(), it is called from
> kvm_arch_vcpu_setup().

hum, sorry for this.
However, the idea pertains, not to call this function from inside an
ioctl changing crypto parameters, but only during vcpu creation.


> Also,
> this loop was already here, I did not put it in. Assuming whomever put
> it there did so
> for a reason, it is not my place to remove it. According to a trace I
> ran, the calls to this
> function occur after the vcpus are created. Consequently, the
> kvm_s390_vcpu_crypto_setup()
> function would not be called without the loop and neither the key
> wrapping support nor the
> ECA_APIE would be configured in the vcpu's SIE descriptor.
>
> If you have a better idea for where/how to set this flag, I'm all
> ears. It would be nice if it could be set before the vcpus are
> created, but I haven't
> found a good candidate. I suspect that the loop was put in to make
> sure that all vcpus
> get updated regardless of whether they are running or not, but I don't
> know what happens
> after a vcpu is kicked out of SIE. I suspect, as Halil surmised, that
> QEMU
> restores the vcpus to SIE. This would seemingly cause the
> kvm_arch_vcpu_setup() to get
> called at which time the ECA_APIE value as well as the key wrapping
> values will get set.
> If somebody has knowledge of the flow here, please feel free to pitch in.
>>
>>
>>
>>>>
>>>>
>>>> Pierre
>>>>
>>>>
>>>>
>>>
>>
>

--
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany


2018-03-15 23:40:27

by Tony Krowiak

[permalink] [raw]
Subject: Re: [PATCH v3 04/14] KVM: s390: device attribute to set AP interpretive execution

On 03/15/2018 12:00 PM, Pierre Morel wrote:
> On 15/03/2018 16:23, Tony Krowiak wrote:
>> On 03/14/2018 05:57 PM, Halil Pasic wrote:
>>>
>>> On 03/14/2018 07:25 PM, Tony Krowiak wrote:
>>>> The VFIO AP device model exploits interpretive execution of AP
>>>> instructions (APIE) to provide guests passthrough access to AP
>>>> devices. This patch introduces a new device attribute in the
>>>> KVM_S390_VM_CRYPTO device attribute group to set APIE from
>>>> the VFIO AP device defined on the guest.
>>>>
>>>> Signed-off-by: Tony Krowiak <[email protected]>
>>>> ---
>>> [..]
>>>
>>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>>> index a60c45b..bc46b67 100644
>>>> --- a/arch/s390/kvm/kvm-s390.c
>>>> +++ b/arch/s390/kvm/kvm-s390.c
>>>> @@ -815,6 +815,19 @@ static int kvm_s390_vm_set_crypto(struct kvm
>>>> *kvm, struct kvm_device_attr *attr)
>>>> sizeof(kvm->arch.crypto.crycb->dea_wrapping_key_mask));
>>>> VM_EVENT(kvm, 3, "%s", "DISABLE: DEA keywrapping support");
>>>> break;
>>>> + case KVM_S390_VM_CRYPTO_INTERPRET_AP:
>>>> + if (attr->addr) {
>>>> + if (!test_kvm_cpu_feat(kvm, KVM_S390_VM_CPU_FEAT_AP))
>>> Unlock mutex before returning?
>> The mutex is unlocked prior to return at the end of the function.
>>>
>>> Maybe flip conditions (don't allow manipulating apie if feature not
>>> there).
>>> Clearing the anyways clear apie if feature not there ain't too bad, but
>>> rejecting the operation appears nicer to me.
>> I think what you're saying is something like this:
>>
>> if (!test_kvm_cpu_feat(kvm, KVM_S390_VM_CPU_FEAT_AP))
>> return -EOPNOTSUPP;
>>
>> kvm->arch.crypto.apie = (attr->addr) ? 1 : 0;
>>
>> I can make arguments for doing this either way, but since the attribute
>> is will most likely only be set by an AP device in userspace, I suppose
>> it makes sense to allow setting of the attribute if the AP feature is
>> installed. It certainly makes sense for the dedicated implementation.
>>>
>>>> + return -EOPNOTSUPP;
>
> Obviously Halil is speaking on this return statement.
> Which returns without unlocking the mutex.
Got it.
>
>
>
>


2018-03-15 23:41:28

by Tony Krowiak

[permalink] [raw]
Subject: Re: [PATCH v3 04/14] KVM: s390: device attribute to set AP interpretive execution

On 03/15/2018 01:56 PM, Pierre Morel wrote:
> On 15/03/2018 18:21, Tony Krowiak wrote:
>> On 03/15/2018 11:45 AM, Pierre Morel wrote:
>>> On 15/03/2018 16:26, Tony Krowiak wrote:
>>>> On 03/15/2018 09:00 AM, Pierre Morel wrote:
>>>>> On 14/03/2018 22:57, Halil Pasic wrote:
>>>>>>
>>>>>> On 03/14/2018 07:25 PM, Tony Krowiak wrote:
>>>>>>> The VFIO AP device model exploits interpretive execution of AP
>>>>>>> instructions (APIE) to provide guests passthrough access to AP
>>>>>>> devices. This patch introduces a new device attribute in the
>>>>>>> KVM_S390_VM_CRYPTO device attribute group to set APIE from
>>>>>>> the VFIO AP device defined on the guest.
>>>>>>>
>>>>>>> Signed-off-by: Tony Krowiak <[email protected]>
>>>>>>> ---
>>>>>> [..]
>>>>>>
>>>>>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>>>>>> index a60c45b..bc46b67 100644
>>>>>>> --- a/arch/s390/kvm/kvm-s390.c
>>>>>>> +++ b/arch/s390/kvm/kvm-s390.c
>>>>>>> @@ -815,6 +815,19 @@ static int kvm_s390_vm_set_crypto(struct
>>>>>>> kvm *kvm, struct kvm_device_attr *attr)
>>>>>>> sizeof(kvm->arch.crypto.crycb->dea_wrapping_key_mask));
>>>>>>> VM_EVENT(kvm, 3, "%s", "DISABLE: DEA keywrapping
>>>>>>> support");
>>>>>>> break;
>>>>>>> + case KVM_S390_VM_CRYPTO_INTERPRET_AP:
>>>>>>> + if (attr->addr) {
>>>>>>> + if (!test_kvm_cpu_feat(kvm, KVM_S390_VM_CPU_FEAT_AP))
>>>>>> Unlock mutex before returning?
>>>>>>
>>>>>> Maybe flip conditions (don't allow manipulating apie if feature
>>>>>> not there).
>>>>>> Clearing the anyways clear apie if feature not there ain't too
>>>>>> bad, but
>>>>>> rejecting the operation appears nicer to me.
>>>>>>
>>>>>>> + return -EOPNOTSUPP;
>>>>>>> + kvm->arch.crypto.apie = 1;
>>>>>>> + VM_EVENT(kvm, 3, "%s",
>>>>>>> + "ENABLE: AP interpretive execution");
>>>>>>> + } else {
>>>>>>> + kvm->arch.crypto.apie = 0;
>>>>>>> + VM_EVENT(kvm, 3, "%s",
>>>>>>> + "DISABLE: AP interpretive execution");
>>>>>>> + }
>>>>>>> + break;
>>>>>>> default:
>>>>>>> mutex_unlock(&kvm->lock);
>>>>>>> return -ENXIO;
>>>>>> I wonder how the loop after this switch works for
>>>>>> KVM_S390_VM_CRYPTO_INTERPRET_AP:
>>>>>>
>>>>>> kvm_for_each_vcpu(i, vcpu, kvm) {
>>>>>> kvm_s390_vcpu_crypto_setup(vcpu);
>>>>>> exit_sie(vcpu);
>>>>>> }
>>>>>>
>>>>>> From not doing something like for KVM_S390_VM_CRYPTO_INTERPRET_AP
>>>>>>
>>>>>> if (kvm->created_vcpus) {
>>>>>> mutex_unlock(&kvm->lock);
>>>>>> return -EBUSY;
>>>>>> and from the aforementioned loop I guess ECA.28 can be changed
>>>>>> for a running guest.
>>>>>>
>>>>>> If there are running vcpus when KVM_S390_VM_CRYPTO_INTERPRET_AP is
>>>>>> changed (set) these will be taken out of SIE by exit_sie(). Then
>>>>>> for the
>>>>>> corresponding threads the control probably goes to QEMU (the
>>>>>> emulator in
>>>>>> the userspace). And it puts that vcpu back into the SIE, and then
>>>>>> that
>>>>>> cpu starts acting according to the new ECA.28 value. While other
>>>>>> vcpus
>>>>>> may still work with the old value of ECA.28.
>>>>>>
>>>>>> I'm not saying what I describe above is necessarily something
>>>>>> broken.
>>>>>> But I would like to have it explained, why is it OK -- provided I
>>>>>> did not
>>>>>> make any errors in my reasoning (assumptions included).
>>>>>>
>>>>>> Can you help me understand this code?
>>>>>>
>>>>>> Regards,
>>>>>> Halil
>>>>>>
>>>>>> [..]
>>>>>>
>>>>>
>>>>> I have the same concerns as Halil.
>>>>>
>>>>> We do not need to change the virtulization type
>>>>> (hardware/software) on the fly for the current use case.
>>>>>
>>>>> Couldn't we delay this until we have one and in between only make
>>>>> the vCPU hotplug clean?
>>>>>
>>>>> We only need to let the door open for the day we have such a use
>>>>> case.
>>>> Are you suggesting this code be removed? If so, then where and
>>>> under what conditions would
>>>> you suggest setting ECA.28 given you objected to setting it based
>>>> on whether the
>>>> AP feature is installed?
>>>
>>> I would only call kvm_s390_vcpu_crypto_setup() from inside
>>> kvm_arch_vcpu_init()
>>> as it is already.
>> It is not called from kvm_arch_vcpu_init(), it is called from
>> kvm_arch_vcpu_setup().
>
> hum, sorry for this.
> However, the idea pertains, not to call this function from inside an
> ioctl changing crypto parameters, but only during vcpu creation.
Unfortunately, the ioctl does not get called until after the vcpus are
created (see my comments below)
>
>
>
>> Also,
>> this loop was already here, I did not put it in. Assuming whomever
>> put it there did so
>> for a reason, it is not my place to remove it. According to a trace I
>> ran, the calls to this
>> function occur after the vcpus are created. Consequently, the
>> kvm_s390_vcpu_crypto_setup()
>> function would not be called without the loop and neither the key
>> wrapping support nor the
>> ECA_APIE would be configured in the vcpu's SIE descriptor.
>>
>> If you have a better idea for where/how to set this flag, I'm all
>> ears. It would be nice if it could be set before the vcpus are
>> created, but I haven't
>> found a good candidate. I suspect that the loop was put in to make
>> sure that all vcpus
>> get updated regardless of whether they are running or not, but I
>> don't know what happens
>> after a vcpu is kicked out of SIE. I suspect, as Halil surmised, that
>> QEMU
>> restores the vcpus to SIE. This would seemingly cause the
>> kvm_arch_vcpu_setup() to get
>> called at which time the ECA_APIE value as well as the key wrapping
>> values will get set.
>> If somebody has knowledge of the flow here, please feel free to pitch
>> in.
>>>
>>>
>>>
>>>>>
>>>>>
>>>>> Pierre
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>


2018-03-16 07:52:56

by Pierre Morel

[permalink] [raw]
Subject: Re: [PATCH v3 04/14] KVM: s390: device attribute to set AP interpretive execution

On 16/03/2018 00:39, Tony Krowiak wrote:
> On 03/15/2018 01:56 PM, Pierre Morel wrote:
>> On 15/03/2018 18:21, Tony Krowiak wrote:
>>> On 03/15/2018 11:45 AM, Pierre Morel wrote:
>>>> On 15/03/2018 16:26, Tony Krowiak wrote:
>>>>> On 03/15/2018 09:00 AM, Pierre Morel wrote:
>>>>>> On 14/03/2018 22:57, Halil Pasic wrote:
>>>>>>>
>>>>>>> On 03/14/2018 07:25 PM, Tony Krowiak wrote:
>>>>>>>> The VFIO AP device model exploits interpretive execution of AP
>>>>>>>> instructions (APIE) to provide guests passthrough access to AP
>>>>>>>> devices. This patch introduces a new device attribute in the
>>>>>>>> KVM_S390_VM_CRYPTO device attribute group to set APIE from
>>>>>>>> the VFIO AP device defined on the guest.
>>>>>>>>
>>>>>>>> Signed-off-by: Tony Krowiak <[email protected]>
>>>>>>>> ---
>>>>>>> [..]
>>>>>>>
>>>>>>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>>>>>>> index a60c45b..bc46b67 100644
>>>>>>>> --- a/arch/s390/kvm/kvm-s390.c
>>>>>>>> +++ b/arch/s390/kvm/kvm-s390.c
>>>>>>>> @@ -815,6 +815,19 @@ static int kvm_s390_vm_set_crypto(struct
>>>>>>>> kvm *kvm, struct kvm_device_attr *attr)
>>>>>>>> sizeof(kvm->arch.crypto.crycb->dea_wrapping_key_mask));
>>>>>>>>           VM_EVENT(kvm, 3, "%s", "DISABLE: DEA keywrapping
>>>>>>>> support");
>>>>>>>>           break;
>>>>>>>> +    case KVM_S390_VM_CRYPTO_INTERPRET_AP:
>>>>>>>> +        if (attr->addr) {
>>>>>>>> +            if (!test_kvm_cpu_feat(kvm, KVM_S390_VM_CPU_FEAT_AP))
>>>>>>> Unlock mutex before returning?
>>>>>>>
>>>>>>> Maybe flip conditions (don't allow manipulating apie if feature
>>>>>>> not there).
>>>>>>> Clearing the anyways clear apie if feature not there ain't too
>>>>>>> bad, but
>>>>>>> rejecting the operation appears nicer to me.
>>>>>>>
>>>>>>>> +                return -EOPNOTSUPP;
>>>>>>>> +            kvm->arch.crypto.apie = 1;
>>>>>>>> +            VM_EVENT(kvm, 3, "%s",
>>>>>>>> +                 "ENABLE: AP interpretive execution");
>>>>>>>> +        } else {
>>>>>>>> +            kvm->arch.crypto.apie = 0;
>>>>>>>> +            VM_EVENT(kvm, 3, "%s",
>>>>>>>> +                 "DISABLE: AP interpretive execution");
>>>>>>>> +        }
>>>>>>>> +        break;
>>>>>>>>       default:
>>>>>>>>           mutex_unlock(&kvm->lock);
>>>>>>>>           return -ENXIO;
>>>>>>> I wonder how the loop after this switch works for
>>>>>>> KVM_S390_VM_CRYPTO_INTERPRET_AP:
>>>>>>>
>>>>>>>          kvm_for_each_vcpu(i, vcpu, kvm) {
>>>>>>>                  kvm_s390_vcpu_crypto_setup(vcpu);
>>>>>>>                  exit_sie(vcpu);
>>>>>>>          }
>>>>>>>
>>>>>>>  From not doing something like for KVM_S390_VM_CRYPTO_INTERPRET_AP
>>>>>>>
>>>>>>>          if (kvm->created_vcpus) {
>>>>>>>                  mutex_unlock(&kvm->lock);
>>>>>>>                  return -EBUSY;
>>>>>>> and from the aforementioned loop I guess ECA.28 can be changed
>>>>>>> for a running guest.
>>>>>>>
>>>>>>> If there are running vcpus when KVM_S390_VM_CRYPTO_INTERPRET_AP is
>>>>>>> changed (set) these will be taken out of SIE by exit_sie(). Then
>>>>>>> for the
>>>>>>> corresponding threads the control probably goes to QEMU (the
>>>>>>> emulator in
>>>>>>> the userspace). And it puts that vcpu back into the SIE, and
>>>>>>> then that
>>>>>>> cpu starts acting according to the new ECA.28 value. While other
>>>>>>> vcpus
>>>>>>> may still work with the old value of ECA.28.
>>>>>>>
>>>>>>> I'm not saying what I describe above is necessarily something
>>>>>>> broken.
>>>>>>> But I would like to have it explained, why is it OK -- provided
>>>>>>> I did not
>>>>>>> make any errors in my reasoning (assumptions included).
>>>>>>>
>>>>>>> Can you help me understand this code?
>>>>>>>
>>>>>>> Regards,
>>>>>>> Halil
>>>>>>>
>>>>>>> [..]
>>>>>>>
>>>>>>
>>>>>> I have the same concerns as Halil.
>>>>>>
>>>>>> We do not need to change the virtulization type
>>>>>> (hardware/software) on the fly for the current use case.
>>>>>>
>>>>>> Couldn't we delay this until we have one and in between only make
>>>>>> the vCPU hotplug clean?
>>>>>>
>>>>>> We only need to let the door open for the day we have such a use
>>>>>> case.
>>>>> Are you suggesting this code be removed? If so, then where and
>>>>> under what conditions would
>>>>> you suggest setting ECA.28 given you objected to setting it based
>>>>> on whether the
>>>>> AP feature is installed?
>>>>
>>>> I would only call kvm_s390_vcpu_crypto_setup() from inside
>>>> kvm_arch_vcpu_init()
>>>> as it is already.
>>> It is not called from kvm_arch_vcpu_init(), it is called from
>>> kvm_arch_vcpu_setup().
>>
>> hum, sorry for this.
>> However, the idea pertains, not to call this function from inside an
>> ioctl changing crypto parameters, but only during vcpu creation.
> Unfortunately, the ioctl does not get called until after the vcpus are
> created (see my comments below)

That is why I think you should not change the ECA field from the crypto
ioctl but only during the vcpu initialization phase.

>>
>>
>>
>>> Also,
>>> this loop was already here, I did not put it in. Assuming whomever
>>> put it there did so
>>> for a reason, it is not my place to remove it. According to a trace
>>> I ran, the calls to this
>>> function occur after the vcpus are created. Consequently, the
>>> kvm_s390_vcpu_crypto_setup()
>>> function would not be called without the loop and neither the key
>>> wrapping support nor the
>>> ECA_APIE would be configured in the vcpu's SIE descriptor.
>>>
>>> If you have a better idea for where/how to set this flag, I'm all
>>> ears. It would be nice if it could be set before the vcpus are
>>> created, but I haven't
>>> found a good candidate. I suspect that the loop was put in to make
>>> sure that all vcpus
>>> get updated regardless of whether they are running or not, but I
>>> don't know what happens
>>> after a vcpu is kicked out of SIE. I suspect, as Halil surmised,
>>> that QEMU
>>> restores the vcpus to SIE. This would seemingly cause the
>>> kvm_arch_vcpu_setup() to get
>>> called at which time the ECA_APIE value as well as the key wrapping
>>> values will get set.
>>> If somebody has knowledge of the flow here, please feel free to
>>> pitch in.
>>>>
>>>>
>>>>
>>>>>>
>>>>>>
>>>>>> Pierre
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

--
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany


2018-03-16 16:22:18

by Tony Krowiak

[permalink] [raw]
Subject: Re: [PATCH v3 04/14] KVM: s390: device attribute to set AP interpretive execution

On 03/16/2018 03:51 AM, Pierre Morel wrote:
> On 16/03/2018 00:39, Tony Krowiak wrote:
>> On 03/15/2018 01:56 PM, Pierre Morel wrote:
>>> On 15/03/2018 18:21, Tony Krowiak wrote:
>>>> On 03/15/2018 11:45 AM, Pierre Morel wrote:
>>>>> On 15/03/2018 16:26, Tony Krowiak wrote:
>>>>>> On 03/15/2018 09:00 AM, Pierre Morel wrote:
>>>>>>> On 14/03/2018 22:57, Halil Pasic wrote:
>>>>>>>>
>>>>>>>> On 03/14/2018 07:25 PM, Tony Krowiak wrote:
>>>>>>>>> The VFIO AP device model exploits interpretive execution of AP
>>>>>>>>> instructions (APIE) to provide guests passthrough access to AP
>>>>>>>>> devices. This patch introduces a new device attribute in the
>>>>>>>>> KVM_S390_VM_CRYPTO device attribute group to set APIE from
>>>>>>>>> the VFIO AP device defined on the guest.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Tony Krowiak <[email protected]>
>>>>>>>>> ---
>>>>>>>> [..]
>>>>>>>>
>>>>>>>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>>>>>>>> index a60c45b..bc46b67 100644
>>>>>>>>> --- a/arch/s390/kvm/kvm-s390.c
>>>>>>>>> +++ b/arch/s390/kvm/kvm-s390.c
>>>>>>>>> @@ -815,6 +815,19 @@ static int kvm_s390_vm_set_crypto(struct
>>>>>>>>> kvm *kvm, struct kvm_device_attr *attr)
>>>>>>>>> sizeof(kvm->arch.crypto.crycb->dea_wrapping_key_mask));
>>>>>>>>> VM_EVENT(kvm, 3, "%s", "DISABLE: DEA keywrapping
>>>>>>>>> support");
>>>>>>>>> break;
>>>>>>>>> + case KVM_S390_VM_CRYPTO_INTERPRET_AP:
>>>>>>>>> + if (attr->addr) {
>>>>>>>>> + if (!test_kvm_cpu_feat(kvm,
>>>>>>>>> KVM_S390_VM_CPU_FEAT_AP))
>>>>>>>> Unlock mutex before returning?
>>>>>>>>
>>>>>>>> Maybe flip conditions (don't allow manipulating apie if feature
>>>>>>>> not there).
>>>>>>>> Clearing the anyways clear apie if feature not there ain't too
>>>>>>>> bad, but
>>>>>>>> rejecting the operation appears nicer to me.
>>>>>>>>
>>>>>>>>> + return -EOPNOTSUPP;
>>>>>>>>> + kvm->arch.crypto.apie = 1;
>>>>>>>>> + VM_EVENT(kvm, 3, "%s",
>>>>>>>>> + "ENABLE: AP interpretive execution");
>>>>>>>>> + } else {
>>>>>>>>> + kvm->arch.crypto.apie = 0;
>>>>>>>>> + VM_EVENT(kvm, 3, "%s",
>>>>>>>>> + "DISABLE: AP interpretive execution");
>>>>>>>>> + }
>>>>>>>>> + break;
>>>>>>>>> default:
>>>>>>>>> mutex_unlock(&kvm->lock);
>>>>>>>>> return -ENXIO;
>>>>>>>> I wonder how the loop after this switch works for
>>>>>>>> KVM_S390_VM_CRYPTO_INTERPRET_AP:
>>>>>>>>
>>>>>>>> kvm_for_each_vcpu(i, vcpu, kvm) {
>>>>>>>> kvm_s390_vcpu_crypto_setup(vcpu);
>>>>>>>> exit_sie(vcpu);
>>>>>>>> }
>>>>>>>>
>>>>>>>> From not doing something like for KVM_S390_VM_CRYPTO_INTERPRET_AP
>>>>>>>>
>>>>>>>> if (kvm->created_vcpus) {
>>>>>>>> mutex_unlock(&kvm->lock);
>>>>>>>> return -EBUSY;
>>>>>>>> and from the aforementioned loop I guess ECA.28 can be changed
>>>>>>>> for a running guest.
>>>>>>>>
>>>>>>>> If there are running vcpus when KVM_S390_VM_CRYPTO_INTERPRET_AP is
>>>>>>>> changed (set) these will be taken out of SIE by exit_sie().
>>>>>>>> Then for the
>>>>>>>> corresponding threads the control probably goes to QEMU (the
>>>>>>>> emulator in
>>>>>>>> the userspace). And it puts that vcpu back into the SIE, and
>>>>>>>> then that
>>>>>>>> cpu starts acting according to the new ECA.28 value. While
>>>>>>>> other vcpus
>>>>>>>> may still work with the old value of ECA.28.
>>>>>>>>
>>>>>>>> I'm not saying what I describe above is necessarily something
>>>>>>>> broken.
>>>>>>>> But I would like to have it explained, why is it OK -- provided
>>>>>>>> I did not
>>>>>>>> make any errors in my reasoning (assumptions included).
>>>>>>>>
>>>>>>>> Can you help me understand this code?
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Halil
>>>>>>>>
>>>>>>>> [..]
>>>>>>>>
>>>>>>>
>>>>>>> I have the same concerns as Halil.
>>>>>>>
>>>>>>> We do not need to change the virtulization type
>>>>>>> (hardware/software) on the fly for the current use case.
>>>>>>>
>>>>>>> Couldn't we delay this until we have one and in between only
>>>>>>> make the vCPU hotplug clean?
>>>>>>>
>>>>>>> We only need to let the door open for the day we have such a use
>>>>>>> case.
>>>>>> Are you suggesting this code be removed? If so, then where and
>>>>>> under what conditions would
>>>>>> you suggest setting ECA.28 given you objected to setting it based
>>>>>> on whether the
>>>>>> AP feature is installed?
>>>>>
>>>>> I would only call kvm_s390_vcpu_crypto_setup() from inside
>>>>> kvm_arch_vcpu_init()
>>>>> as it is already.
>>>> It is not called from kvm_arch_vcpu_init(), it is called from
>>>> kvm_arch_vcpu_setup().
>>>
>>> hum, sorry for this.
>>> However, the idea pertains, not to call this function from inside an
>>> ioctl changing crypto parameters, but only during vcpu creation.
>> Unfortunately, the ioctl does not get called until after the vcpus
>> are created (see my comments below)
>
> That is why I think you should not change the ECA field from the
> crypto ioctl but only during the vcpu initialization phase.
By what means do you suggest we do that?
>
>
>>>
>>>
>>>
>>>> Also,
>>>> this loop was already here, I did not put it in. Assuming whomever
>>>> put it there did so
>>>> for a reason, it is not my place to remove it. According to a trace
>>>> I ran, the calls to this
>>>> function occur after the vcpus are created. Consequently, the
>>>> kvm_s390_vcpu_crypto_setup()
>>>> function would not be called without the loop and neither the key
>>>> wrapping support nor the
>>>> ECA_APIE would be configured in the vcpu's SIE descriptor.
>>>>
>>>> If you have a better idea for where/how to set this flag, I'm all
>>>> ears. It would be nice if it could be set before the vcpus are
>>>> created, but I haven't
>>>> found a good candidate. I suspect that the loop was put in to make
>>>> sure that all vcpus
>>>> get updated regardless of whether they are running or not, but I
>>>> don't know what happens
>>>> after a vcpu is kicked out of SIE. I suspect, as Halil surmised,
>>>> that QEMU
>>>> restores the vcpus to SIE. This would seemingly cause the
>>>> kvm_arch_vcpu_setup() to get
>>>> called at which time the ECA_APIE value as well as the key wrapping
>>>> values will get set.
>>>> If somebody has knowledge of the flow here, please feel free to
>>>> pitch in.
>>>>>
>>>>>
>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Pierre
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>


2018-03-20 18:00:20

by Tony Krowiak

[permalink] [raw]
Subject: Re: [PATCH v3 04/14] KVM: s390: device attribute to set AP interpretive execution

On 03/16/2018 03:51 AM, Pierre Morel wrote:
> On 16/03/2018 00:39, Tony Krowiak wrote:
>> On 03/15/2018 01:56 PM, Pierre Morel wrote:
>>> On 15/03/2018 18:21, Tony Krowiak wrote:
>>>> On 03/15/2018 11:45 AM, Pierre Morel wrote:
>>>>> On 15/03/2018 16:26, Tony Krowiak wrote:
>>>>>> On 03/15/2018 09:00 AM, Pierre Morel wrote:
>>>>>>> On 14/03/2018 22:57, Halil Pasic wrote:
>>>>>>>>
>>>>>>>> On 03/14/2018 07:25 PM, Tony Krowiak wrote:
>>>>>>>>> The VFIO AP device model exploits interpretive execution of AP
>>>>>>>>> instructions (APIE) to provide guests passthrough access to AP
>>>>>>>>> devices. This patch introduces a new device attribute in the
>>>>>>>>> KVM_S390_VM_CRYPTO device attribute group to set APIE from
>>>>>>>>> the VFIO AP device defined on the guest.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Tony Krowiak <[email protected]>
>>>>>>>>> ---
>>>>>>>> [..]
>>>>>>>>
>>>>>>>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>>>>>>>> index a60c45b..bc46b67 100644
>>>>>>>>> --- a/arch/s390/kvm/kvm-s390.c
>>>>>>>>> +++ b/arch/s390/kvm/kvm-s390.c
>>>>>>>>> @@ -815,6 +815,19 @@ static int kvm_s390_vm_set_crypto(struct
>>>>>>>>> kvm *kvm, struct kvm_device_attr *attr)
>>>>>>>>> sizeof(kvm->arch.crypto.crycb->dea_wrapping_key_mask));
>>>>>>>>> VM_EVENT(kvm, 3, "%s", "DISABLE: DEA keywrapping
>>>>>>>>> support");
>>>>>>>>> break;
>>>>>>>>> + case KVM_S390_VM_CRYPTO_INTERPRET_AP:
>>>>>>>>> + if (attr->addr) {
>>>>>>>>> + if (!test_kvm_cpu_feat(kvm,
>>>>>>>>> KVM_S390_VM_CPU_FEAT_AP))
>>>>>>>> Unlock mutex before returning?
>>>>>>>>
>>>>>>>> Maybe flip conditions (don't allow manipulating apie if feature
>>>>>>>> not there).
>>>>>>>> Clearing the anyways clear apie if feature not there ain't too
>>>>>>>> bad, but
>>>>>>>> rejecting the operation appears nicer to me.
>>>>>>>>
>>>>>>>>> + return -EOPNOTSUPP;
>>>>>>>>> + kvm->arch.crypto.apie = 1;
>>>>>>>>> + VM_EVENT(kvm, 3, "%s",
>>>>>>>>> + "ENABLE: AP interpretive execution");
>>>>>>>>> + } else {
>>>>>>>>> + kvm->arch.crypto.apie = 0;
>>>>>>>>> + VM_EVENT(kvm, 3, "%s",
>>>>>>>>> + "DISABLE: AP interpretive execution");
>>>>>>>>> + }
>>>>>>>>> + break;
>>>>>>>>> default:
>>>>>>>>> mutex_unlock(&kvm->lock);
>>>>>>>>> return -ENXIO;
>>>>>>>> I wonder how the loop after this switch works for
>>>>>>>> KVM_S390_VM_CRYPTO_INTERPRET_AP:
>>>>>>>>
>>>>>>>> kvm_for_each_vcpu(i, vcpu, kvm) {
>>>>>>>> kvm_s390_vcpu_crypto_setup(vcpu);
>>>>>>>> exit_sie(vcpu);
>>>>>>>> }
>>>>>>>>
>>>>>>>> From not doing something like for KVM_S390_VM_CRYPTO_INTERPRET_AP
>>>>>>>>
>>>>>>>> if (kvm->created_vcpus) {
>>>>>>>> mutex_unlock(&kvm->lock);
>>>>>>>> return -EBUSY;
>>>>>>>> and from the aforementioned loop I guess ECA.28 can be changed
>>>>>>>> for a running guest.
>>>>>>>>
>>>>>>>> If there are running vcpus when KVM_S390_VM_CRYPTO_INTERPRET_AP is
>>>>>>>> changed (set) these will be taken out of SIE by exit_sie().
>>>>>>>> Then for the
>>>>>>>> corresponding threads the control probably goes to QEMU (the
>>>>>>>> emulator in
>>>>>>>> the userspace). And it puts that vcpu back into the SIE, and
>>>>>>>> then that
>>>>>>>> cpu starts acting according to the new ECA.28 value. While
>>>>>>>> other vcpus
>>>>>>>> may still work with the old value of ECA.28.
>>>>>>>>
>>>>>>>> I'm not saying what I describe above is necessarily something
>>>>>>>> broken.
>>>>>>>> But I would like to have it explained, why is it OK -- provided
>>>>>>>> I did not
>>>>>>>> make any errors in my reasoning (assumptions included).
>>>>>>>>
>>>>>>>> Can you help me understand this code?
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Halil
>>>>>>>>
>>>>>>>> [..]
>>>>>>>>
>>>>>>>
>>>>>>> I have the same concerns as Halil.
>>>>>>>
>>>>>>> We do not need to change the virtulization type
>>>>>>> (hardware/software) on the fly for the current use case.
>>>>>>>
>>>>>>> Couldn't we delay this until we have one and in between only
>>>>>>> make the vCPU hotplug clean?
>>>>>>>
>>>>>>> We only need to let the door open for the day we have such a use
>>>>>>> case.
>>>>>> Are you suggesting this code be removed? If so, then where and
>>>>>> under what conditions would
>>>>>> you suggest setting ECA.28 given you objected to setting it based
>>>>>> on whether the
>>>>>> AP feature is installed?
>>>>>
>>>>> I would only call kvm_s390_vcpu_crypto_setup() from inside
>>>>> kvm_arch_vcpu_init()
>>>>> as it is already.
>>>> It is not called from kvm_arch_vcpu_init(), it is called from
>>>> kvm_arch_vcpu_setup().
>>>
>>> hum, sorry for this.
>>> However, the idea pertains, not to call this function from inside an
>>> ioctl changing crypto parameters, but only during vcpu creation.
>> Unfortunately, the ioctl does not get called until after the vcpus
>> are created (see my comments below)
>
> That is why I think you should not change the ECA field from the
> crypto ioctl but only during the vcpu initialization phase.
I spoke with Christian this morning and he made a suggestion which I
think would provide the best solution here.
This is my proposal:
1. Get rid of the KVM_S390_VM_CRYPTO_INTERPRET_AP device attribute and
return to setting ECA.28 from the
mdev device open callback.
2. Since there may be vcpus online at the time the mdev device open is
called, we must first take all running vcpus out of
SIE and block them. Christian suggested the
kvm_s390_vcpu_block_all(struct kvm *kvm) function will do the trick. So I
propose introducing a function like the following to be called
during mdev open:

int kvm_ap_set_interpretive_exec(struct kvm *kvm, bool enable)
{
int i;
struct kvm_vcpu *vcpu;

if (!test_kvm_cpu_feat(kvm, KVM_S390_VM_CPU_FEAT_AP))
return -EOPNOTSUPP;

mutex_lock(&kvm->lock);

kvm_s390_vcpu_block_all(kvm);

kvm_for_each_vcpu(i, vcpu, kvm) {
if (enable)
vcpu->arch.sie_block->eca |= ECA_APIE;
else
vcpu->arch.sie_block->eca &= ~ECA_APIE;
}

kvm_s390_vcpu_unblock_all(kvm);

mutex_unlock(&kvm->lock);

return 0;
}

This interface allows us to set ECA.28 even if vcpus are running.
>
>
>>>
>>>
>>>
>>>> Also,
>>>> this loop was already here, I did not put it in. Assuming whomever
>>>> put it there did so
>>>> for a reason, it is not my place to remove it. According to a trace
>>>> I ran, the calls to this
>>>> function occur after the vcpus are created. Consequently, the
>>>> kvm_s390_vcpu_crypto_setup()
>>>> function would not be called without the loop and neither the key
>>>> wrapping support nor the
>>>> ECA_APIE would be configured in the vcpu's SIE descriptor.
>>>>
>>>> If you have a better idea for where/how to set this flag, I'm all
>>>> ears. It would be nice if it could be set before the vcpus are
>>>> created, but I haven't
>>>> found a good candidate. I suspect that the loop was put in to make
>>>> sure that all vcpus
>>>> get updated regardless of whether they are running or not, but I
>>>> don't know what happens
>>>> after a vcpu is kicked out of SIE. I suspect, as Halil surmised,
>>>> that QEMU
>>>> restores the vcpus to SIE. This would seemingly cause the
>>>> kvm_arch_vcpu_setup() to get
>>>> called at which time the ECA_APIE value as well as the key wrapping
>>>> values will get set.
>>>> If somebody has knowledge of the flow here, please feel free to
>>>> pitch in.
>>>>>
>>>>>
>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Pierre
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>


2018-03-20 22:50:02

by Halil Pasic

[permalink] [raw]
Subject: Re: [PATCH v3 04/14] KVM: s390: device attribute to set AP interpretive execution



On 03/20/2018 06:58 PM, Tony Krowiak wrote:
> I spoke with Christian this morning and he made a suggestion which I think would provide the best solution here.
> This is my proposal:
> 1. Get rid of the KVM_S390_VM_CRYPTO_INTERPRET_AP device attribute and return to setting ECA.28 from the
>    mdev device open callback.
> 2. Since there may be vcpus online at the time the mdev device open is called, we must first take all running vcpus out of
>    SIE and block them. Christian suggested the kvm_s390_vcpu_block_all(struct kvm *kvm) function will do the trick. So I
>    propose introducing a function like the following to be called during mdev open:

There is one thing you missed, otherwise I'm *very* satisfied with this
proposal.

What you have missed IMHO is vcpu hottplug. So IMHO you should keep
kvm->arch.crypto.apie, and update it accordingly ...


>
>     int kvm_ap_set_interpretive_exec(struct kvm *kvm, bool enable)
>     {
>         int i;
>         struct kvm_vcpu *vcpu;
>
>         if (!test_kvm_cpu_feat(kvm, KVM_S390_VM_CPU_FEAT_AP))
>             return -EOPNOTSUPP;
>
>         mutex_lock(&kvm->lock);
>
>         kvm_s390_vcpu_block_all(kvm);

... let's say here.

>
>         kvm_for_each_vcpu(i, vcpu, kvm) {

And here you can call kvm_s390_vcpu_crypto_setup(vcpu) (the changes to
this function will be required for hotplug) if you like

>             if (enable)
>                 vcpu->arch.sie_block->eca |= ECA_APIE;
>             else
>                 vcpu->arch.sie_block->eca &= ~ECA_APIE;

or keep this stuff, it does not really matter to me.

>         }
>
>         kvm_s390_vcpu_unblock_all(kvm);
>
>         mutex_unlock(&kvm->lock);
>
>         return 0;
>     }
>
>    This interface allows us to set ECA.28 even if vcpus are running

I tend to agree. I will give it a proper review when this gets more
formal (e.g. v4 (preferably) or patches to be fixed up to this series).

Please don't forget to revisit the discussion on kvm_s390_vm_set_crypto:
if the mechanism there isn't right for ECA.28 I think you should tell
us why it's OK for the other attributes if it's OK. If it is not then
I guess you will want to do a stand alone patch for that.


2018-03-26 08:46:26

by Cornelia Huck

[permalink] [raw]
Subject: Re: [PATCH v3 01/14] KVM: s390: refactor crypto initialization

On Thu, 15 Mar 2018 15:55:39 +0100
Pierre Morel <[email protected]> wrote:

> On 15/03/2018 15:48, Tony Krowiak wrote:
> > On 03/15/2018 08:26 AM, Pierre Morel wrote:
> >> On 14/03/2018 19:25, Tony Krowiak wrote:

> >>> diff --git a/arch/s390/kvm/Kconfig b/arch/s390/kvm/Kconfig
> >>> index a3dbd45..4ca9077 100644
> >>> --- a/arch/s390/kvm/Kconfig
> >>> +++ b/arch/s390/kvm/Kconfig
> >>> @@ -33,6 +33,7 @@ config KVM
> >>>       select HAVE_KVM_INVALID_WAKEUPS
> >>>       select SRCU
> >>>       select KVM_VFIO
> >>> +    select ZCRYPT
> >>
> >> I do not think it is a good solution to *always* enable ZCRYPT
> >> when we have KVM.
> > If CONFIG_ZCRYPT is not selected, then the kvm_ap_apxa_installed()
> > function will not compile
> > because it calls a zcrypt interface. How would you suggest we make
> > sure zcrypt interfaces
> > used in KVM are built if CONFIG_ZCRYPT is not selected?
>
> if zcrypt is not configured, I suppose that the KVM code initializaing CRYCB
> has no use but the function will be called from KVM.
> So I would do something like:
>
> #ifdef ZCRYPT
> external definitions.
> #else
> stubs returning error -ENOZCRYPT (or whatever)
> #endif

The kvm code used some kind of detection for crycb before (IIRC it was
for the key-wrapping stuff). I assume that usage is independent of
zcrypt driver usage in the host?

So, I think that apxa detection function should be used to s390
architecture base code and not be conditional on anything.

2018-03-27 11:00:25

by Cornelia Huck

[permalink] [raw]
Subject: Re: [PATCH v3 03/14] KVM: s390: CPU model support for AP virtualization

On Wed, 14 Mar 2018 14:25:43 -0400
Tony Krowiak <[email protected]> wrote:

> Introduces a new CPU model feature and two CPU model
> facilities to support AP virtualization for KVM guests.
>
> CPU model feature:
>
> The KVM_S390_VM_CPU_FEAT_AP feature indicates that
> AP instructions are available on the guest. This
> feature will be enabled by the kernel only if the AP
> instructions are installed on the linux host. This feature
> must be specifically turned on for the KVM guest from
> userspace to use the VFIO AP device driver for guest
> access to AP devices.
>
> CPU model facilities:
>
> 1. AP Query Configuration Information (QCI) facility is installed.
>
> This is indicated by setting facilities bit 12 for
> the guest. The kernel will not enable this facility
> for the guest if it is not set on the host. This facility
> must not be set by userspace if the KVM_S390_VM_CPU_FEAT_AP
> feature is not installed.
>
> 2. AP Facilities Test facility (APFT) is installed.
>
> This is indicated by setting facilities bit 15 for
> the guest. The kernel will not enable this facility for
> the guest if it is not set on the host. This facility
> must not be set by userspace if the KVM_S390_VM_CPU_FEAT_AP
> feature is not installed.
>
> Reviewed-by: Christian Borntraeger <[email protected]>
> Reviewed-by: Halil Pasic <[email protected]>
> Signed-off-by: Tony Krowiak <[email protected]>
> ---
> arch/s390/include/asm/kvm_host.h | 1 +
> arch/s390/include/uapi/asm/kvm.h | 1 +
> arch/s390/kvm/kvm-s390.c | 4 ++++
> arch/s390/tools/gen_facilities.c | 2 ++
> 4 files changed, 8 insertions(+), 0 deletions(-)
>

> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index c47731d..a60c45b 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -350,6 +350,10 @@ static void kvm_s390_cpu_feat_init(void)
>
> if (MACHINE_HAS_ESOP)
> allow_cpu_feat(KVM_S390_VM_CPU_FEAT_ESOP);
> +
> + if (ap_instructions_installed()) /* AP instructions installed on host */
> + allow_cpu_feat(KVM_S390_VM_CPU_FEAT_AP);

That's another dependency of the base kvm-s390 module on zcrypt, which
I don't like at all.

There are two possibilities here:
- Exposing the features makes sense even if no zcrypt driver is active
in the host. Then, ap_instructions_installed() needs to be moved into
always-built code (see my comments for the interface in patch 1).
- Exposing the features makes sense only if we actually want to make
vfio-ap available. Then we should provide the proper check in the
vfio-ap parts (which depends on zcrypt) and stub it out if vfio-ap is
not configured.

> +
> /*
> * We need SIE support, ESOP (PROT_READ protection for gmap_shadow),
> * 64bit SCAO (SCA passthrough) and IDTE (for gmap_shadow unshadowing).

2018-03-27 11:18:44

by Cornelia Huck

[permalink] [raw]
Subject: Re: [PATCH v3 05/14] s390: vfio-ap: base implementation of VFIO AP device driver

On Thu, 15 Mar 2018 13:25:25 -0400
Tony Krowiak <[email protected]> wrote:

> On 03/15/2018 09:25 AM, Pierre Morel wrote:
> > On 14/03/2018 19:25, Tony Krowiak wrote:

> >> +config VFIO_AP
> >> + def_tristate m
> > not sure it must be module by default.
> > I would not set it by default.
> Connie also asked about this in the last review, so I will go ahead
> and change it.
> >
> >> + prompt "VFIO support for AP devices"
> >> + depends on ZCRYPT && VFIO_MDEV_DEVICE
> >
> > VFIO_MDEV_DEVICE is a general feature *needed* by VFIO_AP
> > and has no use case by its own. If it is set it is obviously because some
> > mediated device drivers needs it.
> > while ZCRYPT is a Z feature which may be set without VFIO_AP.
> >
> > So you need:
> >
> > config VFIO_AP
> > def_tristate n
> > prompt "VFIO support for AP devices"
> > depends on ZCRYPT
> > select VFIO_MDEV
> > select VFIO_MDEV_DEVICE
> > ...
> I was thinking the same just yesterday and I agree, this makes sense.

OTOH, nobody else seems to do a select on these symbols so far.

If you decide to go that route, you'll also need to depend on VFIO
(otherwise you could end up selecting symbols with unmet dependencies).
All in all, I prefer the 'depends' approach.

2018-03-27 11:20:41

by Cornelia Huck

[permalink] [raw]
Subject: Re: [PATCH v3 11/14] s390: vfio-ap: sysfs interface to view matrix mdev matrix

On Thu, 15 Mar 2018 10:42:33 +0100
Pierre Morel <[email protected]> wrote:

> On 14/03/2018 19:25, Tony Krowiak wrote:
> > Provides a sysfs interface to view the AP matrix configured for the
> > mediated matrix device.
> >
> > The relevant sysfs structures are:
> >
> > /sys/devices/vfio_ap
> > ... [matrix]
> > ...... [mdev_supported_types]
> > ......... [vfio_ap-passthrough]
> > ............ [devices]
> > ...............[$uuid]
> > .................. matrix
> >
> > To view the matrix configured for the mediated matrix device,
> > print the matrix file:
> >
> > cat matrix
> >
> > Signed-off-by: Tony Krowiak <[email protected]>
> > ---
> > drivers/s390/crypto/vfio_ap_ops.c | 39 +++++++++++++++++++++++++++++++++++++
> > 1 files changed, 39 insertions(+), 0 deletions(-)
> >
> > diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
> > index 461d450..04f7a92 100644
> > --- a/drivers/s390/crypto/vfio_ap_ops.c
> > +++ b/drivers/s390/crypto/vfio_ap_ops.c
> > @@ -692,6 +692,44 @@ static ssize_t control_domains_show(struct device *dev,
> > }
> > DEVICE_ATTR_RO(control_domains);
> >
> > +static ssize_t matrix_show(struct device *dev, struct device_attribute *attr,
> > + char *buf)
> > +{
> > + struct mdev_device *mdev = mdev_from_dev(dev);
> > + struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
> > + char *bufpos = buf;
> > + unsigned long apid;
> > + unsigned long apqi;
> > + int nchars = 0;
> > + int n;
> > +
> > + n = sprintf(bufpos, "ADAPTER.DOMAIN\n");
>
> For easy parsing it is better to only report the interesting data
> and let a user space utility make fancy presentation.

+1. Attributes should normally be simple, one-value things.

>
> > + bufpos += n;
> > + nchars += n;
> > +
> > + n = sprintf(bufpos, "--------------\n");
> > + bufpos += n;
> > + nchars += n;
> > +
> > + for_each_set_bit_inv(apid, matrix_mdev->matrix->apm,
> > + matrix_mdev->matrix->apm_max) {
> > + n = sprintf(bufpos, "%02lx\n", apid);
> > + bufpos += n;
> > + nchars += n;
> > +
> > + for_each_set_bit_inv(apqi, matrix_mdev->matrix->aqm,
> > + matrix_mdev->matrix->aqm_max) {
> > + n = sprintf(bufpos, "%02lx.%04lx\n", apid, apqi);
> > + bufpos += n;
> > + nchars += n;
> > + }
> > + }
> > +
> > + return nchars;
> > +}
> > +DEVICE_ATTR_RO(matrix);
> > +
> > +
> > static struct attribute *vfio_ap_mdev_attrs[] = {
> > &dev_attr_assign_adapter.attr,
> > &dev_attr_unassign_adapter.attr,
> > @@ -700,6 +738,7 @@ static ssize_t control_domains_show(struct device *dev,
> > &dev_attr_assign_control_domain.attr,
> > &dev_attr_unassign_control_domain.attr,
> > &dev_attr_control_domains.attr,
> > + &dev_attr_matrix.attr,
> > NULL,
> > };
> >
>


2018-03-27 11:25:21

by Pierre Morel

[permalink] [raw]
Subject: Re: [PATCH v3 03/14] KVM: s390: CPU model support for AP virtualization

On 27/03/2018 12:59, Cornelia Huck wrote:
> On Wed, 14 Mar 2018 14:25:43 -0400
> Tony Krowiak <[email protected]> wrote:
>
>> Introduces a new CPU model feature and two CPU model
>> facilities to support AP virtualization for KVM guests.
>>
>> CPU model feature:
>>
>> The KVM_S390_VM_CPU_FEAT_AP feature indicates that
>> AP instructions are available on the guest. This
>> feature will be enabled by the kernel only if the AP
>> instructions are installed on the linux host. This feature
>> must be specifically turned on for the KVM guest from
>> userspace to use the VFIO AP device driver for guest
>> access to AP devices.
>>
>> CPU model facilities:
>>
>> 1. AP Query Configuration Information (QCI) facility is installed.
>>
>> This is indicated by setting facilities bit 12 for
>> the guest. The kernel will not enable this facility
>> for the guest if it is not set on the host. This facility
>> must not be set by userspace if the KVM_S390_VM_CPU_FEAT_AP
>> feature is not installed.
>>
>> 2. AP Facilities Test facility (APFT) is installed.
>>
>> This is indicated by setting facilities bit 15 for
>> the guest. The kernel will not enable this facility for
>> the guest if it is not set on the host. This facility
>> must not be set by userspace if the KVM_S390_VM_CPU_FEAT_AP
>> feature is not installed.
>>
>> Reviewed-by: Christian Borntraeger <[email protected]>
>> Reviewed-by: Halil Pasic <[email protected]>
>> Signed-off-by: Tony Krowiak <[email protected]>
>> ---
>> arch/s390/include/asm/kvm_host.h | 1 +
>> arch/s390/include/uapi/asm/kvm.h | 1 +
>> arch/s390/kvm/kvm-s390.c | 4 ++++
>> arch/s390/tools/gen_facilities.c | 2 ++
>> 4 files changed, 8 insertions(+), 0 deletions(-)
>>
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index c47731d..a60c45b 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -350,6 +350,10 @@ static void kvm_s390_cpu_feat_init(void)
>>
>> if (MACHINE_HAS_ESOP)
>> allow_cpu_feat(KVM_S390_VM_CPU_FEAT_ESOP);
>> +
>> + if (ap_instructions_installed()) /* AP instructions installed on host */
>> + allow_cpu_feat(KVM_S390_VM_CPU_FEAT_AP);
> That's another dependency of the base kvm-s390 module on zcrypt, which
> I don't like at all.

In fact there is a tricky thing about zcrypt it is that even it is
configured a a module
CONFIG_ZCRYPT=M, the AP BUS is built statically with the kernel.
See drivers/s390/crypto/Makefile
"
ap-objs := ap_bus.o ap_card.o ap_queue.o
obj-$(subst m,y,$(CONFIG_ZCRYPT)) += ap.o
"
ugly isn't it?

>
> There are two possibilities here:
> - Exposing the features makes sense even if no zcrypt driver is active
> in the host. Then, ap_instructions_installed() needs to be moved into
> always-built code (see my comments for the interface in patch 1).

This is what we need for future enhancement I think.

> - Exposing the features makes sense only if we actually want to make
> vfio-ap available. Then we should provide the proper check in the
> vfio-ap parts (which depends on zcrypt) and stub it out if vfio-ap is
> not configured.
>
>> +
>> /*
>> * We need SIE support, ESOP (PROT_READ protection for gmap_shadow),
>> * 64bit SCAO (SCA passthrough) and IDTE (for gmap_shadow unshadowing).


--
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany


2018-03-27 11:32:16

by Cornelia Huck

[permalink] [raw]
Subject: Re: [PATCH v3 03/14] KVM: s390: CPU model support for AP virtualization

On Tue, 27 Mar 2018 13:22:56 +0200
Pierre Morel <[email protected]> wrote:

> On 27/03/2018 12:59, Cornelia Huck wrote:
> > On Wed, 14 Mar 2018 14:25:43 -0400
> > Tony Krowiak <[email protected]> wrote:
> >
> >> Introduces a new CPU model feature and two CPU model
> >> facilities to support AP virtualization for KVM guests.
> >>
> >> CPU model feature:
> >>
> >> The KVM_S390_VM_CPU_FEAT_AP feature indicates that
> >> AP instructions are available on the guest. This
> >> feature will be enabled by the kernel only if the AP
> >> instructions are installed on the linux host. This feature
> >> must be specifically turned on for the KVM guest from
> >> userspace to use the VFIO AP device driver for guest
> >> access to AP devices.
> >>
> >> CPU model facilities:
> >>
> >> 1. AP Query Configuration Information (QCI) facility is installed.
> >>
> >> This is indicated by setting facilities bit 12 for
> >> the guest. The kernel will not enable this facility
> >> for the guest if it is not set on the host. This facility
> >> must not be set by userspace if the KVM_S390_VM_CPU_FEAT_AP
> >> feature is not installed.
> >>
> >> 2. AP Facilities Test facility (APFT) is installed.
> >>
> >> This is indicated by setting facilities bit 15 for
> >> the guest. The kernel will not enable this facility for
> >> the guest if it is not set on the host. This facility
> >> must not be set by userspace if the KVM_S390_VM_CPU_FEAT_AP
> >> feature is not installed.
> >>
> >> Reviewed-by: Christian Borntraeger <[email protected]>
> >> Reviewed-by: Halil Pasic <[email protected]>
> >> Signed-off-by: Tony Krowiak <[email protected]>
> >> ---
> >> arch/s390/include/asm/kvm_host.h | 1 +
> >> arch/s390/include/uapi/asm/kvm.h | 1 +
> >> arch/s390/kvm/kvm-s390.c | 4 ++++
> >> arch/s390/tools/gen_facilities.c | 2 ++
> >> 4 files changed, 8 insertions(+), 0 deletions(-)
> >>
> >> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> >> index c47731d..a60c45b 100644
> >> --- a/arch/s390/kvm/kvm-s390.c
> >> +++ b/arch/s390/kvm/kvm-s390.c
> >> @@ -350,6 +350,10 @@ static void kvm_s390_cpu_feat_init(void)
> >>
> >> if (MACHINE_HAS_ESOP)
> >> allow_cpu_feat(KVM_S390_VM_CPU_FEAT_ESOP);
> >> +
> >> + if (ap_instructions_installed()) /* AP instructions installed on host */
> >> + allow_cpu_feat(KVM_S390_VM_CPU_FEAT_AP);
> > That's another dependency of the base kvm-s390 module on zcrypt, which
> > I don't like at all.
>
> In fact there is a tricky thing about zcrypt it is that even it is
> configured a a module
> CONFIG_ZCRYPT=M, the AP BUS is built statically with the kernel.
> See drivers/s390/crypto/Makefile
> "
> ap-objs := ap_bus.o ap_card.o ap_queue.o
> obj-$(subst m,y,$(CONFIG_ZCRYPT)) += ap.o
> "
> ugly isn't it?

Yeah, I found it... interesting the first time I saw it.

>
> >
> > There are two possibilities here:
> > - Exposing the features makes sense even if no zcrypt driver is active
> > in the host. Then, ap_instructions_installed() needs to be moved into
> > always-built code (see my comments for the interface in patch 1).
>
> This is what we need for future enhancement I think.

OK, so that function needs to go into whatever place the interface used
in patch 1 goes to as well.

>
> > - Exposing the features makes sense only if we actually want to make
> > vfio-ap available. Then we should provide the proper check in the
> > vfio-ap parts (which depends on zcrypt) and stub it out if vfio-ap is
> > not configured.
> >
> >> +
> >> /*
> >> * We need SIE support, ESOP (PROT_READ protection for gmap_shadow),
> >> * 64bit SCAO (SCA passthrough) and IDTE (for gmap_shadow unshadowing).
>
>


2018-03-27 14:46:37

by Pierre Morel

[permalink] [raw]
Subject: Re: [PATCH v3 05/14] s390: vfio-ap: base implementation of VFIO AP device driver

On 27/03/2018 13:17, Cornelia Huck wrote:
> On Thu, 15 Mar 2018 13:25:25 -0400
> Tony Krowiak <[email protected]> wrote:
>
>> On 03/15/2018 09:25 AM, Pierre Morel wrote:
>>> On 14/03/2018 19:25, Tony Krowiak wrote:
>>>> +config VFIO_AP
>>>> + def_tristate m
>>> not sure it must be module by default.
>>> I would not set it by default.
>> Connie also asked about this in the last review, so I will go ahead
>> and change it.
>>>
>>>> + prompt "VFIO support for AP devices"
>>>> + depends on ZCRYPT && VFIO_MDEV_DEVICE
>>> VFIO_MDEV_DEVICE is a general feature *needed* by VFIO_AP
>>> and has no use case by its own. If it is set it is obviously because some
>>> mediated device drivers needs it.
>>> while ZCRYPT is a Z feature which may be set without VFIO_AP.
>>>
>>> So you need:
>>>
>>> config VFIO_AP
>>> def_tristate n
>>> prompt "VFIO support for AP devices"
>>> depends on ZCRYPT
>>> select VFIO_MDEV
>>> select VFIO_MDEV_DEVICE
>>> ...
>> I was thinking the same just yesterday and I agree, this makes sense.
> OTOH, nobody else seems to do a select on these symbols so far.
>
> If you decide to go that route, you'll also need to depend on VFIO

I think a select is better (again).

> (otherwise you could end up selecting symbols with unmet dependencies).
> All in all, I prefer the 'depends' approach.
>
Why do you prefer this approach?


I can tell you why I prefer a mixed approach:

We have two tools, depends and select.

It seems to me that depends should be used for things we can not choose
to be there or not, but things that just are there, like hardware
dependencies. For example MMU, CPU type, CRYPTO hardware...

Select on the other hand is useful to choose things that we need like
libraries, VFIO, VIRTIO, crypto libraries etc.

Using this policy is clear and makes easy to choose functionalities and
get the utilities automatically.

On the other hand, only using depends makes things to hide the
functionalities behind the utilities.


--
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany


2018-03-29 18:59:08

by Tony Krowiak

[permalink] [raw]
Subject: Re: [PATCH v3 01/14] KVM: s390: refactor crypto initialization

On 03/26/2018 04:44 AM, Cornelia Huck wrote:
> On Thu, 15 Mar 2018 15:55:39 +0100
> Pierre Morel <[email protected]> wrote:
>
>> On 15/03/2018 15:48, Tony Krowiak wrote:
>>> On 03/15/2018 08:26 AM, Pierre Morel wrote:
>>>> On 14/03/2018 19:25, Tony Krowiak wrote:
>>>>> diff --git a/arch/s390/kvm/Kconfig b/arch/s390/kvm/Kconfig
>>>>> index a3dbd45..4ca9077 100644
>>>>> --- a/arch/s390/kvm/Kconfig
>>>>> +++ b/arch/s390/kvm/Kconfig
>>>>> @@ -33,6 +33,7 @@ config KVM
>>>>> select HAVE_KVM_INVALID_WAKEUPS
>>>>> select SRCU
>>>>> select KVM_VFIO
>>>>> + select ZCRYPT
>>>> I do not think it is a good solution to *always* enable ZCRYPT
>>>> when we have KVM.
>>> If CONFIG_ZCRYPT is not selected, then the kvm_ap_apxa_installed()
>>> function will not compile
>>> because it calls a zcrypt interface. How would you suggest we make
>>> sure zcrypt interfaces
>>> used in KVM are built if CONFIG_ZCRYPT is not selected?
>> if zcrypt is not configured, I suppose that the KVM code initializaing CRYCB
>> has no use but the function will be called from KVM.
>> So I would do something like:
>>
>> #ifdef ZCRYPT
>> external definitions.
>> #else
>> stubs returning error -ENOZCRYPT (or whatever)
>> #endif
> The kvm code used some kind of detection for crycb before (IIRC it was
> for the key-wrapping stuff). I assume that usage is independent of
> zcrypt driver usage in the host?
A function in kvm-s390.c was replaced with a call to the function in
ap_bus.c that was externalized in patch 2/14. This was done to remove
duplicate code. Since zcrypt is built into the kernel, I didn't think
it would be a problem, but apparently because of the way zcrypt is
configured, it is still possible to remove it from the kernel build.
>
> So, I think that apxa detection function should be used to s390
> architecture base code and not be conditional on anything.
I am convinced that the original function from kvm_s390.c should be
restored.
>


2018-04-02 19:55:21

by Tony Krowiak

[permalink] [raw]
Subject: Re: [PATCH v3 04/14] KVM: s390: device attribute to set AP interpretive execution

On 03/20/2018 06:48 PM, Halil Pasic wrote:
>
> On 03/20/2018 06:58 PM, Tony Krowiak wrote:
>> I spoke with Christian this morning and he made a suggestion which I think would provide the best solution here.
>> This is my proposal:
>> 1. Get rid of the KVM_S390_VM_CRYPTO_INTERPRET_AP device attribute and return to setting ECA.28 from the
>> mdev device open callback.
>> 2. Since there may be vcpus online at the time the mdev device open is called, we must first take all running vcpus out of
>> SIE and block them. Christian suggested the kvm_s390_vcpu_block_all(struct kvm *kvm) function will do the trick. So I
>> propose introducing a function like the following to be called during mdev open:
> There is one thing you missed, otherwise I'm *very* satisfied with this
> proposal.
>
> What you have missed IMHO is vcpu hottplug. So IMHO you should keep
> kvm->arch.crypto.apie, and update it accordingly ...
I agree, I will fix it.
>
>
>> int kvm_ap_set_interpretive_exec(struct kvm *kvm, bool enable)
>> {
>> int i;
>> struct kvm_vcpu *vcpu;
>>
>> if (!test_kvm_cpu_feat(kvm, KVM_S390_VM_CPU_FEAT_AP))
>> return -EOPNOTSUPP;
>>
>> mutex_lock(&kvm->lock);
>>
>> kvm_s390_vcpu_block_all(kvm);
> ... let's say here.
Yep
>
>> kvm_for_each_vcpu(i, vcpu, kvm) {
> And here you can call kvm_s390_vcpu_crypto_setup(vcpu) (the changes to
> this function will be required for hotplug) if you like
Sounds good to me.
>
>> if (enable)
>> vcpu->arch.sie_block->eca |= ECA_APIE;
>> else
>> vcpu->arch.sie_block->eca &= ~ECA_APIE;
> or keep this stuff, it does not really matter to me.
I'll call the kvm_s390_vcpu_crypto_setup(vcpu) to set ECA_APIE.
>
>> }
>>
>> kvm_s390_vcpu_unblock_all(kvm);
>>
>> mutex_unlock(&kvm->lock);
>>
>> return 0;
>> }
>>
>> This interface allows us to set ECA.28 even if vcpus are running
> I tend to agree. I will give it a proper review when this gets more
> formal (e.g. v4 (preferably) or patches to be fixed up to this series).
>
> Please don't forget to revisit the discussion on kvm_s390_vm_set_crypto:
> if the mechanism there isn't right for ECA.28 I think you should tell
> us why it's OK for the other attributes if it's OK. If it is not then
> I guess you will want to do a stand alone patch for that.
That will no longer be a part of this patch series. We can revisit that as
a separate issue at a future time.
>


2018-04-03 09:57:49

by Cornelia Huck

[permalink] [raw]
Subject: Re: [PATCH v3 05/14] s390: vfio-ap: base implementation of VFIO AP device driver

On Tue, 27 Mar 2018 16:45:02 +0200
Pierre Morel <[email protected]> wrote:

> On 27/03/2018 13:17, Cornelia Huck wrote:
> > On Thu, 15 Mar 2018 13:25:25 -0400
> > Tony Krowiak <[email protected]> wrote:
> >
> >> On 03/15/2018 09:25 AM, Pierre Morel wrote:
> >>> On 14/03/2018 19:25, Tony Krowiak wrote:
> >>>> +config VFIO_AP
> >>>> + def_tristate m
> >>> not sure it must be module by default.
> >>> I would not set it by default.
> >> Connie also asked about this in the last review, so I will go ahead
> >> and change it.
> >>>
> >>>> + prompt "VFIO support for AP devices"
> >>>> + depends on ZCRYPT && VFIO_MDEV_DEVICE
> >>> VFIO_MDEV_DEVICE is a general feature *needed* by VFIO_AP
> >>> and has no use case by its own. If it is set it is obviously because some
> >>> mediated device drivers needs it.
> >>> while ZCRYPT is a Z feature which may be set without VFIO_AP.
> >>>
> >>> So you need:
> >>>
> >>> config VFIO_AP
> >>> def_tristate n
> >>> prompt "VFIO support for AP devices"
> >>> depends on ZCRYPT
> >>> select VFIO_MDEV
> >>> select VFIO_MDEV_DEVICE
> >>> ...
> >> I was thinking the same just yesterday and I agree, this makes sense.
> > OTOH, nobody else seems to do a select on these symbols so far.
> >
> > If you decide to go that route, you'll also need to depend on VFIO
>
> I think a select is better (again).
>
> > (otherwise you could end up selecting symbols with unmet dependencies).
> > All in all, I prefer the 'depends' approach.
> >
> Why do you prefer this approach?

Hm, I thought I had already written a mail, but apparently I didn't....

> I can tell you why I prefer a mixed approach:
>
> We have two tools, depends and select.
>
> It seems to me that depends should be used for things we can not choose
> to be there or not, but things that just are there, like hardware
> dependencies. For example MMU, CPU type, CRYPTO hardware...
>
> Select on the other hand is useful to choose things that we need like
> libraries, VFIO, VIRTIO, crypto libraries etc.
>
> Using this policy is clear and makes easy to choose functionalities and
> get the utilities automatically.
>
> On the other hand, only using depends makes things to hide the
> functionalities behind the utilities.

My view is the following:
- select is useful for library functionality or for enabling
architecture-specific optimizations (the HAVE_xxx symbols),
especially things you don't want the user to deal with. If you select
something, you need to take care of any dependencies yourself.
- depends is useful for more complex dependencies, and especially
things you don't want automagically enabled. [In modern menuconfig,
it is easy to figure out any missing dependencies for a config option
anyway.]

The mdev infrastructure is too complex to be considered a simple
library IMO (cf. the missing VFIO dependency).

2018-04-03 10:59:39

by Cornelia Huck

[permalink] [raw]
Subject: Re: [PATCH v3 05/14] s390: vfio-ap: base implementation of VFIO AP device driver

On Wed, 14 Mar 2018 14:25:45 -0400
Tony Krowiak <[email protected]> wrote:

> Introduces a new AP device driver. This device driver
> is built on the VFIO mediated device framework. The framework
> provides sysfs interfaces that facilitate passthrough
> access by guests to devices installed on the linux host.
>
> The VFIO AP device driver will serve two purposes:
>
> 1. Provide the interfaces to reserve AP devices for exclusive
> use by KVM guests. This is accomplished by unbinding the
> devices to be reserved for guest usage from the default AP
> device driver and binding them to the VFIO AP device driver.
>
> 2. Implements the functions, callbacks and sysfs attribute
> interfaces required to create one or more VFIO mediated
> devices each of which will be used to configure the AP
> matrix for a guest and serve as a file descriptor
> for facilitating communication between QEMU and the
> VFIO AP device driver.
>
> When the VFIO AP device driver is initialized:
>
> * It registers with the AP bus for control of type 10 (CEX4
> and newer) AP queue devices. The probe and remove callbacks
> will be provided to support the binding/unbinding of
> AP queue devices to/from the VFIO AP device driver.
>
> * Creates a /sys/devices/vfio-ap/matrix device to hold
> the APQNs of the AP devices bound to the VFIO
> AP device driver and serves as the parent of the
> mediated devices created for each guest.
>
> Signed-off-by: Tony Krowiak <[email protected]>
> ---
> MAINTAINERS | 2 +
> arch/s390/Kconfig | 11 +++
> drivers/s390/crypto/Makefile | 4 +
> drivers/s390/crypto/vfio_ap_drv.c | 135 +++++++++++++++++++++++++++++++++
> drivers/s390/crypto/vfio_ap_private.h | 22 ++++++
> include/uapi/linux/vfio.h | 2 +
> 6 files changed, 176 insertions(+), 0 deletions(-)
> create mode 100644 drivers/s390/crypto/vfio_ap_drv.c
> create mode 100644 drivers/s390/crypto/vfio_ap_private.h
>

> diff --git a/drivers/s390/crypto/vfio_ap_drv.c b/drivers/s390/crypto/vfio_ap_drv.c
> new file mode 100644
> index 0000000..459e595
> --- /dev/null
> +++ b/drivers/s390/crypto/vfio_ap_drv.c
> @@ -0,0 +1,135 @@
> +/*
> + * VFIO based AP device driver
> + *
> + * Copyright IBM Corp. 2017

Update to 2018?

> + *
> + * Author(s): Tony Krowiak <[email protected]>
> + */
> +
> +#include <linux/module.h>
> +#include <linux/mod_devicetable.h>
> +#include <linux/slab.h>
> +
> +#include "vfio_ap_private.h"
> +
> +#define VFIO_AP_ROOT_NAME "vfio_ap"
> +#define VFIO_AP_DEV_TYPE_NAME "ap_matrix"
> +#define VFIO_AP_DEV_NAME "matrix"
> +
> +MODULE_AUTHOR("IBM Corporation");
> +MODULE_DESCRIPTION("VFIO AP device driver, Copyright IBM Corp. 2017");
> +MODULE_LICENSE("GPL v2");
> +
> +static struct device *vfio_ap_root_device;
> +
> +static struct ap_driver vfio_ap_drv;
> +
> +static struct ap_matrix *ap_matrix;
> +
> +static struct device_type vfio_ap_dev_type = {
> + .name = VFIO_AP_DEV_TYPE_NAME,
> +};
> +
> +/* Only type 10 adapters (CEX4 and later) are supported
> + * by the AP matrix device driver
> + */
> +static struct ap_device_id ap_queue_ids[] = {
> + { .dev_type = AP_DEVICE_TYPE_CEX4,
> + .match_flags = AP_DEVICE_ID_MATCH_QUEUE_TYPE },
> + { .dev_type = AP_DEVICE_TYPE_CEX5,
> + .match_flags = AP_DEVICE_ID_MATCH_QUEUE_TYPE },
> + { .dev_type = AP_DEVICE_TYPE_CEX6,
> + .match_flags = AP_DEVICE_ID_MATCH_QUEUE_TYPE },
> + { /* end of sibling */ },
> +};
> +
> +MODULE_DEVICE_TABLE(vfio_ap, ap_queue_ids);
> +
> +static int vfio_ap_queue_dev_probe(struct ap_device *apdev)
> +{
> + return 0;
> +}
> +
> +static void vfio_ap_matrix_dev_release(struct device *dev)
> +{
> + struct ap_matrix *ap_matrix = dev_get_drvdata(dev);
> +
> + kfree(ap_matrix);
> +}
> +
> +static int vfio_ap_matrix_dev_create(void)
> +{
> + int ret;
> +
> + vfio_ap_root_device = root_device_register(VFIO_AP_ROOT_NAME);
> +
> + ret = IS_ERR(vfio_ap_root_device);
> + if (ret) {

Minor nit: I'd contract that to

if (IS_ERR(vfio_ap_root_device)) {

(you're writing ret in any case)

> + ret = PTR_ERR(vfio_ap_root_device);
> + goto done;
> + }
> +
> + ap_matrix = kzalloc(sizeof(*ap_matrix), GFP_KERNEL);
> + if (!ap_matrix) {
> + ret = -ENOMEM;
> + goto matrix_alloc_err;
> + }
> +
> + ap_matrix->device.type = &vfio_ap_dev_type;
> + dev_set_name(&ap_matrix->device, "%s", VFIO_AP_DEV_NAME);
> + ap_matrix->device.parent = vfio_ap_root_device;
> + ap_matrix->device.release = vfio_ap_matrix_dev_release;
> + ap_matrix->device.driver = &vfio_ap_drv.driver;
> +
> + ret = device_register(&ap_matrix->device);
> + if (ret)
> + goto matrix_reg_err;
> +
> + goto done;
> +
> +matrix_reg_err:
> + put_device(&ap_matrix->device);
> + kfree(ap_matrix);

The kfree() is wrong: If you called device_register for the embedded
struct device, this needs to be handled via the ->release callback
exclusively (IOW, the put_device() is enough and the kfree needs to go).

> +
> +matrix_alloc_err:
> + root_device_unregister(vfio_ap_root_device);
> +
> +done:
> + return ret;
> +}

2018-04-03 11:10:25

by Cornelia Huck

[permalink] [raw]
Subject: Re: [PATCH v3 07/14] KVM: s390: interfaces to configure/deconfigure guest's AP matrix

On Wed, 14 Mar 2018 14:25:47 -0400
Tony Krowiak <[email protected]> wrote:

> Provides interfaces to assign AP adapters, usage domains
> and control domains to a KVM guest.
>
> A KVM guest is started by executing the Start Interpretive Execution (SIE)
> instruction. The SIE state description is a control block that contains the
> state information for a KVM guest and is supplied as input to the SIE
> instruction. The SIE state description has a satellite structure called the
> Crypto Control Block (CRYCB). The CRYCB contains three bitmask fields
> identifying the adapters, queues (domains) and control domains assigned to
> the KVM guest:
>
> * The AP Adapter Mask (APM) field identifies the AP adapters assigned to
> the KVM guest
>
> * The AP Queue Mask (AQM) field identifies the AP queues assigned to
> the KVM guest. Each AP queue is connected to a usage domain within
> an AP adapter.
>
> * The AP Domain Mask (ADM) field identifies the control domains
> assigned to the KVM guest.
>
> Each adapter, queue (usage domain) and control domain are identified by
> a number from 0 to 255. The bits in each mask, from most significant to
> least significant bit, correspond to the numbers 0-255. When a bit is
> set, the corresponding adapter, queue (usage domain) or control domain
> is assigned to the KVM guest.
>
> This patch will set the bits in the APM, AQM and ADM fields of the
> CRYCB referenced by the KVM guest's SIE state description. The process
> used is:
>
> 1. Verify that the bits to be set do not exceed the maximum bit
> number for the given mask.
>
> 2. Verify that the APQNs that can be derived from the intersection
> of the bits set in the APM and AQM fields of the KVM guest's CRYCB
> are not assigned to any other KVM guest running on the same linux
> host.
>
> 3. Set the APM, AQM and ADM in the CRYCB according to the matrix
> configured for the mediated matrix device via its sysfs
> adapter, domain and control domain attribute files respectively.
>
> Signed-off-by: Tony Krowiak <[email protected]>
> ---
> arch/s390/include/asm/kvm-ap.h | 36 +++++
> arch/s390/kvm/kvm-ap.c | 268 +++++++++++++++++++++++++++++++++
> drivers/s390/crypto/vfio_ap_ops.c | 19 +++
> drivers/s390/crypto/vfio_ap_private.h | 4 +
> 4 files changed, 327 insertions(+), 0 deletions(-)
>

> diff --git a/arch/s390/kvm/kvm-ap.c b/arch/s390/kvm/kvm-ap.c
> index a2c6ad2..eb365e2 100644
> --- a/arch/s390/kvm/kvm-ap.c
> +++ b/arch/s390/kvm/kvm-ap.c
> @@ -8,9 +8,129 @@
>
> #include <asm/kvm-ap.h>
> #include <asm/ap.h>
> +#include <linux/bitops.h>
>
> #include "kvm-s390.h"
>
> +static inline void kvm_ap_clear_crycb_masks(struct kvm *kvm)
> +{
> + int crycb_fmt = kvm->arch.crypto.crycbd & CRYCB_FORMAT_MASK;
> +
> + if (crycb_fmt == CRYCB_FORMAT2)
> + memset(&kvm->arch.crypto.crycb->apcb1, 0,
> + sizeof(kvm->arch.crypto.crycb->apcb1));
> + else
> + memset(&kvm->arch.crypto.crycb->apcb0, 0,
> + sizeof(kvm->arch.crypto.crycb->apcb0));
> +}

Should that rather be a switch/case? If there's a CRYCB_FORMAT3 in the
future, I'd think that it's more likely that it uses apcb1 and not
apcb0. Can't comment further without the architecture, obviously.

(...)

> +static void kvm_ap_set_crycb_masks(struct kvm *kvm,
> + struct kvm_ap_matrix *matrix)
> +{
> + unsigned long *apm = kvm_ap_get_crycb_apm(kvm);
> + unsigned long *aqm = kvm_ap_get_crycb_aqm(kvm);
> + unsigned long *adm = kvm_ap_get_crycb_adm(kvm);
> +
> + kvm_ap_clear_crycb_masks(kvm);
> + memcpy(apm, matrix->apm, KVM_AP_MASK_BYTES(matrix->apm_max));
> + memcpy(aqm, matrix->aqm, KVM_AP_MASK_BYTES(matrix->aqm_max));
> +
> + /*
> + * Merge the AQM and ADM since the ADM is a superset of the
> + * AQM by architectural convention.

Is this 'architectural convention' in the sense of 'there's a statement
in the architecture that it always is like that', or in the sense of
'all real-life systems are like that'?

[From my sketchy memory, this convention makes sense but is not
enshrined; but I might misremember.]

> + */
> + bitmap_or(adm, adm, aqm, matrix->adm_max);
> +}

2018-04-03 11:11:51

by Cornelia Huck

[permalink] [raw]
Subject: Re: [PATCH v3 08/14] s390: vfio-ap: sysfs interfaces to configure adapters

On Wed, 14 Mar 2018 14:25:48 -0400
Tony Krowiak <[email protected]> wrote:

> diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h
> index a388b66..f6e7ed1 100644
> --- a/drivers/s390/crypto/vfio_ap_private.h
> +++ b/drivers/s390/crypto/vfio_ap_private.h
> @@ -11,6 +11,7 @@
> #include <linux/types.h>
> #include <linux/device.h>
> #include <linux/mdev.h>
> +#include <asm/kvm-ap.h>
>
> #include "ap_bus.h"
>

Does this hunk belong in this patch?

2018-04-03 11:20:42

by Cornelia Huck

[permalink] [raw]
Subject: Re: [PATCH v3 09/14] s390: vfio-ap: sysfs interfaces to configure domains

On Wed, 14 Mar 2018 14:25:49 -0400
Tony Krowiak <[email protected]> wrote:

> Provides the sysfs interfaces for assigning AP domains to
> and unassigning AP domains from a mediated matrix device.
>
> An AP domain ID corresponds to an AP queue index (APQI). For
> each domain assigned to the mediated matrix device, its
> corresponging APQI is stored in an AP queue mask (AQM).
> The bits in the AQM, from most significant to least
> significant bit, correspond to AP domain numbers 0 to 255.
> When a domain is assigned, the bit corresponding to its
> APQI will be set in the AQM. Likewise, when a domain is
> unassigned, the bit corresponding to its APQI will be
> cleared from the AQM.
>
> The relevant sysfs structures are:
>
> /sys/devices/vfio_ap
> ... [matrix]
> ...... [mdev_supported_types]
> ......... [vfio_ap-passthrough]
> ............ [devices]
> ...............[$uuid]
> .................. assign_domain
> .................. unassign_domain
>
> To assign a domain to the $uuid mediated matrix device,
> write the domain's ID to the assign_domain file. To
> unassign a domain, write the domain's ID to the
> unassign_domain file. The ID is specified using
> conventional semantics: If it begins with 0x, the number
> will be parsed as a hexadecimal (case insensitive) number;
> otherwise, it will be parsed as a decimal number.
>
> For example, to assign domain 173 (0xad) to the mediated matrix
> device $uuid:
>
> echo 173 > assign_domain
>
> or
>
> echo 0xad > assign_domain
>
> To unassign domain 173 (0xad):
>
> echo 173 > unassign_domain
>
> or
>
> echo 0xad > unassign_domain
>
> The assignment will be rejected:
>
> * If the domain ID exceeds the maximum value for an AP domain:
>
> * If the AP Extended Addressing (APXA) facility is installed,
> the max value is 255
>
> * Else the max value is 15
>
> * If no AP adapters have yet been assigned and there are
> no AP queues reserved by the VFIO AP driver that have an APQN
> with an APQI matching that of the AP domain number being
> assigned.
>
> * If any of the APQNs that can be derived from the intersection
> of the APQI being assigned and the AP adapter ID (APID) of
> each of the AP adapters previously assigned can not be matched
> with an APQN of an AP queue device reserved by the VFIO AP
> driver.
>
> Signed-off-by: Tony Krowiak <[email protected]>
> ---
> arch/s390/include/asm/kvm-ap.h | 1 +
> drivers/s390/crypto/vfio_ap_ops.c | 215 ++++++++++++++++++++++++++++++++++++-
> 2 files changed, 215 insertions(+), 1 deletions(-)
>

> diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
> index 90512a6..c448835 100644
> --- a/drivers/s390/crypto/vfio_ap_ops.c
> +++ b/drivers/s390/crypto/vfio_ap_ops.c
> @@ -377,10 +377,223 @@ static ssize_t unassign_adapter_store(struct device *dev,
> }
> DEVICE_ATTR_WO(unassign_adapter);
>
> +/**
> + * vfio_ap_validate_queues_for_apqi
> + *
> + * @ap_matrix: the matrix device
> + * @matrix_mdev: the mediated matrix device
> + * @apqi: an AP queue index (APQI) - corresponds to a domain ID
> + *
> + * Verifies that each APQN that is derived from the intersection of @apqi and
> + * each AP adapter ID (APID) corresponding to an AP domain assigned to the
> + * @matrix_mdev matches the APQN of an AP queue reserved by the VFIO AP device
> + * driver.
> + *
> + * Returns 0 if validation succeeds; otherwise, returns an error.
> + */
> +static int vfio_ap_validate_queues_for_apqi(struct ap_matrix *ap_matrix,
> + struct ap_matrix_mdev *matrix_mdev,
> + unsigned long apqi)
> +{
> + int ret;
> + struct vfio_ap_qid_match qid_match;
> + unsigned long apid;
> + struct device_driver *drv = ap_matrix->device.driver;
> +
> + /**
> + * Examine each APQN with the specified APQI
> + */
> + for_each_set_bit_inv(apid, matrix_mdev->matrix->apm,
> + matrix_mdev->matrix->apm_max) {
> + qid_match.qid = AP_MKQID(apid, apqi);
> + qid_match.dev = NULL;
> +
> + ret = driver_for_each_device(drv, NULL, &qid_match,
> + vfio_ap_queue_match);
> + if (ret)
> + return ret;

Hm, I'm wondering whether jumping out of the outer loop is the correct
thing to do here - and if yes, whether we should log an error?

> +
> + /*
> + * If the APQN identifies an AP queue that is reserved by the
> + * VFIO AP device driver, continue processing.
> + */
> + if (qid_match.dev)
> + continue;
> +
> + pr_err("%s: AP queue %02lx.%04lx not reserved by %s driver",
> + VFIO_AP_MATRIX_MODULE_NAME, apqi, apqi,
> + VFIO_AP_DRV_NAME);
> +
> + return -ENXIO;
> + }
> +
> + return 0;
> +}

2018-04-03 11:31:04

by Cornelia Huck

[permalink] [raw]
Subject: Re: [PATCH v3 01/14] KVM: s390: refactor crypto initialization

On Thu, 29 Mar 2018 14:57:22 -0400
Tony Krowiak <[email protected]> wrote:

> On 03/26/2018 04:44 AM, Cornelia Huck wrote:
> > On Thu, 15 Mar 2018 15:55:39 +0100
> > Pierre Morel <[email protected]> wrote:
> >
> >> On 15/03/2018 15:48, Tony Krowiak wrote:
> >>> On 03/15/2018 08:26 AM, Pierre Morel wrote:
> >>>> On 14/03/2018 19:25, Tony Krowiak wrote:
> >>>>> diff --git a/arch/s390/kvm/Kconfig b/arch/s390/kvm/Kconfig
> >>>>> index a3dbd45..4ca9077 100644
> >>>>> --- a/arch/s390/kvm/Kconfig
> >>>>> +++ b/arch/s390/kvm/Kconfig
> >>>>> @@ -33,6 +33,7 @@ config KVM
> >>>>> select HAVE_KVM_INVALID_WAKEUPS
> >>>>> select SRCU
> >>>>> select KVM_VFIO
> >>>>> + select ZCRYPT
> >>>> I do not think it is a good solution to *always* enable ZCRYPT
> >>>> when we have KVM.
> >>> If CONFIG_ZCRYPT is not selected, then the kvm_ap_apxa_installed()
> >>> function will not compile
> >>> because it calls a zcrypt interface. How would you suggest we make
> >>> sure zcrypt interfaces
> >>> used in KVM are built if CONFIG_ZCRYPT is not selected?
> >> if zcrypt is not configured, I suppose that the KVM code initializaing CRYCB
> >> has no use but the function will be called from KVM.
> >> So I would do something like:
> >>
> >> #ifdef ZCRYPT
> >> external definitions.
> >> #else
> >> stubs returning error -ENOZCRYPT (or whatever)
> >> #endif
> > The kvm code used some kind of detection for crycb before (IIRC it was
> > for the key-wrapping stuff). I assume that usage is independent of
> > zcrypt driver usage in the host?
> A function in kvm-s390.c was replaced with a call to the function in
> ap_bus.c that was externalized in patch 2/14. This was done to remove
> duplicate code. Since zcrypt is built into the kernel, I didn't think
> it would be a problem, but apparently because of the way zcrypt is
> configured, it is still possible to remove it from the kernel build.

Yes.

> >
> > So, I think that apxa detection function should be used to s390
> > architecture base code and not be conditional on anything.
> I am convinced that the original function from kvm_s390.c should be
> restored.

That would work as well, but removing the code duplication via moving
to s390 architecture code should not be that bad, either. Leaving the
decision to the respective maintainers.

2018-04-03 13:06:15

by Tony Krowiak

[permalink] [raw]
Subject: Re: [PATCH v3 05/14] s390: vfio-ap: base implementation of VFIO AP device driver

On 04/03/2018 06:57 AM, Cornelia Huck wrote:
> On Wed, 14 Mar 2018 14:25:45 -0400
> Tony Krowiak <[email protected]> wrote:
>
>> Introduces a new AP device driver. This device driver
>> is built on the VFIO mediated device framework. The framework
>> provides sysfs interfaces that facilitate passthrough
>> access by guests to devices installed on the linux host.
>>
>> The VFIO AP device driver will serve two purposes:
>>
>> 1. Provide the interfaces to reserve AP devices for exclusive
>> use by KVM guests. This is accomplished by unbinding the
>> devices to be reserved for guest usage from the default AP
>> device driver and binding them to the VFIO AP device driver.
>>
>> 2. Implements the functions, callbacks and sysfs attribute
>> interfaces required to create one or more VFIO mediated
>> devices each of which will be used to configure the AP
>> matrix for a guest and serve as a file descriptor
>> for facilitating communication between QEMU and the
>> VFIO AP device driver.
>>
>> When the VFIO AP device driver is initialized:
>>
>> * It registers with the AP bus for control of type 10 (CEX4
>> and newer) AP queue devices. The probe and remove callbacks
>> will be provided to support the binding/unbinding of
>> AP queue devices to/from the VFIO AP device driver.
>>
>> * Creates a /sys/devices/vfio-ap/matrix device to hold
>> the APQNs of the AP devices bound to the VFIO
>> AP device driver and serves as the parent of the
>> mediated devices created for each guest.
>>
>> Signed-off-by: Tony Krowiak <[email protected]>
>> ---
>> MAINTAINERS | 2 +
>> arch/s390/Kconfig | 11 +++
>> drivers/s390/crypto/Makefile | 4 +
>> drivers/s390/crypto/vfio_ap_drv.c | 135 +++++++++++++++++++++++++++++++++
>> drivers/s390/crypto/vfio_ap_private.h | 22 ++++++
>> include/uapi/linux/vfio.h | 2 +
>> 6 files changed, 176 insertions(+), 0 deletions(-)
>> create mode 100644 drivers/s390/crypto/vfio_ap_drv.c
>> create mode 100644 drivers/s390/crypto/vfio_ap_private.h
>>
>> diff --git a/drivers/s390/crypto/vfio_ap_drv.c b/drivers/s390/crypto/vfio_ap_drv.c
>> new file mode 100644
>> index 0000000..459e595
>> --- /dev/null
>> +++ b/drivers/s390/crypto/vfio_ap_drv.c
>> @@ -0,0 +1,135 @@
>> +/*
>> + * VFIO based AP device driver
>> + *
>> + * Copyright IBM Corp. 2017
> Update to 2018?
Okay, will do.
>
>> + *
>> + * Author(s): Tony Krowiak <[email protected]>
>> + */
>> +
>> +#include <linux/module.h>
>> +#include <linux/mod_devicetable.h>
>> +#include <linux/slab.h>
>> +
>> +#include "vfio_ap_private.h"
>> +
>> +#define VFIO_AP_ROOT_NAME "vfio_ap"
>> +#define VFIO_AP_DEV_TYPE_NAME "ap_matrix"
>> +#define VFIO_AP_DEV_NAME "matrix"
>> +
>> +MODULE_AUTHOR("IBM Corporation");
>> +MODULE_DESCRIPTION("VFIO AP device driver, Copyright IBM Corp. 2017");
>> +MODULE_LICENSE("GPL v2");
>> +
>> +static struct device *vfio_ap_root_device;
>> +
>> +static struct ap_driver vfio_ap_drv;
>> +
>> +static struct ap_matrix *ap_matrix;
>> +
>> +static struct device_type vfio_ap_dev_type = {
>> + .name = VFIO_AP_DEV_TYPE_NAME,
>> +};
>> +
>> +/* Only type 10 adapters (CEX4 and later) are supported
>> + * by the AP matrix device driver
>> + */
>> +static struct ap_device_id ap_queue_ids[] = {
>> + { .dev_type = AP_DEVICE_TYPE_CEX4,
>> + .match_flags = AP_DEVICE_ID_MATCH_QUEUE_TYPE },
>> + { .dev_type = AP_DEVICE_TYPE_CEX5,
>> + .match_flags = AP_DEVICE_ID_MATCH_QUEUE_TYPE },
>> + { .dev_type = AP_DEVICE_TYPE_CEX6,
>> + .match_flags = AP_DEVICE_ID_MATCH_QUEUE_TYPE },
>> + { /* end of sibling */ },
>> +};
>> +
>> +MODULE_DEVICE_TABLE(vfio_ap, ap_queue_ids);
>> +
>> +static int vfio_ap_queue_dev_probe(struct ap_device *apdev)
>> +{
>> + return 0;
>> +}
>> +
>> +static void vfio_ap_matrix_dev_release(struct device *dev)
>> +{
>> + struct ap_matrix *ap_matrix = dev_get_drvdata(dev);
>> +
>> + kfree(ap_matrix);
>> +}
>> +
>> +static int vfio_ap_matrix_dev_create(void)
>> +{
>> + int ret;
>> +
>> + vfio_ap_root_device = root_device_register(VFIO_AP_ROOT_NAME);
>> +
>> + ret = IS_ERR(vfio_ap_root_device);
>> + if (ret) {
> Minor nit: I'd contract that to
>
> if (IS_ERR(vfio_ap_root_device)) {
>
> (you're writing ret in any case)
Okay, will do.
>
>> + ret = PTR_ERR(vfio_ap_root_device);
>> + goto done;
>> + }
>> +
>> + ap_matrix = kzalloc(sizeof(*ap_matrix), GFP_KERNEL);
>> + if (!ap_matrix) {
>> + ret = -ENOMEM;
>> + goto matrix_alloc_err;
>> + }
>> +
>> + ap_matrix->device.type = &vfio_ap_dev_type;
>> + dev_set_name(&ap_matrix->device, "%s", VFIO_AP_DEV_NAME);
>> + ap_matrix->device.parent = vfio_ap_root_device;
>> + ap_matrix->device.release = vfio_ap_matrix_dev_release;
>> + ap_matrix->device.driver = &vfio_ap_drv.driver;
>> +
>> + ret = device_register(&ap_matrix->device);
>> + if (ret)
>> + goto matrix_reg_err;
>> +
>> + goto done;
>> +
>> +matrix_reg_err:
>> + put_device(&ap_matrix->device);
>> + kfree(ap_matrix);
> The kfree() is wrong: If you called device_register for the embedded
> struct device, this needs to be handled via the ->release callback
> exclusively (IOW, the put_device() is enough and the kfree needs to go).
Ah yes, I see that. I will fix it.
>
>> +
>> +matrix_alloc_err:
>> + root_device_unregister(vfio_ap_root_device);
>> +
>> +done:
>> + return ret;
>> +}



2018-04-03 13:20:39

by Tony Krowiak

[permalink] [raw]
Subject: Re: [PATCH v3 07/14] KVM: s390: interfaces to configure/deconfigure guest's AP matrix

On 04/03/2018 07:07 AM, Cornelia Huck wrote:
> On Wed, 14 Mar 2018 14:25:47 -0400
> Tony Krowiak <[email protected]> wrote:
>
>> Provides interfaces to assign AP adapters, usage domains
>> and control domains to a KVM guest.
>>
>> A KVM guest is started by executing the Start Interpretive Execution (SIE)
>> instruction. The SIE state description is a control block that contains the
>> state information for a KVM guest and is supplied as input to the SIE
>> instruction. The SIE state description has a satellite structure called the
>> Crypto Control Block (CRYCB). The CRYCB contains three bitmask fields
>> identifying the adapters, queues (domains) and control domains assigned to
>> the KVM guest:
>>
>> * The AP Adapter Mask (APM) field identifies the AP adapters assigned to
>> the KVM guest
>>
>> * The AP Queue Mask (AQM) field identifies the AP queues assigned to
>> the KVM guest. Each AP queue is connected to a usage domain within
>> an AP adapter.
>>
>> * The AP Domain Mask (ADM) field identifies the control domains
>> assigned to the KVM guest.
>>
>> Each adapter, queue (usage domain) and control domain are identified by
>> a number from 0 to 255. The bits in each mask, from most significant to
>> least significant bit, correspond to the numbers 0-255. When a bit is
>> set, the corresponding adapter, queue (usage domain) or control domain
>> is assigned to the KVM guest.
>>
>> This patch will set the bits in the APM, AQM and ADM fields of the
>> CRYCB referenced by the KVM guest's SIE state description. The process
>> used is:
>>
>> 1. Verify that the bits to be set do not exceed the maximum bit
>> number for the given mask.
>>
>> 2. Verify that the APQNs that can be derived from the intersection
>> of the bits set in the APM and AQM fields of the KVM guest's CRYCB
>> are not assigned to any other KVM guest running on the same linux
>> host.
>>
>> 3. Set the APM, AQM and ADM in the CRYCB according to the matrix
>> configured for the mediated matrix device via its sysfs
>> adapter, domain and control domain attribute files respectively.
>>
>> Signed-off-by: Tony Krowiak <[email protected]>
>> ---
>> arch/s390/include/asm/kvm-ap.h | 36 +++++
>> arch/s390/kvm/kvm-ap.c | 268 +++++++++++++++++++++++++++++++++
>> drivers/s390/crypto/vfio_ap_ops.c | 19 +++
>> drivers/s390/crypto/vfio_ap_private.h | 4 +
>> 4 files changed, 327 insertions(+), 0 deletions(-)
>>
>> diff --git a/arch/s390/kvm/kvm-ap.c b/arch/s390/kvm/kvm-ap.c
>> index a2c6ad2..eb365e2 100644
>> --- a/arch/s390/kvm/kvm-ap.c
>> +++ b/arch/s390/kvm/kvm-ap.c
>> @@ -8,9 +8,129 @@
>>
>> #include <asm/kvm-ap.h>
>> #include <asm/ap.h>
>> +#include <linux/bitops.h>
>>
>> #include "kvm-s390.h"
>>
>> +static inline void kvm_ap_clear_crycb_masks(struct kvm *kvm)
>> +{
>> + int crycb_fmt = kvm->arch.crypto.crycbd & CRYCB_FORMAT_MASK;
>> +
>> + if (crycb_fmt == CRYCB_FORMAT2)
>> + memset(&kvm->arch.crypto.crycb->apcb1, 0,
>> + sizeof(kvm->arch.crypto.crycb->apcb1));
>> + else
>> + memset(&kvm->arch.crypto.crycb->apcb0, 0,
>> + sizeof(kvm->arch.crypto.crycb->apcb0));
>> +}
> Should that rather be a switch/case? If there's a CRYCB_FORMAT3 in the
> future, I'd think that it's more likely that it uses apcb1 and not
> apcb0. Can't comment further without the architecture, obviously.
Maybe we should just clear both structures without regard to the CRYCB
format.
>
> (...)
>
>> +static void kvm_ap_set_crycb_masks(struct kvm *kvm,
>> + struct kvm_ap_matrix *matrix)
>> +{
>> + unsigned long *apm = kvm_ap_get_crycb_apm(kvm);
>> + unsigned long *aqm = kvm_ap_get_crycb_aqm(kvm);
>> + unsigned long *adm = kvm_ap_get_crycb_adm(kvm);
>> +
>> + kvm_ap_clear_crycb_masks(kvm);
>> + memcpy(apm, matrix->apm, KVM_AP_MASK_BYTES(matrix->apm_max));
>> + memcpy(aqm, matrix->aqm, KVM_AP_MASK_BYTES(matrix->aqm_max));
>> +
>> + /*
>> + * Merge the AQM and ADM since the ADM is a superset of the
>> + * AQM by architectural convention.
> Is this 'architectural convention' in the sense of 'there's a statement
> in the architecture that it always is like that', or in the sense of
> 'all real-life systems are like that'?
> [From my sketchy memory, this convention makes sense but is not
> enshrined; but I might misremember.]
The documentation states it is an agreed upon convention.
>
>> + */
>> + bitmap_or(adm, adm, aqm, matrix->adm_max);
>> +}



2018-04-03 13:36:16

by Tony Krowiak

[permalink] [raw]
Subject: Re: [PATCH v3 08/14] s390: vfio-ap: sysfs interfaces to configure adapters

On 04/03/2018 07:10 AM, Cornelia Huck wrote:
> On Wed, 14 Mar 2018 14:25:48 -0400
> Tony Krowiak <[email protected]> wrote:
>
>> diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h
>> index a388b66..f6e7ed1 100644
>> --- a/drivers/s390/crypto/vfio_ap_private.h
>> +++ b/drivers/s390/crypto/vfio_ap_private.h
>> @@ -11,6 +11,7 @@
>> #include <linux/types.h>
>> #include <linux/device.h>
>> #include <linux/mdev.h>
>> +#include <asm/kvm-ap.h>
>>
>> #include "ap_bus.h"
>>
> Does this hunk belong in this patch?
It does look out of place here. I'll restore it to its rightful resting
place.
>


2018-04-03 13:39:38

by Cornelia Huck

[permalink] [raw]
Subject: Re: [PATCH v3 07/14] KVM: s390: interfaces to configure/deconfigure guest's AP matrix

On Tue, 3 Apr 2018 09:17:59 -0400
Tony Krowiak <[email protected]> wrote:

> On 04/03/2018 07:07 AM, Cornelia Huck wrote:
> > On Wed, 14 Mar 2018 14:25:47 -0400
> > Tony Krowiak <[email protected]> wrote:
> >
> >> Provides interfaces to assign AP adapters, usage domains
> >> and control domains to a KVM guest.
> >>
> >> A KVM guest is started by executing the Start Interpretive Execution (SIE)
> >> instruction. The SIE state description is a control block that contains the
> >> state information for a KVM guest and is supplied as input to the SIE
> >> instruction. The SIE state description has a satellite structure called the
> >> Crypto Control Block (CRYCB). The CRYCB contains three bitmask fields
> >> identifying the adapters, queues (domains) and control domains assigned to
> >> the KVM guest:
> >>
> >> * The AP Adapter Mask (APM) field identifies the AP adapters assigned to
> >> the KVM guest
> >>
> >> * The AP Queue Mask (AQM) field identifies the AP queues assigned to
> >> the KVM guest. Each AP queue is connected to a usage domain within
> >> an AP adapter.
> >>
> >> * The AP Domain Mask (ADM) field identifies the control domains
> >> assigned to the KVM guest.
> >>
> >> Each adapter, queue (usage domain) and control domain are identified by
> >> a number from 0 to 255. The bits in each mask, from most significant to
> >> least significant bit, correspond to the numbers 0-255. When a bit is
> >> set, the corresponding adapter, queue (usage domain) or control domain
> >> is assigned to the KVM guest.
> >>
> >> This patch will set the bits in the APM, AQM and ADM fields of the
> >> CRYCB referenced by the KVM guest's SIE state description. The process
> >> used is:
> >>
> >> 1. Verify that the bits to be set do not exceed the maximum bit
> >> number for the given mask.
> >>
> >> 2. Verify that the APQNs that can be derived from the intersection
> >> of the bits set in the APM and AQM fields of the KVM guest's CRYCB
> >> are not assigned to any other KVM guest running on the same linux
> >> host.
> >>
> >> 3. Set the APM, AQM and ADM in the CRYCB according to the matrix
> >> configured for the mediated matrix device via its sysfs
> >> adapter, domain and control domain attribute files respectively.
> >>
> >> Signed-off-by: Tony Krowiak <[email protected]>
> >> ---
> >> arch/s390/include/asm/kvm-ap.h | 36 +++++
> >> arch/s390/kvm/kvm-ap.c | 268 +++++++++++++++++++++++++++++++++
> >> drivers/s390/crypto/vfio_ap_ops.c | 19 +++
> >> drivers/s390/crypto/vfio_ap_private.h | 4 +
> >> 4 files changed, 327 insertions(+), 0 deletions(-)
> >>
> >> diff --git a/arch/s390/kvm/kvm-ap.c b/arch/s390/kvm/kvm-ap.c
> >> index a2c6ad2..eb365e2 100644
> >> --- a/arch/s390/kvm/kvm-ap.c
> >> +++ b/arch/s390/kvm/kvm-ap.c
> >> @@ -8,9 +8,129 @@
> >>
> >> #include <asm/kvm-ap.h>
> >> #include <asm/ap.h>
> >> +#include <linux/bitops.h>
> >>
> >> #include "kvm-s390.h"
> >>
> >> +static inline void kvm_ap_clear_crycb_masks(struct kvm *kvm)
> >> +{
> >> + int crycb_fmt = kvm->arch.crypto.crycbd & CRYCB_FORMAT_MASK;
> >> +
> >> + if (crycb_fmt == CRYCB_FORMAT2)
> >> + memset(&kvm->arch.crypto.crycb->apcb1, 0,
> >> + sizeof(kvm->arch.crypto.crycb->apcb1));
> >> + else
> >> + memset(&kvm->arch.crypto.crycb->apcb0, 0,
> >> + sizeof(kvm->arch.crypto.crycb->apcb0));
> >> +}
> > Should that rather be a switch/case? If there's a CRYCB_FORMAT3 in the
> > future, I'd think that it's more likely that it uses apcb1 and not
> > apcb0. Can't comment further without the architecture, obviously.
> Maybe we should just clear both structures without regard to the CRYCB
> format.

Yes; but my concern applies to the other checks for CRYCB_FORMAT2 as
well (snipped).

2018-04-03 15:20:50

by Cornelia Huck

[permalink] [raw]
Subject: Re: [PATCH v3 09/14] s390: vfio-ap: sysfs interfaces to configure domains

On Tue, 3 Apr 2018 11:12:45 -0400
Tony Krowiak <[email protected]> wrote:

> On 04/03/2018 07:17 AM, Cornelia Huck wrote:
> > On Wed, 14 Mar 2018 14:25:49 -0400
> > Tony Krowiak <[email protected]> wrote:
> >
> >> Provides the sysfs interfaces for assigning AP domains to
> >> and unassigning AP domains from a mediated matrix device.
> >>
> >> An AP domain ID corresponds to an AP queue index (APQI). For
> >> each domain assigned to the mediated matrix device, its
> >> corresponging APQI is stored in an AP queue mask (AQM).
> >> The bits in the AQM, from most significant to least
> >> significant bit, correspond to AP domain numbers 0 to 255.
> >> When a domain is assigned, the bit corresponding to its
> >> APQI will be set in the AQM. Likewise, when a domain is
> >> unassigned, the bit corresponding to its APQI will be
> >> cleared from the AQM.
> >>
> >> The relevant sysfs structures are:
> >>
> >> /sys/devices/vfio_ap
> >> ... [matrix]
> >> ...... [mdev_supported_types]
> >> ......... [vfio_ap-passthrough]
> >> ............ [devices]
> >> ...............[$uuid]
> >> .................. assign_domain
> >> .................. unassign_domain
> >>
> >> To assign a domain to the $uuid mediated matrix device,
> >> write the domain's ID to the assign_domain file. To
> >> unassign a domain, write the domain's ID to the
> >> unassign_domain file. The ID is specified using
> >> conventional semantics: If it begins with 0x, the number
> >> will be parsed as a hexadecimal (case insensitive) number;
> >> otherwise, it will be parsed as a decimal number.
> >>
> >> For example, to assign domain 173 (0xad) to the mediated matrix
> >> device $uuid:
> >>
> >> echo 173 > assign_domain
> >>
> >> or
> >>
> >> echo 0xad > assign_domain
> >>
> >> To unassign domain 173 (0xad):
> >>
> >> echo 173 > unassign_domain
> >>
> >> or
> >>
> >> echo 0xad > unassign_domain
> >>
> >> The assignment will be rejected:
> >>
> >> * If the domain ID exceeds the maximum value for an AP domain:
> >>
> >> * If the AP Extended Addressing (APXA) facility is installed,
> >> the max value is 255
> >>
> >> * Else the max value is 15
> >>
> >> * If no AP adapters have yet been assigned and there are
> >> no AP queues reserved by the VFIO AP driver that have an APQN
> >> with an APQI matching that of the AP domain number being
> >> assigned.
> >>
> >> * If any of the APQNs that can be derived from the intersection
> >> of the APQI being assigned and the AP adapter ID (APID) of
> >> each of the AP adapters previously assigned can not be matched
> >> with an APQN of an AP queue device reserved by the VFIO AP
> >> driver.
> >>
> >> Signed-off-by: Tony Krowiak <[email protected]>
> >> ---
> >> arch/s390/include/asm/kvm-ap.h | 1 +
> >> drivers/s390/crypto/vfio_ap_ops.c | 215 ++++++++++++++++++++++++++++++++++++-
> >> 2 files changed, 215 insertions(+), 1 deletions(-)
> >>
> >> diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
> >> index 90512a6..c448835 100644
> >> --- a/drivers/s390/crypto/vfio_ap_ops.c
> >> +++ b/drivers/s390/crypto/vfio_ap_ops.c
> >> @@ -377,10 +377,223 @@ static ssize_t unassign_adapter_store(struct device *dev,
> >> }
> >> DEVICE_ATTR_WO(unassign_adapter);
> >>
> >> +/**
> >> + * vfio_ap_validate_queues_for_apqi
> >> + *
> >> + * @ap_matrix: the matrix device
> >> + * @matrix_mdev: the mediated matrix device
> >> + * @apqi: an AP queue index (APQI) - corresponds to a domain ID
> >> + *
> >> + * Verifies that each APQN that is derived from the intersection of @apqi and
> >> + * each AP adapter ID (APID) corresponding to an AP domain assigned to the
> >> + * @matrix_mdev matches the APQN of an AP queue reserved by the VFIO AP device
> >> + * driver.
> >> + *
> >> + * Returns 0 if validation succeeds; otherwise, returns an error.
> >> + */
> >> +static int vfio_ap_validate_queues_for_apqi(struct ap_matrix *ap_matrix,
> >> + struct ap_matrix_mdev *matrix_mdev,
> >> + unsigned long apqi)
> >> +{
> >> + int ret;
> >> + struct vfio_ap_qid_match qid_match;
> >> + unsigned long apid;
> >> + struct device_driver *drv = ap_matrix->device.driver;
> >> +
> >> + /**
> >> + * Examine each APQN with the specified APQI
> >> + */
> >> + for_each_set_bit_inv(apid, matrix_mdev->matrix->apm,
> >> + matrix_mdev->matrix->apm_max) {
> >> + qid_match.qid = AP_MKQID(apid, apqi);
> >> + qid_match.dev = NULL;
> >> +
> >> + ret = driver_for_each_device(drv, NULL, &qid_match,
> >> + vfio_ap_queue_match);
> >> + if (ret)
> >> + return ret;
> > Hm, I'm wondering whether jumping out of the outer loop is the correct
> > thing to do here - and if yes, whether we should log an error?
> If you look at the vfio_ap_queue_match() function which is passed to the
> driver_for_each_device() function, it never returns an error. The
> driver_for_each_device() function only returns an error if the function
> passed in returns an error, so in reality, the value of *ret *will never
> be anything but 0. Having said that, there are no guarantees that the
> vfio_ap_queue_match() function will never change, so it would probably
> be a good idea to log an error if *ret *is not 0.**I think returning at
> this point is valid because a non-zero is returned from
> driver_for_each_device() function as soon as the input function returns
> a non-zero value for a specific device. This means that subsequent
> devices will not be processed, so we may not know whether an AP queue
> has been reserved or not - see check below.

OK, then logging an error makes the most sense.

Is there a source tree with the patches somewhere, btw? Checking out a
branch is less time-consuming than applying a series (and helps
review). Same applies to the qemu patches; maybe one of the IBM
maintainers can push to a branch?

>
> >
> >> +
> >> + /*
> >> + * If the APQN identifies an AP queue that is reserved by the
> >> + * VFIO AP device driver, continue processing.
> >> + */
> >> + if (qid_match.dev)
> >> + continue;
> >> +
> >> + pr_err("%s: AP queue %02lx.%04lx not reserved by %s driver",
> >> + VFIO_AP_MATRIX_MODULE_NAME, apqi, apqi,
> >> + VFIO_AP_DRV_NAME);
> >> +
> >> + return -ENXIO;
> >> + }
> >> +
> >> + return 0;
> >> +}
>
>


2018-04-03 15:44:45

by Tony Krowiak

[permalink] [raw]
Subject: Re: [PATCH v3 09/14] s390: vfio-ap: sysfs interfaces to configure domains

On 04/03/2018 11:19 AM, Cornelia Huck wrote:
> On Tue, 3 Apr 2018 11:12:45 -0400
> Tony Krowiak <[email protected]> wrote:
>
>> On 04/03/2018 07:17 AM, Cornelia Huck wrote:
>>> On Wed, 14 Mar 2018 14:25:49 -0400
>>> Tony Krowiak <[email protected]> wrote:
>>>
>>>> Provides the sysfs interfaces for assigning AP domains to
>>>> and unassigning AP domains from a mediated matrix device.
>>>>
>>>> An AP domain ID corresponds to an AP queue index (APQI). For
>>>> each domain assigned to the mediated matrix device, its
>>>> corresponging APQI is stored in an AP queue mask (AQM).
>>>> The bits in the AQM, from most significant to least
>>>> significant bit, correspond to AP domain numbers 0 to 255.
>>>> When a domain is assigned, the bit corresponding to its
>>>> APQI will be set in the AQM. Likewise, when a domain is
>>>> unassigned, the bit corresponding to its APQI will be
>>>> cleared from the AQM.
>>>>
>>>> The relevant sysfs structures are:
>>>>
>>>> /sys/devices/vfio_ap
>>>> ... [matrix]
>>>> ...... [mdev_supported_types]
>>>> ......... [vfio_ap-passthrough]
>>>> ............ [devices]
>>>> ...............[$uuid]
>>>> .................. assign_domain
>>>> .................. unassign_domain
>>>>
>>>> To assign a domain to the $uuid mediated matrix device,
>>>> write the domain's ID to the assign_domain file. To
>>>> unassign a domain, write the domain's ID to the
>>>> unassign_domain file. The ID is specified using
>>>> conventional semantics: If it begins with 0x, the number
>>>> will be parsed as a hexadecimal (case insensitive) number;
>>>> otherwise, it will be parsed as a decimal number.
>>>>
>>>> For example, to assign domain 173 (0xad) to the mediated matrix
>>>> device $uuid:
>>>>
>>>> echo 173 > assign_domain
>>>>
>>>> or
>>>>
>>>> echo 0xad > assign_domain
>>>>
>>>> To unassign domain 173 (0xad):
>>>>
>>>> echo 173 > unassign_domain
>>>>
>>>> or
>>>>
>>>> echo 0xad > unassign_domain
>>>>
>>>> The assignment will be rejected:
>>>>
>>>> * If the domain ID exceeds the maximum value for an AP domain:
>>>>
>>>> * If the AP Extended Addressing (APXA) facility is installed,
>>>> the max value is 255
>>>>
>>>> * Else the max value is 15
>>>>
>>>> * If no AP adapters have yet been assigned and there are
>>>> no AP queues reserved by the VFIO AP driver that have an APQN
>>>> with an APQI matching that of the AP domain number being
>>>> assigned.
>>>>
>>>> * If any of the APQNs that can be derived from the intersection
>>>> of the APQI being assigned and the AP adapter ID (APID) of
>>>> each of the AP adapters previously assigned can not be matched
>>>> with an APQN of an AP queue device reserved by the VFIO AP
>>>> driver.
>>>>
>>>> Signed-off-by: Tony Krowiak <[email protected]>
>>>> ---
>>>> arch/s390/include/asm/kvm-ap.h | 1 +
>>>> drivers/s390/crypto/vfio_ap_ops.c | 215 ++++++++++++++++++++++++++++++++++++-
>>>> 2 files changed, 215 insertions(+), 1 deletions(-)
>>>>
>>>> diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
>>>> index 90512a6..c448835 100644
>>>> --- a/drivers/s390/crypto/vfio_ap_ops.c
>>>> +++ b/drivers/s390/crypto/vfio_ap_ops.c
>>>> @@ -377,10 +377,223 @@ static ssize_t unassign_adapter_store(struct device *dev,
>>>> }
>>>> DEVICE_ATTR_WO(unassign_adapter);
>>>>
>>>> +/**
>>>> + * vfio_ap_validate_queues_for_apqi
>>>> + *
>>>> + * @ap_matrix: the matrix device
>>>> + * @matrix_mdev: the mediated matrix device
>>>> + * @apqi: an AP queue index (APQI) - corresponds to a domain ID
>>>> + *
>>>> + * Verifies that each APQN that is derived from the intersection of @apqi and
>>>> + * each AP adapter ID (APID) corresponding to an AP domain assigned to the
>>>> + * @matrix_mdev matches the APQN of an AP queue reserved by the VFIO AP device
>>>> + * driver.
>>>> + *
>>>> + * Returns 0 if validation succeeds; otherwise, returns an error.
>>>> + */
>>>> +static int vfio_ap_validate_queues_for_apqi(struct ap_matrix *ap_matrix,
>>>> + struct ap_matrix_mdev *matrix_mdev,
>>>> + unsigned long apqi)
>>>> +{
>>>> + int ret;
>>>> + struct vfio_ap_qid_match qid_match;
>>>> + unsigned long apid;
>>>> + struct device_driver *drv = ap_matrix->device.driver;
>>>> +
>>>> + /**
>>>> + * Examine each APQN with the specified APQI
>>>> + */
>>>> + for_each_set_bit_inv(apid, matrix_mdev->matrix->apm,
>>>> + matrix_mdev->matrix->apm_max) {
>>>> + qid_match.qid = AP_MKQID(apid, apqi);
>>>> + qid_match.dev = NULL;
>>>> +
>>>> + ret = driver_for_each_device(drv, NULL, &qid_match,
>>>> + vfio_ap_queue_match);
>>>> + if (ret)
>>>> + return ret;
>>> Hm, I'm wondering whether jumping out of the outer loop is the correct
>>> thing to do here - and if yes, whether we should log an error?
>> If you look at the vfio_ap_queue_match() function which is passed to the
>> driver_for_each_device() function, it never returns an error. The
>> driver_for_each_device() function only returns an error if the function
>> passed in returns an error, so in reality, the value of *ret *will never
>> be anything but 0. Having said that, there are no guarantees that the
>> vfio_ap_queue_match() function will never change, so it would probably
>> be a good idea to log an error if *ret *is not 0.**I think returning at
>> this point is valid because a non-zero is returned from
>> driver_for_each_device() function as soon as the input function returns
>> a non-zero value for a specific device. This means that subsequent
>> devices will not be processed, so we may not know whether an AP queue
>> has been reserved or not - see check below.
> OK, then logging an error makes the most sense.
>
> Is there a source tree with the patches somewhere, btw? Checking out a
> branch is less time-consuming than applying a series (and helps
> review). Same applies to the qemu patches; maybe one of the IBM
> maintainers can push to a branch?
I'll check with Christian for the forthcoming v4 patches.
>
>>>
>>>> +
>>>> + /*
>>>> + * If the APQN identifies an AP queue that is reserved by the
>>>> + * VFIO AP device driver, continue processing.
>>>> + */
>>>> + if (qid_match.dev)
>>>> + continue;
>>>> +
>>>> + pr_err("%s: AP queue %02lx.%04lx not reserved by %s driver",
>>>> + VFIO_AP_MATRIX_MODULE_NAME, apqi, apqi,
>>>> + VFIO_AP_DRV_NAME);
>>>> +
>>>> + return -ENXIO;
>>>> + }
>>>> +
>>>> + return 0;
>>>> +}
>>


2018-04-05 10:44:11

by Christian Borntraeger

[permalink] [raw]
Subject: Re: [PATCH v3 01/14] KVM: s390: refactor crypto initialization



On 03/14/2018 07:25 PM, Tony Krowiak wrote:
> This patch refactors the code that initializes the crypto
> configuration for a guest. The crypto configuration is contained in
> a crypto control block (CRYCB) which is a satellite control block to
> our main hardware virtualization control block. The CRYCB is
> attached to the main virtualization control block via a CRYCB
> designation (CRYCBD) designation field containing the address of
> the CRYCB as well as its format.
>
> Prior to the introduction of AP device virtualization, there was
> no need to provide access to or specify the format of the CRYCB for
> a guest unless the MSA extension 3 (MSAX3) facility was installed
> on the host system. With the introduction of AP device virtualization,
> the CRYCB and its format must be made accessible to the guest
> regardless of the presence of the MSAX3 facility.
>
> The crypto initialization code is restructured as follows:
>
> * A new compilation unit is introduced to contain all interfaces
> and data structures related to configuring a guest's CRYCB for
> both the refactoring of crypto initialization as well as all
> subsequent patches introducing AP virtualization support.
>
> * Currently, the asm code for querying the AP configuration is
> duplicated in the AP bus as well as in KVM. Since the KVM
> code was introduced, the AP bus has externalized the interface
> for querying the AP configuration. The KVM interface will be
> replaced with a call to the AP bus interface. Of course, this
> will be moved to the new compilation unit mentioned above.
>
> * An interface to format the CRYCBD field will be provided via
> the new compilation unit and called from the KVM vm
> initialization.
>
> Signed-off-by: Tony Krowiak <[email protected]>
> ---
> MAINTAINERS | 10 ++++++
> arch/s390/include/asm/kvm-ap.h | 16 ++++++++++
> arch/s390/include/asm/kvm_host.h | 1 +
> arch/s390/kvm/Kconfig | 1 +
> arch/s390/kvm/Makefile | 2 +-
> arch/s390/kvm/kvm-ap.c | 48 +++++++++++++++++++++++++++++
> arch/s390/kvm/kvm-s390.c | 61 ++++---------------------------------
> 7 files changed, 84 insertions(+), 55 deletions(-)
> create mode 100644 arch/s390/include/asm/kvm-ap.h
> create mode 100644 arch/s390/kvm/kvm-ap.c
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 0ec5881..72742d5 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -11875,6 +11875,16 @@ W: http://www.ibm.com/developerworks/linux/linux390/
> S: Supported
> F: drivers/s390/crypto/
>
> +S390 VFIO AP DRIVER
> +M: Tony Krowiak <[email protected]>
> +M: Christian Borntraeger <[email protected]>
> +M: Martin Schwidefsky <[email protected]>
> +L: [email protected]
> +W: http://www.ibm.com/developerworks/linux/linux390/
> +S: Supported
> +F: arch/s390/include/asm/kvm/kvm-ap.h
> +F: arch/s390/kvm/kvm-ap.c
> +
> S390 ZFCP DRIVER
> M: Steffen Maier <[email protected]>
> M: Benjamin Block <[email protected]>


The Maintainers update belongs into a different patch (e.g. when you introduce
drivers/s390/crypto/vfio_ap_drv.c )


2018-04-05 10:46:45

by Christian Borntraeger

[permalink] [raw]
Subject: Re: [PATCH v3 01/14] KVM: s390: refactor crypto initialization



On 04/05/2018 12:42 PM, Christian Borntraeger wrote:
>
>
> On 03/14/2018 07:25 PM, Tony Krowiak wrote:
>> This patch refactors the code that initializes the crypto
>> configuration for a guest. The crypto configuration is contained in
>> a crypto control block (CRYCB) which is a satellite control block to
>> our main hardware virtualization control block. The CRYCB is
>> attached to the main virtualization control block via a CRYCB
>> designation (CRYCBD) designation field containing the address of
>> the CRYCB as well as its format.
>>
>> Prior to the introduction of AP device virtualization, there was
>> no need to provide access to or specify the format of the CRYCB for
>> a guest unless the MSA extension 3 (MSAX3) facility was installed
>> on the host system. With the introduction of AP device virtualization,
>> the CRYCB and its format must be made accessible to the guest
>> regardless of the presence of the MSAX3 facility.
>>
>> The crypto initialization code is restructured as follows:
>>
>> * A new compilation unit is introduced to contain all interfaces
>> and data structures related to configuring a guest's CRYCB for
>> both the refactoring of crypto initialization as well as all
>> subsequent patches introducing AP virtualization support.
>>
>> * Currently, the asm code for querying the AP configuration is
>> duplicated in the AP bus as well as in KVM. Since the KVM
>> code was introduced, the AP bus has externalized the interface
>> for querying the AP configuration. The KVM interface will be
>> replaced with a call to the AP bus interface. Of course, this
>> will be moved to the new compilation unit mentioned above.
>>
>> * An interface to format the CRYCBD field will be provided via
>> the new compilation unit and called from the KVM vm
>> initialization.
>>
>> Signed-off-by: Tony Krowiak <[email protected]>
>> ---
>> MAINTAINERS | 10 ++++++
>> arch/s390/include/asm/kvm-ap.h | 16 ++++++++++
>> arch/s390/include/asm/kvm_host.h | 1 +
>> arch/s390/kvm/Kconfig | 1 +
>> arch/s390/kvm/Makefile | 2 +-
>> arch/s390/kvm/kvm-ap.c | 48 +++++++++++++++++++++++++++++
>> arch/s390/kvm/kvm-s390.c | 61 ++++---------------------------------
>> 7 files changed, 84 insertions(+), 55 deletions(-)
>> create mode 100644 arch/s390/include/asm/kvm-ap.h
>> create mode 100644 arch/s390/kvm/kvm-ap.c
>>
>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index 0ec5881..72742d5 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -11875,6 +11875,16 @@ W: http://www.ibm.com/developerworks/linux/linux390/
>> S: Supported
>> F: drivers/s390/crypto/
>>
>> +S390 VFIO AP DRIVER
>> +M: Tony Krowiak <[email protected]>
>> +M: Christian Borntraeger <[email protected]>
>> +M: Martin Schwidefsky <[email protected]>
>> +L: [email protected]
>> +W: http://www.ibm.com/developerworks/linux/linux390/
>> +S: Supported
>> +F: arch/s390/include/asm/kvm/kvm-ap.h
>> +F: arch/s390/kvm/kvm-ap.c
>> +
>> S390 ZFCP DRIVER
>> M: Steffen Maier <[email protected]>
>> M: Benjamin Block <[email protected]>
>
>
> The Maintainers update belongs into a different patch (e.g. when you introduce
> drivers/s390/crypto/vfio_ap_drv.c )

To put it differently. I think kvm-ap code in here is more related to kvm than to
vfio-ap.


2018-04-05 13:19:16

by Tony Krowiak

[permalink] [raw]
Subject: Re: [PATCH v3 01/14] KVM: s390: refactor crypto initialization

On 04/05/2018 06:45 AM, Christian Borntraeger wrote:
>
> On 04/05/2018 12:42 PM, Christian Borntraeger wrote:
>>
>> On 03/14/2018 07:25 PM, Tony Krowiak wrote:
>>> This patch refactors the code that initializes the crypto
>>> configuration for a guest. The crypto configuration is contained in
>>> a crypto control block (CRYCB) which is a satellite control block to
>>> our main hardware virtualization control block. The CRYCB is
>>> attached to the main virtualization control block via a CRYCB
>>> designation (CRYCBD) designation field containing the address of
>>> the CRYCB as well as its format.
>>>
>>> Prior to the introduction of AP device virtualization, there was
>>> no need to provide access to or specify the format of the CRYCB for
>>> a guest unless the MSA extension 3 (MSAX3) facility was installed
>>> on the host system. With the introduction of AP device virtualization,
>>> the CRYCB and its format must be made accessible to the guest
>>> regardless of the presence of the MSAX3 facility.
>>>
>>> The crypto initialization code is restructured as follows:
>>>
>>> * A new compilation unit is introduced to contain all interfaces
>>> and data structures related to configuring a guest's CRYCB for
>>> both the refactoring of crypto initialization as well as all
>>> subsequent patches introducing AP virtualization support.
>>>
>>> * Currently, the asm code for querying the AP configuration is
>>> duplicated in the AP bus as well as in KVM. Since the KVM
>>> code was introduced, the AP bus has externalized the interface
>>> for querying the AP configuration. The KVM interface will be
>>> replaced with a call to the AP bus interface. Of course, this
>>> will be moved to the new compilation unit mentioned above.
>>>
>>> * An interface to format the CRYCBD field will be provided via
>>> the new compilation unit and called from the KVM vm
>>> initialization.
>>>
>>> Signed-off-by: Tony Krowiak <[email protected]>
>>> ---
>>> MAINTAINERS | 10 ++++++
>>> arch/s390/include/asm/kvm-ap.h | 16 ++++++++++
>>> arch/s390/include/asm/kvm_host.h | 1 +
>>> arch/s390/kvm/Kconfig | 1 +
>>> arch/s390/kvm/Makefile | 2 +-
>>> arch/s390/kvm/kvm-ap.c | 48 +++++++++++++++++++++++++++++
>>> arch/s390/kvm/kvm-s390.c | 61 ++++---------------------------------
>>> 7 files changed, 84 insertions(+), 55 deletions(-)
>>> create mode 100644 arch/s390/include/asm/kvm-ap.h
>>> create mode 100644 arch/s390/kvm/kvm-ap.c
>>>
>>> diff --git a/MAINTAINERS b/MAINTAINERS
>>> index 0ec5881..72742d5 100644
>>> --- a/MAINTAINERS
>>> +++ b/MAINTAINERS
>>> @@ -11875,6 +11875,16 @@ W: http://www.ibm.com/developerworks/linux/linux390/
>>> S: Supported
>>> F: drivers/s390/crypto/
>>>
>>> +S390 VFIO AP DRIVER
>>> +M: Tony Krowiak <[email protected]>
>>> +M: Christian Borntraeger <[email protected]>
>>> +M: Martin Schwidefsky <[email protected]>
>>> +L: [email protected]
>>> +W: http://www.ibm.com/developerworks/linux/linux390/
>>> +S: Supported
>>> +F: arch/s390/include/asm/kvm/kvm-ap.h
>>> +F: arch/s390/kvm/kvm-ap.c
>>> +
>>> S390 ZFCP DRIVER
>>> M: Steffen Maier <[email protected]>
>>> M: Benjamin Block <[email protected]>
>>
>> The Maintainers update belongs into a different patch (e.g. when you introduce
>> drivers/s390/crypto/vfio_ap_drv.c )
> To put it differently. I think kvm-ap code in here is more related to kvm than to
> vfio-ap.
Okay, I'll remove this from here. It looks like it is already covered under
KERNEL VIRTUAL MACHINE for s390 (KVM/s390).
>