2022-04-27 09:39:48

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH v6 00/21] KVM: s390: enable zPCI for interpretive execution

Enable interpretive execution of zPCI instructions + adapter interruption
forwarding for s390x KVM vfio-pci. This is done by triggering a routine
when the VFIO group is associated with the KVM guest, transmitting to
firmware a special token (GISA designation) to enable that specific guest
for interpretive execution on that zPCI device. Load/store interpreation
enablement is then controlled by userspace (based upon whether or not a
SHM bit is placed in the virtual function handle). Adapter Event
Notification interpretation is controlled from userspace via a new KVM
ioctl.

By allowing intepretation of zPCI instructions and firmware delivery of
interrupts to guests, we can reduce the frequency of guest SIE exits for
zPCI.

From the perspective of guest configuration, you passthrough zPCI devices
in the same manner as before, with intepretation support being used by
default if available in kernel+qemu.

Some minor changes are already suggested for the QEMU series, but for
now the v5 one is still compatible:

https://lore.kernel.org/kvm/[email protected]/

Changelog v5->v6:
- Add more R-b tags
- Address a leak of aift during rmmod kvm (Christian) also destroy the
associated mutex at this time
- only need to check nisc match during 2nd insmod of kvm (Pierre)
- Register for kvm notifier in vfio_pci_zdev open_device, use this to
trigger GISA (un)registration instead of via iommu_group_for_each_dev
from virt/kvm/vfio.c (Jason)
- Since we now always enable the interpretation facilities once a
passthrough PCI device is detected, let's just set the ECB during
init vm -- SHM bits will always be used to override load/store
interpretation
- Still using KVM_S390_ZPCI_OP for AEN setup/teardown; we can continue
to discuss alternatives if it makes sense, but wanted to update the
rest of the series so review can continue.

Matthew Rosato (21):
s390/sclp: detect the zPCI load/store interpretation facility
s390/sclp: detect the AISII facility
s390/sclp: detect the AENI facility
s390/sclp: detect the AISI facility
s390/airq: pass more TPI info to airq handlers
s390/airq: allow for airq structure that uses an input vector
s390/pci: externalize the SIC operation controls and routine
s390/pci: stash associated GISA designation
s390/pci: stash dtsm and maxstbl
KVM: s390: pci: add basic kvm_zdev structure
KVM: s390: pci: do initial setup for AEN interpretation
KVM: s390: pci: enable host forwarding of Adapter Event Notifications
KVM: s390: mechanism to enable guest zPCI Interpretation
KVM: s390: pci: provide routines for enabling/disabling interrupt
forwarding
KVM: s390: pci: add routines to start/stop interpretive execution
vfio-pci/zdev: add open/close device hooks
vfio-pci/zdev: add function handle to clp base capability
vfio-pci/zdev: different maxstbl for interpreted devices
KVM: s390: add KVM_S390_ZPCI_OP to manage guest zPCI devices
KVM: s390: introduce CPU feature for zPCI Interpretation
MAINTAINERS: additional files related kvm s390 pci passthrough

Documentation/virt/kvm/api.rst | 45 +++
MAINTAINERS | 1 +
arch/s390/include/asm/airq.h | 7 +-
arch/s390/include/asm/kvm_host.h | 18 +
arch/s390/include/asm/pci.h | 12 +
arch/s390/include/asm/pci_clp.h | 9 +-
arch/s390/include/asm/pci_insn.h | 29 +-
arch/s390/include/asm/sclp.h | 4 +
arch/s390/include/asm/tpi.h | 13 +
arch/s390/include/uapi/asm/kvm.h | 1 +
arch/s390/kvm/Makefile | 1 +
arch/s390/kvm/interrupt.c | 95 ++++-
arch/s390/kvm/kvm-s390.c | 87 +++-
arch/s390/kvm/kvm-s390.h | 10 +
arch/s390/kvm/pci.c | 667 +++++++++++++++++++++++++++++++
arch/s390/kvm/pci.h | 87 ++++
arch/s390/pci/pci.c | 15 +
arch/s390/pci/pci_clp.c | 7 +
arch/s390/pci/pci_insn.c | 4 +-
arch/s390/pci/pci_irq.c | 48 ++-
drivers/s390/char/sclp_early.c | 4 +
drivers/s390/cio/airq.c | 12 +-
drivers/s390/cio/qdio_thinint.c | 6 +-
drivers/s390/crypto/ap_bus.c | 9 +-
drivers/s390/virtio/virtio_ccw.c | 6 +-
drivers/vfio/pci/vfio_pci_core.c | 2 +
drivers/vfio/pci/vfio_pci_zdev.c | 61 ++-
include/linux/vfio_pci_core.h | 10 +
include/uapi/linux/kvm.h | 31 ++
include/uapi/linux/vfio_zdev.h | 7 +
30 files changed, 1256 insertions(+), 52 deletions(-)
create mode 100644 arch/s390/kvm/pci.c
create mode 100644 arch/s390/kvm/pci.h

--
2.27.0


2022-04-27 09:55:40

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH v6 15/21] KVM: s390: pci: add routines to start/stop interpretive execution

These routines will be invoked at the time an s390x vfio-pci device is
associated with a KVM (or when the association is removed), allowing
the zPCI device to enable or disable load/store intepretation mode;
this requires the host zPCI device to inform firmware of the unique
token (GISA designation) that is associated with the owning KVM.

Furthemore, add/remove these devices from a list associated with the
kvm and ensure proper cleanup always occurs during vm exit.

Signed-off-by: Matthew Rosato <[email protected]>
---
arch/s390/include/asm/kvm_host.h | 13 +++
arch/s390/kvm/kvm-s390.c | 9 ++
arch/s390/kvm/pci.c | 158 +++++++++++++++++++++++++++++++
arch/s390/kvm/pci.h | 5 +
arch/s390/pci/pci.c | 3 +
5 files changed, 188 insertions(+)

diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index 8e381603b6a7..cd58f204305e 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -19,6 +19,7 @@
#include <linux/kvm.h>
#include <linux/seqlock.h>
#include <linux/module.h>
+#include <linux/pci.h>
#include <asm/debug.h>
#include <asm/cpu.h>
#include <asm/fpu/api.h>
@@ -967,6 +968,8 @@ struct kvm_arch{
DECLARE_BITMAP(idle_mask, KVM_MAX_VCPUS);
struct kvm_s390_gisa_interrupt gisa_int;
struct kvm_s390_pv pv;
+ struct list_head kzdev_list;
+ spinlock_t kzdev_list_lock;
};

#define KVM_HVA_ERR_BAD (-1UL)
@@ -1017,4 +1020,14 @@ static inline void kvm_arch_flush_shadow_memslot(struct kvm *kvm,
static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {}
static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {}

+#ifdef CONFIG_PCI
+int kvm_s390_pci_register_kvm(struct zpci_dev *zdev, struct kvm *kvm);
+#else
+static inline int kvm_s390_pci_register_kvm(struct zpci_dev *dev,
+ struct kvm *kvm)
+{
+ return -EPERM;
+}
+#endif
+
#endif
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 504f312c1c27..bd6c8aeabc24 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -2866,6 +2866,13 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)

kvm_s390_crypto_init(kvm);

+ if (IS_ENABLED(CONFIG_VFIO_PCI)) {
+ mutex_lock(&kvm->lock);
+ kvm_s390_pci_init_list(kvm);
+ kvm_s390_vcpu_pci_enable_interp(kvm);
+ mutex_unlock(&kvm->lock);
+ }
+
mutex_init(&kvm->arch.float_int.ais_lock);
spin_lock_init(&kvm->arch.float_int.lock);
for (i = 0; i < FIRQ_LIST_COUNT; i++)
@@ -2951,6 +2958,8 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
if (!kvm_is_ucontrol(kvm))
gmap_remove(kvm->arch.gmap);
kvm_s390_destroy_adapters(kvm);
+ if (IS_ENABLED(CONFIG_VFIO_PCI))
+ kvm_s390_pci_clear_list(kvm);
kvm_s390_clear_float_irqs(kvm);
kvm_s390_vsie_destroy(kvm);
KVM_EVENT(3, "vm 0x%pK destroyed", kvm);
diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
index 72a9c6bfae86..2aee7f11db1d 100644
--- a/arch/s390/kvm/pci.c
+++ b/arch/s390/kvm/pci.c
@@ -12,7 +12,9 @@
#include <asm/pci.h>
#include <asm/pci_insn.h>
#include <asm/pci_io.h>
+#include <asm/sclp.h>
#include "pci.h"
+#include "kvm-s390.h"

struct zpci_aift *aift;

@@ -424,6 +426,162 @@ void kvm_s390_pci_dev_release(struct zpci_dev *zdev)
}
EXPORT_SYMBOL_GPL(kvm_s390_pci_dev_release);

+static inline int register_kvm(struct zpci_dev *zdev, struct kvm *kvm)
+{
+ int rc;
+
+ if (zdev->kzdev || zdev->gisa != 0)
+ return -EINVAL;
+
+ mutex_lock(&kvm->lock);
+
+ rc = kvm_s390_pci_dev_open(zdev);
+ if (rc)
+ goto err;
+
+ /*
+ * If interpretation facilities aren't available, add the device to
+ * the kzdev list but don't enable for interpretation.
+ */
+ if (!kvm_s390_pci_interp_allowed())
+ goto out;
+
+ /*
+ * If this is the first request to use an interpreted device, make the
+ * necessary vcpu changes
+ */
+ if (!kvm->arch.use_zpci_interp)
+ kvm_s390_vcpu_pci_enable_interp(kvm);
+
+ if (zdev_enabled(zdev)) {
+ rc = zpci_disable_device(zdev);
+ if (rc)
+ goto err;
+ }
+
+ /*
+ * Store information about the identity of the kvm guest allowed to
+ * access this device via interpretation to be used by host CLP
+ */
+ zdev->gisa = (u32)virt_to_phys(&kvm->arch.sie_page2->gisa);
+
+ rc = zpci_enable_device(zdev);
+ if (rc)
+ goto clear_gisa;
+
+ /* Re-register the IOMMU that was already created */
+ rc = zpci_register_ioat(zdev, 0, zdev->start_dma, zdev->end_dma,
+ virt_to_phys(zdev->dma_table));
+ if (rc)
+ goto clear_gisa;
+
+out:
+ zdev->kzdev->kvm = kvm;
+
+ spin_lock(&kvm->arch.kzdev_list_lock);
+ list_add_tail(&zdev->kzdev->entry, &kvm->arch.kzdev_list);
+ spin_unlock(&kvm->arch.kzdev_list_lock);
+
+ mutex_unlock(&kvm->lock);
+ return 0;
+
+clear_gisa:
+ zdev->gisa = 0;
+err:
+ if (zdev->kzdev)
+ kvm_s390_pci_dev_release(zdev);
+ mutex_unlock(&kvm->lock);
+ return rc;
+}
+
+static inline int unregister_kvm(struct zpci_dev *zdev)
+{
+ struct kvm *kvm;
+ int rc;
+
+ if (!zdev->kzdev)
+ return -EINVAL;
+
+ kvm = zdev->kzdev->kvm;
+ mutex_lock(&kvm->lock);
+
+ /*
+ * A 0 gisa means interpretation was never enabled, just remove the
+ * device from the list.
+ */
+ if (zdev->gisa == 0)
+ goto out;
+
+ /* Forwarding must be turned off before interpretation */
+ if (zdev->kzdev->fib.fmt0.aibv != 0)
+ kvm_s390_pci_aif_disable(zdev, true);
+
+ /* Remove the host CLP guest designation */
+ zdev->gisa = 0;
+
+ if (zdev_enabled(zdev)) {
+ rc = zpci_disable_device(zdev);
+ if (rc)
+ goto out;
+ }
+
+ rc = zpci_enable_device(zdev);
+ if (rc)
+ goto out;
+
+ /* Re-register the IOMMU that was already created */
+ rc = zpci_register_ioat(zdev, 0, zdev->start_dma, zdev->end_dma,
+ virt_to_phys(zdev->dma_table));
+
+out:
+ spin_lock(&kvm->arch.kzdev_list_lock);
+ list_del(&zdev->kzdev->entry);
+ spin_unlock(&kvm->arch.kzdev_list_lock);
+ kvm_s390_pci_dev_release(zdev);
+
+ mutex_unlock(&kvm->lock);
+
+ return rc;
+}
+
+int kvm_s390_pci_register_kvm(struct zpci_dev *zdev, struct kvm *kvm)
+{
+ if (!zdev)
+ return 0;
+
+ /*
+ * Register device with this KVM (or remove the KVM association if 0).
+ * If interpetation facilities are available, enable them and let
+ * userspace indicate whether or not they will be used (specify SHM bit
+ * to disable).
+ */
+ if (kvm)
+ return register_kvm(zdev, kvm);
+ else
+ return unregister_kvm(zdev);
+}
+EXPORT_SYMBOL_GPL(kvm_s390_pci_register_kvm);
+
+void kvm_s390_pci_init_list(struct kvm *kvm)
+{
+ spin_lock_init(&kvm->arch.kzdev_list_lock);
+ INIT_LIST_HEAD(&kvm->arch.kzdev_list);
+}
+
+void kvm_s390_pci_clear_list(struct kvm *kvm)
+{
+ struct kvm_zdev *tmp, *kzdev;
+ LIST_HEAD(remove);
+
+ spin_lock(&kvm->arch.kzdev_list_lock);
+ list_for_each_entry_safe(kzdev, tmp, &kvm->arch.kzdev_list, entry)
+ list_move_tail(&kzdev->entry, &remove);
+ spin_unlock(&kvm->arch.kzdev_list_lock);
+
+ list_for_each_entry_safe(kzdev, tmp, &remove, entry)
+ unregister_kvm(kzdev->zdev);
+}
+
int kvm_s390_pci_init(void)
{
aift = kzalloc(sizeof(struct zpci_aift), GFP_KERNEL);
diff --git a/arch/s390/kvm/pci.h b/arch/s390/kvm/pci.h
index f567c3189a40..d11662fadb78 100644
--- a/arch/s390/kvm/pci.h
+++ b/arch/s390/kvm/pci.h
@@ -13,6 +13,7 @@
#include <linux/kvm_host.h>
#include <linux/pci.h>
#include <linux/mutex.h>
+#include <linux/kvm.h>
#include <linux/kvm_host.h>
#include <asm/airq.h>
#include <asm/cpu.h>
@@ -21,6 +22,7 @@ struct kvm_zdev {
struct zpci_dev *zdev;
struct kvm *kvm;
struct zpci_fib fib;
+ struct list_head entry;
};

struct zpci_gaite {
@@ -54,6 +56,9 @@ static inline struct kvm *kvm_s390_pci_si_to_kvm(struct zpci_aift *aift,
int kvm_s390_pci_aen_init(u8 nisc);
void kvm_s390_pci_aen_exit(void);

+void kvm_s390_pci_init_list(struct kvm *kvm);
+void kvm_s390_pci_clear_list(struct kvm *kvm);
+
int kvm_s390_pci_init(void);
void kvm_s390_pci_exit(void);

diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c
index f0a439c43395..d9b021fb84d5 100644
--- a/arch/s390/pci/pci.c
+++ b/arch/s390/pci/pci.c
@@ -132,6 +132,7 @@ int zpci_register_ioat(struct zpci_dev *zdev, u8 dmaas,
zpci_dbg(3, "reg ioat fid:%x, cc:%d, status:%d\n", zdev->fid, cc, status);
return cc;
}
+EXPORT_SYMBOL_GPL(zpci_register_ioat);

/* Modify PCI: Unregister I/O address translation parameters */
int zpci_unregister_ioat(struct zpci_dev *zdev, u8 dmaas)
@@ -712,6 +713,7 @@ int zpci_enable_device(struct zpci_dev *zdev)
zpci_update_fh(zdev, fh);
return rc;
}
+EXPORT_SYMBOL_GPL(zpci_enable_device);

int zpci_disable_device(struct zpci_dev *zdev)
{
@@ -735,6 +737,7 @@ int zpci_disable_device(struct zpci_dev *zdev)
}
return rc;
}
+EXPORT_SYMBOL_GPL(zpci_disable_device);

/**
* zpci_hot_reset_device - perform a reset of the given zPCI function
--
2.27.0

2022-04-27 09:57:30

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH v6 01/21] s390/sclp: detect the zPCI load/store interpretation facility

Detect the zPCI Load/Store Interpretation facility.

Reviewed-by: Eric Farman <[email protected]>
Reviewed-by: Christian Borntraeger <[email protected]>
Reviewed-by: Claudio Imbrenda <[email protected]>
Signed-off-by: Matthew Rosato <[email protected]>
---
arch/s390/include/asm/sclp.h | 1 +
drivers/s390/char/sclp_early.c | 1 +
2 files changed, 2 insertions(+)

diff --git a/arch/s390/include/asm/sclp.h b/arch/s390/include/asm/sclp.h
index 04cb1e7582a6..06c0b5eb8ef5 100644
--- a/arch/s390/include/asm/sclp.h
+++ b/arch/s390/include/asm/sclp.h
@@ -87,6 +87,7 @@ struct sclp_info {
unsigned char has_diag318 : 1;
unsigned char has_sipl : 1;
unsigned char has_dirq : 1;
+ unsigned char has_zpci_lsi : 1;
unsigned int ibc;
unsigned int mtid;
unsigned int mtid_cp;
diff --git a/drivers/s390/char/sclp_early.c b/drivers/s390/char/sclp_early.c
index e9943a86c361..b88dd0da1231 100644
--- a/drivers/s390/char/sclp_early.c
+++ b/drivers/s390/char/sclp_early.c
@@ -45,6 +45,7 @@ static void __init sclp_early_facilities_detect(void)
sclp.has_gisaf = !!(sccb->fac118 & 0x08);
sclp.has_hvs = !!(sccb->fac119 & 0x80);
sclp.has_kss = !!(sccb->fac98 & 0x01);
+ sclp.has_zpci_lsi = !!(sccb->fac118 & 0x01);
if (sccb->fac85 & 0x02)
S390_lowcore.machine_flags |= MACHINE_FLAG_ESOP;
if (sccb->fac91 & 0x40)
--
2.27.0

2022-04-27 09:59:28

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH v6 10/21] KVM: s390: pci: add basic kvm_zdev structure

This structure will be used to carry kvm passthrough information related to
zPCI devices.

Reviewed-by: Niklas Schnelle <[email protected]>
Reviewed-by: Pierre Morel <[email protected]>
Reviewed-by: Christian Borntraeger <[email protected]>
Signed-off-by: Matthew Rosato <[email protected]>
---
arch/s390/include/asm/pci.h | 3 +++
arch/s390/kvm/Makefile | 1 +
arch/s390/kvm/pci.c | 38 +++++++++++++++++++++++++++++++++++++
arch/s390/kvm/pci.h | 21 ++++++++++++++++++++
4 files changed, 63 insertions(+)
create mode 100644 arch/s390/kvm/pci.c
create mode 100644 arch/s390/kvm/pci.h

diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
index 4c5b8fbc2079..50f1851edfbe 100644
--- a/arch/s390/include/asm/pci.h
+++ b/arch/s390/include/asm/pci.h
@@ -97,6 +97,7 @@ struct zpci_bar_struct {
};

struct s390_domain;
+struct kvm_zdev;

#define ZPCI_FUNCTIONS_PER_BUS 256
struct zpci_bus {
@@ -189,7 +190,9 @@ struct zpci_dev {

struct dentry *debugfs_dev;

+ /* IOMMU and passthrough */
struct s390_domain *s390_domain; /* s390 IOMMU domain data */
+ struct kvm_zdev *kzdev;
};

static inline bool zdev_enabled(struct zpci_dev *zdev)
diff --git a/arch/s390/kvm/Makefile b/arch/s390/kvm/Makefile
index 26f4a74e5ce4..00cf6853d93f 100644
--- a/arch/s390/kvm/Makefile
+++ b/arch/s390/kvm/Makefile
@@ -10,4 +10,5 @@ ccflags-y := -Ivirt/kvm -Iarch/s390/kvm
kvm-y += kvm-s390.o intercept.o interrupt.o priv.o sigp.o
kvm-y += diag.o gaccess.o guestdbg.o vsie.o pv.o

+kvm-$(CONFIG_PCI) += pci.o
obj-$(CONFIG_KVM) += kvm.o
diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
new file mode 100644
index 000000000000..213be236c05a
--- /dev/null
+++ b/arch/s390/kvm/pci.c
@@ -0,0 +1,38 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * s390 kvm PCI passthrough support
+ *
+ * Copyright IBM Corp. 2022
+ *
+ * Author(s): Matthew Rosato <[email protected]>
+ */
+
+#include <linux/kvm_host.h>
+#include <linux/pci.h>
+#include "pci.h"
+
+int kvm_s390_pci_dev_open(struct zpci_dev *zdev)
+{
+ struct kvm_zdev *kzdev;
+
+ kzdev = kzalloc(sizeof(struct kvm_zdev), GFP_KERNEL);
+ if (!kzdev)
+ return -ENOMEM;
+
+ kzdev->zdev = zdev;
+ zdev->kzdev = kzdev;
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(kvm_s390_pci_dev_open);
+
+void kvm_s390_pci_dev_release(struct zpci_dev *zdev)
+{
+ struct kvm_zdev *kzdev;
+
+ kzdev = zdev->kzdev;
+ WARN_ON(kzdev->zdev != zdev);
+ zdev->kzdev = 0;
+ kfree(kzdev);
+}
+EXPORT_SYMBOL_GPL(kvm_s390_pci_dev_release);
diff --git a/arch/s390/kvm/pci.h b/arch/s390/kvm/pci.h
new file mode 100644
index 000000000000..ce93978e8913
--- /dev/null
+++ b/arch/s390/kvm/pci.h
@@ -0,0 +1,21 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * s390 kvm PCI passthrough support
+ *
+ * Copyright IBM Corp. 2022
+ *
+ * Author(s): Matthew Rosato <[email protected]>
+ */
+
+#ifndef __KVM_S390_PCI_H
+#define __KVM_S390_PCI_H
+
+#include <linux/kvm_host.h>
+#include <linux/pci.h>
+
+struct kvm_zdev {
+ struct zpci_dev *zdev;
+ struct kvm *kvm;
+};
+
+#endif /* __KVM_S390_PCI_H */
--
2.27.0

2022-04-27 10:11:18

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH v6 17/21] vfio-pci/zdev: add function handle to clp base capability

The function handle is a system-wide unique identifier for a zPCI
device. With zPCI instruction interpretation, the host will no
longer be executing the zPCI instructions on behalf of the guest.
As a result, the guest needs to use the real function handle in
order for firmware to associate the instruction with the proper
PCI function. Let's provide that handle to the guest.

Reviewed-by: Pierre Morel <[email protected]>
Signed-off-by: Matthew Rosato <[email protected]>
---
drivers/vfio/pci/vfio_pci_zdev.c | 5 +++--
include/uapi/linux/vfio_zdev.h | 3 +++
2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci_zdev.c b/drivers/vfio/pci/vfio_pci_zdev.c
index 112c1f820f8e..00183e8aa393 100644
--- a/drivers/vfio/pci/vfio_pci_zdev.c
+++ b/drivers/vfio/pci/vfio_pci_zdev.c
@@ -24,14 +24,15 @@ static int zpci_base_cap(struct zpci_dev *zdev, struct vfio_info_cap *caps)
{
struct vfio_device_info_cap_zpci_base cap = {
.header.id = VFIO_DEVICE_INFO_CAP_ZPCI_BASE,
- .header.version = 1,
+ .header.version = 2,
.start_dma = zdev->start_dma,
.end_dma = zdev->end_dma,
.pchid = zdev->pchid,
.vfn = zdev->vfn,
.fmb_length = zdev->fmb_length,
.pft = zdev->pft,
- .gid = zdev->pfgid
+ .gid = zdev->pfgid,
+ .fh = zdev->fh
};

return vfio_info_add_capability(caps, &cap.header, sizeof(cap));
diff --git a/include/uapi/linux/vfio_zdev.h b/include/uapi/linux/vfio_zdev.h
index b4309397b6b2..78c022af3d29 100644
--- a/include/uapi/linux/vfio_zdev.h
+++ b/include/uapi/linux/vfio_zdev.h
@@ -29,6 +29,9 @@ struct vfio_device_info_cap_zpci_base {
__u16 fmb_length; /* Measurement Block Length (in bytes) */
__u8 pft; /* PCI Function Type */
__u8 gid; /* PCI function group ID */
+ /* End of version 1 */
+ __u32 fh; /* PCI function handle */
+ /* End of version 2 */
};

/**
--
2.27.0

2022-04-27 10:11:57

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH v6 19/21] KVM: s390: add KVM_S390_ZPCI_OP to manage guest zPCI devices

The KVM_S390_ZPCI_OP ioctl provides a mechanism for managing
hardware-assisted virtualization features for s390X zPCI passthrough.
Add the first 2 operations, which can be used to enable/disable
the specified device for Adapter Event Notification interpretation.

Signed-off-by: Matthew Rosato <[email protected]>
---
Documentation/virt/kvm/api.rst | 45 +++++++++++++++++++++++
arch/s390/kvm/kvm-s390.c | 23 ++++++++++++
arch/s390/kvm/pci.c | 65 ++++++++++++++++++++++++++++++++++
arch/s390/kvm/pci.h | 2 ++
include/uapi/linux/kvm.h | 31 ++++++++++++++++
5 files changed, 166 insertions(+)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 85c7abc51af5..c5689df07d71 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -5645,6 +5645,51 @@ enabled with ``arch_prctl()``, but this may change in the future.
The offsets of the state save areas in struct kvm_xsave follow the contents
of CPUID leaf 0xD on the host.

+4.135 KVM_S390_ZPCI_OP
+--------------------
+
+:Capability: KVM_CAP_S390_ZPCI_OP
+:Architectures: s390
+:Type: vcpu ioctl
+:Parameters: struct kvm_s390_zpci_op (in)
+:Returns: 0 on success, <0 on error
+
+Used to manage hardware-assisted virtualization features for zPCI devices.
+
+Parameters are specified via the following structure::
+
+ struct kvm_s390_zpci_op {
+ /* in */
+ __u32 fh; /* target device */
+ __u8 op; /* operation to perform */
+ __u8 pad[3];
+ union {
+ /* for KVM_S390_ZPCIOP_REG_AEN */
+ struct {
+ __u64 ibv; /* Guest addr of interrupt bit vector */
+ __u64 sb; /* Guest addr of summary bit */
+ __u32 flags;
+ __u32 noi; /* Number of interrupts */
+ __u8 isc; /* Guest interrupt subclass */
+ __u8 sbo; /* Offset of guest summary bit vector */
+ __u16 pad;
+ } reg_aen;
+ __u64 reserved[8];
+ } u;
+ };
+
+The type of operation is specified in the "op" field.
+KVM_S390_ZPCIOP_REG_AEN is used to register the VM for adapter event
+notification interpretation, which will allow firmware delivery of adapter
+events directly to the vm, with KVM providing a backup delivery mechanism;
+KVM_S390_ZPCIOP_DEREG_AEN is used to subsequently disable interpretation of
+adapter event notifications.
+
+The target zPCI function must also be specified via the "fh" field. For the
+KVM_S390_ZPCIOP_REG_AEN operation, additional information to establish firmware
+delivery must be provided via the "reg_aen" struct.
+
+The "reserved" field is meant for future extensions.

5. The kvm_run structure
========================
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index bd6c8aeabc24..69e435b06425 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -618,6 +618,12 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
case KVM_CAP_S390_PROTECTED:
r = is_prot_virt_host();
break;
+ case KVM_CAP_S390_ZPCI_OP:
+ if (kvm_s390_pci_interp_allowed())
+ r = 1;
+ else
+ r = 0;
+ break;
default:
r = 0;
}
@@ -2623,6 +2629,23 @@ long kvm_arch_vm_ioctl(struct file *filp,
r = -EFAULT;
break;
}
+ case KVM_S390_ZPCI_OP: {
+ struct kvm_s390_zpci_op args;
+
+ r = -EINVAL;
+ if (!IS_ENABLED(CONFIG_VFIO_PCI))
+ break;
+ if (copy_from_user(&args, argp, sizeof(args))) {
+ r = -EFAULT;
+ break;
+ }
+ r = kvm_s390_pci_zpci_op(kvm, &args);
+ if (r)
+ break;
+ if (copy_to_user(argp, &args, sizeof(args)))
+ r = -EFAULT;
+ break;
+ }
default:
r = -ENOTTY;
}
diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
index 2aee7f11db1d..fdf8a0f8dcd3 100644
--- a/arch/s390/kvm/pci.c
+++ b/arch/s390/kvm/pci.c
@@ -582,6 +582,71 @@ void kvm_s390_pci_clear_list(struct kvm *kvm)
unregister_kvm(kzdev->zdev);
}

+static struct kvm_zdev *get_kzdev_by_fh(struct kvm *kvm, u32 fh)
+{
+ struct kvm_zdev *kzdev, *retval = NULL;
+
+ spin_lock(&kvm->arch.kzdev_list_lock);
+ list_for_each_entry(kzdev, &kvm->arch.kzdev_list, entry) {
+ if (kzdev->zdev->fh == fh) {
+ retval = kzdev;
+ break;
+ }
+ }
+ spin_unlock(&kvm->arch.kzdev_list_lock);
+
+ return retval;
+}
+
+static int kvm_s390_pci_zpci_reg_aen(struct zpci_dev *zdev,
+ struct kvm_s390_zpci_op *args)
+{
+ struct zpci_fib fib = {};
+
+ fib.fmt0.aibv = args->u.reg_aen.ibv;
+ fib.fmt0.isc = args->u.reg_aen.isc;
+ fib.fmt0.noi = args->u.reg_aen.noi;
+ if (args->u.reg_aen.sb != 0) {
+ fib.fmt0.aisb = args->u.reg_aen.sb;
+ fib.fmt0.aisbo = args->u.reg_aen.sbo;
+ fib.fmt0.sum = 1;
+ } else {
+ fib.fmt0.aisb = 0;
+ fib.fmt0.aisbo = 0;
+ fib.fmt0.sum = 0;
+ }
+
+ if (args->u.reg_aen.flags & KVM_S390_ZPCIOP_REGAEN_HOST)
+ return kvm_s390_pci_aif_enable(zdev, &fib, true);
+ else
+ return kvm_s390_pci_aif_enable(zdev, &fib, false);
+}
+
+int kvm_s390_pci_zpci_op(struct kvm *kvm, struct kvm_s390_zpci_op *args)
+{
+ struct kvm_zdev *kzdev;
+ struct zpci_dev *zdev;
+ int r;
+
+ kzdev = get_kzdev_by_fh(kvm, args->fh);
+ if (!kzdev)
+ return -ENODEV;
+ zdev = kzdev->zdev;
+
+ switch (args->op) {
+ case KVM_S390_ZPCIOP_REG_AEN:
+ r = kvm_s390_pci_zpci_reg_aen(zdev, args);
+ break;
+ case KVM_S390_ZPCIOP_DEREG_AEN:
+ r = kvm_s390_pci_aif_disable(zdev, false);
+ break;
+ default:
+ r = -EINVAL;
+ }
+
+ return r;
+}
+
int kvm_s390_pci_init(void)
{
aift = kzalloc(sizeof(struct zpci_aift), GFP_KERNEL);
diff --git a/arch/s390/kvm/pci.h b/arch/s390/kvm/pci.h
index d11662fadb78..f099c790a15a 100644
--- a/arch/s390/kvm/pci.h
+++ b/arch/s390/kvm/pci.h
@@ -59,6 +59,8 @@ void kvm_s390_pci_aen_exit(void);
void kvm_s390_pci_init_list(struct kvm *kvm);
void kvm_s390_pci_clear_list(struct kvm *kvm);

+int kvm_s390_pci_zpci_op(struct kvm *kvm, struct kvm_s390_zpci_op *args);
+
int kvm_s390_pci_init(void);
void kvm_s390_pci_exit(void);

diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 91a6fe4e02c0..014adb30d42c 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1144,6 +1144,7 @@ struct kvm_ppc_resize_hpt {
#define KVM_CAP_S390_MEM_OP_EXTENSION 211
#define KVM_CAP_PMU_CAPABILITY 212
#define KVM_CAP_DISABLE_QUIRKS2 213
+#define KVM_CAP_S390_ZPCI_OP 214

#ifdef KVM_CAP_IRQ_ROUTING

@@ -2060,4 +2061,34 @@ struct kvm_stats_desc {
/* Available with KVM_CAP_XSAVE2 */
#define KVM_GET_XSAVE2 _IOR(KVMIO, 0xcf, struct kvm_xsave)

+/* Available with KVM_CAP_S390_ZPCI_OP */
+#define KVM_S390_ZPCI_OP _IOW(KVMIO, 0xd0, struct kvm_s390_zpci_op)
+
+struct kvm_s390_zpci_op {
+ /* in */
+ __u32 fh; /* target device */
+ __u8 op; /* operation to perform */
+ __u8 pad[3];
+ union {
+ /* for KVM_S390_ZPCIOP_REG_AEN */
+ struct {
+ __u64 ibv; /* Guest addr of interrupt bit vector */
+ __u64 sb; /* Guest addr of summary bit */
+ __u32 flags;
+ __u32 noi; /* Number of interrupts */
+ __u8 isc; /* Guest interrupt subclass */
+ __u8 sbo; /* Offset of guest summary bit vector */
+ __u16 pad;
+ } reg_aen;
+ __u64 reserved[8];
+ } u;
+};
+
+/* types for kvm_s390_zpci_op->op */
+#define KVM_S390_ZPCIOP_REG_AEN 0
+#define KVM_S390_ZPCIOP_DEREG_AEN 1
+
+/* flags for kvm_s390_zpci_op->u.reg_aen.flags */
+#define KVM_S390_ZPCIOP_REGAEN_HOST (1 << 0)
+
#endif /* __LINUX_KVM_H */
--
2.27.0

2022-04-27 10:12:40

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH v6 06/21] s390/airq: allow for airq structure that uses an input vector

When doing device passthrough where interrupts are being forwarded from
host to guest, we wish to use a pinned section of guest memory as the
vector (the same memory used by the guest as the vector). To accomplish
this, add a new parameter for airq_iv_create which allows passing an
existing vector to be used instead of allocating a new one. The caller
is responsible for ensuring the vector is pinned in memory as well as for
unpinning the memory when the vector is no longer needed.

A subsequent patch will use this new parameter for zPCI interpretation.

Reviewed-by: Pierre Morel <[email protected]>
Reviewed-by: Claudio Imbrenda <[email protected]>
Acked-by: Cornelia Huck <[email protected]>
Signed-off-by: Matthew Rosato <[email protected]>
---
arch/s390/include/asm/airq.h | 4 +++-
arch/s390/pci/pci_irq.c | 8 ++++----
drivers/s390/cio/airq.c | 10 +++++++---
drivers/s390/virtio/virtio_ccw.c | 2 +-
4 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/arch/s390/include/asm/airq.h b/arch/s390/include/asm/airq.h
index 7918a7d09028..e82e5626e139 100644
--- a/arch/s390/include/asm/airq.h
+++ b/arch/s390/include/asm/airq.h
@@ -47,8 +47,10 @@ struct airq_iv {
#define AIRQ_IV_PTR 4 /* Allocate the ptr array */
#define AIRQ_IV_DATA 8 /* Allocate the data array */
#define AIRQ_IV_CACHELINE 16 /* Cacheline alignment for the vector */
+#define AIRQ_IV_GUESTVEC 32 /* Vector is a pinned guest page */

-struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags);
+struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags,
+ unsigned long *vec);
void airq_iv_release(struct airq_iv *iv);
unsigned long airq_iv_alloc(struct airq_iv *iv, unsigned long num);
void airq_iv_free(struct airq_iv *iv, unsigned long bit, unsigned long num);
diff --git a/arch/s390/pci/pci_irq.c b/arch/s390/pci/pci_irq.c
index b805c75252ed..87c7d121c255 100644
--- a/arch/s390/pci/pci_irq.c
+++ b/arch/s390/pci/pci_irq.c
@@ -296,7 +296,7 @@ int arch_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type)
zdev->aisb = bit;

/* Create adapter interrupt vector */
- zdev->aibv = airq_iv_create(msi_vecs, AIRQ_IV_DATA | AIRQ_IV_BITLOCK);
+ zdev->aibv = airq_iv_create(msi_vecs, AIRQ_IV_DATA | AIRQ_IV_BITLOCK, NULL);
if (!zdev->aibv)
return -ENOMEM;

@@ -419,7 +419,7 @@ static int __init zpci_directed_irq_init(void)
union zpci_sic_iib iib = {{0}};
unsigned int cpu;

- zpci_sbv = airq_iv_create(num_possible_cpus(), 0);
+ zpci_sbv = airq_iv_create(num_possible_cpus(), 0, NULL);
if (!zpci_sbv)
return -ENOMEM;

@@ -441,7 +441,7 @@ static int __init zpci_directed_irq_init(void)
zpci_ibv[cpu] = airq_iv_create(cache_line_size() * BITS_PER_BYTE,
AIRQ_IV_DATA |
AIRQ_IV_CACHELINE |
- (!cpu ? AIRQ_IV_ALLOC : 0));
+ (!cpu ? AIRQ_IV_ALLOC : 0), NULL);
if (!zpci_ibv[cpu])
return -ENOMEM;
}
@@ -458,7 +458,7 @@ static int __init zpci_floating_irq_init(void)
if (!zpci_ibv)
return -ENOMEM;

- zpci_sbv = airq_iv_create(ZPCI_NR_DEVICES, AIRQ_IV_ALLOC);
+ zpci_sbv = airq_iv_create(ZPCI_NR_DEVICES, AIRQ_IV_ALLOC, NULL);
if (!zpci_sbv)
goto out_free;

diff --git a/drivers/s390/cio/airq.c b/drivers/s390/cio/airq.c
index 230eb5b5b64e..34967e67249e 100644
--- a/drivers/s390/cio/airq.c
+++ b/drivers/s390/cio/airq.c
@@ -122,10 +122,12 @@ static inline unsigned long iv_size(unsigned long bits)
* airq_iv_create - create an interrupt vector
* @bits: number of bits in the interrupt vector
* @flags: allocation flags
+ * @vec: pointer to pinned guest memory if AIRQ_IV_GUESTVEC
*
* Returns a pointer to an interrupt vector structure
*/
-struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags)
+struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags,
+ unsigned long *vec)
{
struct airq_iv *iv;
unsigned long size;
@@ -146,6 +148,8 @@ struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags)
&iv->vector_dma);
if (!iv->vector)
goto out_free;
+ } else if (flags & AIRQ_IV_GUESTVEC) {
+ iv->vector = vec;
} else {
iv->vector = cio_dma_zalloc(size);
if (!iv->vector)
@@ -185,7 +189,7 @@ struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags)
kfree(iv->avail);
if (iv->flags & AIRQ_IV_CACHELINE && iv->vector)
dma_pool_free(airq_iv_cache, iv->vector, iv->vector_dma);
- else
+ else if (!(iv->flags & AIRQ_IV_GUESTVEC))
cio_dma_free(iv->vector, size);
kfree(iv);
out:
@@ -204,7 +208,7 @@ void airq_iv_release(struct airq_iv *iv)
kfree(iv->bitlock);
if (iv->flags & AIRQ_IV_CACHELINE)
dma_pool_free(airq_iv_cache, iv->vector, iv->vector_dma);
- else
+ else if (!(iv->flags & AIRQ_IV_GUESTVEC))
cio_dma_free(iv->vector, iv_size(iv->bits));
kfree(iv->avail);
kfree(iv);
diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index 52c376d15978..410498d693f8 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -241,7 +241,7 @@ static struct airq_info *new_airq_info(int index)
return NULL;
rwlock_init(&info->lock);
info->aiv = airq_iv_create(VIRTIO_IV_BITS, AIRQ_IV_ALLOC | AIRQ_IV_PTR
- | AIRQ_IV_CACHELINE);
+ | AIRQ_IV_CACHELINE, NULL);
if (!info->aiv) {
kfree(info);
return NULL;
--
2.27.0

2022-04-27 10:14:43

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH v6 18/21] vfio-pci/zdev: different maxstbl for interpreted devices

When doing load/store interpretation, the maximum store block length is
determined by the underlying firmware, not the host kernel API. Reflect
that in the associated Query PCI Function Group clp capability and let
userspace decide which is appropriate to present to the guest.

Reviewed-by: Pierre Morel <[email protected]>
Signed-off-by: Matthew Rosato <[email protected]>
---
drivers/vfio/pci/vfio_pci_zdev.c | 6 ++++--
include/uapi/linux/vfio_zdev.h | 4 ++++
2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci_zdev.c b/drivers/vfio/pci/vfio_pci_zdev.c
index 00183e8aa393..ea20d4d36eef 100644
--- a/drivers/vfio/pci/vfio_pci_zdev.c
+++ b/drivers/vfio/pci/vfio_pci_zdev.c
@@ -45,14 +45,16 @@ static int zpci_group_cap(struct zpci_dev *zdev, struct vfio_info_cap *caps)
{
struct vfio_device_info_cap_zpci_group cap = {
.header.id = VFIO_DEVICE_INFO_CAP_ZPCI_GROUP,
- .header.version = 1,
+ .header.version = 2,
.dasm = zdev->dma_mask,
.msi_addr = zdev->msi_addr,
.flags = VFIO_DEVICE_INFO_ZPCI_FLAG_REFRESH,
.mui = zdev->fmb_update,
.noi = zdev->max_msi,
.maxstbl = ZPCI_MAX_WRITE_SIZE,
- .version = zdev->version
+ .version = zdev->version,
+ .reserved = 0,
+ .imaxstbl = zdev->maxstbl
};

return vfio_info_add_capability(caps, &cap.header, sizeof(cap));
diff --git a/include/uapi/linux/vfio_zdev.h b/include/uapi/linux/vfio_zdev.h
index 78c022af3d29..77f2aff1f27e 100644
--- a/include/uapi/linux/vfio_zdev.h
+++ b/include/uapi/linux/vfio_zdev.h
@@ -50,6 +50,10 @@ struct vfio_device_info_cap_zpci_group {
__u16 noi; /* Maximum number of MSIs */
__u16 maxstbl; /* Maximum Store Block Length */
__u8 version; /* Supported PCI Version */
+ /* End of version 1 */
+ __u8 reserved;
+ __u16 imaxstbl; /* Maximum Interpreted Store Block Length */
+ /* End of version 2 */
};

/**
--
2.27.0

2022-04-27 10:22:02

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH v6 13/21] KVM: s390: mechanism to enable guest zPCI Interpretation

The guest must have access to certain facilities in order to allow
interpretive execution of zPCI instructions and adapter event
notifications. However, there are some cases where a guest might
disable interpretation -- provide a mechanism via which we can defer
enabling the associated zPCI interpretation facilities until the guest
indicates it wishes to use them.

Reviewed-by: Pierre Morel <[email protected]>
Signed-off-by: Matthew Rosato <[email protected]>
---
arch/s390/include/asm/kvm_host.h | 4 ++++
arch/s390/kvm/kvm-s390.c | 37 ++++++++++++++++++++++++++++++++
arch/s390/kvm/kvm-s390.h | 10 +++++++++
3 files changed, 51 insertions(+)

diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index c1518a505060..8e381603b6a7 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -254,7 +254,10 @@ struct kvm_s390_sie_block {
#define ECB2_IEP 0x20
#define ECB2_PFMFI 0x08
#define ECB2_ESCA 0x04
+#define ECB2_ZPCI_LSI 0x02
__u8 ecb2; /* 0x0062 */
+#define ECB3_AISI 0x20
+#define ECB3_AISII 0x10
#define ECB3_DEA 0x08
#define ECB3_AES 0x04
#define ECB3_RI 0x01
@@ -940,6 +943,7 @@ struct kvm_arch{
int use_cmma;
int use_pfmfi;
int use_skf;
+ int use_zpci_interp;
int user_cpu_state_ctrl;
int user_sigp;
int user_stsi;
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 856f55ffd98c..504f312c1c27 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -1031,6 +1031,41 @@ static int kvm_s390_vm_set_crypto(struct kvm *kvm, struct kvm_device_attr *attr)
return 0;
}

+static void kvm_s390_vcpu_pci_setup(struct kvm_vcpu *vcpu)
+{
+ /* Only set the ECB bits after guest requests zPCI interpretation */
+ if (!vcpu->kvm->arch.use_zpci_interp)
+ return;
+
+ vcpu->arch.sie_block->ecb2 |= ECB2_ZPCI_LSI;
+ vcpu->arch.sie_block->ecb3 |= ECB3_AISII + ECB3_AISI;
+}
+
+/* Must be called with kvm->lock held */
+void kvm_s390_vcpu_pci_enable_interp(struct kvm *kvm)
+{
+ struct kvm_vcpu *vcpu;
+ unsigned long i;
+
+ /*
+ * If host is configured for PCI and the necessary facilities are
+ * available, turn on interpretation for the life of this guest
+ */
+ if (!kvm_s390_pci_interp_allowed())
+ return;
+
+ kvm->arch.use_zpci_interp = 1;
+
+ kvm_s390_vcpu_block_all(kvm);
+
+ kvm_for_each_vcpu(i, vcpu, kvm) {
+ kvm_s390_vcpu_pci_setup(vcpu);
+ kvm_s390_sync_request(KVM_REQ_VSIE_RESTART, vcpu);
+ }
+
+ kvm_s390_vcpu_unblock_all(kvm);
+}
+
static void kvm_s390_sync_request_broadcast(struct kvm *kvm, int req)
{
unsigned long cx;
@@ -3331,6 +3366,8 @@ static int kvm_s390_vcpu_setup(struct kvm_vcpu *vcpu)

kvm_s390_vcpu_crypto_setup(vcpu);

+ kvm_s390_vcpu_pci_setup(vcpu);
+
mutex_lock(&vcpu->kvm->lock);
if (kvm_s390_pv_is_protected(vcpu->kvm)) {
rc = kvm_s390_pv_create_cpu(vcpu, &uvrc, &uvrrc);
diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h
index 497d52a83c78..975cffaf13de 100644
--- a/arch/s390/kvm/kvm-s390.h
+++ b/arch/s390/kvm/kvm-s390.h
@@ -507,6 +507,16 @@ void kvm_s390_reinject_machine_check(struct kvm_vcpu *vcpu,
*/
void kvm_s390_vcpu_crypto_reset_all(struct kvm *kvm);

+/**
+ * kvm_s390_vcpu_pci_enable_interp
+ *
+ * Set the associated PCI attributes for each vcpu to allow for zPCI Load/Store
+ * interpretation as well as adapter interruption forwarding.
+ *
+ * @kvm: the KVM guest
+ */
+void kvm_s390_vcpu_pci_enable_interp(struct kvm *kvm);
+
/**
* diag9c_forwarding_hz
*
--
2.27.0

2022-04-27 10:30:49

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH v6 11/21] KVM: s390: pci: do initial setup for AEN interpretation

Initial setup for Adapter Event Notification Interpretation for zPCI
passthrough devices. Specifically, allocate a structure for forwarding of
adapter events and pass the address of this structure to firmware.

Signed-off-by: Matthew Rosato <[email protected]>
---
arch/s390/include/asm/pci.h | 4 +
arch/s390/include/asm/pci_insn.h | 12 +++
arch/s390/kvm/interrupt.c | 14 +++
arch/s390/kvm/kvm-s390.c | 11 +++
arch/s390/kvm/pci.c | 159 +++++++++++++++++++++++++++++++
arch/s390/kvm/pci.h | 48 ++++++++++
arch/s390/pci/pci.c | 6 ++
7 files changed, 254 insertions(+)

diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
index 50f1851edfbe..322060a75d9f 100644
--- a/arch/s390/include/asm/pci.h
+++ b/arch/s390/include/asm/pci.h
@@ -9,6 +9,7 @@
#include <asm-generic/pci.h>
#include <asm/pci_clp.h>
#include <asm/pci_debug.h>
+#include <asm/pci_insn.h>
#include <asm/sclp.h>

#define PCIBIOS_MIN_IO 0x1000
@@ -204,6 +205,9 @@ extern const struct attribute_group *zpci_attr_groups[];
extern unsigned int s390_pci_force_floating __initdata;
extern unsigned int s390_pci_no_rid;

+extern union zpci_sic_iib *zpci_aipb;
+extern struct airq_iv *zpci_aif_sbv;
+
/* -----------------------------------------------------------------------------
Prototypes
----------------------------------------------------------------------------- */
diff --git a/arch/s390/include/asm/pci_insn.h b/arch/s390/include/asm/pci_insn.h
index 5331082fa516..e5f57cfe1d45 100644
--- a/arch/s390/include/asm/pci_insn.h
+++ b/arch/s390/include/asm/pci_insn.h
@@ -101,6 +101,7 @@ struct zpci_fib {
/* Set Interruption Controls Operation Controls */
#define SIC_IRQ_MODE_ALL 0
#define SIC_IRQ_MODE_SINGLE 1
+#define SIC_SET_AENI_CONTROLS 2
#define SIC_IRQ_MODE_DIRECT 4
#define SIC_IRQ_MODE_D_ALL 16
#define SIC_IRQ_MODE_D_SINGLE 17
@@ -127,9 +128,20 @@ struct zpci_cdiib {
u64 : 64;
} __packed __aligned(8);

+/* adapter interruption parameters block */
+struct zpci_aipb {
+ u64 faisb;
+ u64 gait;
+ u16 : 13;
+ u16 afi : 3;
+ u32 : 32;
+ u16 faal;
+} __packed __aligned(8);
+
union zpci_sic_iib {
struct zpci_diib diib;
struct zpci_cdiib cdiib;
+ struct zpci_aipb aipb;
};

DECLARE_STATIC_KEY_FALSE(have_mio);
diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
index 17ff475157d8..37ff4358121a 100644
--- a/arch/s390/kvm/interrupt.c
+++ b/arch/s390/kvm/interrupt.c
@@ -32,6 +32,7 @@
#include "kvm-s390.h"
#include "gaccess.h"
#include "trace-s390.h"
+#include "pci.h"

#define PFAULT_INIT 0x0600
#define PFAULT_DONE 0x0680
@@ -3328,6 +3329,11 @@ void kvm_s390_gib_destroy(void)
{
if (!gib)
return;
+ if (kvm_s390_pci_interp_allowed() && aift) {
+ mutex_lock(&aift->aift_lock);
+ kvm_s390_pci_aen_exit();
+ mutex_unlock(&aift->aift_lock);
+ }
chsc_sgib(0);
unregister_adapter_interrupt(&gib_alert_irq);
free_page((unsigned long)gib);
@@ -3365,6 +3371,14 @@ int kvm_s390_gib_init(u8 nisc)
goto out_unreg_gal;
}

+ if (kvm_s390_pci_interp_allowed()) {
+ if (kvm_s390_pci_aen_init(nisc)) {
+ pr_err("Initializing AEN for PCI failed\n");
+ rc = -EIO;
+ goto out_unreg_gal;
+ }
+ }
+
KVM_EVENT(3, "gib 0x%pK (nisc=%d) initialized", gib, gib->nisc);
goto out;

diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index da3dabda1a12..20882b3c08a5 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -47,6 +47,7 @@
#include <asm/fpu/api.h>
#include "kvm-s390.h"
#include "gaccess.h"
+#include "pci.h"

#define CREATE_TRACE_POINTS
#include "trace.h"
@@ -502,6 +503,14 @@ int kvm_arch_init(void *opaque)
goto out;
}

+ if (kvm_s390_pci_interp_allowed()) {
+ rc = kvm_s390_pci_init();
+ if (rc) {
+ pr_err("Unable to allocate AIFT for PCI\n");
+ goto out;
+ }
+ }
+
rc = kvm_s390_gib_init(GAL_ISC);
if (rc)
goto out;
@@ -516,6 +525,8 @@ int kvm_arch_init(void *opaque)
void kvm_arch_exit(void)
{
kvm_s390_gib_destroy();
+ if (kvm_s390_pci_interp_allowed())
+ kvm_s390_pci_exit();
debug_unregister(kvm_s390_dbf);
debug_unregister(kvm_s390_dbf_uv);
}
diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
index 213be236c05a..6e6636c20fa2 100644
--- a/arch/s390/kvm/pci.c
+++ b/arch/s390/kvm/pci.c
@@ -9,8 +9,148 @@

#include <linux/kvm_host.h>
#include <linux/pci.h>
+#include <asm/pci.h>
+#include <asm/pci_insn.h>
#include "pci.h"

+struct zpci_aift *aift;
+
+static inline int __set_irq_noiib(u16 ctl, u8 isc)
+{
+ union zpci_sic_iib iib = {{0}};
+
+ return zpci_set_irq_ctrl(ctl, isc, &iib);
+}
+
+/* Caller must hold the aift lock before calling this function */
+void kvm_s390_pci_aen_exit(void)
+{
+ unsigned long flags;
+ struct kvm_zdev **gait_kzdev;
+
+ /*
+ * Contents of the aipb remain registered for the life of the host
+ * kernel, the information preserved in zpci_aipb and zpci_aif_sbv
+ * in case we insert the KVM module again later. Clear the AIFT
+ * information and free anything not registered with underlying
+ * firmware.
+ */
+ spin_lock_irqsave(&aift->gait_lock, flags);
+ gait_kzdev = aift->kzdev;
+ aift->gait = 0;
+ aift->sbv = 0;
+ aift->kzdev = 0;
+ spin_unlock_irqrestore(&aift->gait_lock, flags);
+
+ kfree(gait_kzdev);
+}
+
+static int zpci_setup_aipb(u8 nisc)
+{
+ struct page *page;
+ int size, rc;
+
+ zpci_aipb = kzalloc(sizeof(union zpci_sic_iib), GFP_KERNEL);
+ if (!zpci_aipb)
+ return -ENOMEM;
+
+ aift->sbv = airq_iv_create(ZPCI_NR_DEVICES, AIRQ_IV_ALLOC, 0);
+ if (!aift->sbv) {
+ rc = -ENOMEM;
+ goto free_aipb;
+ }
+ zpci_aif_sbv = aift->sbv;
+ size = get_order(PAGE_ALIGN(ZPCI_NR_DEVICES *
+ sizeof(struct zpci_gaite)));
+ page = alloc_pages(GFP_KERNEL | __GFP_ZERO, size);
+ if (!page) {
+ rc = -ENOMEM;
+ goto free_sbv;
+ }
+ aift->gait = (struct zpci_gaite *)page_to_phys(page);
+
+ zpci_aipb->aipb.faisb = virt_to_phys(aift->sbv->vector);
+ zpci_aipb->aipb.gait = virt_to_phys(aift->gait);
+ zpci_aipb->aipb.afi = nisc;
+ zpci_aipb->aipb.faal = ZPCI_NR_DEVICES;
+
+ /* Setup Adapter Event Notification Interpretation */
+ if (zpci_set_irq_ctrl(SIC_SET_AENI_CONTROLS, 0, zpci_aipb)) {
+ rc = -EIO;
+ goto free_gait;
+ }
+
+ return 0;
+
+free_gait:
+ free_pages((unsigned long)aift->gait, size);
+free_sbv:
+ airq_iv_release(aift->sbv);
+ zpci_aif_sbv = 0;
+free_aipb:
+ kfree(zpci_aipb);
+ zpci_aipb = 0;
+
+ return rc;
+}
+
+static int zpci_reset_aipb(u8 nisc)
+{
+ /*
+ * AEN registration can only happen once per system boot. If
+ * an aipb already exists then AEN was already registered and
+ * we can re-use the aipb contents. This can only happen if
+ * the KVM module was removed and re-inserted. However, we must
+ * ensure that the same forwarding ISC is used as this is assigned
+ * during KVM module load.
+ */
+ if (zpci_aipb->aipb.afi != nisc)
+ return -EINVAL;
+
+ aift->sbv = zpci_aif_sbv;
+ aift->gait = (struct zpci_gaite *)zpci_aipb->aipb.gait;
+
+ return 0;
+}
+
+int kvm_s390_pci_aen_init(u8 nisc)
+{
+ int rc = 0;
+
+ /* If already enabled for AEN, bail out now */
+ if (aift->gait || aift->sbv)
+ return -EPERM;
+
+ mutex_lock(&aift->aift_lock);
+ aift->kzdev = kcalloc(ZPCI_NR_DEVICES, sizeof(struct kvm_zdev),
+ GFP_KERNEL);
+ if (!aift->kzdev) {
+ rc = -ENOMEM;
+ goto unlock;
+ }
+
+ if (!zpci_aipb)
+ rc = zpci_setup_aipb(nisc);
+ else
+ rc = zpci_reset_aipb(nisc);
+ if (rc)
+ goto free_zdev;
+
+ /* Enable floating IRQs */
+ if (__set_irq_noiib(SIC_IRQ_MODE_SINGLE, nisc)) {
+ rc = -EIO;
+ kvm_s390_pci_aen_exit();
+ }
+
+ goto unlock;
+
+free_zdev:
+ kfree(aift->kzdev);
+unlock:
+ mutex_unlock(&aift->aift_lock);
+ return rc;
+}
+
int kvm_s390_pci_dev_open(struct zpci_dev *zdev)
{
struct kvm_zdev *kzdev;
@@ -36,3 +176,22 @@ void kvm_s390_pci_dev_release(struct zpci_dev *zdev)
kfree(kzdev);
}
EXPORT_SYMBOL_GPL(kvm_s390_pci_dev_release);
+
+int kvm_s390_pci_init(void)
+{
+ aift = kzalloc(sizeof(struct zpci_aift), GFP_KERNEL);
+ if (!aift)
+ return -ENOMEM;
+
+ spin_lock_init(&aift->gait_lock);
+ mutex_init(&aift->aift_lock);
+
+ return 0;
+}
+
+void kvm_s390_pci_exit(void)
+{
+ mutex_destroy(&aift->aift_lock);
+
+ kfree(aift);
+}
diff --git a/arch/s390/kvm/pci.h b/arch/s390/kvm/pci.h
index ce93978e8913..1d8f11e442f6 100644
--- a/arch/s390/kvm/pci.h
+++ b/arch/s390/kvm/pci.h
@@ -12,10 +12,58 @@

#include <linux/kvm_host.h>
#include <linux/pci.h>
+#include <linux/mutex.h>
+#include <asm/airq.h>
+#include <asm/cpu.h>

struct kvm_zdev {
struct zpci_dev *zdev;
struct kvm *kvm;
};

+struct zpci_gaite {
+ u32 gisa;
+ u8 gisc;
+ u8 count;
+ u8 reserved;
+ u8 aisbo;
+ u64 aisb;
+};
+
+struct zpci_aift {
+ struct zpci_gaite *gait;
+ struct airq_iv *sbv;
+ struct kvm_zdev **kzdev;
+ spinlock_t gait_lock; /* Protects the gait, used during AEN forward */
+ struct mutex aift_lock; /* Protects the other structures in aift */
+};
+
+extern struct zpci_aift *aift;
+
+int kvm_s390_pci_aen_init(u8 nisc);
+void kvm_s390_pci_aen_exit(void);
+
+int kvm_s390_pci_init(void);
+void kvm_s390_pci_exit(void);
+
+static inline bool kvm_s390_pci_interp_allowed(void)
+{
+ struct cpuid cpu_id;
+
+ get_cpu_id(&cpu_id);
+ switch (cpu_id.machine) {
+ case 0x2817:
+ case 0x2818:
+ case 0x2827:
+ case 0x2828:
+ case 0x2964:
+ case 0x2965:
+ /* No SHM on certain machines */
+ return false;
+ default:
+ return (IS_ENABLED(CONFIG_VFIO_PCI) && sclp.has_zpci_lsi &&
+ sclp.has_aeni && sclp.has_aisi && sclp.has_aisii);
+ }
+}
+
#endif /* __KVM_S390_PCI_H */
diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c
index a86cd1cbb80e..f0a439c43395 100644
--- a/arch/s390/pci/pci.c
+++ b/arch/s390/pci/pci.c
@@ -61,6 +61,12 @@ DEFINE_STATIC_KEY_FALSE(have_mio);

static struct kmem_cache *zdev_fmb_cache;

+/* AEN structures that must be preserved over KVM module re-insertion */
+union zpci_sic_iib *zpci_aipb;
+EXPORT_SYMBOL_GPL(zpci_aipb);
+struct airq_iv *zpci_aif_sbv;
+EXPORT_SYMBOL_GPL(zpci_aif_sbv);
+
struct zpci_dev *get_zdev_by_fid(u32 fid)
{
struct zpci_dev *tmp, *zdev = NULL;
--
2.27.0

2022-04-27 10:39:49

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH v6 02/21] s390/sclp: detect the AISII facility

Detect the Adapter Interruption Source ID Interpretation facility.

Reviewed-by: Eric Farman <[email protected]>
Reviewed-by: Christian Borntraeger <[email protected]>
Reviewed-by: Claudio Imbrenda <[email protected]>
Signed-off-by: Matthew Rosato <[email protected]>
---
arch/s390/include/asm/sclp.h | 1 +
drivers/s390/char/sclp_early.c | 1 +
2 files changed, 2 insertions(+)

diff --git a/arch/s390/include/asm/sclp.h b/arch/s390/include/asm/sclp.h
index 06c0b5eb8ef5..be14644fe113 100644
--- a/arch/s390/include/asm/sclp.h
+++ b/arch/s390/include/asm/sclp.h
@@ -88,6 +88,7 @@ struct sclp_info {
unsigned char has_sipl : 1;
unsigned char has_dirq : 1;
unsigned char has_zpci_lsi : 1;
+ unsigned char has_aisii : 1;
unsigned int ibc;
unsigned int mtid;
unsigned int mtid_cp;
diff --git a/drivers/s390/char/sclp_early.c b/drivers/s390/char/sclp_early.c
index b88dd0da1231..29fee179e197 100644
--- a/drivers/s390/char/sclp_early.c
+++ b/drivers/s390/char/sclp_early.c
@@ -45,6 +45,7 @@ static void __init sclp_early_facilities_detect(void)
sclp.has_gisaf = !!(sccb->fac118 & 0x08);
sclp.has_hvs = !!(sccb->fac119 & 0x80);
sclp.has_kss = !!(sccb->fac98 & 0x01);
+ sclp.has_aisii = !!(sccb->fac118 & 0x40);
sclp.has_zpci_lsi = !!(sccb->fac118 & 0x01);
if (sccb->fac85 & 0x02)
S390_lowcore.machine_flags |= MACHINE_FLAG_ESOP;
--
2.27.0

2022-04-27 10:48:54

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH v6 07/21] s390/pci: externalize the SIC operation controls and routine

A subsequent patch will be issuing SIC from KVM -- export the necessary
routine and make the operation control definitions available from a header.
Because the routine will now be exported, let's rename __zpci_set_irq_ctrl
to zpci_set_irq_ctrl and get rid of the zero'd iib wrapper function of
the same name.

Reviewed-by: Niklas Schnelle <[email protected]>
Reviewed-by: Claudio Imbrenda <[email protected]>
Reviewed-by: Pierre Morel <[email protected]>
Signed-off-by: Matthew Rosato <[email protected]>
---
arch/s390/include/asm/pci_insn.h | 17 +++++++++--------
arch/s390/pci/pci_insn.c | 3 ++-
arch/s390/pci/pci_irq.c | 26 ++++++++++++--------------
3 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/arch/s390/include/asm/pci_insn.h b/arch/s390/include/asm/pci_insn.h
index 61cf9531f68f..5331082fa516 100644
--- a/arch/s390/include/asm/pci_insn.h
+++ b/arch/s390/include/asm/pci_insn.h
@@ -98,6 +98,14 @@ struct zpci_fib {
u32 gd;
} __packed __aligned(8);

+/* Set Interruption Controls Operation Controls */
+#define SIC_IRQ_MODE_ALL 0
+#define SIC_IRQ_MODE_SINGLE 1
+#define SIC_IRQ_MODE_DIRECT 4
+#define SIC_IRQ_MODE_D_ALL 16
+#define SIC_IRQ_MODE_D_SINGLE 17
+#define SIC_IRQ_MODE_SET_CPU 18
+
/* directed interruption information block */
struct zpci_diib {
u32 : 1;
@@ -134,13 +142,6 @@ int __zpci_store(u64 data, u64 req, u64 offset);
int zpci_store(const volatile void __iomem *addr, u64 data, unsigned long len);
int __zpci_store_block(const u64 *data, u64 req, u64 offset);
void zpci_barrier(void);
-int __zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib);
-
-static inline int zpci_set_irq_ctrl(u16 ctl, u8 isc)
-{
- union zpci_sic_iib iib = {{0}};
-
- return __zpci_set_irq_ctrl(ctl, isc, &iib);
-}
+int zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib);

#endif
diff --git a/arch/s390/pci/pci_insn.c b/arch/s390/pci/pci_insn.c
index 1710d006ee93..4c6967b73932 100644
--- a/arch/s390/pci/pci_insn.c
+++ b/arch/s390/pci/pci_insn.c
@@ -98,7 +98,7 @@ int zpci_refresh_trans(u64 fn, u64 addr, u64 range)
}

/* Set Interruption Controls */
-int __zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib)
+int zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib)
{
if (!test_facility(72))
return -EIO;
@@ -109,6 +109,7 @@ int __zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib)

return 0;
}
+EXPORT_SYMBOL_GPL(zpci_set_irq_ctrl);

/* PCI Load */
static inline int ____pcilg(u64 *data, u64 req, u64 offset, u8 *status)
diff --git a/arch/s390/pci/pci_irq.c b/arch/s390/pci/pci_irq.c
index 87c7d121c255..f2b3145b6697 100644
--- a/arch/s390/pci/pci_irq.c
+++ b/arch/s390/pci/pci_irq.c
@@ -15,13 +15,6 @@

static enum {FLOATING, DIRECTED} irq_delivery;

-#define SIC_IRQ_MODE_ALL 0
-#define SIC_IRQ_MODE_SINGLE 1
-#define SIC_IRQ_MODE_DIRECT 4
-#define SIC_IRQ_MODE_D_ALL 16
-#define SIC_IRQ_MODE_D_SINGLE 17
-#define SIC_IRQ_MODE_SET_CPU 18
-
/*
* summary bit vector
* FLOATING - summary bit per function
@@ -154,6 +147,7 @@ static struct irq_chip zpci_irq_chip = {
static void zpci_handle_cpu_local_irq(bool rescan)
{
struct airq_iv *dibv = zpci_ibv[smp_processor_id()];
+ union zpci_sic_iib iib = {{0}};
unsigned long bit;
int irqs_on = 0;

@@ -165,7 +159,7 @@ static void zpci_handle_cpu_local_irq(bool rescan)
/* End of second scan with interrupts on. */
break;
/* First scan complete, reenable interrupts. */
- if (zpci_set_irq_ctrl(SIC_IRQ_MODE_D_SINGLE, PCI_ISC))
+ if (zpci_set_irq_ctrl(SIC_IRQ_MODE_D_SINGLE, PCI_ISC, &iib))
break;
bit = 0;
continue;
@@ -193,6 +187,7 @@ static void zpci_handle_remote_irq(void *data)
static void zpci_handle_fallback_irq(void)
{
struct cpu_irq_data *cpu_data;
+ union zpci_sic_iib iib = {{0}};
unsigned long cpu;
int irqs_on = 0;

@@ -203,7 +198,7 @@ static void zpci_handle_fallback_irq(void)
/* End of second scan with interrupts on. */
break;
/* First scan complete, reenable interrupts. */
- if (zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, PCI_ISC))
+ if (zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, PCI_ISC, &iib))
break;
cpu = 0;
continue;
@@ -234,6 +229,7 @@ static void zpci_directed_irq_handler(struct airq_struct *airq,
static void zpci_floating_irq_handler(struct airq_struct *airq,
struct tpi_info *tpi_info)
{
+ union zpci_sic_iib iib = {{0}};
unsigned long si, ai;
struct airq_iv *aibv;
int irqs_on = 0;
@@ -247,7 +243,7 @@ static void zpci_floating_irq_handler(struct airq_struct *airq,
/* End of second scan with interrupts on. */
break;
/* First scan complete, reenable interrupts. */
- if (zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, PCI_ISC))
+ if (zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, PCI_ISC, &iib))
break;
si = 0;
continue;
@@ -407,11 +403,12 @@ static struct airq_struct zpci_airq = {
static void __init cpu_enable_directed_irq(void *unused)
{
union zpci_sic_iib iib = {{0}};
+ union zpci_sic_iib ziib = {{0}};

iib.cdiib.dibv_addr = (u64) zpci_ibv[smp_processor_id()]->vector;

- __zpci_set_irq_ctrl(SIC_IRQ_MODE_SET_CPU, 0, &iib);
- zpci_set_irq_ctrl(SIC_IRQ_MODE_D_SINGLE, PCI_ISC);
+ zpci_set_irq_ctrl(SIC_IRQ_MODE_SET_CPU, 0, &iib);
+ zpci_set_irq_ctrl(SIC_IRQ_MODE_D_SINGLE, PCI_ISC, &ziib);
}

static int __init zpci_directed_irq_init(void)
@@ -426,7 +423,7 @@ static int __init zpci_directed_irq_init(void)
iib.diib.isc = PCI_ISC;
iib.diib.nr_cpus = num_possible_cpus();
iib.diib.disb_addr = virt_to_phys(zpci_sbv->vector);
- __zpci_set_irq_ctrl(SIC_IRQ_MODE_DIRECT, 0, &iib);
+ zpci_set_irq_ctrl(SIC_IRQ_MODE_DIRECT, 0, &iib);

zpci_ibv = kcalloc(num_possible_cpus(), sizeof(*zpci_ibv),
GFP_KERNEL);
@@ -471,6 +468,7 @@ static int __init zpci_floating_irq_init(void)

int __init zpci_irq_init(void)
{
+ union zpci_sic_iib iib = {{0}};
int rc;

irq_delivery = sclp.has_dirq ? DIRECTED : FLOATING;
@@ -502,7 +500,7 @@ int __init zpci_irq_init(void)
* Enable floating IRQs (with suppression after one IRQ). When using
* directed IRQs this enables the fallback path.
*/
- zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, PCI_ISC);
+ zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, PCI_ISC, &iib);

return 0;
out_airq:
--
2.27.0

2022-04-27 10:58:57

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH v6 04/21] s390/sclp: detect the AISI facility

Detect the Adapter Interruption Suppression Interpretation facility.

Reviewed-by: Eric Farman <[email protected]>
Reviewed-by: Christian Borntraeger <[email protected]>
Reviewed-by: Claudio Imbrenda <[email protected]>
Signed-off-by: Matthew Rosato <[email protected]>
---
arch/s390/include/asm/sclp.h | 1 +
drivers/s390/char/sclp_early.c | 1 +
2 files changed, 2 insertions(+)

diff --git a/arch/s390/include/asm/sclp.h b/arch/s390/include/asm/sclp.h
index ff359df62154..bdd365407176 100644
--- a/arch/s390/include/asm/sclp.h
+++ b/arch/s390/include/asm/sclp.h
@@ -90,6 +90,7 @@ struct sclp_info {
unsigned char has_zpci_lsi : 1;
unsigned char has_aisii : 1;
unsigned char has_aeni : 1;
+ unsigned char has_aisi : 1;
unsigned int ibc;
unsigned int mtid;
unsigned int mtid_cp;
diff --git a/drivers/s390/char/sclp_early.c b/drivers/s390/char/sclp_early.c
index e9af01b4c97a..c13e55cc4a5d 100644
--- a/drivers/s390/char/sclp_early.c
+++ b/drivers/s390/char/sclp_early.c
@@ -47,6 +47,7 @@ static void __init sclp_early_facilities_detect(void)
sclp.has_kss = !!(sccb->fac98 & 0x01);
sclp.has_aisii = !!(sccb->fac118 & 0x40);
sclp.has_aeni = !!(sccb->fac118 & 0x20);
+ sclp.has_aisi = !!(sccb->fac118 & 0x10);
sclp.has_zpci_lsi = !!(sccb->fac118 & 0x01);
if (sccb->fac85 & 0x02)
S390_lowcore.machine_flags |= MACHINE_FLAG_ESOP;
--
2.27.0

2022-04-27 11:11:39

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH v6 08/21] s390/pci: stash associated GISA designation

For passthrough devices, we will need to know the GISA designation of the
guest if interpretation facilities are to be used. Setup to stash this in
the zdev and set a default of 0 (no GISA designation) for now; a subsequent
patch will set a valid GISA designation for passthrough devices.
Also, extend mpcific routines to specify this stashed designation as part
of the mpcific command.

Reviewed-by: Pierre Morel <[email protected]>
Reviewed-by: Niklas Schnelle <[email protected]>
Reviewed-by: Christian Borntraeger <[email protected]>
Signed-off-by: Matthew Rosato <[email protected]>
---
arch/s390/include/asm/pci.h | 1 +
arch/s390/include/asm/pci_clp.h | 3 ++-
arch/s390/pci/pci.c | 6 ++++++
arch/s390/pci/pci_clp.c | 5 +++++
arch/s390/pci/pci_irq.c | 5 +++++
5 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
index fdb9745ee998..42a4a312a6dd 100644
--- a/arch/s390/include/asm/pci.h
+++ b/arch/s390/include/asm/pci.h
@@ -123,6 +123,7 @@ struct zpci_dev {
enum zpci_state state;
u32 fid; /* function ID, used by sclp */
u32 fh; /* function handle, used by insn's */
+ u32 gisa; /* GISA designation for passthrough */
u16 vfn; /* virtual function number */
u16 pchid; /* physical channel ID */
u8 pfgid; /* function group ID */
diff --git a/arch/s390/include/asm/pci_clp.h b/arch/s390/include/asm/pci_clp.h
index 1f4b666e85ee..f3286bc5ba6e 100644
--- a/arch/s390/include/asm/pci_clp.h
+++ b/arch/s390/include/asm/pci_clp.h
@@ -173,7 +173,8 @@ struct clp_req_set_pci {
u16 reserved2;
u8 oc; /* operation controls */
u8 ndas; /* number of dma spaces */
- u64 reserved3;
+ u32 reserved3;
+ u32 gisa; /* GISA designation */
} __packed;

/* Set PCI function response */
diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c
index e563cb65c0c4..a86cd1cbb80e 100644
--- a/arch/s390/pci/pci.c
+++ b/arch/s390/pci/pci.c
@@ -120,6 +120,7 @@ int zpci_register_ioat(struct zpci_dev *zdev, u8 dmaas,
fib.pba = base;
fib.pal = limit;
fib.iota = iota | ZPCI_IOTA_RTTO_FLAG;
+ fib.gd = zdev->gisa;
cc = zpci_mod_fc(req, &fib, &status);
if (cc)
zpci_dbg(3, "reg ioat fid:%x, cc:%d, status:%d\n", zdev->fid, cc, status);
@@ -133,6 +134,8 @@ int zpci_unregister_ioat(struct zpci_dev *zdev, u8 dmaas)
struct zpci_fib fib = {0};
u8 cc, status;

+ fib.gd = zdev->gisa;
+
cc = zpci_mod_fc(req, &fib, &status);
if (cc)
zpci_dbg(3, "unreg ioat fid:%x, cc:%d, status:%d\n", zdev->fid, cc, status);
@@ -160,6 +163,7 @@ int zpci_fmb_enable_device(struct zpci_dev *zdev)
atomic64_set(&zdev->unmapped_pages, 0);

fib.fmb_addr = virt_to_phys(zdev->fmb);
+ fib.gd = zdev->gisa;
cc = zpci_mod_fc(req, &fib, &status);
if (cc) {
kmem_cache_free(zdev_fmb_cache, zdev->fmb);
@@ -178,6 +182,8 @@ int zpci_fmb_disable_device(struct zpci_dev *zdev)
if (!zdev->fmb)
return -EINVAL;

+ fib.gd = zdev->gisa;
+
/* Function measurement is disabled if fmb address is zero */
cc = zpci_mod_fc(req, &fib, &status);
if (cc == 3) /* Function already gone. */
diff --git a/arch/s390/pci/pci_clp.c b/arch/s390/pci/pci_clp.c
index 1057d7af4a55..deed35edea14 100644
--- a/arch/s390/pci/pci_clp.c
+++ b/arch/s390/pci/pci_clp.c
@@ -229,12 +229,16 @@ static int clp_set_pci_fn(struct zpci_dev *zdev, u32 *fh, u8 nr_dma_as, u8 comma
{
struct clp_req_rsp_set_pci *rrb;
int rc, retries = 100;
+ u32 gisa = 0;

*fh = 0;
rrb = clp_alloc_block(GFP_KERNEL);
if (!rrb)
return -ENOMEM;

+ if (command != CLP_SET_DISABLE_PCI_FN)
+ gisa = zdev->gisa;
+
do {
memset(rrb, 0, sizeof(*rrb));
rrb->request.hdr.len = sizeof(rrb->request);
@@ -243,6 +247,7 @@ static int clp_set_pci_fn(struct zpci_dev *zdev, u32 *fh, u8 nr_dma_as, u8 comma
rrb->request.fh = zdev->fh;
rrb->request.oc = command;
rrb->request.ndas = nr_dma_as;
+ rrb->request.gisa = gisa;

rc = clp_req(rrb, CLP_LPS_PCI);
if (rrb->response.hdr.rsp == CLP_RC_SETPCIFN_BUSY) {
diff --git a/arch/s390/pci/pci_irq.c b/arch/s390/pci/pci_irq.c
index f2b3145b6697..a2b42a63a53b 100644
--- a/arch/s390/pci/pci_irq.c
+++ b/arch/s390/pci/pci_irq.c
@@ -43,6 +43,7 @@ static int zpci_set_airq(struct zpci_dev *zdev)
fib.fmt0.aibvo = 0; /* each zdev has its own interrupt vector */
fib.fmt0.aisb = virt_to_phys(zpci_sbv->vector) + (zdev->aisb / 64) * 8;
fib.fmt0.aisbo = zdev->aisb & 63;
+ fib.gd = zdev->gisa;

return zpci_mod_fc(req, &fib, &status) ? -EIO : 0;
}
@@ -54,6 +55,8 @@ static int zpci_clear_airq(struct zpci_dev *zdev)
struct zpci_fib fib = {0};
u8 cc, status;

+ fib.gd = zdev->gisa;
+
cc = zpci_mod_fc(req, &fib, &status);
if (cc == 3 || (cc == 1 && status == 24))
/* Function already gone or IRQs already deregistered. */
@@ -72,6 +75,7 @@ static int zpci_set_directed_irq(struct zpci_dev *zdev)
fib.fmt = 1;
fib.fmt1.noi = zdev->msi_nr_irqs;
fib.fmt1.dibvo = zdev->msi_first_bit;
+ fib.gd = zdev->gisa;

return zpci_mod_fc(req, &fib, &status) ? -EIO : 0;
}
@@ -84,6 +88,7 @@ static int zpci_clear_directed_irq(struct zpci_dev *zdev)
u8 cc, status;

fib.fmt = 1;
+ fib.gd = zdev->gisa;
cc = zpci_mod_fc(req, &fib, &status);
if (cc == 3 || (cc == 1 && status == 24))
/* Function already gone or IRQs already deregistered. */
--
2.27.0

2022-04-27 11:12:25

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH v6 09/21] s390/pci: stash dtsm and maxstbl

Store information about what IOAT designation types are supported by
underlying hardware as well as the largest store block size allowed.
These values will be needed by passthrough.

Reviewed-by: Niklas Schnelle <[email protected]>
Reviewed-by: Pierre Morel <[email protected]>
Reviewed-by: Christian Borntraeger <[email protected]>
Signed-off-by: Matthew Rosato <[email protected]>
---
arch/s390/include/asm/pci.h | 2 ++
arch/s390/include/asm/pci_clp.h | 6 ++++--
arch/s390/pci/pci_clp.c | 2 ++
3 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
index 42a4a312a6dd..4c5b8fbc2079 100644
--- a/arch/s390/include/asm/pci.h
+++ b/arch/s390/include/asm/pci.h
@@ -126,9 +126,11 @@ struct zpci_dev {
u32 gisa; /* GISA designation for passthrough */
u16 vfn; /* virtual function number */
u16 pchid; /* physical channel ID */
+ u16 maxstbl; /* Maximum store block size */
u8 pfgid; /* function group ID */
u8 pft; /* pci function type */
u8 port;
+ u8 dtsm; /* Supported DT mask */
u8 rid_available : 1;
u8 has_hp_slot : 1;
u8 has_resources : 1;
diff --git a/arch/s390/include/asm/pci_clp.h b/arch/s390/include/asm/pci_clp.h
index f3286bc5ba6e..d6189ed14f84 100644
--- a/arch/s390/include/asm/pci_clp.h
+++ b/arch/s390/include/asm/pci_clp.h
@@ -153,9 +153,11 @@ struct clp_rsp_query_pci_grp {
u8 : 6;
u8 frame : 1;
u8 refresh : 1; /* TLB refresh mode */
- u16 reserved2;
+ u16 : 3;
+ u16 maxstbl : 13; /* Maximum store block size */
u16 mui;
- u16 : 16;
+ u8 dtsm; /* Supported DT mask */
+ u8 reserved3;
u16 maxfaal;
u16 : 4;
u16 dnoi : 12;
diff --git a/arch/s390/pci/pci_clp.c b/arch/s390/pci/pci_clp.c
index deed35edea14..1edb07ceb766 100644
--- a/arch/s390/pci/pci_clp.c
+++ b/arch/s390/pci/pci_clp.c
@@ -106,6 +106,8 @@ static void clp_store_query_pci_fngrp(struct zpci_dev *zdev,
zdev->max_msi = response->noi;
zdev->fmb_update = response->mui;
zdev->version = response->version;
+ zdev->maxstbl = response->maxstbl;
+ zdev->dtsm = response->dtsm;

switch (response->version) {
case 1:
--
2.27.0

2022-04-27 11:18:04

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH v6 12/21] KVM: s390: pci: enable host forwarding of Adapter Event Notifications

In cases where interrupts are not forwarded to the guest via firmware,
KVM is responsible for ensuring delivery. When an interrupt presents
with the forwarding bit, we must process the forwarding tables until
all interrupts are delivered.

Reviewed-by: Pierre Morel <[email protected]>
Signed-off-by: Matthew Rosato <[email protected]>
---
arch/s390/include/asm/kvm_host.h | 1 +
arch/s390/include/asm/tpi.h | 13 ++++++
arch/s390/kvm/interrupt.c | 77 +++++++++++++++++++++++++++++++-
arch/s390/kvm/kvm-s390.c | 3 +-
arch/s390/kvm/pci.h | 10 +++++
5 files changed, 102 insertions(+), 2 deletions(-)

diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index 766028d54a3e..c1518a505060 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -759,6 +759,7 @@ struct kvm_vm_stat {
u64 inject_pfault_done;
u64 inject_service_signal;
u64 inject_virtio;
+ u64 aen_forward;
};

struct kvm_arch_memory_slot {
diff --git a/arch/s390/include/asm/tpi.h b/arch/s390/include/asm/tpi.h
index 1ac538b8cbf5..f76e5fdff23a 100644
--- a/arch/s390/include/asm/tpi.h
+++ b/arch/s390/include/asm/tpi.h
@@ -19,6 +19,19 @@ struct tpi_info {
u32 :12;
} __packed __aligned(4);

+/* I/O-Interruption Code as stored by TPI for an Adapter I/O */
+struct tpi_adapter_info {
+ u32 aism:8;
+ u32 :22;
+ u32 error:1;
+ u32 forward:1;
+ u32 reserved;
+ u32 adapter_IO:1;
+ u32 directed_irq:1;
+ u32 isc:3;
+ u32 :27;
+} __packed __aligned(4);
+
#endif /* __ASSEMBLY__ */

#endif /* _ASM_S390_TPI_H */
diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
index 37ff4358121a..7a59cece6992 100644
--- a/arch/s390/kvm/interrupt.c
+++ b/arch/s390/kvm/interrupt.c
@@ -3313,11 +3313,86 @@ int kvm_s390_gisc_unregister(struct kvm *kvm, u32 gisc)
}
EXPORT_SYMBOL_GPL(kvm_s390_gisc_unregister);

+static void aen_host_forward(unsigned long si)
+{
+ struct kvm_s390_gisa_interrupt *gi;
+ struct zpci_gaite *gaite;
+ struct kvm *kvm;
+
+ gaite = (struct zpci_gaite *)aift->gait +
+ (si * sizeof(struct zpci_gaite));
+ if (gaite->count == 0)
+ return;
+ if (gaite->aisb != 0)
+ set_bit_inv(gaite->aisbo, (unsigned long *)gaite->aisb);
+
+ kvm = kvm_s390_pci_si_to_kvm(aift, si);
+ if (!kvm)
+ return;
+ gi = &kvm->arch.gisa_int;
+
+ if (!(gi->origin->g1.simm & AIS_MODE_MASK(gaite->gisc)) ||
+ !(gi->origin->g1.nimm & AIS_MODE_MASK(gaite->gisc))) {
+ gisa_set_ipm_gisc(gi->origin, gaite->gisc);
+ if (hrtimer_active(&gi->timer))
+ hrtimer_cancel(&gi->timer);
+ hrtimer_start(&gi->timer, 0, HRTIMER_MODE_REL);
+ kvm->stat.aen_forward++;
+ }
+}
+
+static void aen_process_gait(u8 isc)
+{
+ bool found = false, first = true;
+ union zpci_sic_iib iib = {{0}};
+ unsigned long si, flags;
+
+ spin_lock_irqsave(&aift->gait_lock, flags);
+
+ if (!aift->gait) {
+ spin_unlock_irqrestore(&aift->gait_lock, flags);
+ return;
+ }
+
+ for (si = 0;;) {
+ /* Scan adapter summary indicator bit vector */
+ si = airq_iv_scan(aift->sbv, si, airq_iv_end(aift->sbv));
+ if (si == -1UL) {
+ if (first || found) {
+ /* Re-enable interrupts. */
+ zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, isc,
+ &iib);
+ first = found = false;
+ } else {
+ /* Interrupts on and all bits processed */
+ break;
+ }
+ found = false;
+ si = 0;
+ /* Scan again after re-enabling interrupts */
+ continue;
+ }
+ found = true;
+ aen_host_forward(si);
+ }
+
+ spin_unlock_irqrestore(&aift->gait_lock, flags);
+}
+
static void gib_alert_irq_handler(struct airq_struct *airq,
struct tpi_info *tpi_info)
{
+ struct tpi_adapter_info *info = (struct tpi_adapter_info *)tpi_info;
+
inc_irq_stat(IRQIO_GAL);
- process_gib_alert_list();
+
+ if (IS_ENABLED(CONFIG_VFIO_PCI) && (info->forward || info->error)) {
+ aen_process_gait(info->isc);
+ if (info->aism != 0)
+ process_gib_alert_list();
+ } else {
+ process_gib_alert_list();
+ }
}

static struct airq_struct gib_alert_irq = {
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 20882b3c08a5..856f55ffd98c 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -64,7 +64,8 @@ const struct _kvm_stats_desc kvm_vm_stats_desc[] = {
STATS_DESC_COUNTER(VM, inject_float_mchk),
STATS_DESC_COUNTER(VM, inject_pfault_done),
STATS_DESC_COUNTER(VM, inject_service_signal),
- STATS_DESC_COUNTER(VM, inject_virtio)
+ STATS_DESC_COUNTER(VM, inject_virtio),
+ STATS_DESC_COUNTER(VM, aen_forward)
};

const struct kvm_stats_header kvm_vm_stats_header = {
diff --git a/arch/s390/kvm/pci.h b/arch/s390/kvm/pci.h
index 1d8f11e442f6..d97cc99ebd46 100644
--- a/arch/s390/kvm/pci.h
+++ b/arch/s390/kvm/pci.h
@@ -13,6 +13,7 @@
#include <linux/kvm_host.h>
#include <linux/pci.h>
#include <linux/mutex.h>
+#include <linux/kvm_host.h>
#include <asm/airq.h>
#include <asm/cpu.h>

@@ -40,6 +41,15 @@ struct zpci_aift {

extern struct zpci_aift *aift;

+static inline struct kvm *kvm_s390_pci_si_to_kvm(struct zpci_aift *aift,
+ unsigned long si)
+{
+ if (!IS_ENABLED(CONFIG_VFIO_PCI) || aift->kzdev == 0 ||
+ aift->kzdev[si] == 0)
+ return 0;
+ return aift->kzdev[si]->kvm;
+};
+
int kvm_s390_pci_aen_init(u8 nisc);
void kvm_s390_pci_aen_exit(void);

--
2.27.0

2022-04-27 11:34:08

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v6 10/21] KVM: s390: pci: add basic kvm_zdev structure

Hi Matthew,

I love your patch! Perhaps something to improve:

[auto build test WARNING on v5.18-rc4]
[cannot apply to s390/features kvms390/next awilliam-vfio/next next-20220427]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url: https://github.com/intel-lab-lkp/linux/commits/Matthew-Rosato/KVM-s390-enable-zPCI-for-interpretive-execution/20220427-041853
base: af2d861d4cd2a4da5137f795ee3509e6f944a25b
config: s390-defconfig (https://download.01.org/0day-ci/archive/20220427/[email protected]/config)
compiler: s390-linux-gcc (GCC) 11.3.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/e6d8c620090a7b184afdf5b5123d10ac45776eaf
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Matthew-Rosato/KVM-s390-enable-zPCI-for-interpretive-execution/20220427-041853
git checkout e6d8c620090a7b184afdf5b5123d10ac45776eaf
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.3.0 make.cross W=1 O=build_dir ARCH=s390 SHELL=/bin/bash arch/s390/kvm/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>

All warnings (new ones prefixed by >>):

>> arch/s390/kvm/pci.c:14:5: warning: no previous prototype for 'kvm_s390_pci_dev_open' [-Wmissing-prototypes]
14 | int kvm_s390_pci_dev_open(struct zpci_dev *zdev)
| ^~~~~~~~~~~~~~~~~~~~~
>> arch/s390/kvm/pci.c:29:6: warning: no previous prototype for 'kvm_s390_pci_dev_release' [-Wmissing-prototypes]
29 | void kvm_s390_pci_dev_release(struct zpci_dev *zdev)
| ^~~~~~~~~~~~~~~~~~~~~~~~


vim +/kvm_s390_pci_dev_open +14 arch/s390/kvm/pci.c

13
> 14 int kvm_s390_pci_dev_open(struct zpci_dev *zdev)
15 {
16 struct kvm_zdev *kzdev;
17
18 kzdev = kzalloc(sizeof(struct kvm_zdev), GFP_KERNEL);
19 if (!kzdev)
20 return -ENOMEM;
21
22 kzdev->zdev = zdev;
23 zdev->kzdev = kzdev;
24
25 return 0;
26 }
27 EXPORT_SYMBOL_GPL(kvm_s390_pci_dev_open);
28
> 29 void kvm_s390_pci_dev_release(struct zpci_dev *zdev)

--
0-DAY CI Kernel Test Service
https://01.org/lkp

2022-04-27 11:34:15

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH v6 16/21] vfio-pci/zdev: add open/close device hooks

During vfio-pci open_device, register a notifier for the zPCI device to
catch KVM registration. This is needed in order to pass a special
indicator (GISA) to firmware to allow zPCI interpretation facilities to be
used for only the specific KVM associated with the vfio-pci device.
During vfio-pci close_device, unregister the notifier.

Signed-off-by: Matthew Rosato <[email protected]>
---
arch/s390/include/asm/pci.h | 2 ++
drivers/vfio/pci/vfio_pci_core.c | 2 ++
drivers/vfio/pci/vfio_pci_zdev.c | 50 ++++++++++++++++++++++++++++++++
include/linux/vfio_pci_core.h | 10 +++++++
4 files changed, 64 insertions(+)

diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
index 322060a75d9f..fdcc95b36edb 100644
--- a/arch/s390/include/asm/pci.h
+++ b/arch/s390/include/asm/pci.h
@@ -5,6 +5,7 @@
#include <linux/pci.h>
#include <linux/mutex.h>
#include <linux/iommu.h>
+#include <linux/notifier.h>
#include <linux/pci_hotplug.h>
#include <asm-generic/pci.h>
#include <asm/pci_clp.h>
@@ -194,6 +195,7 @@ struct zpci_dev {
/* IOMMU and passthrough */
struct s390_domain *s390_domain; /* s390 IOMMU domain data */
struct kvm_zdev *kzdev;
+ struct notifier_block nb; /* vfio notifications */
};

static inline bool zdev_enabled(struct zpci_dev *zdev)
diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index 06b6f3594a13..d53125b308f0 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -449,6 +449,7 @@ void vfio_pci_core_close_device(struct vfio_device *core_vdev)
vdev->sriov_pf_core_dev->vf_token->users--;
mutex_unlock(&vdev->sriov_pf_core_dev->vf_token->lock);
}
+ vfio_pci_zdev_release(vdev);
vfio_spapr_pci_eeh_release(vdev->pdev);
vfio_pci_core_disable(vdev);

@@ -469,6 +470,7 @@ void vfio_pci_core_finish_enable(struct vfio_pci_core_device *vdev)
{
vfio_pci_probe_mmaps(vdev);
vfio_spapr_pci_eeh_open(vdev->pdev);
+ vfio_pci_zdev_open(vdev);

if (vdev->sriov_pf_core_dev) {
mutex_lock(&vdev->sriov_pf_core_dev->vf_token->lock);
diff --git a/drivers/vfio/pci/vfio_pci_zdev.c b/drivers/vfio/pci/vfio_pci_zdev.c
index ea4c0d2b0663..112c1f820f8e 100644
--- a/drivers/vfio/pci/vfio_pci_zdev.c
+++ b/drivers/vfio/pci/vfio_pci_zdev.c
@@ -11,6 +11,7 @@
#include <linux/uaccess.h>
#include <linux/vfio.h>
#include <linux/vfio_zdev.h>
+#include <linux/kvm_host.h>
#include <asm/pci_clp.h>
#include <asm/pci_io.h>

@@ -136,3 +137,52 @@ int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,

return ret;
}
+
+static int vfio_pci_zdev_group_notifier(struct notifier_block *nb,
+ unsigned long action, void *data)
+{
+ struct zpci_dev *zdev = container_of(nb, struct zpci_dev, nb);
+ int (*fn)(struct zpci_dev *zdev, struct kvm *kvm);
+ int rc = NOTIFY_OK;
+
+ if (action == VFIO_GROUP_NOTIFY_SET_KVM) {
+ if (!zdev)
+ return NOTIFY_DONE;
+
+ fn = symbol_get(kvm_s390_pci_register_kvm);
+ if (!fn)
+ return NOTIFY_DONE;
+
+ if (fn(zdev, (struct kvm *)data))
+ rc = NOTIFY_BAD;
+
+ symbol_put(kvm_s390_pci_register_kvm);
+ }
+
+ return rc;
+}
+
+void vfio_pci_zdev_open(struct vfio_pci_core_device *vdev)
+{
+ unsigned long events = VFIO_GROUP_NOTIFY_SET_KVM;
+ struct zpci_dev *zdev = to_zpci(vdev->pdev);
+
+ if (!zdev)
+ return;
+
+ zdev->nb.notifier_call = vfio_pci_zdev_group_notifier;
+
+ vfio_register_notifier(vdev->vdev.dev, VFIO_GROUP_NOTIFY,
+ &events, &zdev->nb);
+}
+
+void vfio_pci_zdev_release(struct vfio_pci_core_device *vdev)
+{
+ struct zpci_dev *zdev = to_zpci(vdev->pdev);
+
+ if (!zdev)
+ return;
+
+ vfio_unregister_notifier(vdev->vdev.dev, VFIO_GROUP_NOTIFY,
+ &zdev->nb);
+}
diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
index 48f2dd3c568c..b1b285421c18 100644
--- a/include/linux/vfio_pci_core.h
+++ b/include/linux/vfio_pci_core.h
@@ -209,12 +209,22 @@ static inline int vfio_pci_igd_init(struct vfio_pci_core_device *vdev)
#ifdef CONFIG_S390
extern int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
struct vfio_info_cap *caps);
+void vfio_pci_zdev_open(struct vfio_pci_core_device *vdev);
+void vfio_pci_zdev_release(struct vfio_pci_core_device *vdev);
#else
static inline int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
struct vfio_info_cap *caps)
{
return -ENODEV;
}
+
+static inline void vfio_pci_zdev_open(struct vfio_pci_core_device *vdev)
+{
+}
+
+static inline void vfio_pci_zdev_release(struct vfio_pci_core_device *vdev)
+{
+}
#endif

/* Will be exported for vfio pci drivers usage */
--
2.27.0

2022-04-27 11:35:13

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH v6 21/21] MAINTAINERS: additional files related kvm s390 pci passthrough

Add entries from the s390 kvm subdirectory related to pci passthrough.

Acked-by: Christian Borntraeger <[email protected]>
Signed-off-by: Matthew Rosato <[email protected]>
---
MAINTAINERS | 1 +
1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 5e8c2f611766..617f5bb34139 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -17257,6 +17257,7 @@ M: Eric Farman <[email protected]>
L: [email protected]
L: [email protected]
S: Supported
+F: arch/s390/kvm/pci*
F: drivers/vfio/pci/vfio_pci_zdev.c
F: include/uapi/linux/vfio_zdev.h

--
2.27.0

2022-04-27 13:53:53

by Matthew Rosato

[permalink] [raw]
Subject: Re: [PATCH v6 10/21] KVM: s390: pci: add basic kvm_zdev structure

On 4/27/22 4:41 AM, kernel test robot wrote:
> Hi Matthew,
>
> I love your patch! Perhaps something to improve:
>
> [auto build test WARNING on v5.18-rc4]
> [cannot apply to s390/features kvms390/next awilliam-vfio/next next-20220427]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch]
>
> url: https://github.com/intel-lab-lkp/linux/commits/Matthew-Rosato/KVM-s390-enable-zPCI-for-interpretive-execution/20220427-041853
> base: af2d861d4cd2a4da5137f795ee3509e6f944a25b
> config: s390-defconfig (https://download.01.org/0day-ci/archive/20220427/[email protected]/config)
> compiler: s390-linux-gcc (GCC) 11.3.0
> reproduce (this is a W=1 build):
> wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # https://github.com/intel-lab-lkp/linux/commit/e6d8c620090a7b184afdf5b5123d10ac45776eaf
> git remote add linux-review https://github.com/intel-lab-lkp/linux
> git fetch --no-tags linux-review Matthew-Rosato/KVM-s390-enable-zPCI-for-interpretive-execution/20220427-041853
> git checkout e6d8c620090a7b184afdf5b5123d10ac45776eaf
> # save the config file
> mkdir build_dir && cp config build_dir/.config
> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.3.0 make.cross W=1 O=build_dir ARCH=s390 SHELL=/bin/bash arch/s390/kvm/
>
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <[email protected]>
>
> All warnings (new ones prefixed by >>):
>
>>> arch/s390/kvm/pci.c:14:5: warning: no previous prototype for 'kvm_s390_pci_dev_open' [-Wmissing-prototypes]
> 14 | int kvm_s390_pci_dev_open(struct zpci_dev *zdev)
> | ^~~~~~~~~~~~~~~~~~~~~
>>> arch/s390/kvm/pci.c:29:6: warning: no previous prototype for 'kvm_s390_pci_dev_release' [-Wmissing-prototypes]
> 29 | void kvm_s390_pci_dev_release(struct zpci_dev *zdev)
> | ^~~~~~~~~~~~~~~~~~~~~~~~
>

Oops, these 2 functions no longer need to be externalized and can simply
be marked static.

>
> vim +/kvm_s390_pci_dev_open +14 arch/s390/kvm/pci.c
>
> 13
> > 14 int kvm_s390_pci_dev_open(struct zpci_dev *zdev)
> 15 {
> 16 struct kvm_zdev *kzdev;
> 17
> 18 kzdev = kzalloc(sizeof(struct kvm_zdev), GFP_KERNEL);
> 19 if (!kzdev)
> 20 return -ENOMEM;
> 21
> 22 kzdev->zdev = zdev;
> 23 zdev->kzdev = kzdev;
> 24
> 25 return 0;
> 26 }
> 27 EXPORT_SYMBOL_GPL(kvm_s390_pci_dev_open);
> 28
> > 29 void kvm_s390_pci_dev_release(struct zpci_dev *zdev)
>

2022-04-27 14:32:59

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v6 16/21] vfio-pci/zdev: add open/close device hooks

On Tue, Apr 26, 2022 at 04:08:37PM -0400, Matthew Rosato wrote:

> +static int vfio_pci_zdev_group_notifier(struct notifier_block *nb,
> + unsigned long action, void *data)
> +{
> + struct zpci_dev *zdev = container_of(nb, struct zpci_dev, nb);
> + int (*fn)(struct zpci_dev *zdev, struct kvm *kvm);
> + int rc = NOTIFY_OK;
> +
> + if (action == VFIO_GROUP_NOTIFY_SET_KVM) {
> + if (!zdev)
> + return NOTIFY_DONE;
> +
> + fn = symbol_get(kvm_s390_pci_register_kvm);
> + if (!fn)
> + return NOTIFY_DONE;
> +
> + if (fn(zdev, (struct kvm *)data))
> + rc = NOTIFY_BAD;
> +
> + symbol_put(kvm_s390_pci_register_kvm);

Is it possible this function can be in statically linked arch code?

Or, actually, is zPCI useful anyhow without kvm ie can you just have a
direct dependency here?

Jason

2022-04-27 15:10:23

by Matthew Rosato

[permalink] [raw]
Subject: Re: [PATCH v6 16/21] vfio-pci/zdev: add open/close device hooks

On 4/27/22 10:04 AM, Jason Gunthorpe wrote:
> On Tue, Apr 26, 2022 at 04:08:37PM -0400, Matthew Rosato wrote:
>
>> +static int vfio_pci_zdev_group_notifier(struct notifier_block *nb,
>> + unsigned long action, void *data)
>> +{
>> + struct zpci_dev *zdev = container_of(nb, struct zpci_dev, nb);
>> + int (*fn)(struct zpci_dev *zdev, struct kvm *kvm);
>> + int rc = NOTIFY_OK;
>> +
>> + if (action == VFIO_GROUP_NOTIFY_SET_KVM) {
>> + if (!zdev)
>> + return NOTIFY_DONE;
>> +
>> + fn = symbol_get(kvm_s390_pci_register_kvm);
>> + if (!fn)
>> + return NOTIFY_DONE;
>> +
>> + if (fn(zdev, (struct kvm *)data))
>> + rc = NOTIFY_BAD;
>> +
>> + symbol_put(kvm_s390_pci_register_kvm);
>
> Is it possible this function can be in statically linked arch code?
>
> Or, actually, is zPCI useful anyhow without kvm ie can you just have a
> direct dependency here?
>

zPCI devices (zpci_dev) exist regardless of whether kvm is configured or
not, and you can e.g. bind the associated PCI device to vfio-pci when
KVM is not configured (or module not loaded) and get the existing
vfio-pci-zdev extensions for that device (extra VFIO_DEVICE_INFO
response data). Making a direct dependency on KVM would remove that;
this was discussed in a prior version because this extra info is not
used today outside of a KVM usecase -- but it could be useful in the
future (or we may have other s390-isms that are not specific to kvm that
need vfio-pci-zdev).

As far as statically linking in arch... The fundamental
(un)registration task being done here -- (dis)associating the guest GISA
with the firmware and thus allowing this particular guest to use
firmware assists (or turning it off when kvm == 0) -- is only relevant
to guest passthrough with kvm and calls a number of different routines
that reside in the kvm module during the (un)registration process.
Without a direct dependency I think a symbol lookup still has to
inevitably happen at some point in the call chain.

2022-04-27 15:29:16

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v6 16/21] vfio-pci/zdev: add open/close device hooks

On Wed, Apr 27, 2022 at 10:42:07AM -0400, Matthew Rosato wrote:
> On 4/27/22 10:04 AM, Jason Gunthorpe wrote:
> > On Tue, Apr 26, 2022 at 04:08:37PM -0400, Matthew Rosato wrote:
> >
> > > +static int vfio_pci_zdev_group_notifier(struct notifier_block *nb,
> > > + unsigned long action, void *data)
> > > +{
> > > + struct zpci_dev *zdev = container_of(nb, struct zpci_dev, nb);
> > > + int (*fn)(struct zpci_dev *zdev, struct kvm *kvm);
> > > + int rc = NOTIFY_OK;
> > > +
> > > + if (action == VFIO_GROUP_NOTIFY_SET_KVM) {
> > > + if (!zdev)
> > > + return NOTIFY_DONE;
> > > +
> > > + fn = symbol_get(kvm_s390_pci_register_kvm);
> > > + if (!fn)
> > > + return NOTIFY_DONE;
> > > +
> > > + if (fn(zdev, (struct kvm *)data))
> > > + rc = NOTIFY_BAD;
> > > +
> > > + symbol_put(kvm_s390_pci_register_kvm);
> >
> > Is it possible this function can be in statically linked arch code?
> >
> > Or, actually, is zPCI useful anyhow without kvm ie can you just have a
> > direct dependency here?
> >
>
> zPCI devices (zpci_dev) exist regardless of whether kvm is configured or
> not, and you can e.g. bind the associated PCI device to vfio-pci when KVM is
> not configured (or module not loaded) and get the existing vfio-pci-zdev
> extensions for that device (extra VFIO_DEVICE_INFO response data). Making a
> direct dependency on KVM would remove that; this was discussed in a prior
> version because this extra info is not used today outside of a KVM usecase
> are not specific to kvm that need vfio-pci-zdev).

I'm a bit confused, what is the drawback of just having a direct
symbol dependency here? It means vfio loads a little extra kernel
module code, but is that really a big worry given almost all vfio
users on s390 will be using it with kvm?

Or is there some technical blocker? (circular dep or something?)

Jason

2022-04-27 15:41:21

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v6 15/21] KVM: s390: pci: add routines to start/stop interpretive execution

On Tue, Apr 26, 2022 at 04:08:36PM -0400, Matthew Rosato wrote:

> +int kvm_s390_pci_register_kvm(struct zpci_dev *zdev, struct kvm *kvm)
> +{
> + if (!zdev)
> + return 0;
> +
> + /*
> + * Register device with this KVM (or remove the KVM association if 0).
> + * If interpetation facilities are available, enable them and let
> + * userspace indicate whether or not they will be used (specify SHM bit
> + * to disable).
> + */
> + if (kvm)
> + return register_kvm(zdev, kvm);
> + else
> + return unregister_kvm(zdev);
> +}
> +EXPORT_SYMBOL_GPL(kvm_s390_pci_register_kvm);

I think it is cleaner to expose both the register/unregister APIs and
not multiplex them like this

> +void kvm_s390_pci_clear_list(struct kvm *kvm)
> +{
> + struct kvm_zdev *tmp, *kzdev;
> + LIST_HEAD(remove);
> +
> + spin_lock(&kvm->arch.kzdev_list_lock);
> + list_for_each_entry_safe(kzdev, tmp, &kvm->arch.kzdev_list, entry)
> + list_move_tail(&kzdev->entry, &remove);
> + spin_unlock(&kvm->arch.kzdev_list_lock);
> +
> + list_for_each_entry_safe(kzdev, tmp, &remove, entry)
> + unregister_kvm(kzdev->zdev);

Hum, I wonder if this is a mistake in kvm:

static void kvm_destroy_vm(struct kvm *kvm)
{
[..]
kvm_arch_destroy_vm(kvm);
kvm_destroy_devices(kvm);

kvm_destroy_devices() triggers the VFIO notifier with NULL. Indeed for
correctness I would expect the VFIO users to have been notified to
release the kvm before the kvm object becomes partially destroyed?

Maybe you should investigate re-ordering this at the KVM side and just
WARN_ON(!list_empty) in the arch code?

(vfio has this odd usage model where it should use the kvm pointer
without taking a ref on it so long as the unregister hasn't been
called)

If you keep it like this then the locking in register/unregister looks
not broad enough and has to cover the zdev->kzdev also.

Overall I think it is OK designed like this, aside from the ugly
symbol_get in vfio which I hope you can resolve.

Jason

2022-04-27 15:53:56

by Matthew Rosato

[permalink] [raw]
Subject: Re: [PATCH v6 16/21] vfio-pci/zdev: add open/close device hooks

On 4/27/22 11:01 AM, Jason Gunthorpe wrote:
> On Wed, Apr 27, 2022 at 10:42:07AM -0400, Matthew Rosato wrote:
>> On 4/27/22 10:04 AM, Jason Gunthorpe wrote:
>>> On Tue, Apr 26, 2022 at 04:08:37PM -0400, Matthew Rosato wrote:
>>>
>>>> +static int vfio_pci_zdev_group_notifier(struct notifier_block *nb,
>>>> + unsigned long action, void *data)
>>>> +{
>>>> + struct zpci_dev *zdev = container_of(nb, struct zpci_dev, nb);
>>>> + int (*fn)(struct zpci_dev *zdev, struct kvm *kvm);
>>>> + int rc = NOTIFY_OK;
>>>> +
>>>> + if (action == VFIO_GROUP_NOTIFY_SET_KVM) {
>>>> + if (!zdev)
>>>> + return NOTIFY_DONE;
>>>> +
>>>> + fn = symbol_get(kvm_s390_pci_register_kvm);
>>>> + if (!fn)
>>>> + return NOTIFY_DONE;
>>>> +
>>>> + if (fn(zdev, (struct kvm *)data))
>>>> + rc = NOTIFY_BAD;
>>>> +
>>>> + symbol_put(kvm_s390_pci_register_kvm);
>>>
>>> Is it possible this function can be in statically linked arch code?
>>>
>>> Or, actually, is zPCI useful anyhow without kvm ie can you just have a
>>> direct dependency here?
>>>
>>
>> zPCI devices (zpci_dev) exist regardless of whether kvm is configured or
>> not, and you can e.g. bind the associated PCI device to vfio-pci when KVM is
>> not configured (or module not loaded) and get the existing vfio-pci-zdev
>> extensions for that device (extra VFIO_DEVICE_INFO response data). Making a
>> direct dependency on KVM would remove that; this was discussed in a prior
>> version because this extra info is not used today outside of a KVM usecase
>> are not specific to kvm that need vfio-pci-zdev).
>
> I'm a bit confused, what is the drawback of just having a direct
> symbol dependency here? It means vfio loads a little extra kernel
> module code, but is that really a big worry given almost all vfio
> users on s390 will be using it with kvm?

It's about trying to avoid loading unnecessary code (or at least giving
a way to turn it off).

Previously I did something like....

https://lore.kernel.org/kvm/[email protected]/

And could do so again; as discussed in the thread there, I can use e.g.
CONFIG_VFIO_PCI_ZDEV_KVM and make vfio-pci-zdev depend on KVM in this
series. You only get the vfio-pci-zdev extensions when you configure KVM.

Then if we find a usecase for a vfio-pci-zdev extension without KVM
later, we can further split vfio-pci-zdev and go back to building
vfio-pci-zdev when CONFIG_S390 but only build the kvm pieces (like the
code above) when you specify CONFIG_VFIO_PCI_ZDEV_KVM. This would
eliminate this symbol_get. Sound OK?

>
> Or is there some technical blocker? (circular dep or something?)
>
> Jason

2022-04-27 16:08:22

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v6 16/21] vfio-pci/zdev: add open/close device hooks

On Wed, Apr 27, 2022 at 11:26:40AM -0400, Matthew Rosato wrote:
> > > zPCI devices (zpci_dev) exist regardless of whether kvm is configured or
> > > not, and you can e.g. bind the associated PCI device to vfio-pci when KVM is
> > > not configured (or module not loaded) and get the existing vfio-pci-zdev
> > > extensions for that device (extra VFIO_DEVICE_INFO response data). Making a
> > > direct dependency on KVM would remove that; this was discussed in a prior
> > > version because this extra info is not used today outside of a KVM usecase
> > > are not specific to kvm that need vfio-pci-zdev).
> >
> > I'm a bit confused, what is the drawback of just having a direct
> > symbol dependency here? It means vfio loads a little extra kernel
> > module code, but is that really a big worry given almost all vfio
> > users on s390 will be using it with kvm?
>
> It's about trying to avoid loading unnecessary code (or at least giving a
> way to turn it off).
>
> Previously I did something like....
>
> https://lore.kernel.org/kvm/[email protected]/
>
> And could do so again; as discussed in the thread there, I can use e.g.
> CONFIG_VFIO_PCI_ZDEV_KVM and make vfio-pci-zdev depend on KVM in this
> series. You only get the vfio-pci-zdev extensions when you configure KVM.

That make sense to me, I'd rather see that then the symbol_get/put here

Thanks,
Jason

2022-04-27 21:44:47

by Matthew Rosato

[permalink] [raw]
Subject: Re: [PATCH v6 15/21] KVM: s390: pci: add routines to start/stop interpretive execution

On 4/27/22 11:14 AM, Jason Gunthorpe wrote:
> On Tue, Apr 26, 2022 at 04:08:36PM -0400, Matthew Rosato wrote:
>
>> +int kvm_s390_pci_register_kvm(struct zpci_dev *zdev, struct kvm *kvm)
>> +{
>> + if (!zdev)
>> + return 0;
>> +
>> + /*
>> + * Register device with this KVM (or remove the KVM association if 0).
>> + * If interpetation facilities are available, enable them and let
>> + * userspace indicate whether or not they will be used (specify SHM bit
>> + * to disable).
>> + */
>> + if (kvm)
>> + return register_kvm(zdev, kvm);
>> + else
>> + return unregister_kvm(zdev);
>> +}
>> +EXPORT_SYMBOL_GPL(kvm_s390_pci_register_kvm);
>
> I think it is cleaner to expose both the register/unregister APIs and
> not multiplex them like this
>

OK

>> +void kvm_s390_pci_clear_list(struct kvm *kvm)
>> +{
>> + struct kvm_zdev *tmp, *kzdev;
>> + LIST_HEAD(remove);
>> +
>> + spin_lock(&kvm->arch.kzdev_list_lock);
>> + list_for_each_entry_safe(kzdev, tmp, &kvm->arch.kzdev_list, entry)
>> + list_move_tail(&kzdev->entry, &remove);
>> + spin_unlock(&kvm->arch.kzdev_list_lock);
>> +
>> + list_for_each_entry_safe(kzdev, tmp, &remove, entry)
>> + unregister_kvm(kzdev->zdev);
>
> Hum, I wonder if this is a mistake in kvm:
>
> static void kvm_destroy_vm(struct kvm *kvm)
> {
> [..]
> kvm_arch_destroy_vm(kvm);
> kvm_destroy_devices(kvm);
>
> kvm_destroy_devices() triggers the VFIO notifier with NULL. Indeed for
> correctness I would expect the VFIO users to have been notified to
> release the kvm before the kvm object becomes partially destroyed?
>
> Maybe you should investigate re-ordering this at the KVM side and just
> WARN_ON(!list_empty) in the arch code?
>
> (vfio has this odd usage model where it should use the kvm pointer
> without taking a ref on it so long as the unregister hasn't been
> called)
>

The issue there is that I am unregistering the notifier during
close_device for each zPCI dev, which will have already happened -- so
by the time we get to kvm_destroy_devices(), whether it's before or
after kvm_arch_destroy_vm, there are no longer any zPCI notifiers
registered that will trigger.

One way to solve this is to have the zpci close_device hook also trigger
the work that a KVM_DEV_VFIO_GROUP_DEL would (if the device is being
closed, the KVM association for that device isn't applicable anymore so
go ahead and clean up).

Then, since we know each device will get closed (either by userspace or
by kvm), I don't need something like kvm_s390_pci_clear_list at all.

> If you keep it like this then the locking in register/unregister looks
> not broad enough and has to cover the zdev->kzdev also.

But I would still need to revisit the locking with the above idea.

>
> Overall I think it is OK designed like this, aside from the ugly
> symbol_get in vfio which I hope you can resolve.
>
> Jason

2022-04-29 12:32:25

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v6 15/21] KVM: s390: pci: add routines to start/stop interpretive execution

On Wed, Apr 27, 2022 at 04:20:10PM -0400, Matthew Rosato wrote:
> > > +void kvm_s390_pci_clear_list(struct kvm *kvm)
> > > +{
> > > + struct kvm_zdev *tmp, *kzdev;
> > > + LIST_HEAD(remove);
> > > +
> > > + spin_lock(&kvm->arch.kzdev_list_lock);
> > > + list_for_each_entry_safe(kzdev, tmp, &kvm->arch.kzdev_list, entry)
> > > + list_move_tail(&kzdev->entry, &remove);
> > > + spin_unlock(&kvm->arch.kzdev_list_lock);
> > > +
> > > + list_for_each_entry_safe(kzdev, tmp, &remove, entry)
> > > + unregister_kvm(kzdev->zdev);
> >
> > Hum, I wonder if this is a mistake in kvm:
> >
> > static void kvm_destroy_vm(struct kvm *kvm)
> > {
> > [..]
> > kvm_arch_destroy_vm(kvm);
> > kvm_destroy_devices(kvm);
> >
> > kvm_destroy_devices() triggers the VFIO notifier with NULL. Indeed for
> > correctness I would expect the VFIO users to have been notified to
> > release the kvm before the kvm object becomes partially destroyed?
> >
> > Maybe you should investigate re-ordering this at the KVM side and just
> > WARN_ON(!list_empty) in the arch code?
> >
> > (vfio has this odd usage model where it should use the kvm pointer
> > without taking a ref on it so long as the unregister hasn't been
> > called)
> >
>
> The issue there is that I am unregistering the notifier during close_device
> for each zPCI dev, which will have already happened

And at that moment you have to clean up the arch stuff too, it
shouldn't be left floating around once the driver that installed it
looses access to the kvm.

> -- so by the time we get to kvm_destroy_devices(), whether it's
> before or after kvm_arch_destroy_vm, there are no longer any zPCI
> notifiers registered that will trigger.

I don't think that is strictly true, there is no enforced linkage
between the lifetime of the kvm FD and the lifetime of the VFIO device
FD. qemu probably orders them the way you say.

> One way to solve this is to have the zpci close_device hook also trigger the
> work that a KVM_DEV_VFIO_GROUP_DEL would (if the device is being closed, the
> KVM association for that device isn't applicable anymore so go ahead and
> clean up).

That makes the most sense - but think about what happens if the KVM fd
is closed while the device FD is still open..

Jason

2022-05-06 20:17:12

by Christian Borntraeger

[permalink] [raw]
Subject: Re: [PATCH v6 12/21] KVM: s390: pci: enable host forwarding of Adapter Event Notifications



Am 26.04.22 um 22:08 schrieb Matthew Rosato:
[...]

Reviewed-by: Christian Borntraeger <[email protected]>

one comment below.


> +static void aen_process_gait(u8 isc)
> +{
> + bool found = false, first = true;
> + union zpci_sic_iib iib = {{0}};
> + unsigned long si, flags;
> +
> + spin_lock_irqsave(&aift->gait_lock, flags);

I think we can live with this for now, but if it turns out to create too long periods of disabled interrupts or lock hold times we might want to consider moving gaite access to an rcu-like approach.

> +
> + if (!aift->gait) {
> + spin_unlock_irqrestore(&aift->gait_lock, flags);
> + return;
> + }
> +
> + for (si = 0;;) {
> + /* Scan adapter summary indicator bit vector */
> + si = airq_iv_scan(aift->sbv, si, airq_iv_end(aift->sbv));
> + if (si == -1UL) {
> + if (first || found) {
> + /* Re-enable interrupts. */
> + zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, isc,
> + &iib);
> + first = found = false;
> + } else {
> + /* Interrupts on and all bits processed */
> + break;
> + }
> + found = false;
> + si = 0;
> + /* Scan again after re-enabling interrupts */
> + continue;
> + }
> + found = true;
> + aen_host_forward(si);
> + }
> +
> + spin_unlock_irqrestore(&aift->gait_lock, flags);
> +}
> +


2022-05-07 01:02:44

by Christian Borntraeger

[permalink] [raw]
Subject: Re: [PATCH v6 13/21] KVM: s390: mechanism to enable guest zPCI Interpretation

Am 26.04.22 um 22:08 schrieb Matthew Rosato:

Reviewed-by: Christian Borntraeger <[email protected]>



> +/* Must be called with kvm->lock held */

maybe lockdep_assert_held?

> +void kvm_s390_vcpu_pci_enable_interp(struct kvm *kvm)
> +{
> + struct kvm_vcpu *vcpu;
> + unsigned long i;
> +
> + /*
> + * If host is configured for PCI and the necessary facilities are
> + * available, turn on interpretation for the life of this guest
> + */
> + if (!kvm_s390_pci_interp_allowed())
> + return;
> +
> + kvm->arch.use_zpci_interp = 1;
> +
> + kvm_s390_vcpu_block_all(kvm);
> +
> + kvm_for_each_vcpu(i, vcpu, kvm) {
> + kvm_s390_vcpu_pci_setup(vcpu);
> + kvm_s390_sync_request(KVM_REQ_VSIE_RESTART, vcpu);
> + }
> +
> + kvm_s390_vcpu_unblock_all(kvm);
> +}

2022-05-09 04:14:28

by Christian Borntraeger

[permalink] [raw]
Subject: Re: [PATCH v6 11/21] KVM: s390: pci: do initial setup for AEN interpretation

Am 26.04.22 um 22:08 schrieb Matthew Rosato:
[...]
> +/* Caller must hold the aift lock before calling this function */

When you do a next round, maybe use lockdep_assert_help instead of this comment

> +void kvm_s390_pci_aen_exit(void)
> +{
> + unsigned long flags;
> + struct kvm_zdev **gait_kzdev;
> +
> + /*
> + * Contents of the aipb remain registered for the life of the host
> + * kernel, the information preserved in zpci_aipb and zpci_aif_sbv
> + * in case we insert the KVM module again later. Clear the AIFT
> + * information and free anything not registered with underlying
> + * firmware.
> + */
> + spin_lock_irqsave(&aift->gait_lock, flags);
> + gait_kzdev = aift->kzdev;
> + aift->gait = 0;
> + aift->sbv = 0;
> + aift->kzdev = 0;
> + spin_unlock_irqrestore(&aift->gait_lock, flags);
> +
> + kfree(gait_kzdev);
> +}
> +

Otherwise,

Reviewed-by: Christian Borntraeger <[email protected]>


2022-05-09 06:56:40

by Christian Borntraeger

[permalink] [raw]
Subject: Re: [PATCH v6 16/21] vfio-pci/zdev: add open/close device hooks



Am 27.04.22 um 17:39 schrieb Jason Gunthorpe:
> On Wed, Apr 27, 2022 at 11:26:40AM -0400, Matthew Rosato wrote:
>>>> zPCI devices (zpci_dev) exist regardless of whether kvm is configured or
>>>> not, and you can e.g. bind the associated PCI device to vfio-pci when KVM is
>>>> not configured (or module not loaded) and get the existing vfio-pci-zdev
>>>> extensions for that device (extra VFIO_DEVICE_INFO response data). Making a
>>>> direct dependency on KVM would remove that; this was discussed in a prior
>>>> version because this extra info is not used today outside of a KVM usecase
>>>> are not specific to kvm that need vfio-pci-zdev).
>>>
>>> I'm a bit confused, what is the drawback of just having a direct
>>> symbol dependency here? It means vfio loads a little extra kernel
>>> module code, but is that really a big worry given almost all vfio
>>> users on s390 will be using it with kvm?
>>
>> It's about trying to avoid loading unnecessary code (or at least giving a
>> way to turn it off).
>>
>> Previously I did something like....
>>
>> https://lore.kernel.org/kvm/[email protected]/
>>
>> And could do so again; as discussed in the thread there, I can use e.g.
>> CONFIG_VFIO_PCI_ZDEV_KVM and make vfio-pci-zdev depend on KVM in this
>> series. You only get the vfio-pci-zdev extensions when you configure KVM.
>
> That make sense to me, I'd rather see that then the symbol_get/put here

I agree with Jason here.

2022-05-09 09:30:16

by Christian Borntraeger

[permalink] [raw]
Subject: Re: [PATCH v6 17/21] vfio-pci/zdev: add function handle to clp base capability



Am 26.04.22 um 22:08 schrieb Matthew Rosato:
> The function handle is a system-wide unique identifier for a zPCI
> device. With zPCI instruction interpretation, the host will no
> longer be executing the zPCI instructions on behalf of the guest.
> As a result, the guest needs to use the real function handle in
> order for firmware to associate the instruction with the proper
> PCI function. Let's provide that handle to the guest.
>
> Reviewed-by: Pierre Morel <[email protected]>
> Signed-off-by: Matthew Rosato <[email protected]>
Reviewed-by: Christian Borntraeger <[email protected]>