2021-12-07 20:58:04

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH 00/32] KVM: s390: enable zPCI for interpretive execution

Enable interpretive execution of zPCI instructions + adapter interruption
forwarding for s390x KVM vfio-pci. This is done by introducing a series
of new vfio-pci feature ioctls that are unique vfio-pci-zdev (s390x) and
are used to negotiate the various aspects of zPCI interpretation setup.
By allowing intepretation of zPCI instructions and firmware delivery of
interrupts to guests, we can significantly reduce the frequency of guest
SIE exits for zPCI. We then see additional gains by handling a hot-path
instruction that can still intercept to the hypervisor (RPCIT) directly
in kvm.

From the perspective of guest configuration, you passthrough zPCI devices
in the same manner as before, with intepretation support being used by
default if available in kernel+qemu.

Will reply with a link to the associated QEMU series.

Matthew Rosato (32):
s390/sclp: detect the zPCI interpretation facility
s390/sclp: detect the AISII facility
s390/sclp: detect the AENI facility
s390/sclp: detect the AISI facility
s390/airq: pass more TPI info to airq handlers
s390/airq: allow for airq structure that uses an input vector
s390/pci: externalize the SIC operation controls and routine
s390/pci: stash associated GISA designation
s390/pci: export some routines related to RPCIT processing
s390/pci: stash dtsm and maxstbl
s390/pci: add helper function to find device by handle
s390/pci: get SHM information from list pci
KVM: s390: pci: add basic kvm_zdev structure
KVM: s390: pci: do initial setup for AEN interpretation
KVM: s390: pci: enable host forwarding of Adapter Event Notifications
KVM: s390: expose the guest zPCI interpretation facility
KVM: s390: expose the guest Adapter Interruption Source ID facility
KVM: s390: expose guest Adapter Event Notification Interpretation
facility
KVM: s390: mechanism to enable guest zPCI Interpretation
KVM: s390: pci: provide routines for enabling/disabling interpretation
KVM: s390: pci: provide routines for enabling/disabling interrupt
forwarding
KVM: s390: pci: provide routines for enabling/disabling IOAT assist
KVM: s390: pci: handle refresh of PCI translations
KVM: s390: intercept the rpcit instruction
vfio/pci: re-introduce CONFIG_VFIO_PCI_ZDEV
vfio-pci/zdev: wire up group notifier
vfio-pci/zdev: wire up zPCI interpretive execution support
vfio-pci/zdev: wire up zPCI adapter interrupt forwarding support
vfio-pci/zdev: wire up zPCI IOAT assist support
vfio-pci/zdev: add DTSM to clp group capability
KVM: s390: introduce CPU feature for zPCI Interpretation
MAINTAINERS: additional files related kvm s390 pci passthrough

MAINTAINERS | 2 +
arch/s390/include/asm/airq.h | 7 +-
arch/s390/include/asm/kvm_host.h | 5 +
arch/s390/include/asm/kvm_pci.h | 62 +++
arch/s390/include/asm/pci.h | 13 +
arch/s390/include/asm/pci_clp.h | 11 +-
arch/s390/include/asm/pci_dma.h | 3 +
arch/s390/include/asm/pci_insn.h | 29 +-
arch/s390/include/asm/sclp.h | 4 +
arch/s390/include/asm/tpi.h | 14 +
arch/s390/include/uapi/asm/kvm.h | 1 +
arch/s390/kvm/Makefile | 2 +-
arch/s390/kvm/interrupt.c | 97 +++-
arch/s390/kvm/kvm-s390.c | 65 ++-
arch/s390/kvm/kvm-s390.h | 10 +
arch/s390/kvm/pci.c | 784 +++++++++++++++++++++++++++++++
arch/s390/kvm/pci.h | 59 +++
arch/s390/kvm/priv.c | 41 ++
arch/s390/pci/pci.c | 47 ++
arch/s390/pci/pci_clp.c | 19 +-
arch/s390/pci/pci_dma.c | 1 +
arch/s390/pci/pci_insn.c | 5 +-
arch/s390/pci/pci_irq.c | 50 +-
drivers/s390/char/sclp_early.c | 4 +
drivers/s390/cio/airq.c | 12 +-
drivers/s390/cio/qdio_thinint.c | 6 +-
drivers/s390/crypto/ap_bus.c | 9 +-
drivers/s390/virtio/virtio_ccw.c | 6 +-
drivers/vfio/pci/Kconfig | 11 +
drivers/vfio/pci/Makefile | 2 +-
drivers/vfio/pci/vfio_pci_core.c | 8 +
drivers/vfio/pci/vfio_pci_zdev.c | 292 +++++++++++-
include/linux/vfio_pci_core.h | 44 +-
include/uapi/linux/vfio.h | 22 +
include/uapi/linux/vfio_zdev.h | 51 ++
35 files changed, 1738 insertions(+), 60 deletions(-)
create mode 100644 arch/s390/include/asm/kvm_pci.h
create mode 100644 arch/s390/kvm/pci.c
create mode 100644 arch/s390/kvm/pci.h

--
2.27.0



2021-12-07 20:58:11

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH 01/32] s390/sclp: detect the zPCI interpretation facility

Detect the zPCI Load/Store Interpretation facility.

Reviewed-by: Eric Farman <[email protected]>
Signed-off-by: Matthew Rosato <[email protected]>
---
arch/s390/include/asm/sclp.h | 1 +
drivers/s390/char/sclp_early.c | 1 +
2 files changed, 2 insertions(+)

diff --git a/arch/s390/include/asm/sclp.h b/arch/s390/include/asm/sclp.h
index c68ea35de498..c84e8e0ca344 100644
--- a/arch/s390/include/asm/sclp.h
+++ b/arch/s390/include/asm/sclp.h
@@ -88,6 +88,7 @@ struct sclp_info {
unsigned char has_diag318 : 1;
unsigned char has_sipl : 1;
unsigned char has_dirq : 1;
+ unsigned char has_zpci_interp : 1;
unsigned int ibc;
unsigned int mtid;
unsigned int mtid_cp;
diff --git a/drivers/s390/char/sclp_early.c b/drivers/s390/char/sclp_early.c
index b64feab62caa..2e8199b7ae50 100644
--- a/drivers/s390/char/sclp_early.c
+++ b/drivers/s390/char/sclp_early.c
@@ -45,6 +45,7 @@ static void __init sclp_early_facilities_detect(void)
sclp.has_gisaf = !!(sccb->fac118 & 0x08);
sclp.has_hvs = !!(sccb->fac119 & 0x80);
sclp.has_kss = !!(sccb->fac98 & 0x01);
+ sclp.has_zpci_interp = !!(sccb->fac118 & 0x01);
if (sccb->fac85 & 0x02)
S390_lowcore.machine_flags |= MACHINE_FLAG_ESOP;
if (sccb->fac91 & 0x40)
--
2.27.0


2021-12-07 20:58:14

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH 02/32] s390/sclp: detect the AISII facility

Detect the Adapter Interruption Source ID Interpretation facility.

Reviewed-by: Eric Farman <[email protected]>
Signed-off-by: Matthew Rosato <[email protected]>
---
arch/s390/include/asm/sclp.h | 1 +
drivers/s390/char/sclp_early.c | 1 +
2 files changed, 2 insertions(+)

diff --git a/arch/s390/include/asm/sclp.h b/arch/s390/include/asm/sclp.h
index c84e8e0ca344..524a99baf221 100644
--- a/arch/s390/include/asm/sclp.h
+++ b/arch/s390/include/asm/sclp.h
@@ -89,6 +89,7 @@ struct sclp_info {
unsigned char has_sipl : 1;
unsigned char has_dirq : 1;
unsigned char has_zpci_interp : 1;
+ unsigned char has_aisii : 1;
unsigned int ibc;
unsigned int mtid;
unsigned int mtid_cp;
diff --git a/drivers/s390/char/sclp_early.c b/drivers/s390/char/sclp_early.c
index 2e8199b7ae50..a73120b8a5de 100644
--- a/drivers/s390/char/sclp_early.c
+++ b/drivers/s390/char/sclp_early.c
@@ -45,6 +45,7 @@ static void __init sclp_early_facilities_detect(void)
sclp.has_gisaf = !!(sccb->fac118 & 0x08);
sclp.has_hvs = !!(sccb->fac119 & 0x80);
sclp.has_kss = !!(sccb->fac98 & 0x01);
+ sclp.has_aisii = !!(sccb->fac118 & 0x40);
sclp.has_zpci_interp = !!(sccb->fac118 & 0x01);
if (sccb->fac85 & 0x02)
S390_lowcore.machine_flags |= MACHINE_FLAG_ESOP;
--
2.27.0


2021-12-07 20:58:21

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH 03/32] s390/sclp: detect the AENI facility

Detect the Adapter Event Notification Interpretation facility.

Reviewed-by: Eric Farman <[email protected]>
Signed-off-by: Matthew Rosato <[email protected]>
---
arch/s390/include/asm/sclp.h | 1 +
drivers/s390/char/sclp_early.c | 1 +
2 files changed, 2 insertions(+)

diff --git a/arch/s390/include/asm/sclp.h b/arch/s390/include/asm/sclp.h
index 524a99baf221..a763563bb3e7 100644
--- a/arch/s390/include/asm/sclp.h
+++ b/arch/s390/include/asm/sclp.h
@@ -90,6 +90,7 @@ struct sclp_info {
unsigned char has_dirq : 1;
unsigned char has_zpci_interp : 1;
unsigned char has_aisii : 1;
+ unsigned char has_aeni : 1;
unsigned int ibc;
unsigned int mtid;
unsigned int mtid_cp;
diff --git a/drivers/s390/char/sclp_early.c b/drivers/s390/char/sclp_early.c
index a73120b8a5de..52a203ea23cc 100644
--- a/drivers/s390/char/sclp_early.c
+++ b/drivers/s390/char/sclp_early.c
@@ -46,6 +46,7 @@ static void __init sclp_early_facilities_detect(void)
sclp.has_hvs = !!(sccb->fac119 & 0x80);
sclp.has_kss = !!(sccb->fac98 & 0x01);
sclp.has_aisii = !!(sccb->fac118 & 0x40);
+ sclp.has_aeni = !!(sccb->fac118 & 0x20);
sclp.has_zpci_interp = !!(sccb->fac118 & 0x01);
if (sccb->fac85 & 0x02)
S390_lowcore.machine_flags |= MACHINE_FLAG_ESOP;
--
2.27.0


2021-12-07 20:58:35

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH 06/32] s390/airq: allow for airq structure that uses an input vector

When doing device passthrough where interrupts are being forwarded
from host to guest, we wish to use a pinned section of guest memory
as the vector (the same memory used by the guest as the vector).

Signed-off-by: Matthew Rosato <[email protected]>
---
arch/s390/include/asm/airq.h | 4 +++-
arch/s390/pci/pci_irq.c | 8 ++++----
drivers/s390/cio/airq.c | 10 +++++++---
drivers/s390/virtio/virtio_ccw.c | 2 +-
4 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/arch/s390/include/asm/airq.h b/arch/s390/include/asm/airq.h
index 7918a7d09028..e82e5626e139 100644
--- a/arch/s390/include/asm/airq.h
+++ b/arch/s390/include/asm/airq.h
@@ -47,8 +47,10 @@ struct airq_iv {
#define AIRQ_IV_PTR 4 /* Allocate the ptr array */
#define AIRQ_IV_DATA 8 /* Allocate the data array */
#define AIRQ_IV_CACHELINE 16 /* Cacheline alignment for the vector */
+#define AIRQ_IV_GUESTVEC 32 /* Vector is a pinned guest page */

-struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags);
+struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags,
+ unsigned long *vec);
void airq_iv_release(struct airq_iv *iv);
unsigned long airq_iv_alloc(struct airq_iv *iv, unsigned long num);
void airq_iv_free(struct airq_iv *iv, unsigned long bit, unsigned long num);
diff --git a/arch/s390/pci/pci_irq.c b/arch/s390/pci/pci_irq.c
index 880bcd73f11a..dfd4f3276a6d 100644
--- a/arch/s390/pci/pci_irq.c
+++ b/arch/s390/pci/pci_irq.c
@@ -296,7 +296,7 @@ int arch_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type)
zdev->aisb = bit;

/* Create adapter interrupt vector */
- zdev->aibv = airq_iv_create(msi_vecs, AIRQ_IV_DATA | AIRQ_IV_BITLOCK);
+ zdev->aibv = airq_iv_create(msi_vecs, AIRQ_IV_DATA | AIRQ_IV_BITLOCK, 0);
if (!zdev->aibv)
return -ENOMEM;

@@ -421,7 +421,7 @@ static int __init zpci_directed_irq_init(void)
union zpci_sic_iib iib = {{0}};
unsigned int cpu;

- zpci_sbv = airq_iv_create(num_possible_cpus(), 0);
+ zpci_sbv = airq_iv_create(num_possible_cpus(), 0, 0);
if (!zpci_sbv)
return -ENOMEM;

@@ -443,7 +443,7 @@ static int __init zpci_directed_irq_init(void)
zpci_ibv[cpu] = airq_iv_create(cache_line_size() * BITS_PER_BYTE,
AIRQ_IV_DATA |
AIRQ_IV_CACHELINE |
- (!cpu ? AIRQ_IV_ALLOC : 0));
+ (!cpu ? AIRQ_IV_ALLOC : 0), 0);
if (!zpci_ibv[cpu])
return -ENOMEM;
}
@@ -460,7 +460,7 @@ static int __init zpci_floating_irq_init(void)
if (!zpci_ibv)
return -ENOMEM;

- zpci_sbv = airq_iv_create(ZPCI_NR_DEVICES, AIRQ_IV_ALLOC);
+ zpci_sbv = airq_iv_create(ZPCI_NR_DEVICES, AIRQ_IV_ALLOC, 0);
if (!zpci_sbv)
goto out_free;

diff --git a/drivers/s390/cio/airq.c b/drivers/s390/cio/airq.c
index 2f2226786319..375a58b1c838 100644
--- a/drivers/s390/cio/airq.c
+++ b/drivers/s390/cio/airq.c
@@ -122,10 +122,12 @@ static inline unsigned long iv_size(unsigned long bits)
* airq_iv_create - create an interrupt vector
* @bits: number of bits in the interrupt vector
* @flags: allocation flags
+ * @vec: pointer to pinned guest memory if AIRQ_IV_GUESTVEC
*
* Returns a pointer to an interrupt vector structure
*/
-struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags)
+struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags,
+ unsigned long *vec)
{
struct airq_iv *iv;
unsigned long size;
@@ -146,6 +148,8 @@ struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags)
&iv->vector_dma);
if (!iv->vector)
goto out_free;
+ } else if (flags & AIRQ_IV_GUESTVEC) {
+ iv->vector = vec;
} else {
iv->vector = cio_dma_zalloc(size);
if (!iv->vector)
@@ -185,7 +189,7 @@ struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags)
kfree(iv->avail);
if (iv->flags & AIRQ_IV_CACHELINE && iv->vector)
dma_pool_free(airq_iv_cache, iv->vector, iv->vector_dma);
- else
+ else if (!(iv->flags & AIRQ_IV_GUESTVEC))
cio_dma_free(iv->vector, size);
kfree(iv);
out:
@@ -204,7 +208,7 @@ void airq_iv_release(struct airq_iv *iv)
kfree(iv->bitlock);
if (iv->flags & AIRQ_IV_CACHELINE)
dma_pool_free(airq_iv_cache, iv->vector, iv->vector_dma);
- else
+ else if (!(iv->flags & AIRQ_IV_GUESTVEC))
cio_dma_free(iv->vector, iv_size(iv->bits));
kfree(iv->avail);
kfree(iv);
diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index 52c376d15978..ff84f45587be 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -241,7 +241,7 @@ static struct airq_info *new_airq_info(int index)
return NULL;
rwlock_init(&info->lock);
info->aiv = airq_iv_create(VIRTIO_IV_BITS, AIRQ_IV_ALLOC | AIRQ_IV_PTR
- | AIRQ_IV_CACHELINE);
+ | AIRQ_IV_CACHELINE, 0);
if (!info->aiv) {
kfree(info);
return NULL;
--
2.27.0


2021-12-07 20:58:39

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH 07/32] s390/pci: externalize the SIC operation controls and routine

A subsequent patch will be issuing SIC from KVM -- export the necessary
routine and make the operation control definitions available from a header.
Because the routine will now be exported, let's swap the purpose of
zpci_set_irq_ctrl and __zpci_set_irq_ctrl, leaving the latter as a static
within pci_irq.c only for SIC calls that don't specify an iib.

Signed-off-by: Matthew Rosato <[email protected]>
---
arch/s390/include/asm/pci_insn.h | 17 +++++++++--------
arch/s390/pci/pci_insn.c | 3 ++-
arch/s390/pci/pci_irq.c | 28 ++++++++++++++--------------
3 files changed, 25 insertions(+), 23 deletions(-)

diff --git a/arch/s390/include/asm/pci_insn.h b/arch/s390/include/asm/pci_insn.h
index 61cf9531f68f..5331082fa516 100644
--- a/arch/s390/include/asm/pci_insn.h
+++ b/arch/s390/include/asm/pci_insn.h
@@ -98,6 +98,14 @@ struct zpci_fib {
u32 gd;
} __packed __aligned(8);

+/* Set Interruption Controls Operation Controls */
+#define SIC_IRQ_MODE_ALL 0
+#define SIC_IRQ_MODE_SINGLE 1
+#define SIC_IRQ_MODE_DIRECT 4
+#define SIC_IRQ_MODE_D_ALL 16
+#define SIC_IRQ_MODE_D_SINGLE 17
+#define SIC_IRQ_MODE_SET_CPU 18
+
/* directed interruption information block */
struct zpci_diib {
u32 : 1;
@@ -134,13 +142,6 @@ int __zpci_store(u64 data, u64 req, u64 offset);
int zpci_store(const volatile void __iomem *addr, u64 data, unsigned long len);
int __zpci_store_block(const u64 *data, u64 req, u64 offset);
void zpci_barrier(void);
-int __zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib);
-
-static inline int zpci_set_irq_ctrl(u16 ctl, u8 isc)
-{
- union zpci_sic_iib iib = {{0}};
-
- return __zpci_set_irq_ctrl(ctl, isc, &iib);
-}
+int zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib);

#endif
diff --git a/arch/s390/pci/pci_insn.c b/arch/s390/pci/pci_insn.c
index 28d863aaafea..d1a8bd43ce26 100644
--- a/arch/s390/pci/pci_insn.c
+++ b/arch/s390/pci/pci_insn.c
@@ -97,7 +97,7 @@ int zpci_refresh_trans(u64 fn, u64 addr, u64 range)
}

/* Set Interruption Controls */
-int __zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib)
+int zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib)
{
if (!test_facility(72))
return -EIO;
@@ -108,6 +108,7 @@ int __zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib)

return 0;
}
+EXPORT_SYMBOL_GPL(zpci_set_irq_ctrl);

/* PCI Load */
static inline int ____pcilg(u64 *data, u64 req, u64 offset, u8 *status)
diff --git a/arch/s390/pci/pci_irq.c b/arch/s390/pci/pci_irq.c
index dfd4f3276a6d..6b29e39496d1 100644
--- a/arch/s390/pci/pci_irq.c
+++ b/arch/s390/pci/pci_irq.c
@@ -15,13 +15,6 @@

static enum {FLOATING, DIRECTED} irq_delivery;

-#define SIC_IRQ_MODE_ALL 0
-#define SIC_IRQ_MODE_SINGLE 1
-#define SIC_IRQ_MODE_DIRECT 4
-#define SIC_IRQ_MODE_D_ALL 16
-#define SIC_IRQ_MODE_D_SINGLE 17
-#define SIC_IRQ_MODE_SET_CPU 18
-
/*
* summary bit vector
* FLOATING - summary bit per function
@@ -145,6 +138,13 @@ static int zpci_set_irq_affinity(struct irq_data *data, const struct cpumask *de
return IRQ_SET_MASK_OK;
}

+static inline int __zpci_set_irq_ctrl(u16 ctl, u8 isc)
+{
+ union zpci_sic_iib iib = {{0}};
+
+ return zpci_set_irq_ctrl(ctl, isc, &iib);
+}
+
static struct irq_chip zpci_irq_chip = {
.name = "PCI-MSI",
.irq_unmask = pci_msi_unmask_irq,
@@ -165,7 +165,7 @@ static void zpci_handle_cpu_local_irq(bool rescan)
/* End of second scan with interrupts on. */
break;
/* First scan complete, reenable interrupts. */
- if (zpci_set_irq_ctrl(SIC_IRQ_MODE_D_SINGLE, PCI_ISC))
+ if (__zpci_set_irq_ctrl(SIC_IRQ_MODE_D_SINGLE, PCI_ISC))
break;
bit = 0;
continue;
@@ -203,7 +203,7 @@ static void zpci_handle_fallback_irq(void)
/* End of second scan with interrupts on. */
break;
/* First scan complete, reenable interrupts. */
- if (zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, PCI_ISC))
+ if (__zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, PCI_ISC))
break;
cpu = 0;
continue;
@@ -247,7 +247,7 @@ static void zpci_floating_irq_handler(struct airq_struct *airq,
/* End of second scan with interrupts on. */
break;
/* First scan complete, reenable interrupts. */
- if (zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, PCI_ISC))
+ if (__zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, PCI_ISC))
break;
si = 0;
continue;
@@ -412,8 +412,8 @@ static void __init cpu_enable_directed_irq(void *unused)

iib.cdiib.dibv_addr = (u64) zpci_ibv[smp_processor_id()]->vector;

- __zpci_set_irq_ctrl(SIC_IRQ_MODE_SET_CPU, 0, &iib);
- zpci_set_irq_ctrl(SIC_IRQ_MODE_D_SINGLE, PCI_ISC);
+ zpci_set_irq_ctrl(SIC_IRQ_MODE_SET_CPU, 0, &iib);
+ __zpci_set_irq_ctrl(SIC_IRQ_MODE_D_SINGLE, PCI_ISC);
}

static int __init zpci_directed_irq_init(void)
@@ -428,7 +428,7 @@ static int __init zpci_directed_irq_init(void)
iib.diib.isc = PCI_ISC;
iib.diib.nr_cpus = num_possible_cpus();
iib.diib.disb_addr = (u64) zpci_sbv->vector;
- __zpci_set_irq_ctrl(SIC_IRQ_MODE_DIRECT, 0, &iib);
+ zpci_set_irq_ctrl(SIC_IRQ_MODE_DIRECT, 0, &iib);

zpci_ibv = kcalloc(num_possible_cpus(), sizeof(*zpci_ibv),
GFP_KERNEL);
@@ -504,7 +504,7 @@ int __init zpci_irq_init(void)
* Enable floating IRQs (with suppression after one IRQ). When using
* directed IRQs this enables the fallback path.
*/
- zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, PCI_ISC);
+ __zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, PCI_ISC);

return 0;
out_airq:
--
2.27.0


2021-12-07 20:58:52

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH 08/32] s390/pci: stash associated GISA designation

For passthrough devices, we will need to know the GISA designation of the
guest if interpretation facilities are to be used. Setup to stash this in
the zdev and set a default of 0 (no GISA designation) for now; a subsequent
patch will set a valid GISA designation for passthrough devices.
Also, extend mpcific routines to specify this stashed designation as part
of the mpcific command.

Reviewed-by: Niklas Schnelle <[email protected]>
Signed-off-by: Matthew Rosato <[email protected]>
---
arch/s390/include/asm/pci.h | 1 +
arch/s390/include/asm/pci_clp.h | 3 ++-
arch/s390/pci/pci.c | 9 +++++++++
arch/s390/pci/pci_clp.c | 1 +
arch/s390/pci/pci_irq.c | 5 +++++
5 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
index 90824be5ce9a..2474b8d30f2a 100644
--- a/arch/s390/include/asm/pci.h
+++ b/arch/s390/include/asm/pci.h
@@ -123,6 +123,7 @@ struct zpci_dev {
enum zpci_state state;
u32 fid; /* function ID, used by sclp */
u32 fh; /* function handle, used by insn's */
+ u32 gd; /* GISA designation for passthrough */
u16 vfn; /* virtual function number */
u16 pchid; /* physical channel ID */
u8 pfgid; /* function group ID */
diff --git a/arch/s390/include/asm/pci_clp.h b/arch/s390/include/asm/pci_clp.h
index 1f4b666e85ee..3af8d196da74 100644
--- a/arch/s390/include/asm/pci_clp.h
+++ b/arch/s390/include/asm/pci_clp.h
@@ -173,7 +173,8 @@ struct clp_req_set_pci {
u16 reserved2;
u8 oc; /* operation controls */
u8 ndas; /* number of dma spaces */
- u64 reserved3;
+ u32 reserved3;
+ u32 gd; /* GISA designation */
} __packed;

/* Set PCI function response */
diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c
index 2f9b78fa82a5..9b4d3d78b444 100644
--- a/arch/s390/pci/pci.c
+++ b/arch/s390/pci/pci.c
@@ -119,6 +119,7 @@ int zpci_register_ioat(struct zpci_dev *zdev, u8 dmaas,
fib.pba = base;
fib.pal = limit;
fib.iota = iota | ZPCI_IOTA_RTTO_FLAG;
+ fib.gd = zdev->gd;
cc = zpci_mod_fc(req, &fib, &status);
if (cc)
zpci_dbg(3, "reg ioat fid:%x, cc:%d, status:%d\n", zdev->fid, cc, status);
@@ -132,6 +133,8 @@ int zpci_unregister_ioat(struct zpci_dev *zdev, u8 dmaas)
struct zpci_fib fib = {0};
u8 cc, status;

+ fib.gd = zdev->gd;
+
cc = zpci_mod_fc(req, &fib, &status);
if (cc)
zpci_dbg(3, "unreg ioat fid:%x, cc:%d, status:%d\n", zdev->fid, cc, status);
@@ -159,6 +162,7 @@ int zpci_fmb_enable_device(struct zpci_dev *zdev)
atomic64_set(&zdev->unmapped_pages, 0);

fib.fmb_addr = virt_to_phys(zdev->fmb);
+ fib.gd = zdev->gd;
cc = zpci_mod_fc(req, &fib, &status);
if (cc) {
kmem_cache_free(zdev_fmb_cache, zdev->fmb);
@@ -177,6 +181,8 @@ int zpci_fmb_disable_device(struct zpci_dev *zdev)
if (!zdev->fmb)
return -EINVAL;

+ fib.gd = zdev->gd;
+
/* Function measurement is disabled if fmb address is zero */
cc = zpci_mod_fc(req, &fib, &status);
if (cc == 3) /* Function already gone. */
@@ -807,6 +813,9 @@ struct zpci_dev *zpci_create_device(u32 fid, u32 fh, enum zpci_state state)
zdev->fid = fid;
zdev->fh = fh;

+ /* For now, assume it is not a passthrough device */
+ zdev->gd = 0;
+
/* Query function properties and update zdev */
rc = clp_query_pci_fn(zdev);
if (rc)
diff --git a/arch/s390/pci/pci_clp.c b/arch/s390/pci/pci_clp.c
index be077b39da33..e9ed0e4a5cf0 100644
--- a/arch/s390/pci/pci_clp.c
+++ b/arch/s390/pci/pci_clp.c
@@ -240,6 +240,7 @@ static int clp_set_pci_fn(struct zpci_dev *zdev, u32 *fh, u8 nr_dma_as, u8 comma
rrb->request.fh = zdev->fh;
rrb->request.oc = command;
rrb->request.ndas = nr_dma_as;
+ rrb->request.gd = zdev->gd;

rc = clp_req(rrb, CLP_LPS_PCI);
if (rrb->response.hdr.rsp == CLP_RC_SETPCIFN_BUSY) {
diff --git a/arch/s390/pci/pci_irq.c b/arch/s390/pci/pci_irq.c
index 6b29e39496d1..9e8b4507234d 100644
--- a/arch/s390/pci/pci_irq.c
+++ b/arch/s390/pci/pci_irq.c
@@ -43,6 +43,7 @@ static int zpci_set_airq(struct zpci_dev *zdev)
fib.fmt0.aibvo = 0; /* each zdev has its own interrupt vector */
fib.fmt0.aisb = (unsigned long) zpci_sbv->vector + (zdev->aisb/64)*8;
fib.fmt0.aisbo = zdev->aisb & 63;
+ fib.gd = zdev->gd;

return zpci_mod_fc(req, &fib, &status) ? -EIO : 0;
}
@@ -54,6 +55,8 @@ static int zpci_clear_airq(struct zpci_dev *zdev)
struct zpci_fib fib = {0};
u8 cc, status;

+ fib.gd = zdev->gd;
+
cc = zpci_mod_fc(req, &fib, &status);
if (cc == 3 || (cc == 1 && status == 24))
/* Function already gone or IRQs already deregistered. */
@@ -72,6 +75,7 @@ static int zpci_set_directed_irq(struct zpci_dev *zdev)
fib.fmt = 1;
fib.fmt1.noi = zdev->msi_nr_irqs;
fib.fmt1.dibvo = zdev->msi_first_bit;
+ fib.gd = zdev->gd;

return zpci_mod_fc(req, &fib, &status) ? -EIO : 0;
}
@@ -84,6 +88,7 @@ static int zpci_clear_directed_irq(struct zpci_dev *zdev)
u8 cc, status;

fib.fmt = 1;
+ fib.gd = zdev->gd;
cc = zpci_mod_fc(req, &fib, &status);
if (cc == 3 || (cc == 1 && status == 24))
/* Function already gone or IRQs already deregistered. */
--
2.27.0


2021-12-07 20:58:58

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH 09/32] s390/pci: export some routines related to RPCIT processing

KVM will re-use dma_walk_cpu_trans to walk the host shadow table and
will also need to be able to call zpci_refresh_trans to re-issue a RPCIT.

Reviewed-by: Niklas Schnelle <[email protected]>
Signed-off-by: Matthew Rosato <[email protected]>
---
arch/s390/pci/pci_dma.c | 1 +
arch/s390/pci/pci_insn.c | 1 +
2 files changed, 2 insertions(+)

diff --git a/arch/s390/pci/pci_dma.c b/arch/s390/pci/pci_dma.c
index 1f4540d6bd2d..ae55f2f2ecd9 100644
--- a/arch/s390/pci/pci_dma.c
+++ b/arch/s390/pci/pci_dma.c
@@ -116,6 +116,7 @@ unsigned long *dma_walk_cpu_trans(unsigned long *rto, dma_addr_t dma_addr)
px = calc_px(dma_addr);
return &pto[px];
}
+EXPORT_SYMBOL_GPL(dma_walk_cpu_trans);

void dma_update_cpu_trans(unsigned long *entry, void *page_addr, int flags)
{
diff --git a/arch/s390/pci/pci_insn.c b/arch/s390/pci/pci_insn.c
index d1a8bd43ce26..0d1ab268ec24 100644
--- a/arch/s390/pci/pci_insn.c
+++ b/arch/s390/pci/pci_insn.c
@@ -95,6 +95,7 @@ int zpci_refresh_trans(u64 fn, u64 addr, u64 range)

return (cc) ? -EIO : 0;
}
+EXPORT_SYMBOL_GPL(zpci_refresh_trans);

/* Set Interruption Controls */
int zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib)
--
2.27.0


2021-12-07 20:59:05

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH 10/32] s390/pci: stash dtsm and maxstbl

Store information about what IOAT designation types are supported by
underlying hardware as well as the largest store block size allowed.
These values will be needed by passthrough.

Signed-off-by: Matthew Rosato <[email protected]>
---
arch/s390/include/asm/pci.h | 2 ++
arch/s390/include/asm/pci_clp.h | 6 ++++--
arch/s390/pci/pci_clp.c | 2 ++
3 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
index 2474b8d30f2a..1a8f9f42da3a 100644
--- a/arch/s390/include/asm/pci.h
+++ b/arch/s390/include/asm/pci.h
@@ -126,9 +126,11 @@ struct zpci_dev {
u32 gd; /* GISA designation for passthrough */
u16 vfn; /* virtual function number */
u16 pchid; /* physical channel ID */
+ u16 maxstbl; /* Maximum store block size */
u8 pfgid; /* function group ID */
u8 pft; /* pci function type */
u8 port;
+ u8 dtsm; /* Supported DT mask */
u8 rid_available : 1;
u8 has_hp_slot : 1;
u8 has_resources : 1;
diff --git a/arch/s390/include/asm/pci_clp.h b/arch/s390/include/asm/pci_clp.h
index 3af8d196da74..124fadfb74b9 100644
--- a/arch/s390/include/asm/pci_clp.h
+++ b/arch/s390/include/asm/pci_clp.h
@@ -153,9 +153,11 @@ struct clp_rsp_query_pci_grp {
u8 : 6;
u8 frame : 1;
u8 refresh : 1; /* TLB refresh mode */
- u16 reserved2;
+ u16 : 3;
+ u16 maxstbl : 13; /* Maximum store block size */
u16 mui;
- u16 : 16;
+ u8 dtsm; /* Supported DT mask */
+ u8 reserved3;
u16 maxfaal;
u16 : 4;
u16 dnoi : 12;
diff --git a/arch/s390/pci/pci_clp.c b/arch/s390/pci/pci_clp.c
index e9ed0e4a5cf0..bc7446566cbc 100644
--- a/arch/s390/pci/pci_clp.c
+++ b/arch/s390/pci/pci_clp.c
@@ -103,6 +103,8 @@ static void clp_store_query_pci_fngrp(struct zpci_dev *zdev,
zdev->max_msi = response->noi;
zdev->fmb_update = response->mui;
zdev->version = response->version;
+ zdev->maxstbl = response->maxstbl;
+ zdev->dtsm = response->dtsm;

switch (response->version) {
case 1:
--
2.27.0


2021-12-07 20:59:08

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH 11/32] s390/pci: add helper function to find device by handle

Intercepted zPCI instructions will specify the desired function via a
function handle. Add a routine to find the device with the specified
handle.

Acked-by: Niklas Schnelle <[email protected]>
Signed-off-by: Matthew Rosato <[email protected]>
---
arch/s390/include/asm/pci.h | 1 +
arch/s390/pci/pci.c | 16 ++++++++++++++++
2 files changed, 17 insertions(+)

diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
index 1a8f9f42da3a..00a2c24d6d2b 100644
--- a/arch/s390/include/asm/pci.h
+++ b/arch/s390/include/asm/pci.h
@@ -275,6 +275,7 @@ static inline struct zpci_dev *to_zpci_dev(struct device *dev)
}

struct zpci_dev *get_zdev_by_fid(u32);
+struct zpci_dev *get_zdev_by_fh(u32 fh);

/* DMA */
int zpci_dma_init(void);
diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c
index 9b4d3d78b444..af1c0ae017b1 100644
--- a/arch/s390/pci/pci.c
+++ b/arch/s390/pci/pci.c
@@ -76,6 +76,22 @@ struct zpci_dev *get_zdev_by_fid(u32 fid)
return zdev;
}

+struct zpci_dev *get_zdev_by_fh(u32 fh)
+{
+ struct zpci_dev *tmp, *zdev = NULL;
+
+ spin_lock(&zpci_list_lock);
+ list_for_each_entry(tmp, &zpci_list, entry) {
+ if (tmp->fh == fh) {
+ zdev = tmp;
+ break;
+ }
+ }
+ spin_unlock(&zpci_list_lock);
+ return zdev;
+}
+EXPORT_SYMBOL_GPL(get_zdev_by_fh);
+
void zpci_remove_reserved_devices(void)
{
struct zpci_dev *tmp, *zdev;
--
2.27.0


2021-12-07 20:59:09

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH 12/32] s390/pci: get SHM information from list pci

KVM will need information on the special handle mask used to indicate
emulated devices. In order to obtain this, a new type of list pci call
must be made to gather the information. Remove the unused data pointer
from clp_list_pci and __clp_add and instead optionally pass a pointer to
a model-dependent-data field. Additionally, allow for clp_list_pci calls
that don't specify a callback - in this case, just do the first pass of
list pci and exit.

Signed-off-by: Matthew Rosato <[email protected]>
---
arch/s390/include/asm/pci.h | 6 ++++++
arch/s390/include/asm/pci_clp.h | 2 +-
arch/s390/pci/pci.c | 19 +++++++++++++++++++
arch/s390/pci/pci_clp.c | 16 ++++++++++------
4 files changed, 36 insertions(+), 7 deletions(-)

diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
index 00a2c24d6d2b..86f43644756d 100644
--- a/arch/s390/include/asm/pci.h
+++ b/arch/s390/include/asm/pci.h
@@ -219,12 +219,18 @@ int zpci_unregister_ioat(struct zpci_dev *, u8);
void zpci_remove_reserved_devices(void);
void zpci_update_fh(struct zpci_dev *zdev, u32 fh);

+int zpci_get_mdd(u32 *mdd);
+
/* CLP */
+void *clp_alloc_block(gfp_t gfp_mask);
+void clp_free_block(void *ptr);
int clp_setup_writeback_mio(void);
int clp_scan_pci_devices(void);
int clp_query_pci_fn(struct zpci_dev *zdev);
int clp_enable_fh(struct zpci_dev *zdev, u32 *fh, u8 nr_dma_as);
int clp_disable_fh(struct zpci_dev *zdev, u32 *fh);
+int clp_list_pci(struct clp_req_rsp_list_pci *rrb, u32 *mdd,
+ void (*cb)(struct clp_fh_list_entry *));
int clp_get_state(u32 fid, enum zpci_state *state);
int clp_refresh_fh(u32 fid, u32 *fh);

diff --git a/arch/s390/include/asm/pci_clp.h b/arch/s390/include/asm/pci_clp.h
index 124fadfb74b9..d6bc324763f3 100644
--- a/arch/s390/include/asm/pci_clp.h
+++ b/arch/s390/include/asm/pci_clp.h
@@ -76,7 +76,7 @@ struct clp_req_list_pci {
struct clp_rsp_list_pci {
struct clp_rsp_hdr hdr;
u64 resume_token;
- u32 reserved2;
+ u32 mdd;
u16 max_fn;
u8 : 7;
u8 uid_checking : 1;
diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c
index af1c0ae017b1..175854c861cd 100644
--- a/arch/s390/pci/pci.c
+++ b/arch/s390/pci/pci.c
@@ -531,6 +531,25 @@ void zpci_update_fh(struct zpci_dev *zdev, u32 fh)
zpci_do_update_iomap_fh(zdev, fh);
}

+int zpci_get_mdd(u32 *mdd)
+{
+ struct clp_req_rsp_list_pci *rrb;
+ int rc;
+
+ if (!mdd)
+ return -EINVAL;
+
+ rrb = clp_alloc_block(GFP_KERNEL);
+ if (!rrb)
+ return -ENOMEM;
+
+ rc = clp_list_pci(rrb, mdd, NULL);
+
+ clp_free_block(rrb);
+ return rc;
+}
+EXPORT_SYMBOL_GPL(zpci_get_mdd);
+
static struct resource *__alloc_res(struct zpci_dev *zdev, unsigned long start,
unsigned long size, unsigned long flags)
{
diff --git a/arch/s390/pci/pci_clp.c b/arch/s390/pci/pci_clp.c
index bc7446566cbc..e18a548ac22d 100644
--- a/arch/s390/pci/pci_clp.c
+++ b/arch/s390/pci/pci_clp.c
@@ -84,12 +84,12 @@ static __always_inline int clp_req(void *data, unsigned int lps)
return cc;
}

-static void *clp_alloc_block(gfp_t gfp_mask)
+void *clp_alloc_block(gfp_t gfp_mask)
{
return (void *) __get_free_pages(gfp_mask, get_order(CLP_BLK_SIZE));
}

-static void clp_free_block(void *ptr)
+void clp_free_block(void *ptr)
{
free_pages((unsigned long) ptr, get_order(CLP_BLK_SIZE));
}
@@ -358,8 +358,8 @@ static int clp_list_pci_req(struct clp_req_rsp_list_pci *rrb,
return rc;
}

-static int clp_list_pci(struct clp_req_rsp_list_pci *rrb, void *data,
- void (*cb)(struct clp_fh_list_entry *, void *))
+int clp_list_pci(struct clp_req_rsp_list_pci *rrb, u32 *mdd,
+ void (*cb)(struct clp_fh_list_entry *))
{
u64 resume_token = 0;
int nentries, i, rc;
@@ -368,8 +368,12 @@ static int clp_list_pci(struct clp_req_rsp_list_pci *rrb, void *data,
rc = clp_list_pci_req(rrb, &resume_token, &nentries);
if (rc)
return rc;
+ if (mdd)
+ *mdd = rrb->response.mdd;
+ if (!cb)
+ return 0;
for (i = 0; i < nentries; i++)
- cb(&rrb->response.fh_list[i], data);
+ cb(&rrb->response.fh_list[i]);
} while (resume_token);

return rc;
@@ -398,7 +402,7 @@ static int clp_find_pci(struct clp_req_rsp_list_pci *rrb, u32 fid,
return -ENODEV;
}

-static void __clp_add(struct clp_fh_list_entry *entry, void *data)
+static void __clp_add(struct clp_fh_list_entry *entry)
{
struct zpci_dev *zdev;

--
2.27.0


2021-12-07 20:59:17

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH 13/32] KVM: s390: pci: add basic kvm_zdev structure

This structure will be used to carry kvm passthrough information related to
zPCI devices.

Signed-off-by: Matthew Rosato <[email protected]>
---
arch/s390/include/asm/kvm_pci.h | 29 +++++++++++++++++
arch/s390/include/asm/pci.h | 3 ++
arch/s390/kvm/Makefile | 2 +-
arch/s390/kvm/pci.c | 57 +++++++++++++++++++++++++++++++++
4 files changed, 90 insertions(+), 1 deletion(-)
create mode 100644 arch/s390/include/asm/kvm_pci.h
create mode 100644 arch/s390/kvm/pci.c

diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
new file mode 100644
index 000000000000..3e491a39704c
--- /dev/null
+++ b/arch/s390/include/asm/kvm_pci.h
@@ -0,0 +1,29 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * KVM PCI Passthrough for virtual machines on s390
+ *
+ * Copyright IBM Corp. 2021
+ *
+ * Author(s): Matthew Rosato <[email protected]>
+ */
+
+
+#ifndef ASM_KVM_PCI_H
+#define ASM_KVM_PCI_H
+
+#include <linux/types.h>
+#include <linux/kvm_types.h>
+#include <linux/kvm_host.h>
+#include <linux/kvm.h>
+#include <linux/pci.h>
+
+struct kvm_zdev {
+ struct zpci_dev *zdev;
+ struct kvm *kvm;
+};
+
+extern int kvm_s390_pci_dev_open(struct zpci_dev *zdev);
+extern void kvm_s390_pci_dev_release(struct zpci_dev *zdev);
+extern int kvm_s390_pci_attach_kvm(struct zpci_dev *zdev, struct kvm *kvm);
+
+#endif /* ASM_KVM_PCI_H */
diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
index 86f43644756d..32810e1ed308 100644
--- a/arch/s390/include/asm/pci.h
+++ b/arch/s390/include/asm/pci.h
@@ -97,6 +97,7 @@ struct zpci_bar_struct {
};

struct s390_domain;
+struct kvm_zdev;

#define ZPCI_FUNCTIONS_PER_BUS 256
struct zpci_bus {
@@ -190,6 +191,8 @@ struct zpci_dev {
struct dentry *debugfs_dev;

struct s390_domain *s390_domain; /* s390 IOMMU domain data */
+
+ struct kvm_zdev *kzdev; /* passthrough data */
};

static inline bool zdev_enabled(struct zpci_dev *zdev)
diff --git a/arch/s390/kvm/Makefile b/arch/s390/kvm/Makefile
index b3aaadc60ead..95ea865e5d29 100644
--- a/arch/s390/kvm/Makefile
+++ b/arch/s390/kvm/Makefile
@@ -10,6 +10,6 @@ common-objs = $(KVM)/kvm_main.o $(KVM)/eventfd.o $(KVM)/async_pf.o \
ccflags-y := -Ivirt/kvm -Iarch/s390/kvm

kvm-objs := $(common-objs) kvm-s390.o intercept.o interrupt.o priv.o sigp.o
-kvm-objs += diag.o gaccess.o guestdbg.o vsie.o pv.o
+kvm-objs += diag.o gaccess.o guestdbg.o vsie.o pv.o pci.o

obj-$(CONFIG_KVM) += kvm.o
diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
new file mode 100644
index 000000000000..ecfc458a5b39
--- /dev/null
+++ b/arch/s390/kvm/pci.c
@@ -0,0 +1,57 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * s390 kvm PCI passthrough support
+ *
+ * Copyright IBM Corp. 2021
+ *
+ * Author(s): Matthew Rosato <[email protected]>
+ */
+
+#include <linux/kvm_host.h>
+#include <linux/pci.h>
+#include <asm/kvm_pci.h>
+
+int kvm_s390_pci_dev_open(struct zpci_dev *zdev)
+{
+ struct kvm_zdev *kzdev;
+
+ if (zdev == NULL)
+ return -ENODEV;
+
+ kzdev = kzalloc(sizeof(struct kvm_zdev), GFP_KERNEL);
+ if (!kzdev)
+ return -ENOMEM;
+
+ kzdev->zdev = zdev;
+ zdev->kzdev = kzdev;
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(kvm_s390_pci_dev_open);
+
+void kvm_s390_pci_dev_release(struct zpci_dev *zdev)
+{
+ struct kvm_zdev *kzdev;
+
+ if (!zdev || !zdev->kzdev)
+ return;
+
+ kzdev = zdev->kzdev;
+ WARN_ON(kzdev->zdev != zdev);
+ zdev->kzdev = 0;
+ kfree(kzdev);
+
+}
+EXPORT_SYMBOL_GPL(kvm_s390_pci_dev_release);
+
+int kvm_s390_pci_attach_kvm(struct zpci_dev *zdev, struct kvm *kvm)
+{
+ struct kvm_zdev *kzdev = zdev->kzdev;
+
+ if (!kzdev)
+ return -ENODEV;
+
+ kzdev->kvm = kvm;
+ return 0;
+}
+EXPORT_SYMBOL_GPL(kvm_s390_pci_attach_kvm);
--
2.27.0


2021-12-07 20:59:20

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH 14/32] KVM: s390: pci: do initial setup for AEN interpretation

Initial setup for Adapter Event Notification Interpretation for zPCI
passthrough devices. Specifically, allocate a structure for forwarding of
adapter events and pass the address of this structure to firmware.

Signed-off-by: Matthew Rosato <[email protected]>
---
arch/s390/include/asm/pci_insn.h | 12 ++++
arch/s390/kvm/interrupt.c | 17 +++++
arch/s390/kvm/kvm-s390.c | 3 +
arch/s390/kvm/pci.c | 113 +++++++++++++++++++++++++++++++
arch/s390/kvm/pci.h | 42 ++++++++++++
5 files changed, 187 insertions(+)
create mode 100644 arch/s390/kvm/pci.h

diff --git a/arch/s390/include/asm/pci_insn.h b/arch/s390/include/asm/pci_insn.h
index 5331082fa516..e5f57cfe1d45 100644
--- a/arch/s390/include/asm/pci_insn.h
+++ b/arch/s390/include/asm/pci_insn.h
@@ -101,6 +101,7 @@ struct zpci_fib {
/* Set Interruption Controls Operation Controls */
#define SIC_IRQ_MODE_ALL 0
#define SIC_IRQ_MODE_SINGLE 1
+#define SIC_SET_AENI_CONTROLS 2
#define SIC_IRQ_MODE_DIRECT 4
#define SIC_IRQ_MODE_D_ALL 16
#define SIC_IRQ_MODE_D_SINGLE 17
@@ -127,9 +128,20 @@ struct zpci_cdiib {
u64 : 64;
} __packed __aligned(8);

+/* adapter interruption parameters block */
+struct zpci_aipb {
+ u64 faisb;
+ u64 gait;
+ u16 : 13;
+ u16 afi : 3;
+ u32 : 32;
+ u16 faal;
+} __packed __aligned(8);
+
union zpci_sic_iib {
struct zpci_diib diib;
struct zpci_cdiib cdiib;
+ struct zpci_aipb aipb;
};

DECLARE_STATIC_KEY_FALSE(have_mio);
diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
index f9b872e358c6..4efe0e95a40f 100644
--- a/arch/s390/kvm/interrupt.c
+++ b/arch/s390/kvm/interrupt.c
@@ -32,6 +32,7 @@
#include "kvm-s390.h"
#include "gaccess.h"
#include "trace-s390.h"
+#include "pci.h"

#define PFAULT_INIT 0x0600
#define PFAULT_DONE 0x0680
@@ -3276,8 +3277,16 @@ static struct airq_struct gib_alert_irq = {

void kvm_s390_gib_destroy(void)
{
+ struct zpci_aift *aift;
+
if (!gib)
return;
+ aift = kvm_s390_pci_get_aift();
+ if (aift) {
+ mutex_lock(&aift->lock);
+ kvm_s390_pci_aen_exit();
+ mutex_unlock(&aift->lock);
+ }
chsc_sgib(0);
unregister_adapter_interrupt(&gib_alert_irq);
free_page((unsigned long)gib);
@@ -3315,6 +3324,14 @@ int kvm_s390_gib_init(u8 nisc)
goto out_unreg_gal;
}

+ if (IS_ENABLED(CONFIG_PCI) && sclp.has_aeni) {
+ if (kvm_s390_pci_aen_init(nisc)) {
+ pr_err("Initializing AEN for PCI failed\n");
+ rc = -EIO;
+ goto out_unreg_gal;
+ }
+ }
+
KVM_EVENT(3, "gib 0x%pK (nisc=%d) initialized", gib, gib->nisc);
goto out;

diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 14a18ba5ff2c..9cd3c8eb59e8 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -48,6 +48,7 @@
#include <asm/fpu/api.h>
#include "kvm-s390.h"
#include "gaccess.h"
+#include "pci.h"

#define CREATE_TRACE_POINTS
#include "trace.h"
@@ -503,6 +504,8 @@ int kvm_arch_init(void *opaque)
goto out;
}

+ kvm_s390_pci_init();
+
rc = kvm_s390_gib_init(GAL_ISC);
if (rc)
goto out;
diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
index ecfc458a5b39..f0e5386ff943 100644
--- a/arch/s390/kvm/pci.c
+++ b/arch/s390/kvm/pci.c
@@ -10,6 +10,113 @@
#include <linux/kvm_host.h>
#include <linux/pci.h>
#include <asm/kvm_pci.h>
+#include "pci.h"
+
+static struct zpci_aift aift;
+
+static inline int __set_irq_noiib(u16 ctl, u8 isc)
+{
+ union zpci_sic_iib iib = {{0}};
+
+ return zpci_set_irq_ctrl(ctl, isc, &iib);
+}
+
+struct zpci_aift *kvm_s390_pci_get_aift(void)
+{
+ return &aift;
+}
+
+/* Caller must hold the aift lock before calling this function */
+void kvm_s390_pci_aen_exit(void)
+{
+ struct zpci_gaite *gait;
+ unsigned long flags;
+ struct airq_iv *sbv;
+ struct kvm_zdev **gait_kzdev;
+ int size;
+
+ /* Clear the GAIT and forwarding summary vector */
+ __set_irq_noiib(SIC_SET_AENI_CONTROLS, 0);
+
+ spin_lock_irqsave(&aift.gait_lock, flags);
+ gait = aift.gait;
+ sbv = aift.sbv;
+ gait_kzdev = aift.kzdev;
+ aift.gait = 0;
+ aift.sbv = 0;
+ aift.kzdev = 0;
+ spin_unlock_irqrestore(&aift.gait_lock, flags);
+
+ if (sbv)
+ airq_iv_release(sbv);
+ size = get_order(PAGE_ALIGN(ZPCI_NR_DEVICES *
+ sizeof(struct zpci_gaite)));
+ free_pages((unsigned long)gait, size);
+ kfree(gait_kzdev);
+}
+
+int kvm_s390_pci_aen_init(u8 nisc)
+{
+ union zpci_sic_iib iib = {{0}};
+ struct page *page;
+ int rc = 0, size;
+
+ /* If already enabled for AEN, bail out now */
+ if (aift.gait || aift.sbv)
+ return -EPERM;
+
+ mutex_lock(&aift.lock);
+ aift.kzdev = kcalloc(ZPCI_NR_DEVICES, sizeof(struct kvm_zdev),
+ GFP_KERNEL);
+ if (!aift.kzdev) {
+ rc = -ENOMEM;
+ goto unlock;
+ }
+ aift.sbv = airq_iv_create(ZPCI_NR_DEVICES, AIRQ_IV_ALLOC, 0);
+ if (!aift.sbv) {
+ rc = -ENOMEM;
+ goto free_zdev;
+ }
+ size = get_order(PAGE_ALIGN(ZPCI_NR_DEVICES *
+ sizeof(struct zpci_gaite)));
+ page = alloc_pages(GFP_KERNEL | __GFP_ZERO, size);
+ if (!page) {
+ rc = -ENOMEM;
+ goto free_sbv;
+ }
+ aift.gait = (struct zpci_gaite *)page_to_phys(page);
+
+ iib.aipb.faisb = (u64)aift.sbv->vector;
+ iib.aipb.gait = (u64)aift.gait;
+ iib.aipb.afi = nisc;
+ iib.aipb.faal = ZPCI_NR_DEVICES;
+
+ /* Setup Adapter Event Notification Interpretation */
+ if (zpci_set_irq_ctrl(SIC_SET_AENI_CONTROLS, 0, &iib)) {
+ rc = -EIO;
+ goto free_gait;
+ }
+
+ /* Enable floating IRQs */
+ if (__set_irq_noiib(SIC_IRQ_MODE_SINGLE, nisc)) {
+ rc = -EIO;
+ kvm_s390_pci_aen_exit();
+ }
+
+ goto unlock;
+
+free_gait:
+ size = get_order(PAGE_ALIGN(ZPCI_NR_DEVICES *
+ sizeof(struct zpci_gaite)));
+ free_pages((unsigned long)aift.gait, size);
+free_sbv:
+ airq_iv_release(aift.sbv);
+free_zdev:
+ kfree(aift.kzdev);
+unlock:
+ mutex_unlock(&aift.lock);
+ return rc;
+}

int kvm_s390_pci_dev_open(struct zpci_dev *zdev)
{
@@ -55,3 +162,9 @@ int kvm_s390_pci_attach_kvm(struct zpci_dev *zdev, struct kvm *kvm)
return 0;
}
EXPORT_SYMBOL_GPL(kvm_s390_pci_attach_kvm);
+
+void kvm_s390_pci_init(void)
+{
+ spin_lock_init(&aift.gait_lock);
+ mutex_init(&aift.lock);
+}
diff --git a/arch/s390/kvm/pci.h b/arch/s390/kvm/pci.h
new file mode 100644
index 000000000000..74b06d39be3b
--- /dev/null
+++ b/arch/s390/kvm/pci.h
@@ -0,0 +1,42 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * s390 kvm PCI passthrough support
+ *
+ * Copyright IBM Corp. 2021
+ *
+ * Author(s): Matthew Rosato <[email protected]>
+ */
+
+#ifndef __KVM_S390_PCI_H
+#define __KVM_S390_PCI_H
+
+#include <linux/pci.h>
+#include <linux/mutex.h>
+#include <asm/airq.h>
+#include <asm/kvm_pci.h>
+
+struct zpci_gaite {
+ unsigned int gisa;
+ u8 gisc;
+ u8 count;
+ u8 reserved;
+ u8 aisbo;
+ unsigned long aisb;
+};
+
+struct zpci_aift {
+ struct zpci_gaite *gait;
+ struct airq_iv *sbv;
+ struct kvm_zdev **kzdev;
+ spinlock_t gait_lock; /* Protects the gait, used during AEN forward */
+ struct mutex lock; /* Protects the other structures in aift */
+};
+
+struct zpci_aift *kvm_s390_pci_get_aift(void);
+
+int kvm_s390_pci_aen_init(u8 nisc);
+void kvm_s390_pci_aen_exit(void);
+
+void kvm_s390_pci_init(void);
+
+#endif /* __KVM_S390_PCI_H */
--
2.27.0


2021-12-07 20:59:26

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH 04/32] s390/sclp: detect the AISI facility

Detect the Adapter Interruption Suppression Interpretation facility.

Reviewed-by: Eric Farman <[email protected]>
Signed-off-by: Matthew Rosato <[email protected]>
---
arch/s390/include/asm/sclp.h | 1 +
drivers/s390/char/sclp_early.c | 1 +
2 files changed, 2 insertions(+)

diff --git a/arch/s390/include/asm/sclp.h b/arch/s390/include/asm/sclp.h
index a763563bb3e7..559adb28a24c 100644
--- a/arch/s390/include/asm/sclp.h
+++ b/arch/s390/include/asm/sclp.h
@@ -91,6 +91,7 @@ struct sclp_info {
unsigned char has_zpci_interp : 1;
unsigned char has_aisii : 1;
unsigned char has_aeni : 1;
+ unsigned char has_aisi : 1;
unsigned int ibc;
unsigned int mtid;
unsigned int mtid_cp;
diff --git a/drivers/s390/char/sclp_early.c b/drivers/s390/char/sclp_early.c
index 52a203ea23cc..9b29ed850d39 100644
--- a/drivers/s390/char/sclp_early.c
+++ b/drivers/s390/char/sclp_early.c
@@ -47,6 +47,7 @@ static void __init sclp_early_facilities_detect(void)
sclp.has_kss = !!(sccb->fac98 & 0x01);
sclp.has_aisii = !!(sccb->fac118 & 0x40);
sclp.has_aeni = !!(sccb->fac118 & 0x20);
+ sclp.has_aisi = !!(sccb->fac118 & 0x10);
sclp.has_zpci_interp = !!(sccb->fac118 & 0x01);
if (sccb->fac85 & 0x02)
S390_lowcore.machine_flags |= MACHINE_FLAG_ESOP;
--
2.27.0


2021-12-07 20:59:31

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH 05/32] s390/airq: pass more TPI info to airq handlers

A subsequent patch will introduce an airq handler that requires additional
TPI information beyond directed vs floating, so pass the entire tpi_info
structure via the handler. Only pci actually uses this information today,
for the other airq handlers this is effectively a no-op.

Reviewed-by: Eric Farman <[email protected]>
Signed-off-by: Matthew Rosato <[email protected]>
---
arch/s390/include/asm/airq.h | 3 ++-
arch/s390/kvm/interrupt.c | 4 +++-
arch/s390/pci/pci_irq.c | 9 +++++++--
drivers/s390/cio/airq.c | 2 +-
drivers/s390/cio/qdio_thinint.c | 6 ++++--
drivers/s390/crypto/ap_bus.c | 9 ++++++---
drivers/s390/virtio/virtio_ccw.c | 4 +++-
7 files changed, 26 insertions(+), 11 deletions(-)

diff --git a/arch/s390/include/asm/airq.h b/arch/s390/include/asm/airq.h
index 01936fdfaddb..7918a7d09028 100644
--- a/arch/s390/include/asm/airq.h
+++ b/arch/s390/include/asm/airq.h
@@ -12,10 +12,11 @@

#include <linux/bit_spinlock.h>
#include <linux/dma-mapping.h>
+#include <asm/tpi.h>

struct airq_struct {
struct hlist_node list; /* Handler queueing. */
- void (*handler)(struct airq_struct *airq, bool floating);
+ void (*handler)(struct airq_struct *airq, struct tpi_info *tpi_info);
u8 *lsi_ptr; /* Local-Summary-Indicator pointer */
u8 lsi_mask; /* Local-Summary-Indicator mask */
u8 isc; /* Interrupt-subclass */
diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
index c3bd993fdd0c..f9b872e358c6 100644
--- a/arch/s390/kvm/interrupt.c
+++ b/arch/s390/kvm/interrupt.c
@@ -28,6 +28,7 @@
#include <asm/switch_to.h>
#include <asm/nmi.h>
#include <asm/airq.h>
+#include <asm/tpi.h>
#include "kvm-s390.h"
#include "gaccess.h"
#include "trace-s390.h"
@@ -3261,7 +3262,8 @@ int kvm_s390_gisc_unregister(struct kvm *kvm, u32 gisc)
}
EXPORT_SYMBOL_GPL(kvm_s390_gisc_unregister);

-static void gib_alert_irq_handler(struct airq_struct *airq, bool floating)
+static void gib_alert_irq_handler(struct airq_struct *airq,
+ struct tpi_info *tpi_info)
{
inc_irq_stat(IRQIO_GAL);
process_gib_alert_list();
diff --git a/arch/s390/pci/pci_irq.c b/arch/s390/pci/pci_irq.c
index 954bb7a83124..880bcd73f11a 100644
--- a/arch/s390/pci/pci_irq.c
+++ b/arch/s390/pci/pci_irq.c
@@ -11,6 +11,7 @@

#include <asm/isc.h>
#include <asm/airq.h>
+#include <asm/tpi.h>

static enum {FLOATING, DIRECTED} irq_delivery;

@@ -216,8 +217,11 @@ static void zpci_handle_fallback_irq(void)
}
}

-static void zpci_directed_irq_handler(struct airq_struct *airq, bool floating)
+static void zpci_directed_irq_handler(struct airq_struct *airq,
+ struct tpi_info *tpi_info)
{
+ bool floating = !tpi_info->directed_irq;
+
if (floating) {
inc_irq_stat(IRQIO_PCF);
zpci_handle_fallback_irq();
@@ -227,7 +231,8 @@ static void zpci_directed_irq_handler(struct airq_struct *airq, bool floating)
}
}

-static void zpci_floating_irq_handler(struct airq_struct *airq, bool floating)
+static void zpci_floating_irq_handler(struct airq_struct *airq,
+ struct tpi_info *tpi_info)
{
unsigned long si, ai;
struct airq_iv *aibv;
diff --git a/drivers/s390/cio/airq.c b/drivers/s390/cio/airq.c
index e56535c99888..2f2226786319 100644
--- a/drivers/s390/cio/airq.c
+++ b/drivers/s390/cio/airq.c
@@ -99,7 +99,7 @@ static irqreturn_t do_airq_interrupt(int irq, void *dummy)
rcu_read_lock();
hlist_for_each_entry_rcu(airq, head, list)
if ((*airq->lsi_ptr & airq->lsi_mask) != 0)
- airq->handler(airq, !tpi_info->directed_irq);
+ airq->handler(airq, tpi_info);
rcu_read_unlock();

return IRQ_HANDLED;
diff --git a/drivers/s390/cio/qdio_thinint.c b/drivers/s390/cio/qdio_thinint.c
index 8e09bf3a2fcd..9b9335dd06db 100644
--- a/drivers/s390/cio/qdio_thinint.c
+++ b/drivers/s390/cio/qdio_thinint.c
@@ -15,6 +15,7 @@
#include <asm/qdio.h>
#include <asm/airq.h>
#include <asm/isc.h>
+#include <asm/tpi.h>

#include "cio.h"
#include "ioasm.h"
@@ -93,9 +94,10 @@ static inline u32 clear_shared_ind(void)
/**
* tiqdio_thinint_handler - thin interrupt handler for qdio
* @airq: pointer to adapter interrupt descriptor
- * @floating: flag to recognize floating vs. directed interrupts (unused)
+ * @tpi_info: interrupt information (e.g. floating vs directed -- unused)
*/
-static void tiqdio_thinint_handler(struct airq_struct *airq, bool floating)
+static void tiqdio_thinint_handler(struct airq_struct *airq,
+ struct tpi_info *tpi_info)
{
u64 irq_time = S390_lowcore.int_clock;
u32 si_used = clear_shared_ind();
diff --git a/drivers/s390/crypto/ap_bus.c b/drivers/s390/crypto/ap_bus.c
index 1986243f9cd3..df1a038442db 100644
--- a/drivers/s390/crypto/ap_bus.c
+++ b/drivers/s390/crypto/ap_bus.c
@@ -27,6 +27,7 @@
#include <linux/kthread.h>
#include <linux/mutex.h>
#include <asm/airq.h>
+#include <asm/tpi.h>
#include <linux/atomic.h>
#include <asm/isc.h>
#include <linux/hrtimer.h>
@@ -129,7 +130,8 @@ static int ap_max_adapter_id = 63;
static struct bus_type ap_bus_type;

/* Adapter interrupt definitions */
-static void ap_interrupt_handler(struct airq_struct *airq, bool floating);
+static void ap_interrupt_handler(struct airq_struct *airq,
+ struct tpi_info *tpi_info);

static bool ap_irq_flag;

@@ -442,9 +444,10 @@ static enum hrtimer_restart ap_poll_timeout(struct hrtimer *unused)
/**
* ap_interrupt_handler() - Schedule ap_tasklet on interrupt
* @airq: pointer to adapter interrupt descriptor
- * @floating: ignored
+ * @tpi_info: ignored
*/
-static void ap_interrupt_handler(struct airq_struct *airq, bool floating)
+static void ap_interrupt_handler(struct airq_struct *airq,
+ struct tpi_info *tpi_info)
{
inc_irq_stat(IRQIO_APB);
tasklet_schedule(&ap_tasklet);
diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index d35e7a3f7067..52c376d15978 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -33,6 +33,7 @@
#include <asm/virtio-ccw.h>
#include <asm/isc.h>
#include <asm/airq.h>
+#include <asm/tpi.h>

/*
* virtio related functions
@@ -203,7 +204,8 @@ static void drop_airq_indicator(struct virtqueue *vq, struct airq_info *info)
write_unlock_irqrestore(&info->lock, flags);
}

-static void virtio_airq_handler(struct airq_struct *airq, bool floating)
+static void virtio_airq_handler(struct airq_struct *airq,
+ struct tpi_info *tpi_info)
{
struct airq_info *info = container_of(airq, struct airq_info, airq);
unsigned long ai;
--
2.27.0


2021-12-07 20:59:34

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH 15/32] KVM: s390: pci: enable host forwarding of Adapter Event Notifications

In cases where interrupts are not forwarded to the guest via firmware,
KVM is responsible for ensuring delivery. When an interrupt presents
with the forwarding bit, we must process the forwarding tables until
all interrupts are delivered.

Signed-off-by: Matthew Rosato <[email protected]>
---
arch/s390/include/asm/kvm_host.h | 1 +
arch/s390/include/asm/tpi.h | 14 ++++++
arch/s390/kvm/interrupt.c | 76 +++++++++++++++++++++++++++++++-
arch/s390/kvm/kvm-s390.c | 3 +-
arch/s390/kvm/pci.h | 9 ++++
5 files changed, 101 insertions(+), 2 deletions(-)

diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index a604d51acfc8..3f147b8d050b 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -757,6 +757,7 @@ struct kvm_vm_stat {
u64 inject_pfault_done;
u64 inject_service_signal;
u64 inject_virtio;
+ u64 aen_forward;
};

struct kvm_arch_memory_slot {
diff --git a/arch/s390/include/asm/tpi.h b/arch/s390/include/asm/tpi.h
index 1ac538b8cbf5..47a531fdb15b 100644
--- a/arch/s390/include/asm/tpi.h
+++ b/arch/s390/include/asm/tpi.h
@@ -19,6 +19,20 @@ struct tpi_info {
u32 :12;
} __packed __aligned(4);

+/* I/O-Interruption Code as stored by TPI for an Adapter I/O */
+struct tpi_adapter_info {
+ u32 :1;
+ u32 pci:1;
+ u32 :28;
+ u32 error:1;
+ u32 forward:1;
+ u32 reserved;
+ u32 adapter_IO:1;
+ u32 directed_irq:1;
+ u32 isc:3;
+ u32 :27;
+} __packed __aligned(4);
+
#endif /* __ASSEMBLY__ */

#endif /* _ASM_S390_TPI_H */
diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
index 4efe0e95a40f..c6ff871a6ed1 100644
--- a/arch/s390/kvm/interrupt.c
+++ b/arch/s390/kvm/interrupt.c
@@ -3263,11 +3263,85 @@ int kvm_s390_gisc_unregister(struct kvm *kvm, u32 gisc)
}
EXPORT_SYMBOL_GPL(kvm_s390_gisc_unregister);

+static void aen_host_forward(struct zpci_aift *aift, unsigned long si)
+{
+ struct kvm_s390_gisa_interrupt *gi;
+ struct zpci_gaite *gaite;
+ struct kvm *kvm;
+
+ gaite = (struct zpci_gaite *)aift->gait +
+ (si * sizeof(struct zpci_gaite));
+ if (gaite->count == 0)
+ return;
+ if (gaite->aisb != 0)
+ set_bit_inv(gaite->aisbo, (unsigned long *)gaite->aisb);
+
+ kvm = kvm_s390_pci_si_to_kvm(aift, si);
+ if (kvm == 0)
+ return;
+ gi = &kvm->arch.gisa_int;
+
+ if (!(gi->origin->g1.simm & AIS_MODE_MASK(gaite->gisc)) ||
+ !(gi->origin->g1.nimm & AIS_MODE_MASK(gaite->gisc))) {
+ gisa_set_ipm_gisc(gi->origin, gaite->gisc);
+ if (hrtimer_active(&gi->timer))
+ hrtimer_cancel(&gi->timer);
+ hrtimer_start(&gi->timer, 0, HRTIMER_MODE_REL);
+ kvm->stat.aen_forward++;
+ }
+}
+
+static void aen_process_gait(u8 isc)
+{
+ bool found = false, first = true;
+ union zpci_sic_iib iib = {{0}};
+ unsigned long si, flags;
+ struct zpci_aift *aift;
+
+ aift = kvm_s390_pci_get_aift();
+ spin_lock_irqsave(&aift->gait_lock, flags);
+
+ if (!aift->gait) {
+ spin_unlock_irqrestore(&aift->gait_lock, flags);
+ return;
+ }
+
+ for (si = 0;;) {
+ /* Scan adapter summary indicator bit vector */
+ si = airq_iv_scan(aift->sbv, si, airq_iv_end(aift->sbv));
+ if (si == -1UL) {
+ if (first || found) {
+ /* Reenable interrupts. */
+ if (zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, isc,
+ &iib))
+ break;
+ first = found = false;
+ } else {
+ /* Interrupts on and all bits processed */
+ break;
+ }
+ found = false;
+ si = 0;
+ continue;
+ }
+ found = true;
+ aen_host_forward(aift, si);
+ }
+
+ spin_unlock_irqrestore(&aift->gait_lock, flags);
+}
+
static void gib_alert_irq_handler(struct airq_struct *airq,
struct tpi_info *tpi_info)
{
+ struct tpi_adapter_info *info = (struct tpi_adapter_info *)tpi_info;
+
inc_irq_stat(IRQIO_GAL);
- process_gib_alert_list();
+
+ if (info->forward || info->error)
+ aen_process_gait(info->isc);
+ else
+ process_gib_alert_list();
}

static struct airq_struct gib_alert_irq = {
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 9cd3c8eb59e8..c8fe9b7c2395 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -65,7 +65,8 @@ const struct _kvm_stats_desc kvm_vm_stats_desc[] = {
STATS_DESC_COUNTER(VM, inject_float_mchk),
STATS_DESC_COUNTER(VM, inject_pfault_done),
STATS_DESC_COUNTER(VM, inject_service_signal),
- STATS_DESC_COUNTER(VM, inject_virtio)
+ STATS_DESC_COUNTER(VM, inject_virtio),
+ STATS_DESC_COUNTER(VM, aen_forward)
};

const struct kvm_stats_header kvm_vm_stats_header = {
diff --git a/arch/s390/kvm/pci.h b/arch/s390/kvm/pci.h
index 74b06d39be3b..776b2745c675 100644
--- a/arch/s390/kvm/pci.h
+++ b/arch/s390/kvm/pci.h
@@ -12,6 +12,7 @@

#include <linux/pci.h>
#include <linux/mutex.h>
+#include <linux/kvm_host.h>
#include <asm/airq.h>
#include <asm/kvm_pci.h>

@@ -32,6 +33,14 @@ struct zpci_aift {
struct mutex lock; /* Protects the other structures in aift */
};

+static inline struct kvm *kvm_s390_pci_si_to_kvm(struct zpci_aift *aift,
+ unsigned long si)
+{
+ if (aift->kzdev == 0 || aift->kzdev[si] == 0)
+ return 0;
+ return aift->kzdev[si]->kvm;
+};
+
struct zpci_aift *kvm_s390_pci_get_aift(void);

int kvm_s390_pci_aen_init(u8 nisc);
--
2.27.0


2021-12-07 20:59:40

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH 16/32] KVM: s390: expose the guest zPCI interpretation facility

This facility will be used to enable interpretive execution of zPCI
instructions.

Signed-off-by: Matthew Rosato <[email protected]>
---
arch/s390/kvm/kvm-s390.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index c8fe9b7c2395..09991d05c871 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -2751,6 +2751,10 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
set_kvm_facility(kvm->arch.model.fac_mask, 147);
set_kvm_facility(kvm->arch.model.fac_list, 147);
}
+ if (sclp.has_zpci_interp && test_facility(69)) {
+ set_kvm_facility(kvm->arch.model.fac_mask, 69);
+ set_kvm_facility(kvm->arch.model.fac_list, 69);
+ }

if (css_general_characteristics.aiv && test_facility(65))
set_kvm_facility(kvm->arch.model.fac_mask, 65);
--
2.27.0


2021-12-07 20:59:43

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH 17/32] KVM: s390: expose the guest Adapter Interruption Source ID facility

This facility will be used to enable forwarding of PCI interrupts from
firmware directly to guests.

Signed-off-by: Matthew Rosato <[email protected]>
---
arch/s390/kvm/kvm-s390.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 09991d05c871..d44ca313a1b7 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -2755,6 +2755,10 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
set_kvm_facility(kvm->arch.model.fac_mask, 69);
set_kvm_facility(kvm->arch.model.fac_list, 69);
}
+ if (sclp.has_aisii && test_facility(70)) {
+ set_kvm_facility(kvm->arch.model.fac_mask, 70);
+ set_kvm_facility(kvm->arch.model.fac_list, 70);
+ }

if (css_general_characteristics.aiv && test_facility(65))
set_kvm_facility(kvm->arch.model.fac_mask, 65);
--
2.27.0


2021-12-07 20:59:49

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH 18/32] KVM: s390: expose guest Adapter Event Notification Interpretation facility

This facility will be used to enable forwarding of PCI interrupts from
firmware directly to guests.

Signed-off-by: Matthew Rosato <[email protected]>
---
arch/s390/kvm/kvm-s390.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index d44ca313a1b7..a680f2a02b67 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -2759,6 +2759,10 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
set_kvm_facility(kvm->arch.model.fac_mask, 70);
set_kvm_facility(kvm->arch.model.fac_list, 70);
}
+ if (sclp.has_aeni && test_facility(71)) {
+ set_kvm_facility(kvm->arch.model.fac_mask, 71);
+ set_kvm_facility(kvm->arch.model.fac_list, 71);
+ }

if (css_general_characteristics.aiv && test_facility(65))
set_kvm_facility(kvm->arch.model.fac_mask, 65);
--
2.27.0


2021-12-07 20:59:57

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH 19/32] KVM: s390: mechanism to enable guest zPCI Interpretation

The guest must have access to certain facilities in order to allow
interpretive execution of zPCI instructions and adapter event
notifications. However, there are some cases where a guest might
disable interpretation -- provide a mechanism via which we can defer
enabling the associated zPCI interpretation facilities until the guest
indicates it wishes to use them.

Signed-off-by: Matthew Rosato <[email protected]>
---
arch/s390/include/asm/kvm_host.h | 4 +++
arch/s390/kvm/kvm-s390.c | 43 ++++++++++++++++++++++++++++++++
arch/s390/kvm/kvm-s390.h | 10 ++++++++
3 files changed, 57 insertions(+)

diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index 3f147b8d050b..38982c1de413 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -252,7 +252,10 @@ struct kvm_s390_sie_block {
#define ECB2_IEP 0x20
#define ECB2_PFMFI 0x08
#define ECB2_ESCA 0x04
+#define ECB2_ZPCI_LSI 0x02
__u8 ecb2; /* 0x0062 */
+#define ECB3_AISI 0x20
+#define ECB3_AISII 0x10
#define ECB3_DEA 0x08
#define ECB3_AES 0x04
#define ECB3_RI 0x01
@@ -938,6 +941,7 @@ struct kvm_arch{
int use_cmma;
int use_pfmfi;
int use_skf;
+ int use_zpci_interp;
int user_cpu_state_ctrl;
int user_sigp;
int user_stsi;
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index a680f2a02b67..361d742cdf0d 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -1023,6 +1023,47 @@ static int kvm_s390_vm_set_crypto(struct kvm *kvm, struct kvm_device_attr *attr)
return 0;
}

+static void kvm_s390_vcpu_pci_setup(struct kvm_vcpu *vcpu)
+{
+ /*
+ * If the facilities aren't available for PCI interpretation and
+ * interrupt forwarding, we shouldn't be here.
+ */
+ if (!vcpu->kvm->arch.use_zpci_interp)
+ return;
+
+ vcpu->arch.sie_block->ecb2 |= ECB2_ZPCI_LSI;
+ vcpu->arch.sie_block->ecb3 |= ECB3_AISII + ECB3_AISI;
+}
+
+void kvm_s390_vcpu_pci_enable_interp(struct kvm *kvm)
+{
+ struct kvm_vcpu *vcpu;
+ int i;
+
+ /*
+ * If host facilities are available, turn on interpretation for the
+ * life of this guest
+ */
+ if (!test_facility(69) || !test_facility(70) || !test_facility(71) ||
+ !test_facility(72))
+ return;
+
+ mutex_lock(&kvm->lock);
+
+ kvm->arch.use_zpci_interp = 1;
+
+ kvm_s390_vcpu_block_all(kvm);
+
+ kvm_for_each_vcpu(i, vcpu, kvm) {
+ kvm_s390_vcpu_pci_setup(vcpu);
+ kvm_s390_sync_request(KVM_REQ_VSIE_RESTART, vcpu);
+ }
+
+ kvm_s390_vcpu_unblock_all(kvm);
+ mutex_unlock(&kvm->lock);
+}
+
static void kvm_s390_sync_request_broadcast(struct kvm *kvm, int req)
{
int cx;
@@ -3288,6 +3329,8 @@ static int kvm_s390_vcpu_setup(struct kvm_vcpu *vcpu)

kvm_s390_vcpu_crypto_setup(vcpu);

+ kvm_s390_vcpu_pci_setup(vcpu);
+
mutex_lock(&vcpu->kvm->lock);
if (kvm_s390_pv_is_protected(vcpu->kvm)) {
rc = kvm_s390_pv_create_cpu(vcpu, &uvrc, &uvrrc);
diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h
index c07a050d757d..a2eccb8b977e 100644
--- a/arch/s390/kvm/kvm-s390.h
+++ b/arch/s390/kvm/kvm-s390.h
@@ -481,6 +481,16 @@ void kvm_s390_reinject_machine_check(struct kvm_vcpu *vcpu,
*/
void kvm_s390_vcpu_crypto_reset_all(struct kvm *kvm);

+/**
+ * kvm_s390_vcpu_pci_enable_interp
+ *
+ * Set the associated PCI attributes for each vcpu to allow for zPCI Load/Store
+ * interpretation as well as adapter interruption forwarding.
+ *
+ * @kvm: the KVM guest
+ */
+void kvm_s390_vcpu_pci_enable_interp(struct kvm *kvm);
+
/**
* diag9c_forwarding_hz
*
--
2.27.0


2021-12-07 21:00:00

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH 20/32] KVM: s390: pci: provide routines for enabling/disabling interpretation

These routines will be wired into the vfio_pci_zdev ioctl handlers to
respond to requests to enable / disable a device for zPCI Load/Store
interpretation.

The first time such a request is received, enable the necessary facilities
for the guest.

Signed-off-by: Matthew Rosato <[email protected]>
---
arch/s390/include/asm/kvm_pci.h | 4 ++
arch/s390/kvm/pci.c | 91 +++++++++++++++++++++++++++++++++
arch/s390/pci/pci.c | 3 ++
3 files changed, 98 insertions(+)

diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
index 3e491a39704c..5d6283acb54c 100644
--- a/arch/s390/include/asm/kvm_pci.h
+++ b/arch/s390/include/asm/kvm_pci.h
@@ -26,4 +26,8 @@ extern int kvm_s390_pci_dev_open(struct zpci_dev *zdev);
extern void kvm_s390_pci_dev_release(struct zpci_dev *zdev);
extern int kvm_s390_pci_attach_kvm(struct zpci_dev *zdev, struct kvm *kvm);

+extern int kvm_s390_pci_interp_probe(struct zpci_dev *zdev);
+extern int kvm_s390_pci_interp_enable(struct zpci_dev *zdev);
+extern int kvm_s390_pci_interp_disable(struct zpci_dev *zdev);
+
#endif /* ASM_KVM_PCI_H */
diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
index f0e5386ff943..57cbe3827ea6 100644
--- a/arch/s390/kvm/pci.c
+++ b/arch/s390/kvm/pci.c
@@ -10,7 +10,9 @@
#include <linux/kvm_host.h>
#include <linux/pci.h>
#include <asm/kvm_pci.h>
+#include <asm/sclp.h>
#include "pci.h"
+#include "kvm-s390.h"

static struct zpci_aift aift;

@@ -118,6 +120,95 @@ int kvm_s390_pci_aen_init(u8 nisc)
return rc;
}

+int kvm_s390_pci_interp_probe(struct zpci_dev *zdev)
+{
+ if (!(sclp.has_zpci_interp && test_facility(69)))
+ return -EINVAL;
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(kvm_s390_pci_interp_probe);
+
+int kvm_s390_pci_interp_enable(struct zpci_dev *zdev)
+{
+ u32 gd;
+ int rc;
+
+ /*
+ * If this is the first request to use an interpreted device, make the
+ * necessary vcpu changes
+ */
+ if (!zdev->kzdev->kvm->arch.use_zpci_interp)
+ kvm_s390_vcpu_pci_enable_interp(zdev->kzdev->kvm);
+
+ /*
+ * In the event of a system reset in userspace, the GISA designation
+ * may still be assigned because the device is still enabled.
+ * Verify it's the same guest before proceeding.
+ */
+ gd = (u32)(u64)&zdev->kzdev->kvm->arch.sie_page2->gisa;
+ if (zdev->gd != 0 && zdev->gd != gd)
+ return -EPERM;
+
+ if (zdev_enabled(zdev)) {
+ zdev->gd = 0;
+ rc = zpci_disable_device(zdev);
+ if (rc)
+ return rc;
+ }
+
+ /*
+ * Store information about the identity of the kvm guest allowed to
+ * access this device via interpretation to be used by host CLP
+ */
+ zdev->gd = gd;
+
+ rc = zpci_enable_device(zdev);
+ if (rc)
+ goto err;
+
+ /* Re-register the IOMMU that was already created */
+ rc = zpci_register_ioat(zdev, 0, zdev->start_dma, zdev->end_dma,
+ (u64)zdev->dma_table);
+ if (rc)
+ goto err;
+
+ return rc;
+
+err:
+ zdev->gd = 0;
+ return rc;
+}
+EXPORT_SYMBOL_GPL(kvm_s390_pci_interp_enable);
+
+int kvm_s390_pci_interp_disable(struct zpci_dev *zdev)
+{
+ int rc;
+
+ if (zdev->gd == 0)
+ return -EINVAL;
+
+ /* Remove the host CLP guest designation */
+ zdev->gd = 0;
+
+ if (zdev_enabled(zdev)) {
+ rc = zpci_disable_device(zdev);
+ if (rc)
+ return rc;
+ }
+
+ rc = zpci_enable_device(zdev);
+ if (rc)
+ return rc;
+
+ /* Re-register the IOMMU that was already created */
+ rc = zpci_register_ioat(zdev, 0, zdev->start_dma, zdev->end_dma,
+ (u64)zdev->dma_table);
+
+ return rc;
+}
+EXPORT_SYMBOL_GPL(kvm_s390_pci_interp_disable);
+
int kvm_s390_pci_dev_open(struct zpci_dev *zdev)
{
struct kvm_zdev *kzdev;
diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c
index 175854c861cd..0eac84387f3c 100644
--- a/arch/s390/pci/pci.c
+++ b/arch/s390/pci/pci.c
@@ -141,6 +141,7 @@ int zpci_register_ioat(struct zpci_dev *zdev, u8 dmaas,
zpci_dbg(3, "reg ioat fid:%x, cc:%d, status:%d\n", zdev->fid, cc, status);
return cc;
}
+EXPORT_SYMBOL_GPL(zpci_register_ioat);

/* Modify PCI: Unregister I/O address translation parameters */
int zpci_unregister_ioat(struct zpci_dev *zdev, u8 dmaas)
@@ -740,6 +741,7 @@ int zpci_enable_device(struct zpci_dev *zdev)
zpci_update_fh(zdev, fh);
return rc;
}
+EXPORT_SYMBOL_GPL(zpci_enable_device);

int zpci_disable_device(struct zpci_dev *zdev)
{
@@ -763,6 +765,7 @@ int zpci_disable_device(struct zpci_dev *zdev)
}
return rc;
}
+EXPORT_SYMBOL_GPL(zpci_disable_device);

/**
* zpci_hot_reset_device - perform a reset of the given zPCI function
--
2.27.0


2021-12-07 21:00:15

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH 21/32] KVM: s390: pci: provide routines for enabling/disabling interrupt forwarding

These routines will be wired into the vfio_pci_zdev ioctl handlers to
respond to requests to enable / disable a device for Adapter Event
Notifications / Adapter Interuption Forwarding.

Signed-off-by: Matthew Rosato <[email protected]>
---
arch/s390/include/asm/kvm_pci.h | 7 ++
arch/s390/kvm/pci.c | 199 ++++++++++++++++++++++++++++++++
arch/s390/pci/pci_insn.c | 1 +
3 files changed, 207 insertions(+)

diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
index 5d6283acb54c..54a0afdbe7d0 100644
--- a/arch/s390/include/asm/kvm_pci.h
+++ b/arch/s390/include/asm/kvm_pci.h
@@ -16,16 +16,23 @@
#include <linux/kvm_host.h>
#include <linux/kvm.h>
#include <linux/pci.h>
+#include <asm/pci_insn.h>

struct kvm_zdev {
struct zpci_dev *zdev;
struct kvm *kvm;
+ struct zpci_fib fib;
};

extern int kvm_s390_pci_dev_open(struct zpci_dev *zdev);
extern void kvm_s390_pci_dev_release(struct zpci_dev *zdev);
extern int kvm_s390_pci_attach_kvm(struct zpci_dev *zdev, struct kvm *kvm);

+extern int kvm_s390_pci_aif_probe(struct zpci_dev *zdev);
+extern int kvm_s390_pci_aif_enable(struct zpci_dev *zdev, struct zpci_fib *fib,
+ bool assist);
+extern int kvm_s390_pci_aif_disable(struct zpci_dev *zdev);
+
extern int kvm_s390_pci_interp_probe(struct zpci_dev *zdev);
extern int kvm_s390_pci_interp_enable(struct zpci_dev *zdev);
extern int kvm_s390_pci_interp_disable(struct zpci_dev *zdev);
diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
index 57cbe3827ea6..3a29398dd53b 100644
--- a/arch/s390/kvm/pci.c
+++ b/arch/s390/kvm/pci.c
@@ -10,6 +10,8 @@
#include <linux/kvm_host.h>
#include <linux/pci.h>
#include <asm/kvm_pci.h>
+#include <asm/pci.h>
+#include <asm/pci_insn.h>
#include <asm/sclp.h>
#include "pci.h"
#include "kvm-s390.h"
@@ -120,6 +122,199 @@ int kvm_s390_pci_aen_init(u8 nisc)
return rc;
}

+/* Modify PCI: Register floating adapter interruption forwarding */
+static int kvm_zpci_set_airq(struct zpci_dev *zdev)
+{
+ u64 req = ZPCI_CREATE_REQ(zdev->fh, 0, ZPCI_MOD_FC_REG_INT);
+ struct zpci_fib fib = {0};
+ u8 status;
+
+ fib.fmt0.isc = zdev->kzdev->fib.fmt0.isc;
+ fib.fmt0.sum = 1; /* enable summary notifications */
+ fib.fmt0.noi = airq_iv_end(zdev->aibv);
+ fib.fmt0.aibv = (unsigned long) zdev->aibv->vector;
+ fib.fmt0.aibvo = 0;
+ fib.fmt0.aisb = (unsigned long) aift.sbv->vector + (zdev->aisb/64) * 8;
+ fib.fmt0.aisbo = zdev->aisb & 63;
+ fib.gd = zdev->gd;
+
+ return zpci_mod_fc(req, &fib, &status) ? -EIO : 0;
+}
+
+/* Modify PCI: Unregister floating adapter interruption forwarding */
+static int kvm_zpci_clear_airq(struct zpci_dev *zdev)
+{
+ u64 req = ZPCI_CREATE_REQ(zdev->fh, 0, ZPCI_MOD_FC_DEREG_INT);
+ struct zpci_fib fib = {0};
+ u8 cc, status;
+
+ fib.gd = zdev->gd;
+
+ cc = zpci_mod_fc(req, &fib, &status);
+ if (cc == 3 || (cc == 1 && status == 24))
+ /* Function already gone or IRQs already deregistered. */
+ cc = 0;
+
+ return cc ? -EIO : 0;
+}
+
+int kvm_s390_pci_aif_probe(struct zpci_dev *zdev)
+{
+ if (!(sclp.has_aeni && test_facility(71)))
+ return -EINVAL;
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(kvm_s390_pci_aif_probe);
+
+int kvm_s390_pci_aif_enable(struct zpci_dev *zdev, struct zpci_fib *fib,
+ bool assist)
+{
+ struct page *aibv_page, *aisb_page = NULL;
+ unsigned int msi_vecs, idx;
+ struct zpci_gaite *gaite;
+ unsigned long bit;
+ struct kvm *kvm;
+ void *gaddr;
+ int rc = 0;
+
+ /*
+ * Interrupt forwarding is only applicable if the device is already
+ * enabled for interpretation
+ */
+ if (zdev->gd == 0)
+ return -EINVAL;
+
+ kvm = zdev->kzdev->kvm;
+ msi_vecs = min_t(unsigned int, fib->fmt0.noi, zdev->max_msi);
+
+ /* Replace AIBV address */
+ idx = srcu_read_lock(&kvm->srcu);
+ aibv_page = gfn_to_page(kvm, gpa_to_gfn((gpa_t)fib->fmt0.aibv));
+ srcu_read_unlock(&kvm->srcu, idx);
+ if (is_error_page(aibv_page)) {
+ rc = -EIO;
+ goto out;
+ }
+ gaddr = page_to_virt(aibv_page) + (fib->fmt0.aibv & ~PAGE_MASK);
+ fib->fmt0.aibv = (u64)gaddr;
+
+ /* Pin the guest AISB if one was specified */
+ if (fib->fmt0.sum == 1) {
+ idx = srcu_read_lock(&kvm->srcu);
+ aisb_page = gfn_to_page(kvm, gpa_to_gfn((gpa_t)fib->fmt0.aisb));
+ srcu_read_unlock(&kvm->srcu, idx);
+ if (is_error_page(aisb_page)) {
+ rc = -EIO;
+ goto unpin1;
+ }
+ }
+
+ /* AISB must be allocated before we can fill in GAITE */
+ mutex_lock(&aift.lock);
+ bit = airq_iv_alloc_bit(aift.sbv);
+ if (bit == -1UL)
+ goto unpin2;
+ zdev->aisb = bit;
+ zdev->aibv = airq_iv_create(msi_vecs, AIRQ_IV_DATA |
+ AIRQ_IV_BITLOCK |
+ AIRQ_IV_GUESTVEC,
+ (unsigned long *)fib->fmt0.aibv);
+
+ spin_lock_irq(&aift.gait_lock);
+ gaite = (struct zpci_gaite *) aift.gait + (zdev->aisb *
+ sizeof(struct zpci_gaite));
+
+ /* If assist not requested, host will get all alerts */
+ if (assist)
+ gaite->gisa = (u32)(u64)&kvm->arch.sie_page2->gisa;
+ else
+ gaite->gisa = 0;
+
+ gaite->gisc = fib->fmt0.isc;
+ gaite->count++;
+ gaite->aisbo = fib->fmt0.aisbo;
+ gaite->aisb = (u64)(page_address(aisb_page) + (fib->fmt0.aisb &
+ ~PAGE_MASK));
+ aift.kzdev[zdev->aisb] = zdev->kzdev;
+ spin_unlock_irq(&aift.gait_lock);
+
+ /* Update guest FIB for re-issue */
+ fib->fmt0.aisbo = zdev->aisb & 63;
+ fib->fmt0.aisb = (unsigned long) aift.sbv->vector + (zdev->aisb/64)*8;
+ fib->fmt0.isc = kvm_s390_gisc_register(kvm, gaite->gisc);
+
+ /* Save some guest fib values in the host for later use */
+ zdev->kzdev->fib.fmt0.isc = fib->fmt0.isc;
+ zdev->kzdev->fib.fmt0.aibv = fib->fmt0.aibv;
+ mutex_unlock(&aift.lock);
+
+ /* Issue the clp to setup the irq now */
+ rc = kvm_zpci_set_airq(zdev);
+ return rc;
+
+unpin2:
+ mutex_unlock(&aift.lock);
+ if (fib->fmt0.sum == 1) {
+ gaddr = page_to_virt(aisb_page);
+ kvm_release_pfn_dirty((u64)gaddr >> PAGE_SHIFT);
+ }
+unpin1:
+ kvm_release_pfn_dirty(fib->fmt0.aibv >> PAGE_SHIFT);
+out:
+ return rc;
+}
+EXPORT_SYMBOL_GPL(kvm_s390_pci_aif_enable);
+
+int kvm_s390_pci_aif_disable(struct zpci_dev *zdev)
+{
+ struct kvm_zdev *kzdev = zdev->kzdev;
+ struct zpci_gaite *gaite;
+ int rc;
+ u8 isc;
+
+ if (zdev->gd == 0)
+ return -EINVAL;
+
+ /* Even if the clear fails due to an error, clear the GAITE */
+ rc = kvm_zpci_clear_airq(zdev);
+
+ mutex_lock(&aift.lock);
+ if (zdev->kzdev->fib.fmt0.aibv == 0)
+ goto out;
+ spin_lock_irq(&aift.gait_lock);
+ gaite = (struct zpci_gaite *) aift.gait + (zdev->aisb *
+ sizeof(struct zpci_gaite));
+ isc = gaite->gisc;
+ gaite->count--;
+ if (gaite->count == 0) {
+ /* Release guest AIBV and AISB */
+ kvm_release_pfn_dirty(kzdev->fib.fmt0.aibv >> PAGE_SHIFT);
+ if (gaite->aisb != 0)
+ kvm_release_pfn_dirty(gaite->aisb >> PAGE_SHIFT);
+ /* Clear the GAIT entry */
+ gaite->aisb = 0;
+ gaite->gisc = 0;
+ gaite->aisbo = 0;
+ gaite->gisa = 0;
+ aift.kzdev[zdev->aisb] = 0;
+ /* Clear zdev info */
+ airq_iv_free_bit(aift.sbv, zdev->aisb);
+ airq_iv_release(zdev->aibv);
+ zdev->aisb = 0;
+ zdev->aibv = NULL;
+ }
+ spin_unlock_irq(&aift.gait_lock);
+ kvm_s390_gisc_unregister(kzdev->kvm, isc);
+ kzdev->fib.fmt0.isc = 0;
+ kzdev->fib.fmt0.aibv = 0;
+out:
+ mutex_unlock(&aift.lock);
+
+ return rc;
+}
+EXPORT_SYMBOL_GPL(kvm_s390_pci_aif_disable);
+
int kvm_s390_pci_interp_probe(struct zpci_dev *zdev)
{
if (!(sclp.has_zpci_interp && test_facility(69)))
@@ -188,6 +383,10 @@ int kvm_s390_pci_interp_disable(struct zpci_dev *zdev)
if (zdev->gd == 0)
return -EINVAL;

+ /* Forwarding must be turned off before interpretation */
+ if (zdev->kzdev->fib.fmt0.aibv != 0)
+ kvm_s390_pci_aif_disable(zdev);
+
/* Remove the host CLP guest designation */
zdev->gd = 0;

diff --git a/arch/s390/pci/pci_insn.c b/arch/s390/pci/pci_insn.c
index 0d1ab268ec24..b57d3f594113 100644
--- a/arch/s390/pci/pci_insn.c
+++ b/arch/s390/pci/pci_insn.c
@@ -59,6 +59,7 @@ u8 zpci_mod_fc(u64 req, struct zpci_fib *fib, u8 *status)

return cc;
}
+EXPORT_SYMBOL_GPL(zpci_mod_fc);

/* Refresh PCI Translations */
static inline u8 __rpcit(u64 fn, u64 addr, u64 range, u8 *status)
--
2.27.0


2021-12-07 21:00:19

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH 22/32] KVM: s390: pci: provide routines for enabling/disabling IOAT assist

These routines will be wired into the vfio_pci_zdev ioctl handlers to
respond to requests to enable / disable a device for PCI I/O Address
Translation assistance.

Signed-off-by: Matthew Rosato <[email protected]>
---
arch/s390/include/asm/kvm_pci.h | 15 ++++
arch/s390/include/asm/pci_dma.h | 2 +
arch/s390/kvm/pci.c | 133 ++++++++++++++++++++++++++++++++
arch/s390/kvm/pci.h | 2 +
4 files changed, 152 insertions(+)

diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
index 54a0afdbe7d0..254275399f21 100644
--- a/arch/s390/include/asm/kvm_pci.h
+++ b/arch/s390/include/asm/kvm_pci.h
@@ -16,11 +16,21 @@
#include <linux/kvm_host.h>
#include <linux/kvm.h>
#include <linux/pci.h>
+#include <linux/mutex.h>
#include <asm/pci_insn.h>
+#include <asm/pci_dma.h>
+
+struct kvm_zdev_ioat {
+ unsigned long *head[ZPCI_TABLE_PAGES];
+ unsigned long **seg;
+ unsigned long ***pt;
+ struct mutex lock;
+};

struct kvm_zdev {
struct zpci_dev *zdev;
struct kvm *kvm;
+ struct kvm_zdev_ioat ioat;
struct zpci_fib fib;
};

@@ -33,6 +43,11 @@ extern int kvm_s390_pci_aif_enable(struct zpci_dev *zdev, struct zpci_fib *fib,
bool assist);
extern int kvm_s390_pci_aif_disable(struct zpci_dev *zdev);

+extern int kvm_s390_pci_ioat_probe(struct zpci_dev *zdev);
+extern int kvm_s390_pci_ioat_enable(struct zpci_dev *zdev, u64 iota);
+extern int kvm_s390_pci_ioat_disable(struct zpci_dev *zdev);
+extern u8 kvm_s390_pci_get_dtsm(struct zpci_dev *zdev);
+
extern int kvm_s390_pci_interp_probe(struct zpci_dev *zdev);
extern int kvm_s390_pci_interp_enable(struct zpci_dev *zdev);
extern int kvm_s390_pci_interp_disable(struct zpci_dev *zdev);
diff --git a/arch/s390/include/asm/pci_dma.h b/arch/s390/include/asm/pci_dma.h
index 3b8e89d4578a..e1d3c1d3fc8a 100644
--- a/arch/s390/include/asm/pci_dma.h
+++ b/arch/s390/include/asm/pci_dma.h
@@ -50,6 +50,8 @@ enum zpci_ioat_dtype {
#define ZPCI_TABLE_ALIGN ZPCI_TABLE_SIZE
#define ZPCI_TABLE_ENTRY_SIZE (sizeof(unsigned long))
#define ZPCI_TABLE_ENTRIES (ZPCI_TABLE_SIZE / ZPCI_TABLE_ENTRY_SIZE)
+#define ZPCI_TABLE_PAGES (ZPCI_TABLE_SIZE >> PAGE_SHIFT)
+#define ZPCI_TABLE_ENTRIES_PAGES (ZPCI_TABLE_ENTRIES * ZPCI_TABLE_PAGES)

#define ZPCI_TABLE_BITS 11
#define ZPCI_PT_BITS 8
diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
index 3a29398dd53b..a1c0c0881332 100644
--- a/arch/s390/kvm/pci.c
+++ b/arch/s390/kvm/pci.c
@@ -12,6 +12,7 @@
#include <asm/kvm_pci.h>
#include <asm/pci.h>
#include <asm/pci_insn.h>
+#include <asm/pci_dma.h>
#include <asm/sclp.h>
#include "pci.h"
#include "kvm-s390.h"
@@ -315,6 +316,131 @@ int kvm_s390_pci_aif_disable(struct zpci_dev *zdev)
}
EXPORT_SYMBOL_GPL(kvm_s390_pci_aif_disable);

+int kvm_s390_pci_ioat_probe(struct zpci_dev *zdev)
+{
+ return 0;
+}
+EXPORT_SYMBOL_GPL(kvm_s390_pci_ioat_probe);
+
+int kvm_s390_pci_ioat_enable(struct zpci_dev *zdev, u64 iota)
+{
+ gpa_t gpa = (gpa_t)(iota & ZPCI_RTE_ADDR_MASK);
+ struct kvm_zdev_ioat *ioat;
+ struct page *page;
+ struct kvm *kvm;
+ unsigned int idx;
+ void *iaddr;
+ int i, rc = 0;
+
+ if (!zdev->kzdev || !zdev->kzdev->kvm || zdev->kzdev->ioat.head[0])
+ return -EINVAL;
+
+ /* Ensure supported type specified */
+ if ((iota & ZPCI_IOTA_RTTO_FLAG) != ZPCI_IOTA_RTTO_FLAG)
+ return -EINVAL;
+
+ kvm = zdev->kzdev->kvm;
+ ioat = &zdev->kzdev->ioat;
+ mutex_lock(&ioat->lock);
+ idx = srcu_read_lock(&kvm->srcu);
+ for (i = 0; i < ZPCI_TABLE_PAGES; i++) {
+ page = gfn_to_page(kvm, gpa_to_gfn(gpa));
+ if (is_error_page(page)) {
+ srcu_read_unlock(&kvm->srcu, idx);
+ rc = -EIO;
+ goto out;
+ }
+ iaddr = page_to_virt(page) + (gpa & ~PAGE_MASK);
+ ioat->head[i] = (unsigned long *)iaddr;
+ gpa += PAGE_SIZE;
+ }
+ srcu_read_unlock(&kvm->srcu, idx);
+
+ zdev->kzdev->ioat.seg = kcalloc(ZPCI_TABLE_ENTRIES_PAGES,
+ sizeof(unsigned long *), GFP_KERNEL);
+ if (!zdev->kzdev->ioat.seg)
+ goto unpin;
+ zdev->kzdev->ioat.pt = kcalloc(ZPCI_TABLE_ENTRIES,
+ sizeof(unsigned long **), GFP_KERNEL);
+ if (!zdev->kzdev->ioat.pt)
+ goto free_seg;
+
+out:
+ mutex_unlock(&ioat->lock);
+ return rc;
+
+free_seg:
+ kfree(zdev->kzdev->ioat.seg);
+unpin:
+ for (i = 0; i < ZPCI_TABLE_PAGES; i++) {
+ kvm_release_pfn_dirty((u64)ioat->head[i] >> PAGE_SHIFT);
+ ioat->head[i] = 0;
+ }
+ mutex_unlock(&ioat->lock);
+ return -ENOMEM;
+}
+EXPORT_SYMBOL_GPL(kvm_s390_pci_ioat_enable);
+
+static void free_pt_entry(struct kvm_zdev_ioat *ioat, int st, int pt)
+{
+ if (!ioat->pt[st][pt])
+ return;
+
+ kvm_release_pfn_dirty((u64)ioat->pt[st][pt]);
+}
+
+static void free_seg_entry(struct kvm_zdev_ioat *ioat, int entry)
+{
+ int i, st, count = 0;
+
+ for (i = 0; i < ZPCI_TABLE_PAGES; i++) {
+ if (ioat->seg[entry + i]) {
+ kvm_release_pfn_dirty((u64)ioat->seg[entry + i]);
+ count++;
+ }
+ }
+
+ if (count == 0)
+ return;
+
+ st = entry / ZPCI_TABLE_PAGES;
+ for (i = 0; i < ZPCI_TABLE_ENTRIES; i++)
+ free_pt_entry(ioat, st, i);
+ kfree(ioat->pt[st]);
+}
+
+int kvm_s390_pci_ioat_disable(struct zpci_dev *zdev)
+{
+ struct kvm_zdev_ioat *ioat;
+ int i;
+
+ if (!zdev->kzdev || !zdev->kzdev->kvm || !zdev->kzdev->ioat.head[0])
+ return -EINVAL;
+
+ ioat = &zdev->kzdev->ioat;
+ mutex_lock(&ioat->lock);
+ for (i = 0; i < ZPCI_TABLE_PAGES; i++) {
+ kvm_release_pfn_dirty((u64)ioat->head[i] >> PAGE_SHIFT);
+ ioat->head[i] = 0;
+ }
+
+ for (i = 0; i < ZPCI_TABLE_ENTRIES_PAGES; i += ZPCI_TABLE_PAGES)
+ free_seg_entry(ioat, i);
+
+ kfree(ioat->seg);
+ kfree(ioat->pt);
+ mutex_unlock(&ioat->lock);
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(kvm_s390_pci_ioat_disable);
+
+u8 kvm_s390_pci_get_dtsm(struct zpci_dev *zdev)
+{
+ return (zdev->dtsm & KVM_S390_PCI_DTSM_MASK);
+}
+EXPORT_SYMBOL_GPL(kvm_s390_pci_get_dtsm);
+
int kvm_s390_pci_interp_probe(struct zpci_dev *zdev)
{
if (!(sclp.has_zpci_interp && test_facility(69)))
@@ -387,6 +513,10 @@ int kvm_s390_pci_interp_disable(struct zpci_dev *zdev)
if (zdev->kzdev->fib.fmt0.aibv != 0)
kvm_s390_pci_aif_disable(zdev);

+ /* If we are using the IOAT assist, disable it now */
+ if (zdev->kzdev->ioat.head[0])
+ kvm_s390_pci_ioat_disable(zdev);
+
/* Remove the host CLP guest designation */
zdev->gd = 0;

@@ -419,6 +549,8 @@ int kvm_s390_pci_dev_open(struct zpci_dev *zdev)
if (!kzdev)
return -ENOMEM;

+ mutex_init(&kzdev->ioat.lock);
+
kzdev->zdev = zdev;
zdev->kzdev = kzdev;

@@ -436,6 +568,7 @@ void kvm_s390_pci_dev_release(struct zpci_dev *zdev)
kzdev = zdev->kzdev;
WARN_ON(kzdev->zdev != zdev);
zdev->kzdev = 0;
+ mutex_destroy(&kzdev->ioat.lock);
kfree(kzdev);

}
diff --git a/arch/s390/kvm/pci.h b/arch/s390/kvm/pci.h
index 776b2745c675..3c86888fe1b3 100644
--- a/arch/s390/kvm/pci.h
+++ b/arch/s390/kvm/pci.h
@@ -16,6 +16,8 @@
#include <asm/airq.h>
#include <asm/kvm_pci.h>

+#define KVM_S390_PCI_DTSM_MASK 0x40
+
struct zpci_gaite {
unsigned int gisa;
u8 gisc;
--
2.27.0


2021-12-07 21:00:24

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH 23/32] KVM: s390: pci: handle refresh of PCI translations

Add a routine that will perform a shadow operation between a guest
and host IOAT. A subsequent patch will invoke this in response to
an 04 RPCIT instruction intercept.

Signed-off-by: Matthew Rosato <[email protected]>
---
arch/s390/include/asm/kvm_pci.h | 1 +
arch/s390/include/asm/pci_dma.h | 1 +
arch/s390/kvm/pci.c | 191 ++++++++++++++++++++++++++++++++
arch/s390/kvm/pci.h | 4 +-
4 files changed, 196 insertions(+), 1 deletion(-)

diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
index 254275399f21..97e3a369135d 100644
--- a/arch/s390/include/asm/kvm_pci.h
+++ b/arch/s390/include/asm/kvm_pci.h
@@ -30,6 +30,7 @@ struct kvm_zdev_ioat {
struct kvm_zdev {
struct zpci_dev *zdev;
struct kvm *kvm;
+ u64 rpcit_count;
struct kvm_zdev_ioat ioat;
struct zpci_fib fib;
};
diff --git a/arch/s390/include/asm/pci_dma.h b/arch/s390/include/asm/pci_dma.h
index e1d3c1d3fc8a..0ca15e5db3d9 100644
--- a/arch/s390/include/asm/pci_dma.h
+++ b/arch/s390/include/asm/pci_dma.h
@@ -52,6 +52,7 @@ enum zpci_ioat_dtype {
#define ZPCI_TABLE_ENTRIES (ZPCI_TABLE_SIZE / ZPCI_TABLE_ENTRY_SIZE)
#define ZPCI_TABLE_PAGES (ZPCI_TABLE_SIZE >> PAGE_SHIFT)
#define ZPCI_TABLE_ENTRIES_PAGES (ZPCI_TABLE_ENTRIES * ZPCI_TABLE_PAGES)
+#define ZPCI_TABLE_ENTRIES_PER_PAGE (ZPCI_TABLE_ENTRIES / ZPCI_TABLE_PAGES)

#define ZPCI_TABLE_BITS 11
#define ZPCI_PT_BITS 8
diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
index a1c0c0881332..858c5ecdc8b9 100644
--- a/arch/s390/kvm/pci.c
+++ b/arch/s390/kvm/pci.c
@@ -123,6 +123,195 @@ int kvm_s390_pci_aen_init(u8 nisc)
return rc;
}

+static int dma_shadow_cpu_trans(struct kvm_vcpu *vcpu, unsigned long *entry,
+ unsigned long *gentry)
+{
+ unsigned long idx;
+ struct page *page;
+ void *gaddr = NULL;
+ kvm_pfn_t pfn;
+ gpa_t addr;
+ int rc = 0;
+
+ if (pt_entry_isvalid(*gentry)) {
+ /* pin and validate */
+ addr = *gentry & ZPCI_PTE_ADDR_MASK;
+ idx = srcu_read_lock(&vcpu->kvm->srcu);
+ page = gfn_to_page(vcpu->kvm, gpa_to_gfn(addr));
+ srcu_read_unlock(&vcpu->kvm->srcu, idx);
+ if (is_error_page(page))
+ return -EIO;
+ gaddr = page_to_virt(page) + (addr & ~PAGE_MASK);
+ }
+
+ if (pt_entry_isvalid(*entry)) {
+ /* Either we are invalidating, replacing or no-op */
+ if (gaddr) {
+ if ((*entry & ZPCI_PTE_ADDR_MASK) ==
+ (unsigned long)gaddr) {
+ /* Duplicate */
+ kvm_release_pfn_dirty(*entry >> PAGE_SHIFT);
+ } else {
+ /* Replace */
+ pfn = (*entry >> PAGE_SHIFT);
+ invalidate_pt_entry(entry);
+ set_pt_pfaa(entry, gaddr);
+ validate_pt_entry(entry);
+ kvm_release_pfn_dirty(pfn);
+ rc = 1;
+ }
+ } else {
+ /* Invalidate */
+ pfn = (*entry >> PAGE_SHIFT);
+ invalidate_pt_entry(entry);
+ kvm_release_pfn_dirty(pfn);
+ rc = 1;
+ }
+ } else if (gaddr) {
+ /* New Entry */
+ set_pt_pfaa(entry, gaddr);
+ validate_pt_entry(entry);
+ }
+
+ return rc;
+}
+
+unsigned long *dma_walk_guest_cpu_trans(struct kvm_vcpu *vcpu,
+ struct kvm_zdev_ioat *ioat,
+ dma_addr_t dma_addr)
+{
+ unsigned long *rto, *sto, *pto;
+ unsigned int rtx, rts, sx, px, idx;
+ struct page *page;
+ gpa_t addr;
+ int i;
+
+ /* Pin guest segment table if needed */
+ rtx = calc_rtx(dma_addr);
+ rto = ioat->head[(rtx / ZPCI_TABLE_ENTRIES_PER_PAGE)];
+ rts = rtx * ZPCI_TABLE_PAGES;
+ if (!ioat->seg[rts]) {
+ if (!reg_entry_isvalid(rto[rtx % ZPCI_TABLE_ENTRIES_PER_PAGE]))
+ return NULL;
+ sto = get_rt_sto(rto[rtx % ZPCI_TABLE_ENTRIES_PER_PAGE]);
+ addr = ((u64)sto & ZPCI_RTE_ADDR_MASK);
+ idx = srcu_read_lock(&vcpu->kvm->srcu);
+ for (i = 0; i < ZPCI_TABLE_PAGES; i++) {
+ page = gfn_to_page(vcpu->kvm, gpa_to_gfn(addr));
+ if (is_error_page(page)) {
+ srcu_read_unlock(&vcpu->kvm->srcu, idx);
+ return NULL;
+ }
+ ioat->seg[rts + i] = page_to_virt(page) +
+ (addr & ~PAGE_MASK);
+ addr += PAGE_SIZE;
+ }
+ srcu_read_unlock(&vcpu->kvm->srcu, idx);
+ }
+
+ /* Allocate pin pointers for another segment table if needed */
+ if (!ioat->pt[rtx]) {
+ ioat->pt[rtx] = kcalloc(ZPCI_TABLE_ENTRIES,
+ (sizeof(unsigned long *)), GFP_KERNEL);
+ if (!ioat->pt[rtx])
+ return NULL;
+ }
+ /* Pin guest page table if needed */
+ sx = calc_sx(dma_addr);
+ sto = ioat->seg[(rts + (sx / ZPCI_TABLE_ENTRIES_PER_PAGE))];
+ if (!ioat->pt[rtx][sx]) {
+ if (!reg_entry_isvalid(sto[sx % ZPCI_TABLE_ENTRIES_PER_PAGE]))
+ return NULL;
+ pto = get_st_pto(sto[sx % ZPCI_TABLE_ENTRIES_PER_PAGE]);
+ if (!pto)
+ return NULL;
+ addr = ((u64)pto & ZPCI_STE_ADDR_MASK);
+ idx = srcu_read_lock(&vcpu->kvm->srcu);
+ page = gfn_to_page(vcpu->kvm, gpa_to_gfn(addr));
+ srcu_read_unlock(&vcpu->kvm->srcu, idx);
+ if (is_error_page(page))
+ return NULL;
+ ioat->pt[rtx][sx] = page_to_virt(page) + (addr & ~PAGE_MASK);
+ }
+ pto = ioat->pt[rtx][sx];
+
+ /* Return guest PTE */
+ px = calc_px(dma_addr);
+ return &pto[px];
+}
+
+
+static int dma_table_shadow(struct kvm_vcpu *vcpu, struct zpci_dev *zdev,
+ dma_addr_t dma_addr, size_t size)
+{
+ unsigned int nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
+ struct kvm_zdev *kzdev = zdev->kzdev;
+ unsigned long *entry, *gentry;
+ int i, rc = 0, rc2;
+
+ if (!nr_pages || !kzdev)
+ return -EINVAL;
+
+ mutex_lock(&kzdev->ioat.lock);
+ if (!zdev->dma_table || !kzdev->ioat.head[0]) {
+ rc = -EINVAL;
+ goto out_unlock;
+ }
+
+ for (i = 0; i < nr_pages; i++) {
+ gentry = dma_walk_guest_cpu_trans(vcpu, &kzdev->ioat, dma_addr);
+ if (!gentry)
+ continue;
+ entry = dma_walk_cpu_trans(zdev->dma_table, dma_addr);
+
+ if (!entry) {
+ rc = -ENOMEM;
+ goto out_unlock;
+ }
+
+ rc2 = dma_shadow_cpu_trans(vcpu, entry, gentry);
+ if (rc2 < 0) {
+ rc = -EIO;
+ goto out_unlock;
+ }
+ dma_addr += PAGE_SIZE;
+ rc += rc2;
+ }
+
+out_unlock:
+ mutex_unlock(&kzdev->ioat.lock);
+ return rc;
+}
+
+int kvm_s390_pci_refresh_trans(struct kvm_vcpu *vcpu, unsigned long req,
+ unsigned long start, unsigned long size)
+{
+ struct zpci_dev *zdev;
+ u32 fh;
+ int rc;
+
+ /* If the device has a SHM bit on, let userspace take care of this */
+ fh = req >> 32;
+ if ((fh & aift.mdd) != 0)
+ return -EOPNOTSUPP;
+
+ /* Make sure this is a valid device associated with this guest */
+ zdev = get_zdev_by_fh(fh);
+ if (!zdev || !zdev->kzdev || zdev->kzdev->kvm != vcpu->kvm)
+ return -EINVAL;
+
+ /* Only proceed if the device is using the assist */
+ if (zdev->kzdev->ioat.head[0] == 0)
+ return -EOPNOTSUPP;
+
+ rc = dma_table_shadow(vcpu, zdev, start, size);
+ if (rc > 0)
+ rc = zpci_refresh_trans((u64) zdev->fh << 32, start, size);
+ zdev->kzdev->rpcit_count++;
+
+ return rc;
+}
+
/* Modify PCI: Register floating adapter interruption forwarding */
static int kvm_zpci_set_airq(struct zpci_dev *zdev)
{
@@ -590,4 +779,6 @@ void kvm_s390_pci_init(void)
{
spin_lock_init(&aift.gait_lock);
mutex_init(&aift.lock);
+
+ WARN_ON(zpci_get_mdd(&aift.mdd));
}
diff --git a/arch/s390/kvm/pci.h b/arch/s390/kvm/pci.h
index 3c86888fe1b3..d252a631b693 100644
--- a/arch/s390/kvm/pci.h
+++ b/arch/s390/kvm/pci.h
@@ -33,6 +33,7 @@ struct zpci_aift {
struct kvm_zdev **kzdev;
spinlock_t gait_lock; /* Protects the gait, used during AEN forward */
struct mutex lock; /* Protects the other structures in aift */
+ u32 mdd;
};

static inline struct kvm *kvm_s390_pci_si_to_kvm(struct zpci_aift *aift,
@@ -47,7 +48,8 @@ struct zpci_aift *kvm_s390_pci_get_aift(void);

int kvm_s390_pci_aen_init(u8 nisc);
void kvm_s390_pci_aen_exit(void);
-
+int kvm_s390_pci_refresh_trans(struct kvm_vcpu *vcpu, unsigned long req,
+ unsigned long start, unsigned long end);
void kvm_s390_pci_init(void);

#endif /* __KVM_S390_PCI_H */
--
2.27.0


2021-12-07 21:00:44

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH 24/32] KVM: s390: intercept the rpcit instruction

For faster handling of PCI translation refreshes, intercept in KVM
and call the associated handler.

Signed-off-by: Matthew Rosato <[email protected]>
---
arch/s390/kvm/pci.h | 4 ++++
arch/s390/kvm/priv.c | 41 +++++++++++++++++++++++++++++++++++++++++
2 files changed, 45 insertions(+)

diff --git a/arch/s390/kvm/pci.h b/arch/s390/kvm/pci.h
index d252a631b693..3f96eff432aa 100644
--- a/arch/s390/kvm/pci.h
+++ b/arch/s390/kvm/pci.h
@@ -18,6 +18,10 @@

#define KVM_S390_PCI_DTSM_MASK 0x40

+#define KVM_S390_RPCIT_STAT_MASK 0xffffffff00ffffffUL
+#define KVM_S390_RPCIT_INS_RES (0x10 << 24)
+#define KVM_S390_RPCIT_ERR (0x28 << 24)
+
struct zpci_gaite {
unsigned int gisa;
u8 gisc;
diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
index 417154b314a6..768ae92ecc59 100644
--- a/arch/s390/kvm/priv.c
+++ b/arch/s390/kvm/priv.c
@@ -29,6 +29,7 @@
#include <asm/ap.h>
#include "gaccess.h"
#include "kvm-s390.h"
+#include "pci.h"
#include "trace.h"

static int handle_ri(struct kvm_vcpu *vcpu)
@@ -335,6 +336,44 @@ static int handle_rrbe(struct kvm_vcpu *vcpu)
return 0;
}

+static int handle_rpcit(struct kvm_vcpu *vcpu)
+{
+ int reg1, reg2;
+ int rc;
+
+ if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
+ return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
+
+ kvm_s390_get_regs_rre(vcpu, &reg1, &reg2);
+
+ rc = kvm_s390_pci_refresh_trans(vcpu, vcpu->run->s.regs.gprs[reg1],
+ vcpu->run->s.regs.gprs[reg2],
+ vcpu->run->s.regs.gprs[reg2+1]);
+
+ switch (rc) {
+ case 0:
+ kvm_s390_set_psw_cc(vcpu, 0);
+ break;
+ case -EOPNOTSUPP:
+ return -EOPNOTSUPP;
+ case -EINVAL:
+ kvm_s390_set_psw_cc(vcpu, 3);
+ break;
+ case -ENOMEM:
+ vcpu->run->s.regs.gprs[reg1] &= KVM_S390_RPCIT_STAT_MASK;
+ vcpu->run->s.regs.gprs[reg1] |= KVM_S390_RPCIT_INS_RES;
+ kvm_s390_set_psw_cc(vcpu, 1);
+ break;
+ default:
+ vcpu->run->s.regs.gprs[reg1] &= KVM_S390_RPCIT_STAT_MASK;
+ vcpu->run->s.regs.gprs[reg1] |= KVM_S390_RPCIT_ERR;
+ kvm_s390_set_psw_cc(vcpu, 1);
+ break;
+ }
+
+ return 0;
+}
+
#define SSKE_NQ 0x8
#define SSKE_MR 0x4
#define SSKE_MC 0x2
@@ -1275,6 +1314,8 @@ int kvm_s390_handle_b9(struct kvm_vcpu *vcpu)
return handle_essa(vcpu);
case 0xaf:
return handle_pfmf(vcpu);
+ case 0xd3:
+ return handle_rpcit(vcpu);
default:
return -EOPNOTSUPP;
}
--
2.27.0


2021-12-07 21:00:52

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH 25/32] vfio/pci: re-introduce CONFIG_VFIO_PCI_ZDEV

This was previously removed as unnecessary; while that was true, subsequent
changes will make KVM an additional required component for vfio-pci-zdev.
Let's re-introduce CONFIG_VFIO_PCI_ZDEV as now there is actually a reason
to say 'n' for it (when not planning to CONFIG_KVM).

Signed-off-by: Matthew Rosato <[email protected]>
---
drivers/vfio/pci/Kconfig | 11 +++++++++++
drivers/vfio/pci/Makefile | 2 +-
include/linux/vfio_pci_core.h | 2 +-
3 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig
index 860424ccda1b..fedd1d4cb592 100644
--- a/drivers/vfio/pci/Kconfig
+++ b/drivers/vfio/pci/Kconfig
@@ -42,5 +42,16 @@ config VFIO_PCI_IGD
and LPC bridge config space.

To enable Intel IGD assignment through vfio-pci, say Y.
+
+config VFIO_PCI_ZDEV
+ bool "VFIO PCI extensions for s390x KVM passthrough"
+ depends on S390 && KVM
+ default y
+ help
+ Support s390x-specific extensions to enable support for enhancements
+ to KVM passthrough capabilities, such as interpretive execution of
+ zPCI instructions.
+
+ To enable s390x KVM vfio-pci extensions, say Y.
endif
endif
diff --git a/drivers/vfio/pci/Makefile b/drivers/vfio/pci/Makefile
index 349d68d242b4..01b1f83d83d7 100644
--- a/drivers/vfio/pci/Makefile
+++ b/drivers/vfio/pci/Makefile
@@ -1,7 +1,7 @@
# SPDX-License-Identifier: GPL-2.0-only

vfio-pci-core-y := vfio_pci_core.o vfio_pci_intrs.o vfio_pci_rdwr.o vfio_pci_config.o
-vfio-pci-core-$(CONFIG_S390) += vfio_pci_zdev.o
+vfio-pci-core-$(CONFIG_VFIO_PCI_ZDEV) += vfio_pci_zdev.o
obj-$(CONFIG_VFIO_PCI_CORE) += vfio-pci-core.o

vfio-pci-y := vfio_pci.o
diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
index ef9a44b6cf5d..5e2bca3b89db 100644
--- a/include/linux/vfio_pci_core.h
+++ b/include/linux/vfio_pci_core.h
@@ -195,7 +195,7 @@ static inline int vfio_pci_igd_init(struct vfio_pci_core_device *vdev)
}
#endif

-#ifdef CONFIG_S390
+#ifdef CONFIG_VFIO_PCI_ZDEV
extern int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
struct vfio_info_cap *caps);
#else
--
2.27.0


2021-12-07 21:00:53

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH 26/32] vfio-pci/zdev: wire up group notifier

KVM zPCI passthrough device logic will need a reference to the associated
kvm guest that has access to the device. Let's register a group notifier
for VFIO_GROUP_NOTIFY_SET_KVM to catch this information in order to create
an association between a kvm guest and the host zdev.

Signed-off-by: Matthew Rosato <[email protected]>
---
arch/s390/include/asm/kvm_pci.h | 2 ++
drivers/vfio/pci/vfio_pci_core.c | 2 ++
drivers/vfio/pci/vfio_pci_zdev.c | 54 ++++++++++++++++++++++++++++++++
include/linux/vfio_pci_core.h | 12 +++++++
4 files changed, 70 insertions(+)

diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
index 97e3a369135d..6526908ac834 100644
--- a/arch/s390/include/asm/kvm_pci.h
+++ b/arch/s390/include/asm/kvm_pci.h
@@ -17,6 +17,7 @@
#include <linux/kvm.h>
#include <linux/pci.h>
#include <linux/mutex.h>
+#include <linux/notifier.h>
#include <asm/pci_insn.h>
#include <asm/pci_dma.h>

@@ -33,6 +34,7 @@ struct kvm_zdev {
u64 rpcit_count;
struct kvm_zdev_ioat ioat;
struct zpci_fib fib;
+ struct notifier_block nb;
};

extern int kvm_s390_pci_dev_open(struct zpci_dev *zdev);
diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index f948e6cd2993..fc57d4d0abbe 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -452,6 +452,7 @@ void vfio_pci_core_close_device(struct vfio_device *core_vdev)

vfio_pci_vf_token_user_add(vdev, -1);
vfio_spapr_pci_eeh_release(vdev->pdev);
+ vfio_pci_zdev_release(vdev);
vfio_pci_core_disable(vdev);

mutex_lock(&vdev->igate);
@@ -470,6 +471,7 @@ EXPORT_SYMBOL_GPL(vfio_pci_core_close_device);
void vfio_pci_core_finish_enable(struct vfio_pci_core_device *vdev)
{
vfio_pci_probe_mmaps(vdev);
+ vfio_pci_zdev_open(vdev);
vfio_spapr_pci_eeh_open(vdev->pdev);
vfio_pci_vf_token_user_add(vdev, 1);
}
diff --git a/drivers/vfio/pci/vfio_pci_zdev.c b/drivers/vfio/pci/vfio_pci_zdev.c
index ea4c0d2b0663..cfd7f44b06c1 100644
--- a/drivers/vfio/pci/vfio_pci_zdev.c
+++ b/drivers/vfio/pci/vfio_pci_zdev.c
@@ -13,6 +13,7 @@
#include <linux/vfio_zdev.h>
#include <asm/pci_clp.h>
#include <asm/pci_io.h>
+#include <asm/kvm_pci.h>

#include <linux/vfio_pci_core.h>

@@ -136,3 +137,56 @@ int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,

return ret;
}
+
+static int vfio_pci_zdev_group_notifier(struct notifier_block *nb,
+ unsigned long action, void *data)
+{
+ struct kvm_zdev *kzdev = container_of(nb, struct kvm_zdev, nb);
+
+ if (action == VFIO_GROUP_NOTIFY_SET_KVM) {
+ if (!data || !kzdev->zdev)
+ return NOTIFY_DONE;
+ if (kvm_s390_pci_attach_kvm(kzdev->zdev, data))
+ return NOTIFY_DONE;
+ }
+
+ return NOTIFY_OK;
+}
+
+int vfio_pci_zdev_open(struct vfio_pci_core_device *vdev)
+{
+ unsigned long events = VFIO_GROUP_NOTIFY_SET_KVM;
+ struct zpci_dev *zdev = to_zpci(vdev->pdev);
+ int ret;
+
+ if (!zdev)
+ return -ENODEV;
+
+ ret = kvm_s390_pci_dev_open(zdev);
+ if (ret)
+ return -ENODEV;
+
+ zdev->kzdev->nb.notifier_call = vfio_pci_zdev_group_notifier;
+
+ ret = vfio_register_notifier(vdev->vdev.dev, VFIO_GROUP_NOTIFY,
+ &events, &zdev->kzdev->nb);
+ if (ret)
+ kvm_s390_pci_dev_release(zdev);
+
+ return ret;
+}
+
+int vfio_pci_zdev_release(struct vfio_pci_core_device *vdev)
+{
+ struct zpci_dev *zdev = to_zpci(vdev->pdev);
+
+ if (!zdev || !zdev->kzdev)
+ return -ENODEV;
+
+ vfio_unregister_notifier(vdev->vdev.dev, VFIO_GROUP_NOTIFY,
+ &zdev->kzdev->nb);
+
+ kvm_s390_pci_dev_release(zdev);
+
+ return 0;
+}
diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
index 5e2bca3b89db..14079da409f1 100644
--- a/include/linux/vfio_pci_core.h
+++ b/include/linux/vfio_pci_core.h
@@ -198,12 +198,24 @@ static inline int vfio_pci_igd_init(struct vfio_pci_core_device *vdev)
#ifdef CONFIG_VFIO_PCI_ZDEV
extern int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
struct vfio_info_cap *caps);
+int vfio_pci_zdev_open(struct vfio_pci_core_device *vdev);
+int vfio_pci_zdev_release(struct vfio_pci_core_device *vdev);
#else
static inline int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
struct vfio_info_cap *caps)
{
return -ENODEV;
}
+
+static inline int vfio_pci_zdev_open(struct vfio_pci_core_device *vdev)
+{
+ return -ENODEV;
+}
+
+static inline int vfio_pci_zdev_release(struct vfio_pci_core_device *vdev)
+{
+ return -ENODEV;
+}
#endif

/* Will be exported for vfio pci drivers usage */
--
2.27.0


2021-12-07 21:00:58

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH 27/32] vfio-pci/zdev: wire up zPCI interpretive execution support

Introduce support for VFIO_DEVICE_FEATURE_ZPCI_INTERP, which is a new
VFIO_DEVICE_FEATURE ioctl. This interface is used to indicate that an
s390x vfio-pci device wishes to enable/disable zPCI interpretive
execution, which allows zPCI instructions to be executed directly by
underlying firmware without KVM involvement.

Signed-off-by: Matthew Rosato <[email protected]>
---
arch/s390/include/asm/kvm_pci.h | 1 +
drivers/vfio/pci/vfio_pci_core.c | 2 +
drivers/vfio/pci/vfio_pci_zdev.c | 76 ++++++++++++++++++++++++++++++++
include/linux/vfio_pci_core.h | 10 +++++
include/uapi/linux/vfio.h | 7 +++
include/uapi/linux/vfio_zdev.h | 15 +++++++
6 files changed, 111 insertions(+)

diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
index 6526908ac834..062bac720428 100644
--- a/arch/s390/include/asm/kvm_pci.h
+++ b/arch/s390/include/asm/kvm_pci.h
@@ -35,6 +35,7 @@ struct kvm_zdev {
struct kvm_zdev_ioat ioat;
struct zpci_fib fib;
struct notifier_block nb;
+ bool interp;
};

extern int kvm_s390_pci_dev_open(struct zpci_dev *zdev);
diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index fc57d4d0abbe..2b2d64a2190c 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -1172,6 +1172,8 @@ long vfio_pci_core_ioctl(struct vfio_device *core_vdev, unsigned int cmd,
mutex_unlock(&vdev->vf_token->lock);

return 0;
+ case VFIO_DEVICE_FEATURE_ZPCI_INTERP:
+ return vfio_pci_zdev_feat_interp(vdev, feature, arg);
default:
return -ENOTTY;
}
diff --git a/drivers/vfio/pci/vfio_pci_zdev.c b/drivers/vfio/pci/vfio_pci_zdev.c
index cfd7f44b06c1..b205e0ad1fd3 100644
--- a/drivers/vfio/pci/vfio_pci_zdev.c
+++ b/drivers/vfio/pci/vfio_pci_zdev.c
@@ -54,6 +54,10 @@ static int zpci_group_cap(struct zpci_dev *zdev, struct vfio_info_cap *caps)
.version = zdev->version
};

+ /* Some values are different for interpreted devices */
+ if (zdev->kzdev && zdev->kzdev->interp)
+ cap.maxstbl = zdev->maxstbl;
+
return vfio_info_add_capability(caps, &cap.header, sizeof(cap));
}

@@ -138,6 +142,70 @@ int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
return ret;
}

+int vfio_pci_zdev_feat_interp(struct vfio_pci_core_device *vdev,
+ struct vfio_device_feature feature,
+ unsigned long arg)
+{
+ struct zpci_dev *zdev = to_zpci(vdev->pdev);
+ struct vfio_device_zpci_interp *data;
+ struct vfio_device_feature *feat;
+ unsigned long minsz;
+ int size, rc;
+
+ if (!zdev || !zdev->kzdev)
+ return -EINVAL;
+
+ /*
+ * If PROBE requested and feature not found, leave immediately.
+ * Otherwise, keep going as GET or SET may also be specified.
+ */
+ if (feature.flags & VFIO_DEVICE_FEATURE_PROBE) {
+ rc = kvm_s390_pci_interp_probe(zdev);
+ if (rc)
+ return rc;
+ }
+ if (!(feature.flags & (VFIO_DEVICE_FEATURE_GET +
+ VFIO_DEVICE_FEATURE_SET)))
+ return 0;
+
+ size = sizeof(*feat) + sizeof(*data);
+ feat = kzalloc(size, GFP_KERNEL);
+ if (!feat)
+ return -ENOMEM;
+
+ data = (struct vfio_device_zpci_interp *)&feat->data;
+ minsz = offsetofend(struct vfio_device_feature, flags);
+
+ /* Get the rest of the payload for GET/SET */
+ rc = copy_from_user(data, (void __user *)(arg + minsz),
+ sizeof(*data));
+ if (rc)
+ rc = -EINVAL;
+
+ if (feature.flags & VFIO_DEVICE_FEATURE_GET) {
+ if (zdev->gd != 0)
+ data->flags = VFIO_DEVICE_ZPCI_FLAG_INTERP;
+ else
+ data->flags = 0;
+ data->fh = zdev->fh;
+ /* userspace is using host fh, give interpreted clp values */
+ zdev->kzdev->interp = true;
+
+ if (copy_to_user((void __user *)arg, feat, size))
+ rc = -EFAULT;
+ } else if (feature.flags & VFIO_DEVICE_FEATURE_SET) {
+ if (data->flags == VFIO_DEVICE_ZPCI_FLAG_INTERP)
+ rc = kvm_s390_pci_interp_enable(zdev);
+ else if (data->flags == 0)
+ rc = kvm_s390_pci_interp_disable(zdev);
+ else
+ rc = -EINVAL;
+ }
+
+ kfree(feat);
+ return rc;
+}
+
static int vfio_pci_zdev_group_notifier(struct notifier_block *nb,
unsigned long action, void *data)
{
@@ -167,6 +235,7 @@ int vfio_pci_zdev_open(struct vfio_pci_core_device *vdev)
return -ENODEV;

zdev->kzdev->nb.notifier_call = vfio_pci_zdev_group_notifier;
+ zdev->kzdev->interp = false;

ret = vfio_register_notifier(vdev->vdev.dev, VFIO_GROUP_NOTIFY,
&events, &zdev->kzdev->nb);
@@ -186,6 +255,13 @@ int vfio_pci_zdev_release(struct vfio_pci_core_device *vdev)
vfio_unregister_notifier(vdev->vdev.dev, VFIO_GROUP_NOTIFY,
&zdev->kzdev->nb);

+ /*
+ * If the device was using interpretation, don't trust that userspace
+ * did the appropriate cleanup
+ */
+ if (zdev->gd != 0)
+ kvm_s390_pci_interp_disable(zdev);
+
kvm_s390_pci_dev_release(zdev);

return 0;
diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
index 14079da409f1..92dc43c827c9 100644
--- a/include/linux/vfio_pci_core.h
+++ b/include/linux/vfio_pci_core.h
@@ -198,6 +198,9 @@ static inline int vfio_pci_igd_init(struct vfio_pci_core_device *vdev)
#ifdef CONFIG_VFIO_PCI_ZDEV
extern int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
struct vfio_info_cap *caps);
+int vfio_pci_zdev_feat_interp(struct vfio_pci_core_device *vdev,
+ struct vfio_device_feature feature,
+ unsigned long arg);
int vfio_pci_zdev_open(struct vfio_pci_core_device *vdev);
int vfio_pci_zdev_release(struct vfio_pci_core_device *vdev);
#else
@@ -207,6 +210,13 @@ static inline int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
return -ENODEV;
}

+static inline int vfio_pci_zdev_feat_interp(struct vfio_pci_core_device *vdev,
+ struct vfio_device_feature feature,
+ unsigned long arg)
+{
+ return -ENOTTY;
+}
+
static inline int vfio_pci_zdev_open(struct vfio_pci_core_device *vdev)
{
return -ENODEV;
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index ef33ea002b0b..b9a75485b8e7 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -1002,6 +1002,13 @@ struct vfio_device_feature {
*/
#define VFIO_DEVICE_FEATURE_PCI_VF_TOKEN (0)

+/*
+ * Provide support for enabling interpretation of zPCI instructions. This
+ * feature is only valid for s390x PCI devices. Data provided when setting
+ * and getting this feature is futher described in vfio_zdev.h
+ */
+#define VFIO_DEVICE_FEATURE_ZPCI_INTERP (1)
+
/* -------- API for Type1 VFIO IOMMU -------- */

/**
diff --git a/include/uapi/linux/vfio_zdev.h b/include/uapi/linux/vfio_zdev.h
index b4309397b6b2..575f0410dc66 100644
--- a/include/uapi/linux/vfio_zdev.h
+++ b/include/uapi/linux/vfio_zdev.h
@@ -75,4 +75,19 @@ struct vfio_device_info_cap_zpci_pfip {
__u8 pfip[];
};

+/**
+ * VFIO_DEVICE_FEATURE_ZPCI_INTERP
+ *
+ * This feature is used for enabling zPCI instruction interpretation for a
+ * device. No data is provided when setting this feature. When getting
+ * this feature, the following structure is provided which details whether
+ * or not interpretation is active and provides the guest with host device
+ * information necessary to enable interpretation.
+ */
+struct vfio_device_zpci_interp {
+ __u64 flags;
+#define VFIO_DEVICE_ZPCI_FLAG_INTERP 1
+ __u32 fh; /* Host device function handle */
+};
+
#endif
--
2.27.0


2021-12-07 21:01:01

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH 28/32] vfio-pci/zdev: wire up zPCI adapter interrupt forwarding support

Introduce support for VFIO_DEVICE_FEATURE_ZPCI_AIF, which is a new
VFIO_DEVICE_FEATURE ioctl. This interface is used to indicate that an
s390x vfio-pci device wishes to enable/disable zPCI adapter interrupt
forwarding, which allows underlying firmware to deliver interrupts
directly to the associated kvm guest.

Signed-off-by: Matthew Rosato <[email protected]>
---
arch/s390/include/asm/kvm_pci.h | 2 +
drivers/vfio/pci/vfio_pci_core.c | 2 +
drivers/vfio/pci/vfio_pci_zdev.c | 96 +++++++++++++++++++++++++++++++-
include/linux/vfio_pci_core.h | 10 ++++
include/uapi/linux/vfio.h | 7 +++
include/uapi/linux/vfio_zdev.h | 20 +++++++
6 files changed, 136 insertions(+), 1 deletion(-)

diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
index 062bac720428..0a0e42e1db1c 100644
--- a/arch/s390/include/asm/kvm_pci.h
+++ b/arch/s390/include/asm/kvm_pci.h
@@ -36,6 +36,8 @@ struct kvm_zdev {
struct zpci_fib fib;
struct notifier_block nb;
bool interp;
+ bool aif;
+ bool fhost;
};

extern int kvm_s390_pci_dev_open(struct zpci_dev *zdev);
diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index 2b2d64a2190c..01658de660bd 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -1174,6 +1174,8 @@ long vfio_pci_core_ioctl(struct vfio_device *core_vdev, unsigned int cmd,
return 0;
case VFIO_DEVICE_FEATURE_ZPCI_INTERP:
return vfio_pci_zdev_feat_interp(vdev, feature, arg);
+ case VFIO_DEVICE_FEATURE_ZPCI_AIF:
+ return vfio_pci_zdev_feat_aif(vdev, feature, arg);
default:
return -ENOTTY;
}
diff --git a/drivers/vfio/pci/vfio_pci_zdev.c b/drivers/vfio/pci/vfio_pci_zdev.c
index b205e0ad1fd3..dd98808b9139 100644
--- a/drivers/vfio/pci/vfio_pci_zdev.c
+++ b/drivers/vfio/pci/vfio_pci_zdev.c
@@ -13,6 +13,7 @@
#include <linux/vfio_zdev.h>
#include <asm/pci_clp.h>
#include <asm/pci_io.h>
+#include <asm/pci_insn.h>
#include <asm/kvm_pci.h>

#include <linux/vfio_pci_core.h>
@@ -206,6 +207,97 @@ int vfio_pci_zdev_feat_interp(struct vfio_pci_core_device *vdev,
return rc;
}

+int vfio_pci_zdev_feat_aif(struct vfio_pci_core_device *vdev,
+ struct vfio_device_feature feature,
+ unsigned long arg)
+{
+ struct zpci_dev *zdev = to_zpci(vdev->pdev);
+ struct vfio_device_zpci_aif *data;
+ struct vfio_device_feature *feat;
+ unsigned long minsz;
+ int size, rc = 0;
+
+ if (!zdev || !zdev->kzdev)
+ return -EINVAL;
+
+ /*
+ * If PROBE requested and feature not found, leave immediately.
+ * Otherwise, keep going as GET or SET may also be specified.
+ */
+ if (feature.flags & VFIO_DEVICE_FEATURE_PROBE) {
+ rc = kvm_s390_pci_aif_probe(zdev);
+ if (rc)
+ return rc;
+ }
+ if (!(feature.flags & (VFIO_DEVICE_FEATURE_GET +
+ VFIO_DEVICE_FEATURE_SET)))
+ return 0;
+
+ size = sizeof(*feat) + sizeof(*data);
+ feat = kzalloc(size, GFP_KERNEL);
+ if (!feat)
+ return -ENOMEM;
+
+ data = (struct vfio_device_zpci_aif *)&feat->data;
+ minsz = offsetofend(struct vfio_device_feature, flags);
+
+ /* Get the rest of the payload for GET/SET */
+ rc = copy_from_user(data, (void __user *)(arg + minsz),
+ sizeof(*data));
+ if (rc)
+ rc = -EINVAL;
+
+ if (feature.flags & VFIO_DEVICE_FEATURE_GET) {
+ if (zdev->kzdev->aif)
+ data->flags = VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT;
+ if (zdev->kzdev->fhost)
+ data->flags |= VFIO_DEVICE_ZPCI_FLAG_AIF_HOST;
+
+ if (copy_to_user((void __user *)arg, feat, size))
+ rc = -EFAULT;
+ } else if (feature.flags & VFIO_DEVICE_FEATURE_SET) {
+ if (data->flags & VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT) {
+ /* create a guest fib */
+ struct zpci_fib fib;
+
+ fib.fmt0.aibv = data->ibv;
+ fib.fmt0.isc = data->isc;
+ fib.fmt0.noi = data->noi;
+ if (data->sb != 0) {
+ fib.fmt0.aisb = data->sb;
+ fib.fmt0.aisbo = data->sbo;
+ fib.fmt0.sum = 1;
+ } else {
+ fib.fmt0.aisb = 0;
+ fib.fmt0.aisbo = 0;
+ fib.fmt0.sum = 0;
+ }
+ if (data->flags & VFIO_DEVICE_ZPCI_FLAG_AIF_HOST) {
+ rc = kvm_s390_pci_aif_enable(zdev, &fib, false);
+ if (!rc) {
+ zdev->kzdev->aif = true;
+ zdev->kzdev->fhost = true;
+ }
+ } else {
+ rc = kvm_s390_pci_aif_enable(zdev, &fib, true);
+ if (!rc)
+ zdev->kzdev->aif = true;
+ }
+ } else if (data->flags == 0) {
+ rc = kvm_s390_pci_aif_disable(zdev);
+ if (!rc) {
+ zdev->kzdev->aif = false;
+ zdev->kzdev->fhost = false;
+ }
+ } else {
+ rc = -EINVAL;
+ }
+ }
+
+ kfree(feat);
+ return rc;
+}
+
static int vfio_pci_zdev_group_notifier(struct notifier_block *nb,
unsigned long action, void *data)
{
@@ -259,8 +351,10 @@ int vfio_pci_zdev_release(struct vfio_pci_core_device *vdev)
* If the device was using interpretation, don't trust that userspace
* did the appropriate cleanup
*/
- if (zdev->gd != 0)
+ if (zdev->gd != 0) {
+ kvm_s390_pci_aif_disable(zdev);
kvm_s390_pci_interp_disable(zdev);
+ }

kvm_s390_pci_dev_release(zdev);

diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
index 92dc43c827c9..5442d3fa1662 100644
--- a/include/linux/vfio_pci_core.h
+++ b/include/linux/vfio_pci_core.h
@@ -201,6 +201,9 @@ extern int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
int vfio_pci_zdev_feat_interp(struct vfio_pci_core_device *vdev,
struct vfio_device_feature feature,
unsigned long arg);
+int vfio_pci_zdev_feat_aif(struct vfio_pci_core_device *vdev,
+ struct vfio_device_feature feature,
+ unsigned long arg);
int vfio_pci_zdev_open(struct vfio_pci_core_device *vdev);
int vfio_pci_zdev_release(struct vfio_pci_core_device *vdev);
#else
@@ -217,6 +220,13 @@ static inline int vfio_pci_zdev_feat_interp(struct vfio_pci_core_device *vdev,
return -ENOTTY;
}

+static inline int vfio_pci_zdev_feat_aif(struct vfio_pci_core_device *vdev,
+ struct vfio_device_feature feature,
+ unsigned long arg)
+{
+ return -ENOTTY;
+}
+
static inline int vfio_pci_zdev_open(struct vfio_pci_core_device *vdev)
{
return -ENODEV;
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index b9a75485b8e7..fe3bfd99bf50 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -1009,6 +1009,13 @@ struct vfio_device_feature {
*/
#define VFIO_DEVICE_FEATURE_ZPCI_INTERP (1)

+/*
+ * Provide support for enbaling adapter interruption forwarding for zPCI
+ * devices. This feature is only valid for s390x PCI devices. Data provided
+ * when setting and getting this feature is further described in vfio_zdev.h
+ */
+#define VFIO_DEVICE_FEATURE_ZPCI_AIF (2)
+
/* -------- API for Type1 VFIO IOMMU -------- */

/**
diff --git a/include/uapi/linux/vfio_zdev.h b/include/uapi/linux/vfio_zdev.h
index 575f0410dc66..c574e23f9385 100644
--- a/include/uapi/linux/vfio_zdev.h
+++ b/include/uapi/linux/vfio_zdev.h
@@ -90,4 +90,24 @@ struct vfio_device_zpci_interp {
__u32 fh; /* Host device function handle */
};

+/**
+ * VFIO_DEVICE_FEATURE_ZPCI_AIF
+ *
+ * This feature is used for enabling forwarding of adapter interrupts directly
+ * from firmware to the guest. When setting this feature, the flags indicate
+ * whether to enable/disable the feature and the structure defined below is
+ * used to setup the forwarding structures. When getting this feature, only
+ * the flags are used to indicate the current state.
+ */
+struct vfio_device_zpci_aif {
+ __u64 flags;
+#define VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT 1
+#define VFIO_DEVICE_ZPCI_FLAG_AIF_HOST 2
+ __u64 ibv; /* Address of guest interrupt bit vector */
+ __u64 sb; /* Address of guest summary bit */
+ __u32 noi; /* Number of interrupts */
+ __u8 isc; /* Guest interrupt subclass */
+ __u8 sbo; /* Offset of guest summary bit vector */
+};
+
#endif
--
2.27.0


2021-12-07 21:01:05

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH 29/32] vfio-pci/zdev: wire up zPCI IOAT assist support

Introduce support for VFIO_DEVICE_FEATURE_ZPCI_IOAT, which is a new
VFIO_DEVICE_FEATURE ioctl. This interface is used to indicate that an
s390x vfio-pci device wishes to enable/disable zPCI I/O Address
Translation assistance, allowing the host to perform address translation
and shadowing.

Signed-off-by: Matthew Rosato <[email protected]>
---
arch/s390/include/asm/kvm_pci.h | 1 +
drivers/vfio/pci/vfio_pci_core.c | 2 ++
drivers/vfio/pci/vfio_pci_zdev.c | 61 ++++++++++++++++++++++++++++++++
include/linux/vfio_pci_core.h | 10 ++++++
include/uapi/linux/vfio.h | 8 +++++
include/uapi/linux/vfio_zdev.h | 13 +++++++
6 files changed, 95 insertions(+)

diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
index 0a0e42e1db1c..0b362d55c7b2 100644
--- a/arch/s390/include/asm/kvm_pci.h
+++ b/arch/s390/include/asm/kvm_pci.h
@@ -32,6 +32,7 @@ struct kvm_zdev {
struct zpci_dev *zdev;
struct kvm *kvm;
u64 rpcit_count;
+ u64 iota;
struct kvm_zdev_ioat ioat;
struct zpci_fib fib;
struct notifier_block nb;
diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index 01658de660bd..709d9ba22a60 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -1176,6 +1176,8 @@ long vfio_pci_core_ioctl(struct vfio_device *core_vdev, unsigned int cmd,
return vfio_pci_zdev_feat_interp(vdev, feature, arg);
case VFIO_DEVICE_FEATURE_ZPCI_AIF:
return vfio_pci_zdev_feat_aif(vdev, feature, arg);
+ case VFIO_DEVICE_FEATURE_ZPCI_IOAT:
+ return vfio_pci_zdev_feat_ioat(vdev, feature, arg);
default:
return -ENOTTY;
}
diff --git a/drivers/vfio/pci/vfio_pci_zdev.c b/drivers/vfio/pci/vfio_pci_zdev.c
index dd98808b9139..85be77492a6d 100644
--- a/drivers/vfio/pci/vfio_pci_zdev.c
+++ b/drivers/vfio/pci/vfio_pci_zdev.c
@@ -298,6 +298,66 @@ int vfio_pci_zdev_feat_aif(struct vfio_pci_core_device *vdev,
return rc;
}

+int vfio_pci_zdev_feat_ioat(struct vfio_pci_core_device *vdev,
+ struct vfio_device_feature feature,
+ unsigned long arg)
+{
+ struct zpci_dev *zdev = to_zpci(vdev->pdev);
+ struct vfio_device_zpci_ioat *data;
+ struct vfio_device_feature *feat;
+ unsigned long minsz;
+ int size, rc = 0;
+
+ if (!zdev || !zdev->kzdev)
+ return -EINVAL;
+
+ /*
+ * If PROBE requested and feature not found, leave immediately.
+ * Otherwise, keep going as GET or SET may also be specified.
+ */
+ if (feature.flags & VFIO_DEVICE_FEATURE_PROBE) {
+ rc = kvm_s390_pci_ioat_probe(zdev);
+ if (rc)
+ return rc;
+ }
+ if (!(feature.flags & (VFIO_DEVICE_FEATURE_GET +
+ VFIO_DEVICE_FEATURE_SET)))
+ return 0;
+
+ size = sizeof(*feat) + sizeof(*data);
+ feat = kzalloc(size, GFP_KERNEL);
+ if (!feat)
+ return -ENOMEM;
+
+ data = (struct vfio_device_zpci_ioat *)&feat->data;
+ minsz = offsetofend(struct vfio_device_feature, flags);
+
+ /* Get the rest of the payload for GET/SET */
+ rc = copy_from_user(data, (void __user *)(arg + minsz),
+ sizeof(*data));
+ if (rc)
+ rc = -EINVAL;
+
+ if (feature.flags & VFIO_DEVICE_FEATURE_GET) {
+ data->iota = (u64)zdev->kzdev->iota;
+ if (copy_to_user((void __user *)arg, feat, size))
+ rc = -EFAULT;
+ } else if (feature.flags & VFIO_DEVICE_FEATURE_SET) {
+ if (data->iota != 0) {
+ rc = kvm_s390_pci_ioat_enable(zdev, data->iota);
+ if (!rc)
+ zdev->kzdev->iota = data->iota;
+ } else if (zdev->kzdev->iota != 0) {
+ rc = kvm_s390_pci_ioat_disable(zdev);
+ if (!rc)
+ zdev->kzdev->iota = 0;
+ }
+ }
+
+ kfree(feat);
+ return rc;
+}
+
static int vfio_pci_zdev_group_notifier(struct notifier_block *nb,
unsigned long action, void *data)
{
@@ -353,6 +413,7 @@ int vfio_pci_zdev_release(struct vfio_pci_core_device *vdev)
*/
if (zdev->gd != 0) {
kvm_s390_pci_aif_disable(zdev);
+ kvm_s390_pci_ioat_disable(zdev);
kvm_s390_pci_interp_disable(zdev);
}

diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
index 5442d3fa1662..7c45a425e7f8 100644
--- a/include/linux/vfio_pci_core.h
+++ b/include/linux/vfio_pci_core.h
@@ -204,6 +204,9 @@ int vfio_pci_zdev_feat_interp(struct vfio_pci_core_device *vdev,
int vfio_pci_zdev_feat_aif(struct vfio_pci_core_device *vdev,
struct vfio_device_feature feature,
unsigned long arg);
+int vfio_pci_zdev_feat_ioat(struct vfio_pci_core_device *vdev,
+ struct vfio_device_feature feature,
+ unsigned long arg);
int vfio_pci_zdev_open(struct vfio_pci_core_device *vdev);
int vfio_pci_zdev_release(struct vfio_pci_core_device *vdev);
#else
@@ -227,6 +230,13 @@ static inline int vfio_pci_zdev_feat_aif(struct vfio_pci_core_device *vdev,
return -ENOTTY;
}

+static inline int vfio_pci_zdev_feat_ioat(struct vfio_pci_core_device *vdev,
+ struct vfio_device_feature feature,
+ unsigned long arg)
+{
+ return -ENOTTY;
+}
+
static inline int vfio_pci_zdev_open(struct vfio_pci_core_device *vdev)
{
return -ENODEV;
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index fe3bfd99bf50..32c687388f48 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -1016,6 +1016,14 @@ struct vfio_device_feature {
*/
#define VFIO_DEVICE_FEATURE_ZPCI_AIF (2)

+/*
+ * Provide support for enabling guest I/O address translation assistance for
+ * zPCI devices. This feature is only valid for s390x PCI devices. Data
+ * provided when setting and getting this feature is further described in
+ * vfio_zdev.h
+ */
+#define VFIO_DEVICE_FEATURE_ZPCI_IOAT (3)
+
/* -------- API for Type1 VFIO IOMMU -------- */

/**
diff --git a/include/uapi/linux/vfio_zdev.h b/include/uapi/linux/vfio_zdev.h
index c574e23f9385..1a5229b7bb18 100644
--- a/include/uapi/linux/vfio_zdev.h
+++ b/include/uapi/linux/vfio_zdev.h
@@ -110,4 +110,17 @@ struct vfio_device_zpci_aif {
__u8 sbo; /* Offset of guest summary bit vector */
};

+/**
+ * VFIO_DEVICE_FEATURE_ZPCI_IOAT
+ *
+ * This feature is used for enabling guest I/O translation assistance for
+ * passthrough zPCI devices using instruction interpretation. When setting
+ * this feature, the iota specifies a KVM guest I/O translation anchor. When
+ * getting this feature, the most recently set anchor (or 0) is returned in
+ * iota.
+ */
+struct vfio_device_zpci_ioat {
+ __u64 iota;
+};
+
#endif
--
2.27.0


2021-12-07 21:01:10

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH 30/32] vfio-pci/zdev: add DTSM to clp group capability

The DTSM, or designation type supported mask, indicates what IOAT formats
are available to the guest. For an interpreted device, userspace will not
know what format(s) the IOAT assist supports, so pass it via the
capability chain. Since the value belongs to the Query PCI Function Group
clp, let's extend the existing capability with a new version.

Signed-off-by: Matthew Rosato <[email protected]>
---
drivers/vfio/pci/vfio_pci_zdev.c | 9 ++++++---
include/uapi/linux/vfio_zdev.h | 3 +++
2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci_zdev.c b/drivers/vfio/pci/vfio_pci_zdev.c
index 85be77492a6d..342b59ed36c9 100644
--- a/drivers/vfio/pci/vfio_pci_zdev.c
+++ b/drivers/vfio/pci/vfio_pci_zdev.c
@@ -45,19 +45,22 @@ static int zpci_group_cap(struct zpci_dev *zdev, struct vfio_info_cap *caps)
{
struct vfio_device_info_cap_zpci_group cap = {
.header.id = VFIO_DEVICE_INFO_CAP_ZPCI_GROUP,
- .header.version = 1,
+ .header.version = 2,
.dasm = zdev->dma_mask,
.msi_addr = zdev->msi_addr,
.flags = VFIO_DEVICE_INFO_ZPCI_FLAG_REFRESH,
.mui = zdev->fmb_update,
.noi = zdev->max_msi,
.maxstbl = ZPCI_MAX_WRITE_SIZE,
- .version = zdev->version
+ .version = zdev->version,
+ .dtsm = 0
};

/* Some values are different for interpreted devices */
- if (zdev->kzdev && zdev->kzdev->interp)
+ if (zdev->kzdev && zdev->kzdev->interp) {
cap.maxstbl = zdev->maxstbl;
+ cap.dtsm = kvm_s390_pci_get_dtsm(zdev);
+ }

return vfio_info_add_capability(caps, &cap.header, sizeof(cap));
}
diff --git a/include/uapi/linux/vfio_zdev.h b/include/uapi/linux/vfio_zdev.h
index 1a5229b7bb18..b4c2ba8e71f0 100644
--- a/include/uapi/linux/vfio_zdev.h
+++ b/include/uapi/linux/vfio_zdev.h
@@ -47,6 +47,9 @@ struct vfio_device_info_cap_zpci_group {
__u16 noi; /* Maximum number of MSIs */
__u16 maxstbl; /* Maximum Store Block Length */
__u8 version; /* Supported PCI Version */
+ /* End of version 1 */
+ __u8 dtsm; /* Supported IOAT Designations */
+ /* End of version 2 */
};

/**
--
2.27.0


2021-12-07 21:01:18

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH 31/32] KVM: s390: introduce CPU feature for zPCI Interpretation

KVM_S390_VM_CPU_FEAT_ZPCI_INTERP relays whether zPCI interpretive
execution is possible based on the available hardware facilities.

Signed-off-by: Matthew Rosato <[email protected]>
---
arch/s390/include/uapi/asm/kvm.h | 1 +
arch/s390/kvm/kvm-s390.c | 4 ++++
2 files changed, 5 insertions(+)

diff --git a/arch/s390/include/uapi/asm/kvm.h b/arch/s390/include/uapi/asm/kvm.h
index 7a6b14874d65..ed06458a871f 100644
--- a/arch/s390/include/uapi/asm/kvm.h
+++ b/arch/s390/include/uapi/asm/kvm.h
@@ -130,6 +130,7 @@ struct kvm_s390_vm_cpu_machine {
#define KVM_S390_VM_CPU_FEAT_PFMFI 11
#define KVM_S390_VM_CPU_FEAT_SIGPIF 12
#define KVM_S390_VM_CPU_FEAT_KSS 13
+#define KVM_S390_VM_CPU_FEAT_ZPCI_INTERP 14
struct kvm_s390_vm_cpu_feat {
__u64 feat[16];
};
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 361d742cdf0d..45d1bd295b38 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -434,6 +434,10 @@ static void kvm_s390_cpu_feat_init(void)
if (test_facility(151)) /* DFLTCC */
__insn32_query(INSN_DFLTCC, kvm_s390_available_subfunc.dfltcc);

+ if (test_facility(69) && test_facility(70) && test_facility(71) &&
+ test_facility(72)) /* zPCI Interpretation */
+ allow_cpu_feat(KVM_S390_VM_CPU_FEAT_ZPCI_INTERP);
+
if (MACHINE_HAS_ESOP)
allow_cpu_feat(KVM_S390_VM_CPU_FEAT_ESOP);
/*
--
2.27.0


2021-12-07 21:01:29

by Matthew Rosato

[permalink] [raw]
Subject: [PATCH 32/32] MAINTAINERS: additional files related kvm s390 pci passthrough

Add entries from the s390 kvm subdirectory related to pci passthrough.

Signed-off-by: Matthew Rosato <[email protected]>
---
MAINTAINERS | 2 ++
1 file changed, 2 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 43007f2d29e0..a88f8e4f2c80 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -16689,6 +16689,8 @@ M: Eric Farman <[email protected]>
L: [email protected]
L: [email protected]
S: Supported
+F: arch/s390/include/asm/kvm_pci.h
+F: arch/s390/kvm/pci*
F: drivers/vfio/pci/vfio_pci_zdev.c
F: include/uapi/linux/vfio_zdev.h

--
2.27.0


2021-12-07 21:16:58

by Matthew Rosato

[permalink] [raw]
Subject: Re: [PATCH 00/32] KVM: s390: enable zPCI for interpretive execution

On 12/7/21 3:57 PM, Matthew Rosato wrote:
> Enable interpretive execution of zPCI instructions + adapter interruption
> forwarding for s390x KVM vfio-pci. This is done by introducing a series
> of new vfio-pci feature ioctls that are unique vfio-pci-zdev (s390x) and
> are used to negotiate the various aspects of zPCI interpretation setup.
> By allowing intepretation of zPCI instructions and firmware delivery of
> interrupts to guests, we can significantly reduce the frequency of guest
> SIE exits for zPCI. We then see additional gains by handling a hot-path
> instruction that can still intercept to the hypervisor (RPCIT) directly
> in kvm.
>
> From the perspective of guest configuration, you passthrough zPCI devices
> in the same manner as before, with intepretation support being used by
> default if available in kernel+qemu.
>
> Will reply with a link to the associated QEMU series.

https://lists.gnu.org/archive/html/qemu-devel/2021-12/msg00873.html

>
> Matthew Rosato (32):
> s390/sclp: detect the zPCI interpretation facility
> s390/sclp: detect the AISII facility
> s390/sclp: detect the AENI facility
> s390/sclp: detect the AISI facility
> s390/airq: pass more TPI info to airq handlers
> s390/airq: allow for airq structure that uses an input vector
> s390/pci: externalize the SIC operation controls and routine
> s390/pci: stash associated GISA designation
> s390/pci: export some routines related to RPCIT processing
> s390/pci: stash dtsm and maxstbl
> s390/pci: add helper function to find device by handle
> s390/pci: get SHM information from list pci
> KVM: s390: pci: add basic kvm_zdev structure
> KVM: s390: pci: do initial setup for AEN interpretation
> KVM: s390: pci: enable host forwarding of Adapter Event Notifications
> KVM: s390: expose the guest zPCI interpretation facility
> KVM: s390: expose the guest Adapter Interruption Source ID facility
> KVM: s390: expose guest Adapter Event Notification Interpretation
> facility
> KVM: s390: mechanism to enable guest zPCI Interpretation
> KVM: s390: pci: provide routines for enabling/disabling interpretation
> KVM: s390: pci: provide routines for enabling/disabling interrupt
> forwarding
> KVM: s390: pci: provide routines for enabling/disabling IOAT assist
> KVM: s390: pci: handle refresh of PCI translations
> KVM: s390: intercept the rpcit instruction
> vfio/pci: re-introduce CONFIG_VFIO_PCI_ZDEV
> vfio-pci/zdev: wire up group notifier
> vfio-pci/zdev: wire up zPCI interpretive execution support
> vfio-pci/zdev: wire up zPCI adapter interrupt forwarding support
> vfio-pci/zdev: wire up zPCI IOAT assist support
> vfio-pci/zdev: add DTSM to clp group capability
> KVM: s390: introduce CPU feature for zPCI Interpretation
> MAINTAINERS: additional files related kvm s390 pci passthrough
>
> MAINTAINERS | 2 +
> arch/s390/include/asm/airq.h | 7 +-
> arch/s390/include/asm/kvm_host.h | 5 +
> arch/s390/include/asm/kvm_pci.h | 62 +++
> arch/s390/include/asm/pci.h | 13 +
> arch/s390/include/asm/pci_clp.h | 11 +-
> arch/s390/include/asm/pci_dma.h | 3 +
> arch/s390/include/asm/pci_insn.h | 29 +-
> arch/s390/include/asm/sclp.h | 4 +
> arch/s390/include/asm/tpi.h | 14 +
> arch/s390/include/uapi/asm/kvm.h | 1 +
> arch/s390/kvm/Makefile | 2 +-
> arch/s390/kvm/interrupt.c | 97 +++-
> arch/s390/kvm/kvm-s390.c | 65 ++-
> arch/s390/kvm/kvm-s390.h | 10 +
> arch/s390/kvm/pci.c | 784 +++++++++++++++++++++++++++++++
> arch/s390/kvm/pci.h | 59 +++
> arch/s390/kvm/priv.c | 41 ++
> arch/s390/pci/pci.c | 47 ++
> arch/s390/pci/pci_clp.c | 19 +-
> arch/s390/pci/pci_dma.c | 1 +
> arch/s390/pci/pci_insn.c | 5 +-
> arch/s390/pci/pci_irq.c | 50 +-
> drivers/s390/char/sclp_early.c | 4 +
> drivers/s390/cio/airq.c | 12 +-
> drivers/s390/cio/qdio_thinint.c | 6 +-
> drivers/s390/crypto/ap_bus.c | 9 +-
> drivers/s390/virtio/virtio_ccw.c | 6 +-
> drivers/vfio/pci/Kconfig | 11 +
> drivers/vfio/pci/Makefile | 2 +-
> drivers/vfio/pci/vfio_pci_core.c | 8 +
> drivers/vfio/pci/vfio_pci_zdev.c | 292 +++++++++++-
> include/linux/vfio_pci_core.h | 44 +-
> include/uapi/linux/vfio.h | 22 +
> include/uapi/linux/vfio_zdev.h | 51 ++
> 35 files changed, 1738 insertions(+), 60 deletions(-)
> create mode 100644 arch/s390/include/asm/kvm_pci.h
> create mode 100644 arch/s390/kvm/pci.c
> create mode 100644 arch/s390/kvm/pci.h
>


2021-12-08 09:44:31

by Niklas Schnelle

[permalink] [raw]
Subject: Re: [PATCH 20/32] KVM: s390: pci: provide routines for enabling/disabling interpretation

On Tue, 2021-12-07 at 15:57 -0500, Matthew Rosato wrote:
> These routines will be wired into the vfio_pci_zdev ioctl handlers to
> respond to requests to enable / disable a device for zPCI Load/Store
> interpretation.
>
> The first time such a request is received, enable the necessary facilities
> for the guest.
>
> Signed-off-by: Matthew Rosato <[email protected]>
> ---
> arch/s390/include/asm/kvm_pci.h | 4 ++
> arch/s390/kvm/pci.c | 91 +++++++++++++++++++++++++++++++++
> arch/s390/pci/pci.c | 3 ++
> 3 files changed, 98 insertions(+)
>
> diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
> index 3e491a39704c..5d6283acb54c 100644
> --- a/arch/s390/include/asm/kvm_pci.h
>
---8<---
> return rc;
> + }
> +
> + /*
> + * Store information about the identity of the kvm guest allowed to
> + * access this device via interpretation to be used by host CLP
> + */
> + zdev->gd = gd;
> +
> + rc = zpci_enable_device(zdev);
> + if (rc)
> + goto err;
> +
> + /* Re-register the IOMMU that was already created */
> + rc = zpci_register_ioat(zdev, 0, zdev->start_dma, zdev->end_dma,
> + (u64)zdev->dma_table);

The zdev->dma_table is a virtual address but we need an absolute
address in the MPCIFC so the above should use
virt_to_phys(zdev->dma_table) to be compatible with future V != R
kernel memory. As of now since virtual and absolute kernel addresses
are the same this is not a bug and we've had this (wrong) pattern in
the rest of the code but let's get it righht here from the start.

See also my commit "s390/pci: use physical addresses in DMA tables"
that is currently in the s390 feature branch.

> + if (rc)
> + goto err;
> +
> + return rc;
> +
> +err:
> + zdev->gd = 0;
> + return rc;
> +}
> +EXPORT_SYMBOL_GPL(kvm_s390_pci_interp_enable);
> +
> +int kvm_s390_pci_interp_disable(struct zpci_dev *zdev)
> +{
> + int rc;
> +
> + if (zdev->gd == 0)
> + return -EINVAL;
> +
> + /* Remove the host CLP guest designation */
> + zdev->gd = 0;
> +
> + if (zdev_enabled(zdev)) {
> + rc = zpci_disable_device(zdev);
> + if (rc)
> + return rc;
> + }
> +
> + rc = zpci_enable_device(zdev);
> + if (rc)
> + return rc;
> +
> + /* Re-register the IOMMU that was already created */
> + rc = zpci_register_ioat(zdev, 0, zdev->start_dma, zdev->end_dma,
> + (u64)zdev->dma_table);

Same as above

> +
> + return rc;
> +}
> +EXPORT_SYMBOL_GPL(kvm_s390_pci_interp_disable);
> +
>
---8<---


2021-12-08 10:04:57

by Thomas Huth

[permalink] [raw]
Subject: Re: [PATCH 06/32] s390/airq: allow for airq structure that uses an input vector

On 07/12/2021 21.57, Matthew Rosato wrote:
> When doing device passthrough where interrupts are being forwarded
> from host to guest, we wish to use a pinned section of guest memory
> as the vector (the same memory used by the guest as the vector).
>
> Signed-off-by: Matthew Rosato <[email protected]>
> ---
[...]
> diff --git a/arch/s390/pci/pci_irq.c b/arch/s390/pci/pci_irq.c
> index 880bcd73f11a..dfd4f3276a6d 100644
> --- a/arch/s390/pci/pci_irq.c
> +++ b/arch/s390/pci/pci_irq.c
[...]
> @@ -443,7 +443,7 @@ static int __init zpci_directed_irq_init(void)
> zpci_ibv[cpu] = airq_iv_create(cache_line_size() * BITS_PER_BYTE,
> AIRQ_IV_DATA |
> AIRQ_IV_CACHELINE |
> - (!cpu ? AIRQ_IV_ALLOC : 0));
> + (!cpu ? AIRQ_IV_ALLOC : 0), 0);

Nit: Indentation changed

> diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> index 52c376d15978..ff84f45587be 100644
> --- a/drivers/s390/virtio/virtio_ccw.c
> +++ b/drivers/s390/virtio/virtio_ccw.c
> @@ -241,7 +241,7 @@ static struct airq_info *new_airq_info(int index)
> return NULL;
> rwlock_init(&info->lock);
> info->aiv = airq_iv_create(VIRTIO_IV_BITS, AIRQ_IV_ALLOC | AIRQ_IV_PTR
> - | AIRQ_IV_CACHELINE);
> + | AIRQ_IV_CACHELINE, 0);

dito

> if (!info->aiv) {
> kfree(info);
> return NULL;
>

Thomas


2021-12-08 10:30:29

by Niklas Schnelle

[permalink] [raw]
Subject: Re: [PATCH 23/32] KVM: s390: pci: handle refresh of PCI translations

On Tue, 2021-12-07 at 15:57 -0500, Matthew Rosato wrote:
> Add a routine that will perform a shadow operation between a guest
> and host IOAT. A subsequent patch will invoke this in response to
> an 04 RPCIT instruction intercept.
>
> Signed-off-by: Matthew Rosato <[email protected]>
> ---
> arch/s390/include/asm/kvm_pci.h | 1 +
> arch/s390/include/asm/pci_dma.h | 1 +
> arch/s390/kvm/pci.c | 191 ++++++++++++++++++++++++++++++++
> arch/s390/kvm/pci.h | 4 +-
> 4 files changed, 196 insertions(+), 1 deletion(-)
>
> diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
> index 254275399f21..97e3a369135d 100644
> --- a/arch/s390/include/asm/kvm_pci.h
> +++ b/arch/s390/include/asm/kvm_pci.h
> @@ -30,6 +30,7 @@ struct kvm_zdev_ioat {
> struct kvm_zdev {
> struct zpci_dev *zdev;
> struct kvm *kvm;
> + u64 rpcit_count;
> struct kvm_zdev_ioat ioat;
> struct zpci_fib fib;
> };
> diff --git a/arch/s390/include/asm/pci_dma.h b/arch/s390/include/asm/pci_dma.h
> index e1d3c1d3fc8a..0ca15e5db3d9 100644
> --- a/arch/s390/include/asm/pci_dma.h
> +++ b/arch/s390/include/asm/pci_dma.h
> @@ -52,6 +52,7 @@ enum zpci_ioat_dtype {
> #define ZPCI_TABLE_ENTRIES (ZPCI_TABLE_SIZE / ZPCI_TABLE_ENTRY_SIZE)
> #define ZPCI_TABLE_PAGES (ZPCI_TABLE_SIZE >> PAGE_SHIFT)
> #define ZPCI_TABLE_ENTRIES_PAGES (ZPCI_TABLE_ENTRIES * ZPCI_TABLE_PAGES)
> +#define ZPCI_TABLE_ENTRIES_PER_PAGE (ZPCI_TABLE_ENTRIES / ZPCI_TABLE_PAGES)
>
> #define ZPCI_TABLE_BITS 11
> #define ZPCI_PT_BITS 8
> diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
> index a1c0c0881332..858c5ecdc8b9 100644
> --- a/arch/s390/kvm/pci.c
> +++ b/arch/s390/kvm/pci.c
> @@ -123,6 +123,195 @@ int kvm_s390_pci_aen_init(u8 nisc)
> return rc;
> }
>
> +static int dma_shadow_cpu_trans(struct kvm_vcpu *vcpu, unsigned long *entry,
> + unsigned long *gentry)
> +{
> + unsigned long idx;
> + struct page *page;
> + void *gaddr = NULL;
> + kvm_pfn_t pfn;
> + gpa_t addr;
> + int rc = 0;
> +
> + if (pt_entry_isvalid(*gentry)) {
> + /* pin and validate */
> + addr = *gentry & ZPCI_PTE_ADDR_MASK;
> + idx = srcu_read_lock(&vcpu->kvm->srcu);
> + page = gfn_to_page(vcpu->kvm, gpa_to_gfn(addr));
> + srcu_read_unlock(&vcpu->kvm->srcu, idx);
> + if (is_error_page(page))
> + return -EIO;
> + gaddr = page_to_virt(page) + (addr & ~PAGE_MASK);

Hmm, this looks like a virtual vs physical address mixup to me that is
currently not a problem because kernel virtual addresses are equal to
their physical address. Here page_to_virt(page) gives us a virtual
address but the entries in the I/O translation table have to be
physical (aka absolute) addresses.

With my commit "s390/pci: use physical addresses in DMA tables"
currently in the s390 feature branch this is also reflected in the
argument types taken by set_pt_pfaa() below so gaddr should have type
phys_addr_t not void *. That should also remove the need for the cast
to unsigned long for the duplicate check.

> + }
> +
> + if (pt_entry_isvalid(*entry)) {
> + /* Either we are invalidating, replacing or no-op */
> + if (gaddr) {
> + if ((*entry & ZPCI_PTE_ADDR_MASK) ==
> + (unsigned long)gaddr) {
> + /* Duplicate */
> + kvm_release_pfn_dirty(*entry >> PAGE_SHIFT);
> + } else {
> + /* Replace */
> + pfn = (*entry >> PAGE_SHIFT);
> + invalidate_pt_entry(entry);
> + set_pt_pfaa(entry, gaddr);
> + validate_pt_entry(entry);
> + kvm_release_pfn_dirty(pfn);
> + rc = 1;
> + }
> + } else {
> + /* Invalidate */
> + pfn = (*entry >> PAGE_SHIFT);
> + invalidate_pt_entry(entry);
> + kvm_release_pfn_dirty(pfn);
> + rc = 1;
> + }
> + } else if (gaddr) {
> + /* New Entry */
> + set_pt_pfaa(entry, gaddr);
> + validate_pt_entry(entry);
> + }
> +
> + return rc;
> +}
> +
> +unsigned long *dma_walk_guest_cpu_trans(struct kvm_vcpu *vcpu,
> + struct kvm_zdev_ioat *ioat,
> + dma_addr_t dma_addr)
> +{
> + unsigned long *rto, *sto, *pto;
> + unsigned int rtx, rts, sx, px, idx;
> + struct page *page;
> + gpa_t addr;
> + int i;
> +
> + /* Pin guest segment table if needed */
> + rtx = calc_rtx(dma_addr);
> + rto = ioat->head[(rtx / ZPCI_TABLE_ENTRIES_PER_PAGE)];
> + rts = rtx * ZPCI_TABLE_PAGES;
> + if (!ioat->seg[rts]) {
> + if (!reg_entry_isvalid(rto[rtx % ZPCI_TABLE_ENTRIES_PER_PAGE]))
> + return NULL;
> + sto = get_rt_sto(rto[rtx % ZPCI_TABLE_ENTRIES_PER_PAGE]);
> + addr = ((u64)sto & ZPCI_RTE_ADDR_MASK);
> + idx = srcu_read_lock(&vcpu->kvm->srcu);
> + for (i = 0; i < ZPCI_TABLE_PAGES; i++) {
> + page = gfn_to_page(vcpu->kvm, gpa_to_gfn(addr));
> + if (is_error_page(page)) {
> + srcu_read_unlock(&vcpu->kvm->srcu, idx);
> + return NULL;
> + }
> + ioat->seg[rts + i] = page_to_virt(page) +
> + (addr & ~PAGE_MASK);

Here on the other hand I think the page_to_virt() is correct since you
want the virtual addresses to be able to derference it, correct?

> + addr += PAGE_SIZE;
> + }
> + srcu_read_unlock(&vcpu->kvm->srcu, idx);
> + }
> +
> + /* Allocate pin pointers for another segment table if needed */
> + if (!ioat->pt[rtx]) {
> + ioat->pt[rtx] = kcalloc(ZPCI_TABLE_ENTRIES,
> + (sizeof(unsigned long *)), GFP_KERNEL);
> + if (!ioat->pt[rtx])
> + return NULL;
> + }
> + /* Pin guest page table if needed */
> + sx = calc_sx(dma_addr);
> + sto = ioat->seg[(rts + (sx / ZPCI_TABLE_ENTRIES_PER_PAGE))];
> + if (!ioat->pt[rtx][sx]) {
> + if (!reg_entry_isvalid(sto[sx % ZPCI_TABLE_ENTRIES_PER_PAGE]))
> + return NULL;
> + pto = get_st_pto(sto[sx % ZPCI_TABLE_ENTRIES_PER_PAGE]);
> + if (!pto)
> + return NULL;
> + addr = ((u64)pto & ZPCI_STE_ADDR_MASK);
> + idx = srcu_read_lock(&vcpu->kvm->srcu);
> + page = gfn_to_page(vcpu->kvm, gpa_to_gfn(addr));
> + srcu_read_unlock(&vcpu->kvm->srcu, idx);
> + if (is_error_page(page))
> + return NULL;
> + ioat->pt[rtx][sx] = page_to_virt(page) + (addr & ~PAGE_MASK);

Same as above.

> + }
> + pto = ioat->pt[rtx][sx];
> +
> + /* Return guest PTE */
> + px = calc_px(dma_addr);
> + return &pto[px];
> +}
> +
>
---8<---


2021-12-08 11:13:07

by Christian Borntraeger

[permalink] [raw]
Subject: Re: [PATCH 01/32] s390/sclp: detect the zPCI interpretation facility

Am 07.12.21 um 21:57 schrieb Matthew Rosato:
> Detect the zPCI Load/Store Interpretation facility.
>
> Reviewed-by: Eric Farman <[email protected]>
> Signed-off-by: Matthew Rosato <[email protected]>
> ---
> arch/s390/include/asm/sclp.h | 1 +
> drivers/s390/char/sclp_early.c | 1 +
> 2 files changed, 2 insertions(+)
>
> diff --git a/arch/s390/include/asm/sclp.h b/arch/s390/include/asm/sclp.h
> index c68ea35de498..c84e8e0ca344 100644
> --- a/arch/s390/include/asm/sclp.h
> +++ b/arch/s390/include/asm/sclp.h
> @@ -88,6 +88,7 @@ struct sclp_info {
> unsigned char has_diag318 : 1;
> unsigned char has_sipl : 1;
> unsigned char has_dirq : 1;
> + unsigned char has_zpci_interp : 1;

maybe use zpci_lsi (load store interpretion) as pci interpretion would be something else (also fix the the subject line).
With that

Reviewed-by: Christian Borntraeger <[email protected]>


> unsigned int ibc;
> unsigned int mtid;
> unsigned int mtid_cp;
> diff --git a/drivers/s390/char/sclp_early.c b/drivers/s390/char/sclp_early.c
> index b64feab62caa..2e8199b7ae50 100644
> --- a/drivers/s390/char/sclp_early.c
> +++ b/drivers/s390/char/sclp_early.c
> @@ -45,6 +45,7 @@ static void __init sclp_early_facilities_detect(void)
> sclp.has_gisaf = !!(sccb->fac118 & 0x08);
> sclp.has_hvs = !!(sccb->fac119 & 0x80);
> sclp.has_kss = !!(sccb->fac98 & 0x01);
> + sclp.has_zpci_interp = !!(sccb->fac118 & 0x01);
> if (sccb->fac85 & 0x02)
> S390_lowcore.machine_flags |= MACHINE_FLAG_ESOP;
> if (sccb->fac91 & 0x40)
>

2021-12-08 11:14:01

by Christian Borntraeger

[permalink] [raw]
Subject: Re: [PATCH 02/32] s390/sclp: detect the AISII facility

Am 07.12.21 um 21:57 schrieb Matthew Rosato:
> Detect the Adapter Interruption Source ID Interpretation facility.
>
> Reviewed-by: Eric Farman <[email protected]>
> Signed-off-by: Matthew Rosato <[email protected]>

Reviewed-by: Christian Borntraeger <[email protected]>

> ---
> arch/s390/include/asm/sclp.h | 1 +
> drivers/s390/char/sclp_early.c | 1 +
> 2 files changed, 2 insertions(+)
>
> diff --git a/arch/s390/include/asm/sclp.h b/arch/s390/include/asm/sclp.h
> index c84e8e0ca344..524a99baf221 100644
> --- a/arch/s390/include/asm/sclp.h
> +++ b/arch/s390/include/asm/sclp.h
> @@ -89,6 +89,7 @@ struct sclp_info {
> unsigned char has_sipl : 1;
> unsigned char has_dirq : 1;
> unsigned char has_zpci_interp : 1;
> + unsigned char has_aisii : 1;
> unsigned int ibc;
> unsigned int mtid;
> unsigned int mtid_cp;
> diff --git a/drivers/s390/char/sclp_early.c b/drivers/s390/char/sclp_early.c
> index 2e8199b7ae50..a73120b8a5de 100644
> --- a/drivers/s390/char/sclp_early.c
> +++ b/drivers/s390/char/sclp_early.c
> @@ -45,6 +45,7 @@ static void __init sclp_early_facilities_detect(void)
> sclp.has_gisaf = !!(sccb->fac118 & 0x08);
> sclp.has_hvs = !!(sccb->fac119 & 0x80);
> sclp.has_kss = !!(sccb->fac98 & 0x01);
> + sclp.has_aisii = !!(sccb->fac118 & 0x40);
> sclp.has_zpci_interp = !!(sccb->fac118 & 0x01);
> if (sccb->fac85 & 0x02)
> S390_lowcore.machine_flags |= MACHINE_FLAG_ESOP;
>

2021-12-08 11:17:47

by Christian Borntraeger

[permalink] [raw]
Subject: Re: [PATCH 03/32] s390/sclp: detect the AENI facility



Am 07.12.21 um 21:57 schrieb Matthew Rosato:
> Detect the Adapter Event Notification Interpretation facility.
>
> Reviewed-by: Eric Farman <[email protected]>
> Signed-off-by: Matthew Rosato <[email protected]>

Reviewed-by: Christian Borntraeger <[email protected]>

> ---
> arch/s390/include/asm/sclp.h | 1 +
> drivers/s390/char/sclp_early.c | 1 +
> 2 files changed, 2 insertions(+)
>
> diff --git a/arch/s390/include/asm/sclp.h b/arch/s390/include/asm/sclp.h
> index 524a99baf221..a763563bb3e7 100644
> --- a/arch/s390/include/asm/sclp.h
> +++ b/arch/s390/include/asm/sclp.h
> @@ -90,6 +90,7 @@ struct sclp_info {
> unsigned char has_dirq : 1;
> unsigned char has_zpci_interp : 1;
> unsigned char has_aisii : 1;
> + unsigned char has_aeni : 1;
> unsigned int ibc;
> unsigned int mtid;
> unsigned int mtid_cp;
> diff --git a/drivers/s390/char/sclp_early.c b/drivers/s390/char/sclp_early.c
> index a73120b8a5de..52a203ea23cc 100644
> --- a/drivers/s390/char/sclp_early.c
> +++ b/drivers/s390/char/sclp_early.c
> @@ -46,6 +46,7 @@ static void __init sclp_early_facilities_detect(void)
> sclp.has_hvs = !!(sccb->fac119 & 0x80);
> sclp.has_kss = !!(sccb->fac98 & 0x01);
> sclp.has_aisii = !!(sccb->fac118 & 0x40);
> + sclp.has_aeni = !!(sccb->fac118 & 0x20);
> sclp.has_zpci_interp = !!(sccb->fac118 & 0x01);
> if (sccb->fac85 & 0x02)
> S390_lowcore.machine_flags |= MACHINE_FLAG_ESOP;
>

2021-12-08 11:18:17

by Christian Borntraeger

[permalink] [raw]
Subject: Re: [PATCH 04/32] s390/sclp: detect the AISI facility



Am 07.12.21 um 21:57 schrieb Matthew Rosato:
> Detect the Adapter Interruption Suppression Interpretation facility.
>
> Reviewed-by: Eric Farman <[email protected]>
> Signed-off-by: Matthew Rosato <[email protected]>

Reviewed-by: Christian Borntraeger <[email protected]>
> ---
> arch/s390/include/asm/sclp.h | 1 +
> drivers/s390/char/sclp_early.c | 1 +
> 2 files changed, 2 insertions(+)
>
> diff --git a/arch/s390/include/asm/sclp.h b/arch/s390/include/asm/sclp.h
> index a763563bb3e7..559adb28a24c 100644
> --- a/arch/s390/include/asm/sclp.h
> +++ b/arch/s390/include/asm/sclp.h
> @@ -91,6 +91,7 @@ struct sclp_info {
> unsigned char has_zpci_interp : 1;
> unsigned char has_aisii : 1;
> unsigned char has_aeni : 1;
> + unsigned char has_aisi : 1;
> unsigned int ibc;
> unsigned int mtid;
> unsigned int mtid_cp;
> diff --git a/drivers/s390/char/sclp_early.c b/drivers/s390/char/sclp_early.c
> index 52a203ea23cc..9b29ed850d39 100644
> --- a/drivers/s390/char/sclp_early.c
> +++ b/drivers/s390/char/sclp_early.c
> @@ -47,6 +47,7 @@ static void __init sclp_early_facilities_detect(void)
> sclp.has_kss = !!(sccb->fac98 & 0x01);
> sclp.has_aisii = !!(sccb->fac118 & 0x40);
> sclp.has_aeni = !!(sccb->fac118 & 0x20);
> + sclp.has_aisi = !!(sccb->fac118 & 0x10);
> sclp.has_zpci_interp = !!(sccb->fac118 & 0x01);
> if (sccb->fac85 & 0x02)
> S390_lowcore.machine_flags |= MACHINE_FLAG_ESOP;
>

2021-12-08 11:25:56

by Christian Borntraeger

[permalink] [raw]
Subject: Re: [PATCH 05/32] s390/airq: pass more TPI info to airq handlers



Am 07.12.21 um 21:57 schrieb Matthew Rosato:
> A subsequent patch will introduce an airq handler that requires additional
> TPI information beyond directed vs floating, so pass the entire tpi_info
> structure via the handler. Only pci actually uses this information today,
> for the other airq handlers this is effectively a no-op.
>
> Reviewed-by: Eric Farman <[email protected]>
> Signed-off-by: Matthew Rosato <[email protected]>

Looks sane.

Acked-by: Christian Borntraeger <[email protected]>
> ---
> arch/s390/include/asm/airq.h | 3 ++-
> arch/s390/kvm/interrupt.c | 4 +++-
> arch/s390/pci/pci_irq.c | 9 +++++++--
> drivers/s390/cio/airq.c | 2 +-
> drivers/s390/cio/qdio_thinint.c | 6 ++++--
> drivers/s390/crypto/ap_bus.c | 9 ++++++---
> drivers/s390/virtio/virtio_ccw.c | 4 +++-
> 7 files changed, 26 insertions(+), 11 deletions(-)
>
> diff --git a/arch/s390/include/asm/airq.h b/arch/s390/include/asm/airq.h
> index 01936fdfaddb..7918a7d09028 100644
> --- a/arch/s390/include/asm/airq.h
> +++ b/arch/s390/include/asm/airq.h
> @@ -12,10 +12,11 @@
>
> #include <linux/bit_spinlock.h>
> #include <linux/dma-mapping.h>
> +#include <asm/tpi.h>
>
> struct airq_struct {
> struct hlist_node list; /* Handler queueing. */
> - void (*handler)(struct airq_struct *airq, bool floating);
> + void (*handler)(struct airq_struct *airq, struct tpi_info *tpi_info);
> u8 *lsi_ptr; /* Local-Summary-Indicator pointer */
> u8 lsi_mask; /* Local-Summary-Indicator mask */
> u8 isc; /* Interrupt-subclass */
> diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
> index c3bd993fdd0c..f9b872e358c6 100644
> --- a/arch/s390/kvm/interrupt.c
> +++ b/arch/s390/kvm/interrupt.c
> @@ -28,6 +28,7 @@
> #include <asm/switch_to.h>
> #include <asm/nmi.h>
> #include <asm/airq.h>
> +#include <asm/tpi.h>
> #include "kvm-s390.h"
> #include "gaccess.h"
> #include "trace-s390.h"
> @@ -3261,7 +3262,8 @@ int kvm_s390_gisc_unregister(struct kvm *kvm, u32 gisc)
> }
> EXPORT_SYMBOL_GPL(kvm_s390_gisc_unregister);
>
> -static void gib_alert_irq_handler(struct airq_struct *airq, bool floating)
> +static void gib_alert_irq_handler(struct airq_struct *airq,
> + struct tpi_info *tpi_info)
> {
> inc_irq_stat(IRQIO_GAL);
> process_gib_alert_list();
> diff --git a/arch/s390/pci/pci_irq.c b/arch/s390/pci/pci_irq.c
> index 954bb7a83124..880bcd73f11a 100644
> --- a/arch/s390/pci/pci_irq.c
> +++ b/arch/s390/pci/pci_irq.c
> @@ -11,6 +11,7 @@
>
> #include <asm/isc.h>
> #include <asm/airq.h>
> +#include <asm/tpi.h>
>
> static enum {FLOATING, DIRECTED} irq_delivery;
>
> @@ -216,8 +217,11 @@ static void zpci_handle_fallback_irq(void)
> }
> }
>
> -static void zpci_directed_irq_handler(struct airq_struct *airq, bool floating)
> +static void zpci_directed_irq_handler(struct airq_struct *airq,
> + struct tpi_info *tpi_info)
> {
> + bool floating = !tpi_info->directed_irq;
> +
> if (floating) {
> inc_irq_stat(IRQIO_PCF);
> zpci_handle_fallback_irq();
> @@ -227,7 +231,8 @@ static void zpci_directed_irq_handler(struct airq_struct *airq, bool floating)
> }
> }
>
> -static void zpci_floating_irq_handler(struct airq_struct *airq, bool floating)
> +static void zpci_floating_irq_handler(struct airq_struct *airq,
> + struct tpi_info *tpi_info)
> {
> unsigned long si, ai;
> struct airq_iv *aibv;
> diff --git a/drivers/s390/cio/airq.c b/drivers/s390/cio/airq.c
> index e56535c99888..2f2226786319 100644
> --- a/drivers/s390/cio/airq.c
> +++ b/drivers/s390/cio/airq.c
> @@ -99,7 +99,7 @@ static irqreturn_t do_airq_interrupt(int irq, void *dummy)
> rcu_read_lock();
> hlist_for_each_entry_rcu(airq, head, list)
> if ((*airq->lsi_ptr & airq->lsi_mask) != 0)
> - airq->handler(airq, !tpi_info->directed_irq);
> + airq->handler(airq, tpi_info);
> rcu_read_unlock();
>
> return IRQ_HANDLED;
> diff --git a/drivers/s390/cio/qdio_thinint.c b/drivers/s390/cio/qdio_thinint.c
> index 8e09bf3a2fcd..9b9335dd06db 100644
> --- a/drivers/s390/cio/qdio_thinint.c
> +++ b/drivers/s390/cio/qdio_thinint.c
> @@ -15,6 +15,7 @@
> #include <asm/qdio.h>
> #include <asm/airq.h>
> #include <asm/isc.h>
> +#include <asm/tpi.h>
>
> #include "cio.h"
> #include "ioasm.h"
> @@ -93,9 +94,10 @@ static inline u32 clear_shared_ind(void)
> /**
> * tiqdio_thinint_handler - thin interrupt handler for qdio
> * @airq: pointer to adapter interrupt descriptor
> - * @floating: flag to recognize floating vs. directed interrupts (unused)
> + * @tpi_info: interrupt information (e.g. floating vs directed -- unused)
> */
> -static void tiqdio_thinint_handler(struct airq_struct *airq, bool floating)
> +static void tiqdio_thinint_handler(struct airq_struct *airq,
> + struct tpi_info *tpi_info)
> {
> u64 irq_time = S390_lowcore.int_clock;
> u32 si_used = clear_shared_ind();
> diff --git a/drivers/s390/crypto/ap_bus.c b/drivers/s390/crypto/ap_bus.c
> index 1986243f9cd3..df1a038442db 100644
> --- a/drivers/s390/crypto/ap_bus.c
> +++ b/drivers/s390/crypto/ap_bus.c
> @@ -27,6 +27,7 @@
> #include <linux/kthread.h>
> #include <linux/mutex.h>
> #include <asm/airq.h>
> +#include <asm/tpi.h>
> #include <linux/atomic.h>
> #include <asm/isc.h>
> #include <linux/hrtimer.h>
> @@ -129,7 +130,8 @@ static int ap_max_adapter_id = 63;
> static struct bus_type ap_bus_type;
>
> /* Adapter interrupt definitions */
> -static void ap_interrupt_handler(struct airq_struct *airq, bool floating);
> +static void ap_interrupt_handler(struct airq_struct *airq,
> + struct tpi_info *tpi_info);
>
> static bool ap_irq_flag;
>
> @@ -442,9 +444,10 @@ static enum hrtimer_restart ap_poll_timeout(struct hrtimer *unused)
> /**
> * ap_interrupt_handler() - Schedule ap_tasklet on interrupt
> * @airq: pointer to adapter interrupt descriptor
> - * @floating: ignored
> + * @tpi_info: ignored
> */
> -static void ap_interrupt_handler(struct airq_struct *airq, bool floating)
> +static void ap_interrupt_handler(struct airq_struct *airq,
> + struct tpi_info *tpi_info)
> {
> inc_irq_stat(IRQIO_APB);
> tasklet_schedule(&ap_tasklet);
> diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> index d35e7a3f7067..52c376d15978 100644
> --- a/drivers/s390/virtio/virtio_ccw.c
> +++ b/drivers/s390/virtio/virtio_ccw.c
> @@ -33,6 +33,7 @@
> #include <asm/virtio-ccw.h>
> #include <asm/isc.h>
> #include <asm/airq.h>
> +#include <asm/tpi.h>
>
> /*
> * virtio related functions
> @@ -203,7 +204,8 @@ static void drop_airq_indicator(struct virtqueue *vq, struct airq_info *info)
> write_unlock_irqrestore(&info->lock, flags);
> }
>
> -static void virtio_airq_handler(struct airq_struct *airq, bool floating)
> +static void virtio_airq_handler(struct airq_struct *airq,
> + struct tpi_info *tpi_info)
> {
> struct airq_info *info = container_of(airq, struct airq_info, airq);
> unsigned long ai;
>

2021-12-08 12:21:46

by Niklas Schnelle

[permalink] [raw]
Subject: Re: [PATCH 12/32] s390/pci: get SHM information from list pci

On Tue, 2021-12-07 at 15:57 -0500, Matthew Rosato wrote:
> KVM will need information on the special handle mask used to indicate
> emulated devices. In order to obtain this, a new type of list pci call
> must be made to gather the information. Remove the unused data pointer
> from clp_list_pci and __clp_add and instead optionally pass a pointer to
> a model-dependent-data field. Additionally, allow for clp_list_pci calls
> that don't specify a callback - in this case, just do the first pass of
> list pci and exit.
>
> Signed-off-by: Matthew Rosato <[email protected]>
> ---
> arch/s390/include/asm/pci.h | 6 ++++++
> arch/s390/include/asm/pci_clp.h | 2 +-
> arch/s390/pci/pci.c | 19 +++++++++++++++++++
> arch/s390/pci/pci_clp.c | 16 ++++++++++------
> 4 files changed, 36 insertions(+), 7 deletions(-)
>
> diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
> index 00a2c24d6d2b..86f43644756d 100644
> --- a/arch/s390/include/asm/pci.h
> +++ b/arch/s390/include/asm/pci.h
> @@ -219,12 +219,18 @@ int zpci_unregister_ioat(struct zpci_dev *, u8);
> void zpci_remove_reserved_devices(void);
> void zpci_update_fh(struct zpci_dev *zdev, u32 fh);
>
---8<---
> -static int clp_list_pci(struct clp_req_rsp_list_pci *rrb, void *data,
> - void (*cb)(struct clp_fh_list_entry *, void *))
> +int clp_list_pci(struct clp_req_rsp_list_pci *rrb, u32 *mdd,
> + void (*cb)(struct clp_fh_list_entry *))
> {
> u64 resume_token = 0;
> int nentries, i, rc;
> @@ -368,8 +368,12 @@ static int clp_list_pci(struct clp_req_rsp_list_pci *rrb, void *data,
> rc = clp_list_pci_req(rrb, &resume_token, &nentries);
> if (rc)
> return rc;
> + if (mdd)
> + *mdd = rrb->response.mdd;
> + if (!cb)
> + return 0;

I think it would be slightly cleaner to instead de-static
clp_list_pci_req() and call that directly. Just because that makes the
clp_list_pci() still list all PCI functions and allows us to get rid of
the data parameter completely.

Also, I've been thinking about moving clp_scan_devices(),
clp_get_state(), and clp_refresh_fh() out of pci_clp.c because they are
higher level. I think that would nicely fit your zpci_get_mdd() in
pci.c with or without the above suggestion. Then we could do the
removal of the unused data parameter in that series as a cleanup. What
do you think?

> for (i = 0; i < nentries; i++)
> - cb(&rrb->response.fh_list[i], data);
> + cb(&rrb->response.fh_list[i]);
> } while (resume_token);
>
> return rc;
> @@ -398,7 +402,7 @@ static int clp_find_pci(struct clp_req_rsp_list_pci *rrb, u32 fid,
> return -ENODEV;
> }
>
> -static void __clp_add(struct clp_fh_list_entry *entry, void *data)
> +static void __clp_add(struct clp_fh_list_entry *entry)
> {
> struct zpci_dev *zdev;
>


2021-12-08 12:59:28

by Christian Borntraeger

[permalink] [raw]
Subject: Re: [PATCH 06/32] s390/airq: allow for airq structure that uses an input vector



Am 07.12.21 um 21:57 schrieb Matthew Rosato:
> When doing device passthrough where interrupts are being forwarded
> from host to guest, we wish to use a pinned section of guest memory
> as the vector (the same memory used by the guest as the vector).
>
> Signed-off-by: Matthew Rosato <[email protected]>
> ---
> arch/s390/include/asm/airq.h | 4 +++-
> arch/s390/pci/pci_irq.c | 8 ++++----
> drivers/s390/cio/airq.c | 10 +++++++---
> drivers/s390/virtio/virtio_ccw.c | 2 +-
> 4 files changed, 15 insertions(+), 9 deletions(-)
>
> diff --git a/arch/s390/include/asm/airq.h b/arch/s390/include/asm/airq.h
> index 7918a7d09028..e82e5626e139 100644
> --- a/arch/s390/include/asm/airq.h
> +++ b/arch/s390/include/asm/airq.h
> @@ -47,8 +47,10 @@ struct airq_iv {
> #define AIRQ_IV_PTR 4 /* Allocate the ptr array */
> #define AIRQ_IV_DATA 8 /* Allocate the data array */
> #define AIRQ_IV_CACHELINE 16 /* Cacheline alignment for the vector */
> +#define AIRQ_IV_GUESTVEC 32 /* Vector is a pinned guest page */
>
> -struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags);
> +struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags,
> + unsigned long *vec);
> void airq_iv_release(struct airq_iv *iv);
> unsigned long airq_iv_alloc(struct airq_iv *iv, unsigned long num);
> void airq_iv_free(struct airq_iv *iv, unsigned long bit, unsigned long num);
> diff --git a/arch/s390/pci/pci_irq.c b/arch/s390/pci/pci_irq.c
> index 880bcd73f11a..dfd4f3276a6d 100644
> --- a/arch/s390/pci/pci_irq.c
> +++ b/arch/s390/pci/pci_irq.c
> @@ -296,7 +296,7 @@ int arch_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type)
> zdev->aisb = bit;
>
> /* Create adapter interrupt vector */
> - zdev->aibv = airq_iv_create(msi_vecs, AIRQ_IV_DATA | AIRQ_IV_BITLOCK);
> + zdev->aibv = airq_iv_create(msi_vecs, AIRQ_IV_DATA | AIRQ_IV_BITLOCK, 0);
> if (!zdev->aibv)
> return -ENOMEM;
>
> @@ -421,7 +421,7 @@ static int __init zpci_directed_irq_init(void)
> union zpci_sic_iib iib = {{0}};
> unsigned int cpu;
>
> - zpci_sbv = airq_iv_create(num_possible_cpus(), 0);
> + zpci_sbv = airq_iv_create(num_possible_cpus(), 0, 0);

For a pointer use NULL? Also in other places. With the indentation fix this looks sane.

2021-12-08 13:09:47

by Christian Borntraeger

[permalink] [raw]
Subject: Re: [PATCH 07/32] s390/pci: externalize the SIC operation controls and routine

Am 07.12.21 um 21:57 schrieb Matthew Rosato:
> A subsequent patch will be issuing SIC from KVM -- export the necessary
> routine and make the operation control definitions available from a header.
> Because the routine will now be exported, let's swap the purpose of
> zpci_set_irq_ctrl and __zpci_set_irq_ctrl, leaving the latter as a static
> within pci_irq.c only for SIC calls that don't specify an iib.

Maybe it would be simpler to export the __ version instead of renaming everything.
Whatever Niklas prefers.
>
> Signed-off-by: Matthew Rosato <[email protected]>
> ---
> arch/s390/include/asm/pci_insn.h | 17 +++++++++--------
> arch/s390/pci/pci_insn.c | 3 ++-
> arch/s390/pci/pci_irq.c | 28 ++++++++++++++--------------
> 3 files changed, 25 insertions(+), 23 deletions(-)
>
> diff --git a/arch/s390/include/asm/pci_insn.h b/arch/s390/include/asm/pci_insn.h
> index 61cf9531f68f..5331082fa516 100644
> --- a/arch/s390/include/asm/pci_insn.h
> +++ b/arch/s390/include/asm/pci_insn.h
> @@ -98,6 +98,14 @@ struct zpci_fib {
> u32 gd;
> } __packed __aligned(8);
>
> +/* Set Interruption Controls Operation Controls */
> +#define SIC_IRQ_MODE_ALL 0
> +#define SIC_IRQ_MODE_SINGLE 1
> +#define SIC_IRQ_MODE_DIRECT 4
> +#define SIC_IRQ_MODE_D_ALL 16
> +#define SIC_IRQ_MODE_D_SINGLE 17
> +#define SIC_IRQ_MODE_SET_CPU 18
> +
> /* directed interruption information block */
> struct zpci_diib {
> u32 : 1;
> @@ -134,13 +142,6 @@ int __zpci_store(u64 data, u64 req, u64 offset);
> int zpci_store(const volatile void __iomem *addr, u64 data, unsigned long len);
> int __zpci_store_block(const u64 *data, u64 req, u64 offset);
> void zpci_barrier(void);
> -int __zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib);
> -
> -static inline int zpci_set_irq_ctrl(u16 ctl, u8 isc)
> -{
> - union zpci_sic_iib iib = {{0}};
> -
> - return __zpci_set_irq_ctrl(ctl, isc, &iib);
> -}
> +int zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib);
>
> #endif
> diff --git a/arch/s390/pci/pci_insn.c b/arch/s390/pci/pci_insn.c
> index 28d863aaafea..d1a8bd43ce26 100644
> --- a/arch/s390/pci/pci_insn.c
> +++ b/arch/s390/pci/pci_insn.c
> @@ -97,7 +97,7 @@ int zpci_refresh_trans(u64 fn, u64 addr, u64 range)
> }
>
> /* Set Interruption Controls */
> -int __zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib)
> +int zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib)
> {
> if (!test_facility(72))
> return -EIO;
> @@ -108,6 +108,7 @@ int __zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib)
>
> return 0;
> }
> +EXPORT_SYMBOL_GPL(zpci_set_irq_ctrl);
>
> /* PCI Load */
> static inline int ____pcilg(u64 *data, u64 req, u64 offset, u8 *status)
> diff --git a/arch/s390/pci/pci_irq.c b/arch/s390/pci/pci_irq.c
> index dfd4f3276a6d..6b29e39496d1 100644
> --- a/arch/s390/pci/pci_irq.c
> +++ b/arch/s390/pci/pci_irq.c
> @@ -15,13 +15,6 @@
>
> static enum {FLOATING, DIRECTED} irq_delivery;
>
> -#define SIC_IRQ_MODE_ALL 0
> -#define SIC_IRQ_MODE_SINGLE 1
> -#define SIC_IRQ_MODE_DIRECT 4
> -#define SIC_IRQ_MODE_D_ALL 16
> -#define SIC_IRQ_MODE_D_SINGLE 17
> -#define SIC_IRQ_MODE_SET_CPU 18
> -
> /*
> * summary bit vector
> * FLOATING - summary bit per function
> @@ -145,6 +138,13 @@ static int zpci_set_irq_affinity(struct irq_data *data, const struct cpumask *de
> return IRQ_SET_MASK_OK;
> }
>
> +static inline int __zpci_set_irq_ctrl(u16 ctl, u8 isc)
> +{
> + union zpci_sic_iib iib = {{0}};
> +
> + return zpci_set_irq_ctrl(ctl, isc, &iib);
> +}
> +
> static struct irq_chip zpci_irq_chip = {
> .name = "PCI-MSI",
> .irq_unmask = pci_msi_unmask_irq,
> @@ -165,7 +165,7 @@ static void zpci_handle_cpu_local_irq(bool rescan)
> /* End of second scan with interrupts on. */
> break;
> /* First scan complete, reenable interrupts. */
> - if (zpci_set_irq_ctrl(SIC_IRQ_MODE_D_SINGLE, PCI_ISC))
> + if (__zpci_set_irq_ctrl(SIC_IRQ_MODE_D_SINGLE, PCI_ISC))
> break;
> bit = 0;
> continue;
> @@ -203,7 +203,7 @@ static void zpci_handle_fallback_irq(void)
> /* End of second scan with interrupts on. */
> break;
> /* First scan complete, reenable interrupts. */
> - if (zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, PCI_ISC))
> + if (__zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, PCI_ISC))
> break;
> cpu = 0;
> continue;
> @@ -247,7 +247,7 @@ static void zpci_floating_irq_handler(struct airq_struct *airq,
> /* End of second scan with interrupts on. */
> break;
> /* First scan complete, reenable interrupts. */
> - if (zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, PCI_ISC))
> + if (__zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, PCI_ISC))
> break;
> si = 0;
> continue;
> @@ -412,8 +412,8 @@ static void __init cpu_enable_directed_irq(void *unused)
>
> iib.cdiib.dibv_addr = (u64) zpci_ibv[smp_processor_id()]->vector;
>
> - __zpci_set_irq_ctrl(SIC_IRQ_MODE_SET_CPU, 0, &iib);
> - zpci_set_irq_ctrl(SIC_IRQ_MODE_D_SINGLE, PCI_ISC);
> + zpci_set_irq_ctrl(SIC_IRQ_MODE_SET_CPU, 0, &iib);
> + __zpci_set_irq_ctrl(SIC_IRQ_MODE_D_SINGLE, PCI_ISC);
> }
>
> static int __init zpci_directed_irq_init(void)
> @@ -428,7 +428,7 @@ static int __init zpci_directed_irq_init(void)
> iib.diib.isc = PCI_ISC;
> iib.diib.nr_cpus = num_possible_cpus();
> iib.diib.disb_addr = (u64) zpci_sbv->vector;
> - __zpci_set_irq_ctrl(SIC_IRQ_MODE_DIRECT, 0, &iib);
> + zpci_set_irq_ctrl(SIC_IRQ_MODE_DIRECT, 0, &iib);
>
> zpci_ibv = kcalloc(num_possible_cpus(), sizeof(*zpci_ibv),
> GFP_KERNEL);
> @@ -504,7 +504,7 @@ int __init zpci_irq_init(void)
> * Enable floating IRQs (with suppression after one IRQ). When using
> * directed IRQs this enables the fallback path.
> */
> - zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, PCI_ISC);
> + __zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, PCI_ISC);
>
> return 0;
> out_airq:
>

2021-12-08 13:53:15

by Niklas Schnelle

[permalink] [raw]
Subject: Re: [PATCH 07/32] s390/pci: externalize the SIC operation controls and routine

On Wed, 2021-12-08 at 14:09 +0100, Christian Borntraeger wrote:
> Am 07.12.21 um 21:57 schrieb Matthew Rosato:
> > A subsequent patch will be issuing SIC from KVM -- export the necessary
> > routine and make the operation control definitions available from a header.
> > Because the routine will now be exported, let's swap the purpose of
> > zpci_set_irq_ctrl and __zpci_set_irq_ctrl, leaving the latter as a static
> > within pci_irq.c only for SIC calls that don't specify an iib.
>
> Maybe it would be simpler to export the __ version instead of renaming everything.
> Whatever Niklas prefers.

See below I think it's just not worth it having both variants at all.

> > Signed-off-by: Matthew Rosato <[email protected]>
> > ---
> > arch/s390/include/asm/pci_insn.h | 17 +++++++++--------
> > arch/s390/pci/pci_insn.c | 3 ++-
> > arch/s390/pci/pci_irq.c | 28 ++++++++++++++--------------
> > 3 files changed, 25 insertions(+), 23 deletions(-)
> >
> > diff --git a/arch/s390/include/asm/pci_insn.h b/arch/s390/include/asm/pci_insn.h
> > index 61cf9531f68f..5331082fa516 100644
> > --- a/arch/s390/include/asm/pci_insn.h
> > +++ b/arch/s390/include/asm/pci_insn.h
> > @@ -98,6 +98,14 @@ struct zpci_fib {
> > u32 gd;
> > } __packed __aligned(8);
> >
> > +/* Set Interruption Controls Operation Controls */
> > +#define SIC_IRQ_MODE_ALL 0
> > +#define SIC_IRQ_MODE_SINGLE 1
> > +#define SIC_IRQ_MODE_DIRECT 4
> > +#define SIC_IRQ_MODE_D_ALL 16
> > +#define SIC_IRQ_MODE_D_SINGLE 17
> > +#define SIC_IRQ_MODE_SET_CPU 18
> > +
> > /* directed interruption information block */
> > struct zpci_diib {
> > u32 : 1;
> > @@ -134,13 +142,6 @@ int __zpci_store(u64 data, u64 req, u64 offset);
> > int zpci_store(const volatile void __iomem *addr, u64 data, unsigned long len);
> > int __zpci_store_block(const u64 *data, u64 req, u64 offset);
> > void zpci_barrier(void);
> > -int __zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib);
> > -
> > -static inline int zpci_set_irq_ctrl(u16 ctl, u8 isc)
> > -{
> > - union zpci_sic_iib iib = {{0}};
> > -
> > - return __zpci_set_irq_ctrl(ctl, isc, &iib);
> > -}
> > +int zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib);

Since the __zpci_set_irq_ctrl() was already non static/inline the above
inline to non-inline change shouldn't make a performance difference.

Looking at this makes me wonder though. Wouldn't it make sense to just
have the zpci_set_irq_ctrl() function inline in the header. Its body is
a single instruction inline asm plus a test_facility(). The latter by
the way I think also looks rather out of place there considering we
call zpci_set_irq_ctrl() in the interrupt handler and facilities can't
go away so it's pretty silly to check for it on every single
interrupt.. unless I'm totally missing something.

> >
> > #endif
> > diff --git a/arch/s390/pci/pci_insn.c b/arch/s390/pci/pci_insn.c
> > index 28d863aaafea..d1a8bd43ce26 100644
> > --- a/arch/s390/pci/pci_insn.c
> > +++ b/arch/s390/pci/pci_insn.c
> > @@ -97,7 +97,7 @@ int zpci_refresh_trans(u64 fn, u64 addr, u64 range)
> > }
> >
> > /* Set Interruption Controls */
> > -int __zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib)
> > +int zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib)
> > {
> > if (!test_facility(72))
> > return -EIO;
> > @@ -108,6 +108,7 @@ int __zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib)
> >
> > return 0;
> > }
> > +EXPORT_SYMBOL_GPL(zpci_set_irq_ctrl);
> >
> > /* PCI Load */
> > static inline int ____pcilg(u64 *data, u64 req, u64 offset, u8 *status)
> > diff --git a/arch/s390/pci/pci_irq.c b/arch/s390/pci/pci_irq.c
> > index dfd4f3276a6d..6b29e39496d1 100644
> > --- a/arch/s390/pci/pci_irq.c
> > +++ b/arch/s390/pci/pci_irq.c
> > @@ -15,13 +15,6 @@
> >
> > static enum {FLOATING, DIRECTED} irq_delivery;
> >
> > -#define SIC_IRQ_MODE_ALL 0
> > -#define SIC_IRQ_MODE_SINGLE 1
> > -#define SIC_IRQ_MODE_DIRECT 4
> > -#define SIC_IRQ_MODE_D_ALL 16
> > -#define SIC_IRQ_MODE_D_SINGLE 17
> > -#define SIC_IRQ_MODE_SET_CPU 18
> > -
> > /*
> > * summary bit vector
> > * FLOATING - summary bit per function
> > @@ -145,6 +138,13 @@ static int zpci_set_irq_affinity(struct irq_data *data, const struct cpumask *de
> > return IRQ_SET_MASK_OK;
> > }
> >
> > +static inline int __zpci_set_irq_ctrl(u16 ctl, u8 isc)
> > +{
> > + union zpci_sic_iib iib = {{0}};
> > +
> > + return zpci_set_irq_ctrl(ctl, isc, &iib);
> > +}
> > +

I would be totally fine and slighlt prefer to have the 0 iib repeated
at those 3 call sites that don't need it. On first glance that should
come out to pretty much the same number of lines of code and it removes
the potential confusion of swapping the __ prefixed and non-prefixed
variants. What do you think?

> > static struct irq_chip zpci_irq_chip = {
> > .name = "PCI-MSI",
> > .irq_unmask = pci_msi_unmask_irq,
> > @@ -165,7 +165,7 @@ static void zpci_handle_cpu_local_irq(bool rescan)
> >
---8<---


2021-12-08 14:12:36

by Claudio Imbrenda

[permalink] [raw]
Subject: Re: [PATCH 05/32] s390/airq: pass more TPI info to airq handlers

On Tue, 7 Dec 2021 15:57:16 -0500
Matthew Rosato <[email protected]> wrote:

> A subsequent patch will introduce an airq handler that requires additional
> TPI information beyond directed vs floating, so pass the entire tpi_info
> structure via the handler. Only pci actually uses this information today,
> for the other airq handlers this is effectively a no-op.
>
> Reviewed-by: Eric Farman <[email protected]>

Reviewed-by: Claudio Imbrenda <[email protected]>

> Signed-off-by: Matthew Rosato <[email protected]>
> ---
> arch/s390/include/asm/airq.h | 3 ++-
> arch/s390/kvm/interrupt.c | 4 +++-
> arch/s390/pci/pci_irq.c | 9 +++++++--
> drivers/s390/cio/airq.c | 2 +-
> drivers/s390/cio/qdio_thinint.c | 6 ++++--
> drivers/s390/crypto/ap_bus.c | 9 ++++++---
> drivers/s390/virtio/virtio_ccw.c | 4 +++-
> 7 files changed, 26 insertions(+), 11 deletions(-)
>
> diff --git a/arch/s390/include/asm/airq.h b/arch/s390/include/asm/airq.h
> index 01936fdfaddb..7918a7d09028 100644
> --- a/arch/s390/include/asm/airq.h
> +++ b/arch/s390/include/asm/airq.h
> @@ -12,10 +12,11 @@
>
> #include <linux/bit_spinlock.h>
> #include <linux/dma-mapping.h>
> +#include <asm/tpi.h>
>
> struct airq_struct {
> struct hlist_node list; /* Handler queueing. */
> - void (*handler)(struct airq_struct *airq, bool floating);
> + void (*handler)(struct airq_struct *airq, struct tpi_info *tpi_info);
> u8 *lsi_ptr; /* Local-Summary-Indicator pointer */
> u8 lsi_mask; /* Local-Summary-Indicator mask */
> u8 isc; /* Interrupt-subclass */
> diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
> index c3bd993fdd0c..f9b872e358c6 100644
> --- a/arch/s390/kvm/interrupt.c
> +++ b/arch/s390/kvm/interrupt.c
> @@ -28,6 +28,7 @@
> #include <asm/switch_to.h>
> #include <asm/nmi.h>
> #include <asm/airq.h>
> +#include <asm/tpi.h>
> #include "kvm-s390.h"
> #include "gaccess.h"
> #include "trace-s390.h"
> @@ -3261,7 +3262,8 @@ int kvm_s390_gisc_unregister(struct kvm *kvm, u32 gisc)
> }
> EXPORT_SYMBOL_GPL(kvm_s390_gisc_unregister);
>
> -static void gib_alert_irq_handler(struct airq_struct *airq, bool floating)
> +static void gib_alert_irq_handler(struct airq_struct *airq,
> + struct tpi_info *tpi_info)
> {
> inc_irq_stat(IRQIO_GAL);
> process_gib_alert_list();
> diff --git a/arch/s390/pci/pci_irq.c b/arch/s390/pci/pci_irq.c
> index 954bb7a83124..880bcd73f11a 100644
> --- a/arch/s390/pci/pci_irq.c
> +++ b/arch/s390/pci/pci_irq.c
> @@ -11,6 +11,7 @@
>
> #include <asm/isc.h>
> #include <asm/airq.h>
> +#include <asm/tpi.h>
>
> static enum {FLOATING, DIRECTED} irq_delivery;
>
> @@ -216,8 +217,11 @@ static void zpci_handle_fallback_irq(void)
> }
> }
>
> -static void zpci_directed_irq_handler(struct airq_struct *airq, bool floating)
> +static void zpci_directed_irq_handler(struct airq_struct *airq,
> + struct tpi_info *tpi_info)
> {
> + bool floating = !tpi_info->directed_irq;
> +
> if (floating) {
> inc_irq_stat(IRQIO_PCF);
> zpci_handle_fallback_irq();
> @@ -227,7 +231,8 @@ static void zpci_directed_irq_handler(struct airq_struct *airq, bool floating)
> }
> }
>
> -static void zpci_floating_irq_handler(struct airq_struct *airq, bool floating)
> +static void zpci_floating_irq_handler(struct airq_struct *airq,
> + struct tpi_info *tpi_info)
> {
> unsigned long si, ai;
> struct airq_iv *aibv;
> diff --git a/drivers/s390/cio/airq.c b/drivers/s390/cio/airq.c
> index e56535c99888..2f2226786319 100644
> --- a/drivers/s390/cio/airq.c
> +++ b/drivers/s390/cio/airq.c
> @@ -99,7 +99,7 @@ static irqreturn_t do_airq_interrupt(int irq, void *dummy)
> rcu_read_lock();
> hlist_for_each_entry_rcu(airq, head, list)
> if ((*airq->lsi_ptr & airq->lsi_mask) != 0)
> - airq->handler(airq, !tpi_info->directed_irq);
> + airq->handler(airq, tpi_info);
> rcu_read_unlock();
>
> return IRQ_HANDLED;
> diff --git a/drivers/s390/cio/qdio_thinint.c b/drivers/s390/cio/qdio_thinint.c
> index 8e09bf3a2fcd..9b9335dd06db 100644
> --- a/drivers/s390/cio/qdio_thinint.c
> +++ b/drivers/s390/cio/qdio_thinint.c
> @@ -15,6 +15,7 @@
> #include <asm/qdio.h>
> #include <asm/airq.h>
> #include <asm/isc.h>
> +#include <asm/tpi.h>
>
> #include "cio.h"
> #include "ioasm.h"
> @@ -93,9 +94,10 @@ static inline u32 clear_shared_ind(void)
> /**
> * tiqdio_thinint_handler - thin interrupt handler for qdio
> * @airq: pointer to adapter interrupt descriptor
> - * @floating: flag to recognize floating vs. directed interrupts (unused)
> + * @tpi_info: interrupt information (e.g. floating vs directed -- unused)
> */
> -static void tiqdio_thinint_handler(struct airq_struct *airq, bool floating)
> +static void tiqdio_thinint_handler(struct airq_struct *airq,
> + struct tpi_info *tpi_info)
> {
> u64 irq_time = S390_lowcore.int_clock;
> u32 si_used = clear_shared_ind();
> diff --git a/drivers/s390/crypto/ap_bus.c b/drivers/s390/crypto/ap_bus.c
> index 1986243f9cd3..df1a038442db 100644
> --- a/drivers/s390/crypto/ap_bus.c
> +++ b/drivers/s390/crypto/ap_bus.c
> @@ -27,6 +27,7 @@
> #include <linux/kthread.h>
> #include <linux/mutex.h>
> #include <asm/airq.h>
> +#include <asm/tpi.h>
> #include <linux/atomic.h>
> #include <asm/isc.h>
> #include <linux/hrtimer.h>
> @@ -129,7 +130,8 @@ static int ap_max_adapter_id = 63;
> static struct bus_type ap_bus_type;
>
> /* Adapter interrupt definitions */
> -static void ap_interrupt_handler(struct airq_struct *airq, bool floating);
> +static void ap_interrupt_handler(struct airq_struct *airq,
> + struct tpi_info *tpi_info);
>
> static bool ap_irq_flag;
>
> @@ -442,9 +444,10 @@ static enum hrtimer_restart ap_poll_timeout(struct hrtimer *unused)
> /**
> * ap_interrupt_handler() - Schedule ap_tasklet on interrupt
> * @airq: pointer to adapter interrupt descriptor
> - * @floating: ignored
> + * @tpi_info: ignored
> */
> -static void ap_interrupt_handler(struct airq_struct *airq, bool floating)
> +static void ap_interrupt_handler(struct airq_struct *airq,
> + struct tpi_info *tpi_info)
> {
> inc_irq_stat(IRQIO_APB);
> tasklet_schedule(&ap_tasklet);
> diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> index d35e7a3f7067..52c376d15978 100644
> --- a/drivers/s390/virtio/virtio_ccw.c
> +++ b/drivers/s390/virtio/virtio_ccw.c
> @@ -33,6 +33,7 @@
> #include <asm/virtio-ccw.h>
> #include <asm/isc.h>
> #include <asm/airq.h>
> +#include <asm/tpi.h>
>
> /*
> * virtio related functions
> @@ -203,7 +204,8 @@ static void drop_airq_indicator(struct virtqueue *vq, struct airq_info *info)
> write_unlock_irqrestore(&info->lock, flags);
> }
>
> -static void virtio_airq_handler(struct airq_struct *airq, bool floating)
> +static void virtio_airq_handler(struct airq_struct *airq,
> + struct tpi_info *tpi_info)
> {
> struct airq_info *info = container_of(airq, struct airq_info, airq);
> unsigned long ai;


2021-12-08 14:12:42

by Claudio Imbrenda

[permalink] [raw]
Subject: Re: [PATCH 02/32] s390/sclp: detect the AISII facility

On Tue, 7 Dec 2021 15:57:13 -0500
Matthew Rosato <[email protected]> wrote:

> Detect the Adapter Interruption Source ID Interpretation facility.
>
> Reviewed-by: Eric Farman <[email protected]>

Reviewed-by: Claudio Imbrenda <[email protected]>

> Signed-off-by: Matthew Rosato <[email protected]>
> ---
> arch/s390/include/asm/sclp.h | 1 +
> drivers/s390/char/sclp_early.c | 1 +
> 2 files changed, 2 insertions(+)
>
> diff --git a/arch/s390/include/asm/sclp.h b/arch/s390/include/asm/sclp.h
> index c84e8e0ca344..524a99baf221 100644
> --- a/arch/s390/include/asm/sclp.h
> +++ b/arch/s390/include/asm/sclp.h
> @@ -89,6 +89,7 @@ struct sclp_info {
> unsigned char has_sipl : 1;
> unsigned char has_dirq : 1;
> unsigned char has_zpci_interp : 1;
> + unsigned char has_aisii : 1;
> unsigned int ibc;
> unsigned int mtid;
> unsigned int mtid_cp;
> diff --git a/drivers/s390/char/sclp_early.c b/drivers/s390/char/sclp_early.c
> index 2e8199b7ae50..a73120b8a5de 100644
> --- a/drivers/s390/char/sclp_early.c
> +++ b/drivers/s390/char/sclp_early.c
> @@ -45,6 +45,7 @@ static void __init sclp_early_facilities_detect(void)
> sclp.has_gisaf = !!(sccb->fac118 & 0x08);
> sclp.has_hvs = !!(sccb->fac119 & 0x80);
> sclp.has_kss = !!(sccb->fac98 & 0x01);
> + sclp.has_aisii = !!(sccb->fac118 & 0x40);
> sclp.has_zpci_interp = !!(sccb->fac118 & 0x01);
> if (sccb->fac85 & 0x02)
> S390_lowcore.machine_flags |= MACHINE_FLAG_ESOP;


2021-12-08 14:12:48

by Claudio Imbrenda

[permalink] [raw]
Subject: Re: [PATCH 04/32] s390/sclp: detect the AISI facility

On Tue, 7 Dec 2021 15:57:15 -0500
Matthew Rosato <[email protected]> wrote:

> Detect the Adapter Interruption Suppression Interpretation facility.
>
> Reviewed-by: Eric Farman <[email protected]>
> Signed-off-by: Matthew Rosato <[email protected]>

Reviewed-by: Claudio Imbrenda <[email protected]>

> ---
> arch/s390/include/asm/sclp.h | 1 +
> drivers/s390/char/sclp_early.c | 1 +
> 2 files changed, 2 insertions(+)
>
> diff --git a/arch/s390/include/asm/sclp.h b/arch/s390/include/asm/sclp.h
> index a763563bb3e7..559adb28a24c 100644
> --- a/arch/s390/include/asm/sclp.h
> +++ b/arch/s390/include/asm/sclp.h
> @@ -91,6 +91,7 @@ struct sclp_info {
> unsigned char has_zpci_interp : 1;
> unsigned char has_aisii : 1;
> unsigned char has_aeni : 1;
> + unsigned char has_aisi : 1;
> unsigned int ibc;
> unsigned int mtid;
> unsigned int mtid_cp;
> diff --git a/drivers/s390/char/sclp_early.c b/drivers/s390/char/sclp_early.c
> index 52a203ea23cc..9b29ed850d39 100644
> --- a/drivers/s390/char/sclp_early.c
> +++ b/drivers/s390/char/sclp_early.c
> @@ -47,6 +47,7 @@ static void __init sclp_early_facilities_detect(void)
> sclp.has_kss = !!(sccb->fac98 & 0x01);
> sclp.has_aisii = !!(sccb->fac118 & 0x40);
> sclp.has_aeni = !!(sccb->fac118 & 0x20);
> + sclp.has_aisi = !!(sccb->fac118 & 0x10);
> sclp.has_zpci_interp = !!(sccb->fac118 & 0x01);
> if (sccb->fac85 & 0x02)
> S390_lowcore.machine_flags |= MACHINE_FLAG_ESOP;


2021-12-08 14:12:50

by Claudio Imbrenda

[permalink] [raw]
Subject: Re: [PATCH 01/32] s390/sclp: detect the zPCI interpretation facility

On Tue, 7 Dec 2021 15:57:12 -0500
Matthew Rosato <[email protected]> wrote:

> Detect the zPCI Load/Store Interpretation facility.
>
> Reviewed-by: Eric Farman <[email protected]>
> Signed-off-by: Matthew Rosato <[email protected]>

I have the same comment as Christian; with that fixed:

Reviewed-by: Claudio Imbrenda <[email protected]>

> ---
> arch/s390/include/asm/sclp.h | 1 +
> drivers/s390/char/sclp_early.c | 1 +
> 2 files changed, 2 insertions(+)
>
> diff --git a/arch/s390/include/asm/sclp.h b/arch/s390/include/asm/sclp.h
> index c68ea35de498..c84e8e0ca344 100644
> --- a/arch/s390/include/asm/sclp.h
> +++ b/arch/s390/include/asm/sclp.h
> @@ -88,6 +88,7 @@ struct sclp_info {
> unsigned char has_diag318 : 1;
> unsigned char has_sipl : 1;
> unsigned char has_dirq : 1;
> + unsigned char has_zpci_interp : 1;
> unsigned int ibc;
> unsigned int mtid;
> unsigned int mtid_cp;
> diff --git a/drivers/s390/char/sclp_early.c b/drivers/s390/char/sclp_early.c
> index b64feab62caa..2e8199b7ae50 100644
> --- a/drivers/s390/char/sclp_early.c
> +++ b/drivers/s390/char/sclp_early.c
> @@ -45,6 +45,7 @@ static void __init sclp_early_facilities_detect(void)
> sclp.has_gisaf = !!(sccb->fac118 & 0x08);
> sclp.has_hvs = !!(sccb->fac119 & 0x80);
> sclp.has_kss = !!(sccb->fac98 & 0x01);
> + sclp.has_zpci_interp = !!(sccb->fac118 & 0x01);
> if (sccb->fac85 & 0x02)
> S390_lowcore.machine_flags |= MACHINE_FLAG_ESOP;
> if (sccb->fac91 & 0x40)


2021-12-08 14:12:53

by Claudio Imbrenda

[permalink] [raw]
Subject: Re: [PATCH 03/32] s390/sclp: detect the AENI facility

On Tue, 7 Dec 2021 15:57:14 -0500
Matthew Rosato <[email protected]> wrote:

> Detect the Adapter Event Notification Interpretation facility.
>
> Reviewed-by: Eric Farman <[email protected]>
> Signed-off-by: Matthew Rosato <[email protected]>

Reviewed-by: Claudio Imbrenda <[email protected]>

> ---
> arch/s390/include/asm/sclp.h | 1 +
> drivers/s390/char/sclp_early.c | 1 +
> 2 files changed, 2 insertions(+)
>
> diff --git a/arch/s390/include/asm/sclp.h b/arch/s390/include/asm/sclp.h
> index 524a99baf221..a763563bb3e7 100644
> --- a/arch/s390/include/asm/sclp.h
> +++ b/arch/s390/include/asm/sclp.h
> @@ -90,6 +90,7 @@ struct sclp_info {
> unsigned char has_dirq : 1;
> unsigned char has_zpci_interp : 1;
> unsigned char has_aisii : 1;
> + unsigned char has_aeni : 1;
> unsigned int ibc;
> unsigned int mtid;
> unsigned int mtid_cp;
> diff --git a/drivers/s390/char/sclp_early.c b/drivers/s390/char/sclp_early.c
> index a73120b8a5de..52a203ea23cc 100644
> --- a/drivers/s390/char/sclp_early.c
> +++ b/drivers/s390/char/sclp_early.c
> @@ -46,6 +46,7 @@ static void __init sclp_early_facilities_detect(void)
> sclp.has_hvs = !!(sccb->fac119 & 0x80);
> sclp.has_kss = !!(sccb->fac98 & 0x01);
> sclp.has_aisii = !!(sccb->fac118 & 0x40);
> + sclp.has_aeni = !!(sccb->fac118 & 0x20);
> sclp.has_zpci_interp = !!(sccb->fac118 & 0x01);
> if (sccb->fac85 & 0x02)
> S390_lowcore.machine_flags |= MACHINE_FLAG_ESOP;


2021-12-08 14:33:42

by Eric Farman

[permalink] [raw]
Subject: Re: [PATCH 01/32] s390/sclp: detect the zPCI interpretation facility

On Wed, 2021-12-08 at 12:12 +0100, Christian Borntraeger wrote:
> Am 07.12.21 um 21:57 schrieb Matthew Rosato:
> > Detect the zPCI Load/Store Interpretation facility.
> >
> > Reviewed-by: Eric Farman <[email protected]>
> > Signed-off-by: Matthew Rosato <[email protected]>
> > ---
> > arch/s390/include/asm/sclp.h | 1 +
> > drivers/s390/char/sclp_early.c | 1 +
> > 2 files changed, 2 insertions(+)
> >
> > diff --git a/arch/s390/include/asm/sclp.h
> > b/arch/s390/include/asm/sclp.h
> > index c68ea35de498..c84e8e0ca344 100644
> > --- a/arch/s390/include/asm/sclp.h
> > +++ b/arch/s390/include/asm/sclp.h
> > @@ -88,6 +88,7 @@ struct sclp_info {
> > unsigned char has_diag318 : 1;
> > unsigned char has_sipl : 1;
> > unsigned char has_dirq : 1;
> > + unsigned char has_zpci_interp : 1;
>
> maybe use zpci_lsi (load store interpretion) as pci interpretion
> would be something else (also fix the the subject line).
> With that
>
> Reviewed-by: Christian Borntraeger <[email protected]>

My r-b can stay with Christian's suggested change.

>
>
> > unsigned int ibc;
> > unsigned int mtid;
> > unsigned int mtid_cp;
> > diff --git a/drivers/s390/char/sclp_early.c
> > b/drivers/s390/char/sclp_early.c
> > index b64feab62caa..2e8199b7ae50 100644
> > --- a/drivers/s390/char/sclp_early.c
> > +++ b/drivers/s390/char/sclp_early.c
> > @@ -45,6 +45,7 @@ static void __init
> > sclp_early_facilities_detect(void)
> > sclp.has_gisaf = !!(sccb->fac118 & 0x08);
> > sclp.has_hvs = !!(sccb->fac119 & 0x80);
> > sclp.has_kss = !!(sccb->fac98 & 0x01);
> > + sclp.has_zpci_interp = !!(sccb->fac118 & 0x01);
> > if (sccb->fac85 & 0x02)
> > S390_lowcore.machine_flags |= MACHINE_FLAG_ESOP;
> > if (sccb->fac91 & 0x40)
> >


2021-12-08 15:04:58

by Matthew Rosato

[permalink] [raw]
Subject: Re: [PATCH 20/32] KVM: s390: pci: provide routines for enabling/disabling interpretation

On 12/8/21 4:44 AM, Niklas Schnelle wrote:
> On Tue, 2021-12-07 at 15:57 -0500, Matthew Rosato wrote:
>> These routines will be wired into the vfio_pci_zdev ioctl handlers to
>> respond to requests to enable / disable a device for zPCI Load/Store
>> interpretation.
>>
>> The first time such a request is received, enable the necessary facilities
>> for the guest.
>>
>> Signed-off-by: Matthew Rosato <[email protected]>
>> ---
>> arch/s390/include/asm/kvm_pci.h | 4 ++
>> arch/s390/kvm/pci.c | 91 +++++++++++++++++++++++++++++++++
>> arch/s390/pci/pci.c | 3 ++
>> 3 files changed, 98 insertions(+)
>>
>> diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
>> index 3e491a39704c..5d6283acb54c 100644
>> --- a/arch/s390/include/asm/kvm_pci.h
>>
> ---8<---
>> return rc;
>> + }
>> +
>> + /*
>> + * Store information about the identity of the kvm guest allowed to
>> + * access this device via interpretation to be used by host CLP
>> + */
>> + zdev->gd = gd;
>> +
>> + rc = zpci_enable_device(zdev);
>> + if (rc)
>> + goto err;
>> +
>> + /* Re-register the IOMMU that was already created */
>> + rc = zpci_register_ioat(zdev, 0, zdev->start_dma, zdev->end_dma,
>> + (u64)zdev->dma_table);
>
> The zdev->dma_table is a virtual address but we need an absolute
> address in the MPCIFC so the above should use
> virt_to_phys(zdev->dma_table) to be compatible with future V != R
> kernel memory. As of now since virtual and absolute kernel addresses
> are the same this is not a bug and we've had this (wrong) pattern in
> the rest of the code but let's get it righht here from the start.
>
> See also my commit "s390/pci: use physical addresses in DMA tables"
> that is currently in the s390 feature branch.

You're right of course -- I saw those changes happening as I prepared
this series but I didn't want to delay getting comments any longer, what
with the holidays approaching. Of course, I didn't realize they were
already out on the feature branch.

I suspect there is some more of this also in the code related to
handling RPCIT. AEN setup too.

>
>> + if (rc)
>> + goto err;
>> +
>> + return rc;
>> +
>> +err:
>> + zdev->gd = 0;
>> + return rc;
>> +}
>> +EXPORT_SYMBOL_GPL(kvm_s390_pci_interp_enable);
>> +
>> +int kvm_s390_pci_interp_disable(struct zpci_dev *zdev)
>> +{
>> + int rc;
>> +
>> + if (zdev->gd == 0)
>> + return -EINVAL;
>> +
>> + /* Remove the host CLP guest designation */
>> + zdev->gd = 0;
>> +
>> + if (zdev_enabled(zdev)) {
>> + rc = zpci_disable_device(zdev);
>> + if (rc)
>> + return rc;
>> + }
>> +
>> + rc = zpci_enable_device(zdev);
>> + if (rc)
>> + return rc;
>> +
>> + /* Re-register the IOMMU that was already created */
>> + rc = zpci_register_ioat(zdev, 0, zdev->start_dma, zdev->end_dma,
>> + (u64)zdev->dma_table);
>
> Same as above
>
>> +
>> + return rc;
>> +}
>> +EXPORT_SYMBOL_GPL(kvm_s390_pci_interp_disable);
>> +
>>
> ---8<---
>


2021-12-08 15:33:35

by Matthew Rosato

[permalink] [raw]
Subject: Re: [PATCH 07/32] s390/pci: externalize the SIC operation controls and routine

On 12/8/21 8:53 AM, Niklas Schnelle wrote:
> On Wed, 2021-12-08 at 14:09 +0100, Christian Borntraeger wrote:
>> Am 07.12.21 um 21:57 schrieb Matthew Rosato:
>>> A subsequent patch will be issuing SIC from KVM -- export the necessary
>>> routine and make the operation control definitions available from a header.
>>> Because the routine will now be exported, let's swap the purpose of
>>> zpci_set_irq_ctrl and __zpci_set_irq_ctrl, leaving the latter as a static
>>> within pci_irq.c only for SIC calls that don't specify an iib.
>>
>> Maybe it would be simpler to export the __ version instead of renaming everything.
>> Whatever Niklas prefers.
>
> See below I think it's just not worth it having both variants at all.
>
>>> Signed-off-by: Matthew Rosato <[email protected]>
>>> ---
>>> arch/s390/include/asm/pci_insn.h | 17 +++++++++--------
>>> arch/s390/pci/pci_insn.c | 3 ++-
>>> arch/s390/pci/pci_irq.c | 28 ++++++++++++++--------------
>>> 3 files changed, 25 insertions(+), 23 deletions(-)
>>>
>>> diff --git a/arch/s390/include/asm/pci_insn.h b/arch/s390/include/asm/pci_insn.h
>>> index 61cf9531f68f..5331082fa516 100644
>>> --- a/arch/s390/include/asm/pci_insn.h
>>> +++ b/arch/s390/include/asm/pci_insn.h
>>> @@ -98,6 +98,14 @@ struct zpci_fib {
>>> u32 gd;
>>> } __packed __aligned(8);
>>>
>>> +/* Set Interruption Controls Operation Controls */
>>> +#define SIC_IRQ_MODE_ALL 0
>>> +#define SIC_IRQ_MODE_SINGLE 1
>>> +#define SIC_IRQ_MODE_DIRECT 4
>>> +#define SIC_IRQ_MODE_D_ALL 16
>>> +#define SIC_IRQ_MODE_D_SINGLE 17
>>> +#define SIC_IRQ_MODE_SET_CPU 18
>>> +
>>> /* directed interruption information block */
>>> struct zpci_diib {
>>> u32 : 1;
>>> @@ -134,13 +142,6 @@ int __zpci_store(u64 data, u64 req, u64 offset);
>>> int zpci_store(const volatile void __iomem *addr, u64 data, unsigned long len);
>>> int __zpci_store_block(const u64 *data, u64 req, u64 offset);
>>> void zpci_barrier(void);
>>> -int __zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib);
>>> -
>>> -static inline int zpci_set_irq_ctrl(u16 ctl, u8 isc)
>>> -{
>>> - union zpci_sic_iib iib = {{0}};
>>> -
>>> - return __zpci_set_irq_ctrl(ctl, isc, &iib);
>>> -}
>>> +int zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib);
>
> Since the __zpci_set_irq_ctrl() was already non static/inline the above
> inline to non-inline change shouldn't make a performance difference.
>
> Looking at this makes me wonder though. Wouldn't it make sense to just
> have the zpci_set_irq_ctrl() function inline in the header. Its body is
> a single instruction inline asm plus a test_facility(). The latter by
> the way I think also looks rather out of place there considering we
> call zpci_set_irq_ctrl() in the interrupt handler and facilities can't
> go away so it's pretty silly to check for it on every single
> interrupt.. unless I'm totally missing something.

This test_facility isn't new to this patch, it was added via

commit 48070c73058be6de9c0d754d441ed7092dfc8f12
Author: Christian Borntraeger <[email protected]>
Date: Mon Oct 30 14:38:58 2017 +0100

s390/pci: do not require AIS facility

It looks like in the past, we would not even initialize zpci at all if
AIS wasn't available. With this, we initialize PCI but only do the SIC
when we have AIS, which makes sense.

So for this patch, the sane thing to do is probably just keep the
test_facility() in place and move to header, inline.

Maybe there's a subsequent optimization to be made (setup a static key
like have_mio vs doing test_facility all the time?)

>
>>>
>>> #endif
>>> diff --git a/arch/s390/pci/pci_insn.c b/arch/s390/pci/pci_insn.c
>>> index 28d863aaafea..d1a8bd43ce26 100644
>>> --- a/arch/s390/pci/pci_insn.c
>>> +++ b/arch/s390/pci/pci_insn.c
>>> @@ -97,7 +97,7 @@ int zpci_refresh_trans(u64 fn, u64 addr, u64 range)
>>> }
>>>
>>> /* Set Interruption Controls */
>>> -int __zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib)
>>> +int zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib)
>>> {
>>> if (!test_facility(72))
>>> return -EIO;
>>> @@ -108,6 +108,7 @@ int __zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib)
>>>
>>> return 0;
>>> }
>>> +EXPORT_SYMBOL_GPL(zpci_set_irq_ctrl);
>>>
>>> /* PCI Load */
>>> static inline int ____pcilg(u64 *data, u64 req, u64 offset, u8 *status)
>>> diff --git a/arch/s390/pci/pci_irq.c b/arch/s390/pci/pci_irq.c
>>> index dfd4f3276a6d..6b29e39496d1 100644
>>> --- a/arch/s390/pci/pci_irq.c
>>> +++ b/arch/s390/pci/pci_irq.c
>>> @@ -15,13 +15,6 @@
>>>
>>> static enum {FLOATING, DIRECTED} irq_delivery;
>>>
>>> -#define SIC_IRQ_MODE_ALL 0
>>> -#define SIC_IRQ_MODE_SINGLE 1
>>> -#define SIC_IRQ_MODE_DIRECT 4
>>> -#define SIC_IRQ_MODE_D_ALL 16
>>> -#define SIC_IRQ_MODE_D_SINGLE 17
>>> -#define SIC_IRQ_MODE_SET_CPU 18
>>> -
>>> /*
>>> * summary bit vector
>>> * FLOATING - summary bit per function
>>> @@ -145,6 +138,13 @@ static int zpci_set_irq_affinity(struct irq_data *data, const struct cpumask *de
>>> return IRQ_SET_MASK_OK;
>>> }
>>>
>>> +static inline int __zpci_set_irq_ctrl(u16 ctl, u8 isc)
>>> +{
>>> + union zpci_sic_iib iib = {{0}};
>>> +
>>> + return zpci_set_irq_ctrl(ctl, isc, &iib);
>>> +}
>>> +
>
> I would be totally fine and slighlt prefer to have the 0 iib repeated
> at those 3 call sites that don't need it. On first glance that should
> come out to pretty much the same number of lines of code and it removes
> the potential confusion of swapping the __ prefixed and non-prefixed
> variants. What do you think?

Sure, I can do that.

>
>>> static struct irq_chip zpci_irq_chip = {
>>> .name = "PCI-MSI",
>>> .irq_unmask = pci_msi_unmask_irq,
>>> @@ -165,7 +165,7 @@ static void zpci_handle_cpu_local_irq(bool rescan)
>>>
> ---8<---
>


2021-12-08 15:59:18

by Niklas Schnelle

[permalink] [raw]
Subject: Re: [PATCH 07/32] s390/pci: externalize the SIC operation controls and routine

On Wed, 2021-12-08 at 10:33 -0500, Matthew Rosato wrote:
> On 12/8/21 8:53 AM, Niklas Schnelle wrote:
> > On Wed, 2021-12-08 at 14:09 +0100, Christian Borntraeger wrote:
> > > Am 07.12.21 um 21:57 schrieb Matthew Rosato:
> > > > A subsequent patch will be issuing SIC from KVM -- export the necessary
> > > > routine and make the operation control definitions available from a header.
> > > > Because the routine will now be exported, let's swap the purpose of
> > > > zpci_set_irq_ctrl and __zpci_set_irq_ctrl, leaving the latter as a static
> > > > within pci_irq.c only for SIC calls that don't specify an iib.
> > >
> > > Maybe it would be simpler to export the __ version instead of renaming everything.
> > > Whatever Niklas prefers.
> >
> > See below I think it's just not worth it having both variants at all.
> >
> > > > Signed-off-by: Matthew Rosato <[email protected]>
> > > > ---
> > > > arch/s390/include/asm/pci_insn.h | 17 +++++++++--------
> > > > arch/s390/pci/pci_insn.c | 3 ++-
> > > > arch/s390/pci/pci_irq.c | 28 ++++++++++++++--------------
> > > > 3 files changed, 25 insertions(+), 23 deletions(-)
> > > >
> > > > diff --git a/arch/s390/include/asm/pci_insn.h b/arch/s390/include/asm/pci_insn.h
> > > > index 61cf9531f68f..5331082fa516 100644
> > > > --- a/arch/s390/include/asm/pci_insn.h
> > > > +++ b/arch/s390/include/asm/pci_insn.h
> > > > @@ -98,6 +98,14 @@ struct zpci_fib {
> > > > u32 gd;
> > > > } __packed __aligned(8);
> > > >
> > > > +/* Set Interruption Controls Operation Controls */
> > > > +#define SIC_IRQ_MODE_ALL 0
> > > > +#define SIC_IRQ_MODE_SINGLE 1
> > > > +#define SIC_IRQ_MODE_DIRECT 4
> > > > +#define SIC_IRQ_MODE_D_ALL 16
> > > > +#define SIC_IRQ_MODE_D_SINGLE 17
> > > > +#define SIC_IRQ_MODE_SET_CPU 18
> > > > +
> > > > /* directed interruption information block */
> > > > struct zpci_diib {
> > > > u32 : 1;
> > > > @@ -134,13 +142,6 @@ int __zpci_store(u64 data, u64 req, u64 offset);
> > > > int zpci_store(const volatile void __iomem *addr, u64 data, unsigned long len);
> > > > int __zpci_store_block(const u64 *data, u64 req, u64 offset);
> > > > void zpci_barrier(void);
> > > > -int __zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib);
> > > > -
> > > > -static inline int zpci_set_irq_ctrl(u16 ctl, u8 isc)
> > > > -{
> > > > - union zpci_sic_iib iib = {{0}};
> > > > -
> > > > - return __zpci_set_irq_ctrl(ctl, isc, &iib);
> > > > -}
> > > > +int zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib);
> >
> > Since the __zpci_set_irq_ctrl() was already non static/inline the above
> > inline to non-inline change shouldn't make a performance difference.
> >
> > Looking at this makes me wonder though. Wouldn't it make sense to just
> > have the zpci_set_irq_ctrl() function inline in the header. Its body is
> > a single instruction inline asm plus a test_facility(). The latter by
> > the way I think also looks rather out of place there considering we
> > call zpci_set_irq_ctrl() in the interrupt handler and facilities can't
> > go away so it's pretty silly to check for it on every single
> > interrupt.. unless I'm totally missing something.
>
> This test_facility isn't new to this patch

Yeah I got that part, your patch just made me look.

> , it was added via
>
> commit 48070c73058be6de9c0d754d441ed7092dfc8f12
> Author: Christian Borntraeger <[email protected]>
> Date: Mon Oct 30 14:38:58 2017 +0100
>
> s390/pci: do not require AIS facility
>
> It looks like in the past, we would not even initialize zpci at all if
> AIS wasn't available. With this, we initialize PCI but only do the SIC
> when we have AIS, which makes sense.

Ah yes I guess that is the something I was missing. I was wondering why
that wasn't just tested for during init.

>
> So for this patch, the sane thing to do is probably just keep the
> test_facility() in place and move to header, inline.

Yes sounds good.

>
> Maybe there's a subsequent optimization to be made (setup a static key
> like have_mio vs doing test_facility all the time?)

Yeah, looking again more closely at test_facilities() it's probably not
that expensive either I'll do some tests. Maybe we can also just add a
comment and a normal unlikely() macro since with this series KVM would
also support AIS, correct?

>

---8<---


2021-12-08 16:19:05

by Niklas Schnelle

[permalink] [raw]
Subject: Re: [PATCH 10/32] s390/pci: stash dtsm and maxstbl

On Tue, 2021-12-07 at 15:57 -0500, Matthew Rosato wrote:
> Store information about what IOAT designation types are supported by
> underlying hardware as well as the largest store block size allowed.
> These values will be needed by passthrough.
>
> Signed-off-by: Matthew Rosato <[email protected]>
> ---
> arch/s390/include/asm/pci.h | 2 ++
> arch/s390/include/asm/pci_clp.h | 6 ++++--
> arch/s390/pci/pci_clp.c | 2 ++
> 3 files changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
> index 2474b8d30f2a..1a8f9f42da3a 100644
> --- a/arch/s390/include/asm/pci.h
> +++ b/arch/s390/include/asm/pci.h
> @@ -126,9 +126,11 @@ struct zpci_dev {
> u32 gd; /* GISA designation for passthrough */
> u16 vfn; /* virtual function number */
> u16 pchid; /* physical channel ID */
> + u16 maxstbl; /* Maximum store block size */
> u8 pfgid; /* function group ID */
> u8 pft; /* pci function type */
> u8 port;
> + u8 dtsm; /* Supported DT mask */
> u8 rid_available : 1;
> u8 has_hp_slot : 1;
> u8 has_resources : 1;
> diff --git a/arch/s390/include/asm/pci_clp.h b/arch/s390/include/asm/pci_clp.h
> index 3af8d196da74..124fadfb74b9 100644
> --- a/arch/s390/include/asm/pci_clp.h
> +++ b/arch/s390/include/asm/pci_clp.h
> @@ -153,9 +153,11 @@ struct clp_rsp_query_pci_grp {
> u8 : 6;
> u8 frame : 1;
> u8 refresh : 1; /* TLB refresh mode */
> - u16 reserved2;
> + u16 : 3;
> + u16 maxstbl : 13; /* Maximum store block size */
> u16 mui;
> - u16 : 16;
> + u8 dtsm; /* Supported DT mask */
> + u8 reserved3;
> u16 maxfaal;
> u16 : 4;
> u16 dnoi : 12;
> diff --git a/arch/s390/pci/pci_clp.c b/arch/s390/pci/pci_clp.c
> index e9ed0e4a5cf0..bc7446566cbc 100644
> --- a/arch/s390/pci/pci_clp.c
> +++ b/arch/s390/pci/pci_clp.c
> @@ -103,6 +103,8 @@ static void clp_store_query_pci_fngrp(struct zpci_dev *zdev,
> zdev->max_msi = response->noi;
> zdev->fmb_update = response->mui;
> zdev->version = response->version;
> + zdev->maxstbl = response->maxstbl;
> + zdev->dtsm = response->dtsm;
>
> switch (response->version) {
> case 1:

Reviewed-by: Niklas Schnelle <[email protected]>


2021-12-08 16:20:15

by Matthew Rosato

[permalink] [raw]
Subject: Re: [PATCH 07/32] s390/pci: externalize the SIC operation controls and routine

On 12/8/21 10:59 AM, Niklas Schnelle wrote:
> On Wed, 2021-12-08 at 10:33 -0500, Matthew Rosato wrote:
>> On 12/8/21 8:53 AM, Niklas Schnelle wrote:
>>> On Wed, 2021-12-08 at 14:09 +0100, Christian Borntraeger wrote:
>>>> Am 07.12.21 um 21:57 schrieb Matthew Rosato:
>>>>> A subsequent patch will be issuing SIC from KVM -- export the necessary
>>>>> routine and make the operation control definitions available from a header.
>>>>> Because the routine will now be exported, let's swap the purpose of
>>>>> zpci_set_irq_ctrl and __zpci_set_irq_ctrl, leaving the latter as a static
>>>>> within pci_irq.c only for SIC calls that don't specify an iib.
>>>>
>>>> Maybe it would be simpler to export the __ version instead of renaming everything.
>>>> Whatever Niklas prefers.
>>>
>>> See below I think it's just not worth it having both variants at all.
>>>
>>>>> Signed-off-by: Matthew Rosato <[email protected]>
>>>>> ---
>>>>> arch/s390/include/asm/pci_insn.h | 17 +++++++++--------
>>>>> arch/s390/pci/pci_insn.c | 3 ++-
>>>>> arch/s390/pci/pci_irq.c | 28 ++++++++++++++--------------
>>>>> 3 files changed, 25 insertions(+), 23 deletions(-)
>>>>>
>>>>> diff --git a/arch/s390/include/asm/pci_insn.h b/arch/s390/include/asm/pci_insn.h
>>>>> index 61cf9531f68f..5331082fa516 100644
>>>>> --- a/arch/s390/include/asm/pci_insn.h
>>>>> +++ b/arch/s390/include/asm/pci_insn.h
>>>>> @@ -98,6 +98,14 @@ struct zpci_fib {
>>>>> u32 gd;
>>>>> } __packed __aligned(8);
>>>>>
>>>>> +/* Set Interruption Controls Operation Controls */
>>>>> +#define SIC_IRQ_MODE_ALL 0
>>>>> +#define SIC_IRQ_MODE_SINGLE 1
>>>>> +#define SIC_IRQ_MODE_DIRECT 4
>>>>> +#define SIC_IRQ_MODE_D_ALL 16
>>>>> +#define SIC_IRQ_MODE_D_SINGLE 17
>>>>> +#define SIC_IRQ_MODE_SET_CPU 18
>>>>> +
>>>>> /* directed interruption information block */
>>>>> struct zpci_diib {
>>>>> u32 : 1;
>>>>> @@ -134,13 +142,6 @@ int __zpci_store(u64 data, u64 req, u64 offset);
>>>>> int zpci_store(const volatile void __iomem *addr, u64 data, unsigned long len);
>>>>> int __zpci_store_block(const u64 *data, u64 req, u64 offset);
>>>>> void zpci_barrier(void);
>>>>> -int __zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib);
>>>>> -
>>>>> -static inline int zpci_set_irq_ctrl(u16 ctl, u8 isc)
>>>>> -{
>>>>> - union zpci_sic_iib iib = {{0}};
>>>>> -
>>>>> - return __zpci_set_irq_ctrl(ctl, isc, &iib);
>>>>> -}
>>>>> +int zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib);
>>>
>>> Since the __zpci_set_irq_ctrl() was already non static/inline the above
>>> inline to non-inline change shouldn't make a performance difference.
>>>
>>> Looking at this makes me wonder though. Wouldn't it make sense to just
>>> have the zpci_set_irq_ctrl() function inline in the header. Its body is
>>> a single instruction inline asm plus a test_facility(). The latter by
>>> the way I think also looks rather out of place there considering we
>>> call zpci_set_irq_ctrl() in the interrupt handler and facilities can't
>>> go away so it's pretty silly to check for it on every single
>>> interrupt.. unless I'm totally missing something.
>>
>> This test_facility isn't new to this patch
>
> Yeah I got that part, your patch just made me look.
>
>> , it was added via
>>
>> commit 48070c73058be6de9c0d754d441ed7092dfc8f12
>> Author: Christian Borntraeger <[email protected]>
>> Date: Mon Oct 30 14:38:58 2017 +0100
>>
>> s390/pci: do not require AIS facility
>>
>> It looks like in the past, we would not even initialize zpci at all if
>> AIS wasn't available. With this, we initialize PCI but only do the SIC
>> when we have AIS, which makes sense.
>
> Ah yes I guess that is the something I was missing. I was wondering why
> that wasn't just tested for during init.
>
>>
>> So for this patch, the sane thing to do is probably just keep the
>> test_facility() in place and move to header, inline.
>
> Yes sounds good.
>
>>
>> Maybe there's a subsequent optimization to be made (setup a static key
>> like have_mio vs doing test_facility all the time?)
>
> Yeah, looking again more closely at test_facilities() it's probably not
> that expensive either I'll do some tests. Maybe we can also just add a
> comment and a normal unlikely() macro since with this series KVM would
> also support AIS, correct?
AIS was already being set as a KVM facility / allowed as QEMU capability
before this series, however there was a period of time where QEMU was
disabling it (disabled in QEMU 3f2d07b3b01e, enabled again in QEMU
a5c8617af691) which I suspect was the impetus for this kernel change;
this means that there are older machines that won't have it, but moving
forward we should be OK in the standard case. Of course the kernel
should still be able to tolerate the case where AIS is unavailable (old
machine, intentionally forced off, etc), so maybe the unlikely indeed
makes the most sense.

As far as a comment for the unlikely I could add something like 'some
virtualized environments may have disabled the AIS facility'?



2021-12-08 16:41:25

by Niklas Schnelle

[permalink] [raw]
Subject: Re: [PATCH 07/32] s390/pci: externalize the SIC operation controls and routine

On Wed, 2021-12-08 at 11:20 -0500, Matthew Rosato wrote:
> On 12/8/21 10:59 AM, Niklas Schnelle wrote:
> > On Wed, 2021-12-08 at 10:33 -0500, Matthew Rosato wrote:
> > > On 12/8/21 8:53 AM, Niklas Schnelle wrote:
> > > > On Wed, 2021-12-08 at 14:09 +0100, Christian Borntraeger wrote:
> > > > > Am 07.12.21 um 21:57 schrieb Matthew Rosato:
> > > > > > A subsequent patch will be issuing SIC from KVM -- export the necessary
> > > > > > routine and make the operation control definitions available from a header.
> > > > > > Because the routine will now be exported, let's swap the purpose of
> > > > > > zpci_set_irq_ctrl and __zpci_set_irq_ctrl, leaving the latter as a static
> > > > > > within pci_irq.c only for SIC calls that don't specify an iib.
> > > > >
> > > > > Maybe it would be simpler to export the __ version instead of renaming everything.
> > > > > Whatever Niklas prefers.
> > > >
> > > > See below I think it's just not worth it having both variants at all.
> > > >
> > > > > > Signed-off-by: Matthew Rosato <[email protected]>
> > > > > > ---
> > > > > > arch/s390/include/asm/pci_insn.h | 17 +++++++++--------
> > > > > > arch/s390/pci/pci_insn.c | 3 ++-
> > > > > > arch/s390/pci/pci_irq.c | 28 ++++++++++++++--------------
> > > > > > 3 files changed, 25 insertions(+), 23 deletions(-)
> > > > > >
> > > > > > diff --git a/arch/s390/include/asm/pci_insn.h b/arch/s390/include/asm/pci_insn.h
> > > > > > index 61cf9531f68f..5331082fa516 100644
> > > > > > --- a/arch/s390/include/asm/pci_insn.h
> > > > > > +++ b/arch/s390/include/asm/pci_insn.h
> > > > > > @@ -98,6 +98,14 @@ struct zpci_fib {
> > > > > > u32 gd;
> > > > > > } __packed __aligned(8);
> > > > > >
> > > > > > +/* Set Interruption Controls Operation Controls */
> > > > > > +#define SIC_IRQ_MODE_ALL 0
> > > > > > +#define SIC_IRQ_MODE_SINGLE 1
> > > > > > +#define SIC_IRQ_MODE_DIRECT 4
> > > > > > +#define SIC_IRQ_MODE_D_ALL 16
> > > > > > +#define SIC_IRQ_MODE_D_SINGLE 17
> > > > > > +#define SIC_IRQ_MODE_SET_CPU 18
> > > > > > +
> > > > > > /* directed interruption information block */
> > > > > > struct zpci_diib {
> > > > > > u32 : 1;
> > > > > > @@ -134,13 +142,6 @@ int __zpci_store(u64 data, u64 req, u64 offset);
> > > > > > int zpci_store(const volatile void __iomem *addr, u64 data, unsigned long len);
> > > > > > int __zpci_store_block(const u64 *data, u64 req, u64 offset);
> > > > > > void zpci_barrier(void);
> > > > > > -int __zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib);
> > > > > > -
> > > > > > -static inline int zpci_set_irq_ctrl(u16 ctl, u8 isc)
> > > > > > -{
> > > > > > - union zpci_sic_iib iib = {{0}};
> > > > > > -
> > > > > > - return __zpci_set_irq_ctrl(ctl, isc, &iib);
> > > > > > -}
> > > > > > +int zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib);
> > > >
> > > > Since the __zpci_set_irq_ctrl() was already non static/inline the above
> > > > inline to non-inline change shouldn't make a performance difference.
> > > >
> > > > Looking at this makes me wonder though. Wouldn't it make sense to just
> > > > have the zpci_set_irq_ctrl() function inline in the header. Its body is
> > > > a single instruction inline asm plus a test_facility(). The latter by
> > > > the way I think also looks rather out of place there considering we
> > > > call zpci_set_irq_ctrl() in the interrupt handler and facilities can't
> > > > go away so it's pretty silly to check for it on every single
> > > > interrupt.. unless I'm totally missing something.
> > >
> > > This test_facility isn't new to this patch
> >
> > Yeah I got that part, your patch just made me look.
> >
> > > , it was added via
> > >
> > > commit 48070c73058be6de9c0d754d441ed7092dfc8f12
> > > Author: Christian Borntraeger <[email protected]>
> > > Date: Mon Oct 30 14:38:58 2017 +0100
> > >
> > > s390/pci: do not require AIS facility
> > >
> > > It looks like in the past, we would not even initialize zpci at all if
> > > AIS wasn't available. With this, we initialize PCI but only do the SIC
> > > when we have AIS, which makes sense.
> >
> > Ah yes I guess that is the something I was missing. I was wondering why
> > that wasn't just tested for during init.
> >
> > > So for this patch, the sane thing to do is probably just keep the
> > > test_facility() in place and move to header, inline.
> >
> > Yes sounds good.
> >
> > > Maybe there's a subsequent optimization to be made (setup a static key
> > > like have_mio vs doing test_facility all the time?)
> >
> > Yeah, looking again more closely at test_facilities() it's probably not
> > that expensive either I'll do some tests. Maybe we can also just add a
> > comment and a normal unlikely() macro since with this series KVM would
> > also support AIS, correct?
> AIS was already being set as a KVM facility / allowed as QEMU capability
> before this series, however there was a period of time where QEMU was
> disabling it (disabled in QEMU 3f2d07b3b01e, enabled again in QEMU
> a5c8617af691) which I suspect was the impetus for this kernel change;
> this means that there are older machines that won't have it, but moving
> forward we should be OK in the standard case. Of course the kernel
> should still be able to tolerate the case where AIS is unavailable (old
> machine, intentionally forced off, etc), so maybe the unlikely indeed
> makes the most sense.

Thanks for the background!

>
> As far as a comment for the unlikely I could add something like 'some
> virtualized environments may have disabled the AIS facility'?

I think we should add the unlikely() and comment in a separate patch
such that this one really doesn't change behavior only the call
signature and export.


2021-12-08 18:04:33

by Matthew Rosato

[permalink] [raw]
Subject: Re: [PATCH 23/32] KVM: s390: pci: handle refresh of PCI translations

On 12/8/21 5:30 AM, Niklas Schnelle wrote:
> On Tue, 2021-12-07 at 15:57 -0500, Matthew Rosato wrote:
>> Add a routine that will perform a shadow operation between a guest
>> and host IOAT. A subsequent patch will invoke this in response to
>> an 04 RPCIT instruction intercept.
>>
>> Signed-off-by: Matthew Rosato <[email protected]>
>> ---
>> arch/s390/include/asm/kvm_pci.h | 1 +
>> arch/s390/include/asm/pci_dma.h | 1 +
>> arch/s390/kvm/pci.c | 191 ++++++++++++++++++++++++++++++++
>> arch/s390/kvm/pci.h | 4 +-
>> 4 files changed, 196 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
>> index 254275399f21..97e3a369135d 100644
>> --- a/arch/s390/include/asm/kvm_pci.h
>> +++ b/arch/s390/include/asm/kvm_pci.h
>> @@ -30,6 +30,7 @@ struct kvm_zdev_ioat {
>> struct kvm_zdev {
>> struct zpci_dev *zdev;
>> struct kvm *kvm;
>> + u64 rpcit_count;
>> struct kvm_zdev_ioat ioat;
>> struct zpci_fib fib;
>> };
>> diff --git a/arch/s390/include/asm/pci_dma.h b/arch/s390/include/asm/pci_dma.h
>> index e1d3c1d3fc8a..0ca15e5db3d9 100644
>> --- a/arch/s390/include/asm/pci_dma.h
>> +++ b/arch/s390/include/asm/pci_dma.h
>> @@ -52,6 +52,7 @@ enum zpci_ioat_dtype {
>> #define ZPCI_TABLE_ENTRIES (ZPCI_TABLE_SIZE / ZPCI_TABLE_ENTRY_SIZE)
>> #define ZPCI_TABLE_PAGES (ZPCI_TABLE_SIZE >> PAGE_SHIFT)
>> #define ZPCI_TABLE_ENTRIES_PAGES (ZPCI_TABLE_ENTRIES * ZPCI_TABLE_PAGES)
>> +#define ZPCI_TABLE_ENTRIES_PER_PAGE (ZPCI_TABLE_ENTRIES / ZPCI_TABLE_PAGES)
>>
>> #define ZPCI_TABLE_BITS 11
>> #define ZPCI_PT_BITS 8
>> diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
>> index a1c0c0881332..858c5ecdc8b9 100644
>> --- a/arch/s390/kvm/pci.c
>> +++ b/arch/s390/kvm/pci.c
>> @@ -123,6 +123,195 @@ int kvm_s390_pci_aen_init(u8 nisc)
>> return rc;
>> }
>>
>> +static int dma_shadow_cpu_trans(struct kvm_vcpu *vcpu, unsigned long *entry,
>> + unsigned long *gentry)
>> +{
>> + unsigned long idx;
>> + struct page *page;
>> + void *gaddr = NULL;
>> + kvm_pfn_t pfn;
>> + gpa_t addr;
>> + int rc = 0;
>> +
>> + if (pt_entry_isvalid(*gentry)) {
>> + /* pin and validate */
>> + addr = *gentry & ZPCI_PTE_ADDR_MASK;
>> + idx = srcu_read_lock(&vcpu->kvm->srcu);
>> + page = gfn_to_page(vcpu->kvm, gpa_to_gfn(addr));
>> + srcu_read_unlock(&vcpu->kvm->srcu, idx);
>> + if (is_error_page(page))
>> + return -EIO;
>> + gaddr = page_to_virt(page) + (addr & ~PAGE_MASK);
>
> Hmm, this looks like a virtual vs physical address mixup to me that is
> currently not a problem because kernel virtual addresses are equal to
> their physical address. Here page_to_virt(page) gives us a virtual
> address but the entries in the I/O translation table have to be
> physical (aka absolute) addresses.
>
> With my commit "s390/pci: use physical addresses in DMA tables"
> currently in the s390 feature branch this is also reflected in the
> argument types taken by set_pt_pfaa() below so gaddr should have type
> phys_addr_t not void *. That should also remove the need for the cast
> to unsigned long for the duplicate check.

Right... Like the other comment re: virtual vs physical address I will
take a look and fix for v2.

2021-12-08 18:19:09

by Niklas Schnelle

[permalink] [raw]
Subject: Re: [PATCH 07/32] s390/pci: externalize the SIC operation controls and routine

On Wed, 2021-12-08 at 17:41 +0100, Niklas Schnelle wrote:
> On Wed, 2021-12-08 at 11:20 -0500, Matthew Rosato wrote:
> > On 12/8/21 10:59 AM, Niklas Schnelle wrote:
> > > On Wed, 2021-12-08 at 10:33 -0500, Matthew Rosato wrote:
> > > > On 12/8/21 8:53 AM, Niklas Schnelle wrote:
> > > > > On Wed, 2021-12-08 at 14:09 +0100, Christian Borntraeger wrote:
> > > > > > Am 07.12.21 um 21:57 schrieb Matthew Rosato:
> > > > > > > A subsequent patch will be issuing SIC from KVM -- export the necessary
> > > > > > > routine and make the operation control definitions available from a header.
> > > > > > > Because the routine will now be exported, let's swap the purpose of
> > > > > > > zpci_set_irq_ctrl and __zpci_set_irq_ctrl, leaving the latter as a static
> > > > > > > within pci_irq.c only for SIC calls that don't specify an iib.
> > > > > >
> > > > > > Maybe it would be simpler to export the __ version instead of renaming everything.
> > > > > > Whatever Niklas prefers.
> > > > >
> > > > > See below I think it's just not worth it having both variants at all.
> > > > >
> > > > > > > Signed-off-by: Matthew Rosato <[email protected]>
> > > > > > > ---
> > > > > > > arch/s390/include/asm/pci_insn.h | 17 +++++++++--------
> > > > > > > arch/s390/pci/pci_insn.c | 3 ++-
> > > > > > > arch/s390/pci/pci_irq.c | 28 ++++++++++++++--------------
> > > > > > > 3 files changed, 25 insertions(+), 23 deletions(-)
> > > > > > >
> > > > > > > diff --git a/arch/s390/include/asm/pci_insn.h b/arch/s390/include/asm/pci_insn.h
> > > > > > > index 61cf9531f68f..5331082fa516 100644
> > > > > > > --- a/arch/s390/include/asm/pci_insn.h
> > > > > > > +++ b/arch/s390/include/asm/pci_insn.h
> > > > > > > @@ -98,6 +98,14 @@ struct zpci_fib {
> > > > > > > u32 gd;
> > > > > > > } __packed __aligned(8);
> > > > > > >
> > > > > > > +/* Set Interruption Controls Operation Controls */
> > > > > > > +#define SIC_IRQ_MODE_ALL 0
> > > > > > > +#define SIC_IRQ_MODE_SINGLE 1
> > > > > > > +#define SIC_IRQ_MODE_DIRECT 4
> > > > > > > +#define SIC_IRQ_MODE_D_ALL 16
> > > > > > > +#define SIC_IRQ_MODE_D_SINGLE 17
> > > > > > > +#define SIC_IRQ_MODE_SET_CPU 18
> > > > > > > +
> > > > > > > /* directed interruption information block */
> > > > > > > struct zpci_diib {
> > > > > > > u32 : 1;
> > > > > > > @@ -134,13 +142,6 @@ int __zpci_store(u64 data, u64 req, u64 offset);
> > > > > > > int zpci_store(const volatile void __iomem *addr, u64 data, unsigned long len);
> > > > > > > int __zpci_store_block(const u64 *data, u64 req, u64 offset);
> > > > > > > void zpci_barrier(void);
> > > > > > > -int __zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib);
> > > > > > > -
> > > > > > > -static inline int zpci_set_irq_ctrl(u16 ctl, u8 isc)
> > > > > > > -{
> > > > > > > - union zpci_sic_iib iib = {{0}};
> > > > > > > -
> > > > > > > - return __zpci_set_irq_ctrl(ctl, isc, &iib);
> > > > > > > -}
> > > > > > > +int zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib);
> > > > >
> > > > > Since the __zpci_set_irq_ctrl() was already non static/inline the above
> > > > > inline to non-inline change shouldn't make a performance difference.
> > > > >
> > > > > Looking at this makes me wonder though. Wouldn't it make sense to just
> > > > > have the zpci_set_irq_ctrl() function inline in the header. Its body is
> > > > > a single instruction inline asm plus a test_facility(). The latter by
> > > > > the way I think also looks rather out of place there considering we
> > > > > call zpci_set_irq_ctrl() in the interrupt handler and facilities can't
> > > > > go away so it's pretty silly to check for it on every single
> > > > > interrupt.. unless I'm totally missing something.
> > > >
> > > > This test_facility isn't new to this patch
> > >
> > > Yeah I got that part, your patch just made me look.
> > >
> > > > , it was added via
> > > >
> > > > commit 48070c73058be6de9c0d754d441ed7092dfc8f12
> > > > Author: Christian Borntraeger <[email protected]>
> > > > Date: Mon Oct 30 14:38:58 2017 +0100
> > > >
> > > > s390/pci: do not require AIS facility
> > > >
> > > > It looks like in the past, we would not even initialize zpci at all if
> > > > AIS wasn't available. With this, we initialize PCI but only do the SIC
> > > > when we have AIS, which makes sense.
> > >
> > > Ah yes I guess that is the something I was missing. I was wondering why
> > > that wasn't just tested for during init.
> > >
> > > > So for this patch, the sane thing to do is probably just keep the
> > > > test_facility() in place and move to header, inline.
> > >
> > > Yes sounds good.

As discussed out of band, slight change of plan. Let's keep the
implementation in pci_insn.c for now but remove the __* prefix and the
iib 0 wrapper. This way we get rid of potential confusion of swapping
what each variant does and we also don't need to export a __* prefixed
function. I tried it out locally and having the iib 0 at the callsites
indeed doesn't look worse.


2021-12-09 09:12:56

by Pierre Morel

[permalink] [raw]
Subject: Re: [PATCH 10/32] s390/pci: stash dtsm and maxstbl



On 12/7/21 21:57, Matthew Rosato wrote:
> Store information about what IOAT designation types are supported by
> underlying hardware as well as the largest store block size allowed.
> These values will be needed by passthrough.
>
> Signed-off-by: Matthew Rosato <[email protected]>
> ---
> arch/s390/include/asm/pci.h | 2 ++
> arch/s390/include/asm/pci_clp.h | 6 ++++--
> arch/s390/pci/pci_clp.c | 2 ++
> 3 files changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
> index 2474b8d30f2a..1a8f9f42da3a 100644
> --- a/arch/s390/include/asm/pci.h
> +++ b/arch/s390/include/asm/pci.h
> @@ -126,9 +126,11 @@ struct zpci_dev {
> u32 gd; /* GISA designation for passthrough */
> u16 vfn; /* virtual function number */
> u16 pchid; /* physical channel ID */
> + u16 maxstbl; /* Maximum store block size */
> u8 pfgid; /* function group ID */
> u8 pft; /* pci function type */
> u8 port;
> + u8 dtsm; /* Supported DT mask */
> u8 rid_available : 1;
> u8 has_hp_slot : 1;
> u8 has_resources : 1;
> diff --git a/arch/s390/include/asm/pci_clp.h b/arch/s390/include/asm/pci_clp.h
> index 3af8d196da74..124fadfb74b9 100644
> --- a/arch/s390/include/asm/pci_clp.h
> +++ b/arch/s390/include/asm/pci_clp.h
> @@ -153,9 +153,11 @@ struct clp_rsp_query_pci_grp {
> u8 : 6;
> u8 frame : 1;
> u8 refresh : 1; /* TLB refresh mode */
> - u16 reserved2;
> + u16 : 3;
> + u16 maxstbl : 13; /* Maximum store block size */
> u16 mui;
> - u16 : 16;
> + u8 dtsm; /* Supported DT mask */
> + u8 reserved3;
> u16 maxfaal;
> u16 : 4;
> u16 dnoi : 12;
> diff --git a/arch/s390/pci/pci_clp.c b/arch/s390/pci/pci_clp.c
> index e9ed0e4a5cf0..bc7446566cbc 100644
> --- a/arch/s390/pci/pci_clp.c
> +++ b/arch/s390/pci/pci_clp.c
> @@ -103,6 +103,8 @@ static void clp_store_query_pci_fngrp(struct zpci_dev *zdev,
> zdev->max_msi = response->noi;
> zdev->fmb_update = response->mui;
> zdev->version = response->version;
> + zdev->maxstbl = response->maxstbl;
> + zdev->dtsm = response->dtsm;
>
> switch (response->version) {
> case 1:
>

Reviewed-by: Pierre Morel <[email protected]>


--
Pierre Morel
IBM Lab Boeblingen

2021-12-09 15:08:08

by Christian Borntraeger

[permalink] [raw]
Subject: Re: [PATCH 08/32] s390/pci: stash associated GISA designation



Am 07.12.21 um 21:57 schrieb Matthew Rosato:
> For passthrough devices, we will need to know the GISA designation of the
> guest if interpretation facilities are to be used. Setup to stash this in
> the zdev and set a default of 0 (no GISA designation) for now; a subsequent
> patch will set a valid GISA designation for passthrough devices.
> Also, extend mpcific routines to specify this stashed designation as part
> of the mpcific command.
>
> Reviewed-by: Niklas Schnelle <[email protected]>
> Signed-off-by: Matthew Rosato <[email protected]>

Reviewed-by: Christian Borntraeger <[email protected]>

> ---
> arch/s390/include/asm/pci.h | 1 +
> arch/s390/include/asm/pci_clp.h | 3 ++-
> arch/s390/pci/pci.c | 9 +++++++++
> arch/s390/pci/pci_clp.c | 1 +
> arch/s390/pci/pci_irq.c | 5 +++++
> 5 files changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
> index 90824be5ce9a..2474b8d30f2a 100644
> --- a/arch/s390/include/asm/pci.h
> +++ b/arch/s390/include/asm/pci.h
> @@ -123,6 +123,7 @@ struct zpci_dev {
> enum zpci_state state;
> u32 fid; /* function ID, used by sclp */
> u32 fh; /* function handle, used by insn's */
> + u32 gd; /* GISA designation for passthrough */
> u16 vfn; /* virtual function number */
> u16 pchid; /* physical channel ID */
> u8 pfgid; /* function group ID */
> diff --git a/arch/s390/include/asm/pci_clp.h b/arch/s390/include/asm/pci_clp.h
> index 1f4b666e85ee..3af8d196da74 100644
> --- a/arch/s390/include/asm/pci_clp.h
> +++ b/arch/s390/include/asm/pci_clp.h
> @@ -173,7 +173,8 @@ struct clp_req_set_pci {
> u16 reserved2;
> u8 oc; /* operation controls */
> u8 ndas; /* number of dma spaces */
> - u64 reserved3;
> + u32 reserved3;
> + u32 gd; /* GISA designation */
> } __packed;
>
> /* Set PCI function response */
> diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c
> index 2f9b78fa82a5..9b4d3d78b444 100644
> --- a/arch/s390/pci/pci.c
> +++ b/arch/s390/pci/pci.c
> @@ -119,6 +119,7 @@ int zpci_register_ioat(struct zpci_dev *zdev, u8 dmaas,
> fib.pba = base;
> fib.pal = limit;
> fib.iota = iota | ZPCI_IOTA_RTTO_FLAG;
> + fib.gd = zdev->gd;
> cc = zpci_mod_fc(req, &fib, &status);
> if (cc)
> zpci_dbg(3, "reg ioat fid:%x, cc:%d, status:%d\n", zdev->fid, cc, status);
> @@ -132,6 +133,8 @@ int zpci_unregister_ioat(struct zpci_dev *zdev, u8 dmaas)
> struct zpci_fib fib = {0};
> u8 cc, status;
>
> + fib.gd = zdev->gd;
> +
> cc = zpci_mod_fc(req, &fib, &status);
> if (cc)
> zpci_dbg(3, "unreg ioat fid:%x, cc:%d, status:%d\n", zdev->fid, cc, status);
> @@ -159,6 +162,7 @@ int zpci_fmb_enable_device(struct zpci_dev *zdev)
> atomic64_set(&zdev->unmapped_pages, 0);
>
> fib.fmb_addr = virt_to_phys(zdev->fmb);
> + fib.gd = zdev->gd;
> cc = zpci_mod_fc(req, &fib, &status);
> if (cc) {
> kmem_cache_free(zdev_fmb_cache, zdev->fmb);
> @@ -177,6 +181,8 @@ int zpci_fmb_disable_device(struct zpci_dev *zdev)
> if (!zdev->fmb)
> return -EINVAL;
>
> + fib.gd = zdev->gd;
> +
> /* Function measurement is disabled if fmb address is zero */
> cc = zpci_mod_fc(req, &fib, &status);
> if (cc == 3) /* Function already gone. */
> @@ -807,6 +813,9 @@ struct zpci_dev *zpci_create_device(u32 fid, u32 fh, enum zpci_state state)
> zdev->fid = fid;
> zdev->fh = fh;
>
> + /* For now, assume it is not a passthrough device */
> + zdev->gd = 0;
> +
> /* Query function properties and update zdev */
> rc = clp_query_pci_fn(zdev);
> if (rc)
> diff --git a/arch/s390/pci/pci_clp.c b/arch/s390/pci/pci_clp.c
> index be077b39da33..e9ed0e4a5cf0 100644
> --- a/arch/s390/pci/pci_clp.c
> +++ b/arch/s390/pci/pci_clp.c
> @@ -240,6 +240,7 @@ static int clp_set_pci_fn(struct zpci_dev *zdev, u32 *fh, u8 nr_dma_as, u8 comma
> rrb->request.fh = zdev->fh;
> rrb->request.oc = command;
> rrb->request.ndas = nr_dma_as;
> + rrb->request.gd = zdev->gd;
>
> rc = clp_req(rrb, CLP_LPS_PCI);
> if (rrb->response.hdr.rsp == CLP_RC_SETPCIFN_BUSY) {
> diff --git a/arch/s390/pci/pci_irq.c b/arch/s390/pci/pci_irq.c
> index 6b29e39496d1..9e8b4507234d 100644
> --- a/arch/s390/pci/pci_irq.c
> +++ b/arch/s390/pci/pci_irq.c
> @@ -43,6 +43,7 @@ static int zpci_set_airq(struct zpci_dev *zdev)
> fib.fmt0.aibvo = 0; /* each zdev has its own interrupt vector */
> fib.fmt0.aisb = (unsigned long) zpci_sbv->vector + (zdev->aisb/64)*8;
> fib.fmt0.aisbo = zdev->aisb & 63;
> + fib.gd = zdev->gd;
>
> return zpci_mod_fc(req, &fib, &status) ? -EIO : 0;
> }
> @@ -54,6 +55,8 @@ static int zpci_clear_airq(struct zpci_dev *zdev)
> struct zpci_fib fib = {0};
> u8 cc, status;
>
> + fib.gd = zdev->gd;
> +
> cc = zpci_mod_fc(req, &fib, &status);
> if (cc == 3 || (cc == 1 && status == 24))
> /* Function already gone or IRQs already deregistered. */
> @@ -72,6 +75,7 @@ static int zpci_set_directed_irq(struct zpci_dev *zdev)
> fib.fmt = 1;
> fib.fmt1.noi = zdev->msi_nr_irqs;
> fib.fmt1.dibvo = zdev->msi_first_bit;
> + fib.gd = zdev->gd;
>
> return zpci_mod_fc(req, &fib, &status) ? -EIO : 0;
> }
> @@ -84,6 +88,7 @@ static int zpci_clear_directed_irq(struct zpci_dev *zdev)
> u8 cc, status;
>
> fib.fmt = 1;
> + fib.gd = zdev->gd;
> cc = zpci_mod_fc(req, &fib, &status);
> if (cc == 3 || (cc == 1 && status == 24))
> /* Function already gone or IRQs already deregistered. */
>

2021-12-09 15:21:11

by Christian Borntraeger

[permalink] [raw]
Subject: Re: [PATCH 09/32] s390/pci: export some routines related to RPCIT processing



Am 07.12.21 um 21:57 schrieb Matthew Rosato:
> KVM will re-use dma_walk_cpu_trans to walk the host shadow table and
> will also need to be able to call zpci_refresh_trans to re-issue a RPCIT.
>
> Reviewed-by: Niklas Schnelle <[email protected]>
> Signed-off-by: Matthew Rosato <[email protected]>

Makes sense

Acked-by: Christian Borntraeger <[email protected]>

> ---
> arch/s390/pci/pci_dma.c | 1 +
> arch/s390/pci/pci_insn.c | 1 +
> 2 files changed, 2 insertions(+)
>
> diff --git a/arch/s390/pci/pci_dma.c b/arch/s390/pci/pci_dma.c
> index 1f4540d6bd2d..ae55f2f2ecd9 100644
> --- a/arch/s390/pci/pci_dma.c
> +++ b/arch/s390/pci/pci_dma.c
> @@ -116,6 +116,7 @@ unsigned long *dma_walk_cpu_trans(unsigned long *rto, dma_addr_t dma_addr)
> px = calc_px(dma_addr);
> return &pto[px];
> }
> +EXPORT_SYMBOL_GPL(dma_walk_cpu_trans);
>
> void dma_update_cpu_trans(unsigned long *entry, void *page_addr, int flags)
> {
> diff --git a/arch/s390/pci/pci_insn.c b/arch/s390/pci/pci_insn.c
> index d1a8bd43ce26..0d1ab268ec24 100644
> --- a/arch/s390/pci/pci_insn.c
> +++ b/arch/s390/pci/pci_insn.c
> @@ -95,6 +95,7 @@ int zpci_refresh_trans(u64 fn, u64 addr, u64 range)
>
> return (cc) ? -EIO : 0;
> }
> +EXPORT_SYMBOL_GPL(zpci_refresh_trans);
>
> /* Set Interruption Controls */
> int zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib)
>

2021-12-09 15:25:33

by Christian Borntraeger

[permalink] [raw]
Subject: Re: [PATCH 10/32] s390/pci: stash dtsm and maxstbl



Am 07.12.21 um 21:57 schrieb Matthew Rosato:
> Store information about what IOAT designation types are supported by
> underlying hardware as well as the largest store block size allowed.
> These values will be needed by passthrough.
>
> Signed-off-by: Matthew Rosato <[email protected]>

Reviewed-by: Christian Borntraeger <[email protected]>

> ---
> arch/s390/include/asm/pci.h | 2 ++
> arch/s390/include/asm/pci_clp.h | 6 ++++--
> arch/s390/pci/pci_clp.c | 2 ++
> 3 files changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
> index 2474b8d30f2a..1a8f9f42da3a 100644
> --- a/arch/s390/include/asm/pci.h
> +++ b/arch/s390/include/asm/pci.h
> @@ -126,9 +126,11 @@ struct zpci_dev {
> u32 gd; /* GISA designation for passthrough */
> u16 vfn; /* virtual function number */
> u16 pchid; /* physical channel ID */
> + u16 maxstbl; /* Maximum store block size */
> u8 pfgid; /* function group ID */
> u8 pft; /* pci function type */
> u8 port;
> + u8 dtsm; /* Supported DT mask */
> u8 rid_available : 1;
> u8 has_hp_slot : 1;
> u8 has_resources : 1;
> diff --git a/arch/s390/include/asm/pci_clp.h b/arch/s390/include/asm/pci_clp.h
> index 3af8d196da74..124fadfb74b9 100644
> --- a/arch/s390/include/asm/pci_clp.h
> +++ b/arch/s390/include/asm/pci_clp.h
> @@ -153,9 +153,11 @@ struct clp_rsp_query_pci_grp {
> u8 : 6;
> u8 frame : 1;
> u8 refresh : 1; /* TLB refresh mode */
> - u16 reserved2;
> + u16 : 3;
> + u16 maxstbl : 13; /* Maximum store block size */
> u16 mui;
> - u16 : 16;
> + u8 dtsm; /* Supported DT mask */
> + u8 reserved3;
> u16 maxfaal;
> u16 : 4;
> u16 dnoi : 12;
> diff --git a/arch/s390/pci/pci_clp.c b/arch/s390/pci/pci_clp.c
> index e9ed0e4a5cf0..bc7446566cbc 100644
> --- a/arch/s390/pci/pci_clp.c
> +++ b/arch/s390/pci/pci_clp.c
> @@ -103,6 +103,8 @@ static void clp_store_query_pci_fngrp(struct zpci_dev *zdev,
> zdev->max_msi = response->noi;
> zdev->fmb_update = response->mui;
> zdev->version = response->version;
> + zdev->maxstbl = response->maxstbl;
> + zdev->dtsm = response->dtsm;
>
> switch (response->version) {
> case 1:
>

2021-12-09 15:28:48

by Christian Borntraeger

[permalink] [raw]
Subject: Re: [PATCH 11/32] s390/pci: add helper function to find device by handle



Am 07.12.21 um 21:57 schrieb Matthew Rosato:
> Intercepted zPCI instructions will specify the desired function via a
> function handle. Add a routine to find the device with the specified
> handle.
>
> Acked-by: Niklas Schnelle <[email protected]>
> Signed-off-by: Matthew Rosato <[email protected]>

I guess we do not have hundreds of devices, so this should be fast enough.
I guess long term wit hundreds of VFs we might want to redo the zpci_list
into a tree but for now as this is just like get_zdev_by_fid

Reviewed-by: Christian Borntraeger <[email protected]>

> ---
> arch/s390/include/asm/pci.h | 1 +
> arch/s390/pci/pci.c | 16 ++++++++++++++++
> 2 files changed, 17 insertions(+)
>
> diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
> index 1a8f9f42da3a..00a2c24d6d2b 100644
> --- a/arch/s390/include/asm/pci.h
> +++ b/arch/s390/include/asm/pci.h
> @@ -275,6 +275,7 @@ static inline struct zpci_dev *to_zpci_dev(struct device *dev)
> }
>
> struct zpci_dev *get_zdev_by_fid(u32);
> +struct zpci_dev *get_zdev_by_fh(u32 fh);
>
> /* DMA */
> int zpci_dma_init(void);
> diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c
> index 9b4d3d78b444..af1c0ae017b1 100644
> --- a/arch/s390/pci/pci.c
> +++ b/arch/s390/pci/pci.c
> @@ -76,6 +76,22 @@ struct zpci_dev *get_zdev_by_fid(u32 fid)
> return zdev;
> }
>
> +struct zpci_dev *get_zdev_by_fh(u32 fh)
> +{
> + struct zpci_dev *tmp, *zdev = NULL;
> +
> + spin_lock(&zpci_list_lock);
> + list_for_each_entry(tmp, &zpci_list, entry) {
> + if (tmp->fh == fh) {
> + zdev = tmp;
> + break;
> + }
> + }
> + spin_unlock(&zpci_list_lock);
> + return zdev;
> +}
> +EXPORT_SYMBOL_GPL(get_zdev_by_fh);
> +
> void zpci_remove_reserved_devices(void)
> {
> struct zpci_dev *tmp, *zdev;
>

2021-12-09 15:47:14

by Christian Borntraeger

[permalink] [raw]
Subject: Re: [PATCH 12/32] s390/pci: get SHM information from list pci

Am 07.12.21 um 21:57 schrieb Matthew Rosato:
> KVM will need information on the special handle mask used to indicate
> emulated devices. In order to obtain this, a new type of list pci call
> must be made to gather the information. Remove the unused data pointer
> from clp_list_pci and __clp_add and instead optionally pass a pointer to
> a model-dependent-data field. Additionally, allow for clp_list_pci calls
> that don't specify a callback - in this case, just do the first pass of
> list pci and exit.
>
> Signed-off-by: Matthew Rosato <[email protected]>
> ---
> arch/s390/include/asm/pci.h | 6 ++++++
> arch/s390/include/asm/pci_clp.h | 2 +-
> arch/s390/pci/pci.c | 19 +++++++++++++++++++
> arch/s390/pci/pci_clp.c | 16 ++++++++++------
> 4 files changed, 36 insertions(+), 7 deletions(-)
>
> diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
> index 00a2c24d6d2b..86f43644756d 100644
> --- a/arch/s390/include/asm/pci.h
> +++ b/arch/s390/include/asm/pci.h
> @@ -219,12 +219,18 @@ int zpci_unregister_ioat(struct zpci_dev *, u8);
> void zpci_remove_reserved_devices(void);
> void zpci_update_fh(struct zpci_dev *zdev, u32 fh);
>
> +int zpci_get_mdd(u32 *mdd);
> +
> /* CLP */
> +void *clp_alloc_block(gfp_t gfp_mask);
> +void clp_free_block(void *ptr);
> int clp_setup_writeback_mio(void);
> int clp_scan_pci_devices(void);
> int clp_query_pci_fn(struct zpci_dev *zdev);
> int clp_enable_fh(struct zpci_dev *zdev, u32 *fh, u8 nr_dma_as);
> int clp_disable_fh(struct zpci_dev *zdev, u32 *fh);
> +int clp_list_pci(struct clp_req_rsp_list_pci *rrb, u32 *mdd,
> + void (*cb)(struct clp_fh_list_entry *));
> int clp_get_state(u32 fid, enum zpci_state *state);
> int clp_refresh_fh(u32 fid, u32 *fh);
>
> diff --git a/arch/s390/include/asm/pci_clp.h b/arch/s390/include/asm/pci_clp.h
> index 124fadfb74b9..d6bc324763f3 100644
> --- a/arch/s390/include/asm/pci_clp.h
> +++ b/arch/s390/include/asm/pci_clp.h
> @@ -76,7 +76,7 @@ struct clp_req_list_pci {
> struct clp_rsp_list_pci {
> struct clp_rsp_hdr hdr;
> u64 resume_token;
> - u32 reserved2;
> + u32 mdd;
> u16 max_fn;
> u8 : 7;
> u8 uid_checking : 1;
> diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c
> index af1c0ae017b1..175854c861cd 100644
> --- a/arch/s390/pci/pci.c
> +++ b/arch/s390/pci/pci.c
> @@ -531,6 +531,25 @@ void zpci_update_fh(struct zpci_dev *zdev, u32 fh)
> zpci_do_update_iomap_fh(zdev, fh);
> }
>
> +int zpci_get_mdd(u32 *mdd)
> +{
> + struct clp_req_rsp_list_pci *rrb;
> + int rc;
> +
> + if (!mdd)
> + return -EINVAL;
> +
> + rrb = clp_alloc_block(GFP_KERNEL);
> + if (!rrb)
> + return -ENOMEM;
> +
> + rc = clp_list_pci(rrb, mdd, NULL);
> +
> + clp_free_block(rrb);
> + return rc;
> +}
> +EXPORT_SYMBOL_GPL(zpci_get_mdd);

Maybe move this into pci_clp.c to avoid the export of clp_alloc_block and void clp_free_block?
Niklas?
In any case the code looks correct from a HW perspective.

Reviewed-by: Christian Borntraeger <[email protected]>


> +
> static struct resource *__alloc_res(struct zpci_dev *zdev, unsigned long start,
> unsigned long size, unsigned long flags)
> {
> diff --git a/arch/s390/pci/pci_clp.c b/arch/s390/pci/pci_clp.c
> index bc7446566cbc..e18a548ac22d 100644
> --- a/arch/s390/pci/pci_clp.c
> +++ b/arch/s390/pci/pci_clp.c
> @@ -84,12 +84,12 @@ static __always_inline int clp_req(void *data, unsigned int lps)
> return cc;
> }
>
> -static void *clp_alloc_block(gfp_t gfp_mask)
> +void *clp_alloc_block(gfp_t gfp_mask)
> {
> return (void *) __get_free_pages(gfp_mask, get_order(CLP_BLK_SIZE));
> }
>
> -static void clp_free_block(void *ptr)
> +void clp_free_block(void *ptr)
> {
> free_pages((unsigned long) ptr, get_order(CLP_BLK_SIZE));
> }
> @@ -358,8 +358,8 @@ static int clp_list_pci_req(struct clp_req_rsp_list_pci *rrb,
> return rc;
> }
>
> -static int clp_list_pci(struct clp_req_rsp_list_pci *rrb, void *data,
> - void (*cb)(struct clp_fh_list_entry *, void *))
> +int clp_list_pci(struct clp_req_rsp_list_pci *rrb, u32 *mdd,
> + void (*cb)(struct clp_fh_list_entry *))
> {
> u64 resume_token = 0;
> int nentries, i, rc;
> @@ -368,8 +368,12 @@ static int clp_list_pci(struct clp_req_rsp_list_pci *rrb, void *data,
> rc = clp_list_pci_req(rrb, &resume_token, &nentries);
> if (rc)
> return rc;
> + if (mdd)
> + *mdd = rrb->response.mdd;
> + if (!cb)
> + return 0;
> for (i = 0; i < nentries; i++)
> - cb(&rrb->response.fh_list[i], data);
> + cb(&rrb->response.fh_list[i]);
> } while (resume_token);
>
> return rc;
> @@ -398,7 +402,7 @@ static int clp_find_pci(struct clp_req_rsp_list_pci *rrb, u32 fid,
> return -ENODEV;
> }
>
> -static void __clp_add(struct clp_fh_list_entry *entry, void *data)
> +static void __clp_add(struct clp_fh_list_entry *entry)
> {
> struct zpci_dev *zdev;
>
>

2021-12-09 15:54:23

by Christian Borntraeger

[permalink] [raw]
Subject: Re: [PATCH 13/32] KVM: s390: pci: add basic kvm_zdev structure

Am 07.12.21 um 21:57 schrieb Matthew Rosato:
> This structure will be used to carry kvm passthrough information related to
> zPCI devices.
>
> Signed-off-by: Matthew Rosato <[email protected]>

Mostly a skeleton but looks ok

Reviewed-by: Christian Borntraeger <[email protected]>

> ---
> arch/s390/include/asm/kvm_pci.h | 29 +++++++++++++++++
> arch/s390/include/asm/pci.h | 3 ++
> arch/s390/kvm/Makefile | 2 +-
> arch/s390/kvm/pci.c | 57 +++++++++++++++++++++++++++++++++
> 4 files changed, 90 insertions(+), 1 deletion(-)
> create mode 100644 arch/s390/include/asm/kvm_pci.h
> create mode 100644 arch/s390/kvm/pci.c
>
> diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
> new file mode 100644
> index 000000000000..3e491a39704c
> --- /dev/null
> +++ b/arch/s390/include/asm/kvm_pci.h
> @@ -0,0 +1,29 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * KVM PCI Passthrough for virtual machines on s390
> + *
> + * Copyright IBM Corp. 2021
> + *
> + * Author(s): Matthew Rosato <[email protected]>
> + */
> +
> +
> +#ifndef ASM_KVM_PCI_H
> +#define ASM_KVM_PCI_H
> +
> +#include <linux/types.h>
> +#include <linux/kvm_types.h>
> +#include <linux/kvm_host.h>
> +#include <linux/kvm.h>
> +#include <linux/pci.h>
> +
> +struct kvm_zdev {
> + struct zpci_dev *zdev;
> + struct kvm *kvm;
> +};
> +
> +extern int kvm_s390_pci_dev_open(struct zpci_dev *zdev);
> +extern void kvm_s390_pci_dev_release(struct zpci_dev *zdev);
> +extern int kvm_s390_pci_attach_kvm(struct zpci_dev *zdev, struct kvm *kvm);
> +
> +#endif /* ASM_KVM_PCI_H */
> diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
> index 86f43644756d..32810e1ed308 100644
> --- a/arch/s390/include/asm/pci.h
> +++ b/arch/s390/include/asm/pci.h
> @@ -97,6 +97,7 @@ struct zpci_bar_struct {
> };
>
> struct s390_domain;
> +struct kvm_zdev;
>
> #define ZPCI_FUNCTIONS_PER_BUS 256
> struct zpci_bus {
> @@ -190,6 +191,8 @@ struct zpci_dev {
> struct dentry *debugfs_dev;
>
> struct s390_domain *s390_domain; /* s390 IOMMU domain data */
> +
> + struct kvm_zdev *kzdev; /* passthrough data */
> };
>
> static inline bool zdev_enabled(struct zpci_dev *zdev)
> diff --git a/arch/s390/kvm/Makefile b/arch/s390/kvm/Makefile
> index b3aaadc60ead..95ea865e5d29 100644
> --- a/arch/s390/kvm/Makefile
> +++ b/arch/s390/kvm/Makefile
> @@ -10,6 +10,6 @@ common-objs = $(KVM)/kvm_main.o $(KVM)/eventfd.o $(KVM)/async_pf.o \
> ccflags-y := -Ivirt/kvm -Iarch/s390/kvm
>
> kvm-objs := $(common-objs) kvm-s390.o intercept.o interrupt.o priv.o sigp.o
> -kvm-objs += diag.o gaccess.o guestdbg.o vsie.o pv.o
> +kvm-objs += diag.o gaccess.o guestdbg.o vsie.o pv.o pci.o
>
> obj-$(CONFIG_KVM) += kvm.o
> diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
> new file mode 100644
> index 000000000000..ecfc458a5b39
> --- /dev/null
> +++ b/arch/s390/kvm/pci.c
> @@ -0,0 +1,57 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * s390 kvm PCI passthrough support
> + *
> + * Copyright IBM Corp. 2021
> + *
> + * Author(s): Matthew Rosato <[email protected]>
> + */
> +
> +#include <linux/kvm_host.h>
> +#include <linux/pci.h>
> +#include <asm/kvm_pci.h>
> +
> +int kvm_s390_pci_dev_open(struct zpci_dev *zdev)
> +{
> + struct kvm_zdev *kzdev;
> +
> + if (zdev == NULL)
> + return -ENODEV;
> +
> + kzdev = kzalloc(sizeof(struct kvm_zdev), GFP_KERNEL);
> + if (!kzdev)
> + return -ENOMEM;
> +
> + kzdev->zdev = zdev;
> + zdev->kzdev = kzdev;
> +
> + return 0;
> +}
> +EXPORT_SYMBOL_GPL(kvm_s390_pci_dev_open);
> +
> +void kvm_s390_pci_dev_release(struct zpci_dev *zdev)
> +{
> + struct kvm_zdev *kzdev;
> +
> + if (!zdev || !zdev->kzdev)
> + return;
> +
> + kzdev = zdev->kzdev;
> + WARN_ON(kzdev->zdev != zdev);
> + zdev->kzdev = 0;
> + kfree(kzdev);
> +
> +}
> +EXPORT_SYMBOL_GPL(kvm_s390_pci_dev_release);
> +
> +int kvm_s390_pci_attach_kvm(struct zpci_dev *zdev, struct kvm *kvm)
> +{
> + struct kvm_zdev *kzdev = zdev->kzdev;
> +
> + if (!kzdev)
> + return -ENODEV;
> +
> + kzdev->kvm = kvm;
> + return 0;
> +}
> +EXPORT_SYMBOL_GPL(kvm_s390_pci_attach_kvm);
>

2021-12-09 18:25:28

by Matthew Rosato

[permalink] [raw]
Subject: Re: [PATCH 12/32] s390/pci: get SHM information from list pci

On 12/8/21 7:21 AM, Niklas Schnelle wrote:
> On Tue, 2021-12-07 at 15:57 -0500, Matthew Rosato wrote:
>> KVM will need information on the special handle mask used to indicate
>> emulated devices. In order to obtain this, a new type of list pci call
>> must be made to gather the information. Remove the unused data pointer
>> from clp_list_pci and __clp_add and instead optionally pass a pointer to
>> a model-dependent-data field. Additionally, allow for clp_list_pci calls
>> that don't specify a callback - in this case, just do the first pass of
>> list pci and exit.
>>
>> Signed-off-by: Matthew Rosato <[email protected]>
>> ---
>> arch/s390/include/asm/pci.h | 6 ++++++
>> arch/s390/include/asm/pci_clp.h | 2 +-
>> arch/s390/pci/pci.c | 19 +++++++++++++++++++
>> arch/s390/pci/pci_clp.c | 16 ++++++++++------
>> 4 files changed, 36 insertions(+), 7 deletions(-)
>>
>> diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
>> index 00a2c24d6d2b..86f43644756d 100644
>> --- a/arch/s390/include/asm/pci.h
>> +++ b/arch/s390/include/asm/pci.h
>> @@ -219,12 +219,18 @@ int zpci_unregister_ioat(struct zpci_dev *, u8);
>> void zpci_remove_reserved_devices(void);
>> void zpci_update_fh(struct zpci_dev *zdev, u32 fh);
>>
> ---8<---
>> -static int clp_list_pci(struct clp_req_rsp_list_pci *rrb, void *data,
>> - void (*cb)(struct clp_fh_list_entry *, void *))
>> +int clp_list_pci(struct clp_req_rsp_list_pci *rrb, u32 *mdd,
>> + void (*cb)(struct clp_fh_list_entry *))
>> {
>> u64 resume_token = 0;
>> int nentries, i, rc;
>> @@ -368,8 +368,12 @@ static int clp_list_pci(struct clp_req_rsp_list_pci *rrb, void *data,
>> rc = clp_list_pci_req(rrb, &resume_token, &nentries);
>> if (rc)
>> return rc;
>> + if (mdd)
>> + *mdd = rrb->response.mdd;
>> + if (!cb)
>> + return 0;
>
> I think it would be slightly cleaner to instead de-static
> clp_list_pci_req() and call that directly. Just because that makes the
> clp_list_pci() still list all PCI functions and allows us to get rid of
> the data parameter completely.
>

Oops, I must have missed this comment before. Sure, makes sense.

> Also, I've been thinking about moving clp_scan_devices(),
> clp_get_state(), and clp_refresh_fh() out of pci_clp.c because they are
> higher level. I think that would nicely fit your zpci_get_mdd() in
> pci.c with or without the above suggestion. Then we could do the
> removal of the unused data parameter in that series as a cleanup. What
> do you think?

Sure, that would be fine. So then this patch instead just exports
alloc/free/clp_list_pci_req and the new zpci_get_mdd calls
clp_list_pci_req directly. I'll drop the changes to clp_list_pci() and
__clp_add (and re-word the commit message)

>
>> for (i = 0; i < nentries; i++)
>> - cb(&rrb->response.fh_list[i], data);
>> + cb(&rrb->response.fh_list[i]);
>> } while (resume_token);
>>
>> return rc;
>> @@ -398,7 +402,7 @@ static int clp_find_pci(struct clp_req_rsp_list_pci *rrb, u32 fid,
>> return -ENODEV;
>> }
>>
>> -static void __clp_add(struct clp_fh_list_entry *entry, void *data)
>> +static void __clp_add(struct clp_fh_list_entry *entry)
>> {
>> struct zpci_dev *zdev;
>>
>


2021-12-09 19:54:45

by Christian Borntraeger

[permalink] [raw]
Subject: Re: [PATCH 14/32] KVM: s390: pci: do initial setup for AEN interpretation

Am 07.12.21 um 21:57 schrieb Matthew Rosato:
> Initial setup for Adapter Event Notification Interpretation for zPCI
> passthrough devices. Specifically, allocate a structure for forwarding of
> adapter events and pass the address of this structure to firmware.
>
> Signed-off-by: Matthew Rosato <[email protected]>
> ---
> arch/s390/include/asm/pci_insn.h | 12 ++++
> arch/s390/kvm/interrupt.c | 17 +++++
> arch/s390/kvm/kvm-s390.c | 3 +
> arch/s390/kvm/pci.c | 113 +++++++++++++++++++++++++++++++
> arch/s390/kvm/pci.h | 42 ++++++++++++
> 5 files changed, 187 insertions(+)
> create mode 100644 arch/s390/kvm/pci.h
>
> diff --git a/arch/s390/include/asm/pci_insn.h b/arch/s390/include/asm/pci_insn.h
> index 5331082fa516..e5f57cfe1d45 100644
> --- a/arch/s390/include/asm/pci_insn.h
> +++ b/arch/s390/include/asm/pci_insn.h
> @@ -101,6 +101,7 @@ struct zpci_fib {
> /* Set Interruption Controls Operation Controls */
> #define SIC_IRQ_MODE_ALL 0
> #define SIC_IRQ_MODE_SINGLE 1
> +#define SIC_SET_AENI_CONTROLS 2
> #define SIC_IRQ_MODE_DIRECT 4
> #define SIC_IRQ_MODE_D_ALL 16
> #define SIC_IRQ_MODE_D_SINGLE 17
> @@ -127,9 +128,20 @@ struct zpci_cdiib {
> u64 : 64;
> } __packed __aligned(8);
>
> +/* adapter interruption parameters block */
> +struct zpci_aipb {
> + u64 faisb;
> + u64 gait;
> + u16 : 13;
> + u16 afi : 3;
> + u32 : 32;
> + u16 faal;
> +} __packed __aligned(8);
> +
> union zpci_sic_iib {
> struct zpci_diib diib;
> struct zpci_cdiib cdiib;
> + struct zpci_aipb aipb;
> };
>
> DECLARE_STATIC_KEY_FALSE(have_mio);
> diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
> index f9b872e358c6..4efe0e95a40f 100644
> --- a/arch/s390/kvm/interrupt.c
> +++ b/arch/s390/kvm/interrupt.c
> @@ -32,6 +32,7 @@
> #include "kvm-s390.h"
> #include "gaccess.h"
> #include "trace-s390.h"
> +#include "pci.h"
>
> #define PFAULT_INIT 0x0600
> #define PFAULT_DONE 0x0680
> @@ -3276,8 +3277,16 @@ static struct airq_struct gib_alert_irq = {
>
> void kvm_s390_gib_destroy(void)
> {
> + struct zpci_aift *aift;
> +
> if (!gib)
> return;
> + aift = kvm_s390_pci_get_aift();
> + if (aift) {
> + mutex_lock(&aift->lock)

aift is a static variable and later patches seem to access that directly without the wrapper.
Can we get rid of kvm_s390_pci_get_aift?
;
> + kvm_s390_pci_aen_exit();
> + mutex_unlock(&aift->lock);
> + }
> chsc_sgib(0);
> unregister_adapter_interrupt(&gib_alert_irq);
> free_page((unsigned long)gib);
> @@ -3315,6 +3324,14 @@ int kvm_s390_gib_init(u8 nisc)
> goto out_unreg_gal;
> }
>
> + if (IS_ENABLED(CONFIG_PCI) && sclp.has_aeni) {
> + if (kvm_s390_pci_aen_init(nisc)) {
> + pr_err("Initializing AEN for PCI failed\n");
> + rc = -EIO;
> + goto out_unreg_gal;
> + }
> + }
> +
> KVM_EVENT(3, "gib 0x%pK (nisc=%d) initialized", gib, gib->nisc);
> goto out;
>
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 14a18ba5ff2c..9cd3c8eb59e8 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -48,6 +48,7 @@
> #include <asm/fpu/api.h>
> #include "kvm-s390.h"
> #include "gaccess.h"
> +#include "pci.h"
>
> #define CREATE_TRACE_POINTS
> #include "trace.h"
> @@ -503,6 +504,8 @@ int kvm_arch_init(void *opaque)
> goto out;
> }
>
> + kvm_s390_pci_init();
> +
> rc = kvm_s390_gib_init(GAL_ISC);
> if (rc)
> goto out;
> diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
> index ecfc458a5b39..f0e5386ff943 100644
> --- a/arch/s390/kvm/pci.c
> +++ b/arch/s390/kvm/pci.c
> @@ -10,6 +10,113 @@
> #include <linux/kvm_host.h>
> #include <linux/pci.h>
> #include <asm/kvm_pci.h>
> +#include "pci.h"
> +
> +static struct zpci_aift aift;

see below.
> +
> +static inline int __set_irq_noiib(u16 ctl, u8 isc)
> +{
> + union zpci_sic_iib iib = {{0}};
> +
> + return zpci_set_irq_ctrl(ctl, isc, &iib);
> +}
> +
> +struct zpci_aift *kvm_s390_pci_get_aift(void)
> +{
> + return &aift;
> +}
> +
> +/* Caller must hold the aift lock before calling this function */
> +void kvm_s390_pci_aen_exit(void)
> +{
> + struct zpci_gaite *gait;
> + unsigned long flags;
> + struct airq_iv *sbv;
> + struct kvm_zdev **gait_kzdev;
> + int size;
> +
> + /* Clear the GAIT and forwarding summary vector */
> + __set_irq_noiib(SIC_SET_AENI_CONTROLS, 0);
> +
> + spin_lock_irqsave(&aift.gait_lock, flags);
> + gait = aift.gait;
> + sbv = aift.sbv;
> + gait_kzdev = aift.kzdev;
> + aift.gait = 0;
> + aift.sbv = 0;
> + aift.kzdev = 0;
> + spin_unlock_irqrestore(&aift.gait_lock, flags);
> +
> + if (sbv)
> + airq_iv_release(sbv);
> + size = get_order(PAGE_ALIGN(ZPCI_NR_DEVICES *
> + sizeof(struct zpci_gaite)));
> + free_pages((unsigned long)gait, size);
> + kfree(gait_kzdev);
> +}
> +
> +int kvm_s390_pci_aen_init(u8 nisc)
> +{
> + union zpci_sic_iib iib = {{0}};
> + struct page *page;
> + int rc = 0, size;
> +
> + /* If already enabled for AEN, bail out now */
> + if (aift.gait || aift.sbv)
> + return -EPERM;
> +
> + mutex_lock(&aift.lock);
> + aift.kzdev = kcalloc(ZPCI_NR_DEVICES, sizeof(struct kvm_zdev),
> + GFP_KERNEL);
> + if (!aift.kzdev) {
> + rc = -ENOMEM;
> + goto unlock;
> + }
> + aift.sbv = airq_iv_create(ZPCI_NR_DEVICES, AIRQ_IV_ALLOC, 0);
> + if (!aift.sbv) {
> + rc = -ENOMEM;
> + goto free_zdev;
> + }
> + size = get_order(PAGE_ALIGN(ZPCI_NR_DEVICES *
> + sizeof(struct zpci_gaite)));
> + page = alloc_pages(GFP_KERNEL | __GFP_ZERO, size);
> + if (!page) {
> + rc = -ENOMEM;
> + goto free_sbv;
> + }
> + aift.gait = (struct zpci_gaite *)page_to_phys(page);
> +
> + iib.aipb.faisb = (u64)aift.sbv->vector;
> + iib.aipb.gait = (u64)aift.gait;
> + iib.aipb.afi = nisc;
> + iib.aipb.faal = ZPCI_NR_DEVICES;
> +
> + /* Setup Adapter Event Notification Interpretation */
> + if (zpci_set_irq_ctrl(SIC_SET_AENI_CONTROLS, 0, &iib)) {
> + rc = -EIO;
> + goto free_gait;
> + }
> +
> + /* Enable floating IRQs */
> + if (__set_irq_noiib(SIC_IRQ_MODE_SINGLE, nisc)) {
> + rc = -EIO;
> + kvm_s390_pci_aen_exit();
> + }
> +
> + goto unlock;
> +
> +free_gait:
> + size = get_order(PAGE_ALIGN(ZPCI_NR_DEVICES *
> + sizeof(struct zpci_gaite)));
> + free_pages((unsigned long)aift.gait, size);
> +free_sbv:
> + airq_iv_release(aift.sbv);
> +free_zdev:
> + kfree(aift.kzdev);
> +unlock:
> + mutex_unlock(&aift.lock);
> + return rc;
> +}
>
> int kvm_s390_pci_dev_open(struct zpci_dev *zdev)
> {
> @@ -55,3 +162,9 @@ int kvm_s390_pci_attach_kvm(struct zpci_dev *zdev, struct kvm *kvm)
> return 0;
> }
> EXPORT_SYMBOL_GPL(kvm_s390_pci_attach_kvm);
> +
> +void kvm_s390_pci_init(void)
> +{
> + spin_lock_init(&aift.gait_lock);
> + mutex_init(&aift.lock);
> +}

Can we maybe use designated initializer for the static definition of aift, e.g. something
like
static struct zpci_aift aift = {
.gait_lock = __SPIN_LOCK_UNLOCKED(aift.gait_lock),
.lock = __MUTEX_INITIALIZER(aift.lock),
}
and get rid of the init function?


> diff --git a/arch/s390/kvm/pci.h b/arch/s390/kvm/pci.h
> new file mode 100644
> index 000000000000..74b06d39be3b
> --- /dev/null
> +++ b/arch/s390/kvm/pci.h
> @@ -0,0 +1,42 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * s390 kvm PCI passthrough support
> + *
> + * Copyright IBM Corp. 2021
> + *
> + * Author(s): Matthew Rosato <[email protected]>
> + */
> +
> +#ifndef __KVM_S390_PCI_H
> +#define __KVM_S390_PCI_H
> +
> +#include <linux/pci.h>
> +#include <linux/mutex.h>
> +#include <asm/airq.h>
> +#include <asm/kvm_pci.h>
> +
> +struct zpci_gaite {
> + unsigned int gisa;

since we use u8 below, what about u32
> + u8 gisc;
> + u8 count;
> + u8 reserved;
> + u8 aisbo;
> + unsigned long aisb;

and u64 ?
> +};
> +
> +struct zpci_aift {
> + struct zpci_gaite *gait;
> + struct airq_iv *sbv;
> + struct kvm_zdev **kzdev;
> + spinlock_t gait_lock; /* Protects the gait, used during AEN forward */
> + struct mutex lock; /* Protects the other structures in aift */
> +};
> +
> +struct zpci_aift *kvm_s390_pci_get_aift(void);
> +
> +int kvm_s390_pci_aen_init(u8 nisc);
> +void kvm_s390_pci_aen_exit(void);
> +
> +void kvm_s390_pci_init(void);
> +
> +#endif /* __KVM_S390_PCI_H */
>

2021-12-09 20:20:09

by Matthew Rosato

[permalink] [raw]
Subject: Re: [PATCH 14/32] KVM: s390: pci: do initial setup for AEN interpretation

On 12/9/21 2:54 PM, Christian Borntraeger wrote:
> Am 07.12.21 um 21:57 schrieb Matthew Rosato:
>> Initial setup for Adapter Event Notification Interpretation for zPCI
>> passthrough devices.  Specifically, allocate a structure for
>> forwarding of
>> adapter events and pass the address of this structure to firmware.
>>
>> Signed-off-by: Matthew Rosato <[email protected]>
>> ---
>>   arch/s390/include/asm/pci_insn.h |  12 ++++
>>   arch/s390/kvm/interrupt.c        |  17 +++++
>>   arch/s390/kvm/kvm-s390.c         |   3 +
>>   arch/s390/kvm/pci.c              | 113 +++++++++++++++++++++++++++++++
>>   arch/s390/kvm/pci.h              |  42 ++++++++++++
>>   5 files changed, 187 insertions(+)
>>   create mode 100644 arch/s390/kvm/pci.h
>>
>> diff --git a/arch/s390/include/asm/pci_insn.h
>> b/arch/s390/include/asm/pci_insn.h
>> index 5331082fa516..e5f57cfe1d45 100644
>> --- a/arch/s390/include/asm/pci_insn.h
>> +++ b/arch/s390/include/asm/pci_insn.h
>> @@ -101,6 +101,7 @@ struct zpci_fib {
>>   /* Set Interruption Controls Operation Controls  */
>>   #define    SIC_IRQ_MODE_ALL        0
>>   #define    SIC_IRQ_MODE_SINGLE        1
>> +#define    SIC_SET_AENI_CONTROLS        2
>>   #define    SIC_IRQ_MODE_DIRECT        4
>>   #define    SIC_IRQ_MODE_D_ALL        16
>>   #define    SIC_IRQ_MODE_D_SINGLE        17
>> @@ -127,9 +128,20 @@ struct zpci_cdiib {
>>       u64 : 64;
>>   } __packed __aligned(8);
>> +/* adapter interruption parameters block */
>> +struct zpci_aipb {
>> +    u64 faisb;
>> +    u64 gait;
>> +    u16 : 13;
>> +    u16 afi : 3;
>> +    u32 : 32;
>> +    u16 faal;
>> +} __packed __aligned(8);
>> +
>>   union zpci_sic_iib {
>>       struct zpci_diib diib;
>>       struct zpci_cdiib cdiib;
>> +    struct zpci_aipb aipb;
>>   };
>>   DECLARE_STATIC_KEY_FALSE(have_mio);
>> diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
>> index f9b872e358c6..4efe0e95a40f 100644
>> --- a/arch/s390/kvm/interrupt.c
>> +++ b/arch/s390/kvm/interrupt.c
>> @@ -32,6 +32,7 @@
>>   #include "kvm-s390.h"
>>   #include "gaccess.h"
>>   #include "trace-s390.h"
>> +#include "pci.h"
>>   #define PFAULT_INIT 0x0600
>>   #define PFAULT_DONE 0x0680
>> @@ -3276,8 +3277,16 @@ static struct airq_struct gib_alert_irq = {
>>   void kvm_s390_gib_destroy(void)
>>   {
>> +    struct zpci_aift *aift;
>> +
>>       if (!gib)
>>           return;
>> +    aift = kvm_s390_pci_get_aift();
>> +    if (aift) {
>> +        mutex_lock(&aift->lock)
>
> aift is a static variable and later patches seem to access that directly
> without the wrapper.
> Can we get rid of kvm_s390_pci_get_aift?

kvm/interrupt.c must also access it when handling AEN forwarding (next
patch)

> ;
>> +        kvm_s390_pci_aen_exit();
>> +        mutex_unlock(&aift->lock);
>> +    }
>>       chsc_sgib(0);
>>       unregister_adapter_interrupt(&gib_alert_irq);
>>       free_page((unsigned long)gib);
>> @@ -3315,6 +3324,14 @@ int kvm_s390_gib_init(u8 nisc)
>>           goto out_unreg_gal;
>>       }
>> +    if (IS_ENABLED(CONFIG_PCI) && sclp.has_aeni) {
>> +        if (kvm_s390_pci_aen_init(nisc)) {
>> +            pr_err("Initializing AEN for PCI failed\n");
>> +            rc = -EIO;
>> +            goto out_unreg_gal;
>> +        }
>> +    }
>> +
>>       KVM_EVENT(3, "gib 0x%pK (nisc=%d) initialized", gib, gib->nisc);
>>       goto out;
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index 14a18ba5ff2c..9cd3c8eb59e8 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -48,6 +48,7 @@
>>   #include <asm/fpu/api.h>
>>   #include "kvm-s390.h"
>>   #include "gaccess.h"
>> +#include "pci.h"
>>   #define CREATE_TRACE_POINTS
>>   #include "trace.h"
>> @@ -503,6 +504,8 @@ int kvm_arch_init(void *opaque)
>>           goto out;
>>       }
>> +    kvm_s390_pci_init();
>> +
>>       rc = kvm_s390_gib_init(GAL_ISC);
>>       if (rc)
>>           goto out;
>> diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
>> index ecfc458a5b39..f0e5386ff943 100644
>> --- a/arch/s390/kvm/pci.c
>> +++ b/arch/s390/kvm/pci.c
>> @@ -10,6 +10,113 @@
>>   #include <linux/kvm_host.h>
>>   #include <linux/pci.h>
>>   #include <asm/kvm_pci.h>
>> +#include "pci.h"
>> +
>> +static struct zpci_aift aift;
>
> see below.
>> +
>> +static inline int __set_irq_noiib(u16 ctl, u8 isc)
>> +{
>> +    union zpci_sic_iib iib = {{0}};
>> +
>> +    return zpci_set_irq_ctrl(ctl, isc, &iib);
>> +}
>> +
>> +struct zpci_aift *kvm_s390_pci_get_aift(void)
>> +{
>> +    return &aift;
>> +}
>> +
>> +/* Caller must hold the aift lock before calling this function */
>> +void kvm_s390_pci_aen_exit(void)
>> +{
>> +    struct zpci_gaite *gait;
>> +    unsigned long flags;
>> +    struct airq_iv *sbv;
>> +    struct kvm_zdev **gait_kzdev;
>> +    int size;
>> +
>> +    /* Clear the GAIT and forwarding summary vector */
>> +    __set_irq_noiib(SIC_SET_AENI_CONTROLS, 0);
>> +
>> +    spin_lock_irqsave(&aift.gait_lock, flags);
>> +    gait = aift.gait;
>> +    sbv = aift.sbv;
>> +    gait_kzdev = aift.kzdev;
>> +    aift.gait = 0;
>> +    aift.sbv = 0;
>> +    aift.kzdev = 0;
>> +    spin_unlock_irqrestore(&aift.gait_lock, flags);
>> +
>> +    if (sbv)
>> +        airq_iv_release(sbv);
>> +    size = get_order(PAGE_ALIGN(ZPCI_NR_DEVICES *
>> +                    sizeof(struct zpci_gaite)));
>> +    free_pages((unsigned long)gait, size);
>> +    kfree(gait_kzdev);
>> +}
>> +
>> +int kvm_s390_pci_aen_init(u8 nisc)
>> +{
>> +    union zpci_sic_iib iib = {{0}};
>> +    struct page *page;
>> +    int rc = 0, size;
>> +
>> +    /* If already enabled for AEN, bail out now */
>> +    if (aift.gait || aift.sbv)
>> +        return -EPERM;
>> +
>> +    mutex_lock(&aift.lock);
>> +    aift.kzdev = kcalloc(ZPCI_NR_DEVICES, sizeof(struct kvm_zdev),
>> +                 GFP_KERNEL);
>> +    if (!aift.kzdev) {
>> +        rc = -ENOMEM;
>> +        goto unlock;
>> +    }
>> +    aift.sbv = airq_iv_create(ZPCI_NR_DEVICES, AIRQ_IV_ALLOC, 0);
>> +    if (!aift.sbv) {
>> +        rc = -ENOMEM;
>> +        goto free_zdev;
>> +    }
>> +    size = get_order(PAGE_ALIGN(ZPCI_NR_DEVICES *
>> +                    sizeof(struct zpci_gaite)));
>> +    page = alloc_pages(GFP_KERNEL | __GFP_ZERO, size);
>> +    if (!page) {
>> +        rc = -ENOMEM;
>> +        goto free_sbv;
>> +    }
>> +    aift.gait = (struct zpci_gaite *)page_to_phys(page);
>> +
>> +    iib.aipb.faisb = (u64)aift.sbv->vector;
>> +    iib.aipb.gait = (u64)aift.gait;
>> +    iib.aipb.afi = nisc;
>> +    iib.aipb.faal = ZPCI_NR_DEVICES;
>> +
>> +    /* Setup Adapter Event Notification Interpretation */
>> +    if (zpci_set_irq_ctrl(SIC_SET_AENI_CONTROLS, 0, &iib)) {
>> +        rc = -EIO;
>> +        goto free_gait;
>> +    }
>> +
>> +    /* Enable floating IRQs */
>> +    if (__set_irq_noiib(SIC_IRQ_MODE_SINGLE, nisc)) {
>> +        rc = -EIO;
>> +        kvm_s390_pci_aen_exit();
>> +    }
>> +
>> +    goto unlock;
>> +
>> +free_gait:
>> +    size = get_order(PAGE_ALIGN(ZPCI_NR_DEVICES *
>> +                    sizeof(struct zpci_gaite)));
>> +    free_pages((unsigned long)aift.gait, size);
>> +free_sbv:
>> +    airq_iv_release(aift.sbv);
>> +free_zdev:
>> +    kfree(aift.kzdev);
>> +unlock:
>> +    mutex_unlock(&aift.lock);
>> +    return rc;
>> +}
>>   int kvm_s390_pci_dev_open(struct zpci_dev *zdev)
>>   {
>> @@ -55,3 +162,9 @@ int kvm_s390_pci_attach_kvm(struct zpci_dev *zdev,
>> struct kvm *kvm)
>>       return 0;
>>   }
>>   EXPORT_SYMBOL_GPL(kvm_s390_pci_attach_kvm);
>> +
>> +void kvm_s390_pci_init(void)
>> +{
>> +    spin_lock_init(&aift.gait_lock);
>> +    mutex_init(&aift.lock);
>> +}
>
> Can we maybe use designated initializer for the static definition of
> aift, e.g. something
> like
> static struct zpci_aift aift = {
>     .gait_lock = __SPIN_LOCK_UNLOCKED(aift.gait_lock),
>     .lock    = __MUTEX_INITIALIZER(aift.lock),
> }
> and get rid of the init function? >

Maybe -- I can certainly do the above, but I do add a call to
zpci_get_mdd() in the init function (patch 23), so if I want to in patch
23 instead add .mdd = zpci_get_mdd() to this designated initializer I'd
have to re-work zpci_get_mdd (patch 12) to return the mdd rather than
the CLP LIST PCI return code. We want at least a warning if we're
setting a 0 for mdd because the CLP failed for some bizarre reason.

I guess one option would be to move the WARN_ON into the zpci_get_mdd()
function itself and then now we can do

u32 zpci_get_mdd(void);

Niklas, what do you think?

>
>> diff --git a/arch/s390/kvm/pci.h b/arch/s390/kvm/pci.h
>> new file mode 100644
>> index 000000000000..74b06d39be3b
>> --- /dev/null
>> +++ b/arch/s390/kvm/pci.h
>> @@ -0,0 +1,42 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +/*
>> + * s390 kvm PCI passthrough support
>> + *
>> + * Copyright IBM Corp. 2021
>> + *
>> + *    Author(s): Matthew Rosato <[email protected]>
>> + */
>> +
>> +#ifndef __KVM_S390_PCI_H
>> +#define __KVM_S390_PCI_H
>> +
>> +#include <linux/pci.h>
>> +#include <linux/mutex.h>
>> +#include <asm/airq.h>
>> +#include <asm/kvm_pci.h>
>> +
>> +struct zpci_gaite {
>> +    unsigned int gisa;
>
> since we use u8 below, what about u32
>> +    u8 gisc;
>> +    u8 count;
>> +    u8 reserved;
>> +    u8 aisbo;
>> +    unsigned long aisb;
>
> and u64 ?
>> +};
>> +
>> +struct zpci_aift {
>> +    struct zpci_gaite *gait;
>> +    struct airq_iv *sbv;
>> +    struct kvm_zdev **kzdev;
>> +    spinlock_t gait_lock; /* Protects the gait, used during AEN
>> forward */
>> +    struct mutex lock; /* Protects the other structures in aift */
>> +};
>> +
>> +struct zpci_aift *kvm_s390_pci_get_aift(void);
>> +
>> +int kvm_s390_pci_aen_init(u8 nisc);
>> +void kvm_s390_pci_aen_exit(void);
>> +
>> +void kvm_s390_pci_init(void);
>> +
>> +#endif /* __KVM_S390_PCI_H */
>>


2021-12-09 20:24:12

by Christian Borntraeger

[permalink] [raw]
Subject: Re: [PATCH 14/32] KVM: s390: pci: do initial setup for AEN interpretation



Am 09.12.21 um 21:20 schrieb Matthew Rosato:
> On 12/9/21 2:54 PM, Christian Borntraeger wrote:
>> Am 07.12.21 um 21:57 schrieb Matthew Rosato:
>>> Initial setup for Adapter Event Notification Interpretation for zPCI
>>> passthrough devices.  Specifically, allocate a structure for forwarding of
>>> adapter events and pass the address of this structure to firmware.
>>>
>>> Signed-off-by: Matthew Rosato <[email protected]>
>>> ---
>>>   arch/s390/include/asm/pci_insn.h |  12 ++++
>>>   arch/s390/kvm/interrupt.c        |  17 +++++
>>>   arch/s390/kvm/kvm-s390.c         |   3 +
>>>   arch/s390/kvm/pci.c              | 113 +++++++++++++++++++++++++++++++
>>>   arch/s390/kvm/pci.h              |  42 ++++++++++++
>>>   5 files changed, 187 insertions(+)
>>>   create mode 100644 arch/s390/kvm/pci.h
>>>
>>> diff --git a/arch/s390/include/asm/pci_insn.h b/arch/s390/include/asm/pci_insn.h
>>> index 5331082fa516..e5f57cfe1d45 100644
>>> --- a/arch/s390/include/asm/pci_insn.h
>>> +++ b/arch/s390/include/asm/pci_insn.h
>>> @@ -101,6 +101,7 @@ struct zpci_fib {
>>>   /* Set Interruption Controls Operation Controls  */
>>>   #define    SIC_IRQ_MODE_ALL        0
>>>   #define    SIC_IRQ_MODE_SINGLE        1
>>> +#define    SIC_SET_AENI_CONTROLS        2
>>>   #define    SIC_IRQ_MODE_DIRECT        4
>>>   #define    SIC_IRQ_MODE_D_ALL        16
>>>   #define    SIC_IRQ_MODE_D_SINGLE        17
>>> @@ -127,9 +128,20 @@ struct zpci_cdiib {
>>>       u64 : 64;
>>>   } __packed __aligned(8);
>>> +/* adapter interruption parameters block */
>>> +struct zpci_aipb {
>>> +    u64 faisb;
>>> +    u64 gait;
>>> +    u16 : 13;
>>> +    u16 afi : 3;
>>> +    u32 : 32;
>>> +    u16 faal;
>>> +} __packed __aligned(8);
>>> +
>>>   union zpci_sic_iib {
>>>       struct zpci_diib diib;
>>>       struct zpci_cdiib cdiib;
>>> +    struct zpci_aipb aipb;
>>>   };
>>>   DECLARE_STATIC_KEY_FALSE(have_mio);
>>> diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
>>> index f9b872e358c6..4efe0e95a40f 100644
>>> --- a/arch/s390/kvm/interrupt.c
>>> +++ b/arch/s390/kvm/interrupt.c
>>> @@ -32,6 +32,7 @@
>>>   #include "kvm-s390.h"
>>>   #include "gaccess.h"
>>>   #include "trace-s390.h"
>>> +#include "pci.h"
>>>   #define PFAULT_INIT 0x0600
>>>   #define PFAULT_DONE 0x0680
>>> @@ -3276,8 +3277,16 @@ static struct airq_struct gib_alert_irq = {
>>>   void kvm_s390_gib_destroy(void)
>>>   {
>>> +    struct zpci_aift *aift;
>>> +
>>>       if (!gib)
>>>           return;
>>> +    aift = kvm_s390_pci_get_aift();
>>> +    if (aift) {
>>> +        mutex_lock(&aift->lock)
>>
>> aift is a static variable and later patches seem to access that directly without the wrapper.
>> Can we get rid of kvm_s390_pci_get_aift?
>
> kvm/interrupt.c must also access it when handling AEN forwarding (next patch)

So maybe just make it non-static and declare it in the header file?
[...]

>> Can we maybe use designated initializer for the static definition of aift, e.g. something
>> like
>> static struct zpci_aift aift = {
>>      .gait_lock = __SPIN_LOCK_UNLOCKED(aift.gait_lock),
>>      .lock    = __MUTEX_INITIALIZER(aift.lock),
>> }
>> and get rid of the init function? >
>
> Maybe -- I can certainly do the above, but I do add a call to zpci_get_mdd() in the init function (patch 23), so if I want to in patch 23 instead add .mdd = zpci_get_mdd() to this designated initializer I'd have to re-work zpci_get_mdd (patch 12) to return the mdd rather than the CLP LIST PCI return code.  We want at least a warning if we're setting a 0 for mdd because the CLP failed for some bizarre reason.
>
> I guess one option would be to move the WARN_ON into the zpci_get_mdd() function itself and then now we can do

So maybe leave this as is.
>
> u32 zpci_get_mdd(void);
>
> Niklas, what do you think?
>

2021-12-10 08:36:40

by Niklas Schnelle

[permalink] [raw]
Subject: Re: [PATCH 14/32] KVM: s390: pci: do initial setup for AEN interpretation

On Thu, 2021-12-09 at 15:20 -0500, Matthew Rosato wrote:
> On 12/9/21 2:54 PM, Christian Borntraeger wrote:
> > Am 07.12.21 um 21:57 schrieb Matthew Rosato:
> > > Initial setup for Adapter Event Notification Interpretation for zPCI
> > > passthrough devices. Specifically, allocate a structure for
> > > forwarding of
> > > adapter events and pass the address of this structure to firmware.
> > >
> > > Signed-off-by: Matthew Rosato <[email protected]>
> > > ---
> > > arch/s390/include/asm/pci_insn.h | 12 ++++
> > > arch/s390/kvm/interrupt.c | 17 +++++
> > > arch/s390/kvm/kvm-s390.c | 3 +
> > > arch/s390/kvm/pci.c | 113 +++++++++++++++++++++++++++++++
> > > arch/s390/kvm/pci.h | 42 ++++++++++++
> > > 5 files changed, 187 insertions(+)
> > > create mode 100644 arch/s390/kvm/pci.h
> > >
---8<---
> > > int kvm_s390_pci_dev_open(struct zpci_dev *zdev)
> > > {
> > > @@ -55,3 +162,9 @@ int kvm_s390_pci_attach_kvm(struct zpci_dev *zdev,
> > > struct kvm *kvm)
> > > return 0;
> > > }
> > > EXPORT_SYMBOL_GPL(kvm_s390_pci_attach_kvm);
> > > +
> > > +void kvm_s390_pci_init(void)
> > > +{
> > > + spin_lock_init(&aift.gait_lock);
> > > + mutex_init(&aift.lock);
> > > +}
> >
> > Can we maybe use designated initializer for the static definition of
> > aift, e.g. something
> > like
> > static struct zpci_aift aift = {
> > .gait_lock = __SPIN_LOCK_UNLOCKED(aift.gait_lock),
> > .lock = __MUTEX_INITIALIZER(aift.lock),
> > }
> > and get rid of the init function? >
>
> Maybe -- I can certainly do the above, but I do add a call to
> zpci_get_mdd() in the init function (patch 23), so if I want to in patch
> 23 instead add .mdd = zpci_get_mdd() to this designated initializer I'd
> have to re-work zpci_get_mdd (patch 12) to return the mdd rather than
> the CLP LIST PCI return code. We want at least a warning if we're
> setting a 0 for mdd because the CLP failed for some bizarre reason.
>
> I guess one option would be to move the WARN_ON into the zpci_get_mdd()
> function itself and then now we can do

Hmm, if we do change zpci_get_mdd() which I'm generally fine with I
feel like the initializer would be weird mix of truly static lock
initialization and a function that actually does a CLP.
I'm also a little worried about initialization order if kvm is built-
in. The CLP should work even with PCI not initialized but what if for
example the facility isn't even there?

Also if you do change zpci_get-mdd() I'd prefer a pr_err() instead of a
WARN_ON(), no reason to crash the system for this if it runs with
panic-on-warn. So I think overall keeping it as is makes more sense.


2021-12-10 08:46:07

by Niklas Schnelle

[permalink] [raw]
Subject: Re: [PATCH 12/32] s390/pci: get SHM information from list pci

On Thu, 2021-12-09 at 16:47 +0100, Christian Borntraeger wrote:
> Am 07.12.21 um 21:57 schrieb Matthew Rosato:
> > KVM will need information on the special handle mask used to indicate
> > emulated devices. In order to obtain this, a new type of list pci call
> > must be made to gather the information. Remove the unused data pointer
> > from clp_list_pci and __clp_add and instead optionally pass a pointer to
> > a model-dependent-data field. Additionally, allow for clp_list_pci calls
> > that don't specify a callback - in this case, just do the first pass of
> > list pci and exit.
> >
> > Signed-off-by: Matthew Rosato <[email protected]>
> > ---
> > arch/s390/include/asm/pci.h | 6 ++++++
> > arch/s390/include/asm/pci_clp.h | 2 +-
> > arch/s390/pci/pci.c | 19 +++++++++++++++++++
> > arch/s390/pci/pci_clp.c | 16 ++++++++++------
> > 4 files changed, 36 insertions(+), 7 deletions(-)
> >
---8<---
> >
> > +int zpci_get_mdd(u32 *mdd)
> > +{
> > + struct clp_req_rsp_list_pci *rrb;
> > + int rc;
> > +
> > + if (!mdd)
> > + return -EINVAL;
> > +
> > + rrb = clp_alloc_block(GFP_KERNEL);
> > + if (!rrb)
> > + return -ENOMEM;
> > +
> > + rc = clp_list_pci(rrb, mdd, NULL);
> > +
> > + clp_free_block(rrb);
> > + return rc;
> > +}
> > +EXPORT_SYMBOL_GPL(zpci_get_mdd);
>
> Maybe move this into pci_clp.c to avoid the export of clp_alloc_block and void clp_free_block?
> Niklas?

That was actually my idea. I'm thinking of moving clp_get_state(),
clp_scan_pci_devices(), ans clp_refresh_fh() to pci.c too because I
feel these deal with higher level concerns than the rest of pci_clp.c.

I have no strong opinion though and might be thinking ahead to much
here. With the change discussed in the other mail of not modifying
clp_list_pci() maybe it would be better to keep it here and thus this
patch more focused and minimal and then possibly move it with the other
similar functions.


2021-12-10 09:53:25

by Pierre Morel

[permalink] [raw]
Subject: Re: [PATCH 14/32] KVM: s390: pci: do initial setup for AEN interpretation



On 12/7/21 21:57, Matthew Rosato wrote:
> Initial setup for Adapter Event Notification Interpretation for zPCI
> passthrough devices. Specifically, allocate a structure for forwarding of
> adapter events and pass the address of this structure to firmware.
>
> Signed-off-by: Matthew Rosato <[email protected]>
> ---
> arch/s390/include/asm/pci_insn.h | 12 ++++
> arch/s390/kvm/interrupt.c | 17 +++++
> arch/s390/kvm/kvm-s390.c | 3 +
> arch/s390/kvm/pci.c | 113 +++++++++++++++++++++++++++++++
> arch/s390/kvm/pci.h | 42 ++++++++++++
> 5 files changed, 187 insertions(+)
> create mode 100644 arch/s390/kvm/pci.h
>
> diff --git a/arch/s390/include/asm/pci_insn.h b/arch/s390/include/asm/pci_insn.h
> index 5331082fa516..e5f57cfe1d45 100644
> --- a/arch/s390/include/asm/pci_insn.h
> +++ b/arch/s390/include/asm/pci_insn.h
> @@ -101,6 +101,7 @@ struct zpci_fib {
> /* Set Interruption Controls Operation Controls */
> #define SIC_IRQ_MODE_ALL 0
> #define SIC_IRQ_MODE_SINGLE 1
> +#define SIC_SET_AENI_CONTROLS 2
> #define SIC_IRQ_MODE_DIRECT 4
> #define SIC_IRQ_MODE_D_ALL 16
> #define SIC_IRQ_MODE_D_SINGLE 17
> @@ -127,9 +128,20 @@ struct zpci_cdiib {
> u64 : 64;
> } __packed __aligned(8);
>
> +/* adapter interruption parameters block */
> +struct zpci_aipb {
> + u64 faisb;
> + u64 gait;
> + u16 : 13;
> + u16 afi : 3;
> + u32 : 32;
> + u16 faal;
> +} __packed __aligned(8);
> +
> union zpci_sic_iib {
> struct zpci_diib diib;
> struct zpci_cdiib cdiib;
> + struct zpci_aipb aipb;
> };
>
> DECLARE_STATIC_KEY_FALSE(have_mio);
> diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
> index f9b872e358c6..4efe0e95a40f 100644
> --- a/arch/s390/kvm/interrupt.c
> +++ b/arch/s390/kvm/interrupt.c
> @@ -32,6 +32,7 @@
> #include "kvm-s390.h"
> #include "gaccess.h"
> #include "trace-s390.h"
> +#include "pci.h"
>
> #define PFAULT_INIT 0x0600
> #define PFAULT_DONE 0x0680
> @@ -3276,8 +3277,16 @@ static struct airq_struct gib_alert_irq = {
>
> void kvm_s390_gib_destroy(void)
> {
> + struct zpci_aift *aift;
> +
> if (!gib)
> return;
> + aift = kvm_s390_pci_get_aift();
> + if (aift) {
> + mutex_lock(&aift->lock);
> + kvm_s390_pci_aen_exit();

Shouldn't we check for CONFIG_PCI and sclp.gas_aeni here as in gib_init ?

> + mutex_unlock(&aift->lock);
> + }
> chsc_sgib(0);
> unregister_adapter_interrupt(&gib_alert_irq);
> free_page((unsigned long)gib);
> @@ -3315,6 +3324,14 @@ int kvm_s390_gib_init(u8 nisc)
> goto out_unreg_gal;
> }
>
> + if (IS_ENABLED(CONFIG_PCI) && sclp.has_aeni) {
> + if (kvm_s390_pci_aen_init(nisc)) {
> + pr_err("Initializing AEN for PCI failed\n");
> + rc = -EIO;
> + goto out_unreg_gal;
> + }
> + }
> +
> KVM_EVENT(3, "gib 0x%pK (nisc=%d) initialized", gib, gib->nisc);
> goto out;
>
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 14a18ba5ff2c..9cd3c8eb59e8 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -48,6 +48,7 @@
> #include <asm/fpu/api.h>
> #include "kvm-s390.h"
> #include "gaccess.h"
> +#include "pci.h"
>
> #define CREATE_TRACE_POINTS
> #include "trace.h"
> @@ -503,6 +504,8 @@ int kvm_arch_init(void *opaque)
> goto out;
> }
>
> + kvm_s390_pci_init();
> +
> rc = kvm_s390_gib_init(GAL_ISC);
> if (rc)
> goto out;
> diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
> index ecfc458a5b39..f0e5386ff943 100644
> --- a/arch/s390/kvm/pci.c
> +++ b/arch/s390/kvm/pci.c
> @@ -10,6 +10,113 @@
> #include <linux/kvm_host.h>
> #include <linux/pci.h>
> #include <asm/kvm_pci.h>
> +#include "pci.h"
> +
> +static struct zpci_aift aift;
> +
> +static inline int __set_irq_noiib(u16 ctl, u8 isc)
> +{
> + union zpci_sic_iib iib = {{0}};
> +
> + return zpci_set_irq_ctrl(ctl, isc, &iib);
> +}
> +
> +struct zpci_aift *kvm_s390_pci_get_aift(void)
> +{
> + return &aift;
> +}
> +
> +/* Caller must hold the aift lock before calling this function */
> +void kvm_s390_pci_aen_exit(void)
> +{
> + struct zpci_gaite *gait;
> + unsigned long flags;
> + struct airq_iv *sbv;
> + struct kvm_zdev **gait_kzdev;
> + int size;
> +
> + /* Clear the GAIT and forwarding summary vector */
> + __set_irq_noiib(SIC_SET_AENI_CONTROLS, 0);

Why don't we use the PCI ISC here?

...snip...

--
Pierre Morel
IBM Lab Boeblingen

2021-12-10 10:54:41

by Pierre Morel

[permalink] [raw]
Subject: Re: [PATCH 14/32] KVM: s390: pci: do initial setup for AEN interpretation



On 12/10/21 10:54, Pierre Morel wrote:
>
>
> On 12/7/21 21:57, Matthew Rosato wrote:
>> Initial setup for Adapter Event Notification Interpretation for zPCI
>> passthrough devices.  Specifically, allocate a structure for
>> forwarding of
>> adapter events and pass the address of this structure to firmware.
>>
>> Signed-off-by: Matthew Rosato <[email protected]>
>> ---
>>   arch/s390/include/asm/pci_insn.h |  12 ++++
>>   arch/s390/kvm/interrupt.c        |  17 +++++
>>   arch/s390/kvm/kvm-s390.c         |   3 +
>>   arch/s390/kvm/pci.c              | 113 +++++++++++++++++++++++++++++++
>>   arch/s390/kvm/pci.h              |  42 ++++++++++++
>>   5 files changed, 187 insertions(+)
>>   create mode 100644 arch/s390/kvm/pci.h
>>
>> diff --git a/arch/s390/include/asm/pci_insn.h
>> b/arch/s390/include/asm/pci_insn.h
>> index 5331082fa516..e5f57cfe1d45 100644
>> --- a/arch/s390/include/asm/pci_insn.h
>> +++ b/arch/s390/include/asm/pci_insn.h
>> @@ -101,6 +101,7 @@ struct zpci_fib {
>>   /* Set Interruption Controls Operation Controls  */
>>   #define    SIC_IRQ_MODE_ALL        0
>>   #define    SIC_IRQ_MODE_SINGLE        1
>> +#define    SIC_SET_AENI_CONTROLS        2
>>   #define    SIC_IRQ_MODE_DIRECT        4
>>   #define    SIC_IRQ_MODE_D_ALL        16
>>   #define    SIC_IRQ_MODE_D_SINGLE        17
>> @@ -127,9 +128,20 @@ struct zpci_cdiib {
>>       u64 : 64;
>>   } __packed __aligned(8);
>> +/* adapter interruption parameters block */
>> +struct zpci_aipb {
>> +    u64 faisb;
>> +    u64 gait;
>> +    u16 : 13;
>> +    u16 afi : 3;
>> +    u32 : 32;
>> +    u16 faal;
>> +} __packed __aligned(8);
>> +
>>   union zpci_sic_iib {
>>       struct zpci_diib diib;
>>       struct zpci_cdiib cdiib;
>> +    struct zpci_aipb aipb;
>>   };
>>   DECLARE_STATIC_KEY_FALSE(have_mio);
>> diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
>> index f9b872e358c6..4efe0e95a40f 100644
>> --- a/arch/s390/kvm/interrupt.c
>> +++ b/arch/s390/kvm/interrupt.c
>> @@ -32,6 +32,7 @@
>>   #include "kvm-s390.h"
>>   #include "gaccess.h"
>>   #include "trace-s390.h"
>> +#include "pci.h"
>>   #define PFAULT_INIT 0x0600
>>   #define PFAULT_DONE 0x0680
>> @@ -3276,8 +3277,16 @@ static struct airq_struct gib_alert_irq = {
>>   void kvm_s390_gib_destroy(void)
>>   {
>> +    struct zpci_aift *aift;
>> +
>>       if (!gib)
>>           return;
>> +    aift = kvm_s390_pci_get_aift();
>> +    if (aift) {
>> +        mutex_lock(&aift->lock);
>> +        kvm_s390_pci_aen_exit();
>
> Shouldn't we check for CONFIG_PCI and sclp.gas_aeni here as in gib_init ?
>
>> +        mutex_unlock(&aift->lock);
>> +    }
>>       chsc_sgib(0);
>>       unregister_adapter_interrupt(&gib_alert_irq);
>>       free_page((unsigned long)gib);
>> @@ -3315,6 +3324,14 @@ int kvm_s390_gib_init(u8 nisc)
>>           goto out_unreg_gal;
>>       }
>> +    if (IS_ENABLED(CONFIG_PCI) && sclp.has_aeni) {
>> +        if (kvm_s390_pci_aen_init(nisc)) {
>> +            pr_err("Initializing AEN for PCI failed\n");
>> +            rc = -EIO;
>> +            goto out_unreg_gal;
>> +        }
>> +    }
>> +
>>       KVM_EVENT(3, "gib 0x%pK (nisc=%d) initialized", gib, gib->nisc);
>>       goto out;
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index 14a18ba5ff2c..9cd3c8eb59e8 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -48,6 +48,7 @@
>>   #include <asm/fpu/api.h>
>>   #include "kvm-s390.h"
>>   #include "gaccess.h"
>> +#include "pci.h"
>>   #define CREATE_TRACE_POINTS
>>   #include "trace.h"
>> @@ -503,6 +504,8 @@ int kvm_arch_init(void *opaque)
>>           goto out;
>>       }
>> +    kvm_s390_pci_init();
>> +
>>       rc = kvm_s390_gib_init(GAL_ISC);
>>       if (rc)
>>           goto out;
>> diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
>> index ecfc458a5b39..f0e5386ff943 100644
>> --- a/arch/s390/kvm/pci.c
>> +++ b/arch/s390/kvm/pci.c
>> @@ -10,6 +10,113 @@
>>   #include <linux/kvm_host.h>
>>   #include <linux/pci.h>
>>   #include <asm/kvm_pci.h>
>> +#include "pci.h"
>> +
>> +static struct zpci_aift aift;
>> +
>> +static inline int __set_irq_noiib(u16 ctl, u8 isc)
>> +{
>> +    union zpci_sic_iib iib = {{0}};
>> +
>> +    return zpci_set_irq_ctrl(ctl, isc, &iib);
>> +}
>> +
>> +struct zpci_aift *kvm_s390_pci_get_aift(void)
>> +{
>> +    return &aift;
>> +}
>> +
>> +/* Caller must hold the aift lock before calling this function */
>> +void kvm_s390_pci_aen_exit(void)
>> +{
>> +    struct zpci_gaite *gait;
>> +    unsigned long flags;
>> +    struct airq_iv *sbv;
>> +    struct kvm_zdev **gait_kzdev;
>> +    int size;
>> +
>> +    /* Clear the GAIT and forwarding summary vector */
>> +    __set_irq_noiib(SIC_SET_AENI_CONTROLS, 0);
>
> Why don't we use the PCI ISC here?

hum OK, sorry, isc is ignored for SIC_SET_AENI_CONTROLS

>
> ...snip...
>

--
Pierre Morel
IBM Lab Boeblingen

2021-12-10 13:27:13

by Christian Borntraeger

[permalink] [raw]
Subject: Re: [PATCH 19/32] KVM: s390: mechanism to enable guest zPCI Interpretation



Am 07.12.21 um 21:57 schrieb Matthew Rosato:
> The guest must have access to certain facilities in order to allow
> interpretive execution of zPCI instructions and adapter event
> notifications. However, there are some cases where a guest might
> disable interpretation -- provide a mechanism via which we can defer
> enabling the associated zPCI interpretation facilities until the guest
> indicates it wishes to use them.
>
> Signed-off-by: Matthew Rosato <[email protected]>
> ---
> arch/s390/include/asm/kvm_host.h | 4 +++
> arch/s390/kvm/kvm-s390.c | 43 ++++++++++++++++++++++++++++++++
> arch/s390/kvm/kvm-s390.h | 10 ++++++++
> 3 files changed, 57 insertions(+)
>
> diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
> index 3f147b8d050b..38982c1de413 100644
> --- a/arch/s390/include/asm/kvm_host.h
> +++ b/arch/s390/include/asm/kvm_host.h
> @@ -252,7 +252,10 @@ struct kvm_s390_sie_block {
> #define ECB2_IEP 0x20
> #define ECB2_PFMFI 0x08
> #define ECB2_ESCA 0x04
> +#define ECB2_ZPCI_LSI 0x02
> __u8 ecb2; /* 0x0062 */
> +#define ECB3_AISI 0x20
> +#define ECB3_AISII 0x10
> #define ECB3_DEA 0x08
> #define ECB3_AES 0x04
> #define ECB3_RI 0x01
> @@ -938,6 +941,7 @@ struct kvm_arch{
> int use_cmma;
> int use_pfmfi;
> int use_skf;
> + int use_zpci_interp;
> int user_cpu_state_ctrl;
> int user_sigp;
> int user_stsi;
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index a680f2a02b67..361d742cdf0d 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -1023,6 +1023,47 @@ static int kvm_s390_vm_set_crypto(struct kvm *kvm, struct kvm_device_attr *attr)
> return 0;
> }
>
> +static void kvm_s390_vcpu_pci_setup(struct kvm_vcpu *vcpu)
> +{
> + /*
> + * If the facilities aren't available for PCI interpretation and
> + * interrupt forwarding, we shouldn't be here.
> + */

This reads like we want a WARN_ON or BUG_ON, but as we call this uncoditionally this is
actually a valid check. So instead of "shouldn't be here" say something like "bail out
if interpretion is not active". ?

> + if (!vcpu->kvm->arch.use_zpci_interp)
> + return;
> +
> + vcpu->arch.sie_block->ecb2 |= ECB2_ZPCI_LSI;
> + vcpu->arch.sie_block->ecb3 |= ECB3_AISII + ECB3_AISI;
> +}
> +
> +void kvm_s390_vcpu_pci_enable_interp(struct kvm *kvm)
> +{
> + struct kvm_vcpu *vcpu;
> + int i;
> +
> + /*
> + * If host facilities are available, turn on interpretation for the
> + * life of this guest
> + */
> + if (!test_facility(69) || !test_facility(70) || !test_facility(71) ||
> + !test_facility(72))
> + return;

Wouldnt that also enable interpretion for VSIE? I guess we should check for the
sclp facilities from patches 1,2,3, and 4 instead.


> +
> + mutex_lock(&kvm->lock);
> +
> + kvm->arch.use_zpci_interp = 1;
> +
> + kvm_s390_vcpu_block_all(kvm);
> +
> + kvm_for_each_vcpu(i, vcpu, kvm) {
> + kvm_s390_vcpu_pci_setup(vcpu);
> + kvm_s390_sync_request(KVM_REQ_VSIE_RESTART, vcpu);
> + }
> +
> + kvm_s390_vcpu_unblock_all(kvm);
> + mutex_unlock(&kvm->lock);
> +}
> +
> static void kvm_s390_sync_request_broadcast(struct kvm *kvm, int req)
> {
> int cx;
> @@ -3288,6 +3329,8 @@ static int kvm_s390_vcpu_setup(struct kvm_vcpu *vcpu)
> h
> kvm_s390_vcpu_crypto_setup(vcpu);
>
> + kvm_s390_vcpu_pci_setup(vcpu);
> +
> mutex_lock(&vcpu->kvm->lock);
> if (kvm_s390_pv_is_protected(vcpu->kvm)) {
> rc = kvm_s390_pv_create_cpu(vcpu, &uvrc, &uvrrc);
> diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h
> index c07a050d757d..a2eccb8b977e 100644
> --- a/arch/s390/kvm/kvm-s390.h
> +++ b/arch/s390/kvm/kvm-s390.h
> @@ -481,6 +481,16 @@ void kvm_s390_reinject_machine_check(struct kvm_vcpu *vcpu,
> */
> void kvm_s390_vcpu_crypto_reset_all(struct kvm *kvm);
>
> +/**
> + * kvm_s390_vcpu_pci_enable_interp
> + *
> + * Set the associated PCI attributes for each vcpu to allow for zPCI Load/Store
> + * interpretation as well as adapter interruption forwarding.
> + *
> + * @kvm: the KVM guest
> + */
> +void kvm_s390_vcpu_pci_enable_interp(struct kvm *kvm);
> +
> /**
> * diag9c_forwarding_hz
> *
>

2021-12-10 14:22:02

by Matthew Rosato

[permalink] [raw]
Subject: Re: [PATCH 19/32] KVM: s390: mechanism to enable guest zPCI Interpretation

On 12/10/21 8:27 AM, Christian Borntraeger wrote:
>
>
> Am 07.12.21 um 21:57 schrieb Matthew Rosato:
>> The guest must have access to certain facilities in order to allow
>> interpretive execution of zPCI instructions and adapter event
>> notifications.  However, there are some cases where a guest might
>> disable interpretation -- provide a mechanism via which we can defer
>> enabling the associated zPCI interpretation facilities until the guest
>> indicates it wishes to use them.
>>
>> Signed-off-by: Matthew Rosato <[email protected]>
>> ---
>>   arch/s390/include/asm/kvm_host.h |  4 +++
>>   arch/s390/kvm/kvm-s390.c         | 43 ++++++++++++++++++++++++++++++++
>>   arch/s390/kvm/kvm-s390.h         | 10 ++++++++
>>   3 files changed, 57 insertions(+)
>>
>> diff --git a/arch/s390/include/asm/kvm_host.h
>> b/arch/s390/include/asm/kvm_host.h
>> index 3f147b8d050b..38982c1de413 100644
>> --- a/arch/s390/include/asm/kvm_host.h
>> +++ b/arch/s390/include/asm/kvm_host.h
>> @@ -252,7 +252,10 @@ struct kvm_s390_sie_block {
>>   #define ECB2_IEP    0x20
>>   #define ECB2_PFMFI    0x08
>>   #define ECB2_ESCA    0x04
>> +#define ECB2_ZPCI_LSI    0x02
>>       __u8    ecb2;                   /* 0x0062 */
>> +#define ECB3_AISI    0x20
>> +#define ECB3_AISII    0x10
>>   #define ECB3_DEA 0x08
>>   #define ECB3_AES 0x04
>>   #define ECB3_RI  0x01
>> @@ -938,6 +941,7 @@ struct kvm_arch{
>>       int use_cmma;
>>       int use_pfmfi;
>>       int use_skf;
>> +    int use_zpci_interp;
>>       int user_cpu_state_ctrl;
>>       int user_sigp;
>>       int user_stsi;
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index a680f2a02b67..361d742cdf0d 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -1023,6 +1023,47 @@ static int kvm_s390_vm_set_crypto(struct kvm
>> *kvm, struct kvm_device_attr *attr)
>>       return 0;
>>   }
>> +static void kvm_s390_vcpu_pci_setup(struct kvm_vcpu *vcpu)
>> +{
>> +    /*
>> +     * If the facilities aren't available for PCI interpretation and
>> +     * interrupt forwarding, we shouldn't be here.
>> +     */
>
> This reads like we want a WARN_ON or BUG_ON, but as we call this
> uncoditionally this is
> actually a valid check. So instead of "shouldn't be here" say something
> like "bail out
> if interpretion is not active".  ?

Right, this comment block is plain wrong. We expect to get here under
multiple circumstances and its OK for this bit to be off:
- initial vcpu setup (use_zpci_interp is off)
- Right after we set use_zpci_interp=1 (turn on ECB for all vcpu)
- hotplug vcpu setup (use_zpci_interp might be on or off)

Will re-word.

>
>> +    if (!vcpu->kvm->arch.use_zpci_interp)
>> +        return;
>> +
>> +    vcpu->arch.sie_block->ecb2 |= ECB2_ZPCI_LSI;
>> +    vcpu->arch.sie_block->ecb3 |= ECB3_AISII + ECB3_AISI;
>> +}
>> +
>> +void kvm_s390_vcpu_pci_enable_interp(struct kvm *kvm)
>> +{
>> +    struct kvm_vcpu *vcpu;
>> +    int i;
>> +
>> +    /*
>> +     * If host facilities are available, turn on interpretation for the
>> +     * life of this guest
>> +     */
>> +    if (!test_facility(69) || !test_facility(70) ||
>> !test_facility(71) ||
>> +        !test_facility(72))
>> +        return;
>
> Wouldnt that also enable interpretion for VSIE? I guess we should check
> for the
> sclp facilities from patches 1,2,3, and 4 instead.
>

Good point -- will change.



2021-12-10 19:04:02

by Eric Farman

[permalink] [raw]
Subject: Re: [PATCH 08/32] s390/pci: stash associated GISA designation

On Tue, 2021-12-07 at 15:57 -0500, Matthew Rosato wrote:
> For passthrough devices, we will need to know the GISA designation of
> the
> guest if interpretation facilities are to be used. Setup to stash
> this in
> the zdev and set a default of 0 (no GISA designation) for now; a
> subsequent
> patch will set a valid GISA designation for passthrough devices.
> Also, extend mpcific routines to specify this stashed designation as
> part
> of the mpcific command.
>
> Reviewed-by: Niklas Schnelle <[email protected]>
> Signed-off-by: Matthew Rosato <[email protected]>

Reviewed-by: Eric Farman <[email protected]>

> ---
> arch/s390/include/asm/pci.h | 1 +
> arch/s390/include/asm/pci_clp.h | 3 ++-
> arch/s390/pci/pci.c | 9 +++++++++
> arch/s390/pci/pci_clp.c | 1 +
> arch/s390/pci/pci_irq.c | 5 +++++
> 5 files changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/arch/s390/include/asm/pci.h
> b/arch/s390/include/asm/pci.h
> index 90824be5ce9a..2474b8d30f2a 100644
> --- a/arch/s390/include/asm/pci.h
> +++ b/arch/s390/include/asm/pci.h
> @@ -123,6 +123,7 @@ struct zpci_dev {
> enum zpci_state state;
> u32 fid; /* function ID, used by sclp
> */
> u32 fh; /* function handle, used by
> insn's */
> + u32 gd; /* GISA designation for
> passthrough */
> u16 vfn; /* virtual function number */
> u16 pchid; /* physical channel ID */
> u8 pfgid; /* function group ID */
> diff --git a/arch/s390/include/asm/pci_clp.h
> b/arch/s390/include/asm/pci_clp.h
> index 1f4b666e85ee..3af8d196da74 100644
> --- a/arch/s390/include/asm/pci_clp.h
> +++ b/arch/s390/include/asm/pci_clp.h
> @@ -173,7 +173,8 @@ struct clp_req_set_pci {
> u16 reserved2;
> u8 oc; /* operation controls */
> u8 ndas; /* number of dma spaces */
> - u64 reserved3;
> + u32 reserved3;
> + u32 gd; /* GISA designation */
> } __packed;
>
> /* Set PCI function response */
> diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c
> index 2f9b78fa82a5..9b4d3d78b444 100644
> --- a/arch/s390/pci/pci.c
> +++ b/arch/s390/pci/pci.c
> @@ -119,6 +119,7 @@ int zpci_register_ioat(struct zpci_dev *zdev, u8
> dmaas,
> fib.pba = base;
> fib.pal = limit;
> fib.iota = iota | ZPCI_IOTA_RTTO_FLAG;
> + fib.gd = zdev->gd;
> cc = zpci_mod_fc(req, &fib, &status);
> if (cc)
> zpci_dbg(3, "reg ioat fid:%x, cc:%d, status:%d\n",
> zdev->fid, cc, status);
> @@ -132,6 +133,8 @@ int zpci_unregister_ioat(struct zpci_dev *zdev,
> u8 dmaas)
> struct zpci_fib fib = {0};
> u8 cc, status;
>
> + fib.gd = zdev->gd;
> +
> cc = zpci_mod_fc(req, &fib, &status);
> if (cc)
> zpci_dbg(3, "unreg ioat fid:%x, cc:%d, status:%d\n",
> zdev->fid, cc, status);
> @@ -159,6 +162,7 @@ int zpci_fmb_enable_device(struct zpci_dev *zdev)
> atomic64_set(&zdev->unmapped_pages, 0);
>
> fib.fmb_addr = virt_to_phys(zdev->fmb);
> + fib.gd = zdev->gd;
> cc = zpci_mod_fc(req, &fib, &status);
> if (cc) {
> kmem_cache_free(zdev_fmb_cache, zdev->fmb);
> @@ -177,6 +181,8 @@ int zpci_fmb_disable_device(struct zpci_dev
> *zdev)
> if (!zdev->fmb)
> return -EINVAL;
>
> + fib.gd = zdev->gd;
> +
> /* Function measurement is disabled if fmb address is zero */
> cc = zpci_mod_fc(req, &fib, &status);
> if (cc == 3) /* Function already gone. */
> @@ -807,6 +813,9 @@ struct zpci_dev *zpci_create_device(u32 fid, u32
> fh, enum zpci_state state)
> zdev->fid = fid;
> zdev->fh = fh;
>
> + /* For now, assume it is not a passthrough device */
> + zdev->gd = 0;
> +
> /* Query function properties and update zdev */
> rc = clp_query_pci_fn(zdev);
> if (rc)
> diff --git a/arch/s390/pci/pci_clp.c b/arch/s390/pci/pci_clp.c
> index be077b39da33..e9ed0e4a5cf0 100644
> --- a/arch/s390/pci/pci_clp.c
> +++ b/arch/s390/pci/pci_clp.c
> @@ -240,6 +240,7 @@ static int clp_set_pci_fn(struct zpci_dev *zdev,
> u32 *fh, u8 nr_dma_as, u8 comma
> rrb->request.fh = zdev->fh;
> rrb->request.oc = command;
> rrb->request.ndas = nr_dma_as;
> + rrb->request.gd = zdev->gd;
>
> rc = clp_req(rrb, CLP_LPS_PCI);
> if (rrb->response.hdr.rsp == CLP_RC_SETPCIFN_BUSY) {
> diff --git a/arch/s390/pci/pci_irq.c b/arch/s390/pci/pci_irq.c
> index 6b29e39496d1..9e8b4507234d 100644
> --- a/arch/s390/pci/pci_irq.c
> +++ b/arch/s390/pci/pci_irq.c
> @@ -43,6 +43,7 @@ static int zpci_set_airq(struct zpci_dev *zdev)
> fib.fmt0.aibvo = 0; /* each zdev has its own interrupt
> vector */
> fib.fmt0.aisb = (unsigned long) zpci_sbv->vector + (zdev-
> >aisb/64)*8;
> fib.fmt0.aisbo = zdev->aisb & 63;
> + fib.gd = zdev->gd;
>
> return zpci_mod_fc(req, &fib, &status) ? -EIO : 0;
> }
> @@ -54,6 +55,8 @@ static int zpci_clear_airq(struct zpci_dev *zdev)
> struct zpci_fib fib = {0};
> u8 cc, status;
>
> + fib.gd = zdev->gd;
> +
> cc = zpci_mod_fc(req, &fib, &status);
> if (cc == 3 || (cc == 1 && status == 24))
> /* Function already gone or IRQs already deregistered.
> */
> @@ -72,6 +75,7 @@ static int zpci_set_directed_irq(struct zpci_dev
> *zdev)
> fib.fmt = 1;
> fib.fmt1.noi = zdev->msi_nr_irqs;
> fib.fmt1.dibvo = zdev->msi_first_bit;
> + fib.gd = zdev->gd;
>
> return zpci_mod_fc(req, &fib, &status) ? -EIO : 0;
> }
> @@ -84,6 +88,7 @@ static int zpci_clear_directed_irq(struct zpci_dev
> *zdev)
> u8 cc, status;
>
> fib.fmt = 1;
> + fib.gd = zdev->gd;
> cc = zpci_mod_fc(req, &fib, &status);
> if (cc == 3 || (cc == 1 && status == 24))
> /* Function already gone or IRQs already deregistered.
> */


2021-12-10 19:04:50

by Eric Farman

[permalink] [raw]
Subject: Re: [PATCH 11/32] s390/pci: add helper function to find device by handle

On Tue, 2021-12-07 at 15:57 -0500, Matthew Rosato wrote:
> Intercepted zPCI instructions will specify the desired function via a
> function handle. Add a routine to find the device with the specified
> handle.
>
> Acked-by: Niklas Schnelle <[email protected]>
> Signed-off-by: Matthew Rosato <[email protected]>

Reviewed-by: Eric Farman <[email protected]>

> ---
> arch/s390/include/asm/pci.h | 1 +
> arch/s390/pci/pci.c | 16 ++++++++++++++++
> 2 files changed, 17 insertions(+)
>
> diff --git a/arch/s390/include/asm/pci.h
> b/arch/s390/include/asm/pci.h
> index 1a8f9f42da3a..00a2c24d6d2b 100644
> --- a/arch/s390/include/asm/pci.h
> +++ b/arch/s390/include/asm/pci.h
> @@ -275,6 +275,7 @@ static inline struct zpci_dev *to_zpci_dev(struct
> device *dev)
> }
>
> struct zpci_dev *get_zdev_by_fid(u32);
> +struct zpci_dev *get_zdev_by_fh(u32 fh);
>
> /* DMA */
> int zpci_dma_init(void);
> diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c
> index 9b4d3d78b444..af1c0ae017b1 100644
> --- a/arch/s390/pci/pci.c
> +++ b/arch/s390/pci/pci.c
> @@ -76,6 +76,22 @@ struct zpci_dev *get_zdev_by_fid(u32 fid)
> return zdev;
> }
>
> +struct zpci_dev *get_zdev_by_fh(u32 fh)
> +{
> + struct zpci_dev *tmp, *zdev = NULL;
> +
> + spin_lock(&zpci_list_lock);
> + list_for_each_entry(tmp, &zpci_list, entry) {
> + if (tmp->fh == fh) {
> + zdev = tmp;
> + break;
> + }
> + }
> + spin_unlock(&zpci_list_lock);
> + return zdev;
> +}
> +EXPORT_SYMBOL_GPL(get_zdev_by_fh);
> +
> void zpci_remove_reserved_devices(void)
> {
> struct zpci_dev *tmp, *zdev;


2021-12-10 19:05:08

by Eric Farman

[permalink] [raw]
Subject: Re: [PATCH 13/32] KVM: s390: pci: add basic kvm_zdev structure

On Tue, 2021-12-07 at 15:57 -0500, Matthew Rosato wrote:
> This structure will be used to carry kvm passthrough information
> related to
> zPCI devices.
>
> Signed-off-by: Matthew Rosato <[email protected]>

Reviewed-by: Eric Farman <[email protected]>

> ---
> arch/s390/include/asm/kvm_pci.h | 29 +++++++++++++++++
> arch/s390/include/asm/pci.h | 3 ++
> arch/s390/kvm/Makefile | 2 +-
> arch/s390/kvm/pci.c | 57
> +++++++++++++++++++++++++++++++++
> 4 files changed, 90 insertions(+), 1 deletion(-)
> create mode 100644 arch/s390/include/asm/kvm_pci.h
> create mode 100644 arch/s390/kvm/pci.c
>
> diff --git a/arch/s390/include/asm/kvm_pci.h
> b/arch/s390/include/asm/kvm_pci.h
> new file mode 100644
> index 000000000000..3e491a39704c
> --- /dev/null
> +++ b/arch/s390/include/asm/kvm_pci.h
> @@ -0,0 +1,29 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * KVM PCI Passthrough for virtual machines on s390
> + *
> + * Copyright IBM Corp. 2021
> + *
> + * Author(s): Matthew Rosato <[email protected]>
> + */
> +
> +
> +#ifndef ASM_KVM_PCI_H
> +#define ASM_KVM_PCI_H
> +
> +#include <linux/types.h>
> +#include <linux/kvm_types.h>
> +#include <linux/kvm_host.h>
> +#include <linux/kvm.h>
> +#include <linux/pci.h>
> +
> +struct kvm_zdev {
> + struct zpci_dev *zdev;
> + struct kvm *kvm;
> +};
> +
> +extern int kvm_s390_pci_dev_open(struct zpci_dev *zdev);
> +extern void kvm_s390_pci_dev_release(struct zpci_dev *zdev);
> +extern int kvm_s390_pci_attach_kvm(struct zpci_dev *zdev, struct kvm
> *kvm);
> +
> +#endif /* ASM_KVM_PCI_H */
> diff --git a/arch/s390/include/asm/pci.h
> b/arch/s390/include/asm/pci.h
> index 86f43644756d..32810e1ed308 100644
> --- a/arch/s390/include/asm/pci.h
> +++ b/arch/s390/include/asm/pci.h
> @@ -97,6 +97,7 @@ struct zpci_bar_struct {
> };
>
> struct s390_domain;
> +struct kvm_zdev;
>
> #define ZPCI_FUNCTIONS_PER_BUS 256
> struct zpci_bus {
> @@ -190,6 +191,8 @@ struct zpci_dev {
> struct dentry *debugfs_dev;
>
> struct s390_domain *s390_domain; /* s390 IOMMU domain data */
> +
> + struct kvm_zdev *kzdev; /* passthrough data */
> };
>
> static inline bool zdev_enabled(struct zpci_dev *zdev)
> diff --git a/arch/s390/kvm/Makefile b/arch/s390/kvm/Makefile
> index b3aaadc60ead..95ea865e5d29 100644
> --- a/arch/s390/kvm/Makefile
> +++ b/arch/s390/kvm/Makefile
> @@ -10,6 +10,6 @@ common-objs = $(KVM)/kvm_main.o
> $(KVM)/eventfd.o $(KVM)/async_pf.o \
> ccflags-y := -Ivirt/kvm -Iarch/s390/kvm
>
> kvm-objs := $(common-objs) kvm-s390.o intercept.o interrupt.o priv.o
> sigp.o
> -kvm-objs += diag.o gaccess.o guestdbg.o vsie.o pv.o
> +kvm-objs += diag.o gaccess.o guestdbg.o vsie.o pv.o pci.o
>
> obj-$(CONFIG_KVM) += kvm.o
> diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
> new file mode 100644
> index 000000000000..ecfc458a5b39
> --- /dev/null
> +++ b/arch/s390/kvm/pci.c
> @@ -0,0 +1,57 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * s390 kvm PCI passthrough support
> + *
> + * Copyright IBM Corp. 2021
> + *
> + * Author(s): Matthew Rosato <[email protected]>
> + */
> +
> +#include <linux/kvm_host.h>
> +#include <linux/pci.h>
> +#include <asm/kvm_pci.h>
> +
> +int kvm_s390_pci_dev_open(struct zpci_dev *zdev)
> +{
> + struct kvm_zdev *kzdev;
> +
> + if (zdev == NULL)
> + return -ENODEV;
> +
> + kzdev = kzalloc(sizeof(struct kvm_zdev), GFP_KERNEL);
> + if (!kzdev)
> + return -ENOMEM;
> +
> + kzdev->zdev = zdev;
> + zdev->kzdev = kzdev;
> +
> + return 0;
> +}
> +EXPORT_SYMBOL_GPL(kvm_s390_pci_dev_open);
> +
> +void kvm_s390_pci_dev_release(struct zpci_dev *zdev)
> +{
> + struct kvm_zdev *kzdev;
> +
> + if (!zdev || !zdev->kzdev)
> + return;
> +
> + kzdev = zdev->kzdev;
> + WARN_ON(kzdev->zdev != zdev);
> + zdev->kzdev = 0;
> + kfree(kzdev);
> +
> +}
> +EXPORT_SYMBOL_GPL(kvm_s390_pci_dev_release);
> +
> +int kvm_s390_pci_attach_kvm(struct zpci_dev *zdev, struct kvm *kvm)
> +{
> + struct kvm_zdev *kzdev = zdev->kzdev;
> +
> + if (!kzdev)
> + return -ENODEV;
> +
> + kzdev->kvm = kvm;
> + return 0;
> +}
> +EXPORT_SYMBOL_GPL(kvm_s390_pci_attach_kvm);


2021-12-10 21:51:52

by Eric Farman

[permalink] [raw]
Subject: Re: [PATCH 15/32] KVM: s390: pci: enable host forwarding of Adapter Event Notifications

On Tue, 2021-12-07 at 15:57 -0500, Matthew Rosato wrote:
> In cases where interrupts are not forwarded to the guest via
> firmware,
> KVM is responsible for ensuring delivery. When an interrupt presents
> with the forwarding bit, we must process the forwarding tables until
> all interrupts are delivered.
>
> Signed-off-by: Matthew Rosato <[email protected]>

Nits below regarding 0-vs-NULL, but otherwise:

Reviewed-by: Eric Farman <[email protected]>

> ---
> arch/s390/include/asm/kvm_host.h | 1 +
> arch/s390/include/asm/tpi.h | 14 ++++++
> arch/s390/kvm/interrupt.c | 76
> +++++++++++++++++++++++++++++++-
> arch/s390/kvm/kvm-s390.c | 3 +-
> arch/s390/kvm/pci.h | 9 ++++
> 5 files changed, 101 insertions(+), 2 deletions(-)
>
> diff --git a/arch/s390/include/asm/kvm_host.h
> b/arch/s390/include/asm/kvm_host.h
> index a604d51acfc8..3f147b8d050b 100644
> --- a/arch/s390/include/asm/kvm_host.h
> +++ b/arch/s390/include/asm/kvm_host.h
> @@ -757,6 +757,7 @@ struct kvm_vm_stat {
> u64 inject_pfault_done;
> u64 inject_service_signal;
> u64 inject_virtio;
> + u64 aen_forward;
> };
>
> struct kvm_arch_memory_slot {
> diff --git a/arch/s390/include/asm/tpi.h
> b/arch/s390/include/asm/tpi.h
> index 1ac538b8cbf5..47a531fdb15b 100644
> --- a/arch/s390/include/asm/tpi.h
> +++ b/arch/s390/include/asm/tpi.h
> @@ -19,6 +19,20 @@ struct tpi_info {
> u32 :12;
> } __packed __aligned(4);
>
> +/* I/O-Interruption Code as stored by TPI for an Adapter I/O */
> +struct tpi_adapter_info {
> + u32 :1;
> + u32 pci:1;
> + u32 :28;
> + u32 error:1;
> + u32 forward:1;
> + u32 reserved;
> + u32 adapter_IO:1;
> + u32 directed_irq:1;
> + u32 isc:3;
> + u32 :27;
> +} __packed __aligned(4);
> +
> #endif /* __ASSEMBLY__ */
>
> #endif /* _ASM_S390_TPI_H */
> diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
> index 4efe0e95a40f..c6ff871a6ed1 100644
> --- a/arch/s390/kvm/interrupt.c
> +++ b/arch/s390/kvm/interrupt.c
> @@ -3263,11 +3263,85 @@ int kvm_s390_gisc_unregister(struct kvm *kvm,
> u32 gisc)
> }
> EXPORT_SYMBOL_GPL(kvm_s390_gisc_unregister);
>
> +static void aen_host_forward(struct zpci_aift *aift, unsigned long
> si)
> +{
> + struct kvm_s390_gisa_interrupt *gi;
> + struct zpci_gaite *gaite;
> + struct kvm *kvm;
> +
> + gaite = (struct zpci_gaite *)aift->gait +
> + (si * sizeof(struct zpci_gaite));
> + if (gaite->count == 0)
> + return;
> + if (gaite->aisb != 0)
> + set_bit_inv(gaite->aisbo, (unsigned long *)gaite-
> >aisb);
> +
> + kvm = kvm_s390_pci_si_to_kvm(aift, si);
> + if (kvm == 0)

if (!kvm)

> + return;
> + gi = &kvm->arch.gisa_int;
> +
> + if (!(gi->origin->g1.simm & AIS_MODE_MASK(gaite->gisc)) ||
> + !(gi->origin->g1.nimm & AIS_MODE_MASK(gaite->gisc))) {
> + gisa_set_ipm_gisc(gi->origin, gaite->gisc);
> + if (hrtimer_active(&gi->timer))
> + hrtimer_cancel(&gi->timer);
> + hrtimer_start(&gi->timer, 0, HRTIMER_MODE_REL);
> + kvm->stat.aen_forward++;
> + }
> +}
> +
> +static void aen_process_gait(u8 isc)
> +{
> + bool found = false, first = true;
> + union zpci_sic_iib iib = {{0}};
> + unsigned long si, flags;
> + struct zpci_aift *aift;
> +
> + aift = kvm_s390_pci_get_aift();
> + spin_lock_irqsave(&aift->gait_lock, flags);
> +
> + if (!aift->gait) {
> + spin_unlock_irqrestore(&aift->gait_lock, flags);
> + return;
> + }
> +
> + for (si = 0;;) {
> + /* Scan adapter summary indicator bit vector */
> + si = airq_iv_scan(aift->sbv, si, airq_iv_end(aift-
> >sbv));
> + if (si == -1UL) {
> + if (first || found) {
> + /* Reenable interrupts. */
> + if
> (zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, isc,
> + &iib))
> + break;
> + first = found = false;
> + } else {
> + /* Interrupts on and all bits processed
> */
> + break;
> + }
> + found = false;
> + si = 0;
> + continue;
> + }
> + found = true;
> + aen_host_forward(aift, si);
> + }
> +
> + spin_unlock_irqrestore(&aift->gait_lock, flags);
> +}
> +
> static void gib_alert_irq_handler(struct airq_struct *airq,
> struct tpi_info *tpi_info)
> {
> + struct tpi_adapter_info *info = (struct tpi_adapter_info
> *)tpi_info;
> +
> inc_irq_stat(IRQIO_GAL);
> - process_gib_alert_list();
> +
> + if (info->forward || info->error)
> + aen_process_gait(info->isc);
> + else
> + process_gib_alert_list();
> }
>
> static struct airq_struct gib_alert_irq = {
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 9cd3c8eb59e8..c8fe9b7c2395 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -65,7 +65,8 @@ const struct _kvm_stats_desc kvm_vm_stats_desc[] =
> {
> STATS_DESC_COUNTER(VM, inject_float_mchk),
> STATS_DESC_COUNTER(VM, inject_pfault_done),
> STATS_DESC_COUNTER(VM, inject_service_signal),
> - STATS_DESC_COUNTER(VM, inject_virtio)
> + STATS_DESC_COUNTER(VM, inject_virtio),
> + STATS_DESC_COUNTER(VM, aen_forward)
> };
>
> const struct kvm_stats_header kvm_vm_stats_header = {
> diff --git a/arch/s390/kvm/pci.h b/arch/s390/kvm/pci.h
> index 74b06d39be3b..776b2745c675 100644
> --- a/arch/s390/kvm/pci.h
> +++ b/arch/s390/kvm/pci.h
> @@ -12,6 +12,7 @@
>
> #include <linux/pci.h>
> #include <linux/mutex.h>
> +#include <linux/kvm_host.h>
> #include <asm/airq.h>
> #include <asm/kvm_pci.h>
>
> @@ -32,6 +33,14 @@ struct zpci_aift {
> struct mutex lock; /* Protects the other structures in aift */
> };
>
> +static inline struct kvm *kvm_s390_pci_si_to_kvm(struct zpci_aift
> *aift,
> + unsigned long si)
> +{
> + if (aift->kzdev == 0 || aift->kzdev[si] == 0)
> + return 0;

Check/return NULL

> + return aift->kzdev[si]->kvm;
> +};
> +
> struct zpci_aift *kvm_s390_pci_get_aift(void);
>
> int kvm_s390_pci_aen_init(u8 nisc);


2021-12-13 14:34:15

by Pierre Morel

[permalink] [raw]
Subject: Re: [PATCH 05/32] s390/airq: pass more TPI info to airq handlers



On 12/7/21 21:57, Matthew Rosato wrote:
> A subsequent patch will introduce an airq handler that requires additional
> TPI information beyond directed vs floating, so pass the entire tpi_info
> structure via the handler. Only pci actually uses this information today,
> for the other airq handlers this is effectively a no-op.
>
> Reviewed-by: Eric Farman <[email protected]>
> Signed-off-by: Matthew Rosato <[email protected]>

Reviewed-by: Pierre Morel <[email protected]>


> ---
> arch/s390/include/asm/airq.h | 3 ++-
> arch/s390/kvm/interrupt.c | 4 +++-
> arch/s390/pci/pci_irq.c | 9 +++++++--
> drivers/s390/cio/airq.c | 2 +-
> drivers/s390/cio/qdio_thinint.c | 6 ++++--
> drivers/s390/crypto/ap_bus.c | 9 ++++++---
> drivers/s390/virtio/virtio_ccw.c | 4 +++-
> 7 files changed, 26 insertions(+), 11 deletions(-)
>
> diff --git a/arch/s390/include/asm/airq.h b/arch/s390/include/asm/airq.h
> index 01936fdfaddb..7918a7d09028 100644
> --- a/arch/s390/include/asm/airq.h
> +++ b/arch/s390/include/asm/airq.h
> @@ -12,10 +12,11 @@
>
> #include <linux/bit_spinlock.h>
> #include <linux/dma-mapping.h>
> +#include <asm/tpi.h>
>
> struct airq_struct {
> struct hlist_node list; /* Handler queueing. */
> - void (*handler)(struct airq_struct *airq, bool floating);
> + void (*handler)(struct airq_struct *airq, struct tpi_info *tpi_info);
> u8 *lsi_ptr; /* Local-Summary-Indicator pointer */
> u8 lsi_mask; /* Local-Summary-Indicator mask */
> u8 isc; /* Interrupt-subclass */
> diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
> index c3bd993fdd0c..f9b872e358c6 100644
> --- a/arch/s390/kvm/interrupt.c
> +++ b/arch/s390/kvm/interrupt.c
> @@ -28,6 +28,7 @@
> #include <asm/switch_to.h>
> #include <asm/nmi.h>
> #include <asm/airq.h>
> +#include <asm/tpi.h>
> #include "kvm-s390.h"
> #include "gaccess.h"
> #include "trace-s390.h"
> @@ -3261,7 +3262,8 @@ int kvm_s390_gisc_unregister(struct kvm *kvm, u32 gisc)
> }
> EXPORT_SYMBOL_GPL(kvm_s390_gisc_unregister);
>
> -static void gib_alert_irq_handler(struct airq_struct *airq, bool floating)
> +static void gib_alert_irq_handler(struct airq_struct *airq,
> + struct tpi_info *tpi_info)
> {
> inc_irq_stat(IRQIO_GAL);
> process_gib_alert_list();
> diff --git a/arch/s390/pci/pci_irq.c b/arch/s390/pci/pci_irq.c
> index 954bb7a83124..880bcd73f11a 100644
> --- a/arch/s390/pci/pci_irq.c
> +++ b/arch/s390/pci/pci_irq.c
> @@ -11,6 +11,7 @@
>
> #include <asm/isc.h>
> #include <asm/airq.h>
> +#include <asm/tpi.h>
>
> static enum {FLOATING, DIRECTED} irq_delivery;
>
> @@ -216,8 +217,11 @@ static void zpci_handle_fallback_irq(void)
> }
> }
>
> -static void zpci_directed_irq_handler(struct airq_struct *airq, bool floating)
> +static void zpci_directed_irq_handler(struct airq_struct *airq,
> + struct tpi_info *tpi_info)
> {
> + bool floating = !tpi_info->directed_irq;
> +
> if (floating) {
> inc_irq_stat(IRQIO_PCF);
> zpci_handle_fallback_irq();
> @@ -227,7 +231,8 @@ static void zpci_directed_irq_handler(struct airq_struct *airq, bool floating)
> }
> }
>
> -static void zpci_floating_irq_handler(struct airq_struct *airq, bool floating)
> +static void zpci_floating_irq_handler(struct airq_struct *airq,
> + struct tpi_info *tpi_info)
> {
> unsigned long si, ai;
> struct airq_iv *aibv;
> diff --git a/drivers/s390/cio/airq.c b/drivers/s390/cio/airq.c
> index e56535c99888..2f2226786319 100644
> --- a/drivers/s390/cio/airq.c
> +++ b/drivers/s390/cio/airq.c
> @@ -99,7 +99,7 @@ static irqreturn_t do_airq_interrupt(int irq, void *dummy)
> rcu_read_lock();
> hlist_for_each_entry_rcu(airq, head, list)
> if ((*airq->lsi_ptr & airq->lsi_mask) != 0)
> - airq->handler(airq, !tpi_info->directed_irq);
> + airq->handler(airq, tpi_info);
> rcu_read_unlock();
>
> return IRQ_HANDLED;
> diff --git a/drivers/s390/cio/qdio_thinint.c b/drivers/s390/cio/qdio_thinint.c
> index 8e09bf3a2fcd..9b9335dd06db 100644
> --- a/drivers/s390/cio/qdio_thinint.c
> +++ b/drivers/s390/cio/qdio_thinint.c
> @@ -15,6 +15,7 @@
> #include <asm/qdio.h>
> #include <asm/airq.h>
> #include <asm/isc.h>
> +#include <asm/tpi.h>
>
> #include "cio.h"
> #include "ioasm.h"
> @@ -93,9 +94,10 @@ static inline u32 clear_shared_ind(void)
> /**
> * tiqdio_thinint_handler - thin interrupt handler for qdio
> * @airq: pointer to adapter interrupt descriptor
> - * @floating: flag to recognize floating vs. directed interrupts (unused)
> + * @tpi_info: interrupt information (e.g. floating vs directed -- unused)
> */
> -static void tiqdio_thinint_handler(struct airq_struct *airq, bool floating)
> +static void tiqdio_thinint_handler(struct airq_struct *airq,
> + struct tpi_info *tpi_info)
> {
> u64 irq_time = S390_lowcore.int_clock;
> u32 si_used = clear_shared_ind();
> diff --git a/drivers/s390/crypto/ap_bus.c b/drivers/s390/crypto/ap_bus.c
> index 1986243f9cd3..df1a038442db 100644
> --- a/drivers/s390/crypto/ap_bus.c
> +++ b/drivers/s390/crypto/ap_bus.c
> @@ -27,6 +27,7 @@
> #include <linux/kthread.h>
> #include <linux/mutex.h>
> #include <asm/airq.h>
> +#include <asm/tpi.h>
> #include <linux/atomic.h>
> #include <asm/isc.h>
> #include <linux/hrtimer.h>
> @@ -129,7 +130,8 @@ static int ap_max_adapter_id = 63;
> static struct bus_type ap_bus_type;
>
> /* Adapter interrupt definitions */
> -static void ap_interrupt_handler(struct airq_struct *airq, bool floating);
> +static void ap_interrupt_handler(struct airq_struct *airq,
> + struct tpi_info *tpi_info);
>
> static bool ap_irq_flag;
>
> @@ -442,9 +444,10 @@ static enum hrtimer_restart ap_poll_timeout(struct hrtimer *unused)
> /**
> * ap_interrupt_handler() - Schedule ap_tasklet on interrupt
> * @airq: pointer to adapter interrupt descriptor
> - * @floating: ignored
> + * @tpi_info: ignored
> */
> -static void ap_interrupt_handler(struct airq_struct *airq, bool floating)
> +static void ap_interrupt_handler(struct airq_struct *airq,
> + struct tpi_info *tpi_info)
> {
> inc_irq_stat(IRQIO_APB);
> tasklet_schedule(&ap_tasklet);
> diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> index d35e7a3f7067..52c376d15978 100644
> --- a/drivers/s390/virtio/virtio_ccw.c
> +++ b/drivers/s390/virtio/virtio_ccw.c
> @@ -33,6 +33,7 @@
> #include <asm/virtio-ccw.h>
> #include <asm/isc.h>
> #include <asm/airq.h>
> +#include <asm/tpi.h>
>
> /*
> * virtio related functions
> @@ -203,7 +204,8 @@ static void drop_airq_indicator(struct virtqueue *vq, struct airq_info *info)
> write_unlock_irqrestore(&info->lock, flags);
> }
>
> -static void virtio_airq_handler(struct airq_struct *airq, bool floating)
> +static void virtio_airq_handler(struct airq_struct *airq,
> + struct tpi_info *tpi_info)
> {
> struct airq_info *info = container_of(airq, struct airq_info, airq);
> unsigned long ai;
>

--
Pierre Morel
IBM Lab Boeblingen

2021-12-13 14:58:48

by Pierre Morel

[permalink] [raw]
Subject: Re: [PATCH 08/32] s390/pci: stash associated GISA designation



On 12/7/21 21:57, Matthew Rosato wrote:
> For passthrough devices, we will need to know the GISA designation of the
> guest if interpretation facilities are to be used. Setup to stash this in
> the zdev and set a default of 0 (no GISA designation) for now; a subsequent
> patch will set a valid GISA designation for passthrough devices.
> Also, extend mpcific routines to specify this stashed designation as part
> of the mpcific command.
>
> Reviewed-by: Niklas Schnelle <[email protected]>
> Signed-off-by: Matthew Rosato <[email protected]>
> ---
> arch/s390/include/asm/pci.h | 1 +
> arch/s390/include/asm/pci_clp.h | 3 ++-
> arch/s390/pci/pci.c | 9 +++++++++
> arch/s390/pci/pci_clp.c | 1 +
> arch/s390/pci/pci_irq.c | 5 +++++
> 5 files changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
> index 90824be5ce9a..2474b8d30f2a 100644
> --- a/arch/s390/include/asm/pci.h
> +++ b/arch/s390/include/asm/pci.h
> @@ -123,6 +123,7 @@ struct zpci_dev {
> enum zpci_state state;
> u32 fid; /* function ID, used by sclp */
> u32 fh; /* function handle, used by insn's */
> + u32 gd; /* GISA designation for passthrough */
> u16 vfn; /* virtual function number */
> u16 pchid; /* physical channel ID */
> u8 pfgid; /* function group ID */
> diff --git a/arch/s390/include/asm/pci_clp.h b/arch/s390/include/asm/pci_clp.h
> index 1f4b666e85ee..3af8d196da74 100644
> --- a/arch/s390/include/asm/pci_clp.h
> +++ b/arch/s390/include/asm/pci_clp.h
> @@ -173,7 +173,8 @@ struct clp_req_set_pci {
> u16 reserved2;
> u8 oc; /* operation controls */
> u8 ndas; /* number of dma spaces */
> - u64 reserved3;
> + u32 reserved3;
> + u32 gd; /* GISA designation */
> } __packed;
>
> /* Set PCI function response */
> diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c
> index 2f9b78fa82a5..9b4d3d78b444 100644
> --- a/arch/s390/pci/pci.c
> +++ b/arch/s390/pci/pci.c
> @@ -119,6 +119,7 @@ int zpci_register_ioat(struct zpci_dev *zdev, u8 dmaas,
> fib.pba = base;
> fib.pal = limit;
> fib.iota = iota | ZPCI_IOTA_RTTO_FLAG;
> + fib.gd = zdev->gd;
> cc = zpci_mod_fc(req, &fib, &status);
> if (cc)
> zpci_dbg(3, "reg ioat fid:%x, cc:%d, status:%d\n", zdev->fid, cc, status);
> @@ -132,6 +133,8 @@ int zpci_unregister_ioat(struct zpci_dev *zdev, u8 dmaas)
> struct zpci_fib fib = {0};
> u8 cc, status;
>
> + fib.gd = zdev->gd;
> +
> cc = zpci_mod_fc(req, &fib, &status);
> if (cc)
> zpci_dbg(3, "unreg ioat fid:%x, cc:%d, status:%d\n", zdev->fid, cc, status);
> @@ -159,6 +162,7 @@ int zpci_fmb_enable_device(struct zpci_dev *zdev)
> atomic64_set(&zdev->unmapped_pages, 0);
>
> fib.fmb_addr = virt_to_phys(zdev->fmb);
> + fib.gd = zdev->gd;
> cc = zpci_mod_fc(req, &fib, &status);
> if (cc) {
> kmem_cache_free(zdev_fmb_cache, zdev->fmb);
> @@ -177,6 +181,8 @@ int zpci_fmb_disable_device(struct zpci_dev *zdev)
> if (!zdev->fmb)
> return -EINVAL;
>
> + fib.gd = zdev->gd;
> +
> /* Function measurement is disabled if fmb address is zero */
> cc = zpci_mod_fc(req, &fib, &status);
> if (cc == 3) /* Function already gone. */
> @@ -807,6 +813,9 @@ struct zpci_dev *zpci_create_device(u32 fid, u32 fh, enum zpci_state state)
> zdev->fid = fid;
> zdev->fh = fh;
>
> + /* For now, assume it is not a passthrough device */
> + zdev->gd = 0;

useless as zdev is zallocated


> +
> /* Query function properties and update zdev */
> rc = clp_query_pci_fn(zdev);
> if (rc)
> diff --git a/arch/s390/pci/pci_clp.c b/arch/s390/pci/pci_clp.c
> index be077b39da33..e9ed0e4a5cf0 100644
> --- a/arch/s390/pci/pci_clp.c
> +++ b/arch/s390/pci/pci_clp.c
> @@ -240,6 +240,7 @@ static int clp_set_pci_fn(struct zpci_dev *zdev, u32 *fh, u8 nr_dma_as, u8 comma
> rrb->request.fh = zdev->fh;
> rrb->request.oc = command;
> rrb->request.ndas = nr_dma_as;
> + rrb->request.gd = zdev->gd;
>
> rc = clp_req(rrb, CLP_LPS_PCI);
> if (rrb->response.hdr.rsp == CLP_RC_SETPCIFN_BUSY) {
> diff --git a/arch/s390/pci/pci_irq.c b/arch/s390/pci/pci_irq.c
> index 6b29e39496d1..9e8b4507234d 100644
> --- a/arch/s390/pci/pci_irq.c
> +++ b/arch/s390/pci/pci_irq.c
> @@ -43,6 +43,7 @@ static int zpci_set_airq(struct zpci_dev *zdev)
> fib.fmt0.aibvo = 0; /* each zdev has its own interrupt vector */
> fib.fmt0.aisb = (unsigned long) zpci_sbv->vector + (zdev->aisb/64)*8;
> fib.fmt0.aisbo = zdev->aisb & 63;
> + fib.gd = zdev->gd;
>
> return zpci_mod_fc(req, &fib, &status) ? -EIO : 0;
> }
> @@ -54,6 +55,8 @@ static int zpci_clear_airq(struct zpci_dev *zdev)
> struct zpci_fib fib = {0};
> u8 cc, status;
>
> + fib.gd = zdev->gd;
> +
> cc = zpci_mod_fc(req, &fib, &status);
> if (cc == 3 || (cc == 1 && status == 24))
> /* Function already gone or IRQs already deregistered. */
> @@ -72,6 +75,7 @@ static int zpci_set_directed_irq(struct zpci_dev *zdev)
> fib.fmt = 1;
> fib.fmt1.noi = zdev->msi_nr_irqs;
> fib.fmt1.dibvo = zdev->msi_first_bit;
> + fib.gd = zdev->gd;
>
> return zpci_mod_fc(req, &fib, &status) ? -EIO : 0;
> }
> @@ -84,6 +88,7 @@ static int zpci_clear_directed_irq(struct zpci_dev *zdev)
> u8 cc, status;
>
> fib.fmt = 1;
> + fib.gd = zdev->gd;
> cc = zpci_mod_fc(req, &fib, &status);
> if (cc == 3 || (cc == 1 && status == 24))
> /* Function already gone or IRQs already deregistered. */
>

With the correction
Reviewed-by: Pierre Morel <[email protected]>


--
Pierre Morel
IBM Lab Boeblingen

2021-12-13 15:18:52

by Pierre Morel

[permalink] [raw]
Subject: Re: [PATCH 13/32] KVM: s390: pci: add basic kvm_zdev structure



On 12/7/21 21:57, Matthew Rosato wrote:
> This structure will be used to carry kvm passthrough information related to
> zPCI devices.
>
> Signed-off-by: Matthew Rosato <[email protected]>
> ---
> arch/s390/include/asm/kvm_pci.h | 29 +++++++++++++++++
> arch/s390/include/asm/pci.h | 3 ++
> arch/s390/kvm/Makefile | 2 +-
> arch/s390/kvm/pci.c | 57 +++++++++++++++++++++++++++++++++
> 4 files changed, 90 insertions(+), 1 deletion(-)
> create mode 100644 arch/s390/include/asm/kvm_pci.h
> create mode 100644 arch/s390/kvm/pci.c
>
> diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
> new file mode 100644
> index 000000000000..3e491a39704c
> --- /dev/null
> +++ b/arch/s390/include/asm/kvm_pci.h
> @@ -0,0 +1,29 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * KVM PCI Passthrough for virtual machines on s390
> + *
> + * Copyright IBM Corp. 2021
> + *
> + * Author(s): Matthew Rosato <[email protected]>
> + */
> +
> +
> +#ifndef ASM_KVM_PCI_H
> +#define ASM_KVM_PCI_H
> +
> +#include <linux/types.h>
> +#include <linux/kvm_types.h>
> +#include <linux/kvm_host.h>
> +#include <linux/kvm.h>
> +#include <linux/pci.h>
> +
> +struct kvm_zdev {
> + struct zpci_dev *zdev;
> + struct kvm *kvm;
> +};
> +
> +extern int kvm_s390_pci_dev_open(struct zpci_dev *zdev);
> +extern void kvm_s390_pci_dev_release(struct zpci_dev *zdev);
> +extern int kvm_s390_pci_attach_kvm(struct zpci_dev *zdev, struct kvm *kvm);
> +
> +#endif /* ASM_KVM_PCI_H */
> diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
> index 86f43644756d..32810e1ed308 100644
> --- a/arch/s390/include/asm/pci.h
> +++ b/arch/s390/include/asm/pci.h
> @@ -97,6 +97,7 @@ struct zpci_bar_struct {
> };
>
> struct s390_domain;
> +struct kvm_zdev;
>
> #define ZPCI_FUNCTIONS_PER_BUS 256
> struct zpci_bus {
> @@ -190,6 +191,8 @@ struct zpci_dev {
> struct dentry *debugfs_dev;
>
> struct s390_domain *s390_domain; /* s390 IOMMU domain data */
> +
> + struct kvm_zdev *kzdev; /* passthrough data */
> };
>
> static inline bool zdev_enabled(struct zpci_dev *zdev)
> diff --git a/arch/s390/kvm/Makefile b/arch/s390/kvm/Makefile
> index b3aaadc60ead..95ea865e5d29 100644
> --- a/arch/s390/kvm/Makefile
> +++ b/arch/s390/kvm/Makefile
> @@ -10,6 +10,6 @@ common-objs = $(KVM)/kvm_main.o $(KVM)/eventfd.o $(KVM)/async_pf.o \
> ccflags-y := -Ivirt/kvm -Iarch/s390/kvm
>
> kvm-objs := $(common-objs) kvm-s390.o intercept.o interrupt.o priv.o sigp.o
> -kvm-objs += diag.o gaccess.o guestdbg.o vsie.o pv.o
> +kvm-objs += diag.o gaccess.o guestdbg.o vsie.o pv.o pci.o
>
> obj-$(CONFIG_KVM) += kvm.o
> diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
> new file mode 100644
> index 000000000000..ecfc458a5b39
> --- /dev/null
> +++ b/arch/s390/kvm/pci.c
> @@ -0,0 +1,57 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * s390 kvm PCI passthrough support
> + *
> + * Copyright IBM Corp. 2021
> + *
> + * Author(s): Matthew Rosato <[email protected]>
> + */
> +
> +#include <linux/kvm_host.h>
> +#include <linux/pci.h>
> +#include <asm/kvm_pci.h>
> +
> +int kvm_s390_pci_dev_open(struct zpci_dev *zdev)
> +{
> + struct kvm_zdev *kzdev;
> +
> + if (zdev == NULL)
> + return -ENODEV;

This check is not needed, why should this function be called with a NULL
argument and the only caller at the moment already check it.

> +
> + kzdev = kzalloc(sizeof(struct kvm_zdev), GFP_KERNEL);
> + if (!kzdev)
> + return -ENOMEM;
> +
> + kzdev->zdev = zdev;
> + zdev->kzdev = kzdev;
> +
> + return 0;
> +}
> +EXPORT_SYMBOL_GPL(kvm_s390_pci_dev_open);
> +
> +void kvm_s390_pci_dev_release(struct zpci_dev *zdev)
> +{
> + struct kvm_zdev *kzdev;
> +
> + if (!zdev || !zdev->kzdev)
> + return;

same here

> +
> + kzdev = zdev->kzdev;
> + WARN_ON(kzdev->zdev != zdev);
> + zdev->kzdev = 0;
> + kfree(kzdev);
> +
> +}
> +EXPORT_SYMBOL_GPL(kvm_s390_pci_dev_release);
> +
> +int kvm_s390_pci_attach_kvm(struct zpci_dev *zdev, struct kvm *kvm)
> +{
> + struct kvm_zdev *kzdev = zdev->kzdev;
> +
> + if (!kzdev)
> + return -ENODEV;

and here

> +
> + kzdev->kvm = kvm;
> + return 0;
> +}
> +EXPORT_SYMBOL_GPL(kvm_s390_pci_attach_kvm);
>

--
Pierre Morel
IBM Lab Boeblingen

2021-12-14 09:15:21

by Pierre Morel

[permalink] [raw]
Subject: Re: [PATCH 20/32] KVM: s390: pci: provide routines for enabling/disabling interpretation



On 12/7/21 21:57, Matthew Rosato wrote:
> These routines will be wired into the vfio_pci_zdev ioctl handlers to
> respond to requests to enable / disable a device for zPCI Load/Store
> interpretation.
>
> The first time such a request is received, enable the necessary facilities
> for the guest.
>
> Signed-off-by: Matthew Rosato <[email protected]>
> ---
> arch/s390/include/asm/kvm_pci.h | 4 ++
> arch/s390/kvm/pci.c | 91 +++++++++++++++++++++++++++++++++
> arch/s390/pci/pci.c | 3 ++
> 3 files changed, 98 insertions(+)
>
> diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
> index 3e491a39704c..5d6283acb54c 100644
> --- a/arch/s390/include/asm/kvm_pci.h
> +++ b/arch/s390/include/asm/kvm_pci.h
> @@ -26,4 +26,8 @@ extern int kvm_s390_pci_dev_open(struct zpci_dev *zdev);
> extern void kvm_s390_pci_dev_release(struct zpci_dev *zdev);
> extern int kvm_s390_pci_attach_kvm(struct zpci_dev *zdev, struct kvm *kvm);
>
> +extern int kvm_s390_pci_interp_probe(struct zpci_dev *zdev);
> +extern int kvm_s390_pci_interp_enable(struct zpci_dev *zdev);
> +extern int kvm_s390_pci_interp_disable(struct zpci_dev *zdev);

extern prototypes should be avoided in .h files


> +
> #endif /* ASM_KVM_PCI_H */
> diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
> index f0e5386ff943..57cbe3827ea6 100644
> --- a/arch/s390/kvm/pci.c
> +++ b/arch/s390/kvm/pci.c
> @@ -10,7 +10,9 @@
> #include <linux/kvm_host.h>
> #include <linux/pci.h>
> #include <asm/kvm_pci.h>
> +#include <asm/sclp.h>
> #include "pci.h"
> +#include "kvm-s390.h"
>
> static struct zpci_aift aift;
>
> @@ -118,6 +120,95 @@ int kvm_s390_pci_aen_init(u8 nisc)
> return rc;
> }
>
> +int kvm_s390_pci_interp_probe(struct zpci_dev *zdev)
> +{
> + if (!(sclp.has_zpci_interp && test_facility(69)))
> + return -EINVAL;
> +
> + return 0;
> +}
> +EXPORT_SYMBOL_GPL(kvm_s390_pci_interp_probe);
> +
> +int kvm_s390_pci_interp_enable(struct zpci_dev *zdev)
> +{
> + u32 gd;
> + int rc;
> +
> + /*
> + * If this is the first request to use an interpreted device, make the
> + * necessary vcpu changes
> + */
> + if (!zdev->kzdev->kvm->arch.use_zpci_interp)
> + kvm_s390_vcpu_pci_enable_interp(zdev->kzdev->kvm);
> +
> + /*
> + * In the event of a system reset in userspace, the GISA designation
> + * may still be assigned because the device is still enabled.
> + * Verify it's the same guest before proceeding.
> + */
> + gd = (u32)(u64)&zdev->kzdev->kvm->arch.sie_page2->gisa;
> + if (zdev->gd != 0 && zdev->gd != gd)
> + return -EPERM;
> +
> + if (zdev_enabled(zdev)) {
> + zdev->gd = 0;
> + rc = zpci_disable_device(zdev);
> + if (rc)
> + return rc;
> + }
> +
> + /*
> + * Store information about the identity of the kvm guest allowed to
> + * access this device via interpretation to be used by host CLP
> + */
> + zdev->gd = gd;
> +
> + rc = zpci_enable_device(zdev);
> + if (rc)
> + goto err;
> +
> + /* Re-register the IOMMU that was already created */
> + rc = zpci_register_ioat(zdev, 0, zdev->start_dma, zdev->end_dma,
> + (u64)zdev->dma_table);
> + if (rc)
> + goto err;
> +
> + return rc;
> +
> +err:
> + zdev->gd = 0;
> + return rc;
> +}
> +EXPORT_SYMBOL_GPL(kvm_s390_pci_interp_enable);
> +
> +int kvm_s390_pci_interp_disable(struct zpci_dev *zdev)
> +{
> + int rc;
> +
> + if (zdev->gd == 0)
> + return -EINVAL;
> +
> + /* Remove the host CLP guest designation */
> + zdev->gd = 0;
> +
> + if (zdev_enabled(zdev)) {
> + rc = zpci_disable_device(zdev);
> + if (rc)
> + return rc;
> + }
> +
> + rc = zpci_enable_device(zdev);
> + if (rc)
> + return rc;
> +
> + /* Re-register the IOMMU that was already created */
> + rc = zpci_register_ioat(zdev, 0, zdev->start_dma, zdev->end_dma,
> + (u64)zdev->dma_table);
> +
> + return rc;
> +}
> +EXPORT_SYMBOL_GPL(kvm_s390_pci_interp_disable);
> +
> int kvm_s390_pci_dev_open(struct zpci_dev *zdev)
> {
> struct kvm_zdev *kzdev;
> diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c
> index 175854c861cd..0eac84387f3c 100644
> --- a/arch/s390/pci/pci.c
> +++ b/arch/s390/pci/pci.c
> @@ -141,6 +141,7 @@ int zpci_register_ioat(struct zpci_dev *zdev, u8 dmaas,
> zpci_dbg(3, "reg ioat fid:%x, cc:%d, status:%d\n", zdev->fid, cc, status);
> return cc;
> }
> +EXPORT_SYMBOL_GPL(zpci_register_ioat);
>
> /* Modify PCI: Unregister I/O address translation parameters */
> int zpci_unregister_ioat(struct zpci_dev *zdev, u8 dmaas)
> @@ -740,6 +741,7 @@ int zpci_enable_device(struct zpci_dev *zdev)
> zpci_update_fh(zdev, fh);
> return rc;
> }
> +EXPORT_SYMBOL_GPL(zpci_enable_device);
>
> int zpci_disable_device(struct zpci_dev *zdev)
> {
> @@ -763,6 +765,7 @@ int zpci_disable_device(struct zpci_dev *zdev)
> }
> return rc;
> }
> +EXPORT_SYMBOL_GPL(zpci_disable_device);
>
> /**
> * zpci_hot_reset_device - perform a reset of the given zPCI function
>

--
Pierre Morel
IBM Lab Boeblingen

2021-12-14 09:15:34

by Pierre Morel

[permalink] [raw]
Subject: Re: [PATCH 13/32] KVM: s390: pci: add basic kvm_zdev structure



On 12/7/21 21:57, Matthew Rosato wrote:
> This structure will be used to carry kvm passthrough information related to
> zPCI devices.
>
> Signed-off-by: Matthew Rosato <[email protected]>
> ---
> arch/s390/include/asm/kvm_pci.h | 29 +++++++++++++++++
> arch/s390/include/asm/pci.h | 3 ++
> arch/s390/kvm/Makefile | 2 +-
> arch/s390/kvm/pci.c | 57 +++++++++++++++++++++++++++++++++
> 4 files changed, 90 insertions(+), 1 deletion(-)
> create mode 100644 arch/s390/include/asm/kvm_pci.h
> create mode 100644 arch/s390/kvm/pci.c
>
> diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
> new file mode 100644
> index 000000000000..3e491a39704c
> --- /dev/null
> +++ b/arch/s390/include/asm/kvm_pci.h
> @@ -0,0 +1,29 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * KVM PCI Passthrough for virtual machines on s390
> + *
> + * Copyright IBM Corp. 2021
> + *
> + * Author(s): Matthew Rosato <[email protected]>
> + */
> +
> +
> +#ifndef ASM_KVM_PCI_H
> +#define ASM_KVM_PCI_H
> +
> +#include <linux/types.h>
> +#include <linux/kvm_types.h>
> +#include <linux/kvm_host.h>
> +#include <linux/kvm.h>
> +#include <linux/pci.h>
> +
> +struct kvm_zdev {
> + struct zpci_dev *zdev;
> + struct kvm *kvm;
> +};
> +
> +extern int kvm_s390_pci_dev_open(struct zpci_dev *zdev);
> +extern void kvm_s390_pci_dev_release(struct zpci_dev *zdev);
> +extern int kvm_s390_pci_attach_kvm(struct zpci_dev *zdev, struct kvm *kvm);

No need for "extern" in the prototype definition.


> +
> +#endif /* ASM_KVM_PCI_H */
> diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
> index 86f43644756d..32810e1ed308 100644
> --- a/arch/s390/include/asm/pci.h
> +++ b/arch/s390/include/asm/pci.h
> @@ -97,6 +97,7 @@ struct zpci_bar_struct {
> };
>
> struct s390_domain;
> +struct kvm_zdev;
>
> #define ZPCI_FUNCTIONS_PER_BUS 256
> struct zpci_bus {
> @@ -190,6 +191,8 @@ struct zpci_dev {
> struct dentry *debugfs_dev;
>
> struct s390_domain *s390_domain; /* s390 IOMMU domain data */
> +
> + struct kvm_zdev *kzdev; /* passthrough data */
> };
>
> static inline bool zdev_enabled(struct zpci_dev *zdev)
> diff --git a/arch/s390/kvm/Makefile b/arch/s390/kvm/Makefile
> index b3aaadc60ead..95ea865e5d29 100644
> --- a/arch/s390/kvm/Makefile
> +++ b/arch/s390/kvm/Makefile
> @@ -10,6 +10,6 @@ common-objs = $(KVM)/kvm_main.o $(KVM)/eventfd.o $(KVM)/async_pf.o \
> ccflags-y := -Ivirt/kvm -Iarch/s390/kvm
>
> kvm-objs := $(common-objs) kvm-s390.o intercept.o interrupt.o priv.o sigp.o
> -kvm-objs += diag.o gaccess.o guestdbg.o vsie.o pv.o
> +kvm-objs += diag.o gaccess.o guestdbg.o vsie.o pv.o pci.o
>
> obj-$(CONFIG_KVM) += kvm.o
> diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
> new file mode 100644
> index 000000000000..ecfc458a5b39
> --- /dev/null
> +++ b/arch/s390/kvm/pci.c
> @@ -0,0 +1,57 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * s390 kvm PCI passthrough support
> + *
> + * Copyright IBM Corp. 2021
> + *
> + * Author(s): Matthew Rosato <[email protected]>
> + */
> +
> +#include <linux/kvm_host.h>
> +#include <linux/pci.h>
> +#include <asm/kvm_pci.h>
> +
> +int kvm_s390_pci_dev_open(struct zpci_dev *zdev)
> +{
> + struct kvm_zdev *kzdev;
> +
> + if (zdev == NULL)
> + return -ENODEV;
> +
> + kzdev = kzalloc(sizeof(struct kvm_zdev), GFP_KERNEL);
> + if (!kzdev)
> + return -ENOMEM;
> +
> + kzdev->zdev = zdev;
> + zdev->kzdev = kzdev;
> +
> + return 0;
> +}
> +EXPORT_SYMBOL_GPL(kvm_s390_pci_dev_open);
> +
> +void kvm_s390_pci_dev_release(struct zpci_dev *zdev)
> +{
> + struct kvm_zdev *kzdev;
> +
> + if (!zdev || !zdev->kzdev)
> + return;
> +
> + kzdev = zdev->kzdev;
> + WARN_ON(kzdev->zdev != zdev);
> + zdev->kzdev = 0;
> + kfree(kzdev);
> +
> +}

No need for a blanc line before the end of the function


> +EXPORT_SYMBOL_GPL(kvm_s390_pci_dev_release);
> +
> +int kvm_s390_pci_attach_kvm(struct zpci_dev *zdev, struct kvm *kvm)
> +{
> + struct kvm_zdev *kzdev = zdev->kzdev;
> +
> + if (!kzdev)
> + return -ENODEV;
> +
> + kzdev->kvm = kvm;
> + return 0;
> +}
> +EXPORT_SYMBOL_GPL(kvm_s390_pci_attach_kvm);
>

--
Pierre Morel
IBM Lab Boeblingen

2021-12-14 09:25:48

by Pierre Morel

[permalink] [raw]
Subject: Re: [PATCH 21/32] KVM: s390: pci: provide routines for enabling/disabling interrupt forwarding



On 12/7/21 21:57, Matthew Rosato wrote:
> These routines will be wired into the vfio_pci_zdev ioctl handlers to
> respond to requests to enable / disable a device for Adapter Event
> Notifications / Adapter Interuption Forwarding.
>
> Signed-off-by: Matthew Rosato <[email protected]>
> ---
> arch/s390/include/asm/kvm_pci.h | 7 ++
> arch/s390/kvm/pci.c | 199 ++++++++++++++++++++++++++++++++
> arch/s390/pci/pci_insn.c | 1 +
> 3 files changed, 207 insertions(+)
>
> diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
> index 5d6283acb54c..54a0afdbe7d0 100644
> --- a/arch/s390/include/asm/kvm_pci.h
> +++ b/arch/s390/include/asm/kvm_pci.h
> @@ -16,16 +16,23 @@
> #include <linux/kvm_host.h>
> #include <linux/kvm.h>
> #include <linux/pci.h>
> +#include <asm/pci_insn.h>
>
> struct kvm_zdev {
> struct zpci_dev *zdev;
> struct kvm *kvm;
> + struct zpci_fib fib;
> };
>
> extern int kvm_s390_pci_dev_open(struct zpci_dev *zdev);
> extern void kvm_s390_pci_dev_release(struct zpci_dev *zdev);
> extern int kvm_s390_pci_attach_kvm(struct zpci_dev *zdev, struct kvm *kvm);
>
> +extern int kvm_s390_pci_aif_probe(struct zpci_dev *zdev);
> +extern int kvm_s390_pci_aif_enable(struct zpci_dev *zdev, struct zpci_fib *fib,
> + bool assist);
> +extern int kvm_s390_pci_aif_disable(struct zpci_dev *zdev);
> +

No need for extern in the prototype definition.


> extern int kvm_s390_pci_interp_probe(struct zpci_dev *zdev);
> extern int kvm_s390_pci_interp_enable(struct zpci_dev *zdev);
> extern int kvm_s390_pci_interp_disable(struct zpci_dev *zdev);
> diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
> index 57cbe3827ea6..3a29398dd53b 100644
> --- a/arch/s390/kvm/pci.c
> +++ b/arch/s390/kvm/pci.c
> @@ -10,6 +10,8 @@
> #include <linux/kvm_host.h>
> #include <linux/pci.h>
> #include <asm/kvm_pci.h>
> +#include <asm/pci.h>
> +#include <asm/pci_insn.h>
> #include <asm/sclp.h>
> #include "pci.h"
> #include "kvm-s390.h"
> @@ -120,6 +122,199 @@ int kvm_s390_pci_aen_init(u8 nisc)
> return rc;
> }
>
> +/* Modify PCI: Register floating adapter interruption forwarding */
> +static int kvm_zpci_set_airq(struct zpci_dev *zdev)
> +{
> + u64 req = ZPCI_CREATE_REQ(zdev->fh, 0, ZPCI_MOD_FC_REG_INT);
> + struct zpci_fib fib = {0};
> + u8 status;
> +
> + fib.fmt0.isc = zdev->kzdev->fib.fmt0.isc;
> + fib.fmt0.sum = 1; /* enable summary notifications */
> + fib.fmt0.noi = airq_iv_end(zdev->aibv);
> + fib.fmt0.aibv = (unsigned long) zdev->aibv->vector;

no blanc needed after cast

> + fib.fmt0.aibvo = 0;
> + fib.fmt0.aisb = (unsigned long) aift.sbv->vector + (zdev->aisb/64) * 8;

same here and blancs needed arround /

> + fib.fmt0.aisbo = zdev->aisb & 63;
> + fib.gd = zdev->gd;
> +
> + return zpci_mod_fc(req, &fib, &status) ? -EIO : 0;
> +}
> +
> +/* Modify PCI: Unregister floating adapter interruption forwarding */
> +static int kvm_zpci_clear_airq(struct zpci_dev *zdev)
> +{
> + u64 req = ZPCI_CREATE_REQ(zdev->fh, 0, ZPCI_MOD_FC_DEREG_INT);
> + struct zpci_fib fib = {0};
> + u8 cc, status;
> +
> + fib.gd = zdev->gd;
> +
> + cc = zpci_mod_fc(req, &fib, &status);
> + if (cc == 3 || (cc == 1 && status == 24))
> + /* Function already gone or IRQs already deregistered. */
> + cc = 0;
> +
> + return cc ? -EIO : 0;
> +}
> +
> +int kvm_s390_pci_aif_probe(struct zpci_dev *zdev)
> +{
> + if (!(sclp.has_aeni && test_facility(71)))
> + return -EINVAL;
> +
> + return 0;
> +}
> +EXPORT_SYMBOL_GPL(kvm_s390_pci_aif_probe);
> +
> +int kvm_s390_pci_aif_enable(struct zpci_dev *zdev, struct zpci_fib *fib,
> + bool assist)
> +{
> + struct page *aibv_page, *aisb_page = NULL;
> + unsigned int msi_vecs, idx;
> + struct zpci_gaite *gaite;
> + unsigned long bit;
> + struct kvm *kvm;
> + void *gaddr;
> + int rc = 0;
> +
> + /*
> + * Interrupt forwarding is only applicable if the device is already
> + * enabled for interpretation
> + */
> + if (zdev->gd == 0)
> + return -EINVAL;
> +
> + kvm = zdev->kzdev->kvm;
> + msi_vecs = min_t(unsigned int, fib->fmt0.noi, zdev->max_msi);
> +
> + /* Replace AIBV address */
> + idx = srcu_read_lock(&kvm->srcu);
> + aibv_page = gfn_to_page(kvm, gpa_to_gfn((gpa_t)fib->fmt0.aibv));
> + srcu_read_unlock(&kvm->srcu, idx);
> + if (is_error_page(aibv_page)) {
> + rc = -EIO;
> + goto out;
> + }
> + gaddr = page_to_virt(aibv_page) + (fib->fmt0.aibv & ~PAGE_MASK);
> + fib->fmt0.aibv = (u64)gaddr;
> +
> + /* Pin the guest AISB if one was specified */
> + if (fib->fmt0.sum == 1) {
> + idx = srcu_read_lock(&kvm->srcu);
> + aisb_page = gfn_to_page(kvm, gpa_to_gfn((gpa_t)fib->fmt0.aisb));
> + srcu_read_unlock(&kvm->srcu, idx);
> + if (is_error_page(aisb_page)) {
> + rc = -EIO;
> + goto unpin1;
> + }
> + }
> +
> + /* AISB must be allocated before we can fill in GAITE */
> + mutex_lock(&aift.lock);
> + bit = airq_iv_alloc_bit(aift.sbv);
> + if (bit == -1UL)
> + goto unpin2;
> + zdev->aisb = bit;
> + zdev->aibv = airq_iv_create(msi_vecs, AIRQ_IV_DATA |
> + AIRQ_IV_BITLOCK |
> + AIRQ_IV_GUESTVEC,
> + (unsigned long *)fib->fmt0.aibv);
> +
> + spin_lock_irq(&aift.gait_lock);
> + gaite = (struct zpci_gaite *) aift.gait + (zdev->aisb *
no blanc after cast

> + sizeof(struct zpci_gaite));
> +
> + /* If assist not requested, host will get all alerts */
> + if (assist)
> + gaite->gisa = (u32)(u64)&kvm->arch.sie_page2->gisa;
> + else
> + gaite->gisa = 0;
> +
> + gaite->gisc = fib->fmt0.isc;
> + gaite->count++;
> + gaite->aisbo = fib->fmt0.aisbo;
> + gaite->aisb = (u64)(page_address(aisb_page) + (fib->fmt0.aisb &
> + ~PAGE_MASK));
> + aift.kzdev[zdev->aisb] = zdev->kzdev;
> + spin_unlock_irq(&aift.gait_lock);
> +
> + /* Update guest FIB for re-issue */
> + fib->fmt0.aisbo = zdev->aisb & 63;
> + fib->fmt0.aisb = (unsigned long) aift.sbv->vector + (zdev->aisb/64)*8;

no blanc after cast and blanc arround / and *

> + fib->fmt0.isc = kvm_s390_gisc_register(kvm, gaite->gisc);
> +
> + /* Save some guest fib values in the host for later use */
> + zdev->kzdev->fib.fmt0.isc = fib->fmt0.isc;
> + zdev->kzdev->fib.fmt0.aibv = fib->fmt0.aibv;
> + mutex_unlock(&aift.lock);
> +
> + /* Issue the clp to setup the irq now */
> + rc = kvm_zpci_set_airq(zdev);
> + return rc;
> +
> +unpin2:
> + mutex_unlock(&aift.lock);
> + if (fib->fmt0.sum == 1) {
> + gaddr = page_to_virt(aisb_page);
> + kvm_release_pfn_dirty((u64)gaddr >> PAGE_SHIFT);
> + }
> +unpin1:
> + kvm_release_pfn_dirty(fib->fmt0.aibv >> PAGE_SHIFT);
> +out:
> + return rc;
> +}
> +EXPORT_SYMBOL_GPL(kvm_s390_pci_aif_enable);
> +
> +int kvm_s390_pci_aif_disable(struct zpci_dev *zdev)
> +{
> + struct kvm_zdev *kzdev = zdev->kzdev;
> + struct zpci_gaite *gaite;
> + int rc;
> + u8 isc;
> +
> + if (zdev->gd == 0)
> + return -EINVAL;
> +
> + /* Even if the clear fails due to an error, clear the GAITE */
> + rc = kvm_zpci_clear_airq(zdev);
> +
> + mutex_lock(&aift.lock);
> + if (zdev->kzdev->fib.fmt0.aibv == 0)
> + goto out;
> + spin_lock_irq(&aift.gait_lock);
> + gaite = (struct zpci_gaite *) aift.gait + (zdev->aisb *
dito cast
> + sizeof(struct zpci_gaite));
> + isc = gaite->gisc;
> + gaite->count--;
> + if (gaite->count == 0) {
> + /* Release guest AIBV and AISB */
> + kvm_release_pfn_dirty(kzdev->fib.fmt0.aibv >> PAGE_SHIFT);
> + if (gaite->aisb != 0)
> + kvm_release_pfn_dirty(gaite->aisb >> PAGE_SHIFT);
> + /* Clear the GAIT entry */
> + gaite->aisb = 0;
> + gaite->gisc = 0;
> + gaite->aisbo = 0;
> + gaite->gisa = 0;
> + aift.kzdev[zdev->aisb] = 0;
> + /* Clear zdev info */
> + airq_iv_free_bit(aift.sbv, zdev->aisb);
> + airq_iv_release(zdev->aibv);
> + zdev->aisb = 0;
> + zdev->aibv = NULL;
> + }
> + spin_unlock_irq(&aift.gait_lock);
> + kvm_s390_gisc_unregister(kzdev->kvm, isc);
> + kzdev->fib.fmt0.isc = 0;
> + kzdev->fib.fmt0.aibv = 0;
> +out:
> + mutex_unlock(&aift.lock);
> +
> + return rc;
> +}
> +EXPORT_SYMBOL_GPL(kvm_s390_pci_aif_disable);
> +
> int kvm_s390_pci_interp_probe(struct zpci_dev *zdev)
> {
> if (!(sclp.has_zpci_interp && test_facility(69)))
> @@ -188,6 +383,10 @@ int kvm_s390_pci_interp_disable(struct zpci_dev *zdev)
> if (zdev->gd == 0)
> return -EINVAL;
>
> + /* Forwarding must be turned off before interpretation */
> + if (zdev->kzdev->fib.fmt0.aibv != 0)
> + kvm_s390_pci_aif_disable(zdev);
> +
> /* Remove the host CLP guest designation */
> zdev->gd = 0;
>
> diff --git a/arch/s390/pci/pci_insn.c b/arch/s390/pci/pci_insn.c
> index 0d1ab268ec24..b57d3f594113 100644
> --- a/arch/s390/pci/pci_insn.c
> +++ b/arch/s390/pci/pci_insn.c
> @@ -59,6 +59,7 @@ u8 zpci_mod_fc(u64 req, struct zpci_fib *fib, u8 *status)
>
> return cc;
> }
> +EXPORT_SYMBOL_GPL(zpci_mod_fc);
>
> /* Refresh PCI Translations */
> static inline u8 __rpcit(u64 fn, u64 addr, u64 range, u8 *status)
>

--
Pierre Morel
IBM Lab Boeblingen

2021-12-14 09:45:24

by Pierre Morel

[permalink] [raw]
Subject: Re: [PATCH 29/32] vfio-pci/zdev: wire up zPCI IOAT assist support



On 12/7/21 21:57, Matthew Rosato wrote:
> Introduce support for VFIO_DEVICE_FEATURE_ZPCI_IOAT, which is a new
> VFIO_DEVICE_FEATURE ioctl. This interface is used to indicate that an
> s390x vfio-pci device wishes to enable/disable zPCI I/O Address
> Translation assistance, allowing the host to perform address translation
> and shadowing.
>
> Signed-off-by: Matthew Rosato <[email protected]>
> ---
> arch/s390/include/asm/kvm_pci.h | 1 +
> drivers/vfio/pci/vfio_pci_core.c | 2 ++
> drivers/vfio/pci/vfio_pci_zdev.c | 61 ++++++++++++++++++++++++++++++++
> include/linux/vfio_pci_core.h | 10 ++++++
> include/uapi/linux/vfio.h | 8 +++++
> include/uapi/linux/vfio_zdev.h | 13 +++++++
> 6 files changed, 95 insertions(+)
>
> diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
> index 0a0e42e1db1c..0b362d55c7b2 100644
> --- a/arch/s390/include/asm/kvm_pci.h
> +++ b/arch/s390/include/asm/kvm_pci.h
> @@ -32,6 +32,7 @@ struct kvm_zdev {
> struct zpci_dev *zdev;
> struct kvm *kvm;
> u64 rpcit_count;
> + u64 iota;
> struct kvm_zdev_ioat ioat;
> struct zpci_fib fib;
> struct notifier_block nb;
> diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
> index 01658de660bd..709d9ba22a60 100644
> --- a/drivers/vfio/pci/vfio_pci_core.c
> +++ b/drivers/vfio/pci/vfio_pci_core.c
> @@ -1176,6 +1176,8 @@ long vfio_pci_core_ioctl(struct vfio_device *core_vdev, unsigned int cmd,
> return vfio_pci_zdev_feat_interp(vdev, feature, arg);
> case VFIO_DEVICE_FEATURE_ZPCI_AIF:
> return vfio_pci_zdev_feat_aif(vdev, feature, arg);
> + case VFIO_DEVICE_FEATURE_ZPCI_IOAT:
> + return vfio_pci_zdev_feat_ioat(vdev, feature, arg);
> default:
> return -ENOTTY;
> }
> diff --git a/drivers/vfio/pci/vfio_pci_zdev.c b/drivers/vfio/pci/vfio_pci_zdev.c
> index dd98808b9139..85be77492a6d 100644
> --- a/drivers/vfio/pci/vfio_pci_zdev.c
> +++ b/drivers/vfio/pci/vfio_pci_zdev.c
> @@ -298,6 +298,66 @@ int vfio_pci_zdev_feat_aif(struct vfio_pci_core_device *vdev,
> return rc;
> }
>
> +int vfio_pci_zdev_feat_ioat(struct vfio_pci_core_device *vdev,
> + struct vfio_device_feature feature,
> + unsigned long arg)
> +{
> + struct zpci_dev *zdev = to_zpci(vdev->pdev);
> + struct vfio_device_zpci_ioat *data;
> + struct vfio_device_feature *feat;
> + unsigned long minsz;
> + int size, rc = 0;
> +
> + if (!zdev || !zdev->kzdev)
> + return -EINVAL;
> +
> + /*
> + * If PROBE requested and feature not found, leave immediately.
> + * Otherwise, keep going as GET or SET may also be specified.
> + */
> + if (feature.flags & VFIO_DEVICE_FEATURE_PROBE) {
> + rc = kvm_s390_pci_ioat_probe(zdev);
> + if (rc)
> + return rc;
> + }
> + if (!(feature.flags & (VFIO_DEVICE_FEATURE_GET +
> + VFIO_DEVICE_FEATURE_SET)))
> + return 0;
> +

I think you should verify the argsz.

> + size = sizeof(*feat) + sizeof(*data);
> + feat = kzalloc(size, GFP_KERNEL);
> + if (!feat)
> + return -ENOMEM;
> +
> + data = (struct vfio_device_zpci_ioat *)&feat->data;
> + minsz = offsetofend(struct vfio_device_feature, flags);
> +
> + /* Get the rest of the payload for GET/SET */
> + rc = copy_from_user(data, (void __user *)(arg + minsz),
> + sizeof(*data));

Alignment

> + if (rc)
> + rc = -EINVAL;
> +
> + if (feature.flags & VFIO_DEVICE_FEATURE_GET) {
> + data->iota = (u64)zdev->kzdev->iota;
> + if (copy_to_user((void __user *)arg, feat, size))
> + rc = -EFAULT;
> + } else if (feature.flags & VFIO_DEVICE_FEATURE_SET) {
> + if (data->iota != 0) {
> + rc = kvm_s390_pci_ioat_enable(zdev, data->iota);
> + if (!rc)
> + zdev->kzdev->iota = data->iota;
> + } else if (zdev->kzdev->iota != 0) {
> + rc = kvm_s390_pci_ioat_disable(zdev);
> + if (!rc)
> + zdev->kzdev->iota = 0;
> + }
> + }
> +
> + kfree(feat);
> + return rc;
> +}
> +
> static int vfio_pci_zdev_group_notifier(struct notifier_block *nb,
> unsigned long action, void *data)
> {
> @@ -353,6 +413,7 @@ int vfio_pci_zdev_release(struct vfio_pci_core_device *vdev)
> */
> if (zdev->gd != 0) {
> kvm_s390_pci_aif_disable(zdev);
> + kvm_s390_pci_ioat_disable(zdev);
> kvm_s390_pci_interp_disable(zdev);
> }
>
> diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
> index 5442d3fa1662..7c45a425e7f8 100644
> --- a/include/linux/vfio_pci_core.h
> +++ b/include/linux/vfio_pci_core.h
> @@ -204,6 +204,9 @@ int vfio_pci_zdev_feat_interp(struct vfio_pci_core_device *vdev,
> int vfio_pci_zdev_feat_aif(struct vfio_pci_core_device *vdev,
> struct vfio_device_feature feature,
> unsigned long arg);
> +int vfio_pci_zdev_feat_ioat(struct vfio_pci_core_device *vdev,
> + struct vfio_device_feature feature,
> + unsigned long arg);
> int vfio_pci_zdev_open(struct vfio_pci_core_device *vdev);
> int vfio_pci_zdev_release(struct vfio_pci_core_device *vdev);
> #else
> @@ -227,6 +230,13 @@ static inline int vfio_pci_zdev_feat_aif(struct vfio_pci_core_device *vdev,
> return -ENOTTY;
> }
>
> +static inline int vfio_pci_zdev_feat_ioat(struct vfio_pci_core_device *vdev,
> + struct vfio_device_feature feature,
> + unsigned long arg)
> +{
> + return -ENOTTY;
> +}
> +
> static inline int vfio_pci_zdev_open(struct vfio_pci_core_device *vdev)
> {
> return -ENODEV;
> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> index fe3bfd99bf50..32c687388f48 100644
> --- a/include/uapi/linux/vfio.h
> +++ b/include/uapi/linux/vfio.h
> @@ -1016,6 +1016,14 @@ struct vfio_device_feature {
> */
> #define VFIO_DEVICE_FEATURE_ZPCI_AIF (2)
>
> +/*
> + * Provide support for enabling guest I/O address translation assistance for
> + * zPCI devices. This feature is only valid for s390x PCI devices. Data
> + * provided when setting and getting this feature is further described in
> + * vfio_zdev.h
> + */
> +#define VFIO_DEVICE_FEATURE_ZPCI_IOAT (3)
> +
> /* -------- API for Type1 VFIO IOMMU -------- */
>
> /**
> diff --git a/include/uapi/linux/vfio_zdev.h b/include/uapi/linux/vfio_zdev.h
> index c574e23f9385..1a5229b7bb18 100644
> --- a/include/uapi/linux/vfio_zdev.h
> +++ b/include/uapi/linux/vfio_zdev.h
> @@ -110,4 +110,17 @@ struct vfio_device_zpci_aif {
> __u8 sbo; /* Offset of guest summary bit vector */
> };
>
> +/**
> + * VFIO_DEVICE_FEATURE_ZPCI_IOAT
> + *
> + * This feature is used for enabling guest I/O translation assistance for
> + * passthrough zPCI devices using instruction interpretation. When setting
> + * this feature, the iota specifies a KVM guest I/O translation anchor. When
> + * getting this feature, the most recently set anchor (or 0) is returned in
> + * iota.
> + */
> +struct vfio_device_zpci_ioat {
> + __u64 iota;
> +};
> +
> #endif
>

--
Pierre Morel
IBM Lab Boeblingen

2021-12-14 09:47:52

by Pierre Morel

[permalink] [raw]
Subject: Re: [PATCH 28/32] vfio-pci/zdev: wire up zPCI adapter interrupt forwarding support



On 12/7/21 21:57, Matthew Rosato wrote:
> Introduce support for VFIO_DEVICE_FEATURE_ZPCI_AIF, which is a new
> VFIO_DEVICE_FEATURE ioctl. This interface is used to indicate that an
> s390x vfio-pci device wishes to enable/disable zPCI adapter interrupt
> forwarding, which allows underlying firmware to deliver interrupts
> directly to the associated kvm guest.
>
> Signed-off-by: Matthew Rosato <[email protected]>
> ---
> arch/s390/include/asm/kvm_pci.h | 2 +
> drivers/vfio/pci/vfio_pci_core.c | 2 +
> drivers/vfio/pci/vfio_pci_zdev.c | 96 +++++++++++++++++++++++++++++++-
> include/linux/vfio_pci_core.h | 10 ++++
> include/uapi/linux/vfio.h | 7 +++
> include/uapi/linux/vfio_zdev.h | 20 +++++++
> 6 files changed, 136 insertions(+), 1 deletion(-)
>
> diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
> index 062bac720428..0a0e42e1db1c 100644
> --- a/arch/s390/include/asm/kvm_pci.h
> +++ b/arch/s390/include/asm/kvm_pci.h
> @@ -36,6 +36,8 @@ struct kvm_zdev {
> struct zpci_fib fib;
> struct notifier_block nb;
> bool interp;
> + bool aif;
> + bool fhost;
> };
>
> extern int kvm_s390_pci_dev_open(struct zpci_dev *zdev);
> diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
> index 2b2d64a2190c..01658de660bd 100644
> --- a/drivers/vfio/pci/vfio_pci_core.c
> +++ b/drivers/vfio/pci/vfio_pci_core.c
> @@ -1174,6 +1174,8 @@ long vfio_pci_core_ioctl(struct vfio_device *core_vdev, unsigned int cmd,
> return 0;
> case VFIO_DEVICE_FEATURE_ZPCI_INTERP:
> return vfio_pci_zdev_feat_interp(vdev, feature, arg);
> + case VFIO_DEVICE_FEATURE_ZPCI_AIF:
> + return vfio_pci_zdev_feat_aif(vdev, feature, arg);
> default:
> return -ENOTTY;
> }
> diff --git a/drivers/vfio/pci/vfio_pci_zdev.c b/drivers/vfio/pci/vfio_pci_zdev.c
> index b205e0ad1fd3..dd98808b9139 100644
> --- a/drivers/vfio/pci/vfio_pci_zdev.c
> +++ b/drivers/vfio/pci/vfio_pci_zdev.c
> @@ -13,6 +13,7 @@
> #include <linux/vfio_zdev.h>
> #include <asm/pci_clp.h>
> #include <asm/pci_io.h>
> +#include <asm/pci_insn.h>
> #include <asm/kvm_pci.h>
>
> #include <linux/vfio_pci_core.h>
> @@ -206,6 +207,97 @@ int vfio_pci_zdev_feat_interp(struct vfio_pci_core_device *vdev,
> return rc;
> }
>
> +int vfio_pci_zdev_feat_aif(struct vfio_pci_core_device *vdev,
> + struct vfio_device_feature feature,
> + unsigned long arg)
> +{
> + struct zpci_dev *zdev = to_zpci(vdev->pdev);
> + struct vfio_device_zpci_aif *data;
> + struct vfio_device_feature *feat;
> + unsigned long minsz;
> + int size, rc = 0;
> +
> + if (!zdev || !zdev->kzdev)
> + return -EINVAL;
> +
> + /*
> + * If PROBE requested and feature not found, leave immediately.
> + * Otherwise, keep going as GET or SET may also be specified.
> + */
> + if (feature.flags & VFIO_DEVICE_FEATURE_PROBE) {
> + rc = kvm_s390_pci_aif_probe(zdev);
> + if (rc)
> + return rc;
> + }
> + if (!(feature.flags & (VFIO_DEVICE_FEATURE_GET +
> + VFIO_DEVICE_FEATURE_SET)))
> + return 0;
> +
> + size = sizeof(*feat) + sizeof(*data);
> + feat = kzalloc(size, GFP_KERNEL);
> + if (!feat)
> + return -ENOMEM;
> +
> + data = (struct vfio_device_zpci_aif *)&feat->data;
> + minsz = offsetofend(struct vfio_device_feature, flags);

I think you should check the argsz.

> +
> + /* Get the rest of the payload for GET/SET */
> + rc = copy_from_user(data, (void __user *)(arg + minsz),
> + sizeof(*data));
> + if (rc)
> + rc = -EINVAL;
> +
> + if (feature.flags & VFIO_DEVICE_FEATURE_GET) {
> + if (zdev->kzdev->aif)
> + data->flags = VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT;
> + if (zdev->kzdev->fhost)
> + data->flags |= VFIO_DEVICE_ZPCI_FLAG_AIF_HOST;
> +
> + if (copy_to_user((void __user *)arg, feat, size))
> + rc = -EFAULT;
> + } else if (feature.flags & VFIO_DEVICE_FEATURE_SET) {
> + if (data->flags & VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT) {
> + /* create a guest fib */
> + struct zpci_fib fib;
> +
> + fib.fmt0.aibv = data->ibv;
> + fib.fmt0.isc = data->isc;
> + fib.fmt0.noi = data->noi;
> + if (data->sb != 0) {
> + fib.fmt0.aisb = data->sb;
> + fib.fmt0.aisbo = data->sbo;
> + fib.fmt0.sum = 1;
> + } else {
> + fib.fmt0.aisb = 0;
> + fib.fmt0.aisbo = 0;
> + fib.fmt0.sum = 0;
> + }
> + if (data->flags & VFIO_DEVICE_ZPCI_FLAG_AIF_HOST) {
> + rc = kvm_s390_pci_aif_enable(zdev, &fib, false);
> + if (!rc) {
> + zdev->kzdev->aif = true;
> + zdev->kzdev->fhost = true;
> + }
> + } else {
> + rc = kvm_s390_pci_aif_enable(zdev, &fib, true);
> + if (!rc)
> + zdev->kzdev->aif = true;
> + }
> + } else if (data->flags == 0) {
> + rc = kvm_s390_pci_aif_disable(zdev);
> + if (!rc) {
> + zdev->kzdev->aif = false;
> + zdev->kzdev->fhost = false;
> + }
> + } else {
> + rc = -EINVAL;
> + }
> + }
> +
> + kfree(feat);
> + return rc;
> +}
> +
> static int vfio_pci_zdev_group_notifier(struct notifier_block *nb,
> unsigned long action, void *data)
> {
> @@ -259,8 +351,10 @@ int vfio_pci_zdev_release(struct vfio_pci_core_device *vdev)
> * If the device was using interpretation, don't trust that userspace
> * did the appropriate cleanup
> */
> - if (zdev->gd != 0)
> + if (zdev->gd != 0) {
> + kvm_s390_pci_aif_disable(zdev);
> kvm_s390_pci_interp_disable(zdev);
> + }
>
> kvm_s390_pci_dev_release(zdev);
>
> diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
> index 92dc43c827c9..5442d3fa1662 100644
> --- a/include/linux/vfio_pci_core.h
> +++ b/include/linux/vfio_pci_core.h
> @@ -201,6 +201,9 @@ extern int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
> int vfio_pci_zdev_feat_interp(struct vfio_pci_core_device *vdev,
> struct vfio_device_feature feature,
> unsigned long arg);
> +int vfio_pci_zdev_feat_aif(struct vfio_pci_core_device *vdev,
> + struct vfio_device_feature feature,
> + unsigned long arg);
> int vfio_pci_zdev_open(struct vfio_pci_core_device *vdev);
> int vfio_pci_zdev_release(struct vfio_pci_core_device *vdev);
> #else
> @@ -217,6 +220,13 @@ static inline int vfio_pci_zdev_feat_interp(struct vfio_pci_core_device *vdev,
> return -ENOTTY;
> }
>
> +static inline int vfio_pci_zdev_feat_aif(struct vfio_pci_core_device *vdev,
> + struct vfio_device_feature feature,
> + unsigned long arg)
> +{
> + return -ENOTTY;
> +}
> +
> static inline int vfio_pci_zdev_open(struct vfio_pci_core_device *vdev)
> {
> return -ENODEV;
> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> index b9a75485b8e7..fe3bfd99bf50 100644
> --- a/include/uapi/linux/vfio.h
> +++ b/include/uapi/linux/vfio.h
> @@ -1009,6 +1009,13 @@ struct vfio_device_feature {
> */
> #define VFIO_DEVICE_FEATURE_ZPCI_INTERP (1)
>
> +/*
> + * Provide support for enbaling adapter interruption forwarding for zPCI
> + * devices. This feature is only valid for s390x PCI devices. Data provided
> + * when setting and getting this feature is further described in vfio_zdev.h
> + */
> +#define VFIO_DEVICE_FEATURE_ZPCI_AIF (2)
> +
> /* -------- API for Type1 VFIO IOMMU -------- */
>
> /**
> diff --git a/include/uapi/linux/vfio_zdev.h b/include/uapi/linux/vfio_zdev.h
> index 575f0410dc66..c574e23f9385 100644
> --- a/include/uapi/linux/vfio_zdev.h
> +++ b/include/uapi/linux/vfio_zdev.h
> @@ -90,4 +90,24 @@ struct vfio_device_zpci_interp {
> __u32 fh; /* Host device function handle */
> };
>
> +/**
> + * VFIO_DEVICE_FEATURE_ZPCI_AIF
> + *
> + * This feature is used for enabling forwarding of adapter interrupts directly
> + * from firmware to the guest. When setting this feature, the flags indicate
> + * whether to enable/disable the feature and the structure defined below is
> + * used to setup the forwarding structures. When getting this feature, only
> + * the flags are used to indicate the current state.
> + */
> +struct vfio_device_zpci_aif {
> + __u64 flags;
> +#define VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT 1
> +#define VFIO_DEVICE_ZPCI_FLAG_AIF_HOST 2
> + __u64 ibv; /* Address of guest interrupt bit vector */
> + __u64 sb; /* Address of guest summary bit */
> + __u32 noi; /* Number of interrupts */
> + __u8 isc; /* Guest interrupt subclass */
> + __u8 sbo; /* Offset of guest summary bit vector */
> +};
> +
> #endif
>

--
Pierre Morel
IBM Lab Boeblingen

2021-12-14 09:57:28

by Pierre Morel

[permalink] [raw]
Subject: Re: [PATCH 30/32] vfio-pci/zdev: add DTSM to clp group capability



On 12/7/21 21:57, Matthew Rosato wrote:
> The DTSM, or designation type supported mask, indicates what IOAT formats
> are available to the guest. For an interpreted device, userspace will not
> know what format(s) the IOAT assist supports, so pass it via the
> capability chain. Since the value belongs to the Query PCI Function Group
> clp, let's extend the existing capability with a new version.
>
> Signed-off-by: Matthew Rosato <[email protected]>
> ---
> drivers/vfio/pci/vfio_pci_zdev.c | 9 ++++++---
> include/uapi/linux/vfio_zdev.h | 3 +++
> 2 files changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/vfio/pci/vfio_pci_zdev.c b/drivers/vfio/pci/vfio_pci_zdev.c
> index 85be77492a6d..342b59ed36c9 100644
> --- a/drivers/vfio/pci/vfio_pci_zdev.c
> +++ b/drivers/vfio/pci/vfio_pci_zdev.c
> @@ -45,19 +45,22 @@ static int zpci_group_cap(struct zpci_dev *zdev, struct vfio_info_cap *caps)
> {
> struct vfio_device_info_cap_zpci_group cap = {
> .header.id = VFIO_DEVICE_INFO_CAP_ZPCI_GROUP,
> - .header.version = 1,
> + .header.version = 2,
> .dasm = zdev->dma_mask,
> .msi_addr = zdev->msi_addr,
> .flags = VFIO_DEVICE_INFO_ZPCI_FLAG_REFRESH,
> .mui = zdev->fmb_update,
> .noi = zdev->max_msi,
> .maxstbl = ZPCI_MAX_WRITE_SIZE,

This, maxstbl, is not part of the patch but shouldn't we consider it too?
The maxstbl is fixed for intercepted VFIO because the kernel is handling
the STBL instruction in behalf of the guest.
Here the guest will use STBL directly.

I think we should report the right maxstbl value.

> - .version = zdev->version
> + .version = zdev->version,
> + .dtsm = 0
> };
>
> /* Some values are different for interpreted devices */
> - if (zdev->kzdev && zdev->kzdev->interp)
> + if (zdev->kzdev && zdev->kzdev->interp) {
> cap.maxstbl = zdev->maxstbl;
> + cap.dtsm = kvm_s390_pci_get_dtsm(zdev);
> + }
>
> return vfio_info_add_capability(caps, &cap.header, sizeof(cap));
> }
> diff --git a/include/uapi/linux/vfio_zdev.h b/include/uapi/linux/vfio_zdev.h
> index 1a5229b7bb18..b4c2ba8e71f0 100644
> --- a/include/uapi/linux/vfio_zdev.h
> +++ b/include/uapi/linux/vfio_zdev.h
> @@ -47,6 +47,9 @@ struct vfio_device_info_cap_zpci_group {
> __u16 noi; /* Maximum number of MSIs */
> __u16 maxstbl; /* Maximum Store Block Length */
> __u8 version; /* Supported PCI Version */
> + /* End of version 1 */
> + __u8 dtsm; /* Supported IOAT Designations */
> + /* End of version 2 */
> };
>
> /**
>

--
Pierre Morel
IBM Lab Boeblingen

2021-12-14 14:59:27

by Matthew Rosato

[permalink] [raw]
Subject: Re: [PATCH 30/32] vfio-pci/zdev: add DTSM to clp group capability

On 12/14/21 4:58 AM, Pierre Morel wrote:
>
>
> On 12/7/21 21:57, Matthew Rosato wrote:
>> The DTSM, or designation type supported mask, indicates what IOAT formats
>> are available to the guest.  For an interpreted device, userspace will
>> not
>> know what format(s) the IOAT assist supports, so pass it via the
>> capability chain.  Since the value belongs to the Query PCI Function
>> Group
>> clp, let's extend the existing capability with a new version.
>>
>> Signed-off-by: Matthew Rosato <[email protected]>
>> ---
>>   drivers/vfio/pci/vfio_pci_zdev.c | 9 ++++++---
>>   include/uapi/linux/vfio_zdev.h   | 3 +++
>>   2 files changed, 9 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/vfio/pci/vfio_pci_zdev.c
>> b/drivers/vfio/pci/vfio_pci_zdev.c
>> index 85be77492a6d..342b59ed36c9 100644
>> --- a/drivers/vfio/pci/vfio_pci_zdev.c
>> +++ b/drivers/vfio/pci/vfio_pci_zdev.c
>> @@ -45,19 +45,22 @@ static int zpci_group_cap(struct zpci_dev *zdev,
>> struct vfio_info_cap *caps)
>>   {
>>       struct vfio_device_info_cap_zpci_group cap = {
>>           .header.id = VFIO_DEVICE_INFO_CAP_ZPCI_GROUP,
>> -        .header.version = 1,
>> +        .header.version = 2,
>>           .dasm = zdev->dma_mask,
>>           .msi_addr = zdev->msi_addr,
>>           .flags = VFIO_DEVICE_INFO_ZPCI_FLAG_REFRESH,
>>           .mui = zdev->fmb_update,
>>           .noi = zdev->max_msi,
>>           .maxstbl = ZPCI_MAX_WRITE_SIZE,
>
> This, maxstbl, is not part of the patch but shouldn't we consider it too?
> The maxstbl is fixed for intercepted VFIO because the kernel is handling
> the STBL instruction in behalf of the guest.
> Here the guest will use STBL directly.
>
> I think we should report the right maxstbl value.
>

I think we are OK, I think you missed the line that does this already,
it was added in patch 27 when we wire up interpretive execution. So,
here we are defaulting to reporting ZPCI_MAX_WRITE_SIZE, and then ...

>> -        .version = zdev->version
>> +        .version = zdev->version,
>> +        .dtsm = 0
>>       };
>>       /* Some values are different for interpreted devices */
>> -    if (zdev->kzdev && zdev->kzdev->interp)
>> +    if (zdev->kzdev && zdev->kzdev->interp) {
>>           cap.maxstbl = zdev->maxstbl;

... Here we overwrite this with the hardware value only for interpreted
devices. Just like we are also now additionally doing for DTSM with
this patch.

>> +        cap.dtsm = kvm_s390_pci_get_dtsm(zdev);
>> +    }
>>       return vfio_info_add_capability(caps, &cap.header, sizeof(cap));
>>   }
>> diff --git a/include/uapi/linux/vfio_zdev.h
>> b/include/uapi/linux/vfio_zdev.h
>> index 1a5229b7bb18..b4c2ba8e71f0 100644
>> --- a/include/uapi/linux/vfio_zdev.h
>> +++ b/include/uapi/linux/vfio_zdev.h
>> @@ -47,6 +47,9 @@ struct vfio_device_info_cap_zpci_group {
>>       __u16 noi;        /* Maximum number of MSIs */
>>       __u16 maxstbl;        /* Maximum Store Block Length */
>>       __u8 version;        /* Supported PCI Version */
>> +    /* End of version 1 */
>> +    __u8 dtsm;        /* Supported IOAT Designations */
>> +    /* End of version 2 */
>>   };
>>   /**
>>
>


2021-12-14 16:29:57

by Pierre Morel

[permalink] [raw]
Subject: Re: [PATCH 27/32] vfio-pci/zdev: wire up zPCI interpretive execution support



On 12/7/21 21:57, Matthew Rosato wrote:
> Introduce support for VFIO_DEVICE_FEATURE_ZPCI_INTERP, which is a new
> VFIO_DEVICE_FEATURE ioctl. This interface is used to indicate that an
> s390x vfio-pci device wishes to enable/disable zPCI interpretive
> execution, which allows zPCI instructions to be executed directly by
> underlying firmware without KVM involvement.
>
> Signed-off-by: Matthew Rosato <[email protected]>
> ---
> arch/s390/include/asm/kvm_pci.h | 1 +
> drivers/vfio/pci/vfio_pci_core.c | 2 +
> drivers/vfio/pci/vfio_pci_zdev.c | 76 ++++++++++++++++++++++++++++++++
> include/linux/vfio_pci_core.h | 10 +++++
> include/uapi/linux/vfio.h | 7 +++
> include/uapi/linux/vfio_zdev.h | 15 +++++++
> 6 files changed, 111 insertions(+)
>
> diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
> index 6526908ac834..062bac720428 100644
> --- a/arch/s390/include/asm/kvm_pci.h
> +++ b/arch/s390/include/asm/kvm_pci.h
> @@ -35,6 +35,7 @@ struct kvm_zdev {
> struct kvm_zdev_ioat ioat;
> struct zpci_fib fib;
> struct notifier_block nb;
> + bool interp;
> };
>
> extern int kvm_s390_pci_dev_open(struct zpci_dev *zdev);
> diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
> index fc57d4d0abbe..2b2d64a2190c 100644
> --- a/drivers/vfio/pci/vfio_pci_core.c
> +++ b/drivers/vfio/pci/vfio_pci_core.c
> @@ -1172,6 +1172,8 @@ long vfio_pci_core_ioctl(struct vfio_device *core_vdev, unsigned int cmd,
> mutex_unlock(&vdev->vf_token->lock);
>
> return 0;
> + case VFIO_DEVICE_FEATURE_ZPCI_INTERP:
> + return vfio_pci_zdev_feat_interp(vdev, feature, arg);
> default:
> return -ENOTTY;
> }
> diff --git a/drivers/vfio/pci/vfio_pci_zdev.c b/drivers/vfio/pci/vfio_pci_zdev.c
> index cfd7f44b06c1..b205e0ad1fd3 100644
> --- a/drivers/vfio/pci/vfio_pci_zdev.c
> +++ b/drivers/vfio/pci/vfio_pci_zdev.c
> @@ -54,6 +54,10 @@ static int zpci_group_cap(struct zpci_dev *zdev, struct vfio_info_cap *caps)
> .version = zdev->version
> };
>
> + /* Some values are different for interpreted devices */
> + if (zdev->kzdev && zdev->kzdev->interp)
> + cap.maxstbl = zdev->maxstbl;

right did not see that so my comment on patch 30 is not right.

> +
> return vfio_info_add_capability(caps, &cap.header, sizeof(cap));
> }
>
> @@ -138,6 +142,70 @@ int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
> return ret;
> }
>
> +int vfio_pci_zdev_feat_interp(struct vfio_pci_core_device *vdev,
> + struct vfio_device_feature feature,
> + unsigned long arg)
> +{
> + struct zpci_dev *zdev = to_zpci(vdev->pdev);
> + struct vfio_device_zpci_interp *data;
> + struct vfio_device_feature *feat;
> + unsigned long minsz;
> + int size, rc;
> +
> + if (!zdev || !zdev->kzdev)
> + return -EINVAL;
> +
> + /*
> + * If PROBE requested and feature not found, leave immediately.
> + * Otherwise, keep going as GET or SET may also be specified.
> + */
> + if (feature.flags & VFIO_DEVICE_FEATURE_PROBE) {
> + rc = kvm_s390_pci_interp_probe(zdev);
> + if (rc)
> + return rc;
> + }
> + if (!(feature.flags & (VFIO_DEVICE_FEATURE_GET +
> + VFIO_DEVICE_FEATURE_SET)))
> + return 0;
> +
> + size = sizeof(*feat) + sizeof(*data);
> + feat = kzalloc(size, GFP_KERNEL);
> + if (!feat)
> + return -ENOMEM;
> +
> + data = (struct vfio_device_zpci_interp *)&feat->data;
> + minsz = offsetofend(struct vfio_device_feature, flags);
> +
> + /* Get the rest of the payload for GET/SET */
> + rc = copy_from_user(data, (void __user *)(arg + minsz),
> + sizeof(*data));

Here as in patch 28, I think yo ushould take care of feature.argsz


> + if (rc)
> + rc = -EINVAL;
> +
> + if (feature.flags & VFIO_DEVICE_FEATURE_GET) {
> + if (zdev->gd != 0)
> + data->flags = VFIO_DEVICE_ZPCI_FLAG_INTERP;
> + else
> + data->flags = 0;
> + data->fh = zdev->fh;
> + /* userspace is using host fh, give interpreted clp values */
> + zdev->kzdev->interp = true;
> +
> + if (copy_to_user((void __user *)arg, feat, size))
> + rc = -EFAULT;
> + } else if (feature.flags & VFIO_DEVICE_FEATURE_SET) {
> + if (data->flags == VFIO_DEVICE_ZPCI_FLAG_INTERP)
> + rc = kvm_s390_pci_interp_enable(zdev);
> + else if (data->flags == 0)
> + rc = kvm_s390_pci_interp_disable(zdev);
> + else
> + rc = -EINVAL;
> + }
> +
> + kfree(feat);
> + return rc;
> +}
> +
> static int vfio_pci_zdev_group_notifier(struct notifier_block *nb,
> unsigned long action, void *data)
> {
> @@ -167,6 +235,7 @@ int vfio_pci_zdev_open(struct vfio_pci_core_device *vdev)
> return -ENODEV;
>
> zdev->kzdev->nb.notifier_call = vfio_pci_zdev_group_notifier;
> + zdev->kzdev->interp = false;
>
> ret = vfio_register_notifier(vdev->vdev.dev, VFIO_GROUP_NOTIFY,
> &events, &zdev->kzdev->nb);
> @@ -186,6 +255,13 @@ int vfio_pci_zdev_release(struct vfio_pci_core_device *vdev)
> vfio_unregister_notifier(vdev->vdev.dev, VFIO_GROUP_NOTIFY,
> &zdev->kzdev->nb);
>
> + /*
> + * If the device was using interpretation, don't trust that userspace
> + * did the appropriate cleanup
> + */
> + if (zdev->gd != 0)
> + kvm_s390_pci_interp_disable(zdev);
> +
> kvm_s390_pci_dev_release(zdev);
>
> return 0;
> diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
> index 14079da409f1..92dc43c827c9 100644
> --- a/include/linux/vfio_pci_core.h
> +++ b/include/linux/vfio_pci_core.h
> @@ -198,6 +198,9 @@ static inline int vfio_pci_igd_init(struct vfio_pci_core_device *vdev)
> #ifdef CONFIG_VFIO_PCI_ZDEV
> extern int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
> struct vfio_info_cap *caps);
> +int vfio_pci_zdev_feat_interp(struct vfio_pci_core_device *vdev,
> + struct vfio_device_feature feature,
> + unsigned long arg);
> int vfio_pci_zdev_open(struct vfio_pci_core_device *vdev);
> int vfio_pci_zdev_release(struct vfio_pci_core_device *vdev);
> #else
> @@ -207,6 +210,13 @@ static inline int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
> return -ENODEV;
> }
>
> +static inline int vfio_pci_zdev_feat_interp(struct vfio_pci_core_device *vdev,
> + struct vfio_device_feature feature,
> + unsigned long arg)
> +{
> + return -ENOTTY;
> +}
> +
> static inline int vfio_pci_zdev_open(struct vfio_pci_core_device *vdev)
> {
> return -ENODEV;
> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> index ef33ea002b0b..b9a75485b8e7 100644
> --- a/include/uapi/linux/vfio.h
> +++ b/include/uapi/linux/vfio.h
> @@ -1002,6 +1002,13 @@ struct vfio_device_feature {
> */
> #define VFIO_DEVICE_FEATURE_PCI_VF_TOKEN (0)
>
> +/*
> + * Provide support for enabling interpretation of zPCI instructions. This
> + * feature is only valid for s390x PCI devices. Data provided when setting
> + * and getting this feature is futher described in vfio_zdev.h
> + */
> +#define VFIO_DEVICE_FEATURE_ZPCI_INTERP (1)
> +
> /* -------- API for Type1 VFIO IOMMU -------- */
>
> /**
> diff --git a/include/uapi/linux/vfio_zdev.h b/include/uapi/linux/vfio_zdev.h
> index b4309397b6b2..575f0410dc66 100644
> --- a/include/uapi/linux/vfio_zdev.h
> +++ b/include/uapi/linux/vfio_zdev.h
> @@ -75,4 +75,19 @@ struct vfio_device_info_cap_zpci_pfip {
> __u8 pfip[];
> };
>
> +/**
> + * VFIO_DEVICE_FEATURE_ZPCI_INTERP
> + *
> + * This feature is used for enabling zPCI instruction interpretation for a
> + * device. No data is provided when setting this feature. When getting
> + * this feature, the following structure is provided which details whether
> + * or not interpretation is active and provides the guest with host device
> + * information necessary to enable interpretation.
> + */
> +struct vfio_device_zpci_interp {
> + __u64 flags;
> +#define VFIO_DEVICE_ZPCI_FLAG_INTERP 1
> + __u32 fh; /* Host device function handle */
> +};
> +
> #endif
>

--
Pierre Morel
IBM Lab Boeblingen

2021-12-14 16:58:38

by Pierre Morel

[permalink] [raw]
Subject: Re: [PATCH 23/32] KVM: s390: pci: handle refresh of PCI translations



On 12/7/21 21:57, Matthew Rosato wrote:
> Add a routine that will perform a shadow operation between a guest
> and host IOAT. A subsequent patch will invoke this in response to
> an 04 RPCIT instruction intercept.
>
> Signed-off-by: Matthew Rosato <[email protected]>
> ---
> arch/s390/include/asm/kvm_pci.h | 1 +
> arch/s390/include/asm/pci_dma.h | 1 +
> arch/s390/kvm/pci.c | 191 ++++++++++++++++++++++++++++++++
> arch/s390/kvm/pci.h | 4 +-
> 4 files changed, 196 insertions(+), 1 deletion(-)
>
> diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
> index 254275399f21..97e3a369135d 100644
> --- a/arch/s390/include/asm/kvm_pci.h
> +++ b/arch/s390/include/asm/kvm_pci.h
> @@ -30,6 +30,7 @@ struct kvm_zdev_ioat {
> struct kvm_zdev {
> struct zpci_dev *zdev;
> struct kvm *kvm;
> + u64 rpcit_count;
> struct kvm_zdev_ioat ioat;
> struct zpci_fib fib;
> };
> diff --git a/arch/s390/include/asm/pci_dma.h b/arch/s390/include/asm/pci_dma.h
> index e1d3c1d3fc8a..0ca15e5db3d9 100644
> --- a/arch/s390/include/asm/pci_dma.h
> +++ b/arch/s390/include/asm/pci_dma.h
> @@ -52,6 +52,7 @@ enum zpci_ioat_dtype {
> #define ZPCI_TABLE_ENTRIES (ZPCI_TABLE_SIZE / ZPCI_TABLE_ENTRY_SIZE)
> #define ZPCI_TABLE_PAGES (ZPCI_TABLE_SIZE >> PAGE_SHIFT)
> #define ZPCI_TABLE_ENTRIES_PAGES (ZPCI_TABLE_ENTRIES * ZPCI_TABLE_PAGES)
> +#define ZPCI_TABLE_ENTRIES_PER_PAGE (ZPCI_TABLE_ENTRIES / ZPCI_TABLE_PAGES)
>
> #define ZPCI_TABLE_BITS 11
> #define ZPCI_PT_BITS 8
> diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
> index a1c0c0881332..858c5ecdc8b9 100644
> --- a/arch/s390/kvm/pci.c
> +++ b/arch/s390/kvm/pci.c
> @@ -123,6 +123,195 @@ int kvm_s390_pci_aen_init(u8 nisc)
> return rc;
> }

...snip...

> +
> +int kvm_s390_pci_refresh_trans(struct kvm_vcpu *vcpu, unsigned long req,
> + unsigned long start, unsigned long size)
> +{
> + struct zpci_dev *zdev;
> + u32 fh;
> + int rc;
> +
> + /* If the device has a SHM bit on, let userspace take care of this */
> + fh = req >> 32;
> + if ((fh & aift.mdd) != 0)
> + return -EOPNOTSUPP;

I think you should make this check in the caller.

> +
> + /* Make sure this is a valid device associated with this guest */
> + zdev = get_zdev_by_fh(fh);
> + if (!zdev || !zdev->kzdev || zdev->kzdev->kvm != vcpu->kvm)
> + return -EINVAL;
> +
> + /* Only proceed if the device is using the assist */
> + if (zdev->kzdev->ioat.head[0] == 0)
> + return -EOPNOTSUPP;

Using the assist means using interpretation over using interception and
legacy vfio-pci. right?

> +
> + rc = dma_table_shadow(vcpu, zdev, start, size);
> + if (rc > 0)
> + rc = zpci_refresh_trans((u64) zdev->fh << 32, start, size);

Here you lose the status reported by the hardware.
You should directly use __rpcit(fn, addr, range, &status);


> + zdev->kzdev->rpcit_count++;
> +
> + return rc;
> +}
> +
> /* Modify PCI: Register floating adapter interruption forwarding */
> static int kvm_zpci_set_airq(struct zpci_dev *zdev)
> {
> @@ -590,4 +779,6 @@ void kvm_s390_pci_init(void)
> {
> spin_lock_init(&aift.gait_lock);
> mutex_init(&aift.lock);
> +
> + WARN_ON(zpci_get_mdd(&aift.mdd));
> }
> diff --git a/arch/s390/kvm/pci.h b/arch/s390/kvm/pci.h
> index 3c86888fe1b3..d252a631b693 100644
> --- a/arch/s390/kvm/pci.h
> +++ b/arch/s390/kvm/pci.h
> @@ -33,6 +33,7 @@ struct zpci_aift {
> struct kvm_zdev **kzdev;
> spinlock_t gait_lock; /* Protects the gait, used during AEN forward */
> struct mutex lock; /* Protects the other structures in aift */
> + u32 mdd;
> };
>
> static inline struct kvm *kvm_s390_pci_si_to_kvm(struct zpci_aift *aift,
> @@ -47,7 +48,8 @@ struct zpci_aift *kvm_s390_pci_get_aift(void);
>
> int kvm_s390_pci_aen_init(u8 nisc);
> void kvm_s390_pci_aen_exit(void);
> -
> +int kvm_s390_pci_refresh_trans(struct kvm_vcpu *vcpu, unsigned long req,
> + unsigned long start, unsigned long end);
> void kvm_s390_pci_init(void);
>
> #endif /* __KVM_S390_PCI_H */
>

--
Pierre Morel
IBM Lab Boeblingen

2021-12-14 17:03:08

by Pierre Morel

[permalink] [raw]
Subject: Re: [PATCH 24/32] KVM: s390: intercept the rpcit instruction



On 12/7/21 21:57, Matthew Rosato wrote:
> For faster handling of PCI translation refreshes, intercept in KVM
> and call the associated handler.
>
> Signed-off-by: Matthew Rosato <[email protected]>
> ---
> arch/s390/kvm/pci.h | 4 ++++
> arch/s390/kvm/priv.c | 41 +++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 45 insertions(+)
>
> diff --git a/arch/s390/kvm/pci.h b/arch/s390/kvm/pci.h
> index d252a631b693..3f96eff432aa 100644
> --- a/arch/s390/kvm/pci.h
> +++ b/arch/s390/kvm/pci.h
> @@ -18,6 +18,10 @@
>
> #define KVM_S390_PCI_DTSM_MASK 0x40
>
> +#define KVM_S390_RPCIT_STAT_MASK 0xffffffff00ffffffUL
> +#define KVM_S390_RPCIT_INS_RES (0x10 << 24)
> +#define KVM_S390_RPCIT_ERR (0x28 << 24)

I

> +
> struct zpci_gaite {
> unsigned int gisa;
> u8 gisc;
> diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
> index 417154b314a6..768ae92ecc59 100644
> --- a/arch/s390/kvm/priv.c
> +++ b/arch/s390/kvm/priv.c
> @@ -29,6 +29,7 @@
> #include <asm/ap.h>
> #include "gaccess.h"
> #include "kvm-s390.h"
> +#include "pci.h"
> #include "trace.h"
>
> static int handle_ri(struct kvm_vcpu *vcpu)
> @@ -335,6 +336,44 @@ static int handle_rrbe(struct kvm_vcpu *vcpu)
> return 0;
> }
>
> +static int handle_rpcit(struct kvm_vcpu *vcpu)
> +{
> + int reg1, reg2;
> + int rc;
> +
> + if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
> + return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
> +
> + kvm_s390_get_regs_rre(vcpu, &reg1, &reg2);
> +

I would prefer to take care of the interception immediately here

fh = vcpu->run->s.regs.gprs[reg1] >> 32;
if ((fh & aift.mdd) != 0)
return -EOPNOTSUP

instead of doing it inside kvm_s390_pci_refresh_trans.
It would simplify in my opinion.

> + rc = kvm_s390_pci_refresh_trans(vcpu, vcpu->run->s.regs.gprs[reg1],
> + vcpu->run->s.regs.gprs[reg2],
> + vcpu->run->s.regs.gprs[reg2+1]);
> +


> + switch (rc) {
> + case 0:
> + kvm_s390_set_psw_cc(vcpu, 0);
> + break;
> + case -EOPNOTSUPP:
> + return -EOPNOTSUPP;
> + case -EINVAL:
> + kvm_s390_set_psw_cc(vcpu, 3);
> + break;
> + case -ENOMEM:
> + vcpu->run->s.regs.gprs[reg1] &= KVM_S390_RPCIT_STAT_MASK;
> + vcpu->run->s.regs.gprs[reg1] |= KVM_S390_RPCIT_INS_RES;
> + kvm_s390_set_psw_cc(vcpu, 1);
> + break;
> + default:
> + vcpu->run->s.regs.gprs[reg1] &= KVM_S390_RPCIT_STAT_MASK;
> + vcpu->run->s.regs.gprs[reg1] |= KVM_S390_RPCIT_ERR;

I think you should use the status reported by the hardware, reporting
"Error recovery in progress" what ever the hardware error was does not
seem right.

> + kvm_s390_set_psw_cc(vcpu, 1);
> + break;
> + }

NIT: This switch above could be much more simple if you set CC after the
switch.

> +
> + return 0;
> +}
> +
> #define SSKE_NQ 0x8
> #define SSKE_MR 0x4
> #define SSKE_MC 0x2
> @@ -1275,6 +1314,8 @@ int kvm_s390_handle_b9(struct kvm_vcpu *vcpu)
> return handle_essa(vcpu);
> case 0xaf:
> return handle_pfmf(vcpu);
> + case 0xd3:
> + return handle_rpcit(vcpu);
> default:
> return -EOPNOTSUPP;
> }
>





--
Pierre Morel
IBM Lab Boeblingen

2021-12-14 17:45:57

by Pierre Morel

[permalink] [raw]
Subject: Re: [PATCH 22/32] KVM: s390: pci: provide routines for enabling/disabling IOAT assist



On 12/7/21 21:57, Matthew Rosato wrote:
> These routines will be wired into the vfio_pci_zdev ioctl handlers to
> respond to requests to enable / disable a device for PCI I/O Address
> Translation assistance.
>
> Signed-off-by: Matthew Rosato <[email protected]>
> ---
> arch/s390/include/asm/kvm_pci.h | 15 ++++
> arch/s390/include/asm/pci_dma.h | 2 +
> arch/s390/kvm/pci.c | 133 ++++++++++++++++++++++++++++++++
> arch/s390/kvm/pci.h | 2 +
> 4 files changed, 152 insertions(+)
>
> diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
> index 54a0afdbe7d0..254275399f21 100644
> --- a/arch/s390/include/asm/kvm_pci.h
> +++ b/arch/s390/include/asm/kvm_pci.h
> @@ -16,11 +16,21 @@
> #include <linux/kvm_host.h>
> #include <linux/kvm.h>
> #include <linux/pci.h>
> +#include <linux/mutex.h>
> #include <asm/pci_insn.h>
> +#include <asm/pci_dma.h>
> +
> +struct kvm_zdev_ioat {
> + unsigned long *head[ZPCI_TABLE_PAGES];
> + unsigned long **seg;
> + unsigned long ***pt;
> + struct mutex lock;
> +};
>
> struct kvm_zdev {
> struct zpci_dev *zdev;
> struct kvm *kvm;
> + struct kvm_zdev_ioat ioat;
> struct zpci_fib fib;
> };
>
> @@ -33,6 +43,11 @@ extern int kvm_s390_pci_aif_enable(struct zpci_dev *zdev, struct zpci_fib *fib,
> bool assist);
> extern int kvm_s390_pci_aif_disable(struct zpci_dev *zdev);
>
> +extern int kvm_s390_pci_ioat_probe(struct zpci_dev *zdev);
> +extern int kvm_s390_pci_ioat_enable(struct zpci_dev *zdev, u64 iota);
> +extern int kvm_s390_pci_ioat_disable(struct zpci_dev *zdev);
> +extern u8 kvm_s390_pci_get_dtsm(struct zpci_dev *zdev);
> +
> extern int kvm_s390_pci_interp_probe(struct zpci_dev *zdev);
> extern int kvm_s390_pci_interp_enable(struct zpci_dev *zdev);
> extern int kvm_s390_pci_interp_disable(struct zpci_dev *zdev);
> diff --git a/arch/s390/include/asm/pci_dma.h b/arch/s390/include/asm/pci_dma.h
> index 3b8e89d4578a..e1d3c1d3fc8a 100644
> --- a/arch/s390/include/asm/pci_dma.h
> +++ b/arch/s390/include/asm/pci_dma.h
> @@ -50,6 +50,8 @@ enum zpci_ioat_dtype {
> #define ZPCI_TABLE_ALIGN ZPCI_TABLE_SIZE
> #define ZPCI_TABLE_ENTRY_SIZE (sizeof(unsigned long))
> #define ZPCI_TABLE_ENTRIES (ZPCI_TABLE_SIZE / ZPCI_TABLE_ENTRY_SIZE)
> +#define ZPCI_TABLE_PAGES (ZPCI_TABLE_SIZE >> PAGE_SHIFT)
> +#define ZPCI_TABLE_ENTRIES_PAGES (ZPCI_TABLE_ENTRIES * ZPCI_TABLE_PAGES)
>
> #define ZPCI_TABLE_BITS 11
> #define ZPCI_PT_BITS 8
> diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
> index 3a29398dd53b..a1c0c0881332 100644
> --- a/arch/s390/kvm/pci.c
> +++ b/arch/s390/kvm/pci.c
> @@ -12,6 +12,7 @@
> #include <asm/kvm_pci.h>
> #include <asm/pci.h>
> #include <asm/pci_insn.h>
> +#include <asm/pci_dma.h>
> #include <asm/sclp.h>
> #include "pci.h"
> #include "kvm-s390.h"
> @@ -315,6 +316,131 @@ int kvm_s390_pci_aif_disable(struct zpci_dev *zdev)
> }
> EXPORT_SYMBOL_GPL(kvm_s390_pci_aif_disable);
>
> +int kvm_s390_pci_ioat_probe(struct zpci_dev *zdev)
> +{
> + return 0;
> +}
> +EXPORT_SYMBOL_GPL(kvm_s390_pci_ioat_probe);
> +
> +int kvm_s390_pci_ioat_enable(struct zpci_dev *zdev, u64 iota)
> +{
> + gpa_t gpa = (gpa_t)(iota & ZPCI_RTE_ADDR_MASK);
> + struct kvm_zdev_ioat *ioat;
> + struct page *page;
> + struct kvm *kvm;
> + unsigned int idx;
> + void *iaddr;
> + int i, rc = 0;
> +
> + if (!zdev->kzdev || !zdev->kzdev->kvm || zdev->kzdev->ioat.head[0])
> + return -EINVAL;

The only caller already checked zdev->kzdev.
Could we use a macro to replace zdev->kzdev->ioat.head[0] ?
like
#define shadow_pgtbl_initialized zdev->kzdev->ioat.head[0]

Would be clearer for me.

> +
> + /* Ensure supported type specified */
> + if ((iota & ZPCI_IOTA_RTTO_FLAG) != ZPCI_IOTA_RTTO_FLAG)
> + return -EINVAL;
> +
> + kvm = zdev->kzdev->kvm;
> + ioat = &zdev->kzdev->ioat;
> + mutex_lock(&ioat->lock);
> + idx = srcu_read_lock(&kvm->srcu);
> + for (i = 0; i < ZPCI_TABLE_PAGES; i++) {
> + page = gfn_to_page(kvm, gpa_to_gfn(gpa));
> + if (is_error_page(page)) {
> + srcu_read_unlock(&kvm->srcu, idx);
> + rc = -EIO;
> + goto out;
> + }
> + iaddr = page_to_virt(page) + (gpa & ~PAGE_MASK);
> + ioat->head[i] = (unsigned long *)iaddr;
> + gpa += PAGE_SIZE;
> + }
> + srcu_read_unlock(&kvm->srcu, idx);
> +
> + zdev->kzdev->ioat.seg = kcalloc(ZPCI_TABLE_ENTRIES_PAGES,
> + sizeof(unsigned long *), GFP_KERNEL);
> + if (!zdev->kzdev->ioat.seg)
> + goto unpin;
> + zdev->kzdev->ioat.pt = kcalloc(ZPCI_TABLE_ENTRIES,
> + sizeof(unsigned long **), GFP_KERNEL);
> + if (!zdev->kzdev->ioat.pt)
> + goto free_seg;
> +
> +out:
> + mutex_unlock(&ioat->lock);
> + return rc;
> +
> +free_seg:
> + kfree(zdev->kzdev->ioat.seg);
> +unpin:
> + for (i = 0; i < ZPCI_TABLE_PAGES; i++) {
> + kvm_release_pfn_dirty((u64)ioat->head[i] >> PAGE_SHIFT);

I did not find when the pages are pinned.

> + ioat->head[i] = 0;
> + }
> + mutex_unlock(&ioat->lock);
> + return -ENOMEM;
> +}
> +EXPORT_SYMBOL_GPL(kvm_s390_pci_ioat_enable);
> +

...snip...
>

--
Pierre Morel
IBM Lab Boeblingen

2021-12-14 17:55:10

by Matthew Rosato

[permalink] [raw]
Subject: Re: [PATCH 23/32] KVM: s390: pci: handle refresh of PCI translations

On 12/14/21 11:59 AM, Pierre Morel wrote:
>
>
> On 12/7/21 21:57, Matthew Rosato wrote:
>> Add a routine that will perform a shadow operation between a guest
>> and host IOAT.  A subsequent patch will invoke this in response to
>> an 04 RPCIT instruction intercept.
>>
>> Signed-off-by: Matthew Rosato <[email protected]>
>> ---
>>   arch/s390/include/asm/kvm_pci.h |   1 +
>>   arch/s390/include/asm/pci_dma.h |   1 +
>>   arch/s390/kvm/pci.c             | 191 ++++++++++++++++++++++++++++++++
>>   arch/s390/kvm/pci.h             |   4 +-
>>   4 files changed, 196 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/s390/include/asm/kvm_pci.h
>> b/arch/s390/include/asm/kvm_pci.h
>> index 254275399f21..97e3a369135d 100644
>> --- a/arch/s390/include/asm/kvm_pci.h
>> +++ b/arch/s390/include/asm/kvm_pci.h
>> @@ -30,6 +30,7 @@ struct kvm_zdev_ioat {
>>   struct kvm_zdev {
>>       struct zpci_dev *zdev;
>>       struct kvm *kvm;
>> +    u64 rpcit_count;
>>       struct kvm_zdev_ioat ioat;
>>       struct zpci_fib fib;
>>   };
>> diff --git a/arch/s390/include/asm/pci_dma.h
>> b/arch/s390/include/asm/pci_dma.h
>> index e1d3c1d3fc8a..0ca15e5db3d9 100644
>> --- a/arch/s390/include/asm/pci_dma.h
>> +++ b/arch/s390/include/asm/pci_dma.h
>> @@ -52,6 +52,7 @@ enum zpci_ioat_dtype {
>>   #define ZPCI_TABLE_ENTRIES        (ZPCI_TABLE_SIZE /
>> ZPCI_TABLE_ENTRY_SIZE)
>>   #define ZPCI_TABLE_PAGES        (ZPCI_TABLE_SIZE >> PAGE_SHIFT)
>>   #define ZPCI_TABLE_ENTRIES_PAGES    (ZPCI_TABLE_ENTRIES *
>> ZPCI_TABLE_PAGES)
>> +#define ZPCI_TABLE_ENTRIES_PER_PAGE    (ZPCI_TABLE_ENTRIES /
>> ZPCI_TABLE_PAGES)
>>   #define ZPCI_TABLE_BITS            11
>>   #define ZPCI_PT_BITS            8
>> diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
>> index a1c0c0881332..858c5ecdc8b9 100644
>> --- a/arch/s390/kvm/pci.c
>> +++ b/arch/s390/kvm/pci.c
>> @@ -123,6 +123,195 @@ int kvm_s390_pci_aen_init(u8 nisc)
>>       return rc;
>>   }
>
> ...snip...
>
>> +
>> +int kvm_s390_pci_refresh_trans(struct kvm_vcpu *vcpu, unsigned long req,
>> +                   unsigned long start, unsigned long size)
>> +{
>> +    struct zpci_dev *zdev;
>> +    u32 fh;
>> +    int rc;
>> +
>> +    /* If the device has a SHM bit on, let userspace take care of
>> this */
>> +    fh = req >> 32;
>> +    if ((fh & aift.mdd) != 0)
>> +        return -EOPNOTSUPP;
>
> I think you should make this check in the caller.

OK

>
>> +
>> +    /* Make sure this is a valid device associated with this guest */
>> +    zdev = get_zdev_by_fh(fh);
>> +    if (!zdev || !zdev->kzdev || zdev->kzdev->kvm != vcpu->kvm)
>> +        return -EINVAL;
>> +
>> +    /* Only proceed if the device is using the assist */
>> +    if (zdev->kzdev->ioat.head[0] == 0)
>> +        return -EOPNOTSUPP;
>
> Using the assist means using interpretation over using interception and
> legacy vfio-pci. right?

Right - more specifically that the IOAT assist feature was never set via
the vfio feature ioctl, so we can't handle the RPCIT for this device and
so throw to userspace.

The way the QEMU series is being implemented, a device using
interpretation will always have the IOAT feature set on.

>
>> +
>> +    rc = dma_table_shadow(vcpu, zdev, start, size);
>> +    if (rc > 0)
>> +        rc = zpci_refresh_trans((u64) zdev->fh << 32, start, size);
>
> Here you lose the status reported by the hardware.
> You should directly use __rpcit(fn, addr, range, &status);

OK, I can have a look at doing this.

@Niklas thoughts on how you would want this exported. Renamed to
zpci_rpcit or so?

>
>
>> +    zdev->kzdev->rpcit_count++;
>> +
>> +    return rc;
>> +}
>> +
>>   /* Modify PCI: Register floating adapter interruption forwarding */
>>   static int kvm_zpci_set_airq(struct zpci_dev *zdev)
>>   {
>> @@ -590,4 +779,6 @@ void kvm_s390_pci_init(void)
>>   {
>>       spin_lock_init(&aift.gait_lock);
>>       mutex_init(&aift.lock);
>> +
>> +    WARN_ON(zpci_get_mdd(&aift.mdd));
>>   }
>> diff --git a/arch/s390/kvm/pci.h b/arch/s390/kvm/pci.h
>> index 3c86888fe1b3..d252a631b693 100644
>> --- a/arch/s390/kvm/pci.h
>> +++ b/arch/s390/kvm/pci.h
>> @@ -33,6 +33,7 @@ struct zpci_aift {
>>       struct kvm_zdev **kzdev;
>>       spinlock_t gait_lock; /* Protects the gait, used during AEN
>> forward */
>>       struct mutex lock; /* Protects the other structures in aift */
>> +    u32 mdd;
>>   };
>>   static inline struct kvm *kvm_s390_pci_si_to_kvm(struct zpci_aift
>> *aift,
>> @@ -47,7 +48,8 @@ struct zpci_aift *kvm_s390_pci_get_aift(void);
>>   int kvm_s390_pci_aen_init(u8 nisc);
>>   void kvm_s390_pci_aen_exit(void);
>> -
>> +int kvm_s390_pci_refresh_trans(struct kvm_vcpu *vcpu, unsigned long req,
>> +                   unsigned long start, unsigned long end);
>>   void kvm_s390_pci_init(void);
>>   #endif /* __KVM_S390_PCI_H */
>>
>


2021-12-14 18:00:15

by Matthew Rosato

[permalink] [raw]
Subject: Re: [PATCH 24/32] KVM: s390: intercept the rpcit instruction

On 12/14/21 12:04 PM, Pierre Morel wrote:
>
>
> On 12/7/21 21:57, Matthew Rosato wrote:
>> For faster handling of PCI translation refreshes, intercept in KVM
>> and call the associated handler.
>>
>> Signed-off-by: Matthew Rosato <[email protected]>
>> ---
>>   arch/s390/kvm/pci.h  |  4 ++++
>>   arch/s390/kvm/priv.c | 41 +++++++++++++++++++++++++++++++++++++++++
>>   2 files changed, 45 insertions(+)
>>
>> diff --git a/arch/s390/kvm/pci.h b/arch/s390/kvm/pci.h
>> index d252a631b693..3f96eff432aa 100644
>> --- a/arch/s390/kvm/pci.h
>> +++ b/arch/s390/kvm/pci.h
>> @@ -18,6 +18,10 @@
>>   #define KVM_S390_PCI_DTSM_MASK 0x40
>> +#define KVM_S390_RPCIT_STAT_MASK 0xffffffff00ffffffUL
>> +#define KVM_S390_RPCIT_INS_RES (0x10 << 24)
>> +#define KVM_S390_RPCIT_ERR (0x28 << 24)
>
> I
>
>> +
>>   struct zpci_gaite {
>>       unsigned int gisa;
>>       u8 gisc;
>> diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
>> index 417154b314a6..768ae92ecc59 100644
>> --- a/arch/s390/kvm/priv.c
>> +++ b/arch/s390/kvm/priv.c
>> @@ -29,6 +29,7 @@
>>   #include <asm/ap.h>
>>   #include "gaccess.h"
>>   #include "kvm-s390.h"
>> +#include "pci.h"
>>   #include "trace.h"
>>   static int handle_ri(struct kvm_vcpu *vcpu)
>> @@ -335,6 +336,44 @@ static int handle_rrbe(struct kvm_vcpu *vcpu)
>>       return 0;
>>   }
>> +static int handle_rpcit(struct kvm_vcpu *vcpu)
>> +{
>> +    int reg1, reg2;
>> +    int rc;
>> +
>> +    if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
>> +        return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
>> +
>> +    kvm_s390_get_regs_rre(vcpu, &reg1, &reg2);
>> +
>
> I would prefer to take care of the interception immediately here
>
>         fh = vcpu->run->s.regs.gprs[reg1] >> 32;
>         if ((fh & aift.mdd) != 0)
>                 return -EOPNOTSUP
>
> instead of doing it inside kvm_s390_pci_refresh_trans.
> It would simplify in my opinion.

OK

>
>> +    rc = kvm_s390_pci_refresh_trans(vcpu, vcpu->run->s.regs.gprs[reg1],
>> +                    vcpu->run->s.regs.gprs[reg2],
>> +                    vcpu->run->s.regs.gprs[reg2+1]);
>> +
>
>
>> +    switch (rc) {
>> +    case 0:
>> +        kvm_s390_set_psw_cc(vcpu, 0);
>> +        break;
>> +    case -EOPNOTSUPP:
>> +        return -EOPNOTSUPP;
>> +    case -EINVAL:
>> +        kvm_s390_set_psw_cc(vcpu, 3);
>> +        break;
>> +    case -ENOMEM:
>> +        vcpu->run->s.regs.gprs[reg1] &= KVM_S390_RPCIT_STAT_MASK;
>> +        vcpu->run->s.regs.gprs[reg1] |= KVM_S390_RPCIT_INS_RES;
>> +        kvm_s390_set_psw_cc(vcpu, 1);
>> +        break;
>> +    default:
>> +        vcpu->run->s.regs.gprs[reg1] &= KVM_S390_RPCIT_STAT_MASK;
>> +        vcpu->run->s.regs.gprs[reg1] |= KVM_S390_RPCIT_ERR;
>
> I think you should use the status reported by the hardware, reporting
> "Error recovery in progress" what ever the hardware error was does not
> seem right.
>

OK, this ties into your other comment about calling __rpcit() directly
so we have a status to look at -- will look into it

>> +        kvm_s390_set_psw_cc(vcpu, 1);
>> +        break;
>> +    }
>
> NIT: This switch above could be much more simple if you set CC after the
> switch.

We are setting 3 different CCs over 4 cases, so there's only 1
duplication in the switch, so I'm not sure how much simpler?

But anyway this might not be relevant if I change to call __rpcit()
directly.

>
>> +
>> +    return 0;
>> +}
>> +
>>   #define SSKE_NQ 0x8
>>   #define SSKE_MR 0x4
>>   #define SSKE_MC 0x2
>> @@ -1275,6 +1314,8 @@ int kvm_s390_handle_b9(struct kvm_vcpu *vcpu)
>>           return handle_essa(vcpu);
>>       case 0xaf:
>>           return handle_pfmf(vcpu);
>> +    case 0xd3:
>> +        return handle_rpcit(vcpu);
>>       default:
>>           return -EOPNOTSUPP;
>>       }
>>
>
>
>
>
>


2021-12-14 18:13:24

by Matthew Rosato

[permalink] [raw]
Subject: Re: [PATCH 22/32] KVM: s390: pci: provide routines for enabling/disabling IOAT assist

On 12/14/21 12:46 PM, Pierre Morel wrote:
>
>
> On 12/7/21 21:57, Matthew Rosato wrote:
>> These routines will be wired into the vfio_pci_zdev ioctl handlers to
>> respond to requests to enable / disable a device for PCI I/O Address
>> Translation assistance.
>>
>> Signed-off-by: Matthew Rosato <[email protected]>
>> ---
>>   arch/s390/include/asm/kvm_pci.h |  15 ++++
>>   arch/s390/include/asm/pci_dma.h |   2 +
>>   arch/s390/kvm/pci.c             | 133 ++++++++++++++++++++++++++++++++
>>   arch/s390/kvm/pci.h             |   2 +
>>   4 files changed, 152 insertions(+)
>>
>> diff --git a/arch/s390/include/asm/kvm_pci.h
>> b/arch/s390/include/asm/kvm_pci.h
>> index 54a0afdbe7d0..254275399f21 100644
>> --- a/arch/s390/include/asm/kvm_pci.h
>> +++ b/arch/s390/include/asm/kvm_pci.h
>> @@ -16,11 +16,21 @@
>>   #include <linux/kvm_host.h>
>>   #include <linux/kvm.h>
>>   #include <linux/pci.h>
>> +#include <linux/mutex.h>
>>   #include <asm/pci_insn.h>
>> +#include <asm/pci_dma.h>
>> +
>> +struct kvm_zdev_ioat {
>> +    unsigned long *head[ZPCI_TABLE_PAGES];
>> +    unsigned long **seg;
>> +    unsigned long ***pt;
>> +    struct mutex lock;
>> +};
>>   struct kvm_zdev {
>>       struct zpci_dev *zdev;
>>       struct kvm *kvm;
>> +    struct kvm_zdev_ioat ioat;
>>       struct zpci_fib fib;
>>   };
>> @@ -33,6 +43,11 @@ extern int kvm_s390_pci_aif_enable(struct zpci_dev
>> *zdev, struct zpci_fib *fib,
>>                      bool assist);
>>   extern int kvm_s390_pci_aif_disable(struct zpci_dev *zdev);
>> +extern int kvm_s390_pci_ioat_probe(struct zpci_dev *zdev);
>> +extern int kvm_s390_pci_ioat_enable(struct zpci_dev *zdev, u64 iota);
>> +extern int kvm_s390_pci_ioat_disable(struct zpci_dev *zdev);
>> +extern u8 kvm_s390_pci_get_dtsm(struct zpci_dev *zdev);
>> +
>>   extern int kvm_s390_pci_interp_probe(struct zpci_dev *zdev);
>>   extern int kvm_s390_pci_interp_enable(struct zpci_dev *zdev);
>>   extern int kvm_s390_pci_interp_disable(struct zpci_dev *zdev);
>> diff --git a/arch/s390/include/asm/pci_dma.h
>> b/arch/s390/include/asm/pci_dma.h
>> index 3b8e89d4578a..e1d3c1d3fc8a 100644
>> --- a/arch/s390/include/asm/pci_dma.h
>> +++ b/arch/s390/include/asm/pci_dma.h
>> @@ -50,6 +50,8 @@ enum zpci_ioat_dtype {
>>   #define ZPCI_TABLE_ALIGN        ZPCI_TABLE_SIZE
>>   #define ZPCI_TABLE_ENTRY_SIZE        (sizeof(unsigned long))
>>   #define ZPCI_TABLE_ENTRIES        (ZPCI_TABLE_SIZE /
>> ZPCI_TABLE_ENTRY_SIZE)
>> +#define ZPCI_TABLE_PAGES        (ZPCI_TABLE_SIZE >> PAGE_SHIFT)
>> +#define ZPCI_TABLE_ENTRIES_PAGES    (ZPCI_TABLE_ENTRIES *
>> ZPCI_TABLE_PAGES)
>>   #define ZPCI_TABLE_BITS            11
>>   #define ZPCI_PT_BITS            8
>> diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
>> index 3a29398dd53b..a1c0c0881332 100644
>> --- a/arch/s390/kvm/pci.c
>> +++ b/arch/s390/kvm/pci.c
>> @@ -12,6 +12,7 @@
>>   #include <asm/kvm_pci.h>
>>   #include <asm/pci.h>
>>   #include <asm/pci_insn.h>
>> +#include <asm/pci_dma.h>
>>   #include <asm/sclp.h>
>>   #include "pci.h"
>>   #include "kvm-s390.h"
>> @@ -315,6 +316,131 @@ int kvm_s390_pci_aif_disable(struct zpci_dev *zdev)
>>   }
>>   EXPORT_SYMBOL_GPL(kvm_s390_pci_aif_disable);
>> +int kvm_s390_pci_ioat_probe(struct zpci_dev *zdev)
>> +{
>> +    return 0;
>> +}
>> +EXPORT_SYMBOL_GPL(kvm_s390_pci_ioat_probe);
>> +
>> +int kvm_s390_pci_ioat_enable(struct zpci_dev *zdev, u64 iota)
>> +{
>> +    gpa_t gpa = (gpa_t)(iota & ZPCI_RTE_ADDR_MASK);
>> +    struct kvm_zdev_ioat *ioat;
>> +    struct page *page;
>> +    struct kvm *kvm;
>> +    unsigned int idx;
>> +    void *iaddr;
>> +    int i, rc = 0;
>> +
>> +    if (!zdev->kzdev || !zdev->kzdev->kvm || zdev->kzdev->ioat.head[0])
>> +        return -EINVAL;
>
> The only caller already checked zdev->kzdev.

I tend to get overzealous with these checks..

> Could we use a macro to replace zdev->kzdev->ioat.head[0] ?
> like
> #define shadow_pgtbl_initialized zdev->kzdev->ioat.head[0] >
> Would be clearer for me.

Sure

>
>> +
>> +    /* Ensure supported type specified */
>> +    if ((iota & ZPCI_IOTA_RTTO_FLAG) != ZPCI_IOTA_RTTO_FLAG)
>> +        return -EINVAL;
>> +
>> +    kvm = zdev->kzdev->kvm;
>> +    ioat = &zdev->kzdev->ioat;
>> +    mutex_lock(&ioat->lock);
>> +    idx = srcu_read_lock(&kvm->srcu);
>> +    for (i = 0; i < ZPCI_TABLE_PAGES; i++) {
>> +        page = gfn_to_page(kvm, gpa_to_gfn(gpa));

In relation to your question below about where things are being pinned...

Here the call to gfn_to_page does the pin (this call eventually drives
hva_to_pfn for pinning)

>> +        if (is_error_page(page)) {
>> +            srcu_read_unlock(&kvm->srcu, idx);
>> +            rc = -EIO;
>> +            goto out;
>> +        }
>> +        iaddr = page_to_virt(page) + (gpa & ~PAGE_MASK);
>> +        ioat->head[i] = (unsigned long *)iaddr;

^^ here we store what was pinned above in ioat->head[] and can use it
later for unpinning.

But looking again now I think for the is_error_page() case above here I
should also be going to unpin: to cleanup in case we were somewhere in
the middle of the loop and so have some pages pinned already.

>> +        gpa += PAGE_SIZE;
>> +    }
>> +    srcu_read_unlock(&kvm->srcu, idx);
>> +
>> +    zdev->kzdev->ioat.seg = kcalloc(ZPCI_TABLE_ENTRIES_PAGES,
>> +                    sizeof(unsigned long *), GFP_KERNEL);
>> +    if (!zdev->kzdev->ioat.seg)
>> +        goto unpin;
>> +    zdev->kzdev->ioat.pt = kcalloc(ZPCI_TABLE_ENTRIES,
>> +                       sizeof(unsigned long **), GFP_KERNEL);
>> +    if (!zdev->kzdev->ioat.pt)
>> +        goto free_seg;
>> +
>> +out:
>> +    mutex_unlock(&ioat->lock);
>> +    return rc;
>> +
>> +free_seg:
>> +    kfree(zdev->kzdev->ioat.seg);
>> +unpin:
>> +    for (i = 0; i < ZPCI_TABLE_PAGES; i++) {
>> +        kvm_release_pfn_dirty((u64)ioat->head[i] >> PAGE_SHIFT);
>
> I did not find when the pages are pinned.

See above.

>
>> +        ioat->head[i] = 0;
>> +    }
>> +    mutex_unlock(&ioat->lock);
>> +    return -ENOMEM;
>> +}
>> +EXPORT_SYMBOL_GPL(kvm_s390_pci_ioat_enable);
>> +
>
> ...snip...
>>
>


2021-12-16 14:39:47

by Niklas Schnelle

[permalink] [raw]
Subject: Re: [PATCH 23/32] KVM: s390: pci: handle refresh of PCI translations

On Tue, 2021-12-14 at 12:54 -0500, Matthew Rosato wrote:
> On 12/14/21 11:59 AM, Pierre Morel wrote:
> >
> > On 12/7/21 21:57, Matthew Rosato wrote:
> > > Add a routine that will perform a shadow operation between a guest
> > > and host IOAT. A subsequent patch will invoke this in response to
> > > an 04 RPCIT instruction intercept.
> > >
> > > Signed-off-by: Matthew Rosato <[email protected]>
> > > ---
> > > arch/s390/include/asm/kvm_pci.h | 1 +
> > > arch/s390/include/asm/pci_dma.h | 1 +
> > > arch/s390/kvm/pci.c | 191 ++++++++++++++++++++++++++++++++
> > > arch/s390/kvm/pci.h | 4 +-
> > > 4 files changed, 196 insertions(+), 1 deletion(-)
> > >
---8<---
> >
> > > +
> > > +int kvm_s390_pci_refresh_trans(struct kvm_vcpu *vcpu, unsigned long req,
> > > + unsigned long start, unsigned long size)
> > > +{
> > > + struct zpci_dev *zdev;
> > > + u32 fh;
> > > + int rc;
> > > +
> > > + /* If the device has a SHM bit on, let userspace take care of
> > > this */
> > > + fh = req >> 32;
> > > + if ((fh & aift.mdd) != 0)
> > > + return -EOPNOTSUPP;
> >
> > I think you should make this check in the caller.
>
> OK
>
> > > +
> > > + /* Make sure this is a valid device associated with this guest */
> > > + zdev = get_zdev_by_fh(fh);
> > > + if (!zdev || !zdev->kzdev || zdev->kzdev->kvm != vcpu->kvm)
> > > + return -EINVAL;
> > > +
> > > + /* Only proceed if the device is using the assist */
> > > + if (zdev->kzdev->ioat.head[0] == 0)
> > > + return -EOPNOTSUPP;
> >
> > Using the assist means using interpretation over using interception and
> > legacy vfio-pci. right?
>
> Right - more specifically that the IOAT assist feature was never set via
> the vfio feature ioctl, so we can't handle the RPCIT for this device and
> so throw to userspace.
>
> The way the QEMU series is being implemented, a device using
> interpretation will always have the IOAT feature set on.
>
> > > +
> > > + rc = dma_table_shadow(vcpu, zdev, start, size);
> > > + if (rc > 0)
> > > + rc = zpci_refresh_trans((u64) zdev->fh << 32, start, size);
> >
> > Here you lose the status reported by the hardware.
> > You should directly use __rpcit(fn, addr, range, &status);
>
> OK, I can have a look at doing this.
>
> @Niklas thoughts on how you would want this exported. Renamed to
> zpci_rpcit or so?

Hmm with using __rpcit() directly we would lose the error reporting in
s390dbf and this ist still kind of a RPCIT in the host. How about we
add the status as an out parameter to zpci_refresh_trans()? But yes if
you prefer to use __rpcit() directly I would rename it to zpci_rpcit().

>

---8<---


2021-12-16 14:51:32

by Matthew Rosato

[permalink] [raw]
Subject: Re: [PATCH 23/32] KVM: s390: pci: handle refresh of PCI translations

On 12/16/21 9:39 AM, Niklas Schnelle wrote:
> On Tue, 2021-12-14 at 12:54 -0500, Matthew Rosato wrote:
>> On 12/14/21 11:59 AM, Pierre Morel wrote:
>>>
>>> On 12/7/21 21:57, Matthew Rosato wrote:
>>>> Add a routine that will perform a shadow operation between a guest
>>>> and host IOAT. A subsequent patch will invoke this in response to
>>>> an 04 RPCIT instruction intercept.
>>>>
>>>> Signed-off-by: Matthew Rosato <[email protected]>
>>>> ---
>>>> arch/s390/include/asm/kvm_pci.h | 1 +
>>>> arch/s390/include/asm/pci_dma.h | 1 +
>>>> arch/s390/kvm/pci.c | 191 ++++++++++++++++++++++++++++++++
>>>> arch/s390/kvm/pci.h | 4 +-
>>>> 4 files changed, 196 insertions(+), 1 deletion(-)
>>>>
> ---8<---
>>>
>>>> +
>>>> +int kvm_s390_pci_refresh_trans(struct kvm_vcpu *vcpu, unsigned long req,
>>>> + unsigned long start, unsigned long size)
>>>> +{
>>>> + struct zpci_dev *zdev;
>>>> + u32 fh;
>>>> + int rc;
>>>> +
>>>> + /* If the device has a SHM bit on, let userspace take care of
>>>> this */
>>>> + fh = req >> 32;
>>>> + if ((fh & aift.mdd) != 0)
>>>> + return -EOPNOTSUPP;
>>>
>>> I think you should make this check in the caller.
>>
>> OK
>>
>>>> +
>>>> + /* Make sure this is a valid device associated with this guest */
>>>> + zdev = get_zdev_by_fh(fh);
>>>> + if (!zdev || !zdev->kzdev || zdev->kzdev->kvm != vcpu->kvm)
>>>> + return -EINVAL;
>>>> +
>>>> + /* Only proceed if the device is using the assist */
>>>> + if (zdev->kzdev->ioat.head[0] == 0)
>>>> + return -EOPNOTSUPP;
>>>
>>> Using the assist means using interpretation over using interception and
>>> legacy vfio-pci. right?
>>
>> Right - more specifically that the IOAT assist feature was never set via
>> the vfio feature ioctl, so we can't handle the RPCIT for this device and
>> so throw to userspace.
>>
>> The way the QEMU series is being implemented, a device using
>> interpretation will always have the IOAT feature set on.
>>
>>>> +
>>>> + rc = dma_table_shadow(vcpu, zdev, start, size);
>>>> + if (rc > 0)
>>>> + rc = zpci_refresh_trans((u64) zdev->fh << 32, start, size);
>>>
>>> Here you lose the status reported by the hardware.
>>> You should directly use __rpcit(fn, addr, range, &status);
>>
>> OK, I can have a look at doing this.
>>
>> @Niklas thoughts on how you would want this exported. Renamed to
>> zpci_rpcit or so?
>
> Hmm with using __rpcit() directly we would lose the error reporting in
> s390dbf and this ist still kind of a RPCIT in the host. How about we
> add the status as an out parameter to zpci_refresh_trans()? But yes if

Another advantage of doing this would be that we then also keep the cc2
retry logic in zpci_refresh_trans(), which would be nice.

However we do still lose the returned CC value from the instruction.
But I think we can infer a CC1 from a nonzero status and a CC3 from a
zero status so maybe this is OK too.

I think I will add the status parm to zpci_refresh_trans().

FWIW, I do also think it is likely we will end up with a s390dbf for
kvm-pci at some point after this initial series.


> you prefer to use __rpcit() directly I would rename it to zpci_rpcit().
>

>>
>
> ---8<---
>


2021-12-17 09:41:22

by Niklas Schnelle

[permalink] [raw]
Subject: Re: [PATCH 23/32] KVM: s390: pci: handle refresh of PCI translations

On Thu, 2021-12-16 at 09:51 -0500, Matthew Rosato wrote:
> On 12/16/21 9:39 AM, Niklas Schnelle wrote:
> > On Tue, 2021-12-14 at 12:54 -0500, Matthew Rosato wrote:
> > > On 12/14/21 11:59 AM, Pierre Morel wrote:
> > > > On 12/7/21 21:57, Matthew Rosato wrote:
> > > > > Add a routine that will perform a shadow operation between a guest
> > > > > and host IOAT. A subsequent patch will invoke this in response to
> > > > > an 04 RPCIT instruction intercept.
> > > > >
> > > > > Signed-off-by: Matthew Rosato <[email protected]>
> > > > > ---
> > > > > arch/s390/include/asm/kvm_pci.h | 1 +
> > > > > arch/s390/include/asm/pci_dma.h | 1 +
> > > > > arch/s390/kvm/pci.c | 191 ++++++++++++++++++++++++++++++++
> > > > > arch/s390/kvm/pci.h | 4 +-
> > > > > 4 files changed, 196 insertions(+), 1 deletion(-)
> > > > >
> > ---8<---
> > > > > +
> > > > > +int kvm_s390_pci_refresh_trans(struct kvm_vcpu *vcpu, unsigned long req,
> > > > > + unsigned long start, unsigned long size)
> > > > > +{
> > > > > + struct zpci_dev *zdev;
> > > > > + u32 fh;
> > > > > + int rc;
> > > > > +
> > > > > + /* If the device has a SHM bit on, let userspace take care of
> > > > > this */
> > > > > + fh = req >> 32;
> > > > > + if ((fh & aift.mdd) != 0)
> > > > > + return -EOPNOTSUPP;
> > > >
> > > > I think you should make this check in the caller.
> > >
> > > OK
> > >
> > > > > +
> > > > > + /* Make sure this is a valid device associated with this guest */
> > > > > + zdev = get_zdev_by_fh(fh);
> > > > > + if (!zdev || !zdev->kzdev || zdev->kzdev->kvm != vcpu->kvm)
> > > > > + return -EINVAL;
> > > > > +
> > > > > + /* Only proceed if the device is using the assist */
> > > > > + if (zdev->kzdev->ioat.head[0] == 0)
> > > > > + return -EOPNOTSUPP;
> > > >
> > > > Using the assist means using interpretation over using interception and
> > > > legacy vfio-pci. right?
> > >
> > > Right - more specifically that the IOAT assist feature was never set via
> > > the vfio feature ioctl, so we can't handle the RPCIT for this device and
> > > so throw to userspace.
> > >
> > > The way the QEMU series is being implemented, a device using
> > > interpretation will always have the IOAT feature set on.
> > >
> > > > > +
> > > > > + rc = dma_table_shadow(vcpu, zdev, start, size);
> > > > > + if (rc > 0)
> > > > > + rc = zpci_refresh_trans((u64) zdev->fh << 32, start, size);
> > > >
> > > > Here you lose the status reported by the hardware.
> > > > You should directly use __rpcit(fn, addr, range, &status);
> > >
> > > OK, I can have a look at doing this.
> > >
> > > @Niklas thoughts on how you would want this exported. Renamed to
> > > zpci_rpcit or so?
> >
> > Hmm with using __rpcit() directly we would lose the error reporting in
> > s390dbf and this ist still kind of a RPCIT in the host. How about we
> > add the status as an out parameter to zpci_refresh_trans()? But yes if
>
> Another advantage of doing this would be that we then also keep the cc2
> retry logic in zpci_refresh_trans(), which would be nice.

Yeah thought about that too. If we don't have that I believe the guest
would retry but that means doing two full intercepts and going through
all the other logic too. Since these retries are afaik extremely rare
it shouldn't matter much but on the other hand I would expect them to
only happen when the system is overloaded and then doing all this extra
work surely isn't helpful.

>
> However we do still lose the returned CC value from the instruction.
> But I think we can infer a CC1 from a nonzero status and a CC3 from a
> zero status so maybe this is OK too.

I agree.



2021-12-17 14:55:18

by Christian Borntraeger

[permalink] [raw]
Subject: Re: [PATCH 32/32] MAINTAINERS: additional files related kvm s390 pci passthrough



Am 07.12.21 um 21:57 schrieb Matthew Rosato:
> Add entries from the s390 kvm subdirectory related to pci passthrough.
>
> Signed-off-by: Matthew Rosato <[email protected]>

Acked-by: Christian Borntraeger <[email protected]>

Question for Alex. Shall I take these and future patches regarding KVM hw support for PCI passthru via my tree or via your vfio tree?

> ---
> MAINTAINERS | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 43007f2d29e0..a88f8e4f2c80 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -16689,6 +16689,8 @@ M: Eric Farman <[email protected]>
> L: [email protected]
> L: [email protected]
> S: Supported
> +F: arch/s390/include/asm/kvm_pci.h
> +F: arch/s390/kvm/pci*
> F: drivers/vfio/pci/vfio_pci_zdev.c
> F: include/uapi/linux/vfio_zdev.h
>
>

2021-12-17 15:05:44

by Christian Borntraeger

[permalink] [raw]
Subject: Re: [PATCH 16/32] KVM: s390: expose the guest zPCI interpretation facility



Am 07.12.21 um 21:57 schrieb Matthew Rosato:
> This facility will be used to enable interpretive execution of zPCI
> instructions.
>
> Signed-off-by: Matthew Rosato <[email protected]>
> ---
> arch/s390/kvm/kvm-s390.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index c8fe9b7c2395..09991d05c871 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -2751,6 +2751,10 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
> set_kvm_facility(kvm->arch.model.fac_mask, 147);
> set_kvm_facility(kvm->arch.model.fac_list, 147);
> }
> + if (sclp.has_zpci_interp && test_facility(69)) {
> + set_kvm_facility(kvm->arch.model.fac_mask, 69);
> + set_kvm_facility(kvm->arch.model.fac_list, 69);
> + }


Do we need the setting of these stfle bits somewhere? I think QEMU sets them as well for the guest.
We only need this when the kernel probes for this (test_kvm_facility) But then the question is, shouldnt
we then simply check for sclp bits in those places?
See also patch 19. We need to build it in a way that allows VSIE support later on.

>
> if (css_general_characteristics.aiv && test_facility(65))
> set_kvm_facility(kvm->arch.model.fac_mask, 65);
>

2021-12-17 15:11:21

by Christian Borntraeger

[permalink] [raw]
Subject: Re: [PATCH 17/32] KVM: s390: expose the guest Adapter Interruption Source ID facility



Am 07.12.21 um 21:57 schrieb Matthew Rosato:
> This facility will be used to enable forwarding of PCI interrupts from
> firmware directly to guests.
>
> Signed-off-by: Matthew Rosato <[email protected]>
> ---
> arch/s390/kvm/kvm-s390.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 09991d05c871..d44ca313a1b7 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -2755,6 +2755,10 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
> set_kvm_facility(kvm->arch.model.fac_mask, 69);
> set_kvm_facility(kvm->arch.model.fac_list, 69);
> }
> + if (sclp.has_aisii && test_facility(70)) {
> + set_kvm_facility(kvm->arch.model.fac_mask, 70);
> + set_kvm_facility(kvm->arch.model.fac_list, 70);
> + }
>
same as patch 16 (as well as 18)

> if (css_general_characteristics.aiv && test_facility(65))
> set_kvm_facility(kvm->arch.model.fac_mask, 65);
>

2021-12-17 15:19:12

by Matthew Rosato

[permalink] [raw]
Subject: Re: [PATCH 16/32] KVM: s390: expose the guest zPCI interpretation facility

On 12/17/21 10:05 AM, Christian Borntraeger wrote:
>
>
> Am 07.12.21 um 21:57 schrieb Matthew Rosato:
>> This facility will be used to enable interpretive execution of zPCI
>> instructions.
>>
>> Signed-off-by: Matthew Rosato <[email protected]>
>> ---
>>   arch/s390/kvm/kvm-s390.c | 4 ++++
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index c8fe9b7c2395..09991d05c871 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -2751,6 +2751,10 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned
>> long type)
>>           set_kvm_facility(kvm->arch.model.fac_mask, 147);
>>           set_kvm_facility(kvm->arch.model.fac_list, 147);
>>       }
>> +    if (sclp.has_zpci_interp && test_facility(69)) {
>> +        set_kvm_facility(kvm->arch.model.fac_mask, 69);
>> +        set_kvm_facility(kvm->arch.model.fac_list, 69);
>> +    }
>
>
> Do we need the setting of these stfle bits somewhere? I think QEMU sets
> them as well for the guest. > We only need this when the kernel probes for this (test_kvm_facility)
> But then the question is, shouldnt
> we then simply check for sclp bits in those places?
> See also patch 19. We need to build it in a way that allows VSIE support
> later on.
>

Right, so this currently sets the facility bits but we don't set the
associated guest SCLP bits. I guess since we are not enabling for VSIE
now it would make sense to not set either.

So then just to confirm we are on the same page: I will drop these
patches 16-18 and leave the kvm facilities unset until we wish to enable
VSIE. And then also make sure we are checking sclp bits (e.g. patch
19). OK?

>>       if (css_general_characteristics.aiv && test_facility(65))
>>           set_kvm_facility(kvm->arch.model.fac_mask, 65);
>>


2021-12-17 16:49:23

by Christian Borntraeger

[permalink] [raw]
Subject: Re: [PATCH 25/32] vfio/pci: re-introduce CONFIG_VFIO_PCI_ZDEV



Am 07.12.21 um 21:57 schrieb Matthew Rosato:
> This was previously removed as unnecessary; while that was true, subsequent
> changes will make KVM an additional required component for vfio-pci-zdev.
> Let's re-introduce CONFIG_VFIO_PCI_ZDEV as now there is actually a reason
> to say 'n' for it (when not planning to CONFIG_KVM).
>
> Signed-off-by: Matthew Rosato <[email protected]>
> ---
> drivers/vfio/pci/Kconfig | 11 +++++++++++
> drivers/vfio/pci/Makefile | 2 +-
> include/linux/vfio_pci_core.h | 2 +-
> 3 files changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig
> index 860424ccda1b..fedd1d4cb592 100644
> --- a/drivers/vfio/pci/Kconfig
> +++ b/drivers/vfio/pci/Kconfig
> @@ -42,5 +42,16 @@ config VFIO_PCI_IGD
> and LPC bridge config space.
>
> To enable Intel IGD assignment through vfio-pci, say Y.
> +
> +config VFIO_PCI_ZDEV
> + bool "VFIO PCI extensions for s390x KVM passthrough"
> + depends on S390 && KVM

does this also depend on vfio-pci?

> + default y
> + help
> + Support s390x-specific extensions to enable support for enhancements
> + to KVM passthrough capabilities, such as interpretive execution of
> + zPCI instructions.
> +
> + To enable s390x KVM vfio-pci extensions, say Y.
> endif
> endif
> diff --git a/drivers/vfio/pci/Makefile b/drivers/vfio/pci/Makefile
> index 349d68d242b4..01b1f83d83d7 100644
> --- a/drivers/vfio/pci/Makefile
> +++ b/drivers/vfio/pci/Makefile
> @@ -1,7 +1,7 @@
> # SPDX-License-Identifier: GPL-2.0-only
>
> vfio-pci-core-y := vfio_pci_core.o vfio_pci_intrs.o vfio_pci_rdwr.o vfio_pci_config.o
> -vfio-pci-core-$(CONFIG_S390) += vfio_pci_zdev.o
> +vfio-pci-core-$(CONFIG_VFIO_PCI_ZDEV) += vfio_pci_zdev.o
> obj-$(CONFIG_VFIO_PCI_CORE) += vfio-pci-core.o
>
> vfio-pci-y := vfio_pci.o
> diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
> index ef9a44b6cf5d..5e2bca3b89db 100644
> --- a/include/linux/vfio_pci_core.h
> +++ b/include/linux/vfio_pci_core.h
> @@ -195,7 +195,7 @@ static inline int vfio_pci_igd_init(struct vfio_pci_core_device *vdev)
> }
> #endif
>
> -#ifdef CONFIG_S390
> +#ifdef CONFIG_VFIO_PCI_ZDEV
> extern int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
> struct vfio_info_cap *caps);
> #else
>

2021-12-17 16:56:46

by Christian Borntraeger

[permalink] [raw]
Subject: Re: [PATCH 15/32] KVM: s390: pci: enable host forwarding of Adapter Event Notifications

Am 07.12.21 um 21:57 schrieb Matthew Rosato:
> In cases where interrupts are not forwarded to the guest via firmware,
> KVM is responsible for ensuring delivery. When an interrupt presents
> with the forwarding bit, we must process the forwarding tables until
> all interrupts are delivered.
>
> Signed-off-by: Matthew Rosato <[email protected]>
> ---
[...]

> +static void aen_host_forward(struct zpci_aift *aift, unsigned long si)
> +{
> + struct kvm_s390_gisa_interrupt *gi;
> + struct zpci_gaite *gaite;
> + struct kvm *kvm;
> +
> + gaite = (struct zpci_gaite *)aift->gait +
> + (si * sizeof(struct zpci_gaite));
> + if (gaite->count == 0)
> + return;
> + if (gaite->aisb != 0)
> + set_bit_inv(gaite->aisbo, (unsigned long *)gaite->aisb);
> +
> + kvm = kvm_s390_pci_si_to_kvm(aift, si);
> + if (kvm == 0)
> + return;
> + gi = &kvm->arch.gisa_int;
> +
> + if (!(gi->origin->g1.simm & AIS_MODE_MASK(gaite->gisc)) ||
> + !(gi->origin->g1.nimm & AIS_MODE_MASK(gaite->gisc))) {
> + gisa_set_ipm_gisc(gi->origin, gaite->gisc);
> + if (hrtimer_active(&gi->timer))
> + hrtimer_cancel(&gi->timer);
> + hrtimer_start(&gi->timer, 0, HRTIMER_MODE_REL);
> + kvm->stat.aen_forward++;
> + }
> +}
> +
> +static void aen_process_gait(u8 isc)
> +{
> + bool found = false, first = true;
> + union zpci_sic_iib iib = {{0}};
> + unsigned long si, flags;
> + struct zpci_aift *aift;
> +
> + aift = kvm_s390_pci_get_aift();
> + spin_lock_irqsave(&aift->gait_lock, flags);
> +
> + if (!aift->gait) {
> + spin_unlock_irqrestore(&aift->gait_lock, flags);
> + return;
> + }
> +
> + for (si = 0;;) {
> + /* Scan adapter summary indicator bit vector */
> + si = airq_iv_scan(aift->sbv, si, airq_iv_end(aift->sbv));
> + if (si == -1UL) {
> + if (first || found) {
> + /* Reenable interrupts. */
> + if (zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, isc,
> + &iib))
> + break;
> + first = found = false;
> + } else {
> + /* Interrupts on and all bits processed */
> + break;
> + }
> + found = false;
> + si = 0;
> + continue;
> + }
> + found = true;
> + aen_host_forward(aift, si);
> + }
> +
> + spin_unlock_irqrestore(&aift->gait_lock, flags);
> +}
> +
> static void gib_alert_irq_handler(struct airq_struct *airq,
> struct tpi_info *tpi_info)
> {
> + struct tpi_adapter_info *info = (struct tpi_adapter_info *)tpi_info;
> +
> inc_irq_stat(IRQIO_GAL);
> - process_gib_alert_list();
> +
> + if (info->forward || info->error)
> + aen_process_gait(info->isc);
> + else
> + process_gib_alert_list();
> }

Not sure, would it make sense to actually do both after an alert interrupt or do we always get a separate interrupt for event vs. irq?
[..]

2021-12-17 16:58:29

by Christian Borntraeger

[permalink] [raw]
Subject: Re: [PATCH 16/32] KVM: s390: expose the guest zPCI interpretation facility



Am 17.12.21 um 16:19 schrieb Matthew Rosato:
> On 12/17/21 10:05 AM, Christian Borntraeger wrote:
>>
>>
>> Am 07.12.21 um 21:57 schrieb Matthew Rosato:
>>> This facility will be used to enable interpretive execution of zPCI
>>> instructions.
>>>
>>> Signed-off-by: Matthew Rosato <[email protected]>
>>> ---
>>>   arch/s390/kvm/kvm-s390.c | 4 ++++
>>>   1 file changed, 4 insertions(+)
>>>
>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>> index c8fe9b7c2395..09991d05c871 100644
>>> --- a/arch/s390/kvm/kvm-s390.c
>>> +++ b/arch/s390/kvm/kvm-s390.c
>>> @@ -2751,6 +2751,10 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>>>           set_kvm_facility(kvm->arch.model.fac_mask, 147);
>>>           set_kvm_facility(kvm->arch.model.fac_list, 147);
>>>       }
>>> +    if (sclp.has_zpci_interp && test_facility(69)) {
>>> +        set_kvm_facility(kvm->arch.model.fac_mask, 69);
>>> +        set_kvm_facility(kvm->arch.model.fac_list, 69);
>>> +    }
>>
>>
>> Do we need the setting of these stfle bits somewhere? I think QEMU sets them as well for the guest. > We only need this when the kernel probes for this (test_kvm_facility)
>> But then the question is, shouldnt
>> we then simply check for sclp bits in those places?
>> See also patch 19. We need to build it in a way that allows VSIE support later on.
>>
>
> Right, so this currently sets the facility bits but we don't set the associated guest SCLP bits.  I guess since we are not enabling for VSIE now it would make sense to not set either.
>
> So then just to confirm we are on the same page:  I will drop these patches 16-18 and leave the kvm facilities unset until we wish to enable VSIE.  And then also make sure we are checking sclp bits (e.g. patch 19).  OK?

Right drop these patches and change patch 19. When we later enable VSIE we need QEMU to set the sclp bits. Not sure, does this work as of today or do we need additional vsie changes (I would assume so)?

2021-12-17 17:13:39

by Matthew Rosato

[permalink] [raw]
Subject: Re: [PATCH 16/32] KVM: s390: expose the guest zPCI interpretation facility

On 12/17/21 11:58 AM, Christian Borntraeger wrote:
>
>
> Am 17.12.21 um 16:19 schrieb Matthew Rosato:
>> On 12/17/21 10:05 AM, Christian Borntraeger wrote:
>>>
>>>
>>> Am 07.12.21 um 21:57 schrieb Matthew Rosato:
>>>> This facility will be used to enable interpretive execution of zPCI
>>>> instructions.
>>>>
>>>> Signed-off-by: Matthew Rosato <[email protected]>
>>>> ---
>>>>   arch/s390/kvm/kvm-s390.c | 4 ++++
>>>>   1 file changed, 4 insertions(+)
>>>>
>>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>>> index c8fe9b7c2395..09991d05c871 100644
>>>> --- a/arch/s390/kvm/kvm-s390.c
>>>> +++ b/arch/s390/kvm/kvm-s390.c
>>>> @@ -2751,6 +2751,10 @@ int kvm_arch_init_vm(struct kvm *kvm,
>>>> unsigned long type)
>>>>           set_kvm_facility(kvm->arch.model.fac_mask, 147);
>>>>           set_kvm_facility(kvm->arch.model.fac_list, 147);
>>>>       }
>>>> +    if (sclp.has_zpci_interp && test_facility(69)) {
>>>> +        set_kvm_facility(kvm->arch.model.fac_mask, 69);
>>>> +        set_kvm_facility(kvm->arch.model.fac_list, 69);
>>>> +    }
>>>
>>>
>>> Do we need the setting of these stfle bits somewhere? I think QEMU
>>> sets them as well for the guest. > We only need this when the kernel
>>> probes for this (test_kvm_facility)
>>> But then the question is, shouldnt
>>> we then simply check for sclp bits in those places?
>>> See also patch 19. We need to build it in a way that allows VSIE
>>> support later on.
>>>
>>
>> Right, so this currently sets the facility bits but we don't set the
>> associated guest SCLP bits.  I guess since we are not enabling for
>> VSIE now it would make sense to not set either.
>>
>> So then just to confirm we are on the same page:  I will drop these
>> patches 16-18 and leave the kvm facilities unset until we wish to
>> enable VSIE.  And then also make sure we are checking sclp bits (e.g.
>> patch 19).  OK?
>
> Right drop these patches and change patch 19. When we later enable VSIE
> we need QEMU to set the sclp bits. Not sure, does this work as of today
> or do we need additional vsie changes (I would assume so)?

No, we will need some additional work to be able to enable for VSIE
(e.g. adapter interrupt source ID)

2021-12-17 17:43:02

by Matthew Rosato

[permalink] [raw]
Subject: Re: [PATCH 15/32] KVM: s390: pci: enable host forwarding of Adapter Event Notifications

On 12/17/21 11:56 AM, Christian Borntraeger wrote:
> Am 07.12.21 um 21:57 schrieb Matthew Rosato:
>> In cases where interrupts are not forwarded to the guest via firmware,
>> KVM is responsible for ensuring delivery.  When an interrupt presents
>> with the forwarding bit, we must process the forwarding tables until
>> all interrupts are delivered.
>>
>> Signed-off-by: Matthew Rosato <[email protected]>
>> ---
> [...]
>
>> +static void aen_host_forward(struct zpci_aift *aift, unsigned long si)
>> +{
>> +    struct kvm_s390_gisa_interrupt *gi;
>> +    struct zpci_gaite *gaite;
>> +    struct kvm *kvm;
>> +
>> +    gaite = (struct zpci_gaite *)aift->gait +
>> +        (si * sizeof(struct zpci_gaite));
>> +    if (gaite->count == 0)
>> +        return;
>> +    if (gaite->aisb != 0)
>> +        set_bit_inv(gaite->aisbo, (unsigned long *)gaite->aisb);
>> +
>> +    kvm = kvm_s390_pci_si_to_kvm(aift, si);
>> +    if (kvm == 0)
>> +        return;
>> +    gi = &kvm->arch.gisa_int;
>> +
>> +    if (!(gi->origin->g1.simm & AIS_MODE_MASK(gaite->gisc)) ||
>> +        !(gi->origin->g1.nimm & AIS_MODE_MASK(gaite->gisc))) {
>> +        gisa_set_ipm_gisc(gi->origin, gaite->gisc);
>> +        if (hrtimer_active(&gi->timer))
>> +            hrtimer_cancel(&gi->timer);
>> +        hrtimer_start(&gi->timer, 0, HRTIMER_MODE_REL);
>> +        kvm->stat.aen_forward++;
>> +    }
>> +}
>> +
>> +static void aen_process_gait(u8 isc)
>> +{
>> +    bool found = false, first = true;
>> +    union zpci_sic_iib iib = {{0}};
>> +    unsigned long si, flags;
>> +    struct zpci_aift *aift;
>> +
>> +    aift = kvm_s390_pci_get_aift();
>> +    spin_lock_irqsave(&aift->gait_lock, flags);
>> +
>> +    if (!aift->gait) {
>> +        spin_unlock_irqrestore(&aift->gait_lock, flags);
>> +        return;
>> +    }
>> +
>> +    for (si = 0;;) {
>> +        /* Scan adapter summary indicator bit vector */
>> +        si = airq_iv_scan(aift->sbv, si, airq_iv_end(aift->sbv));
>> +        if (si == -1UL) {
>> +            if (first || found) {
>> +                /* Reenable interrupts. */
>> +                if (zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, isc,
>> +                              &iib))
>> +                    break;
>> +                first = found = false;
>> +            } else {
>> +                /* Interrupts on and all bits processed */
>> +                break;
>> +            }
>> +            found = false;
>> +            si = 0;
>> +            continue;
>> +        }
>> +        found = true;
>> +        aen_host_forward(aift, si);
>> +    }
>> +
>> +    spin_unlock_irqrestore(&aift->gait_lock, flags);
>> +}
>> +
>>   static void gib_alert_irq_handler(struct airq_struct *airq,
>>                     struct tpi_info *tpi_info)
>>   {
>> +    struct tpi_adapter_info *info = (struct tpi_adapter_info *)tpi_info;
>> +
>>       inc_irq_stat(IRQIO_GAL);
>> -    process_gib_alert_list();
>> +
>> +    if (info->forward || info->error)
>> +        aen_process_gait(info->isc);
>> +    else
>> +        process_gib_alert_list();
>>   }
>
> Not sure, would it make sense to actually do both after an alert
> interrupt or do we always get a separate interrupt for event vs. irq?
> [..]

Good point - I thought this was an either/or scenario but I went back
and doubled checked -- looks like it is indeed possible to get a single
interrupt that indicates processing of both AEN events and the alert
list is required. (It is also possible to get interrupts that indicate
processing of only one or the other is required). So, my code above is
wrong.

However, we also don't need to call process_gib_alert_list()
unconditionally after handling AEN -- there is more information we can
check in tpi_adapter_info to decide whether that is necessary (aism); I
will add this.

2021-12-17 17:54:42

by Matthew Rosato

[permalink] [raw]
Subject: Re: [PATCH 25/32] vfio/pci: re-introduce CONFIG_VFIO_PCI_ZDEV

On 12/17/21 11:49 AM, Christian Borntraeger wrote:
>
>
> Am 07.12.21 um 21:57 schrieb Matthew Rosato:
>> This was previously removed as unnecessary; while that was true,
>> subsequent
>> changes will make KVM an additional required component for vfio-pci-zdev.
>> Let's re-introduce CONFIG_VFIO_PCI_ZDEV as now there is actually a reason
>> to say 'n' for it (when not planning to CONFIG_KVM).
>>
>> Signed-off-by: Matthew Rosato <[email protected]>
>> ---
>>   drivers/vfio/pci/Kconfig      | 11 +++++++++++
>>   drivers/vfio/pci/Makefile     |  2 +-
>>   include/linux/vfio_pci_core.h |  2 +-
>>   3 files changed, 13 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig
>> index 860424ccda1b..fedd1d4cb592 100644
>> --- a/drivers/vfio/pci/Kconfig
>> +++ b/drivers/vfio/pci/Kconfig
>> @@ -42,5 +42,16 @@ config VFIO_PCI_IGD
>>         and LPC bridge config space.
>>         To enable Intel IGD assignment through vfio-pci, say Y.
>> +
>> +config VFIO_PCI_ZDEV
>> +    bool "VFIO PCI extensions for s390x KVM passthrough"
>> +    depends on S390 && KVM
>
> does this also depend on vfio-pci?
>

Yes - but this config statement is already contained within an 'if
VFIO_PCI' block along with config VFIO_PCI_VGA and config VFIO_PCI_IGD.



2021-12-17 20:27:05

by Matthew Rosato

[permalink] [raw]
Subject: Re: [PATCH 13/32] KVM: s390: pci: add basic kvm_zdev structure

On 12/7/21 3:57 PM, Matthew Rosato wrote:
> This structure will be used to carry kvm passthrough information related to
> zPCI devices.
>
> Signed-off-by: Matthew Rosato <[email protected]>
> ---
...
> static inline bool zdev_enabled(struct zpci_dev *zdev)
> diff --git a/arch/s390/kvm/Makefile b/arch/s390/kvm/Makefile
> index b3aaadc60ead..95ea865e5d29 100644
> --- a/arch/s390/kvm/Makefile
> +++ b/arch/s390/kvm/Makefile
> @@ -10,6 +10,6 @@ common-objs = $(KVM)/kvm_main.o $(KVM)/eventfd.o $(KVM)/async_pf.o \
> ccflags-y := -Ivirt/kvm -Iarch/s390/kvm
>
> kvm-objs := $(common-objs) kvm-s390.o intercept.o interrupt.o priv.o sigp.o
> -kvm-objs += diag.o gaccess.o guestdbg.o vsie.o pv.o
> +kvm-objs += diag.o gaccess.o guestdbg.o vsie.o pv.o pci.o

This should instead be

kvm-objs-$(CONFIG_PCI) += pci.o

I think this makes sense as we aren't about to do PCI passthrough
support anyway if the host kernel doesn't support PCI (no vfio-pci,
etc). This will quiet the kernel test robot complaints about
CONFIG_PCI_NR_FUNCTIONS seen on the next patch in this series.

>
> obj-$(CONFIG_KVM) += kvm.o

2021-12-20 17:25:17

by Pierre Morel

[permalink] [raw]
Subject: Re: [PATCH 13/32] KVM: s390: pci: add basic kvm_zdev structure



On 12/17/21 21:26, Matthew Rosato wrote:
> On 12/7/21 3:57 PM, Matthew Rosato wrote:
>> This structure will be used to carry kvm passthrough information
>> related to
>> zPCI devices.
>>
>> Signed-off-by: Matthew Rosato <[email protected]>
>> ---
> ...
>>   static inline bool zdev_enabled(struct zpci_dev *zdev)
>> diff --git a/arch/s390/kvm/Makefile b/arch/s390/kvm/Makefile
>> index b3aaadc60ead..95ea865e5d29 100644
>> --- a/arch/s390/kvm/Makefile
>> +++ b/arch/s390/kvm/Makefile
>> @@ -10,6 +10,6 @@ common-objs = $(KVM)/kvm_main.o $(KVM)/eventfd.o
>> $(KVM)/async_pf.o \
>>   ccflags-y := -Ivirt/kvm -Iarch/s390/kvm
>>   kvm-objs := $(common-objs) kvm-s390.o intercept.o interrupt.o priv.o
>> sigp.o
>> -kvm-objs += diag.o gaccess.o guestdbg.o vsie.o pv.o
>> +kvm-objs += diag.o gaccess.o guestdbg.o vsie.o pv.o pci.o
>
> This should instead be
>
> kvm-objs-$(CONFIG_PCI) += pci.o
>
> I think this makes sense as we aren't about to do PCI passthrough
> support anyway if the host kernel doesn't support PCI (no vfio-pci,
> etc).   This will quiet the kernel test robot complaints about
> CONFIG_PCI_NR_FUNCTIONS seen on the next patch in this series.

hum, then you will need more than this to put all pci references in
priv.c and kvm-s390.c away.

>
>>   obj-$(CONFIG_KVM) += kvm.o



--
Pierre Morel
IBM Lab Boeblingen

2021-12-21 18:47:38

by Alex Williamson

[permalink] [raw]
Subject: Re: [PATCH 26/32] vfio-pci/zdev: wire up group notifier

On Tue, 7 Dec 2021 15:57:37 -0500
Matthew Rosato <[email protected]> wrote:

> KVM zPCI passthrough device logic will need a reference to the associated
> kvm guest that has access to the device. Let's register a group notifier
> for VFIO_GROUP_NOTIFY_SET_KVM to catch this information in order to create
> an association between a kvm guest and the host zdev.
>
> Signed-off-by: Matthew Rosato <[email protected]>
> ---
> arch/s390/include/asm/kvm_pci.h | 2 ++
> drivers/vfio/pci/vfio_pci_core.c | 2 ++
> drivers/vfio/pci/vfio_pci_zdev.c | 54 ++++++++++++++++++++++++++++++++
> include/linux/vfio_pci_core.h | 12 +++++++
> 4 files changed, 70 insertions(+)
>
> diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
> index 97e3a369135d..6526908ac834 100644
> --- a/arch/s390/include/asm/kvm_pci.h
> +++ b/arch/s390/include/asm/kvm_pci.h
> @@ -17,6 +17,7 @@
> #include <linux/kvm.h>
> #include <linux/pci.h>
> #include <linux/mutex.h>
> +#include <linux/notifier.h>
> #include <asm/pci_insn.h>
> #include <asm/pci_dma.h>
>
> @@ -33,6 +34,7 @@ struct kvm_zdev {
> u64 rpcit_count;
> struct kvm_zdev_ioat ioat;
> struct zpci_fib fib;
> + struct notifier_block nb;
> };
>
> extern int kvm_s390_pci_dev_open(struct zpci_dev *zdev);
> diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
> index f948e6cd2993..fc57d4d0abbe 100644
> --- a/drivers/vfio/pci/vfio_pci_core.c
> +++ b/drivers/vfio/pci/vfio_pci_core.c
> @@ -452,6 +452,7 @@ void vfio_pci_core_close_device(struct vfio_device *core_vdev)
>
> vfio_pci_vf_token_user_add(vdev, -1);
> vfio_spapr_pci_eeh_release(vdev->pdev);
> + vfio_pci_zdev_release(vdev);
> vfio_pci_core_disable(vdev);
>
> mutex_lock(&vdev->igate);
> @@ -470,6 +471,7 @@ EXPORT_SYMBOL_GPL(vfio_pci_core_close_device);
> void vfio_pci_core_finish_enable(struct vfio_pci_core_device *vdev)
> {
> vfio_pci_probe_mmaps(vdev);
> + vfio_pci_zdev_open(vdev);
> vfio_spapr_pci_eeh_open(vdev->pdev);
> vfio_pci_vf_token_user_add(vdev, 1);
> }
> diff --git a/drivers/vfio/pci/vfio_pci_zdev.c b/drivers/vfio/pci/vfio_pci_zdev.c
> index ea4c0d2b0663..cfd7f44b06c1 100644
> --- a/drivers/vfio/pci/vfio_pci_zdev.c
> +++ b/drivers/vfio/pci/vfio_pci_zdev.c
> @@ -13,6 +13,7 @@
> #include <linux/vfio_zdev.h>
> #include <asm/pci_clp.h>
> #include <asm/pci_io.h>
> +#include <asm/kvm_pci.h>
>
> #include <linux/vfio_pci_core.h>
>
> @@ -136,3 +137,56 @@ int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
>
> return ret;
> }
> +
> +static int vfio_pci_zdev_group_notifier(struct notifier_block *nb,
> + unsigned long action, void *data)
> +{
> + struct kvm_zdev *kzdev = container_of(nb, struct kvm_zdev, nb);
> +
> + if (action == VFIO_GROUP_NOTIFY_SET_KVM) {
> + if (!data || !kzdev->zdev)
> + return NOTIFY_DONE;
> + if (kvm_s390_pci_attach_kvm(kzdev->zdev, data))
> + return NOTIFY_DONE;
> + }
> +
> + return NOTIFY_OK;
> +}
> +
> +int vfio_pci_zdev_open(struct vfio_pci_core_device *vdev)
> +{
> + unsigned long events = VFIO_GROUP_NOTIFY_SET_KVM;
> + struct zpci_dev *zdev = to_zpci(vdev->pdev);
> + int ret;
> +
> + if (!zdev)
> + return -ENODEV;
> +
> + ret = kvm_s390_pci_dev_open(zdev);
> + if (ret)
> + return -ENODEV;
> +
> + zdev->kzdev->nb.notifier_call = vfio_pci_zdev_group_notifier;
> +
> + ret = vfio_register_notifier(vdev->vdev.dev, VFIO_GROUP_NOTIFY,
> + &events, &zdev->kzdev->nb);
> + if (ret)
> + kvm_s390_pci_dev_release(zdev);
> +
> + return ret;

None of these error return paths are realized by the call site. Thanks,

Alex

> +}
> +
> +int vfio_pci_zdev_release(struct vfio_pci_core_device *vdev)
> +{
> + struct zpci_dev *zdev = to_zpci(vdev->pdev);
> +
> + if (!zdev || !zdev->kzdev)
> + return -ENODEV;
> +
> + vfio_unregister_notifier(vdev->vdev.dev, VFIO_GROUP_NOTIFY,
> + &zdev->kzdev->nb);
> +
> + kvm_s390_pci_dev_release(zdev);
> +
> + return 0;
> +}
> diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
> index 5e2bca3b89db..14079da409f1 100644
> --- a/include/linux/vfio_pci_core.h
> +++ b/include/linux/vfio_pci_core.h
> @@ -198,12 +198,24 @@ static inline int vfio_pci_igd_init(struct vfio_pci_core_device *vdev)
> #ifdef CONFIG_VFIO_PCI_ZDEV
> extern int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
> struct vfio_info_cap *caps);
> +int vfio_pci_zdev_open(struct vfio_pci_core_device *vdev);
> +int vfio_pci_zdev_release(struct vfio_pci_core_device *vdev);
> #else
> static inline int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
> struct vfio_info_cap *caps)
> {
> return -ENODEV;
> }
> +
> +static inline int vfio_pci_zdev_open(struct vfio_pci_core_device *vdev)
> +{
> + return -ENODEV;
> +}
> +
> +static inline int vfio_pci_zdev_release(struct vfio_pci_core_device *vdev)
> +{
> + return -ENODEV;
> +}
> #endif
>
> /* Will be exported for vfio pci drivers usage */


2021-12-21 18:48:16

by Alex Williamson

[permalink] [raw]
Subject: Re: [PATCH 27/32] vfio-pci/zdev: wire up zPCI interpretive execution support

On Tue, 7 Dec 2021 15:57:38 -0500
Matthew Rosato <[email protected]> wrote:

> Introduce support for VFIO_DEVICE_FEATURE_ZPCI_INTERP, which is a new
> VFIO_DEVICE_FEATURE ioctl. This interface is used to indicate that an
> s390x vfio-pci device wishes to enable/disable zPCI interpretive
> execution, which allows zPCI instructions to be executed directly by
> underlying firmware without KVM involvement.
>
> Signed-off-by: Matthew Rosato <[email protected]>
> ---
> arch/s390/include/asm/kvm_pci.h | 1 +
> drivers/vfio/pci/vfio_pci_core.c | 2 +
> drivers/vfio/pci/vfio_pci_zdev.c | 76 ++++++++++++++++++++++++++++++++
> include/linux/vfio_pci_core.h | 10 +++++
> include/uapi/linux/vfio.h | 7 +++
> include/uapi/linux/vfio_zdev.h | 15 +++++++
> 6 files changed, 111 insertions(+)
>
> diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
> index 6526908ac834..062bac720428 100644
> --- a/arch/s390/include/asm/kvm_pci.h
> +++ b/arch/s390/include/asm/kvm_pci.h
> @@ -35,6 +35,7 @@ struct kvm_zdev {
> struct kvm_zdev_ioat ioat;
> struct zpci_fib fib;
> struct notifier_block nb;
> + bool interp;
> };
>
> extern int kvm_s390_pci_dev_open(struct zpci_dev *zdev);
> diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
> index fc57d4d0abbe..2b2d64a2190c 100644
> --- a/drivers/vfio/pci/vfio_pci_core.c
> +++ b/drivers/vfio/pci/vfio_pci_core.c
> @@ -1172,6 +1172,8 @@ long vfio_pci_core_ioctl(struct vfio_device *core_vdev, unsigned int cmd,
> mutex_unlock(&vdev->vf_token->lock);
>
> return 0;
> + case VFIO_DEVICE_FEATURE_ZPCI_INTERP:
> + return vfio_pci_zdev_feat_interp(vdev, feature, arg);
> default:
> return -ENOTTY;
> }
> diff --git a/drivers/vfio/pci/vfio_pci_zdev.c b/drivers/vfio/pci/vfio_pci_zdev.c
> index cfd7f44b06c1..b205e0ad1fd3 100644
> --- a/drivers/vfio/pci/vfio_pci_zdev.c
> +++ b/drivers/vfio/pci/vfio_pci_zdev.c
> @@ -54,6 +54,10 @@ static int zpci_group_cap(struct zpci_dev *zdev, struct vfio_info_cap *caps)
> .version = zdev->version
> };
>
> + /* Some values are different for interpreted devices */
> + if (zdev->kzdev && zdev->kzdev->interp)
> + cap.maxstbl = zdev->maxstbl;
> +
> return vfio_info_add_capability(caps, &cap.header, sizeof(cap));
> }
>
> @@ -138,6 +142,70 @@ int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
> return ret;
> }
>
> +int vfio_pci_zdev_feat_interp(struct vfio_pci_core_device *vdev,
> + struct vfio_device_feature feature,
> + unsigned long arg)
> +{
> + struct zpci_dev *zdev = to_zpci(vdev->pdev);
> + struct vfio_device_zpci_interp *data;
> + struct vfio_device_feature *feat;
> + unsigned long minsz;
> + int size, rc;
> +
> + if (!zdev || !zdev->kzdev)
> + return -EINVAL;
> +
> + /*
> + * If PROBE requested and feature not found, leave immediately.
> + * Otherwise, keep going as GET or SET may also be specified.
> + */
> + if (feature.flags & VFIO_DEVICE_FEATURE_PROBE) {
> + rc = kvm_s390_pci_interp_probe(zdev);
> + if (rc)
> + return rc;
> + }
> + if (!(feature.flags & (VFIO_DEVICE_FEATURE_GET +
> + VFIO_DEVICE_FEATURE_SET)))
> + return 0;
> +
> + size = sizeof(*feat) + sizeof(*data);
> + feat = kzalloc(size, GFP_KERNEL);
> + if (!feat)
> + return -ENOMEM;
> +
> + data = (struct vfio_device_zpci_interp *)&feat->data;
> + minsz = offsetofend(struct vfio_device_feature, flags);
> +
> + /* Get the rest of the payload for GET/SET */
> + rc = copy_from_user(data, (void __user *)(arg + minsz),
> + sizeof(*data));

argsz as noted by Pierre.

> + if (rc)
> + rc = -EINVAL;
> +
> + if (feature.flags & VFIO_DEVICE_FEATURE_GET) {
> + if (zdev->gd != 0)
> + data->flags = VFIO_DEVICE_ZPCI_FLAG_INTERP;
> + else
> + data->flags = 0;
> + data->fh = zdev->fh;
> + /* userspace is using host fh, give interpreted clp values */
> + zdev->kzdev->interp = true;
> +
> + if (copy_to_user((void __user *)arg, feat, size))
> + rc = -EFAULT;
> + } else if (feature.flags & VFIO_DEVICE_FEATURE_SET) {
> + if (data->flags == VFIO_DEVICE_ZPCI_FLAG_INTERP)
> + rc = kvm_s390_pci_interp_enable(zdev);
> + else if (data->flags == 0)
> + rc = kvm_s390_pci_interp_disable(zdev);

I see kvm_s390_pci_interp_enable() dereferencing through
zdev->kzdev->kvm without testing it, how do you know the device is
being using with KVM and that the user has registered the group through
the kvm-vfio device? If these features are dependent on a previously
registered KVM association, shouldn't the feature probing reflect that?
VFIO_GROUP_NOTIFY_SET_KVM can also be called with a NULL KVM pointer.
Thanks,

Alex


> + else
> + rc = -EINVAL;
> + }
> +
> + kfree(feat);
> + return rc;
> +}
> +
> static int vfio_pci_zdev_group_notifier(struct notifier_block *nb,
> unsigned long action, void *data)
> {
> @@ -167,6 +235,7 @@ int vfio_pci_zdev_open(struct vfio_pci_core_device *vdev)
> return -ENODEV;
>
> zdev->kzdev->nb.notifier_call = vfio_pci_zdev_group_notifier;
> + zdev->kzdev->interp = false;
>
> ret = vfio_register_notifier(vdev->vdev.dev, VFIO_GROUP_NOTIFY,
> &events, &zdev->kzdev->nb);
> @@ -186,6 +255,13 @@ int vfio_pci_zdev_release(struct vfio_pci_core_device *vdev)
> vfio_unregister_notifier(vdev->vdev.dev, VFIO_GROUP_NOTIFY,
> &zdev->kzdev->nb);
>
> + /*
> + * If the device was using interpretation, don't trust that userspace
> + * did the appropriate cleanup
> + */
> + if (zdev->gd != 0)
> + kvm_s390_pci_interp_disable(zdev);
> +
> kvm_s390_pci_dev_release(zdev);
>
> return 0;
> diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
> index 14079da409f1..92dc43c827c9 100644
> --- a/include/linux/vfio_pci_core.h
> +++ b/include/linux/vfio_pci_core.h
> @@ -198,6 +198,9 @@ static inline int vfio_pci_igd_init(struct vfio_pci_core_device *vdev)
> #ifdef CONFIG_VFIO_PCI_ZDEV
> extern int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
> struct vfio_info_cap *caps);
> +int vfio_pci_zdev_feat_interp(struct vfio_pci_core_device *vdev,
> + struct vfio_device_feature feature,
> + unsigned long arg);
> int vfio_pci_zdev_open(struct vfio_pci_core_device *vdev);
> int vfio_pci_zdev_release(struct vfio_pci_core_device *vdev);
> #else
> @@ -207,6 +210,13 @@ static inline int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
> return -ENODEV;
> }
>
> +static inline int vfio_pci_zdev_feat_interp(struct vfio_pci_core_device *vdev,
> + struct vfio_device_feature feature,
> + unsigned long arg)
> +{
> + return -ENOTTY;
> +}
> +
> static inline int vfio_pci_zdev_open(struct vfio_pci_core_device *vdev)
> {
> return -ENODEV;
> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> index ef33ea002b0b..b9a75485b8e7 100644
> --- a/include/uapi/linux/vfio.h
> +++ b/include/uapi/linux/vfio.h
> @@ -1002,6 +1002,13 @@ struct vfio_device_feature {
> */
> #define VFIO_DEVICE_FEATURE_PCI_VF_TOKEN (0)
>
> +/*
> + * Provide support for enabling interpretation of zPCI instructions. This
> + * feature is only valid for s390x PCI devices. Data provided when setting
> + * and getting this feature is futher described in vfio_zdev.h
> + */
> +#define VFIO_DEVICE_FEATURE_ZPCI_INTERP (1)
> +
> /* -------- API for Type1 VFIO IOMMU -------- */
>
> /**
> diff --git a/include/uapi/linux/vfio_zdev.h b/include/uapi/linux/vfio_zdev.h
> index b4309397b6b2..575f0410dc66 100644
> --- a/include/uapi/linux/vfio_zdev.h
> +++ b/include/uapi/linux/vfio_zdev.h
> @@ -75,4 +75,19 @@ struct vfio_device_info_cap_zpci_pfip {
> __u8 pfip[];
> };
>
> +/**
> + * VFIO_DEVICE_FEATURE_ZPCI_INTERP
> + *
> + * This feature is used for enabling zPCI instruction interpretation for a
> + * device. No data is provided when setting this feature. When getting
> + * this feature, the following structure is provided which details whether
> + * or not interpretation is active and provides the guest with host device
> + * information necessary to enable interpretation.
> + */
> +struct vfio_device_zpci_interp {
> + __u64 flags;
> +#define VFIO_DEVICE_ZPCI_FLAG_INTERP 1
> + __u32 fh; /* Host device function handle */
> +};
> +
> #endif


2021-12-21 19:11:35

by Alex Williamson

[permalink] [raw]
Subject: Re: [PATCH 32/32] MAINTAINERS: additional files related kvm s390 pci passthrough

On Fri, 17 Dec 2021 15:55:08 +0100
Christian Borntraeger <[email protected]> wrote:

> Am 07.12.21 um 21:57 schrieb Matthew Rosato:
> > Add entries from the s390 kvm subdirectory related to pci passthrough.
> >
> > Signed-off-by: Matthew Rosato <[email protected]>
>
> Acked-by: Christian Borntraeger <[email protected]>
>
> Question for Alex. Shall I take these and future patches regarding
> KVM hw support for PCI passthru via my tree or via your vfio tree?

Looks like there will be another rev of this series but the diffstat of
this one would suggest your tree. For future patches, I don't need to
slow down the process for anything entirely internal to zpci,
especially since I don't know the intricacies anyway, but I'd like to
at least get a chance to look at anything exposing new vfio interfaces.
Thanks,

Alex