2022-07-11 08:44:08

by Pierre Morel

[permalink] [raw]
Subject: [PATCH v12 0/3] s390x: KVM: CPU Topology

Hi all,

This new spin suppress the check for real cpu migration and
modify the checking of valid function code inside the interception
of the STSI instruction.

The series provides:
0- Modification of the ipte lock handling to use KVM instead of the
vcpu as an argument because ipte lock work on SCA which is uniq
per KVM structure and common to all vCPUs.
1- interception of the STSI instruction forwarding the CPU topology
2- interpretation of the PTF instruction
3- a KVM capability for the userland hypervisor to ask KVM to
setup PTF interpretation.
4- KVM ioctl to get and set the MTCR bit of the SCA in order to
migrate this bit during a migration.


0- Foreword

The S390 CPU topology is reported using two instructions:
- PTF, to get information if the CPU topology did change since last
PTF instruction or a subsystem reset.
- STSI, to get the topology information, consisting of the topology
of the CPU inside the sockets, of the sockets inside the books etc.

The PTF(2) instruction report a change if the STSI(15.1.2) instruction
will report a difference with the last STSI(15.1.2) instruction*.
With the SIE interpretation, the PTF(2) instruction will report a
change to the guest if the host sets the SCA.MTCR bit.

*The STSI(15.1.2) instruction reports:
- The cores address within a socket
- The polarization of the cores
- The CPU type of the cores
- If the cores are dedicated or not

We decided to implement the CPU topology for S390 in several steps:

- first we report CPU hotplug

In future development we will provide:

- modification of the CPU mask inside sockets
- handling of shared CPUs
- reporting of the CPU Type
- reporting of the polarization


1- Interception of STSI

To provide Topology information to the guest through the STSI
instruction, we forward STSI with Function Code 15 to the
userland hypervisor which will take care to provide the right
information to the guest.

To let the guest use both the PTF instruction to check if a topology
change occurred and sthe STSI_15.x.x instruction we add a new KVM
capability to enable the topology facility.

2- Interpretation of PTF with FC(2)

The PTF instruction reports a topology change if there is any change
with a previous STSI(15.1.2) SYSIB.

Changes inside a STSI(15.1.2) SYSIB occur if CPU bits are set or clear
inside the CPU Topology List Entry CPU mask field, which happens with
changes in CPU polarization, dedication, CPU types and adding or
removing CPUs in a socket.

Considering that the KVM guests currently only supports:
- horizontal polarization
- type 3 (Linux) CPU

And that we decide to support only:
- dedicated CPUs on the host
- pinned vCPUs on the guest

the creation of vCPU will is the only trigger to set the MTCR bit for
a guest.

The reporting to the guest is done using the Multiprocessor
Topology-Change-Report (MTCR) bit of the utility entry of the guest's
SCA which will be cleared during the interpretation of PTF.

Regards,
Pierre

Pierre Morel (3):
KVM: s390: Cleanup ipte lock access and SIIF facility checks
KVM: s390: guest support for topology function
KVM: s390: resetting the Topology-Change-Report

Documentation/virt/kvm/api.rst | 25 +++++++++
arch/s390/include/asm/kvm_host.h | 18 +++++-
arch/s390/include/uapi/asm/kvm.h | 1 +
arch/s390/kvm/gaccess.c | 96 ++++++++++++++++----------------
arch/s390/kvm/gaccess.h | 6 +-
arch/s390/kvm/kvm-s390.c | 87 +++++++++++++++++++++++++++++
arch/s390/kvm/priv.c | 28 +++++++---
arch/s390/kvm/vsie.c | 8 +++
include/uapi/linux/kvm.h | 1 +
9 files changed, 209 insertions(+), 61 deletions(-)

--
2.31.1

Changelog:

from v11 to v12

- protect sca pointer
(Janis)

- check for user_stsi before returning information
to userland
(Janis)

- check for protected virtualization
(Pierre)

from v10 to v11

- access mctr with interlocked access instead of ipte_lock
(Janis)

- set mctr in kvm_arch_vcpu_destroy
(Nico)

- better function documentation
(Claudio)

- use a single function to set and clear
(Janosch)

- Use u8 as API data
(David, Janis)

- Check KVM_CAP_S390_USER_STSI before returning
data to userspace
(Nico)

from v9 to v10

- Suppression of the check on real CPU migration
(Christian)

- Changed the check on fc in handle_stsi
(David)

from v8 to v9

- bug correction in kvm_s390_topology_changed
(Heiko)

- simplification for ipte_lock/unlock to use kvm
as arg instead of vcpu and test on sclp.has_siif
instead of the SIE ECA_SII.
(David)

- use of a single value for reporting if the
topology changed instead of a structure
(David)

from v7 to v8

- implement reset handling
(Janosch)

- change the way to check if the topology changed
(Nico, Heiko)

from v6 to v7

- rebase

from v5 to v6

- make the subject more accurate
(Claudio)

- Change the kvm_s390_set_mtcr() function to have vcpu in the name
(Janosch)

- Replace the checks on ECB_PTF wit the check of facility 11
(Janosch)

- modify kvm_arch_vcpu_load, move the check in a function in
the header file
(Janosh)

- No magical number replace the "new cpu value" of -1 with a define
(Janosch)

- Make the checks for STSI validity clearer
(Janosch)

from v4 tp v5

- modify the way KVM_CAP is tested to be OK with vsie
(David)

from v3 to v4

- squatch both patches
(David)

- Added Documentation
(David)

- Modified the detection for new vCPUs
(Pierre)

from v2 to v3

- use PTF interpretation
(Christian)

- optimize arch_update_cpu_topology using PTF
(Pierre)

from v1 to v2:

- Add a KVM capability to let QEMU know we support PTF and STSI 15
(David)

- check KVM facility 11 before accepting STSI fc 15
(David)

- handle all we can in userland
(David)

- add tracing to STSI fc 15
(Connie)


2022-07-11 09:05:09

by Pierre Morel

[permalink] [raw]
Subject: [PATCH v12 3/3] KVM: s390: resetting the Topology-Change-Report

During a subsystem reset the Topology-Change-Report is cleared.

Let's give userland the possibility to clear the MTCR in the case
of a subsystem reset.

To migrate the MTCR, we give userland the possibility to
query the MTCR state.

We indicate KVM support for the CPU topology facility with a new
KVM capability: KVM_CAP_S390_CPU_TOPOLOGY.

Signed-off-by: Pierre Morel <[email protected]>
---
Documentation/virt/kvm/api.rst | 25 ++++++++++++++
arch/s390/include/uapi/asm/kvm.h | 1 +
arch/s390/kvm/kvm-s390.c | 56 ++++++++++++++++++++++++++++++++
include/uapi/linux/kvm.h | 1 +
4 files changed, 83 insertions(+)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 11e00a46c610..5e086125d8ad 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -7956,6 +7956,31 @@ should adjust CPUID leaf 0xA to reflect that the PMU is disabled.
When enabled, KVM will exit to userspace with KVM_EXIT_SYSTEM_EVENT of
type KVM_SYSTEM_EVENT_SUSPEND to process the guest suspend request.

+8.37 KVM_CAP_S390_CPU_TOPOLOGY
+------------------------------
+
+:Capability: KVM_CAP_S390_CPU_TOPOLOGY
+:Architectures: s390
+:Type: vm
+
+This capability indicates that KVM will provide the S390 CPU Topology
+facility which consist of the interpretation of the PTF instruction for
+the function code 2 along with interception and forwarding of both the
+PTF instruction with function codes 0 or 1 and the STSI(15,1,x)
+instruction to the userland hypervisor.
+
+The stfle facility 11, CPU Topology facility, should not be indicated
+to the guest without this capability.
+
+When this capability is present, KVM provides a new attribute group
+on vm fd, KVM_S390_VM_CPU_TOPOLOGY.
+This new attribute allows to get, set or clear the Modified Change
+Topology Report (MTCR) bit of the SCA through the kvm_device_attr
+structure.
+
+When getting the Modified Change Topology Report value, the attr->addr
+must point to a byte where the value will be stored.
+
9. Known KVM API problems
=========================

diff --git a/arch/s390/include/uapi/asm/kvm.h b/arch/s390/include/uapi/asm/kvm.h
index 7a6b14874d65..a73cf01a1606 100644
--- a/arch/s390/include/uapi/asm/kvm.h
+++ b/arch/s390/include/uapi/asm/kvm.h
@@ -74,6 +74,7 @@ struct kvm_s390_io_adapter_req {
#define KVM_S390_VM_CRYPTO 2
#define KVM_S390_VM_CPU_MODEL 3
#define KVM_S390_VM_MIGRATION 4
+#define KVM_S390_VM_CPU_TOPOLOGY 5

/* kvm attributes for mem_ctrl */
#define KVM_S390_VM_MEM_ENABLE_CMMA 0
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 70436bfff53a..b18e0b940b26 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -606,6 +606,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
case KVM_CAP_S390_PROTECTED:
r = is_prot_virt_host();
break;
+ case KVM_CAP_S390_CPU_TOPOLOGY:
+ r = test_facility(11);
+ break;
default:
r = 0;
}
@@ -817,6 +820,20 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
icpt_operexc_on_all_vcpus(kvm);
r = 0;
break;
+ case KVM_CAP_S390_CPU_TOPOLOGY:
+ r = -EINVAL;
+ mutex_lock(&kvm->lock);
+ if (kvm->created_vcpus) {
+ r = -EBUSY;
+ } else if (test_facility(11)) {
+ set_kvm_facility(kvm->arch.model.fac_mask, 11);
+ set_kvm_facility(kvm->arch.model.fac_list, 11);
+ r = 0;
+ }
+ mutex_unlock(&kvm->lock);
+ VM_EVENT(kvm, 3, "ENABLE: CAP_S390_CPU_TOPOLOGY %s",
+ r ? "(not available)" : "(success)");
+ break;
default:
r = -EINVAL;
break;
@@ -1717,6 +1734,36 @@ static void kvm_s390_update_topology_change_report(struct kvm *kvm, bool val)
read_unlock(&kvm->arch.sca_lock);
}

+static int kvm_s390_set_topology(struct kvm *kvm, struct kvm_device_attr *attr)
+{
+ if (!test_kvm_facility(kvm, 11))
+ return -ENXIO;
+
+ kvm_s390_update_topology_change_report(kvm, !!attr->attr);
+ return 0;
+}
+
+static int kvm_s390_get_topology(struct kvm *kvm, struct kvm_device_attr *attr)
+{
+ union sca_utility utility;
+ struct bsca_block *sca;
+ __u8 topo;
+
+ if (!test_kvm_facility(kvm, 11))
+ return -ENXIO;
+
+ read_lock(&kvm->arch.sca_lock);
+ sca = kvm->arch.sca;
+ utility.val = READ_ONCE(sca->utility.val);
+ read_unlock(&kvm->arch.sca_lock);
+ topo = utility.mtcr;
+
+ if (copy_to_user((void __user *)attr->addr, &topo, sizeof(topo)))
+ return -EFAULT;
+
+ return 0;
+}
+
static int kvm_s390_vm_set_attr(struct kvm *kvm, struct kvm_device_attr *attr)
{
int ret;
@@ -1737,6 +1784,9 @@ static int kvm_s390_vm_set_attr(struct kvm *kvm, struct kvm_device_attr *attr)
case KVM_S390_VM_MIGRATION:
ret = kvm_s390_vm_set_migration(kvm, attr);
break;
+ case KVM_S390_VM_CPU_TOPOLOGY:
+ ret = kvm_s390_set_topology(kvm, attr);
+ break;
default:
ret = -ENXIO;
break;
@@ -1762,6 +1812,9 @@ static int kvm_s390_vm_get_attr(struct kvm *kvm, struct kvm_device_attr *attr)
case KVM_S390_VM_MIGRATION:
ret = kvm_s390_vm_get_migration(kvm, attr);
break;
+ case KVM_S390_VM_CPU_TOPOLOGY:
+ ret = kvm_s390_get_topology(kvm, attr);
+ break;
default:
ret = -ENXIO;
break;
@@ -1835,6 +1888,9 @@ static int kvm_s390_vm_has_attr(struct kvm *kvm, struct kvm_device_attr *attr)
case KVM_S390_VM_MIGRATION:
ret = 0;
break;
+ case KVM_S390_VM_CPU_TOPOLOGY:
+ ret = test_kvm_facility(kvm, 11) ? 0 : -ENXIO;
+ break;
default:
ret = -ENXIO;
break;
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 5088bd9f1922..33317d820032 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1157,6 +1157,7 @@ struct kvm_ppc_resize_hpt {
#define KVM_CAP_VM_TSC_CONTROL 214
#define KVM_CAP_SYSTEM_EVENT_DATA 215
#define KVM_CAP_ARM_SYSTEM_SUSPEND 216
+#define KVM_CAP_S390_CPU_TOPOLOGY 217

#ifdef KVM_CAP_IRQ_ROUTING

--
2.31.1

2022-07-11 09:05:23

by Pierre Morel

[permalink] [raw]
Subject: [PATCH v12 2/3] KVM: s390: guest support for topology function

We report a topology change to the guest for any CPU hotplug.

The reporting to the guest is done using the Multiprocessor
Topology-Change-Report (MTCR) bit of the utility entry in the guest's
SCA which will be cleared during the interpretation of PTF.

On every vCPU creation we set the MCTR bit to let the guest know the
next time it uses the PTF with command 2 instruction that the
topology changed and that it should use the STSI(15.1.x) instruction
to get the topology details.

STSI(15.1.x) gives information on the CPU configuration topology.
Let's accept the interception of STSI with the function code 15 and
let the userland part of the hypervisor handle it when userland
supports the CPU Topology facility.

Signed-off-by: Pierre Morel <[email protected]>
Reviewed-by: Nico Boehr <[email protected]>
---
arch/s390/include/asm/kvm_host.h | 18 +++++++++++++++---
arch/s390/kvm/kvm-s390.c | 31 +++++++++++++++++++++++++++++++
arch/s390/kvm/priv.c | 22 ++++++++++++++++++----
arch/s390/kvm/vsie.c | 8 ++++++++
4 files changed, 72 insertions(+), 7 deletions(-)

diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index 766028d54a3e..ae6bd3d607de 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -93,19 +93,30 @@ union ipte_control {
};
};

+union sca_utility {
+ __u16 val;
+ struct {
+ __u16 mtcr : 1;
+ __u16 reserved : 15;
+ };
+};
+
struct bsca_block {
union ipte_control ipte_control;
__u64 reserved[5];
__u64 mcn;
- __u64 reserved2;
+ union sca_utility utility;
+ __u8 reserved2[6];
struct bsca_entry cpu[KVM_S390_BSCA_CPU_SLOTS];
};

struct esca_block {
union ipte_control ipte_control;
- __u64 reserved1[7];
+ __u64 reserved1[6];
+ union sca_utility utility;
+ __u8 reserved2[6];
__u64 mcn[4];
- __u64 reserved2[20];
+ __u64 reserved3[20];
struct esca_entry cpu[KVM_S390_ESCA_CPU_SLOTS];
};

@@ -249,6 +260,7 @@ struct kvm_s390_sie_block {
#define ECB_SPECI 0x08
#define ECB_SRSI 0x04
#define ECB_HOSTPROTINT 0x02
+#define ECB_PTF 0x01
__u8 ecb; /* 0x0061 */
#define ECB2_CMMA 0x80
#define ECB2_IEP 0x20
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 8fcb56141689..70436bfff53a 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -1691,6 +1691,32 @@ static int kvm_s390_get_cpu_model(struct kvm *kvm, struct kvm_device_attr *attr)
return ret;
}

+/**
+ * kvm_s390_update_topology_change_report - update CPU topology change report
+ * @kvm: guest KVM description
+ * @val: set or clear the MTCR bit
+ *
+ * Updates the Multiprocessor Topology-Change-Report bit to signal
+ * the guest with a topology change.
+ * This is only relevant if the topology facility is present.
+ *
+ * The SCA version, bsca or esca, doesn't matter as offset is the same.
+ */
+static void kvm_s390_update_topology_change_report(struct kvm *kvm, bool val)
+{
+ union sca_utility new, old;
+ struct bsca_block *sca;
+
+ read_lock(&kvm->arch.sca_lock);
+ do {
+ sca = kvm->arch.sca;
+ old = READ_ONCE(sca->utility);
+ new = old;
+ new.mtcr = val;
+ } while (cmpxchg(&sca->utility.val, old.val, new.val) != old.val);
+ read_unlock(&kvm->arch.sca_lock);
+}
+
static int kvm_s390_vm_set_attr(struct kvm *kvm, struct kvm_device_attr *attr)
{
int ret;
@@ -2877,6 +2903,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
kvm_clear_async_pf_completion_queue(vcpu);
if (!kvm_is_ucontrol(vcpu->kvm))
sca_del_vcpu(vcpu);
+ kvm_s390_update_topology_change_report(vcpu->kvm, 1);

if (kvm_is_ucontrol(vcpu->kvm))
gmap_remove(vcpu->arch.gmap);
@@ -3272,6 +3299,8 @@ static int kvm_s390_vcpu_setup(struct kvm_vcpu *vcpu)
vcpu->arch.sie_block->ecb |= ECB_HOSTPROTINT;
if (test_kvm_facility(vcpu->kvm, 9))
vcpu->arch.sie_block->ecb |= ECB_SRSI;
+ if (test_kvm_facility(vcpu->kvm, 11))
+ vcpu->arch.sie_block->ecb |= ECB_PTF;
if (test_kvm_facility(vcpu->kvm, 73))
vcpu->arch.sie_block->ecb |= ECB_TE;
if (!kvm_is_ucontrol(vcpu->kvm))
@@ -3403,6 +3432,8 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
rc = kvm_s390_vcpu_setup(vcpu);
if (rc)
goto out_ucontrol_uninit;
+
+ kvm_s390_update_topology_change_report(vcpu->kvm, 1);
return 0;

out_ucontrol_uninit:
diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
index 12c464c7cddf..a0f41f65a4f1 100644
--- a/arch/s390/kvm/priv.c
+++ b/arch/s390/kvm/priv.c
@@ -873,10 +873,20 @@ static int handle_stsi(struct kvm_vcpu *vcpu)
if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);

- if (fc > 3) {
- kvm_s390_set_psw_cc(vcpu, 3);
- return 0;
- }
+ /* Bailout forbidden function codes */
+ if (fc > 3 && fc != 15)
+ goto out_no_data;
+
+ /*
+ * fc 15 is fully provided only with
+ * - PTF/CPU topology support through facility 15
+ * - KVM_CAP_S390_USER_STSI
+ * - and is not provided with protected virtualization
+ */
+ if (fc == 15 && (!test_kvm_facility(vcpu->kvm, 11) ||
+ !vcpu->kvm->arch.user_stsi ||
+ kvm_s390_pv_cpu_is_protected(vcpu)))
+ goto out_no_data;

if (vcpu->run->s.regs.gprs[0] & 0x0fffff00
|| vcpu->run->s.regs.gprs[1] & 0xffff0000)
@@ -910,6 +920,10 @@ static int handle_stsi(struct kvm_vcpu *vcpu)
goto out_no_data;
handle_stsi_3_2_2(vcpu, (void *) mem);
break;
+ case 15: /* fc 15 is fully handled in userspace */
+ insert_stsi_usr_data(vcpu, operand2, ar, fc, sel1, sel2);
+ trace_kvm_s390_handle_stsi(vcpu, fc, sel1, sel2, operand2);
+ return -EREMOTE;
}
if (kvm_s390_pv_cpu_is_protected(vcpu)) {
memcpy((void *)sida_origin(vcpu->arch.sie_block), (void *)mem,
diff --git a/arch/s390/kvm/vsie.c b/arch/s390/kvm/vsie.c
index dada78b92691..94138f8f0c1c 100644
--- a/arch/s390/kvm/vsie.c
+++ b/arch/s390/kvm/vsie.c
@@ -503,6 +503,14 @@ static int shadow_scb(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
/* Host-protection-interruption introduced with ESOP */
if (test_kvm_cpu_feat(vcpu->kvm, KVM_S390_VM_CPU_FEAT_ESOP))
scb_s->ecb |= scb_o->ecb & ECB_HOSTPROTINT;
+ /*
+ * CPU Topology
+ * This facility only uses the utility field of the SCA and none of
+ * the cpu entries that are problematic with the other interpretation
+ * facilities so we can pass it through
+ */
+ if (test_kvm_facility(vcpu->kvm, 11))
+ scb_s->ecb |= scb_o->ecb & ECB_PTF;
/* transactional execution */
if (test_kvm_facility(vcpu->kvm, 73) && wants_tx) {
/* remap the prefix is tx is toggled on */
--
2.31.1

2022-07-11 09:10:06

by Pierre Morel

[permalink] [raw]
Subject: [PATCH v12 1/3] KVM: s390: Cleanup ipte lock access and SIIF facility checks

We can check if SIIF is enabled by testing the sclp_info struct
instead of testing the sie control block eca variable as that
facility is always enabled if available.

Also let's cleanup all the ipte related struct member accesses
which currently happen by referencing the KVM struct via the
VCPU struct.
Making the KVM struct the parameter to the ipte_* functions
removes one level of indirection which makes the code more readable.

Signed-off-by: Pierre Morel <[email protected]>
Reviewed-by: Janosch Frank <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Reviewed-by: Nico Boehr <[email protected]>
---
arch/s390/kvm/gaccess.c | 96 ++++++++++++++++++++---------------------
arch/s390/kvm/gaccess.h | 6 +--
arch/s390/kvm/priv.c | 6 +--
3 files changed, 54 insertions(+), 54 deletions(-)

diff --git a/arch/s390/kvm/gaccess.c b/arch/s390/kvm/gaccess.c
index 227ed0009354..082ec5f2c3a5 100644
--- a/arch/s390/kvm/gaccess.c
+++ b/arch/s390/kvm/gaccess.c
@@ -262,77 +262,77 @@ struct aste {
/* .. more fields there */
};

-int ipte_lock_held(struct kvm_vcpu *vcpu)
+int ipte_lock_held(struct kvm *kvm)
{
- if (vcpu->arch.sie_block->eca & ECA_SII) {
+ if (sclp.has_siif) {
int rc;

- read_lock(&vcpu->kvm->arch.sca_lock);
- rc = kvm_s390_get_ipte_control(vcpu->kvm)->kh != 0;
- read_unlock(&vcpu->kvm->arch.sca_lock);
+ read_lock(&kvm->arch.sca_lock);
+ rc = kvm_s390_get_ipte_control(kvm)->kh != 0;
+ read_unlock(&kvm->arch.sca_lock);
return rc;
}
- return vcpu->kvm->arch.ipte_lock_count != 0;
+ return kvm->arch.ipte_lock_count != 0;
}

-static void ipte_lock_simple(struct kvm_vcpu *vcpu)
+static void ipte_lock_simple(struct kvm *kvm)
{
union ipte_control old, new, *ic;

- mutex_lock(&vcpu->kvm->arch.ipte_mutex);
- vcpu->kvm->arch.ipte_lock_count++;
- if (vcpu->kvm->arch.ipte_lock_count > 1)
+ mutex_lock(&kvm->arch.ipte_mutex);
+ kvm->arch.ipte_lock_count++;
+ if (kvm->arch.ipte_lock_count > 1)
goto out;
retry:
- read_lock(&vcpu->kvm->arch.sca_lock);
- ic = kvm_s390_get_ipte_control(vcpu->kvm);
+ read_lock(&kvm->arch.sca_lock);
+ ic = kvm_s390_get_ipte_control(kvm);
do {
old = READ_ONCE(*ic);
if (old.k) {
- read_unlock(&vcpu->kvm->arch.sca_lock);
+ read_unlock(&kvm->arch.sca_lock);
cond_resched();
goto retry;
}
new = old;
new.k = 1;
} while (cmpxchg(&ic->val, old.val, new.val) != old.val);
- read_unlock(&vcpu->kvm->arch.sca_lock);
+ read_unlock(&kvm->arch.sca_lock);
out:
- mutex_unlock(&vcpu->kvm->arch.ipte_mutex);
+ mutex_unlock(&kvm->arch.ipte_mutex);
}

-static void ipte_unlock_simple(struct kvm_vcpu *vcpu)
+static void ipte_unlock_simple(struct kvm *kvm)
{
union ipte_control old, new, *ic;

- mutex_lock(&vcpu->kvm->arch.ipte_mutex);
- vcpu->kvm->arch.ipte_lock_count--;
- if (vcpu->kvm->arch.ipte_lock_count)
+ mutex_lock(&kvm->arch.ipte_mutex);
+ kvm->arch.ipte_lock_count--;
+ if (kvm->arch.ipte_lock_count)
goto out;
- read_lock(&vcpu->kvm->arch.sca_lock);
- ic = kvm_s390_get_ipte_control(vcpu->kvm);
+ read_lock(&kvm->arch.sca_lock);
+ ic = kvm_s390_get_ipte_control(kvm);
do {
old = READ_ONCE(*ic);
new = old;
new.k = 0;
} while (cmpxchg(&ic->val, old.val, new.val) != old.val);
- read_unlock(&vcpu->kvm->arch.sca_lock);
- wake_up(&vcpu->kvm->arch.ipte_wq);
+ read_unlock(&kvm->arch.sca_lock);
+ wake_up(&kvm->arch.ipte_wq);
out:
- mutex_unlock(&vcpu->kvm->arch.ipte_mutex);
+ mutex_unlock(&kvm->arch.ipte_mutex);
}

-static void ipte_lock_siif(struct kvm_vcpu *vcpu)
+static void ipte_lock_siif(struct kvm *kvm)
{
union ipte_control old, new, *ic;

retry:
- read_lock(&vcpu->kvm->arch.sca_lock);
- ic = kvm_s390_get_ipte_control(vcpu->kvm);
+ read_lock(&kvm->arch.sca_lock);
+ ic = kvm_s390_get_ipte_control(kvm);
do {
old = READ_ONCE(*ic);
if (old.kg) {
- read_unlock(&vcpu->kvm->arch.sca_lock);
+ read_unlock(&kvm->arch.sca_lock);
cond_resched();
goto retry;
}
@@ -340,15 +340,15 @@ static void ipte_lock_siif(struct kvm_vcpu *vcpu)
new.k = 1;
new.kh++;
} while (cmpxchg(&ic->val, old.val, new.val) != old.val);
- read_unlock(&vcpu->kvm->arch.sca_lock);
+ read_unlock(&kvm->arch.sca_lock);
}

-static void ipte_unlock_siif(struct kvm_vcpu *vcpu)
+static void ipte_unlock_siif(struct kvm *kvm)
{
union ipte_control old, new, *ic;

- read_lock(&vcpu->kvm->arch.sca_lock);
- ic = kvm_s390_get_ipte_control(vcpu->kvm);
+ read_lock(&kvm->arch.sca_lock);
+ ic = kvm_s390_get_ipte_control(kvm);
do {
old = READ_ONCE(*ic);
new = old;
@@ -356,25 +356,25 @@ static void ipte_unlock_siif(struct kvm_vcpu *vcpu)
if (!new.kh)
new.k = 0;
} while (cmpxchg(&ic->val, old.val, new.val) != old.val);
- read_unlock(&vcpu->kvm->arch.sca_lock);
+ read_unlock(&kvm->arch.sca_lock);
if (!new.kh)
- wake_up(&vcpu->kvm->arch.ipte_wq);
+ wake_up(&kvm->arch.ipte_wq);
}

-void ipte_lock(struct kvm_vcpu *vcpu)
+void ipte_lock(struct kvm *kvm)
{
- if (vcpu->arch.sie_block->eca & ECA_SII)
- ipte_lock_siif(vcpu);
+ if (sclp.has_siif)
+ ipte_lock_siif(kvm);
else
- ipte_lock_simple(vcpu);
+ ipte_lock_simple(kvm);
}

-void ipte_unlock(struct kvm_vcpu *vcpu)
+void ipte_unlock(struct kvm *kvm)
{
- if (vcpu->arch.sie_block->eca & ECA_SII)
- ipte_unlock_siif(vcpu);
+ if (sclp.has_siif)
+ ipte_unlock_siif(kvm);
else
- ipte_unlock_simple(vcpu);
+ ipte_unlock_simple(kvm);
}

static int ar_translation(struct kvm_vcpu *vcpu, union asce *asce, u8 ar,
@@ -1086,7 +1086,7 @@ int access_guest_with_key(struct kvm_vcpu *vcpu, unsigned long ga, u8 ar,
try_storage_prot_override = storage_prot_override_applicable(vcpu);
need_ipte_lock = psw_bits(*psw).dat && !asce.r;
if (need_ipte_lock)
- ipte_lock(vcpu);
+ ipte_lock(vcpu->kvm);
/*
* Since we do the access further down ultimately via a move instruction
* that does key checking and returns an error in case of a protection
@@ -1127,7 +1127,7 @@ int access_guest_with_key(struct kvm_vcpu *vcpu, unsigned long ga, u8 ar,
}
out_unlock:
if (need_ipte_lock)
- ipte_unlock(vcpu);
+ ipte_unlock(vcpu->kvm);
if (nr_pages > ARRAY_SIZE(gpa_array))
vfree(gpas);
return rc;
@@ -1199,10 +1199,10 @@ int check_gva_range(struct kvm_vcpu *vcpu, unsigned long gva, u8 ar,
rc = get_vcpu_asce(vcpu, &asce, gva, ar, mode);
if (rc)
return rc;
- ipte_lock(vcpu);
+ ipte_lock(vcpu->kvm);
rc = guest_range_to_gpas(vcpu, gva, ar, NULL, length, asce, mode,
access_key);
- ipte_unlock(vcpu);
+ ipte_unlock(vcpu->kvm);

return rc;
}
@@ -1465,7 +1465,7 @@ int kvm_s390_shadow_fault(struct kvm_vcpu *vcpu, struct gmap *sg,
* tables/pointers we read stay valid - unshadowing is however
* always possible - only guest_table_lock protects us.
*/
- ipte_lock(vcpu);
+ ipte_lock(vcpu->kvm);

rc = gmap_shadow_pgt_lookup(sg, saddr, &pgt, &dat_protection, &fake);
if (rc)
@@ -1499,7 +1499,7 @@ int kvm_s390_shadow_fault(struct kvm_vcpu *vcpu, struct gmap *sg,
pte.p |= dat_protection;
if (!rc)
rc = gmap_shadow_page(sg, saddr, __pte(pte.val));
- ipte_unlock(vcpu);
+ ipte_unlock(vcpu->kvm);
mmap_read_unlock(sg->mm);
return rc;
}
diff --git a/arch/s390/kvm/gaccess.h b/arch/s390/kvm/gaccess.h
index 1124ff282012..9408d6cc8e2c 100644
--- a/arch/s390/kvm/gaccess.h
+++ b/arch/s390/kvm/gaccess.h
@@ -440,9 +440,9 @@ int read_guest_real(struct kvm_vcpu *vcpu, unsigned long gra, void *data,
return access_guest_real(vcpu, gra, data, len, 0);
}

-void ipte_lock(struct kvm_vcpu *vcpu);
-void ipte_unlock(struct kvm_vcpu *vcpu);
-int ipte_lock_held(struct kvm_vcpu *vcpu);
+void ipte_lock(struct kvm *kvm);
+void ipte_unlock(struct kvm *kvm);
+int ipte_lock_held(struct kvm *kvm);
int kvm_s390_check_low_addr_prot_real(struct kvm_vcpu *vcpu, unsigned long gra);

/* MVPG PEI indication bits */
diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
index 83bb5cf97282..12c464c7cddf 100644
--- a/arch/s390/kvm/priv.c
+++ b/arch/s390/kvm/priv.c
@@ -442,7 +442,7 @@ static int handle_ipte_interlock(struct kvm_vcpu *vcpu)
vcpu->stat.instruction_ipte_interlock++;
if (psw_bits(vcpu->arch.sie_block->gpsw).pstate)
return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
- wait_event(vcpu->kvm->arch.ipte_wq, !ipte_lock_held(vcpu));
+ wait_event(vcpu->kvm->arch.ipte_wq, !ipte_lock_held(vcpu->kvm));
kvm_s390_retry_instr(vcpu);
VCPU_EVENT(vcpu, 4, "%s", "retrying ipte interlock operation");
return 0;
@@ -1471,7 +1471,7 @@ static int handle_tprot(struct kvm_vcpu *vcpu)
access_key = (operand2 & 0xf0) >> 4;

if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_DAT)
- ipte_lock(vcpu);
+ ipte_lock(vcpu->kvm);

ret = guest_translate_address_with_key(vcpu, address, ar, &gpa,
GACC_STORE, access_key);
@@ -1508,7 +1508,7 @@ static int handle_tprot(struct kvm_vcpu *vcpu)
}

if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_DAT)
- ipte_unlock(vcpu);
+ ipte_unlock(vcpu->kvm);
return ret;
}

--
2.31.1

2022-07-11 13:01:29

by Janis Schoetterl-Glausch

[permalink] [raw]
Subject: Re: [PATCH v12 2/3] KVM: s390: guest support for topology function

On 7/11/22 10:41, Pierre Morel wrote:
> We report a topology change to the guest for any CPU hotplug.
>
> The reporting to the guest is done using the Multiprocessor
> Topology-Change-Report (MTCR) bit of the utility entry in the guest's
> SCA which will be cleared during the interpretation of PTF.
>
> On every vCPU creation we set the MCTR bit to let the guest know the
> next time it uses the PTF with command 2 instruction that the
> topology changed and that it should use the STSI(15.1.x) instruction
> to get the topology details.
>
> STSI(15.1.x) gives information on the CPU configuration topology.
> Let's accept the interception of STSI with the function code 15 and
> let the userland part of the hypervisor handle it when userland
> supports the CPU Topology facility.
>
> Signed-off-by: Pierre Morel <[email protected]>
> Reviewed-by: Nico Boehr <[email protected]>

Reviewed-by: Janis Schoetterl-Glausch <[email protected]>
See nit below.
> ---
> arch/s390/include/asm/kvm_host.h | 18 +++++++++++++++---
> arch/s390/kvm/kvm-s390.c | 31 +++++++++++++++++++++++++++++++
> arch/s390/kvm/priv.c | 22 ++++++++++++++++++----
> arch/s390/kvm/vsie.c | 8 ++++++++
> 4 files changed, 72 insertions(+), 7 deletions(-)
>

[...]

> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 8fcb56141689..70436bfff53a 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -1691,6 +1691,32 @@ static int kvm_s390_get_cpu_model(struct kvm *kvm, struct kvm_device_attr *attr)
> return ret;
> }
>
> +/**
> + * kvm_s390_update_topology_change_report - update CPU topology change report
> + * @kvm: guest KVM description
> + * @val: set or clear the MTCR bit
> + *
> + * Updates the Multiprocessor Topology-Change-Report bit to signal
> + * the guest with a topology change.
> + * This is only relevant if the topology facility is present.
> + *
> + * The SCA version, bsca or esca, doesn't matter as offset is the same.
> + */
> +static void kvm_s390_update_topology_change_report(struct kvm *kvm, bool val)
> +{
> + union sca_utility new, old;
> + struct bsca_block *sca;
> +
> + read_lock(&kvm->arch.sca_lock);
> + do {
> + sca = kvm->arch.sca;

I find this assignment being in the loop unintuitive, but it should not make a difference.

> + old = READ_ONCE(sca->utility);
> + new = old;
> + new.mtcr = val;
> + } while (cmpxchg(&sca->utility.val, old.val, new.val) != old.val);
> + read_unlock(&kvm->arch.sca_lock);
> +}
> +
[...]

2022-07-11 13:53:08

by Janis Schoetterl-Glausch

[permalink] [raw]
Subject: Re: [PATCH v12 3/3] KVM: s390: resetting the Topology-Change-Report

On 7/11/22 10:41, Pierre Morel wrote:
> During a subsystem reset the Topology-Change-Report is cleared.
>
> Let's give userland the possibility to clear the MTCR in the case
> of a subsystem reset.
>
> To migrate the MTCR, we give userland the possibility to
> query the MTCR state.
>
> We indicate KVM support for the CPU topology facility with a new
> KVM capability: KVM_CAP_S390_CPU_TOPOLOGY.
>
> Signed-off-by: Pierre Morel <[email protected]>

Reviewed-by: Janis Schoetterl-Glausch <[email protected]>

See nits/comments below.

> ---
> Documentation/virt/kvm/api.rst | 25 ++++++++++++++
> arch/s390/include/uapi/asm/kvm.h | 1 +
> arch/s390/kvm/kvm-s390.c | 56 ++++++++++++++++++++++++++++++++
> include/uapi/linux/kvm.h | 1 +
> 4 files changed, 83 insertions(+)
>
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index 11e00a46c610..5e086125d8ad 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -7956,6 +7956,31 @@ should adjust CPUID leaf 0xA to reflect that the PMU is disabled.
> When enabled, KVM will exit to userspace with KVM_EXIT_SYSTEM_EVENT of
> type KVM_SYSTEM_EVENT_SUSPEND to process the guest suspend request.
>
> +8.37 KVM_CAP_S390_CPU_TOPOLOGY
> +------------------------------
> +
> +:Capability: KVM_CAP_S390_CPU_TOPOLOGY
> +:Architectures: s390
> +:Type: vm
> +
> +This capability indicates that KVM will provide the S390 CPU Topology
> +facility which consist of the interpretation of the PTF instruction for
> +the function code 2 along with interception and forwarding of both the
> +PTF instruction with function codes 0 or 1 and the STSI(15,1,x)

Is the architecture allowed to extend STSI without a facility?
If so, if we say here that STSI 15.1.x is passed to user space, then
I think we should have a

if (sel1 != 1)
goto out_no_data;

or maybe even

if (sel1 != 1 || sel2 < 2 || sel2 > 6)
goto out_no_data;

in priv.c

> +instruction to the userland hypervisor.
> +
> +The stfle facility 11, CPU Topology facility, should not be indicated
> +to the guest without this capability.
> +
> +When this capability is present, KVM provides a new attribute group
> +on vm fd, KVM_S390_VM_CPU_TOPOLOGY.
> +This new attribute allows to get, set or clear the Modified Change

get or set, now that there is no explicit clear anymore.

> +Topology Report (MTCR) bit of the SCA through the kvm_device_attr
> +structure.> +
> +When getting the Modified Change Topology Report value, the attr->addr

When getting/setting the...

> +must point to a byte where the value will be stored.

... will be stored/retrieved from.
> +
> 9. Known KVM API problems
> =========================
>
> diff --git a/arch/s390/include/uapi/asm/kvm.h b/arch/s390/include/uapi/asm/kvm.h
> index 7a6b14874d65..a73cf01a1606 100644
> --- a/arch/s390/include/uapi/asm/kvm.h
> +++ b/arch/s390/include/uapi/asm/kvm.h
> @@ -74,6 +74,7 @@ struct kvm_s390_io_adapter_req {
> #define KVM_S390_VM_CRYPTO 2
> #define KVM_S390_VM_CPU_MODEL 3
> #define KVM_S390_VM_MIGRATION 4
> +#define KVM_S390_VM_CPU_TOPOLOGY 5
>
> /* kvm attributes for mem_ctrl */
> #define KVM_S390_VM_MEM_ENABLE_CMMA 0
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 70436bfff53a..b18e0b940b26 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -606,6 +606,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
> case KVM_CAP_S390_PROTECTED:
> r = is_prot_virt_host();
> break;
> + case KVM_CAP_S390_CPU_TOPOLOGY:
> + r = test_facility(11);
> + break;
> default:
> r = 0;
> }
> @@ -817,6 +820,20 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
> icpt_operexc_on_all_vcpus(kvm);
> r = 0;
> break;
> + case KVM_CAP_S390_CPU_TOPOLOGY:
> + r = -EINVAL;
> + mutex_lock(&kvm->lock);
> + if (kvm->created_vcpus) {
> + r = -EBUSY;
> + } else if (test_facility(11)) {
> + set_kvm_facility(kvm->arch.model.fac_mask, 11);
> + set_kvm_facility(kvm->arch.model.fac_list, 11);
> + r = 0;
> + }
> + mutex_unlock(&kvm->lock);
> + VM_EVENT(kvm, 3, "ENABLE: CAP_S390_CPU_TOPOLOGY %s",
> + r ? "(not available)" : "(success)");
> + break;
> default:
> r = -EINVAL;
> break;
> @@ -1717,6 +1734,36 @@ static void kvm_s390_update_topology_change_report(struct kvm *kvm, bool val)
> read_unlock(&kvm->arch.sca_lock);
> }
>
> +static int kvm_s390_set_topology(struct kvm *kvm, struct kvm_device_attr *attr)

kvm_s390_set_topology_changed maybe?
kvm_s390_get_topology_changed below then.

> +{
> + if (!test_kvm_facility(kvm, 11))
> + return -ENXIO;
> +
> + kvm_s390_update_topology_change_report(kvm, !!attr->attr);
> + return 0;
> +}
> +
> +static int kvm_s390_get_topology(struct kvm *kvm, struct kvm_device_attr *attr)
> +{
> + union sca_utility utility;
> + struct bsca_block *sca;
> + __u8 topo;
> +
> + if (!test_kvm_facility(kvm, 11))
> + return -ENXIO;
> +
> + read_lock(&kvm->arch.sca_lock);
> + sca = kvm->arch.sca;
> + utility.val = READ_ONCE(sca->utility.val);

I don't think you need the READ_ONCE anymore, now that there is a lock it should act as a compile barrier.
> + read_unlock(&kvm->arch.sca_lock);
> + topo = utility.mtcr;
> +
> + if (copy_to_user((void __user *)attr->addr, &topo, sizeof(topo)))

Why void not u8?

> + return -EFAULT;
> +
> + return 0;
> +}
> +
[...]

2022-07-12 07:32:13

by Pierre Morel

[permalink] [raw]
Subject: Re: [PATCH v12 3/3] KVM: s390: resetting the Topology-Change-Report



On 7/11/22 15:22, Janis Schoetterl-Glausch wrote:
> On 7/11/22 10:41, Pierre Morel wrote:
>> During a subsystem reset the Topology-Change-Report is cleared.
>>
>> Let's give userland the possibility to clear the MTCR in the case
>> of a subsystem reset.
>>
>> To migrate the MTCR, we give userland the possibility to
>> query the MTCR state.
>>
>> We indicate KVM support for the CPU topology facility with a new
>> KVM capability: KVM_CAP_S390_CPU_TOPOLOGY.
>>
>> Signed-off-by: Pierre Morel <[email protected]>
>
> Reviewed-by: Janis Schoetterl-Glausch <[email protected]>
>

Thanks!

> See nits/comments below.
>
>> ---
>> Documentation/virt/kvm/api.rst | 25 ++++++++++++++
>> arch/s390/include/uapi/asm/kvm.h | 1 +
>> arch/s390/kvm/kvm-s390.c | 56 ++++++++++++++++++++++++++++++++
>> include/uapi/linux/kvm.h | 1 +
>> 4 files changed, 83 insertions(+)
>>
>> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
>> index 11e00a46c610..5e086125d8ad 100644
>> --- a/Documentation/virt/kvm/api.rst
>> +++ b/Documentation/virt/kvm/api.rst
>> @@ -7956,6 +7956,31 @@ should adjust CPUID leaf 0xA to reflect that the PMU is disabled.
>> When enabled, KVM will exit to userspace with KVM_EXIT_SYSTEM_EVENT of
>> type KVM_SYSTEM_EVENT_SUSPEND to process the guest suspend request.
>>
>> +8.37 KVM_CAP_S390_CPU_TOPOLOGY
>> +------------------------------
>> +
>> +:Capability: KVM_CAP_S390_CPU_TOPOLOGY
>> +:Architectures: s390
>> +:Type: vm
>> +
>> +This capability indicates that KVM will provide the S390 CPU Topology
>> +facility which consist of the interpretation of the PTF instruction for
>> +the function code 2 along with interception and forwarding of both the
>> +PTF instruction with function codes 0 or 1 and the STSI(15,1,x)
>
> Is the architecture allowed to extend STSI without a facility?
> If so, if we say here that STSI 15.1.x is passed to user space, then
> I think we should have a
>
> if (sel1 != 1)
> goto out_no_data;
>
> or maybe even
>
> if (sel1 != 1 || sel2 < 2 || sel2 > 6)
> goto out_no_data;
>
> in priv.c

I am not a big fan of doing everything in the kernel.
Here we have no performance issue since it is an error of the guest if
it sends a wrong selector.

Even testing the facility or PV in the kernel is for my opinion arguable
in the case we do not do any treatment in the kernel.

I do not see what it brings to us, it increase the LOCs and makes the
implementation less easy to evolve.


>
>> +instruction to the userland hypervisor.
>> +
>> +The stfle facility 11, CPU Topology facility, should not be indicated
>> +to the guest without this capability.
>> +
>> +When this capability is present, KVM provides a new attribute group
>> +on vm fd, KVM_S390_VM_CPU_TOPOLOGY.
>> +This new attribute allows to get, set or clear the Modified Change
>
> get or set, now that there is no explicit clear anymore.

Yes now it is a set to 0 but the action of clearing remains.

>
>> +Topology Report (MTCR) bit of the SCA through the kvm_device_attr
>> +structure.> +
>> +When getting the Modified Change Topology Report value, the attr->addr
>
> When getting/setting the...
>
>> +must point to a byte where the value will be stored.
>
> ... will be stored/retrieved from.

OK


>> +
>> 9. Known KVM API problems
>> =========================
>>
>> diff --git a/arch/s390/include/uapi/asm/kvm.h b/arch/s390/include/uapi/asm/kvm.h
>> index 7a6b14874d65..a73cf01a1606 100644
>> --- a/arch/s390/include/uapi/asm/kvm.h
>> +++ b/arch/s390/include/uapi/asm/kvm.h
>> @@ -74,6 +74,7 @@ struct kvm_s390_io_adapter_req {
>> #define KVM_S390_VM_CRYPTO 2
>> #define KVM_S390_VM_CPU_MODEL 3
>> #define KVM_S390_VM_MIGRATION 4
>> +#define KVM_S390_VM_CPU_TOPOLOGY 5
>>
>> /* kvm attributes for mem_ctrl */
>> #define KVM_S390_VM_MEM_ENABLE_CMMA 0
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index 70436bfff53a..b18e0b940b26 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -606,6 +606,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>> case KVM_CAP_S390_PROTECTED:
>> r = is_prot_virt_host();
>> break;
>> + case KVM_CAP_S390_CPU_TOPOLOGY:
>> + r = test_facility(11);
>> + break;
>> default:
>> r = 0;
>> }
>> @@ -817,6 +820,20 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
>> icpt_operexc_on_all_vcpus(kvm);
>> r = 0;
>> break;
>> + case KVM_CAP_S390_CPU_TOPOLOGY:
>> + r = -EINVAL;
>> + mutex_lock(&kvm->lock);
>> + if (kvm->created_vcpus) {
>> + r = -EBUSY;
>> + } else if (test_facility(11)) {
>> + set_kvm_facility(kvm->arch.model.fac_mask, 11);
>> + set_kvm_facility(kvm->arch.model.fac_list, 11);
>> + r = 0;
>> + }
>> + mutex_unlock(&kvm->lock);
>> + VM_EVENT(kvm, 3, "ENABLE: CAP_S390_CPU_TOPOLOGY %s",
>> + r ? "(not available)" : "(success)");
>> + break;
>> default:
>> r = -EINVAL;
>> break;
>> @@ -1717,6 +1734,36 @@ static void kvm_s390_update_topology_change_report(struct kvm *kvm, bool val)
>> read_unlock(&kvm->arch.sca_lock);
>> }
>>
>> +static int kvm_s390_set_topology(struct kvm *kvm, struct kvm_device_attr *attr)
>
> kvm_s390_set_topology_changed maybe?
> kvm_s390_get_topology_changed below then.

No strong opinion, if you prefer I change this.

>
>> +{
>> + if (!test_kvm_facility(kvm, 11))
>> + return -ENXIO;
>> +
>> + kvm_s390_update_topology_change_report(kvm, !!attr->attr);
>> + return 0;
>> +}
>> +
>> +static int kvm_s390_get_topology(struct kvm *kvm, struct kvm_device_attr *attr)
>> +{
>> + union sca_utility utility;
>> + struct bsca_block *sca;
>> + __u8 topo;
>> +
>> + if (!test_kvm_facility(kvm, 11))
>> + return -ENXIO;
>> +
>> + read_lock(&kvm->arch.sca_lock);
>> + sca = kvm->arch.sca;
>> + utility.val = READ_ONCE(sca->utility.val);
>
> I don't think you need the READ_ONCE anymore, now that there is a lock it should act as a compile barrier.

I think you are right.

>> + read_unlock(&kvm->arch.sca_lock);
>> + topo = utility.mtcr;
>> +
>> + if (copy_to_user((void __user *)attr->addr, &topo, sizeof(topo)))
>
> Why void not u8?

I like to say we write on "topo" with the size of "topo".
So we do not need to verify the effective size of topo.
But I understand, it is a UAPI, setting u8 in the copy_to_user makes
sense too.
For my personal opinion, I would have prefer that userland tell us the
size it awaits even here, for this special case, since we use a byte, we
can not do really wrong.

>
>> + return -EFAULT;
>> +
>> + return 0;
>> +}
>> +
> [...]
>

--
Pierre Morel
IBM Lab Boeblingen

2022-07-12 07:46:46

by Pierre Morel

[permalink] [raw]
Subject: Re: [PATCH v12 2/3] KVM: s390: guest support for topology function



On 7/11/22 14:30, Janis Schoetterl-Glausch wrote:
> On 7/11/22 10:41, Pierre Morel wrote:
>> We report a topology change to the guest for any CPU hotplug.
>>
>> The reporting to the guest is done using the Multiprocessor
>> Topology-Change-Report (MTCR) bit of the utility entry in the guest's
>> SCA which will be cleared during the interpretation of PTF.
>>
>> On every vCPU creation we set the MCTR bit to let the guest know the
>> next time it uses the PTF with command 2 instruction that the
>> topology changed and that it should use the STSI(15.1.x) instruction
>> to get the topology details.
>>
>> STSI(15.1.x) gives information on the CPU configuration topology.
>> Let's accept the interception of STSI with the function code 15 and
>> let the userland part of the hypervisor handle it when userland
>> supports the CPU Topology facility.
>>
>> Signed-off-by: Pierre Morel <[email protected]>
>> Reviewed-by: Nico Boehr <[email protected]>
>
> Reviewed-by: Janis Schoetterl-Glausch <[email protected]>

Thanks.


> See nit below.
>> ---
>> arch/s390/include/asm/kvm_host.h | 18 +++++++++++++++---
>> arch/s390/kvm/kvm-s390.c | 31 +++++++++++++++++++++++++++++++
>> arch/s390/kvm/priv.c | 22 ++++++++++++++++++----
>> arch/s390/kvm/vsie.c | 8 ++++++++
>> 4 files changed, 72 insertions(+), 7 deletions(-)
>>
>
> [...]
>
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index 8fcb56141689..70436bfff53a 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -1691,6 +1691,32 @@ static int kvm_s390_get_cpu_model(struct kvm *kvm, struct kvm_device_attr *attr)
>> return ret;
>> }
>>
>> +/**
>> + * kvm_s390_update_topology_change_report - update CPU topology change report
>> + * @kvm: guest KVM description
>> + * @val: set or clear the MTCR bit
>> + *
>> + * Updates the Multiprocessor Topology-Change-Report bit to signal
>> + * the guest with a topology change.
>> + * This is only relevant if the topology facility is present.
>> + *
>> + * The SCA version, bsca or esca, doesn't matter as offset is the same.
>> + */
>> +static void kvm_s390_update_topology_change_report(struct kvm *kvm, bool val)
>> +{
>> + union sca_utility new, old;
>> + struct bsca_block *sca;
>> +
>> + read_lock(&kvm->arch.sca_lock);
>> + do {
>> + sca = kvm->arch.sca;
>
> I find this assignment being in the loop unintuitive, but it should not make a difference.

The price would be an ugly cast.


>
>> + old = READ_ONCE(sca->utility);
>> + new = old;
>> + new.mtcr = val;
>> + } while (cmpxchg(&sca->utility.val, old.val, new.val) != old.val);
>> + read_unlock(&kvm->arch.sca_lock);
>> +}
>> +
> [...]
>


--
Pierre Morel
IBM Lab Boeblingen

2022-07-12 09:10:12

by Janis Schoetterl-Glausch

[permalink] [raw]
Subject: Re: [PATCH v12 3/3] KVM: s390: resetting the Topology-Change-Report

On 7/12/22 09:24, Pierre Morel wrote:
>
>
> On 7/11/22 15:22, Janis Schoetterl-Glausch wrote:
>> On 7/11/22 10:41, Pierre Morel wrote:
>>> During a subsystem reset the Topology-Change-Report is cleared.
>>>
>>> Let's give userland the possibility to clear the MTCR in the case
>>> of a subsystem reset.
>>>
>>> To migrate the MTCR, we give userland the possibility to
>>> query the MTCR state.
>>>
>>> We indicate KVM support for the CPU topology facility with a new
>>> KVM capability: KVM_CAP_S390_CPU_TOPOLOGY.
>>>
>>> Signed-off-by: Pierre Morel <[email protected]>
>>
>> Reviewed-by: Janis Schoetterl-Glausch <[email protected]>
>>
>
> Thanks!
>
>> See nits/comments below.
>>
>>> ---
>>>   Documentation/virt/kvm/api.rst   | 25 ++++++++++++++
>>>   arch/s390/include/uapi/asm/kvm.h |  1 +
>>>   arch/s390/kvm/kvm-s390.c         | 56 ++++++++++++++++++++++++++++++++
>>>   include/uapi/linux/kvm.h         |  1 +
>>>   4 files changed, 83 insertions(+)
>>>
>>> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
>>> index 11e00a46c610..5e086125d8ad 100644
>>> --- a/Documentation/virt/kvm/api.rst
>>> +++ b/Documentation/virt/kvm/api.rst
>>> @@ -7956,6 +7956,31 @@ should adjust CPUID leaf 0xA to reflect that the PMU is disabled.
>>>   When enabled, KVM will exit to userspace with KVM_EXIT_SYSTEM_EVENT of
>>>   type KVM_SYSTEM_EVENT_SUSPEND to process the guest suspend request.
>>>   +8.37 KVM_CAP_S390_CPU_TOPOLOGY
>>> +------------------------------
>>> +
>>> +:Capability: KVM_CAP_S390_CPU_TOPOLOGY
>>> +:Architectures: s390
>>> +:Type: vm
>>> +
>>> +This capability indicates that KVM will provide the S390 CPU Topology
>>> +facility which consist of the interpretation of the PTF instruction for
>>> +the function code 2 along with interception and forwarding of both the
>>> +PTF instruction with function codes 0 or 1 and the STSI(15,1,x)
>>
>> Is the architecture allowed to extend STSI without a facility?
>> If so, if we say here that STSI 15.1.x is passed to user space, then
>> I think we should have a
>>
>> if (sel1 != 1)
>>     goto out_no_data;
>>
>> or maybe even
>>
>> if (sel1 != 1 || sel2 < 2 || sel2 > 6)
>>     goto out_no_data;
>>
>> in priv.c
>
> I am not a big fan of doing everything in the kernel.
> Here we have no performance issue since it is an error of the guest if it sends a wrong selector.
>
I agree, but I didn't suggest it for performance reasons.
I was thinking about future proofing, that is if the architecture is extended.
We don't know if future extensions are best handled in the kernel or user space,
so if we prevent it from going to user space, we can defer the decision to when we know more.
But that's only relevant if STSI can be extended without a capability, which is why I asked about that.

> Even testing the facility or PV in the kernel is for my opinion arguable in the case we do not do any treatment in the kernel.
>
> I do not see what it brings to us, it increase the LOCs and makes the implementation less easy to evolve.
>
>
>>
>>> +instruction to the userland hypervisor.
>>> +
>>> +The stfle facility 11, CPU Topology facility, should not be indicated
>>> +to the guest without this capability.
>>> +
>>> +When this capability is present, KVM provides a new attribute group
>>> +on vm fd, KVM_S390_VM_CPU_TOPOLOGY.
>>> +This new attribute allows to get, set or clear the Modified Change
>>
>> get or set, now that there is no explicit clear anymore.
>
> Yes now it is a set to 0 but the action of clearing remains.
>
>>
>>> +Topology Report (MTCR) bit of the SCA through the kvm_device_attr
>>> +structure.> +
>>> +When getting the Modified Change Topology Report value, the attr->addr
>>
>> When getting/setting the...
>>
>>> +must point to a byte where the value will be stored.
>>
>> ... will be stored/retrieved from.
>
> OK

Wait no, I didn't get how that works. You're passing the value via attr->attr, not reading it from addr.
>
>
>>> +
>>>   9. Known KVM API problems
>>>   =========================
>>>   diff --git a/arch/s390/include/uapi/asm/kvm.h b/arch/s390/include/uapi/asm/kvm.h
>>> index 7a6b14874d65..a73cf01a1606 100644
>>> --- a/arch/s390/include/uapi/asm/kvm.h
>>> +++ b/arch/s390/include/uapi/asm/kvm.h
>>> @@ -74,6 +74,7 @@ struct kvm_s390_io_adapter_req {
>>>   #define KVM_S390_VM_CRYPTO        2
>>>   #define KVM_S390_VM_CPU_MODEL        3
>>>   #define KVM_S390_VM_MIGRATION        4
>>> +#define KVM_S390_VM_CPU_TOPOLOGY    5
>>>     /* kvm attributes for mem_ctrl */
>>>   #define KVM_S390_VM_MEM_ENABLE_CMMA    0
>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>> index 70436bfff53a..b18e0b940b26 100644
>>> --- a/arch/s390/kvm/kvm-s390.c
>>> +++ b/arch/s390/kvm/kvm-s390.c
>>> @@ -606,6 +606,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>>>       case KVM_CAP_S390_PROTECTED:
>>>           r = is_prot_virt_host();
>>>           break;
>>> +    case KVM_CAP_S390_CPU_TOPOLOGY:
>>> +        r = test_facility(11);
>>> +        break;
>>>       default:
>>>           r = 0;
>>>       }
>>> @@ -817,6 +820,20 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
>>>           icpt_operexc_on_all_vcpus(kvm);
>>>           r = 0;
>>>           break;
>>> +    case KVM_CAP_S390_CPU_TOPOLOGY:
>>> +        r = -EINVAL;
>>> +        mutex_lock(&kvm->lock);
>>> +        if (kvm->created_vcpus) {
>>> +            r = -EBUSY;
>>> +        } else if (test_facility(11)) {
>>> +            set_kvm_facility(kvm->arch.model.fac_mask, 11);
>>> +            set_kvm_facility(kvm->arch.model.fac_list, 11);
>>> +            r = 0;
>>> +        }
>>> +        mutex_unlock(&kvm->lock);
>>> +        VM_EVENT(kvm, 3, "ENABLE: CAP_S390_CPU_TOPOLOGY %s",
>>> +             r ? "(not available)" : "(success)");
>>> +        break;
>>>       default:
>>>           r = -EINVAL;
>>>           break;
>>> @@ -1717,6 +1734,36 @@ static void kvm_s390_update_topology_change_report(struct kvm *kvm, bool val)
>>>       read_unlock(&kvm->arch.sca_lock);
>>>   }
>>>   +static int kvm_s390_set_topology(struct kvm *kvm, struct kvm_device_attr *attr)
>>
>> kvm_s390_set_topology_changed maybe?
>> kvm_s390_get_topology_changed below then.
>

I won't insist on it, but I do think it's more readable.

> No strong opinion, if you prefer I change this.
>
>>
>>> +{
>>> +    if (!test_kvm_facility(kvm, 11))
>>> +        return -ENXIO;
>>> +
>>> +    kvm_s390_update_topology_change_report(kvm, !!attr->attr);
>>> +    return 0;
>>> +}
>>> +
>>> +static int kvm_s390_get_topology(struct kvm *kvm, struct kvm_device_attr *attr)
>>> +{
>>> +    union sca_utility utility;
>>> +    struct bsca_block *sca;
>>> +    __u8 topo;
>>> +
>>> +    if (!test_kvm_facility(kvm, 11))
>>> +        return -ENXIO;
>>> +
>>> +    read_lock(&kvm->arch.sca_lock);
>>> +    sca = kvm->arch.sca;
>>> +    utility.val = READ_ONCE(sca->utility.val);
>>
>> I don't think you need the READ_ONCE anymore, now that there is a lock it should act as a compile barrier.
>
> I think you are right.
>
>>> +    read_unlock(&kvm->arch.sca_lock);
>>> +    topo = utility.mtcr;
>>> +
>>> +    if (copy_to_user((void __user *)attr->addr, &topo, sizeof(topo)))
>>
>> Why void not u8?
>
> I like to say we write on "topo" with the size of "topo".
> So we do not need to verify the effective size of topo.
> But I understand, it is a UAPI, setting u8 in the copy_to_user makes sense too.
> For my personal opinion, I would have prefer that userland tell us the size it awaits even here, for this special case, since we use a byte, we can not do really wrong.
You're right, it doesn't make a difference.
What about doing put_user(topo, (u8 *)attr->addr)), seems more straight forward.
>
>>
>>> +        return -EFAULT;
>>> +
>>> +    return 0;
>>> +}
>>> +
>> [...]
>>
>

2022-07-12 09:23:18

by Janis Schoetterl-Glausch

[permalink] [raw]
Subject: Re: [PATCH v12 2/3] KVM: s390: guest support for topology function

On 7/12/22 09:45, Pierre Morel wrote:
>
>
> On 7/11/22 14:30, Janis Schoetterl-Glausch wrote:
>> On 7/11/22 10:41, Pierre Morel wrote:
>>> We report a topology change to the guest for any CPU hotplug.
>>>
>>> The reporting to the guest is done using the Multiprocessor
>>> Topology-Change-Report (MTCR) bit of the utility entry in the guest's
>>> SCA which will be cleared during the interpretation of PTF.
>>>
>>> On every vCPU creation we set the MCTR bit to let the guest know the
>>> next time it uses the PTF with command 2 instruction that the
>>> topology changed and that it should use the STSI(15.1.x) instruction
>>> to get the topology details.
>>>
>>> STSI(15.1.x) gives information on the CPU configuration topology.
>>> Let's accept the interception of STSI with the function code 15 and
>>> let the userland part of the hypervisor handle it when userland
>>> supports the CPU Topology facility.
>>>
>>> Signed-off-by: Pierre Morel <[email protected]>
>>> Reviewed-by: Nico Boehr <[email protected]>
>>
>> Reviewed-by: Janis Schoetterl-Glausch <[email protected]>
>
> Thanks.
>
>
>> See nit below.
>>> ---
>>>   arch/s390/include/asm/kvm_host.h | 18 +++++++++++++++---
>>>   arch/s390/kvm/kvm-s390.c         | 31 +++++++++++++++++++++++++++++++
>>>   arch/s390/kvm/priv.c             | 22 ++++++++++++++++++----
>>>   arch/s390/kvm/vsie.c             |  8 ++++++++
>>>   4 files changed, 72 insertions(+), 7 deletions(-)
>>>
>>
>> [...]
>>
>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>> index 8fcb56141689..70436bfff53a 100644
>>> --- a/arch/s390/kvm/kvm-s390.c
>>> +++ b/arch/s390/kvm/kvm-s390.c
>>> @@ -1691,6 +1691,32 @@ static int kvm_s390_get_cpu_model(struct kvm *kvm, struct kvm_device_attr *attr)
>>>       return ret;
>>>   }
>>>   +/**
>>> + * kvm_s390_update_topology_change_report - update CPU topology change report
>>> + * @kvm: guest KVM description
>>> + * @val: set or clear the MTCR bit
>>> + *
>>> + * Updates the Multiprocessor Topology-Change-Report bit to signal
>>> + * the guest with a topology change.
>>> + * This is only relevant if the topology facility is present.
>>> + *
>>> + * The SCA version, bsca or esca, doesn't matter as offset is the same.
>>> + */
>>> +static void kvm_s390_update_topology_change_report(struct kvm *kvm, bool val)
>>> +{
>>> +    union sca_utility new, old;
>>> +    struct bsca_block *sca;
>>> +
>>> +    read_lock(&kvm->arch.sca_lock);
>>> +    do {
>>> +        sca = kvm->arch.sca;
>>
>> I find this assignment being in the loop unintuitive, but it should not make a difference.
>
> The price would be an ugly cast.

I don't get what you mean. Nothing about the types changes if you move it before the loop.
>
>
>>
>>> +        old = READ_ONCE(sca->utility);
>>> +        new = old;
>>> +        new.mtcr = val;
>>> +    } while (cmpxchg(&sca->utility.val, old.val, new.val) != old.val);
>>> +    read_unlock(&kvm->arch.sca_lock);
>>> +}
>>> +
>> [...]
>>
>
>

2022-07-12 09:54:34

by Pierre Morel

[permalink] [raw]
Subject: Re: [PATCH v12 2/3] KVM: s390: guest support for topology function



On 7/12/22 10:50, Janis Schoetterl-Glausch wrote:
> On 7/12/22 09:45, Pierre Morel wrote:
>>
>>
>> On 7/11/22 14:30, Janis Schoetterl-Glausch wrote:
>>> On 7/11/22 10:41, Pierre Morel wrote:
>>>> We report a topology change to the guest for any CPU hotplug.
>>>>
>>>> The reporting to the guest is done using the Multiprocessor
>>>> Topology-Change-Report (MTCR) bit of the utility entry in the guest's
>>>> SCA which will be cleared during the interpretation of PTF.
>>>>
>>>> On every vCPU creation we set the MCTR bit to let the guest know the
>>>> next time it uses the PTF with command 2 instruction that the
>>>> topology changed and that it should use the STSI(15.1.x) instruction
>>>> to get the topology details.
>>>>
>>>> STSI(15.1.x) gives information on the CPU configuration topology.
>>>> Let's accept the interception of STSI with the function code 15 and
>>>> let the userland part of the hypervisor handle it when userland
>>>> supports the CPU Topology facility.
>>>>
>>>> Signed-off-by: Pierre Morel <[email protected]>
>>>> Reviewed-by: Nico Boehr <[email protected]>
>>>
>>> Reviewed-by: Janis Schoetterl-Glausch <[email protected]>
>>
>> Thanks.
>>
>>
>>> See nit below.
>>>> ---
>>>>   arch/s390/include/asm/kvm_host.h | 18 +++++++++++++++---
>>>>   arch/s390/kvm/kvm-s390.c         | 31 +++++++++++++++++++++++++++++++
>>>>   arch/s390/kvm/priv.c             | 22 ++++++++++++++++++----
>>>>   arch/s390/kvm/vsie.c             |  8 ++++++++
>>>>   4 files changed, 72 insertions(+), 7 deletions(-)
>>>>
>>>
>>> [...]
>>>
>>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>>> index 8fcb56141689..70436bfff53a 100644
>>>> --- a/arch/s390/kvm/kvm-s390.c
>>>> +++ b/arch/s390/kvm/kvm-s390.c
>>>> @@ -1691,6 +1691,32 @@ static int kvm_s390_get_cpu_model(struct kvm *kvm, struct kvm_device_attr *attr)
>>>>       return ret;
>>>>   }
>>>>   +/**
>>>> + * kvm_s390_update_topology_change_report - update CPU topology change report
>>>> + * @kvm: guest KVM description
>>>> + * @val: set or clear the MTCR bit
>>>> + *
>>>> + * Updates the Multiprocessor Topology-Change-Report bit to signal
>>>> + * the guest with a topology change.
>>>> + * This is only relevant if the topology facility is present.
>>>> + *
>>>> + * The SCA version, bsca or esca, doesn't matter as offset is the same.
>>>> + */
>>>> +static void kvm_s390_update_topology_change_report(struct kvm *kvm, bool val)
>>>> +{
>>>> +    union sca_utility new, old;
>>>> +    struct bsca_block *sca;
>>>> +
>>>> +    read_lock(&kvm->arch.sca_lock);
>>>> +    do {
>>>> +        sca = kvm->arch.sca;
>>>
>>> I find this assignment being in the loop unintuitive, but it should not make a difference.
>>
>> The price would be an ugly cast.
>
> I don't get what you mean. Nothing about the types changes if you move it before the loop.

Yes right, did wrong understand.
It is better before.

>>
>>
>>>
>>>> +        old = READ_ONCE(sca->utility);
>>>> +        new = old;
>>>> +        new.mtcr = val;
>>>> +    } while (cmpxchg(&sca->utility.val, old.val, new.val) != old.val);
>>>> +    read_unlock(&kvm->arch.sca_lock);
>>>> +}
>>>> +
>>> [...]
>>>
>>
>>
>

--
Pierre Morel
IBM Lab Boeblingen

2022-07-12 11:16:18

by Pierre Morel

[permalink] [raw]
Subject: Re: [PATCH v12 3/3] KVM: s390: resetting the Topology-Change-Report



On 7/12/22 10:47, Janis Schoetterl-Glausch wrote:
> On 7/12/22 09:24, Pierre Morel wrote:
>>
>>
>> On 7/11/22 15:22, Janis Schoetterl-Glausch wrote:
>>> On 7/11/22 10:41, Pierre Morel wrote:
>>>> During a subsystem reset the Topology-Change-Report is cleared.
>>>>
>>>> Let's give userland the possibility to clear the MTCR in the case
>>>> of a subsystem reset.
>>>>
>>>> To migrate the MTCR, we give userland the possibility to
>>>> query the MTCR state.
>>>>
>>>> We indicate KVM support for the CPU topology facility with a new
>>>> KVM capability: KVM_CAP_S390_CPU_TOPOLOGY.
>>>>
>>>> Signed-off-by: Pierre Morel <[email protected]>
>>>
>>> Reviewed-by: Janis Schoetterl-Glausch <[email protected]>
>>>
>>
>> Thanks!
>>
>>> See nits/comments below.
>>>
>>>> ---
>>>>   Documentation/virt/kvm/api.rst   | 25 ++++++++++++++
>>>>   arch/s390/include/uapi/asm/kvm.h |  1 +
>>>>   arch/s390/kvm/kvm-s390.c         | 56 ++++++++++++++++++++++++++++++++
>>>>   include/uapi/linux/kvm.h         |  1 +
>>>>   4 files changed, 83 insertions(+)
>>>>
>>>> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
>>>> index 11e00a46c610..5e086125d8ad 100644
>>>> --- a/Documentation/virt/kvm/api.rst
>>>> +++ b/Documentation/virt/kvm/api.rst
>>>> @@ -7956,6 +7956,31 @@ should adjust CPUID leaf 0xA to reflect that the PMU is disabled.
>>>>   When enabled, KVM will exit to userspace with KVM_EXIT_SYSTEM_EVENT of
>>>>   type KVM_SYSTEM_EVENT_SUSPEND to process the guest suspend request.
>>>>   +8.37 KVM_CAP_S390_CPU_TOPOLOGY
>>>> +------------------------------
>>>> +
>>>> +:Capability: KVM_CAP_S390_CPU_TOPOLOGY
>>>> +:Architectures: s390
>>>> +:Type: vm
>>>> +
>>>> +This capability indicates that KVM will provide the S390 CPU Topology
>>>> +facility which consist of the interpretation of the PTF instruction for
>>>> +the function code 2 along with interception and forwarding of both the
>>>> +PTF instruction with function codes 0 or 1 and the STSI(15,1,x)
>>>
>>> Is the architecture allowed to extend STSI without a facility?
>>> If so, if we say here that STSI 15.1.x is passed to user space, then
>>> I think we should have a
>>>
>>> if (sel1 != 1)
>>>     goto out_no_data;
>>>
>>> or maybe even
>>>
>>> if (sel1 != 1 || sel2 < 2 || sel2 > 6)
>>>     goto out_no_data;
>>>
>>> in priv.c
>>
>> I am not a big fan of doing everything in the kernel.
>> Here we have no performance issue since it is an error of the guest if it sends a wrong selector.
>>
> I agree, but I didn't suggest it for performance reasons.

Yes, and that is why I do not agree ;)

> I was thinking about future proofing, that is if the architecture is extended.
> We don't know if future extensions are best handled in the kernel or user space,
> so if we prevent it from going to user space, we can defer the decision to when we know more.

If future extensions are better handle in kernel we will handle them in
kernel, obviously, in this case we will need a patch.

If it is not better handle in kernel we will handle the extensions in
userland and we will not need a kernel patch making the update of the
virtual architecture easier and faster.

If we prohibit the extensions in kernel we will need a kernel patch in
both cases and a userland patch if it is not completely handled in kernel.

In userland we check any wrong selector before the instruction goes back
to the guest.

> But that's only relevant if STSI can be extended without a capability, which is why I asked about that.

Logicaly any change, extension, in the architecture should be signaled
by a facility bit or something.

>
>> Even testing the facility or PV in the kernel is for my opinion arguable in the case we do not do any treatment in the kernel.
>>
>> I do not see what it brings to us, it increase the LOCs and makes the implementation less easy to evolve.
>>
>>
>>>
>>>> +instruction to the userland hypervisor.
>>>> +
>>>> +The stfle facility 11, CPU Topology facility, should not be indicated
>>>> +to the guest without this capability.
>>>> +
>>>> +When this capability is present, KVM provides a new attribute group
>>>> +on vm fd, KVM_S390_VM_CPU_TOPOLOGY.
>>>> +This new attribute allows to get, set or clear the Modified Change
>>>
>>> get or set, now that there is no explicit clear anymore.
>>
>> Yes now it is a set to 0 but the action of clearing remains.
>>
>>>
>>>> +Topology Report (MTCR) bit of the SCA through the kvm_device_attr
>>>> +structure.> +
>>>> +When getting the Modified Change Topology Report value, the attr->addr
>>>
>>> When getting/setting the...
>>>
>>>> +must point to a byte where the value will be stored.
>>>
>>> ... will be stored/retrieved from.
>>
>> OK
>
> Wait no, I didn't get how that works. You're passing the value via attr->attr, not reading it from addr.

:) OK

>>
>>
>>>> +
>>>>   9. Known KVM API problems
>>>>   =========================
>>>>   diff --git a/arch/s390/include/uapi/asm/kvm.h b/arch/s390/include/uapi/asm/kvm.h
>>>> index 7a6b14874d65..a73cf01a1606 100644
>>>> --- a/arch/s390/include/uapi/asm/kvm.h
>>>> +++ b/arch/s390/include/uapi/asm/kvm.h
>>>> @@ -74,6 +74,7 @@ struct kvm_s390_io_adapter_req {
>>>>   #define KVM_S390_VM_CRYPTO        2
>>>>   #define KVM_S390_VM_CPU_MODEL        3
>>>>   #define KVM_S390_VM_MIGRATION        4
>>>> +#define KVM_S390_VM_CPU_TOPOLOGY    5
>>>>     /* kvm attributes for mem_ctrl */
>>>>   #define KVM_S390_VM_MEM_ENABLE_CMMA    0
>>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>>> index 70436bfff53a..b18e0b940b26 100644
>>>> --- a/arch/s390/kvm/kvm-s390.c
>>>> +++ b/arch/s390/kvm/kvm-s390.c
>>>> @@ -606,6 +606,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>>>>       case KVM_CAP_S390_PROTECTED:
>>>>           r = is_prot_virt_host();
>>>>           break;
>>>> +    case KVM_CAP_S390_CPU_TOPOLOGY:
>>>> +        r = test_facility(11);
>>>> +        break;
>>>>       default:
>>>>           r = 0;
>>>>       }
>>>> @@ -817,6 +820,20 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
>>>>           icpt_operexc_on_all_vcpus(kvm);
>>>>           r = 0;
>>>>           break;
>>>> +    case KVM_CAP_S390_CPU_TOPOLOGY:
>>>> +        r = -EINVAL;
>>>> +        mutex_lock(&kvm->lock);
>>>> +        if (kvm->created_vcpus) {
>>>> +            r = -EBUSY;
>>>> +        } else if (test_facility(11)) {
>>>> +            set_kvm_facility(kvm->arch.model.fac_mask, 11);
>>>> +            set_kvm_facility(kvm->arch.model.fac_list, 11);
>>>> +            r = 0;
>>>> +        }
>>>> +        mutex_unlock(&kvm->lock);
>>>> +        VM_EVENT(kvm, 3, "ENABLE: CAP_S390_CPU_TOPOLOGY %s",
>>>> +             r ? "(not available)" : "(success)");
>>>> +        break;
>>>>       default:
>>>>           r = -EINVAL;
>>>>           break;
>>>> @@ -1717,6 +1734,36 @@ static void kvm_s390_update_topology_change_report(struct kvm *kvm, bool val)
>>>>       read_unlock(&kvm->arch.sca_lock);
>>>>   }
>>>>   +static int kvm_s390_set_topology(struct kvm *kvm, struct kvm_device_attr *attr)
>>>
>>> kvm_s390_set_topology_changed maybe?
>>> kvm_s390_get_topology_changed below then.
>>
>
> I won't insist on it, but I do think it's more readable.

OK, I can change it

>
>> No strong opinion, if you prefer I change this.
>>
>>>
>>>> +{
>>>> +    if (!test_kvm_facility(kvm, 11))
>>>> +        return -ENXIO;
>>>> +
>>>> +    kvm_s390_update_topology_change_report(kvm, !!attr->attr);
>>>> +    return 0;
>>>> +}
>>>> +
>>>> +static int kvm_s390_get_topology(struct kvm *kvm, struct kvm_device_attr *attr)
>>>> +{
>>>> +    union sca_utility utility;
>>>> +    struct bsca_block *sca;
>>>> +    __u8 topo;
>>>> +
>>>> +    if (!test_kvm_facility(kvm, 11))
>>>> +        return -ENXIO;
>>>> +
>>>> +    read_lock(&kvm->arch.sca_lock);
>>>> +    sca = kvm->arch.sca;
>>>> +    utility.val = READ_ONCE(sca->utility.val);
>>>
>>> I don't think you need the READ_ONCE anymore, now that there is a lock it should act as a compile barrier.
>>
>> I think you are right.
>>
>>>> +    read_unlock(&kvm->arch.sca_lock);
>>>> +    topo = utility.mtcr;
>>>> +
>>>> +    if (copy_to_user((void __user *)attr->addr, &topo, sizeof(topo)))
>>>
>>> Why void not u8?
>>
>> I like to say we write on "topo" with the size of "topo".
>> So we do not need to verify the effective size of topo.
>> But I understand, it is a UAPI, setting u8 in the copy_to_user makes sense too.
>> For my personal opinion, I would have prefer that userland tell us the size it awaits even here, for this special case, since we use a byte, we can not do really wrong.
> You're right, it doesn't make a difference.
> What about doing put_user(topo, (u8 *)attr->addr)), seems more straight forward.

OK

>>
>>>
>>>> +        return -EFAULT;
>>>> +
>>>> +    return 0;
>>>> +}
>>>> +
>>> [...]
>>>
>>
>

--
Pierre Morel
IBM Lab Boeblingen

2022-07-13 09:04:28

by Janosch Frank

[permalink] [raw]
Subject: Re: [PATCH v12 3/3] KVM: s390: resetting the Topology-Change-Report

On 7/12/22 13:17, Pierre Morel wrote:
>
>
> On 7/12/22 10:47, Janis Schoetterl-Glausch wrote:
>> On 7/12/22 09:24, Pierre Morel wrote:
>>>
>>>
>>> On 7/11/22 15:22, Janis Schoetterl-Glausch wrote:
>>>> On 7/11/22 10:41, Pierre Morel wrote:
>>>>> During a subsystem reset the Topology-Change-Report is cleared.
>>>>>
>>>>> Let's give userland the possibility to clear the MTCR in the case
>>>>> of a subsystem reset.
>>>>>
>>>>> To migrate the MTCR, we give userland the possibility to
>>>>> query the MTCR state.
>>>>>
>>>>> We indicate KVM support for the CPU topology facility with a new
>>>>> KVM capability: KVM_CAP_S390_CPU_TOPOLOGY.
>>>>>
>>>>> Signed-off-by: Pierre Morel <[email protected]>
>>>>
>>>> Reviewed-by: Janis Schoetterl-Glausch <[email protected]>
>>>>
>>>
>>> Thanks!
>>>
>>>> See nits/comments below.
>>>>
>>>>> ---
>>>>>   Documentation/virt/kvm/api.rst   | 25 ++++++++++++++
>>>>>   arch/s390/include/uapi/asm/kvm.h |  1 +
>>>>>   arch/s390/kvm/kvm-s390.c         | 56 ++++++++++++++++++++++++++++++++
>>>>>   include/uapi/linux/kvm.h         |  1 +
>>>>>   4 files changed, 83 insertions(+)
>>>>>
>>>>> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
>>>>> index 11e00a46c610..5e086125d8ad 100644
>>>>> --- a/Documentation/virt/kvm/api.rst
>>>>> +++ b/Documentation/virt/kvm/api.rst
>>>>> @@ -7956,6 +7956,31 @@ should adjust CPUID leaf 0xA to reflect that the PMU is disabled.
>>>>>   When enabled, KVM will exit to userspace with KVM_EXIT_SYSTEM_EVENT of
>>>>>   type KVM_SYSTEM_EVENT_SUSPEND to process the guest suspend request.
>>>>>   +8.37 KVM_CAP_S390_CPU_TOPOLOGY
>>>>> +------------------------------
>>>>> +
>>>>> +:Capability: KVM_CAP_S390_CPU_TOPOLOGY
>>>>> +:Architectures: s390
>>>>> +:Type: vm
>>>>> +
>>>>> +This capability indicates that KVM will provide the S390 CPU Topology
>>>>> +facility which consist of the interpretation of the PTF instruction for
>>>>> +the function code 2 along with interception and forwarding of both the
>>>>> +PTF instruction with function codes 0 or 1 and the STSI(15,1,x)
>>>>
>>>> Is the architecture allowed to extend STSI without a facility?
>>>> If so, if we say here that STSI 15.1.x is passed to user space, then
>>>> I think we should have a
>>>>
>>>> if (sel1 != 1)
>>>>     goto out_no_data;
>>>>
>>>> or maybe even
>>>>
>>>> if (sel1 != 1 || sel2 < 2 || sel2 > 6)
>>>>     goto out_no_data;
>>>>
>>>> in priv.c
>>>
>>> I am not a big fan of doing everything in the kernel.
>>> Here we have no performance issue since it is an error of the guest if it sends a wrong selector.
>>>
>> I agree, but I didn't suggest it for performance reasons.
>
> Yes, and that is why I do not agree ;)
>
>> I was thinking about future proofing, that is if the architecture is extended.
>> We don't know if future extensions are best handled in the kernel or user space,
>> so if we prevent it from going to user space, we can defer the decision to when we know more.
>
> If future extensions are better handle in kernel we will handle them in
> kernel, obviously, in this case we will need a patch.
>
> If it is not better handle in kernel we will handle the extensions in
> userland and we will not need a kernel patch making the update of the
> virtual architecture easier and faster.
>
> If we prohibit the extensions in kernel we will need a kernel patch in
> both cases and a userland patch if it is not completely handled in kernel.
>
> In userland we check any wrong selector before the instruction goes back
> to the guest.

I opt for passing the lower selectors down for QEMU to handle.

>
>> But that's only relevant if STSI can be extended without a capability, which is why I asked about that.
>
> Logicaly any change, extension, in the architecture should be signaled
> by a facility bit or something.
>
>>
>>> Even testing the facility or PV in the kernel is for my opinion arguable in the case we do not do any treatment in the kernel.

That's actually a good point.

New instruction interceptions for PV will need to be enabled by KVM via
a switch somewhere since the UV can't rely on the fact that KVM will
correctly handle it without an enablement.


So please remove the pv check

>>>
>>> I do not see what it brings to us, it increase the LOCs and makes the implementation less easy to evolve.
>>>
>>>
>>>>
>>>>> +instruction to the userland hypervisor.
>>>>> +
>>>>> +The stfle facility 11, CPU Topology facility, should not be indicated
>>>>> +to the guest without this capability.
>>>>> +
>>>>> +When this capability is present, KVM provides a new attribute group
>>>>> +on vm fd, KVM_S390_VM_CPU_TOPOLOGY.
>>>>> +This new attribute allows to get, set or clear the Modified Change
>>>>
>>>> get or set, now that there is no explicit clear anymore.
>>>
>>> Yes now it is a set to 0 but the action of clearing remains.

Yes

>>>
>>>>
>>>>> +Topology Report (MTCR) bit of the SCA through the kvm_device_attr
>>>>> +structure.> +
>>>>> +When getting the Modified Change Topology Report value, the attr->addr
>>>>
>>>> When getting/setting the...
>>>>
>>>>> +must point to a byte where the value will be stored.
>>>>
>>>> ... will be stored/retrieved from.
>>>
>>> OK
>>
>> Wait no, I didn't get how that works. You're passing the value via attr->attr, not reading it from addr.
>
> :) OK
>
>>>
>>>
>>>>> +
>>>>>   9. Known KVM API problems
>>>>>   =========================
>>>>>   diff --git a/arch/s390/include/uapi/asm/kvm.h b/arch/s390/include/uapi/asm/kvm.h
>>>>> index 7a6b14874d65..a73cf01a1606 100644
>>>>> --- a/arch/s390/include/uapi/asm/kvm.h
>>>>> +++ b/arch/s390/include/uapi/asm/kvm.h
>>>>> @@ -74,6 +74,7 @@ struct kvm_s390_io_adapter_req {
>>>>>   #define KVM_S390_VM_CRYPTO        2
>>>>>   #define KVM_S390_VM_CPU_MODEL        3
>>>>>   #define KVM_S390_VM_MIGRATION        4
>>>>> +#define KVM_S390_VM_CPU_TOPOLOGY    5
>>>>>     /* kvm attributes for mem_ctrl */
>>>>>   #define KVM_S390_VM_MEM_ENABLE_CMMA    0
>>>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>>>> index 70436bfff53a..b18e0b940b26 100644
>>>>> --- a/arch/s390/kvm/kvm-s390.c
>>>>> +++ b/arch/s390/kvm/kvm-s390.c
>>>>> @@ -606,6 +606,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>>>>>       case KVM_CAP_S390_PROTECTED:
>>>>>           r = is_prot_virt_host();
>>>>>           break;
>>>>> +    case KVM_CAP_S390_CPU_TOPOLOGY:
>>>>> +        r = test_facility(11);
>>>>> +        break;
>>>>>       default:
>>>>>           r = 0;
>>>>>       }
>>>>> @@ -817,6 +820,20 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
>>>>>           icpt_operexc_on_all_vcpus(kvm);
>>>>>           r = 0;
>>>>>           break;
>>>>> +    case KVM_CAP_S390_CPU_TOPOLOGY:
>>>>> +        r = -EINVAL;
>>>>> +        mutex_lock(&kvm->lock);
>>>>> +        if (kvm->created_vcpus) {
>>>>> +            r = -EBUSY;
>>>>> +        } else if (test_facility(11)) {
>>>>> +            set_kvm_facility(kvm->arch.model.fac_mask, 11);
>>>>> +            set_kvm_facility(kvm->arch.model.fac_list, 11);
>>>>> +            r = 0;
>>>>> +        }
>>>>> +        mutex_unlock(&kvm->lock);
>>>>> +        VM_EVENT(kvm, 3, "ENABLE: CAP_S390_CPU_TOPOLOGY %s",
>>>>> +             r ? "(not available)" : "(success)");
>>>>> +        break;
>>>>>       default:
>>>>>           r = -EINVAL;
>>>>>           break;
>>>>> @@ -1717,6 +1734,36 @@ static void kvm_s390_update_topology_change_report(struct kvm *kvm, bool val)
>>>>>       read_unlock(&kvm->arch.sca_lock);
>>>>>   }
>>>>>   +static int kvm_s390_set_topology(struct kvm *kvm, struct kvm_device_attr *attr)
>>>>
>>>> kvm_s390_set_topology_changed maybe?
>>>> kvm_s390_get_topology_changed below then.

kvm_s390_set_topology_change_indication

It's long but it's rarely used.
Maybe shorten topology to "topo"

[..]
>>>> I don't think you need the READ_ONCE anymore, now that there is a lock it should act as a compile barrier.
>>>
>>> I think you are right.
>>>
>>>>> +    read_unlock(&kvm->arch.sca_lock);
>>>>> +    topo = utility.mtcr;
>>>>> +
>>>>> +    if (copy_to_user((void __user *)attr->addr, &topo, sizeof(topo)))
>>>>
>>>> Why void not u8?
>>>
>>> I like to say we write on "topo" with the size of "topo".
>>> So we do not need to verify the effective size of topo.
>>> But I understand, it is a UAPI, setting u8 in the copy_to_user makes sense too.
>>> For my personal opinion, I would have prefer that userland tell us the size it awaits even here, for this special case, since we use a byte, we can not do really wrong.
>> You're right, it doesn't make a difference.
>> What about doing put_user(topo, (u8 *)attr->addr)), seems more straight forward.
>
> OK

(u8 __user *)

Always go the explicit route if possible

>
>>>
>>>>
>>>>> +        return -EFAULT;
>>>>> +
>>>>> +    return 0;
>>>>> +}
>>>>> +
>>>> [...]
>>>>
>>>
>>
>

2022-07-13 09:13:33

by Janosch Frank

[permalink] [raw]
Subject: Re: [PATCH v12 2/3] KVM: s390: guest support for topology function

[...]
>>>>>   +/**
>>>>> + * kvm_s390_update_topology_change_report - update CPU topology change report
>>>>> + * @kvm: guest KVM description
>>>>> + * @val: set or clear the MTCR bit
>>>>> + *
>>>>> + * Updates the Multiprocessor Topology-Change-Report bit to signal
>>>>> + * the guest with a topology change.
>>>>> + * This is only relevant if the topology facility is present.
>>>>> + *
>>>>> + * The SCA version, bsca or esca, doesn't matter as offset is the same.
>>>>> + */
>>>>> +static void kvm_s390_update_topology_change_report(struct kvm *kvm, bool val)
>>>>> +{
>>>>> +    union sca_utility new, old;
>>>>> +    struct bsca_block *sca;
>>>>> +
>>>>> +    read_lock(&kvm->arch.sca_lock);
>>>>> +    do {
>>>>> +        sca = kvm->arch.sca;
>>>>
>>>> I find this assignment being in the loop unintuitive, but it should not make a difference.
>>>
>>> The price would be an ugly cast.
>>
>> I don't get what you mean. Nothing about the types changes if you move it before the loop.
>
> Yes right, did wrong understand.
> It is better before.
With the assignment moved one line up:
Reviewed-by: Janosch Frank <[email protected]>

>
>>>
>>>
>>>>
>>>>> +        old = READ_ONCE(sca->utility);
>>>>> +        new = old;
>>>>> +        new.mtcr = val;
>>>>> +    } while (cmpxchg(&sca->utility.val, old.val, new.val) != old.val);
>>>>> +    read_unlock(&kvm->arch.sca_lock);
>>>>> +}
>>>>> +
>>>> [...]
>>>>
>>>
>>>
>>
>

2022-07-13 09:20:04

by Pierre Morel

[permalink] [raw]
Subject: Re: [PATCH v12 2/3] KVM: s390: guest support for topology function



On 7/13/22 10:34, Janosch Frank wrote:
> [...]
>>>>>>     +/**
>>>>>> + * kvm_s390_update_topology_change_report - update CPU topology
>>>>>> change report
>>>>>> + * @kvm: guest KVM description
>>>>>> + * @val: set or clear the MTCR bit
>>>>>> + *
>>>>>> + * Updates the Multiprocessor Topology-Change-Report bit to signal
>>>>>> + * the guest with a topology change.
>>>>>> + * This is only relevant if the topology facility is present.
>>>>>> + *
>>>>>> + * The SCA version, bsca or esca, doesn't matter as offset is the
>>>>>> same.
>>>>>> + */
>>>>>> +static void kvm_s390_update_topology_change_report(struct kvm
>>>>>> *kvm, bool val)
>>>>>> +{
>>>>>> +    union sca_utility new, old;
>>>>>> +    struct bsca_block *sca;
>>>>>> +
>>>>>> +    read_lock(&kvm->arch.sca_lock);
>>>>>> +    do {
>>>>>> +        sca = kvm->arch.sca;
>>>>>
>>>>> I find this assignment being in the loop unintuitive, but it should
>>>>> not make a difference.
>>>>
>>>> The price would be an ugly cast.
>>>
>>> I don't get what you mean. Nothing about the types changes if you
>>> move it before the loop.
>>
>> Yes right, did wrong understand.
>> It is better before.
> With the assignment moved one line up:
> Reviewed-by: Janosch Frank <[email protected]>

Thanks

>
>>
>>>>
>>>>
>>>>>
>>>>>> +        old = READ_ONCE(sca->utility);
>>>>>> +        new = old;
>>>>>> +        new.mtcr = val;
>>>>>> +    } while (cmpxchg(&sca->utility.val, old.val, new.val) !=
>>>>>> old.val);
>>>>>> +    read_unlock(&kvm->arch.sca_lock);
>>>>>> +}
>>>>>> +
>>>>> [...]
>>>>>
>>>>
>>>>
>>>
>>
>

--
Pierre Morel
IBM Lab Boeblingen

2022-07-13 09:25:28

by Janosch Frank

[permalink] [raw]
Subject: Re: [PATCH v12 0/3] s390x: KVM: CPU Topology

On 7/11/22 10:41, Pierre Morel wrote:
> Hi all,
>
> This new spin suppress the check for real cpu migration and
> modify the checking of valid function code inside the interception
> of the STSI instruction.
>
> The series provides:
> 0- Modification of the ipte lock handling to use KVM instead of the
> vcpu as an argument because ipte lock work on SCA which is uniq
> per KVM structure and common to all vCPUs.
> 1- interception of the STSI instruction forwarding the CPU topology
> 2- interpretation of the PTF instruction
> 3- a KVM capability for the userland hypervisor to ask KVM to
> setup PTF interpretation.
> 4- KVM ioctl to get and set the MTCR bit of the SCA in order to
> migrate this bit during a migration.

Please rebase before sending the next version

2022-07-14 08:37:48

by Pierre Morel

[permalink] [raw]
Subject: Re: [PATCH v12 3/3] KVM: s390: resetting the Topology-Change-Report



On 7/13/22 11:01, Janosch Frank wrote:
> On 7/12/22 13:17, Pierre Morel wrote:
>>
>>
>> On 7/12/22 10:47, Janis Schoetterl-Glausch wrote:
>>> On 7/12/22 09:24, Pierre Morel wrote:
>>>>
>>>>

...

>> kernel.
>>
>> In userland we check any wrong selector before the instruction goes back
>> to the guest.
>
> I opt for passing the lower selectors down for QEMU to handle.

OK

>
>>
>>> But that's only relevant if STSI can be extended without a
>>> capability, which is why I asked about that.
>>
>> Logicaly any change, extension, in the architecture should be signaled
>> by a facility bit or something.
>>
>>>
>>>> Even testing the facility or PV in the kernel is for my opinion
>>>> arguable in the case we do not do any treatment in the kernel.
>
> That's actually a good point.
>
> New instruction interceptions for PV will need to be enabled by KVM via
> a switch somewhere since the UV can't rely on the fact that KVM will
> correctly handle it without an enablement.
>
>
> So please remove the pv check

OK

>

...

>>>>>>     +static int kvm_s390_set_topology(struct kvm *kvm, struct
>>>>>> kvm_device_attr *attr)
>>>>>
>>>>> kvm_s390_set_topology_changed maybe?
>>>>> kvm_s390_get_topology_changed below then.
>
> kvm_s390_set_topology_change_indication
>
> It's long but it's rarely used.
> Maybe shorten topology to "topo"

OK
I use
kvm_s390_get_topo_change_indication()


Thanks.

Regards,
Pierre

--
Pierre Morel
IBM Lab Boeblingen