2022-04-21 05:27:22

by Pierre Morel

[permalink] [raw]
Subject: [PATCH v8 0/2] s390x: KVM: CPU Topology

Hi all,

This new spin adds reset handling and bug correction to the series
for the implementation of interpretation for the PTF instruction.

The series provides:
1- interception of the STSI instruction forwarding the CPU topology
2- interpretation of the PTF instruction
3- a KVM capability for the userland hypervisor to ask KVM to
setup PTF interpretation.


0- Foreword

The S390 CPU topology is reported using two instructions:
- PTF, to get information if the CPU topology did change since last
PTF instruction or a subsystem reset.
- STSI, to get the topology information, consisting of the topology
of the CPU inside the sockets, of the sockets inside the books etc.

The PTF(2) instruction report a change if the STSI(15.1.2) instruction
will report a difference with the last STSI(15.1.2) instruction*.
With the SIE interpretation, the PTF(2) instruction will report a
change to the guest if the host sets the SCA.MTCR bit.

*The STSI(15.1.2) instruction reports:
- The cores address within a socket
- The polarization of the cores
- The CPU type of the cores
- If the cores are dedicated or not

We decided to implement the CPU topology for S390 in several steps:

- first we report CPU hotplug
- modification of the CPU mask inside sockets

In future development we will provide:

- handling of shared CPUs
- reporting of the CPU Type
- reporting of the polarization


1- Interception of STSI

To provide Topology information to the guest through the STSI
instruction, we forward STSI with Function Code 15 to the
userland hypervisor which will take care to provide the right
information to the guest.

To let the guest use both the PTF instruction to check if a topology
change occurred and sthe STSI_15.x.x instruction we add a new KVM
capability to enable the topology facility.

2- Interpretation of PTF with FC(2)

The PTF instruction will report a topology change if there is any change
with a previous STSI(15.1.2) SYSIB.
Changes inside a STSI(15.1.2) SYSIB occur if CPU bits are set or clear
inside the CPU Topology List Entry CPU mask field, which happens with
changes in CPU polarization, dedication, CPU types and adding or
removing CPUs in a socket.

The reporting to the guest is done using the Multiprocessor
Topology-Change-Report (MTCR) bit of the utility entry of the guest's
SCA which will be cleared during the interpretation of PTF.

To check if the topology has been modified we use a new field of the
arch vCPU prev_cpu, to save the previous real CPU ID at the end of a
schedule and verify on next schedule that the CPU used is in the same
socket, this field is initialized to -1 on vCPU creation.


Regards,
Pierre


Pierre Morel (2):
s390x: KVM: guest support for topology function
s390x: KVM: resetting the Topology-Change-Report

Documentation/virt/kvm/api.rst | 16 +++
arch/s390/include/asm/kvm_host.h | 12 ++-
arch/s390/include/uapi/asm/kvm.h | 9 ++
arch/s390/kvm/kvm-s390.c | 177 ++++++++++++++++++++++++++++++-
arch/s390/kvm/kvm-s390.h | 25 +++++
arch/s390/kvm/priv.c | 14 ++-
arch/s390/kvm/vsie.c | 3 +
include/uapi/linux/kvm.h | 1 +
8 files changed, 249 insertions(+), 8 deletions(-)

--
2.27.0

Changelog:

from v7 to v8

- implement reset handling
(Janosch)

- change the way to check if the topology changed
(Nico, Heiko)

from v6 to v7

- rebase

from v5 to v6

- make the subject more accurate
(Claudio)

- Change the kvm_s390_set_mtcr() function to have vcpu in the name
(Janosch)

- Replace the checks on ECB_PTF wit the check of facility 11
(Janosch)

- modify kvm_arch_vcpu_load, move the check in a function in
the header file
(Janosh)

- No magical number replace the "new cpu value" of -1 with a define
(Janosch)

- Make the checks for STSI validity clearer
(Janosch)

from v4 tp v5

- modify the way KVM_CAP is tested to be OK with vsie
(David)

from v3 to v4

- squatch both patches
(David)

- Added Documentation
(David)

- Modified the detection for new vCPUs
(Pierre)

from v2 to v3

- use PTF interpretation
(Christian)

- optimize arch_update_cpu_topology using PTF
(Pierre)

from v1 to v2:

- Add a KVM capability to let QEMU know we support PTF and STSI 15
(David)

- check KVM facility 11 before accepting STSI fc 15
(David)

- handle all we can in userland
(David)

- add tracing to STSI fc 15
(Connie)


2022-04-22 18:55:41

by Pierre Morel

[permalink] [raw]
Subject: [PATCH v8 2/2] s390x: KVM: resetting the Topology-Change-Report

During a subsystem reset the Topology-Change-Report is cleared.
Let's give userland the possibility to clear the MTCR in the case
of a subsystem reset.

To migrate the MTCR, let's give userland the possibility to
query the MTCR state.

Signed-off-by: Pierre Morel <[email protected]>
---
arch/s390/include/uapi/asm/kvm.h | 9 +++
arch/s390/kvm/kvm-s390.c | 103 +++++++++++++++++++++++++++++++
2 files changed, 112 insertions(+)

diff --git a/arch/s390/include/uapi/asm/kvm.h b/arch/s390/include/uapi/asm/kvm.h
index 7a6b14874d65..bb3df6d49f27 100644
--- a/arch/s390/include/uapi/asm/kvm.h
+++ b/arch/s390/include/uapi/asm/kvm.h
@@ -74,6 +74,7 @@ struct kvm_s390_io_adapter_req {
#define KVM_S390_VM_CRYPTO 2
#define KVM_S390_VM_CPU_MODEL 3
#define KVM_S390_VM_MIGRATION 4
+#define KVM_S390_VM_CPU_TOPOLOGY 5

/* kvm attributes for mem_ctrl */
#define KVM_S390_VM_MEM_ENABLE_CMMA 0
@@ -171,6 +172,14 @@ struct kvm_s390_vm_cpu_subfunc {
#define KVM_S390_VM_MIGRATION_START 1
#define KVM_S390_VM_MIGRATION_STATUS 2

+/* kvm attributes for cpu topology */
+#define KVM_S390_VM_CPU_TOPO_MTR_CLEAR 0
+#define KVM_S390_VM_CPU_TOPO_MTR_SET 1
+
+struct kvm_s390_cpu_topology {
+ __u16 mtcr;
+};
+
/* for KVM_GET_REGS and KVM_SET_REGS */
struct kvm_regs {
/* general purpose regs for s390 */
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 925ccc59f283..755f325c9e70 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -1756,6 +1756,100 @@ static int kvm_s390_sca_set_mtcr(struct kvm *kvm)
return 0;
}

+/**
+ * kvm_s390_sca_clear_mtcr
+ * @kvm: guest KVM description
+ *
+ * Is only relevant if the topology facility is present,
+ * the caller should check KVM facility 11
+ *
+ * Updates the Multiprocessor Topology-Change-Report to signal
+ * the guest with a topology change.
+ */
+static int kvm_s390_sca_clear_mtcr(struct kvm *kvm)
+{
+ struct bsca_block *sca = kvm->arch.sca;
+ struct kvm_vcpu *vcpu;
+ int val;
+
+ vcpu = kvm_s390_get_first_vcpu(kvm);
+ if (!vcpu)
+ return -ENODEV;
+
+ ipte_lock(vcpu);
+ val = READ_ONCE(sca->utility);
+ WRITE_ONCE(sca->utility, sca->utility & ~SCA_UTILITY_MTCR);
+ ipte_unlock(vcpu);
+
+ return 0;
+}
+
+/**
+ * kvm_s390_sca_get_mtcr
+ * @kvm: guest KVM description
+ *
+ * Is only relevant if the topology facility is present,
+ * the caller should check KVM facility 11
+ *
+ * reports to QEMU the Multiprocessor Topology-Change-Report.
+ */
+static int kvm_s390_sca_get_mtcr(struct kvm *kvm)
+{
+ struct bsca_block *sca = kvm->arch.sca;
+ struct kvm_vcpu *vcpu;
+ int val;
+
+ vcpu = kvm_s390_get_first_vcpu(kvm);
+ if (!vcpu)
+ return -ENODEV;
+
+ ipte_lock(vcpu);
+ val = READ_ONCE(sca->utility);
+ ipte_unlock(vcpu);
+
+ return val;
+}
+
+static int kvm_s390_set_topology(struct kvm *kvm, struct kvm_device_attr *attr)
+{
+ int ret = -EFAULT;
+
+ if (!test_kvm_facility(kvm, 11))
+ return -ENXIO;
+
+ switch (attr->attr) {
+ case KVM_S390_VM_CPU_TOPO_MTR_SET:
+ ret = kvm_s390_sca_set_mtcr(kvm);
+ break;
+ case KVM_S390_VM_CPU_TOPO_MTR_CLEAR:
+ ret = kvm_s390_sca_clear_mtcr(kvm);
+ break;
+ }
+
+ return ret;
+}
+
+static int kvm_s390_get_topology(struct kvm *kvm, struct kvm_device_attr *attr)
+{
+ struct kvm_s390_cpu_topology *topology;
+ int ret = 0;
+
+ if (!test_kvm_facility(kvm, 11))
+ return -ENXIO;
+
+ topology = kzalloc(sizeof(*topology), GFP_KERNEL);
+ if (!topology)
+ return -ENOMEM;
+
+ topology->mtcr = kvm_s390_sca_get_mtcr(kvm);
+ if (copy_to_user((void __user *)attr->addr, topology,
+ sizeof(struct kvm_s390_cpu_topology)))
+ ret = -EFAULT;
+
+ kfree(topology);
+ return ret;
+}
+
static int kvm_s390_vm_set_attr(struct kvm *kvm, struct kvm_device_attr *attr)
{
int ret;
@@ -1776,6 +1870,9 @@ static int kvm_s390_vm_set_attr(struct kvm *kvm, struct kvm_device_attr *attr)
case KVM_S390_VM_MIGRATION:
ret = kvm_s390_vm_set_migration(kvm, attr);
break;
+ case KVM_S390_VM_CPU_TOPOLOGY:
+ ret = kvm_s390_set_topology(kvm, attr);
+ break;
default:
ret = -ENXIO;
break;
@@ -1801,6 +1898,9 @@ static int kvm_s390_vm_get_attr(struct kvm *kvm, struct kvm_device_attr *attr)
case KVM_S390_VM_MIGRATION:
ret = kvm_s390_vm_get_migration(kvm, attr);
break;
+ case KVM_S390_VM_CPU_TOPOLOGY:
+ ret = kvm_s390_get_topology(kvm, attr);
+ break;
default:
ret = -ENXIO;
break;
@@ -1874,6 +1974,9 @@ static int kvm_s390_vm_has_attr(struct kvm *kvm, struct kvm_device_attr *attr)
case KVM_S390_VM_MIGRATION:
ret = 0;
break;
+ case KVM_S390_VM_CPU_TOPOLOGY:
+ ret = test_kvm_facility(kvm, 11) ? 0 : -ENXIO;
+ break;
default:
ret = -ENXIO;
break;
--
2.27.0

2022-04-22 20:23:14

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v8 2/2] s390x: KVM: resetting the Topology-Change-Report

Hi Pierre,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on kvm/master]
[also build test WARNING on v5.18-rc3]
[cannot apply to kvms390/next mst-vhost/linux-next next-20220420]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url: https://github.com/intel-lab-lkp/linux/commits/Pierre-Morel/s390x-KVM-CPU-Topology/20220420-194302
base: https://git.kernel.org/pub/scm/virt/kvm/kvm.git master
config: s390-randconfig-r044-20220420 (https://download.01.org/0day-ci/archive/20220420/[email protected]/config)
compiler: clang version 15.0.0 (https://github.com/llvm/llvm-project bac6cd5bf85669e3376610cfc4c4f9ca015e7b9b)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# install s390 cross compiling tool for clang build
# apt-get install binutils-s390x-linux-gnu
# https://github.com/intel-lab-lkp/linux/commit/0bdeef651636ac2ef4918fb6e3230614e2fb3581
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Pierre-Morel/s390x-KVM-CPU-Topology/20220420-194302
git checkout 0bdeef651636ac2ef4918fb6e3230614e2fb3581
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=s390 SHELL=/bin/bash arch/s390/kvm/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>

All warnings (new ones prefixed by >>):

In file included from arch/s390/kvm/kvm-s390.c:22:
In file included from include/linux/kvm_host.h:41:
In file included from include/linux/kvm_para.h:5:
In file included from include/uapi/linux/kvm_para.h:37:
In file included from arch/s390/include/asm/kvm_para.h:25:
In file included from arch/s390/include/asm/diag.h:12:
In file included from include/linux/if_ether.h:19:
In file included from include/linux/skbuff.h:31:
In file included from include/linux/dma-mapping.h:10:
In file included from include/linux/scatterlist.h:9:
In file included from arch/s390/include/asm/io.h:75:
include/asm-generic/io.h:464:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
val = __raw_readb(PCI_IOBASE + addr);
~~~~~~~~~~ ^
include/asm-generic/io.h:477:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
~~~~~~~~~~ ^
include/uapi/linux/byteorder/big_endian.h:37:59: note: expanded from macro '__le16_to_cpu'
#define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x))
^
include/uapi/linux/swab.h:102:54: note: expanded from macro '__swab16'
#define __swab16(x) (__u16)__builtin_bswap16((__u16)(x))
^
In file included from arch/s390/kvm/kvm-s390.c:22:
In file included from include/linux/kvm_host.h:41:
In file included from include/linux/kvm_para.h:5:
In file included from include/uapi/linux/kvm_para.h:37:
In file included from arch/s390/include/asm/kvm_para.h:25:
In file included from arch/s390/include/asm/diag.h:12:
In file included from include/linux/if_ether.h:19:
In file included from include/linux/skbuff.h:31:
In file included from include/linux/dma-mapping.h:10:
In file included from include/linux/scatterlist.h:9:
In file included from arch/s390/include/asm/io.h:75:
include/asm-generic/io.h:490:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
~~~~~~~~~~ ^
include/uapi/linux/byteorder/big_endian.h:35:59: note: expanded from macro '__le32_to_cpu'
#define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x))
^
include/uapi/linux/swab.h:115:54: note: expanded from macro '__swab32'
#define __swab32(x) (__u32)__builtin_bswap32((__u32)(x))
^
In file included from arch/s390/kvm/kvm-s390.c:22:
In file included from include/linux/kvm_host.h:41:
In file included from include/linux/kvm_para.h:5:
In file included from include/uapi/linux/kvm_para.h:37:
In file included from arch/s390/include/asm/kvm_para.h:25:
In file included from arch/s390/include/asm/diag.h:12:
In file included from include/linux/if_ether.h:19:
In file included from include/linux/skbuff.h:31:
In file included from include/linux/dma-mapping.h:10:
In file included from include/linux/scatterlist.h:9:
In file included from arch/s390/include/asm/io.h:75:
include/asm-generic/io.h:501:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
__raw_writeb(value, PCI_IOBASE + addr);
~~~~~~~~~~ ^
include/asm-generic/io.h:511:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
__raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr);
~~~~~~~~~~ ^
include/asm-generic/io.h:521:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
__raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr);
~~~~~~~~~~ ^
include/asm-generic/io.h:609:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
readsb(PCI_IOBASE + addr, buffer, count);
~~~~~~~~~~ ^
include/asm-generic/io.h:617:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
readsw(PCI_IOBASE + addr, buffer, count);
~~~~~~~~~~ ^
include/asm-generic/io.h:625:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
readsl(PCI_IOBASE + addr, buffer, count);
~~~~~~~~~~ ^
include/asm-generic/io.h:634:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
writesb(PCI_IOBASE + addr, buffer, count);
~~~~~~~~~~ ^
include/asm-generic/io.h:643:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
writesw(PCI_IOBASE + addr, buffer, count);
~~~~~~~~~~ ^
include/asm-generic/io.h:652:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
writesl(PCI_IOBASE + addr, buffer, count);
~~~~~~~~~~ ^
>> arch/s390/kvm/kvm-s390.c:1773:6: warning: variable 'val' set but not used [-Wunused-but-set-variable]
int val;
^
13 warnings generated.


vim +/val +1773 arch/s390/kvm/kvm-s390.c

1758
1759 /**
1760 * kvm_s390_sca_clear_mtcr
1761 * @kvm: guest KVM description
1762 *
1763 * Is only relevant if the topology facility is present,
1764 * the caller should check KVM facility 11
1765 *
1766 * Updates the Multiprocessor Topology-Change-Report to signal
1767 * the guest with a topology change.
1768 */
1769 static int kvm_s390_sca_clear_mtcr(struct kvm *kvm)
1770 {
1771 struct bsca_block *sca = kvm->arch.sca;
1772 struct kvm_vcpu *vcpu;
> 1773 int val;
1774
1775 vcpu = kvm_s390_get_first_vcpu(kvm);
1776 if (!vcpu)
1777 return -ENODEV;
1778
1779 ipte_lock(vcpu);
1780 val = READ_ONCE(sca->utility);
1781 WRITE_ONCE(sca->utility, sca->utility & ~SCA_UTILITY_MTCR);
1782 ipte_unlock(vcpu);
1783
1784 return 0;
1785 }
1786

--
0-DAY CI Kernel Test Service
https://01.org/lkp

2022-04-22 20:33:04

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v8 2/2] s390x: KVM: resetting the Topology-Change-Report

Hi Pierre,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on kvm/master]
[also build test WARNING on v5.18-rc3]
[cannot apply to kvms390/next mst-vhost/linux-next next-20220420]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url: https://github.com/intel-lab-lkp/linux/commits/Pierre-Morel/s390x-KVM-CPU-Topology/20220420-194302
base: https://git.kernel.org/pub/scm/virt/kvm/kvm.git master
config: s390-allyesconfig (https://download.01.org/0day-ci/archive/20220421/[email protected]/config)
compiler: s390-linux-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/0bdeef651636ac2ef4918fb6e3230614e2fb3581
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Pierre-Morel/s390x-KVM-CPU-Topology/20220420-194302
git checkout 0bdeef651636ac2ef4918fb6e3230614e2fb3581
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross W=1 O=build_dir ARCH=s390 SHELL=/bin/bash arch/s390/kvm/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>

All warnings (new ones prefixed by >>):

arch/s390/kvm/kvm-s390.c: In function 'kvm_s390_sca_clear_mtcr':
>> arch/s390/kvm/kvm-s390.c:1773:13: warning: variable 'val' set but not used [-Wunused-but-set-variable]
1773 | int val;
| ^~~


vim +/val +1773 arch/s390/kvm/kvm-s390.c

1758
1759 /**
1760 * kvm_s390_sca_clear_mtcr
1761 * @kvm: guest KVM description
1762 *
1763 * Is only relevant if the topology facility is present,
1764 * the caller should check KVM facility 11
1765 *
1766 * Updates the Multiprocessor Topology-Change-Report to signal
1767 * the guest with a topology change.
1768 */
1769 static int kvm_s390_sca_clear_mtcr(struct kvm *kvm)
1770 {
1771 struct bsca_block *sca = kvm->arch.sca;
1772 struct kvm_vcpu *vcpu;
> 1773 int val;
1774
1775 vcpu = kvm_s390_get_first_vcpu(kvm);
1776 if (!vcpu)
1777 return -ENODEV;
1778
1779 ipte_lock(vcpu);
1780 val = READ_ONCE(sca->utility);
1781 WRITE_ONCE(sca->utility, sca->utility & ~SCA_UTILITY_MTCR);
1782 ipte_unlock(vcpu);
1783
1784 return 0;
1785 }
1786

--
0-DAY CI Kernel Test Service
https://01.org/lkp

2022-04-22 21:05:32

by Pierre Morel

[permalink] [raw]
Subject: [PATCH v8 1/2] s390x: KVM: guest support for topology function

We let the userland hypervisor know if the machine support the CPU
topology facility using a new KVM capability: KVM_CAP_S390_CPU_TOPOLOGY.

The PTF instruction will report a topology change if there is any change
with a previous STSI_15_1_2 SYSIB.
Changes inside a STSI_15_1_2 SYSIB occur if CPU bits are set or clear
inside the CPU Topology List Entry CPU mask field, which happens with
changes in CPU polarization, dedication, CPU types and adding or
removing CPUs in a socket.

The reporting to the guest is done using the Multiprocessor
Topology-Change-Report (MTCR) bit of the utility entry of the guest's
SCA which will be cleared during the interpretation of PTF.

To check if the topology has been modified we use a new field of the
arch vCPU to save the previous real CPU ID at the end of a schedule
and verify on next schedule that the CPU used is in the same socket.
We do not report polarization, CPU Type or dedication change.

STSI(15.1.x) gives information on the CPU configuration topology.
Let's accept the interception of STSI with the function code 15 and
let the userland part of the hypervisor handle it when userland
support the CPU Topology facility.

Signed-off-by: Pierre Morel <[email protected]>
---
Documentation/virt/kvm/api.rst | 16 +++++++
arch/s390/include/asm/kvm_host.h | 12 ++++--
arch/s390/kvm/kvm-s390.c | 74 +++++++++++++++++++++++++++++++-
arch/s390/kvm/kvm-s390.h | 25 +++++++++++
arch/s390/kvm/priv.c | 14 ++++--
arch/s390/kvm/vsie.c | 3 ++
include/uapi/linux/kvm.h | 1 +
7 files changed, 137 insertions(+), 8 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 85c7abc51af5..3499bc8d205e 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -7769,3 +7769,19 @@ Ordering of KVM_GET_*/KVM_SET_* ioctls
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

TBD
+
+8.17 KVM_CAP_S390_CPU_TOPOLOGY
+------------------------------
+
+:Capability: KVM_CAP_S390_CPU_TOPOLOGY
+:Architectures: s390
+:Type: vm
+
+This capability indicates that kvm will provide the S390 CPU Topology facility
+which consist of the interpretation of the PTF instruction for the Function
+Code 2 along with interception and forwarding of both the PTF instruction
+with Function Codes 0 or 1 and the STSI(15,1,x) instruction to the userland
+hypervisor.
+
+The stfle facility 11, CPU Topology facility, should not be provided to the
+guest without this capability.
diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index 766028d54a3e..04653b43ccee 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -97,15 +97,19 @@ struct bsca_block {
union ipte_control ipte_control;
__u64 reserved[5];
__u64 mcn;
- __u64 reserved2;
+#define SCA_UTILITY_MTCR 0x8000
+ __u16 utility;
+ __u8 reserved2[6];
struct bsca_entry cpu[KVM_S390_BSCA_CPU_SLOTS];
};

struct esca_block {
union ipte_control ipte_control;
- __u64 reserved1[7];
+ __u64 reserved1[6];
+ __u16 utility;
+ __u8 reserved2[6];
__u64 mcn[4];
- __u64 reserved2[20];
+ __u64 reserved3[20];
struct esca_entry cpu[KVM_S390_ESCA_CPU_SLOTS];
};

@@ -249,6 +253,7 @@ struct kvm_s390_sie_block {
#define ECB_SPECI 0x08
#define ECB_SRSI 0x04
#define ECB_HOSTPROTINT 0x02
+#define ECB_PTF 0x01
__u8 ecb; /* 0x0061 */
#define ECB2_CMMA 0x80
#define ECB2_IEP 0x20
@@ -750,6 +755,7 @@ struct kvm_vcpu_arch {
bool skey_enabled;
struct kvm_s390_pv_vcpu pv;
union diag318_info diag318_info;
+ int prev_cpu;
};

struct kvm_vm_stat {
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 156d1c25a3c1..925ccc59f283 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -606,6 +606,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
case KVM_CAP_S390_PROTECTED:
r = is_prot_virt_host();
break;
+ case KVM_CAP_S390_CPU_TOPOLOGY:
+ r = test_facility(11);
+ break;
default:
r = 0;
}
@@ -817,6 +820,20 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
icpt_operexc_on_all_vcpus(kvm);
r = 0;
break;
+ case KVM_CAP_S390_CPU_TOPOLOGY:
+ r = -EINVAL;
+ mutex_lock(&kvm->lock);
+ if (kvm->created_vcpus) {
+ r = -EBUSY;
+ } else if (test_facility(11)) {
+ set_kvm_facility(kvm->arch.model.fac_mask, 11);
+ set_kvm_facility(kvm->arch.model.fac_list, 11);
+ r = 0;
+ }
+ mutex_unlock(&kvm->lock);
+ VM_EVENT(kvm, 3, "ENABLE: CPU TOPOLOGY %s",
+ r ? "(not available)" : "(success)");
+ break;
default:
r = -EINVAL;
break;
@@ -1695,6 +1712,50 @@ static int kvm_s390_get_cpu_model(struct kvm *kvm, struct kvm_device_attr *attr)
return ret;
}

+/**
+ * kvm_s390_get_first_vcpu
+ * @kvm: guest KVM description
+ *
+ * returns the first online vcpu
+ */
+static struct kvm_vcpu *kvm_s390_get_first_vcpu(struct kvm *kvm)
+{
+ struct kvm_vcpu *vcpu;
+ unsigned long i;
+
+ kvm_for_each_vcpu(i, vcpu, kvm)
+ return vcpu;
+ return NULL;
+}
+
+/**
+ * kvm_s390_sca_set_mtcr
+ * @kvm: guest KVM description
+ *
+ * Is only relevant if the topology facility is present,
+ * the caller should check KVM facility 11
+ *
+ * Updates the Multiprocessor Topology-Change-Report to signal
+ * the guest with a topology change.
+ * Note that both bsca and esca have the utility half word at
+ * the same offset.
+ */
+static int kvm_s390_sca_set_mtcr(struct kvm *kvm)
+{
+ struct bsca_block *sca = kvm->arch.sca;
+ struct kvm_vcpu *vcpu;
+
+ vcpu = kvm_s390_get_first_vcpu(kvm);
+ if (!vcpu)
+ return -ENODEV;
+
+ ipte_lock(vcpu);
+ WRITE_ONCE(sca->utility, sca->utility | SCA_UTILITY_MTCR);
+ ipte_unlock(vcpu);
+
+ return 0;
+}
+
static int kvm_s390_vm_set_attr(struct kvm *kvm, struct kvm_device_attr *attr)
{
int ret;
@@ -3138,16 +3199,20 @@ __u64 kvm_s390_get_cpu_timer(struct kvm_vcpu *vcpu)

void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
{
-
gmap_enable(vcpu->arch.enabled_gmap);
kvm_s390_set_cpuflags(vcpu, CPUSTAT_RUNNING);
if (vcpu->arch.cputm_enabled && !is_vcpu_idle(vcpu))
__start_cpu_timer_accounting(vcpu);
vcpu->cpu = cpu;
+
+ if (kvm_s390_topology_changed(vcpu))
+ kvm_s390_sca_set_mtcr(vcpu->kvm);
}

void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
{
+ /* Remember which CPU was backing the vCPU */
+ vcpu->arch.prev_cpu = vcpu->cpu;
vcpu->cpu = -1;
if (vcpu->arch.cputm_enabled && !is_vcpu_idle(vcpu))
__stop_cpu_timer_accounting(vcpu);
@@ -3267,6 +3332,13 @@ static int kvm_s390_vcpu_setup(struct kvm_vcpu *vcpu)
vcpu->arch.sie_block->ecb |= ECB_HOSTPROTINT;
if (test_kvm_facility(vcpu->kvm, 9))
vcpu->arch.sie_block->ecb |= ECB_SRSI;
+
+ /* PTF needs guest facilities to enable interpretation */
+ if (test_kvm_facility(vcpu->kvm, 11))
+ vcpu->arch.sie_block->ecb |= ECB_PTF;
+ /* Indicate this is a new vcpu */
+ vcpu->arch.prev_cpu = S390_KVM_TOPOLOGY_NEW_CPU;
+
if (test_kvm_facility(vcpu->kvm, 73))
vcpu->arch.sie_block->ecb |= ECB_TE;
if (!kvm_is_ucontrol(vcpu->kvm))
diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h
index 497d52a83c78..897767652b5c 100644
--- a/arch/s390/kvm/kvm-s390.h
+++ b/arch/s390/kvm/kvm-s390.h
@@ -514,4 +514,29 @@ void kvm_s390_vcpu_crypto_reset_all(struct kvm *kvm);
*/
extern unsigned int diag9c_forwarding_hz;

+#define S390_KVM_TOPOLOGY_NEW_CPU -1
+/**
+ * kvm_s390_topology_changed
+ * @vcpu: the virtual CPU
+ *
+ * If the topology facility is present, checks if the CPU toplogy
+ * viewed by the guest changed due to load balancing or CPU hotplug.
+ */
+static inline bool kvm_s390_topology_changed(struct kvm_vcpu *vcpu)
+{
+ if (!test_kvm_facility(vcpu->kvm, 11))
+ return false;
+
+ /* A new vCPU has been hotplugged */
+ if (vcpu->arch.prev_cpu == S390_KVM_TOPOLOGY_NEW_CPU)
+ return true;
+
+ /* The real CPU backing up the vCPU moved to another socket */
+ if (cpumask_test_cpu(vcpu->cpu,
+ topology_core_cpumask(vcpu->arch.prev_cpu)))
+ return true;
+
+ return false;
+}
+
#endif
diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
index 5beb7a4a11b3..5ab7173c3909 100644
--- a/arch/s390/kvm/priv.c
+++ b/arch/s390/kvm/priv.c
@@ -874,10 +874,12 @@ static int handle_stsi(struct kvm_vcpu *vcpu)
if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);

- if (fc > 3) {
- kvm_s390_set_psw_cc(vcpu, 3);
- return 0;
- }
+ if (fc > 3 && fc != 15)
+ goto out_no_data;
+
+ /* fc 15 is provided with PTF/CPU topology support */
+ if (fc == 15 && !test_kvm_facility(vcpu->kvm, 11))
+ goto out_no_data;

if (vcpu->run->s.regs.gprs[0] & 0x0fffff00
|| vcpu->run->s.regs.gprs[1] & 0xffff0000)
@@ -911,6 +913,10 @@ static int handle_stsi(struct kvm_vcpu *vcpu)
goto out_no_data;
handle_stsi_3_2_2(vcpu, (void *) mem);
break;
+ case 15:
+ trace_kvm_s390_handle_stsi(vcpu, fc, sel1, sel2, operand2);
+ insert_stsi_usr_data(vcpu, operand2, ar, fc, sel1, sel2);
+ return -EREMOTE;
}
if (kvm_s390_pv_cpu_is_protected(vcpu)) {
memcpy((void *)sida_origin(vcpu->arch.sie_block), (void *)mem,
diff --git a/arch/s390/kvm/vsie.c b/arch/s390/kvm/vsie.c
index acda4b6fc851..da0397cf2cc7 100644
--- a/arch/s390/kvm/vsie.c
+++ b/arch/s390/kvm/vsie.c
@@ -503,6 +503,9 @@ static int shadow_scb(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
/* Host-protection-interruption introduced with ESOP */
if (test_kvm_cpu_feat(vcpu->kvm, KVM_S390_VM_CPU_FEAT_ESOP))
scb_s->ecb |= scb_o->ecb & ECB_HOSTPROTINT;
+ /* CPU Topology */
+ if (test_kvm_facility(vcpu->kvm, 11))
+ scb_s->ecb |= scb_o->ecb & ECB_PTF;
/* transactional execution */
if (test_kvm_facility(vcpu->kvm, 73) && wants_tx) {
/* remap the prefix is tx is toggled on */
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 91a6fe4e02c0..9640cfa9a92a 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1144,6 +1144,7 @@ struct kvm_ppc_resize_hpt {
#define KVM_CAP_S390_MEM_OP_EXTENSION 211
#define KVM_CAP_PMU_CAPABILITY 212
#define KVM_CAP_DISABLE_QUIRKS2 213
+#define KVM_CAP_S390_CPU_TOPOLOGY 214

#ifdef KVM_CAP_IRQ_ROUTING

--
2.27.0

2022-04-28 16:08:46

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH v8 2/2] s390x: KVM: resetting the Topology-Change-Report

On 20.04.22 13:34, Pierre Morel wrote:
> During a subsystem reset the Topology-Change-Report is cleared.
> Let's give userland the possibility to clear the MTCR in the case
> of a subsystem reset.
>
> To migrate the MTCR, let's give userland the possibility to
> query the MTCR state.
>
> Signed-off-by: Pierre Morel <[email protected]>
> ---
> arch/s390/include/uapi/asm/kvm.h | 9 +++
> arch/s390/kvm/kvm-s390.c | 103 +++++++++++++++++++++++++++++++
> 2 files changed, 112 insertions(+)
>
> diff --git a/arch/s390/include/uapi/asm/kvm.h b/arch/s390/include/uapi/asm/kvm.h
> index 7a6b14874d65..bb3df6d49f27 100644
> --- a/arch/s390/include/uapi/asm/kvm.h
> +++ b/arch/s390/include/uapi/asm/kvm.h
> @@ -74,6 +74,7 @@ struct kvm_s390_io_adapter_req {
> #define KVM_S390_VM_CRYPTO 2
> #define KVM_S390_VM_CPU_MODEL 3
> #define KVM_S390_VM_MIGRATION 4
> +#define KVM_S390_VM_CPU_TOPOLOGY 5
>
> /* kvm attributes for mem_ctrl */
> #define KVM_S390_VM_MEM_ENABLE_CMMA 0
> @@ -171,6 +172,14 @@ struct kvm_s390_vm_cpu_subfunc {
> #define KVM_S390_VM_MIGRATION_START 1
> #define KVM_S390_VM_MIGRATION_STATUS 2
>
> +/* kvm attributes for cpu topology */
> +#define KVM_S390_VM_CPU_TOPO_MTR_CLEAR 0
> +#define KVM_S390_VM_CPU_TOPO_MTR_SET 1
> +
> +struct kvm_s390_cpu_topology {
> + __u16 mtcr;
> +};

Just wondering:

1) Do we really need a struct for that
2) Do we want to leave some room for later expansion?

> +
> /* for KVM_GET_REGS and KVM_SET_REGS */
> struct kvm_regs {
> /* general purpose regs for s390 */
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 925ccc59f283..755f325c9e70 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -1756,6 +1756,100 @@ static int kvm_s390_sca_set_mtcr(struct kvm *kvm)
> return 0;
> }
>
> +/**
> + * kvm_s390_sca_clear_mtcr
> + * @kvm: guest KVM description
> + *
> + * Is only relevant if the topology facility is present,
> + * the caller should check KVM facility 11
> + *
> + * Updates the Multiprocessor Topology-Change-Report to signal
> + * the guest with a topology change.
> + */
> +static int kvm_s390_sca_clear_mtcr(struct kvm *kvm)
> +{
> + struct bsca_block *sca = kvm->arch.sca;
> + struct kvm_vcpu *vcpu;
> + int val;
> +
> + vcpu = kvm_s390_get_first_vcpu(kvm);
> + if (!vcpu)
> + return -ENODEV;

It would be cleaner to have ipte_lock/ipte_unlock variants that are
independent of a vcpu.

Instead of checking for "vcpu->arch.sie_block->eca & ECA_SII" we might
just check for sclp.has_siif. Everything else that performs the
lock/unlock should be contained in "struct kvm" directly, unless I am
missing something.

[...]

> +
> +static int kvm_s390_get_topology(struct kvm *kvm, struct kvm_device_attr *attr)
> +{
> + struct kvm_s390_cpu_topology *topology;
> + int ret = 0;
> +
> + if (!test_kvm_facility(kvm, 11))
> + return -ENXIO;
> +
> + topology = kzalloc(sizeof(*topology), GFP_KERNEL);
> + if (!topology)
> + return -ENOMEM;

I'm confused. We're allocating a __u16 to then free it again below? Why
not simply use a value on the stack like in kvm_s390_vm_get_migration()?



u16 mtcr;
...
mtcr = kvm_s390_sca_get_mtcr(kvm);

if (copy_to_user((void __user *)attr->addr, &mtcr, sizeof(mtcr)))
return -EFAULT;
return 0;



> +
> + topology->mtcr = kvm_s390_sca_get_mtcr(kvm);

s/ / /

> + if (copy_to_user((void __user *)attr->addr, topology,
> + sizeof(struct kvm_s390_cpu_topology)))
> + ret = -EFAULT;
> +
> + kfree(topology);
> + return ret;
> +}
> +


--
Thanks,

David / dhildenb

2022-05-05 23:06:13

by Pierre Morel

[permalink] [raw]
Subject: Re: [PATCH v8 2/2] s390x: KVM: resetting the Topology-Change-Report



On 4/28/22 15:50, David Hildenbrand wrote:
> On 20.04.22 13:34, Pierre Morel wrote:
>> During a subsystem reset the Topology-Change-Report is cleared.
>> Let's give userland the possibility to clear the MTCR in the case
>> of a subsystem reset.
>>
>> To migrate the MTCR, let's give userland the possibility to
>> query the MTCR state.
>>
>> Signed-off-by: Pierre Morel <[email protected]>
>> ---
>> arch/s390/include/uapi/asm/kvm.h | 9 +++
>> arch/s390/kvm/kvm-s390.c | 103 +++++++++++++++++++++++++++++++
>> 2 files changed, 112 insertions(+)
>>
>> diff --git a/arch/s390/include/uapi/asm/kvm.h b/arch/s390/include/uapi/asm/kvm.h
>> index 7a6b14874d65..bb3df6d49f27 100644
>> --- a/arch/s390/include/uapi/asm/kvm.h
>> +++ b/arch/s390/include/uapi/asm/kvm.h
>> @@ -74,6 +74,7 @@ struct kvm_s390_io_adapter_req {
>> #define KVM_S390_VM_CRYPTO 2
>> #define KVM_S390_VM_CPU_MODEL 3
>> #define KVM_S390_VM_MIGRATION 4
>> +#define KVM_S390_VM_CPU_TOPOLOGY 5
>>
>> /* kvm attributes for mem_ctrl */
>> #define KVM_S390_VM_MEM_ENABLE_CMMA 0
>> @@ -171,6 +172,14 @@ struct kvm_s390_vm_cpu_subfunc {
>> #define KVM_S390_VM_MIGRATION_START 1
>> #define KVM_S390_VM_MIGRATION_STATUS 2
>>
>> +/* kvm attributes for cpu topology */
>> +#define KVM_S390_VM_CPU_TOPO_MTR_CLEAR 0
>> +#define KVM_S390_VM_CPU_TOPO_MTR_SET 1
>> +
>> +struct kvm_s390_cpu_topology {
>> + __u16 mtcr;
>> +};
>
> Just wondering:
>
> 1) Do we really need a struct for that
> 2) Do we want to leave some room for later expansion?

Yes it is the goal, if we want to report more topology information for
the case the vCPUs are not pin on the real CPUs.
In this case I think we need to report more information on the vCPU
topology to the guest.
For now I explicitly limited the case to pinned vCPUs.

But the change from a u16 to a structure can be done at that moment.

>
>> +
>> /* for KVM_GET_REGS and KVM_SET_REGS */
>> struct kvm_regs {
>> /* general purpose regs for s390 */
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index 925ccc59f283..755f325c9e70 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -1756,6 +1756,100 @@ static int kvm_s390_sca_set_mtcr(struct kvm *kvm)
>> return 0;
>> }
>>
>> +/**
>> + * kvm_s390_sca_clear_mtcr
>> + * @kvm: guest KVM description
>> + *
>> + * Is only relevant if the topology facility is present,
>> + * the caller should check KVM facility 11
>> + *
>> + * Updates the Multiprocessor Topology-Change-Report to signal
>> + * the guest with a topology change.
>> + */
>> +static int kvm_s390_sca_clear_mtcr(struct kvm *kvm)
>> +{
>> + struct bsca_block *sca = kvm->arch.sca;
>> + struct kvm_vcpu *vcpu;
>> + int val;
>> +
>> + vcpu = kvm_s390_get_first_vcpu(kvm);
>> + if (!vcpu)
>> + return -ENODEV;
>
> It would be cleaner to have ipte_lock/ipte_unlock variants that are
> independent of a vcpu.
>
> Instead of checking for "vcpu->arch.sie_block->eca & ECA_SII" we might
> just check for sclp.has_siif. Everything else that performs the
> lock/unlock should be contained in "struct kvm" directly, unless I am
> missing something.

No you are right, ipte_lock/unlock are independent of the vcpu.
I already had a patch on this but I did not think about sclp.has_siif
and it was still heavy.

>
> [...]
>
>> +
>> +static int kvm_s390_get_topology(struct kvm *kvm, struct kvm_device_attr *attr)
>> +{
>> + struct kvm_s390_cpu_topology *topology;
>> + int ret = 0;
>> +
>> + if (!test_kvm_facility(kvm, 11))
>> + return -ENXIO;
>> +
>> + topology = kzalloc(sizeof(*topology), GFP_KERNEL);
>> + if (!topology)
>> + return -ENOMEM;
>
> I'm confused. We're allocating a __u16 to then free it again below? Why
> not simply use a value on the stack like in kvm_s390_vm_get_migration()?

comes from the idea to bring up more information.
But done like this it has no point.


>
>
>
> u16 mtcr;
> ...
> mtcr = kvm_s390_sca_get_mtcr(kvm);
>
> if (copy_to_user((void __user *)attr->addr, &mtcr, sizeof(mtcr)))
> return -EFAULT;
> return 0;

yes, thanks.

>
>
>
>> +
>> + topology->mtcr = kvm_s390_sca_get_mtcr(kvm);
>
> s/ / /

yes too

>
>> + if (copy_to_user((void __user *)attr->addr, topology,
>> + sizeof(struct kvm_s390_cpu_topology)))
>> + ret = -EFAULT;
>> +
>> + kfree(topology);
>> + return ret;
>> +}
>> +
>
>


Thanks a lot David,

Regards,
Pierre

--
Pierre Morel
IBM Lab Boeblingen