This is a comtinuation of "KVM: x86: Drop arbitraty KVM_SOFT_MAX_VCPUS"
(https://lore.kernel.org/kvm/[email protected]/)
work.
1) Enforce KVM_CAP_NR_VCPUS <= KVM_CAP_MAX_VCPUS rule on all
architectures. [Sean Christopherson]
2) Make KVM_CAP_NR_VCPUS return num_online_cpus() and not an arbitrary
value of '710' on x86.
Everything but x86 was only 'eyeball tested', the change is trivial
but sorry in advance if I screwed up)
Vitaly Kuznetsov (5):
KVM: arm64: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
KVM: MIPS: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
KVM: PPC: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
KVM: RISC-V: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
KVM: x86: Drop arbitraty KVM_SOFT_MAX_VCPUS
arch/arm64/kvm/arm.c | 7 ++++++-
arch/mips/kvm/mips.c | 2 +-
arch/powerpc/kvm/powerpc.c | 4 ++--
arch/riscv/kvm/vm.c | 2 +-
arch/x86/include/asm/kvm_host.h | 1 -
arch/x86/kvm/x86.c | 2 +-
6 files changed, 11 insertions(+), 7 deletions(-)
--
2.33.1
It doesn't make sense to return the recommended maximum number of
vCPUs which exceeds the maximum possible number of vCPUs.
Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/arm64/kvm/arm.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 7838e9fb693e..391dc7a921d5 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -223,7 +223,12 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
r = 1;
break;
case KVM_CAP_NR_VCPUS:
- r = num_online_cpus();
+ if (kvm)
+ r = min_t(unsigned int, num_online_cpus(),
+ kvm->arch.max_vcpus);
+ else
+ r = min_t(unsigned int, num_online_cpus(),
+ kvm_arm_default_max_vcpus());
break;
case KVM_CAP_MAX_VCPUS:
case KVM_CAP_MAX_VCPU_ID:
--
2.33.1
It doesn't make sense to return the recommended maximum number of
vCPUs which exceeds the maximum possible number of vCPUs.
Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/mips/kvm/mips.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
index 562aa878b266..aa20d074d388 100644
--- a/arch/mips/kvm/mips.c
+++ b/arch/mips/kvm/mips.c
@@ -1067,7 +1067,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
r = 1;
break;
case KVM_CAP_NR_VCPUS:
- r = num_online_cpus();
+ r = min_t(unsigned int, num_online_cpus(), KVM_MAX_VCPUS);
break;
case KVM_CAP_MAX_VCPUS:
r = KVM_MAX_VCPUS;
--
2.33.1
It doesn't make sense to return the recommended maximum number of
vCPUs which exceeds the maximum possible number of vCPUs.
Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/powerpc/kvm/powerpc.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 8ab90ce8738f..ccac8d5686ff 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -641,9 +641,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
* implementations just count online CPUs.
*/
if (hv_enabled)
- r = num_present_cpus();
+ r = min_t(unsigned int, num_present_cpus(), KVM_MAX_VCPUS);
else
- r = num_online_cpus();
+ r = min_t(unsigned int, num_online_cpus(), KVM_MAX_VCPUS);
break;
case KVM_CAP_MAX_VCPUS:
r = KVM_MAX_VCPUS;
--
2.33.1
It doesn't make sense to return the recommended maximum number of
vCPUs which exceeds the maximum possible number of vCPUs.
Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/riscv/kvm/vm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/riscv/kvm/vm.c b/arch/riscv/kvm/vm.c
index 26399df15b63..fb18af34a4b5 100644
--- a/arch/riscv/kvm/vm.c
+++ b/arch/riscv/kvm/vm.c
@@ -74,7 +74,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
r = 1;
break;
case KVM_CAP_NR_VCPUS:
- r = num_online_cpus();
+ r = min_t(unsigned int, num_online_cpus(), KVM_MAX_VCPUS);
break;
case KVM_CAP_MAX_VCPUS:
r = KVM_MAX_VCPUS;
--
2.33.1
KVM_CAP_NR_VCPUS is used to get the "recommended" maximum number of
VCPUs and arm64/mips/riscv report num_online_cpus(). Powerpc reports
either num_online_cpus() or num_present_cpus(), s390 has multiple
constants depending on hardware features. On x86, KVM reports an
arbitrary value of '710' which is supposed to be the maximum tested
value but it's possible to test all KVM_MAX_VCPUS even when there are
less physical CPUs available.
Drop the arbitrary '710' value and return num_online_cpus() on x86 as
well. The recommendation will match other architectures and will mean
'no CPU overcommit'.
For reference, QEMU only queries KVM_CAP_NR_VCPUS to print a warning
when the requested vCPU number exceeds it. The static limit of '710'
is quite weird as smaller systems with just a few physical CPUs should
certainly "recommend" less.
Suggested-by: Eduardo Habkost <[email protected]>
Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 1 -
arch/x86/kvm/x86.c | 2 +-
2 files changed, 1 insertion(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 88fce6ab4bbd..0232a00598f2 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -38,7 +38,6 @@
#define __KVM_HAVE_ARCH_VCPU_DEBUGFS
#define KVM_MAX_VCPUS 1024
-#define KVM_SOFT_MAX_VCPUS 710
/*
* In x86, the VCPU ID corresponds to the APIC ID, and APIC IDs
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ac83d873d65b..18a00a7c23bc 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4137,7 +4137,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
r = !static_call(kvm_x86_cpu_has_accelerated_tpr)();
break;
case KVM_CAP_NR_VCPUS:
- r = KVM_SOFT_MAX_VCPUS;
+ r = min_t(unsigned int, num_online_cpus(), KVM_MAX_VCPUS);
break;
case KVM_CAP_MAX_VCPUS:
r = KVM_MAX_VCPUS;
--
2.33.1
On 11/11/21 17:27, Vitaly Kuznetsov wrote:
> This is a comtinuation of "KVM: x86: Drop arbitraty KVM_SOFT_MAX_VCPUS"
> (https://lore.kernel.org/kvm/[email protected]/)
> work.
>
> 1) Enforce KVM_CAP_NR_VCPUS <= KVM_CAP_MAX_VCPUS rule on all
> architectures. [Sean Christopherson]
> 2) Make KVM_CAP_NR_VCPUS return num_online_cpus() and not an arbitrary
> value of '710' on x86.
>
> Everything but x86 was only 'eyeball tested', the change is trivial
> but sorry in advance if I screwed up)
Christian, can you look at this for s390? Returning a fixed value seems
wrong for KVM_CAP_NR_VCPUS.
Thanks,
Paolo
> Vitaly Kuznetsov (5):
> KVM: arm64: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
> KVM: MIPS: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
> KVM: PPC: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
> KVM: RISC-V: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
> KVM: x86: Drop arbitraty KVM_SOFT_MAX_VCPUS
>
> arch/arm64/kvm/arm.c | 7 ++++++-
> arch/mips/kvm/mips.c | 2 +-
> arch/powerpc/kvm/powerpc.c | 4 ++--
> arch/riscv/kvm/vm.c | 2 +-
> arch/x86/include/asm/kvm_host.h | 1 -
> arch/x86/kvm/x86.c | 2 +-
> 6 files changed, 11 insertions(+), 7 deletions(-)
>
Hi Vitaly,
On 2021-11-11 16:27, Vitaly Kuznetsov wrote:
> It doesn't make sense to return the recommended maximum number of
> vCPUs which exceeds the maximum possible number of vCPUs.
>
> Signed-off-by: Vitaly Kuznetsov <[email protected]>
> ---
> arch/arm64/kvm/arm.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 7838e9fb693e..391dc7a921d5 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -223,7 +223,12 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm,
> long ext)
> r = 1;
> break;
> case KVM_CAP_NR_VCPUS:
> - r = num_online_cpus();
> + if (kvm)
> + r = min_t(unsigned int, num_online_cpus(),
> + kvm->arch.max_vcpus);
> + else
> + r = min_t(unsigned int, num_online_cpus(),
> + kvm_arm_default_max_vcpus());
> break;
> case KVM_CAP_MAX_VCPUS:
> case KVM_CAP_MAX_VCPU_ID:
This looks odd. This means that depending on the phase userspace is
in while initialising the VM, KVM_CAP_NR_VCPUS can return one thing
or the other.
For example, I create a VM on a 32 CPU system, NR_VCPUS says 32.
I create a GICv2 interrupt controller, it now says 8.
That's a change in behaviour that is visible by userspace, which
I'm keen on avoiding. I'd rather have the kvm and !kvm cases
return the same thing.
Thanks,
M.
--
Jazz is not dead. It just smells funny...
Marc Zyngier <[email protected]> writes:
> Hi Vitaly,
>
> On 2021-11-11 16:27, Vitaly Kuznetsov wrote:
>> It doesn't make sense to return the recommended maximum number of
>> vCPUs which exceeds the maximum possible number of vCPUs.
>>
>> Signed-off-by: Vitaly Kuznetsov <[email protected]>
>> ---
>> arch/arm64/kvm/arm.c | 7 ++++++-
>> 1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
>> index 7838e9fb693e..391dc7a921d5 100644
>> --- a/arch/arm64/kvm/arm.c
>> +++ b/arch/arm64/kvm/arm.c
>> @@ -223,7 +223,12 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm,
>> long ext)
>> r = 1;
>> break;
>> case KVM_CAP_NR_VCPUS:
>> - r = num_online_cpus();
>> + if (kvm)
>> + r = min_t(unsigned int, num_online_cpus(),
>> + kvm->arch.max_vcpus);
>> + else
>> + r = min_t(unsigned int, num_online_cpus(),
>> + kvm_arm_default_max_vcpus());
>> break;
>> case KVM_CAP_MAX_VCPUS:
>> case KVM_CAP_MAX_VCPU_ID:
>
> This looks odd. This means that depending on the phase userspace is
> in while initialising the VM, KVM_CAP_NR_VCPUS can return one thing
> or the other.
>
> For example, I create a VM on a 32 CPU system, NR_VCPUS says 32.
> I create a GICv2 interrupt controller, it now says 8.
>
> That's a change in behaviour that is visible by userspace
Yes, I realize this is a userspace visible change. The reason I suggest
it is that logically, it seems very odd that the maximum recommended
number of vCPUs (KVM_CAP_NR_VCPUS) can be higher, than the maximum
supported number of vCPUs (KVM_CAP_MAX_VCPUS). All userspaces which use
this information somehow should already contain some workaround for this
case. (maybe it's a rare one and nobody hit it yet or maybe there are no
userspaces using KVM_CAP_NR_VCPUS for anything besides complaining --
like QEMU).
I'd like KVM to be consistent across architectures and have the same
(similar) meaning for KVM_CAP_NR_VCPUS.
> which I'm keen on avoiding. I'd rather have the kvm and !kvm cases
> return the same thing.
Forgive me my (ARM?) ignorance but what would it be then? If we go for
min(num_online_cpus(), kvm_arm_default_max_vcpus()) in both cases, cat
this can still go above KVM_CAP_MAX_VCPUS after vGIC is created?
Thanks for the feedback!
--
Vitaly
On Fri, Nov 12, 2021 at 10:51:10AM +0100, Vitaly Kuznetsov wrote:
> Marc Zyngier <[email protected]> writes:
>
> > Hi Vitaly,
> >
> > On 2021-11-11 16:27, Vitaly Kuznetsov wrote:
> >> It doesn't make sense to return the recommended maximum number of
> >> vCPUs which exceeds the maximum possible number of vCPUs.
> >>
> >> Signed-off-by: Vitaly Kuznetsov <[email protected]>
> >> ---
> >> arch/arm64/kvm/arm.c | 7 ++++++-
> >> 1 file changed, 6 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> >> index 7838e9fb693e..391dc7a921d5 100644
> >> --- a/arch/arm64/kvm/arm.c
> >> +++ b/arch/arm64/kvm/arm.c
> >> @@ -223,7 +223,12 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm,
> >> long ext)
> >> r = 1;
> >> break;
> >> case KVM_CAP_NR_VCPUS:
> >> - r = num_online_cpus();
> >> + if (kvm)
> >> + r = min_t(unsigned int, num_online_cpus(),
> >> + kvm->arch.max_vcpus);
> >> + else
> >> + r = min_t(unsigned int, num_online_cpus(),
> >> + kvm_arm_default_max_vcpus());
> >> break;
> >> case KVM_CAP_MAX_VCPUS:
> >> case KVM_CAP_MAX_VCPU_ID:
> >
> > This looks odd. This means that depending on the phase userspace is
> > in while initialising the VM, KVM_CAP_NR_VCPUS can return one thing
> > or the other.
> >
> > For example, I create a VM on a 32 CPU system, NR_VCPUS says 32.
> > I create a GICv2 interrupt controller, it now says 8.
> >
> > That's a change in behaviour that is visible by userspace
>
> Yes, I realize this is a userspace visible change. The reason I suggest
> it is that logically, it seems very odd that the maximum recommended
> number of vCPUs (KVM_CAP_NR_VCPUS) can be higher, than the maximum
> supported number of vCPUs (KVM_CAP_MAX_VCPUS). All userspaces which use
> this information somehow should already contain some workaround for this
> case. (maybe it's a rare one and nobody hit it yet or maybe there are no
> userspaces using KVM_CAP_NR_VCPUS for anything besides complaining --
> like QEMU).
>
> I'd like KVM to be consistent across architectures and have the same
> (similar) meaning for KVM_CAP_NR_VCPUS.
KVM_CAP_NR_VCPUS seems pretty useless if we just want to tell userspace
the same thing it can get with _SC_NPROCESSORS_ONLN. In fact, if userspace
knows something we don't about the future onlining of some pcpus, then
maybe userspace would prefer to check _SC_NPROCESSORS_CONF.
>
> > which I'm keen on avoiding. I'd rather have the kvm and !kvm cases
> > return the same thing.
>
> Forgive me my (ARM?) ignorance but what would it be then? If we go for
> min(num_online_cpus(), kvm_arm_default_max_vcpus()) in both cases, cat
> this can still go above KVM_CAP_MAX_VCPUS after vGIC is created?
So the GIC version case looks like the type of thing that could make
KVM_CAP_NR_VCPUS useful, i.e. being able to tell userspace a maximum
number of vcpus supported for a given configuration. However, even
in that case the concept of "recommended" number doesn't make sense,
because, for the GICv2 example, a VM cannot configure more than 8 VCPUs,
so it's a real limit, not a recommendation. Maybe KVM_CAP_NR_VCPUS should
just be left alone, but deprecated, and, if there's need, a new CAP could
be created for a config-vcpu-max.
Thanks,
drew
On 11/12/21 11:38, Andrew Jones wrote:
>>
>> I'd like KVM to be consistent across architectures and have the same
>> (similar) meaning for KVM_CAP_NR_VCPUS.
> KVM_CAP_NR_VCPUS seems pretty useless if we just want to tell userspace
> the same thing it can get with _SC_NPROCESSORS_ONLN. In fact, if userspace
> knows something we don't about the future onlining of some pcpus, then
> maybe userspace would prefer to check _SC_NPROCESSORS_CONF.
It's the same for most architectures, but for example PPC has to take
into account the handling of threads, which can exist while the VMs run
but not in the host. So KVM_CAP_NR_VCPUS on PPC is _SC_NPROCESSORS_CONF
times the number of threads per core.
If you don't count PPC (not sure about s390), it _is_ pretty useless indeed.
Paolo
On Fri, 12 Nov 2021 09:51:10 +0000,
Vitaly Kuznetsov <[email protected]> wrote:
>
> Marc Zyngier <[email protected]> writes:
>
> > Hi Vitaly,
> >
> > On 2021-11-11 16:27, Vitaly Kuznetsov wrote:
> >> It doesn't make sense to return the recommended maximum number of
> >> vCPUs which exceeds the maximum possible number of vCPUs.
> >>
> >> Signed-off-by: Vitaly Kuznetsov <[email protected]>
> >> ---
> >> arch/arm64/kvm/arm.c | 7 ++++++-
> >> 1 file changed, 6 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> >> index 7838e9fb693e..391dc7a921d5 100644
> >> --- a/arch/arm64/kvm/arm.c
> >> +++ b/arch/arm64/kvm/arm.c
> >> @@ -223,7 +223,12 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm,
> >> long ext)
> >> r = 1;
> >> break;
> >> case KVM_CAP_NR_VCPUS:
> >> - r = num_online_cpus();
> >> + if (kvm)
> >> + r = min_t(unsigned int, num_online_cpus(),
> >> + kvm->arch.max_vcpus);
> >> + else
> >> + r = min_t(unsigned int, num_online_cpus(),
> >> + kvm_arm_default_max_vcpus());
> >> break;
> >> case KVM_CAP_MAX_VCPUS:
> >> case KVM_CAP_MAX_VCPU_ID:
> >
> > This looks odd. This means that depending on the phase userspace is
> > in while initialising the VM, KVM_CAP_NR_VCPUS can return one thing
> > or the other.
> >
> > For example, I create a VM on a 32 CPU system, NR_VCPUS says 32.
> > I create a GICv2 interrupt controller, it now says 8.
> >
> > That's a change in behaviour that is visible by userspace
>
> Yes, I realize this is a userspace visible change. The reason I suggest
> it is that logically, it seems very odd that the maximum recommended
> number of vCPUs (KVM_CAP_NR_VCPUS) can be higher, than the maximum
> supported number of vCPUs (KVM_CAP_MAX_VCPUS).
I'm all for this change.
> All userspaces which use
> this information somehow should already contain some workaround for this
> case. (maybe it's a rare one and nobody hit it yet or maybe there are no
> userspaces using KVM_CAP_NR_VCPUS for anything besides complaining --
> like QEMU).
>
> I'd like KVM to be consistent across architectures and have the same
> (similar) meaning for KVM_CAP_NR_VCPUS.
Sure, but this is a pretty useless piece of information anyway. As
Andrew pointed out, the information is available somewhere else, and
all we need to do is to cap it to the number of supported vcpus, which
is effectively a KVM limitation.
Also, we are talking about representing the architecture to userspace.
No amount of massaging is going to make an arm64 box look like an x86.
> > which I'm keen on avoiding. I'd rather have the kvm and !kvm cases
> > return the same thing.
>
> Forgive me my (ARM?) ignorance but what would it be then? If we go for
> min(num_online_cpus(), kvm_arm_default_max_vcpus()) in both cases, cat
> this can still go above KVM_CAP_MAX_VCPUS after vGIC is created?
"min(num_online_cpus(), kvm_arm_default_max_vcpus())" is probably the
right thing in all cases. Yes, KVM_CAP_NR_VCPUS will keep reporting
more than the VM can actually support. But that's why we have
KVM_CAP_MAX_VCPUS, which tells you now many vcpus you can create for a
given configuration.
This shows how useless KVM_CAP_NR_VCPUS is, and I wouldn't mind a
documentation patch stating this.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
On 11/12/21 15:02, Marc Zyngier wrote:
>> I'd like KVM to be consistent across architectures and have the same
>> (similar) meaning for KVM_CAP_NR_VCPUS.
> Sure, but this is a pretty useless piece of information anyway. As
> Andrew pointed out, the information is available somewhere else, and
> all we need to do is to cap it to the number of supported vcpus, which
> is effectively a KVM limitation.
>
> Also, we are talking about representing the architecture to userspace.
> No amount of massaging is going to make an arm64 box look like an x86.
Not sure what you mean? The API is about providing a piece of
information independent of the architecture, while catering for a ppc
weirdness. Yes it's mostly useless if you don't care about ppc, but
it's not about making arm64 look like x86 or ppc; it's about not having
to special case ppc in userspace.
If anything, if KVM_CAP_NR_VCPUS returns the same for kvm and !kvm, then
*that* is making an arm64 box look like an x86. On ARM the max vCPUs
depends on VM's GIC configuration, so KVM_CAP_NR_VCPUS should take that
into account. Or KVM_CAP_NR_VCPUS should have been only for !kvm; but
the ship for that has sailed.
Paolo
>>> which I'm keen on avoiding. I'd rather have the kvm and !kvm cases
>>> return the same thing.
>> Forgive me my (ARM?) ignorance but what would it be then? If we go for
>> min(num_online_cpus(), kvm_arm_default_max_vcpus()) in both cases, cat
>> this can still go above KVM_CAP_MAX_VCPUS after vGIC is created?
> "min(num_online_cpus(), kvm_arm_default_max_vcpus())" is probably the
> right thing in all cases. Yes, KVM_CAP_NR_VCPUS will keep reporting
> more than the VM can actually support. But that's why we have
> KVM_CAP_MAX_VCPUS, which tells you now many vcpus you can create for a
> given configuration.
Am 11.11.21 um 17:32 schrieb Paolo Bonzini:
> On 11/11/21 17:27, Vitaly Kuznetsov wrote:
>> This is a comtinuation of "KVM: x86: Drop arbitraty KVM_SOFT_MAX_VCPUS"
>> (https://lore.kernel.org/kvm/[email protected]/)
>> work.
>>
>> 1) Enforce KVM_CAP_NR_VCPUS <= KVM_CAP_MAX_VCPUS rule on all
>> architectures. [Sean Christopherson]
>> 2) Make KVM_CAP_NR_VCPUS return num_online_cpus() and not an arbitrary
>> value of '710' on x86.
>>
>> Everything but x86 was only 'eyeball tested', the change is trivial
>> but sorry in advance if I screwed up)
>
> Christian, can you look at this for s390? Returning a fixed value seems wrong for KVM_CAP_NR_VCPUS.
will do. (Sorry I was OOO the last days).
>
> Thanks,
>
> Paolo
>
>> Vitaly Kuznetsov (5):
>> KVM: arm64: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
>> KVM: MIPS: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
>> KVM: PPC: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
>> KVM: RISC-V: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
>> KVM: x86: Drop arbitraty KVM_SOFT_MAX_VCPUS
>>
>> arch/arm64/kvm/arm.c | 7 ++++++-
>> arch/mips/kvm/mips.c | 2 +-
>> arch/powerpc/kvm/powerpc.c | 4 ++--
>> arch/riscv/kvm/vm.c | 2 +-
>> arch/x86/include/asm/kvm_host.h | 1 -
>> arch/x86/kvm/x86.c | 2 +-
>> 6 files changed, 11 insertions(+), 7 deletions(-)
>>
>
Am 11.11.21 um 17:32 schrieb Paolo Bonzini:
> On 11/11/21 17:27, Vitaly Kuznetsov wrote:
>> This is a comtinuation of "KVM: x86: Drop arbitraty KVM_SOFT_MAX_VCPUS"
>> (https://lore.kernel.org/kvm/[email protected]/)
>> work.
>>
>> 1) Enforce KVM_CAP_NR_VCPUS <= KVM_CAP_MAX_VCPUS rule on all
>> architectures. [Sean Christopherson]
>> 2) Make KVM_CAP_NR_VCPUS return num_online_cpus() and not an arbitrary
>> value of '710' on x86.
>>
>> Everything but x86 was only 'eyeball tested', the change is trivial
>> but sorry in advance if I screwed up)
>
> Christian, can you look at this for s390? Returning a fixed value seems wrong for KVM_CAP_NR_VCPUS.
If we talk about recommended number, then num_online_cpus() also seems to make sense on s390 so
if you change that for s390 as well I can ACK this.
Christian Borntraeger <[email protected]> writes:
> Am 11.11.21 um 17:32 schrieb Paolo Bonzini:
>> On 11/11/21 17:27, Vitaly Kuznetsov wrote:
>>> This is a comtinuation of "KVM: x86: Drop arbitraty KVM_SOFT_MAX_VCPUS"
>>> (https://lore.kernel.org/kvm/[email protected]/)
>>> work.
>>>
>>> 1) Enforce KVM_CAP_NR_VCPUS <= KVM_CAP_MAX_VCPUS rule on all
>>> architectures. [Sean Christopherson]
>>> 2) Make KVM_CAP_NR_VCPUS return num_online_cpus() and not an arbitrary
>>> value of '710' on x86.
>>>
>>> Everything but x86 was only 'eyeball tested', the change is trivial
>>> but sorry in advance if I screwed up)
>>
>> Christian, can you look at this for s390? Returning a fixed value seems wrong for KVM_CAP_NR_VCPUS.
>
> If we talk about recommended number, then num_online_cpus() also seems to make sense on s390 so
> if you change that for s390 as well I can ACK this.
Thanks!
For KVM_CAP_MAX_VCPUS s390 code returns one of the three things:
KVM_S390_BSCA_CPU_SLOTS(64), KVM_MAX_VCPUS(255) or
KVM_S390_ESCA_CPU_SLOTS(248).
For KVM_CAP_NR_VCPUS, would it be better to return raw
num_online_cpus():
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 6a6dd5e1daf6..fcecbb762a1a 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -578,6 +578,8 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
r = MEM_OP_MAX_SIZE;
break;
case KVM_CAP_NR_VCPUS:
+ r = num_online_cpus();
+ break;
case KVM_CAP_MAX_VCPUS:
case KVM_CAP_MAX_VCPU_ID:
r = KVM_S390_BSCA_CPU_SLOTS;
or cap KVM_CAP_MAX_VCPUS value with num_online_cpus(), e.g.
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 6a6dd5e1daf6..1cfe36f6432e 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -585,6 +585,8 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
r = KVM_MAX_VCPUS;
else if (sclp.has_esca && sclp.has_64bscao)
r = KVM_S390_ESCA_CPU_SLOTS;
+ if (ext == KVM_CAP_NR_VCPUS)
+ r = min_t(unsigned int, num_online_cpus(), r);
break;
case KVM_CAP_S390_COW:
r = MACHINE_HAS_ESOP;
For reference, see our ARM discussion:
https://lore.kernel.org/kvm/[email protected]/
though 390's situation is different, the returned value for
KVM_CAP_MAX_VCPUS is not VM-dependent.
--
Vitaly
Am 15.11.21 um 17:04 schrieb Vitaly Kuznetsov:
[...]
> or cap KVM_CAP_MAX_VCPUS value with num_online_cpus(), e.g.
>
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 6a6dd5e1daf6..1cfe36f6432e 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -585,6 +585,8 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
> r = KVM_MAX_VCPUS;
> else if (sclp.has_esca && sclp.has_64bscao)
> r = KVM_S390_ESCA_CPU_SLOTS;
> + if (ext == KVM_CAP_NR_VCPUS)
> + r = min_t(unsigned int, num_online_cpus(), r);
> break;
> case KVM_CAP_S390_COW:
> r = MACHINE_HAS_ESOP;
Acked-by: Christian Borntraeger <[email protected]>
I think this is the better variant. Thanks.
Paolo Bonzini <[email protected]> writes:
> On 11/12/21 15:02, Marc Zyngier wrote:
>>> I'd like KVM to be consistent across architectures and have the same
>>> (similar) meaning for KVM_CAP_NR_VCPUS.
>> Sure, but this is a pretty useless piece of information anyway. As
>> Andrew pointed out, the information is available somewhere else, and
>> all we need to do is to cap it to the number of supported vcpus, which
>> is effectively a KVM limitation.
>>
>> Also, we are talking about representing the architecture to userspace.
>> No amount of massaging is going to make an arm64 box look like an x86.
>
> Not sure what you mean? The API is about providing a piece of
> information independent of the architecture, while catering for a ppc
> weirdness. Yes it's mostly useless if you don't care about ppc, but
> it's not about making arm64 look like x86 or ppc; it's about not having
> to special case ppc in userspace.
>
> If anything, if KVM_CAP_NR_VCPUS returns the same for kvm and !kvm, then
> *that* is making an arm64 box look like an x86. On ARM the max vCPUs
> depends on VM's GIC configuration, so KVM_CAP_NR_VCPUS should take that
> into account.
(I'm about to send v2 as we have s390 sorted out.)
So what do we decide about ARM?
- Current approach (kvm->arch.max_vcpus/kvm_arm_default_max_vcpus()
depending on 'if (kvm)') - that would be my preference.
- Always kvm_arm_default_max_vcpus to make the output independent on 'if
(kvm)'.
- keep the status quo (drop the patch).
Please advise)
--
Vitaly
On 11/16/21 14:23, Vitaly Kuznetsov wrote:
> (I'm about to send v2 as we have s390 sorted out.)
>
> So what do we decide about ARM?
> - Current approach (kvm->arch.max_vcpus/kvm_arm_default_max_vcpus()
> depending on 'if (kvm)') - that would be my preference.
That would be mine too.
Paolo
> - Always kvm_arm_default_max_vcpus to make the output independent on 'if
> (kvm)'.
> - keep the status quo (drop the patch).
On Tue, 16 Nov 2021 13:23:25 +0000,
Vitaly Kuznetsov <[email protected]> wrote:
>
> Paolo Bonzini <[email protected]> writes:
>
> > On 11/12/21 15:02, Marc Zyngier wrote:
> >>> I'd like KVM to be consistent across architectures and have the same
> >>> (similar) meaning for KVM_CAP_NR_VCPUS.
> >> Sure, but this is a pretty useless piece of information anyway. As
> >> Andrew pointed out, the information is available somewhere else, and
> >> all we need to do is to cap it to the number of supported vcpus, which
> >> is effectively a KVM limitation.
> >>
> >> Also, we are talking about representing the architecture to userspace.
> >> No amount of massaging is going to make an arm64 box look like an x86.
> >
> > Not sure what you mean? The API is about providing a piece of
> > information independent of the architecture, while catering for a ppc
> > weirdness. Yes it's mostly useless if you don't care about ppc, but
> > it's not about making arm64 look like x86 or ppc; it's about not having
> > to special case ppc in userspace.
> >
> > If anything, if KVM_CAP_NR_VCPUS returns the same for kvm and !kvm, then
> > *that* is making an arm64 box look like an x86. On ARM the max vCPUs
> > depends on VM's GIC configuration, so KVM_CAP_NR_VCPUS should take that
> > into account.
>
> (I'm about to send v2 as we have s390 sorted out.)
>
> So what do we decide about ARM?
[...]
> - Always kvm_arm_default_max_vcpus to make the output independent on 'if
> (kvm)'.
This. Between two useless numbers, I prefer the one that doesn't
introduce any userspace visible changes.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
On 11/16/21 16:55, Marc Zyngier wrote:
>> - Always kvm_arm_default_max_vcpus to make the output independent on 'if
>> (kvm)'.
> This. Between two useless numbers, I prefer the one that doesn't
> introduce any userspace visible changes.
Fair enough, I'm not going to override you---but please add a comment
that says
/*
* arm64 treats KVM_CAP_NR_CPUS different from all other
* architectures, as it does not bound it to num_online_cpus().
* It should not matter much because this is just an advisory
* value.
*/
or something like that.
Paolo