2023-09-04 15:23:47

by Marc Zyngier

[permalink] [raw]
Subject: Re: [RFC v2] KVM: arm/arm64: optimize vSGI injection performance

On Fri, 25 Aug 2023 02:58:11 +0100,
Xu Zhao <[email protected]> wrote:
>
> In a VM with more than 16 vCPUs (with multiple aff0 groups), if the target
> vCPU of a vSGI exceeds 16th vCPU, kvm needs to iterate from vCPU0 until
> the target vCPU is found. However, affinity routing information is provided
> by the ICC_SGI* register, which allows kvm to bypass other aff0 groups,
> iterating only on the aff0 group that the target vCPU located. It reduces
> the maximum iteration times from the total number of vCPUs to 16, or even
> 8 times.
>
> This patch aims to optimize the vSGI injecting performance of injecting
> target exceeds 16th vCPU in vm with more than 16 vCPUs.

The problem is that you optimise it for the default case, and break it
for *everything* else.

[...]

> The performance of VM witch 32 cores improvement can be observed. When
> injecting SGI into the first vCPU of the first aff0 group, the performance
> remains the same as before (because the number of iteration is also 1),
> but there is an improvement in performance when injecting interrupts into
> the last vCPU. When injecting vSGI into the first and last vCPU of the
> second aff0 group, the performance improvement is significant because
> compared to the original algorithm, it skipped iterates the first aff0
> group.
>
> BTW, performance improvement can also be observed by microbench in
> kvm-unit-test with little modification :add 32 cores initialization,
> then change IPI target CPU in function ipi_exec.
>
> The more vcpu a VM has, the greater the performance improvement of injecting
> vSGI into the vCPU in the last aff0 group.
>
> Signed-off-by: Xu Zhao <[email protected]>
> ---
> arch/arm64/kvm/vgic/vgic-mmio-v3.c | 152 ++++++++++++++---------------
> include/linux/kvm_host.h | 5 +
> 2 files changed, 78 insertions(+), 79 deletions(-)
>
> diff --git a/arch/arm64/kvm/vgic/vgic-mmio-v3.c b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> index 188d2187eede..af8f2d6b18c3 100644
> --- a/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> +++ b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> @@ -1013,44 +1013,64 @@ int vgic_v3_has_attr_regs(struct kvm_device *dev, struct kvm_device_attr *attr)
>
> return 0;
> }
> +
> /*
> - * Compare a given affinity (level 1-3 and a level 0 mask, from the SGI
> - * generation register ICC_SGI1R_EL1) with a given VCPU.
> - * If the VCPU's MPIDR matches, return the level0 affinity, otherwise
> - * return -1.
> + * Get affinity routing index from ICC_SGI_* register
> + * format:
> + * aff3 aff2 aff1 aff0
> + * |- 8 bits -|- 8 bits -|- 8 bits -|- 4 bits -|
> */
> -static int match_mpidr(u64 sgi_aff, u16 sgi_cpu_mask, struct kvm_vcpu *vcpu)
> +static unsigned long sgi_to_affinity(unsigned long reg)
> {
> - unsigned long affinity;
> - int level0;
> + u64 aff;
>
> - /*
> - * Split the current VCPU's MPIDR into affinity level 0 and the
> - * rest as this is what we have to compare against.
> - */
> - affinity = kvm_vcpu_get_mpidr_aff(vcpu);
> - level0 = MPIDR_AFFINITY_LEVEL(affinity, 0);
> - affinity &= ~MPIDR_LEVEL_MASK;
> + /* aff3 - aff1 */
> + aff = (((reg) & ICC_SGI1R_AFFINITY_3_MASK) >> ICC_SGI1R_AFFINITY_3_SHIFT) << 16 |
> + (((reg) & ICC_SGI1R_AFFINITY_2_MASK) >> ICC_SGI1R_AFFINITY_2_SHIFT) << 8 |
> + (((reg) & ICC_SGI1R_AFFINITY_1_MASK) >> ICC_SGI1R_AFFINITY_1_SHIFT);

Here, you assume that you can directly map a vcpu index to an
affinity. It would be awesome if that was the case. However, this is
only valid at reset time, and userspace is perfectly allowed to change
this mapping by writing to the vcpu's MPIDR_EL1.

So this won't work at all if userspace wants to set its own specific
CPU numbering.

M.

--
Without deviation from the norm, progress is not possible.


2023-09-12 10:42:15

by zhaoxu

[permalink] [raw]
Subject: Re: [RFC v2] KVM: arm/arm64: optimize vSGI injection performance



On 2023/9/4 17:57, Marc Zyngier wrote:
> On Fri, 25 Aug 2023 02:58:11 +0100,
> Xu Zhao <[email protected]> wrote:
[...]
>> - unsigned long affinity;
>> - int level0;
>> + u64 aff;
>>
>> - /*
>> - * Split the current VCPU's MPIDR into affinity level 0 and the
>> - * rest as this is what we have to compare against.
>> - */
>> - affinity = kvm_vcpu_get_mpidr_aff(vcpu);
>> - level0 = MPIDR_AFFINITY_LEVEL(affinity, 0);
>> - affinity &= ~MPIDR_LEVEL_MASK;
>> + /* aff3 - aff1 */
>> + aff = (((reg) & ICC_SGI1R_AFFINITY_3_MASK) >> ICC_SGI1R_AFFINITY_3_SHIFT) << 16 |
>> + (((reg) & ICC_SGI1R_AFFINITY_2_MASK) >> ICC_SGI1R_AFFINITY_2_SHIFT) << 8 |
>> + (((reg) & ICC_SGI1R_AFFINITY_1_MASK) >> ICC_SGI1R_AFFINITY_1_SHIFT);
>
> Here, you assume that you can directly map a vcpu index to an
> affinity. It would be awesome if that was the case. However, this is
> only valid at reset time, and userspace is perfectly allowed to change
> this mapping by writing to the vcpu's MPIDR_EL1.
>
> So this won't work at all if userspace wants to set its own specific
> CPU numbering.
>
> M.
>
Hi Marc,

Yes, i don't think too much about userspace can change MPIDR value, I
checked the source code of qemu, qemu create vcpu sequentially, so in
this case, vcpu_id is equivalent to vcpu_idx which means vcpu_id
represents the position in vcpu array.

These days, I'm still thinking about whether it is because of the
content related to future vcpu hot-plug feature that vcpu_id can be
modified, but now it seems that's not entirely the case.

I have read your latest patch and have been deeply inspired, and Thanks
for agreeing with this issue.

With Regards
Xu.

2023-09-12 13:24:40

by Marc Zyngier

[permalink] [raw]
Subject: Re: [RFC v2] KVM: arm/arm64: optimize vSGI injection performance

On Tue, 12 Sep 2023 05:13:19 +0100,
zhaoxu <[email protected]> wrote:
>
>
>
> On 2023/9/4 17:57, Marc Zyngier wrote:
> > On Fri, 25 Aug 2023 02:58:11 +0100,
> > Xu Zhao <[email protected]> wrote:
> [...]
> >> - unsigned long affinity;
> >> - int level0;
> >> + u64 aff;
> >> - /*
> >> - * Split the current VCPU's MPIDR into affinity level 0 and the
> >> - * rest as this is what we have to compare against.
> >> - */
> >> - affinity = kvm_vcpu_get_mpidr_aff(vcpu);
> >> - level0 = MPIDR_AFFINITY_LEVEL(affinity, 0);
> >> - affinity &= ~MPIDR_LEVEL_MASK;
> >> + /* aff3 - aff1 */
> >> + aff = (((reg) & ICC_SGI1R_AFFINITY_3_MASK) >> ICC_SGI1R_AFFINITY_3_SHIFT) << 16 |
> >> + (((reg) & ICC_SGI1R_AFFINITY_2_MASK) >> ICC_SGI1R_AFFINITY_2_SHIFT) << 8 |
> >> + (((reg) & ICC_SGI1R_AFFINITY_1_MASK) >> ICC_SGI1R_AFFINITY_1_SHIFT);
> >
> > Here, you assume that you can directly map a vcpu index to an
> > affinity. It would be awesome if that was the case. However, this is
> > only valid at reset time, and userspace is perfectly allowed to change
> > this mapping by writing to the vcpu's MPIDR_EL1.
> >
> > So this won't work at all if userspace wants to set its own specific
> > CPU numbering.
> >
> > M.
> >
> Hi Marc,
>
> Yes, i don't think too much about userspace can change MPIDR value, I
> checked the source code of qemu, qemu create vcpu sequentially, so in
> this case, vcpu_id is equivalent to vcpu_idx which means vcpu_id
> represents the position in vcpu array.

The problem is that this is only a convention, and userspace is
totally free to use vcpu_id in a different way. Note that we have
other bugs in the KVM code that treat them interchangeably, but I'm
trying to fix that.

> These days, I'm still thinking about whether it is because of the
> content related to future vcpu hot-plug feature that vcpu_id can be
> modified, but now it seems that's not entirely the case.

There are 3 levels of identification:

- vcpu_idx, which is an internal KVM index that userspace is not
suppose to rely on (or know)

- vcpu_id, which is provided by userspace as a CPU number, and from
which we derive the default MPIDR_EL1 value. This is used all over
the code to identify a CPU from userspace.

- MPIDR_EL1, which is how the architecture identifies a CPU.

For CPU hotplug, I expect usespace to set vcpu_id and MPIDR_EL1 as it
sees fit, knowing that the vpcu must have been allocated upfront (and
vcpu_idx set).

> I have read your latest patch and have been deeply inspired, and
> Thanks for agreeing with this issue.

No worries. I'd appreciate you testing it and reporting whether this
matches the results you are observing with your own patch.

Thanks,

M.

--
Without deviation from the norm, progress is not possible.