LinuxLists.cc - [RFC] Question about async TLB flush and KVM pv tlb improvements

2020-02-25 04:12:53

Subject: [RFC] Question about async TLB flush and KVM pv tlb improvements

Hi there,

I saw this async TLB flush patch at https://lore.kernel.org/patchwork/patch/1082481/ , and I am wondering after one year, do you think if this patch is practical or there are functional flaws?
From my POV, Nadav's patch seems has no obvious flaw. But I am not familiar about the relationship between CPU's speculation exec and stale TLB, since it's usually transparent from programing. In which condition would machine check occurs? Is there some reference I can learn?
BTW, I am trying to improve kvm pv tlb flush that if a vCPU is preempted, as initiating CPU is not sending IPI to and waiting for the preempted vCPU, when the preempted vCPU is resuming, I want the VMM to inject an interrupt, perhaps NMI, to the vCPU and letting vCPU flush TLB instead of flush TLB for the vCPU, in case the vCPU is not in kernel mode or disabled interrupt, otherwise stick to VMM flush. Since VMM flush using INVVPID would flush all TLB of all PCID thus has some negative performance impacting on the preempted vCPU. So is there same problem as the async TLB flush patch?
Thanks in advance.

2020-02-25 06:42:28

by Wanpeng Li

[permalink] [raw]

Subject: Re: [RFC] Question about async TLB flush and KVM pv tlb improvements

On Tue, 25 Feb 2020 at 12:12, 何容光(邦采) <[email protected]> wrote:
>
> Hi there,
>
> I saw this async TLB flush patch at https://lore.kernel.org/patchwork/patch/1082481/ , and I am wondering after one year, do you think if this patch is practical or there are functional flaws?
> From my POV, Nadav's patch seems has no obvious flaw. But I am not familiar about the relationship between CPU's speculation exec and stale TLB, since it's usually transparent from programing. In which condition would machine check occurs? Is there some reference I can learn?
> BTW, I am trying to improve kvm pv tlb flush that if a vCPU is preempted, as initiating CPU is not sending IPI to and waiting for the preempted vCPU, when the preempted vCPU is resuming, I want the VMM to inject an interrupt, perhaps NMI, to the vCPU and letting vCPU flush TLB instead of flush TLB for the vCPU, in case the vCPU is not in kernel mode or disabled interrupt, otherwise stick to VMM flush. Since VMM flush using INVVPID would flush all TLB of all PCID thus has some negative performance impacting on the preempted vCPU. So is there same problem as the async TLB flush patch?

PV TLB Shootdown is disabled in dedicated scenario, I believe there
are already heavy tlb misses in overcommit scenarios before this
feature, so flush all TLB associated with one specific VPID will not
worse that much.

Wanpeng

2020-02-25 07:55:07

by 何容光(邦采)

[permalink] [raw]

Subject: 回复：[RFC] Question about async TLB flush and KVM pv tlb improvements

> On Tue, 25 Feb 2020 at 12:12, 何容光(邦采) <[email protected]> wrote:
>>
>> Hi there,
>>
>> I saw this async TLB flush patch at https://lore.kernel.org/patchwork/patch/1082481/ , and I am wondering after one year, do you think if this patch is practical or there are functional flaws?
>> From my POV, Nadav's patch seems has no obvious flaw. But I am not familiar about the relationship between CPU's speculation exec and stale TLB, since it's usually transparent from programing. In which condition would machine check occurs? Is there some reference I can learn?
>> BTW, I am trying to improve kvm pv tlb flush that if a vCPU is preempted, as initiating CPU is not sending IPI to and waiting for the preempted vCPU, when the preempted vCPU is resuming, I want the VMM to inject an interrupt, perhaps NMI, to the vCPU and letting vCPU flush TLB instead of flush TLB for the vCPU, in case the vCPU is not in kernel mode or disabled interrupt, otherwise stick to VMM flush. Since VMM flush using INVVPID would flush all TLB of all PCID thus has some negative performance impacting on the preempted vCPU. So is there same problem as the async TLB flush patch?

> PV TLB Shootdown is disabled in dedicated scenario, I believe there
> are already heavy tlb misses in overcommit scenarios before this
> feature, so flush all TLB associated with one specific VPID will not
> worse that much.

If vcpus running on one pcpu is limited to a few, from my test, there
can still be some beneficial. Especially if we can move all the logic to
VMM eliminating waiting of IPI, however correctness of functionally
is a concern. This is also why I found Nadav's patch, do you have
any advice on this?

2020-02-25 08:53:32

by Wanpeng Li

[permalink] [raw]

Subject: Re: [RFC] Question about async TLB flush and KVM pv tlb improvements

On Tue, 25 Feb 2020 at 15:53, 何容光(邦采) <[email protected]> wrote:
>
> > On Tue, 25 Feb 2020 at 12:12, 何容光(邦采) <[email protected]> wrote:
> >>
> >> Hi there,
> >>
> >> I saw this async TLB flush patch at https://lore.kernel.org/patchwork/patch/1082481/ , and I am wondering after one year, do you think if this patch is practical or there are functional flaws?
> >> From my POV, Nadav's patch seems has no obvious flaw. But I am not familiar about the relationship between CPU's speculation exec and stale TLB, since it's usually transparent from programing. In which condition would machine check occurs? Is there some reference I can learn?
> >> BTW, I am trying to improve kvm pv tlb flush that if a vCPU is preempted, as initiating CPU is not sending IPI to and waiting for the preempted vCPU, when the preempted vCPU is resuming, I want the VMM to inject an interrupt, perhaps NMI, to the vCPU and letting vCPU flush TLB instead of flush TLB for the vCPU, in case the vCPU is not in kernel mode or disabled interrupt, otherwise stick to VMM flush. Since VMM flush using INVVPID would flush all TLB of all PCID thus has some negative performance impacting on the preempted vCPU. So is there same problem as the async TLB flush patch?
>
> > PV TLB Shootdown is disabled in dedicated scenario, I believe there
> > are already heavy tlb misses in overcommit scenarios before this
> > feature, so flush all TLB associated with one specific VPID will not
> > worse that much.
>
> If vcpus running on one pcpu is limited to a few, from my test, there
> can still be some beneficial. Especially if we can move all the logic to

Unless the vCPU is preempted.

> VMM eliminating waiting of IPI, however correctness of functionally
> is a concern. This is also why I found Nadav's patch, do you have
> any advice on this?

2020-02-25 09:43:31

by He Rongguang

[permalink] [raw]

Subject: Re: [RFC] Question about async TLB flush and KVM pv tlb improvements

在 2020/2/25 16:41, Wanpeng Li 写道:
> On Tue, 25 Feb 2020 at 15:53, 何容光(邦采) <[email protected]> wrote:
>>> On Tue, 25 Feb 2020 at 12:12, 何容光(邦采) <[email protected]> wrote:
>>>> Hi there,
>>>>
>>>> I saw this async TLB flush patch at https://lore.kernel.org/patchwork/patch/1082481/ , and I am wondering after one year, do you think if this patch is practical or there are functional flaws?
>>>> From my POV, Nadav's patch seems has no obvious flaw. But I am not familiar about the relationship between CPU's speculation exec and stale TLB, since it's usually transparent from programing. In which condition would machine check occurs? Is there some reference I can learn?
>>>> BTW, I am trying to improve kvm pv tlb flush that if a vCPU is preempted, as initiating CPU is not sending IPI to and waiting for the preempted vCPU, when the preempted vCPU is resuming, I want the VMM to inject an interrupt, perhaps NMI, to the vCPU and letting vCPU flush TLB instead of flush TLB for the vCPU, in case the vCPU is not in kernel mode or disabled interrupt, otherwise stick to VMM flush. Since VMM flush using INVVPID would flush all TLB of all PCID thus has some negative performance impacting on the preempted vCPU. So is there same problem as the async TLB flush patch?
>>> PV TLB Shootdown is disabled in dedicated scenario, I believe there
>>> are already heavy tlb misses in overcommit scenarios before this
>>> feature, so flush all TLB associated with one specific VPID will not
>>> worse that much.
>> If vcpus running on one pcpu is limited to a few, from my test, there
>> can still be some beneficial. Especially if we can move all the logic to
> Unless the vCPU is preempted.

Correct, in fact I am using a no-IPI-in-VM approch, that's why I am
asking about the

async approch.

>> VMM eliminating waiting of IPI, however correctness of functionally
>> is a concern. This is also why I found Nadav's patch, do you have
>> any advice on this?