The Hyper-V enlightened TLB remote flush function does not exclude
lazy TLB mode CPUs like the equivalent native function. Limited
telemetry shows that up to 80% of the CPUs being flushed are in
lazy mode, so flushing them is unnecessary and wasteful.
The best place to exclude the lazy TLB mode CPUs is when copying
the Linux cpumask to the Hyper-V VPset data structure, since the
copying already processes CPUs one-by-one. Currently this copying
function has the capabilty to exclude the calling CPU. Generalize
this exclusion functionality to exclude CPUs based on a callback
function that is invoked for each CPU. Then for TLB flushing,
use this callback function to check the lazy TLB mode status of
each targeted CPU.
Patch 1 of this series does the generalization, and fixes up the
one caller of the existing "exclude self" capability.
Patch 2 then implements the exclusion based on lazy TLB mode,
using the generalization from Patch 1.
Michael Kelley (2):
x86/hyperv: Add callback filter to cpumask_to_vpset()
x86/hyperv: Exclude lazy TLB mode CPUs from enlightened TLB flushes
arch/x86/hyperv/hv_apic.c | 12 ++++++++----
arch/x86/hyperv/mmu.c | 11 ++++++++++-
include/asm-generic/mshyperv.h | 22 ++++++++++++++--------
3 files changed, 32 insertions(+), 13 deletions(-)
--
1.8.3.1
In the case where page tables are not freed, native_flush_tlb_multi()
does not do a remote TLB flush on CPUs in lazy TLB mode because the
CPU will flush itself at the next context switch. By comparison, the
Hyper-V enlightened TLB flush does not exclude CPUs in lazy TLB mode
and so performs unnecessary flushes.
If we're not freeing page tables, add logic to test for lazy TLB
mode when adding CPUs to the input argument to the Hyper-V TLB
flush hypercall. Exclude lazy TLB mode CPUs so the behavior
matches native_flush_tlb_multi() and the unnecessary flushes are
avoided. Handle both the <=64 vCPU case and the _ex case for >64
vCPUs.
Signed-off-by: Michael Kelley <[email protected]>
---
arch/x86/hyperv/mmu.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/arch/x86/hyperv/mmu.c b/arch/x86/hyperv/mmu.c
index 0ad2378..8460bd3 100644
--- a/arch/x86/hyperv/mmu.c
+++ b/arch/x86/hyperv/mmu.c
@@ -52,6 +52,11 @@ static inline int fill_gva_list(u64 gva_list[], int offset,
return gva_n - offset;
}
+static bool cpu_is_lazy(int cpu)
+{
+ return per_cpu(cpu_tlbstate_shared.is_lazy, cpu);
+}
+
static void hyperv_flush_tlb_multi(const struct cpumask *cpus,
const struct flush_tlb_info *info)
{
@@ -60,6 +65,7 @@ static void hyperv_flush_tlb_multi(const struct cpumask *cpus,
struct hv_tlb_flush *flush;
u64 status;
unsigned long flags;
+ bool do_lazy = !info->freed_tables;
trace_hyperv_mmu_flush_tlb_multi(cpus, info);
@@ -112,6 +118,8 @@ static void hyperv_flush_tlb_multi(const struct cpumask *cpus,
goto do_ex_hypercall;
for_each_cpu(cpu, cpus) {
+ if (do_lazy && cpu_is_lazy(cpu))
+ continue;
vcpu = hv_cpu_number_to_vp_number(cpu);
if (vcpu == VP_INVAL) {
local_irq_restore(flags);
@@ -198,7 +206,8 @@ static u64 hyperv_flush_tlb_others_ex(const struct cpumask *cpus,
flush->hv_vp_set.valid_bank_mask = 0;
flush->hv_vp_set.format = HV_GENERIC_SET_SPARSE_4K;
- nr_bank = cpumask_to_vpset(&(flush->hv_vp_set), cpus);
+ nr_bank = cpumask_to_vpset_skip(&flush->hv_vp_set, cpus,
+ info->freed_tables ? NULL : cpu_is_lazy);
if (nr_bank < 0)
return HV_STATUS_INVALID_PARAMETER;
--
1.8.3.1
On Mon, Mar 27, 2023 at 06:16:05AM -0700, Michael Kelley wrote:
> The Hyper-V enlightened TLB remote flush function does not exclude
> lazy TLB mode CPUs like the equivalent native function. Limited
> telemetry shows that up to 80% of the CPUs being flushed are in
> lazy mode, so flushing them is unnecessary and wasteful.
>
> The best place to exclude the lazy TLB mode CPUs is when copying
> the Linux cpumask to the Hyper-V VPset data structure, since the
> copying already processes CPUs one-by-one. Currently this copying
> function has the capabilty to exclude the calling CPU. Generalize
> this exclusion functionality to exclude CPUs based on a callback
> function that is invoked for each CPU. Then for TLB flushing,
> use this callback function to check the lazy TLB mode status of
> each targeted CPU.
>
> Patch 1 of this series does the generalization, and fixes up the
> one caller of the existing "exclude self" capability.
>
> Patch 2 then implements the exclusion based on lazy TLB mode,
> using the generalization from Patch 1.
>
> Michael Kelley (2):
> x86/hyperv: Add callback filter to cpumask_to_vpset()
> x86/hyperv: Exclude lazy TLB mode CPUs from enlightened TLB flushes
>
> arch/x86/hyperv/hv_apic.c | 12 ++++++++----
> arch/x86/hyperv/mmu.c | 11 ++++++++++-
> include/asm-generic/mshyperv.h | 22 ++++++++++++++--------
> 3 files changed, 32 insertions(+), 13 deletions(-)
Applied to hyperv-next. Thanks.
>
> --
> 1.8.3.1
>