2017-06-09 13:27:43

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v8 00/10] Hyper-V: paravirtualized remote TLB flushing and hypercall improvements

Changes since v7:
- Minor code style fixes (drop explicit casting, reformat code a bit)
in PATCH3 and PATCH9 [Andy Shevchenko]

Original description:

Hyper-V supports hypercalls for doing local and remote TLB flushing and
gives its guests hints when using hypercall is preferred. While doing
hypercalls for local TLB flushes is probably not practical (and is not
being suggested by modern Hyper-V versions) remote TLB flush with a
hypercall brings significant improvement.

To test the series I wrote a special 'TLB trasher': on a 16 vCPU guest I
was creating 32 threads which were doing 100000 mmap/munmaps each on some
big file. Here are the results:

Before:
# time ./pthread_mmap ./randfile
real 3m33.118s
user 0m3.698s
sys 3m16.624s

After:
# time ./pthread_mmap ./randfile
real 2m19.920s
user 0m2.662s
sys 2m9.948s

This series brings a number of small improvements along the way: fast
hypercall implementation and using it for event signaling, rep hypercalls
implementation, hyperv tracing subsystem (which only traces the newly added
remote TLB flush for now).

Vitaly Kuznetsov (10):
x86/hyper-v: include hyperv/ only when CONFIG_HYPERV is set
x86/hyper-v: stash the max number of virtual/logical processor
x86/hyper-v: make hv_do_hypercall() inline
x86/hyper-v: fast hypercall implementation
hyper-v: use fast hypercall for HVCALL_SIGNAL_EVENT
x86/hyper-v: implement rep hypercalls
hyper-v: globalize vp_index
x86/hyper-v: use hypercall for remote TLB flush
x86/hyper-v: support extended CPU ranges for TLB flush hypercalls
tracing/hyper-v: trace hyperv_mmu_flush_tlb_others()

MAINTAINERS | 1 +
arch/x86/Kbuild | 2 +-
arch/x86/hyperv/Makefile | 2 +-
arch/x86/hyperv/hv_init.c | 90 ++++++------
arch/x86/hyperv/mmu.c | 268 ++++++++++++++++++++++++++++++++++++
arch/x86/include/asm/mshyperv.h | 148 +++++++++++++++++++-
arch/x86/include/asm/trace/hyperv.h | 38 +++++
arch/x86/include/uapi/asm/hyperv.h | 17 +++
arch/x86/kernel/cpu/mshyperv.c | 13 +-
drivers/hv/channel_mgmt.c | 20 +--
drivers/hv/connection.c | 7 +-
drivers/hv/hv.c | 9 --
drivers/hv/hyperv_vmbus.h | 11 --
drivers/hv/vmbus_drv.c | 17 ---
drivers/pci/host/pci-hyperv.c | 10 +-
include/linux/hyperv.h | 17 +--
16 files changed, 539 insertions(+), 131 deletions(-)
create mode 100644 arch/x86/hyperv/mmu.c
create mode 100644 arch/x86/include/asm/trace/hyperv.h

--
2.9.4


2017-06-09 13:27:50

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v8 01/10] x86/hyper-v: include hyperv/ only when CONFIG_HYPERV is set

Code is arch/x86/hyperv/ is only needed when CONFIG_HYPERV is set, the
'basic' support and detection lives in arch/x86/kernel/cpu/mshyperv.c
which is included when CONFIG_HYPERVISOR_GUEST is set.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
---
arch/x86/Kbuild | 2 +-
arch/x86/include/asm/mshyperv.h | 7 ++++++-
2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/x86/Kbuild b/arch/x86/Kbuild
index 586b786..3e6f640 100644
--- a/arch/x86/Kbuild
+++ b/arch/x86/Kbuild
@@ -8,7 +8,7 @@ obj-$(CONFIG_KVM) += kvm/
obj-$(CONFIG_XEN) += xen/

# Hyper-V paravirtualization support
-obj-$(CONFIG_HYPERVISOR_GUEST) += hyperv/
+obj-$(subst m,y,$(CONFIG_HYPERV)) += hyperv/

# lguest paravirtualization support
obj-$(CONFIG_LGUEST_GUEST) += lguest/
diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index 18325dc..66f9f2ac 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -174,7 +174,12 @@ void hyperv_init(void);
void hyperv_report_panic(struct pt_regs *regs);
bool hv_is_hypercall_page_setup(void);
void hyperv_cleanup(void);
-#endif
+#else /* CONFIG_HYPERV */
+static inline void hyperv_init(void) {}
+static inline bool hv_is_hypercall_page_setup(void) { return false; }
+static inline hyperv_cleanup(void) {}
+#endif /* CONFIG_HYPERV */
+
#ifdef CONFIG_HYPERV_TSCPAGE
struct ms_hyperv_tsc_page *hv_get_tsc_page(void);
static inline u64 hv_read_tsc_page(const struct ms_hyperv_tsc_page *tsc_pg)
--
2.9.4

2017-06-09 13:28:04

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v8 02/10] x86/hyper-v: stash the max number of virtual/logical processor

Max virtual processor will be needed for 'extended' hypercalls supporting
more than 64 vCPUs. While on it, unify on 'Hyper-V' in mshyperv.c as we
currently have a mix, report acquired misc features as well.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
---
arch/x86/include/asm/mshyperv.h | 2 ++
arch/x86/kernel/cpu/mshyperv.c | 12 +++++++++---
2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index 66f9f2ac..115a0e2 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -29,6 +29,8 @@ struct ms_hyperv_info {
u32 features;
u32 misc_features;
u32 hints;
+ u32 max_vp_index;
+ u32 max_lp_index;
};

extern struct ms_hyperv_info ms_hyperv;
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index 04cb8d3..f259e01 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -175,9 +175,15 @@ static void __init ms_hyperv_init_platform(void)
ms_hyperv.misc_features = cpuid_edx(HYPERV_CPUID_FEATURES);
ms_hyperv.hints = cpuid_eax(HYPERV_CPUID_ENLIGHTMENT_INFO);

- pr_info("HyperV: features 0x%x, hints 0x%x\n",
+ pr_info("Hyper-V: features 0x%x, hints 0x%x\n",
ms_hyperv.features, ms_hyperv.hints);

+ ms_hyperv.max_vp_index = cpuid_eax(HVCPUID_IMPLEMENTATION_LIMITS);
+ ms_hyperv.max_lp_index = cpuid_ebx(HVCPUID_IMPLEMENTATION_LIMITS);
+
+ pr_debug("Hyper-V: max %u virtual processors, %u logical processors\n",
+ ms_hyperv.max_vp_index, ms_hyperv.max_lp_index);
+
/*
* Extract host information.
*/
@@ -203,7 +209,7 @@ static void __init ms_hyperv_init_platform(void)
rdmsrl(HV_X64_MSR_APIC_FREQUENCY, hv_lapic_frequency);
hv_lapic_frequency = div_u64(hv_lapic_frequency, HZ);
lapic_timer_frequency = hv_lapic_frequency;
- pr_info("HyperV: LAPIC Timer Frequency: %#x\n",
+ pr_info("Hyper-V: LAPIC Timer Frequency: %#x\n",
lapic_timer_frequency);
}

@@ -237,7 +243,7 @@ static void __init ms_hyperv_init_platform(void)
}

const __refconst struct hypervisor_x86 x86_hyper_ms_hyperv = {
- .name = "Microsoft HyperV",
+ .name = "Microsoft Hyper-V",
.detect = ms_hyperv_platform,
.init_platform = ms_hyperv_init_platform,
};
--
2.9.4

2017-06-09 13:28:08

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v8 03/10] x86/hyper-v: make hv_do_hypercall() inline

We have only three call sites for hv_do_hypercall() and we're going to
change HVCALL_SIGNAL_EVENT to doing fast hypercall so we can inline this
function for optimization.

Hyper-V top level functional specification states that r9-r11 registers
and flags may be clobbered by the hypervisor during hypercall and with
inlining this is somewhat important, add the clobbers.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
---
Changes since v7:
- Use U64_MAX instead of (u64)ULLONG_MAX [Andy Shevchenko]
---
arch/x86/hyperv/hv_init.c | 54 ++++-------------------------------------
arch/x86/include/asm/mshyperv.h | 39 +++++++++++++++++++++++++++++
drivers/hv/connection.c | 2 ++
include/linux/hyperv.h | 1 -
4 files changed, 46 insertions(+), 50 deletions(-)

diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index 5b882cc..691603e 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -75,7 +75,8 @@ static struct clocksource hyperv_cs_msr = {
.flags = CLOCK_SOURCE_IS_CONTINUOUS,
};

-static void *hypercall_pg;
+void *hv_hypercall_pg;
+EXPORT_SYMBOL_GPL(hv_hypercall_pg);
struct clocksource *hyperv_cs;
EXPORT_SYMBOL_GPL(hyperv_cs);

@@ -102,15 +103,15 @@ void hyperv_init(void)
guest_id = generate_guest_id(0, LINUX_VERSION_CODE, 0);
wrmsrl(HV_X64_MSR_GUEST_OS_ID, guest_id);

- hypercall_pg = __vmalloc(PAGE_SIZE, GFP_KERNEL, PAGE_KERNEL_RX);
- if (hypercall_pg == NULL) {
+ hv_hypercall_pg = __vmalloc(PAGE_SIZE, GFP_KERNEL, PAGE_KERNEL_RX);
+ if (hv_hypercall_pg == NULL) {
wrmsrl(HV_X64_MSR_GUEST_OS_ID, 0);
return;
}

rdmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
hypercall_msr.enable = 1;
- hypercall_msr.guest_physical_address = vmalloc_to_pfn(hypercall_pg);
+ hypercall_msr.guest_physical_address = vmalloc_to_pfn(hv_hypercall_pg);
wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);

/*
@@ -170,51 +171,6 @@ void hyperv_cleanup(void)
}
EXPORT_SYMBOL_GPL(hyperv_cleanup);

-/*
- * hv_do_hypercall- Invoke the specified hypercall
- */
-u64 hv_do_hypercall(u64 control, void *input, void *output)
-{
- u64 input_address = (input) ? virt_to_phys(input) : 0;
- u64 output_address = (output) ? virt_to_phys(output) : 0;
-#ifdef CONFIG_X86_64
- u64 hv_status = 0;
-
- if (!hypercall_pg)
- return (u64)ULLONG_MAX;
-
- __asm__ __volatile__("mov %0, %%r8" : : "r" (output_address) : "r8");
- __asm__ __volatile__("call *%3" : "=a" (hv_status) :
- "c" (control), "d" (input_address),
- "m" (hypercall_pg));
-
- return hv_status;
-
-#else
-
- u32 control_hi = control >> 32;
- u32 control_lo = control & 0xFFFFFFFF;
- u32 hv_status_hi = 1;
- u32 hv_status_lo = 1;
- u32 input_address_hi = input_address >> 32;
- u32 input_address_lo = input_address & 0xFFFFFFFF;
- u32 output_address_hi = output_address >> 32;
- u32 output_address_lo = output_address & 0xFFFFFFFF;
-
- if (!hypercall_pg)
- return (u64)ULLONG_MAX;
-
- __asm__ __volatile__ ("call *%8" : "=d"(hv_status_hi),
- "=a"(hv_status_lo) : "d" (control_hi),
- "a" (control_lo), "b" (input_address_hi),
- "c" (input_address_lo), "D"(output_address_hi),
- "S"(output_address_lo), "m" (hypercall_pg));
-
- return hv_status_lo | ((u64)hv_status_hi << 32);
-#endif /* !x86_64 */
-}
-EXPORT_SYMBOL_GPL(hv_do_hypercall);
-
void hyperv_report_panic(struct pt_regs *regs)
{
static bool panic_reported;
diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index 115a0e2..a004b3b 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -171,6 +171,45 @@ void hv_remove_crash_handler(void);

#if IS_ENABLED(CONFIG_HYPERV)
extern struct clocksource *hyperv_cs;
+extern void *hv_hypercall_pg;
+
+static inline u64 hv_do_hypercall(u64 control, void *input, void *output)
+{
+ u64 input_address = input ? virt_to_phys(input) : 0;
+ u64 output_address = output ? virt_to_phys(output) : 0;
+ u64 hv_status;
+ register void *__sp asm(_ASM_SP);
+
+#ifdef CONFIG_X86_64
+ if (!hv_hypercall_pg)
+ return U64_MAX;
+
+ __asm__ __volatile__("mov %4, %%r8\n"
+ "call *%5"
+ : "=a" (hv_status), "+r" (__sp),
+ "+c" (control), "+d" (input_address)
+ : "r" (output_address), "m" (hv_hypercall_pg)
+ : "cc", "memory", "r8", "r9", "r10", "r11");
+#else
+ u32 input_address_hi = upper_32_bits(input_address);
+ u32 input_address_lo = lower_32_bits(input_address);
+ u32 output_address_hi = upper_32_bits(output_address);
+ u32 output_address_lo = lower_32_bits(output_address);
+
+ if (!hv_hypercall_pg)
+ return U64_MAX;
+
+ __asm__ __volatile__("call *%7"
+ : "=A" (hv_status),
+ "+c" (input_address_lo), "+r" (__sp)
+ : "A" (control),
+ "b" (input_address_hi),
+ "D"(output_address_hi), "S"(output_address_lo),
+ "m" (hv_hypercall_pg)
+ : "cc", "memory");
+#endif /* !x86_64 */
+ return hv_status;
+}

void hyperv_init(void);
void hyperv_report_panic(struct pt_regs *regs);
diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c
index 59c11ff..45e806e 100644
--- a/drivers/hv/connection.c
+++ b/drivers/hv/connection.c
@@ -32,6 +32,8 @@
#include <linux/hyperv.h>
#include <linux/export.h>
#include <asm/hyperv.h>
+#include <asm/mshyperv.h>
+
#include "hyperv_vmbus.h"


diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index e09fc82..d1ae02d 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -1188,7 +1188,6 @@ int vmbus_allocate_mmio(struct resource **new, struct hv_device *device_obj,
bool fb_overlap_ok);
void vmbus_free_mmio(resource_size_t start, resource_size_t size);
int vmbus_cpu_number_to_vp_number(int cpu_number);
-u64 hv_do_hypercall(u64 control, void *input, void *output);

/*
* GUID definitions of various offer types - services offered to the guest.
--
2.9.4

2017-06-09 13:28:27

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v8 04/10] x86/hyper-v: fast hypercall implementation

Hyper-V supports 'fast' hypercalls when all parameters are passed through
registers. Implement an inline version of a simpliest of these calls:
hypercall with one 8-byte input and no output.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
---
arch/x86/include/asm/mshyperv.h | 34 ++++++++++++++++++++++++++++++++++
1 file changed, 34 insertions(+)

diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index a004b3b..f3bedb4 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -211,6 +211,40 @@ static inline u64 hv_do_hypercall(u64 control, void *input, void *output)
return hv_status;
}

+#define HV_HYPERCALL_FAST_BIT BIT(16)
+
+/* Fast hypercall with 8 bytes of input and no output */
+static inline u64 hv_do_fast_hypercall8(u16 code, u64 input1)
+{
+ u64 hv_status, control = (u64)code | HV_HYPERCALL_FAST_BIT;
+ register void *__sp asm(_ASM_SP);
+
+#ifdef CONFIG_X86_64
+ {
+ __asm__ __volatile__("call *%4"
+ : "=a" (hv_status), "+r" (__sp),
+ "+c" (control), "+d" (input1)
+ : "m" (hv_hypercall_pg)
+ : "cc", "r8", "r9", "r10", "r11");
+ }
+#else
+ {
+ u32 input1_hi = upper_32_bits(input1);
+ u32 input1_lo = lower_32_bits(input1);
+
+ __asm__ __volatile__ ("call *%5"
+ : "=A"(hv_status),
+ "+c"(input1_lo),
+ "+r"(__sp)
+ : "A" (control),
+ "b" (input1_hi),
+ "m" (hv_hypercall_pg)
+ : "cc", "edi", "esi");
+ }
+#endif
+ return hv_status;
+}
+
void hyperv_init(void);
void hyperv_report_panic(struct pt_regs *regs);
bool hv_is_hypercall_page_setup(void);
--
2.9.4

2017-06-09 13:28:32

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v8 06/10] x86/hyper-v: implement rep hypercalls

Rep hypercalls are normal hypercalls which perform multiple actions at
once. Hyper-V guarantees to return exectution to the caller in not more
than 50us and the caller needs to use hypercall continuation. Touch NMI
watchdog between hypercall invocations.

This is going to be used for HvFlushVirtualAddressList hypercall for
remote TLB flushing.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
---
arch/x86/include/asm/mshyperv.h | 39 +++++++++++++++++++++++++++++++++++++++
1 file changed, 39 insertions(+)

diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index f3bedb4..ed8107d 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -4,6 +4,7 @@
#include <linux/types.h>
#include <linux/interrupt.h>
#include <linux/clocksource.h>
+#include <linux/nmi.h>
#include <asm/hyperv.h>

/*
@@ -211,7 +212,13 @@ static inline u64 hv_do_hypercall(u64 control, void *input, void *output)
return hv_status;
}

+#define HV_HYPERCALL_RESULT_MASK GENMASK_ULL(15, 0)
#define HV_HYPERCALL_FAST_BIT BIT(16)
+#define HV_HYPERCALL_VARHEAD_OFFSET 17
+#define HV_HYPERCALL_REP_COMP_OFFSET 32
+#define HV_HYPERCALL_REP_COMP_MASK GENMASK_ULL(43, 32)
+#define HV_HYPERCALL_REP_START_OFFSET 48
+#define HV_HYPERCALL_REP_START_MASK GENMASK_ULL(59, 48)

/* Fast hypercall with 8 bytes of input and no output */
static inline u64 hv_do_fast_hypercall8(u16 code, u64 input1)
@@ -245,6 +252,38 @@ static inline u64 hv_do_fast_hypercall8(u16 code, u64 input1)
return hv_status;
}

+/*
+ * Rep hypercalls. Callers of this functions are supposed to ensure that
+ * rep_count and varhead_size comply with Hyper-V hypercall definition.
+ */
+static inline u64 hv_do_rep_hypercall(u16 code, u16 rep_count, u16 varhead_size,
+ void *input, void *output)
+{
+ u64 control = code;
+ u64 status;
+ u16 rep_comp;
+
+ control |= (u64)varhead_size << HV_HYPERCALL_VARHEAD_OFFSET;
+ control |= (u64)rep_count << HV_HYPERCALL_REP_COMP_OFFSET;
+
+ do {
+ status = hv_do_hypercall(control, input, output);
+ if ((status & HV_HYPERCALL_RESULT_MASK) != HV_STATUS_SUCCESS)
+ return status;
+
+ /* Bits 32-43 of status have 'Reps completed' data. */
+ rep_comp = (status & HV_HYPERCALL_REP_COMP_MASK) >>
+ HV_HYPERCALL_REP_COMP_OFFSET;
+
+ control &= ~HV_HYPERCALL_REP_START_MASK;
+ control |= (u64)rep_comp << HV_HYPERCALL_REP_START_OFFSET;
+
+ touch_nmi_watchdog();
+ } while (rep_comp < rep_count);
+
+ return status;
+}
+
void hyperv_init(void);
void hyperv_report_panic(struct pt_regs *regs);
bool hv_is_hypercall_page_setup(void);
--
2.9.4

2017-06-09 13:28:58

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v8 10/10] tracing/hyper-v: trace hyperv_mmu_flush_tlb_others()

Add Hyper-V tracing subsystem and trace hyperv_mmu_flush_tlb_others().
Tracing is done the same way we do xen_mmu_flush_tlb_others().

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
MAINTAINERS | 1 +
arch/x86/hyperv/mmu.c | 7 +++++++
arch/x86/include/asm/trace/hyperv.h | 38 +++++++++++++++++++++++++++++++++++++
3 files changed, 46 insertions(+)
create mode 100644 arch/x86/include/asm/trace/hyperv.h

diff --git a/MAINTAINERS b/MAINTAINERS
index e3f44fd..001614b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6168,6 +6168,7 @@ M: Stephen Hemminger <[email protected]>
L: [email protected]
S: Maintained
F: arch/x86/include/asm/mshyperv.h
+F: arch/x86/include/asm/trace/hyperv.h
F: arch/x86/include/uapi/asm/hyperv.h
F: arch/x86/kernel/cpu/mshyperv.c
F: arch/x86/hyperv
diff --git a/arch/x86/hyperv/mmu.c b/arch/x86/hyperv/mmu.c
index 0712111..cb35453 100644
--- a/arch/x86/hyperv/mmu.c
+++ b/arch/x86/hyperv/mmu.c
@@ -8,6 +8,9 @@
#include <asm/msr.h>
#include <asm/tlbflush.h>

+#define CREATE_TRACE_POINTS
+#include <asm/trace/hyperv.h>
+
/* HvFlushVirtualAddressSpace, HvFlushVirtualAddressList hypercalls */
struct hv_flush_pcpu {
u64 address_space;
@@ -103,6 +106,8 @@ static void hyperv_flush_tlb_others(const struct cpumask *cpus,
u64 status = U64_MAX;
unsigned long flags;

+ trace_hyperv_mmu_flush_tlb_others(cpus, mm, start, end);
+
if (!pcpu_flush || !hv_hypercall_pg)
goto do_native;

@@ -172,6 +177,8 @@ static void hyperv_flush_tlb_others_ex(const struct cpumask *cpus,
u64 status = U64_MAX;
unsigned long flags;

+ trace_hyperv_mmu_flush_tlb_others(cpus, mm, start, end);
+
if (!pcpu_flush_ex || !hv_hypercall_pg)
goto do_native;

diff --git a/arch/x86/include/asm/trace/hyperv.h b/arch/x86/include/asm/trace/hyperv.h
new file mode 100644
index 0000000..e44ece1
--- /dev/null
+++ b/arch/x86/include/asm/trace/hyperv.h
@@ -0,0 +1,38 @@
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM hyperv
+
+#if !defined(_TRACE_HYPERV_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_HYPERV_H
+
+#include <linux/tracepoint.h>
+
+#if IS_ENABLED(CONFIG_HYPERV)
+
+TRACE_EVENT(hyperv_mmu_flush_tlb_others,
+ TP_PROTO(const struct cpumask *cpus, struct mm_struct *mm,
+ unsigned long addr, unsigned long end),
+ TP_ARGS(cpus, mm, addr, end),
+ TP_STRUCT__entry(
+ __field(unsigned int, ncpus)
+ __field(struct mm_struct *, mm)
+ __field(unsigned long, addr)
+ __field(unsigned long, end)
+ ),
+ TP_fast_assign(__entry->ncpus = cpumask_weight(cpus);
+ __entry->mm = mm;
+ __entry->addr = addr,
+ __entry->end = end),
+ TP_printk("ncpus %d mm %p addr %lx, end %lx",
+ __entry->ncpus, __entry->mm, __entry->addr, __entry->end)
+ );
+
+#endif /* CONFIG_HYPERV */
+
+#undef TRACE_INCLUDE_PATH
+#define TRACE_INCLUDE_PATH asm/trace/
+#undef TRACE_INCLUDE_FILE
+#define TRACE_INCLUDE_FILE hyperv
+#endif /* _TRACE_HYPERV_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
--
2.9.4

2017-06-09 13:29:02

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v8 07/10] hyper-v: globalize vp_index

To support implementing remote TLB flushing on Hyper-V with a hypercall
we need to make vp_index available outside of vmbus module. Rename and
globalize.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
---
arch/x86/hyperv/hv_init.c | 34 +++++++++++++++++++++++++++++++++-
arch/x86/include/asm/mshyperv.h | 24 ++++++++++++++++++++++++
drivers/hv/channel_mgmt.c | 7 +++----
drivers/hv/connection.c | 3 ++-
drivers/hv/hv.c | 9 ---------
drivers/hv/hyperv_vmbus.h | 11 -----------
drivers/hv/vmbus_drv.c | 17 -----------------
drivers/pci/host/pci-hyperv.c | 10 +++++-----
include/linux/hyperv.h | 1 -
9 files changed, 67 insertions(+), 49 deletions(-)

diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index 691603e..e93b9a0 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -26,6 +26,8 @@
#include <linux/mm.h>
#include <linux/clockchips.h>
#include <linux/hyperv.h>
+#include <linux/slab.h>
+#include <linux/cpuhotplug.h>

#ifdef CONFIG_HYPERV_TSCPAGE

@@ -80,6 +82,20 @@ EXPORT_SYMBOL_GPL(hv_hypercall_pg);
struct clocksource *hyperv_cs;
EXPORT_SYMBOL_GPL(hyperv_cs);

+u32 *hv_vp_index;
+EXPORT_SYMBOL_GPL(hv_vp_index);
+
+static int hv_cpu_init(unsigned int cpu)
+{
+ u64 msr_vp_index;
+
+ hv_get_vp_index(msr_vp_index);
+
+ hv_vp_index[smp_processor_id()] = msr_vp_index;
+
+ return 0;
+}
+
/*
* This function is to be invoked early in the boot sequence after the
* hypervisor has been detected.
@@ -95,6 +111,16 @@ void hyperv_init(void)
if (x86_hyper != &x86_hyper_ms_hyperv)
return;

+ /* Allocate percpu VP index */
+ hv_vp_index = kmalloc_array(num_possible_cpus(), sizeof(*hv_vp_index),
+ GFP_KERNEL);
+ if (!hv_vp_index)
+ return;
+
+ if (cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "x86/hyperv_init:online",
+ hv_cpu_init, NULL) < 0)
+ goto free_vp_index;
+
/*
* Setup the hypercall page and enable hypercalls.
* 1. Register the guest ID
@@ -106,7 +132,7 @@ void hyperv_init(void)
hv_hypercall_pg = __vmalloc(PAGE_SIZE, GFP_KERNEL, PAGE_KERNEL_RX);
if (hv_hypercall_pg == NULL) {
wrmsrl(HV_X64_MSR_GUEST_OS_ID, 0);
- return;
+ goto free_vp_index;
}

rdmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
@@ -149,6 +175,12 @@ void hyperv_init(void)
hyperv_cs = &hyperv_cs_msr;
if (ms_hyperv.features & HV_X64_MSR_TIME_REF_COUNT_AVAILABLE)
clocksource_register_hz(&hyperv_cs_msr, NSEC_PER_SEC/100);
+
+ return;
+
+free_vp_index:
+ kfree(hv_vp_index);
+ hv_vp_index = NULL;
}

/*
diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index ed8107d..7581251 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -284,6 +284,30 @@ static inline u64 hv_do_rep_hypercall(u16 code, u16 rep_count, u16 varhead_size,
return status;
}

+/*
+ * Hypervisor's notion of virtual processor ID is different from
+ * Linux' notion of CPU ID. This information can only be retrieved
+ * in the context of the calling CPU. Setup a map for easy access
+ * to this information.
+ */
+extern u32 *hv_vp_index;
+
+/**
+ * hv_cpu_number_to_vp_number() - Map CPU to VP.
+ * @cpu_number: CPU number in Linux terms
+ *
+ * This function returns the mapping between the Linux processor
+ * number and the hypervisor's virtual processor number, useful
+ * in making hypercalls and such that talk about specific
+ * processors.
+ *
+ * Return: Virtual processor number in Hyper-V terms
+ */
+static inline int hv_cpu_number_to_vp_number(int cpu_number)
+{
+ return hv_vp_index[cpu_number];
+}
+
void hyperv_init(void);
void hyperv_report_panic(struct pt_regs *regs);
bool hv_is_hypercall_page_setup(void);
diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index f501ce1..331b314 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -600,7 +600,7 @@ static void init_vp_index(struct vmbus_channel *channel, u16 dev_type)
*/
channel->numa_node = 0;
channel->target_cpu = 0;
- channel->target_vp = hv_context.vp_index[0];
+ channel->target_vp = hv_cpu_number_to_vp_number(0);
return;
}

@@ -684,7 +684,7 @@ static void init_vp_index(struct vmbus_channel *channel, u16 dev_type)
}

channel->target_cpu = cur_cpu;
- channel->target_vp = hv_context.vp_index[cur_cpu];
+ channel->target_vp = hv_cpu_number_to_vp_number(cur_cpu);
}

static void vmbus_wait_for_unload(void)
@@ -1220,8 +1220,7 @@ struct vmbus_channel *vmbus_get_outgoing_channel(struct vmbus_channel *primary)
return outgoing_channel;
}

- cur_cpu = hv_context.vp_index[get_cpu()];
- put_cpu();
+ cur_cpu = hv_cpu_number_to_vp_number(smp_processor_id());
list_for_each_safe(cur, tmp, &primary->sc_list) {
cur_channel = list_entry(cur, struct vmbus_channel, sc_list);
if (cur_channel->state != CHANNEL_OPENED_STATE)
diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c
index 37ecf51..f41901f 100644
--- a/drivers/hv/connection.c
+++ b/drivers/hv/connection.c
@@ -96,7 +96,8 @@ static int vmbus_negotiate_version(struct vmbus_channel_msginfo *msginfo,
* the CPU attempting to connect may not be CPU 0.
*/
if (version >= VERSION_WIN8_1) {
- msg->target_vcpu = hv_context.vp_index[smp_processor_id()];
+ msg->target_vcpu =
+ hv_cpu_number_to_vp_number(smp_processor_id());
vmbus_connection.connect_cpu = smp_processor_id();
} else {
msg->target_vcpu = 0;
diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
index 2ea1220..8267439 100644
--- a/drivers/hv/hv.c
+++ b/drivers/hv/hv.c
@@ -234,7 +234,6 @@ int hv_synic_init(unsigned int cpu)
union hv_synic_siefp siefp;
union hv_synic_sint shared_sint;
union hv_synic_scontrol sctrl;
- u64 vp_index;

/* Setup the Synic's message page */
hv_get_simp(simp.as_uint64);
@@ -276,14 +275,6 @@ int hv_synic_init(unsigned int cpu)
hv_context.synic_initialized = true;

/*
- * Setup the mapping between Hyper-V's notion
- * of cpuid and Linux' notion of cpuid.
- * This array will be indexed using Linux cpuid.
- */
- hv_get_vp_index(vp_index);
- hv_context.vp_index[cpu] = (u32)vp_index;
-
- /*
* Register the per-cpu clockevent source.
*/
if (ms_hyperv.features & HV_X64_MSR_SYNTIMER_AVAILABLE)
diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h
index 1b6a5e0..49569f8 100644
--- a/drivers/hv/hyperv_vmbus.h
+++ b/drivers/hv/hyperv_vmbus.h
@@ -229,17 +229,6 @@ struct hv_context {
struct hv_per_cpu_context __percpu *cpu_context;

/*
- * Hypervisor's notion of virtual processor ID is different from
- * Linux' notion of CPU ID. This information can only be retrieved
- * in the context of the calling CPU. Setup a map for easy access
- * to this information:
- *
- * vp_index[a] is the Hyper-V's processor ID corresponding to
- * Linux cpuid 'a'.
- */
- u32 vp_index[NR_CPUS];
-
- /*
* To manage allocations in a NUMA node.
* Array indexed by numa node ID.
*/
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index ed84e96..c7e7d6d 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -1451,23 +1451,6 @@ void vmbus_free_mmio(resource_size_t start, resource_size_t size)
}
EXPORT_SYMBOL_GPL(vmbus_free_mmio);

-/**
- * vmbus_cpu_number_to_vp_number() - Map CPU to VP.
- * @cpu_number: CPU number in Linux terms
- *
- * This function returns the mapping between the Linux processor
- * number and the hypervisor's virtual processor number, useful
- * in making hypercalls and such that talk about specific
- * processors.
- *
- * Return: Virtual processor number in Hyper-V terms
- */
-int vmbus_cpu_number_to_vp_number(int cpu_number)
-{
- return hv_context.vp_index[cpu_number];
-}
-EXPORT_SYMBOL_GPL(vmbus_cpu_number_to_vp_number);
-
static int vmbus_acpi_add(struct acpi_device *device)
{
acpi_status result;
diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c
index 8493638..9139fa7 100644
--- a/drivers/pci/host/pci-hyperv.c
+++ b/drivers/pci/host/pci-hyperv.c
@@ -810,7 +810,8 @@ static void hv_irq_unmask(struct irq_data *data)
params->vector = cfg->vector;

for_each_cpu_and(cpu, dest, cpu_online_mask)
- params->vp_mask |= (1ULL << vmbus_cpu_number_to_vp_number(cpu));
+ __set_bit(hv_cpu_number_to_vp_number(cpu),
+ (unsigned long *)params->vp_mask);

hv_do_hypercall(HVCALL_RETARGET_INTERRUPT, params, NULL);

@@ -903,10 +904,9 @@ static void hv_compose_msi_msg(struct irq_data *data, struct msi_msg *msg)
if (cpumask_weight(affinity) >= 32) {
int_pkt->int_desc.cpu_mask = CPU_AFFINITY_ALL;
} else {
- for_each_cpu_and(cpu, affinity, cpu_online_mask) {
- int_pkt->int_desc.cpu_mask |=
- (1ULL << vmbus_cpu_number_to_vp_number(cpu));
- }
+ for_each_cpu_and(cpu, affinity, cpu_online_mask)
+ __set_bit(hv_cpu_number_to_vp_number(cpu),
+ (unsigned long *)int_pkt->int_desc.cpu_mask);
}

ret = vmbus_sendpacket(hpdev->hbus->hdev->channel, int_pkt,
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index 2c9a2c8..9591ae7 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -1174,7 +1174,6 @@ int vmbus_allocate_mmio(struct resource **new, struct hv_device *device_obj,
resource_size_t size, resource_size_t align,
bool fb_overlap_ok);
void vmbus_free_mmio(resource_size_t start, resource_size_t size);
-int vmbus_cpu_number_to_vp_number(int cpu_number);

/*
* GUID definitions of various offer types - services offered to the guest.
--
2.9.4

2017-06-09 13:29:11

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v8 08/10] x86/hyper-v: use hypercall for remote TLB flush

Hyper-V host can suggest us to use hypercall for doing remote TLB flush,
this is supposed to work faster than IPIs.

Implementation details: to do HvFlushVirtualAddress{Space,List} hypercalls
we need to put the input somewhere in memory and we don't really want to
have memory allocation on each call so we pre-allocate per cpu memory areas
on boot.

pv_ops patching is happening very early so we need to separate
hyperv_setup_mmu_ops() and hyper_alloc_mmu().

It is possible and easy to implement local TLB flushing too and there is
even a hint for that. However, I don't see a room for optimization on the
host side as both hypercall and native tlb flush will result in vmexit. The
hint is also not set on modern Hyper-V versions.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/hyperv/Makefile | 2 +-
arch/x86/hyperv/hv_init.c | 2 +
arch/x86/hyperv/mmu.c | 135 +++++++++++++++++++++++++++++++++++++
arch/x86/include/asm/mshyperv.h | 3 +
arch/x86/include/uapi/asm/hyperv.h | 7 ++
arch/x86/kernel/cpu/mshyperv.c | 1 +
6 files changed, 149 insertions(+), 1 deletion(-)
create mode 100644 arch/x86/hyperv/mmu.c

diff --git a/arch/x86/hyperv/Makefile b/arch/x86/hyperv/Makefile
index 171ae09..367a820 100644
--- a/arch/x86/hyperv/Makefile
+++ b/arch/x86/hyperv/Makefile
@@ -1 +1 @@
-obj-y := hv_init.o
+obj-y := hv_init.o mmu.o
diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index e93b9a0..1a8eb55 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -140,6 +140,8 @@ void hyperv_init(void)
hypercall_msr.guest_physical_address = vmalloc_to_pfn(hv_hypercall_pg);
wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);

+ hyper_alloc_mmu();
+
/*
* Register Hyper-V specific clocksource.
*/
diff --git a/arch/x86/hyperv/mmu.c b/arch/x86/hyperv/mmu.c
new file mode 100644
index 0000000..fe87ffe
--- /dev/null
+++ b/arch/x86/hyperv/mmu.c
@@ -0,0 +1,135 @@
+#include <linux/hyperv.h>
+#include <linux/log2.h>
+#include <linux/slab.h>
+#include <linux/types.h>
+
+#include <asm/fpu/api.h>
+#include <asm/mshyperv.h>
+#include <asm/msr.h>
+#include <asm/tlbflush.h>
+
+/* HvFlushVirtualAddressSpace, HvFlushVirtualAddressList hypercalls */
+struct hv_flush_pcpu {
+ u64 address_space;
+ u64 flags;
+ u64 processor_mask;
+ u64 gva_list[];
+};
+
+/* Each gva in gva_list encodes up to 4096 pages to flush */
+#define HV_TLB_FLUSH_UNIT (4096 * PAGE_SIZE)
+
+static struct hv_flush_pcpu __percpu *pcpu_flush;
+
+/*
+ * Fills in gva_list starting from offset. Returns the number of items filled
+ * in gva_list (including offset)
+ */
+static inline int fill_gva_list(u64 gva_list[], int offset,
+ unsigned long start, unsigned long end)
+{
+ int gva_n = offset;
+ unsigned long cur = start, diff;
+
+ do {
+ diff = end > cur ? end - cur : 0;
+
+ gva_list[gva_n] = cur & PAGE_MASK;
+ /*
+ * Lower 12 bits encode the number of additional
+ * pages to flush (in addition to the 'cur' page).
+ */
+ if (diff >= HV_TLB_FLUSH_UNIT)
+ gva_list[gva_n] |= ~PAGE_MASK;
+ else if (diff)
+ gva_list[gva_n] |= (diff - 1) >> PAGE_SHIFT;
+
+ cur += HV_TLB_FLUSH_UNIT;
+ gva_n++;
+
+ } while (cur < end);
+
+ return gva_n;
+}
+
+static void hyperv_flush_tlb_others(const struct cpumask *cpus,
+ struct mm_struct *mm, unsigned long start,
+ unsigned long end)
+{
+ int cpu, vcpu, gva_n, max_gvas;
+ struct hv_flush_pcpu *flush;
+ u64 status = U64_MAX;
+ unsigned long flags;
+
+ if (!pcpu_flush || !hv_hypercall_pg)
+ goto do_native;
+
+ if (cpumask_empty(cpus))
+ return;
+
+ local_irq_save(flags);
+
+ flush = this_cpu_ptr(pcpu_flush);
+
+ if (mm) {
+ flush->address_space = virt_to_phys(mm->pgd);
+ flush->flags = 0;
+ } else {
+ flush->address_space = 0;
+ flush->flags = HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES;
+ }
+
+ flush->processor_mask = 0;
+ if (cpumask_equal(cpus, cpu_present_mask)) {
+ flush->flags |= HV_FLUSH_ALL_PROCESSORS;
+ } else {
+ for_each_cpu(cpu, cpus) {
+ vcpu = hv_cpu_number_to_vp_number(cpu);
+ if (vcpu < 64)
+ __set_bit(vcpu, (unsigned long *)
+ &flush->processor_mask);
+ else
+ goto do_native;
+ }
+ }
+
+ /*
+ * We can flush not more than max_gvas with one hypercall. Flush the
+ * whole address space if we were asked to do more.
+ */
+ max_gvas = (PAGE_SIZE - sizeof(*flush)) / sizeof(flush->gva_list[0]);
+
+ if (end == TLB_FLUSH_ALL) {
+ flush->flags |= HV_FLUSH_NON_GLOBAL_MAPPINGS_ONLY;
+ status = hv_do_hypercall(HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE,
+ flush, NULL);
+ } else if (end && ((end - start)/HV_TLB_FLUSH_UNIT) > max_gvas) {
+ status = hv_do_hypercall(HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE,
+ flush, NULL);
+ } else {
+ gva_n = fill_gva_list(flush->gva_list, 0, start, end);
+ status = hv_do_rep_hypercall(HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST,
+ gva_n, 0, flush, NULL);
+ }
+
+ local_irq_restore(flags);
+
+ if (!(status & HV_HYPERCALL_RESULT_MASK))
+ return;
+do_native:
+ native_flush_tlb_others(cpus, mm, start, end);
+}
+
+void hyperv_setup_mmu_ops(void)
+{
+ if (ms_hyperv.hints & HV_X64_REMOTE_TLB_FLUSH_RECOMMENDED) {
+ pr_info("Hyper-V: Using hypercall for remote TLB flush\n");
+ pv_mmu_ops.flush_tlb_others = hyperv_flush_tlb_others;
+ }
+}
+
+void hyper_alloc_mmu(void)
+{
+ if (ms_hyperv.hints & HV_X64_REMOTE_TLB_FLUSH_RECOMMENDED)
+ pcpu_flush = __alloc_percpu(PAGE_SIZE, PAGE_SIZE);
+}
diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index 7581251..3710c84 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -309,6 +309,8 @@ static inline int hv_cpu_number_to_vp_number(int cpu_number)
}

void hyperv_init(void);
+void hyperv_setup_mmu_ops(void);
+void hyper_alloc_mmu(void);
void hyperv_report_panic(struct pt_regs *regs);
bool hv_is_hypercall_page_setup(void);
void hyperv_cleanup(void);
@@ -316,6 +318,7 @@ void hyperv_cleanup(void);
static inline void hyperv_init(void) {}
static inline bool hv_is_hypercall_page_setup(void) { return false; }
static inline hyperv_cleanup(void) {}
+static inline void hyperv_setup_mmu_ops(void) {}
#endif /* CONFIG_HYPERV */

#ifdef CONFIG_HYPERV_TSCPAGE
diff --git a/arch/x86/include/uapi/asm/hyperv.h b/arch/x86/include/uapi/asm/hyperv.h
index 432df4b..cc0e70c 100644
--- a/arch/x86/include/uapi/asm/hyperv.h
+++ b/arch/x86/include/uapi/asm/hyperv.h
@@ -239,6 +239,8 @@
(~((1ull << HV_X64_MSR_HYPERCALL_PAGE_ADDRESS_SHIFT) - 1))

/* Declare the various hypercall operations. */
+#define HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE 0x0002
+#define HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST 0x0003
#define HVCALL_NOTIFY_LONG_SPIN_WAIT 0x0008
#define HVCALL_POST_MESSAGE 0x005c
#define HVCALL_SIGNAL_EVENT 0x005d
@@ -256,6 +258,11 @@
#define HV_PROCESSOR_POWER_STATE_C2 2
#define HV_PROCESSOR_POWER_STATE_C3 3

+#define HV_FLUSH_ALL_PROCESSORS BIT(0)
+#define HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES BIT(1)
+#define HV_FLUSH_NON_GLOBAL_MAPPINGS_ONLY BIT(2)
+#define HV_FLUSH_USE_EXTENDED_RANGE_FORMAT BIT(3)
+
/* hypercall status code */
#define HV_STATUS_SUCCESS 0
#define HV_STATUS_INVALID_HYPERCALL_CODE 2
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index f259e01..2a1918d 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -239,6 +239,7 @@ static void __init ms_hyperv_init_platform(void)
* Setup the hook to get control post apic initialization.
*/
x86_platform.apic_post_init = hyperv_init;
+ hyperv_setup_mmu_ops();
#endif
}

--
2.9.4

2017-06-09 13:28:51

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v8 09/10] x86/hyper-v: support extended CPU ranges for TLB flush hypercalls

Hyper-V hosts may support more than 64 vCPUs, we need to use
HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX/LIST_EX hypercalls in this
case.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
Changes since v7:
- Drop explicit casting from virt_to_phys() [Andy Shevchenko]
- Re-format max_gvas calculus [Andy Shevchenko]
---
arch/x86/hyperv/mmu.c | 130 ++++++++++++++++++++++++++++++++++++-
arch/x86/include/uapi/asm/hyperv.h | 10 +++
2 files changed, 138 insertions(+), 2 deletions(-)

diff --git a/arch/x86/hyperv/mmu.c b/arch/x86/hyperv/mmu.c
index fe87ffe..0712111 100644
--- a/arch/x86/hyperv/mmu.c
+++ b/arch/x86/hyperv/mmu.c
@@ -16,11 +16,25 @@ struct hv_flush_pcpu {
u64 gva_list[];
};

+/* HvFlushVirtualAddressSpaceEx, HvFlushVirtualAddressListEx hypercalls */
+struct hv_flush_pcpu_ex {
+ u64 address_space;
+ u64 flags;
+ struct {
+ u64 format;
+ u64 valid_bank_mask;
+ u64 bank_contents[];
+ } hv_vp_set;
+ u64 gva_list[];
+};
+
/* Each gva in gva_list encodes up to 4096 pages to flush */
#define HV_TLB_FLUSH_UNIT (4096 * PAGE_SIZE)

static struct hv_flush_pcpu __percpu *pcpu_flush;

+static struct hv_flush_pcpu_ex __percpu *pcpu_flush_ex;
+
/*
* Fills in gva_list starting from offset. Returns the number of items filled
* in gva_list (including offset)
@@ -52,6 +66,34 @@ static inline int fill_gva_list(u64 gva_list[], int offset,
return gva_n;
}

+/* Return the number of banks in the resulting vp_set */
+static inline int cpumask_to_vp_set(struct hv_flush_pcpu_ex *flush,
+ const struct cpumask *cpus)
+{
+ int cpu, vcpu, vcpu_bank, vcpu_offset, nr_bank = 1;
+
+ /*
+ * Some banks may end up being empty but this is acceptable.
+ */
+ for_each_cpu(cpu, cpus) {
+ vcpu = hv_cpu_number_to_vp_number(cpu);
+ vcpu_bank = vcpu / 64;
+ vcpu_offset = vcpu % 64;
+
+ /* valid_bank_mask can represent up to 64 banks */
+ if (vcpu_bank >= 64)
+ return 0;
+
+ __set_bit(vcpu_offset, (unsigned long *)
+ &flush->hv_vp_set.bank_contents[vcpu_bank]);
+ if (vcpu_bank >= nr_bank)
+ nr_bank = vcpu_bank + 1;
+ }
+ flush->hv_vp_set.valid_bank_mask = GENMASK_ULL(nr_bank - 1, 0);
+
+ return nr_bank;
+}
+
static void hyperv_flush_tlb_others(const struct cpumask *cpus,
struct mm_struct *mm, unsigned long start,
unsigned long end)
@@ -120,16 +162,100 @@ static void hyperv_flush_tlb_others(const struct cpumask *cpus,
native_flush_tlb_others(cpus, mm, start, end);
}

+static void hyperv_flush_tlb_others_ex(const struct cpumask *cpus,
+ struct mm_struct *mm,
+ unsigned long start,
+ unsigned long end)
+{
+ int nr_bank = 0, max_gvas, gva_n;
+ struct hv_flush_pcpu_ex *flush;
+ u64 status = U64_MAX;
+ unsigned long flags;
+
+ if (!pcpu_flush_ex || !hv_hypercall_pg)
+ goto do_native;
+
+ if (cpumask_empty(cpus))
+ return;
+
+ local_irq_save(flags);
+
+ flush = this_cpu_ptr(pcpu_flush_ex);
+
+ if (mm) {
+ flush->address_space = virt_to_phys(mm->pgd);
+ flush->flags = 0;
+ } else {
+ flush->address_space = 0;
+ flush->flags = HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES;
+ }
+
+ flush->hv_vp_set.valid_bank_mask = 0;
+
+ if (!cpumask_equal(cpus, cpu_present_mask)) {
+ flush->hv_vp_set.format = HV_GENERIC_SET_SPARCE_4K;
+ nr_bank = cpumask_to_vp_set(flush, cpus);
+ }
+
+ if (!nr_bank) {
+ flush->hv_vp_set.format = HV_GENERIC_SET_ALL;
+ flush->flags |= HV_FLUSH_ALL_PROCESSORS;
+ }
+
+ /*
+ * We can flush not more than max_gvas with one hypercall. Flush the
+ * whole address space if we were asked to do more.
+ */
+ max_gvas =
+ (PAGE_SIZE - sizeof(*flush) - nr_bank *
+ sizeof(flush->hv_vp_set.bank_contents[0])) /
+ sizeof(flush->gva_list[0]);
+
+ if (end == TLB_FLUSH_ALL) {
+ flush->flags |= HV_FLUSH_NON_GLOBAL_MAPPINGS_ONLY;
+ status = hv_do_rep_hypercall(
+ HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX,
+ 0, nr_bank + 2, flush, NULL);
+ } else if (end && ((end - start)/HV_TLB_FLUSH_UNIT) > max_gvas) {
+ status = hv_do_rep_hypercall(
+ HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX,
+ 0, nr_bank + 2, flush, NULL);
+ } else {
+ gva_n = fill_gva_list(flush->gva_list, nr_bank, start, end);
+ status = hv_do_rep_hypercall(
+ HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX,
+ gva_n, nr_bank + 2, flush, NULL);
+ }
+
+ local_irq_restore(flags);
+
+ if (!(status & HV_HYPERCALL_RESULT_MASK))
+ return;
+do_native:
+ native_flush_tlb_others(cpus, mm, start, end);
+}
+
void hyperv_setup_mmu_ops(void)
{
- if (ms_hyperv.hints & HV_X64_REMOTE_TLB_FLUSH_RECOMMENDED) {
+ if (!(ms_hyperv.hints & HV_X64_REMOTE_TLB_FLUSH_RECOMMENDED))
+ return;
+
+ if (!(ms_hyperv.hints & HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED)) {
pr_info("Hyper-V: Using hypercall for remote TLB flush\n");
pv_mmu_ops.flush_tlb_others = hyperv_flush_tlb_others;
+ } else {
+ pr_info("Hyper-V: Using ext hypercall for remote TLB flush\n");
+ pv_mmu_ops.flush_tlb_others = hyperv_flush_tlb_others_ex;
}
}

void hyper_alloc_mmu(void)
{
- if (ms_hyperv.hints & HV_X64_REMOTE_TLB_FLUSH_RECOMMENDED)
+ if (!(ms_hyperv.hints & HV_X64_REMOTE_TLB_FLUSH_RECOMMENDED))
+ return;
+
+ if (!(ms_hyperv.hints & HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED))
pcpu_flush = __alloc_percpu(PAGE_SIZE, PAGE_SIZE);
+ else
+ pcpu_flush_ex = __alloc_percpu(PAGE_SIZE, PAGE_SIZE);
}
diff --git a/arch/x86/include/uapi/asm/hyperv.h b/arch/x86/include/uapi/asm/hyperv.h
index cc0e70c..9a19272 100644
--- a/arch/x86/include/uapi/asm/hyperv.h
+++ b/arch/x86/include/uapi/asm/hyperv.h
@@ -152,6 +152,9 @@
*/
#define HV_X64_DEPRECATING_AEOI_RECOMMENDED (1 << 9)

+/* Recommend using the newer ExProcessorMasks interface */
+#define HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED (1 << 11)
+
/*
* Crash notification flag.
*/
@@ -242,6 +245,8 @@
#define HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE 0x0002
#define HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST 0x0003
#define HVCALL_NOTIFY_LONG_SPIN_WAIT 0x0008
+#define HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX 0x0013
+#define HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX 0x0014
#define HVCALL_POST_MESSAGE 0x005c
#define HVCALL_SIGNAL_EVENT 0x005d

@@ -263,6 +268,11 @@
#define HV_FLUSH_NON_GLOBAL_MAPPINGS_ONLY BIT(2)
#define HV_FLUSH_USE_EXTENDED_RANGE_FORMAT BIT(3)

+enum HV_GENERIC_SET_FORMAT {
+ HV_GENERIC_SET_SPARCE_4K,
+ HV_GENERIC_SET_ALL,
+};
+
/* hypercall status code */
#define HV_STATUS_SUCCESS 0
#define HV_STATUS_INVALID_HYPERCALL_CODE 2
--
2.9.4

2017-06-09 13:30:11

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v8 05/10] hyper-v: use fast hypercall for HVCALL_SIGNAL_EVENT

We need to pass only 8 bytes of input for HvSignalEvent which makes it a
perfect fit for fast hypercall. hv_input_signal_event_buffer is not needed
any more and hv_input_signal_event is converted to union for convenience.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
---
drivers/hv/channel_mgmt.c | 13 ++-----------
drivers/hv/connection.c | 2 +-
include/linux/hyperv.h | 15 +--------------
3 files changed, 4 insertions(+), 26 deletions(-)

diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index 0fabd41..f501ce1 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -806,21 +806,12 @@ static void vmbus_onoffer(struct vmbus_channel_message_header *hdr)
/*
* Setup state for signalling the host.
*/
- newchannel->sig_event = (struct hv_input_signal_event *)
- (ALIGN((unsigned long)
- &newchannel->sig_buf,
- HV_HYPERCALL_PARAM_ALIGN));
-
- newchannel->sig_event->connectionid.asu32 = 0;
- newchannel->sig_event->connectionid.u.id = VMBUS_EVENT_CONNECTION_ID;
- newchannel->sig_event->flag_number = 0;
- newchannel->sig_event->rsvdz = 0;
+ newchannel->sig_event = VMBUS_EVENT_CONNECTION_ID;

if (vmbus_proto_version != VERSION_WS2008) {
newchannel->is_dedicated_interrupt =
(offer->is_dedicated_interrupt != 0);
- newchannel->sig_event->connectionid.u.id =
- offer->connection_id;
+ newchannel->sig_event = offer->connection_id;
}

memcpy(&newchannel->offermsg, offer,
diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c
index 45e806e..37ecf51 100644
--- a/drivers/hv/connection.c
+++ b/drivers/hv/connection.c
@@ -408,6 +408,6 @@ void vmbus_set_event(struct vmbus_channel *channel)
if (!channel->is_dedicated_interrupt)
vmbus_send_interrupt(child_relid);

- hv_do_hypercall(HVCALL_SIGNAL_EVENT, channel->sig_event, NULL);
+ hv_do_fast_hypercall8(HVCALL_SIGNAL_EVENT, channel->sig_event);
}
EXPORT_SYMBOL_GPL(vmbus_set_event);
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index d1ae02d..2c9a2c8 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -677,18 +677,6 @@ union hv_connection_id {
} u;
};

-/* Definition of the hv_signal_event hypercall input structure. */
-struct hv_input_signal_event {
- union hv_connection_id connectionid;
- u16 flag_number;
- u16 rsvdz;
-};
-
-struct hv_input_signal_event_buffer {
- u64 align8;
- struct hv_input_signal_event event;
-};
-
enum hv_numa_policy {
HV_BALANCED = 0,
HV_LOCALIZED,
@@ -771,8 +759,7 @@ struct vmbus_channel {
} callback_mode;

bool is_dedicated_interrupt;
- struct hv_input_signal_event_buffer sig_buf;
- struct hv_input_signal_event *sig_event;
+ u64 sig_event;

/*
* Starting with win8, this field will be used to specify
--
2.9.4

2017-06-09 16:04:21

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v8 00/10] Hyper-V: paravirtualized remote TLB flushing and hypercall improvements

On Fri, Jun 9, 2017 at 4:27 PM, Vitaly Kuznetsov <[email protected]> wrote:
> Changes since v7:
> - Minor code style fixes (drop explicit casting, reformat code a bit)
> in PATCH3 and PATCH9 [Andy Shevchenko]
>
> Original description:
>
> Hyper-V supports hypercalls for doing local and remote TLB flushing and
> gives its guests hints when using hypercall is preferred. While doing
> hypercalls for local TLB flushes is probably not practical (and is not
> being suggested by modern Hyper-V versions) remote TLB flush with a
> hypercall brings significant improvement.
>
> To test the series I wrote a special 'TLB trasher': on a 16 vCPU guest I
> was creating 32 threads which were doing 100000 mmap/munmaps each on some
> big file. Here are the results:
>
> Before:
> # time ./pthread_mmap ./randfile
> real 3m33.118s
> user 0m3.698s
> sys 3m16.624s
>
> After:
> # time ./pthread_mmap ./randfile
> real 2m19.920s
> user 0m2.662s
> sys 2m9.948s
>
> This series brings a number of small improvements along the way: fast
> hypercall implementation and using it for event signaling, rep hypercalls
> implementation, hyperv tracing subsystem (which only traces the newly added
> remote TLB flush for now).
>

FWIW,

Reviewed-by: Andy Shevchenko <[email protected]>

to the patches that do not have it yet.

> Vitaly Kuznetsov (10):
> x86/hyper-v: include hyperv/ only when CONFIG_HYPERV is set
> x86/hyper-v: stash the max number of virtual/logical processor
> x86/hyper-v: make hv_do_hypercall() inline
> x86/hyper-v: fast hypercall implementation
> hyper-v: use fast hypercall for HVCALL_SIGNAL_EVENT
> x86/hyper-v: implement rep hypercalls
> hyper-v: globalize vp_index
> x86/hyper-v: use hypercall for remote TLB flush
> x86/hyper-v: support extended CPU ranges for TLB flush hypercalls
> tracing/hyper-v: trace hyperv_mmu_flush_tlb_others()
>
> MAINTAINERS | 1 +
> arch/x86/Kbuild | 2 +-
> arch/x86/hyperv/Makefile | 2 +-
> arch/x86/hyperv/hv_init.c | 90 ++++++------
> arch/x86/hyperv/mmu.c | 268 ++++++++++++++++++++++++++++++++++++
> arch/x86/include/asm/mshyperv.h | 148 +++++++++++++++++++-
> arch/x86/include/asm/trace/hyperv.h | 38 +++++
> arch/x86/include/uapi/asm/hyperv.h | 17 +++
> arch/x86/kernel/cpu/mshyperv.c | 13 +-
> drivers/hv/channel_mgmt.c | 20 +--
> drivers/hv/connection.c | 7 +-
> drivers/hv/hv.c | 9 --
> drivers/hv/hyperv_vmbus.h | 11 --
> drivers/hv/vmbus_drv.c | 17 ---
> drivers/pci/host/pci-hyperv.c | 10 +-
> include/linux/hyperv.h | 17 +--
> 16 files changed, 539 insertions(+), 131 deletions(-)
> create mode 100644 arch/x86/hyperv/mmu.c
> create mode 100644 arch/x86/include/asm/trace/hyperv.h
>
> --
> 2.9.4
>



--
With Best Regards,
Andy Shevchenko

2017-06-09 16:14:57

by Stephen Hemminger

[permalink] [raw]
Subject: Re: [PATCH v8 00/10] Hyper-V: paravirtualized remote TLB flushing and hypercall improvements

On Fri, 9 Jun 2017 15:27:26 +0200
Vitaly Kuznetsov <[email protected]> wrote:

> Changes since v7:
> - Minor code style fixes (drop explicit casting, reformat code a bit)
> in PATCH3 and PATCH9 [Andy Shevchenko]
>
> Original description:
>
> Hyper-V supports hypercalls for doing local and remote TLB flushing and
> gives its guests hints when using hypercall is preferred. While doing
> hypercalls for local TLB flushes is probably not practical (and is not
> being suggested by modern Hyper-V versions) remote TLB flush with a
> hypercall brings significant improvement.
>
> To test the series I wrote a special 'TLB trasher': on a 16 vCPU guest I
> was creating 32 threads which were doing 100000 mmap/munmaps each on some
> big file. Here are the results:
>
> Before:
> # time ./pthread_mmap ./randfile
> real 3m33.118s
> user 0m3.698s
> sys 3m16.624s
>
> After:
> # time ./pthread_mmap ./randfile
> real 2m19.920s
> user 0m2.662s
> sys 2m9.948s
>
> This series brings a number of small improvements along the way: fast
> hypercall implementation and using it for event signaling, rep hypercalls
> implementation, hyperv tracing subsystem (which only traces the newly added
> remote TLB flush for now).
>
> Vitaly Kuznetsov (10):
> x86/hyper-v: include hyperv/ only when CONFIG_HYPERV is set
> x86/hyper-v: stash the max number of virtual/logical processor
> x86/hyper-v: make hv_do_hypercall() inline
> x86/hyper-v: fast hypercall implementation
> hyper-v: use fast hypercall for HVCALL_SIGNAL_EVENT
> x86/hyper-v: implement rep hypercalls
> hyper-v: globalize vp_index
> x86/hyper-v: use hypercall for remote TLB flush
> x86/hyper-v: support extended CPU ranges for TLB flush hypercalls
> tracing/hyper-v: trace hyperv_mmu_flush_tlb_others()
>
> MAINTAINERS | 1 +
> arch/x86/Kbuild | 2 +-
> arch/x86/hyperv/Makefile | 2 +-
> arch/x86/hyperv/hv_init.c | 90 ++++++------
> arch/x86/hyperv/mmu.c | 268 ++++++++++++++++++++++++++++++++++++
> arch/x86/include/asm/mshyperv.h | 148 +++++++++++++++++++-
> arch/x86/include/asm/trace/hyperv.h | 38 +++++
> arch/x86/include/uapi/asm/hyperv.h | 17 +++
> arch/x86/kernel/cpu/mshyperv.c | 13 +-
> drivers/hv/channel_mgmt.c | 20 +--
> drivers/hv/connection.c | 7 +-
> drivers/hv/hv.c | 9 --
> drivers/hv/hyperv_vmbus.h | 11 --
> drivers/hv/vmbus_drv.c | 17 ---
> drivers/pci/host/pci-hyperv.c | 10 +-
> include/linux/hyperv.h | 17 +--
> 16 files changed, 539 insertions(+), 131 deletions(-)
> create mode 100644 arch/x86/hyperv/mmu.c
> create mode 100644 arch/x86/include/asm/trace/hyperv.h
>

Looks good.

Reviewed-by: Stephen Hemminger <[email protected]>

2017-06-09 18:04:18

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH v8 10/10] tracing/hyper-v: trace hyperv_mmu_flush_tlb_others()

On Fri, 9 Jun 2017 15:27:36 +0200
Vitaly Kuznetsov <[email protected]> wrote:

> Add Hyper-V tracing subsystem and trace hyperv_mmu_flush_tlb_others().
> Tracing is done the same way we do xen_mmu_flush_tlb_others().
>
> Signed-off-by: Vitaly Kuznetsov <[email protected]>
> ---
> MAINTAINERS | 1 +
> arch/x86/hyperv/mmu.c | 7 +++++++
> arch/x86/include/asm/trace/hyperv.h | 38 +++++++++++++++++++++++++++++++++++++
> 3 files changed, 46 insertions(+)
> create mode 100644 arch/x86/include/asm/trace/hyperv.h
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index e3f44fd..001614b 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -6168,6 +6168,7 @@ M: Stephen Hemminger <[email protected]>
> L: [email protected]
> S: Maintained
> F: arch/x86/include/asm/mshyperv.h
> +F: arch/x86/include/asm/trace/hyperv.h
> F: arch/x86/include/uapi/asm/hyperv.h
> F: arch/x86/kernel/cpu/mshyperv.c
> F: arch/x86/hyperv
> diff --git a/arch/x86/hyperv/mmu.c b/arch/x86/hyperv/mmu.c
> index 0712111..cb35453 100644
> --- a/arch/x86/hyperv/mmu.c
> +++ b/arch/x86/hyperv/mmu.c
> @@ -8,6 +8,9 @@
> #include <asm/msr.h>
> #include <asm/tlbflush.h>
>
> +#define CREATE_TRACE_POINTS
> +#include <asm/trace/hyperv.h>
> +
> /* HvFlushVirtualAddressSpace, HvFlushVirtualAddressList hypercalls */
> struct hv_flush_pcpu {
> u64 address_space;
> @@ -103,6 +106,8 @@ static void hyperv_flush_tlb_others(const struct cpumask *cpus,
> u64 status = U64_MAX;
> unsigned long flags;
>
> + trace_hyperv_mmu_flush_tlb_others(cpus, mm, start, end);
> +
> if (!pcpu_flush || !hv_hypercall_pg)
> goto do_native;
>
> @@ -172,6 +177,8 @@ static void hyperv_flush_tlb_others_ex(const struct cpumask *cpus,
> u64 status = U64_MAX;
> unsigned long flags;
>
> + trace_hyperv_mmu_flush_tlb_others(cpus, mm, start, end);
> +
> if (!pcpu_flush_ex || !hv_hypercall_pg)
> goto do_native;
>
> diff --git a/arch/x86/include/asm/trace/hyperv.h b/arch/x86/include/asm/trace/hyperv.h
> new file mode 100644
> index 0000000..e44ece1
> --- /dev/null
> +++ b/arch/x86/include/asm/trace/hyperv.h
> @@ -0,0 +1,38 @@
> +#undef TRACE_SYSTEM
> +#define TRACE_SYSTEM hyperv
> +
> +#if !defined(_TRACE_HYPERV_H) || defined(TRACE_HEADER_MULTI_READ)
> +#define _TRACE_HYPERV_H
> +
> +#include <linux/tracepoint.h>
> +
> +#if IS_ENABLED(CONFIG_HYPERV)

Hmm, this is new to me. I know you can use IS_ENABLED() inside C code,
but I've never seen it used for preprocessor directives. This usually
is just:

#ifdef CONFIG_HYPERV

Other than that, this looks fine to me.

-- Steve

> +
> +TRACE_EVENT(hyperv_mmu_flush_tlb_others,
> + TP_PROTO(const struct cpumask *cpus, struct mm_struct *mm,
> + unsigned long addr, unsigned long end),
> + TP_ARGS(cpus, mm, addr, end),
> + TP_STRUCT__entry(
> + __field(unsigned int, ncpus)
> + __field(struct mm_struct *, mm)
> + __field(unsigned long, addr)
> + __field(unsigned long, end)
> + ),
> + TP_fast_assign(__entry->ncpus = cpumask_weight(cpus);
> + __entry->mm = mm;
> + __entry->addr = addr,
> + __entry->end = end),
> + TP_printk("ncpus %d mm %p addr %lx, end %lx",
> + __entry->ncpus, __entry->mm, __entry->addr, __entry->end)
> + );
> +
> +#endif /* CONFIG_HYPERV */
> +
> +#undef TRACE_INCLUDE_PATH
> +#define TRACE_INCLUDE_PATH asm/trace/
> +#undef TRACE_INCLUDE_FILE
> +#define TRACE_INCLUDE_FILE hyperv
> +#endif /* _TRACE_HYPERV_H */
> +
> +/* This part must be outside protection */
> +#include <trace/define_trace.h>

2017-06-09 18:23:57

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v8 10/10] tracing/hyper-v: trace hyperv_mmu_flush_tlb_others()

On Fri, Jun 9, 2017 at 9:04 PM, Steven Rostedt <[email protected]> wrote:
> On Fri, 9 Jun 2017 15:27:36 +0200
> Vitaly Kuznetsov <[email protected]> wrote:


>> +#if IS_ENABLED(CONFIG_HYPERV)
>
> Hmm, this is new to me. I know you can use IS_ENABLED() inside C code,
> but I've never seen it used for preprocessor directives. This usually
> is just:
>
> #ifdef CONFIG_HYPERV
>
> Other than that, this looks fine to me.

That is the magic of IS_ENABLED(), IS_BUILTIN() macros.
So, the code above is fine as is.

--
With Best Regards,
Andy Shevchenko

2017-06-09 18:32:13

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH v8 10/10] tracing/hyper-v: trace hyperv_mmu_flush_tlb_others()

On Fri, 9 Jun 2017 21:23:52 +0300
Andy Shevchenko <[email protected]> wrote:

> On Fri, Jun 9, 2017 at 9:04 PM, Steven Rostedt <[email protected]> wrote:
> > On Fri, 9 Jun 2017 15:27:36 +0200
> > Vitaly Kuznetsov <[email protected]> wrote:
>
>
> >> +#if IS_ENABLED(CONFIG_HYPERV)
> >
> > Hmm, this is new to me. I know you can use IS_ENABLED() inside C code,
> > but I've never seen it used for preprocessor directives. This usually
> > is just:
> >
> > #ifdef CONFIG_HYPERV
> >
> > Other than that, this looks fine to me.
>
> That is the magic of IS_ENABLED(), IS_BUILTIN() macros.
> So, the code above is fine as is.
>

I'm sure it works, but it just adds one more way of doing the same
thing. I thought that was what perl was always criticized for, and why
people usually prefer python. Don't get me wrong, I prefer oysters over
snakes. But I just wanted to point out the lack of consistency here.

-- Steve


2017-06-09 18:40:11

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v8 10/10] tracing/hyper-v: trace hyperv_mmu_flush_tlb_others()

On Fri, Jun 9, 2017 at 9:32 PM, Steven Rostedt <[email protected]> wrote:
> On Fri, 9 Jun 2017 21:23:52 +0300
> Andy Shevchenko <[email protected]> wrote:

> I'm sure it works, but it just adds one more way of doing the same
> thing. I thought that was what perl was always criticized for, and why
> people usually prefer python. Don't get me wrong, I prefer oysters over
> snakes. But I just wanted to point out the lack of consistency here.

Fair enough.

--
With Best Regards,
Andy Shevchenko

2017-06-09 18:54:00

by Paul Bolle

[permalink] [raw]
Subject: Re: [PATCH v8 10/10] tracing/hyper-v: trace hyperv_mmu_flush_tlb_others()

On Fri, 2017-06-09 at 14:32 -0400, Steven Rostedt wrote:
> I'm sure it works, but it just adds one more way of doing the same
> thing. I thought that was what perl was always criticized for, and why
> people usually prefer python. Don't get me wrong, I prefer oysters over
> snakes. But I just wanted to point out the lack of consistency here.

A major benefit is that
#if IS_ENABLED(CONFIG_HYPERV)

is shorter than
#if defined(CONFIG_HYPERV) || defined(CONFIG_HYPERV_MODULE)

and less prone to typos.


Paul Bolle

2017-06-09 19:07:12

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH v8 10/10] tracing/hyper-v: trace hyperv_mmu_flush_tlb_others()

On Fri, 09 Jun 2017 20:53:53 +0200
Paul Bolle <[email protected]> wrote:

> On Fri, 2017-06-09 at 14:32 -0400, Steven Rostedt wrote:
> > I'm sure it works, but it just adds one more way of doing the same
> > thing. I thought that was what perl was always criticized for, and why
> > people usually prefer python. Don't get me wrong, I prefer oysters over
> > snakes. But I just wanted to point out the lack of consistency here.
>
> A major benefit is that
> #if IS_ENABLED(CONFIG_HYPERV)
>
> is shorter than
> #if defined(CONFIG_HYPERV) || defined(CONFIG_HYPERV_MODULE)
>
> and less prone to typos.
>

I don't believe the module version is needed here. Otherwise I would
question the #if altogether. Which now I'm looking at it, why is it
needed?

What includes this header file that wouldn't have that set anyway? The
only place it is included is in:

arch/x86/hyperv/mmu.c

Is that compiled without CONFIG_HYPERV?

-- Steve

2017-06-12 02:56:42

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH v8 10/10] tracing/hyper-v: trace hyperv_mmu_flush_tlb_others()

Steven Rostedt <[email protected]> writes:

> On Fri, 09 Jun 2017 20:53:53 +0200
> Paul Bolle <[email protected]> wrote:
>
>> On Fri, 2017-06-09 at 14:32 -0400, Steven Rostedt wrote:
>> > I'm sure it works, but it just adds one more way of doing the same
>> > thing. I thought that was what perl was always criticized for, and why
>> > people usually prefer python. Don't get me wrong, I prefer oysters over
>> > snakes. But I just wanted to point out the lack of consistency here.
>>
>> A major benefit is that
>> #if IS_ENABLED(CONFIG_HYPERV)
>>
>> is shorter than
>> #if defined(CONFIG_HYPERV) || defined(CONFIG_HYPERV_MODULE)
>>
>> and less prone to typos.
>>
>
> I don't believe the module version is needed here. Otherwise I would
> question the #if altogether. Which now I'm looking at it, why is it
> needed?
>
> What includes this header file that wouldn't have that set anyway? The
> only place it is included is in:
>
> arch/x86/hyperv/mmu.c
>
> Is that compiled without CONFIG_HYPERV?

No, it is not but as was already mentioned it is valid and common to have
CONFIG_HYPERV=m (we should've probably done things differently in the past;
CONFIG_HYPERV=y/n should've been used for indicating Hyper-V support and
something like CONFIG_HYPERV_VMBUS=y/m/n to say if we want to have vmbus
as a module but...).

arch/x86/hyperv/mmu.c is compiled in vmlinux when CONFIG_HYPERV is
enabled in any way, this is updated in PATCH8 of this series:

diff --git a/arch/x86/Kbuild b/arch/x86/Kbuild
index 586b786..3e6f640 100644
--- a/arch/x86/Kbuild
+++ b/arch/x86/Kbuild
@@ -8,7 +8,7 @@ obj-$(CONFIG_KVM) += kvm/
obj-$(CONFIG_XEN) += xen/

# Hyper-V paravirtualization support
-obj-$(CONFIG_HYPERVISOR_GUEST) += hyperv/
+obj-$(subst m,y,$(CONFIG_HYPERV)) += hyperv/

# lguest paravirtualization support
obj-$(CONFIG_LGUEST_GUEST) += lguest/

(it was Andy who suggested we use 'subst', not me :-)

So we can't just change IS_ENABLED -> ifdef in this patch. We can, of
course, write " #if defined(CONFIG_HYPERV) ||
defined(CONFIG_HYPERV_MODULE)" but we were replacing it with IF_ENABLED
in the past, not sure we should do that. Dropping the #if altogether is
possible, but why having it when CONFIG_HYPERV=n?

--
Vitaly

2017-06-13 23:21:57

by Stephen Hemminger

[permalink] [raw]
Subject: Re: [PATCH v8 07/10] hyper-v: globalize vp_index

On Fri, 9 Jun 2017 15:27:33 +0200
Vitaly Kuznetsov <[email protected]> wrote:

> To support implementing remote TLB flushing on Hyper-V with a hypercall
> we need to make vp_index available outside of vmbus module. Rename and
> globalize.
>
> Signed-off-by: Vitaly Kuznetsov <[email protected]>
> Reviewed-by: Andy Shevchenko <[email protected]>

This is correct, but needs to be rebased.
It conflicts with the PCI protocol version 1.2 patches that are
in the PCI tree.

2017-06-14 02:29:32

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH v8 07/10] hyper-v: globalize vp_index

Stephen Hemminger <[email protected]> writes:

> On Fri, 9 Jun 2017 15:27:33 +0200
> Vitaly Kuznetsov <[email protected]> wrote:
>
>> To support implementing remote TLB flushing on Hyper-V with a hypercall
>> we need to make vp_index available outside of vmbus module. Rename and
>> globalize.
>>
>> Signed-off-by: Vitaly Kuznetsov <[email protected]>
>> Reviewed-by: Andy Shevchenko <[email protected]>
>
> This is correct, but needs to be rebased.
> It conflicts with the PCI protocol version 1.2 patches that are
> in the PCI tree.

:-(

The question is - what do we do? As far as I understand the intent was
to push this through Greg's char-misc tree. If I rebase it to Bjorn's
pci tree patches won't apply to char-misc and Greg won't take them. I
see three possible ways to go:
1) Take them into char-misc and resolve the conflict in merge window
(Linus will hate us all :-( )
2) Ask Greg to merge with Bjorn _now_ so we can send the rebased
version.
3) Postpone these patches to the next kernel release. No guarantee we
won't clash with something else :-(

So I'm a bit lost. With Hyper-V drivers scattered across multiple trees
we're doomed to have such issues with every relatively big series.

--
Vitaly

2017-06-14 04:31:39

by Jork Loeser

[permalink] [raw]
Subject: RE: [PATCH v8 07/10] hyper-v: globalize vp_index

> From: Vitaly Kuznetsov [mailto:[email protected]]
> Sent: Tuesday, June 13, 2017 19:29
>
> Stephen Hemminger <[email protected]> writes:
>
> > On Fri, 9 Jun 2017 15:27:33 +0200
> > Vitaly Kuznetsov <[email protected]> wrote:
> >
> >> To support implementing remote TLB flushing on Hyper-V with a
> >> hypercall we need to make vp_index available outside of vmbus module.
> >> Rename and globalize.
> >>
> >> Signed-off-by: Vitaly Kuznetsov <[email protected]>
> >> Reviewed-by: Andy Shevchenko <[email protected]>
> >
> > This is correct, but needs to be rebased.
> > It conflicts with the PCI protocol version 1.2 patches that are in the
> > PCI tree.
>
> :-(
>
> The question is - what do we do? As far as I understand the intent was to push
> this through Greg's char-misc tree. If I rebase it to Bjorn's pci tree patches won't
> apply to char-misc and Greg won't take them. I see three possible ways to go:
> 1) Take them into char-misc and resolve the conflict in merge window (Linus will
> hate us all :-( )
> 2) Ask Greg to merge with Bjorn _now_ so we can send the rebased version.
> 3) Postpone these patches to the next kernel release. No guarantee we won't
> clash with something else :-(
>
> So I'm a bit lost. With Hyper-V drivers scattered across multiple trees we're
> doomed to have such issues with every relatively big series.

I would like to see Vitaly's patch-set being integrated shortly (option 1).

In anticipation of this, the PCI protocol version 1.2 patches duplicate the CPU-ID/vCPU-ID mapping. The conflict thus is "just" a re-naming conflict - taking either old or new is fine (one occurrence of conflict). Is this acceptable for conflict management without instilling undue despise?

That said, I am more than happy to help in the resolution. Also, once both changes are merged, I'll remove the duplicated logic.

Regards,
Jork


2017-06-14 16:10:34

by Stephen Hemminger

[permalink] [raw]
Subject: Re: [PATCH v8 07/10] hyper-v: globalize vp_index

On Wed, 14 Jun 2017 04:31:32 +0000
Jork Loeser <[email protected]> wrote:

> > From: Vitaly Kuznetsov [mailto:[email protected]]
> > Sent: Tuesday, June 13, 2017 19:29
> >
> > Stephen Hemminger <[email protected]> writes:
> >
> > > On Fri, 9 Jun 2017 15:27:33 +0200
> > > Vitaly Kuznetsov <[email protected]> wrote:
> > >
> > >> To support implementing remote TLB flushing on Hyper-V with a
> > >> hypercall we need to make vp_index available outside of vmbus module.
> > >> Rename and globalize.
> > >>
> > >> Signed-off-by: Vitaly Kuznetsov <[email protected]>
> > >> Reviewed-by: Andy Shevchenko <[email protected]>
> > >
> > > This is correct, but needs to be rebased.
> > > It conflicts with the PCI protocol version 1.2 patches that are in the
> > > PCI tree.
> >
> > :-(
> >
> > The question is - what do we do? As far as I understand the intent was to push
> > this through Greg's char-misc tree. If I rebase it to Bjorn's pci tree patches won't
> > apply to char-misc and Greg won't take them. I see three possible ways to go:
> > 1) Take them into char-misc and resolve the conflict in merge window (Linus will
> > hate us all :-( )
> > 2) Ask Greg to merge with Bjorn _now_ so we can send the rebased version.
> > 3) Postpone these patches to the next kernel release. No guarantee we won't
> > clash with something else :-(
> >
> > So I'm a bit lost. With Hyper-V drivers scattered across multiple trees we're
> > doomed to have such issues with every relatively big series.
>
> I would like to see Vitaly's patch-set being integrated shortly (option 1).
>
> In anticipation of this, the PCI protocol version 1.2 patches duplicate the CPU-ID/vCPU-ID mapping. The conflict thus is "just" a re-naming conflict - taking either old or new is fine (one occurrence of conflict). Is this acceptable for conflict management without instilling undue despise?
>
> That said, I am more than happy to help in the resolution. Also, once both changes are merged, I'll remove the duplicated logic.
>
> Regards,
> Jork
>

There a few other options:
1) Work with Stephen to resolve merge conflict in linux-next. This means any
conflict would get resolved before merge window
2) Figure out how to get enabling code in (maybe duplicate functions) and then
delete the extra later. For example 1-6 could go in now.

3) Just wait. The hypercall patches are optimizations and could be deferred
the pain with this is carrying more patches and managing the backlog gets
to be a real nuisance.