2021-03-01 01:22:57

by Michael Kelley (LINUX)

[permalink] [raw]
Subject: [PATCH v2 00/10] Refactor arch specific Hyper-V code

To support Linux guests on Hyper-V on multiple architectures, the original
approach factored out all differences between Hyper-V on x86/x64 and
Hyper-V on ARM64 into functions or #defines under arch/x86 and
arch/arm64. Some of these differences are truly related to the
architecture, but others are more properly treated as Linux OS
differences or just quirks in Hyper-V. Feedback from Arnd Bergmann[1]
recommended that differences other than architecture should be
incorporated into the architecture independent Hyper-V code. Each
difference can be handled with conditions specific to the difference
instead of tying it to the broader x86/x64 vs. ARM64. This approach
reduces the amount of code under arch/x86 and arch/arm64 and keeps
the non-architectural differences localized and more easily understood.

This patch set implements the new approach by changing the interface
between the architecture independent code and the architecture dependent
code for x86/x64. The patches move code from arch/x86 to the
architecture independent Hyper-V code whenever possible, and add
architecture independent support needed by other architectures like
ARM64. No functionality is changed for x86/x64. A subsequent patch
set will provide the Hyper-V support code under arch/arm64.

This patch set results in an increase in lines of code (though some
of the increase is additional comments). But the lines needed under
arch/arm64 in the upcoming patch set is significantly reduced, resulting
in a net decrease of about 125 lines.

[1] https://lore.kernel.org/lkml/CAK8P3a1hDBVembCd+6=ENUWYFz=72JBTFMrKYZ2aFd+_Q04F+g@mail.gmail.com/

Changes in v2:
* In patch 9/10, for consistency change the rating on hyperv_cs_msr
along with the rating on hyperv_cs_tsc [Boqun Feng]
* In patch 10/10, add missing call to hv_remove_stimer0_handler()
[Boqun Feng]
* In patch 10/10, add clarifying comment that hv_setup_stimer0_irq()
is not used on x86/x64 [Wei Liu]

Michael Kelley (10):
Drivers: hv: vmbus: Move Hyper-V page allocator to arch neutral code
x86/hyper-v: Move hv_message_type to architecture neutral module
Drivers: hv: Redo Hyper-V synthetic MSR get/set functions
Drivers: hv: vmbus: Move hyperv_report_panic_msg to arch neutral code
Drivers: hv: vmbus: Handle auto EOI quirk inline
Drivers: hv: vmbus: Move handling of VMbus interrupts
clocksource/drivers/hyper-v: Handle vDSO differences inline
clocksource/drivers/hyper-v: Handle sched_clock differences inline
clocksource/drivers/hyper-v: Set clocksource rating based on Hyper-V
feature
clocksource/drivers/hyper-v: Move handling of STIMER0 interrupts

arch/x86/hyperv/hv_init.c | 53 +-------
arch/x86/include/asm/hyperv-tlfs.h | 131 +++++++++----------
arch/x86/include/asm/mshyperv.h | 67 ++--------
arch/x86/kernel/cpu/mshyperv.c | 23 +---
drivers/clocksource/hyperv_timer.c | 258 ++++++++++++++++++++++++++-----------
drivers/hv/hv.c | 93 ++++++++++---
drivers/hv/vmbus_drv.c | 89 +++++++++++--
include/asm-generic/hyperv-tlfs.h | 35 +++++
include/asm-generic/mshyperv.h | 19 ++-
include/clocksource/hyperv_timer.h | 3 +-
10 files changed, 458 insertions(+), 313 deletions(-)

--
1.8.3.1


2021-03-01 01:23:55

by Michael Kelley (LINUX)

[permalink] [raw]
Subject: [PATCH v2 01/10] Drivers: hv: vmbus: Move Hyper-V page allocator to arch neutral code

The Hyper-V page allocator functions are implemented in an architecture
neutral way. Move them into the architecture neutral VMbus module so
a separate implementation for ARM64 is not needed.

No functional change.

Signed-off-by: Michael Kelley <[email protected]>
Reviewed-by: Boqun Feng <[email protected]>
---
arch/x86/hyperv/hv_init.c | 22 ----------------------
arch/x86/include/asm/mshyperv.h | 5 -----
drivers/hv/hv.c | 36 ++++++++++++++++++++++++++++++++++++
include/asm-generic/mshyperv.h | 4 ++++
4 files changed, 40 insertions(+), 27 deletions(-)

diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index b81047d..4bdb344 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -54,28 +54,6 @@
u32 hv_max_vp_index;
EXPORT_SYMBOL_GPL(hv_max_vp_index);

-void *hv_alloc_hyperv_page(void)
-{
- BUILD_BUG_ON(PAGE_SIZE != HV_HYP_PAGE_SIZE);
-
- return (void *)__get_free_page(GFP_KERNEL);
-}
-EXPORT_SYMBOL_GPL(hv_alloc_hyperv_page);
-
-void *hv_alloc_hyperv_zeroed_page(void)
-{
- BUILD_BUG_ON(PAGE_SIZE != HV_HYP_PAGE_SIZE);
-
- return (void *)__get_free_page(GFP_KERNEL | __GFP_ZERO);
-}
-EXPORT_SYMBOL_GPL(hv_alloc_hyperv_zeroed_page);
-
-void hv_free_hyperv_page(unsigned long addr)
-{
- free_page(addr);
-}
-EXPORT_SYMBOL_GPL(hv_free_hyperv_page);
-
static int hv_cpu_init(unsigned int cpu)
{
u64 msr_vp_index;
diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index ccf60a8..ef6e968 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -233,9 +233,6 @@ static inline struct hv_vp_assist_page *hv_get_vp_assist_page(unsigned int cpu)

void __init hyperv_init(void);
void hyperv_setup_mmu_ops(void);
-void *hv_alloc_hyperv_page(void);
-void *hv_alloc_hyperv_zeroed_page(void);
-void hv_free_hyperv_page(unsigned long addr);
void set_hv_tscchange_cb(void (*cb)(void));
void clear_hv_tscchange_cb(void);
void hyperv_stop_tsc_emulation(void);
@@ -272,8 +269,6 @@ int hv_map_ioapic_interrupt(int ioapic_id, bool level, int vcpu, int vector,
#else /* CONFIG_HYPERV */
static inline void hyperv_init(void) {}
static inline void hyperv_setup_mmu_ops(void) {}
-static inline void *hv_alloc_hyperv_page(void) { return NULL; }
-static inline void hv_free_hyperv_page(unsigned long addr) {}
static inline void set_hv_tscchange_cb(void (*cb)(void)) {}
static inline void clear_hv_tscchange_cb(void) {}
static inline void hyperv_stop_tsc_emulation(void) {};
diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
index f202ac7..cca8d5e 100644
--- a/drivers/hv/hv.c
+++ b/drivers/hv/hv.c
@@ -37,6 +37,42 @@ int hv_init(void)
}

/*
+ * Functions for allocating and freeing memory with size and
+ * alignment HV_HYP_PAGE_SIZE. These functions are needed because
+ * the guest page size may not be the same as the Hyper-V page
+ * size. We depend upon kmalloc() aligning power-of-two size
+ * allocations to the allocation size boundary, so that the
+ * allocated memory appears to Hyper-V as a page of the size
+ * it expects.
+ */
+
+void *hv_alloc_hyperv_page(void)
+{
+ BUILD_BUG_ON(PAGE_SIZE < HV_HYP_PAGE_SIZE);
+
+ if (PAGE_SIZE == HV_HYP_PAGE_SIZE)
+ return (void *)__get_free_page(GFP_KERNEL);
+ else
+ return kmalloc(HV_HYP_PAGE_SIZE, GFP_KERNEL);
+}
+
+void *hv_alloc_hyperv_zeroed_page(void)
+{
+ if (PAGE_SIZE == HV_HYP_PAGE_SIZE)
+ return (void *)__get_free_page(GFP_KERNEL | __GFP_ZERO);
+ else
+ return kzalloc(HV_HYP_PAGE_SIZE, GFP_KERNEL);
+}
+
+void hv_free_hyperv_page(unsigned long addr)
+{
+ if (PAGE_SIZE == HV_HYP_PAGE_SIZE)
+ free_page(addr);
+ else
+ kfree((void *)addr);
+}
+
+/*
* hv_post_message - Post a message using the hypervisor message IPC.
*
* This involves a hypercall.
diff --git a/include/asm-generic/mshyperv.h b/include/asm-generic/mshyperv.h
index dff58a3..694b5bc 100644
--- a/include/asm-generic/mshyperv.h
+++ b/include/asm-generic/mshyperv.h
@@ -117,6 +117,10 @@ static inline void vmbus_signal_eom(struct hv_message *msg, u32 old_msg_type)
/* Sentinel value for an uninitialized entry in hv_vp_index array */
#define VP_INVAL U32_MAX

+void *hv_alloc_hyperv_page(void);
+void *hv_alloc_hyperv_zeroed_page(void);
+void hv_free_hyperv_page(unsigned long addr);
+
/**
* hv_cpu_number_to_vp_number() - Map CPU to VP.
* @cpu_number: CPU number in Linux terms
--
1.8.3.1

2021-03-01 01:25:24

by Michael Kelley (LINUX)

[permalink] [raw]
Subject: [PATCH v2 03/10] Drivers: hv: Redo Hyper-V synthetic MSR get/set functions

Current code defines a separate get and set macro for each Hyper-V
synthetic MSR used by the VMbus driver. Furthermore, the get macro
can't be converted to a standard function because the second argument
is modified in place, which is somewhat bad form.

Redo this by providing a single get and a single set function that
take a parameter specifying the MSR to be operated on. Fixup usage
of the get function. Calling locations are no more complex than before,
but the code under arch/x86 and the upcoming code under arch/arm64
is significantly simplified.

Also standardize the names of Hyper-V synthetic MSRs that are
architecture neutral. But keep the old x86-specific names as aliases
that can be removed later when all references (particularly in KVM
code) have been cleaned up in a separate patch series.

No functional change.

Signed-off-by: Michael Kelley <[email protected]>
Reviewed-by: Boqun Feng <[email protected]>
---
arch/x86/hyperv/hv_init.c | 2 +-
arch/x86/include/asm/hyperv-tlfs.h | 102 +++++++++++++++++++++++--------------
arch/x86/include/asm/mshyperv.h | 39 ++++----------
drivers/clocksource/hyperv_timer.c | 26 +++++-----
drivers/hv/hv.c | 37 ++++++++------
drivers/hv/vmbus_drv.c | 2 +-
include/asm-generic/mshyperv.h | 2 +-
7 files changed, 110 insertions(+), 100 deletions(-)

diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index 4bdb344..94d52c5 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -75,7 +75,7 @@ static int hv_cpu_init(unsigned int cpu)
*output_arg = page_address(pg + 1);
}

- hv_get_vp_index(msr_vp_index);
+ msr_vp_index = hv_get_register(HV_REGISTER_VP_INDEX);

hv_vp_index[smp_processor_id()] = msr_vp_index;

diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
index 68b38a2..606f5cc 100644
--- a/arch/x86/include/asm/hyperv-tlfs.h
+++ b/arch/x86/include/asm/hyperv-tlfs.h
@@ -156,7 +156,7 @@ enum hv_isolation_type {
#define HV_X64_MSR_HYPERCALL 0x40000001

/* MSR used to provide vcpu index */
-#define HV_X64_MSR_VP_INDEX 0x40000002
+#define HV_REGISTER_VP_INDEX 0x40000002

/* MSR used to reset the guest OS. */
#define HV_X64_MSR_RESET 0x40000003
@@ -165,10 +165,10 @@ enum hv_isolation_type {
#define HV_X64_MSR_VP_RUNTIME 0x40000010

/* MSR used to read the per-partition time reference counter */
-#define HV_X64_MSR_TIME_REF_COUNT 0x40000020
+#define HV_REGISTER_TIME_REF_COUNT 0x40000020

/* A partition's reference time stamp counter (TSC) page */
-#define HV_X64_MSR_REFERENCE_TSC 0x40000021
+#define HV_REGISTER_REFERENCE_TSC 0x40000021

/* MSR used to retrieve the TSC frequency */
#define HV_X64_MSR_TSC_FREQUENCY 0x40000022
@@ -183,50 +183,50 @@ enum hv_isolation_type {
#define HV_X64_MSR_VP_ASSIST_PAGE 0x40000073

/* Define synthetic interrupt controller model specific registers. */
-#define HV_X64_MSR_SCONTROL 0x40000080
-#define HV_X64_MSR_SVERSION 0x40000081
-#define HV_X64_MSR_SIEFP 0x40000082
-#define HV_X64_MSR_SIMP 0x40000083
-#define HV_X64_MSR_EOM 0x40000084
-#define HV_X64_MSR_SINT0 0x40000090
-#define HV_X64_MSR_SINT1 0x40000091
-#define HV_X64_MSR_SINT2 0x40000092
-#define HV_X64_MSR_SINT3 0x40000093
-#define HV_X64_MSR_SINT4 0x40000094
-#define HV_X64_MSR_SINT5 0x40000095
-#define HV_X64_MSR_SINT6 0x40000096
-#define HV_X64_MSR_SINT7 0x40000097
-#define HV_X64_MSR_SINT8 0x40000098
-#define HV_X64_MSR_SINT9 0x40000099
-#define HV_X64_MSR_SINT10 0x4000009A
-#define HV_X64_MSR_SINT11 0x4000009B
-#define HV_X64_MSR_SINT12 0x4000009C
-#define HV_X64_MSR_SINT13 0x4000009D
-#define HV_X64_MSR_SINT14 0x4000009E
-#define HV_X64_MSR_SINT15 0x4000009F
+#define HV_REGISTER_SCONTROL 0x40000080
+#define HV_REGISTER_SVERSION 0x40000081
+#define HV_REGISTER_SIEFP 0x40000082
+#define HV_REGISTER_SIMP 0x40000083
+#define HV_REGISTER_EOM 0x40000084
+#define HV_REGISTER_SINT0 0x40000090
+#define HV_REGISTER_SINT1 0x40000091
+#define HV_REGISTER_SINT2 0x40000092
+#define HV_REGISTER_SINT3 0x40000093
+#define HV_REGISTER_SINT4 0x40000094
+#define HV_REGISTER_SINT5 0x40000095
+#define HV_REGISTER_SINT6 0x40000096
+#define HV_REGISTER_SINT7 0x40000097
+#define HV_REGISTER_SINT8 0x40000098
+#define HV_REGISTER_SINT9 0x40000099
+#define HV_REGISTER_SINT10 0x4000009A
+#define HV_REGISTER_SINT11 0x4000009B
+#define HV_REGISTER_SINT12 0x4000009C
+#define HV_REGISTER_SINT13 0x4000009D
+#define HV_REGISTER_SINT14 0x4000009E
+#define HV_REGISTER_SINT15 0x4000009F

/*
* Synthetic Timer MSRs. Four timers per vcpu.
*/
-#define HV_X64_MSR_STIMER0_CONFIG 0x400000B0
-#define HV_X64_MSR_STIMER0_COUNT 0x400000B1
-#define HV_X64_MSR_STIMER1_CONFIG 0x400000B2
-#define HV_X64_MSR_STIMER1_COUNT 0x400000B3
-#define HV_X64_MSR_STIMER2_CONFIG 0x400000B4
-#define HV_X64_MSR_STIMER2_COUNT 0x400000B5
-#define HV_X64_MSR_STIMER3_CONFIG 0x400000B6
-#define HV_X64_MSR_STIMER3_COUNT 0x400000B7
+#define HV_REGISTER_STIMER0_CONFIG 0x400000B0
+#define HV_REGISTER_STIMER0_COUNT 0x400000B1
+#define HV_REGISTER_STIMER1_CONFIG 0x400000B2
+#define HV_REGISTER_STIMER1_COUNT 0x400000B3
+#define HV_REGISTER_STIMER2_CONFIG 0x400000B4
+#define HV_REGISTER_STIMER2_COUNT 0x400000B5
+#define HV_REGISTER_STIMER3_CONFIG 0x400000B6
+#define HV_REGISTER_STIMER3_COUNT 0x400000B7

/* Hyper-V guest idle MSR */
#define HV_X64_MSR_GUEST_IDLE 0x400000F0

/* Hyper-V guest crash notification MSR's */
-#define HV_X64_MSR_CRASH_P0 0x40000100
-#define HV_X64_MSR_CRASH_P1 0x40000101
-#define HV_X64_MSR_CRASH_P2 0x40000102
-#define HV_X64_MSR_CRASH_P3 0x40000103
-#define HV_X64_MSR_CRASH_P4 0x40000104
-#define HV_X64_MSR_CRASH_CTL 0x40000105
+#define HV_REGISTER_CRASH_P0 0x40000100
+#define HV_REGISTER_CRASH_P1 0x40000101
+#define HV_REGISTER_CRASH_P2 0x40000102
+#define HV_REGISTER_CRASH_P3 0x40000103
+#define HV_REGISTER_CRASH_P4 0x40000104
+#define HV_REGISTER_CRASH_CTL 0x40000105

/* TSC emulation after migration */
#define HV_X64_MSR_REENLIGHTENMENT_CONTROL 0x40000106
@@ -236,6 +236,32 @@ enum hv_isolation_type {
/* TSC invariant control */
#define HV_X64_MSR_TSC_INVARIANT_CONTROL 0x40000118

+/* Register name aliases for temporary compatibility */
+#define HV_X64_MSR_STIMER0_COUNT HV_REGISTER_STIMER0_COUNT
+#define HV_X64_MSR_STIMER0_CONFIG HV_REGISTER_STIMER0_CONFIG
+#define HV_X64_MSR_STIMER1_COUNT HV_REGISTER_STIMER1_COUNT
+#define HV_X64_MSR_STIMER1_CONFIG HV_REGISTER_STIMER1_CONFIG
+#define HV_X64_MSR_STIMER2_COUNT HV_REGISTER_STIMER2_COUNT
+#define HV_X64_MSR_STIMER2_CONFIG HV_REGISTER_STIMER2_CONFIG
+#define HV_X64_MSR_STIMER3_COUNT HV_REGISTER_STIMER3_COUNT
+#define HV_X64_MSR_STIMER3_CONFIG HV_REGISTER_STIMER3_CONFIG
+#define HV_X64_MSR_SCONTROL HV_REGISTER_SCONTROL
+#define HV_X64_MSR_SVERSION HV_REGISTER_SVERSION
+#define HV_X64_MSR_SIMP HV_REGISTER_SIMP
+#define HV_X64_MSR_SIEFP HV_REGISTER_SIEFP
+#define HV_X64_MSR_VP_INDEX HV_REGISTER_VP_INDEX
+#define HV_X64_MSR_EOM HV_REGISTER_EOM
+#define HV_X64_MSR_SINT0 HV_REGISTER_SINT0
+#define HV_X64_MSR_SINT15 HV_REGISTER_SINT15
+#define HV_X64_MSR_CRASH_P0 HV_REGISTER_CRASH_P0
+#define HV_X64_MSR_CRASH_P1 HV_REGISTER_CRASH_P1
+#define HV_X64_MSR_CRASH_P2 HV_REGISTER_CRASH_P2
+#define HV_X64_MSR_CRASH_P3 HV_REGISTER_CRASH_P3
+#define HV_X64_MSR_CRASH_P4 HV_REGISTER_CRASH_P4
+#define HV_X64_MSR_CRASH_CTL HV_REGISTER_CRASH_CTL
+#define HV_X64_MSR_TIME_REF_COUNT HV_REGISTER_TIME_REF_COUNT
+#define HV_X64_MSR_REFERENCE_TSC HV_REGISTER_REFERENCE_TSC
+
/*
* Declare the MSR used to setup pages used to communicate with the hypervisor.
*/
diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index ef6e968..2590ce5 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -14,41 +14,22 @@ typedef int (*hyperv_fill_flush_list_func)(
struct hv_guest_mapping_flush_list *flush,
void *data);

-#define hv_init_timer(timer, tick) \
- wrmsrl(HV_X64_MSR_STIMER0_COUNT + (2*timer), tick)
-#define hv_init_timer_config(timer, val) \
- wrmsrl(HV_X64_MSR_STIMER0_CONFIG + (2*timer), val)
-
-#define hv_get_simp(val) rdmsrl(HV_X64_MSR_SIMP, val)
-#define hv_set_simp(val) wrmsrl(HV_X64_MSR_SIMP, val)
-
-#define hv_get_siefp(val) rdmsrl(HV_X64_MSR_SIEFP, val)
-#define hv_set_siefp(val) wrmsrl(HV_X64_MSR_SIEFP, val)
-
-#define hv_get_synic_state(val) rdmsrl(HV_X64_MSR_SCONTROL, val)
-#define hv_set_synic_state(val) wrmsrl(HV_X64_MSR_SCONTROL, val)
+static inline void hv_set_register(unsigned int reg, u64 value)
+{
+ wrmsrl(reg, value);
+}

-#define hv_get_vp_index(index) rdmsrl(HV_X64_MSR_VP_INDEX, index)
+static inline u64 hv_get_register(unsigned int reg)
+{
+ u64 value;

-#define hv_signal_eom() wrmsrl(HV_X64_MSR_EOM, 0)
+ rdmsrl(reg, value);
+ return value;
+}

-#define hv_get_synint_state(int_num, val) \
- rdmsrl(HV_X64_MSR_SINT0 + int_num, val)
-#define hv_set_synint_state(int_num, val) \
- wrmsrl(HV_X64_MSR_SINT0 + int_num, val)
#define hv_recommend_using_aeoi() \
(!(ms_hyperv.hints & HV_DEPRECATING_AEOI_RECOMMENDED))

-#define hv_get_crash_ctl(val) \
- rdmsrl(HV_X64_MSR_CRASH_CTL, val)
-
-#define hv_get_time_ref_count(val) \
- rdmsrl(HV_X64_MSR_TIME_REF_COUNT, val)
-
-#define hv_get_reference_tsc(val) \
- rdmsrl(HV_X64_MSR_REFERENCE_TSC, val)
-#define hv_set_reference_tsc(val) \
- wrmsrl(HV_X64_MSR_REFERENCE_TSC, val)
#define hv_set_clocksource_vdso(val) \
((val).vdso_clock_mode = VDSO_CLOCKMODE_HVCLOCK)
#define hv_enable_vdso_clocksource() \
diff --git a/drivers/clocksource/hyperv_timer.c b/drivers/clocksource/hyperv_timer.c
index 269a691..c73c127 100644
--- a/drivers/clocksource/hyperv_timer.c
+++ b/drivers/clocksource/hyperv_timer.c
@@ -68,14 +68,14 @@ static int hv_ce_set_next_event(unsigned long delta,

current_tick = hv_read_reference_counter();
current_tick += delta;
- hv_init_timer(0, current_tick);
+ hv_set_register(HV_REGISTER_STIMER0_COUNT, current_tick);
return 0;
}

static int hv_ce_shutdown(struct clock_event_device *evt)
{
- hv_init_timer(0, 0);
- hv_init_timer_config(0, 0);
+ hv_set_register(HV_REGISTER_STIMER0_COUNT, 0);
+ hv_set_register(HV_REGISTER_STIMER0_CONFIG, 0);
if (direct_mode_enabled)
hv_disable_stimer0_percpu_irq(stimer0_irq);

@@ -105,7 +105,7 @@ static int hv_ce_set_oneshot(struct clock_event_device *evt)
timer_cfg.direct_mode = 0;
timer_cfg.sintx = stimer0_message_sint;
}
- hv_init_timer_config(0, timer_cfg.as_uint64);
+ hv_set_register(HV_REGISTER_STIMER0_CONFIG, timer_cfg.as_uint64);
return 0;
}

@@ -331,7 +331,7 @@ static u64 notrace read_hv_clock_tsc(void)
u64 current_tick = hv_read_tsc_page(hv_get_tsc_page());

if (current_tick == U64_MAX)
- hv_get_time_ref_count(current_tick);
+ current_tick = hv_get_register(HV_REGISTER_TIME_REF_COUNT);

return current_tick;
}
@@ -352,9 +352,9 @@ static void suspend_hv_clock_tsc(struct clocksource *arg)
u64 tsc_msr;

/* Disable the TSC page */
- hv_get_reference_tsc(tsc_msr);
+ tsc_msr = hv_get_register(HV_REGISTER_REFERENCE_TSC);
tsc_msr &= ~BIT_ULL(0);
- hv_set_reference_tsc(tsc_msr);
+ hv_set_register(HV_REGISTER_REFERENCE_TSC, tsc_msr);
}


@@ -364,10 +364,10 @@ static void resume_hv_clock_tsc(struct clocksource *arg)
u64 tsc_msr;

/* Re-enable the TSC page */
- hv_get_reference_tsc(tsc_msr);
+ tsc_msr = hv_get_register(HV_REGISTER_REFERENCE_TSC);
tsc_msr &= GENMASK_ULL(11, 0);
tsc_msr |= BIT_ULL(0) | (u64)phys_addr;
- hv_set_reference_tsc(tsc_msr);
+ hv_set_register(HV_REGISTER_REFERENCE_TSC, tsc_msr);
}

static int hv_cs_enable(struct clocksource *cs)
@@ -389,14 +389,12 @@ static int hv_cs_enable(struct clocksource *cs)

static u64 notrace read_hv_clock_msr(void)
{
- u64 current_tick;
/*
* Read the partition counter to get the current tick count. This count
* is set to 0 when the partition is created and is incremented in
* 100 nanosecond units.
*/
- hv_get_time_ref_count(current_tick);
- return current_tick;
+ return hv_get_register(HV_REGISTER_TIME_REF_COUNT);
}

static u64 notrace read_hv_clock_msr_cs(struct clocksource *arg)
@@ -439,10 +437,10 @@ static bool __init hv_init_tsc_clocksource(void)
* (which already has at least the low 12 bits set to zero since
* it is page aligned). Also set the "enable" bit, which is bit 0.
*/
- hv_get_reference_tsc(tsc_msr);
+ tsc_msr = hv_get_register(HV_REGISTER_REFERENCE_TSC);
tsc_msr &= GENMASK_ULL(11, 0);
tsc_msr = tsc_msr | 0x1 | (u64)phys_addr;
- hv_set_reference_tsc(tsc_msr);
+ hv_set_register(HV_REGISTER_REFERENCE_TSC, tsc_msr);

hv_set_clocksource_vdso(hyperv_cs_tsc);
clocksource_register_hz(&hyperv_cs_tsc, NSEC_PER_SEC/100);
diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
index cca8d5e..0c1fa69 100644
--- a/drivers/hv/hv.c
+++ b/drivers/hv/hv.c
@@ -198,34 +198,36 @@ void hv_synic_enable_regs(unsigned int cpu)
union hv_synic_scontrol sctrl;

/* Setup the Synic's message page */
- hv_get_simp(simp.as_uint64);
+ simp.as_uint64 = hv_get_register(HV_REGISTER_SIMP);
simp.simp_enabled = 1;
simp.base_simp_gpa = virt_to_phys(hv_cpu->synic_message_page)
>> HV_HYP_PAGE_SHIFT;

- hv_set_simp(simp.as_uint64);
+ hv_set_register(HV_REGISTER_SIMP, simp.as_uint64);

/* Setup the Synic's event page */
- hv_get_siefp(siefp.as_uint64);
+ siefp.as_uint64 = hv_get_register(HV_REGISTER_SIEFP);
siefp.siefp_enabled = 1;
siefp.base_siefp_gpa = virt_to_phys(hv_cpu->synic_event_page)
>> HV_HYP_PAGE_SHIFT;

- hv_set_siefp(siefp.as_uint64);
+ hv_set_register(HV_REGISTER_SIEFP, siefp.as_uint64);

/* Setup the shared SINT. */
- hv_get_synint_state(VMBUS_MESSAGE_SINT, shared_sint.as_uint64);
+ shared_sint.as_uint64 = hv_get_register(HV_REGISTER_SINT0 +
+ VMBUS_MESSAGE_SINT);

shared_sint.vector = hv_get_vector();
shared_sint.masked = false;
shared_sint.auto_eoi = hv_recommend_using_aeoi();
- hv_set_synint_state(VMBUS_MESSAGE_SINT, shared_sint.as_uint64);
+ hv_set_register(HV_REGISTER_SINT0 + VMBUS_MESSAGE_SINT,
+ shared_sint.as_uint64);

/* Enable the global synic bit */
- hv_get_synic_state(sctrl.as_uint64);
+ sctrl.as_uint64 = hv_get_register(HV_REGISTER_SCONTROL);
sctrl.enable = 1;

- hv_set_synic_state(sctrl.as_uint64);
+ hv_set_register(HV_REGISTER_SCONTROL, sctrl.as_uint64);
}

int hv_synic_init(unsigned int cpu)
@@ -247,32 +249,35 @@ void hv_synic_disable_regs(unsigned int cpu)
union hv_synic_siefp siefp;
union hv_synic_scontrol sctrl;

- hv_get_synint_state(VMBUS_MESSAGE_SINT, shared_sint.as_uint64);
+ shared_sint.as_uint64 = hv_get_register(HV_REGISTER_SINT0 +
+ VMBUS_MESSAGE_SINT);

shared_sint.masked = 1;

/* Need to correctly cleanup in the case of SMP!!! */
/* Disable the interrupt */
- hv_set_synint_state(VMBUS_MESSAGE_SINT, shared_sint.as_uint64);
+ hv_set_register(HV_REGISTER_SINT0 + VMBUS_MESSAGE_SINT,
+ shared_sint.as_uint64);

- hv_get_simp(simp.as_uint64);
+ simp.as_uint64 = hv_get_register(HV_REGISTER_SIMP);
simp.simp_enabled = 0;
simp.base_simp_gpa = 0;

- hv_set_simp(simp.as_uint64);
+ hv_set_register(HV_REGISTER_SIMP, simp.as_uint64);

- hv_get_siefp(siefp.as_uint64);
+ siefp.as_uint64 = hv_get_register(HV_REGISTER_SIEFP);
siefp.siefp_enabled = 0;
siefp.base_siefp_gpa = 0;

- hv_set_siefp(siefp.as_uint64);
+ hv_set_register(HV_REGISTER_SIEFP, siefp.as_uint64);

/* Disable the global synic bit */
- hv_get_synic_state(sctrl.as_uint64);
+ sctrl.as_uint64 = hv_get_register(HV_REGISTER_SCONTROL);
sctrl.enable = 0;
- hv_set_synic_state(sctrl.as_uint64);
+ hv_set_register(HV_REGISTER_SCONTROL, sctrl.as_uint64);
}

+
int hv_synic_cleanup(unsigned int cpu)
{
struct vmbus_channel *channel, *sc;
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 10dce9f..9e63170 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -1521,7 +1521,7 @@ static int vmbus_bus_init(void)
* Register for panic kmsg callback only if the right
* capability is supported by the hypervisor.
*/
- hv_get_crash_ctl(hyperv_crash_ctl);
+ hyperv_crash_ctl = hv_get_register(HV_REGISTER_CRASH_CTL);
if (hyperv_crash_ctl & HV_CRASH_CTL_CRASH_NOTIFY_MSG)
hv_kmsg_dump_register();

diff --git a/include/asm-generic/mshyperv.h b/include/asm-generic/mshyperv.h
index 694b5bc..163d8b0 100644
--- a/include/asm-generic/mshyperv.h
+++ b/include/asm-generic/mshyperv.h
@@ -88,7 +88,7 @@ static inline void vmbus_signal_eom(struct hv_message *msg, u32 old_msg_type)
* possibly deliver another msg from the
* hypervisor
*/
- hv_signal_eom();
+ hv_set_register(HV_REGISTER_EOM, 0);
}
}

--
1.8.3.1

2021-03-01 01:25:41

by Michael Kelley (LINUX)

[permalink] [raw]
Subject: [PATCH v2 04/10] Drivers: hv: vmbus: Move hyperv_report_panic_msg to arch neutral code

With the new Hyper-V MSR set function, hyperv_report_panic_msg() can be
architecture neutral, so move it out from under arch/x86 and merge into
hv_kmsg_dump(). This move also avoids needing a separate implementation
under arch/arm64.

No functional change.

Signed-off-by: Michael Kelley <[email protected]>
Reviewed-by: Boqun Feng <[email protected]>
---
arch/x86/hyperv/hv_init.c | 27 ---------------------------
drivers/hv/vmbus_drv.c | 24 +++++++++++++++++++-----
include/asm-generic/mshyperv.h | 1 -
3 files changed, 19 insertions(+), 33 deletions(-)

diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index 94d52c5..9af4f8a 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -571,33 +571,6 @@ void hyperv_report_panic(struct pt_regs *regs, long err, bool in_die)
}
EXPORT_SYMBOL_GPL(hyperv_report_panic);

-/**
- * hyperv_report_panic_msg - report panic message to Hyper-V
- * @pa: physical address of the panic page containing the message
- * @size: size of the message in the page
- */
-void hyperv_report_panic_msg(phys_addr_t pa, size_t size)
-{
- /*
- * P3 to contain the physical address of the panic page & P4 to
- * contain the size of the panic data in that page. Rest of the
- * registers are no-op when the NOTIFY_MSG flag is set.
- */
- wrmsrl(HV_X64_MSR_CRASH_P0, 0);
- wrmsrl(HV_X64_MSR_CRASH_P1, 0);
- wrmsrl(HV_X64_MSR_CRASH_P2, 0);
- wrmsrl(HV_X64_MSR_CRASH_P3, pa);
- wrmsrl(HV_X64_MSR_CRASH_P4, size);
-
- /*
- * Let Hyper-V know there is crash data available along with
- * the panic message.
- */
- wrmsrl(HV_X64_MSR_CRASH_CTL,
- (HV_CRASH_CTL_CRASH_NOTIFY | HV_CRASH_CTL_CRASH_NOTIFY_MSG));
-}
-EXPORT_SYMBOL_GPL(hyperv_report_panic_msg);
-
bool hv_is_hyperv_initialized(void)
{
union hv_x64_msr_hypercall_contents hypercall_msr;
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 9e63170..7524d71 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -1392,22 +1392,36 @@ static void hv_kmsg_dump(struct kmsg_dumper *dumper,
enum kmsg_dump_reason reason)
{
size_t bytes_written;
- phys_addr_t panic_pa;

/* We are only interested in panics. */
if ((reason != KMSG_DUMP_PANIC) || (!sysctl_record_panic_msg))
return;

- panic_pa = virt_to_phys(hv_panic_page);
-
/*
* Write dump contents to the page. No need to synchronize; panic should
* be single-threaded.
*/
kmsg_dump_get_buffer(dumper, false, hv_panic_page, HV_HYP_PAGE_SIZE,
&bytes_written);
- if (bytes_written)
- hyperv_report_panic_msg(panic_pa, bytes_written);
+ if (!bytes_written)
+ return;
+ /*
+ * P3 to contain the physical address of the panic page & P4 to
+ * contain the size of the panic data in that page. Rest of the
+ * registers are no-op when the NOTIFY_MSG flag is set.
+ */
+ hv_set_register(HV_REGISTER_CRASH_P0, 0);
+ hv_set_register(HV_REGISTER_CRASH_P1, 0);
+ hv_set_register(HV_REGISTER_CRASH_P2, 0);
+ hv_set_register(HV_REGISTER_CRASH_P3, virt_to_phys(hv_panic_page));
+ hv_set_register(HV_REGISTER_CRASH_P4, bytes_written);
+
+ /*
+ * Let Hyper-V know there is crash data available along with
+ * the panic message.
+ */
+ hv_set_register(HV_REGISTER_CRASH_CTL,
+ (HV_CRASH_CTL_CRASH_NOTIFY | HV_CRASH_CTL_CRASH_NOTIFY_MSG));
}

static struct kmsg_dumper hv_kmsg_dumper = {
diff --git a/include/asm-generic/mshyperv.h b/include/asm-generic/mshyperv.h
index 163d8b0..70b798d 100644
--- a/include/asm-generic/mshyperv.h
+++ b/include/asm-generic/mshyperv.h
@@ -173,7 +173,6 @@ static inline int cpumask_to_vpset(struct hv_vpset *vpset,
}

void hyperv_report_panic(struct pt_regs *regs, long err, bool in_die);
-void hyperv_report_panic_msg(phys_addr_t pa, size_t size);
bool hv_is_hyperv_initialized(void);
bool hv_is_hibernation_supported(void);
enum hv_isolation_type hv_get_isolation_type(void);
--
1.8.3.1

2021-03-01 01:29:11

by Michael Kelley (LINUX)

[permalink] [raw]
Subject: [PATCH v2 05/10] Drivers: hv: vmbus: Handle auto EOI quirk inline

On x86/x64, Hyper-V provides a flag to indicate auto EOI functionality,
but it doesn't on ARM64. Handle this quirk inline instead of calling
into code under arch/x86 (and coming, under arch/arm64).

No functional change.

Signed-off-by: Michael Kelley <[email protected]>
Reviewed-by: Boqun Feng <[email protected]>
---
arch/x86/include/asm/mshyperv.h | 3 ---
drivers/hv/hv.c | 12 +++++++++++-
2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index 2590ce5..a6c608d 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -27,9 +27,6 @@ static inline u64 hv_get_register(unsigned int reg)
return value;
}

-#define hv_recommend_using_aeoi() \
- (!(ms_hyperv.hints & HV_DEPRECATING_AEOI_RECOMMENDED))
-
#define hv_set_clocksource_vdso(val) \
((val).vdso_clock_mode = VDSO_CLOCKMODE_HVCLOCK)
#define hv_enable_vdso_clocksource() \
diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
index 0c1fa69..afe7a62 100644
--- a/drivers/hv/hv.c
+++ b/drivers/hv/hv.c
@@ -219,7 +219,17 @@ void hv_synic_enable_regs(unsigned int cpu)

shared_sint.vector = hv_get_vector();
shared_sint.masked = false;
- shared_sint.auto_eoi = hv_recommend_using_aeoi();
+
+ /*
+ * On architectures where Hyper-V doesn't support AEOI (e.g., ARM64),
+ * it doesn't provide a recommendation flag and AEOI must be disabled.
+ */
+#ifdef HV_DEPRECATING_AEOI_RECOMMENDED
+ shared_sint.auto_eoi =
+ !(ms_hyperv.hints & HV_DEPRECATING_AEOI_RECOMMENDED);
+#else
+ shared_sint.auto_eoi = 0;
+#endif
hv_set_register(HV_REGISTER_SINT0 + VMBUS_MESSAGE_SINT,
shared_sint.as_uint64);

--
1.8.3.1

2021-03-01 01:30:22

by Michael Kelley (LINUX)

[permalink] [raw]
Subject: [PATCH v2 06/10] Drivers: hv: vmbus: Move handling of VMbus interrupts

VMbus interrupts are most naturally modelled as per-cpu IRQs. But
because x86/x64 doesn't have per-cpu IRQs, the core VMbus interrupt
handling machinery is done in code under arch/x86 and Linux IRQs are
not used. Adding support for ARM64 means adding equivalent code
using per-cpu IRQs under arch/arm64.

A better model is to treat per-cpu IRQs as the normal path (which it is
for modern architectures), and the x86/x64 path as the exception. Do this
by incorporating standard Linux per-cpu IRQ allocation into the main VMbus
driver, and bypassing it in the x86/x64 exception case. For x86/x64,
special case code is retained under arch/x86, but no VMbus interrupt
handling code is needed under arch/arm64.

No functional change.

Signed-off-by: Michael Kelley <[email protected]>
Reviewed-by: Boqun Feng <[email protected]>
---
arch/x86/include/asm/mshyperv.h | 1 -
arch/x86/kernel/cpu/mshyperv.c | 13 +++------
drivers/hv/hv.c | 8 +++++-
drivers/hv/vmbus_drv.c | 63 ++++++++++++++++++++++++++++++++++++-----
include/asm-generic/mshyperv.h | 7 ++---
5 files changed, 70 insertions(+), 22 deletions(-)

diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index a6c608d..c10dd1c 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -32,7 +32,6 @@ static inline u64 hv_get_register(unsigned int reg)
#define hv_enable_vdso_clocksource() \
vclocks_set_used(VDSO_CLOCKMODE_HVCLOCK);
#define hv_get_raw_timer() rdtsc_ordered()
-#define hv_get_vector() HYPERVISOR_CALLBACK_VECTOR

/*
* Reference to pv_ops must be inline so objtool
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index e88bc29..41fd84a 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -60,23 +60,18 @@
set_irq_regs(old_regs);
}

-int hv_setup_vmbus_irq(int irq, void (*handler)(void))
+void hv_setup_vmbus_handler(void (*handler)(void))
{
- /*
- * The 'irq' argument is ignored on x86/x64 because a hard-coded
- * interrupt vector is used for Hyper-V interrupts.
- */
vmbus_handler = handler;
- return 0;
}
+EXPORT_SYMBOL_GPL(hv_setup_vmbus_handler);

-void hv_remove_vmbus_irq(void)
+void hv_remove_vmbus_handler(void)
{
/* We have no way to deallocate the interrupt gate */
vmbus_handler = NULL;
}
-EXPORT_SYMBOL_GPL(hv_setup_vmbus_irq);
-EXPORT_SYMBOL_GPL(hv_remove_vmbus_irq);
+EXPORT_SYMBOL_GPL(hv_remove_vmbus_handler);

/*
* Routines to do per-architecture handling of stimer0
diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
index afe7a62..917b29e 100644
--- a/drivers/hv/hv.c
+++ b/drivers/hv/hv.c
@@ -16,6 +16,7 @@
#include <linux/version.h>
#include <linux/random.h>
#include <linux/clockchips.h>
+#include <linux/interrupt.h>
#include <clocksource/hyperv_timer.h>
#include <asm/mshyperv.h>
#include "hyperv_vmbus.h"
@@ -214,10 +215,12 @@ void hv_synic_enable_regs(unsigned int cpu)
hv_set_register(HV_REGISTER_SIEFP, siefp.as_uint64);

/* Setup the shared SINT. */
+ if (vmbus_irq != -1)
+ enable_percpu_irq(vmbus_irq, 0);
shared_sint.as_uint64 = hv_get_register(HV_REGISTER_SINT0 +
VMBUS_MESSAGE_SINT);

- shared_sint.vector = hv_get_vector();
+ shared_sint.vector = vmbus_interrupt;
shared_sint.masked = false;

/*
@@ -285,6 +288,9 @@ void hv_synic_disable_regs(unsigned int cpu)
sctrl.as_uint64 = hv_get_register(HV_REGISTER_SCONTROL);
sctrl.enable = 0;
hv_set_register(HV_REGISTER_SCONTROL, sctrl.as_uint64);
+
+ if (vmbus_irq != -1)
+ disable_percpu_irq(vmbus_irq);
}


diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 7524d71..51c40d5 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -48,8 +48,10 @@ struct vmbus_dynid {

static void *hv_panic_page;

+static long __percpu *vmbus_evt;
+
/* Values parsed from ACPI DSDT */
-static int vmbus_irq;
+int vmbus_irq;
int vmbus_interrupt;

/*
@@ -1381,7 +1383,13 @@ static void vmbus_isr(void)
tasklet_schedule(&hv_cpu->msg_dpc);
}

- add_interrupt_randomness(hv_get_vector(), 0);
+ add_interrupt_randomness(vmbus_interrupt, 0);
+}
+
+static irqreturn_t vmbus_percpu_isr(int irq, void *dev_id)
+{
+ vmbus_isr();
+ return IRQ_HANDLED;
}

/*
@@ -1496,9 +1504,28 @@ static int vmbus_bus_init(void)
if (ret)
return ret;

- ret = hv_setup_vmbus_irq(vmbus_irq, vmbus_isr);
- if (ret)
- goto err_setup;
+ /*
+ * VMbus interrupts are best modeled as per-cpu interrupts. If
+ * on an architecture with support for per-cpu IRQs (e.g. ARM64),
+ * allocate a per-cpu IRQ using standard Linux kernel functionality.
+ * If not on such an architecture (e.g., x86/x64), then rely on
+ * code in the arch-specific portion of the code tree to connect
+ * the VMbus interrupt handler.
+ */
+
+ if (vmbus_irq == -1) {
+ hv_setup_vmbus_handler(vmbus_isr);
+ } else {
+ vmbus_evt = alloc_percpu(long);
+ ret = request_percpu_irq(vmbus_irq, vmbus_percpu_isr,
+ "Hyper-V VMbus", vmbus_evt);
+ if (ret) {
+ pr_err("Can't request Hyper-V VMbus IRQ %d, Err %d",
+ vmbus_irq, ret);
+ free_percpu(vmbus_evt);
+ goto err_setup;
+ }
+ }

ret = hv_synic_alloc();
if (ret)
@@ -1559,7 +1586,12 @@ static int vmbus_bus_init(void)
err_cpuhp:
hv_synic_free();
err_alloc:
- hv_remove_vmbus_irq();
+ if (vmbus_irq == -1) {
+ hv_remove_vmbus_handler();
+ } else {
+ free_percpu_irq(vmbus_irq, vmbus_evt);
+ free_percpu(vmbus_evt);
+ }
err_setup:
bus_unregister(&hv_bus);
unregister_sysctl_table(hv_ctl_table_hdr);
@@ -2677,6 +2709,18 @@ static int __init hv_acpi_init(void)
ret = -ETIMEDOUT;
goto cleanup;
}
+
+ /*
+ * If we're on an architecture with a hardcoded hypervisor
+ * vector (i.e. x86/x64), override the VMbus interrupt found
+ * in the ACPI tables. Ensure vmbus_irq is not set since the
+ * normal Linux IRQ mechanism is not used in this case.
+ */
+#ifdef HYPERVISOR_CALLBACK_VECTOR
+ vmbus_interrupt = HYPERVISOR_CALLBACK_VECTOR;
+ vmbus_irq = -1;
+#endif
+
hv_debug_init();

ret = vmbus_bus_init();
@@ -2707,7 +2751,12 @@ static void __exit vmbus_exit(void)
vmbus_connection.conn_state = DISCONNECTED;
hv_stimer_global_cleanup();
vmbus_disconnect();
- hv_remove_vmbus_irq();
+ if (vmbus_irq == -1) {
+ hv_remove_vmbus_handler();
+ } else {
+ free_percpu_irq(vmbus_irq, vmbus_evt);
+ free_percpu(vmbus_evt);
+ }
for_each_online_cpu(cpu) {
struct hv_per_cpu_context *hv_cpu
= per_cpu_ptr(hv_context.cpu_context, cpu);
diff --git a/include/asm-generic/mshyperv.h b/include/asm-generic/mshyperv.h
index 70b798d..43dc371 100644
--- a/include/asm-generic/mshyperv.h
+++ b/include/asm-generic/mshyperv.h
@@ -92,10 +92,8 @@ static inline void vmbus_signal_eom(struct hv_message *msg, u32 old_msg_type)
}
}

-int hv_setup_vmbus_irq(int irq, void (*handler)(void));
-void hv_remove_vmbus_irq(void);
-void hv_enable_vmbus_irq(void);
-void hv_disable_vmbus_irq(void);
+void hv_setup_vmbus_handler(void (*handler)(void));
+void hv_remove_vmbus_handler(void);

void hv_setup_kexec_handler(void (*handler)(void));
void hv_remove_kexec_handler(void);
@@ -103,6 +101,7 @@ static inline void vmbus_signal_eom(struct hv_message *msg, u32 old_msg_type)
void hv_remove_crash_handler(void);

extern int vmbus_interrupt;
+extern int vmbus_irq;

#if IS_ENABLED(CONFIG_HYPERV)
/*
--
1.8.3.1

2021-03-01 01:32:46

by Michael Kelley (LINUX)

[permalink] [raw]
Subject: [PATCH v2 08/10] clocksource/drivers/hyper-v: Handle sched_clock differences inline

While the Hyper-V Reference TSC code is architecture neutral, the
pv_ops.time.sched_clock() function is implemented for x86/x64, but not
for ARM64. Current code calls a utility function under arch/x86 (and
coming, under arch/arm64) to handle the difference.

Change this approach to handle the difference inline based on whether
GENERIC_SCHED_CLOCK is present. The new approach removes code under
arch/* since the difference is tied more to the specifics of the Linux
implementation than to the architecture.

No functional change.

Signed-off-by: Michael Kelley <[email protected]>
Reviewed-by: Boqun Feng <[email protected]>
---
arch/x86/include/asm/mshyperv.h | 11 -----------
drivers/clocksource/hyperv_timer.c | 21 +++++++++++++++++++++
2 files changed, 21 insertions(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index 4f566db..5433312 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -29,17 +29,6 @@ static inline u64 hv_get_register(unsigned int reg)

#define hv_get_raw_timer() rdtsc_ordered()

-/*
- * Reference to pv_ops must be inline so objtool
- * detection of noinstr violations can work correctly.
- */
-static __always_inline void hv_setup_sched_clock(void *sched_clock)
-{
-#ifdef CONFIG_PARAVIRT
- pv_ops.time.sched_clock = sched_clock;
-#endif
-}
-
void hyperv_vector_handler(struct pt_regs *regs);

static inline void hv_enable_stimer0_percpu_irq(int irq) {}
diff --git a/drivers/clocksource/hyperv_timer.c b/drivers/clocksource/hyperv_timer.c
index 5e5e08aa..38fb396 100644
--- a/drivers/clocksource/hyperv_timer.c
+++ b/drivers/clocksource/hyperv_timer.c
@@ -423,6 +423,27 @@ static u64 notrace read_hv_sched_clock_msr(void)
.flags = CLOCK_SOURCE_IS_CONTINUOUS,
};

+/*
+ * Reference to pv_ops must be inline so objtool
+ * detection of noinstr violations can work correctly.
+ */
+static __always_inline void hv_setup_sched_clock(void *sched_clock)
+{
+#ifdef CONFIG_GENERIC_SCHED_CLOCK
+ /*
+ * We're on an architecture with generic sched clock (not x86/x64).
+ * The Hyper-V sched clock read function returns nanoseconds, not
+ * the normal 100ns units of the Hyper-V synthetic clock.
+ */
+ sched_clock_register(sched_clock, 64, NSEC_PER_SEC);
+#else
+#ifdef CONFIG_PARAVIRT
+ /* We're on x86/x64 *and* using PV ops */
+ pv_ops.time.sched_clock = sched_clock;
+#endif
+#endif
+}
+
static bool __init hv_init_tsc_clocksource(void)
{
u64 tsc_msr;
--
1.8.3.1

2021-03-01 01:37:14

by Michael Kelley (LINUX)

[permalink] [raw]
Subject: [PATCH v2 07/10] clocksource/drivers/hyper-v: Handle vDSO differences inline

While the driver for the Hyper-V Reference TSC and STIMERs is architecture
neutral, vDSO is implemented for x86/x64, but not for ARM64. Current code
calls into utility functions under arch/x86 (and coming, under arch/arm64)
to handle the difference.

Change this approach to handle the difference inline based on whether
VDSO_CLOCK_MODE_HVCLOCK is present. The new approach removes code under
arch/* since the difference is tied more to the specifics of the Linux
implementation than to the architecture.

No functional change.

Signed-off-by: Michael Kelley <[email protected]>
Reviewed-by: Boqun Feng <[email protected]>
---
arch/x86/include/asm/mshyperv.h | 4 ----
drivers/clocksource/hyperv_timer.c | 10 ++++++++--
2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index c10dd1c..4f566db 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -27,10 +27,6 @@ static inline u64 hv_get_register(unsigned int reg)
return value;
}

-#define hv_set_clocksource_vdso(val) \
- ((val).vdso_clock_mode = VDSO_CLOCKMODE_HVCLOCK)
-#define hv_enable_vdso_clocksource() \
- vclocks_set_used(VDSO_CLOCKMODE_HVCLOCK);
#define hv_get_raw_timer() rdtsc_ordered()

/*
diff --git a/drivers/clocksource/hyperv_timer.c b/drivers/clocksource/hyperv_timer.c
index c73c127..5e5e08aa 100644
--- a/drivers/clocksource/hyperv_timer.c
+++ b/drivers/clocksource/hyperv_timer.c
@@ -372,7 +372,9 @@ static void resume_hv_clock_tsc(struct clocksource *arg)

static int hv_cs_enable(struct clocksource *cs)
{
- hv_enable_vdso_clocksource();
+#ifdef VDSO_CLOCKMODE_HVCLOCK
+ vclocks_set_used(VDSO_CLOCKMODE_HVCLOCK);
+#endif
return 0;
}

@@ -385,6 +387,11 @@ static int hv_cs_enable(struct clocksource *cs)
.suspend= suspend_hv_clock_tsc,
.resume = resume_hv_clock_tsc,
.enable = hv_cs_enable,
+#ifdef VDSO_CLOCKMODE_HVCLOCK
+ .vdso_clock_mode = VDSO_CLOCKMODE_HVCLOCK,
+#else
+ .vdso_clock_mode = VDSO_CLOCKMODE_NONE,
+#endif
};

static u64 notrace read_hv_clock_msr(void)
@@ -442,7 +449,6 @@ static bool __init hv_init_tsc_clocksource(void)
tsc_msr = tsc_msr | 0x1 | (u64)phys_addr;
hv_set_register(HV_REGISTER_REFERENCE_TSC, tsc_msr);

- hv_set_clocksource_vdso(hyperv_cs_tsc);
clocksource_register_hz(&hyperv_cs_tsc, NSEC_PER_SEC/100);

hv_sched_clock_offset = hv_read_reference_counter();
--
1.8.3.1

2021-03-01 02:20:58

by Michael Kelley (LINUX)

[permalink] [raw]
Subject: [PATCH v2 02/10] x86/hyper-v: Move hv_message_type to architecture neutral module

The definition of enum hv_message_type includes arch neutral and
x86/x64-specific values. Ideally there would be a way to put the
arch neutral values in an arch neutral module, and the arch
specific values in an arch specific module. But C doesn't provide
a way to extend enum types. As a compromise, move the entire
definition into an arch neutral module, to avoid duplicating the
arch neutral values for x86/x64 and for ARM64.

No functional change.

Signed-off-by: Michael Kelley <[email protected]>
Reviewed-by: Boqun Feng <[email protected]>
---
arch/x86/include/asm/hyperv-tlfs.h | 29 -----------------------------
include/asm-generic/hyperv-tlfs.h | 35 +++++++++++++++++++++++++++++++++++
2 files changed, 35 insertions(+), 29 deletions(-)

diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
index e6cd3fe..68b38a2 100644
--- a/arch/x86/include/asm/hyperv-tlfs.h
+++ b/arch/x86/include/asm/hyperv-tlfs.h
@@ -288,35 +288,6 @@ struct hv_tsc_emulation_status {
#define HV_X64_MSR_TSC_REFERENCE_ENABLE 0x00000001
#define HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT 12

-
-/* Define hypervisor message types. */
-enum hv_message_type {
- HVMSG_NONE = 0x00000000,
-
- /* Memory access messages. */
- HVMSG_UNMAPPED_GPA = 0x80000000,
- HVMSG_GPA_INTERCEPT = 0x80000001,
-
- /* Timer notification messages. */
- HVMSG_TIMER_EXPIRED = 0x80000010,
-
- /* Error messages. */
- HVMSG_INVALID_VP_REGISTER_VALUE = 0x80000020,
- HVMSG_UNRECOVERABLE_EXCEPTION = 0x80000021,
- HVMSG_UNSUPPORTED_FEATURE = 0x80000022,
-
- /* Trace buffer complete messages. */
- HVMSG_EVENTLOG_BUFFERCOMPLETE = 0x80000040,
-
- /* Platform-specific processor intercept messages. */
- HVMSG_X64_IOPORT_INTERCEPT = 0x80010000,
- HVMSG_X64_MSR_INTERCEPT = 0x80010001,
- HVMSG_X64_CPUID_INTERCEPT = 0x80010002,
- HVMSG_X64_EXCEPTION_INTERCEPT = 0x80010003,
- HVMSG_X64_APIC_EOI = 0x80010004,
- HVMSG_X64_LEGACY_FP_ERROR = 0x80010005
-};
-
struct hv_nested_enlightenments_control {
struct {
__u32 directhypercall:1;
diff --git a/include/asm-generic/hyperv-tlfs.h b/include/asm-generic/hyperv-tlfs.h
index 83448e8..9cf10837 100644
--- a/include/asm-generic/hyperv-tlfs.h
+++ b/include/asm-generic/hyperv-tlfs.h
@@ -220,6 +220,41 @@ enum HV_GENERIC_SET_FORMAT {
#define HV_MESSAGE_PAYLOAD_BYTE_COUNT (240)
#define HV_MESSAGE_PAYLOAD_QWORD_COUNT (30)

+/*
+ * Define hypervisor message types. Some of the message types
+ * are x86/x64 specific, but there's no good way to separate
+ * them out into the arch-specific version of hyperv-tlfs.h
+ * because C doesn't provide a way to extend enum types.
+ * Keeping them all in the arch neutral hyperv-tlfs.h seems
+ * the least messy compromise.
+ */
+enum hv_message_type {
+ HVMSG_NONE = 0x00000000,
+
+ /* Memory access messages. */
+ HVMSG_UNMAPPED_GPA = 0x80000000,
+ HVMSG_GPA_INTERCEPT = 0x80000001,
+
+ /* Timer notification messages. */
+ HVMSG_TIMER_EXPIRED = 0x80000010,
+
+ /* Error messages. */
+ HVMSG_INVALID_VP_REGISTER_VALUE = 0x80000020,
+ HVMSG_UNRECOVERABLE_EXCEPTION = 0x80000021,
+ HVMSG_UNSUPPORTED_FEATURE = 0x80000022,
+
+ /* Trace buffer complete messages. */
+ HVMSG_EVENTLOG_BUFFERCOMPLETE = 0x80000040,
+
+ /* Platform-specific processor intercept messages. */
+ HVMSG_X64_IOPORT_INTERCEPT = 0x80010000,
+ HVMSG_X64_MSR_INTERCEPT = 0x80010001,
+ HVMSG_X64_CPUID_INTERCEPT = 0x80010002,
+ HVMSG_X64_EXCEPTION_INTERCEPT = 0x80010003,
+ HVMSG_X64_APIC_EOI = 0x80010004,
+ HVMSG_X64_LEGACY_FP_ERROR = 0x80010005
+};
+
/* Define synthetic interrupt controller message flags. */
union hv_message_flags {
__u8 asu8;
--
1.8.3.1

2021-03-01 03:11:35

by Michael Kelley (LINUX)

[permalink] [raw]
Subject: [PATCH v2 10/10] clocksource/drivers/hyper-v: Move handling of STIMER0 interrupts

STIMER0 interrupts are most naturally modeled as per-cpu IRQs. But
because x86/x64 doesn't have per-cpu IRQs, the core STIMER0 interrupt
handling machinery is done in code under arch/x86 and Linux IRQs are
not used. Adding support for ARM64 means adding equivalent code
using per-cpu IRQs under arch/arm64.

A better model is to treat per-cpu IRQs as the normal path (which it is
for modern architectures), and the x86/x64 path as the exception. Do this
by incorporating standard Linux per-cpu IRQ allocation into the main
SITMER0 driver code, and bypass it in the x86/x64 exception case. For
x86/x64, special case code is retained under arch/x86, but no STIMER0
interrupt handling code is needed under arch/arm64.

No functional change.

Signed-off-by: Michael Kelley <[email protected]>
---
arch/x86/hyperv/hv_init.c | 2 +-
arch/x86/include/asm/mshyperv.h | 4 -
arch/x86/kernel/cpu/mshyperv.c | 10 +--
drivers/clocksource/hyperv_timer.c | 180 ++++++++++++++++++++++++++-----------
include/asm-generic/mshyperv.h | 5 --
include/clocksource/hyperv_timer.h | 3 +-
6 files changed, 132 insertions(+), 72 deletions(-)

diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index 9af4f8a..9d10025 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -327,7 +327,7 @@ static void __init hv_stimer_setup_percpu_clockev(void)
* Ignore any errors in setting up stimer clockevents
* as we can run with the LAPIC timer as a fallback.
*/
- (void)hv_stimer_alloc();
+ (void)hv_stimer_alloc(false);

/*
* Still register the LAPIC timer, because the direct-mode STIMER is
diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index 5433312..6d4891b 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -31,10 +31,6 @@ static inline u64 hv_get_register(unsigned int reg)

void hyperv_vector_handler(struct pt_regs *regs);

-static inline void hv_enable_stimer0_percpu_irq(int irq) {}
-static inline void hv_disable_stimer0_percpu_irq(int irq) {}
-
-
#if IS_ENABLED(CONFIG_HYPERV)
extern int hyperv_init_cpuhp;

diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index 41fd84a..cebed53 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -90,21 +90,17 @@ void hv_remove_vmbus_handler(void)
set_irq_regs(old_regs);
}

-int hv_setup_stimer0_irq(int *irq, int *vector, void (*handler)(void))
+/* For x86/x64, override weak placeholders in hyperv_timer.c */
+void hv_setup_stimer0_handler(void (*handler)(void))
{
- *vector = HYPERV_STIMER0_VECTOR;
- *irq = -1; /* Unused on x86/x64 */
hv_stimer0_handler = handler;
- return 0;
}
-EXPORT_SYMBOL_GPL(hv_setup_stimer0_irq);

-void hv_remove_stimer0_irq(int irq)
+void hv_remove_stimer0_handler(void)
{
/* We have no way to deallocate the interrupt gate */
hv_stimer0_handler = NULL;
}
-EXPORT_SYMBOL_GPL(hv_remove_stimer0_irq);

void hv_setup_kexec_handler(void (*handler)(void))
{
diff --git a/drivers/clocksource/hyperv_timer.c b/drivers/clocksource/hyperv_timer.c
index cdb8e0c..b2bf5e5 100644
--- a/drivers/clocksource/hyperv_timer.c
+++ b/drivers/clocksource/hyperv_timer.c
@@ -18,6 +18,9 @@
#include <linux/sched_clock.h>
#include <linux/mm.h>
#include <linux/cpuhotplug.h>
+#include <linux/interrupt.h>
+#include <linux/irq.h>
+#include <linux/acpi.h>
#include <clocksource/hyperv_timer.h>
#include <asm/hyperv-tlfs.h>
#include <asm/mshyperv.h>
@@ -43,14 +46,13 @@
*/
static bool direct_mode_enabled;

-static int stimer0_irq;
-static int stimer0_vector;
+static int stimer0_irq = -1;
+static long __percpu *stimer0_evt;
static int stimer0_message_sint;

/*
- * ISR for when stimer0 is operating in Direct Mode. Direct Mode
- * does not use VMbus or any VMbus messages, so process here and not
- * in the VMbus driver code.
+ * Common code for stimer0 interrupts coming via Direct Mode or
+ * as a VMbus message.
*/
void hv_stimer0_isr(void)
{
@@ -61,6 +63,16 @@ void hv_stimer0_isr(void)
}
EXPORT_SYMBOL_GPL(hv_stimer0_isr);

+/*
+ * stimer0 interrupt handler for architectures that support
+ * per-cpu interrupts, which also implies Direct Mode.
+ */
+static irqreturn_t hv_stimer0_percpu_isr(int irq, void *dev_id)
+{
+ hv_stimer0_isr();
+ return IRQ_HANDLED;
+}
+
static int hv_ce_set_next_event(unsigned long delta,
struct clock_event_device *evt)
{
@@ -76,8 +88,8 @@ static int hv_ce_shutdown(struct clock_event_device *evt)
{
hv_set_register(HV_REGISTER_STIMER0_COUNT, 0);
hv_set_register(HV_REGISTER_STIMER0_CONFIG, 0);
- if (direct_mode_enabled)
- hv_disable_stimer0_percpu_irq(stimer0_irq);
+ if (direct_mode_enabled && stimer0_irq >= 0)
+ disable_percpu_irq(stimer0_irq);

return 0;
}
@@ -95,8 +107,9 @@ static int hv_ce_set_oneshot(struct clock_event_device *evt)
* on the specified hardware vector/IRQ.
*/
timer_cfg.direct_mode = 1;
- timer_cfg.apic_vector = stimer0_vector;
- hv_enable_stimer0_percpu_irq(stimer0_irq);
+ timer_cfg.apic_vector = HYPERV_STIMER0_VECTOR;
+ if (stimer0_irq >= 0)
+ enable_percpu_irq(stimer0_irq, IRQ_TYPE_NONE);
} else {
/*
* When it expires, the timer will generate a VMbus message,
@@ -169,10 +182,70 @@ int hv_stimer_cleanup(unsigned int cpu)
}
EXPORT_SYMBOL_GPL(hv_stimer_cleanup);

+/*
+ * These placeholders are overridden by arch specific code on
+ * architectures that need special setup of the stimer0 IRQ because
+ * they don't support per-cpu IRQs (such as x86/x64).
+ */
+void __weak hv_setup_stimer0_handler(void (*handler)(void))
+{
+};
+
+void __weak hv_remove_stimer0_handler(void)
+{
+};
+
+/* Called only on architectures with per-cpu IRQs (i.e., not x86/x64) */
+static int hv_setup_stimer0_irq(void)
+{
+ int ret;
+
+ ret = acpi_register_gsi(NULL, HYPERV_STIMER0_VECTOR,
+ ACPI_EDGE_SENSITIVE, ACPI_ACTIVE_HIGH);
+ if (ret < 0) {
+ pr_err("Can't register Hyper-V stimer0 GSI. Error %d", ret);
+ return ret;
+ }
+ stimer0_irq = ret;
+
+ stimer0_evt = alloc_percpu(long);
+ if (!stimer0_evt) {
+ ret = -ENOMEM;
+ goto unregister_gsi;
+ }
+ ret = request_percpu_irq(stimer0_irq, hv_stimer0_percpu_isr,
+ "Hyper-V stimer0", stimer0_evt);
+ if (ret) {
+ pr_err("Can't request Hyper-V stimer0 IRQ %d. Error %d",
+ stimer0_irq, ret);
+ goto free_stimer0_evt;
+ }
+ return ret;
+
+free_stimer0_evt:
+ free_percpu(stimer0_evt);
+unregister_gsi:
+ acpi_unregister_gsi(stimer0_irq);
+ stimer0_irq = -1;
+ return ret;
+}
+
+static void hv_remove_stimer0_irq(void)
+{
+ if (stimer0_irq == -1) {
+ hv_remove_stimer0_handler();
+ } else {
+ free_percpu_irq(stimer0_irq, stimer0_evt);
+ free_percpu(stimer0_evt);
+ acpi_unregister_gsi(stimer0_irq);
+ stimer0_irq = -1;
+ }
+}
+
/* hv_stimer_alloc - Global initialization of the clockevent and stimer0 */
-int hv_stimer_alloc(void)
+int hv_stimer_alloc(bool have_percpu_irqs)
{
- int ret = 0;
+ int ret;

/*
* Synthetic timers are always available except on old versions of
@@ -188,29 +261,37 @@ int hv_stimer_alloc(void)

direct_mode_enabled = ms_hyperv.misc_features &
HV_STIMER_DIRECT_MODE_AVAILABLE;
- if (direct_mode_enabled) {
- ret = hv_setup_stimer0_irq(&stimer0_irq, &stimer0_vector,
- hv_stimer0_isr);
+
+ /*
+ * If Direct Mode isn't enabled, the remainder of the initialization
+ * is done later by hv_stimer_legacy_init()
+ */
+ if (!direct_mode_enabled)
+ return 0;
+
+ if (have_percpu_irqs) {
+ ret = hv_setup_stimer0_irq();
if (ret)
- goto free_percpu;
+ goto free_clock_event;
+ } else {
+ hv_setup_stimer0_handler(hv_stimer0_isr);
+ }

- /*
- * Since we are in Direct Mode, stimer initialization
- * can be done now with a CPUHP value in the same range
- * as other clockevent devices.
- */
- ret = cpuhp_setup_state(CPUHP_AP_HYPERV_TIMER_STARTING,
- "clockevents/hyperv/stimer:starting",
- hv_stimer_init, hv_stimer_cleanup);
- if (ret < 0)
- goto free_stimer0_irq;
+ /*
+ * Since we are in Direct Mode, stimer initialization
+ * can be done now with a CPUHP value in the same range
+ * as other clockevent devices.
+ */
+ ret = cpuhp_setup_state(CPUHP_AP_HYPERV_TIMER_STARTING,
+ "clockevents/hyperv/stimer:starting",
+ hv_stimer_init, hv_stimer_cleanup);
+ if (ret < 0) {
+ hv_remove_stimer0_irq();
+ goto free_clock_event;
}
return ret;

-free_stimer0_irq:
- hv_remove_stimer0_irq(stimer0_irq);
- stimer0_irq = 0;
-free_percpu:
+free_clock_event:
free_percpu(hv_clock_event);
hv_clock_event = NULL;
return ret;
@@ -254,23 +335,6 @@ void hv_stimer_legacy_cleanup(unsigned int cpu)
}
EXPORT_SYMBOL_GPL(hv_stimer_legacy_cleanup);

-
-/* hv_stimer_free - Free global resources allocated by hv_stimer_alloc() */
-void hv_stimer_free(void)
-{
- if (!hv_clock_event)
- return;
-
- if (direct_mode_enabled) {
- cpuhp_remove_state(CPUHP_AP_HYPERV_TIMER_STARTING);
- hv_remove_stimer0_irq(stimer0_irq);
- stimer0_irq = 0;
- }
- free_percpu(hv_clock_event);
- hv_clock_event = NULL;
-}
-EXPORT_SYMBOL_GPL(hv_stimer_free);
-
/*
* Do a global cleanup of clockevents for the cases of kexec and
* vmbus exit
@@ -287,12 +351,17 @@ void hv_stimer_global_cleanup(void)
hv_stimer_legacy_cleanup(cpu);
}

- /*
- * If Direct Mode is enabled, the cpuhp teardown callback
- * (hv_stimer_cleanup) will be run on all CPUs to stop the
- * stimers.
- */
- hv_stimer_free();
+ if (!hv_clock_event)
+ return;
+
+ if (direct_mode_enabled) {
+ cpuhp_remove_state(CPUHP_AP_HYPERV_TIMER_STARTING);
+ hv_remove_stimer0_irq();
+ stimer0_irq = -1;
+ }
+ free_percpu(hv_clock_event);
+ hv_clock_event = NULL;
+
}
EXPORT_SYMBOL_GPL(hv_stimer_global_cleanup);

@@ -454,9 +523,14 @@ static bool __init hv_init_tsc_clocksource(void)
* Hyper-V Reference TSC rating, causing the generic TSC to be used.
* TSC_INVARIANT is not offered on ARM64, so the Hyper-V Reference
* TSC will be preferred over the virtualized ARM64 arch counter.
+ * While the Hyper-V MSR clocksource won't be used since the
+ * Reference TSC clocksource is present, change its rating as
+ * well for consistency.
*/
- if (ms_hyperv.features & HV_ACCESS_TSC_INVARIANT)
+ if (ms_hyperv.features & HV_ACCESS_TSC_INVARIANT) {
hyperv_cs_tsc.rating = 250;
+ hyperv_cs_msr.rating = 250;
+ }

hv_read_reference_counter = read_hv_clock_tsc;
phys_addr = virt_to_phys(hv_get_tsc_page());
diff --git a/include/asm-generic/mshyperv.h b/include/asm-generic/mshyperv.h
index 43dc371..69e7fe0 100644
--- a/include/asm-generic/mshyperv.h
+++ b/include/asm-generic/mshyperv.h
@@ -183,9 +183,4 @@ static inline int cpumask_to_vpset(struct hv_vpset *vpset,
static inline void hyperv_cleanup(void) {}
#endif /* CONFIG_HYPERV */

-#if IS_ENABLED(CONFIG_HYPERV)
-extern int hv_setup_stimer0_irq(int *irq, int *vector, void (*handler)(void));
-extern void hv_remove_stimer0_irq(int irq);
-#endif
-
#endif
diff --git a/include/clocksource/hyperv_timer.h b/include/clocksource/hyperv_timer.h
index 34eef083..b6774aa 100644
--- a/include/clocksource/hyperv_timer.h
+++ b/include/clocksource/hyperv_timer.h
@@ -21,8 +21,7 @@
#define HV_MIN_DELTA_TICKS 1

/* Routines called by the VMbus driver */
-extern int hv_stimer_alloc(void);
-extern void hv_stimer_free(void);
+extern int hv_stimer_alloc(bool have_percpu_irqs);
extern int hv_stimer_cleanup(unsigned int cpu);
extern void hv_stimer_legacy_init(unsigned int cpu, int sint);
extern void hv_stimer_legacy_cleanup(unsigned int cpu);
--
1.8.3.1

2021-03-01 03:12:52

by Michael Kelley (LINUX)

[permalink] [raw]
Subject: [PATCH v2 09/10] clocksource/drivers/hyper-v: Set clocksource rating based on Hyper-V feature

On x86/x64, the TSC clocksource is available in a Hyper-V VM only if
Hyper-V provides the TSC_INVARIANT flag. The rating on the Hyper-V
Reference TSC page clocksource is currently set so that it will not
override the TSC clocksource in this case. Alternatively, if the TSC
clocksource is not available, then the Hyper-V clocksource is used.

But on ARM64, the Hyper-V Reference TSC page clocksource should
override the ARM arch counter, since the Hyper-V clocksource provides
scaling and offsetting during live migrations that is not provided
for the ARM arch counter.

To get the needed behavior for both x86/x64 and ARM64, tweak the
logic by defaulting the Hyper-V Reference TSC page clocksource
rating to a large value that will always override. If the Hyper-V
TSC_INVARIANT flag is set, then reduce the rating so that it will not
override the TSC.

While the logic for getting there is slightly different, the net
result in the normal cases is no functional change.

Signed-off-by: Michael Kelley <[email protected]>
---
drivers/clocksource/hyperv_timer.c | 23 +++++++++++++----------
1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/drivers/clocksource/hyperv_timer.c b/drivers/clocksource/hyperv_timer.c
index 38fb396..cdb8e0c 100644
--- a/drivers/clocksource/hyperv_timer.c
+++ b/drivers/clocksource/hyperv_timer.c
@@ -302,14 +302,6 @@ void hv_stimer_global_cleanup(void)
* the other that uses the TSC reference page feature as defined in the
* TLFS. The MSR version is for compatibility with old versions of
* Hyper-V and 32-bit x86. The TSC reference page version is preferred.
- *
- * The Hyper-V clocksource ratings of 250 are chosen to be below the
- * TSC clocksource rating of 300. In configurations where Hyper-V offers
- * an InvariantTSC, the TSC is not marked "unstable", so the TSC clocksource
- * is available and preferred. With the higher rating, it will be the
- * default. On older hardware and Hyper-V versions, the TSC is marked
- * "unstable", so no TSC clocksource is created and the selected Hyper-V
- * clocksource will be the default.
*/

u64 (*hv_read_reference_counter)(void);
@@ -380,7 +372,7 @@ static int hv_cs_enable(struct clocksource *cs)

static struct clocksource hyperv_cs_tsc = {
.name = "hyperv_clocksource_tsc_page",
- .rating = 250,
+ .rating = 500,
.read = read_hv_clock_tsc_cs,
.mask = CLOCKSOURCE_MASK(64),
.flags = CLOCK_SOURCE_IS_CONTINUOUS,
@@ -417,7 +409,7 @@ static u64 notrace read_hv_sched_clock_msr(void)

static struct clocksource hyperv_cs_msr = {
.name = "hyperv_clocksource_msr",
- .rating = 250,
+ .rating = 500,
.read = read_hv_clock_msr_cs,
.mask = CLOCKSOURCE_MASK(64),
.flags = CLOCK_SOURCE_IS_CONTINUOUS,
@@ -455,6 +447,17 @@ static bool __init hv_init_tsc_clocksource(void)
if (hv_root_partition)
return false;

+ /*
+ * If Hyper-V offers TSC_INVARIANT, then the virtualized TSC correctly
+ * handles frequency and offset changes due to live migration,
+ * pause/resume, and other VM management operations. So lower the
+ * Hyper-V Reference TSC rating, causing the generic TSC to be used.
+ * TSC_INVARIANT is not offered on ARM64, so the Hyper-V Reference
+ * TSC will be preferred over the virtualized ARM64 arch counter.
+ */
+ if (ms_hyperv.features & HV_ACCESS_TSC_INVARIANT)
+ hyperv_cs_tsc.rating = 250;
+
hv_read_reference_counter = read_hv_clock_tsc;
phys_addr = virt_to_phys(hv_get_tsc_page());

--
1.8.3.1

2021-03-01 11:33:47

by Wei Liu

[permalink] [raw]
Subject: Re: [PATCH v2 00/10] Refactor arch specific Hyper-V code

On Sun, Feb 28, 2021 at 05:15:22PM -0800, Michael Kelley wrote:
> To support Linux guests on Hyper-V on multiple architectures, the original
> approach factored out all differences between Hyper-V on x86/x64 and
> Hyper-V on ARM64 into functions or #defines under arch/x86 and
> arch/arm64. Some of these differences are truly related to the
> architecture, but others are more properly treated as Linux OS
> differences or just quirks in Hyper-V. Feedback from Arnd Bergmann[1]
> recommended that differences other than architecture should be
> incorporated into the architecture independent Hyper-V code. Each
> difference can be handled with conditions specific to the difference
> instead of tying it to the broader x86/x64 vs. ARM64. This approach
> reduces the amount of code under arch/x86 and arch/arm64 and keeps
> the non-architectural differences localized and more easily understood.
>
> This patch set implements the new approach by changing the interface
> between the architecture independent code and the architecture dependent
> code for x86/x64. The patches move code from arch/x86 to the
> architecture independent Hyper-V code whenever possible, and add
> architecture independent support needed by other architectures like
> ARM64. No functionality is changed for x86/x64. A subsequent patch
> set will provide the Hyper-V support code under arch/arm64.
>
> This patch set results in an increase in lines of code (though some
> of the increase is additional comments). But the lines needed under
> arch/arm64 in the upcoming patch set is significantly reduced, resulting
> in a net decrease of about 125 lines.
>
> [1] https://lore.kernel.org/lkml/CAK8P3a1hDBVembCd+6=ENUWYFz=72JBTFMrKYZ2aFd+_Q04F+g@mail.gmail.com/

This series looks good to me.

Given this series touches mostly Hyper-V code. I will be taking this it
via hyperv-next once the last two patches are reviewed.

Wei.

2021-03-01 14:30:12

by Daniel Lezcano

[permalink] [raw]
Subject: Re: [PATCH v2 08/10] clocksource/drivers/hyper-v: Handle sched_clock differences inline

On 01/03/2021 02:15, Michael Kelley wrote:
> While the Hyper-V Reference TSC code is architecture neutral, the
> pv_ops.time.sched_clock() function is implemented for x86/x64, but not
> for ARM64. Current code calls a utility function under arch/x86 (and
> coming, under arch/arm64) to handle the difference.
>
> Change this approach to handle the difference inline based on whether
> GENERIC_SCHED_CLOCK is present. The new approach removes code under
> arch/* since the difference is tied more to the specifics of the Linux
> implementation than to the architecture.
>
> No functional change.
>
> Signed-off-by: Michael Kelley <[email protected]>
> Reviewed-by: Boqun Feng <[email protected]>
> ---

[ ... ]

> +/*
> + * Reference to pv_ops must be inline so objtool
> + * detection of noinstr violations can work correctly.
> + */
> +static __always_inline void hv_setup_sched_clock(void *sched_clock)
> +{
> +#ifdef CONFIG_GENERIC_SCHED_CLOCK
> + /*
> + * We're on an architecture with generic sched clock (not x86/x64).
> + * The Hyper-V sched clock read function returns nanoseconds, not
> + * the normal 100ns units of the Hyper-V synthetic clock.
> + */
> + sched_clock_register(sched_clock, 64, NSEC_PER_SEC);
> +#else
> +#ifdef CONFIG_PARAVIRT
> + /* We're on x86/x64 *and* using PV ops */
> + pv_ops.time.sched_clock = sched_clock;
> +#endif
> +#endif
> +}
Please refer to:

Documentation/process/coding-style.rst

Section 21)

[ ... ]

Prefer to compile out entire functions, rather than portions of
functions or portions of expressions.

[ ... ]


--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

2021-03-02 07:10:48

by Daniel Lezcano

[permalink] [raw]
Subject: Re: [PATCH v2 10/10] clocksource/drivers/hyper-v: Move handling of STIMER0 interrupts

On 01/03/2021 02:15, Michael Kelley wrote:
> STIMER0 interrupts are most naturally modeled as per-cpu IRQs. But
> because x86/x64 doesn't have per-cpu IRQs, the core STIMER0 interrupt
> handling machinery is done in code under arch/x86 and Linux IRQs are
> not used. Adding support for ARM64 means adding equivalent code
> using per-cpu IRQs under arch/arm64.
>
> A better model is to treat per-cpu IRQs as the normal path (which it is
> for modern architectures), and the x86/x64 path as the exception. Do this
> by incorporating standard Linux per-cpu IRQ allocation into the main
> SITMER0 driver code, and bypass it in the x86/x64 exception case. For
> x86/x64, special case code is retained under arch/x86, but no STIMER0
> interrupt handling code is needed under arch/arm64.
>
> No functional change.
>
> Signed-off-by: Michael Kelley <[email protected]>
> ---
> arch/x86/hyperv/hv_init.c | 2 +-
> arch/x86/include/asm/mshyperv.h | 4 -
> arch/x86/kernel/cpu/mshyperv.c | 10 +--
> drivers/clocksource/hyperv_timer.c | 180 ++++++++++++++++++++++++++-----------
> include/asm-generic/mshyperv.h | 5 --
> include/clocksource/hyperv_timer.h | 3 +-
> 6 files changed, 132 insertions(+), 72 deletions(-)
>
> diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
> index 9af4f8a..9d10025 100644
> --- a/arch/x86/hyperv/hv_init.c
> +++ b/arch/x86/hyperv/hv_init.c
> @@ -327,7 +327,7 @@ static void __init hv_stimer_setup_percpu_clockev(void)
> * Ignore any errors in setting up stimer clockevents
> * as we can run with the LAPIC timer as a fallback.
> */
> - (void)hv_stimer_alloc();
> + (void)hv_stimer_alloc(false);
>
> /*
> * Still register the LAPIC timer, because the direct-mode STIMER is
> diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
> index 5433312..6d4891b 100644
> --- a/arch/x86/include/asm/mshyperv.h
> +++ b/arch/x86/include/asm/mshyperv.h
> @@ -31,10 +31,6 @@ static inline u64 hv_get_register(unsigned int reg)
>
> void hyperv_vector_handler(struct pt_regs *regs);
>
> -static inline void hv_enable_stimer0_percpu_irq(int irq) {}
> -static inline void hv_disable_stimer0_percpu_irq(int irq) {}
> -
> -
> #if IS_ENABLED(CONFIG_HYPERV)
> extern int hyperv_init_cpuhp;
>
> diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
> index 41fd84a..cebed53 100644
> --- a/arch/x86/kernel/cpu/mshyperv.c
> +++ b/arch/x86/kernel/cpu/mshyperv.c
> @@ -90,21 +90,17 @@ void hv_remove_vmbus_handler(void)
> set_irq_regs(old_regs);
> }
>
> -int hv_setup_stimer0_irq(int *irq, int *vector, void (*handler)(void))
> +/* For x86/x64, override weak placeholders in hyperv_timer.c */
> +void hv_setup_stimer0_handler(void (*handler)(void))
> {
> - *vector = HYPERV_STIMER0_VECTOR;
> - *irq = -1; /* Unused on x86/x64 */
> hv_stimer0_handler = handler;
> - return 0;
> }
> -EXPORT_SYMBOL_GPL(hv_setup_stimer0_irq);
>
> -void hv_remove_stimer0_irq(int irq)
> +void hv_remove_stimer0_handler(void)
> {
> /* We have no way to deallocate the interrupt gate */
> hv_stimer0_handler = NULL;
> }
> -EXPORT_SYMBOL_GPL(hv_remove_stimer0_irq);
>
> void hv_setup_kexec_handler(void (*handler)(void))
> {
> diff --git a/drivers/clocksource/hyperv_timer.c b/drivers/clocksource/hyperv_timer.c
> index cdb8e0c..b2bf5e5 100644
> --- a/drivers/clocksource/hyperv_timer.c
> +++ b/drivers/clocksource/hyperv_timer.c
> @@ -18,6 +18,9 @@
> #include <linux/sched_clock.h>
> #include <linux/mm.h>
> #include <linux/cpuhotplug.h>
> +#include <linux/interrupt.h>
> +#include <linux/irq.h>
> +#include <linux/acpi.h>
> #include <clocksource/hyperv_timer.h>
> #include <asm/hyperv-tlfs.h>
> #include <asm/mshyperv.h>
> @@ -43,14 +46,13 @@
> */
> static bool direct_mode_enabled;
>
> -static int stimer0_irq;
> -static int stimer0_vector;
> +static int stimer0_irq = -1;
> +static long __percpu *stimer0_evt;

Why not

static DEFINE_PER_CPU(long, stimer0_evt);

no need of allocation /free ?






--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

2021-03-02 18:26:31

by Michael Kelley (LINUX)

[permalink] [raw]
Subject: RE: [PATCH v2 07/10] clocksource/drivers/hyper-v: Handle vDSO differences inline

From: Daniel Lezcano <[email protected]> Sent: Monday, March 1, 2021 4:22 AM
>
> On 01/03/2021 02:15, Michael Kelley wrote:
> > While the driver for the Hyper-V Reference TSC and STIMERs is architecture
> > neutral, vDSO is implemented for x86/x64, but not for ARM64. Current code
> > calls into utility functions under arch/x86 (and coming, under arch/arm64)
> > to handle the difference.
> >
> > Change this approach to handle the difference inline based on whether
> > VDSO_CLOCK_MODE_HVCLOCK is present. The new approach removes code under
> > arch/* since the difference is tied more to the specifics of the Linux
> > implementation than to the architecture.
> >
> > No functional change.
>
> A suggestion below
>
>
> > Signed-off-by: Michael Kelley <[email protected]>
> > Reviewed-by: Boqun Feng <[email protected]>
> > ---
> > arch/x86/include/asm/mshyperv.h | 4 ----
> > drivers/clocksource/hyperv_timer.c | 10 ++++++++--
> > 2 files changed, 8 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/clocksource/hyperv_timer.c b/drivers/clocksource/hyperv_timer.c
> > index c73c127..5e5e08aa 100644
> > --- a/drivers/clocksource/hyperv_timer.c
> > +++ b/drivers/clocksource/hyperv_timer.c
> > @@ -372,7 +372,9 @@ static void resume_hv_clock_tsc(struct clocksource *arg)
> >
> > static int hv_cs_enable(struct clocksource *cs)
>
> static __maybe_unused int hv_cs_enable(struct clocksource *cs)
>
> > {
> > - hv_enable_vdso_clocksource();
> > +#ifdef VDSO_CLOCKMODE_HVCLOCK
> > + vclocks_set_used(VDSO_CLOCKMODE_HVCLOCK);
> > +#endif
> > return 0;
> > }
> >
> > @@ -385,6 +387,11 @@ static int hv_cs_enable(struct clocksource *cs)
> > .suspend= suspend_hv_clock_tsc,
> > .resume = resume_hv_clock_tsc,
> > .enable = hv_cs_enable,
> > +#ifdef VDSO_CLOCKMODE_HVCLOCK
> > + .vdso_clock_mode = VDSO_CLOCKMODE_HVCLOCK,
> > +#else
> > + .vdso_clock_mode = VDSO_CLOCKMODE_NONE,
> > +#endif
>
> #ifdef VDSO_CLOCKMODE_HVCLOCK
> .enable = hv_cs_enable,
> .vdso_clock_mode = VDSO_CLOCKMODE_HVCLOCK,
> #else
> .vdso_clock_mode = VDSO_CLOCKMODE_NONE,
> #endif
>

Is there any particular benefit (that I might not be recognizing)
to having the .enable function be NULL vs. a function that
does nothing? I can see the handful of places where the
.enable function is invoked, and there doesn't seem to be
much difference.

In any case, I have no problem with making the change in
a v3 of the patch set.

Michael


2021-03-02 18:29:57

by Michael Kelley (LINUX)

[permalink] [raw]
Subject: RE: [PATCH v2 10/10] clocksource/drivers/hyper-v: Move handling of STIMER0 interrupts

From: Daniel Lezcano <[email protected]> Sent: Monday, March 1, 2021 10:45 AM
>
> On 01/03/2021 02:15, Michael Kelley wrote:
> > STIMER0 interrupts are most naturally modeled as per-cpu IRQs. But
> > because x86/x64 doesn't have per-cpu IRQs, the core STIMER0 interrupt
> > handling machinery is done in code under arch/x86 and Linux IRQs are
> > not used. Adding support for ARM64 means adding equivalent code
> > using per-cpu IRQs under arch/arm64.
> >
> > A better model is to treat per-cpu IRQs as the normal path (which it is
> > for modern architectures), and the x86/x64 path as the exception. Do this
> > by incorporating standard Linux per-cpu IRQ allocation into the main
> > SITMER0 driver code, and bypass it in the x86/x64 exception case. For
> > x86/x64, special case code is retained under arch/x86, but no STIMER0
> > interrupt handling code is needed under arch/arm64.
> >
> > No functional change.
> >
> > Signed-off-by: Michael Kelley <[email protected]>
> > ---
> > arch/x86/hyperv/hv_init.c | 2 +-
> > arch/x86/include/asm/mshyperv.h | 4 -
> > arch/x86/kernel/cpu/mshyperv.c | 10 +--
> > drivers/clocksource/hyperv_timer.c | 180 ++++++++++++++++++++++++++-----------
> > include/asm-generic/mshyperv.h | 5 --
> > include/clocksource/hyperv_timer.h | 3 +-
> > 6 files changed, 132 insertions(+), 72 deletions(-)
> >
> > diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
> > index 9af4f8a..9d10025 100644
> > --- a/arch/x86/hyperv/hv_init.c
> > +++ b/arch/x86/hyperv/hv_init.c
> > @@ -327,7 +327,7 @@ static void __init hv_stimer_setup_percpu_clockev(void)
> > * Ignore any errors in setting up stimer clockevents
> > * as we can run with the LAPIC timer as a fallback.
> > */
> > - (void)hv_stimer_alloc();
> > + (void)hv_stimer_alloc(false);
> >
> > /*
> > * Still register the LAPIC timer, because the direct-mode STIMER is
> > diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
> > index 5433312..6d4891b 100644
> > --- a/arch/x86/include/asm/mshyperv.h
> > +++ b/arch/x86/include/asm/mshyperv.h
> > @@ -31,10 +31,6 @@ static inline u64 hv_get_register(unsigned int reg)
> >
> > void hyperv_vector_handler(struct pt_regs *regs);
> >
> > -static inline void hv_enable_stimer0_percpu_irq(int irq) {}
> > -static inline void hv_disable_stimer0_percpu_irq(int irq) {}
> > -
> > -
> > #if IS_ENABLED(CONFIG_HYPERV)
> > extern int hyperv_init_cpuhp;
> >
> > diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
> > index 41fd84a..cebed53 100644
> > --- a/arch/x86/kernel/cpu/mshyperv.c
> > +++ b/arch/x86/kernel/cpu/mshyperv.c
> > @@ -90,21 +90,17 @@ void hv_remove_vmbus_handler(void)
> > set_irq_regs(old_regs);
> > }
> >
> > -int hv_setup_stimer0_irq(int *irq, int *vector, void (*handler)(void))
> > +/* For x86/x64, override weak placeholders in hyperv_timer.c */
> > +void hv_setup_stimer0_handler(void (*handler)(void))
> > {
> > - *vector = HYPERV_STIMER0_VECTOR;
> > - *irq = -1; /* Unused on x86/x64 */
> > hv_stimer0_handler = handler;
> > - return 0;
> > }
> > -EXPORT_SYMBOL_GPL(hv_setup_stimer0_irq);
> >
> > -void hv_remove_stimer0_irq(int irq)
> > +void hv_remove_stimer0_handler(void)
> > {
> > /* We have no way to deallocate the interrupt gate */
> > hv_stimer0_handler = NULL;
> > }
> > -EXPORT_SYMBOL_GPL(hv_remove_stimer0_irq);
> >
> > void hv_setup_kexec_handler(void (*handler)(void))
> > {
> > diff --git a/drivers/clocksource/hyperv_timer.c b/drivers/clocksource/hyperv_timer.c
> > index cdb8e0c..b2bf5e5 100644
> > --- a/drivers/clocksource/hyperv_timer.c
> > +++ b/drivers/clocksource/hyperv_timer.c
> > @@ -18,6 +18,9 @@
> > #include <linux/sched_clock.h>
> > #include <linux/mm.h>
> > #include <linux/cpuhotplug.h>
> > +#include <linux/interrupt.h>
> > +#include <linux/irq.h>
> > +#include <linux/acpi.h>
> > #include <clocksource/hyperv_timer.h>
> > #include <asm/hyperv-tlfs.h>
> > #include <asm/mshyperv.h>
> > @@ -43,14 +46,13 @@
> > */
> > static bool direct_mode_enabled;
> >
> > -static int stimer0_irq;
> > -static int stimer0_vector;
> > +static int stimer0_irq = -1;
> > +static long __percpu *stimer0_evt;
>
> Why not
>
> static DEFINE_PER_CPU(long, stimer0_evt);
>
> no need of allocation /free ?
>

Indeed! I'll make that simplification in v3 of the patch set.

Michael

2021-03-03 03:14:51

by Daniel Lezcano

[permalink] [raw]
Subject: Re: [PATCH v2 07/10] clocksource/drivers/hyper-v: Handle vDSO differences inline

On 01/03/2021 02:15, Michael Kelley wrote:
> While the driver for the Hyper-V Reference TSC and STIMERs is architecture
> neutral, vDSO is implemented for x86/x64, but not for ARM64. Current code
> calls into utility functions under arch/x86 (and coming, under arch/arm64)
> to handle the difference.
>
> Change this approach to handle the difference inline based on whether
> VDSO_CLOCK_MODE_HVCLOCK is present. The new approach removes code under
> arch/* since the difference is tied more to the specifics of the Linux
> implementation than to the architecture.
>
> No functional change.

A suggestion below


> Signed-off-by: Michael Kelley <[email protected]>
> Reviewed-by: Boqun Feng <[email protected]>
> ---
> arch/x86/include/asm/mshyperv.h | 4 ----
> drivers/clocksource/hyperv_timer.c | 10 ++++++++--
> 2 files changed, 8 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
> index c10dd1c..4f566db 100644
> --- a/arch/x86/include/asm/mshyperv.h
> +++ b/arch/x86/include/asm/mshyperv.h
> @@ -27,10 +27,6 @@ static inline u64 hv_get_register(unsigned int reg)
> return value;
> }
>
> -#define hv_set_clocksource_vdso(val) \
> - ((val).vdso_clock_mode = VDSO_CLOCKMODE_HVCLOCK)
> -#define hv_enable_vdso_clocksource() \
> - vclocks_set_used(VDSO_CLOCKMODE_HVCLOCK);
> #define hv_get_raw_timer() rdtsc_ordered()
>
> /*
> diff --git a/drivers/clocksource/hyperv_timer.c b/drivers/clocksource/hyperv_timer.c
> index c73c127..5e5e08aa 100644
> --- a/drivers/clocksource/hyperv_timer.c
> +++ b/drivers/clocksource/hyperv_timer.c
> @@ -372,7 +372,9 @@ static void resume_hv_clock_tsc(struct clocksource *arg)
>
> static int hv_cs_enable(struct clocksource *cs)

static __maybe_unused int hv_cs_enable(struct clocksource *cs)

> {
> - hv_enable_vdso_clocksource();
> +#ifdef VDSO_CLOCKMODE_HVCLOCK
> + vclocks_set_used(VDSO_CLOCKMODE_HVCLOCK);
> +#endif
> return 0;
> }
>
> @@ -385,6 +387,11 @@ static int hv_cs_enable(struct clocksource *cs)
> .suspend= suspend_hv_clock_tsc,
> .resume = resume_hv_clock_tsc,
> .enable = hv_cs_enable,
> +#ifdef VDSO_CLOCKMODE_HVCLOCK
> + .vdso_clock_mode = VDSO_CLOCKMODE_HVCLOCK,
> +#else
> + .vdso_clock_mode = VDSO_CLOCKMODE_NONE,
> +#endif

#ifdef VDSO_CLOCKMODE_HVCLOCK
.enable = hv_cs_enable,
.vdso_clock_mode = VDSO_CLOCKMODE_HVCLOCK,
#else
.vdso_clock_mode = VDSO_CLOCKMODE_NONE,
#endif



> };
>
> static u64 notrace read_hv_clock_msr(void)
> @@ -442,7 +449,6 @@ static bool __init hv_init_tsc_clocksource(void)
> tsc_msr = tsc_msr | 0x1 | (u64)phys_addr;
> hv_set_register(HV_REGISTER_REFERENCE_TSC, tsc_msr);
>
> - hv_set_clocksource_vdso(hyperv_cs_tsc);
> clocksource_register_hz(&hyperv_cs_tsc, NSEC_PER_SEC/100);
>
> hv_sched_clock_offset = hv_read_reference_counter();
>


--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

2021-03-03 04:40:26

by Daniel Lezcano

[permalink] [raw]
Subject: Re: [PATCH v2 09/10] clocksource/drivers/hyper-v: Set clocksource rating based on Hyper-V feature

On 01/03/2021 02:15, Michael Kelley wrote:
> On x86/x64, the TSC clocksource is available in a Hyper-V VM only if
> Hyper-V provides the TSC_INVARIANT flag. The rating on the Hyper-V
> Reference TSC page clocksource is currently set so that it will not
> override the TSC clocksource in this case. Alternatively, if the TSC
> clocksource is not available, then the Hyper-V clocksource is used.
>
> But on ARM64, the Hyper-V Reference TSC page clocksource should
> override the ARM arch counter, since the Hyper-V clocksource provides
> scaling and offsetting during live migrations that is not provided
> for the ARM arch counter.
>
> To get the needed behavior for both x86/x64 and ARM64, tweak the
> logic by defaulting the Hyper-V Reference TSC page clocksource
> rating to a large value that will always override. If the Hyper-V
> TSC_INVARIANT flag is set, then reduce the rating so that it will not
> override the TSC.
>
> While the logic for getting there is slightly different, the net
> result in the normal cases is no functional change.
>
> Signed-off-by: Michael Kelley <[email protected]>

Acked-by: Daniel Lezcano <[email protected]>

[ ... ]

--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

2021-03-04 05:38:20

by Michael Kelley (LINUX)

[permalink] [raw]
Subject: RE: [PATCH v2 08/10] clocksource/drivers/hyper-v: Handle sched_clock differences inline

From: Daniel Lezcano <[email protected]> Sent: Monday, March 1, 2021 6:25 AM
>
> On 01/03/2021 02:15, Michael Kelley wrote:
> > While the Hyper-V Reference TSC code is architecture neutral, the
> > pv_ops.time.sched_clock() function is implemented for x86/x64, but not
> > for ARM64. Current code calls a utility function under arch/x86 (and
> > coming, under arch/arm64) to handle the difference.
> >
> > Change this approach to handle the difference inline based on whether
> > GENERIC_SCHED_CLOCK is present. The new approach removes code under
> > arch/* since the difference is tied more to the specifics of the Linux
> > implementation than to the architecture.
> >
> > No functional change.
> >
> > Signed-off-by: Michael Kelley <[email protected]>
> > Reviewed-by: Boqun Feng <[email protected]>
> > ---
>
> [ ... ]
>
> > +/*
> > + * Reference to pv_ops must be inline so objtool
> > + * detection of noinstr violations can work correctly.
> > + */
> > +static __always_inline void hv_setup_sched_clock(void *sched_clock)
> > +{
> > +#ifdef CONFIG_GENERIC_SCHED_CLOCK
> > + /*
> > + * We're on an architecture with generic sched clock (not x86/x64).
> > + * The Hyper-V sched clock read function returns nanoseconds, not
> > + * the normal 100ns units of the Hyper-V synthetic clock.
> > + */
> > + sched_clock_register(sched_clock, 64, NSEC_PER_SEC);
> > +#else
> > +#ifdef CONFIG_PARAVIRT
> > + /* We're on x86/x64 *and* using PV ops */
> > + pv_ops.time.sched_clock = sched_clock;
> > +#endif
> > +#endif
> > +}
> Please refer to:
>
> Documentation/process/coding-style.rst
>
> Section 21)
>
> [ ... ]
>
> Prefer to compile out entire functions, rather than portions of
> functions or portions of expressions.
>
> [ ... ]
>

OK. I'll rework the #ifdef in v3 of the patch set. Is the following
the preferred approach?

#ifdef CONFIG_GENERIC_SCHED_CLOCK
static __always_inline void hv_setup_sched_clock(void *sched_clock)
{
sched_clock_register(sched_clock, 64 NSEC_PER_SEC);
}
#else
#ifdef CONFIG_PARAVIRT
static __always_inline void hv_setup_sched_clock(void *sched_clock)
{
pv_ops.time.sched_clock = sched_clock:
}
#else
static __always_inline void hv_setup_sched_clock(void *sched_clock) {}
#endif
#endif

Michael

2021-03-04 06:24:17

by Daniel Lezcano

[permalink] [raw]
Subject: Re: [PATCH v2 07/10] clocksource/drivers/hyper-v: Handle vDSO differences inline

On 02/03/2021 02:29, Michael Kelley wrote:
> From: Daniel Lezcano <[email protected]> Sent: Monday, March 1, 2021 4:22 AM
>>
>> On 01/03/2021 02:15, Michael Kelley wrote:
>>> While the driver for the Hyper-V Reference TSC and STIMERs is architecture
>>> neutral, vDSO is implemented for x86/x64, but not for ARM64. Current code
>>> calls into utility functions under arch/x86 (and coming, under arch/arm64)
>>> to handle the difference.
>>>
>>> Change this approach to handle the difference inline based on whether
>>> VDSO_CLOCK_MODE_HVCLOCK is present. The new approach removes code under
>>> arch/* since the difference is tied more to the specifics of the Linux
>>> implementation than to the architecture.
>>>
>>> No functional change.
>>
>> A suggestion below
>>
>>
>>> Signed-off-by: Michael Kelley <[email protected]>
>>> Reviewed-by: Boqun Feng <[email protected]>
>>> ---
>>> arch/x86/include/asm/mshyperv.h | 4 ----
>>> drivers/clocksource/hyperv_timer.c | 10 ++++++++--
>>> 2 files changed, 8 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/drivers/clocksource/hyperv_timer.c b/drivers/clocksource/hyperv_timer.c
>>> index c73c127..5e5e08aa 100644
>>> --- a/drivers/clocksource/hyperv_timer.c
>>> +++ b/drivers/clocksource/hyperv_timer.c
>>> @@ -372,7 +372,9 @@ static void resume_hv_clock_tsc(struct clocksource *arg)
>>>
>>> static int hv_cs_enable(struct clocksource *cs)
>>
>> static __maybe_unused int hv_cs_enable(struct clocksource *cs)
>>
>>> {
>>> - hv_enable_vdso_clocksource();
>>> +#ifdef VDSO_CLOCKMODE_HVCLOCK
>>> + vclocks_set_used(VDSO_CLOCKMODE_HVCLOCK);
>>> +#endif
>>> return 0;
>>> }
>>>
>>> @@ -385,6 +387,11 @@ static int hv_cs_enable(struct clocksource *cs)
>>> .suspend= suspend_hv_clock_tsc,
>>> .resume = resume_hv_clock_tsc,
>>> .enable = hv_cs_enable,
>>> +#ifdef VDSO_CLOCKMODE_HVCLOCK
>>> + .vdso_clock_mode = VDSO_CLOCKMODE_HVCLOCK,
>>> +#else
>>> + .vdso_clock_mode = VDSO_CLOCKMODE_NONE,
>>> +#endif
>>
>> #ifdef VDSO_CLOCKMODE_HVCLOCK
>> .enable = hv_cs_enable,
>> .vdso_clock_mode = VDSO_CLOCKMODE_HVCLOCK,
>> #else
>> .vdso_clock_mode = VDSO_CLOCKMODE_NONE,
>> #endif
>>
>
> Is there any particular benefit (that I might not be recognizing)
> to having the .enable function be NULL vs. a function that
> does nothing? I can see the handful of places where the
> .enable function is invoked, and there doesn't seem to be
> much difference.
>
> In any case, I have no problem with making the change in
> a v3 of the patch set.

It is just coding style, it allows to remove a #ifdef in the code.



--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

2021-03-04 06:24:59

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: ** POTENTIAL FRAUD ALERT - RED HAT ** [PATCH v2 01/10] Drivers: hv: vmbus: Move Hyper-V page allocator to arch neutral code

Michael Kelley <[email protected]> writes:

> The Hyper-V page allocator functions are implemented in an architecture
> neutral way. Move them into the architecture neutral VMbus module so
> a separate implementation for ARM64 is not needed.
>
> No functional change.
>
> Signed-off-by: Michael Kelley <[email protected]>
> Reviewed-by: Boqun Feng <[email protected]>
> ---
> arch/x86/hyperv/hv_init.c | 22 ----------------------
> arch/x86/include/asm/mshyperv.h | 5 -----
> drivers/hv/hv.c | 36 ++++++++++++++++++++++++++++++++++++
> include/asm-generic/mshyperv.h | 4 ++++
> 4 files changed, 40 insertions(+), 27 deletions(-)
>
> diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
> index b81047d..4bdb344 100644
> --- a/arch/x86/hyperv/hv_init.c
> +++ b/arch/x86/hyperv/hv_init.c
> @@ -54,28 +54,6 @@
> u32 hv_max_vp_index;
> EXPORT_SYMBOL_GPL(hv_max_vp_index);
>
> -void *hv_alloc_hyperv_page(void)
> -{
> - BUILD_BUG_ON(PAGE_SIZE != HV_HYP_PAGE_SIZE);
> -
> - return (void *)__get_free_page(GFP_KERNEL);
> -}
> -EXPORT_SYMBOL_GPL(hv_alloc_hyperv_page);
> -
> -void *hv_alloc_hyperv_zeroed_page(void)
> -{
> - BUILD_BUG_ON(PAGE_SIZE != HV_HYP_PAGE_SIZE);
> -
> - return (void *)__get_free_page(GFP_KERNEL | __GFP_ZERO);
> -}
> -EXPORT_SYMBOL_GPL(hv_alloc_hyperv_zeroed_page);
> -
> -void hv_free_hyperv_page(unsigned long addr)
> -{
> - free_page(addr);
> -}
> -EXPORT_SYMBOL_GPL(hv_free_hyperv_page);
> -
> static int hv_cpu_init(unsigned int cpu)
> {
> u64 msr_vp_index;
> diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
> index ccf60a8..ef6e968 100644
> --- a/arch/x86/include/asm/mshyperv.h
> +++ b/arch/x86/include/asm/mshyperv.h
> @@ -233,9 +233,6 @@ static inline struct hv_vp_assist_page *hv_get_vp_assist_page(unsigned int cpu)
>
> void __init hyperv_init(void);
> void hyperv_setup_mmu_ops(void);
> -void *hv_alloc_hyperv_page(void);
> -void *hv_alloc_hyperv_zeroed_page(void);
> -void hv_free_hyperv_page(unsigned long addr);
> void set_hv_tscchange_cb(void (*cb)(void));
> void clear_hv_tscchange_cb(void);
> void hyperv_stop_tsc_emulation(void);
> @@ -272,8 +269,6 @@ int hv_map_ioapic_interrupt(int ioapic_id, bool level, int vcpu, int vector,
> #else /* CONFIG_HYPERV */
> static inline void hyperv_init(void) {}
> static inline void hyperv_setup_mmu_ops(void) {}
> -static inline void *hv_alloc_hyperv_page(void) { return NULL; }
> -static inline void hv_free_hyperv_page(unsigned long addr) {}
> static inline void set_hv_tscchange_cb(void (*cb)(void)) {}
> static inline void clear_hv_tscchange_cb(void) {}
> static inline void hyperv_stop_tsc_emulation(void) {};
> diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
> index f202ac7..cca8d5e 100644
> --- a/drivers/hv/hv.c
> +++ b/drivers/hv/hv.c
> @@ -37,6 +37,42 @@ int hv_init(void)
> }
>
> /*
> + * Functions for allocating and freeing memory with size and
> + * alignment HV_HYP_PAGE_SIZE. These functions are needed because
> + * the guest page size may not be the same as the Hyper-V page
> + * size. We depend upon kmalloc() aligning power-of-two size
> + * allocations to the allocation size boundary, so that the
> + * allocated memory appears to Hyper-V as a page of the size
> + * it expects.
> + */
> +
> +void *hv_alloc_hyperv_page(void)
> +{
> + BUILD_BUG_ON(PAGE_SIZE < HV_HYP_PAGE_SIZE);
> +
> + if (PAGE_SIZE == HV_HYP_PAGE_SIZE)
> + return (void *)__get_free_page(GFP_KERNEL);
> + else
> + return kmalloc(HV_HYP_PAGE_SIZE, GFP_KERNEL);

PAGE_SIZE and HV_HYP_PAGE_SIZE are known compile-time and in case this
won't change in the future we can probably write this as

#if PAGE_SIZE == HV_HYP_PAGE_SIZE
return (void *)__get_free_page(GFP_KERNEL);
#else
return kmalloc(HV_HYP_PAGE_SIZE, GFP_KERNEL);
#endif

(not sure if the output is going to be any different with e.g. gcc's '-O2')

> +}
> +
> +void *hv_alloc_hyperv_zeroed_page(void)
> +{
> + if (PAGE_SIZE == HV_HYP_PAGE_SIZE)
> + return (void *)__get_free_page(GFP_KERNEL | __GFP_ZERO);
> + else
> + return kzalloc(HV_HYP_PAGE_SIZE, GFP_KERNEL);
> +}
> +
> +void hv_free_hyperv_page(unsigned long addr)
> +{
> + if (PAGE_SIZE == HV_HYP_PAGE_SIZE)
> + free_page(addr);
> + else
> + kfree((void *)addr);
> +}
> +
> +/*
> * hv_post_message - Post a message using the hypervisor message IPC.
> *
> * This involves a hypercall.
> diff --git a/include/asm-generic/mshyperv.h b/include/asm-generic/mshyperv.h
> index dff58a3..694b5bc 100644
> --- a/include/asm-generic/mshyperv.h
> +++ b/include/asm-generic/mshyperv.h
> @@ -117,6 +117,10 @@ static inline void vmbus_signal_eom(struct hv_message *msg, u32 old_msg_type)
> /* Sentinel value for an uninitialized entry in hv_vp_index array */
> #define VP_INVAL U32_MAX
>
> +void *hv_alloc_hyperv_page(void);
> +void *hv_alloc_hyperv_zeroed_page(void);
> +void hv_free_hyperv_page(unsigned long addr);
> +
> /**
> * hv_cpu_number_to_vp_number() - Map CPU to VP.
> * @cpu_number: CPU number in Linux terms

--
Vitaly

2021-03-04 06:27:35

by Daniel Lezcano

[permalink] [raw]
Subject: Re: [PATCH v2 08/10] clocksource/drivers/hyper-v: Handle sched_clock differences inline

On 02/03/2021 02:38, Michael Kelley wrote:
> From: Daniel Lezcano <[email protected]> Sent: Monday, March 1, 2021 6:25 AM
>>
>> On 01/03/2021 02:15, Michael Kelley wrote:
>>> While the Hyper-V Reference TSC code is architecture neutral, the
>>> pv_ops.time.sched_clock() function is implemented for x86/x64, but not
>>> for ARM64. Current code calls a utility function under arch/x86 (and
>>> coming, under arch/arm64) to handle the difference.
>>>
>>> Change this approach to handle the difference inline based on whether
>>> GENERIC_SCHED_CLOCK is present. The new approach removes code under
>>> arch/* since the difference is tied more to the specifics of the Linux
>>> implementation than to the architecture.
>>>
>>> No functional change.
>>>
>>> Signed-off-by: Michael Kelley <[email protected]>
>>> Reviewed-by: Boqun Feng <[email protected]>
>>> ---
>>
>> [ ... ]
>>
>>> +/*
>>> + * Reference to pv_ops must be inline so objtool
>>> + * detection of noinstr violations can work correctly.
>>> + */
>>> +static __always_inline void hv_setup_sched_clock(void *sched_clock)
>>> +{
>>> +#ifdef CONFIG_GENERIC_SCHED_CLOCK
>>> + /*
>>> + * We're on an architecture with generic sched clock (not x86/x64).
>>> + * The Hyper-V sched clock read function returns nanoseconds, not
>>> + * the normal 100ns units of the Hyper-V synthetic clock.
>>> + */
>>> + sched_clock_register(sched_clock, 64, NSEC_PER_SEC);
>>> +#else
>>> +#ifdef CONFIG_PARAVIRT
>>> + /* We're on x86/x64 *and* using PV ops */
>>> + pv_ops.time.sched_clock = sched_clock;
>>> +#endif
>>> +#endif
>>> +}
>> Please refer to:
>>
>> Documentation/process/coding-style.rst
>>
>> Section 21)
>>
>> [ ... ]
>>
>> Prefer to compile out entire functions, rather than portions of
>> functions or portions of expressions.
>>
>> [ ... ]
>>
>
> OK. I'll rework the #ifdef in v3 of the patch set. Is the following
> the preferred approach?

Yes but with an indentation and comment to describe the section end.

eg.

#ifdef A
#else
# ifdef B
...
# else
# endif /* B */
#endif /* A */

> #ifdef CONFIG_GENERIC_SCHED_CLOCK
> static __always_inline void hv_setup_sched_clock(void *sched_clock)
> {
> sched_clock_register(sched_clock, 64 NSEC_PER_SEC);
> }
> #else
> #ifdef CONFIG_PARAVIRT
> static __always_inline void hv_setup_sched_clock(void *sched_clock)
> {
> pv_ops.time.sched_clock = sched_clock:
> }
> #else
> static __always_inline void hv_setup_sched_clock(void *sched_clock) {}
> #endif
> #endif

Or

#ifdef CONFIG_GENERIC_SCHED_CLOCK
...
#elif defined CONFIG_PARAVIRT
...
#else /* !CONFIG_GENERIC_SCHED_CLOCK && !CONFIG_PARAVIRT */
...
#endif /* CONFIG_GENERIC_SCHED_CLOCK */



--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

2021-03-04 20:27:45

by Michael Kelley (LINUX)

[permalink] [raw]
Subject: RE: ** POTENTIAL FRAUD ALERT - RED HAT ** [PATCH v2 01/10] Drivers: hv: vmbus: Move Hyper-V page allocator to arch neutral code

From: Vitaly Kuznetsov <[email protected]> Sent: Tuesday, March 2, 2021 4:57 AM
>
> Michael Kelley <[email protected]> writes:
>
> > The Hyper-V page allocator functions are implemented in an architecture
> > neutral way. Move them into the architecture neutral VMbus module so
> > a separate implementation for ARM64 is not needed.
> >
> > No functional change.
> >
> > Signed-off-by: Michael Kelley <[email protected]>
> > Reviewed-by: Boqun Feng <[email protected]>
> > ---
> > arch/x86/hyperv/hv_init.c | 22 ----------------------
> > arch/x86/include/asm/mshyperv.h | 5 -----
> > drivers/hv/hv.c | 36 ++++++++++++++++++++++++++++++++++++
> > include/asm-generic/mshyperv.h | 4 ++++
> > 4 files changed, 40 insertions(+), 27 deletions(-)
> >

[snip]

> >
> > /*
> > + * Functions for allocating and freeing memory with size and
> > + * alignment HV_HYP_PAGE_SIZE. These functions are needed because
> > + * the guest page size may not be the same as the Hyper-V page
> > + * size. We depend upon kmalloc() aligning power-of-two size
> > + * allocations to the allocation size boundary, so that the
> > + * allocated memory appears to Hyper-V as a page of the size
> > + * it expects.
> > + */
> > +
> > +void *hv_alloc_hyperv_page(void)
> > +{
> > + BUILD_BUG_ON(PAGE_SIZE < HV_HYP_PAGE_SIZE);
> > +
> > + if (PAGE_SIZE == HV_HYP_PAGE_SIZE)
> > + return (void *)__get_free_page(GFP_KERNEL);
> > + else
> > + return kmalloc(HV_HYP_PAGE_SIZE, GFP_KERNEL);
>
> PAGE_SIZE and HV_HYP_PAGE_SIZE are known compile-time and in case this
> won't change in the future we can probably write this as
>
> #if PAGE_SIZE == HV_HYP_PAGE_SIZE
> return (void *)__get_free_page(GFP_KERNEL);
> #else
> return kmalloc(HV_HYP_PAGE_SIZE, GFP_KERNEL);
> #endif
>
> (not sure if the output is going to be any different with e.g. gcc's '-O2')
>

I looked at the generated code, and the compiler does the right
thing on both x86/x64 and on ARM64. I'd rather leave the code
as is so that both legs of the 'if' statement get checked by the
compiler regardless of whether PAGE_SIZE == HV_HYP_PAGE_SIZE.

Michael

> > +}
> > +
> > +void *hv_alloc_hyperv_zeroed_page(void)
> > +{
> > + if (PAGE_SIZE == HV_HYP_PAGE_SIZE)
> > + return (void *)__get_free_page(GFP_KERNEL | __GFP_ZERO);
> > + else
> > + return kzalloc(HV_HYP_PAGE_SIZE, GFP_KERNEL);
> > +}
> > +
> > +void hv_free_hyperv_page(unsigned long addr)
> > +{
> > + if (PAGE_SIZE == HV_HYP_PAGE_SIZE)
> > + free_page(addr);
> > + else
> > + kfree((void *)addr);
> > +}
> > +
> > +/*