2022-11-01 17:35:55

by Stanislav Kinsburskii

[permalink] [raw]
Subject: [PATCH 0/4] hyper-v: Introduce TSC page for root partition

This series does some cleanup and precursor changes to the hyper-v clock
source and introduces support for TSC page based clock in root partition.

Hyper-V root partition requires kernel to map the page, specified by the
hypervisor (instead of provide the PFN to the hypervisor like it's done in
guest partition).

The following series implements...

---

Stanislav Kinsburskiy (4):
drivers/clocksource/hyper-v: Introduce a pointer to TSC page
drivers/clocksource/hyper-v: Introduce TSC MSR register structure
drivers/clocksource/hyper-v: Use TSC PFN getter to map vvar page
drivers/clocksource/hyper-v: Add TSC page support for root partition


arch/x86/entry/vdso/vma.c | 7 ++---
arch/x86/hyperv/hv_init.c | 2 +
drivers/clocksource/hyperv_timer.c | 51 ++++++++++++++++++++++++++----------
include/clocksource/hyperv_timer.h | 7 +++++
4 files changed, 49 insertions(+), 18 deletions(-)



2022-11-01 17:51:43

by Stanislav Kinsburskii

[permalink] [raw]
Subject: [PATCH 4/4] drivers/clocksource/hyper-v: Add TSC page support for root partition

From: Stanislav Kinsburskiy <[email protected]>

It hyper-v root partition guest has to map the page, specified by the
hypervisor (instead of providing the page to the hypervisor like it's done in
the guest partitions).
However, it's too early to map the page when the clock is initialized, so, the
actual mapping is happening later.

Signed-off-by: Stanislav Kinsburskiy <[email protected]>
CC: "K. Y. Srinivasan" <[email protected]>
CC: Haiyang Zhang <[email protected]>
CC: Wei Liu <[email protected]>
CC: Dexuan Cui <[email protected]>
CC: Thomas Gleixner <[email protected]>
CC: Ingo Molnar <[email protected]>
CC: Borislav Petkov <[email protected]>
CC: Dave Hansen <[email protected]>
CC: [email protected]
CC: "H. Peter Anvin" <[email protected]>
CC: Daniel Lezcano <[email protected]>
CC: [email protected]
CC: [email protected]
---
arch/x86/hyperv/hv_init.c | 2 ++
drivers/clocksource/hyperv_timer.c | 34 +++++++++++++++++++++++++---------
include/clocksource/hyperv_timer.h | 1 +
3 files changed, 28 insertions(+), 9 deletions(-)

diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index f49bc3ec76e6..89954490af93 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -464,6 +464,8 @@ void __init hyperv_init(void)
BUG_ON(!src);
memcpy_to_page(pg, 0, src, HV_HYP_PAGE_SIZE);
memunmap(src);
+
+ hv_remap_tsc_clocksource();
} else {
hypercall_msr.guest_physical_address = vmalloc_to_pfn(hv_hypercall_pg);
wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
diff --git a/drivers/clocksource/hyperv_timer.c b/drivers/clocksource/hyperv_timer.c
index 635c14c1e3bf..4118e4bc9194 100644
--- a/drivers/clocksource/hyperv_timer.c
+++ b/drivers/clocksource/hyperv_timer.c
@@ -508,9 +508,6 @@ static bool __init hv_init_tsc_clocksource(void)
if (!(ms_hyperv.features & HV_MSR_REFERENCE_TSC_AVAILABLE))
return false;

- if (hv_root_partition)
- return false;
-
/*
* If Hyper-V offers TSC_INVARIANT, then the virtualized TSC correctly
* handles frequency and offset changes due to live migration,
@@ -528,16 +525,22 @@ static bool __init hv_init_tsc_clocksource(void)
}

hv_read_reference_counter = read_hv_clock_tsc;
- tsc_pfn = __phys_to_pfn(virt_to_phys(tsc_page));

/*
- * The Hyper-V TLFS specifies to preserve the value of reserved
- * bits in registers. So read the existing value, preserve the
- * low order 12 bits, and add in the guest physical address
- * (which already has at least the low 12 bits set to zero since
- * it is page aligned). Also set the "enable" bit, which is bit 0.
+ * TSC page mapping works differently in root and guest partitions.
+ * - In guest partition the guest PFN has to be passed to the
+ * hypervisor.
+ * - In root partition it's other way around: the guest has to map the
+ * PFN, provided by the hypervisor.
+ * But it can't be mapped right here as it's too early and MMU isn't
+ * ready yet. So, we only set the enable bit here and will remap the
+ * page later in hv_remap_tsc_clocksource().
*/
tsc_msr.as_uint64 = hv_get_register(HV_REGISTER_REFERENCE_TSC);
+ if (hv_root_partition)
+ tsc_pfn = tsc_msr.pfn;
+ else
+ tsc_pfn = __phys_to_pfn(virt_to_phys(tsc_page));
tsc_msr.enable = 1;
tsc_msr.pfn = tsc_pfn;
hv_set_register(HV_REGISTER_REFERENCE_TSC, tsc_msr.as_uint64);
@@ -572,3 +575,16 @@ void __init hv_init_clocksource(void)
hv_sched_clock_offset = hv_read_reference_counter();
hv_setup_sched_clock(read_hv_sched_clock_msr);
}
+
+void __init hv_remap_tsc_clocksource(void)
+{
+ if (!(ms_hyperv.features & HV_MSR_REFERENCE_TSC_AVAILABLE))
+ return;
+
+ if (!hv_root_partition)
+ return;
+
+ tsc_page = memremap(__pfn_to_phys(tsc_pfn), PAGE_SIZE, MEMREMAP_WB);
+ if (!tsc_page)
+ pr_err("Failed to remap Hyper-V TSC page.\n");
+}
diff --git a/include/clocksource/hyperv_timer.h b/include/clocksource/hyperv_timer.h
index 3078d23faaea..783701a2102d 100644
--- a/include/clocksource/hyperv_timer.h
+++ b/include/clocksource/hyperv_timer.h
@@ -31,6 +31,7 @@ extern void hv_stimer_global_cleanup(void);
extern void hv_stimer0_isr(void);

extern void hv_init_clocksource(void);
+extern void hv_remap_tsc_clocksource(void);

extern unsigned long hv_get_tsc_pfn(void);
extern struct ms_hyperv_tsc_page *hv_get_tsc_page(void);



2022-11-01 18:10:33

by Stanislav Kinsburskii

[permalink] [raw]
Subject: [PATCH 2/4] drivers/clocksource/hyper-v: Introduce TSC MSR register structure

From: Stanislav Kinsburskiy <[email protected]>

And rework the code to use it instead of the physical address.
This is a cleanup and precursor patch for upcoming support for TSC page
mapping into hyper-v root partition.

Signed-off-by: Stanislav Kinsburskiy <[email protected]>
CC: "K. Y. Srinivasan" <[email protected]>
CC: Haiyang Zhang <[email protected]>
CC: Wei Liu <[email protected]>
CC: Dexuan Cui <[email protected]>
CC: Daniel Lezcano <[email protected]>
CC: Thomas Gleixner <[email protected]>
CC: [email protected]
CC: [email protected]
---
drivers/clocksource/hyperv_timer.c | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/clocksource/hyperv_timer.c b/drivers/clocksource/hyperv_timer.c
index c4dbf40a3d3e..d447bc99a399 100644
--- a/drivers/clocksource/hyperv_timer.c
+++ b/drivers/clocksource/hyperv_timer.c
@@ -367,6 +367,12 @@ static union {
} tsc_pg __aligned(PAGE_SIZE);

static struct ms_hyperv_tsc_page *tsc_page = &tsc_pg.page;
+static unsigned long tsc_pfn;
+
+static unsigned long hv_get_tsc_pfn(void)
+{
+ return tsc_pfn;
+}

struct ms_hyperv_tsc_page *hv_get_tsc_page(void)
{
@@ -408,13 +414,12 @@ static void suspend_hv_clock_tsc(struct clocksource *arg)

static void resume_hv_clock_tsc(struct clocksource *arg)
{
- phys_addr_t phys_addr = virt_to_phys(tsc_page);
union hv_reference_tsc_msr tsc_msr;

/* Re-enable the TSC page */
tsc_msr.as_uint64 = hv_get_register(HV_REGISTER_REFERENCE_TSC);
tsc_msr.enable = 1;
- tsc_msr.pfn = __phys_to_pfn(phys_addr);
+ tsc_msr.pfn = tsc_pfn;
hv_set_register(HV_REGISTER_REFERENCE_TSC, tsc_msr.as_uint64);
}

@@ -498,7 +503,6 @@ static __always_inline void hv_setup_sched_clock(void *sched_clock) {}
static bool __init hv_init_tsc_clocksource(void)
{
union hv_reference_tsc_msr tsc_msr;
- phys_addr_t phys_addr;

if (!(ms_hyperv.features & HV_MSR_REFERENCE_TSC_AVAILABLE))
return false;
@@ -523,7 +527,7 @@ static bool __init hv_init_tsc_clocksource(void)
}

hv_read_reference_counter = read_hv_clock_tsc;
- phys_addr = virt_to_phys(hv_get_tsc_page());
+ tsc_pfn = __phys_to_pfn(virt_to_phys(tsc_page));

/*
* The Hyper-V TLFS specifies to preserve the value of reserved
@@ -534,7 +538,7 @@ static bool __init hv_init_tsc_clocksource(void)
*/
tsc_msr.as_uint64 = hv_get_register(HV_REGISTER_REFERENCE_TSC);
tsc_msr.enable = 1;
- tsc_msr.pfn = __phys_to_pfn(phys_addr);
+ tsc_msr.pfn = tsc_pfn;
hv_set_register(HV_REGISTER_REFERENCE_TSC, tsc_msr.as_uint64);

clocksource_register_hz(&hyperv_cs_tsc, NSEC_PER_SEC/100);



2022-11-01 18:13:28

by Stanislav Kinsburskii

[permalink] [raw]
Subject: [PATCH 3/4] drivers/clocksource/hyper-v: Use TSC PFN getter to map vvar page

From: Stanislav Kinsburskiy <[email protected]>

Instead of converting the virtual address to physical directly.
This is a precursor patch for the upcoming support for TSC page mapping into
hyper-v root partition, which address will be defined by the hypervisor and
mapped into the kernel.

Signed-off-by: Stanislav Kinsburskiy <[email protected]>
CC: Andy Lutomirski <[email protected]>
CC: Thomas Gleixner <[email protected]>
CC: Ingo Molnar <[email protected]>
CC: Borislav Petkov <[email protected]>
CC: Dave Hansen <[email protected]>
CC: [email protected]
CC: "H. Peter Anvin" <[email protected]>
CC: "K. Y. Srinivasan" <[email protected]>
CC: Haiyang Zhang <[email protected]>
CC: Wei Liu <[email protected]>
CC: Dexuan Cui <[email protected]>
CC: Daniel Lezcano <[email protected]>
CC: [email protected]
CC: [email protected]
---
arch/x86/entry/vdso/vma.c | 7 +++----
drivers/clocksource/hyperv_timer.c | 3 ++-
include/clocksource/hyperv_timer.h | 6 ++++++
3 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c
index 311eae30e089..6976416b2c9f 100644
--- a/arch/x86/entry/vdso/vma.c
+++ b/arch/x86/entry/vdso/vma.c
@@ -210,11 +210,10 @@ static vm_fault_t vvar_fault(const struct vm_special_mapping *sm,
pgprot_decrypted(vma->vm_page_prot));
}
} else if (sym_offset == image->sym_hvclock_page) {
- struct ms_hyperv_tsc_page *tsc_pg = hv_get_tsc_page();
+ pfn = hv_get_tsc_pfn();

- if (tsc_pg && vclock_was_used(VDSO_CLOCKMODE_HVCLOCK))
- return vmf_insert_pfn(vma, vmf->address,
- virt_to_phys(tsc_pg) >> PAGE_SHIFT);
+ if (pfn && vclock_was_used(VDSO_CLOCKMODE_HVCLOCK))
+ return vmf_insert_pfn(vma, vmf->address, pfn);
} else if (sym_offset == image->sym_timens_page) {
struct page *timens_page = find_timens_vvar_page(vma);

diff --git a/drivers/clocksource/hyperv_timer.c b/drivers/clocksource/hyperv_timer.c
index d447bc99a399..635c14c1e3bf 100644
--- a/drivers/clocksource/hyperv_timer.c
+++ b/drivers/clocksource/hyperv_timer.c
@@ -369,10 +369,11 @@ static union {
static struct ms_hyperv_tsc_page *tsc_page = &tsc_pg.page;
static unsigned long tsc_pfn;

-static unsigned long hv_get_tsc_pfn(void)
+unsigned long hv_get_tsc_pfn(void)
{
return tsc_pfn;
}
+EXPORT_SYMBOL_GPL(hv_get_tsc_pfn);

struct ms_hyperv_tsc_page *hv_get_tsc_page(void)
{
diff --git a/include/clocksource/hyperv_timer.h b/include/clocksource/hyperv_timer.h
index b3f5d73ae1d6..3078d23faaea 100644
--- a/include/clocksource/hyperv_timer.h
+++ b/include/clocksource/hyperv_timer.h
@@ -32,6 +32,7 @@ extern void hv_stimer0_isr(void);

extern void hv_init_clocksource(void);

+extern unsigned long hv_get_tsc_pfn(void);
extern struct ms_hyperv_tsc_page *hv_get_tsc_page(void);

static inline notrace u64
@@ -90,6 +91,11 @@ hv_read_tsc_page(const struct ms_hyperv_tsc_page *tsc_pg)
}

#else /* CONFIG_HYPERV_TIMER */
+static inline unsigned long hv_get_tsc_pfn(void)
+{
+ return 0;
+}
+
static inline struct ms_hyperv_tsc_page *hv_get_tsc_page(void)
{
return NULL;



2022-11-01 18:23:38

by Stanislav Kinsburskii

[permalink] [raw]
Subject: [PATCH 1/4] drivers/clocksource/hyper-v: Introduce a pointer to TSC page

From: Stanislav Kinsburskiy <[email protected]>

Will be used later keep the address of the remapped page for the root
partition.

Signed-off-by: Stanislav Kinsburskiy <[email protected]>
CC: "K. Y. Srinivasan" <[email protected]>
CC: Haiyang Zhang <[email protected]>
CC: Wei Liu <[email protected]>
CC: Dexuan Cui <[email protected]>
CC: Daniel Lezcano <[email protected]>
CC: Thomas Gleixner <[email protected]>
CC: [email protected]
CC: [email protected]
---
drivers/clocksource/hyperv_timer.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/clocksource/hyperv_timer.c b/drivers/clocksource/hyperv_timer.c
index 11332c82d1af..c4dbf40a3d3e 100644
--- a/drivers/clocksource/hyperv_timer.c
+++ b/drivers/clocksource/hyperv_timer.c
@@ -366,9 +366,11 @@ static union {
u8 reserved[PAGE_SIZE];
} tsc_pg __aligned(PAGE_SIZE);

+static struct ms_hyperv_tsc_page *tsc_page = &tsc_pg.page;
+
struct ms_hyperv_tsc_page *hv_get_tsc_page(void)
{
- return &tsc_pg.page;
+ return tsc_page;
}
EXPORT_SYMBOL_GPL(hv_get_tsc_page);

@@ -406,7 +408,7 @@ static void suspend_hv_clock_tsc(struct clocksource *arg)

static void resume_hv_clock_tsc(struct clocksource *arg)
{
- phys_addr_t phys_addr = virt_to_phys(&tsc_pg);
+ phys_addr_t phys_addr = virt_to_phys(tsc_page);
union hv_reference_tsc_msr tsc_msr;

/* Re-enable the TSC page */



2022-11-02 13:32:14

by Wei Liu

[permalink] [raw]
Subject: Re: [PATCH 3/4] drivers/clocksource/hyper-v: Use TSC PFN getter to map vvar page

On Tue, Nov 01, 2022 at 05:31:15PM +0000, Stanislav Kinsburskii wrote:
> From: Stanislav Kinsburskiy <[email protected]>
>
> Instead of converting the virtual address to physical directly.
> This is a precursor patch for the upcoming support for TSC page mapping into
> hyper-v root partition, which address will be defined by the hypervisor and

Please use "Microsoft Hypervisor" instead of "hyper-v".

Also, it should be "Hyper-V" not "hyper-v" in the future.

> mapped into the kernel.
>
> Signed-off-by: Stanislav Kinsburskiy <[email protected]>
> CC: Andy Lutomirski <[email protected]>
> CC: Thomas Gleixner <[email protected]>
> CC: Ingo Molnar <[email protected]>
> CC: Borislav Petkov <[email protected]>
> CC: Dave Hansen <[email protected]>
> CC: [email protected]
> CC: "H. Peter Anvin" <[email protected]>
> CC: "K. Y. Srinivasan" <[email protected]>
> CC: Haiyang Zhang <[email protected]>
> CC: Wei Liu <[email protected]>
> CC: Dexuan Cui <[email protected]>
> CC: Daniel Lezcano <[email protected]>
> CC: [email protected]
> CC: [email protected]
> ---
> arch/x86/entry/vdso/vma.c | 7 +++----
> drivers/clocksource/hyperv_timer.c | 3 ++-
> include/clocksource/hyperv_timer.h | 6 ++++++
> 3 files changed, 11 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c
> index 311eae30e089..6976416b2c9f 100644
> --- a/arch/x86/entry/vdso/vma.c
> +++ b/arch/x86/entry/vdso/vma.c
> @@ -210,11 +210,10 @@ static vm_fault_t vvar_fault(const struct vm_special_mapping *sm,
> pgprot_decrypted(vma->vm_page_prot));
> }
> } else if (sym_offset == image->sym_hvclock_page) {
> - struct ms_hyperv_tsc_page *tsc_pg = hv_get_tsc_page();
> + pfn = hv_get_tsc_pfn();
>
> - if (tsc_pg && vclock_was_used(VDSO_CLOCKMODE_HVCLOCK))
> - return vmf_insert_pfn(vma, vmf->address,
> - virt_to_phys(tsc_pg) >> PAGE_SHIFT);
> + if (pfn && vclock_was_used(VDSO_CLOCKMODE_HVCLOCK))
> + return vmf_insert_pfn(vma, vmf->address, pfn);

hv_get_tsc_pfn() can return 0. You will insert PFN 0 into the page
table. I think you should check if the PFN is valid.

> } else if (sym_offset == image->sym_timens_page) {
> struct page *timens_page = find_timens_vvar_page(vma);
>
> diff --git a/drivers/clocksource/hyperv_timer.c b/drivers/clocksource/hyperv_timer.c
> index d447bc99a399..635c14c1e3bf 100644
> --- a/drivers/clocksource/hyperv_timer.c
> +++ b/drivers/clocksource/hyperv_timer.c
> @@ -369,10 +369,11 @@ static union {
> static struct ms_hyperv_tsc_page *tsc_page = &tsc_pg.page;
> static unsigned long tsc_pfn;
>
> -static unsigned long hv_get_tsc_pfn(void)
> +unsigned long hv_get_tsc_pfn(void)
> {
> return tsc_pfn;
> }
> +EXPORT_SYMBOL_GPL(hv_get_tsc_pfn);
>
> struct ms_hyperv_tsc_page *hv_get_tsc_page(void)
> {
> diff --git a/include/clocksource/hyperv_timer.h b/include/clocksource/hyperv_timer.h
> index b3f5d73ae1d6..3078d23faaea 100644
> --- a/include/clocksource/hyperv_timer.h
> +++ b/include/clocksource/hyperv_timer.h
> @@ -32,6 +32,7 @@ extern void hv_stimer0_isr(void);
>
> extern void hv_init_clocksource(void);
>
> +extern unsigned long hv_get_tsc_pfn(void);
> extern struct ms_hyperv_tsc_page *hv_get_tsc_page(void);
>
> static inline notrace u64
> @@ -90,6 +91,11 @@ hv_read_tsc_page(const struct ms_hyperv_tsc_page *tsc_pg)
> }
>
> #else /* CONFIG_HYPERV_TIMER */
> +static inline unsigned long hv_get_tsc_pfn(void)
> +{
> + return 0;
> +}
> +
> static inline struct ms_hyperv_tsc_page *hv_get_tsc_page(void)
> {
> return NULL;
>
>

2022-11-02 13:33:05

by Anirudh Rayabharam

[permalink] [raw]
Subject: Re: [PATCH 2/4] drivers/clocksource/hyper-v: Introduce TSC MSR register structure

Please update the patch title since this doesn't introduce the TSC MSR
register structure.

Thanks,
Anirudh.

On Tue, Nov 01, 2022 at 05:31:09PM +0000, Stanislav Kinsburskii wrote:
> From: Stanislav Kinsburskiy <[email protected]>
>
> And rework the code to use it instead of the physical address.
> This is a cleanup and precursor patch for upcoming support for TSC page
> mapping into hyper-v root partition.
>
> Signed-off-by: Stanislav Kinsburskiy <[email protected]>
> CC: "K. Y. Srinivasan" <[email protected]>
> CC: Haiyang Zhang <[email protected]>
> CC: Wei Liu <[email protected]>
> CC: Dexuan Cui <[email protected]>
> CC: Daniel Lezcano <[email protected]>
> CC: Thomas Gleixner <[email protected]>
> CC: [email protected]
> CC: [email protected]
> ---
> drivers/clocksource/hyperv_timer.c | 14 +++++++++-----
> 1 file changed, 9 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/clocksource/hyperv_timer.c b/drivers/clocksource/hyperv_timer.c
> index c4dbf40a3d3e..d447bc99a399 100644
> --- a/drivers/clocksource/hyperv_timer.c
> +++ b/drivers/clocksource/hyperv_timer.c
> @@ -367,6 +367,12 @@ static union {
> } tsc_pg __aligned(PAGE_SIZE);
>
> static struct ms_hyperv_tsc_page *tsc_page = &tsc_pg.page;
> +static unsigned long tsc_pfn;
> +
> +static unsigned long hv_get_tsc_pfn(void)
> +{
> + return tsc_pfn;
> +}
>
> struct ms_hyperv_tsc_page *hv_get_tsc_page(void)
> {
> @@ -408,13 +414,12 @@ static void suspend_hv_clock_tsc(struct clocksource *arg)
>
> static void resume_hv_clock_tsc(struct clocksource *arg)
> {
> - phys_addr_t phys_addr = virt_to_phys(tsc_page);
> union hv_reference_tsc_msr tsc_msr;
>
> /* Re-enable the TSC page */
> tsc_msr.as_uint64 = hv_get_register(HV_REGISTER_REFERENCE_TSC);
> tsc_msr.enable = 1;
> - tsc_msr.pfn = __phys_to_pfn(phys_addr);
> + tsc_msr.pfn = tsc_pfn;
> hv_set_register(HV_REGISTER_REFERENCE_TSC, tsc_msr.as_uint64);
> }
>
> @@ -498,7 +503,6 @@ static __always_inline void hv_setup_sched_clock(void *sched_clock) {}
> static bool __init hv_init_tsc_clocksource(void)
> {
> union hv_reference_tsc_msr tsc_msr;
> - phys_addr_t phys_addr;
>
> if (!(ms_hyperv.features & HV_MSR_REFERENCE_TSC_AVAILABLE))
> return false;
> @@ -523,7 +527,7 @@ static bool __init hv_init_tsc_clocksource(void)
> }
>
> hv_read_reference_counter = read_hv_clock_tsc;
> - phys_addr = virt_to_phys(hv_get_tsc_page());
> + tsc_pfn = __phys_to_pfn(virt_to_phys(tsc_page));
>
> /*
> * The Hyper-V TLFS specifies to preserve the value of reserved
> @@ -534,7 +538,7 @@ static bool __init hv_init_tsc_clocksource(void)
> */
> tsc_msr.as_uint64 = hv_get_register(HV_REGISTER_REFERENCE_TSC);
> tsc_msr.enable = 1;
> - tsc_msr.pfn = __phys_to_pfn(phys_addr);
> + tsc_msr.pfn = tsc_pfn;
> hv_set_register(HV_REGISTER_REFERENCE_TSC, tsc_msr.as_uint64);
>
> clocksource_register_hz(&hyperv_cs_tsc, NSEC_PER_SEC/100);
>

2022-11-02 13:41:05

by Anirudh Rayabharam

[permalink] [raw]
Subject: Re: [PATCH 4/4] drivers/clocksource/hyper-v: Add TSC page support for root partition

On Tue, Nov 01, 2022 at 05:31:20PM +0000, Stanislav Kinsburskii wrote:
> From: Stanislav Kinsburskiy <[email protected]>
>
> It hyper-v root partition guest has to map the page, specified by the
> hypervisor (instead of providing the page to the hypervisor like it's done in
> the guest partitions).
> However, it's too early to map the page when the clock is initialized, so, the
> actual mapping is happening later.
>
> Signed-off-by: Stanislav Kinsburskiy <[email protected]>
> CC: "K. Y. Srinivasan" <[email protected]>
> CC: Haiyang Zhang <[email protected]>
> CC: Wei Liu <[email protected]>
> CC: Dexuan Cui <[email protected]>
> CC: Thomas Gleixner <[email protected]>
> CC: Ingo Molnar <[email protected]>
> CC: Borislav Petkov <[email protected]>
> CC: Dave Hansen <[email protected]>
> CC: [email protected]
> CC: "H. Peter Anvin" <[email protected]>
> CC: Daniel Lezcano <[email protected]>
> CC: [email protected]
> CC: [email protected]
> ---
> arch/x86/hyperv/hv_init.c | 2 ++
> drivers/clocksource/hyperv_timer.c | 34 +++++++++++++++++++++++++---------
> include/clocksource/hyperv_timer.h | 1 +
> 3 files changed, 28 insertions(+), 9 deletions(-)
>
> diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
> index f49bc3ec76e6..89954490af93 100644
> --- a/arch/x86/hyperv/hv_init.c
> +++ b/arch/x86/hyperv/hv_init.c
> @@ -464,6 +464,8 @@ void __init hyperv_init(void)
> BUG_ON(!src);
> memcpy_to_page(pg, 0, src, HV_HYP_PAGE_SIZE);
> memunmap(src);
> +
> + hv_remap_tsc_clocksource();
> } else {
> hypercall_msr.guest_physical_address = vmalloc_to_pfn(hv_hypercall_pg);
> wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
> diff --git a/drivers/clocksource/hyperv_timer.c b/drivers/clocksource/hyperv_timer.c
> index 635c14c1e3bf..4118e4bc9194 100644
> --- a/drivers/clocksource/hyperv_timer.c
> +++ b/drivers/clocksource/hyperv_timer.c
> @@ -508,9 +508,6 @@ static bool __init hv_init_tsc_clocksource(void)
> if (!(ms_hyperv.features & HV_MSR_REFERENCE_TSC_AVAILABLE))
> return false;
>
> - if (hv_root_partition)
> - return false;
> -
> /*
> * If Hyper-V offers TSC_INVARIANT, then the virtualized TSC correctly
> * handles frequency and offset changes due to live migration,
> @@ -528,16 +525,22 @@ static bool __init hv_init_tsc_clocksource(void)
> }
>
> hv_read_reference_counter = read_hv_clock_tsc;
> - tsc_pfn = __phys_to_pfn(virt_to_phys(tsc_page));
>
> /*
> - * The Hyper-V TLFS specifies to preserve the value of reserved
> - * bits in registers. So read the existing value, preserve the
> - * low order 12 bits, and add in the guest physical address
> - * (which already has at least the low 12 bits set to zero since
> - * it is page aligned). Also set the "enable" bit, which is bit 0.
> + * TSC page mapping works differently in root and guest partitions.
> + * - In guest partition the guest PFN has to be passed to the
> + * hypervisor.
> + * - In root partition it's other way around: the guest has to map the
> + * PFN, provided by the hypervisor.
> + * But it can't be mapped right here as it's too early and MMU isn't
> + * ready yet. So, we only set the enable bit here and will remap the
> + * page later in hv_remap_tsc_clocksource().
> */
> tsc_msr.as_uint64 = hv_get_register(HV_REGISTER_REFERENCE_TSC);
> + if (hv_root_partition)
> + tsc_pfn = tsc_msr.pfn;

Why store the PFN like this? While mapping the page it can be read from the
MSR. Once the tsc page is mapped it can by obtained by
__phys_to_pfn(virt_to_phys(tsc_page)).

> + else
> + tsc_pfn = __phys_to_pfn(virt_to_phys(tsc_page));
> tsc_msr.enable = 1;
> tsc_msr.pfn = tsc_pfn;
> hv_set_register(HV_REGISTER_REFERENCE_TSC, tsc_msr.as_uint64);
> @@ -572,3 +575,16 @@ void __init hv_init_clocksource(void)
> hv_sched_clock_offset = hv_read_reference_counter();
> hv_setup_sched_clock(read_hv_sched_clock_msr);
> }
> +
> +void __init hv_remap_tsc_clocksource(void)
> +{
> + if (!(ms_hyperv.features & HV_MSR_REFERENCE_TSC_AVAILABLE))
> + return;
> +
> + if (!hv_root_partition)

Perhaps this should WARN()?

Thanks,
Anirudh.

> + return;
> +
> + tsc_page = memremap(__pfn_to_phys(tsc_pfn), PAGE_SIZE, MEMREMAP_WB);
> + if (!tsc_page)
> + pr_err("Failed to remap Hyper-V TSC page.\n");
> +}
> diff --git a/include/clocksource/hyperv_timer.h b/include/clocksource/hyperv_timer.h
> index 3078d23faaea..783701a2102d 100644
> --- a/include/clocksource/hyperv_timer.h
> +++ b/include/clocksource/hyperv_timer.h
> @@ -31,6 +31,7 @@ extern void hv_stimer_global_cleanup(void);
> extern void hv_stimer0_isr(void);
>
> extern void hv_init_clocksource(void);
> +extern void hv_remap_tsc_clocksource(void);
>
> extern unsigned long hv_get_tsc_pfn(void);
> extern struct ms_hyperv_tsc_page *hv_get_tsc_page(void);
>

2022-11-02 19:15:55

by Michael Kelley (LINUX)

[permalink] [raw]
Subject: RE: [PATCH 2/4] drivers/clocksource/hyper-v: Introduce TSC MSR register structure

From: Stanislav Kinsburskii <[email protected]> Sent: Tuesday, November 1, 2022 10:31 AM
>
> And rework the code to use it instead of the physical address.
> This is a cleanup and precursor patch for upcoming support for TSC page
> mapping into hyper-v root partition.
>
> Signed-off-by: Stanislav Kinsburskiy <[email protected]>
> CC: "K. Y. Srinivasan" <[email protected]>
> CC: Haiyang Zhang <[email protected]>
> CC: Wei Liu <[email protected]>
> CC: Dexuan Cui <[email protected]>
> CC: Daniel Lezcano <[email protected]>
> CC: Thomas Gleixner <[email protected]>
> CC: [email protected]
> CC: [email protected]
> ---
> drivers/clocksource/hyperv_timer.c | 14 +++++++++-----
> 1 file changed, 9 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/clocksource/hyperv_timer.c b/drivers/clocksource/hyperv_timer.c
> index c4dbf40a3d3e..d447bc99a399 100644
> --- a/drivers/clocksource/hyperv_timer.c
> +++ b/drivers/clocksource/hyperv_timer.c
> @@ -367,6 +367,12 @@ static union {
> } tsc_pg __aligned(PAGE_SIZE);
>
> static struct ms_hyperv_tsc_page *tsc_page = &tsc_pg.page;
> +static unsigned long tsc_pfn;
> +
> +static unsigned long hv_get_tsc_pfn(void)
> +{
> + return tsc_pfn;
> +}

It makes sense to have the tsc_page global variable so that we can
handle the root partition and guest partition cases with common code,
even though the TSC page memory originates differently in the two cases.

But do we also need a tsc_pfn global variable and getter function? When
the PFN is needed, conversion from the tsc_page virtual address to the PFN
isn't hard, and such a conversion is needed in only a couple of places. To me,
it's simpler to keep a single global variable and getter function (i.e.,
hv_get_tsc_page), and do the conversions where needed. Adding tsc_pfn
and the getter function introduces a fair amount of code churn for not much
benefit. It's a judgment call, but that's my $.02.

I think this may be the same as what Anirudh is saying in his comments on
Patch 4/4 in this series.

Michael

>
> struct ms_hyperv_tsc_page *hv_get_tsc_page(void)
> {
> @@ -408,13 +414,12 @@ static void suspend_hv_clock_tsc(struct clocksource *arg)
>
> static void resume_hv_clock_tsc(struct clocksource *arg)
> {
> - phys_addr_t phys_addr = virt_to_phys(tsc_page);
> union hv_reference_tsc_msr tsc_msr;
>
> /* Re-enable the TSC page */
> tsc_msr.as_uint64 = hv_get_register(HV_REGISTER_REFERENCE_TSC);
> tsc_msr.enable = 1;
> - tsc_msr.pfn = __phys_to_pfn(phys_addr);
> + tsc_msr.pfn = tsc_pfn;
> hv_set_register(HV_REGISTER_REFERENCE_TSC, tsc_msr.as_uint64);
> }
>
> @@ -498,7 +503,6 @@ static __always_inline void hv_setup_sched_clock(void
> *sched_clock) {}
> static bool __init hv_init_tsc_clocksource(void)
> {
> union hv_reference_tsc_msr tsc_msr;
> - phys_addr_t phys_addr;
>
> if (!(ms_hyperv.features & HV_MSR_REFERENCE_TSC_AVAILABLE))
> return false;
> @@ -523,7 +527,7 @@ static bool __init hv_init_tsc_clocksource(void)
> }
>
> hv_read_reference_counter = read_hv_clock_tsc;
> - phys_addr = virt_to_phys(hv_get_tsc_page());
> + tsc_pfn = __phys_to_pfn(virt_to_phys(tsc_page));
>
> /*
> * The Hyper-V TLFS specifies to preserve the value of reserved
> @@ -534,7 +538,7 @@ static bool __init hv_init_tsc_clocksource(void)
> */
> tsc_msr.as_uint64 = hv_get_register(HV_REGISTER_REFERENCE_TSC);
> tsc_msr.enable = 1;
> - tsc_msr.pfn = __phys_to_pfn(phys_addr);
> + tsc_msr.pfn = tsc_pfn;
> hv_set_register(HV_REGISTER_REFERENCE_TSC, tsc_msr.as_uint64);
>
> clocksource_register_hz(&hyperv_cs_tsc, NSEC_PER_SEC/100);
>

2022-11-02 19:40:37

by Michael Kelley (LINUX)

[permalink] [raw]
Subject: RE: [PATCH 4/4] drivers/clocksource/hyper-v: Add TSC page support for root partition

From: Stanislav Kinsburskii <[email protected]> Sent: Tuesday, November 1, 2022 10:31 AM
>
> It hyper-v root partition guest has to map the page, specified by the
> hypervisor (instead of providing the page to the hypervisor like it's done in
> the guest partitions).
> However, it's too early to map the page when the clock is initialized, so, the
> actual mapping is happening later.
>
> Signed-off-by: Stanislav Kinsburskiy <[email protected]>
> CC: "K. Y. Srinivasan" <[email protected]>
> CC: Haiyang Zhang <[email protected]>
> CC: Wei Liu <[email protected]>
> CC: Dexuan Cui <[email protected]>
> CC: Thomas Gleixner <[email protected]>
> CC: Ingo Molnar <[email protected]>
> CC: Borislav Petkov <[email protected]>
> CC: Dave Hansen <[email protected]>
> CC: [email protected]
> CC: "H. Peter Anvin" <[email protected]>
> CC: Daniel Lezcano <[email protected]>
> CC: [email protected]
> CC: [email protected]
> ---
> arch/x86/hyperv/hv_init.c | 2 ++
> drivers/clocksource/hyperv_timer.c | 34 +++++++++++++++++++++++++---------
> include/clocksource/hyperv_timer.h | 1 +
> 3 files changed, 28 insertions(+), 9 deletions(-)
>
> diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
> index f49bc3ec76e6..89954490af93 100644
> --- a/arch/x86/hyperv/hv_init.c
> +++ b/arch/x86/hyperv/hv_init.c
> @@ -464,6 +464,8 @@ void __init hyperv_init(void)
> BUG_ON(!src);
> memcpy_to_page(pg, 0, src, HV_HYP_PAGE_SIZE);
> memunmap(src);
> +
> + hv_remap_tsc_clocksource();
> } else {
> hypercall_msr.guest_physical_address =
> vmalloc_to_pfn(hv_hypercall_pg);
> wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
> diff --git a/drivers/clocksource/hyperv_timer.c b/drivers/clocksource/hyperv_timer.c
> index 635c14c1e3bf..4118e4bc9194 100644
> --- a/drivers/clocksource/hyperv_timer.c
> +++ b/drivers/clocksource/hyperv_timer.c
> @@ -508,9 +508,6 @@ static bool __init hv_init_tsc_clocksource(void)
> if (!(ms_hyperv.features & HV_MSR_REFERENCE_TSC_AVAILABLE))
> return false;
>
> - if (hv_root_partition)
> - return false;
> -
> /*
> * If Hyper-V offers TSC_INVARIANT, then the virtualized TSC correctly
> * handles frequency and offset changes due to live migration,
> @@ -528,16 +525,22 @@ static bool __init hv_init_tsc_clocksource(void)
> }
>
> hv_read_reference_counter = read_hv_clock_tsc;
> - tsc_pfn = __phys_to_pfn(virt_to_phys(tsc_page));
>
> /*
> - * The Hyper-V TLFS specifies to preserve the value of reserved
> - * bits in registers. So read the existing value, preserve the
> - * low order 12 bits, and add in the guest physical address
> - * (which already has at least the low 12 bits set to zero since
> - * it is page aligned). Also set the "enable" bit, which is bit 0.
> + * TSC page mapping works differently in root and guest partitions.
> + * - In guest partition the guest PFN has to be passed to the
> + * hypervisor.
> + * - In root partition it's other way around: the guest has to map the
> + * PFN, provided by the hypervisor.
> + * But it can't be mapped right here as it's too early and MMU isn't
> + * ready yet. So, we only set the enable bit here and will remap the
> + * page later in hv_remap_tsc_clocksource().
> */
> tsc_msr.as_uint64 = hv_get_register(HV_REGISTER_REFERENCE_TSC);
> + if (hv_root_partition)
> + tsc_pfn = tsc_msr.pfn;
> + else
> + tsc_pfn = __phys_to_pfn(virt_to_phys(tsc_page));
> tsc_msr.enable = 1;
> tsc_msr.pfn = tsc_pfn;
> hv_set_register(HV_REGISTER_REFERENCE_TSC, tsc_msr.as_uint64);
> @@ -572,3 +575,16 @@ void __init hv_init_clocksource(void)
> hv_sched_clock_offset = hv_read_reference_counter();
> hv_setup_sched_clock(read_hv_sched_clock_msr);
> }
> +
> +void __init hv_remap_tsc_clocksource(void)
> +{
> + if (!(ms_hyperv.features & HV_MSR_REFERENCE_TSC_AVAILABLE))
> + return;
> +
> + if (!hv_root_partition)
> + return;
> +
> + tsc_page = memremap(__pfn_to_phys(tsc_pfn), PAGE_SIZE, MEMREMAP_WB);

Instead of hard-coding PAGE_SIZE here, could this be sizeof(union tsc_pg)?
In the past we sorted out how to make the memory allocated for the TSC page be
a full guest page (not Microsoft Hypervisor page, which could be different) so that
it can be mapped into user space for vDSO. So it seems appropriate to piggyback
on that union definition rather than hardcoding PAGE_SIZE.

> + if (!tsc_page)
> + pr_err("Failed to remap Hyper-V TSC page.\n");
> +}
> diff --git a/include/clocksource/hyperv_timer.h
> b/include/clocksource/hyperv_timer.h
> index 3078d23faaea..783701a2102d 100644
> --- a/include/clocksource/hyperv_timer.h
> +++ b/include/clocksource/hyperv_timer.h
> @@ -31,6 +31,7 @@ extern void hv_stimer_global_cleanup(void);
> extern void hv_stimer0_isr(void);
>
> extern void hv_init_clocksource(void);
> +extern void hv_remap_tsc_clocksource(void);
>
> extern unsigned long hv_get_tsc_pfn(void);
> extern struct ms_hyperv_tsc_page *hv_get_tsc_page(void);
>

2022-11-02 20:02:08

by Michael Kelley (LINUX)

[permalink] [raw]
Subject: RE: [PATCH 2/4] drivers/clocksource/hyper-v: Introduce TSC MSR register structure

From: Stanislav Kinsburskii <[email protected]> Sent: Tuesday, November 1, 2022 10:31 AM
>
> And rework the code to use it instead of the physical address.
> This is a cleanup and precursor patch for upcoming support for TSC page
> mapping into hyper-v root partition.

Again, a slightly more robust commit message would be good. Avoid a partial
sentence that is a continuation of the commit title (which isn't correct, as
Anirudh already pointed out).

>
> Signed-off-by: Stanislav Kinsburskiy <[email protected]>
> CC: "K. Y. Srinivasan" <[email protected]>
> CC: Haiyang Zhang <[email protected]>
> CC: Wei Liu <[email protected]>
> CC: Dexuan Cui <[email protected]>
> CC: Daniel Lezcano <[email protected]>
> CC: Thomas Gleixner <[email protected]>
> CC: [email protected]
> CC: [email protected]
> ---
> drivers/clocksource/hyperv_timer.c | 14 +++++++++-----
> 1 file changed, 9 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/clocksource/hyperv_timer.c b/drivers/clocksource/hyperv_timer.c
> index c4dbf40a3d3e..d447bc99a399 100644
> --- a/drivers/clocksource/hyperv_timer.c
> +++ b/drivers/clocksource/hyperv_timer.c
> @@ -367,6 +367,12 @@ static union {
> } tsc_pg __aligned(PAGE_SIZE);
>
> static struct ms_hyperv_tsc_page *tsc_page = &tsc_pg.page;
> +static unsigned long tsc_pfn;
> +
> +static unsigned long hv_get_tsc_pfn(void)
> +{
> + return tsc_pfn;
> +}
>
> struct ms_hyperv_tsc_page *hv_get_tsc_page(void)
> {
> @@ -408,13 +414,12 @@ static void suspend_hv_clock_tsc(struct clocksource *arg)
>
> static void resume_hv_clock_tsc(struct clocksource *arg)
> {
> - phys_addr_t phys_addr = virt_to_phys(tsc_page);
> union hv_reference_tsc_msr tsc_msr;
>
> /* Re-enable the TSC page */
> tsc_msr.as_uint64 = hv_get_register(HV_REGISTER_REFERENCE_TSC);
> tsc_msr.enable = 1;
> - tsc_msr.pfn = __phys_to_pfn(phys_addr);
> + tsc_msr.pfn = tsc_pfn;
> hv_set_register(HV_REGISTER_REFERENCE_TSC, tsc_msr.as_uint64);
> }
>
> @@ -498,7 +503,6 @@ static __always_inline void hv_setup_sched_clock(void
> *sched_clock) {}
> static bool __init hv_init_tsc_clocksource(void)
> {
> union hv_reference_tsc_msr tsc_msr;
> - phys_addr_t phys_addr;
>
> if (!(ms_hyperv.features & HV_MSR_REFERENCE_TSC_AVAILABLE))
> return false;
> @@ -523,7 +527,7 @@ static bool __init hv_init_tsc_clocksource(void)
> }
>
> hv_read_reference_counter = read_hv_clock_tsc;
> - phys_addr = virt_to_phys(hv_get_tsc_page());
> + tsc_pfn = __phys_to_pfn(virt_to_phys(tsc_page));
>
> /*
> * The Hyper-V TLFS specifies to preserve the value of reserved
> @@ -534,7 +538,7 @@ static bool __init hv_init_tsc_clocksource(void)
> */
> tsc_msr.as_uint64 = hv_get_register(HV_REGISTER_REFERENCE_TSC);
> tsc_msr.enable = 1;
> - tsc_msr.pfn = __phys_to_pfn(phys_addr);
> + tsc_msr.pfn = tsc_pfn;
> hv_set_register(HV_REGISTER_REFERENCE_TSC, tsc_msr.as_uint64);
>
> clocksource_register_hz(&hyperv_cs_tsc, NSEC_PER_SEC/100);
>

2022-11-02 20:02:07

by Michael Kelley (LINUX)

[permalink] [raw]
Subject: RE: [PATCH 1/4] drivers/clocksource/hyper-v: Introduce a pointer to TSC page

From: Stanislav Kinsburskii <[email protected]> Sent: Tuesday, November 1, 2022 10:31 AM
>
> Will be used later keep the address of the remapped page for the root
> partition.

s/later keep/later to keep/

I'd like to see a more robust commit message, and not a partial
sentence that is a continuation of the commit title. Something along the
lines of:

When Linux is running in the root partition of the Microsoft Hypervisor,
the memory for the TSC page is provided by the hypervisor, so the TSC
page address can't be the address of a Linux global variable.

Introduce a global variable to contain the TSC page address. For a guest VM,
it defaults to the address of the Linux global variable. If running in the root
partition, a later patch overrides to be the address of the page provided by
the hypervisor.

>
> Signed-off-by: Stanislav Kinsburskiy <[email protected]>
> CC: "K. Y. Srinivasan" <[email protected]>
> CC: Haiyang Zhang <[email protected]>
> CC: Wei Liu <[email protected]>
> CC: Dexuan Cui <[email protected]>
> CC: Daniel Lezcano <[email protected]>
> CC: Thomas Gleixner <[email protected]>
> CC: [email protected]
> CC: [email protected]
> ---
> drivers/clocksource/hyperv_timer.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/clocksource/hyperv_timer.c b/drivers/clocksource/hyperv_timer.c
> index 11332c82d1af..c4dbf40a3d3e 100644
> --- a/drivers/clocksource/hyperv_timer.c
> +++ b/drivers/clocksource/hyperv_timer.c
> @@ -366,9 +366,11 @@ static union {
> u8 reserved[PAGE_SIZE];
> } tsc_pg __aligned(PAGE_SIZE);
>
> +static struct ms_hyperv_tsc_page *tsc_page = &tsc_pg.page;
> +
> struct ms_hyperv_tsc_page *hv_get_tsc_page(void)
> {
> - return &tsc_pg.page;
> + return tsc_page;
> }
> EXPORT_SYMBOL_GPL(hv_get_tsc_page);
>
> @@ -406,7 +408,7 @@ static void suspend_hv_clock_tsc(struct clocksource *arg)
>
> static void resume_hv_clock_tsc(struct clocksource *arg)
> {
> - phys_addr_t phys_addr = virt_to_phys(&tsc_pg);
> + phys_addr_t phys_addr = virt_to_phys(tsc_page);
> union hv_reference_tsc_msr tsc_msr;
>
> /* Re-enable the TSC page */
>

2022-11-02 20:06:56

by Wei Liu

[permalink] [raw]
Subject: Re: [PATCH 3/4] drivers/clocksource/hyper-v: Use TSC PFN getter to map vvar page

On Wed, Nov 02, 2022 at 09:48:47AM -0700, Stanislav Kinsburskiy wrote:
> > > diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c
> > > index 311eae30e089..6976416b2c9f 100644
> > > --- a/arch/x86/entry/vdso/vma.c
> > > +++ b/arch/x86/entry/vdso/vma.c
> > > @@ -210,11 +210,10 @@ static vm_fault_t vvar_fault(const struct
> > vm_special_mapping *sm,
> > >
> > pgprot_decrypted(vma->vm_page_prot));
> > > }
> > > } else if (sym_offset == image->sym_hvclock_page) {
> > > - struct ms_hyperv_tsc_page *tsc_pg = hv_get_tsc_page();
> > > + pfn = hv_get_tsc_pfn();
> > >
> > > - if (tsc_pg && vclock_was_used(VDSO_CLOCKMODE_HVCLOCK))
> > > - return vmf_insert_pfn(vma, vmf->address,
> > > - virt_to_phys(tsc_pg) >>
> > PAGE_SHIFT);
> > > + if (pfn && vclock_was_used(VDSO_CLOCKMODE_HVCLOCK))
> > > + return vmf_insert_pfn(vma, vmf->address, pfn);
> >
> > hv_get_tsc_pfn() can return 0. You will insert PFN 0 into the page
> > table. I think you should check if the PFN is valid.
> >
> >
> I'm confused. Isn't "if (pfn &&" checks for non-zero value?

I misread. The code is fine.

Wei.

2022-11-02 20:46:30

by Michael Kelley (LINUX)

[permalink] [raw]
Subject: RE: [PATCH 2/4] drivers/clocksource/hyper-v: Introduce TSC MSR register structure

From: Stanislav Kinsburskiy <[email protected]> Sent: Wednesday, November 2, 2022 12:26 PM

> ??, 2 ????. 2022 ?. ? 12:07, Michael Kelley (LINUX) <mailto:[email protected]>:
> From: Stanislav Kinsburskii <mailto:[email protected]> Sent: Tuesday, November 1, 2022 10:31 AM
> >
> > And rework the code to use it instead of the physical address.
> > This is a cleanup and precursor patch for upcoming support for TSC page
> > mapping into hyper-v root partition.
> >
> >? drivers/clocksource/hyperv_timer.c |? ?14 +++++++++-----
> >? 1 file changed, 9 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/clocksource/hyperv_timer.c b/drivers/clocksource/hyperv_timer.c
> > index c4dbf40a3d3e..d447bc99a399 100644
> > --- a/drivers/clocksource/hyperv_timer.c
> > +++ b/drivers/clocksource/hyperv_timer.c
> > @@ -367,6 +367,12 @@ static union {
> >? } tsc_pg __aligned(PAGE_SIZE);
> >
> >? static struct ms_hyperv_tsc_page *tsc_page = &tsc_pg.page;
> > +static unsigned long tsc_pfn;
> > +
> > +static unsigned long hv_get_tsc_pfn(void)
> > +{
> > +? ? ?return tsc_pfn;
> > +}
>
> > It makes sense to have the tsc_page global variable so that we can
> > handle the root partition and guest partition cases with common code,
> > even though the TSC page memory originates differently in the two cases.
> >
> > But do we also need a tsc_pfn global variable and getter function?? When
> > the PFN is needed, conversion from the tsc_page virtual address to the PFN
> > isn't hard, and such a conversion is needed in only a couple of places.? To me,
> > it's simpler to keep a single global variable and getter function (i.e.,
> > hv_get_tsc_page), and do the conversions where needed.? ?Adding tsc_pfn
> > and the getter function introduces a fair amount of code churn for not much
> > benefit.? It's a judgment call, but that's my $.02.
>
> As I replied to Anirudh , AFAIK virt_to_phys doesn't work for remapped pages.
> Another option would be to read the MSR each time PFN has to be returned to
> the vvar mapping function (i.e. on every fork), which introduces unnecessary
> performance regression..
> Another modification would be to make pfn a static variable and initialize it
> once in hv_get_tsc_pfn() on first access. But I like this implementation?less.

Valid point about virt_to_phys(). But virt_to_hvpfn() does the right thing. It
distinguishes between kernel addresses in the main linear mapping and
vmalloc() addresses, which is what you get from memremap(). But I haven't
looked through all the places virt_to_hvpfn() would be needed to make sure
it's safe to use.

However, thinking about virt_to_hvpfn(), there's a problem with Anirudh's
earlier patch set that started using __phys_to_pfn(). That won't work correctly
if the guest page size is not 4K because it will return a PFN based on the guest
page size, not based on Hyper-V's expectation that the PFN is based on a
4K page size. So that needs to be fixed either way.

Michael

?

2022-11-02 20:53:55

by Michael Kelley (LINUX)

[permalink] [raw]
Subject: RE: [PATCH 1/4] drivers/clocksource/hyper-v: Introduce a pointer to TSC page

From: Stanislav Kinsburskiy <[email protected]> Sent: Wednesday, November 2, 2022 12:19 PM

> ср, 2 нояб. 2022 г. в 11:56, Michael Kelley (LINUX) <mailto:[email protected]>:
> From: Stanislav Kinsburskii <mailto:[email protected]> Sent: Tuesday, November 1, 2022 10:31 AM
> > >
> > > Will be used later keep the address of the remapped page for the root
> > > partition.
> >
> > s/later keep/later to keep/
>
> I'll fix it, thanks.
>  
> > I'd like to see a more robust commit message, and not a partial
> > sentence that is a continuation of the commit title.  Something along the lines of:
> >
> > When Linux is running in the root partition of the Microsoft Hypervisor,
> > the memory for the TSC page is provided by the hypervisor, so the TSC
> > page address can't be the address of a Linux global variable.
> >
> > Introduce a global variable to contain the TSC page address.  For a guest VM,
> > it defaults to the address of the Linux global variable.  If running in the root
> > partition, a later patch overrides to be the address of the page provided by
> > the hypervisor.
>
> This is a cleanup patch whose goal is to provide a clear separation between the
> actual feature (which comes in the last patch of the series) and other precursor
> changes, making the feature introduction more laconic and clean.
>
> I doubt it needs any additional text to expose the details of the resulting goal.
> Why do you think otherwise?

To me, the additional text clearly answers the "why" question for the patch. Here's
the quote from Documentation/process/submitting-patches.rst:

The explanation body will be committed to the permanent source
changelog, so should make sense to a competent reader who has long since
forgotten the immediate details of the discussion that might have led to
this patch. Including symptoms of the failure which the patch addresses
(kernel log messages, oops messages, etc.) are especially useful for
people who might be searching the commit logs looking for the applicable
patch. The text should be written in such detail so that when read
weeks, months or even years later, it can give the reader the needed
details to grasp the reasoning for **why** the patch was created.

Certainly, it's somewhat a matter of personal style, and I tend to lean toward the
"more explanation is better" approach. But if no one else cares to weigh in on
the topic, it's not a blocker for me.

Michael

2022-11-02 21:30:41

by Michael Kelley (LINUX)

[permalink] [raw]
Subject: RE: [PATCH 2/4] drivers/clocksource/hyper-v: Introduce TSC MSR register structure

From: Stanislav Kinsburskiy <[email protected]> Sent: Wednesday, November 2, 2022 1:58 PM

> ??, 2 ????. 2022 ?. ? 13:30, Michael Kelley (LINUX) <mailto:[email protected]>:
> From: Stanislav Kinsburskiy <mailto:[email protected]>? Sent: Wednesday, November 2, 2022 12:26 PM
>
> > > It makes sense to have the tsc_page global variable so that we can
> > > handle the root partition and guest partition cases with common code,
> > > even though the TSC page memory originates differently in the two cases.
> > >
> > > But do we also need a tsc_pfn global variable and getter function?? When
> > > the PFN is needed, conversion from the tsc_page virtual address to the PFN
> > > isn't hard, and such a conversion is needed in only a couple of places.? To me,
> > > it's simpler to keep a single global variable and getter function (i.e.,
> > > hv_get_tsc_page), and do the conversions where needed.? ?Adding tsc_pfn
> > > and the getter function introduces a fair amount of code churn for not much
> > > benefit.? It's a judgment call, but that's my $.02.
> >
> > As I replied to Anirudh , AFAIK virt_to_phys doesn't work for remapped pages.
> > Another option would be to read the MSR each time PFN has to be returned to
> > the vvar mapping function (i.e. on every fork), which introduces unnecessary
> > performance regression..
> > Another modification would be to make pfn a static variable and initialize it
> > once in hv_get_tsc_pfn() on first access. But I like this implementation?less.

> > Valid point about virt_to_phys().? But virt_to_hvpfn() does the right thing.? It
> > distinguishes between kernel addresses in the main linear mapping and
> > vmalloc() addresses, which is what you get from memremap().? But I haven't
> > looked through all the places virt_to_hvpfn() would be needed to make sure
> > it's safe to use.
>
> Yeah, I guess virt_to_hvpfn() will do.
> But I'm not sure it the current code should be reworked to use it: it would save only a
> few lines of code, but will remove the explicit distinguishment between root and guest
> partitions,?currently reflected in the code.
> Please, let me know if you insist on reworking the series to use virt_to_hvpfn().

Your call. I'm OK with leaving things "as is" due to the additional complexity
of dealing with the vmalloc() address that comes from memremap().
?
> > However, thinking about virt_to_hvpfn(), there's a problem with Anirudh's
> > earlier patch set that started using __phys_to_pfn().? That won't work correctly
> > if the guest page size is not 4K because it will return a PFN based on the guest
> > page size, not based on Hyper-V's expectation that the PFN is based on a
> > 4K page size.? So that needs to be fixed either way.

> Could you elaborate more on the problem??
?
The key is to recognize that PFNs are inherently interpreted in the context
of the page size. Consider Guest Physical Address (GPA) 0x12340000.
If the page size is 4096, the PFN is 0x12340. If the page size is 64K, the PFN
is 0x1234. Hyper-V is always expecting PFNs in the context of a page size
of 4096. But Linux guests running on Hyper-V on ARM64 could have a
guest page size of 64K, even though Hyper-V itself is using a page size
of 4096. For my example, in an ARM64 VM with 64K page size,
__phys_to_pfn(0x12340000) would return 0x1234. If that value is
stored in the PFN field of the MSR, Hyper-V will think the GPA is
0x1234000 when it should be 0x12340000.

Michael
?

2022-11-02 22:53:38

by Stanislav Kinsburskii

[permalink] [raw]
Subject: Re: [PATCH 2/4] drivers/clocksource/hyper-v: Introduce TSC MSR register structure

On Wed, Nov 02, 2022 at 09:27:07PM +0000, Michael Kelley (LINUX) wrote:
> From: Stanislav Kinsburskiy <[email protected]> Sent: Wednesday, November 2, 2022 1:58 PM
>
> > ср, 2 нояб. 2022 г. в 13:30, Michael Kelley (LINUX) <mailto:[email protected]>:
> > From: Stanislav Kinsburskiy <mailto:[email protected] Sent: Wednesday, November 2, 2022 12:26 PM
> >
> > > > It makes sense to have the tsc_page global variable so that we can
> > > > handle the root partition and guest partition cases with common code,
> > > > even though the TSC page memory originates differently in the two cases.
> > > >
> > > > But do we also need a tsc_pfn global variable and getter function?  When
> > > > the PFN is needed, conversion from the tsc_page virtual address to the PFN
> > > > isn't hard, and such a conversion is needed in only a couple of places.  To me,
> > > > it's simpler to keep a single global variable and getter function (i.e.,
> > > > hv_get_tsc_page), and do the conversions where needed.   Adding tsc_pfn
> > > > and the getter function introduces a fair amount of code churn for not much
> > > > benefit.  It's a judgment call, but that's my $.02.
> > >
> > > As I replied to Anirudh , AFAIK virt_to_phys doesn't work for remapped pages.
> > > Another option would be to read the MSR each time PFN has to be returned to
> > > the vvar mapping function (i.e. on every fork), which introduces unnecessary
> > > performance regression..
> > > Another modification would be to make pfn a static variable and initialize it
> > > once in hv_get_tsc_pfn() on first access. But I like this implementation less.
>
> > > Valid point about virt_to_phys().  But virt_to_hvpfn() does the right thing.  It
> > > distinguishes between kernel addresses in the main linear mapping and
> > > vmalloc() addresses, which is what you get from memremap().  But I haven't
> > > looked through all the places virt_to_hvpfn() would be needed to make sure
> > > it's safe to use.
> >
> > Yeah, I guess virt_to_hvpfn() will do.
> > But I'm not sure it the current code should be reworked to use it: it would save only a
> > few lines of code, but will remove the explicit distinguishment between root and guest
> > partitions, currently reflected in the code.
> > Please, let me know if you insist on reworking the series to use virt_to_hvpfn().
>
> Your call. I'm OK with leaving things "as is" due to the additional complexity
> of dealing with the vmalloc() address that comes from memremap().
>  

I'll keep as it is then. Thanks.

> > > However, thinking about virt_to_hvpfn(), there's a problem with Anirudh's
> > > earlier patch set that started using __phys_to_pfn().  That won't work correctly
> > > if the guest page size is not 4K because it will return a PFN based on the guest
> > > page size, not based on Hyper-V's expectation that the PFN is based on a
> > > 4K page size.  So that needs to be fixed either way.
>
> > Could you elaborate more on the problem? 
>  
> The key is to recognize that PFNs are inherently interpreted in the context
> of the page size. Consider Guest Physical Address (GPA) 0x12340000.
> If the page size is 4096, the PFN is 0x12340. If the page size is 64K, the PFN
> is 0x1234. Hyper-V is always expecting PFNs in the context of a page size
> of 4096. But Linux guests running on Hyper-V on ARM64 could have a
> guest page size of 64K, even though Hyper-V itself is using a page size
> of 4096. For my example, in an ARM64 VM with 64K page size,
> __phys_to_pfn(0x12340000) would return 0x1234. If that value is
> stored in the PFN field of the MSR, Hyper-V will think the GPA is
> 0x1234000 when it should be 0x12340000.
>

Thank you for the verbose explanation.

Stas

> Michael
>