2022-11-02 22:52:53

by Stanislav Kinsburskii

[permalink] [raw]
Subject: [PATCH v2 4/4] drivers/clocksource/hyper-v: Add TSC page support for root partition

From: Stanislav Kinsburskiy <[email protected]>

Microsoft Hypervisor root partition has to map the TSC page specified
by the hypervisor, instead of providing the page to the hypervisor like
it's done in the guest partitions.

However, it's too early to map the page when the clock is initialized, so, the
actual mapping is happening later.

Signed-off-by: Stanislav Kinsburskiy <[email protected]>
CC: "K. Y. Srinivasan" <[email protected]>
CC: Haiyang Zhang <[email protected]>
CC: Wei Liu <[email protected]>
CC: Dexuan Cui <[email protected]>
CC: Thomas Gleixner <[email protected]>
CC: Ingo Molnar <[email protected]>
CC: Borislav Petkov <[email protected]>
CC: Dave Hansen <[email protected]>
CC: [email protected]
CC: "H. Peter Anvin" <[email protected]>
CC: Daniel Lezcano <[email protected]>
CC: [email protected]
CC: [email protected]
---
arch/x86/hyperv/hv_init.c | 2 ++
drivers/clocksource/hyperv_timer.c | 37 +++++++++++++++++++++++++++---------
include/clocksource/hyperv_timer.h | 1 +
3 files changed, 31 insertions(+), 9 deletions(-)

diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index f49bc3ec76e6..89954490af93 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -464,6 +464,8 @@ void __init hyperv_init(void)
BUG_ON(!src);
memcpy_to_page(pg, 0, src, HV_HYP_PAGE_SIZE);
memunmap(src);
+
+ hv_remap_tsc_clocksource();
} else {
hypercall_msr.guest_physical_address = vmalloc_to_pfn(hv_hypercall_pg);
wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
diff --git a/drivers/clocksource/hyperv_timer.c b/drivers/clocksource/hyperv_timer.c
index 635c14c1e3bf..ec76303b2a76 100644
--- a/drivers/clocksource/hyperv_timer.c
+++ b/drivers/clocksource/hyperv_timer.c
@@ -508,9 +508,6 @@ static bool __init hv_init_tsc_clocksource(void)
if (!(ms_hyperv.features & HV_MSR_REFERENCE_TSC_AVAILABLE))
return false;

- if (hv_root_partition)
- return false;
-
/*
* If Hyper-V offers TSC_INVARIANT, then the virtualized TSC correctly
* handles frequency and offset changes due to live migration,
@@ -528,16 +525,22 @@ static bool __init hv_init_tsc_clocksource(void)
}

hv_read_reference_counter = read_hv_clock_tsc;
- tsc_pfn = __phys_to_pfn(virt_to_phys(tsc_page));

/*
- * The Hyper-V TLFS specifies to preserve the value of reserved
- * bits in registers. So read the existing value, preserve the
- * low order 12 bits, and add in the guest physical address
- * (which already has at least the low 12 bits set to zero since
- * it is page aligned). Also set the "enable" bit, which is bit 0.
+ * TSC page mapping works differently in root and guest partitions.
+ * - In guest partition the guest PFN has to be passed to the
+ * hypervisor.
+ * - In root partition it's other way around: it has to map the PFN
+ * provided by the hypervisor.
+ * But it can't be mapped right here as it's too early and MMU isn't
+ * ready yet. So, we only set the enable bit here and will remap the
+ * page later in hv_remap_tsc_clocksource().
*/
tsc_msr.as_uint64 = hv_get_register(HV_REGISTER_REFERENCE_TSC);
+ if (hv_root_partition)
+ tsc_pfn = tsc_msr.pfn;
+ else
+ tsc_pfn = __phys_to_pfn(virt_to_phys(tsc_page));
tsc_msr.enable = 1;
tsc_msr.pfn = tsc_pfn;
hv_set_register(HV_REGISTER_REFERENCE_TSC, tsc_msr.as_uint64);
@@ -572,3 +575,19 @@ void __init hv_init_clocksource(void)
hv_sched_clock_offset = hv_read_reference_counter();
hv_setup_sched_clock(read_hv_sched_clock_msr);
}
+
+void __init hv_remap_tsc_clocksource(void)
+{
+ if (!(ms_hyperv.features & HV_MSR_REFERENCE_TSC_AVAILABLE))
+ return;
+
+ if (!hv_root_partition) {
+ WARN(1, "%s: attempt to remap TSC page in guest partition\n",
+ __func__);
+ return;
+ }
+
+ tsc_page = memremap(__pfn_to_phys(tsc_pfn), sizeof(tsc_pg), MEMREMAP_WB);
+ if (!tsc_page)
+ pr_err("Failed to remap Hyper-V TSC page.\n");
+}
diff --git a/include/clocksource/hyperv_timer.h b/include/clocksource/hyperv_timer.h
index 3078d23faaea..783701a2102d 100644
--- a/include/clocksource/hyperv_timer.h
+++ b/include/clocksource/hyperv_timer.h
@@ -31,6 +31,7 @@ extern void hv_stimer_global_cleanup(void);
extern void hv_stimer0_isr(void);

extern void hv_init_clocksource(void);
+extern void hv_remap_tsc_clocksource(void);

extern unsigned long hv_get_tsc_pfn(void);
extern struct ms_hyperv_tsc_page *hv_get_tsc_page(void);




2022-11-03 01:01:48

by Michael Kelley (LINUX)

[permalink] [raw]
Subject: RE: [PATCH v2 4/4] drivers/clocksource/hyper-v: Add TSC page support for root partition

From: Stanislav Kinsburskii <[email protected]> Sent: Wednesday, November 2, 2022 3:08 PM
>
> Microsoft Hypervisor root partition has to map the TSC page specified
> by the hypervisor, instead of providing the page to the hypervisor like
> it's done in the guest partitions.
>
> However, it's too early to map the page when the clock is initialized, so, the
> actual mapping is happening later.
>
> Signed-off-by: Stanislav Kinsburskiy <[email protected]>
> CC: "K. Y. Srinivasan" <[email protected]>
> CC: Haiyang Zhang <[email protected]>
> CC: Wei Liu <[email protected]>
> CC: Dexuan Cui <[email protected]>
> CC: Thomas Gleixner <[email protected]>
> CC: Ingo Molnar <[email protected]>
> CC: Borislav Petkov <[email protected]>
> CC: Dave Hansen <[email protected]>
> CC: [email protected]
> CC: "H. Peter Anvin" <[email protected]>
> CC: Daniel Lezcano <[email protected]>
> CC: [email protected]
> CC: [email protected]
> ---
> arch/x86/hyperv/hv_init.c | 2 ++
> drivers/clocksource/hyperv_timer.c | 37 +++++++++++++++++++++++++++---------
> include/clocksource/hyperv_timer.h | 1 +
> 3 files changed, 31 insertions(+), 9 deletions(-)
>
> diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
> index f49bc3ec76e6..89954490af93 100644
> --- a/arch/x86/hyperv/hv_init.c
> +++ b/arch/x86/hyperv/hv_init.c
> @@ -464,6 +464,8 @@ void __init hyperv_init(void)
> BUG_ON(!src);
> memcpy_to_page(pg, 0, src, HV_HYP_PAGE_SIZE);
> memunmap(src);
> +
> + hv_remap_tsc_clocksource();
> } else {
> hypercall_msr.guest_physical_address =
> vmalloc_to_pfn(hv_hypercall_pg);
> wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
> diff --git a/drivers/clocksource/hyperv_timer.c b/drivers/clocksource/hyperv_timer.c
> index 635c14c1e3bf..ec76303b2a76 100644
> --- a/drivers/clocksource/hyperv_timer.c
> +++ b/drivers/clocksource/hyperv_timer.c
> @@ -508,9 +508,6 @@ static bool __init hv_init_tsc_clocksource(void)
> if (!(ms_hyperv.features & HV_MSR_REFERENCE_TSC_AVAILABLE))
> return false;
>
> - if (hv_root_partition)
> - return false;
> -
> /*
> * If Hyper-V offers TSC_INVARIANT, then the virtualized TSC correctly
> * handles frequency and offset changes due to live migration,
> @@ -528,16 +525,22 @@ static bool __init hv_init_tsc_clocksource(void)
> }
>
> hv_read_reference_counter = read_hv_clock_tsc;
> - tsc_pfn = __phys_to_pfn(virt_to_phys(tsc_page));
>
> /*
> - * The Hyper-V TLFS specifies to preserve the value of reserved
> - * bits in registers. So read the existing value, preserve the
> - * low order 12 bits, and add in the guest physical address
> - * (which already has at least the low 12 bits set to zero since
> - * it is page aligned). Also set the "enable" bit, which is bit 0.
> + * TSC page mapping works differently in root and guest partitions.
> + * - In guest partition the guest PFN has to be passed to the
> + * hypervisor.
> + * - In root partition it's other way around: it has to map the PFN
> + * provided by the hypervisor.
> + * But it can't be mapped right here as it's too early and MMU isn't
> + * ready yet. So, we only set the enable bit here and will remap the
> + * page later in hv_remap_tsc_clocksource().
> */
> tsc_msr.as_uint64 = hv_get_register(HV_REGISTER_REFERENCE_TSC);
> + if (hv_root_partition)
> + tsc_pfn = tsc_msr.pfn;
> + else
> + tsc_pfn = __phys_to_pfn(virt_to_phys(tsc_page));

Same problem here with setting tsc_pfn to a guest PFN, which may be
Different from what Hyper-V is expecting as a PFN two lines below. I know
the above line was just carried over from Anirudh's previous patch set,
but I was thinking you would fix this issue. :-)

> tsc_msr.enable = 1;
> tsc_msr.pfn = tsc_pfn;
> hv_set_register(HV_REGISTER_REFERENCE_TSC, tsc_msr.as_uint64);
> @@ -572,3 +575,19 @@ void __init hv_init_clocksource(void)
> hv_sched_clock_offset = hv_read_reference_counter();
> hv_setup_sched_clock(read_hv_sched_clock_msr);
> }
> +
> +void __init hv_remap_tsc_clocksource(void)
> +{
> + if (!(ms_hyperv.features & HV_MSR_REFERENCE_TSC_AVAILABLE))
> + return;
> +
> + if (!hv_root_partition) {
> + WARN(1, "%s: attempt to remap TSC page in guest partition\n",
> + __func__);
> + return;
> + }
> +
> + tsc_page = memremap(__pfn_to_phys(tsc_pfn), sizeof(tsc_pg), MEMREMAP_WB);

Note that use of __pfn_to_phys() is at risk of being wrong depending on whether
you decide to keep a guest PFN or a Hyper-V PFN in tsc_pfn.

> + if (!tsc_page)
> + pr_err("Failed to remap Hyper-V TSC page.\n");
> +}
> diff --git a/include/clocksource/hyperv_timer.h
> b/include/clocksource/hyperv_timer.h
> index 3078d23faaea..783701a2102d 100644
> --- a/include/clocksource/hyperv_timer.h
> +++ b/include/clocksource/hyperv_timer.h
> @@ -31,6 +31,7 @@ extern void hv_stimer_global_cleanup(void);
> extern void hv_stimer0_isr(void);
>
> extern void hv_init_clocksource(void);
> +extern void hv_remap_tsc_clocksource(void);
>
> extern unsigned long hv_get_tsc_pfn(void);
> extern struct ms_hyperv_tsc_page *hv_get_tsc_page(void);
>

2022-11-03 17:30:53

by Stanislav Kinsburskii

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] drivers/clocksource/hyper-v: Add TSC page support for root partition

O Wed, Nov 02, 2022 at 11:36:45PM +0000, Michael Kelley (LINUX) wrote:
> From: Stanislav Kinsburskii <[email protected]> Sent: Wednesday, November 2, 2022 3:08 PM
> >
> > Microsoft Hypervisor root partition has to map the TSC page specified
> > by the hypervisor, instead of providing the page to the hypervisor like
> > it's done in the guest partitions.
> >
> > However, it's too early to map the page when the clock is initialized, so, the
> > actual mapping is happening later.
> >
> > Signed-off-by: Stanislav Kinsburskiy <[email protected]>
> > CC: "K. Y. Srinivasan" <[email protected]>
> > CC: Haiyang Zhang <[email protected]>
> > CC: Wei Liu <[email protected]>
> > CC: Dexuan Cui <[email protected]>
> > CC: Thomas Gleixner <[email protected]>
> > CC: Ingo Molnar <[email protected]>
> > CC: Borislav Petkov <[email protected]>
> > CC: Dave Hansen <[email protected]>
> > CC: [email protected]
> > CC: "H. Peter Anvin" <[email protected]>
> > CC: Daniel Lezcano <[email protected]>
> > CC: [email protected]
> > CC: [email protected]
> > ---
> > arch/x86/hyperv/hv_init.c | 2 ++
> > drivers/clocksource/hyperv_timer.c | 37 +++++++++++++++++++++++++++---------
> > include/clocksource/hyperv_timer.h | 1 +
> > 3 files changed, 31 insertions(+), 9 deletions(-)
> >
> > diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
> > index f49bc3ec76e6..89954490af93 100644
> > --- a/arch/x86/hyperv/hv_init.c
> > +++ b/arch/x86/hyperv/hv_init.c
> > @@ -464,6 +464,8 @@ void __init hyperv_init(void)
> > BUG_ON(!src);
> > memcpy_to_page(pg, 0, src, HV_HYP_PAGE_SIZE);
> > memunmap(src);
> > +
> > + hv_remap_tsc_clocksource();
> > } else {
> > hypercall_msr.guest_physical_address =
> > vmalloc_to_pfn(hv_hypercall_pg);
> > wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
> > diff --git a/drivers/clocksource/hyperv_timer.c b/drivers/clocksource/hyperv_timer.c
> > index 635c14c1e3bf..ec76303b2a76 100644
> > --- a/drivers/clocksource/hyperv_timer.c
> > +++ b/drivers/clocksource/hyperv_timer.c
> > @@ -508,9 +508,6 @@ static bool __init hv_init_tsc_clocksource(void)
> > if (!(ms_hyperv.features & HV_MSR_REFERENCE_TSC_AVAILABLE))
> > return false;
> >
> > - if (hv_root_partition)
> > - return false;
> > -
> > /*
> > * If Hyper-V offers TSC_INVARIANT, then the virtualized TSC correctly
> > * handles frequency and offset changes due to live migration,
> > @@ -528,16 +525,22 @@ static bool __init hv_init_tsc_clocksource(void)
> > }
> >
> > hv_read_reference_counter = read_hv_clock_tsc;
> > - tsc_pfn = __phys_to_pfn(virt_to_phys(tsc_page));
> >
> > /*
> > - * The Hyper-V TLFS specifies to preserve the value of reserved
> > - * bits in registers. So read the existing value, preserve the
> > - * low order 12 bits, and add in the guest physical address
> > - * (which already has at least the low 12 bits set to zero since
> > - * it is page aligned). Also set the "enable" bit, which is bit 0.
> > + * TSC page mapping works differently in root and guest partitions.
> > + * - In guest partition the guest PFN has to be passed to the
> > + * hypervisor.
> > + * - In root partition it's other way around: it has to map the PFN
> > + * provided by the hypervisor.
> > + * But it can't be mapped right here as it's too early and MMU isn't
> > + * ready yet. So, we only set the enable bit here and will remap the
> > + * page later in hv_remap_tsc_clocksource().
> > */
> > tsc_msr.as_uint64 = hv_get_register(HV_REGISTER_REFERENCE_TSC);
> > + if (hv_root_partition)
> > + tsc_pfn = tsc_msr.pfn;
> > + else
> > + tsc_pfn = __phys_to_pfn(virt_to_phys(tsc_page));
>
> Same problem here with setting tsc_pfn to a guest PFN, which may be
> Different from what Hyper-V is expecting as a PFN two lines below. I know
> the above line was just carried over from Anirudh's previous patch set,
> but I was thinking you would fix this issue. :-)
>

Fair call. I guess Anirudh has addressed it himself, so I'm going to
rebase on his fix.

> > tsc_msr.enable = 1;
> > tsc_msr.pfn = tsc_pfn;
> > hv_set_register(HV_REGISTER_REFERENCE_TSC, tsc_msr.as_uint64);
> > @@ -572,3 +575,19 @@ void __init hv_init_clocksource(void)
> > hv_sched_clock_offset = hv_read_reference_counter();
> > hv_setup_sched_clock(read_hv_sched_clock_msr);
> > }
> > +
> > +void __init hv_remap_tsc_clocksource(void)
> > +{
> > + if (!(ms_hyperv.features & HV_MSR_REFERENCE_TSC_AVAILABLE))
> > + return;
> > +
> > + if (!hv_root_partition) {
> > + WARN(1, "%s: attempt to remap TSC page in guest partition\n",
> > + __func__);
> > + return;
> > + }
> > +
> > + tsc_page = memremap(__pfn_to_phys(tsc_pfn), sizeof(tsc_pg), MEMREMAP_WB);
>
> Note that use of __pfn_to_phys() is at risk of being wrong depending on whether
> you decide to keep a guest PFN or a Hyper-V PFN in tsc_pfn.
>

It's Hyperv-V PFN that is stored in the variable (to match the MSR value for the root partition).
I guess this approach will workd regardless of the guest page size.

Stas

2022-11-03 17:33:55

by Michael Kelley (LINUX)

[permalink] [raw]
Subject: RE: [PATCH v2 4/4] drivers/clocksource/hyper-v: Add TSC page support for root partition

From: Stanislav Kinsburskii <[email protected]> Sent: Thursday, November 3, 2022 10:01 AM
>
> O Wed, Nov 02, 2022 at 11:36:45PM +0000, Michael Kelley (LINUX) wrote:
> > From: Stanislav Kinsburskii <[email protected]> Sent: Wednesday,
> November 2, 2022 3:08 PM
> > >
> > > Microsoft Hypervisor root partition has to map the TSC page specified
> > > by the hypervisor, instead of providing the page to the hypervisor like
> > > it's done in the guest partitions.
> > >
> > > However, it's too early to map the page when the clock is initialized, so, the
> > > actual mapping is happening later.
> > >
> > > Signed-off-by: Stanislav Kinsburskiy <[email protected]>
> > > CC: "K. Y. Srinivasan" <[email protected]>
> > > CC: Haiyang Zhang <[email protected]>
> > > CC: Wei Liu <[email protected]>
> > > CC: Dexuan Cui <[email protected]>
> > > CC: Thomas Gleixner <[email protected]>
> > > CC: Ingo Molnar <[email protected]>
> > > CC: Borislav Petkov <[email protected]>
> > > CC: Dave Hansen <[email protected]>
> > > CC: [email protected]
> > > CC: "H. Peter Anvin" <[email protected]>
> > > CC: Daniel Lezcano <[email protected]>
> > > CC: [email protected]
> > > CC: [email protected]
> > > ---
> > > arch/x86/hyperv/hv_init.c | 2 ++
> > > drivers/clocksource/hyperv_timer.c | 37 +++++++++++++++++++++++++++------
> ---
> > > include/clocksource/hyperv_timer.h | 1 +
> > > 3 files changed, 31 insertions(+), 9 deletions(-)
> > >
> > > diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
> > > index f49bc3ec76e6..89954490af93 100644
> > > --- a/arch/x86/hyperv/hv_init.c
> > > +++ b/arch/x86/hyperv/hv_init.c
> > > @@ -464,6 +464,8 @@ void __init hyperv_init(void)
> > > BUG_ON(!src);
> > > memcpy_to_page(pg, 0, src, HV_HYP_PAGE_SIZE);
> > > memunmap(src);
> > > +
> > > + hv_remap_tsc_clocksource();
> > > } else {
> > > hypercall_msr.guest_physical_address =
> > > vmalloc_to_pfn(hv_hypercall_pg);
> > > wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
> > > diff --git a/drivers/clocksource/hyperv_timer.c
> b/drivers/clocksource/hyperv_timer.c
> > > index 635c14c1e3bf..ec76303b2a76 100644
> > > --- a/drivers/clocksource/hyperv_timer.c
> > > +++ b/drivers/clocksource/hyperv_timer.c
> > > @@ -508,9 +508,6 @@ static bool __init hv_init_tsc_clocksource(void)
> > > if (!(ms_hyperv.features & HV_MSR_REFERENCE_TSC_AVAILABLE))
> > > return false;
> > >
> > > - if (hv_root_partition)
> > > - return false;
> > > -
> > > /*
> > > * If Hyper-V offers TSC_INVARIANT, then the virtualized TSC correctly
> > > * handles frequency and offset changes due to live migration,
> > > @@ -528,16 +525,22 @@ static bool __init hv_init_tsc_clocksource(void)
> > > }
> > >
> > > hv_read_reference_counter = read_hv_clock_tsc;
> > > - tsc_pfn = __phys_to_pfn(virt_to_phys(tsc_page));
> > >
> > > /*
> > > - * The Hyper-V TLFS specifies to preserve the value of reserved
> > > - * bits in registers. So read the existing value, preserve the
> > > - * low order 12 bits, and add in the guest physical address
> > > - * (which already has at least the low 12 bits set to zero since
> > > - * it is page aligned). Also set the "enable" bit, which is bit 0.
> > > + * TSC page mapping works differently in root and guest partitions.
> > > + * - In guest partition the guest PFN has to be passed to the
> > > + * hypervisor.
> > > + * - In root partition it's other way around: it has to map the PFN
> > > + * provided by the hypervisor.
> > > + * But it can't be mapped right here as it's too early and MMU isn't
> > > + * ready yet. So, we only set the enable bit here and will remap the
> > > + * page later in hv_remap_tsc_clocksource().
> > > */
> > > tsc_msr.as_uint64 = hv_get_register(HV_REGISTER_REFERENCE_TSC);
> > > + if (hv_root_partition)
> > > + tsc_pfn = tsc_msr.pfn;
> > > + else
> > > + tsc_pfn = __phys_to_pfn(virt_to_phys(tsc_page));
> >
> > Same problem here with setting tsc_pfn to a guest PFN, which may be
> > Different from what Hyper-V is expecting as a PFN two lines below. I know
> > the above line was just carried over from Anirudh's previous patch set,
> > but I was thinking you would fix this issue. :-)
> >
>
> Fair call. I guess Anirudh has addressed it himself, so I'm going to
> rebase on his fix.

Sounds good.

>
> > > tsc_msr.enable = 1;
> > > tsc_msr.pfn = tsc_pfn;
> > > hv_set_register(HV_REGISTER_REFERENCE_TSC, tsc_msr.as_uint64);
> > > @@ -572,3 +575,19 @@ void __init hv_init_clocksource(void)
> > > hv_sched_clock_offset = hv_read_reference_counter();
> > > hv_setup_sched_clock(read_hv_sched_clock_msr);
> > > }
> > > +
> > > +void __init hv_remap_tsc_clocksource(void)
> > > +{
> > > + if (!(ms_hyperv.features & HV_MSR_REFERENCE_TSC_AVAILABLE))
> > > + return;
> > > +
> > > + if (!hv_root_partition) {
> > > + WARN(1, "%s: attempt to remap TSC page in guest partition\n",
> > > + __func__);
> > > + return;
> > > + }
> > > +
> > > + tsc_page = memremap(__pfn_to_phys(tsc_pfn), sizeof(tsc_pg), MEMREMAP_WB);
> >
> > Note that use of __pfn_to_phys() is at risk of being wrong depending on whether
> > you decide to keep a guest PFN or a Hyper-V PFN in tsc_pfn.
> >
>
> It's Hyperv-V PFN that is stored in the variable (to match the MSR value for the root
> partition). I guess this approach will workd regardless of the guest page size.

I agree that it's best for tsc_pfn to contain a Hyper-V PFN. But if that's the case,
the above use of __pfn_to_phys() won't work since it is expecting a guest PFN
as input.

Michael