2022-04-05 00:50:04

by Mike Travis

[permalink] [raw]
Subject: [PATCH v3 0/3] x86/platform/uv: UV Kernel support for UV5

Send a second time on 04/04/22, sent first time on 03/18/22.

v2: Delete patch to remove SCRATCH 5 NMI support check for
UV2 and UV3k systems with old NMI support function.

v3: Fix check BIOS NMI support mistake in Patch 1.

Update NMI setup for UV5
Update NMI handler to interface with UV5 hardware. This involves
changing the EVENT_OCCURRED MMR used by the hardware and removes
the check for which NMI function is supported by UV BIOS. The
newer NMI function is assumed supported on UV5 and above.

Update TSC sync check for UV5
Update TSC to not check TSC sync state for uv5+ as it is not
available. It is assumed that TSC will always be in sync for
multiple chassis and will pass the tests for the kernel to
accept it as the clocksource. To disable this check use the
kernel start options tsc=reliable clocksource=tsc.

Log gap hole end size
Show value of gap end in the kernel log which equates to number
of physical address bits used by system. The end address of
the gap holds PA bits 56:26 which gives the range up to 64PB
max size with 64MB of granularity.

Mike Travis (3):
x86/platform/uv: Update NMI Handler for UV5
x86/platform/uv: Update TSC sync state for UV5
x86/platform/uv: Log gap hole end size

arch/x86/kernel/apic/x2apic_uv_x.c | 20 +++++++++++++++-----
arch/x86/platform/uv/uv_nmi.c | 21 +++++++++++----------
2 files changed, 26 insertions(+), 15 deletions(-)

--
2.26.2


2022-04-05 01:30:56

by Mike Travis

[permalink] [raw]
Subject: [PATCH v3 2/3] x86/platform/uv: Update TSC sync state for UV5

Update to not check TSC sync state for uv5+ as it is not available.
It is assumed that TSC will always be in sync for multiple chassis and
will pass the tests for the kernel to accept it as the clocksource.
To disable this check use the kernel start options tsc=reliable
clocksource=tsc.

Signed-off-by: Mike Travis <[email protected]>
Reviewed-by: Dimitri Sivanich <[email protected]>
Reviewed-by: Steve Wahl <[email protected]>
---
v2: Update patch description to be more explanatory.
---
arch/x86/kernel/apic/x2apic_uv_x.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/apic/x2apic_uv_x.c b/arch/x86/kernel/apic/x2apic_uv_x.c
index f5a48e66e4f5..387d6533549a 100644
--- a/arch/x86/kernel/apic/x2apic_uv_x.c
+++ b/arch/x86/kernel/apic/x2apic_uv_x.c
@@ -199,10 +199,16 @@ static void __init uv_tsc_check_sync(void)
int mmr_shift;
char *state;

- /* Different returns from different UV BIOS versions */
+ /* UV5+, sync state from bios not available, assumed valid */
+ if (!is_uv(UV2|UV3|UV4)) {
+ pr_debug("UV: TSC sync state for UV5+ assumed valid\n");
+ mark_tsc_async_resets("UV5+");
+ return;
+ }
+
+ /* UV2,3,4, UV BIOS TSC sync state available */
mmr = uv_early_read_mmr(UVH_TSC_SYNC_MMR);
- mmr_shift =
- is_uv2_hub() ? UVH_TSC_SYNC_SHIFT_UV2K : UVH_TSC_SYNC_SHIFT;
+ mmr_shift = is_uv2_hub() ? UVH_TSC_SYNC_SHIFT_UV2K : UVH_TSC_SYNC_SHIFT;
sync_state = (mmr >> mmr_shift) & UVH_TSC_SYNC_MASK;

/* Check if TSC is valid for all sockets */
--
2.26.2

2022-04-05 01:52:00

by Mike Travis

[permalink] [raw]
Subject: [PATCH v3 3/3] x86/platform/uv: Log gap hole end size

Show value of gap end in the kernel log which equates to number of physical
address bits used by system. The end address of the gap holds PA bits 56:26
which gives the range up to 64PB max size with 64MB of granularity.

Signed-off-by: Mike Travis <[email protected]>
Reviewed-by: Steve Wahl <[email protected]>
---
v2: Update patch description to be more explanatory.
---
arch/x86/kernel/apic/x2apic_uv_x.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/apic/x2apic_uv_x.c b/arch/x86/kernel/apic/x2apic_uv_x.c
index 387d6533549a..146f0f63a43b 100644
--- a/arch/x86/kernel/apic/x2apic_uv_x.c
+++ b/arch/x86/kernel/apic/x2apic_uv_x.c
@@ -1346,7 +1346,7 @@ static void __init decode_gam_params(unsigned long ptr)
static void __init decode_gam_rng_tbl(unsigned long ptr)
{
struct uv_gam_range_entry *gre = (struct uv_gam_range_entry *)ptr;
- unsigned long lgre = 0;
+ unsigned long lgre = 0, gend = 0;
int index = 0;
int sock_min = 999999, pnode_min = 99999;
int sock_max = -1, pnode_max = -1;
@@ -1380,6 +1380,9 @@ static void __init decode_gam_rng_tbl(unsigned long ptr)
flag, size, suffix[order],
gre->type, gre->nasid, gre->sockid, gre->pnode);

+ if (gre->type == UV_GAM_RANGE_TYPE_HOLE)
+ gend = (unsigned long)gre->limit << UV_GAM_RANGE_SHFT;
+
/* update to next range start */
lgre = gre->limit;
if (sock_min > gre->sockid)
@@ -1397,7 +1400,8 @@ static void __init decode_gam_rng_tbl(unsigned long ptr)
_max_pnode = pnode_max;
_gr_table_len = index;

- pr_info("UV: GRT: %d entries, sockets(min:%x,max:%x) pnodes(min:%x,max:%x)\n", index, _min_socket, _max_socket, _min_pnode, _max_pnode);
+ pr_info("UV: GRT: %d entries, sockets(min:%x,max:%x), pnodes(min:%x,max:%x), gap_end(%d)\n",
+ index, _min_socket, _max_socket, _min_pnode, _max_pnode, fls64(gend));
}

/* Walk through UVsystab decoding the fields */
--
2.26.2

2022-04-05 02:58:22

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v3 0/3] x86/platform/uv: UV Kernel support for UV5

On Mon, Apr 04, 2022 at 12:41:08PM -0500, Mike Travis wrote:
> Send a second time on 04/04/22, sent first time on 03/18/22.

Is this one any different from your submission on 3/18?

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2022-04-05 03:39:14

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v3 2/3] x86/platform/uv: Update TSC sync state for UV5

Mike,

On Mon, Apr 04 2022 at 12:41, Mike Travis wrote:
> Update to not check TSC sync state for uv5+ as it is not available.
> It is assumed that TSC will always be in sync for multiple chassis and
> will pass the tests for the kernel to accept it as the clocksource.
> To disable this check use the kernel start options tsc=reliable
> clocksource=tsc.
...
> ---
> v2: Update patch description to be more explanatory.

after reading the above word salad, I have no urge to go back to V1 to
figure out how bad that changelog was. Let me walk through it:

> Update to not check TSC sync state for uv5+ as it is not available.

That's not a sentence and it does not provide any information about the
background and the why. See:

https://www.kernel.org/doc/html/latest/process/maintainer-tip.html#changelog

for further explanation.

> It is assumed that TSC will always be in sync for multiple chassis and
> will pass the tests for the kernel to accept it as the clocksource.

"It is assumed" is a real convincing technical argument - NOT!

Either the hardware guarantees something or not. If the hardware cannot
guarantee it but does not provide a mechanism to verify it then spell it
out:

"UV5 gave up on providing an interface which allows to query the TSC
synchronization state. The hardware/firmware specification defines
that TSC is synchronized accross multiple chassis, but there is
neither a guarantee nor a validation mechanism. This leaves the
kernel with the only option to assume that TSC is synchronized and
TSC_ADJUST might be different between sockets due to UV5+ firmware
synchronization."

> To disable this check use the kernel start options tsc=reliable
> clocksource=tsc.

So now it get's really confusing. Which check? The one in your
explanatory description above:

"Update to not check TSC sync state for uv5+ as it is not available."

or what?

I can assume what are you trying to tell me here, but that's not a
qualification for 'explanatory'

> Signed-off-by: Mike Travis <[email protected]>
> Reviewed-by: Dimitri Sivanich <[email protected]>
> Reviewed-by: Steve Wahl <[email protected]>

Impressive list of Reviewed-by tags. Just for clarification:

Reviewed-by tags include the changelog part.

> ---
> arch/x86/kernel/apic/x2apic_uv_x.c | 12 +++++++++---
> 1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kernel/apic/x2apic_uv_x.c b/arch/x86/kernel/apic/x2apic_uv_x.c
> index f5a48e66e4f5..387d6533549a 100644
> --- a/arch/x86/kernel/apic/x2apic_uv_x.c
> +++ b/arch/x86/kernel/apic/x2apic_uv_x.c
> @@ -199,10 +199,16 @@ static void __init uv_tsc_check_sync(void)
> int mmr_shift;
> char *state;
>
> - /* Different returns from different UV BIOS versions */
> + /* UV5+, sync state from bios not available, assumed valid */
> + if (!is_uv(UV2|UV3|UV4)) {
> + pr_debug("UV: TSC sync state for UV5+ assumed valid\n");

UV advertises its version all over the place and the call here:

> + mark_tsc_async_resets("UV5+");

emits the reason at info level. So you surely need the pr_debug() above
to figure out that your code works as expected. You just failed to add
the obviuos counterpart:

pr_debug("After calling hello_world()!\n");

Seriously. Neither the changelog nor the comment above the condition
qualify for explanatory or useful. Please try again.

> +
> + /* UV2,3,4, UV BIOS TSC sync state available */
> mmr = uv_early_read_mmr(UVH_TSC_SYNC_MMR);
> - mmr_shift =
> - is_uv2_hub() ? UVH_TSC_SYNC_SHIFT_UV2K : UVH_TSC_SYNC_SHIFT;
> + mmr_shift = is_uv2_hub() ? UVH_TSC_SYNC_SHIFT_UV2K : UVH_TSC_SYNC_SHIFT;

How is this change related to the problem which this patch tries to
solve? Documentation/process/* applies to UV too. You definitely should
know that by now.

Thanks,

tglx