2017-07-18 22:01:03

by Andrew Banman

[permalink] [raw]
Subject: [PATCH] x86/platform/uv/BAU: disable BAU on single hub configurations

The BAU confers no benefit to a UV system running with only one hub/socket.
Permanently disable the BAU driver if there are less than two hubs online
to avoid BAU overhead. We have observed failed boots on single-socket UV4
systems caused by BAU that are avoided with this patch.

Signed-off-by: Andrew Banman <[email protected]>
Acked-by: Russ Anderson <[email protected]>
Acked-by: Mike Travis <[email protected]>
---
arch/x86/platform/uv/tlb_uv.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/arch/x86/platform/uv/tlb_uv.c b/arch/x86/platform/uv/tlb_uv.c
index 2511a28..88216cc 100644
--- a/arch/x86/platform/uv/tlb_uv.c
+++ b/arch/x86/platform/uv/tlb_uv.c
@@ -2251,6 +2251,12 @@ static int __init uv_bau_init(void)
}

nuvhubs = uv_num_possible_blades();
+ if (nuvhubs < 2) {
+ pr_crit("UV: BAU disabled - insufficient hub count\n");
+ set_bau_off();
+ nobau_perm = 1;
+ return 0;
+ }

uv_base_pnode = 0x7fffffff;
for (uvhub = 0; uvhub < nuvhubs; uvhub++) {
--
1.8.2.1


2017-07-20 11:47:56

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] x86/platform/uv/BAU: disable BAU on single hub configurations


* Andrew Banman <[email protected]> wrote:

> The BAU confers no benefit to a UV system running with only one hub/socket.
> Permanently disable the BAU driver if there are less than two hubs online
> to avoid BAU overhead. We have observed failed boots on single-socket UV4
> systems caused by BAU that are avoided with this patch.
>
> Signed-off-by: Andrew Banman <[email protected]>
> Acked-by: Russ Anderson <[email protected]>
> Acked-by: Mike Travis <[email protected]>
> ---
> arch/x86/platform/uv/tlb_uv.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/arch/x86/platform/uv/tlb_uv.c b/arch/x86/platform/uv/tlb_uv.c
> index 2511a28..88216cc 100644
> --- a/arch/x86/platform/uv/tlb_uv.c
> +++ b/arch/x86/platform/uv/tlb_uv.c
> @@ -2251,6 +2251,12 @@ static int __init uv_bau_init(void)
> }
>
> nuvhubs = uv_num_possible_blades();
> + if (nuvhubs < 2) {
> + pr_crit("UV: BAU disabled - insufficient hub count\n");
> + set_bau_off();
> + nobau_perm = 1;
> + return 0;
> + }

Yeah, could you structure the error paths in this function in a bit more organized
fashion? It has two similar error handling blocks:


pr_crit("UV: BAU disabled - insufficient hub count\n");
set_bau_off();
nobau_perm = 1;
return 0;

...

set_bau_off();
nobau_perm = 1;
return 0;

which could be consolidated via the usual goto exception construct:

if (nuvhubs < 2) {
pr_crit("UV: BAU disabled - insufficient hub count\n");
goto err_disable_bau;
}
...

if (init_per_cpu(nuvhubs, uv_base_pnode))
pr_crit("UV: BAU disabled - per CPU init failed\n");
goto err_disable_bau;
}

...
return 0;

err_disable_bau:

set_bau_off();
nobau_perm = 1;
return 0;

Note that I added an error message to the second case as well.

Plus, in the error case you might want to use a 'return -EINVAL;' instead of
return 0, or so?

Plus plus, there's probably a (mild) memory leak in the error paths, can the
cpumasks be free_cpumask_var() freed - or are they still required even if BAU is
disabled?

Thanks,

Ingo

2017-07-20 18:56:01

by Andrew Banman

[permalink] [raw]
Subject: Re: [PATCH] x86/platform/uv/BAU: disable BAU on single hub configurations

On Thu, Jul 20, 2017 at 01:47:50PM +0200, Ingo Molnar wrote:
>
> * Andrew Banman <[email protected]> wrote:
>
> > The BAU confers no benefit to a UV system running with only one hub/socket.
> > Permanently disable the BAU driver if there are less than two hubs online
> > to avoid BAU overhead. We have observed failed boots on single-socket UV4
> > systems caused by BAU that are avoided with this patch.
> >
> > Signed-off-by: Andrew Banman <[email protected]>
> > Acked-by: Russ Anderson <[email protected]>
> > Acked-by: Mike Travis <[email protected]>
> > ---
> > arch/x86/platform/uv/tlb_uv.c | 6 ++++++
> > 1 file changed, 6 insertions(+)
> >
> > diff --git a/arch/x86/platform/uv/tlb_uv.c b/arch/x86/platform/uv/tlb_uv.c
> > index 2511a28..88216cc 100644
> > --- a/arch/x86/platform/uv/tlb_uv.c
> > +++ b/arch/x86/platform/uv/tlb_uv.c
> > @@ -2251,6 +2251,12 @@ static int __init uv_bau_init(void)
> > }
> >
> > nuvhubs = uv_num_possible_blades();
> > + if (nuvhubs < 2) {
> > + pr_crit("UV: BAU disabled - insufficient hub count\n");
> > + set_bau_off();
> > + nobau_perm = 1;
> > + return 0;
> > + }
>
> Yeah, could you structure the error paths in this function in a bit more organized
> fashion? It has two similar error handling blocks:
>
>
> pr_crit("UV: BAU disabled - insufficient hub count\n");
> set_bau_off();
> nobau_perm = 1;
> return 0;
>
> ...
>
> set_bau_off();
> nobau_perm = 1;
> return 0;
>
> which could be consolidated via the usual goto exception construct:
>
> if (nuvhubs < 2) {
> pr_crit("UV: BAU disabled - insufficient hub count\n");
> goto err_disable_bau;
> }
> ...
>
> if (init_per_cpu(nuvhubs, uv_base_pnode))
> pr_crit("UV: BAU disabled - per CPU init failed\n");
> goto err_disable_bau;
> }
>
> ...
> return 0;
>
> err_disable_bau:
>
> set_bau_off();
> nobau_perm = 1;
> return 0;
>
> Note that I added an error message to the second case as well.
>
> Plus, in the error case you might want to use a 'return -EINVAL;' instead of
> return 0, or so?

I agree with your suggestions, and I'm happy to make the changes.

>
> Plus plus, there's probably a (mild) memory leak in the error paths, can the
> cpumasks be free_cpumask_var() freed - or are they still required even if BAU is
> disabled?

In the case of nobau_perm=1 they can be freed. I will include that in the next
version.

>
> Thanks,
>
> Ingo

Thank you! I'll have the next version out shortly.

Andrew

2017-07-20 22:07:23

by Andrew Banman

[permalink] [raw]
Subject: [PATCH v2] x86/platform/uv/BAU: disable BAU on single hub configurations

The BAU confers no benefit to a UV system running with only one hub/socket.
Permanently disable the BAU driver if there are less than two hubs online
to avoid BAU overhead. We have observed failed boots on single-socket UV4
systems caused by BAU that are avoided with this patch.

Version 2: Consolidate initialization error blocks with goto err_bau_disable
and free the per_cpu cpumasks to fix a memory leak.

Signed-off-by: Andrew Banman <[email protected]>
Acked-by: Russ Anderson <[email protected]>
Acked-by: Mike Travis <[email protected]>
---
arch/x86/platform/uv/tlb_uv.c | 21 ++++++++++++++++-----
1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/arch/x86/platform/uv/tlb_uv.c b/arch/x86/platform/uv/tlb_uv.c
index 2511a28..e4a51a6 100644
--- a/arch/x86/platform/uv/tlb_uv.c
+++ b/arch/x86/platform/uv/tlb_uv.c
@@ -2245,13 +2245,17 @@ static int __init uv_bau_init(void)
else if (is_uv1_hub())
ops = uv1_bau_ops;

+ nuvhubs = uv_num_possible_blades();
+ if (nuvhubs < 2) {
+ pr_crit("UV: BAU disabled - insufficient hub count\n");
+ goto err_bau_disable;
+ }
+
for_each_possible_cpu(cur_cpu) {
mask = &per_cpu(uv_flush_tlb_mask, cur_cpu);
zalloc_cpumask_var_node(mask, GFP_KERNEL, cpu_to_node(cur_cpu));
}

- nuvhubs = uv_num_possible_blades();
-
uv_base_pnode = 0x7fffffff;
for (uvhub = 0; uvhub < nuvhubs; uvhub++) {
cpus = uv_blade_nr_possible_cpus(uvhub);
@@ -2264,9 +2268,8 @@ static int __init uv_bau_init(void)
enable_timeouts();

if (init_per_cpu(nuvhubs, uv_base_pnode)) {
- set_bau_off();
- nobau_perm = 1;
- return 0;
+ pr_crit("UV: BAU disabled - per CPU init failed\n");
+ goto err_bau_disable;
}

vector = UV_BAU_MESSAGE;
@@ -2292,6 +2295,14 @@ static int __init uv_bau_init(void)
}

return 0;
+
+err_bau_disable:
+ for_each_possible_cpu(cur_cpu) {
+ free_cpumask_var(per_cpu(uv_flush_tlb_mask, cur_cpu));
+ }
+ set_bau_off();
+ nobau_perm = 1;
+ return -EINVAL;
}
core_initcall(uv_bau_init);
fs_initcall(uv_ptc_init);
--
1.8.2.1

Subject: [tip:x86/urgent] x86/platform/uv/BAU: Disable BAU on single hub configurations

Commit-ID: 2fe9a5c6ade4dfb53ff1c137cca3828d9d1d0948
Gitweb: http://git.kernel.org/tip/2fe9a5c6ade4dfb53ff1c137cca3828d9d1d0948
Author: Andrew Banman <[email protected]>
AuthorDate: Thu, 20 Jul 2017 17:05:51 -0500
Committer: Ingo Molnar <[email protected]>
CommitDate: Fri, 21 Jul 2017 09:56:25 +0200

x86/platform/uv/BAU: Disable BAU on single hub configurations

The BAU confers no benefit to a UV system running with only one hub/socket.
Permanently disable the BAU driver if there are less than two hubs online
to avoid BAU overhead. We have observed failed boots on single-socket UV4
systems caused by BAU that are avoided with this patch.

Also, while at it, consolidate initialization error blocks and fix a
memory leak.

Signed-off-by: Andrew Banman <[email protected]>
Acked-by: Russ Anderson <[email protected]>
Acked-by: Mike Travis <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
[ Minor cleanups. ]
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/platform/uv/tlb_uv.c | 23 ++++++++++++++++++-----
1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/arch/x86/platform/uv/tlb_uv.c b/arch/x86/platform/uv/tlb_uv.c
index fd87591..3e4bdb4 100644
--- a/arch/x86/platform/uv/tlb_uv.c
+++ b/arch/x86/platform/uv/tlb_uv.c
@@ -2221,13 +2221,17 @@ static int __init uv_bau_init(void)
else if (is_uv1_hub())
ops = uv1_bau_ops;

+ nuvhubs = uv_num_possible_blades();
+ if (nuvhubs < 2) {
+ pr_crit("UV: BAU disabled - insufficient hub count\n");
+ goto err_bau_disable;
+ }
+
for_each_possible_cpu(cur_cpu) {
mask = &per_cpu(uv_flush_tlb_mask, cur_cpu);
zalloc_cpumask_var_node(mask, GFP_KERNEL, cpu_to_node(cur_cpu));
}

- nuvhubs = uv_num_possible_blades();
-
uv_base_pnode = 0x7fffffff;
for (uvhub = 0; uvhub < nuvhubs; uvhub++) {
cpus = uv_blade_nr_possible_cpus(uvhub);
@@ -2240,9 +2244,8 @@ static int __init uv_bau_init(void)
enable_timeouts();

if (init_per_cpu(nuvhubs, uv_base_pnode)) {
- set_bau_off();
- nobau_perm = 1;
- return 0;
+ pr_crit("UV: BAU disabled - per CPU init failed\n");
+ goto err_bau_disable;
}

vector = UV_BAU_MESSAGE;
@@ -2268,6 +2271,16 @@ static int __init uv_bau_init(void)
}

return 0;
+
+err_bau_disable:
+
+ for_each_possible_cpu(cur_cpu)
+ free_cpumask_var(per_cpu(uv_flush_tlb_mask, cur_cpu));
+
+ set_bau_off();
+ nobau_perm = 1;
+
+ return -EINVAL;
}
core_initcall(uv_bau_init);
fs_initcall(uv_ptc_init);