RCU_EXPEDITE_BOOT should speed up the boot process by enforcing
synchronize_rcu_expedited() instead of synchronize_rcu() during the boot
process. There should be no reason why one does not want this and there
is no need worry about real time latency at this point.
Therefore make it default.
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
---
Checked on two boxes and synchronize_rcu() was invoked four times before
it reached rcu_end_inkernel_boot()
init/Kconfig | 13 -------------
kernel/rcu/update.c | 6 ++----
2 files changed, 2 insertions(+), 17 deletions(-)
diff --git a/init/Kconfig b/init/Kconfig
index b6c9166d878a..fe51bd3bbc61 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -771,19 +771,6 @@ config RCU_NOCB_CPU_ALL
endchoice
-config RCU_EXPEDITE_BOOT
- bool
- default n
- help
- This option enables expedited grace periods at boot time,
- as if rcu_expedite_gp() had been invoked early in boot.
- The corresponding rcu_unexpedite_gp() is invoked from
- rcu_end_inkernel_boot(), which is intended to be invoked
- at the end of the kernel-only boot sequence, just before
- init is exec'ed.
-
- Accept the default if unsure.
-
endmenu # "RCU Subsystem"
config BUILD_BIN2C
diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
index b40d3468ba4e..419ca811bda9 100644
--- a/kernel/rcu/update.c
+++ b/kernel/rcu/update.c
@@ -130,8 +130,7 @@ bool rcu_gp_is_normal(void)
}
EXPORT_SYMBOL_GPL(rcu_gp_is_normal);
-static atomic_t rcu_expedited_nesting =
- ATOMIC_INIT(IS_ENABLED(CONFIG_RCU_EXPEDITE_BOOT) ? 1 : 0);
+static atomic_t rcu_expedited_nesting = ATOMIC_INIT(1);
/*
* Should normal grace-period primitives be expedited? Intended for
@@ -179,8 +178,7 @@ EXPORT_SYMBOL_GPL(rcu_unexpedite_gp);
*/
void rcu_end_inkernel_boot(void)
{
- if (IS_ENABLED(CONFIG_RCU_EXPEDITE_BOOT))
- rcu_unexpedite_gp();
+ rcu_unexpedite_gp();
if (rcu_normal_after_boot)
WRITE_ONCE(rcu_normal, 1);
}
--
2.10.2
On Wed, Nov 02, 2016 at 05:30:02PM +0100, Sebastian Andrzej Siewior wrote:
> RCU_EXPEDITE_BOOT should speed up the boot process by enforcing
> synchronize_rcu_expedited() instead of synchronize_rcu() during the boot
> process. There should be no reason why one does not want this and there
> is no need worry about real time latency at this point.
> Therefore make it default.
>
> Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Well, it has been awhile since I removed a Kconfig parameter.
So why could this be a bad thing?
1. Very large systems might see scalability issues with unconditional
expediting at boot. But if we don't try it, we won't know.
2. People bringing up new hardware might not want quite so many
IPIs. But they can just set rcu_normal to prevent that.
I am therefore queuing it for testiong and review. ;-)
Thanx, Paul
> ---
> Checked on two boxes and synchronize_rcu() was invoked four times before
> it reached rcu_end_inkernel_boot()
>
> init/Kconfig | 13 -------------
> kernel/rcu/update.c | 6 ++----
> 2 files changed, 2 insertions(+), 17 deletions(-)
>
> diff --git a/init/Kconfig b/init/Kconfig
> index b6c9166d878a..fe51bd3bbc61 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -771,19 +771,6 @@ config RCU_NOCB_CPU_ALL
>
> endchoice
>
> -config RCU_EXPEDITE_BOOT
> - bool
> - default n
> - help
> - This option enables expedited grace periods at boot time,
> - as if rcu_expedite_gp() had been invoked early in boot.
> - The corresponding rcu_unexpedite_gp() is invoked from
> - rcu_end_inkernel_boot(), which is intended to be invoked
> - at the end of the kernel-only boot sequence, just before
> - init is exec'ed.
> -
> - Accept the default if unsure.
> -
> endmenu # "RCU Subsystem"
>
> config BUILD_BIN2C
> diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
> index b40d3468ba4e..419ca811bda9 100644
> --- a/kernel/rcu/update.c
> +++ b/kernel/rcu/update.c
> @@ -130,8 +130,7 @@ bool rcu_gp_is_normal(void)
> }
> EXPORT_SYMBOL_GPL(rcu_gp_is_normal);
>
> -static atomic_t rcu_expedited_nesting =
> - ATOMIC_INIT(IS_ENABLED(CONFIG_RCU_EXPEDITE_BOOT) ? 1 : 0);
> +static atomic_t rcu_expedited_nesting = ATOMIC_INIT(1);
>
> /*
> * Should normal grace-period primitives be expedited? Intended for
> @@ -179,8 +178,7 @@ EXPORT_SYMBOL_GPL(rcu_unexpedite_gp);
> */
> void rcu_end_inkernel_boot(void)
> {
> - if (IS_ENABLED(CONFIG_RCU_EXPEDITE_BOOT))
> - rcu_unexpedite_gp();
> + rcu_unexpedite_gp();
> if (rcu_normal_after_boot)
> WRITE_ONCE(rcu_normal, 1);
> }
> --
> 2.10.2
>
On 2016-11-03 09:22:28 [-0700], Paul E. McKenney wrote:
> On Wed, Nov 02, 2016 at 05:30:02PM +0100, Sebastian Andrzej Siewior wrote:
> > RCU_EXPEDITE_BOOT should speed up the boot process by enforcing
> > synchronize_rcu_expedited() instead of synchronize_rcu() during the boot
> > process. There should be no reason why one does not want this and there
> > is no need worry about real time latency at this point.
> > Therefore make it default.
> >
> > Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
>
> Well, it has been awhile since I removed a Kconfig parameter.
>
> So why could this be a bad thing?
>
> 1. Very large systems might see scalability issues with unconditional
> expediting at boot. But if we don't try it, we won't know.
You mean we would make the boot process slower for them instead of
faster?
> 2. People bringing up new hardware might not want quite so many
> IPIs. But they can just set rcu_normal to prevent that.
I wanted to make things simple and not complicated…
> I am therefore queuing it for testiong and review. ;-)
Okay thanks.
> Thanx, Paul
Sebastian
On Thu, Nov 03, 2016 at 05:33:27PM +0100, Sebastian Andrzej Siewior wrote:
> On 2016-11-03 09:22:28 [-0700], Paul E. McKenney wrote:
> > On Wed, Nov 02, 2016 at 05:30:02PM +0100, Sebastian Andrzej Siewior wrote:
> > > RCU_EXPEDITE_BOOT should speed up the boot process by enforcing
> > > synchronize_rcu_expedited() instead of synchronize_rcu() during the boot
> > > process. There should be no reason why one does not want this and there
> > > is no need worry about real time latency at this point.
> > > Therefore make it default.
> > >
> > > Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
> >
> > Well, it has been awhile since I removed a Kconfig parameter.
> >
> > So why could this be a bad thing?
> >
> > 1. Very large systems might see scalability issues with unconditional
> > expediting at boot. But if we don't try it, we won't know.
>
> You mean we would make the boot process slower for them instead of
> faster?
For really bit systems, quite possibly, where "really big" means
many hundreds or (more likely) thousands of CPUs.
But there are things that I can do to fix this when and if.
> > 2. People bringing up new hardware might not want quite so many
> > IPIs. But they can just set rcu_normal to prevent that.
>
> I wanted to make things simple and not complicated…
I know that feeling. ;-)
> > I am therefore queuing it for testiong and review. ;-)
>
> Okay thanks.
Thanx, Paul
> Sebastian
>
On Thu, 3 Nov 2016 09:59:31 -0700
"Paul E. McKenney" <[email protected]> wrote:
> On Thu, Nov 03, 2016 at 05:33:27PM +0100, Sebastian Andrzej Siewior wrote:
> > On 2016-11-03 09:22:28 [-0700], Paul E. McKenney wrote:
> > > On Wed, Nov 02, 2016 at 05:30:02PM +0100, Sebastian Andrzej Siewior wrote:
> > > > RCU_EXPEDITE_BOOT should speed up the boot process by enforcing
> > > > synchronize_rcu_expedited() instead of synchronize_rcu() during the boot
> > > > process. There should be no reason why one does not want this and there
> > > > is no need worry about real time latency at this point.
> > > > Therefore make it default.
> > > >
> > > > Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
> > >
> > > Well, it has been awhile since I removed a Kconfig parameter.
> > >
> > > So why could this be a bad thing?
> > >
> > > 1. Very large systems might see scalability issues with unconditional
> > > expediting at boot. But if we don't try it, we won't know.
> >
> > You mean we would make the boot process slower for them instead of
> > faster?
>
> For really bit systems, quite possibly, where "really big" means
> many hundreds or (more likely) thousands of CPUs.
>
> But there are things that I can do to fix this when and if.
>
> > > 2. People bringing up new hardware might not want quite so many
> > > IPIs. But they can just set rcu_normal to prevent that.
> >
> > I wanted to make things simple and not complicated…
>
> I know that feeling. ;-)
>
I agree, but if this creates a boot time regression in large machines,
it may not be warranted.
I know Linus usually doesn't like options with default y, but this may
be one of those exceptions. Perhaps we should make it on by default and
say in the config "if you have a machine with 100s or 1000s of CPUs,
you may want to disable this".
-- Steve
On 2016-11-07 12:19:39 [-0500], Steven Rostedt wrote:
> I agree, but if this creates a boot time regression in large machines,
> it may not be warranted.
>
> I know Linus usually doesn't like options with default y, but this may
> be one of those exceptions. Perhaps we should make it on by default and
> say in the config "if you have a machine with 100s or 1000s of CPUs,
> you may want to disable this".
The default could change if we know where the limit is. I have access to
a box with approx 140 CPUs so I could check there if it is already bad.
But everything above that / in the 1000 range is a different story.
> -- Steve
Sebastian
On Mon, Nov 07, 2016 at 06:30:30PM +0100, Sebastian Andrzej Siewior wrote:
> On 2016-11-07 12:19:39 [-0500], Steven Rostedt wrote:
> > I agree, but if this creates a boot time regression in large machines,
> > it may not be warranted.
> >
> > I know Linus usually doesn't like options with default y, but this may
> > be one of those exceptions. Perhaps we should make it on by default and
> > say in the config "if you have a machine with 100s or 1000s of CPUs,
> > you may want to disable this".
>
> The default could change if we know where the limit is. I have access to
> a box with approx 140 CPUs so I could check there if it is already bad.
> But everything above that / in the 1000 range is a different story.
Right; if we can characterize what machines it benefits and what
machines it hurts, we can automatically detect and run the appropriate
case with no configuration option needed.
On Mon, Nov 07, 2016 at 10:05:13AM -0800, Paul E. McKenney wrote:
> On Mon, Nov 07, 2016 at 09:35:46AM -0800, Josh Triplett wrote:
> > On Mon, Nov 07, 2016 at 06:30:30PM +0100, Sebastian Andrzej Siewior wrote:
> > > On 2016-11-07 12:19:39 [-0500], Steven Rostedt wrote:
> > > > I agree, but if this creates a boot time regression in large machines,
> > > > it may not be warranted.
> > > >
> > > > I know Linus usually doesn't like options with default y, but this may
> > > > be one of those exceptions. Perhaps we should make it on by default and
> > > > say in the config "if you have a machine with 100s or 1000s of CPUs,
> > > > you may want to disable this".
> > >
> > > The default could change if we know where the limit is. I have access to
> > > a box with approx 140 CPUs so I could check there if it is already bad.
> > > But everything above that / in the 1000 range is a different story.
> >
> > Right; if we can characterize what machines it benefits and what
> > machines it hurts, we can automatically detect and run the appropriate
> > case with no configuration option needed.
>
> I very much like this approach! Anyone have access to large systems on
> which this experiment could be carried out? In the absence of new data,
> I would just set the cutoff at 256 CPUs, as I have done in the past.
One potential issue here: the point where RCU_EXPEDITE_BOOT pessimizes
likely depends on interconnect as much as CPUs. I'd guess that you may
want to set the cutoff based on number of NUMA nodes, rather than number
of CPUs.
On Mon, Nov 07, 2016 at 09:35:46AM -0800, Josh Triplett wrote:
> On Mon, Nov 07, 2016 at 06:30:30PM +0100, Sebastian Andrzej Siewior wrote:
> > On 2016-11-07 12:19:39 [-0500], Steven Rostedt wrote:
> > > I agree, but if this creates a boot time regression in large machines,
> > > it may not be warranted.
> > >
> > > I know Linus usually doesn't like options with default y, but this may
> > > be one of those exceptions. Perhaps we should make it on by default and
> > > say in the config "if you have a machine with 100s or 1000s of CPUs,
> > > you may want to disable this".
> >
> > The default could change if we know where the limit is. I have access to
> > a box with approx 140 CPUs so I could check there if it is already bad.
> > But everything above that / in the 1000 range is a different story.
>
> Right; if we can characterize what machines it benefits and what
> machines it hurts, we can automatically detect and run the appropriate
> case with no configuration option needed.
I very much like this approach! Anyone have access to large systems on
which this experiment could be carried out? In the absence of new data,
I would just set the cutoff at 256 CPUs, as I have done in the past.
Thanx, Paul
On Mon, Nov 07, 2016 at 10:08:32AM -0800, Josh Triplett wrote:
> On Mon, Nov 07, 2016 at 10:05:13AM -0800, Paul E. McKenney wrote:
> > On Mon, Nov 07, 2016 at 09:35:46AM -0800, Josh Triplett wrote:
> > > On Mon, Nov 07, 2016 at 06:30:30PM +0100, Sebastian Andrzej Siewior wrote:
> > > > On 2016-11-07 12:19:39 [-0500], Steven Rostedt wrote:
> > > > > I agree, but if this creates a boot time regression in large machines,
> > > > > it may not be warranted.
> > > > >
> > > > > I know Linus usually doesn't like options with default y, but this may
> > > > > be one of those exceptions. Perhaps we should make it on by default and
> > > > > say in the config "if you have a machine with 100s or 1000s of CPUs,
> > > > > you may want to disable this".
> > > >
> > > > The default could change if we know where the limit is. I have access to
> > > > a box with approx 140 CPUs so I could check there if it is already bad.
> > > > But everything above that / in the 1000 range is a different story.
> > >
> > > Right; if we can characterize what machines it benefits and what
> > > machines it hurts, we can automatically detect and run the appropriate
> > > case with no configuration option needed.
> >
> > I very much like this approach! Anyone have access to large systems on
> > which this experiment could be carried out? In the absence of new data,
> > I would just set the cutoff at 256 CPUs, as I have done in the past.
>
> One potential issue here: the point where RCU_EXPEDITE_BOOT pessimizes
> likely depends on interconnect as much as CPUs. I'd guess that you may
> want to set the cutoff based on number of NUMA nodes, rather than number
> of CPUs.
Longer term, the solution would be a function that defined the cutoff
point. If an architecture didn't define the function, they get the
default cutoff of 256. If they do define the function, they get whatever
they coded.
Thanx, Paul