2015-04-16 18:38:24

by Paul E. McKenney

[permalink] [raw]
Subject: [GIT RFC PULL rcu/urgent] Prevent Kconfig from asking pointless questions

Hello, Ingo,

This series contains a single change that fixes Kconfig asking pointless
questions (https://lkml.org/lkml/2015/4/14/616). This is an RFC pull
because there has not yet been a -next build for April 16th. If you
would prefer to wait until after -next has pulled this, please let me
know and I will redo this pull request after that has happened.

In the meantime, this change is available in the git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git for-mingo

for you to fetch changes up to 8d7dc9283f399e1fda4e48a1c453f689326d9396:

rcu: Control grace-period delays directly from value (2015-04-14 19:33:59 -0700)

----------------------------------------------------------------
Paul E. McKenney (1):
rcu: Control grace-period delays directly from value

kernel/rcu/tree.c | 16 +++++++++-------
lib/Kconfig.debug | 1 +
2 files changed, 10 insertions(+), 7 deletions(-)


2015-04-18 13:03:48

by Ingo Molnar

[permalink] [raw]
Subject: Re: [GIT RFC PULL rcu/urgent] Prevent Kconfig from asking pointless questions


* Paul E. McKenney <[email protected]> wrote:

> Hello, Ingo,
>
> This series contains a single change that fixes Kconfig asking pointless
> questions (https://lkml.org/lkml/2015/4/14/616). This is an RFC pull
> because there has not yet been a -next build for April 16th. If you
> would prefer to wait until after -next has pulled this, please let me
> know and I will redo this pull request after that has happened.
>
> In the meantime, this change is available in the git repository at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git for-mingo
>
> for you to fetch changes up to 8d7dc9283f399e1fda4e48a1c453f689326d9396:
>
> rcu: Control grace-period delays directly from value (2015-04-14 19:33:59 -0700)
>
> ----------------------------------------------------------------
> Paul E. McKenney (1):
> rcu: Control grace-period delays directly from value
>
> kernel/rcu/tree.c | 16 +++++++++-------
> lib/Kconfig.debug | 1 +
> 2 files changed, 10 insertions(+), 7 deletions(-)

Pulled, thanks a lot Paul!

Note, while this fixes Linus's immediate complaint that arose from the
new option, I still think we need to do more fixes in this area.

To demonstrate the current situation I tried the following experiment,
I did a 'make defconfig' on an x86 box and then took the .config and
deleted all 'RCU Subsystem' options not marked as debugging.

Then I did a 'make oldconfig' to see what kinds of questions a user is
facing when trying to configure RCU:

*
* Restart config...
*
*
* RCU Subsystem
*
RCU Implementation
> 1. Tree-based hierarchical RCU (TREE_RCU) (NEW)
choice[1]: 1
Task_based RCU implementation using voluntary context switch (TASKS_RCU) [N/y/?] (NEW)
Consider userspace as in RCU extended quiescent state (RCU_USER_QS) [N/y/?] (NEW)
Tree-based hierarchical RCU fanout value (RCU_FANOUT) [64] (NEW)
Tree-based hierarchical RCU leaf-level fanout value (RCU_FANOUT_LEAF) [16] (NEW)
Disable tree-based hierarchical RCU auto-balancing (RCU_FANOUT_EXACT) [N/y/?] (NEW)
Accelerate last non-dyntick-idle CPU's grace periods (RCU_FAST_NO_HZ) [N/y/?] (NEW)
Real-time priority to use for RCU worker threads (RCU_KTHREAD_PRIO) [0] (NEW)
Offload RCU callback processing from boot-selected CPUs (RCU_NOCB_CPU) [N/y/?] (NEW)
#
# configuration written to .config
#

Only TREE_RCU is available on defconfig, so all the other options
marked with '(NEW)' were offered as an interactive prompt.

I don't think that any of the 8 interactive options (!) here are
particularly useful to even advanced users who configure kernels, and
I don't think they should be offered under non-expert settings.

Instead we should pick a preferred RCU configuration based on other
hints (such as CONFIG_NR_CPUS and CONFIG_NO_HZ settings), and if users
or distribution makers find some problem with that, we should address
those specific complaints.

Making everything under the sun configurable, with which non-RCU
experts cannot really do anything anyway, isn't very user friendly -
and results in:

- user confusion and frustration

- possibly messed up configurations

- it also hides inefficiencies that might arise from our defaults:
someone genuinely finding a problem might just tweak the .config,
without ever communicating that bad default to us.

So doing (much!) less is in general the best option for Kconfig driven
UIs.

Ingo

2015-04-18 13:34:53

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [GIT RFC PULL rcu/urgent] Prevent Kconfig from asking pointless questions

On Sat, Apr 18, 2015 at 03:03:41PM +0200, Ingo Molnar wrote:
>
> * Paul E. McKenney <[email protected]> wrote:
>
> > Hello, Ingo,
> >
> > This series contains a single change that fixes Kconfig asking pointless
> > questions (https://lkml.org/lkml/2015/4/14/616). This is an RFC pull
> > because there has not yet been a -next build for April 16th. If you
> > would prefer to wait until after -next has pulled this, please let me
> > know and I will redo this pull request after that has happened.
> >
> > In the meantime, this change is available in the git repository at:
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git for-mingo
> >
> > for you to fetch changes up to 8d7dc9283f399e1fda4e48a1c453f689326d9396:
> >
> > rcu: Control grace-period delays directly from value (2015-04-14 19:33:59 -0700)
> >
> > ----------------------------------------------------------------
> > Paul E. McKenney (1):
> > rcu: Control grace-period delays directly from value
> >
> > kernel/rcu/tree.c | 16 +++++++++-------
> > lib/Kconfig.debug | 1 +
> > 2 files changed, 10 insertions(+), 7 deletions(-)
>
> Pulled, thanks a lot Paul!
>
> Note, while this fixes Linus's immediate complaint that arose from the
> new option, I still think we need to do more fixes in this area.

Good point!

> To demonstrate the current situation I tried the following experiment,
> I did a 'make defconfig' on an x86 box and then took the .config and
> deleted all 'RCU Subsystem' options not marked as debugging.
>
> Then I did a 'make oldconfig' to see what kinds of questions a user is
> facing when trying to configure RCU:
>
> *
> * Restart config...
> *
> *
> * RCU Subsystem
> *
> RCU Implementation
> > 1. Tree-based hierarchical RCU (TREE_RCU) (NEW)
> choice[1]: 1

Hmmm... Given that there is no choice, I agree that it is a bit silly
to ask...

> Task_based RCU implementation using voluntary context switch (TASKS_RCU) [N/y/?] (NEW)

Agreed, this one should be driven directly off of CONFIG_RCU_TORTURE_TEST
and the tracing use case.

> Consider userspace as in RCU extended quiescent state (RCU_USER_QS) [N/y/?] (NEW)

This should be driven directly off of CONFIG_NO_HZ_FULL, unless
Frederic knows something I don't.

> Tree-based hierarchical RCU fanout value (RCU_FANOUT) [64] (NEW)

Hmmm... I could drop/obscure this one in favor of a boot parameter.

> Tree-based hierarchical RCU leaf-level fanout value (RCU_FANOUT_LEAF) [16] (NEW)

Ditto -- though large configurations really do set this to 64 in combination
with the skew_tick boot parameter. Maybe we need to drive these off of
some large-system parameter, like CONFIG_MAX_SMP.

> Disable tree-based hierarchical RCU auto-balancing (RCU_FANOUT_EXACT) [N/y/?] (NEW)

I should just make this a boot parameter. Absolutely no reason for it to
be a Kconfig parameter.

> Accelerate last non-dyntick-idle CPU's grace periods (RCU_FAST_NO_HZ) [N/y/?] (NEW)

On this one, I have no idea. Its purpose is energy efficiency, but it
does have some downsides, for example, increasing idle entry/exit latency.
I am a bit nervous about having it be a boot parameter because that
would leave an extra compare-branch in the path. This one will require
some thought.

> Real-time priority to use for RCU worker threads (RCU_KTHREAD_PRIO) [0] (NEW)

Indeed, Linus complained about this one. ;-)

This Kconfig parameter is a stopgap, and needs a real solution. People
with crazy-heavy workloads involving realtime cannot live without it,
but that means that most people don't have to care. I have had solving
this on my list, and this clearly increases its priority.

> Offload RCU callback processing from boot-selected CPUs (RCU_NOCB_CPU) [N/y/?] (NEW)

Hmmm... Maybe a boot parameter, but I thought that there was some reason
that this was problematic. I will have to take another look.

Anyway, this one is important to non-NO_HZ_FULL real-time workloads.
In a -rt kernel, making CONFIG_PREEMPT_RT (or whatever it is these
days) drive this one makes a lot of sense.

> #
> # configuration written to .config
> #
>
> Only TREE_RCU is available on defconfig, so all the other options
> marked with '(NEW)' were offered as an interactive prompt.
>
> I don't think that any of the 8 interactive options (!) here are
> particularly useful to even advanced users who configure kernels, and
> I don't think they should be offered under non-expert settings.

Would it make sense to have a CONFIG_RCU_EXPERT setting to hide the
remaining settings? That would reduce the common-case number of
questions to one, which would be a quick and safe improvement.
Especially when combined with the changes I called out above.

> Instead we should pick a preferred RCU configuration based on other
> hints (such as CONFIG_NR_CPUS and CONFIG_NO_HZ settings), and if users
> or distribution makers find some problem with that, we should address
> those specific complaints.
>
> Making everything under the sun configurable, with which non-RCU
> experts cannot really do anything anyway, isn't very user friendly -
> and results in:
>
> - user confusion and frustration
>
> - possibly messed up configurations
>
> - it also hides inefficiencies that might arise from our defaults:
> someone genuinely finding a problem might just tweak the .config,
> without ever communicating that bad default to us.
>
> So doing (much!) less is in general the best option for Kconfig driven
> UIs.

I certainly cannot argue with this point!

Thanx, Paul

2015-04-18 14:32:46

by Ingo Molnar

[permalink] [raw]
Subject: Re: [GIT RFC PULL rcu/urgent] Prevent Kconfig from asking pointless questions


* Paul E. McKenney <[email protected]> wrote:

> On Sat, Apr 18, 2015 at 03:03:41PM +0200, Ingo Molnar wrote:
> >
> > * Paul E. McKenney <[email protected]> wrote:
> >
> > > Hello, Ingo,
> > >
> > > This series contains a single change that fixes Kconfig asking pointless
> > > questions (https://lkml.org/lkml/2015/4/14/616). This is an RFC pull
> > > because there has not yet been a -next build for April 16th. If you
> > > would prefer to wait until after -next has pulled this, please let me
> > > know and I will redo this pull request after that has happened.
> > >
> > > In the meantime, this change is available in the git repository at:
> > >
> > > git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git for-mingo
> > >
> > > for you to fetch changes up to 8d7dc9283f399e1fda4e48a1c453f689326d9396:
> > >
> > > rcu: Control grace-period delays directly from value (2015-04-14 19:33:59 -0700)
> > >
> > > ----------------------------------------------------------------
> > > Paul E. McKenney (1):
> > > rcu: Control grace-period delays directly from value
> > >
> > > kernel/rcu/tree.c | 16 +++++++++-------
> > > lib/Kconfig.debug | 1 +
> > > 2 files changed, 10 insertions(+), 7 deletions(-)
> >
> > Pulled, thanks a lot Paul!
> >
> > Note, while this fixes Linus's immediate complaint that arose from the
> > new option, I still think we need to do more fixes in this area.
>
> Good point!
>
> > To demonstrate the current situation I tried the following experiment,
> > I did a 'make defconfig' on an x86 box and then took the .config and
> > deleted all 'RCU Subsystem' options not marked as debugging.
> >
> > Then I did a 'make oldconfig' to see what kinds of questions a user is
> > facing when trying to configure RCU:
> >
> > *
> > * Restart config...
> > *
> > *
> > * RCU Subsystem
> > *
> > RCU Implementation
> > > 1. Tree-based hierarchical RCU (TREE_RCU) (NEW)
> > choice[1]: 1
>
> Hmmm... Given that there is no choice, I agree that it is a bit silly
> to ask...

To clarify: this doesn't actually ask - it gets skipped by the kconfig
tool. All the rest is an interactive prompt.

> > Task_based RCU implementation using voluntary context switch (TASKS_RCU) [N/y/?] (NEW)
>
> Agreed, this one should be driven directly off of CONFIG_RCU_TORTURE_TEST
> and the tracing use case.

Yeah.

> > Consider userspace as in RCU extended quiescent state (RCU_USER_QS) [N/y/?] (NEW)
>
> This should be driven directly off of CONFIG_NO_HZ_FULL, unless
> Frederic knows something I don't.

Yes.

> > Tree-based hierarchical RCU fanout value (RCU_FANOUT) [64] (NEW)
>
> Hmmm... I could drop/obscure this one in favor of a boot parameter.

Well, what I think might be even bette to make it scale based on
CONFIG_NR_CPUS. Distros already actively manage the 'maximum number of
CPUs we support', so relying on that value makes sense.

So if someone sets CONFIG_NR_CPUS to 1024, it gets scaled accordingly.
If CONFIG_NR_CPUS is set to 2, it gets scaled to a minimal config.
Note that this would excercise and test the affected codepaths better
as well, as we'd get different size setups.

As for the boot option to override it: what would be the usecase for
that?

> > Tree-based hierarchical RCU leaf-level fanout value (RCU_FANOUT_LEAF) [16] (NEW)
>
> Ditto -- though large configurations really do set this to 64 in
> combination with the skew_tick boot parameter. Maybe we need to
> drive these off of some large-system parameter, like CONFIG_MAX_SMP.

Or rather CONFIG_NR_CPUS. CONFIG_MAX_SMP is really a debugging thing,
to configure the system to the silliest high settings that doesn't
outright crash - but it doesn't make much sense otherwise.

> > Disable tree-based hierarchical RCU auto-balancing (RCU_FANOUT_EXACT) [N/y/?] (NEW)
>
> I should just make this a boot parameter. Absolutely no reason for
> it to be a Kconfig parameter.

Again I'd size this to NR_CPUS - and for the boot parameter, I'd think
about actual usecases.

> > Accelerate last non-dyntick-idle CPU's grace periods (RCU_FAST_NO_HZ) [N/y/?] (NEW)
>
> On this one, I have no idea. Its purpose is energy efficiency, but
> it does have some downsides, for example, increasing idle entry/exit
> latency. I am a bit nervous about having it be a boot parameter
> because that would leave an extra compare-branch in the path. This
> one will require some thought.

Keeping this one configurable, with a good default and a good
explanation makes sense. There's a lot of

> > Real-time priority to use for RCU worker threads (RCU_KTHREAD_PRIO) [0] (NEW)
>
> Indeed, Linus complained about this one. ;-)

:-) Yes, it's an essentially unanswerable question.

> This Kconfig parameter is a stopgap, and needs a real solution.
> People with crazy-heavy workloads involving realtime cannot live
> without it, but that means that most people don't have to care. I
> have had solving this on my list, and this clearly increases its
> priority.

So what value do they use, prio 99? 98? It might be better to offer
this option as a binary choice, and set a given priority. If -rt
people complain then they might help us in solving it properly.

> > Offload RCU callback processing from boot-selected CPUs (RCU_NOCB_CPU) [N/y/?] (NEW)
>
> Hmmm... Maybe a boot parameter, but I thought that there was some
> reason that this was problematic. I will have to take another look.
>
> Anyway, this one is important to non-NO_HZ_FULL real-time workloads.
> In a -rt kernel, making CONFIG_PREEMPT_RT (or whatever it is these
> days) drive this one makes a lot of sense.

Ok.

>
> > #
> > # configuration written to .config
> > #
> >
> > Only TREE_RCU is available on defconfig, so all the other options
> > marked with '(NEW)' were offered as an interactive prompt.
> >
> > I don't think that any of the 8 interactive options (!) here are
> > particularly useful to even advanced users who configure kernels, and
> > I don't think they should be offered under non-expert settings.
>
> Would it make sense to have a CONFIG_RCU_EXPERT setting to hide the
> remaining settings? That would reduce the common-case number of
> questions to one, which would be a quick and safe improvement.
> Especially when combined with the changes I called out above.

Yes, that's absolutely sensible - although I'd also do the
CONFIG_NR_CPUS based auto-scaling if it's not set, to make sure
distros don't end up tuning this (inevitably imperfectly) which won't
flow back upstream:

That's the other main problem with widely tunable, numeric settings,
beyond their user hostility: if they are wrong and are corrected in a
distro they don't flow back to upstream, so they are dead end
mechanisms as far as code quality and good defaults are concerned.

Thanks,

Ingo

2015-04-19 02:05:51

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [GIT RFC PULL rcu/urgent] Prevent Kconfig from asking pointless questions

On Sat, Apr 18, 2015 at 04:32:38PM +0200, Ingo Molnar wrote:
>
> * Paul E. McKenney <[email protected]> wrote:
>
> > On Sat, Apr 18, 2015 at 03:03:41PM +0200, Ingo Molnar wrote:
> > >
> > > * Paul E. McKenney <[email protected]> wrote:
> > >
> > > > Hello, Ingo,
> > > >
> > > > This series contains a single change that fixes Kconfig asking pointless
> > > > questions (https://lkml.org/lkml/2015/4/14/616). This is an RFC pull
> > > > because there has not yet been a -next build for April 16th. If you
> > > > would prefer to wait until after -next has pulled this, please let me
> > > > know and I will redo this pull request after that has happened.
> > > >
> > > > In the meantime, this change is available in the git repository at:
> > > >
> > > > git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git for-mingo
> > > >
> > > > for you to fetch changes up to 8d7dc9283f399e1fda4e48a1c453f689326d9396:
> > > >
> > > > rcu: Control grace-period delays directly from value (2015-04-14 19:33:59 -0700)
> > > >
> > > > ----------------------------------------------------------------
> > > > Paul E. McKenney (1):
> > > > rcu: Control grace-period delays directly from value
> > > >
> > > > kernel/rcu/tree.c | 16 +++++++++-------
> > > > lib/Kconfig.debug | 1 +
> > > > 2 files changed, 10 insertions(+), 7 deletions(-)
> > >
> > > Pulled, thanks a lot Paul!
> > >
> > > Note, while this fixes Linus's immediate complaint that arose from the
> > > new option, I still think we need to do more fixes in this area.
> >
> > Good point!
> >
> > > To demonstrate the current situation I tried the following experiment,
> > > I did a 'make defconfig' on an x86 box and then took the .config and
> > > deleted all 'RCU Subsystem' options not marked as debugging.
> > >
> > > Then I did a 'make oldconfig' to see what kinds of questions a user is
> > > facing when trying to configure RCU:
> > >
> > > *
> > > * Restart config...
> > > *
> > > *
> > > * RCU Subsystem
> > > *
> > > RCU Implementation
> > > > 1. Tree-based hierarchical RCU (TREE_RCU) (NEW)
> > > choice[1]: 1
> >
> > Hmmm... Given that there is no choice, I agree that it is a bit silly
> > to ask...
>
> To clarify: this doesn't actually ask - it gets skipped by the kconfig
> tool. All the rest is an interactive prompt.

Ah, good point!

> > > Task_based RCU implementation using voluntary context switch (TASKS_RCU) [N/y/?] (NEW)
> >
> > Agreed, this one should be driven directly off of CONFIG_RCU_TORTURE_TEST
> > and the tracing use case.
>
> Yeah.

OK, will do.

> > > Consider userspace as in RCU extended quiescent state (RCU_USER_QS) [N/y/?] (NEW)
> >
> > This should be driven directly off of CONFIG_NO_HZ_FULL, unless
> > Frederic knows something I don't.
>
> Yes.

Then unless Frederic objects... ;-)

> > > Tree-based hierarchical RCU fanout value (RCU_FANOUT) [64] (NEW)
> >
> > Hmmm... I could drop/obscure this one in favor of a boot parameter.
>
> Well, what I think might be even bette to make it scale based on
> CONFIG_NR_CPUS. Distros already actively manage the 'maximum number of
> CPUs we support', so relying on that value makes sense.
>
> So if someone sets CONFIG_NR_CPUS to 1024, it gets scaled accordingly.
> If CONFIG_NR_CPUS is set to 2, it gets scaled to a minimal config.
> Note that this would excercise and test the affected codepaths better
> as well, as we'd get different size setups.
>
> As for the boot option to override it: what would be the usecase for
> that?

Well, in normal circumstances, it should be 64 for 64-bit systems and
32 for 32-bit systems, regardless of number of CPUs. But if you had
an odd-sized multisocket system with extremely high socket-to-socket
memory latencies, you might want to select a different value. For
a silly example, suppose your system had 27 hardware threads per socket.
Then you might want to set both RCU_FANOUT_LEAF and RCU_FANOUT to 27.

Or use a boot parameter to do so, as can be done today for RCU_FANOUT_LEAF.

@@@

> > > Tree-based hierarchical RCU leaf-level fanout value (RCU_FANOUT_LEAF) [16] (NEW)
> >
> > Ditto -- though large configurations really do set this to 64 in
> > combination with the skew_tick boot parameter. Maybe we need to
> > drive these off of some large-system parameter, like CONFIG_MAX_SMP.
>
> Or rather CONFIG_NR_CPUS. CONFIG_MAX_SMP is really a debugging thing,
> to configure the system to the silliest high settings that doesn't
> outright crash - but it doesn't make much sense otherwise.

Except that setting RCU_FANOUT_LEAF to 64 without also booting with
skew_tick=1 is a really bad idea, as the synchronized scheduling-clock
interrupts will cause ugly levels of lock contention on the rcu_node
->lock. :-(

But perhaps making the default value of sched_skew_tick be 1 if
RCU_FANOUT_LEAF is greater than 16 is the right solution.

> > > Disable tree-based hierarchical RCU auto-balancing (RCU_FANOUT_EXACT) [N/y/?] (NEW)
> >
> > I should just make this a boot parameter. Absolutely no reason for
> > it to be a Kconfig parameter.
>
> Again I'd size this to NR_CPUS - and for the boot parameter, I'd think
> about actual usecases.

The intended use case is related to the odd-sized systems mentioned
for RCU_FANOUT. By default, we spread CPUs across the leaf-level
rcu_node structures to reduce lock contention, via RCU_FANOUT_EXACT=n.
Systems with high remote memory latencies might want RCU_FANOUT_EXACT=y
to have full control of the geometry.

Maybe I should just eliminate this choice, forcing the current default.

> > > Accelerate last non-dyntick-idle CPU's grace periods (RCU_FAST_NO_HZ) [N/y/?] (NEW)
> >
> > On this one, I have no idea. Its purpose is energy efficiency, but
> > it does have some downsides, for example, increasing idle entry/exit
> > latency. I am a bit nervous about having it be a boot parameter
> > because that would leave an extra compare-branch in the path. This
> > one will require some thought.
>
> Keeping this one configurable, with a good default and a good
> explanation makes sense. There's a lot of
>
> > > Real-time priority to use for RCU worker threads (RCU_KTHREAD_PRIO) [0] (NEW)
> >
> > Indeed, Linus complained about this one. ;-)
>
> :-) Yes, it's an essentially unanswerable question.
>
> > This Kconfig parameter is a stopgap, and needs a real solution.
> > People with crazy-heavy workloads involving realtime cannot live
> > without it, but that means that most people don't have to care. I
> > have had solving this on my list, and this clearly increases its
> > priority.
>
> So what value do they use, prio 99? 98? It might be better to offer
> this option as a binary choice, and set a given priority. If -rt
> people complain then they might help us in solving it properly.

I honestly do not remember what priority they were using, it is
not in email, and I don't keep IRC logs that far back. Adding
[email protected] on CC.

> > > Offload RCU callback processing from boot-selected CPUs (RCU_NOCB_CPU) [N/y/?] (NEW)
> >
> > Hmmm... Maybe a boot parameter, but I thought that there was some
> > reason that this was problematic. I will have to take another look.
> >
> > Anyway, this one is important to non-NO_HZ_FULL real-time workloads.
> > In a -rt kernel, making CONFIG_PREEMPT_RT (or whatever it is these
> > days) drive this one makes a lot of sense.
>
> Ok.

But in the meantime, it looks like making non-default settings depend on
RCU_EXPERT it the right thing to do.

> > > #
> > > # configuration written to .config
> > > #
> > >
> > > Only TREE_RCU is available on defconfig, so all the other options
> > > marked with '(NEW)' were offered as an interactive prompt.
> > >
> > > I don't think that any of the 8 interactive options (!) here are
> > > particularly useful to even advanced users who configure kernels, and
> > > I don't think they should be offered under non-expert settings.
> >
> > Would it make sense to have a CONFIG_RCU_EXPERT setting to hide the
> > remaining settings? That would reduce the common-case number of
> > questions to one, which would be a quick and safe improvement.
> > Especially when combined with the changes I called out above.
>
> Yes, that's absolutely sensible - although I'd also do the
> CONFIG_NR_CPUS based auto-scaling if it's not set, to make sure
> distros don't end up tuning this (inevitably imperfectly) which won't
> flow back upstream:
>
> That's the other main problem with widely tunable, numeric settings,
> beyond their user hostility: if they are wrong and are corrected in a
> distro they don't flow back to upstream, so they are dead end
> mechanisms as far as code quality and good defaults are concerned.

OK, I will put the surviving options under CONFIG_RCU_EXPERT, and I will
check around to see if I can find any cases of distros setting them to
non-default values.

Thanx, Paul

2015-04-20 16:36:10

by Clark Williams

[permalink] [raw]
Subject: Re: [GIT RFC PULL rcu/urgent] Prevent Kconfig from asking pointless questions

On Sat, 18 Apr 2015 19:05:42 -0700
"Paul E. McKenney" <[email protected]> wrote:
> > > > Real-time priority to use for RCU worker threads (RCU_KTHREAD_PRIO) [0] (NEW)
> > >
> > > Indeed, Linus complained about this one. ;-)
> >
> > :-) Yes, it's an essentially unanswerable question.
> >
> > > This Kconfig parameter is a stopgap, and needs a real solution.
> > > People with crazy-heavy workloads involving realtime cannot live
> > > without it, but that means that most people don't have to care. I
> > > have had solving this on my list, and this clearly increases its
> > > priority.
> >
> > So what value do they use, prio 99? 98? It might be better to offer
> > this option as a binary choice, and set a given priority. If -rt
> > people complain then they might help us in solving it properly.
>
> I honestly do not remember what priority they were using, it is
> not in email, and I don't keep IRC logs that far back. Adding
> [email protected] on CC.

As I recall, we started out using fifo:1, but when you get heavy
workloads running at higher fifo priorities, we wanted to boost the rcu
worker threads over those workloads.

Currently the irq threads default to fifo:50, so maybe a good
default choice for the rcu threads on RT is fifo:49. That of course
presumes rational behavior on the part of application developers.

I seem to recall that you and I had a discussion about making this
value a runtime knob in /sys but that didn't go anywhere. Do we need to
crank that up again and just use the config as a default/starting
value? If so then we could just default to fifo:1 and let sysadmins
tweak the value to match up with the workload.

Clark


Attachments:
(No filename) (819.00 B)
OpenPGP digital signature

2015-04-20 17:09:10

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [GIT RFC PULL rcu/urgent] Prevent Kconfig from asking pointless questions

On Mon, Apr 20, 2015 at 11:35:54AM -0500, Clark Williams wrote:
> On Sat, 18 Apr 2015 19:05:42 -0700
> "Paul E. McKenney" <[email protected]> wrote:
> > > > > Real-time priority to use for RCU worker threads (RCU_KTHREAD_PRIO) [0] (NEW)
> > > >
> > > > Indeed, Linus complained about this one. ;-)
> > >
> > > :-) Yes, it's an essentially unanswerable question.
> > >
> > > > This Kconfig parameter is a stopgap, and needs a real solution.
> > > > People with crazy-heavy workloads involving realtime cannot live
> > > > without it, but that means that most people don't have to care. I
> > > > have had solving this on my list, and this clearly increases its
> > > > priority.
> > >
> > > So what value do they use, prio 99? 98? It might be better to offer
> > > this option as a binary choice, and set a given priority. If -rt
> > > people complain then they might help us in solving it properly.
> >
> > I honestly do not remember what priority they were using, it is
> > not in email, and I don't keep IRC logs that far back. Adding
> > [email protected] on CC.
>
> As I recall, we started out using fifo:1, but when you get heavy
> workloads running at higher fifo priorities, we wanted to boost the rcu
> worker threads over those workloads.
>
> Currently the irq threads default to fifo:50, so maybe a good
> default choice for the rcu threads on RT is fifo:49. That of course
> presumes rational behavior on the part of application developers.
>
> I seem to recall that you and I had a discussion about making this
> value a runtime knob in /sys but that didn't go anywhere. Do we need to
> crank that up again and just use the config as a default/starting
> value? If so then we could just default to fifo:1 and let sysadmins
> tweak the value to match up with the workload.

The sysfs knob might be nice, but as far as I know nobody has been
complaining about it.

Besides, we already have the rcutree.kthread_prio= kernel-boot parameter.
So how about if the Kconfig parameter selects either SCHED_OTHER
(the default) or SCHED_FIFO:1, and then the boot parameter can be used
to select other values.

That said, if the lack of a sysfs knob has been causing real problems,
let's make that happen.

Thanx, Paul

2015-04-20 17:59:32

by Clark Williams

[permalink] [raw]
Subject: Re: [GIT RFC PULL rcu/urgent] Prevent Kconfig from asking pointless questions

On Mon, 20 Apr 2015 10:09:03 -0700
"Paul E. McKenney" <[email protected]> wrote:

> On Mon, Apr 20, 2015 at 11:35:54AM -0500, Clark Williams wrote:
> > On Sat, 18 Apr 2015 19:05:42 -0700
> > "Paul E. McKenney" <[email protected]> wrote:
> > > > > > Real-time priority to use for RCU worker threads (RCU_KTHREAD_PRIO) [0] (NEW)
> > > > >
> > > > > Indeed, Linus complained about this one. ;-)
> > > >
> > > > :-) Yes, it's an essentially unanswerable question.
> > > >
> > > > > This Kconfig parameter is a stopgap, and needs a real solution.
> > > > > People with crazy-heavy workloads involving realtime cannot live
> > > > > without it, but that means that most people don't have to care. I
> > > > > have had solving this on my list, and this clearly increases its
> > > > > priority.
> > > >
> > > > So what value do they use, prio 99? 98? It might be better to offer
> > > > this option as a binary choice, and set a given priority. If -rt
> > > > people complain then they might help us in solving it properly.
> > >
> > > I honestly do not remember what priority they were using, it is
> > > not in email, and I don't keep IRC logs that far back. Adding
> > > [email protected] on CC.
> >
> > As I recall, we started out using fifo:1, but when you get heavy
> > workloads running at higher fifo priorities, we wanted to boost the rcu
> > worker threads over those workloads.
> >
> > Currently the irq threads default to fifo:50, so maybe a good
> > default choice for the rcu threads on RT is fifo:49. That of course
> > presumes rational behavior on the part of application developers.
> >
> > I seem to recall that you and I had a discussion about making this
> > value a runtime knob in /sys but that didn't go anywhere. Do we need to
> > crank that up again and just use the config as a default/starting
> > value? If so then we could just default to fifo:1 and let sysadmins
> > tweak the value to match up with the workload.
>
> The sysfs knob might be nice, but as far as I know nobody has been
> complaining about it.
>
> Besides, we already have the rcutree.kthread_prio= kernel-boot parameter.
> So how about if the Kconfig parameter selects either SCHED_OTHER
> (the default) or SCHED_FIFO:1, and then the boot parameter can be used
> to select other values.

Yeah, that will work.

>
> That said, if the lack of a sysfs knob has been causing real problems,
> let's make that happen.

I'll talk to the other RT-ers and get back to you on that. I suspect
most folks would like it just to not have to reboot while tuning, but
not sure it's worth the extra code.

Clark


Attachments:
(No filename) (819.00 B)
OpenPGP digital signature

2015-04-20 18:01:14

by Steven Rostedt

[permalink] [raw]
Subject: Re: [GIT RFC PULL rcu/urgent] Prevent Kconfig from asking pointless questions

On Mon, Apr 20, 2015 at 10:09:03AM -0700, Paul E. McKenney wrote:
>
> The sysfs knob might be nice, but as far as I know nobody has been
> complaining about it.
>
> Besides, we already have the rcutree.kthread_prio= kernel-boot parameter.
> So how about if the Kconfig parameter selects either SCHED_OTHER
> (the default) or SCHED_FIFO:1, and then the boot parameter can be used
> to select other values.
>
> That said, if the lack of a sysfs knob has been causing real problems,
> let's make that happen.

But then it's too late, because the time of something getting into the kernel
to the time people can use it can be months if not years.

I see no harm in adding one. Pretty much every kernel parameter I added for
ftrace, has a sysctrl knob for it. (Not a sysfs knob, but a /proc/sys/kernel
knob which is different).

-- Steve

2015-04-20 18:09:12

by Ingo Molnar

[permalink] [raw]
Subject: Re: [GIT RFC PULL rcu/urgent] Prevent Kconfig from asking pointless questions


* Steven Rostedt <[email protected]> wrote:

> On Mon, Apr 20, 2015 at 10:09:03AM -0700, Paul E. McKenney wrote:
> >
> > The sysfs knob might be nice, but as far as I know nobody has been
> > complaining about it.
> >
> > Besides, we already have the rcutree.kthread_prio= kernel-boot
> > parameter. So how about if the Kconfig parameter selects either
> > SCHED_OTHER (the default) or SCHED_FIFO:1, and then the boot
> > parameter can be used to select other values.
> >
> > That said, if the lack of a sysfs knob has been causing real
> > problems, let's make that happen.
>
> But then it's too late, because the time of something getting into
> the kernel to the time people can use it can be months if not years.
>
> I see no harm in adding one. Pretty much every kernel parameter I
> added for ftrace, has a sysctrl knob for it. (Not a sysfs knob, but
> a /proc/sys/kernel knob which is different).

So the disadvantage is that if a boot default is wrong, we'll hear
about it eventually and can fix/improve it.

If a sysctl knob is wrong, people will just 'tune' it and forget to
propagate it to the kernel proper (why should they).

Which is fine for something like ftrace and other ad-hoc
instrumentation that is generally very fine tuned to a given bug or
given piece of hardware, but for something like the RCU implementation
of the kernel - even if it's just a RT side thought of it - I'm not so
sure about it.

Thanks,

Ingo

Subject: Re: [GIT RFC PULL rcu/urgent] Prevent Kconfig from asking pointless questions



On 04/20/2015 02:59 PM, Clark Williams wrote:
>> > That said, if the lack of a sysfs knob has been causing real problems,
>> > let's make that happen.
> I'll talk to the other RT-ers and get back to you on that. I suspect
> most folks would like it just to not have to reboot while tuning, but
> not sure it's worth the extra code.

I agree that users do not like to reboot the system while tuning
policy:priority. Thus, be able to adjust it without reboot is a good option.

Daniel

2015-04-20 18:21:55

by Steven Rostedt

[permalink] [raw]
Subject: Re: [GIT RFC PULL rcu/urgent] Prevent Kconfig from asking pointless questions

On Mon, 20 Apr 2015 20:09:04 +0200
Ingo Molnar <[email protected]> wrote:


> So the disadvantage is that if a boot default is wrong, we'll hear
> about it eventually and can fix/improve it.
>
> If a sysctl knob is wrong, people will just 'tune' it and forget to
> propagate it to the kernel proper (why should they).

My fear is that there is no one true value. One person complains about
it, we change it, then someone else complains about the new value. That
would be even worse.

>
> Which is fine for something like ftrace and other ad-hoc
> instrumentation that is generally very fine tuned to a given bug or
> given piece of hardware, but for something like the RCU implementation
> of the kernel - even if it's just a RT side thought of it - I'm not so
> sure about it.

I would argue than every case is different, and only the sysadmin would
know the right value. Thus, just set it to one, and if that's not good
enough, then the sysadmins can change it to their needs.

-- Steve

2015-04-20 18:28:38

by Ingo Molnar

[permalink] [raw]
Subject: Re: [GIT RFC PULL rcu/urgent] Prevent Kconfig from asking pointless questions


* Steven Rostedt <[email protected]> wrote:

> On Mon, 20 Apr 2015 20:09:04 +0200
> Ingo Molnar <[email protected]> wrote:
>
>
> > So the disadvantage is that if a boot default is wrong, we'll hear
> > about it eventually and can fix/improve it.
> >
> > If a sysctl knob is wrong, people will just 'tune' it and forget
> > to propagate it to the kernel proper (why should they).
>
> My fear is that there is no one true value. [...]

Do we know that?

> [...] One person complains about it, we change it, then someone else
> complains about the new value. That would be even worse.

At that point we can still add a sysctl, if valid arguments are
offered.

> > Which is fine for something like ftrace and other ad-hoc
> > instrumentation that is generally very fine tuned to a given bug
> > or given piece of hardware, but for something like the RCU
> > implementation of the kernel - even if it's just a RT side thought
> > of it - I'm not so sure about it.
>
> I would argue than every case is different, and only the sysadmin
> would know the right value. Thus, just set it to one, and if that's
> not good enough, then the sysadmins can change it to their needs.

Well, we had really bad experience with sysctls in the past, in
particular in the VM: with various settings exposed and distros
'tuning' them - sometimes radically changing the way the system
worked, confusing everyone involved.

So I'm in general opposed to sysctls for core kernel behavior - except
for cases where we don't know better.

Instrumentation - especially instrumentation that should have been
implemented mostly in user-space, like ftrace ;-) - is another special
case that should stay as flexible as possible via sysctls, obviously.

Thanks,

Ingo

2015-04-20 18:34:49

by Steven Rostedt

[permalink] [raw]
Subject: Re: [GIT RFC PULL rcu/urgent] Prevent Kconfig from asking pointless questions

On Mon, 20 Apr 2015 20:28:32 +0200
Ingo Molnar <[email protected]> wrote:

> Instrumentation - especially instrumentation that should have been
> implemented mostly in user-space, like ftrace ;-) - is another special
> case that should stay as flexible as possible via sysctls, obviously.

I know I used ftrace as an example, but a more appropriate example
would be the sched knobs, as this is more about rcu scheduling than
anything else.

See:

sched_autogroup_enabled sched_rr_timeslice_ms
sched_child_runs_first sched_rt_period_us
sched_domain/ sched_rt_runtime_us
sched_latency_ns sched_shares_window_ns
sched_migration_cost_ns sched_time_avg_ms
sched_min_granularity_ns sched_tunable_scaling
sched_nr_migrate sched_wakeup_granularity_ns

In particular, the sched_rt_* ones.


-- Steve

2015-04-20 20:40:58

by Steven Rostedt

[permalink] [raw]
Subject: Re: [GIT RFC PULL rcu/urgent] Prevent Kconfig from asking pointless questions

On Mon, Apr 20, 2015 at 10:09:03AM -0700, Paul E. McKenney wrote:
>
> The sysfs knob might be nice, but as far as I know nobody has been
> complaining about it.
>
> Besides, we already have the rcutree.kthread_prio= kernel-boot parameter.
> So how about if the Kconfig parameter selects either SCHED_OTHER
> (the default) or SCHED_FIFO:1, and then the boot parameter can be used
> to select other values.

Hmm, what priority is this for anyway. To change the priority of the boost
value at run time, do we only need to change the priority of the rcub threads?

And the priority of the other rcu threads can change as well with a simple
chrt?

If that's the case, then we don't need a sysctl knob at all.

-- Steve


>
> That said, if the lack of a sysfs knob has been causing real problems,
> let's make that happen.

2015-04-20 21:15:16

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [GIT RFC PULL rcu/urgent] Prevent Kconfig from asking pointless questions

On Mon, Apr 20, 2015 at 04:40:49PM -0400, Steven Rostedt wrote:
> On Mon, Apr 20, 2015 at 10:09:03AM -0700, Paul E. McKenney wrote:
> >
> > The sysfs knob might be nice, but as far as I know nobody has been
> > complaining about it.
> >
> > Besides, we already have the rcutree.kthread_prio= kernel-boot parameter.
> > So how about if the Kconfig parameter selects either SCHED_OTHER
> > (the default) or SCHED_FIFO:1, and then the boot parameter can be used
> > to select other values.
>
> Hmm, what priority is this for anyway. To change the priority of the boost
> value at run time, do we only need to change the priority of the rcub threads?
>
> And the priority of the other rcu threads can change as well with a simple
> chrt?
>
> If that's the case, then we don't need a sysctl knob at all.

For the grace-period kthreads and the boost kthread, that is the case.
It is also the case for the per-CPU kthreads that invoke RCU callbacks
for the non-offloaded RCU_BOOST configuration (and that replace all
softirq RCU work in -rt).

So, should I just ditch all of the priority-setting within RCU and tell
users to just use chrt?

Thanx, Paul

2015-04-20 21:50:31

by Clark Williams

[permalink] [raw]
Subject: Re: [GIT RFC PULL rcu/urgent] Prevent Kconfig from asking pointless questions

On Mon, 20 Apr 2015 14:15:04 -0700
"Paul E. McKenney" <[email protected]> wrote:

> On Mon, Apr 20, 2015 at 04:40:49PM -0400, Steven Rostedt wrote:
> > On Mon, Apr 20, 2015 at 10:09:03AM -0700, Paul E. McKenney wrote:
> > >
> > > The sysfs knob might be nice, but as far as I know nobody has been
> > > complaining about it.
> > >
> > > Besides, we already have the rcutree.kthread_prio= kernel-boot parameter.
> > > So how about if the Kconfig parameter selects either SCHED_OTHER
> > > (the default) or SCHED_FIFO:1, and then the boot parameter can be used
> > > to select other values.
> >
> > Hmm, what priority is this for anyway. To change the priority of the boost
> > value at run time, do we only need to change the priority of the rcub threads?
> >
> > And the priority of the other rcu threads can change as well with a simple
> > chrt?
> >
> > If that's the case, then we don't need a sysctl knob at all.
>
> For the grace-period kthreads and the boost kthread, that is the case.
> It is also the case for the per-CPU kthreads that invoke RCU callbacks
> for the non-offloaded RCU_BOOST configuration (and that replace all
> softirq RCU work in -rt).
>
> So, should I just ditch all of the priority-setting within RCU and tell
> users to just use chrt?

Looks to me like all we need to do is tell people if they need a boost
higher than the compiled in default (RCU_KTHREAD_PRIO), then chrt the
priority of the rcub thread to the desired priority.


Attachments:
(No filename) (819.00 B)
OpenPGP digital signature

2015-04-21 01:23:09

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [GIT RFC PULL rcu/urgent] Prevent Kconfig from asking pointless questions

On Mon, Apr 20, 2015 at 04:50:07PM -0500, Clark Williams wrote:
> On Mon, 20 Apr 2015 14:15:04 -0700
> "Paul E. McKenney" <[email protected]> wrote:
>
> > On Mon, Apr 20, 2015 at 04:40:49PM -0400, Steven Rostedt wrote:
> > > On Mon, Apr 20, 2015 at 10:09:03AM -0700, Paul E. McKenney wrote:
> > > >
> > > > The sysfs knob might be nice, but as far as I know nobody has been
> > > > complaining about it.
> > > >
> > > > Besides, we already have the rcutree.kthread_prio= kernel-boot parameter.
> > > > So how about if the Kconfig parameter selects either SCHED_OTHER
> > > > (the default) or SCHED_FIFO:1, and then the boot parameter can be used
> > > > to select other values.
> > >
> > > Hmm, what priority is this for anyway. To change the priority of the boost
> > > value at run time, do we only need to change the priority of the rcub threads?
> > >
> > > And the priority of the other rcu threads can change as well with a simple
> > > chrt?
> > >
> > > If that's the case, then we don't need a sysctl knob at all.
> >
> > For the grace-period kthreads and the boost kthread, that is the case.
> > It is also the case for the per-CPU kthreads that invoke RCU callbacks
> > for the non-offloaded RCU_BOOST configuration (and that replace all
> > softirq RCU work in -rt).
> >
> > So, should I just ditch all of the priority-setting within RCU and tell
> > users to just use chrt?
>
> Looks to me like all we need to do is tell people if they need a boost
> higher than the compiled in default (RCU_KTHREAD_PRIO), then chrt the
> priority of the rcub thread to the desired priority.

There's the rub. They also need to chrt the RCU grace-period kthreads
as well as the per-CPU kthreads (rcuc). Which is a pain and easy to
get wrong.

So at this point, I am leaning towards keeping RCU_KTHREAD_PRIO, but
hiding it behind RCU_EXPERT. Someone in an emergency situation can use
chrt to get RCU going, at least assuming that they had the foresight to
leave a prio-99 shell running somewhere and assuming that they do the
chrt before the system hits OOM. But they have to do all that anyway
if they were to use a sysfs or similar interface. And it is easy to
tell when you have boosted all the necessary kthreads because RCU
grace periods start advancing once again. You don't get that feedback
when you set things up at boot time. ;-)

So again, at least for the moment, I believe that RCU need not provide
a run-time interface for changing RCU kthread priorities, that the
RCU_KTHREAD_PRIO Kconfig parameter should remain, except that it needs
to be hidden behind RCU_EXPERT, and that the rcutree.kthread_prio=
kernel-boot parameter should also remain.

Seem reasonable?

Thanx, Paul

2015-04-21 03:37:29

by Mike Galbraith

[permalink] [raw]
Subject: Re: [GIT RFC PULL rcu/urgent] Prevent Kconfig from asking pointless questions

On Mon, 2015-04-20 at 14:21 -0400, Steven Rostedt wrote:
>
> I would argue than every case is different, and only the sysadmin
> would
> know the right value. Thus, just set it to one, and if that's not
> good
> enough, then the sysadmins can change it to their needs.

Agreed. I don't have it turned on in my -rt kernels, because I don't
want to force a knight in shining (priority x) armor on users.

-Mike

2015-04-21 06:42:31

by Ingo Molnar

[permalink] [raw]
Subject: Re: [GIT RFC PULL rcu/urgent] Prevent Kconfig from asking pointless questions


* Steven Rostedt <[email protected]> wrote:

> On Mon, 20 Apr 2015 20:28:32 +0200
> Ingo Molnar <[email protected]> wrote:
>
> > Instrumentation - especially instrumentation that should have been
> > implemented mostly in user-space, like ftrace ;-) - is another special
> > case that should stay as flexible as possible via sysctls, obviously.
>
> I know I used ftrace as an example, but a more appropriate example
> would be the sched knobs, as this is more about rcu scheduling than
> anything else.
>
> See:
>
> sched_autogroup_enabled sched_rr_timeslice_ms
> sched_child_runs_first sched_rt_period_us
> sched_domain/ sched_rt_runtime_us
> sched_latency_ns sched_shares_window_ns
> sched_migration_cost_ns sched_time_avg_ms
> sched_min_granularity_ns sched_tunable_scaling
> sched_nr_migrate sched_wakeup_granularity_ns

You are comparing apples to oranges.

1)

Many of these are only sysctls if CONFIG_SCHED_DEBUG is enabled, see:

triton:~/tip> git grep const_debug kernel/sched/*.c
kernel/sched/core.c:const_debug unsigned int sysctl_sched_features =
kernel/sched/core.c:const_debug unsigned int sysctl_sched_nr_migrate = 32;
kernel/sched/core.c:const_debug unsigned int sysctl_sched_time_avg = MSEC_PER_SEC;
kernel/sched/core.c:const_debug unsigned int sysctl_timer_migration = 1;
kernel/sched/fair.c:const_debug unsigned int sysctl_sched_migration_cost = 500000UL;

and they turn into 'const' otherwise:

/*
* Tunables that become constants when CONFIG_SCHED_DEBUG is off:
*/
#ifdef CONFIG_SCHED_DEBUG
# include <linux/static_key.h>
# define const_debug __read_mostly
#else
# define const_debug const
#endif

2)

A handful of them are simple on/off knobs, such as the
sched_child_runs_first quirk, or the sched_autogroup_enabled.

3)

There's basically just the three sched_rt_* ones that are 'true'
tunables (not on/off knobs), mostly because of ABI weakness:
setscheduler() has no interface for them.

Note that modern scheduler policies, like SCHED_DEADLINE, get all
their policy parameters from the sched_setparam() user-space ABI, they
are not driven by sysctls.

So my point stands.

Thanks,

Ingo

2015-04-21 13:12:37

by Steven Rostedt

[permalink] [raw]
Subject: Re: [GIT RFC PULL rcu/urgent] Prevent Kconfig from asking pointless questions

On Mon, 20 Apr 2015 18:22:58 -0700
"Paul E. McKenney" <[email protected]> wrote:

> On Mon, Apr 20, 2015 at 04:50:07PM -0500, Clark Williams wrote:
> > On Mon, 20 Apr 2015 14:15:04 -0700
> > "Paul E. McKenney" <[email protected]> wrote:
> >
> > > On Mon, Apr 20, 2015 at 04:40:49PM -0400, Steven Rostedt wrote:
> > > > On Mon, Apr 20, 2015 at 10:09:03AM -0700, Paul E. McKenney wrote:
> > > > >
> > > > > The sysfs knob might be nice, but as far as I know nobody has been
> > > > > complaining about it.
> > > > >
> > > > > Besides, we already have the rcutree.kthread_prio= kernel-boot parameter.
> > > > > So how about if the Kconfig parameter selects either SCHED_OTHER
> > > > > (the default) or SCHED_FIFO:1, and then the boot parameter can be used
> > > > > to select other values.
> > > >
> > > > Hmm, what priority is this for anyway. To change the priority of the boost
> > > > value at run time, do we only need to change the priority of the rcub threads?
> > > >
> > > > And the priority of the other rcu threads can change as well with a simple
> > > > chrt?
> > > >
> > > > If that's the case, then we don't need a sysctl knob at all.
> > >
> > > For the grace-period kthreads and the boost kthread, that is the case.
> > > It is also the case for the per-CPU kthreads that invoke RCU callbacks
> > > for the non-offloaded RCU_BOOST configuration (and that replace all
> > > softirq RCU work in -rt).
> > >
> > > So, should I just ditch all of the priority-setting within RCU and tell
> > > users to just use chrt?
> >
> > Looks to me like all we need to do is tell people if they need a boost
> > higher than the compiled in default (RCU_KTHREAD_PRIO), then chrt the
> > priority of the rcub thread to the desired priority.
>
> There's the rub. They also need to chrt the RCU grace-period kthreads
> as well as the per-CPU kthreads (rcuc). Which is a pain and easy to
> get wrong.
>
> So at this point, I am leaning towards keeping RCU_KTHREAD_PRIO, but
> hiding it behind RCU_EXPERT. Someone in an emergency situation can use
> chrt to get RCU going, at least assuming that they had the foresight to
> leave a prio-99 shell running somewhere and assuming that they do the
> chrt before the system hits OOM. But they have to do all that anyway
> if they were to use a sysfs or similar interface. And it is easy to
> tell when you have boosted all the necessary kthreads because RCU
> grace periods start advancing once again. You don't get that feedback
> when you set things up at boot time. ;-)
>
> So again, at least for the moment, I believe that RCU need not provide
> a run-time interface for changing RCU kthread priorities, that the
> RCU_KTHREAD_PRIO Kconfig parameter should remain, except that it needs
> to be hidden behind RCU_EXPERT, and that the rcutree.kthread_prio=
> kernel-boot parameter should also remain.
>
> Seem reasonable?
>

Does chrt override the kthread_prio at run time? If so, then great.
Otherwise, the sysadmin should still have a way to control their
priorities of kernel threads (with few exceptions like the migration
thread).

-- Steve

2015-04-21 13:19:04

by Steven Rostedt

[permalink] [raw]
Subject: Re: [GIT RFC PULL rcu/urgent] Prevent Kconfig from asking pointless questions

On Tue, 21 Apr 2015 08:42:23 +0200
Ingo Molnar <[email protected]> wrote:


> Note that modern scheduler policies, like SCHED_DEADLINE, get all
> their policy parameters from the sched_setparam() user-space ABI, they
> are not driven by sysctls.

Right, and when I realized that we can do the same with chrt, I
suggested using that. Clark thought that the prio of the rcu threads
always went back to the hard coded value set at compile time. I looked
at the code and didn't see that and then asked Paul about it. Seems
chrt should work. Before my argument was based on not having any way at
run time to change the rcu priorities, which I feel is bad. But if
there is a way, then that should be what we tell users to use.

-- Steve

2015-04-21 15:17:19

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [GIT RFC PULL rcu/urgent] Prevent Kconfig from asking pointless questions

On Tue, Apr 21, 2015 at 09:12:32AM -0400, Steven Rostedt wrote:
> On Mon, 20 Apr 2015 18:22:58 -0700
> "Paul E. McKenney" <[email protected]> wrote:
>
> > On Mon, Apr 20, 2015 at 04:50:07PM -0500, Clark Williams wrote:
> > > On Mon, 20 Apr 2015 14:15:04 -0700
> > > "Paul E. McKenney" <[email protected]> wrote:
> > >
> > > > On Mon, Apr 20, 2015 at 04:40:49PM -0400, Steven Rostedt wrote:
> > > > > On Mon, Apr 20, 2015 at 10:09:03AM -0700, Paul E. McKenney wrote:
> > > > > >
> > > > > > The sysfs knob might be nice, but as far as I know nobody has been
> > > > > > complaining about it.
> > > > > >
> > > > > > Besides, we already have the rcutree.kthread_prio= kernel-boot parameter.
> > > > > > So how about if the Kconfig parameter selects either SCHED_OTHER
> > > > > > (the default) or SCHED_FIFO:1, and then the boot parameter can be used
> > > > > > to select other values.
> > > > >
> > > > > Hmm, what priority is this for anyway. To change the priority of the boost
> > > > > value at run time, do we only need to change the priority of the rcub threads?
> > > > >
> > > > > And the priority of the other rcu threads can change as well with a simple
> > > > > chrt?
> > > > >
> > > > > If that's the case, then we don't need a sysctl knob at all.
> > > >
> > > > For the grace-period kthreads and the boost kthread, that is the case.
> > > > It is also the case for the per-CPU kthreads that invoke RCU callbacks
> > > > for the non-offloaded RCU_BOOST configuration (and that replace all
> > > > softirq RCU work in -rt).
> > > >
> > > > So, should I just ditch all of the priority-setting within RCU and tell
> > > > users to just use chrt?
> > >
> > > Looks to me like all we need to do is tell people if they need a boost
> > > higher than the compiled in default (RCU_KTHREAD_PRIO), then chrt the
> > > priority of the rcub thread to the desired priority.
> >
> > There's the rub. They also need to chrt the RCU grace-period kthreads
> > as well as the per-CPU kthreads (rcuc). Which is a pain and easy to
> > get wrong.
> >
> > So at this point, I am leaning towards keeping RCU_KTHREAD_PRIO, but
> > hiding it behind RCU_EXPERT. Someone in an emergency situation can use
> > chrt to get RCU going, at least assuming that they had the foresight to
> > leave a prio-99 shell running somewhere and assuming that they do the
> > chrt before the system hits OOM. But they have to do all that anyway
> > if they were to use a sysfs or similar interface. And it is easy to
> > tell when you have boosted all the necessary kthreads because RCU
> > grace periods start advancing once again. You don't get that feedback
> > when you set things up at boot time. ;-)
> >
> > So again, at least for the moment, I believe that RCU need not provide
> > a run-time interface for changing RCU kthread priorities, that the
> > RCU_KTHREAD_PRIO Kconfig parameter should remain, except that it needs
> > to be hidden behind RCU_EXPERT, and that the rcutree.kthread_prio=
> > kernel-boot parameter should also remain.
> >
> > Seem reasonable?
>
> Does chrt override the kthread_prio at run time? If so, then great.
> Otherwise, the sysadmin should still have a way to control their
> priorities of kernel threads (with few exceptions like the migration
> thread).

Yep, RCU sets the prios only at boot time, so if they are set differently
at runtime, they should stay set differently. Unless chrt refuses to
work on kthreads or something. ;-)

Thanx, Paul

2015-04-21 15:50:37

by Steven Rostedt

[permalink] [raw]
Subject: Re: [GIT RFC PULL rcu/urgent] Prevent Kconfig from asking pointless questions

On Tue, 21 Apr 2015 08:01:22 -0700
"Paul E. McKenney" <[email protected]> wrote:

> > > Seem reasonable?
> >
> > Does chrt override the kthread_prio at run time? If so, then great.
> > Otherwise, the sysadmin should still have a way to control their
> > priorities of kernel threads (with few exceptions like the migration
> > thread).
>
> Yep, RCU sets the prios only at boot time, so if they are set differently
> at runtime, they should stay set differently. Unless chrt refuses to
> work on kthreads or something. ;-)

Great! This all sounds reasonable to me :-)

-- Steve

2015-04-21 16:32:20

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [GIT RFC PULL rcu/urgent] Prevent Kconfig from asking pointless questions

On Tue, Apr 21, 2015 at 11:50:27AM -0400, Steven Rostedt wrote:
> On Tue, 21 Apr 2015 08:01:22 -0700
> "Paul E. McKenney" <[email protected]> wrote:
>
> > > > Seem reasonable?
> > >
> > > Does chrt override the kthread_prio at run time? If so, then great.
> > > Otherwise, the sysadmin should still have a way to control their
> > > priorities of kernel threads (with few exceptions like the migration
> > > thread).
> >
> > Yep, RCU sets the prios only at boot time, so if they are set differently
> > at runtime, they should stay set differently. Unless chrt refuses to
> > work on kthreads or something. ;-)
>
> Great! This all sounds reasonable to me :-)

OK, will proceed as planned, then!

Thanx, Paul