2018-06-10 23:46:55

by Joel Fernandes

[permalink] [raw]
Subject: [PATCH 1/2] rcutorture: Disable RT throttling for boost tests

From: "Joel Fernandes (Google)" <[email protected]>

Currently rcutorture is not able to torture RCU boosting properly. This
is because the rcutorture's boost threads which are doing the torturing
may be throttled due to RT throttling.

This patch makes rcutorture use the right torture technique (unthrottled
rcutorture boost tasks) for torturing RCU so that the test fails
correctly when no boost is available.

Currently this requires accessing sysctl_sched_rt_runtime directly, but
that should be Ok since rcutorture is test code. Such direct access is
also only possible if rcutorture is used as a built-in so make it
conditional on that.

Signed-off-by: Joel Fernandes (Google) <[email protected]>
---
kernel/rcu/rcutorture.c | 29 +++++++++++++++++++++++++++++
1 file changed, 29 insertions(+)

diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index 5219e7ccd8a4..00e06349d7bb 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -55,6 +55,7 @@
#include <linux/torture.h>
#include <linux/vmalloc.h>
#include <linux/sched/debug.h>
+#include <linux/sched/sysctl.h>

#include "rcu.h"

@@ -772,6 +773,32 @@ static void rcu_torture_boost_cb(struct rcu_head *head)
smp_store_release(&rbip->inflight, 0);
}

+static int old_rt_runtime = -1;
+
+static void rcu_torture_disable_rt_throttle(void)
+{
+ /*
+ * Disable RT throttling so that rcutorture's boost threads don't get
+ * throttled. Only possible if rcutorture is built-in otherwise the
+ * user should manually do this by setting the sched_rt_period_us and
+ * sched_rt_runtime sysctls.
+ */
+ if (!IS_BUILTIN(CONFIG_RCU_TORTURE_TEST) || old_rt_runtime != -1)
+ return;
+
+ old_rt_runtime = sysctl_sched_rt_runtime;
+ sysctl_sched_rt_runtime = -1;
+}
+
+static void rcu_torture_enable_rt_throttle(void)
+{
+ if (!IS_BUILTIN(CONFIG_RCU_TORTURE_TEST) || old_rt_runtime == -1)
+ return;
+
+ sysctl_sched_rt_runtime = old_rt_runtime;
+ old_rt_runtime = -1;
+}
+
static int rcu_torture_boost(void *arg)
{
unsigned long call_rcu_time;
@@ -1510,6 +1537,7 @@ static int rcutorture_booster_cleanup(unsigned int cpu)
mutex_lock(&boost_mutex);
t = boost_tasks[cpu];
boost_tasks[cpu] = NULL;
+ rcu_torture_enable_rt_throttle();
mutex_unlock(&boost_mutex);

/* This must be outside of the mutex, otherwise deadlock! */
@@ -1526,6 +1554,7 @@ static int rcutorture_booster_init(unsigned int cpu)

/* Don't allow time recalculation while creating a new task. */
mutex_lock(&boost_mutex);
+ rcu_torture_disable_rt_throttle();
VERBOSE_TOROUT_STRING("Creating rcu_torture_boost task");
boost_tasks[cpu] = kthread_create_on_node(rcu_torture_boost, NULL,
cpu_to_node(cpu),
--
2.18.0.rc1.242.g61856ae69a-goog



2018-06-10 23:48:17

by Joel Fernandes

[permalink] [raw]
Subject: [PATCH 2/2] rcutorture: Make boost test more robust

From: "Joel Fernandes (Google)" <[email protected]>

Currently, with RCU_BOOST disabled, I get no failures when forcing
rcutorture to test RCU boost priority inversion. The reason seems to be
that we don't check for failures if the callback never ran at all for
the duration of the boost-test loop.

Further, the 'rtb' and 'rtbf' counters seem to be used inconsistently.
'rtb' is incremented at the start of each test and 'rtbf' is incremented
per-cpu on each failure of call_rcu. So its possible 'rtbf' > 'rtb'.

To test the boost with rcutorture, I did following on a 4-CPU x86 machine:

modprobe rcutorture test_boost=2
sleep 20
rmmod rcutorture

With patch:
rtbf: 8 rtb: 12

Without patch:
rtbf: 0 rtb: 2

In summary this patch:
- Increments failed and total test counters once per boost-test.
- Checks for failure cases correctly.

Signed-off-by: Joel Fernandes (Google) <[email protected]>
---
kernel/rcu/rcutorture.c | 45 +++++++++++++++++++++++++++++++++++------
1 file changed, 39 insertions(+), 6 deletions(-)

diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index 00e06349d7bb..0040cc2b836b 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -799,6 +799,18 @@ static void rcu_torture_enable_rt_throttle(void)
old_rt_runtime = -1;
}

+static bool rcu_torture_boost_failed(unsigned long start, unsigned long end)
+{
+ if (end - start > test_boost_duration * HZ - HZ / 2) {
+ VERBOSE_TOROUT_STRING("rcu_torture_boost boosting failed");
+ n_rcu_torture_boost_failure++;
+
+ return true; /* failed */
+ }
+
+ return false; /* passed */
+}
+
static int rcu_torture_boost(void *arg)
{
unsigned long call_rcu_time;
@@ -819,6 +831,21 @@ static int rcu_torture_boost(void *arg)
init_rcu_head_on_stack(&rbi.rcu);
/* Each pass through the following loop does one boost-test cycle. */
do {
+ /* Track if the test failed already in this test interval? */
+ bool failed = false;
+
+ /* Increment n_rcu_torture_boosts once per boost-test */
+ while (!kthread_should_stop()) {
+ if (mutex_trylock(&boost_mutex)) {
+ n_rcu_torture_boosts++;
+ mutex_unlock(&boost_mutex);
+ break;
+ }
+ schedule_timeout_uninterruptible(1);
+ }
+ if (kthread_should_stop())
+ goto checkwait;
+
/* Wait for the next test interval. */
oldstarttime = boost_starttime;
while (ULONG_CMP_LT(jiffies, oldstarttime)) {
@@ -837,11 +864,10 @@ static int rcu_torture_boost(void *arg)
/* RCU core before ->inflight = 1. */
smp_store_release(&rbi.inflight, 1);
call_rcu(&rbi.rcu, rcu_torture_boost_cb);
- if (jiffies - call_rcu_time >
- test_boost_duration * HZ - HZ / 2) {
- VERBOSE_TOROUT_STRING("rcu_torture_boost boosting failed");
- n_rcu_torture_boost_failure++;
- }
+ /* Check if the boost test failed */
+ failed = failed ||
+ rcu_torture_boost_failed(call_rcu_time,
+ jiffies);
call_rcu_time = jiffies;
}
stutter_wait("rcu_torture_boost");
@@ -849,6 +875,14 @@ static int rcu_torture_boost(void *arg)
goto checkwait;
}

+ /*
+ * If boost never happened, then inflight will always be 1, in
+ * this case the boost check would never happen in the above
+ * loop so do another one here.
+ */
+ if (!failed && smp_load_acquire(&rbi.inflight))
+ rcu_torture_boost_failed(call_rcu_time, jiffies);
+
/*
* Set the start time of the next test interval.
* Yes, this is vulnerable to long delays, but such
@@ -861,7 +895,6 @@ static int rcu_torture_boost(void *arg)
if (mutex_trylock(&boost_mutex)) {
boost_starttime = jiffies +
test_boost_interval * HZ;
- n_rcu_torture_boosts++;
mutex_unlock(&boost_mutex);
break;
}
--
2.18.0.rc1.242.g61856ae69a-goog


2018-06-11 15:19:04

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH 1/2] rcutorture: Disable RT throttling for boost tests

On Sun, Jun 10, 2018 at 04:45:43PM -0700, Joel Fernandes wrote:
> From: "Joel Fernandes (Google)" <[email protected]>
>
> Currently rcutorture is not able to torture RCU boosting properly. This
> is because the rcutorture's boost threads which are doing the torturing
> may be throttled due to RT throttling.
>
> This patch makes rcutorture use the right torture technique (unthrottled
> rcutorture boost tasks) for torturing RCU so that the test fails
> correctly when no boost is available.
>
> Currently this requires accessing sysctl_sched_rt_runtime directly, but
> that should be Ok since rcutorture is test code. Such direct access is
> also only possible if rcutorture is used as a built-in so make it
> conditional on that.
>
> Signed-off-by: Joel Fernandes (Google) <[email protected]>

Thank you, Joel! I have queued these for testing and review.
Peter Zijlstra might have some feedback on variable access, but will
test first.

Thanx, Paul

> ---
> kernel/rcu/rcutorture.c | 29 +++++++++++++++++++++++++++++
> 1 file changed, 29 insertions(+)
>
> diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
> index 5219e7ccd8a4..00e06349d7bb 100644
> --- a/kernel/rcu/rcutorture.c
> +++ b/kernel/rcu/rcutorture.c
> @@ -55,6 +55,7 @@
> #include <linux/torture.h>
> #include <linux/vmalloc.h>
> #include <linux/sched/debug.h>
> +#include <linux/sched/sysctl.h>
>
> #include "rcu.h"
>
> @@ -772,6 +773,32 @@ static void rcu_torture_boost_cb(struct rcu_head *head)
> smp_store_release(&rbip->inflight, 0);
> }
>
> +static int old_rt_runtime = -1;
> +
> +static void rcu_torture_disable_rt_throttle(void)
> +{
> + /*
> + * Disable RT throttling so that rcutorture's boost threads don't get
> + * throttled. Only possible if rcutorture is built-in otherwise the
> + * user should manually do this by setting the sched_rt_period_us and
> + * sched_rt_runtime sysctls.
> + */
> + if (!IS_BUILTIN(CONFIG_RCU_TORTURE_TEST) || old_rt_runtime != -1)
> + return;
> +
> + old_rt_runtime = sysctl_sched_rt_runtime;
> + sysctl_sched_rt_runtime = -1;
> +}
> +
> +static void rcu_torture_enable_rt_throttle(void)
> +{
> + if (!IS_BUILTIN(CONFIG_RCU_TORTURE_TEST) || old_rt_runtime == -1)
> + return;
> +
> + sysctl_sched_rt_runtime = old_rt_runtime;
> + old_rt_runtime = -1;
> +}
> +
> static int rcu_torture_boost(void *arg)
> {
> unsigned long call_rcu_time;
> @@ -1510,6 +1537,7 @@ static int rcutorture_booster_cleanup(unsigned int cpu)
> mutex_lock(&boost_mutex);
> t = boost_tasks[cpu];
> boost_tasks[cpu] = NULL;
> + rcu_torture_enable_rt_throttle();
> mutex_unlock(&boost_mutex);
>
> /* This must be outside of the mutex, otherwise deadlock! */
> @@ -1526,6 +1554,7 @@ static int rcutorture_booster_init(unsigned int cpu)
>
> /* Don't allow time recalculation while creating a new task. */
> mutex_lock(&boost_mutex);
> + rcu_torture_disable_rt_throttle();
> VERBOSE_TOROUT_STRING("Creating rcu_torture_boost task");
> boost_tasks[cpu] = kthread_create_on_node(rcu_torture_boost, NULL,
> cpu_to_node(cpu),
> --
> 2.18.0.rc1.242.g61856ae69a-goog
>


2018-06-13 07:03:25

by kernel test robot

[permalink] [raw]
Subject: [lkp-robot] [rcutorture] 46e26223e3: WARNING:at_kernel/rcu/rcutorture.c:#rcu_torture_stats_print


FYI, we noticed the following commit (built with gcc-4.9):

commit: 46e26223e39c64763e321f229e324be15179c505 ("rcutorture: Make boost test more robust")
url: https://github.com/0day-ci/linux/commits/Joel-Fernandes/rcutorture-Disable-RT-throttling-for-boost-tests/20180611-074731
base: https://git.kernel.org/cgit/linux/kernel/git/paulmck/linux-rcu.git rcu/next

in testcase: trinity
with following parameters:

runtime: 300s

test-description: Trinity is a linux system call fuzz tester.
test-url: http://codemonkey.org.uk/projects/trinity/


on test machine: qemu-system-i386 -enable-kvm -smp 2 -m 320M

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):


+--------------------------------------------------------------------+------------+------------+
| | c71f2f97e9 | 46e26223e3 |
+--------------------------------------------------------------------+------------+------------+
| boot_successes | 0 | 0 |
| boot_failures | 16 | 12 |
| WARNING:suspicious_RCU_usage | 16 | 12 |
| lib/test_rhashtable.c:#suspicious_rcu_dereference_protected()usage | 16 | 12 |
| BUG:workqueue_lockup-pool | 5 | |
| INFO:task_blocked_for_more_than#seconds | 5 | 3 |
| WARNING:at_kernel/rcu/rcutorture.c:#rcu_torture_stats_print | 0 | 12 |
| EIP:rcu_torture_stats_print | 0 | 12 |
+--------------------------------------------------------------------+------------+------------+



[ 74.158185] WARNING: CPU: 0 PID: 41 at kernel/rcu/rcutorture.c:1324 rcu_torture_stats_print+0x443/0x520
[ 74.160583] CPU: 0 PID: 41 Comm: rcu_torture_sta Not tainted 4.17.0-rc1-00151-g46e2622 #1
[ 74.161981] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[ 74.163331] EIP: rcu_torture_stats_print+0x443/0x520
[ 74.164187] EFLAGS: 00010202 CPU: 0
[ 74.164775] EAX: 7a8dcaa0 EBX: 00000000 ECX: 00000001 EDX: 00000000
[ 74.165755] ESI: 8979df34 EDI: 7a607f32 EBP: 8979df6c ESP: 8979dee0
[ 74.166780] DS: 007b ES: 007b FS: 0000 GS: 00e0 SS: 0068
[ 74.167668] CR0: 80050033 CR2: 0806e270 CR3: 0ecc6000 CR4: 00000690
[ 74.168721] Call Trace:
[ 74.169145] ? schedule_timeout+0x286/0x6d0
[ 74.169913] rcu_torture_stats+0x34/0x80
[ 74.170576] kthread+0xe0/0x110
[ 74.171105] ? rcu_torture_stats_print+0x520/0x520
[ 74.171900] ? __kthread_bind_mask+0x40/0x40
[ 74.172718] ret_from_fork+0x2e/0x38
[ 74.173313] Code: 04 c7 04 24 f9 1a 5f 7a e8 64 01 ff ff ff 05 28 bc 26 7b 31 c9 ba 01 00 00 00 b8 a0 ca 8d 7a c7 04 24 01 00 00 00 e8 fd 89 05 00 <0f> 0b 31 c9 ba 01 00 00 00 b8 88 ca 8d 7a c7 04 24 01 00 00 00
[ 74.176618] irq event stamp: 138
[ 74.177169] hardirqs last enabled at (137): [<790c173d>] console_unlock+0x49d/0x6a0
[ 74.178459] hardirqs last disabled at (138): [<7a1a2a1c>] common_exception+0x46/0x66
[ 74.179719] softirqs last enabled at (120): [<7a1a4b60>] __do_softirq+0x4b0/0x57d
[ 74.181013] softirqs last disabled at (89): [<7901e76c>] call_on_stack+0x4c/0x60
[ 74.182450] ---[ end trace 834273b866f313c1 ]---


To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> job-script # job-script is attached in this email



Thanks,
Xiaolong


Attachments:
(No filename) (3.65 kB)
config-4.17.0-rc1-00151-g46e2622 (112.80 kB)
job-script (3.91 kB)
dmesg.xz (21.47 kB)
Download all attachments

2018-06-13 07:30:58

by Joel Fernandes

[permalink] [raw]
Subject: Re: [lkp-robot] [rcutorture] 46e26223e3: WARNING:at_kernel/rcu/rcutorture.c:#rcu_torture_stats_print

(I'm actually not working this week, but still thought of replying :))

On Wed, Jun 13, 2018 at 02:57:11PM +0800, kernel test robot wrote:
>
> FYI, we noticed the following commit (built with gcc-4.9):
>
> commit: 46e26223e39c64763e321f229e324be15179c505 ("rcutorture: Make boost test more robust")
> url: https://github.com/0day-ci/linux/commits/Joel-Fernandes/rcutorture-Disable-RT-throttling-for-boost-tests/20180611-074731
> base: https://git.kernel.org/cgit/linux/kernel/git/paulmck/linux-rcu.git rcu/next
>
> in testcase: trinity
> with following parameters:
>
> runtime: 300s
>
> test-description: Trinity is a linux system call fuzz tester.
> test-url: http://codemonkey.org.uk/projects/trinity/

I'll try to reproduce this, but it could be because after this patch, the
boost test is actually working.

The reason for the rcutorture test failure could be that the default
kthread_prio for the system's RCU threads is set to 1 (unless overridden by
rcutree.kthread_prio) which is also equal to the priority of the rcutorture's
boost threads. Due to this the rcutorture test could starve the RCU threads
as well and defeat the boosting mechanism. I was able to solve a similar
issue by just passing rcutree.kthread_prio of 50 on the kernel command line.

Paul, would it be ok if we changed the default kthread_prio to something > 1
so that rcutorture can test properly without needing to pass any extra
rcutree.* parameters?

so something like this in kernel/rcu/tree.c ?

static int kthread_prio = IS_ENABLED(CONFIG_RCU_BOOST) ? 2 : 0;

thanks!

- Joel


2018-06-18 16:55:59

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [lkp-robot] [rcutorture] 46e26223e3: WARNING:at_kernel/rcu/rcutorture.c:#rcu_torture_stats_print

On Wed, Jun 13, 2018 at 12:29:48AM -0700, Joel Fernandes wrote:
> (I'm actually not working this week, but still thought of replying :))
>
> On Wed, Jun 13, 2018 at 02:57:11PM +0800, kernel test robot wrote:
> >
> > FYI, we noticed the following commit (built with gcc-4.9):
> >
> > commit: 46e26223e39c64763e321f229e324be15179c505 ("rcutorture: Make boost test more robust")
> > url: https://github.com/0day-ci/linux/commits/Joel-Fernandes/rcutorture-Disable-RT-throttling-for-boost-tests/20180611-074731
> > base: https://git.kernel.org/cgit/linux/kernel/git/paulmck/linux-rcu.git rcu/next
> >
> > in testcase: trinity
> > with following parameters:
> >
> > runtime: 300s
> >
> > test-description: Trinity is a linux system call fuzz tester.
> > test-url: http://codemonkey.org.uk/projects/trinity/
>
> I'll try to reproduce this, but it could be because after this patch, the
> boost test is actually working.

Quite possibly. ;-)

> The reason for the rcutorture test failure could be that the default
> kthread_prio for the system's RCU threads is set to 1 (unless overridden by
> rcutree.kthread_prio) which is also equal to the priority of the rcutorture's
> boost threads. Due to this the rcutorture test could starve the RCU threads
> as well and defeat the boosting mechanism. I was able to solve a similar
> issue by just passing rcutree.kthread_prio of 50 on the kernel command line.
>
> Paul, would it be ok if we changed the default kthread_prio to something > 1
> so that rcutorture can test properly without needing to pass any extra
> rcutree.* parameters?
>
> so something like this in kernel/rcu/tree.c ?
>
> static int kthread_prio = IS_ENABLED(CONFIG_RCU_BOOST) ? 2 : 0;

Would it be possible to also condition this on rcutorture being built
in? Or are they doing modprobes for rcutorture?

Thanx, Paul


2018-06-18 22:27:52

by Joel Fernandes

[permalink] [raw]
Subject: Re: [lkp-robot] [rcutorture] 46e26223e3: WARNING:at_kernel/rcu/rcutorture.c:#rcu_torture_stats_print

On Mon, Jun 18, 2018 at 09:56:46AM -0700, Paul E. McKenney wrote:
> > The reason for the rcutorture test failure could be that the default
> > kthread_prio for the system's RCU threads is set to 1 (unless overridden by
> > rcutree.kthread_prio) which is also equal to the priority of the rcutorture's
> > boost threads. Due to this the rcutorture test could starve the RCU threads
> > as well and defeat the boosting mechanism. I was able to solve a similar
> > issue by just passing rcutree.kthread_prio of 50 on the kernel command line.
> >
> > Paul, would it be ok if we changed the default kthread_prio to something > 1
> > so that rcutorture can test properly without needing to pass any extra
> > rcutree.* parameters?
> >
> > so something like this in kernel/rcu/tree.c ?
> >
> > static int kthread_prio = IS_ENABLED(CONFIG_RCU_BOOST) ? 2 : 0;
>
> Would it be possible to also condition this on rcutorture being built
> in? Or are they doing modprobes for rcutorture?

They seem to be doing built-in rcutorture tests. But I believe the same
problem would occur even if you used modules? I believe the fact that
rcutorture is a module or built-in wouldn't matter to the underlying issue
which is the RCU subsystems's threads are at too low of a priority
(rcutree.kthread_prio = 1).

If you agree with changing the default priority, I have included a patch
below for rcu/dev.

thanks,

- Joel

---8<-----------------------

From b0f4111ef1abd1c481c269fadb3535c83ab43c93 Mon Sep 17 00:00:00 2001
From: "Joel Fernandes (Google)" <[email protected]>
Date: Mon, 18 Jun 2018 15:13:10 -0700
Subject: [PATCH] rcu: Change default RCU kthread priority to 2

The current RT priority of 1 for RCU kthreads makes rcutorture's boost test
fail on systems where rcutree.kthread_prio isn't passed.

The rcutorture boost kthreads have the same priority as well (RT priority of
1). Due to this, the rcutorture kthreads starve the RCU subsystem's kthreads
and causes rcutorture failures. This patch changes the priority of the RCU
subsystem's threads to a default RT priority of 2 so that rcutorture's
threads get preempted by them. Verified that the boost tests will pass with
this change.

Reported-by: Xiaolong Ye (via lkp-robot) <[email protected]>
Signed-off-by: Joel Fernandes (Google) <[email protected]>
---
kernel/rcu/tree.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index deb2508be923..920c39e3f871 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -171,7 +171,7 @@ static void rcu_report_exp_rdp(struct rcu_state *rsp,
static void sync_sched_exp_online_cleanup(int cpu);

/* rcuc/rcub kthread realtime priority */
-static int kthread_prio = IS_ENABLED(CONFIG_RCU_BOOST) ? 1 : 0;
+static int kthread_prio = IS_ENABLED(CONFIG_RCU_BOOST) ? 2 : 0;
module_param(kthread_prio, int, 0644);

/* Delay in jiffies for grace-period initialization delays, debug only. */
--
2.18.0.rc1.244.gcf134e6275-goog


2018-06-19 01:09:16

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [lkp-robot] [rcutorture] 46e26223e3: WARNING:at_kernel/rcu/rcutorture.c:#rcu_torture_stats_print

On Mon, Jun 18, 2018 at 03:26:47PM -0700, Joel Fernandes wrote:
> On Mon, Jun 18, 2018 at 09:56:46AM -0700, Paul E. McKenney wrote:
> > > The reason for the rcutorture test failure could be that the default
> > > kthread_prio for the system's RCU threads is set to 1 (unless overridden by
> > > rcutree.kthread_prio) which is also equal to the priority of the rcutorture's
> > > boost threads. Due to this the rcutorture test could starve the RCU threads
> > > as well and defeat the boosting mechanism. I was able to solve a similar
> > > issue by just passing rcutree.kthread_prio of 50 on the kernel command line.
> > >
> > > Paul, would it be ok if we changed the default kthread_prio to something > 1
> > > so that rcutorture can test properly without needing to pass any extra
> > > rcutree.* parameters?
> > >
> > > so something like this in kernel/rcu/tree.c ?
> > >
> > > static int kthread_prio = IS_ENABLED(CONFIG_RCU_BOOST) ? 2 : 0;
> >
> > Would it be possible to also condition this on rcutorture being built
> > in? Or are they doing modprobes for rcutorture?
>
> They seem to be doing built-in rcutorture tests. But I believe the same
> problem would occur even if you used modules? I believe the fact that
> rcutorture is a module or built-in wouldn't matter to the underlying issue
> which is the RCU subsystems's threads are at too low of a priority
> (rcutree.kthread_prio = 1).

Understood...

> If you agree with changing the default priority, I have included a patch
> below for rcu/dev.

The problem is that without rcutorture, rcutree.kthread_prio=1 is a
legitimate choice, and changing the default globally could be breaking
someone. So it would be far better to up the priority only during known
rcutorture testing.

Thanx, Paul

> thanks,
>
> - Joel
>
> ---8<-----------------------
>
> >From b0f4111ef1abd1c481c269fadb3535c83ab43c93 Mon Sep 17 00:00:00 2001
> From: "Joel Fernandes (Google)" <[email protected]>
> Date: Mon, 18 Jun 2018 15:13:10 -0700
> Subject: [PATCH] rcu: Change default RCU kthread priority to 2
>
> The current RT priority of 1 for RCU kthreads makes rcutorture's boost test
> fail on systems where rcutree.kthread_prio isn't passed.
>
> The rcutorture boost kthreads have the same priority as well (RT priority of
> 1). Due to this, the rcutorture kthreads starve the RCU subsystem's kthreads
> and causes rcutorture failures. This patch changes the priority of the RCU
> subsystem's threads to a default RT priority of 2 so that rcutorture's
> threads get preempted by them. Verified that the boost tests will pass with
> this change.
>
> Reported-by: Xiaolong Ye (via lkp-robot) <[email protected]>
> Signed-off-by: Joel Fernandes (Google) <[email protected]>
> ---
> kernel/rcu/tree.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index deb2508be923..920c39e3f871 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -171,7 +171,7 @@ static void rcu_report_exp_rdp(struct rcu_state *rsp,
> static void sync_sched_exp_online_cleanup(int cpu);
>
> /* rcuc/rcub kthread realtime priority */
> -static int kthread_prio = IS_ENABLED(CONFIG_RCU_BOOST) ? 1 : 0;
> +static int kthread_prio = IS_ENABLED(CONFIG_RCU_BOOST) ? 2 : 0;
> module_param(kthread_prio, int, 0644);
>
> /* Delay in jiffies for grace-period initialization delays, debug only. */
> --
> 2.18.0.rc1.244.gcf134e6275-goog
>


2018-06-19 01:37:07

by Joel Fernandes

[permalink] [raw]
Subject: Re: [lkp-robot] [rcutorture] 46e26223e3: WARNING:at_kernel/rcu/rcutorture.c:#rcu_torture_stats_print



On June 18, 2018 6:08:03 PM PDT, "Paul E. McKenney" <[email protected]> wrote:
>On Mon, Jun 18, 2018 at 03:26:47PM -0700, Joel Fernandes wrote:
>> On Mon, Jun 18, 2018 at 09:56:46AM -0700, Paul E. McKenney wrote:
>> > > The reason for the rcutorture test failure could be that the
>default
>> > > kthread_prio for the system's RCU threads is set to 1 (unless
>overridden by
>> > > rcutree.kthread_prio) which is also equal to the priority of the
>rcutorture's
>> > > boost threads. Due to this the rcutorture test could starve the
>RCU threads
>> > > as well and defeat the boosting mechanism. I was able to solve a
>similar
>> > > issue by just passing rcutree.kthread_prio of 50 on the kernel
>command line.
>> > >
>> > > Paul, would it be ok if we changed the default kthread_prio to
>something > 1
>> > > so that rcutorture can test properly without needing to pass any
>extra
>> > > rcutree.* parameters?
>> > >
>> > > so something like this in kernel/rcu/tree.c ?
>> > >
>> > > static int kthread_prio = IS_ENABLED(CONFIG_RCU_BOOST) ? 2 : 0;
>> >
>> > Would it be possible to also condition this on rcutorture being
>built
>> > in? Or are they doing modprobes for rcutorture?
>>
>> They seem to be doing built-in rcutorture tests. But I believe the
>same
>> problem would occur even if you used modules? I believe the fact that
>> rcutorture is a module or built-in wouldn't matter to the underlying
>issue
>> which is the RCU subsystems's threads are at too low of a priority
>> (rcutree.kthread_prio = 1).
>
>Understood...
>
>> If you agree with changing the default priority, I have included a
>patch
>> below for rcu/dev.
>
>The problem is that without rcutorture, rcutree.kthread_prio=1 is a
>legitimate choice, and changing the default globally could be breaking
>someone. So it would be far better to up the priority only during
>known
>rcutorture testing.

Oh I see what you're saying. I'll work on a patch along these lines then. Thanks!

- Joel




>
> Thanx, Paul
>
>> thanks,
>>
>> - Joel
>>
>> ---8<-----------------------
>>
>> >From b0f4111ef1abd1c481c269fadb3535c83ab43c93 Mon Sep 17 00:00:00
>2001
>> From: "Joel Fernandes (Google)" <[email protected]>
>> Date: Mon, 18 Jun 2018 15:13:10 -0700
>> Subject: [PATCH] rcu: Change default RCU kthread priority to 2
>>
>> The current RT priority of 1 for RCU kthreads makes rcutorture's
>boost test
>> fail on systems where rcutree.kthread_prio isn't passed.
>>
>> The rcutorture boost kthreads have the same priority as well (RT
>priority of
>> 1). Due to this, the rcutorture kthreads starve the RCU subsystem's
>kthreads
>> and causes rcutorture failures. This patch changes the priority of
>the RCU
>> subsystem's threads to a default RT priority of 2 so that
>rcutorture's
>> threads get preempted by them. Verified that the boost tests will
>pass with
>> this change.
>>
>> Reported-by: Xiaolong Ye (via lkp-robot) <[email protected]>
>> Signed-off-by: Joel Fernandes (Google) <[email protected]>
>> ---
>> kernel/rcu/tree.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
>> index deb2508be923..920c39e3f871 100644
>> --- a/kernel/rcu/tree.c
>> +++ b/kernel/rcu/tree.c
>> @@ -171,7 +171,7 @@ static void rcu_report_exp_rdp(struct rcu_state
>*rsp,
>> static void sync_sched_exp_online_cleanup(int cpu);
>>
>> /* rcuc/rcub kthread realtime priority */
>> -static int kthread_prio = IS_ENABLED(CONFIG_RCU_BOOST) ? 1 : 0;
>> +static int kthread_prio = IS_ENABLED(CONFIG_RCU_BOOST) ? 2 : 0;
>> module_param(kthread_prio, int, 0644);
>>
>> /* Delay in jiffies for grace-period initialization delays, debug
>only. */
>> --
>> 2.18.0.rc1.244.gcf134e6275-goog
>>

--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

2018-06-19 02:37:59

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [lkp-robot] [rcutorture] 46e26223e3: WARNING:at_kernel/rcu/rcutorture.c:#rcu_torture_stats_print

On Mon, Jun 18, 2018 at 06:36:06PM -0700, Joel Fernandes wrote:
>
>
> On June 18, 2018 6:08:03 PM PDT, "Paul E. McKenney" <[email protected]> wrote:
> >On Mon, Jun 18, 2018 at 03:26:47PM -0700, Joel Fernandes wrote:
> >> On Mon, Jun 18, 2018 at 09:56:46AM -0700, Paul E. McKenney wrote:
> >> > > The reason for the rcutorture test failure could be that the
> >default
> >> > > kthread_prio for the system's RCU threads is set to 1 (unless
> >overridden by
> >> > > rcutree.kthread_prio) which is also equal to the priority of the
> >rcutorture's
> >> > > boost threads. Due to this the rcutorture test could starve the
> >RCU threads
> >> > > as well and defeat the boosting mechanism. I was able to solve a
> >similar
> >> > > issue by just passing rcutree.kthread_prio of 50 on the kernel
> >command line.
> >> > >
> >> > > Paul, would it be ok if we changed the default kthread_prio to
> >something > 1
> >> > > so that rcutorture can test properly without needing to pass any
> >extra
> >> > > rcutree.* parameters?
> >> > >
> >> > > so something like this in kernel/rcu/tree.c ?
> >> > >
> >> > > static int kthread_prio = IS_ENABLED(CONFIG_RCU_BOOST) ? 2 : 0;
> >> >
> >> > Would it be possible to also condition this on rcutorture being
> >built
> >> > in? Or are they doing modprobes for rcutorture?
> >>
> >> They seem to be doing built-in rcutorture tests. But I believe the
> >same
> >> problem would occur even if you used modules? I believe the fact that
> >> rcutorture is a module or built-in wouldn't matter to the underlying
> >issue
> >> which is the RCU subsystems's threads are at too low of a priority
> >> (rcutree.kthread_prio = 1).
> >
> >Understood...
> >
> >> If you agree with changing the default priority, I have included a
> >patch
> >> below for rcu/dev.
> >
> >The problem is that without rcutorture, rcutree.kthread_prio=1 is a
> >legitimate choice, and changing the default globally could be breaking
> >someone. So it would be far better to up the priority only during
> >known
> >rcutorture testing.
>
> Oh I see what you're saying. I'll work on a patch along these lines
> then. Thanks!

Looking forward to seeing it!

Thanx, Paul