LinuxLists.cc - [PATCH RT v2 0/3] RCU fixes

[permalink] [raw]

Subject: [PATCH RT v2 1/3] rcu: Acquire RCU lock when disabling BHs

A plain local_bh_disable() is documented as creating an RCU critical
section, and (at least) rcutorture expects this to be the case. However,
in_softirq() doesn't block a grace period on PREEMPT_RT, since RCU checks
preempt_count() directly. Even if RCU were changed to check
in_softirq(), that wouldn't allow blocked BH disablers to be boosted.

Fix this by calling rcu_read_lock() from local_bh_disable(), and update
rcu_read_lock_bh_held() accordingly.

Signed-off-by: Scott Wood <[email protected]>
---
Another question is whether non-raw spinlocks are intended to create an
RCU read-side critical section due to implicit preempt disable. If they
are, then we'd need to add rcu_read_lock() there as well since RT doesn't
disable preemption (and rcutorture should explicitly test with a
spinlock). If not, the documentation should make that clear.

include/linux/rcupdate.h | 4 ++++
kernel/rcu/update.c | 4 ++++
kernel/softirq.c | 12 +++++++++---
3 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 388ace315f32..d6e357378732 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -615,10 +615,12 @@ static inline void rcu_read_unlock(void)
static inline void rcu_read_lock_bh(void)
{
local_bh_disable();
+#ifndef CONFIG_PREEMPT_RT_FULL
__acquire(RCU_BH);
rcu_lock_acquire(&rcu_bh_lock_map);
RCU_LOCKDEP_WARN(!rcu_is_watching(),
"rcu_read_lock_bh() used illegally while idle");
+#endif
}

/*
@@ -628,10 +630,12 @@ static inline void rcu_read_lock_bh(void)
*/
static inline void rcu_read_unlock_bh(void)
{
+#ifndef CONFIG_PREEMPT_RT_FULL
RCU_LOCKDEP_WARN(!rcu_is_watching(),
"rcu_read_unlock_bh() used illegally while idle");
rcu_lock_release(&rcu_bh_lock_map);
__release(RCU_BH);
+#endif
local_bh_enable();
}

diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
index 016c66a98292..a9cdf3d562bc 100644
--- a/kernel/rcu/update.c
+++ b/kernel/rcu/update.c
@@ -296,7 +296,11 @@ int rcu_read_lock_bh_held(void)
return 0;
if (!rcu_lockdep_current_cpu_online())
return 0;
+#ifdef CONFIG_PREEMPT_RT_FULL
+ return lock_is_held(&rcu_lock_map) || irqs_disabled();
+#else
return in_softirq() || irqs_disabled();
+#endif
}
EXPORT_SYMBOL_GPL(rcu_read_lock_bh_held);

diff --git a/kernel/softirq.c b/kernel/softirq.c
index d16d080a74f7..6080c9328df1 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -115,8 +115,10 @@ void __local_bh_disable_ip(unsigned long ip, unsigned int cnt)
long soft_cnt;

WARN_ON_ONCE(in_irq());
- if (!in_atomic())
+ if (!in_atomic()) {
local_lock(bh_lock);
+ rcu_read_lock();
+ }
soft_cnt = this_cpu_inc_return(softirq_counter);
WARN_ON_ONCE(soft_cnt == 0);
current->softirq_count += SOFTIRQ_DISABLE_OFFSET;
@@ -151,8 +153,10 @@ void _local_bh_enable(void)
#endif

current->softirq_count -= SOFTIRQ_DISABLE_OFFSET;
- if (!in_atomic())
+ if (!in_atomic()) {
+ rcu_read_unlock();
local_unlock(bh_lock);
+ }
}

void _local_bh_enable_rt(void)
@@ -185,8 +189,10 @@ void __local_bh_enable_ip(unsigned long ip, unsigned int cnt)
WARN_ON_ONCE(count < 0);
local_irq_enable();

- if (!in_atomic())
+ if (!in_atomic()) {
+ rcu_read_unlock();
local_unlock(bh_lock);
+ }

current->softirq_count -= SOFTIRQ_DISABLE_OFFSET;
preempt_check_resched();
--
1.8.3.1

2019-08-22 00:07:15

[permalink] [raw]

Subject: Re: [PATCH RT v2 1/3] rcu: Acquire RCU lock when disabling BHs

On Wed, Aug 21, 2019 at 06:19:04PM -0500, Scott Wood wrote:
> A plain local_bh_disable() is documented as creating an RCU critical
> section, and (at least) rcutorture expects this to be the case. However,
> in_softirq() doesn't block a grace period on PREEMPT_RT, since RCU checks
> preempt_count() directly. Even if RCU were changed to check
> in_softirq(), that wouldn't allow blocked BH disablers to be boosted.
>
> Fix this by calling rcu_read_lock() from local_bh_disable(), and update
> rcu_read_lock_bh_held() accordingly.

Cool! Some questions and comments below.

Thanx, Paul

> Signed-off-by: Scott Wood <[email protected]>
> ---
> Another question is whether non-raw spinlocks are intended to create an
> RCU read-side critical section due to implicit preempt disable.

Hmmm... Did non-raw spinlocks act like rcu_read_lock_sched()
and rcu_read_unlock_sched() pairs in -rt prior to the RCU flavor
consolidation? If not, I don't see why they should do so after that
consolidation in -rt.

> If they
> are, then we'd need to add rcu_read_lock() there as well since RT doesn't
> disable preemption (and rcutorture should explicitly test with a
> spinlock). If not, the documentation should make that clear.

True enough!

> include/linux/rcupdate.h | 4 ++++
> kernel/rcu/update.c | 4 ++++
> kernel/softirq.c | 12 +++++++++---
> 3 files changed, 17 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> index 388ace315f32..d6e357378732 100644
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -615,10 +615,12 @@ static inline void rcu_read_unlock(void)
> static inline void rcu_read_lock_bh(void)
> {
> local_bh_disable();
> +#ifndef CONFIG_PREEMPT_RT_FULL
> __acquire(RCU_BH);
> rcu_lock_acquire(&rcu_bh_lock_map);
> RCU_LOCKDEP_WARN(!rcu_is_watching(),
> "rcu_read_lock_bh() used illegally while idle");
> +#endif

Any chance of this using "if (!IS_ENABLED(CONFIG_PREEMPT_RT_FULL))"?
We should be OK providing a do-nothing __maybe_unused rcu_bh_lock_map
for lockdep-enabled -rt kernels, right?

> }
>
> /*
> @@ -628,10 +630,12 @@ static inline void rcu_read_lock_bh(void)
> */
> static inline void rcu_read_unlock_bh(void)
> {
> +#ifndef CONFIG_PREEMPT_RT_FULL
> RCU_LOCKDEP_WARN(!rcu_is_watching(),
> "rcu_read_unlock_bh() used illegally while idle");
> rcu_lock_release(&rcu_bh_lock_map);
> __release(RCU_BH);
> +#endif

Ditto.

> local_bh_enable();
> }
>
> diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
> index 016c66a98292..a9cdf3d562bc 100644
> --- a/kernel/rcu/update.c
> +++ b/kernel/rcu/update.c
> @@ -296,7 +296,11 @@ int rcu_read_lock_bh_held(void)
> return 0;
> if (!rcu_lockdep_current_cpu_online())
> return 0;
> +#ifdef CONFIG_PREEMPT_RT_FULL
> + return lock_is_held(&rcu_lock_map) || irqs_disabled();
> +#else
> return in_softirq() || irqs_disabled();
> +#endif

And globally.

> }
> EXPORT_SYMBOL_GPL(rcu_read_lock_bh_held);
>
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index d16d080a74f7..6080c9328df1 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -115,8 +115,10 @@ void __local_bh_disable_ip(unsigned long ip, unsigned int cnt)
> long soft_cnt;
>
> WARN_ON_ONCE(in_irq());
> - if (!in_atomic())
> + if (!in_atomic()) {
> local_lock(bh_lock);
> + rcu_read_lock();
> + }
> soft_cnt = this_cpu_inc_return(softirq_counter);
> WARN_ON_ONCE(soft_cnt == 0);
> current->softirq_count += SOFTIRQ_DISABLE_OFFSET;
> @@ -151,8 +153,10 @@ void _local_bh_enable(void)
> #endif
>
> current->softirq_count -= SOFTIRQ_DISABLE_OFFSET;
> - if (!in_atomic())
> + if (!in_atomic()) {
> + rcu_read_unlock();
> local_unlock(bh_lock);
> + }
> }
>
> void _local_bh_enable_rt(void)
> @@ -185,8 +189,10 @@ void __local_bh_enable_ip(unsigned long ip, unsigned int cnt)
> WARN_ON_ONCE(count < 0);
> local_irq_enable();
>
> - if (!in_atomic())
> + if (!in_atomic()) {
> + rcu_read_unlock();
> local_unlock(bh_lock);
> + }

The return from in_atomic() is guaranteed to be the same at
local_bh_enable() time as was at the call to the corresponding
local_bh_disable()?

I could have sworn that I ran afoul of this last year. Might these
added rcu_read_lock() and rcu_read_unlock() calls need to check for
CONFIG_PREEMPT_RT_FULL?

> current->softirq_count -= SOFTIRQ_DISABLE_OFFSET;
> preempt_check_resched();
> --
> 1.8.3.1
>

2019-08-22 02:19:35

[permalink] [raw]

Subject: [PATCH RT v2 3/3] rcu: Disable use_softirq on PREEMPT_RT

Besides restoring behavior that used to be default on RT, this avoids
a deadlock on scheduler locks:

[ 136.894657] 039: ============================================
[ 136.900401] 039: WARNING: possible recursive locking detected
[ 136.906146] 039: 5.2.9-rt3.dbg+ #174 Tainted: G E
[ 136.912152] 039: --------------------------------------------
[ 136.917896] 039: rcu_torture_rea/13474 is trying to acquire lock:
[ 136.923990] 039: 000000005f25146d
[ 136.927310] 039: (
[ 136.929414] 039: &p->pi_lock
[ 136.932303] 039: ){-...}
[ 136.934840] 039: , at: try_to_wake_up+0x39/0x920
[ 136.939461] 039:
but task is already holding lock:
[ 136.944425] 039: 000000005f25146d
[ 136.947745] 039: (
[ 136.949852] 039: &p->pi_lock
[ 136.952738] 039: ){-...}
[ 136.955274] 039: , at: try_to_wake_up+0x39/0x920
[ 136.959895] 039:
other info that might help us debug this:
[ 136.965555] 039: Possible unsafe locking scenario:

[ 136.970608] 039: CPU0
[ 136.973493] 039: ----
[ 136.976378] 039: lock(
[ 136.978918] 039: &p->pi_lock
[ 136.981806] 039: );
[ 136.983911] 039: lock(
[ 136.986451] 039: &p->pi_lock
[ 136.989336] 039: );
[ 136.991444] 039:
*** DEADLOCK ***

[ 136.995194] 039: May be due to missing lock nesting notation

[ 137.001115] 039: 3 locks held by rcu_torture_rea/13474:
[ 137.006341] 039: #0:
[ 137.008707] 039: 000000005f25146d
[ 137.012024] 039: (
[ 137.014131] 039: &p->pi_lock
[ 137.017015] 039: ){-...}
[ 137.019558] 039: , at: try_to_wake_up+0x39/0x920
[ 137.024175] 039: #1:
[ 137.026540] 039: 0000000011c8e51d
[ 137.029859] 039: (
[ 137.031966] 039: &rq->lock
[ 137.034679] 039: ){-...}
[ 137.037217] 039: , at: try_to_wake_up+0x241/0x920
[ 137.041924] 039: #2:
[ 137.044291] 039: 00000000098649b9
[ 137.047610] 039: (
[ 137.049714] 039: rcu_read_lock
[ 137.052774] 039: ){....}
[ 137.055314] 039: , at: cpuacct_charge+0x33/0x1e0
[ 137.059934] 039:
stack backtrace:
[ 137.063425] 039: CPU: 39 PID: 13474 Comm: rcu_torture_rea Kdump: loaded Tainted: G E 5.2.9-rt3.dbg+ #174
[ 137.074197] 039: Hardware name: Intel Corporation S2600BT/S2600BT, BIOS SE5C620.86B.01.00.0763.022420181017 02/24/2018
[ 137.084886] 039: Call Trace:
[ 137.087773] 039: <IRQ>
[ 137.090226] 039: dump_stack+0x5e/0x8b
[ 137.093997] 039: __lock_acquire+0x725/0x1100
[ 137.098358] 039: lock_acquire+0xc0/0x240
[ 137.102374] 039: ? try_to_wake_up+0x39/0x920
[ 137.106737] 039: _raw_spin_lock_irqsave+0x47/0x90
[ 137.111534] 039: ? try_to_wake_up+0x39/0x920
[ 137.115910] 039: try_to_wake_up+0x39/0x920
[ 137.120098] 039: rcu_read_unlock_special+0x65/0xb0
[ 137.124977] 039: __rcu_read_unlock+0x5d/0x70
[ 137.129337] 039: cpuacct_charge+0xd9/0x1e0
[ 137.133522] 039: ? cpuacct_charge+0x33/0x1e0
[ 137.137880] 039: update_curr+0x14b/0x420
[ 137.141894] 039: enqueue_entity+0x42/0x370
[ 137.146080] 039: enqueue_task_fair+0xa9/0x490
[ 137.150528] 039: activate_task+0x5a/0xf0
[ 137.154539] 039: ttwu_do_activate+0x4e/0x90
[ 137.158813] 039: try_to_wake_up+0x277/0x920
[ 137.163086] 039: irq_exit+0xb6/0xf0
[ 137.166661] 039: smp_apic_timer_interrupt+0xe3/0x3a0
[ 137.171714] 039: apic_timer_interrupt+0xf/0x20
[ 137.176249] 039: </IRQ>
[ 137.178785] 039: RIP: 0010:__schedule+0x0/0x8e0
[ 137.183319] 039: Code: 00 02 48 89 43 20 e8 0f 5a 00 00 48 8d 7b 28 e8 86 f2 fd ff 31 c0 5b 5d 41 5c c3 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 <55> 48 89 e5 41 57 41 56 49 c7 c6 c0 ca 1e 00 41 55 41 89 fd 41 54
[ 137.202498] 039: RSP: 0018:ffffc9005835fbc0 EFLAGS: 00000246
[ 137.208158] 039: ORIG_RAX: ffffffffffffff13
[ 137.212428] 039: RAX: 0000000000000000 RBX: ffff8897c3e1bb00 RCX: 0000000000000001
[ 137.219994] 039: RDX: 0000000080004008 RSI: 0000000000000006 RDI: 0000000000000001
[ 137.227560] 039: RBP: ffff8897c3e1bb00 R08: 0000000000000000 R09: 0000000000000000
[ 137.235126] 039: R10: 0000000000000001 R11: 0000000000000001 R12: ffffffff81001fd1
[ 137.242694] 039: R13: 0000000000000044 R14: 0000000000000000 R15: ffffc9005835fcac
[ 137.250259] 039: ? ___preempt_schedule+0x16/0x18
[ 137.254969] 039: preempt_schedule_common+0x32/0x80
[ 137.259846] 039: ___preempt_schedule+0x16/0x18
[ 137.264379] 039: rcutorture_one_extend+0x33a/0x510 [rcutorture]
[ 137.270397] 039: rcu_torture_one_read+0x18c/0x450 [rcutorture]
[ 137.276334] 039: rcu_torture_reader+0xac/0x1f0 [rcutorture]
[ 137.281998] 039: ? rcu_torture_reader+0x1f0/0x1f0 [rcutorture]
[ 137.287920] 039: kthread+0x106/0x140
[ 137.291591] 039: ? rcu_torture_one_read+0x450/0x450 [rcutorture]
[ 137.297681] 039: ? kthread_bind+0x10/0x10
[ 137.301783] 039: ret_from_fork+0x3a/0x50

Signed-off-by: Scott Wood <[email protected]>
---
I think the prohibition on use_softirq can be dropped once RT gets the
latest RCU code, but the question of what use_softirq should default
to on PREEMPT_RT remains.
---
kernel/rcu/tree.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index fc8b00c61b32..12036d33e2f9 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -98,9 +98,16 @@ struct rcu_state rcu_state = {
/* Dump rcu_node combining tree at boot to verify correct setup. */
static bool dump_tree;
module_param(dump_tree, bool, 0444);
-/* By default, use RCU_SOFTIRQ instead of rcuc kthreads. */
+/*
+ * By default, use RCU_SOFTIRQ instead of rcuc kthreads.
+ * But, avoid RCU_SOFTIRQ on PREEMPT_RT due to pi/rq deadlocks.
+ */
+#ifdef CONFIG_PREEMPT_RT_FULL
+static bool use_softirq;
+#else
static bool use_softirq = 1;
module_param(use_softirq, bool, 0444);
+#endif
/* Control rcu_node-tree auto-balancing at boot time. */
static bool rcu_fanout_exact;
module_param(rcu_fanout_exact, bool, 0444);
--
1.8.3.1

2019-08-22 05:04:01

[permalink] [raw]

Subject: Re: [PATCH RT v2 3/3] rcu: Disable use_softirq on PREEMPT_RT

On Wed, Aug 21, 2019 at 06:19:06PM -0500, Scott Wood wrote:
> Besides restoring behavior that used to be default on RT, this avoids
> a deadlock on scheduler locks:
>
> [ 136.894657] 039: ============================================
> [ 136.900401] 039: WARNING: possible recursive locking detected
> [ 136.906146] 039: 5.2.9-rt3.dbg+ #174 Tainted: G E
> [ 136.912152] 039: --------------------------------------------
> [ 136.917896] 039: rcu_torture_rea/13474 is trying to acquire lock:
> [ 136.923990] 039: 000000005f25146d
> [ 136.927310] 039: (
> [ 136.929414] 039: &p->pi_lock
> [ 136.932303] 039: ){-...}
> [ 136.934840] 039: , at: try_to_wake_up+0x39/0x920
> [ 136.939461] 039:
> but task is already holding lock:
> [ 136.944425] 039: 000000005f25146d
> [ 136.947745] 039: (
> [ 136.949852] 039: &p->pi_lock
> [ 136.952738] 039: ){-...}
> [ 136.955274] 039: , at: try_to_wake_up+0x39/0x920
> [ 136.959895] 039:
> other info that might help us debug this:
> [ 136.965555] 039: Possible unsafe locking scenario:
>
> [ 136.970608] 039: CPU0
> [ 136.973493] 039: ----
> [ 136.976378] 039: lock(
> [ 136.978918] 039: &p->pi_lock
> [ 136.981806] 039: );
> [ 136.983911] 039: lock(
> [ 136.986451] 039: &p->pi_lock
> [ 136.989336] 039: );
> [ 136.991444] 039:
> *** DEADLOCK ***
>
> [ 136.995194] 039: May be due to missing lock nesting notation
>
> [ 137.001115] 039: 3 locks held by rcu_torture_rea/13474:
> [ 137.006341] 039: #0:
> [ 137.008707] 039: 000000005f25146d
> [ 137.012024] 039: (
> [ 137.014131] 039: &p->pi_lock
> [ 137.017015] 039: ){-...}
> [ 137.019558] 039: , at: try_to_wake_up+0x39/0x920
> [ 137.024175] 039: #1:
> [ 137.026540] 039: 0000000011c8e51d
> [ 137.029859] 039: (
> [ 137.031966] 039: &rq->lock
> [ 137.034679] 039: ){-...}
> [ 137.037217] 039: , at: try_to_wake_up+0x241/0x920
> [ 137.041924] 039: #2:
> [ 137.044291] 039: 00000000098649b9
> [ 137.047610] 039: (
> [ 137.049714] 039: rcu_read_lock
> [ 137.052774] 039: ){....}
> [ 137.055314] 039: , at: cpuacct_charge+0x33/0x1e0
> [ 137.059934] 039:
> stack backtrace:
> [ 137.063425] 039: CPU: 39 PID: 13474 Comm: rcu_torture_rea Kdump: loaded Tainted: G E 5.2.9-rt3.dbg+ #174
> [ 137.074197] 039: Hardware name: Intel Corporation S2600BT/S2600BT, BIOS SE5C620.86B.01.00.0763.022420181017 02/24/2018
> [ 137.084886] 039: Call Trace:
> [ 137.087773] 039: <IRQ>
> [ 137.090226] 039: dump_stack+0x5e/0x8b
> [ 137.093997] 039: __lock_acquire+0x725/0x1100
> [ 137.098358] 039: lock_acquire+0xc0/0x240
> [ 137.102374] 039: ? try_to_wake_up+0x39/0x920
> [ 137.106737] 039: _raw_spin_lock_irqsave+0x47/0x90
> [ 137.111534] 039: ? try_to_wake_up+0x39/0x920
> [ 137.115910] 039: try_to_wake_up+0x39/0x920
> [ 137.120098] 039: rcu_read_unlock_special+0x65/0xb0
> [ 137.124977] 039: __rcu_read_unlock+0x5d/0x70
> [ 137.129337] 039: cpuacct_charge+0xd9/0x1e0
> [ 137.133522] 039: ? cpuacct_charge+0x33/0x1e0
> [ 137.137880] 039: update_curr+0x14b/0x420
> [ 137.141894] 039: enqueue_entity+0x42/0x370
> [ 137.146080] 039: enqueue_task_fair+0xa9/0x490
> [ 137.150528] 039: activate_task+0x5a/0xf0
> [ 137.154539] 039: ttwu_do_activate+0x4e/0x90
> [ 137.158813] 039: try_to_wake_up+0x277/0x920
> [ 137.163086] 039: irq_exit+0xb6/0xf0
> [ 137.166661] 039: smp_apic_timer_interrupt+0xe3/0x3a0
> [ 137.171714] 039: apic_timer_interrupt+0xf/0x20
> [ 137.176249] 039: </IRQ>
> [ 137.178785] 039: RIP: 0010:__schedule+0x0/0x8e0
> [ 137.183319] 039: Code: 00 02 48 89 43 20 e8 0f 5a 00 00 48 8d 7b 28 e8 86 f2 fd ff 31 c0 5b 5d 41 5c c3 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 <55> 48 89 e5 41 57 41 56 49 c7 c6 c0 ca 1e 00 41 55 41 89 fd 41 54
> [ 137.202498] 039: RSP: 0018:ffffc9005835fbc0 EFLAGS: 00000246
> [ 137.208158] 039: ORIG_RAX: ffffffffffffff13
> [ 137.212428] 039: RAX: 0000000000000000 RBX: ffff8897c3e1bb00 RCX: 0000000000000001
> [ 137.219994] 039: RDX: 0000000080004008 RSI: 0000000000000006 RDI: 0000000000000001
> [ 137.227560] 039: RBP: ffff8897c3e1bb00 R08: 0000000000000000 R09: 0000000000000000
> [ 137.235126] 039: R10: 0000000000000001 R11: 0000000000000001 R12: ffffffff81001fd1
> [ 137.242694] 039: R13: 0000000000000044 R14: 0000000000000000 R15: ffffc9005835fcac
> [ 137.250259] 039: ? ___preempt_schedule+0x16/0x18
> [ 137.254969] 039: preempt_schedule_common+0x32/0x80
> [ 137.259846] 039: ___preempt_schedule+0x16/0x18
> [ 137.264379] 039: rcutorture_one_extend+0x33a/0x510 [rcutorture]
> [ 137.270397] 039: rcu_torture_one_read+0x18c/0x450 [rcutorture]
> [ 137.276334] 039: rcu_torture_reader+0xac/0x1f0 [rcutorture]
> [ 137.281998] 039: ? rcu_torture_reader+0x1f0/0x1f0 [rcutorture]
> [ 137.287920] 039: kthread+0x106/0x140
> [ 137.291591] 039: ? rcu_torture_one_read+0x450/0x450 [rcutorture]
> [ 137.297681] 039: ? kthread_bind+0x10/0x10
> [ 137.301783] 039: ret_from_fork+0x3a/0x50
>
> Signed-off-by: Scott Wood <[email protected]>
> ---
> I think the prohibition on use_softirq can be dropped once RT gets the
> latest RCU code, but the question of what use_softirq should default
> to on PREEMPT_RT remains.
> ---
> kernel/rcu/tree.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index fc8b00c61b32..12036d33e2f9 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -98,9 +98,16 @@ struct rcu_state rcu_state = {
> /* Dump rcu_node combining tree at boot to verify correct setup. */
> static bool dump_tree;
> module_param(dump_tree, bool, 0444);
> -/* By default, use RCU_SOFTIRQ instead of rcuc kthreads. */
> +/*
> + * By default, use RCU_SOFTIRQ instead of rcuc kthreads.
> + * But, avoid RCU_SOFTIRQ on PREEMPT_RT due to pi/rq deadlocks.
> + */
> +#ifdef CONFIG_PREEMPT_RT_FULL
> +static bool use_softirq;
> +#else
> static bool use_softirq = 1;
> module_param(use_softirq, bool, 0444);
> +#endif

Save a couple of lines?

static bool use_softirq = !IS_ENABLED(CONFIG_PREEMPT_RT_FULL);

And if I understand your point above, the module_param() might be
able to be the same either way given the new RCU. But if not,
at least we get rid of the #else.

Thanx, Paul

> /* Control rcu_node-tree auto-balancing at boot time. */
> static bool rcu_fanout_exact;
> module_param(rcu_fanout_exact, bool, 0444);
> --
> 1.8.3.1
>

2019-08-22 15:22:20

[permalink] [raw]

Subject: Re: [PATCH RT v2 1/3] rcu: Acquire RCU lock when disabling BHs

On Wed, Aug 21, 2019 at 04:33:58PM -0700, Paul E. McKenney wrote:
> On Wed, Aug 21, 2019 at 06:19:04PM -0500, Scott Wood wrote:
> > A plain local_bh_disable() is documented as creating an RCU critical
> > section, and (at least) rcutorture expects this to be the case. However,
> > in_softirq() doesn't block a grace period on PREEMPT_RT, since RCU checks
> > preempt_count() directly. Even if RCU were changed to check
> > in_softirq(), that wouldn't allow blocked BH disablers to be boosted.
> >
> > Fix this by calling rcu_read_lock() from local_bh_disable(), and update
> > rcu_read_lock_bh_held() accordingly.
>
> Cool! Some questions and comments below.
>
> Thanx, Paul
>
> > Signed-off-by: Scott Wood <[email protected]>
> > ---
> > Another question is whether non-raw spinlocks are intended to create an
> > RCU read-side critical section due to implicit preempt disable.
>
> Hmmm... Did non-raw spinlocks act like rcu_read_lock_sched()
> and rcu_read_unlock_sched() pairs in -rt prior to the RCU flavor
> consolidation? If not, I don't see why they should do so after that
> consolidation in -rt.

May be I am missing something, but I didn't see the connection between
consolidation and this patch. AFAICS, this patch is so that
rcu_read_lock_bh_held() works at all on -rt. Did I badly miss something?

> > If they
> > are, then we'd need to add rcu_read_lock() there as well since RT doesn't
> > disable preemption (and rcutorture should explicitly test with a
> > spinlock). If not, the documentation should make that clear.
>
> True enough!
>
> > include/linux/rcupdate.h | 4 ++++
> > kernel/rcu/update.c | 4 ++++
> > kernel/softirq.c | 12 +++++++++---
> > 3 files changed, 17 insertions(+), 3 deletions(-)
> >
> > diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> > index 388ace315f32..d6e357378732 100644
> > --- a/include/linux/rcupdate.h
> > +++ b/include/linux/rcupdate.h
> > @@ -615,10 +615,12 @@ static inline void rcu_read_unlock(void)
> > static inline void rcu_read_lock_bh(void)
> > {
> > local_bh_disable();
> > +#ifndef CONFIG_PREEMPT_RT_FULL
> > __acquire(RCU_BH);
> > rcu_lock_acquire(&rcu_bh_lock_map);
> > RCU_LOCKDEP_WARN(!rcu_is_watching(),
> > "rcu_read_lock_bh() used illegally while idle");
> > +#endif
>
> Any chance of this using "if (!IS_ENABLED(CONFIG_PREEMPT_RT_FULL))"?
> We should be OK providing a do-nothing __maybe_unused rcu_bh_lock_map
> for lockdep-enabled -rt kernels, right?

Since this function is small, I prefer if -rt defines their own
rcu_read_lock_bh() which just does the local_bh_disable(). That would be way
cleaner IMO. IIRC, -rt does similar things for spinlocks, but it has been
sometime since I look at the -rt patchset.

> > }
> >
> > /*
> > @@ -628,10 +630,12 @@ static inline void rcu_read_lock_bh(void)
> > */
> > static inline void rcu_read_unlock_bh(void)
> > {
> > +#ifndef CONFIG_PREEMPT_RT_FULL
> > RCU_LOCKDEP_WARN(!rcu_is_watching(),
> > "rcu_read_unlock_bh() used illegally while idle");
> > rcu_lock_release(&rcu_bh_lock_map);
> > __release(RCU_BH);
> > +#endif
>
> Ditto.
>
> > local_bh_enable();
> > }
> >
> > diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
> > index 016c66a98292..a9cdf3d562bc 100644
> > --- a/kernel/rcu/update.c
> > +++ b/kernel/rcu/update.c
> > @@ -296,7 +296,11 @@ int rcu_read_lock_bh_held(void)
> > return 0;
> > if (!rcu_lockdep_current_cpu_online())
> > return 0;
> > +#ifdef CONFIG_PREEMPT_RT_FULL
> > + return lock_is_held(&rcu_lock_map) || irqs_disabled();
> > +#else
> > return in_softirq() || irqs_disabled();
> > +#endif
>
> And globally.

And could be untangled a bit as well:

if (irqs_disabled())
return 1;

if (IS_ENABLED(CONFIG_PREEMPT_RT_FULL))
return lock_is_held(&rcu_lock_map);

return in_softirq();

> > }
> > EXPORT_SYMBOL_GPL(rcu_read_lock_bh_held);
> >
> > diff --git a/kernel/softirq.c b/kernel/softirq.c
> > index d16d080a74f7..6080c9328df1 100644
> > --- a/kernel/softirq.c
> > +++ b/kernel/softirq.c
> > @@ -115,8 +115,10 @@ void __local_bh_disable_ip(unsigned long ip, unsigned int cnt)
> > long soft_cnt;
> >
> > WARN_ON_ONCE(in_irq());
> > - if (!in_atomic())
> > + if (!in_atomic()) {
> > local_lock(bh_lock);
> > + rcu_read_lock();
> > + }
> > soft_cnt = this_cpu_inc_return(softirq_counter);
> > WARN_ON_ONCE(soft_cnt == 0);
> > current->softirq_count += SOFTIRQ_DISABLE_OFFSET;
> > @@ -151,8 +153,10 @@ void _local_bh_enable(void)
> > #endif
> >
> > current->softirq_count -= SOFTIRQ_DISABLE_OFFSET;
> > - if (!in_atomic())
> > + if (!in_atomic()) {
> > + rcu_read_unlock();
> > local_unlock(bh_lock);
> > + }
> > }
> >
> > void _local_bh_enable_rt(void)
> > @@ -185,8 +189,10 @@ void __local_bh_enable_ip(unsigned long ip, unsigned int cnt)
> > WARN_ON_ONCE(count < 0);
> > local_irq_enable();
> >
> > - if (!in_atomic())
> > + if (!in_atomic()) {
> > + rcu_read_unlock();
> > local_unlock(bh_lock);
> > + }
>
> The return from in_atomic() is guaranteed to be the same at
> local_bh_enable() time as was at the call to the corresponding
> local_bh_disable()?
>
> I could have sworn that I ran afoul of this last year. Might these
> added rcu_read_lock() and rcu_read_unlock() calls need to check for
> CONFIG_PREEMPT_RT_FULL?

Great point! I think they should be guarded but will let Scott answer that
one.

thanks,

- Joel

2019-08-22 15:28:54

[permalink] [raw]

Subject: Re: [PATCH RT v2 3/3] rcu: Disable use_softirq on PREEMPT_RT

On Wed, Aug 21, 2019 at 06:19:06PM -0500, Scott Wood wrote:
> Besides restoring behavior that used to be default on RT, this avoids
> a deadlock on scheduler locks:
>
> [ 136.894657] 039: ============================================
> [ 136.900401] 039: WARNING: possible recursive locking detected
> [ 136.906146] 039: 5.2.9-rt3.dbg+ #174 Tainted: G E
> [ 136.912152] 039: --------------------------------------------
> [ 136.917896] 039: rcu_torture_rea/13474 is trying to acquire lock:
> [ 136.923990] 039: 000000005f25146d
> [ 136.927310] 039: (
> [ 136.929414] 039: &p->pi_lock
> [ 136.932303] 039: ){-...}
> [ 136.934840] 039: , at: try_to_wake_up+0x39/0x920
> [ 136.939461] 039:
> but task is already holding lock:
> [ 136.944425] 039: 000000005f25146d
> [ 136.947745] 039: (
> [ 136.949852] 039: &p->pi_lock
> [ 136.952738] 039: ){-...}
> [ 136.955274] 039: , at: try_to_wake_up+0x39/0x920
> [ 136.959895] 039:
> other info that might help us debug this:
> [ 136.965555] 039: Possible unsafe locking scenario:
>
> [ 136.970608] 039: CPU0
> [ 136.973493] 039: ----
> [ 136.976378] 039: lock(
> [ 136.978918] 039: &p->pi_lock
> [ 136.981806] 039: );
> [ 136.983911] 039: lock(
> [ 136.986451] 039: &p->pi_lock
> [ 136.989336] 039: );
> [ 136.991444] 039:
> *** DEADLOCK ***
>
> [ 136.995194] 039: May be due to missing lock nesting notation
>
> [ 137.001115] 039: 3 locks held by rcu_torture_rea/13474:
> [ 137.006341] 039: #0:
> [ 137.008707] 039: 000000005f25146d
> [ 137.012024] 039: (
> [ 137.014131] 039: &p->pi_lock
> [ 137.017015] 039: ){-...}
> [ 137.019558] 039: , at: try_to_wake_up+0x39/0x920
> [ 137.024175] 039: #1:
> [ 137.026540] 039: 0000000011c8e51d
> [ 137.029859] 039: (
> [ 137.031966] 039: &rq->lock
> [ 137.034679] 039: ){-...}
> [ 137.037217] 039: , at: try_to_wake_up+0x241/0x920
> [ 137.041924] 039: #2:
> [ 137.044291] 039: 00000000098649b9
> [ 137.047610] 039: (
> [ 137.049714] 039: rcu_read_lock
> [ 137.052774] 039: ){....}
> [ 137.055314] 039: , at: cpuacct_charge+0x33/0x1e0
> [ 137.059934] 039:
> stack backtrace:
> [ 137.063425] 039: CPU: 39 PID: 13474 Comm: rcu_torture_rea Kdump: loaded Tainted: G E 5.2.9-rt3.dbg+ #174
> [ 137.074197] 039: Hardware name: Intel Corporation S2600BT/S2600BT, BIOS SE5C620.86B.01.00.0763.022420181017 02/24/2018
> [ 137.084886] 039: Call Trace:
> [ 137.087773] 039: <IRQ>
> [ 137.090226] 039: dump_stack+0x5e/0x8b
> [ 137.093997] 039: __lock_acquire+0x725/0x1100
> [ 137.098358] 039: lock_acquire+0xc0/0x240
> [ 137.102374] 039: ? try_to_wake_up+0x39/0x920
> [ 137.106737] 039: _raw_spin_lock_irqsave+0x47/0x90
> [ 137.111534] 039: ? try_to_wake_up+0x39/0x920
> [ 137.115910] 039: try_to_wake_up+0x39/0x920
> [ 137.120098] 039: rcu_read_unlock_special+0x65/0xb0
> [ 137.124977] 039: __rcu_read_unlock+0x5d/0x70
> [ 137.129337] 039: cpuacct_charge+0xd9/0x1e0
> [ 137.133522] 039: ? cpuacct_charge+0x33/0x1e0
> [ 137.137880] 039: update_curr+0x14b/0x420
> [ 137.141894] 039: enqueue_entity+0x42/0x370
> [ 137.146080] 039: enqueue_task_fair+0xa9/0x490
> [ 137.150528] 039: activate_task+0x5a/0xf0
> [ 137.154539] 039: ttwu_do_activate+0x4e/0x90
> [ 137.158813] 039: try_to_wake_up+0x277/0x920
> [ 137.163086] 039: irq_exit+0xb6/0xf0
> [ 137.166661] 039: smp_apic_timer_interrupt+0xe3/0x3a0
> [ 137.171714] 039: apic_timer_interrupt+0xf/0x20
> [ 137.176249] 039: </IRQ>
> [ 137.178785] 039: RIP: 0010:__schedule+0x0/0x8e0
> [ 137.183319] 039: Code: 00 02 48 89 43 20 e8 0f 5a 00 00 48 8d 7b 28 e8 86 f2 fd ff 31 c0 5b 5d 41 5c c3 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 <55> 48 89 e5 41 57 41 56 49 c7 c6 c0 ca 1e 00 41 55 41 89 fd 41 54
> [ 137.202498] 039: RSP: 0018:ffffc9005835fbc0 EFLAGS: 00000246
> [ 137.208158] 039: ORIG_RAX: ffffffffffffff13
> [ 137.212428] 039: RAX: 0000000000000000 RBX: ffff8897c3e1bb00 RCX: 0000000000000001
> [ 137.219994] 039: RDX: 0000000080004008 RSI: 0000000000000006 RDI: 0000000000000001
> [ 137.227560] 039: RBP: ffff8897c3e1bb00 R08: 0000000000000000 R09: 0000000000000000
> [ 137.235126] 039: R10: 0000000000000001 R11: 0000000000000001 R12: ffffffff81001fd1
> [ 137.242694] 039: R13: 0000000000000044 R14: 0000000000000000 R15: ffffc9005835fcac
> [ 137.250259] 039: ? ___preempt_schedule+0x16/0x18
> [ 137.254969] 039: preempt_schedule_common+0x32/0x80
> [ 137.259846] 039: ___preempt_schedule+0x16/0x18
> [ 137.264379] 039: rcutorture_one_extend+0x33a/0x510 [rcutorture]
> [ 137.270397] 039: rcu_torture_one_read+0x18c/0x450 [rcutorture]
> [ 137.276334] 039: rcu_torture_reader+0xac/0x1f0 [rcutorture]
> [ 137.281998] 039: ? rcu_torture_reader+0x1f0/0x1f0 [rcutorture]
> [ 137.287920] 039: kthread+0x106/0x140
> [ 137.291591] 039: ? rcu_torture_one_read+0x450/0x450 [rcutorture]
> [ 137.297681] 039: ? kthread_bind+0x10/0x10
> [ 137.301783] 039: ret_from_fork+0x3a/0x50
>
> Signed-off-by: Scott Wood <[email protected]>
> ---
> I think the prohibition on use_softirq can be dropped once RT gets the
> latest RCU code, but the question of what use_softirq should default
> to on PREEMPT_RT remains.

Independent of the question of what use_softirq should default to, could we
test RT with latest RCU code now to check if the deadlock goes away? That
way, maybe we can find any issues in current RCU that cause scheduler
deadlocks in the situation you pointed. The reason I am asking is because
recently additional commits [1] try to prevent deadlock and it'd be nice to
ensure that other conditions are not lingering (I don't think they are but
it'd be nice to be sure).

I am happy to do such testing myself if you want, however what does it take
to apply the RT patchset to the latest mainline? Is it an achievable feat?

thanks,

- Joel

[1]
23634ebc1d94 ("rcu: Check for wakeup-safe conditions")
25102de ("rcu: Only do rcu_read_unlock_special() wakeups if expedited")

> kernel/rcu/tree.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index fc8b00c61b32..12036d33e2f9 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -98,9 +98,16 @@ struct rcu_state rcu_state = {
> /* Dump rcu_node combining tree at boot to verify correct setup. */
> static bool dump_tree;
> module_param(dump_tree, bool, 0444);
> -/* By default, use RCU_SOFTIRQ instead of rcuc kthreads. */
> +/*
> + * By default, use RCU_SOFTIRQ instead of rcuc kthreads.
> + * But, avoid RCU_SOFTIRQ on PREEMPT_RT due to pi/rq deadlocks.
> + */
> +#ifdef CONFIG_PREEMPT_RT_FULL
> +static bool use_softirq;
> +#else
> static bool use_softirq = 1;
> module_param(use_softirq, bool, 0444);
> +#endif
> /* Control rcu_node-tree auto-balancing at boot time. */
> static bool rcu_fanout_exact;
> module_param(rcu_fanout_exact, bool, 0444);
> --
> 1.8.3.1
>

2019-08-22 17:17:07

[permalink] [raw]

Subject: Re: [PATCH RT v2 1/3] rcu: Acquire RCU lock when disabling BHs

On Thu, Aug 22, 2019 at 09:39:55AM -0400, Joel Fernandes wrote:
> On Wed, Aug 21, 2019 at 04:33:58PM -0700, Paul E. McKenney wrote:
> > On Wed, Aug 21, 2019 at 06:19:04PM -0500, Scott Wood wrote:
> > > A plain local_bh_disable() is documented as creating an RCU critical
> > > section, and (at least) rcutorture expects this to be the case. However,
> > > in_softirq() doesn't block a grace period on PREEMPT_RT, since RCU checks
> > > preempt_count() directly. Even if RCU were changed to check
> > > in_softirq(), that wouldn't allow blocked BH disablers to be boosted.
> > >
> > > Fix this by calling rcu_read_lock() from local_bh_disable(), and update
> > > rcu_read_lock_bh_held() accordingly.
> >
> > Cool! Some questions and comments below.
> >
> > Thanx, Paul
> >
> > > Signed-off-by: Scott Wood <[email protected]>
> > > ---
> > > Another question is whether non-raw spinlocks are intended to create an
> > > RCU read-side critical section due to implicit preempt disable.
> >
> > Hmmm... Did non-raw spinlocks act like rcu_read_lock_sched()
> > and rcu_read_unlock_sched() pairs in -rt prior to the RCU flavor
> > consolidation? If not, I don't see why they should do so after that
> > consolidation in -rt.
>
> May be I am missing something, but I didn't see the connection between
> consolidation and this patch. AFAICS, this patch is so that
> rcu_read_lock_bh_held() works at all on -rt. Did I badly miss something?

I was interpreting Scott's question (which would be excluded from the
git commit log) as relating to a possible follow-on patch.

The question is "how special can non-raw spinlocks be in -rt?". From what
I can see, they have been treated as sleeplocks from an RCU viewpoint,
so maybe that should continue to be the case. It does deserve some
thought because in mainline a non-raw spinlock really would block a
post-consolidation RCU grace period, even in PREEMPT kernels.

But then again, you cannot preempt a non-raw spinlock in mainline but
you can in -rt, so extending that exception to RCU is not unreasonable.

Either way, we do need to make a definite decision and document it.
If I were forced to make a decision right now, I would follow the old
behavior, so that only raw spinlocks were guaranteed to block RCU grace
periods. But I am not being forced, so let's actually discuss and make
a conscious decision. ;-)

Thanx, Paul

2019-08-22 20:40:58

[permalink] [raw]

Subject: Re: [PATCH RT v2 3/3] rcu: Disable use_softirq on PREEMPT_RT

On Thu, Aug 22, 2019 at 09:59:53AM -0400, Joel Fernandes wrote:
> On Wed, Aug 21, 2019 at 06:19:06PM -0500, Scott Wood wrote:
> > Besides restoring behavior that used to be default on RT, this avoids
> > a deadlock on scheduler locks:
> >
> > [ 136.894657] 039: ============================================
> > [ 136.900401] 039: WARNING: possible recursive locking detected
> > [ 136.906146] 039: 5.2.9-rt3.dbg+ #174 Tainted: G E
> > [ 136.912152] 039: --------------------------------------------
> > [ 136.917896] 039: rcu_torture_rea/13474 is trying to acquire lock:
> > [ 136.923990] 039: 000000005f25146d
> > [ 136.927310] 039: (
> > [ 136.929414] 039: &p->pi_lock
> > [ 136.932303] 039: ){-...}
> > [ 136.934840] 039: , at: try_to_wake_up+0x39/0x920
> > [ 136.939461] 039:
> > but task is already holding lock:
> > [ 136.944425] 039: 000000005f25146d
> > [ 136.947745] 039: (
> > [ 136.949852] 039: &p->pi_lock
> > [ 136.952738] 039: ){-...}
> > [ 136.955274] 039: , at: try_to_wake_up+0x39/0x920
> > [ 136.959895] 039:
> > other info that might help us debug this:
> > [ 136.965555] 039: Possible unsafe locking scenario:
> >
> > [ 136.970608] 039: CPU0
> > [ 136.973493] 039: ----
> > [ 136.976378] 039: lock(
> > [ 136.978918] 039: &p->pi_lock
> > [ 136.981806] 039: );
> > [ 136.983911] 039: lock(
> > [ 136.986451] 039: &p->pi_lock
> > [ 136.989336] 039: );
> > [ 136.991444] 039:
> > *** DEADLOCK ***
> >
> > [ 136.995194] 039: May be due to missing lock nesting notation
> >
> > [ 137.001115] 039: 3 locks held by rcu_torture_rea/13474:
> > [ 137.006341] 039: #0:
> > [ 137.008707] 039: 000000005f25146d
> > [ 137.012024] 039: (
> > [ 137.014131] 039: &p->pi_lock
> > [ 137.017015] 039: ){-...}
> > [ 137.019558] 039: , at: try_to_wake_up+0x39/0x920
> > [ 137.024175] 039: #1:
> > [ 137.026540] 039: 0000000011c8e51d
> > [ 137.029859] 039: (
> > [ 137.031966] 039: &rq->lock
> > [ 137.034679] 039: ){-...}
> > [ 137.037217] 039: , at: try_to_wake_up+0x241/0x920
> > [ 137.041924] 039: #2:
> > [ 137.044291] 039: 00000000098649b9
> > [ 137.047610] 039: (
> > [ 137.049714] 039: rcu_read_lock
> > [ 137.052774] 039: ){....}
> > [ 137.055314] 039: , at: cpuacct_charge+0x33/0x1e0
> > [ 137.059934] 039:
> > stack backtrace:
> > [ 137.063425] 039: CPU: 39 PID: 13474 Comm: rcu_torture_rea Kdump: loaded Tainted: G E 5.2.9-rt3.dbg+ #174
> > [ 137.074197] 039: Hardware name: Intel Corporation S2600BT/S2600BT, BIOS SE5C620.86B.01.00.0763.022420181017 02/24/2018
> > [ 137.084886] 039: Call Trace:
> > [ 137.087773] 039: <IRQ>
> > [ 137.090226] 039: dump_stack+0x5e/0x8b
> > [ 137.093997] 039: __lock_acquire+0x725/0x1100
> > [ 137.098358] 039: lock_acquire+0xc0/0x240
> > [ 137.102374] 039: ? try_to_wake_up+0x39/0x920
> > [ 137.106737] 039: _raw_spin_lock_irqsave+0x47/0x90
> > [ 137.111534] 039: ? try_to_wake_up+0x39/0x920
> > [ 137.115910] 039: try_to_wake_up+0x39/0x920
> > [ 137.120098] 039: rcu_read_unlock_special+0x65/0xb0
> > [ 137.124977] 039: __rcu_read_unlock+0x5d/0x70
> > [ 137.129337] 039: cpuacct_charge+0xd9/0x1e0
> > [ 137.133522] 039: ? cpuacct_charge+0x33/0x1e0
> > [ 137.137880] 039: update_curr+0x14b/0x420
> > [ 137.141894] 039: enqueue_entity+0x42/0x370
> > [ 137.146080] 039: enqueue_task_fair+0xa9/0x490
> > [ 137.150528] 039: activate_task+0x5a/0xf0
> > [ 137.154539] 039: ttwu_do_activate+0x4e/0x90
> > [ 137.158813] 039: try_to_wake_up+0x277/0x920
> > [ 137.163086] 039: irq_exit+0xb6/0xf0
> > [ 137.166661] 039: smp_apic_timer_interrupt+0xe3/0x3a0
> > [ 137.171714] 039: apic_timer_interrupt+0xf/0x20
> > [ 137.176249] 039: </IRQ>
> > [ 137.178785] 039: RIP: 0010:__schedule+0x0/0x8e0
> > [ 137.183319] 039: Code: 00 02 48 89 43 20 e8 0f 5a 00 00 48 8d 7b 28 e8 86 f2 fd ff 31 c0 5b 5d 41 5c c3 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 <55> 48 89 e5 41 57 41 56 49 c7 c6 c0 ca 1e 00 41 55 41 89 fd 41 54
> > [ 137.202498] 039: RSP: 0018:ffffc9005835fbc0 EFLAGS: 00000246
> > [ 137.208158] 039: ORIG_RAX: ffffffffffffff13
> > [ 137.212428] 039: RAX: 0000000000000000 RBX: ffff8897c3e1bb00 RCX: 0000000000000001
> > [ 137.219994] 039: RDX: 0000000080004008 RSI: 0000000000000006 RDI: 0000000000000001
> > [ 137.227560] 039: RBP: ffff8897c3e1bb00 R08: 0000000000000000 R09: 0000000000000000
> > [ 137.235126] 039: R10: 0000000000000001 R11: 0000000000000001 R12: ffffffff81001fd1
> > [ 137.242694] 039: R13: 0000000000000044 R14: 0000000000000000 R15: ffffc9005835fcac
> > [ 137.250259] 039: ? ___preempt_schedule+0x16/0x18
> > [ 137.254969] 039: preempt_schedule_common+0x32/0x80
> > [ 137.259846] 039: ___preempt_schedule+0x16/0x18
> > [ 137.264379] 039: rcutorture_one_extend+0x33a/0x510 [rcutorture]
> > [ 137.270397] 039: rcu_torture_one_read+0x18c/0x450 [rcutorture]
> > [ 137.276334] 039: rcu_torture_reader+0xac/0x1f0 [rcutorture]
> > [ 137.281998] 039: ? rcu_torture_reader+0x1f0/0x1f0 [rcutorture]
> > [ 137.287920] 039: kthread+0x106/0x140
> > [ 137.291591] 039: ? rcu_torture_one_read+0x450/0x450 [rcutorture]
> > [ 137.297681] 039: ? kthread_bind+0x10/0x10
> > [ 137.301783] 039: ret_from_fork+0x3a/0x50
> >
> > Signed-off-by: Scott Wood <[email protected]>
> > ---
> > I think the prohibition on use_softirq can be dropped once RT gets the
> > latest RCU code, but the question of what use_softirq should default
> > to on PREEMPT_RT remains.
>
> Independent of the question of what use_softirq should default to, could we
> test RT with latest RCU code now to check if the deadlock goes away? That
> way, maybe we can find any issues in current RCU that cause scheduler
> deadlocks in the situation you pointed. The reason I am asking is because
> recently additional commits [1] try to prevent deadlock and it'd be nice to
> ensure that other conditions are not lingering (I don't think they are but
> it'd be nice to be sure).
>
> I am happy to do such testing myself if you want, however what does it take
> to apply the RT patchset to the latest mainline? Is it an achievable feat?

I suggest just using the -rt git tree, which I believe lives here:

https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git

I would guess that branch linux-5.2.y-rt is the one you want, but I
would ask Scott instead of blindly trusting my guess. ;-)

Thanx, Paul

2019-08-23 07:44:57

[permalink] [raw]

Subject: Re: [PATCH RT v2 3/3] rcu: Disable use_softirq on PREEMPT_RT

On Thu, 2019-08-22 at 09:59 -0400, Joel Fernandes wrote:
> On Wed, Aug 21, 2019 at 06:19:06PM -0500, Scott Wood wrote:
> > I think the prohibition on use_softirq can be dropped once RT gets the
> > latest RCU code, but the question of what use_softirq should default
> > to on PREEMPT_RT remains.
>
> Independent of the question of what use_softirq should default to, could
> we
> test RT with latest RCU code now to check if the deadlock goes away? That
> way, maybe we can find any issues in current RCU that cause scheduler
> deadlocks in the situation you pointed. The reason I am asking is because
> recently additional commits [1] try to prevent deadlock and it'd be nice
> to
> ensure that other conditions are not lingering (I don't think they are but
> it'd be nice to be sure).
>
> I am happy to do such testing myself if you want, however what does it
> take
> to apply the RT patchset to the latest mainline? Is it an achievable feat?

I did run such a test (cherry picking all RCU patches that aren't already in
RT, plus your RFC patch to rcu_read_unlock_special, rather than applying RT
to current mainline) with rcutorture plus a looping kernel build overnight,
and didn't see any splats with or without use_softirq.

-Scott

2019-08-23 09:13:36

[permalink] [raw]

Subject: Re: [PATCH RT v2 3/3] rcu: Disable use_softirq on PREEMPT_RT

On Thu, Aug 22, 2019 at 02:31:17PM -0500, Scott Wood wrote:
> On Thu, 2019-08-22 at 09:59 -0400, Joel Fernandes wrote:
> > On Wed, Aug 21, 2019 at 06:19:06PM -0500, Scott Wood wrote:
> > > I think the prohibition on use_softirq can be dropped once RT gets the
> > > latest RCU code, but the question of what use_softirq should default
> > > to on PREEMPT_RT remains.
> >
> > Independent of the question of what use_softirq should default to, could
> > we
> > test RT with latest RCU code now to check if the deadlock goes away? That
> > way, maybe we can find any issues in current RCU that cause scheduler
> > deadlocks in the situation you pointed. The reason I am asking is because
> > recently additional commits [1] try to prevent deadlock and it'd be nice
> > to
> > ensure that other conditions are not lingering (I don't think they are but
> > it'd be nice to be sure).
> >
> > I am happy to do such testing myself if you want, however what does it
> > take
> > to apply the RT patchset to the latest mainline? Is it an achievable feat?
>
> I did run such a test (cherry picking all RCU patches that aren't already in
> RT, plus your RFC patch to rcu_read_unlock_special, rather than applying RT
> to current mainline) with rcutorture plus a looping kernel build overnight,
> and didn't see any splats with or without use_softirq.

Cool, that's good to know you didn't see splats!

thanks,

- Joel

2019-08-23 09:38:45

[permalink] [raw]

Subject: Re: [PATCH RT v2 1/3] rcu: Acquire RCU lock when disabling BHs

On Thu, Aug 22, 2019 at 08:27:06AM -0700, Paul E. McKenney wrote:
> On Thu, Aug 22, 2019 at 09:39:55AM -0400, Joel Fernandes wrote:
> > On Wed, Aug 21, 2019 at 04:33:58PM -0700, Paul E. McKenney wrote:
> > > On Wed, Aug 21, 2019 at 06:19:04PM -0500, Scott Wood wrote:
> > > > A plain local_bh_disable() is documented as creating an RCU critical
> > > > section, and (at least) rcutorture expects this to be the case. However,
> > > > in_softirq() doesn't block a grace period on PREEMPT_RT, since RCU checks
> > > > preempt_count() directly. Even if RCU were changed to check
> > > > in_softirq(), that wouldn't allow blocked BH disablers to be boosted.
> > > >
> > > > Fix this by calling rcu_read_lock() from local_bh_disable(), and update
> > > > rcu_read_lock_bh_held() accordingly.
> > >
> > > Cool! Some questions and comments below.
> > >
> > > Thanx, Paul
> > >
> > > > Signed-off-by: Scott Wood <[email protected]>
> > > > ---
> > > > Another question is whether non-raw spinlocks are intended to create an
> > > > RCU read-side critical section due to implicit preempt disable.
> > >
> > > Hmmm... Did non-raw spinlocks act like rcu_read_lock_sched()
> > > and rcu_read_unlock_sched() pairs in -rt prior to the RCU flavor
> > > consolidation? If not, I don't see why they should do so after that
> > > consolidation in -rt.
> >
> > May be I am missing something, but I didn't see the connection between
> > consolidation and this patch. AFAICS, this patch is so that
> > rcu_read_lock_bh_held() works at all on -rt. Did I badly miss something?
>
> I was interpreting Scott's question (which would be excluded from the
> git commit log) as relating to a possible follow-on patch.
>
> The question is "how special can non-raw spinlocks be in -rt?". From what
> I can see, they have been treated as sleeplocks from an RCU viewpoint,
> so maybe that should continue to be the case. It does deserve some
> thought because in mainline a non-raw spinlock really would block a
> post-consolidation RCU grace period, even in PREEMPT kernels.
>
> But then again, you cannot preempt a non-raw spinlock in mainline but
> you can in -rt, so extending that exception to RCU is not unreasonable.
>
> Either way, we do need to make a definite decision and document it.
> If I were forced to make a decision right now, I would follow the old
> behavior, so that only raw spinlocks were guaranteed to block RCU grace
> periods. But I am not being forced, so let's actually discuss and make
> a conscious decision. ;-)

I think non-raw spinlocks on -rt should at least do rcu_read_lock() so that
any driver or kernel code that depends on this behavior and works on non-rt
also works on -rt. It also removes the chance a kernel developer may miss
documentation and accidentally forget that their code may break on -rt. I am
curious to see how much this design pattern appears in the kernel
(spin_lock'ed section "intended" as an RCU-reader by code sequences).

Logically speaking, to me anything that disables preemption on non-RT should
do rcu_read_lock() on -rt so that from RCU's perspective, things are working.
But I wonder where we would draw the line and if the bar is to need actual
examples of usage patterns to make a decision..

Any thoughts?

thanks,

- Joel

2019-08-23 09:51:48

[permalink] [raw]

Subject: Re: [PATCH RT v2 1/3] rcu: Acquire RCU lock when disabling BHs

On Wed, 2019-08-21 at 16:33 -0700, Paul E. McKenney wrote:
> On Wed, Aug 21, 2019 at 06:19:04PM -0500, Scott Wood wrote:
> > diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> > index 388ace315f32..d6e357378732 100644
> > --- a/include/linux/rcupdate.h
> > +++ b/include/linux/rcupdate.h
> > @@ -615,10 +615,12 @@ static inline void rcu_read_unlock(void)
> > static inline void rcu_read_lock_bh(void)
> > {
> > local_bh_disable();
> > +#ifndef CONFIG_PREEMPT_RT_FULL
> > __acquire(RCU_BH);
> > rcu_lock_acquire(&rcu_bh_lock_map);
> > RCU_LOCKDEP_WARN(!rcu_is_watching(),
> > "rcu_read_lock_bh() used illegally while idle");
> > +#endif
>
> Any chance of this using "if (!IS_ENABLED(CONFIG_PREEMPT_RT_FULL))"?
> We should be OK providing a do-nothing __maybe_unused rcu_bh_lock_map
> for lockdep-enabled -rt kernels, right?

OK.

> > @@ -185,8 +189,10 @@ void __local_bh_enable_ip(unsigned long ip,
> > > > unsigned int cnt)
> > WARN_ON_ONCE(count < 0);
> > local_irq_enable();
> >
> > - if (!in_atomic())
> > + if (!in_atomic()) {
> > + rcu_read_unlock();
> > local_unlock(bh_lock);
> > + }
>
> The return from in_atomic() is guaranteed to be the same at
> local_bh_enable() time as was at the call to the corresponding
> local_bh_disable()?

That's an existing requirement on RT (which rcutorture currently violates)
due to bh_lock.

> I could have sworn that I ran afoul of this last year. Might these
> added rcu_read_lock() and rcu_read_unlock() calls need to check for
> CONFIG_PREEMPT_RT_FULL?

This code is already under a PREEMPT_RT_FULL ifdef.

-Scott

2019-08-23 09:56:15

[permalink] [raw]

Subject: Re: [PATCH RT v2 1/3] rcu: Acquire RCU lock when disabling BHs

On Thu, 2019-08-22 at 09:39 -0400, Joel Fernandes wrote:
> On Wed, Aug 21, 2019 at 04:33:58PM -0700, Paul E. McKenney wrote:
> > On Wed, Aug 21, 2019 at 06:19:04PM -0500, Scott Wood wrote:
> > > Signed-off-by: Scott Wood <[email protected]>
> > > ---
> > > Another question is whether non-raw spinlocks are intended to create
> > > an
> > > RCU read-side critical section due to implicit preempt disable.
> >
> > Hmmm... Did non-raw spinlocks act like rcu_read_lock_sched()
> > and rcu_read_unlock_sched() pairs in -rt prior to the RCU flavor
> > consolidation? If not, I don't see why they should do so after that
> > consolidation in -rt.
>
> May be I am missing something, but I didn't see the connection between
> consolidation and this patch. AFAICS, this patch is so that
> rcu_read_lock_bh_held() works at all on -rt. Did I badly miss something?

Before consolidation, RT mapped rcu_read_lock_bh_held() to
rcu_read_lock_bh() and called rcu_read_lock() from rcu_read_lock_bh(). This
somehow got lost when rebasing on top of 5.0.

> > > include/linux/rcupdate.h | 4 ++++
> > > kernel/rcu/update.c | 4 ++++
> > > kernel/softirq.c | 12 +++++++++---
> > > 3 files changed, 17 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> > > index 388ace315f32..d6e357378732 100644
> > > --- a/include/linux/rcupdate.h
> > > +++ b/include/linux/rcupdate.h
> > > @@ -615,10 +615,12 @@ static inline void rcu_read_unlock(void)
> > > static inline void rcu_read_lock_bh(void)
> > > {
> > > local_bh_disable();
> > > +#ifndef CONFIG_PREEMPT_RT_FULL
> > > __acquire(RCU_BH);
> > > rcu_lock_acquire(&rcu_bh_lock_map);
> > > RCU_LOCKDEP_WARN(!rcu_is_watching(),
> > > "rcu_read_lock_bh() used illegally while idle");
> > > +#endif
> >
> > Any chance of this using "if (!IS_ENABLED(CONFIG_PREEMPT_RT_FULL))"?
> > We should be OK providing a do-nothing __maybe_unused rcu_bh_lock_map
> > for lockdep-enabled -rt kernels, right?
>
> Since this function is small, I prefer if -rt defines their own
> rcu_read_lock_bh() which just does the local_bh_disable(). That would be
> way
> cleaner IMO. IIRC, -rt does similar things for spinlocks, but it has been
> sometime since I look at the -rt patchset.

I'll do it whichever way you all decide, though I'm not sure I agree about
it being cleaner (especially while RT is still out-of-tree and a change to
the non-RT version that fails to trigger a merge conflict is a concern).

What about moving everything but the local_bh_disable into a separate
function called from rcu_read_lock_bh, and making that a no-op on RT?

> > >
> > > diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
> > > index 016c66a98292..a9cdf3d562bc 100644
> > > --- a/kernel/rcu/update.c
> > > +++ b/kernel/rcu/update.c
> > > @@ -296,7 +296,11 @@ int rcu_read_lock_bh_held(void)
> > > return 0;
> > > if (!rcu_lockdep_current_cpu_online())
> > > return 0;
> > > +#ifdef CONFIG_PREEMPT_RT_FULL
> > > + return lock_is_held(&rcu_lock_map) || irqs_disabled();
> > > +#else
> > > return in_softirq() || irqs_disabled();
> > > +#endif
> >
> > And globally.
>
> And could be untangled a bit as well:
>
> if (irqs_disabled())
> return 1;
>
> if (IS_ENABLED(CONFIG_PREEMPT_RT_FULL))
> return lock_is_held(&rcu_lock_map);
>
> return in_softirq();

OK.

-Scott

2019-08-23 15:57:53

[permalink] [raw]

Subject: Re: [PATCH RT v2 1/3] rcu: Acquire RCU lock when disabling BHs

On Thu, Aug 22, 2019 at 09:50:09PM -0400, Joel Fernandes wrote:
> On Thu, Aug 22, 2019 at 08:27:06AM -0700, Paul E. McKenney wrote:
> > On Thu, Aug 22, 2019 at 09:39:55AM -0400, Joel Fernandes wrote:
> > > On Wed, Aug 21, 2019 at 04:33:58PM -0700, Paul E. McKenney wrote:
> > > > On Wed, Aug 21, 2019 at 06:19:04PM -0500, Scott Wood wrote:
> > > > > A plain local_bh_disable() is documented as creating an RCU critical
> > > > > section, and (at least) rcutorture expects this to be the case. However,
> > > > > in_softirq() doesn't block a grace period on PREEMPT_RT, since RCU checks
> > > > > preempt_count() directly. Even if RCU were changed to check
> > > > > in_softirq(), that wouldn't allow blocked BH disablers to be boosted.
> > > > >
> > > > > Fix this by calling rcu_read_lock() from local_bh_disable(), and update
> > > > > rcu_read_lock_bh_held() accordingly.
> > > >
> > > > Cool! Some questions and comments below.
> > > >
> > > > Thanx, Paul
> > > >
> > > > > Signed-off-by: Scott Wood <[email protected]>
> > > > > ---
> > > > > Another question is whether non-raw spinlocks are intended to create an
> > > > > RCU read-side critical section due to implicit preempt disable.
> > > >
> > > > Hmmm... Did non-raw spinlocks act like rcu_read_lock_sched()
> > > > and rcu_read_unlock_sched() pairs in -rt prior to the RCU flavor
> > > > consolidation? If not, I don't see why they should do so after that
> > > > consolidation in -rt.
> > >
> > > May be I am missing something, but I didn't see the connection between
> > > consolidation and this patch. AFAICS, this patch is so that
> > > rcu_read_lock_bh_held() works at all on -rt. Did I badly miss something?
> >
> > I was interpreting Scott's question (which would be excluded from the
> > git commit log) as relating to a possible follow-on patch.
> >
> > The question is "how special can non-raw spinlocks be in -rt?". From what
> > I can see, they have been treated as sleeplocks from an RCU viewpoint,
> > so maybe that should continue to be the case. It does deserve some
> > thought because in mainline a non-raw spinlock really would block a
> > post-consolidation RCU grace period, even in PREEMPT kernels.
> >
> > But then again, you cannot preempt a non-raw spinlock in mainline but
> > you can in -rt, so extending that exception to RCU is not unreasonable.
> >
> > Either way, we do need to make a definite decision and document it.
> > If I were forced to make a decision right now, I would follow the old
> > behavior, so that only raw spinlocks were guaranteed to block RCU grace
> > periods. But I am not being forced, so let's actually discuss and make
> > a conscious decision. ;-)
>
> I think non-raw spinlocks on -rt should at least do rcu_read_lock() so that
> any driver or kernel code that depends on this behavior and works on non-rt
> also works on -rt. It also removes the chance a kernel developer may miss
> documentation and accidentally forget that their code may break on -rt. I am
> curious to see how much this design pattern appears in the kernel
> (spin_lock'ed section "intended" as an RCU-reader by code sequences).
>
> Logically speaking, to me anything that disables preemption on non-RT should
> do rcu_read_lock() on -rt so that from RCU's perspective, things are working.
> But I wonder where we would draw the line and if the bar is to need actual
> examples of usage patterns to make a decision..
>
> Any thoughts?

Yes. Let's listen to what the -rt guys have to say. After all, they
are the ones who would be dealing with any differences in semantics.

Thanx, Paul

2019-08-23 17:31:16

[permalink] [raw]

Subject: Re: [PATCH RT v2 1/3] rcu: Acquire RCU lock when disabling BHs

On Thu, Aug 22, 2019 at 09:36:21PM -0500, Scott Wood wrote:
> On Wed, 2019-08-21 at 16:33 -0700, Paul E. McKenney wrote:
> > On Wed, Aug 21, 2019 at 06:19:04PM -0500, Scott Wood wrote:
> > > diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> > > index 388ace315f32..d6e357378732 100644
> > > --- a/include/linux/rcupdate.h
> > > +++ b/include/linux/rcupdate.h
> > > @@ -615,10 +615,12 @@ static inline void rcu_read_unlock(void)
> > > static inline void rcu_read_lock_bh(void)
> > > {
> > > local_bh_disable();
> > > +#ifndef CONFIG_PREEMPT_RT_FULL
> > > __acquire(RCU_BH);
> > > rcu_lock_acquire(&rcu_bh_lock_map);
> > > RCU_LOCKDEP_WARN(!rcu_is_watching(),
> > > "rcu_read_lock_bh() used illegally while idle");
> > > +#endif
> >
> > Any chance of this using "if (!IS_ENABLED(CONFIG_PREEMPT_RT_FULL))"?
> > We should be OK providing a do-nothing __maybe_unused rcu_bh_lock_map
> > for lockdep-enabled -rt kernels, right?
>
> OK.
>
> > > @@ -185,8 +189,10 @@ void __local_bh_enable_ip(unsigned long ip,
> > > > > unsigned int cnt)
> > > WARN_ON_ONCE(count < 0);
> > > local_irq_enable();
> > >
> > > - if (!in_atomic())
> > > + if (!in_atomic()) {
> > > + rcu_read_unlock();
> > > local_unlock(bh_lock);
> > > + }
> >
> > The return from in_atomic() is guaranteed to be the same at
> > local_bh_enable() time as was at the call to the corresponding
> > local_bh_disable()?
>
> That's an existing requirement on RT (which rcutorture currently violates)
> due to bh_lock.
>
> > I could have sworn that I ran afoul of this last year. Might these
> > added rcu_read_lock() and rcu_read_unlock() calls need to check for
> > CONFIG_PREEMPT_RT_FULL?
>
> This code is already under a PREEMPT_RT_FULL ifdef.

Good enough, then!

Thanx, Paul

2019-08-23 23:32:39

by Sebastian Andrzej Siewior

[permalink] [raw]

Subject: Re: [PATCH RT v2 1/3] rcu: Acquire RCU lock when disabling BHs

On 2019-08-22 22:23:23 [-0500], Scott Wood wrote:
> On Thu, 2019-08-22 at 09:39 -0400, Joel Fernandes wrote:
> > On Wed, Aug 21, 2019 at 04:33:58PM -0700, Paul E. McKenney wrote:
> > > On Wed, Aug 21, 2019 at 06:19:04PM -0500, Scott Wood wrote:
> > > > Signed-off-by: Scott Wood <[email protected]>
> > > > ---
> > > > Another question is whether non-raw spinlocks are intended to create
> > > > an
> > > > RCU read-side critical section due to implicit preempt disable.
> > >
> > > Hmmm... Did non-raw spinlocks act like rcu_read_lock_sched()
> > > and rcu_read_unlock_sched() pairs in -rt prior to the RCU flavor
> > > consolidation? If not, I don't see why they should do so after that
> > > consolidation in -rt.
> >
> > May be I am missing something, but I didn't see the connection between
> > consolidation and this patch. AFAICS, this patch is so that
> > rcu_read_lock_bh_held() works at all on -rt. Did I badly miss something?
>
> Before consolidation, RT mapped rcu_read_lock_bh_held() to
> rcu_read_lock_bh() and called rcu_read_lock() from rcu_read_lock_bh(). This
> somehow got lost when rebasing on top of 5.0.

so now rcu_read_lock_bh_held() is untouched and in_softirq() reports 1.
So the problem is that we never hold RCU but report 1 like we do?

Sebastian

2019-08-23 23:33:39

by Sebastian Andrzej Siewior

[permalink] [raw]

Subject: Re: [PATCH RT v2 3/3] rcu: Disable use_softirq on PREEMPT_RT

On 2019-08-21 16:40:18 [-0700], Paul E. McKenney wrote:
> Save a couple of lines?
>
> static bool use_softirq = !IS_ENABLED(CONFIG_PREEMPT_RT_FULL);
>
> And if I understand your point above, the module_param() might be
> able to be the same either way given the new RCU. But if not,
> at least we get rid of the #else.

I *think* we wanted this. And while I took the RCU patches for v5.2 I
forgot to enable it by default…

> Thanx, Paul

Sebastian

2019-08-23 23:37:23

[permalink] [raw]

Subject: Re: [PATCH RT v2 1/3] rcu: Acquire RCU lock when disabling BHs

On Fri, 2019-08-23 at 18:17 +0200, Sebastian Andrzej Siewior wrote:
> On 2019-08-22 22:23:23 [-0500], Scott Wood wrote:
> > On Thu, 2019-08-22 at 09:39 -0400, Joel Fernandes wrote:
> > > On Wed, Aug 21, 2019 at 04:33:58PM -0700, Paul E. McKenney wrote:
> > > > On Wed, Aug 21, 2019 at 06:19:04PM -0500, Scott Wood wrote:
> > > > > Signed-off-by: Scott Wood <[email protected]>
> > > > > ---
> > > > > Another question is whether non-raw spinlocks are intended to
> > > > > create
> > > > > an
> > > > > RCU read-side critical section due to implicit preempt disable.
> > > >
> > > > Hmmm... Did non-raw spinlocks act like rcu_read_lock_sched()
> > > > and rcu_read_unlock_sched() pairs in -rt prior to the RCU flavor
> > > > consolidation? If not, I don't see why they should do so after that
> > > > consolidation in -rt.
> > >
> > > May be I am missing something, but I didn't see the connection between
> > > consolidation and this patch. AFAICS, this patch is so that
> > > rcu_read_lock_bh_held() works at all on -rt. Did I badly miss
> > > something?
> >
> > Before consolidation, RT mapped rcu_read_lock_bh_held() to
> > rcu_read_lock_bh() and called rcu_read_lock() from
> > rcu_read_lock_bh(). This
> > somehow got lost when rebasing on top of 5.0.
>
> so now rcu_read_lock_bh_held() is untouched and in_softirq() reports 1.
> So the problem is that we never hold RCU but report 1 like we do?

Yes.

-Scott

2019-08-24 18:11:36

[permalink] [raw]

Subject: Re: [PATCH RT v2 1/3] rcu: Acquire RCU lock when disabling BHs

On Thu, Aug 22, 2019 at 10:23:23PM -0500, Scott Wood wrote:
> On Thu, 2019-08-22 at 09:39 -0400, Joel Fernandes wrote:
> > On Wed, Aug 21, 2019 at 04:33:58PM -0700, Paul E. McKenney wrote:
> > > On Wed, Aug 21, 2019 at 06:19:04PM -0500, Scott Wood wrote:

[ . . . ]

> > > > include/linux/rcupdate.h | 4 ++++
> > > > kernel/rcu/update.c | 4 ++++
> > > > kernel/softirq.c | 12 +++++++++---
> > > > 3 files changed, 17 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> > > > index 388ace315f32..d6e357378732 100644
> > > > --- a/include/linux/rcupdate.h
> > > > +++ b/include/linux/rcupdate.h
> > > > @@ -615,10 +615,12 @@ static inline void rcu_read_unlock(void)
> > > > static inline void rcu_read_lock_bh(void)
> > > > {
> > > > local_bh_disable();
> > > > +#ifndef CONFIG_PREEMPT_RT_FULL
> > > > __acquire(RCU_BH);
> > > > rcu_lock_acquire(&rcu_bh_lock_map);
> > > > RCU_LOCKDEP_WARN(!rcu_is_watching(),
> > > > "rcu_read_lock_bh() used illegally while idle");
> > > > +#endif
> > >
> > > Any chance of this using "if (!IS_ENABLED(CONFIG_PREEMPT_RT_FULL))"?
> > > We should be OK providing a do-nothing __maybe_unused rcu_bh_lock_map
> > > for lockdep-enabled -rt kernels, right?
> >
> > Since this function is small, I prefer if -rt defines their own
> > rcu_read_lock_bh() which just does the local_bh_disable(). That would be
> > way
> > cleaner IMO. IIRC, -rt does similar things for spinlocks, but it has been
> > sometime since I look at the -rt patchset.
>
> I'll do it whichever way you all decide, though I'm not sure I agree about
> it being cleaner (especially while RT is still out-of-tree and a change to
> the non-RT version that fails to trigger a merge conflict is a concern).
>
> What about moving everything but the local_bh_disable into a separate
> function called from rcu_read_lock_bh, and making that a no-op on RT?

That makes a lot of sense to me!

Thanx, Paul

2019-08-26 16:01:09

by Sebastian Andrzej Siewior

[permalink] [raw]

Subject: Re: [PATCH RT v2 1/3] rcu: Acquire RCU lock when disabling BHs

On 2019-08-23 14:46:39 [-0500], Scott Wood wrote:
> > > Before consolidation, RT mapped rcu_read_lock_bh_held() to
> > > rcu_read_lock_bh() and called rcu_read_lock() from
> > > rcu_read_lock_bh(). This
> > > somehow got lost when rebasing on top of 5.0.
> >
> > so now rcu_read_lock_bh_held() is untouched and in_softirq() reports 1.
> > So the problem is that we never hold RCU but report 1 like we do?
>
> Yes.

I understand the part where "rcu_read_lock() becomes part of
local_bh_disable()". But why do you modify rcu_read_lock_bh_held() and
rcu_read_lock_bh()? Couldn't they remain as-is?

> -Scott

Sebastian

2019-08-26 23:23:14