2018-07-04 17:37:38

by Joe Korty

[permalink] [raw]
Subject: [PATCH RT] sample fix for splat in futex_[un]lock_pi for !rt

Balance atomic/!atomic migrate_enable calls in futex_[un]lock_pi.

The clever use of migrate_disable/enable in rt patch

"futex: workaround migrate_disable/enable in different"

has balanced atomic/!atomic context only for the rt kernel.
This workaround makes it balanced for both rt and !rt.

The 'solution' presented here is for reference only.
A better solution might be for !rt to go back to using
migrate_enable/disable == preempt_enable/disable.
This patch passes the futex selftests for rt and !rt.

Sample kernel splat, edited for brevity. This happens
near the end of boot on a CentOS 7 installation.

WARNING: CPU: 1 PID: 5966 at kernel/sched/core.c:6994 migrate_enable+0x24e/0x2f0
CPU: 1 PID: 5966 Comm: threaded-ml Not tainted 4.14.40-rt31 #1
Hardware name: Supermicro X9DRL-3F/iF/X9DRL-3F/iF, BIOS 3.2 09/22/2015
task: ffff88046b67a6c0 task.stack: ffffc900053a0000
RIP: 0010:migrate_enable+0x24e/0x2f0
RSP: 0018:ffffc900053a3df8 EFLAGS: 00010246

Call Trace:
futex_unlock_pi+0x134/0x210
do_futex+0x13f/0x190
SyS_futex+0x6e/0x150
do_syscall_64+0x6f/0x190
entry_SYSCALL_64_after_hwframe+0x42/0xb7


WARNING: CPU: 1 PID: 5966 at kernel/sched/core.c:6998 migrate_enable+0x75/0x2f0
CPU: 1 PID: 5966 Comm: threaded-ml Tainted: G W 4.14.40-rt31 #1
Hardware name: Supermicro X9DRL-3F/iF/X9DRL-3F/iF, BIOS 3.2 09/22/2015
task: ffff88046b67a6c0 task.stack: ffffc900053a0000
RIP: 0010:migrate_enable+0x75/0x2f0
RSP: 0018:ffffc900053a3df8 EFLAGS: 00010246

Call Trace:
futex_unlock_pi+0x134/0x210
do_futex+0x13f/0x190
SyS_futex+0x6e/0x150
do_syscall_64+0x6f/0x190
entry_SYSCALL_64_after_hwframe+0x42/0xb7

This patch was developed against 4.14.40-rt31. Should be
applicatible to all rt releases in which migrate_enable !=
preempt_enable for !rt kernels.

Signed-off-by: Joe Korty <[email protected]>

Index: b/kernel/futex.c
===================================================================
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -2838,7 +2838,14 @@ retry_private:
spin_unlock(q.lock_ptr);
ret = __rt_mutex_start_proxy_lock(&q.pi_state->pi_mutex, &rt_waiter, current);
raw_spin_unlock_irq(&q.pi_state->pi_mutex.wait_lock);
+#ifdef CONFIG_PREEMPT_RT_FULL
migrate_enable();
+#else
+ /* !rt has to force balanced atomic/!atomic migrate_enable/disable uses */
+ preempt_disable();
+ migrate_enable();
+ preempt_enable();
+#endif

if (ret) {
if (ret == 1)
@@ -2998,7 +3005,14 @@ retry:
/* drops pi_state->pi_mutex.wait_lock */
ret = wake_futex_pi(uaddr, uval, pi_state);

+#ifdef CONFIG_PREEMPT_RT_FULL
+ migrate_enable();
+#else
+ /* !rt has to force balanced atomic/!atomic uses */
+ preempt_disable();
migrate_enable();
+ preempt_enable();
+#endif

put_pi_state(pi_state);




Subject: [PATCH RT] sched/migrate_disable: fallback to preempt_disable() instead barrier()

migrate_disable() does nothing !SMP && !RT. This is bad for two reasons:
- The futex code relies on the fact migrate_disable() is part of spin_lock().
There is a workaround for the !in_atomic() case in migrate_disable() which
work-arounds the different ordering (non-atomic lock and atomic unlock).

- we have a few instances where preempt_disable() is replaced with
migrate_disable().

For both cases it is bad if migrate_disable() ends up as barrier() instead of
preempt_disable(). Let migrate_disable() fallback to preempt_disable().

Cc: [email protected]
Reported-by: [email protected]
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
---
include/linux/preempt.h | 4 ++--
kernel/sched/core.c | 2 ++
2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/linux/preempt.h b/include/linux/preempt.h
index 043e431a7e8e..d46688d521e6 100644
--- a/include/linux/preempt.h
+++ b/include/linux/preempt.h
@@ -241,8 +241,8 @@ static inline int __migrate_disabled(struct task_struct *p)
}

#else
-#define migrate_disable() barrier()
-#define migrate_enable() barrier()
+#define migrate_disable() preempt_disable()
+#define migrate_enable() preempt_enable()
static inline int __migrate_disabled(struct task_struct *p)
{
return 0;
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index ac3fb8495bd5..626a62218518 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7326,6 +7326,7 @@ void migrate_disable(void)
#endif

p->migrate_disable++;
+ preempt_disable();
}
EXPORT_SYMBOL(migrate_disable);

@@ -7349,6 +7350,7 @@ void migrate_enable(void)

WARN_ON_ONCE(p->migrate_disable <= 0);
p->migrate_disable--;
+ preempt_enable();
}
EXPORT_SYMBOL(migrate_enable);
#endif
--
2.18.0


2018-07-05 16:20:03

by Joe Korty

[permalink] [raw]
Subject: Re: [PATCH RT] sched/migrate_disable: fallback to preempt_disable() instead barrier()

On Thu, Jul 05, 2018 at 05:50:34PM +0200, Sebastian Andrzej Siewior wrote:
> migrate_disable() does nothing !SMP && !RT. This is bad for two reasons:
> - The futex code relies on the fact migrate_disable() is part of spin_lock().
> There is a workaround for the !in_atomic() case in migrate_disable() which
> work-arounds the different ordering (non-atomic lock and atomic unlock).
>
> - we have a few instances where preempt_disable() is replaced with
> migrate_disable().
>
> For both cases it is bad if migrate_disable() ends up as barrier() instead of
> preempt_disable(). Let migrate_disable() fallback to preempt_disable().
>
> Cc: [email protected]
> Reported-by: [email protected]
> Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
> ---
> include/linux/preempt.h | 4 ++--
> kernel/sched/core.c | 2 ++
> 2 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/preempt.h b/include/linux/preempt.h
> index 043e431a7e8e..d46688d521e6 100644
> --- a/include/linux/preempt.h
> +++ b/include/linux/preempt.h
> @@ -241,8 +241,8 @@ static inline int __migrate_disabled(struct task_struct *p)
> }
>
> #else
> -#define migrate_disable() barrier()
> -#define migrate_enable() barrier()
> +#define migrate_disable() preempt_disable()
> +#define migrate_enable() preempt_enable()
> static inline int __migrate_disabled(struct task_struct *p)
> {
> return 0;
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index ac3fb8495bd5..626a62218518 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -7326,6 +7326,7 @@ void migrate_disable(void)
> #endif
>
> p->migrate_disable++;
> + preempt_disable();
> }
> EXPORT_SYMBOL(migrate_disable);
>
> @@ -7349,6 +7350,7 @@ void migrate_enable(void)
>
> WARN_ON_ONCE(p->migrate_disable <= 0);
> p->migrate_disable--;
> + preempt_enable();
> }
> EXPORT_SYMBOL(migrate_enable);
> #endif
> --
> 2.18.0



Hi Sebastian,
I just verified that this fix does not work for my mix of
config options (smp && preempt && !rt).

Regards,
Joe


2018-07-05 16:25:08

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH RT] sched/migrate_disable: fallback to preempt_disable() instead barrier()

[ Added Peter ]

On Thu, 5 Jul 2018 17:50:34 +0200
Sebastian Andrzej Siewior <[email protected]> wrote:

> migrate_disable() does nothing !SMP && !RT. This is bad for two reasons:
> - The futex code relies on the fact migrate_disable() is part of spin_lock().
> There is a workaround for the !in_atomic() case in migrate_disable() which
> work-arounds the different ordering (non-atomic lock and atomic unlock).

But isn't it only part of spin_lock() in the RT case?

>
> - we have a few instances where preempt_disable() is replaced with
> migrate_disable().

What? Really? I thought we only replace preempt_disable() with a
local_lock(). Which gives annotation to why a preempt_disable() exists.
And on non-RT, local_lock() is preempt_disable().

>
> For both cases it is bad if migrate_disable() ends up as barrier() instead of
> preempt_disable(). Let migrate_disable() fallback to preempt_disable().
>

I still don't understand exactly what is "bad" about it.

IIRC, I remember Peter not wanting any open coded "migrate_disable"
calls. It was to be for internal use cases only, and specifically, only
for RT.

Personally, I think making migrate_disable() into preempt_disable() on
NON_RT is incorrect too.

-- Steve



> Cc: [email protected]
> Reported-by: [email protected]
> Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
> ---
> include/linux/preempt.h | 4 ++--
> kernel/sched/core.c | 2 ++
> 2 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/preempt.h b/include/linux/preempt.h
> index 043e431a7e8e..d46688d521e6 100644
> --- a/include/linux/preempt.h
> +++ b/include/linux/preempt.h
> @@ -241,8 +241,8 @@ static inline int __migrate_disabled(struct task_struct *p)
> }
>
> #else
> -#define migrate_disable() barrier()
> -#define migrate_enable() barrier()
> +#define migrate_disable() preempt_disable()
> +#define migrate_enable() preempt_enable()
> static inline int __migrate_disabled(struct task_struct *p)
> {
> return 0;
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index ac3fb8495bd5..626a62218518 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -7326,6 +7326,7 @@ void migrate_disable(void)
> #endif
>
> p->migrate_disable++;
> + preempt_disable();
> }
> EXPORT_SYMBOL(migrate_disable);
>
> @@ -7349,6 +7350,7 @@ void migrate_enable(void)
>
> WARN_ON_ONCE(p->migrate_disable <= 0);
> p->migrate_disable--;
> + preempt_enable();
> }
> EXPORT_SYMBOL(migrate_enable);
> #endif


Subject: Re: [PATCH RT] sched/migrate_disable: fallback to preempt_disable() instead barrier()

On 2018-07-05 12:23:00 [-0400], Steven Rostedt wrote:
> [ Added Peter ]
>
> On Thu, 5 Jul 2018 17:50:34 +0200
> Sebastian Andrzej Siewior <[email protected]> wrote:
>
> > migrate_disable() does nothing !SMP && !RT. This is bad for two reasons:
> > - The futex code relies on the fact migrate_disable() is part of spin_lock().
> > There is a workaround for the !in_atomic() case in migrate_disable() which
> > work-arounds the different ordering (non-atomic lock and atomic unlock).
>
> But isn't it only part of spin_lock() in the RT case?

that is correct. so in the !RT case if it remains a barrier then nothing
bad happens. So this should not affect futex case at all. Let me retry
this (Joe also says that this patch does not fix it).

> >
> > - we have a few instances where preempt_disable() is replaced with
> > migrate_disable().
>
> What? Really? I thought we only replace preempt_disable() with a
> local_lock(). Which gives annotation to why a preempt_disable() exists.
> And on non-RT, local_lock() is preempt_disable().

KVM-arm-arm64-downgrade-preempt_disable-d-region-to-.patch
printk-rt-aware.patch
upstream-net-rt-remove-preemption-disabling-in-netif_rx.patch

> > For both cases it is bad if migrate_disable() ends up as barrier() instead of
> > preempt_disable(). Let migrate_disable() fallback to preempt_disable().
> >
>
> I still don't understand exactly what is "bad" about it.
>
> IIRC, I remember Peter not wanting any open coded "migrate_disable"
> calls. It was to be for internal use cases only, and specifically, only
> for RT.

the futex code locks in !ATOMIC context and unlocks the same lock in
ATOMIC context which is not balanced on RT. This is the only case where
we do migrate_disable magic.

> Personally, I think making migrate_disable() into preempt_disable() on
> NON_RT is incorrect too.

For the three patches I mentioned above, the migrate_disable() was
preempt_disable() for !RT before the RT patch was applied. So nothing
changes here. It should only matter for the case where migrate_disable()
was used explicit.

> -- Steve

Sebastian

Subject: Re: [PATCH RT] sched/migrate_disable: fallback to preempt_disable() instead barrier()

On 2018-07-05 12:18:07 [-0400], [email protected] wrote:
> Hi Sebastian,
Hi Joe,

> I just verified that this fix does not work for my mix of
> config options (smp && preempt && !rt).

Okay. So for !RT+SMP we keep migrate_disable() around and it almost
nothing. And it is not referenced anywhere so it does not matter as long
as it not used directly.

We could turn migrate_disable() into a nop/barrier but then we have
three uses which do preempt_disable() -> migrate_disable() (see other
thread).
For the futex code it should not matter much because at this point
preemption is disabled due to the spin_lock() (so we would just extend
it past the spin_unlock() or wake_futex_pi() which ends with
preempt_enable()).


> Regards,
> Joe

Sebastian

Subject: [PATCH RT v2] sched/migrate_disable: fallback to preempt_disable() instead barrier()

On SMP + !RT migrate_disable() is still around. It is not part of spin_lock()
anymore so it has almost no users. However the futex code has a workaround for
the !in_atomic() part of migrate disable which fails because the matching
migrade_disable() is no longer part of spin_lock().

On !SMP + !RT migrate_disable() is reduced to barrier(). This is not optimal
because we few spots where a "preempt_disable()" statement was replaced with
"migrate_disable()".

We also used the migration_disable counter to figure out if a sleeping lock is
acquired so RCU does not complain about schedule() during rcu_read_lock() while
a sleeping lock is held. This changed, we no longer use it, we have now a
sleeping_lock counter for the RCU purpose.

This means we can now:
- for SMP + RT_BASE
full migration program, nothing changes here

- for !SMP + RT_BASE
the migration counting is no longer required. It used to ensure that the task
is not migrated to another CPU and that this CPU remains online. !SMP ensures
that already.
Move it to CONFIG_SCHED_DEBUG so the counting is done for debugging purpose
only.

- for all other cases including !RT
fallback to preempt_disable(). The only remaining users of migrate_disable()
are those which were converted from preempt_disable() and the futex
workaround which is already in the preempt_disable() section due to the
spin_lock that is held.

Cc: [email protected]
Reported-by: [email protected]
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
---
v1…v2: limit migrate_disable to RT only. Use preempt_disable() for !RT
if migrate_disable() is used.

include/linux/preempt.h | 6 +++---
include/linux/sched.h | 4 ++--
kernel/sched/core.c | 23 +++++++++++------------
kernel/sched/debug.c | 2 +-
4 files changed, 17 insertions(+), 18 deletions(-)

--- a/include/linux/preempt.h
+++ b/include/linux/preempt.h
@@ -204,7 +204,7 @@ do { \

#define preemptible() (preempt_count() == 0 && !irqs_disabled())

-#ifdef CONFIG_SMP
+#if defined(CONFIG_SMP) && defined(CONFIG_PREEMPT_RT_BASE)

extern void migrate_disable(void);
extern void migrate_enable(void);
@@ -221,8 +221,8 @@ static inline int __migrate_disabled(str
}

#else
-#define migrate_disable() barrier()
-#define migrate_enable() barrier()
+#define migrate_disable() preempt_disable()
+#define migrate_enable() preempt_enable()
static inline int __migrate_disabled(struct task_struct *p)
{
return 0;
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -645,7 +645,7 @@ struct task_struct {
int nr_cpus_allowed;
const cpumask_t *cpus_ptr;
cpumask_t cpus_mask;
-#if defined(CONFIG_PREEMPT_COUNT) && defined(CONFIG_SMP)
+#if defined(CONFIG_SMP) && defined(CONFIG_PREEMPT_RT_BASE)
int migrate_disable;
int migrate_disable_update;
# ifdef CONFIG_SCHED_DEBUG
@@ -653,8 +653,8 @@ struct task_struct {
# endif

#elif !defined(CONFIG_SMP) && defined(CONFIG_PREEMPT_RT_BASE)
- int migrate_disable;
# ifdef CONFIG_SCHED_DEBUG
+ int migrate_disable;
int migrate_disable_atomic;
# endif
#endif
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1059,7 +1059,7 @@ void set_cpus_allowed_common(struct task
p->nr_cpus_allowed = cpumask_weight(new_mask);
}

-#if defined(CONFIG_PREEMPT_COUNT) && defined(CONFIG_SMP)
+#if defined(CONFIG_SMP) && defined(CONFIG_PREEMPT_RT_BASE)
int __migrate_disabled(struct task_struct *p)
{
return p->migrate_disable;
@@ -1098,7 +1098,7 @@ static void __do_set_cpus_allowed_tail(s

void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask)
{
-#if defined(CONFIG_PREEMPT_COUNT) && defined(CONFIG_SMP)
+#if defined(CONFIG_SMP) && defined(CONFIG_PREEMPT_RT_BASE)
if (__migrate_disabled(p)) {
lockdep_assert_held(&p->pi_lock);

@@ -1171,7 +1171,7 @@ static int __set_cpus_allowed_ptr(struct
if (cpumask_test_cpu(task_cpu(p), new_mask) || __migrate_disabled(p))
goto out;

-#if defined(CONFIG_PREEMPT_COUNT) && defined(CONFIG_SMP)
+#if defined(CONFIG_SMP) && defined(CONFIG_PREEMPT_RT_BASE)
if (__migrate_disabled(p)) {
p->migrate_disable_update = 1;
goto out;
@@ -7134,7 +7134,7 @@ const u32 sched_prio_to_wmult[40] = {
/* 15 */ 119304647, 148102320, 186737708, 238609294, 286331153,
};

-#if defined(CONFIG_PREEMPT_COUNT) && defined(CONFIG_SMP)
+#if defined(CONFIG_SMP) && defined(CONFIG_PREEMPT_RT_BASE)

static inline void
update_nr_migratory(struct task_struct *p, long delta)
@@ -7282,45 +7282,44 @@ EXPORT_SYMBOL(migrate_enable);
#elif !defined(CONFIG_SMP) && defined(CONFIG_PREEMPT_RT_BASE)
void migrate_disable(void)
{
+#ifdef CONFIG_SCHED_DEBUG
struct task_struct *p = current;

if (in_atomic() || irqs_disabled()) {
-#ifdef CONFIG_SCHED_DEBUG
p->migrate_disable_atomic++;
-#endif
return;
}
-#ifdef CONFIG_SCHED_DEBUG
+
if (unlikely(p->migrate_disable_atomic)) {
tracing_off();
WARN_ON_ONCE(1);
}
-#endif

p->migrate_disable++;
+#endif
+ barrier();
}
EXPORT_SYMBOL(migrate_disable);

void migrate_enable(void)
{
+#ifdef CONFIG_SCHED_DEBUG
struct task_struct *p = current;

if (in_atomic() || irqs_disabled()) {
-#ifdef CONFIG_SCHED_DEBUG
p->migrate_disable_atomic--;
-#endif
return;
}

-#ifdef CONFIG_SCHED_DEBUG
if (unlikely(p->migrate_disable_atomic)) {
tracing_off();
WARN_ON_ONCE(1);
}
-#endif

WARN_ON_ONCE(p->migrate_disable <= 0);
p->migrate_disable--;
+#endif
+ barrier();
}
EXPORT_SYMBOL(migrate_enable);
#endif
--- a/kernel/sched/debug.c
+++ b/kernel/sched/debug.c
@@ -1030,7 +1030,7 @@ void proc_sched_show_task(struct task_st
P(dl.runtime);
P(dl.deadline);
}
-#if defined(CONFIG_PREEMPT_COUNT) && defined(CONFIG_SMP)
+#if defined(CONFIG_SMP) && defined(CONFIG_PREEMPT_RT_BASE)
P(migrate_disable);
#endif
P(nr_cpus_allowed);

2018-07-06 19:07:53

by Joe Korty

[permalink] [raw]
Subject: Re: [PATCH RT v2] sched/migrate_disable: fallback to preempt_disable() instead barrier()

On Fri, Jul 06, 2018 at 12:58:57PM +0200, Sebastian Andrzej Siewior wrote:
> On SMP + !RT migrate_disable() is still around. It is not part of spin_lock()
> anymore so it has almost no users. However the futex code has a workaround for
> the !in_atomic() part of migrate disable which fails because the matching
> migrade_disable() is no longer part of spin_lock().
>
> On !SMP + !RT migrate_disable() is reduced to barrier(). This is not optimal
> because we few spots where a "preempt_disable()" statement was replaced with
> "migrate_disable()".
>
> We also used the migration_disable counter to figure out if a sleeping lock is
> acquired so RCU does not complain about schedule() during rcu_read_lock() while
> a sleeping lock is held. This changed, we no longer use it, we have now a
> sleeping_lock counter for the RCU purpose.
>
> This means we can now:
> - for SMP + RT_BASE
> full migration program, nothing changes here
>
> - for !SMP + RT_BASE
> the migration counting is no longer required. It used to ensure that the task
> is not migrated to another CPU and that this CPU remains online. !SMP ensures
> that already.
> Move it to CONFIG_SCHED_DEBUG so the counting is done for debugging purpose
> only.
>
> - for all other cases including !RT
> fallback to preempt_disable(). The only remaining users of migrate_disable()
> are those which were converted from preempt_disable() and the futex
> workaround which is already in the preempt_disable() section due to the
> spin_lock that is held.
>
> Cc: [email protected]
> Reported-by: [email protected]
> Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
> ---
> v1???v2: limit migrate_disable to RT only. Use preempt_disable() for !RT
> if migrate_disable() is used.
>
> include/linux/preempt.h | 6 +++---
> include/linux/sched.h | 4 ++--
> kernel/sched/core.c | 23 +++++++++++------------
> kernel/sched/debug.c | 2 +-
> 4 files changed, 17 insertions(+), 18 deletions(-)


Hi Sebastian,
v2 works for me.

I compiled and booted both smp+preempt+!rt and
smp+preempt+rt kernels, no splats on boot for either.

I ran the futex selftests on both kernels, both passed.

I ran a selection of posix tests from an old version of
the Linux Test Project, both kernels passed all tests.

Regards, and thanks,
Joe

2018-07-11 18:22:59

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH RT v2] sched/migrate_disable: fallback to preempt_disable() instead barrier()

On Wed, 11 Jul 2018 17:39:52 +0200
Sebastian Andrzej Siewior <[email protected]> wrote:

> On 2018-07-06 12:58:57 [+0200], To Joe Korty wrote:
> > On SMP + !RT migrate_disable() is still around. It is not part of spin_lock()
> > anymore so it has almost no users. However the futex code has a workaround for
> > the !in_atomic() part of migrate disable which fails because the matching
> > migrade_disable() is no longer part of spin_lock().
> >
> > On !SMP + !RT migrate_disable() is reduced to barrier(). This is not optimal
> > because we few spots where a "preempt_disable()" statement was replaced with
> > "migrate_disable()".
> >
> > We also used the migration_disable counter to figure out if a sleeping lock is
> > acquired so RCU does not complain about schedule() during rcu_read_lock() while
> > a sleeping lock is held. This changed, we no longer use it, we have now a
> > sleeping_lock counter for the RCU purpose.
> >
> > This means we can now:
> > - for SMP + RT_BASE
> > full migration program, nothing changes here
> >
> > - for !SMP + RT_BASE
> > the migration counting is no longer required. It used to ensure that the task
> > is not migrated to another CPU and that this CPU remains online. !SMP ensures
> > that already.
> > Move it to CONFIG_SCHED_DEBUG so the counting is done for debugging purpose
> > only.
> >
> > - for all other cases including !RT
> > fallback to preempt_disable(). The only remaining users of migrate_disable()
> > are those which were converted from preempt_disable() and the futex
> > workaround which is already in the preempt_disable() section due to the
> > spin_lock that is held.
> >
> > Cc: [email protected]
> > Reported-by: [email protected]
> > Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
> > ---
> > v1…v2: limit migrate_disable to RT only. Use preempt_disable() for !RT
> > if migrate_disable() is used.
>
> If there are no objections I would pick this up for next v4.16.
>

I still rather have migrate_disable() be a nop (barrier at most)
when !RT, to keep it from being used, and just fix the places that are
of issue. But we can discuss this when we push this to mainline. I'm
fine with adding it to -rt if it fixes a real bug now.

-- Steve

Subject: Re: [PATCH RT v2] sched/migrate_disable: fallback to preempt_disable() instead barrier()

On 2018-07-06 12:58:57 [+0200], To Joe Korty wrote:
> On SMP + !RT migrate_disable() is still around. It is not part of spin_lock()
> anymore so it has almost no users. However the futex code has a workaround for
> the !in_atomic() part of migrate disable which fails because the matching
> migrade_disable() is no longer part of spin_lock().
>
> On !SMP + !RT migrate_disable() is reduced to barrier(). This is not optimal
> because we few spots where a "preempt_disable()" statement was replaced with
> "migrate_disable()".
>
> We also used the migration_disable counter to figure out if a sleeping lock is
> acquired so RCU does not complain about schedule() during rcu_read_lock() while
> a sleeping lock is held. This changed, we no longer use it, we have now a
> sleeping_lock counter for the RCU purpose.
>
> This means we can now:
> - for SMP + RT_BASE
> full migration program, nothing changes here
>
> - for !SMP + RT_BASE
> the migration counting is no longer required. It used to ensure that the task
> is not migrated to another CPU and that this CPU remains online. !SMP ensures
> that already.
> Move it to CONFIG_SCHED_DEBUG so the counting is done for debugging purpose
> only.
>
> - for all other cases including !RT
> fallback to preempt_disable(). The only remaining users of migrate_disable()
> are those which were converted from preempt_disable() and the futex
> workaround which is already in the preempt_disable() section due to the
> spin_lock that is held.
>
> Cc: [email protected]
> Reported-by: [email protected]
> Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
> ---
> v1…v2: limit migrate_disable to RT only. Use preempt_disable() for !RT
> if migrate_disable() is used.

If there are no objections I would pick this up for next v4.16.

Sebastian