2016-04-05 12:50:06

by Mike Galbraith

[permalink] [raw]
Subject: [rfc patch 2/2] rt/locking/hotplug: Fix rt_spin_lock_slowlock() migrate_disable() bug


I met a problem while testing shiny new hotplug machinery.

migrate_disable() -> pin_current_cpu() -> hotplug_lock() leads to..
BUG_ON(rt_mutex_real_waiter(task->pi_blocked_on));

With hotplug_lock()/hotplug_unlock() now gone, there is no lock added by
the CPU pinning code, thus we're free to pin after lock acquisition, and
unpin before release with no ABBA worries. Doing so will also save a few
cycles if we have to repeat the acquisition loop.

Fixes: e24b142cfb4a rt/locking: Reenable migration accross schedule
Signed-off-by: Mike Galbraith <[email protected]>
---
kernel/locking/rtmutex.c | 37 +++++++++++++++++--------------------
1 file changed, 17 insertions(+), 20 deletions(-)

--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -930,12 +930,12 @@ static inline void rt_spin_lock_fastlock
{
might_sleep_no_state_check();

- if (do_mig_dis)
- migrate_disable();
-
- if (likely(rt_mutex_cmpxchg_acquire(lock, NULL, current)))
+ if (likely(rt_mutex_cmpxchg_acquire(lock, NULL, current))) {
rt_mutex_deadlock_account_lock(lock, current);
- else
+
+ if (do_mig_dis)
+ migrate_disable();
+ } else
slowfn(lock, do_mig_dis);
}

@@ -995,12 +995,11 @@ static int task_blocks_on_rt_mutex(struc
* the try_to_wake_up() code handles this accordingly.
*/
static void noinline __sched rt_spin_lock_slowlock(struct rt_mutex *lock,
- bool mg_off)
+ bool do_mig_dis)
{
struct task_struct *lock_owner, *self = current;
struct rt_mutex_waiter waiter, *top_waiter;
unsigned long flags;
- int ret;

rt_mutex_init_waiter(&waiter, true);

@@ -1008,6 +1007,8 @@ static void noinline __sched rt_spin_lo

if (__try_to_take_rt_mutex(lock, self, NULL, STEAL_LATERAL)) {
raw_spin_unlock_irqrestore(&lock->wait_lock, flags);
+ if (do_mig_dis)
+ migrate_disable();
return;
}

@@ -1024,8 +1025,7 @@ static void noinline __sched rt_spin_lo
__set_current_state_no_track(TASK_UNINTERRUPTIBLE);
raw_spin_unlock(&self->pi_lock);

- ret = task_blocks_on_rt_mutex(lock, &waiter, self, RT_MUTEX_MIN_CHAINWALK);
- BUG_ON(ret);
+ BUG_ON(task_blocks_on_rt_mutex(lock, &waiter, self, RT_MUTEX_MIN_CHAINWALK));

for (;;) {
/* Try to acquire the lock again. */
@@ -1039,13 +1039,8 @@ static void noinline __sched rt_spin_lo

debug_rt_mutex_print_deadlock(&waiter);

- if (top_waiter != &waiter || adaptive_wait(lock, lock_owner)) {
- if (mg_off)
- migrate_enable();
+ if (top_waiter != &waiter || adaptive_wait(lock, lock_owner))
schedule();
- if (mg_off)
- migrate_disable();
- }

raw_spin_lock_irqsave(&lock->wait_lock, flags);

@@ -1077,6 +1072,9 @@ static void noinline __sched rt_spin_lo

raw_spin_unlock_irqrestore(&lock->wait_lock, flags);

+ if (do_mig_dis)
+ migrate_disable();
+
debug_rt_mutex_free_waiter(&waiter);
}

@@ -1159,10 +1157,10 @@ EXPORT_SYMBOL(rt_spin_unlock__no_mg);

void __lockfunc rt_spin_unlock(spinlock_t *lock)
{
+ migrate_enable();
/* NOTE: we always pass in '1' for nested, for simplicity */
spin_release(&lock->dep_map, 1, _RET_IP_);
rt_spin_lock_fastunlock(&lock->lock, rt_spin_lock_slowunlock);
- migrate_enable();
}
EXPORT_SYMBOL(rt_spin_unlock);

@@ -1204,12 +1202,11 @@ int __lockfunc rt_spin_trylock(spinlock_
{
int ret;

- migrate_disable();
ret = rt_mutex_trylock(&lock->lock);
- if (ret)
+ if (ret) {
+ migrate_disable();
spin_acquire(&lock->dep_map, 0, 1, _RET_IP_);
- else
- migrate_enable();
+ }
return ret;
}
EXPORT_SYMBOL(rt_spin_trylock);


2016-04-06 12:00:28

by Mike Galbraith

[permalink] [raw]
Subject: Re: [rfc patch 2/2] rt/locking/hotplug: Fix rt_spin_lock_slowlock() migrate_disable() bug

It'll take a hotplug beating seemingly as well as any non-rt kernel,
but big box NAKed it due to jitter, which can mean 1.0 things.. duh.

-Mike

2016-04-07 04:37:17

by Mike Galbraith

[permalink] [raw]
Subject: Re: [rfc patch 2/2] rt/locking/hotplug: Fix rt_spin_lock_slowlock() migrate_disable() bug

On Wed, 2016-04-06 at 14:00 +0200, Mike Galbraith wrote:
> It'll take a hotplug beating seemingly as well as any non-rt kernel,
> but big box NAKed it due to jitter, which can mean 1.0 things.. duh.

FWIW, the below turned big box NAK into ACK. Stressing hotplug over
night, iteration completion time went from about 2 1/2 hours with
bandaids on the two identified rt sore spots, to an hour and 10 minutes
as well for some reason, but whatever..

There are other things like doing the downing on the cpu being taken
down that would likely be a good idea, but I suppose I'll now wait to
see what future devel trees look like. I suspect Thomas will aim his
axe at the annoying lock too (and his makes clean cuts). Meanwhile,
just reverting e24b142cfb4a makes hotplug as safe as it ever was (not
at all), slaughtering the lock seems to put current rt on par with non
-rt (iow other changes left not much rt trouble remaining), and the
below is one way to make e24b142cfb4a non-toxic.

-Mike

rt/locking/hotplug: Fix rt_spin_lock_slowlock() migrate_disable() bug

I met a problem while testing shiny new hotplug machinery.

migrate_disable() -> pin_current_cpu() -> hotplug_lock() leads to..
BUG_ON(rt_mutex_real_waiter(task->pi_blocked_on));

Unpin before we block, and repin while still in atomic context
after acquisition.

Fixes: e24b142cfb4a rt/locking: Reenable migration accross schedule
Signed-off-by: Mike Galbraith <[email protected]>
---
include/linux/cpu.h | 2 ++
include/linux/preempt.h | 2 ++
kernel/cpu.c | 13 +++++++++++++
kernel/locking/rtmutex.c | 18 +++++++++++-------
kernel/sched/core.c | 7 +++++++
5 files changed, 35 insertions(+), 7 deletions(-)

--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -231,6 +231,7 @@ extern void put_online_cpus(void);
extern void cpu_hotplug_disable(void);
extern void cpu_hotplug_enable(void);
extern void pin_current_cpu(void);
+extern void pin_current_cpu_in_atomic(void);
extern void unpin_current_cpu(void);
#define hotcpu_notifier(fn, pri) cpu_notifier(fn, pri)
#define __hotcpu_notifier(fn, pri) __cpu_notifier(fn, pri)
@@ -250,6 +251,7 @@ static inline void cpu_hotplug_done(void
#define cpu_hotplug_disable() do { } while (0)
#define cpu_hotplug_enable() do { } while (0)
static inline void pin_current_cpu(void) { }
+static inline void pin_current_cpu_in_atomic(void) { }
static inline void unpin_current_cpu(void) { }
#define hotcpu_notifier(fn, pri) do { (void)(fn); } while (0)
#define __hotcpu_notifier(fn, pri) do { (void)(fn); } while (0)
--- a/include/linux/preempt.h
+++ b/include/linux/preempt.h
@@ -302,9 +302,11 @@ do { \
# define preempt_enable_nort() barrier()
# ifdef CONFIG_SMP
extern void migrate_disable(void);
+ extern void migrate_disable_in_atomic(void);
extern void migrate_enable(void);
# else /* CONFIG_SMP */
# define migrate_disable() barrier()
+# define migrate_disable_in_atomic() barrier()
# define migrate_enable() barrier()
# endif /* CONFIG_SMP */
#else
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -204,6 +204,19 @@ void pin_current_cpu(void)
}

/**
+ * pin_current_cpu_in_atomic - Prevent the current cpu from being unplugged
+ *
+ * The caller is acquiring a lock, and must have a reference before leaving
+ * the preemption disabled region therein.
+ *
+ * Must be called with preemption disabled (preempt_count = 1)!
+ */
+void pin_current_cpu_in_atomic(void)
+{
+ this_cpu_ptr(&hotplug_pcp)->refcount++;
+}
+
+/**
* unpin_current_cpu - Allow unplug of current cpu
*
* Must be called with preemption or interrupts disabled!
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -1002,11 +1002,17 @@ static void noinline __sched rt_spin_lo
unsigned long flags;
int ret;

+ mg_off &= (self->migrate_disable == 1 && !self->state);
+ if (mg_off)
+ migrate_enable();
+
rt_mutex_init_waiter(&waiter, true);

raw_spin_lock_irqsave(&lock->wait_lock, flags);

if (__try_to_take_rt_mutex(lock, self, NULL, STEAL_LATERAL)) {
+ if (mg_off)
+ migrate_disable_in_atomic();
raw_spin_unlock_irqrestore(&lock->wait_lock, flags);
return;
}
@@ -1029,8 +1035,11 @@ static void noinline __sched rt_spin_lo

for (;;) {
/* Try to acquire the lock again. */
- if (__try_to_take_rt_mutex(lock, self, &waiter, STEAL_LATERAL))
+ if (__try_to_take_rt_mutex(lock, self, &waiter, STEAL_LATERAL)) {
+ if (mg_off)
+ migrate_disable_in_atomic();
break;
+ }

top_waiter = rt_mutex_top_waiter(lock);
lock_owner = rt_mutex_owner(lock);
@@ -1039,13 +1048,8 @@ static void noinline __sched rt_spin_lo

debug_rt_mutex_print_deadlock(&waiter);

- if (top_waiter != &waiter || adaptive_wait(lock, lock_owner)) {
- if (mg_off)
- migrate_enable();
+ if (top_waiter != &waiter || adaptive_wait(lock, lock_owner))
schedule();
- if (mg_off)
- migrate_disable();
- }

raw_spin_lock_irqsave(&lock->wait_lock, flags);

--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3328,6 +3328,13 @@ void migrate_disable(void)
}
EXPORT_SYMBOL(migrate_disable);

+void migrate_disable_in_atomic(void)
+{
+ pin_current_cpu_in_atomic();
+ current->migrate_disable++;
+}
+EXPORT_SYMBOL(migrate_disable_in_atomic);
+
void migrate_enable(void)
{
struct task_struct *p = current;

Subject: Re: [rfc patch 2/2] rt/locking/hotplug: Fix rt_spin_lock_slowlock() migrate_disable() bug

On 04/07/2016 06:37 AM, Mike Galbraith wrote:
> On Wed, 2016-04-06 at 14:00 +0200, Mike Galbraith wrote:
>> It'll take a hotplug beating seemingly as well as any non-rt kernel,
>> but big box NAKed it due to jitter, which can mean 1.0 things.. duh.
>
> FWIW, the below turned big box NAK into ACK. Stressing hotplug over
> night, iteration completion time went from about 2 1/2 hours with
> bandaids on the two identified rt sore spots, to an hour and 10 minutes
> as well for some reason, but whatever..

Just to be clear. I need #1, #2 from this thread and this below to make
it work?

> -Mike
>

Sebastian

2016-04-07 19:08:17

by Mike Galbraith

[permalink] [raw]
Subject: Re: [rfc patch 2/2] rt/locking/hotplug: Fix rt_spin_lock_slowlock() migrate_disable() bug

On Thu, 2016-04-07 at 18:48 +0200, Sebastian Andrzej Siewior wrote:
> On 04/07/2016 06:37 AM, Mike Galbraith wrote:
> > On Wed, 2016-04-06 at 14:00 +0200, Mike Galbraith wrote:
> > > It'll take a hotplug beating seemingly as well as any non-rt kernel,
> > > but big box NAKed it due to jitter, which can mean 1.0 things.. duh.
> >
> > FWIW, the below turned big box NAK into ACK. Stressing hotplug over
> > night, iteration completion time went from about 2 1/2 hours with
> > bandaids on the two identified rt sore spots, to an hour and 10 minutes
> > as well for some reason, but whatever..
>
> Just to be clear. I need #1, #2 from this thread and this below to make
> it work?

The followup to 2/2 replaced 2/2.

-Mike