LinuxLists.cc - Are ALL_TASKS_PI on in 2.6.16-rt7?

2006-03-26 21:07:18

by Esben Nielsen

[permalink] [raw]

Subject: Are ALL_TASKS_PI on in 2.6.16-rt7?

It just looks like also normal, non-rt tasks are boosting.

Esben

2006-03-26 21:23:24

by Esben Nielsen

[permalink] [raw]

Subject: 2.6.16-rt7 and deadlock detection.

I don't get any print outs of any deadlock detection with many of my
tests.
When there is a deadlock down() simply returns instead of blocking
forever.

Esben

On Sun, 26 Mar 2006, Esben Nielsen wrote:

> It just looks like also normal, non-rt tasks are boosting.
>
> Esben
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2006-03-26 21:30:22

by Ingo Molnar

[permalink] [raw]

Subject: Re: Are ALL_TASKS_PI on in 2.6.16-rt7?

* Esben Nielsen <[email protected]> wrote:

> It just looks like also normal, non-rt tasks are boosting.

correct. We'd like to make sure the PI code is correct - and for
PI-futex it makes sense anyway.

Ingo

2006-03-26 21:34:00

by Esben Nielsen

[permalink] [raw]

Subject: Re: Are ALL_TASKS_PI on in 2.6.16-rt7?

On Sun, 26 Mar 2006, Ingo Molnar wrote:

>
> * Esben Nielsen <[email protected]> wrote:
>
> > It just looks like also normal, non-rt tasks are boosting.
>
> correct. We'd like to make sure the PI code is correct - and for
> PI-futex it makes sense anyway.
>

It wont work 100% when a task is boosted to a normal, non-rt prio well
since at scheduler_tick()
p->prio = effective_prio(p);
can be executed overwriting the boost.

> Ingo
>

Esben

2006-03-26 21:38:27

by Ingo Molnar

[permalink] [raw]

Subject: Re: Are ALL_TASKS_PI on in 2.6.16-rt7?

* Esben Nielsen <[email protected]> wrote:

> On Sun, 26 Mar 2006, Ingo Molnar wrote:
>
> >
> > * Esben Nielsen <[email protected]> wrote:
> >
> > > It just looks like also normal, non-rt tasks are boosting.
> >
> > correct. We'd like to make sure the PI code is correct - and for
> > PI-futex it makes sense anyway.
> >
>
> It wont work 100% when a task is boosted to a normal, non-rt prio well
> since at scheduler_tick()
> p->prio = effective_prio(p);
> can be executed overwriting the boost.

yeah - but that's relatively rare, upon expiration of the timeslice.
The following would probably solve it: scheduler_tick() could take the
pi_lock (before taking the rq lock), update normal_prio, and then call
into rt_mutex_getprio() [just like setscheduler does] to set the
priority.

Ingo

2006-03-26 22:05:08

by Esben Nielsen

[permalink] [raw]

Subject: Re: 2.6.16-rt7 and deadlock detection.

On Sun, 26 Mar 2006, Esben Nielsen wrote:

> I don't get any print outs of any deadlock detection with many of my
> tests.
> When there is a deadlock down() simply returns instead of blocking
> forever.

rt_mutex_slowlock seems to return -EDEADLK even though caller didn't ask
for deadlock detection (detect_deadlock=0). That is bad because then the
caller will not check for it. It ought to simply leave the task blocked.

It only happens with CONFIG_DEBUG_RT_MUTEXES. That one also messes up the
task->pi_waiters as earlier reported.

Esben

>
> Esben
>
>
>
> On Sun, 26 Mar 2006, Esben Nielsen wrote:
>
> > It just looks like also normal, non-rt tasks are boosting.
> >
> > Esben
> >
> >
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
> >
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2006-03-26 22:06:59

by Ingo Molnar

[permalink] [raw]

Subject: Re: 2.6.16-rt7 and deadlock detection.

* Esben Nielsen <[email protected]> wrote:

> On Sun, 26 Mar 2006, Esben Nielsen wrote:
>
> > I don't get any print outs of any deadlock detection with many of my
> > tests.
> > When there is a deadlock down() simply returns instead of blocking
> > forever.
>
> rt_mutex_slowlock seems to return -EDEADLK even though caller didn't
> ask for deadlock detection (detect_deadlock=0). That is bad because
> then the caller will not check for it. It ought to simply leave the
> task blocked.
>
> It only happens with CONFIG_DEBUG_RT_MUTEXES. That one also messes up
> the task->pi_waiters as earlier reported.

FYI, here are my current fixes relative to -rt9. The deadlock-detection
issue is not yet fixed i think, but i found a couple of other PI related
bugs.

Ingo

Index: linux/kernel/rtmutex.c
===================================================================
--- linux.orig/kernel/rtmutex.c
+++ linux/kernel/rtmutex.c
@@ -360,103 +360,6 @@ static void adjust_pi_chain(struct rt_mu
}

/*
- * Task blocks on lock.
- *
- * Prepare waiter and potentially propagate our priority into the pi chain.
- *
- * This must be called with lock->wait_lock held.
- */
-static int task_blocks_on_rt_mutex(struct rt_mutex *lock,
- struct rt_mutex_waiter *waiter,
- int detect_deadlock __IP_DECL__)
-{
- int res = 0;
- struct rt_mutex_waiter *top_waiter = waiter;
- LIST_HEAD(lock_chain);
-
- waiter->task = current;
- waiter->lock = lock;
- debug_rt_mutex_reset_waiter(waiter);
-
- spin_lock(&current->pi_lock);
- current->pi_locked_by = current;
- plist_node_init(&waiter->list_entry, current->prio);
- plist_node_init(&waiter->pi_list_entry, current->prio);
-
- /* Get the top priority waiter of the lock: */
- if (rt_mutex_has_waiters(lock))
- top_waiter = rt_mutex_top_waiter(lock);
- plist_add(&waiter->list_entry, &lock->wait_list);
-
- current->pi_blocked_on = waiter;
-
- /*
- * Call adjust_prio_chain, when waiter is the new top waiter
- * or when deadlock detection is requested:
- */
- if (waiter != rt_mutex_top_waiter(lock) &&
- !debug_rt_mutex_detect_deadlock(detect_deadlock))
- goto out;
-
- /* Try to lock the full chain: */
- res = lock_pi_chain(lock, waiter, &lock_chain, 1, detect_deadlock);
-
- if (likely(!res))
- adjust_pi_chain(lock, waiter, top_waiter, &lock_chain);
-
- /* Common case: we managed to lock it: */
- if (res != -EBUSY)
- goto out_unlock;
-
- /* Rare case: we hit some other task running a pi chain operation: */
- unlock_pi_chain(&lock_chain);
-
- plist_del(&waiter->list_entry, &lock->wait_list);
- current->pi_blocked_on = NULL;
- current->pi_locked_by = NULL;
- spin_unlock(&current->pi_lock);
- fixup_rt_mutex_waiters(lock);
-
- spin_unlock(&lock->wait_lock);
-
- spin_lock(&pi_conflicts_lock);
-
- spin_lock(&current->pi_lock);
- current->pi_locked_by = current;
- spin_lock(&lock->wait_lock);
- if (!rt_mutex_owner(lock)) {
- waiter->task = NULL;
- spin_unlock(&pi_conflicts_lock);
- goto out;
- }
- plist_node_init(&waiter->list_entry, current->prio);
- plist_node_init(&waiter->pi_list_entry, current->prio);
-
- /* Get the top priority waiter of the lock: */
- if (rt_mutex_has_waiters(lock))
- top_waiter = rt_mutex_top_waiter(lock);
- plist_add(&waiter->list_entry, &lock->wait_list);
-
- current->pi_blocked_on = waiter;
-
- /* Lock the full chain: */
- res = lock_pi_chain(lock, waiter, &lock_chain, 0, detect_deadlock);
-
- /* Drop the conflicts lock before adjusting: */
- spin_unlock(&pi_conflicts_lock);
-
- if (likely(!res))
- adjust_pi_chain(lock, waiter, top_waiter, &lock_chain);
-
- out_unlock:
- unlock_pi_chain(&lock_chain);
- out:
- current->pi_locked_by = NULL;
- spin_unlock(&current->pi_lock);
- return res;
-}
-
-/*
* Optimization: check if we can steal the lock from the
* assigned pending owner [which might not have taken the
* lock yet]:
@@ -562,6 +465,117 @@ static int try_to_take_rt_mutex(struct r
}

/*
+ * Task blocks on lock.
+ *
+ * Prepare waiter and potentially propagate our priority into the pi chain.
+ *
+ * This must be called with lock->wait_lock held.
+ * return values: 0: waiter queued, 1: got the lock,
+ * -EDEADLK: deadlock detected.
+ */
+static int task_blocks_on_rt_mutex(struct rt_mutex *lock,
+ struct rt_mutex_waiter *waiter,
+ int detect_deadlock __IP_DECL__)
+{
+ struct rt_mutex_waiter *top_waiter = waiter;
+ LIST_HEAD(lock_chain);
+ int res = 0;
+
+ waiter->task = current;
+ waiter->lock = lock;
+ debug_rt_mutex_reset_waiter(waiter);
+
+ spin_lock(&current->pi_lock);
+ current->pi_locked_by = current;
+ plist_node_init(&waiter->list_entry, current->prio);
+ plist_node_init(&waiter->pi_list_entry, current->prio);
+
+ /* Get the top priority waiter of the lock: */
+ if (rt_mutex_has_waiters(lock))
+ top_waiter = rt_mutex_top_waiter(lock);
+ plist_add(&waiter->list_entry, &lock->wait_list);
+
+ current->pi_blocked_on = waiter;
+
+ /*
+ * Call adjust_prio_chain, when waiter is the new top waiter
+ * or when deadlock detection is requested:
+ */
+ if (waiter != rt_mutex_top_waiter(lock) &&
+ !debug_rt_mutex_detect_deadlock(detect_deadlock))
+ goto out_unlock_pi;
+
+ /* Try to lock the full chain: */
+ res = lock_pi_chain(lock, waiter, &lock_chain, 1, detect_deadlock);
+
+ if (likely(!res))
+ adjust_pi_chain(lock, waiter, top_waiter, &lock_chain);
+
+ /* Common case: we managed to lock it: */
+ if (res != -EBUSY)
+ goto out_unlock_chain_pi;
+
+ /* Rare case: we hit some other task running a pi chain operation: */
+ unlock_pi_chain(&lock_chain);
+
+ plist_del(&waiter->list_entry, &lock->wait_list);
+ current->pi_blocked_on = NULL;
+ current->pi_locked_by = NULL;
+ spin_unlock(&current->pi_lock);
+ fixup_rt_mutex_waiters(lock);
+
+ spin_unlock(&lock->wait_lock);
+
+ /*
+ * Here we have dropped all locks, and take the global
+ * pi_conflicts_lock. We have to redo all the work, no
+ * previous information about the lock is valid anymore:
+ */
+ spin_lock(&pi_conflicts_lock);
+
+ spin_lock(&lock->wait_lock);
+ if (try_to_take_rt_mutex(lock __IP__)) {
+ /*
+ * Rare race: against all odds we got the lock.
+ */
+ res = 1;
+ goto out;
+ }
+
+ WARN_ON(!rt_mutex_owner(lock) || rt_mutex_owner(lock) == current);
+
+ spin_lock(&current->pi_lock);
+ current->pi_locked_by = current;
+
+ plist_node_init(&waiter->list_entry, current->prio);
+ plist_node_init(&waiter->pi_list_entry, current->prio);
+
+ /* Get the top priority waiter of the lock: */
+ if (rt_mutex_has_waiters(lock))
+ top_waiter = rt_mutex_top_waiter(lock);
+ plist_add(&waiter->list_entry, &lock->wait_list);
+
+ current->pi_blocked_on = waiter;
+
+ /* Lock the full chain: */
+ res = lock_pi_chain(lock, waiter, &lock_chain, 0, detect_deadlock);
+
+ /* Drop the conflicts lock before adjusting: */
+ spin_unlock(&pi_conflicts_lock);
+
+ if (likely(!res))
+ adjust_pi_chain(lock, waiter, top_waiter, &lock_chain);
+
+ out_unlock_chain_pi:
+ unlock_pi_chain(&lock_chain);
+ out_unlock_pi:
+ current->pi_locked_by = NULL;
+ spin_unlock(&current->pi_lock);
+ out:
+ return res;
+}
+
+/*
* Wake up the next waiter on the lock.
*
* Remove the top waiter from the current tasks waiter list and from
@@ -773,7 +787,7 @@ rt_lock_slowlock(struct rt_mutex *lock _

for (;;) {
unsigned long saved_flags;
- int saved_lock_depth = current->lock_depth;
+ int ret, saved_lock_depth = current->lock_depth;

/* Try to acquire the lock */
if (try_to_take_rt_mutex(lock __IP__))
@@ -783,8 +797,16 @@ rt_lock_slowlock(struct rt_mutex *lock _
* when we have been woken up by the previous owner
* but the lock got stolen by an higher prio task.
*/
- if (unlikely(!waiter.task))
- task_blocks_on_rt_mutex(lock, &waiter, 0 __IP__);
+ if (!waiter.task) {
+ ret = task_blocks_on_rt_mutex(lock, &waiter, 0 __IP__);
+ /* got the lock: */
+ if (ret == 1) {
+ ret = 0;
+ break;
+ }
+ /* deadlock_detect == 0, so return should be 0 or 1: */
+ WARN_ON(ret);
+ }

/*
* Prevent schedule() to drop BKL, while waiting for
@@ -974,10 +996,9 @@ rt_mutex_slowlock(struct rt_mutex *lock,
if (!waiter.task) {
ret = task_blocks_on_rt_mutex(lock, &waiter,
detect_deadlock __IP__);
- if (ret == -EDEADLK)
+ /* got the lock or deadlock: */
+ if (ret == 1 || ret == -EDEADLK)
break;
- if (ret == -EBUSY)
- continue;
}

saved_flags = current->flags & PF_NOSCHED;
@@ -1043,16 +1064,15 @@ rt_mutex_slowtrylock(struct rt_mutex *lo

if (likely(rt_mutex_owner(lock) != current)) {

+ /* FIXME: why is this done here and not above? */
init_lists(lock);

ret = try_to_take_rt_mutex(lock __IP__);
/*
* try_to_take_rt_mutex() sets the lock waiters
- * bit. We might be the only waiter. Check if this
- * needs to be cleaned up.
+ * bit unconditionally. Clean this up.
*/
- if (!ret)
- fixup_rt_mutex_waiters(lock);
+ fixup_rt_mutex_waiters(lock);
}

spin_unlock_irqrestore(&lock->wait_lock, flags);
Index: linux/kernel/rtmutex_common.h
===================================================================
--- linux.orig/kernel/rtmutex_common.h
+++ linux/kernel/rtmutex_common.h
@@ -98,18 +98,18 @@ task_top_pi_waiter(struct task_struct *p
static inline struct task_struct *rt_mutex_owner(struct rt_mutex *lock)
{
return (struct task_struct *)
- ((unsigned long)((lock)->owner) & ~RT_MUTEX_OWNER_MASKALL);
+ ((unsigned long)lock->owner & ~RT_MUTEX_OWNER_MASKALL);
}

static inline struct task_struct *rt_mutex_real_owner(struct rt_mutex *lock)
{
return (struct task_struct *)
- ((unsigned long)((lock)->owner) & ~RT_MUTEX_HAS_WAITERS);
+ ((unsigned long)lock->owner & ~RT_MUTEX_HAS_WAITERS);
}

static inline unsigned long rt_mutex_owner_pending(struct rt_mutex *lock)
{
- return ((unsigned long)((lock)->owner) & RT_MUTEX_OWNER_PENDING);
+ return (unsigned long)lock->owner & RT_MUTEX_OWNER_PENDING;
}

/*

2006-03-26 23:38:05

by Ingo Molnar

[permalink] [raw]

Subject: 2.6.16-rt10

* Esben Nielsen <[email protected]> wrote:

> rt_mutex_slowlock seems to return -EDEADLK even though caller didn't
> ask for deadlock detection (detect_deadlock=0). That is bad because
> then the caller will not check for it. It ought to simply leave the
> task blocked.
>
> It only happens with CONFIG_DEBUG_RT_MUTEXES. That one also messes up
> the task->pi_waiters as earlier reported.

i've released -rt10 with this bug (and other PI related bugs) fixed.

Ingo

2006-03-28 09:44:16

by Simon Derr

[permalink] [raw]

Subject: Re: 2.6.16-rt10

On Mon, 27 Mar 2006, Ingo Molnar wrote:

>
> i've released -rt10

Is anyone working on a port of this patch to the IA64 architecture ?

Simon.

2006-03-28 20:52:19

by Ingo Molnar

[permalink] [raw]

Subject: Re: 2.6.16-rt10

* Simon Derr <[email protected]> wrote:

> On Mon, 27 Mar 2006, Ingo Molnar wrote:
>
> > i've released -rt10
>
> Is anyone working on a port of this patch to the IA64 architecture ?

not that i know of. If someone wants to do that, take a look at the
x86_64 changes (or ppc/mips/i386 changes) to get an idea. These are the
rough steps needed:

- do the raw_spinlock_t/rwlock_t -> __raw_spinlock_t/rwlock_t rename

- change the APIs in asm-ia64/semaphore.h (and arch files) to
compat_up()/down()/etc.

- in the arch Kconfig, turn off RWSEM_XCHGADD_ALGORITHM if PREEMPT_RT.

- add the TID_NEED_RESCHED_DELAYED logic to thread_info.h and the entry
assembly code.

- change most/all spinlocks in arch/ia64 to raw_spinlock / RAW_SPINLOCK

- change most/all seqlocks to raw_seqlock / RAW_SEQLOCK

- add smp_send_reschedule_allbutself().

- take a good look at the arch/x86_64/kernel/process.c changes and port
the need_resched_delayed() and __schedule() changes.

that should be at least 95% of what's needed. (the x86_64 port does a
couple of other things too, like latency tracing support, etc., but you
dont need those for the initial ia64 port.)

Ingo

2006-03-29 07:56:11

by Simon Derr

[permalink] [raw]

Subject: Re: 2.6.16-rt10

On Tue, 28 Mar 2006, Ingo Molnar wrote:

> > Is anyone working on a port of this patch to the IA64 architecture ?
>
> not that i know of. If someone wants to do that, take a look at the
> x86_64 changes (or ppc/mips/i386 changes) to get an idea. These are the
> rough steps needed:
>
> - do the raw_spinlock_t/rwlock_t -> __raw_spinlock_t/rwlock_t rename
>
> - change the APIs in asm-ia64/semaphore.h (and arch files) to
> compat_up()/down()/etc.
>
> - in the arch Kconfig, turn off RWSEM_XCHGADD_ALGORITHM if PREEMPT_RT.
>
> - add the TID_NEED_RESCHED_DELAYED logic to thread_info.h and the entry
> assembly code.
>
> - change most/all spinlocks in arch/ia64 to raw_spinlock / RAW_SPINLOCK
>
> - change most/all seqlocks to raw_seqlock / RAW_SEQLOCK
>
> - add smp_send_reschedule_allbutself().
>
> - take a good look at the arch/x86_64/kernel/process.c changes and port
> the need_resched_delayed() and __schedule() changes.
>
> that should be at least 95% of what's needed. (the x86_64 port does a
> couple of other things too, like latency tracing support, etc., but you
> dont need those for the initial ia64 port.)

Thanks a lot for your reply, Ingo, and for these hints on how to begin.

Simon.