2005-11-26 07:42:29

by David Singleton

[permalink] [raw]
Subject: robust futex heap support patch

There is a new patch, patch-2.6.14-rt15-rf1, that adds support for
robust and priority inheriting
pthread_mutexes on the 'heap'.

The previous patches only supported either file based pthread_mutexes
or mmapped anonymous memory based pthread_mutexes. This patch allows
pthread_mutexes
to be 'malloc'ed while using the PTHREAD_MUTEX_ROBUST_NP attribute
or PTHREAD_PRIO_INHERIT attribute.

The patch can be found at:

http://source.mvista.com/~dsingleton

David


2005-11-26 13:31:34

by Ingo Molnar

[permalink] [raw]
Subject: Re: robust futex heap support patch


* david singleton <[email protected]> wrote:

> There is a new patch, patch-2.6.14-rt15-rf1, that adds support for
> robust and priority inheriting pthread_mutexes on the 'heap'.

we need to go a bit slower. For now i had to remove robust-futexes from
the -rt17 release because they broke normal (non-robust) futex support
in -rt15. A simple mozilla startup would hang... Please send fixes
against -rt16 and i'll try to re-add the robust futexes patch later on.
You can find -rt16 at:

http://people.redhat.com/mingo/realtime-preempt/older/patch-2.6.14-rt16

> The previous patches only supported either file based pthread_mutexes
> or mmapped anonymous memory based pthread_mutexes. This patch allows
> pthread_mutexes to be 'malloc'ed while using the
> PTHREAD_MUTEX_ROBUST_NP attribute or PTHREAD_PRIO_INHERIT attribute.
>
> The patch can be found at:
>
> http://source.mvista.com/~dsingleton

this patch looks much cleaner than the earlier one, but there's one more
step to go: now that we've got the futex_head in every vma, why not hang
all robust futexes to the vma, and thus get rid of ->robust_list and
->robust_sem from struct address_space?

Ingo

2005-11-26 20:51:42

by David Singleton

[permalink] [raw]
Subject: Re: robust futex heap support patch


On Nov 26, 2005, at 5:31 AM, Ingo Molnar wrote:

>
> * david singleton <[email protected]> wrote:
>
>> There is a new patch, patch-2.6.14-rt15-rf1, that adds support for
>> robust and priority inheriting pthread_mutexes on the 'heap'.
>
> we need to go a bit slower. For now i had to remove robust-futexes from
> the -rt17 release because they broke normal (non-robust) futex support
> in -rt15. A simple mozilla startup would hang... Please send fixes
> against -rt16 and i'll try to re-add the robust futexes patch later on.
> You can find -rt16 at:

whoops, sorry.

Here's he piece that broke regular futexes. Futex wake doesn't
need to check to see if the robust list is null or not.

Index: linux-2.6.14/kernel/futex.c
===================================================================
--- linux-2.6.14.orig/kernel/futex.c
+++ linux-2.6.14/kernel/futex.c
@@ -323,10 +323,6 @@ static int futex_wake(unsigned long uadd
ret = get_futex_key(uaddr, &key, &head, &sem);
if (unlikely(ret != 0))
goto out;
- if (head == NULL) {
- ret = -EINVAL;
- goto out;
- }

bh = hash_futex(&key);
spin_lock(&bh->lock);


Let me get the build fixed and add a new test for myself. I'll start
running
this on my desktop to do builds, run mozilla and firefox, and generally
do all my normal work.

David
>
>
> http://people.redhat.com/mingo/realtime-preempt/older/patch-2.6.14-
> rt16
>
>> The previous patches only supported either file based pthread_mutexes
>> or mmapped anonymous memory based pthread_mutexes. This patch allows
>> pthread_mutexes to be 'malloc'ed while using the
>> PTHREAD_MUTEX_ROBUST_NP attribute or PTHREAD_PRIO_INHERIT attribute.
>>
>> The patch can be found at:
>>
>> http://source.mvista.com/~dsingleton
>
> this patch looks much cleaner than the earlier one, but there's one
> more
> step to go: now that we've got the futex_head in every vma, why not
> hang
> all robust futexes to the vma, and thus get rid of ->robust_list and
> ->robust_sem from struct address_space?
>
> Ingo

2005-12-05 22:30:41

by Esben Nielsen

[permalink] [raw]
Subject: Re: robust futex heap support patch

Hi,
I got a little time to look at your current patch (2.6.14-rt21-rf8).
I noticed a problem in futex_wake_robust(). You have a "goto retry" to
solve the following situation:

Task A Task B
takes futex in userspace
Tries to take mutex and sets the
waiting bit in user space
releases futex, notices task B
calls kernel and enters
futex_wake_robust()

retry:
if not owner in rt_mutex
goto retry;
Calls kernel
Makes A owner in rt_mutex
blocks
Leaves retry-loop and
completes the futex wake
operation as normally.


However, if Task A is RT on a UP machine it will go on in the retry loop
forever. Task B will never get the CPU to complete it's kernel-call.

You have probably by manipulating the userspace flag from within the
rt_mutex code :-(

Esben


On Fri, 25 Nov 2005, david singleton wrote:

> There is a new patch, patch-2.6.14-rt15-rf1, that adds support for
> robust and priority inheriting
> pthread_mutexes on the 'heap'.
>
> The previous patches only supported either file based pthread_mutexes
> or mmapped anonymous memory based pthread_mutexes. This patch allows
> pthread_mutexes
> to be 'malloc'ed while using the PTHREAD_MUTEX_ROBUST_NP attribute
> or PTHREAD_PRIO_INHERIT attribute.
>
> The patch can be found at:
>
> http://source.mvista.com/~dsingleton
>
> David
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2005-12-06 04:51:12

by David Singleton

[permalink] [raw]
Subject: Re: robust futex heap support patch


On Dec 5, 2005, at 2:30 PM, Esben Nielsen wrote:

I'm currently trying to close a race condition between futex_wait_robust
and futex_wake_robust that Dave Carlson is seeing on his SMP system.

The scenario is as follows:

Thread A locks an pthread_mutex via the fast path and does not enter
the kernel.

Thread B tries to lock the lock and sees it is already locked. Thread
B sets the
waiters flag in the lock and enters the kernel to lock the lock on
behalf of
thread A and then block on the mutex waiting for it's release.

Thread A unlocks the lock and sees the waiters flag set. Thread A gets
to the futex_wake_robust before Thread B can get to futex_wait_robust.

Thread A sees that it does not own the lock in the kernel and was
returning EINVAL.

patch-2.6.14-rt21-rf8 was a preliminary patch for Dave Carlson to try
and get more information about
the race condition. ( rf8 and rf9 are still returning EAGAIN from
thread B trying to
do the futex_wait_robust and the library should be retrying with
EAGAIN, but it currently isn't).

When I get the race condition closed I'll post the patch and notify
everyone on lkml
and the robustmutexes mailing lists.

David



> Hi,
> I got a little time to look at your current patch (2.6.14-rt21-rf8).
> I noticed a problem in futex_wake_robust(). You have a "goto retry" to
> solve the following situation:
>
> Task A Task B
> takes futex in userspace
> Tries to take mutex and sets the
> waiting bit in user space
> releases futex, notices task B
> calls kernel and enters
> futex_wake_robust()
>
> retry:
> if not owner in rt_mutex
> goto retry;
> Calls kernel
> Makes A owner in rt_mutex
> blocks
> Leaves retry-loop and
> completes the futex wake
> operation as normally.
>
>
> However, if Task A is RT on a UP machine it will go on in the retry
> loop
> forever. Task B will never get the CPU to complete it's kernel-call.
>
> You have probably by manipulating the userspace flag from within the
> rt_mutex code :-(
>
> Esben
>
>
> On Fri, 25 Nov 2005, david singleton wrote:
>
>> There is a new patch, patch-2.6.14-rt15-rf1, that adds support for
>> robust and priority inheriting
>> pthread_mutexes on the 'heap'.
>>
>> The previous patches only supported either file based pthread_mutexes
>> or mmapped anonymous memory based pthread_mutexes. This patch allows
>> pthread_mutexes
>> to be 'malloc'ed while using the PTHREAD_MUTEX_ROBUST_NP attribute
>> or PTHREAD_PRIO_INHERIT attribute.
>>
>> The patch can be found at:
>>
>> http://source.mvista.com/~dsingleton
>>
>> David
>>
>> -
>> To unsubscribe from this list: send the line "unsubscribe
>> linux-kernel" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at http://www.tux.org/lkml/
>>
>

2005-12-06 14:06:41

by Esben Nielsen

[permalink] [raw]
Subject: Re: robust futex heap support patch

On Mon, 5 Dec 2005, david singleton wrote:

>
> On Dec 5, 2005, at 2:30 PM, Esben Nielsen wrote:
>
> I'm currently trying to close a race condition between futex_wait_robust
> and futex_wake_robust that Dave Carlson is seeing on his SMP system.
>
> The scenario is as follows:
>
> Thread A locks an pthread_mutex via the fast path and does not enter
> the kernel.
>
> Thread B tries to lock the lock and sees it is already locked. Thread
> B sets the
> waiters flag in the lock and enters the kernel to lock the lock on
> behalf of
> thread A and then block on the mutex waiting for it's release.
>
> Thread A unlocks the lock and sees the waiters flag set. Thread A gets
> to the futex_wake_robust before Thread B can get to futex_wait_robust.
>
> Thread A sees that it does not own the lock in the kernel and was
> returning EINVAL.
>
> patch-2.6.14-rt21-rf8 was a preliminary patch for Dave Carlson to try
> and get more information about
> the race condition. ( rf8 and rf9 are still returning EAGAIN from
> thread B trying to
> do the futex_wait_robust and the library should be retrying with
> EAGAIN, but it currently isn't).
>

*nod*
I just pointed out that you can't make thread A loop the way you do.

What I would do was to do the user space flag checks while having the
raw spinlock of the rt_mutex. That way you are sure that stuff in the
kernel is serialized. But I don't know what to do exactly to do in
there....


> When I get the race condition closed I'll post the patch and notify
> everyone on lkml
> and the robustmutexes mailing lists.

Where can I sign up to that mailing list? I would like to follow the
development, although I don't have much time to contibute.

Esben

>
> David
>
>
>
> > Hi,
> > I got a little time to look at your current patch (2.6.14-rt21-rf8).
> > I noticed a problem in futex_wake_robust(). You have a "goto retry" to
> > solve the following situation:
> >
> > Task A Task B
> > takes futex in userspace
> > Tries to take mutex and sets the
> > waiting bit in user space
> > releases futex, notices task B
> > calls kernel and enters
> > futex_wake_robust()
> >
> > retry:
> > if not owner in rt_mutex
> > goto retry;
> > Calls kernel
> > Makes A owner in rt_mutex
> > blocks
> > Leaves retry-loop and
> > completes the futex wake
> > operation as normally.
> >
> >
> > However, if Task A is RT on a UP machine it will go on in the retry
> > loop
> > forever. Task B will never get the CPU to complete it's kernel-call.
> >
> > You have probably by manipulating the userspace flag from within the
> > rt_mutex code :-(
> >
> > Esben
> >
> >
> > On Fri, 25 Nov 2005, david singleton wrote:
> >
> >> There is a new patch, patch-2.6.14-rt15-rf1, that adds support for
> >> robust and priority inheriting
> >> pthread_mutexes on the 'heap'.
> >>
> >> The previous patches only supported either file based pthread_mutexes
> >> or mmapped anonymous memory based pthread_mutexes. This patch allows
> >> pthread_mutexes
> >> to be 'malloc'ed while using the PTHREAD_MUTEX_ROBUST_NP attribute
> >> or PTHREAD_PRIO_INHERIT attribute.
> >>
> >> The patch can be found at:
> >>
> >> http://source.mvista.com/~dsingleton
> >>
> >> David
> >>
> >> -
> >> To unsubscribe from this list: send the line "unsubscribe
> >> linux-kernel" in
> >> the body of a message to [email protected]
> >> More majordomo info at http://vger.kernel.org/majordomo-info.html
> >> Please read the FAQ at http://www.tux.org/lkml/
> >>
> >
>

2005-12-06 15:23:58

by David Singleton

[permalink] [raw]
Subject: Re: robust futex heap support patch


On Dec 6, 2005, at 6:06 AM, Esben Nielsen wrote:

> On Mon, 5 Dec 2005, david singleton wrote:
>
>>
>> On Dec 5, 2005, at 2:30 PM, Esben Nielsen wrote:
>>
>> I'm currently trying to close a race condition between
>> futex_wait_robust
>> and futex_wake_robust that Dave Carlson is seeing on his SMP system.
>>
>> The scenario is as follows:
>>
>> Thread A locks an pthread_mutex via the fast path and does not enter
>> the kernel.
>>
>> Thread B tries to lock the lock and sees it is already locked. Thread
>> B sets the
>> waiters flag in the lock and enters the kernel to lock the lock on
>> behalf of
>> thread A and then block on the mutex waiting for it's release.
>>
>> Thread A unlocks the lock and sees the waiters flag set. Thread A
>> gets
>> to the futex_wake_robust before Thread B can get to futex_wait_robust.
>>
>> Thread A sees that it does not own the lock in the kernel and was
>> returning EINVAL.
>>
>> patch-2.6.14-rt21-rf8 was a preliminary patch for Dave Carlson to try
>> and get more information about
>> the race condition. ( rf8 and rf9 are still returning EAGAIN from
>> thread B trying to
>> do the futex_wait_robust and the library should be retrying with
>> EAGAIN, but it currently isn't).
>>
>
> *nod*
> I just pointed out that you can't make thread A loop the way you do.
>
> What I would do was to do the user space flag checks while having the
> raw spinlock of the rt_mutex. That way you are sure that stuff in the
> kernel is serialized. But I don't know what to do exactly to do in
> there....

Yes, and you are correct. That patch was an attempt at seeing how far
the wake code
had to go to let the waiters in on an SMP machine.
>
>
>> When I get the race condition closed I'll post the patch and notify
>> everyone on lkml
>> and the robustmutexes mailing lists.
>
> Where can I sign up to that mailing list? I would like to follow the
> development, although I don't have much time to contibute.

The mail lists you cc'd will do it, [email protected] and
[email protected].

David

>
> Esben
>
>>
>> David
>>
>>
>>
>>> Hi,
>>> I got a little time to look at your current patch (2.6.14-rt21-rf8).
>>> I noticed a problem in futex_wake_robust(). You have a "goto retry"
>>> to
>>> solve the following situation:
>>>
>>> Task A Task B
>>> takes futex in userspace
>>> Tries to take mutex and sets the
>>> waiting bit in user space
>>> releases futex, notices task B
>>> calls kernel and enters
>>> futex_wake_robust()
>>>
>>> retry:
>>> if not owner in rt_mutex
>>> goto retry;
>>> Calls kernel
>>> Makes A owner in rt_mutex
>>> blocks
>>> Leaves retry-loop and
>>> completes the futex wake
>>> operation as normally.
>>>
>>>
>>> However, if Task A is RT on a UP machine it will go on in the retry
>>> loop
>>> forever. Task B will never get the CPU to complete it's kernel-call.
>>>
>>> You have probably by manipulating the userspace flag from within the
>>> rt_mutex code :-(
>>>
>>> Esben
>>>
>>>
>>> On Fri, 25 Nov 2005, david singleton wrote:
>>>
>>>> There is a new patch, patch-2.6.14-rt15-rf1, that adds support for
>>>> robust and priority inheriting
>>>> pthread_mutexes on the 'heap'.
>>>>
>>>> The previous patches only supported either file based
>>>> pthread_mutexes
>>>> or mmapped anonymous memory based pthread_mutexes. This patch
>>>> allows
>>>> pthread_mutexes
>>>> to be 'malloc'ed while using the PTHREAD_MUTEX_ROBUST_NP attribute
>>>> or PTHREAD_PRIO_INHERIT attribute.
>>>>
>>>> The patch can be found at:
>>>>
>>>> http://source.mvista.com/~dsingleton
>>>>
>>>> David
>>>>
>>>> -
>>>> To unsubscribe from this list: send the line "unsubscribe
>>>> linux-kernel" in
>>>> the body of a message to [email protected]
>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>> Please read the FAQ at http://www.tux.org/lkml/
>>>>
>>>
>>
>

2005-12-07 01:23:25

by Randy Dunlap

[permalink] [raw]
Subject: Re: robust futex heap support patch

On Tue, 6 Dec 2005 15:06:12 +0100 (MET) Esben Nielsen wrote:

> On Mon, 5 Dec 2005, david singleton wrote:
>
> >
> > On Dec 5, 2005, at 2:30 PM, Esben Nielsen wrote:
> >
> > I'm currently trying to close a race condition between futex_wait_robust
> > and futex_wake_robust that Dave Carlson is seeing on his SMP system.
> >
> > The scenario is as follows:
> >
> > Thread A locks an pthread_mutex via the fast path and does not enter
> > the kernel.
> >
> > Thread B tries to lock the lock and sees it is already locked. Thread
> > B sets the
> > waiters flag in the lock and enters the kernel to lock the lock on
> > behalf of
> > thread A and then block on the mutex waiting for it's release.
> >
> > Thread A unlocks the lock and sees the waiters flag set. Thread A gets
> > to the futex_wake_robust before Thread B can get to futex_wait_robust.
> >
> > Thread A sees that it does not own the lock in the kernel and was
> > returning EINVAL.
> >
> > patch-2.6.14-rt21-rf8 was a preliminary patch for Dave Carlson to try
> > and get more information about
> > the race condition. ( rf8 and rf9 are still returning EAGAIN from
> > thread B trying to
> > do the futex_wait_robust and the library should be retrying with
> > EAGAIN, but it currently isn't).
> >
>
> *nod*
> I just pointed out that you can't make thread A loop the way you do.
>
> What I would do was to do the user space flag checks while having the
> raw spinlock of the rt_mutex. That way you are sure that stuff in the
> kernel is serialized. But I don't know what to do exactly to do in
> there....
>
>
> > When I get the race condition closed I'll post the patch and notify
> > everyone on lkml
> > and the robustmutexes mailing lists.
>
> Where can I sign up to that mailing list? I would like to follow the
> development, although I don't have much time to contibute.

http://lists.osdl.org/mailman/listinfo/robustmutexes/


---
~Randy