LinuxLists.cc - Process-Shared Mutex (futex)

2002-06-06 16:21:02

Subject: Process-Shared Mutex (futex) - What is it good for ?

Nice to have everything as POSIX says, but how could process-shared
mutex be usefull ? Imagine two processes useing one mutex to lock shared
memory area. One process locks, and then dies (for example, it goes
sigSEGV way). Second process could wait for ages (untill reboot ?) and
it won't get lock() on that mutex ever. Wouldn't it be more usefull to
have automatic mutex cleanup after process death ? Just make a cleanup,
and mark it as 'damaged', so other processes will eventualy get error
saying that something went wrong.

--
Bye,

and have a very nice day !

2002-06-07 08:33:31

by Peter Wächtler

[permalink] [raw]

Subject: Re: Process-Shared Mutex (futex) - What is it good for ?

Vladimir Zidar wrote:
> Nice to have everything as POSIX says, but how could process-shared
> mutex be usefull ? Imagine two processes useing one mutex to lock shared
> memory area. One process locks, and then dies (for example, it goes
> sigSEGV way). Second process could wait for ages (untill reboot ?) and
> it won't get lock() on that mutex ever. Wouldn't it be more usefull to
> have automatic mutex cleanup after process death ? Just make a cleanup,
> and mark it as 'damaged', so other processes will eventualy get error
> saying that something went wrong.
>
>

Look at kernel/futex.c in 2.5 tree.
I vote for killing the "dangling" process - like it's done in IRIX.

2002-06-07 20:17:40

by Vladimir Zidar

[permalink] [raw]

Subject: Re: Process-Shared Mutex (futex) - What is it good for ?

On Fri, 2002-06-07 at 10:35, Peter W?chtler wrote:
> Vladimir Zidar wrote:
> > Nice to have everything as POSIX says, but how could process-shared
> > mutex be usefull ? Imagine two processes useing one mutex to lock shared
> > memory area. One process locks, and then dies (for example, it goes
> > sigSEGV way). Second process could wait for ages (untill reboot ?) and
> > it won't get lock() on that mutex ever. Wouldn't it be more usefull to
> > have automatic mutex cleanup after process death ? Just make a cleanup,
> > and mark it as 'damaged', so other processes will eventualy get error
> > saying that something went wrong.
> >
> >
>
> Look at kernel/futex.c in 2.5 tree.
> I vote for killing the "dangling" process - like it's done in IRIX.

I don't like killing other processes just for that.

I like the way file locks works. But they have some shortcomings:

1. they work for files only (and consume one file descriptor per lock)
2. they don't work as expected (hm, well, what is exactly expected I
don't know, but I don't like how they work) when used from threads.

I don't like the way pshared pthread_mutex_t works
1. they are unnamed
2. there is no automatic cleanup

I don't like the way sysv ipc works.
1. they are ... well, not exactly *named*, but have some twisted
identifiers generated with ftok() on files, messing with inodes and such
that they look like one big kludge.
2. theres hard limit on how much of them can process create, use.
3. theres no automatic cleanups.

So I had to invent (or at least to pick idea from other OS-es (-: )
myself:

Here is what I've implemented so far, for private use, but if somebody
is interested, I would be glad share the source, even to release it
under GPL.

Process shared, thread shared, named mutexes.

I call them nutexes. Every nutex has name, creator ownership and
permission bits much like files, with their names not in fs namespace,
but rather somewhere else.

Nutex works much like file lock, but it work in natural way, no matter
from which user-space execution context (process, thread, anything else
?) called.

They can hold read or write lock, with write locks having higher
priority than read ones.

Nutex connection to execution context is over one single file
descriptor ( "/dev/nutex" ), which is opened once from each context, at
first access and stored for example, in static variable for processes,
and for threads with pthread_set/getspecific().

On single file descriptor, caller can open/create as much nutexes as it
likes. There is no hard/implementation limits (soft-limits are to be
implemented - today maybe, over /proc/nutex interface ?).

So far, so good.

But, now it comes to abnormal program termination. Nutexes do three
different things in three different situations.

When process terminates, /dev/nutex is automaticaly closed, and all
associated nutexes are automaticaly unlocked, BUT:

1. If process was holding READ lock, nothing special happens.
2. If process was holding WRITE lock, nutex is marked as 'damaged', and
every subsequent Lock() from other processes on that nutex will result
with error EPIPE.
3. If process was *creator* of this nutex, it is marked as REMOVED, and
all subsequent Lock() attempts from other processes on that nutex will
result with error EIDRM.

Nice eh ?

Also, there is early stage "/proc/nutex" interface, that can show
status much like "/proc/locks" is doing now.

All that is implemented as single kernel module which registers
/dev/nutex and /proc/nutex at initialization, and do all hard work over
four IOCTLs.

Anybody interesed can contact me for source in private mail, since it
is not in final stage yet (two more things to implement), and I won't
post it anywhere today.

So, let me hear the comments... ?

--
Bye,

and have a very nice day !

2002-06-09 09:45:48

by Peter Wächtler

[permalink] [raw]

Subject: Re: Process-Shared Mutex (futex) - What is it good for ?

Vladimir Zidar schrieb:
>
> On Fri, 2002-06-07 at 10:35, Peter W?chtler wrote:
> > Vladimir Zidar wrote:
> > > Nice to have everything as POSIX says, but how could process-shared
> > > mutex be usefull ? Imagine two processes useing one mutex to lock shared
> > > memory area. One process locks, and then dies (for example, it goes
> > > sigSEGV way). Second process could wait for ages (untill reboot ?) and
> > > it won't get lock() on that mutex ever. Wouldn't it be more usefull to
> > > have automatic mutex cleanup after process death ? Just make a cleanup,
> > > and mark it as 'damaged', so other processes will eventualy get error
> > > saying that something went wrong.
> > >
> > >
> >
> > Look at kernel/futex.c in 2.5 tree.
> > I vote for killing the "dangling" process - like it's done in IRIX.
>
> I don't like killing other processes just for that.
>

Just for *that*?
Do you write programs that reveal from sigsegv with sigsetjmp(3)?

> I like the way file locks works. But they have some shortcomings:
>
> 1. they work for files only (and consume one file descriptor per lock)
> 2. they don't work as expected (hm, well, what is exactly expected I
> don't know, but I don't like how they work) when used from threads.
>

Hmh, you don't know what's expected but you like record locks?
Perhaps you mean the silly semantics that all locks are deleted if one
FD is closed?
Well, fcntl is always a system call. The main design goal for futexes
is: if the lock is free, don't enter the kernel. This shows
great performance benefits on uncontended locks where only a smaller
number of threads block because the lock is held.

Also with posix mutexes they are designed to work in user space with
PTHREAD_SCOPE_PROCESS. And they are unnamed so that you don't *have* to
provide them in kernel space or use a "lock manager".

The drawback is really: what happens when something went wrong?
Your system of cooperating threads or processes is skrewed up.
Are the programs prepared to deal with that? How could they?
Reminds me of databases where you would rollback a transaction... but
there the database can track what is happening

> I don't like the way pshared pthread_mutex_t works
> 1. they are unnamed
> 2. there is no automatic cleanup
>
> I don't like the way sysv ipc works.
> 1. they are ... well, not exactly *named*, but have some twisted
> identifiers generated with ftok() on files, messing with inodes and such
> that they look like one big kludge.
> 2. theres hard limit on how much of them can process create, use.
> 3. theres no automatic cleanups.
>

SysV semaphores *do* provide cleanup.
Posix semaphores *are* named.
sem_t *sem_open(const char *name, int oflag, ...);

> So I had to invent (or at least to pick idea from other OS-es (-: )
> myself:
>

>From which OS do you have your nutex idea?

> Here is what I've implemented so far, for private use, but if somebody
> is interested, I would be glad share the source, even to release it
> under GPL.
>
> Process shared, thread shared, named mutexes.
>
> I call them nutexes. Every nutex has name, creator ownership and
> permission bits much like files, with their names not in fs namespace,
> but rather somewhere else.
>
> Nutex works much like file lock, but it work in natural way, no matter
> from which user-space execution context (process, thread, anything else
> ?) called.
>
> They can hold read or write lock, with write locks having higher
> priority than read ones.
>
> Nutex connection to execution context is over one single file
> descriptor ( "/dev/nutex" ), which is opened once from each context, at
> first access and stored for example, in static variable for processes,
> and for threads with pthread_set/getspecific().
>

Threads share the FDs.
How do unrelated processes access the same nutex?
Do you pass a name via iotcl()?

> On single file descriptor, caller can open/create as much nutexes as it
> likes. There is no hard/implementation limits (soft-limits are to be
> implemented - today maybe, over /proc/nutex interface ?).
>
> So far, so good.
>
> But, now it comes to abnormal program termination. Nutexes do three
> different things in three different situations.
>
> When process terminates, /dev/nutex is automaticaly closed, and all
> associated nutexes are automaticaly unlocked, BUT:
>
> 1. If process was holding READ lock, nothing special happens.
> 2. If process was holding WRITE lock, nutex is marked as 'damaged', and
> every subsequent Lock() from other processes on that nutex will result
> with error EPIPE.
> 3. If process was *creator* of this nutex, it is marked as REMOVED, and
> all subsequent Lock() attempts from other processes on that nutex will
> result with error EIDRM.
>
> Nice eh ?
>

Don't know yet. What's the error message: Something went wrong with the
lock - ask your sysadmin? ;-)

> Also, there is early stage "/proc/nutex" interface, that can show
> status much like "/proc/locks" is doing now.
>
> All that is implemented as single kernel module which registers
> /dev/nutex and /proc/nutex at initialization, and do all hard work over
> four IOCTLs.
>
> Anybody interesed can contact me for source in private mail, since it
> is not in final stage yet (two more things to implement), and I won't
> post it anywhere today.
>

So what should a process do, when it encounters that a lock is broken?
Analyse the data and repair it or giving up?
I think the only sensible way is to give up - with a clear
error message that something severly went wrong. I don't like programs
that do proceed in the hope that it will heal itself. Often these
programs tend to fail in such obscure ways that nobody knows what
was going on.

I'm not sure if that offers any advantages over fcntl record locking.
Can you clarify above issues?

2002-06-10 15:42:55

by Vladimir Zidar

[permalink] [raw]

Subject: Re: Process-Shared Mutex (futex) - What is it good for ?

On Sun, 2002-06-09 at 11:49, Peter W?chtler wrote:

> Just for *that*?
> Do you write programs that reveal from sigsegv with sigsetjmp(3)?

No, I do not. But killing the process sounds much like abnormal
programm termination. Can you feel the word 'abnormal' ? It is opposite
of normal - be it simple as error condition on file descriptor.

> Hmh, you don't know what's expected but you like record locks?
> Perhaps you mean the silly semantics that all locks are deleted if one
> FD is closed?

Yes, it is silly, and just because of that, I cannot implement
two-level file locking within threads and processes.

> Well, fcntl is always a system call. The main design goal for futexes
> is: if the lock is free, don't enter the kernel. This shows
> great performance benefits on uncontended locks where only a smaller
> number of threads block because the lock is held.
>
> Also with posix mutexes they are designed to work in user space with
> PTHREAD_SCOPE_PROCESS. And they are unnamed so that you don't *have* to
> provide them in kernel space or use a "lock manager".

Great for them. They are lucky little bastards, futexes. So what ? I
need locking to work nicely between both threads AND processes, and I
don't wanna kill any process when something goes wrong. Performance is
not *that* much important. More important is clean implementation of
healthy idea.
Also, I need them to have *names*, so that separate programs can find
each-others.

> The drawback is really: what happens when something went wrong?
> Your system of cooperating threads or processes is skrewed up.
> Are the programs prepared to deal with that? How could they?
> Reminds me of databases where you would rollback a transaction... but
> there the database can track what is happening

When something goes wrong, it is handled in the same way you handle
error condition from socket, or from file -- you do check for return
value of read(), do you ?

>
> SysV semaphores *do* provide cleanup.

But they persist in kernel space until deleted explicity, am I wrong ?

> Posix semaphores *are* named.
> sem_t *sem_open(const char *name, int oflag, ...);

And not implemented under Linux (process shared), and also they have
problem in case where terminated process was holding them down.

>
> > So I had to invent (or at least to pick idea from other OS-es (-: )
> > myself:
> >
>
> >From which OS do you have your nutex idea?

CreateMutex()/WaitForSingleObject()/ReleaseMutex()/CloseHandle() from
win32.
Also I tend to like the idea of named public semaphores in AmigaOS.

> > Nutex connection to execution context is over one single file
> > descriptor ( "/dev/nutex" ), which is opened once from each context, at
> > first access and stored for example, in static variable for processes,
> > and for threads with pthread_set/getspecific().
> >
>
> Threads share the FDs.
> How do unrelated processes access the same nutex?
> Do you pass a name via iotcl()?

Yes. And there is small library built on top of these few ioctl-s, so
it basicaly looks like this:

int id;

id = nutex_open(name, flags..);

if(nutex_lock(id, nutex_read_lock) < 0) {
perror("nutex_lock");
} else {
/* read ... */
nutex_unlock(id);
}

nutex_close(id);

> > 1. If process was holding READ lock, nothing special happens.
> > 2. If process was holding WRITE lock, nutex is marked as 'damaged', and
> > every subsequent Lock() from other processes on that nutex will result
> > with error EPIPE.
> > 3. If process was *creator* of this nutex, it is marked as REMOVED, and
> > all subsequent Lock() attempts from other processes on that nutex will
> > result with error EIDRM.
> >
> > Nice eh ?
> >
>
> Don't know yet. What's the error message: Something went wrong with the
> lock - ask your sysadmin? ;-)

At least you get a chance to print message on terminal, and/or to log
something before you die.

> So what should a process do, when it encounters that a lock is broken?
> Analyse the data and repair it or giving up?
> I think the only sensible way is to give up - with a clear
> error message that something severly went wrong. I don't like programs
> that do proceed in the hope that it will heal itself. Often these
> programs tend to fail in such obscure ways that nobody knows what
> was going on.
>
> I'm not sure if that offers any advantages over fcntl record locking.
> Can you clarify above issues?

Adventage is that it works okay with threads, that it doesn't use
filesystem namespace, and that it understands how important is to report
an error condition in case of broken write lock.

--
Bye,

and have a very nice day !

2002-06-11 13:17:08

by Peter Wächtler

[permalink] [raw]

Subject: Re: Process-Shared Mutex (futex) - What is it good for ?

Vladimir Zidar wrote:
> On Sun, 2002-06-09 at 11:49, Peter W?chtler wrote:
>
>
>>Just for *that*?
>>Do you write programs that reveal from sigsegv with sigsetjmp(3)?
>>
>
> No, I do not. But killing the process sounds much like abnormal
> programm termination. Can you feel the word 'abnormal' ? It is opposite
> of normal - be it simple as error condition on file descriptor.
>

A-prog: B-prog:

gets write lock
write some data
block on read lock
write some data
crashes
wants an error indication to repair data magically

So a crashing A-prog is OK for you, but B should get an indication.
Could catch a signal (SIGLOST?) returning -1 with errno=LOCKBROKEN
That would be possible with futex.

That is a case for writing data to a file - what about linked lists
in memory?

2002-06-11 17:18:03

by Vladimir Zidar

[permalink] [raw]

Subject: Re: Process-Shared Mutex (futex) - What is it good for ?

On Tue, 2002-06-11 at 15:19, Peter W?chtler wrote:

>
> A-prog: B-prog:
>
> gets write lock
> write some data
> block on read lock
> write some data
> crashes
> wants an error indication to repair data magically

B-Prog unblocks, but gets -1, with errno=EPIPE.

>
>
> So a crashing A-prog is OK for you, but B should get an indication.
> Could catch a signal (SIGLOST?) returning -1 with errno=LOCKBROKEN
> That would be possible with futex.

Nice if that *would* be possible. But that IS how nutexes are working
already.

> That is a case for writing data to a file - what about linked lists
> in memory?

Exactly the same. Nutexes are not related to files in any way (other
than /dev/nutex descriptor, but that's completly different thing).

--
Bye,

and have a very nice day !