Subject: blocking file lock functions (lockf,flock,fcntl) do not return after timer signal

bug description:

flock, lockf, fcntl do not return even after the signal SIGALRM has
been raised and the signal handler function has been executed
the functions should return with a return value EWOULDBLOCK as described
in the man pages


test:

sequence of called functions (start the test in 2 terminal sessions)
1. signal
2. setitimer
3. fopen
4. fileno
5. fcntl with F_WRLCK and F_SETLKW (or flock or lockf)
6. getchar (to keep the lock in the 1st session; now start the 2nd)
in the 2nd session the file lock function (fcntl) will not return


kernel versions:

2.4.18-64GB-SMP
2.4.21psetlvm
2.6.11.4-21.9-default


please reply or CC to mailto:[email protected]



Dieter Mueller-Wipperfuerth
BOI GmbH.
Spazgasse 4
4040 Linz
Austria


2005-10-12 12:49:01

by Alex Riesen

[permalink] [raw]
Subject: Re: blocking file lock functions (lockf,flock,fcntl) do not return after timer signal

On 10/12/05, "Dieter M?ller (BOI GmbH)" <[email protected]> wrote:
> bug description:
>
> flock, lockf, fcntl do not return even after the signal SIGALRM has
> been raised and the signal handler function has been executed
> the functions should return with a return value EWOULDBLOCK as described
> in the man pages

To confirm:

#include <unistd.h>
#include <sys/time.h>
#include <sys/file.h>
#include <time.h>
#include <signal.h>

void alrm(int sig)
{
write(2, "timeout\n", 8);
}

int main(int argc, char* argv[])
{
struct itimerval tv = {
.it_interval = {.tv_sec = 10, .tv_usec = 0},
.it_value = {.tv_sec = 10, .tv_usec = 0},
};
struct itimerval otv;

signal(SIGALRM, alrm);
setitimer(ITIMER_REAL, &tv, &otv);
int fd = open(argv[1], O_RDWR);
if ( fd < 0 )
{
perror(argv[1]);
return 1;
}
printf("locking...\n");
if ( flock(fd, LOCK_EX) < 0 )
{
perror("flock");
return 1;
}
printf("sleeping...\n");
int ch;
read(0, &ch, 1);
close(fd);
return 0;
}

2005-10-12 13:09:11

by linux-os (Dick Johnson)

[permalink] [raw]
Subject: Re: blocking file lock functions (lockf,flock,fcntl) do not return after timer signal


On Wed, 12 Oct 2005, Alex Riesen wrote:

> On 10/12/05, "Dieter M?ller (BOI GmbH)" <[email protected]> wrote:
>> bug description:
>>
>> flock, lockf, fcntl do not return even after the signal SIGALRM has
>> been raised and the signal handler function has been executed
>> the functions should return with a return value EWOULDBLOCK as described
>> in the man pages
>
> To confirm:
>
> #include <unistd.h>
> #include <sys/time.h>
> #include <sys/file.h>
> #include <time.h>
> #include <signal.h>
>
> void alrm(int sig)
> {
> write(2, "timeout\n", 8);
> }
>
> int main(int argc, char* argv[])
> {
> struct itimerval tv = {
> .it_interval = {.tv_sec = 10, .tv_usec = 0},
> .it_value = {.tv_sec = 10, .tv_usec = 0},
> };
> struct itimerval otv;
>
> signal(SIGALRM, alrm);
> setitimer(ITIMER_REAL, &tv, &otv);
> int fd = open(argv[1], O_RDWR);
> if ( fd < 0 )
> {
> perror(argv[1]);
> return 1;
> }
> printf("locking...\n");
> if ( flock(fd, LOCK_EX) < 0 )
> {
> perror("flock");
> return 1;
> }
> printf("sleeping...\n");
> int ch;
> read(0, &ch, 1);
> close(fd);
> return 0;
> }
> -

Does your 'signal()' impliment POSIX or BSD signals? You don't know.
It's whatever the 'C' runtime library got built for. You need to
use sigaction() so you can set the flags to give you your intended
action.

Cheers,
Dick Johnson
Penguin : Linux version 2.6.13.4 on an i686 machine (5589.48 BogoMips).
Warning : 98.36% of all statistics are fiction.

****************************************************************
The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to [email protected] - and destroy all copies of this information, including any attachments, without reading or disclosing them.

Thank you.

2005-10-12 14:39:28

by Trond Myklebust

[permalink] [raw]
Subject: Re: blocking file lock functions (lockf,flock,fcntl) do not return after timer signal

on den 12.10.2005 Klokka 14:48 (+0200) skreiv Alex Riesen:
> On 10/12/05, "Dieter Müller (BOI GmbH)" <[email protected]> wrote:
> > bug description:
> >
> > flock, lockf, fcntl do not return even after the signal SIGALRM has
> > been raised and the signal handler function has been executed
> > the functions should return with a return value EWOULDBLOCK as described
> > in the man pages

Works for me on a local filesystem.

Desktop$ ./gnurr gnarg
locking...
timeout
timeout
timeout
timeout
timeout

However it is true that it doesn't work over NFSv2/v3. The latter is
probably because we use the synchronous NLM calls which block all
signals during the wait in order to avoid state consistency problems (if
the lock gets granted on server after the client was interrupted, then
the administrator gets to clean up the lock).

We can probably relax this requirement a bit, and rely on the CANCEL
call to get us out of trouble.

Cheers,
Trond

2005-10-12 15:10:29

by Alex Riesen

[permalink] [raw]
Subject: Re: blocking file lock functions (lockf,flock,fcntl) do not return after timer signal

On 10/12/05, Trond Myklebust <[email protected]> wrote:
> on den 12.10.2005 Klokka 14:48 (+0200) skreiv Alex Riesen:
> > On 10/12/05, "Dieter M?ller (BOI GmbH)" <[email protected]> wrote:
> > > bug description:
> > >
> > > flock, lockf, fcntl do not return even after the signal SIGALRM has
> > > been raised and the signal handler function has been executed
> > > the functions should return with a return value EWOULDBLOCK as described
> > > in the man pages
>
> Works for me on a local filesystem.
>
> Desktop$ ./gnurr gnarg
> locking...
> timeout
> timeout
> timeout
> timeout
> timeout

Doesn't look so. I'd expect "flock: EWOULDBLOCK" and "sleeping" after
the first timeout.

2005-10-12 15:20:39

by linux-os (Dick Johnson)

[permalink] [raw]
Subject: Re: blocking file lock functions (lockf,flock,fcntl) do not return after timer signal


On Wed, 12 Oct 2005, Alex Riesen wrote:

> On 10/12/05, Trond Myklebust <[email protected]> wrote:
>> on den 12.10.2005 Klokka 14:48 (+0200) skreiv Alex Riesen:
>>> On 10/12/05, "Dieter M?ller (BOI GmbH)" <[email protected]> wrote:
>>>> bug description:
>>>>
>>>> flock, lockf, fcntl do not return even after the signal SIGALRM has
>>>> been raised and the signal handler function has been executed
>>>> the functions should return with a return value EWOULDBLOCK as described
>>>> in the man pages
>>
>> Works for me on a local filesystem.
>>
>> Desktop$ ./gnurr gnarg
>> locking...
>> timeout
>> timeout
>> timeout
>> timeout
>> timeout
>
> Doesn't look so. I'd expect "flock: EWOULDBLOCK" and "sleeping" after
> the first timeout.

As I told you, you use sigaction(). Also flock() will not block
unless there is another open on the file. The code will run to
your blocking read(), wait 10 seconds, get your "timeout" from
the signal handler, then read() will return with -1 and ERESTARTSYS
in errno as required.

Also, using a 'C' runtime library call like write() in a signal-
handler is a bug.


#include <unistd.h>
#include <sys/time.h>
#include <sys/file.h>
#include <time.h>
#include <signal.h>

void alrm(int sig)
{
write(2, "timeout\n", 8);
}

int main(int argc, char* argv[])
{
struct sigaction sa;
struct itimerval tv = {
.it_interval = {.tv_sec = 10, .tv_usec = 0},
.it_value = {.tv_sec = 10, .tv_usec = 0},
};
struct itimerval otv;

sigaction(SIGALRM, NULL, &sa);
sa.sa_handler = alrm;
sa.sa_flags = 0;
sigaction(SIGALRM, &sa, NULL);


// signal(SIGALRM, alrm);
setitimer(ITIMER_REAL, &tv, &otv);
int fd = open(argv[1], O_RDWR);
if ( fd < 0 )
{
perror(argv[1]);
return 1;
}
printf("locking...\n");
if ( flock(fd, LOCK_EX) < 0 )
{
perror("flock");
return 1;
}
printf("sleeping...\n");
int ch;
read(0, &ch, 1);
close(fd);
return 0;
}

Cheers,
Dick Johnson
Penguin : Linux version 2.6.13.4 on an i686 machine (5589.56 BogoMips).
Warning : 98.36% of all statistics are fiction.

****************************************************************
The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to [email protected] - and destroy all copies of this information, including any attachments, without reading or disclosing them.

Thank you.

2005-10-12 15:37:05

by Michael Kerrisk

[permalink] [raw]
Subject: Re: blocking file lock functions (lockf,flock,fcntl) do not return after timer signal

> Von: "linux-os \(Dick Johnson\)" <[email protected]>
> An: "Alex Riesen" <[email protected]>
> Kopie: "Trond Myklebust" <[email protected]>, <[email protected]>,
> "Linux kernel" <[email protected]>
> Betreff: Re: blocking file lock functions (lockf,flock,fcntl) do not
> return after timer signal

[...]

> Datum: Wed, 12 Oct 2005 11:20:26 -0400
> As I told you, you use sigaction(). Also flock() will not block
> unless there is another open on the file. The code will run to
> your blocking read(), wait 10 seconds, get your "timeout" from
> the signal handler, then read() will return with -1 and ERESTARTSYS
> in errno as required.

I was just trying to write a message to say the same ;-).

> Also, using a 'C' runtime library call like write() in a signal-
> handler is a bug.

But this is not correct. write() is async-signal-safe (POSIX
requires it).

Cheers,

Michael

--
10 GB Mailbox, 100 FreeSMS/Monat http://www.gmx.net/de/go/topmail
+++ GMX - die erste Adresse f?r Mail, Message, More +++

2005-10-12 15:43:55

by linux-os (Dick Johnson)

[permalink] [raw]
Subject: Re: blocking file lock functions (lockf,flock,fcntl) do not return after timer signal


On Wed, 12 Oct 2005, Michael Kerrisk wrote:

>> Von: "linux-os \(Dick Johnson\)" <[email protected]>
>> An: "Alex Riesen" <[email protected]>
>> Kopie: "Trond Myklebust" <[email protected]>, <[email protected]>,
>> "Linux kernel" <[email protected]>
>> Betreff: Re: blocking file lock functions (lockf,flock,fcntl) do not
>> return after timer signal
>
> [...]
>
>> Datum: Wed, 12 Oct 2005 11:20:26 -0400
>> As I told you, you use sigaction(). Also flock() will not block
>> unless there is another open on the file. The code will run to
>> your blocking read(), wait 10 seconds, get your "timeout" from
>> the signal handler, then read() will return with -1 and ERESTARTSYS
>> in errno as required.
>
> I was just trying to write a message to say the same ;-).
>
>> Also, using a 'C' runtime library call like write() in a signal-
>> handler is a bug.
>
> But this is not correct. write() is async-signal-safe (POSIX
> requires it).
>

Then tell it to the doom-sayers who always excoriate me when
I use a 'C' runtime library call in test signal code. I have
been told that the __only__ thing you can do in a signal handler
is access global memory and/or execute siglongjmp().

> Cheers,
>
> Michael
>
> --
> 10 GB Mailbox, 100 FreeSMS/Monat http://www.gmx.net/de/go/topmail
> +++ GMX - die erste Adresse f?r Mail, Message, More +++
>

Cheers,
Dick Johnson
Penguin : Linux version 2.6.13.4 on an i686 machine (5589.56 BogoMips).
Warning : 98.36% of all statistics are fiction.

****************************************************************
The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to [email protected] - and destroy all copies of this information, including any attachments, without reading or disclosing them.

Thank you.

2005-10-12 16:05:15

by Michael Kerrisk

[permalink] [raw]
Subject: Re: blocking file lock functions (lockf,flock,fcntl) do not return after timer signal

> >> Also, using a 'C' runtime library call like write() in a signal-
> >> handler is a bug.
> >
> > But this is not correct. write() is async-signal-safe (POSIX
> > requires it).
>
> Then tell it to the doom-sayers who always excoriate me when
> I use a 'C' runtime library call in test signal code. I have
> been told that the __only__ thing you can do in a signal handler
> is access global memory and/or execute siglongjmp().

Nevertheless, it is not so. The problem that some may
complain about is not C RTL code, but perhaps using
printf() (wrong) instead of write()


From:
http://www.opengroup.org/onlinepubs/009695399/functions/xsh_chap02_04.html#tag_02_04

The following table defines a set of functions that shall
be either reentrant or non-interruptible by signals and
shall be async-signal-safe. Therefore applications may
invoke them, without restriction, from signal-catching
functions:
[...]
write()

Cheers,

Michael

--
NEU: Telefon-Flatrate f?rs dt. Festnetz! GMX Phone_Flat: 9,99 Euro/Mon.*
F?r DSL-Nutzer. Ohne Providerwechsel! http://www.gmx.net/de/go/telefonie

2005-10-12 16:06:41

by Mark Lord

[permalink] [raw]
Subject: Re: blocking file lock functions (lockf,flock,fcntl) do not return after timer signal

linux-os (Dick Johnson) wrote:
> On Wed, 12 Oct 2005, Michael Kerrisk wrote:
>
>>But this is not correct. write() is async-signal-safe (POSIX
>>requires it).
>
> Then tell it to the doom-sayers who always excoriate me when
> I use a 'C' runtime library call in test signal code. I have
> been told that the __only__ thing you can do in a signal handler
> is access global memory and/or execute siglongjmp().

Try "man 2 signal", and read the list of signal-safe functions
given at the bottom of the manpage, from POSIX 1003.1-2003.

write() is included (of course it is, since it is really a
kernel syscall not a library function).

Cheers

2005-10-12 16:37:16

by Trond Myklebust

[permalink] [raw]
Subject: Re: blocking file lock functions (lockf,flock,fcntl) do not return after timer signal

on den 12.10.2005 Klokka 17:10 (+0200) skreiv Alex Riesen:

> > Desktop$ ./gnurr gnarg
> > locking...
> > timeout
> > timeout
> > timeout
> > timeout
> > timeout
>
> Doesn't look so. I'd expect "flock: EWOULDBLOCK" and "sleeping" after
> the first timeout.

I would rather expect flock to return with ERESTARTSYS and then for libc
to restart the syscall once the signal handler has finished executing.
A stint with the "strace" utility will show you that this is precisely
what happens.

As Dick and others already pointed out to you, the POSIX function
sigaction() allows you to disable the automatic restarting of the
syscall.

Cheers,
Trond

2005-10-12 21:15:59

by Alex Riesen

[permalink] [raw]
Subject: Re: blocking file lock functions (lockf,flock,fcntl) do not return after timer signal

linux-os (Dick Johnson), Wed, Oct 12, 2005 17:20:26 +0200:
> >>>> flock, lockf, fcntl do not return even after the signal SIGALRM has
> >>>> been raised and the signal handler function has been executed
> >>>> the functions should return with a return value EWOULDBLOCK as described
> >>>> in the man pages
> >>
> >> Works for me on a local filesystem.
> >>
> >> Desktop$ ./gnurr gnarg
> >> locking...
> >> timeout
> >
> > Doesn't look so. I'd expect "flock: EWOULDBLOCK" and "sleeping" after
> > the first timeout.

It's EINTR, btw.

linux-os (Dick Johnson), Wed, Oct 12, 2005 17:20:26 +0200:
> As I told you, you use sigaction(). Also flock() will not block
> unless there is another open on the file. The code will run to
> your blocking read(), wait 10 seconds, get your "timeout" from
> the signal handler, then read() will return with -1 and ERESTARTSYS
> in errno as required.

Ahh yes, of course. signal(2) places a syscall-restarting handler in glibc.
My bad, sorry.

For the last time:

// everything works as expected, flock returns with EINTR in the
// second instance of the program.
#include <unistd.h>
#include <sys/time.h>
#include <sys/file.h>
#include <stdio.h>
#include <signal.h>
#include <errno.h>

void alrm(int sig)
{
write(2, "timeout\n", 8);
}

int main(int argc, char* argv[])
{
struct itimerval tv = {
.it_interval = {.tv_sec = 10, .tv_usec = 0},
.it_value = {.tv_sec = 10, .tv_usec = 0},
};
struct sigaction sa = { .sa_handler = alrm, .sa_flags = 0 };
sigaction(SIGALRM, &sa, NULL);
setitimer(ITIMER_REAL, &tv, NULL);
int fd = open(argv[1], O_RDWR);
if ( fd < 0 ) {
perror(argv[1]);
return 1;
}
printf("locking...\n");
if ( flock(fd, LOCK_EX) < 0 ) {
perror("flock");
return 1;
}
printf("sleeping...\n");
int ch;
while ( read(0, &ch, 1) < 0 && EINTR == errno )
;
close(fd);
return 0;
}