2012-06-26 12:10:32

by Anders Kaseorg

[permalink] [raw]
Subject: [PATCH] fifo: Do not restart open() if it already found a partner

If a parent and child process open the two ends of a fifo, and the
child immediately exits, the parent may receive a SIGCHLD before its
open() returns. In that case, we need to make sure that open() will
return successfully after the SIGCHLD handler returns, instead of
throwing EINTR or being restarted. Otherwise, the restarted open()
would incorrectly wait for a second partner on the other end.

The following test demonstrates the EINTR that was wrongly thrown from
the parent’s open(). Change .sa_flags = 0 to .sa_flags = SA_RESTART
to see a deadlock instead, in which the restarted open() waits for a
second reader that will never come.

#include <sys/stat.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <fcntl.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

#define CHECK(x) do {if ((x) == -1) {perror(#x); abort();}} while(0)

void handler(int signum) {}

int main()
{
struct sigaction act = {.sa_handler = handler, .sa_flags = 0};
CHECK(sigaction(SIGCHLD, &act, NULL));
CHECK(mknod("fifo", S_IFIFO | S_IRWXU, 0));
for (;;) {
int fd;
pid_t pid;
putc('.', stderr);
CHECK(pid = fork());
if (pid == 0) {
CHECK(fd = open("fifo", O_RDONLY));
_exit(0);
}
CHECK(fd = open("fifo", O_WRONLY));
CHECK(close(fd));
CHECK(waitpid(pid, NULL, 0));
}
}

This is what I suspect was causing the Git test suite to fail in
t9010-svn-fe.sh:
http://bugs.debian.org/678852

Signed-off-by: Anders Kaseorg <[email protected]>
---
fs/fifo.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/fs/fifo.c b/fs/fifo.c
index b1a524d..0d2b53e 100644
--- a/fs/fifo.c
+++ b/fs/fifo.c
@@ -14,7 +14,7 @@
#include <linux/sched.h>
#include <linux/pipe_fs_i.h>

-static void wait_for_partner(struct inode* inode, unsigned int *cnt)
+static bool wait_for_partner(struct inode* inode, unsigned int *cnt)
{
int cur = *cnt;

@@ -23,6 +23,7 @@ static void wait_for_partner(struct inode* inode, unsigned int *cnt)
if (signal_pending(current))
break;
}
+ return cur != *cnt;
}

static void wake_up_partner(struct inode* inode)
@@ -67,8 +68,7 @@ static int fifo_open(struct inode *inode, struct file *filp)
* seen a writer */
filp->f_version = pipe->w_counter;
} else {
- wait_for_partner(inode, &pipe->w_counter);
- if(signal_pending(current))
+ if (!wait_for_partner(inode, &pipe->w_counter))
goto err_rd;
}
}
@@ -90,8 +90,7 @@ static int fifo_open(struct inode *inode, struct file *filp)
wake_up_partner(inode);

if (!pipe->readers) {
- wait_for_partner(inode, &pipe->r_counter);
- if (signal_pending(current))
+ if (!wait_for_partner(inode, &pipe->r_counter))
goto err_wr;
}
break;
--
1.7.11.1


2012-06-26 20:20:20

by Jonathan Nieder

[permalink] [raw]
Subject: Re: [PATCH] fifo: Do not restart open() if it already found a partner

Hi,

Anders Kaseorg wrote:

> The following test demonstrates the EINTR that was wrongly thrown from
> the parent’s open(). Change .sa_flags = 0 to .sa_flags = SA_RESTART
> to see a deadlock instead, in which the restarted open() waits for a
> second reader that will never come.

Nice.

To recap, reading a fifo without a writer (resp. when writing a fifo
without a reader), fifo_open() without O_NONBLOCK waits for the other
end to be opened:

if (!pipe->writers) {
if ((filp->f_flags & O_NONBLOCK)) {
...
} else {
wait_for_partner(inode, &pipe->w_counter);

The wait_for_partner() function waits for the pipe to be opened.
It is interruptible. Inlining a little for clarity[*]:

int cur = pipe->w_counter;

while (cur == pipe->w_counter) {
DEFINE_WAIT(wait);

prepare_to_wait(&pipe->wait, &wait, TASK_INTERRUPTIBLE);
pipe_unlock(pipe);
schedule();
finish_wait(&pipe->wait, &wait);
pipe_lock(pipe);

if (signal_pending(current))
break;
}

In the window while i_mutex is unlocked, it is possible for the fifo
to be opened for writing and closed again. That's fine --- open()
should succeed as long as someone has successfully opened the pipe for
writing. And similarly, if a writer opens the pipe and a signal
arrives, we should let open() succeed. The rest of fifo_open() is
quick and the signal will still be promptly delivered afterwards. So
this looks like a good patch.

Two small details:

1. I wasn't able to get your testcase to reliably fail (on 3.0.36).
Sometimes it would produce EINTR quickly and sometimes it prefers
to spew out dots. Perhaps a note would help avoid confusing people
that want to see if their kernel is affected.

2. The return value convention surprised me a little:

-static void wait_for_partner(struct inode* inode, unsigned int *cnt)
+static bool wait_for_partner(struct inode* inode, unsigned int *cnt)

It would be easier to guess the sense at a glance if it returned an
int (e.g., 0 or -ERESTARTSYS) so the caller could look like

if (wait_for_partner(inode, &pipe->w_counter))
/* wait failed */
...

Documentation/CodingStyle says more about that in chapter 16.

Except for those two details,
Reviewed-by: Jonathan Nieder <[email protected]>

Thanks for a pleasant read.

[*] This is almost the same as

int dummy;
wait_event_interruptible(&pipe->wait, pipe->w_counter != cur, dummy);

except that we keep i_mutex held when placing the current task on the
waitqueue. There's probably some simplification possible, but that's
a story for another day.

2012-06-26 21:04:10

by Anders Kaseorg

[permalink] [raw]
Subject: [PATCH v2] fifo: Do not restart open() if it already found a partner

If a parent and child process open the two ends of a fifo, and the
child immediately exits, the parent may receive a SIGCHLD before its
open() returns. In that case, we need to make sure that open() will
return successfully after the SIGCHLD handler returns, instead of
throwing EINTR or being restarted. Otherwise, the restarted open()
would incorrectly wait for a second partner on the other end.

The following test demonstrates the EINTR that was wrongly thrown from
the parent’s open(). Change .sa_flags = 0 to .sa_flags = SA_RESTART
to see a deadlock instead, in which the restarted open() waits for a
second reader that will never come. (On my systems, this happens
pretty reliably within about 5 to 500 iterations. Others report that
it manages to loop ~forever sometimes; YMMV.)

#include <sys/stat.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <fcntl.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

#define CHECK(x) do if ((x) == -1) {perror(#x); abort();} while(0)

void handler(int signum) {}

int main()
{
struct sigaction act = {.sa_handler = handler, .sa_flags = 0};
CHECK(sigaction(SIGCHLD, &act, NULL));
CHECK(mknod("fifo", S_IFIFO | S_IRWXU, 0));
for (;;) {
int fd;
pid_t pid;
putc('.', stderr);
CHECK(pid = fork());
if (pid == 0) {
CHECK(fd = open("fifo", O_RDONLY));
_exit(0);
}
CHECK(fd = open("fifo", O_WRONLY));
CHECK(close(fd));
CHECK(waitpid(pid, NULL, 0));
}
}

This is what I suspect was causing the Git test suite to fail in
t9010-svn-fe.sh:
http://bugs.debian.org/678852

Signed-off-by: Anders Kaseorg <[email protected]>
---
Changes from v1:
• Change wait_for_partner return from true/false to 0/-ERESTARTSYS.
• Mention that the demo program doesn’t always work for some.
• Remove some unneeded braces from the demo program’s CHECK() macro.

fs/fifo.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/fs/fifo.c b/fs/fifo.c
index b1a524d..cf6f434 100644
--- a/fs/fifo.c
+++ b/fs/fifo.c
@@ -14,7 +14,7 @@
#include <linux/sched.h>
#include <linux/pipe_fs_i.h>

-static void wait_for_partner(struct inode* inode, unsigned int *cnt)
+static int wait_for_partner(struct inode* inode, unsigned int *cnt)
{
int cur = *cnt;

@@ -23,6 +23,7 @@ static void wait_for_partner(struct inode* inode, unsigned int *cnt)
if (signal_pending(current))
break;
}
+ return cur == *cnt ? -ERESTARTSYS : 0;
}

static void wake_up_partner(struct inode* inode)
@@ -67,8 +68,7 @@ static int fifo_open(struct inode *inode, struct file *filp)
* seen a writer */
filp->f_version = pipe->w_counter;
} else {
- wait_for_partner(inode, &pipe->w_counter);
- if(signal_pending(current))
+ if (wait_for_partner(inode, &pipe->w_counter))
goto err_rd;
}
}
@@ -90,8 +90,7 @@ static int fifo_open(struct inode *inode, struct file *filp)
wake_up_partner(inode);

if (!pipe->readers) {
- wait_for_partner(inode, &pipe->r_counter);
- if (signal_pending(current))
+ if (wait_for_partner(inode, &pipe->r_counter))
goto err_wr;
}
break;
--
1.7.11.1

2012-06-26 21:58:30

by Jonathan Nieder

[permalink] [raw]
Subject: Re: [PATCH v2] fifo: Do not restart open() if it already found a partner

Anders Kaseorg wrote:

> Signed-off-by: Anders Kaseorg <[email protected]>
> ---
> Changes from v1:
> • Change wait_for_partner return from true/false to 0/-ERESTARTSYS.
> • Mention that the demo program doesn’t always work for some.
> • Remove some unneeded braces from the demo program’s CHECK() macro.

Yep, I don't have anything left to complain about. Looks obviously
correct to my inexpert eyes.

Reviewed-by: Jonathan Nieder <[email protected]>